pax_global_header00006660000000000000000000000064147734271160014526gustar00rootroot0000000000000052 comment=45b7e6deffadbd81e6c95bb783926c051a5c0fa8 rdma-core-56.1/000077500000000000000000000000001477342711600133325ustar00rootroot00000000000000rdma-core-56.1/.clang-format000066400000000000000000000102661477342711600157120ustar00rootroot00000000000000# SPDX-License-Identifier: GPL-2.0 # # clang-format configuration file. Intended for clang-format >= 4. # # For more information, see: # # Documentation/process/clang-format.rst # https://clang.llvm.org/docs/ClangFormat.html # https://clang.llvm.org/docs/ClangFormatStyleOptions.html # --- AccessModifierOffset: -4 AlignAfterOpenBracket: Align AlignConsecutiveAssignments: false AlignConsecutiveDeclarations: false #AlignEscapedNewlines: Left # Unknown to clang-format-4.0 AlignOperands: true AlignTrailingComments: false AllowAllParametersOfDeclarationOnNextLine: false AllowShortBlocksOnASingleLine: false AllowShortCaseLabelsOnASingleLine: false AllowShortFunctionsOnASingleLine: None AllowShortIfStatementsOnASingleLine: false AllowShortLoopsOnASingleLine: false AlwaysBreakAfterDefinitionReturnType: None AlwaysBreakAfterReturnType: None AlwaysBreakBeforeMultilineStrings: false AlwaysBreakTemplateDeclarations: false BinPackArguments: true BinPackParameters: true BraceWrapping: AfterClass: false AfterControlStatement: false AfterEnum: false AfterFunction: true AfterNamespace: true AfterObjCDeclaration: false AfterStruct: false AfterUnion: false #AfterExternBlock: false # Unknown to clang-format-5.0 BeforeCatch: false BeforeElse: false IndentBraces: false #SplitEmptyFunction: true # Unknown to clang-format-4.0 #SplitEmptyRecord: true # Unknown to clang-format-4.0 #SplitEmptyNamespace: true # Unknown to clang-format-4.0 BreakBeforeBinaryOperators: None BreakBeforeBraces: Custom #BreakBeforeInheritanceComma: false # Unknown to clang-format-4.0 BreakBeforeTernaryOperators: false BreakConstructorInitializersBeforeComma: false #BreakConstructorInitializers: BeforeComma # Unknown to clang-format-4.0 BreakAfterJavaFieldAnnotations: false BreakStringLiterals: false ColumnLimit: 80 CommentPragmas: '^ IWYU pragma:' #CompactNamespaces: false # Unknown to clang-format-4.0 ConstructorInitializerAllOnOneLineOrOnePerLine: false ConstructorInitializerIndentWidth: 8 ContinuationIndentWidth: 8 Cpp11BracedListStyle: false DerivePointerAlignment: false DisableFormat: false ExperimentalAutoDetectBinPacking: false #FixNamespaceComments: false # Unknown to clang-format-4.0 # Taken from: # grep -Rh '^#define [^[:space:]]*for_each[^[:space:]]*(' build/include/ \ # | sed "s,^#define \([^[:space:]]*for_each[^[:space:]]*\)(.*$, - '\1'," \ # | sort | uniq ForEachMacros: - 'list_for_each' - 'list_for_each_off' - 'list_for_each_off_dir_' - 'list_for_each_rev' - 'list_for_each_rev_off' - 'list_for_each_rev_safe' - 'list_for_each_rev_safe_off' - 'list_for_each_safe' - 'list_for_each_safe_off' - 'list_for_each_safe_off_dir_' #IncludeBlocks: Preserve # Unknown to clang-format-5.0 IncludeCategories: - Regex: '.*' Priority: 1 IncludeIsMainRegex: '(Test)?$' IndentCaseLabels: false #IndentPPDirectives: None # Unknown to clang-format-5.0 IndentWidth: 8 IndentWrappedFunctionNames: false JavaScriptQuotes: Leave JavaScriptWrapImports: true KeepEmptyLinesAtTheStartOfBlocks: false MacroBlockBegin: '' MacroBlockEnd: '' MaxEmptyLinesToKeep: 1 NamespaceIndentation: Inner #ObjCBinPackProtocolList: Auto # Unknown to clang-format-5.0 ObjCBlockIndentWidth: 8 ObjCSpaceAfterProperty: true ObjCSpaceBeforeProtocolList: true # Taken from git's rules #PenaltyBreakAssignment: 10 # Unknown to clang-format-4.0 PenaltyBreakBeforeFirstCallParameter: 30 PenaltyBreakComment: 10 PenaltyBreakFirstLessLess: 0 PenaltyBreakString: 10 PenaltyExcessCharacter: 100 PenaltyReturnTypeOnItsOwnLine: 60 PointerAlignment: Right ReflowComments: false SortIncludes: false #SortUsingDeclarations: false # Unknown to clang-format-4.0 SpaceAfterCStyleCast: false SpaceAfterTemplateKeyword: true SpaceBeforeAssignmentOperators: true #SpaceBeforeCtorInitializerColon: true # Unknown to clang-format-5.0 #SpaceBeforeInheritanceColon: true # Unknown to clang-format-5.0 SpaceBeforeParens: ControlStatements #SpaceBeforeRangeBasedForLoopColon: true # Unknown to clang-format-5.0 SpaceInEmptyParentheses: false SpacesBeforeTrailingComments: 1 SpacesInAngles: false SpacesInContainerLiterals: false SpacesInCStyleCastParentheses: false SpacesInParentheses: false SpacesInSquareBrackets: false Standard: Cpp03 TabWidth: 8 UseTab: Always ... rdma-core-56.1/.gitignore000066400000000000000000000012601477342711600153210ustar00rootroot00000000000000# -*- mode: gitignore; -*- # CMake cmake_install.cmake CMakeFiles CMakeCache.txt lib*.a /bin/** /lib/** /include/** /.ninja* *.ninja Makefile # Tags TAGS .TAGS !TAGS/ tags .tags !tags/ gtags.files GTAGS GRTAGS GPATH # clangd .cache/ # cscope cscope.files cscope.out cscope.in.out cscope.po.out # Emacs *~ \#*\# /.emacs.desktop /.emacs.desktop.lock *.elc auto-save-list tramp .\#* # Org-mode .org-id-locations *_archive # flymake-mode *_flymake.* # eshell files /eshell/history /eshell/lastdir # elpa packages /elpa/ # reftex files *.rel # AUCTeX auto folder /auto/ # cask packages .cask/ # vim [._]*.s[a-w][a-z] [._]s[a-w][a-z] *.un~ Session.vim .netrwhist *~ # python *.pyc rdma-core-56.1/.mailmap000066400000000000000000000011731477342711600147550ustar00rootroot00000000000000# # This list is used by git-shortlog to fix a few botched name translations # in the git archive, either because the author's full name was messed up # and/or not always written the same way, making contributions from the # same person appearing not to be so or badly displayed. # Jason Gunthorpe Jason Gunthorpe Leon Romanovsky Leon Romanovsky Steve Wise Steve Wise rdma-core-56.1/ABI/000077500000000000000000000000001477342711600137255ustar00rootroot00000000000000rdma-core-56.1/ABI/.gitignore000066400000000000000000000000001477342711600157030ustar00rootroot00000000000000rdma-core-56.1/ABI/efa.dump000066400000000000000000006563751477342711600153750ustar00rootroot00000000000000$VAR1 = { 'ABI_DUMPER_VERSION' => '1.2', 'ABI_DUMP_VERSION' => '3.5', 'Arch' => 'x86_64', 'GccVersion' => '12.3.0', 'Headers' => {}, 'Language' => 'C', 'LibraryName' => 'libefa.so.1.3.56.0', 'LibraryVersion' => 'efa', 'MissedOffsets' => '1', 'MissedRegs' => '1', 'NameSpaces' => {}, 'Needed' => { 'libc.so.6' => 1, 'libibverbs.so.1' => 1 }, 'Sources' => {}, 'SymbolInfo' => { '51938' => { 'Header' => undef, 'Line' => '2485', 'Param' => { '0' => { 'name' => 'ibvah', 'type' => '10273' }, '1' => { 'name' => 'attr', 'type' => '52442' }, '2' => { 'name' => 'inlen', 'type' => '2222' } }, 'Return' => '152', 'ShortName' => 'efadv_query_ah' }, '67000' => { 'Header' => undef, 'Line' => '1630', 'Param' => { '0' => { 'name' => 'ibvctx', 'type' => '3883' }, '1' => { 'name' => 'attr_ex', 'type' => '16328' }, '2' => { 'name' => 'efa_attr', 'type' => '67792' }, '3' => { 'name' => 'inlen', 'type' => '2222' } }, 'Return' => '6238', 'ShortName' => 'efadv_create_qp_ex' }, '67818' => { 'Header' => undef, 'Line' => '1597', 'Param' => { '0' => { 'name' => 'ibvpd', 'type' => '7613' }, '1' => { 'name' => 'attr', 'type' => '19201' }, '2' => { 'name' => 'driver_qp_type', 'type' => '2222' } }, 'Return' => '6238', 'ShortName' => 'efadv_create_driver_qp' }, '75400' => { 'Header' => undef, 'Line' => '1049', 'Param' => { '0' => { 'name' => 'ibvcqx', 'type' => '12599' } }, 'Return' => '48586', 'ShortName' => 'efadv_cq_from_ibv_cq_ex' }, '75468' => { 'Header' => undef, 'Line' => '1012', 'Param' => { '0' => { 'name' => 'ibvctx', 'type' => '3883' }, '1' => { 'name' => 'attr_ex', 'type' => '16157' }, '2' => { 'name' => 'efa_attr', 'type' => '76240' }, '3' => { 'name' => 'inlen', 'type' => '2222' } }, 'Return' => '12599', 'ShortName' => 'efadv_create_cq' }, '83866' => { 'Header' => undef, 'Line' => '285', 'Param' => { '0' => { 'name' => 'ibvmr', 'type' => '7428' }, '1' => { 'name' => 'attr', 'type' => '85750' }, '2' => { 'name' => 'inlen', 'type' => '2222' } }, 'Return' => '152', 'ShortName' => 'efadv_query_mr' }, '87250' => { 'Header' => undef, 'Line' => '145', 'Param' => { '0' => { 'name' => 'ibvctx', 'type' => '3883' }, '1' => { 'name' => 'attr', 'type' => '87774' }, '2' => { 'name' => 'inlen', 'type' => '2222' } }, 'Return' => '152', 'ShortName' => 'efadv_query_device' } }, 'SymbolVersion' => { 'efadv_cq_from_ibv_cq_ex' => 'efadv_cq_from_ibv_cq_ex@@EFA_1.2', 'efadv_create_cq' => 'efadv_create_cq@@EFA_1.2', 'efadv_create_driver_qp' => 'efadv_create_driver_qp@@EFA_1.0', 'efadv_create_qp_ex' => 'efadv_create_qp_ex@@EFA_1.1', 'efadv_query_ah' => 'efadv_query_ah@@EFA_1.1', 'efadv_query_device' => 'efadv_query_device@@EFA_1.1', 'efadv_query_mr' => 'efadv_query_mr@@EFA_1.3' }, 'Symbols' => { 'libefa.so.1.3.56.0' => { 'efadv_cq_from_ibv_cq_ex@@EFA_1.2' => 1, 'efadv_create_cq@@EFA_1.2' => 1, 'efadv_create_driver_qp@@EFA_1.0' => 1, 'efadv_create_qp_ex@@EFA_1.1' => 1, 'efadv_query_ah@@EFA_1.1' => 1, 'efadv_query_device@@EFA_1.1' => 1, 'efadv_query_mr@@EFA_1.3' => 1 } }, 'Target' => 'unix', 'TypeInfo' => { '1' => { 'Name' => 'void', 'Type' => 'Intrinsic' }, '10032' => { 'Header' => undef, 'Line' => '1161', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '2378' }, '1' => { 'name' => 'invalidate_rkey', 'offset' => '0', 'type' => '2222' } }, 'Size' => '4', 'Type' => 'Union' }, '10065' => { 'Header' => undef, 'Line' => '1166', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '2222' } }, 'Size' => '16', 'Type' => 'Struct' }, '10103' => { 'Header' => undef, 'Line' => '1170', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'compare_add', 'offset' => '8', 'type' => '2234' }, '2' => { 'name' => 'swap', 'offset' => '22', 'type' => '2234' }, '3' => { 'name' => 'rkey', 'offset' => '36', 'type' => '2222' } }, 'Size' => '32', 'Type' => 'Struct' }, '10169' => { 'Header' => undef, 'Line' => '1176', 'Memb' => { '0' => { 'name' => 'ah', 'offset' => '0', 'type' => '10273' }, '1' => { 'name' => 'remote_qpn', 'offset' => '8', 'type' => '2222' }, '2' => { 'name' => 'remote_qkey', 'offset' => '18', 'type' => '2222' } }, 'Size' => '16', 'Type' => 'Struct' }, '10219' => { 'Header' => undef, 'Line' => '1695', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3883' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '7613' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '2222' } }, 'Name' => 'struct ibv_ah', 'Size' => '24', 'Type' => 'Struct' }, '10273' => { 'BaseType' => '10219', 'Name' => 'struct ibv_ah*', 'Size' => '8', 'Type' => 'Pointer' }, '10278' => { 'Header' => undef, 'Line' => '1165', 'Memb' => { '0' => { 'name' => 'rdma', 'offset' => '0', 'type' => '10065' }, '1' => { 'name' => 'atomic', 'offset' => '0', 'type' => '10103' }, '2' => { 'name' => 'ud', 'offset' => '0', 'type' => '10169' } }, 'Size' => '32', 'Type' => 'Union' }, '10322' => { 'Header' => undef, 'Line' => '1183', 'Memb' => { '0' => { 'name' => 'remote_srqn', 'offset' => '0', 'type' => '2222' } }, 'Size' => '4', 'Type' => 'Struct' }, '10346' => { 'Header' => undef, 'Line' => '1182', 'Memb' => { '0' => { 'name' => 'xrc', 'offset' => '0', 'type' => '10322' } }, 'Size' => '4', 'Type' => 'Union' }, '10367' => { 'Header' => undef, 'Line' => '1188', 'Memb' => { '0' => { 'name' => 'mw', 'offset' => '0', 'type' => '10417' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '2222' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '7245' } }, 'Size' => '48', 'Type' => 'Struct' }, '10417' => { 'BaseType' => '7647', 'Name' => 'struct ibv_mw*', 'Size' => '8', 'Type' => 'Pointer' }, '10422' => { 'Header' => undef, 'Line' => '1193', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '78' }, '1' => { 'name' => 'hdr_sz', 'offset' => '8', 'type' => '2210' }, '2' => { 'name' => 'mss', 'offset' => '16', 'type' => '2210' } }, 'Size' => '16', 'Type' => 'Struct' }, '10472' => { 'Header' => undef, 'Line' => '1187', 'Memb' => { '0' => { 'name' => 'bind_mw', 'offset' => '0', 'type' => '10367' }, '1' => { 'name' => 'tso', 'offset' => '0', 'type' => '10422' } }, 'Size' => '48', 'Type' => 'Union' }, '10505' => { 'Header' => undef, 'Line' => '1151', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '10641' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '10646' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '152' }, '4' => { 'name' => 'opcode', 'offset' => '40', 'type' => '9823' }, '5' => { 'name' => 'send_flags', 'offset' => '50', 'type' => '66' }, '6' => { 'name' => 'unnamed0', 'offset' => '54', 'type' => '10032' }, '7' => { 'name' => 'wr', 'offset' => '64', 'type' => '10278' }, '8' => { 'name' => 'qp_type', 'offset' => '114', 'type' => '10346' }, '9' => { 'name' => 'unnamed1', 'offset' => '128', 'type' => '10472' } }, 'Name' => 'struct ibv_send_wr', 'Size' => '128', 'Type' => 'Struct' }, '10641' => { 'BaseType' => '10505', 'Name' => 'struct ibv_send_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '10646' => { 'BaseType' => '9971', 'Name' => 'struct ibv_sge*', 'Size' => '8', 'Type' => 'Pointer' }, '10651' => { 'Header' => undef, 'Line' => '1201', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '10721' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '10646' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '152' } }, 'Name' => 'struct ibv_recv_wr', 'Size' => '32', 'Type' => 'Struct' }, '10721' => { 'BaseType' => '10651', 'Name' => 'struct ibv_recv_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '10978' => { 'Header' => undef, 'Line' => '1237', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'send_flags', 'offset' => '8', 'type' => '66' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '7245' } }, 'Name' => 'struct ibv_mw_bind', 'Size' => '48', 'Type' => 'Struct' }, '11059' => { 'BaseType' => '10721', 'Name' => 'struct ibv_recv_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '116' => { 'BaseType' => '80', 'Header' => undef, 'Line' => '38', 'Name' => '__uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '12013' => { 'Header' => undef, 'Line' => '1502', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3883' }, '1' => { 'name' => 'fd', 'offset' => '8', 'type' => '152' }, '2' => { 'name' => 'refcnt', 'offset' => '18', 'type' => '152' } }, 'Name' => 'struct ibv_comp_channel', 'Size' => '16', 'Type' => 'Struct' }, '12067' => { 'BaseType' => '12013', 'Name' => 'struct ibv_comp_channel*', 'Size' => '8', 'Type' => 'Pointer' }, '12072' => { 'Header' => undef, 'Line' => '1521', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2222' } }, 'Name' => 'struct ibv_poll_cq_attr', 'Size' => '4', 'Type' => 'Struct' }, '12100' => { 'Header' => undef, 'Line' => '1525', 'Memb' => { '0' => { 'name' => 'tag', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'priv', 'offset' => '8', 'type' => '2222' } }, 'Name' => 'struct ibv_wc_tm_info', 'Size' => '16', 'Type' => 'Struct' }, '12141' => { 'Header' => undef, 'Line' => '1530', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3883' }, '1' => { 'name' => 'channel', 'offset' => '8', 'type' => '12067' }, '10' => { 'name' => 'status', 'offset' => '306', 'type' => '6597' }, '11' => { 'name' => 'wr_id', 'offset' => '310', 'type' => '2234' }, '12' => { 'name' => 'start_poll', 'offset' => '324', 'type' => '12609' }, '13' => { 'name' => 'next_poll', 'offset' => '338', 'type' => '12629' }, '14' => { 'name' => 'end_poll', 'offset' => '352', 'type' => '12645' }, '15' => { 'name' => 'read_opcode', 'offset' => '360', 'type' => '12665' }, '16' => { 'name' => 'read_vendor_err', 'offset' => '374', 'type' => '12685' }, '17' => { 'name' => 'read_byte_len', 'offset' => '388', 'type' => '12685' }, '18' => { 'name' => 'read_imm_data', 'offset' => '402', 'type' => '12705' }, '19' => { 'name' => 'read_qp_num', 'offset' => '512', 'type' => '12685' }, '2' => { 'name' => 'cq_context', 'offset' => '22', 'type' => '78' }, '20' => { 'name' => 'read_src_qp', 'offset' => '520', 'type' => '12685' }, '21' => { 'name' => 'read_wc_flags', 'offset' => '534', 'type' => '12725' }, '22' => { 'name' => 'read_slid', 'offset' => '548', 'type' => '12685' }, '23' => { 'name' => 'read_sl', 'offset' => '562', 'type' => '12745' }, '24' => { 'name' => 'read_dlid_path_bits', 'offset' => '576', 'type' => '12745' }, '25' => { 'name' => 'read_completion_ts', 'offset' => '584', 'type' => '12765' }, '26' => { 'name' => 'read_cvlan', 'offset' => '598', 'type' => '12785' }, '27' => { 'name' => 'read_flow_tag', 'offset' => '612', 'type' => '12685' }, '28' => { 'name' => 'read_tm_info', 'offset' => '626', 'type' => '12811' }, '29' => { 'name' => 'read_completion_wallclock_ns', 'offset' => '640', 'type' => '12765' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '2222' }, '4' => { 'name' => 'cqe', 'offset' => '40', 'type' => '152' }, '5' => { 'name' => 'mutex', 'offset' => '50', 'type' => '774' }, '6' => { 'name' => 'cond', 'offset' => '114', 'type' => '848' }, '7' => { 'name' => 'comp_events_completed', 'offset' => '288', 'type' => '2222' }, '8' => { 'name' => 'async_events_completed', 'offset' => '292', 'type' => '2222' }, '9' => { 'name' => 'comp_mask', 'offset' => '296', 'type' => '2222' } }, 'Name' => 'struct ibv_cq_ex', 'Size' => '288', 'Type' => 'Struct' }, '12599' => { 'BaseType' => '12141', 'Name' => 'struct ibv_cq_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '12604' => { 'BaseType' => '12072', 'Name' => 'struct ibv_poll_cq_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '12609' => { 'Name' => 'int(*)(struct ibv_cq_ex*, struct ibv_poll_cq_attr*)', 'Param' => { '0' => { 'type' => '12599' }, '1' => { 'type' => '12604' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '12629' => { 'Name' => 'int(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '12599' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '12645' => { 'Name' => 'void(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '12599' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '12665' => { 'Name' => 'enum ibv_wc_opcode(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '12599' } }, 'Return' => '6758', 'Size' => '8', 'Type' => 'FuncPtr' }, '12685' => { 'Name' => 'uint32_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '12599' } }, 'Return' => '2222', 'Size' => '8', 'Type' => 'FuncPtr' }, '12705' => { 'Name' => '__be32(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '12599' } }, 'Return' => '2378', 'Size' => '8', 'Type' => 'FuncPtr' }, '12725' => { 'Name' => 'unsigned int(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '12599' } }, 'Return' => '66', 'Size' => '8', 'Type' => 'FuncPtr' }, '12745' => { 'Name' => 'uint8_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '12599' } }, 'Return' => '2198', 'Size' => '8', 'Type' => 'FuncPtr' }, '12765' => { 'Name' => 'uint64_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '12599' } }, 'Return' => '2234', 'Size' => '8', 'Type' => 'FuncPtr' }, '12785' => { 'Name' => 'uint16_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '12599' } }, 'Return' => '2210', 'Size' => '8', 'Type' => 'FuncPtr' }, '12806' => { 'BaseType' => '12100', 'Name' => 'struct ibv_wc_tm_info*', 'Size' => '8', 'Type' => 'Pointer' }, '12811' => { 'Name' => 'void(*)(struct ibv_cq_ex*, struct ibv_wc_tm_info*)', 'Param' => { '0' => { 'type' => '12599' }, '1' => { 'type' => '12806' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '13328' => { 'Header' => undef, 'Line' => '1969', 'Memb' => { '0' => { 'name' => '_dummy1', 'offset' => '0', 'type' => '13514' }, '1' => { 'name' => '_dummy2', 'offset' => '8', 'type' => '13530' } }, 'Name' => 'struct _ibv_device_ops', 'Size' => '16', 'Type' => 'Struct' }, '13390' => { 'BaseType' => '13395', 'Name' => 'struct ibv_device*', 'Size' => '8', 'Type' => 'Pointer' }, '13395' => { 'Header' => undef, 'Line' => '1979', 'Memb' => { '0' => { 'name' => '_ops', 'offset' => '0', 'type' => '13328' }, '1' => { 'name' => 'node_type', 'offset' => '22', 'type' => '3483' }, '2' => { 'name' => 'transport_type', 'offset' => '32', 'type' => '3547' }, '3' => { 'name' => 'name', 'offset' => '36', 'type' => '4497' }, '4' => { 'name' => 'dev_name', 'offset' => '136', 'type' => '4497' }, '5' => { 'name' => 'dev_path', 'offset' => '338', 'type' => '13561' }, '6' => { 'name' => 'ibdev_path', 'offset' => '1032', 'type' => '13561' } }, 'Name' => 'struct ibv_device', 'Size' => '664', 'Type' => 'Struct' }, '13514' => { 'Name' => 'struct ibv_context*(*)(struct ibv_device*, int)', 'Param' => { '0' => { 'type' => '13390' }, '1' => { 'type' => '152' } }, 'Return' => '3883', 'Size' => '8', 'Type' => 'FuncPtr' }, '13530' => { 'Name' => 'void(*)(struct ibv_context*)', 'Param' => { '0' => { 'type' => '3883' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '13561' => { 'BaseType' => '241', 'Name' => 'char[256]', 'Size' => '256', 'Type' => 'Array' }, '13577' => { 'Header' => undef, 'Line' => '1994', 'Memb' => { '0' => { 'name' => '_compat_query_device', 'offset' => '0', 'type' => '14065' }, '1' => { 'name' => '_compat_query_port', 'offset' => '8', 'type' => '14105' }, '10' => { 'name' => '_compat_create_cq', 'offset' => '128', 'type' => '14115' }, '11' => { 'name' => 'poll_cq', 'offset' => '136', 'type' => '14230' }, '12' => { 'name' => 'req_notify_cq', 'offset' => '150', 'type' => '14255' }, '13' => { 'name' => '_compat_cq_event', 'offset' => '260', 'type' => '14115' }, '14' => { 'name' => '_compat_resize_cq', 'offset' => '274', 'type' => '14115' }, '15' => { 'name' => '_compat_destroy_cq', 'offset' => '288', 'type' => '14115' }, '16' => { 'name' => '_compat_create_srq', 'offset' => '296', 'type' => '14115' }, '17' => { 'name' => '_compat_modify_srq', 'offset' => '310', 'type' => '14115' }, '18' => { 'name' => '_compat_query_srq', 'offset' => '324', 'type' => '14115' }, '19' => { 'name' => '_compat_destroy_srq', 'offset' => '338', 'type' => '14115' }, '2' => { 'name' => '_compat_alloc_pd', 'offset' => '22', 'type' => '14115' }, '20' => { 'name' => 'post_srq_recv', 'offset' => '352', 'type' => '14285' }, '21' => { 'name' => '_compat_create_qp', 'offset' => '360', 'type' => '14115' }, '22' => { 'name' => '_compat_query_qp', 'offset' => '374', 'type' => '14115' }, '23' => { 'name' => '_compat_modify_qp', 'offset' => '388', 'type' => '14115' }, '24' => { 'name' => '_compat_destroy_qp', 'offset' => '402', 'type' => '14115' }, '25' => { 'name' => 'post_send', 'offset' => '512', 'type' => '14320' }, '26' => { 'name' => 'post_recv', 'offset' => '520', 'type' => '14350' }, '27' => { 'name' => '_compat_create_ah', 'offset' => '534', 'type' => '14115' }, '28' => { 'name' => '_compat_destroy_ah', 'offset' => '548', 'type' => '14115' }, '29' => { 'name' => '_compat_attach_mcast', 'offset' => '562', 'type' => '14115' }, '3' => { 'name' => '_compat_dealloc_pd', 'offset' => '36', 'type' => '14115' }, '30' => { 'name' => '_compat_detach_mcast', 'offset' => '576', 'type' => '14115' }, '31' => { 'name' => '_compat_async_event', 'offset' => '584', 'type' => '14115' }, '4' => { 'name' => '_compat_reg_mr', 'offset' => '50', 'type' => '14115' }, '5' => { 'name' => '_compat_rereg_mr', 'offset' => '64', 'type' => '14115' }, '6' => { 'name' => '_compat_dereg_mr', 'offset' => '72', 'type' => '14115' }, '7' => { 'name' => 'alloc_mw', 'offset' => '86', 'type' => '14140' }, '8' => { 'name' => 'bind_mw', 'offset' => '100', 'type' => '14175' }, '9' => { 'name' => 'dealloc_mw', 'offset' => '114', 'type' => '14195' } }, 'Name' => 'struct ibv_context_ops', 'Size' => '256', 'Type' => 'Struct' }, '140' => { 'BaseType' => '92', 'Header' => undef, 'Line' => '40', 'Name' => '__uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '14060' => { 'BaseType' => '3963', 'Name' => 'struct ibv_device_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '14065' => { 'Name' => 'int(*)(struct ibv_context*, struct ibv_device_attr*)', 'Param' => { '0' => { 'type' => '3883' }, '1' => { 'type' => '14060' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14095' => { 'BaseType' => '14100', 'Name' => 'struct _compat_ibv_port_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '14100' => { 'Name' => 'struct _compat_ibv_port_attr', 'Type' => 'Struct' }, '14105' => { 'Name' => 'int(*)(struct ibv_context*, uint8_t, struct _compat_ibv_port_attr*)', 'Param' => { '0' => { 'type' => '3883' }, '1' => { 'type' => '2198' }, '2' => { 'type' => '14095' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14115' => { 'Name' => 'void*(*)()', 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '14140' => { 'Name' => 'struct ibv_mw*(*)(struct ibv_pd*, enum ibv_mw_type)', 'Param' => { '0' => { 'type' => '7613' }, '1' => { 'type' => '7618' } }, 'Return' => '10417', 'Size' => '8', 'Type' => 'FuncPtr' }, '14170' => { 'BaseType' => '10978', 'Name' => 'struct ibv_mw_bind*', 'Size' => '8', 'Type' => 'Pointer' }, '14175' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_mw*, struct ibv_mw_bind*)', 'Param' => { '0' => { 'type' => '6238' }, '1' => { 'type' => '10417' }, '2' => { 'type' => '14170' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14195' => { 'Name' => 'int(*)(struct ibv_mw*)', 'Param' => { '0' => { 'type' => '10417' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14225' => { 'BaseType' => '7059', 'Name' => 'struct ibv_wc*', 'Size' => '8', 'Type' => 'Pointer' }, '14230' => { 'Name' => 'int(*)(struct ibv_cq*, int, struct ibv_wc*)', 'Param' => { '0' => { 'type' => '6040' }, '1' => { 'type' => '152' }, '2' => { 'type' => '14225' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14255' => { 'Name' => 'int(*)(struct ibv_cq*, int)', 'Param' => { '0' => { 'type' => '6040' }, '1' => { 'type' => '152' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14285' => { 'Name' => 'int(*)(struct ibv_srq*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '6353' }, '1' => { 'type' => '10721' }, '2' => { 'type' => '11059' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14315' => { 'BaseType' => '10641', 'Name' => 'struct ibv_send_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '14320' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_send_wr*, struct ibv_send_wr**)', 'Param' => { '0' => { 'type' => '6238' }, '1' => { 'type' => '10641' }, '2' => { 'type' => '14315' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14350' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '6238' }, '1' => { 'type' => '10721' }, '2' => { 'type' => '11059' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14355' => { 'Header' => undef, 'Line' => '2057', 'Memb' => { '0' => { 'name' => 'cqe', 'offset' => '0', 'type' => '2222' }, '1' => { 'name' => 'cq_context', 'offset' => '8', 'type' => '78' }, '2' => { 'name' => 'channel', 'offset' => '22', 'type' => '12067' }, '3' => { 'name' => 'comp_vector', 'offset' => '36', 'type' => '2222' }, '4' => { 'name' => 'wc_flags', 'offset' => '50', 'type' => '2234' }, '5' => { 'name' => 'comp_mask', 'offset' => '64', 'type' => '2222' }, '6' => { 'name' => 'flags', 'offset' => '68', 'type' => '2222' }, '7' => { 'name' => 'parent_domain', 'offset' => '72', 'type' => '7613' } }, 'Name' => 'struct ibv_cq_init_attr_ex', 'Size' => '56', 'Type' => 'Struct' }, '152' => { 'Name' => 'int', 'Size' => '4', 'Type' => 'Intrinsic' }, '16157' => { 'BaseType' => '14355', 'Name' => 'struct ibv_cq_init_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '16328' => { 'BaseType' => '8953', 'Name' => 'struct ibv_qp_init_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '169' => { 'BaseType' => '66', 'Header' => undef, 'Line' => '42', 'Name' => '__uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '19201' => { 'BaseType' => '8768', 'Name' => 'struct ibv_qp_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '193' => { 'BaseType' => '54', 'Header' => undef, 'Line' => '45', 'Name' => '__uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '2198' => { 'BaseType' => '116', 'Header' => undef, 'Line' => '24', 'Name' => 'uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '2210' => { 'BaseType' => '140', 'Header' => undef, 'Line' => '25', 'Name' => 'uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '2222' => { 'BaseType' => '169', 'Header' => undef, 'Line' => '26', 'Name' => 'uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '2234' => { 'BaseType' => '193', 'Header' => undef, 'Line' => '27', 'Name' => 'uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '2278' => { 'Name' => '_Bool', 'Size' => '1', 'Type' => 'Intrinsic' }, '2354' => { 'BaseType' => '66', 'Header' => undef, 'Line' => '27', 'Name' => '__u32', 'Size' => '4', 'Type' => 'Typedef' }, '2366' => { 'BaseType' => '392', 'Header' => undef, 'Line' => '31', 'Name' => '__u64', 'Size' => '8', 'Type' => 'Typedef' }, '2378' => { 'BaseType' => '2354', 'Header' => undef, 'Line' => '27', 'Name' => '__be32', 'Size' => '4', 'Type' => 'Typedef' }, '2390' => { 'BaseType' => '2366', 'Header' => undef, 'Line' => '29', 'Name' => '__be64', 'Size' => '8', 'Type' => 'Typedef' }, '241' => { 'Name' => 'char', 'Size' => '1', 'Type' => 'Intrinsic' }, '3388' => { 'Header' => undef, 'Line' => '67', 'Memb' => { '0' => { 'name' => 'subnet_prefix', 'offset' => '0', 'type' => '2390' }, '1' => { 'name' => 'interface_id', 'offset' => '8', 'type' => '2390' } }, 'Size' => '16', 'Type' => 'Struct' }, '3424' => { 'Header' => undef, 'Line' => '65', 'Memb' => { '0' => { 'name' => 'raw', 'offset' => '0', 'type' => '3467' }, '1' => { 'name' => 'global', 'offset' => '0', 'type' => '3388' } }, 'Name' => 'union ibv_gid', 'Size' => '16', 'Type' => 'Union' }, '3467' => { 'BaseType' => '2198', 'Name' => 'uint8_t[16]', 'Size' => '16', 'Type' => 'Array' }, '3483' => { 'Header' => undef, 'Line' => '95', 'Memb' => { '0' => { 'name' => 'IBV_NODE_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_NODE_CA', 'value' => '1' }, '2' => { 'name' => 'IBV_NODE_SWITCH', 'value' => '2' }, '3' => { 'name' => 'IBV_NODE_ROUTER', 'value' => '3' }, '4' => { 'name' => 'IBV_NODE_RNIC', 'value' => '4' }, '5' => { 'name' => 'IBV_NODE_USNIC', 'value' => '5' }, '6' => { 'name' => 'IBV_NODE_USNIC_UDP', 'value' => '6' }, '7' => { 'name' => 'IBV_NODE_UNSPECIFIED', 'value' => '7' } }, 'Name' => 'enum ibv_node_type', 'Size' => '4', 'Type' => 'Enum' }, '3547' => { 'Header' => undef, 'Line' => '106', 'Memb' => { '0' => { 'name' => 'IBV_TRANSPORT_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_TRANSPORT_IB', 'value' => '0' }, '2' => { 'name' => 'IBV_TRANSPORT_IWARP', 'value' => '1' }, '3' => { 'name' => 'IBV_TRANSPORT_USNIC', 'value' => '2' }, '4' => { 'name' => 'IBV_TRANSPORT_USNIC_UDP', 'value' => '3' }, '5' => { 'name' => 'IBV_TRANSPORT_UNSPECIFIED', 'value' => '4' } }, 'Name' => 'enum ibv_transport_type', 'Size' => '4', 'Type' => 'Enum' }, '3599' => { 'Header' => undef, 'Line' => '155', 'Memb' => { '0' => { 'name' => 'IBV_ATOMIC_NONE', 'value' => '0' }, '1' => { 'name' => 'IBV_ATOMIC_HCA', 'value' => '1' }, '2' => { 'name' => 'IBV_ATOMIC_GLOB', 'value' => '2' } }, 'Name' => 'enum ibv_atomic_cap', 'Size' => '4', 'Type' => 'Enum' }, '3766' => { 'Header' => undef, 'Line' => '2037', 'Memb' => { '0' => { 'name' => 'device', 'offset' => '0', 'type' => '13390' }, '1' => { 'name' => 'ops', 'offset' => '8', 'type' => '13577' }, '2' => { 'name' => 'cmd_fd', 'offset' => '612', 'type' => '152' }, '3' => { 'name' => 'async_fd', 'offset' => '616', 'type' => '152' }, '4' => { 'name' => 'num_comp_vectors', 'offset' => '626', 'type' => '152' }, '5' => { 'name' => 'mutex', 'offset' => '640', 'type' => '774' }, '6' => { 'name' => 'abi_compat', 'offset' => '800', 'type' => '78' } }, 'Name' => 'struct ibv_context', 'Size' => '328', 'Type' => 'Struct' }, '38335' => { 'BaseType' => '2198', 'Name' => 'uint8_t[6]', 'Size' => '6', 'Type' => 'Array' }, '3883' => { 'BaseType' => '3766', 'Name' => 'struct ibv_context*', 'Size' => '8', 'Type' => 'Pointer' }, '392' => { 'Name' => 'unsigned long long', 'Size' => '8', 'Type' => 'Intrinsic' }, '3963' => { 'Header' => undef, 'Line' => '182', 'Memb' => { '0' => { 'name' => 'fw_ver', 'offset' => '0', 'type' => '4497' }, '1' => { 'name' => 'node_guid', 'offset' => '100', 'type' => '2390' }, '10' => { 'name' => 'device_cap_flags', 'offset' => '278', 'type' => '66' }, '11' => { 'name' => 'max_sge', 'offset' => '288', 'type' => '152' }, '12' => { 'name' => 'max_sge_rd', 'offset' => '292', 'type' => '152' }, '13' => { 'name' => 'max_cq', 'offset' => '296', 'type' => '152' }, '14' => { 'name' => 'max_cqe', 'offset' => '306', 'type' => '152' }, '15' => { 'name' => 'max_mr', 'offset' => '310', 'type' => '152' }, '16' => { 'name' => 'max_pd', 'offset' => '320', 'type' => '152' }, '17' => { 'name' => 'max_qp_rd_atom', 'offset' => '324', 'type' => '152' }, '18' => { 'name' => 'max_ee_rd_atom', 'offset' => '328', 'type' => '152' }, '19' => { 'name' => 'max_res_rd_atom', 'offset' => '338', 'type' => '152' }, '2' => { 'name' => 'sys_image_guid', 'offset' => '114', 'type' => '2390' }, '20' => { 'name' => 'max_qp_init_rd_atom', 'offset' => '342', 'type' => '152' }, '21' => { 'name' => 'max_ee_init_rd_atom', 'offset' => '352', 'type' => '152' }, '22' => { 'name' => 'atomic_cap', 'offset' => '356', 'type' => '3599' }, '23' => { 'name' => 'max_ee', 'offset' => '360', 'type' => '152' }, '24' => { 'name' => 'max_rdd', 'offset' => '370', 'type' => '152' }, '25' => { 'name' => 'max_mw', 'offset' => '374', 'type' => '152' }, '26' => { 'name' => 'max_raw_ipv6_qp', 'offset' => '384', 'type' => '152' }, '27' => { 'name' => 'max_raw_ethy_qp', 'offset' => '388', 'type' => '152' }, '28' => { 'name' => 'max_mcast_grp', 'offset' => '392', 'type' => '152' }, '29' => { 'name' => 'max_mcast_qp_attach', 'offset' => '402', 'type' => '152' }, '3' => { 'name' => 'max_mr_size', 'offset' => '128', 'type' => '2234' }, '30' => { 'name' => 'max_total_mcast_qp_attach', 'offset' => '406', 'type' => '152' }, '31' => { 'name' => 'max_ah', 'offset' => '512', 'type' => '152' }, '32' => { 'name' => 'max_fmr', 'offset' => '516', 'type' => '152' }, '33' => { 'name' => 'max_map_per_fmr', 'offset' => '520', 'type' => '152' }, '34' => { 'name' => 'max_srq', 'offset' => '530', 'type' => '152' }, '35' => { 'name' => 'max_srq_wr', 'offset' => '534', 'type' => '152' }, '36' => { 'name' => 'max_srq_sge', 'offset' => '544', 'type' => '152' }, '37' => { 'name' => 'max_pkeys', 'offset' => '548', 'type' => '2210' }, '38' => { 'name' => 'local_ca_ack_delay', 'offset' => '550', 'type' => '2198' }, '39' => { 'name' => 'phys_port_cnt', 'offset' => '551', 'type' => '2198' }, '4' => { 'name' => 'page_size_cap', 'offset' => '136', 'type' => '2234' }, '5' => { 'name' => 'vendor_id', 'offset' => '150', 'type' => '2222' }, '6' => { 'name' => 'vendor_part_id', 'offset' => '256', 'type' => '2222' }, '7' => { 'name' => 'hw_ver', 'offset' => '260', 'type' => '2222' }, '8' => { 'name' => 'max_qp', 'offset' => '264', 'type' => '152' }, '9' => { 'name' => 'max_qp_wr', 'offset' => '274', 'type' => '152' } }, 'Name' => 'struct ibv_device_attr', 'Size' => '232', 'Type' => 'Struct' }, '42' => { 'BaseType' => '54', 'Header' => undef, 'Line' => '214', 'Name' => 'size_t', 'Size' => '8', 'Type' => 'Typedef' }, '4497' => { 'BaseType' => '241', 'Name' => 'char[64]', 'Size' => '64', 'Type' => 'Array' }, '48176' => { 'Header' => undef, 'Line' => '32', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'driver_qp_type', 'offset' => '8', 'type' => '2222' }, '2' => { 'name' => 'flags', 'offset' => '18', 'type' => '2210' }, '3' => { 'name' => 'sl', 'offset' => '20', 'type' => '2198' }, '4' => { 'name' => 'reserved', 'offset' => '21', 'type' => '48254' } }, 'Name' => 'struct efadv_qp_init_attr', 'Size' => '16', 'Type' => 'Struct' }, '48254' => { 'BaseType' => '2198', 'Name' => 'uint8_t[1]', 'Size' => '1', 'Type' => 'Array' }, '48313' => { 'Header' => undef, 'Line' => '53', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'max_sq_wr', 'offset' => '8', 'type' => '2222' }, '2' => { 'name' => 'max_rq_wr', 'offset' => '18', 'type' => '2222' }, '3' => { 'name' => 'max_sq_sge', 'offset' => '22', 'type' => '2210' }, '4' => { 'name' => 'max_rq_sge', 'offset' => '24', 'type' => '2210' }, '5' => { 'name' => 'inline_buf_size', 'offset' => '32', 'type' => '2210' }, '6' => { 'name' => 'reserved', 'offset' => '34', 'type' => '48444' }, '7' => { 'name' => 'device_caps', 'offset' => '36', 'type' => '2222' }, '8' => { 'name' => 'max_rdma_size', 'offset' => '40', 'type' => '2222' } }, 'Name' => 'struct efadv_device_attr', 'Size' => '32', 'Type' => 'Struct' }, '48444' => { 'BaseType' => '2198', 'Name' => 'uint8_t[2]', 'Size' => '2', 'Type' => 'Array' }, '48460' => { 'Header' => undef, 'Line' => '69', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'ahn', 'offset' => '8', 'type' => '2210' }, '2' => { 'name' => 'reserved', 'offset' => '16', 'type' => '38335' } }, 'Name' => 'struct efadv_ah_attr', 'Size' => '16', 'Type' => 'Struct' }, '48513' => { 'Header' => undef, 'Line' => '78', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'wc_read_sgid', 'offset' => '8', 'type' => '48596' }, '2' => { 'name' => 'wc_is_unsolicited', 'offset' => '22', 'type' => '48616' } }, 'Name' => 'struct efadv_cq', 'Size' => '24', 'Type' => 'Struct' }, '48586' => { 'BaseType' => '48513', 'Name' => 'struct efadv_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '48591' => { 'BaseType' => '3424', 'Name' => 'union ibv_gid*', 'Size' => '8', 'Type' => 'Pointer' }, '48596' => { 'Name' => 'int(*)(struct efadv_cq*, union ibv_gid*)', 'Param' => { '0' => { 'type' => '48586' }, '1' => { 'type' => '48591' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '48616' => { 'Name' => '_Bool(*)(struct efadv_cq*)', 'Param' => { '0' => { 'type' => '48586' } }, 'Return' => '2278', 'Size' => '8', 'Type' => 'FuncPtr' }, '48646' => { 'Header' => undef, 'Line' => '89', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'wc_flags', 'offset' => '8', 'type' => '2234' } }, 'Name' => 'struct efadv_cq_init_attr', 'Size' => '16', 'Type' => 'Struct' }, '48717' => { 'Header' => undef, 'Line' => '118', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'ic_id_validity', 'offset' => '8', 'type' => '2210' }, '2' => { 'name' => 'recv_ic_id', 'offset' => '16', 'type' => '2210' }, '3' => { 'name' => 'rdma_read_ic_id', 'offset' => '18', 'type' => '2210' }, '4' => { 'name' => 'rdma_recv_ic_id', 'offset' => '20', 'type' => '2210' } }, 'Name' => 'struct efadv_mr_attr', 'Size' => '16', 'Type' => 'Struct' }, '52442' => { 'BaseType' => '48460', 'Name' => 'struct efadv_ah_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '54' => { 'Name' => 'unsigned long', 'Size' => '8', 'Type' => 'Intrinsic' }, '5901' => { 'Header' => undef, 'Line' => '1508', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3883' }, '1' => { 'name' => 'channel', 'offset' => '8', 'type' => '12067' }, '2' => { 'name' => 'cq_context', 'offset' => '22', 'type' => '78' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '2222' }, '4' => { 'name' => 'cqe', 'offset' => '40', 'type' => '152' }, '5' => { 'name' => 'mutex', 'offset' => '50', 'type' => '774' }, '6' => { 'name' => 'cond', 'offset' => '114', 'type' => '848' }, '7' => { 'name' => 'comp_events_completed', 'offset' => '288', 'type' => '2222' }, '8' => { 'name' => 'async_events_completed', 'offset' => '292', 'type' => '2222' } }, 'Name' => 'struct ibv_cq', 'Size' => '128', 'Type' => 'Struct' }, '6040' => { 'BaseType' => '5901', 'Name' => 'struct ibv_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '6045' => { 'Header' => undef, 'Line' => '1283', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3883' }, '1' => { 'name' => 'qp_context', 'offset' => '8', 'type' => '78' }, '10' => { 'name' => 'mutex', 'offset' => '100', 'type' => '774' }, '11' => { 'name' => 'cond', 'offset' => '260', 'type' => '848' }, '12' => { 'name' => 'events_completed', 'offset' => '338', 'type' => '2222' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '7613' }, '3' => { 'name' => 'send_cq', 'offset' => '36', 'type' => '6040' }, '4' => { 'name' => 'recv_cq', 'offset' => '50', 'type' => '6040' }, '5' => { 'name' => 'srq', 'offset' => '64', 'type' => '6353' }, '6' => { 'name' => 'handle', 'offset' => '72', 'type' => '2222' }, '7' => { 'name' => 'qp_num', 'offset' => '82', 'type' => '2222' }, '8' => { 'name' => 'state', 'offset' => '86', 'type' => '9276' }, '9' => { 'name' => 'qp_type', 'offset' => '96', 'type' => '8625' } }, 'Name' => 'struct ibv_qp', 'Size' => '160', 'Type' => 'Struct' }, '6238' => { 'BaseType' => '6045', 'Name' => 'struct ibv_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '6243' => { 'Header' => undef, 'Line' => '1243', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3883' }, '1' => { 'name' => 'srq_context', 'offset' => '8', 'type' => '78' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '7613' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '2222' }, '4' => { 'name' => 'mutex', 'offset' => '50', 'type' => '774' }, '5' => { 'name' => 'cond', 'offset' => '114', 'type' => '848' }, '6' => { 'name' => 'events_completed', 'offset' => '288', 'type' => '2222' } }, 'Name' => 'struct ibv_srq', 'Size' => '128', 'Type' => 'Struct' }, '6353' => { 'BaseType' => '6243', 'Name' => 'struct ibv_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '6597' => { 'Header' => undef, 'Line' => '485', 'Memb' => { '0' => { 'name' => 'IBV_WC_SUCCESS', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_LOC_LEN_ERR', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_REM_ACCESS_ERR', 'value' => '10' }, '11' => { 'name' => 'IBV_WC_REM_OP_ERR', 'value' => '11' }, '12' => { 'name' => 'IBV_WC_RETRY_EXC_ERR', 'value' => '12' }, '13' => { 'name' => 'IBV_WC_RNR_RETRY_EXC_ERR', 'value' => '13' }, '14' => { 'name' => 'IBV_WC_LOC_RDD_VIOL_ERR', 'value' => '14' }, '15' => { 'name' => 'IBV_WC_REM_INV_RD_REQ_ERR', 'value' => '15' }, '16' => { 'name' => 'IBV_WC_REM_ABORT_ERR', 'value' => '16' }, '17' => { 'name' => 'IBV_WC_INV_EECN_ERR', 'value' => '17' }, '18' => { 'name' => 'IBV_WC_INV_EEC_STATE_ERR', 'value' => '18' }, '19' => { 'name' => 'IBV_WC_FATAL_ERR', 'value' => '19' }, '2' => { 'name' => 'IBV_WC_LOC_QP_OP_ERR', 'value' => '2' }, '20' => { 'name' => 'IBV_WC_RESP_TIMEOUT_ERR', 'value' => '20' }, '21' => { 'name' => 'IBV_WC_GENERAL_ERR', 'value' => '21' }, '22' => { 'name' => 'IBV_WC_TM_ERR', 'value' => '22' }, '23' => { 'name' => 'IBV_WC_TM_RNDV_INCOMPLETE', 'value' => '23' }, '3' => { 'name' => 'IBV_WC_LOC_EEC_OP_ERR', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_LOC_PROT_ERR', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_WR_FLUSH_ERR', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_MW_BIND_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_BAD_RESP_ERR', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_LOC_ACCESS_ERR', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_REM_INV_REQ_ERR', 'value' => '9' } }, 'Name' => 'enum ibv_wc_status', 'Size' => '4', 'Type' => 'Enum' }, '66' => { 'Name' => 'unsigned int', 'Size' => '4', 'Type' => 'Intrinsic' }, '6758' => { 'Header' => undef, 'Line' => '513', 'Memb' => { '0' => { 'name' => 'IBV_WC_SEND', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_RDMA_WRITE', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_RECV', 'value' => '128' }, '11' => { 'name' => 'IBV_WC_RECV_RDMA_WITH_IMM', 'value' => '129' }, '12' => { 'name' => 'IBV_WC_TM_ADD', 'value' => '130' }, '13' => { 'name' => 'IBV_WC_TM_DEL', 'value' => '131' }, '14' => { 'name' => 'IBV_WC_TM_SYNC', 'value' => '132' }, '15' => { 'name' => 'IBV_WC_TM_RECV', 'value' => '133' }, '16' => { 'name' => 'IBV_WC_TM_NO_TAG', 'value' => '134' }, '17' => { 'name' => 'IBV_WC_DRIVER1', 'value' => '135' }, '18' => { 'name' => 'IBV_WC_DRIVER2', 'value' => '136' }, '19' => { 'name' => 'IBV_WC_DRIVER3', 'value' => '137' }, '2' => { 'name' => 'IBV_WC_RDMA_READ', 'value' => '2' }, '3' => { 'name' => 'IBV_WC_COMP_SWAP', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_FETCH_ADD', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_BIND_MW', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_LOCAL_INV', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_TSO', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_FLUSH', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_ATOMIC_WRITE', 'value' => '9' } }, 'Name' => 'enum ibv_wc_opcode', 'Size' => '4', 'Type' => 'Enum' }, '67792' => { 'BaseType' => '48176', 'Name' => 'struct efadv_qp_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '7026' => { 'Header' => undef, 'Line' => '598', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '2378' }, '1' => { 'name' => 'invalidated_rkey', 'offset' => '0', 'type' => '2222' } }, 'Size' => '4', 'Type' => 'Union' }, '7059' => { 'Header' => undef, 'Line' => '589', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'status', 'offset' => '8', 'type' => '6597' }, '10' => { 'name' => 'slid', 'offset' => '66', 'type' => '2210' }, '11' => { 'name' => 'sl', 'offset' => '68', 'type' => '2198' }, '12' => { 'name' => 'dlid_path_bits', 'offset' => '69', 'type' => '2198' }, '2' => { 'name' => 'opcode', 'offset' => '18', 'type' => '6758' }, '3' => { 'name' => 'vendor_err', 'offset' => '22', 'type' => '2222' }, '4' => { 'name' => 'byte_len', 'offset' => '32', 'type' => '2222' }, '5' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '7026' }, '6' => { 'name' => 'qp_num', 'offset' => '40', 'type' => '2222' }, '7' => { 'name' => 'src_qp', 'offset' => '50', 'type' => '2222' }, '8' => { 'name' => 'wc_flags', 'offset' => '54', 'type' => '66' }, '9' => { 'name' => 'pkey_index', 'offset' => '64', 'type' => '2210' } }, 'Name' => 'struct ibv_wc', 'Size' => '48', 'Type' => 'Struct' }, '7245' => { 'Header' => undef, 'Line' => '625', 'Memb' => { '0' => { 'name' => 'mr', 'offset' => '0', 'type' => '7428' }, '1' => { 'name' => 'addr', 'offset' => '8', 'type' => '2234' }, '2' => { 'name' => 'length', 'offset' => '22', 'type' => '2234' }, '3' => { 'name' => 'mw_access_flags', 'offset' => '36', 'type' => '66' } }, 'Name' => 'struct ibv_mw_bind_info', 'Size' => '32', 'Type' => 'Struct' }, '7318' => { 'Header' => undef, 'Line' => '668', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3883' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '7613' }, '2' => { 'name' => 'addr', 'offset' => '22', 'type' => '78' }, '3' => { 'name' => 'length', 'offset' => '36', 'type' => '42' }, '4' => { 'name' => 'handle', 'offset' => '50', 'type' => '2222' }, '5' => { 'name' => 'lkey', 'offset' => '54', 'type' => '2222' }, '6' => { 'name' => 'rkey', 'offset' => '64', 'type' => '2222' } }, 'Name' => 'struct ibv_mr', 'Size' => '48', 'Type' => 'Struct' }, '7428' => { 'BaseType' => '7318', 'Name' => 'struct ibv_mr*', 'Size' => '8', 'Type' => 'Pointer' }, '7433' => { 'Header' => undef, 'Line' => '632', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3883' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '2222' } }, 'Name' => 'struct ibv_pd', 'Size' => '16', 'Type' => 'Struct' }, '7585' => { 'Header' => undef, 'Line' => '657', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3883' } }, 'Name' => 'struct ibv_xrcd', 'Size' => '8', 'Type' => 'Struct' }, '7613' => { 'BaseType' => '7433', 'Name' => 'struct ibv_pd*', 'Size' => '8', 'Type' => 'Pointer' }, '7618' => { 'Header' => undef, 'Line' => '678', 'Memb' => { '0' => { 'name' => 'IBV_MW_TYPE_1', 'value' => '1' }, '1' => { 'name' => 'IBV_MW_TYPE_2', 'value' => '2' } }, 'Name' => 'enum ibv_mw_type', 'Size' => '4', 'Type' => 'Enum' }, '76240' => { 'BaseType' => '48646', 'Name' => 'struct efadv_cq_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '7647' => { 'Header' => undef, 'Line' => '683', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3883' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '7613' }, '2' => { 'name' => 'rkey', 'offset' => '22', 'type' => '2222' }, '3' => { 'name' => 'handle', 'offset' => '32', 'type' => '2222' }, '4' => { 'name' => 'type', 'offset' => '36', 'type' => '7618' } }, 'Name' => 'struct ibv_mw', 'Size' => '32', 'Type' => 'Struct' }, '78' => { 'BaseType' => '1', 'Name' => 'void*', 'Size' => '8', 'Type' => 'Pointer' }, '80' => { 'Name' => 'unsigned char', 'Size' => '1', 'Type' => 'Intrinsic' }, '8219' => { 'BaseType' => '7585', 'Name' => 'struct ibv_xrcd*', 'Size' => '8', 'Type' => 'Pointer' }, '8494' => { 'Header' => undef, 'Line' => '880', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3883' }, '1' => { 'name' => 'ind_tbl_handle', 'offset' => '8', 'type' => '152' }, '2' => { 'name' => 'ind_tbl_num', 'offset' => '18', 'type' => '152' }, '3' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '2222' } }, 'Name' => 'struct ibv_rwq_ind_table', 'Size' => '24', 'Type' => 'Struct' }, '85750' => { 'BaseType' => '48717', 'Name' => 'struct efadv_mr_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '8625' => { 'Header' => undef, 'Line' => '901', 'Memb' => { '0' => { 'name' => 'IBV_QPT_RC', 'value' => '2' }, '1' => { 'name' => 'IBV_QPT_UC', 'value' => '3' }, '2' => { 'name' => 'IBV_QPT_UD', 'value' => '4' }, '3' => { 'name' => 'IBV_QPT_RAW_PACKET', 'value' => '8' }, '4' => { 'name' => 'IBV_QPT_XRC_SEND', 'value' => '9' }, '5' => { 'name' => 'IBV_QPT_XRC_RECV', 'value' => '10' }, '6' => { 'name' => 'IBV_QPT_DRIVER', 'value' => '255' } }, 'Name' => 'enum ibv_qp_type', 'Size' => '4', 'Type' => 'Enum' }, '8684' => { 'Header' => undef, 'Line' => '911', 'Memb' => { '0' => { 'name' => 'max_send_wr', 'offset' => '0', 'type' => '2222' }, '1' => { 'name' => 'max_recv_wr', 'offset' => '4', 'type' => '2222' }, '2' => { 'name' => 'max_send_sge', 'offset' => '8', 'type' => '2222' }, '3' => { 'name' => 'max_recv_sge', 'offset' => '18', 'type' => '2222' }, '4' => { 'name' => 'max_inline_data', 'offset' => '22', 'type' => '2222' } }, 'Name' => 'struct ibv_qp_cap', 'Size' => '20', 'Type' => 'Struct' }, '8768' => { 'Header' => undef, 'Line' => '919', 'Memb' => { '0' => { 'name' => 'qp_context', 'offset' => '0', 'type' => '78' }, '1' => { 'name' => 'send_cq', 'offset' => '8', 'type' => '6040' }, '2' => { 'name' => 'recv_cq', 'offset' => '22', 'type' => '6040' }, '3' => { 'name' => 'srq', 'offset' => '36', 'type' => '6353' }, '4' => { 'name' => 'cap', 'offset' => '50', 'type' => '8684' }, '5' => { 'name' => 'qp_type', 'offset' => '82', 'type' => '8625' }, '6' => { 'name' => 'sq_sig_all', 'offset' => '86', 'type' => '152' } }, 'Name' => 'struct ibv_qp_init_attr', 'Size' => '64', 'Type' => 'Struct' }, '87774' => { 'BaseType' => '48313', 'Name' => 'struct efadv_device_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '8878' => { 'Header' => undef, 'Line' => '963', 'Memb' => { '0' => { 'name' => 'rx_hash_function', 'offset' => '0', 'type' => '2198' }, '1' => { 'name' => 'rx_hash_key_len', 'offset' => '1', 'type' => '2198' }, '2' => { 'name' => 'rx_hash_key', 'offset' => '8', 'type' => '8948' }, '3' => { 'name' => 'rx_hash_fields_mask', 'offset' => '22', 'type' => '2234' } }, 'Name' => 'struct ibv_rx_hash_conf', 'Size' => '24', 'Type' => 'Struct' }, '8948' => { 'BaseType' => '2198', 'Name' => 'uint8_t*', 'Size' => '8', 'Type' => 'Pointer' }, '8953' => { 'Header' => undef, 'Line' => '972', 'Memb' => { '0' => { 'name' => 'qp_context', 'offset' => '0', 'type' => '78' }, '1' => { 'name' => 'send_cq', 'offset' => '8', 'type' => '6040' }, '10' => { 'name' => 'create_flags', 'offset' => '128', 'type' => '2222' }, '11' => { 'name' => 'max_tso_header', 'offset' => '132', 'type' => '2210' }, '12' => { 'name' => 'rwq_ind_tbl', 'offset' => '136', 'type' => '9187' }, '13' => { 'name' => 'rx_hash_conf', 'offset' => '150', 'type' => '8878' }, '14' => { 'name' => 'source_qpn', 'offset' => '288', 'type' => '2222' }, '15' => { 'name' => 'send_ops_flags', 'offset' => '296', 'type' => '2234' }, '2' => { 'name' => 'recv_cq', 'offset' => '22', 'type' => '6040' }, '3' => { 'name' => 'srq', 'offset' => '36', 'type' => '6353' }, '4' => { 'name' => 'cap', 'offset' => '50', 'type' => '8684' }, '5' => { 'name' => 'qp_type', 'offset' => '82', 'type' => '8625' }, '6' => { 'name' => 'sq_sig_all', 'offset' => '86', 'type' => '152' }, '7' => { 'name' => 'comp_mask', 'offset' => '96', 'type' => '2222' }, '8' => { 'name' => 'pd', 'offset' => '100', 'type' => '7613' }, '9' => { 'name' => 'xrcd', 'offset' => '114', 'type' => '8219' } }, 'Name' => 'struct ibv_qp_init_attr_ex', 'Size' => '136', 'Type' => 'Struct' }, '9187' => { 'BaseType' => '8494', 'Name' => 'struct ibv_rwq_ind_table*', 'Size' => '8', 'Type' => 'Pointer' }, '92' => { 'Name' => 'unsigned short', 'Size' => '2', 'Type' => 'Intrinsic' }, '9276' => { 'Header' => undef, 'Line' => '1050', 'Memb' => { '0' => { 'name' => 'IBV_QPS_RESET', 'value' => '0' }, '1' => { 'name' => 'IBV_QPS_INIT', 'value' => '1' }, '2' => { 'name' => 'IBV_QPS_RTR', 'value' => '2' }, '3' => { 'name' => 'IBV_QPS_RTS', 'value' => '3' }, '4' => { 'name' => 'IBV_QPS_SQD', 'value' => '4' }, '5' => { 'name' => 'IBV_QPS_SQE', 'value' => '5' }, '6' => { 'name' => 'IBV_QPS_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_QPS_UNKNOWN', 'value' => '7' } }, 'Name' => 'enum ibv_qp_state', 'Size' => '4', 'Type' => 'Enum' }, '9823' => { 'Header' => undef, 'Line' => '1103', 'Memb' => { '0' => { 'name' => 'IBV_WR_RDMA_WRITE', 'value' => '0' }, '1' => { 'name' => 'IBV_WR_RDMA_WRITE_WITH_IMM', 'value' => '1' }, '10' => { 'name' => 'IBV_WR_TSO', 'value' => '10' }, '11' => { 'name' => 'IBV_WR_DRIVER1', 'value' => '11' }, '12' => { 'name' => 'IBV_WR_FLUSH', 'value' => '14' }, '13' => { 'name' => 'IBV_WR_ATOMIC_WRITE', 'value' => '15' }, '2' => { 'name' => 'IBV_WR_SEND', 'value' => '2' }, '3' => { 'name' => 'IBV_WR_SEND_WITH_IMM', 'value' => '3' }, '4' => { 'name' => 'IBV_WR_RDMA_READ', 'value' => '4' }, '5' => { 'name' => 'IBV_WR_ATOMIC_CMP_AND_SWP', 'value' => '5' }, '6' => { 'name' => 'IBV_WR_ATOMIC_FETCH_AND_ADD', 'value' => '6' }, '7' => { 'name' => 'IBV_WR_LOCAL_INV', 'value' => '7' }, '8' => { 'name' => 'IBV_WR_BIND_MW', 'value' => '8' }, '9' => { 'name' => 'IBV_WR_SEND_WITH_INV', 'value' => '9' } }, 'Name' => 'enum ibv_wr_opcode', 'Size' => '4', 'Type' => 'Enum' }, '9971' => { 'Header' => undef, 'Line' => '1145', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '2234' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '2222' }, '2' => { 'name' => 'lkey', 'offset' => '18', 'type' => '2222' } }, 'Name' => 'struct ibv_sge', 'Size' => '16', 'Type' => 'Struct' } }, 'UndefinedSymbols' => { 'libefa.so.1.3.56.0' => { '_ITM_deregisterTMCloneTable' => 0, '_ITM_registerTMCloneTable' => 0, '__cxa_finalize@GLIBC_2.2.5' => 0, '__errno_location@GLIBC_2.2.5' => 0, '__gmon_start__' => 0, '__snprintf_chk@GLIBC_2.3.4' => 0, '__stack_chk_fail@GLIBC_2.4' => 0, '__verbs_log@IBVERBS_PRIVATE_34' => 0, '_verbs_init_and_alloc_context@IBVERBS_PRIVATE_34' => 0, 'calloc@GLIBC_2.2.5' => 0, 'execute_ioctl@IBVERBS_PRIVATE_34' => 0, 'free@GLIBC_2.2.5' => 0, 'ibv_cmd_alloc_pd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_ah@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_cq_ex@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_qp_ex@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dealloc_pd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dereg_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_ah@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_get_context@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_device_any@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_port@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_reg_dmabuf_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_reg_mr@IBVERBS_PRIVATE_34' => 0, 'malloc@GLIBC_2.2.5' => 0, 'memcpy@GLIBC_2.14' => 0, 'memset@GLIBC_2.2.5' => 0, 'mmap@GLIBC_2.2.5' => 0, 'munmap@GLIBC_2.2.5' => 0, 'pthread_spin_destroy@GLIBC_2.34' => 0, 'pthread_spin_init@GLIBC_2.34' => 0, 'pthread_spin_lock@GLIBC_2.34' => 0, 'pthread_spin_unlock@GLIBC_2.34' => 0, 'sysconf@GLIBC_2.2.5' => 0, 'verbs_register_driver_34@IBVERBS_PRIVATE_34' => 0, 'verbs_set_ops@IBVERBS_PRIVATE_34' => 0, 'verbs_uninit_context@IBVERBS_PRIVATE_34' => 0 } }, 'WordSize' => '8' }; rdma-core-56.1/ABI/hns.dump000066400000000000000000005203401477342711600154100ustar00rootroot00000000000000$VAR1 = { 'ABI_DUMPER_VERSION' => '1.2', 'ABI_DUMP_VERSION' => '3.5', 'Arch' => 'x86_64', 'GccVersion' => '12.3.0', 'Headers' => {}, 'Language' => 'C', 'LibraryName' => 'libhns.so.1.0.56.0', 'LibraryVersion' => 'hns', 'MissedOffsets' => '1', 'MissedRegs' => '1', 'NameSpaces' => {}, 'Needed' => { 'libc.so.6' => 1, 'libibverbs.so.1' => 1 }, 'Sources' => {}, 'SymbolInfo' => { '144809' => { 'Header' => undef, 'Line' => '1682', 'Param' => { '0' => { 'name' => 'context', 'type' => '3963' }, '1' => { 'name' => 'attrs_out', 'type' => '145132' } }, 'Return' => '152', 'ShortName' => 'hnsdv_query_device' }, '145158' => { 'Header' => undef, 'Line' => '1665', 'Param' => { '0' => { 'name' => 'context', 'type' => '3963' }, '1' => { 'name' => 'qp_attr', 'type' => '16449' }, '2' => { 'name' => 'hns_attr', 'type' => '145331' } }, 'Return' => '6323', 'ShortName' => 'hnsdv_create_qp' }, '24058' => { 'Header' => undef, 'Line' => '287', 'Param' => { '0' => { 'name' => 'device', 'type' => '13508' } }, 'Return' => '2309', 'ShortName' => 'hnsdv_is_supported' } }, 'SymbolVersion' => { 'hnsdv_create_qp' => 'hnsdv_create_qp@@HNS_1.0', 'hnsdv_is_supported' => 'hnsdv_is_supported@@HNS_1.0', 'hnsdv_query_device' => 'hnsdv_query_device@@HNS_1.0' }, 'Symbols' => { 'libhns.so.1.0.56.0' => { 'hnsdv_create_qp@@HNS_1.0' => 1, 'hnsdv_is_supported@@HNS_1.0' => 1, 'hnsdv_query_device@@HNS_1.0' => 1 } }, 'Target' => 'unix', 'TypeInfo' => { '1' => { 'Name' => 'void', 'Type' => 'Intrinsic' }, '10076' => { 'Header' => undef, 'Line' => '1145', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '2397' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '2385' }, '2' => { 'name' => 'lkey', 'offset' => '18', 'type' => '2385' } }, 'Name' => 'struct ibv_sge', 'Size' => '16', 'Type' => 'Struct' }, '10137' => { 'Header' => undef, 'Line' => '1161', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '2457' }, '1' => { 'name' => 'invalidate_rkey', 'offset' => '0', 'type' => '2385' } }, 'Size' => '4', 'Type' => 'Union' }, '10170' => { 'Header' => undef, 'Line' => '1166', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '2397' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '2385' } }, 'Size' => '16', 'Type' => 'Struct' }, '10208' => { 'Header' => undef, 'Line' => '1170', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '2397' }, '1' => { 'name' => 'compare_add', 'offset' => '8', 'type' => '2397' }, '2' => { 'name' => 'swap', 'offset' => '22', 'type' => '2397' }, '3' => { 'name' => 'rkey', 'offset' => '36', 'type' => '2385' } }, 'Size' => '32', 'Type' => 'Struct' }, '10274' => { 'Header' => undef, 'Line' => '1176', 'Memb' => { '0' => { 'name' => 'ah', 'offset' => '0', 'type' => '10380' }, '1' => { 'name' => 'remote_qpn', 'offset' => '8', 'type' => '2385' }, '2' => { 'name' => 'remote_qkey', 'offset' => '18', 'type' => '2385' } }, 'Size' => '16', 'Type' => 'Struct' }, '10325' => { 'Header' => undef, 'Line' => '1695', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3963' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '7705' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '2385' } }, 'Name' => 'struct ibv_ah', 'Size' => '24', 'Type' => 'Struct' }, '10380' => { 'BaseType' => '10325', 'Name' => 'struct ibv_ah*', 'Size' => '8', 'Type' => 'Pointer' }, '10385' => { 'Header' => undef, 'Line' => '1165', 'Memb' => { '0' => { 'name' => 'rdma', 'offset' => '0', 'type' => '10170' }, '1' => { 'name' => 'atomic', 'offset' => '0', 'type' => '10208' }, '2' => { 'name' => 'ud', 'offset' => '0', 'type' => '10274' } }, 'Size' => '32', 'Type' => 'Union' }, '10429' => { 'Header' => undef, 'Line' => '1183', 'Memb' => { '0' => { 'name' => 'remote_srqn', 'offset' => '0', 'type' => '2385' } }, 'Size' => '4', 'Type' => 'Struct' }, '10453' => { 'Header' => undef, 'Line' => '1182', 'Memb' => { '0' => { 'name' => 'xrc', 'offset' => '0', 'type' => '10429' } }, 'Size' => '4', 'Type' => 'Union' }, '10474' => { 'Header' => undef, 'Line' => '1188', 'Memb' => { '0' => { 'name' => 'mw', 'offset' => '0', 'type' => '10525' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '2385' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '7334' } }, 'Size' => '48', 'Type' => 'Struct' }, '10525' => { 'BaseType' => '7739', 'Name' => 'struct ibv_mw*', 'Size' => '8', 'Type' => 'Pointer' }, '10530' => { 'Header' => undef, 'Line' => '1193', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '78' }, '1' => { 'name' => 'hdr_sz', 'offset' => '8', 'type' => '2373' }, '2' => { 'name' => 'mss', 'offset' => '16', 'type' => '2373' } }, 'Size' => '16', 'Type' => 'Struct' }, '10582' => { 'Header' => undef, 'Line' => '1187', 'Memb' => { '0' => { 'name' => 'bind_mw', 'offset' => '0', 'type' => '10474' }, '1' => { 'name' => 'tso', 'offset' => '0', 'type' => '10530' } }, 'Size' => '48', 'Type' => 'Union' }, '10615' => { 'Header' => undef, 'Line' => '1151', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2397' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '10752' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '10757' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '152' }, '4' => { 'name' => 'opcode', 'offset' => '40', 'type' => '9928' }, '5' => { 'name' => 'send_flags', 'offset' => '50', 'type' => '66' }, '6' => { 'name' => 'unnamed0', 'offset' => '54', 'type' => '10137' }, '7' => { 'name' => 'wr', 'offset' => '64', 'type' => '10385' }, '8' => { 'name' => 'qp_type', 'offset' => '114', 'type' => '10453' }, '9' => { 'name' => 'unnamed1', 'offset' => '128', 'type' => '10582' } }, 'Name' => 'struct ibv_send_wr', 'Size' => '128', 'Type' => 'Struct' }, '10752' => { 'BaseType' => '10615', 'Name' => 'struct ibv_send_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '10757' => { 'BaseType' => '10076', 'Name' => 'struct ibv_sge*', 'Size' => '8', 'Type' => 'Pointer' }, '10762' => { 'Header' => undef, 'Line' => '1201', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2397' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '10832' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '10757' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '152' } }, 'Name' => 'struct ibv_recv_wr', 'Size' => '32', 'Type' => 'Struct' }, '10832' => { 'BaseType' => '10762', 'Name' => 'struct ibv_recv_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '11092' => { 'Header' => undef, 'Line' => '1237', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2397' }, '1' => { 'name' => 'send_flags', 'offset' => '8', 'type' => '66' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '7334' } }, 'Name' => 'struct ibv_mw_bind', 'Size' => '48', 'Type' => 'Struct' }, '11173' => { 'BaseType' => '10832', 'Name' => 'struct ibv_recv_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '116' => { 'BaseType' => '80', 'Header' => undef, 'Line' => '38', 'Name' => '__uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '12127' => { 'Header' => undef, 'Line' => '1502', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3963' }, '1' => { 'name' => 'fd', 'offset' => '8', 'type' => '152' }, '2' => { 'name' => 'refcnt', 'offset' => '18', 'type' => '152' } }, 'Name' => 'struct ibv_comp_channel', 'Size' => '16', 'Type' => 'Struct' }, '12182' => { 'BaseType' => '12127', 'Name' => 'struct ibv_comp_channel*', 'Size' => '8', 'Type' => 'Pointer' }, '13446' => { 'Header' => undef, 'Line' => '1969', 'Memb' => { '0' => { 'name' => '_dummy1', 'offset' => '0', 'type' => '13632' }, '1' => { 'name' => '_dummy2', 'offset' => '8', 'type' => '13648' } }, 'Name' => 'struct _ibv_device_ops', 'Size' => '16', 'Type' => 'Struct' }, '13508' => { 'BaseType' => '13513', 'Name' => 'struct ibv_device*', 'Size' => '8', 'Type' => 'Pointer' }, '13513' => { 'Header' => undef, 'Line' => '1979', 'Memb' => { '0' => { 'name' => '_ops', 'offset' => '0', 'type' => '13446' }, '1' => { 'name' => 'node_type', 'offset' => '22', 'type' => '3562' }, '2' => { 'name' => 'transport_type', 'offset' => '32', 'type' => '3626' }, '3' => { 'name' => 'name', 'offset' => '36', 'type' => '4577' }, '4' => { 'name' => 'dev_name', 'offset' => '136', 'type' => '4577' }, '5' => { 'name' => 'dev_path', 'offset' => '338', 'type' => '13679' }, '6' => { 'name' => 'ibdev_path', 'offset' => '1032', 'type' => '13679' } }, 'Name' => 'struct ibv_device', 'Size' => '664', 'Type' => 'Struct' }, '13632' => { 'Name' => 'struct ibv_context*(*)(struct ibv_device*, int)', 'Param' => { '0' => { 'type' => '13508' }, '1' => { 'type' => '152' } }, 'Return' => '3963', 'Size' => '8', 'Type' => 'FuncPtr' }, '13648' => { 'Name' => 'void(*)(struct ibv_context*)', 'Param' => { '0' => { 'type' => '3963' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '13679' => { 'BaseType' => '253', 'Name' => 'char[256]', 'Size' => '256', 'Type' => 'Array' }, '13695' => { 'Header' => undef, 'Line' => '1994', 'Memb' => { '0' => { 'name' => '_compat_query_device', 'offset' => '0', 'type' => '14183' }, '1' => { 'name' => '_compat_query_port', 'offset' => '8', 'type' => '14223' }, '10' => { 'name' => '_compat_create_cq', 'offset' => '128', 'type' => '14233' }, '11' => { 'name' => 'poll_cq', 'offset' => '136', 'type' => '14348' }, '12' => { 'name' => 'req_notify_cq', 'offset' => '150', 'type' => '14373' }, '13' => { 'name' => '_compat_cq_event', 'offset' => '260', 'type' => '14233' }, '14' => { 'name' => '_compat_resize_cq', 'offset' => '274', 'type' => '14233' }, '15' => { 'name' => '_compat_destroy_cq', 'offset' => '288', 'type' => '14233' }, '16' => { 'name' => '_compat_create_srq', 'offset' => '296', 'type' => '14233' }, '17' => { 'name' => '_compat_modify_srq', 'offset' => '310', 'type' => '14233' }, '18' => { 'name' => '_compat_query_srq', 'offset' => '324', 'type' => '14233' }, '19' => { 'name' => '_compat_destroy_srq', 'offset' => '338', 'type' => '14233' }, '2' => { 'name' => '_compat_alloc_pd', 'offset' => '22', 'type' => '14233' }, '20' => { 'name' => 'post_srq_recv', 'offset' => '352', 'type' => '14403' }, '21' => { 'name' => '_compat_create_qp', 'offset' => '360', 'type' => '14233' }, '22' => { 'name' => '_compat_query_qp', 'offset' => '374', 'type' => '14233' }, '23' => { 'name' => '_compat_modify_qp', 'offset' => '388', 'type' => '14233' }, '24' => { 'name' => '_compat_destroy_qp', 'offset' => '402', 'type' => '14233' }, '25' => { 'name' => 'post_send', 'offset' => '512', 'type' => '14438' }, '26' => { 'name' => 'post_recv', 'offset' => '520', 'type' => '14468' }, '27' => { 'name' => '_compat_create_ah', 'offset' => '534', 'type' => '14233' }, '28' => { 'name' => '_compat_destroy_ah', 'offset' => '548', 'type' => '14233' }, '29' => { 'name' => '_compat_attach_mcast', 'offset' => '562', 'type' => '14233' }, '3' => { 'name' => '_compat_dealloc_pd', 'offset' => '36', 'type' => '14233' }, '30' => { 'name' => '_compat_detach_mcast', 'offset' => '576', 'type' => '14233' }, '31' => { 'name' => '_compat_async_event', 'offset' => '584', 'type' => '14233' }, '4' => { 'name' => '_compat_reg_mr', 'offset' => '50', 'type' => '14233' }, '5' => { 'name' => '_compat_rereg_mr', 'offset' => '64', 'type' => '14233' }, '6' => { 'name' => '_compat_dereg_mr', 'offset' => '72', 'type' => '14233' }, '7' => { 'name' => 'alloc_mw', 'offset' => '86', 'type' => '14258' }, '8' => { 'name' => 'bind_mw', 'offset' => '100', 'type' => '14293' }, '9' => { 'name' => 'dealloc_mw', 'offset' => '114', 'type' => '14313' } }, 'Name' => 'struct ibv_context_ops', 'Size' => '256', 'Type' => 'Struct' }, '137534' => { 'Header' => undef, 'Line' => '29', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2397' }, '1' => { 'name' => 'create_flags', 'offset' => '8', 'type' => '2385' }, '2' => { 'name' => 'congest_type', 'offset' => '18', 'type' => '2361' }, '3' => { 'name' => 'reserved', 'offset' => '19', 'type' => '137600' } }, 'Name' => 'struct hnsdv_qp_init_attr', 'Size' => '16', 'Type' => 'Struct' }, '137600' => { 'BaseType' => '2361', 'Name' => 'uint8_t[3]', 'Size' => '3', 'Type' => 'Array' }, '137639' => { 'Header' => undef, 'Line' => '40', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2397' }, '1' => { 'name' => 'flags', 'offset' => '8', 'type' => '2397' }, '2' => { 'name' => 'congest_type', 'offset' => '22', 'type' => '2361' }, '3' => { 'name' => 'reserved', 'offset' => '23', 'type' => '137705' } }, 'Name' => 'struct hnsdv_context', 'Size' => '24', 'Type' => 'Struct' }, '137705' => { 'BaseType' => '2361', 'Name' => 'uint8_t[7]', 'Size' => '7', 'Type' => 'Array' }, '140' => { 'BaseType' => '92', 'Header' => undef, 'Line' => '40', 'Name' => '__uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '14178' => { 'BaseType' => '4043', 'Name' => 'struct ibv_device_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '14183' => { 'Name' => 'int(*)(struct ibv_context*, struct ibv_device_attr*)', 'Param' => { '0' => { 'type' => '3963' }, '1' => { 'type' => '14178' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14213' => { 'BaseType' => '14218', 'Name' => 'struct _compat_ibv_port_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '14218' => { 'Name' => 'struct _compat_ibv_port_attr', 'Type' => 'Struct' }, '14223' => { 'Name' => 'int(*)(struct ibv_context*, uint8_t, struct _compat_ibv_port_attr*)', 'Param' => { '0' => { 'type' => '3963' }, '1' => { 'type' => '2361' }, '2' => { 'type' => '14213' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14233' => { 'Name' => 'void*(*)()', 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '14258' => { 'Name' => 'struct ibv_mw*(*)(struct ibv_pd*, enum ibv_mw_type)', 'Param' => { '0' => { 'type' => '7705' }, '1' => { 'type' => '7710' } }, 'Return' => '10525', 'Size' => '8', 'Type' => 'FuncPtr' }, '14288' => { 'BaseType' => '11092', 'Name' => 'struct ibv_mw_bind*', 'Size' => '8', 'Type' => 'Pointer' }, '14293' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_mw*, struct ibv_mw_bind*)', 'Param' => { '0' => { 'type' => '6323' }, '1' => { 'type' => '10525' }, '2' => { 'type' => '14288' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14313' => { 'Name' => 'int(*)(struct ibv_mw*)', 'Param' => { '0' => { 'type' => '10525' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14343' => { 'BaseType' => '7147', 'Name' => 'struct ibv_wc*', 'Size' => '8', 'Type' => 'Pointer' }, '14348' => { 'Name' => 'int(*)(struct ibv_cq*, int, struct ibv_wc*)', 'Param' => { '0' => { 'type' => '6123' }, '1' => { 'type' => '152' }, '2' => { 'type' => '14343' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14373' => { 'Name' => 'int(*)(struct ibv_cq*, int)', 'Param' => { '0' => { 'type' => '6123' }, '1' => { 'type' => '152' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14403' => { 'Name' => 'int(*)(struct ibv_srq*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '6439' }, '1' => { 'type' => '10832' }, '2' => { 'type' => '11173' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14433' => { 'BaseType' => '10752', 'Name' => 'struct ibv_send_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '14438' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_send_wr*, struct ibv_send_wr**)', 'Param' => { '0' => { 'type' => '6323' }, '1' => { 'type' => '10752' }, '2' => { 'type' => '14433' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '14468' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '6323' }, '1' => { 'type' => '10832' }, '2' => { 'type' => '11173' } }, 'Return' => '152', 'Size' => '8', 'Type' => 'FuncPtr' }, '145132' => { 'BaseType' => '137639', 'Name' => 'struct hnsdv_context*', 'Size' => '8', 'Type' => 'Pointer' }, '145331' => { 'BaseType' => '137534', 'Name' => 'struct hnsdv_qp_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '152' => { 'Name' => 'int', 'Size' => '4', 'Type' => 'Intrinsic' }, '16449' => { 'BaseType' => '9054', 'Name' => 'struct ibv_qp_init_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '169' => { 'BaseType' => '66', 'Header' => undef, 'Line' => '42', 'Name' => '__uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '193' => { 'BaseType' => '54', 'Header' => undef, 'Line' => '45', 'Name' => '__uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '2309' => { 'Name' => '_Bool', 'Size' => '1', 'Type' => 'Intrinsic' }, '2361' => { 'BaseType' => '116', 'Header' => undef, 'Line' => '24', 'Name' => 'uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '2373' => { 'BaseType' => '140', 'Header' => undef, 'Line' => '25', 'Name' => 'uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '2385' => { 'BaseType' => '169', 'Header' => undef, 'Line' => '26', 'Name' => 'uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '2397' => { 'BaseType' => '193', 'Header' => undef, 'Line' => '27', 'Name' => 'uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '2433' => { 'BaseType' => '66', 'Header' => undef, 'Line' => '27', 'Name' => '__u32', 'Size' => '4', 'Type' => 'Typedef' }, '2445' => { 'BaseType' => '420', 'Header' => undef, 'Line' => '31', 'Name' => '__u64', 'Size' => '8', 'Type' => 'Typedef' }, '2457' => { 'BaseType' => '2433', 'Header' => undef, 'Line' => '27', 'Name' => '__be32', 'Size' => '4', 'Type' => 'Typedef' }, '2469' => { 'BaseType' => '2445', 'Header' => undef, 'Line' => '29', 'Name' => '__be64', 'Size' => '8', 'Type' => 'Typedef' }, '253' => { 'Name' => 'char', 'Size' => '1', 'Type' => 'Intrinsic' }, '3562' => { 'Header' => undef, 'Line' => '95', 'Memb' => { '0' => { 'name' => 'IBV_NODE_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_NODE_CA', 'value' => '1' }, '2' => { 'name' => 'IBV_NODE_SWITCH', 'value' => '2' }, '3' => { 'name' => 'IBV_NODE_ROUTER', 'value' => '3' }, '4' => { 'name' => 'IBV_NODE_RNIC', 'value' => '4' }, '5' => { 'name' => 'IBV_NODE_USNIC', 'value' => '5' }, '6' => { 'name' => 'IBV_NODE_USNIC_UDP', 'value' => '6' }, '7' => { 'name' => 'IBV_NODE_UNSPECIFIED', 'value' => '7' } }, 'Name' => 'enum ibv_node_type', 'Size' => '4', 'Type' => 'Enum' }, '3626' => { 'Header' => undef, 'Line' => '106', 'Memb' => { '0' => { 'name' => 'IBV_TRANSPORT_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_TRANSPORT_IB', 'value' => '0' }, '2' => { 'name' => 'IBV_TRANSPORT_IWARP', 'value' => '1' }, '3' => { 'name' => 'IBV_TRANSPORT_USNIC', 'value' => '2' }, '4' => { 'name' => 'IBV_TRANSPORT_USNIC_UDP', 'value' => '3' }, '5' => { 'name' => 'IBV_TRANSPORT_UNSPECIFIED', 'value' => '4' } }, 'Name' => 'enum ibv_transport_type', 'Size' => '4', 'Type' => 'Enum' }, '3678' => { 'Header' => undef, 'Line' => '155', 'Memb' => { '0' => { 'name' => 'IBV_ATOMIC_NONE', 'value' => '0' }, '1' => { 'name' => 'IBV_ATOMIC_HCA', 'value' => '1' }, '2' => { 'name' => 'IBV_ATOMIC_GLOB', 'value' => '2' } }, 'Name' => 'enum ibv_atomic_cap', 'Size' => '4', 'Type' => 'Enum' }, '3845' => { 'Header' => undef, 'Line' => '2037', 'Memb' => { '0' => { 'name' => 'device', 'offset' => '0', 'type' => '13508' }, '1' => { 'name' => 'ops', 'offset' => '8', 'type' => '13695' }, '2' => { 'name' => 'cmd_fd', 'offset' => '612', 'type' => '152' }, '3' => { 'name' => 'async_fd', 'offset' => '616', 'type' => '152' }, '4' => { 'name' => 'num_comp_vectors', 'offset' => '626', 'type' => '152' }, '5' => { 'name' => 'mutex', 'offset' => '640', 'type' => '853' }, '6' => { 'name' => 'abi_compat', 'offset' => '800', 'type' => '78' } }, 'Name' => 'struct ibv_context', 'Size' => '328', 'Type' => 'Struct' }, '3963' => { 'BaseType' => '3845', 'Name' => 'struct ibv_context*', 'Size' => '8', 'Type' => 'Pointer' }, '4043' => { 'Header' => undef, 'Line' => '182', 'Memb' => { '0' => { 'name' => 'fw_ver', 'offset' => '0', 'type' => '4577' }, '1' => { 'name' => 'node_guid', 'offset' => '100', 'type' => '2469' }, '10' => { 'name' => 'device_cap_flags', 'offset' => '278', 'type' => '66' }, '11' => { 'name' => 'max_sge', 'offset' => '288', 'type' => '152' }, '12' => { 'name' => 'max_sge_rd', 'offset' => '292', 'type' => '152' }, '13' => { 'name' => 'max_cq', 'offset' => '296', 'type' => '152' }, '14' => { 'name' => 'max_cqe', 'offset' => '306', 'type' => '152' }, '15' => { 'name' => 'max_mr', 'offset' => '310', 'type' => '152' }, '16' => { 'name' => 'max_pd', 'offset' => '320', 'type' => '152' }, '17' => { 'name' => 'max_qp_rd_atom', 'offset' => '324', 'type' => '152' }, '18' => { 'name' => 'max_ee_rd_atom', 'offset' => '328', 'type' => '152' }, '19' => { 'name' => 'max_res_rd_atom', 'offset' => '338', 'type' => '152' }, '2' => { 'name' => 'sys_image_guid', 'offset' => '114', 'type' => '2469' }, '20' => { 'name' => 'max_qp_init_rd_atom', 'offset' => '342', 'type' => '152' }, '21' => { 'name' => 'max_ee_init_rd_atom', 'offset' => '352', 'type' => '152' }, '22' => { 'name' => 'atomic_cap', 'offset' => '356', 'type' => '3678' }, '23' => { 'name' => 'max_ee', 'offset' => '360', 'type' => '152' }, '24' => { 'name' => 'max_rdd', 'offset' => '370', 'type' => '152' }, '25' => { 'name' => 'max_mw', 'offset' => '374', 'type' => '152' }, '26' => { 'name' => 'max_raw_ipv6_qp', 'offset' => '384', 'type' => '152' }, '27' => { 'name' => 'max_raw_ethy_qp', 'offset' => '388', 'type' => '152' }, '28' => { 'name' => 'max_mcast_grp', 'offset' => '392', 'type' => '152' }, '29' => { 'name' => 'max_mcast_qp_attach', 'offset' => '402', 'type' => '152' }, '3' => { 'name' => 'max_mr_size', 'offset' => '128', 'type' => '2397' }, '30' => { 'name' => 'max_total_mcast_qp_attach', 'offset' => '406', 'type' => '152' }, '31' => { 'name' => 'max_ah', 'offset' => '512', 'type' => '152' }, '32' => { 'name' => 'max_fmr', 'offset' => '516', 'type' => '152' }, '33' => { 'name' => 'max_map_per_fmr', 'offset' => '520', 'type' => '152' }, '34' => { 'name' => 'max_srq', 'offset' => '530', 'type' => '152' }, '35' => { 'name' => 'max_srq_wr', 'offset' => '534', 'type' => '152' }, '36' => { 'name' => 'max_srq_sge', 'offset' => '544', 'type' => '152' }, '37' => { 'name' => 'max_pkeys', 'offset' => '548', 'type' => '2373' }, '38' => { 'name' => 'local_ca_ack_delay', 'offset' => '550', 'type' => '2361' }, '39' => { 'name' => 'phys_port_cnt', 'offset' => '551', 'type' => '2361' }, '4' => { 'name' => 'page_size_cap', 'offset' => '136', 'type' => '2397' }, '5' => { 'name' => 'vendor_id', 'offset' => '150', 'type' => '2385' }, '6' => { 'name' => 'vendor_part_id', 'offset' => '256', 'type' => '2385' }, '7' => { 'name' => 'hw_ver', 'offset' => '260', 'type' => '2385' }, '8' => { 'name' => 'max_qp', 'offset' => '264', 'type' => '152' }, '9' => { 'name' => 'max_qp_wr', 'offset' => '274', 'type' => '152' } }, 'Name' => 'struct ibv_device_attr', 'Size' => '232', 'Type' => 'Struct' }, '42' => { 'BaseType' => '54', 'Header' => undef, 'Line' => '214', 'Name' => 'size_t', 'Size' => '8', 'Type' => 'Typedef' }, '420' => { 'Name' => 'unsigned long long', 'Size' => '8', 'Type' => 'Intrinsic' }, '4577' => { 'BaseType' => '253', 'Name' => 'char[64]', 'Size' => '64', 'Type' => 'Array' }, '54' => { 'Name' => 'unsigned long', 'Size' => '8', 'Type' => 'Intrinsic' }, '5983' => { 'Header' => undef, 'Line' => '1508', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3963' }, '1' => { 'name' => 'channel', 'offset' => '8', 'type' => '12182' }, '2' => { 'name' => 'cq_context', 'offset' => '22', 'type' => '78' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '2385' }, '4' => { 'name' => 'cqe', 'offset' => '40', 'type' => '152' }, '5' => { 'name' => 'mutex', 'offset' => '50', 'type' => '853' }, '6' => { 'name' => 'cond', 'offset' => '114', 'type' => '927' }, '7' => { 'name' => 'comp_events_completed', 'offset' => '288', 'type' => '2385' }, '8' => { 'name' => 'async_events_completed', 'offset' => '292', 'type' => '2385' } }, 'Name' => 'struct ibv_cq', 'Size' => '128', 'Type' => 'Struct' }, '6123' => { 'BaseType' => '5983', 'Name' => 'struct ibv_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '6128' => { 'Header' => undef, 'Line' => '1283', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3963' }, '1' => { 'name' => 'qp_context', 'offset' => '8', 'type' => '78' }, '10' => { 'name' => 'mutex', 'offset' => '100', 'type' => '853' }, '11' => { 'name' => 'cond', 'offset' => '260', 'type' => '927' }, '12' => { 'name' => 'events_completed', 'offset' => '338', 'type' => '2385' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '7705' }, '3' => { 'name' => 'send_cq', 'offset' => '36', 'type' => '6123' }, '4' => { 'name' => 'recv_cq', 'offset' => '50', 'type' => '6123' }, '5' => { 'name' => 'srq', 'offset' => '64', 'type' => '6439' }, '6' => { 'name' => 'handle', 'offset' => '72', 'type' => '2385' }, '7' => { 'name' => 'qp_num', 'offset' => '82', 'type' => '2385' }, '8' => { 'name' => 'state', 'offset' => '86', 'type' => '9380' }, '9' => { 'name' => 'qp_type', 'offset' => '96', 'type' => '8724' } }, 'Name' => 'struct ibv_qp', 'Size' => '160', 'Type' => 'Struct' }, '6323' => { 'BaseType' => '6128', 'Name' => 'struct ibv_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '6328' => { 'Header' => undef, 'Line' => '1243', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3963' }, '1' => { 'name' => 'srq_context', 'offset' => '8', 'type' => '78' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '7705' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '2385' }, '4' => { 'name' => 'mutex', 'offset' => '50', 'type' => '853' }, '5' => { 'name' => 'cond', 'offset' => '114', 'type' => '927' }, '6' => { 'name' => 'events_completed', 'offset' => '288', 'type' => '2385' } }, 'Name' => 'struct ibv_srq', 'Size' => '128', 'Type' => 'Struct' }, '6439' => { 'BaseType' => '6328', 'Name' => 'struct ibv_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '66' => { 'Name' => 'unsigned int', 'Size' => '4', 'Type' => 'Intrinsic' }, '6685' => { 'Header' => undef, 'Line' => '485', 'Memb' => { '0' => { 'name' => 'IBV_WC_SUCCESS', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_LOC_LEN_ERR', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_REM_ACCESS_ERR', 'value' => '10' }, '11' => { 'name' => 'IBV_WC_REM_OP_ERR', 'value' => '11' }, '12' => { 'name' => 'IBV_WC_RETRY_EXC_ERR', 'value' => '12' }, '13' => { 'name' => 'IBV_WC_RNR_RETRY_EXC_ERR', 'value' => '13' }, '14' => { 'name' => 'IBV_WC_LOC_RDD_VIOL_ERR', 'value' => '14' }, '15' => { 'name' => 'IBV_WC_REM_INV_RD_REQ_ERR', 'value' => '15' }, '16' => { 'name' => 'IBV_WC_REM_ABORT_ERR', 'value' => '16' }, '17' => { 'name' => 'IBV_WC_INV_EECN_ERR', 'value' => '17' }, '18' => { 'name' => 'IBV_WC_INV_EEC_STATE_ERR', 'value' => '18' }, '19' => { 'name' => 'IBV_WC_FATAL_ERR', 'value' => '19' }, '2' => { 'name' => 'IBV_WC_LOC_QP_OP_ERR', 'value' => '2' }, '20' => { 'name' => 'IBV_WC_RESP_TIMEOUT_ERR', 'value' => '20' }, '21' => { 'name' => 'IBV_WC_GENERAL_ERR', 'value' => '21' }, '22' => { 'name' => 'IBV_WC_TM_ERR', 'value' => '22' }, '23' => { 'name' => 'IBV_WC_TM_RNDV_INCOMPLETE', 'value' => '23' }, '3' => { 'name' => 'IBV_WC_LOC_EEC_OP_ERR', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_LOC_PROT_ERR', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_WR_FLUSH_ERR', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_MW_BIND_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_BAD_RESP_ERR', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_LOC_ACCESS_ERR', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_REM_INV_REQ_ERR', 'value' => '9' } }, 'Name' => 'enum ibv_wc_status', 'Size' => '4', 'Type' => 'Enum' }, '6846' => { 'Header' => undef, 'Line' => '513', 'Memb' => { '0' => { 'name' => 'IBV_WC_SEND', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_RDMA_WRITE', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_RECV', 'value' => '128' }, '11' => { 'name' => 'IBV_WC_RECV_RDMA_WITH_IMM', 'value' => '129' }, '12' => { 'name' => 'IBV_WC_TM_ADD', 'value' => '130' }, '13' => { 'name' => 'IBV_WC_TM_DEL', 'value' => '131' }, '14' => { 'name' => 'IBV_WC_TM_SYNC', 'value' => '132' }, '15' => { 'name' => 'IBV_WC_TM_RECV', 'value' => '133' }, '16' => { 'name' => 'IBV_WC_TM_NO_TAG', 'value' => '134' }, '17' => { 'name' => 'IBV_WC_DRIVER1', 'value' => '135' }, '18' => { 'name' => 'IBV_WC_DRIVER2', 'value' => '136' }, '19' => { 'name' => 'IBV_WC_DRIVER3', 'value' => '137' }, '2' => { 'name' => 'IBV_WC_RDMA_READ', 'value' => '2' }, '3' => { 'name' => 'IBV_WC_COMP_SWAP', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_FETCH_ADD', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_BIND_MW', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_LOCAL_INV', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_TSO', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_FLUSH', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_ATOMIC_WRITE', 'value' => '9' } }, 'Name' => 'enum ibv_wc_opcode', 'Size' => '4', 'Type' => 'Enum' }, '7114' => { 'Header' => undef, 'Line' => '598', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '2457' }, '1' => { 'name' => 'invalidated_rkey', 'offset' => '0', 'type' => '2385' } }, 'Size' => '4', 'Type' => 'Union' }, '7147' => { 'Header' => undef, 'Line' => '589', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2397' }, '1' => { 'name' => 'status', 'offset' => '8', 'type' => '6685' }, '10' => { 'name' => 'slid', 'offset' => '66', 'type' => '2373' }, '11' => { 'name' => 'sl', 'offset' => '68', 'type' => '2361' }, '12' => { 'name' => 'dlid_path_bits', 'offset' => '69', 'type' => '2361' }, '2' => { 'name' => 'opcode', 'offset' => '18', 'type' => '6846' }, '3' => { 'name' => 'vendor_err', 'offset' => '22', 'type' => '2385' }, '4' => { 'name' => 'byte_len', 'offset' => '32', 'type' => '2385' }, '5' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '7114' }, '6' => { 'name' => 'qp_num', 'offset' => '40', 'type' => '2385' }, '7' => { 'name' => 'src_qp', 'offset' => '50', 'type' => '2385' }, '8' => { 'name' => 'wc_flags', 'offset' => '54', 'type' => '66' }, '9' => { 'name' => 'pkey_index', 'offset' => '64', 'type' => '2373' } }, 'Name' => 'struct ibv_wc', 'Size' => '48', 'Type' => 'Struct' }, '7334' => { 'Header' => undef, 'Line' => '625', 'Memb' => { '0' => { 'name' => 'mr', 'offset' => '0', 'type' => '7519' }, '1' => { 'name' => 'addr', 'offset' => '8', 'type' => '2397' }, '2' => { 'name' => 'length', 'offset' => '22', 'type' => '2397' }, '3' => { 'name' => 'mw_access_flags', 'offset' => '36', 'type' => '66' } }, 'Name' => 'struct ibv_mw_bind_info', 'Size' => '32', 'Type' => 'Struct' }, '7408' => { 'Header' => undef, 'Line' => '668', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3963' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '7705' }, '2' => { 'name' => 'addr', 'offset' => '22', 'type' => '78' }, '3' => { 'name' => 'length', 'offset' => '36', 'type' => '42' }, '4' => { 'name' => 'handle', 'offset' => '50', 'type' => '2385' }, '5' => { 'name' => 'lkey', 'offset' => '54', 'type' => '2385' }, '6' => { 'name' => 'rkey', 'offset' => '64', 'type' => '2385' } }, 'Name' => 'struct ibv_mr', 'Size' => '48', 'Type' => 'Struct' }, '7519' => { 'BaseType' => '7408', 'Name' => 'struct ibv_mr*', 'Size' => '8', 'Type' => 'Pointer' }, '7524' => { 'Header' => undef, 'Line' => '632', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3963' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '2385' } }, 'Name' => 'struct ibv_pd', 'Size' => '16', 'Type' => 'Struct' }, '7677' => { 'Header' => undef, 'Line' => '657', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3963' } }, 'Name' => 'struct ibv_xrcd', 'Size' => '8', 'Type' => 'Struct' }, '7705' => { 'BaseType' => '7524', 'Name' => 'struct ibv_pd*', 'Size' => '8', 'Type' => 'Pointer' }, '7710' => { 'Header' => undef, 'Line' => '678', 'Memb' => { '0' => { 'name' => 'IBV_MW_TYPE_1', 'value' => '1' }, '1' => { 'name' => 'IBV_MW_TYPE_2', 'value' => '2' } }, 'Name' => 'enum ibv_mw_type', 'Size' => '4', 'Type' => 'Enum' }, '7739' => { 'Header' => undef, 'Line' => '683', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3963' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '7705' }, '2' => { 'name' => 'rkey', 'offset' => '22', 'type' => '2385' }, '3' => { 'name' => 'handle', 'offset' => '32', 'type' => '2385' }, '4' => { 'name' => 'type', 'offset' => '36', 'type' => '7710' } }, 'Name' => 'struct ibv_mw', 'Size' => '32', 'Type' => 'Struct' }, '78' => { 'BaseType' => '1', 'Name' => 'void*', 'Size' => '8', 'Type' => 'Pointer' }, '80' => { 'Name' => 'unsigned char', 'Size' => '1', 'Type' => 'Intrinsic' }, '8316' => { 'BaseType' => '7677', 'Name' => 'struct ibv_xrcd*', 'Size' => '8', 'Type' => 'Pointer' }, '8593' => { 'Header' => undef, 'Line' => '880', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '3963' }, '1' => { 'name' => 'ind_tbl_handle', 'offset' => '8', 'type' => '152' }, '2' => { 'name' => 'ind_tbl_num', 'offset' => '18', 'type' => '152' }, '3' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '2385' } }, 'Name' => 'struct ibv_rwq_ind_table', 'Size' => '24', 'Type' => 'Struct' }, '8724' => { 'Header' => undef, 'Line' => '901', 'Memb' => { '0' => { 'name' => 'IBV_QPT_RC', 'value' => '2' }, '1' => { 'name' => 'IBV_QPT_UC', 'value' => '3' }, '2' => { 'name' => 'IBV_QPT_UD', 'value' => '4' }, '3' => { 'name' => 'IBV_QPT_RAW_PACKET', 'value' => '8' }, '4' => { 'name' => 'IBV_QPT_XRC_SEND', 'value' => '9' }, '5' => { 'name' => 'IBV_QPT_XRC_RECV', 'value' => '10' }, '6' => { 'name' => 'IBV_QPT_DRIVER', 'value' => '255' } }, 'Name' => 'enum ibv_qp_type', 'Size' => '4', 'Type' => 'Enum' }, '8783' => { 'Header' => undef, 'Line' => '911', 'Memb' => { '0' => { 'name' => 'max_send_wr', 'offset' => '0', 'type' => '2385' }, '1' => { 'name' => 'max_recv_wr', 'offset' => '4', 'type' => '2385' }, '2' => { 'name' => 'max_send_sge', 'offset' => '8', 'type' => '2385' }, '3' => { 'name' => 'max_recv_sge', 'offset' => '18', 'type' => '2385' }, '4' => { 'name' => 'max_inline_data', 'offset' => '22', 'type' => '2385' } }, 'Name' => 'struct ibv_qp_cap', 'Size' => '20', 'Type' => 'Struct' }, '8979' => { 'Header' => undef, 'Line' => '963', 'Memb' => { '0' => { 'name' => 'rx_hash_function', 'offset' => '0', 'type' => '2361' }, '1' => { 'name' => 'rx_hash_key_len', 'offset' => '1', 'type' => '2361' }, '2' => { 'name' => 'rx_hash_key', 'offset' => '8', 'type' => '9049' }, '3' => { 'name' => 'rx_hash_fields_mask', 'offset' => '22', 'type' => '2397' } }, 'Name' => 'struct ibv_rx_hash_conf', 'Size' => '24', 'Type' => 'Struct' }, '9049' => { 'BaseType' => '2361', 'Name' => 'uint8_t*', 'Size' => '8', 'Type' => 'Pointer' }, '9054' => { 'Header' => undef, 'Line' => '972', 'Memb' => { '0' => { 'name' => 'qp_context', 'offset' => '0', 'type' => '78' }, '1' => { 'name' => 'send_cq', 'offset' => '8', 'type' => '6123' }, '10' => { 'name' => 'create_flags', 'offset' => '128', 'type' => '2385' }, '11' => { 'name' => 'max_tso_header', 'offset' => '132', 'type' => '2373' }, '12' => { 'name' => 'rwq_ind_tbl', 'offset' => '136', 'type' => '9291' }, '13' => { 'name' => 'rx_hash_conf', 'offset' => '150', 'type' => '8979' }, '14' => { 'name' => 'source_qpn', 'offset' => '288', 'type' => '2385' }, '15' => { 'name' => 'send_ops_flags', 'offset' => '296', 'type' => '2397' }, '2' => { 'name' => 'recv_cq', 'offset' => '22', 'type' => '6123' }, '3' => { 'name' => 'srq', 'offset' => '36', 'type' => '6439' }, '4' => { 'name' => 'cap', 'offset' => '50', 'type' => '8783' }, '5' => { 'name' => 'qp_type', 'offset' => '82', 'type' => '8724' }, '6' => { 'name' => 'sq_sig_all', 'offset' => '86', 'type' => '152' }, '7' => { 'name' => 'comp_mask', 'offset' => '96', 'type' => '2385' }, '8' => { 'name' => 'pd', 'offset' => '100', 'type' => '7705' }, '9' => { 'name' => 'xrcd', 'offset' => '114', 'type' => '8316' } }, 'Name' => 'struct ibv_qp_init_attr_ex', 'Size' => '136', 'Type' => 'Struct' }, '92' => { 'Name' => 'unsigned short', 'Size' => '2', 'Type' => 'Intrinsic' }, '9291' => { 'BaseType' => '8593', 'Name' => 'struct ibv_rwq_ind_table*', 'Size' => '8', 'Type' => 'Pointer' }, '9380' => { 'Header' => undef, 'Line' => '1050', 'Memb' => { '0' => { 'name' => 'IBV_QPS_RESET', 'value' => '0' }, '1' => { 'name' => 'IBV_QPS_INIT', 'value' => '1' }, '2' => { 'name' => 'IBV_QPS_RTR', 'value' => '2' }, '3' => { 'name' => 'IBV_QPS_RTS', 'value' => '3' }, '4' => { 'name' => 'IBV_QPS_SQD', 'value' => '4' }, '5' => { 'name' => 'IBV_QPS_SQE', 'value' => '5' }, '6' => { 'name' => 'IBV_QPS_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_QPS_UNKNOWN', 'value' => '7' } }, 'Name' => 'enum ibv_qp_state', 'Size' => '4', 'Type' => 'Enum' }, '9928' => { 'Header' => undef, 'Line' => '1103', 'Memb' => { '0' => { 'name' => 'IBV_WR_RDMA_WRITE', 'value' => '0' }, '1' => { 'name' => 'IBV_WR_RDMA_WRITE_WITH_IMM', 'value' => '1' }, '10' => { 'name' => 'IBV_WR_TSO', 'value' => '10' }, '11' => { 'name' => 'IBV_WR_DRIVER1', 'value' => '11' }, '12' => { 'name' => 'IBV_WR_FLUSH', 'value' => '14' }, '13' => { 'name' => 'IBV_WR_ATOMIC_WRITE', 'value' => '15' }, '2' => { 'name' => 'IBV_WR_SEND', 'value' => '2' }, '3' => { 'name' => 'IBV_WR_SEND_WITH_IMM', 'value' => '3' }, '4' => { 'name' => 'IBV_WR_RDMA_READ', 'value' => '4' }, '5' => { 'name' => 'IBV_WR_ATOMIC_CMP_AND_SWP', 'value' => '5' }, '6' => { 'name' => 'IBV_WR_ATOMIC_FETCH_AND_ADD', 'value' => '6' }, '7' => { 'name' => 'IBV_WR_LOCAL_INV', 'value' => '7' }, '8' => { 'name' => 'IBV_WR_BIND_MW', 'value' => '8' }, '9' => { 'name' => 'IBV_WR_SEND_WITH_INV', 'value' => '9' } }, 'Name' => 'enum ibv_wr_opcode', 'Size' => '4', 'Type' => 'Enum' } }, 'UndefinedSymbols' => { 'libhns.so.1.0.56.0' => { '_ITM_deregisterTMCloneTable' => 0, '_ITM_registerTMCloneTable' => 0, '__cxa_finalize@GLIBC_2.2.5' => 0, '__errno_location@GLIBC_2.2.5' => 0, '__gmon_start__' => 0, '__snprintf_chk@GLIBC_2.3.4' => 0, '__stack_chk_fail@GLIBC_2.4' => 0, '__verbs_log@IBVERBS_PRIVATE_34' => 0, '_verbs_init_and_alloc_context@IBVERBS_PRIVATE_34' => 0, 'calloc@GLIBC_2.2.5' => 0, 'fcntl@GLIBC_2.2.5' => 0, 'free@GLIBC_2.2.5' => 0, 'getenv@GLIBC_2.2.5' => 0, 'getrandom@GLIBC_2.25' => 0, 'ibv_cmd_alloc_mw@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_alloc_pd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_close_xrcd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_ah@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_cq_ex@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_qp_ex2@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_srq_ex@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dealloc_mw@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dealloc_pd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dereg_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_ah@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_srq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_get_context@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_qp_ex@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_srq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_open_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_open_xrcd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_device_any@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_port@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_srq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_reg_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_rereg_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_dofork_range@IBVERBS_1.1' => 0, 'ibv_dontfork_range@IBVERBS_1.1' => 0, 'ibv_query_gid_type@IBVERBS_PRIVATE_34' => 0, 'ibv_resolve_eth_l2_from_gid@IBVERBS_1.1' => 0, 'malloc@GLIBC_2.2.5' => 0, 'memcpy@GLIBC_2.14' => 0, 'memset@GLIBC_2.2.5' => 0, 'mmap@GLIBC_2.2.5' => 0, 'munmap@GLIBC_2.2.5' => 0, 'pthread_mutex_destroy@GLIBC_2.2.5' => 0, 'pthread_mutex_init@GLIBC_2.2.5' => 0, 'pthread_mutex_lock@GLIBC_2.2.5' => 0, 'pthread_mutex_unlock@GLIBC_2.2.5' => 0, 'pthread_spin_destroy@GLIBC_2.34' => 0, 'pthread_spin_init@GLIBC_2.34' => 0, 'pthread_spin_lock@GLIBC_2.34' => 0, 'pthread_spin_unlock@GLIBC_2.34' => 0, 'rand_r@GLIBC_2.2.5' => 0, 'sysconf@GLIBC_2.2.5' => 0, 'time@GLIBC_2.2.5' => 0, 'verbs_register_driver_34@IBVERBS_PRIVATE_34' => 0, 'verbs_set_ops@IBVERBS_PRIVATE_34' => 0, 'verbs_uninit_context@IBVERBS_PRIVATE_34' => 0 } }, 'WordSize' => '8' }; rdma-core-56.1/ABI/ibmad.dump000066400000000000000000020174231477342711600157010ustar00rootroot00000000000000$VAR1 = { 'ABI_DUMPER_VERSION' => '1.2', 'ABI_DUMP_VERSION' => '3.5', 'Arch' => 'x86_64', 'GccVersion' => '12.3.0', 'Headers' => {}, 'Language' => 'C', 'LibraryName' => 'libibmad.so.5.5.56.0', 'LibraryVersion' => 'ibmad', 'MissedOffsets' => '1', 'MissedRegs' => '1', 'NameSpaces' => {}, 'Needed' => { 'libc.so.6' => 1, 'libibumad.so.3' => 1 }, 'Sources' => {}, 'SymbolInfo' => { '100041' => { 'Header' => undef, 'Line' => '373', 'Param' => { '0' => { 'name' => 'dev_name', 'type' => '227' }, '1' => { 'name' => 'dev_port', 'type' => '72' }, '2' => { 'name' => 'mgmt_classes', 'type' => '5588' }, '3' => { 'name' => 'num_classes', 'type' => '72' } }, 'Return' => '1731', 'ShortName' => 'mad_rpc_open_port' }, '101179' => { 'Header' => undef, 'Line' => '345', 'Param' => { '0' => { 'name' => 'dev_name', 'type' => '227' }, '1' => { 'name' => 'dev_port', 'type' => '72' }, '2' => { 'name' => 'mgmt_classes', 'type' => '5588' }, '3' => { 'name' => 'num_classes', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'madrpc_init' }, '102098' => { 'Header' => undef, 'Line' => '338', 'Param' => { '0' => { 'name' => 'rpc', 'type' => '96955' }, '1' => { 'name' => 'dport', 'type' => '1829' }, '2' => { 'name' => 'rmpp', 'type' => '96960' }, '3' => { 'name' => 'data', 'type' => '220' } }, 'Return' => '220', 'ShortName' => 'madrpc_rmpp' }, '102255' => { 'Header' => undef, 'Line' => '333', 'Param' => { '0' => { 'name' => 'rpc', 'type' => '96955' }, '1' => { 'name' => 'dport', 'type' => '1829' }, '2' => { 'name' => 'payload', 'type' => '220' }, '3' => { 'name' => 'rcvdata', 'type' => '220' } }, 'Return' => '220', 'ShortName' => 'madrpc' }, '102412' => { 'Header' => undef, 'Line' => '1509', 'Param' => { '0' => { 'name' => 'port', 'type' => '1881' }, '1' => { 'name' => 'rpc', 'type' => '96955' }, '2' => { 'name' => 'dport', 'type' => '1829' }, '3' => { 'name' => 'rmpp', 'type' => '96960' }, '4' => { 'name' => 'data', 'type' => '220' } }, 'Return' => '220', 'ShortName' => 'mad_rpc_rmpp' }, '107228' => { 'Header' => undef, 'Line' => '112', 'Param' => { '0' => { 'name' => 'port', 'type' => '1731' }, '1' => { 'name' => 'class', 'type' => '72' } }, 'Return' => '72', 'ShortName' => 'mad_rpc_class_agent' }, '107335' => { 'Header' => undef, 'Line' => '102', 'Return' => '72', 'ShortName' => 'madrpc_portid' }, '107365' => { 'Header' => undef, 'Line' => '97', 'Param' => { '0' => { 'name' => 'port', 'type' => '1731' }, '1' => { 'name' => 'timeout', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_rpc_set_timeout' }, '107420' => { 'Header' => undef, 'Line' => '92', 'Param' => { '0' => { 'name' => 'port', 'type' => '1731' }, '1' => { 'name' => 'retries', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_rpc_set_retries' }, '107475' => { 'Header' => undef, 'Line' => '86', 'Param' => { '0' => { 'name' => 'timeout', 'type' => '72' } }, 'Return' => '72', 'ShortName' => 'madrpc_set_timeout' }, '107522' => { 'Header' => undef, 'Line' => '79', 'Param' => { '0' => { 'name' => 'retries', 'type' => '72' } }, 'Return' => '72', 'ShortName' => 'madrpc_set_retries' }, '107569' => { 'Header' => undef, 'Line' => '73', 'Param' => { '0' => { 'name' => 'madbuf', 'type' => '220' }, '1' => { 'name' => 'len', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'madrpc_save_mad' }, '107624' => { 'Header' => undef, 'Line' => '68', 'Param' => { '0' => { 'name' => 'set', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'madrpc_show_errors' }, '114840' => { 'Header' => undef, 'Line' => '164', 'Param' => { '0' => { 'name' => 'srcport', 'type' => '1881' }, '1' => { 'name' => 'guid', 'type' => '268' }, '2' => { 'name' => 'sm_id', 'type' => '1829' }, '3' => { 'name' => 'buf', 'type' => '220' } }, 'Return' => '72', 'ShortName' => 'ib_node_query_via' }, '115368' => { 'Header' => undef, 'Line' => '139', 'Param' => { '0' => { 'name' => 'srcgid', 'type' => '2912' }, '1' => { 'name' => 'destgid', 'type' => '2912' }, '2' => { 'name' => 'sm_id', 'type' => '1829' }, '3' => { 'name' => 'buf', 'type' => '220' } }, 'Return' => '72', 'ShortName' => 'ib_path_query' }, '116127' => { 'Header' => undef, 'Line' => '79', 'Param' => { '0' => { 'name' => 'rcvbuf', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'sa', 'type' => '116278' }, '3' => { 'name' => 'timeout', 'type' => '108' } }, 'Return' => '2912', 'ShortName' => 'sa_call' }, '116283' => { 'Header' => undef, 'Line' => '44', 'Param' => { '0' => { 'name' => 'ibmad_port', 'type' => '1881' }, '1' => { 'name' => 'rcvbuf', 'type' => '220' }, '2' => { 'name' => 'portid', 'type' => '1829' }, '3' => { 'name' => 'sa', 'type' => '116278' }, '4' => { 'name' => 'timeout', 'type' => '108' } }, 'Return' => '2912', 'ShortName' => 'sa_rpc_call' }, '11763' => { 'Header' => undef, 'Line' => '1266', 'Param' => { '0' => { 'name' => 'field', 'type' => '6858' }, '1' => { 'name' => 'buf', 'type' => '227' }, '2' => { 'name' => 'bufsz', 'type' => '72' }, '3' => { 'name' => 'val', 'type' => '220' } }, 'Return' => '227', 'ShortName' => 'mad_dump_field' }, '11801' => { 'Header' => undef, 'Line' => '1477', 'Param' => { '0' => { 'name' => 'buf', 'type' => '2912' }, '1' => { 'name' => 'field', 'type' => '6858' }, '2' => { 'name' => 'val', 'type' => '220' } }, 'Return' => '1', 'ShortName' => 'mad_decode_field' }, '11909' => { 'Header' => undef, 'Line' => '1735', 'Param' => { '0' => { 'name' => 'file', 'type' => '788' }, '1' => { 'name' => 'msg', 'type' => '79' }, '2' => { 'name' => 'p', 'type' => '220' }, '3' => { 'name' => 'size', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'xdump' }, '12260' => { 'Header' => undef, 'Line' => '1247', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_portinfo_ext' }, '124259' => { 'Header' => undef, 'Line' => '193', 'Param' => { '0' => { 'name' => 'umad', 'type' => '220' } }, 'Return' => '1', 'ShortName' => 'mad_free' }, '124377' => { 'Header' => undef, 'Line' => '188', 'Return' => '220', 'ShortName' => 'mad_alloc' }, '124487' => { 'Header' => undef, 'Line' => '171', 'Param' => { '0' => { 'name' => 'umad', 'type' => '220' }, '1' => { 'name' => 'timeout', 'type' => '72' }, '2' => { 'name' => 'srcport', 'type' => '1731' } }, 'Return' => '220', 'ShortName' => 'mad_receive_via' }, '125091' => { 'Header' => undef, 'Line' => '166', 'Param' => { '0' => { 'name' => 'umad', 'type' => '220' }, '1' => { 'name' => 'timeout', 'type' => '72' } }, 'Return' => '220', 'ShortName' => 'mad_receive' }, '125191' => { 'Header' => undef, 'Line' => '87', 'Param' => { '0' => { 'name' => 'umad', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'rstatus', 'type' => '256' }, '3' => { 'name' => 'srcport', 'type' => '1731' } }, 'Return' => '72', 'ShortName' => 'mad_respond_via' }, '12555' => { 'Header' => undef, 'Line' => '1241', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_classportinfo' }, '126529' => { 'Header' => undef, 'Line' => '82', 'Param' => { '0' => { 'name' => 'umad', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'rstatus', 'type' => '256' } }, 'Return' => '72', 'ShortName' => 'mad_respond' }, '126982' => { 'Header' => undef, 'Line' => '47', 'Param' => { '0' => { 'name' => 'rpc', 'type' => '124224' }, '1' => { 'name' => 'dport', 'type' => '1829' }, '2' => { 'name' => 'rmpp', 'type' => '124229' }, '3' => { 'name' => 'data', 'type' => '220' } }, 'Return' => '72', 'ShortName' => 'mad_send' }, '12712' => { 'Header' => undef, 'Line' => '1235', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_timestamp' }, '12871' => { 'Header' => undef, 'Line' => '1229', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_congestioncontroltableentry' }, '129117' => { 'Header' => undef, 'Line' => '141', 'Param' => { '0' => { 'name' => 'rcvbuf', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'attrid', 'type' => '108' }, '3' => { 'name' => 'mod', 'type' => '108' }, '4' => { 'name' => 'timeout', 'type' => '108' } }, 'Return' => '2912', 'ShortName' => 'smp_query' }, '129500' => { 'Header' => undef, 'Line' => '101', 'Param' => { '0' => { 'name' => 'rcvbuf', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'attrid', 'type' => '108' }, '3' => { 'name' => 'mod', 'type' => '108' }, '4' => { 'name' => 'timeout', 'type' => '108' }, '5' => { 'name' => 'rstatus', 'type' => '5588' }, '6' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '2912', 'ShortName' => 'smp_query_status_via' }, '129932' => { 'Header' => undef, 'Line' => '95', 'Param' => { '0' => { 'name' => 'data', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'attrid', 'type' => '108' }, '3' => { 'name' => 'mod', 'type' => '108' }, '4' => { 'name' => 'timeout', 'type' => '108' } }, 'Return' => '2912', 'ShortName' => 'smp_set' }, '130109' => { 'Header' => undef, 'Line' => '87', 'Param' => { '0' => { 'name' => 'data', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'attrid', 'type' => '108' }, '3' => { 'name' => 'mod', 'type' => '108' }, '4' => { 'name' => 'timeout', 'type' => '108' }, '5' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '2912', 'ShortName' => 'smp_set_via' }, '13030' => { 'Header' => undef, 'Line' => '1223', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_congestioncontroltable' }, '130310' => { 'Header' => undef, 'Line' => '55', 'Param' => { '0' => { 'name' => 'data', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'attrid', 'type' => '108' }, '3' => { 'name' => 'mod', 'type' => '108' }, '4' => { 'name' => 'timeout', 'type' => '108' }, '5' => { 'name' => 'rstatus', 'type' => '5588' }, '6' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '2912', 'ShortName' => 'smp_set_status_via' }, '130737' => { 'Header' => undef, 'Line' => '50', 'Param' => { '0' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '268', 'ShortName' => 'smp_mkey_get' }, '130783' => { 'Header' => undef, 'Line' => '45', 'Param' => { '0' => { 'name' => 'srcport', 'type' => '1731' }, '1' => { 'name' => 'mkey', 'type' => '268' } }, 'Return' => '1', 'ShortName' => 'smp_mkey_set' }, '13189' => { 'Header' => undef, 'Line' => '1217', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_cacongestionentry' }, '133038' => { 'Header' => undef, 'Line' => '58', 'Param' => { '0' => { 'name' => 'data', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'call', 'type' => '133777' }, '3' => { 'name' => 'srcport', 'type' => '1731' } }, 'Return' => '2912', 'ShortName' => 'ib_vendor_call_via' }, '13348' => { 'Header' => undef, 'Line' => '1211', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_cacongestionsetting' }, '133803' => { 'Header' => undef, 'Line' => '52', 'Param' => { '0' => { 'name' => 'data', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'call', 'type' => '133777' } }, 'Return' => '2912', 'ShortName' => 'ib_vendor_call' }, '13507' => { 'Header' => undef, 'Line' => '1205', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_switchportcongestionsettingelement' }, '13666' => { 'Header' => undef, 'Line' => '1199', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_switchcongestionsetting' }, '13825' => { 'Header' => undef, 'Line' => '1193', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_congestionlogentryca' }, '13984' => { 'Header' => undef, 'Line' => '1187', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_congestionlogca' }, '14143' => { 'Header' => undef, 'Line' => '1181', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_congestionlogentryswitch' }, '14302' => { 'Header' => undef, 'Line' => '1175', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_congestionlogswitch' }, '14461' => { 'Header' => undef, 'Line' => '1169', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_congestionlog' }, '14620' => { 'Header' => undef, 'Line' => '1163', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_congestionkeyinfo' }, '14779' => { 'Header' => undef, 'Line' => '1157', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_cc_congestioninfo' }, '14938' => { 'Header' => undef, 'Line' => '1151', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_mlnx_ext_port_info' }, '15097' => { 'Header' => undef, 'Line' => '1138', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_vl_xmit_time_cong' }, '15310' => { 'Header' => undef, 'Line' => '1125', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_xmit_con_ctrl' }, '15523' => { 'Header' => undef, 'Line' => '1112', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_sl_rcv_becn' }, '15736' => { 'Header' => undef, 'Line' => '1099', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_sl_rcv_fecn' }, '15949' => { 'Header' => undef, 'Line' => '1085', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_rcv_con_ctrl' }, '16162' => { 'Header' => undef, 'Line' => '1072', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_sw_port_vl_congestion' }, '16375' => { 'Header' => undef, 'Line' => '1059', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_port_vl_xmit_wait_counters' }, '16588' => { 'Header' => undef, 'Line' => '1046', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_port_vl_xmit_flow_ctl_update_errors' }, '16801' => { 'Header' => undef, 'Line' => '1033', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_port_vl_op_data' }, '17014' => { 'Header' => undef, 'Line' => '1020', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_port_vl_op_packet' }, '17227' => { 'Header' => undef, 'Line' => '1007', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_port_flow_ctl_counters' }, '1736' => { 'Data' => 1, 'Header' => undef, 'Line' => '1697', 'Return' => '72', 'ShortName' => 'ibdebug' }, '17440' => { 'Header' => undef, 'Line' => '994', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_port_op_rcv_counters' }, '17653' => { 'Header' => undef, 'Line' => '989', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_port_ext_speeds_counters' }, '17812' => { 'Header' => undef, 'Line' => '982', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_port_ext_speeds_counters_rsfec_active' }, '1782' => { 'Header' => undef, 'Line' => '1541', 'Param' => { '0' => { 'name' => 'rpc', 'type' => '124224' }, '1' => { 'name' => 'dport', 'type' => '1829' }, '2' => { 'name' => 'rmpp', 'type' => '124229' }, '3' => { 'name' => 'data', 'type' => '220' }, '4' => { 'name' => 'srcport', 'type' => '1731' } }, 'Return' => '72', 'ShortName' => 'mad_send_via' }, '17971' => { 'Header' => undef, 'Line' => '977', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_portsamples_result' }, '18130' => { 'Header' => undef, 'Line' => '972', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_portsamples_control' }, '18289' => { 'Header' => undef, 'Line' => '959', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_rcv_err' }, '1839' => { 'Header' => undef, 'Line' => '1506', 'Param' => { '0' => { 'name' => 'port', 'type' => '1881' }, '1' => { 'name' => 'rpc', 'type' => '96955' }, '2' => { 'name' => 'dport', 'type' => '1829' }, '3' => { 'name' => 'payload', 'type' => '220' }, '4' => { 'name' => 'rcvdata', 'type' => '220' } }, 'Return' => '220', 'ShortName' => 'mad_rpc' }, '18502' => { 'Header' => undef, 'Line' => '946', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_xmt_disc' }, '18715' => { 'Header' => undef, 'Line' => '933', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_rcv_sl' }, '18927' => { 'Header' => undef, 'Line' => '920', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_xmt_sl' }, '1899' => { 'Header' => undef, 'Line' => '1452', 'Param' => { '0' => { 'name' => 'portid', 'type' => '1829' } }, 'Return' => '227', 'ShortName' => 'portid2str' }, '19138' => { 'Header' => undef, 'Line' => '908', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters_ext' }, '1921' => { 'Header' => undef, 'Line' => '47', 'Param' => { '0' => { 'name' => 'data', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'call', 'type' => '2917' }, '3' => { 'name' => 'srcport', 'type' => '1731' } }, 'Return' => '2912', 'ShortName' => 'bm_call_via' }, '19351' => { 'Header' => undef, 'Line' => '890', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_perfcounters' }, '19635' => { 'Header' => undef, 'Line' => '885', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_switchinfo' }, '19792' => { 'Header' => undef, 'Line' => '880', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_portstates' }, '19949' => { 'Header' => undef, 'Line' => '868', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_portinfo' }, '20161' => { 'Header' => undef, 'Line' => '863', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_nodeinfo' }, '20318' => { 'Header' => undef, 'Line' => '855', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_nodedesc' }, '20530' => { 'Header' => undef, 'Line' => '849', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' }, '4' => { 'name' => 'start', 'type' => '72' }, '5' => { 'name' => 'end', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_fields' }, '21071' => { 'Header' => undef, 'Line' => '797', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'num', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_vlarbitration' }, '21949' => { 'Header' => undef, 'Line' => '782', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_sltovl' }, '22613' => { 'Header' => undef, 'Line' => '1665', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_node_type' }, '23210' => { 'Header' => undef, 'Line' => '1659', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_string' }, '23471' => { 'Header' => undef, 'Line' => '1659', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_array' }, '23721' => { 'Header' => undef, 'Line' => '711', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_bitfield' }, '23954' => { 'Header' => undef, 'Line' => '1664', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_portcapmask2' }, '24784' => { 'Header' => undef, 'Line' => '1664', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_portcapmask' }, '26621' => { 'Header' => undef, 'Line' => '1665', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_opervls' }, '27538' => { 'Header' => undef, 'Line' => '1665', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_vlcap' }, '28344' => { 'Header' => undef, 'Line' => '1665', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_mtu' }, '29150' => { 'Header' => undef, 'Line' => '1664', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_physportstate' }, '30289' => { 'Header' => undef, 'Line' => '1660', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkdowndefstate' }, '30886' => { 'Header' => undef, 'Line' => '1663', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_portstate' }, '31692' => { 'Header' => undef, 'Line' => '1692', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkspeedexten2' }, '31763' => { 'Header' => undef, 'Line' => '1692', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkspeedextsup2' }, '31895' => { 'Header' => undef, 'Line' => '1692', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkspeedext2' }, '32381' => { 'Header' => undef, 'Line' => '1663', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkspeedexten' }, '32603' => { 'Header' => undef, 'Line' => '1662', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkspeedextsup' }, '33542' => { 'Header' => undef, 'Line' => '1662', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkspeedext' }, '34348' => { 'Header' => undef, 'Line' => '1661', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkspeeden' }, '34419' => { 'Header' => undef, 'Line' => '1661', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkspeedsup' }, '35232' => { 'Header' => undef, 'Line' => '1661', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkspeed' }, '35934' => { 'Header' => undef, 'Line' => '1660', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkwidthen' }, '36085' => { 'Header' => undef, 'Line' => '1660', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkwidthsup' }, '37169' => { 'Header' => undef, 'Line' => '1659', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_linkwidth' }, '37977' => { 'Header' => undef, 'Line' => '1658', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_rhex' }, '38822' => { 'Header' => undef, 'Line' => '1658', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_hex' }, '39667' => { 'Header' => undef, 'Line' => '1658', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_uint' }, '40280' => { 'Header' => undef, 'Line' => '43', 'Param' => { '0' => { 'name' => 'buf', 'type' => '227' }, '1' => { 'name' => 'bufsz', 'type' => '72' }, '2' => { 'name' => 'val', 'type' => '220' }, '3' => { 'name' => 'valsz', 'type' => '72' } }, 'Return' => '1', 'ShortName' => 'mad_dump_int' }, '49122' => { 'Header' => undef, 'Line' => '1280', 'Param' => { '0' => { 'name' => 'field', 'type' => '6858' } }, 'Return' => '79', 'ShortName' => 'mad_field_name' }, '49171' => { 'Header' => undef, 'Line' => '1273', 'Param' => { '0' => { 'name' => 'field', 'type' => '6858' }, '1' => { 'name' => 'buf', 'type' => '227' }, '2' => { 'name' => 'bufsz', 'type' => '72' }, '3' => { 'name' => 'val', 'type' => '220' } }, 'Return' => '227', 'ShortName' => 'mad_dump_val' }, '49571' => { 'Header' => undef, 'Line' => '1259', 'Param' => { '0' => { 'name' => 'field', 'type' => '6858' }, '1' => { 'name' => 'name', 'type' => '79' }, '2' => { 'name' => 'val', 'type' => '220' } }, 'Return' => '72', 'ShortName' => 'mad_print_field' }, '50805' => { 'Header' => undef, 'Line' => '1478', 'Param' => { '0' => { 'name' => 'buf', 'type' => '2912' }, '1' => { 'name' => 'field', 'type' => '6858' }, '2' => { 'name' => 'val', 'type' => '220' } }, 'Return' => '1', 'ShortName' => 'mad_encode_field' }, '5154' => { 'Header' => undef, 'Line' => '79', 'Param' => { '0' => { 'name' => 'payload', 'type' => '220' }, '1' => { 'name' => 'rcvbuf', 'type' => '220' }, '2' => { 'name' => 'portid', 'type' => '1829' }, '3' => { 'name' => 'attrid', 'type' => '108' }, '4' => { 'name' => 'mod', 'type' => '108' }, '5' => { 'name' => 'timeout', 'type' => '108' }, '6' => { 'name' => 'rstatus', 'type' => '5588' }, '7' => { 'name' => 'srcport', 'type' => '1881' }, '8' => { 'name' => 'cckey', 'type' => '268' } }, 'Return' => '220', 'ShortName' => 'cc_config_status_via' }, '51869' => { 'Header' => undef, 'Line' => '1161', 'Param' => { '0' => { 'name' => 'buf', 'type' => '220' }, '1' => { 'name' => 'base_offs', 'type' => '72' }, '2' => { 'name' => 'field', 'type' => '6858' }, '3' => { 'name' => 'val', 'type' => '220' } }, 'Return' => '1', 'ShortName' => 'mad_get_array' }, '52160' => { 'Header' => undef, 'Line' => '1475', 'Param' => { '0' => { 'name' => 'buf', 'type' => '220' }, '1' => { 'name' => 'base_offs', 'type' => '72' }, '2' => { 'name' => 'field', 'type' => '6858' }, '3' => { 'name' => 'val', 'type' => '220' } }, 'Return' => '1', 'ShortName' => 'mad_set_array' }, '52451' => { 'Header' => undef, 'Line' => '1473', 'Param' => { '0' => { 'name' => 'buf', 'type' => '220' }, '1' => { 'name' => 'base_offs', 'type' => '72' }, '2' => { 'name' => 'field', 'type' => '6858' }, '3' => { 'name' => 'val', 'type' => '268' } }, 'Return' => '1', 'ShortName' => 'mad_set_field64' }, '52732' => { 'Header' => undef, 'Line' => '1472', 'Param' => { '0' => { 'name' => 'buf', 'type' => '220' }, '1' => { 'name' => 'base_offs', 'type' => '72' }, '2' => { 'name' => 'field', 'type' => '6858' } }, 'Return' => '268', 'ShortName' => 'mad_get_field64' }, '52993' => { 'Header' => undef, 'Line' => '1469', 'Param' => { '0' => { 'name' => 'buf', 'type' => '220' }, '1' => { 'name' => 'base_offs', 'type' => '72' }, '2' => { 'name' => 'field', 'type' => '6858' }, '3' => { 'name' => 'val', 'type' => '256' } }, 'Return' => '1', 'ShortName' => 'mad_set_field' }, '53165' => { 'Header' => undef, 'Line' => '1468', 'Param' => { '0' => { 'name' => 'buf', 'type' => '220' }, '1' => { 'name' => 'base_offs', 'type' => '72' }, '2' => { 'name' => 'field', 'type' => '6858' } }, 'Return' => '256', 'ShortName' => 'mad_get_field' }, '5614' => { 'Header' => undef, 'Line' => '44', 'Param' => { '0' => { 'name' => 'rcvbuf', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'attrid', 'type' => '108' }, '3' => { 'name' => 'mod', 'type' => '108' }, '4' => { 'name' => 'timeout', 'type' => '108' }, '5' => { 'name' => 'rstatus', 'type' => '5588' }, '6' => { 'name' => 'srcport', 'type' => '1881' }, '7' => { 'name' => 'cckey', 'type' => '268' } }, 'Return' => '220', 'ShortName' => 'cc_query_status_via' }, '61028' => { 'Header' => undef, 'Line' => '83', 'Param' => { '0' => { 'name' => 'rcvbuf', 'type' => '220' }, '1' => { 'name' => 'dest', 'type' => '1829' }, '2' => { 'name' => 'port', 'type' => '72' }, '3' => { 'name' => 'mask', 'type' => '108' }, '4' => { 'name' => 'timeout', 'type' => '108' }, '5' => { 'name' => 'id', 'type' => '108' }, '6' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '2912', 'ShortName' => 'performance_reset_via' }, '61826' => { 'Header' => undef, 'Line' => '46', 'Param' => { '0' => { 'name' => 'rcvbuf', 'type' => '220' }, '1' => { 'name' => 'dest', 'type' => '1829' }, '2' => { 'name' => 'port', 'type' => '72' }, '3' => { 'name' => 'timeout', 'type' => '108' }, '4' => { 'name' => 'id', 'type' => '108' }, '5' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '2912', 'ShortName' => 'pma_query_via' }, '69963' => { 'Header' => undef, 'Line' => '1487', 'Param' => { '0' => { 'name' => 'umad', 'type' => '220' }, '1' => { 'name' => 'rpc', 'type' => '70928' }, '2' => { 'name' => 'dport', 'type' => '1829' }, '3' => { 'name' => 'rmpp', 'type' => '70938' }, '4' => { 'name' => 'data', 'type' => '220' } }, 'Return' => '72', 'ShortName' => 'mad_build_pkt' }, '70948' => { 'Header' => undef, 'Line' => '82', 'Param' => { '0' => { 'name' => 'buf', 'type' => '220' }, '1' => { 'name' => 'rpc', 'type' => '70928' }, '2' => { 'name' => 'drpath', 'type' => '72333' }, '3' => { 'name' => 'data', 'type' => '220' } }, 'Return' => '220', 'ShortName' => 'mad_encode' }, '72364' => { 'Header' => undef, 'Line' => '1517', 'Param' => { '0' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '72', 'ShortName' => 'mad_get_retries' }, '72416' => { 'Header' => undef, 'Line' => '1516', 'Param' => { '0' => { 'name' => 'srcport', 'type' => '1881' }, '1' => { 'name' => 'override_ms', 'type' => '72' } }, 'Return' => '72', 'ShortName' => 'mad_get_timeout' }, '72476' => { 'Header' => undef, 'Line' => '1486', 'Return' => '268', 'ShortName' => 'mad_trid' }, '74034' => { 'Header' => undef, 'Line' => '110', 'Param' => { '0' => { 'name' => 'path', 'type' => '74403' }, '1' => { 'name' => 'dstr', 'type' => '227' }, '2' => { 'name' => 'dstr_size', 'type' => '46' } }, 'Return' => '227', 'ShortName' => 'drpath2str' }, '74408' => { 'Header' => undef, 'Line' => '1454', 'Param' => { '0' => { 'name' => 'path', 'type' => '74403' }, '1' => { 'name' => 'routepath', 'type' => '227' }, '2' => { 'name' => 'drslid', 'type' => '72' }, '3' => { 'name' => 'drdlid', 'type' => '72' } }, 'Return' => '72', 'ShortName' => 'str2drpath' }, '75439' => { 'Header' => undef, 'Line' => '44', 'Param' => { '0' => { 'name' => 'portid', 'type' => '1829' } }, 'Return' => '72', 'ShortName' => 'portid2portnum' }, '76837' => { 'Header' => undef, 'Line' => '107', 'Param' => { '0' => { 'name' => 'srcport', 'type' => '1731' } }, 'Return' => '72', 'ShortName' => 'mad_rpc_portid' }, '76860' => { 'Header' => undef, 'Line' => '126', 'Param' => { '0' => { 'name' => 'mgmt', 'type' => '72' }, '1' => { 'name' => 'rmpp_version', 'type' => '232' }, '2' => { 'name' => 'method_mask', 'type' => '76744' }, '3' => { 'name' => 'class_oui', 'type' => '256' }, '4' => { 'name' => 'srcport', 'type' => '1731' } }, 'Return' => '72', 'ShortName' => 'mad_register_server_via' }, '77749' => { 'Header' => undef, 'Line' => '119', 'Param' => { '0' => { 'name' => 'mgmt', 'type' => '72' }, '1' => { 'name' => 'rmpp_version', 'type' => '232' }, '2' => { 'name' => 'method_mask', 'type' => '76744' }, '3' => { 'name' => 'class_oui', 'type' => '256' } }, 'Return' => '72', 'ShortName' => 'mad_register_server' }, '77903' => { 'Header' => undef, 'Line' => '1526', 'Param' => { '0' => { 'name' => 'mgmt', 'type' => '72' }, '1' => { 'name' => 'rmpp_version', 'type' => '232' }, '2' => { 'name' => 'srcport', 'type' => '1731' } }, 'Return' => '72', 'ShortName' => 'mad_register_client_via' }, '78417' => { 'Header' => undef, 'Line' => '97', 'Param' => { '0' => { 'name' => 'mgmt', 'type' => '72' }, '1' => { 'name' => 'rmpp_version', 'type' => '232' } }, 'Return' => '72', 'ShortName' => 'mad_register_client' }, '78635' => { 'Header' => undef, 'Line' => '74', 'Param' => { '0' => { 'name' => 'mgmt', 'type' => '72' } }, 'Return' => '72', 'ShortName' => 'mad_class_agent' }, '85178' => { 'Header' => undef, 'Line' => '111', 'Param' => { '0' => { 'name' => 'srcport', 'type' => '1881' }, '1' => { 'name' => 'srcgid', 'type' => '2912' }, '2' => { 'name' => 'destgid', 'type' => '2912' }, '3' => { 'name' => 'sm_id', 'type' => '1829' }, '4' => { 'name' => 'buf', 'type' => '220' } }, 'Return' => '72', 'ShortName' => 'ib_path_query_via' }, '85274' => { 'Header' => undef, 'Line' => '133', 'Param' => { '0' => { 'name' => 'rcvbuf', 'type' => '220' }, '1' => { 'name' => 'portid', 'type' => '1829' }, '2' => { 'name' => 'attrid', 'type' => '108' }, '3' => { 'name' => 'mod', 'type' => '108' }, '4' => { 'name' => 'timeout', 'type' => '108' }, '5' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '2912', 'ShortName' => 'smp_query_via' }, '85321' => { 'Header' => undef, 'Line' => '241', 'Param' => { '0' => { 'name' => 'portid', 'type' => '1829' }, '1' => { 'name' => 'portnum', 'type' => '5588' }, '2' => { 'name' => 'gid', 'type' => '85446' } }, 'Return' => '72', 'ShortName' => 'ib_resolve_self' }, '85451' => { 'Header' => undef, 'Line' => '213', 'Param' => { '0' => { 'name' => 'portid', 'type' => '1829' }, '1' => { 'name' => 'portnum', 'type' => '5588' }, '2' => { 'name' => 'gid', 'type' => '85446' }, '3' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '72', 'ShortName' => 'ib_resolve_self_via' }, '86003' => { 'Header' => undef, 'Line' => '206', 'Param' => { '0' => { 'name' => 'portid', 'type' => '1829' }, '1' => { 'name' => 'addr_str', 'type' => '227' }, '2' => { 'name' => 'dest_type', 'type' => '84856' }, '3' => { 'name' => 'sm_id', 'type' => '1829' } }, 'Return' => '72', 'ShortName' => 'ib_resolve_portid_str' }, '86154' => { 'Header' => undef, 'Line' => '137', 'Param' => { '0' => { 'name' => 'portid', 'type' => '1829' }, '1' => { 'name' => 'addr_str', 'type' => '227' }, '2' => { 'name' => 'dest_type', 'type' => '84856' }, '3' => { 'name' => 'sm_id', 'type' => '1829' }, '4' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '72', 'ShortName' => 'ib_resolve_portid_str_via' }, '87137' => { 'Header' => undef, 'Line' => '97', 'Param' => { '0' => { 'name' => 'portid', 'type' => '1829' }, '1' => { 'name' => 'guid', 'type' => '87806' }, '2' => { 'name' => 'sm_id', 'type' => '1829' }, '3' => { 'name' => 'timeout', 'type' => '72' }, '4' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '72', 'ShortName' => 'ib_resolve_guid_via' }, '87827' => { 'Header' => undef, 'Line' => '75', 'Param' => { '0' => { 'name' => 'portid', 'type' => '1829' }, '1' => { 'name' => 'gid', 'type' => '2912' }, '2' => { 'name' => 'sm_id', 'type' => '1829' }, '3' => { 'name' => 'timeout', 'type' => '72' }, '4' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '72', 'ShortName' => 'ib_resolve_gid_via' }, '88093' => { 'Header' => undef, 'Line' => '70', 'Param' => { '0' => { 'name' => 'sm_id', 'type' => '1829' }, '1' => { 'name' => 'timeout', 'type' => '72' } }, 'Return' => '72', 'ShortName' => 'ib_resolve_smlid' }, '88192' => { 'Header' => undef, 'Line' => '48', 'Param' => { '0' => { 'name' => 'sm_id', 'type' => '1829' }, '1' => { 'name' => 'timeout', 'type' => '72' }, '2' => { 'name' => 'srcport', 'type' => '1881' } }, 'Return' => '72', 'ShortName' => 'ib_resolve_smlid_via' }, '96977' => { 'Header' => undef, 'Line' => '580', 'Param' => { '0' => { 'name' => 'srcport', 'type' => '97101' } }, 'Return' => '1', 'ShortName' => 'mad_rpc_close_port2' }, '97106' => { 'Header' => undef, 'Line' => '497', 'Param' => { '0' => { 'name' => 'dev_name', 'type' => '227' }, '1' => { 'name' => 'dev_port', 'type' => '72' }, '2' => { 'name' => 'mgmt_classes', 'type' => '5588' }, '3' => { 'name' => 'num_classes', 'type' => '72' }, '4' => { 'name' => 'enforce_smi', 'type' => '108' } }, 'Return' => '97101', 'ShortName' => 'mad_rpc_open_port2' }, '99956' => { 'Header' => undef, 'Line' => '434', 'Param' => { '0' => { 'name' => 'port', 'type' => '1731' } }, 'Return' => '1', 'ShortName' => 'mad_rpc_close_port' } }, 'SymbolVersion' => { 'bm_call_via' => 'bm_call_via@@IBMAD_1.3', 'cc_config_status_via' => 'cc_config_status_via@@IBMAD_1.3', 'cc_query_status_via' => 'cc_query_status_via@@IBMAD_1.3', 'drpath2str' => 'drpath2str@@IBMAD_1.3', 'ib_node_query_via' => 'ib_node_query_via@@IBMAD_1.3', 'ib_path_query' => 'ib_path_query@@IBMAD_1.3', 'ib_path_query_via' => 'ib_path_query_via@@IBMAD_1.3', 'ib_resolve_gid_via' => 'ib_resolve_gid_via@@IBMAD_1.3', 'ib_resolve_guid_via' => 'ib_resolve_guid_via@@IBMAD_1.3', 'ib_resolve_portid_str' => 'ib_resolve_portid_str@@IBMAD_1.3', 'ib_resolve_portid_str_via' => 'ib_resolve_portid_str_via@@IBMAD_1.3', 'ib_resolve_self' => 'ib_resolve_self@@IBMAD_1.3', 'ib_resolve_self_via' => 'ib_resolve_self_via@@IBMAD_1.3', 'ib_resolve_smlid' => 'ib_resolve_smlid@@IBMAD_1.3', 'ib_resolve_smlid_via' => 'ib_resolve_smlid_via@@IBMAD_1.3', 'ib_vendor_call' => 'ib_vendor_call@@IBMAD_1.3', 'ib_vendor_call_via' => 'ib_vendor_call_via@@IBMAD_1.3', 'ibdebug' => 'ibdebug@@IBMAD_1.3', 'mad_alloc' => 'mad_alloc@@IBMAD_1.3', 'mad_build_pkt' => 'mad_build_pkt@@IBMAD_1.3', 'mad_class_agent' => 'mad_class_agent@@IBMAD_1.3', 'mad_decode_field' => 'mad_decode_field@@IBMAD_1.3', 'mad_dump_array' => 'mad_dump_array@@IBMAD_1.3', 'mad_dump_bitfield' => 'mad_dump_bitfield@@IBMAD_1.3', 'mad_dump_cc_cacongestionentry' => 'mad_dump_cc_cacongestionentry@@IBMAD_1.3', 'mad_dump_cc_cacongestionsetting' => 'mad_dump_cc_cacongestionsetting@@IBMAD_1.3', 'mad_dump_cc_congestioncontroltable' => 'mad_dump_cc_congestioncontroltable@@IBMAD_1.3', 'mad_dump_cc_congestioncontroltableentry' => 'mad_dump_cc_congestioncontroltableentry@@IBMAD_1.3', 'mad_dump_cc_congestioninfo' => 'mad_dump_cc_congestioninfo@@IBMAD_1.3', 'mad_dump_cc_congestionkeyinfo' => 'mad_dump_cc_congestionkeyinfo@@IBMAD_1.3', 'mad_dump_cc_congestionlog' => 'mad_dump_cc_congestionlog@@IBMAD_1.3', 'mad_dump_cc_congestionlogca' => 'mad_dump_cc_congestionlogca@@IBMAD_1.3', 'mad_dump_cc_congestionlogentryca' => 'mad_dump_cc_congestionlogentryca@@IBMAD_1.3', 'mad_dump_cc_congestionlogentryswitch' => 'mad_dump_cc_congestionlogentryswitch@@IBMAD_1.3', 'mad_dump_cc_congestionlogswitch' => 'mad_dump_cc_congestionlogswitch@@IBMAD_1.3', 'mad_dump_cc_switchcongestionsetting' => 'mad_dump_cc_switchcongestionsetting@@IBMAD_1.3', 'mad_dump_cc_switchportcongestionsettingelement' => 'mad_dump_cc_switchportcongestionsettingelement@@IBMAD_1.3', 'mad_dump_cc_timestamp' => 'mad_dump_cc_timestamp@@IBMAD_1.3', 'mad_dump_classportinfo' => 'mad_dump_classportinfo@@IBMAD_1.3', 'mad_dump_field' => 'mad_dump_field@@IBMAD_1.3', 'mad_dump_fields' => 'mad_dump_fields@@IBMAD_1.3', 'mad_dump_hex' => 'mad_dump_hex@@IBMAD_1.3', 'mad_dump_int' => 'mad_dump_int@@IBMAD_1.3', 'mad_dump_linkdowndefstate' => 'mad_dump_linkdowndefstate@@IBMAD_1.3', 'mad_dump_linkspeed' => 'mad_dump_linkspeed@@IBMAD_1.3', 'mad_dump_linkspeeden' => 'mad_dump_linkspeeden@@IBMAD_1.3', 'mad_dump_linkspeedext' => 'mad_dump_linkspeedext@@IBMAD_1.3', 'mad_dump_linkspeedext2' => 'mad_dump_linkspeedext2@@IBMAD_1.4', 'mad_dump_linkspeedexten' => 'mad_dump_linkspeedexten@@IBMAD_1.3', 'mad_dump_linkspeedexten2' => 'mad_dump_linkspeedexten2@@IBMAD_1.4', 'mad_dump_linkspeedextsup' => 'mad_dump_linkspeedextsup@@IBMAD_1.3', 'mad_dump_linkspeedextsup2' => 'mad_dump_linkspeedextsup2@@IBMAD_1.4', 'mad_dump_linkspeedsup' => 'mad_dump_linkspeedsup@@IBMAD_1.3', 'mad_dump_linkwidth' => 'mad_dump_linkwidth@@IBMAD_1.3', 'mad_dump_linkwidthen' => 'mad_dump_linkwidthen@@IBMAD_1.3', 'mad_dump_linkwidthsup' => 'mad_dump_linkwidthsup@@IBMAD_1.3', 'mad_dump_mlnx_ext_port_info' => 'mad_dump_mlnx_ext_port_info@@IBMAD_1.3', 'mad_dump_mtu' => 'mad_dump_mtu@@IBMAD_1.3', 'mad_dump_node_type' => 'mad_dump_node_type@@IBMAD_1.3', 'mad_dump_nodedesc' => 'mad_dump_nodedesc@@IBMAD_1.3', 'mad_dump_nodeinfo' => 'mad_dump_nodeinfo@@IBMAD_1.3', 'mad_dump_opervls' => 'mad_dump_opervls@@IBMAD_1.3', 'mad_dump_perfcounters' => 'mad_dump_perfcounters@@IBMAD_1.3', 'mad_dump_perfcounters_ext' => 'mad_dump_perfcounters_ext@@IBMAD_1.3', 'mad_dump_perfcounters_port_flow_ctl_counters' => 'mad_dump_perfcounters_port_flow_ctl_counters@@IBMAD_1.3', 'mad_dump_perfcounters_port_op_rcv_counters' => 'mad_dump_perfcounters_port_op_rcv_counters@@IBMAD_1.3', 'mad_dump_perfcounters_port_vl_op_data' => 'mad_dump_perfcounters_port_vl_op_data@@IBMAD_1.3', 'mad_dump_perfcounters_port_vl_op_packet' => 'mad_dump_perfcounters_port_vl_op_packet@@IBMAD_1.3', 'mad_dump_perfcounters_port_vl_xmit_flow_ctl_update_errors' => 'mad_dump_perfcounters_port_vl_xmit_flow_ctl_update_errors@@IBMAD_1.3', 'mad_dump_perfcounters_port_vl_xmit_wait_counters' => 'mad_dump_perfcounters_port_vl_xmit_wait_counters@@IBMAD_1.3', 'mad_dump_perfcounters_rcv_con_ctrl' => 'mad_dump_perfcounters_rcv_con_ctrl@@IBMAD_1.3', 'mad_dump_perfcounters_rcv_err' => 'mad_dump_perfcounters_rcv_err@@IBMAD_1.3', 'mad_dump_perfcounters_rcv_sl' => 'mad_dump_perfcounters_rcv_sl@@IBMAD_1.3', 'mad_dump_perfcounters_sl_rcv_becn' => 'mad_dump_perfcounters_sl_rcv_becn@@IBMAD_1.3', 'mad_dump_perfcounters_sl_rcv_fecn' => 'mad_dump_perfcounters_sl_rcv_fecn@@IBMAD_1.3', 'mad_dump_perfcounters_sw_port_vl_congestion' => 'mad_dump_perfcounters_sw_port_vl_congestion@@IBMAD_1.3', 'mad_dump_perfcounters_vl_xmit_time_cong' => 'mad_dump_perfcounters_vl_xmit_time_cong@@IBMAD_1.3', 'mad_dump_perfcounters_xmit_con_ctrl' => 'mad_dump_perfcounters_xmit_con_ctrl@@IBMAD_1.3', 'mad_dump_perfcounters_xmt_disc' => 'mad_dump_perfcounters_xmt_disc@@IBMAD_1.3', 'mad_dump_perfcounters_xmt_sl' => 'mad_dump_perfcounters_xmt_sl@@IBMAD_1.3', 'mad_dump_physportstate' => 'mad_dump_physportstate@@IBMAD_1.3', 'mad_dump_port_ext_speeds_counters' => 'mad_dump_port_ext_speeds_counters@@IBMAD_1.3', 'mad_dump_port_ext_speeds_counters_rsfec_active' => 'mad_dump_port_ext_speeds_counters_rsfec_active@@IBMAD_1.3', 'mad_dump_portcapmask' => 'mad_dump_portcapmask@@IBMAD_1.3', 'mad_dump_portcapmask2' => 'mad_dump_portcapmask2@@IBMAD_1.3', 'mad_dump_portinfo' => 'mad_dump_portinfo@@IBMAD_1.3', 'mad_dump_portinfo_ext' => 'mad_dump_portinfo_ext@@IBMAD_1.3', 'mad_dump_portsamples_control' => 'mad_dump_portsamples_control@@IBMAD_1.3', 'mad_dump_portsamples_result' => 'mad_dump_portsamples_result@@IBMAD_1.3', 'mad_dump_portstate' => 'mad_dump_portstate@@IBMAD_1.3', 'mad_dump_portstates' => 'mad_dump_portstates@@IBMAD_1.3', 'mad_dump_rhex' => 'mad_dump_rhex@@IBMAD_1.3', 'mad_dump_sltovl' => 'mad_dump_sltovl@@IBMAD_1.3', 'mad_dump_string' => 'mad_dump_string@@IBMAD_1.3', 'mad_dump_switchinfo' => 'mad_dump_switchinfo@@IBMAD_1.3', 'mad_dump_uint' => 'mad_dump_uint@@IBMAD_1.3', 'mad_dump_val' => 'mad_dump_val@@IBMAD_1.3', 'mad_dump_vlarbitration' => 'mad_dump_vlarbitration@@IBMAD_1.3', 'mad_dump_vlcap' => 'mad_dump_vlcap@@IBMAD_1.3', 'mad_encode' => 'mad_encode@@IBMAD_1.3', 'mad_encode_field' => 'mad_encode_field@@IBMAD_1.3', 'mad_field_name' => 'mad_field_name@@IBMAD_1.3', 'mad_free' => 'mad_free@@IBMAD_1.3', 'mad_get_array' => 'mad_get_array@@IBMAD_1.3', 'mad_get_field' => 'mad_get_field@@IBMAD_1.3', 'mad_get_field64' => 'mad_get_field64@@IBMAD_1.3', 'mad_get_retries' => 'mad_get_retries@@IBMAD_1.3', 'mad_get_timeout' => 'mad_get_timeout@@IBMAD_1.3', 'mad_print_field' => 'mad_print_field@@IBMAD_1.3', 'mad_receive' => 'mad_receive@@IBMAD_1.3', 'mad_receive_via' => 'mad_receive_via@@IBMAD_1.3', 'mad_register_client' => 'mad_register_client@@IBMAD_1.3', 'mad_register_client_via' => 'mad_register_client_via@@IBMAD_1.3', 'mad_register_server' => 'mad_register_server@@IBMAD_1.3', 'mad_register_server_via' => 'mad_register_server_via@@IBMAD_1.3', 'mad_respond' => 'mad_respond@@IBMAD_1.3', 'mad_respond_via' => 'mad_respond_via@@IBMAD_1.3', 'mad_rpc' => 'mad_rpc@@IBMAD_1.3', 'mad_rpc_class_agent' => 'mad_rpc_class_agent@@IBMAD_1.3', 'mad_rpc_close_port' => 'mad_rpc_close_port@@IBMAD_1.3', 'mad_rpc_close_port2' => 'mad_rpc_close_port2@@IBMAD_1.5', 'mad_rpc_open_port' => 'mad_rpc_open_port@@IBMAD_1.3', 'mad_rpc_open_port2' => 'mad_rpc_open_port2@@IBMAD_1.5', 'mad_rpc_portid' => 'mad_rpc_portid@@IBMAD_1.3', 'mad_rpc_rmpp' => 'mad_rpc_rmpp@@IBMAD_1.3', 'mad_rpc_set_retries' => 'mad_rpc_set_retries@@IBMAD_1.3', 'mad_rpc_set_timeout' => 'mad_rpc_set_timeout@@IBMAD_1.3', 'mad_send' => 'mad_send@@IBMAD_1.3', 'mad_send_via' => 'mad_send_via@@IBMAD_1.3', 'mad_set_array' => 'mad_set_array@@IBMAD_1.3', 'mad_set_field' => 'mad_set_field@@IBMAD_1.3', 'mad_set_field64' => 'mad_set_field64@@IBMAD_1.3', 'mad_trid' => 'mad_trid@@IBMAD_1.3', 'madrpc' => 'madrpc@@IBMAD_1.3', 'madrpc_init' => 'madrpc_init@@IBMAD_1.3', 'madrpc_portid' => 'madrpc_portid@@IBMAD_1.3', 'madrpc_rmpp' => 'madrpc_rmpp@@IBMAD_1.3', 'madrpc_save_mad' => 'madrpc_save_mad@@IBMAD_1.3', 'madrpc_set_retries' => 'madrpc_set_retries@@IBMAD_1.3', 'madrpc_set_timeout' => 'madrpc_set_timeout@@IBMAD_1.3', 'madrpc_show_errors' => 'madrpc_show_errors@@IBMAD_1.3', 'performance_reset_via' => 'performance_reset_via@@IBMAD_1.3', 'pma_query_via' => 'pma_query_via@@IBMAD_1.3', 'portid2portnum' => 'portid2portnum@@IBMAD_1.3', 'portid2str' => 'portid2str@@IBMAD_1.3', 'sa_call' => 'sa_call@@IBMAD_1.3', 'sa_rpc_call' => 'sa_rpc_call@@IBMAD_1.3', 'smp_mkey_get' => 'smp_mkey_get@@IBMAD_1.3', 'smp_mkey_set' => 'smp_mkey_set@@IBMAD_1.3', 'smp_query' => 'smp_query@@IBMAD_1.3', 'smp_query_status_via' => 'smp_query_status_via@@IBMAD_1.3', 'smp_query_via' => 'smp_query_via@@IBMAD_1.3', 'smp_set' => 'smp_set@@IBMAD_1.3', 'smp_set_status_via' => 'smp_set_status_via@@IBMAD_1.3', 'smp_set_via' => 'smp_set_via@@IBMAD_1.3', 'str2drpath' => 'str2drpath@@IBMAD_1.3', 'xdump' => 'xdump@@IBMAD_1.3' }, 'Symbols' => { 'libibmad.so.5.5.56.0' => { 'bm_call_via@@IBMAD_1.3' => 1, 'cc_config_status_via@@IBMAD_1.3' => 1, 'cc_query_status_via@@IBMAD_1.3' => 1, 'drpath2str@@IBMAD_1.3' => 1, 'ib_node_query_via@@IBMAD_1.3' => 1, 'ib_path_query@@IBMAD_1.3' => 1, 'ib_path_query_via@@IBMAD_1.3' => 1, 'ib_resolve_gid_via@@IBMAD_1.3' => 1, 'ib_resolve_guid_via@@IBMAD_1.3' => 1, 'ib_resolve_portid_str@@IBMAD_1.3' => 1, 'ib_resolve_portid_str_via@@IBMAD_1.3' => 1, 'ib_resolve_self@@IBMAD_1.3' => 1, 'ib_resolve_self_via@@IBMAD_1.3' => 1, 'ib_resolve_smlid@@IBMAD_1.3' => 1, 'ib_resolve_smlid_via@@IBMAD_1.3' => 1, 'ib_vendor_call@@IBMAD_1.3' => 1, 'ib_vendor_call_via@@IBMAD_1.3' => 1, 'ibdebug@@IBMAD_1.3' => -4, 'mad_alloc@@IBMAD_1.3' => 1, 'mad_build_pkt@@IBMAD_1.3' => 1, 'mad_class_agent@@IBMAD_1.3' => 1, 'mad_decode_field@@IBMAD_1.3' => 1, 'mad_dump_array@@IBMAD_1.3' => 1, 'mad_dump_bitfield@@IBMAD_1.3' => 1, 'mad_dump_cc_cacongestionentry@@IBMAD_1.3' => 1, 'mad_dump_cc_cacongestionsetting@@IBMAD_1.3' => 1, 'mad_dump_cc_congestioncontroltable@@IBMAD_1.3' => 1, 'mad_dump_cc_congestioncontroltableentry@@IBMAD_1.3' => 1, 'mad_dump_cc_congestioninfo@@IBMAD_1.3' => 1, 'mad_dump_cc_congestionkeyinfo@@IBMAD_1.3' => 1, 'mad_dump_cc_congestionlog@@IBMAD_1.3' => 1, 'mad_dump_cc_congestionlogca@@IBMAD_1.3' => 1, 'mad_dump_cc_congestionlogentryca@@IBMAD_1.3' => 1, 'mad_dump_cc_congestionlogentryswitch@@IBMAD_1.3' => 1, 'mad_dump_cc_congestionlogswitch@@IBMAD_1.3' => 1, 'mad_dump_cc_switchcongestionsetting@@IBMAD_1.3' => 1, 'mad_dump_cc_switchportcongestionsettingelement@@IBMAD_1.3' => 1, 'mad_dump_cc_timestamp@@IBMAD_1.3' => 1, 'mad_dump_classportinfo@@IBMAD_1.3' => 1, 'mad_dump_field@@IBMAD_1.3' => 1, 'mad_dump_fields@@IBMAD_1.3' => 1, 'mad_dump_hex@@IBMAD_1.3' => 1, 'mad_dump_int@@IBMAD_1.3' => 1, 'mad_dump_linkdowndefstate@@IBMAD_1.3' => 1, 'mad_dump_linkspeed@@IBMAD_1.3' => 1, 'mad_dump_linkspeeden@@IBMAD_1.3' => 1, 'mad_dump_linkspeedext2@@IBMAD_1.4' => 1, 'mad_dump_linkspeedext@@IBMAD_1.3' => 1, 'mad_dump_linkspeedexten2@@IBMAD_1.4' => 1, 'mad_dump_linkspeedexten@@IBMAD_1.3' => 1, 'mad_dump_linkspeedextsup2@@IBMAD_1.4' => 1, 'mad_dump_linkspeedextsup@@IBMAD_1.3' => 1, 'mad_dump_linkspeedsup@@IBMAD_1.3' => 1, 'mad_dump_linkwidth@@IBMAD_1.3' => 1, 'mad_dump_linkwidthen@@IBMAD_1.3' => 1, 'mad_dump_linkwidthsup@@IBMAD_1.3' => 1, 'mad_dump_mlnx_ext_port_info@@IBMAD_1.3' => 1, 'mad_dump_mtu@@IBMAD_1.3' => 1, 'mad_dump_node_type@@IBMAD_1.3' => 1, 'mad_dump_nodedesc@@IBMAD_1.3' => 1, 'mad_dump_nodeinfo@@IBMAD_1.3' => 1, 'mad_dump_opervls@@IBMAD_1.3' => 1, 'mad_dump_perfcounters@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_ext@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_port_flow_ctl_counters@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_port_op_rcv_counters@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_port_vl_op_data@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_port_vl_op_packet@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_port_vl_xmit_flow_ctl_update_errors@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_port_vl_xmit_wait_counters@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_rcv_con_ctrl@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_rcv_err@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_rcv_sl@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_sl_rcv_becn@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_sl_rcv_fecn@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_sw_port_vl_congestion@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_vl_xmit_time_cong@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_xmit_con_ctrl@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_xmt_disc@@IBMAD_1.3' => 1, 'mad_dump_perfcounters_xmt_sl@@IBMAD_1.3' => 1, 'mad_dump_physportstate@@IBMAD_1.3' => 1, 'mad_dump_port_ext_speeds_counters@@IBMAD_1.3' => 1, 'mad_dump_port_ext_speeds_counters_rsfec_active@@IBMAD_1.3' => 1, 'mad_dump_portcapmask2@@IBMAD_1.3' => 1, 'mad_dump_portcapmask@@IBMAD_1.3' => 1, 'mad_dump_portinfo@@IBMAD_1.3' => 1, 'mad_dump_portinfo_ext@@IBMAD_1.3' => 1, 'mad_dump_portsamples_control@@IBMAD_1.3' => 1, 'mad_dump_portsamples_result@@IBMAD_1.3' => 1, 'mad_dump_portstate@@IBMAD_1.3' => 1, 'mad_dump_portstates@@IBMAD_1.3' => 1, 'mad_dump_rhex@@IBMAD_1.3' => 1, 'mad_dump_sltovl@@IBMAD_1.3' => 1, 'mad_dump_string@@IBMAD_1.3' => 1, 'mad_dump_switchinfo@@IBMAD_1.3' => 1, 'mad_dump_uint@@IBMAD_1.3' => 1, 'mad_dump_val@@IBMAD_1.3' => 1, 'mad_dump_vlarbitration@@IBMAD_1.3' => 1, 'mad_dump_vlcap@@IBMAD_1.3' => 1, 'mad_encode@@IBMAD_1.3' => 1, 'mad_encode_field@@IBMAD_1.3' => 1, 'mad_field_name@@IBMAD_1.3' => 1, 'mad_free@@IBMAD_1.3' => 1, 'mad_get_array@@IBMAD_1.3' => 1, 'mad_get_field64@@IBMAD_1.3' => 1, 'mad_get_field@@IBMAD_1.3' => 1, 'mad_get_retries@@IBMAD_1.3' => 1, 'mad_get_timeout@@IBMAD_1.3' => 1, 'mad_print_field@@IBMAD_1.3' => 1, 'mad_receive@@IBMAD_1.3' => 1, 'mad_receive_via@@IBMAD_1.3' => 1, 'mad_register_client@@IBMAD_1.3' => 1, 'mad_register_client_via@@IBMAD_1.3' => 1, 'mad_register_server@@IBMAD_1.3' => 1, 'mad_register_server_via@@IBMAD_1.3' => 1, 'mad_respond@@IBMAD_1.3' => 1, 'mad_respond_via@@IBMAD_1.3' => 1, 'mad_rpc@@IBMAD_1.3' => 1, 'mad_rpc_class_agent@@IBMAD_1.3' => 1, 'mad_rpc_close_port2@@IBMAD_1.5' => 1, 'mad_rpc_close_port@@IBMAD_1.3' => 1, 'mad_rpc_open_port2@@IBMAD_1.5' => 1, 'mad_rpc_open_port@@IBMAD_1.3' => 1, 'mad_rpc_portid@@IBMAD_1.3' => 1, 'mad_rpc_rmpp@@IBMAD_1.3' => 1, 'mad_rpc_set_retries@@IBMAD_1.3' => 1, 'mad_rpc_set_timeout@@IBMAD_1.3' => 1, 'mad_send@@IBMAD_1.3' => 1, 'mad_send_via@@IBMAD_1.3' => 1, 'mad_set_array@@IBMAD_1.3' => 1, 'mad_set_field64@@IBMAD_1.3' => 1, 'mad_set_field@@IBMAD_1.3' => 1, 'mad_trid@@IBMAD_1.3' => 1, 'madrpc@@IBMAD_1.3' => 1, 'madrpc_init@@IBMAD_1.3' => 1, 'madrpc_portid@@IBMAD_1.3' => 1, 'madrpc_rmpp@@IBMAD_1.3' => 1, 'madrpc_save_mad@@IBMAD_1.3' => 1, 'madrpc_set_retries@@IBMAD_1.3' => 1, 'madrpc_set_timeout@@IBMAD_1.3' => 1, 'madrpc_show_errors@@IBMAD_1.3' => 1, 'performance_reset_via@@IBMAD_1.3' => 1, 'pma_query_via@@IBMAD_1.3' => 1, 'portid2portnum@@IBMAD_1.3' => 1, 'portid2str@@IBMAD_1.3' => 1, 'sa_call@@IBMAD_1.3' => 1, 'sa_rpc_call@@IBMAD_1.3' => 1, 'smp_mkey_get@@IBMAD_1.3' => 1, 'smp_mkey_set@@IBMAD_1.3' => 1, 'smp_query@@IBMAD_1.3' => 1, 'smp_query_status_via@@IBMAD_1.3' => 1, 'smp_query_via@@IBMAD_1.3' => 1, 'smp_set@@IBMAD_1.3' => 1, 'smp_set_status_via@@IBMAD_1.3' => 1, 'smp_set_via@@IBMAD_1.3' => 1, 'str2drpath@@IBMAD_1.3' => 1, 'xdump@@IBMAD_1.3' => 1 } }, 'Target' => 'unix', 'TypeInfo' => { '1' => { 'Name' => 'void', 'Type' => 'Intrinsic' }, '101' => { 'Name' => 'unsigned char', 'Size' => '1', 'Type' => 'Intrinsic' }, '1036' => { 'BaseType' => '810', 'Header' => undef, 'Line' => '243', 'Name' => 'ibmad_gid_t', 'Size' => '16', 'Type' => 'Typedef' }, '108' => { 'Name' => 'unsigned int', 'Size' => '4', 'Type' => 'Intrinsic' }, '114394' => { 'Header' => undef, 'Line' => '1382', 'Memb' => { '0' => { 'name' => 'attrid', 'offset' => '0', 'type' => '108' }, '1' => { 'name' => 'mod', 'offset' => '4', 'type' => '108' }, '2' => { 'name' => 'mask', 'offset' => '8', 'type' => '268' }, '3' => { 'name' => 'method', 'offset' => '22', 'type' => '108' }, '4' => { 'name' => 'trid', 'offset' => '36', 'type' => '268' }, '5' => { 'name' => 'recsz', 'offset' => '50', 'type' => '108' }, '6' => { 'name' => 'rmpp', 'offset' => '54', 'type' => '114383' } }, 'Name' => 'struct ib_sa_call', 'Size' => '56', 'Type' => 'Struct' }, '114498' => { 'BaseType' => '114394', 'Header' => undef, 'Line' => '1391', 'Name' => 'ib_sa_call_t', 'Size' => '56', 'Type' => 'Typedef' }, '115' => { 'Name' => 'signed char', 'Size' => '1', 'Type' => 'Intrinsic' }, '116278' => { 'BaseType' => '114498', 'Name' => 'ib_sa_call_t*', 'Size' => '8', 'Type' => 'Pointer' }, '122' => { 'BaseType' => '101', 'Header' => undef, 'Line' => '38', 'Name' => '__uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '132603' => { 'Header' => undef, 'Line' => '1393', 'Memb' => { '0' => { 'name' => 'method', 'offset' => '0', 'type' => '108' }, '1' => { 'name' => 'mgmt_class', 'offset' => '4', 'type' => '108' }, '2' => { 'name' => 'attrid', 'offset' => '8', 'type' => '108' }, '3' => { 'name' => 'mod', 'offset' => '18', 'type' => '108' }, '4' => { 'name' => 'oui', 'offset' => '22', 'type' => '256' }, '5' => { 'name' => 'timeout', 'offset' => '32', 'type' => '108' }, '6' => { 'name' => 'rmpp', 'offset' => '36', 'type' => '132592' } }, 'Name' => 'struct ib_vendor_call', 'Size' => '44', 'Type' => 'Struct' }, '132707' => { 'BaseType' => '132603', 'Header' => undef, 'Line' => '1401', 'Name' => 'ib_vendor_call_t', 'Size' => '44', 'Type' => 'Typedef' }, '133777' => { 'BaseType' => '132707', 'Name' => 'ib_vendor_call_t*', 'Size' => '8', 'Type' => 'Pointer' }, '1356' => { 'Header' => undef, 'Line' => '308', 'Memb' => { '0' => { 'name' => 'lid', 'offset' => '0', 'type' => '72' }, '1' => { 'name' => 'drpath', 'offset' => '4', 'type' => '1122' }, '2' => { 'name' => 'grh_present', 'offset' => '118', 'type' => '72' }, '3' => { 'name' => 'gid', 'offset' => '128', 'type' => '1036' }, '4' => { 'name' => 'qp', 'offset' => '150', 'type' => '256' }, '5' => { 'name' => 'qkey', 'offset' => '256', 'type' => '256' }, '6' => { 'name' => 'sl', 'offset' => '260', 'type' => '232' }, '7' => { 'name' => 'pkey_idx', 'offset' => '264', 'type' => '108' } }, 'Name' => 'struct portid', 'Size' => '112', 'Type' => 'Struct' }, '1471' => { 'BaseType' => '1356', 'Header' => undef, 'Line' => '317', 'Name' => 'ib_portid_t', 'Size' => '112', 'Type' => 'Typedef' }, '153' => { 'BaseType' => '108', 'Header' => undef, 'Line' => '42', 'Name' => '__uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '1632' => { 'Header' => undef, 'Line' => '1403', 'Memb' => { '0' => { 'name' => 'method', 'offset' => '0', 'type' => '108' }, '1' => { 'name' => 'attrid', 'offset' => '4', 'type' => '108' }, '2' => { 'name' => 'mod', 'offset' => '8', 'type' => '108' }, '3' => { 'name' => 'timeout', 'offset' => '18', 'type' => '108' }, '4' => { 'name' => 'bkey', 'offset' => '22', 'type' => '268' } }, 'Name' => 'struct ib_bm_call', 'Size' => '24', 'Type' => 'Struct' }, '165' => { 'Name' => 'long', 'Size' => '8', 'Type' => 'Intrinsic' }, '1710' => { 'BaseType' => '1632', 'Header' => undef, 'Line' => '1409', 'Name' => 'ib_bm_call_t', 'Size' => '24', 'Type' => 'Typedef' }, '172' => { 'BaseType' => '58', 'Header' => undef, 'Line' => '45', 'Name' => '__uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '1721' => { 'Header' => undef, 'Line' => '39', 'Memb' => { '0' => { 'name' => 'port_id', 'offset' => '0', 'type' => '72' }, '1' => { 'name' => 'class_agents', 'offset' => '4', 'type' => '5015' }, '2' => { 'name' => 'timeout', 'offset' => '4136', 'type' => '72' }, '3' => { 'name' => 'retries', 'offset' => '4146', 'type' => '72' }, '4' => { 'name' => 'smp_mkey', 'offset' => '4160', 'type' => '268' } }, 'Name' => 'struct ibmad_port', 'Size' => '1048', 'Type' => 'Struct' }, '1726' => { 'BaseType' => '1721', 'Name' => 'struct ibmad_port const', 'Type' => 'Const' }, '1731' => { 'BaseType' => '1721', 'Name' => 'struct ibmad_port*', 'Size' => '8', 'Type' => 'Pointer' }, '1829' => { 'BaseType' => '1471', 'Name' => 'ib_portid_t*', 'Size' => '8', 'Type' => 'Pointer' }, '184' => { 'BaseType' => '165', 'Header' => undef, 'Line' => '152', 'Name' => '__off_t', 'Size' => '8', 'Type' => 'Typedef' }, '1881' => { 'BaseType' => '1726', 'Name' => 'struct ibmad_port const*', 'Size' => '8', 'Type' => 'Pointer' }, '196' => { 'BaseType' => '165', 'Header' => undef, 'Line' => '153', 'Name' => '__off64_t', 'Size' => '8', 'Type' => 'Typedef' }, '220' => { 'BaseType' => '1', 'Name' => 'void*', 'Size' => '8', 'Type' => 'Pointer' }, '227' => { 'BaseType' => '89', 'Name' => 'char*', 'Size' => '8', 'Type' => 'Pointer' }, '232' => { 'BaseType' => '122', 'Header' => undef, 'Line' => '24', 'Name' => 'uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '256' => { 'BaseType' => '153', 'Header' => undef, 'Line' => '26', 'Name' => 'uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '268' => { 'BaseType' => '172', 'Header' => undef, 'Line' => '27', 'Name' => 'uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '2912' => { 'BaseType' => '232', 'Name' => 'uint8_t*', 'Size' => '8', 'Type' => 'Pointer' }, '2917' => { 'BaseType' => '1710', 'Name' => 'ib_bm_call_t*', 'Size' => '8', 'Type' => 'Pointer' }, '305' => { 'Header' => undef, 'Line' => '49', 'Memb' => { '0' => { 'name' => '_flags', 'offset' => '0', 'type' => '72' }, '1' => { 'name' => '_IO_read_ptr', 'offset' => '8', 'type' => '227' }, '10' => { 'name' => '_IO_backup_base', 'offset' => '128', 'type' => '227' }, '11' => { 'name' => '_IO_save_end', 'offset' => '136', 'type' => '227' }, '12' => { 'name' => '_markers', 'offset' => '150', 'type' => '721' }, '13' => { 'name' => '_chain', 'offset' => '260', 'type' => '726' }, '14' => { 'name' => '_fileno', 'offset' => '274', 'type' => '72' }, '15' => { 'name' => '_flags2', 'offset' => '278', 'type' => '72' }, '16' => { 'name' => '_old_offset', 'offset' => '288', 'type' => '184' }, '17' => { 'name' => '_cur_column', 'offset' => '296', 'type' => '65' }, '18' => { 'name' => '_vtable_offset', 'offset' => '304', 'type' => '115' }, '19' => { 'name' => '_shortbuf', 'offset' => '305', 'type' => '731' }, '2' => { 'name' => '_IO_read_end', 'offset' => '22', 'type' => '227' }, '20' => { 'name' => '_lock', 'offset' => '310', 'type' => '747' }, '21' => { 'name' => '_offset', 'offset' => '324', 'type' => '196' }, '22' => { 'name' => '_codecvt', 'offset' => '338', 'type' => '757' }, '23' => { 'name' => '_wide_data', 'offset' => '352', 'type' => '767' }, '24' => { 'name' => '_freeres_list', 'offset' => '360', 'type' => '726' }, '25' => { 'name' => '_freeres_buf', 'offset' => '374', 'type' => '220' }, '26' => { 'name' => '__pad5', 'offset' => '388', 'type' => '46' }, '27' => { 'name' => '_mode', 'offset' => '402', 'type' => '72' }, '28' => { 'name' => '_unused2', 'offset' => '406', 'type' => '772' }, '3' => { 'name' => '_IO_read_base', 'offset' => '36', 'type' => '227' }, '4' => { 'name' => '_IO_write_base', 'offset' => '50', 'type' => '227' }, '5' => { 'name' => '_IO_write_ptr', 'offset' => '64', 'type' => '227' }, '6' => { 'name' => '_IO_write_end', 'offset' => '72', 'type' => '227' }, '7' => { 'name' => '_IO_buf_base', 'offset' => '86', 'type' => '227' }, '8' => { 'name' => '_IO_buf_end', 'offset' => '100', 'type' => '227' }, '9' => { 'name' => '_IO_save_base', 'offset' => '114', 'type' => '227' } }, 'Name' => 'struct _IO_FILE', 'Size' => '216', 'Type' => 'Struct' }, '46' => { 'BaseType' => '58', 'Header' => undef, 'Line' => '214', 'Name' => 'size_t', 'Size' => '8', 'Type' => 'Typedef' }, '5015' => { 'BaseType' => '72', 'Name' => 'int[256]', 'Size' => '1024', 'Type' => 'Array' }, '5588' => { 'BaseType' => '72', 'Name' => 'int*', 'Size' => '8', 'Type' => 'Pointer' }, '58' => { 'Name' => 'unsigned long', 'Size' => '8', 'Type' => 'Intrinsic' }, '65' => { 'Name' => 'unsigned short', 'Size' => '2', 'Type' => 'Intrinsic' }, '6858' => { 'Header' => undef, 'Line' => '330', 'Memb' => { '0' => { 'name' => 'IB_NO_FIELD', 'value' => '0' }, '1' => { 'name' => 'IB_GID_PREFIX_F', 'value' => '1' }, '10' => { 'name' => 'IB_DRSMP_HOPPTR_F', 'value' => '10' }, '100' => { 'name' => 'IB_SW_OPT_SLTOVL_MAPPING_F', 'value' => '97' }, '101' => { 'name' => 'IB_SW_LIDS_PER_PORT_F', 'value' => '98' }, '102' => { 'name' => 'IB_SW_PARTITION_ENFORCE_CAP_F', 'value' => '99' }, '103' => { 'name' => 'IB_SW_PARTITION_ENF_INB_F', 'value' => '100' }, '104' => { 'name' => 'IB_SW_PARTITION_ENF_OUTB_F', 'value' => '101' }, '105' => { 'name' => 'IB_SW_FILTER_RAW_INB_F', 'value' => '102' }, '106' => { 'name' => 'IB_SW_FILTER_RAW_OUTB_F', 'value' => '103' }, '107' => { 'name' => 'IB_SW_ENHANCED_PORT0_F', 'value' => '104' }, '108' => { 'name' => 'IB_SW_MCAST_FDB_TOP_F', 'value' => '105' }, '109' => { 'name' => 'IB_SW_LAST_F', 'value' => '106' }, '11' => { 'name' => 'IB_DRSMP_STATUS_F', 'value' => '11' }, '110' => { 'name' => 'IB_LINEAR_FORW_TBL_F', 'value' => '107' }, '111' => { 'name' => 'IB_MULTICAST_FORW_TBL_F', 'value' => '108' }, '112' => { 'name' => 'IB_NODE_DESC_F', 'value' => '109' }, '113' => { 'name' => 'IB_NOTICE_IS_GENERIC_F', 'value' => '110' }, '114' => { 'name' => 'IB_NOTICE_TYPE_F', 'value' => '111' }, '115' => { 'name' => 'IB_NOTICE_PRODUCER_F', 'value' => '112' }, '116' => { 'name' => 'IB_NOTICE_TRAP_NUMBER_F', 'value' => '113' }, '117' => { 'name' => 'IB_NOTICE_ISSUER_LID_F', 'value' => '114' }, '118' => { 'name' => 'IB_NOTICE_TOGGLE_F', 'value' => '115' }, '119' => { 'name' => 'IB_NOTICE_COUNT_F', 'value' => '116' }, '12' => { 'name' => 'IB_DRSMP_DIRECTION_F', 'value' => '12' }, '120' => { 'name' => 'IB_NOTICE_DATA_DETAILS_F', 'value' => '117' }, '121' => { 'name' => 'IB_NOTICE_DATA_LID_F', 'value' => '118' }, '122' => { 'name' => 'IB_NOTICE_DATA_144_LID_F', 'value' => '119' }, '123' => { 'name' => 'IB_NOTICE_DATA_144_CAPMASK_F', 'value' => '120' }, '124' => { 'name' => 'IB_PC_FIRST_F', 'value' => '121' }, '125' => { 'name' => 'IB_PC_PORT_SELECT_F', 'value' => '121' }, '126' => { 'name' => 'IB_PC_COUNTER_SELECT_F', 'value' => '122' }, '127' => { 'name' => 'IB_PC_ERR_SYM_F', 'value' => '123' }, '128' => { 'name' => 'IB_PC_LINK_RECOVERS_F', 'value' => '124' }, '129' => { 'name' => 'IB_PC_LINK_DOWNED_F', 'value' => '125' }, '13' => { 'name' => 'IB_MAD_TRID_F', 'value' => '13' }, '130' => { 'name' => 'IB_PC_ERR_RCV_F', 'value' => '126' }, '131' => { 'name' => 'IB_PC_ERR_PHYSRCV_F', 'value' => '127' }, '132' => { 'name' => 'IB_PC_ERR_SWITCH_REL_F', 'value' => '128' }, '133' => { 'name' => 'IB_PC_XMT_DISCARDS_F', 'value' => '129' }, '134' => { 'name' => 'IB_PC_ERR_XMTCONSTR_F', 'value' => '130' }, '135' => { 'name' => 'IB_PC_ERR_RCVCONSTR_F', 'value' => '131' }, '136' => { 'name' => 'IB_PC_COUNTER_SELECT2_F', 'value' => '132' }, '137' => { 'name' => 'IB_PC_ERR_LOCALINTEG_F', 'value' => '133' }, '138' => { 'name' => 'IB_PC_ERR_EXCESS_OVR_F', 'value' => '134' }, '139' => { 'name' => 'IB_PC_VL15_DROPPED_F', 'value' => '135' }, '14' => { 'name' => 'IB_MAD_ATTRID_F', 'value' => '14' }, '140' => { 'name' => 'IB_PC_XMT_BYTES_F', 'value' => '136' }, '141' => { 'name' => 'IB_PC_RCV_BYTES_F', 'value' => '137' }, '142' => { 'name' => 'IB_PC_XMT_PKTS_F', 'value' => '138' }, '143' => { 'name' => 'IB_PC_RCV_PKTS_F', 'value' => '139' }, '144' => { 'name' => 'IB_PC_XMT_WAIT_F', 'value' => '140' }, '145' => { 'name' => 'IB_PC_LAST_F', 'value' => '141' }, '146' => { 'name' => 'IB_SMINFO_GUID_F', 'value' => '142' }, '147' => { 'name' => 'IB_SMINFO_KEY_F', 'value' => '143' }, '148' => { 'name' => 'IB_SMINFO_ACT_F', 'value' => '144' }, '149' => { 'name' => 'IB_SMINFO_PRIO_F', 'value' => '145' }, '15' => { 'name' => 'IB_MAD_ATTRMOD_F', 'value' => '15' }, '150' => { 'name' => 'IB_SMINFO_STATE_F', 'value' => '146' }, '151' => { 'name' => 'IB_SA_RMPP_VERS_F', 'value' => '147' }, '152' => { 'name' => 'IB_SA_RMPP_TYPE_F', 'value' => '148' }, '153' => { 'name' => 'IB_SA_RMPP_RESP_F', 'value' => '149' }, '154' => { 'name' => 'IB_SA_RMPP_FLAGS_F', 'value' => '150' }, '155' => { 'name' => 'IB_SA_RMPP_STATUS_F', 'value' => '151' }, '156' => { 'name' => 'IB_SA_RMPP_D1_F', 'value' => '152' }, '157' => { 'name' => 'IB_SA_RMPP_SEGNUM_F', 'value' => '153' }, '158' => { 'name' => 'IB_SA_RMPP_D2_F', 'value' => '154' }, '159' => { 'name' => 'IB_SA_RMPP_LEN_F', 'value' => '155' }, '16' => { 'name' => 'IB_MAD_MKEY_F', 'value' => '16' }, '160' => { 'name' => 'IB_SA_RMPP_NEWWIN_F', 'value' => '156' }, '161' => { 'name' => 'IB_SA_MP_NPATH_F', 'value' => '157' }, '162' => { 'name' => 'IB_SA_MP_NSRC_F', 'value' => '158' }, '163' => { 'name' => 'IB_SA_MP_NDEST_F', 'value' => '159' }, '164' => { 'name' => 'IB_SA_MP_GID0_F', 'value' => '160' }, '165' => { 'name' => 'IB_SA_PR_DGID_F', 'value' => '161' }, '166' => { 'name' => 'IB_SA_PR_SGID_F', 'value' => '162' }, '167' => { 'name' => 'IB_SA_PR_DLID_F', 'value' => '163' }, '168' => { 'name' => 'IB_SA_PR_SLID_F', 'value' => '164' }, '169' => { 'name' => 'IB_SA_PR_NPATH_F', 'value' => '165' }, '17' => { 'name' => 'IB_DRSMP_DRDLID_F', 'value' => '17' }, '170' => { 'name' => 'IB_SA_PR_SL_F', 'value' => '166' }, '171' => { 'name' => 'IB_SA_MCM_MGID_F', 'value' => '167' }, '172' => { 'name' => 'IB_SA_MCM_PORTGID_F', 'value' => '168' }, '173' => { 'name' => 'IB_SA_MCM_QKEY_F', 'value' => '169' }, '174' => { 'name' => 'IB_SA_MCM_MLID_F', 'value' => '170' }, '175' => { 'name' => 'IB_SA_MCM_SL_F', 'value' => '171' }, '176' => { 'name' => 'IB_SA_MCM_MTU_F', 'value' => '172' }, '177' => { 'name' => 'IB_SA_MCM_RATE_F', 'value' => '173' }, '178' => { 'name' => 'IB_SA_MCM_TCLASS_F', 'value' => '174' }, '179' => { 'name' => 'IB_SA_MCM_PKEY_F', 'value' => '175' }, '18' => { 'name' => 'IB_DRSMP_DRSLID_F', 'value' => '18' }, '180' => { 'name' => 'IB_SA_MCM_FLOW_LABEL_F', 'value' => '176' }, '181' => { 'name' => 'IB_SA_MCM_JOIN_STATE_F', 'value' => '177' }, '182' => { 'name' => 'IB_SA_MCM_PROXY_JOIN_F', 'value' => '178' }, '183' => { 'name' => 'IB_SA_SR_ID_F', 'value' => '179' }, '184' => { 'name' => 'IB_SA_SR_GID_F', 'value' => '180' }, '185' => { 'name' => 'IB_SA_SR_PKEY_F', 'value' => '181' }, '186' => { 'name' => 'IB_SA_SR_LEASE_F', 'value' => '182' }, '187' => { 'name' => 'IB_SA_SR_KEY_F', 'value' => '183' }, '188' => { 'name' => 'IB_SA_SR_NAME_F', 'value' => '184' }, '189' => { 'name' => 'IB_SA_SR_DATA_F', 'value' => '185' }, '19' => { 'name' => 'IB_SA_MKEY_F', 'value' => '19' }, '190' => { 'name' => 'IB_ATS_SM_NODE_ADDR_F', 'value' => '186' }, '191' => { 'name' => 'IB_ATS_SM_MAGIC_KEY_F', 'value' => '187' }, '192' => { 'name' => 'IB_ATS_SM_NODE_TYPE_F', 'value' => '188' }, '193' => { 'name' => 'IB_ATS_SM_NODE_NAME_F', 'value' => '189' }, '194' => { 'name' => 'IB_SLTOVL_MAPPING_TABLE_F', 'value' => '190' }, '195' => { 'name' => 'IB_VL_ARBITRATION_TABLE_F', 'value' => '191' }, '196' => { 'name' => 'IB_VEND2_OUI_F', 'value' => '192' }, '197' => { 'name' => 'IB_VEND2_DATA_F', 'value' => '193' }, '198' => { 'name' => 'IB_PC_EXT_FIRST_F', 'value' => '194' }, '199' => { 'name' => 'IB_PC_EXT_PORT_SELECT_F', 'value' => '194' }, '2' => { 'name' => 'IB_GID_GUID_F', 'value' => '2' }, '20' => { 'name' => 'IB_SA_ATTROFFS_F', 'value' => '20' }, '200' => { 'name' => 'IB_PC_EXT_COUNTER_SELECT_F', 'value' => '195' }, '201' => { 'name' => 'IB_PC_EXT_XMT_BYTES_F', 'value' => '196' }, '202' => { 'name' => 'IB_PC_EXT_RCV_BYTES_F', 'value' => '197' }, '203' => { 'name' => 'IB_PC_EXT_XMT_PKTS_F', 'value' => '198' }, '204' => { 'name' => 'IB_PC_EXT_RCV_PKTS_F', 'value' => '199' }, '205' => { 'name' => 'IB_PC_EXT_XMT_UPKTS_F', 'value' => '200' }, '206' => { 'name' => 'IB_PC_EXT_RCV_UPKTS_F', 'value' => '201' }, '207' => { 'name' => 'IB_PC_EXT_XMT_MPKTS_F', 'value' => '202' }, '208' => { 'name' => 'IB_PC_EXT_RCV_MPKTS_F', 'value' => '203' }, '209' => { 'name' => 'IB_PC_EXT_LAST_F', 'value' => '204' }, '21' => { 'name' => 'IB_SA_COMPMASK_F', 'value' => '21' }, '210' => { 'name' => 'IB_GUID_GUID0_F', 'value' => '205' }, '211' => { 'name' => 'IB_CPI_BASEVER_F', 'value' => '206' }, '212' => { 'name' => 'IB_CPI_CLASSVER_F', 'value' => '207' }, '213' => { 'name' => 'IB_CPI_CAPMASK_F', 'value' => '208' }, '214' => { 'name' => 'IB_CPI_CAPMASK2_F', 'value' => '209' }, '215' => { 'name' => 'IB_CPI_RESP_TIME_VALUE_F', 'value' => '210' }, '216' => { 'name' => 'IB_CPI_REDIRECT_GID_F', 'value' => '211' }, '217' => { 'name' => 'IB_CPI_REDIRECT_TC_F', 'value' => '212' }, '218' => { 'name' => 'IB_CPI_REDIRECT_SL_F', 'value' => '213' }, '219' => { 'name' => 'IB_CPI_REDIRECT_FL_F', 'value' => '214' }, '22' => { 'name' => 'IB_SA_DATA_F', 'value' => '22' }, '220' => { 'name' => 'IB_CPI_REDIRECT_LID_F', 'value' => '215' }, '221' => { 'name' => 'IB_CPI_REDIRECT_PKEY_F', 'value' => '216' }, '222' => { 'name' => 'IB_CPI_REDIRECT_QP_F', 'value' => '217' }, '223' => { 'name' => 'IB_CPI_REDIRECT_QKEY_F', 'value' => '218' }, '224' => { 'name' => 'IB_CPI_TRAP_GID_F', 'value' => '219' }, '225' => { 'name' => 'IB_CPI_TRAP_TC_F', 'value' => '220' }, '226' => { 'name' => 'IB_CPI_TRAP_SL_F', 'value' => '221' }, '227' => { 'name' => 'IB_CPI_TRAP_FL_F', 'value' => '222' }, '228' => { 'name' => 'IB_CPI_TRAP_LID_F', 'value' => '223' }, '229' => { 'name' => 'IB_CPI_TRAP_PKEY_F', 'value' => '224' }, '23' => { 'name' => 'IB_SM_DATA_F', 'value' => '23' }, '230' => { 'name' => 'IB_CPI_TRAP_HL_F', 'value' => '225' }, '231' => { 'name' => 'IB_CPI_TRAP_QP_F', 'value' => '226' }, '232' => { 'name' => 'IB_CPI_TRAP_QKEY_F', 'value' => '227' }, '233' => { 'name' => 'IB_PC_XMT_DATA_SL_FIRST_F', 'value' => '228' }, '234' => { 'name' => 'IB_PC_XMT_DATA_SL0_F', 'value' => '228' }, '235' => { 'name' => 'IB_PC_XMT_DATA_SL1_F', 'value' => '229' }, '236' => { 'name' => 'IB_PC_XMT_DATA_SL2_F', 'value' => '230' }, '237' => { 'name' => 'IB_PC_XMT_DATA_SL3_F', 'value' => '231' }, '238' => { 'name' => 'IB_PC_XMT_DATA_SL4_F', 'value' => '232' }, '239' => { 'name' => 'IB_PC_XMT_DATA_SL5_F', 'value' => '233' }, '24' => { 'name' => 'IB_GS_DATA_F', 'value' => '24' }, '240' => { 'name' => 'IB_PC_XMT_DATA_SL6_F', 'value' => '234' }, '241' => { 'name' => 'IB_PC_XMT_DATA_SL7_F', 'value' => '235' }, '242' => { 'name' => 'IB_PC_XMT_DATA_SL8_F', 'value' => '236' }, '243' => { 'name' => 'IB_PC_XMT_DATA_SL9_F', 'value' => '237' }, '244' => { 'name' => 'IB_PC_XMT_DATA_SL10_F', 'value' => '238' }, '245' => { 'name' => 'IB_PC_XMT_DATA_SL11_F', 'value' => '239' }, '246' => { 'name' => 'IB_PC_XMT_DATA_SL12_F', 'value' => '240' }, '247' => { 'name' => 'IB_PC_XMT_DATA_SL13_F', 'value' => '241' }, '248' => { 'name' => 'IB_PC_XMT_DATA_SL14_F', 'value' => '242' }, '249' => { 'name' => 'IB_PC_XMT_DATA_SL15_F', 'value' => '243' }, '25' => { 'name' => 'IB_DRSMP_PATH_F', 'value' => '25' }, '250' => { 'name' => 'IB_PC_XMT_DATA_SL_LAST_F', 'value' => '244' }, '251' => { 'name' => 'IB_PC_RCV_DATA_SL_FIRST_F', 'value' => '245' }, '252' => { 'name' => 'IB_PC_RCV_DATA_SL0_F', 'value' => '245' }, '253' => { 'name' => 'IB_PC_RCV_DATA_SL1_F', 'value' => '246' }, '254' => { 'name' => 'IB_PC_RCV_DATA_SL2_F', 'value' => '247' }, '255' => { 'name' => 'IB_PC_RCV_DATA_SL3_F', 'value' => '248' }, '256' => { 'name' => 'IB_PC_RCV_DATA_SL4_F', 'value' => '249' }, '257' => { 'name' => 'IB_PC_RCV_DATA_SL5_F', 'value' => '250' }, '258' => { 'name' => 'IB_PC_RCV_DATA_SL6_F', 'value' => '251' }, '259' => { 'name' => 'IB_PC_RCV_DATA_SL7_F', 'value' => '252' }, '26' => { 'name' => 'IB_DRSMP_RPATH_F', 'value' => '26' }, '260' => { 'name' => 'IB_PC_RCV_DATA_SL8_F', 'value' => '253' }, '261' => { 'name' => 'IB_PC_RCV_DATA_SL9_F', 'value' => '254' }, '262' => { 'name' => 'IB_PC_RCV_DATA_SL10_F', 'value' => '255' }, '263' => { 'name' => 'IB_PC_RCV_DATA_SL11_F', 'value' => '256' }, '264' => { 'name' => 'IB_PC_RCV_DATA_SL12_F', 'value' => '257' }, '265' => { 'name' => 'IB_PC_RCV_DATA_SL13_F', 'value' => '258' }, '266' => { 'name' => 'IB_PC_RCV_DATA_SL14_F', 'value' => '259' }, '267' => { 'name' => 'IB_PC_RCV_DATA_SL15_F', 'value' => '260' }, '268' => { 'name' => 'IB_PC_RCV_DATA_SL_LAST_F', 'value' => '261' }, '269' => { 'name' => 'IB_PC_XMT_INACT_DISC_F', 'value' => '262' }, '27' => { 'name' => 'IB_PORT_FIRST_F', 'value' => '27' }, '270' => { 'name' => 'IB_PC_XMT_NEIGH_MTU_DISC_F', 'value' => '263' }, '271' => { 'name' => 'IB_PC_XMT_SW_LIFE_DISC_F', 'value' => '264' }, '272' => { 'name' => 'IB_PC_XMT_SW_HOL_DISC_F', 'value' => '265' }, '273' => { 'name' => 'IB_PC_XMT_DISC_LAST_F', 'value' => '266' }, '274' => { 'name' => 'IB_PC_RCV_LOCAL_PHY_ERR_F', 'value' => '267' }, '275' => { 'name' => 'IB_PC_RCV_MALFORMED_PKT_ERR_F', 'value' => '268' }, '276' => { 'name' => 'IB_PC_RCV_BUF_OVR_ERR_F', 'value' => '269' }, '277' => { 'name' => 'IB_PC_RCV_DLID_MAP_ERR_F', 'value' => '270' }, '278' => { 'name' => 'IB_PC_RCV_VL_MAP_ERR_F', 'value' => '271' }, '279' => { 'name' => 'IB_PC_RCV_LOOPING_ERR_F', 'value' => '272' }, '28' => { 'name' => 'IB_PORT_MKEY_F', 'value' => '27' }, '280' => { 'name' => 'IB_PC_RCV_ERR_LAST_F', 'value' => '273' }, '281' => { 'name' => 'IB_PSC_OPCODE_F', 'value' => '274' }, '282' => { 'name' => 'IB_PSC_PORT_SELECT_F', 'value' => '275' }, '283' => { 'name' => 'IB_PSC_TICK_F', 'value' => '276' }, '284' => { 'name' => 'IB_PSC_COUNTER_WIDTH_F', 'value' => '277' }, '285' => { 'name' => 'IB_PSC_COUNTER_MASK0_F', 'value' => '278' }, '286' => { 'name' => 'IB_PSC_COUNTER_MASKS1TO9_F', 'value' => '279' }, '287' => { 'name' => 'IB_PSC_COUNTER_MASKS10TO14_F', 'value' => '280' }, '288' => { 'name' => 'IB_PSC_SAMPLE_MECHS_F', 'value' => '281' }, '289' => { 'name' => 'IB_PSC_SAMPLE_STATUS_F', 'value' => '282' }, '29' => { 'name' => 'IB_PORT_GID_PREFIX_F', 'value' => '28' }, '290' => { 'name' => 'IB_PSC_OPTION_MASK_F', 'value' => '283' }, '291' => { 'name' => 'IB_PSC_VENDOR_MASK_F', 'value' => '284' }, '292' => { 'name' => 'IB_PSC_SAMPLE_START_F', 'value' => '285' }, '293' => { 'name' => 'IB_PSC_SAMPLE_INTVL_F', 'value' => '286' }, '294' => { 'name' => 'IB_PSC_TAG_F', 'value' => '287' }, '295' => { 'name' => 'IB_PSC_COUNTER_SEL0_F', 'value' => '288' }, '296' => { 'name' => 'IB_PSC_COUNTER_SEL1_F', 'value' => '289' }, '297' => { 'name' => 'IB_PSC_COUNTER_SEL2_F', 'value' => '290' }, '298' => { 'name' => 'IB_PSC_COUNTER_SEL3_F', 'value' => '291' }, '299' => { 'name' => 'IB_PSC_COUNTER_SEL4_F', 'value' => '292' }, '3' => { 'name' => 'IB_MAD_METHOD_F', 'value' => '3' }, '30' => { 'name' => 'IB_PORT_LID_F', 'value' => '29' }, '300' => { 'name' => 'IB_PSC_COUNTER_SEL5_F', 'value' => '293' }, '301' => { 'name' => 'IB_PSC_COUNTER_SEL6_F', 'value' => '294' }, '302' => { 'name' => 'IB_PSC_COUNTER_SEL7_F', 'value' => '295' }, '303' => { 'name' => 'IB_PSC_COUNTER_SEL8_F', 'value' => '296' }, '304' => { 'name' => 'IB_PSC_COUNTER_SEL9_F', 'value' => '297' }, '305' => { 'name' => 'IB_PSC_COUNTER_SEL10_F', 'value' => '298' }, '306' => { 'name' => 'IB_PSC_COUNTER_SEL11_F', 'value' => '299' }, '307' => { 'name' => 'IB_PSC_COUNTER_SEL12_F', 'value' => '300' }, '308' => { 'name' => 'IB_PSC_COUNTER_SEL13_F', 'value' => '301' }, '309' => { 'name' => 'IB_PSC_COUNTER_SEL14_F', 'value' => '302' }, '31' => { 'name' => 'IB_PORT_SMLID_F', 'value' => '30' }, '310' => { 'name' => 'IB_PSC_SAMPLES_ONLY_OPT_MASK_F', 'value' => '303' }, '311' => { 'name' => 'IB_PSC_LAST_F', 'value' => '304' }, '312' => { 'name' => 'IB_GI_GUID0_F', 'value' => '305' }, '313' => { 'name' => 'IB_GI_GUID1_F', 'value' => '306' }, '314' => { 'name' => 'IB_GI_GUID2_F', 'value' => '307' }, '315' => { 'name' => 'IB_GI_GUID3_F', 'value' => '308' }, '316' => { 'name' => 'IB_GI_GUID4_F', 'value' => '309' }, '317' => { 'name' => 'IB_GI_GUID5_F', 'value' => '310' }, '318' => { 'name' => 'IB_GI_GUID6_F', 'value' => '311' }, '319' => { 'name' => 'IB_GI_GUID7_F', 'value' => '312' }, '32' => { 'name' => 'IB_PORT_CAPMASK_F', 'value' => '31' }, '320' => { 'name' => 'IB_SA_GIR_LID_F', 'value' => '313' }, '321' => { 'name' => 'IB_SA_GIR_BLOCKNUM_F', 'value' => '314' }, '322' => { 'name' => 'IB_SA_GIR_GUID0_F', 'value' => '315' }, '323' => { 'name' => 'IB_SA_GIR_GUID1_F', 'value' => '316' }, '324' => { 'name' => 'IB_SA_GIR_GUID2_F', 'value' => '317' }, '325' => { 'name' => 'IB_SA_GIR_GUID3_F', 'value' => '318' }, '326' => { 'name' => 'IB_SA_GIR_GUID4_F', 'value' => '319' }, '327' => { 'name' => 'IB_SA_GIR_GUID5_F', 'value' => '320' }, '328' => { 'name' => 'IB_SA_GIR_GUID6_F', 'value' => '321' }, '329' => { 'name' => 'IB_SA_GIR_GUID7_F', 'value' => '322' }, '33' => { 'name' => 'IB_PORT_DIAG_F', 'value' => '32' }, '330' => { 'name' => 'IB_PORT_CAPMASK2_F', 'value' => '323' }, '331' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_ACTIVE_F', 'value' => '324' }, '332' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_SUPPORTED_F', 'value' => '325' }, '333' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_ENABLED_F', 'value' => '326' }, '334' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_LAST_F', 'value' => '327' }, '335' => { 'name' => 'IB_PESC_PORT_SELECT_F', 'value' => '328' }, '336' => { 'name' => 'IB_PESC_COUNTER_SELECT_F', 'value' => '329' }, '337' => { 'name' => 'IB_PESC_SYNC_HDR_ERR_CTR_F', 'value' => '330' }, '338' => { 'name' => 'IB_PESC_UNK_BLOCK_CTR_F', 'value' => '331' }, '339' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE0_F', 'value' => '332' }, '34' => { 'name' => 'IB_PORT_MKEY_LEASE_F', 'value' => '33' }, '340' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE1_F', 'value' => '333' }, '341' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE2_F', 'value' => '334' }, '342' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE3_F', 'value' => '335' }, '343' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE4_F', 'value' => '336' }, '344' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE5_F', 'value' => '337' }, '345' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE6_F', 'value' => '338' }, '346' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE7_F', 'value' => '339' }, '347' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE8_F', 'value' => '340' }, '348' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE9_F', 'value' => '341' }, '349' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE10_F', 'value' => '342' }, '35' => { 'name' => 'IB_PORT_LOCAL_PORT_F', 'value' => '34' }, '350' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE11_F', 'value' => '343' }, '351' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE0_F', 'value' => '344' }, '352' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE1_F', 'value' => '345' }, '353' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE2_F', 'value' => '346' }, '354' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE3_F', 'value' => '347' }, '355' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE4_F', 'value' => '348' }, '356' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE5_F', 'value' => '349' }, '357' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE6_F', 'value' => '350' }, '358' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE7_F', 'value' => '351' }, '359' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE8_F', 'value' => '352' }, '36' => { 'name' => 'IB_PORT_LINK_WIDTH_ENABLED_F', 'value' => '35' }, '360' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE9_F', 'value' => '353' }, '361' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE10_F', 'value' => '354' }, '362' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE11_F', 'value' => '355' }, '363' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE0_F', 'value' => '356' }, '364' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE1_F', 'value' => '357' }, '365' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE2_F', 'value' => '358' }, '366' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE3_F', 'value' => '359' }, '367' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE4_F', 'value' => '360' }, '368' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE5_F', 'value' => '361' }, '369' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE6_F', 'value' => '362' }, '37' => { 'name' => 'IB_PORT_LINK_WIDTH_SUPPORTED_F', 'value' => '36' }, '370' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE7_F', 'value' => '363' }, '371' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE8_F', 'value' => '364' }, '372' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE9_F', 'value' => '365' }, '373' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE10_F', 'value' => '366' }, '374' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE11_F', 'value' => '367' }, '375' => { 'name' => 'IB_PESC_LAST_F', 'value' => '368' }, '376' => { 'name' => 'IB_PC_PORT_OP_RCV_COUNTERS_FIRST_F', 'value' => '369' }, '377' => { 'name' => 'IB_PC_PORT_OP_RCV_PKTS_F', 'value' => '369' }, '378' => { 'name' => 'IB_PC_PORT_OP_RCV_DATA_F', 'value' => '370' }, '379' => { 'name' => 'IB_PC_PORT_OP_RCV_COUNTERS_LAST_F', 'value' => '371' }, '38' => { 'name' => 'IB_PORT_LINK_WIDTH_ACTIVE_F', 'value' => '37' }, '380' => { 'name' => 'IB_PC_PORT_FLOW_CTL_COUNTERS_FIRST_F', 'value' => '372' }, '381' => { 'name' => 'IB_PC_PORT_XMIT_FLOW_PKTS_F', 'value' => '372' }, '382' => { 'name' => 'IB_PC_PORT_RCV_FLOW_PKTS_F', 'value' => '373' }, '383' => { 'name' => 'IB_PC_PORT_FLOW_CTL_COUNTERS_LAST_F', 'value' => '374' }, '384' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS_FIRST_F', 'value' => '375' }, '385' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS0_F', 'value' => '375' }, '386' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS1_F', 'value' => '376' }, '387' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS2_F', 'value' => '377' }, '388' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS3_F', 'value' => '378' }, '389' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS4_F', 'value' => '379' }, '39' => { 'name' => 'IB_PORT_LINK_SPEED_SUPPORTED_F', 'value' => '38' }, '390' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS5_F', 'value' => '380' }, '391' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS6_F', 'value' => '381' }, '392' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS7_F', 'value' => '382' }, '393' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS8_F', 'value' => '383' }, '394' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS9_F', 'value' => '384' }, '395' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS10_F', 'value' => '385' }, '396' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS11_F', 'value' => '386' }, '397' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS12_F', 'value' => '387' }, '398' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS13_F', 'value' => '388' }, '399' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS14_F', 'value' => '389' }, '4' => { 'name' => 'IB_MAD_RESPONSE_F', 'value' => '4' }, '40' => { 'name' => 'IB_PORT_STATE_F', 'value' => '39' }, '400' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS15_F', 'value' => '390' }, '401' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS_LAST_F', 'value' => '391' }, '402' => { 'name' => 'IB_PC_PORT_VL_OP_DATA_FIRST_F', 'value' => '392' }, '403' => { 'name' => 'IB_PC_PORT_VL_OP_DATA0_F', 'value' => '392' }, '404' => { 'name' => 'IB_PC_PORT_VL_OP_DATA1_F', 'value' => '393' }, '405' => { 'name' => 'IB_PC_PORT_VL_OP_DATA2_F', 'value' => '394' }, '406' => { 'name' => 'IB_PC_PORT_VL_OP_DATA3_F', 'value' => '395' }, '407' => { 'name' => 'IB_PC_PORT_VL_OP_DATA4_F', 'value' => '396' }, '408' => { 'name' => 'IB_PC_PORT_VL_OP_DATA5_F', 'value' => '397' }, '409' => { 'name' => 'IB_PC_PORT_VL_OP_DATA6_F', 'value' => '398' }, '41' => { 'name' => 'IB_PORT_PHYS_STATE_F', 'value' => '40' }, '410' => { 'name' => 'IB_PC_PORT_VL_OP_DATA7_F', 'value' => '399' }, '411' => { 'name' => 'IB_PC_PORT_VL_OP_DATA8_F', 'value' => '400' }, '412' => { 'name' => 'IB_PC_PORT_VL_OP_DATA9_F', 'value' => '401' }, '413' => { 'name' => 'IB_PC_PORT_VL_OP_DATA10_F', 'value' => '402' }, '414' => { 'name' => 'IB_PC_PORT_VL_OP_DATA11_F', 'value' => '403' }, '415' => { 'name' => 'IB_PC_PORT_VL_OP_DATA12_F', 'value' => '404' }, '416' => { 'name' => 'IB_PC_PORT_VL_OP_DATA13_F', 'value' => '405' }, '417' => { 'name' => 'IB_PC_PORT_VL_OP_DATA14_F', 'value' => '406' }, '418' => { 'name' => 'IB_PC_PORT_VL_OP_DATA15_F', 'value' => '407' }, '419' => { 'name' => 'IB_PC_PORT_VL_OP_DATA_LAST_F', 'value' => '408' }, '42' => { 'name' => 'IB_PORT_LINK_DOWN_DEF_F', 'value' => '41' }, '420' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS_FIRST_F', 'value' => '409' }, '421' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS0_F', 'value' => '409' }, '422' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS1_F', 'value' => '410' }, '423' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS2_F', 'value' => '411' }, '424' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS3_F', 'value' => '412' }, '425' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS4_F', 'value' => '413' }, '426' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS5_F', 'value' => '414' }, '427' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS6_F', 'value' => '415' }, '428' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS7_F', 'value' => '416' }, '429' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS8_F', 'value' => '417' }, '43' => { 'name' => 'IB_PORT_MKEY_PROT_BITS_F', 'value' => '42' }, '430' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS9_F', 'value' => '418' }, '431' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS10_F', 'value' => '419' }, '432' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS11_F', 'value' => '420' }, '433' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS12_F', 'value' => '421' }, '434' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS13_F', 'value' => '422' }, '435' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS14_F', 'value' => '423' }, '436' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS15_F', 'value' => '424' }, '437' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS_LAST_F', 'value' => '425' }, '438' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT_COUNTERS_FIRST_F', 'value' => '426' }, '439' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT0_F', 'value' => '426' }, '44' => { 'name' => 'IB_PORT_LMC_F', 'value' => '43' }, '440' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT1_F', 'value' => '427' }, '441' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT2_F', 'value' => '428' }, '442' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT3_F', 'value' => '429' }, '443' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT4_F', 'value' => '430' }, '444' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT5_F', 'value' => '431' }, '445' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT6_F', 'value' => '432' }, '446' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT7_F', 'value' => '433' }, '447' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT8_F', 'value' => '434' }, '448' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT9_F', 'value' => '435' }, '449' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT10_F', 'value' => '436' }, '45' => { 'name' => 'IB_PORT_LINK_SPEED_ACTIVE_F', 'value' => '44' }, '450' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT11_F', 'value' => '437' }, '451' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT12_F', 'value' => '438' }, '452' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT13_F', 'value' => '439' }, '453' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT14_F', 'value' => '440' }, '454' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT15_F', 'value' => '441' }, '455' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT_COUNTERS_LAST_F', 'value' => '442' }, '456' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION_FIRST_F', 'value' => '443' }, '457' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION0_F', 'value' => '443' }, '458' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION1_F', 'value' => '444' }, '459' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION2_F', 'value' => '445' }, '46' => { 'name' => 'IB_PORT_LINK_SPEED_ENABLED_F', 'value' => '45' }, '460' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION3_F', 'value' => '446' }, '461' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION4_F', 'value' => '447' }, '462' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION5_F', 'value' => '448' }, '463' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION6_F', 'value' => '449' }, '464' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION7_F', 'value' => '450' }, '465' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION8_F', 'value' => '451' }, '466' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION9_F', 'value' => '452' }, '467' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION10_F', 'value' => '453' }, '468' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION11_F', 'value' => '454' }, '469' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION12_F', 'value' => '455' }, '47' => { 'name' => 'IB_PORT_NEIGHBOR_MTU_F', 'value' => '46' }, '470' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION13_F', 'value' => '456' }, '471' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION14_F', 'value' => '457' }, '472' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION15_F', 'value' => '458' }, '473' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION_LAST_F', 'value' => '459' }, '474' => { 'name' => 'IB_PC_RCV_CON_CTRL_FIRST_F', 'value' => '460' }, '475' => { 'name' => 'IB_PC_RCV_CON_CTRL_PKT_RCV_FECN_F', 'value' => '460' }, '476' => { 'name' => 'IB_PC_RCV_CON_CTRL_PKT_RCV_BECN_F', 'value' => '461' }, '477' => { 'name' => 'IB_PC_RCV_CON_CTRL_LAST_F', 'value' => '462' }, '478' => { 'name' => 'IB_PC_SL_RCV_FECN_FIRST_F', 'value' => '463' }, '479' => { 'name' => 'IB_PC_SL_RCV_FECN0_F', 'value' => '463' }, '48' => { 'name' => 'IB_PORT_SMSL_F', 'value' => '47' }, '480' => { 'name' => 'IB_PC_SL_RCV_FECN1_F', 'value' => '464' }, '481' => { 'name' => 'IB_PC_SL_RCV_FECN2_F', 'value' => '465' }, '482' => { 'name' => 'IB_PC_SL_RCV_FECN3_F', 'value' => '466' }, '483' => { 'name' => 'IB_PC_SL_RCV_FECN4_F', 'value' => '467' }, '484' => { 'name' => 'IB_PC_SL_RCV_FECN5_F', 'value' => '468' }, '485' => { 'name' => 'IB_PC_SL_RCV_FECN6_F', 'value' => '469' }, '486' => { 'name' => 'IB_PC_SL_RCV_FECN7_F', 'value' => '470' }, '487' => { 'name' => 'IB_PC_SL_RCV_FECN8_F', 'value' => '471' }, '488' => { 'name' => 'IB_PC_SL_RCV_FECN9_F', 'value' => '472' }, '489' => { 'name' => 'IB_PC_SL_RCV_FECN10_F', 'value' => '473' }, '49' => { 'name' => 'IB_PORT_VL_CAP_F', 'value' => '48' }, '490' => { 'name' => 'IB_PC_SL_RCV_FECN11_F', 'value' => '474' }, '491' => { 'name' => 'IB_PC_SL_RCV_FECN12_F', 'value' => '475' }, '492' => { 'name' => 'IB_PC_SL_RCV_FECN13_F', 'value' => '476' }, '493' => { 'name' => 'IB_PC_SL_RCV_FECN14_F', 'value' => '477' }, '494' => { 'name' => 'IB_PC_SL_RCV_FECN15_F', 'value' => '478' }, '495' => { 'name' => 'IB_PC_SL_RCV_FECN_LAST_F', 'value' => '479' }, '496' => { 'name' => 'IB_PC_SL_RCV_BECN_FIRST_F', 'value' => '480' }, '497' => { 'name' => 'IB_PC_SL_RCV_BECN0_F', 'value' => '480' }, '498' => { 'name' => 'IB_PC_SL_RCV_BECN1_F', 'value' => '481' }, '499' => { 'name' => 'IB_PC_SL_RCV_BECN2_F', 'value' => '482' }, '5' => { 'name' => 'IB_MAD_CLASSVER_F', 'value' => '5' }, '50' => { 'name' => 'IB_PORT_INIT_TYPE_F', 'value' => '49' }, '500' => { 'name' => 'IB_PC_SL_RCV_BECN3_F', 'value' => '483' }, '501' => { 'name' => 'IB_PC_SL_RCV_BECN4_F', 'value' => '484' }, '502' => { 'name' => 'IB_PC_SL_RCV_BECN5_F', 'value' => '485' }, '503' => { 'name' => 'IB_PC_SL_RCV_BECN6_F', 'value' => '486' }, '504' => { 'name' => 'IB_PC_SL_RCV_BECN7_F', 'value' => '487' }, '505' => { 'name' => 'IB_PC_SL_RCV_BECN8_F', 'value' => '488' }, '506' => { 'name' => 'IB_PC_SL_RCV_BECN9_F', 'value' => '489' }, '507' => { 'name' => 'IB_PC_SL_RCV_BECN10_F', 'value' => '490' }, '508' => { 'name' => 'IB_PC_SL_RCV_BECN11_F', 'value' => '491' }, '509' => { 'name' => 'IB_PC_SL_RCV_BECN12_F', 'value' => '492' }, '51' => { 'name' => 'IB_PORT_VL_HIGH_LIMIT_F', 'value' => '50' }, '510' => { 'name' => 'IB_PC_SL_RCV_BECN13_F', 'value' => '493' }, '511' => { 'name' => 'IB_PC_SL_RCV_BECN14_F', 'value' => '494' }, '512' => { 'name' => 'IB_PC_SL_RCV_BECN15_F', 'value' => '495' }, '513' => { 'name' => 'IB_PC_SL_RCV_BECN_LAST_F', 'value' => '496' }, '514' => { 'name' => 'IB_PC_XMIT_CON_CTRL_FIRST_F', 'value' => '497' }, '515' => { 'name' => 'IB_PC_XMIT_CON_CTRL_TIME_CONG_F', 'value' => '497' }, '516' => { 'name' => 'IB_PC_XMIT_CON_CTRL_LAST_F', 'value' => '498' }, '517' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG_FIRST_F', 'value' => '499' }, '518' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG0_F', 'value' => '499' }, '519' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG1_F', 'value' => '500' }, '52' => { 'name' => 'IB_PORT_VL_ARBITRATION_HIGH_CAP_F', 'value' => '51' }, '520' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG2_F', 'value' => '501' }, '521' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG3_F', 'value' => '502' }, '522' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG4_F', 'value' => '503' }, '523' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG5_F', 'value' => '504' }, '524' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG6_F', 'value' => '505' }, '525' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG7_F', 'value' => '506' }, '526' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG8_F', 'value' => '507' }, '527' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG9_F', 'value' => '508' }, '528' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG10_F', 'value' => '509' }, '529' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG11_F', 'value' => '510' }, '53' => { 'name' => 'IB_PORT_VL_ARBITRATION_LOW_CAP_F', 'value' => '52' }, '530' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG12_F', 'value' => '511' }, '531' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG13_F', 'value' => '512' }, '532' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG14_F', 'value' => '513' }, '533' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG_LAST_F', 'value' => '514' }, '534' => { 'name' => 'IB_MLNX_EXT_PORT_STATE_CHG_ENABLE_F', 'value' => '515' }, '535' => { 'name' => 'IB_MLNX_EXT_PORT_LINK_SPEED_SUPPORTED_F', 'value' => '516' }, '536' => { 'name' => 'IB_MLNX_EXT_PORT_LINK_SPEED_ENABLED_F', 'value' => '517' }, '537' => { 'name' => 'IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F', 'value' => '518' }, '538' => { 'name' => 'IB_MLNX_EXT_PORT_LAST_F', 'value' => '519' }, '539' => { 'name' => 'IB_CC_CCKEY_F', 'value' => '520' }, '54' => { 'name' => 'IB_PORT_INIT_TYPE_REPLY_F', 'value' => '53' }, '540' => { 'name' => 'IB_CC_CONGESTION_INFO_FIRST_F', 'value' => '521' }, '541' => { 'name' => 'IB_CC_CONGESTION_INFO_F', 'value' => '521' }, '542' => { 'name' => 'IB_CC_CONGESTION_INFO_CONTROL_TABLE_CAP_F', 'value' => '522' }, '543' => { 'name' => 'IB_CC_CONGESTION_INFO_LAST_F', 'value' => '523' }, '544' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_FIRST_F', 'value' => '524' }, '545' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_CC_KEY_F', 'value' => '524' }, '546' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_CC_KEY_PROTECT_BIT_F', 'value' => '525' }, '547' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_CC_KEY_LEASE_PERIOD_F', 'value' => '526' }, '548' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_CC_KEY_VIOLATIONS_F', 'value' => '527' }, '549' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_LAST_F', 'value' => '528' }, '55' => { 'name' => 'IB_PORT_MTU_CAP_F', 'value' => '54' }, '550' => { 'name' => 'IB_CC_CONGESTION_LOG_FIRST_F', 'value' => '529' }, '551' => { 'name' => 'IB_CC_CONGESTION_LOG_LOGTYPE_F', 'value' => '529' }, '552' => { 'name' => 'IB_CC_CONGESTION_LOG_CONGESTION_FLAGS_F', 'value' => '530' }, '553' => { 'name' => 'IB_CC_CONGESTION_LOG_LAST_F', 'value' => '531' }, '554' => { 'name' => 'IB_CC_CONGESTION_LOG_SWITCH_FIRST_F', 'value' => '532' }, '555' => { 'name' => 'IB_CC_CONGESTION_LOG_SWITCH_LOG_EVENTS_COUNTER_F', 'value' => '532' }, '556' => { 'name' => 'IB_CC_CONGESTION_LOG_SWITCH_CURRENT_TIME_STAMP_F', 'value' => '533' }, '557' => { 'name' => 'IB_CC_CONGESTION_LOG_SWITCH_PORTMAP_F', 'value' => '534' }, '558' => { 'name' => 'IB_CC_CONGESTION_LOG_SWITCH_LAST_F', 'value' => '535' }, '559' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_FIRST_F', 'value' => '536' }, '56' => { 'name' => 'IB_PORT_VL_STALL_COUNT_F', 'value' => '55' }, '560' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_SLID_F', 'value' => '536' }, '561' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_DLID_F', 'value' => '537' }, '562' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_SL_F', 'value' => '538' }, '563' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_TIMESTAMP_F', 'value' => '539' }, '564' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_LAST_F', 'value' => '540' }, '565' => { 'name' => 'IB_CC_CONGESTION_LOG_CA_FIRST_F', 'value' => '541' }, '566' => { 'name' => 'IB_CC_CONGESTION_LOG_CA_THRESHOLD_EVENT_COUNTER_F', 'value' => '541' }, '567' => { 'name' => 'IB_CC_CONGESTION_LOG_CA_THRESHOLD_CONGESTION_EVENT_MAP_F', 'value' => '542' }, '568' => { 'name' => 'IB_CC_CONGESTION_LOG_CA_CURRENT_TIMESTAMP_F', 'value' => '543' }, '569' => { 'name' => 'IB_CC_CONGESTION_LOG_CA_LAST_F', 'value' => '544' }, '57' => { 'name' => 'IB_PORT_HOQ_LIFE_F', 'value' => '56' }, '570' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_FIRST_F', 'value' => '545' }, '571' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_LOCAL_QP_CN_ENTRY_F', 'value' => '545' }, '572' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_SL_CN_ENTRY_F', 'value' => '546' }, '573' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_SERVICE_TYPE_CN_ENTRY_F', 'value' => '547' }, '574' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_REMOTE_QP_NUMBER_CN_ENTRY_F', 'value' => '548' }, '575' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_LOCAL_LID_CN_F', 'value' => '549' }, '576' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_REMOTE_LID_CN_ENTRY_F', 'value' => '550' }, '577' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_TIMESTAMP_CN_ENTRY_F', 'value' => '551' }, '578' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_LAST_F', 'value' => '552' }, '579' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_FIRST_F', 'value' => '553' }, '58' => { 'name' => 'IB_PORT_OPER_VLS_F', 'value' => '57' }, '580' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_CONTROL_MAP_F', 'value' => '553' }, '581' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_VICTIM_MASK_F', 'value' => '554' }, '582' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_CREDIT_MASK_F', 'value' => '555' }, '583' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_THRESHOLD_F', 'value' => '556' }, '584' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_PACKET_SIZE_F', 'value' => '557' }, '585' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_CS_THRESHOLD_F', 'value' => '558' }, '586' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_CS_RETURN_DELAY_F', 'value' => '559' }, '587' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_MARKING_RATE_F', 'value' => '560' }, '588' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_LAST_F', 'value' => '561' }, '589' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_FIRST_F', 'value' => '562' }, '59' => { 'name' => 'IB_PORT_PART_EN_INB_F', 'value' => '58' }, '590' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_VALID_F', 'value' => '562' }, '591' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_CONTROL_TYPE_F', 'value' => '563' }, '592' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_THRESHOLD_F', 'value' => '564' }, '593' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_PACKET_SIZE_F', 'value' => '565' }, '594' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_CONG_PARM_MARKING_RATE_F', 'value' => '566' }, '595' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_LAST_F', 'value' => '567' }, '596' => { 'name' => 'IB_CC_CA_CONGESTION_SETTING_FIRST_F', 'value' => '568' }, '597' => { 'name' => 'IB_CC_CA_CONGESTION_SETTING_PORT_CONTROL_F', 'value' => '568' }, '598' => { 'name' => 'IB_CC_CA_CONGESTION_SETTING_CONTROL_MAP_F', 'value' => '569' }, '599' => { 'name' => 'IB_CC_CA_CONGESTION_SETTING_LAST_F', 'value' => '570' }, '6' => { 'name' => 'IB_MAD_MGMTCLASS_F', 'value' => '6' }, '60' => { 'name' => 'IB_PORT_PART_EN_OUTB_F', 'value' => '59' }, '600' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_FIRST_F', 'value' => '571' }, '601' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_CCTI_TIMER_F', 'value' => '571' }, '602' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_CCTI_INCREASE_F', 'value' => '572' }, '603' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_TRIGGER_THRESHOLD_F', 'value' => '573' }, '604' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_CCTI_MIN_F', 'value' => '574' }, '605' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_LAST_F', 'value' => '575' }, '606' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_FIRST_F', 'value' => '576' }, '607' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_CCTI_LIMIT_F', 'value' => '576' }, '608' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_LAST_F', 'value' => '577' }, '609' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_FIRST_F', 'value' => '578' }, '61' => { 'name' => 'IB_PORT_FILTER_RAW_INB_F', 'value' => '60' }, '610' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_CCT_SHIFT_F', 'value' => '578' }, '611' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_CCT_MULTIPLIER_F', 'value' => '579' }, '612' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_LAST_F', 'value' => '580' }, '613' => { 'name' => 'IB_CC_TIMESTAMP_FIRST_F', 'value' => '581' }, '614' => { 'name' => 'IB_CC_TIMESTAMP_F', 'value' => '581' }, '615' => { 'name' => 'IB_CC_TIMESTAMP_LAST_F', 'value' => '582' }, '616' => { 'name' => 'IB_SA_NR_FIRST_F', 'value' => '583' }, '617' => { 'name' => 'IB_SA_NR_LID_F', 'value' => '583' }, '618' => { 'name' => 'IB_SA_NR_BASEVER_F', 'value' => '584' }, '619' => { 'name' => 'IB_SA_NR_CLASSVER_F', 'value' => '585' }, '62' => { 'name' => 'IB_PORT_FILTER_RAW_OUTB_F', 'value' => '61' }, '620' => { 'name' => 'IB_SA_NR_TYPE_F', 'value' => '586' }, '621' => { 'name' => 'IB_SA_NR_NPORTS_F', 'value' => '587' }, '622' => { 'name' => 'IB_SA_NR_SYSTEM_GUID_F', 'value' => '588' }, '623' => { 'name' => 'IB_SA_NR_GUID_F', 'value' => '589' }, '624' => { 'name' => 'IB_SA_NR_PORT_GUID_F', 'value' => '590' }, '625' => { 'name' => 'IB_SA_NR_PARTITION_CAP_F', 'value' => '591' }, '626' => { 'name' => 'IB_SA_NR_DEVID_F', 'value' => '592' }, '627' => { 'name' => 'IB_SA_NR_REVISION_F', 'value' => '593' }, '628' => { 'name' => 'IB_SA_NR_LOCAL_PORT_F', 'value' => '594' }, '629' => { 'name' => 'IB_SA_NR_VENDORID_F', 'value' => '595' }, '63' => { 'name' => 'IB_PORT_MKEY_VIOL_F', 'value' => '62' }, '630' => { 'name' => 'IB_SA_NR_NODEDESC_F', 'value' => '596' }, '631' => { 'name' => 'IB_SA_NR_LAST_F', 'value' => '597' }, '632' => { 'name' => 'IB_PSR_TAG_F', 'value' => '598' }, '633' => { 'name' => 'IB_PSR_SAMPLE_STATUS_F', 'value' => '599' }, '634' => { 'name' => 'IB_PSR_COUNTER0_F', 'value' => '600' }, '635' => { 'name' => 'IB_PSR_COUNTER1_F', 'value' => '601' }, '636' => { 'name' => 'IB_PSR_COUNTER2_F', 'value' => '602' }, '637' => { 'name' => 'IB_PSR_COUNTER3_F', 'value' => '603' }, '638' => { 'name' => 'IB_PSR_COUNTER4_F', 'value' => '604' }, '639' => { 'name' => 'IB_PSR_COUNTER5_F', 'value' => '605' }, '64' => { 'name' => 'IB_PORT_PKEY_VIOL_F', 'value' => '63' }, '640' => { 'name' => 'IB_PSR_COUNTER6_F', 'value' => '606' }, '641' => { 'name' => 'IB_PSR_COUNTER7_F', 'value' => '607' }, '642' => { 'name' => 'IB_PSR_COUNTER8_F', 'value' => '608' }, '643' => { 'name' => 'IB_PSR_COUNTER9_F', 'value' => '609' }, '644' => { 'name' => 'IB_PSR_COUNTER10_F', 'value' => '610' }, '645' => { 'name' => 'IB_PSR_COUNTER11_F', 'value' => '611' }, '646' => { 'name' => 'IB_PSR_COUNTER12_F', 'value' => '612' }, '647' => { 'name' => 'IB_PSR_COUNTER13_F', 'value' => '613' }, '648' => { 'name' => 'IB_PSR_COUNTER14_F', 'value' => '614' }, '649' => { 'name' => 'IB_PSR_LAST_F', 'value' => '615' }, '65' => { 'name' => 'IB_PORT_QKEY_VIOL_F', 'value' => '64' }, '650' => { 'name' => 'IB_PORT_EXT_FIRST_F', 'value' => '616' }, '651' => { 'name' => 'IB_PORT_EXT_CAPMASK_F', 'value' => '616' }, '652' => { 'name' => 'IB_PORT_EXT_FEC_MODE_ACTIVE_F', 'value' => '617' }, '653' => { 'name' => 'IB_PORT_EXT_FDR_FEC_MODE_SUPPORTED_F', 'value' => '618' }, '654' => { 'name' => 'IB_PORT_EXT_FDR_FEC_MODE_ENABLED_F', 'value' => '619' }, '655' => { 'name' => 'IB_PORT_EXT_EDR_FEC_MODE_SUPPORTED_F', 'value' => '620' }, '656' => { 'name' => 'IB_PORT_EXT_EDR_FEC_MODE_ENABLED_F', 'value' => '621' }, '657' => { 'name' => 'IB_PORT_EXT_LAST_F', 'value' => '622' }, '658' => { 'name' => 'IB_PESC_RSFEC_FIRST_F', 'value' => '623' }, '659' => { 'name' => 'IB_PESC_RSFEC_PORT_SELECT_F', 'value' => '623' }, '66' => { 'name' => 'IB_PORT_GUID_CAP_F', 'value' => '65' }, '660' => { 'name' => 'IB_PESC_RSFEC_COUNTER_SELECT_F', 'value' => '624' }, '661' => { 'name' => 'IB_PESC_RSFEC_SYNC_HDR_ERR_CTR_F', 'value' => '625' }, '662' => { 'name' => 'IB_PESC_RSFEC_UNK_BLOCK_CTR_F', 'value' => '626' }, '663' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE0_F', 'value' => '627' }, '664' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE1_F', 'value' => '628' }, '665' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE2_F', 'value' => '629' }, '666' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE3_F', 'value' => '630' }, '667' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE4_F', 'value' => '631' }, '668' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE5_F', 'value' => '632' }, '669' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE6_F', 'value' => '633' }, '67' => { 'name' => 'IB_PORT_CLIENT_REREG_F', 'value' => '66' }, '670' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE7_F', 'value' => '634' }, '671' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE8_F', 'value' => '635' }, '672' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE9_F', 'value' => '636' }, '673' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE10_F', 'value' => '637' }, '674' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE11_F', 'value' => '638' }, '675' => { 'name' => 'IB_PESC_PORT_FEC_CORR_BLOCK_CTR_F', 'value' => '639' }, '676' => { 'name' => 'IB_PESC_PORT_FEC_UNCORR_BLOCK_CTR_F', 'value' => '640' }, '677' => { 'name' => 'IB_PESC_PORT_FEC_CORR_SYMBOL_CTR_F', 'value' => '641' }, '678' => { 'name' => 'IB_PESC_RSFEC_LAST_F', 'value' => '642' }, '679' => { 'name' => 'IB_PC_EXT_COUNTER_SELECT2_F', 'value' => '643' }, '68' => { 'name' => 'IB_PORT_MCAST_PKEY_SUPR_ENAB_F', 'value' => '67' }, '680' => { 'name' => 'IB_PC_EXT_ERR_SYM_F', 'value' => '644' }, '681' => { 'name' => 'IB_PC_EXT_LINK_RECOVERS_F', 'value' => '645' }, '682' => { 'name' => 'IB_PC_EXT_LINK_DOWNED_F', 'value' => '646' }, '683' => { 'name' => 'IB_PC_EXT_ERR_RCV_F', 'value' => '647' }, '684' => { 'name' => 'IB_PC_EXT_ERR_PHYSRCV_F', 'value' => '648' }, '685' => { 'name' => 'IB_PC_EXT_ERR_SWITCH_REL_F', 'value' => '649' }, '686' => { 'name' => 'IB_PC_EXT_XMT_DISCARDS_F', 'value' => '650' }, '687' => { 'name' => 'IB_PC_EXT_ERR_XMTCONSTR_F', 'value' => '651' }, '688' => { 'name' => 'IB_PC_EXT_ERR_RCVCONSTR_F', 'value' => '652' }, '689' => { 'name' => 'IB_PC_EXT_ERR_LOCALINTEG_F', 'value' => '653' }, '69' => { 'name' => 'IB_PORT_SUBN_TIMEOUT_F', 'value' => '68' }, '690' => { 'name' => 'IB_PC_EXT_ERR_EXCESS_OVR_F', 'value' => '654' }, '691' => { 'name' => 'IB_PC_EXT_VL15_DROPPED_F', 'value' => '655' }, '692' => { 'name' => 'IB_PC_EXT_XMT_WAIT_F', 'value' => '656' }, '693' => { 'name' => 'IB_PC_EXT_QP1_DROP_F', 'value' => '657' }, '694' => { 'name' => 'IB_PC_EXT_ERR_LAST_F', 'value' => '658' }, '695' => { 'name' => 'IB_PC_QP1_DROP_F', 'value' => '659' }, '696' => { 'name' => 'IB_PORT_EXT_HDR_FEC_MODE_SUPPORTED_F', 'value' => '660' }, '697' => { 'name' => 'IB_PORT_EXT_HDR_FEC_MODE_ENABLED_F', 'value' => '661' }, '698' => { 'name' => 'IB_PORT_EXT_HDR_FEC_MODE_LAST_F', 'value' => '662' }, '699' => { 'name' => 'IB_PORT_EXT_NDR_FEC_MODE_SUPPORTED_F', 'value' => '663' }, '7' => { 'name' => 'IB_MAD_BASEVER_F', 'value' => '7' }, '70' => { 'name' => 'IB_PORT_RESP_TIME_VAL_F', 'value' => '69' }, '700' => { 'name' => 'IB_PORT_EXT_NDR_FEC_MODE_ENABLED_F', 'value' => '664' }, '701' => { 'name' => 'IB_PORT_EXT_NDR_FEC_MODE_LAST_F', 'value' => '665' }, '702' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_ACTIVE_2_F', 'value' => '666' }, '703' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_SUPPORTED_2_F', 'value' => '667' }, '704' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_ENABLED_2_F', 'value' => '668' }, '705' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_2_LAST_F', 'value' => '669' }, '706' => { 'name' => 'IB_FIELD_LAST_', 'value' => '670' }, '71' => { 'name' => 'IB_PORT_LOCAL_PHYS_ERR_F', 'value' => '70' }, '72' => { 'name' => 'IB_PORT_OVERRUN_ERR_F', 'value' => '71' }, '73' => { 'name' => 'IB_PORT_MAX_CREDIT_HINT_F', 'value' => '72' }, '74' => { 'name' => 'IB_PORT_LINK_ROUND_TRIP_F', 'value' => '73' }, '75' => { 'name' => 'IB_PORT_LAST_F', 'value' => '74' }, '76' => { 'name' => 'IB_NODE_FIRST_F', 'value' => '75' }, '77' => { 'name' => 'IB_NODE_BASE_VERS_F', 'value' => '75' }, '78' => { 'name' => 'IB_NODE_CLASS_VERS_F', 'value' => '76' }, '79' => { 'name' => 'IB_NODE_TYPE_F', 'value' => '77' }, '8' => { 'name' => 'IB_MAD_STATUS_F', 'value' => '8' }, '80' => { 'name' => 'IB_NODE_NPORTS_F', 'value' => '78' }, '81' => { 'name' => 'IB_NODE_SYSTEM_GUID_F', 'value' => '79' }, '82' => { 'name' => 'IB_NODE_GUID_F', 'value' => '80' }, '83' => { 'name' => 'IB_NODE_PORT_GUID_F', 'value' => '81' }, '84' => { 'name' => 'IB_NODE_PARTITION_CAP_F', 'value' => '82' }, '85' => { 'name' => 'IB_NODE_DEVID_F', 'value' => '83' }, '86' => { 'name' => 'IB_NODE_REVISION_F', 'value' => '84' }, '87' => { 'name' => 'IB_NODE_LOCAL_PORT_F', 'value' => '85' }, '88' => { 'name' => 'IB_NODE_VENDORID_F', 'value' => '86' }, '89' => { 'name' => 'IB_NODE_LAST_F', 'value' => '87' }, '9' => { 'name' => 'IB_DRSMP_HOPCNT_F', 'value' => '9' }, '90' => { 'name' => 'IB_SW_FIRST_F', 'value' => '88' }, '91' => { 'name' => 'IB_SW_LINEAR_FDB_CAP_F', 'value' => '88' }, '92' => { 'name' => 'IB_SW_RANDOM_FDB_CAP_F', 'value' => '89' }, '93' => { 'name' => 'IB_SW_MCAST_FDB_CAP_F', 'value' => '90' }, '94' => { 'name' => 'IB_SW_LINEAR_FDB_TOP_F', 'value' => '91' }, '95' => { 'name' => 'IB_SW_DEF_PORT_F', 'value' => '92' }, '96' => { 'name' => 'IB_SW_DEF_MCAST_PRIM_F', 'value' => '93' }, '97' => { 'name' => 'IB_SW_DEF_MCAST_NOT_PRIM_F', 'value' => '94' }, '98' => { 'name' => 'IB_SW_LIFE_TIME_F', 'value' => '95' }, '99' => { 'name' => 'IB_SW_STATE_CHANGE_F', 'value' => '96' } }, 'Name' => 'enum MAD_FIELDS', 'Size' => '4', 'Type' => 'Enum' }, '696' => { 'BaseType' => '305', 'Header' => undef, 'Line' => '7', 'Name' => 'FILE', 'Size' => '216', 'Type' => 'Typedef' }, '708' => { 'BaseType' => '1', 'Header' => undef, 'Line' => '43', 'Name' => '_IO_lock_t', 'Type' => 'Typedef' }, '716' => { 'Name' => 'struct _IO_marker', 'Type' => 'Struct' }, '72' => { 'Name' => 'int', 'Size' => '4', 'Type' => 'Intrinsic' }, '721' => { 'BaseType' => '716', 'Name' => 'struct _IO_marker*', 'Size' => '8', 'Type' => 'Pointer' }, '726' => { 'BaseType' => '305', 'Name' => 'struct _IO_FILE*', 'Size' => '8', 'Type' => 'Pointer' }, '731' => { 'BaseType' => '89', 'Name' => 'char[1]', 'Size' => '1', 'Type' => 'Array' }, '747' => { 'BaseType' => '708', 'Name' => '_IO_lock_t*', 'Size' => '8', 'Type' => 'Pointer' }, '752' => { 'Name' => 'struct _IO_codecvt', 'Type' => 'Struct' }, '757' => { 'BaseType' => '752', 'Name' => 'struct _IO_codecvt*', 'Size' => '8', 'Type' => 'Pointer' }, '762' => { 'Name' => 'struct _IO_wide_data', 'Type' => 'Struct' }, '767' => { 'BaseType' => '762', 'Name' => 'struct _IO_wide_data*', 'Size' => '8', 'Type' => 'Pointer' }, '76744' => { 'BaseType' => '165', 'Name' => 'long*', 'Size' => '8', 'Type' => 'Pointer' }, '772' => { 'BaseType' => '89', 'Name' => 'char[20]', 'Size' => '20', 'Type' => 'Array' }, '788' => { 'BaseType' => '696', 'Name' => 'FILE*', 'Size' => '8', 'Type' => 'Pointer' }, '79' => { 'BaseType' => '96', 'Name' => 'char const*', 'Size' => '8', 'Type' => 'Pointer' }, '810' => { 'BaseType' => '232', 'Name' => 'uint8_t[16]', 'Size' => '16', 'Type' => 'Array' }, '84856' => { 'Header' => undef, 'Line' => '1432', 'Memb' => { '0' => { 'name' => 'IB_DEST_LID', 'value' => '0' }, '1' => { 'name' => 'IB_DEST_DRPATH', 'value' => '1' }, '2' => { 'name' => 'IB_DEST_GUID', 'value' => '2' }, '3' => { 'name' => 'IB_DEST_DRSLID', 'value' => '3' }, '4' => { 'name' => 'IB_DEST_GID', 'value' => '4' } }, 'Name' => 'enum MAD_DEST', 'Size' => '4', 'Type' => 'Enum' }, '85446' => { 'BaseType' => '1036', 'Name' => 'ibmad_gid_t*', 'Size' => '8', 'Type' => 'Pointer' }, '87806' => { 'BaseType' => '268', 'Name' => 'uint64_t*', 'Size' => '8', 'Type' => 'Pointer' }, '89' => { 'Name' => 'char', 'Size' => '1', 'Type' => 'Intrinsic' }, '95819' => { 'Header' => undef, 'Line' => '1411', 'Memb' => { '0' => { 'name' => 'port', 'offset' => '0', 'type' => '1731' }, '1' => { 'name' => 'ca_name', 'offset' => '8', 'type' => '772' } }, 'Name' => 'struct ibmad_ports_item', 'Size' => '32', 'Type' => 'Struct' }, '95949' => { 'BaseType' => '95819', 'Header' => undef, 'Line' => '1414', 'Name' => 'ibmad_ports_item_t', 'Size' => '32', 'Type' => 'Typedef' }, '95960' => { 'Header' => undef, 'Line' => '1416', 'Memb' => { '0' => { 'name' => 'smi', 'offset' => '0', 'type' => '95949' }, '1' => { 'name' => 'gsi', 'offset' => '50', 'type' => '95949' } }, 'Name' => 'struct ibmad_ports_pair', 'Size' => '64', 'Type' => 'Struct' }, '96' => { 'BaseType' => '89', 'Name' => 'char const', 'Size' => '1', 'Type' => 'Const' }, '97101' => { 'BaseType' => '95960', 'Name' => 'struct ibmad_ports_pair*', 'Size' => '8', 'Type' => 'Pointer' } }, 'UndefinedSymbols' => { 'libibmad.so.5.5.56.0' => { '_ITM_deregisterTMCloneTable' => 0, '_ITM_registerTMCloneTable' => 0, '__cxa_finalize@GLIBC_2.2.5' => 0, '__errno_location@GLIBC_2.2.5' => 0, '__fprintf_chk@GLIBC_2.3.4' => 0, '__gmon_start__' => 0, '__memset_chk@GLIBC_2.3.4' => 0, '__printf_chk@GLIBC_2.3.4' => 0, '__snprintf_chk@GLIBC_2.3.4' => 0, '__sprintf_chk@GLIBC_2.3.4' => 0, '__stack_chk_fail@GLIBC_2.4' => 0, 'calloc@GLIBC_2.2.5' => 0, 'exit@GLIBC_2.2.5' => 0, 'fputc@GLIBC_2.2.5' => 0, 'fputs@GLIBC_2.2.5' => 0, 'free@GLIBC_2.2.5' => 0, 'getenv@GLIBC_2.2.5' => 0, 'getpid@GLIBC_2.2.5' => 0, 'inet_ntop@GLIBC_2.2.5' => 0, 'inet_pton@GLIBC_2.2.5' => 0, 'malloc@GLIBC_2.2.5' => 0, 'memcpy@GLIBC_2.14' => 0, 'random@GLIBC_2.2.5' => 0, 'snprintf@GLIBC_2.2.5' => 0, 'srandom@GLIBC_2.2.5' => 0, 'stderr@GLIBC_2.2.5' => 0, 'strchr@GLIBC_2.2.5' => 0, 'strdup@GLIBC_2.2.5' => 0, 'strerror@GLIBC_2.2.5' => 0, 'strlen@GLIBC_2.2.5' => 0, 'strncpy@GLIBC_2.2.5' => 0, 'strtol@GLIBC_2.2.5' => 0, 'strtoull@GLIBC_2.2.5' => 0, 'time@GLIBC_2.2.5' => 0, 'umad_addr_dump@IBUMAD_1.0' => 0, 'umad_close_port@IBUMAD_1.0' => 0, 'umad_get_mad@IBUMAD_1.0' => 0, 'umad_get_mad_addr@IBUMAD_1.0' => 0, 'umad_get_smi_gsi_pair_by_ca_name@IBUMAD_1.4' => 0, 'umad_init@IBUMAD_1.0' => 0, 'umad_open_port@IBUMAD_1.0' => 0, 'umad_recv@IBUMAD_1.0' => 0, 'umad_register@IBUMAD_1.0' => 0, 'umad_register_oui@IBUMAD_1.0' => 0, 'umad_send@IBUMAD_1.0' => 0, 'umad_set_addr@IBUMAD_1.0' => 0, 'umad_set_grh@IBUMAD_1.0' => 0, 'umad_set_pkey@IBUMAD_1.0' => 0, 'umad_size@IBUMAD_1.0' => 0, 'umad_status@IBUMAD_1.0' => 0 } }, 'WordSize' => '8' }; rdma-core-56.1/ABI/ibnetdisc.dump000066400000000000000000011051131477342711600165620ustar00rootroot00000000000000$VAR1 = { 'ABI_DUMPER_VERSION' => '1.2', 'ABI_DUMP_VERSION' => '3.5', 'Arch' => 'x86_64', 'GccVersion' => '12.3.0', 'Headers' => {}, 'Language' => 'C', 'LibraryName' => 'libibnetdisc.so.5.1.56.0', 'LibraryVersion' => 'ibnetdisc', 'MissedOffsets' => '1', 'MissedRegs' => '1', 'NameSpaces' => {}, 'Needed' => { 'libc.so.6' => 1, 'libibmad.so.5' => 1, 'libibumad.so.3' => 1 }, 'Sources' => {}, 'SymbolInfo' => { '17834' => { 'Header' => undef, 'Line' => '249', 'Param' => { '0' => { 'name' => 'fabric', 'type' => '13740' }, '1' => { 'name' => 'chassisnum', 'type' => '86' } }, 'Return' => '276', 'ShortName' => 'ibnd_get_chassis_guid' }, '18756' => { 'Header' => undef, 'Line' => '164', 'Param' => { '0' => { 'name' => 'guid', 'type' => '276' } }, 'Return' => '65', 'ShortName' => 'ibnd_is_xsigo_tca' }, '18809' => { 'Header' => undef, 'Line' => '155', 'Param' => { '0' => { 'name' => 'guid', 'type' => '276' } }, 'Return' => '65', 'ShortName' => 'ibnd_is_xsigo_hca' }, '18890' => { 'Header' => undef, 'Line' => '139', 'Param' => { '0' => { 'name' => 'guid', 'type' => '276' } }, 'Return' => '65', 'ShortName' => 'ibnd_is_xsigo_guid' }, '19022' => { 'Header' => undef, 'Line' => '95', 'Param' => { '0' => { 'name' => 'node', 'type' => '6420' }, '1' => { 'name' => 'str', 'type' => '200' }, '2' => { 'name' => 'size', 'type' => '46' } }, 'Return' => '200', 'ShortName' => 'ibnd_get_chassis_slot_str' }, '19382' => { 'Header' => undef, 'Line' => '59', 'Param' => { '0' => { 'name' => 'node', 'type' => '6420' } }, 'Return' => '288', 'ShortName' => 'ibnd_get_chassis_type' }, '30465' => { 'Header' => undef, 'Line' => '1152', 'Param' => { '0' => { 'name' => 'buf', 'type' => '200' }, '1' => { 'name' => 'bufsz', 'type' => '65' }, '2' => { 'name' => 'speed', 'type' => '65' } }, 'Return' => '200', 'ShortName' => 'ibnd_dump_agg_linkspeedextsup' }, '30521' => { 'Header' => undef, 'Line' => '1147', 'Param' => { '0' => { 'name' => 'buf', 'type' => '200' }, '1' => { 'name' => 'bufsz', 'type' => '65' }, '2' => { 'name' => 'speed', 'type' => '65' } }, 'Return' => '200', 'ShortName' => 'ibnd_dump_agg_linkspeedexten' }, '30578' => { 'Header' => undef, 'Line' => '1114', 'Param' => { '0' => { 'name' => 'buf', 'type' => '200' }, '1' => { 'name' => 'bufsz', 'type' => '65' }, '2' => { 'name' => 'speed', 'type' => '65' } }, 'Return' => '200', 'ShortName' => 'ibnd_dump_agg_linkspeedext_bits' }, '31525' => { 'Header' => undef, 'Line' => '1084', 'Param' => { '0' => { 'name' => 'buf', 'type' => '200' }, '1' => { 'name' => 'bufsz', 'type' => '65' }, '2' => { 'name' => 'speed', 'type' => '65' } }, 'Return' => '200', 'ShortName' => 'ibnd_dump_agg_linkspeedext' }, '32405' => { 'Header' => undef, 'Line' => '1077', 'Param' => { '0' => { 'name' => 'cap_info', 'type' => '193' }, '1' => { 'name' => 'info', 'type' => '193' } }, 'Return' => '65', 'ShortName' => 'ibnd_get_agg_linkspeedextsup' }, '32522' => { 'Header' => undef, 'Line' => '1070', 'Param' => { '0' => { 'name' => 'cap_info', 'type' => '193' }, '1' => { 'name' => 'info', 'type' => '193' } }, 'Return' => '65', 'ShortName' => 'ibnd_get_agg_linkspeedexten' }, '32639' => { 'Header' => undef, 'Line' => '1063', 'Param' => { '0' => { 'name' => 'cap_info', 'type' => '193' }, '1' => { 'name' => 'info', 'type' => '193' } }, 'Return' => '65', 'ShortName' => 'ibnd_get_agg_linkspeedext' }, '32756' => { 'Header' => undef, 'Line' => '1035', 'Param' => { '0' => { 'name' => 'cap_info', 'type' => '193' }, '1' => { 'name' => 'info', 'type' => '193' }, '2' => { 'name' => 'efield', 'type' => '1064' }, '3' => { 'name' => 'e2field', 'type' => '1064' } }, 'Return' => '65', 'ShortName' => 'ibnd_get_agg_linkspeedext_field' }, '33158' => { 'Header' => undef, 'Line' => '1014', 'Param' => { '0' => { 'name' => 'fabric', 'type' => '13740' }, '1' => { 'name' => 'func', 'type' => '28716' }, '2' => { 'name' => 'user_data', 'type' => '193' } }, 'Return' => '1', 'ShortName' => 'ibnd_iter_ports' }, '33512' => { 'Header' => undef, 'Line' => '974', 'Param' => { '0' => { 'name' => 'fabric', 'type' => '13740' }, '1' => { 'name' => 'dr_str', 'type' => '200' } }, 'Return' => '6674', 'ShortName' => 'ibnd_find_port_dr' }, '33965' => { 'Header' => undef, 'Line' => '957', 'Param' => { '0' => { 'name' => 'fabric', 'type' => '13740' }, '1' => { 'name' => 'guid', 'type' => '276' } }, 'Return' => '6674', 'ShortName' => 'ibnd_find_port_guid' }, '34168' => { 'Header' => undef, 'Line' => '942', 'Param' => { '0' => { 'name' => 'fabric', 'type' => '13740' }, '1' => { 'name' => 'lid', 'type' => '252' } }, 'Return' => '6674', 'ShortName' => 'ibnd_find_port_lid' }, '34312' => { 'Header' => undef, 'Line' => '907', 'Param' => { '0' => { 'name' => 'fabric', 'type' => '13740' }, '1' => { 'name' => 'func', 'type' => '28683' }, '2' => { 'name' => 'node_type', 'type' => '65' }, '3' => { 'name' => 'user_data', 'type' => '193' } }, 'Return' => '1', 'ShortName' => 'ibnd_iter_nodes_type' }, '34782' => { 'Header' => undef, 'Line' => '888', 'Param' => { '0' => { 'name' => 'fabric', 'type' => '13740' }, '1' => { 'name' => 'func', 'type' => '28683' }, '2' => { 'name' => 'user_data', 'type' => '193' } }, 'Return' => '1', 'ShortName' => 'ibnd_iter_nodes' }, '35113' => { 'Header' => undef, 'Line' => '198', 'Param' => { '0' => { 'name' => 'fabric', 'type' => '13740' } }, 'Return' => '1', 'ShortName' => 'ibnd_destroy_fabric' }, '35428' => { 'Header' => undef, 'Line' => '763', 'Param' => { '0' => { 'name' => 'ca_name', 'type' => '200' }, '1' => { 'name' => 'ca_port', 'type' => '65' }, '2' => { 'name' => 'from', 'type' => '29876' }, '3' => { 'name' => 'cfg', 'type' => '29205' } }, 'Return' => '13740', 'ShortName' => 'ibnd_discover_fabric' }, '38282' => { 'Header' => undef, 'Line' => '627', 'Param' => { '0' => { 'name' => 'fabric', 'type' => '13740' }, '1' => { 'name' => 'dr_str', 'type' => '200' } }, 'Return' => '6420', 'ShortName' => 'ibnd_find_node_dr' }, '38404' => { 'Header' => undef, 'Line' => '610', 'Param' => { '0' => { 'name' => 'fabric', 'type' => '13740' }, '1' => { 'name' => 'guid', 'type' => '276' } }, 'Return' => '6420', 'ShortName' => 'ibnd_find_node_guid' }, '51656' => { 'Header' => undef, 'Line' => '878', 'Param' => { '0' => { 'name' => 'fabric', 'type' => '13740' }, '1' => { 'name' => 'file', 'type' => '288' }, '2' => { 'name' => 'flags', 'type' => '100' } }, 'Return' => '65', 'ShortName' => 'ibnd_cache_fabric' }, '56061' => { 'Header' => undef, 'Line' => '620', 'Param' => { '0' => { 'name' => 'file', 'type' => '288' }, '1' => { 'name' => 'flags', 'type' => '100' } }, 'Return' => '13740', 'ShortName' => 'ibnd_load_fabric' } }, 'SymbolVersion' => { 'ibnd_cache_fabric' => 'ibnd_cache_fabric@@IBNETDISC_1.0', 'ibnd_destroy_fabric' => 'ibnd_destroy_fabric@@IBNETDISC_1.0', 'ibnd_discover_fabric' => 'ibnd_discover_fabric@@IBNETDISC_1.0', 'ibnd_dump_agg_linkspeedext' => 'ibnd_dump_agg_linkspeedext@@IBNETDISC_1.1', 'ibnd_dump_agg_linkspeedext_bits' => 'ibnd_dump_agg_linkspeedext_bits@@IBNETDISC_1.1', 'ibnd_dump_agg_linkspeedexten' => 'ibnd_dump_agg_linkspeedexten@@IBNETDISC_1.1', 'ibnd_dump_agg_linkspeedextsup' => 'ibnd_dump_agg_linkspeedextsup@@IBNETDISC_1.1', 'ibnd_find_node_dr' => 'ibnd_find_node_dr@@IBNETDISC_1.0', 'ibnd_find_node_guid' => 'ibnd_find_node_guid@@IBNETDISC_1.0', 'ibnd_find_port_dr' => 'ibnd_find_port_dr@@IBNETDISC_1.0', 'ibnd_find_port_guid' => 'ibnd_find_port_guid@@IBNETDISC_1.0', 'ibnd_find_port_lid' => 'ibnd_find_port_lid@@IBNETDISC_1.0', 'ibnd_get_agg_linkspeedext' => 'ibnd_get_agg_linkspeedext@@IBNETDISC_1.1', 'ibnd_get_agg_linkspeedext_field' => 'ibnd_get_agg_linkspeedext_field@@IBNETDISC_1.1', 'ibnd_get_agg_linkspeedexten' => 'ibnd_get_agg_linkspeedexten@@IBNETDISC_1.1', 'ibnd_get_agg_linkspeedextsup' => 'ibnd_get_agg_linkspeedextsup@@IBNETDISC_1.1', 'ibnd_get_chassis_guid' => 'ibnd_get_chassis_guid@@IBNETDISC_1.0', 'ibnd_get_chassis_slot_str' => 'ibnd_get_chassis_slot_str@@IBNETDISC_1.0', 'ibnd_get_chassis_type' => 'ibnd_get_chassis_type@@IBNETDISC_1.0', 'ibnd_is_xsigo_guid' => 'ibnd_is_xsigo_guid@@IBNETDISC_1.0', 'ibnd_is_xsigo_hca' => 'ibnd_is_xsigo_hca@@IBNETDISC_1.0', 'ibnd_is_xsigo_tca' => 'ibnd_is_xsigo_tca@@IBNETDISC_1.0', 'ibnd_iter_nodes' => 'ibnd_iter_nodes@@IBNETDISC_1.0', 'ibnd_iter_nodes_type' => 'ibnd_iter_nodes_type@@IBNETDISC_1.0', 'ibnd_iter_ports' => 'ibnd_iter_ports@@IBNETDISC_1.0', 'ibnd_load_fabric' => 'ibnd_load_fabric@@IBNETDISC_1.0' }, 'Symbols' => { 'libibnetdisc.so.5.1.56.0' => { 'ibnd_cache_fabric@@IBNETDISC_1.0' => 1, 'ibnd_destroy_fabric@@IBNETDISC_1.0' => 1, 'ibnd_discover_fabric@@IBNETDISC_1.0' => 1, 'ibnd_dump_agg_linkspeedext@@IBNETDISC_1.1' => 1, 'ibnd_dump_agg_linkspeedext_bits@@IBNETDISC_1.1' => 1, 'ibnd_dump_agg_linkspeedexten@@IBNETDISC_1.1' => 1, 'ibnd_dump_agg_linkspeedextsup@@IBNETDISC_1.1' => 1, 'ibnd_find_node_dr@@IBNETDISC_1.0' => 1, 'ibnd_find_node_guid@@IBNETDISC_1.0' => 1, 'ibnd_find_port_dr@@IBNETDISC_1.0' => 1, 'ibnd_find_port_guid@@IBNETDISC_1.0' => 1, 'ibnd_find_port_lid@@IBNETDISC_1.0' => 1, 'ibnd_get_agg_linkspeedext@@IBNETDISC_1.1' => 1, 'ibnd_get_agg_linkspeedext_field@@IBNETDISC_1.1' => 1, 'ibnd_get_agg_linkspeedexten@@IBNETDISC_1.1' => 1, 'ibnd_get_agg_linkspeedextsup@@IBNETDISC_1.1' => 1, 'ibnd_get_chassis_guid@@IBNETDISC_1.0' => 1, 'ibnd_get_chassis_slot_str@@IBNETDISC_1.0' => 1, 'ibnd_get_chassis_type@@IBNETDISC_1.0' => 1, 'ibnd_is_xsigo_guid@@IBNETDISC_1.0' => 1, 'ibnd_is_xsigo_hca@@IBNETDISC_1.0' => 1, 'ibnd_is_xsigo_tca@@IBNETDISC_1.0' => 1, 'ibnd_iter_nodes@@IBNETDISC_1.0' => 1, 'ibnd_iter_nodes_type@@IBNETDISC_1.0' => 1, 'ibnd_iter_ports@@IBNETDISC_1.0' => 1, 'ibnd_load_fabric@@IBNETDISC_1.0' => 1 } }, 'Target' => 'unix', 'TypeInfo' => { '1' => { 'Name' => 'void', 'Type' => 'Intrinsic' }, '100' => { 'Name' => 'unsigned int', 'Size' => '4', 'Type' => 'Intrinsic' }, '1051' => { 'BaseType' => '934', 'Header' => undef, 'Line' => '317', 'Name' => 'ib_portid_t', 'Size' => '112', 'Type' => 'Typedef' }, '1064' => { 'Header' => undef, 'Line' => '330', 'Memb' => { '0' => { 'name' => 'IB_NO_FIELD', 'value' => '0' }, '1' => { 'name' => 'IB_GID_PREFIX_F', 'value' => '1' }, '10' => { 'name' => 'IB_DRSMP_HOPPTR_F', 'value' => '10' }, '100' => { 'name' => 'IB_SW_OPT_SLTOVL_MAPPING_F', 'value' => '97' }, '101' => { 'name' => 'IB_SW_LIDS_PER_PORT_F', 'value' => '98' }, '102' => { 'name' => 'IB_SW_PARTITION_ENFORCE_CAP_F', 'value' => '99' }, '103' => { 'name' => 'IB_SW_PARTITION_ENF_INB_F', 'value' => '100' }, '104' => { 'name' => 'IB_SW_PARTITION_ENF_OUTB_F', 'value' => '101' }, '105' => { 'name' => 'IB_SW_FILTER_RAW_INB_F', 'value' => '102' }, '106' => { 'name' => 'IB_SW_FILTER_RAW_OUTB_F', 'value' => '103' }, '107' => { 'name' => 'IB_SW_ENHANCED_PORT0_F', 'value' => '104' }, '108' => { 'name' => 'IB_SW_MCAST_FDB_TOP_F', 'value' => '105' }, '109' => { 'name' => 'IB_SW_LAST_F', 'value' => '106' }, '11' => { 'name' => 'IB_DRSMP_STATUS_F', 'value' => '11' }, '110' => { 'name' => 'IB_LINEAR_FORW_TBL_F', 'value' => '107' }, '111' => { 'name' => 'IB_MULTICAST_FORW_TBL_F', 'value' => '108' }, '112' => { 'name' => 'IB_NODE_DESC_F', 'value' => '109' }, '113' => { 'name' => 'IB_NOTICE_IS_GENERIC_F', 'value' => '110' }, '114' => { 'name' => 'IB_NOTICE_TYPE_F', 'value' => '111' }, '115' => { 'name' => 'IB_NOTICE_PRODUCER_F', 'value' => '112' }, '116' => { 'name' => 'IB_NOTICE_TRAP_NUMBER_F', 'value' => '113' }, '117' => { 'name' => 'IB_NOTICE_ISSUER_LID_F', 'value' => '114' }, '118' => { 'name' => 'IB_NOTICE_TOGGLE_F', 'value' => '115' }, '119' => { 'name' => 'IB_NOTICE_COUNT_F', 'value' => '116' }, '12' => { 'name' => 'IB_DRSMP_DIRECTION_F', 'value' => '12' }, '120' => { 'name' => 'IB_NOTICE_DATA_DETAILS_F', 'value' => '117' }, '121' => { 'name' => 'IB_NOTICE_DATA_LID_F', 'value' => '118' }, '122' => { 'name' => 'IB_NOTICE_DATA_144_LID_F', 'value' => '119' }, '123' => { 'name' => 'IB_NOTICE_DATA_144_CAPMASK_F', 'value' => '120' }, '124' => { 'name' => 'IB_PC_FIRST_F', 'value' => '121' }, '125' => { 'name' => 'IB_PC_PORT_SELECT_F', 'value' => '121' }, '126' => { 'name' => 'IB_PC_COUNTER_SELECT_F', 'value' => '122' }, '127' => { 'name' => 'IB_PC_ERR_SYM_F', 'value' => '123' }, '128' => { 'name' => 'IB_PC_LINK_RECOVERS_F', 'value' => '124' }, '129' => { 'name' => 'IB_PC_LINK_DOWNED_F', 'value' => '125' }, '13' => { 'name' => 'IB_MAD_TRID_F', 'value' => '13' }, '130' => { 'name' => 'IB_PC_ERR_RCV_F', 'value' => '126' }, '131' => { 'name' => 'IB_PC_ERR_PHYSRCV_F', 'value' => '127' }, '132' => { 'name' => 'IB_PC_ERR_SWITCH_REL_F', 'value' => '128' }, '133' => { 'name' => 'IB_PC_XMT_DISCARDS_F', 'value' => '129' }, '134' => { 'name' => 'IB_PC_ERR_XMTCONSTR_F', 'value' => '130' }, '135' => { 'name' => 'IB_PC_ERR_RCVCONSTR_F', 'value' => '131' }, '136' => { 'name' => 'IB_PC_COUNTER_SELECT2_F', 'value' => '132' }, '137' => { 'name' => 'IB_PC_ERR_LOCALINTEG_F', 'value' => '133' }, '138' => { 'name' => 'IB_PC_ERR_EXCESS_OVR_F', 'value' => '134' }, '139' => { 'name' => 'IB_PC_VL15_DROPPED_F', 'value' => '135' }, '14' => { 'name' => 'IB_MAD_ATTRID_F', 'value' => '14' }, '140' => { 'name' => 'IB_PC_XMT_BYTES_F', 'value' => '136' }, '141' => { 'name' => 'IB_PC_RCV_BYTES_F', 'value' => '137' }, '142' => { 'name' => 'IB_PC_XMT_PKTS_F', 'value' => '138' }, '143' => { 'name' => 'IB_PC_RCV_PKTS_F', 'value' => '139' }, '144' => { 'name' => 'IB_PC_XMT_WAIT_F', 'value' => '140' }, '145' => { 'name' => 'IB_PC_LAST_F', 'value' => '141' }, '146' => { 'name' => 'IB_SMINFO_GUID_F', 'value' => '142' }, '147' => { 'name' => 'IB_SMINFO_KEY_F', 'value' => '143' }, '148' => { 'name' => 'IB_SMINFO_ACT_F', 'value' => '144' }, '149' => { 'name' => 'IB_SMINFO_PRIO_F', 'value' => '145' }, '15' => { 'name' => 'IB_MAD_ATTRMOD_F', 'value' => '15' }, '150' => { 'name' => 'IB_SMINFO_STATE_F', 'value' => '146' }, '151' => { 'name' => 'IB_SA_RMPP_VERS_F', 'value' => '147' }, '152' => { 'name' => 'IB_SA_RMPP_TYPE_F', 'value' => '148' }, '153' => { 'name' => 'IB_SA_RMPP_RESP_F', 'value' => '149' }, '154' => { 'name' => 'IB_SA_RMPP_FLAGS_F', 'value' => '150' }, '155' => { 'name' => 'IB_SA_RMPP_STATUS_F', 'value' => '151' }, '156' => { 'name' => 'IB_SA_RMPP_D1_F', 'value' => '152' }, '157' => { 'name' => 'IB_SA_RMPP_SEGNUM_F', 'value' => '153' }, '158' => { 'name' => 'IB_SA_RMPP_D2_F', 'value' => '154' }, '159' => { 'name' => 'IB_SA_RMPP_LEN_F', 'value' => '155' }, '16' => { 'name' => 'IB_MAD_MKEY_F', 'value' => '16' }, '160' => { 'name' => 'IB_SA_RMPP_NEWWIN_F', 'value' => '156' }, '161' => { 'name' => 'IB_SA_MP_NPATH_F', 'value' => '157' }, '162' => { 'name' => 'IB_SA_MP_NSRC_F', 'value' => '158' }, '163' => { 'name' => 'IB_SA_MP_NDEST_F', 'value' => '159' }, '164' => { 'name' => 'IB_SA_MP_GID0_F', 'value' => '160' }, '165' => { 'name' => 'IB_SA_PR_DGID_F', 'value' => '161' }, '166' => { 'name' => 'IB_SA_PR_SGID_F', 'value' => '162' }, '167' => { 'name' => 'IB_SA_PR_DLID_F', 'value' => '163' }, '168' => { 'name' => 'IB_SA_PR_SLID_F', 'value' => '164' }, '169' => { 'name' => 'IB_SA_PR_NPATH_F', 'value' => '165' }, '17' => { 'name' => 'IB_DRSMP_DRDLID_F', 'value' => '17' }, '170' => { 'name' => 'IB_SA_PR_SL_F', 'value' => '166' }, '171' => { 'name' => 'IB_SA_MCM_MGID_F', 'value' => '167' }, '172' => { 'name' => 'IB_SA_MCM_PORTGID_F', 'value' => '168' }, '173' => { 'name' => 'IB_SA_MCM_QKEY_F', 'value' => '169' }, '174' => { 'name' => 'IB_SA_MCM_MLID_F', 'value' => '170' }, '175' => { 'name' => 'IB_SA_MCM_SL_F', 'value' => '171' }, '176' => { 'name' => 'IB_SA_MCM_MTU_F', 'value' => '172' }, '177' => { 'name' => 'IB_SA_MCM_RATE_F', 'value' => '173' }, '178' => { 'name' => 'IB_SA_MCM_TCLASS_F', 'value' => '174' }, '179' => { 'name' => 'IB_SA_MCM_PKEY_F', 'value' => '175' }, '18' => { 'name' => 'IB_DRSMP_DRSLID_F', 'value' => '18' }, '180' => { 'name' => 'IB_SA_MCM_FLOW_LABEL_F', 'value' => '176' }, '181' => { 'name' => 'IB_SA_MCM_JOIN_STATE_F', 'value' => '177' }, '182' => { 'name' => 'IB_SA_MCM_PROXY_JOIN_F', 'value' => '178' }, '183' => { 'name' => 'IB_SA_SR_ID_F', 'value' => '179' }, '184' => { 'name' => 'IB_SA_SR_GID_F', 'value' => '180' }, '185' => { 'name' => 'IB_SA_SR_PKEY_F', 'value' => '181' }, '186' => { 'name' => 'IB_SA_SR_LEASE_F', 'value' => '182' }, '187' => { 'name' => 'IB_SA_SR_KEY_F', 'value' => '183' }, '188' => { 'name' => 'IB_SA_SR_NAME_F', 'value' => '184' }, '189' => { 'name' => 'IB_SA_SR_DATA_F', 'value' => '185' }, '19' => { 'name' => 'IB_SA_MKEY_F', 'value' => '19' }, '190' => { 'name' => 'IB_ATS_SM_NODE_ADDR_F', 'value' => '186' }, '191' => { 'name' => 'IB_ATS_SM_MAGIC_KEY_F', 'value' => '187' }, '192' => { 'name' => 'IB_ATS_SM_NODE_TYPE_F', 'value' => '188' }, '193' => { 'name' => 'IB_ATS_SM_NODE_NAME_F', 'value' => '189' }, '194' => { 'name' => 'IB_SLTOVL_MAPPING_TABLE_F', 'value' => '190' }, '195' => { 'name' => 'IB_VL_ARBITRATION_TABLE_F', 'value' => '191' }, '196' => { 'name' => 'IB_VEND2_OUI_F', 'value' => '192' }, '197' => { 'name' => 'IB_VEND2_DATA_F', 'value' => '193' }, '198' => { 'name' => 'IB_PC_EXT_FIRST_F', 'value' => '194' }, '199' => { 'name' => 'IB_PC_EXT_PORT_SELECT_F', 'value' => '194' }, '2' => { 'name' => 'IB_GID_GUID_F', 'value' => '2' }, '20' => { 'name' => 'IB_SA_ATTROFFS_F', 'value' => '20' }, '200' => { 'name' => 'IB_PC_EXT_COUNTER_SELECT_F', 'value' => '195' }, '201' => { 'name' => 'IB_PC_EXT_XMT_BYTES_F', 'value' => '196' }, '202' => { 'name' => 'IB_PC_EXT_RCV_BYTES_F', 'value' => '197' }, '203' => { 'name' => 'IB_PC_EXT_XMT_PKTS_F', 'value' => '198' }, '204' => { 'name' => 'IB_PC_EXT_RCV_PKTS_F', 'value' => '199' }, '205' => { 'name' => 'IB_PC_EXT_XMT_UPKTS_F', 'value' => '200' }, '206' => { 'name' => 'IB_PC_EXT_RCV_UPKTS_F', 'value' => '201' }, '207' => { 'name' => 'IB_PC_EXT_XMT_MPKTS_F', 'value' => '202' }, '208' => { 'name' => 'IB_PC_EXT_RCV_MPKTS_F', 'value' => '203' }, '209' => { 'name' => 'IB_PC_EXT_LAST_F', 'value' => '204' }, '21' => { 'name' => 'IB_SA_COMPMASK_F', 'value' => '21' }, '210' => { 'name' => 'IB_GUID_GUID0_F', 'value' => '205' }, '211' => { 'name' => 'IB_CPI_BASEVER_F', 'value' => '206' }, '212' => { 'name' => 'IB_CPI_CLASSVER_F', 'value' => '207' }, '213' => { 'name' => 'IB_CPI_CAPMASK_F', 'value' => '208' }, '214' => { 'name' => 'IB_CPI_CAPMASK2_F', 'value' => '209' }, '215' => { 'name' => 'IB_CPI_RESP_TIME_VALUE_F', 'value' => '210' }, '216' => { 'name' => 'IB_CPI_REDIRECT_GID_F', 'value' => '211' }, '217' => { 'name' => 'IB_CPI_REDIRECT_TC_F', 'value' => '212' }, '218' => { 'name' => 'IB_CPI_REDIRECT_SL_F', 'value' => '213' }, '219' => { 'name' => 'IB_CPI_REDIRECT_FL_F', 'value' => '214' }, '22' => { 'name' => 'IB_SA_DATA_F', 'value' => '22' }, '220' => { 'name' => 'IB_CPI_REDIRECT_LID_F', 'value' => '215' }, '221' => { 'name' => 'IB_CPI_REDIRECT_PKEY_F', 'value' => '216' }, '222' => { 'name' => 'IB_CPI_REDIRECT_QP_F', 'value' => '217' }, '223' => { 'name' => 'IB_CPI_REDIRECT_QKEY_F', 'value' => '218' }, '224' => { 'name' => 'IB_CPI_TRAP_GID_F', 'value' => '219' }, '225' => { 'name' => 'IB_CPI_TRAP_TC_F', 'value' => '220' }, '226' => { 'name' => 'IB_CPI_TRAP_SL_F', 'value' => '221' }, '227' => { 'name' => 'IB_CPI_TRAP_FL_F', 'value' => '222' }, '228' => { 'name' => 'IB_CPI_TRAP_LID_F', 'value' => '223' }, '229' => { 'name' => 'IB_CPI_TRAP_PKEY_F', 'value' => '224' }, '23' => { 'name' => 'IB_SM_DATA_F', 'value' => '23' }, '230' => { 'name' => 'IB_CPI_TRAP_HL_F', 'value' => '225' }, '231' => { 'name' => 'IB_CPI_TRAP_QP_F', 'value' => '226' }, '232' => { 'name' => 'IB_CPI_TRAP_QKEY_F', 'value' => '227' }, '233' => { 'name' => 'IB_PC_XMT_DATA_SL_FIRST_F', 'value' => '228' }, '234' => { 'name' => 'IB_PC_XMT_DATA_SL0_F', 'value' => '228' }, '235' => { 'name' => 'IB_PC_XMT_DATA_SL1_F', 'value' => '229' }, '236' => { 'name' => 'IB_PC_XMT_DATA_SL2_F', 'value' => '230' }, '237' => { 'name' => 'IB_PC_XMT_DATA_SL3_F', 'value' => '231' }, '238' => { 'name' => 'IB_PC_XMT_DATA_SL4_F', 'value' => '232' }, '239' => { 'name' => 'IB_PC_XMT_DATA_SL5_F', 'value' => '233' }, '24' => { 'name' => 'IB_GS_DATA_F', 'value' => '24' }, '240' => { 'name' => 'IB_PC_XMT_DATA_SL6_F', 'value' => '234' }, '241' => { 'name' => 'IB_PC_XMT_DATA_SL7_F', 'value' => '235' }, '242' => { 'name' => 'IB_PC_XMT_DATA_SL8_F', 'value' => '236' }, '243' => { 'name' => 'IB_PC_XMT_DATA_SL9_F', 'value' => '237' }, '244' => { 'name' => 'IB_PC_XMT_DATA_SL10_F', 'value' => '238' }, '245' => { 'name' => 'IB_PC_XMT_DATA_SL11_F', 'value' => '239' }, '246' => { 'name' => 'IB_PC_XMT_DATA_SL12_F', 'value' => '240' }, '247' => { 'name' => 'IB_PC_XMT_DATA_SL13_F', 'value' => '241' }, '248' => { 'name' => 'IB_PC_XMT_DATA_SL14_F', 'value' => '242' }, '249' => { 'name' => 'IB_PC_XMT_DATA_SL15_F', 'value' => '243' }, '25' => { 'name' => 'IB_DRSMP_PATH_F', 'value' => '25' }, '250' => { 'name' => 'IB_PC_XMT_DATA_SL_LAST_F', 'value' => '244' }, '251' => { 'name' => 'IB_PC_RCV_DATA_SL_FIRST_F', 'value' => '245' }, '252' => { 'name' => 'IB_PC_RCV_DATA_SL0_F', 'value' => '245' }, '253' => { 'name' => 'IB_PC_RCV_DATA_SL1_F', 'value' => '246' }, '254' => { 'name' => 'IB_PC_RCV_DATA_SL2_F', 'value' => '247' }, '255' => { 'name' => 'IB_PC_RCV_DATA_SL3_F', 'value' => '248' }, '256' => { 'name' => 'IB_PC_RCV_DATA_SL4_F', 'value' => '249' }, '257' => { 'name' => 'IB_PC_RCV_DATA_SL5_F', 'value' => '250' }, '258' => { 'name' => 'IB_PC_RCV_DATA_SL6_F', 'value' => '251' }, '259' => { 'name' => 'IB_PC_RCV_DATA_SL7_F', 'value' => '252' }, '26' => { 'name' => 'IB_DRSMP_RPATH_F', 'value' => '26' }, '260' => { 'name' => 'IB_PC_RCV_DATA_SL8_F', 'value' => '253' }, '261' => { 'name' => 'IB_PC_RCV_DATA_SL9_F', 'value' => '254' }, '262' => { 'name' => 'IB_PC_RCV_DATA_SL10_F', 'value' => '255' }, '263' => { 'name' => 'IB_PC_RCV_DATA_SL11_F', 'value' => '256' }, '264' => { 'name' => 'IB_PC_RCV_DATA_SL12_F', 'value' => '257' }, '265' => { 'name' => 'IB_PC_RCV_DATA_SL13_F', 'value' => '258' }, '266' => { 'name' => 'IB_PC_RCV_DATA_SL14_F', 'value' => '259' }, '267' => { 'name' => 'IB_PC_RCV_DATA_SL15_F', 'value' => '260' }, '268' => { 'name' => 'IB_PC_RCV_DATA_SL_LAST_F', 'value' => '261' }, '269' => { 'name' => 'IB_PC_XMT_INACT_DISC_F', 'value' => '262' }, '27' => { 'name' => 'IB_PORT_FIRST_F', 'value' => '27' }, '270' => { 'name' => 'IB_PC_XMT_NEIGH_MTU_DISC_F', 'value' => '263' }, '271' => { 'name' => 'IB_PC_XMT_SW_LIFE_DISC_F', 'value' => '264' }, '272' => { 'name' => 'IB_PC_XMT_SW_HOL_DISC_F', 'value' => '265' }, '273' => { 'name' => 'IB_PC_XMT_DISC_LAST_F', 'value' => '266' }, '274' => { 'name' => 'IB_PC_RCV_LOCAL_PHY_ERR_F', 'value' => '267' }, '275' => { 'name' => 'IB_PC_RCV_MALFORMED_PKT_ERR_F', 'value' => '268' }, '276' => { 'name' => 'IB_PC_RCV_BUF_OVR_ERR_F', 'value' => '269' }, '277' => { 'name' => 'IB_PC_RCV_DLID_MAP_ERR_F', 'value' => '270' }, '278' => { 'name' => 'IB_PC_RCV_VL_MAP_ERR_F', 'value' => '271' }, '279' => { 'name' => 'IB_PC_RCV_LOOPING_ERR_F', 'value' => '272' }, '28' => { 'name' => 'IB_PORT_MKEY_F', 'value' => '27' }, '280' => { 'name' => 'IB_PC_RCV_ERR_LAST_F', 'value' => '273' }, '281' => { 'name' => 'IB_PSC_OPCODE_F', 'value' => '274' }, '282' => { 'name' => 'IB_PSC_PORT_SELECT_F', 'value' => '275' }, '283' => { 'name' => 'IB_PSC_TICK_F', 'value' => '276' }, '284' => { 'name' => 'IB_PSC_COUNTER_WIDTH_F', 'value' => '277' }, '285' => { 'name' => 'IB_PSC_COUNTER_MASK0_F', 'value' => '278' }, '286' => { 'name' => 'IB_PSC_COUNTER_MASKS1TO9_F', 'value' => '279' }, '287' => { 'name' => 'IB_PSC_COUNTER_MASKS10TO14_F', 'value' => '280' }, '288' => { 'name' => 'IB_PSC_SAMPLE_MECHS_F', 'value' => '281' }, '289' => { 'name' => 'IB_PSC_SAMPLE_STATUS_F', 'value' => '282' }, '29' => { 'name' => 'IB_PORT_GID_PREFIX_F', 'value' => '28' }, '290' => { 'name' => 'IB_PSC_OPTION_MASK_F', 'value' => '283' }, '291' => { 'name' => 'IB_PSC_VENDOR_MASK_F', 'value' => '284' }, '292' => { 'name' => 'IB_PSC_SAMPLE_START_F', 'value' => '285' }, '293' => { 'name' => 'IB_PSC_SAMPLE_INTVL_F', 'value' => '286' }, '294' => { 'name' => 'IB_PSC_TAG_F', 'value' => '287' }, '295' => { 'name' => 'IB_PSC_COUNTER_SEL0_F', 'value' => '288' }, '296' => { 'name' => 'IB_PSC_COUNTER_SEL1_F', 'value' => '289' }, '297' => { 'name' => 'IB_PSC_COUNTER_SEL2_F', 'value' => '290' }, '298' => { 'name' => 'IB_PSC_COUNTER_SEL3_F', 'value' => '291' }, '299' => { 'name' => 'IB_PSC_COUNTER_SEL4_F', 'value' => '292' }, '3' => { 'name' => 'IB_MAD_METHOD_F', 'value' => '3' }, '30' => { 'name' => 'IB_PORT_LID_F', 'value' => '29' }, '300' => { 'name' => 'IB_PSC_COUNTER_SEL5_F', 'value' => '293' }, '301' => { 'name' => 'IB_PSC_COUNTER_SEL6_F', 'value' => '294' }, '302' => { 'name' => 'IB_PSC_COUNTER_SEL7_F', 'value' => '295' }, '303' => { 'name' => 'IB_PSC_COUNTER_SEL8_F', 'value' => '296' }, '304' => { 'name' => 'IB_PSC_COUNTER_SEL9_F', 'value' => '297' }, '305' => { 'name' => 'IB_PSC_COUNTER_SEL10_F', 'value' => '298' }, '306' => { 'name' => 'IB_PSC_COUNTER_SEL11_F', 'value' => '299' }, '307' => { 'name' => 'IB_PSC_COUNTER_SEL12_F', 'value' => '300' }, '308' => { 'name' => 'IB_PSC_COUNTER_SEL13_F', 'value' => '301' }, '309' => { 'name' => 'IB_PSC_COUNTER_SEL14_F', 'value' => '302' }, '31' => { 'name' => 'IB_PORT_SMLID_F', 'value' => '30' }, '310' => { 'name' => 'IB_PSC_SAMPLES_ONLY_OPT_MASK_F', 'value' => '303' }, '311' => { 'name' => 'IB_PSC_LAST_F', 'value' => '304' }, '312' => { 'name' => 'IB_GI_GUID0_F', 'value' => '305' }, '313' => { 'name' => 'IB_GI_GUID1_F', 'value' => '306' }, '314' => { 'name' => 'IB_GI_GUID2_F', 'value' => '307' }, '315' => { 'name' => 'IB_GI_GUID3_F', 'value' => '308' }, '316' => { 'name' => 'IB_GI_GUID4_F', 'value' => '309' }, '317' => { 'name' => 'IB_GI_GUID5_F', 'value' => '310' }, '318' => { 'name' => 'IB_GI_GUID6_F', 'value' => '311' }, '319' => { 'name' => 'IB_GI_GUID7_F', 'value' => '312' }, '32' => { 'name' => 'IB_PORT_CAPMASK_F', 'value' => '31' }, '320' => { 'name' => 'IB_SA_GIR_LID_F', 'value' => '313' }, '321' => { 'name' => 'IB_SA_GIR_BLOCKNUM_F', 'value' => '314' }, '322' => { 'name' => 'IB_SA_GIR_GUID0_F', 'value' => '315' }, '323' => { 'name' => 'IB_SA_GIR_GUID1_F', 'value' => '316' }, '324' => { 'name' => 'IB_SA_GIR_GUID2_F', 'value' => '317' }, '325' => { 'name' => 'IB_SA_GIR_GUID3_F', 'value' => '318' }, '326' => { 'name' => 'IB_SA_GIR_GUID4_F', 'value' => '319' }, '327' => { 'name' => 'IB_SA_GIR_GUID5_F', 'value' => '320' }, '328' => { 'name' => 'IB_SA_GIR_GUID6_F', 'value' => '321' }, '329' => { 'name' => 'IB_SA_GIR_GUID7_F', 'value' => '322' }, '33' => { 'name' => 'IB_PORT_DIAG_F', 'value' => '32' }, '330' => { 'name' => 'IB_PORT_CAPMASK2_F', 'value' => '323' }, '331' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_ACTIVE_F', 'value' => '324' }, '332' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_SUPPORTED_F', 'value' => '325' }, '333' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_ENABLED_F', 'value' => '326' }, '334' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_LAST_F', 'value' => '327' }, '335' => { 'name' => 'IB_PESC_PORT_SELECT_F', 'value' => '328' }, '336' => { 'name' => 'IB_PESC_COUNTER_SELECT_F', 'value' => '329' }, '337' => { 'name' => 'IB_PESC_SYNC_HDR_ERR_CTR_F', 'value' => '330' }, '338' => { 'name' => 'IB_PESC_UNK_BLOCK_CTR_F', 'value' => '331' }, '339' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE0_F', 'value' => '332' }, '34' => { 'name' => 'IB_PORT_MKEY_LEASE_F', 'value' => '33' }, '340' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE1_F', 'value' => '333' }, '341' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE2_F', 'value' => '334' }, '342' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE3_F', 'value' => '335' }, '343' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE4_F', 'value' => '336' }, '344' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE5_F', 'value' => '337' }, '345' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE6_F', 'value' => '338' }, '346' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE7_F', 'value' => '339' }, '347' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE8_F', 'value' => '340' }, '348' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE9_F', 'value' => '341' }, '349' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE10_F', 'value' => '342' }, '35' => { 'name' => 'IB_PORT_LOCAL_PORT_F', 'value' => '34' }, '350' => { 'name' => 'IB_PESC_ERR_DET_CTR_LANE11_F', 'value' => '343' }, '351' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE0_F', 'value' => '344' }, '352' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE1_F', 'value' => '345' }, '353' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE2_F', 'value' => '346' }, '354' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE3_F', 'value' => '347' }, '355' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE4_F', 'value' => '348' }, '356' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE5_F', 'value' => '349' }, '357' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE6_F', 'value' => '350' }, '358' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE7_F', 'value' => '351' }, '359' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE8_F', 'value' => '352' }, '36' => { 'name' => 'IB_PORT_LINK_WIDTH_ENABLED_F', 'value' => '35' }, '360' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE9_F', 'value' => '353' }, '361' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE10_F', 'value' => '354' }, '362' => { 'name' => 'IB_PESC_FEC_CORR_BLOCK_CTR_LANE11_F', 'value' => '355' }, '363' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE0_F', 'value' => '356' }, '364' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE1_F', 'value' => '357' }, '365' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE2_F', 'value' => '358' }, '366' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE3_F', 'value' => '359' }, '367' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE4_F', 'value' => '360' }, '368' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE5_F', 'value' => '361' }, '369' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE6_F', 'value' => '362' }, '37' => { 'name' => 'IB_PORT_LINK_WIDTH_SUPPORTED_F', 'value' => '36' }, '370' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE7_F', 'value' => '363' }, '371' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE8_F', 'value' => '364' }, '372' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE9_F', 'value' => '365' }, '373' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE10_F', 'value' => '366' }, '374' => { 'name' => 'IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE11_F', 'value' => '367' }, '375' => { 'name' => 'IB_PESC_LAST_F', 'value' => '368' }, '376' => { 'name' => 'IB_PC_PORT_OP_RCV_COUNTERS_FIRST_F', 'value' => '369' }, '377' => { 'name' => 'IB_PC_PORT_OP_RCV_PKTS_F', 'value' => '369' }, '378' => { 'name' => 'IB_PC_PORT_OP_RCV_DATA_F', 'value' => '370' }, '379' => { 'name' => 'IB_PC_PORT_OP_RCV_COUNTERS_LAST_F', 'value' => '371' }, '38' => { 'name' => 'IB_PORT_LINK_WIDTH_ACTIVE_F', 'value' => '37' }, '380' => { 'name' => 'IB_PC_PORT_FLOW_CTL_COUNTERS_FIRST_F', 'value' => '372' }, '381' => { 'name' => 'IB_PC_PORT_XMIT_FLOW_PKTS_F', 'value' => '372' }, '382' => { 'name' => 'IB_PC_PORT_RCV_FLOW_PKTS_F', 'value' => '373' }, '383' => { 'name' => 'IB_PC_PORT_FLOW_CTL_COUNTERS_LAST_F', 'value' => '374' }, '384' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS_FIRST_F', 'value' => '375' }, '385' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS0_F', 'value' => '375' }, '386' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS1_F', 'value' => '376' }, '387' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS2_F', 'value' => '377' }, '388' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS3_F', 'value' => '378' }, '389' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS4_F', 'value' => '379' }, '39' => { 'name' => 'IB_PORT_LINK_SPEED_SUPPORTED_F', 'value' => '38' }, '390' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS5_F', 'value' => '380' }, '391' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS6_F', 'value' => '381' }, '392' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS7_F', 'value' => '382' }, '393' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS8_F', 'value' => '383' }, '394' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS9_F', 'value' => '384' }, '395' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS10_F', 'value' => '385' }, '396' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS11_F', 'value' => '386' }, '397' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS12_F', 'value' => '387' }, '398' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS13_F', 'value' => '388' }, '399' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS14_F', 'value' => '389' }, '4' => { 'name' => 'IB_MAD_RESPONSE_F', 'value' => '4' }, '40' => { 'name' => 'IB_PORT_STATE_F', 'value' => '39' }, '400' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS15_F', 'value' => '390' }, '401' => { 'name' => 'IB_PC_PORT_VL_OP_PACKETS_LAST_F', 'value' => '391' }, '402' => { 'name' => 'IB_PC_PORT_VL_OP_DATA_FIRST_F', 'value' => '392' }, '403' => { 'name' => 'IB_PC_PORT_VL_OP_DATA0_F', 'value' => '392' }, '404' => { 'name' => 'IB_PC_PORT_VL_OP_DATA1_F', 'value' => '393' }, '405' => { 'name' => 'IB_PC_PORT_VL_OP_DATA2_F', 'value' => '394' }, '406' => { 'name' => 'IB_PC_PORT_VL_OP_DATA3_F', 'value' => '395' }, '407' => { 'name' => 'IB_PC_PORT_VL_OP_DATA4_F', 'value' => '396' }, '408' => { 'name' => 'IB_PC_PORT_VL_OP_DATA5_F', 'value' => '397' }, '409' => { 'name' => 'IB_PC_PORT_VL_OP_DATA6_F', 'value' => '398' }, '41' => { 'name' => 'IB_PORT_PHYS_STATE_F', 'value' => '40' }, '410' => { 'name' => 'IB_PC_PORT_VL_OP_DATA7_F', 'value' => '399' }, '411' => { 'name' => 'IB_PC_PORT_VL_OP_DATA8_F', 'value' => '400' }, '412' => { 'name' => 'IB_PC_PORT_VL_OP_DATA9_F', 'value' => '401' }, '413' => { 'name' => 'IB_PC_PORT_VL_OP_DATA10_F', 'value' => '402' }, '414' => { 'name' => 'IB_PC_PORT_VL_OP_DATA11_F', 'value' => '403' }, '415' => { 'name' => 'IB_PC_PORT_VL_OP_DATA12_F', 'value' => '404' }, '416' => { 'name' => 'IB_PC_PORT_VL_OP_DATA13_F', 'value' => '405' }, '417' => { 'name' => 'IB_PC_PORT_VL_OP_DATA14_F', 'value' => '406' }, '418' => { 'name' => 'IB_PC_PORT_VL_OP_DATA15_F', 'value' => '407' }, '419' => { 'name' => 'IB_PC_PORT_VL_OP_DATA_LAST_F', 'value' => '408' }, '42' => { 'name' => 'IB_PORT_LINK_DOWN_DEF_F', 'value' => '41' }, '420' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS_FIRST_F', 'value' => '409' }, '421' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS0_F', 'value' => '409' }, '422' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS1_F', 'value' => '410' }, '423' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS2_F', 'value' => '411' }, '424' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS3_F', 'value' => '412' }, '425' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS4_F', 'value' => '413' }, '426' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS5_F', 'value' => '414' }, '427' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS6_F', 'value' => '415' }, '428' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS7_F', 'value' => '416' }, '429' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS8_F', 'value' => '417' }, '43' => { 'name' => 'IB_PORT_MKEY_PROT_BITS_F', 'value' => '42' }, '430' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS9_F', 'value' => '418' }, '431' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS10_F', 'value' => '419' }, '432' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS11_F', 'value' => '420' }, '433' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS12_F', 'value' => '421' }, '434' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS13_F', 'value' => '422' }, '435' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS14_F', 'value' => '423' }, '436' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS15_F', 'value' => '424' }, '437' => { 'name' => 'IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS_LAST_F', 'value' => '425' }, '438' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT_COUNTERS_FIRST_F', 'value' => '426' }, '439' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT0_F', 'value' => '426' }, '44' => { 'name' => 'IB_PORT_LMC_F', 'value' => '43' }, '440' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT1_F', 'value' => '427' }, '441' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT2_F', 'value' => '428' }, '442' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT3_F', 'value' => '429' }, '443' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT4_F', 'value' => '430' }, '444' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT5_F', 'value' => '431' }, '445' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT6_F', 'value' => '432' }, '446' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT7_F', 'value' => '433' }, '447' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT8_F', 'value' => '434' }, '448' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT9_F', 'value' => '435' }, '449' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT10_F', 'value' => '436' }, '45' => { 'name' => 'IB_PORT_LINK_SPEED_ACTIVE_F', 'value' => '44' }, '450' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT11_F', 'value' => '437' }, '451' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT12_F', 'value' => '438' }, '452' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT13_F', 'value' => '439' }, '453' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT14_F', 'value' => '440' }, '454' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT15_F', 'value' => '441' }, '455' => { 'name' => 'IB_PC_PORT_VL_XMIT_WAIT_COUNTERS_LAST_F', 'value' => '442' }, '456' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION_FIRST_F', 'value' => '443' }, '457' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION0_F', 'value' => '443' }, '458' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION1_F', 'value' => '444' }, '459' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION2_F', 'value' => '445' }, '46' => { 'name' => 'IB_PORT_LINK_SPEED_ENABLED_F', 'value' => '45' }, '460' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION3_F', 'value' => '446' }, '461' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION4_F', 'value' => '447' }, '462' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION5_F', 'value' => '448' }, '463' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION6_F', 'value' => '449' }, '464' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION7_F', 'value' => '450' }, '465' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION8_F', 'value' => '451' }, '466' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION9_F', 'value' => '452' }, '467' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION10_F', 'value' => '453' }, '468' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION11_F', 'value' => '454' }, '469' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION12_F', 'value' => '455' }, '47' => { 'name' => 'IB_PORT_NEIGHBOR_MTU_F', 'value' => '46' }, '470' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION13_F', 'value' => '456' }, '471' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION14_F', 'value' => '457' }, '472' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION15_F', 'value' => '458' }, '473' => { 'name' => 'IB_PC_SW_PORT_VL_CONGESTION_LAST_F', 'value' => '459' }, '474' => { 'name' => 'IB_PC_RCV_CON_CTRL_FIRST_F', 'value' => '460' }, '475' => { 'name' => 'IB_PC_RCV_CON_CTRL_PKT_RCV_FECN_F', 'value' => '460' }, '476' => { 'name' => 'IB_PC_RCV_CON_CTRL_PKT_RCV_BECN_F', 'value' => '461' }, '477' => { 'name' => 'IB_PC_RCV_CON_CTRL_LAST_F', 'value' => '462' }, '478' => { 'name' => 'IB_PC_SL_RCV_FECN_FIRST_F', 'value' => '463' }, '479' => { 'name' => 'IB_PC_SL_RCV_FECN0_F', 'value' => '463' }, '48' => { 'name' => 'IB_PORT_SMSL_F', 'value' => '47' }, '480' => { 'name' => 'IB_PC_SL_RCV_FECN1_F', 'value' => '464' }, '481' => { 'name' => 'IB_PC_SL_RCV_FECN2_F', 'value' => '465' }, '482' => { 'name' => 'IB_PC_SL_RCV_FECN3_F', 'value' => '466' }, '483' => { 'name' => 'IB_PC_SL_RCV_FECN4_F', 'value' => '467' }, '484' => { 'name' => 'IB_PC_SL_RCV_FECN5_F', 'value' => '468' }, '485' => { 'name' => 'IB_PC_SL_RCV_FECN6_F', 'value' => '469' }, '486' => { 'name' => 'IB_PC_SL_RCV_FECN7_F', 'value' => '470' }, '487' => { 'name' => 'IB_PC_SL_RCV_FECN8_F', 'value' => '471' }, '488' => { 'name' => 'IB_PC_SL_RCV_FECN9_F', 'value' => '472' }, '489' => { 'name' => 'IB_PC_SL_RCV_FECN10_F', 'value' => '473' }, '49' => { 'name' => 'IB_PORT_VL_CAP_F', 'value' => '48' }, '490' => { 'name' => 'IB_PC_SL_RCV_FECN11_F', 'value' => '474' }, '491' => { 'name' => 'IB_PC_SL_RCV_FECN12_F', 'value' => '475' }, '492' => { 'name' => 'IB_PC_SL_RCV_FECN13_F', 'value' => '476' }, '493' => { 'name' => 'IB_PC_SL_RCV_FECN14_F', 'value' => '477' }, '494' => { 'name' => 'IB_PC_SL_RCV_FECN15_F', 'value' => '478' }, '495' => { 'name' => 'IB_PC_SL_RCV_FECN_LAST_F', 'value' => '479' }, '496' => { 'name' => 'IB_PC_SL_RCV_BECN_FIRST_F', 'value' => '480' }, '497' => { 'name' => 'IB_PC_SL_RCV_BECN0_F', 'value' => '480' }, '498' => { 'name' => 'IB_PC_SL_RCV_BECN1_F', 'value' => '481' }, '499' => { 'name' => 'IB_PC_SL_RCV_BECN2_F', 'value' => '482' }, '5' => { 'name' => 'IB_MAD_CLASSVER_F', 'value' => '5' }, '50' => { 'name' => 'IB_PORT_INIT_TYPE_F', 'value' => '49' }, '500' => { 'name' => 'IB_PC_SL_RCV_BECN3_F', 'value' => '483' }, '501' => { 'name' => 'IB_PC_SL_RCV_BECN4_F', 'value' => '484' }, '502' => { 'name' => 'IB_PC_SL_RCV_BECN5_F', 'value' => '485' }, '503' => { 'name' => 'IB_PC_SL_RCV_BECN6_F', 'value' => '486' }, '504' => { 'name' => 'IB_PC_SL_RCV_BECN7_F', 'value' => '487' }, '505' => { 'name' => 'IB_PC_SL_RCV_BECN8_F', 'value' => '488' }, '506' => { 'name' => 'IB_PC_SL_RCV_BECN9_F', 'value' => '489' }, '507' => { 'name' => 'IB_PC_SL_RCV_BECN10_F', 'value' => '490' }, '508' => { 'name' => 'IB_PC_SL_RCV_BECN11_F', 'value' => '491' }, '509' => { 'name' => 'IB_PC_SL_RCV_BECN12_F', 'value' => '492' }, '51' => { 'name' => 'IB_PORT_VL_HIGH_LIMIT_F', 'value' => '50' }, '510' => { 'name' => 'IB_PC_SL_RCV_BECN13_F', 'value' => '493' }, '511' => { 'name' => 'IB_PC_SL_RCV_BECN14_F', 'value' => '494' }, '512' => { 'name' => 'IB_PC_SL_RCV_BECN15_F', 'value' => '495' }, '513' => { 'name' => 'IB_PC_SL_RCV_BECN_LAST_F', 'value' => '496' }, '514' => { 'name' => 'IB_PC_XMIT_CON_CTRL_FIRST_F', 'value' => '497' }, '515' => { 'name' => 'IB_PC_XMIT_CON_CTRL_TIME_CONG_F', 'value' => '497' }, '516' => { 'name' => 'IB_PC_XMIT_CON_CTRL_LAST_F', 'value' => '498' }, '517' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG_FIRST_F', 'value' => '499' }, '518' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG0_F', 'value' => '499' }, '519' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG1_F', 'value' => '500' }, '52' => { 'name' => 'IB_PORT_VL_ARBITRATION_HIGH_CAP_F', 'value' => '51' }, '520' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG2_F', 'value' => '501' }, '521' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG3_F', 'value' => '502' }, '522' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG4_F', 'value' => '503' }, '523' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG5_F', 'value' => '504' }, '524' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG6_F', 'value' => '505' }, '525' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG7_F', 'value' => '506' }, '526' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG8_F', 'value' => '507' }, '527' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG9_F', 'value' => '508' }, '528' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG10_F', 'value' => '509' }, '529' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG11_F', 'value' => '510' }, '53' => { 'name' => 'IB_PORT_VL_ARBITRATION_LOW_CAP_F', 'value' => '52' }, '530' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG12_F', 'value' => '511' }, '531' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG13_F', 'value' => '512' }, '532' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG14_F', 'value' => '513' }, '533' => { 'name' => 'IB_PC_VL_XMIT_TIME_CONG_LAST_F', 'value' => '514' }, '534' => { 'name' => 'IB_MLNX_EXT_PORT_STATE_CHG_ENABLE_F', 'value' => '515' }, '535' => { 'name' => 'IB_MLNX_EXT_PORT_LINK_SPEED_SUPPORTED_F', 'value' => '516' }, '536' => { 'name' => 'IB_MLNX_EXT_PORT_LINK_SPEED_ENABLED_F', 'value' => '517' }, '537' => { 'name' => 'IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F', 'value' => '518' }, '538' => { 'name' => 'IB_MLNX_EXT_PORT_LAST_F', 'value' => '519' }, '539' => { 'name' => 'IB_CC_CCKEY_F', 'value' => '520' }, '54' => { 'name' => 'IB_PORT_INIT_TYPE_REPLY_F', 'value' => '53' }, '540' => { 'name' => 'IB_CC_CONGESTION_INFO_FIRST_F', 'value' => '521' }, '541' => { 'name' => 'IB_CC_CONGESTION_INFO_F', 'value' => '521' }, '542' => { 'name' => 'IB_CC_CONGESTION_INFO_CONTROL_TABLE_CAP_F', 'value' => '522' }, '543' => { 'name' => 'IB_CC_CONGESTION_INFO_LAST_F', 'value' => '523' }, '544' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_FIRST_F', 'value' => '524' }, '545' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_CC_KEY_F', 'value' => '524' }, '546' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_CC_KEY_PROTECT_BIT_F', 'value' => '525' }, '547' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_CC_KEY_LEASE_PERIOD_F', 'value' => '526' }, '548' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_CC_KEY_VIOLATIONS_F', 'value' => '527' }, '549' => { 'name' => 'IB_CC_CONGESTION_KEY_INFO_LAST_F', 'value' => '528' }, '55' => { 'name' => 'IB_PORT_MTU_CAP_F', 'value' => '54' }, '550' => { 'name' => 'IB_CC_CONGESTION_LOG_FIRST_F', 'value' => '529' }, '551' => { 'name' => 'IB_CC_CONGESTION_LOG_LOGTYPE_F', 'value' => '529' }, '552' => { 'name' => 'IB_CC_CONGESTION_LOG_CONGESTION_FLAGS_F', 'value' => '530' }, '553' => { 'name' => 'IB_CC_CONGESTION_LOG_LAST_F', 'value' => '531' }, '554' => { 'name' => 'IB_CC_CONGESTION_LOG_SWITCH_FIRST_F', 'value' => '532' }, '555' => { 'name' => 'IB_CC_CONGESTION_LOG_SWITCH_LOG_EVENTS_COUNTER_F', 'value' => '532' }, '556' => { 'name' => 'IB_CC_CONGESTION_LOG_SWITCH_CURRENT_TIME_STAMP_F', 'value' => '533' }, '557' => { 'name' => 'IB_CC_CONGESTION_LOG_SWITCH_PORTMAP_F', 'value' => '534' }, '558' => { 'name' => 'IB_CC_CONGESTION_LOG_SWITCH_LAST_F', 'value' => '535' }, '559' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_FIRST_F', 'value' => '536' }, '56' => { 'name' => 'IB_PORT_VL_STALL_COUNT_F', 'value' => '55' }, '560' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_SLID_F', 'value' => '536' }, '561' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_DLID_F', 'value' => '537' }, '562' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_SL_F', 'value' => '538' }, '563' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_TIMESTAMP_F', 'value' => '539' }, '564' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_SWITCH_LAST_F', 'value' => '540' }, '565' => { 'name' => 'IB_CC_CONGESTION_LOG_CA_FIRST_F', 'value' => '541' }, '566' => { 'name' => 'IB_CC_CONGESTION_LOG_CA_THRESHOLD_EVENT_COUNTER_F', 'value' => '541' }, '567' => { 'name' => 'IB_CC_CONGESTION_LOG_CA_THRESHOLD_CONGESTION_EVENT_MAP_F', 'value' => '542' }, '568' => { 'name' => 'IB_CC_CONGESTION_LOG_CA_CURRENT_TIMESTAMP_F', 'value' => '543' }, '569' => { 'name' => 'IB_CC_CONGESTION_LOG_CA_LAST_F', 'value' => '544' }, '57' => { 'name' => 'IB_PORT_HOQ_LIFE_F', 'value' => '56' }, '570' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_FIRST_F', 'value' => '545' }, '571' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_LOCAL_QP_CN_ENTRY_F', 'value' => '545' }, '572' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_SL_CN_ENTRY_F', 'value' => '546' }, '573' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_SERVICE_TYPE_CN_ENTRY_F', 'value' => '547' }, '574' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_REMOTE_QP_NUMBER_CN_ENTRY_F', 'value' => '548' }, '575' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_LOCAL_LID_CN_F', 'value' => '549' }, '576' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_REMOTE_LID_CN_ENTRY_F', 'value' => '550' }, '577' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_TIMESTAMP_CN_ENTRY_F', 'value' => '551' }, '578' => { 'name' => 'IB_CC_CONGESTION_LOG_ENTRY_CA_LAST_F', 'value' => '552' }, '579' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_FIRST_F', 'value' => '553' }, '58' => { 'name' => 'IB_PORT_OPER_VLS_F', 'value' => '57' }, '580' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_CONTROL_MAP_F', 'value' => '553' }, '581' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_VICTIM_MASK_F', 'value' => '554' }, '582' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_CREDIT_MASK_F', 'value' => '555' }, '583' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_THRESHOLD_F', 'value' => '556' }, '584' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_PACKET_SIZE_F', 'value' => '557' }, '585' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_CS_THRESHOLD_F', 'value' => '558' }, '586' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_CS_RETURN_DELAY_F', 'value' => '559' }, '587' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_MARKING_RATE_F', 'value' => '560' }, '588' => { 'name' => 'IB_CC_SWITCH_CONGESTION_SETTING_LAST_F', 'value' => '561' }, '589' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_FIRST_F', 'value' => '562' }, '59' => { 'name' => 'IB_PORT_PART_EN_INB_F', 'value' => '58' }, '590' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_VALID_F', 'value' => '562' }, '591' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_CONTROL_TYPE_F', 'value' => '563' }, '592' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_THRESHOLD_F', 'value' => '564' }, '593' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_PACKET_SIZE_F', 'value' => '565' }, '594' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_CONG_PARM_MARKING_RATE_F', 'value' => '566' }, '595' => { 'name' => 'IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_LAST_F', 'value' => '567' }, '596' => { 'name' => 'IB_CC_CA_CONGESTION_SETTING_FIRST_F', 'value' => '568' }, '597' => { 'name' => 'IB_CC_CA_CONGESTION_SETTING_PORT_CONTROL_F', 'value' => '568' }, '598' => { 'name' => 'IB_CC_CA_CONGESTION_SETTING_CONTROL_MAP_F', 'value' => '569' }, '599' => { 'name' => 'IB_CC_CA_CONGESTION_SETTING_LAST_F', 'value' => '570' }, '6' => { 'name' => 'IB_MAD_MGMTCLASS_F', 'value' => '6' }, '60' => { 'name' => 'IB_PORT_PART_EN_OUTB_F', 'value' => '59' }, '600' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_FIRST_F', 'value' => '571' }, '601' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_CCTI_TIMER_F', 'value' => '571' }, '602' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_CCTI_INCREASE_F', 'value' => '572' }, '603' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_TRIGGER_THRESHOLD_F', 'value' => '573' }, '604' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_CCTI_MIN_F', 'value' => '574' }, '605' => { 'name' => 'IB_CC_CA_CONGESTION_ENTRY_LAST_F', 'value' => '575' }, '606' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_FIRST_F', 'value' => '576' }, '607' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_CCTI_LIMIT_F', 'value' => '576' }, '608' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_LAST_F', 'value' => '577' }, '609' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_FIRST_F', 'value' => '578' }, '61' => { 'name' => 'IB_PORT_FILTER_RAW_INB_F', 'value' => '60' }, '610' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_CCT_SHIFT_F', 'value' => '578' }, '611' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_CCT_MULTIPLIER_F', 'value' => '579' }, '612' => { 'name' => 'IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_LAST_F', 'value' => '580' }, '613' => { 'name' => 'IB_CC_TIMESTAMP_FIRST_F', 'value' => '581' }, '614' => { 'name' => 'IB_CC_TIMESTAMP_F', 'value' => '581' }, '615' => { 'name' => 'IB_CC_TIMESTAMP_LAST_F', 'value' => '582' }, '616' => { 'name' => 'IB_SA_NR_FIRST_F', 'value' => '583' }, '617' => { 'name' => 'IB_SA_NR_LID_F', 'value' => '583' }, '618' => { 'name' => 'IB_SA_NR_BASEVER_F', 'value' => '584' }, '619' => { 'name' => 'IB_SA_NR_CLASSVER_F', 'value' => '585' }, '62' => { 'name' => 'IB_PORT_FILTER_RAW_OUTB_F', 'value' => '61' }, '620' => { 'name' => 'IB_SA_NR_TYPE_F', 'value' => '586' }, '621' => { 'name' => 'IB_SA_NR_NPORTS_F', 'value' => '587' }, '622' => { 'name' => 'IB_SA_NR_SYSTEM_GUID_F', 'value' => '588' }, '623' => { 'name' => 'IB_SA_NR_GUID_F', 'value' => '589' }, '624' => { 'name' => 'IB_SA_NR_PORT_GUID_F', 'value' => '590' }, '625' => { 'name' => 'IB_SA_NR_PARTITION_CAP_F', 'value' => '591' }, '626' => { 'name' => 'IB_SA_NR_DEVID_F', 'value' => '592' }, '627' => { 'name' => 'IB_SA_NR_REVISION_F', 'value' => '593' }, '628' => { 'name' => 'IB_SA_NR_LOCAL_PORT_F', 'value' => '594' }, '629' => { 'name' => 'IB_SA_NR_VENDORID_F', 'value' => '595' }, '63' => { 'name' => 'IB_PORT_MKEY_VIOL_F', 'value' => '62' }, '630' => { 'name' => 'IB_SA_NR_NODEDESC_F', 'value' => '596' }, '631' => { 'name' => 'IB_SA_NR_LAST_F', 'value' => '597' }, '632' => { 'name' => 'IB_PSR_TAG_F', 'value' => '598' }, '633' => { 'name' => 'IB_PSR_SAMPLE_STATUS_F', 'value' => '599' }, '634' => { 'name' => 'IB_PSR_COUNTER0_F', 'value' => '600' }, '635' => { 'name' => 'IB_PSR_COUNTER1_F', 'value' => '601' }, '636' => { 'name' => 'IB_PSR_COUNTER2_F', 'value' => '602' }, '637' => { 'name' => 'IB_PSR_COUNTER3_F', 'value' => '603' }, '638' => { 'name' => 'IB_PSR_COUNTER4_F', 'value' => '604' }, '639' => { 'name' => 'IB_PSR_COUNTER5_F', 'value' => '605' }, '64' => { 'name' => 'IB_PORT_PKEY_VIOL_F', 'value' => '63' }, '640' => { 'name' => 'IB_PSR_COUNTER6_F', 'value' => '606' }, '641' => { 'name' => 'IB_PSR_COUNTER7_F', 'value' => '607' }, '642' => { 'name' => 'IB_PSR_COUNTER8_F', 'value' => '608' }, '643' => { 'name' => 'IB_PSR_COUNTER9_F', 'value' => '609' }, '644' => { 'name' => 'IB_PSR_COUNTER10_F', 'value' => '610' }, '645' => { 'name' => 'IB_PSR_COUNTER11_F', 'value' => '611' }, '646' => { 'name' => 'IB_PSR_COUNTER12_F', 'value' => '612' }, '647' => { 'name' => 'IB_PSR_COUNTER13_F', 'value' => '613' }, '648' => { 'name' => 'IB_PSR_COUNTER14_F', 'value' => '614' }, '649' => { 'name' => 'IB_PSR_LAST_F', 'value' => '615' }, '65' => { 'name' => 'IB_PORT_QKEY_VIOL_F', 'value' => '64' }, '650' => { 'name' => 'IB_PORT_EXT_FIRST_F', 'value' => '616' }, '651' => { 'name' => 'IB_PORT_EXT_CAPMASK_F', 'value' => '616' }, '652' => { 'name' => 'IB_PORT_EXT_FEC_MODE_ACTIVE_F', 'value' => '617' }, '653' => { 'name' => 'IB_PORT_EXT_FDR_FEC_MODE_SUPPORTED_F', 'value' => '618' }, '654' => { 'name' => 'IB_PORT_EXT_FDR_FEC_MODE_ENABLED_F', 'value' => '619' }, '655' => { 'name' => 'IB_PORT_EXT_EDR_FEC_MODE_SUPPORTED_F', 'value' => '620' }, '656' => { 'name' => 'IB_PORT_EXT_EDR_FEC_MODE_ENABLED_F', 'value' => '621' }, '657' => { 'name' => 'IB_PORT_EXT_LAST_F', 'value' => '622' }, '658' => { 'name' => 'IB_PESC_RSFEC_FIRST_F', 'value' => '623' }, '659' => { 'name' => 'IB_PESC_RSFEC_PORT_SELECT_F', 'value' => '623' }, '66' => { 'name' => 'IB_PORT_GUID_CAP_F', 'value' => '65' }, '660' => { 'name' => 'IB_PESC_RSFEC_COUNTER_SELECT_F', 'value' => '624' }, '661' => { 'name' => 'IB_PESC_RSFEC_SYNC_HDR_ERR_CTR_F', 'value' => '625' }, '662' => { 'name' => 'IB_PESC_RSFEC_UNK_BLOCK_CTR_F', 'value' => '626' }, '663' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE0_F', 'value' => '627' }, '664' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE1_F', 'value' => '628' }, '665' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE2_F', 'value' => '629' }, '666' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE3_F', 'value' => '630' }, '667' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE4_F', 'value' => '631' }, '668' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE5_F', 'value' => '632' }, '669' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE6_F', 'value' => '633' }, '67' => { 'name' => 'IB_PORT_CLIENT_REREG_F', 'value' => '66' }, '670' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE7_F', 'value' => '634' }, '671' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE8_F', 'value' => '635' }, '672' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE9_F', 'value' => '636' }, '673' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE10_F', 'value' => '637' }, '674' => { 'name' => 'IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE11_F', 'value' => '638' }, '675' => { 'name' => 'IB_PESC_PORT_FEC_CORR_BLOCK_CTR_F', 'value' => '639' }, '676' => { 'name' => 'IB_PESC_PORT_FEC_UNCORR_BLOCK_CTR_F', 'value' => '640' }, '677' => { 'name' => 'IB_PESC_PORT_FEC_CORR_SYMBOL_CTR_F', 'value' => '641' }, '678' => { 'name' => 'IB_PESC_RSFEC_LAST_F', 'value' => '642' }, '679' => { 'name' => 'IB_PC_EXT_COUNTER_SELECT2_F', 'value' => '643' }, '68' => { 'name' => 'IB_PORT_MCAST_PKEY_SUPR_ENAB_F', 'value' => '67' }, '680' => { 'name' => 'IB_PC_EXT_ERR_SYM_F', 'value' => '644' }, '681' => { 'name' => 'IB_PC_EXT_LINK_RECOVERS_F', 'value' => '645' }, '682' => { 'name' => 'IB_PC_EXT_LINK_DOWNED_F', 'value' => '646' }, '683' => { 'name' => 'IB_PC_EXT_ERR_RCV_F', 'value' => '647' }, '684' => { 'name' => 'IB_PC_EXT_ERR_PHYSRCV_F', 'value' => '648' }, '685' => { 'name' => 'IB_PC_EXT_ERR_SWITCH_REL_F', 'value' => '649' }, '686' => { 'name' => 'IB_PC_EXT_XMT_DISCARDS_F', 'value' => '650' }, '687' => { 'name' => 'IB_PC_EXT_ERR_XMTCONSTR_F', 'value' => '651' }, '688' => { 'name' => 'IB_PC_EXT_ERR_RCVCONSTR_F', 'value' => '652' }, '689' => { 'name' => 'IB_PC_EXT_ERR_LOCALINTEG_F', 'value' => '653' }, '69' => { 'name' => 'IB_PORT_SUBN_TIMEOUT_F', 'value' => '68' }, '690' => { 'name' => 'IB_PC_EXT_ERR_EXCESS_OVR_F', 'value' => '654' }, '691' => { 'name' => 'IB_PC_EXT_VL15_DROPPED_F', 'value' => '655' }, '692' => { 'name' => 'IB_PC_EXT_XMT_WAIT_F', 'value' => '656' }, '693' => { 'name' => 'IB_PC_EXT_QP1_DROP_F', 'value' => '657' }, '694' => { 'name' => 'IB_PC_EXT_ERR_LAST_F', 'value' => '658' }, '695' => { 'name' => 'IB_PC_QP1_DROP_F', 'value' => '659' }, '696' => { 'name' => 'IB_PORT_EXT_HDR_FEC_MODE_SUPPORTED_F', 'value' => '660' }, '697' => { 'name' => 'IB_PORT_EXT_HDR_FEC_MODE_ENABLED_F', 'value' => '661' }, '698' => { 'name' => 'IB_PORT_EXT_HDR_FEC_MODE_LAST_F', 'value' => '662' }, '699' => { 'name' => 'IB_PORT_EXT_NDR_FEC_MODE_SUPPORTED_F', 'value' => '663' }, '7' => { 'name' => 'IB_MAD_BASEVER_F', 'value' => '7' }, '70' => { 'name' => 'IB_PORT_RESP_TIME_VAL_F', 'value' => '69' }, '700' => { 'name' => 'IB_PORT_EXT_NDR_FEC_MODE_ENABLED_F', 'value' => '664' }, '701' => { 'name' => 'IB_PORT_EXT_NDR_FEC_MODE_LAST_F', 'value' => '665' }, '702' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_ACTIVE_2_F', 'value' => '666' }, '703' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_SUPPORTED_2_F', 'value' => '667' }, '704' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_ENABLED_2_F', 'value' => '668' }, '705' => { 'name' => 'IB_PORT_LINK_SPEED_EXT_2_LAST_F', 'value' => '669' }, '706' => { 'name' => 'IB_FIELD_LAST_', 'value' => '670' }, '71' => { 'name' => 'IB_PORT_LOCAL_PHYS_ERR_F', 'value' => '70' }, '72' => { 'name' => 'IB_PORT_OVERRUN_ERR_F', 'value' => '71' }, '73' => { 'name' => 'IB_PORT_MAX_CREDIT_HINT_F', 'value' => '72' }, '74' => { 'name' => 'IB_PORT_LINK_ROUND_TRIP_F', 'value' => '73' }, '75' => { 'name' => 'IB_PORT_LAST_F', 'value' => '74' }, '76' => { 'name' => 'IB_NODE_FIRST_F', 'value' => '75' }, '77' => { 'name' => 'IB_NODE_BASE_VERS_F', 'value' => '75' }, '78' => { 'name' => 'IB_NODE_CLASS_VERS_F', 'value' => '76' }, '79' => { 'name' => 'IB_NODE_TYPE_F', 'value' => '77' }, '8' => { 'name' => 'IB_MAD_STATUS_F', 'value' => '8' }, '80' => { 'name' => 'IB_NODE_NPORTS_F', 'value' => '78' }, '81' => { 'name' => 'IB_NODE_SYSTEM_GUID_F', 'value' => '79' }, '82' => { 'name' => 'IB_NODE_GUID_F', 'value' => '80' }, '83' => { 'name' => 'IB_NODE_PORT_GUID_F', 'value' => '81' }, '84' => { 'name' => 'IB_NODE_PARTITION_CAP_F', 'value' => '82' }, '85' => { 'name' => 'IB_NODE_DEVID_F', 'value' => '83' }, '86' => { 'name' => 'IB_NODE_REVISION_F', 'value' => '84' }, '87' => { 'name' => 'IB_NODE_LOCAL_PORT_F', 'value' => '85' }, '88' => { 'name' => 'IB_NODE_VENDORID_F', 'value' => '86' }, '89' => { 'name' => 'IB_NODE_LAST_F', 'value' => '87' }, '9' => { 'name' => 'IB_DRSMP_HOPCNT_F', 'value' => '9' }, '90' => { 'name' => 'IB_SW_FIRST_F', 'value' => '88' }, '91' => { 'name' => 'IB_SW_LINEAR_FDB_CAP_F', 'value' => '88' }, '92' => { 'name' => 'IB_SW_RANDOM_FDB_CAP_F', 'value' => '89' }, '93' => { 'name' => 'IB_SW_MCAST_FDB_CAP_F', 'value' => '90' }, '94' => { 'name' => 'IB_SW_LINEAR_FDB_TOP_F', 'value' => '91' }, '95' => { 'name' => 'IB_SW_DEF_PORT_F', 'value' => '92' }, '96' => { 'name' => 'IB_SW_DEF_MCAST_PRIM_F', 'value' => '93' }, '97' => { 'name' => 'IB_SW_DEF_MCAST_NOT_PRIM_F', 'value' => '94' }, '98' => { 'name' => 'IB_SW_LIFE_TIME_F', 'value' => '95' }, '99' => { 'name' => 'IB_SW_STATE_CHANGE_F', 'value' => '96' } }, 'Name' => 'enum MAD_FIELDS', 'Size' => '4', 'Type' => 'Enum' }, '114' => { 'BaseType' => '86', 'Header' => undef, 'Line' => '38', 'Name' => '__uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '133' => { 'BaseType' => '93', 'Header' => undef, 'Line' => '40', 'Name' => '__uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '13740' => { 'BaseType' => '6679', 'Name' => 'ibnd_fabric_t*', 'Size' => '8', 'Type' => 'Pointer' }, '145' => { 'BaseType' => '100', 'Header' => undef, 'Line' => '42', 'Name' => '__uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '157' => { 'BaseType' => '58', 'Header' => undef, 'Line' => '45', 'Name' => '__uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '193' => { 'BaseType' => '1', 'Name' => 'void*', 'Size' => '8', 'Type' => 'Pointer' }, '200' => { 'BaseType' => '210', 'Name' => 'char*', 'Size' => '8', 'Type' => 'Pointer' }, '210' => { 'Name' => 'char', 'Size' => '1', 'Type' => 'Intrinsic' }, '217' => { 'BaseType' => '210', 'Name' => 'char const', 'Size' => '1', 'Type' => 'Const' }, '240' => { 'BaseType' => '114', 'Header' => undef, 'Line' => '24', 'Name' => 'uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '252' => { 'BaseType' => '133', 'Header' => undef, 'Line' => '25', 'Name' => 'uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '264' => { 'BaseType' => '145', 'Header' => undef, 'Line' => '26', 'Name' => 'uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '276' => { 'BaseType' => '157', 'Header' => undef, 'Line' => '27', 'Name' => 'uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '28308' => { 'Header' => undef, 'Line' => '145', 'Memb' => { '0' => { 'name' => 'max_smps', 'offset' => '0', 'type' => '100' }, '1' => { 'name' => 'show_progress', 'offset' => '4', 'type' => '100' }, '2' => { 'name' => 'max_hops', 'offset' => '8', 'type' => '100' }, '3' => { 'name' => 'debug', 'offset' => '18', 'type' => '100' }, '4' => { 'name' => 'timeout_ms', 'offset' => '22', 'type' => '100' }, '5' => { 'name' => 'retries', 'offset' => '32', 'type' => '100' }, '6' => { 'name' => 'flags', 'offset' => '36', 'type' => '264' }, '7' => { 'name' => 'mkey', 'offset' => '50', 'type' => '276' }, '8' => { 'name' => 'pad', 'offset' => '64', 'type' => '28439' } }, 'Name' => 'struct ibnd_config', 'Size' => '88', 'Type' => 'Struct' }, '28439' => { 'BaseType' => '240', 'Name' => 'uint8_t[44]', 'Size' => '44', 'Type' => 'Array' }, '28683' => { 'BaseType' => '28695', 'Header' => undef, 'Line' => '214', 'Name' => 'ibnd_iter_node_func_t', 'Size' => '8', 'Type' => 'Typedef' }, '28695' => { 'Name' => 'void(*)(ibnd_node_t*, void*)', 'Param' => { '0' => { 'type' => '6420' }, '1' => { 'type' => '193' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '28716' => { 'BaseType' => '28728', 'Header' => undef, 'Line' => '227', 'Name' => 'ibnd_iter_port_func_t', 'Size' => '8', 'Type' => 'Typedef' }, '28728' => { 'Name' => 'void(*)(ibnd_port_t*, void*)', 'Param' => { '0' => { 'type' => '6674' }, '1' => { 'type' => '193' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '288' => { 'BaseType' => '217', 'Name' => 'char const*', 'Size' => '8', 'Type' => 'Pointer' }, '29205' => { 'BaseType' => '28308', 'Name' => 'struct ibnd_config*', 'Size' => '8', 'Type' => 'Pointer' }, '29876' => { 'BaseType' => '1051', 'Name' => 'ib_portid_t*', 'Size' => '8', 'Type' => 'Pointer' }, '46' => { 'BaseType' => '58', 'Header' => undef, 'Line' => '214', 'Name' => 'size_t', 'Size' => '8', 'Type' => 'Typedef' }, '58' => { 'Name' => 'unsigned long', 'Size' => '8', 'Type' => 'Intrinsic' }, '5825' => { 'Header' => undef, 'Line' => '54', 'Memb' => { '0' => { 'name' => 'next', 'offset' => '0', 'type' => '6124' }, '1' => { 'name' => 'path_portid', 'offset' => '8', 'type' => '1051' }, '10' => { 'name' => 'nodedesc', 'offset' => '626', 'type' => '6129' }, '11' => { 'name' => 'ports', 'offset' => '836', 'type' => '6289' }, '12' => { 'name' => 'next_chassis_node', 'offset' => '850', 'type' => '6124' }, '13' => { 'name' => 'chassis', 'offset' => '864', 'type' => '6403' }, '14' => { 'name' => 'ch_type', 'offset' => '872', 'type' => '86' }, '15' => { 'name' => 'ch_type_str', 'offset' => '873', 'type' => '770' }, '16' => { 'name' => 'ch_anafanum', 'offset' => '905', 'type' => '86' }, '17' => { 'name' => 'ch_slotnum', 'offset' => '912', 'type' => '86' }, '18' => { 'name' => 'ch_slot', 'offset' => '913', 'type' => '86' }, '19' => { 'name' => 'ch_found', 'offset' => '914', 'type' => '86' }, '2' => { 'name' => 'smalid', 'offset' => '288', 'type' => '252' }, '20' => { 'name' => 'htnext', 'offset' => '1024', 'type' => '6124' }, '21' => { 'name' => 'type_next', 'offset' => '1032', 'type' => '6124' }, '3' => { 'name' => 'smalmc', 'offset' => '290', 'type' => '240' }, '4' => { 'name' => 'smaenhsp0', 'offset' => '292', 'type' => '65' }, '5' => { 'name' => 'switchinfo', 'offset' => '296', 'type' => '906' }, '6' => { 'name' => 'guid', 'offset' => '402', 'type' => '276' }, '7' => { 'name' => 'type', 'offset' => '512', 'type' => '65' }, '8' => { 'name' => 'numports', 'offset' => '516', 'type' => '65' }, '9' => { 'name' => 'info', 'offset' => '520', 'type' => '906' } }, 'Name' => 'struct ibnd_node', 'Size' => '416', 'Type' => 'Struct' }, '6124' => { 'BaseType' => '5825', 'Name' => 'struct ibnd_node*', 'Size' => '8', 'Type' => 'Pointer' }, '6129' => { 'BaseType' => '210', 'Name' => 'char[65]', 'Size' => '65', 'Type' => 'Array' }, '6145' => { 'Header' => undef, 'Line' => '104', 'Memb' => { '0' => { 'name' => 'guid', 'offset' => '0', 'type' => '276' }, '1' => { 'name' => 'portnum', 'offset' => '8', 'type' => '65' }, '2' => { 'name' => 'ext_portnum', 'offset' => '18', 'type' => '65' }, '3' => { 'name' => 'node', 'offset' => '22', 'type' => '6420' }, '4' => { 'name' => 'remoteport', 'offset' => '36', 'type' => '6294' }, '5' => { 'name' => 'base_lid', 'offset' => '50', 'type' => '252' }, '6' => { 'name' => 'lmc', 'offset' => '52', 'type' => '240' }, '7' => { 'name' => 'info', 'offset' => '53', 'type' => '906' }, '8' => { 'name' => 'ext_info', 'offset' => '153', 'type' => '906' }, '9' => { 'name' => 'htnext', 'offset' => '360', 'type' => '6294' } }, 'Name' => 'struct ibnd_port', 'Size' => '176', 'Type' => 'Struct' }, '6289' => { 'BaseType' => '6294', 'Name' => 'struct ibnd_port**', 'Size' => '8', 'Type' => 'Pointer' }, '6294' => { 'BaseType' => '6145', 'Name' => 'struct ibnd_port*', 'Size' => '8', 'Type' => 'Pointer' }, '6299' => { 'Header' => undef, 'Line' => '124', 'Memb' => { '0' => { 'name' => 'next', 'offset' => '0', 'type' => '6403' }, '1' => { 'name' => 'chassisguid', 'offset' => '8', 'type' => '276' }, '2' => { 'name' => 'chassisnum', 'offset' => '22', 'type' => '86' }, '3' => { 'name' => 'nodecount', 'offset' => '23', 'type' => '86' }, '4' => { 'name' => 'nodes', 'offset' => '36', 'type' => '6420' }, '5' => { 'name' => 'spinenode', 'offset' => '50', 'type' => '6437' }, '6' => { 'name' => 'linenode', 'offset' => '388', 'type' => '6453' } }, 'Name' => 'struct ibnd_chassis', 'Size' => '480', 'Type' => 'Struct' }, '6403' => { 'BaseType' => '6299', 'Name' => 'struct ibnd_chassis*', 'Size' => '8', 'Type' => 'Pointer' }, '6408' => { 'BaseType' => '5825', 'Header' => undef, 'Line' => '99', 'Name' => 'ibnd_node_t', 'Size' => '416', 'Type' => 'Typedef' }, '6420' => { 'BaseType' => '6408', 'Name' => 'ibnd_node_t*', 'Size' => '8', 'Type' => 'Pointer' }, '6425' => { 'BaseType' => '6145', 'Header' => undef, 'Line' => '119', 'Name' => 'ibnd_port_t', 'Size' => '176', 'Type' => 'Typedef' }, '6437' => { 'BaseType' => '6420', 'Name' => 'ibnd_node_t*[19]', 'Size' => '152', 'Type' => 'Array' }, '6453' => { 'BaseType' => '6420', 'Name' => 'ibnd_node_t*[37]', 'Size' => '296', 'Type' => 'Array' }, '6469' => { 'BaseType' => '6299', 'Header' => undef, 'Line' => '138', 'Name' => 'ibnd_chassis_t', 'Size' => '480', 'Type' => 'Typedef' }, '6481' => { 'Header' => undef, 'Line' => '161', 'Memb' => { '0' => { 'name' => 'from_node', 'offset' => '0', 'type' => '6420' }, '1' => { 'name' => 'from_portnum', 'offset' => '8', 'type' => '65' }, '10' => { 'name' => 'routers', 'offset' => '8776', 'type' => '6420' }, '2' => { 'name' => 'nodes', 'offset' => '22', 'type' => '6420' }, '3' => { 'name' => 'chassis', 'offset' => '36', 'type' => '6637' }, '4' => { 'name' => 'maxhops_discovered', 'offset' => '50', 'type' => '100' }, '5' => { 'name' => 'total_mads_used', 'offset' => '54', 'type' => '100' }, '6' => { 'name' => 'nodestbl', 'offset' => '64', 'type' => '6642' }, '7' => { 'name' => 'portstbl', 'offset' => '4406', 'type' => '6658' }, '8' => { 'name' => 'switches', 'offset' => '8754', 'type' => '6420' }, '9' => { 'name' => 'ch_adapters', 'offset' => '8768', 'type' => '6420' } }, 'Name' => 'struct ibnd_fabric', 'Size' => '2256', 'Type' => 'Struct' }, '65' => { 'Name' => 'int', 'Size' => '4', 'Type' => 'Intrinsic' }, '6637' => { 'BaseType' => '6469', 'Name' => 'ibnd_chassis_t*', 'Size' => '8', 'Type' => 'Pointer' }, '6642' => { 'BaseType' => '6420', 'Name' => 'ibnd_node_t*[137]', 'Size' => '1096', 'Type' => 'Array' }, '6658' => { 'BaseType' => '6674', 'Name' => 'ibnd_port_t*[137]', 'Size' => '1096', 'Type' => 'Array' }, '6674' => { 'BaseType' => '6425', 'Name' => 'ibnd_port_t*', 'Size' => '8', 'Type' => 'Pointer' }, '6679' => { 'BaseType' => '6481', 'Header' => undef, 'Line' => '182', 'Name' => 'ibnd_fabric_t', 'Size' => '2256', 'Type' => 'Typedef' }, '770' => { 'BaseType' => '210', 'Name' => 'char[20]', 'Size' => '20', 'Type' => 'Array' }, '818' => { 'BaseType' => '240', 'Name' => 'uint8_t[16]', 'Size' => '16', 'Type' => 'Array' }, '834' => { 'BaseType' => '818', 'Header' => undef, 'Line' => '243', 'Name' => 'ibmad_gid_t', 'Size' => '16', 'Type' => 'Typedef' }, '86' => { 'Name' => 'unsigned char', 'Size' => '1', 'Type' => 'Intrinsic' }, '906' => { 'BaseType' => '240', 'Name' => 'uint8_t[64]', 'Size' => '64', 'Type' => 'Array' }, '93' => { 'Name' => 'unsigned short', 'Size' => '2', 'Type' => 'Intrinsic' }, '934' => { 'Header' => undef, 'Line' => '308', 'Memb' => { '0' => { 'name' => 'lid', 'offset' => '0', 'type' => '65' }, '1' => { 'name' => 'drpath', 'offset' => '4', 'type' => '922' }, '2' => { 'name' => 'grh_present', 'offset' => '118', 'type' => '65' }, '3' => { 'name' => 'gid', 'offset' => '128', 'type' => '834' }, '4' => { 'name' => 'qp', 'offset' => '150', 'type' => '264' }, '5' => { 'name' => 'qkey', 'offset' => '256', 'type' => '264' }, '6' => { 'name' => 'sl', 'offset' => '260', 'type' => '240' }, '7' => { 'name' => 'pkey_idx', 'offset' => '264', 'type' => '100' } }, 'Name' => 'struct portid', 'Size' => '112', 'Type' => 'Struct' } }, 'UndefinedSymbols' => { 'libibnetdisc.so.5.1.56.0' => { '_ITM_deregisterTMCloneTable' => 0, '_ITM_registerTMCloneTable' => 0, '__cxa_finalize@GLIBC_2.2.5' => 0, '__errno_location@GLIBC_2.2.5' => 0, '__fprintf_chk@GLIBC_2.3.4' => 0, '__gmon_start__' => 0, '__memset_chk@GLIBC_2.3.4' => 0, '__printf_chk@GLIBC_2.3.4' => 0, '__snprintf_chk@GLIBC_2.3.4' => 0, '__stack_chk_fail@GLIBC_2.4' => 0, 'calloc@GLIBC_2.2.5' => 0, 'close@GLIBC_2.2.5' => 0, 'free@GLIBC_2.2.5' => 0, 'ib_resolve_self_via@IBMAD_1.3' => 0, 'ibdebug@IBMAD_1.3' => 0, 'lseek@GLIBC_2.2.5' => 0, 'mad_build_pkt@IBMAD_1.3' => 0, 'mad_decode_field@IBMAD_1.3' => 0, 'mad_dump_node_type@IBMAD_1.3' => 0, 'mad_dump_val@IBMAD_1.3' => 0, 'mad_get_field64@IBMAD_1.3' => 0, 'mad_get_field@IBMAD_1.3' => 0, 'mad_rpc_close_port2@IBMAD_1.5' => 0, 'mad_rpc_open_port2@IBMAD_1.5' => 0, 'mad_rpc_set_retries@IBMAD_1.3' => 0, 'mad_rpc_set_timeout@IBMAD_1.3' => 0, 'mad_trid@IBMAD_1.3' => 0, 'malloc@GLIBC_2.2.5' => 0, 'open@GLIBC_2.2.5' => 0, 'portid2str@IBMAD_1.3' => 0, 'read@GLIBC_2.2.5' => 0, 'smp_mkey_set@IBMAD_1.3' => 0, 'snprintf@GLIBC_2.2.5' => 0, 'stat@GLIBC_2.33' => 0, 'stderr@GLIBC_2.2.5' => 0, 'str2drpath@IBMAD_1.3' => 0, 'strerror@GLIBC_2.2.5' => 0, 'strncpy@GLIBC_2.2.5' => 0, 'strtol@GLIBC_2.2.5' => 0, 'umad_close_port@IBUMAD_1.0' => 0, 'umad_get_mad@IBUMAD_1.0' => 0, 'umad_init@IBUMAD_1.0' => 0, 'umad_open_port@IBUMAD_1.0' => 0, 'umad_recv@IBUMAD_1.0' => 0, 'umad_register@IBUMAD_1.0' => 0, 'umad_send@IBUMAD_1.0' => 0, 'umad_size@IBUMAD_1.0' => 0, 'umad_status@IBUMAD_1.0' => 0, 'unlink@GLIBC_2.2.5' => 0, 'write@GLIBC_2.2.5' => 0 } }, 'WordSize' => '8' }; rdma-core-56.1/ABI/ibumad.dump000066400000000000000000002736031477342711600160700ustar00rootroot00000000000000$VAR1 = { 'ABI_DUMPER_VERSION' => '1.2', 'ABI_DUMP_VERSION' => '3.5', 'Arch' => 'x86_64', 'GccVersion' => '12.3.0', 'Headers' => {}, 'Language' => 'C', 'LibraryName' => 'libibumad.so.3.4.56.0', 'LibraryVersion' => 'ibumad', 'MissedOffsets' => '1', 'MissedRegs' => '1', 'NameSpaces' => {}, 'Needed' => { 'libc.so.6' => 1 }, 'Sources' => {}, 'SymbolInfo' => { '10725' => { 'Header' => undef, 'Line' => '1358', 'Param' => { '0' => { 'name' => 'head', 'type' => '5221' } }, 'Return' => '1', 'ShortName' => 'umad_free_ca_device_list' }, '10828' => { 'Header' => undef, 'Line' => '1301', 'Return' => '5221', 'ShortName' => 'umad_get_ca_device_list' }, '11536' => { 'Header' => undef, 'Line' => '1254', 'Param' => { '0' => { 'name' => 'head', 'type' => '11791' }, '1' => { 'name' => 'size', 'type' => '198' } }, 'Return' => '112', 'ShortName' => 'umad_sort_ca_device_list' }, '11796' => { 'Header' => undef, 'Line' => '1245', 'Param' => { '0' => { 'name' => 'umad', 'type' => '138' } }, 'Return' => '1', 'ShortName' => 'umad_dump' }, '12050' => { 'Header' => undef, 'Line' => '1225', 'Param' => { '0' => { 'name' => 'addr', 'type' => '12399' } }, 'Return' => '1', 'ShortName' => 'umad_addr_dump' }, '12441' => { 'Header' => undef, 'Line' => '1218', 'Param' => { '0' => { 'name' => 'level', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_debug' }, '12490' => { 'Header' => undef, 'Line' => '1211', 'Param' => { '0' => { 'name' => 'umad', 'type' => '138' } }, 'Return' => '12399', 'ShortName' => 'umad_get_mad_addr' }, '12559' => { 'Header' => undef, 'Line' => '1204', 'Param' => { '0' => { 'name' => 'umad', 'type' => '138' } }, 'Return' => '112', 'ShortName' => 'umad_status' }, '12628' => { 'Header' => undef, 'Line' => '1198', 'Param' => { '0' => { 'name' => 'fd', 'type' => '112' }, '1' => { 'name' => 'agentid', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_unregister' }, '12900' => { 'Header' => undef, 'Line' => '1115', 'Param' => { '0' => { 'name' => 'port_fd', 'type' => '112' }, '1' => { 'name' => 'attr', 'type' => '14286' }, '2' => { 'name' => 'agent_id', 'type' => '8434' } }, 'Return' => '112', 'ShortName' => 'umad_register2' }, '14312' => { 'Header' => undef, 'Line' => '1080', 'Param' => { '0' => { 'name' => 'fd', 'type' => '112' }, '1' => { 'name' => 'mgmt_class', 'type' => '112' }, '2' => { 'name' => 'mgmt_version', 'type' => '112' }, '3' => { 'name' => 'rmpp_version', 'type' => '174' }, '4' => { 'name' => 'method_mask', 'type' => '15174' } }, 'Return' => '112', 'ShortName' => 'umad_register' }, '15200' => { 'Header' => undef, 'Line' => '1041', 'Param' => { '0' => { 'name' => 'fd', 'type' => '112' }, '1' => { 'name' => 'mgmt_class', 'type' => '112' }, '2' => { 'name' => 'rmpp_version', 'type' => '174' }, '3' => { 'name' => 'oui', 'type' => '16147' }, '4' => { 'name' => 'method_mask', 'type' => '15174' } }, 'Return' => '112', 'ShortName' => 'umad_register_oui' }, '16173' => { 'Header' => undef, 'Line' => '1035', 'Param' => { '0' => { 'name' => 'fd', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_get_fd' }, '16383' => { 'Header' => undef, 'Line' => '1029', 'Param' => { '0' => { 'name' => 'fd', 'type' => '112' }, '1' => { 'name' => 'timeout_ms', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_poll' }, '16782' => { 'Header' => undef, 'Line' => '982', 'Param' => { '0' => { 'name' => 'fd', 'type' => '112' }, '1' => { 'name' => 'umad', 'type' => '138' }, '2' => { 'name' => 'length', 'type' => '773' }, '3' => { 'name' => 'timeout_ms', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_recv' }, '17800' => { 'Header' => undef, 'Line' => '937', 'Param' => { '0' => { 'name' => 'fd', 'type' => '112' }, '1' => { 'name' => 'agentid', 'type' => '112' }, '2' => { 'name' => 'umad', 'type' => '138' }, '3' => { 'name' => 'length', 'type' => '112' }, '4' => { 'name' => 'timeout_ms', 'type' => '112' }, '5' => { 'name' => 'retries', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_send' }, '18365' => { 'Header' => undef, 'Line' => '923', 'Param' => { '0' => { 'name' => 'umad', 'type' => '138' }, '1' => { 'name' => 'dlid', 'type' => '296' }, '2' => { 'name' => 'dqp', 'type' => '4238' }, '3' => { 'name' => 'sl', 'type' => '112' }, '4' => { 'name' => 'qkey', 'type' => '4238' } }, 'Return' => '112', 'ShortName' => 'umad_set_addr_net' }, '18798' => { 'Header' => undef, 'Line' => '909', 'Param' => { '0' => { 'name' => 'umad', 'type' => '138' }, '1' => { 'name' => 'dlid', 'type' => '112' }, '2' => { 'name' => 'dqp', 'type' => '112' }, '3' => { 'name' => 'sl', 'type' => '112' }, '4' => { 'name' => 'qkey', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_set_addr' }, '19199' => { 'Header' => undef, 'Line' => '899', 'Param' => { '0' => { 'name' => 'umad', 'type' => '138' } }, 'Return' => '112', 'ShortName' => 'umad_get_pkey' }, '19268' => { 'Header' => undef, 'Line' => '889', 'Param' => { '0' => { 'name' => 'umad', 'type' => '138' }, '1' => { 'name' => 'pkey_index', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_set_pkey' }, '19351' => { 'Header' => undef, 'Line' => '871', 'Param' => { '0' => { 'name' => 'umad', 'type' => '138' }, '1' => { 'name' => 'mad_addr', 'type' => '138' } }, 'Return' => '112', 'ShortName' => 'umad_set_grh' }, '19494' => { 'Header' => undef, 'Line' => '865', 'Return' => '198', 'ShortName' => 'umad_size' }, '19525' => { 'Header' => undef, 'Line' => '859', 'Param' => { '0' => { 'name' => 'umad', 'type' => '138' } }, 'Return' => '138', 'ShortName' => 'umad_get_mad' }, '19574' => { 'Header' => undef, 'Line' => '852', 'Param' => { '0' => { 'name' => 'fd', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_close_port' }, '19787' => { 'Header' => undef, 'Line' => '837', 'Param' => { '0' => { 'name' => 'port', 'type' => '5093' } }, 'Return' => '112', 'ShortName' => 'umad_release_port' }, '20176' => { 'Header' => undef, 'Line' => '814', 'Param' => { '0' => { 'name' => 'ca_name', 'type' => '210' }, '1' => { 'name' => 'portnum', 'type' => '112' }, '2' => { 'name' => 'port', 'type' => '5093' } }, 'Return' => '112', 'ShortName' => 'umad_get_port' }, '20731' => { 'Header' => undef, 'Line' => '799', 'Param' => { '0' => { 'name' => 'ca', 'type' => '21167' } }, 'Return' => '112', 'ShortName' => 'umad_release_ca' }, '21172' => { 'Header' => undef, 'Line' => '774', 'Param' => { '0' => { 'name' => 'ca_name', 'type' => '210' }, '1' => { 'name' => 'ca', 'type' => '21167' } }, 'Return' => '112', 'ShortName' => 'umad_get_ca' }, '21652' => { 'Header' => undef, 'Line' => '769', 'Param' => { '0' => { 'name' => 'ca_name', 'type' => '210' }, '1' => { 'name' => 'portnum', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_open_smi_port' }, '21760' => { 'Header' => undef, 'Line' => '764', 'Param' => { '0' => { 'name' => 'ca_name', 'type' => '210' }, '1' => { 'name' => 'portnum', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_open_port' }, '23156' => { 'Header' => undef, 'Line' => '687', 'Param' => { '0' => { 'name' => 'ca_name', 'type' => '210' }, '1' => { 'name' => 'portnum', 'type' => '112' }, '2' => { 'name' => 'path', 'type' => '152' }, '3' => { 'name' => 'max', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_get_issm_path' }, '23735' => { 'Header' => undef, 'Line' => '648', 'Param' => { '0' => { 'name' => 'ca_name', 'type' => '210' }, '1' => { 'name' => 'portguids', 'type' => '1946' }, '2' => { 'name' => 'max', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_get_ca_portguids' }, '24353' => { 'Header' => undef, 'Line' => '618', 'Param' => { '0' => { 'name' => 'cas', 'type' => '10426' }, '1' => { 'name' => 'max', 'type' => '112' } }, 'Return' => '112', 'ShortName' => 'umad_get_cas_names' }, '25404' => { 'Header' => undef, 'Line' => '598', 'Return' => '112', 'ShortName' => 'umad_done' }, '25568' => { 'Header' => undef, 'Line' => '592', 'Return' => '112', 'ShortName' => 'umad_init' }, '33795' => { 'Header' => undef, 'Line' => '339', 'Param' => { '0' => { 'name' => 'mgmt_class', 'type' => '174' }, '1' => { 'name' => 'attr_id', 'type' => '296' } }, 'Return' => '210', 'ShortName' => 'umad_attribute_str' }, '34275' => { 'Header' => undef, 'Line' => '165', 'Param' => { '0' => { 'name' => '_status', 'type' => '296' } }, 'Return' => '210', 'ShortName' => 'umad_sa_mad_status_str' }, '34331' => { 'Header' => undef, 'Line' => '142', 'Param' => { '0' => { 'name' => '_status', 'type' => '296' } }, 'Return' => '210', 'ShortName' => 'umad_common_mad_status_str' }, '34393' => { 'Header' => undef, 'Line' => '134', 'Param' => { '0' => { 'name' => 'mgmt_class', 'type' => '174' }, '1' => { 'name' => 'method', 'type' => '174' } }, 'Return' => '210', 'ShortName' => 'umad_method_str' }, '34620' => { 'Header' => undef, 'Line' => '45', 'Param' => { '0' => { 'name' => 'mgmt_class', 'type' => '174' } }, 'Return' => '210', 'ShortName' => 'umad_class_str' }, '7003' => { 'Header' => undef, 'Line' => '1598', 'Param' => { '0' => { 'name' => 'name', 'type' => '210' }, '1' => { 'name' => 'portnum', 'type' => '174' }, '2' => { 'name' => 'ca_pair', 'type' => '5839' }, '3' => { 'name' => 'enforce_smi', 'type' => '60' } }, 'Return' => '112', 'ShortName' => 'umad_get_smi_gsi_pair_by_ca_name' }, '8568' => { 'Header' => undef, 'Line' => '1476', 'Param' => { '0' => { 'name' => 'cas', 'type' => '5839' }, '1' => { 'name' => 'max', 'type' => '198' } }, 'Return' => '112', 'ShortName' => 'umad_get_smi_gsi_pairs' } }, 'SymbolVersion' => { 'umad_addr_dump' => 'umad_addr_dump@@IBUMAD_1.0', 'umad_attribute_str' => 'umad_attribute_str@@IBUMAD_1.0', 'umad_class_str' => 'umad_class_str@@IBUMAD_1.0', 'umad_close_port' => 'umad_close_port@@IBUMAD_1.0', 'umad_common_mad_status_str' => 'umad_common_mad_status_str@@IBUMAD_1.0', 'umad_debug' => 'umad_debug@@IBUMAD_1.0', 'umad_done' => 'umad_done@@IBUMAD_1.0', 'umad_dump' => 'umad_dump@@IBUMAD_1.0', 'umad_free_ca_device_list' => 'umad_free_ca_device_list@@IBUMAD_1.1', 'umad_get_ca' => 'umad_get_ca@@IBUMAD_1.0', 'umad_get_ca_device_list' => 'umad_get_ca_device_list@@IBUMAD_1.1', 'umad_get_ca_portguids' => 'umad_get_ca_portguids@@IBUMAD_1.0', 'umad_get_cas_names' => 'umad_get_cas_names@@IBUMAD_1.0', 'umad_get_fd' => 'umad_get_fd@@IBUMAD_1.0', 'umad_get_issm_path' => 'umad_get_issm_path@@IBUMAD_1.0', 'umad_get_mad' => 'umad_get_mad@@IBUMAD_1.0', 'umad_get_mad_addr' => 'umad_get_mad_addr@@IBUMAD_1.0', 'umad_get_pkey' => 'umad_get_pkey@@IBUMAD_1.0', 'umad_get_port' => 'umad_get_port@@IBUMAD_1.0', 'umad_get_smi_gsi_pair_by_ca_name' => 'umad_get_smi_gsi_pair_by_ca_name@@IBUMAD_1.4', 'umad_get_smi_gsi_pairs' => 'umad_get_smi_gsi_pairs@@IBUMAD_1.4', 'umad_init' => 'umad_init@@IBUMAD_1.0', 'umad_method_str' => 'umad_method_str@@IBUMAD_1.0', 'umad_open_port' => 'umad_open_port@@IBUMAD_1.0', 'umad_open_smi_port' => 'umad_open_smi_port@@IBUMAD_1.3', 'umad_poll' => 'umad_poll@@IBUMAD_1.0', 'umad_recv' => 'umad_recv@@IBUMAD_1.0', 'umad_register' => 'umad_register@@IBUMAD_1.0', 'umad_register2' => 'umad_register2@@IBUMAD_1.0', 'umad_register_oui' => 'umad_register_oui@@IBUMAD_1.0', 'umad_release_ca' => 'umad_release_ca@@IBUMAD_1.0', 'umad_release_port' => 'umad_release_port@@IBUMAD_1.0', 'umad_sa_mad_status_str' => 'umad_sa_mad_status_str@@IBUMAD_1.0', 'umad_send' => 'umad_send@@IBUMAD_1.0', 'umad_set_addr' => 'umad_set_addr@@IBUMAD_1.0', 'umad_set_addr_net' => 'umad_set_addr_net@@IBUMAD_1.0', 'umad_set_grh' => 'umad_set_grh@@IBUMAD_1.0', 'umad_set_pkey' => 'umad_set_pkey@@IBUMAD_1.0', 'umad_size' => 'umad_size@@IBUMAD_1.0', 'umad_sort_ca_device_list' => 'umad_sort_ca_device_list@@IBUMAD_1.2', 'umad_status' => 'umad_status@@IBUMAD_1.0', 'umad_unregister' => 'umad_unregister@@IBUMAD_1.0' }, 'Symbols' => { 'libibumad.so.3.4.56.0' => { 'umad_addr_dump@@IBUMAD_1.0' => 1, 'umad_attribute_str@@IBUMAD_1.0' => 1, 'umad_class_str@@IBUMAD_1.0' => 1, 'umad_close_port@@IBUMAD_1.0' => 1, 'umad_common_mad_status_str@@IBUMAD_1.0' => 1, 'umad_debug@@IBUMAD_1.0' => 1, 'umad_done@@IBUMAD_1.0' => 1, 'umad_dump@@IBUMAD_1.0' => 1, 'umad_free_ca_device_list@@IBUMAD_1.1' => 1, 'umad_get_ca@@IBUMAD_1.0' => 1, 'umad_get_ca_device_list@@IBUMAD_1.1' => 1, 'umad_get_ca_portguids@@IBUMAD_1.0' => 1, 'umad_get_cas_names@@IBUMAD_1.0' => 1, 'umad_get_fd@@IBUMAD_1.0' => 1, 'umad_get_issm_path@@IBUMAD_1.0' => 1, 'umad_get_mad@@IBUMAD_1.0' => 1, 'umad_get_mad_addr@@IBUMAD_1.0' => 1, 'umad_get_pkey@@IBUMAD_1.0' => 1, 'umad_get_port@@IBUMAD_1.0' => 1, 'umad_get_smi_gsi_pair_by_ca_name@@IBUMAD_1.4' => 1, 'umad_get_smi_gsi_pairs@@IBUMAD_1.4' => 1, 'umad_init@@IBUMAD_1.0' => 1, 'umad_method_str@@IBUMAD_1.0' => 1, 'umad_open_port@@IBUMAD_1.0' => 1, 'umad_open_smi_port@@IBUMAD_1.3' => 1, 'umad_poll@@IBUMAD_1.0' => 1, 'umad_recv@@IBUMAD_1.0' => 1, 'umad_register2@@IBUMAD_1.0' => 1, 'umad_register@@IBUMAD_1.0' => 1, 'umad_register_oui@@IBUMAD_1.0' => 1, 'umad_release_ca@@IBUMAD_1.0' => 1, 'umad_release_port@@IBUMAD_1.0' => 1, 'umad_sa_mad_status_str@@IBUMAD_1.0' => 1, 'umad_send@@IBUMAD_1.0' => 1, 'umad_set_addr@@IBUMAD_1.0' => 1, 'umad_set_addr_net@@IBUMAD_1.0' => 1, 'umad_set_grh@@IBUMAD_1.0' => 1, 'umad_set_pkey@@IBUMAD_1.0' => 1, 'umad_size@@IBUMAD_1.0' => 1, 'umad_sort_ca_device_list@@IBUMAD_1.2' => 1, 'umad_status@@IBUMAD_1.0' => 1, 'umad_unregister@@IBUMAD_1.0' => 1 } }, 'Target' => 'unix', 'TypeInfo' => { '1' => { 'Name' => 'void', 'Type' => 'Intrinsic' }, '100' => { 'BaseType' => '53', 'Header' => undef, 'Line' => '40', 'Name' => '__uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '10426' => { 'BaseType' => '3782', 'Name' => 'char[20]*', 'Size' => '8', 'Type' => 'Pointer' }, '112' => { 'Name' => 'int', 'Size' => '4', 'Type' => 'Intrinsic' }, '11791' => { 'BaseType' => '5221', 'Name' => 'struct umad_device_node**', 'Size' => '8', 'Type' => 'Pointer' }, '119' => { 'Name' => 'long', 'Size' => '8', 'Type' => 'Intrinsic' }, '12399' => { 'BaseType' => '4587', 'Name' => 'ib_mad_addr_t*', 'Size' => '8', 'Type' => 'Pointer' }, '126' => { 'BaseType' => '67', 'Header' => undef, 'Line' => '45', 'Name' => '__uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '138' => { 'BaseType' => '1', 'Name' => 'void*', 'Size' => '8', 'Type' => 'Pointer' }, '14286' => { 'BaseType' => '5226', 'Name' => 'struct umad_reg_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '15174' => { 'BaseType' => '119', 'Name' => 'long*', 'Size' => '8', 'Type' => 'Pointer' }, '152' => { 'BaseType' => '162', 'Name' => 'char*', 'Size' => '8', 'Type' => 'Pointer' }, '16147' => { 'BaseType' => '174', 'Name' => 'uint8_t*', 'Size' => '8', 'Type' => 'Pointer' }, '162' => { 'Name' => 'char', 'Size' => '1', 'Type' => 'Intrinsic' }, '169' => { 'BaseType' => '162', 'Name' => 'char const', 'Size' => '1', 'Type' => 'Const' }, '174' => { 'BaseType' => '81', 'Header' => undef, 'Line' => '24', 'Name' => 'uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '186' => { 'BaseType' => '126', 'Header' => undef, 'Line' => '27', 'Name' => 'uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '1946' => { 'BaseType' => '308', 'Name' => '__be64*', 'Size' => '8', 'Type' => 'Pointer' }, '198' => { 'BaseType' => '67', 'Header' => undef, 'Line' => '214', 'Name' => 'size_t', 'Size' => '8', 'Type' => 'Typedef' }, '210' => { 'BaseType' => '169', 'Name' => 'char const*', 'Size' => '8', 'Type' => 'Pointer' }, '21167' => { 'BaseType' => '5098', 'Name' => 'umad_ca_t*', 'Size' => '8', 'Type' => 'Pointer' }, '239' => { 'Name' => 'unsigned long long', 'Size' => '8', 'Type' => 'Intrinsic' }, '272' => { 'BaseType' => '53', 'Header' => undef, 'Line' => '24', 'Name' => '__u16', 'Size' => '2', 'Type' => 'Typedef' }, '284' => { 'BaseType' => '239', 'Header' => undef, 'Line' => '31', 'Name' => '__u64', 'Size' => '8', 'Type' => 'Typedef' }, '296' => { 'BaseType' => '272', 'Header' => undef, 'Line' => '25', 'Name' => '__be16', 'Size' => '2', 'Type' => 'Typedef' }, '308' => { 'BaseType' => '284', 'Header' => undef, 'Line' => '29', 'Name' => '__be64', 'Size' => '8', 'Type' => 'Typedef' }, '3146' => { 'BaseType' => '60', 'Header' => undef, 'Line' => '42', 'Name' => '__uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '320' => { 'BaseType' => '174', 'Name' => 'uint8_t[16]', 'Size' => '16', 'Type' => 'Array' }, '336' => { 'Header' => undef, 'Line' => '59', 'Memb' => { '0' => { 'name' => 'subnet_prefix', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'interface_id', 'offset' => '8', 'type' => '308' } }, 'Size' => '16', 'Type' => 'Struct' }, '368' => { 'Header' => undef, 'Line' => '56', 'Memb' => { '0' => { 'name' => 'raw', 'offset' => '0', 'type' => '320' }, '1' => { 'name' => 'raw_be16', 'offset' => '0', 'type' => '417' }, '2' => { 'name' => 'global', 'offset' => '0', 'type' => '336' } }, 'Name' => 'union umad_gid', 'Size' => '16', 'Type' => 'Union' }, '3782' => { 'BaseType' => '162', 'Name' => 'char[20]', 'Size' => '20', 'Type' => 'Array' }, '3827' => { 'BaseType' => '162', 'Name' => 'char[40]', 'Size' => '40', 'Type' => 'Array' }, '4073' => { 'BaseType' => '100', 'Header' => undef, 'Line' => '25', 'Name' => 'uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '4085' => { 'BaseType' => '3146', 'Header' => undef, 'Line' => '26', 'Name' => 'uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '417' => { 'BaseType' => '296', 'Name' => '__be16[8]', 'Size' => '16', 'Type' => 'Array' }, '4202' => { 'BaseType' => '60', 'Header' => undef, 'Line' => '27', 'Name' => '__u32', 'Size' => '4', 'Type' => 'Typedef' }, '4238' => { 'BaseType' => '4202', 'Header' => undef, 'Line' => '27', 'Name' => '__be32', 'Size' => '4', 'Type' => 'Typedef' }, '4362' => { 'Header' => undef, 'Line' => '77', 'Memb' => { '0' => { 'name' => 'gid', 'offset' => '0', 'type' => '320' }, '1' => { 'name' => 'ib_gid', 'offset' => '0', 'type' => '368' } }, 'Size' => '16', 'Type' => 'Union' }, '4397' => { 'Header' => undef, 'Line' => '67', 'Memb' => { '0' => { 'name' => 'qpn', 'offset' => '0', 'type' => '4238' }, '1' => { 'name' => 'qkey', 'offset' => '4', 'type' => '4238' }, '10' => { 'name' => 'flow_label', 'offset' => '50', 'type' => '4238' }, '11' => { 'name' => 'pkey_index', 'offset' => '54', 'type' => '4073' }, '12' => { 'name' => 'reserved', 'offset' => '56', 'type' => '4571' }, '2' => { 'name' => 'lid', 'offset' => '8', 'type' => '296' }, '3' => { 'name' => 'sl', 'offset' => '16', 'type' => '174' }, '4' => { 'name' => 'path_bits', 'offset' => '17', 'type' => '174' }, '5' => { 'name' => 'grh_present', 'offset' => '18', 'type' => '174' }, '6' => { 'name' => 'gid_index', 'offset' => '19', 'type' => '174' }, '7' => { 'name' => 'hop_limit', 'offset' => '20', 'type' => '174' }, '8' => { 'name' => 'traffic_class', 'offset' => '21', 'type' => '174' }, '9' => { 'name' => 'unnamed0', 'offset' => '22', 'type' => '4362' } }, 'Name' => 'struct ib_mad_addr', 'Size' => '44', 'Type' => 'Struct' }, '4571' => { 'BaseType' => '174', 'Name' => 'uint8_t[6]', 'Size' => '6', 'Type' => 'Array' }, '4587' => { 'BaseType' => '4397', 'Header' => undef, 'Line' => '84', 'Name' => 'ib_mad_addr_t', 'Size' => '44', 'Type' => 'Typedef' }, '46' => { 'Name' => 'unsigned char', 'Size' => '1', 'Type' => 'Intrinsic' }, '4720' => { 'Header' => undef, 'Line' => '142', 'Memb' => { '0' => { 'name' => 'ca_name', 'offset' => '0', 'type' => '3782' }, '1' => { 'name' => 'portnum', 'offset' => '32', 'type' => '112' }, '10' => { 'name' => 'gid_prefix', 'offset' => '86', 'type' => '308' }, '11' => { 'name' => 'port_guid', 'offset' => '100', 'type' => '308' }, '12' => { 'name' => 'pkeys_size', 'offset' => '114', 'type' => '60' }, '13' => { 'name' => 'pkeys', 'offset' => '128', 'type' => '4929' }, '14' => { 'name' => 'link_layer', 'offset' => '136', 'type' => '3782' }, '2' => { 'name' => 'base_lid', 'offset' => '36', 'type' => '60' }, '3' => { 'name' => 'lmc', 'offset' => '40', 'type' => '60' }, '4' => { 'name' => 'sm_lid', 'offset' => '50', 'type' => '60' }, '5' => { 'name' => 'sm_sl', 'offset' => '54', 'type' => '60' }, '6' => { 'name' => 'state', 'offset' => '64', 'type' => '60' }, '7' => { 'name' => 'phys_state', 'offset' => '68', 'type' => '60' }, '8' => { 'name' => 'rate', 'offset' => '72', 'type' => '60' }, '9' => { 'name' => 'capmask', 'offset' => '82', 'type' => '4238' } }, 'Name' => 'struct umad_port', 'Size' => '112', 'Type' => 'Struct' }, '4929' => { 'BaseType' => '4073', 'Name' => 'uint16_t*', 'Size' => '8', 'Type' => 'Pointer' }, '4934' => { 'BaseType' => '4720', 'Header' => undef, 'Line' => '158', 'Name' => 'umad_port_t', 'Size' => '112', 'Type' => 'Typedef' }, '4946' => { 'Header' => undef, 'Line' => '160', 'Memb' => { '0' => { 'name' => 'ca_name', 'offset' => '0', 'type' => '3782' }, '1' => { 'name' => 'node_type', 'offset' => '32', 'type' => '60' }, '2' => { 'name' => 'numports', 'offset' => '36', 'type' => '112' }, '3' => { 'name' => 'fw_ver', 'offset' => '40', 'type' => '3782' }, '4' => { 'name' => 'ca_type', 'offset' => '72', 'type' => '3827' }, '5' => { 'name' => 'hw_ver', 'offset' => '136', 'type' => '3782' }, '6' => { 'name' => 'node_guid', 'offset' => '274', 'type' => '308' }, '7' => { 'name' => 'system_guid', 'offset' => '288', 'type' => '308' }, '8' => { 'name' => 'ports', 'offset' => '296', 'type' => '5077' } }, 'Name' => 'struct umad_ca', 'Size' => '208', 'Type' => 'Struct' }, '5077' => { 'BaseType' => '5093', 'Name' => 'umad_port_t*[10]', 'Size' => '80', 'Type' => 'Array' }, '5093' => { 'BaseType' => '4934', 'Name' => 'umad_port_t*', 'Size' => '8', 'Type' => 'Pointer' }, '5098' => { 'BaseType' => '4946', 'Header' => undef, 'Line' => '170', 'Name' => 'umad_ca_t', 'Size' => '208', 'Type' => 'Typedef' }, '5115' => { 'Header' => undef, 'Line' => '172', 'Memb' => { '0' => { 'name' => 'smi_name', 'offset' => '0', 'type' => '3782' }, '1' => { 'name' => 'smi_preferred_port', 'offset' => '32', 'type' => '4085' }, '2' => { 'name' => 'gsi_name', 'offset' => '36', 'type' => '3782' }, '3' => { 'name' => 'gsi_preferred_port', 'offset' => '68', 'type' => '4085' } }, 'Name' => 'struct umad_ca_pair', 'Size' => '48', 'Type' => 'Struct' }, '5181' => { 'Header' => undef, 'Line' => '180', 'Memb' => { '0' => { 'name' => 'next', 'offset' => '0', 'type' => '5221' }, '1' => { 'name' => 'ca_name', 'offset' => '8', 'type' => '210' } }, 'Name' => 'struct umad_device_node', 'Size' => '16', 'Type' => 'Struct' }, '5221' => { 'BaseType' => '5181', 'Name' => 'struct umad_device_node*', 'Size' => '8', 'Type' => 'Pointer' }, '5226' => { 'Header' => undef, 'Line' => '233', 'Memb' => { '0' => { 'name' => 'mgmt_class', 'offset' => '0', 'type' => '174' }, '1' => { 'name' => 'mgmt_class_version', 'offset' => '1', 'type' => '174' }, '2' => { 'name' => 'flags', 'offset' => '4', 'type' => '4085' }, '3' => { 'name' => 'method_mask', 'offset' => '8', 'type' => '5318' }, '4' => { 'name' => 'oui', 'offset' => '36', 'type' => '4085' }, '5' => { 'name' => 'rmpp_version', 'offset' => '40', 'type' => '174' } }, 'Name' => 'struct umad_reg_attr', 'Size' => '32', 'Type' => 'Struct' }, '53' => { 'Name' => 'unsigned short', 'Size' => '2', 'Type' => 'Intrinsic' }, '5318' => { 'BaseType' => '186', 'Name' => 'uint64_t[2]', 'Size' => '16', 'Type' => 'Array' }, '5839' => { 'BaseType' => '5115', 'Name' => 'struct umad_ca_pair*', 'Size' => '8', 'Type' => 'Pointer' }, '60' => { 'Name' => 'unsigned int', 'Size' => '4', 'Type' => 'Intrinsic' }, '67' => { 'Name' => 'unsigned long', 'Size' => '8', 'Type' => 'Intrinsic' }, '773' => { 'BaseType' => '112', 'Name' => 'int*', 'Size' => '8', 'Type' => 'Pointer' }, '81' => { 'BaseType' => '46', 'Header' => undef, 'Line' => '38', 'Name' => '__uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '8434' => { 'BaseType' => '4085', 'Name' => 'uint32_t*', 'Size' => '8', 'Type' => 'Pointer' } }, 'UndefinedSymbols' => { 'libibumad.so.3.4.56.0' => { '_ITM_deregisterTMCloneTable' => 0, '_ITM_registerTMCloneTable' => 0, '__ctype_b_loc@GLIBC_2.3' => 0, '__cxa_finalize@GLIBC_2.2.5' => 0, '__errno_location@GLIBC_2.2.5' => 0, '__fprintf_chk@GLIBC_2.3.4' => 0, '__gmon_start__' => 0, '__snprintf_chk@GLIBC_2.3.4' => 0, '__stack_chk_fail@GLIBC_2.4' => 0, 'alphasort@GLIBC_2.2.5' => 0, 'calloc@GLIBC_2.2.5' => 0, 'close@GLIBC_2.2.5' => 0, 'closedir@GLIBC_2.2.5' => 0, 'free@GLIBC_2.2.5' => 0, 'getpid@GLIBC_2.2.5' => 0, 'ioctl@GLIBC_2.2.5' => 0, 'memset@GLIBC_2.2.5' => 0, 'open@GLIBC_2.2.5' => 0, 'opendir@GLIBC_2.2.5' => 0, 'poll@GLIBC_2.2.5' => 0, 'qsort@GLIBC_2.2.5' => 0, 'read@GLIBC_2.2.5' => 0, 'readdir@GLIBC_2.2.5' => 0, 'scandir@GLIBC_2.2.5' => 0, 'snprintf@GLIBC_2.2.5' => 0, 'stderr@GLIBC_2.2.5' => 0, 'strcmp@GLIBC_2.2.5' => 0, 'strcpy@GLIBC_2.2.5' => 0, 'strdup@GLIBC_2.2.5' => 0, 'strerror@GLIBC_2.2.5' => 0, 'strlen@GLIBC_2.2.5' => 0, 'strncmp@GLIBC_2.2.5' => 0, 'strncpy@GLIBC_2.2.5' => 0, 'strrchr@GLIBC_2.2.5' => 0, 'strsep@GLIBC_2.2.5' => 0, 'strtol@GLIBC_2.2.5' => 0, 'strtoul@GLIBC_2.2.5' => 0, 'strtoull@GLIBC_2.2.5' => 0, 'write@GLIBC_2.2.5' => 0 } }, 'WordSize' => '8' }; rdma-core-56.1/ABI/ibverbs.dump000066400000000000000000046055651477342711600162750ustar00rootroot00000000000000$VAR1 = { 'ABI_DUMPER_VERSION' => '1.2', 'ABI_DUMP_VERSION' => '3.5', 'Arch' => 'x86_64', 'GccVersion' => '12.3.0', 'Headers' => {}, 'Language' => 'C', 'LibraryName' => 'libibverbs.so.1.14.56.0', 'LibraryVersion' => 'ibverbs', 'MissedOffsets' => '1', 'MissedRegs' => '1', 'NameSpaces' => {}, 'Needed' => { 'libc.so.6' => 1, 'libnl-3.so.200' => 1, 'libnl-route-3.so.200' => 1 }, 'Sources' => {}, 'SymbolInfo' => { '101890' => { 'Header' => undef, 'Line' => '68', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'port_num', 'type' => '929' }, '2' => { 'name' => 'port_attr', 'type' => '63789' }, '3' => { 'name' => 'cmd', 'type' => '102963' }, '4' => { 'name' => 'cmd_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_query_port' }, '110351' => { 'Header' => undef, 'Line' => '78', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'dm', 'type' => '112333' }, '2' => { 'name' => 'offset', 'type' => '965' }, '3' => { 'name' => 'length', 'type' => '53' }, '4' => { 'name' => 'access', 'type' => '70' }, '5' => { 'name' => 'vmr', 'type' => '23204' }, '6' => { 'name' => 'link', 'type' => '40016' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_reg_dm_mr' }, '112357' => { 'Header' => undef, 'Line' => '63', 'Param' => { '0' => { 'name' => 'dm', 'type' => '112333' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_free_dm' }, '112809' => { 'Header' => undef, 'Line' => '35', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '8991' }, '1' => { 'name' => 'dm_attr', 'type' => '113808' }, '2' => { 'name' => 'dm', 'type' => '112333' }, '3' => { 'name' => 'link', 'type' => '40016' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_alloc_dm' }, '142670' => { 'Header' => undef, 'Line' => '35', 'Param' => { '0' => { 'name' => 'flow_id', 'type' => '18335' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_destroy_flow' }, '149845' => { 'Header' => undef, 'Line' => '120', 'Param' => { '0' => { 'name' => 'action', 'type' => '150282' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_destroy_flow_action' }, '150304' => { 'Header' => undef, 'Line' => '101', 'Param' => { '0' => { 'name' => 'flow_action', 'type' => '150282' }, '1' => { 'name' => 'attr', 'type' => '64104' }, '2' => { 'name' => 'driver', 'type' => '40016' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_modify_flow_action_esp' }, '150827' => { 'Header' => undef, 'Line' => '72', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '8991' }, '1' => { 'name' => 'attr', 'type' => '64104' }, '2' => { 'name' => 'flow_action', 'type' => '150282' }, '3' => { 'name' => 'driver', 'type' => '40016' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_flow_action_esp' }, '178627' => { 'Header' => undef, 'Line' => '120', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'offset', 'type' => '965' }, '2' => { 'name' => 'length', 'type' => '53' }, '3' => { 'name' => 'iova', 'type' => '965' }, '4' => { 'name' => 'fd', 'type' => '161' }, '5' => { 'name' => 'access', 'type' => '161' }, '6' => { 'name' => 'vmr', 'type' => '23204' }, '7' => { 'name' => 'driver', 'type' => '40016' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_reg_dmabuf_mr' }, '180839' => { 'Header' => undef, 'Line' => '90', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'vmr', 'type' => '23204' }, '2' => { 'name' => 'mr_handle', 'type' => '953' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_query_mr' }, '181937' => { 'Header' => undef, 'Line' => '58', 'Param' => { '0' => { 'name' => 'vmr', 'type' => '23204' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_dereg_mr' }, '182465' => { 'Header' => undef, 'Line' => '39', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'advice', 'type' => '52608' }, '2' => { 'name' => 'flags', 'type' => '953' }, '3' => { 'name' => 'sg_list', 'type' => '14062' }, '4' => { 'name' => 'num_sge', 'type' => '953' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_advise_mr' }, '190819' => { 'Header' => undef, 'Line' => '35', 'Param' => { '0' => { 'name' => 'mw', 'type' => '13825' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_dealloc_mw' }, '197998' => { 'Header' => undef, 'Line' => '35', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_dealloc_pd' }, '219982' => { 'Header' => undef, 'Line' => '449', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_destroy_qp' }, '220822' => { 'Header' => undef, 'Line' => '422', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'qp', 'type' => '30110' }, '2' => { 'name' => 'attr_ex', 'type' => '64760' }, '3' => { 'name' => 'cmd', 'type' => '221262' }, '4' => { 'name' => 'cmd_size', 'type' => '53' }, '5' => { 'name' => 'resp', 'type' => '221267' }, '6' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_qp_ex2' }, '221289' => { 'Header' => undef, 'Line' => '401', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'qp', 'type' => '30110' }, '2' => { 'name' => 'attr_ex', 'type' => '64760' }, '3' => { 'name' => 'cmd', 'type' => '221789' }, '4' => { 'name' => 'cmd_size', 'type' => '53' }, '5' => { 'name' => 'resp', 'type' => '30120' }, '6' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_qp_ex' }, '221816' => { 'Header' => undef, 'Line' => '373', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'qp', 'type' => '9935' }, '2' => { 'name' => 'attr', 'type' => '23199' }, '3' => { 'name' => 'cmd', 'type' => '221789' }, '4' => { 'name' => 'cmd_size', 'type' => '53' }, '5' => { 'name' => 'resp', 'type' => '30120' }, '6' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_qp' }, '23272' => { 'Data' => 1, 'Header' => undef, 'Line' => '328', 'Return' => '18370', 'ShortName' => 'verbs_allow_disassociate_destroy' }, '234652' => { 'Header' => undef, 'Line' => '35', 'Param' => { '0' => { 'name' => 'rwq_ind_table', 'type' => '12422' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_destroy_rwq_ind_table' }, '23611' => { 'Header' => undef, 'Line' => '1205', 'Param' => { '0' => { 'name' => 'cq', 'type' => '9734' }, '1' => { 'name' => 'attr', 'type' => '18340' }, '2' => { 'name' => 'cmd', 'type' => '23769' }, '3' => { 'name' => 'cmd_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_modify_cq' }, '23774' => { 'Header' => undef, 'Line' => '1160', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'init_attr', 'type' => '18345' }, '2' => { 'name' => 'rwq_ind_table', 'type' => '12422' }, '3' => { 'name' => 'resp', 'type' => '24173' }, '4' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_rwq_ind_table' }, '24183' => { 'Header' => undef, 'Line' => '1128', 'Param' => { '0' => { 'name' => 'wq', 'type' => '10252' }, '1' => { 'name' => 'attr', 'type' => '18350' }, '2' => { 'name' => 'cmd', 'type' => '24420' }, '3' => { 'name' => 'cmd_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_modify_wq' }, '24425' => { 'Header' => undef, 'Line' => '1069', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'flow_id', 'type' => '18335' }, '2' => { 'name' => 'flow_attr', 'type' => '18355' }, '3' => { 'name' => 'ucmd', 'type' => '82' }, '4' => { 'name' => 'ucmd_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_flow' }, '254044' => { 'Header' => undef, 'Line' => '245', 'Param' => { '0' => { 'name' => 'srq', 'type' => '10052' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_destroy_srq' }, '254873' => { 'Header' => undef, 'Line' => '222', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'srq', 'type' => '255302' }, '2' => { 'name' => 'attr_ex', 'type' => '64820' }, '3' => { 'name' => 'cmd', 'type' => '255307' }, '4' => { 'name' => 'cmd_size', 'type' => '53' }, '5' => { 'name' => 'resp', 'type' => '255312' }, '6' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_srq_ex' }, '255334' => { 'Header' => undef, 'Line' => '200', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'srq', 'type' => '10052' }, '2' => { 'name' => 'attr', 'type' => '67152' }, '3' => { 'name' => 'cmd', 'type' => '255842' }, '4' => { 'name' => 'cmd_size', 'type' => '53' }, '5' => { 'name' => 'resp', 'type' => '255312' }, '6' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_srq' }, '26469' => { 'Header' => undef, 'Line' => '858', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'gid', 'type' => '23189' }, '2' => { 'name' => 'lid', 'type' => '941' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_detach_mcast' }, '26766' => { 'Header' => undef, 'Line' => '845', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'gid', 'type' => '23189' }, '2' => { 'name' => 'lid', 'type' => '941' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_attach_mcast' }, '26991' => { 'Header' => undef, 'Line' => '809', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'ah', 'type' => '13672' }, '2' => { 'name' => 'attr', 'type' => '23194' }, '3' => { 'name' => 'resp', 'type' => '27330' }, '4' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_ah' }, '27335' => { 'Header' => undef, 'Line' => '750', 'Param' => { '0' => { 'name' => 'srq', 'type' => '10052' }, '1' => { 'name' => 'wr', 'type' => '14138' }, '2' => { 'name' => 'bad_wr', 'type' => '14225' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_post_srq_recv' }, '27809' => { 'Header' => undef, 'Line' => '691', 'Param' => { '0' => { 'name' => 'ibqp', 'type' => '9935' }, '1' => { 'name' => 'wr', 'type' => '14138' }, '2' => { 'name' => 'bad_wr', 'type' => '14225' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_post_recv' }, '278780' => { 'Header' => undef, 'Line' => '141', 'Param' => { '0' => { 'name' => 'wq', 'type' => '10252' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_destroy_wq' }, '279625' => { 'Header' => undef, 'Line' => '121', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'wq_init_attr', 'type' => '64454' }, '2' => { 'name' => 'wq', 'type' => '10252' }, '3' => { 'name' => 'cmd', 'type' => '282986' }, '4' => { 'name' => 'cmd_size', 'type' => '53' }, '5' => { 'name' => 'resp', 'type' => '282991' }, '6' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_wq' }, '28277' => { 'Header' => undef, 'Line' => '603', 'Param' => { '0' => { 'name' => 'ibqp', 'type' => '9935' }, '1' => { 'name' => 'wr', 'type' => '14057' }, '2' => { 'name' => 'bad_wr', 'type' => '18295' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_post_send' }, '28752' => { 'Header' => undef, 'Line' => '583', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'attr', 'type' => '23209' }, '2' => { 'name' => 'attr_mask', 'type' => '161' }, '3' => { 'name' => 'cmd', 'type' => '28996' }, '4' => { 'name' => 'cmd_size', 'type' => '53' }, '5' => { 'name' => 'resp', 'type' => '29001' }, '6' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_modify_qp_ex' }, '29006' => { 'Header' => undef, 'Line' => '566', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'attr', 'type' => '23209' }, '2' => { 'name' => 'attr_mask', 'type' => '161' }, '3' => { 'name' => 'cmd', 'type' => '29220' }, '4' => { 'name' => 'cmd_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_modify_qp' }, '290583' => { 'Header' => undef, 'Line' => '35', 'Param' => { '0' => { 'name' => 'xrcd', 'type' => '22845' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_close_xrcd' }, '29294' => { 'Header' => undef, 'Line' => '393', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'attr', 'type' => '23209' }, '2' => { 'name' => 'attr_mask', 'type' => '161' }, '3' => { 'name' => 'init_attr', 'type' => '23199' }, '4' => { 'name' => 'cmd', 'type' => '29717' }, '5' => { 'name' => 'cmd_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_query_qp' }, '29722' => { 'Header' => undef, 'Line' => '343', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'qp', 'type' => '30110' }, '2' => { 'name' => 'vqp_sz', 'type' => '161' }, '3' => { 'name' => 'attr', 'type' => '18360' }, '4' => { 'name' => 'cmd', 'type' => '30115' }, '5' => { 'name' => 'cmd_size', 'type' => '53' }, '6' => { 'name' => 'resp', 'type' => '30120' }, '7' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_open_qp' }, '30125' => { 'Header' => undef, 'Line' => '314', 'Param' => { '0' => { 'name' => 'srq', 'type' => '10052' }, '1' => { 'name' => 'srq_attr', 'type' => '23214' }, '2' => { 'name' => 'cmd', 'type' => '30376' }, '3' => { 'name' => 'cmd_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_query_srq' }, '30381' => { 'Header' => undef, 'Line' => '296', 'Param' => { '0' => { 'name' => 'srq', 'type' => '10052' }, '1' => { 'name' => 'srq_attr', 'type' => '23214' }, '2' => { 'name' => 'srq_attr_mask', 'type' => '161' }, '3' => { 'name' => 'cmd', 'type' => '30584' }, '4' => { 'name' => 'cmd_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_modify_srq' }, '30915' => { 'Header' => undef, 'Line' => '253', 'Param' => { '0' => { 'name' => 'cq', 'type' => '9734' }, '1' => { 'name' => 'cqe', 'type' => '161' }, '2' => { 'name' => 'cmd', 'type' => '31177' }, '3' => { 'name' => 'cmd_size', 'type' => '53' }, '4' => { 'name' => 'resp', 'type' => '31182' }, '5' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_resize_cq' }, '309262' => { 'Header' => undef, 'Line' => '310', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'addr', 'type' => '82' }, '2' => { 'name' => 'length', 'type' => '53' }, '3' => { 'name' => 'iova', 'type' => '965' }, '4' => { 'name' => 'access', 'type' => '70' } }, 'Return' => '11186', 'ShortName' => 'ibv_reg_mr_iova2' }, '309792' => { 'Alias' => '__ibv_register_driver_1_1', 'Header' => undef, 'Line' => '979', 'Param' => { '0' => { 'name' => 'name', 'type' => '74950' }, '1' => { 'name' => 'init_func', 'type' => '308690' } }, 'Return' => '1', 'ShortName' => 'ibv_register_driver' }, '309844' => { 'Alias' => '__ibv_detach_mcast_1_0', 'Header' => undef, 'Line' => '972', 'Param' => { '0' => { 'name' => 'qp', 'type' => '308605' }, '1' => { 'name' => 'gid', 'type' => '98836' }, '2' => { 'name' => 'lid', 'type' => '941' } }, 'Return' => '161', 'ShortName' => 'ibv_detach_mcast' }, '309969' => { 'Alias' => '__ibv_attach_mcast_1_0', 'Header' => undef, 'Line' => '965', 'Param' => { '0' => { 'name' => 'qp', 'type' => '308605' }, '1' => { 'name' => 'gid', 'type' => '98836' }, '2' => { 'name' => 'lid', 'type' => '941' } }, 'Return' => '161', 'ShortName' => 'ibv_attach_mcast' }, '310094' => { 'Alias' => '__ibv_destroy_ah_1_0', 'Header' => undef, 'Line' => '951', 'Param' => { '0' => { 'name' => 'ah', 'type' => '307566' } }, 'Return' => '161', 'ShortName' => 'ibv_destroy_ah' }, '310200' => { 'Alias' => '__ibv_create_ah_1_0', 'Header' => undef, 'Line' => '927', 'Param' => { '0' => { 'name' => 'pd', 'type' => '306987' }, '1' => { 'name' => 'attr', 'type' => '23194' } }, 'Return' => '307566', 'ShortName' => 'ibv_create_ah' }, '310380' => { 'Alias' => '__ibv_destroy_qp_1_0', 'Header' => undef, 'Line' => '913', 'Param' => { '0' => { 'name' => 'qp', 'type' => '308605' } }, 'Return' => '161', 'ShortName' => 'ibv_destroy_qp' }, '310486' => { 'Alias' => '__ibv_modify_qp_1_0', 'Header' => undef, 'Line' => '904', 'Param' => { '0' => { 'name' => 'qp', 'type' => '308605' }, '1' => { 'name' => 'attr', 'type' => '23209' }, '2' => { 'name' => 'attr_mask', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'ibv_modify_qp' }, '310607' => { 'Alias' => '__ibv_query_qp_1_0', 'Header' => undef, 'Line' => '881', 'Param' => { '0' => { 'name' => 'qp', 'type' => '308605' }, '1' => { 'name' => 'attr', 'type' => '23209' }, '2' => { 'name' => 'attr_mask', 'type' => '161' }, '3' => { 'name' => 'init_attr', 'type' => '310808' } }, 'Return' => '161', 'ShortName' => 'ibv_query_qp' }, '310813' => { 'Alias' => '__ibv_create_qp_1_0', 'Header' => undef, 'Line' => '836', 'Param' => { '0' => { 'name' => 'pd', 'type' => '306987' }, '1' => { 'name' => 'qp_init_attr', 'type' => '310808' } }, 'Return' => '308605', 'ShortName' => 'ibv_create_qp' }, '311027' => { 'Alias' => '__ibv_destroy_srq_1_0', 'Header' => undef, 'Line' => '822', 'Param' => { '0' => { 'name' => 'srq', 'type' => '307350' } }, 'Return' => '161', 'ShortName' => 'ibv_destroy_srq' }, '311134' => { 'Alias' => '__ibv_query_srq_1_0', 'Header' => undef, 'Line' => '814', 'Param' => { '0' => { 'name' => 'srq', 'type' => '307350' }, '1' => { 'name' => 'srq_attr', 'type' => '23214' } }, 'Return' => '161', 'ShortName' => 'ibv_query_srq' }, '311229' => { 'Alias' => '__ibv_modify_srq_1_0', 'Header' => undef, 'Line' => '805', 'Param' => { '0' => { 'name' => 'srq', 'type' => '307350' }, '1' => { 'name' => 'srq_attr', 'type' => '23214' }, '2' => { 'name' => 'srq_attr_mask', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'ibv_modify_srq' }, '311351' => { 'Alias' => '__ibv_create_srq_1_0', 'Header' => undef, 'Line' => '777', 'Param' => { '0' => { 'name' => 'pd', 'type' => '306987' }, '1' => { 'name' => 'srq_init_attr', 'type' => '67152' } }, 'Return' => '307350', 'ShortName' => 'ibv_create_srq' }, '311532' => { 'Alias' => '__ibv_ack_cq_events_1_0', 'Header' => undef, 'Line' => '769', 'Param' => { '0' => { 'name' => 'cq', 'type' => '307345' }, '1' => { 'name' => 'nevents', 'type' => '70' } }, 'Return' => '1', 'ShortName' => 'ibv_ack_cq_events' }, '311622' => { 'Alias' => '__ibv_get_cq_event_1_0', 'Header' => undef, 'Line' => '749', 'Param' => { '0' => { 'name' => 'channel', 'type' => '15165' }, '1' => { 'name' => 'cq', 'type' => '311815' }, '2' => { 'name' => 'cq_context', 'type' => '153588' } }, 'Return' => '161', 'ShortName' => 'ibv_get_cq_event' }, '311820' => { 'Alias' => '__ibv_destroy_cq_1_0', 'Header' => undef, 'Line' => '735', 'Param' => { '0' => { 'name' => 'cq', 'type' => '307345' } }, 'Return' => '161', 'ShortName' => 'ibv_destroy_cq' }, '31187' => { 'Header' => undef, 'Line' => '240', 'Param' => { '0' => { 'name' => 'ibcq', 'type' => '9734' }, '1' => { 'name' => 'solicited_only', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_req_notify_cq' }, '311926' => { 'Alias' => '__ibv_resize_cq_1_0', 'Header' => undef, 'Line' => '728', 'Param' => { '0' => { 'name' => 'cq', 'type' => '307345' }, '1' => { 'name' => 'cqe', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'ibv_resize_cq' }, '312020' => { 'Alias' => '__ibv_create_cq_1_0', 'Header' => undef, 'Line' => '699', 'Param' => { '0' => { 'name' => 'context', 'type' => '306891' }, '1' => { 'name' => 'cqe', 'type' => '161' }, '2' => { 'name' => 'cq_context', 'type' => '82' }, '3' => { 'name' => 'channel', 'type' => '15165' }, '4' => { 'name' => 'comp_vector', 'type' => '161' } }, 'Return' => '307345', 'ShortName' => 'ibv_create_cq' }, '312279' => { 'Alias' => '__ibv_dereg_mr_1_0', 'Header' => undef, 'Line' => '685', 'Param' => { '0' => { 'name' => 'mr', 'type' => '312385' } }, 'Return' => '161', 'ShortName' => 'ibv_dereg_mr' }, '312390' => { 'Alias' => '__ibv_reg_mr_1_0', 'Header' => undef, 'Line' => '658', 'Param' => { '0' => { 'name' => 'pd', 'type' => '306987' }, '1' => { 'name' => 'addr', 'type' => '82' }, '2' => { 'name' => 'length', 'type' => '53' }, '3' => { 'name' => 'access', 'type' => '161' } }, 'Return' => '312385', 'ShortName' => 'ibv_reg_mr' }, '312716' => { 'Alias' => '__ibv_dealloc_pd_1_0', 'Header' => undef, 'Line' => '644', 'Param' => { '0' => { 'name' => 'pd', 'type' => '306987' } }, 'Return' => '161', 'ShortName' => 'ibv_dealloc_pd' }, '312822' => { 'Alias' => '__ibv_alloc_pd_1_0', 'Header' => undef, 'Line' => '621', 'Param' => { '0' => { 'name' => 'context', 'type' => '306891' } }, 'Return' => '306987', 'ShortName' => 'ibv_alloc_pd' }, '312971' => { 'Alias' => '__ibv_query_pkey_1_0', 'Header' => undef, 'Line' => '612', 'Param' => { '0' => { 'name' => 'context', 'type' => '306891' }, '1' => { 'name' => 'port_num', 'type' => '929' }, '2' => { 'name' => 'index', 'type' => '161' }, '3' => { 'name' => 'pkey', 'type' => '309427' } }, 'Return' => '161', 'ShortName' => 'ibv_query_pkey' }, '313123' => { 'Alias' => '__ibv_query_gid_1_0', 'Header' => undef, 'Line' => '603', 'Param' => { '0' => { 'name' => 'context', 'type' => '306891' }, '1' => { 'name' => 'port_num', 'type' => '929' }, '2' => { 'name' => 'index', 'type' => '161' }, '3' => { 'name' => 'gid', 'type' => '98836' } }, 'Return' => '161', 'ShortName' => 'ibv_query_gid' }, '313275' => { 'Alias' => '__ibv_query_port_1_0', 'Header' => undef, 'Line' => '594', 'Param' => { '0' => { 'name' => 'context', 'type' => '306891' }, '1' => { 'name' => 'port_num', 'type' => '929' }, '2' => { 'name' => 'port_attr', 'type' => '63789' } }, 'Return' => '161', 'ShortName' => 'ibv_query_port' }, '31328' => { 'Header' => undef, 'Line' => '194', 'Param' => { '0' => { 'name' => 'ibcq', 'type' => '9734' }, '1' => { 'name' => 'ne', 'type' => '161' }, '2' => { 'name' => 'wc', 'type' => '18205' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_poll_cq' }, '313664' => { 'Alias' => '__ibv_query_device_1_0', 'Header' => undef, 'Line' => '586', 'Param' => { '0' => { 'name' => 'context', 'type' => '306891' }, '1' => { 'name' => 'device_attr', 'type' => '18040' } }, 'Return' => '161', 'ShortName' => 'ibv_query_device' }, '313759' => { 'Alias' => '__ibv_ack_async_event_1_0', 'Header' => undef, 'Line' => '549', 'Param' => { '0' => { 'name' => 'event', 'type' => '66976' } }, 'Return' => '1', 'ShortName' => 'ibv_ack_async_event' }, '313861' => { 'Header' => undef, 'Line' => '510', 'Param' => { '0' => { 'name' => 'context', 'type' => '306891' }, '1' => { 'name' => 'event', 'type' => '66976' } }, 'Return' => '161', 'ShortName' => '__ibv_get_async_event_1_0' }, '313919' => { 'Alias' => '__ibv_close_device_1_0', 'Header' => undef, 'Line' => '496', 'Param' => { '0' => { 'name' => 'context', 'type' => '306891' } }, 'Return' => '161', 'ShortName' => 'ibv_close_device' }, '314026' => { 'Alias' => '__ibv_open_device_1_0', 'Header' => undef, 'Line' => '467', 'Param' => { '0' => { 'name' => 'device', 'type' => '308685' } }, 'Return' => '306891', 'ShortName' => 'ibv_open_device' }, '315211' => { 'Alias' => '__ibv_get_device_guid_1_0', 'Header' => undef, 'Line' => '294', 'Param' => { '0' => { 'name' => 'device', 'type' => '308685' } }, 'Return' => '1061', 'ShortName' => 'ibv_get_device_guid' }, '315278' => { 'Alias' => '__ibv_get_device_name_1_0', 'Header' => undef, 'Line' => '287', 'Param' => { '0' => { 'name' => 'device', 'type' => '308685' } }, 'Return' => '74950', 'ShortName' => 'ibv_get_device_name' }, '315345' => { 'Alias' => '__ibv_free_device_list_1_0', 'Header' => undef, 'Line' => '272', 'Param' => { '0' => { 'name' => 'list', 'type' => '315462' } }, 'Return' => '1', 'ShortName' => 'ibv_free_device_list' }, '315467' => { 'Alias' => '__ibv_get_device_list_1_0', 'Header' => undef, 'Line' => '231', 'Param' => { '0' => { 'name' => 'num', 'type' => '23549' } }, 'Return' => '315462', 'ShortName' => 'ibv_get_device_list' }, '31682' => { 'Header' => undef, 'Line' => '169', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'type', 'type' => '11400' }, '2' => { 'name' => 'mw', 'type' => '13825' }, '3' => { 'name' => 'cmd', 'type' => '32019' }, '4' => { 'name' => 'cmd_size', 'type' => '53' }, '5' => { 'name' => 'resp', 'type' => '32024' }, '6' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_alloc_mw' }, '32029' => { 'Header' => undef, 'Line' => '140', 'Param' => { '0' => { 'name' => 'vmr', 'type' => '23204' }, '1' => { 'name' => 'flags', 'type' => '953' }, '10' => { 'name' => 'resp_sz', 'type' => '53' }, '2' => { 'name' => 'addr', 'type' => '82' }, '3' => { 'name' => 'length', 'type' => '53' }, '4' => { 'name' => 'hca_va', 'type' => '965' }, '5' => { 'name' => 'access', 'type' => '161' }, '6' => { 'name' => 'pd', 'type' => '11395' }, '7' => { 'name' => 'cmd', 'type' => '32418' }, '8' => { 'name' => 'cmd_sz', 'type' => '53' }, '9' => { 'name' => 'resp', 'type' => '32423' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_rereg_mr' }, '32428' => { 'Header' => undef, 'Line' => '99', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'addr', 'type' => '82' }, '2' => { 'name' => 'length', 'type' => '53' }, '3' => { 'name' => 'hca_va', 'type' => '965' }, '4' => { 'name' => 'access', 'type' => '161' }, '5' => { 'name' => 'vmr', 'type' => '23204' }, '6' => { 'name' => 'cmd', 'type' => '32762' }, '7' => { 'name' => 'cmd_size', 'type' => '53' }, '8' => { 'name' => 'resp', 'type' => '32767' }, '9' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_reg_mr' }, '32772' => { 'Header' => undef, 'Line' => '67', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'xrcd', 'type' => '22845' }, '2' => { 'name' => 'vxrcd_size', 'type' => '161' }, '3' => { 'name' => 'attr', 'type' => '18365' }, '4' => { 'name' => 'cmd', 'type' => '33068' }, '5' => { 'name' => 'cmd_size', 'type' => '53' }, '6' => { 'name' => 'resp', 'type' => '33073' }, '7' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_open_xrcd' }, '33078' => { 'Header' => undef, 'Line' => '50', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'pd', 'type' => '11395' }, '2' => { 'name' => 'cmd', 'type' => '33343' }, '3' => { 'name' => 'cmd_size', 'type' => '53' }, '4' => { 'name' => 'resp', 'type' => '33348' }, '5' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_alloc_pd' }, '334991' => { 'Header' => undef, 'Line' => '589', 'Param' => { '0' => { 'name' => 'vctx', 'type' => '91187' }, '1' => { 'name' => 'ops', 'type' => '335015' } }, 'Return' => '1', 'ShortName' => 'verbs_set_ops' }, '335367' => { 'Alias' => '__ibv_ack_async_event_1_1', 'Header' => undef, 'Line' => '498', 'Param' => { '0' => { 'name' => 'event', 'type' => '66976' } }, 'Return' => '1', 'ShortName' => 'ibv_ack_async_event' }, '335673' => { 'Alias' => '__ibv_get_async_event_1_1', 'Header' => undef, 'Line' => '452', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'event', 'type' => '66976' } }, 'Return' => '161', 'ShortName' => 'ibv_get_async_event' }, '335978' => { 'Alias' => '__ibv_close_device_1_1', 'Header' => undef, 'Line' => '442', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' } }, 'Return' => '161', 'ShortName' => 'ibv_close_device' }, '336165' => { 'Header' => undef, 'Line' => '432', 'Param' => { '0' => { 'name' => 'context_ex', 'type' => '91187' } }, 'Return' => '1', 'ShortName' => 'verbs_uninit_context' }, '336268' => { 'Header' => undef, 'Line' => '370', 'Param' => { '0' => { 'name' => 'cmd_fd', 'type' => '161' } }, 'Return' => '8991', 'ShortName' => 'ibv_import_device' }, '336712' => { 'Alias' => '__ibv_open_device_1_1', 'Header' => undef, 'Line' => '363', 'Param' => { '0' => { 'name' => 'device', 'type' => '17378' } }, 'Return' => '8991', 'ShortName' => 'ibv_open_device' }, '336793' => { 'Header' => undef, 'Line' => '323', 'Param' => { '0' => { 'name' => 'device', 'type' => '17378' }, '1' => { 'name' => 'private_data', 'type' => '82' } }, 'Return' => '8991', 'ShortName' => 'verbs_open_device' }, '337091' => { 'Header' => undef, 'Line' => '265', 'Param' => { '0' => { 'name' => 'device', 'type' => '17378' }, '1' => { 'name' => 'cmd_fd', 'type' => '161' }, '2' => { 'name' => 'alloc_size', 'type' => '53' }, '3' => { 'name' => 'context_offset', 'type' => '91187' }, '4' => { 'name' => 'driver_id', 'type' => '953' } }, 'Return' => '82', 'ShortName' => '_verbs_init_and_alloc_context' }, '338219' => { 'Header' => undef, 'Line' => '502', 'Param' => { '0' => { 'name' => 'cq', 'type' => '9734' }, '1' => { 'name' => 'context', 'type' => '8991' }, '2' => { 'name' => 'channel', 'type' => '15165' }, '3' => { 'name' => 'cq_context', 'type' => '82' } }, 'Return' => '1', 'ShortName' => 'verbs_init_cq' }, '338429' => { 'Header' => undef, 'Line' => '153', 'Param' => { '0' => { 'name' => 'device', 'type' => '17378' } }, 'Return' => '161', 'ShortName' => 'ibv_get_device_index' }, '338495' => { 'Alias' => '__ibv_get_device_guid_1_1', 'Header' => undef, 'Line' => '116', 'Param' => { '0' => { 'name' => 'device', 'type' => '17378' } }, 'Return' => '1061', 'ShortName' => 'ibv_get_device_guid' }, '339010' => { 'Alias' => '__ibv_get_device_name_1_1', 'Header' => undef, 'Line' => '109', 'Param' => { '0' => { 'name' => 'device', 'type' => '17378' } }, 'Return' => '74950', 'ShortName' => 'ibv_get_device_name' }, '339057' => { 'Alias' => '__ibv_free_device_list_1_1', 'Header' => undef, 'Line' => '98', 'Param' => { '0' => { 'name' => 'list', 'type' => '309712' } }, 'Return' => '1', 'ShortName' => 'ibv_free_device_list' }, '339157' => { 'Alias' => '__ibv_get_device_list_1_1', 'Header' => undef, 'Line' => '54', 'Param' => { '0' => { 'name' => 'num', 'type' => '23549' } }, 'Return' => '309712', 'ShortName' => 'ibv_get_device_list' }, '367839' => { 'Header' => undef, 'Line' => '136', 'Param' => { '0' => { 'name' => 'opcode', 'type' => '13213' } }, 'Return' => '74950', 'ShortName' => 'ibv_wr_opcode_str' }, '367932' => { 'Header' => undef, 'Line' => '101', 'Param' => { '0' => { 'name' => 'status', 'type' => '10257' } }, 'Return' => '74950', 'ShortName' => 'ibv_wc_status_str' }, '368025' => { 'Header' => undef, 'Line' => '70', 'Param' => { '0' => { 'name' => 'event', 'type' => '54967' } }, 'Return' => '74950', 'ShortName' => 'ibv_event_type_str' }, '368118' => { 'Header' => undef, 'Line' => '53', 'Param' => { '0' => { 'name' => 'port_state', 'type' => '54578' } }, 'Return' => '74950', 'ShortName' => 'ibv_port_state_str' }, '368211' => { 'Header' => undef, 'Line' => '35', 'Param' => { '0' => { 'name' => 'node_type', 'type' => '8728' } }, 'Return' => '74950', 'ShortName' => 'ibv_node_type_str' }, '370963' => { 'Header' => undef, 'Line' => '48', 'Return' => '74950', 'ShortName' => 'ibv_get_sysfs_path' }, '393791' => { 'Header' => undef, 'Line' => '125', 'Return' => '161', 'ShortName' => 'ibv_fork_init' }, '393917' => { 'Header' => undef, 'Line' => '108', 'Param' => { '0' => { 'name' => 'dir', 'type' => '74950' }, '1' => { 'name' => 'file', 'type' => '74950' }, '2' => { 'name' => 'buf', 'type' => '221' }, '3' => { 'name' => 'size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_read_sysfs_file' }, '399863' => { 'Header' => undef, 'Line' => '240', 'Param' => { '0' => { 'name' => 'ops', 'type' => '91132' } }, 'Return' => '1', 'ShortName' => 'verbs_register_driver_34' }, '402685' => { 'Header' => undef, 'Line' => '66', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '91187' }, '1' => { 'name' => 'level', 'type' => '953' }, '2' => { 'name' => 'fmt', 'type' => '74950' }, '3' => { 'type' => '-1' } }, 'Return' => '1', 'ShortName' => '__verbs_log' }, '40495' => { 'Header' => undef, 'Line' => '35', 'Param' => { '0' => { 'name' => 'ah', 'type' => '13672' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_destroy_ah' }, '406663' => { 'Header' => undef, 'Line' => '117', 'Param' => { '0' => { 'name' => 'dst', 'type' => '406864' }, '1' => { 'name' => 'src', 'type' => '406869' } }, 'Return' => '1', 'ShortName' => 'ibv_copy_path_rec_to_kern' }, '406874' => { 'Header' => undef, 'Line' => '92', 'Param' => { '0' => { 'name' => 'dst', 'type' => '406869' }, '1' => { 'name' => 'src', 'type' => '406864' } }, 'Return' => '1', 'ShortName' => 'ibv_copy_path_rec_from_kern' }, '407069' => { 'Header' => undef, 'Line' => '56', 'Param' => { '0' => { 'name' => 'dst', 'type' => '23209' }, '1' => { 'name' => 'src', 'type' => '407199' } }, 'Return' => '1', 'ShortName' => 'ibv_copy_qp_attr_from_kern' }, '407204' => { 'Header' => undef, 'Line' => '39', 'Param' => { '0' => { 'name' => 'dst', 'type' => '23194' }, '1' => { 'name' => 'src', 'type' => '407335' } }, 'Return' => '1', 'ShortName' => 'ibv_copy_ah_attr_from_kern' }, '411820' => { 'Header' => undef, 'Line' => '700', 'Param' => { '0' => { 'name' => 'base', 'type' => '82' }, '1' => { 'name' => 'size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_dofork_range' }, '412066' => { 'Header' => undef, 'Line' => '699', 'Param' => { '0' => { 'name' => 'base', 'type' => '82' }, '1' => { 'name' => 'size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_dontfork_range' }, '414065' => { 'Header' => undef, 'Line' => '181', 'Return' => '410188', 'ShortName' => 'ibv_is_fork_initialized' }, '457416' => { 'Header' => undef, 'Line' => '1125', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'ece', 'type' => '67452' } }, 'Return' => '161', 'ShortName' => 'ibv_query_ece' }, '457609' => { 'Header' => undef, 'Line' => '1115', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'ece', 'type' => '67452' } }, 'Return' => '161', 'ShortName' => 'ibv_set_ece' }, '457819' => { 'Header' => undef, 'Line' => '1031', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'attr', 'type' => '23194' }, '2' => { 'name' => 'eth_mac', 'type' => '58193' }, '3' => { 'name' => 'vid', 'type' => '458719' } }, 'Return' => '161', 'ShortName' => 'ibv_resolve_eth_l2_from_gid' }, '458834' => { 'Alias' => '__ibv_detach_mcast_1_1', 'Header' => undef, 'Line' => '990', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'gid', 'type' => '23189' }, '2' => { 'name' => 'lid', 'type' => '941' } }, 'Return' => '161', 'ShortName' => 'ibv_detach_mcast' }, '459058' => { 'Alias' => '__ibv_attach_mcast_1_1', 'Header' => undef, 'Line' => '983', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'gid', 'type' => '23189' }, '2' => { 'name' => 'lid', 'type' => '941' } }, 'Return' => '161', 'ShortName' => 'ibv_attach_mcast' }, '459282' => { 'Alias' => '__ibv_destroy_ah_1_1', 'Header' => undef, 'Line' => '976', 'Param' => { '0' => { 'name' => 'ah', 'type' => '13672' } }, 'Return' => '161', 'ShortName' => 'ibv_destroy_ah' }, '459448' => { 'Header' => undef, 'Line' => '963', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'wc', 'type' => '18205' }, '2' => { 'name' => 'grh', 'type' => '459688' }, '3' => { 'name' => 'port_num', 'type' => '929' } }, 'Return' => '13672', 'ShortName' => 'ibv_create_ah_from_wc' }, '459693' => { 'Header' => undef, 'Line' => '935', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'port_num', 'type' => '929' }, '2' => { 'name' => 'wc', 'type' => '18205' }, '3' => { 'name' => 'grh', 'type' => '459688' }, '4' => { 'name' => 'ah_attr', 'type' => '23194' } }, 'Return' => '161', 'ShortName' => 'ibv_init_ah_from_wc' }, '461580' => { 'Header' => undef, 'Line' => '755', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'port_num', 'type' => '929' }, '2' => { 'name' => 'index', 'type' => '70' }, '3' => { 'name' => 'type', 'type' => '98705' } }, 'Return' => '161', 'ShortName' => 'ibv_query_gid_type' }, '461807' => { 'Alias' => '__ibv_create_ah_1_1', 'Header' => undef, 'Line' => '741', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'attr', 'type' => '23194' } }, 'Return' => '13672', 'ShortName' => 'ibv_create_ah' }, '462018' => { 'Alias' => '__ibv_destroy_qp_1_1', 'Header' => undef, 'Line' => '734', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' } }, 'Return' => '161', 'ShortName' => 'ibv_destroy_qp' }, '462184' => { 'Alias' => '__ibv_modify_qp_1_1', 'Header' => undef, 'Line' => '717', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'attr', 'type' => '23209' }, '2' => { 'name' => 'attr_mask', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'ibv_modify_qp' }, '462421' => { 'Header' => undef, 'Line' => '695', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'op', 'type' => '13213' }, '2' => { 'name' => 'flags', 'type' => '953' } }, 'Return' => '161', 'ShortName' => 'ibv_query_qp_data_in_order' }, '462635' => { 'Alias' => '__ibv_query_qp_1_1', 'Header' => undef, 'Line' => '677', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' }, '1' => { 'name' => 'attr', 'type' => '23209' }, '2' => { 'name' => 'attr_mask', 'type' => '161' }, '3' => { 'name' => 'init_attr', 'type' => '23199' } }, 'Return' => '161', 'ShortName' => 'ibv_query_qp' }, '462899' => { 'Header' => undef, 'Line' => '668', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9935' } }, 'Return' => '14644', 'ShortName' => 'ibv_qp_to_qp_ex' }, '462973' => { 'Alias' => '__ibv_create_qp_1_1', 'Header' => undef, 'Line' => '658', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'qp_init_attr', 'type' => '23199' } }, 'Return' => '9935', 'ShortName' => 'ibv_create_qp' }, '463177' => { 'Alias' => '__ibv_destroy_srq_1_1', 'Header' => undef, 'Line' => '651', 'Param' => { '0' => { 'name' => 'srq', 'type' => '10052' } }, 'Return' => '161', 'ShortName' => 'ibv_destroy_srq' }, '463344' => { 'Alias' => '__ibv_query_srq_1_1', 'Header' => undef, 'Line' => '644', 'Param' => { '0' => { 'name' => 'srq', 'type' => '10052' }, '1' => { 'name' => 'srq_attr', 'type' => '23214' } }, 'Return' => '161', 'ShortName' => 'ibv_query_srq' }, '463538' => { 'Alias' => '__ibv_modify_srq_1_1', 'Header' => undef, 'Line' => '635', 'Param' => { '0' => { 'name' => 'srq', 'type' => '10052' }, '1' => { 'name' => 'srq_attr', 'type' => '23214' }, '2' => { 'name' => 'srq_attr_mask', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'ibv_modify_srq' }, '463759' => { 'Alias' => '__ibv_create_srq_1_1', 'Header' => undef, 'Line' => '615', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'srq_init_attr', 'type' => '67152' } }, 'Return' => '10052', 'ShortName' => 'ibv_create_srq' }, '464005' => { 'Alias' => '__ibv_ack_cq_events_1_1', 'Header' => undef, 'Line' => '605', 'Param' => { '0' => { 'name' => 'cq', 'type' => '9734' }, '1' => { 'name' => 'nevents', 'type' => '70' } }, 'Return' => '1', 'ShortName' => 'ibv_ack_cq_events' }, '464147' => { 'Alias' => '__ibv_get_cq_event_1_1', 'Header' => undef, 'Line' => '587', 'Param' => { '0' => { 'name' => 'channel', 'type' => '15165' }, '1' => { 'name' => 'cq', 'type' => '309135' }, '2' => { 'name' => 'cq_context', 'type' => '153588' } }, 'Return' => '161', 'ShortName' => 'ibv_get_cq_event' }, '464456' => { 'Alias' => '__ibv_destroy_cq_1_1', 'Header' => undef, 'Line' => '567', 'Param' => { '0' => { 'name' => 'cq', 'type' => '9734' } }, 'Return' => '161', 'ShortName' => 'ibv_destroy_cq' }, '464668' => { 'Alias' => '__ibv_resize_cq_1_1', 'Header' => undef, 'Line' => '560', 'Param' => { '0' => { 'name' => 'cq', 'type' => '9734' }, '1' => { 'name' => 'cqe', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'ibv_resize_cq' }, '464861' => { 'Alias' => '__ibv_create_cq_1_1', 'Header' => undef, 'Line' => '545', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'cqe', 'type' => '161' }, '2' => { 'name' => 'cq_context', 'type' => '82' }, '3' => { 'name' => 'channel', 'type' => '15165' }, '4' => { 'name' => 'comp_vector', 'type' => '161' } }, 'Return' => '9734', 'ShortName' => 'ibv_create_cq' }, '465188' => { 'Header' => undef, 'Line' => '522', 'Param' => { '0' => { 'name' => 'channel', 'type' => '15165' } }, 'Return' => '161', 'ShortName' => 'ibv_destroy_comp_channel' }, '465405' => { 'Header' => undef, 'Line' => '498', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' } }, 'Return' => '15165', 'ShortName' => 'ibv_create_comp_channel' }, '465662' => { 'Alias' => '__ibv_dereg_mr_1_1', 'Header' => undef, 'Line' => '481', 'Param' => { '0' => { 'name' => 'mr', 'type' => '11186' } }, 'Return' => '161', 'ShortName' => 'ibv_dereg_mr' }, '465928' => { 'Alias' => '__ibv_rereg_mr_1_1', 'Header' => undef, 'Line' => '416', 'Param' => { '0' => { 'name' => 'mr', 'type' => '11186' }, '1' => { 'name' => 'flags', 'type' => '161' }, '2' => { 'name' => 'pd', 'type' => '11395' }, '3' => { 'name' => 'addr', 'type' => '82' }, '4' => { 'name' => 'length', 'type' => '53' }, '5' => { 'name' => 'access', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'ibv_rereg_mr' }, '466421' => { 'Header' => undef, 'Line' => '398', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'offset', 'type' => '965' }, '2' => { 'name' => 'length', 'type' => '53' }, '3' => { 'name' => 'iova', 'type' => '965' }, '4' => { 'name' => 'fd', 'type' => '161' }, '5' => { 'name' => 'access', 'type' => '161' } }, 'Return' => '11186', 'ShortName' => 'ibv_reg_dmabuf_mr' }, '466737' => { 'Header' => undef, 'Line' => '393', 'Param' => { '0' => { 'name' => 'dm', 'type' => '53174' } }, 'Return' => '1', 'ShortName' => 'ibv_unimport_dm' }, '466899' => { 'Header' => undef, 'Line' => '385', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'dm_handle', 'type' => '953' } }, 'Return' => '53174', 'ShortName' => 'ibv_import_dm' }, '467093' => { 'Header' => undef, 'Line' => '377', 'Param' => { '0' => { 'name' => 'mr', 'type' => '11186' } }, 'Return' => '1', 'ShortName' => 'ibv_unimport_mr' }, '467255' => { 'Header' => undef, 'Line' => '369', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'mr_handle', 'type' => '953' } }, 'Return' => '11186', 'ShortName' => 'ibv_import_mr' }, '467448' => { 'Header' => undef, 'Line' => '360', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' } }, 'Return' => '1', 'ShortName' => 'ibv_unimport_pd' }, '467610' => { 'Header' => undef, 'Line' => '353', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'pd_handle', 'type' => '953' } }, 'Return' => '11395', 'ShortName' => 'ibv_import_pd' }, '467804' => { 'Header' => undef, 'Line' => '347', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'addr', 'type' => '82' }, '2' => { 'name' => 'length', 'type' => '53' }, '3' => { 'name' => 'iova', 'type' => '965' }, '4' => { 'name' => 'access', 'type' => '161' } }, 'Return' => '11186', 'ShortName' => 'ibv_reg_mr_iova' }, '467987' => { 'Alias' => '__ibv_reg_mr_1_1', 'Header' => undef, 'Line' => '338', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' }, '1' => { 'name' => 'addr', 'type' => '82' }, '2' => { 'name' => 'length', 'type' => '53' }, '3' => { 'name' => 'access', 'type' => '161' } }, 'Return' => '11186', 'ShortName' => 'ibv_reg_mr' }, '468554' => { 'Alias' => '__ibv_dealloc_pd_1_1', 'Header' => undef, 'Line' => '303', 'Param' => { '0' => { 'name' => 'pd', 'type' => '11395' } }, 'Return' => '161', 'ShortName' => 'ibv_dealloc_pd' }, '468720' => { 'Alias' => '__ibv_alloc_pd_1_1', 'Header' => undef, 'Line' => '290', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' } }, 'Return' => '11395', 'ShortName' => 'ibv_alloc_pd' }, '468905' => { 'Alias' => '__ibv_get_pkey_index_1_5', 'Header' => undef, 'Line' => '274', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'port_num', 'type' => '929' }, '2' => { 'name' => 'pkey', 'type' => '1037' } }, 'Return' => '161', 'ShortName' => 'ibv_get_pkey_index' }, '469108' => { 'Alias' => '__ibv_query_pkey_1_1', 'Header' => undef, 'Line' => '254', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'port_num', 'type' => '929' }, '2' => { 'name' => 'index', 'type' => '161' }, '3' => { 'name' => 'pkey', 'type' => '309427' } }, 'Return' => '161', 'ShortName' => 'ibv_query_pkey' }, '469414' => { 'Alias' => '__ibv_query_gid_1_1', 'Header' => undef, 'Line' => '231', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'port_num', 'type' => '929' }, '2' => { 'name' => 'index', 'type' => '161' }, '3' => { 'name' => 'gid', 'type' => '98836' } }, 'Return' => '161', 'ShortName' => 'ibv_query_gid' }, '469784' => { 'Alias' => '__ibv_query_port_1_1', 'Header' => undef, 'Line' => '221', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'port_num', 'type' => '929' }, '2' => { 'name' => 'port_attr', 'type' => '18075' } }, 'Return' => '161', 'ShortName' => 'ibv_query_port' }, '470010' => { 'Alias' => '__ibv_query_device_1_1', 'Header' => undef, 'Line' => '163', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'device_attr', 'type' => '18040' } }, 'Return' => '161', 'ShortName' => 'ibv_query_device' }, '470211' => { 'Header' => undef, 'Line' => '133', 'Param' => { '0' => { 'name' => 'mbps', 'type' => '161' } }, 'Return' => '443458', 'ShortName' => 'mbps_to_ibv_rate' }, '470260' => { 'Header' => undef, 'Line' => '103', 'Param' => { '0' => { 'name' => 'rate', 'type' => '443458' } }, 'Return' => '161', 'ShortName' => 'ibv_rate_to_mbps' }, '470313' => { 'Header' => undef, 'Line' => '81', 'Param' => { '0' => { 'name' => 'mult', 'type' => '161' } }, 'Return' => '443458', 'ShortName' => 'mult_to_ibv_rate' }, '470366' => { 'Header' => undef, 'Line' => '59', 'Param' => { '0' => { 'name' => 'rate', 'type' => '443458' } }, 'Return' => '161', 'ShortName' => 'ibv_rate_to_mult' }, '47303' => { 'Header' => undef, 'Line' => '193', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'cmd', 'type' => '40016' } }, 'Return' => '161', 'ShortName' => 'execute_ioctl' }, '47329' => { 'Header' => undef, 'Line' => '125', 'Param' => { '0' => { 'name' => 'num_attrs', 'type' => '70' }, '1' => { 'name' => 'link', 'type' => '40016' } }, 'Return' => '70', 'ShortName' => '__ioctl_final_num_attrs' }, '47355' => { 'Header' => undef, 'Line' => '79', 'Param' => { '0' => { 'name' => 'vcounters', 'type' => '48314' }, '1' => { 'name' => 'counters_value', 'type' => '46784' }, '2' => { 'name' => 'ncounters', 'type' => '953' }, '3' => { 'name' => 'flags', 'type' => '953' }, '4' => { 'name' => 'link', 'type' => '40016' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_read_counters' }, '48338' => { 'Header' => undef, 'Line' => '64', 'Param' => { '0' => { 'name' => 'vcounters', 'type' => '48314' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_destroy_counters' }, '48793' => { 'Header' => undef, 'Line' => '38', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'init_attr', 'type' => '46789' }, '2' => { 'name' => 'vcounters', 'type' => '48314' }, '3' => { 'name' => 'link', 'type' => '40016' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_counters' }, '68539' => { 'Header' => undef, 'Line' => '203', 'Param' => { '0' => { 'name' => 'cq', 'type' => '9734' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_destroy_cq' }, '69361' => { 'Header' => undef, 'Line' => '186', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'cq_attr', 'type' => '69870' }, '2' => { 'name' => 'cq', 'type' => '69875' }, '3' => { 'name' => 'cmd', 'type' => '69880' }, '4' => { 'name' => 'cmd_size', 'type' => '53' }, '5' => { 'name' => 'resp', 'type' => '69885' }, '6' => { 'name' => 'resp_size', 'type' => '53' }, '7' => { 'name' => 'cmd_flags', 'type' => '953' }, '8' => { 'name' => 'driver', 'type' => '40016' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_cq_ex2' }, '69909' => { 'Header' => undef, 'Line' => '170', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'cq_attr', 'type' => '69870' }, '2' => { 'name' => 'cq', 'type' => '69875' }, '3' => { 'name' => 'cmd', 'type' => '69880' }, '4' => { 'name' => 'cmd_size', 'type' => '53' }, '5' => { 'name' => 'resp', 'type' => '69885' }, '6' => { 'name' => 'resp_size', 'type' => '53' }, '7' => { 'name' => 'cmd_flags', 'type' => '953' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_cq_ex' }, '70351' => { 'Header' => undef, 'Line' => '156', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'cqe', 'type' => '161' }, '2' => { 'name' => 'channel', 'type' => '15165' }, '3' => { 'name' => 'comp_vector', 'type' => '161' }, '4' => { 'name' => 'cq', 'type' => '9734' }, '5' => { 'name' => 'cmd', 'type' => '70793' }, '6' => { 'name' => 'cmd_size', 'type' => '53' }, '7' => { 'name' => 'resp', 'type' => '70798' }, '8' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_create_cq' }, '92633' => { 'Header' => undef, 'Line' => '724', 'Param' => { '0' => { 'name' => 'buf', 'type' => '221' }, '1' => { 'name' => 'size', 'type' => '53' }, '2' => { 'name' => 'sysfs_dev', 'type' => '91152' }, '3' => { 'name' => 'fnfmt', 'type' => '74950' }, '4' => { 'type' => '-1' } }, 'Return' => '161', 'ShortName' => 'ibv_read_ibdev_sysfs_file' }, '92941' => { 'Header' => undef, 'Line' => '522', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'input', 'type' => '64639' }, '2' => { 'name' => 'attr', 'type' => '64644' }, '3' => { 'name' => 'attr_size', 'type' => '53' }, '4' => { 'name' => 'resp', 'type' => '93526' }, '5' => { 'name' => 'resp_size', 'type' => '93531' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_query_device_any' }, '93536' => { 'Header' => undef, 'Line' => '485', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'entries', 'type' => '94784' }, '2' => { 'name' => 'max_entries', 'type' => '53' }, '3' => { 'name' => 'flags', 'type' => '953' }, '4' => { 'name' => 'entry_size', 'type' => '53' } }, 'Return' => '254', 'ShortName' => '_ibv_query_gid_table' }, '94806' => { 'Header' => undef, 'Line' => '474', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'port_num', 'type' => '953' }, '2' => { 'name' => 'gid_index', 'type' => '953' }, '3' => { 'name' => 'entry', 'type' => '94784' }, '4' => { 'name' => 'flags', 'type' => '953' }, '5' => { 'name' => 'entry_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => '_ibv_query_gid_ex' }, '98989' => { 'Header' => undef, 'Line' => '187', 'Param' => { '0' => { 'name' => 'context', 'type' => '8991' }, '1' => { 'name' => 'driver', 'type' => '40016' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_query_context' }, '99778' => { 'Header' => undef, 'Line' => '175', 'Param' => { '0' => { 'name' => 'context_ex', 'type' => '91187' }, '1' => { 'name' => 'cmd', 'type' => '101119' }, '2' => { 'name' => 'cmd_size', 'type' => '53' }, '3' => { 'name' => 'resp', 'type' => '101124' }, '4' => { 'name' => 'resp_size', 'type' => '53' } }, 'Return' => '161', 'ShortName' => 'ibv_cmd_get_context' } }, 'SymbolVersion' => { '__ibv_ack_async_event_1_0' => 'ibv_ack_async_event@IBVERBS_1.0', '__ibv_ack_async_event_1_1' => 'ibv_ack_async_event@@IBVERBS_1.1', '__ibv_ack_cq_events_1_0' => 'ibv_ack_cq_events@IBVERBS_1.0', '__ibv_ack_cq_events_1_1' => 'ibv_ack_cq_events@@IBVERBS_1.1', '__ibv_alloc_pd_1_0' => 'ibv_alloc_pd@IBVERBS_1.0', '__ibv_alloc_pd_1_1' => 'ibv_alloc_pd@@IBVERBS_1.1', '__ibv_attach_mcast_1_0' => 'ibv_attach_mcast@IBVERBS_1.0', '__ibv_attach_mcast_1_1' => 'ibv_attach_mcast@@IBVERBS_1.1', '__ibv_close_device_1_0' => 'ibv_close_device@IBVERBS_1.0', '__ibv_close_device_1_1' => 'ibv_close_device@@IBVERBS_1.1', '__ibv_create_ah_1_0' => 'ibv_create_ah@IBVERBS_1.0', '__ibv_create_ah_1_1' => 'ibv_create_ah@@IBVERBS_1.1', '__ibv_create_cq_1_0' => 'ibv_create_cq@IBVERBS_1.0', '__ibv_create_cq_1_1' => 'ibv_create_cq@@IBVERBS_1.1', '__ibv_create_qp_1_0' => 'ibv_create_qp@IBVERBS_1.0', '__ibv_create_qp_1_1' => 'ibv_create_qp@@IBVERBS_1.1', '__ibv_create_srq_1_0' => 'ibv_create_srq@IBVERBS_1.0', '__ibv_create_srq_1_1' => 'ibv_create_srq@@IBVERBS_1.1', '__ibv_dealloc_pd_1_0' => 'ibv_dealloc_pd@IBVERBS_1.0', '__ibv_dealloc_pd_1_1' => 'ibv_dealloc_pd@@IBVERBS_1.1', '__ibv_dereg_mr_1_0' => 'ibv_dereg_mr@IBVERBS_1.0', '__ibv_dereg_mr_1_1' => 'ibv_dereg_mr@@IBVERBS_1.1', '__ibv_destroy_ah_1_0' => 'ibv_destroy_ah@IBVERBS_1.0', '__ibv_destroy_ah_1_1' => 'ibv_destroy_ah@@IBVERBS_1.1', '__ibv_destroy_cq_1_0' => 'ibv_destroy_cq@IBVERBS_1.0', '__ibv_destroy_cq_1_1' => 'ibv_destroy_cq@@IBVERBS_1.1', '__ibv_destroy_qp_1_0' => 'ibv_destroy_qp@IBVERBS_1.0', '__ibv_destroy_qp_1_1' => 'ibv_destroy_qp@@IBVERBS_1.1', '__ibv_destroy_srq_1_0' => 'ibv_destroy_srq@IBVERBS_1.0', '__ibv_destroy_srq_1_1' => 'ibv_destroy_srq@@IBVERBS_1.1', '__ibv_detach_mcast_1_0' => 'ibv_detach_mcast@IBVERBS_1.0', '__ibv_detach_mcast_1_1' => 'ibv_detach_mcast@@IBVERBS_1.1', '__ibv_free_device_list_1_0' => 'ibv_free_device_list@IBVERBS_1.0', '__ibv_free_device_list_1_1' => 'ibv_free_device_list@@IBVERBS_1.1', '__ibv_get_async_event_1_0' => 'ibv_get_async_event@IBVERBS_1.0', '__ibv_get_async_event_1_1' => 'ibv_get_async_event@@IBVERBS_1.1', '__ibv_get_cq_event_1_0' => 'ibv_get_cq_event@IBVERBS_1.0', '__ibv_get_cq_event_1_1' => 'ibv_get_cq_event@@IBVERBS_1.1', '__ibv_get_device_guid_1_0' => 'ibv_get_device_guid@IBVERBS_1.0', '__ibv_get_device_guid_1_1' => 'ibv_get_device_guid@@IBVERBS_1.1', '__ibv_get_device_list_1_0' => 'ibv_get_device_list@IBVERBS_1.0', '__ibv_get_device_list_1_1' => 'ibv_get_device_list@@IBVERBS_1.1', '__ibv_get_device_name_1_0' => 'ibv_get_device_name@IBVERBS_1.0', '__ibv_get_device_name_1_1' => 'ibv_get_device_name@@IBVERBS_1.1', '__ibv_get_pkey_index_1_5' => 'ibv_get_pkey_index@@IBVERBS_1.5', '__ibv_modify_qp_1_0' => 'ibv_modify_qp@IBVERBS_1.0', '__ibv_modify_qp_1_1' => 'ibv_modify_qp@@IBVERBS_1.1', '__ibv_modify_srq_1_0' => 'ibv_modify_srq@IBVERBS_1.0', '__ibv_modify_srq_1_1' => 'ibv_modify_srq@@IBVERBS_1.1', '__ibv_open_device_1_0' => 'ibv_open_device@IBVERBS_1.0', '__ibv_open_device_1_1' => 'ibv_open_device@@IBVERBS_1.1', '__ibv_query_device_1_0' => 'ibv_query_device@IBVERBS_1.0', '__ibv_query_device_1_1' => 'ibv_query_device@@IBVERBS_1.1', '__ibv_query_gid_1_0' => 'ibv_query_gid@IBVERBS_1.0', '__ibv_query_gid_1_1' => 'ibv_query_gid@@IBVERBS_1.1', '__ibv_query_pkey_1_0' => 'ibv_query_pkey@IBVERBS_1.0', '__ibv_query_pkey_1_1' => 'ibv_query_pkey@@IBVERBS_1.1', '__ibv_query_port_1_0' => 'ibv_query_port@IBVERBS_1.0', '__ibv_query_port_1_1' => 'ibv_query_port@@IBVERBS_1.1', '__ibv_query_qp_1_0' => 'ibv_query_qp@IBVERBS_1.0', '__ibv_query_qp_1_1' => 'ibv_query_qp@@IBVERBS_1.1', '__ibv_query_srq_1_0' => 'ibv_query_srq@IBVERBS_1.0', '__ibv_query_srq_1_1' => 'ibv_query_srq@@IBVERBS_1.1', '__ibv_reg_mr_1_0' => 'ibv_reg_mr@IBVERBS_1.0', '__ibv_reg_mr_1_1' => 'ibv_reg_mr@@IBVERBS_1.1', '__ibv_register_driver_1_1' => 'ibv_register_driver@IBVERBS_1.1', '__ibv_rereg_mr_1_1' => 'ibv_rereg_mr@@IBVERBS_1.1', '__ibv_resize_cq_1_0' => 'ibv_resize_cq@IBVERBS_1.0', '__ibv_resize_cq_1_1' => 'ibv_resize_cq@@IBVERBS_1.1', '__ioctl_final_num_attrs' => '__ioctl_final_num_attrs@@IBVERBS_PRIVATE_34', '__verbs_log' => '__verbs_log@@IBVERBS_PRIVATE_34', '_ibv_query_gid_ex' => '_ibv_query_gid_ex@@IBVERBS_1.11', '_ibv_query_gid_table' => '_ibv_query_gid_table@@IBVERBS_1.11', '_verbs_init_and_alloc_context' => '_verbs_init_and_alloc_context@@IBVERBS_PRIVATE_34', 'execute_ioctl' => 'execute_ioctl@@IBVERBS_PRIVATE_34', 'ibv_cmd_advise_mr' => 'ibv_cmd_advise_mr@@IBVERBS_PRIVATE_34', 'ibv_cmd_alloc_dm' => 'ibv_cmd_alloc_dm@@IBVERBS_PRIVATE_34', 'ibv_cmd_alloc_mw' => 'ibv_cmd_alloc_mw@@IBVERBS_PRIVATE_34', 'ibv_cmd_alloc_pd' => 'ibv_cmd_alloc_pd@@IBVERBS_PRIVATE_34', 'ibv_cmd_attach_mcast' => 'ibv_cmd_attach_mcast@@IBVERBS_PRIVATE_34', 'ibv_cmd_close_xrcd' => 'ibv_cmd_close_xrcd@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_ah' => 'ibv_cmd_create_ah@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_counters' => 'ibv_cmd_create_counters@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_cq' => 'ibv_cmd_create_cq@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_cq_ex' => 'ibv_cmd_create_cq_ex@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_cq_ex2' => 'ibv_cmd_create_cq_ex2@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_flow' => 'ibv_cmd_create_flow@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_flow_action_esp' => 'ibv_cmd_create_flow_action_esp@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_qp' => 'ibv_cmd_create_qp@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_qp_ex' => 'ibv_cmd_create_qp_ex@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_qp_ex2' => 'ibv_cmd_create_qp_ex2@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_rwq_ind_table' => 'ibv_cmd_create_rwq_ind_table@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_srq' => 'ibv_cmd_create_srq@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_srq_ex' => 'ibv_cmd_create_srq_ex@@IBVERBS_PRIVATE_34', 'ibv_cmd_create_wq' => 'ibv_cmd_create_wq@@IBVERBS_PRIVATE_34', 'ibv_cmd_dealloc_mw' => 'ibv_cmd_dealloc_mw@@IBVERBS_PRIVATE_34', 'ibv_cmd_dealloc_pd' => 'ibv_cmd_dealloc_pd@@IBVERBS_PRIVATE_34', 'ibv_cmd_dereg_mr' => 'ibv_cmd_dereg_mr@@IBVERBS_PRIVATE_34', 'ibv_cmd_destroy_ah' => 'ibv_cmd_destroy_ah@@IBVERBS_PRIVATE_34', 'ibv_cmd_destroy_counters' => 'ibv_cmd_destroy_counters@@IBVERBS_PRIVATE_34', 'ibv_cmd_destroy_cq' => 'ibv_cmd_destroy_cq@@IBVERBS_PRIVATE_34', 'ibv_cmd_destroy_flow' => 'ibv_cmd_destroy_flow@@IBVERBS_PRIVATE_34', 'ibv_cmd_destroy_flow_action' => 'ibv_cmd_destroy_flow_action@@IBVERBS_PRIVATE_34', 'ibv_cmd_destroy_qp' => 'ibv_cmd_destroy_qp@@IBVERBS_PRIVATE_34', 'ibv_cmd_destroy_rwq_ind_table' => 'ibv_cmd_destroy_rwq_ind_table@@IBVERBS_PRIVATE_34', 'ibv_cmd_destroy_srq' => 'ibv_cmd_destroy_srq@@IBVERBS_PRIVATE_34', 'ibv_cmd_destroy_wq' => 'ibv_cmd_destroy_wq@@IBVERBS_PRIVATE_34', 'ibv_cmd_detach_mcast' => 'ibv_cmd_detach_mcast@@IBVERBS_PRIVATE_34', 'ibv_cmd_free_dm' => 'ibv_cmd_free_dm@@IBVERBS_PRIVATE_34', 'ibv_cmd_get_context' => 'ibv_cmd_get_context@@IBVERBS_PRIVATE_34', 'ibv_cmd_modify_cq' => 'ibv_cmd_modify_cq@@IBVERBS_PRIVATE_34', 'ibv_cmd_modify_flow_action_esp' => 'ibv_cmd_modify_flow_action_esp@@IBVERBS_PRIVATE_34', 'ibv_cmd_modify_qp' => 'ibv_cmd_modify_qp@@IBVERBS_PRIVATE_34', 'ibv_cmd_modify_qp_ex' => 'ibv_cmd_modify_qp_ex@@IBVERBS_PRIVATE_34', 'ibv_cmd_modify_srq' => 'ibv_cmd_modify_srq@@IBVERBS_PRIVATE_34', 'ibv_cmd_modify_wq' => 'ibv_cmd_modify_wq@@IBVERBS_PRIVATE_34', 'ibv_cmd_open_qp' => 'ibv_cmd_open_qp@@IBVERBS_PRIVATE_34', 'ibv_cmd_open_xrcd' => 'ibv_cmd_open_xrcd@@IBVERBS_PRIVATE_34', 'ibv_cmd_poll_cq' => 'ibv_cmd_poll_cq@@IBVERBS_PRIVATE_34', 'ibv_cmd_post_recv' => 'ibv_cmd_post_recv@@IBVERBS_PRIVATE_34', 'ibv_cmd_post_send' => 'ibv_cmd_post_send@@IBVERBS_PRIVATE_34', 'ibv_cmd_post_srq_recv' => 'ibv_cmd_post_srq_recv@@IBVERBS_PRIVATE_34', 'ibv_cmd_query_context' => 'ibv_cmd_query_context@@IBVERBS_PRIVATE_34', 'ibv_cmd_query_device_any' => 'ibv_cmd_query_device_any@@IBVERBS_PRIVATE_34', 'ibv_cmd_query_mr' => 'ibv_cmd_query_mr@@IBVERBS_PRIVATE_34', 'ibv_cmd_query_port' => 'ibv_cmd_query_port@@IBVERBS_PRIVATE_34', 'ibv_cmd_query_qp' => 'ibv_cmd_query_qp@@IBVERBS_PRIVATE_34', 'ibv_cmd_query_srq' => 'ibv_cmd_query_srq@@IBVERBS_PRIVATE_34', 'ibv_cmd_read_counters' => 'ibv_cmd_read_counters@@IBVERBS_PRIVATE_34', 'ibv_cmd_reg_dm_mr' => 'ibv_cmd_reg_dm_mr@@IBVERBS_PRIVATE_34', 'ibv_cmd_reg_dmabuf_mr' => 'ibv_cmd_reg_dmabuf_mr@@IBVERBS_PRIVATE_34', 'ibv_cmd_reg_mr' => 'ibv_cmd_reg_mr@@IBVERBS_PRIVATE_34', 'ibv_cmd_req_notify_cq' => 'ibv_cmd_req_notify_cq@@IBVERBS_PRIVATE_34', 'ibv_cmd_rereg_mr' => 'ibv_cmd_rereg_mr@@IBVERBS_PRIVATE_34', 'ibv_cmd_resize_cq' => 'ibv_cmd_resize_cq@@IBVERBS_PRIVATE_34', 'ibv_copy_ah_attr_from_kern' => 'ibv_copy_ah_attr_from_kern@@IBVERBS_1.1', 'ibv_copy_path_rec_from_kern' => 'ibv_copy_path_rec_from_kern@@IBVERBS_1.0', 'ibv_copy_path_rec_to_kern' => 'ibv_copy_path_rec_to_kern@@IBVERBS_1.0', 'ibv_copy_qp_attr_from_kern' => 'ibv_copy_qp_attr_from_kern@@IBVERBS_1.0', 'ibv_create_ah_from_wc' => 'ibv_create_ah_from_wc@@IBVERBS_1.1', 'ibv_create_comp_channel' => 'ibv_create_comp_channel@@IBVERBS_1.0', 'ibv_destroy_comp_channel' => 'ibv_destroy_comp_channel@@IBVERBS_1.0', 'ibv_dofork_range' => 'ibv_dofork_range@@IBVERBS_1.1', 'ibv_dontfork_range' => 'ibv_dontfork_range@@IBVERBS_1.1', 'ibv_event_type_str' => 'ibv_event_type_str@@IBVERBS_1.1', 'ibv_fork_init' => 'ibv_fork_init@@IBVERBS_1.1', 'ibv_get_device_index' => 'ibv_get_device_index@@IBVERBS_1.9', 'ibv_get_sysfs_path' => 'ibv_get_sysfs_path@@IBVERBS_1.0', 'ibv_import_device' => 'ibv_import_device@@IBVERBS_1.10', 'ibv_import_dm' => 'ibv_import_dm@@IBVERBS_1.13', 'ibv_import_mr' => 'ibv_import_mr@@IBVERBS_1.10', 'ibv_import_pd' => 'ibv_import_pd@@IBVERBS_1.10', 'ibv_init_ah_from_wc' => 'ibv_init_ah_from_wc@@IBVERBS_1.1', 'ibv_is_fork_initialized' => 'ibv_is_fork_initialized@@IBVERBS_1.13', 'ibv_node_type_str' => 'ibv_node_type_str@@IBVERBS_1.1', 'ibv_port_state_str' => 'ibv_port_state_str@@IBVERBS_1.1', 'ibv_qp_to_qp_ex' => 'ibv_qp_to_qp_ex@@IBVERBS_1.6', 'ibv_query_ece' => 'ibv_query_ece@@IBVERBS_1.10', 'ibv_query_gid_type' => 'ibv_query_gid_type@@IBVERBS_PRIVATE_34', 'ibv_query_qp_data_in_order' => 'ibv_query_qp_data_in_order@@IBVERBS_1.14', 'ibv_rate_to_mbps' => 'ibv_rate_to_mbps@@IBVERBS_1.1', 'ibv_rate_to_mult' => 'ibv_rate_to_mult@@IBVERBS_1.0', 'ibv_read_ibdev_sysfs_file' => 'ibv_read_ibdev_sysfs_file@@IBVERBS_PRIVATE_34', 'ibv_read_sysfs_file' => 'ibv_read_sysfs_file@@IBVERBS_1.0', 'ibv_reg_dmabuf_mr' => 'ibv_reg_dmabuf_mr@@IBVERBS_1.12', 'ibv_reg_mr_iova' => 'ibv_reg_mr_iova@@IBVERBS_1.7', 'ibv_reg_mr_iova2' => 'ibv_reg_mr_iova2@@IBVERBS_1.8', 'ibv_resolve_eth_l2_from_gid' => 'ibv_resolve_eth_l2_from_gid@@IBVERBS_1.1', 'ibv_set_ece' => 'ibv_set_ece@@IBVERBS_1.10', 'ibv_unimport_dm' => 'ibv_unimport_dm@@IBVERBS_1.13', 'ibv_unimport_mr' => 'ibv_unimport_mr@@IBVERBS_1.10', 'ibv_unimport_pd' => 'ibv_unimport_pd@@IBVERBS_1.10', 'ibv_wc_status_str' => 'ibv_wc_status_str@@IBVERBS_1.1', 'ibv_wr_opcode_str' => 'ibv_wr_opcode_str@@IBVERBS_PRIVATE_34', 'mbps_to_ibv_rate' => 'mbps_to_ibv_rate@@IBVERBS_1.1', 'mult_to_ibv_rate' => 'mult_to_ibv_rate@@IBVERBS_1.0', 'verbs_allow_disassociate_destroy' => 'verbs_allow_disassociate_destroy@@IBVERBS_PRIVATE_34', 'verbs_init_cq' => 'verbs_init_cq@@IBVERBS_PRIVATE_34', 'verbs_open_device' => 'verbs_open_device@@IBVERBS_PRIVATE_34', 'verbs_register_driver_34' => 'verbs_register_driver_34@@IBVERBS_PRIVATE_34', 'verbs_set_ops' => 'verbs_set_ops@@IBVERBS_PRIVATE_34', 'verbs_uninit_context' => 'verbs_uninit_context@@IBVERBS_PRIVATE_34' }, 'Symbols' => { 'libibverbs.so.1.14.56.0' => { '__ioctl_final_num_attrs@@IBVERBS_PRIVATE_34' => 1, '__verbs_log@@IBVERBS_PRIVATE_34' => 1, '_ibv_query_gid_ex@@IBVERBS_1.11' => 1, '_ibv_query_gid_table@@IBVERBS_1.11' => 1, '_verbs_init_and_alloc_context@@IBVERBS_PRIVATE_34' => 1, 'execute_ioctl@@IBVERBS_PRIVATE_34' => 1, 'ibv_ack_async_event@@IBVERBS_1.1' => 1, 'ibv_ack_async_event@IBVERBS_1.0' => 1, 'ibv_ack_cq_events@@IBVERBS_1.1' => 1, 'ibv_ack_cq_events@IBVERBS_1.0' => 1, 'ibv_alloc_pd@@IBVERBS_1.1' => 1, 'ibv_alloc_pd@IBVERBS_1.0' => 1, 'ibv_attach_mcast@@IBVERBS_1.1' => 1, 'ibv_attach_mcast@IBVERBS_1.0' => 1, 'ibv_close_device@@IBVERBS_1.1' => 1, 'ibv_close_device@IBVERBS_1.0' => 1, 'ibv_cmd_advise_mr@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_alloc_dm@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_alloc_mw@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_alloc_pd@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_attach_mcast@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_close_xrcd@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_ah@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_counters@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_cq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_cq_ex2@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_cq_ex@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_flow@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_flow_action_esp@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_qp@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_qp_ex2@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_qp_ex@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_rwq_ind_table@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_srq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_srq_ex@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_create_wq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_dealloc_mw@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_dealloc_pd@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_dereg_mr@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_destroy_ah@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_destroy_counters@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_destroy_cq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_destroy_flow@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_destroy_flow_action@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_destroy_qp@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_destroy_rwq_ind_table@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_destroy_srq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_destroy_wq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_detach_mcast@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_free_dm@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_get_context@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_modify_cq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_modify_flow_action_esp@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_modify_qp@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_modify_qp_ex@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_modify_srq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_modify_wq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_open_qp@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_open_xrcd@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_poll_cq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_post_recv@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_post_send@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_post_srq_recv@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_query_context@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_query_device_any@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_query_mr@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_query_port@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_query_qp@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_query_srq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_read_counters@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_reg_dm_mr@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_reg_dmabuf_mr@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_reg_mr@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_req_notify_cq@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_rereg_mr@@IBVERBS_PRIVATE_34' => 1, 'ibv_cmd_resize_cq@@IBVERBS_PRIVATE_34' => 1, 'ibv_copy_ah_attr_from_kern@@IBVERBS_1.1' => 1, 'ibv_copy_path_rec_from_kern@@IBVERBS_1.0' => 1, 'ibv_copy_path_rec_to_kern@@IBVERBS_1.0' => 1, 'ibv_copy_qp_attr_from_kern@@IBVERBS_1.0' => 1, 'ibv_create_ah@@IBVERBS_1.1' => 1, 'ibv_create_ah@IBVERBS_1.0' => 1, 'ibv_create_ah_from_wc@@IBVERBS_1.1' => 1, 'ibv_create_comp_channel@@IBVERBS_1.0' => 1, 'ibv_create_cq@@IBVERBS_1.1' => 1, 'ibv_create_cq@IBVERBS_1.0' => 1, 'ibv_create_qp@@IBVERBS_1.1' => 1, 'ibv_create_qp@IBVERBS_1.0' => 1, 'ibv_create_srq@@IBVERBS_1.1' => 1, 'ibv_create_srq@IBVERBS_1.0' => 1, 'ibv_dealloc_pd@@IBVERBS_1.1' => 1, 'ibv_dealloc_pd@IBVERBS_1.0' => 1, 'ibv_dereg_mr@@IBVERBS_1.1' => 1, 'ibv_dereg_mr@IBVERBS_1.0' => 1, 'ibv_destroy_ah@@IBVERBS_1.1' => 1, 'ibv_destroy_ah@IBVERBS_1.0' => 1, 'ibv_destroy_comp_channel@@IBVERBS_1.0' => 1, 'ibv_destroy_cq@@IBVERBS_1.1' => 1, 'ibv_destroy_cq@IBVERBS_1.0' => 1, 'ibv_destroy_qp@@IBVERBS_1.1' => 1, 'ibv_destroy_qp@IBVERBS_1.0' => 1, 'ibv_destroy_srq@@IBVERBS_1.1' => 1, 'ibv_destroy_srq@IBVERBS_1.0' => 1, 'ibv_detach_mcast@@IBVERBS_1.1' => 1, 'ibv_detach_mcast@IBVERBS_1.0' => 1, 'ibv_dofork_range@@IBVERBS_1.1' => 1, 'ibv_dontfork_range@@IBVERBS_1.1' => 1, 'ibv_event_type_str@@IBVERBS_1.1' => 1, 'ibv_fork_init@@IBVERBS_1.1' => 1, 'ibv_free_device_list@@IBVERBS_1.1' => 1, 'ibv_free_device_list@IBVERBS_1.0' => 1, 'ibv_get_async_event@@IBVERBS_1.1' => 1, 'ibv_get_async_event@IBVERBS_1.0' => 1, 'ibv_get_cq_event@@IBVERBS_1.1' => 1, 'ibv_get_cq_event@IBVERBS_1.0' => 1, 'ibv_get_device_guid@@IBVERBS_1.1' => 1, 'ibv_get_device_guid@IBVERBS_1.0' => 1, 'ibv_get_device_index@@IBVERBS_1.9' => 1, 'ibv_get_device_list@@IBVERBS_1.1' => 1, 'ibv_get_device_list@IBVERBS_1.0' => 1, 'ibv_get_device_name@@IBVERBS_1.1' => 1, 'ibv_get_device_name@IBVERBS_1.0' => 1, 'ibv_get_pkey_index@@IBVERBS_1.5' => 1, 'ibv_get_sysfs_path@@IBVERBS_1.0' => 1, 'ibv_import_device@@IBVERBS_1.10' => 1, 'ibv_import_dm@@IBVERBS_1.13' => 1, 'ibv_import_mr@@IBVERBS_1.10' => 1, 'ibv_import_pd@@IBVERBS_1.10' => 1, 'ibv_init_ah_from_wc@@IBVERBS_1.1' => 1, 'ibv_is_fork_initialized@@IBVERBS_1.13' => 1, 'ibv_modify_qp@@IBVERBS_1.1' => 1, 'ibv_modify_qp@IBVERBS_1.0' => 1, 'ibv_modify_srq@@IBVERBS_1.1' => 1, 'ibv_modify_srq@IBVERBS_1.0' => 1, 'ibv_node_type_str@@IBVERBS_1.1' => 1, 'ibv_open_device@@IBVERBS_1.1' => 1, 'ibv_open_device@IBVERBS_1.0' => 1, 'ibv_port_state_str@@IBVERBS_1.1' => 1, 'ibv_qp_to_qp_ex@@IBVERBS_1.6' => 1, 'ibv_query_device@@IBVERBS_1.1' => 1, 'ibv_query_device@IBVERBS_1.0' => 1, 'ibv_query_ece@@IBVERBS_1.10' => 1, 'ibv_query_gid@@IBVERBS_1.1' => 1, 'ibv_query_gid@IBVERBS_1.0' => 1, 'ibv_query_gid_type@@IBVERBS_PRIVATE_34' => 1, 'ibv_query_pkey@@IBVERBS_1.1' => 1, 'ibv_query_pkey@IBVERBS_1.0' => 1, 'ibv_query_port@@IBVERBS_1.1' => 1, 'ibv_query_port@IBVERBS_1.0' => 1, 'ibv_query_qp@@IBVERBS_1.1' => 1, 'ibv_query_qp@IBVERBS_1.0' => 1, 'ibv_query_qp_data_in_order@@IBVERBS_1.14' => 1, 'ibv_query_srq@@IBVERBS_1.1' => 1, 'ibv_query_srq@IBVERBS_1.0' => 1, 'ibv_rate_to_mbps@@IBVERBS_1.1' => 1, 'ibv_rate_to_mult@@IBVERBS_1.0' => 1, 'ibv_read_ibdev_sysfs_file@@IBVERBS_PRIVATE_34' => 1, 'ibv_read_sysfs_file@@IBVERBS_1.0' => 1, 'ibv_reg_dmabuf_mr@@IBVERBS_1.12' => 1, 'ibv_reg_mr@@IBVERBS_1.1' => 1, 'ibv_reg_mr@IBVERBS_1.0' => 1, 'ibv_reg_mr_iova2@@IBVERBS_1.8' => 1, 'ibv_reg_mr_iova@@IBVERBS_1.7' => 1, 'ibv_register_driver@IBVERBS_1.1' => 1, 'ibv_rereg_mr@@IBVERBS_1.1' => 1, 'ibv_resize_cq@@IBVERBS_1.1' => 1, 'ibv_resize_cq@IBVERBS_1.0' => 1, 'ibv_resolve_eth_l2_from_gid@@IBVERBS_1.1' => 1, 'ibv_set_ece@@IBVERBS_1.10' => 1, 'ibv_unimport_dm@@IBVERBS_1.13' => 1, 'ibv_unimport_mr@@IBVERBS_1.10' => 1, 'ibv_unimport_pd@@IBVERBS_1.10' => 1, 'ibv_wc_status_str@@IBVERBS_1.1' => 1, 'ibv_wr_opcode_str@@IBVERBS_PRIVATE_34' => 1, 'mbps_to_ibv_rate@@IBVERBS_1.1' => 1, 'mult_to_ibv_rate@@IBVERBS_1.0' => 1, 'verbs_allow_disassociate_destroy@@IBVERBS_PRIVATE_34' => -1, 'verbs_init_cq@@IBVERBS_PRIVATE_34' => 1, 'verbs_open_device@@IBVERBS_PRIVATE_34' => 1, 'verbs_register_driver_34@@IBVERBS_PRIVATE_34' => 1, 'verbs_set_ops@@IBVERBS_PRIVATE_34' => 1, 'verbs_uninit_context@@IBVERBS_PRIVATE_34' => 1 } }, 'Target' => 'unix', 'TypeInfo' => { '-1' => { 'Name' => '...', 'Type' => 'Intrinsic' }, '1' => { 'Name' => 'void', 'Type' => 'Intrinsic' }, '1001' => { 'BaseType' => '101', 'Header' => undef, 'Line' => '24', 'Name' => '__u16', 'Size' => '2', 'Type' => 'Typedef' }, '10052' => { 'BaseType' => '9940', 'Name' => 'struct ibv_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '10057' => { 'Header' => undef, 'Line' => '1265', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'wq_context', 'offset' => '8', 'type' => '82' }, '10' => { 'name' => 'cond', 'offset' => '150', 'type' => '906' }, '11' => { 'name' => 'events_completed', 'offset' => '324', 'type' => '953' }, '12' => { 'name' => 'comp_mask', 'offset' => '328', 'type' => '953' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '11395' }, '3' => { 'name' => 'cq', 'offset' => '36', 'type' => '9734' }, '4' => { 'name' => 'wq_num', 'offset' => '50', 'type' => '953' }, '5' => { 'name' => 'handle', 'offset' => '54', 'type' => '953' }, '6' => { 'name' => 'state', 'offset' => '64', 'type' => '11842' }, '7' => { 'name' => 'wq_type', 'offset' => '68', 'type' => '11772' }, '8' => { 'name' => 'post_recv', 'offset' => '72', 'type' => '14230' }, '9' => { 'name' => 'mutex', 'offset' => '86', 'type' => '832' } }, 'Name' => 'struct ibv_wq', 'Size' => '152', 'Type' => 'Struct' }, '101' => { 'Name' => 'unsigned short', 'Size' => '2', 'Type' => 'Intrinsic' }, '101119' => { 'BaseType' => '90045', 'Name' => 'struct ibv_get_context*', 'Size' => '8', 'Type' => 'Pointer' }, '101124' => { 'BaseType' => '76676', 'Name' => 'struct ib_uverbs_get_context_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '1013' => { 'BaseType' => '70', 'Header' => undef, 'Line' => '27', 'Name' => '__u32', 'Size' => '4', 'Type' => 'Typedef' }, '1025' => { 'BaseType' => '348', 'Header' => undef, 'Line' => '31', 'Name' => '__u64', 'Size' => '8', 'Type' => 'Typedef' }, '10252' => { 'BaseType' => '10057', 'Name' => 'struct ibv_wq*', 'Size' => '8', 'Type' => 'Pointer' }, '10257' => { 'Header' => undef, 'Line' => '485', 'Memb' => { '0' => { 'name' => 'IBV_WC_SUCCESS', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_LOC_LEN_ERR', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_REM_ACCESS_ERR', 'value' => '10' }, '11' => { 'name' => 'IBV_WC_REM_OP_ERR', 'value' => '11' }, '12' => { 'name' => 'IBV_WC_RETRY_EXC_ERR', 'value' => '12' }, '13' => { 'name' => 'IBV_WC_RNR_RETRY_EXC_ERR', 'value' => '13' }, '14' => { 'name' => 'IBV_WC_LOC_RDD_VIOL_ERR', 'value' => '14' }, '15' => { 'name' => 'IBV_WC_REM_INV_RD_REQ_ERR', 'value' => '15' }, '16' => { 'name' => 'IBV_WC_REM_ABORT_ERR', 'value' => '16' }, '17' => { 'name' => 'IBV_WC_INV_EECN_ERR', 'value' => '17' }, '18' => { 'name' => 'IBV_WC_INV_EEC_STATE_ERR', 'value' => '18' }, '19' => { 'name' => 'IBV_WC_FATAL_ERR', 'value' => '19' }, '2' => { 'name' => 'IBV_WC_LOC_QP_OP_ERR', 'value' => '2' }, '20' => { 'name' => 'IBV_WC_RESP_TIMEOUT_ERR', 'value' => '20' }, '21' => { 'name' => 'IBV_WC_GENERAL_ERR', 'value' => '21' }, '22' => { 'name' => 'IBV_WC_TM_ERR', 'value' => '22' }, '23' => { 'name' => 'IBV_WC_TM_RNDV_INCOMPLETE', 'value' => '23' }, '3' => { 'name' => 'IBV_WC_LOC_EEC_OP_ERR', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_LOC_PROT_ERR', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_WR_FLUSH_ERR', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_MW_BIND_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_BAD_RESP_ERR', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_LOC_ACCESS_ERR', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_REM_INV_REQ_ERR', 'value' => '9' } }, 'Name' => 'enum ibv_wc_status', 'Size' => '4', 'Type' => 'Enum' }, '102963' => { 'BaseType' => '90309', 'Name' => 'struct ibv_query_port*', 'Size' => '8', 'Type' => 'Pointer' }, '1037' => { 'BaseType' => '1001', 'Header' => undef, 'Line' => '25', 'Name' => '__be16', 'Size' => '2', 'Type' => 'Typedef' }, '10418' => { 'Header' => undef, 'Line' => '513', 'Memb' => { '0' => { 'name' => 'IBV_WC_SEND', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_RDMA_WRITE', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_RECV', 'value' => '128' }, '11' => { 'name' => 'IBV_WC_RECV_RDMA_WITH_IMM', 'value' => '129' }, '12' => { 'name' => 'IBV_WC_TM_ADD', 'value' => '130' }, '13' => { 'name' => 'IBV_WC_TM_DEL', 'value' => '131' }, '14' => { 'name' => 'IBV_WC_TM_SYNC', 'value' => '132' }, '15' => { 'name' => 'IBV_WC_TM_RECV', 'value' => '133' }, '16' => { 'name' => 'IBV_WC_TM_NO_TAG', 'value' => '134' }, '17' => { 'name' => 'IBV_WC_DRIVER1', 'value' => '135' }, '18' => { 'name' => 'IBV_WC_DRIVER2', 'value' => '136' }, '19' => { 'name' => 'IBV_WC_DRIVER3', 'value' => '137' }, '2' => { 'name' => 'IBV_WC_RDMA_READ', 'value' => '2' }, '3' => { 'name' => 'IBV_WC_COMP_SWAP', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_FETCH_ADD', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_BIND_MW', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_LOCAL_INV', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_TSO', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_FLUSH', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_ATOMIC_WRITE', 'value' => '9' } }, 'Name' => 'enum ibv_wc_opcode', 'Size' => '4', 'Type' => 'Enum' }, '1049' => { 'BaseType' => '1013', 'Header' => undef, 'Line' => '27', 'Name' => '__be32', 'Size' => '4', 'Type' => 'Typedef' }, '105400' => { 'BaseType' => '52889', 'Name' => 'struct ibv_alloc_dm_attr const', 'Size' => '16', 'Type' => 'Const' }, '1061' => { 'BaseType' => '1025', 'Header' => undef, 'Line' => '29', 'Name' => '__be64', 'Size' => '8', 'Type' => 'Typedef' }, '10686' => { 'Header' => undef, 'Line' => '598', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '1049' }, '1' => { 'name' => 'invalidated_rkey', 'offset' => '0', 'type' => '953' } }, 'Size' => '4', 'Type' => 'Union' }, '10723' => { 'Header' => undef, 'Line' => '589', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'status', 'offset' => '8', 'type' => '10257' }, '10' => { 'name' => 'slid', 'offset' => '66', 'type' => '941' }, '11' => { 'name' => 'sl', 'offset' => '68', 'type' => '929' }, '12' => { 'name' => 'dlid_path_bits', 'offset' => '69', 'type' => '929' }, '2' => { 'name' => 'opcode', 'offset' => '18', 'type' => '10418' }, '3' => { 'name' => 'vendor_err', 'offset' => '22', 'type' => '953' }, '4' => { 'name' => 'byte_len', 'offset' => '32', 'type' => '953' }, '5' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '10686' }, '6' => { 'name' => 'qp_num', 'offset' => '40', 'type' => '953' }, '7' => { 'name' => 'src_qp', 'offset' => '50', 'type' => '953' }, '8' => { 'name' => 'wc_flags', 'offset' => '54', 'type' => '70' }, '9' => { 'name' => 'pkey_index', 'offset' => '64', 'type' => '941' } }, 'Name' => 'struct ibv_wc', 'Size' => '48', 'Type' => 'Struct' }, '109907' => { 'Header' => undef, 'Line' => '187', 'Memb' => { '0' => { 'name' => 'dm', 'offset' => '0', 'type' => '52942' }, '1' => { 'name' => 'handle', 'offset' => '50', 'type' => '953' } }, 'Name' => 'struct verbs_dm', 'Size' => '40', 'Type' => 'Struct' }, '10999' => { 'Header' => undef, 'Line' => '625', 'Memb' => { '0' => { 'name' => 'mr', 'offset' => '0', 'type' => '11186' }, '1' => { 'name' => 'addr', 'offset' => '8', 'type' => '965' }, '2' => { 'name' => 'length', 'offset' => '22', 'type' => '965' }, '3' => { 'name' => 'mw_access_flags', 'offset' => '36', 'type' => '70' } }, 'Name' => 'struct ibv_mw_bind_info', 'Size' => '32', 'Type' => 'Struct' }, '11069' => { 'BaseType' => '10999', 'Name' => 'struct ibv_mw_bind_info const', 'Size' => '32', 'Type' => 'Const' }, '11074' => { 'Header' => undef, 'Line' => '668', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '11395' }, '2' => { 'name' => 'addr', 'offset' => '22', 'type' => '82' }, '3' => { 'name' => 'length', 'offset' => '36', 'type' => '53' }, '4' => { 'name' => 'handle', 'offset' => '50', 'type' => '953' }, '5' => { 'name' => 'lkey', 'offset' => '54', 'type' => '953' }, '6' => { 'name' => 'rkey', 'offset' => '64', 'type' => '953' } }, 'Name' => 'struct ibv_mr', 'Size' => '48', 'Type' => 'Struct' }, '11186' => { 'BaseType' => '11074', 'Name' => 'struct ibv_mr*', 'Size' => '8', 'Type' => 'Pointer' }, '11191' => { 'Header' => undef, 'Line' => '632', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '953' } }, 'Name' => 'struct ibv_pd', 'Size' => '16', 'Type' => 'Struct' }, '112333' => { 'BaseType' => '109907', 'Name' => 'struct verbs_dm*', 'Size' => '8', 'Type' => 'Pointer' }, '11269' => { 'Header' => undef, 'Line' => '651', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'fd', 'offset' => '4', 'type' => '161' }, '2' => { 'name' => 'oflags', 'offset' => '8', 'type' => '161' } }, 'Name' => 'struct ibv_xrcd_init_attr', 'Size' => '12', 'Type' => 'Struct' }, '11325' => { 'Header' => undef, 'Line' => '657', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' } }, 'Name' => 'struct ibv_xrcd', 'Size' => '8', 'Type' => 'Struct' }, '113808' => { 'BaseType' => '105400', 'Name' => 'struct ibv_alloc_dm_attr const*', 'Size' => '8', 'Type' => 'Pointer' }, '11395' => { 'BaseType' => '11191', 'Name' => 'struct ibv_pd*', 'Size' => '8', 'Type' => 'Pointer' }, '11400' => { 'Header' => undef, 'Line' => '678', 'Memb' => { '0' => { 'name' => 'IBV_MW_TYPE_1', 'value' => '1' }, '1' => { 'name' => 'IBV_MW_TYPE_2', 'value' => '2' } }, 'Name' => 'enum ibv_mw_type', 'Size' => '4', 'Type' => 'Enum' }, '11429' => { 'Header' => undef, 'Line' => '683', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '11395' }, '2' => { 'name' => 'rkey', 'offset' => '22', 'type' => '953' }, '3' => { 'name' => 'handle', 'offset' => '32', 'type' => '953' }, '4' => { 'name' => 'type', 'offset' => '36', 'type' => '11400' } }, 'Name' => 'struct ibv_mw', 'Size' => '32', 'Type' => 'Struct' }, '11513' => { 'Header' => undef, 'Line' => '691', 'Memb' => { '0' => { 'name' => 'dgid', 'offset' => '0', 'type' => '8669' }, '1' => { 'name' => 'flow_label', 'offset' => '22', 'type' => '953' }, '2' => { 'name' => 'sgid_index', 'offset' => '32', 'type' => '929' }, '3' => { 'name' => 'hop_limit', 'offset' => '33', 'type' => '929' }, '4' => { 'name' => 'traffic_class', 'offset' => '34', 'type' => '929' } }, 'Name' => 'struct ibv_global_route', 'Size' => '24', 'Type' => 'Struct' }, '11598' => { 'Header' => undef, 'Line' => '762', 'Memb' => { '0' => { 'name' => 'grh', 'offset' => '0', 'type' => '11513' }, '1' => { 'name' => 'dlid', 'offset' => '36', 'type' => '941' }, '2' => { 'name' => 'sl', 'offset' => '38', 'type' => '929' }, '3' => { 'name' => 'src_path_bits', 'offset' => '39', 'type' => '929' }, '4' => { 'name' => 'static_rate', 'offset' => '40', 'type' => '929' }, '5' => { 'name' => 'is_global', 'offset' => '41', 'type' => '929' }, '6' => { 'name' => 'port_num', 'offset' => '48', 'type' => '929' } }, 'Name' => 'struct ibv_ah_attr', 'Size' => '32', 'Type' => 'Struct' }, '11710' => { 'Header' => undef, 'Line' => '777', 'Memb' => { '0' => { 'name' => 'max_wr', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'max_sge', 'offset' => '4', 'type' => '953' }, '2' => { 'name' => 'srq_limit', 'offset' => '8', 'type' => '953' } }, 'Name' => 'struct ibv_srq_attr', 'Size' => '12', 'Type' => 'Struct' }, '11767' => { 'BaseType' => '11325', 'Name' => 'struct ibv_xrcd*', 'Size' => '8', 'Type' => 'Pointer' }, '11772' => { 'Header' => undef, 'Line' => '820', 'Memb' => { '0' => { 'name' => 'IBV_WQT_RQ', 'value' => '0' } }, 'Name' => 'enum ibv_wq_type', 'Size' => '4', 'Type' => 'Enum' }, '11842' => { 'Header' => undef, 'Line' => '848', 'Memb' => { '0' => { 'name' => 'IBV_WQS_RESET', 'value' => '0' }, '1' => { 'name' => 'IBV_WQS_RDY', 'value' => '1' }, '2' => { 'name' => 'IBV_WQS_ERR', 'value' => '2' }, '3' => { 'name' => 'IBV_WQS_UNKNOWN', 'value' => '3' } }, 'Name' => 'enum ibv_wq_state', 'Size' => '4', 'Type' => 'Enum' }, '11924' => { 'Header' => undef, 'Line' => '862', 'Memb' => { '0' => { 'name' => 'attr_mask', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'wq_state', 'offset' => '4', 'type' => '11842' }, '2' => { 'name' => 'curr_wq_state', 'offset' => '8', 'type' => '11842' }, '3' => { 'name' => 'flags', 'offset' => '18', 'type' => '953' }, '4' => { 'name' => 'flags_mask', 'offset' => '22', 'type' => '953' } }, 'Name' => 'struct ibv_wq_attr', 'Size' => '20', 'Type' => 'Struct' }, '12009' => { 'Header' => undef, 'Line' => '880', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'ind_tbl_handle', 'offset' => '8', 'type' => '161' }, '2' => { 'name' => 'ind_tbl_num', 'offset' => '18', 'type' => '161' }, '3' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '953' } }, 'Name' => 'struct ibv_rwq_ind_table', 'Size' => '24', 'Type' => 'Struct' }, '12103' => { 'Header' => undef, 'Line' => '894', 'Memb' => { '0' => { 'name' => 'log_ind_tbl_size', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'ind_tbl', 'offset' => '8', 'type' => '12160' }, '2' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '953' } }, 'Name' => 'struct ibv_rwq_ind_table_init_attr', 'Size' => '24', 'Type' => 'Struct' }, '12160' => { 'BaseType' => '10252', 'Name' => 'struct ibv_wq**', 'Size' => '8', 'Type' => 'Pointer' }, '12165' => { 'Header' => undef, 'Line' => '901', 'Memb' => { '0' => { 'name' => 'IBV_QPT_RC', 'value' => '2' }, '1' => { 'name' => 'IBV_QPT_UC', 'value' => '3' }, '2' => { 'name' => 'IBV_QPT_UD', 'value' => '4' }, '3' => { 'name' => 'IBV_QPT_RAW_PACKET', 'value' => '8' }, '4' => { 'name' => 'IBV_QPT_XRC_SEND', 'value' => '9' }, '5' => { 'name' => 'IBV_QPT_XRC_RECV', 'value' => '10' }, '6' => { 'name' => 'IBV_QPT_DRIVER', 'value' => '255' } }, 'Name' => 'enum ibv_qp_type', 'Size' => '4', 'Type' => 'Enum' }, '12224' => { 'Header' => undef, 'Line' => '911', 'Memb' => { '0' => { 'name' => 'max_send_wr', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'max_recv_wr', 'offset' => '4', 'type' => '953' }, '2' => { 'name' => 'max_send_sge', 'offset' => '8', 'type' => '953' }, '3' => { 'name' => 'max_recv_sge', 'offset' => '18', 'type' => '953' }, '4' => { 'name' => 'max_inline_data', 'offset' => '22', 'type' => '953' } }, 'Name' => 'struct ibv_qp_cap', 'Size' => '20', 'Type' => 'Struct' }, '12309' => { 'Header' => undef, 'Line' => '919', 'Memb' => { '0' => { 'name' => 'qp_context', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'send_cq', 'offset' => '8', 'type' => '9734' }, '2' => { 'name' => 'recv_cq', 'offset' => '22', 'type' => '9734' }, '3' => { 'name' => 'srq', 'offset' => '36', 'type' => '10052' }, '4' => { 'name' => 'cap', 'offset' => '50', 'type' => '12224' }, '5' => { 'name' => 'qp_type', 'offset' => '82', 'type' => '12165' }, '6' => { 'name' => 'sq_sig_all', 'offset' => '86', 'type' => '161' } }, 'Name' => 'struct ibv_qp_init_attr', 'Size' => '64', 'Type' => 'Struct' }, '12422' => { 'BaseType' => '12009', 'Name' => 'struct ibv_rwq_ind_table*', 'Size' => '8', 'Type' => 'Pointer' }, '12474' => { 'Header' => undef, 'Line' => '1001', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'qp_num', 'offset' => '4', 'type' => '953' }, '2' => { 'name' => 'xrcd', 'offset' => '8', 'type' => '11767' }, '3' => { 'name' => 'qp_context', 'offset' => '22', 'type' => '82' }, '4' => { 'name' => 'qp_type', 'offset' => '36', 'type' => '12165' } }, 'Name' => 'struct ibv_qp_open_attr', 'Size' => '32', 'Type' => 'Struct' }, '125' => { 'BaseType' => '89', 'Header' => undef, 'Line' => '38', 'Name' => '__uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '12734' => { 'Header' => undef, 'Line' => '1050', 'Memb' => { '0' => { 'name' => 'IBV_QPS_RESET', 'value' => '0' }, '1' => { 'name' => 'IBV_QPS_INIT', 'value' => '1' }, '2' => { 'name' => 'IBV_QPS_RTR', 'value' => '2' }, '3' => { 'name' => 'IBV_QPS_RTS', 'value' => '3' }, '4' => { 'name' => 'IBV_QPS_SQD', 'value' => '4' }, '5' => { 'name' => 'IBV_QPS_SQE', 'value' => '5' }, '6' => { 'name' => 'IBV_QPS_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_QPS_UNKNOWN', 'value' => '7' } }, 'Name' => 'enum ibv_qp_state', 'Size' => '4', 'Type' => 'Enum' }, '12799' => { 'Header' => undef, 'Line' => '1061', 'Memb' => { '0' => { 'name' => 'IBV_MIG_MIGRATED', 'value' => '0' }, '1' => { 'name' => 'IBV_MIG_REARM', 'value' => '1' }, '2' => { 'name' => 'IBV_MIG_ARMED', 'value' => '2' } }, 'Name' => 'enum ibv_mig_state', 'Size' => '4', 'Type' => 'Enum' }, '12834' => { 'Header' => undef, 'Line' => '1067', 'Memb' => { '0' => { 'name' => 'qp_state', 'offset' => '0', 'type' => '12734' }, '1' => { 'name' => 'cur_qp_state', 'offset' => '4', 'type' => '12734' }, '10' => { 'name' => 'ah_attr', 'offset' => '86', 'type' => '11598' }, '11' => { 'name' => 'alt_ah_attr', 'offset' => '136', 'type' => '11598' }, '12' => { 'name' => 'pkey_index', 'offset' => '288', 'type' => '941' }, '13' => { 'name' => 'alt_pkey_index', 'offset' => '290', 'type' => '941' }, '14' => { 'name' => 'en_sqd_async_notify', 'offset' => '292', 'type' => '929' }, '15' => { 'name' => 'sq_draining', 'offset' => '293', 'type' => '929' }, '16' => { 'name' => 'max_rd_atomic', 'offset' => '294', 'type' => '929' }, '17' => { 'name' => 'max_dest_rd_atomic', 'offset' => '295', 'type' => '929' }, '18' => { 'name' => 'min_rnr_timer', 'offset' => '296', 'type' => '929' }, '19' => { 'name' => 'port_num', 'offset' => '297', 'type' => '929' }, '2' => { 'name' => 'path_mtu', 'offset' => '8', 'type' => '9546' }, '20' => { 'name' => 'timeout', 'offset' => '304', 'type' => '929' }, '21' => { 'name' => 'retry_cnt', 'offset' => '305', 'type' => '929' }, '22' => { 'name' => 'rnr_retry', 'offset' => '306', 'type' => '929' }, '23' => { 'name' => 'alt_port_num', 'offset' => '307', 'type' => '929' }, '24' => { 'name' => 'alt_timeout', 'offset' => '308', 'type' => '929' }, '25' => { 'name' => 'rate_limit', 'offset' => '310', 'type' => '953' }, '3' => { 'name' => 'path_mig_state', 'offset' => '18', 'type' => '12799' }, '4' => { 'name' => 'qkey', 'offset' => '22', 'type' => '953' }, '5' => { 'name' => 'rq_psn', 'offset' => '32', 'type' => '953' }, '6' => { 'name' => 'sq_psn', 'offset' => '36', 'type' => '953' }, '7' => { 'name' => 'dest_qp_num', 'offset' => '40', 'type' => '953' }, '8' => { 'name' => 'qp_access_flags', 'offset' => '50', 'type' => '70' }, '9' => { 'name' => 'cap', 'offset' => '54', 'type' => '12224' } }, 'Name' => 'struct ibv_qp_attr', 'Size' => '144', 'Type' => 'Struct' }, '13213' => { 'Header' => undef, 'Line' => '1103', 'Memb' => { '0' => { 'name' => 'IBV_WR_RDMA_WRITE', 'value' => '0' }, '1' => { 'name' => 'IBV_WR_RDMA_WRITE_WITH_IMM', 'value' => '1' }, '10' => { 'name' => 'IBV_WR_TSO', 'value' => '10' }, '11' => { 'name' => 'IBV_WR_DRIVER1', 'value' => '11' }, '12' => { 'name' => 'IBV_WR_FLUSH', 'value' => '14' }, '13' => { 'name' => 'IBV_WR_ATOMIC_WRITE', 'value' => '15' }, '2' => { 'name' => 'IBV_WR_SEND', 'value' => '2' }, '3' => { 'name' => 'IBV_WR_SEND_WITH_IMM', 'value' => '3' }, '4' => { 'name' => 'IBV_WR_RDMA_READ', 'value' => '4' }, '5' => { 'name' => 'IBV_WR_ATOMIC_CMP_AND_SWP', 'value' => '5' }, '6' => { 'name' => 'IBV_WR_ATOMIC_FETCH_AND_ADD', 'value' => '6' }, '7' => { 'name' => 'IBV_WR_LOCAL_INV', 'value' => '7' }, '8' => { 'name' => 'IBV_WR_BIND_MW', 'value' => '8' }, '9' => { 'name' => 'IBV_WR_SEND_WITH_INV', 'value' => '9' } }, 'Name' => 'enum ibv_wr_opcode', 'Size' => '4', 'Type' => 'Enum' }, '13314' => { 'Header' => undef, 'Line' => '1140', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '53' } }, 'Name' => 'struct ibv_data_buf', 'Size' => '16', 'Type' => 'Struct' }, '13357' => { 'BaseType' => '13314', 'Name' => 'struct ibv_data_buf const', 'Size' => '16', 'Type' => 'Const' }, '13362' => { 'Header' => undef, 'Line' => '1145', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '953' }, '2' => { 'name' => 'lkey', 'offset' => '18', 'type' => '953' } }, 'Name' => 'struct ibv_sge', 'Size' => '16', 'Type' => 'Struct' }, '13419' => { 'BaseType' => '13362', 'Name' => 'struct ibv_sge const', 'Size' => '16', 'Type' => 'Const' }, '13424' => { 'Header' => undef, 'Line' => '1161', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '1049' }, '1' => { 'name' => 'invalidate_rkey', 'offset' => '0', 'type' => '953' } }, 'Size' => '4', 'Type' => 'Union' }, '13461' => { 'Header' => undef, 'Line' => '1166', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '953' } }, 'Size' => '16', 'Type' => 'Struct' }, '13499' => { 'Header' => undef, 'Line' => '1170', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'compare_add', 'offset' => '8', 'type' => '965' }, '2' => { 'name' => 'swap', 'offset' => '22', 'type' => '965' }, '3' => { 'name' => 'rkey', 'offset' => '36', 'type' => '953' } }, 'Size' => '32', 'Type' => 'Struct' }, '13565' => { 'Header' => undef, 'Line' => '1176', 'Memb' => { '0' => { 'name' => 'ah', 'offset' => '0', 'type' => '13672' }, '1' => { 'name' => 'remote_qpn', 'offset' => '8', 'type' => '953' }, '2' => { 'name' => 'remote_qkey', 'offset' => '18', 'type' => '953' } }, 'Size' => '16', 'Type' => 'Struct' }, '13616' => { 'Header' => undef, 'Line' => '1695', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '11395' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '953' } }, 'Name' => 'struct ibv_ah', 'Size' => '24', 'Type' => 'Struct' }, '13672' => { 'BaseType' => '13616', 'Name' => 'struct ibv_ah*', 'Size' => '8', 'Type' => 'Pointer' }, '13677' => { 'Header' => undef, 'Line' => '1165', 'Memb' => { '0' => { 'name' => 'rdma', 'offset' => '0', 'type' => '13461' }, '1' => { 'name' => 'atomic', 'offset' => '0', 'type' => '13499' }, '2' => { 'name' => 'ud', 'offset' => '0', 'type' => '13565' } }, 'Size' => '32', 'Type' => 'Union' }, '13726' => { 'Header' => undef, 'Line' => '1183', 'Memb' => { '0' => { 'name' => 'remote_srqn', 'offset' => '0', 'type' => '953' } }, 'Size' => '4', 'Type' => 'Struct' }, '13750' => { 'Header' => undef, 'Line' => '1182', 'Memb' => { '0' => { 'name' => 'xrc', 'offset' => '0', 'type' => '13726' } }, 'Size' => '4', 'Type' => 'Union' }, '13774' => { 'Header' => undef, 'Line' => '1188', 'Memb' => { '0' => { 'name' => 'mw', 'offset' => '0', 'type' => '13825' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '953' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '10999' } }, 'Size' => '48', 'Type' => 'Struct' }, '13825' => { 'BaseType' => '11429', 'Name' => 'struct ibv_mw*', 'Size' => '8', 'Type' => 'Pointer' }, '13830' => { 'Header' => undef, 'Line' => '1193', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'hdr_sz', 'offset' => '8', 'type' => '941' }, '2' => { 'name' => 'mss', 'offset' => '16', 'type' => '941' } }, 'Size' => '16', 'Type' => 'Struct' }, '13882' => { 'Header' => undef, 'Line' => '1187', 'Memb' => { '0' => { 'name' => 'bind_mw', 'offset' => '0', 'type' => '13774' }, '1' => { 'name' => 'tso', 'offset' => '0', 'type' => '13830' } }, 'Size' => '48', 'Type' => 'Union' }, '13919' => { 'Header' => undef, 'Line' => '1151', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '14057' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '14062' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '161' }, '4' => { 'name' => 'opcode', 'offset' => '40', 'type' => '13213' }, '5' => { 'name' => 'send_flags', 'offset' => '50', 'type' => '70' }, '6' => { 'name' => 'unnamed0', 'offset' => '54', 'type' => '13424' }, '7' => { 'name' => 'wr', 'offset' => '64', 'type' => '13677' }, '8' => { 'name' => 'qp_type', 'offset' => '114', 'type' => '13750' }, '9' => { 'name' => 'unnamed1', 'offset' => '128', 'type' => '13882' } }, 'Name' => 'struct ibv_send_wr', 'Size' => '128', 'Type' => 'Struct' }, '14057' => { 'BaseType' => '13919', 'Name' => 'struct ibv_send_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '14062' => { 'BaseType' => '13362', 'Name' => 'struct ibv_sge*', 'Size' => '8', 'Type' => 'Pointer' }, '14067' => { 'Header' => undef, 'Line' => '1201', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '14138' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '14062' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '161' } }, 'Name' => 'struct ibv_recv_wr', 'Size' => '32', 'Type' => 'Struct' }, '14138' => { 'BaseType' => '14067', 'Name' => 'struct ibv_recv_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '14143' => { 'Header' => undef, 'Line' => '1237', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'send_flags', 'offset' => '8', 'type' => '70' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '10999' } }, 'Name' => 'struct ibv_mw_bind', 'Size' => '48', 'Type' => 'Struct' }, '14225' => { 'BaseType' => '14138', 'Name' => 'struct ibv_recv_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '14230' => { 'Name' => 'int(*)(struct ibv_wq*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '10252' }, '1' => { 'type' => '14138' }, '2' => { 'type' => '14225' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '14235' => { 'Header' => undef, 'Line' => '1300', 'Memb' => { '0' => { 'name' => 'qp_base', 'offset' => '0', 'type' => '9739' }, '1' => { 'name' => 'comp_mask', 'offset' => '352', 'type' => '965' }, '10' => { 'name' => 'wr_rdma_write_imm', 'offset' => '562', 'type' => '14794' }, '11' => { 'name' => 'wr_send', 'offset' => '576', 'type' => '14810' }, '12' => { 'name' => 'wr_send_imm', 'offset' => '584', 'type' => '14831' }, '13' => { 'name' => 'wr_send_inv', 'offset' => '598', 'type' => '14737' }, '14' => { 'name' => 'wr_send_tso', 'offset' => '612', 'type' => '14862' }, '15' => { 'name' => 'wr_set_ud_addr', 'offset' => '626', 'type' => '14893' }, '16' => { 'name' => 'wr_set_xrc_srqn', 'offset' => '640', 'type' => '14737' }, '17' => { 'name' => 'wr_set_inline_data', 'offset' => '648', 'type' => '14919' }, '18' => { 'name' => 'wr_set_inline_data_list', 'offset' => '662', 'type' => '14950' }, '19' => { 'name' => 'wr_set_sge', 'offset' => '772', 'type' => '14981' }, '2' => { 'name' => 'wr_id', 'offset' => '360', 'type' => '965' }, '20' => { 'name' => 'wr_set_sge_list', 'offset' => '786', 'type' => '15012' }, '21' => { 'name' => 'wr_start', 'offset' => '800', 'type' => '14810' }, '22' => { 'name' => 'wr_complete', 'offset' => '808', 'type' => '15032' }, '23' => { 'name' => 'wr_abort', 'offset' => '822', 'type' => '14810' }, '24' => { 'name' => 'wr_atomic_write', 'offset' => '836', 'type' => '15063' }, '25' => { 'name' => 'wr_flush', 'offset' => '850', 'type' => '15104' }, '3' => { 'name' => 'wr_flags', 'offset' => '374', 'type' => '70' }, '4' => { 'name' => 'wr_atomic_cmp_swp', 'offset' => '388', 'type' => '14649' }, '5' => { 'name' => 'wr_atomic_fetch_add', 'offset' => '402', 'type' => '14680' }, '6' => { 'name' => 'wr_bind_mw', 'offset' => '512', 'type' => '14716' }, '7' => { 'name' => 'wr_local_inv', 'offset' => '520', 'type' => '14737' }, '8' => { 'name' => 'wr_rdma_read', 'offset' => '534', 'type' => '14763' }, '9' => { 'name' => 'wr_rdma_write', 'offset' => '548', 'type' => '14763' } }, 'Name' => 'struct ibv_qp_ex', 'Size' => '360', 'Type' => 'Struct' }, '1430' => { 'Header' => undef, 'Line' => '158', 'Memb' => { '0' => { 'name' => 'command', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'in_words', 'offset' => '4', 'type' => '1001' }, '2' => { 'name' => 'out_words', 'offset' => '6', 'type' => '1001' } }, 'Name' => 'struct ib_uverbs_cmd_hdr', 'Size' => '8', 'Type' => 'Struct' }, '14644' => { 'BaseType' => '14235', 'Name' => 'struct ibv_qp_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '14649' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, uint64_t, uint64_t)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '953' }, '2' => { 'type' => '965' }, '3' => { 'type' => '965' }, '4' => { 'type' => '965' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '14680' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, uint64_t)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '953' }, '2' => { 'type' => '965' }, '3' => { 'type' => '965' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '14711' => { 'BaseType' => '11069', 'Name' => 'struct ibv_mw_bind_info const*', 'Size' => '8', 'Type' => 'Pointer' }, '14716' => { 'Name' => 'void(*)(struct ibv_qp_ex*, struct ibv_mw*, uint32_t, struct ibv_mw_bind_info const*)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '13825' }, '2' => { 'type' => '953' }, '3' => { 'type' => '14711' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '14737' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '953' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '14763' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '953' }, '2' => { 'type' => '965' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '14794' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, __be32)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '953' }, '2' => { 'type' => '965' }, '3' => { 'type' => '1049' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '14810' => { 'Name' => 'void(*)(struct ibv_qp_ex*)', 'Param' => { '0' => { 'type' => '14644' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '1483' => { 'Header' => undef, 'Line' => '164', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'provider_in_words', 'offset' => '8', 'type' => '1001' }, '2' => { 'name' => 'provider_out_words', 'offset' => '16', 'type' => '1001' }, '3' => { 'name' => 'cmd_hdr_reserved', 'offset' => '18', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_cmd_hdr', 'Size' => '16', 'Type' => 'Struct' }, '14831' => { 'Name' => 'void(*)(struct ibv_qp_ex*, __be32)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '1049' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '14862' => { 'Name' => 'void(*)(struct ibv_qp_ex*, void*, uint16_t, uint16_t)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '82' }, '2' => { 'type' => '941' }, '3' => { 'type' => '941' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '14893' => { 'Name' => 'void(*)(struct ibv_qp_ex*, struct ibv_ah*, uint32_t, uint32_t)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '13672' }, '2' => { 'type' => '953' }, '3' => { 'type' => '953' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '149' => { 'BaseType' => '101', 'Header' => undef, 'Line' => '40', 'Name' => '__uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '14919' => { 'Name' => 'void(*)(struct ibv_qp_ex*, void*, size_t)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '82' }, '2' => { 'type' => '53' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '14945' => { 'BaseType' => '13357', 'Name' => 'struct ibv_data_buf const*', 'Size' => '8', 'Type' => 'Pointer' }, '14950' => { 'Name' => 'void(*)(struct ibv_qp_ex*, size_t, struct ibv_data_buf const*)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '53' }, '2' => { 'type' => '14945' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '14981' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, uint32_t)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '953' }, '2' => { 'type' => '965' }, '3' => { 'type' => '953' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '15007' => { 'BaseType' => '13419', 'Name' => 'struct ibv_sge const*', 'Size' => '8', 'Type' => 'Pointer' }, '15012' => { 'Name' => 'void(*)(struct ibv_qp_ex*, size_t, struct ibv_sge const*)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '53' }, '2' => { 'type' => '15007' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '150282' => { 'BaseType' => '23083', 'Name' => 'struct verbs_flow_action*', 'Size' => '8', 'Type' => 'Pointer' }, '15032' => { 'Name' => 'int(*)(struct ibv_qp_ex*)', 'Param' => { '0' => { 'type' => '14644' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '15063' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, void const*)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '953' }, '2' => { 'type' => '965' }, '3' => { 'type' => '918' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '15104' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, size_t, uint8_t, uint8_t)', 'Param' => { '0' => { 'type' => '14644' }, '1' => { 'type' => '953' }, '2' => { 'type' => '965' }, '3' => { 'type' => '53' }, '4' => { 'type' => '929' }, '5' => { 'type' => '929' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '15109' => { 'Header' => undef, 'Line' => '1502', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'fd', 'offset' => '8', 'type' => '161' }, '2' => { 'name' => 'refcnt', 'offset' => '18', 'type' => '161' } }, 'Name' => 'struct ibv_comp_channel', 'Size' => '16', 'Type' => 'Struct' }, '15165' => { 'BaseType' => '15109', 'Name' => 'struct ibv_comp_channel*', 'Size' => '8', 'Type' => 'Pointer' }, '15199' => { 'Header' => undef, 'Line' => '1577', 'Memb' => { '0' => { 'name' => 'cq_count', 'offset' => '0', 'type' => '941' }, '1' => { 'name' => 'cq_period', 'offset' => '2', 'type' => '941' } }, 'Name' => 'struct ibv_moderate_cq', 'Size' => '4', 'Type' => 'Struct' }, '15242' => { 'Header' => undef, 'Line' => '1582', 'Memb' => { '0' => { 'name' => 'attr_mask', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'moderate', 'offset' => '4', 'type' => '15199' } }, 'Name' => 'struct ibv_modify_cq_attr', 'Size' => '8', 'Type' => 'Struct' }, '15285' => { 'Header' => undef, 'Line' => '1707', 'Memb' => { '0' => { 'name' => 'IBV_FLOW_ATTR_NORMAL', 'value' => '0' }, '1' => { 'name' => 'IBV_FLOW_ATTR_ALL_DEFAULT', 'value' => '1' }, '2' => { 'name' => 'IBV_FLOW_ATTR_MC_DEFAULT', 'value' => '2' }, '3' => { 'name' => 'IBV_FLOW_ATTR_SNIFFER', 'value' => '3' } }, 'Name' => 'enum ibv_flow_attr_type', 'Size' => '4', 'Type' => 'Enum' }, '153588' => { 'BaseType' => '82', 'Name' => 'void**', 'Size' => '8', 'Type' => 'Pointer' }, '1549' => { 'BaseType' => '1025', 'Name' => '__u64[]', 'Size' => '8', 'Type' => 'Array' }, '1564' => { 'BaseType' => '989', 'Name' => '__u8[7]', 'Size' => '7', 'Type' => 'Array' }, '1580' => { 'Header' => undef, 'Line' => '321', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'driver_data', 'offset' => '8', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_alloc_pd', 'Size' => '8', 'Type' => 'Struct' }, '161' => { 'Name' => 'int', 'Size' => '4', 'Type' => 'Intrinsic' }, '1620' => { 'Header' => undef, 'Line' => '326', 'Memb' => { '0' => { 'name' => 'pd_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'driver_data', 'offset' => '4', 'type' => '1663' } }, 'Name' => 'struct ib_uverbs_alloc_pd_resp', 'Size' => '4', 'Type' => 'Struct' }, '1663' => { 'BaseType' => '1013', 'Name' => '__u32[]', 'Size' => '8', 'Type' => 'Array' }, '16763' => { 'Header' => undef, 'Line' => '1940', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' } }, 'Name' => 'struct ibv_flow_action', 'Size' => '8', 'Type' => 'Struct' }, '1678' => { 'Header' => undef, 'Line' => '335', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'fd', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'oflags', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_open_xrcd', 'Size' => '16', 'Type' => 'Struct' }, '16859' => { 'Header' => undef, 'Line' => '2105', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' } }, 'Name' => 'struct ibv_counters', 'Size' => '8', 'Type' => 'Struct' }, '16888' => { 'BaseType' => '16859', 'Name' => 'struct ibv_counters*', 'Size' => '8', 'Type' => 'Pointer' }, '17145' => { 'Header' => undef, 'Line' => '1920', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'type', 'offset' => '4', 'type' => '15285' }, '2' => { 'name' => 'size', 'offset' => '8', 'type' => '941' }, '3' => { 'name' => 'priority', 'offset' => '16', 'type' => '941' }, '4' => { 'name' => 'num_of_specs', 'offset' => '18', 'type' => '929' }, '5' => { 'name' => 'port', 'offset' => '19', 'type' => '929' }, '6' => { 'name' => 'flags', 'offset' => '22', 'type' => '953' } }, 'Name' => 'struct ibv_flow_attr', 'Size' => '20', 'Type' => 'Struct' }, '17258' => { 'Header' => undef, 'Line' => '1934', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'context', 'offset' => '8', 'type' => '8991' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '953' } }, 'Name' => 'struct ibv_flow', 'Size' => '24', 'Type' => 'Struct' }, '173' => { 'BaseType' => '70', 'Header' => undef, 'Line' => '42', 'Name' => '__uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '17315' => { 'Header' => undef, 'Line' => '1969', 'Memb' => { '0' => { 'name' => '_dummy1', 'offset' => '0', 'type' => '17495' }, '1' => { 'name' => '_dummy2', 'offset' => '8', 'type' => '17511' } }, 'Name' => 'struct _ibv_device_ops', 'Size' => '16', 'Type' => 'Struct' }, '17378' => { 'BaseType' => '17383', 'Name' => 'struct ibv_device*', 'Size' => '8', 'Type' => 'Pointer' }, '17383' => { 'Header' => undef, 'Line' => '1979', 'Memb' => { '0' => { 'name' => '_ops', 'offset' => '0', 'type' => '17315' }, '1' => { 'name' => 'node_type', 'offset' => '22', 'type' => '8728' }, '2' => { 'name' => 'transport_type', 'offset' => '32', 'type' => '8792' }, '3' => { 'name' => 'name', 'offset' => '36', 'type' => '9530' }, '4' => { 'name' => 'dev_name', 'offset' => '136', 'type' => '9530' }, '5' => { 'name' => 'dev_path', 'offset' => '338', 'type' => '17542' }, '6' => { 'name' => 'ibdev_path', 'offset' => '1032', 'type' => '17542' } }, 'Name' => 'struct ibv_device', 'Size' => '664', 'Type' => 'Struct' }, '1745' => { 'Header' => undef, 'Line' => '342', 'Memb' => { '0' => { 'name' => 'xrcd_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'driver_data', 'offset' => '4', 'type' => '1663' } }, 'Name' => 'struct ib_uverbs_open_xrcd_resp', 'Size' => '4', 'Type' => 'Struct' }, '17495' => { 'Name' => 'struct ibv_context*(*)(struct ibv_device*, int)', 'Param' => { '0' => { 'type' => '17378' }, '1' => { 'type' => '161' } }, 'Return' => '8991', 'Size' => '8', 'Type' => 'FuncPtr' }, '17511' => { 'Name' => 'void(*)(struct ibv_context*)', 'Param' => { '0' => { 'type' => '8991' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '17542' => { 'BaseType' => '226', 'Name' => 'char[256]', 'Size' => '256', 'Type' => 'Array' }, '17558' => { 'Header' => undef, 'Line' => '1994', 'Memb' => { '0' => { 'name' => '_compat_query_device', 'offset' => '0', 'type' => '18045' }, '1' => { 'name' => '_compat_query_port', 'offset' => '8', 'type' => '18085' }, '10' => { 'name' => '_compat_create_cq', 'offset' => '128', 'type' => '18095' }, '11' => { 'name' => 'poll_cq', 'offset' => '136', 'type' => '18210' }, '12' => { 'name' => 'req_notify_cq', 'offset' => '150', 'type' => '18235' }, '13' => { 'name' => '_compat_cq_event', 'offset' => '260', 'type' => '18095' }, '14' => { 'name' => '_compat_resize_cq', 'offset' => '274', 'type' => '18095' }, '15' => { 'name' => '_compat_destroy_cq', 'offset' => '288', 'type' => '18095' }, '16' => { 'name' => '_compat_create_srq', 'offset' => '296', 'type' => '18095' }, '17' => { 'name' => '_compat_modify_srq', 'offset' => '310', 'type' => '18095' }, '18' => { 'name' => '_compat_query_srq', 'offset' => '324', 'type' => '18095' }, '19' => { 'name' => '_compat_destroy_srq', 'offset' => '338', 'type' => '18095' }, '2' => { 'name' => '_compat_alloc_pd', 'offset' => '22', 'type' => '18095' }, '20' => { 'name' => 'post_srq_recv', 'offset' => '352', 'type' => '18265' }, '21' => { 'name' => '_compat_create_qp', 'offset' => '360', 'type' => '18095' }, '22' => { 'name' => '_compat_query_qp', 'offset' => '374', 'type' => '18095' }, '23' => { 'name' => '_compat_modify_qp', 'offset' => '388', 'type' => '18095' }, '24' => { 'name' => '_compat_destroy_qp', 'offset' => '402', 'type' => '18095' }, '25' => { 'name' => 'post_send', 'offset' => '512', 'type' => '18300' }, '26' => { 'name' => 'post_recv', 'offset' => '520', 'type' => '18330' }, '27' => { 'name' => '_compat_create_ah', 'offset' => '534', 'type' => '18095' }, '28' => { 'name' => '_compat_destroy_ah', 'offset' => '548', 'type' => '18095' }, '29' => { 'name' => '_compat_attach_mcast', 'offset' => '562', 'type' => '18095' }, '3' => { 'name' => '_compat_dealloc_pd', 'offset' => '36', 'type' => '18095' }, '30' => { 'name' => '_compat_detach_mcast', 'offset' => '576', 'type' => '18095' }, '31' => { 'name' => '_compat_async_event', 'offset' => '584', 'type' => '18095' }, '4' => { 'name' => '_compat_reg_mr', 'offset' => '50', 'type' => '18095' }, '5' => { 'name' => '_compat_rereg_mr', 'offset' => '64', 'type' => '18095' }, '6' => { 'name' => '_compat_dereg_mr', 'offset' => '72', 'type' => '18095' }, '7' => { 'name' => 'alloc_mw', 'offset' => '86', 'type' => '18120' }, '8' => { 'name' => 'bind_mw', 'offset' => '100', 'type' => '18155' }, '9' => { 'name' => 'dealloc_mw', 'offset' => '114', 'type' => '18175' } }, 'Name' => 'struct ibv_context_ops', 'Size' => '256', 'Type' => 'Struct' }, '1788' => { 'Header' => undef, 'Line' => '351', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'start', 'offset' => '8', 'type' => '1025' }, '2' => { 'name' => 'length', 'offset' => '22', 'type' => '1025' }, '3' => { 'name' => 'hca_va', 'offset' => '36', 'type' => '1025' }, '4' => { 'name' => 'pd_handle', 'offset' => '50', 'type' => '1013' }, '5' => { 'name' => 'access_flags', 'offset' => '54', 'type' => '1013' }, '6' => { 'name' => 'driver_data', 'offset' => '64', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_reg_mr', 'Size' => '40', 'Type' => 'Struct' }, '18040' => { 'BaseType' => '8996', 'Name' => 'struct ibv_device_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '18045' => { 'Name' => 'int(*)(struct ibv_context*, struct ibv_device_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '18040' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '18075' => { 'BaseType' => '18080', 'Name' => 'struct _compat_ibv_port_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '18080' => { 'Header' => undef, 'Line' => '197', 'Memb' => { '0' => { 'name' => 'state', 'offset' => '0', 'type' => '54578' }, '1' => { 'name' => 'max_mtu', 'offset' => '4', 'type' => '9546' }, '10' => { 'name' => 'sm_lid', 'offset' => '54', 'type' => '941' }, '11' => { 'name' => 'lmc', 'offset' => '56', 'type' => '929' }, '12' => { 'name' => 'max_vl_num', 'offset' => '57', 'type' => '929' }, '13' => { 'name' => 'sm_sl', 'offset' => '64', 'type' => '929' }, '14' => { 'name' => 'subnet_timeout', 'offset' => '65', 'type' => '929' }, '15' => { 'name' => 'init_type_reply', 'offset' => '66', 'type' => '929' }, '16' => { 'name' => 'active_width', 'offset' => '67', 'type' => '929' }, '17' => { 'name' => 'active_speed', 'offset' => '68', 'type' => '929' }, '18' => { 'name' => 'phys_state', 'offset' => '69', 'type' => '929' }, '19' => { 'name' => 'link_layer', 'offset' => '70', 'type' => '929' }, '2' => { 'name' => 'active_mtu', 'offset' => '8', 'type' => '9546' }, '20' => { 'name' => 'flags', 'offset' => '71', 'type' => '929' }, '3' => { 'name' => 'gid_tbl_len', 'offset' => '18', 'type' => '161' }, '4' => { 'name' => 'port_cap_flags', 'offset' => '22', 'type' => '953' }, '5' => { 'name' => 'max_msg_sz', 'offset' => '32', 'type' => '953' }, '6' => { 'name' => 'bad_pkey_cntr', 'offset' => '36', 'type' => '953' }, '7' => { 'name' => 'qkey_viol_cntr', 'offset' => '40', 'type' => '953' }, '8' => { 'name' => 'pkey_tbl_len', 'offset' => '50', 'type' => '941' }, '9' => { 'name' => 'lid', 'offset' => '52', 'type' => '941' } }, 'Name' => 'struct _compat_ibv_port_attr', 'Size' => '48', 'Type' => 'Struct' }, '18085' => { 'Name' => 'int(*)(struct ibv_context*, uint8_t, struct _compat_ibv_port_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '929' }, '2' => { 'type' => '18075' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '18095' => { 'Name' => 'void*(*)()', 'Return' => '82', 'Size' => '8', 'Type' => 'FuncPtr' }, '18120' => { 'Name' => 'struct ibv_mw*(*)(struct ibv_pd*, enum ibv_mw_type)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '11400' } }, 'Return' => '13825', 'Size' => '8', 'Type' => 'FuncPtr' }, '18150' => { 'BaseType' => '14143', 'Name' => 'struct ibv_mw_bind*', 'Size' => '8', 'Type' => 'Pointer' }, '18155' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_mw*, struct ibv_mw_bind*)', 'Param' => { '0' => { 'type' => '9935' }, '1' => { 'type' => '13825' }, '2' => { 'type' => '18150' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '18175' => { 'Name' => 'int(*)(struct ibv_mw*)', 'Param' => { '0' => { 'type' => '13825' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '18205' => { 'BaseType' => '10723', 'Name' => 'struct ibv_wc*', 'Size' => '8', 'Type' => 'Pointer' }, '18210' => { 'Name' => 'int(*)(struct ibv_cq*, int, struct ibv_wc*)', 'Param' => { '0' => { 'type' => '9734' }, '1' => { 'type' => '161' }, '2' => { 'type' => '18205' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '18235' => { 'Name' => 'int(*)(struct ibv_cq*, int)', 'Param' => { '0' => { 'type' => '9734' }, '1' => { 'type' => '161' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '18265' => { 'Name' => 'int(*)(struct ibv_srq*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '10052' }, '1' => { 'type' => '14138' }, '2' => { 'type' => '14225' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '18295' => { 'BaseType' => '14057', 'Name' => 'struct ibv_send_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '18300' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_send_wr*, struct ibv_send_wr**)', 'Param' => { '0' => { 'type' => '9935' }, '1' => { 'type' => '14057' }, '2' => { 'type' => '18295' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '18330' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '9935' }, '1' => { 'type' => '14138' }, '2' => { 'type' => '14225' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '18335' => { 'BaseType' => '17258', 'Name' => 'struct ibv_flow*', 'Size' => '8', 'Type' => 'Pointer' }, '18340' => { 'BaseType' => '15242', 'Name' => 'struct ibv_modify_cq_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '18345' => { 'BaseType' => '12103', 'Name' => 'struct ibv_rwq_ind_table_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '18350' => { 'BaseType' => '11924', 'Name' => 'struct ibv_wq_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '18355' => { 'BaseType' => '17145', 'Name' => 'struct ibv_flow_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '18360' => { 'BaseType' => '12474', 'Name' => 'struct ibv_qp_open_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '18365' => { 'BaseType' => '11269', 'Name' => 'struct ibv_xrcd_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '18370' => { 'Name' => '_Bool', 'Size' => '1', 'Type' => 'Intrinsic' }, '18410' => { 'Header' => undef, 'Line' => '51', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'ex_hdr', 'offset' => '8', 'type' => '1483' } }, 'Name' => 'struct ex_hdr', 'Size' => '24', 'Type' => 'Struct' }, '18449' => { 'Header' => undef, 'Line' => '175', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'pd_handle', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'mw_type', 'offset' => '18', 'type' => '989' }, '3' => { 'name' => 'reserved', 'offset' => '19', 'type' => '2238' }, '4' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Size' => '16', 'Type' => 'Struct' }, '185' => { 'Name' => 'long', 'Size' => '8', 'Type' => 'Intrinsic' }, '18522' => { 'Header' => undef, 'Line' => '175', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '18449' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '2156' } }, 'Size' => '16', 'Type' => 'Union' }, '18545' => { 'Header' => undef, 'Line' => '175', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '18522' } }, 'Name' => 'struct ibv_alloc_mw', 'Size' => '24', 'Type' => 'Struct' }, '18597' => { 'Header' => undef, 'Line' => '176', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'driver_data', 'offset' => '8', 'type' => '1549' } }, 'Size' => '8', 'Type' => 'Struct' }, '18631' => { 'Header' => undef, 'Line' => '176', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '18597' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '1580' } }, 'Size' => '8', 'Type' => 'Union' }, '18654' => { 'Header' => undef, 'Line' => '176', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '18631' } }, 'Name' => 'struct ibv_alloc_pd', 'Size' => '16', 'Type' => 'Struct' }, '1895' => { 'Header' => undef, 'Line' => '361', 'Memb' => { '0' => { 'name' => 'mr_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'lkey', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'rkey', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'driver_data', 'offset' => '18', 'type' => '1663' } }, 'Name' => 'struct ib_uverbs_reg_mr_resp', 'Size' => '12', 'Type' => 'Struct' }, '19145' => { 'Header' => undef, 'Line' => '194', 'Memb' => { '0' => { 'name' => 'dest', 'offset' => '0', 'type' => '3401' }, '1' => { 'name' => 'alt_dest', 'offset' => '50', 'type' => '3401' }, '10' => { 'name' => 'alt_pkey_index', 'offset' => '148', 'type' => '1001' }, '11' => { 'name' => 'qp_state', 'offset' => '150', 'type' => '989' }, '12' => { 'name' => 'cur_qp_state', 'offset' => '151', 'type' => '989' }, '13' => { 'name' => 'path_mtu', 'offset' => '152', 'type' => '989' }, '14' => { 'name' => 'path_mig_state', 'offset' => '153', 'type' => '989' }, '15' => { 'name' => 'en_sqd_async_notify', 'offset' => '256', 'type' => '989' }, '16' => { 'name' => 'max_rd_atomic', 'offset' => '257', 'type' => '989' }, '17' => { 'name' => 'max_dest_rd_atomic', 'offset' => '258', 'type' => '989' }, '18' => { 'name' => 'min_rnr_timer', 'offset' => '259', 'type' => '989' }, '19' => { 'name' => 'port_num', 'offset' => '260', 'type' => '989' }, '2' => { 'name' => 'qp_handle', 'offset' => '100', 'type' => '1013' }, '20' => { 'name' => 'timeout', 'offset' => '261', 'type' => '989' }, '21' => { 'name' => 'retry_cnt', 'offset' => '262', 'type' => '989' }, '22' => { 'name' => 'rnr_retry', 'offset' => '263', 'type' => '989' }, '23' => { 'name' => 'alt_port_num', 'offset' => '264', 'type' => '989' }, '24' => { 'name' => 'alt_timeout', 'offset' => '265', 'type' => '989' }, '25' => { 'name' => 'reserved', 'offset' => '272', 'type' => '4489' }, '26' => { 'name' => 'driver_data', 'offset' => '274', 'type' => '1549' }, '3' => { 'name' => 'attr_mask', 'offset' => '104', 'type' => '1013' }, '4' => { 'name' => 'qkey', 'offset' => '114', 'type' => '1013' }, '5' => { 'name' => 'rq_psn', 'offset' => '118', 'type' => '1013' }, '6' => { 'name' => 'sq_psn', 'offset' => '128', 'type' => '1013' }, '7' => { 'name' => 'dest_qp_num', 'offset' => '132', 'type' => '1013' }, '8' => { 'name' => 'qp_access_flags', 'offset' => '136', 'type' => '1013' }, '9' => { 'name' => 'pkey_index', 'offset' => '146', 'type' => '1001' } }, 'Size' => '112', 'Type' => 'Struct' }, '19519' => { 'Header' => undef, 'Line' => '194', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '19145' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '4098' } }, 'Size' => '112', 'Type' => 'Union' }, '19542' => { 'Header' => undef, 'Line' => '194', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '19519' } }, 'Name' => 'struct ibv_modify_qp', 'Size' => '120', 'Type' => 'Struct' }, '19584' => { 'Header' => undef, 'Line' => '195', 'Memb' => { '0' => { 'name' => 'srq_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'attr_mask', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'max_wr', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'srq_limit', 'offset' => '18', 'type' => '1013' }, '4' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Size' => '16', 'Type' => 'Struct' }, '19657' => { 'Header' => undef, 'Line' => '195', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '19584' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '8070' } }, 'Size' => '16', 'Type' => 'Union' }, '1966' => { 'Header' => undef, 'Line' => '368', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'mr_handle', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'flags', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'start', 'offset' => '22', 'type' => '1025' }, '4' => { 'name' => 'length', 'offset' => '36', 'type' => '1025' }, '5' => { 'name' => 'hca_va', 'offset' => '50', 'type' => '1025' }, '6' => { 'name' => 'pd_handle', 'offset' => '64', 'type' => '1013' }, '7' => { 'name' => 'access_flags', 'offset' => '68', 'type' => '1013' }, '8' => { 'name' => 'driver_data', 'offset' => '72', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_rereg_mr', 'Size' => '48', 'Type' => 'Struct' }, '19680' => { 'Header' => undef, 'Line' => '195', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '19657' } }, 'Name' => 'struct ibv_modify_srq', 'Size' => '24', 'Type' => 'Struct' }, '197' => { 'BaseType' => '46', 'Header' => undef, 'Line' => '45', 'Name' => '__uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '19722' => { 'Header' => undef, 'Line' => '196', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '2' => { 'name' => 'pd_handle', 'offset' => '22', 'type' => '1013' }, '3' => { 'name' => 'qpn', 'offset' => '32', 'type' => '1013' }, '4' => { 'name' => 'qp_type', 'offset' => '36', 'type' => '989' }, '5' => { 'name' => 'reserved', 'offset' => '37', 'type' => '1564' }, '6' => { 'name' => 'driver_data', 'offset' => '50', 'type' => '1549' } }, 'Size' => '32', 'Type' => 'Struct' }, '19820' => { 'Header' => undef, 'Line' => '196', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '19722' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '3136' } }, 'Size' => '32', 'Type' => 'Union' }, '19843' => { 'Header' => undef, 'Line' => '196', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '19820' } }, 'Name' => 'struct ibv_open_qp', 'Size' => '40', 'Type' => 'Struct' }, '19897' => { 'Header' => undef, 'Line' => '197', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'fd', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'oflags', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Size' => '16', 'Type' => 'Struct' }, '19955' => { 'Header' => undef, 'Line' => '197', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '19897' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '1678' } }, 'Size' => '16', 'Type' => 'Union' }, '19978' => { 'Header' => undef, 'Line' => '197', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '19955' } }, 'Name' => 'struct ibv_open_xrcd', 'Size' => '24', 'Type' => 'Struct' }, '200673' => { 'Header' => undef, 'Line' => '586', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '10' => { 'name' => 'max_inline_data', 'offset' => '72', 'type' => '1013' }, '11' => { 'name' => 'sq_sig_all', 'offset' => '82', 'type' => '989' }, '12' => { 'name' => 'qp_type', 'offset' => '83', 'type' => '989' }, '13' => { 'name' => 'is_srq', 'offset' => '84', 'type' => '989' }, '14' => { 'name' => 'reserved', 'offset' => '85', 'type' => '989' }, '15' => { 'name' => 'driver_data', 'offset' => '86', 'type' => '1549' }, '2' => { 'name' => 'pd_handle', 'offset' => '22', 'type' => '1013' }, '3' => { 'name' => 'send_cq_handle', 'offset' => '32', 'type' => '1013' }, '4' => { 'name' => 'recv_cq_handle', 'offset' => '36', 'type' => '1013' }, '5' => { 'name' => 'srq_handle', 'offset' => '40', 'type' => '1013' }, '6' => { 'name' => 'max_send_wr', 'offset' => '50', 'type' => '1013' }, '7' => { 'name' => 'max_recv_wr', 'offset' => '54', 'type' => '1013' }, '8' => { 'name' => 'max_send_sge', 'offset' => '64', 'type' => '1013' }, '9' => { 'name' => 'max_recv_sge', 'offset' => '68', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_create_qp', 'Size' => '56', 'Type' => 'Struct' }, '200935' => { 'Header' => undef, 'Line' => '613', 'Memb' => { '0' => { 'name' => 'user_handle', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'pd_handle', 'offset' => '8', 'type' => '1013' }, '10' => { 'name' => 'sq_sig_all', 'offset' => '68', 'type' => '989' }, '11' => { 'name' => 'qp_type', 'offset' => '69', 'type' => '989' }, '12' => { 'name' => 'is_srq', 'offset' => '70', 'type' => '989' }, '13' => { 'name' => 'reserved', 'offset' => '71', 'type' => '989' }, '14' => { 'name' => 'comp_mask', 'offset' => '72', 'type' => '1013' }, '15' => { 'name' => 'create_flags', 'offset' => '82', 'type' => '1013' }, '16' => { 'name' => 'rwq_ind_tbl_handle', 'offset' => '86', 'type' => '1013' }, '17' => { 'name' => 'source_qpn', 'offset' => '96', 'type' => '1013' }, '2' => { 'name' => 'send_cq_handle', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'recv_cq_handle', 'offset' => '22', 'type' => '1013' }, '4' => { 'name' => 'srq_handle', 'offset' => '32', 'type' => '1013' }, '5' => { 'name' => 'max_send_wr', 'offset' => '36', 'type' => '1013' }, '6' => { 'name' => 'max_recv_wr', 'offset' => '40', 'type' => '1013' }, '7' => { 'name' => 'max_send_sge', 'offset' => '50', 'type' => '1013' }, '8' => { 'name' => 'max_recv_sge', 'offset' => '54', 'type' => '1013' }, '9' => { 'name' => 'max_inline_data', 'offset' => '64', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_create_qp', 'Size' => '64', 'Type' => 'Struct' }, '201358' => { 'Header' => undef, 'Line' => '657', 'Memb' => { '0' => { 'name' => 'base', 'offset' => '0', 'type' => '3245' }, '1' => { 'name' => 'comp_mask', 'offset' => '50', 'type' => '1013' }, '2' => { 'name' => 'response_length', 'offset' => '54', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_create_qp_resp', 'Size' => '40', 'Type' => 'Struct' }, '20641' => { 'Header' => undef, 'Line' => '204', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'qp_handle', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'attr_mask', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Size' => '16', 'Type' => 'Struct' }, '20701' => { 'Header' => undef, 'Line' => '204', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '20641' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '3583' } }, 'Size' => '16', 'Type' => 'Union' }, '20724' => { 'Header' => undef, 'Line' => '204', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '20701' } }, 'Name' => 'struct ibv_query_qp', 'Size' => '24', 'Type' => 'Struct' }, '20776' => { 'Header' => undef, 'Line' => '205', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'srq_handle', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'reserved', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Size' => '16', 'Type' => 'Struct' }, '20836' => { 'Header' => undef, 'Line' => '205', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '20776' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '8153' } }, 'Size' => '16', 'Type' => 'Union' }, '20859' => { 'Header' => undef, 'Line' => '205', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '20836' } }, 'Name' => 'struct ibv_query_srq', 'Size' => '24', 'Type' => 'Struct' }, '209' => { 'BaseType' => '185', 'Header' => undef, 'Line' => '194', 'Name' => '__ssize_t', 'Size' => '8', 'Type' => 'Typedef' }, '20913' => { 'Header' => undef, 'Line' => '206', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'start', 'offset' => '8', 'type' => '1025' }, '2' => { 'name' => 'length', 'offset' => '22', 'type' => '1025' }, '3' => { 'name' => 'hca_va', 'offset' => '36', 'type' => '1025' }, '4' => { 'name' => 'pd_handle', 'offset' => '50', 'type' => '1013' }, '5' => { 'name' => 'access_flags', 'offset' => '54', 'type' => '1013' }, '6' => { 'name' => 'driver_data', 'offset' => '64', 'type' => '1549' } }, 'Size' => '40', 'Type' => 'Struct' }, '2101' => { 'Header' => undef, 'Line' => '380', 'Memb' => { '0' => { 'name' => 'lkey', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'rkey', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'driver_data', 'offset' => '8', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_rereg_mr_resp', 'Size' => '8', 'Type' => 'Struct' }, '21012' => { 'Header' => undef, 'Line' => '206', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '20913' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '1788' } }, 'Size' => '40', 'Type' => 'Union' }, '21035' => { 'Header' => undef, 'Line' => '206', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '21012' } }, 'Name' => 'struct ibv_reg_mr', 'Size' => '48', 'Type' => 'Struct' }, '21196' => { 'Header' => undef, 'Line' => '208', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'mr_handle', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'flags', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'start', 'offset' => '22', 'type' => '1025' }, '4' => { 'name' => 'length', 'offset' => '36', 'type' => '1025' }, '5' => { 'name' => 'hca_va', 'offset' => '50', 'type' => '1025' }, '6' => { 'name' => 'pd_handle', 'offset' => '64', 'type' => '1013' }, '7' => { 'name' => 'access_flags', 'offset' => '68', 'type' => '1013' }, '8' => { 'name' => 'driver_data', 'offset' => '72', 'type' => '1549' } }, 'Size' => '48', 'Type' => 'Struct' }, '21321' => { 'Header' => undef, 'Line' => '208', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '21196' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '1966' } }, 'Size' => '48', 'Type' => 'Union' }, '21344' => { 'Header' => undef, 'Line' => '208', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '21321' } }, 'Name' => 'struct ibv_rereg_mr', 'Size' => '56', 'Type' => 'Struct' }, '21396' => { 'Header' => undef, 'Line' => '209', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'cq_handle', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'cqe', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Size' => '16', 'Type' => 'Struct' }, '21455' => { 'Header' => undef, 'Line' => '209', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '21396' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '2309' } }, 'Size' => '16', 'Type' => 'Union' }, '21478' => { 'Header' => undef, 'Line' => '209', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '21455' } }, 'Name' => 'struct ibv_resize_cq', 'Size' => '24', 'Type' => 'Struct' }, '215236' => { 'Header' => undef, 'Line' => '182', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '10' => { 'name' => 'max_inline_data', 'offset' => '72', 'type' => '1013' }, '11' => { 'name' => 'sq_sig_all', 'offset' => '82', 'type' => '989' }, '12' => { 'name' => 'qp_type', 'offset' => '83', 'type' => '989' }, '13' => { 'name' => 'is_srq', 'offset' => '84', 'type' => '989' }, '14' => { 'name' => 'reserved', 'offset' => '85', 'type' => '989' }, '15' => { 'name' => 'driver_data', 'offset' => '86', 'type' => '1549' }, '2' => { 'name' => 'pd_handle', 'offset' => '22', 'type' => '1013' }, '3' => { 'name' => 'send_cq_handle', 'offset' => '32', 'type' => '1013' }, '4' => { 'name' => 'recv_cq_handle', 'offset' => '36', 'type' => '1013' }, '5' => { 'name' => 'srq_handle', 'offset' => '40', 'type' => '1013' }, '6' => { 'name' => 'max_send_wr', 'offset' => '50', 'type' => '1013' }, '7' => { 'name' => 'max_recv_wr', 'offset' => '54', 'type' => '1013' }, '8' => { 'name' => 'max_send_sge', 'offset' => '64', 'type' => '1013' }, '9' => { 'name' => 'max_recv_sge', 'offset' => '68', 'type' => '1013' } }, 'Size' => '56', 'Type' => 'Struct' }, '215452' => { 'Header' => undef, 'Line' => '182', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '215236' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '200673' } }, 'Size' => '56', 'Type' => 'Union' }, '215479' => { 'Header' => undef, 'Line' => '182', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '215452' } }, 'Name' => 'struct ibv_create_qp', 'Size' => '64', 'Type' => 'Struct' }, '2156' => { 'Header' => undef, 'Line' => '390', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'pd_handle', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'mw_type', 'offset' => '18', 'type' => '989' }, '3' => { 'name' => 'reserved', 'offset' => '19', 'type' => '2238' }, '4' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_alloc_mw', 'Size' => '16', 'Type' => 'Struct' }, '215673' => { 'Header' => undef, 'Line' => '213', 'Memb' => { '0' => { 'name' => 'user_handle', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'pd_handle', 'offset' => '8', 'type' => '1013' }, '10' => { 'name' => 'sq_sig_all', 'offset' => '68', 'type' => '989' }, '11' => { 'name' => 'qp_type', 'offset' => '69', 'type' => '989' }, '12' => { 'name' => 'is_srq', 'offset' => '70', 'type' => '989' }, '13' => { 'name' => 'reserved', 'offset' => '71', 'type' => '989' }, '14' => { 'name' => 'comp_mask', 'offset' => '72', 'type' => '1013' }, '15' => { 'name' => 'create_flags', 'offset' => '82', 'type' => '1013' }, '16' => { 'name' => 'rwq_ind_tbl_handle', 'offset' => '86', 'type' => '1013' }, '17' => { 'name' => 'source_qpn', 'offset' => '96', 'type' => '1013' }, '2' => { 'name' => 'send_cq_handle', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'recv_cq_handle', 'offset' => '22', 'type' => '1013' }, '4' => { 'name' => 'srq_handle', 'offset' => '32', 'type' => '1013' }, '5' => { 'name' => 'max_send_wr', 'offset' => '36', 'type' => '1013' }, '6' => { 'name' => 'max_recv_wr', 'offset' => '40', 'type' => '1013' }, '7' => { 'name' => 'max_send_sge', 'offset' => '50', 'type' => '1013' }, '8' => { 'name' => 'max_recv_sge', 'offset' => '54', 'type' => '1013' }, '9' => { 'name' => 'max_inline_data', 'offset' => '64', 'type' => '1013' } }, 'Size' => '64', 'Type' => 'Struct' }, '215915' => { 'Header' => undef, 'Line' => '213', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '215673' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '200935' } }, 'Size' => '64', 'Type' => 'Union' }, '215942' => { 'Header' => undef, 'Line' => '213', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '18410' }, '1' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '215915' } }, 'Name' => 'struct ibv_create_qp_ex', 'Size' => '88', 'Type' => 'Struct' }, '21780' => { 'Header' => undef, 'Line' => '219', 'Memb' => { '0' => { 'name' => 'cq_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'attr_mask', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'attr', 'offset' => '8', 'type' => '8519' }, '3' => { 'name' => 'reserved', 'offset' => '18', 'type' => '1013' } }, 'Size' => '16', 'Type' => 'Struct' }, '21842' => { 'Header' => undef, 'Line' => '219', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '21780' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '8562' } }, 'Size' => '16', 'Type' => 'Union' }, '21869' => { 'Header' => undef, 'Line' => '219', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '18410' }, '1' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '21842' } }, 'Name' => 'struct ibv_modify_cq', 'Size' => '40', 'Type' => 'Struct' }, '21909' => { 'Header' => undef, 'Line' => '220', 'Memb' => { '0' => { 'name' => 'base', 'offset' => '0', 'type' => '4098' }, '1' => { 'name' => 'rate_limit', 'offset' => '274', 'type' => '1013' }, '2' => { 'name' => 'reserved', 'offset' => '278', 'type' => '1013' } }, 'Size' => '120', 'Type' => 'Struct' }, '21956' => { 'Header' => undef, 'Line' => '220', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '21909' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '4520' } }, 'Size' => '120', 'Type' => 'Union' }, '21979' => { 'Header' => undef, 'Line' => '220', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '18410' }, '1' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '21956' } }, 'Name' => 'struct ibv_modify_qp_ex', 'Size' => '144', 'Type' => 'Struct' }, '22031' => { 'Header' => undef, 'Line' => '221', 'Memb' => { '0' => { 'name' => 'attr_mask', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'wq_handle', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'wq_state', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'curr_wq_state', 'offset' => '18', 'type' => '1013' }, '4' => { 'name' => 'flags', 'offset' => '22', 'type' => '1013' }, '5' => { 'name' => 'flags_mask', 'offset' => '32', 'type' => '1013' } }, 'Size' => '24', 'Type' => 'Struct' }, '221' => { 'BaseType' => '226', 'Name' => 'char*', 'Size' => '8', 'Type' => 'Pointer' }, '22119' => { 'Header' => undef, 'Line' => '221', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '22031' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '8292' } }, 'Size' => '24', 'Type' => 'Union' }, '221262' => { 'BaseType' => '215942', 'Name' => 'struct ibv_create_qp_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '221267' => { 'BaseType' => '201358', 'Name' => 'struct ib_uverbs_ex_create_qp_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '22146' => { 'Header' => undef, 'Line' => '221', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '18410' }, '1' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '22119' } }, 'Name' => 'struct ibv_modify_wq', 'Size' => '48', 'Type' => 'Struct' }, '221789' => { 'BaseType' => '215479', 'Name' => 'struct ibv_create_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '2238' => { 'BaseType' => '989', 'Name' => '__u8[3]', 'Size' => '3', 'Type' => 'Array' }, '2254' => { 'Header' => undef, 'Line' => '398', 'Memb' => { '0' => { 'name' => 'mw_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'rkey', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'driver_data', 'offset' => '8', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_alloc_mw_resp', 'Size' => '8', 'Type' => 'Struct' }, '226' => { 'Name' => 'char', 'Size' => '1', 'Type' => 'Intrinsic' }, '22792' => { 'Header' => undef, 'Line' => '111', 'Memb' => { '0' => { 'name' => 'xrcd', 'offset' => '0', 'type' => '11325' }, '1' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '953' }, '2' => { 'name' => 'handle', 'offset' => '18', 'type' => '953' } }, 'Name' => 'struct verbs_xrcd', 'Size' => '16', 'Type' => 'Struct' }, '22845' => { 'BaseType' => '22792', 'Name' => 'struct verbs_xrcd*', 'Size' => '8', 'Type' => 'Pointer' }, '22879' => { 'Header' => undef, 'Line' => '141', 'Memb' => { '0' => { 'name' => 'IBV_MR_TYPE_MR', 'value' => '0' }, '1' => { 'name' => 'IBV_MR_TYPE_NULL_MR', 'value' => '1' }, '2' => { 'name' => 'IBV_MR_TYPE_IMPORTED_MR', 'value' => '2' }, '3' => { 'name' => 'IBV_MR_TYPE_DMABUF_MR', 'value' => '3' } }, 'Name' => 'enum ibv_mr_type', 'Size' => '4', 'Type' => 'Enum' }, '22920' => { 'Header' => undef, 'Line' => '148', 'Memb' => { '0' => { 'name' => 'ibv_mr', 'offset' => '0', 'type' => '11074' }, '1' => { 'name' => 'mr_type', 'offset' => '72', 'type' => '22879' }, '2' => { 'name' => 'access', 'offset' => '82', 'type' => '161' } }, 'Name' => 'struct verbs_mr', 'Size' => '56', 'Type' => 'Struct' }, '22973' => { 'Header' => undef, 'Line' => '160', 'Memb' => { '0' => { 'name' => 'qp', 'offset' => '0', 'type' => '9739' }, '1' => { 'name' => 'qp_ex', 'offset' => '0', 'type' => '14235' } }, 'Size' => '360', 'Type' => 'Union' }, '23007' => { 'Header' => undef, 'Line' => '159', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '22973' }, '1' => { 'name' => 'comp_mask', 'offset' => '864', 'type' => '953' }, '2' => { 'name' => 'xrcd', 'offset' => '872', 'type' => '22845' } }, 'Name' => 'struct verbs_qp', 'Size' => '376', 'Type' => 'Struct' }, '23054' => { 'Header' => undef, 'Line' => '176', 'Memb' => { '0' => { 'name' => 'IBV_FLOW_ACTION_UNSPECIFIED', 'value' => '0' }, '1' => { 'name' => 'IBV_FLOW_ACTION_ESP', 'value' => '1' } }, 'Name' => 'enum ibv_flow_action_type', 'Size' => '4', 'Type' => 'Enum' }, '23083' => { 'Header' => undef, 'Line' => '181', 'Memb' => { '0' => { 'name' => 'action', 'offset' => '0', 'type' => '16763' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '953' }, '2' => { 'name' => 'type', 'offset' => '18', 'type' => '23054' } }, 'Name' => 'struct verbs_flow_action', 'Size' => '16', 'Type' => 'Struct' }, '2309' => { 'Header' => undef, 'Line' => '453', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'cq_handle', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'cqe', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_resize_cq', 'Size' => '16', 'Type' => 'Struct' }, '23141' => { 'Header' => undef, 'Line' => '299', 'Memb' => { '0' => { 'name' => 'counters', 'offset' => '0', 'type' => '16859' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '953' } }, 'Name' => 'struct verbs_counters', 'Size' => '16', 'Type' => 'Struct' }, '23189' => { 'BaseType' => '8707', 'Name' => 'union ibv_gid const*', 'Size' => '8', 'Type' => 'Pointer' }, '23194' => { 'BaseType' => '11598', 'Name' => 'struct ibv_ah_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '23199' => { 'BaseType' => '12309', 'Name' => 'struct ibv_qp_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '23204' => { 'BaseType' => '22920', 'Name' => 'struct verbs_mr*', 'Size' => '8', 'Type' => 'Pointer' }, '23209' => { 'BaseType' => '12834', 'Name' => 'struct ibv_qp_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '23214' => { 'BaseType' => '11710', 'Name' => 'struct ibv_srq_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '23549' => { 'BaseType' => '161', 'Name' => 'int*', 'Size' => '8', 'Type' => 'Pointer' }, '237196' => { 'Header' => undef, 'Line' => '1173', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '2' => { 'name' => 'pd_handle', 'offset' => '22', 'type' => '1013' }, '3' => { 'name' => 'max_wr', 'offset' => '32', 'type' => '1013' }, '4' => { 'name' => 'max_sge', 'offset' => '36', 'type' => '1013' }, '5' => { 'name' => 'srq_limit', 'offset' => '40', 'type' => '1013' }, '6' => { 'name' => 'driver_data', 'offset' => '50', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_create_srq', 'Size' => '32', 'Type' => 'Struct' }, '237309' => { 'Header' => undef, 'Line' => '1183', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '10' => { 'name' => 'driver_data', 'offset' => '72', 'type' => '1549' }, '2' => { 'name' => 'srq_type', 'offset' => '22', 'type' => '1013' }, '3' => { 'name' => 'pd_handle', 'offset' => '32', 'type' => '1013' }, '4' => { 'name' => 'max_wr', 'offset' => '36', 'type' => '1013' }, '5' => { 'name' => 'max_sge', 'offset' => '40', 'type' => '1013' }, '6' => { 'name' => 'srq_limit', 'offset' => '50', 'type' => '1013' }, '7' => { 'name' => 'max_num_tags', 'offset' => '54', 'type' => '1013' }, '8' => { 'name' => 'xrcd_handle', 'offset' => '64', 'type' => '1013' }, '9' => { 'name' => 'cq_handle', 'offset' => '68', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_create_xsrq', 'Size' => '48', 'Type' => 'Struct' }, '237478' => { 'Header' => undef, 'Line' => '1197', 'Memb' => { '0' => { 'name' => 'srq_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'max_wr', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'max_sge', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'srqn', 'offset' => '18', 'type' => '1013' }, '4' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1663' } }, 'Name' => 'struct ib_uverbs_create_srq_resp', 'Size' => '16', 'Type' => 'Struct' }, '23769' => { 'BaseType' => '21869', 'Name' => 'struct ibv_modify_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '2377' => { 'Header' => undef, 'Line' => '460', 'Memb' => { '0' => { 'name' => 'cqe', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'reserved', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'driver_data', 'offset' => '8', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_resize_cq_resp', 'Size' => '8', 'Type' => 'Struct' }, '24173' => { 'BaseType' => '8448', 'Name' => 'struct ib_uverbs_ex_create_rwq_ind_table_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '24420' => { 'BaseType' => '22146', 'Name' => 'struct ibv_modify_wq*', 'Size' => '8', 'Type' => 'Pointer' }, '250276' => { 'Header' => undef, 'Line' => '183', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '2' => { 'name' => 'pd_handle', 'offset' => '22', 'type' => '1013' }, '3' => { 'name' => 'max_wr', 'offset' => '32', 'type' => '1013' }, '4' => { 'name' => 'max_sge', 'offset' => '36', 'type' => '1013' }, '5' => { 'name' => 'srq_limit', 'offset' => '40', 'type' => '1013' }, '6' => { 'name' => 'driver_data', 'offset' => '50', 'type' => '1549' } }, 'Size' => '32', 'Type' => 'Struct' }, '250375' => { 'Header' => undef, 'Line' => '183', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '250276' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '237196' } }, 'Size' => '32', 'Type' => 'Union' }, '250402' => { 'Header' => undef, 'Line' => '183', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '250375' } }, 'Name' => 'struct ibv_create_srq', 'Size' => '40', 'Type' => 'Struct' }, '250467' => { 'Header' => undef, 'Line' => '184', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '10' => { 'name' => 'driver_data', 'offset' => '72', 'type' => '1549' }, '2' => { 'name' => 'srq_type', 'offset' => '22', 'type' => '1013' }, '3' => { 'name' => 'pd_handle', 'offset' => '32', 'type' => '1013' }, '4' => { 'name' => 'max_wr', 'offset' => '36', 'type' => '1013' }, '5' => { 'name' => 'max_sge', 'offset' => '40', 'type' => '1013' }, '6' => { 'name' => 'srq_limit', 'offset' => '50', 'type' => '1013' }, '7' => { 'name' => 'max_num_tags', 'offset' => '54', 'type' => '1013' }, '8' => { 'name' => 'xrcd_handle', 'offset' => '64', 'type' => '1013' }, '9' => { 'name' => 'cq_handle', 'offset' => '68', 'type' => '1013' } }, 'Size' => '48', 'Type' => 'Struct' }, '250618' => { 'Header' => undef, 'Line' => '184', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '250467' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '237309' } }, 'Size' => '48', 'Type' => 'Union' }, '250645' => { 'Header' => undef, 'Line' => '184', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '250618' } }, 'Name' => 'struct ibv_create_xsrq', 'Size' => '56', 'Type' => 'Struct' }, '251149' => { 'Header' => undef, 'Line' => '117', 'Memb' => { '0' => { 'name' => 'srq', 'offset' => '0', 'type' => '9940' }, '1' => { 'name' => 'srq_type', 'offset' => '296', 'type' => '57259' }, '2' => { 'name' => 'xrcd', 'offset' => '310', 'type' => '22845' }, '3' => { 'name' => 'cq', 'offset' => '324', 'type' => '9734' }, '4' => { 'name' => 'srq_num', 'offset' => '338', 'type' => '953' } }, 'Name' => 'struct verbs_srq', 'Size' => '160', 'Type' => 'Struct' }, '254' => { 'BaseType' => '209', 'Header' => undef, 'Line' => '77', 'Name' => 'ssize_t', 'Size' => '8', 'Type' => 'Typedef' }, '255302' => { 'BaseType' => '251149', 'Name' => 'struct verbs_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '255307' => { 'BaseType' => '250645', 'Name' => 'struct ibv_create_xsrq*', 'Size' => '8', 'Type' => 'Pointer' }, '255312' => { 'BaseType' => '237478', 'Name' => 'struct ib_uverbs_create_srq_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '255842' => { 'BaseType' => '250402', 'Name' => 'struct ibv_create_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '262677' => { 'Header' => undef, 'Line' => '1237', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'wq_type', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '3' => { 'name' => 'pd_handle', 'offset' => '22', 'type' => '1013' }, '4' => { 'name' => 'cq_handle', 'offset' => '32', 'type' => '1013' }, '5' => { 'name' => 'max_wr', 'offset' => '36', 'type' => '1013' }, '6' => { 'name' => 'max_sge', 'offset' => '40', 'type' => '1013' }, '7' => { 'name' => 'create_flags', 'offset' => '50', 'type' => '1013' }, '8' => { 'name' => 'reserved', 'offset' => '54', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_create_wq', 'Size' => '40', 'Type' => 'Struct' }, '262820' => { 'Header' => undef, 'Line' => '1249', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'response_length', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'wq_handle', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'max_wr', 'offset' => '18', 'type' => '1013' }, '4' => { 'name' => 'max_sge', 'offset' => '22', 'type' => '1013' }, '5' => { 'name' => 'wqn', 'offset' => '32', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_create_wq_resp', 'Size' => '24', 'Type' => 'Struct' }, '266' => { 'Name' => 'long long', 'Size' => '8', 'Type' => 'Intrinsic' }, '27330' => { 'BaseType' => '5543', 'Name' => 'struct ib_uverbs_create_ah_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '275720' => { 'Header' => undef, 'Line' => '215', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'wq_type', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '3' => { 'name' => 'pd_handle', 'offset' => '22', 'type' => '1013' }, '4' => { 'name' => 'cq_handle', 'offset' => '32', 'type' => '1013' }, '5' => { 'name' => 'max_wr', 'offset' => '36', 'type' => '1013' }, '6' => { 'name' => 'max_sge', 'offset' => '40', 'type' => '1013' }, '7' => { 'name' => 'create_flags', 'offset' => '50', 'type' => '1013' }, '8' => { 'name' => 'reserved', 'offset' => '54', 'type' => '1013' } }, 'Size' => '40', 'Type' => 'Struct' }, '275848' => { 'Header' => undef, 'Line' => '215', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '275720' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '262677' } }, 'Size' => '40', 'Type' => 'Union' }, '275876' => { 'Header' => undef, 'Line' => '215', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '18410' }, '1' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '275848' } }, 'Name' => 'struct ibv_create_wq', 'Size' => '64', 'Type' => 'Struct' }, '282986' => { 'BaseType' => '275876', 'Name' => 'struct ibv_create_wq*', 'Size' => '8', 'Type' => 'Pointer' }, '282991' => { 'BaseType' => '262820', 'Name' => 'struct ib_uverbs_ex_create_wq_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '2856' => { 'Header' => undef, 'Line' => '528', 'Memb' => { '0' => { 'name' => 'dgid', 'offset' => '0', 'type' => '2955' }, '1' => { 'name' => 'flow_label', 'offset' => '22', 'type' => '1013' }, '2' => { 'name' => 'sgid_index', 'offset' => '32', 'type' => '989' }, '3' => { 'name' => 'hop_limit', 'offset' => '33', 'type' => '989' }, '4' => { 'name' => 'traffic_class', 'offset' => '34', 'type' => '989' }, '5' => { 'name' => 'reserved', 'offset' => '35', 'type' => '989' } }, 'Name' => 'struct ib_uverbs_global_route', 'Size' => '24', 'Type' => 'Struct' }, '28996' => { 'BaseType' => '21979', 'Name' => 'struct ibv_modify_qp_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '29001' => { 'BaseType' => '4575', 'Name' => 'struct ib_uverbs_ex_modify_qp_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '29220' => { 'BaseType' => '19542', 'Name' => 'struct ibv_modify_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '2955' => { 'BaseType' => '989', 'Name' => '__u8[16]', 'Size' => '16', 'Type' => 'Array' }, '2971' => { 'Header' => undef, 'Line' => '537', 'Memb' => { '0' => { 'name' => 'grh', 'offset' => '0', 'type' => '2856' }, '1' => { 'name' => 'dlid', 'offset' => '36', 'type' => '1001' }, '2' => { 'name' => 'sl', 'offset' => '38', 'type' => '989' }, '3' => { 'name' => 'src_path_bits', 'offset' => '39', 'type' => '989' }, '4' => { 'name' => 'static_rate', 'offset' => '40', 'type' => '989' }, '5' => { 'name' => 'is_global', 'offset' => '41', 'type' => '989' }, '6' => { 'name' => 'port_num', 'offset' => '48', 'type' => '989' }, '7' => { 'name' => 'reserved', 'offset' => '49', 'type' => '989' } }, 'Name' => 'struct ib_uverbs_ah_attr', 'Size' => '32', 'Type' => 'Struct' }, '29717' => { 'BaseType' => '20724', 'Name' => 'struct ibv_query_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '30110' => { 'BaseType' => '23007', 'Name' => 'struct verbs_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '30115' => { 'BaseType' => '19843', 'Name' => 'struct ibv_open_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '30120' => { 'BaseType' => '3245', 'Name' => 'struct ib_uverbs_create_qp_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '30376' => { 'BaseType' => '20859', 'Name' => 'struct ibv_query_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '30584' => { 'BaseType' => '19680', 'Name' => 'struct ibv_modify_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '306746' => { 'Header' => undef, 'Line' => '44', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '306891' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '953' }, '2' => { 'name' => 'real_pd', 'offset' => '22', 'type' => '11395' } }, 'Name' => 'struct ibv_pd_1_0', 'Size' => '24', 'Type' => 'Struct' }, '306799' => { 'Header' => undef, 'Line' => '218', 'Memb' => { '0' => { 'name' => 'device', 'offset' => '0', 'type' => '308685' }, '1' => { 'name' => 'ops', 'offset' => '8', 'type' => '308070' }, '2' => { 'name' => 'cmd_fd', 'offset' => '548', 'type' => '161' }, '3' => { 'name' => 'async_fd', 'offset' => '552', 'type' => '161' }, '4' => { 'name' => 'num_comp_vectors', 'offset' => '562', 'type' => '161' }, '5' => { 'name' => 'real_context', 'offset' => '576', 'type' => '8991' } }, 'Name' => 'struct ibv_context_1_0', 'Size' => '248', 'Type' => 'Struct' }, '306891' => { 'BaseType' => '306799', 'Name' => 'struct ibv_context_1_0*', 'Size' => '8', 'Type' => 'Pointer' }, '306896' => { 'Header' => undef, 'Line' => '51', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '306891' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '306987' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '953' }, '3' => { 'name' => 'lkey', 'offset' => '32', 'type' => '953' }, '4' => { 'name' => 'rkey', 'offset' => '36', 'type' => '953' }, '5' => { 'name' => 'real_mr', 'offset' => '50', 'type' => '11186' } }, 'Name' => 'struct ibv_mr_1_0', 'Size' => '40', 'Type' => 'Struct' }, '306987' => { 'BaseType' => '306746', 'Name' => 'struct ibv_pd_1_0*', 'Size' => '8', 'Type' => 'Pointer' }, '306992' => { 'Header' => undef, 'Line' => '61', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '306891' }, '1' => { 'name' => 'srq_context', 'offset' => '8', 'type' => '82' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '306987' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '953' }, '4' => { 'name' => 'mutex', 'offset' => '50', 'type' => '292098' }, '5' => { 'name' => 'cond', 'offset' => '114', 'type' => '292172' }, '6' => { 'name' => 'events_completed', 'offset' => '288', 'type' => '953' }, '7' => { 'name' => 'real_srq', 'offset' => '296', 'type' => '10052' } }, 'Name' => 'struct ibv_srq_1_0', 'Size' => '136', 'Type' => 'Struct' }, '307109' => { 'Header' => undef, 'Line' => '74', 'Memb' => { '0' => { 'name' => 'qp_context', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'send_cq', 'offset' => '8', 'type' => '307345' }, '2' => { 'name' => 'recv_cq', 'offset' => '22', 'type' => '307345' }, '3' => { 'name' => 'srq', 'offset' => '36', 'type' => '307350' }, '4' => { 'name' => 'cap', 'offset' => '50', 'type' => '12224' }, '5' => { 'name' => 'qp_type', 'offset' => '82', 'type' => '12165' }, '6' => { 'name' => 'sq_sig_all', 'offset' => '86', 'type' => '161' } }, 'Name' => 'struct ibv_qp_init_attr_1_0', 'Size' => '64', 'Type' => 'Struct' }, '307214' => { 'Header' => undef, 'Line' => '137', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '306891' }, '1' => { 'name' => 'cq_context', 'offset' => '8', 'type' => '82' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '953' }, '3' => { 'name' => 'cqe', 'offset' => '32', 'type' => '161' }, '4' => { 'name' => 'mutex', 'offset' => '36', 'type' => '292098' }, '5' => { 'name' => 'cond', 'offset' => '100', 'type' => '292172' }, '6' => { 'name' => 'comp_events_completed', 'offset' => '274', 'type' => '953' }, '7' => { 'name' => 'async_events_completed', 'offset' => '278', 'type' => '953' }, '8' => { 'name' => 'real_cq', 'offset' => '288', 'type' => '9734' } }, 'Name' => 'struct ibv_cq_1_0', 'Size' => '128', 'Type' => 'Struct' }, '307345' => { 'BaseType' => '307214', 'Name' => 'struct ibv_cq_1_0*', 'Size' => '8', 'Type' => 'Pointer' }, '307350' => { 'BaseType' => '306992', 'Name' => 'struct ibv_srq_1_0*', 'Size' => '8', 'Type' => 'Pointer' }, '307355' => { 'Header' => undef, 'Line' => '93', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '953' } }, 'Size' => '16', 'Type' => 'Struct' }, '307391' => { 'Header' => undef, 'Line' => '97', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'compare_add', 'offset' => '8', 'type' => '965' }, '2' => { 'name' => 'swap', 'offset' => '22', 'type' => '965' }, '3' => { 'name' => 'rkey', 'offset' => '36', 'type' => '953' } }, 'Size' => '32', 'Type' => 'Struct' }, '307453' => { 'Header' => undef, 'Line' => '103', 'Memb' => { '0' => { 'name' => 'ah', 'offset' => '0', 'type' => '307566' }, '1' => { 'name' => 'remote_qpn', 'offset' => '8', 'type' => '953' }, '2' => { 'name' => 'remote_qkey', 'offset' => '18', 'type' => '953' } }, 'Size' => '16', 'Type' => 'Struct' }, '307501' => { 'Header' => undef, 'Line' => '151', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '306891' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '306987' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '953' }, '3' => { 'name' => 'real_ah', 'offset' => '36', 'type' => '13672' } }, 'Name' => 'struct ibv_ah_1_0', 'Size' => '32', 'Type' => 'Struct' }, '307566' => { 'BaseType' => '307501', 'Name' => 'struct ibv_ah_1_0*', 'Size' => '8', 'Type' => 'Pointer' }, '307571' => { 'Header' => undef, 'Line' => '92', 'Memb' => { '0' => { 'name' => 'rdma', 'offset' => '0', 'type' => '307355' }, '1' => { 'name' => 'atomic', 'offset' => '0', 'type' => '307391' }, '2' => { 'name' => 'ud', 'offset' => '0', 'type' => '307453' } }, 'Size' => '32', 'Type' => 'Union' }, '307616' => { 'Header' => undef, 'Line' => '84', 'Memb' => { '0' => { 'name' => 'next', 'offset' => '0', 'type' => '307733' }, '1' => { 'name' => 'wr_id', 'offset' => '8', 'type' => '965' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '14062' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '161' }, '4' => { 'name' => 'opcode', 'offset' => '40', 'type' => '13213' }, '5' => { 'name' => 'send_flags', 'offset' => '50', 'type' => '161' }, '6' => { 'name' => 'imm_data', 'offset' => '54', 'type' => '1049' }, '7' => { 'name' => 'wr', 'offset' => '64', 'type' => '307571' } }, 'Name' => 'struct ibv_send_wr_1_0', 'Size' => '72', 'Type' => 'Struct' }, '307733' => { 'BaseType' => '307616', 'Name' => 'struct ibv_send_wr_1_0*', 'Size' => '8', 'Type' => 'Pointer' }, '307738' => { 'Header' => undef, 'Line' => '111', 'Memb' => { '0' => { 'name' => 'next', 'offset' => '0', 'type' => '307804' }, '1' => { 'name' => 'wr_id', 'offset' => '8', 'type' => '965' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '14062' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '161' } }, 'Name' => 'struct ibv_recv_wr_1_0', 'Size' => '32', 'Type' => 'Struct' }, '307804' => { 'BaseType' => '307738', 'Name' => 'struct ibv_recv_wr_1_0*', 'Size' => '8', 'Type' => 'Pointer' }, '307809' => { 'Header' => undef, 'Line' => '118', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '306891' }, '1' => { 'name' => 'qp_context', 'offset' => '8', 'type' => '82' }, '10' => { 'name' => 'mutex', 'offset' => '100', 'type' => '292098' }, '11' => { 'name' => 'cond', 'offset' => '260', 'type' => '292172' }, '12' => { 'name' => 'events_completed', 'offset' => '338', 'type' => '953' }, '13' => { 'name' => 'real_qp', 'offset' => '352', 'type' => '9935' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '306987' }, '3' => { 'name' => 'send_cq', 'offset' => '36', 'type' => '307345' }, '4' => { 'name' => 'recv_cq', 'offset' => '50', 'type' => '307345' }, '5' => { 'name' => 'srq', 'offset' => '64', 'type' => '307350' }, '6' => { 'name' => 'handle', 'offset' => '72', 'type' => '953' }, '7' => { 'name' => 'qp_num', 'offset' => '82', 'type' => '953' }, '8' => { 'name' => 'state', 'offset' => '86', 'type' => '12734' }, '9' => { 'name' => 'qp_type', 'offset' => '96', 'type' => '12165' } }, 'Name' => 'struct ibv_qp_1_0', 'Size' => '168', 'Type' => 'Struct' }, '308004' => { 'Header' => undef, 'Line' => '159', 'Memb' => { '0' => { 'name' => 'obsolete_sysfs_dev', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'obsolete_sysfs_ibdev', 'offset' => '8', 'type' => '82' }, '2' => { 'name' => 'real_device', 'offset' => '22', 'type' => '17378' }, '3' => { 'name' => '_ops', 'offset' => '36', 'type' => '17315' } }, 'Name' => 'struct ibv_device_1_0', 'Size' => '40', 'Type' => 'Struct' }, '308070' => { 'Header' => undef, 'Line' => '166', 'Memb' => { '0' => { 'name' => 'query_device', 'offset' => '0', 'type' => '18045' }, '1' => { 'name' => 'query_port', 'offset' => '8', 'type' => '67487' }, '10' => { 'name' => 'resize_cq', 'offset' => '128', 'type' => '18235' }, '11' => { 'name' => 'destroy_cq', 'offset' => '136', 'type' => '67242' }, '12' => { 'name' => 'create_srq', 'offset' => '150', 'type' => '67157' }, '13' => { 'name' => 'modify_srq', 'offset' => '260', 'type' => '67427' }, '14' => { 'name' => 'query_srq', 'offset' => '274', 'type' => '67577' }, '15' => { 'name' => 'destroy_srq', 'offset' => '288', 'type' => '67282' }, '16' => { 'name' => 'post_srq_recv', 'offset' => '296', 'type' => '308575' }, '17' => { 'name' => 'create_qp', 'offset' => '310', 'type' => '67127' }, '18' => { 'name' => 'query_qp', 'offset' => '324', 'type' => '67522' }, '19' => { 'name' => 'modify_qp', 'offset' => '338', 'type' => '67392' }, '2' => { 'name' => 'alloc_pd', 'offset' => '22', 'type' => '66955' }, '20' => { 'name' => 'destroy_qp', 'offset' => '352', 'type' => '67262' }, '21' => { 'name' => 'post_send', 'offset' => '360', 'type' => '308615' }, '22' => { 'name' => 'post_recv', 'offset' => '374', 'type' => '308645' }, '23' => { 'name' => 'create_ah', 'offset' => '388', 'type' => '67062' }, '24' => { 'name' => 'destroy_ah', 'offset' => '402', 'type' => '67222' }, '25' => { 'name' => 'attach_mcast', 'offset' => '512', 'type' => '308680' }, '26' => { 'name' => 'detach_mcast', 'offset' => '520', 'type' => '308680' }, '3' => { 'name' => 'dealloc_pd', 'offset' => '36', 'type' => '67177' }, '4' => { 'name' => 'reg_mr', 'offset' => '50', 'type' => '308465' }, '5' => { 'name' => 'dereg_mr', 'offset' => '64', 'type' => '308485' }, '6' => { 'name' => 'create_cq', 'offset' => '72', 'type' => '67097' }, '7' => { 'name' => 'poll_cq', 'offset' => '86', 'type' => '308515' }, '8' => { 'name' => 'req_notify_cq', 'offset' => '100', 'type' => '308540' }, '9' => { 'name' => 'cq_event', 'offset' => '114', 'type' => '67032' } }, 'Name' => 'struct ibv_context_ops_1_0', 'Size' => '216', 'Type' => 'Struct' }, '308465' => { 'Name' => 'struct ibv_mr*(*)(struct ibv_pd*, void*, size_t, int)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '82' }, '2' => { 'type' => '53' }, '3' => { 'type' => '161' } }, 'Return' => '11186', 'Size' => '8', 'Type' => 'FuncPtr' }, '308485' => { 'Name' => 'int(*)(struct ibv_mr*)', 'Param' => { '0' => { 'type' => '11186' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '308515' => { 'Name' => 'int(*)(struct ibv_cq_1_0*, int, struct ibv_wc*)', 'Param' => { '0' => { 'type' => '307345' }, '1' => { 'type' => '161' }, '2' => { 'type' => '18205' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '308540' => { 'Name' => 'int(*)(struct ibv_cq_1_0*, int)', 'Param' => { '0' => { 'type' => '307345' }, '1' => { 'type' => '161' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '308570' => { 'BaseType' => '307804', 'Name' => 'struct ibv_recv_wr_1_0**', 'Size' => '8', 'Type' => 'Pointer' }, '308575' => { 'Name' => 'int(*)(struct ibv_srq_1_0*, struct ibv_recv_wr_1_0*, struct ibv_recv_wr_1_0**)', 'Param' => { '0' => { 'type' => '307350' }, '1' => { 'type' => '307804' }, '2' => { 'type' => '308570' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '308605' => { 'BaseType' => '307809', 'Name' => 'struct ibv_qp_1_0*', 'Size' => '8', 'Type' => 'Pointer' }, '308610' => { 'BaseType' => '307733', 'Name' => 'struct ibv_send_wr_1_0**', 'Size' => '8', 'Type' => 'Pointer' }, '308615' => { 'Name' => 'int(*)(struct ibv_qp_1_0*, struct ibv_send_wr_1_0*, struct ibv_send_wr_1_0**)', 'Param' => { '0' => { 'type' => '308605' }, '1' => { 'type' => '307733' }, '2' => { 'type' => '308610' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '308645' => { 'Name' => 'int(*)(struct ibv_qp_1_0*, struct ibv_recv_wr_1_0*, struct ibv_recv_wr_1_0**)', 'Param' => { '0' => { 'type' => '308605' }, '1' => { 'type' => '307804' }, '2' => { 'type' => '308570' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '308680' => { 'Name' => 'int(*)(struct ibv_qp*, union ibv_gid*, uint16_t)', 'Param' => { '0' => { 'type' => '9935' }, '1' => { 'type' => '98836' }, '2' => { 'type' => '941' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '308685' => { 'BaseType' => '308004', 'Name' => 'struct ibv_device_1_0*', 'Size' => '8', 'Type' => 'Pointer' }, '308690' => { 'BaseType' => '308702', 'Header' => undef, 'Line' => '228', 'Name' => 'ibv_driver_init_func_1_1', 'Size' => '8', 'Type' => 'Typedef' }, '308702' => { 'Name' => 'struct ibv_device*(*)(char const*, int)', 'Param' => { '0' => { 'type' => '74950' }, '1' => { 'type' => '161' } }, 'Return' => '17378', 'Size' => '8', 'Type' => 'FuncPtr' }, '309135' => { 'BaseType' => '9734', 'Name' => 'struct ibv_cq**', 'Size' => '8', 'Type' => 'Pointer' }, '309427' => { 'BaseType' => '1037', 'Name' => '__be16*', 'Size' => '8', 'Type' => 'Pointer' }, '3097' => { 'BaseType' => '989', 'Name' => '__u8[5]', 'Size' => '5', 'Type' => 'Array' }, '309712' => { 'BaseType' => '17378', 'Name' => 'struct ibv_device**', 'Size' => '8', 'Type' => 'Pointer' }, '310808' => { 'BaseType' => '307109', 'Name' => 'struct ibv_qp_init_attr_1_0*', 'Size' => '8', 'Type' => 'Pointer' }, '31177' => { 'BaseType' => '21478', 'Name' => 'struct ibv_resize_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '311815' => { 'BaseType' => '307345', 'Name' => 'struct ibv_cq_1_0**', 'Size' => '8', 'Type' => 'Pointer' }, '31182' => { 'BaseType' => '2377', 'Name' => 'struct ib_uverbs_resize_cq_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '312385' => { 'BaseType' => '306896', 'Name' => 'struct ibv_mr_1_0*', 'Size' => '8', 'Type' => 'Pointer' }, '3136' => { 'Header' => undef, 'Line' => '634', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '2' => { 'name' => 'pd_handle', 'offset' => '22', 'type' => '1013' }, '3' => { 'name' => 'qpn', 'offset' => '32', 'type' => '1013' }, '4' => { 'name' => 'qp_type', 'offset' => '36', 'type' => '989' }, '5' => { 'name' => 'reserved', 'offset' => '37', 'type' => '1564' }, '6' => { 'name' => 'driver_data', 'offset' => '50', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_open_qp', 'Size' => '32', 'Type' => 'Struct' }, '315462' => { 'BaseType' => '308685', 'Name' => 'struct ibv_device_1_0**', 'Size' => '8', 'Type' => 'Pointer' }, '32019' => { 'BaseType' => '18545', 'Name' => 'struct ibv_alloc_mw*', 'Size' => '8', 'Type' => 'Pointer' }, '32024' => { 'BaseType' => '2254', 'Name' => 'struct ib_uverbs_alloc_mw_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '32418' => { 'BaseType' => '21344', 'Name' => 'struct ibv_rereg_mr*', 'Size' => '8', 'Type' => 'Pointer' }, '32423' => { 'BaseType' => '2101', 'Name' => 'struct ib_uverbs_rereg_mr_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '3245' => { 'Header' => undef, 'Line' => '645', 'Memb' => { '0' => { 'name' => 'qp_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'qpn', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'max_send_wr', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'max_recv_wr', 'offset' => '18', 'type' => '1013' }, '4' => { 'name' => 'max_send_sge', 'offset' => '22', 'type' => '1013' }, '5' => { 'name' => 'max_recv_sge', 'offset' => '32', 'type' => '1013' }, '6' => { 'name' => 'max_inline_data', 'offset' => '36', 'type' => '1013' }, '7' => { 'name' => 'reserved', 'offset' => '40', 'type' => '1013' }, '8' => { 'name' => 'driver_data', 'offset' => '50', 'type' => '1663' } }, 'Name' => 'struct ib_uverbs_create_qp_resp', 'Size' => '32', 'Type' => 'Struct' }, '32762' => { 'BaseType' => '21035', 'Name' => 'struct ibv_reg_mr*', 'Size' => '8', 'Type' => 'Pointer' }, '32767' => { 'BaseType' => '1895', 'Name' => 'struct ib_uverbs_reg_mr_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '33068' => { 'BaseType' => '19978', 'Name' => 'struct ibv_open_xrcd*', 'Size' => '8', 'Type' => 'Pointer' }, '33073' => { 'BaseType' => '1745', 'Name' => 'struct ib_uverbs_open_xrcd_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '33343' => { 'BaseType' => '18654', 'Name' => 'struct ibv_alloc_pd*', 'Size' => '8', 'Type' => 'Pointer' }, '33348' => { 'BaseType' => '1620', 'Name' => 'struct ib_uverbs_alloc_pd_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '333671' => { 'BaseType' => '65817', 'Name' => 'struct verbs_context_ops const', 'Size' => '608', 'Type' => 'Const' }, '335015' => { 'BaseType' => '333671', 'Name' => 'struct verbs_context_ops const*', 'Size' => '8', 'Type' => 'Pointer' }, '3401' => { 'Header' => undef, 'Line' => '667', 'Memb' => { '0' => { 'name' => 'dgid', 'offset' => '0', 'type' => '2955' }, '1' => { 'name' => 'flow_label', 'offset' => '22', 'type' => '1013' }, '10' => { 'name' => 'is_global', 'offset' => '48', 'type' => '989' }, '11' => { 'name' => 'port_num', 'offset' => '49', 'type' => '989' }, '2' => { 'name' => 'dlid', 'offset' => '32', 'type' => '1001' }, '3' => { 'name' => 'reserved', 'offset' => '34', 'type' => '1001' }, '4' => { 'name' => 'sgid_index', 'offset' => '36', 'type' => '989' }, '5' => { 'name' => 'hop_limit', 'offset' => '37', 'type' => '989' }, '6' => { 'name' => 'traffic_class', 'offset' => '38', 'type' => '989' }, '7' => { 'name' => 'sl', 'offset' => '39', 'type' => '989' }, '8' => { 'name' => 'src_path_bits', 'offset' => '40', 'type' => '989' }, '9' => { 'name' => 'static_rate', 'offset' => '41', 'type' => '989' } }, 'Name' => 'struct ib_uverbs_qp_dest', 'Size' => '32', 'Type' => 'Struct' }, '34361' => { 'BaseType' => '266', 'Header' => undef, 'Line' => '30', 'Name' => '__s64', 'Size' => '8', 'Type' => 'Typedef' }, '34457' => { 'Header' => undef, 'Line' => '59', 'Memb' => { '0' => { 'name' => 'elem_id', 'offset' => '0', 'type' => '989' }, '1' => { 'name' => 'reserved', 'offset' => '1', 'type' => '989' } }, 'Size' => '2', 'Type' => 'Struct' }, '34493' => { 'Header' => undef, 'Line' => '58', 'Memb' => { '0' => { 'name' => 'enum_data', 'offset' => '0', 'type' => '34457' }, '1' => { 'name' => 'reserved', 'offset' => '0', 'type' => '1001' } }, 'Size' => '2', 'Type' => 'Union' }, '34527' => { 'Header' => undef, 'Line' => '65', 'Memb' => { '0' => { 'name' => 'data', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'data_s64', 'offset' => '0', 'type' => '34361' } }, 'Size' => '8', 'Type' => 'Union' }, '34563' => { 'Header' => undef, 'Line' => '54', 'Memb' => { '0' => { 'name' => 'attr_id', 'offset' => '0', 'type' => '1001' }, '1' => { 'name' => 'len', 'offset' => '2', 'type' => '1001' }, '2' => { 'name' => 'flags', 'offset' => '4', 'type' => '1001' }, '3' => { 'name' => 'attr_data', 'offset' => '6', 'type' => '34493' }, '4' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '34527' } }, 'Name' => 'struct ib_uverbs_attr', 'Size' => '16', 'Type' => 'Struct' }, '34635' => { 'Header' => undef, 'Line' => '76', 'Memb' => { '0' => { 'name' => 'length', 'offset' => '0', 'type' => '1001' }, '1' => { 'name' => 'object_id', 'offset' => '2', 'type' => '1001' }, '2' => { 'name' => 'method_id', 'offset' => '4', 'type' => '1001' }, '3' => { 'name' => 'num_attrs', 'offset' => '6', 'type' => '1001' }, '4' => { 'name' => 'reserved1', 'offset' => '8', 'type' => '1025' }, '5' => { 'name' => 'driver_id', 'offset' => '22', 'type' => '1013' }, '6' => { 'name' => 'reserved2', 'offset' => '32', 'type' => '1013' }, '7' => { 'name' => 'attrs', 'offset' => '36', 'type' => '34750' } }, 'Name' => 'struct ib_uverbs_ioctl_hdr', 'Size' => '24', 'Type' => 'Struct' }, '34750' => { 'BaseType' => '34563', 'Name' => 'struct ib_uverbs_attr[]', 'Size' => '8', 'Type' => 'Array' }, '348' => { 'Name' => 'unsigned long long', 'Size' => '8', 'Type' => 'Intrinsic' }, '3583' => { 'Header' => undef, 'Line' => '682', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'qp_handle', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'attr_mask', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_query_qp', 'Size' => '16', 'Type' => 'Struct' }, '39865' => { 'Header' => undef, 'Line' => '85', 'Memb' => { '0' => { 'name' => 'next', 'offset' => '0', 'type' => '40016' }, '1' => { 'name' => 'next_attr', 'offset' => '8', 'type' => '40021' }, '2' => { 'name' => 'last_attr', 'offset' => '22', 'type' => '40021' }, '3' => { 'name' => 'uhw_in_idx', 'offset' => '36', 'type' => '929' }, '4' => { 'name' => 'uhw_out_idx', 'offset' => '37', 'type' => '929' }, '5' => { 'name' => 'uhw_in_headroom_dwords', 'offset' => '38', 'type' => '929' }, '6' => { 'name' => 'uhw_out_headroom_dwords', 'offset' => '39', 'type' => '929' }, '7' => { 'name' => 'hdr', 'offset' => '50', 'type' => '34635' } }, 'Name' => 'struct ibv_command_buffer', 'Size' => '56', 'Type' => 'Struct' }, '40016' => { 'BaseType' => '39865', 'Name' => 'struct ibv_command_buffer*', 'Size' => '8', 'Type' => 'Pointer' }, '40021' => { 'BaseType' => '34563', 'Name' => 'struct ib_uverbs_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '404779' => { 'Header' => undef, 'Line' => '548', 'Memb' => { '0' => { 'name' => 'qp_attr_mask', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'qp_state', 'offset' => '4', 'type' => '1013' }, '10' => { 'name' => 'ah_attr', 'offset' => '64', 'type' => '2971' }, '11' => { 'name' => 'alt_ah_attr', 'offset' => '114', 'type' => '2971' }, '12' => { 'name' => 'max_send_wr', 'offset' => '260', 'type' => '1013' }, '13' => { 'name' => 'max_recv_wr', 'offset' => '264', 'type' => '1013' }, '14' => { 'name' => 'max_send_sge', 'offset' => '274', 'type' => '1013' }, '15' => { 'name' => 'max_recv_sge', 'offset' => '278', 'type' => '1013' }, '16' => { 'name' => 'max_inline_data', 'offset' => '288', 'type' => '1013' }, '17' => { 'name' => 'pkey_index', 'offset' => '292', 'type' => '1001' }, '18' => { 'name' => 'alt_pkey_index', 'offset' => '294', 'type' => '1001' }, '19' => { 'name' => 'en_sqd_async_notify', 'offset' => '296', 'type' => '989' }, '2' => { 'name' => 'cur_qp_state', 'offset' => '8', 'type' => '1013' }, '20' => { 'name' => 'sq_draining', 'offset' => '297', 'type' => '989' }, '21' => { 'name' => 'max_rd_atomic', 'offset' => '304', 'type' => '989' }, '22' => { 'name' => 'max_dest_rd_atomic', 'offset' => '305', 'type' => '989' }, '23' => { 'name' => 'min_rnr_timer', 'offset' => '306', 'type' => '989' }, '24' => { 'name' => 'port_num', 'offset' => '307', 'type' => '989' }, '25' => { 'name' => 'timeout', 'offset' => '308', 'type' => '989' }, '26' => { 'name' => 'retry_cnt', 'offset' => '309', 'type' => '989' }, '27' => { 'name' => 'rnr_retry', 'offset' => '310', 'type' => '989' }, '28' => { 'name' => 'alt_port_num', 'offset' => '311', 'type' => '989' }, '29' => { 'name' => 'alt_timeout', 'offset' => '312', 'type' => '989' }, '3' => { 'name' => 'path_mtu', 'offset' => '18', 'type' => '1013' }, '30' => { 'name' => 'reserved', 'offset' => '313', 'type' => '3097' }, '4' => { 'name' => 'path_mig_state', 'offset' => '22', 'type' => '1013' }, '5' => { 'name' => 'qkey', 'offset' => '32', 'type' => '1013' }, '6' => { 'name' => 'rq_psn', 'offset' => '36', 'type' => '1013' }, '7' => { 'name' => 'sq_psn', 'offset' => '40', 'type' => '1013' }, '8' => { 'name' => 'dest_qp_num', 'offset' => '50', 'type' => '1013' }, '9' => { 'name' => 'qp_access_flags', 'offset' => '54', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_qp_attr', 'Size' => '144', 'Type' => 'Struct' }, '406145' => { 'Header' => undef, 'Line' => '40', 'Memb' => { '0' => { 'name' => 'dgid', 'offset' => '0', 'type' => '8669' }, '1' => { 'name' => 'sgid', 'offset' => '22', 'type' => '8669' }, '10' => { 'name' => 'pkey', 'offset' => '84', 'type' => '1037' }, '11' => { 'name' => 'sl', 'offset' => '86', 'type' => '929' }, '12' => { 'name' => 'mtu_selector', 'offset' => '87', 'type' => '929' }, '13' => { 'name' => 'mtu', 'offset' => '88', 'type' => '929' }, '14' => { 'name' => 'rate_selector', 'offset' => '89', 'type' => '929' }, '15' => { 'name' => 'rate', 'offset' => '96', 'type' => '929' }, '16' => { 'name' => 'packet_life_time_selector', 'offset' => '97', 'type' => '929' }, '17' => { 'name' => 'packet_life_time', 'offset' => '98', 'type' => '929' }, '18' => { 'name' => 'preference', 'offset' => '99', 'type' => '929' }, '2' => { 'name' => 'dlid', 'offset' => '50', 'type' => '1037' }, '3' => { 'name' => 'slid', 'offset' => '52', 'type' => '1037' }, '4' => { 'name' => 'raw_traffic', 'offset' => '54', 'type' => '161' }, '5' => { 'name' => 'flow_label', 'offset' => '64', 'type' => '1049' }, '6' => { 'name' => 'hop_limit', 'offset' => '68', 'type' => '929' }, '7' => { 'name' => 'traffic_class', 'offset' => '69', 'type' => '929' }, '8' => { 'name' => 'reversible', 'offset' => '72', 'type' => '161' }, '9' => { 'name' => 'numb_path', 'offset' => '82', 'type' => '929' } }, 'Name' => 'struct ibv_sa_path_rec', 'Size' => '64', 'Type' => 'Struct' }, '406404' => { 'Header' => undef, 'Line' => '55', 'Memb' => { '0' => { 'name' => 'dgid', 'offset' => '0', 'type' => '2955' }, '1' => { 'name' => 'sgid', 'offset' => '22', 'type' => '2955' }, '10' => { 'name' => 'traffic_class', 'offset' => '85', 'type' => '989' }, '11' => { 'name' => 'numb_path', 'offset' => '86', 'type' => '989' }, '12' => { 'name' => 'sl', 'offset' => '87', 'type' => '989' }, '13' => { 'name' => 'mtu_selector', 'offset' => '88', 'type' => '989' }, '14' => { 'name' => 'rate_selector', 'offset' => '89', 'type' => '989' }, '15' => { 'name' => 'rate', 'offset' => '96', 'type' => '989' }, '16' => { 'name' => 'packet_life_time_selector', 'offset' => '97', 'type' => '989' }, '17' => { 'name' => 'packet_life_time', 'offset' => '98', 'type' => '989' }, '18' => { 'name' => 'preference', 'offset' => '99', 'type' => '989' }, '2' => { 'name' => 'dlid', 'offset' => '50', 'type' => '1037' }, '3' => { 'name' => 'slid', 'offset' => '52', 'type' => '1037' }, '4' => { 'name' => 'raw_traffic', 'offset' => '54', 'type' => '1013' }, '5' => { 'name' => 'flow_label', 'offset' => '64', 'type' => '1049' }, '6' => { 'name' => 'reversible', 'offset' => '68', 'type' => '1013' }, '7' => { 'name' => 'mtu', 'offset' => '72', 'type' => '1013' }, '8' => { 'name' => 'pkey', 'offset' => '82', 'type' => '1037' }, '9' => { 'name' => 'hop_limit', 'offset' => '84', 'type' => '989' } }, 'Name' => 'struct ib_user_path_rec', 'Size' => '64', 'Type' => 'Struct' }, '406864' => { 'BaseType' => '406404', 'Name' => 'struct ib_user_path_rec*', 'Size' => '8', 'Type' => 'Pointer' }, '406869' => { 'BaseType' => '406145', 'Name' => 'struct ibv_sa_path_rec*', 'Size' => '8', 'Type' => 'Pointer' }, '407199' => { 'BaseType' => '404779', 'Name' => 'struct ib_uverbs_qp_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '407335' => { 'BaseType' => '2971', 'Name' => 'struct ib_uverbs_ah_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '4098' => { 'Header' => undef, 'Line' => '723', 'Memb' => { '0' => { 'name' => 'dest', 'offset' => '0', 'type' => '3401' }, '1' => { 'name' => 'alt_dest', 'offset' => '50', 'type' => '3401' }, '10' => { 'name' => 'alt_pkey_index', 'offset' => '148', 'type' => '1001' }, '11' => { 'name' => 'qp_state', 'offset' => '150', 'type' => '989' }, '12' => { 'name' => 'cur_qp_state', 'offset' => '151', 'type' => '989' }, '13' => { 'name' => 'path_mtu', 'offset' => '152', 'type' => '989' }, '14' => { 'name' => 'path_mig_state', 'offset' => '153', 'type' => '989' }, '15' => { 'name' => 'en_sqd_async_notify', 'offset' => '256', 'type' => '989' }, '16' => { 'name' => 'max_rd_atomic', 'offset' => '257', 'type' => '989' }, '17' => { 'name' => 'max_dest_rd_atomic', 'offset' => '258', 'type' => '989' }, '18' => { 'name' => 'min_rnr_timer', 'offset' => '259', 'type' => '989' }, '19' => { 'name' => 'port_num', 'offset' => '260', 'type' => '989' }, '2' => { 'name' => 'qp_handle', 'offset' => '100', 'type' => '1013' }, '20' => { 'name' => 'timeout', 'offset' => '261', 'type' => '989' }, '21' => { 'name' => 'retry_cnt', 'offset' => '262', 'type' => '989' }, '22' => { 'name' => 'rnr_retry', 'offset' => '263', 'type' => '989' }, '23' => { 'name' => 'alt_port_num', 'offset' => '264', 'type' => '989' }, '24' => { 'name' => 'alt_timeout', 'offset' => '265', 'type' => '989' }, '25' => { 'name' => 'reserved', 'offset' => '272', 'type' => '4489' }, '26' => { 'name' => 'driver_data', 'offset' => '274', 'type' => '1549' }, '3' => { 'name' => 'attr_mask', 'offset' => '104', 'type' => '1013' }, '4' => { 'name' => 'qkey', 'offset' => '114', 'type' => '1013' }, '5' => { 'name' => 'rq_psn', 'offset' => '118', 'type' => '1013' }, '6' => { 'name' => 'sq_psn', 'offset' => '128', 'type' => '1013' }, '7' => { 'name' => 'dest_qp_num', 'offset' => '132', 'type' => '1013' }, '8' => { 'name' => 'qp_access_flags', 'offset' => '136', 'type' => '1013' }, '9' => { 'name' => 'pkey_index', 'offset' => '146', 'type' => '1001' } }, 'Name' => 'struct ib_uverbs_modify_qp', 'Size' => '112', 'Type' => 'Struct' }, '410188' => { 'Header' => undef, 'Line' => '142', 'Memb' => { '0' => { 'name' => 'IBV_FORK_DISABLED', 'value' => '0' }, '1' => { 'name' => 'IBV_FORK_ENABLED', 'value' => '1' }, '2' => { 'name' => 'IBV_FORK_UNNEEDED', 'value' => '2' } }, 'Name' => 'enum ibv_fork_status', 'Size' => '4', 'Type' => 'Enum' }, '443360' => { 'Header' => undef, 'Line' => '699', 'Memb' => { '0' => { 'name' => 'version_tclass_flow', 'offset' => '0', 'type' => '1049' }, '1' => { 'name' => 'paylen', 'offset' => '4', 'type' => '1037' }, '2' => { 'name' => 'next_hdr', 'offset' => '6', 'type' => '929' }, '3' => { 'name' => 'hop_limit', 'offset' => '7', 'type' => '929' }, '4' => { 'name' => 'sgid', 'offset' => '8', 'type' => '8669' }, '5' => { 'name' => 'dgid', 'offset' => '36', 'type' => '8669' } }, 'Name' => 'struct ibv_grh', 'Size' => '40', 'Type' => 'Struct' }, '443458' => { 'Header' => undef, 'Line' => '708', 'Memb' => { '0' => { 'name' => 'IBV_RATE_MAX', 'value' => '0' }, '1' => { 'name' => 'IBV_RATE_2_5_GBPS', 'value' => '2' }, '10' => { 'name' => 'IBV_RATE_14_GBPS', 'value' => '11' }, '11' => { 'name' => 'IBV_RATE_56_GBPS', 'value' => '12' }, '12' => { 'name' => 'IBV_RATE_112_GBPS', 'value' => '13' }, '13' => { 'name' => 'IBV_RATE_168_GBPS', 'value' => '14' }, '14' => { 'name' => 'IBV_RATE_25_GBPS', 'value' => '15' }, '15' => { 'name' => 'IBV_RATE_100_GBPS', 'value' => '16' }, '16' => { 'name' => 'IBV_RATE_200_GBPS', 'value' => '17' }, '17' => { 'name' => 'IBV_RATE_300_GBPS', 'value' => '18' }, '18' => { 'name' => 'IBV_RATE_28_GBPS', 'value' => '19' }, '19' => { 'name' => 'IBV_RATE_50_GBPS', 'value' => '20' }, '2' => { 'name' => 'IBV_RATE_5_GBPS', 'value' => '5' }, '20' => { 'name' => 'IBV_RATE_400_GBPS', 'value' => '21' }, '21' => { 'name' => 'IBV_RATE_600_GBPS', 'value' => '22' }, '22' => { 'name' => 'IBV_RATE_800_GBPS', 'value' => '23' }, '23' => { 'name' => 'IBV_RATE_1200_GBPS', 'value' => '24' }, '3' => { 'name' => 'IBV_RATE_10_GBPS', 'value' => '3' }, '4' => { 'name' => 'IBV_RATE_20_GBPS', 'value' => '6' }, '5' => { 'name' => 'IBV_RATE_30_GBPS', 'value' => '4' }, '6' => { 'name' => 'IBV_RATE_40_GBPS', 'value' => '7' }, '7' => { 'name' => 'IBV_RATE_60_GBPS', 'value' => '8' }, '8' => { 'name' => 'IBV_RATE_80_GBPS', 'value' => '9' }, '9' => { 'name' => 'IBV_RATE_120_GBPS', 'value' => '10' } }, 'Name' => 'enum ibv_rate', 'Size' => '4', 'Type' => 'Enum' }, '4489' => { 'BaseType' => '989', 'Name' => '__u8[2]', 'Size' => '2', 'Type' => 'Array' }, '4520' => { 'Header' => undef, 'Line' => '753', 'Memb' => { '0' => { 'name' => 'base', 'offset' => '0', 'type' => '4098' }, '1' => { 'name' => 'rate_limit', 'offset' => '274', 'type' => '1013' }, '2' => { 'name' => 'reserved', 'offset' => '278', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_modify_qp', 'Size' => '120', 'Type' => 'Struct' }, '4575' => { 'Header' => undef, 'Line' => '759', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'response_length', 'offset' => '4', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_modify_qp_resp', 'Size' => '8', 'Type' => 'Struct' }, '458719' => { 'BaseType' => '941', 'Name' => 'uint16_t*', 'Size' => '8', 'Type' => 'Pointer' }, '459688' => { 'BaseType' => '443360', 'Name' => 'struct ibv_grh*', 'Size' => '8', 'Type' => 'Pointer' }, '46' => { 'Name' => 'unsigned long', 'Size' => '8', 'Type' => 'Intrinsic' }, '46756' => { 'Header' => undef, 'Line' => '2101', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '953' } }, 'Name' => 'struct ibv_counters_init_attr', 'Size' => '4', 'Type' => 'Struct' }, '46784' => { 'BaseType' => '965', 'Name' => 'uint64_t*', 'Size' => '8', 'Type' => 'Pointer' }, '46789' => { 'BaseType' => '46756', 'Name' => 'struct ibv_counters_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '48314' => { 'BaseType' => '23141', 'Name' => 'struct verbs_counters*', 'Size' => '8', 'Type' => 'Pointer' }, '50263' => { 'BaseType' => '185', 'Header' => undef, 'Line' => '160', 'Name' => '__time_t', 'Size' => '8', 'Type' => 'Typedef' }, '50282' => { 'BaseType' => '185', 'Header' => undef, 'Line' => '197', 'Name' => '__syscall_slong_t', 'Size' => '8', 'Type' => 'Typedef' }, '50390' => { 'BaseType' => '161', 'Header' => undef, 'Line' => '26', 'Name' => '__s32', 'Size' => '4', 'Type' => 'Typedef' }, '50831' => { 'Header' => undef, 'Line' => '11', 'Memb' => { '0' => { 'name' => 'tv_sec', 'offset' => '0', 'type' => '50263' }, '1' => { 'name' => 'tv_nsec', 'offset' => '8', 'type' => '50282' } }, 'Name' => 'struct timespec', 'Size' => '16', 'Type' => 'Struct' }, '51885' => { 'Header' => undef, 'Line' => '416', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '2' => { 'name' => 'cqe', 'offset' => '22', 'type' => '1013' }, '3' => { 'name' => 'comp_vector', 'offset' => '32', 'type' => '1013' }, '4' => { 'name' => 'comp_channel', 'offset' => '36', 'type' => '50390' }, '5' => { 'name' => 'reserved', 'offset' => '40', 'type' => '1013' }, '6' => { 'name' => 'driver_data', 'offset' => '50', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_create_cq', 'Size' => '32', 'Type' => 'Struct' }, '52022' => { 'Header' => undef, 'Line' => '431', 'Memb' => { '0' => { 'name' => 'user_handle', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'cqe', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'comp_vector', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'comp_channel', 'offset' => '22', 'type' => '50390' }, '4' => { 'name' => 'comp_mask', 'offset' => '32', 'type' => '1013' }, '5' => { 'name' => 'flags', 'offset' => '36', 'type' => '1013' }, '6' => { 'name' => 'reserved', 'offset' => '40', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_create_cq', 'Size' => '32', 'Type' => 'Struct' }, '52132' => { 'Header' => undef, 'Line' => '441', 'Memb' => { '0' => { 'name' => 'cq_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'cqe', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'driver_data', 'offset' => '8', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_create_cq_resp', 'Size' => '8', 'Type' => 'Struct' }, '52202' => { 'Header' => undef, 'Line' => '447', 'Memb' => { '0' => { 'name' => 'base', 'offset' => '0', 'type' => '52132' }, '1' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'response_length', 'offset' => '18', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_create_cq_resp', 'Size' => '16', 'Type' => 'Struct' }, '52352' => { 'Header' => undef, 'Line' => '146', 'Memb' => { '0' => { 'name' => 'IB_UVERBS_FLOW_ACTION_ESP_KEYMAT_AES_GCM', 'value' => '0' } }, 'Name' => 'enum ib_uverbs_flow_action_esp_keymat', 'Size' => '4', 'Type' => 'Enum' }, '52375' => { 'Header' => undef, 'Line' => '165', 'Memb' => { '0' => { 'name' => 'IB_UVERBS_FLOW_ACTION_ESP_REPLAY_NONE', 'value' => '0' }, '1' => { 'name' => 'IB_UVERBS_FLOW_ACTION_ESP_REPLAY_BMP', 'value' => '1' } }, 'Name' => 'enum ib_uverbs_flow_action_esp_replay', 'Size' => '4', 'Type' => 'Enum' }, '52404' => { 'Header' => undef, 'Line' => '191', 'Memb' => { '0' => { 'name' => 'val_ptr', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'val_ptr_data_u64', 'offset' => '0', 'type' => '1025' } }, 'Size' => '8', 'Type' => 'Union' }, '52438' => { 'Header' => undef, 'Line' => '192', 'Memb' => { '0' => { 'name' => 'next_ptr', 'offset' => '0', 'type' => '52524' }, '1' => { 'name' => 'next_ptr_data_u64', 'offset' => '0', 'type' => '1025' } }, 'Size' => '8', 'Type' => 'Union' }, '52472' => { 'Header' => undef, 'Line' => '187', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '52404' }, '1' => { 'name' => 'unnamed1', 'offset' => '8', 'type' => '52438' }, '2' => { 'name' => 'len', 'offset' => '22', 'type' => '1001' }, '3' => { 'name' => 'type', 'offset' => '24', 'type' => '1001' } }, 'Name' => 'struct ib_uverbs_flow_action_esp_encap', 'Size' => '24', 'Type' => 'Struct' }, '52524' => { 'BaseType' => '52472', 'Name' => 'struct ib_uverbs_flow_action_esp_encap*', 'Size' => '8', 'Type' => 'Pointer' }, '52529' => { 'Header' => undef, 'Line' => '197', 'Memb' => { '0' => { 'name' => 'spi', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'seq', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'tfc_pad', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'flags', 'offset' => '18', 'type' => '1013' }, '4' => { 'name' => 'hard_limit_pkts', 'offset' => '22', 'type' => '1025' } }, 'Name' => 'struct ib_uverbs_flow_action_esp', 'Size' => '24', 'Type' => 'Struct' }, '52608' => { 'Header' => undef, 'Line' => '210', 'Memb' => { '0' => { 'name' => 'IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH', 'value' => '0' }, '1' => { 'name' => 'IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_WRITE', 'value' => '1' }, '2' => { 'name' => 'IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT', 'value' => '2' } }, 'Name' => 'enum ib_uverbs_advise_mr_advice', 'Size' => '4', 'Type' => 'Enum' }, '52889' => { 'Header' => undef, 'Line' => '161', 'Memb' => { '0' => { 'name' => 'length', 'offset' => '0', 'type' => '53' }, '1' => { 'name' => 'log_align_req', 'offset' => '8', 'type' => '953' }, '2' => { 'name' => 'comp_mask', 'offset' => '18', 'type' => '953' } }, 'Name' => 'struct ibv_alloc_dm_attr', 'Size' => '16', 'Type' => 'Struct' }, '52942' => { 'Header' => undef, 'Line' => '171', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'memcpy_to_dm', 'offset' => '8', 'type' => '53190' }, '2' => { 'name' => 'memcpy_from_dm', 'offset' => '22', 'type' => '53225' }, '3' => { 'name' => 'comp_mask', 'offset' => '36', 'type' => '953' }, '4' => { 'name' => 'handle', 'offset' => '40', 'type' => '953' } }, 'Name' => 'struct ibv_dm', 'Size' => '32', 'Type' => 'Struct' }, '53' => { 'BaseType' => '46', 'Header' => undef, 'Line' => '214', 'Name' => 'size_t', 'Size' => '8', 'Type' => 'Typedef' }, '53174' => { 'BaseType' => '52942', 'Name' => 'struct ibv_dm*', 'Size' => '8', 'Type' => 'Pointer' }, '53190' => { 'Name' => 'int(*)(struct ibv_dm*, uint64_t, void const*, size_t)', 'Param' => { '0' => { 'type' => '53174' }, '1' => { 'type' => '965' }, '2' => { 'type' => '918' }, '3' => { 'type' => '53' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '53225' => { 'Name' => 'int(*)(void*, struct ibv_dm*, uint64_t, size_t)', 'Param' => { '0' => { 'type' => '82' }, '1' => { 'type' => '53174' }, '2' => { 'type' => '965' }, '3' => { 'type' => '53' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '53780' => { 'Header' => undef, 'Line' => '227', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '953' } }, 'Name' => 'struct ibv_query_device_ex_input', 'Size' => '4', 'Type' => 'Struct' }, '53807' => { 'BaseType' => '53780', 'Name' => 'struct ibv_query_device_ex_input const', 'Size' => '4', 'Type' => 'Const' }, '53812' => { 'Header' => undef, 'Line' => '242', 'Memb' => { '0' => { 'name' => 'rc_odp_caps', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'uc_odp_caps', 'offset' => '4', 'type' => '953' }, '2' => { 'name' => 'ud_odp_caps', 'offset' => '8', 'type' => '953' } }, 'Size' => '12', 'Type' => 'Struct' }, '53861' => { 'Header' => undef, 'Line' => '240', 'Memb' => { '0' => { 'name' => 'general_caps', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'per_transport_caps', 'offset' => '8', 'type' => '53812' } }, 'Name' => 'struct ibv_odp_caps', 'Size' => '24', 'Type' => 'Struct' }, '53901' => { 'Header' => undef, 'Line' => '254', 'Memb' => { '0' => { 'name' => 'max_tso', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'supported_qpts', 'offset' => '4', 'type' => '953' } }, 'Name' => 'struct ibv_tso_caps', 'Size' => '8', 'Type' => 'Struct' }, '53942' => { 'Header' => undef, 'Line' => '285', 'Memb' => { '0' => { 'name' => 'supported_qpts', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'max_rwq_indirection_tables', 'offset' => '4', 'type' => '953' }, '2' => { 'name' => 'max_rwq_indirection_table_size', 'offset' => '8', 'type' => '953' }, '3' => { 'name' => 'rx_hash_fields_mask', 'offset' => '22', 'type' => '965' }, '4' => { 'name' => 'rx_hash_function', 'offset' => '36', 'type' => '929' } }, 'Name' => 'struct ibv_rss_caps', 'Size' => '32', 'Type' => 'Struct' }, '54026' => { 'Header' => undef, 'Line' => '293', 'Memb' => { '0' => { 'name' => 'qp_rate_limit_min', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'qp_rate_limit_max', 'offset' => '4', 'type' => '953' }, '2' => { 'name' => 'supported_qpts', 'offset' => '8', 'type' => '953' } }, 'Name' => 'struct ibv_packet_pacing_caps', 'Size' => '12', 'Type' => 'Struct' }, '54082' => { 'Header' => undef, 'Line' => '310', 'Memb' => { '0' => { 'name' => 'max_rndv_hdr_size', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'max_num_tags', 'offset' => '4', 'type' => '953' }, '2' => { 'name' => 'flags', 'offset' => '8', 'type' => '953' }, '3' => { 'name' => 'max_ops', 'offset' => '18', 'type' => '953' }, '4' => { 'name' => 'max_sge', 'offset' => '22', 'type' => '953' } }, 'Name' => 'struct ibv_tm_caps', 'Size' => '20', 'Type' => 'Struct' }, '54166' => { 'Header' => undef, 'Line' => '323', 'Memb' => { '0' => { 'name' => 'max_cq_count', 'offset' => '0', 'type' => '941' }, '1' => { 'name' => 'max_cq_period', 'offset' => '2', 'type' => '941' } }, 'Name' => 'struct ibv_cq_moderation_caps', 'Size' => '4', 'Type' => 'Struct' }, '54208' => { 'Header' => undef, 'Line' => '338', 'Memb' => { '0' => { 'name' => 'fetch_add', 'offset' => '0', 'type' => '941' }, '1' => { 'name' => 'swap', 'offset' => '2', 'type' => '941' }, '2' => { 'name' => 'compare_swap', 'offset' => '4', 'type' => '941' } }, 'Name' => 'struct ibv_pci_atomic_caps', 'Size' => '6', 'Type' => 'Struct' }, '54264' => { 'Header' => undef, 'Line' => '344', 'Memb' => { '0' => { 'name' => 'orig_attr', 'offset' => '0', 'type' => '8996' }, '1' => { 'name' => 'comp_mask', 'offset' => '562', 'type' => '953' }, '10' => { 'name' => 'raw_packet_caps', 'offset' => '836', 'type' => '953' }, '11' => { 'name' => 'tm_caps', 'offset' => '840', 'type' => '54082' }, '12' => { 'name' => 'cq_mod_caps', 'offset' => '872', 'type' => '54166' }, '13' => { 'name' => 'max_dm_size', 'offset' => '886', 'type' => '965' }, '14' => { 'name' => 'pci_atomic_caps', 'offset' => '900', 'type' => '54208' }, '15' => { 'name' => 'xrc_odp_caps', 'offset' => '914', 'type' => '953' }, '16' => { 'name' => 'phys_port_cnt_ex', 'offset' => '918', 'type' => '953' }, '2' => { 'name' => 'odp_caps', 'offset' => '576', 'type' => '53861' }, '3' => { 'name' => 'completion_timestamp_mask', 'offset' => '612', 'type' => '965' }, '4' => { 'name' => 'hca_core_clock', 'offset' => '626', 'type' => '965' }, '5' => { 'name' => 'device_cap_flags_ex', 'offset' => '640', 'type' => '965' }, '6' => { 'name' => 'tso_caps', 'offset' => '648', 'type' => '53901' }, '7' => { 'name' => 'rss_caps', 'offset' => '662', 'type' => '53942' }, '8' => { 'name' => 'max_wq_type_rq', 'offset' => '808', 'type' => '953' }, '9' => { 'name' => 'packet_pacing_caps', 'offset' => '818', 'type' => '54026' } }, 'Name' => 'struct ibv_device_attr_ex', 'Size' => '400', 'Type' => 'Struct' }, '54578' => { 'Header' => undef, 'Line' => '372', 'Memb' => { '0' => { 'name' => 'IBV_PORT_NOP', 'value' => '0' }, '1' => { 'name' => 'IBV_PORT_DOWN', 'value' => '1' }, '2' => { 'name' => 'IBV_PORT_INIT', 'value' => '2' }, '3' => { 'name' => 'IBV_PORT_ARMED', 'value' => '3' }, '4' => { 'name' => 'IBV_PORT_ACTIVE', 'value' => '4' }, '5' => { 'name' => 'IBV_PORT_ACTIVE_DEFER', 'value' => '5' } }, 'Name' => 'enum ibv_port_state', 'Size' => '4', 'Type' => 'Enum' }, '54631' => { 'Header' => undef, 'Line' => '425', 'Memb' => { '0' => { 'name' => 'state', 'offset' => '0', 'type' => '54578' }, '1' => { 'name' => 'max_mtu', 'offset' => '4', 'type' => '9546' }, '10' => { 'name' => 'sm_lid', 'offset' => '54', 'type' => '941' }, '11' => { 'name' => 'lmc', 'offset' => '56', 'type' => '929' }, '12' => { 'name' => 'max_vl_num', 'offset' => '57', 'type' => '929' }, '13' => { 'name' => 'sm_sl', 'offset' => '64', 'type' => '929' }, '14' => { 'name' => 'subnet_timeout', 'offset' => '65', 'type' => '929' }, '15' => { 'name' => 'init_type_reply', 'offset' => '66', 'type' => '929' }, '16' => { 'name' => 'active_width', 'offset' => '67', 'type' => '929' }, '17' => { 'name' => 'active_speed', 'offset' => '68', 'type' => '929' }, '18' => { 'name' => 'phys_state', 'offset' => '69', 'type' => '929' }, '19' => { 'name' => 'link_layer', 'offset' => '70', 'type' => '929' }, '2' => { 'name' => 'active_mtu', 'offset' => '8', 'type' => '9546' }, '20' => { 'name' => 'flags', 'offset' => '71', 'type' => '929' }, '21' => { 'name' => 'port_cap_flags2', 'offset' => '72', 'type' => '941' }, '22' => { 'name' => 'active_speed_ex', 'offset' => '82', 'type' => '953' }, '3' => { 'name' => 'gid_tbl_len', 'offset' => '18', 'type' => '161' }, '4' => { 'name' => 'port_cap_flags', 'offset' => '22', 'type' => '953' }, '5' => { 'name' => 'max_msg_sz', 'offset' => '32', 'type' => '953' }, '6' => { 'name' => 'bad_pkey_cntr', 'offset' => '36', 'type' => '953' }, '7' => { 'name' => 'qkey_viol_cntr', 'offset' => '40', 'type' => '953' }, '8' => { 'name' => 'pkey_tbl_len', 'offset' => '50', 'type' => '941' }, '9' => { 'name' => 'lid', 'offset' => '52', 'type' => '941' } }, 'Name' => 'struct ibv_port_attr', 'Size' => '56', 'Type' => 'Struct' }, '54967' => { 'Header' => undef, 'Line' => '451', 'Memb' => { '0' => { 'name' => 'IBV_EVENT_CQ_ERR', 'value' => '0' }, '1' => { 'name' => 'IBV_EVENT_QP_FATAL', 'value' => '1' }, '10' => { 'name' => 'IBV_EVENT_PORT_ERR', 'value' => '10' }, '11' => { 'name' => 'IBV_EVENT_LID_CHANGE', 'value' => '11' }, '12' => { 'name' => 'IBV_EVENT_PKEY_CHANGE', 'value' => '12' }, '13' => { 'name' => 'IBV_EVENT_SM_CHANGE', 'value' => '13' }, '14' => { 'name' => 'IBV_EVENT_SRQ_ERR', 'value' => '14' }, '15' => { 'name' => 'IBV_EVENT_SRQ_LIMIT_REACHED', 'value' => '15' }, '16' => { 'name' => 'IBV_EVENT_QP_LAST_WQE_REACHED', 'value' => '16' }, '17' => { 'name' => 'IBV_EVENT_CLIENT_REREGISTER', 'value' => '17' }, '18' => { 'name' => 'IBV_EVENT_GID_CHANGE', 'value' => '18' }, '19' => { 'name' => 'IBV_EVENT_WQ_FATAL', 'value' => '19' }, '2' => { 'name' => 'IBV_EVENT_QP_REQ_ERR', 'value' => '2' }, '3' => { 'name' => 'IBV_EVENT_QP_ACCESS_ERR', 'value' => '3' }, '4' => { 'name' => 'IBV_EVENT_COMM_EST', 'value' => '4' }, '5' => { 'name' => 'IBV_EVENT_SQ_DRAINED', 'value' => '5' }, '6' => { 'name' => 'IBV_EVENT_PATH_MIG', 'value' => '6' }, '7' => { 'name' => 'IBV_EVENT_PATH_MIG_ERR', 'value' => '7' }, '8' => { 'name' => 'IBV_EVENT_DEVICE_FATAL', 'value' => '8' }, '9' => { 'name' => 'IBV_EVENT_PORT_ACTIVE', 'value' => '9' } }, 'Name' => 'enum ibv_event_type', 'Size' => '4', 'Type' => 'Enum' }, '55104' => { 'Header' => undef, 'Line' => '475', 'Memb' => { '0' => { 'name' => 'cq', 'offset' => '0', 'type' => '9734' }, '1' => { 'name' => 'qp', 'offset' => '0', 'type' => '9935' }, '2' => { 'name' => 'srq', 'offset' => '0', 'type' => '10052' }, '3' => { 'name' => 'wq', 'offset' => '0', 'type' => '10252' }, '4' => { 'name' => 'port_num', 'offset' => '0', 'type' => '161' } }, 'Size' => '8', 'Type' => 'Union' }, '5543' => { 'Header' => undef, 'Line' => '891', 'Memb' => { '0' => { 'name' => 'ah_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'driver_data', 'offset' => '4', 'type' => '1663' } }, 'Name' => 'struct ib_uverbs_create_ah_resp', 'Size' => '4', 'Type' => 'Struct' }, '55830' => { 'Header' => undef, 'Line' => '474', 'Memb' => { '0' => { 'name' => 'element', 'offset' => '0', 'type' => '55104' }, '1' => { 'name' => 'event_type', 'offset' => '8', 'type' => '54967' } }, 'Name' => 'struct ibv_async_event', 'Size' => '16', 'Type' => 'Struct' }, '56710' => { 'Header' => undef, 'Line' => '637', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '953' } }, 'Name' => 'struct ibv_td_init_attr', 'Size' => '4', 'Type' => 'Struct' }, '56738' => { 'Header' => undef, 'Line' => '641', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' } }, 'Name' => 'struct ibv_td', 'Size' => '8', 'Type' => 'Struct' }, '57217' => { 'Header' => undef, 'Line' => '783', 'Memb' => { '0' => { 'name' => 'srq_context', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'attr', 'offset' => '8', 'type' => '11710' } }, 'Name' => 'struct ibv_srq_init_attr', 'Size' => '24', 'Type' => 'Struct' }, '57259' => { 'Header' => undef, 'Line' => '788', 'Memb' => { '0' => { 'name' => 'IBV_SRQT_BASIC', 'value' => '0' }, '1' => { 'name' => 'IBV_SRQT_XRC', 'value' => '1' }, '2' => { 'name' => 'IBV_SRQT_TM', 'value' => '2' } }, 'Name' => 'enum ibv_srq_type', 'Size' => '4', 'Type' => 'Enum' }, '57294' => { 'Header' => undef, 'Line' => '803', 'Memb' => { '0' => { 'name' => 'max_num_tags', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'max_ops', 'offset' => '4', 'type' => '953' } }, 'Name' => 'struct ibv_tm_cap', 'Size' => '8', 'Type' => 'Struct' }, '57336' => { 'Header' => undef, 'Line' => '808', 'Memb' => { '0' => { 'name' => 'srq_context', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'attr', 'offset' => '8', 'type' => '11710' }, '2' => { 'name' => 'comp_mask', 'offset' => '32', 'type' => '953' }, '3' => { 'name' => 'srq_type', 'offset' => '36', 'type' => '57259' }, '4' => { 'name' => 'pd', 'offset' => '50', 'type' => '11395' }, '5' => { 'name' => 'xrcd', 'offset' => '64', 'type' => '11767' }, '6' => { 'name' => 'cq', 'offset' => '72', 'type' => '9734' }, '7' => { 'name' => 'tm_cap', 'offset' => '86', 'type' => '57294' } }, 'Name' => 'struct ibv_srq_init_attr_ex', 'Size' => '64', 'Type' => 'Struct' }, '57488' => { 'Header' => undef, 'Line' => '837', 'Memb' => { '0' => { 'name' => 'wq_context', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'wq_type', 'offset' => '8', 'type' => '11772' }, '2' => { 'name' => 'max_wr', 'offset' => '18', 'type' => '953' }, '3' => { 'name' => 'max_sge', 'offset' => '22', 'type' => '953' }, '4' => { 'name' => 'pd', 'offset' => '36', 'type' => '11395' }, '5' => { 'name' => 'cq', 'offset' => '50', 'type' => '9734' }, '6' => { 'name' => 'comp_mask', 'offset' => '64', 'type' => '953' }, '7' => { 'name' => 'create_flags', 'offset' => '68', 'type' => '953' } }, 'Name' => 'struct ibv_wq_init_attr', 'Size' => '48', 'Type' => 'Struct' }, '58123' => { 'Header' => undef, 'Line' => '963', 'Memb' => { '0' => { 'name' => 'rx_hash_function', 'offset' => '0', 'type' => '929' }, '1' => { 'name' => 'rx_hash_key_len', 'offset' => '1', 'type' => '929' }, '2' => { 'name' => 'rx_hash_key', 'offset' => '8', 'type' => '58193' }, '3' => { 'name' => 'rx_hash_fields_mask', 'offset' => '22', 'type' => '965' } }, 'Name' => 'struct ibv_rx_hash_conf', 'Size' => '24', 'Type' => 'Struct' }, '58193' => { 'BaseType' => '929', 'Name' => 'uint8_t*', 'Size' => '8', 'Type' => 'Pointer' }, '58198' => { 'Header' => undef, 'Line' => '972', 'Memb' => { '0' => { 'name' => 'qp_context', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'send_cq', 'offset' => '8', 'type' => '9734' }, '10' => { 'name' => 'create_flags', 'offset' => '128', 'type' => '953' }, '11' => { 'name' => 'max_tso_header', 'offset' => '132', 'type' => '941' }, '12' => { 'name' => 'rwq_ind_tbl', 'offset' => '136', 'type' => '12422' }, '13' => { 'name' => 'rx_hash_conf', 'offset' => '150', 'type' => '58123' }, '14' => { 'name' => 'source_qpn', 'offset' => '288', 'type' => '953' }, '15' => { 'name' => 'send_ops_flags', 'offset' => '296', 'type' => '965' }, '2' => { 'name' => 'recv_cq', 'offset' => '22', 'type' => '9734' }, '3' => { 'name' => 'srq', 'offset' => '36', 'type' => '10052' }, '4' => { 'name' => 'cap', 'offset' => '50', 'type' => '12224' }, '5' => { 'name' => 'qp_type', 'offset' => '82', 'type' => '12165' }, '6' => { 'name' => 'sq_sig_all', 'offset' => '86', 'type' => '161' }, '7' => { 'name' => 'comp_mask', 'offset' => '96', 'type' => '953' }, '8' => { 'name' => 'pd', 'offset' => '100', 'type' => '11395' }, '9' => { 'name' => 'xrcd', 'offset' => '114', 'type' => '11767' } }, 'Name' => 'struct ibv_qp_init_attr_ex', 'Size' => '136', 'Type' => 'Struct' }, '59002' => { 'Header' => undef, 'Line' => '1096', 'Memb' => { '0' => { 'name' => 'rate_limit', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'max_burst_sz', 'offset' => '4', 'type' => '953' }, '2' => { 'name' => 'typical_pkt_sz', 'offset' => '8', 'type' => '941' }, '3' => { 'name' => 'comp_mask', 'offset' => '18', 'type' => '953' } }, 'Name' => 'struct ibv_qp_rate_limit_attr', 'Size' => '16', 'Type' => 'Struct' }, '59929' => { 'Header' => undef, 'Line' => '1208', 'Memb' => { '0' => { 'name' => 'IBV_WR_TAG_ADD', 'value' => '0' }, '1' => { 'name' => 'IBV_WR_TAG_DEL', 'value' => '1' }, '2' => { 'name' => 'IBV_WR_TAG_SYNC', 'value' => '2' } }, 'Name' => 'enum ibv_ops_wr_opcode', 'Size' => '4', 'Type' => 'Enum' }, '59964' => { 'Header' => undef, 'Line' => '1227', 'Memb' => { '0' => { 'name' => 'recv_wr_id', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'sg_list', 'offset' => '8', 'type' => '14062' }, '2' => { 'name' => 'num_sge', 'offset' => '22', 'type' => '161' }, '3' => { 'name' => 'tag', 'offset' => '36', 'type' => '965' }, '4' => { 'name' => 'mask', 'offset' => '50', 'type' => '965' } }, 'Size' => '40', 'Type' => 'Struct' }, '60044' => { 'Header' => undef, 'Line' => '1224', 'Memb' => { '0' => { 'name' => 'unexpected_cnt', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'handle', 'offset' => '4', 'type' => '953' }, '2' => { 'name' => 'add', 'offset' => '8', 'type' => '59964' } }, 'Size' => '48', 'Type' => 'Struct' }, '60096' => { 'Header' => undef, 'Line' => '1219', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '60179' }, '2' => { 'name' => 'opcode', 'offset' => '22', 'type' => '59929' }, '3' => { 'name' => 'flags', 'offset' => '32', 'type' => '161' }, '4' => { 'name' => 'tm', 'offset' => '36', 'type' => '60044' } }, 'Name' => 'struct ibv_ops_wr', 'Size' => '72', 'Type' => 'Struct' }, '60179' => { 'BaseType' => '60096', 'Name' => 'struct ibv_ops_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '60275' => { 'Header' => undef, 'Line' => '1487', 'Memb' => { '0' => { 'name' => 'vendor_id', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'options', 'offset' => '4', 'type' => '953' }, '2' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '953' } }, 'Name' => 'struct ibv_ece', 'Size' => '12', 'Type' => 'Struct' }, '60391' => { 'Header' => undef, 'Line' => '1521', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '953' } }, 'Name' => 'struct ibv_poll_cq_attr', 'Size' => '4', 'Type' => 'Struct' }, '60419' => { 'Header' => undef, 'Line' => '1525', 'Memb' => { '0' => { 'name' => 'tag', 'offset' => '0', 'type' => '965' }, '1' => { 'name' => 'priv', 'offset' => '8', 'type' => '953' } }, 'Name' => 'struct ibv_wc_tm_info', 'Size' => '16', 'Type' => 'Struct' }, '60461' => { 'Header' => undef, 'Line' => '1530', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'channel', 'offset' => '8', 'type' => '15165' }, '10' => { 'name' => 'status', 'offset' => '306', 'type' => '10257' }, '11' => { 'name' => 'wr_id', 'offset' => '310', 'type' => '965' }, '12' => { 'name' => 'start_poll', 'offset' => '324', 'type' => '60930' }, '13' => { 'name' => 'next_poll', 'offset' => '338', 'type' => '60950' }, '14' => { 'name' => 'end_poll', 'offset' => '352', 'type' => '60966' }, '15' => { 'name' => 'read_opcode', 'offset' => '360', 'type' => '60986' }, '16' => { 'name' => 'read_vendor_err', 'offset' => '374', 'type' => '61006' }, '17' => { 'name' => 'read_byte_len', 'offset' => '388', 'type' => '61006' }, '18' => { 'name' => 'read_imm_data', 'offset' => '402', 'type' => '61026' }, '19' => { 'name' => 'read_qp_num', 'offset' => '512', 'type' => '61006' }, '2' => { 'name' => 'cq_context', 'offset' => '22', 'type' => '82' }, '20' => { 'name' => 'read_src_qp', 'offset' => '520', 'type' => '61006' }, '21' => { 'name' => 'read_wc_flags', 'offset' => '534', 'type' => '61046' }, '22' => { 'name' => 'read_slid', 'offset' => '548', 'type' => '61006' }, '23' => { 'name' => 'read_sl', 'offset' => '562', 'type' => '61066' }, '24' => { 'name' => 'read_dlid_path_bits', 'offset' => '576', 'type' => '61066' }, '25' => { 'name' => 'read_completion_ts', 'offset' => '584', 'type' => '61086' }, '26' => { 'name' => 'read_cvlan', 'offset' => '598', 'type' => '61106' }, '27' => { 'name' => 'read_flow_tag', 'offset' => '612', 'type' => '61006' }, '28' => { 'name' => 'read_tm_info', 'offset' => '626', 'type' => '61132' }, '29' => { 'name' => 'read_completion_wallclock_ns', 'offset' => '640', 'type' => '61086' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '953' }, '4' => { 'name' => 'cqe', 'offset' => '40', 'type' => '161' }, '5' => { 'name' => 'mutex', 'offset' => '50', 'type' => '51311' }, '6' => { 'name' => 'cond', 'offset' => '114', 'type' => '51385' }, '7' => { 'name' => 'comp_events_completed', 'offset' => '288', 'type' => '953' }, '8' => { 'name' => 'async_events_completed', 'offset' => '292', 'type' => '953' }, '9' => { 'name' => 'comp_mask', 'offset' => '296', 'type' => '953' } }, 'Name' => 'struct ibv_cq_ex', 'Size' => '288', 'Type' => 'Struct' }, '60920' => { 'BaseType' => '60461', 'Name' => 'struct ibv_cq_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '60925' => { 'BaseType' => '60391', 'Name' => 'struct ibv_poll_cq_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '60930' => { 'Name' => 'int(*)(struct ibv_cq_ex*, struct ibv_poll_cq_attr*)', 'Param' => { '0' => { 'type' => '60920' }, '1' => { 'type' => '60925' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '60950' => { 'Name' => 'int(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '60920' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '60966' => { 'Name' => 'void(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '60920' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '60986' => { 'Name' => 'enum ibv_wc_opcode(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '60920' } }, 'Return' => '10418', 'Size' => '8', 'Type' => 'FuncPtr' }, '61006' => { 'Name' => 'uint32_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '60920' } }, 'Return' => '953', 'Size' => '8', 'Type' => 'FuncPtr' }, '61026' => { 'Name' => '__be32(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '60920' } }, 'Return' => '1049', 'Size' => '8', 'Type' => 'FuncPtr' }, '61046' => { 'Name' => 'unsigned int(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '60920' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '61066' => { 'Name' => 'uint8_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '60920' } }, 'Return' => '929', 'Size' => '8', 'Type' => 'FuncPtr' }, '61086' => { 'Name' => 'uint64_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '60920' } }, 'Return' => '965', 'Size' => '8', 'Type' => 'FuncPtr' }, '61106' => { 'Name' => 'uint16_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '60920' } }, 'Return' => '941', 'Size' => '8', 'Type' => 'FuncPtr' }, '61127' => { 'BaseType' => '60419', 'Name' => 'struct ibv_wc_tm_info*', 'Size' => '8', 'Type' => 'Pointer' }, '61132' => { 'Name' => 'void(*)(struct ibv_cq_ex*, struct ibv_wc_tm_info*)', 'Param' => { '0' => { 'type' => '60920' }, '1' => { 'type' => '61127' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '61491' => { 'Header' => undef, 'Line' => '1948', 'Memb' => { '0' => { 'name' => 'esp_attr', 'offset' => '0', 'type' => '61645' }, '1' => { 'name' => 'keymat_proto', 'offset' => '8', 'type' => '52352' }, '2' => { 'name' => 'keymat_len', 'offset' => '18', 'type' => '941' }, '3' => { 'name' => 'keymat_ptr', 'offset' => '22', 'type' => '82' }, '4' => { 'name' => 'replay_proto', 'offset' => '36', 'type' => '52375' }, '5' => { 'name' => 'replay_len', 'offset' => '40', 'type' => '941' }, '6' => { 'name' => 'replay_ptr', 'offset' => '50', 'type' => '82' }, '7' => { 'name' => 'esp_encap', 'offset' => '64', 'type' => '52524' }, '8' => { 'name' => 'comp_mask', 'offset' => '72', 'type' => '953' }, '9' => { 'name' => 'esn', 'offset' => '82', 'type' => '953' } }, 'Name' => 'struct ibv_flow_action_esp_attr', 'Size' => '56', 'Type' => 'Struct' }, '61645' => { 'BaseType' => '52529', 'Name' => 'struct ib_uverbs_flow_action_esp*', 'Size' => '8', 'Type' => 'Pointer' }, '62704' => { 'Header' => undef, 'Line' => '2057', 'Memb' => { '0' => { 'name' => 'cqe', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'cq_context', 'offset' => '8', 'type' => '82' }, '2' => { 'name' => 'channel', 'offset' => '22', 'type' => '15165' }, '3' => { 'name' => 'comp_vector', 'offset' => '36', 'type' => '953' }, '4' => { 'name' => 'wc_flags', 'offset' => '50', 'type' => '965' }, '5' => { 'name' => 'comp_mask', 'offset' => '64', 'type' => '953' }, '6' => { 'name' => 'flags', 'offset' => '68', 'type' => '953' }, '7' => { 'name' => 'parent_domain', 'offset' => '72', 'type' => '11395' } }, 'Name' => 'struct ibv_cq_init_attr_ex', 'Size' => '56', 'Type' => 'Struct' }, '62830' => { 'BaseType' => '62704', 'Name' => 'struct ibv_cq_init_attr_ex const', 'Size' => '56', 'Type' => 'Const' }, '62835' => { 'Header' => undef, 'Line' => '2090', 'Memb' => { '0' => { 'name' => 'pd', 'offset' => '0', 'type' => '11395' }, '1' => { 'name' => 'td', 'offset' => '8', 'type' => '62931' }, '2' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '953' }, '3' => { 'name' => 'alloc', 'offset' => '36', 'type' => '62971' }, '4' => { 'name' => 'free', 'offset' => '50', 'type' => '63002' }, '5' => { 'name' => 'pd_context', 'offset' => '64', 'type' => '82' } }, 'Name' => 'struct ibv_parent_domain_init_attr', 'Size' => '48', 'Type' => 'Struct' }, '62931' => { 'BaseType' => '56738', 'Name' => 'struct ibv_td*', 'Size' => '8', 'Type' => 'Pointer' }, '62971' => { 'Name' => 'void*(*)(struct ibv_pd*, void*, size_t, size_t, uint64_t)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '82' }, '2' => { 'type' => '53' }, '3' => { 'type' => '53' }, '4' => { 'type' => '965' } }, 'Return' => '82', 'Size' => '8', 'Type' => 'FuncPtr' }, '63002' => { 'Name' => 'void(*)(struct ibv_pd*, void*, void*, uint64_t)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '82' }, '2' => { 'type' => '82' }, '3' => { 'type' => '965' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '63035' => { 'Header' => undef, 'Line' => '2109', 'Memb' => { '0' => { 'name' => 'IBV_COUNTER_PACKETS', 'value' => '0' }, '1' => { 'name' => 'IBV_COUNTER_BYTES', 'value' => '1' } }, 'Name' => 'enum ibv_counter_description', 'Size' => '4', 'Type' => 'Enum' }, '63064' => { 'Header' => undef, 'Line' => '2114', 'Memb' => { '0' => { 'name' => 'counter_desc', 'offset' => '0', 'type' => '63035' }, '1' => { 'name' => 'index', 'offset' => '4', 'type' => '953' }, '2' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '953' } }, 'Name' => 'struct ibv_counter_attach_attr', 'Size' => '12', 'Type' => 'Struct' }, '63120' => { 'Header' => undef, 'Line' => '2129', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '953' }, '1' => { 'name' => 'raw_clock', 'offset' => '8', 'type' => '50831' } }, 'Name' => 'struct ibv_values_ex', 'Size' => '24', 'Type' => 'Struct' }, '63162' => { 'Header' => undef, 'Line' => '2134', 'Memb' => { '0' => { 'name' => 'query_port', 'offset' => '0', 'type' => '63794' }, '1' => { 'name' => 'advise_mr', 'offset' => '8', 'type' => '63834' }, '10' => { 'name' => 'modify_flow_action_esp', 'offset' => '128', 'type' => '64109' }, '11' => { 'name' => 'destroy_flow_action', 'offset' => '136', 'type' => '64129' }, '12' => { 'name' => 'create_flow_action_esp', 'offset' => '150', 'type' => '64154' }, '13' => { 'name' => 'modify_qp_rate_limit', 'offset' => '260', 'type' => '64184' }, '14' => { 'name' => 'alloc_parent_domain', 'offset' => '274', 'type' => '64214' }, '15' => { 'name' => 'dealloc_td', 'offset' => '288', 'type' => '64234' }, '16' => { 'name' => 'alloc_td', 'offset' => '296', 'type' => '64264' }, '17' => { 'name' => 'modify_cq', 'offset' => '310', 'type' => '64294' }, '18' => { 'name' => 'post_srq_ops', 'offset' => '324', 'type' => '64329' }, '19' => { 'name' => 'destroy_rwq_ind_table', 'offset' => '338', 'type' => '64349' }, '2' => { 'name' => 'alloc_null_mr', 'offset' => '22', 'type' => '63854' }, '20' => { 'name' => 'create_rwq_ind_table', 'offset' => '352', 'type' => '64379' }, '21' => { 'name' => 'destroy_wq', 'offset' => '360', 'type' => '64399' }, '22' => { 'name' => 'modify_wq', 'offset' => '374', 'type' => '64429' }, '23' => { 'name' => 'create_wq', 'offset' => '388', 'type' => '64459' }, '24' => { 'name' => 'query_rt_values', 'offset' => '402', 'type' => '64489' }, '25' => { 'name' => 'create_cq_ex', 'offset' => '512', 'type' => '64519' }, '26' => { 'name' => 'priv', 'offset' => '520', 'type' => '64604' }, '27' => { 'name' => 'query_device_ex', 'offset' => '534', 'type' => '64649' }, '28' => { 'name' => 'ibv_destroy_flow', 'offset' => '548', 'type' => '64669' }, '29' => { 'name' => 'ABI_placeholder2', 'offset' => '562', 'type' => '64675' }, '3' => { 'name' => 'read_counters', 'offset' => '36', 'type' => '63894' }, '30' => { 'name' => 'ibv_create_flow', 'offset' => '576', 'type' => '64705' }, '31' => { 'name' => 'ABI_placeholder1', 'offset' => '584', 'type' => '64675' }, '32' => { 'name' => 'open_qp', 'offset' => '598', 'type' => '64735' }, '33' => { 'name' => 'create_qp_ex', 'offset' => '612', 'type' => '64765' }, '34' => { 'name' => 'get_srq_num', 'offset' => '626', 'type' => '64795' }, '35' => { 'name' => 'create_srq_ex', 'offset' => '640', 'type' => '64825' }, '36' => { 'name' => 'open_xrcd', 'offset' => '648', 'type' => '64855' }, '37' => { 'name' => 'close_xrcd', 'offset' => '662', 'type' => '64875' }, '38' => { 'name' => '_ABI_placeholder3', 'offset' => '772', 'type' => '965' }, '39' => { 'name' => 'sz', 'offset' => '786', 'type' => '53' }, '4' => { 'name' => 'attach_counters_point_flow', 'offset' => '50', 'type' => '63934' }, '40' => { 'name' => 'context', 'offset' => '800', 'type' => '8879' }, '5' => { 'name' => 'create_counters', 'offset' => '64', 'type' => '63964' }, '6' => { 'name' => 'destroy_counters', 'offset' => '72', 'type' => '63984' }, '7' => { 'name' => 'reg_dm_mr', 'offset' => '86', 'type' => '64024' }, '8' => { 'name' => 'alloc_dm', 'offset' => '100', 'type' => '64054' }, '9' => { 'name' => 'free_dm', 'offset' => '114', 'type' => '64074' } }, 'Name' => 'struct verbs_context', 'Size' => '648', 'Type' => 'Struct' }, '63789' => { 'BaseType' => '54631', 'Name' => 'struct ibv_port_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '63794' => { 'Name' => 'int(*)(struct ibv_context*, uint8_t, struct ibv_port_attr*, size_t)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '929' }, '2' => { 'type' => '63789' }, '3' => { 'type' => '53' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '63834' => { 'Name' => 'int(*)(struct ibv_pd*, enum ib_uverbs_advise_mr_advice, uint32_t, struct ibv_sge*, uint32_t)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '52608' }, '2' => { 'type' => '953' }, '3' => { 'type' => '14062' }, '4' => { 'type' => '953' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '63854' => { 'Name' => 'struct ibv_mr*(*)(struct ibv_pd*)', 'Param' => { '0' => { 'type' => '11395' } }, 'Return' => '11186', 'Size' => '8', 'Type' => 'FuncPtr' }, '63894' => { 'Name' => 'int(*)(struct ibv_counters*, uint64_t*, uint32_t, uint32_t)', 'Param' => { '0' => { 'type' => '16888' }, '1' => { 'type' => '46784' }, '2' => { 'type' => '953' }, '3' => { 'type' => '953' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '63924' => { 'BaseType' => '63064', 'Name' => 'struct ibv_counter_attach_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '63934' => { 'Name' => 'int(*)(struct ibv_counters*, struct ibv_counter_attach_attr*, struct ibv_flow*)', 'Param' => { '0' => { 'type' => '16888' }, '1' => { 'type' => '63924' }, '2' => { 'type' => '18335' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '63964' => { 'Name' => 'struct ibv_counters*(*)(struct ibv_context*, struct ibv_counters_init_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '46789' } }, 'Return' => '16888', 'Size' => '8', 'Type' => 'FuncPtr' }, '63984' => { 'Name' => 'int(*)(struct ibv_counters*)', 'Param' => { '0' => { 'type' => '16888' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64024' => { 'Name' => 'struct ibv_mr*(*)(struct ibv_pd*, struct ibv_dm*, uint64_t, size_t, unsigned int)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '53174' }, '2' => { 'type' => '965' }, '3' => { 'type' => '53' }, '4' => { 'type' => '70' } }, 'Return' => '11186', 'Size' => '8', 'Type' => 'FuncPtr' }, '64049' => { 'BaseType' => '52889', 'Name' => 'struct ibv_alloc_dm_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '64054' => { 'Name' => 'struct ibv_dm*(*)(struct ibv_context*, struct ibv_alloc_dm_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '64049' } }, 'Return' => '53174', 'Size' => '8', 'Type' => 'FuncPtr' }, '64074' => { 'Name' => 'int(*)(struct ibv_dm*)', 'Param' => { '0' => { 'type' => '53174' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64099' => { 'BaseType' => '16763', 'Name' => 'struct ibv_flow_action*', 'Size' => '8', 'Type' => 'Pointer' }, '64104' => { 'BaseType' => '61491', 'Name' => 'struct ibv_flow_action_esp_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '64109' => { 'Name' => 'int(*)(struct ibv_flow_action*, struct ibv_flow_action_esp_attr*)', 'Param' => { '0' => { 'type' => '64099' }, '1' => { 'type' => '64104' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64129' => { 'Name' => 'int(*)(struct ibv_flow_action*)', 'Param' => { '0' => { 'type' => '64099' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64154' => { 'Name' => 'struct ibv_flow_action*(*)(struct ibv_context*, struct ibv_flow_action_esp_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '64104' } }, 'Return' => '64099', 'Size' => '8', 'Type' => 'FuncPtr' }, '64179' => { 'BaseType' => '59002', 'Name' => 'struct ibv_qp_rate_limit_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '64184' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_qp_rate_limit_attr*)', 'Param' => { '0' => { 'type' => '9935' }, '1' => { 'type' => '64179' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64209' => { 'BaseType' => '62835', 'Name' => 'struct ibv_parent_domain_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '64214' => { 'Name' => 'struct ibv_pd*(*)(struct ibv_context*, struct ibv_parent_domain_init_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '64209' } }, 'Return' => '11395', 'Size' => '8', 'Type' => 'FuncPtr' }, '64234' => { 'Name' => 'int(*)(struct ibv_td*)', 'Param' => { '0' => { 'type' => '62931' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64259' => { 'BaseType' => '56710', 'Name' => 'struct ibv_td_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '64264' => { 'Name' => 'struct ibv_td*(*)(struct ibv_context*, struct ibv_td_init_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '64259' } }, 'Return' => '62931', 'Size' => '8', 'Type' => 'FuncPtr' }, '64294' => { 'Name' => 'int(*)(struct ibv_cq*, struct ibv_modify_cq_attr*)', 'Param' => { '0' => { 'type' => '9734' }, '1' => { 'type' => '18340' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64324' => { 'BaseType' => '60179', 'Name' => 'struct ibv_ops_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '64329' => { 'Name' => 'int(*)(struct ibv_srq*, struct ibv_ops_wr*, struct ibv_ops_wr**)', 'Param' => { '0' => { 'type' => '10052' }, '1' => { 'type' => '60179' }, '2' => { 'type' => '64324' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64349' => { 'Name' => 'int(*)(struct ibv_rwq_ind_table*)', 'Param' => { '0' => { 'type' => '12422' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64379' => { 'Name' => 'struct ibv_rwq_ind_table*(*)(struct ibv_context*, struct ibv_rwq_ind_table_init_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '18345' } }, 'Return' => '12422', 'Size' => '8', 'Type' => 'FuncPtr' }, '64399' => { 'Name' => 'int(*)(struct ibv_wq*)', 'Param' => { '0' => { 'type' => '10252' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64429' => { 'Name' => 'int(*)(struct ibv_wq*, struct ibv_wq_attr*)', 'Param' => { '0' => { 'type' => '10252' }, '1' => { 'type' => '18350' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64454' => { 'BaseType' => '57488', 'Name' => 'struct ibv_wq_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '64459' => { 'Name' => 'struct ibv_wq*(*)(struct ibv_context*, struct ibv_wq_init_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '64454' } }, 'Return' => '10252', 'Size' => '8', 'Type' => 'FuncPtr' }, '64484' => { 'BaseType' => '63120', 'Name' => 'struct ibv_values_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '64489' => { 'Name' => 'int(*)(struct ibv_context*, struct ibv_values_ex*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '64484' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64514' => { 'BaseType' => '62704', 'Name' => 'struct ibv_cq_init_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '64519' => { 'Name' => 'struct ibv_cq_ex*(*)(struct ibv_context*, struct ibv_cq_init_attr_ex*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '64514' } }, 'Return' => '60920', 'Size' => '8', 'Type' => 'FuncPtr' }, '64524' => { 'Header' => undef, 'Line' => '72', 'Memb' => { '0' => { 'name' => 'unsupported_ioctls', 'offset' => '0', 'type' => '68056' }, '1' => { 'name' => 'driver_id', 'offset' => '22', 'type' => '953' }, '2' => { 'name' => 'use_ioctl_write', 'offset' => '32', 'type' => '18370' }, '3' => { 'name' => 'ops', 'offset' => '36', 'type' => '65817' }, '4' => { 'name' => 'imported', 'offset' => '1586', 'type' => '18370' } }, 'Name' => 'struct verbs_ex_private', 'Size' => '640', 'Type' => 'Struct' }, '64604' => { 'BaseType' => '64524', 'Name' => 'struct verbs_ex_private*', 'Size' => '8', 'Type' => 'Pointer' }, '64639' => { 'BaseType' => '53807', 'Name' => 'struct ibv_query_device_ex_input const*', 'Size' => '8', 'Type' => 'Pointer' }, '64644' => { 'BaseType' => '54264', 'Name' => 'struct ibv_device_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '64649' => { 'Name' => 'int(*)(struct ibv_context*, struct ibv_query_device_ex_input const*, struct ibv_device_attr_ex*, size_t)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '64639' }, '2' => { 'type' => '64644' }, '3' => { 'type' => '53' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64669' => { 'Name' => 'int(*)(struct ibv_flow*)', 'Param' => { '0' => { 'type' => '18335' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64675' => { 'Name' => 'void(*)()', 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '64705' => { 'Name' => 'struct ibv_flow*(*)(struct ibv_qp*, struct ibv_flow_attr*)', 'Param' => { '0' => { 'type' => '9935' }, '1' => { 'type' => '18355' } }, 'Return' => '18335', 'Size' => '8', 'Type' => 'FuncPtr' }, '64735' => { 'Name' => 'struct ibv_qp*(*)(struct ibv_context*, struct ibv_qp_open_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '18360' } }, 'Return' => '9935', 'Size' => '8', 'Type' => 'FuncPtr' }, '64760' => { 'BaseType' => '58198', 'Name' => 'struct ibv_qp_init_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '64765' => { 'Name' => 'struct ibv_qp*(*)(struct ibv_context*, struct ibv_qp_init_attr_ex*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '64760' } }, 'Return' => '9935', 'Size' => '8', 'Type' => 'FuncPtr' }, '64790' => { 'BaseType' => '953', 'Name' => 'uint32_t*', 'Size' => '8', 'Type' => 'Pointer' }, '64795' => { 'Name' => 'int(*)(struct ibv_srq*, uint32_t*)', 'Param' => { '0' => { 'type' => '10052' }, '1' => { 'type' => '64790' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '64820' => { 'BaseType' => '57336', 'Name' => 'struct ibv_srq_init_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '64825' => { 'Name' => 'struct ibv_srq*(*)(struct ibv_context*, struct ibv_srq_init_attr_ex*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '64820' } }, 'Return' => '10052', 'Size' => '8', 'Type' => 'FuncPtr' }, '64855' => { 'Name' => 'struct ibv_xrcd*(*)(struct ibv_context*, struct ibv_xrcd_init_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '18365' } }, 'Return' => '11767', 'Size' => '8', 'Type' => 'FuncPtr' }, '64875' => { 'Name' => 'int(*)(struct ibv_xrcd*)', 'Param' => { '0' => { 'type' => '11767' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '65139' => { 'Header' => undef, 'Line' => '181', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'user_handle', 'offset' => '8', 'type' => '1025' }, '2' => { 'name' => 'cqe', 'offset' => '22', 'type' => '1013' }, '3' => { 'name' => 'comp_vector', 'offset' => '32', 'type' => '1013' }, '4' => { 'name' => 'comp_channel', 'offset' => '36', 'type' => '50390' }, '5' => { 'name' => 'reserved', 'offset' => '40', 'type' => '1013' }, '6' => { 'name' => 'driver_data', 'offset' => '50', 'type' => '1549' } }, 'Size' => '32', 'Type' => 'Struct' }, '65238' => { 'Header' => undef, 'Line' => '181', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '65139' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '51885' } }, 'Size' => '32', 'Type' => 'Union' }, '65265' => { 'Header' => undef, 'Line' => '181', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '65238' } }, 'Name' => 'struct ibv_create_cq', 'Size' => '40', 'Type' => 'Struct' }, '65457' => { 'Header' => undef, 'Line' => '211', 'Memb' => { '0' => { 'name' => 'user_handle', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'cqe', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'comp_vector', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'comp_channel', 'offset' => '22', 'type' => '50390' }, '4' => { 'name' => 'comp_mask', 'offset' => '32', 'type' => '1013' }, '5' => { 'name' => 'flags', 'offset' => '36', 'type' => '1013' }, '6' => { 'name' => 'reserved', 'offset' => '40', 'type' => '1013' } }, 'Size' => '32', 'Type' => 'Struct' }, '65556' => { 'Header' => undef, 'Line' => '211', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '65457' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '52022' } }, 'Size' => '32', 'Type' => 'Union' }, '65583' => { 'Header' => undef, 'Line' => '211', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '18410' }, '1' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '65556' } }, 'Name' => 'struct ibv_create_cq_ex', 'Size' => '56', 'Type' => 'Struct' }, '65763' => { 'Header' => undef, 'Line' => '170', 'Memb' => { '0' => { 'name' => 'cq', 'offset' => '0', 'type' => '9593' }, '1' => { 'name' => 'cq_ex', 'offset' => '0', 'type' => '60461' } }, 'Size' => '288', 'Type' => 'Union' }, '65797' => { 'Header' => undef, 'Line' => '169', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '65763' } }, 'Name' => 'struct verbs_cq', 'Size' => '288', 'Type' => 'Struct' }, '65817' => { 'Header' => undef, 'Line' => '311', 'Memb' => { '0' => { 'name' => 'advise_mr', 'offset' => '0', 'type' => '63834' }, '1' => { 'name' => 'alloc_dm', 'offset' => '8', 'type' => '64054' }, '10' => { 'name' => 'bind_mw', 'offset' => '128', 'type' => '18155' }, '11' => { 'name' => 'close_xrcd', 'offset' => '136', 'type' => '64875' }, '12' => { 'name' => 'cq_event', 'offset' => '150', 'type' => '67032' }, '13' => { 'name' => 'create_ah', 'offset' => '260', 'type' => '67062' }, '14' => { 'name' => 'create_counters', 'offset' => '274', 'type' => '63964' }, '15' => { 'name' => 'create_cq', 'offset' => '288', 'type' => '67097' }, '16' => { 'name' => 'create_cq_ex', 'offset' => '296', 'type' => '64519' }, '17' => { 'name' => 'create_flow', 'offset' => '310', 'type' => '64705' }, '18' => { 'name' => 'create_flow_action_esp', 'offset' => '324', 'type' => '64154' }, '19' => { 'name' => 'create_qp', 'offset' => '338', 'type' => '67127' }, '2' => { 'name' => 'alloc_mw', 'offset' => '22', 'type' => '18120' }, '20' => { 'name' => 'create_qp_ex', 'offset' => '352', 'type' => '64765' }, '21' => { 'name' => 'create_rwq_ind_table', 'offset' => '360', 'type' => '64379' }, '22' => { 'name' => 'create_srq', 'offset' => '374', 'type' => '67157' }, '23' => { 'name' => 'create_srq_ex', 'offset' => '388', 'type' => '64825' }, '24' => { 'name' => 'create_wq', 'offset' => '402', 'type' => '64459' }, '25' => { 'name' => 'dealloc_mw', 'offset' => '512', 'type' => '18175' }, '26' => { 'name' => 'dealloc_pd', 'offset' => '520', 'type' => '67177' }, '27' => { 'name' => 'dealloc_td', 'offset' => '534', 'type' => '64234' }, '28' => { 'name' => 'dereg_mr', 'offset' => '548', 'type' => '67202' }, '29' => { 'name' => 'destroy_ah', 'offset' => '562', 'type' => '67222' }, '3' => { 'name' => 'alloc_null_mr', 'offset' => '36', 'type' => '63854' }, '30' => { 'name' => 'destroy_counters', 'offset' => '576', 'type' => '63984' }, '31' => { 'name' => 'destroy_cq', 'offset' => '584', 'type' => '67242' }, '32' => { 'name' => 'destroy_flow', 'offset' => '598', 'type' => '64669' }, '33' => { 'name' => 'destroy_flow_action', 'offset' => '612', 'type' => '64129' }, '34' => { 'name' => 'destroy_qp', 'offset' => '626', 'type' => '67262' }, '35' => { 'name' => 'destroy_rwq_ind_table', 'offset' => '640', 'type' => '64349' }, '36' => { 'name' => 'destroy_srq', 'offset' => '648', 'type' => '67282' }, '37' => { 'name' => 'destroy_wq', 'offset' => '662', 'type' => '64399' }, '38' => { 'name' => 'detach_mcast', 'offset' => '772', 'type' => '67016' }, '39' => { 'name' => 'free_context', 'offset' => '786', 'type' => '17511' }, '4' => { 'name' => 'alloc_parent_domain', 'offset' => '50', 'type' => '64214' }, '40' => { 'name' => 'free_dm', 'offset' => '800', 'type' => '64074' }, '41' => { 'name' => 'get_srq_num', 'offset' => '808', 'type' => '64795' }, '42' => { 'name' => 'import_dm', 'offset' => '822', 'type' => '67307' }, '43' => { 'name' => 'import_mr', 'offset' => '836', 'type' => '67332' }, '44' => { 'name' => 'import_pd', 'offset' => '850', 'type' => '67357' }, '45' => { 'name' => 'modify_cq', 'offset' => '864', 'type' => '64294' }, '46' => { 'name' => 'modify_flow_action_esp', 'offset' => '872', 'type' => '64109' }, '47' => { 'name' => 'modify_qp', 'offset' => '886', 'type' => '67392' }, '48' => { 'name' => 'modify_qp_rate_limit', 'offset' => '900', 'type' => '64184' }, '49' => { 'name' => 'modify_srq', 'offset' => '914', 'type' => '67427' }, '5' => { 'name' => 'alloc_pd', 'offset' => '64', 'type' => '66955' }, '50' => { 'name' => 'modify_wq', 'offset' => '1024', 'type' => '64429' }, '51' => { 'name' => 'open_qp', 'offset' => '1032', 'type' => '64735' }, '52' => { 'name' => 'open_xrcd', 'offset' => '1046', 'type' => '64855' }, '53' => { 'name' => 'poll_cq', 'offset' => '1060', 'type' => '18210' }, '54' => { 'name' => 'post_recv', 'offset' => '1074', 'type' => '18330' }, '55' => { 'name' => 'post_send', 'offset' => '1088', 'type' => '18300' }, '56' => { 'name' => 'post_srq_ops', 'offset' => '1096', 'type' => '64329' }, '57' => { 'name' => 'post_srq_recv', 'offset' => '1110', 'type' => '18265' }, '58' => { 'name' => 'query_device_ex', 'offset' => '1124', 'type' => '64649' }, '59' => { 'name' => 'query_ece', 'offset' => '1138', 'type' => '67457' }, '6' => { 'name' => 'alloc_td', 'offset' => '72', 'type' => '64264' }, '60' => { 'name' => 'query_port', 'offset' => '1152', 'type' => '67487' }, '61' => { 'name' => 'query_qp', 'offset' => '1160', 'type' => '67522' }, '62' => { 'name' => 'query_qp_data_in_order', 'offset' => '1174', 'type' => '67552' }, '63' => { 'name' => 'query_rt_values', 'offset' => '1284', 'type' => '64489' }, '64' => { 'name' => 'query_srq', 'offset' => '1298', 'type' => '67577' }, '65' => { 'name' => 'read_counters', 'offset' => '1312', 'type' => '63894' }, '66' => { 'name' => 'reg_dm_mr', 'offset' => '1320', 'type' => '64024' }, '67' => { 'name' => 'reg_dmabuf_mr', 'offset' => '1334', 'type' => '67622' }, '68' => { 'name' => 'reg_mr', 'offset' => '1348', 'type' => '67662' }, '69' => { 'name' => 'req_notify_cq', 'offset' => '1362', 'type' => '18235' }, '7' => { 'name' => 'async_event', 'offset' => '86', 'type' => '66981' }, '70' => { 'name' => 'rereg_mr', 'offset' => '1376', 'type' => '67707' }, '71' => { 'name' => 'resize_cq', 'offset' => '1384', 'type' => '18235' }, '72' => { 'name' => 'set_ece', 'offset' => '1398', 'type' => '67457' }, '73' => { 'name' => 'unimport_dm', 'offset' => '1412', 'type' => '67723' }, '74' => { 'name' => 'unimport_mr', 'offset' => '1426', 'type' => '67739' }, '75' => { 'name' => 'unimport_pd', 'offset' => '1536', 'type' => '67755' }, '8' => { 'name' => 'attach_counters_point_flow', 'offset' => '100', 'type' => '63934' }, '9' => { 'name' => 'attach_mcast', 'offset' => '114', 'type' => '67016' } }, 'Name' => 'struct verbs_context_ops', 'Size' => '608', 'Type' => 'Struct' }, '66955' => { 'Name' => 'struct ibv_pd*(*)(struct ibv_context*)', 'Param' => { '0' => { 'type' => '8991' } }, 'Return' => '11395', 'Size' => '8', 'Type' => 'FuncPtr' }, '66976' => { 'BaseType' => '55830', 'Name' => 'struct ibv_async_event*', 'Size' => '8', 'Type' => 'Pointer' }, '66981' => { 'Name' => 'void(*)(struct ibv_context*, struct ibv_async_event*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '66976' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '67016' => { 'Name' => 'int(*)(struct ibv_qp*, union ibv_gid const*, uint16_t)', 'Param' => { '0' => { 'type' => '9935' }, '1' => { 'type' => '23189' }, '2' => { 'type' => '941' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67032' => { 'Name' => 'void(*)(struct ibv_cq*)', 'Param' => { '0' => { 'type' => '9734' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '67062' => { 'Name' => 'struct ibv_ah*(*)(struct ibv_pd*, struct ibv_ah_attr*)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '23194' } }, 'Return' => '13672', 'Size' => '8', 'Type' => 'FuncPtr' }, '67097' => { 'Name' => 'struct ibv_cq*(*)(struct ibv_context*, int, struct ibv_comp_channel*, int)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '161' }, '2' => { 'type' => '15165' }, '3' => { 'type' => '161' } }, 'Return' => '9734', 'Size' => '8', 'Type' => 'FuncPtr' }, '67127' => { 'Name' => 'struct ibv_qp*(*)(struct ibv_pd*, struct ibv_qp_init_attr*)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '23199' } }, 'Return' => '9935', 'Size' => '8', 'Type' => 'FuncPtr' }, '67152' => { 'BaseType' => '57217', 'Name' => 'struct ibv_srq_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '67157' => { 'Name' => 'struct ibv_srq*(*)(struct ibv_pd*, struct ibv_srq_init_attr*)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '67152' } }, 'Return' => '10052', 'Size' => '8', 'Type' => 'FuncPtr' }, '67177' => { 'Name' => 'int(*)(struct ibv_pd*)', 'Param' => { '0' => { 'type' => '11395' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67202' => { 'Name' => 'int(*)(struct verbs_mr*)', 'Param' => { '0' => { 'type' => '23204' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67222' => { 'Name' => 'int(*)(struct ibv_ah*)', 'Param' => { '0' => { 'type' => '13672' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67242' => { 'Name' => 'int(*)(struct ibv_cq*)', 'Param' => { '0' => { 'type' => '9734' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67262' => { 'Name' => 'int(*)(struct ibv_qp*)', 'Param' => { '0' => { 'type' => '9935' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67282' => { 'Name' => 'int(*)(struct ibv_srq*)', 'Param' => { '0' => { 'type' => '10052' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67307' => { 'Name' => 'struct ibv_dm*(*)(struct ibv_context*, uint32_t)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '953' } }, 'Return' => '53174', 'Size' => '8', 'Type' => 'FuncPtr' }, '67332' => { 'Name' => 'struct ibv_mr*(*)(struct ibv_pd*, uint32_t)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '953' } }, 'Return' => '11186', 'Size' => '8', 'Type' => 'FuncPtr' }, '67357' => { 'Name' => 'struct ibv_pd*(*)(struct ibv_context*, uint32_t)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '953' } }, 'Return' => '11395', 'Size' => '8', 'Type' => 'FuncPtr' }, '67392' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_qp_attr*, int)', 'Param' => { '0' => { 'type' => '9935' }, '1' => { 'type' => '23209' }, '2' => { 'type' => '161' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67427' => { 'Name' => 'int(*)(struct ibv_srq*, struct ibv_srq_attr*, int)', 'Param' => { '0' => { 'type' => '10052' }, '1' => { 'type' => '23214' }, '2' => { 'type' => '161' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67452' => { 'BaseType' => '60275', 'Name' => 'struct ibv_ece*', 'Size' => '8', 'Type' => 'Pointer' }, '67457' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_ece*)', 'Param' => { '0' => { 'type' => '9935' }, '1' => { 'type' => '67452' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67487' => { 'Name' => 'int(*)(struct ibv_context*, uint8_t, struct ibv_port_attr*)', 'Param' => { '0' => { 'type' => '8991' }, '1' => { 'type' => '929' }, '2' => { 'type' => '63789' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67522' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_qp_attr*, int, struct ibv_qp_init_attr*)', 'Param' => { '0' => { 'type' => '9935' }, '1' => { 'type' => '23209' }, '2' => { 'type' => '161' }, '3' => { 'type' => '23199' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67552' => { 'Name' => 'int(*)(struct ibv_qp*, enum ibv_wr_opcode, uint32_t)', 'Param' => { '0' => { 'type' => '9935' }, '1' => { 'type' => '13213' }, '2' => { 'type' => '953' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67577' => { 'Name' => 'int(*)(struct ibv_srq*, struct ibv_srq_attr*)', 'Param' => { '0' => { 'type' => '10052' }, '1' => { 'type' => '23214' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67622' => { 'Name' => 'struct ibv_mr*(*)(struct ibv_pd*, uint64_t, size_t, uint64_t, int, int)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '965' }, '2' => { 'type' => '53' }, '3' => { 'type' => '965' }, '4' => { 'type' => '161' }, '5' => { 'type' => '161' } }, 'Return' => '11186', 'Size' => '8', 'Type' => 'FuncPtr' }, '67662' => { 'Name' => 'struct ibv_mr*(*)(struct ibv_pd*, void*, size_t, uint64_t, int)', 'Param' => { '0' => { 'type' => '11395' }, '1' => { 'type' => '82' }, '2' => { 'type' => '53' }, '3' => { 'type' => '965' }, '4' => { 'type' => '161' } }, 'Return' => '11186', 'Size' => '8', 'Type' => 'FuncPtr' }, '67707' => { 'Name' => 'int(*)(struct verbs_mr*, int, struct ibv_pd*, void*, size_t, int)', 'Param' => { '0' => { 'type' => '23204' }, '1' => { 'type' => '161' }, '2' => { 'type' => '11395' }, '3' => { 'type' => '82' }, '4' => { 'type' => '53' }, '5' => { 'type' => '161' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '67723' => { 'Name' => 'void(*)(struct ibv_dm*)', 'Param' => { '0' => { 'type' => '53174' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '67739' => { 'Name' => 'void(*)(struct ibv_mr*)', 'Param' => { '0' => { 'type' => '11186' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '67755' => { 'Name' => 'void(*)(struct ibv_pd*)', 'Param' => { '0' => { 'type' => '11395' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '68056' => { 'BaseType' => '46', 'Name' => 'unsigned long[2]', 'Size' => '16', 'Type' => 'Array' }, '69870' => { 'BaseType' => '62830', 'Name' => 'struct ibv_cq_init_attr_ex const*', 'Size' => '8', 'Type' => 'Pointer' }, '69875' => { 'BaseType' => '65797', 'Name' => 'struct verbs_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '69880' => { 'BaseType' => '65583', 'Name' => 'struct ibv_create_cq_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '69885' => { 'BaseType' => '52202', 'Name' => 'struct ib_uverbs_ex_create_cq_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '70' => { 'Name' => 'unsigned int', 'Size' => '4', 'Type' => 'Intrinsic' }, '70793' => { 'BaseType' => '65265', 'Name' => 'struct ibv_create_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '70798' => { 'BaseType' => '52132', 'Name' => 'struct ib_uverbs_create_cq_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '74880' => { 'BaseType' => '46', 'Header' => undef, 'Line' => '145', 'Name' => '__dev_t', 'Size' => '8', 'Type' => 'Typedef' }, '74940' => { 'BaseType' => '226', 'Name' => 'char const', 'Size' => '1', 'Type' => 'Const' }, '74950' => { 'BaseType' => '74940', 'Name' => 'char const*', 'Size' => '8', 'Type' => 'Pointer' }, '74984' => { 'BaseType' => '74880', 'Header' => undef, 'Line' => '59', 'Name' => 'dev_t', 'Size' => '8', 'Type' => 'Typedef' }, '76449' => { 'Header' => undef, 'Line' => '141', 'Memb' => { '0' => { 'name' => 'max_cq_moderation_count', 'offset' => '0', 'type' => '1001' }, '1' => { 'name' => 'max_cq_moderation_period', 'offset' => '2', 'type' => '1001' }, '2' => { 'name' => 'reserved', 'offset' => '4', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_cq_moderation_caps', 'Size' => '8', 'Type' => 'Struct' }, '76621' => { 'Header' => undef, 'Line' => '171', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'driver_data', 'offset' => '8', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_get_context', 'Size' => '8', 'Type' => 'Struct' }, '76676' => { 'Header' => undef, 'Line' => '176', 'Memb' => { '0' => { 'name' => 'async_fd', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'num_comp_vectors', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'driver_data', 'offset' => '8', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_get_context_resp', 'Size' => '8', 'Type' => 'Struct' }, '76769' => { 'Header' => undef, 'Line' => '187', 'Memb' => { '0' => { 'name' => 'fw_ver', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'node_guid', 'offset' => '8', 'type' => '1061' }, '10' => { 'name' => 'device_cap_flags', 'offset' => '96', 'type' => '1013' }, '11' => { 'name' => 'max_sge', 'offset' => '100', 'type' => '1013' }, '12' => { 'name' => 'max_sge_rd', 'offset' => '104', 'type' => '1013' }, '13' => { 'name' => 'max_cq', 'offset' => '114', 'type' => '1013' }, '14' => { 'name' => 'max_cqe', 'offset' => '118', 'type' => '1013' }, '15' => { 'name' => 'max_mr', 'offset' => '128', 'type' => '1013' }, '16' => { 'name' => 'max_pd', 'offset' => '132', 'type' => '1013' }, '17' => { 'name' => 'max_qp_rd_atom', 'offset' => '136', 'type' => '1013' }, '18' => { 'name' => 'max_ee_rd_atom', 'offset' => '146', 'type' => '1013' }, '19' => { 'name' => 'max_res_rd_atom', 'offset' => '150', 'type' => '1013' }, '2' => { 'name' => 'sys_image_guid', 'offset' => '22', 'type' => '1061' }, '20' => { 'name' => 'max_qp_init_rd_atom', 'offset' => '256', 'type' => '1013' }, '21' => { 'name' => 'max_ee_init_rd_atom', 'offset' => '260', 'type' => '1013' }, '22' => { 'name' => 'atomic_cap', 'offset' => '264', 'type' => '1013' }, '23' => { 'name' => 'max_ee', 'offset' => '274', 'type' => '1013' }, '24' => { 'name' => 'max_rdd', 'offset' => '278', 'type' => '1013' }, '25' => { 'name' => 'max_mw', 'offset' => '288', 'type' => '1013' }, '26' => { 'name' => 'max_raw_ipv6_qp', 'offset' => '292', 'type' => '1013' }, '27' => { 'name' => 'max_raw_ethy_qp', 'offset' => '296', 'type' => '1013' }, '28' => { 'name' => 'max_mcast_grp', 'offset' => '306', 'type' => '1013' }, '29' => { 'name' => 'max_mcast_qp_attach', 'offset' => '310', 'type' => '1013' }, '3' => { 'name' => 'max_mr_size', 'offset' => '36', 'type' => '1025' }, '30' => { 'name' => 'max_total_mcast_qp_attach', 'offset' => '320', 'type' => '1013' }, '31' => { 'name' => 'max_ah', 'offset' => '324', 'type' => '1013' }, '32' => { 'name' => 'max_fmr', 'offset' => '328', 'type' => '1013' }, '33' => { 'name' => 'max_map_per_fmr', 'offset' => '338', 'type' => '1013' }, '34' => { 'name' => 'max_srq', 'offset' => '342', 'type' => '1013' }, '35' => { 'name' => 'max_srq_wr', 'offset' => '352', 'type' => '1013' }, '36' => { 'name' => 'max_srq_sge', 'offset' => '356', 'type' => '1013' }, '37' => { 'name' => 'max_pkeys', 'offset' => '360', 'type' => '1001' }, '38' => { 'name' => 'local_ca_ack_delay', 'offset' => '368', 'type' => '989' }, '39' => { 'name' => 'phys_port_cnt', 'offset' => '369', 'type' => '989' }, '4' => { 'name' => 'page_size_cap', 'offset' => '50', 'type' => '1025' }, '40' => { 'name' => 'reserved', 'offset' => '370', 'type' => '77316' }, '5' => { 'name' => 'vendor_id', 'offset' => '64', 'type' => '1013' }, '6' => { 'name' => 'vendor_part_id', 'offset' => '68', 'type' => '1013' }, '7' => { 'name' => 'hw_ver', 'offset' => '72', 'type' => '1013' }, '8' => { 'name' => 'max_qp', 'offset' => '82', 'type' => '1013' }, '9' => { 'name' => 'max_qp_wr', 'offset' => '86', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_query_device_resp', 'Size' => '176', 'Type' => 'Struct' }, '77316' => { 'BaseType' => '989', 'Name' => '__u8[4]', 'Size' => '4', 'Type' => 'Array' }, '77372' => { 'Header' => undef, 'Line' => '238', 'Memb' => { '0' => { 'name' => 'rc_odp_caps', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'uc_odp_caps', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'ud_odp_caps', 'offset' => '8', 'type' => '1013' } }, 'Size' => '12', 'Type' => 'Struct' }, '77421' => { 'Header' => undef, 'Line' => '236', 'Memb' => { '0' => { 'name' => 'general_caps', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'per_transport_caps', 'offset' => '8', 'type' => '77372' }, '2' => { 'name' => 'reserved', 'offset' => '32', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_odp_caps', 'Size' => '24', 'Type' => 'Struct' }, '77474' => { 'Header' => undef, 'Line' => '246', 'Memb' => { '0' => { 'name' => 'supported_qpts', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'max_rwq_indirection_tables', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'max_rwq_indirection_table_size', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'reserved', 'offset' => '18', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_rss_caps', 'Size' => '16', 'Type' => 'Struct' }, '77540' => { 'Header' => undef, 'Line' => '257', 'Memb' => { '0' => { 'name' => 'max_rndv_hdr_size', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'max_num_tags', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'flags', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'max_ops', 'offset' => '18', 'type' => '1013' }, '4' => { 'name' => 'max_sge', 'offset' => '22', 'type' => '1013' }, '5' => { 'name' => 'reserved', 'offset' => '32', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_tm_caps', 'Size' => '24', 'Type' => 'Struct' }, '77638' => { 'Header' => undef, 'Line' => '271', 'Memb' => { '0' => { 'name' => 'base', 'offset' => '0', 'type' => '76769' }, '1' => { 'name' => 'comp_mask', 'offset' => '374', 'type' => '1013' }, '10' => { 'name' => 'tm_caps', 'offset' => '598', 'type' => '77540' }, '11' => { 'name' => 'cq_moderation_caps', 'offset' => '640', 'type' => '76449' }, '12' => { 'name' => 'max_dm_size', 'offset' => '648', 'type' => '1025' }, '13' => { 'name' => 'xrc_odp_caps', 'offset' => '662', 'type' => '1013' }, '14' => { 'name' => 'reserved', 'offset' => '768', 'type' => '1013' }, '2' => { 'name' => 'response_length', 'offset' => '384', 'type' => '1013' }, '3' => { 'name' => 'odp_caps', 'offset' => '388', 'type' => '77421' }, '4' => { 'name' => 'timestamp_mask', 'offset' => '520', 'type' => '1025' }, '5' => { 'name' => 'hca_core_clock', 'offset' => '534', 'type' => '1025' }, '6' => { 'name' => 'device_cap_flags_ex', 'offset' => '548', 'type' => '1025' }, '7' => { 'name' => 'rss_caps', 'offset' => '562', 'type' => '77474' }, '8' => { 'name' => 'max_wq_type_rq', 'offset' => '584', 'type' => '1013' }, '9' => { 'name' => 'raw_packet_caps', 'offset' => '594', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_query_device_resp', 'Size' => '304', 'Type' => 'Struct' }, '77866' => { 'Header' => undef, 'Line' => '289', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'port_num', 'offset' => '8', 'type' => '989' }, '2' => { 'name' => 'reserved', 'offset' => '9', 'type' => '1564' }, '3' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_query_port', 'Size' => '16', 'Type' => 'Struct' }, '78777' => { 'Header' => undef, 'Line' => '79', 'Memb' => { '0' => { 'name' => 'gid', 'offset' => '0', 'type' => '8669' }, '1' => { 'name' => 'gid_index', 'offset' => '22', 'type' => '953' }, '2' => { 'name' => 'port_num', 'offset' => '32', 'type' => '953' }, '3' => { 'name' => 'gid_type', 'offset' => '36', 'type' => '953' }, '4' => { 'name' => 'ndev_ifindex', 'offset' => '40', 'type' => '953' } }, 'Name' => 'struct ibv_gid_entry', 'Size' => '32', 'Type' => 'Struct' }, '8070' => { 'Header' => undef, 'Line' => '1205', 'Memb' => { '0' => { 'name' => 'srq_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'attr_mask', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'max_wr', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'srq_limit', 'offset' => '18', 'type' => '1013' }, '4' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_modify_srq', 'Size' => '16', 'Type' => 'Struct' }, '8153' => { 'Header' => undef, 'Line' => '1213', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'srq_handle', 'offset' => '8', 'type' => '1013' }, '2' => { 'name' => 'reserved', 'offset' => '18', 'type' => '1013' }, '3' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Name' => 'struct ib_uverbs_query_srq', 'Size' => '16', 'Type' => 'Struct' }, '82' => { 'BaseType' => '1', 'Name' => 'void*', 'Size' => '8', 'Type' => 'Pointer' }, '8292' => { 'Header' => undef, 'Line' => '1270', 'Memb' => { '0' => { 'name' => 'attr_mask', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'wq_handle', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'wq_state', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'curr_wq_state', 'offset' => '18', 'type' => '1013' }, '4' => { 'name' => 'flags', 'offset' => '22', 'type' => '1013' }, '5' => { 'name' => 'flags_mask', 'offset' => '32', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_modify_wq', 'Size' => '24', 'Type' => 'Struct' }, '8448' => { 'Header' => undef, 'Line' => '1291', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'response_length', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'ind_tbl_handle', 'offset' => '8', 'type' => '1013' }, '3' => { 'name' => 'ind_tbl_num', 'offset' => '18', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_create_rwq_ind_table_resp', 'Size' => '16', 'Type' => 'Struct' }, '8519' => { 'Header' => undef, 'Line' => '1303', 'Memb' => { '0' => { 'name' => 'cq_count', 'offset' => '0', 'type' => '1001' }, '1' => { 'name' => 'cq_period', 'offset' => '2', 'type' => '1001' } }, 'Name' => 'struct ib_uverbs_cq_moderation', 'Size' => '4', 'Type' => 'Struct' }, '8562' => { 'Header' => undef, 'Line' => '1308', 'Memb' => { '0' => { 'name' => 'cq_handle', 'offset' => '0', 'type' => '1013' }, '1' => { 'name' => 'attr_mask', 'offset' => '4', 'type' => '1013' }, '2' => { 'name' => 'attr', 'offset' => '8', 'type' => '8519' }, '3' => { 'name' => 'reserved', 'offset' => '18', 'type' => '1013' } }, 'Name' => 'struct ib_uverbs_ex_modify_cq', 'Size' => '16', 'Type' => 'Struct' }, '8633' => { 'Header' => undef, 'Line' => '67', 'Memb' => { '0' => { 'name' => 'subnet_prefix', 'offset' => '0', 'type' => '1061' }, '1' => { 'name' => 'interface_id', 'offset' => '8', 'type' => '1061' } }, 'Size' => '16', 'Type' => 'Struct' }, '8669' => { 'Header' => undef, 'Line' => '65', 'Memb' => { '0' => { 'name' => 'raw', 'offset' => '0', 'type' => '8712' }, '1' => { 'name' => 'global', 'offset' => '0', 'type' => '8633' } }, 'Name' => 'union ibv_gid', 'Size' => '16', 'Type' => 'Union' }, '8707' => { 'BaseType' => '8669', 'Name' => 'union ibv_gid const', 'Size' => '16', 'Type' => 'Const' }, '8712' => { 'BaseType' => '929', 'Name' => 'uint8_t[16]', 'Size' => '16', 'Type' => 'Array' }, '8728' => { 'Header' => undef, 'Line' => '95', 'Memb' => { '0' => { 'name' => 'IBV_NODE_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_NODE_CA', 'value' => '1' }, '2' => { 'name' => 'IBV_NODE_SWITCH', 'value' => '2' }, '3' => { 'name' => 'IBV_NODE_ROUTER', 'value' => '3' }, '4' => { 'name' => 'IBV_NODE_RNIC', 'value' => '4' }, '5' => { 'name' => 'IBV_NODE_USNIC', 'value' => '5' }, '6' => { 'name' => 'IBV_NODE_USNIC_UDP', 'value' => '6' }, '7' => { 'name' => 'IBV_NODE_UNSPECIFIED', 'value' => '7' } }, 'Name' => 'enum ibv_node_type', 'Size' => '4', 'Type' => 'Enum' }, '8792' => { 'Header' => undef, 'Line' => '106', 'Memb' => { '0' => { 'name' => 'IBV_TRANSPORT_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_TRANSPORT_IB', 'value' => '0' }, '2' => { 'name' => 'IBV_TRANSPORT_IWARP', 'value' => '1' }, '3' => { 'name' => 'IBV_TRANSPORT_USNIC', 'value' => '2' }, '4' => { 'name' => 'IBV_TRANSPORT_USNIC_UDP', 'value' => '3' }, '5' => { 'name' => 'IBV_TRANSPORT_UNSPECIFIED', 'value' => '4' } }, 'Name' => 'enum ibv_transport_type', 'Size' => '4', 'Type' => 'Enum' }, '8844' => { 'Header' => undef, 'Line' => '155', 'Memb' => { '0' => { 'name' => 'IBV_ATOMIC_NONE', 'value' => '0' }, '1' => { 'name' => 'IBV_ATOMIC_HCA', 'value' => '1' }, '2' => { 'name' => 'IBV_ATOMIC_GLOB', 'value' => '2' } }, 'Name' => 'enum ibv_atomic_cap', 'Size' => '4', 'Type' => 'Enum' }, '8879' => { 'Header' => undef, 'Line' => '2037', 'Memb' => { '0' => { 'name' => 'device', 'offset' => '0', 'type' => '17378' }, '1' => { 'name' => 'ops', 'offset' => '8', 'type' => '17558' }, '2' => { 'name' => 'cmd_fd', 'offset' => '612', 'type' => '161' }, '3' => { 'name' => 'async_fd', 'offset' => '616', 'type' => '161' }, '4' => { 'name' => 'num_comp_vectors', 'offset' => '626', 'type' => '161' }, '5' => { 'name' => 'mutex', 'offset' => '640', 'type' => '832' }, '6' => { 'name' => 'abi_compat', 'offset' => '800', 'type' => '82' } }, 'Name' => 'struct ibv_context', 'Size' => '328', 'Type' => 'Struct' }, '89' => { 'Name' => 'unsigned char', 'Size' => '1', 'Type' => 'Intrinsic' }, '89904' => { 'BaseType' => '74839', 'Header' => undef, 'Line' => '46', 'Name' => 'atomic_int', 'Type' => 'Typedef' }, '8991' => { 'BaseType' => '8879', 'Name' => 'struct ibv_context*', 'Size' => '8', 'Type' => 'Pointer' }, '8996' => { 'Header' => undef, 'Line' => '182', 'Memb' => { '0' => { 'name' => 'fw_ver', 'offset' => '0', 'type' => '9530' }, '1' => { 'name' => 'node_guid', 'offset' => '100', 'type' => '1061' }, '10' => { 'name' => 'device_cap_flags', 'offset' => '278', 'type' => '70' }, '11' => { 'name' => 'max_sge', 'offset' => '288', 'type' => '161' }, '12' => { 'name' => 'max_sge_rd', 'offset' => '292', 'type' => '161' }, '13' => { 'name' => 'max_cq', 'offset' => '296', 'type' => '161' }, '14' => { 'name' => 'max_cqe', 'offset' => '306', 'type' => '161' }, '15' => { 'name' => 'max_mr', 'offset' => '310', 'type' => '161' }, '16' => { 'name' => 'max_pd', 'offset' => '320', 'type' => '161' }, '17' => { 'name' => 'max_qp_rd_atom', 'offset' => '324', 'type' => '161' }, '18' => { 'name' => 'max_ee_rd_atom', 'offset' => '328', 'type' => '161' }, '19' => { 'name' => 'max_res_rd_atom', 'offset' => '338', 'type' => '161' }, '2' => { 'name' => 'sys_image_guid', 'offset' => '114', 'type' => '1061' }, '20' => { 'name' => 'max_qp_init_rd_atom', 'offset' => '342', 'type' => '161' }, '21' => { 'name' => 'max_ee_init_rd_atom', 'offset' => '352', 'type' => '161' }, '22' => { 'name' => 'atomic_cap', 'offset' => '356', 'type' => '8844' }, '23' => { 'name' => 'max_ee', 'offset' => '360', 'type' => '161' }, '24' => { 'name' => 'max_rdd', 'offset' => '370', 'type' => '161' }, '25' => { 'name' => 'max_mw', 'offset' => '374', 'type' => '161' }, '26' => { 'name' => 'max_raw_ipv6_qp', 'offset' => '384', 'type' => '161' }, '27' => { 'name' => 'max_raw_ethy_qp', 'offset' => '388', 'type' => '161' }, '28' => { 'name' => 'max_mcast_grp', 'offset' => '392', 'type' => '161' }, '29' => { 'name' => 'max_mcast_qp_attach', 'offset' => '402', 'type' => '161' }, '3' => { 'name' => 'max_mr_size', 'offset' => '128', 'type' => '965' }, '30' => { 'name' => 'max_total_mcast_qp_attach', 'offset' => '406', 'type' => '161' }, '31' => { 'name' => 'max_ah', 'offset' => '512', 'type' => '161' }, '32' => { 'name' => 'max_fmr', 'offset' => '516', 'type' => '161' }, '33' => { 'name' => 'max_map_per_fmr', 'offset' => '520', 'type' => '161' }, '34' => { 'name' => 'max_srq', 'offset' => '530', 'type' => '161' }, '35' => { 'name' => 'max_srq_wr', 'offset' => '534', 'type' => '161' }, '36' => { 'name' => 'max_srq_sge', 'offset' => '544', 'type' => '161' }, '37' => { 'name' => 'max_pkeys', 'offset' => '548', 'type' => '941' }, '38' => { 'name' => 'local_ca_ack_delay', 'offset' => '550', 'type' => '929' }, '39' => { 'name' => 'phys_port_cnt', 'offset' => '551', 'type' => '929' }, '4' => { 'name' => 'page_size_cap', 'offset' => '136', 'type' => '965' }, '5' => { 'name' => 'vendor_id', 'offset' => '150', 'type' => '953' }, '6' => { 'name' => 'vendor_part_id', 'offset' => '256', 'type' => '953' }, '7' => { 'name' => 'hw_ver', 'offset' => '260', 'type' => '953' }, '8' => { 'name' => 'max_qp', 'offset' => '264', 'type' => '161' }, '9' => { 'name' => 'max_qp_wr', 'offset' => '274', 'type' => '161' } }, 'Name' => 'struct ibv_device_attr', 'Size' => '232', 'Type' => 'Struct' }, '89984' => { 'Header' => undef, 'Line' => '193', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'driver_data', 'offset' => '8', 'type' => '1549' } }, 'Size' => '8', 'Type' => 'Struct' }, '90018' => { 'Header' => undef, 'Line' => '193', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '89984' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '76621' } }, 'Size' => '8', 'Type' => 'Union' }, '90045' => { 'Header' => undef, 'Line' => '193', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '90018' } }, 'Name' => 'struct ibv_get_context', 'Size' => '16', 'Type' => 'Struct' }, '90222' => { 'Header' => undef, 'Line' => '203', 'Memb' => { '0' => { 'name' => 'response', 'offset' => '0', 'type' => '1025' }, '1' => { 'name' => 'port_num', 'offset' => '8', 'type' => '989' }, '2' => { 'name' => 'reserved', 'offset' => '9', 'type' => '1564' }, '3' => { 'name' => 'driver_data', 'offset' => '22', 'type' => '1549' } }, 'Size' => '16', 'Type' => 'Struct' }, '90282' => { 'Header' => undef, 'Line' => '203', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '90222' }, '1' => { 'name' => 'core_payload', 'offset' => '0', 'type' => '77866' } }, 'Size' => '16', 'Type' => 'Union' }, '90309' => { 'Header' => undef, 'Line' => '203', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '1430' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '90282' } }, 'Name' => 'struct ibv_query_port', 'Size' => '24', 'Type' => 'Struct' }, '90480' => { 'Header' => undef, 'Line' => '24', 'Memb' => { '0' => { 'name' => 'next', 'offset' => '0', 'type' => '90520' }, '1' => { 'name' => 'prev', 'offset' => '8', 'type' => '90520' } }, 'Name' => 'struct list_node', 'Size' => '16', 'Type' => 'Struct' }, '90520' => { 'BaseType' => '90480', 'Name' => 'struct list_node*', 'Size' => '8', 'Type' => 'Pointer' }, '90525' => { 'Header' => undef, 'Line' => '130', 'Memb' => { '0' => { 'name' => 'IBV_GID_TYPE_SYSFS_IB_ROCE_V1', 'value' => '0' }, '1' => { 'name' => 'IBV_GID_TYPE_SYSFS_ROCE_V2', 'value' => '1' } }, 'Name' => 'enum ibv_gid_type_sysfs', 'Size' => '4', 'Type' => 'Enum' }, '90589' => { 'Header' => undef, 'Line' => '201', 'Memb' => { '0' => { 'name' => 'modalias', 'offset' => '0', 'type' => '74950' }, '1' => { 'name' => 'driver_id', 'offset' => '0', 'type' => '965' } }, 'Size' => '8', 'Type' => 'Union' }, '90623' => { 'Header' => undef, 'Line' => '199', 'Memb' => { '0' => { 'name' => 'driver_data', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'u', 'offset' => '8', 'type' => '90589' }, '2' => { 'name' => 'vendor', 'offset' => '22', 'type' => '941' }, '3' => { 'name' => 'device', 'offset' => '24', 'type' => '941' }, '4' => { 'name' => 'kind', 'offset' => '32', 'type' => '929' } }, 'Name' => 'struct verbs_match_ent', 'Size' => '24', 'Type' => 'Struct' }, '90700' => { 'BaseType' => '90623', 'Name' => 'struct verbs_match_ent const', 'Size' => '24', 'Type' => 'Const' }, '90705' => { 'Header' => undef, 'Line' => '249', 'Memb' => { '0' => { 'name' => 'entry', 'offset' => '0', 'type' => '90480' }, '1' => { 'name' => 'provider_data', 'offset' => '22', 'type' => '82' }, '10' => { 'name' => 'driver_id', 'offset' => '2386', 'type' => '953' }, '11' => { 'name' => 'node_type', 'offset' => '2390', 'type' => '8728' }, '12' => { 'name' => 'ibdev_idx', 'offset' => '2400', 'type' => '161' }, '13' => { 'name' => 'num_ports', 'offset' => '2404', 'type' => '953' }, '14' => { 'name' => 'abi_ver', 'offset' => '2408', 'type' => '953' }, '15' => { 'name' => 'time_created', 'offset' => '2422', 'type' => '50831' }, '2' => { 'name' => 'match', 'offset' => '36', 'type' => '90946' }, '3' => { 'name' => 'flags', 'offset' => '50', 'type' => '70' }, '4' => { 'name' => 'sysfs_name', 'offset' => '54', 'type' => '9530' }, '5' => { 'name' => 'sysfs_cdev', 'offset' => '260', 'type' => '74984' }, '6' => { 'name' => 'ibdev_name', 'offset' => '274', 'type' => '9530' }, '7' => { 'name' => 'ibdev_path', 'offset' => '374', 'type' => '17542' }, '8' => { 'name' => 'modalias', 'offset' => '1074', 'type' => '90951' }, '9' => { 'name' => 'node_guid', 'offset' => '2372', 'type' => '965' } }, 'Name' => 'struct verbs_sysfs_dev', 'Size' => '992', 'Type' => 'Struct' }, '90946' => { 'BaseType' => '90700', 'Name' => 'struct verbs_match_ent const*', 'Size' => '8', 'Type' => 'Pointer' }, '90951' => { 'BaseType' => '226', 'Name' => 'char[512]', 'Size' => '512', 'Type' => 'Array' }, '90968' => { 'Header' => undef, 'Line' => '269', 'Memb' => { '0' => { 'name' => 'name', 'offset' => '0', 'type' => '74950' }, '1' => { 'name' => 'match_min_abi_version', 'offset' => '8', 'type' => '953' }, '2' => { 'name' => 'match_max_abi_version', 'offset' => '18', 'type' => '953' }, '3' => { 'name' => 'match_table', 'offset' => '22', 'type' => '90946' }, '4' => { 'name' => 'static_providers', 'offset' => '36', 'type' => '91127' }, '5' => { 'name' => 'match_device', 'offset' => '50', 'type' => '91157' }, '6' => { 'name' => 'alloc_context', 'offset' => '64', 'type' => '91192' }, '7' => { 'name' => 'import_context', 'offset' => '72', 'type' => '91217' }, '8' => { 'name' => 'alloc_device', 'offset' => '86', 'type' => '91346' }, '9' => { 'name' => 'uninit_device', 'offset' => '100', 'type' => '91362' } }, 'Name' => 'struct verbs_device_ops', 'Size' => '72', 'Type' => 'Struct' }, '91122' => { 'BaseType' => '90968', 'Name' => 'struct verbs_device_ops const', 'Size' => '72', 'Type' => 'Const' }, '91127' => { 'BaseType' => '91132', 'Name' => 'struct verbs_device_ops const**', 'Size' => '8', 'Type' => 'Pointer' }, '91132' => { 'BaseType' => '91122', 'Name' => 'struct verbs_device_ops const*', 'Size' => '8', 'Type' => 'Pointer' }, '91152' => { 'BaseType' => '90705', 'Name' => 'struct verbs_sysfs_dev*', 'Size' => '8', 'Type' => 'Pointer' }, '91157' => { 'Name' => '_Bool(*)(struct verbs_sysfs_dev*)', 'Param' => { '0' => { 'type' => '91152' } }, 'Return' => '18370', 'Size' => '8', 'Type' => 'FuncPtr' }, '91187' => { 'BaseType' => '63162', 'Name' => 'struct verbs_context*', 'Size' => '8', 'Type' => 'Pointer' }, '91192' => { 'Name' => 'struct verbs_context*(*)(struct ibv_device*, int, void*)', 'Param' => { '0' => { 'type' => '17378' }, '1' => { 'type' => '161' }, '2' => { 'type' => '82' } }, 'Return' => '91187', 'Size' => '8', 'Type' => 'FuncPtr' }, '91217' => { 'Name' => 'struct verbs_context*(*)(struct ibv_device*, int)', 'Param' => { '0' => { 'type' => '17378' }, '1' => { 'type' => '161' } }, 'Return' => '91187', 'Size' => '8', 'Type' => 'FuncPtr' }, '91222' => { 'Header' => undef, 'Line' => '290', 'Memb' => { '0' => { 'name' => 'device', 'offset' => '0', 'type' => '17383' }, '1' => { 'name' => 'ops', 'offset' => '1636', 'type' => '91132' }, '2' => { 'name' => 'refcount', 'offset' => '1650', 'type' => '89904' }, '3' => { 'name' => 'entry', 'offset' => '1664', 'type' => '90480' }, '4' => { 'name' => 'sysfs', 'offset' => '1686', 'type' => '91152' }, '5' => { 'name' => 'core_support', 'offset' => '1796', 'type' => '965' } }, 'Name' => 'struct verbs_device', 'Size' => '712', 'Type' => 'Struct' }, '91341' => { 'BaseType' => '91222', 'Name' => 'struct verbs_device*', 'Size' => '8', 'Type' => 'Pointer' }, '91346' => { 'Name' => 'struct verbs_device*(*)(struct verbs_sysfs_dev*)', 'Param' => { '0' => { 'type' => '91152' } }, 'Return' => '91341', 'Size' => '8', 'Type' => 'FuncPtr' }, '91362' => { 'Name' => 'void(*)(struct verbs_device*)', 'Param' => { '0' => { 'type' => '91341' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '918' => { 'BaseType' => '928', 'Name' => 'void const*', 'Size' => '8', 'Type' => 'Pointer' }, '928' => { 'BaseType' => '1', 'Name' => 'void const', 'Type' => 'Const' }, '929' => { 'BaseType' => '125', 'Header' => undef, 'Line' => '24', 'Name' => 'uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '93526' => { 'BaseType' => '77638', 'Name' => 'struct ib_uverbs_ex_query_device_resp*', 'Size' => '8', 'Type' => 'Pointer' }, '93531' => { 'BaseType' => '53', 'Name' => 'size_t*', 'Size' => '8', 'Type' => 'Pointer' }, '941' => { 'BaseType' => '149', 'Header' => undef, 'Line' => '25', 'Name' => 'uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '94784' => { 'BaseType' => '78777', 'Name' => 'struct ibv_gid_entry*', 'Size' => '8', 'Type' => 'Pointer' }, '953' => { 'BaseType' => '173', 'Header' => undef, 'Line' => '26', 'Name' => 'uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '9530' => { 'BaseType' => '226', 'Name' => 'char[64]', 'Size' => '64', 'Type' => 'Array' }, '9546' => { 'Header' => undef, 'Line' => '364', 'Memb' => { '0' => { 'name' => 'IBV_MTU_256', 'value' => '1' }, '1' => { 'name' => 'IBV_MTU_512', 'value' => '2' }, '2' => { 'name' => 'IBV_MTU_1024', 'value' => '3' }, '3' => { 'name' => 'IBV_MTU_2048', 'value' => '4' }, '4' => { 'name' => 'IBV_MTU_4096', 'value' => '5' } }, 'Name' => 'enum ibv_mtu', 'Size' => '4', 'Type' => 'Enum' }, '9593' => { 'Header' => undef, 'Line' => '1508', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'channel', 'offset' => '8', 'type' => '15165' }, '2' => { 'name' => 'cq_context', 'offset' => '22', 'type' => '82' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '953' }, '4' => { 'name' => 'cqe', 'offset' => '40', 'type' => '161' }, '5' => { 'name' => 'mutex', 'offset' => '50', 'type' => '832' }, '6' => { 'name' => 'cond', 'offset' => '114', 'type' => '906' }, '7' => { 'name' => 'comp_events_completed', 'offset' => '288', 'type' => '953' }, '8' => { 'name' => 'async_events_completed', 'offset' => '292', 'type' => '953' } }, 'Name' => 'struct ibv_cq', 'Size' => '128', 'Type' => 'Struct' }, '965' => { 'BaseType' => '197', 'Header' => undef, 'Line' => '27', 'Name' => 'uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '9734' => { 'BaseType' => '9593', 'Name' => 'struct ibv_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '9739' => { 'Header' => undef, 'Line' => '1283', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'qp_context', 'offset' => '8', 'type' => '82' }, '10' => { 'name' => 'mutex', 'offset' => '100', 'type' => '832' }, '11' => { 'name' => 'cond', 'offset' => '260', 'type' => '906' }, '12' => { 'name' => 'events_completed', 'offset' => '338', 'type' => '953' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '11395' }, '3' => { 'name' => 'send_cq', 'offset' => '36', 'type' => '9734' }, '4' => { 'name' => 'recv_cq', 'offset' => '50', 'type' => '9734' }, '5' => { 'name' => 'srq', 'offset' => '64', 'type' => '10052' }, '6' => { 'name' => 'handle', 'offset' => '72', 'type' => '953' }, '7' => { 'name' => 'qp_num', 'offset' => '82', 'type' => '953' }, '8' => { 'name' => 'state', 'offset' => '86', 'type' => '12734' }, '9' => { 'name' => 'qp_type', 'offset' => '96', 'type' => '12165' } }, 'Name' => 'struct ibv_qp', 'Size' => '160', 'Type' => 'Struct' }, '98705' => { 'BaseType' => '90525', 'Name' => 'enum ibv_gid_type_sysfs*', 'Size' => '8', 'Type' => 'Pointer' }, '98836' => { 'BaseType' => '8669', 'Name' => 'union ibv_gid*', 'Size' => '8', 'Type' => 'Pointer' }, '989' => { 'BaseType' => '89', 'Header' => undef, 'Line' => '21', 'Name' => '__u8', 'Size' => '1', 'Type' => 'Typedef' }, '9935' => { 'BaseType' => '9739', 'Name' => 'struct ibv_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '9940' => { 'Header' => undef, 'Line' => '1243', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '8991' }, '1' => { 'name' => 'srq_context', 'offset' => '8', 'type' => '82' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '11395' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '953' }, '4' => { 'name' => 'mutex', 'offset' => '50', 'type' => '832' }, '5' => { 'name' => 'cond', 'offset' => '114', 'type' => '906' }, '6' => { 'name' => 'events_completed', 'offset' => '288', 'type' => '953' } }, 'Name' => 'struct ibv_srq', 'Size' => '128', 'Type' => 'Struct' } }, 'UndefinedSymbols' => { 'libibverbs.so.1.14.56.0' => { '_ITM_deregisterTMCloneTable' => 0, '_ITM_registerTMCloneTable' => 0, '__asprintf_chk@GLIBC_2.8' => 0, '__cxa_finalize@GLIBC_2.2.5' => 0, '__errno_location@GLIBC_2.2.5' => 0, '__fdelt_chk@GLIBC_2.15' => 0, '__fprintf_chk@GLIBC_2.3.4' => 0, '__getdelim@GLIBC_2.2.5' => 0, '__gmon_start__' => 0, '__isoc99_sscanf@GLIBC_2.7' => 0, '__snprintf_chk@GLIBC_2.3.4' => 0, '__stack_chk_fail@GLIBC_2.4' => 0, '__strcpy_chk@GLIBC_2.3.4' => 0, '__vasprintf_chk@GLIBC_2.8' => 0, '__vfprintf_chk@GLIBC_2.3.4' => 0, 'bind@GLIBC_2.2.5' => 0, 'calloc@GLIBC_2.2.5' => 0, 'close@GLIBC_2.2.5' => 0, 'closedir@GLIBC_2.2.5' => 0, 'dirfd@GLIBC_2.2.5' => 0, 'dlerror@GLIBC_2.34' => 0, 'dlopen@GLIBC_2.34' => 0, 'fclose@GLIBC_2.2.5' => 0, 'fcntl@GLIBC_2.2.5' => 0, 'fgets@GLIBC_2.2.5' => 0, 'fnmatch@GLIBC_2.2.5' => 0, 'fopen@GLIBC_2.2.5' => 0, 'free@GLIBC_2.2.5' => 0, 'freeaddrinfo@GLIBC_2.2.5' => 0, 'freeifaddrs@GLIBC_2.3' => 0, 'fstat@GLIBC_2.33' => 0, 'fwrite@GLIBC_2.2.5' => 0, 'getenv@GLIBC_2.2.5' => 0, 'geteuid@GLIBC_2.2.5' => 0, 'getifaddrs@GLIBC_2.3' => 0, 'getpid@GLIBC_2.2.5' => 0, 'getrandom@GLIBC_2.25' => 0, 'getrlimit@GLIBC_2.2.5' => 0, 'getuid@GLIBC_2.2.5' => 0, 'if_nametoindex@GLIBC_2.2.5' => 0, 'inotify_add_watch@GLIBC_2.4' => 0, 'inotify_init1@GLIBC_2.9' => 0, 'ioctl@GLIBC_2.2.5' => 0, 'madvise@GLIBC_2.2.5' => 0, 'malloc@GLIBC_2.2.5' => 0, 'memcmp@GLIBC_2.2.5' => 0, 'memcpy@GLIBC_2.14' => 0, 'memset@GLIBC_2.2.5' => 0, 'nl_addr_build' => 0, 'nl_addr_clone' => 0, 'nl_addr_fill_sockaddr' => 0, 'nl_addr_get_binary_addr' => 0, 'nl_addr_get_family' => 0, 'nl_addr_get_len' => 0, 'nl_addr_get_prefixlen' => 0, 'nl_addr_info' => 0, 'nl_addr_put' => 0, 'nl_addr_set_prefixlen' => 0, 'nl_cache_free' => 0, 'nl_cache_mngt_provide' => 0, 'nl_cache_mngt_unprovide' => 0, 'nl_cache_refill' => 0, 'nl_connect' => 0, 'nl_msg_parse' => 0, 'nl_object_match_filter' => 0, 'nl_recvmsgs_default' => 0, 'nl_send_auto' => 0, 'nl_send_simple' => 0, 'nl_socket_add_membership' => 0, 'nl_socket_alloc' => 0, 'nl_socket_disable_auto_ack' => 0, 'nl_socket_disable_msg_peek' => 0, 'nl_socket_disable_seq_check' => 0, 'nl_socket_free' => 0, 'nl_socket_get_fd' => 0, 'nl_socket_modify_cb' => 0, 'nl_socket_modify_err_cb' => 0, 'nla_get_string' => 0, 'nla_get_u32' => 0, 'nla_get_u64' => 0, 'nla_get_u8' => 0, 'nla_put' => 0, 'nlmsg_alloc_simple' => 0, 'nlmsg_append' => 0, 'nlmsg_free' => 0, 'nlmsg_hdr' => 0, 'nlmsg_parse' => 0, 'open@GLIBC_2.2.5' => 0, 'openat@GLIBC_2.4' => 0, 'opendir@GLIBC_2.2.5' => 0, 'poll@GLIBC_2.2.5' => 0, 'posix_memalign@GLIBC_2.2.5' => 0, 'pthread_cond_init@GLIBC_2.3.2' => 0, 'pthread_cond_signal@GLIBC_2.3.2' => 0, 'pthread_cond_wait@GLIBC_2.3.2' => 0, 'pthread_mutex_init@GLIBC_2.2.5' => 0, 'pthread_mutex_lock@GLIBC_2.2.5' => 0, 'pthread_mutex_unlock@GLIBC_2.2.5' => 0, 'rand_r@GLIBC_2.2.5' => 0, 'read@GLIBC_2.2.5' => 0, 'readdir@GLIBC_2.2.5' => 0, 'rtnl_link_alloc_cache' => 0, 'rtnl_link_get' => 0, 'rtnl_link_get_addr' => 0, 'rtnl_link_is_vlan' => 0, 'rtnl_link_put' => 0, 'rtnl_link_vlan_get_id' => 0, 'rtnl_neigh_alloc' => 0, 'rtnl_neigh_alloc_cache' => 0, 'rtnl_neigh_get' => 0, 'rtnl_neigh_get_lladdr' => 0, 'rtnl_neigh_put' => 0, 'rtnl_neigh_set_dst' => 0, 'rtnl_neigh_set_ifindex' => 0, 'rtnl_route_alloc_cache' => 0, 'rtnl_route_get_pref_src' => 0, 'rtnl_route_get_type' => 0, 'rtnl_route_nexthop_n' => 0, 'rtnl_route_nh_get_gateway' => 0, 'rtnl_route_nh_get_ifindex' => 0, 'select@GLIBC_2.2.5' => 0, 'sendto@GLIBC_2.2.5' => 0, 'snprintf@GLIBC_2.2.5' => 0, 'socket@GLIBC_2.2.5' => 0, 'stat@GLIBC_2.33' => 0, 'stderr@GLIBC_2.2.5' => 0, 'strcmp@GLIBC_2.2.5' => 0, 'strcpy@GLIBC_2.2.5' => 0, 'strdup@GLIBC_2.2.5' => 0, 'strlen@GLIBC_2.2.5' => 0, 'strndup@GLIBC_2.2.5' => 0, 'strsep@GLIBC_2.2.5' => 0, 'strspn@GLIBC_2.2.5' => 0, 'strstr@GLIBC_2.2.5' => 0, 'strtol@GLIBC_2.2.5' => 0, 'strtoul@GLIBC_2.2.5' => 0, 'sysconf@GLIBC_2.2.5' => 0, 'time@GLIBC_2.2.5' => 0, 'timerfd_create@GLIBC_2.8' => 0, 'timerfd_settime@GLIBC_2.8' => 0, 'write@GLIBC_2.2.5' => 0 } }, 'WordSize' => '8' }; rdma-core-56.1/ABI/mana.dump000066400000000000000000005172641477342711600155470ustar00rootroot00000000000000$VAR1 = { 'ABI_DUMPER_VERSION' => '1.2', 'ABI_DUMP_VERSION' => '3.5', 'Arch' => 'x86_64', 'GccVersion' => '12.3.0', 'Headers' => {}, 'Language' => 'C', 'LibraryName' => 'libmana.so.1.0.56.0', 'LibraryVersion' => 'mana', 'MissedOffsets' => '1', 'MissedRegs' => '1', 'NameSpaces' => {}, 'Needed' => { 'libc.so.6' => 1, 'libibverbs.so.1' => 1 }, 'Sources' => {}, 'SymbolInfo' => { '41079' => { 'Header' => undef, 'Line' => '43', 'Param' => { '0' => { 'name' => 'obj', 'type' => '41416' }, '1' => { 'name' => 'obj_type', 'type' => '1052' } }, 'Return' => '78', 'ShortName' => 'manadv_init_obj' }, '41431' => { 'Header' => undef, 'Line' => '23', 'Param' => { '0' => { 'name' => 'ibv_ctx', 'type' => '4073' }, '1' => { 'name' => 'type', 'type' => '39530' }, '2' => { 'name' => 'attr', 'type' => '126' } }, 'Return' => '78', 'ShortName' => 'manadv_set_context_attr' } }, 'SymbolVersion' => { 'manadv_init_obj' => 'manadv_init_obj@@MANA_1.0', 'manadv_set_context_attr' => 'manadv_set_context_attr@@MANA_1.0' }, 'Symbols' => { 'libmana.so.1.0.56.0' => { 'manadv_init_obj@@MANA_1.0' => 1, 'manadv_set_context_attr@@MANA_1.0' => 1 } }, 'Target' => 'unix', 'TypeInfo' => { '1' => { 'Name' => 'void', 'Type' => 'Intrinsic' }, '10013' => { 'Header' => undef, 'Line' => '1103', 'Memb' => { '0' => { 'name' => 'IBV_WR_RDMA_WRITE', 'value' => '0' }, '1' => { 'name' => 'IBV_WR_RDMA_WRITE_WITH_IMM', 'value' => '1' }, '10' => { 'name' => 'IBV_WR_TSO', 'value' => '10' }, '11' => { 'name' => 'IBV_WR_DRIVER1', 'value' => '11' }, '12' => { 'name' => 'IBV_WR_FLUSH', 'value' => '14' }, '13' => { 'name' => 'IBV_WR_ATOMIC_WRITE', 'value' => '15' }, '2' => { 'name' => 'IBV_WR_SEND', 'value' => '2' }, '3' => { 'name' => 'IBV_WR_SEND_WITH_IMM', 'value' => '3' }, '4' => { 'name' => 'IBV_WR_RDMA_READ', 'value' => '4' }, '5' => { 'name' => 'IBV_WR_ATOMIC_CMP_AND_SWP', 'value' => '5' }, '6' => { 'name' => 'IBV_WR_ATOMIC_FETCH_AND_ADD', 'value' => '6' }, '7' => { 'name' => 'IBV_WR_LOCAL_INV', 'value' => '7' }, '8' => { 'name' => 'IBV_WR_BIND_MW', 'value' => '8' }, '9' => { 'name' => 'IBV_WR_SEND_WITH_INV', 'value' => '9' } }, 'Name' => 'enum ibv_wr_opcode', 'Size' => '4', 'Type' => 'Enum' }, '1016' => { 'BaseType' => '164', 'Header' => undef, 'Line' => '24', 'Name' => 'uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '10161' => { 'Header' => undef, 'Line' => '1145', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '1052' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '1040' }, '2' => { 'name' => 'lkey', 'offset' => '18', 'type' => '1040' } }, 'Name' => 'struct ibv_sge', 'Size' => '16', 'Type' => 'Struct' }, '10222' => { 'Header' => undef, 'Line' => '1161', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '1117' }, '1' => { 'name' => 'invalidate_rkey', 'offset' => '0', 'type' => '1040' } }, 'Size' => '4', 'Type' => 'Union' }, '10255' => { 'Header' => undef, 'Line' => '1166', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '1052' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '1040' } }, 'Size' => '16', 'Type' => 'Struct' }, '1028' => { 'BaseType' => '188', 'Header' => undef, 'Line' => '25', 'Name' => 'uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '10293' => { 'Header' => undef, 'Line' => '1170', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '1052' }, '1' => { 'name' => 'compare_add', 'offset' => '8', 'type' => '1052' }, '2' => { 'name' => 'swap', 'offset' => '22', 'type' => '1052' }, '3' => { 'name' => 'rkey', 'offset' => '36', 'type' => '1040' } }, 'Size' => '32', 'Type' => 'Struct' }, '10359' => { 'Header' => undef, 'Line' => '1176', 'Memb' => { '0' => { 'name' => 'ah', 'offset' => '0', 'type' => '10463' }, '1' => { 'name' => 'remote_qpn', 'offset' => '8', 'type' => '1040' }, '2' => { 'name' => 'remote_qkey', 'offset' => '18', 'type' => '1040' } }, 'Size' => '16', 'Type' => 'Struct' }, '1040' => { 'BaseType' => '200', 'Header' => undef, 'Line' => '26', 'Name' => 'uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '10409' => { 'Header' => undef, 'Line' => '1695', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '4073' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '7803' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '1040' } }, 'Name' => 'struct ibv_ah', 'Size' => '24', 'Type' => 'Struct' }, '10463' => { 'BaseType' => '10409', 'Name' => 'struct ibv_ah*', 'Size' => '8', 'Type' => 'Pointer' }, '10468' => { 'Header' => undef, 'Line' => '1165', 'Memb' => { '0' => { 'name' => 'rdma', 'offset' => '0', 'type' => '10255' }, '1' => { 'name' => 'atomic', 'offset' => '0', 'type' => '10293' }, '2' => { 'name' => 'ud', 'offset' => '0', 'type' => '10359' } }, 'Size' => '32', 'Type' => 'Union' }, '10512' => { 'Header' => undef, 'Line' => '1183', 'Memb' => { '0' => { 'name' => 'remote_srqn', 'offset' => '0', 'type' => '1040' } }, 'Size' => '4', 'Type' => 'Struct' }, '1052' => { 'BaseType' => '212', 'Header' => undef, 'Line' => '27', 'Name' => 'uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '10536' => { 'Header' => undef, 'Line' => '1182', 'Memb' => { '0' => { 'name' => 'xrc', 'offset' => '0', 'type' => '10512' } }, 'Size' => '4', 'Type' => 'Union' }, '10557' => { 'Header' => undef, 'Line' => '1188', 'Memb' => { '0' => { 'name' => 'mw', 'offset' => '0', 'type' => '10607' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '1040' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '7435' } }, 'Size' => '48', 'Type' => 'Struct' }, '10607' => { 'BaseType' => '7837', 'Name' => 'struct ibv_mw*', 'Size' => '8', 'Type' => 'Pointer' }, '10612' => { 'Header' => undef, 'Line' => '1193', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '126' }, '1' => { 'name' => 'hdr_sz', 'offset' => '8', 'type' => '1028' }, '2' => { 'name' => 'mss', 'offset' => '16', 'type' => '1028' } }, 'Size' => '16', 'Type' => 'Struct' }, '10662' => { 'Header' => undef, 'Line' => '1187', 'Memb' => { '0' => { 'name' => 'bind_mw', 'offset' => '0', 'type' => '10557' }, '1' => { 'name' => 'tso', 'offset' => '0', 'type' => '10612' } }, 'Size' => '48', 'Type' => 'Union' }, '10695' => { 'Header' => undef, 'Line' => '1151', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '1052' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '10831' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '10836' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '78' }, '4' => { 'name' => 'opcode', 'offset' => '40', 'type' => '10013' }, '5' => { 'name' => 'send_flags', 'offset' => '50', 'type' => '114' }, '6' => { 'name' => 'unnamed0', 'offset' => '54', 'type' => '10222' }, '7' => { 'name' => 'wr', 'offset' => '64', 'type' => '10468' }, '8' => { 'name' => 'qp_type', 'offset' => '114', 'type' => '10536' }, '9' => { 'name' => 'unnamed1', 'offset' => '128', 'type' => '10662' } }, 'Name' => 'struct ibv_send_wr', 'Size' => '128', 'Type' => 'Struct' }, '10831' => { 'BaseType' => '10695', 'Name' => 'struct ibv_send_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '10836' => { 'BaseType' => '10161', 'Name' => 'struct ibv_sge*', 'Size' => '8', 'Type' => 'Pointer' }, '10841' => { 'Header' => undef, 'Line' => '1201', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '1052' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '10911' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '10836' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '78' } }, 'Name' => 'struct ibv_recv_wr', 'Size' => '32', 'Type' => 'Struct' }, '10911' => { 'BaseType' => '10841', 'Name' => 'struct ibv_recv_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '1093' => { 'BaseType' => '114', 'Header' => undef, 'Line' => '27', 'Name' => '__u32', 'Size' => '4', 'Type' => 'Typedef' }, '1105' => { 'BaseType' => '427', 'Header' => undef, 'Line' => '31', 'Name' => '__u64', 'Size' => '8', 'Type' => 'Typedef' }, '11168' => { 'Header' => undef, 'Line' => '1237', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '1052' }, '1' => { 'name' => 'send_flags', 'offset' => '8', 'type' => '114' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '7435' } }, 'Name' => 'struct ibv_mw_bind', 'Size' => '48', 'Type' => 'Struct' }, '1117' => { 'BaseType' => '1093', 'Header' => undef, 'Line' => '27', 'Name' => '__be32', 'Size' => '4', 'Type' => 'Typedef' }, '11249' => { 'BaseType' => '10911', 'Name' => 'struct ibv_recv_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '11254' => { 'Name' => 'int(*)(struct ibv_wq*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '6740' }, '1' => { 'type' => '10911' }, '2' => { 'type' => '11249' } }, 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '1129' => { 'BaseType' => '1105', 'Header' => undef, 'Line' => '29', 'Name' => '__be64', 'Size' => '8', 'Type' => 'Typedef' }, '114' => { 'Name' => 'unsigned int', 'Size' => '4', 'Type' => 'Intrinsic' }, '12203' => { 'Header' => undef, 'Line' => '1502', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '4073' }, '1' => { 'name' => 'fd', 'offset' => '8', 'type' => '78' }, '2' => { 'name' => 'refcnt', 'offset' => '18', 'type' => '78' } }, 'Name' => 'struct ibv_comp_channel', 'Size' => '16', 'Type' => 'Struct' }, '12257' => { 'BaseType' => '12203', 'Name' => 'struct ibv_comp_channel*', 'Size' => '8', 'Type' => 'Pointer' }, '126' => { 'BaseType' => '1', 'Name' => 'void*', 'Size' => '8', 'Type' => 'Pointer' }, '128' => { 'Name' => 'unsigned char', 'Size' => '1', 'Type' => 'Intrinsic' }, '13518' => { 'Header' => undef, 'Line' => '1969', 'Memb' => { '0' => { 'name' => '_dummy1', 'offset' => '0', 'type' => '13699' }, '1' => { 'name' => '_dummy2', 'offset' => '8', 'type' => '13715' } }, 'Name' => 'struct _ibv_device_ops', 'Size' => '16', 'Type' => 'Struct' }, '13580' => { 'BaseType' => '13585', 'Name' => 'struct ibv_device*', 'Size' => '8', 'Type' => 'Pointer' }, '13585' => { 'Header' => undef, 'Line' => '1979', 'Memb' => { '0' => { 'name' => '_ops', 'offset' => '0', 'type' => '13518' }, '1' => { 'name' => 'node_type', 'offset' => '22', 'type' => '3673' }, '2' => { 'name' => 'transport_type', 'offset' => '32', 'type' => '3737' }, '3' => { 'name' => 'name', 'offset' => '36', 'type' => '4687' }, '4' => { 'name' => 'dev_name', 'offset' => '136', 'type' => '4687' }, '5' => { 'name' => 'dev_path', 'offset' => '338', 'type' => '13746' }, '6' => { 'name' => 'ibdev_path', 'offset' => '1032', 'type' => '13746' } }, 'Name' => 'struct ibv_device', 'Size' => '664', 'Type' => 'Struct' }, '13699' => { 'Name' => 'struct ibv_context*(*)(struct ibv_device*, int)', 'Param' => { '0' => { 'type' => '13580' }, '1' => { 'type' => '78' } }, 'Return' => '4073', 'Size' => '8', 'Type' => 'FuncPtr' }, '13715' => { 'Name' => 'void(*)(struct ibv_context*)', 'Param' => { '0' => { 'type' => '4073' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '13746' => { 'BaseType' => '272', 'Name' => 'char[256]', 'Size' => '256', 'Type' => 'Array' }, '13762' => { 'Header' => undef, 'Line' => '1994', 'Memb' => { '0' => { 'name' => '_compat_query_device', 'offset' => '0', 'type' => '14250' }, '1' => { 'name' => '_compat_query_port', 'offset' => '8', 'type' => '14290' }, '10' => { 'name' => '_compat_create_cq', 'offset' => '128', 'type' => '14300' }, '11' => { 'name' => 'poll_cq', 'offset' => '136', 'type' => '14415' }, '12' => { 'name' => 'req_notify_cq', 'offset' => '150', 'type' => '14440' }, '13' => { 'name' => '_compat_cq_event', 'offset' => '260', 'type' => '14300' }, '14' => { 'name' => '_compat_resize_cq', 'offset' => '274', 'type' => '14300' }, '15' => { 'name' => '_compat_destroy_cq', 'offset' => '288', 'type' => '14300' }, '16' => { 'name' => '_compat_create_srq', 'offset' => '296', 'type' => '14300' }, '17' => { 'name' => '_compat_modify_srq', 'offset' => '310', 'type' => '14300' }, '18' => { 'name' => '_compat_query_srq', 'offset' => '324', 'type' => '14300' }, '19' => { 'name' => '_compat_destroy_srq', 'offset' => '338', 'type' => '14300' }, '2' => { 'name' => '_compat_alloc_pd', 'offset' => '22', 'type' => '14300' }, '20' => { 'name' => 'post_srq_recv', 'offset' => '352', 'type' => '14470' }, '21' => { 'name' => '_compat_create_qp', 'offset' => '360', 'type' => '14300' }, '22' => { 'name' => '_compat_query_qp', 'offset' => '374', 'type' => '14300' }, '23' => { 'name' => '_compat_modify_qp', 'offset' => '388', 'type' => '14300' }, '24' => { 'name' => '_compat_destroy_qp', 'offset' => '402', 'type' => '14300' }, '25' => { 'name' => 'post_send', 'offset' => '512', 'type' => '14505' }, '26' => { 'name' => 'post_recv', 'offset' => '520', 'type' => '14535' }, '27' => { 'name' => '_compat_create_ah', 'offset' => '534', 'type' => '14300' }, '28' => { 'name' => '_compat_destroy_ah', 'offset' => '548', 'type' => '14300' }, '29' => { 'name' => '_compat_attach_mcast', 'offset' => '562', 'type' => '14300' }, '3' => { 'name' => '_compat_dealloc_pd', 'offset' => '36', 'type' => '14300' }, '30' => { 'name' => '_compat_detach_mcast', 'offset' => '576', 'type' => '14300' }, '31' => { 'name' => '_compat_async_event', 'offset' => '584', 'type' => '14300' }, '4' => { 'name' => '_compat_reg_mr', 'offset' => '50', 'type' => '14300' }, '5' => { 'name' => '_compat_rereg_mr', 'offset' => '64', 'type' => '14300' }, '6' => { 'name' => '_compat_dereg_mr', 'offset' => '72', 'type' => '14300' }, '7' => { 'name' => 'alloc_mw', 'offset' => '86', 'type' => '14325' }, '8' => { 'name' => 'bind_mw', 'offset' => '100', 'type' => '14360' }, '9' => { 'name' => 'dealloc_mw', 'offset' => '114', 'type' => '14380' } }, 'Name' => 'struct ibv_context_ops', 'Size' => '256', 'Type' => 'Struct' }, '140' => { 'Name' => 'unsigned short', 'Size' => '2', 'Type' => 'Intrinsic' }, '14245' => { 'BaseType' => '4153', 'Name' => 'struct ibv_device_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '14250' => { 'Name' => 'int(*)(struct ibv_context*, struct ibv_device_attr*)', 'Param' => { '0' => { 'type' => '4073' }, '1' => { 'type' => '14245' } }, 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '14280' => { 'BaseType' => '14285', 'Name' => 'struct _compat_ibv_port_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '14285' => { 'Name' => 'struct _compat_ibv_port_attr', 'Type' => 'Struct' }, '14290' => { 'Name' => 'int(*)(struct ibv_context*, uint8_t, struct _compat_ibv_port_attr*)', 'Param' => { '0' => { 'type' => '4073' }, '1' => { 'type' => '1016' }, '2' => { 'type' => '14280' } }, 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '14300' => { 'Name' => 'void*(*)()', 'Return' => '126', 'Size' => '8', 'Type' => 'FuncPtr' }, '14325' => { 'Name' => 'struct ibv_mw*(*)(struct ibv_pd*, enum ibv_mw_type)', 'Param' => { '0' => { 'type' => '7803' }, '1' => { 'type' => '7808' } }, 'Return' => '10607', 'Size' => '8', 'Type' => 'FuncPtr' }, '14355' => { 'BaseType' => '11168', 'Name' => 'struct ibv_mw_bind*', 'Size' => '8', 'Type' => 'Pointer' }, '14360' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_mw*, struct ibv_mw_bind*)', 'Param' => { '0' => { 'type' => '6428' }, '1' => { 'type' => '10607' }, '2' => { 'type' => '14355' } }, 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '14380' => { 'Name' => 'int(*)(struct ibv_mw*)', 'Param' => { '0' => { 'type' => '10607' } }, 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '14410' => { 'BaseType' => '7249', 'Name' => 'struct ibv_wc*', 'Size' => '8', 'Type' => 'Pointer' }, '14415' => { 'Name' => 'int(*)(struct ibv_cq*, int, struct ibv_wc*)', 'Param' => { '0' => { 'type' => '6230' }, '1' => { 'type' => '78' }, '2' => { 'type' => '14410' } }, 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '14440' => { 'Name' => 'int(*)(struct ibv_cq*, int)', 'Param' => { '0' => { 'type' => '6230' }, '1' => { 'type' => '78' } }, 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '14470' => { 'Name' => 'int(*)(struct ibv_srq*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '6543' }, '1' => { 'type' => '10911' }, '2' => { 'type' => '11249' } }, 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '14500' => { 'BaseType' => '10831', 'Name' => 'struct ibv_send_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '14505' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_send_wr*, struct ibv_send_wr**)', 'Param' => { '0' => { 'type' => '6428' }, '1' => { 'type' => '10831' }, '2' => { 'type' => '14500' } }, 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '14535' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '6428' }, '1' => { 'type' => '10911' }, '2' => { 'type' => '11249' } }, 'Return' => '78', 'Size' => '8', 'Type' => 'FuncPtr' }, '164' => { 'BaseType' => '128', 'Header' => undef, 'Line' => '38', 'Name' => '__uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '188' => { 'BaseType' => '140', 'Header' => undef, 'Line' => '40', 'Name' => '__uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '200' => { 'BaseType' => '114', 'Header' => undef, 'Line' => '42', 'Name' => '__uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '212' => { 'BaseType' => '66', 'Header' => undef, 'Line' => '45', 'Name' => '__uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '272' => { 'Name' => 'char', 'Size' => '1', 'Type' => 'Intrinsic' }, '3673' => { 'Header' => undef, 'Line' => '95', 'Memb' => { '0' => { 'name' => 'IBV_NODE_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_NODE_CA', 'value' => '1' }, '2' => { 'name' => 'IBV_NODE_SWITCH', 'value' => '2' }, '3' => { 'name' => 'IBV_NODE_ROUTER', 'value' => '3' }, '4' => { 'name' => 'IBV_NODE_RNIC', 'value' => '4' }, '5' => { 'name' => 'IBV_NODE_USNIC', 'value' => '5' }, '6' => { 'name' => 'IBV_NODE_USNIC_UDP', 'value' => '6' }, '7' => { 'name' => 'IBV_NODE_UNSPECIFIED', 'value' => '7' } }, 'Name' => 'enum ibv_node_type', 'Size' => '4', 'Type' => 'Enum' }, '3737' => { 'Header' => undef, 'Line' => '106', 'Memb' => { '0' => { 'name' => 'IBV_TRANSPORT_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_TRANSPORT_IB', 'value' => '0' }, '2' => { 'name' => 'IBV_TRANSPORT_IWARP', 'value' => '1' }, '3' => { 'name' => 'IBV_TRANSPORT_USNIC', 'value' => '2' }, '4' => { 'name' => 'IBV_TRANSPORT_USNIC_UDP', 'value' => '3' }, '5' => { 'name' => 'IBV_TRANSPORT_UNSPECIFIED', 'value' => '4' } }, 'Name' => 'enum ibv_transport_type', 'Size' => '4', 'Type' => 'Enum' }, '3789' => { 'Header' => undef, 'Line' => '155', 'Memb' => { '0' => { 'name' => 'IBV_ATOMIC_NONE', 'value' => '0' }, '1' => { 'name' => 'IBV_ATOMIC_HCA', 'value' => '1' }, '2' => { 'name' => 'IBV_ATOMIC_GLOB', 'value' => '2' } }, 'Name' => 'enum ibv_atomic_cap', 'Size' => '4', 'Type' => 'Enum' }, '39530' => { 'Header' => undef, 'Line' => '18', 'Memb' => { '0' => { 'name' => 'MANADV_CTX_ATTR_BUF_ALLOCATORS', 'value' => '0' } }, 'Name' => 'enum manadv_set_ctx_attr_type', 'Size' => '4', 'Type' => 'Enum' }, '3956' => { 'Header' => undef, 'Line' => '2037', 'Memb' => { '0' => { 'name' => 'device', 'offset' => '0', 'type' => '13580' }, '1' => { 'name' => 'ops', 'offset' => '8', 'type' => '13762' }, '2' => { 'name' => 'cmd_fd', 'offset' => '612', 'type' => '78' }, '3' => { 'name' => 'async_fd', 'offset' => '616', 'type' => '78' }, '4' => { 'name' => 'num_comp_vectors', 'offset' => '626', 'type' => '78' }, '5' => { 'name' => 'mutex', 'offset' => '640', 'type' => '860' }, '6' => { 'name' => 'abi_compat', 'offset' => '800', 'type' => '126' } }, 'Name' => 'struct ibv_context', 'Size' => '328', 'Type' => 'Struct' }, '39652' => { 'Header' => undef, 'Line' => '32', 'Memb' => { '0' => { 'name' => 'sq_buf', 'offset' => '0', 'type' => '126' }, '1' => { 'name' => 'sq_count', 'offset' => '8', 'type' => '1040' }, '2' => { 'name' => 'sq_size', 'offset' => '18', 'type' => '1040' }, '3' => { 'name' => 'sq_id', 'offset' => '22', 'type' => '1040' }, '4' => { 'name' => 'tx_vp_offset', 'offset' => '32', 'type' => '1040' }, '5' => { 'name' => 'db_page', 'offset' => '36', 'type' => '126' } }, 'Name' => 'struct manadv_qp', 'Size' => '32', 'Type' => 'Struct' }, '39744' => { 'Header' => undef, 'Line' => '41', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '126' }, '1' => { 'name' => 'count', 'offset' => '8', 'type' => '1040' }, '2' => { 'name' => 'cq_id', 'offset' => '18', 'type' => '1040' } }, 'Name' => 'struct manadv_cq', 'Size' => '16', 'Type' => 'Struct' }, '39797' => { 'Header' => undef, 'Line' => '47', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '126' }, '1' => { 'name' => 'count', 'offset' => '8', 'type' => '1040' }, '2' => { 'name' => 'size', 'offset' => '18', 'type' => '1040' }, '3' => { 'name' => 'wq_id', 'offset' => '22', 'type' => '1040' }, '4' => { 'name' => 'db_page', 'offset' => '36', 'type' => '126' } }, 'Name' => 'struct manadv_rwq', 'Size' => '32', 'Type' => 'Struct' }, '39876' => { 'Header' => undef, 'Line' => '56', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '6428' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '39911' } }, 'Size' => '16', 'Type' => 'Struct' }, '39911' => { 'BaseType' => '39652', 'Name' => 'struct manadv_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '39916' => { 'Header' => undef, 'Line' => '61', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '6230' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '39951' } }, 'Size' => '16', 'Type' => 'Struct' }, '39951' => { 'BaseType' => '39744', 'Name' => 'struct manadv_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '39956' => { 'Header' => undef, 'Line' => '66', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '6740' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '39991' } }, 'Size' => '16', 'Type' => 'Struct' }, '39991' => { 'BaseType' => '39797', 'Name' => 'struct manadv_rwq*', 'Size' => '8', 'Type' => 'Pointer' }, '39996' => { 'Header' => undef, 'Line' => '55', 'Memb' => { '0' => { 'name' => 'qp', 'offset' => '0', 'type' => '39876' }, '1' => { 'name' => 'cq', 'offset' => '22', 'type' => '39916' }, '2' => { 'name' => 'rwq', 'offset' => '50', 'type' => '39956' } }, 'Name' => 'struct manadv_obj', 'Size' => '48', 'Type' => 'Struct' }, '4073' => { 'BaseType' => '3956', 'Name' => 'struct ibv_context*', 'Size' => '8', 'Type' => 'Pointer' }, '41416' => { 'BaseType' => '39996', 'Name' => 'struct manadv_obj*', 'Size' => '8', 'Type' => 'Pointer' }, '4153' => { 'Header' => undef, 'Line' => '182', 'Memb' => { '0' => { 'name' => 'fw_ver', 'offset' => '0', 'type' => '4687' }, '1' => { 'name' => 'node_guid', 'offset' => '100', 'type' => '1129' }, '10' => { 'name' => 'device_cap_flags', 'offset' => '278', 'type' => '114' }, '11' => { 'name' => 'max_sge', 'offset' => '288', 'type' => '78' }, '12' => { 'name' => 'max_sge_rd', 'offset' => '292', 'type' => '78' }, '13' => { 'name' => 'max_cq', 'offset' => '296', 'type' => '78' }, '14' => { 'name' => 'max_cqe', 'offset' => '306', 'type' => '78' }, '15' => { 'name' => 'max_mr', 'offset' => '310', 'type' => '78' }, '16' => { 'name' => 'max_pd', 'offset' => '320', 'type' => '78' }, '17' => { 'name' => 'max_qp_rd_atom', 'offset' => '324', 'type' => '78' }, '18' => { 'name' => 'max_ee_rd_atom', 'offset' => '328', 'type' => '78' }, '19' => { 'name' => 'max_res_rd_atom', 'offset' => '338', 'type' => '78' }, '2' => { 'name' => 'sys_image_guid', 'offset' => '114', 'type' => '1129' }, '20' => { 'name' => 'max_qp_init_rd_atom', 'offset' => '342', 'type' => '78' }, '21' => { 'name' => 'max_ee_init_rd_atom', 'offset' => '352', 'type' => '78' }, '22' => { 'name' => 'atomic_cap', 'offset' => '356', 'type' => '3789' }, '23' => { 'name' => 'max_ee', 'offset' => '360', 'type' => '78' }, '24' => { 'name' => 'max_rdd', 'offset' => '370', 'type' => '78' }, '25' => { 'name' => 'max_mw', 'offset' => '374', 'type' => '78' }, '26' => { 'name' => 'max_raw_ipv6_qp', 'offset' => '384', 'type' => '78' }, '27' => { 'name' => 'max_raw_ethy_qp', 'offset' => '388', 'type' => '78' }, '28' => { 'name' => 'max_mcast_grp', 'offset' => '392', 'type' => '78' }, '29' => { 'name' => 'max_mcast_qp_attach', 'offset' => '402', 'type' => '78' }, '3' => { 'name' => 'max_mr_size', 'offset' => '128', 'type' => '1052' }, '30' => { 'name' => 'max_total_mcast_qp_attach', 'offset' => '406', 'type' => '78' }, '31' => { 'name' => 'max_ah', 'offset' => '512', 'type' => '78' }, '32' => { 'name' => 'max_fmr', 'offset' => '516', 'type' => '78' }, '33' => { 'name' => 'max_map_per_fmr', 'offset' => '520', 'type' => '78' }, '34' => { 'name' => 'max_srq', 'offset' => '530', 'type' => '78' }, '35' => { 'name' => 'max_srq_wr', 'offset' => '534', 'type' => '78' }, '36' => { 'name' => 'max_srq_sge', 'offset' => '544', 'type' => '78' }, '37' => { 'name' => 'max_pkeys', 'offset' => '548', 'type' => '1028' }, '38' => { 'name' => 'local_ca_ack_delay', 'offset' => '550', 'type' => '1016' }, '39' => { 'name' => 'phys_port_cnt', 'offset' => '551', 'type' => '1016' }, '4' => { 'name' => 'page_size_cap', 'offset' => '136', 'type' => '1052' }, '5' => { 'name' => 'vendor_id', 'offset' => '150', 'type' => '1040' }, '6' => { 'name' => 'vendor_part_id', 'offset' => '256', 'type' => '1040' }, '7' => { 'name' => 'hw_ver', 'offset' => '260', 'type' => '1040' }, '8' => { 'name' => 'max_qp', 'offset' => '264', 'type' => '78' }, '9' => { 'name' => 'max_qp_wr', 'offset' => '274', 'type' => '78' } }, 'Name' => 'struct ibv_device_attr', 'Size' => '232', 'Type' => 'Struct' }, '427' => { 'Name' => 'unsigned long long', 'Size' => '8', 'Type' => 'Intrinsic' }, '4687' => { 'BaseType' => '272', 'Name' => 'char[64]', 'Size' => '64', 'Type' => 'Array' }, '54' => { 'BaseType' => '66', 'Header' => undef, 'Line' => '214', 'Name' => 'size_t', 'Size' => '8', 'Type' => 'Typedef' }, '6091' => { 'Header' => undef, 'Line' => '1508', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '4073' }, '1' => { 'name' => 'channel', 'offset' => '8', 'type' => '12257' }, '2' => { 'name' => 'cq_context', 'offset' => '22', 'type' => '126' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '1040' }, '4' => { 'name' => 'cqe', 'offset' => '40', 'type' => '78' }, '5' => { 'name' => 'mutex', 'offset' => '50', 'type' => '860' }, '6' => { 'name' => 'cond', 'offset' => '114', 'type' => '934' }, '7' => { 'name' => 'comp_events_completed', 'offset' => '288', 'type' => '1040' }, '8' => { 'name' => 'async_events_completed', 'offset' => '292', 'type' => '1040' } }, 'Name' => 'struct ibv_cq', 'Size' => '128', 'Type' => 'Struct' }, '6230' => { 'BaseType' => '6091', 'Name' => 'struct ibv_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '6235' => { 'Header' => undef, 'Line' => '1283', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '4073' }, '1' => { 'name' => 'qp_context', 'offset' => '8', 'type' => '126' }, '10' => { 'name' => 'mutex', 'offset' => '100', 'type' => '860' }, '11' => { 'name' => 'cond', 'offset' => '260', 'type' => '934' }, '12' => { 'name' => 'events_completed', 'offset' => '338', 'type' => '1040' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '7803' }, '3' => { 'name' => 'send_cq', 'offset' => '36', 'type' => '6230' }, '4' => { 'name' => 'recv_cq', 'offset' => '50', 'type' => '6230' }, '5' => { 'name' => 'srq', 'offset' => '64', 'type' => '6543' }, '6' => { 'name' => 'handle', 'offset' => '72', 'type' => '1040' }, '7' => { 'name' => 'qp_num', 'offset' => '82', 'type' => '1040' }, '8' => { 'name' => 'state', 'offset' => '86', 'type' => '9466' }, '9' => { 'name' => 'qp_type', 'offset' => '96', 'type' => '8815' } }, 'Name' => 'struct ibv_qp', 'Size' => '160', 'Type' => 'Struct' }, '6428' => { 'BaseType' => '6235', 'Name' => 'struct ibv_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '6433' => { 'Header' => undef, 'Line' => '1243', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '4073' }, '1' => { 'name' => 'srq_context', 'offset' => '8', 'type' => '126' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '7803' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '1040' }, '4' => { 'name' => 'mutex', 'offset' => '50', 'type' => '860' }, '5' => { 'name' => 'cond', 'offset' => '114', 'type' => '934' }, '6' => { 'name' => 'events_completed', 'offset' => '288', 'type' => '1040' } }, 'Name' => 'struct ibv_srq', 'Size' => '128', 'Type' => 'Struct' }, '6543' => { 'BaseType' => '6433', 'Name' => 'struct ibv_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '6548' => { 'Header' => undef, 'Line' => '1265', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '4073' }, '1' => { 'name' => 'wq_context', 'offset' => '8', 'type' => '126' }, '10' => { 'name' => 'cond', 'offset' => '150', 'type' => '934' }, '11' => { 'name' => 'events_completed', 'offset' => '324', 'type' => '1040' }, '12' => { 'name' => 'comp_mask', 'offset' => '328', 'type' => '1040' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '7803' }, '3' => { 'name' => 'cq', 'offset' => '36', 'type' => '6230' }, '4' => { 'name' => 'wq_num', 'offset' => '50', 'type' => '1040' }, '5' => { 'name' => 'handle', 'offset' => '54', 'type' => '1040' }, '6' => { 'name' => 'state', 'offset' => '64', 'type' => '8559' }, '7' => { 'name' => 'wq_type', 'offset' => '68', 'type' => '8414' }, '8' => { 'name' => 'post_recv', 'offset' => '72', 'type' => '11254' }, '9' => { 'name' => 'mutex', 'offset' => '86', 'type' => '860' } }, 'Name' => 'struct ibv_wq', 'Size' => '152', 'Type' => 'Struct' }, '66' => { 'Name' => 'unsigned long', 'Size' => '8', 'Type' => 'Intrinsic' }, '6740' => { 'BaseType' => '6548', 'Name' => 'struct ibv_wq*', 'Size' => '8', 'Type' => 'Pointer' }, '6787' => { 'Header' => undef, 'Line' => '485', 'Memb' => { '0' => { 'name' => 'IBV_WC_SUCCESS', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_LOC_LEN_ERR', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_REM_ACCESS_ERR', 'value' => '10' }, '11' => { 'name' => 'IBV_WC_REM_OP_ERR', 'value' => '11' }, '12' => { 'name' => 'IBV_WC_RETRY_EXC_ERR', 'value' => '12' }, '13' => { 'name' => 'IBV_WC_RNR_RETRY_EXC_ERR', 'value' => '13' }, '14' => { 'name' => 'IBV_WC_LOC_RDD_VIOL_ERR', 'value' => '14' }, '15' => { 'name' => 'IBV_WC_REM_INV_RD_REQ_ERR', 'value' => '15' }, '16' => { 'name' => 'IBV_WC_REM_ABORT_ERR', 'value' => '16' }, '17' => { 'name' => 'IBV_WC_INV_EECN_ERR', 'value' => '17' }, '18' => { 'name' => 'IBV_WC_INV_EEC_STATE_ERR', 'value' => '18' }, '19' => { 'name' => 'IBV_WC_FATAL_ERR', 'value' => '19' }, '2' => { 'name' => 'IBV_WC_LOC_QP_OP_ERR', 'value' => '2' }, '20' => { 'name' => 'IBV_WC_RESP_TIMEOUT_ERR', 'value' => '20' }, '21' => { 'name' => 'IBV_WC_GENERAL_ERR', 'value' => '21' }, '22' => { 'name' => 'IBV_WC_TM_ERR', 'value' => '22' }, '23' => { 'name' => 'IBV_WC_TM_RNDV_INCOMPLETE', 'value' => '23' }, '3' => { 'name' => 'IBV_WC_LOC_EEC_OP_ERR', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_LOC_PROT_ERR', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_WR_FLUSH_ERR', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_MW_BIND_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_BAD_RESP_ERR', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_LOC_ACCESS_ERR', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_REM_INV_REQ_ERR', 'value' => '9' } }, 'Name' => 'enum ibv_wc_status', 'Size' => '4', 'Type' => 'Enum' }, '6948' => { 'Header' => undef, 'Line' => '513', 'Memb' => { '0' => { 'name' => 'IBV_WC_SEND', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_RDMA_WRITE', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_RECV', 'value' => '128' }, '11' => { 'name' => 'IBV_WC_RECV_RDMA_WITH_IMM', 'value' => '129' }, '12' => { 'name' => 'IBV_WC_TM_ADD', 'value' => '130' }, '13' => { 'name' => 'IBV_WC_TM_DEL', 'value' => '131' }, '14' => { 'name' => 'IBV_WC_TM_SYNC', 'value' => '132' }, '15' => { 'name' => 'IBV_WC_TM_RECV', 'value' => '133' }, '16' => { 'name' => 'IBV_WC_TM_NO_TAG', 'value' => '134' }, '17' => { 'name' => 'IBV_WC_DRIVER1', 'value' => '135' }, '18' => { 'name' => 'IBV_WC_DRIVER2', 'value' => '136' }, '19' => { 'name' => 'IBV_WC_DRIVER3', 'value' => '137' }, '2' => { 'name' => 'IBV_WC_RDMA_READ', 'value' => '2' }, '3' => { 'name' => 'IBV_WC_COMP_SWAP', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_FETCH_ADD', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_BIND_MW', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_LOCAL_INV', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_TSO', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_FLUSH', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_ATOMIC_WRITE', 'value' => '9' } }, 'Name' => 'enum ibv_wc_opcode', 'Size' => '4', 'Type' => 'Enum' }, '7216' => { 'Header' => undef, 'Line' => '598', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '1117' }, '1' => { 'name' => 'invalidated_rkey', 'offset' => '0', 'type' => '1040' } }, 'Size' => '4', 'Type' => 'Union' }, '7249' => { 'Header' => undef, 'Line' => '589', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '1052' }, '1' => { 'name' => 'status', 'offset' => '8', 'type' => '6787' }, '10' => { 'name' => 'slid', 'offset' => '66', 'type' => '1028' }, '11' => { 'name' => 'sl', 'offset' => '68', 'type' => '1016' }, '12' => { 'name' => 'dlid_path_bits', 'offset' => '69', 'type' => '1016' }, '2' => { 'name' => 'opcode', 'offset' => '18', 'type' => '6948' }, '3' => { 'name' => 'vendor_err', 'offset' => '22', 'type' => '1040' }, '4' => { 'name' => 'byte_len', 'offset' => '32', 'type' => '1040' }, '5' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '7216' }, '6' => { 'name' => 'qp_num', 'offset' => '40', 'type' => '1040' }, '7' => { 'name' => 'src_qp', 'offset' => '50', 'type' => '1040' }, '8' => { 'name' => 'wc_flags', 'offset' => '54', 'type' => '114' }, '9' => { 'name' => 'pkey_index', 'offset' => '64', 'type' => '1028' } }, 'Name' => 'struct ibv_wc', 'Size' => '48', 'Type' => 'Struct' }, '7435' => { 'Header' => undef, 'Line' => '625', 'Memb' => { '0' => { 'name' => 'mr', 'offset' => '0', 'type' => '7618' }, '1' => { 'name' => 'addr', 'offset' => '8', 'type' => '1052' }, '2' => { 'name' => 'length', 'offset' => '22', 'type' => '1052' }, '3' => { 'name' => 'mw_access_flags', 'offset' => '36', 'type' => '114' } }, 'Name' => 'struct ibv_mw_bind_info', 'Size' => '32', 'Type' => 'Struct' }, '7508' => { 'Header' => undef, 'Line' => '668', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '4073' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '7803' }, '2' => { 'name' => 'addr', 'offset' => '22', 'type' => '126' }, '3' => { 'name' => 'length', 'offset' => '36', 'type' => '54' }, '4' => { 'name' => 'handle', 'offset' => '50', 'type' => '1040' }, '5' => { 'name' => 'lkey', 'offset' => '54', 'type' => '1040' }, '6' => { 'name' => 'rkey', 'offset' => '64', 'type' => '1040' } }, 'Name' => 'struct ibv_mr', 'Size' => '48', 'Type' => 'Struct' }, '7618' => { 'BaseType' => '7508', 'Name' => 'struct ibv_mr*', 'Size' => '8', 'Type' => 'Pointer' }, '7623' => { 'Header' => undef, 'Line' => '632', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '4073' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '1040' } }, 'Name' => 'struct ibv_pd', 'Size' => '16', 'Type' => 'Struct' }, '78' => { 'Name' => 'int', 'Size' => '4', 'Type' => 'Intrinsic' }, '7803' => { 'BaseType' => '7623', 'Name' => 'struct ibv_pd*', 'Size' => '8', 'Type' => 'Pointer' }, '7808' => { 'Header' => undef, 'Line' => '678', 'Memb' => { '0' => { 'name' => 'IBV_MW_TYPE_1', 'value' => '1' }, '1' => { 'name' => 'IBV_MW_TYPE_2', 'value' => '2' } }, 'Name' => 'enum ibv_mw_type', 'Size' => '4', 'Type' => 'Enum' }, '7837' => { 'Header' => undef, 'Line' => '683', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '4073' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '7803' }, '2' => { 'name' => 'rkey', 'offset' => '22', 'type' => '1040' }, '3' => { 'name' => 'handle', 'offset' => '32', 'type' => '1040' }, '4' => { 'name' => 'type', 'offset' => '36', 'type' => '7808' } }, 'Name' => 'struct ibv_mw', 'Size' => '32', 'Type' => 'Struct' }, '8414' => { 'Header' => undef, 'Line' => '820', 'Memb' => { '0' => { 'name' => 'IBV_WQT_RQ', 'value' => '0' } }, 'Name' => 'enum ibv_wq_type', 'Size' => '4', 'Type' => 'Enum' }, '8559' => { 'Header' => undef, 'Line' => '848', 'Memb' => { '0' => { 'name' => 'IBV_WQS_RESET', 'value' => '0' }, '1' => { 'name' => 'IBV_WQS_RDY', 'value' => '1' }, '2' => { 'name' => 'IBV_WQS_ERR', 'value' => '2' }, '3' => { 'name' => 'IBV_WQS_UNKNOWN', 'value' => '3' } }, 'Name' => 'enum ibv_wq_state', 'Size' => '4', 'Type' => 'Enum' }, '8815' => { 'Header' => undef, 'Line' => '901', 'Memb' => { '0' => { 'name' => 'IBV_QPT_RC', 'value' => '2' }, '1' => { 'name' => 'IBV_QPT_UC', 'value' => '3' }, '2' => { 'name' => 'IBV_QPT_UD', 'value' => '4' }, '3' => { 'name' => 'IBV_QPT_RAW_PACKET', 'value' => '8' }, '4' => { 'name' => 'IBV_QPT_XRC_SEND', 'value' => '9' }, '5' => { 'name' => 'IBV_QPT_XRC_RECV', 'value' => '10' }, '6' => { 'name' => 'IBV_QPT_DRIVER', 'value' => '255' } }, 'Name' => 'enum ibv_qp_type', 'Size' => '4', 'Type' => 'Enum' }, '9466' => { 'Header' => undef, 'Line' => '1050', 'Memb' => { '0' => { 'name' => 'IBV_QPS_RESET', 'value' => '0' }, '1' => { 'name' => 'IBV_QPS_INIT', 'value' => '1' }, '2' => { 'name' => 'IBV_QPS_RTR', 'value' => '2' }, '3' => { 'name' => 'IBV_QPS_RTS', 'value' => '3' }, '4' => { 'name' => 'IBV_QPS_SQD', 'value' => '4' }, '5' => { 'name' => 'IBV_QPS_SQE', 'value' => '5' }, '6' => { 'name' => 'IBV_QPS_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_QPS_UNKNOWN', 'value' => '7' } }, 'Name' => 'enum ibv_qp_state', 'Size' => '4', 'Type' => 'Enum' } }, 'UndefinedSymbols' => { 'libmana.so.1.0.56.0' => { '_ITM_deregisterTMCloneTable' => 0, '_ITM_registerTMCloneTable' => 0, '__cxa_finalize@GLIBC_2.2.5' => 0, '__errno_location@GLIBC_2.2.5' => 0, '__gmon_start__' => 0, '__stack_chk_fail@GLIBC_2.4' => 0, '__verbs_log@IBVERBS_PRIVATE_34' => 0, '_verbs_init_and_alloc_context@IBVERBS_PRIVATE_34' => 0, 'calloc@GLIBC_2.2.5' => 0, 'free@GLIBC_2.2.5' => 0, 'ibv_cmd_alloc_pd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_qp_ex2@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_rwq_ind_table@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_wq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dealloc_pd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dereg_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_rwq_ind_table@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_wq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_get_context@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_device_any@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_port@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_reg_mr@IBVERBS_PRIVATE_34' => 0, 'malloc@GLIBC_2.2.5' => 0, 'memset@GLIBC_2.2.5' => 0, 'mmap@GLIBC_2.2.5' => 0, 'munmap@GLIBC_2.2.5' => 0, 'pthread_mutex_destroy@GLIBC_2.2.5' => 0, 'pthread_mutex_init@GLIBC_2.2.5' => 0, 'pthread_mutex_lock@GLIBC_2.2.5' => 0, 'pthread_mutex_unlock@GLIBC_2.2.5' => 0, 'pthread_spin_destroy@GLIBC_2.34' => 0, 'pthread_spin_init@GLIBC_2.34' => 0, 'pthread_spin_lock@GLIBC_2.34' => 0, 'pthread_spin_unlock@GLIBC_2.34' => 0, 'verbs_register_driver_34@IBVERBS_PRIVATE_34' => 0, 'verbs_set_ops@IBVERBS_PRIVATE_34' => 0, 'verbs_uninit_context@IBVERBS_PRIVATE_34' => 0 } }, 'WordSize' => '8' }; rdma-core-56.1/ABI/mlx4.dump000066400000000000000000006364451477342711600155220ustar00rootroot00000000000000$VAR1 = { 'ABI_DUMPER_VERSION' => '1.2', 'ABI_DUMP_VERSION' => '3.5', 'Arch' => 'x86_64', 'GccVersion' => '12.3.0', 'Headers' => {}, 'Language' => 'C', 'LibraryName' => 'libmlx4.so.1.0.56.0', 'LibraryVersion' => 'mlx4', 'MissedOffsets' => '1', 'MissedRegs' => '1', 'NameSpaces' => {}, 'Needed' => { 'libc.so.6' => 1, 'libibverbs.so.1' => 1 }, 'Sources' => {}, 'SymbolInfo' => { '178063' => { 'Header' => undef, 'Line' => '1051', 'Param' => { '0' => { 'name' => 'context', 'type' => '1699' }, '1' => { 'name' => 'attr', 'type' => '13060' }, '2' => { 'name' => 'mlx4_qp_attr', 'type' => '109709' } }, 'Return' => '4033', 'ShortName' => 'mlx4dv_create_qp' }, '87428' => { 'Header' => undef, 'Line' => '401', 'Param' => { '0' => { 'name' => 'context', 'type' => '1699' }, '1' => { 'name' => 'attr_type', 'type' => '83281' }, '2' => { 'name' => 'attr', 'type' => '243' } }, 'Return' => '70', 'ShortName' => 'mlx4dv_set_context_attr' }, '87524' => { 'Header' => undef, 'Line' => '388', 'Param' => { '0' => { 'name' => 'ctx_in', 'type' => '1699' }, '1' => { 'name' => 'attrs_out', 'type' => '87607' } }, 'Return' => '70', 'ShortName' => 'mlx4dv_query_device' }, '87612' => { 'Header' => undef, 'Line' => '372', 'Param' => { '0' => { 'name' => 'obj', 'type' => '88056' }, '1' => { 'name' => 'obj_type', 'type' => '957' } }, 'Return' => '70', 'ShortName' => 'mlx4dv_init_obj' } }, 'SymbolVersion' => { 'mlx4dv_create_qp' => 'mlx4dv_create_qp@@MLX4_1.0', 'mlx4dv_init_obj' => 'mlx4dv_init_obj@@MLX4_1.0', 'mlx4dv_query_device' => 'mlx4dv_query_device@@MLX4_1.0', 'mlx4dv_set_context_attr' => 'mlx4dv_set_context_attr@@MLX4_1.0' }, 'Symbols' => { 'libmlx4.so.1.0.56.0' => { 'mlx4dv_create_qp@@MLX4_1.0' => 1, 'mlx4dv_init_obj@@MLX4_1.0' => 1, 'mlx4dv_query_device@@MLX4_1.0' => 1, 'mlx4dv_set_context_attr@@MLX4_1.0' => 1 } }, 'Target' => 'unix', 'TypeInfo' => { '1' => { 'Name' => 'void', 'Type' => 'Intrinsic' }, '1005' => { 'BaseType' => '981', 'Header' => undef, 'Line' => '27', 'Name' => '__be32', 'Size' => '4', 'Type' => 'Typedef' }, '10104' => { 'Header' => undef, 'Line' => '1969', 'Memb' => { '0' => { 'name' => '_dummy1', 'offset' => '0', 'type' => '10283' }, '1' => { 'name' => '_dummy2', 'offset' => '8', 'type' => '10299' } }, 'Name' => 'struct _ibv_device_ops', 'Size' => '16', 'Type' => 'Struct' }, '10166' => { 'BaseType' => '10171', 'Name' => 'struct ibv_device*', 'Size' => '8', 'Type' => 'Pointer' }, '1017' => { 'BaseType' => '993', 'Header' => undef, 'Line' => '29', 'Name' => '__be64', 'Size' => '8', 'Type' => 'Typedef' }, '10171' => { 'Header' => undef, 'Line' => '1979', 'Memb' => { '0' => { 'name' => '_ops', 'offset' => '0', 'type' => '10104' }, '1' => { 'name' => 'node_type', 'offset' => '22', 'type' => '1305' }, '2' => { 'name' => 'transport_type', 'offset' => '32', 'type' => '1369' }, '3' => { 'name' => 'name', 'offset' => '36', 'type' => '2313' }, '4' => { 'name' => 'dev_name', 'offset' => '136', 'type' => '2313' }, '5' => { 'name' => 'dev_path', 'offset' => '338', 'type' => '10304' }, '6' => { 'name' => 'ibdev_path', 'offset' => '1032', 'type' => '10304' } }, 'Name' => 'struct ibv_device', 'Size' => '664', 'Type' => 'Struct' }, '10283' => { 'Name' => 'struct ibv_context*(*)(struct ibv_device*, int)', 'Param' => { '0' => { 'type' => '10166' }, '1' => { 'type' => '70' } }, 'Return' => '1699', 'Size' => '8', 'Type' => 'FuncPtr' }, '10299' => { 'Name' => 'void(*)(struct ibv_context*)', 'Param' => { '0' => { 'type' => '1699' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '10304' => { 'BaseType' => '257', 'Name' => 'char[256]', 'Size' => '256', 'Type' => 'Array' }, '10320' => { 'Header' => undef, 'Line' => '1994', 'Memb' => { '0' => { 'name' => '_compat_query_device', 'offset' => '0', 'type' => '10807' }, '1' => { 'name' => '_compat_query_port', 'offset' => '8', 'type' => '10847' }, '10' => { 'name' => '_compat_create_cq', 'offset' => '128', 'type' => '10857' }, '11' => { 'name' => 'poll_cq', 'offset' => '136', 'type' => '10972' }, '12' => { 'name' => 'req_notify_cq', 'offset' => '150', 'type' => '10997' }, '13' => { 'name' => '_compat_cq_event', 'offset' => '260', 'type' => '10857' }, '14' => { 'name' => '_compat_resize_cq', 'offset' => '274', 'type' => '10857' }, '15' => { 'name' => '_compat_destroy_cq', 'offset' => '288', 'type' => '10857' }, '16' => { 'name' => '_compat_create_srq', 'offset' => '296', 'type' => '10857' }, '17' => { 'name' => '_compat_modify_srq', 'offset' => '310', 'type' => '10857' }, '18' => { 'name' => '_compat_query_srq', 'offset' => '324', 'type' => '10857' }, '19' => { 'name' => '_compat_destroy_srq', 'offset' => '338', 'type' => '10857' }, '2' => { 'name' => '_compat_alloc_pd', 'offset' => '22', 'type' => '10857' }, '20' => { 'name' => 'post_srq_recv', 'offset' => '352', 'type' => '11027' }, '21' => { 'name' => '_compat_create_qp', 'offset' => '360', 'type' => '10857' }, '22' => { 'name' => '_compat_query_qp', 'offset' => '374', 'type' => '10857' }, '23' => { 'name' => '_compat_modify_qp', 'offset' => '388', 'type' => '10857' }, '24' => { 'name' => '_compat_destroy_qp', 'offset' => '402', 'type' => '10857' }, '25' => { 'name' => 'post_send', 'offset' => '512', 'type' => '11062' }, '26' => { 'name' => 'post_recv', 'offset' => '520', 'type' => '11092' }, '27' => { 'name' => '_compat_create_ah', 'offset' => '534', 'type' => '10857' }, '28' => { 'name' => '_compat_destroy_ah', 'offset' => '548', 'type' => '10857' }, '29' => { 'name' => '_compat_attach_mcast', 'offset' => '562', 'type' => '10857' }, '3' => { 'name' => '_compat_dealloc_pd', 'offset' => '36', 'type' => '10857' }, '30' => { 'name' => '_compat_detach_mcast', 'offset' => '576', 'type' => '10857' }, '31' => { 'name' => '_compat_async_event', 'offset' => '584', 'type' => '10857' }, '4' => { 'name' => '_compat_reg_mr', 'offset' => '50', 'type' => '10857' }, '5' => { 'name' => '_compat_rereg_mr', 'offset' => '64', 'type' => '10857' }, '6' => { 'name' => '_compat_dereg_mr', 'offset' => '72', 'type' => '10857' }, '7' => { 'name' => 'alloc_mw', 'offset' => '86', 'type' => '10882' }, '8' => { 'name' => 'bind_mw', 'offset' => '100', 'type' => '10917' }, '9' => { 'name' => 'dealloc_mw', 'offset' => '114', 'type' => '10937' } }, 'Name' => 'struct ibv_context_ops', 'Size' => '256', 'Type' => 'Struct' }, '106679' => { 'Header' => undef, 'Line' => '425', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '957' }, '1' => { 'name' => 'inl_recv_sz', 'offset' => '8', 'type' => '945' } }, 'Name' => 'struct mlx4dv_qp_init_attr', 'Size' => '16', 'Type' => 'Struct' }, '10802' => { 'BaseType' => '1779', 'Name' => 'struct ibv_device_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '10807' => { 'Name' => 'int(*)(struct ibv_context*, struct ibv_device_attr*)', 'Param' => { '0' => { 'type' => '1699' }, '1' => { 'type' => '10802' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '10837' => { 'BaseType' => '10842', 'Name' => 'struct _compat_ibv_port_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '10842' => { 'Name' => 'struct _compat_ibv_port_attr', 'Type' => 'Struct' }, '10847' => { 'Name' => 'int(*)(struct ibv_context*, uint8_t, struct _compat_ibv_port_attr*)', 'Param' => { '0' => { 'type' => '1699' }, '1' => { 'type' => '921' }, '2' => { 'type' => '10837' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '10857' => { 'Name' => 'void*(*)()', 'Return' => '243', 'Size' => '8', 'Type' => 'FuncPtr' }, '10882' => { 'Name' => 'struct ibv_mw*(*)(struct ibv_pd*, enum ibv_mw_type)', 'Param' => { '0' => { 'type' => '5233' }, '1' => { 'type' => '5238' } }, 'Return' => '7271', 'Size' => '8', 'Type' => 'FuncPtr' }, '10912' => { 'BaseType' => '7830', 'Name' => 'struct ibv_mw_bind*', 'Size' => '8', 'Type' => 'Pointer' }, '10917' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_mw*, struct ibv_mw_bind*)', 'Param' => { '0' => { 'type' => '4033' }, '1' => { 'type' => '7271' }, '2' => { 'type' => '10912' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '10937' => { 'Name' => 'int(*)(struct ibv_mw*)', 'Param' => { '0' => { 'type' => '7271' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '10967' => { 'BaseType' => '4679', 'Name' => 'struct ibv_wc*', 'Size' => '8', 'Type' => 'Pointer' }, '109709' => { 'BaseType' => '106679', 'Name' => 'struct mlx4dv_qp_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '10972' => { 'Name' => 'int(*)(struct ibv_cq*, int, struct ibv_wc*)', 'Param' => { '0' => { 'type' => '3835' }, '1' => { 'type' => '70' }, '2' => { 'type' => '10967' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '10997' => { 'Name' => 'int(*)(struct ibv_cq*, int)', 'Param' => { '0' => { 'type' => '3835' }, '1' => { 'type' => '70' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '11027' => { 'Name' => 'int(*)(struct ibv_srq*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '4148' }, '1' => { 'type' => '7574' }, '2' => { 'type' => '7911' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '11057' => { 'BaseType' => '7494', 'Name' => 'struct ibv_send_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '11062' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_send_wr*, struct ibv_send_wr**)', 'Param' => { '0' => { 'type' => '4033' }, '1' => { 'type' => '7494' }, '2' => { 'type' => '11057' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '11092' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '4033' }, '1' => { 'type' => '7574' }, '2' => { 'type' => '7911' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '111' => { 'Name' => 'unsigned char', 'Size' => '1', 'Type' => 'Intrinsic' }, '123' => { 'Name' => 'unsigned short', 'Size' => '2', 'Type' => 'Intrinsic' }, '1305' => { 'Header' => undef, 'Line' => '95', 'Memb' => { '0' => { 'name' => 'IBV_NODE_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_NODE_CA', 'value' => '1' }, '2' => { 'name' => 'IBV_NODE_SWITCH', 'value' => '2' }, '3' => { 'name' => 'IBV_NODE_ROUTER', 'value' => '3' }, '4' => { 'name' => 'IBV_NODE_RNIC', 'value' => '4' }, '5' => { 'name' => 'IBV_NODE_USNIC', 'value' => '5' }, '6' => { 'name' => 'IBV_NODE_USNIC_UDP', 'value' => '6' }, '7' => { 'name' => 'IBV_NODE_UNSPECIFIED', 'value' => '7' } }, 'Name' => 'enum ibv_node_type', 'Size' => '4', 'Type' => 'Enum' }, '13060' => { 'BaseType' => '6223', 'Name' => 'struct ibv_qp_init_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '13090' => { 'BaseType' => '945', 'Name' => 'uint32_t*', 'Size' => '8', 'Type' => 'Pointer' }, '13397' => { 'BaseType' => '1005', 'Name' => '__be32*', 'Size' => '8', 'Type' => 'Pointer' }, '135' => { 'Name' => 'unsigned int', 'Size' => '4', 'Type' => 'Intrinsic' }, '1369' => { 'Header' => undef, 'Line' => '106', 'Memb' => { '0' => { 'name' => 'IBV_TRANSPORT_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_TRANSPORT_IB', 'value' => '0' }, '2' => { 'name' => 'IBV_TRANSPORT_IWARP', 'value' => '1' }, '3' => { 'name' => 'IBV_TRANSPORT_USNIC', 'value' => '2' }, '4' => { 'name' => 'IBV_TRANSPORT_USNIC_UDP', 'value' => '3' }, '5' => { 'name' => 'IBV_TRANSPORT_UNSPECIFIED', 'value' => '4' } }, 'Name' => 'enum ibv_transport_type', 'Size' => '4', 'Type' => 'Enum' }, '1421' => { 'Header' => undef, 'Line' => '155', 'Memb' => { '0' => { 'name' => 'IBV_ATOMIC_NONE', 'value' => '0' }, '1' => { 'name' => 'IBV_ATOMIC_HCA', 'value' => '1' }, '2' => { 'name' => 'IBV_ATOMIC_GLOB', 'value' => '2' } }, 'Name' => 'enum ibv_atomic_cap', 'Size' => '4', 'Type' => 'Enum' }, '1588' => { 'Header' => undef, 'Line' => '2037', 'Memb' => { '0' => { 'name' => 'device', 'offset' => '0', 'type' => '10166' }, '1' => { 'name' => 'ops', 'offset' => '8', 'type' => '10320' }, '2' => { 'name' => 'cmd_fd', 'offset' => '612', 'type' => '70' }, '3' => { 'name' => 'async_fd', 'offset' => '616', 'type' => '70' }, '4' => { 'name' => 'num_comp_vectors', 'offset' => '626', 'type' => '70' }, '5' => { 'name' => 'mutex', 'offset' => '640', 'type' => '771' }, '6' => { 'name' => 'abi_compat', 'offset' => '800', 'type' => '243' } }, 'Name' => 'struct ibv_context', 'Size' => '328', 'Type' => 'Struct' }, '159' => { 'BaseType' => '111', 'Header' => undef, 'Line' => '38', 'Name' => '__uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '1699' => { 'BaseType' => '1588', 'Name' => 'struct ibv_context*', 'Size' => '8', 'Type' => 'Pointer' }, '1779' => { 'Header' => undef, 'Line' => '182', 'Memb' => { '0' => { 'name' => 'fw_ver', 'offset' => '0', 'type' => '2313' }, '1' => { 'name' => 'node_guid', 'offset' => '100', 'type' => '1017' }, '10' => { 'name' => 'device_cap_flags', 'offset' => '278', 'type' => '135' }, '11' => { 'name' => 'max_sge', 'offset' => '288', 'type' => '70' }, '12' => { 'name' => 'max_sge_rd', 'offset' => '292', 'type' => '70' }, '13' => { 'name' => 'max_cq', 'offset' => '296', 'type' => '70' }, '14' => { 'name' => 'max_cqe', 'offset' => '306', 'type' => '70' }, '15' => { 'name' => 'max_mr', 'offset' => '310', 'type' => '70' }, '16' => { 'name' => 'max_pd', 'offset' => '320', 'type' => '70' }, '17' => { 'name' => 'max_qp_rd_atom', 'offset' => '324', 'type' => '70' }, '18' => { 'name' => 'max_ee_rd_atom', 'offset' => '328', 'type' => '70' }, '19' => { 'name' => 'max_res_rd_atom', 'offset' => '338', 'type' => '70' }, '2' => { 'name' => 'sys_image_guid', 'offset' => '114', 'type' => '1017' }, '20' => { 'name' => 'max_qp_init_rd_atom', 'offset' => '342', 'type' => '70' }, '21' => { 'name' => 'max_ee_init_rd_atom', 'offset' => '352', 'type' => '70' }, '22' => { 'name' => 'atomic_cap', 'offset' => '356', 'type' => '1421' }, '23' => { 'name' => 'max_ee', 'offset' => '360', 'type' => '70' }, '24' => { 'name' => 'max_rdd', 'offset' => '370', 'type' => '70' }, '25' => { 'name' => 'max_mw', 'offset' => '374', 'type' => '70' }, '26' => { 'name' => 'max_raw_ipv6_qp', 'offset' => '384', 'type' => '70' }, '27' => { 'name' => 'max_raw_ethy_qp', 'offset' => '388', 'type' => '70' }, '28' => { 'name' => 'max_mcast_grp', 'offset' => '392', 'type' => '70' }, '29' => { 'name' => 'max_mcast_qp_attach', 'offset' => '402', 'type' => '70' }, '3' => { 'name' => 'max_mr_size', 'offset' => '128', 'type' => '957' }, '30' => { 'name' => 'max_total_mcast_qp_attach', 'offset' => '406', 'type' => '70' }, '31' => { 'name' => 'max_ah', 'offset' => '512', 'type' => '70' }, '32' => { 'name' => 'max_fmr', 'offset' => '516', 'type' => '70' }, '33' => { 'name' => 'max_map_per_fmr', 'offset' => '520', 'type' => '70' }, '34' => { 'name' => 'max_srq', 'offset' => '530', 'type' => '70' }, '35' => { 'name' => 'max_srq_wr', 'offset' => '534', 'type' => '70' }, '36' => { 'name' => 'max_srq_sge', 'offset' => '544', 'type' => '70' }, '37' => { 'name' => 'max_pkeys', 'offset' => '548', 'type' => '933' }, '38' => { 'name' => 'local_ca_ack_delay', 'offset' => '550', 'type' => '921' }, '39' => { 'name' => 'phys_port_cnt', 'offset' => '551', 'type' => '921' }, '4' => { 'name' => 'page_size_cap', 'offset' => '136', 'type' => '957' }, '5' => { 'name' => 'vendor_id', 'offset' => '150', 'type' => '945' }, '6' => { 'name' => 'vendor_part_id', 'offset' => '256', 'type' => '945' }, '7' => { 'name' => 'hw_ver', 'offset' => '260', 'type' => '945' }, '8' => { 'name' => 'max_qp', 'offset' => '264', 'type' => '70' }, '9' => { 'name' => 'max_qp_wr', 'offset' => '274', 'type' => '70' } }, 'Name' => 'struct ibv_device_attr', 'Size' => '232', 'Type' => 'Struct' }, '183' => { 'BaseType' => '123', 'Header' => undef, 'Line' => '40', 'Name' => '__uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '195' => { 'BaseType' => '135', 'Header' => undef, 'Line' => '42', 'Name' => '__uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '207' => { 'BaseType' => '58', 'Header' => undef, 'Line' => '45', 'Name' => '__uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '219' => { 'BaseType' => '87', 'Header' => undef, 'Line' => '152', 'Name' => '__off_t', 'Size' => '8', 'Type' => 'Typedef' }, '2313' => { 'BaseType' => '257', 'Name' => 'char[64]', 'Size' => '64', 'Type' => 'Array' }, '243' => { 'BaseType' => '1', 'Name' => 'void*', 'Size' => '8', 'Type' => 'Pointer' }, '257' => { 'Name' => 'char', 'Size' => '1', 'Type' => 'Intrinsic' }, '269' => { 'BaseType' => '219', 'Header' => undef, 'Line' => '85', 'Name' => 'off_t', 'Size' => '8', 'Type' => 'Typedef' }, '3696' => { 'Header' => undef, 'Line' => '1508', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '1699' }, '1' => { 'name' => 'channel', 'offset' => '8', 'type' => '8849' }, '2' => { 'name' => 'cq_context', 'offset' => '22', 'type' => '243' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '945' }, '4' => { 'name' => 'cqe', 'offset' => '40', 'type' => '70' }, '5' => { 'name' => 'mutex', 'offset' => '50', 'type' => '771' }, '6' => { 'name' => 'cond', 'offset' => '114', 'type' => '844' }, '7' => { 'name' => 'comp_events_completed', 'offset' => '288', 'type' => '945' }, '8' => { 'name' => 'async_events_completed', 'offset' => '292', 'type' => '945' } }, 'Name' => 'struct ibv_cq', 'Size' => '128', 'Type' => 'Struct' }, '3835' => { 'BaseType' => '3696', 'Name' => 'struct ibv_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '3840' => { 'Header' => undef, 'Line' => '1283', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '1699' }, '1' => { 'name' => 'qp_context', 'offset' => '8', 'type' => '243' }, '10' => { 'name' => 'mutex', 'offset' => '100', 'type' => '771' }, '11' => { 'name' => 'cond', 'offset' => '260', 'type' => '844' }, '12' => { 'name' => 'events_completed', 'offset' => '338', 'type' => '945' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '5233' }, '3' => { 'name' => 'send_cq', 'offset' => '36', 'type' => '3835' }, '4' => { 'name' => 'recv_cq', 'offset' => '50', 'type' => '3835' }, '5' => { 'name' => 'srq', 'offset' => '64', 'type' => '4148' }, '6' => { 'name' => 'handle', 'offset' => '72', 'type' => '945' }, '7' => { 'name' => 'qp_num', 'offset' => '82', 'type' => '945' }, '8' => { 'name' => 'state', 'offset' => '86', 'type' => '6546' }, '9' => { 'name' => 'qp_type', 'offset' => '96', 'type' => '6006' } }, 'Name' => 'struct ibv_qp', 'Size' => '160', 'Type' => 'Struct' }, '390' => { 'Name' => 'unsigned long long', 'Size' => '8', 'Type' => 'Intrinsic' }, '4033' => { 'BaseType' => '3840', 'Name' => 'struct ibv_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '4038' => { 'Header' => undef, 'Line' => '1243', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '1699' }, '1' => { 'name' => 'srq_context', 'offset' => '8', 'type' => '243' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '5233' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '945' }, '4' => { 'name' => 'mutex', 'offset' => '50', 'type' => '771' }, '5' => { 'name' => 'cond', 'offset' => '114', 'type' => '844' }, '6' => { 'name' => 'events_completed', 'offset' => '288', 'type' => '945' } }, 'Name' => 'struct ibv_srq', 'Size' => '128', 'Type' => 'Struct' }, '4148' => { 'BaseType' => '4038', 'Name' => 'struct ibv_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '4153' => { 'Header' => undef, 'Line' => '1265', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '1699' }, '1' => { 'name' => 'wq_context', 'offset' => '8', 'type' => '243' }, '10' => { 'name' => 'cond', 'offset' => '150', 'type' => '844' }, '11' => { 'name' => 'events_completed', 'offset' => '324', 'type' => '945' }, '12' => { 'name' => 'comp_mask', 'offset' => '328', 'type' => '945' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '5233' }, '3' => { 'name' => 'cq', 'offset' => '36', 'type' => '3835' }, '4' => { 'name' => 'wq_num', 'offset' => '50', 'type' => '945' }, '5' => { 'name' => 'handle', 'offset' => '54', 'type' => '945' }, '6' => { 'name' => 'state', 'offset' => '64', 'type' => '5751' }, '7' => { 'name' => 'wq_type', 'offset' => '68', 'type' => '5607' }, '8' => { 'name' => 'post_recv', 'offset' => '72', 'type' => '7916' }, '9' => { 'name' => 'mutex', 'offset' => '86', 'type' => '771' } }, 'Name' => 'struct ibv_wq', 'Size' => '152', 'Type' => 'Struct' }, '4345' => { 'BaseType' => '4153', 'Name' => 'struct ibv_wq*', 'Size' => '8', 'Type' => 'Pointer' }, '4350' => { 'Header' => undef, 'Line' => '485', 'Memb' => { '0' => { 'name' => 'IBV_WC_SUCCESS', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_LOC_LEN_ERR', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_REM_ACCESS_ERR', 'value' => '10' }, '11' => { 'name' => 'IBV_WC_REM_OP_ERR', 'value' => '11' }, '12' => { 'name' => 'IBV_WC_RETRY_EXC_ERR', 'value' => '12' }, '13' => { 'name' => 'IBV_WC_RNR_RETRY_EXC_ERR', 'value' => '13' }, '14' => { 'name' => 'IBV_WC_LOC_RDD_VIOL_ERR', 'value' => '14' }, '15' => { 'name' => 'IBV_WC_REM_INV_RD_REQ_ERR', 'value' => '15' }, '16' => { 'name' => 'IBV_WC_REM_ABORT_ERR', 'value' => '16' }, '17' => { 'name' => 'IBV_WC_INV_EECN_ERR', 'value' => '17' }, '18' => { 'name' => 'IBV_WC_INV_EEC_STATE_ERR', 'value' => '18' }, '19' => { 'name' => 'IBV_WC_FATAL_ERR', 'value' => '19' }, '2' => { 'name' => 'IBV_WC_LOC_QP_OP_ERR', 'value' => '2' }, '20' => { 'name' => 'IBV_WC_RESP_TIMEOUT_ERR', 'value' => '20' }, '21' => { 'name' => 'IBV_WC_GENERAL_ERR', 'value' => '21' }, '22' => { 'name' => 'IBV_WC_TM_ERR', 'value' => '22' }, '23' => { 'name' => 'IBV_WC_TM_RNDV_INCOMPLETE', 'value' => '23' }, '3' => { 'name' => 'IBV_WC_LOC_EEC_OP_ERR', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_LOC_PROT_ERR', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_WR_FLUSH_ERR', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_MW_BIND_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_BAD_RESP_ERR', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_LOC_ACCESS_ERR', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_REM_INV_REQ_ERR', 'value' => '9' } }, 'Name' => 'enum ibv_wc_status', 'Size' => '4', 'Type' => 'Enum' }, '4510' => { 'Header' => undef, 'Line' => '513', 'Memb' => { '0' => { 'name' => 'IBV_WC_SEND', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_RDMA_WRITE', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_RECV', 'value' => '128' }, '11' => { 'name' => 'IBV_WC_RECV_RDMA_WITH_IMM', 'value' => '129' }, '12' => { 'name' => 'IBV_WC_TM_ADD', 'value' => '130' }, '13' => { 'name' => 'IBV_WC_TM_DEL', 'value' => '131' }, '14' => { 'name' => 'IBV_WC_TM_SYNC', 'value' => '132' }, '15' => { 'name' => 'IBV_WC_TM_RECV', 'value' => '133' }, '16' => { 'name' => 'IBV_WC_TM_NO_TAG', 'value' => '134' }, '17' => { 'name' => 'IBV_WC_DRIVER1', 'value' => '135' }, '18' => { 'name' => 'IBV_WC_DRIVER2', 'value' => '136' }, '19' => { 'name' => 'IBV_WC_DRIVER3', 'value' => '137' }, '2' => { 'name' => 'IBV_WC_RDMA_READ', 'value' => '2' }, '3' => { 'name' => 'IBV_WC_COMP_SWAP', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_FETCH_ADD', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_BIND_MW', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_LOCAL_INV', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_TSO', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_FLUSH', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_ATOMIC_WRITE', 'value' => '9' } }, 'Name' => 'enum ibv_wc_opcode', 'Size' => '4', 'Type' => 'Enum' }, '46' => { 'BaseType' => '58', 'Header' => undef, 'Line' => '214', 'Name' => 'size_t', 'Size' => '8', 'Type' => 'Typedef' }, '4646' => { 'Header' => undef, 'Line' => '598', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '1005' }, '1' => { 'name' => 'invalidated_rkey', 'offset' => '0', 'type' => '945' } }, 'Size' => '4', 'Type' => 'Union' }, '4679' => { 'Header' => undef, 'Line' => '589', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '957' }, '1' => { 'name' => 'status', 'offset' => '8', 'type' => '4350' }, '10' => { 'name' => 'slid', 'offset' => '66', 'type' => '933' }, '11' => { 'name' => 'sl', 'offset' => '68', 'type' => '921' }, '12' => { 'name' => 'dlid_path_bits', 'offset' => '69', 'type' => '921' }, '2' => { 'name' => 'opcode', 'offset' => '18', 'type' => '4510' }, '3' => { 'name' => 'vendor_err', 'offset' => '22', 'type' => '945' }, '4' => { 'name' => 'byte_len', 'offset' => '32', 'type' => '945' }, '5' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '4646' }, '6' => { 'name' => 'qp_num', 'offset' => '40', 'type' => '945' }, '7' => { 'name' => 'src_qp', 'offset' => '50', 'type' => '945' }, '8' => { 'name' => 'wc_flags', 'offset' => '54', 'type' => '135' }, '9' => { 'name' => 'pkey_index', 'offset' => '64', 'type' => '933' } }, 'Name' => 'struct ibv_wc', 'Size' => '48', 'Type' => 'Struct' }, '4865' => { 'Header' => undef, 'Line' => '625', 'Memb' => { '0' => { 'name' => 'mr', 'offset' => '0', 'type' => '5048' }, '1' => { 'name' => 'addr', 'offset' => '8', 'type' => '957' }, '2' => { 'name' => 'length', 'offset' => '22', 'type' => '957' }, '3' => { 'name' => 'mw_access_flags', 'offset' => '36', 'type' => '135' } }, 'Name' => 'struct ibv_mw_bind_info', 'Size' => '32', 'Type' => 'Struct' }, '4938' => { 'Header' => undef, 'Line' => '668', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '1699' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '5233' }, '2' => { 'name' => 'addr', 'offset' => '22', 'type' => '243' }, '3' => { 'name' => 'length', 'offset' => '36', 'type' => '46' }, '4' => { 'name' => 'handle', 'offset' => '50', 'type' => '945' }, '5' => { 'name' => 'lkey', 'offset' => '54', 'type' => '945' }, '6' => { 'name' => 'rkey', 'offset' => '64', 'type' => '945' } }, 'Name' => 'struct ibv_mr', 'Size' => '48', 'Type' => 'Struct' }, '5048' => { 'BaseType' => '4938', 'Name' => 'struct ibv_mr*', 'Size' => '8', 'Type' => 'Pointer' }, '5053' => { 'Header' => undef, 'Line' => '632', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '1699' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '945' } }, 'Name' => 'struct ibv_pd', 'Size' => '16', 'Type' => 'Struct' }, '5205' => { 'Header' => undef, 'Line' => '657', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '1699' } }, 'Name' => 'struct ibv_xrcd', 'Size' => '8', 'Type' => 'Struct' }, '5233' => { 'BaseType' => '5053', 'Name' => 'struct ibv_pd*', 'Size' => '8', 'Type' => 'Pointer' }, '5238' => { 'Header' => undef, 'Line' => '678', 'Memb' => { '0' => { 'name' => 'IBV_MW_TYPE_1', 'value' => '1' }, '1' => { 'name' => 'IBV_MW_TYPE_2', 'value' => '2' } }, 'Name' => 'enum ibv_mw_type', 'Size' => '4', 'Type' => 'Enum' }, '5266' => { 'Header' => undef, 'Line' => '683', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '1699' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '5233' }, '2' => { 'name' => 'rkey', 'offset' => '22', 'type' => '945' }, '3' => { 'name' => 'handle', 'offset' => '32', 'type' => '945' }, '4' => { 'name' => 'type', 'offset' => '36', 'type' => '5238' } }, 'Name' => 'struct ibv_mw', 'Size' => '32', 'Type' => 'Struct' }, '5602' => { 'BaseType' => '5205', 'Name' => 'struct ibv_xrcd*', 'Size' => '8', 'Type' => 'Pointer' }, '5607' => { 'Header' => undef, 'Line' => '820', 'Memb' => { '0' => { 'name' => 'IBV_WQT_RQ', 'value' => '0' } }, 'Name' => 'enum ibv_wq_type', 'Size' => '4', 'Type' => 'Enum' }, '5751' => { 'Header' => undef, 'Line' => '848', 'Memb' => { '0' => { 'name' => 'IBV_WQS_RESET', 'value' => '0' }, '1' => { 'name' => 'IBV_WQS_RDY', 'value' => '1' }, '2' => { 'name' => 'IBV_WQS_ERR', 'value' => '2' }, '3' => { 'name' => 'IBV_WQS_UNKNOWN', 'value' => '3' } }, 'Name' => 'enum ibv_wq_state', 'Size' => '4', 'Type' => 'Enum' }, '58' => { 'Name' => 'unsigned long', 'Size' => '8', 'Type' => 'Intrinsic' }, '5875' => { 'Header' => undef, 'Line' => '880', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '1699' }, '1' => { 'name' => 'ind_tbl_handle', 'offset' => '8', 'type' => '70' }, '2' => { 'name' => 'ind_tbl_num', 'offset' => '18', 'type' => '70' }, '3' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '945' } }, 'Name' => 'struct ibv_rwq_ind_table', 'Size' => '24', 'Type' => 'Struct' }, '6006' => { 'Header' => undef, 'Line' => '901', 'Memb' => { '0' => { 'name' => 'IBV_QPT_RC', 'value' => '2' }, '1' => { 'name' => 'IBV_QPT_UC', 'value' => '3' }, '2' => { 'name' => 'IBV_QPT_UD', 'value' => '4' }, '3' => { 'name' => 'IBV_QPT_RAW_PACKET', 'value' => '8' }, '4' => { 'name' => 'IBV_QPT_XRC_SEND', 'value' => '9' }, '5' => { 'name' => 'IBV_QPT_XRC_RECV', 'value' => '10' }, '6' => { 'name' => 'IBV_QPT_DRIVER', 'value' => '255' } }, 'Name' => 'enum ibv_qp_type', 'Size' => '4', 'Type' => 'Enum' }, '6064' => { 'Header' => undef, 'Line' => '911', 'Memb' => { '0' => { 'name' => 'max_send_wr', 'offset' => '0', 'type' => '945' }, '1' => { 'name' => 'max_recv_wr', 'offset' => '4', 'type' => '945' }, '2' => { 'name' => 'max_send_sge', 'offset' => '8', 'type' => '945' }, '3' => { 'name' => 'max_recv_sge', 'offset' => '18', 'type' => '945' }, '4' => { 'name' => 'max_inline_data', 'offset' => '22', 'type' => '945' } }, 'Name' => 'struct ibv_qp_cap', 'Size' => '20', 'Type' => 'Struct' }, '6148' => { 'Header' => undef, 'Line' => '963', 'Memb' => { '0' => { 'name' => 'rx_hash_function', 'offset' => '0', 'type' => '921' }, '1' => { 'name' => 'rx_hash_key_len', 'offset' => '1', 'type' => '921' }, '2' => { 'name' => 'rx_hash_key', 'offset' => '8', 'type' => '6218' }, '3' => { 'name' => 'rx_hash_fields_mask', 'offset' => '22', 'type' => '957' } }, 'Name' => 'struct ibv_rx_hash_conf', 'Size' => '24', 'Type' => 'Struct' }, '6218' => { 'BaseType' => '921', 'Name' => 'uint8_t*', 'Size' => '8', 'Type' => 'Pointer' }, '6223' => { 'Header' => undef, 'Line' => '972', 'Memb' => { '0' => { 'name' => 'qp_context', 'offset' => '0', 'type' => '243' }, '1' => { 'name' => 'send_cq', 'offset' => '8', 'type' => '3835' }, '10' => { 'name' => 'create_flags', 'offset' => '128', 'type' => '945' }, '11' => { 'name' => 'max_tso_header', 'offset' => '132', 'type' => '933' }, '12' => { 'name' => 'rwq_ind_tbl', 'offset' => '136', 'type' => '6457' }, '13' => { 'name' => 'rx_hash_conf', 'offset' => '150', 'type' => '6148' }, '14' => { 'name' => 'source_qpn', 'offset' => '288', 'type' => '945' }, '15' => { 'name' => 'send_ops_flags', 'offset' => '296', 'type' => '957' }, '2' => { 'name' => 'recv_cq', 'offset' => '22', 'type' => '3835' }, '3' => { 'name' => 'srq', 'offset' => '36', 'type' => '4148' }, '4' => { 'name' => 'cap', 'offset' => '50', 'type' => '6064' }, '5' => { 'name' => 'qp_type', 'offset' => '82', 'type' => '6006' }, '6' => { 'name' => 'sq_sig_all', 'offset' => '86', 'type' => '70' }, '7' => { 'name' => 'comp_mask', 'offset' => '96', 'type' => '945' }, '8' => { 'name' => 'pd', 'offset' => '100', 'type' => '5233' }, '9' => { 'name' => 'xrcd', 'offset' => '114', 'type' => '5602' } }, 'Name' => 'struct ibv_qp_init_attr_ex', 'Size' => '136', 'Type' => 'Struct' }, '6457' => { 'BaseType' => '5875', 'Name' => 'struct ibv_rwq_ind_table*', 'Size' => '8', 'Type' => 'Pointer' }, '6546' => { 'Header' => undef, 'Line' => '1050', 'Memb' => { '0' => { 'name' => 'IBV_QPS_RESET', 'value' => '0' }, '1' => { 'name' => 'IBV_QPS_INIT', 'value' => '1' }, '2' => { 'name' => 'IBV_QPS_RTR', 'value' => '2' }, '3' => { 'name' => 'IBV_QPS_RTS', 'value' => '3' }, '4' => { 'name' => 'IBV_QPS_SQD', 'value' => '4' }, '5' => { 'name' => 'IBV_QPS_SQE', 'value' => '5' }, '6' => { 'name' => 'IBV_QPS_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_QPS_UNKNOWN', 'value' => '7' } }, 'Name' => 'enum ibv_qp_state', 'Size' => '4', 'Type' => 'Enum' }, '6680' => { 'Header' => undef, 'Line' => '1103', 'Memb' => { '0' => { 'name' => 'IBV_WR_RDMA_WRITE', 'value' => '0' }, '1' => { 'name' => 'IBV_WR_RDMA_WRITE_WITH_IMM', 'value' => '1' }, '10' => { 'name' => 'IBV_WR_TSO', 'value' => '10' }, '11' => { 'name' => 'IBV_WR_DRIVER1', 'value' => '11' }, '12' => { 'name' => 'IBV_WR_FLUSH', 'value' => '14' }, '13' => { 'name' => 'IBV_WR_ATOMIC_WRITE', 'value' => '15' }, '2' => { 'name' => 'IBV_WR_SEND', 'value' => '2' }, '3' => { 'name' => 'IBV_WR_SEND_WITH_IMM', 'value' => '3' }, '4' => { 'name' => 'IBV_WR_RDMA_READ', 'value' => '4' }, '5' => { 'name' => 'IBV_WR_ATOMIC_CMP_AND_SWP', 'value' => '5' }, '6' => { 'name' => 'IBV_WR_ATOMIC_FETCH_AND_ADD', 'value' => '6' }, '7' => { 'name' => 'IBV_WR_LOCAL_INV', 'value' => '7' }, '8' => { 'name' => 'IBV_WR_BIND_MW', 'value' => '8' }, '9' => { 'name' => 'IBV_WR_SEND_WITH_INV', 'value' => '9' } }, 'Name' => 'enum ibv_wr_opcode', 'Size' => '4', 'Type' => 'Enum' }, '6827' => { 'Header' => undef, 'Line' => '1145', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '957' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '945' }, '2' => { 'name' => 'lkey', 'offset' => '18', 'type' => '945' } }, 'Name' => 'struct ibv_sge', 'Size' => '16', 'Type' => 'Struct' }, '6888' => { 'Header' => undef, 'Line' => '1161', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '1005' }, '1' => { 'name' => 'invalidate_rkey', 'offset' => '0', 'type' => '945' } }, 'Size' => '4', 'Type' => 'Union' }, '6921' => { 'Header' => undef, 'Line' => '1166', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '957' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '945' } }, 'Size' => '16', 'Type' => 'Struct' }, '6959' => { 'Header' => undef, 'Line' => '1170', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '957' }, '1' => { 'name' => 'compare_add', 'offset' => '8', 'type' => '957' }, '2' => { 'name' => 'swap', 'offset' => '22', 'type' => '957' }, '3' => { 'name' => 'rkey', 'offset' => '36', 'type' => '945' } }, 'Size' => '32', 'Type' => 'Struct' }, '70' => { 'Name' => 'int', 'Size' => '4', 'Type' => 'Intrinsic' }, '7025' => { 'Header' => undef, 'Line' => '1176', 'Memb' => { '0' => { 'name' => 'ah', 'offset' => '0', 'type' => '7129' }, '1' => { 'name' => 'remote_qpn', 'offset' => '8', 'type' => '945' }, '2' => { 'name' => 'remote_qkey', 'offset' => '18', 'type' => '945' } }, 'Size' => '16', 'Type' => 'Struct' }, '7075' => { 'Header' => undef, 'Line' => '1695', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '1699' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '5233' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '945' } }, 'Name' => 'struct ibv_ah', 'Size' => '24', 'Type' => 'Struct' }, '7129' => { 'BaseType' => '7075', 'Name' => 'struct ibv_ah*', 'Size' => '8', 'Type' => 'Pointer' }, '7134' => { 'Header' => undef, 'Line' => '1165', 'Memb' => { '0' => { 'name' => 'rdma', 'offset' => '0', 'type' => '6921' }, '1' => { 'name' => 'atomic', 'offset' => '0', 'type' => '6959' }, '2' => { 'name' => 'ud', 'offset' => '0', 'type' => '7025' } }, 'Size' => '32', 'Type' => 'Union' }, '7177' => { 'Header' => undef, 'Line' => '1183', 'Memb' => { '0' => { 'name' => 'remote_srqn', 'offset' => '0', 'type' => '945' } }, 'Size' => '4', 'Type' => 'Struct' }, '7201' => { 'Header' => undef, 'Line' => '1182', 'Memb' => { '0' => { 'name' => 'xrc', 'offset' => '0', 'type' => '7177' } }, 'Size' => '4', 'Type' => 'Union' }, '7221' => { 'Header' => undef, 'Line' => '1188', 'Memb' => { '0' => { 'name' => 'mw', 'offset' => '0', 'type' => '7271' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '945' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '4865' } }, 'Size' => '48', 'Type' => 'Struct' }, '7271' => { 'BaseType' => '5266', 'Name' => 'struct ibv_mw*', 'Size' => '8', 'Type' => 'Pointer' }, '7276' => { 'Header' => undef, 'Line' => '1193', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '243' }, '1' => { 'name' => 'hdr_sz', 'offset' => '8', 'type' => '933' }, '2' => { 'name' => 'mss', 'offset' => '16', 'type' => '933' } }, 'Size' => '16', 'Type' => 'Struct' }, '7326' => { 'Header' => undef, 'Line' => '1187', 'Memb' => { '0' => { 'name' => 'bind_mw', 'offset' => '0', 'type' => '7221' }, '1' => { 'name' => 'tso', 'offset' => '0', 'type' => '7276' } }, 'Size' => '48', 'Type' => 'Union' }, '7358' => { 'Header' => undef, 'Line' => '1151', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '957' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '7494' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '7499' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '70' }, '4' => { 'name' => 'opcode', 'offset' => '40', 'type' => '6680' }, '5' => { 'name' => 'send_flags', 'offset' => '50', 'type' => '135' }, '6' => { 'name' => 'unnamed0', 'offset' => '54', 'type' => '6888' }, '7' => { 'name' => 'wr', 'offset' => '64', 'type' => '7134' }, '8' => { 'name' => 'qp_type', 'offset' => '114', 'type' => '7201' }, '9' => { 'name' => 'unnamed1', 'offset' => '128', 'type' => '7326' } }, 'Name' => 'struct ibv_send_wr', 'Size' => '128', 'Type' => 'Struct' }, '7494' => { 'BaseType' => '7358', 'Name' => 'struct ibv_send_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '7499' => { 'BaseType' => '6827', 'Name' => 'struct ibv_sge*', 'Size' => '8', 'Type' => 'Pointer' }, '7504' => { 'Header' => undef, 'Line' => '1201', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '957' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '7574' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '7499' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '70' } }, 'Name' => 'struct ibv_recv_wr', 'Size' => '32', 'Type' => 'Struct' }, '7574' => { 'BaseType' => '7504', 'Name' => 'struct ibv_recv_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '7830' => { 'Header' => undef, 'Line' => '1237', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '957' }, '1' => { 'name' => 'send_flags', 'offset' => '8', 'type' => '135' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '4865' } }, 'Name' => 'struct ibv_mw_bind', 'Size' => '48', 'Type' => 'Struct' }, '7911' => { 'BaseType' => '7574', 'Name' => 'struct ibv_recv_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '7916' => { 'Name' => 'int(*)(struct ibv_wq*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '4345' }, '1' => { 'type' => '7574' }, '2' => { 'type' => '7911' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '82219' => { 'Header' => undef, 'Line' => '161', 'Memb' => { '0' => { 'name' => 'wqe_cnt', 'offset' => '0', 'type' => '945' }, '1' => { 'name' => 'wqe_shift', 'offset' => '4', 'type' => '70' }, '2' => { 'name' => 'offset', 'offset' => '8', 'type' => '70' } }, 'Size' => '12', 'Type' => 'Struct' }, '82268' => { 'Header' => undef, 'Line' => '166', 'Memb' => { '0' => { 'name' => 'wqe_cnt', 'offset' => '0', 'type' => '945' }, '1' => { 'name' => 'wqe_shift', 'offset' => '4', 'type' => '70' }, '2' => { 'name' => 'offset', 'offset' => '8', 'type' => '70' } }, 'Size' => '12', 'Type' => 'Struct' }, '82317' => { 'Header' => undef, 'Line' => '171', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '243' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '46' } }, 'Size' => '16', 'Type' => 'Struct' }, '82353' => { 'Header' => undef, 'Line' => '157', 'Memb' => { '0' => { 'name' => 'rdb', 'offset' => '0', 'type' => '13397' }, '1' => { 'name' => 'sdb', 'offset' => '8', 'type' => '13090' }, '2' => { 'name' => 'doorbell_qpn', 'offset' => '22', 'type' => '1005' }, '3' => { 'name' => 'sq', 'offset' => '32', 'type' => '82219' }, '4' => { 'name' => 'rq', 'offset' => '50', 'type' => '82268' }, '5' => { 'name' => 'buf', 'offset' => '72', 'type' => '82317' }, '6' => { 'name' => 'comp_mask', 'offset' => '100', 'type' => '957' }, '7' => { 'name' => 'uar_mmap_offset', 'offset' => '114', 'type' => '269' } }, 'Name' => 'struct mlx4dv_qp', 'Size' => '80', 'Type' => 'Struct' }, '82497' => { 'Header' => undef, 'Line' => '184', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '243' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '46' } }, 'Size' => '16', 'Type' => 'Struct' }, '82533' => { 'Header' => undef, 'Line' => '183', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '82497' }, '1' => { 'name' => 'cqe_cnt', 'offset' => '22', 'type' => '945' }, '2' => { 'name' => 'cqn', 'offset' => '32', 'type' => '945' }, '3' => { 'name' => 'set_ci_db', 'offset' => '36', 'type' => '13397' }, '4' => { 'name' => 'arm_db', 'offset' => '50', 'type' => '13397' }, '5' => { 'name' => 'arm_sn', 'offset' => '64', 'type' => '70' }, '6' => { 'name' => 'cqe_size', 'offset' => '68', 'type' => '70' }, '7' => { 'name' => 'comp_mask', 'offset' => '72', 'type' => '957' }, '8' => { 'name' => 'cq_uar', 'offset' => '86', 'type' => '243' } }, 'Name' => 'struct mlx4dv_cq', 'Size' => '64', 'Type' => 'Struct' }, '82664' => { 'Header' => undef, 'Line' => '199', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '243' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '46' } }, 'Size' => '16', 'Type' => 'Struct' }, '82700' => { 'Header' => undef, 'Line' => '198', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '82664' }, '1' => { 'name' => 'wqe_shift', 'offset' => '22', 'type' => '70' }, '2' => { 'name' => 'head', 'offset' => '32', 'type' => '70' }, '3' => { 'name' => 'tail', 'offset' => '36', 'type' => '70' }, '4' => { 'name' => 'db', 'offset' => '50', 'type' => '13397' }, '5' => { 'name' => 'comp_mask', 'offset' => '64', 'type' => '957' } }, 'Name' => 'struct mlx4dv_srq', 'Size' => '48', 'Type' => 'Struct' }, '82791' => { 'Header' => undef, 'Line' => '212', 'Memb' => { '0' => { 'name' => 'wqe_cnt', 'offset' => '0', 'type' => '945' }, '1' => { 'name' => 'wqe_shift', 'offset' => '4', 'type' => '70' }, '2' => { 'name' => 'offset', 'offset' => '8', 'type' => '70' } }, 'Size' => '12', 'Type' => 'Struct' }, '82840' => { 'Header' => undef, 'Line' => '217', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '243' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '46' } }, 'Size' => '16', 'Type' => 'Struct' }, '82876' => { 'Header' => undef, 'Line' => '210', 'Memb' => { '0' => { 'name' => 'rdb', 'offset' => '0', 'type' => '13397' }, '1' => { 'name' => 'rq', 'offset' => '8', 'type' => '82791' }, '2' => { 'name' => 'buf', 'offset' => '36', 'type' => '82840' }, '3' => { 'name' => 'comp_mask', 'offset' => '64', 'type' => '957' } }, 'Name' => 'struct mlx4dv_rwq', 'Size' => '48', 'Type' => 'Struct' }, '82941' => { 'Header' => undef, 'Line' => '225', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '4033' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '82976' } }, 'Size' => '16', 'Type' => 'Struct' }, '82976' => { 'BaseType' => '82353', 'Name' => 'struct mlx4dv_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '82981' => { 'Header' => undef, 'Line' => '229', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '3835' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '83016' } }, 'Size' => '16', 'Type' => 'Struct' }, '83016' => { 'BaseType' => '82533', 'Name' => 'struct mlx4dv_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '83021' => { 'Header' => undef, 'Line' => '233', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '4148' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '83056' } }, 'Size' => '16', 'Type' => 'Struct' }, '83056' => { 'BaseType' => '82700', 'Name' => 'struct mlx4dv_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '83061' => { 'Header' => undef, 'Line' => '237', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '4345' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '83096' } }, 'Size' => '16', 'Type' => 'Struct' }, '83096' => { 'BaseType' => '82876', 'Name' => 'struct mlx4dv_rwq*', 'Size' => '8', 'Type' => 'Pointer' }, '83101' => { 'Header' => undef, 'Line' => '224', 'Memb' => { '0' => { 'name' => 'qp', 'offset' => '0', 'type' => '82941' }, '1' => { 'name' => 'cq', 'offset' => '22', 'type' => '82981' }, '2' => { 'name' => 'srq', 'offset' => '50', 'type' => '83021' }, '3' => { 'name' => 'rwq', 'offset' => '72', 'type' => '83061' } }, 'Name' => 'struct mlx4dv_obj', 'Size' => '64', 'Type' => 'Struct' }, '83225' => { 'Header' => undef, 'Line' => '437', 'Memb' => { '0' => { 'name' => 'version', 'offset' => '0', 'type' => '921' }, '1' => { 'name' => 'max_inl_recv_sz', 'offset' => '4', 'type' => '945' }, '2' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '957' } }, 'Name' => 'struct mlx4dv_context', 'Size' => '16', 'Type' => 'Struct' }, '83281' => { 'Header' => undef, 'Line' => '539', 'Memb' => { '0' => { 'name' => 'MLX4DV_SET_CTX_ATTR_LOG_WQS_RANGE_SZ', 'value' => '0' }, '1' => { 'name' => 'MLX4DV_SET_CTX_ATTR_BUF_ALLOCATORS', 'value' => '1' } }, 'Name' => 'enum mlx4dv_set_ctx_attr_type', 'Size' => '4', 'Type' => 'Enum' }, '87' => { 'Name' => 'long', 'Size' => '8', 'Type' => 'Intrinsic' }, '87607' => { 'BaseType' => '83225', 'Name' => 'struct mlx4dv_context*', 'Size' => '8', 'Type' => 'Pointer' }, '8795' => { 'Header' => undef, 'Line' => '1502', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '1699' }, '1' => { 'name' => 'fd', 'offset' => '8', 'type' => '70' }, '2' => { 'name' => 'refcnt', 'offset' => '18', 'type' => '70' } }, 'Name' => 'struct ibv_comp_channel', 'Size' => '16', 'Type' => 'Struct' }, '88056' => { 'BaseType' => '83101', 'Name' => 'struct mlx4dv_obj*', 'Size' => '8', 'Type' => 'Pointer' }, '8849' => { 'BaseType' => '8795', 'Name' => 'struct ibv_comp_channel*', 'Size' => '8', 'Type' => 'Pointer' }, '921' => { 'BaseType' => '159', 'Header' => undef, 'Line' => '24', 'Name' => 'uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '933' => { 'BaseType' => '183', 'Header' => undef, 'Line' => '25', 'Name' => 'uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '945' => { 'BaseType' => '195', 'Header' => undef, 'Line' => '26', 'Name' => 'uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '957' => { 'BaseType' => '207', 'Header' => undef, 'Line' => '27', 'Name' => 'uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '981' => { 'BaseType' => '135', 'Header' => undef, 'Line' => '27', 'Name' => '__u32', 'Size' => '4', 'Type' => 'Typedef' }, '993' => { 'BaseType' => '390', 'Header' => undef, 'Line' => '31', 'Name' => '__u64', 'Size' => '8', 'Type' => 'Typedef' } }, 'UndefinedSymbols' => { 'libmlx4.so.1.0.56.0' => { '_ITM_deregisterTMCloneTable' => 0, '_ITM_registerTMCloneTable' => 0, '__cxa_finalize@GLIBC_2.2.5' => 0, '__errno_location@GLIBC_2.2.5' => 0, '__gmon_start__' => 0, '__printf_chk@GLIBC_2.3.4' => 0, '__snprintf_chk@GLIBC_2.3.4' => 0, '__stack_chk_fail@GLIBC_2.4' => 0, '_verbs_init_and_alloc_context@IBVERBS_PRIVATE_34' => 0, 'calloc@GLIBC_2.2.5' => 0, 'free@GLIBC_2.2.5' => 0, 'fwrite@GLIBC_2.2.5' => 0, 'ibv_cmd_alloc_mw@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_alloc_pd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_attach_mcast@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_close_xrcd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_cq_ex@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_flow@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_qp_ex2@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_qp_ex@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_rwq_ind_table@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_srq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_srq_ex@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_wq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dealloc_mw@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dealloc_pd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dereg_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_flow@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_rwq_ind_table@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_srq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_wq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_detach_mcast@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_get_context@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_srq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_wq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_open_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_open_xrcd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_device_any@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_port@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_srq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_reg_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_rereg_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_resize_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_dofork_range@IBVERBS_1.1' => 0, 'ibv_dontfork_range@IBVERBS_1.1' => 0, 'ibv_query_device@IBVERBS_1.1' => 0, 'ibv_query_gid@IBVERBS_1.1' => 0, 'ibv_query_port@IBVERBS_1.1' => 0, 'ibv_resolve_eth_l2_from_gid@IBVERBS_1.1' => 0, 'malloc@GLIBC_2.2.5' => 0, 'memcpy@GLIBC_2.14' => 0, 'memset@GLIBC_2.2.5' => 0, 'mmap@GLIBC_2.2.5' => 0, 'munmap@GLIBC_2.2.5' => 0, 'pthread_mutex_init@GLIBC_2.2.5' => 0, 'pthread_mutex_lock@GLIBC_2.2.5' => 0, 'pthread_mutex_unlock@GLIBC_2.2.5' => 0, 'pthread_spin_init@GLIBC_2.34' => 0, 'pthread_spin_lock@GLIBC_2.34' => 0, 'pthread_spin_unlock@GLIBC_2.34' => 0, 'stderr@GLIBC_2.2.5' => 0, 'sysconf@GLIBC_2.2.5' => 0, 'verbs_register_driver_34@IBVERBS_PRIVATE_34' => 0, 'verbs_set_ops@IBVERBS_PRIVATE_34' => 0, 'verbs_uninit_context@IBVERBS_PRIVATE_34' => 0 } }, 'WordSize' => '8' }; rdma-core-56.1/ABI/mlx5.dump000066400000000000000000040614511477342711600155140ustar00rootroot00000000000000$VAR1 = { 'ABI_DUMPER_VERSION' => '1.2', 'ABI_DUMP_VERSION' => '3.5', 'Arch' => 'x86_64', 'GccVersion' => '12.3.0', 'Headers' => {}, 'Language' => 'C', 'LibraryName' => 'libmlx5.so.1.25.56.0', 'LibraryVersion' => 'mlx5', 'MissedOffsets' => '1', 'MissedRegs' => '1', 'NameSpaces' => {}, 'Needed' => { 'libc.so.6' => 1, 'libibverbs.so.1' => 1 }, 'Sources' => {}, 'SymbolInfo' => { '1184406' => { 'Header' => undef, 'Line' => '5863', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'addr', 'type' => '308' }, '2' => { 'name' => 'size', 'type' => '419' }, '3' => { 'name' => 'access', 'type' => '2001' } }, 'Return' => '29785', 'ShortName' => 'mlx5dv_devx_umem_reg' }, '1184609' => { 'Header' => undef, 'Line' => '5896', 'Param' => { '0' => { 'name' => 'dv_devx_umem', 'type' => '29785' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_umem_dereg' }, '1208576' => { 'Header' => undef, 'Line' => '6141', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'port_num', 'type' => '2001' }, '2' => { 'name' => 'info', 'type' => '31157' }, '3' => { 'name' => 'info_len', 'type' => '419' } }, 'Return' => '159', 'ShortName' => '_mlx5dv_query_port' }, '1362446' => { 'Header' => undef, 'Line' => '6361', 'Param' => { '0' => { 'name' => 'qp', 'type' => '5101' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_qp_query' }, '1362574' => { 'Header' => undef, 'Line' => '6409', 'Param' => { '0' => { 'name' => 'qp', 'type' => '5101' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_qp_modify' }, '1371020' => { 'Header' => undef, 'Line' => '2334', 'Param' => { '0' => { 'name' => 'device', 'type' => '11334' }, '1' => { 'name' => 'attr', 'type' => '1365741' } }, 'Return' => '2944', 'ShortName' => 'mlx5dv_open_device' }, '1371205' => { 'Header' => undef, 'Line' => '2328', 'Param' => { '0' => { 'name' => 'device', 'type' => '11334' } }, 'Return' => '2091', 'ShortName' => 'mlx5dv_is_supported' }, '1371695' => { 'Header' => undef, 'Line' => '2228', 'Param' => { '0' => { 'name' => 'ibv_ctx', 'type' => '2944' }, '1' => { 'name' => 'type', 'type' => '21136' }, '2' => { 'name' => 'attr', 'type' => '308' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_set_context_attr' }, '1372637' => { 'Header' => undef, 'Line' => '2142', 'Param' => { '0' => { 'name' => 'obj', 'type' => '30252' }, '1' => { 'name' => 'obj_type', 'type' => '2023' } }, 'Return' => '159', 'ShortName' => '__mlx5dv_init_obj_1_0' }, '1372695' => { 'Alias' => '__mlx5dv_init_obj_1_2', 'Header' => undef, 'Line' => '2123', 'Param' => { '0' => { 'name' => 'obj', 'type' => '30252' }, '1' => { 'name' => 'obj_type', 'type' => '2023' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_init_obj' }, '1373748' => { 'Header' => undef, 'Line' => '2066', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '2944' }, '1' => { 'name' => 'qpn', 'type' => '2001' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_reserved_qpn_dealloc' }, '1374740' => { 'Header' => undef, 'Line' => '2013', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '2944' }, '1' => { 'name' => 'qpn', 'type' => '14268' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_reserved_qpn_alloc' }, '1377038' => { 'Header' => undef, 'Line' => '1871', 'Param' => { '0' => { 'name' => 'qp', 'type' => '5101' }, '1' => { 'name' => 'requestor', 'type' => '31007' }, '2' => { 'name' => 'responder', 'type' => '31007' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_modify_qp_sched_elem' }, '1380231' => { 'Header' => undef, 'Line' => '1744', 'Param' => { '0' => { 'name' => 'leaf', 'type' => '30882' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_sched_leaf_destroy' }, '1380444' => { 'Header' => undef, 'Line' => '1722', 'Param' => { '0' => { 'name' => 'node', 'type' => '21852' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_sched_node_destroy' }, '1380657' => { 'Header' => undef, 'Line' => '1699', 'Param' => { '0' => { 'name' => 'leaf', 'type' => '30882' }, '1' => { 'name' => 'attr', 'type' => '30852' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_sched_leaf_modify' }, '1380834' => { 'Header' => undef, 'Line' => '1671', 'Param' => { '0' => { 'name' => 'node', 'type' => '21852' }, '1' => { 'name' => 'attr', 'type' => '30852' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_sched_node_modify' }, '1381011' => { 'Header' => undef, 'Line' => '1641', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '2944' }, '1' => { 'name' => 'attr', 'type' => '30852' } }, 'Return' => '30882', 'ShortName' => 'mlx5dv_sched_leaf_create' }, '1381619' => { 'Header' => undef, 'Line' => '1590', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '2944' }, '1' => { 'name' => 'attr', 'type' => '30852' } }, 'Return' => '21852', 'ShortName' => 'mlx5dv_sched_node_create' }, '1385006' => { 'Header' => undef, 'Line' => '1398', 'Param' => { '0' => { 'name' => 'qp', 'type' => '5101' }, '1' => { 'name' => 'stream_id', 'type' => '1989' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dci_stream_id_reset' }, '1385986' => { 'Header' => undef, 'Line' => '1388', 'Param' => { '0' => { 'name' => 'qp', 'type' => '5101' }, '1' => { 'name' => 'udp_sport', 'type' => '1989' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_modify_qp_udp_sport' }, '1386865' => { 'Header' => undef, 'Line' => '1349', 'Param' => { '0' => { 'name' => 'qp', 'type' => '5101' }, '1' => { 'name' => 'port_num', 'type' => '1977' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_modify_qp_lag_port' }, '1387991' => { 'Header' => undef, 'Line' => '1271', 'Param' => { '0' => { 'name' => 'qp', 'type' => '5101' }, '1' => { 'name' => 'port_num', 'type' => '7308' }, '2' => { 'name' => 'active_port_num', 'type' => '7308' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_query_qp_lag_port' }, '1390858' => { 'Header' => undef, 'Line' => '994', 'Param' => { '0' => { 'name' => 'ctx_in', 'type' => '2944' }, '1' => { 'name' => 'attrs_out', 'type' => '30742' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_query_device' }, '1463307' => { 'Header' => undef, 'Line' => '3566', 'Param' => { '0' => { 'name' => 'attr', 'type' => '1463960' } }, 'Return' => '1463955', 'ShortName' => 'mlx5dv_get_vfio_device_list' }, '1463975' => { 'Header' => undef, 'Line' => '3546', 'Param' => { '0' => { 'name' => 'ibctx', 'type' => '2944' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_vfio_process_events' }, '1467635' => { 'Header' => undef, 'Line' => '3539', 'Param' => { '0' => { 'name' => 'ibctx', 'type' => '2944' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_vfio_get_events_fd' }, '1586540' => { 'Header' => undef, 'Line' => '4240', 'Param' => { '0' => { 'name' => 'dv_qp', 'type' => '17851' }, '1' => { 'name' => 'wr_id', 'type' => '2023' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_qp_cancel_posted_send_wrs' }, '1773942' => { 'Header' => undef, 'Line' => '7979', 'Param' => { '0' => { 'name' => 'dveq', 'type' => '31272' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_destroy_eq' }, '1774071' => { 'Header' => undef, 'Line' => '7966', 'Param' => { '0' => { 'name' => 'ibctx', 'type' => '2944' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '31272', 'ShortName' => 'mlx5dv_devx_create_eq' }, '1774311' => { 'Header' => undef, 'Line' => '7953', 'Param' => { '0' => { 'name' => 'dvmsi', 'type' => '31207' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_free_msi_vector' }, '1774441' => { 'Header' => undef, 'Line' => '7941', 'Param' => { '0' => { 'name' => 'ibctx', 'type' => '2944' } }, 'Return' => '31207', 'ShortName' => 'mlx5dv_devx_alloc_msi_vector' }, '1774574' => { 'Header' => undef, 'Line' => '7929', 'Param' => { '0' => { 'name' => 'dv_pp', 'type' => '30206' } }, 'Return' => '1', 'ShortName' => 'mlx5dv_pp_free' }, '1775142' => { 'Header' => undef, 'Line' => '7896', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'pp_context_sz', 'type' => '419' }, '2' => { 'name' => 'pp_context', 'type' => '1961' }, '3' => { 'name' => 'flags', 'type' => '2001' } }, 'Return' => '30206', 'ShortName' => 'mlx5dv_pp_alloc' }, '1776791' => { 'Header' => undef, 'Line' => '7837', 'Param' => { '0' => { 'name' => 'dv_var', 'type' => '30150' } }, 'Return' => '1', 'ShortName' => 'mlx5dv_free_var' }, '1777359' => { 'Header' => undef, 'Line' => '7808', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'flags', 'type' => '2001' } }, 'Return' => '30150', 'ShortName' => 'mlx5dv_alloc_var' }, '1778868' => { 'Header' => undef, 'Line' => '7751', 'Param' => { '0' => { 'name' => 'dek', 'type' => '17582' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dek_destroy' }, '1779081' => { 'Header' => undef, 'Line' => '7727', 'Param' => { '0' => { 'name' => 'dek', 'type' => '17582' }, '1' => { 'name' => 'dek_attr', 'type' => '30100' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dek_query' }, '1779938' => { 'Header' => undef, 'Line' => '7674', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'init_attr', 'type' => '30070' } }, 'Return' => '17582', 'ShortName' => 'mlx5dv_dek_create' }, '1781493' => { 'Header' => undef, 'Line' => '7541', 'Param' => { '0' => { 'name' => 'crypto_login', 'type' => '18698' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_crypto_login_destroy' }, '1781706' => { 'Header' => undef, 'Line' => '7516', 'Param' => { '0' => { 'name' => 'crypto_login', 'type' => '18698' }, '1' => { 'name' => 'query_attr', 'type' => '30020' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_crypto_login_query' }, '1781883' => { 'Header' => undef, 'Line' => '7492', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'login_attr', 'type' => '29990' } }, 'Return' => '18698', 'ShortName' => 'mlx5dv_crypto_login_create' }, '1782255' => { 'Header' => undef, 'Line' => '7452', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_crypto_logout' }, '1782562' => { 'Header' => undef, 'Line' => '7419', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'state', 'type' => '29940' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_crypto_login_query_state' }, '1782928' => { 'Header' => undef, 'Line' => '7382', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'login_attr', 'type' => '29910' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_crypto_login' }, '1784600' => { 'Header' => undef, 'Line' => '7196', 'Param' => { '0' => { 'name' => 'dv_mkey', 'type' => '17897' }, '1' => { 'name' => 'err_info', 'type' => '1784870' }, '2' => { 'name' => 'err_info_size', 'type' => '419' } }, 'Return' => '159', 'ShortName' => '_mlx5dv_mkey_check' }, '1784956' => { 'Header' => undef, 'Line' => '7135', 'Param' => { '0' => { 'name' => 'dv_mkey', 'type' => '17897' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_destroy_mkey' }, '1785478' => { 'Header' => undef, 'Line' => '7096', 'Param' => { '0' => { 'name' => 'mkey_init_attr', 'type' => '29860' } }, 'Return' => '17897', 'ShortName' => 'mlx5dv_create_mkey' }, '1789036' => { 'Header' => undef, 'Line' => '6971', 'Param' => { '0' => { 'name' => 'event_channel', 'type' => '29488' }, '1' => { 'name' => 'event_data', 'type' => '29699' }, '2' => { 'name' => 'event_resp_len', 'type' => '419' } }, 'Return' => '1915', 'ShortName' => 'mlx5dv_devx_get_event' }, '1789636' => { 'Header' => undef, 'Line' => '6857', 'Param' => { '0' => { 'name' => 'cmd_comp', 'type' => '29442' }, '1' => { 'name' => 'cmd_resp', 'type' => '29664' }, '2' => { 'name' => 'cmd_resp_len', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_get_async_cmd_comp' }, '1789992' => { 'Header' => undef, 'Line' => '6827', 'Param' => { '0' => { 'name' => 'obj', 'type' => '19309' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'outlen', 'type' => '419' }, '4' => { 'name' => 'wr_id', 'type' => '2023' }, '5' => { 'name' => 'cmd_comp', 'type' => '29442' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_obj_query_async' }, '1791551' => { 'Header' => undef, 'Line' => '6791', 'Param' => { '0' => { 'name' => 'dv_event_channel', 'type' => '29488' }, '1' => { 'name' => 'fd', 'type' => '159' }, '2' => { 'name' => 'obj', 'type' => '19309' }, '3' => { 'name' => 'event_num', 'type' => '1989' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_subscribe_devx_event_fd' }, '1792838' => { 'Header' => undef, 'Line' => '6749', 'Param' => { '0' => { 'name' => 'dv_event_channel', 'type' => '29488' }, '1' => { 'name' => 'obj', 'type' => '19309' }, '2' => { 'name' => 'events_sz', 'type' => '1989' }, '3' => { 'name' => 'events_num', 'type' => '29549' }, '4' => { 'name' => 'cookie', 'type' => '2023' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_subscribe_devx_event' }, '1794211' => { 'Header' => undef, 'Line' => '6710', 'Param' => { '0' => { 'name' => 'dv_event_channel', 'type' => '29488' } }, 'Return' => '1', 'ShortName' => 'mlx5dv_devx_destroy_event_channel' }, '1794437' => { 'Header' => undef, 'Line' => '6686', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'flags', 'type' => '15148' } }, 'Return' => '29488', 'ShortName' => 'mlx5dv_devx_create_event_channel' }, '1795434' => { 'Header' => undef, 'Line' => '6642', 'Param' => { '0' => { 'name' => 'cmd_comp', 'type' => '29442' } }, 'Return' => '1', 'ShortName' => 'mlx5dv_devx_destroy_cmd_comp' }, '1795583' => { 'Header' => undef, 'Line' => '6623', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' } }, 'Return' => '29442', 'ShortName' => 'mlx5dv_devx_create_cmd_comp' }, '1796320' => { 'Header' => undef, 'Line' => '6577', 'Param' => { '0' => { 'name' => 'ind_tbl', 'type' => '7550' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_ind_tbl_modify' }, '1797576' => { 'Header' => undef, 'Line' => '6550', 'Param' => { '0' => { 'name' => 'ind_tbl', 'type' => '7550' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_ind_tbl_query' }, '1798832' => { 'Header' => undef, 'Line' => '6522', 'Param' => { '0' => { 'name' => 'wq', 'type' => '5416' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_wq_modify' }, '1800086' => { 'Header' => undef, 'Line' => '6496', 'Param' => { '0' => { 'name' => 'wq', 'type' => '5416' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_wq_query' }, '1801340' => { 'Header' => undef, 'Line' => '6470', 'Param' => { '0' => { 'name' => 'srq', 'type' => '5217' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_srq_modify' }, '1802596' => { 'Header' => undef, 'Line' => '6444', 'Param' => { '0' => { 'name' => 'srq', 'type' => '5217' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_srq_query' }, '1806810' => { 'Header' => undef, 'Line' => '6335', 'Param' => { '0' => { 'name' => 'cq', 'type' => '4901' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_cq_modify' }, '1808064' => { 'Header' => undef, 'Line' => '6309', 'Param' => { '0' => { 'name' => 'cq', 'type' => '4901' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_cq_query' }, '1809318' => { 'Header' => undef, 'Line' => '6283', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'vector', 'type' => '2001' }, '2' => { 'name' => 'eqn', 'type' => '14268' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_query_eqn' }, '1822367' => { 'Header' => undef, 'Line' => '5850', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '2944' }, '1' => { 'name' => 'umem_in', 'type' => '29815' } }, 'Return' => '29785', 'ShortName' => 'mlx5dv_devx_umem_reg_ex' }, '1826617' => { 'Header' => undef, 'Line' => '5655', 'Param' => { '0' => { 'name' => 'flow_matcher', 'type' => '30577' }, '1' => { 'name' => 'match_value', 'type' => '18970' }, '2' => { 'name' => 'num_actions', 'type' => '419' }, '3' => { 'name' => 'actions_attr', 'type' => '30647' } }, 'Return' => '13488', 'ShortName' => 'mlx5dv_create_flow' }, '1838558' => { 'Header' => undef, 'Line' => '5072', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'buf', 'type' => '346' }, '2' => { 'name' => 'buf_len', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_get_data_direct_sysfs_path' }, '1839230' => { 'Header' => undef, 'Line' => '5044', 'Param' => { '0' => { 'name' => 'pd', 'type' => '6313' }, '1' => { 'name' => 'offset', 'type' => '2023' }, '2' => { 'name' => 'length', 'type' => '419' }, '3' => { 'name' => 'iova', 'type' => '2023' }, '4' => { 'name' => 'fd', 'type' => '159' }, '5' => { 'name' => 'access', 'type' => '159' }, '6' => { 'name' => 'mlx5_access', 'type' => '159' } }, 'Return' => '6127', 'ShortName' => 'mlx5dv_reg_dmabuf_mr' }, '1840215' => { 'Header' => undef, 'Line' => '5005', 'Param' => { '0' => { 'name' => 'dm', 'type' => '2979' }, '1' => { 'name' => 'op', 'type' => '1977' } }, 'Return' => '308', 'ShortName' => 'mlx5dv_dm_map_op_addr' }, '1845967' => { 'Header' => undef, 'Line' => '4746', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '2944' }, '1' => { 'name' => 'esp', 'type' => '13663' }, '2' => { 'name' => 'mlx5_attr', 'type' => '30472' } }, 'Return' => '13658', 'ShortName' => 'mlx5dv_create_flow_action_esp' }, '1850468' => { 'Header' => undef, 'Line' => '4427', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'attr', 'type' => '14013' }, '2' => { 'name' => 'mlx5_wq_attr', 'type' => '30377' } }, 'Return' => '5416', 'ShortName' => 'mlx5dv_create_wq' }, '1861530' => { 'Header' => undef, 'Line' => '3476', 'Param' => { '0' => { 'name' => 'qp', 'type' => '9472' } }, 'Return' => '17851', 'ShortName' => 'mlx5dv_qp_ex_from_ibv_qp_ex' }, '1861578' => { 'Header' => undef, 'Line' => '3462', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'qp_attr', 'type' => '14238' }, '2' => { 'name' => 'mlx5_qp_attr', 'type' => '30322' } }, 'Return' => '5101', 'ShortName' => 'mlx5dv_create_qp' }, '1862269' => { 'Header' => undef, 'Line' => '3429', 'Param' => { '0' => { 'name' => 'ah', 'type' => '8232' }, '1' => { 'name' => 'qp_num', 'type' => '2001' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_map_ah_to_qp' }, '1885002' => { 'Header' => undef, 'Line' => '1219', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'cq_attr', 'type' => '14073' }, '2' => { 'name' => 'mlx5_cq_attr', 'type' => '30287' } }, 'Return' => '10526', 'ShortName' => 'mlx5dv_create_cq' }, '294020' => { 'Header' => undef, 'Line' => '1631', 'Param' => { '0' => { 'name' => 'rule', 'type' => '290921' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dr_rule_destroy' }, '294109' => { 'Header' => undef, 'Line' => '222', 'Param' => { '0' => { 'name' => 'tbl', 'type' => '269706' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dr_table_destroy' }, '294132' => { 'Header' => undef, 'Line' => '1579', 'Param' => { '0' => { 'name' => 'matcher', 'type' => '290840' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dr_matcher_destroy' }, '294155' => { 'Header' => undef, 'Line' => '1611', 'Param' => { '0' => { 'name' => 'matcher', 'type' => '290840' }, '1' => { 'name' => 'value', 'type' => '18970' }, '2' => { 'name' => 'num_actions', 'type' => '419' }, '3' => { 'name' => 'actions', 'type' => '269843' } }, 'Return' => '290921', 'ShortName' => 'mlx5dv_dr_rule_create' }, '294193' => { 'Header' => undef, 'Line' => '1466', 'Param' => { '0' => { 'name' => 'tbl', 'type' => '269706' }, '1' => { 'name' => 'priority', 'type' => '1989' }, '2' => { 'name' => 'match_criteria_enable', 'type' => '1977' }, '3' => { 'name' => 'mask', 'type' => '18970' } }, 'Return' => '290840', 'ShortName' => 'mlx5dv_dr_matcher_create' }, '294231' => { 'Header' => undef, 'Line' => '168', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' }, '1' => { 'name' => 'level', 'type' => '2001' } }, 'Return' => '269706', 'ShortName' => 'mlx5dv_dr_table_create' }, '294366' => { 'Header' => undef, 'Line' => '6091', 'Param' => { '0' => { 'name' => 'obj', 'type' => '19309' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_obj_destroy' }, '294663' => { 'Header' => undef, 'Line' => '4812', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '2944' }, '1' => { 'name' => 'actions_sz', 'type' => '419' }, '2' => { 'name' => 'actions', 'type' => '13448' }, '3' => { 'name' => 'ft_type', 'type' => '14957' } }, 'Return' => '13658', 'ShortName' => 'mlx5dv_create_flow_action_modify_header' }, '294833' => { 'Header' => undef, 'Line' => '4880', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '2944' }, '1' => { 'name' => 'data_sz', 'type' => '419' }, '2' => { 'name' => 'data', 'type' => '308' }, '3' => { 'name' => 'reformat_type', 'type' => '15005' }, '4' => { 'name' => 'ft_type', 'type' => '14957' } }, 'Return' => '13658', 'ShortName' => 'mlx5dv_create_flow_action_packet_reformat' }, '295009' => { 'Header' => undef, 'Line' => '5749', 'Param' => { '0' => { 'name' => 'sa', 'type' => '30682' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_destroy_steering_anchor' }, '295051' => { 'Header' => undef, 'Line' => '5736', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'attr', 'type' => '30687' } }, 'Return' => '30682', 'ShortName' => 'mlx5dv_create_steering_anchor' }, '295291' => { 'Header' => undef, 'Line' => '3085', 'Param' => { '0' => { 'name' => 'action', 'type' => '269848' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dr_action_destroy' }, '295902' => { 'Header' => undef, 'Line' => '3053', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' }, '1' => { 'name' => 'num_dest', 'type' => '419' }, '2' => { 'name' => 'dests', 'type' => '297349' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_dest_array' }, '297999' => { 'Header' => undef, 'Line' => '2843', 'Param' => { '0' => { 'name' => 'attr', 'type' => '300076' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_flow_sampler' }, '301852' => { 'Header' => undef, 'Line' => '2443', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' }, '1' => { 'name' => 'ib_port', 'type' => '2001' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_dest_ib_port' }, '302232' => { 'Header' => undef, 'Line' => '2407', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' }, '1' => { 'name' => 'vport', 'type' => '2001' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_dest_vport' }, '302612' => { 'Header' => undef, 'Line' => '2360', 'Param' => { '0' => { 'name' => 'attr', 'type' => '294450' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_flow_meter' }, '303149' => { 'Header' => undef, 'Line' => '2343', 'Param' => { '0' => { 'name' => 'action', 'type' => '269848' }, '1' => { 'name' => 'attr', 'type' => '294450' }, '2' => { 'name' => 'modify_field_select', 'type' => '2215' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dr_action_modify_flow_meter' }, '303283' => { 'Header' => undef, 'Line' => '2280', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' }, '1' => { 'name' => 'flags', 'type' => '2001' }, '2' => { 'name' => 'actions_sz', 'type' => '419' }, '3' => { 'name' => 'actions', 'type' => '294537' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_modify_header' }, '307906' => { 'Header' => undef, 'Line' => '1773', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' }, '1' => { 'name' => 'vlan_hdr', 'type' => '2203' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_push_vlan' }, '308302' => { 'Header' => undef, 'Line' => '1768', 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_pop_vlan' }, '308540' => { 'Header' => undef, 'Line' => '1703', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' }, '1' => { 'name' => 'flags', 'type' => '2001' }, '2' => { 'name' => 'reformat_type', 'type' => '15005' }, '3' => { 'name' => 'data_sz', 'type' => '419' }, '4' => { 'name' => 'data', 'type' => '308' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_packet_reformat' }, '310729' => { 'Header' => undef, 'Line' => '1494', 'Param' => { '0' => { 'name' => 'tag_value', 'type' => '2001' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_tag' }, '311007' => { 'Header' => undef, 'Line' => '1475', 'Param' => { '0' => { 'name' => 'action', 'type' => '269848' }, '1' => { 'name' => 'offset', 'type' => '2001' }, '2' => { 'name' => 'flags', 'type' => '2001' }, '3' => { 'name' => 'return_reg_c', 'type' => '1977' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dr_action_modify_aso' }, '311424' => { 'Header' => undef, 'Line' => '1384', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' }, '1' => { 'name' => 'devx_obj', 'type' => '19309' }, '2' => { 'name' => 'offset', 'type' => '2001' }, '3' => { 'name' => 'flags', 'type' => '2001' }, '4' => { 'name' => 'return_reg_c', 'type' => '1977' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_aso' }, '312710' => { 'Header' => undef, 'Line' => '1251', 'Param' => { '0' => { 'name' => 'devx_obj', 'type' => '19309' }, '1' => { 'name' => 'offset', 'type' => '2001' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_flow_counter' }, '313025' => { 'Header' => undef, 'Line' => '1198', 'Param' => { '0' => { 'name' => 'tbl', 'type' => '269706' }, '1' => { 'name' => 'priority', 'type' => '1989' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_dest_root_table' }, '313914' => { 'Header' => undef, 'Line' => '1119', 'Param' => { '0' => { 'name' => 'tbl', 'type' => '269706' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_dest_table' }, '314276' => { 'Header' => undef, 'Line' => '1101', 'Param' => { '0' => { 'name' => 'devx_obj', 'type' => '19309' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_dest_devx_tir' }, '314571' => { 'Header' => undef, 'Line' => '1081', 'Param' => { '0' => { 'name' => 'ibqp', 'type' => '5101' } }, 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_dest_ibv_qp' }, '314866' => { 'Header' => undef, 'Line' => '1075', 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_default_miss' }, '315104' => { 'Header' => undef, 'Line' => '1070', 'Return' => '269848', 'ShortName' => 'mlx5dv_dr_action_create_drop' }, '354718' => { 'Header' => undef, 'Line' => '878', 'Param' => { '0' => { 'name' => 'fout', 'type' => '1927' }, '1' => { 'name' => 'rule', 'type' => '290921' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dump_dr_rule' }, '355357' => { 'Header' => undef, 'Line' => '853', 'Param' => { '0' => { 'name' => 'fout', 'type' => '1927' }, '1' => { 'name' => 'matcher', 'type' => '290840' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dump_dr_matcher' }, '355972' => { 'Header' => undef, 'Line' => '832', 'Param' => { '0' => { 'name' => 'fout', 'type' => '1927' }, '1' => { 'name' => 'tbl', 'type' => '269706' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dump_dr_table' }, '356563' => { 'Header' => undef, 'Line' => '814', 'Param' => { '0' => { 'name' => 'fout', 'type' => '1927' }, '1' => { 'name' => 'dmn', 'type' => '284828' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dump_dr_domain' }, '427350' => { 'Header' => undef, 'Line' => '6063', 'Param' => { '0' => { 'name' => 'obj', 'type' => '19309' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_obj_modify' }, '427416' => { 'Header' => undef, 'Line' => '6037', 'Param' => { '0' => { 'name' => 'obj', 'type' => '19309' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_obj_query' }, '427459' => { 'Header' => undef, 'Line' => '6008', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '19309', 'ShortName' => 'mlx5dv_devx_obj_create' }, '427594' => { 'Header' => undef, 'Line' => '6115', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'in', 'type' => '1961' }, '2' => { 'name' => 'inlen', 'type' => '419' }, '3' => { 'name' => 'out', 'type' => '308' }, '4' => { 'name' => 'outlen', 'type' => '419' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_devx_general_cmd' }, '495978' => { 'Header' => undef, 'Line' => '5254', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'dm_attr', 'type' => '13608' }, '2' => { 'name' => 'mlx5_dm_attr', 'type' => '30412' } }, 'Return' => '2979', 'ShortName' => 'mlx5dv_alloc_dm' }, '543555' => { 'Header' => undef, 'Line' => '5498', 'Param' => { '0' => { 'name' => 'flow_matcher', 'type' => '30577' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_destroy_flow_matcher' }, '545569' => { 'Header' => undef, 'Line' => '5467', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'attr', 'type' => '30582' } }, 'Return' => '30577', 'ShortName' => 'mlx5dv_create_flow_matcher' }, '549924' => { 'Header' => undef, 'Line' => '1416', 'Param' => { '0' => { 'name' => 'matcher', 'type' => '290840' }, '1' => { 'name' => 'matcher_layout', 'type' => '550203' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dr_matcher_set_layout' }, '602218' => { 'Header' => undef, 'Line' => '6257', 'Param' => { '0' => { 'name' => 'dv_devx_uar', 'type' => '29729' } }, 'Return' => '1', 'ShortName' => 'mlx5dv_devx_free_uar' }, '602391' => { 'Header' => undef, 'Line' => '6234', 'Param' => { '0' => { 'name' => 'context', 'type' => '2944' }, '1' => { 'name' => 'flags', 'type' => '2001' } }, 'Return' => '29729', 'ShortName' => 'mlx5dv_devx_alloc_uar' }, '602965' => { 'Header' => undef, 'Line' => '646', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dr_domain_destroy' }, '603722' => { 'Header' => undef, 'Line' => '635', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' }, '1' => { 'name' => 'allow', 'type' => '2091' } }, 'Return' => '1', 'ShortName' => 'mlx5dv_dr_domain_allow_duplicate_rules' }, '604136' => { 'Header' => undef, 'Line' => '624', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' }, '1' => { 'name' => 'enable', 'type' => '2091' } }, 'Return' => '1', 'ShortName' => 'mlx5dv_dr_domain_set_reclaim_device_memory' }, '604550' => { 'Header' => undef, 'Line' => '1993', 'Param' => { '0' => { 'name' => 'dmn', 'type' => '284828' }, '1' => { 'name' => 'flags', 'type' => '2001' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dr_domain_sync' }, '604736' => { 'Header' => undef, 'Line' => '500', 'Param' => { '0' => { 'name' => 'ctx', 'type' => '2944' }, '1' => { 'name' => 'type', 'type' => '269449' } }, 'Return' => '284828', 'ShortName' => 'mlx5dv_dr_domain_create' }, '719631' => { 'Header' => undef, 'Line' => '1797', 'Param' => { '0' => { 'name' => 'devx_obj', 'type' => '19309' }, '1' => { 'name' => 'dmn', 'type' => '284828' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dr_aso_other_domain_unlink' }, '719760' => { 'Header' => undef, 'Line' => '1776', 'Param' => { '0' => { 'name' => 'devx_obj', 'type' => '19309' }, '1' => { 'name' => 'peer_dmn', 'type' => '284828' }, '2' => { 'name' => 'dmn', 'type' => '284828' }, '3' => { 'name' => 'flags', 'type' => '2001' }, '4' => { 'name' => 'return_reg_c', 'type' => '1977' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_dr_aso_other_domain_link' }, '76668' => { 'Header' => undef, 'Line' => '2278', 'Param' => { '0' => { 'name' => 'ctx_in', 'type' => '2944' }, '1' => { 'name' => 'clock_info', 'type' => '31117' } }, 'Return' => '159', 'ShortName' => 'mlx5dv_get_clock_info' } }, 'SymbolVersion' => { '__mlx5dv_init_obj_1_0' => 'mlx5dv_init_obj@MLX5_1.0', '__mlx5dv_init_obj_1_2' => 'mlx5dv_init_obj@@MLX5_1.2', '_mlx5dv_mkey_check' => '_mlx5dv_mkey_check@@MLX5_1.20', '_mlx5dv_query_port' => '_mlx5dv_query_port@@MLX5_1.19', 'mlx5dv_alloc_dm' => 'mlx5dv_alloc_dm@@MLX5_1.10', 'mlx5dv_alloc_var' => 'mlx5dv_alloc_var@@MLX5_1.12', 'mlx5dv_create_cq' => 'mlx5dv_create_cq@@MLX5_1.1', 'mlx5dv_create_flow' => 'mlx5dv_create_flow@@MLX5_1.6', 'mlx5dv_create_flow_action_esp' => 'mlx5dv_create_flow_action_esp@@MLX5_1.5', 'mlx5dv_create_flow_action_modify_header' => 'mlx5dv_create_flow_action_modify_header@@MLX5_1.7', 'mlx5dv_create_flow_action_packet_reformat' => 'mlx5dv_create_flow_action_packet_reformat@@MLX5_1.7', 'mlx5dv_create_flow_matcher' => 'mlx5dv_create_flow_matcher@@MLX5_1.6', 'mlx5dv_create_mkey' => 'mlx5dv_create_mkey@@MLX5_1.10', 'mlx5dv_create_qp' => 'mlx5dv_create_qp@@MLX5_1.3', 'mlx5dv_create_steering_anchor' => 'mlx5dv_create_steering_anchor@@MLX5_1.24', 'mlx5dv_create_wq' => 'mlx5dv_create_wq@@MLX5_1.3', 'mlx5dv_crypto_login' => 'mlx5dv_crypto_login@@MLX5_1.21', 'mlx5dv_crypto_login_create' => 'mlx5dv_crypto_login_create@@MLX5_1.24', 'mlx5dv_crypto_login_destroy' => 'mlx5dv_crypto_login_destroy@@MLX5_1.24', 'mlx5dv_crypto_login_query' => 'mlx5dv_crypto_login_query@@MLX5_1.24', 'mlx5dv_crypto_login_query_state' => 'mlx5dv_crypto_login_query_state@@MLX5_1.21', 'mlx5dv_crypto_logout' => 'mlx5dv_crypto_logout@@MLX5_1.21', 'mlx5dv_dci_stream_id_reset' => 'mlx5dv_dci_stream_id_reset@@MLX5_1.21', 'mlx5dv_dek_create' => 'mlx5dv_dek_create@@MLX5_1.21', 'mlx5dv_dek_destroy' => 'mlx5dv_dek_destroy@@MLX5_1.21', 'mlx5dv_dek_query' => 'mlx5dv_dek_query@@MLX5_1.21', 'mlx5dv_destroy_flow_matcher' => 'mlx5dv_destroy_flow_matcher@@MLX5_1.6', 'mlx5dv_destroy_mkey' => 'mlx5dv_destroy_mkey@@MLX5_1.10', 'mlx5dv_destroy_steering_anchor' => 'mlx5dv_destroy_steering_anchor@@MLX5_1.24', 'mlx5dv_devx_alloc_msi_vector' => 'mlx5dv_devx_alloc_msi_vector@@MLX5_1.23', 'mlx5dv_devx_alloc_uar' => 'mlx5dv_devx_alloc_uar@@MLX5_1.7', 'mlx5dv_devx_cq_modify' => 'mlx5dv_devx_cq_modify@@MLX5_1.8', 'mlx5dv_devx_cq_query' => 'mlx5dv_devx_cq_query@@MLX5_1.8', 'mlx5dv_devx_create_cmd_comp' => 'mlx5dv_devx_create_cmd_comp@@MLX5_1.9', 'mlx5dv_devx_create_eq' => 'mlx5dv_devx_create_eq@@MLX5_1.23', 'mlx5dv_devx_create_event_channel' => 'mlx5dv_devx_create_event_channel@@MLX5_1.11', 'mlx5dv_devx_destroy_cmd_comp' => 'mlx5dv_devx_destroy_cmd_comp@@MLX5_1.9', 'mlx5dv_devx_destroy_eq' => 'mlx5dv_devx_destroy_eq@@MLX5_1.23', 'mlx5dv_devx_destroy_event_channel' => 'mlx5dv_devx_destroy_event_channel@@MLX5_1.11', 'mlx5dv_devx_free_msi_vector' => 'mlx5dv_devx_free_msi_vector@@MLX5_1.23', 'mlx5dv_devx_free_uar' => 'mlx5dv_devx_free_uar@@MLX5_1.7', 'mlx5dv_devx_general_cmd' => 'mlx5dv_devx_general_cmd@@MLX5_1.7', 'mlx5dv_devx_get_async_cmd_comp' => 'mlx5dv_devx_get_async_cmd_comp@@MLX5_1.9', 'mlx5dv_devx_get_event' => 'mlx5dv_devx_get_event@@MLX5_1.11', 'mlx5dv_devx_ind_tbl_modify' => 'mlx5dv_devx_ind_tbl_modify@@MLX5_1.8', 'mlx5dv_devx_ind_tbl_query' => 'mlx5dv_devx_ind_tbl_query@@MLX5_1.8', 'mlx5dv_devx_obj_create' => 'mlx5dv_devx_obj_create@@MLX5_1.7', 'mlx5dv_devx_obj_destroy' => 'mlx5dv_devx_obj_destroy@@MLX5_1.7', 'mlx5dv_devx_obj_modify' => 'mlx5dv_devx_obj_modify@@MLX5_1.7', 'mlx5dv_devx_obj_query' => 'mlx5dv_devx_obj_query@@MLX5_1.7', 'mlx5dv_devx_obj_query_async' => 'mlx5dv_devx_obj_query_async@@MLX5_1.9', 'mlx5dv_devx_qp_modify' => 'mlx5dv_devx_qp_modify@@MLX5_1.8', 'mlx5dv_devx_qp_query' => 'mlx5dv_devx_qp_query@@MLX5_1.8', 'mlx5dv_devx_query_eqn' => 'mlx5dv_devx_query_eqn@@MLX5_1.7', 'mlx5dv_devx_srq_modify' => 'mlx5dv_devx_srq_modify@@MLX5_1.8', 'mlx5dv_devx_srq_query' => 'mlx5dv_devx_srq_query@@MLX5_1.8', 'mlx5dv_devx_subscribe_devx_event' => 'mlx5dv_devx_subscribe_devx_event@@MLX5_1.11', 'mlx5dv_devx_subscribe_devx_event_fd' => 'mlx5dv_devx_subscribe_devx_event_fd@@MLX5_1.11', 'mlx5dv_devx_umem_dereg' => 'mlx5dv_devx_umem_dereg@@MLX5_1.7', 'mlx5dv_devx_umem_reg' => 'mlx5dv_devx_umem_reg@@MLX5_1.7', 'mlx5dv_devx_umem_reg_ex' => 'mlx5dv_devx_umem_reg_ex@@MLX5_1.19', 'mlx5dv_devx_wq_modify' => 'mlx5dv_devx_wq_modify@@MLX5_1.8', 'mlx5dv_devx_wq_query' => 'mlx5dv_devx_wq_query@@MLX5_1.8', 'mlx5dv_dm_map_op_addr' => 'mlx5dv_dm_map_op_addr@@MLX5_1.19', 'mlx5dv_dr_action_create_aso' => 'mlx5dv_dr_action_create_aso@@MLX5_1.17', 'mlx5dv_dr_action_create_default_miss' => 'mlx5dv_dr_action_create_default_miss@@MLX5_1.14', 'mlx5dv_dr_action_create_dest_array' => 'mlx5dv_dr_action_create_dest_array@@MLX5_1.16', 'mlx5dv_dr_action_create_dest_devx_tir' => 'mlx5dv_dr_action_create_dest_devx_tir@@MLX5_1.15', 'mlx5dv_dr_action_create_dest_ib_port' => 'mlx5dv_dr_action_create_dest_ib_port@@MLX5_1.21', 'mlx5dv_dr_action_create_dest_ibv_qp' => 'mlx5dv_dr_action_create_dest_ibv_qp@@MLX5_1.10', 'mlx5dv_dr_action_create_dest_root_table' => 'mlx5dv_dr_action_create_dest_root_table@@MLX5_1.24', 'mlx5dv_dr_action_create_dest_table' => 'mlx5dv_dr_action_create_dest_table@@MLX5_1.10', 'mlx5dv_dr_action_create_dest_vport' => 'mlx5dv_dr_action_create_dest_vport@@MLX5_1.10', 'mlx5dv_dr_action_create_drop' => 'mlx5dv_dr_action_create_drop@@MLX5_1.10', 'mlx5dv_dr_action_create_flow_counter' => 'mlx5dv_dr_action_create_flow_counter@@MLX5_1.10', 'mlx5dv_dr_action_create_flow_meter' => 'mlx5dv_dr_action_create_flow_meter@@MLX5_1.12', 'mlx5dv_dr_action_create_flow_sampler' => 'mlx5dv_dr_action_create_flow_sampler@@MLX5_1.16', 'mlx5dv_dr_action_create_modify_header' => 'mlx5dv_dr_action_create_modify_header@@MLX5_1.10', 'mlx5dv_dr_action_create_packet_reformat' => 'mlx5dv_dr_action_create_packet_reformat@@MLX5_1.10', 'mlx5dv_dr_action_create_pop_vlan' => 'mlx5dv_dr_action_create_pop_vlan@@MLX5_1.17', 'mlx5dv_dr_action_create_push_vlan' => 'mlx5dv_dr_action_create_push_vlan@@MLX5_1.17', 'mlx5dv_dr_action_create_tag' => 'mlx5dv_dr_action_create_tag@@MLX5_1.10', 'mlx5dv_dr_action_destroy' => 'mlx5dv_dr_action_destroy@@MLX5_1.10', 'mlx5dv_dr_action_modify_aso' => 'mlx5dv_dr_action_modify_aso@@MLX5_1.17', 'mlx5dv_dr_action_modify_flow_meter' => 'mlx5dv_dr_action_modify_flow_meter@@MLX5_1.12', 'mlx5dv_dr_aso_other_domain_link' => 'mlx5dv_dr_aso_other_domain_link@@MLX5_1.22', 'mlx5dv_dr_aso_other_domain_unlink' => 'mlx5dv_dr_aso_other_domain_unlink@@MLX5_1.22', 'mlx5dv_dr_domain_allow_duplicate_rules' => 'mlx5dv_dr_domain_allow_duplicate_rules@@MLX5_1.20', 'mlx5dv_dr_domain_create' => 'mlx5dv_dr_domain_create@@MLX5_1.10', 'mlx5dv_dr_domain_destroy' => 'mlx5dv_dr_domain_destroy@@MLX5_1.10', 'mlx5dv_dr_domain_set_reclaim_device_memory' => 'mlx5dv_dr_domain_set_reclaim_device_memory@@MLX5_1.14', 'mlx5dv_dr_domain_sync' => 'mlx5dv_dr_domain_sync@@MLX5_1.10', 'mlx5dv_dr_matcher_create' => 'mlx5dv_dr_matcher_create@@MLX5_1.10', 'mlx5dv_dr_matcher_destroy' => 'mlx5dv_dr_matcher_destroy@@MLX5_1.10', 'mlx5dv_dr_matcher_set_layout' => 'mlx5dv_dr_matcher_set_layout@@MLX5_1.21', 'mlx5dv_dr_rule_create' => 'mlx5dv_dr_rule_create@@MLX5_1.10', 'mlx5dv_dr_rule_destroy' => 'mlx5dv_dr_rule_destroy@@MLX5_1.10', 'mlx5dv_dr_table_create' => 'mlx5dv_dr_table_create@@MLX5_1.10', 'mlx5dv_dr_table_destroy' => 'mlx5dv_dr_table_destroy@@MLX5_1.10', 'mlx5dv_dump_dr_domain' => 'mlx5dv_dump_dr_domain@@MLX5_1.12', 'mlx5dv_dump_dr_matcher' => 'mlx5dv_dump_dr_matcher@@MLX5_1.12', 'mlx5dv_dump_dr_rule' => 'mlx5dv_dump_dr_rule@@MLX5_1.12', 'mlx5dv_dump_dr_table' => 'mlx5dv_dump_dr_table@@MLX5_1.12', 'mlx5dv_free_var' => 'mlx5dv_free_var@@MLX5_1.12', 'mlx5dv_get_clock_info' => 'mlx5dv_get_clock_info@@MLX5_1.4', 'mlx5dv_get_data_direct_sysfs_path' => 'mlx5dv_get_data_direct_sysfs_path@@MLX5_1.25', 'mlx5dv_get_vfio_device_list' => 'mlx5dv_get_vfio_device_list@@MLX5_1.21', 'mlx5dv_is_supported' => 'mlx5dv_is_supported@@MLX5_1.8', 'mlx5dv_map_ah_to_qp' => 'mlx5dv_map_ah_to_qp@@MLX5_1.20', 'mlx5dv_modify_qp_lag_port' => 'mlx5dv_modify_qp_lag_port@@MLX5_1.14', 'mlx5dv_modify_qp_sched_elem' => 'mlx5dv_modify_qp_sched_elem@@MLX5_1.17', 'mlx5dv_modify_qp_udp_sport' => 'mlx5dv_modify_qp_udp_sport@@MLX5_1.17', 'mlx5dv_open_device' => 'mlx5dv_open_device@@MLX5_1.7', 'mlx5dv_pp_alloc' => 'mlx5dv_pp_alloc@@MLX5_1.13', 'mlx5dv_pp_free' => 'mlx5dv_pp_free@@MLX5_1.13', 'mlx5dv_qp_cancel_posted_send_wrs' => 'mlx5dv_qp_cancel_posted_send_wrs@@MLX5_1.20', 'mlx5dv_qp_ex_from_ibv_qp_ex' => 'mlx5dv_qp_ex_from_ibv_qp_ex@@MLX5_1.10', 'mlx5dv_query_device' => 'mlx5dv_query_device@@MLX5_1.0', 'mlx5dv_query_qp_lag_port' => 'mlx5dv_query_qp_lag_port@@MLX5_1.14', 'mlx5dv_reg_dmabuf_mr' => 'mlx5dv_reg_dmabuf_mr@@MLX5_1.25', 'mlx5dv_reserved_qpn_alloc' => 'mlx5dv_reserved_qpn_alloc@@MLX5_1.18', 'mlx5dv_reserved_qpn_dealloc' => 'mlx5dv_reserved_qpn_dealloc@@MLX5_1.18', 'mlx5dv_sched_leaf_create' => 'mlx5dv_sched_leaf_create@@MLX5_1.17', 'mlx5dv_sched_leaf_destroy' => 'mlx5dv_sched_leaf_destroy@@MLX5_1.17', 'mlx5dv_sched_leaf_modify' => 'mlx5dv_sched_leaf_modify@@MLX5_1.17', 'mlx5dv_sched_node_create' => 'mlx5dv_sched_node_create@@MLX5_1.17', 'mlx5dv_sched_node_destroy' => 'mlx5dv_sched_node_destroy@@MLX5_1.17', 'mlx5dv_sched_node_modify' => 'mlx5dv_sched_node_modify@@MLX5_1.17', 'mlx5dv_set_context_attr' => 'mlx5dv_set_context_attr@@MLX5_1.2', 'mlx5dv_vfio_get_events_fd' => 'mlx5dv_vfio_get_events_fd@@MLX5_1.21', 'mlx5dv_vfio_process_events' => 'mlx5dv_vfio_process_events@@MLX5_1.21' }, 'Symbols' => { 'libmlx5.so.1.25.56.0' => { '_mlx5dv_mkey_check@@MLX5_1.20' => 1, '_mlx5dv_query_port@@MLX5_1.19' => 1, 'mlx5dv_alloc_dm@@MLX5_1.10' => 1, 'mlx5dv_alloc_var@@MLX5_1.12' => 1, 'mlx5dv_create_cq@@MLX5_1.1' => 1, 'mlx5dv_create_flow@@MLX5_1.6' => 1, 'mlx5dv_create_flow_action_esp@@MLX5_1.5' => 1, 'mlx5dv_create_flow_action_modify_header@@MLX5_1.7' => 1, 'mlx5dv_create_flow_action_packet_reformat@@MLX5_1.7' => 1, 'mlx5dv_create_flow_matcher@@MLX5_1.6' => 1, 'mlx5dv_create_mkey@@MLX5_1.10' => 1, 'mlx5dv_create_qp@@MLX5_1.3' => 1, 'mlx5dv_create_steering_anchor@@MLX5_1.24' => 1, 'mlx5dv_create_wq@@MLX5_1.3' => 1, 'mlx5dv_crypto_login@@MLX5_1.21' => 1, 'mlx5dv_crypto_login_create@@MLX5_1.24' => 1, 'mlx5dv_crypto_login_destroy@@MLX5_1.24' => 1, 'mlx5dv_crypto_login_query@@MLX5_1.24' => 1, 'mlx5dv_crypto_login_query_state@@MLX5_1.21' => 1, 'mlx5dv_crypto_logout@@MLX5_1.21' => 1, 'mlx5dv_dci_stream_id_reset@@MLX5_1.21' => 1, 'mlx5dv_dek_create@@MLX5_1.21' => 1, 'mlx5dv_dek_destroy@@MLX5_1.21' => 1, 'mlx5dv_dek_query@@MLX5_1.21' => 1, 'mlx5dv_destroy_flow_matcher@@MLX5_1.6' => 1, 'mlx5dv_destroy_mkey@@MLX5_1.10' => 1, 'mlx5dv_destroy_steering_anchor@@MLX5_1.24' => 1, 'mlx5dv_devx_alloc_msi_vector@@MLX5_1.23' => 1, 'mlx5dv_devx_alloc_uar@@MLX5_1.7' => 1, 'mlx5dv_devx_cq_modify@@MLX5_1.8' => 1, 'mlx5dv_devx_cq_query@@MLX5_1.8' => 1, 'mlx5dv_devx_create_cmd_comp@@MLX5_1.9' => 1, 'mlx5dv_devx_create_eq@@MLX5_1.23' => 1, 'mlx5dv_devx_create_event_channel@@MLX5_1.11' => 1, 'mlx5dv_devx_destroy_cmd_comp@@MLX5_1.9' => 1, 'mlx5dv_devx_destroy_eq@@MLX5_1.23' => 1, 'mlx5dv_devx_destroy_event_channel@@MLX5_1.11' => 1, 'mlx5dv_devx_free_msi_vector@@MLX5_1.23' => 1, 'mlx5dv_devx_free_uar@@MLX5_1.7' => 1, 'mlx5dv_devx_general_cmd@@MLX5_1.7' => 1, 'mlx5dv_devx_get_async_cmd_comp@@MLX5_1.9' => 1, 'mlx5dv_devx_get_event@@MLX5_1.11' => 1, 'mlx5dv_devx_ind_tbl_modify@@MLX5_1.8' => 1, 'mlx5dv_devx_ind_tbl_query@@MLX5_1.8' => 1, 'mlx5dv_devx_obj_create@@MLX5_1.7' => 1, 'mlx5dv_devx_obj_destroy@@MLX5_1.7' => 1, 'mlx5dv_devx_obj_modify@@MLX5_1.7' => 1, 'mlx5dv_devx_obj_query@@MLX5_1.7' => 1, 'mlx5dv_devx_obj_query_async@@MLX5_1.9' => 1, 'mlx5dv_devx_qp_modify@@MLX5_1.8' => 1, 'mlx5dv_devx_qp_query@@MLX5_1.8' => 1, 'mlx5dv_devx_query_eqn@@MLX5_1.7' => 1, 'mlx5dv_devx_srq_modify@@MLX5_1.8' => 1, 'mlx5dv_devx_srq_query@@MLX5_1.8' => 1, 'mlx5dv_devx_subscribe_devx_event@@MLX5_1.11' => 1, 'mlx5dv_devx_subscribe_devx_event_fd@@MLX5_1.11' => 1, 'mlx5dv_devx_umem_dereg@@MLX5_1.7' => 1, 'mlx5dv_devx_umem_reg@@MLX5_1.7' => 1, 'mlx5dv_devx_umem_reg_ex@@MLX5_1.19' => 1, 'mlx5dv_devx_wq_modify@@MLX5_1.8' => 1, 'mlx5dv_devx_wq_query@@MLX5_1.8' => 1, 'mlx5dv_dm_map_op_addr@@MLX5_1.19' => 1, 'mlx5dv_dr_action_create_aso@@MLX5_1.17' => 1, 'mlx5dv_dr_action_create_default_miss@@MLX5_1.14' => 1, 'mlx5dv_dr_action_create_dest_array@@MLX5_1.16' => 1, 'mlx5dv_dr_action_create_dest_devx_tir@@MLX5_1.15' => 1, 'mlx5dv_dr_action_create_dest_ib_port@@MLX5_1.21' => 1, 'mlx5dv_dr_action_create_dest_ibv_qp@@MLX5_1.10' => 1, 'mlx5dv_dr_action_create_dest_root_table@@MLX5_1.24' => 1, 'mlx5dv_dr_action_create_dest_table@@MLX5_1.10' => 1, 'mlx5dv_dr_action_create_dest_vport@@MLX5_1.10' => 1, 'mlx5dv_dr_action_create_drop@@MLX5_1.10' => 1, 'mlx5dv_dr_action_create_flow_counter@@MLX5_1.10' => 1, 'mlx5dv_dr_action_create_flow_meter@@MLX5_1.12' => 1, 'mlx5dv_dr_action_create_flow_sampler@@MLX5_1.16' => 1, 'mlx5dv_dr_action_create_modify_header@@MLX5_1.10' => 1, 'mlx5dv_dr_action_create_packet_reformat@@MLX5_1.10' => 1, 'mlx5dv_dr_action_create_pop_vlan@@MLX5_1.17' => 1, 'mlx5dv_dr_action_create_push_vlan@@MLX5_1.17' => 1, 'mlx5dv_dr_action_create_tag@@MLX5_1.10' => 1, 'mlx5dv_dr_action_destroy@@MLX5_1.10' => 1, 'mlx5dv_dr_action_modify_aso@@MLX5_1.17' => 1, 'mlx5dv_dr_action_modify_flow_meter@@MLX5_1.12' => 1, 'mlx5dv_dr_aso_other_domain_link@@MLX5_1.22' => 1, 'mlx5dv_dr_aso_other_domain_unlink@@MLX5_1.22' => 1, 'mlx5dv_dr_domain_allow_duplicate_rules@@MLX5_1.20' => 1, 'mlx5dv_dr_domain_create@@MLX5_1.10' => 1, 'mlx5dv_dr_domain_destroy@@MLX5_1.10' => 1, 'mlx5dv_dr_domain_set_reclaim_device_memory@@MLX5_1.14' => 1, 'mlx5dv_dr_domain_sync@@MLX5_1.10' => 1, 'mlx5dv_dr_matcher_create@@MLX5_1.10' => 1, 'mlx5dv_dr_matcher_destroy@@MLX5_1.10' => 1, 'mlx5dv_dr_matcher_set_layout@@MLX5_1.21' => 1, 'mlx5dv_dr_rule_create@@MLX5_1.10' => 1, 'mlx5dv_dr_rule_destroy@@MLX5_1.10' => 1, 'mlx5dv_dr_table_create@@MLX5_1.10' => 1, 'mlx5dv_dr_table_destroy@@MLX5_1.10' => 1, 'mlx5dv_dump_dr_domain@@MLX5_1.12' => 1, 'mlx5dv_dump_dr_matcher@@MLX5_1.12' => 1, 'mlx5dv_dump_dr_rule@@MLX5_1.12' => 1, 'mlx5dv_dump_dr_table@@MLX5_1.12' => 1, 'mlx5dv_free_var@@MLX5_1.12' => 1, 'mlx5dv_get_clock_info@@MLX5_1.4' => 1, 'mlx5dv_get_data_direct_sysfs_path@@MLX5_1.25' => 1, 'mlx5dv_get_vfio_device_list@@MLX5_1.21' => 1, 'mlx5dv_init_obj@@MLX5_1.2' => 1, 'mlx5dv_init_obj@MLX5_1.0' => 1, 'mlx5dv_is_supported@@MLX5_1.8' => 1, 'mlx5dv_map_ah_to_qp@@MLX5_1.20' => 1, 'mlx5dv_modify_qp_lag_port@@MLX5_1.14' => 1, 'mlx5dv_modify_qp_sched_elem@@MLX5_1.17' => 1, 'mlx5dv_modify_qp_udp_sport@@MLX5_1.17' => 1, 'mlx5dv_open_device@@MLX5_1.7' => 1, 'mlx5dv_pp_alloc@@MLX5_1.13' => 1, 'mlx5dv_pp_free@@MLX5_1.13' => 1, 'mlx5dv_qp_cancel_posted_send_wrs@@MLX5_1.20' => 1, 'mlx5dv_qp_ex_from_ibv_qp_ex@@MLX5_1.10' => 1, 'mlx5dv_query_device@@MLX5_1.0' => 1, 'mlx5dv_query_qp_lag_port@@MLX5_1.14' => 1, 'mlx5dv_reg_dmabuf_mr@@MLX5_1.25' => 1, 'mlx5dv_reserved_qpn_alloc@@MLX5_1.18' => 1, 'mlx5dv_reserved_qpn_dealloc@@MLX5_1.18' => 1, 'mlx5dv_sched_leaf_create@@MLX5_1.17' => 1, 'mlx5dv_sched_leaf_destroy@@MLX5_1.17' => 1, 'mlx5dv_sched_leaf_modify@@MLX5_1.17' => 1, 'mlx5dv_sched_node_create@@MLX5_1.17' => 1, 'mlx5dv_sched_node_destroy@@MLX5_1.17' => 1, 'mlx5dv_sched_node_modify@@MLX5_1.17' => 1, 'mlx5dv_set_context_attr@@MLX5_1.2' => 1, 'mlx5dv_vfio_get_events_fd@@MLX5_1.21' => 1, 'mlx5dv_vfio_process_events@@MLX5_1.21' => 1 } }, 'Target' => 'unix', 'TypeInfo' => { '1' => { 'Name' => 'void', 'Type' => 'Intrinsic' }, '10025' => { 'Header' => undef, 'Line' => '1525', 'Memb' => { '0' => { 'name' => 'tag', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'priv', 'offset' => '8', 'type' => '2001' } }, 'Name' => 'struct ibv_wc_tm_info', 'Size' => '16', 'Type' => 'Struct' }, '10067' => { 'Header' => undef, 'Line' => '1530', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'channel', 'offset' => '8', 'type' => '9992' }, '10' => { 'name' => 'status', 'offset' => '306', 'type' => '5421' }, '11' => { 'name' => 'wr_id', 'offset' => '310', 'type' => '2023' }, '12' => { 'name' => 'start_poll', 'offset' => '324', 'type' => '10536' }, '13' => { 'name' => 'next_poll', 'offset' => '338', 'type' => '10556' }, '14' => { 'name' => 'end_poll', 'offset' => '352', 'type' => '10572' }, '15' => { 'name' => 'read_opcode', 'offset' => '360', 'type' => '10592' }, '16' => { 'name' => 'read_vendor_err', 'offset' => '374', 'type' => '10612' }, '17' => { 'name' => 'read_byte_len', 'offset' => '388', 'type' => '10612' }, '18' => { 'name' => 'read_imm_data', 'offset' => '402', 'type' => '10632' }, '19' => { 'name' => 'read_qp_num', 'offset' => '512', 'type' => '10612' }, '2' => { 'name' => 'cq_context', 'offset' => '22', 'type' => '308' }, '20' => { 'name' => 'read_src_qp', 'offset' => '520', 'type' => '10612' }, '21' => { 'name' => 'read_wc_flags', 'offset' => '534', 'type' => '10652' }, '22' => { 'name' => 'read_slid', 'offset' => '548', 'type' => '10612' }, '23' => { 'name' => 'read_sl', 'offset' => '562', 'type' => '10672' }, '24' => { 'name' => 'read_dlid_path_bits', 'offset' => '576', 'type' => '10672' }, '25' => { 'name' => 'read_completion_ts', 'offset' => '584', 'type' => '10692' }, '26' => { 'name' => 'read_cvlan', 'offset' => '598', 'type' => '10712' }, '27' => { 'name' => 'read_flow_tag', 'offset' => '612', 'type' => '10612' }, '28' => { 'name' => 'read_tm_info', 'offset' => '626', 'type' => '10738' }, '29' => { 'name' => 'read_completion_wallclock_ns', 'offset' => '640', 'type' => '10692' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '2001' }, '4' => { 'name' => 'cqe', 'offset' => '40', 'type' => '159' }, '5' => { 'name' => 'mutex', 'offset' => '50', 'type' => '893' }, '6' => { 'name' => 'cond', 'offset' => '114', 'type' => '966' }, '7' => { 'name' => 'comp_events_completed', 'offset' => '288', 'type' => '2001' }, '8' => { 'name' => 'async_events_completed', 'offset' => '292', 'type' => '2001' }, '9' => { 'name' => 'comp_mask', 'offset' => '296', 'type' => '2001' } }, 'Name' => 'struct ibv_cq_ex', 'Size' => '288', 'Type' => 'Struct' }, '10526' => { 'BaseType' => '10067', 'Name' => 'struct ibv_cq_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '10531' => { 'BaseType' => '9997', 'Name' => 'struct ibv_poll_cq_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '10536' => { 'Name' => 'int(*)(struct ibv_cq_ex*, struct ibv_poll_cq_attr*)', 'Param' => { '0' => { 'type' => '10526' }, '1' => { 'type' => '10531' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '10556' => { 'Name' => 'int(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '10526' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '10572' => { 'Name' => 'void(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '10526' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '10592' => { 'Name' => 'enum ibv_wc_opcode(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '10526' } }, 'Return' => '5582', 'Size' => '8', 'Type' => 'FuncPtr' }, '10612' => { 'Name' => 'uint32_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '10526' } }, 'Return' => '2001', 'Size' => '8', 'Type' => 'FuncPtr' }, '10632' => { 'Name' => '__be32(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '10526' } }, 'Return' => '2203', 'Size' => '8', 'Type' => 'FuncPtr' }, '10652' => { 'Name' => 'unsigned int(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '10526' } }, 'Return' => '70', 'Size' => '8', 'Type' => 'FuncPtr' }, '10672' => { 'Name' => 'uint8_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '10526' } }, 'Return' => '1977', 'Size' => '8', 'Type' => 'FuncPtr' }, '10692' => { 'Name' => 'uint64_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '10526' } }, 'Return' => '2023', 'Size' => '8', 'Type' => 'FuncPtr' }, '10712' => { 'Name' => 'uint16_t(*)(struct ibv_cq_ex*)', 'Param' => { '0' => { 'type' => '10526' } }, 'Return' => '1989', 'Size' => '8', 'Type' => 'FuncPtr' }, '10733' => { 'BaseType' => '10025', 'Name' => 'struct ibv_wc_tm_info*', 'Size' => '8', 'Type' => 'Pointer' }, '10738' => { 'Name' => 'void(*)(struct ibv_cq_ex*, struct ibv_wc_tm_info*)', 'Param' => { '0' => { 'type' => '10526' }, '1' => { 'type' => '10733' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '10827' => { 'Header' => undef, 'Line' => '1707', 'Memb' => { '0' => { 'name' => 'IBV_FLOW_ATTR_NORMAL', 'value' => '0' }, '1' => { 'name' => 'IBV_FLOW_ATTR_ALL_DEFAULT', 'value' => '1' }, '2' => { 'name' => 'IBV_FLOW_ATTR_MC_DEFAULT', 'value' => '2' }, '3' => { 'name' => 'IBV_FLOW_ATTR_SNIFFER', 'value' => '3' } }, 'Name' => 'enum ibv_flow_attr_type', 'Size' => '4', 'Type' => 'Enum' }, '10868' => { 'BaseType' => '1977', 'Name' => 'uint8_t[6]', 'Size' => '6', 'Type' => 'Array' }, '10884' => { 'Header' => undef, 'Line' => '1940', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' } }, 'Name' => 'struct ibv_flow_action', 'Size' => '8', 'Type' => 'Struct' }, '10912' => { 'Header' => undef, 'Line' => '2105', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' } }, 'Name' => 'struct ibv_counters', 'Size' => '8', 'Type' => 'Struct' }, '10940' => { 'BaseType' => '10912', 'Name' => 'struct ibv_counters*', 'Size' => '8', 'Type' => 'Pointer' }, '11057' => { 'Header' => undef, 'Line' => '1934', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'context', 'offset' => '8', 'type' => '2944' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '2001' } }, 'Name' => 'struct ibv_flow', 'Size' => '24', 'Type' => 'Struct' }, '111' => { 'Name' => 'signed char', 'Size' => '1', 'Type' => 'Intrinsic' }, '11113' => { 'Header' => undef, 'Line' => '1948', 'Memb' => { '0' => { 'name' => 'esp_attr', 'offset' => '0', 'type' => '11267' }, '1' => { 'name' => 'keymat_proto', 'offset' => '8', 'type' => '2243' }, '2' => { 'name' => 'keymat_len', 'offset' => '18', 'type' => '1989' }, '3' => { 'name' => 'keymat_ptr', 'offset' => '22', 'type' => '308' }, '4' => { 'name' => 'replay_proto', 'offset' => '36', 'type' => '2267' }, '5' => { 'name' => 'replay_len', 'offset' => '40', 'type' => '1989' }, '6' => { 'name' => 'replay_ptr', 'offset' => '50', 'type' => '308' }, '7' => { 'name' => 'esp_encap', 'offset' => '64', 'type' => '2406' }, '8' => { 'name' => 'comp_mask', 'offset' => '72', 'type' => '2001' }, '9' => { 'name' => 'esn', 'offset' => '82', 'type' => '2001' } }, 'Name' => 'struct ibv_flow_action_esp_attr', 'Size' => '56', 'Type' => 'Struct' }, '11267' => { 'BaseType' => '2411', 'Name' => 'struct ib_uverbs_flow_action_esp*', 'Size' => '8', 'Type' => 'Pointer' }, '11272' => { 'Header' => undef, 'Line' => '1969', 'Memb' => { '0' => { 'name' => '_dummy1', 'offset' => '0', 'type' => '11453' }, '1' => { 'name' => '_dummy2', 'offset' => '8', 'type' => '11469' } }, 'Name' => 'struct _ibv_device_ops', 'Size' => '16', 'Type' => 'Struct' }, '11334' => { 'BaseType' => '11339', 'Name' => 'struct ibv_device*', 'Size' => '8', 'Type' => 'Pointer' }, '11339' => { 'Header' => undef, 'Line' => '1979', 'Memb' => { '0' => { 'name' => '_ops', 'offset' => '0', 'type' => '11272' }, '1' => { 'name' => 'node_type', 'offset' => '22', 'type' => '2540' }, '2' => { 'name' => 'transport_type', 'offset' => '32', 'type' => '2605' }, '3' => { 'name' => 'name', 'offset' => '36', 'type' => '3558' }, '4' => { 'name' => 'dev_name', 'offset' => '136', 'type' => '3558' }, '5' => { 'name' => 'dev_path', 'offset' => '338', 'type' => '11474' }, '6' => { 'name' => 'ibdev_path', 'offset' => '1032', 'type' => '11474' } }, 'Name' => 'struct ibv_device', 'Size' => '664', 'Type' => 'Struct' }, '11453' => { 'Name' => 'struct ibv_context*(*)(struct ibv_device*, int)', 'Param' => { '0' => { 'type' => '11334' }, '1' => { 'type' => '159' } }, 'Return' => '2944', 'Size' => '8', 'Type' => 'FuncPtr' }, '11469' => { 'Name' => 'void(*)(struct ibv_context*)', 'Param' => { '0' => { 'type' => '2944' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '11474' => { 'BaseType' => '356', 'Name' => 'char[256]', 'Size' => '256', 'Type' => 'Array' }, '11490' => { 'Header' => undef, 'Line' => '1994', 'Memb' => { '0' => { 'name' => '_compat_query_device', 'offset' => '0', 'type' => '11978' }, '1' => { 'name' => '_compat_query_port', 'offset' => '8', 'type' => '12018' }, '10' => { 'name' => '_compat_create_cq', 'offset' => '128', 'type' => '12028' }, '11' => { 'name' => 'poll_cq', 'offset' => '136', 'type' => '12143' }, '12' => { 'name' => 'req_notify_cq', 'offset' => '150', 'type' => '12168' }, '13' => { 'name' => '_compat_cq_event', 'offset' => '260', 'type' => '12028' }, '14' => { 'name' => '_compat_resize_cq', 'offset' => '274', 'type' => '12028' }, '15' => { 'name' => '_compat_destroy_cq', 'offset' => '288', 'type' => '12028' }, '16' => { 'name' => '_compat_create_srq', 'offset' => '296', 'type' => '12028' }, '17' => { 'name' => '_compat_modify_srq', 'offset' => '310', 'type' => '12028' }, '18' => { 'name' => '_compat_query_srq', 'offset' => '324', 'type' => '12028' }, '19' => { 'name' => '_compat_destroy_srq', 'offset' => '338', 'type' => '12028' }, '2' => { 'name' => '_compat_alloc_pd', 'offset' => '22', 'type' => '12028' }, '20' => { 'name' => 'post_srq_recv', 'offset' => '352', 'type' => '12198' }, '21' => { 'name' => '_compat_create_qp', 'offset' => '360', 'type' => '12028' }, '22' => { 'name' => '_compat_query_qp', 'offset' => '374', 'type' => '12028' }, '23' => { 'name' => '_compat_modify_qp', 'offset' => '388', 'type' => '12028' }, '24' => { 'name' => '_compat_destroy_qp', 'offset' => '402', 'type' => '12028' }, '25' => { 'name' => 'post_send', 'offset' => '512', 'type' => '12233' }, '26' => { 'name' => 'post_recv', 'offset' => '520', 'type' => '12263' }, '27' => { 'name' => '_compat_create_ah', 'offset' => '534', 'type' => '12028' }, '28' => { 'name' => '_compat_destroy_ah', 'offset' => '548', 'type' => '12028' }, '29' => { 'name' => '_compat_attach_mcast', 'offset' => '562', 'type' => '12028' }, '3' => { 'name' => '_compat_dealloc_pd', 'offset' => '36', 'type' => '12028' }, '30' => { 'name' => '_compat_detach_mcast', 'offset' => '576', 'type' => '12028' }, '31' => { 'name' => '_compat_async_event', 'offset' => '584', 'type' => '12028' }, '4' => { 'name' => '_compat_reg_mr', 'offset' => '50', 'type' => '12028' }, '5' => { 'name' => '_compat_rereg_mr', 'offset' => '64', 'type' => '12028' }, '6' => { 'name' => '_compat_dereg_mr', 'offset' => '72', 'type' => '12028' }, '7' => { 'name' => 'alloc_mw', 'offset' => '86', 'type' => '12053' }, '8' => { 'name' => 'bind_mw', 'offset' => '100', 'type' => '12088' }, '9' => { 'name' => 'dealloc_mw', 'offset' => '114', 'type' => '12108' } }, 'Name' => 'struct ibv_context_ops', 'Size' => '256', 'Type' => 'Struct' }, '11973' => { 'BaseType' => '3024', 'Name' => 'struct ibv_device_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '11978' => { 'Name' => 'int(*)(struct ibv_context*, struct ibv_device_attr*)', 'Param' => { '0' => { 'type' => '2944' }, '1' => { 'type' => '11973' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '12008' => { 'BaseType' => '12013', 'Name' => 'struct _compat_ibv_port_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '12013' => { 'Name' => 'struct _compat_ibv_port_attr', 'Type' => 'Struct' }, '12018' => { 'Name' => 'int(*)(struct ibv_context*, uint8_t, struct _compat_ibv_port_attr*)', 'Param' => { '0' => { 'type' => '2944' }, '1' => { 'type' => '1977' }, '2' => { 'type' => '12008' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '12028' => { 'Name' => 'void*(*)()', 'Return' => '308', 'Size' => '8', 'Type' => 'FuncPtr' }, '12053' => { 'Name' => 'struct ibv_mw*(*)(struct ibv_pd*, enum ibv_mw_type)', 'Param' => { '0' => { 'type' => '6313' }, '1' => { 'type' => '6318' } }, 'Return' => '8385', 'Size' => '8', 'Type' => 'FuncPtr' }, '12083' => { 'BaseType' => '8958', 'Name' => 'struct ibv_mw_bind*', 'Size' => '8', 'Type' => 'Pointer' }, '12088' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_mw*, struct ibv_mw_bind*)', 'Param' => { '0' => { 'type' => '5101' }, '1' => { 'type' => '8385' }, '2' => { 'type' => '12083' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '12108' => { 'Name' => 'int(*)(struct ibv_mw*)', 'Param' => { '0' => { 'type' => '8385' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '12138' => { 'BaseType' => '5755', 'Name' => 'struct ibv_wc*', 'Size' => '8', 'Type' => 'Pointer' }, '12143' => { 'Name' => 'int(*)(struct ibv_cq*, int, struct ibv_wc*)', 'Param' => { '0' => { 'type' => '4901' }, '1' => { 'type' => '159' }, '2' => { 'type' => '12138' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '12168' => { 'Name' => 'int(*)(struct ibv_cq*, int)', 'Param' => { '0' => { 'type' => '4901' }, '1' => { 'type' => '159' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '12198' => { 'Name' => 'int(*)(struct ibv_srq*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '5217' }, '1' => { 'type' => '8696' }, '2' => { 'type' => '9039' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '12228' => { 'BaseType' => '8616', 'Name' => 'struct ibv_send_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '12233' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_send_wr*, struct ibv_send_wr**)', 'Param' => { '0' => { 'type' => '5101' }, '1' => { 'type' => '8616' }, '2' => { 'type' => '12228' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '12263' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '5101' }, '1' => { 'type' => '8696' }, '2' => { 'type' => '9039' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '12268' => { 'Header' => undef, 'Line' => '2057', 'Memb' => { '0' => { 'name' => 'cqe', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'cq_context', 'offset' => '8', 'type' => '308' }, '2' => { 'name' => 'channel', 'offset' => '22', 'type' => '9992' }, '3' => { 'name' => 'comp_vector', 'offset' => '36', 'type' => '2001' }, '4' => { 'name' => 'wc_flags', 'offset' => '50', 'type' => '2023' }, '5' => { 'name' => 'comp_mask', 'offset' => '64', 'type' => '2001' }, '6' => { 'name' => 'flags', 'offset' => '68', 'type' => '2001' }, '7' => { 'name' => 'parent_domain', 'offset' => '72', 'type' => '6313' } }, 'Name' => 'struct ibv_cq_init_attr_ex', 'Size' => '56', 'Type' => 'Struct' }, '123' => { 'BaseType' => '46', 'Header' => undef, 'Line' => '38', 'Name' => '__uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '12530' => { 'Name' => 'void*(*)(struct ibv_pd*, void*, size_t, size_t, uint64_t)', 'Param' => { '0' => { 'type' => '6313' }, '1' => { 'type' => '308' }, '2' => { 'type' => '419' }, '3' => { 'type' => '419' }, '4' => { 'type' => '2023' } }, 'Return' => '308', 'Size' => '8', 'Type' => 'FuncPtr' }, '12561' => { 'Name' => 'void(*)(struct ibv_pd*, void*, void*, uint64_t)', 'Param' => { '0' => { 'type' => '6313' }, '1' => { 'type' => '308' }, '2' => { 'type' => '308' }, '3' => { 'type' => '2023' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '1292243' => { 'Header' => undef, 'Line' => '9', 'Memb' => { '0' => { 'name' => 'DR_ARG_CHUNK_SIZE_1', 'value' => '0' }, '1' => { 'name' => 'DR_ARG_CHUNK_SIZE_MIN', 'value' => '0' }, '2' => { 'name' => 'DR_ARG_CHUNK_SIZE_2', 'value' => '1' }, '3' => { 'name' => 'DR_ARG_CHUNK_SIZE_3', 'value' => '2' }, '4' => { 'name' => 'DR_ARG_CHUNK_SIZE_4', 'value' => '3' }, '5' => { 'name' => 'DR_ARG_CHUNK_SIZE_MAX', 'value' => '4' } }, 'Name' => 'enum dr_arg_chunk_size', 'Size' => '4', 'Type' => 'Enum' }, '1292297' => { 'Header' => undef, 'Line' => '19', 'Memb' => { '0' => { 'name' => 'log_chunk_size', 'offset' => '0', 'type' => '1292243' }, '1' => { 'name' => 'dmn', 'offset' => '8', 'type' => '284828' }, '2' => { 'name' => 'free_list', 'offset' => '22', 'type' => '14403' }, '3' => { 'name' => 'mutex', 'offset' => '50', 'type' => '1259932' } }, 'Name' => 'struct dr_arg_pool', 'Size' => '72', 'Type' => 'Struct' }, '1292363' => { 'BaseType' => '1292379', 'Name' => 'struct dr_arg_pool*[4]', 'Size' => '32', 'Type' => 'Array' }, '1292379' => { 'BaseType' => '1292297', 'Name' => 'struct dr_arg_pool*', 'Size' => '8', 'Type' => 'Pointer' }, '1327410' => { 'Header' => undef, 'Line' => '1716', 'Memb' => { '0' => { 'name' => 'flags', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_context_attr', 'Size' => '16', 'Type' => 'Struct' }, '13448' => { 'BaseType' => '2023', 'Name' => 'uint64_t*', 'Size' => '8', 'Type' => 'Pointer' }, '13488' => { 'BaseType' => '11057', 'Name' => 'struct ibv_flow*', 'Size' => '8', 'Type' => 'Pointer' }, '13608' => { 'BaseType' => '2694', 'Name' => 'struct ibv_alloc_dm_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '1365741' => { 'BaseType' => '1327410', 'Name' => 'struct mlx5dv_context_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '13658' => { 'BaseType' => '10884', 'Name' => 'struct ibv_flow_action*', 'Size' => '8', 'Type' => 'Pointer' }, '13663' => { 'BaseType' => '11113', 'Name' => 'struct ibv_flow_action_esp_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '14013' => { 'BaseType' => '6715', 'Name' => 'struct ibv_wq_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '14073' => { 'BaseType' => '12268', 'Name' => 'struct ibv_cq_init_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '1408' => { 'Header' => undef, 'Line' => '49', 'Memb' => { '0' => { 'name' => '_flags', 'offset' => '0', 'type' => '159' }, '1' => { 'name' => '_IO_read_ptr', 'offset' => '8', 'type' => '346' }, '10' => { 'name' => '_IO_backup_base', 'offset' => '128', 'type' => '346' }, '11' => { 'name' => '_IO_save_end', 'offset' => '136', 'type' => '346' }, '12' => { 'name' => '_markers', 'offset' => '150', 'type' => '1824' }, '13' => { 'name' => '_chain', 'offset' => '260', 'type' => '1829' }, '14' => { 'name' => '_fileno', 'offset' => '274', 'type' => '159' }, '15' => { 'name' => '_flags2', 'offset' => '278', 'type' => '159' }, '16' => { 'name' => '_old_offset', 'offset' => '288', 'type' => '248' }, '17' => { 'name' => '_cur_column', 'offset' => '296', 'type' => '58' }, '18' => { 'name' => '_vtable_offset', 'offset' => '304', 'type' => '111' }, '19' => { 'name' => '_shortbuf', 'offset' => '305', 'type' => '1834' }, '2' => { 'name' => '_IO_read_end', 'offset' => '22', 'type' => '346' }, '20' => { 'name' => '_lock', 'offset' => '310', 'type' => '1850' }, '21' => { 'name' => '_offset', 'offset' => '324', 'type' => '260' }, '22' => { 'name' => '_codecvt', 'offset' => '338', 'type' => '1860' }, '23' => { 'name' => '_wide_data', 'offset' => '352', 'type' => '1870' }, '24' => { 'name' => '_freeres_list', 'offset' => '360', 'type' => '1829' }, '25' => { 'name' => '_freeres_buf', 'offset' => '374', 'type' => '308' }, '26' => { 'name' => '__pad5', 'offset' => '388', 'type' => '419' }, '27' => { 'name' => '_mode', 'offset' => '402', 'type' => '159' }, '28' => { 'name' => '_unused2', 'offset' => '406', 'type' => '1875' }, '3' => { 'name' => '_IO_read_base', 'offset' => '36', 'type' => '346' }, '4' => { 'name' => '_IO_write_base', 'offset' => '50', 'type' => '346' }, '5' => { 'name' => '_IO_write_ptr', 'offset' => '64', 'type' => '346' }, '6' => { 'name' => '_IO_write_end', 'offset' => '72', 'type' => '346' }, '7' => { 'name' => '_IO_buf_base', 'offset' => '86', 'type' => '346' }, '8' => { 'name' => '_IO_buf_end', 'offset' => '100', 'type' => '346' }, '9' => { 'name' => '_IO_save_base', 'offset' => '114', 'type' => '346' } }, 'Name' => 'struct _IO_FILE', 'Size' => '216', 'Type' => 'Struct' }, '14238' => { 'BaseType' => '7313', 'Name' => 'struct ibv_qp_init_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '1425752' => { 'Header' => undef, 'Line' => '1727', 'Memb' => { '0' => { 'name' => 'pci_name', 'offset' => '0', 'type' => '1967' }, '1' => { 'name' => 'flags', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_vfio_context_attr', 'Size' => '24', 'Type' => 'Struct' }, '14268' => { 'BaseType' => '2001', 'Name' => 'uint32_t*', 'Size' => '8', 'Type' => 'Pointer' }, '14358' => { 'Header' => undef, 'Line' => '24', 'Memb' => { '0' => { 'name' => 'next', 'offset' => '0', 'type' => '14398' }, '1' => { 'name' => 'prev', 'offset' => '8', 'type' => '14398' } }, 'Name' => 'struct list_node', 'Size' => '16', 'Type' => 'Struct' }, '14398' => { 'BaseType' => '14358', 'Name' => 'struct list_node*', 'Size' => '8', 'Type' => 'Pointer' }, '14403' => { 'Header' => undef, 'Line' => '41', 'Memb' => { '0' => { 'name' => 'n', 'offset' => '0', 'type' => '14358' } }, 'Name' => 'struct list_head', 'Size' => '16', 'Type' => 'Struct' }, '1463955' => { 'BaseType' => '11334', 'Name' => 'struct ibv_device**', 'Size' => '8', 'Type' => 'Pointer' }, '1463960' => { 'BaseType' => '1425752', 'Name' => 'struct mlx5dv_vfio_context_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '147' => { 'BaseType' => '58', 'Header' => undef, 'Line' => '40', 'Name' => '__uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '14957' => { 'Header' => undef, 'Line' => '42', 'Memb' => { '0' => { 'name' => 'MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_RX', 'value' => '0' }, '1' => { 'name' => 'MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_TX', 'value' => '1' }, '2' => { 'name' => 'MLX5_IB_UAPI_FLOW_TABLE_TYPE_FDB', 'value' => '2' }, '3' => { 'name' => 'MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_RX', 'value' => '3' }, '4' => { 'name' => 'MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TX', 'value' => '4' } }, 'Name' => 'enum mlx5_ib_uapi_flow_table_type', 'Size' => '4', 'Type' => 'Enum' }, '15005' => { 'Header' => undef, 'Line' => '50', 'Memb' => { '0' => { 'name' => 'MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2', 'value' => '0' }, '1' => { 'name' => 'MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL', 'value' => '1' }, '2' => { 'name' => 'MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2', 'value' => '2' }, '3' => { 'name' => 'MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL', 'value' => '3' } }, 'Name' => 'enum mlx5_ib_uapi_flow_action_packet_reformat_type', 'Size' => '4', 'Type' => 'Enum' }, '15047' => { 'Header' => undef, 'Line' => '61', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2179' }, '1' => { 'name' => 'out_data', 'offset' => '8', 'type' => '15085' } }, 'Name' => 'struct mlx5_ib_uapi_devx_async_cmd_hdr', 'Size' => '8', 'Type' => 'Struct' }, '15085' => { 'BaseType' => '2143', 'Name' => '__u8[]', 'Size' => '8', 'Type' => 'Array' }, '15100' => { 'Header' => undef, 'Line' => '66', 'Memb' => { '0' => { 'name' => 'MLX5_IB_UAPI_DM_TYPE_MEMIC', 'value' => '0' }, '1' => { 'name' => 'MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM', 'value' => '1' }, '2' => { 'name' => 'MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM', 'value' => '2' }, '3' => { 'name' => 'MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_PATTERN_SW_ICM', 'value' => '3' }, '4' => { 'name' => 'MLX5_IB_UAPI_DM_TYPE_ENCAP_SW_ICM', 'value' => '4' } }, 'Name' => 'enum mlx5_ib_uapi_dm_type', 'Size' => '4', 'Type' => 'Enum' }, '15148' => { 'Header' => undef, 'Line' => '74', 'Memb' => { '0' => { 'name' => 'MLX5_IB_UAPI_DEVX_CR_EV_CH_FLAGS_OMIT_DATA', 'value' => '1' } }, 'Name' => 'enum mlx5_ib_uapi_devx_create_event_channel_flags', 'Size' => '4', 'Type' => 'Enum' }, '15172' => { 'Header' => undef, 'Line' => '78', 'Memb' => { '0' => { 'name' => 'cookie', 'offset' => '0', 'type' => '2179' }, '1' => { 'name' => 'out_data', 'offset' => '8', 'type' => '15085' } }, 'Name' => 'struct mlx5_ib_uapi_devx_async_event_hdr', 'Size' => '8', 'Type' => 'Struct' }, '15210' => { 'Header' => undef, 'Line' => '101', 'Memb' => { '0' => { 'name' => 'value', 'offset' => '0', 'type' => '2167' }, '1' => { 'name' => 'mask', 'offset' => '4', 'type' => '2167' } }, 'Name' => 'struct mlx5_ib_uapi_reg', 'Size' => '8', 'Type' => 'Struct' }, '15250' => { 'Header' => undef, 'Line' => '106', 'Memb' => { '0' => { 'name' => 'flags', 'offset' => '0', 'type' => '2179' }, '1' => { 'name' => 'vport', 'offset' => '8', 'type' => '2155' }, '2' => { 'name' => 'vport_vhca_id', 'offset' => '16', 'type' => '2155' }, '3' => { 'name' => 'esw_owner_vhca_id', 'offset' => '18', 'type' => '2155' }, '4' => { 'name' => 'rsvd0', 'offset' => '20', 'type' => '2155' }, '5' => { 'name' => 'vport_steering_icm_rx', 'offset' => '22', 'type' => '2179' }, '6' => { 'name' => 'vport_steering_icm_tx', 'offset' => '36', 'type' => '2179' }, '7' => { 'name' => 'reg_c0', 'offset' => '50', 'type' => '15210' } }, 'Name' => 'struct mlx5_ib_uapi_query_port', 'Size' => '40', 'Type' => 'Struct' }, '15684' => { 'Header' => undef, 'Line' => '94', 'Memb' => { '0' => { 'name' => 'max_num', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'supported_format', 'offset' => '4', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_cqe_comp_caps', 'Size' => '8', 'Type' => 'Struct' }, '15724' => { 'Header' => undef, 'Line' => '99', 'Memb' => { '0' => { 'name' => 'sw_parsing_offloads', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'supported_qpts', 'offset' => '4', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_sw_parsing_caps', 'Size' => '8', 'Type' => 'Struct' }, '15764' => { 'Header' => undef, 'Line' => '104', 'Memb' => { '0' => { 'name' => 'min_single_stride_log_num_of_bytes', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'max_single_stride_log_num_of_bytes', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'min_single_wqe_log_num_of_strides', 'offset' => '8', 'type' => '2001' }, '3' => { 'name' => 'max_single_wqe_log_num_of_strides', 'offset' => '18', 'type' => '2001' }, '4' => { 'name' => 'supported_qpts', 'offset' => '22', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_striding_rq_caps', 'Size' => '20', 'Type' => 'Struct' }, '15843' => { 'Header' => undef, 'Line' => '112', 'Memb' => { '0' => { 'name' => 'max_log_num_concurent', 'offset' => '0', 'type' => '1977' }, '1' => { 'name' => 'max_log_num_errored', 'offset' => '1', 'type' => '1977' } }, 'Name' => 'struct mlx5dv_dci_streams_caps', 'Size' => '2', 'Type' => 'Struct' }, '15883' => { 'Header' => undef, 'Line' => '133', 'Memb' => { '0' => { 'name' => 'MLX5DV_SIG_TYPE_T10DIF', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_SIG_TYPE_CRC', 'value' => '1' } }, 'Name' => 'enum mlx5dv_sig_type', 'Size' => '4', 'Type' => 'Enum' }, '159' => { 'Name' => 'int', 'Size' => '4', 'Type' => 'Intrinsic' }, '15913' => { 'Header' => undef, 'Line' => '143', 'Memb' => { '0' => { 'name' => 'MLX5DV_SIG_T10DIF_CRC', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_SIG_T10DIF_CSUM', 'value' => '1' } }, 'Name' => 'enum mlx5dv_sig_t10dif_bg_type', 'Size' => '4', 'Type' => 'Enum' }, '15943' => { 'Header' => undef, 'Line' => '153', 'Memb' => { '0' => { 'name' => 'MLX5DV_SIG_CRC_TYPE_CRC32', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_SIG_CRC_TYPE_CRC32C', 'value' => '1' }, '2' => { 'name' => 'MLX5DV_SIG_CRC_TYPE_CRC64_XP10', 'value' => '2' } }, 'Name' => 'enum mlx5dv_sig_crc_type', 'Size' => '4', 'Type' => 'Enum' }, '15979' => { 'Header' => undef, 'Line' => '165', 'Memb' => { '0' => { 'name' => 'MLX5DV_BLOCK_SIZE_512', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_BLOCK_SIZE_520', 'value' => '1' }, '2' => { 'name' => 'MLX5DV_BLOCK_SIZE_4048', 'value' => '2' }, '3' => { 'name' => 'MLX5DV_BLOCK_SIZE_4096', 'value' => '3' }, '4' => { 'name' => 'MLX5DV_BLOCK_SIZE_4160', 'value' => '4' } }, 'Name' => 'enum mlx5dv_block_size', 'Size' => '4', 'Type' => 'Enum' }, '16027' => { 'Header' => undef, 'Line' => '181', 'Memb' => { '0' => { 'name' => 'block_size', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'block_prot', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 't10dif_bg', 'offset' => '18', 'type' => '1989' }, '3' => { 'name' => 'crc_type', 'offset' => '20', 'type' => '1989' } }, 'Name' => 'struct mlx5dv_sig_caps', 'Size' => '16', 'Type' => 'Struct' }, '16093' => { 'Header' => undef, 'Line' => '204', 'Memb' => { '0' => { 'name' => 'failed_selftests', 'offset' => '0', 'type' => '1989' }, '1' => { 'name' => 'crypto_engines', 'offset' => '2', 'type' => '1977' }, '2' => { 'name' => 'wrapped_import_method', 'offset' => '3', 'type' => '1977' }, '3' => { 'name' => 'log_max_num_deks', 'offset' => '4', 'type' => '1977' }, '4' => { 'name' => 'flags', 'offset' => '8', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_crypto_caps', 'Size' => '12', 'Type' => 'Struct' }, '16172' => { 'Header' => undef, 'Line' => '217', 'Memb' => { '0' => { 'name' => 'max_rc', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'max_xrc', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'max_dct', 'offset' => '8', 'type' => '2001' }, '3' => { 'name' => 'max_ud', 'offset' => '18', 'type' => '2001' }, '4' => { 'name' => 'max_uc', 'offset' => '22', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_ooo_recv_wrs_caps', 'Size' => '20', 'Type' => 'Struct' }, '16251' => { 'Header' => undef, 'Line' => '228', 'Memb' => { '0' => { 'name' => 'version', 'offset' => '0', 'type' => '1977' }, '1' => { 'name' => 'flags', 'offset' => '8', 'type' => '2023' }, '10' => { 'name' => 'dc_odp_caps', 'offset' => '132', 'type' => '2001' }, '11' => { 'name' => 'hca_core_clock', 'offset' => '136', 'type' => '308' }, '12' => { 'name' => 'num_lag_ports', 'offset' => '150', 'type' => '1977' }, '13' => { 'name' => 'sig_caps', 'offset' => '260', 'type' => '16027' }, '14' => { 'name' => 'dci_streams_caps', 'offset' => '288', 'type' => '15843' }, '15' => { 'name' => 'max_wr_memcpy_length', 'offset' => '296', 'type' => '419' }, '16' => { 'name' => 'crypto_caps', 'offset' => '310', 'type' => '16093' }, '17' => { 'name' => 'max_dc_rd_atom', 'offset' => '338', 'type' => '2023' }, '18' => { 'name' => 'max_dc_init_rd_atom', 'offset' => '352', 'type' => '2023' }, '19' => { 'name' => 'reg_c0', 'offset' => '360', 'type' => '15210' }, '2' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '2023' }, '20' => { 'name' => 'ooo_recv_wrs_caps', 'offset' => '374', 'type' => '16172' }, '3' => { 'name' => 'cqe_comp_caps', 'offset' => '36', 'type' => '15684' }, '4' => { 'name' => 'sw_parsing_caps', 'offset' => '50', 'type' => '15724' }, '5' => { 'name' => 'striding_rq_caps', 'offset' => '64', 'type' => '15764' }, '6' => { 'name' => 'tunnel_offloads_caps', 'offset' => '96', 'type' => '2001' }, '7' => { 'name' => 'max_dynamic_bfregs', 'offset' => '100', 'type' => '2001' }, '8' => { 'name' => 'max_clock_info_update_nsec', 'offset' => '114', 'type' => '2023' }, '9' => { 'name' => 'flow_action_flags', 'offset' => '128', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_context', 'Size' => '200', 'Type' => 'Struct' }, '16538' => { 'Header' => undef, 'Line' => '277', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'cqe_comp_res_format', 'offset' => '8', 'type' => '1977' }, '2' => { 'name' => 'flags', 'offset' => '18', 'type' => '2001' }, '3' => { 'name' => 'cqe_size', 'offset' => '22', 'type' => '1989' } }, 'Name' => 'struct mlx5dv_cq_init_attr', 'Size' => '24', 'Type' => 'Struct' }, '166' => { 'BaseType' => '159', 'Name' => 'int volatile', 'Size' => '4', 'Type' => 'Volatile' }, '16608' => { 'Header' => undef, 'Line' => '307', 'Memb' => { '0' => { 'name' => 'pd', 'offset' => '0', 'type' => '6313' }, '1' => { 'name' => 'create_flags', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'max_entries', 'offset' => '18', 'type' => '1989' } }, 'Name' => 'struct mlx5dv_mkey_init_attr', 'Size' => '16', 'Type' => 'Struct' }, '16663' => { 'Header' => undef, 'Line' => '313', 'Memb' => { '0' => { 'name' => 'lkey', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'rkey', 'offset' => '4', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_mkey', 'Size' => '8', 'Type' => 'Struct' }, '16705' => { 'Header' => undef, 'Line' => '328', 'Memb' => { '0' => { 'name' => 'MLX5DV_DCTYPE_DCT', 'value' => '1' }, '1' => { 'name' => 'MLX5DV_DCTYPE_DCI', 'value' => '2' } }, 'Name' => 'enum mlx5dv_dc_type', 'Size' => '4', 'Type' => 'Enum' }, '16734' => { 'Header' => undef, 'Line' => '333', 'Memb' => { '0' => { 'name' => 'log_num_concurent', 'offset' => '0', 'type' => '1977' }, '1' => { 'name' => 'log_num_errored', 'offset' => '1', 'type' => '1977' } }, 'Name' => 'struct mlx5dv_dci_streams', 'Size' => '2', 'Type' => 'Struct' }, '16776' => { 'Header' => undef, 'Line' => '340', 'Memb' => { '0' => { 'name' => 'dct_access_key', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'dci_streams', 'offset' => '0', 'type' => '16734' } }, 'Size' => '8', 'Type' => 'Union' }, '16812' => { 'Header' => undef, 'Line' => '338', 'Memb' => { '0' => { 'name' => 'dc_type', 'offset' => '0', 'type' => '16705' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '16776' } }, 'Name' => 'struct mlx5dv_dc_init_attr', 'Size' => '16', 'Type' => 'Struct' }, '16846' => { 'Header' => undef, 'Line' => '354', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'create_flags', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'dc_init_attr', 'offset' => '22', 'type' => '16812' }, '3' => { 'name' => 'send_ops_flags', 'offset' => '50', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_qp_init_attr', 'Size' => '40', 'Type' => 'Struct' }, '16916' => { 'Header' => undef, 'Line' => '365', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'bytes_count', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'bytes_skip', 'offset' => '18', 'type' => '2001' }, '3' => { 'name' => 'lkey', 'offset' => '22', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_mr_interleaved', 'Size' => '24', 'Type' => 'Struct' }, '16986' => { 'BaseType' => '16916', 'Name' => 'struct mlx5dv_mr_interleaved const', 'Size' => '24', 'Type' => 'Const' }, '16991' => { 'Header' => undef, 'Line' => '378', 'Memb' => { '0' => { 'name' => 'bg_type', 'offset' => '0', 'type' => '15913' }, '1' => { 'name' => 'bg', 'offset' => '4', 'type' => '1989' }, '2' => { 'name' => 'app_tag', 'offset' => '6', 'type' => '1989' }, '3' => { 'name' => 'ref_tag', 'offset' => '8', 'type' => '2001' }, '4' => { 'name' => 'flags', 'offset' => '18', 'type' => '1989' } }, 'Name' => 'struct mlx5dv_sig_t10dif', 'Size' => '16', 'Type' => 'Struct' }, '17074' => { 'BaseType' => '16991', 'Name' => 'struct mlx5dv_sig_t10dif const', 'Size' => '16', 'Type' => 'Const' }, '17079' => { 'Header' => undef, 'Line' => '386', 'Memb' => { '0' => { 'name' => 'type', 'offset' => '0', 'type' => '15943' }, '1' => { 'name' => 'seed', 'offset' => '8', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_sig_crc', 'Size' => '16', 'Type' => 'Struct' }, '17121' => { 'BaseType' => '17079', 'Name' => 'struct mlx5dv_sig_crc const', 'Size' => '16', 'Type' => 'Const' }, '17126' => { 'Header' => undef, 'Line' => '393', 'Memb' => { '0' => { 'name' => 'dif', 'offset' => '0', 'type' => '17162' }, '1' => { 'name' => 'crc', 'offset' => '0', 'type' => '17167' } }, 'Size' => '8', 'Type' => 'Union' }, '17162' => { 'BaseType' => '17074', 'Name' => 'struct mlx5dv_sig_t10dif const*', 'Size' => '8', 'Type' => 'Pointer' }, '17167' => { 'BaseType' => '17121', 'Name' => 'struct mlx5dv_sig_crc const*', 'Size' => '8', 'Type' => 'Pointer' }, '17172' => { 'Header' => undef, 'Line' => '391', 'Memb' => { '0' => { 'name' => 'sig_type', 'offset' => '0', 'type' => '15883' }, '1' => { 'name' => 'sig', 'offset' => '8', 'type' => '17126' }, '2' => { 'name' => 'block_size', 'offset' => '22', 'type' => '15979' }, '3' => { 'name' => 'comp_mask', 'offset' => '36', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_sig_block_domain', 'Size' => '32', 'Type' => 'Struct' }, '17242' => { 'BaseType' => '17172', 'Name' => 'struct mlx5dv_sig_block_domain const', 'Size' => '32', 'Type' => 'Const' }, '17247' => { 'Header' => undef, 'Line' => '414', 'Memb' => { '0' => { 'name' => 'mem', 'offset' => '0', 'type' => '17350' }, '1' => { 'name' => 'wire', 'offset' => '8', 'type' => '17350' }, '2' => { 'name' => 'flags', 'offset' => '22', 'type' => '2001' }, '3' => { 'name' => 'check_mask', 'offset' => '32', 'type' => '1977' }, '4' => { 'name' => 'copy_mask', 'offset' => '33', 'type' => '1977' }, '5' => { 'name' => 'comp_mask', 'offset' => '36', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_sig_block_attr', 'Size' => '32', 'Type' => 'Struct' }, '1731624' => { 'Header' => undef, 'Line' => '598', 'Memb' => { '0' => { 'name' => 'MLX5DV_MKEY_NO_ERR', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_MKEY_SIG_BLOCK_BAD_GUARD', 'value' => '1' }, '2' => { 'name' => 'MLX5DV_MKEY_SIG_BLOCK_BAD_REFTAG', 'value' => '2' }, '3' => { 'name' => 'MLX5DV_MKEY_SIG_BLOCK_BAD_APPTAG', 'value' => '3' } }, 'Name' => 'enum mlx5dv_mkey_err_type', 'Size' => '4', 'Type' => 'Enum' }, '1731666' => { 'Header' => undef, 'Line' => '605', 'Memb' => { '0' => { 'name' => 'actual_value', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'expected_value', 'offset' => '8', 'type' => '2023' }, '2' => { 'name' => 'offset', 'offset' => '22', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_sig_err', 'Size' => '24', 'Type' => 'Struct' }, '1731722' => { 'Header' => undef, 'Line' => '613', 'Memb' => { '0' => { 'name' => 'sig', 'offset' => '0', 'type' => '1731666' } }, 'Size' => '24', 'Type' => 'Union' }, '1731745' => { 'Header' => undef, 'Line' => '611', 'Memb' => { '0' => { 'name' => 'err_type', 'offset' => '0', 'type' => '1731624' }, '1' => { 'name' => 'err', 'offset' => '8', 'type' => '1731722' } }, 'Name' => 'struct mlx5dv_mkey_err', 'Size' => '32', 'Type' => 'Struct' }, '17345' => { 'BaseType' => '17247', 'Name' => 'struct mlx5dv_sig_block_attr const', 'Size' => '32', 'Type' => 'Const' }, '17350' => { 'BaseType' => '17242', 'Name' => 'struct mlx5dv_sig_block_domain const*', 'Size' => '8', 'Type' => 'Pointer' }, '17355' => { 'Header' => undef, 'Line' => '423', 'Memb' => { '0' => { 'name' => 'MLX5DV_CRYPTO_STANDARD_AES_XTS', 'value' => '0' } }, 'Name' => 'enum mlx5dv_crypto_standard', 'Size' => '4', 'Type' => 'Enum' }, '17378' => { 'Header' => undef, 'Line' => '427', 'Memb' => { '0' => { 'name' => 'MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_AFTER_CRYPTO_ON_TX', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_BEFORE_CRYPTO_ON_TX', 'value' => '1' } }, 'Name' => 'enum mlx5dv_signature_crypto_order', 'Size' => '4', 'Type' => 'Enum' }, '17407' => { 'Header' => undef, 'Line' => '432', 'Memb' => { '0' => { 'name' => 'crypto_standard', 'offset' => '0', 'type' => '17355' }, '1' => { 'name' => 'encrypt_on_tx', 'offset' => '4', 'type' => '2091' }, '2' => { 'name' => 'signature_crypto_order', 'offset' => '8', 'type' => '17378' }, '3' => { 'name' => 'data_unit_size', 'offset' => '18', 'type' => '15979' }, '4' => { 'name' => 'initial_tweak', 'offset' => '22', 'type' => '17538' }, '5' => { 'name' => 'dek', 'offset' => '50', 'type' => '17582' }, '6' => { 'name' => 'keytag', 'offset' => '64', 'type' => '978' }, '7' => { 'name' => 'comp_mask', 'offset' => '72', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_crypto_attr', 'Size' => '56', 'Type' => 'Struct' }, '17533' => { 'BaseType' => '17407', 'Name' => 'struct mlx5dv_crypto_attr const', 'Size' => '56', 'Type' => 'Const' }, '17538' => { 'BaseType' => '356', 'Name' => 'char[16]', 'Size' => '16', 'Type' => 'Array' }, '17554' => { 'Header' => undef, 'Line' => '919', 'Memb' => { '0' => { 'name' => 'devx_obj', 'offset' => '0', 'type' => '19309' } }, 'Name' => 'struct mlx5dv_dek', 'Size' => '8', 'Type' => 'Struct' }, '17582' => { 'BaseType' => '17554', 'Name' => 'struct mlx5dv_dek*', 'Size' => '8', 'Type' => 'Pointer' }, '17587' => { 'Header' => undef, 'Line' => '447', 'Memb' => { '0' => { 'name' => 'conf_flags', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_mkey_conf_attr', 'Size' => '16', 'Type' => 'Struct' }, '176' => { 'BaseType' => '70', 'Header' => undef, 'Line' => '42', 'Name' => '__uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '17629' => { 'Header' => undef, 'Line' => '458', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'wr_set_dc_addr', 'offset' => '8', 'type' => '17856' }, '10' => { 'name' => 'wr_set_dc_addr_stream', 'offset' => '128', 'type' => '18145' }, '11' => { 'name' => 'wr_memcpy', 'offset' => '136', 'type' => '18186' }, '12' => { 'name' => 'wr_set_mkey_crypto', 'offset' => '150', 'type' => '18212' }, '2' => { 'name' => 'wr_mr_interleaved', 'offset' => '22', 'type' => '17907' }, '3' => { 'name' => 'wr_mr_list', 'offset' => '36', 'type' => '17943' }, '4' => { 'name' => 'wr_mkey_configure', 'offset' => '50', 'type' => '17979' }, '5' => { 'name' => 'wr_set_mkey_access_flags', 'offset' => '64', 'type' => '18000' }, '6' => { 'name' => 'wr_set_mkey_layout_list', 'offset' => '72', 'type' => '18026' }, '7' => { 'name' => 'wr_set_mkey_layout_interleaved', 'offset' => '86', 'type' => '18062' }, '8' => { 'name' => 'wr_set_mkey_sig_block', 'offset' => '100', 'type' => '18088' }, '9' => { 'name' => 'wr_raw_wqe', 'offset' => '114', 'type' => '18109' } }, 'Name' => 'struct mlx5dv_qp_ex', 'Size' => '104', 'Type' => 'Struct' }, '1784870' => { 'BaseType' => '1731745', 'Name' => 'struct mlx5dv_mkey_err*', 'Size' => '8', 'Type' => 'Pointer' }, '17851' => { 'BaseType' => '17629', 'Name' => 'struct mlx5dv_qp_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '17856' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, struct ibv_ah*, uint32_t, uint64_t)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '8232' }, '2' => { 'type' => '2001' }, '3' => { 'type' => '2023' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '17897' => { 'BaseType' => '16663', 'Name' => 'struct mlx5dv_mkey*', 'Size' => '8', 'Type' => 'Pointer' }, '17902' => { 'BaseType' => '16916', 'Name' => 'struct mlx5dv_mr_interleaved*', 'Size' => '8', 'Type' => 'Pointer' }, '17907' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, struct mlx5dv_mkey*, uint32_t, uint32_t, uint16_t, struct mlx5dv_mr_interleaved*)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '17897' }, '2' => { 'type' => '2001' }, '3' => { 'type' => '2001' }, '4' => { 'type' => '1989' }, '5' => { 'type' => '17902' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '17943' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, struct mlx5dv_mkey*, uint32_t, uint16_t, struct ibv_sge*)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '17897' }, '2' => { 'type' => '2001' }, '3' => { 'type' => '1989' }, '4' => { 'type' => '8621' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '17974' => { 'BaseType' => '17587', 'Name' => 'struct mlx5dv_mkey_conf_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '17979' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, struct mlx5dv_mkey*, uint8_t, struct mlx5dv_mkey_conf_attr*)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '17897' }, '2' => { 'type' => '1977' }, '3' => { 'type' => '17974' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '1799' => { 'BaseType' => '1408', 'Header' => undef, 'Line' => '7', 'Name' => 'FILE', 'Size' => '216', 'Type' => 'Typedef' }, '18000' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, uint32_t)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '2001' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '18026' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, uint16_t, struct ibv_sge const*)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '1989' }, '2' => { 'type' => '9835' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '18057' => { 'BaseType' => '16986', 'Name' => 'struct mlx5dv_mr_interleaved const*', 'Size' => '8', 'Type' => 'Pointer' }, '18062' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, uint32_t, uint16_t, struct mlx5dv_mr_interleaved const*)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '2001' }, '2' => { 'type' => '1989' }, '3' => { 'type' => '18057' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '18083' => { 'BaseType' => '17345', 'Name' => 'struct mlx5dv_sig_block_attr const*', 'Size' => '8', 'Type' => 'Pointer' }, '18088' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, struct mlx5dv_sig_block_attr const*)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '18083' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '18109' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, void const*)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '1961' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '1811' => { 'BaseType' => '1', 'Header' => undef, 'Line' => '43', 'Name' => '_IO_lock_t', 'Type' => 'Typedef' }, '18145' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, struct ibv_ah*, uint32_t, uint64_t, uint16_t)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '8232' }, '2' => { 'type' => '2001' }, '3' => { 'type' => '2023' }, '4' => { 'type' => '1989' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '18186' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, uint32_t, uint64_t, uint32_t, uint64_t, size_t)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '2001' }, '2' => { 'type' => '2023' }, '3' => { 'type' => '2001' }, '4' => { 'type' => '2023' }, '5' => { 'type' => '419' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '1819' => { 'Name' => 'struct _IO_marker', 'Type' => 'Struct' }, '18207' => { 'BaseType' => '17533', 'Name' => 'struct mlx5dv_crypto_attr const*', 'Size' => '8', 'Type' => 'Pointer' }, '18212' => { 'Name' => 'void(*)(struct mlx5dv_qp_ex*, struct mlx5dv_crypto_attr const*)', 'Param' => { '0' => { 'type' => '17851' }, '1' => { 'type' => '18207' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '18217' => { 'Header' => undef, 'Line' => '637', 'Memb' => { '0' => { 'name' => 'credential_id', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'import_kek_id', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'credential', 'offset' => '8', 'type' => '950' }, '3' => { 'name' => 'comp_mask', 'offset' => '86', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_crypto_login_attr', 'Size' => '64', 'Type' => 'Struct' }, '1824' => { 'BaseType' => '1819', 'Name' => 'struct _IO_marker*', 'Size' => '8', 'Type' => 'Pointer' }, '18287' => { 'Header' => undef, 'Line' => '644', 'Memb' => { '0' => { 'name' => 'credential_id', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'import_kek_id', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'credential', 'offset' => '8', 'type' => '1961' }, '3' => { 'name' => 'credential_len', 'offset' => '22', 'type' => '419' }, '4' => { 'name' => 'comp_mask', 'offset' => '36', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_crypto_login_attr_ex', 'Size' => '32', 'Type' => 'Struct' }, '1829' => { 'BaseType' => '1408', 'Name' => 'struct _IO_FILE*', 'Size' => '8', 'Type' => 'Pointer' }, '1834' => { 'BaseType' => '356', 'Name' => 'char[1]', 'Size' => '1', 'Type' => 'Array' }, '18371' => { 'Header' => undef, 'Line' => '651', 'Memb' => { '0' => { 'name' => 'MLX5DV_CRYPTO_LOGIN_STATE_VALID', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_CRYPTO_LOGIN_STATE_NO_LOGIN', 'value' => '1' }, '2' => { 'name' => 'MLX5DV_CRYPTO_LOGIN_STATE_INVALID', 'value' => '2' } }, 'Name' => 'enum mlx5dv_crypto_login_state', 'Size' => '4', 'Type' => 'Enum' }, '18406' => { 'Header' => undef, 'Line' => '657', 'Memb' => { '0' => { 'name' => 'state', 'offset' => '0', 'type' => '18371' }, '1' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_crypto_login_query_attr', 'Size' => '16', 'Type' => 'Struct' }, '18448' => { 'Header' => undef, 'Line' => '679', 'Memb' => { '0' => { 'name' => 'MLX5DV_CRYPTO_KEY_SIZE_128', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_CRYPTO_KEY_SIZE_256', 'value' => '1' } }, 'Name' => 'enum mlx5dv_crypto_key_size', 'Size' => '4', 'Type' => 'Enum' }, '18477' => { 'Header' => undef, 'Line' => '684', 'Memb' => { '0' => { 'name' => 'MLX5DV_CRYPTO_KEY_PURPOSE_AES_XTS', 'value' => '0' } }, 'Name' => 'enum mlx5dv_crypto_key_purpose', 'Size' => '4', 'Type' => 'Enum' }, '1850' => { 'BaseType' => '1811', 'Name' => '_IO_lock_t*', 'Size' => '8', 'Type' => 'Pointer' }, '18500' => { 'Header' => undef, 'Line' => '688', 'Memb' => { '0' => { 'name' => 'MLX5DV_DEK_STATE_READY', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_DEK_STATE_ERROR', 'value' => '1' } }, 'Name' => 'enum mlx5dv_dek_state', 'Size' => '4', 'Type' => 'Enum' }, '18529' => { 'Header' => undef, 'Line' => '697', 'Memb' => { '0' => { 'name' => 'key_size', 'offset' => '0', 'type' => '18448' }, '1' => { 'name' => 'has_keytag', 'offset' => '4', 'type' => '2091' }, '2' => { 'name' => 'key_purpose', 'offset' => '8', 'type' => '18477' }, '3' => { 'name' => 'pd', 'offset' => '22', 'type' => '6313' }, '4' => { 'name' => 'opaque', 'offset' => '36', 'type' => '978' }, '5' => { 'name' => 'key', 'offset' => '50', 'type' => '18654' }, '6' => { 'name' => 'comp_mask', 'offset' => '352', 'type' => '2023' }, '7' => { 'name' => 'crypto_login', 'offset' => '360', 'type' => '18698' } }, 'Name' => 'struct mlx5dv_dek_init_attr', 'Size' => '176', 'Type' => 'Struct' }, '1855' => { 'Name' => 'struct _IO_codecvt', 'Type' => 'Struct' }, '1860' => { 'BaseType' => '1855', 'Name' => 'struct _IO_codecvt*', 'Size' => '8', 'Type' => 'Pointer' }, '1865' => { 'Name' => 'struct _IO_wide_data', 'Type' => 'Struct' }, '18654' => { 'BaseType' => '356', 'Name' => 'char[128]', 'Size' => '128', 'Type' => 'Array' }, '18670' => { 'Header' => undef, 'Line' => '915', 'Memb' => { '0' => { 'name' => 'devx_obj', 'offset' => '0', 'type' => '19309' } }, 'Name' => 'struct mlx5dv_crypto_login_obj', 'Size' => '8', 'Type' => 'Struct' }, '18698' => { 'BaseType' => '18670', 'Name' => 'struct mlx5dv_crypto_login_obj*', 'Size' => '8', 'Type' => 'Pointer' }, '1870' => { 'BaseType' => '1865', 'Name' => 'struct _IO_wide_data*', 'Size' => '8', 'Type' => 'Pointer' }, '18703' => { 'Header' => undef, 'Line' => '708', 'Memb' => { '0' => { 'name' => 'state', 'offset' => '0', 'type' => '18500' }, '1' => { 'name' => 'opaque', 'offset' => '4', 'type' => '978' }, '2' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_dek_attr', 'Size' => '24', 'Type' => 'Struct' }, '1875' => { 'BaseType' => '356', 'Name' => 'char[20]', 'Size' => '20', 'Type' => 'Array' }, '18759' => { 'Header' => undef, 'Line' => '727', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'action_flags', 'offset' => '8', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_flow_action_esp', 'Size' => '16', 'Type' => 'Struct' }, '188' => { 'Name' => 'long', 'Size' => '8', 'Type' => 'Intrinsic' }, '18801' => { 'Header' => undef, 'Line' => '732', 'Memb' => { '0' => { 'name' => 'match_sz', 'offset' => '0', 'type' => '419' }, '1' => { 'name' => 'match_buf', 'offset' => '8', 'type' => '18843' } }, 'Name' => 'struct mlx5dv_flow_match_parameters', 'Size' => '8', 'Type' => 'Struct' }, '18843' => { 'BaseType' => '2023', 'Name' => 'uint64_t[]', 'Size' => '8', 'Type' => 'Array' }, '18858' => { 'Header' => undef, 'Line' => '741', 'Memb' => { '0' => { 'name' => 'type', 'offset' => '0', 'type' => '10827' }, '1' => { 'name' => 'flags', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'priority', 'offset' => '8', 'type' => '1989' }, '3' => { 'name' => 'match_criteria_enable', 'offset' => '16', 'type' => '1977' }, '4' => { 'name' => 'match_mask', 'offset' => '22', 'type' => '18970' }, '5' => { 'name' => 'comp_mask', 'offset' => '36', 'type' => '2023' }, '6' => { 'name' => 'ft_type', 'offset' => '50', 'type' => '14957' } }, 'Name' => 'struct mlx5dv_flow_matcher_attr', 'Size' => '40', 'Type' => 'Struct' }, '18970' => { 'BaseType' => '18801', 'Name' => 'struct mlx5dv_flow_match_parameters*', 'Size' => '8', 'Type' => 'Pointer' }, '18975' => { 'Header' => undef, 'Line' => '759', 'Memb' => { '0' => { 'name' => 'ft_type', 'offset' => '0', 'type' => '14957' }, '1' => { 'name' => 'priority', 'offset' => '4', 'type' => '1989' }, '2' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_steering_anchor_attr', 'Size' => '16', 'Type' => 'Struct' }, '1903' => { 'BaseType' => '248', 'Header' => undef, 'Line' => '63', 'Name' => 'off_t', 'Size' => '8', 'Type' => 'Typedef' }, '19031' => { 'Header' => undef, 'Line' => '765', 'Memb' => { '0' => { 'name' => 'id', 'offset' => '0', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_steering_anchor', 'Size' => '4', 'Type' => 'Struct' }, '19058' => { 'Header' => undef, 'Line' => '774', 'Memb' => { '0' => { 'name' => 'MLX5DV_FLOW_ACTION_DEST_IBV_QP', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_FLOW_ACTION_DROP', 'value' => '1' }, '2' => { 'name' => 'MLX5DV_FLOW_ACTION_IBV_COUNTER', 'value' => '2' }, '3' => { 'name' => 'MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION', 'value' => '3' }, '4' => { 'name' => 'MLX5DV_FLOW_ACTION_TAG', 'value' => '4' }, '5' => { 'name' => 'MLX5DV_FLOW_ACTION_DEST_DEVX', 'value' => '5' }, '6' => { 'name' => 'MLX5DV_FLOW_ACTION_COUNTERS_DEVX', 'value' => '6' }, '7' => { 'name' => 'MLX5DV_FLOW_ACTION_DEFAULT_MISS', 'value' => '7' } }, 'Name' => 'enum mlx5dv_flow_action_type', 'Size' => '4', 'Type' => 'Enum' }, '19123' => { 'Header' => undef, 'Line' => '787', 'Memb' => { '0' => { 'name' => 'qp', 'offset' => '0', 'type' => '5101' }, '1' => { 'name' => 'counter', 'offset' => '0', 'type' => '10940' }, '2' => { 'name' => 'action', 'offset' => '0', 'type' => '13658' }, '3' => { 'name' => 'tag_value', 'offset' => '0', 'type' => '2001' }, '4' => { 'name' => 'obj', 'offset' => '0', 'type' => '19309' } }, 'Size' => '8', 'Type' => 'Union' }, '1915' => { 'BaseType' => '310', 'Header' => undef, 'Line' => '77', 'Name' => 'ssize_t', 'Size' => '8', 'Type' => 'Typedef' }, '19197' => { 'Header' => undef, 'Line' => '794', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'type', 'offset' => '18', 'type' => '28169' }, '3' => { 'name' => 'object_id', 'offset' => '22', 'type' => '2001' }, '4' => { 'name' => 'rx_icm_addr', 'offset' => '36', 'type' => '2023' }, '5' => { 'name' => 'log_obj_range', 'offset' => '50', 'type' => '1977' }, '6' => { 'name' => 'priv', 'offset' => '64', 'type' => '308' } }, 'Name' => 'struct mlx5dv_devx_obj', 'Size' => '48', 'Type' => 'Struct' }, '1927' => { 'BaseType' => '1799', 'Name' => 'FILE*', 'Size' => '8', 'Type' => 'Pointer' }, '19309' => { 'BaseType' => '19197', 'Name' => 'struct mlx5dv_devx_obj*', 'Size' => '8', 'Type' => 'Pointer' }, '19314' => { 'Header' => undef, 'Line' => '785', 'Memb' => { '0' => { 'name' => 'type', 'offset' => '0', 'type' => '19058' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '19123' } }, 'Name' => 'struct mlx5dv_flow_action_attr', 'Size' => '16', 'Type' => 'Struct' }, '19348' => { 'Header' => undef, 'Line' => '856', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'wqe_cnt', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'stride', 'offset' => '18', 'type' => '2001' } }, 'Size' => '16', 'Type' => 'Struct' }, '19401' => { 'Header' => undef, 'Line' => '861', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'wqe_cnt', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'stride', 'offset' => '18', 'type' => '2001' } }, 'Size' => '16', 'Type' => 'Struct' }, '19454' => { 'Header' => undef, 'Line' => '866', 'Memb' => { '0' => { 'name' => 'reg', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'size', 'offset' => '8', 'type' => '2001' } }, 'Size' => '16', 'Type' => 'Struct' }, '19493' => { 'Header' => undef, 'Line' => '854', 'Memb' => { '0' => { 'name' => 'dbrec', 'offset' => '0', 'type' => '19658' }, '1' => { 'name' => 'sq', 'offset' => '8', 'type' => '19348' }, '10' => { 'name' => 'tir_icm_addr', 'offset' => '136', 'type' => '2023' }, '2' => { 'name' => 'rq', 'offset' => '36', 'type' => '19401' }, '3' => { 'name' => 'bf', 'offset' => '64', 'type' => '19454' }, '4' => { 'name' => 'comp_mask', 'offset' => '86', 'type' => '2023' }, '5' => { 'name' => 'uar_mmap_offset', 'offset' => '100', 'type' => '1903' }, '6' => { 'name' => 'tirn', 'offset' => '114', 'type' => '2001' }, '7' => { 'name' => 'tisn', 'offset' => '118', 'type' => '2001' }, '8' => { 'name' => 'rqn', 'offset' => '128', 'type' => '2001' }, '9' => { 'name' => 'sqn', 'offset' => '132', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_qp', 'Size' => '96', 'Type' => 'Struct' }, '1961' => { 'BaseType' => '1966', 'Name' => 'void const*', 'Size' => '8', 'Type' => 'Pointer' }, '19658' => { 'BaseType' => '2203', 'Name' => '__be32*', 'Size' => '8', 'Type' => 'Pointer' }, '1966' => { 'BaseType' => '1', 'Name' => 'void const', 'Type' => 'Const' }, '19663' => { 'Header' => undef, 'Line' => '879', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'dbrec', 'offset' => '8', 'type' => '19658' }, '2' => { 'name' => 'cqe_cnt', 'offset' => '22', 'type' => '2001' }, '3' => { 'name' => 'cqe_size', 'offset' => '32', 'type' => '2001' }, '4' => { 'name' => 'cq_uar', 'offset' => '36', 'type' => '308' }, '5' => { 'name' => 'cqn', 'offset' => '50', 'type' => '2001' }, '6' => { 'name' => 'comp_mask', 'offset' => '64', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_cq', 'Size' => '48', 'Type' => 'Struct' }, '1967' => { 'BaseType' => '363', 'Name' => 'char const*', 'Size' => '8', 'Type' => 'Pointer' }, '1977' => { 'BaseType' => '123', 'Header' => undef, 'Line' => '24', 'Name' => 'uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '19775' => { 'Header' => undef, 'Line' => '893', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'dbrec', 'offset' => '8', 'type' => '19658' }, '2' => { 'name' => 'stride', 'offset' => '22', 'type' => '2001' }, '3' => { 'name' => 'head', 'offset' => '32', 'type' => '2001' }, '4' => { 'name' => 'tail', 'offset' => '36', 'type' => '2001' }, '5' => { 'name' => 'comp_mask', 'offset' => '50', 'type' => '2023' }, '6' => { 'name' => 'srqn', 'offset' => '64', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_srq', 'Size' => '48', 'Type' => 'Struct' }, '19887' => { 'Header' => undef, 'Line' => '903', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'dbrec', 'offset' => '8', 'type' => '19658' }, '2' => { 'name' => 'wqe_cnt', 'offset' => '22', 'type' => '2001' }, '3' => { 'name' => 'stride', 'offset' => '32', 'type' => '2001' }, '4' => { 'name' => 'comp_mask', 'offset' => '36', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_rwq', 'Size' => '32', 'Type' => 'Struct' }, '1989' => { 'BaseType' => '147', 'Header' => undef, 'Line' => '25', 'Name' => 'uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '19971' => { 'Header' => undef, 'Line' => '911', 'Memb' => { '0' => { 'name' => 'type', 'offset' => '0', 'type' => '15100' }, '1' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_alloc_dm_attr', 'Size' => '16', 'Type' => 'Struct' }, '200' => { 'BaseType' => '82', 'Header' => undef, 'Line' => '45', 'Name' => '__uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '2001' => { 'BaseType' => '176', 'Header' => undef, 'Line' => '26', 'Name' => 'uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '20013' => { 'Header' => undef, 'Line' => '920', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '2023' }, '2' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '2023' }, '3' => { 'name' => 'remote_va', 'offset' => '36', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_dm', 'Size' => '32', 'Type' => 'Struct' }, '20083' => { 'Header' => undef, 'Line' => '941', 'Memb' => { '0' => { 'name' => 'av', 'offset' => '0', 'type' => '20292' }, '1' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_ah', 'Size' => '16', 'Type' => 'Struct' }, '20124' => { 'Header' => undef, 'Line' => '1295', 'Memb' => { '0' => { 'name' => 'key', 'offset' => '0', 'type' => '21100' }, '1' => { 'name' => 'dqp_dct', 'offset' => '8', 'type' => '2203' }, '10' => { 'name' => 'rgid', 'offset' => '50', 'type' => '2524' }, '2' => { 'name' => 'stat_rate_sl', 'offset' => '18', 'type' => '1977' }, '3' => { 'name' => 'fl_mlid', 'offset' => '19', 'type' => '1977' }, '4' => { 'name' => 'rlid', 'offset' => '20', 'type' => '2191' }, '5' => { 'name' => 'reserved0', 'offset' => '22', 'type' => '20946' }, '6' => { 'name' => 'rmac', 'offset' => '32', 'type' => '10868' }, '7' => { 'name' => 'tclass', 'offset' => '38', 'type' => '1977' }, '8' => { 'name' => 'hop_limit', 'offset' => '39', 'type' => '1977' }, '9' => { 'name' => 'grh_gid_fl', 'offset' => '40', 'type' => '2203' } }, 'Name' => 'struct mlx5_wqe_av', 'Size' => '48', 'Type' => 'Struct' }, '2023' => { 'BaseType' => '200', 'Header' => undef, 'Line' => '27', 'Name' => 'uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '20292' => { 'BaseType' => '20124', 'Name' => 'struct mlx5_wqe_av*', 'Size' => '8', 'Type' => 'Pointer' }, '20297' => { 'Header' => undef, 'Line' => '946', 'Memb' => { '0' => { 'name' => 'pdn', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_pd', 'Size' => '16', 'Type' => 'Struct' }, '20339' => { 'Header' => undef, 'Line' => '951', 'Memb' => { '0' => { 'name' => 'handle', 'offset' => '0', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_devx', 'Size' => '4', 'Type' => 'Struct' }, '20367' => { 'Header' => undef, 'Line' => '956', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '5101' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '20405' } }, 'Size' => '16', 'Type' => 'Struct' }, '20405' => { 'BaseType' => '19493', 'Name' => 'struct mlx5dv_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '20410' => { 'Header' => undef, 'Line' => '960', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '4901' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '20448' } }, 'Size' => '16', 'Type' => 'Struct' }, '20448' => { 'BaseType' => '19663', 'Name' => 'struct mlx5dv_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '20453' => { 'Header' => undef, 'Line' => '964', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '5217' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '20491' } }, 'Size' => '16', 'Type' => 'Struct' }, '20491' => { 'BaseType' => '19775', 'Name' => 'struct mlx5dv_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '20496' => { 'Header' => undef, 'Line' => '968', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '5416' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '20534' } }, 'Size' => '16', 'Type' => 'Struct' }, '20534' => { 'BaseType' => '19887', 'Name' => 'struct mlx5dv_rwq*', 'Size' => '8', 'Type' => 'Pointer' }, '20539' => { 'Header' => undef, 'Line' => '972', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '2979' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '20577' } }, 'Size' => '16', 'Type' => 'Struct' }, '20577' => { 'BaseType' => '20013', 'Name' => 'struct mlx5dv_dm*', 'Size' => '8', 'Type' => 'Pointer' }, '20582' => { 'Header' => undef, 'Line' => '976', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '8232' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '20620' } }, 'Size' => '16', 'Type' => 'Struct' }, '20620' => { 'BaseType' => '20083', 'Name' => 'struct mlx5dv_ah*', 'Size' => '8', 'Type' => 'Pointer' }, '20625' => { 'Header' => undef, 'Line' => '980', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '6313' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '20663' } }, 'Size' => '16', 'Type' => 'Struct' }, '20663' => { 'BaseType' => '20297', 'Name' => 'struct mlx5dv_pd*', 'Size' => '8', 'Type' => 'Pointer' }, '20668' => { 'Header' => undef, 'Line' => '984', 'Memb' => { '0' => { 'name' => 'in', 'offset' => '0', 'type' => '19309' }, '1' => { 'name' => 'out', 'offset' => '8', 'type' => '20706' } }, 'Size' => '16', 'Type' => 'Struct' }, '20706' => { 'BaseType' => '20339', 'Name' => 'struct mlx5dv_devx*', 'Size' => '8', 'Type' => 'Pointer' }, '20711' => { 'Header' => undef, 'Line' => '955', 'Memb' => { '0' => { 'name' => 'qp', 'offset' => '0', 'type' => '20367' }, '1' => { 'name' => 'cq', 'offset' => '22', 'type' => '20410' }, '2' => { 'name' => 'srq', 'offset' => '50', 'type' => '20453' }, '3' => { 'name' => 'rwq', 'offset' => '72', 'type' => '20496' }, '4' => { 'name' => 'dm', 'offset' => '100', 'type' => '20539' }, '5' => { 'name' => 'ah', 'offset' => '128', 'type' => '20582' }, '6' => { 'name' => 'pd', 'offset' => '150', 'type' => '20625' }, '7' => { 'name' => 'devx', 'offset' => '274', 'type' => '20668' } }, 'Name' => 'struct mlx5dv_obj', 'Size' => '128', 'Type' => 'Struct' }, '20832' => { 'Header' => undef, 'Line' => '1005', 'Memb' => { '0' => { 'name' => 'single_stride_log_num_of_bytes', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'single_wqe_log_num_of_strides', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'two_byte_shift_en', 'offset' => '8', 'type' => '1977' } }, 'Name' => 'struct mlx5dv_striding_rq_init_attr', 'Size' => '12', 'Type' => 'Struct' }, '20888' => { 'Header' => undef, 'Line' => '1011', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'striding_rq_attrs', 'offset' => '8', 'type' => '20832' } }, 'Name' => 'struct mlx5dv_wq_init_attr', 'Size' => '24', 'Type' => 'Struct' }, '2091' => { 'Name' => '_Bool', 'Size' => '1', 'Type' => 'Intrinsic' }, '20946' => { 'BaseType' => '1977', 'Name' => 'uint8_t[4]', 'Size' => '4', 'Type' => 'Array' }, '2103' => { 'BaseType' => '171', 'Header' => undef, 'Line' => '46', 'Name' => 'atomic_int', 'Type' => 'Typedef' }, '21061' => { 'Header' => undef, 'Line' => '1297', 'Memb' => { '0' => { 'name' => 'qkey', 'offset' => '0', 'type' => '2203' }, '1' => { 'name' => 'reserved', 'offset' => '4', 'type' => '2203' } }, 'Size' => '8', 'Type' => 'Struct' }, '21100' => { 'Header' => undef, 'Line' => '1296', 'Memb' => { '0' => { 'name' => 'qkey', 'offset' => '0', 'type' => '21061' }, '1' => { 'name' => 'dc_key', 'offset' => '0', 'type' => '2215' } }, 'Size' => '8', 'Type' => 'Union' }, '21136' => { 'Header' => undef, 'Line' => '1626', 'Memb' => { '0' => { 'name' => 'MLX5DV_CTX_ATTR_BUF_ALLOCATORS', 'value' => '1' } }, 'Name' => 'enum mlx5dv_set_ctx_attr_type', 'Size' => '4', 'Type' => 'Enum' }, '21261' => { 'Header' => undef, 'Line' => '1650', 'Memb' => { '0' => { 'name' => 'nsec', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'last_cycles', 'offset' => '8', 'type' => '2023' }, '2' => { 'name' => 'frac', 'offset' => '22', 'type' => '2023' }, '3' => { 'name' => 'mult', 'offset' => '36', 'type' => '2001' }, '4' => { 'name' => 'shift', 'offset' => '40', 'type' => '2001' }, '5' => { 'name' => 'mask', 'offset' => '50', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_clock_info', 'Size' => '40', 'Type' => 'Struct' }, '21359' => { 'Header' => undef, 'Line' => '1772', 'Memb' => { '0' => { 'name' => 'umem_id', 'offset' => '0', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_devx_umem', 'Size' => '4', 'Type' => 'Struct' }, '21387' => { 'Header' => undef, 'Line' => '1783', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'size', 'offset' => '8', 'type' => '419' }, '2' => { 'name' => 'access', 'offset' => '22', 'type' => '2001' }, '3' => { 'name' => 'pgsz_bitmap', 'offset' => '36', 'type' => '2023' }, '4' => { 'name' => 'comp_mask', 'offset' => '50', 'type' => '2023' }, '5' => { 'name' => 'dmabuf_fd', 'offset' => '64', 'type' => '159' } }, 'Name' => 'struct mlx5dv_devx_umem_in', 'Size' => '48', 'Type' => 'Struct' }, '2143' => { 'BaseType' => '46', 'Header' => undef, 'Line' => '21', 'Name' => '__u8', 'Size' => '1', 'Type' => 'Typedef' }, '21485' => { 'Header' => undef, 'Line' => '1797', 'Memb' => { '0' => { 'name' => 'reg_addr', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'base_addr', 'offset' => '8', 'type' => '308' }, '2' => { 'name' => 'page_id', 'offset' => '22', 'type' => '2001' }, '3' => { 'name' => 'mmap_off', 'offset' => '36', 'type' => '1903' }, '4' => { 'name' => 'comp_mask', 'offset' => '50', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_devx_uar', 'Size' => '40', 'Type' => 'Struct' }, '2155' => { 'BaseType' => '58', 'Header' => undef, 'Line' => '24', 'Name' => '__u16', 'Size' => '2', 'Type' => 'Typedef' }, '21569' => { 'Header' => undef, 'Line' => '1810', 'Memb' => { '0' => { 'name' => 'page_id', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'length', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'mmap_off', 'offset' => '8', 'type' => '1903' }, '3' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_var', 'Size' => '24', 'Type' => 'Struct' }, '21639' => { 'Header' => undef, 'Line' => '1847', 'Memb' => { '0' => { 'name' => 'fd', 'offset' => '0', 'type' => '159' } }, 'Name' => 'struct mlx5dv_devx_cmd_comp', 'Size' => '4', 'Type' => 'Struct' }, '21666' => { 'Header' => undef, 'Line' => '1863', 'Memb' => { '0' => { 'name' => 'fd', 'offset' => '0', 'type' => '159' } }, 'Name' => 'struct mlx5dv_devx_event_channel', 'Size' => '4', 'Type' => 'Struct' }, '2167' => { 'BaseType' => '70', 'Header' => undef, 'Line' => '27', 'Name' => '__u32', 'Size' => '4', 'Type' => 'Typedef' }, '21693' => { 'Header' => undef, 'Line' => '2157', 'Memb' => { '0' => { 'name' => 'index', 'offset' => '0', 'type' => '1989' } }, 'Name' => 'struct mlx5dv_pp', 'Size' => '2', 'Type' => 'Struct' }, '21721' => { 'Header' => undef, 'Line' => '2183', 'Memb' => { '0' => { 'name' => 'parent', 'offset' => '0', 'type' => '21852' }, '1' => { 'name' => 'flags', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'bw_share', 'offset' => '18', 'type' => '2001' }, '3' => { 'name' => 'max_avg_bw', 'offset' => '22', 'type' => '2001' }, '4' => { 'name' => 'comp_mask', 'offset' => '36', 'type' => '2023' } }, 'Name' => 'struct mlx5dv_sched_attr', 'Size' => '32', 'Type' => 'Struct' }, '2179' => { 'BaseType' => '443', 'Header' => undef, 'Line' => '31', 'Name' => '__u64', 'Size' => '8', 'Type' => 'Typedef' }, '21805' => { 'BaseType' => '21721', 'Name' => 'struct mlx5dv_sched_attr const', 'Size' => '32', 'Type' => 'Const' }, '21810' => { 'Header' => undef, 'Line' => '937', 'Memb' => { '0' => { 'name' => 'parent', 'offset' => '0', 'type' => '21852' }, '1' => { 'name' => 'obj', 'offset' => '8', 'type' => '19309' } }, 'Name' => 'struct mlx5dv_sched_node', 'Size' => '16', 'Type' => 'Struct' }, '21852' => { 'BaseType' => '21810', 'Name' => 'struct mlx5dv_sched_node*', 'Size' => '8', 'Type' => 'Pointer' }, '21857' => { 'Header' => undef, 'Line' => '2226', 'Memb' => { '0' => { 'name' => 'vector', 'offset' => '0', 'type' => '159' }, '1' => { 'name' => 'fd', 'offset' => '4', 'type' => '159' } }, 'Name' => 'struct mlx5dv_devx_msi_vector', 'Size' => '8', 'Type' => 'Struct' }, '21898' => { 'Header' => undef, 'Line' => '2236', 'Memb' => { '0' => { 'name' => 'vaddr', 'offset' => '0', 'type' => '308' } }, 'Name' => 'struct mlx5dv_devx_eq', 'Size' => '8', 'Type' => 'Struct' }, '2191' => { 'BaseType' => '2155', 'Header' => undef, 'Line' => '25', 'Name' => '__be16', 'Size' => '2', 'Type' => 'Typedef' }, '22006' => { 'Header' => undef, 'Line' => '197', 'Memb' => { '0' => { 'name' => 'MLX5_ALLOC_TYPE_ANON', 'value' => '0' }, '1' => { 'name' => 'MLX5_ALLOC_TYPE_HUGE', 'value' => '1' }, '2' => { 'name' => 'MLX5_ALLOC_TYPE_CONTIG', 'value' => '2' }, '3' => { 'name' => 'MLX5_ALLOC_TYPE_PREFER_HUGE', 'value' => '3' }, '4' => { 'name' => 'MLX5_ALLOC_TYPE_PREFER_CONTIG', 'value' => '4' }, '5' => { 'name' => 'MLX5_ALLOC_TYPE_EXTERNAL', 'value' => '5' }, '6' => { 'name' => 'MLX5_ALLOC_TYPE_CUSTOM', 'value' => '6' }, '7' => { 'name' => 'MLX5_ALLOC_TYPE_ALL', 'value' => '7' } }, 'Name' => 'enum mlx5_alloc_type', 'Size' => '4', 'Type' => 'Enum' }, '2203' => { 'BaseType' => '2167', 'Header' => undef, 'Line' => '27', 'Name' => '__be32', 'Size' => '4', 'Type' => 'Typedef' }, '2215' => { 'BaseType' => '2179', 'Header' => undef, 'Line' => '29', 'Name' => '__be64', 'Size' => '8', 'Type' => 'Typedef' }, '22160' => { 'Header' => undef, 'Line' => '244', 'Memb' => { '0' => { 'name' => 'lock', 'offset' => '0', 'type' => '994' }, '1' => { 'name' => 'in_use', 'offset' => '4', 'type' => '159' }, '2' => { 'name' => 'need_lock', 'offset' => '8', 'type' => '159' } }, 'Name' => 'struct mlx5_spinlock', 'Size' => '12', 'Type' => 'Struct' }, '2243' => { 'Header' => undef, 'Line' => '146', 'Memb' => { '0' => { 'name' => 'IB_UVERBS_FLOW_ACTION_ESP_KEYMAT_AES_GCM', 'value' => '0' } }, 'Name' => 'enum ib_uverbs_flow_action_esp_keymat', 'Size' => '4', 'Type' => 'Enum' }, '22488' => { 'BaseType' => '82', 'Name' => 'unsigned long*', 'Size' => '8', 'Type' => 'Pointer' }, '2267' => { 'Header' => undef, 'Line' => '165', 'Memb' => { '0' => { 'name' => 'IB_UVERBS_FLOW_ACTION_ESP_REPLAY_NONE', 'value' => '0' }, '1' => { 'name' => 'IB_UVERBS_FLOW_ACTION_ESP_REPLAY_BMP', 'value' => '1' } }, 'Name' => 'enum ib_uverbs_flow_action_esp_replay', 'Size' => '4', 'Type' => 'Enum' }, '2297' => { 'Header' => undef, 'Line' => '191', 'Memb' => { '0' => { 'name' => 'val_ptr', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'val_ptr_data_u64', 'offset' => '0', 'type' => '2179' } }, 'Size' => '8', 'Type' => 'Union' }, '2326' => { 'Header' => undef, 'Line' => '192', 'Memb' => { '0' => { 'name' => 'next_ptr', 'offset' => '0', 'type' => '2406' }, '1' => { 'name' => 'next_ptr_data_u64', 'offset' => '0', 'type' => '2179' } }, 'Size' => '8', 'Type' => 'Union' }, '2355' => { 'Header' => undef, 'Line' => '187', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '2297' }, '1' => { 'name' => 'unnamed1', 'offset' => '8', 'type' => '2326' }, '2' => { 'name' => 'len', 'offset' => '22', 'type' => '2155' }, '3' => { 'name' => 'type', 'offset' => '24', 'type' => '2155' } }, 'Name' => 'struct ib_uverbs_flow_action_esp_encap', 'Size' => '24', 'Type' => 'Struct' }, '2406' => { 'BaseType' => '2355', 'Name' => 'struct ib_uverbs_flow_action_esp_encap*', 'Size' => '8', 'Type' => 'Pointer' }, '2411' => { 'Header' => undef, 'Line' => '197', 'Memb' => { '0' => { 'name' => 'spi', 'offset' => '0', 'type' => '2167' }, '1' => { 'name' => 'seq', 'offset' => '4', 'type' => '2167' }, '2' => { 'name' => 'tfc_pad', 'offset' => '8', 'type' => '2167' }, '3' => { 'name' => 'flags', 'offset' => '18', 'type' => '2167' }, '4' => { 'name' => 'hard_limit_pkts', 'offset' => '22', 'type' => '2179' } }, 'Name' => 'struct ib_uverbs_flow_action_esp', 'Size' => '24', 'Type' => 'Struct' }, '248' => { 'BaseType' => '188', 'Header' => undef, 'Line' => '152', 'Name' => '__off_t', 'Size' => '8', 'Type' => 'Typedef' }, '249572' => { 'BaseType' => '2001', 'Name' => 'uint32_t[4]', 'Size' => '16', 'Type' => 'Array' }, '2524' => { 'BaseType' => '1977', 'Name' => 'uint8_t[16]', 'Size' => '16', 'Type' => 'Array' }, '2540' => { 'Header' => undef, 'Line' => '95', 'Memb' => { '0' => { 'name' => 'IBV_NODE_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_NODE_CA', 'value' => '1' }, '2' => { 'name' => 'IBV_NODE_SWITCH', 'value' => '2' }, '3' => { 'name' => 'IBV_NODE_ROUTER', 'value' => '3' }, '4' => { 'name' => 'IBV_NODE_RNIC', 'value' => '4' }, '5' => { 'name' => 'IBV_NODE_USNIC', 'value' => '5' }, '6' => { 'name' => 'IBV_NODE_USNIC_UDP', 'value' => '6' }, '7' => { 'name' => 'IBV_NODE_UNSPECIFIED', 'value' => '7' } }, 'Name' => 'enum ibv_node_type', 'Size' => '4', 'Type' => 'Enum' }, '25598' => { 'Header' => undef, 'Line' => '615', 'Memb' => { '0' => { 'name' => 'reg', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'need_lock', 'offset' => '8', 'type' => '159' }, '10' => { 'name' => 'count', 'offset' => '278', 'type' => '2001' }, '11' => { 'name' => 'uar_entry', 'offset' => '288', 'type' => '14358' }, '12' => { 'name' => 'uar_handle', 'offset' => '310', 'type' => '2001' }, '13' => { 'name' => 'length', 'offset' => '320', 'type' => '2001' }, '14' => { 'name' => 'page_id', 'offset' => '324', 'type' => '2001' }, '2' => { 'name' => 'lock', 'offset' => '18', 'type' => '22160' }, '3' => { 'name' => 'offset', 'offset' => '36', 'type' => '70' }, '4' => { 'name' => 'buf_size', 'offset' => '40', 'type' => '70' }, '5' => { 'name' => 'uuarn', 'offset' => '50', 'type' => '70' }, '6' => { 'name' => 'uar_mmap_offset', 'offset' => '64', 'type' => '1903' }, '7' => { 'name' => 'uar', 'offset' => '72', 'type' => '308' }, '8' => { 'name' => 'bfreg_dyn_index', 'offset' => '86', 'type' => '2001' }, '9' => { 'name' => 'devx_uar', 'offset' => '100', 'type' => '28025' } }, 'Name' => 'struct mlx5_bf', 'Size' => '152', 'Type' => 'Struct' }, '25900' => { 'BaseType' => '25598', 'Name' => 'struct mlx5_bf*', 'Size' => '8', 'Type' => 'Pointer' }, '260' => { 'BaseType' => '188', 'Header' => undef, 'Line' => '153', 'Name' => '__off64_t', 'Size' => '8', 'Type' => 'Typedef' }, '2605' => { 'Header' => undef, 'Line' => '106', 'Memb' => { '0' => { 'name' => 'IBV_TRANSPORT_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_TRANSPORT_IB', 'value' => '0' }, '2' => { 'name' => 'IBV_TRANSPORT_IWARP', 'value' => '1' }, '3' => { 'name' => 'IBV_TRANSPORT_USNIC', 'value' => '2' }, '4' => { 'name' => 'IBV_TRANSPORT_USNIC_UDP', 'value' => '3' }, '5' => { 'name' => 'IBV_TRANSPORT_UNSPECIFIED', 'value' => '4' } }, 'Name' => 'enum ibv_transport_type', 'Size' => '4', 'Type' => 'Enum' }, '2658' => { 'Header' => undef, 'Line' => '155', 'Memb' => { '0' => { 'name' => 'IBV_ATOMIC_NONE', 'value' => '0' }, '1' => { 'name' => 'IBV_ATOMIC_HCA', 'value' => '1' }, '2' => { 'name' => 'IBV_ATOMIC_GLOB', 'value' => '2' } }, 'Name' => 'enum ibv_atomic_cap', 'Size' => '4', 'Type' => 'Enum' }, '2694' => { 'Header' => undef, 'Line' => '161', 'Memb' => { '0' => { 'name' => 'length', 'offset' => '0', 'type' => '419' }, '1' => { 'name' => 'log_align_req', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'comp_mask', 'offset' => '18', 'type' => '2001' } }, 'Name' => 'struct ibv_alloc_dm_attr', 'Size' => '16', 'Type' => 'Struct' }, '269449' => { 'Header' => undef, 'Line' => '1959', 'Memb' => { '0' => { 'name' => 'MLX5DV_DR_DOMAIN_TYPE_NIC_RX', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_DR_DOMAIN_TYPE_NIC_TX', 'value' => '1' }, '2' => { 'name' => 'MLX5DV_DR_DOMAIN_TYPE_FDB', 'value' => '2' } }, 'Name' => 'enum mlx5dv_dr_domain_type', 'Size' => '4', 'Type' => 'Enum' }, '269484' => { 'Header' => undef, 'Line' => '1971', 'Memb' => { '0' => { 'name' => 'next_table', 'offset' => '0', 'type' => '269706' }, '1' => { 'name' => 'active', 'offset' => '8', 'type' => '1977' }, '2' => { 'name' => 'reg_c_index', 'offset' => '9', 'type' => '1977' }, '3' => { 'name' => 'flow_meter_parameter_sz', 'offset' => '22', 'type' => '419' }, '4' => { 'name' => 'flow_meter_parameter', 'offset' => '36', 'type' => '308' } }, 'Name' => 'struct mlx5dv_dr_flow_meter_attr', 'Size' => '32', 'Type' => 'Struct' }, '269568' => { 'Header' => undef, 'Line' => '1165', 'Memb' => { '0' => { 'name' => 'dmn', 'offset' => '0', 'type' => '284828' }, '1' => { 'name' => 'rx', 'offset' => '8', 'type' => '290176' }, '2' => { 'name' => 'tx', 'offset' => '36', 'type' => '290176' }, '3' => { 'name' => 'level', 'offset' => '64', 'type' => '2001' }, '4' => { 'name' => 'table_type', 'offset' => '68', 'type' => '2001' }, '5' => { 'name' => 'matcher_list', 'offset' => '72', 'type' => '14403' }, '6' => { 'name' => 'devx_obj', 'offset' => '100', 'type' => '19309' }, '7' => { 'name' => 'refcount', 'offset' => '114', 'type' => '2103' }, '8' => { 'name' => 'tbl_list', 'offset' => '128', 'type' => '14358' } }, 'Name' => 'struct mlx5dv_dr_table', 'Size' => '96', 'Type' => 'Struct' }, '269706' => { 'BaseType' => '269568', 'Name' => 'struct mlx5dv_dr_table*', 'Size' => '8', 'Type' => 'Pointer' }, '269711' => { 'Header' => undef, 'Line' => '1979', 'Memb' => { '0' => { 'name' => 'sample_ratio', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'default_next_table', 'offset' => '8', 'type' => '269706' }, '2' => { 'name' => 'num_sample_actions', 'offset' => '22', 'type' => '2001' }, '3' => { 'name' => 'sample_actions', 'offset' => '36', 'type' => '269843' }, '4' => { 'name' => 'action', 'offset' => '50', 'type' => '2215' } }, 'Name' => 'struct mlx5dv_dr_flow_sampler_attr', 'Size' => '40', 'Type' => 'Struct' }, '269795' => { 'Header' => undef, 'Line' => '1257', 'Memb' => { '0' => { 'name' => 'action_type', 'offset' => '0', 'type' => '282518' }, '1' => { 'name' => 'refcount', 'offset' => '4', 'type' => '2103' }, '2' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '292052' } }, 'Name' => 'struct mlx5dv_dr_action', 'Size' => '80', 'Type' => 'Struct' }, '269843' => { 'BaseType' => '269848', 'Name' => 'struct mlx5dv_dr_action**', 'Size' => '8', 'Type' => 'Pointer' }, '269848' => { 'BaseType' => '269795', 'Name' => 'struct mlx5dv_dr_action*', 'Size' => '8', 'Type' => 'Pointer' }, '269876' => { 'Header' => undef, 'Line' => '2056', 'Memb' => { '0' => { 'name' => 'MLX5DV_DR_ACTION_DEST', 'value' => '0' }, '1' => { 'name' => 'MLX5DV_DR_ACTION_DEST_REFORMAT', 'value' => '1' } }, 'Name' => 'enum mlx5dv_dr_action_dest_type', 'Size' => '4', 'Type' => 'Enum' }, '269905' => { 'Header' => undef, 'Line' => '2061', 'Memb' => { '0' => { 'name' => 'reformat', 'offset' => '0', 'type' => '269848' }, '1' => { 'name' => 'dest', 'offset' => '8', 'type' => '269848' } }, 'Name' => 'struct mlx5dv_dr_action_dest_reformat', 'Size' => '16', 'Type' => 'Struct' }, '269947' => { 'Header' => undef, 'Line' => '2068', 'Memb' => { '0' => { 'name' => 'dest', 'offset' => '0', 'type' => '269848' }, '1' => { 'name' => 'dest_reformat', 'offset' => '0', 'type' => '269984' } }, 'Size' => '8', 'Type' => 'Union' }, '269984' => { 'BaseType' => '269905', 'Name' => 'struct mlx5dv_dr_action_dest_reformat*', 'Size' => '8', 'Type' => 'Pointer' }, '269989' => { 'Header' => undef, 'Line' => '2066', 'Memb' => { '0' => { 'name' => 'type', 'offset' => '0', 'type' => '269876' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '269947' } }, 'Name' => 'struct mlx5dv_dr_action_dest_attr', 'Size' => '16', 'Type' => 'Struct' }, '27143' => { 'Header' => undef, 'Line' => '430', 'Memb' => { '0' => { 'name' => 'shmid', 'offset' => '0', 'type' => '159' }, '1' => { 'name' => 'shmaddr', 'offset' => '8', 'type' => '308' }, '2' => { 'name' => 'bitmap', 'offset' => '22', 'type' => '22488' }, '3' => { 'name' => 'bmp_size', 'offset' => '36', 'type' => '82' }, '4' => { 'name' => 'entry', 'offset' => '50', 'type' => '14358' } }, 'Name' => 'struct mlx5_hugetlb_mem', 'Size' => '48', 'Type' => 'Struct' }, '27227' => { 'Header' => undef, 'Line' => '438', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '419' }, '2' => { 'name' => 'base', 'offset' => '22', 'type' => '159' }, '3' => { 'name' => 'hmem', 'offset' => '36', 'type' => '27353' }, '4' => { 'name' => 'type', 'offset' => '50', 'type' => '22006' }, '5' => { 'name' => 'resource_type', 'offset' => '64', 'type' => '2023' }, '6' => { 'name' => 'req_alignment', 'offset' => '72', 'type' => '419' }, '7' => { 'name' => 'mparent_domain', 'offset' => '86', 'type' => '27442' } }, 'Name' => 'struct mlx5_buf', 'Size' => '64', 'Type' => 'Struct' }, '27353' => { 'BaseType' => '27143', 'Name' => 'struct mlx5_hugetlb_mem*', 'Size' => '8', 'Type' => 'Pointer' }, '27358' => { 'Header' => undef, 'Line' => '467', 'Memb' => { '0' => { 'name' => 'mpd', 'offset' => '0', 'type' => '27555' }, '1' => { 'name' => 'mtd', 'offset' => '136', 'type' => '27636' }, '2' => { 'name' => 'alloc', 'offset' => '150', 'type' => '12530' }, '3' => { 'name' => 'free', 'offset' => '260', 'type' => '12561' }, '4' => { 'name' => 'pd_context', 'offset' => '274', 'type' => '308' } }, 'Name' => 'struct mlx5_parent_domain', 'Size' => '120', 'Type' => 'Struct' }, '27442' => { 'BaseType' => '27358', 'Name' => 'struct mlx5_parent_domain*', 'Size' => '8', 'Type' => 'Pointer' }, '27447' => { 'Header' => undef, 'Line' => '449', 'Memb' => { '0' => { 'name' => 'ibv_td', 'offset' => '0', 'type' => '6202' }, '1' => { 'name' => 'bf', 'offset' => '8', 'type' => '25900' }, '2' => { 'name' => 'refcount', 'offset' => '22', 'type' => '2103' } }, 'Name' => 'struct mlx5_td', 'Size' => '24', 'Type' => 'Struct' }, '2747' => { 'Header' => undef, 'Line' => '171', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'memcpy_to_dm', 'offset' => '8', 'type' => '2984' }, '2' => { 'name' => 'memcpy_from_dm', 'offset' => '22', 'type' => '3019' }, '3' => { 'name' => 'comp_mask', 'offset' => '36', 'type' => '2001' }, '4' => { 'name' => 'handle', 'offset' => '40', 'type' => '2001' } }, 'Name' => 'struct ibv_dm', 'Size' => '32', 'Type' => 'Struct' }, '27502' => { 'Header' => undef, 'Line' => '460', 'Memb' => { '0' => { 'name' => 'opaque_buf', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'opaque_mr', 'offset' => '8', 'type' => '6127' }, '2' => { 'name' => 'opaque_mr_mutex', 'offset' => '22', 'type' => '893' } }, 'Size' => '56', 'Type' => 'Struct' }, '27555' => { 'Header' => undef, 'Line' => '455', 'Memb' => { '0' => { 'name' => 'ibv_pd', 'offset' => '0', 'type' => '6132' }, '1' => { 'name' => 'pdn', 'offset' => '22', 'type' => '2001' }, '2' => { 'name' => 'refcount', 'offset' => '32', 'type' => '2103' }, '3' => { 'name' => 'mprotection_domain', 'offset' => '36', 'type' => '27631' }, '4' => { 'name' => 'unnamed0', 'offset' => '50', 'type' => '27502' } }, 'Name' => 'struct mlx5_pd', 'Size' => '88', 'Type' => 'Struct' }, '27631' => { 'BaseType' => '27555', 'Name' => 'struct mlx5_pd*', 'Size' => '8', 'Type' => 'Pointer' }, '27636' => { 'BaseType' => '27447', 'Name' => 'struct mlx5_td*', 'Size' => '8', 'Type' => 'Pointer' }, '28020' => { 'BaseType' => '70', 'Name' => 'unsigned int*', 'Size' => '8', 'Type' => 'Pointer' }, '28025' => { 'Header' => undef, 'Line' => '610', 'Memb' => { '0' => { 'name' => 'dv_devx_uar', 'offset' => '0', 'type' => '21485' }, '1' => { 'name' => 'context', 'offset' => '64', 'type' => '2944' } }, 'Name' => 'struct mlx5_devx_uar', 'Size' => '48', 'Type' => 'Struct' }, '28072' => { 'Header' => undef, 'Line' => '768', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_flow_matcher', 'Size' => '16', 'Type' => 'Struct' }, '28169' => { 'Header' => undef, 'Line' => '779', 'Memb' => { '0' => { 'name' => 'MLX5_DEVX_FLOW_TABLE', 'value' => '1' }, '1' => { 'name' => 'MLX5_DEVX_FLOW_COUNTER', 'value' => '2' }, '10' => { 'name' => 'MLX5_DEVX_ASO_FLOW_METER', 'value' => '11' }, '11' => { 'name' => 'MLX5_DEVX_ASO_CT', 'value' => '12' }, '2' => { 'name' => 'MLX5_DEVX_FLOW_METER', 'value' => '3' }, '3' => { 'name' => 'MLX5_DEVX_QP', 'value' => '4' }, '4' => { 'name' => 'MLX5_DEVX_PKT_REFORMAT_CTX', 'value' => '5' }, '5' => { 'name' => 'MLX5_DEVX_TIR', 'value' => '6' }, '6' => { 'name' => 'MLX5_DEVX_FLOW_GROUP', 'value' => '7' }, '7' => { 'name' => 'MLX5_DEVX_FLOW_TABLE_ENTRY', 'value' => '8' }, '8' => { 'name' => 'MLX5_DEVX_FLOW_SAMPLER', 'value' => '9' }, '9' => { 'name' => 'MLX5_DEVX_ASO_FIRST_HIT', 'value' => '10' } }, 'Name' => 'enum mlx5_devx_obj_type', 'Size' => '4', 'Type' => 'Enum' }, '282156' => { 'Header' => undef, 'Line' => '67', 'Memb' => { '0' => { 'name' => 'DR_CHUNK_SIZE_1', 'value' => '0' }, '1' => { 'name' => 'DR_CHUNK_SIZE_MIN', 'value' => '0' }, '10' => { 'name' => 'DR_CHUNK_SIZE_512', 'value' => '9' }, '11' => { 'name' => 'DR_CHUNK_SIZE_1K', 'value' => '10' }, '12' => { 'name' => 'DR_CHUNK_SIZE_2K', 'value' => '11' }, '13' => { 'name' => 'DR_CHUNK_SIZE_4K', 'value' => '12' }, '14' => { 'name' => 'DR_CHUNK_SIZE_8K', 'value' => '13' }, '15' => { 'name' => 'DR_CHUNK_SIZE_16K', 'value' => '14' }, '16' => { 'name' => 'DR_CHUNK_SIZE_32K', 'value' => '15' }, '17' => { 'name' => 'DR_CHUNK_SIZE_64K', 'value' => '16' }, '18' => { 'name' => 'DR_CHUNK_SIZE_128K', 'value' => '17' }, '19' => { 'name' => 'DR_CHUNK_SIZE_256K', 'value' => '18' }, '2' => { 'name' => 'DR_CHUNK_SIZE_2', 'value' => '1' }, '20' => { 'name' => 'DR_CHUNK_SIZE_512K', 'value' => '19' }, '21' => { 'name' => 'DR_CHUNK_SIZE_1024K', 'value' => '20' }, '22' => { 'name' => 'DR_CHUNK_SIZE_2048K', 'value' => '21' }, '23' => { 'name' => 'DR_CHUNK_SIZE_4096K', 'value' => '22' }, '24' => { 'name' => 'DR_CHUNK_SIZE_8192K', 'value' => '23' }, '25' => { 'name' => 'DR_CHUNK_SIZE_16384K', 'value' => '24' }, '26' => { 'name' => 'DR_CHUNK_SIZE_MAX', 'value' => '25' }, '3' => { 'name' => 'DR_CHUNK_SIZE_4', 'value' => '2' }, '4' => { 'name' => 'DR_CHUNK_SIZE_8', 'value' => '3' }, '5' => { 'name' => 'DR_CHUNK_SIZE_16', 'value' => '4' }, '6' => { 'name' => 'DR_CHUNK_SIZE_32', 'value' => '5' }, '7' => { 'name' => 'DR_CHUNK_SIZE_64', 'value' => '6' }, '8' => { 'name' => 'DR_CHUNK_SIZE_128', 'value' => '7' }, '9' => { 'name' => 'DR_CHUNK_SIZE_256', 'value' => '8' } }, 'Name' => 'enum dr_icm_chunk_size', 'Size' => '4', 'Type' => 'Enum' }, '282336' => { 'Header' => undef, 'Line' => '97', 'Memb' => { '0' => { 'name' => 'DR_ICM_TYPE_STE', 'value' => '0' }, '1' => { 'name' => 'DR_ICM_TYPE_MODIFY_ACTION', 'value' => '1' }, '2' => { 'name' => 'DR_ICM_TYPE_MODIFY_HDR_PTRN', 'value' => '2' }, '3' => { 'name' => 'DR_ICM_TYPE_ENCAP', 'value' => '3' }, '4' => { 'name' => 'DR_ICM_TYPE_MAX', 'value' => '4' } }, 'Name' => 'enum dr_icm_type', 'Size' => '4', 'Type' => 'Enum' }, '282518' => { 'Header' => undef, 'Line' => '173', 'Memb' => { '0' => { 'name' => 'DR_ACTION_TYP_TNL_L2_TO_L2', 'value' => '0' }, '1' => { 'name' => 'DR_ACTION_TYP_L2_TO_TNL_L2', 'value' => '1' }, '10' => { 'name' => 'DR_ACTION_TYP_VPORT', 'value' => '10' }, '11' => { 'name' => 'DR_ACTION_TYP_METER', 'value' => '11' }, '12' => { 'name' => 'DR_ACTION_TYP_MISS', 'value' => '12' }, '13' => { 'name' => 'DR_ACTION_TYP_SAMPLER', 'value' => '13' }, '14' => { 'name' => 'DR_ACTION_TYP_DEST_ARRAY', 'value' => '14' }, '15' => { 'name' => 'DR_ACTION_TYP_POP_VLAN', 'value' => '15' }, '16' => { 'name' => 'DR_ACTION_TYP_PUSH_VLAN', 'value' => '16' }, '17' => { 'name' => 'DR_ACTION_TYP_ASO_FIRST_HIT', 'value' => '17' }, '18' => { 'name' => 'DR_ACTION_TYP_ASO_FLOW_METER', 'value' => '18' }, '19' => { 'name' => 'DR_ACTION_TYP_ASO_CT', 'value' => '19' }, '2' => { 'name' => 'DR_ACTION_TYP_TNL_L3_TO_L2', 'value' => '2' }, '20' => { 'name' => 'DR_ACTION_TYP_ROOT_FT', 'value' => '20' }, '21' => { 'name' => 'DR_ACTION_TYP_MAX', 'value' => '21' }, '3' => { 'name' => 'DR_ACTION_TYP_L2_TO_TNL_L3', 'value' => '3' }, '4' => { 'name' => 'DR_ACTION_TYP_DROP', 'value' => '4' }, '5' => { 'name' => 'DR_ACTION_TYP_QP', 'value' => '5' }, '6' => { 'name' => 'DR_ACTION_TYP_FT', 'value' => '6' }, '7' => { 'name' => 'DR_ACTION_TYP_CTR', 'value' => '7' }, '8' => { 'name' => 'DR_ACTION_TYP_TAG', 'value' => '8' }, '9' => { 'name' => 'DR_ACTION_TYP_MODIFY_HDR', 'value' => '9' } }, 'Name' => 'enum dr_action_type', 'Size' => '4', 'Type' => 'Enum' }, '2826' => { 'Header' => undef, 'Line' => '2037', 'Memb' => { '0' => { 'name' => 'device', 'offset' => '0', 'type' => '11334' }, '1' => { 'name' => 'ops', 'offset' => '8', 'type' => '11490' }, '2' => { 'name' => 'cmd_fd', 'offset' => '612', 'type' => '159' }, '3' => { 'name' => 'async_fd', 'offset' => '616', 'type' => '159' }, '4' => { 'name' => 'num_comp_vectors', 'offset' => '626', 'type' => '159' }, '5' => { 'name' => 'mutex', 'offset' => '640', 'type' => '893' }, '6' => { 'name' => 'abi_compat', 'offset' => '800', 'type' => '308' } }, 'Name' => 'struct ibv_context', 'Size' => '328', 'Type' => 'Struct' }, '282668' => { 'Header' => undef, 'Line' => '232', 'Memb' => { '0' => { 'name' => 'hw_ste', 'offset' => '0', 'type' => '7308' }, '1' => { 'name' => 'refcount', 'offset' => '8', 'type' => '2103' }, '2' => { 'name' => 'miss_list_node', 'offset' => '22', 'type' => '14358' }, '3' => { 'name' => 'htbl', 'offset' => '50', 'type' => '282954' }, '4' => { 'name' => 'next_htbl', 'offset' => '64', 'type' => '282954' }, '5' => { 'name' => 'rule_rx_tx', 'offset' => '72', 'type' => '283015' }, '6' => { 'name' => 'ste_chain_location', 'offset' => '86', 'type' => '1977' }, '7' => { 'name' => 'size', 'offset' => '87', 'type' => '1977' } }, 'Name' => 'struct dr_ste', 'Size' => '64', 'Type' => 'Struct' }, '282786' => { 'Header' => undef, 'Line' => '268', 'Memb' => { '0' => { 'name' => 'type', 'offset' => '0', 'type' => '283062' }, '1' => { 'name' => 'lu_type', 'offset' => '4', 'type' => '1989' }, '10' => { 'name' => 'ctrl', 'offset' => '100', 'type' => '283020' }, '2' => { 'name' => 'byte_mask', 'offset' => '6', 'type' => '1989' }, '3' => { 'name' => 'refcount', 'offset' => '8', 'type' => '2103' }, '4' => { 'name' => 'chunk', 'offset' => '22', 'type' => '283217' }, '5' => { 'name' => 'ste_arr', 'offset' => '36', 'type' => '283222' }, '6' => { 'name' => 'hw_ste_arr', 'offset' => '50', 'type' => '7308' }, '7' => { 'name' => 'miss_list', 'offset' => '64', 'type' => '39716' }, '8' => { 'name' => 'chunk_size', 'offset' => '72', 'type' => '282156' }, '9' => { 'name' => 'pointing_ste', 'offset' => '86', 'type' => '283222' } }, 'Name' => 'struct dr_ste_htbl', 'Size' => '72', 'Type' => 'Struct' }, '282954' => { 'BaseType' => '282786', 'Name' => 'struct dr_ste_htbl*', 'Size' => '8', 'Type' => 'Pointer' }, '282959' => { 'Header' => undef, 'Line' => '1363', 'Memb' => { '0' => { 'name' => 'nic_matcher', 'offset' => '0', 'type' => '292287' }, '1' => { 'name' => 'last_rule_ste', 'offset' => '8', 'type' => '283222' }, '2' => { 'name' => 'lock_index', 'offset' => '22', 'type' => '1977' } }, 'Name' => 'struct dr_rule_rx_tx', 'Size' => '24', 'Type' => 'Struct' }, '283015' => { 'BaseType' => '282959', 'Name' => 'struct dr_rule_rx_tx*', 'Size' => '8', 'Type' => 'Pointer' }, '283020' => { 'Header' => undef, 'Line' => '253', 'Memb' => { '0' => { 'name' => 'num_of_valid_entries', 'offset' => '0', 'type' => '159' }, '1' => { 'name' => 'num_of_collisions', 'offset' => '4', 'type' => '159' } }, 'Name' => 'struct dr_ste_htbl_ctrl', 'Size' => '8', 'Type' => 'Struct' }, '283062' => { 'Header' => undef, 'Line' => '263', 'Memb' => { '0' => { 'name' => 'DR_STE_HTBL_TYPE_LEGACY', 'value' => '0' }, '1' => { 'name' => 'DR_STE_HTBL_TYPE_MATCH', 'value' => '1' } }, 'Name' => 'enum dr_ste_htbl_type', 'Size' => '4', 'Type' => 'Enum' }, '283091' => { 'Header' => undef, 'Line' => '1426', 'Memb' => { '0' => { 'name' => 'buddy_mem', 'offset' => '0', 'type' => '292568' }, '1' => { 'name' => 'chunk_list', 'offset' => '8', 'type' => '14358' }, '2' => { 'name' => 'num_of_entries', 'offset' => '36', 'type' => '2001' }, '3' => { 'name' => 'byte_size', 'offset' => '40', 'type' => '2001' }, '4' => { 'name' => 'seg', 'offset' => '50', 'type' => '2001' }, '5' => { 'name' => 'ste_arr', 'offset' => '64', 'type' => '283222' }, '6' => { 'name' => 'hw_ste_arr', 'offset' => '72', 'type' => '7308' }, '7' => { 'name' => 'miss_list', 'offset' => '86', 'type' => '39716' } }, 'Name' => 'struct dr_icm_chunk', 'Size' => '64', 'Type' => 'Struct' }, '283217' => { 'BaseType' => '283091', 'Name' => 'struct dr_icm_chunk*', 'Size' => '8', 'Type' => 'Pointer' }, '283222' => { 'BaseType' => '282668', 'Name' => 'struct dr_ste*', 'Size' => '8', 'Type' => 'Pointer' }, '283232' => { 'Header' => undef, 'Line' => '307', 'Memb' => { '0' => { 'name' => 'byte_mask', 'offset' => '0', 'type' => '1989' }, '1' => { 'name' => 'bit_mask', 'offset' => '2', 'type' => '2524' } }, 'Size' => '18', 'Type' => 'Struct' }, '283271' => { 'Header' => undef, 'Line' => '311', 'Memb' => { '0' => { 'name' => 'format_id', 'offset' => '0', 'type' => '1989' }, '1' => { 'name' => 'match', 'offset' => '2', 'type' => '63702' }, '2' => { 'name' => 'definer_obj', 'offset' => '64', 'type' => '19309' } }, 'Size' => '48', 'Type' => 'Struct' }, '283324' => { 'Header' => undef, 'Line' => '306', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '283232' }, '1' => { 'name' => 'unnamed1', 'offset' => '0', 'type' => '283271' } }, 'Size' => '48', 'Type' => 'Union' }, '283345' => { 'Header' => undef, 'Line' => '300', 'Memb' => { '0' => { 'name' => 'inner', 'offset' => '0', 'type' => '2091' }, '1' => { 'name' => 'rx', 'offset' => '1', 'type' => '2091' }, '2' => { 'name' => 'caps', 'offset' => '8', 'type' => '284150' }, '3' => { 'name' => 'lu_type', 'offset' => '22', 'type' => '1989' }, '4' => { 'name' => 'htbl_type', 'offset' => '32', 'type' => '283062' }, '5' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '283324' }, '6' => { 'name' => 'ste_build_tag_func', 'offset' => '114', 'type' => '284306' } }, 'Name' => 'struct dr_ste_build', 'Size' => '80', 'Type' => 'Struct' }, '283448' => { 'Header' => undef, 'Line' => '940', 'Memb' => { '0' => { 'name' => 'dmn', 'offset' => '0', 'type' => '284828' }, '1' => { 'name' => 'gvmi', 'offset' => '8', 'type' => '1989' }, '10' => { 'name' => 'log_modify_pattern_icm_size', 'offset' => '114', 'type' => '2001' }, '11' => { 'name' => 'hdr_modify_pattern_icm_addr', 'offset' => '128', 'type' => '2023' }, '12' => { 'name' => 'indirect_encap_icm_base', 'offset' => '136', 'type' => '2023' }, '13' => { 'name' => 'log_sw_encap_icm_size', 'offset' => '150', 'type' => '2001' }, '14' => { 'name' => 'max_encap_size', 'offset' => '256', 'type' => '1989' }, '15' => { 'name' => 'flex_protocols', 'offset' => '260', 'type' => '2001' }, '16' => { 'name' => 'flex_parser_header_modify', 'offset' => '264', 'type' => '1977' }, '17' => { 'name' => 'flex_parser_id_icmp_dw0', 'offset' => '265', 'type' => '1977' }, '18' => { 'name' => 'flex_parser_id_icmp_dw1', 'offset' => '272', 'type' => '1977' }, '19' => { 'name' => 'flex_parser_id_icmpv6_dw0', 'offset' => '273', 'type' => '1977' }, '2' => { 'name' => 'nic_rx_drop_address', 'offset' => '22', 'type' => '2023' }, '20' => { 'name' => 'flex_parser_id_icmpv6_dw1', 'offset' => '274', 'type' => '1977' }, '21' => { 'name' => 'flex_parser_id_geneve_opt_0', 'offset' => '275', 'type' => '1977' }, '22' => { 'name' => 'flex_parser_id_mpls_over_gre', 'offset' => '276', 'type' => '1977' }, '23' => { 'name' => 'flex_parser_id_mpls_over_udp', 'offset' => '277', 'type' => '1977' }, '24' => { 'name' => 'flex_parser_id_gtpu_dw_0', 'offset' => '278', 'type' => '1977' }, '25' => { 'name' => 'flex_parser_id_gtpu_teid', 'offset' => '279', 'type' => '1977' }, '26' => { 'name' => 'flex_parser_id_gtpu_dw_2', 'offset' => '280', 'type' => '1977' }, '27' => { 'name' => 'flex_parser_id_gtpu_first_ext_dw_0', 'offset' => '281', 'type' => '1977' }, '28' => { 'name' => 'flex_parser_ok_bits_supp', 'offset' => '288', 'type' => '1977' }, '29' => { 'name' => 'definer_supp_checksum', 'offset' => '289', 'type' => '1977' }, '3' => { 'name' => 'nic_tx_drop_address', 'offset' => '36', 'type' => '2023' }, '30' => { 'name' => 'max_ft_level', 'offset' => '290', 'type' => '1977' }, '31' => { 'name' => 'sw_format_ver', 'offset' => '291', 'type' => '1977' }, '32' => { 'name' => 'isolate_vl_tc', 'offset' => '292', 'type' => '2091' }, '33' => { 'name' => 'eswitch_manager', 'offset' => '293', 'type' => '2091' }, '34' => { 'name' => 'rx_sw_owner', 'offset' => '294', 'type' => '2091' }, '35' => { 'name' => 'tx_sw_owner', 'offset' => '295', 'type' => '2091' }, '36' => { 'name' => 'fdb_sw_owner', 'offset' => '296', 'type' => '2091' }, '37' => { 'name' => 'rx_sw_owner_v2', 'offset' => '297', 'type' => '2091' }, '38' => { 'name' => 'tx_sw_owner_v2', 'offset' => '304', 'type' => '2091' }, '39' => { 'name' => 'fdb_sw_owner_v2', 'offset' => '305', 'type' => '2091' }, '4' => { 'name' => 'nic_tx_allow_address', 'offset' => '50', 'type' => '2023' }, '40' => { 'name' => 'roce_caps', 'offset' => '306', 'type' => '287823' }, '41' => { 'name' => 'definer_format_sup', 'offset' => '310', 'type' => '2023' }, '42' => { 'name' => 'log_header_modify_argument_granularity', 'offset' => '324', 'type' => '1989' }, '43' => { 'name' => 'log_header_modify_argument_max_alloc', 'offset' => '326', 'type' => '1989' }, '44' => { 'name' => 'support_modify_argument', 'offset' => '328', 'type' => '2091' }, '45' => { 'name' => 'prio_tag_required', 'offset' => '329', 'type' => '2091' }, '46' => { 'name' => 'is_ecpf', 'offset' => '336', 'type' => '2091' }, '47' => { 'name' => 'vports', 'offset' => '338', 'type' => '287938' }, '48' => { 'name' => 'support_full_tnl_hdr', 'offset' => '626', 'type' => '2091' }, '5' => { 'name' => 'esw_rx_drop_address', 'offset' => '64', 'type' => '2023' }, '6' => { 'name' => 'esw_tx_drop_address', 'offset' => '72', 'type' => '2023' }, '7' => { 'name' => 'log_icm_size', 'offset' => '86', 'type' => '2001' }, '8' => { 'name' => 'log_modify_hdr_icm_size', 'offset' => '96', 'type' => '1977' }, '9' => { 'name' => 'hdr_modify_icm_addr', 'offset' => '100', 'type' => '2023' } }, 'Name' => 'struct dr_devx_caps', 'Size' => '280', 'Type' => 'Struct' }, '284150' => { 'BaseType' => '283448', 'Name' => 'struct dr_devx_caps*', 'Size' => '8', 'Type' => 'Pointer' }, '284180' => { 'BaseType' => '284185', 'Name' => 'struct dr_match_param*', 'Size' => '8', 'Type' => 'Pointer' }, '284185' => { 'Header' => undef, 'Line' => '879', 'Memb' => { '0' => { 'name' => 'outer', 'offset' => '0', 'type' => '285243' }, '1' => { 'name' => 'misc', 'offset' => '100', 'type' => '285772' }, '2' => { 'name' => 'inner', 'offset' => '296', 'type' => '285243' }, '3' => { 'name' => 'misc2', 'offset' => '402', 'type' => '286433' }, '4' => { 'name' => 'misc3', 'offset' => '598', 'type' => '286855' }, '5' => { 'name' => 'misc4', 'offset' => '800', 'type' => '287216' }, '6' => { 'name' => 'misc5', 'offset' => '900', 'type' => '287454' } }, 'Name' => 'struct dr_match_param', 'Size' => '448', 'Type' => 'Struct' }, '284301' => { 'BaseType' => '283345', 'Name' => 'struct dr_ste_build*', 'Size' => '8', 'Type' => 'Pointer' }, '284306' => { 'Name' => 'int(*)(struct dr_match_param*, struct dr_ste_build*, uint8_t*)', 'Param' => { '0' => { 'type' => '284180' }, '1' => { 'type' => '284301' }, '2' => { 'type' => '7308' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '284358' => { 'Header' => undef, 'Line' => '379', 'Memb' => { '0' => { 'name' => 'set', 'offset' => '0', 'type' => '2091' } }, 'Size' => '1', 'Type' => 'Struct' }, '284383' => { 'Header' => undef, 'Line' => '382', 'Memb' => { '0' => { 'name' => 'initial_color', 'offset' => '0', 'type' => '1977' } }, 'Size' => '1', 'Type' => 'Struct' }, '284408' => { 'Header' => undef, 'Line' => '385', 'Memb' => { '0' => { 'name' => 'direction', 'offset' => '0', 'type' => '2091' } }, 'Size' => '1', 'Type' => 'Struct' }, '284433' => { 'Header' => undef, 'Line' => '378', 'Memb' => { '0' => { 'name' => 'first_hit', 'offset' => '0', 'type' => '284358' }, '1' => { 'name' => 'flow_meter', 'offset' => '0', 'type' => '284383' }, '2' => { 'name' => 'ct', 'offset' => '0', 'type' => '284408' } }, 'Size' => '1', 'Type' => 'Union' }, '284482' => { 'Header' => undef, 'Line' => '373', 'Memb' => { '0' => { 'name' => 'dmn', 'offset' => '0', 'type' => '284828' }, '1' => { 'name' => 'devx_obj', 'offset' => '8', 'type' => '19309' }, '2' => { 'name' => 'offset', 'offset' => '22', 'type' => '2001' }, '3' => { 'name' => 'dest_reg_id', 'offset' => '32', 'type' => '1977' }, '4' => { 'name' => 'unnamed0', 'offset' => '33', 'type' => '284433' } }, 'Name' => 'struct dr_action_aso', 'Size' => '24', 'Type' => 'Struct' }, '284558' => { 'Header' => undef, 'Line' => '1080', 'Memb' => { '0' => { 'name' => 'ctx', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'ste_ctx', 'offset' => '8', 'type' => '289941' }, '10' => { 'name' => 'modify_header_arg_mngr', 'offset' => '114', 'type' => '289971' }, '11' => { 'name' => 'encap_icm_pool', 'offset' => '128', 'type' => '289951' }, '12' => { 'name' => 'send_ring', 'offset' => '136', 'type' => '289976' }, '13' => { 'name' => 'info', 'offset' => '512', 'type' => '288758' }, '14' => { 'name' => 'tbl_list', 'offset' => '4224', 'type' => '14403' }, '15' => { 'name' => 'flags', 'offset' => '4246', 'type' => '2001' }, '16' => { 'name' => 'debug_lock', 'offset' => '4352', 'type' => '994' }, '17' => { 'name' => 'num_buddies', 'offset' => '4356', 'type' => '249572' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '6313' }, '3' => { 'name' => 'pd_num', 'offset' => '36', 'type' => '159' }, '4' => { 'name' => 'uar', 'offset' => '50', 'type' => '29729' }, '5' => { 'name' => 'type', 'offset' => '64', 'type' => '269449' }, '6' => { 'name' => 'refcount', 'offset' => '68', 'type' => '2103' }, '7' => { 'name' => 'ste_icm_pool', 'offset' => '72', 'type' => '289951' }, '8' => { 'name' => 'action_icm_pool', 'offset' => '86', 'type' => '289951' }, '9' => { 'name' => 'modify_header_ptrn_mngr', 'offset' => '100', 'type' => '289961' } }, 'Name' => 'struct mlx5dv_dr_domain', 'Size' => '1120', 'Type' => 'Struct' }, '284828' => { 'BaseType' => '284558', 'Name' => 'struct mlx5dv_dr_domain*', 'Size' => '8', 'Type' => 'Pointer' }, '284833' => { 'Header' => undef, 'Line' => '410', 'Memb' => { '0' => { 'name' => 'count_pop', 'offset' => '0', 'type' => '159' }, '1' => { 'name' => 'count_push', 'offset' => '4', 'type' => '159' }, '2' => { 'name' => 'headers', 'offset' => '8', 'type' => '284886' } }, 'Size' => '16', 'Type' => 'Struct' }, '284886' => { 'BaseType' => '2001', 'Name' => 'uint32_t[2]', 'Size' => '8', 'Type' => 'Array' }, '284902' => { 'Header' => undef, 'Line' => '393', 'Memb' => { '0' => { 'name' => 'modify_index', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'modify_pat_idx', 'offset' => '4', 'type' => '2001' }, '10' => { 'name' => 'ctr_id', 'offset' => '82', 'type' => '2001' }, '11' => { 'name' => 'gvmi', 'offset' => '86', 'type' => '1989' }, '12' => { 'name' => 'hit_gvmi', 'offset' => '88', 'type' => '1989' }, '13' => { 'name' => 'reformat_id', 'offset' => '96', 'type' => '2001' }, '14' => { 'name' => 'reformat_size', 'offset' => '100', 'type' => '2001' }, '15' => { 'name' => 'prio_tag_required', 'offset' => '104', 'type' => '2091' }, '16' => { 'name' => 'vlans', 'offset' => '114', 'type' => '284833' }, '17' => { 'name' => 'aso', 'offset' => '136', 'type' => '285196' }, '18' => { 'name' => 'aso_ste_loc', 'offset' => '150', 'type' => '2001' }, '19' => { 'name' => 'dmn', 'offset' => '260', 'type' => '284828' }, '2' => { 'name' => 'modify_actions', 'offset' => '8', 'type' => '1989' }, '3' => { 'name' => 'single_modify_action', 'offset' => '22', 'type' => '7308' }, '4' => { 'name' => 'decap_index', 'offset' => '36', 'type' => '2001' }, '5' => { 'name' => 'decap_pat_idx', 'offset' => '40', 'type' => '2001' }, '6' => { 'name' => 'decap_actions', 'offset' => '50', 'type' => '1989' }, '7' => { 'name' => 'decap_with_vlan', 'offset' => '52', 'type' => '2091' }, '8' => { 'name' => 'final_icm_addr', 'offset' => '64', 'type' => '2023' }, '9' => { 'name' => 'flow_tag', 'offset' => '72', 'type' => '2001' } }, 'Name' => 'struct dr_ste_actions_attr', 'Size' => '112', 'Type' => 'Struct' }, '285196' => { 'BaseType' => '284482', 'Name' => 'struct dr_action_aso*', 'Size' => '8', 'Type' => 'Pointer' }, '285243' => { 'Header' => undef, 'Line' => '699', 'Memb' => { '0' => { 'name' => 'smac_47_16', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'dmac_47_16', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'src_ip_127_96', 'offset' => '50', 'type' => '2001' }, '3' => { 'name' => 'src_ip_95_64', 'offset' => '54', 'type' => '2001' }, '4' => { 'name' => 'src_ip_63_32', 'offset' => '64', 'type' => '2001' }, '5' => { 'name' => 'src_ip_31_0', 'offset' => '68', 'type' => '2001' }, '6' => { 'name' => 'dst_ip_127_96', 'offset' => '72', 'type' => '2001' }, '7' => { 'name' => 'dst_ip_95_64', 'offset' => '82', 'type' => '2001' }, '8' => { 'name' => 'dst_ip_63_32', 'offset' => '86', 'type' => '2001' }, '9' => { 'name' => 'dst_ip_31_0', 'offset' => '96', 'type' => '2001' } }, 'Name' => 'struct dr_match_spec', 'Size' => '64', 'Type' => 'Struct' }, '285772' => { 'Header' => undef, 'Line' => '737', 'Memb' => { '0' => { 'name' => 'inner_esp_spi', 'offset' => '68', 'type' => '2001' }, '1' => { 'name' => 'outer_esp_spi', 'offset' => '72', 'type' => '2001' }, '2' => { 'name' => 'reserved_at_1a0', 'offset' => '82', 'type' => '2001' }, '3' => { 'name' => 'reserved_at_1c0', 'offset' => '86', 'type' => '2001' }, '4' => { 'name' => 'reserved_at_1e0', 'offset' => '96', 'type' => '2001' } }, 'Name' => 'struct dr_match_misc', 'Size' => '64', 'Type' => 'Struct' }, '286433' => { 'Header' => undef, 'Line' => '783', 'Memb' => { '0' => { 'name' => 'metadata_reg_c_7', 'offset' => '22', 'type' => '2001' }, '1' => { 'name' => 'metadata_reg_c_6', 'offset' => '32', 'type' => '2001' }, '10' => { 'name' => 'reserved_at_1c0', 'offset' => '86', 'type' => '2001' }, '11' => { 'name' => 'reserved_at_1e0', 'offset' => '96', 'type' => '2001' }, '2' => { 'name' => 'metadata_reg_c_5', 'offset' => '36', 'type' => '2001' }, '3' => { 'name' => 'metadata_reg_c_4', 'offset' => '40', 'type' => '2001' }, '4' => { 'name' => 'metadata_reg_c_3', 'offset' => '50', 'type' => '2001' }, '5' => { 'name' => 'metadata_reg_c_2', 'offset' => '54', 'type' => '2001' }, '6' => { 'name' => 'metadata_reg_c_1', 'offset' => '64', 'type' => '2001' }, '7' => { 'name' => 'metadata_reg_c_0', 'offset' => '68', 'type' => '2001' }, '8' => { 'name' => 'metadata_reg_a', 'offset' => '72', 'type' => '2001' }, '9' => { 'name' => 'reserved_at_1a0', 'offset' => '82', 'type' => '2001' } }, 'Name' => 'struct dr_match_misc2', 'Size' => '64', 'Type' => 'Struct' }, '286855' => { 'Header' => undef, 'Line' => '814', 'Memb' => { '0' => { 'name' => 'inner_tcp_seq_num', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'outer_tcp_seq_num', 'offset' => '4', 'type' => '2001' }, '10' => { 'name' => 'geneve_tlv_option_0_data', 'offset' => '54', 'type' => '2001' }, '11' => { 'name' => 'gtpu_teid', 'offset' => '64', 'type' => '2001' }, '12' => { 'name' => 'gtpu_dw_2', 'offset' => '72', 'type' => '2001' }, '13' => { 'name' => 'gtpu_first_ext_dw_0', 'offset' => '82', 'type' => '2001' }, '14' => { 'name' => 'gtpu_dw_0', 'offset' => '86', 'type' => '2001' }, '15' => { 'name' => 'reserved_at_1e0', 'offset' => '96', 'type' => '2001' }, '2' => { 'name' => 'inner_tcp_ack_num', 'offset' => '8', 'type' => '2001' }, '3' => { 'name' => 'outer_tcp_ack_num', 'offset' => '18', 'type' => '2001' }, '4' => { 'name' => 'icmpv4_header_data', 'offset' => '36', 'type' => '2001' }, '5' => { 'name' => 'icmpv6_header_data', 'offset' => '40', 'type' => '2001' }, '6' => { 'name' => 'icmpv4_type', 'offset' => '50', 'type' => '1977' }, '7' => { 'name' => 'icmpv4_code', 'offset' => '51', 'type' => '1977' }, '8' => { 'name' => 'icmpv6_type', 'offset' => '52', 'type' => '1977' }, '9' => { 'name' => 'icmpv6_code', 'offset' => '53', 'type' => '1977' } }, 'Name' => 'struct dr_match_misc3', 'Size' => '64', 'Type' => 'Struct' }, '287216' => { 'Header' => undef, 'Line' => '841', 'Memb' => { '0' => { 'name' => 'prog_sample_field_value_0', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'prog_sample_field_id_0', 'offset' => '4', 'type' => '2001' }, '10' => { 'name' => 'prog_sample_field_value_5', 'offset' => '64', 'type' => '2001' }, '11' => { 'name' => 'prog_sample_field_id_5', 'offset' => '68', 'type' => '2001' }, '12' => { 'name' => 'prog_sample_field_value_6', 'offset' => '72', 'type' => '2001' }, '13' => { 'name' => 'prog_sample_field_id_6', 'offset' => '82', 'type' => '2001' }, '14' => { 'name' => 'prog_sample_field_value_7', 'offset' => '86', 'type' => '2001' }, '15' => { 'name' => 'prog_sample_field_id_7', 'offset' => '96', 'type' => '2001' }, '2' => { 'name' => 'prog_sample_field_value_1', 'offset' => '8', 'type' => '2001' }, '3' => { 'name' => 'prog_sample_field_id_1', 'offset' => '18', 'type' => '2001' }, '4' => { 'name' => 'prog_sample_field_value_2', 'offset' => '22', 'type' => '2001' }, '5' => { 'name' => 'prog_sample_field_id_2', 'offset' => '32', 'type' => '2001' }, '6' => { 'name' => 'prog_sample_field_value_3', 'offset' => '36', 'type' => '2001' }, '7' => { 'name' => 'prog_sample_field_id_3', 'offset' => '40', 'type' => '2001' }, '8' => { 'name' => 'prog_sample_field_value_4', 'offset' => '50', 'type' => '2001' }, '9' => { 'name' => 'prog_sample_field_id_4', 'offset' => '54', 'type' => '2001' } }, 'Name' => 'struct dr_match_misc4', 'Size' => '64', 'Type' => 'Struct' }, '287454' => { 'Header' => undef, 'Line' => '860', 'Memb' => { '0' => { 'name' => 'macsec_tag_0', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'macsec_tag_1', 'offset' => '4', 'type' => '2001' }, '10' => { 'name' => 'reserved_at_140', 'offset' => '64', 'type' => '2001' }, '11' => { 'name' => 'reserved_at_160', 'offset' => '68', 'type' => '2001' }, '12' => { 'name' => 'reserved_at_180', 'offset' => '72', 'type' => '2001' }, '13' => { 'name' => 'reserved_at_1a0', 'offset' => '82', 'type' => '2001' }, '14' => { 'name' => 'reserved_at_1c0', 'offset' => '86', 'type' => '2001' }, '15' => { 'name' => 'reserved_at_1e0', 'offset' => '96', 'type' => '2001' }, '2' => { 'name' => 'macsec_tag_2', 'offset' => '8', 'type' => '2001' }, '3' => { 'name' => 'macsec_tag_3', 'offset' => '18', 'type' => '2001' }, '4' => { 'name' => 'tunnel_header_0', 'offset' => '22', 'type' => '2001' }, '5' => { 'name' => 'tunnel_header_1', 'offset' => '32', 'type' => '2001' }, '6' => { 'name' => 'tunnel_header_2', 'offset' => '36', 'type' => '2001' }, '7' => { 'name' => 'tunnel_header_3', 'offset' => '40', 'type' => '2001' }, '8' => { 'name' => 'reserved_at_100', 'offset' => '50', 'type' => '2001' }, '9' => { 'name' => 'reserved_at_120', 'offset' => '54', 'type' => '2001' } }, 'Name' => 'struct dr_match_misc5', 'Size' => '64', 'Type' => 'Struct' }, '287692' => { 'Header' => undef, 'Line' => '902', 'Memb' => { '0' => { 'name' => 'vport_gvmi', 'offset' => '0', 'type' => '1989' }, '1' => { 'name' => 'vhca_gvmi', 'offset' => '2', 'type' => '1989' }, '2' => { 'name' => 'icm_address_rx', 'offset' => '8', 'type' => '2023' }, '3' => { 'name' => 'icm_address_tx', 'offset' => '22', 'type' => '2023' }, '4' => { 'name' => 'num', 'offset' => '36', 'type' => '1989' }, '5' => { 'name' => 'metadata_c', 'offset' => '40', 'type' => '2001' }, '6' => { 'name' => 'metadata_c_mask', 'offset' => '50', 'type' => '2001' }, '7' => { 'name' => 'next', 'offset' => '64', 'type' => '287818' } }, 'Name' => 'struct dr_devx_vport_cap', 'Size' => '48', 'Type' => 'Struct' }, '287818' => { 'BaseType' => '287692', 'Name' => 'struct dr_devx_vport_cap*', 'Size' => '8', 'Type' => 'Pointer' }, '287823' => { 'Header' => undef, 'Line' => '914', 'Memb' => { '0' => { 'name' => 'roce_en', 'offset' => '0', 'type' => '2091' }, '1' => { 'name' => 'fl_rc_qp_when_roce_disabled', 'offset' => '1', 'type' => '2091' }, '2' => { 'name' => 'fl_rc_qp_when_roce_enabled', 'offset' => '2', 'type' => '2091' }, '3' => { 'name' => 'qp_ts_format', 'offset' => '3', 'type' => '1977' } }, 'Name' => 'struct dr_devx_roce_cap', 'Size' => '4', 'Type' => 'Struct' }, '287893' => { 'Header' => undef, 'Line' => '921', 'Memb' => { '0' => { 'name' => 'buckets', 'offset' => '0', 'type' => '287922' } }, 'Name' => 'struct dr_vports_table', 'Size' => '2048', 'Type' => 'Struct' }, '287922' => { 'BaseType' => '287818', 'Name' => 'struct dr_devx_vport_cap*[256]', 'Size' => '2048', 'Type' => 'Array' }, '287938' => { 'Header' => undef, 'Line' => '925', 'Memb' => { '0' => { 'name' => 'esw_mngr', 'offset' => '0', 'type' => '287692' }, '1' => { 'name' => 'wire', 'offset' => '72', 'type' => '287692' }, '2' => { 'name' => 'vports', 'offset' => '150', 'type' => '288036' }, '3' => { 'name' => 'ib_ports', 'offset' => '260', 'type' => '288041' }, '4' => { 'name' => 'num_ports', 'offset' => '274', 'type' => '2001' }, '5' => { 'name' => 'lock', 'offset' => '278', 'type' => '994' } }, 'Name' => 'struct dr_devx_vports', 'Size' => '120', 'Type' => 'Struct' }, '288036' => { 'BaseType' => '287893', 'Name' => 'struct dr_vports_table*', 'Size' => '8', 'Type' => 'Pointer' }, '288041' => { 'BaseType' => '287818', 'Name' => 'struct dr_devx_vport_cap**', 'Size' => '8', 'Type' => 'Pointer' }, '288456' => { 'Header' => undef, 'Line' => '1030', 'Memb' => { '0' => { 'name' => 'type', 'offset' => '0', 'type' => '1977' }, '1' => { 'name' => 'level', 'offset' => '1', 'type' => '1977' }, '2' => { 'name' => 'ft_dvo', 'offset' => '8', 'type' => '19309' }, '3' => { 'name' => 'fg_dvo', 'offset' => '22', 'type' => '19309' }, '4' => { 'name' => 'fte_dvo', 'offset' => '36', 'type' => '19309' } }, 'Name' => 'struct dr_devx_tbl', 'Size' => '32', 'Type' => 'Struct' }, '288638' => { 'Header' => undef, 'Line' => '1047', 'Memb' => { '0' => { 'name' => 'DR_DOMAIN_NIC_TYPE_RX', 'value' => '0' }, '1' => { 'name' => 'DR_DOMAIN_NIC_TYPE_TX', 'value' => '1' } }, 'Name' => 'enum dr_domain_nic_type', 'Size' => '4', 'Type' => 'Enum' }, '288667' => { 'Header' => undef, 'Line' => '1052', 'Memb' => { '0' => { 'name' => 'drop_icm_addr', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'default_icm_addr', 'offset' => '8', 'type' => '2023' }, '2' => { 'name' => 'type', 'offset' => '22', 'type' => '288638' }, '3' => { 'name' => 'locks', 'offset' => '32', 'type' => '288753' } }, 'Name' => 'struct dr_domain_rx_tx', 'Size' => '80', 'Type' => 'Struct' }, '288737' => { 'BaseType' => '994', 'Name' => 'pthread_spinlock_t[14]', 'Size' => '56', 'Type' => 'Array' }, '288753' => { 'BaseType' => '288737', 'Name' => 'pthread_spinlock_t[14] volatile', 'Size' => '56', 'Type' => 'Volatile' }, '288758' => { 'Header' => undef, 'Line' => '1060', 'Memb' => { '0' => { 'name' => 'supp_sw_steering', 'offset' => '0', 'type' => '2091' }, '1' => { 'name' => 'max_log_sw_icm_sz', 'offset' => '4', 'type' => '2001' }, '10' => { 'name' => 'caps', 'offset' => '1426', 'type' => '283448' }, '11' => { 'name' => 'use_mqs', 'offset' => '2162', 'type' => '2091' }, '2' => { 'name' => 'max_log_action_icm_sz', 'offset' => '8', 'type' => '2001' }, '3' => { 'name' => 'max_log_modify_hdr_pattern_icm_sz', 'offset' => '18', 'type' => '2001' }, '4' => { 'name' => 'max_log_sw_icm_rehash_sz', 'offset' => '22', 'type' => '2001' }, '5' => { 'name' => 'max_log_sw_encap_icm_sz', 'offset' => '32', 'type' => '2001' }, '6' => { 'name' => 'max_send_size', 'offset' => '36', 'type' => '2001' }, '7' => { 'name' => 'rx', 'offset' => '50', 'type' => '288667' }, '8' => { 'name' => 'tx', 'offset' => '274', 'type' => '288667' }, '9' => { 'name' => 'attr', 'offset' => '402', 'type' => '4058' } }, 'Name' => 'struct dr_domain_info', 'Size' => '880', 'Type' => 'Struct' }, '288941' => { 'Header' => undef, 'Line' => '155', 'Memb' => { '0' => { 'name' => 'build_eth_l2_src_dst_init', 'offset' => '0', 'type' => '293052' }, '1' => { 'name' => 'build_eth_l3_ipv6_src_init', 'offset' => '8', 'type' => '293052' }, '10' => { 'name' => 'build_tnl_gre_init', 'offset' => '128', 'type' => '293052' }, '11' => { 'name' => 'build_tnl_mpls_over_gre_init', 'offset' => '136', 'type' => '293052' }, '12' => { 'name' => 'build_tnl_mpls_over_udp_init', 'offset' => '150', 'type' => '293052' }, '13' => { 'name' => 'build_icmp_init', 'offset' => '260', 'type' => '293052' }, '14' => { 'name' => 'build_general_purpose_init', 'offset' => '274', 'type' => '293052' }, '15' => { 'name' => 'build_eth_l4_misc_init', 'offset' => '288', 'type' => '293052' }, '16' => { 'name' => 'build_tnl_vxlan_gpe_init', 'offset' => '296', 'type' => '293052' }, '17' => { 'name' => 'build_tnl_geneve_init', 'offset' => '310', 'type' => '293052' }, '18' => { 'name' => 'build_tnl_geneve_tlv_opt_init', 'offset' => '324', 'type' => '293052' }, '19' => { 'name' => 'build_tnl_geneve_tlv_opt_exist_init', 'offset' => '338', 'type' => '293052' }, '2' => { 'name' => 'build_eth_l3_ipv6_dst_init', 'offset' => '22', 'type' => '293052' }, '20' => { 'name' => 'build_tnl_gtpu_init', 'offset' => '352', 'type' => '293052' }, '21' => { 'name' => 'build_tnl_gtpu_flex_parser_0', 'offset' => '360', 'type' => '293052' }, '22' => { 'name' => 'build_tnl_gtpu_flex_parser_1', 'offset' => '374', 'type' => '293052' }, '23' => { 'name' => 'build_register_0_init', 'offset' => '388', 'type' => '293052' }, '24' => { 'name' => 'build_register_1_init', 'offset' => '402', 'type' => '293052' }, '25' => { 'name' => 'build_src_gvmi_qpn_init', 'offset' => '512', 'type' => '293052' }, '26' => { 'name' => 'build_flex_parser_0_init', 'offset' => '520', 'type' => '293052' }, '27' => { 'name' => 'build_flex_parser_1_init', 'offset' => '534', 'type' => '293052' }, '28' => { 'name' => 'build_tunnel_header_init', 'offset' => '548', 'type' => '293052' }, '29' => { 'name' => 'build_ib_l4_init', 'offset' => '562', 'type' => '293052' }, '3' => { 'name' => 'build_eth_l3_ipv4_5_tuple_init', 'offset' => '36', 'type' => '293052' }, '30' => { 'name' => 'build_def0_init', 'offset' => '576', 'type' => '293052' }, '31' => { 'name' => 'build_def2_init', 'offset' => '584', 'type' => '293052' }, '32' => { 'name' => 'build_def6_init', 'offset' => '598', 'type' => '293052' }, '33' => { 'name' => 'build_def16_init', 'offset' => '612', 'type' => '293052' }, '34' => { 'name' => 'build_def22_init', 'offset' => '626', 'type' => '293052' }, '35' => { 'name' => 'build_def24_init', 'offset' => '640', 'type' => '293052' }, '36' => { 'name' => 'build_def25_init', 'offset' => '648', 'type' => '293052' }, '37' => { 'name' => 'build_def26_init', 'offset' => '662', 'type' => '293052' }, '38' => { 'name' => 'build_def28_init', 'offset' => '772', 'type' => '293052' }, '39' => { 'name' => 'build_def33_init', 'offset' => '786', 'type' => '293052' }, '4' => { 'name' => 'build_eth_l2_src_init', 'offset' => '50', 'type' => '293052' }, '40' => { 'name' => 'aso_other_domain_link', 'offset' => '800', 'type' => '293120' }, '41' => { 'name' => 'aso_other_domain_unlink', 'offset' => '808', 'type' => '29192' }, '42' => { 'name' => 'ste_init', 'offset' => '822', 'type' => '293151' }, '43' => { 'name' => 'set_next_lu_type', 'offset' => '836', 'type' => '293172' }, '44' => { 'name' => 'get_next_lu_type', 'offset' => '850', 'type' => '293192' }, '45' => { 'name' => 'set_miss_addr', 'offset' => '864', 'type' => '293213' }, '46' => { 'name' => 'get_miss_addr', 'offset' => '872', 'type' => '293233' }, '47' => { 'name' => 'set_hit_addr', 'offset' => '886', 'type' => '293259' }, '48' => { 'name' => 'set_byte_mask', 'offset' => '900', 'type' => '293172' }, '49' => { 'name' => 'get_byte_mask', 'offset' => '914', 'type' => '293192' }, '5' => { 'name' => 'build_eth_l2_dst_init', 'offset' => '64', 'type' => '293052' }, '50' => { 'name' => 'set_ctrl_always_hit_htbl', 'offset' => '1024', 'type' => '293300' }, '51' => { 'name' => 'set_ctrl_always_miss', 'offset' => '1032', 'type' => '293326' }, '52' => { 'name' => 'set_hit_gvmi', 'offset' => '1046', 'type' => '293172' }, '53' => { 'name' => 'actions_caps', 'offset' => '1060', 'type' => '2001' }, '54' => { 'name' => 'action_modify_field_arr', 'offset' => '1074', 'type' => '293331' }, '55' => { 'name' => 'action_modify_field_arr_size', 'offset' => '1088', 'type' => '419' }, '56' => { 'name' => 'set_actions_rx', 'offset' => '1096', 'type' => '293377' }, '57' => { 'name' => 'set_actions_tx', 'offset' => '1110', 'type' => '293377' }, '58' => { 'name' => 'set_action_set', 'offset' => '1124', 'type' => '293413' }, '59' => { 'name' => 'set_action_add', 'offset' => '1138', 'type' => '293413' }, '6' => { 'name' => 'build_eth_l2_tnl_init', 'offset' => '72', 'type' => '293052' }, '60' => { 'name' => 'set_action_copy', 'offset' => '1152', 'type' => '293454' }, '61' => { 'name' => 'get_action_hw_field', 'offset' => '1160', 'type' => '293484' }, '62' => { 'name' => 'set_action_decap_l3_list', 'offset' => '1174', 'type' => '293524' }, '63' => { 'name' => 'set_aso_ct_cross_dmn', 'offset' => '1284', 'type' => '293560' }, '64' => { 'name' => 'alloc_modify_hdr_chunk', 'offset' => '1298', 'type' => '293585' }, '65' => { 'name' => 'dealloc_modify_hdr_chunk', 'offset' => '1312', 'type' => '293601' }, '66' => { 'name' => 'set_encap', 'offset' => '1320', 'type' => '293632' }, '67' => { 'name' => 'set_push_vlan', 'offset' => '1334', 'type' => '293658' }, '68' => { 'name' => 'set_pop_vlan', 'offset' => '1348', 'type' => '293684' }, '69' => { 'name' => 'set_rx_decap', 'offset' => '1362', 'type' => '293705' }, '7' => { 'name' => 'build_eth_l3_ipv4_misc_init', 'offset' => '86', 'type' => '293052' }, '70' => { 'name' => 'set_encap_l3', 'offset' => '1376', 'type' => '293741' }, '71' => { 'name' => 'prepare_for_postsend', 'offset' => '1384', 'type' => '293762' }, '8' => { 'name' => 'build_eth_ipv6_l3_l4_init', 'offset' => '100', 'type' => '293052' }, '9' => { 'name' => 'build_mpls_init', 'offset' => '114', 'type' => '293052' } }, 'Name' => 'struct dr_ste_ctx', 'Size' => '576', 'Type' => 'Struct' }, '289941' => { 'BaseType' => '288941', 'Name' => 'struct dr_ste_ctx*', 'Size' => '8', 'Type' => 'Pointer' }, '289946' => { 'Header' => undef, 'Line' => '37', 'Memb' => { '0' => { 'name' => 'icm_type', 'offset' => '0', 'type' => '282336' }, '1' => { 'name' => 'dmn', 'offset' => '8', 'type' => '284828' }, '2' => { 'name' => 'max_log_chunk_sz', 'offset' => '22', 'type' => '282156' }, '3' => { 'name' => 'lock', 'offset' => '32', 'type' => '994' }, '4' => { 'name' => 'buddy_mem_list', 'offset' => '36', 'type' => '14403' }, '5' => { 'name' => 'hot_memory_size', 'offset' => '64', 'type' => '2023' }, '6' => { 'name' => 'syncing', 'offset' => '72', 'type' => '2091' }, '7' => { 'name' => 'th', 'offset' => '86', 'type' => '419' } }, 'Name' => 'struct dr_icm_pool', 'Size' => '64', 'Type' => 'Struct' }, '289951' => { 'BaseType' => '289946', 'Name' => 'struct dr_icm_pool*', 'Size' => '8', 'Type' => 'Pointer' }, '289956' => { 'Header' => undef, 'Line' => '15', 'Memb' => { '0' => { 'name' => 'dmn', 'offset' => '0', 'type' => '284828' }, '1' => { 'name' => 'ptrn_icm_pool', 'offset' => '8', 'type' => '289951' }, '2' => { 'name' => 'ptrn_list', 'offset' => '22', 'type' => '14403' }, '3' => { 'name' => 'modify_hdr_mutex', 'offset' => '50', 'type' => '1212654' } }, 'Name' => 'struct dr_ptrn_mngr', 'Size' => '72', 'Type' => 'Struct' }, '289961' => { 'BaseType' => '289956', 'Name' => 'struct dr_ptrn_mngr*', 'Size' => '8', 'Type' => 'Pointer' }, '289966' => { 'Header' => undef, 'Line' => '26', 'Memb' => { '0' => { 'name' => 'dmn', 'offset' => '0', 'type' => '284828' }, '1' => { 'name' => 'pools', 'offset' => '8', 'type' => '1292363' } }, 'Name' => 'struct dr_arg_mngr', 'Size' => '40', 'Type' => 'Struct' }, '289971' => { 'BaseType' => '289966', 'Name' => 'struct dr_arg_mngr*', 'Size' => '8', 'Type' => 'Pointer' }, '289976' => { 'BaseType' => '289992', 'Name' => 'struct dr_send_ring*[14]', 'Size' => '112', 'Type' => 'Array' }, '289992' => { 'BaseType' => '289997', 'Name' => 'struct dr_send_ring*', 'Size' => '8', 'Type' => 'Pointer' }, '289997' => { 'Header' => undef, 'Line' => '1687', 'Memb' => { '0' => { 'name' => 'cq', 'offset' => '0', 'type' => '292908' }, '1' => { 'name' => 'qp', 'offset' => '72', 'type' => '293032' }, '10' => { 'name' => 'sync_buff', 'offset' => '260', 'type' => '308' }, '11' => { 'name' => 'sync_mr', 'offset' => '274', 'type' => '6127' }, '2' => { 'name' => 'mr', 'offset' => '86', 'type' => '6127' }, '3' => { 'name' => 'pending_wqe', 'offset' => '100', 'type' => '2001' }, '4' => { 'name' => 'signal_th', 'offset' => '104', 'type' => '1989' }, '5' => { 'name' => 'max_inline_size', 'offset' => '114', 'type' => '2001' }, '6' => { 'name' => 'tx_head', 'offset' => '118', 'type' => '2001' }, '7' => { 'name' => 'lock', 'offset' => '128', 'type' => '994' }, '8' => { 'name' => 'buf', 'offset' => '136', 'type' => '308' }, '9' => { 'name' => 'buf_size', 'offset' => '150', 'type' => '2001' } }, 'Name' => 'struct dr_send_ring', 'Size' => '120', 'Type' => 'Struct' }, '29010' => { 'Header' => undef, 'Line' => '942', 'Memb' => { '0' => { 'name' => 'parent', 'offset' => '0', 'type' => '21852' }, '1' => { 'name' => 'obj', 'offset' => '8', 'type' => '19309' } }, 'Name' => 'struct mlx5dv_sched_leaf', 'Size' => '16', 'Type' => 'Struct' }, '290176' => { 'Header' => undef, 'Line' => '1160', 'Memb' => { '0' => { 'name' => 's_anchor', 'offset' => '0', 'type' => '282954' }, '1' => { 'name' => 'nic_dmn', 'offset' => '8', 'type' => '290218' } }, 'Name' => 'struct dr_table_rx_tx', 'Size' => '16', 'Type' => 'Struct' }, '290218' => { 'BaseType' => '288667', 'Name' => 'struct dr_domain_rx_tx*', 'Size' => '8', 'Type' => 'Pointer' }, '290223' => { 'Header' => undef, 'Line' => '1177', 'Memb' => { '0' => { 'name' => 's_htbl', 'offset' => '0', 'type' => '282954' }, '1' => { 'name' => 'e_anchor', 'offset' => '8', 'type' => '282954' }, '2' => { 'name' => 'ste_builder', 'offset' => '22', 'type' => '290340' }, '3' => { 'name' => 'num_of_builders', 'offset' => '5654', 'type' => '1977' }, '4' => { 'name' => 'default_icm_addr', 'offset' => '5668', 'type' => '2023' }, '5' => { 'name' => 'nic_tbl', 'offset' => '5682', 'type' => '290356' }, '6' => { 'name' => 'fixed_size', 'offset' => '5696', 'type' => '2091' } }, 'Name' => 'struct dr_matcher_rx_tx', 'Size' => '1648', 'Type' => 'Struct' }, '290340' => { 'BaseType' => '283345', 'Name' => 'struct dr_ste_build[20]', 'Size' => '1600', 'Type' => 'Array' }, '290356' => { 'BaseType' => '290176', 'Name' => 'struct dr_table_rx_tx*', 'Size' => '8', 'Type' => 'Pointer' }, '290361' => { 'Header' => undef, 'Line' => '1187', 'Memb' => { '0' => { 'name' => 'tbl', 'offset' => '0', 'type' => '269706' }, '1' => { 'name' => 'rx', 'offset' => '8', 'type' => '290223' }, '2' => { 'name' => 'tx', 'offset' => '5718', 'type' => '290223' }, '3' => { 'name' => 'matcher_list', 'offset' => '13060', 'type' => '14358' }, '4' => { 'name' => 'prio', 'offset' => '13088', 'type' => '1989' }, '5' => { 'name' => 'mask', 'offset' => '13092', 'type' => '284185' }, '6' => { 'name' => 'match_criteria', 'offset' => '14194', 'type' => '1977' }, '7' => { 'name' => 'refcount', 'offset' => '14198', 'type' => '2103' }, '8' => { 'name' => 'dv_matcher', 'offset' => '14212', 'type' => '30577' }, '9' => { 'name' => 'rule_list', 'offset' => '14226', 'type' => '14403' } }, 'Name' => 'struct mlx5dv_dr_matcher', 'Size' => '3808', 'Type' => 'Struct' }, '29052' => { 'BaseType' => '29010', 'Name' => 'struct mlx5dv_sched_leaf const', 'Size' => '16', 'Type' => 'Const' }, '290522' => { 'Header' => undef, 'Line' => '1200', 'Memb' => { '0' => { 'name' => 'hw_field', 'offset' => '0', 'type' => '1989' }, '1' => { 'name' => 'start', 'offset' => '2', 'type' => '1977' }, '2' => { 'name' => 'end', 'offset' => '3', 'type' => '1977' }, '3' => { 'name' => 'l3_type', 'offset' => '4', 'type' => '1977' }, '4' => { 'name' => 'l4_type', 'offset' => '5', 'type' => '1977' }, '5' => { 'name' => 'flags', 'offset' => '8', 'type' => '2001' } }, 'Name' => 'struct dr_ste_action_modify_field', 'Size' => '12', 'Type' => 'Struct' }, '290620' => { 'BaseType' => '290522', 'Name' => 'struct dr_ste_action_modify_field const', 'Size' => '12', 'Type' => 'Const' }, '290625' => { 'Header' => undef, 'Line' => '1209', 'Memb' => { '0' => { 'name' => 'ref_actions_num', 'offset' => '0', 'type' => '1989' }, '1' => { 'name' => 'ref_actions', 'offset' => '8', 'type' => '269843' }, '2' => { 'name' => 'devx_tbl', 'offset' => '22', 'type' => '290681' } }, 'Name' => 'struct dr_devx_tbl_with_refs', 'Size' => '24', 'Type' => 'Struct' }, '290681' => { 'BaseType' => '288456', 'Name' => 'struct dr_devx_tbl*', 'Size' => '8', 'Type' => 'Pointer' }, '290686' => { 'Header' => undef, 'Line' => '1215', 'Memb' => { '0' => { 'name' => 'devx_obj', 'offset' => '0', 'type' => '19309' }, '1' => { 'name' => 'rx_icm_addr', 'offset' => '8', 'type' => '2023' }, '2' => { 'name' => 'tx_icm_addr', 'offset' => '22', 'type' => '2023' }, '3' => { 'name' => 'next_ft', 'offset' => '36', 'type' => '269706' } }, 'Name' => 'struct dr_flow_sampler', 'Size' => '32', 'Type' => 'Struct' }, '290756' => { 'Header' => undef, 'Line' => '1222', 'Memb' => { '0' => { 'name' => 'tbl', 'offset' => '0', 'type' => '269706' }, '1' => { 'name' => 'matcher', 'offset' => '8', 'type' => '290840' }, '2' => { 'name' => 'rule', 'offset' => '22', 'type' => '290921' }, '3' => { 'name' => 'actions', 'offset' => '36', 'type' => '269843' }, '4' => { 'name' => 'num_of_actions', 'offset' => '50', 'type' => '1989' } }, 'Name' => 'struct dr_flow_sampler_restore_tbl', 'Size' => '40', 'Type' => 'Struct' }, '290840' => { 'BaseType' => '290361', 'Name' => 'struct mlx5dv_dr_matcher*', 'Size' => '8', 'Type' => 'Pointer' }, '290845' => { 'Header' => undef, 'Line' => '1369', 'Memb' => { '0' => { 'name' => 'matcher', 'offset' => '0', 'type' => '290840' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '292329' }, '2' => { 'name' => 'rule_list', 'offset' => '86', 'type' => '14358' }, '3' => { 'name' => 'actions', 'offset' => '114', 'type' => '269843' }, '4' => { 'name' => 'num_actions', 'offset' => '128', 'type' => '1989' } }, 'Name' => 'struct mlx5dv_dr_rule', 'Size' => '88', 'Type' => 'Struct' }, '290921' => { 'BaseType' => '290845', 'Name' => 'struct mlx5dv_dr_rule*', 'Size' => '8', 'Type' => 'Pointer' }, '290926' => { 'Header' => undef, 'Line' => '1230', 'Memb' => { '0' => { 'name' => 'chunk', 'offset' => '0', 'type' => '283217' }, '1' => { 'name' => 'data', 'offset' => '8', 'type' => '7308' }, '2' => { 'name' => 'data_size', 'offset' => '22', 'type' => '2001' }, '3' => { 'name' => 'num_of_actions', 'offset' => '32', 'type' => '1989' }, '4' => { 'name' => 'index', 'offset' => '36', 'type' => '2001' } }, 'Name' => 'struct dr_rewrite_param', 'Size' => '32', 'Type' => 'Struct' }, '291010' => { 'Header' => undef, 'Line' => '1238', 'Memb' => { '0' => { 'name' => 'DR_PTRN_TYP_MODIFY_HDR', 'value' => '9' }, '1' => { 'name' => 'DR_PTRN_TYP_TNL_L3_TO_L2', 'value' => '2' } }, 'Name' => 'enum dr_ptrn_type', 'Size' => '4', 'Type' => 'Enum' }, '291039' => { 'Header' => undef, 'Line' => '1243', 'Memb' => { '0' => { 'name' => 'rewrite_param', 'offset' => '0', 'type' => '290926' }, '1' => { 'name' => 'refcount', 'offset' => '50', 'type' => '2103' }, '2' => { 'name' => 'list', 'offset' => '64', 'type' => '14358' }, '3' => { 'name' => 'type', 'offset' => '86', 'type' => '291010' } }, 'Name' => 'struct dr_ptrn_obj', 'Size' => '64', 'Type' => 'Struct' }, '291109' => { 'Header' => undef, 'Line' => '1250', 'Memb' => { '0' => { 'name' => 'obj', 'offset' => '0', 'type' => '19309' }, '1' => { 'name' => 'obj_offset', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'list_node', 'offset' => '22', 'type' => '14358' }, '3' => { 'name' => 'log_chunk_size', 'offset' => '50', 'type' => '2001' } }, 'Name' => 'struct dr_arg_obj', 'Size' => '40', 'Type' => 'Struct' }, '291179' => { 'Header' => undef, 'Line' => '1272', 'Memb' => { '0' => { 'name' => 'ptrn', 'offset' => '0', 'type' => '291218' }, '1' => { 'name' => 'arg', 'offset' => '8', 'type' => '291223' } }, 'Size' => '16', 'Type' => 'Struct' }, '291218' => { 'BaseType' => '291039', 'Name' => 'struct dr_ptrn_obj*', 'Size' => '8', 'Type' => 'Pointer' }, '291223' => { 'BaseType' => '291109', 'Name' => 'struct dr_arg_obj*', 'Size' => '8', 'Type' => 'Pointer' }, '291228' => { 'Header' => undef, 'Line' => '1267', 'Memb' => { '0' => { 'name' => 'param', 'offset' => '0', 'type' => '290926' }, '1' => { 'name' => 'ptrn_arg', 'offset' => '64', 'type' => '291179' } }, 'Size' => '56', 'Type' => 'Struct' }, '291315' => { 'Header' => undef, 'Line' => '1265', 'Memb' => { '0' => { 'name' => 'flow_action', 'offset' => '0', 'type' => '13658' }, '1' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '291228' } }, 'Size' => '56', 'Type' => 'Union' }, '291344' => { 'Header' => undef, 'Line' => '1261', 'Memb' => { '0' => { 'name' => 'dmn', 'offset' => '0', 'type' => '284828' }, '1' => { 'name' => 'is_root_level', 'offset' => '8', 'type' => '2091' }, '2' => { 'name' => 'args_send_qp', 'offset' => '18', 'type' => '2001' }, '3' => { 'name' => 'unnamed0', 'offset' => '22', 'type' => '291315' } }, 'Size' => '72', 'Type' => 'Struct' }, '291403' => { 'Header' => undef, 'Line' => '1284', 'Memb' => { '0' => { 'name' => 'dvo', 'offset' => '0', 'type' => '19309' }, '1' => { 'name' => 'data', 'offset' => '8', 'type' => '7308' }, '2' => { 'name' => 'index', 'offset' => '22', 'type' => '2001' }, '3' => { 'name' => 'chunk', 'offset' => '36', 'type' => '283217' }, '4' => { 'name' => 'reformat_size', 'offset' => '50', 'type' => '2001' } }, 'Size' => '40', 'Type' => 'Struct' }, '291484' => { 'Header' => undef, 'Line' => '1282', 'Memb' => { '0' => { 'name' => 'flow_action', 'offset' => '0', 'type' => '13658' }, '1' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '291403' } }, 'Size' => '40', 'Type' => 'Union' }, '291513' => { 'Header' => undef, 'Line' => '1279', 'Memb' => { '0' => { 'name' => 'dmn', 'offset' => '0', 'type' => '284828' }, '1' => { 'name' => 'is_root_level', 'offset' => '8', 'type' => '2091' }, '2' => { 'name' => 'unnamed0', 'offset' => '22', 'type' => '291484' } }, 'Size' => '56', 'Type' => 'Struct' }, '291558' => { 'Header' => undef, 'Line' => '1293', 'Memb' => { '0' => { 'name' => 'next_ft', 'offset' => '0', 'type' => '269706' }, '1' => { 'name' => 'devx_obj', 'offset' => '8', 'type' => '19309' }, '2' => { 'name' => 'rx_icm_addr', 'offset' => '22', 'type' => '2023' }, '3' => { 'name' => 'tx_icm_addr', 'offset' => '36', 'type' => '2023' } }, 'Size' => '32', 'Type' => 'Struct' }, '291625' => { 'Header' => undef, 'Line' => '1299', 'Memb' => { '0' => { 'name' => 'dmn', 'offset' => '0', 'type' => '284828' }, '1' => { 'name' => 'term_tbl', 'offset' => '8', 'type' => '291706' }, '2' => { 'name' => 'sampler_default', 'offset' => '22', 'type' => '291711' }, '3' => { 'name' => 'restore_tbl', 'offset' => '36', 'type' => '291716' }, '4' => { 'name' => 'sampler_restore', 'offset' => '50', 'type' => '291711' } }, 'Size' => '40', 'Type' => 'Struct' }, '291706' => { 'BaseType' => '290625', 'Name' => 'struct dr_devx_tbl_with_refs*', 'Size' => '8', 'Type' => 'Pointer' }, '291711' => { 'BaseType' => '290686', 'Name' => 'struct dr_flow_sampler*', 'Size' => '8', 'Type' => 'Pointer' }, '291716' => { 'BaseType' => '290756', 'Name' => 'struct dr_flow_sampler_restore_tbl*', 'Size' => '8', 'Type' => 'Pointer' }, '291721' => { 'Header' => undef, 'Line' => '1307', 'Memb' => { '0' => { 'name' => 'dmn', 'offset' => '0', 'type' => '284828' }, '1' => { 'name' => 'actions_list', 'offset' => '8', 'type' => '14403' }, '2' => { 'name' => 'devx_tbl', 'offset' => '36', 'type' => '290681' }, '3' => { 'name' => 'rx_icm_addr', 'offset' => '50', 'type' => '2023' }, '4' => { 'name' => 'tx_icm_addr', 'offset' => '64', 'type' => '2023' } }, 'Size' => '48', 'Type' => 'Struct' }, '291802' => { 'Header' => undef, 'Line' => '1314', 'Memb' => { '0' => { 'name' => 'devx_obj', 'offset' => '0', 'type' => '19309' }, '1' => { 'name' => 'offset', 'offset' => '8', 'type' => '2001' } }, 'Size' => '16', 'Type' => 'Struct' }, '291841' => { 'Header' => undef, 'Line' => '1318', 'Memb' => { '0' => { 'name' => 'dmn', 'offset' => '0', 'type' => '284828' }, '1' => { 'name' => 'caps', 'offset' => '8', 'type' => '287818' } }, 'Size' => '16', 'Type' => 'Struct' }, '291880' => { 'Header' => undef, 'Line' => '1322', 'Memb' => { '0' => { 'name' => 'vlan_hdr', 'offset' => '0', 'type' => '2001' } }, 'Size' => '4', 'Type' => 'Struct' }, '291905' => { 'Header' => undef, 'Line' => '1327', 'Memb' => { '0' => { 'name' => 'devx_tir', 'offset' => '0', 'type' => '19309' }, '1' => { 'name' => 'qp', 'offset' => '0', 'type' => '5101' } }, 'Size' => '8', 'Type' => 'Union' }, '29192' => { 'Name' => 'int(*)(struct mlx5dv_devx_obj*)', 'Param' => { '0' => { 'type' => '19309' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '291941' => { 'Header' => undef, 'Line' => '1325', 'Memb' => { '0' => { 'name' => 'is_qp', 'offset' => '0', 'type' => '2091' }, '1' => { 'name' => 'unnamed0', 'offset' => '8', 'type' => '291905' } }, 'Size' => '16', 'Type' => 'Struct' }, '291972' => { 'Header' => undef, 'Line' => '1332', 'Memb' => { '0' => { 'name' => 'tbl', 'offset' => '0', 'type' => '269706' }, '1' => { 'name' => 'devx_tbl', 'offset' => '8', 'type' => '290681' }, '2' => { 'name' => 'sa', 'offset' => '22', 'type' => '30682' }, '3' => { 'name' => 'rx_icm_addr', 'offset' => '36', 'type' => '2023' }, '4' => { 'name' => 'tx_icm_addr', 'offset' => '50', 'type' => '2023' } }, 'Size' => '40', 'Type' => 'Struct' }, '292052' => { 'Header' => undef, 'Line' => '1260', 'Memb' => { '0' => { 'name' => 'rewrite', 'offset' => '0', 'type' => '291344' }, '1' => { 'name' => 'reformat', 'offset' => '0', 'type' => '291513' }, '10' => { 'name' => 'root_tbl', 'offset' => '0', 'type' => '291972' }, '11' => { 'name' => 'aso', 'offset' => '0', 'type' => '284482' }, '12' => { 'name' => 'devx_obj', 'offset' => '0', 'type' => '19309' }, '13' => { 'name' => 'flow_tag', 'offset' => '0', 'type' => '2001' }, '2' => { 'name' => 'meter', 'offset' => '0', 'type' => '291558' }, '3' => { 'name' => 'sampler', 'offset' => '0', 'type' => '291625' }, '4' => { 'name' => 'dest_tbl', 'offset' => '0', 'type' => '269706' }, '5' => { 'name' => 'dest_array', 'offset' => '0', 'type' => '291721' }, '6' => { 'name' => 'ctr', 'offset' => '0', 'type' => '291802' }, '7' => { 'name' => 'vport', 'offset' => '0', 'type' => '291841' }, '8' => { 'name' => 'push_vlan', 'offset' => '0', 'type' => '291880' }, '9' => { 'name' => 'dest_qp', 'offset' => '0', 'type' => '291941' } }, 'Size' => '72', 'Type' => 'Union' }, '292287' => { 'BaseType' => '290223', 'Name' => 'struct dr_matcher_rx_tx*', 'Size' => '8', 'Type' => 'Pointer' }, '292292' => { 'Header' => undef, 'Line' => '1372', 'Memb' => { '0' => { 'name' => 'rx', 'offset' => '0', 'type' => '282959' }, '1' => { 'name' => 'tx', 'offset' => '36', 'type' => '282959' } }, 'Size' => '48', 'Type' => 'Struct' }, '292329' => { 'Header' => undef, 'Line' => '1371', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '292292' }, '1' => { 'name' => 'flow', 'offset' => '0', 'type' => '13488' } }, 'Size' => '48', 'Type' => 'Union' }, '292358' => { 'Header' => undef, 'Line' => '1734', 'Memb' => { '0' => { 'name' => 'bits', 'offset' => '0', 'type' => '293037' }, '1' => { 'name' => 'num_free', 'offset' => '8', 'type' => '28020' }, '10' => { 'name' => 'ste_arr', 'offset' => '260', 'type' => '283222' }, '11' => { 'name' => 'miss_list', 'offset' => '274', 'type' => '39716' }, '12' => { 'name' => 'hw_ste_arr', 'offset' => '288', 'type' => '7308' }, '13' => { 'name' => 'hw_ste_sz', 'offset' => '296', 'type' => '1977' }, '2' => { 'name' => 'set_bit', 'offset' => '22', 'type' => '293037' }, '3' => { 'name' => 'max_order', 'offset' => '36', 'type' => '2001' }, '4' => { 'name' => 'list_node', 'offset' => '50', 'type' => '14358' }, '5' => { 'name' => 'icm_mr', 'offset' => '72', 'type' => '293047' }, '6' => { 'name' => 'pool', 'offset' => '86', 'type' => '289951' }, '7' => { 'name' => 'used_list', 'offset' => '100', 'type' => '14403' }, '8' => { 'name' => 'used_memory', 'offset' => '128', 'type' => '419' }, '9' => { 'name' => 'hot_list', 'offset' => '136', 'type' => '14403' } }, 'Name' => 'struct dr_icm_buddy_mem', 'Size' => '136', 'Type' => 'Struct' }, '292568' => { 'BaseType' => '292358', 'Name' => 'struct dr_icm_buddy_mem*', 'Size' => '8', 'Type' => 'Pointer' }, '292573' => { 'Header' => undef, 'Line' => '1646', 'Memb' => { '0' => { 'name' => 'wqe_head', 'offset' => '0', 'type' => '28020' }, '1' => { 'name' => 'wqe_cnt', 'offset' => '8', 'type' => '70' }, '2' => { 'name' => 'max_post', 'offset' => '18', 'type' => '70' }, '3' => { 'name' => 'head', 'offset' => '22', 'type' => '70' }, '4' => { 'name' => 'tail', 'offset' => '32', 'type' => '70' }, '5' => { 'name' => 'cur_post', 'offset' => '36', 'type' => '70' }, '6' => { 'name' => 'max_gs', 'offset' => '40', 'type' => '159' }, '7' => { 'name' => 'wqe_shift', 'offset' => '50', 'type' => '159' }, '8' => { 'name' => 'offset', 'offset' => '54', 'type' => '159' }, '9' => { 'name' => 'qend', 'offset' => '64', 'type' => '308' } }, 'Name' => 'struct dr_wq', 'Size' => '48', 'Type' => 'Struct' }, '292727' => { 'Header' => undef, 'Line' => '1659', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '27227' }, '1' => { 'name' => 'sq', 'offset' => '100', 'type' => '292573' }, '10' => { 'name' => 'db_umem', 'offset' => '534', 'type' => '29785' }, '2' => { 'name' => 'rq', 'offset' => '274', 'type' => '292573' }, '3' => { 'name' => 'sq_size', 'offset' => '352', 'type' => '159' }, '4' => { 'name' => 'sq_start', 'offset' => '360', 'type' => '308' }, '5' => { 'name' => 'max_inline_data', 'offset' => '374', 'type' => '159' }, '6' => { 'name' => 'db', 'offset' => '388', 'type' => '19658' }, '7' => { 'name' => 'obj', 'offset' => '402', 'type' => '19309' }, '8' => { 'name' => 'uar', 'offset' => '512', 'type' => '29729' }, '9' => { 'name' => 'buf_umem', 'offset' => '520', 'type' => '29785' } }, 'Name' => 'struct dr_qp', 'Size' => '232', 'Type' => 'Struct' }, '292908' => { 'Header' => undef, 'Line' => '1674', 'Memb' => { '0' => { 'name' => 'buf', 'offset' => '0', 'type' => '7308' }, '1' => { 'name' => 'cons_index', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'ncqe', 'offset' => '18', 'type' => '159' }, '3' => { 'name' => 'qp', 'offset' => '22', 'type' => '293032' }, '4' => { 'name' => 'db', 'offset' => '36', 'type' => '19658' }, '5' => { 'name' => 'ibv_cq', 'offset' => '50', 'type' => '4901' }, '6' => { 'name' => 'cqn', 'offset' => '64', 'type' => '2001' }, '7' => { 'name' => 'cqe_sz', 'offset' => '68', 'type' => '2001' } }, 'Name' => 'struct dr_cq', 'Size' => '48', 'Type' => 'Struct' }, '293032' => { 'BaseType' => '292727', 'Name' => 'struct dr_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '293037' => { 'BaseType' => '22488', 'Name' => 'unsigned long**', 'Size' => '8', 'Type' => 'Pointer' }, '293042' => { 'Header' => undef, 'Line' => '49', 'Memb' => { '0' => { 'name' => 'mr', 'offset' => '0', 'type' => '6127' }, '1' => { 'name' => 'dm', 'offset' => '8', 'type' => '2979' }, '2' => { 'name' => 'icm_start_addr', 'offset' => '22', 'type' => '2023' } }, 'Name' => 'struct dr_icm_mr', 'Size' => '24', 'Type' => 'Struct' }, '293047' => { 'BaseType' => '293042', 'Name' => 'struct dr_icm_mr*', 'Size' => '8', 'Type' => 'Pointer' }, '293052' => { 'BaseType' => '293064', 'Header' => undef, 'Line' => '152', 'Name' => 'dr_ste_builder_void_init', 'Size' => '8', 'Type' => 'Typedef' }, '293064' => { 'Name' => 'void(*)(struct dr_ste_build*, struct dr_match_param*)', 'Param' => { '0' => { 'type' => '284301' }, '1' => { 'type' => '284180' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293120' => { 'Name' => 'int(*)(struct mlx5dv_devx_obj*, struct mlx5dv_dr_domain*, struct mlx5dv_dr_domain*, uint32_t, uint8_t)', 'Param' => { '0' => { 'type' => '19309' }, '1' => { 'type' => '284828' }, '2' => { 'type' => '284828' }, '3' => { 'type' => '2001' }, '4' => { 'type' => '1977' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '293151' => { 'Name' => 'void(*)(uint8_t*, uint16_t, _Bool, uint16_t)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '1989' }, '2' => { 'type' => '2091' }, '3' => { 'type' => '1989' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293172' => { 'Name' => 'void(*)(uint8_t*, uint16_t)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '1989' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293192' => { 'Name' => 'uint16_t(*)(uint8_t*)', 'Param' => { '0' => { 'type' => '7308' } }, 'Return' => '1989', 'Size' => '8', 'Type' => 'FuncPtr' }, '293213' => { 'Name' => 'void(*)(uint8_t*, uint64_t)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '2023' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293233' => { 'Name' => 'uint64_t(*)(uint8_t*)', 'Param' => { '0' => { 'type' => '7308' } }, 'Return' => '2023', 'Size' => '8', 'Type' => 'FuncPtr' }, '293259' => { 'Name' => 'void(*)(uint8_t*, uint64_t, uint32_t)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '2023' }, '2' => { 'type' => '2001' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293300' => { 'Name' => 'void(*)(uint8_t*, uint16_t, uint16_t, uint64_t, uint32_t, uint16_t)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '1989' }, '2' => { 'type' => '1989' }, '3' => { 'type' => '2023' }, '4' => { 'type' => '2001' }, '5' => { 'type' => '1989' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293326' => { 'Name' => 'void(*)(uint8_t*, uint64_t, uint16_t)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '2023' }, '2' => { 'type' => '1989' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293331' => { 'BaseType' => '290620', 'Name' => 'struct dr_ste_action_modify_field const*', 'Size' => '8', 'Type' => 'Pointer' }, '293372' => { 'BaseType' => '284902', 'Name' => 'struct dr_ste_actions_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '293377' => { 'Name' => 'void(*)(struct dr_ste_ctx*, uint8_t*, uint32_t, uint8_t*, struct dr_ste_actions_attr*, uint32_t*)', 'Param' => { '0' => { 'type' => '289941' }, '1' => { 'type' => '7308' }, '2' => { 'type' => '2001' }, '3' => { 'type' => '7308' }, '4' => { 'type' => '293372' }, '5' => { 'type' => '14268' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293413' => { 'Name' => 'void(*)(uint8_t*, uint8_t, uint8_t, uint8_t, uint32_t)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '1977' }, '2' => { 'type' => '1977' }, '3' => { 'type' => '1977' }, '4' => { 'type' => '2001' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293454' => { 'Name' => 'void(*)(uint8_t*, uint8_t, uint8_t, uint8_t, uint8_t, uint8_t)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '1977' }, '2' => { 'type' => '1977' }, '3' => { 'type' => '1977' }, '4' => { 'type' => '1977' }, '5' => { 'type' => '1977' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293484' => { 'Name' => 'struct dr_ste_action_modify_field const*(*)(struct dr_ste_ctx*, uint16_t, struct dr_devx_caps*)', 'Param' => { '0' => { 'type' => '289941' }, '1' => { 'type' => '1989' }, '2' => { 'type' => '284150' } }, 'Return' => '293331', 'Size' => '8', 'Type' => 'FuncPtr' }, '293524' => { 'Name' => 'int(*)(void*, uint32_t, uint8_t*, uint32_t, uint16_t*)', 'Param' => { '0' => { 'type' => '308' }, '1' => { 'type' => '2001' }, '2' => { 'type' => '7308' }, '3' => { 'type' => '2001' }, '4' => { 'type' => '29549' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '293560' => { 'Name' => 'void(*)(uint8_t*, uint32_t, uint32_t, uint8_t, _Bool)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '2001' }, '2' => { 'type' => '2001' }, '3' => { 'type' => '1977' }, '4' => { 'type' => '2091' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293585' => { 'Name' => 'int(*)(struct mlx5dv_dr_action*, uint32_t)', 'Param' => { '0' => { 'type' => '269848' }, '1' => { 'type' => '2001' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '293601' => { 'Name' => 'void(*)(struct mlx5dv_dr_action*)', 'Param' => { '0' => { 'type' => '269848' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293632' => { 'Name' => 'void(*)(uint8_t*, uint8_t*, uint32_t, int)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '7308' }, '2' => { 'type' => '2001' }, '3' => { 'type' => '159' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293658' => { 'Name' => 'void(*)(uint8_t*, uint8_t*, uint32_t)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '7308' }, '2' => { 'type' => '2001' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293684' => { 'Name' => 'void(*)(uint8_t*, uint8_t*, uint8_t)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '7308' }, '2' => { 'type' => '1977' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293705' => { 'Name' => 'void(*)(uint8_t*, uint8_t*)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '7308' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293741' => { 'Name' => 'void(*)(uint8_t*, uint8_t*, uint8_t*, uint32_t, int)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '7308' }, '2' => { 'type' => '7308' }, '3' => { 'type' => '2001' }, '4' => { 'type' => '159' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '293762' => { 'Name' => 'void(*)(uint8_t*, uint32_t)', 'Param' => { '0' => { 'type' => '7308' }, '1' => { 'type' => '2001' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '2944' => { 'BaseType' => '2826', 'Name' => 'struct ibv_context*', 'Size' => '8', 'Type' => 'Pointer' }, '29442' => { 'BaseType' => '21639', 'Name' => 'struct mlx5dv_devx_cmd_comp*', 'Size' => '8', 'Type' => 'Pointer' }, '294450' => { 'BaseType' => '269484', 'Name' => 'struct mlx5dv_dr_flow_meter_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '294537' => { 'BaseType' => '2215', 'Name' => '__be64*', 'Size' => '8', 'Type' => 'Pointer' }, '29488' => { 'BaseType' => '21666', 'Name' => 'struct mlx5dv_devx_event_channel*', 'Size' => '8', 'Type' => 'Pointer' }, '29549' => { 'BaseType' => '1989', 'Name' => 'uint16_t*', 'Size' => '8', 'Type' => 'Pointer' }, '29664' => { 'BaseType' => '15047', 'Name' => 'struct mlx5_ib_uapi_devx_async_cmd_hdr*', 'Size' => '8', 'Type' => 'Pointer' }, '29699' => { 'BaseType' => '15172', 'Name' => 'struct mlx5_ib_uapi_devx_async_event_hdr*', 'Size' => '8', 'Type' => 'Pointer' }, '29729' => { 'BaseType' => '21485', 'Name' => 'struct mlx5dv_devx_uar*', 'Size' => '8', 'Type' => 'Pointer' }, '297349' => { 'BaseType' => '297354', 'Name' => 'struct mlx5dv_dr_action_dest_attr**', 'Size' => '8', 'Type' => 'Pointer' }, '297354' => { 'BaseType' => '269989', 'Name' => 'struct mlx5dv_dr_action_dest_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '29785' => { 'BaseType' => '21359', 'Name' => 'struct mlx5dv_devx_umem*', 'Size' => '8', 'Type' => 'Pointer' }, '2979' => { 'BaseType' => '2747', 'Name' => 'struct ibv_dm*', 'Size' => '8', 'Type' => 'Pointer' }, '29815' => { 'BaseType' => '21387', 'Name' => 'struct mlx5dv_devx_umem_in*', 'Size' => '8', 'Type' => 'Pointer' }, '2984' => { 'Name' => 'int(*)(struct ibv_dm*, uint64_t, void const*, size_t)', 'Param' => { '0' => { 'type' => '2979' }, '1' => { 'type' => '2023' }, '2' => { 'type' => '1961' }, '3' => { 'type' => '419' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '29860' => { 'BaseType' => '16608', 'Name' => 'struct mlx5dv_mkey_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '29910' => { 'BaseType' => '18217', 'Name' => 'struct mlx5dv_crypto_login_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '29940' => { 'BaseType' => '18371', 'Name' => 'enum mlx5dv_crypto_login_state*', 'Size' => '8', 'Type' => 'Pointer' }, '29990' => { 'BaseType' => '18287', 'Name' => 'struct mlx5dv_crypto_login_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '300076' => { 'BaseType' => '269711', 'Name' => 'struct mlx5dv_dr_flow_sampler_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '30020' => { 'BaseType' => '18406', 'Name' => 'struct mlx5dv_crypto_login_query_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '30070' => { 'BaseType' => '18529', 'Name' => 'struct mlx5dv_dek_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '30100' => { 'BaseType' => '18703', 'Name' => 'struct mlx5dv_dek_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '30150' => { 'BaseType' => '21569', 'Name' => 'struct mlx5dv_var*', 'Size' => '8', 'Type' => 'Pointer' }, '3019' => { 'Name' => 'int(*)(void*, struct ibv_dm*, uint64_t, size_t)', 'Param' => { '0' => { 'type' => '308' }, '1' => { 'type' => '2979' }, '2' => { 'type' => '2023' }, '3' => { 'type' => '419' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '30206' => { 'BaseType' => '21693', 'Name' => 'struct mlx5dv_pp*', 'Size' => '8', 'Type' => 'Pointer' }, '3024' => { 'Header' => undef, 'Line' => '182', 'Memb' => { '0' => { 'name' => 'fw_ver', 'offset' => '0', 'type' => '3558' }, '1' => { 'name' => 'node_guid', 'offset' => '100', 'type' => '2215' }, '10' => { 'name' => 'device_cap_flags', 'offset' => '278', 'type' => '70' }, '11' => { 'name' => 'max_sge', 'offset' => '288', 'type' => '159' }, '12' => { 'name' => 'max_sge_rd', 'offset' => '292', 'type' => '159' }, '13' => { 'name' => 'max_cq', 'offset' => '296', 'type' => '159' }, '14' => { 'name' => 'max_cqe', 'offset' => '306', 'type' => '159' }, '15' => { 'name' => 'max_mr', 'offset' => '310', 'type' => '159' }, '16' => { 'name' => 'max_pd', 'offset' => '320', 'type' => '159' }, '17' => { 'name' => 'max_qp_rd_atom', 'offset' => '324', 'type' => '159' }, '18' => { 'name' => 'max_ee_rd_atom', 'offset' => '328', 'type' => '159' }, '19' => { 'name' => 'max_res_rd_atom', 'offset' => '338', 'type' => '159' }, '2' => { 'name' => 'sys_image_guid', 'offset' => '114', 'type' => '2215' }, '20' => { 'name' => 'max_qp_init_rd_atom', 'offset' => '342', 'type' => '159' }, '21' => { 'name' => 'max_ee_init_rd_atom', 'offset' => '352', 'type' => '159' }, '22' => { 'name' => 'atomic_cap', 'offset' => '356', 'type' => '2658' }, '23' => { 'name' => 'max_ee', 'offset' => '360', 'type' => '159' }, '24' => { 'name' => 'max_rdd', 'offset' => '370', 'type' => '159' }, '25' => { 'name' => 'max_mw', 'offset' => '374', 'type' => '159' }, '26' => { 'name' => 'max_raw_ipv6_qp', 'offset' => '384', 'type' => '159' }, '27' => { 'name' => 'max_raw_ethy_qp', 'offset' => '388', 'type' => '159' }, '28' => { 'name' => 'max_mcast_grp', 'offset' => '392', 'type' => '159' }, '29' => { 'name' => 'max_mcast_qp_attach', 'offset' => '402', 'type' => '159' }, '3' => { 'name' => 'max_mr_size', 'offset' => '128', 'type' => '2023' }, '30' => { 'name' => 'max_total_mcast_qp_attach', 'offset' => '406', 'type' => '159' }, '31' => { 'name' => 'max_ah', 'offset' => '512', 'type' => '159' }, '32' => { 'name' => 'max_fmr', 'offset' => '516', 'type' => '159' }, '33' => { 'name' => 'max_map_per_fmr', 'offset' => '520', 'type' => '159' }, '34' => { 'name' => 'max_srq', 'offset' => '530', 'type' => '159' }, '35' => { 'name' => 'max_srq_wr', 'offset' => '534', 'type' => '159' }, '36' => { 'name' => 'max_srq_sge', 'offset' => '544', 'type' => '159' }, '37' => { 'name' => 'max_pkeys', 'offset' => '548', 'type' => '1989' }, '38' => { 'name' => 'local_ca_ack_delay', 'offset' => '550', 'type' => '1977' }, '39' => { 'name' => 'phys_port_cnt', 'offset' => '551', 'type' => '1977' }, '4' => { 'name' => 'page_size_cap', 'offset' => '136', 'type' => '2023' }, '5' => { 'name' => 'vendor_id', 'offset' => '150', 'type' => '2001' }, '6' => { 'name' => 'vendor_part_id', 'offset' => '256', 'type' => '2001' }, '7' => { 'name' => 'hw_ver', 'offset' => '260', 'type' => '2001' }, '8' => { 'name' => 'max_qp', 'offset' => '264', 'type' => '159' }, '9' => { 'name' => 'max_qp_wr', 'offset' => '274', 'type' => '159' } }, 'Name' => 'struct ibv_device_attr', 'Size' => '232', 'Type' => 'Struct' }, '30252' => { 'BaseType' => '20711', 'Name' => 'struct mlx5dv_obj*', 'Size' => '8', 'Type' => 'Pointer' }, '30287' => { 'BaseType' => '16538', 'Name' => 'struct mlx5dv_cq_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '30322' => { 'BaseType' => '16846', 'Name' => 'struct mlx5dv_qp_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '30377' => { 'BaseType' => '20888', 'Name' => 'struct mlx5dv_wq_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '30412' => { 'BaseType' => '19971', 'Name' => 'struct mlx5dv_alloc_dm_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '30472' => { 'BaseType' => '18759', 'Name' => 'struct mlx5dv_flow_action_esp*', 'Size' => '8', 'Type' => 'Pointer' }, '30577' => { 'BaseType' => '28072', 'Name' => 'struct mlx5dv_flow_matcher*', 'Size' => '8', 'Type' => 'Pointer' }, '30582' => { 'BaseType' => '18858', 'Name' => 'struct mlx5dv_flow_matcher_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '30647' => { 'BaseType' => '19314', 'Name' => 'struct mlx5dv_flow_action_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '30682' => { 'BaseType' => '19031', 'Name' => 'struct mlx5dv_steering_anchor*', 'Size' => '8', 'Type' => 'Pointer' }, '30687' => { 'BaseType' => '18975', 'Name' => 'struct mlx5dv_steering_anchor_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '30742' => { 'BaseType' => '16251', 'Name' => 'struct mlx5dv_context*', 'Size' => '8', 'Type' => 'Pointer' }, '308' => { 'BaseType' => '1', 'Name' => 'void*', 'Size' => '8', 'Type' => 'Pointer' }, '30852' => { 'BaseType' => '21805', 'Name' => 'struct mlx5dv_sched_attr const*', 'Size' => '8', 'Type' => 'Pointer' }, '30882' => { 'BaseType' => '29010', 'Name' => 'struct mlx5dv_sched_leaf*', 'Size' => '8', 'Type' => 'Pointer' }, '310' => { 'BaseType' => '188', 'Header' => undef, 'Line' => '194', 'Name' => '__ssize_t', 'Size' => '8', 'Type' => 'Typedef' }, '31007' => { 'BaseType' => '29052', 'Name' => 'struct mlx5dv_sched_leaf const*', 'Size' => '8', 'Type' => 'Pointer' }, '31117' => { 'BaseType' => '21261', 'Name' => 'struct mlx5dv_clock_info*', 'Size' => '8', 'Type' => 'Pointer' }, '31157' => { 'BaseType' => '15250', 'Name' => 'struct mlx5_ib_uapi_query_port*', 'Size' => '8', 'Type' => 'Pointer' }, '31207' => { 'BaseType' => '21857', 'Name' => 'struct mlx5dv_devx_msi_vector*', 'Size' => '8', 'Type' => 'Pointer' }, '31272' => { 'BaseType' => '21898', 'Name' => 'struct mlx5dv_devx_eq*', 'Size' => '8', 'Type' => 'Pointer' }, '346' => { 'BaseType' => '356', 'Name' => 'char*', 'Size' => '8', 'Type' => 'Pointer' }, '3558' => { 'BaseType' => '356', 'Name' => 'char[64]', 'Size' => '64', 'Type' => 'Array' }, '356' => { 'Name' => 'char', 'Size' => '1', 'Type' => 'Intrinsic' }, '3606' => { 'Header' => undef, 'Line' => '242', 'Memb' => { '0' => { 'name' => 'rc_odp_caps', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'uc_odp_caps', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'ud_odp_caps', 'offset' => '8', 'type' => '2001' } }, 'Size' => '12', 'Type' => 'Struct' }, '363' => { 'BaseType' => '356', 'Name' => 'char const', 'Size' => '1', 'Type' => 'Const' }, '3655' => { 'Header' => undef, 'Line' => '240', 'Memb' => { '0' => { 'name' => 'general_caps', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'per_transport_caps', 'offset' => '8', 'type' => '3606' } }, 'Name' => 'struct ibv_odp_caps', 'Size' => '24', 'Type' => 'Struct' }, '3695' => { 'Header' => undef, 'Line' => '254', 'Memb' => { '0' => { 'name' => 'max_tso', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'supported_qpts', 'offset' => '4', 'type' => '2001' } }, 'Name' => 'struct ibv_tso_caps', 'Size' => '8', 'Type' => 'Struct' }, '3736' => { 'Header' => undef, 'Line' => '285', 'Memb' => { '0' => { 'name' => 'supported_qpts', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'max_rwq_indirection_tables', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'max_rwq_indirection_table_size', 'offset' => '8', 'type' => '2001' }, '3' => { 'name' => 'rx_hash_fields_mask', 'offset' => '22', 'type' => '2023' }, '4' => { 'name' => 'rx_hash_function', 'offset' => '36', 'type' => '1977' } }, 'Name' => 'struct ibv_rss_caps', 'Size' => '32', 'Type' => 'Struct' }, '3820' => { 'Header' => undef, 'Line' => '293', 'Memb' => { '0' => { 'name' => 'qp_rate_limit_min', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'qp_rate_limit_max', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'supported_qpts', 'offset' => '8', 'type' => '2001' } }, 'Name' => 'struct ibv_packet_pacing_caps', 'Size' => '12', 'Type' => 'Struct' }, '3876' => { 'Header' => undef, 'Line' => '310', 'Memb' => { '0' => { 'name' => 'max_rndv_hdr_size', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'max_num_tags', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'flags', 'offset' => '8', 'type' => '2001' }, '3' => { 'name' => 'max_ops', 'offset' => '18', 'type' => '2001' }, '4' => { 'name' => 'max_sge', 'offset' => '22', 'type' => '2001' } }, 'Name' => 'struct ibv_tm_caps', 'Size' => '20', 'Type' => 'Struct' }, '3960' => { 'Header' => undef, 'Line' => '323', 'Memb' => { '0' => { 'name' => 'max_cq_count', 'offset' => '0', 'type' => '1989' }, '1' => { 'name' => 'max_cq_period', 'offset' => '2', 'type' => '1989' } }, 'Name' => 'struct ibv_cq_moderation_caps', 'Size' => '4', 'Type' => 'Struct' }, '39716' => { 'BaseType' => '14403', 'Name' => 'struct list_head*', 'Size' => '8', 'Type' => 'Pointer' }, '4002' => { 'Header' => undef, 'Line' => '338', 'Memb' => { '0' => { 'name' => 'fetch_add', 'offset' => '0', 'type' => '1989' }, '1' => { 'name' => 'swap', 'offset' => '2', 'type' => '1989' }, '2' => { 'name' => 'compare_swap', 'offset' => '4', 'type' => '1989' } }, 'Name' => 'struct ibv_pci_atomic_caps', 'Size' => '6', 'Type' => 'Struct' }, '4058' => { 'Header' => undef, 'Line' => '344', 'Memb' => { '0' => { 'name' => 'orig_attr', 'offset' => '0', 'type' => '3024' }, '1' => { 'name' => 'comp_mask', 'offset' => '562', 'type' => '2001' }, '10' => { 'name' => 'raw_packet_caps', 'offset' => '836', 'type' => '2001' }, '11' => { 'name' => 'tm_caps', 'offset' => '840', 'type' => '3876' }, '12' => { 'name' => 'cq_mod_caps', 'offset' => '872', 'type' => '3960' }, '13' => { 'name' => 'max_dm_size', 'offset' => '886', 'type' => '2023' }, '14' => { 'name' => 'pci_atomic_caps', 'offset' => '900', 'type' => '4002' }, '15' => { 'name' => 'xrc_odp_caps', 'offset' => '914', 'type' => '2001' }, '16' => { 'name' => 'phys_port_cnt_ex', 'offset' => '918', 'type' => '2001' }, '2' => { 'name' => 'odp_caps', 'offset' => '576', 'type' => '3655' }, '3' => { 'name' => 'completion_timestamp_mask', 'offset' => '612', 'type' => '2023' }, '4' => { 'name' => 'hca_core_clock', 'offset' => '626', 'type' => '2023' }, '5' => { 'name' => 'device_cap_flags_ex', 'offset' => '640', 'type' => '2023' }, '6' => { 'name' => 'tso_caps', 'offset' => '648', 'type' => '3695' }, '7' => { 'name' => 'rss_caps', 'offset' => '662', 'type' => '3736' }, '8' => { 'name' => 'max_wq_type_rq', 'offset' => '808', 'type' => '2001' }, '9' => { 'name' => 'packet_pacing_caps', 'offset' => '818', 'type' => '3820' } }, 'Name' => 'struct ibv_device_attr_ex', 'Size' => '400', 'Type' => 'Struct' }, '419' => { 'BaseType' => '82', 'Header' => undef, 'Line' => '214', 'Name' => 'size_t', 'Size' => '8', 'Type' => 'Typedef' }, '443' => { 'Name' => 'unsigned long long', 'Size' => '8', 'Type' => 'Intrinsic' }, '46' => { 'Name' => 'unsigned char', 'Size' => '1', 'Type' => 'Intrinsic' }, '4761' => { 'Header' => undef, 'Line' => '1508', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'channel', 'offset' => '8', 'type' => '9992' }, '2' => { 'name' => 'cq_context', 'offset' => '22', 'type' => '308' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '2001' }, '4' => { 'name' => 'cqe', 'offset' => '40', 'type' => '159' }, '5' => { 'name' => 'mutex', 'offset' => '50', 'type' => '893' }, '6' => { 'name' => 'cond', 'offset' => '114', 'type' => '966' }, '7' => { 'name' => 'comp_events_completed', 'offset' => '288', 'type' => '2001' }, '8' => { 'name' => 'async_events_completed', 'offset' => '292', 'type' => '2001' } }, 'Name' => 'struct ibv_cq', 'Size' => '128', 'Type' => 'Struct' }, '4901' => { 'BaseType' => '4761', 'Name' => 'struct ibv_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '4906' => { 'Header' => undef, 'Line' => '1283', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'qp_context', 'offset' => '8', 'type' => '308' }, '10' => { 'name' => 'mutex', 'offset' => '100', 'type' => '893' }, '11' => { 'name' => 'cond', 'offset' => '260', 'type' => '966' }, '12' => { 'name' => 'events_completed', 'offset' => '338', 'type' => '2001' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '6313' }, '3' => { 'name' => 'send_cq', 'offset' => '36', 'type' => '4901' }, '4' => { 'name' => 'recv_cq', 'offset' => '50', 'type' => '4901' }, '5' => { 'name' => 'srq', 'offset' => '64', 'type' => '5217' }, '6' => { 'name' => 'handle', 'offset' => '72', 'type' => '2001' }, '7' => { 'name' => 'qp_num', 'offset' => '82', 'type' => '2001' }, '8' => { 'name' => 'state', 'offset' => '86', 'type' => '7639' }, '9' => { 'name' => 'qp_type', 'offset' => '96', 'type' => '7095' } }, 'Name' => 'struct ibv_qp', 'Size' => '160', 'Type' => 'Struct' }, '5101' => { 'BaseType' => '4906', 'Name' => 'struct ibv_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '5106' => { 'Header' => undef, 'Line' => '1243', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'srq_context', 'offset' => '8', 'type' => '308' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '6313' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '2001' }, '4' => { 'name' => 'mutex', 'offset' => '50', 'type' => '893' }, '5' => { 'name' => 'cond', 'offset' => '114', 'type' => '966' }, '6' => { 'name' => 'events_completed', 'offset' => '288', 'type' => '2001' } }, 'Name' => 'struct ibv_srq', 'Size' => '128', 'Type' => 'Struct' }, '5217' => { 'BaseType' => '5106', 'Name' => 'struct ibv_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '5222' => { 'Header' => undef, 'Line' => '1265', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'wq_context', 'offset' => '8', 'type' => '308' }, '10' => { 'name' => 'cond', 'offset' => '150', 'type' => '966' }, '11' => { 'name' => 'events_completed', 'offset' => '324', 'type' => '2001' }, '12' => { 'name' => 'comp_mask', 'offset' => '328', 'type' => '2001' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '6313' }, '3' => { 'name' => 'cq', 'offset' => '36', 'type' => '4901' }, '4' => { 'name' => 'wq_num', 'offset' => '50', 'type' => '2001' }, '5' => { 'name' => 'handle', 'offset' => '54', 'type' => '2001' }, '6' => { 'name' => 'state', 'offset' => '64', 'type' => '6839' }, '7' => { 'name' => 'wq_type', 'offset' => '68', 'type' => '6692' }, '8' => { 'name' => 'post_recv', 'offset' => '72', 'type' => '9044' }, '9' => { 'name' => 'mutex', 'offset' => '86', 'type' => '893' } }, 'Name' => 'struct ibv_wq', 'Size' => '152', 'Type' => 'Struct' }, '526093' => { 'Header' => undef, 'Line' => '2019', 'Memb' => { '0' => { 'name' => 'flags', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'log_num_of_rules_hint', 'offset' => '4', 'type' => '2001' } }, 'Name' => 'struct mlx5dv_dr_matcher_layout', 'Size' => '8', 'Type' => 'Struct' }, '5416' => { 'BaseType' => '5222', 'Name' => 'struct ibv_wq*', 'Size' => '8', 'Type' => 'Pointer' }, '5421' => { 'Header' => undef, 'Line' => '485', 'Memb' => { '0' => { 'name' => 'IBV_WC_SUCCESS', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_LOC_LEN_ERR', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_REM_ACCESS_ERR', 'value' => '10' }, '11' => { 'name' => 'IBV_WC_REM_OP_ERR', 'value' => '11' }, '12' => { 'name' => 'IBV_WC_RETRY_EXC_ERR', 'value' => '12' }, '13' => { 'name' => 'IBV_WC_RNR_RETRY_EXC_ERR', 'value' => '13' }, '14' => { 'name' => 'IBV_WC_LOC_RDD_VIOL_ERR', 'value' => '14' }, '15' => { 'name' => 'IBV_WC_REM_INV_RD_REQ_ERR', 'value' => '15' }, '16' => { 'name' => 'IBV_WC_REM_ABORT_ERR', 'value' => '16' }, '17' => { 'name' => 'IBV_WC_INV_EECN_ERR', 'value' => '17' }, '18' => { 'name' => 'IBV_WC_INV_EEC_STATE_ERR', 'value' => '18' }, '19' => { 'name' => 'IBV_WC_FATAL_ERR', 'value' => '19' }, '2' => { 'name' => 'IBV_WC_LOC_QP_OP_ERR', 'value' => '2' }, '20' => { 'name' => 'IBV_WC_RESP_TIMEOUT_ERR', 'value' => '20' }, '21' => { 'name' => 'IBV_WC_GENERAL_ERR', 'value' => '21' }, '22' => { 'name' => 'IBV_WC_TM_ERR', 'value' => '22' }, '23' => { 'name' => 'IBV_WC_TM_RNDV_INCOMPLETE', 'value' => '23' }, '3' => { 'name' => 'IBV_WC_LOC_EEC_OP_ERR', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_LOC_PROT_ERR', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_WR_FLUSH_ERR', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_MW_BIND_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_BAD_RESP_ERR', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_LOC_ACCESS_ERR', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_REM_INV_REQ_ERR', 'value' => '9' } }, 'Name' => 'enum ibv_wc_status', 'Size' => '4', 'Type' => 'Enum' }, '550203' => { 'BaseType' => '526093', 'Name' => 'struct mlx5dv_dr_matcher_layout*', 'Size' => '8', 'Type' => 'Pointer' }, '5582' => { 'Header' => undef, 'Line' => '513', 'Memb' => { '0' => { 'name' => 'IBV_WC_SEND', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_RDMA_WRITE', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_RECV', 'value' => '128' }, '11' => { 'name' => 'IBV_WC_RECV_RDMA_WITH_IMM', 'value' => '129' }, '12' => { 'name' => 'IBV_WC_TM_ADD', 'value' => '130' }, '13' => { 'name' => 'IBV_WC_TM_DEL', 'value' => '131' }, '14' => { 'name' => 'IBV_WC_TM_SYNC', 'value' => '132' }, '15' => { 'name' => 'IBV_WC_TM_RECV', 'value' => '133' }, '16' => { 'name' => 'IBV_WC_TM_NO_TAG', 'value' => '134' }, '17' => { 'name' => 'IBV_WC_DRIVER1', 'value' => '135' }, '18' => { 'name' => 'IBV_WC_DRIVER2', 'value' => '136' }, '19' => { 'name' => 'IBV_WC_DRIVER3', 'value' => '137' }, '2' => { 'name' => 'IBV_WC_RDMA_READ', 'value' => '2' }, '3' => { 'name' => 'IBV_WC_COMP_SWAP', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_FETCH_ADD', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_BIND_MW', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_LOCAL_INV', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_TSO', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_FLUSH', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_ATOMIC_WRITE', 'value' => '9' } }, 'Name' => 'enum ibv_wc_opcode', 'Size' => '4', 'Type' => 'Enum' }, '5719' => { 'Header' => undef, 'Line' => '598', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '2203' }, '1' => { 'name' => 'invalidated_rkey', 'offset' => '0', 'type' => '2001' } }, 'Size' => '4', 'Type' => 'Union' }, '5755' => { 'Header' => undef, 'Line' => '589', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'status', 'offset' => '8', 'type' => '5421' }, '10' => { 'name' => 'slid', 'offset' => '66', 'type' => '1989' }, '11' => { 'name' => 'sl', 'offset' => '68', 'type' => '1977' }, '12' => { 'name' => 'dlid_path_bits', 'offset' => '69', 'type' => '1977' }, '2' => { 'name' => 'opcode', 'offset' => '18', 'type' => '5582' }, '3' => { 'name' => 'vendor_err', 'offset' => '22', 'type' => '2001' }, '4' => { 'name' => 'byte_len', 'offset' => '32', 'type' => '2001' }, '5' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '5719' }, '6' => { 'name' => 'qp_num', 'offset' => '40', 'type' => '2001' }, '7' => { 'name' => 'src_qp', 'offset' => '50', 'type' => '2001' }, '8' => { 'name' => 'wc_flags', 'offset' => '54', 'type' => '70' }, '9' => { 'name' => 'pkey_index', 'offset' => '64', 'type' => '1989' } }, 'Name' => 'struct ibv_wc', 'Size' => '48', 'Type' => 'Struct' }, '58' => { 'Name' => 'unsigned short', 'Size' => '2', 'Type' => 'Intrinsic' }, '5942' => { 'Header' => undef, 'Line' => '625', 'Memb' => { '0' => { 'name' => 'mr', 'offset' => '0', 'type' => '6127' }, '1' => { 'name' => 'addr', 'offset' => '8', 'type' => '2023' }, '2' => { 'name' => 'length', 'offset' => '22', 'type' => '2023' }, '3' => { 'name' => 'mw_access_flags', 'offset' => '36', 'type' => '70' } }, 'Name' => 'struct ibv_mw_bind_info', 'Size' => '32', 'Type' => 'Struct' }, '6011' => { 'BaseType' => '5942', 'Name' => 'struct ibv_mw_bind_info const', 'Size' => '32', 'Type' => 'Const' }, '6016' => { 'Header' => undef, 'Line' => '668', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '6313' }, '2' => { 'name' => 'addr', 'offset' => '22', 'type' => '308' }, '3' => { 'name' => 'length', 'offset' => '36', 'type' => '419' }, '4' => { 'name' => 'handle', 'offset' => '50', 'type' => '2001' }, '5' => { 'name' => 'lkey', 'offset' => '54', 'type' => '2001' }, '6' => { 'name' => 'rkey', 'offset' => '64', 'type' => '2001' } }, 'Name' => 'struct ibv_mr', 'Size' => '48', 'Type' => 'Struct' }, '6127' => { 'BaseType' => '6016', 'Name' => 'struct ibv_mr*', 'Size' => '8', 'Type' => 'Pointer' }, '6132' => { 'Header' => undef, 'Line' => '632', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '2001' } }, 'Name' => 'struct ibv_pd', 'Size' => '16', 'Type' => 'Struct' }, '6202' => { 'Header' => undef, 'Line' => '641', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' } }, 'Name' => 'struct ibv_td', 'Size' => '8', 'Type' => 'Struct' }, '6285' => { 'Header' => undef, 'Line' => '657', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' } }, 'Name' => 'struct ibv_xrcd', 'Size' => '8', 'Type' => 'Struct' }, '6313' => { 'BaseType' => '6132', 'Name' => 'struct ibv_pd*', 'Size' => '8', 'Type' => 'Pointer' }, '6318' => { 'Header' => undef, 'Line' => '678', 'Memb' => { '0' => { 'name' => 'IBV_MW_TYPE_1', 'value' => '1' }, '1' => { 'name' => 'IBV_MW_TYPE_2', 'value' => '2' } }, 'Name' => 'enum ibv_mw_type', 'Size' => '4', 'Type' => 'Enum' }, '6347' => { 'Header' => undef, 'Line' => '683', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '6313' }, '2' => { 'name' => 'rkey', 'offset' => '22', 'type' => '2001' }, '3' => { 'name' => 'handle', 'offset' => '32', 'type' => '2001' }, '4' => { 'name' => 'type', 'offset' => '36', 'type' => '6318' } }, 'Name' => 'struct ibv_mw', 'Size' => '32', 'Type' => 'Struct' }, '63702' => { 'BaseType' => '1977', 'Name' => 'uint8_t[32]', 'Size' => '32', 'Type' => 'Array' }, '6687' => { 'BaseType' => '6285', 'Name' => 'struct ibv_xrcd*', 'Size' => '8', 'Type' => 'Pointer' }, '6692' => { 'Header' => undef, 'Line' => '820', 'Memb' => { '0' => { 'name' => 'IBV_WQT_RQ', 'value' => '0' } }, 'Name' => 'enum ibv_wq_type', 'Size' => '4', 'Type' => 'Enum' }, '6715' => { 'Header' => undef, 'Line' => '837', 'Memb' => { '0' => { 'name' => 'wq_context', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'wq_type', 'offset' => '8', 'type' => '6692' }, '2' => { 'name' => 'max_wr', 'offset' => '18', 'type' => '2001' }, '3' => { 'name' => 'max_sge', 'offset' => '22', 'type' => '2001' }, '4' => { 'name' => 'pd', 'offset' => '36', 'type' => '6313' }, '5' => { 'name' => 'cq', 'offset' => '50', 'type' => '4901' }, '6' => { 'name' => 'comp_mask', 'offset' => '64', 'type' => '2001' }, '7' => { 'name' => 'create_flags', 'offset' => '68', 'type' => '2001' } }, 'Name' => 'struct ibv_wq_init_attr', 'Size' => '48', 'Type' => 'Struct' }, '6839' => { 'Header' => undef, 'Line' => '848', 'Memb' => { '0' => { 'name' => 'IBV_WQS_RESET', 'value' => '0' }, '1' => { 'name' => 'IBV_WQS_RDY', 'value' => '1' }, '2' => { 'name' => 'IBV_WQS_ERR', 'value' => '2' }, '3' => { 'name' => 'IBV_WQS_UNKNOWN', 'value' => '3' } }, 'Name' => 'enum ibv_wq_state', 'Size' => '4', 'Type' => 'Enum' }, '6964' => { 'Header' => undef, 'Line' => '880', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'ind_tbl_handle', 'offset' => '8', 'type' => '159' }, '2' => { 'name' => 'ind_tbl_num', 'offset' => '18', 'type' => '159' }, '3' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '2001' } }, 'Name' => 'struct ibv_rwq_ind_table', 'Size' => '24', 'Type' => 'Struct' }, '70' => { 'Name' => 'unsigned int', 'Size' => '4', 'Type' => 'Intrinsic' }, '7095' => { 'Header' => undef, 'Line' => '901', 'Memb' => { '0' => { 'name' => 'IBV_QPT_RC', 'value' => '2' }, '1' => { 'name' => 'IBV_QPT_UC', 'value' => '3' }, '2' => { 'name' => 'IBV_QPT_UD', 'value' => '4' }, '3' => { 'name' => 'IBV_QPT_RAW_PACKET', 'value' => '8' }, '4' => { 'name' => 'IBV_QPT_XRC_SEND', 'value' => '9' }, '5' => { 'name' => 'IBV_QPT_XRC_RECV', 'value' => '10' }, '6' => { 'name' => 'IBV_QPT_DRIVER', 'value' => '255' } }, 'Name' => 'enum ibv_qp_type', 'Size' => '4', 'Type' => 'Enum' }, '7154' => { 'Header' => undef, 'Line' => '911', 'Memb' => { '0' => { 'name' => 'max_send_wr', 'offset' => '0', 'type' => '2001' }, '1' => { 'name' => 'max_recv_wr', 'offset' => '4', 'type' => '2001' }, '2' => { 'name' => 'max_send_sge', 'offset' => '8', 'type' => '2001' }, '3' => { 'name' => 'max_recv_sge', 'offset' => '18', 'type' => '2001' }, '4' => { 'name' => 'max_inline_data', 'offset' => '22', 'type' => '2001' } }, 'Name' => 'struct ibv_qp_cap', 'Size' => '20', 'Type' => 'Struct' }, '7238' => { 'Header' => undef, 'Line' => '963', 'Memb' => { '0' => { 'name' => 'rx_hash_function', 'offset' => '0', 'type' => '1977' }, '1' => { 'name' => 'rx_hash_key_len', 'offset' => '1', 'type' => '1977' }, '2' => { 'name' => 'rx_hash_key', 'offset' => '8', 'type' => '7308' }, '3' => { 'name' => 'rx_hash_fields_mask', 'offset' => '22', 'type' => '2023' } }, 'Name' => 'struct ibv_rx_hash_conf', 'Size' => '24', 'Type' => 'Struct' }, '7308' => { 'BaseType' => '1977', 'Name' => 'uint8_t*', 'Size' => '8', 'Type' => 'Pointer' }, '7313' => { 'Header' => undef, 'Line' => '972', 'Memb' => { '0' => { 'name' => 'qp_context', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'send_cq', 'offset' => '8', 'type' => '4901' }, '10' => { 'name' => 'create_flags', 'offset' => '128', 'type' => '2001' }, '11' => { 'name' => 'max_tso_header', 'offset' => '132', 'type' => '1989' }, '12' => { 'name' => 'rwq_ind_tbl', 'offset' => '136', 'type' => '7550' }, '13' => { 'name' => 'rx_hash_conf', 'offset' => '150', 'type' => '7238' }, '14' => { 'name' => 'source_qpn', 'offset' => '288', 'type' => '2001' }, '15' => { 'name' => 'send_ops_flags', 'offset' => '296', 'type' => '2023' }, '2' => { 'name' => 'recv_cq', 'offset' => '22', 'type' => '4901' }, '3' => { 'name' => 'srq', 'offset' => '36', 'type' => '5217' }, '4' => { 'name' => 'cap', 'offset' => '50', 'type' => '7154' }, '5' => { 'name' => 'qp_type', 'offset' => '82', 'type' => '7095' }, '6' => { 'name' => 'sq_sig_all', 'offset' => '86', 'type' => '159' }, '7' => { 'name' => 'comp_mask', 'offset' => '96', 'type' => '2001' }, '8' => { 'name' => 'pd', 'offset' => '100', 'type' => '6313' }, '9' => { 'name' => 'xrcd', 'offset' => '114', 'type' => '6687' } }, 'Name' => 'struct ibv_qp_init_attr_ex', 'Size' => '136', 'Type' => 'Struct' }, '7550' => { 'BaseType' => '6964', 'Name' => 'struct ibv_rwq_ind_table*', 'Size' => '8', 'Type' => 'Pointer' }, '7639' => { 'Header' => undef, 'Line' => '1050', 'Memb' => { '0' => { 'name' => 'IBV_QPS_RESET', 'value' => '0' }, '1' => { 'name' => 'IBV_QPS_INIT', 'value' => '1' }, '2' => { 'name' => 'IBV_QPS_RTR', 'value' => '2' }, '3' => { 'name' => 'IBV_QPS_RTS', 'value' => '3' }, '4' => { 'name' => 'IBV_QPS_SQD', 'value' => '4' }, '5' => { 'name' => 'IBV_QPS_SQE', 'value' => '5' }, '6' => { 'name' => 'IBV_QPS_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_QPS_UNKNOWN', 'value' => '7' } }, 'Name' => 'enum ibv_qp_state', 'Size' => '4', 'Type' => 'Enum' }, '7774' => { 'Header' => undef, 'Line' => '1103', 'Memb' => { '0' => { 'name' => 'IBV_WR_RDMA_WRITE', 'value' => '0' }, '1' => { 'name' => 'IBV_WR_RDMA_WRITE_WITH_IMM', 'value' => '1' }, '10' => { 'name' => 'IBV_WR_TSO', 'value' => '10' }, '11' => { 'name' => 'IBV_WR_DRIVER1', 'value' => '11' }, '12' => { 'name' => 'IBV_WR_FLUSH', 'value' => '14' }, '13' => { 'name' => 'IBV_WR_ATOMIC_WRITE', 'value' => '15' }, '2' => { 'name' => 'IBV_WR_SEND', 'value' => '2' }, '3' => { 'name' => 'IBV_WR_SEND_WITH_IMM', 'value' => '3' }, '4' => { 'name' => 'IBV_WR_RDMA_READ', 'value' => '4' }, '5' => { 'name' => 'IBV_WR_ATOMIC_CMP_AND_SWP', 'value' => '5' }, '6' => { 'name' => 'IBV_WR_ATOMIC_FETCH_AND_ADD', 'value' => '6' }, '7' => { 'name' => 'IBV_WR_LOCAL_INV', 'value' => '7' }, '8' => { 'name' => 'IBV_WR_BIND_MW', 'value' => '8' }, '9' => { 'name' => 'IBV_WR_SEND_WITH_INV', 'value' => '9' } }, 'Name' => 'enum ibv_wr_opcode', 'Size' => '4', 'Type' => 'Enum' }, '7875' => { 'Header' => undef, 'Line' => '1140', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '419' } }, 'Name' => 'struct ibv_data_buf', 'Size' => '16', 'Type' => 'Struct' }, '7917' => { 'BaseType' => '7875', 'Name' => 'struct ibv_data_buf const', 'Size' => '16', 'Type' => 'Const' }, '7922' => { 'Header' => undef, 'Line' => '1145', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'lkey', 'offset' => '18', 'type' => '2001' } }, 'Name' => 'struct ibv_sge', 'Size' => '16', 'Type' => 'Struct' }, '7978' => { 'BaseType' => '7922', 'Name' => 'struct ibv_sge const', 'Size' => '16', 'Type' => 'Const' }, '7983' => { 'Header' => undef, 'Line' => '1161', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '2203' }, '1' => { 'name' => 'invalidate_rkey', 'offset' => '0', 'type' => '2001' } }, 'Size' => '4', 'Type' => 'Union' }, '8019' => { 'Header' => undef, 'Line' => '1166', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '2001' } }, 'Size' => '16', 'Type' => 'Struct' }, '8058' => { 'Header' => undef, 'Line' => '1170', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'compare_add', 'offset' => '8', 'type' => '2023' }, '2' => { 'name' => 'swap', 'offset' => '22', 'type' => '2023' }, '3' => { 'name' => 'rkey', 'offset' => '36', 'type' => '2001' } }, 'Size' => '32', 'Type' => 'Struct' }, '8125' => { 'Header' => undef, 'Line' => '1176', 'Memb' => { '0' => { 'name' => 'ah', 'offset' => '0', 'type' => '8232' }, '1' => { 'name' => 'remote_qpn', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'remote_qkey', 'offset' => '18', 'type' => '2001' } }, 'Size' => '16', 'Type' => 'Struct' }, '8177' => { 'Header' => undef, 'Line' => '1695', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '6313' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '2001' } }, 'Name' => 'struct ibv_ah', 'Size' => '24', 'Type' => 'Struct' }, '82' => { 'Name' => 'unsigned long', 'Size' => '8', 'Type' => 'Intrinsic' }, '8232' => { 'BaseType' => '8177', 'Name' => 'struct ibv_ah*', 'Size' => '8', 'Type' => 'Pointer' }, '8237' => { 'Header' => undef, 'Line' => '1165', 'Memb' => { '0' => { 'name' => 'rdma', 'offset' => '0', 'type' => '8019' }, '1' => { 'name' => 'atomic', 'offset' => '0', 'type' => '8058' }, '2' => { 'name' => 'ud', 'offset' => '0', 'type' => '8125' } }, 'Size' => '32', 'Type' => 'Union' }, '8285' => { 'Header' => undef, 'Line' => '1183', 'Memb' => { '0' => { 'name' => 'remote_srqn', 'offset' => '0', 'type' => '2001' } }, 'Size' => '4', 'Type' => 'Struct' }, '8310' => { 'Header' => undef, 'Line' => '1182', 'Memb' => { '0' => { 'name' => 'xrc', 'offset' => '0', 'type' => '8285' } }, 'Size' => '4', 'Type' => 'Union' }, '8333' => { 'Header' => undef, 'Line' => '1188', 'Memb' => { '0' => { 'name' => 'mw', 'offset' => '0', 'type' => '8385' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '2001' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '5942' } }, 'Size' => '48', 'Type' => 'Struct' }, '8385' => { 'BaseType' => '6347', 'Name' => 'struct ibv_mw*', 'Size' => '8', 'Type' => 'Pointer' }, '8390' => { 'Header' => undef, 'Line' => '1193', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '308' }, '1' => { 'name' => 'hdr_sz', 'offset' => '8', 'type' => '1989' }, '2' => { 'name' => 'mss', 'offset' => '16', 'type' => '1989' } }, 'Size' => '16', 'Type' => 'Struct' }, '8443' => { 'Header' => undef, 'Line' => '1187', 'Memb' => { '0' => { 'name' => 'bind_mw', 'offset' => '0', 'type' => '8333' }, '1' => { 'name' => 'tso', 'offset' => '0', 'type' => '8390' } }, 'Size' => '48', 'Type' => 'Union' }, '8479' => { 'Header' => undef, 'Line' => '1151', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '8616' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '8621' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '159' }, '4' => { 'name' => 'opcode', 'offset' => '40', 'type' => '7774' }, '5' => { 'name' => 'send_flags', 'offset' => '50', 'type' => '70' }, '6' => { 'name' => 'unnamed0', 'offset' => '54', 'type' => '7983' }, '7' => { 'name' => 'wr', 'offset' => '64', 'type' => '8237' }, '8' => { 'name' => 'qp_type', 'offset' => '114', 'type' => '8310' }, '9' => { 'name' => 'unnamed1', 'offset' => '128', 'type' => '8443' } }, 'Name' => 'struct ibv_send_wr', 'Size' => '128', 'Type' => 'Struct' }, '8616' => { 'BaseType' => '8479', 'Name' => 'struct ibv_send_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '8621' => { 'BaseType' => '7922', 'Name' => 'struct ibv_sge*', 'Size' => '8', 'Type' => 'Pointer' }, '8626' => { 'Header' => undef, 'Line' => '1201', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '8696' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '8621' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '159' } }, 'Name' => 'struct ibv_recv_wr', 'Size' => '32', 'Type' => 'Struct' }, '8696' => { 'BaseType' => '8626', 'Name' => 'struct ibv_recv_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '8958' => { 'Header' => undef, 'Line' => '1237', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '2023' }, '1' => { 'name' => 'send_flags', 'offset' => '8', 'type' => '70' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '5942' } }, 'Name' => 'struct ibv_mw_bind', 'Size' => '48', 'Type' => 'Struct' }, '9039' => { 'BaseType' => '8696', 'Name' => 'struct ibv_recv_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '9044' => { 'Name' => 'int(*)(struct ibv_wq*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '5416' }, '1' => { 'type' => '8696' }, '2' => { 'type' => '9039' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '9049' => { 'Header' => undef, 'Line' => '1300', 'Memb' => { '0' => { 'name' => 'qp_base', 'offset' => '0', 'type' => '4906' }, '1' => { 'name' => 'comp_mask', 'offset' => '352', 'type' => '2023' }, '10' => { 'name' => 'wr_rdma_write_imm', 'offset' => '562', 'type' => '9622' }, '11' => { 'name' => 'wr_send', 'offset' => '576', 'type' => '9638' }, '12' => { 'name' => 'wr_send_imm', 'offset' => '584', 'type' => '9659' }, '13' => { 'name' => 'wr_send_inv', 'offset' => '598', 'type' => '9565' }, '14' => { 'name' => 'wr_send_tso', 'offset' => '612', 'type' => '9690' }, '15' => { 'name' => 'wr_set_ud_addr', 'offset' => '626', 'type' => '9721' }, '16' => { 'name' => 'wr_set_xrc_srqn', 'offset' => '640', 'type' => '9565' }, '17' => { 'name' => 'wr_set_inline_data', 'offset' => '648', 'type' => '9747' }, '18' => { 'name' => 'wr_set_inline_data_list', 'offset' => '662', 'type' => '9778' }, '19' => { 'name' => 'wr_set_sge', 'offset' => '772', 'type' => '9809' }, '2' => { 'name' => 'wr_id', 'offset' => '360', 'type' => '2023' }, '20' => { 'name' => 'wr_set_sge_list', 'offset' => '786', 'type' => '9840' }, '21' => { 'name' => 'wr_start', 'offset' => '800', 'type' => '9638' }, '22' => { 'name' => 'wr_complete', 'offset' => '808', 'type' => '9860' }, '23' => { 'name' => 'wr_abort', 'offset' => '822', 'type' => '9638' }, '24' => { 'name' => 'wr_atomic_write', 'offset' => '836', 'type' => '9891' }, '25' => { 'name' => 'wr_flush', 'offset' => '850', 'type' => '9932' }, '3' => { 'name' => 'wr_flags', 'offset' => '374', 'type' => '70' }, '4' => { 'name' => 'wr_atomic_cmp_swp', 'offset' => '388', 'type' => '9477' }, '5' => { 'name' => 'wr_atomic_fetch_add', 'offset' => '402', 'type' => '9508' }, '6' => { 'name' => 'wr_bind_mw', 'offset' => '512', 'type' => '9544' }, '7' => { 'name' => 'wr_local_inv', 'offset' => '520', 'type' => '9565' }, '8' => { 'name' => 'wr_rdma_read', 'offset' => '534', 'type' => '9591' }, '9' => { 'name' => 'wr_rdma_write', 'offset' => '548', 'type' => '9591' } }, 'Name' => 'struct ibv_qp_ex', 'Size' => '360', 'Type' => 'Struct' }, '9472' => { 'BaseType' => '9049', 'Name' => 'struct ibv_qp_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '9477' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, uint64_t, uint64_t)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '2001' }, '2' => { 'type' => '2023' }, '3' => { 'type' => '2023' }, '4' => { 'type' => '2023' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '950' => { 'BaseType' => '356', 'Name' => 'char[48]', 'Size' => '48', 'Type' => 'Array' }, '9508' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, uint64_t)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '2001' }, '2' => { 'type' => '2023' }, '3' => { 'type' => '2023' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9539' => { 'BaseType' => '6011', 'Name' => 'struct ibv_mw_bind_info const*', 'Size' => '8', 'Type' => 'Pointer' }, '9544' => { 'Name' => 'void(*)(struct ibv_qp_ex*, struct ibv_mw*, uint32_t, struct ibv_mw_bind_info const*)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '8385' }, '2' => { 'type' => '2001' }, '3' => { 'type' => '9539' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9565' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '2001' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9591' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '2001' }, '2' => { 'type' => '2023' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9622' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, __be32)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '2001' }, '2' => { 'type' => '2023' }, '3' => { 'type' => '2203' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9638' => { 'Name' => 'void(*)(struct ibv_qp_ex*)', 'Param' => { '0' => { 'type' => '9472' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9659' => { 'Name' => 'void(*)(struct ibv_qp_ex*, __be32)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '2203' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9690' => { 'Name' => 'void(*)(struct ibv_qp_ex*, void*, uint16_t, uint16_t)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '308' }, '2' => { 'type' => '1989' }, '3' => { 'type' => '1989' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9721' => { 'Name' => 'void(*)(struct ibv_qp_ex*, struct ibv_ah*, uint32_t, uint32_t)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '8232' }, '2' => { 'type' => '2001' }, '3' => { 'type' => '2001' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9747' => { 'Name' => 'void(*)(struct ibv_qp_ex*, void*, size_t)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '308' }, '2' => { 'type' => '419' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9773' => { 'BaseType' => '7917', 'Name' => 'struct ibv_data_buf const*', 'Size' => '8', 'Type' => 'Pointer' }, '9778' => { 'Name' => 'void(*)(struct ibv_qp_ex*, size_t, struct ibv_data_buf const*)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '419' }, '2' => { 'type' => '9773' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '978' => { 'BaseType' => '356', 'Name' => 'char[8]', 'Size' => '8', 'Type' => 'Array' }, '9809' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, uint32_t)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '2001' }, '2' => { 'type' => '2023' }, '3' => { 'type' => '2001' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9835' => { 'BaseType' => '7978', 'Name' => 'struct ibv_sge const*', 'Size' => '8', 'Type' => 'Pointer' }, '9840' => { 'Name' => 'void(*)(struct ibv_qp_ex*, size_t, struct ibv_sge const*)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '419' }, '2' => { 'type' => '9835' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9860' => { 'Name' => 'int(*)(struct ibv_qp_ex*)', 'Param' => { '0' => { 'type' => '9472' } }, 'Return' => '159', 'Size' => '8', 'Type' => 'FuncPtr' }, '9891' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, void const*)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '2001' }, '2' => { 'type' => '2023' }, '3' => { 'type' => '1961' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9932' => { 'Name' => 'void(*)(struct ibv_qp_ex*, uint32_t, uint64_t, size_t, uint8_t, uint8_t)', 'Param' => { '0' => { 'type' => '9472' }, '1' => { 'type' => '2001' }, '2' => { 'type' => '2023' }, '3' => { 'type' => '419' }, '4' => { 'type' => '1977' }, '5' => { 'type' => '1977' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '9937' => { 'Header' => undef, 'Line' => '1502', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '2944' }, '1' => { 'name' => 'fd', 'offset' => '8', 'type' => '159' }, '2' => { 'name' => 'refcnt', 'offset' => '18', 'type' => '159' } }, 'Name' => 'struct ibv_comp_channel', 'Size' => '16', 'Type' => 'Struct' }, '994' => { 'BaseType' => '166', 'Header' => undef, 'Line' => '103', 'Name' => 'pthread_spinlock_t', 'Size' => '4', 'Type' => 'Typedef' }, '9992' => { 'BaseType' => '9937', 'Name' => 'struct ibv_comp_channel*', 'Size' => '8', 'Type' => 'Pointer' }, '9997' => { 'Header' => undef, 'Line' => '1521', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '2001' } }, 'Name' => 'struct ibv_poll_cq_attr', 'Size' => '4', 'Type' => 'Struct' } }, 'UndefinedSymbols' => { 'libmlx5.so.1.25.56.0' => { '_ITM_deregisterTMCloneTable' => 0, '_ITM_registerTMCloneTable' => 0, '__cxa_finalize@GLIBC_2.2.5' => 0, '__errno_location@GLIBC_2.2.5' => 0, '__fprintf_chk@GLIBC_2.3.4' => 0, '__gmon_start__' => 0, '__isoc99_sscanf@GLIBC_2.7' => 0, '__memcpy_chk@GLIBC_2.3.4' => 0, '__pread_chk@GLIBC_2.4' => 0, '__snprintf_chk@GLIBC_2.3.4' => 0, '__sprintf_chk@GLIBC_2.3.4' => 0, '__stack_chk_fail@GLIBC_2.4' => 0, '__strncat_chk@GLIBC_2.3.4' => 0, '__vfprintf_chk@GLIBC_2.3.4' => 0, '__xpg_basename@GLIBC_2.2.5' => 0, '_verbs_init_and_alloc_context@IBVERBS_PRIVATE_34' => 0, 'abort@GLIBC_2.2.5' => 0, 'calloc@GLIBC_2.2.5' => 0, 'close@GLIBC_2.2.5' => 0, 'eventfd@GLIBC_2.7' => 0, 'execute_ioctl@IBVERBS_PRIVATE_34' => 0, 'fclose@GLIBC_2.2.5' => 0, 'fcntl@GLIBC_2.2.5' => 0, 'fgets@GLIBC_2.2.5' => 0, 'fopen@GLIBC_2.2.5' => 0, 'fputc@GLIBC_2.2.5' => 0, 'free@GLIBC_2.2.5' => 0, 'fwrite@GLIBC_2.2.5' => 0, 'getenv@GLIBC_2.2.5' => 0, 'gethostname@GLIBC_2.2.5' => 0, 'getpid@GLIBC_2.2.5' => 0, 'getrandom@GLIBC_2.25' => 0, 'gettimeofday@GLIBC_2.2.5' => 0, 'ibv_alloc_pd@IBVERBS_1.1' => 0, 'ibv_cmd_advise_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_alloc_dm@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_alloc_mw@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_alloc_pd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_attach_mcast@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_close_xrcd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_ah@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_counters@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_cq_ex2@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_flow@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_flow_action_esp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_qp_ex2@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_qp_ex@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_rwq_ind_table@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_srq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_srq_ex@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_create_wq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dealloc_mw@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dealloc_pd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_dereg_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_ah@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_counters@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_flow@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_flow_action@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_rwq_ind_table@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_srq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_destroy_wq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_detach_mcast@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_free_dm@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_get_context@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_flow_action_esp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_qp_ex@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_srq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_modify_wq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_open_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_open_xrcd@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_context@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_device_any@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_port@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_qp@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_query_srq@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_read_counters@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_reg_dm_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_reg_dmabuf_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_reg_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_rereg_mr@IBVERBS_PRIVATE_34' => 0, 'ibv_cmd_resize_cq@IBVERBS_PRIVATE_34' => 0, 'ibv_create_cq@IBVERBS_1.1' => 0, 'ibv_dealloc_pd@IBVERBS_1.1' => 0, 'ibv_dereg_mr@IBVERBS_1.1' => 0, 'ibv_destroy_cq@IBVERBS_1.1' => 0, 'ibv_dofork_range@IBVERBS_1.1' => 0, 'ibv_dontfork_range@IBVERBS_1.1' => 0, 'ibv_get_device_name@IBVERBS_1.1' => 0, 'ibv_qp_to_qp_ex@IBVERBS_1.6' => 0, 'ibv_query_device@IBVERBS_1.1' => 0, 'ibv_query_gid_type@IBVERBS_PRIVATE_34' => 0, 'ibv_query_port@IBVERBS_1.1' => 0, 'ibv_reg_mr@IBVERBS_1.1' => 0, 'ibv_resolve_eth_l2_from_gid@IBVERBS_1.1' => 0, 'ioctl@GLIBC_2.2.5' => 0, 'malloc@GLIBC_2.2.5' => 0, 'memcmp@GLIBC_2.2.5' => 0, 'memcpy@GLIBC_2.14' => 0, 'memset@GLIBC_2.2.5' => 0, 'mmap@GLIBC_2.2.5' => 0, 'munmap@GLIBC_2.2.5' => 0, 'open@GLIBC_2.2.5' => 0, 'poll@GLIBC_2.2.5' => 0, 'posix_memalign@GLIBC_2.2.5' => 0, 'pthread_mutex_destroy@GLIBC_2.2.5' => 0, 'pthread_mutex_init@GLIBC_2.2.5' => 0, 'pthread_mutex_lock@GLIBC_2.2.5' => 0, 'pthread_mutex_unlock@GLIBC_2.2.5' => 0, 'pthread_spin_destroy@GLIBC_2.34' => 0, 'pthread_spin_init@GLIBC_2.34' => 0, 'pthread_spin_lock@GLIBC_2.34' => 0, 'pthread_spin_unlock@GLIBC_2.34' => 0, 'pwrite@GLIBC_2.2.5' => 0, 'rand_r@GLIBC_2.2.5' => 0, 'read@GLIBC_2.2.5' => 0, 'readlink@GLIBC_2.2.5' => 0, 'realloc@GLIBC_2.2.5' => 0, 'sched_getaffinity@GLIBC_2.3.4' => 0, 'sched_yield@GLIBC_2.2.5' => 0, 'shmat@GLIBC_2.2.5' => 0, 'shmctl@GLIBC_2.2.5' => 0, 'shmdt@GLIBC_2.2.5' => 0, 'shmget@GLIBC_2.2.5' => 0, 'sleep@GLIBC_2.2.5' => 0, 'stat@GLIBC_2.33' => 0, 'stderr@GLIBC_2.2.5' => 0, 'strcasecmp@GLIBC_2.2.5' => 0, 'strchr@GLIBC_2.2.5' => 0, 'strdup@GLIBC_2.2.5' => 0, 'strerror@GLIBC_2.2.5' => 0, 'strlen@GLIBC_2.2.5' => 0, 'strncpy@GLIBC_2.2.5' => 0, 'strrchr@GLIBC_2.2.5' => 0, 'strtol@GLIBC_2.2.5' => 0, 'strtoul@GLIBC_2.2.5' => 0, 'sysconf@GLIBC_2.2.5' => 0, 'time@GLIBC_2.2.5' => 0, 'usleep@GLIBC_2.2.5' => 0, 'verbs_allow_disassociate_destroy@IBVERBS_PRIVATE_34' => 0, 'verbs_init_cq@IBVERBS_PRIVATE_34' => 0, 'verbs_open_device@IBVERBS_PRIVATE_34' => 0, 'verbs_register_driver_34@IBVERBS_PRIVATE_34' => 0, 'verbs_set_ops@IBVERBS_PRIVATE_34' => 0, 'verbs_uninit_context@IBVERBS_PRIVATE_34' => 0, 'write@GLIBC_2.2.5' => 0 } }, 'WordSize' => '8' }; rdma-core-56.1/ABI/rdmacm.dump000066400000000000000000013666751477342711600161070ustar00rootroot00000000000000$VAR1 = { 'ABI_DUMPER_VERSION' => '1.2', 'ABI_DUMP_VERSION' => '3.5', 'Arch' => 'x86_64', 'GccVersion' => '12.3.0', 'Headers' => {}, 'Language' => 'C', 'LibraryName' => 'librdmacm.so.1.3.56.0', 'LibraryVersion' => 'rdmacm', 'MissedOffsets' => '1', 'MissedRegs' => '1', 'NameSpaces' => {}, 'Needed' => { 'ld-linux-x86-64.so.2' => 1, 'libc.so.6' => 1, 'libibverbs.so.1' => 1, 'libnl-3.so.200' => 1 }, 'Sources' => {}, 'SymbolInfo' => { '113434' => { 'Header' => undef, 'Line' => '4129', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'buf', 'type' => '1888' }, '2' => { 'name' => 'count', 'type' => '46' }, '3' => { 'name' => 'offset', 'type' => '79982' }, '4' => { 'name' => 'flags', 'type' => '161' } }, 'Return' => '46', 'ShortName' => 'riowrite' }, '115213' => { 'Header' => undef, 'Line' => '4082', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'buf', 'type' => '82' }, '2' => { 'name' => 'len', 'type' => '46' } }, 'Return' => '161', 'ShortName' => 'riounmap' }, '115747' => { 'Header' => undef, 'Line' => '4033', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'buf', 'type' => '82' }, '2' => { 'name' => 'len', 'type' => '46' }, '3' => { 'name' => 'prot', 'type' => '161' }, '4' => { 'name' => 'flags', 'type' => '161' }, '5' => { 'name' => 'offset', 'type' => '79982' } }, 'Return' => '79982', 'ShortName' => 'riomap' }, '116991' => { 'Header' => undef, 'Line' => '3976', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'cmd', 'type' => '161' }, '2' => { 'type' => '-1' } }, 'Return' => '161', 'ShortName' => 'rfcntl' }, '117454' => { 'Header' => undef, 'Line' => '3831', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'level', 'type' => '161' }, '2' => { 'name' => 'optname', 'type' => '161' }, '3' => { 'name' => 'optval', 'type' => '82' }, '4' => { 'name' => 'optlen', 'type' => '14413' } }, 'Return' => '161', 'ShortName' => 'rgetsockopt' }, '118451' => { 'Header' => undef, 'Line' => '3650', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'level', 'type' => '161' }, '2' => { 'name' => 'optname', 'type' => '161' }, '3' => { 'name' => 'optval', 'type' => '1888' }, '4' => { 'name' => 'optlen', 'type' => '1110' } }, 'Return' => '161', 'ShortName' => 'rsetsockopt' }, '120137' => { 'Header' => undef, 'Line' => '3608', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'addr', 'type' => '1883' }, '2' => { 'name' => 'addrlen', 'type' => '14413' } }, 'Return' => '161', 'ShortName' => 'rgetsockname' }, '120782' => { 'Header' => undef, 'Line' => '3593', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'addr', 'type' => '1883' }, '2' => { 'name' => 'addrlen', 'type' => '14413' } }, 'Return' => '161', 'ShortName' => 'rgetpeername' }, '121029' => { 'Header' => undef, 'Line' => '3555', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'rclose' }, '121616' => { 'Header' => undef, 'Line' => '3484', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'how', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'rshutdown' }, '12234' => { 'Header' => undef, 'Line' => '748', 'Param' => { '0' => { 'name' => 'node', 'type' => '1356' }, '1' => { 'name' => 'service', 'type' => '1356' }, '2' => { 'name' => 'hints', 'type' => '7867' }, '3' => { 'name' => 'res', 'type' => '7862' } }, 'Return' => '161', 'ShortName' => 'rdma_getaddrinfo' }, '122346' => { 'Header' => undef, 'Line' => '3453', 'Param' => { '0' => { 'name' => 'nfds', 'type' => '161' }, '1' => { 'name' => 'readfds', 'type' => '123303' }, '2' => { 'name' => 'writefds', 'type' => '123303' }, '3' => { 'name' => 'exceptfds', 'type' => '123303' }, '4' => { 'name' => 'timeout', 'type' => '123308' } }, 'Return' => '161', 'ShortName' => 'rselect' }, '123666' => { 'Header' => undef, 'Line' => '3338', 'Param' => { '0' => { 'name' => 'fds', 'type' => '100026' }, '1' => { 'name' => 'nfds', 'type' => '98765' }, '2' => { 'name' => 'timeout', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'rpoll' }, '127594' => { 'Header' => undef, 'Line' => '3035', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'iov', 'type' => '127729' }, '2' => { 'name' => 'iovcnt', 'type' => '161' } }, 'Return' => '767', 'ShortName' => 'rwritev' }, '127734' => { 'Header' => undef, 'Line' => '3030', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'buf', 'type' => '1888' }, '2' => { 'name' => 'count', 'type' => '46' } }, 'Return' => '767', 'ShortName' => 'rwrite' }, '127869' => { 'Header' => undef, 'Line' => '3022', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'msg', 'type' => '103196' }, '2' => { 'name' => 'flags', 'type' => '161' } }, 'Return' => '767', 'ShortName' => 'rsendmsg' }, '129474' => { 'Header' => undef, 'Line' => '2881', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'buf', 'type' => '1888' }, '2' => { 'name' => 'len', 'type' => '46' }, '3' => { 'name' => 'flags', 'type' => '161' }, '4' => { 'name' => 'dest_addr', 'type' => '4573' }, '5' => { 'name' => 'addrlen', 'type' => '1110' } }, 'Return' => '767', 'ShortName' => 'rsendto' }, '130289' => { 'Header' => undef, 'Line' => '2792', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'buf', 'type' => '1888' }, '2' => { 'name' => 'len', 'type' => '46' }, '3' => { 'name' => 'flags', 'type' => '161' } }, 'Return' => '767', 'ShortName' => 'rsend' }, '134562' => { 'Header' => undef, 'Line' => '2634', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'iov', 'type' => '127729' }, '2' => { 'name' => 'iovcnt', 'type' => '161' } }, 'Return' => '767', 'ShortName' => 'rreadv' }, '134764' => { 'Header' => undef, 'Line' => '2629', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'buf', 'type' => '82' }, '2' => { 'name' => 'count', 'type' => '46' } }, 'Return' => '767', 'ShortName' => 'rread' }, '134899' => { 'Header' => undef, 'Line' => '2621', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'msg', 'type' => '135157' }, '2' => { 'name' => 'flags', 'type' => '161' } }, 'Return' => '767', 'ShortName' => 'rrecvmsg' }, '135233' => { 'Header' => undef, 'Line' => '2589', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'buf', 'type' => '82' }, '2' => { 'name' => 'len', 'type' => '46' }, '3' => { 'name' => 'flags', 'type' => '161' }, '4' => { 'name' => 'src_addr', 'type' => '1883' }, '5' => { 'name' => 'addrlen', 'type' => '14413' } }, 'Return' => '767', 'ShortName' => 'rrecvfrom' }, '135813' => { 'Header' => undef, 'Line' => '2518', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'buf', 'type' => '82' }, '2' => { 'name' => 'len', 'type' => '46' }, '3' => { 'name' => 'flags', 'type' => '161' } }, 'Return' => '767', 'ShortName' => 'rrecv' }, '145096' => { 'Header' => undef, 'Line' => '1717', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'addr', 'type' => '4573' }, '2' => { 'name' => 'addrlen', 'type' => '1110' } }, 'Return' => '161', 'ShortName' => 'rconnect' }, '150949' => { 'Header' => undef, 'Line' => '1352', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'addr', 'type' => '1883' }, '2' => { 'name' => 'addrlen', 'type' => '14413' } }, 'Return' => '161', 'ShortName' => 'raccept' }, '151604' => { 'Header' => undef, 'Line' => '1261', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'backlog', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'rlisten' }, '152060' => { 'Header' => undef, 'Line' => '1238', 'Param' => { '0' => { 'name' => 'socket', 'type' => '161' }, '1' => { 'name' => 'addr', 'type' => '4573' }, '2' => { 'name' => 'addrlen', 'type' => '1110' } }, 'Return' => '161', 'ShortName' => 'rbind' }, '152456' => { 'Header' => undef, 'Line' => '1196', 'Param' => { '0' => { 'name' => 'domain', 'type' => '161' }, '1' => { 'name' => 'type', 'type' => '161' }, '2' => { 'name' => 'protocol', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'rsocket' }, '40777' => { 'Header' => undef, 'Line' => '2929', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'ece', 'type' => '33828' } }, 'Return' => '161', 'ShortName' => 'rdma_get_remote_ece' }, '40921' => { 'Header' => undef, 'Line' => '2915', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'ece', 'type' => '33828' } }, 'Return' => '161', 'ShortName' => 'rdma_set_local_ece' }, '41065' => { 'Header' => undef, 'Line' => '2910', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' } }, 'Return' => '2055', 'ShortName' => 'rdma_get_dst_port' }, '41143' => { 'Header' => undef, 'Line' => '2905', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' } }, 'Return' => '2055', 'ShortName' => 'rdma_get_src_port' }, '4148' => { 'Header' => undef, 'Line' => '752', 'Param' => { '0' => { 'name' => 'res', 'type' => '3170' } }, 'Return' => '1', 'ShortName' => 'rdma_freeaddrinfo' }, '41934' => { 'Header' => undef, 'Line' => '2853', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' } }, 'Return' => '1', 'ShortName' => 'rdma_destroy_ep' }, '42085' => { 'Header' => undef, 'Line' => '2789', 'Param' => { '0' => { 'name' => 'id', 'type' => '43044' }, '1' => { 'name' => 'res', 'type' => '3170' }, '2' => { 'name' => 'pd', 'type' => '22955' }, '3' => { 'name' => 'qp_init_attr', 'type' => '33813' } }, 'Return' => '161', 'ShortName' => 'rdma_create_ep' }, '43144' => { 'Header' => undef, 'Line' => '2705', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'channel', 'type' => '32397' } }, 'Return' => '161', 'ShortName' => 'rdma_migrate_id' }, '43766' => { 'Header' => undef, 'Line' => '735', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'level', 'type' => '161' }, '2' => { 'name' => 'optname', 'type' => '161' }, '3' => { 'name' => 'optval', 'type' => '82' }, '4' => { 'name' => 'optlen', 'type' => '46' } }, 'Return' => '161', 'ShortName' => 'rdma_set_option' }, '44094' => { 'Header' => undef, 'Line' => '2643', 'Param' => { '0' => { 'name' => 'event', 'type' => '31707' } }, 'Return' => '1356', 'ShortName' => 'rdma_event_str' }, '44143' => { 'Header' => undef, 'Line' => '2498', 'Param' => { '0' => { 'name' => 'channel', 'type' => '32397' }, '1' => { 'name' => 'event', 'type' => '47987' } }, 'Return' => '161', 'ShortName' => 'rdma_get_cm_event' }, '47997' => { 'Header' => undef, 'Line' => '2486', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' } }, 'Return' => '161', 'ShortName' => 'rdma_establish' }, '49058' => { 'Header' => undef, 'Line' => '2236', 'Param' => { '0' => { 'name' => 'event', 'type' => '32480' } }, 'Return' => '161', 'ShortName' => 'rdma_ack_cm_event' }, '49159' => { 'Header' => undef, 'Line' => '2168', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'addr', 'type' => '1883' } }, 'Return' => '161', 'ShortName' => 'rdma_leave_multicast' }, '50003' => { 'Header' => undef, 'Line' => '2155', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'addr', 'type' => '1883' }, '2' => { 'name' => 'context', 'type' => '82' } }, 'Return' => '161', 'ShortName' => 'rdma_join_multicast' }, '50246' => { 'Header' => undef, 'Line' => '2131', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'mc_join_attr', 'type' => '50531' }, '2' => { 'name' => 'context', 'type' => '82' } }, 'Return' => '161', 'ShortName' => 'rdma_join_multicast_ex' }, '51760' => { 'Header' => undef, 'Line' => '2036', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' } }, 'Return' => '161', 'ShortName' => 'rdma_disconnect' }, '52500' => { 'Header' => undef, 'Line' => '2003', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'event', 'type' => '21106' } }, 'Return' => '161', 'ShortName' => 'rdma_notify' }, '52767' => { 'Header' => undef, 'Line' => '1996', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'private_data', 'type' => '1888' }, '2' => { 'name' => 'private_data_len', 'type' => '789' } }, 'Return' => '161', 'ShortName' => 'rdma_reject_ece' }, '52905' => { 'Header' => undef, 'Line' => '524', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'private_data', 'type' => '1888' }, '2' => { 'name' => 'private_data_len', 'type' => '789' } }, 'Return' => '161', 'ShortName' => 'rdma_reject' }, '53467' => { 'Header' => undef, 'Line' => '506', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'conn_param', 'type' => '48249' } }, 'Return' => '161', 'ShortName' => 'rdma_accept' }, '54387' => { 'Header' => undef, 'Line' => '485', 'Param' => { '0' => { 'name' => 'listen', 'type' => '32753' }, '1' => { 'name' => 'id', 'type' => '43044' } }, 'Return' => '161', 'ShortName' => 'rdma_get_request' }, '54868' => { 'Header' => undef, 'Line' => '480', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'backlog', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'rdma_listen' }, '55165' => { 'Header' => undef, 'Line' => '442', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'conn_param', 'type' => '48249' } }, 'Return' => '161', 'ShortName' => 'rdma_connect' }, '56135' => { 'Header' => undef, 'Line' => '424', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' } }, 'Return' => '1', 'ShortName' => 'rdma_destroy_qp' }, '56218' => { 'Header' => undef, 'Line' => '408', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'pd', 'type' => '22955' }, '2' => { 'name' => 'qp_init_attr', 'type' => '33813' } }, 'Return' => '161', 'ShortName' => 'rdma_create_qp' }, '56530' => { 'Header' => undef, 'Line' => '1622', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'attr', 'type' => '31065' } }, 'Return' => '161', 'ShortName' => 'rdma_create_qp_ex' }, '57709' => { 'Header' => undef, 'Line' => '1560', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' } }, 'Return' => '1', 'ShortName' => 'rdma_destroy_srq' }, '57792' => { 'Header' => undef, 'Line' => '1541', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'pd', 'type' => '22955' }, '2' => { 'name' => 'attr', 'type' => '33818' } }, 'Return' => '161', 'ShortName' => 'rdma_create_srq' }, '58100' => { 'Header' => undef, 'Line' => '1496', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'attr', 'type' => '31125' } }, 'Return' => '161', 'ShortName' => 'rdma_create_srq_ex' }, '60825' => { 'Header' => undef, 'Line' => '1243', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'qp_attr', 'type' => '33823' }, '2' => { 'name' => 'qp_attr_mask', 'type' => '4143' } }, 'Return' => '161', 'ShortName' => 'rdma_init_qp_attr' }, '61203' => { 'Header' => undef, 'Line' => '385', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'timeout_ms', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'rdma_resolve_route' }, '61903' => { 'Header' => undef, 'Line' => '368', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'src_addr', 'type' => '1883' }, '2' => { 'name' => 'dst_addr', 'type' => '1883' }, '3' => { 'name' => 'timeout_ms', 'type' => '161' } }, 'Return' => '161', 'ShortName' => 'rdma_resolve_addr' }, '63751' => { 'Header' => undef, 'Line' => '343', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' }, '1' => { 'name' => 'addr', 'type' => '1883' } }, 'Return' => '161', 'ShortName' => 'rdma_bind_addr' }, '67071' => { 'Header' => undef, 'Line' => '325', 'Param' => { '0' => { 'name' => 'id', 'type' => '32753' } }, 'Return' => '161', 'ShortName' => 'rdma_destroy_id' }, '67651' => { 'Header' => undef, 'Line' => '272', 'Param' => { '0' => { 'name' => 'channel', 'type' => '32397' }, '1' => { 'name' => 'id', 'type' => '43044' }, '2' => { 'name' => 'context', 'type' => '82' }, '3' => { 'name' => 'ps', 'type' => '2913' } }, 'Return' => '161', 'ShortName' => 'rdma_create_id' }, '70450' => { 'Header' => undef, 'Line' => '588', 'Param' => { '0' => { 'name' => 'channel', 'type' => '32397' } }, 'Return' => '1', 'ShortName' => 'rdma_destroy_event_channel' }, '70534' => { 'Header' => undef, 'Line' => '567', 'Return' => '32397', 'ShortName' => 'rdma_create_event_channel' }, '70691' => { 'Header' => undef, 'Line' => '543', 'Param' => { '0' => { 'name' => 'list', 'type' => '71165' } }, 'Return' => '1', 'ShortName' => 'rdma_free_devices' }, '71170' => { 'Header' => undef, 'Line' => '497', 'Param' => { '0' => { 'name' => 'num_devices', 'type' => '4143' } }, 'Return' => '71165', 'ShortName' => 'rdma_get_devices' } }, 'SymbolVersion' => { 'raccept' => 'raccept@@RDMACM_1.0', 'rbind' => 'rbind@@RDMACM_1.0', 'rclose' => 'rclose@@RDMACM_1.0', 'rconnect' => 'rconnect@@RDMACM_1.0', 'rdma_accept' => 'rdma_accept@@RDMACM_1.0', 'rdma_ack_cm_event' => 'rdma_ack_cm_event@@RDMACM_1.0', 'rdma_bind_addr' => 'rdma_bind_addr@@RDMACM_1.0', 'rdma_connect' => 'rdma_connect@@RDMACM_1.0', 'rdma_create_ep' => 'rdma_create_ep@@RDMACM_1.0', 'rdma_create_event_channel' => 'rdma_create_event_channel@@RDMACM_1.0', 'rdma_create_id' => 'rdma_create_id@@RDMACM_1.0', 'rdma_create_qp' => 'rdma_create_qp@@RDMACM_1.0', 'rdma_create_qp_ex' => 'rdma_create_qp_ex@@RDMACM_1.0', 'rdma_create_srq' => 'rdma_create_srq@@RDMACM_1.0', 'rdma_create_srq_ex' => 'rdma_create_srq_ex@@RDMACM_1.0', 'rdma_destroy_ep' => 'rdma_destroy_ep@@RDMACM_1.0', 'rdma_destroy_event_channel' => 'rdma_destroy_event_channel@@RDMACM_1.0', 'rdma_destroy_id' => 'rdma_destroy_id@@RDMACM_1.0', 'rdma_destroy_qp' => 'rdma_destroy_qp@@RDMACM_1.0', 'rdma_destroy_srq' => 'rdma_destroy_srq@@RDMACM_1.0', 'rdma_disconnect' => 'rdma_disconnect@@RDMACM_1.0', 'rdma_establish' => 'rdma_establish@@RDMACM_1.2', 'rdma_event_str' => 'rdma_event_str@@RDMACM_1.0', 'rdma_free_devices' => 'rdma_free_devices@@RDMACM_1.0', 'rdma_freeaddrinfo' => 'rdma_freeaddrinfo@@RDMACM_1.0', 'rdma_get_cm_event' => 'rdma_get_cm_event@@RDMACM_1.0', 'rdma_get_devices' => 'rdma_get_devices@@RDMACM_1.0', 'rdma_get_dst_port' => 'rdma_get_dst_port@@RDMACM_1.0', 'rdma_get_remote_ece' => 'rdma_get_remote_ece@@RDMACM_1.3', 'rdma_get_request' => 'rdma_get_request@@RDMACM_1.0', 'rdma_get_src_port' => 'rdma_get_src_port@@RDMACM_1.0', 'rdma_getaddrinfo' => 'rdma_getaddrinfo@@RDMACM_1.0', 'rdma_init_qp_attr' => 'rdma_init_qp_attr@@RDMACM_1.2', 'rdma_join_multicast' => 'rdma_join_multicast@@RDMACM_1.0', 'rdma_join_multicast_ex' => 'rdma_join_multicast_ex@@RDMACM_1.1', 'rdma_leave_multicast' => 'rdma_leave_multicast@@RDMACM_1.0', 'rdma_listen' => 'rdma_listen@@RDMACM_1.0', 'rdma_migrate_id' => 'rdma_migrate_id@@RDMACM_1.0', 'rdma_notify' => 'rdma_notify@@RDMACM_1.0', 'rdma_reject' => 'rdma_reject@@RDMACM_1.0', 'rdma_reject_ece' => 'rdma_reject_ece@@RDMACM_1.3', 'rdma_resolve_addr' => 'rdma_resolve_addr@@RDMACM_1.0', 'rdma_resolve_route' => 'rdma_resolve_route@@RDMACM_1.0', 'rdma_set_local_ece' => 'rdma_set_local_ece@@RDMACM_1.3', 'rdma_set_option' => 'rdma_set_option@@RDMACM_1.0', 'rfcntl' => 'rfcntl@@RDMACM_1.0', 'rgetpeername' => 'rgetpeername@@RDMACM_1.0', 'rgetsockname' => 'rgetsockname@@RDMACM_1.0', 'rgetsockopt' => 'rgetsockopt@@RDMACM_1.0', 'riomap' => 'riomap@@RDMACM_1.0', 'riounmap' => 'riounmap@@RDMACM_1.0', 'riowrite' => 'riowrite@@RDMACM_1.0', 'rlisten' => 'rlisten@@RDMACM_1.0', 'rpoll' => 'rpoll@@RDMACM_1.0', 'rread' => 'rread@@RDMACM_1.0', 'rreadv' => 'rreadv@@RDMACM_1.0', 'rrecv' => 'rrecv@@RDMACM_1.0', 'rrecvfrom' => 'rrecvfrom@@RDMACM_1.0', 'rrecvmsg' => 'rrecvmsg@@RDMACM_1.0', 'rselect' => 'rselect@@RDMACM_1.0', 'rsend' => 'rsend@@RDMACM_1.0', 'rsendmsg' => 'rsendmsg@@RDMACM_1.0', 'rsendto' => 'rsendto@@RDMACM_1.0', 'rsetsockopt' => 'rsetsockopt@@RDMACM_1.0', 'rshutdown' => 'rshutdown@@RDMACM_1.0', 'rsocket' => 'rsocket@@RDMACM_1.0', 'rwrite' => 'rwrite@@RDMACM_1.0', 'rwritev' => 'rwritev@@RDMACM_1.0' }, 'Symbols' => { 'librdmacm.so.1.3.56.0' => { 'raccept@@RDMACM_1.0' => 1, 'rbind@@RDMACM_1.0' => 1, 'rclose@@RDMACM_1.0' => 1, 'rconnect@@RDMACM_1.0' => 1, 'rdma_accept@@RDMACM_1.0' => 1, 'rdma_ack_cm_event@@RDMACM_1.0' => 1, 'rdma_bind_addr@@RDMACM_1.0' => 1, 'rdma_connect@@RDMACM_1.0' => 1, 'rdma_create_ep@@RDMACM_1.0' => 1, 'rdma_create_event_channel@@RDMACM_1.0' => 1, 'rdma_create_id@@RDMACM_1.0' => 1, 'rdma_create_qp@@RDMACM_1.0' => 1, 'rdma_create_qp_ex@@RDMACM_1.0' => 1, 'rdma_create_srq@@RDMACM_1.0' => 1, 'rdma_create_srq_ex@@RDMACM_1.0' => 1, 'rdma_destroy_ep@@RDMACM_1.0' => 1, 'rdma_destroy_event_channel@@RDMACM_1.0' => 1, 'rdma_destroy_id@@RDMACM_1.0' => 1, 'rdma_destroy_qp@@RDMACM_1.0' => 1, 'rdma_destroy_srq@@RDMACM_1.0' => 1, 'rdma_disconnect@@RDMACM_1.0' => 1, 'rdma_establish@@RDMACM_1.2' => 1, 'rdma_event_str@@RDMACM_1.0' => 1, 'rdma_free_devices@@RDMACM_1.0' => 1, 'rdma_freeaddrinfo@@RDMACM_1.0' => 1, 'rdma_get_cm_event@@RDMACM_1.0' => 1, 'rdma_get_devices@@RDMACM_1.0' => 1, 'rdma_get_dst_port@@RDMACM_1.0' => 1, 'rdma_get_remote_ece@@RDMACM_1.3' => 1, 'rdma_get_request@@RDMACM_1.0' => 1, 'rdma_get_src_port@@RDMACM_1.0' => 1, 'rdma_getaddrinfo@@RDMACM_1.0' => 1, 'rdma_init_qp_attr@@RDMACM_1.2' => 1, 'rdma_join_multicast@@RDMACM_1.0' => 1, 'rdma_join_multicast_ex@@RDMACM_1.1' => 1, 'rdma_leave_multicast@@RDMACM_1.0' => 1, 'rdma_listen@@RDMACM_1.0' => 1, 'rdma_migrate_id@@RDMACM_1.0' => 1, 'rdma_notify@@RDMACM_1.0' => 1, 'rdma_reject@@RDMACM_1.0' => 1, 'rdma_reject_ece@@RDMACM_1.3' => 1, 'rdma_resolve_addr@@RDMACM_1.0' => 1, 'rdma_resolve_route@@RDMACM_1.0' => 1, 'rdma_set_local_ece@@RDMACM_1.3' => 1, 'rdma_set_option@@RDMACM_1.0' => 1, 'rfcntl@@RDMACM_1.0' => 1, 'rgetpeername@@RDMACM_1.0' => 1, 'rgetsockname@@RDMACM_1.0' => 1, 'rgetsockopt@@RDMACM_1.0' => 1, 'riomap@@RDMACM_1.0' => 1, 'riounmap@@RDMACM_1.0' => 1, 'riowrite@@RDMACM_1.0' => 1, 'rlisten@@RDMACM_1.0' => 1, 'rpoll@@RDMACM_1.0' => 1, 'rread@@RDMACM_1.0' => 1, 'rreadv@@RDMACM_1.0' => 1, 'rrecv@@RDMACM_1.0' => 1, 'rrecvfrom@@RDMACM_1.0' => 1, 'rrecvmsg@@RDMACM_1.0' => 1, 'rselect@@RDMACM_1.0' => 1, 'rsend@@RDMACM_1.0' => 1, 'rsendmsg@@RDMACM_1.0' => 1, 'rsendto@@RDMACM_1.0' => 1, 'rsetsockopt@@RDMACM_1.0' => 1, 'rshutdown@@RDMACM_1.0' => 1, 'rsocket@@RDMACM_1.0' => 1, 'rwrite@@RDMACM_1.0' => 1, 'rwritev@@RDMACM_1.0' => 1 } }, 'Target' => 'unix', 'TypeInfo' => { '-1' => { 'Name' => '...', 'Type' => 'Intrinsic' }, '1' => { 'Name' => 'void', 'Type' => 'Intrinsic' }, '100026' => { 'BaseType' => '98777', 'Name' => 'struct pollfd*', 'Size' => '8', 'Type' => 'Pointer' }, '101' => { 'Name' => 'unsigned short', 'Size' => '2', 'Type' => 'Intrinsic' }, '103196' => { 'BaseType' => '81376', 'Name' => 'struct msghdr const*', 'Size' => '8', 'Type' => 'Pointer' }, '1110' => { 'BaseType' => '272', 'Header' => undef, 'Line' => '33', 'Name' => 'socklen_t', 'Size' => '4', 'Type' => 'Typedef' }, '11209' => { 'Header' => undef, 'Line' => '901', 'Memb' => { '0' => { 'name' => 'IBV_QPT_RC', 'value' => '2' }, '1' => { 'name' => 'IBV_QPT_UC', 'value' => '3' }, '2' => { 'name' => 'IBV_QPT_UD', 'value' => '4' }, '3' => { 'name' => 'IBV_QPT_RAW_PACKET', 'value' => '8' }, '4' => { 'name' => 'IBV_QPT_XRC_SEND', 'value' => '9' }, '5' => { 'name' => 'IBV_QPT_XRC_RECV', 'value' => '10' }, '6' => { 'name' => 'IBV_QPT_DRIVER', 'value' => '255' } }, 'Name' => 'enum ibv_qp_type', 'Size' => '4', 'Type' => 'Enum' }, '1196' => { 'BaseType' => '101', 'Header' => undef, 'Line' => '28', 'Name' => 'sa_family_t', 'Size' => '2', 'Type' => 'Typedef' }, '1208' => { 'Header' => undef, 'Line' => '180', 'Memb' => { '0' => { 'name' => 'sa_family', 'offset' => '0', 'type' => '1196' }, '1' => { 'name' => 'sa_data', 'offset' => '2', 'type' => '1253' } }, 'Name' => 'struct sockaddr', 'Size' => '16', 'Type' => 'Struct' }, '123308' => { 'BaseType' => '80030', 'Name' => 'struct timeval*', 'Size' => '8', 'Type' => 'Pointer' }, '1248' => { 'BaseType' => '1208', 'Name' => 'struct sockaddr const', 'Size' => '16', 'Type' => 'Const' }, '125' => { 'BaseType' => '89', 'Header' => undef, 'Line' => '38', 'Name' => '__uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '1253' => { 'BaseType' => '255', 'Name' => 'char[14]', 'Size' => '14', 'Type' => 'Array' }, '127729' => { 'BaseType' => '80862', 'Name' => 'struct iovec const*', 'Size' => '8', 'Type' => 'Pointer' }, '135157' => { 'BaseType' => '81264', 'Name' => 'struct msghdr*', 'Size' => '8', 'Type' => 'Pointer' }, '1356' => { 'BaseType' => '262', 'Name' => 'char const*', 'Size' => '8', 'Type' => 'Pointer' }, '1366' => { 'BaseType' => '813', 'Header' => undef, 'Line' => '30', 'Name' => 'in_addr_t', 'Size' => '4', 'Type' => 'Typedef' }, '137' => { 'Name' => 'short', 'Size' => '2', 'Type' => 'Intrinsic' }, '1378' => { 'Header' => undef, 'Line' => '31', 'Memb' => { '0' => { 'name' => 's_addr', 'offset' => '0', 'type' => '1366' } }, 'Name' => 'struct in_addr', 'Size' => '4', 'Type' => 'Struct' }, '14413' => { 'BaseType' => '1110', 'Name' => 'socklen_t*', 'Size' => '8', 'Type' => 'Pointer' }, '149' => { 'BaseType' => '101', 'Header' => undef, 'Line' => '40', 'Name' => '__uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '15670' => { 'BaseType' => '185', 'Header' => undef, 'Line' => '160', 'Name' => '__time_t', 'Size' => '8', 'Type' => 'Typedef' }, '1588' => { 'BaseType' => '801', 'Header' => undef, 'Line' => '123', 'Name' => 'in_port_t', 'Size' => '2', 'Type' => 'Typedef' }, '1600' => { 'Header' => undef, 'Line' => '221', 'Memb' => { '0' => { 'name' => '__u6_addr8', 'offset' => '0', 'type' => '1646' }, '1' => { 'name' => '__u6_addr16', 'offset' => '0', 'type' => '1662' }, '2' => { 'name' => '__u6_addr32', 'offset' => '0', 'type' => '1678' } }, 'Size' => '16', 'Type' => 'Union' }, '161' => { 'Name' => 'int', 'Size' => '4', 'Type' => 'Intrinsic' }, '1646' => { 'BaseType' => '789', 'Name' => 'uint8_t[16]', 'Size' => '16', 'Type' => 'Array' }, '1662' => { 'BaseType' => '801', 'Name' => 'uint16_t[8]', 'Size' => '16', 'Type' => 'Array' }, '1678' => { 'BaseType' => '813', 'Name' => 'uint32_t[4]', 'Size' => '16', 'Type' => 'Array' }, '16782' => { 'Header' => undef, 'Line' => '193', 'Memb' => { '0' => { 'name' => 'ss_family', 'offset' => '0', 'type' => '1196' }, '1' => { 'name' => '__ss_padding', 'offset' => '2', 'type' => '16835' }, '2' => { 'name' => '__ss_align', 'offset' => '288', 'type' => '58' } }, 'Name' => 'struct sockaddr_storage', 'Size' => '128', 'Type' => 'Struct' }, '16835' => { 'BaseType' => '255', 'Name' => 'char[118]', 'Size' => '118', 'Type' => 'Array' }, '1694' => { 'Header' => undef, 'Line' => '219', 'Memb' => { '0' => { 'name' => '__in6_u', 'offset' => '0', 'type' => '1600' } }, 'Name' => 'struct in6_addr', 'Size' => '16', 'Type' => 'Struct' }, '1721' => { 'Header' => undef, 'Line' => '245', 'Memb' => { '0' => { 'name' => 'sin_family', 'offset' => '0', 'type' => '1196' }, '1' => { 'name' => 'sin_port', 'offset' => '2', 'type' => '1588' }, '2' => { 'name' => 'sin_addr', 'offset' => '4', 'type' => '1378' }, '3' => { 'name' => 'sin_zero', 'offset' => '8', 'type' => '1787' } }, 'Name' => 'struct sockaddr_in', 'Size' => '16', 'Type' => 'Struct' }, '173' => { 'BaseType' => '70', 'Header' => undef, 'Line' => '42', 'Name' => '__uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '1787' => { 'BaseType' => '89', 'Name' => 'unsigned char[8]', 'Size' => '8', 'Type' => 'Array' }, '1803' => { 'Header' => undef, 'Line' => '260', 'Memb' => { '0' => { 'name' => 'sin6_family', 'offset' => '0', 'type' => '1196' }, '1' => { 'name' => 'sin6_port', 'offset' => '2', 'type' => '1588' }, '2' => { 'name' => 'sin6_flowinfo', 'offset' => '4', 'type' => '813' }, '3' => { 'name' => 'sin6_addr', 'offset' => '8', 'type' => '1694' }, '4' => { 'name' => 'sin6_scope_id', 'offset' => '36', 'type' => '813' } }, 'Name' => 'struct sockaddr_in6', 'Size' => '28', 'Type' => 'Struct' }, '185' => { 'Name' => 'long', 'Size' => '8', 'Type' => 'Intrinsic' }, '1883' => { 'BaseType' => '1208', 'Name' => 'struct sockaddr*', 'Size' => '8', 'Type' => 'Pointer' }, '18857' => { 'Header' => undef, 'Line' => '95', 'Memb' => { '0' => { 'name' => 'IBV_NODE_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_NODE_CA', 'value' => '1' }, '2' => { 'name' => 'IBV_NODE_SWITCH', 'value' => '2' }, '3' => { 'name' => 'IBV_NODE_ROUTER', 'value' => '3' }, '4' => { 'name' => 'IBV_NODE_RNIC', 'value' => '4' }, '5' => { 'name' => 'IBV_NODE_USNIC', 'value' => '5' }, '6' => { 'name' => 'IBV_NODE_USNIC_UDP', 'value' => '6' }, '7' => { 'name' => 'IBV_NODE_UNSPECIFIED', 'value' => '7' } }, 'Name' => 'enum ibv_node_type', 'Size' => '4', 'Type' => 'Enum' }, '1888' => { 'BaseType' => '1898', 'Name' => 'void const*', 'Size' => '8', 'Type' => 'Pointer' }, '18921' => { 'Header' => undef, 'Line' => '106', 'Memb' => { '0' => { 'name' => 'IBV_TRANSPORT_UNKNOWN', 'value' => '18446744073709551615 (-1)' }, '1' => { 'name' => 'IBV_TRANSPORT_IB', 'value' => '0' }, '2' => { 'name' => 'IBV_TRANSPORT_IWARP', 'value' => '1' }, '3' => { 'name' => 'IBV_TRANSPORT_USNIC', 'value' => '2' }, '4' => { 'name' => 'IBV_TRANSPORT_USNIC_UDP', 'value' => '3' }, '5' => { 'name' => 'IBV_TRANSPORT_UNSPECIFIED', 'value' => '4' } }, 'Name' => 'enum ibv_transport_type', 'Size' => '4', 'Type' => 'Enum' }, '18973' => { 'Header' => undef, 'Line' => '155', 'Memb' => { '0' => { 'name' => 'IBV_ATOMIC_NONE', 'value' => '0' }, '1' => { 'name' => 'IBV_ATOMIC_HCA', 'value' => '1' }, '2' => { 'name' => 'IBV_ATOMIC_GLOB', 'value' => '2' } }, 'Name' => 'enum ibv_atomic_cap', 'Size' => '4', 'Type' => 'Enum' }, '1898' => { 'BaseType' => '1', 'Name' => 'void const', 'Type' => 'Const' }, '19140' => { 'Header' => undef, 'Line' => '2037', 'Memb' => { '0' => { 'name' => 'device', 'offset' => '0', 'type' => '28129' }, '1' => { 'name' => 'ops', 'offset' => '8', 'type' => '28311' }, '2' => { 'name' => 'cmd_fd', 'offset' => '612', 'type' => '161' }, '3' => { 'name' => 'async_fd', 'offset' => '616', 'type' => '161' }, '4' => { 'name' => 'num_comp_vectors', 'offset' => '626', 'type' => '161' }, '5' => { 'name' => 'mutex', 'offset' => '640', 'type' => '16386' }, '6' => { 'name' => 'abi_compat', 'offset' => '800', 'type' => '82' } }, 'Name' => 'struct ibv_context', 'Size' => '328', 'Type' => 'Struct' }, '19258' => { 'BaseType' => '19140', 'Name' => 'struct ibv_context*', 'Size' => '8', 'Type' => 'Pointer' }, '19338' => { 'Header' => undef, 'Line' => '182', 'Memb' => { '0' => { 'name' => 'fw_ver', 'offset' => '0', 'type' => '19872' }, '1' => { 'name' => 'node_guid', 'offset' => '100', 'type' => '2079' }, '10' => { 'name' => 'device_cap_flags', 'offset' => '278', 'type' => '70' }, '11' => { 'name' => 'max_sge', 'offset' => '288', 'type' => '161' }, '12' => { 'name' => 'max_sge_rd', 'offset' => '292', 'type' => '161' }, '13' => { 'name' => 'max_cq', 'offset' => '296', 'type' => '161' }, '14' => { 'name' => 'max_cqe', 'offset' => '306', 'type' => '161' }, '15' => { 'name' => 'max_mr', 'offset' => '310', 'type' => '161' }, '16' => { 'name' => 'max_pd', 'offset' => '320', 'type' => '161' }, '17' => { 'name' => 'max_qp_rd_atom', 'offset' => '324', 'type' => '161' }, '18' => { 'name' => 'max_ee_rd_atom', 'offset' => '328', 'type' => '161' }, '19' => { 'name' => 'max_res_rd_atom', 'offset' => '338', 'type' => '161' }, '2' => { 'name' => 'sys_image_guid', 'offset' => '114', 'type' => '2079' }, '20' => { 'name' => 'max_qp_init_rd_atom', 'offset' => '342', 'type' => '161' }, '21' => { 'name' => 'max_ee_init_rd_atom', 'offset' => '352', 'type' => '161' }, '22' => { 'name' => 'atomic_cap', 'offset' => '356', 'type' => '18973' }, '23' => { 'name' => 'max_ee', 'offset' => '360', 'type' => '161' }, '24' => { 'name' => 'max_rdd', 'offset' => '370', 'type' => '161' }, '25' => { 'name' => 'max_mw', 'offset' => '374', 'type' => '161' }, '26' => { 'name' => 'max_raw_ipv6_qp', 'offset' => '384', 'type' => '161' }, '27' => { 'name' => 'max_raw_ethy_qp', 'offset' => '388', 'type' => '161' }, '28' => { 'name' => 'max_mcast_grp', 'offset' => '392', 'type' => '161' }, '29' => { 'name' => 'max_mcast_qp_attach', 'offset' => '402', 'type' => '161' }, '3' => { 'name' => 'max_mr_size', 'offset' => '128', 'type' => '825' }, '30' => { 'name' => 'max_total_mcast_qp_attach', 'offset' => '406', 'type' => '161' }, '31' => { 'name' => 'max_ah', 'offset' => '512', 'type' => '161' }, '32' => { 'name' => 'max_fmr', 'offset' => '516', 'type' => '161' }, '33' => { 'name' => 'max_map_per_fmr', 'offset' => '520', 'type' => '161' }, '34' => { 'name' => 'max_srq', 'offset' => '530', 'type' => '161' }, '35' => { 'name' => 'max_srq_wr', 'offset' => '534', 'type' => '161' }, '36' => { 'name' => 'max_srq_sge', 'offset' => '544', 'type' => '161' }, '37' => { 'name' => 'max_pkeys', 'offset' => '548', 'type' => '801' }, '38' => { 'name' => 'local_ca_ack_delay', 'offset' => '550', 'type' => '789' }, '39' => { 'name' => 'phys_port_cnt', 'offset' => '551', 'type' => '789' }, '4' => { 'name' => 'page_size_cap', 'offset' => '136', 'type' => '825' }, '5' => { 'name' => 'vendor_id', 'offset' => '150', 'type' => '813' }, '6' => { 'name' => 'vendor_part_id', 'offset' => '256', 'type' => '813' }, '7' => { 'name' => 'hw_ver', 'offset' => '260', 'type' => '813' }, '8' => { 'name' => 'max_qp', 'offset' => '264', 'type' => '161' }, '9' => { 'name' => 'max_qp_wr', 'offset' => '274', 'type' => '161' } }, 'Name' => 'struct ibv_device_attr', 'Size' => '232', 'Type' => 'Struct' }, '197' => { 'BaseType' => '58', 'Header' => undef, 'Line' => '45', 'Name' => '__uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '19872' => { 'BaseType' => '255', 'Name' => 'char[64]', 'Size' => '64', 'Type' => 'Array' }, '2019' => { 'BaseType' => '101', 'Header' => undef, 'Line' => '24', 'Name' => '__u16', 'Size' => '2', 'Type' => 'Typedef' }, '2031' => { 'BaseType' => '70', 'Header' => undef, 'Line' => '27', 'Name' => '__u32', 'Size' => '4', 'Type' => 'Typedef' }, '2043' => { 'BaseType' => '837', 'Header' => undef, 'Line' => '31', 'Name' => '__u64', 'Size' => '8', 'Type' => 'Typedef' }, '2055' => { 'BaseType' => '2019', 'Header' => undef, 'Line' => '25', 'Name' => '__be16', 'Size' => '2', 'Type' => 'Typedef' }, '20639' => { 'Header' => undef, 'Line' => '364', 'Memb' => { '0' => { 'name' => 'IBV_MTU_256', 'value' => '1' }, '1' => { 'name' => 'IBV_MTU_512', 'value' => '2' }, '2' => { 'name' => 'IBV_MTU_1024', 'value' => '3' }, '3' => { 'name' => 'IBV_MTU_2048', 'value' => '4' }, '4' => { 'name' => 'IBV_MTU_4096', 'value' => '5' } }, 'Name' => 'enum ibv_mtu', 'Size' => '4', 'Type' => 'Enum' }, '2067' => { 'BaseType' => '2031', 'Header' => undef, 'Line' => '27', 'Name' => '__be32', 'Size' => '4', 'Type' => 'Typedef' }, '2079' => { 'BaseType' => '2043', 'Header' => undef, 'Line' => '29', 'Name' => '__be64', 'Size' => '8', 'Type' => 'Typedef' }, '209' => { 'BaseType' => '185', 'Header' => undef, 'Line' => '152', 'Name' => '__off_t', 'Size' => '8', 'Type' => 'Typedef' }, '21106' => { 'Header' => undef, 'Line' => '451', 'Memb' => { '0' => { 'name' => 'IBV_EVENT_CQ_ERR', 'value' => '0' }, '1' => { 'name' => 'IBV_EVENT_QP_FATAL', 'value' => '1' }, '10' => { 'name' => 'IBV_EVENT_PORT_ERR', 'value' => '10' }, '11' => { 'name' => 'IBV_EVENT_LID_CHANGE', 'value' => '11' }, '12' => { 'name' => 'IBV_EVENT_PKEY_CHANGE', 'value' => '12' }, '13' => { 'name' => 'IBV_EVENT_SM_CHANGE', 'value' => '13' }, '14' => { 'name' => 'IBV_EVENT_SRQ_ERR', 'value' => '14' }, '15' => { 'name' => 'IBV_EVENT_SRQ_LIMIT_REACHED', 'value' => '15' }, '16' => { 'name' => 'IBV_EVENT_QP_LAST_WQE_REACHED', 'value' => '16' }, '17' => { 'name' => 'IBV_EVENT_CLIENT_REREGISTER', 'value' => '17' }, '18' => { 'name' => 'IBV_EVENT_GID_CHANGE', 'value' => '18' }, '19' => { 'name' => 'IBV_EVENT_WQ_FATAL', 'value' => '19' }, '2' => { 'name' => 'IBV_EVENT_QP_REQ_ERR', 'value' => '2' }, '3' => { 'name' => 'IBV_EVENT_QP_ACCESS_ERR', 'value' => '3' }, '4' => { 'name' => 'IBV_EVENT_COMM_EST', 'value' => '4' }, '5' => { 'name' => 'IBV_EVENT_SQ_DRAINED', 'value' => '5' }, '6' => { 'name' => 'IBV_EVENT_PATH_MIG', 'value' => '6' }, '7' => { 'name' => 'IBV_EVENT_PATH_MIG_ERR', 'value' => '7' }, '8' => { 'name' => 'IBV_EVENT_DEVICE_FATAL', 'value' => '8' }, '9' => { 'name' => 'IBV_EVENT_PORT_ACTIVE', 'value' => '9' } }, 'Name' => 'enum ibv_event_type', 'Size' => '4', 'Type' => 'Enum' }, '21243' => { 'Header' => undef, 'Line' => '1508', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '19258' }, '1' => { 'name' => 'channel', 'offset' => '8', 'type' => '26787' }, '2' => { 'name' => 'cq_context', 'offset' => '22', 'type' => '82' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '813' }, '4' => { 'name' => 'cqe', 'offset' => '40', 'type' => '161' }, '5' => { 'name' => 'mutex', 'offset' => '50', 'type' => '16386' }, '6' => { 'name' => 'cond', 'offset' => '114', 'type' => '16460' }, '7' => { 'name' => 'comp_events_completed', 'offset' => '288', 'type' => '813' }, '8' => { 'name' => 'async_events_completed', 'offset' => '292', 'type' => '813' } }, 'Name' => 'struct ibv_cq', 'Size' => '128', 'Type' => 'Struct' }, '21383' => { 'BaseType' => '21243', 'Name' => 'struct ibv_cq*', 'Size' => '8', 'Type' => 'Pointer' }, '21388' => { 'Header' => undef, 'Line' => '1283', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '19258' }, '1' => { 'name' => 'qp_context', 'offset' => '8', 'type' => '82' }, '10' => { 'name' => 'mutex', 'offset' => '100', 'type' => '16386' }, '11' => { 'name' => 'cond', 'offset' => '260', 'type' => '16460' }, '12' => { 'name' => 'events_completed', 'offset' => '338', 'type' => '813' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '22955' }, '3' => { 'name' => 'send_cq', 'offset' => '36', 'type' => '21383' }, '4' => { 'name' => 'recv_cq', 'offset' => '50', 'type' => '21383' }, '5' => { 'name' => 'srq', 'offset' => '64', 'type' => '21699' }, '6' => { 'name' => 'handle', 'offset' => '72', 'type' => '813' }, '7' => { 'name' => 'qp_num', 'offset' => '82', 'type' => '813' }, '8' => { 'name' => 'state', 'offset' => '86', 'type' => '24917' }, '9' => { 'name' => 'qp_type', 'offset' => '96', 'type' => '11209' } }, 'Name' => 'struct ibv_qp', 'Size' => '160', 'Type' => 'Struct' }, '21583' => { 'BaseType' => '21388', 'Name' => 'struct ibv_qp*', 'Size' => '8', 'Type' => 'Pointer' }, '21588' => { 'Header' => undef, 'Line' => '1243', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '19258' }, '1' => { 'name' => 'srq_context', 'offset' => '8', 'type' => '82' }, '2' => { 'name' => 'pd', 'offset' => '22', 'type' => '22955' }, '3' => { 'name' => 'handle', 'offset' => '36', 'type' => '813' }, '4' => { 'name' => 'mutex', 'offset' => '50', 'type' => '16386' }, '5' => { 'name' => 'cond', 'offset' => '114', 'type' => '16460' }, '6' => { 'name' => 'events_completed', 'offset' => '288', 'type' => '813' } }, 'Name' => 'struct ibv_srq', 'Size' => '128', 'Type' => 'Struct' }, '21699' => { 'BaseType' => '21588', 'Name' => 'struct ibv_srq*', 'Size' => '8', 'Type' => 'Pointer' }, '21903' => { 'Header' => undef, 'Line' => '485', 'Memb' => { '0' => { 'name' => 'IBV_WC_SUCCESS', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_LOC_LEN_ERR', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_REM_ACCESS_ERR', 'value' => '10' }, '11' => { 'name' => 'IBV_WC_REM_OP_ERR', 'value' => '11' }, '12' => { 'name' => 'IBV_WC_RETRY_EXC_ERR', 'value' => '12' }, '13' => { 'name' => 'IBV_WC_RNR_RETRY_EXC_ERR', 'value' => '13' }, '14' => { 'name' => 'IBV_WC_LOC_RDD_VIOL_ERR', 'value' => '14' }, '15' => { 'name' => 'IBV_WC_REM_INV_RD_REQ_ERR', 'value' => '15' }, '16' => { 'name' => 'IBV_WC_REM_ABORT_ERR', 'value' => '16' }, '17' => { 'name' => 'IBV_WC_INV_EECN_ERR', 'value' => '17' }, '18' => { 'name' => 'IBV_WC_INV_EEC_STATE_ERR', 'value' => '18' }, '19' => { 'name' => 'IBV_WC_FATAL_ERR', 'value' => '19' }, '2' => { 'name' => 'IBV_WC_LOC_QP_OP_ERR', 'value' => '2' }, '20' => { 'name' => 'IBV_WC_RESP_TIMEOUT_ERR', 'value' => '20' }, '21' => { 'name' => 'IBV_WC_GENERAL_ERR', 'value' => '21' }, '22' => { 'name' => 'IBV_WC_TM_ERR', 'value' => '22' }, '23' => { 'name' => 'IBV_WC_TM_RNDV_INCOMPLETE', 'value' => '23' }, '3' => { 'name' => 'IBV_WC_LOC_EEC_OP_ERR', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_LOC_PROT_ERR', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_WR_FLUSH_ERR', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_MW_BIND_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_BAD_RESP_ERR', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_LOC_ACCESS_ERR', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_REM_INV_REQ_ERR', 'value' => '9' } }, 'Name' => 'enum ibv_wc_status', 'Size' => '4', 'Type' => 'Enum' }, '22064' => { 'Header' => undef, 'Line' => '513', 'Memb' => { '0' => { 'name' => 'IBV_WC_SEND', 'value' => '0' }, '1' => { 'name' => 'IBV_WC_RDMA_WRITE', 'value' => '1' }, '10' => { 'name' => 'IBV_WC_RECV', 'value' => '128' }, '11' => { 'name' => 'IBV_WC_RECV_RDMA_WITH_IMM', 'value' => '129' }, '12' => { 'name' => 'IBV_WC_TM_ADD', 'value' => '130' }, '13' => { 'name' => 'IBV_WC_TM_DEL', 'value' => '131' }, '14' => { 'name' => 'IBV_WC_TM_SYNC', 'value' => '132' }, '15' => { 'name' => 'IBV_WC_TM_RECV', 'value' => '133' }, '16' => { 'name' => 'IBV_WC_TM_NO_TAG', 'value' => '134' }, '17' => { 'name' => 'IBV_WC_DRIVER1', 'value' => '135' }, '18' => { 'name' => 'IBV_WC_DRIVER2', 'value' => '136' }, '19' => { 'name' => 'IBV_WC_DRIVER3', 'value' => '137' }, '2' => { 'name' => 'IBV_WC_RDMA_READ', 'value' => '2' }, '3' => { 'name' => 'IBV_WC_COMP_SWAP', 'value' => '3' }, '4' => { 'name' => 'IBV_WC_FETCH_ADD', 'value' => '4' }, '5' => { 'name' => 'IBV_WC_BIND_MW', 'value' => '5' }, '6' => { 'name' => 'IBV_WC_LOCAL_INV', 'value' => '6' }, '7' => { 'name' => 'IBV_WC_TSO', 'value' => '7' }, '8' => { 'name' => 'IBV_WC_FLUSH', 'value' => '8' }, '9' => { 'name' => 'IBV_WC_ATOMIC_WRITE', 'value' => '9' } }, 'Name' => 'enum ibv_wc_opcode', 'Size' => '4', 'Type' => 'Enum' }, '22332' => { 'Header' => undef, 'Line' => '598', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '2067' }, '1' => { 'name' => 'invalidated_rkey', 'offset' => '0', 'type' => '813' } }, 'Size' => '4', 'Type' => 'Union' }, '22367' => { 'Header' => undef, 'Line' => '589', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '825' }, '1' => { 'name' => 'status', 'offset' => '8', 'type' => '21903' }, '10' => { 'name' => 'slid', 'offset' => '66', 'type' => '801' }, '11' => { 'name' => 'sl', 'offset' => '68', 'type' => '789' }, '12' => { 'name' => 'dlid_path_bits', 'offset' => '69', 'type' => '789' }, '2' => { 'name' => 'opcode', 'offset' => '18', 'type' => '22064' }, '3' => { 'name' => 'vendor_err', 'offset' => '22', 'type' => '813' }, '4' => { 'name' => 'byte_len', 'offset' => '32', 'type' => '813' }, '5' => { 'name' => 'unnamed0', 'offset' => '36', 'type' => '22332' }, '6' => { 'name' => 'qp_num', 'offset' => '40', 'type' => '813' }, '7' => { 'name' => 'src_qp', 'offset' => '50', 'type' => '813' }, '8' => { 'name' => 'wc_flags', 'offset' => '54', 'type' => '70' }, '9' => { 'name' => 'pkey_index', 'offset' => '64', 'type' => '801' } }, 'Name' => 'struct ibv_wc', 'Size' => '48', 'Type' => 'Struct' }, '22554' => { 'Header' => undef, 'Line' => '625', 'Memb' => { '0' => { 'name' => 'mr', 'offset' => '0', 'type' => '22734' }, '1' => { 'name' => 'addr', 'offset' => '8', 'type' => '825' }, '2' => { 'name' => 'length', 'offset' => '22', 'type' => '825' }, '3' => { 'name' => 'mw_access_flags', 'offset' => '36', 'type' => '70' } }, 'Name' => 'struct ibv_mw_bind_info', 'Size' => '32', 'Type' => 'Struct' }, '22623' => { 'Header' => undef, 'Line' => '668', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '19258' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '22955' }, '2' => { 'name' => 'addr', 'offset' => '22', 'type' => '82' }, '3' => { 'name' => 'length', 'offset' => '36', 'type' => '46' }, '4' => { 'name' => 'handle', 'offset' => '50', 'type' => '813' }, '5' => { 'name' => 'lkey', 'offset' => '54', 'type' => '813' }, '6' => { 'name' => 'rkey', 'offset' => '64', 'type' => '813' } }, 'Name' => 'struct ibv_mr', 'Size' => '48', 'Type' => 'Struct' }, '22734' => { 'BaseType' => '22623', 'Name' => 'struct ibv_mr*', 'Size' => '8', 'Type' => 'Pointer' }, '22739' => { 'Header' => undef, 'Line' => '632', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '19258' }, '1' => { 'name' => 'handle', 'offset' => '8', 'type' => '813' } }, 'Name' => 'struct ibv_pd', 'Size' => '16', 'Type' => 'Struct' }, '22927' => { 'Header' => undef, 'Line' => '657', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '19258' } }, 'Name' => 'struct ibv_xrcd', 'Size' => '8', 'Type' => 'Struct' }, '22955' => { 'BaseType' => '22739', 'Name' => 'struct ibv_pd*', 'Size' => '8', 'Type' => 'Pointer' }, '22960' => { 'Header' => undef, 'Line' => '678', 'Memb' => { '0' => { 'name' => 'IBV_MW_TYPE_1', 'value' => '1' }, '1' => { 'name' => 'IBV_MW_TYPE_2', 'value' => '2' } }, 'Name' => 'enum ibv_mw_type', 'Size' => '4', 'Type' => 'Enum' }, '22989' => { 'Header' => undef, 'Line' => '683', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '19258' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '22955' }, '2' => { 'name' => 'rkey', 'offset' => '22', 'type' => '813' }, '3' => { 'name' => 'handle', 'offset' => '32', 'type' => '813' }, '4' => { 'name' => 'type', 'offset' => '36', 'type' => '22960' } }, 'Name' => 'struct ibv_mw', 'Size' => '32', 'Type' => 'Struct' }, '23072' => { 'Header' => undef, 'Line' => '691', 'Memb' => { '0' => { 'name' => 'dgid', 'offset' => '0', 'type' => '2428' }, '1' => { 'name' => 'flow_label', 'offset' => '22', 'type' => '813' }, '2' => { 'name' => 'sgid_index', 'offset' => '32', 'type' => '789' }, '3' => { 'name' => 'hop_limit', 'offset' => '33', 'type' => '789' }, '4' => { 'name' => 'traffic_class', 'offset' => '34', 'type' => '789' } }, 'Name' => 'struct ibv_global_route', 'Size' => '24', 'Type' => 'Struct' }, '23156' => { 'Header' => undef, 'Line' => '762', 'Memb' => { '0' => { 'name' => 'grh', 'offset' => '0', 'type' => '23072' }, '1' => { 'name' => 'dlid', 'offset' => '36', 'type' => '801' }, '2' => { 'name' => 'sl', 'offset' => '38', 'type' => '789' }, '3' => { 'name' => 'src_path_bits', 'offset' => '39', 'type' => '789' }, '4' => { 'name' => 'static_rate', 'offset' => '40', 'type' => '789' }, '5' => { 'name' => 'is_global', 'offset' => '41', 'type' => '789' }, '6' => { 'name' => 'port_num', 'offset' => '48', 'type' => '789' } }, 'Name' => 'struct ibv_ah_attr', 'Size' => '32', 'Type' => 'Struct' }, '23267' => { 'Header' => undef, 'Line' => '777', 'Memb' => { '0' => { 'name' => 'max_wr', 'offset' => '0', 'type' => '813' }, '1' => { 'name' => 'max_sge', 'offset' => '4', 'type' => '813' }, '2' => { 'name' => 'srq_limit', 'offset' => '8', 'type' => '813' } }, 'Name' => 'struct ibv_srq_attr', 'Size' => '12', 'Type' => 'Struct' }, '233' => { 'BaseType' => '185', 'Header' => undef, 'Line' => '194', 'Name' => '__ssize_t', 'Size' => '8', 'Type' => 'Typedef' }, '23323' => { 'Header' => undef, 'Line' => '783', 'Memb' => { '0' => { 'name' => 'srq_context', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'attr', 'offset' => '8', 'type' => '23267' } }, 'Name' => 'struct ibv_srq_init_attr', 'Size' => '24', 'Type' => 'Struct' }, '23365' => { 'Header' => undef, 'Line' => '788', 'Memb' => { '0' => { 'name' => 'IBV_SRQT_BASIC', 'value' => '0' }, '1' => { 'name' => 'IBV_SRQT_XRC', 'value' => '1' }, '2' => { 'name' => 'IBV_SRQT_TM', 'value' => '2' } }, 'Name' => 'enum ibv_srq_type', 'Size' => '4', 'Type' => 'Enum' }, '23453' => { 'Header' => undef, 'Line' => '803', 'Memb' => { '0' => { 'name' => 'max_num_tags', 'offset' => '0', 'type' => '813' }, '1' => { 'name' => 'max_ops', 'offset' => '4', 'type' => '813' } }, 'Name' => 'struct ibv_tm_cap', 'Size' => '8', 'Type' => 'Struct' }, '23495' => { 'Header' => undef, 'Line' => '808', 'Memb' => { '0' => { 'name' => 'srq_context', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'attr', 'offset' => '8', 'type' => '23267' }, '2' => { 'name' => 'comp_mask', 'offset' => '32', 'type' => '813' }, '3' => { 'name' => 'srq_type', 'offset' => '36', 'type' => '23365' }, '4' => { 'name' => 'pd', 'offset' => '50', 'type' => '22955' }, '5' => { 'name' => 'xrcd', 'offset' => '64', 'type' => '23619' }, '6' => { 'name' => 'cq', 'offset' => '72', 'type' => '21383' }, '7' => { 'name' => 'tm_cap', 'offset' => '86', 'type' => '23453' } }, 'Name' => 'struct ibv_srq_init_attr_ex', 'Size' => '64', 'Type' => 'Struct' }, '23619' => { 'BaseType' => '22927', 'Name' => 'struct ibv_xrcd*', 'Size' => '8', 'Type' => 'Pointer' }, '23896' => { 'Header' => undef, 'Line' => '880', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '19258' }, '1' => { 'name' => 'ind_tbl_handle', 'offset' => '8', 'type' => '161' }, '2' => { 'name' => 'ind_tbl_num', 'offset' => '18', 'type' => '161' }, '3' => { 'name' => 'comp_mask', 'offset' => '22', 'type' => '813' } }, 'Name' => 'struct ibv_rwq_ind_table', 'Size' => '24', 'Type' => 'Struct' }, '2392' => { 'Header' => undef, 'Line' => '67', 'Memb' => { '0' => { 'name' => 'subnet_prefix', 'offset' => '0', 'type' => '2079' }, '1' => { 'name' => 'interface_id', 'offset' => '8', 'type' => '2079' } }, 'Size' => '16', 'Type' => 'Struct' }, '24086' => { 'Header' => undef, 'Line' => '911', 'Memb' => { '0' => { 'name' => 'max_send_wr', 'offset' => '0', 'type' => '813' }, '1' => { 'name' => 'max_recv_wr', 'offset' => '4', 'type' => '813' }, '2' => { 'name' => 'max_send_sge', 'offset' => '8', 'type' => '813' }, '3' => { 'name' => 'max_recv_sge', 'offset' => '18', 'type' => '813' }, '4' => { 'name' => 'max_inline_data', 'offset' => '22', 'type' => '813' } }, 'Name' => 'struct ibv_qp_cap', 'Size' => '20', 'Type' => 'Struct' }, '24170' => { 'Header' => undef, 'Line' => '919', 'Memb' => { '0' => { 'name' => 'qp_context', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'send_cq', 'offset' => '8', 'type' => '21383' }, '2' => { 'name' => 'recv_cq', 'offset' => '22', 'type' => '21383' }, '3' => { 'name' => 'srq', 'offset' => '36', 'type' => '21699' }, '4' => { 'name' => 'cap', 'offset' => '50', 'type' => '24086' }, '5' => { 'name' => 'qp_type', 'offset' => '82', 'type' => '11209' }, '6' => { 'name' => 'sq_sig_all', 'offset' => '86', 'type' => '161' } }, 'Name' => 'struct ibv_qp_init_attr', 'Size' => '64', 'Type' => 'Struct' }, '2428' => { 'Header' => undef, 'Line' => '65', 'Memb' => { '0' => { 'name' => 'raw', 'offset' => '0', 'type' => '1646' }, '1' => { 'name' => 'global', 'offset' => '0', 'type' => '2392' } }, 'Name' => 'union ibv_gid', 'Size' => '16', 'Type' => 'Union' }, '24341' => { 'Header' => undef, 'Line' => '963', 'Memb' => { '0' => { 'name' => 'rx_hash_function', 'offset' => '0', 'type' => '789' }, '1' => { 'name' => 'rx_hash_key_len', 'offset' => '1', 'type' => '789' }, '2' => { 'name' => 'rx_hash_key', 'offset' => '8', 'type' => '24411' }, '3' => { 'name' => 'rx_hash_fields_mask', 'offset' => '22', 'type' => '825' } }, 'Name' => 'struct ibv_rx_hash_conf', 'Size' => '24', 'Type' => 'Struct' }, '24411' => { 'BaseType' => '789', 'Name' => 'uint8_t*', 'Size' => '8', 'Type' => 'Pointer' }, '24416' => { 'Header' => undef, 'Line' => '972', 'Memb' => { '0' => { 'name' => 'qp_context', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'send_cq', 'offset' => '8', 'type' => '21383' }, '10' => { 'name' => 'create_flags', 'offset' => '128', 'type' => '813' }, '11' => { 'name' => 'max_tso_header', 'offset' => '132', 'type' => '801' }, '12' => { 'name' => 'rwq_ind_tbl', 'offset' => '136', 'type' => '24653' }, '13' => { 'name' => 'rx_hash_conf', 'offset' => '150', 'type' => '24341' }, '14' => { 'name' => 'source_qpn', 'offset' => '288', 'type' => '813' }, '15' => { 'name' => 'send_ops_flags', 'offset' => '296', 'type' => '825' }, '2' => { 'name' => 'recv_cq', 'offset' => '22', 'type' => '21383' }, '3' => { 'name' => 'srq', 'offset' => '36', 'type' => '21699' }, '4' => { 'name' => 'cap', 'offset' => '50', 'type' => '24086' }, '5' => { 'name' => 'qp_type', 'offset' => '82', 'type' => '11209' }, '6' => { 'name' => 'sq_sig_all', 'offset' => '86', 'type' => '161' }, '7' => { 'name' => 'comp_mask', 'offset' => '96', 'type' => '813' }, '8' => { 'name' => 'pd', 'offset' => '100', 'type' => '22955' }, '9' => { 'name' => 'xrcd', 'offset' => '114', 'type' => '23619' } }, 'Name' => 'struct ibv_qp_init_attr_ex', 'Size' => '136', 'Type' => 'Struct' }, '245' => { 'BaseType' => '255', 'Name' => 'char*', 'Size' => '8', 'Type' => 'Pointer' }, '24653' => { 'BaseType' => '23896', 'Name' => 'struct ibv_rwq_ind_table*', 'Size' => '8', 'Type' => 'Pointer' }, '24917' => { 'Header' => undef, 'Line' => '1050', 'Memb' => { '0' => { 'name' => 'IBV_QPS_RESET', 'value' => '0' }, '1' => { 'name' => 'IBV_QPS_INIT', 'value' => '1' }, '2' => { 'name' => 'IBV_QPS_RTR', 'value' => '2' }, '3' => { 'name' => 'IBV_QPS_RTS', 'value' => '3' }, '4' => { 'name' => 'IBV_QPS_SQD', 'value' => '4' }, '5' => { 'name' => 'IBV_QPS_SQE', 'value' => '5' }, '6' => { 'name' => 'IBV_QPS_ERR', 'value' => '6' }, '7' => { 'name' => 'IBV_QPS_UNKNOWN', 'value' => '7' } }, 'Name' => 'enum ibv_qp_state', 'Size' => '4', 'Type' => 'Enum' }, '24982' => { 'Header' => undef, 'Line' => '1061', 'Memb' => { '0' => { 'name' => 'IBV_MIG_MIGRATED', 'value' => '0' }, '1' => { 'name' => 'IBV_MIG_REARM', 'value' => '1' }, '2' => { 'name' => 'IBV_MIG_ARMED', 'value' => '2' } }, 'Name' => 'enum ibv_mig_state', 'Size' => '4', 'Type' => 'Enum' }, '25017' => { 'Header' => undef, 'Line' => '1067', 'Memb' => { '0' => { 'name' => 'qp_state', 'offset' => '0', 'type' => '24917' }, '1' => { 'name' => 'cur_qp_state', 'offset' => '4', 'type' => '24917' }, '10' => { 'name' => 'ah_attr', 'offset' => '86', 'type' => '23156' }, '11' => { 'name' => 'alt_ah_attr', 'offset' => '136', 'type' => '23156' }, '12' => { 'name' => 'pkey_index', 'offset' => '288', 'type' => '801' }, '13' => { 'name' => 'alt_pkey_index', 'offset' => '290', 'type' => '801' }, '14' => { 'name' => 'en_sqd_async_notify', 'offset' => '292', 'type' => '789' }, '15' => { 'name' => 'sq_draining', 'offset' => '293', 'type' => '789' }, '16' => { 'name' => 'max_rd_atomic', 'offset' => '294', 'type' => '789' }, '17' => { 'name' => 'max_dest_rd_atomic', 'offset' => '295', 'type' => '789' }, '18' => { 'name' => 'min_rnr_timer', 'offset' => '296', 'type' => '789' }, '19' => { 'name' => 'port_num', 'offset' => '297', 'type' => '789' }, '2' => { 'name' => 'path_mtu', 'offset' => '8', 'type' => '20639' }, '20' => { 'name' => 'timeout', 'offset' => '304', 'type' => '789' }, '21' => { 'name' => 'retry_cnt', 'offset' => '305', 'type' => '789' }, '22' => { 'name' => 'rnr_retry', 'offset' => '306', 'type' => '789' }, '23' => { 'name' => 'alt_port_num', 'offset' => '307', 'type' => '789' }, '24' => { 'name' => 'alt_timeout', 'offset' => '308', 'type' => '789' }, '25' => { 'name' => 'rate_limit', 'offset' => '310', 'type' => '813' }, '3' => { 'name' => 'path_mig_state', 'offset' => '18', 'type' => '24982' }, '4' => { 'name' => 'qkey', 'offset' => '22', 'type' => '813' }, '5' => { 'name' => 'rq_psn', 'offset' => '32', 'type' => '813' }, '6' => { 'name' => 'sq_psn', 'offset' => '36', 'type' => '813' }, '7' => { 'name' => 'dest_qp_num', 'offset' => '40', 'type' => '813' }, '8' => { 'name' => 'qp_access_flags', 'offset' => '50', 'type' => '70' }, '9' => { 'name' => 'cap', 'offset' => '54', 'type' => '24086' } }, 'Name' => 'struct ibv_qp_attr', 'Size' => '144', 'Type' => 'Struct' }, '25465' => { 'Header' => undef, 'Line' => '1103', 'Memb' => { '0' => { 'name' => 'IBV_WR_RDMA_WRITE', 'value' => '0' }, '1' => { 'name' => 'IBV_WR_RDMA_WRITE_WITH_IMM', 'value' => '1' }, '10' => { 'name' => 'IBV_WR_TSO', 'value' => '10' }, '11' => { 'name' => 'IBV_WR_DRIVER1', 'value' => '11' }, '12' => { 'name' => 'IBV_WR_FLUSH', 'value' => '14' }, '13' => { 'name' => 'IBV_WR_ATOMIC_WRITE', 'value' => '15' }, '2' => { 'name' => 'IBV_WR_SEND', 'value' => '2' }, '3' => { 'name' => 'IBV_WR_SEND_WITH_IMM', 'value' => '3' }, '4' => { 'name' => 'IBV_WR_RDMA_READ', 'value' => '4' }, '5' => { 'name' => 'IBV_WR_ATOMIC_CMP_AND_SWP', 'value' => '5' }, '6' => { 'name' => 'IBV_WR_ATOMIC_FETCH_AND_ADD', 'value' => '6' }, '7' => { 'name' => 'IBV_WR_LOCAL_INV', 'value' => '7' }, '8' => { 'name' => 'IBV_WR_BIND_MW', 'value' => '8' }, '9' => { 'name' => 'IBV_WR_SEND_WITH_INV', 'value' => '9' } }, 'Name' => 'enum ibv_wr_opcode', 'Size' => '4', 'Type' => 'Enum' }, '255' => { 'Name' => 'char', 'Size' => '1', 'Type' => 'Intrinsic' }, '25566' => { 'Header' => undef, 'Line' => '1145', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '825' }, '1' => { 'name' => 'length', 'offset' => '8', 'type' => '813' }, '2' => { 'name' => 'lkey', 'offset' => '18', 'type' => '813' } }, 'Name' => 'struct ibv_sge', 'Size' => '16', 'Type' => 'Struct' }, '25622' => { 'Header' => undef, 'Line' => '1161', 'Memb' => { '0' => { 'name' => 'imm_data', 'offset' => '0', 'type' => '2067' }, '1' => { 'name' => 'invalidate_rkey', 'offset' => '0', 'type' => '813' } }, 'Size' => '4', 'Type' => 'Union' }, '25657' => { 'Header' => undef, 'Line' => '1166', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '825' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '813' } }, 'Size' => '16', 'Type' => 'Struct' }, '25695' => { 'Header' => undef, 'Line' => '1170', 'Memb' => { '0' => { 'name' => 'remote_addr', 'offset' => '0', 'type' => '825' }, '1' => { 'name' => 'compare_add', 'offset' => '8', 'type' => '825' }, '2' => { 'name' => 'swap', 'offset' => '22', 'type' => '825' }, '3' => { 'name' => 'rkey', 'offset' => '36', 'type' => '813' } }, 'Size' => '32', 'Type' => 'Struct' }, '25761' => { 'Header' => undef, 'Line' => '1176', 'Memb' => { '0' => { 'name' => 'ah', 'offset' => '0', 'type' => '25867' }, '1' => { 'name' => 'remote_qpn', 'offset' => '8', 'type' => '813' }, '2' => { 'name' => 'remote_qkey', 'offset' => '18', 'type' => '813' } }, 'Size' => '16', 'Type' => 'Struct' }, '25812' => { 'Header' => undef, 'Line' => '1695', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '19258' }, '1' => { 'name' => 'pd', 'offset' => '8', 'type' => '22955' }, '2' => { 'name' => 'handle', 'offset' => '22', 'type' => '813' } }, 'Name' => 'struct ibv_ah', 'Size' => '24', 'Type' => 'Struct' }, '25867' => { 'BaseType' => '25812', 'Name' => 'struct ibv_ah*', 'Size' => '8', 'Type' => 'Pointer' }, '25872' => { 'Header' => undef, 'Line' => '1165', 'Memb' => { '0' => { 'name' => 'rdma', 'offset' => '0', 'type' => '25657' }, '1' => { 'name' => 'atomic', 'offset' => '0', 'type' => '25695' }, '2' => { 'name' => 'ud', 'offset' => '0', 'type' => '25761' } }, 'Size' => '32', 'Type' => 'Union' }, '25919' => { 'Header' => undef, 'Line' => '1183', 'Memb' => { '0' => { 'name' => 'remote_srqn', 'offset' => '0', 'type' => '813' } }, 'Size' => '4', 'Type' => 'Struct' }, '25943' => { 'Header' => undef, 'Line' => '1182', 'Memb' => { '0' => { 'name' => 'xrc', 'offset' => '0', 'type' => '25919' } }, 'Size' => '4', 'Type' => 'Union' }, '25965' => { 'Header' => undef, 'Line' => '1188', 'Memb' => { '0' => { 'name' => 'mw', 'offset' => '0', 'type' => '26016' }, '1' => { 'name' => 'rkey', 'offset' => '8', 'type' => '813' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '22554' } }, 'Size' => '48', 'Type' => 'Struct' }, '26016' => { 'BaseType' => '22989', 'Name' => 'struct ibv_mw*', 'Size' => '8', 'Type' => 'Pointer' }, '26021' => { 'Header' => undef, 'Line' => '1193', 'Memb' => { '0' => { 'name' => 'hdr', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'hdr_sz', 'offset' => '8', 'type' => '801' }, '2' => { 'name' => 'mss', 'offset' => '16', 'type' => '801' } }, 'Size' => '16', 'Type' => 'Struct' }, '26073' => { 'Header' => undef, 'Line' => '1187', 'Memb' => { '0' => { 'name' => 'bind_mw', 'offset' => '0', 'type' => '25965' }, '1' => { 'name' => 'tso', 'offset' => '0', 'type' => '26021' } }, 'Size' => '48', 'Type' => 'Union' }, '26108' => { 'Header' => undef, 'Line' => '1151', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '825' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '26245' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '26250' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '161' }, '4' => { 'name' => 'opcode', 'offset' => '40', 'type' => '25465' }, '5' => { 'name' => 'send_flags', 'offset' => '50', 'type' => '70' }, '6' => { 'name' => 'unnamed0', 'offset' => '54', 'type' => '25622' }, '7' => { 'name' => 'wr', 'offset' => '64', 'type' => '25872' }, '8' => { 'name' => 'qp_type', 'offset' => '114', 'type' => '25943' }, '9' => { 'name' => 'unnamed1', 'offset' => '128', 'type' => '26073' } }, 'Name' => 'struct ibv_send_wr', 'Size' => '128', 'Type' => 'Struct' }, '262' => { 'BaseType' => '255', 'Name' => 'char const', 'Size' => '1', 'Type' => 'Const' }, '26245' => { 'BaseType' => '26108', 'Name' => 'struct ibv_send_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '26250' => { 'BaseType' => '25566', 'Name' => 'struct ibv_sge*', 'Size' => '8', 'Type' => 'Pointer' }, '26255' => { 'Header' => undef, 'Line' => '1201', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '825' }, '1' => { 'name' => 'next', 'offset' => '8', 'type' => '26325' }, '2' => { 'name' => 'sg_list', 'offset' => '22', 'type' => '26250' }, '3' => { 'name' => 'num_sge', 'offset' => '36', 'type' => '161' } }, 'Name' => 'struct ibv_recv_wr', 'Size' => '32', 'Type' => 'Struct' }, '26325' => { 'BaseType' => '26255', 'Name' => 'struct ibv_recv_wr*', 'Size' => '8', 'Type' => 'Pointer' }, '26585' => { 'Header' => undef, 'Line' => '1237', 'Memb' => { '0' => { 'name' => 'wr_id', 'offset' => '0', 'type' => '825' }, '1' => { 'name' => 'send_flags', 'offset' => '8', 'type' => '70' }, '2' => { 'name' => 'bind_info', 'offset' => '22', 'type' => '22554' } }, 'Name' => 'struct ibv_mw_bind', 'Size' => '48', 'Type' => 'Struct' }, '26666' => { 'BaseType' => '26325', 'Name' => 'struct ibv_recv_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '26676' => { 'Header' => undef, 'Line' => '1487', 'Memb' => { '0' => { 'name' => 'vendor_id', 'offset' => '0', 'type' => '813' }, '1' => { 'name' => 'options', 'offset' => '4', 'type' => '813' }, '2' => { 'name' => 'comp_mask', 'offset' => '8', 'type' => '813' } }, 'Name' => 'struct ibv_ece', 'Size' => '12', 'Type' => 'Struct' }, '26732' => { 'Header' => undef, 'Line' => '1502', 'Memb' => { '0' => { 'name' => 'context', 'offset' => '0', 'type' => '19258' }, '1' => { 'name' => 'fd', 'offset' => '8', 'type' => '161' }, '2' => { 'name' => 'refcnt', 'offset' => '18', 'type' => '161' } }, 'Name' => 'struct ibv_comp_channel', 'Size' => '16', 'Type' => 'Struct' }, '26787' => { 'BaseType' => '26732', 'Name' => 'struct ibv_comp_channel*', 'Size' => '8', 'Type' => 'Pointer' }, '272' => { 'BaseType' => '70', 'Header' => undef, 'Line' => '210', 'Name' => '__socklen_t', 'Size' => '4', 'Type' => 'Typedef' }, '28067' => { 'Header' => undef, 'Line' => '1969', 'Memb' => { '0' => { 'name' => '_dummy1', 'offset' => '0', 'type' => '28248' }, '1' => { 'name' => '_dummy2', 'offset' => '8', 'type' => '28264' } }, 'Name' => 'struct _ibv_device_ops', 'Size' => '16', 'Type' => 'Struct' }, '28129' => { 'BaseType' => '28134', 'Name' => 'struct ibv_device*', 'Size' => '8', 'Type' => 'Pointer' }, '28134' => { 'Header' => undef, 'Line' => '1979', 'Memb' => { '0' => { 'name' => '_ops', 'offset' => '0', 'type' => '28067' }, '1' => { 'name' => 'node_type', 'offset' => '22', 'type' => '18857' }, '2' => { 'name' => 'transport_type', 'offset' => '32', 'type' => '18921' }, '3' => { 'name' => 'name', 'offset' => '36', 'type' => '19872' }, '4' => { 'name' => 'dev_name', 'offset' => '136', 'type' => '19872' }, '5' => { 'name' => 'dev_path', 'offset' => '338', 'type' => '28295' }, '6' => { 'name' => 'ibdev_path', 'offset' => '1032', 'type' => '28295' } }, 'Name' => 'struct ibv_device', 'Size' => '664', 'Type' => 'Struct' }, '28248' => { 'Name' => 'struct ibv_context*(*)(struct ibv_device*, int)', 'Param' => { '0' => { 'type' => '28129' }, '1' => { 'type' => '161' } }, 'Return' => '19258', 'Size' => '8', 'Type' => 'FuncPtr' }, '28264' => { 'Name' => 'void(*)(struct ibv_context*)', 'Param' => { '0' => { 'type' => '19258' } }, 'Return' => '1', 'Size' => '8', 'Type' => 'FuncPtr' }, '28295' => { 'BaseType' => '255', 'Name' => 'char[256]', 'Size' => '256', 'Type' => 'Array' }, '28311' => { 'Header' => undef, 'Line' => '1994', 'Memb' => { '0' => { 'name' => '_compat_query_device', 'offset' => '0', 'type' => '28799' }, '1' => { 'name' => '_compat_query_port', 'offset' => '8', 'type' => '28839' }, '10' => { 'name' => '_compat_create_cq', 'offset' => '128', 'type' => '28849' }, '11' => { 'name' => 'poll_cq', 'offset' => '136', 'type' => '28964' }, '12' => { 'name' => 'req_notify_cq', 'offset' => '150', 'type' => '28989' }, '13' => { 'name' => '_compat_cq_event', 'offset' => '260', 'type' => '28849' }, '14' => { 'name' => '_compat_resize_cq', 'offset' => '274', 'type' => '28849' }, '15' => { 'name' => '_compat_destroy_cq', 'offset' => '288', 'type' => '28849' }, '16' => { 'name' => '_compat_create_srq', 'offset' => '296', 'type' => '28849' }, '17' => { 'name' => '_compat_modify_srq', 'offset' => '310', 'type' => '28849' }, '18' => { 'name' => '_compat_query_srq', 'offset' => '324', 'type' => '28849' }, '19' => { 'name' => '_compat_destroy_srq', 'offset' => '338', 'type' => '28849' }, '2' => { 'name' => '_compat_alloc_pd', 'offset' => '22', 'type' => '28849' }, '20' => { 'name' => 'post_srq_recv', 'offset' => '352', 'type' => '29019' }, '21' => { 'name' => '_compat_create_qp', 'offset' => '360', 'type' => '28849' }, '22' => { 'name' => '_compat_query_qp', 'offset' => '374', 'type' => '28849' }, '23' => { 'name' => '_compat_modify_qp', 'offset' => '388', 'type' => '28849' }, '24' => { 'name' => '_compat_destroy_qp', 'offset' => '402', 'type' => '28849' }, '25' => { 'name' => 'post_send', 'offset' => '512', 'type' => '29054' }, '26' => { 'name' => 'post_recv', 'offset' => '520', 'type' => '29084' }, '27' => { 'name' => '_compat_create_ah', 'offset' => '534', 'type' => '28849' }, '28' => { 'name' => '_compat_destroy_ah', 'offset' => '548', 'type' => '28849' }, '29' => { 'name' => '_compat_attach_mcast', 'offset' => '562', 'type' => '28849' }, '3' => { 'name' => '_compat_dealloc_pd', 'offset' => '36', 'type' => '28849' }, '30' => { 'name' => '_compat_detach_mcast', 'offset' => '576', 'type' => '28849' }, '31' => { 'name' => '_compat_async_event', 'offset' => '584', 'type' => '28849' }, '4' => { 'name' => '_compat_reg_mr', 'offset' => '50', 'type' => '28849' }, '5' => { 'name' => '_compat_rereg_mr', 'offset' => '64', 'type' => '28849' }, '6' => { 'name' => '_compat_dereg_mr', 'offset' => '72', 'type' => '28849' }, '7' => { 'name' => 'alloc_mw', 'offset' => '86', 'type' => '28874' }, '8' => { 'name' => 'bind_mw', 'offset' => '100', 'type' => '28909' }, '9' => { 'name' => 'dealloc_mw', 'offset' => '114', 'type' => '28929' } }, 'Name' => 'struct ibv_context_ops', 'Size' => '256', 'Type' => 'Struct' }, '28794' => { 'BaseType' => '19338', 'Name' => 'struct ibv_device_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '28799' => { 'Name' => 'int(*)(struct ibv_context*, struct ibv_device_attr*)', 'Param' => { '0' => { 'type' => '19258' }, '1' => { 'type' => '28794' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '28829' => { 'BaseType' => '28834', 'Name' => 'struct _compat_ibv_port_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '28834' => { 'Name' => 'struct _compat_ibv_port_attr', 'Type' => 'Struct' }, '28839' => { 'Name' => 'int(*)(struct ibv_context*, uint8_t, struct _compat_ibv_port_attr*)', 'Param' => { '0' => { 'type' => '19258' }, '1' => { 'type' => '789' }, '2' => { 'type' => '28829' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '28849' => { 'Name' => 'void*(*)()', 'Return' => '82', 'Size' => '8', 'Type' => 'FuncPtr' }, '28874' => { 'Name' => 'struct ibv_mw*(*)(struct ibv_pd*, enum ibv_mw_type)', 'Param' => { '0' => { 'type' => '22955' }, '1' => { 'type' => '22960' } }, 'Return' => '26016', 'Size' => '8', 'Type' => 'FuncPtr' }, '28904' => { 'BaseType' => '26585', 'Name' => 'struct ibv_mw_bind*', 'Size' => '8', 'Type' => 'Pointer' }, '28909' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_mw*, struct ibv_mw_bind*)', 'Param' => { '0' => { 'type' => '21583' }, '1' => { 'type' => '26016' }, '2' => { 'type' => '28904' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '28929' => { 'Name' => 'int(*)(struct ibv_mw*)', 'Param' => { '0' => { 'type' => '26016' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '28959' => { 'BaseType' => '22367', 'Name' => 'struct ibv_wc*', 'Size' => '8', 'Type' => 'Pointer' }, '28964' => { 'Name' => 'int(*)(struct ibv_cq*, int, struct ibv_wc*)', 'Param' => { '0' => { 'type' => '21383' }, '1' => { 'type' => '161' }, '2' => { 'type' => '28959' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '28989' => { 'Name' => 'int(*)(struct ibv_cq*, int)', 'Param' => { '0' => { 'type' => '21383' }, '1' => { 'type' => '161' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '29019' => { 'Name' => 'int(*)(struct ibv_srq*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '21699' }, '1' => { 'type' => '26325' }, '2' => { 'type' => '26666' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '29049' => { 'BaseType' => '26245', 'Name' => 'struct ibv_send_wr**', 'Size' => '8', 'Type' => 'Pointer' }, '29054' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_send_wr*, struct ibv_send_wr**)', 'Param' => { '0' => { 'type' => '21583' }, '1' => { 'type' => '26245' }, '2' => { 'type' => '29049' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '29084' => { 'Name' => 'int(*)(struct ibv_qp*, struct ibv_recv_wr*, struct ibv_recv_wr**)', 'Param' => { '0' => { 'type' => '21583' }, '1' => { 'type' => '26325' }, '2' => { 'type' => '26666' } }, 'Return' => '161', 'Size' => '8', 'Type' => 'FuncPtr' }, '2913' => { 'Header' => undef, 'Line' => '69', 'Memb' => { '0' => { 'name' => 'RDMA_PS_IPOIB', 'value' => '2' }, '1' => { 'name' => 'RDMA_PS_TCP', 'value' => '262' }, '2' => { 'name' => 'RDMA_PS_UDP', 'value' => '273' }, '3' => { 'name' => 'RDMA_PS_IB', 'value' => '319' } }, 'Name' => 'enum rdma_port_space', 'Size' => '4', 'Type' => 'Enum' }, '2956' => { 'Header' => undef, 'Line' => '182', 'Memb' => { '0' => { 'name' => 'ai_flags', 'offset' => '0', 'type' => '161' }, '1' => { 'name' => 'ai_family', 'offset' => '4', 'type' => '161' }, '10' => { 'name' => 'ai_route_len', 'offset' => '86', 'type' => '46' }, '11' => { 'name' => 'ai_route', 'offset' => '100', 'type' => '82' }, '12' => { 'name' => 'ai_connect_len', 'offset' => '114', 'type' => '46' }, '13' => { 'name' => 'ai_connect', 'offset' => '128', 'type' => '82' }, '14' => { 'name' => 'ai_next', 'offset' => '136', 'type' => '3170' }, '2' => { 'name' => 'ai_qp_type', 'offset' => '8', 'type' => '161' }, '3' => { 'name' => 'ai_port_space', 'offset' => '18', 'type' => '161' }, '4' => { 'name' => 'ai_src_len', 'offset' => '22', 'type' => '1110' }, '5' => { 'name' => 'ai_dst_len', 'offset' => '32', 'type' => '1110' }, '6' => { 'name' => 'ai_src_addr', 'offset' => '36', 'type' => '1883' }, '7' => { 'name' => 'ai_dst_addr', 'offset' => '50', 'type' => '1883' }, '8' => { 'name' => 'ai_src_canonname', 'offset' => '64', 'type' => '245' }, '9' => { 'name' => 'ai_dst_canonname', 'offset' => '72', 'type' => '245' } }, 'Name' => 'struct rdma_addrinfo', 'Size' => '96', 'Type' => 'Struct' }, '31065' => { 'BaseType' => '24416', 'Name' => 'struct ibv_qp_init_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '31125' => { 'BaseType' => '23495', 'Name' => 'struct ibv_srq_init_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '31185' => { 'Header' => undef, 'Line' => '40', 'Memb' => { '0' => { 'name' => 'dgid', 'offset' => '0', 'type' => '2428' }, '1' => { 'name' => 'sgid', 'offset' => '22', 'type' => '2428' }, '10' => { 'name' => 'pkey', 'offset' => '84', 'type' => '2055' }, '11' => { 'name' => 'sl', 'offset' => '86', 'type' => '789' }, '12' => { 'name' => 'mtu_selector', 'offset' => '87', 'type' => '789' }, '13' => { 'name' => 'mtu', 'offset' => '88', 'type' => '789' }, '14' => { 'name' => 'rate_selector', 'offset' => '89', 'type' => '789' }, '15' => { 'name' => 'rate', 'offset' => '96', 'type' => '789' }, '16' => { 'name' => 'packet_life_time_selector', 'offset' => '97', 'type' => '789' }, '17' => { 'name' => 'packet_life_time', 'offset' => '98', 'type' => '789' }, '18' => { 'name' => 'preference', 'offset' => '99', 'type' => '789' }, '2' => { 'name' => 'dlid', 'offset' => '50', 'type' => '2055' }, '3' => { 'name' => 'slid', 'offset' => '52', 'type' => '2055' }, '4' => { 'name' => 'raw_traffic', 'offset' => '54', 'type' => '161' }, '5' => { 'name' => 'flow_label', 'offset' => '64', 'type' => '2067' }, '6' => { 'name' => 'hop_limit', 'offset' => '68', 'type' => '789' }, '7' => { 'name' => 'traffic_class', 'offset' => '69', 'type' => '789' }, '8' => { 'name' => 'reversible', 'offset' => '72', 'type' => '161' }, '9' => { 'name' => 'numb_path', 'offset' => '82', 'type' => '789' } }, 'Name' => 'struct ibv_sa_path_rec', 'Size' => '64', 'Type' => 'Struct' }, '3165' => { 'BaseType' => '2956', 'Name' => 'struct rdma_addrinfo const', 'Size' => '96', 'Type' => 'Const' }, '3170' => { 'BaseType' => '2956', 'Name' => 'struct rdma_addrinfo*', 'Size' => '8', 'Type' => 'Pointer' }, '31707' => { 'Header' => undef, 'Line' => '50', 'Memb' => { '0' => { 'name' => 'RDMA_CM_EVENT_ADDR_RESOLVED', 'value' => '0' }, '1' => { 'name' => 'RDMA_CM_EVENT_ADDR_ERROR', 'value' => '1' }, '10' => { 'name' => 'RDMA_CM_EVENT_DISCONNECTED', 'value' => '10' }, '11' => { 'name' => 'RDMA_CM_EVENT_DEVICE_REMOVAL', 'value' => '11' }, '12' => { 'name' => 'RDMA_CM_EVENT_MULTICAST_JOIN', 'value' => '12' }, '13' => { 'name' => 'RDMA_CM_EVENT_MULTICAST_ERROR', 'value' => '13' }, '14' => { 'name' => 'RDMA_CM_EVENT_ADDR_CHANGE', 'value' => '14' }, '15' => { 'name' => 'RDMA_CM_EVENT_TIMEWAIT_EXIT', 'value' => '15' }, '2' => { 'name' => 'RDMA_CM_EVENT_ROUTE_RESOLVED', 'value' => '2' }, '3' => { 'name' => 'RDMA_CM_EVENT_ROUTE_ERROR', 'value' => '3' }, '4' => { 'name' => 'RDMA_CM_EVENT_CONNECT_REQUEST', 'value' => '4' }, '5' => { 'name' => 'RDMA_CM_EVENT_CONNECT_RESPONSE', 'value' => '5' }, '6' => { 'name' => 'RDMA_CM_EVENT_CONNECT_ERROR', 'value' => '6' }, '7' => { 'name' => 'RDMA_CM_EVENT_UNREACHABLE', 'value' => '7' }, '8' => { 'name' => 'RDMA_CM_EVENT_REJECTED', 'value' => '8' }, '9' => { 'name' => 'RDMA_CM_EVENT_ESTABLISHED', 'value' => '9' } }, 'Name' => 'enum rdma_cm_event_type', 'Size' => '4', 'Type' => 'Enum' }, '31864' => { 'Header' => undef, 'Line' => '88', 'Memb' => { '0' => { 'name' => 'sgid', 'offset' => '0', 'type' => '2428' }, '1' => { 'name' => 'dgid', 'offset' => '22', 'type' => '2428' }, '2' => { 'name' => 'pkey', 'offset' => '50', 'type' => '2055' } }, 'Name' => 'struct rdma_ib_addr', 'Size' => '40', 'Type' => 'Struct' }, '31917' => { 'Header' => undef, 'Line' => '95', 'Memb' => { '0' => { 'name' => 'src_addr', 'offset' => '0', 'type' => '1208' }, '1' => { 'name' => 'src_sin', 'offset' => '0', 'type' => '1721' }, '2' => { 'name' => 'src_sin6', 'offset' => '0', 'type' => '1803' }, '3' => { 'name' => 'src_storage', 'offset' => '0', 'type' => '16782' } }, 'Size' => '128', 'Type' => 'Union' }, '31975' => { 'Header' => undef, 'Line' => '101', 'Memb' => { '0' => { 'name' => 'dst_addr', 'offset' => '0', 'type' => '1208' }, '1' => { 'name' => 'dst_sin', 'offset' => '0', 'type' => '1721' }, '2' => { 'name' => 'dst_sin6', 'offset' => '0', 'type' => '1803' }, '3' => { 'name' => 'dst_storage', 'offset' => '0', 'type' => '16782' } }, 'Size' => '128', 'Type' => 'Union' }, '32033' => { 'Header' => undef, 'Line' => '107', 'Memb' => { '0' => { 'name' => 'ibaddr', 'offset' => '0', 'type' => '31864' } }, 'Size' => '40', 'Type' => 'Union' }, '32055' => { 'Header' => undef, 'Line' => '94', 'Memb' => { '0' => { 'name' => 'unnamed0', 'offset' => '0', 'type' => '31917' }, '1' => { 'name' => 'unnamed1', 'offset' => '296', 'type' => '31975' }, '2' => { 'name' => 'addr', 'offset' => '598', 'type' => '32033' } }, 'Name' => 'struct rdma_addr', 'Size' => '296', 'Type' => 'Struct' }, '32095' => { 'Header' => undef, 'Line' => '112', 'Memb' => { '0' => { 'name' => 'addr', 'offset' => '0', 'type' => '32055' }, '1' => { 'name' => 'path_rec', 'offset' => '662', 'type' => '32150' }, '2' => { 'name' => 'num_paths', 'offset' => '772', 'type' => '161' } }, 'Name' => 'struct rdma_route', 'Size' => '312', 'Type' => 'Struct' }, '32150' => { 'BaseType' => '31185', 'Name' => 'struct ibv_sa_path_rec*', 'Size' => '8', 'Type' => 'Pointer' }, '32155' => { 'Header' => undef, 'Line' => '118', 'Memb' => { '0' => { 'name' => 'fd', 'offset' => '0', 'type' => '161' } }, 'Name' => 'struct rdma_event_channel', 'Size' => '4', 'Type' => 'Struct' }, '32181' => { 'Header' => undef, 'Line' => '122', 'Memb' => { '0' => { 'name' => 'verbs', 'offset' => '0', 'type' => '19258' }, '1' => { 'name' => 'channel', 'offset' => '8', 'type' => '32397' }, '10' => { 'name' => 'recv_cq_channel', 'offset' => '886', 'type' => '26787' }, '11' => { 'name' => 'recv_cq', 'offset' => '900', 'type' => '21383' }, '12' => { 'name' => 'srq', 'offset' => '914', 'type' => '21699' }, '13' => { 'name' => 'pd', 'offset' => '1024', 'type' => '22955' }, '14' => { 'name' => 'qp_type', 'offset' => '1032', 'type' => '11209' }, '2' => { 'name' => 'context', 'offset' => '22', 'type' => '82' }, '3' => { 'name' => 'qp', 'offset' => '36', 'type' => '21583' }, '4' => { 'name' => 'route', 'offset' => '50', 'type' => '32095' }, '5' => { 'name' => 'ps', 'offset' => '836', 'type' => '2913' }, '6' => { 'name' => 'port_num', 'offset' => '840', 'type' => '789' }, '7' => { 'name' => 'event', 'offset' => '850', 'type' => '32480' }, '8' => { 'name' => 'send_cq_channel', 'offset' => '864', 'type' => '26787' }, '9' => { 'name' => 'send_cq', 'offset' => '872', 'type' => '21383' } }, 'Name' => 'struct rdma_cm_id', 'Size' => '416', 'Type' => 'Struct' }, '32397' => { 'BaseType' => '32155', 'Name' => 'struct rdma_event_channel*', 'Size' => '8', 'Type' => 'Pointer' }, '32402' => { 'Header' => undef, 'Line' => '166', 'Memb' => { '0' => { 'name' => 'id', 'offset' => '0', 'type' => '32753' }, '1' => { 'name' => 'listen_id', 'offset' => '8', 'type' => '32753' }, '2' => { 'name' => 'event', 'offset' => '22', 'type' => '31707' }, '3' => { 'name' => 'status', 'offset' => '32', 'type' => '161' }, '4' => { 'name' => 'param', 'offset' => '36', 'type' => '32720' } }, 'Name' => 'struct rdma_cm_event', 'Size' => '80', 'Type' => 'Struct' }, '32480' => { 'BaseType' => '32402', 'Name' => 'struct rdma_cm_event*', 'Size' => '8', 'Type' => 'Pointer' }, '32510' => { 'Header' => undef, 'Line' => '145', 'Memb' => { '0' => { 'name' => 'private_data', 'offset' => '0', 'type' => '1888' }, '1' => { 'name' => 'private_data_len', 'offset' => '8', 'type' => '789' }, '2' => { 'name' => 'responder_resources', 'offset' => '9', 'type' => '789' }, '3' => { 'name' => 'initiator_depth', 'offset' => '16', 'type' => '789' }, '4' => { 'name' => 'flow_control', 'offset' => '17', 'type' => '789' }, '5' => { 'name' => 'retry_count', 'offset' => '18', 'type' => '789' }, '6' => { 'name' => 'rnr_retry_count', 'offset' => '19', 'type' => '789' }, '7' => { 'name' => 'srq', 'offset' => '20', 'type' => '789' }, '8' => { 'name' => 'qp_num', 'offset' => '22', 'type' => '813' } }, 'Name' => 'struct rdma_conn_param', 'Size' => '24', 'Type' => 'Struct' }, '32641' => { 'Header' => undef, 'Line' => '158', 'Memb' => { '0' => { 'name' => 'private_data', 'offset' => '0', 'type' => '1888' }, '1' => { 'name' => 'private_data_len', 'offset' => '8', 'type' => '789' }, '2' => { 'name' => 'ah_attr', 'offset' => '22', 'type' => '23156' }, '3' => { 'name' => 'qp_num', 'offset' => '72', 'type' => '813' }, '4' => { 'name' => 'qkey', 'offset' => '82', 'type' => '813' } }, 'Name' => 'struct rdma_ud_param', 'Size' => '56', 'Type' => 'Struct' }, '32720' => { 'Header' => undef, 'Line' => '171', 'Memb' => { '0' => { 'name' => 'conn', 'offset' => '0', 'type' => '32510' }, '1' => { 'name' => 'ud', 'offset' => '0', 'type' => '32641' } }, 'Size' => '56', 'Type' => 'Union' }, '32753' => { 'BaseType' => '32181', 'Name' => 'struct rdma_cm_id*', 'Size' => '8', 'Type' => 'Pointer' }, '33047' => { 'Header' => undef, 'Line' => '214', 'Memb' => { '0' => { 'name' => 'comp_mask', 'offset' => '0', 'type' => '813' }, '1' => { 'name' => 'join_flags', 'offset' => '4', 'type' => '813' }, '2' => { 'name' => 'addr', 'offset' => '8', 'type' => '1883' } }, 'Name' => 'struct rdma_cm_join_mc_attr_ex', 'Size' => '16', 'Type' => 'Struct' }, '33813' => { 'BaseType' => '24170', 'Name' => 'struct ibv_qp_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '33818' => { 'BaseType' => '23323', 'Name' => 'struct ibv_srq_init_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '33823' => { 'BaseType' => '25017', 'Name' => 'struct ibv_qp_attr*', 'Size' => '8', 'Type' => 'Pointer' }, '33828' => { 'BaseType' => '26676', 'Name' => 'struct ibv_ece*', 'Size' => '8', 'Type' => 'Pointer' }, '4143' => { 'BaseType' => '161', 'Name' => 'int*', 'Size' => '8', 'Type' => 'Pointer' }, '43044' => { 'BaseType' => '32753', 'Name' => 'struct rdma_cm_id**', 'Size' => '8', 'Type' => 'Pointer' }, '4573' => { 'BaseType' => '1248', 'Name' => 'struct sockaddr const*', 'Size' => '8', 'Type' => 'Pointer' }, '46' => { 'BaseType' => '58', 'Header' => undef, 'Line' => '214', 'Name' => 'size_t', 'Size' => '8', 'Type' => 'Typedef' }, '47987' => { 'BaseType' => '32480', 'Name' => 'struct rdma_cm_event**', 'Size' => '8', 'Type' => 'Pointer' }, '48249' => { 'BaseType' => '32510', 'Name' => 'struct rdma_conn_param*', 'Size' => '8', 'Type' => 'Pointer' }, '50531' => { 'BaseType' => '33047', 'Name' => 'struct rdma_cm_join_mc_attr_ex*', 'Size' => '8', 'Type' => 'Pointer' }, '58' => { 'Name' => 'unsigned long', 'Size' => '8', 'Type' => 'Intrinsic' }, '70' => { 'Name' => 'unsigned int', 'Size' => '4', 'Type' => 'Intrinsic' }, '71165' => { 'BaseType' => '19258', 'Name' => 'struct ibv_context**', 'Size' => '8', 'Type' => 'Pointer' }, '767' => { 'BaseType' => '233', 'Header' => undef, 'Line' => '77', 'Name' => 'ssize_t', 'Size' => '8', 'Type' => 'Typedef' }, '7862' => { 'BaseType' => '3170', 'Name' => 'struct rdma_addrinfo**', 'Size' => '8', 'Type' => 'Pointer' }, '7867' => { 'BaseType' => '3165', 'Name' => 'struct rdma_addrinfo const*', 'Size' => '8', 'Type' => 'Pointer' }, '789' => { 'BaseType' => '125', 'Header' => undef, 'Line' => '24', 'Name' => 'uint8_t', 'Size' => '1', 'Type' => 'Typedef' }, '79888' => { 'BaseType' => '185', 'Header' => undef, 'Line' => '162', 'Name' => '__suseconds_t', 'Size' => '8', 'Type' => 'Typedef' }, '79982' => { 'BaseType' => '209', 'Header' => undef, 'Line' => '85', 'Name' => 'off_t', 'Size' => '8', 'Type' => 'Typedef' }, '80030' => { 'Header' => undef, 'Line' => '8', 'Memb' => { '0' => { 'name' => 'tv_sec', 'offset' => '0', 'type' => '15670' }, '1' => { 'name' => 'tv_usec', 'offset' => '8', 'type' => '79888' } }, 'Name' => 'struct timeval', 'Size' => '16', 'Type' => 'Struct' }, '801' => { 'BaseType' => '149', 'Header' => undef, 'Line' => '25', 'Name' => 'uint16_t', 'Size' => '2', 'Type' => 'Typedef' }, '80822' => { 'Header' => undef, 'Line' => '26', 'Memb' => { '0' => { 'name' => 'iov_base', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'iov_len', 'offset' => '8', 'type' => '46' } }, 'Name' => 'struct iovec', 'Size' => '16', 'Type' => 'Struct' }, '80862' => { 'BaseType' => '80822', 'Name' => 'struct iovec const', 'Size' => '16', 'Type' => 'Const' }, '81264' => { 'Header' => undef, 'Line' => '259', 'Memb' => { '0' => { 'name' => 'msg_name', 'offset' => '0', 'type' => '82' }, '1' => { 'name' => 'msg_namelen', 'offset' => '8', 'type' => '1110' }, '2' => { 'name' => 'msg_iov', 'offset' => '22', 'type' => '81381' }, '3' => { 'name' => 'msg_iovlen', 'offset' => '36', 'type' => '46' }, '4' => { 'name' => 'msg_control', 'offset' => '50', 'type' => '82' }, '5' => { 'name' => 'msg_controllen', 'offset' => '64', 'type' => '46' }, '6' => { 'name' => 'msg_flags', 'offset' => '72', 'type' => '161' } }, 'Name' => 'struct msghdr', 'Size' => '56', 'Type' => 'Struct' }, '813' => { 'BaseType' => '173', 'Header' => undef, 'Line' => '26', 'Name' => 'uint32_t', 'Size' => '4', 'Type' => 'Typedef' }, '81376' => { 'BaseType' => '81264', 'Name' => 'struct msghdr const', 'Size' => '56', 'Type' => 'Const' }, '81381' => { 'BaseType' => '80822', 'Name' => 'struct iovec*', 'Size' => '8', 'Type' => 'Pointer' }, '82' => { 'BaseType' => '1', 'Name' => 'void*', 'Size' => '8', 'Type' => 'Pointer' }, '825' => { 'BaseType' => '197', 'Header' => undef, 'Line' => '27', 'Name' => 'uint64_t', 'Size' => '8', 'Type' => 'Typedef' }, '837' => { 'Name' => 'unsigned long long', 'Size' => '8', 'Type' => 'Intrinsic' }, '89' => { 'Name' => 'unsigned char', 'Size' => '1', 'Type' => 'Intrinsic' }, '98765' => { 'BaseType' => '58', 'Header' => undef, 'Line' => '33', 'Name' => 'nfds_t', 'Size' => '8', 'Type' => 'Typedef' }, '98777' => { 'Header' => undef, 'Line' => '36', 'Memb' => { '0' => { 'name' => 'fd', 'offset' => '0', 'type' => '161' }, '1' => { 'name' => 'events', 'offset' => '4', 'type' => '137' }, '2' => { 'name' => 'revents', 'offset' => '6', 'type' => '137' } }, 'Name' => 'struct pollfd', 'Size' => '8', 'Type' => 'Struct' } }, 'UndefinedSymbols' => { 'librdmacm.so.1.3.56.0' => { '_ITM_deregisterTMCloneTable' => 0, '_ITM_registerTMCloneTable' => 0, '__asprintf_chk@GLIBC_2.8' => 0, '__cxa_finalize@GLIBC_2.2.5' => 0, '__errno_location@GLIBC_2.2.5' => 0, '__fdelt_chk@GLIBC_2.15' => 0, '__gmon_start__' => 0, '__isoc99_fscanf@GLIBC_2.7' => 0, '__memcpy_chk@GLIBC_2.3.4' => 0, '__poll_chk@GLIBC_2.16' => 0, '__stack_chk_fail@GLIBC_2.4' => 0, '__syslog_chk@GLIBC_2.4' => 0, '__tls_get_addr@GLIBC_2.3' => 0, 'bind@GLIBC_2.2.5' => 0, 'calloc@GLIBC_2.2.5' => 0, 'clock_gettime@GLIBC_2.17' => 0, 'close@GLIBC_2.2.5' => 0, 'connect@GLIBC_2.2.5' => 0, 'epoll_create@GLIBC_2.3.2' => 0, 'epoll_ctl@GLIBC_2.3.2' => 0, 'epoll_wait@GLIBC_2.3.2' => 0, 'eventfd@GLIBC_2.7' => 0, 'fclose@GLIBC_2.2.5' => 0, 'fcntl@GLIBC_2.2.5' => 0, 'fopen@GLIBC_2.2.5' => 0, 'free@GLIBC_2.2.5' => 0, 'freeaddrinfo@GLIBC_2.2.5' => 0, 'fstat@GLIBC_2.33' => 0, 'getaddrinfo@GLIBC_2.2.5' => 0, 'getenv@GLIBC_2.2.5' => 0, 'getpeername@GLIBC_2.2.5' => 0, 'getrandom@GLIBC_2.25' => 0, 'getsockname@GLIBC_2.2.5' => 0, 'ibv_ack_cq_events@IBVERBS_1.1' => 0, 'ibv_alloc_pd@IBVERBS_1.1' => 0, 'ibv_attach_mcast@IBVERBS_1.1' => 0, 'ibv_close_device@IBVERBS_1.1' => 0, 'ibv_copy_ah_attr_from_kern@IBVERBS_1.1' => 0, 'ibv_copy_path_rec_from_kern@IBVERBS_1.0' => 0, 'ibv_copy_qp_attr_from_kern@IBVERBS_1.0' => 0, 'ibv_create_ah@IBVERBS_1.1' => 0, 'ibv_create_comp_channel@IBVERBS_1.0' => 0, 'ibv_create_cq@IBVERBS_1.1' => 0, 'ibv_create_qp@IBVERBS_1.1' => 0, 'ibv_create_srq@IBVERBS_1.1' => 0, 'ibv_dealloc_pd@IBVERBS_1.1' => 0, 'ibv_dereg_mr@IBVERBS_1.1' => 0, 'ibv_destroy_ah@IBVERBS_1.1' => 0, 'ibv_destroy_comp_channel@IBVERBS_1.0' => 0, 'ibv_destroy_cq@IBVERBS_1.1' => 0, 'ibv_destroy_qp@IBVERBS_1.1' => 0, 'ibv_destroy_srq@IBVERBS_1.1' => 0, 'ibv_detach_mcast@IBVERBS_1.1' => 0, 'ibv_free_device_list@IBVERBS_1.1' => 0, 'ibv_get_cq_event@IBVERBS_1.1' => 0, 'ibv_get_device_guid@IBVERBS_1.1' => 0, 'ibv_get_device_index@IBVERBS_1.9' => 0, 'ibv_get_device_list@IBVERBS_1.1' => 0, 'ibv_get_pkey_index@IBVERBS_1.5' => 0, 'ibv_get_sysfs_path@IBVERBS_1.0' => 0, 'ibv_modify_qp@IBVERBS_1.1' => 0, 'ibv_open_device@IBVERBS_1.1' => 0, 'ibv_query_device@IBVERBS_1.1' => 0, 'ibv_query_ece@IBVERBS_1.10' => 0, 'ibv_query_gid@IBVERBS_1.1' => 0, 'ibv_query_port@IBVERBS_1.1' => 0, 'ibv_read_sysfs_file@IBVERBS_1.0' => 0, 'ibv_reg_mr@IBVERBS_1.1' => 0, 'ibv_set_ece@IBVERBS_1.10' => 0, 'in6addr_any@GLIBC_2.2.5' => 0, 'in6addr_loopback@GLIBC_2.2.5' => 0, 'inotify_add_watch@GLIBC_2.4' => 0, 'inotify_init1@GLIBC_2.9' => 0, 'malloc@GLIBC_2.2.5' => 0, 'memcmp@GLIBC_2.2.5' => 0, 'memcpy@GLIBC_2.14' => 0, 'memset@GLIBC_2.2.5' => 0, 'nl_connect' => 0, 'nl_recvmsgs_default' => 0, 'nl_send_auto' => 0, 'nl_send_simple' => 0, 'nl_socket_alloc' => 0, 'nl_socket_disable_auto_ack' => 0, 'nl_socket_disable_msg_peek' => 0, 'nl_socket_free' => 0, 'nl_socket_modify_cb' => 0, 'nl_socket_modify_err_cb' => 0, 'nla_get_string' => 0, 'nla_get_u64' => 0, 'nla_put' => 0, 'nlmsg_alloc_simple' => 0, 'nlmsg_free' => 0, 'nlmsg_hdr' => 0, 'nlmsg_parse' => 0, 'open@GLIBC_2.2.5' => 0, 'poll@GLIBC_2.2.5' => 0, 'posix_memalign@GLIBC_2.2.5' => 0, 'pthread_cond_destroy@GLIBC_2.3.2' => 0, 'pthread_cond_init@GLIBC_2.3.2' => 0, 'pthread_cond_signal@GLIBC_2.3.2' => 0, 'pthread_cond_wait@GLIBC_2.3.2' => 0, 'pthread_create@GLIBC_2.34' => 0, 'pthread_join@GLIBC_2.34' => 0, 'pthread_mutex_destroy@GLIBC_2.2.5' => 0, 'pthread_mutex_init@GLIBC_2.2.5' => 0, 'pthread_mutex_lock@GLIBC_2.2.5' => 0, 'pthread_mutex_unlock@GLIBC_2.2.5' => 0, 'qsort@GLIBC_2.2.5' => 0, 'rand_r@GLIBC_2.2.5' => 0, 'read@GLIBC_2.2.5' => 0, 'recv@GLIBC_2.2.5' => 0, 'recvfrom@GLIBC_2.2.5' => 0, 'sched_yield@GLIBC_2.2.5' => 0, 'sem_destroy@GLIBC_2.34' => 0, 'sem_init@GLIBC_2.34' => 0, 'sem_post@GLIBC_2.34' => 0, 'sem_wait@GLIBC_2.34' => 0, 'send@GLIBC_2.2.5' => 0, 'sendmsg@GLIBC_2.2.5' => 0, 'setsockopt@GLIBC_2.2.5' => 0, 'shutdown@GLIBC_2.2.5' => 0, 'snprintf@GLIBC_2.2.5' => 0, 'socket@GLIBC_2.2.5' => 0, 'socketpair@GLIBC_2.2.5' => 0, 'strdup@GLIBC_2.2.5' => 0, 'strlen@GLIBC_2.2.5' => 0, 'strtol@GLIBC_2.2.5' => 0, 'sysconf@GLIBC_2.2.5' => 0, 'tdelete@GLIBC_2.2.5' => 0, 'tdestroy@GLIBC_2.2.5' => 0, 'tfind@GLIBC_2.2.5' => 0, 'time@GLIBC_2.2.5' => 0, 'timerfd_create@GLIBC_2.8' => 0, 'timerfd_settime@GLIBC_2.8' => 0, 'tsearch@GLIBC_2.2.5' => 0, 'write@GLIBC_2.2.5' => 0 } }, 'WordSize' => '8' }; rdma-core-56.1/CMakeLists.txt000066400000000000000000000741351477342711600161040ustar00rootroot00000000000000# COPYRIGHT (c) 2016 Obsidian Research Corporation. See COPYING file # Run cmake as: # mkdir build # cmake -GNinja .. # ninja # # Common options passed to cmake are: # -DIN_PLACE=1 # Configure the build to be run from the build directory, this results in something # that is not installable. # -DCMAKE_EXPORT_COMPILE_COMMANDS=1 # Write a compile_commands.json file for clang tooling # -DCMAKE_BUILD_TYPE=RelWithDebInfo # Change the optimization level, Debug disables optimization, # Release is for packagers # -DENABLE_VALGRIND=0 (default enabled) # Disable valgrind notations, this has a tiny positive performance impact # -DENABLE_RESOLVE_NEIGH=0 (default enabled) # Do not link to libnl and do not resolve neighbours internally for Ethernet, # and do not build iwpmd. # -DENABLE_STATIC=1 (default disabled) # Produce static libraries along with the usual shared libraries. # -DVERBS_PROVIDER_DIR='' (default /usr/lib.../libibverbs) # Use the historical search path for providers, in the standard system library. # -DNO_COMPAT_SYMS=1 (default disabled) # Do not generate backwards compatibility symbols in the shared # libraries. This may is necessary if using a dynmic linker that does # not support symbol versions, such as uclibc. # -DIOCTL_MODE=write (default both) # Disable new kABI ioctl() support and support only the legacy write # path. May also be 'ioctl' to disable fallback to write. # -DIBACM_SERVER_MODE_DEFAULT (default unix) # Selects how clients can connect to this server: # open) Allow incoming connections from any TCP client (internal or external). # loop) Limit incoming connections for server_port to 127.0.0.1. # unix) Use unix-domain sockets, hence limits service to the same machine. # -DIBACM_ACME_PLUS_KERNEL_ONLY_DEFAULT (default 0) # If non-zero, limit incoming requests to kernel or the ib_acme utility # (i.e. do not serve librdmacm requests) # -DPYTHON_EXECUTABLE # Override automatic detection of python to use a certain # exectuable. This can be used to force the build to use python2 on a # system that has python3 installed. Otherwise the build automatically # prefers python3 if available. # -DNO_PYVERBS=1 (default, build pyverbs) # Invoke cython to build pyverbs. Usually you will run with this option # set # -DENABLE_IBDIAGS_COMPAT=True (default False) # Include obsolete scripts. These scripts are replaced by C programs with # a different interface now. # -DNO_MAN_PAGES=1 (default 0, build/install the man pages) # Disable man pages. Allows rdma-core to be built and installed # (without man pages) when neither pandoc/rst2man nor the pandoc-prebuilt # directory are available. # -DENABLE_LTTNG (default, no tracing support) # Enable LTTng tracing. if (${CMAKE_VERSION} VERSION_LESS "3.18.1") # Centos 7 support cmake_minimum_required(VERSION 2.8.12 FATAL_ERROR) else() cmake_minimum_required(VERSION 3.18.1 FATAL_ERROR) endif() project(rdma-core C) # CMake likes to use -rdynamic too much, they fixed it in 3.4. if(POLICY CMP0065) cmake_policy(SET CMP0065 NEW) else() # .. but we really do want to opt out. string(REPLACE "-rdynamic" "" CMAKE_SHARED_LIBRARY_LINK_C_FLAGS "${CMAKE_SHARED_LIBRARY_LINK_C_FLAGS}") endif() # Make RDMA_CHECK_C_LINKER_FLAG work better if(POLICY CMP0056) cmake_policy(SET CMP0056 NEW) endif() set(PACKAGE_NAME "RDMA") # See Documentation/versioning.md set(PACKAGE_VERSION "56.1") # When this is changed the values in these files need changing too: # debian/control # debian/libibverbs1.symbols set(IBVERBS_PABI_VERSION "34") set(IBVERBS_PROVIDER_SUFFIX "-rdmav${IBVERBS_PABI_VERSION}.so") #------------------------- # Basic standard paths # Override the CMAKE_INSTALL_ dirs to be under the build/ directory if (IN_PLACE) set(CMAKE_INSTALL_SYSCONFDIR "${PROJECT_BINARY_DIR}/etc") set(CMAKE_INSTALL_BINDIR "${PROJECT_BINARY_DIR}/bin") set(CMAKE_INSTALL_SBINDIR "${PROJECT_BINARY_DIR}/bin") set(CMAKE_INSTALL_PREFIX "${PROJECT_BINARY_DIR}") set(CMAKE_INSTALL_LIBDIR "lib") set(CMAKE_INSTALL_INCLUDEDIR "include") endif() include(GNUInstallDirs) # C include root set(BUILD_INCLUDE ${PROJECT_BINARY_DIR}/include) # Executables set(BUILD_BIN ${PROJECT_BINARY_DIR}/bin) # Libraries set(BUILD_LIB ${PROJECT_BINARY_DIR}/lib) # Static library pre-processing set(BUILD_STATIC_LIB ${PROJECT_BINARY_DIR}/lib/statics) # Used for IN_PLACE configuration set(BUILD_ETC ${PROJECT_BINARY_DIR}/etc) set(BUILD_PYTHON ${PROJECT_BINARY_DIR}/python) set(IBDIAG_CONFIG_PATH "${CMAKE_INSTALL_FULL_SYSCONFDIR}/infiniband-diags") set(IBDIAG_NODENAME_MAP_PATH "${CMAKE_INSTALL_FULL_SYSCONFDIR}/rdma/ib-node-name-map") set(CMAKE_INSTALL_INITDDIR "${CMAKE_INSTALL_SYSCONFDIR}/init.d" CACHE PATH "Location for init.d files") set(CMAKE_INSTALL_MODPROBEDIR "${CMAKE_INSTALL_SYSCONFDIR}/modprobe.d/" CACHE PATH "Location for modprobe.d files") set(CMAKE_INSTALL_SYSTEMD_SERVICEDIR "${CMAKE_INSTALL_PREFIX}/lib/systemd/system" CACHE PATH "Location for systemd service files") set(CMAKE_INSTALL_SYSTEMD_BINDIR "/lib/systemd" CACHE PATH "Location for systemd extra binaries") set(ACM_PROVIDER_DIR "${CMAKE_INSTALL_FULL_LIBDIR}/ibacm" CACHE PATH "Location for ibacm provider plugin shared library files.") # Location to find the provider plugin shared library files set(VERBS_PROVIDER_DIR "${CMAKE_INSTALL_FULL_LIBDIR}/libibverbs" CACHE PATH "Location for provider plugin shared library files. If set to empty the system search path is used.") # Allow the 'run' dir to be configurable, this historically has been /var/run, but # some systems now use /run/ set(CMAKE_INSTALL_RUNDIR "var/run" CACHE PATH "Location for runtime information, typically /var/run, or /run") if(NOT IS_ABSOLUTE ${CMAKE_INSTALL_RUNDIR}) set(CMAKE_INSTALL_FULL_RUNDIR "${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_RUNDIR}") else() set(CMAKE_INSTALL_FULL_RUNDIR "${CMAKE_INSTALL_RUNDIR}") endif() # Allow the udev rules.d dir to be configurable, this has historically been # /lib/udev/rules.d/, but some systems now prefix /usr/ set(CMAKE_INSTALL_UDEV_RULESDIR "lib/udev/rules.d" CACHE PATH "Location for system udev rules, typically /lib/udev/rules.d or /usr/lib/udev/rules.d") if(NOT IS_ABSOLUTE ${CMAKE_INSTALL_UDEV_RULESDIR}) set(CMAKE_INSTALL_FULL_UDEV_RULESDIR "${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_UDEV_RULESDIR}") else() set(CMAKE_INSTALL_FULL_UDEV_RULESDIR "${CMAKE_INSTALL_UDEV_RULESDIR}") endif() # Allow the perl library dir to be configurable set(CMAKE_INSTALL_PERLDIR "share/perl5" CACHE PATH "Location for system perl library, typically /usr/share/perl5") if(NOT IS_ABSOLUTE ${CMAKE_INSTALL_PERLDIR}) set(CMAKE_INSTALL_FULL_PERLDIR "${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_PERLDIR}") else() set(CMAKE_INSTALL_FULL_PERLDIR "${CMAKE_INSTALL_PERLDIR}") endif() # Location to place provider .driver files if (IN_PLACE) set(CONFIG_DIR "${BUILD_ETC}/libibverbs.d") set(VERBS_PROVIDER_DIR "${BUILD_LIB}") set(ACM_PROVIDER_DIR "${BUILD_LIB}/ibacm") else() set(CONFIG_DIR "${CMAKE_INSTALL_FULL_SYSCONFDIR}/libibverbs.d") endif() set(DISTRO_FLAVOUR "None" CACHE STRING "Flavour of distribution to install for. This primarily impacts the init.d scripts installed.") #------------------------- # Load CMake components set(BUILDLIB "${PROJECT_SOURCE_DIR}/buildlib") set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${BUILDLIB}") include(CMakeParseArguments) include(CheckCCompilerFlag) include(CheckCSourceCompiles) include(CheckIncludeFile) include(CheckTypeSize) include(RDMA_EnableCStd) include(RDMA_Sparse) include(RDMA_BuildType) include(RDMA_DoFixup) include(publish_headers) include(rdma_functions) include(pyverbs_functions) check_c_compiler_flag("-Wcast-align=strict" HAVE_WCAST_ALIGN_STRICT) if (NO_MAN_PAGES) # define empty stub functions to omit man page processing function(rdma_man_pages) endfunction() function(rdma_alias_man_pages) endfunction() else() include(rdma_man) endif() if (NOT DEFINED ENABLE_STATIC) set(ENABLE_STATIC "OFF" CACHE BOOL "Produce static linking libraries as well as shared libraries.") endif() #------------------------- # Setup the basic C compiler RDMA_BuildType() include_directories(${BUILD_INCLUDE}) # Working means that the compiler doesn't spew output that confuses cmake's # capabilitiy tests. ie cmake will test and succeed a simple program RDMA_Check_C_Compiles(HAVE_WORKING_WERROR "int main(int argc,const char *argv[]) { return 0; }" "") if (NOT HAVE_WORKING_WERROR) message(FATAL_ERROR "-Werror doesn't work (compiler always creates warnings?). Werror is required for CMake.") endif() # Use Python modules based on CMake version for backward compatibility if (${CMAKE_VERSION} VERSION_LESS "3.12") FIND_PACKAGE(PythonInterp REQUIRED) FIND_PACKAGE(PythonLibs ${PYTHON_VERSION_MAJOR}.${PYTHON_VERSION_MINOR} EXACT) elseif (${CMAKE_VERSION} VERSION_GREATER_EQUAL "3.12") set(Python_EXECUTABLE ${PYTHON_EXECUTABLE}) FIND_PACKAGE(Python 3 REQUIRED COMPONENTS Interpreter OPTIONAL_COMPONENTS Development) set(PYTHON_EXECUTABLE ${Python_EXECUTABLE}) if(Python_Development_FOUND) set(PYTHONLIBS_FOUND ${Python_Development_FOUND}) set(PYTHON_LIBRARIES ${Python_LIBRARIES}) set(PYTHON_INCLUDE_DIRS ${Python_INCLUDE_DIRS}) endif() endif() set(CYTHON_EXECUTABLE "") if(NOT NO_PYVERBS AND PYTHONLIBS_FOUND) execute_process(COMMAND "${PYTHON_EXECUTABLE}" -c "import sysconfig; print(sysconfig.get_path(\"platlib\"))" OUTPUT_VARIABLE py_path) string(STRIP ${py_path} py_path) set(CMAKE_INSTALL_PYTHON_ARCH_LIB "${py_path}" CACHE PATH "Location for architecture specific python libraries") # See PEP3149 execute_process(COMMAND "${PYTHON_EXECUTABLE}" -c "import sysconfig; x = sysconfig.get_config_var(\"EXT_SUFFIX\"); print(x if x else '.so')" OUTPUT_VARIABLE py_path) string(STRIP ${py_path} CMAKE_PYTHON_SO_SUFFIX) FIND_PACKAGE(cython) elseif(NOT NO_PYVERBS AND NOT PYTHONLIBS_FOUND) message(WARNING "pyverbs build requested but python development files not found") endif() find_program(SYSTEMCTL_BIN systemctl HINTS "/usr/bin" "/bin") if (NOT SYSTEMCTL_BIN) set (SYSTEMCTL_BIN "/bin/systemctl") endif() RDMA_CheckSparse() # Require GNU99 mode RDMA_EnableCStd() # Extra warnings. Turn on -Wextra to keep aware of interesting developments from gcc, # but turn off some that are not terribly useful for this source. # FIXME: I wonder how many of the signed compares are bugs? RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WARNINGS "-Wall -Wextra -Wno-sign-compare -Wno-unused-parameter") RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WMISSING_PROTOTYPES "-Wmissing-prototypes") RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WMISSING_DECLARATIONS "-Wmissing-declarations") RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WWRITE_STRINGS "-Wwrite-strings") RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WFORMAT_2 "-Wformat=2") RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WCAST_FUNCTION "-Wcast-function-type") RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WFORMAT_NONLITERAL "-Wformat-nonliteral") RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WDATE_TIME "-Wdate-time") RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WNESTED_EXTERNS "-Wnested-externs") # At some point after 4.4 gcc fixed shadow to ignore function vs variable # conflicts RDMA_Check_C_Compiles(HAVE_C_WORKING_SHADOW " #include int main(int argc,const char *argv[]) { int access = 1; return access; }" "-Wshadow") if (HAVE_C_WORKING_SHADOW) RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WORKING_SHADOW "-Wshadow") endif() # At some point around 5.4 gcc fixed missing-field-initializers to ignore this # common idiom we use extensively. Since this is a useful warning for # developers try and leave it on if the compiler supports it. RDMA_Check_C_Compiles(HAVE_C_WORKING_MISSING_FIELD_INITIALIZERS " struct foo { int a; int b; }; int main(int argc,const char *argv[]) { struct foo tmp = {}; return tmp.a; }" ) if (NOT HAVE_C_WORKING_MISSING_FIELD_INITIALIZERS) RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WNO_MISSING_FIELD_INITIALIZERS "-Wno-missing-field-initializers") endif() # clang doesn't support variable size GCC extension RDMA_Check_C_Compiles(HAVE_C_VARIABLE_SIZE " struct c { int a; int b[]; }; struct foo { struct c c; int b; }; int main(int argc,const char *argv[]) { return 0; }" ) if (NOT HAVE_C_VARIABLE_SIZE) RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WNO_VARIABLE_SIZE "-Wno-gnu-variable-sized-type-not-at-end") endif() # Check that the compiler supports -fno-strict-aliasing. # The use of this flag in the source is discouraged set(NO_STRICT_ALIASING_FLAGS "") RDMA_AddOptCFlag(NO_STRICT_ALIASING_FLAGS HAVE_NO_STRICT_ALIASING "-fno-strict-aliasing") # pyverbs has a problem with var-tracking warnings, turn it off if we can. set(NO_VAR_TRACKING_FLAGS "") RDMA_AddOptCFlag(NO_VAR_TRACKING_FLAGS HAVE_NO_VAR_TRACKING_ASSIGNMENTS "-fno-var-tracking-assignments") RDMA_Check_C_Compiles(HAVE_FUNC_ATTRIBUTE_IFUNC " #include void entry(void); static void do_entry(void) {} void entry(void) __attribute__((ifunc(\"resolve_entry\"))); typedef void (*fn_t)(void); static fn_t resolve_entry(void) {return &do_entry;} int main(int argc,const char *argv[]) { entry(); }" ) RDMA_Check_C_Compiles(HAVE_FUNC_ATTRIBUTE_SYMVER " #include void _sym(void); __attribute__((__symver__(\"sym@TEST_1.1\"))) void _sym(void) {} int main(int argc,const char *argv[]) { _sym(); }" ) # The code does not do the racy fcntl if the various CLOEXEC's are not # supported so it really doesn't work right if this isn't available. Thus hard # require it. CHECK_C_SOURCE_COMPILES(" #include #include #include #include int main(int argc,const char *argv[]) { open(\".\",O_RDONLY | O_CLOEXEC); socket(AF_INET, SOCK_STREAM | SOCK_CLOEXEC, 0); return 0; }" HAS_CLOEXEC) if (NOT HAS_CLOEXEC) # At least uclibc wrongly hides this POSIX constant behind _GNU_SOURCE CHECK_C_SOURCE_COMPILES(" #define _GNU_SOURCE #include #include #include #include int main(int argc,const char *argv[]) { open(\".\",O_RDONLY | O_CLOEXEC); socket(AF_INET, SOCK_STREAM | SOCK_CLOEXEC, 0); return 0; }" HAS_CLOEXEC_GNU_SOURCE) if (HAS_CLOEXEC_GNU_SOURCE) set(HAS_CLOEXEC 1) add_definitions("-D_GNU_SOURCE=") endif() endif() if (NOT HAS_CLOEXEC) message(FATAL_ERROR "O_CLOEXEC/SOCK_CLOEXEC/fopen(..,\"e\") support is required but not found") endif() # always_inline is supported RDMA_Check_C_Compiles(HAVE_FUNC_ATTRIBUTE_ALWAYS_INLINE " int foo(void); inline __attribute__((always_inline)) int foo(void) {return 0;} int main(int argc,const char *argv[]) { return foo(); }" ) # Linux __u64 is an unsigned long long RDMA_Check_C_Compiles(HAVE_LONG_LONG_U64 " #include int main(int argc,const char *argv[]) { __u64 tmp = 0; unsigned long long *tmp2 = &tmp; return *tmp2; }" ) if (NOT HAVE_LONG_LONG_U64) # Modern Linux has switched to use ull in all cases, but to avoid disturbing # userspace some platforms continued to use unsigned long by default. This # define will cause kernel headers to consistently use unsigned long long add_definitions("-D__SANE_USERSPACE_TYPES__") endif() # glibc and kernel uapi headers can co-exist CHECK_C_SOURCE_COMPILES(" #include #include #include #include int main(int argc,const char *argv[]) { return 0; }" HAVE_GLIBC_UAPI_COMPAT) RDMA_DoFixup("${HAVE_GLIBC_UAPI_COMPAT}" "linux/in.h") RDMA_DoFixup("${HAVE_GLIBC_UAPI_COMPAT}" "linux/in6.h") # The compiler has working -fstrict-aliasing support, old gcc's do not. If # broken then globally disable strict aliasing. RDMA_Check_Aliasing(HAVE_WORKING_STRICT_ALIASING) if (NOT HAVE_WORKING_STRICT_ALIASING) set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${NO_STRICT_ALIASING_FLAGS}") endif() # Check if off_t is 64 bits, eg large file support is enabled CHECK_C_SOURCE_COMPILES(" #include #define BUILD_ASSERT_OR_ZERO(cond) (sizeof(char [1 - 2*!(cond)]) - 1) int main(int argc,const char *argv[]) { return BUILD_ASSERT_OR_ZERO(sizeof(off_t) >= 8); }" HAVE_LARGE_FILES) if (NOT HAVE_LARGE_FILES) CHECK_C_SOURCE_COMPILES(" #define _FILE_OFFSET_BITS 64 #include #define BUILD_ASSERT_OR_ZERO(cond) (sizeof(char [1 - 2*!(cond)]) - 1) int main(int argc,const char *argv[]) { return BUILD_ASSERT_OR_ZERO(sizeof(off_t) >= 8); }" HAVE_LARGE_FILES2) if (NOT HAVE_LARGE_FILES2) message(FATAL_ERROR "Could not enable large file support") endif() add_definitions("-D_FILE_OFFSET_BITS=64") endif() # Provide a shim if C11 stdatomic.h is not supported. if (NOT HAVE_SPARSE) CHECK_INCLUDE_FILE("stdatomic.h" HAVE_STDATOMIC) RDMA_DoFixup("${HAVE_STDATOMIC}" "stdatomic.h") endif() RDMA_Check_SSE(HAVE_TARGET_SSE) # Enable development support features # Prune unneeded shared libraries during linking RDMA_AddOptLDFlag(CMAKE_EXE_LINKER_FLAGS SUPPORTS_AS_NEEDED "-Wl,--as-needed") RDMA_AddOptLDFlag(CMAKE_SHARED_LINKER_FLAGS SUPPORTS_AS_NEEDED "-Wl,--as-needed") RDMA_AddOptLDFlag(CMAKE_MODULE_LINKER_FLAGS SUPPORTS_AS_NEEDED "-Wl,--as-needed") # Ensure all shared ELFs have fully described linking RDMA_AddOptLDFlag(CMAKE_EXE_LINKER_FLAGS SUPPORTS_NO_UNDEFINED "-Wl,--no-undefined") RDMA_AddOptLDFlag(CMAKE_SHARED_LINKER_FLAGS SUPPORTS_NO_UNDEFINED "-Wl,--no-undefined") # Enable gold linker - gold has different linking checks #RDMA_AddOptLDFlag(CMAKE_EXE_LINKER_FLAGS SUPPORTS_NO_UNDEFINED "-fuse-ld=gold") #RDMA_AddOptLDFlag(CMAKE_SHARED_LINKER_FLAGS SUPPORTS_NO_UNDEFINED "-fuse-ld=gold") #RDMA_AddOptLDFlag(CMAKE_MODULE_LINKER_FLAGS SUPPORTS_NO_UNDEFINED "-fuse-ld=gold") # Verify that GNU --version-script and asm(".symver") works find_package(LDSymVer REQUIRED) if (NO_COMPAT_SYMS) set(HAVE_LIMITED_SYMBOL_VERSIONS 1) else() set(HAVE_FULL_SYMBOL_VERSIONS 1) endif() set(NO_MAN_PAGES "OFF" CACHE BOOL "Disable build/install of man pages") if (NOT NO_MAN_PAGES) # Look for pandoc and rst2man for making manual pages FIND_PACKAGE(pandoc) FIND_PACKAGE(rst2man) endif () #------------------------- # Find libraries # pthread FIND_PACKAGE (Threads REQUIRED) FIND_PACKAGE(PkgConfig REQUIRED) # libnl if (NOT DEFINED ENABLE_RESOLVE_NEIGH) set(ENABLE_RESOLVE_NEIGH "ON" CACHE BOOL "Enable internal resolution of neighbours for Etherent") endif() if (ENABLE_RESOLVE_NEIGH) # FIXME use of pkgconfig is discouraged pkg_check_modules(NL libnl-3.0 libnl-route-3.0 REQUIRED) include_directories(${NL_INCLUDE_DIRS}) link_directories(${NL_LIBRARY_DIRS}) set(NL_KIND 3) else() set(NL_KIND 0) set(NL_LIBRARIES "") RDMA_DoFixup(0 "netlink/attr.h") RDMA_DoFixup(0 "netlink/msg.h") RDMA_DoFixup(0 "netlink/netlink.h") RDMA_DoFixup(0 "netlink/object-api.h") RDMA_DoFixup(0 "netlink/route/link.h") RDMA_DoFixup(0 "netlink/route/link/vlan.h") RDMA_DoFixup(0 "netlink/route/neighbour.h") RDMA_DoFixup(0 "netlink/route/route.h") RDMA_DoFixup(0 "netlink/route/rtnl.h") endif() # Older stuff blows up if these headers are included together if (NOT NL_KIND EQUAL 0) set(SAFE_CMAKE_REQUIRED_INCLUDES "${CMAKE_REQUIRED_INCLUDES}") set(CMAKE_REQUIRED_INCLUDES "${NL_INCLUDE_DIRS}") CHECK_C_SOURCE_COMPILES(" #include #include int main(int argc,const char *argv[]) {return 0;}" HAVE_WORKING_IF_H) set(CMAKE_REQUIRED_INCLUDES "${SAFE_CMAKE_REQUIRED_INCLUDES}") endif() # udev find_package(UDev) include_directories(${UDEV_INCLUDE_DIRS}) # Statically determine sizeof(long), this is largely unnecessary, no new code # should rely on this. check_type_size("long" SIZEOF_LONG BUILTIN_TYPES_ONLY LANGUAGE C) # Determine if this arch supports cache coherent DMA. This isn't really an # arch specific property, but for our purposes arches that do not support it # also do not define wmb/etc which breaks our compile. # As a special case s390x always has coherent DMA but needs linking for its wmb CHECK_C_SOURCE_COMPILES(" #if !defined(__s390x__) #include \"${CMAKE_CURRENT_SOURCE_DIR}/util/udma_barrier.h\" #endif int main(int argc,const char *argv[]) {return 0;}" HAVE_COHERENT_DMA) find_package(Systemd) include_directories(${SYSTEMD_INCLUDE_DIRS}) RDMA_DoFixup("${SYSTEMD_FOUND}" "systemd/sd-daemon.h") # drm headers # Check if the headers have been installed by kernel-headers find_path(DRM_INCLUDE_DIRS "drm.h" PATH_SUFFIXES "drm" "libdrm") # Alternatively the headers could have been installed by libdrm if (NOT DRM_INCLUDE_DIRS) pkg_check_modules(DRM libdrm) endif() if (DRM_INCLUDE_DIRS) if (EXISTS "${DRM_INCLUDE_DIRS}/i915_drm.h" AND EXISTS "${DRM_INCLUDE_DIRS}/amdgpu_drm.h") include_directories(${DRM_INCLUDE_DIRS}) else() unset(DRM_INCLUDE_DIRS CACHE) endif() endif() # LTTng Tracer support if (DEFINED ENABLE_LTTNG) include(FindLTTngUST REQUIRED) add_definitions(-DLTTNG_ENABLED) endif() #------------------------- # Apply fixups # We prefer to build with valgrind memcheck.h present, but if not, or the user # requested valgrind disabled, then replace it with our dummy stub. if (NOT DEFINED ENABLE_VALGRIND) set(ENABLE_VALGRIND "ON" CACHE BOOL "Enable use of valgrind annotations") endif() if (ENABLE_VALGRIND) CHECK_INCLUDE_FILE("valgrind/memcheck.h" HAVE_VALGRIND_MEMCHECK) CHECK_INCLUDE_FILE("valgrind/drd.h" HAVE_VALGRIND_DRD) else() set(HAVE_VALGRIND_MEMCHECK 0) set(HAVE_VALGRIND_DRD 0) endif() RDMA_DoFixup("${HAVE_VALGRIND_MEMCHECK}" "valgrind/memcheck.h") RDMA_DoFixup("${HAVE_VALGRIND_DRD}" "valgrind/drd.h") # Older glibc does not include librt CHECK_C_SOURCE_COMPILES(" #include int main(int argc,const char *argv[]) { clock_gettime(CLOCK_MONOTONIC,0); clock_nanosleep(CLOCK_MONOTONIC,0,0,0); return 0; };" LIBC_HAS_LIBRT) if (NOT LIBC_HAS_LIBRT) set(RT_LIBRARIES "rt") endif() # Check for static_assert CHECK_C_SOURCE_COMPILES(" #include static_assert(1, \"failed\"); int main(int argc,const char *argv[]) { static_assert(1, \"failed\"); return 0; };" HAVE_STATIC_ASSERT) RDMA_DoFixup("${HAVE_STATIC_ASSERT}" "assert.h") #------------------------- # Final warning flags # Old version of cmake used 'main(){..}' as their test program which breaks with -Werror. # So set this flag last. RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WSTRICT_PROTOTYPES "-Wstrict-prototypes") RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WOLD_STYLE_DEFINITION "-Wold-style-definition") if (ENABLE_WERROR) set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -Werror") message(STATUS "Enabled -Werror") endif() # Old versions of libnl have a duplicated rtnl_route_put, disbale the warning on those # systems if (NOT NL_KIND EQUAL 0) set(SAFE_CMAKE_REQUIRED_FLAGS "${CMAKE_REQUIRED_FLAGS}") set(CMAKE_REQUIRED_INCLUDES "${NL_INCLUDE_DIRS}") RDMA_Check_C_Compiles(HAVE_C_WREDUNDANT_DECLS " #include int main(int argc,const char *argv[]) { return 0; }" "-Wredundant-decls") set(CMAKE_REQUIRED_INCLUDES "${SAFE_CMAKE_REQUIRED_INCLUDES}") endif() RDMA_AddOptCFlag(CMAKE_C_FLAGS HAVE_C_WREDUNDANT_DECLS "-Wredundant-decls") # Support of getrandom() was added to glibc in version 2.25 CHECK_C_SOURCE_COMPILES(" #include int main(int argc,const char *argv[]) {char buf[64]; return getrandom(buf, 64, GRND_NONBLOCK);}" HAVE_GLIBC_GETRANDOM) RDMA_DoFixup("${HAVE_GLIBC_GETRANDOM}" "sys/random.h") # glibc 2.33 and newer stopped to properly declare __fxstat in sys/stat.h RDMA_Check_C_Compiles(HAVE_GLIBC_FXSTAT " #include int main(int argc,const char *argv[]) { struct stat stat = {}; __fxstat(0, 0, &stat); return 0;}") RDMA_DoFixup("${HAVE_GLIBC_FXSTAT}" "sys/stat.h") # glibc before 2.35 does not necesarily define the HWCAP_S390_PCI_MIO hardware # capability bit constant. Check for it and if necessary shim it in such that # kernel support for PCI MIO instructions can always be checked. RDMA_Check_C_Compiles(HAVE_GLIBC_HWCAP_S390_PCI_MIO " #if defined(__s390x__) #include int main(int argc, const char *argv[]) { return !!(getauxval(AT_HWCAP) & HWCAP_S390_PCI_MIO);} #else int main(int argc, const char *argv[]) {return 0;} #endif ") RDMA_DoFixup("${HAVE_GLIBC_HWCAP_S390_PCI_MIO}" "sys/auxv.h") #------------------------- # Build Prep # Write out a git ignore file to the build directory if it isn't the source # directory. For developer convenience if (NOT ${CMAKE_CURRENT_BINARY_DIR} STREQUAL ${CMAKE_CURRENT_SOURCE_DIR}) file(WRITE ${PROJECT_BINARY_DIR}/.gitignore "*") endif() if ("${IOCTL_MODE}" STREQUAL "both") set(IOCTL_MODE_NUM 3) elseif ("${IOCTL_MODE}" STREQUAL "write") set(IOCTL_MODE_NUM 2) elseif ("${IOCTL_MODE}" STREQUAL "ioctl") set(IOCTL_MODE_NUM 1) elseif ("${IOCTL_MODE}" STREQUAL "") set(IOCTL_MODE_NUM 3) else() message(FATAL_ERROR "-DIOCTL_MODE=${IOCTL_MODE} is not a valid choice") endif() # Configuration defaults if ("${IBACM_SERVER_MODE_DEFAULT}" STREQUAL "open") set(IBACM_SERVER_MODE_DEFAULT "IBACM_SERVER_MODE_OPEN") elseif ("${IBACM_SERVER_MODE_DEFAULT}" STREQUAL "loop") set(IBACM_SERVER_MODE_DEFAULT "IBACM_SERVER_MODE_LOOP") else() set(IBACM_SERVER_MODE_DEFAULT "IBACM_SERVER_MODE_UNIX") endif() if (IBACM_ACME_PLUS_KERNEL_ONLY_DEFAULT) set(IBACM_ACME_PLUS_KERNEL_ONLY_DEFAULT 1) else() set(IBACM_ACME_PLUS_KERNEL_ONLY_DEFAULT 0) endif() configure_file("${BUILDLIB}/config.h.in" "${BUILD_INCLUDE}/config.h" ESCAPE_QUOTES @ONLY) #------------------------- # Sub-directories add_subdirectory(ccan) add_subdirectory(util) add_subdirectory(util/tests) add_subdirectory(Documentation) add_subdirectory(kernel-boot) add_subdirectory(kernel-headers) # Libraries add_subdirectory(libibumad) add_subdirectory(libibumad/man) add_subdirectory(libibverbs) add_subdirectory(libibverbs/man) add_subdirectory(librdmacm) add_subdirectory(librdmacm/man) # Providers if (HAVE_COHERENT_DMA) add_subdirectory(providers/bnxt_re) add_subdirectory(providers/cxgb4) # NO SPARSE add_subdirectory(providers/efa) add_subdirectory(providers/efa/man) add_subdirectory(providers/erdma) add_subdirectory(providers/hns) add_subdirectory(providers/hns/man) add_subdirectory(providers/irdma) add_subdirectory(providers/mana) add_subdirectory(providers/mana/man) add_subdirectory(providers/mlx4) add_subdirectory(providers/mlx4/man) add_subdirectory(providers/mlx5) add_subdirectory(providers/mlx5/man) add_subdirectory(providers/mthca) add_subdirectory(providers/ocrdma) add_subdirectory(providers/qedr) add_subdirectory(providers/vmw_pvrdma) endif() add_subdirectory(providers/hfi1verbs) add_subdirectory(providers/ipathverbs) add_subdirectory(providers/rxe) add_subdirectory(providers/rxe/man) add_subdirectory(providers/siw) add_subdirectory(libibmad) add_subdirectory(libibnetdisc) add_subdirectory(libibnetdisc/man) add_subdirectory(infiniband-diags) add_subdirectory(infiniband-diags/scripts) add_subdirectory(infiniband-diags/man) if (CYTHON_EXECUTABLE) add_subdirectory(pyverbs) add_subdirectory(tests) endif() # Binaries if (NOT NL_KIND EQUAL 0) add_subdirectory(ibacm) # NO SPARSE endif() if (NOT NL_KIND EQUAL 0) add_subdirectory(iwpmd) endif() add_subdirectory(libibumad/tests) add_subdirectory(libibverbs/examples) add_subdirectory(librdmacm/examples) if (UDEV_FOUND) add_subdirectory(rdma-ndd) endif() add_subdirectory(srp_daemon) ibverbs_finalize() rdma_finalize_libs() #------------------------- # Display a summary # Only report things that are non-ideal. message(STATUS "Missing Optional Items:") if (NOT HAVE_FUNC_ATTRIBUTE_ALWAYS_INLINE) message(STATUS " Compiler attribute always_inline NOT supported") endif() if (NOT HAVE_FUNC_ATTRIBUTE_IFUNC) message(STATUS " Compiler attribute ifunc NOT supported") endif() if (NOT HAVE_FUNC_ATTRIBUTE_SYMVER) message(STATUS " Compiler attribute symver NOT supported, can not use LTO") endif() if (NOT HAVE_COHERENT_DMA) message(STATUS " Architecture NOT able to do coherent DMA (check util/udma_barrier.h) some providers disabled!") endif() if (NOT HAVE_STDATOMIC) message(STATUS " C11 stdatomic.h NOT available (old compiler)") endif() if (NOT HAVE_STATIC_ASSERT) message(STATUS " C11 static_assert NOT available (old compiler)") endif() if (NOT HAVE_WORKING_STRICT_ALIASING) message(STATUS " Compiler cannot do strict aliasing") endif() if (NOT HAVE_VALGRIND_MEMCHECK) message(STATUS " Valgrind memcheck.h NOT enabled") endif() if (NOT HAVE_VALGRIND_DRD) message(STATUS " Valgrind drd.h NOT enabled") endif() if (NL_KIND EQUAL 0) message(STATUS " neighbour resolution NOT enabled") else() if (NOT HAVE_WORKING_IF_H) message(STATUS " netlink/route/link.h and net/if.h NOT co-includable (old headers)") endif() endif() if (NO_MAN_PAGES) message(STATUS " man pages NOT built") else() if (NOT PANDOC_FOUND) if (NOT EXISTS "${PROJECT_SOURCE_DIR}/buildlib/pandoc-prebuilt") message(STATUS " pandoc NOT found and NO prebuilt man pages. 'install' disabled") else() message(STATUS " pandoc NOT found (using prebuilt man pages)") endif() endif() if (NOT RST2MAN_FOUND) if (NOT EXISTS "${PROJECT_SOURCE_DIR}/buildlib/pandoc-prebuilt") message(STATUS " rst2man NOT found and NO prebuilt man pages. 'install' disabled") else() message(STATUS " rst2man NOT found (using prebuilt man pages)") endif() endif() endif() if (NOT CYTHON_EXECUTABLE) message(STATUS " cython NOT found (disabling pyverbs)") endif() if (NOT SYSTEMD_FOUND) message(STATUS " libsystemd NOT found (disabling features)") endif() if (NOT UDEV_FOUND) message(STATUS " libudev NOT found (disabling features)") endif() if (NOT HAVE_C_WARNINGS) message(STATUS " extended C warnings NOT supported") endif() if (NOT HAVE_NO_STRICT_ALIASING) message(STATUS " -fno-strict-aliasing NOT supported") endif() if (NOT HAVE_C_WORKING_MISSING_FIELD_INITIALIZERS) message(STATUS " -Wmissing-field-initializers does NOT work") endif() if (NOT HAVE_C_WORKING_SHADOW) message(STATUS " -Wshadow does NOT work") endif() if (NOT HAVE_C_WREDUNDANT_DECLS) message(STATUS " -Wredundant-decls does NOT work") endif() if (NOT HAVE_GLIBC_UAPI_COMPAT) message(STATUS " libc netinet/in.h and linux/in.h do NOT coexist") endif() if (NOT HAVE_TARGET_SSE) message(STATUS " attribute(target(\"sse\")) does NOT work") endif() if (NOT DRM_INCLUDE_DIRS) message(STATUS " DMABUF NOT supported (disabling some tests)") endif() if (NOT HAVE_GLIBC_HWCAP_S390_PCI_MIO ) message(STATUS " Glibc version does not contain the HWCAP_S390_PCI_MIO bit, using shim version") endif() rdma-core-56.1/COPYING.BSD_FB000066400000000000000000000024201477342711600153410ustar00rootroot00000000000000 OpenIB.org BSD license (FreeBSD Variant) Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. rdma-core-56.1/COPYING.BSD_MIT000066400000000000000000000017511477342711600155110ustar00rootroot00000000000000 OpenIB.org BSD license (MIT variant) Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. rdma-core-56.1/COPYING.GPL2000066400000000000000000000432541477342711600151000ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. rdma-core-56.1/COPYING.md000066400000000000000000000037201477342711600147660ustar00rootroot00000000000000# Default Dual License Unless otherwise stated this software is available to you under a choice of one of two licenses. You may choose to be licensed under the terms of the OpenIB.org BSD (MIT variant) license (see COPYING.BSD_MIT) or the GNU General Public License (GPL) Version 2 (see COPYING.GPL2), both included in this package. Files marked 'See COPYING file' are licensed under the above Dual License. # Other Options Individual source files may use a license different from the above Defaul Dual License. If a license is declared in the file then it supersedes the Default License. If a directory contains a COPYING file then the License from that file becomes the Default License for files in that directory and below. # Copyright Holders Refer to individual files for information on the copyright holders. # License Catalog (Informative, Non Binding) ## Utilities Utility source code that may be linked into any binary are available under several licenses: - MIT license (see ccan/LICENSE.MIT) - Creative Commons CC0 1.0 Universal License (see ccan/LICENSE.CC0) ## Providers The following providers use a different license than the Default Dual License. Refer to files in each directory for details. hfi1verbs : Dual License: GPLv2 or Intel 3 clause BSD license ipathverbs : Dual License: GPLv2 or PathScale BSD Patent license ocrdma : Dual License: GPLv2 or OpenIB.org BSD (FreeBSD variant), See COPYING.BSD_FB ## Libraries All library compilable source code (.c and .h files) are available under the Default Dual License. Unmarked ancillary files may be available under a Dual License: GPLv2 or OpenIB.org BSD (FreeBSD variant). ## Tools (iwpmd, srp_daemon, ibacm) All compilable source code (.c and .h files) are available under the Default Dual License. Unmarked ancillary files may be available under a Dual License: GPLv2 or OpenIB.org BSD (FreeBSD variant). srp_daemon/srp_daemon/srp_daemon.sh: Any one of the GPLv2, a 2 clause BSD license or the CPLv1. rdma-core-56.1/Documentation/000077500000000000000000000000001477342711600161435ustar00rootroot00000000000000rdma-core-56.1/Documentation/CMakeLists.txt000066400000000000000000000002631477342711600207040ustar00rootroot00000000000000install(FILES ibacm.md ibsrpdm.md libibverbs.md librdmacm.md rxe.md udev.md tag_matching.md ../README.md ../MAINTAINERS DESTINATION "${CMAKE_INSTALL_DOCDIR}") rdma-core-56.1/Documentation/azure-pipelines.md000066400000000000000000000073701477342711600216100ustar00rootroot00000000000000# Azure Pipelines Continuous Integration rdma-core uses Azure Pipelines to run a variety of compile tests on every pull request. These tests are intended to run through a variety of distribution configurations with the goal to have rdma-core build and work on a wide range of distributions. The system consists of several components: - An Azure Container Registry - The script buildlib/cbuild to produce the container images representing the test scenarios - The instructions in buildlib/azure-pipelines.yml and related support scripts - An Azure Pipelines account linked to the rdma-core GitHub - A GitHub Check Things are arranged so that the cbuild script can run the same commands in the same containers on the local docker system, it does not rely on any special or unique capabilities of Azure Pipelines. # The Containers Containers are built with the cbuild script. Internally it generates a Dockerfile and builds a docker container. ```sh $ buildlib/cbuild build-images centos7 ``` cbuild has definitions for a wide range of platforms that are interesting to test. ## Uploading Containers Containers that are used by Azure Pipelines are prefixed with ucfconsort.azurecr.io/rdma-core/ to indicate they are served from that docker registry (which is implemented as a Azure Container Registry service). Once built the container should be uploaded with: ```sh # Needed onetime $ az login $ sudo az acr login --name ucfconsort $ sudo docker push ucfconsort.azurecr.io/rdma-core/centos7:latest ``` The user will need to be authorized to access the private registry. ## Testing containers locally cbuild has several modes for doing local testing on the container. The fastest is to use 'cbuild make' as a replacement for Ninja. It will run cmake and ninja commands inside the container, but using the local source tree unmodified. This is useful to test and resolve compilation problems. ```sh $ buildlib/cbuild make centos7 ``` Using 'make --run-shell' will perform all container setup but instead of running Ninja it will open a bash shell inside the same container environment. This is useful to test and debug the container contents. Package builds can be tested using 'cbuild pkg'. This automatically generates a source .tar.gz and then runs rpmbuild/etc within the container. This is useful for testing the package building scripts. Note that any changes must be checked in or they will not be included. package builds are some of the tests that Azure Pipelines runs. # Azure Pipelines The actions are controlled by the content of buildlib/azure-pipelines.yml. The process is fairly straightforward and consists of both running distribution package builds and a series of different compilers and analysis checks. The compiler checks are run in a special 'azure-pipelines' container that has several compilers, ARM64 cross compilation, and other things. cbuild is able to run an emulation of the pipelines commands using 'buildlib/cbuild pkg azp' ## Azure Pipelines Security Microsoft has a strange security model - by default they do not send any login secrets to the VM if the VM is triggered from a GitHub Pull Request. This is required as the VM runs code from the PR, and a hostile PR could ex-filtrate the secret data. However, since fetching the containers requires a security token it means PR cannot get the container, and are basically entirely useless. The only option Azure Pipeliens has is to inject *all* security tokens, including the GitHub token, which is madness. The compromise is that when a non-team member user proposes a Pull Request a team member must reivew it and add "/azp run" to the comments to ack that the PR content is not hostile. See https://developercommunity.visualstudio.com/content/idea/392281/granular-permissions-on-secrets-for-github-fork-pu.html rdma-core-56.1/Documentation/contributing.md000066400000000000000000000163501477342711600212010ustar00rootroot00000000000000# Contributing to rdma-core rdma-core is a userspace project for a Linux kernel interface and follows many of the same expectations as contributing to the Linux kernel: - One change per patch Carefully describe your change in the commit message and break up work into appropriate reviewable commits. Refer to [Linux Kernel Submitting Patches](https://github.com/torvalds/linux/blob/master/Documentation/process/submitting-patches.rst) for general information. - Developer Certificate of Origin 'Signed-off-by' lines Include a Signed-off-by line to indicate your submission is suitable licensed and you have the legal authority to make this submission and accept the [DCO](#developers-certificate-of-origin-11) - Broadly follow the [Linux Kernel coding style](https://github.com/torvalds/linux/blob/master/Documentation/process/coding-style.rst) As in the Linux Kernel, commits that are fixing bugs should be marked with a Fixes: line to help backporting. Test your change locally before submitting it, you can use 'buildlib/cbuild' to run the CI process locally and ensure your code meets the mechanical expectations before sending the PR. # Using GitHub Changes to rdma-core should be delivered via [GitHub Pull Request](https://docs.github.com/en/github/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests) to the [rdma-core](https://github.com/linux-rdma/rdma-core) project. Each pull request should have a descriptive title and "cover letter" summary indicating what commits are present. A brief summary of the required steps: - Create a github account for yourself - [Clone](https://docs.github.com/en/github/creating-cloning-and-archiving-repositories/cloning-a-repository-from-github/cloning-a-repository) the [rdma-core](https://github.com/linux-rdma/rdma-core) project in GitHub - Setup a local clone of your repository using 'git clone'. - Ensure your local branch is updated to the tip of rdma-core - Make your change. Form the commits and ensure they are correct - Push to your local git repository to your GitHub on a dedicated branch. - Using the GitHub GUI make a Pull Request from the dedicated branch to rdma-core ## Making Revisions If changes are required they should be integrated into the commits and the pull request updated via force push to your branch. As a policy rdma-core wishes to have clean commit objects. As a courtesy to others describe the changes you made in a Pull Request comment and consider to include a before/after diff in that note. Do not close/open additional pull requests for the same topic. ## Continuous Integration rdma-core performs a matrix of compile tests on each Pull Request. This is to ensure the project continues to be buildable on the wide range of supported distributions and compilers. These tests include some "static analysis" passes that are designed to weed out bugs. Serious errors will result in a red X in the PR and will need to be corrected. Less serious errors, including checkpatch related, will show up with a green check but it is necessary to check the details to see that everything is appropriate. checkpatch is an informative too, not all of its feedback is appropriate to fix. A build similar to AZP can be run locally using docker and the 'buildlib/cbuild' script. ```sh $ buildlib/cbuild build-images azp $ buildlib/cbuild pkg azp ``` ## Coordinating with Kernel Changes Some changes consume a new uAPI that needs to be added to the kernel. Adding a new rdma uAPI requires kernel and user changes that must be presented together for review. - Prepare the kernel patches and rdma-core patches together. Test everything - Send the rdma-core patches as a PR to GitHub and possibly the mailing list - Send the kernel pathces to linux-rdma@vger.kernel.org. Refer to the matching GitHub PR in the cover letter by URL - The GitHub PR will be marked with a 'needs-kernel-patch' tag and will not advance until the kernel component is merged. Keeping the kernel include/uapi header files in sync requires some special actions. The first commit in the series should synchronize the kernel headers copies in rdma-core with the proposed new kernel-headers that this change requires. This commit is created with the script: ```sh $ kernel-headers/update ~/linux.git HEAD --not-final ``` It will generate a new commit in the rdma-core.git that properly copies the kernel headers from a kernel git tree. The --not-final should be used until official, final, commits are available in the canonical [git tree](http://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git) This will allow the CI to run and the patches to be reviewed. Once the kernel commits are applied a final git rebase should be used to revise the kernel-headers commit: ```sh $ kernel-headers/update ~/linux.git --amend ``` The updated commits should be force pushed to GitHub. Newer kernels should always work with older rdma-core and newer rdma-core should always work with older kernels. Changes forcing the simultaneous upgrade of the kernel and rdma-core are forbidden. # Participating in the Mailing List Patches of general interest should be sent to the mailing list linux-rdma@vger.kernel.org for detailed discussion. In particular patches that modify any of the ELF versioned symbols or external programming API should be sent to the mailing list. While all patches must have a GitHub Pull Request created, minor patches can skip the mailing list process. # Making a new library API All new library APIs that can be called externally from rdma-core require a man page describe the API and must be sent to the mailing list for review. This includes device specific "dv" APIs. Breaking the ABI of any exported symbol is forbidden. # Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved. then you just add a line saying: Signed-off-by: Random J Developer using your real name (sorry, no pseudonyms or anonymous contributions.) This will be done for you automatically if you use ``git commit -s``. Reverts should also include "Signed-off-by". ``git revert -s`` does that for you. rdma-core-56.1/Documentation/ibacm.md000066400000000000000000000124751477342711600175510ustar00rootroot00000000000000# The Assistant for InfiniBand Communication Management (IB ACM) The IB ACM library implements and provides a framework for name, address, and route resolution services over InfiniBand. The IB ACM provides information needed to establish a connection, but does not implement the CM protocol. IB ACM services are used by librdmacm to implement the rdma_resolve_addr, rdma_resolve_route, and rdma_getaddrinfo routines. The IB ACM is focused on being scalable and efficient. The current implementation limits network traffic, SA interactions, and centralized services. ACM supports multiple resolution protocols in order to handle different fabric topologies. This release is limited in its handling of dynamic changes. The IB ACM package is comprised of two components: the ibacm service and a test/configuration utility - ib_acme. # Details ### ib_acme The ib_acme program serves a dual role. It acts as a utility to test ibacm operation and help verify if the ibacm service and selected protocol is usable for a given cluster configuration. Additionally, it automatically generates ibacm configuration files to assist with or eliminate manual setup. ### acm configuration files The ibacm service relies on two configuration files. The acm_addr.cfg file contains name and address mappings for each IB endpoint. Although the names in the acm_addr.cfg file can be anything, ib_acme maps the host name and IP addresses to the IB endpoints. The acm_opts.cfg file provides a set of configurable options for the ibacm service, such as timeout, number of retries, logging level, etc. ib_acme generates the acm_opts.cfg file using static information. A future enhancement would adjust options based on the current system and cluster size. ### ibacm The ibacm service is responsible for resolving names and addresses to InfiniBand path information and caching such data. It is implemented as a daemon that execute with administrative privileges. The ibacm implements a client interface over TCP sockets, which is abstracted by the librdmacm library. One or more back-end protocols are used by the ibacm service to satisfy user requests. Although the ibacm supports standard SA path record queries on the back-end, it provides an experimental multicast resolution protocol in hope of achieving greater scalability. The latter is not usable on all fabric topologies, specifically ones that may not have reversible paths. Users should use the ib_acme utility to verify that multicast protocol is usable before running other applications. Conceptually, the ibacm service implements an ARP like protocol and either uses IB multicast records to construct path record data or queries the SA directly, depending on the selected route protocol. By default, the ibacm services uses and caches SA path record queries. Specifically, all IB endpoints join a number of multicast groups. Multicast groups differ based on rates, mtu, sl, etc., and are prioritized. All participating endpoints must be able to communicate on the lowest priority multicast group. The ibacm assigns one or more names/addresses to each IB endpoint using the acm_addr.cfg file. Clients provide source and destination names or addresses as input to the service, and receive as output path record data. The service maps a client's source name/address to a local IB endpoint. If a client does not provide a source address, then the ibacm service will select one based on the destination and local routing tables. If the destination name/address is not cached locally, it sends a multicast request out on the lowest priority multicast group on the local endpoint. The request carries a list of multicast groups that the sender can use. The recipient of the request selects the highest priority multicast group that it can use as well and returns that information directly to the sender. The request data is cached by all endpoints that receive the multicast request message. The source endpoint also caches the response and uses the multicast group that was selected to construct or obtain path record data, which is returned to the client. The current implementation of the IB ACM has several additional restrictions: - The ibacm is limited in its handling of dynamic changes; the ibacm should be stopped and restarted if a cluster is reconfigured. - Support for IPv6 has not been verified. - The number of addresses that can be assigned to a single endpoint is limited to 4. - The number of multicast groups that an endpoint can support is limited to 2. The ibacm contains several internal caches. These include caches for GID and LID destination addresses. These caches can be optionally preloaded. ibacm supports the OpenSM dump_pr plugin "full" PathRecord format which is used to preload these caches. The file format is specified in the ibacm_opts.cfg file via the route_preload setting which should be set to opensm_full_v1 for this file format. Default format is none which does not preload these caches. See dump_pr.notes.txt in dump_pr for more information on the opensm_full_v1 file format and how to configure OpenSM to generate this file. Additionally, the name, IPv4, and IPv6 caches can be be preloaded by using the addr_preload option. The default is none which does not preload these caches. To preload these caches, set this option to acm_hosts and configure the addr_data_file appropriately. rdma-core-56.1/Documentation/ibsrpdm.md000066400000000000000000000033761477342711600201360ustar00rootroot00000000000000# Using ibsrpdm ibsrpdm is used for discovering and connecting to SRP SCSI targets on InfiniBand fabrics. These targets can be accessed with the InfiniBand SRP initiator module, "ib_srp," included in Linux kernels 2.6.15 and newer. To run ibsrpdm, the ib_umad module must be loaded, as well as an appropriate low-level driver for the installed IB hardware. With no command line parameters, ibsrpdm displays information about SRP targets in human-readable form: # ibsrpdm IO Unit Info: port LID: 0009 port GID: fe800000000000000005ad00000013e9 change ID: 73b0 max controllers: 0x01 controller[ 1] GUID: 0005ad00000013e7 vendor ID: 0005ad device ID: 0005ad IO class : 0100 ID: Topspin SRP/FC TCA service entries: 2 service[ 0]: 0000000000000066 / SRP.T10:20030003BA27CC7A service[ 1]: 0000000000000066 / SRP.T10:20030003BA27CF53 With the "-c" flag, ibsrpdm displays information in a form that can be written to the kernel SRP initiators add_target file to connect to the SRP targets. For example: # ibsrpdm -c id_ext=20030003BA27CC7A,ioc_guid=0005ad00000013e7,dgid=fe800000000000000005ad00000013e9,pkey=ffff,service_id=0000000000000066 id_ext=20030003BA27CF53,ioc_guid=0005ad00000013e7,dgid=fe800000000000000005ad00000013e9,pkey=ffff,service_id=0000000000000066 Given this, the command below will connect to the first target discovered from the first port of the local HCA device "mthca0": # echo -n id_ext=20030003BA27CC7A,ioc_guid=0005ad00000013e7,dgid=fe800000000000000005ad00000013e9,pkey=ffff,service_id=0000000000000066 > /sys/class/infiniband_srp/srp-mthca0-1/add_target rdma-core-56.1/Documentation/libibverbs.md000066400000000000000000000060371477342711600206160ustar00rootroot00000000000000# Introduction libibverbs is a library that allows programs to use RDMA "verbs" for direct access to RDMA (currently InfiniBand and iWARP) hardware from userspace. For more information on RDMA verbs, see the InfiniBand Architecture Specification vol. 1, especially chapter 11, and the RDMA Consortium's RDMA Protocol Verbs Specification. # Using libibverbs ### Device nodes The verbs library expects special character device files named /dev/infiniband/uverbsN to be created. When you load the kernel modules, including both the low-level driver for your IB hardware as well as the ib_uverbs module, you should see one or more uverbsN entries in /sys/class/infiniband_verbs in addition to the /dev/infiniband/uverbsN character device files. To create the appropriate character device files automatically with udev, a rule like KERNEL="uverbs*", NAME="infiniband/%k" can be used. This will create device nodes named /dev/infiniband/uverbs0 and so on. Since the RDMA userspace verbs should be safe for use by non-privileged users, you may want to add an appropriate MODE or GROUP to your udev rule. ### Permissions To use IB verbs from userspace, a process must be able to access the appropriate /dev/infiniband/uverbsN special device file. You can check the permissions on this file with the command ls -l /dev/infiniband/uverbs* Make sure that the permissions on these files are such that the user/group that your verbs program runs as can access the device file. To use IB verbs from userspace, a process must also have permission to tell the kernel to lock sufficient memory for all of your registered memory regions as well as the memory used internally by IB resources such as queue pairs (QPs) and completion queues (CQs). To check your resource limits, use the command ulimit -l (or "limit memorylocked" for csh-like shells). If you see a small number such as 32 (the units are KB) then you will need to increase this limit. This is usually done for ordinary users via the file /etc/security/limits.conf. More configuration may be necessary if you are logging in via OpenSSH and your sshd is configured to use privilege separation. # Debugging ### Enabling debug prints Library and providers debug prints can be enabled using the `VERBS_LOG_LEVEL` environment variable, the output shall be written to the file provided in the `VERBS_LOG_FILE` environment variable. When the library is compiled in debug mode and no file is provided the output will be written to stderr. Note: some of the debug prints are only available when the library is compiled in debug mode. The following table describes the expected behavior when VERBS_LOG_LEVEL is set: | | Release | Debug | |-----------------|---------------------------------|------------------------------------------------| | Regular prints | Output to VERBS_LOG_FILE if set | Output to VERBS_LOG_FILE, or stderr if not set | | Datapath prints | Compiled out, no output | Output to VERBS_LOG_FILE, or stderr if not set | rdma-core-56.1/Documentation/librdmacm.md000066400000000000000000000025161477342711600204230ustar00rootroot00000000000000# Device files The userspace CMA uses a single device file regardless of the number of adapters or ports present. To create the appropriate character device file automatically with udev, a rule like KERNEL="rdma_cm", NAME="infiniband/%k", MODE="0666" can be used. This will create the device node named /dev/infiniband/rdma_cm or you can create it manually mknod /dev/infiniband/rdma_cm c 231 255 # Common issues Using multiple interfaces : The librdmacm does support multiple interfaces. To make use of multiple interfaces, however, you need to instruct linux to only send ARP replies on the interface targeted in the ARP request. This can be done using a command similar to the following: sysctl -w net.ipv4.conf.all.arp_ignore=2 Without this change, it's possible for linux to resopnd to ARP requests on a different interface (IP address) than the IP address carried in the ARP request. This causes the RDMA stack to incorrectly map the remote IP address to the wrong RDMA device. Using loopback : The librdmacm relies on ARP to resolve IP address to RDMA addresses. To support loopback connections between different ports on the same system, ARP must be enabled for local resolution: sysctl net.ipv4.conf.all.accept_local=1 Without this setting, loopback connections may timeout during address resolution. rdma-core-56.1/Documentation/pyverbs.md000066400000000000000000000651071477342711600201700ustar00rootroot00000000000000# Pyverbs Pyverbs provides a Python API over rdma-core, the Linux userspace C API for the RDMA stack. ## Goals 1. Provide easier access to RDMA: RDMA has a steep learning curve as is and the C interface requires the user to initialize multiple structs before having usable objects. Pyverbs attempts to remove much of this overhead and provide a smoother user experience. 2. Improve our code by providing a test suite for rdma-core. This means that new features will be tested before merge, and it also means that users and distros will have tests for new and existing features, as well as the means to create them quickly. 3. Stay up-to-date with rdma-core - cover new features during development and provide a test / unit-test alongside the feature. ## Limitations Python handles memory for users. As a result, memory is allocated by Pyverbs when needed (e.g. user buffer for memory region). The memory will be accessible to the users, but not allocated or freed by them. ## Usage Examples Note that all examples use a hard-coded device name ('mlx5_0'). ##### Open an IB device Import the device module and open a device by name: ```python import pyverbs.device as d ctx = d.Context(name='mlx5_0') ``` 'ctx' is Pyverbs' equivalent to rdma-core's ibv_context. At this point, the IB device is already open and ready to use. ##### Query a device ```python import pyverbs.device as d ctx = d.Context(name='mlx5_0') attr = ctx.query_device() print(attr) FW version : 16.24.0185 Node guid : 9803:9b03:0000:e4c6 Sys image GUID : 9803:9b03:0000:e4c6 Max MR size : 0xffffffffffffffff Page size cap : 0xfffffffffffff000 Vendor ID : 0x2c9 Vendor part ID : 4119 HW version : 0 Max QP : 262144 Max QP WR : 32768 Device cap flags : 3983678518 Max SGE : 30 Max SGE RD : 30 MAX CQ : 16777216 Max CQE : 4194303 Max MR : 16777216 Max PD : 16777216 Max QP RD atom : 16 Max EE RD atom : 0 Max res RD atom : 4194304 Max QP init RD atom : 16 Max EE init RD atom : 0 Atomic caps : 1 Max EE : 0 Max RDD : 0 Max MW : 16777216 Max raw IPv6 QPs : 0 Max raw ethy QP : 0 Max mcast group : 2097152 Max mcast QP attach : 240 Max AH : 2147483647 Max FMR : 0 Max map per FMR : 2147483647 Max SRQ : 8388608 Max SRQ WR : 32767 Max SRQ SGE : 31 Max PKeys : 128 local CA ack delay : 16 Phys port count : 1 ``` 'attr' is Pyverbs' equivalent to ibv_device_attr. Pyverbs will provide it to the user upon completion of the call to ibv_query_device. ##### Query GID ```python import pyverbs.device as d ctx = d.Context(name='mlx5_0') gid = ctx.query_gid(port_num=1, index=3) print(gid) 0000:0000:0000:0000:0000:ffff:0b87:3c08 ``` 'gid' is Pyverbs' equivalent to ibv_gid, provided to the user by Pyverbs. ##### Query port The following code snippet provides an example of pyverbs' equivalent of querying a port. Context's query_port() command wraps ibv_query_port(). The example below queries the first port of the device. ```python import pyverbs.device as d ctx=d.Context(name='mlx5_0') port_attr = ctx.query_port(1) print(port_attr) Port state : Active (4) Max MTU : 4096 (5) Active MTU : 1024 (3) SM lid : 0 Port lid : 0 lmc : 0x0 Link layer : Ethernet Max message size : 0x40000000 Port cap flags : IBV_PORT_CM_SUP IBV_PORT_IP_BASED_GIDS Port cap flags 2 : max VL num : 0 Bad Pkey counter : 0 Qkey violations counter : 0 Gid table len : 256 Pkey table len : 1 SM sl : 0 Subnet timeout : 0 Init type reply : 0 Active width : 4X (2) Ative speed : 25.0 Gbps (32) Phys state : Link up (5) Flags : 1 ``` ##### Extended query device The example below shows how to open a device using pyverbs and query the extended device's attributes. Context's query_device_ex() command wraps ibv_query_device_ex(). ```python import pyverbs.device as d ctx = d.Context(name='mlx5_0') attr = ctx.query_device_ex() attr.max_dm_size 131072 attr.rss_caps.max_rwq_indirection_table_size 2048 ``` #### Create RDMA objects ##### PD The following example shows how to open a device and use its context to create a PD. ```python import pyverbs.device as d from pyverbs.pd import PD with d.Context(name='mlx5_0') as ctx: pd = PD(ctx) ``` ##### MR The example below shows how to create a MR using pyverbs. Similar to C, a device must be opened prior to creation and a PD has to be allocated. ```python import pyverbs.device as d from pyverbs.pd import PD from pyverbs.mr import MR import pyverbs.enums as e with d.Context(name='mlx5_0') as ctx: with PD(ctx) as pd: mr_len = 1000 flags = e.IBV_ACCESS_LOCAL_WRITE mr = MR(pd, mr_len, flags) ``` ##### Memory window The following example shows the equivalent of creating a type 1 memory window. It includes opening a device and allocating the necessary PD. The user should unbind or close the memory window before being able to deregister an MR that the MW is bound to. ```python import pyverbs.device as d from pyverbs.pd import PD from pyverbs.mr import MW import pyverbs.enums as e with d.Context(name='mlx5_0') as ctx: with PD(ctx) as pd: mw = MW(pd, e.IBV_MW_TYPE_1) ``` ##### Device memory The following snippet shows how to allocate a DM - a direct memory object, using the device's memory. ```python import random from pyverbs.device import DM, AllocDmAttr import pyverbs.device as d with d.Context(name='mlx5_0') as ctx: attr = ctx.query_device_ex() if attr.max_dm_size != 0: dm_len = random.randint(4, attr.max_dm_size) dm_attrs = AllocDmAttr(dm_len) dm = DM(ctx, dm_attrs) ``` ##### DM MR The example below shows how to open a DMMR - device memory MR, using the device's own memory rather than a user-allocated buffer. ```python import random from pyverbs.device import DM, AllocDmAttr from pyverbs.mr import DMMR import pyverbs.device as d from pyverbs.pd import PD import pyverbs.enums as e with d.Context(name='mlx5_0') as ctx: attr = ctx.query_device_ex() if attr.max_dm_size != 0: dm_len = random.randint(4, attr.max_dm_size) dm_attrs = AllocDmAttr(dm_len) dm_mr_len = random.randint(4, dm_len) with DM(ctx, dm_attrs) as dm: with PD(ctx) as pd: dm_mr = DMMR(pd, dm_mr_len, e.IBV_ACCESS_ZERO_BASED, dm=dm, offset=0) ``` ##### CQ The following snippets show how to create CQs using pyverbs. Pyverbs supports both CQ and extended CQ (CQEX). As in C, a completion queue can be created with or without a completion channel, the snippets show that. CQ's 3rd parameter is cq_context, a user-defined context. We're using None in our snippets. ```python import random from pyverbs.cq import CompChannel, CQ import pyverbs.device as d with d.Context(name='mlx5_0') as ctx: num_cqes = random.randint(0, 200) # Just arbitrary values. Max value can be # found in device attributes comp_vector = 0 # An arbitrary value. comp_vector is limited by the # context's num_comp_vectors if random.choice([True, False]): with CompChannel(ctx) as cc: cq = CQ(ctx, num_cqes, None, cc, comp_vector) else: cq = CQ(ctx, num_cqes, None, None, comp_vector) print(cq) CQ Handle : 0 CQEs : 63 ``` ```python import random from pyverbs.cq import CqInitAttrEx, CQEX import pyverbs.device as d import pyverbs.enums as e with d.Context(name='mlx5_0') as ctx: num_cqe = random.randint(0, 200) wc_flags = e.IBV_WC_EX_WITH_CVLAN comp_mask = 0 # Not using flags in this example # completion channel is not used in this example attrs = CqInitAttrEx(cqe=num_cqe, wc_flags=wc_flags, comp_mask=comp_mask, flags=0) print(attrs) cq_ex = CQEX(ctx, attrs) print(cq_ex) Number of CQEs : 10 WC flags : IBV_WC_EX_WITH_CVLAN comp mask : 0 flags : 0 Extended CQ: Handle : 0 CQEs : 15 ``` ##### Addressing related objects The following code demonstrates creation of GlobalRoute, AHAttr and AH objects. The example creates a global AH so it can also run on RoCE without modifications. ```python from pyverbs.addr import GlobalRoute, AHAttr, AH import pyverbs.device as d from pyverbs.pd import PD with d.Context(name='mlx5_0') as ctx: port_number = 1 gid_index = 0 # GID index 0 always exists and valid gid = ctx.query_gid(port_number, gid_index) gr = GlobalRoute(dgid=gid, sgid_index=gid_index) ah_attr = AHAttr(gr=gr, is_global=1, port_num=port_number) print(ah_attr) with PD(ctx) as pd: ah = AH(pd, attr=ah_attr) DGID : fe80:0000:0000:0000:9a03:9bff:fe00:e4bf flow label : 0 sgid index : 0 hop limit : 1 traffic class : 0 ``` ##### QP The following snippets will demonstrate creation of a QP and a simple post_send operation. For more complex examples, please see pyverbs/examples section. ```python from pyverbs.qp import QPCap, QPInitAttr, QPAttr, QP from pyverbs.addr import GlobalRoute from pyverbs.addr import AH, AHAttr import pyverbs.device as d import pyverbs.enums as e from pyverbs.pd import PD from pyverbs.cq import CQ import pyverbs.wr as pwr ctx = d.Context(name='mlx5_0') pd = PD(ctx) cq = CQ(ctx, 100, None, None, 0) cap = QPCap(100, 10, 1, 1, 0) qia = QPInitAttr(cap=cap, qp_type = e.IBV_QPT_UD, scq=cq, rcq=cq) # A UD QP will be in RTS if a QPAttr object is provided udqp = QP(pd, qia, QPAttr()) port_num = 1 gid_index = 3 # Hard-coded for RoCE v2 interface gid = ctx.query_gid(port_num, gid_index) gr = GlobalRoute(dgid=gid, sgid_index=gid_index) ah_attr = AHAttr(gr=gr, is_global=1, port_num=port_num) ah=AH(pd, ah_attr) wr = pwr.SendWR() wr.set_wr_ud(ah, 0x1101, 0) # in real life, use real values udqp.post_send(wr) ``` ###### Extended QP An extended QP exposes a new set of QP send operations to the user - extensibility for new send opcodes, vendor specific send opcodes and even vendor specific QP types. Pyverbs now exposes the needed interface to create such a QP. Note that the IBV_QP_INIT_ATTR_SEND_OPS_FLAGS in the `comp_mask` is mandatory when using the extended QP's new post send mechanism. ```python from pyverbs.qp import QPCap, QPInitAttrEx, QPAttr, QPEx import pyverbs.device as d import pyverbs.enums as e from pyverbs.pd import PD from pyverbs.cq import CQ ctx = d.Context(name='mlx5_0') pd = PD(ctx) cq = CQ(ctx, 100) cap = QPCap(100, 10, 1, 1, 0) qia = QPInitAttrEx(qp_type=e.IBV_QPT_UD, scq=cq, rcq=cq, cap=cap, pd=pd, comp_mask=e.IBV_QP_INIT_ATTR_SEND_OPS_FLAGS| \ e.IBV_QP_INIT_ATTR_PD) qp = QPEx(ctx, qia) ``` ##### XRCD The following code demonstrates creation of an XRCD object. ```python from pyverbs.xrcd import XRCD, XRCDInitAttr import pyverbs.device as d import pyverbs.enums as e import stat import os ctx = d.Context(name='ibp0s8f0') xrcd_fd = os.open('/tmp/xrcd', os.O_RDONLY | os.O_CREAT, stat.S_IRUSR | stat.S_IRGRP) init = XRCDInitAttr(e.IBV_XRCD_INIT_ATTR_FD | e.IBV_XRCD_INIT_ATTR_OFLAGS, os.O_CREAT, xrcd_fd) xrcd = XRCD(ctx, init) ``` ##### SRQ The following code snippet will demonstrate creation of an XRC SRQ object. For more complex examples, please see pyverbs/tests/test_odp. ```python from pyverbs.xrcd import XRCD, XRCDInitAttr from pyverbs.srq import SRQ, SrqInitAttrEx import pyverbs.device as d import pyverbs.enums as e from pyverbs.cq import CQ from pyverbs.pd import PD import stat import os ctx = d.Context(name='ibp0s8f0') pd = PD(ctx) cq = CQ(ctx, 100, None, None, 0) xrcd_fd = os.open('/tmp/xrcd', os.O_RDONLY | os.O_CREAT, stat.S_IRUSR | stat.S_IRGRP) init = XRCDInitAttr(e.IBV_XRCD_INIT_ATTR_FD | e.IBV_XRCD_INIT_ATTR_OFLAGS, os.O_CREAT, xrcd_fd) xrcd = XRCD(ctx, init) srq_attr = SrqInitAttrEx(max_wr=10) srq_attr.srq_type = e.IBV_SRQT_XRC srq_attr.pd = pd srq_attr.xrcd = xrcd srq_attr.cq = cq srq_attr.comp_mask = e.IBV_SRQ_INIT_ATTR_TYPE | e.IBV_SRQ_INIT_ATTR_PD | \ e.IBV_SRQ_INIT_ATTR_CQ | e.IBV_SRQ_INIT_ATTR_XRCD srq = SRQ(ctx, srq_attr) ##### Open an mlx5 provider A provider is essentially a Context with driver-specific extra features. As such, it inherits from Context. In legcay flow Context iterates over the IB devices and opens the one matches the name given by the user (name= argument). When provider attributes are also given (attr=), the Context will assign the relevant ib_device to its device member, so that the provider will be able to open the device in its specific way as demonstated below: ```python import pyverbs.providers.mlx5.mlx5dv as m from pyverbs.pd import PD attr = m.Mlx5DVContextAttr() # Default values are fine ctx = m.Mlx5Context(attr=attr, name='rocep0s8f0') # The provider context can be used as a regular Context, e.g.: pd = PD(ctx) # Success ``` ##### Query an mlx5 provider After opening an mlx5 provider, users can use the device-specific query for non-legacy attributes. The following snippet demonstrates how to do that. ```python import pyverbs.providers.mlx5.mlx5dv as m ctx = m.Mlx5Context(attr=m.Mlx5DVContextAttr(), name='ibp0s8f0') mlx5_attrs = ctx.query_mlx5_device() print(mlx5_attrs) Version : 0 Flags : CQE v1, Support CQE 128B compression, Support CQE 128B padding, Support packet based credit mode (in RC QP) comp mask : CQE compression, SW parsing, Striding RQ, Tunnel offloads, Dynamic BF regs, Clock info update, Flow action flags CQE compression caps: max num : 64 supported formats : with hash, with RX checksum CSUM, with stride index SW parsing caps: SW parsing offloads : supported QP types : Striding RQ caps: min single stride log num of bytes: 6 max single stride log num of bytes: 13 min single wqe log num of strides: 9 max single wqe log num of strides: 16 supported QP types : Raw Packet Tunnel offloads caps: Max dynamic BF registers: 1024 Max clock info update [nsec]: 1099511 Flow action flags : 0 ``` ##### Create an mlx5 QP Using an Mlx5Context object, one can create either a legacy QP (creation process is the same) or an mlx5 QP. An mlx5 QP is a QP by inheritance but its constructor receives a keyword argument named `dv_init_attr`. If the user provides it, the QP will be created using `mlx5dv_create_qp` rather than `ibv_create_qp_ex`. The following snippet demonstrates how to create both a DC (dynamically connected) QP and a Raw Packet QP which uses mlx5-specific capabilities, unavailable using the legacy interface. Currently, pyverbs supports only creation of a DCI. DCT support will be added in one of the following PRs. ```python from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr from pyverbs.providers.mlx5.mlx5dv import Mlx5DVQPInitAttr, Mlx5QP import pyverbs.providers.mlx5.mlx5_enums as me from pyverbs.qp import QPInitAttrEx, QPCap import pyverbs.enums as e from pyverbs.cq import CQ from pyverbs.pd import PD with Mlx5Context(name='rocep0s8f0', attr=Mlx5DVContextAttr()) as ctx: with PD(ctx) as pd: with CQ(ctx, 100) as cq: cap = QPCap(100, 0, 1, 0) # Create a DC QP of type DCI qia = QPInitAttrEx(cap=cap, pd=pd, scq=cq, qp_type=e.IBV_QPT_DRIVER, comp_mask=e.IBV_QP_INIT_ATTR_PD, rcq=cq) attr = Mlx5DVQPInitAttr(comp_mask=me.MLX5DV_QP_INIT_ATTR_MASK_DC) attr.dc_type = me.MLX5DV_DCTYPE_DCI dci = Mlx5QP(ctx, qia, dv_init_attr=attr) # Create a Raw Packet QP using mlx5-specific capabilities qia.qp_type = e.IBV_QPT_RAW_PACKET attr.comp_mask = me.MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS attr.create_flags = me.MLX5DV_QP_CREATE_ALLOW_SCATTER_TO_CQE |\ me.MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_UC |\ me.MLX5DV_QP_CREATE_TUNNEL_OFFLOADS qp = Mlx5QP(ctx, qia, dv_init_attr=attr) ``` ##### Create an mlx5 CQ Mlx5Context also allows users to create an mlx5 specific CQ. The Mlx5CQ inherits from CQEX, but its constructor receives 3 parameters instead of 2. The 3rd parameter is a keyword argument named `dv_init_attr`. If provided by the user, the CQ will be created using `mlx5dv_create_cq`. The following snippet shows this simple creation process. ```python from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr from pyverbs.providers.mlx5.mlx5dv import Mlx5DVCQInitAttr, Mlx5CQ import pyverbs.providers.mlx5.mlx5_enums as me from pyverbs.cq import CqInitAttrEx with Mlx5Context(name='rocep0s8f0', attr=Mlx5DVContextAttr()) as ctx: cqia = CqInitAttrEx() mlx5_cqia = Mlx5DVCQInitAttr(comp_mask=me.MLX5DV_CQ_INIT_ATTR_MASK_COMPRESSED_CQE, cqe_comp_res_format=me.MLX5DV_CQE_RES_FORMAT_CSUM) cq = Mlx5CQ(ctx, cqia, dv_init_attr=mlx5_cqia) ``` ##### CMID The following code snippet will demonstrate creation of a CMID object, which represents rdma_cm_id C struct, and establish connection between two peers. Currently only synchronous control path is supported (rdma_create_ep). For more complex examples, please see tests/test_rdmacm. ```python from pyverbs.qp import QPInitAttr, QPCap from pyverbs.cmid import CMID, AddrInfo import pyverbs.cm_enums as ce cap = QPCap(max_recv_wr=1) qp_init_attr = QPInitAttr(cap=cap) addr = '11.137.14.124' port = '7471' # Passive side sai = AddrInfo(src=addr, src_service=port, port_space=ce.RDMA_PS_TCP, flags=ce.RAI_PASSIVE) sid = CMID(creator=sai, qp_init_attr=qp_init_attr) sid.listen() # listen for incoming connection requests new_id = sid.get_request() # check if there are any connection requests new_id.accept() # new_id is connected to remote peer and ready to communicate # Active side cai = AddrInfo(src=addr, dst=addr, dst_service=port, port_space=ce.RDMA_PS_TCP) cid = CMID(creator=cai, qp_init_attr=qp_init_attr) cid.connect() # send connection request to passive addr ``` ##### ParentDomain The following code demonstrates the creation of Parent Domain object. In this example, a simple Python allocator is defined. It uses MemAlloc class to allocate aligned memory using a C style aligned_alloc. ```python from pyverbs.pd import PD, ParentDomainInitAttr, ParentDomain, \ ParentDomainContext from pyverbs.device import Context import pyverbs.mem_alloc as mem def alloc_p_func(pd, context, size, alignment, resource_type): p = mem.posix_memalign(size, alignment) return p def free_p_func(pd, context, ptr, resource_type): mem.free(ptr) ctx = Context(name='rocep0s8f0') pd = PD(ctx) pd_ctx = ParentDomainContext(pd, alloc_p_func, free_p_func) pd_attr = ParentDomainInitAttr(pd=pd, pd_context=pd_ctx) parent_domain = ParentDomain(ctx, attr=pd_attr) ``` ##### MLX5 VAR The following code snippet demonstrates how to allocate an mlx5dv_var then using it for memory address mapping, then freeing the VAR. ```python from pyverbs.providers.mlx5.mlx5dv import Mlx5VAR from pyverbs.device import Context import mmap ctx = Context(name='rocep0s8f0') var = Mlx5VAR(ctx) var_map = mmap.mmap(fileno=ctx.cmd_fd, length=var.length, offset=var.mmap_off) # There is no munmap method in mmap Python module, but by closing the mmap # instance the memory is unmapped. var_map.close() var.close() ``` ##### MLX5 PP Packet Pacing (PP) entry can be used for some device commands over the DEVX interface. It allows a rate-limited flow configuration on SQs. The following code snippet demonstrates how to allocate an mlx5dv_pp with rate limit value of 5, then frees the entry. ```python from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr, Mlx5PP import pyverbs.providers.mlx5.mlx5_enums as e # The device must be opened as DEVX context mlx5dv_attr = Mlx5DVContextAttr(e.MLX5DV_CONTEXT_FLAGS_DEVX) ctx = Mlx5Context(attr=mlx5dv_attr, name='rocep0s8f0') rate_limit_inbox = (5).to_bytes(length=4, byteorder='big', signed=True) pp = Mlx5PP(ctx, rate_limit_inbox) pp.close() ``` ##### MLX5 UAR User Access Region (UAR) is part of PCI address space that is mapped for direct access to the HCA from the CPU. The UAR is needed for some device commands over the DevX interface. The following code snippet demonstrates how to allocate and free an mlx5dv_devx_uar. ```python from pyverbs.providers.mlx5.mlx5dv import Mlx5UAR from pyverbs.device import Context ctx = Context(name='rocep0s8f0') uar = Mlx5UAR(ctx) uar.close() ``` ##### Import device, PD and MR Importing a device, PD and MR enables processes to share their context and then share PDs and MRs that is associated with. A process creates a device and then uses some of the Linux systems calls to dup its 'cmd_fd' member which lets other process to obtain ownership. Once other process obtains the 'cmd_fd' it can import the device, then PD(s) and MR(s) to share these objects. Like in C, Pyverbs users are responsible for unimporting the imported objects (which will also close the Pyverbs instance in our case) after they finish using them, and they have to sync between the different processes in order to coordinate the closure of the objects. Unlike in C, closing the underlying objects is currently supported only via the "original" object (meaning only by the process that creates them) and not via the imported object. This limitation is made because currently there's no reference or relation between different Pyverbs objects in different processes. But it's doable and might be added in the future. Here is a demonstration of importing a device, PD and MR in one process. ```python from pyverbs.device import Context from pyverbs.pd import PD from pyverbs.mr import MR import pyverbs.enums as e import os ctx = Context(name='ibp0s8f0') pd = PD(ctx) mr = MR(pd, 100, e.IBV_ACCESS_LOCAL_WRITE) cmd_fd_dup = os.dup(ctx.cmd_fd) imported_ctx = Context(cmd_fd=cmd_fd_dup) imported_pd = PD(imported_ctx, handle=pd.handle) imported_mr = MR(imported_pd, handle=mr.handle) # MRs can be created as usual on the imported PD secondary_mr = MR(imported_pd, 100, e.IBV_ACCESS_REMOTE_READ) # Must manually unimport the imported objects (which close the object and frees # other resources that use them) before closing the "original" objects. # This prevents unexpected behaviours caused by the GC. imported_mr.unimport() imported_pd.unimport() ``` ##### Flow Steering Flow steering rules define packet matching done by the hardware. A spec describes packet matching on a specific layer (L2, L3 etc.). A flow is a collection of specs. A user QP can attach to flows in order to receive specific packets. ###### Flow and FlowAttr ```python from pyverbs.qp import QPCap, QPInitAttr, QPAttr, QP from pyverbs.flow import FlowAttr, Flow from pyverbs.spec import EthSpec import pyverbs.device as d import pyverbs.enums as e from pyverbs.pd import PD from pyverbs.cq import CQ ctx = d.Context(name='rocep0s8f0') pd = PD(ctx) cq = CQ(ctx, 100, None, None, 0) cap = QPCap(100, 10, 1, 1, 0) qia = QPInitAttr(cap=cap, qp_type = e.IBV_QPT_UD, scq=cq, rcq=cq) qp = QP(pd, qia, QPAttr()) # Create Eth spec eth_spec = EthSpec(ether_type=0x800, dst_mac="01:50:56:19:20:a7") eth_spec.src_mac = "24:8a:07:a5:28:c8" eth_spec.src_mac_mask = "ff:ff:ff:ff:ff:ff" # Create Flow flow_attr = FlowAttr(num_of_specs=1) flow_attr.specs.append(eth_spec) flow = Flow(qp, flow_attr) ``` ###### Specs Each spec holds a specific network layer parameters for matching. To enforce the match, the user sets a mask for each parameter. If the bit is set in the mask, the corresponding bit in the value should be matched. Packets coming from the wire are matched against the flow specification. If a match is found, the associated flow actions are executed on the packet. In ingress flows, the QP parameter is treated as another action of scattering the packet to the respected QP. ###### Notes * When creating specs mask will be set to FF's to all the given values (unless provided by the user). When editing a spec mask should be specified explicitly. * If a field is not provided its value and mask will be set to zeros. * Hardware only supports full / empty masks. * Ethernet, IPv4, TCP/UDP, IPv6 and ESP specs can be inner (IBV_FLOW_SPEC_INNER), but set to outer by default. ###### Ethernet spec Example of creating and editing Ethernet spec ```python from pyverbs.spec import EthSpec eth_spec = EthSpec(src_mac="ab:cd:ef:ab:cd:ef", vlan_tag=0x123, is_inner=1) eth_spec.dst_mac = "de:de:de:00:de:de" eth_spec.dst_mac_mask = "ff:ff:ff:ff:ff:ff" eth_spec.ether_type = 0x321 eth_spec.ether_type_mask = 0xffff # Resulting spec print(f'{eth_spec}') ``` Below is the output when printing the spec. Spec type : IBV_FLOW_SPEC_INNER IBV_FLOW_SPEC_ETH Size : 40 Src mac : ab:cd:ef:ab:cd:ef mask: ff:ff:ff:ff:ff:ff Dst mac : de:de:de:00:de:de mask: ff:ff:ff:ff:ff:ff Ether type : 8451 mask: 65535 Vlan tag : 8961 mask: 65535 ##### MLX5 DevX Objects A DevX object represents some underlay firmware object, the input command to create it is some raw data given by the user application which should match the device specification. Upon successful creation, the output buffer includes the raw data from the device according to its specification and is stored in the Mlx5DevxObj instance. This data can be used as part of related firmware commands to this object. In addition to creation, the user can query/modify and destroy the object. Although weakrefs and DevX objects closure are added and handled by Pyverbs, the users must manually close these objects when finished, and should not let them be handled by the GC, or by closing the Mlx5Context directly, since there's no guarantee that the DevX objects are closed in the correct order, because Mlx5DevxObj is a general class that can be any of the device's available objects. But Pyverbs does guarantee to close DevX UARs and UMEMs in order, and after closing the other DevX objects. The following code snippet shows how to allocate and destroy a PD object over DevX. ```python from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr, Mlx5DevxObj import pyverbs.providers.mlx5.mlx5_enums as dve import struct attr = Mlx5DVContextAttr(dve.MLX5DV_CONTEXT_FLAGS_DEVX) ctx = Mlx5Context(attr, 'rocep8s0f0') MLX5_CMD_OP_ALLOC_PD = 0x800 MLX5_CMD_OP_ALLOC_PD_OUTLEN = 0x10 cmd_in = struct.pack('!H14s', MLX5_CMD_OP_ALLOC_PD, bytes(0)) pd = Mlx5DevxObj(ctx, cmd_in, MLX5_CMD_OP_ALLOC_PD_OUTLEN) pd.close() ``` rdma-core-56.1/Documentation/release.md000066400000000000000000000100741477342711600201070ustar00rootroot00000000000000# Release Process Release process of rdma-core library consists from three stages 1. Change library version, according to [Overall Package Version](versioning.md) guide. 2. Push the change above to master branch and ensure that Travis CI reports successful build. 3. Create local annotated signed tag vX.X.X (`git tag vX.X.X -a -s`). 4. Issue `git release` command which will push tag, trigger Travis CI to upload release tar.gz file and create release notes based on tag context with release notes in it. ## git release There are many implementations of different `git release` commands. We recommend you to use the command from [this](https://github.com/mpalmer/github-release) repository due to its simplicity. --- Copy&Paste from relevant [README](https://github.com/mpalmer/github-release/blob/master/README.md) --- This very simple gem provides a `git release` command, which will automatically fill out any and all "release tags" into fully-blown "Github Releases", complete with release notes, a heading, and all the other good things in life. Using this gem, you can turn the following tag annotation: First Release It is with much fanfare and blowing of horns that I bequeath the awesomeness of `git release` upon the world. Features in this release include: * Ability to create a release from a tag annotation or commit message; * Automatically generates an OAuth token if needed; * Feeds your cat while you're hacking(*) You should install it now! `gem install github-release` Into [this](https://github.com/mpalmer/github-release/releases/tag/v0.1.0) simply by running git release ### Installation Simply install the gem: gem install github-release ### Usage Using `git release` is very simple. Just make sure that your `origin` remote points to your Github repo, and then run `git release`. All tags that look like a "version tag" (see "Configuration", below) will be created as Github releases (if they don't already exist) and the message from the tag will be used as the release notes. The format of the release notes is quite straightforward -- the first line of the message associated with the commit will be used as the "name" of the release, with the rest of the message used as the "body" of the release. The body will be interpreted as Github-flavoured markdown, so if you'd like to get fancy, go for your life. The message associated with the "release tag" is either the tag's annotation message (if it is an annotated tag) or else the commit log of the commit on which the tag is placed. I *strongly* recommend annotated tags (but then again, [I'm biased...](http://theshed.hezmatt.org/git-version-bump)) The first time you use `git release`, it will ask you for your Github username and password. This is used to request an OAuth token to talk to the Github API, which is then stored in your global git config. Hence you *shouldn't* be asked for your credentials every time you use `git release`. If you need to use multiple github accounts for different repos, you can override the `release.api-token` config parameter in your repo configuration (but you'll have to get your own OAuth token). ### Configuration There are a few things you can configure to make `git release` work slightly differently. None of them should be required for normal, sane use. * `release.remote` (default `origin`) -- The name of the remote which is used to determine what github repository to send release notes to. * `release.api-token` (default is runtime generated) -- The OAuth token to use to authenticate access to the Github API. When you first run `git release`, you'll be prompted for a username and password to use to generate an initial token; if you need to override it on a per-repo basis, this is the key you'll use. * `release.tag-regex` (default `v\d+\.\d+(\.\d+)?$`) -- The regular expression to filter which tags denote releases, as opposed to other tags you might have decided to make. Only tags which match this regular expression will be pushed up by `git release`, and only those tags will be marked as releases. rdma-core-56.1/Documentation/rxe.md000066400000000000000000000007041477342711600172640ustar00rootroot00000000000000# Configure Soft-RoCE (RXE): Create RXE device over network interface (e.g. eth0): # rdma link add rxe_eth0 type rxe netdev eth0 Use the status command to display the current configuration: # rdma link If you are using a Mellanox HCA, make sure that the mlx4_ib/mlx5_ib kernel module is not loaded (modprobe –rv mlx4_ib) in the soft-RoCE machine. Now you have an Infiniband device called “rxe0_eth0” that can be used to run any RoCE app. rdma-core-56.1/Documentation/stable.md000066400000000000000000000101771477342711600177450ustar00rootroot00000000000000# Stable Branch Release ## General Current Maintainer: Nicolas Morey Upstream rdma-core is considered stable after each mainline release. Branched stable releases, off a mainline release, are on as-needed basis and limited to bug fixes only. All bug fixes are to be backported from mainline and applied by stable branch maintainer. Branched stable releases will append an additional release number (e.g. 15.1) and will ensure that Azure Pipelines CI reports a successful build. Regular stable releases are usually generated at the same time as a mainline release. Some mainline release are however skipped if not enough significant patches have been queued. Additional stable releases can be generated if the need arise (Needed by distributions or OFED). Please contact the maintainer if a stable release is needed outside of the regular schedule. Stable branches are named stable-vXX where XX is the base version number. Once older release are no longer supported, their branch will be deleted but the stable release tags will be kept. Branches are maintained for about 4 years. ## Patch Rules * It must be obviously correct and tested. * It cannot be bigger than 100 lines, with context. * It must fix only one thing. * It must fix a real bug that bothers people (not a, "This could be a problem..." type thing). * ABI must NOT be changed by the fix. ## Submitting to the stable branch Submissions to the stable branch follow the same process as [kernel-stable](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/stable-kernel-rules.rst). ### Option 1 Patches sent to master should add the tag: `Cc: stable@linux-rdma.org` in the sign-off area. Once the patch is merged, it will be applied to the stable tree without anything else needing to be done by the author or subsystem maintainer. If the patch should be applied to more than one release, add the info version as such: `Cc: stable@linux-rdma.org # v15.1 v14` ### Option 2 After the patch has been merged to master, send an email to stable@linux-rdma.org containing the subject of the patch, the commit ID, why you think it should be applied, and what rdma-core version you wish it to be applied to. ### Option 3 Send the patch, after verifying that it follows the above rules, to stable@linux-rdma.org. You must note the upstream commit ID in the changelog of your submission, as well as the rdma-core version you wish it to be applied to. Option 1 is strongly preferred, is the easiest and most common. Option 2 and Option 3 are more useful if the patch isn’t deemed worthy at the time it is applied to a public git tree (for instance, because it deserves more regression testing first). Option 3 is especially useful if the patch needs some special handling to apply to an older version. Note that for Option 3, if the patch deviates from the original upstream patch (for example because it had to be backported) this must be very clearly documented and justified in the patch description. ## Versioning See versioning.md for setting package version on a stable branch. ## Creating a stable branch Stable branch should be created from a release tag of the master branch. The first thing to do on a master branch is to commit the mainstream release ABI infos so that latters patches/fixes can be checked against this reference. To do that, the creator of the branch should run ``` ./buildlib/cbuild build-images azp mkdir ABI touch ABI/.gitignore git add ABI/.gitignore echo " changeLogCompareToRelease: lastNonDraftReleaseByTag" >> buildlib/azure-pipelines-release.yml echo " changeLogCompareToReleaseTag: $(git describe HEAD --match="v*.0" --abbrev=0 | sed 's/0$/*/')" >> buildlib/azure-pipelines-release.yml git add buildlib/azure-pipelines-release.yml git commit -s -m "stable branch creation" -m "Add ABI files and tune Azure pipeline for changelog generation" ./buildlib/cbuild pkg azp git add ABI/* git commit --amend ``` 'cbuild pkg azp' will fail as the ABI verification step files, but it will produce the ABI reference files. Note that the ABI directory must NOT be committed at any point in the master branch. rdma-core-56.1/Documentation/tag_matching.md000066400000000000000000000341021477342711600211120ustar00rootroot00000000000000# Hardware tag matching ## Introduction The MPI standard defines a set of rules, known as tag-matching, for matching source send operations to destination receives according to the following attributes: * Communicator * User tag - wild card may be specified by the receiver * Source rank - wild card may be specified by the receiver * Destination rank - wild card may be specified by the receiver These matching attributes are specified by all Send and Receive operations. Send operations from a given source to a given destination are processed in the order in which the Sends were posted. Receive operations are associated with the earliest send operation (from any source) that matches the attributes, in the order in which the Receives were posted. Note that Receive tags are not necessarily consumed in the order they are created, e.g., a later generated tag may be consumed if earlier tags do not satisfy the matching rules. When a message arrives at the receiver, MPI implementations often classify it as either 'expected' or 'unexpected' according to whether a Receive operation with a matching tag has already been posted by the application. In the expected case, the message may be processed immediately. In the unexpected case, the message is saved in an unexpected message queue, and will be processed when a matching Receive operation is posted. To bound the amount of memory to hold unexpected messages, MPI implementations use 2 data transfer protocols. The 'eager' protocol is used for small messages. Eager messages are sent without any prior synchronization and processed/buffered at the receiver. Typically, with RDMA, a single RDMA-Send operation is used to transfer the data. The 'rendezvous' protocol is used for large messages. Initially, only the message tag is sent along with some meta-data. Only when the tag is matched to a Receive operation, will the receiver initiate the corresponding data transfer. A common RDMA implementation is to send the message tag with an RDMA-Send, and transfer the data with an RDMA-Read issued by the receiver. When the transfer is complete, the receiver will notify the sender that its buffer may be freed using an RDMA-Send. ## RDMA tag-matching offload Tag-matching offload satisfies the following principals: - Tag-matching is viewed as an RDMA application, and thus does not affect the RDMA transport in any way [(*)](#m1) - Tag-matching processing will be split between HW and SW. * HW will hold a bounded prefix of Receive tags - HW will process and transfer any expected message that matches a tag held in HW. * In case the message uses the rendezvous protocol, HW will also initiate the RDMA-Read data transfer and send a notification message when the data transfer completes. - SW will handle any message that is either unexpected or whose tag is not held in HW. (*) This concept can apply to additional application-specific offloads in the future. Tag-matching is initially defined for RC transport. Tag-matching messages are encapsulated in RDMA-Send messages and contain the following headers: ``` 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 Tag Matching Header (TMH): +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Operation | reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | User data (optional) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Tag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Tag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Rendezvous Header (RVH): +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Virtual Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Virtual Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Remote Key | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ``` Tag-matching messages always contain a TMH. An RHV is added for Rendezvous request messages. The following message formats are defined: - Eager request: TMH | payload - Rendezvous request: TMH | RHV | optional meta-data [(**)](#m2) - Rendezvous response: TMH Note that rendezvous data transfers are standard RDMA-Reads (**) Rendezvous request messages may also arrive unexpected; in this case, the message is handled in SW, optionally leveraging additional meta-data passed by the sender. As tag-matching messages are standard RDMA-Sends, no special HW support is needed at the sender. At the receiver, we introduce a new SRQ type - a Tag-Matching SRQ (TM-SRQ). The TM-SRQ forms the serialization point for matching messages coming from any of the associated RC connections, and reports all tag matching completions and events to a dedicated CQ. 2 kinds of buffers may be posted to the TM-SRQ: - Buffers associated with tags (tagged-buffers), which are used when a match is made by HW - Standard SRQ buffers, which are used for unexpected messages (from HW's perspective) When a message is matched by HW, the payload is transferred directly to the application buffer (both in the eager and the rendezvous case), while skipping any TM headers. Otherwise, the entire message, including any TM headers, is scattered to the SRQ buffer. Since unexpected messages are handled in SW, there exists an inherent race between the arrival of messages from the wire and posting of new tagged buffers. For example, consider 2 incoming messages m1 and m2 and matching buffers b1 and b2 that are posted asynchronously. If b1 is posted after m1 arrives but before m2, m1 would be delivered as an unexpected message while m2 would match b1, violating the ordering rules. Consequently, whenever HW deems a message unexpected, tag matching must be disabled for new tags until SW and HW synchronize. This synchronization is achieved by reporting to HW the number of unexpected messages handled by SW (with respect to the current posted tags). When the SW and HW are in synch, tag matching resumes normally. ## Tag Matching Verbs ### Capabilities Tag matching capabilities are queried by ibv_query_device_ex(), and report the following attributes: * **max_rndv_hdr_size** - Max size of rendezvous request header * **max_num_tags** - Max number of tagged buffers in a TM-SRQ matching list * **max_ops** - Max number of outstanding tag matching list operations * **max_sge** - Max number of SGEs in a tagged buffer * **flags** - the following flags are currently defined: - IBV_TM_CAP_RC - Support tag matching on RC transport ### TM-SRQ creation TM-SRQs are created by the ibv_create_srq_ex() Verb, which accepts the following new attributes: * **srq_type** - set to **IBV_SRQT_TM** * **comp_mask** - set the **IBV_SRQ_INIT_ATTR_TM** flag * **tm_cap** - TM properties for this TM-SRQ; defined as follows: ```h struct ibv_tm_cap { uint32_t max_num_tags; /* Matching list size */ uint32_t max_ops; /* Number of outstanding TM operations */ } ``` Similarly to XRC SRQs, a TM-SRQ has a dedicated CQ. RC QPs are associated with the TM-SRQ just like standard SRQs. However, the ownership of the QP's Send Queue is passed to the TM-SRQ, which uses it to initiate rendezvous RDMA-Reads. Receive completions are reported to the TM-SRQ's CQ. ### Managing TM receive buffers Untagged (unexpected) buffers are posted using the standard **ibv_post_srq_recv**() Verb. Tagged buffers are manipulated by a new **ibv_post_srq_ops**() Verb: ```h int ibv_post_srq_ops(struct ibv_srq *srq, struct ibv_ops_wr *wr, struct ibv_ops_wr **bad_wr); ``` ```h struct ibv_ops_wr { uint64_t wr_id; /* User defined WR ID */ /* Pointer to next WR in list, NULL if last WR */ struct ibv_ops_wr *next; enum ibv_ops_wr_opcode opcode; /* From enum ibv_ops_wr_opcode */ int flags; /* From enum ibv_ops_flags */ struct { /* Number of unexpected messages * handled by SW */ uint32_t unexpected_cnt; /* Input parameter for the DEL opcode * and output parameter for the ADD opcode */ uint32_t handle; struct { /* WR ID for TM_RECV */ uint64_t recv_wr_id; struct ibv_sge *sg_list; int num_sge; uint64_t tag; uint64_t mask; } add; } tm; }; ``` The following opcodes are defined: Opcode **IBV_WR_TAG_ADD** - add a tagged buffer entry to the tag matching list. The input consists of an SGE list, a tag, a mask (matching parameters), and the latest unexpected message count. A handle that uniquely identifies the entry is returned upon success. Opcode **IBV_WR_TAG_DEL** - delete a tag entry. The input is an entry handle returned from a previous **IBV_WR_TAG_ADD** operation, and the latest unexpected message count. Note that the operation may fail if the associated tag was consumed by an incoming message. In this case **IBV_WC_TM_ERR** status will be returned in WC. Opcode **IBV_WR_TAG_SYNC** - report the number of unexpected messages handled by the SW. The input comprises only the unexpected message count. To reduce explicit synchronization to a minimum, all completions indicate when synchronization is necessary by setting the **IBV_WC_TM_SYNC_REQ** flag. **ibv_post_srq_ops**() operations are non-signaled by default. To request an explicit completion for a given operation, the standard **IBV_OPS_SIGNALED** flag must be set. The number of outstanding tag-manipulation operations must not exceed the **max_ops** capability. While **wr_id** identifies the tag manipulation operation itself, the **recv_wr_id** field is used to identify the tagged buffer in receive completions. ### Sending TM messages TM messages are sent using standard RC Send operations. A TM message comprises a Tag-Matching Header (TMH), an optional Rendezvous Header (RVH), and a payload. TMH and RVH are defined in infiniband/tm_types.h: ```h struct ibv_tmh { uint8_t opcode; uint8_t reserved[3]; __be32 app_ctx; __be64 tag; }; ``` ```h struct ibv_rvh { __be64 va; __be32 rkey; __be32 len; }; ``` The following opcodes are defined: * **IBV_TM_NO_TAG** - Send a message without a tag. Such a message will always be treated as unexpected by the receiver TM-SRQ. Any data following the opcode is ignored by the tag matching logic, and the message is delivered in its entirety (including the opcode) to the standard SRQ buffer. * **IBV_TM_OP_EAGER** - Send an eager tagged message. The message consists of a TMH followed by payload. * **IBV_TM_OP_RNDV** - Send a tagged rendezvous request. The message consists of a TMH, an RVH, and optional additional data (which may be inspected by receiver SW if the message is deemed unexpected). The RVH must refer to a registered buffer containing the rendezvous payload. The total rendezvous message size must not exceed the **max_rndv_hdr_size** capability. The Sender must consider the operation outstanding until a TM message with the **IBV_TM_OP_FIN** opcode is received, after which the buffer may be deregistered and freed. * **IBV_TM_OP_FIN** - Send a rendezvous completion indication. The message consists of a copy of the original TMH and RVH of the rendezvous request, apart the opcode. This message is sent after the receiver has completed the transfer of the rendezvous payload by an RDMA-read operation. It may be sent either by HW or SW, depending on whether the rendezvous request was handled as expected or unexpected by the TM-SRQ. ### TM completion processing There are 2 types of TM completions: tag-manipulation and receive completions. Tag-manipulation operations generate the following completion opcodes: * **IBV_WC_TM_ADD** - completion of a tag addition operation * **IBV_WC_TM_DEL** - completion of a tag removal operation * **IBV_WC_TM_SYNC** - completion of a synchronization operation These completions are complemented by the **IBV_WC_TM_SYNC_REQ** flag, which indicates whether further HW synchronization is needed. TM receive completions generate the following completion codes: * **IBV_WC_RECV** - standard SRQ completion; used for unexpected messages * **IBV_WC_TM_NO_TAG** - completion of a message sent with the **IBV_TM_NO_TAG** opcode. * **IBV_WC_TM_RECV** - completion of a tag-matching operation The **IBV_WC_TM_RECV** completion is complemented by the following completion flags: - **IBV_WC_TM_MATCH** - a match was performed - **IBV_WC_TM_DATA_VALID** - all data of the matched message has been delivered to memory In single-packet eager messages, both flags are set. When larger messages or rendezvous transfers are involved, matching and data transfer completion are distinct events that generate 2 completion events for the same **recv_wr_id**. While data transfer completions may be arbitrarily delayed depending on message size, matching completion is reported immediately and is always serialized with respect to other matches and the completion of unexpected messages. In addition, **IBV_WC_TM_RECV** completions provide further information about the matched message. This information is obtained using extended CQ processing via the following extractor function: ```h static inline void ibv_wc_read_tm_info(struct ibv_cq_ex *cq, struct ibv_wc_tm_info *tm_info); ``` ```h struct ibv_wc_tm_info { uint64_t tag; /* tag from TMH */ uint32_t priv; /* opaque user data from TMH */ }; ``` Finally, when a posted tagged buffer is insufficient to hold the data of a rendezvous request, the HW completes the buffer with an IBV_WC_TM_RNDV_INCOMPLETE status. In this case, the TMH and RVH headers are scattered into the tagged buffer (tag-matching has still been completed!), and message handling is resumed by SW. rdma-core-56.1/Documentation/testing.md000066400000000000000000000160241477342711600201450ustar00rootroot00000000000000# Testing in rdma-core rdma-core now offers an infrastructure for quick and easy additions of feature- specific tests. ## Design ### Resources Management `BaseResources` class is the basic objects aggregator available. It includes a Context and a PD. Inheriting from it is `TrafficResources` class, which also holds a MR, CQ and QP, making it enough to support loopback traffic testing. It exposes methods for creation of these objects which can be overridden by inheriting classes. Inheriting from `TrafficResources` are currently three classes: - `RCResources` - `UDResources` - `XRXResources` The above subclasses add traffic-specific constants. For example, `UDResources` overrides create_mr and adds the size of the GRH header to the message size. `RCResources` exposes a wrapper to modify the QP to RTS. ### Tests-related Classes `unittest.TestCase` is a logical test unit in Python's unittest module. `RDMATestCase` inherits from it and adds the option to accept parameters (example will follow below) or use a random set of valid parameters: - If no device was provided, it iterates over the existing devices, for each port of each device, it checks which GID indexes are valid (in RoCE, only IPv4 and IPv6 based GIDs are used). Each is added to an array and one entry is selected. - If a device was provided, the same process is done for all ports of this device, and so on. ### Traffic Utilities tests/utils.py offers a few wrappers for common traffic operations, making the use of default values even shorter. Those traffic utilities accept an aggregation object as their first parameter and rely on that object to have valid RDMA resources for proper functioning. - get_[send, recv]_wr() creates a [Send, Recv]WR object with a single SGE. It also sets the MR content to be 'c's for client side or 's's for server side (this is later validated). - post_send() posts a single send request to the aggregation object's QP. If the QP is a UD QP, an address vector will be added to the send WR. - post_recv() posts the given RecvWR times, so it can be used to fill the RQ prior to traffic as well as during traffic. - poll_cq() polls completions from the CQ and raises an exception on a non-success status. - validate() verifies that the data in the MR is as expected ('c's for server, 's's for client). - traffic() runs iterations of send/recv between 2 players. ## How to run rdma-core's tests #### Developers The tests can be executed from ./build/bin: ``` ./build.sh ./build/bin/run_tests.py ``` #### Users The tests are not a Python package, as such they can be found under /usr/share/doc/rdma-core-{version}/tests. In order to run all tests: ``` python /usr/share/doc/rdma-core-/tests/run_tests.py ``` #### Execution output Output will be something like: ``` $ ./build/bin/run_tests.py ..........................................ss............... ---------------------------------------------------------------------- Ran 59 tests in 13.268s OK (skipped=2) ``` A dot represents a passing test. 's' means a skipped test. 'E' means a test that failed. Tests can also be executed in verbose mode: ``` $ python3 /usr/share/doc/rdma-core-26.0/tests/run_tests.py -v test_create_ah (test_addr.AHTest) ... ok test_create_ah_roce (test_addr.AHTest) ... ok test_destroy_ah (test_addr.AHTest) ... ok test_create_comp_channel (test_cq.CCTest) ... ok < many more lines here> test_odp_rc_traffic (test_odp.OdpTestCase) ... skipped 'No port is up, can't run traffic' test_odp_ud_traffic (test_odp.OdpTestCase) ... skipped 'No port is up, can't run traffic' ---------------------------------------------------------------------- Ran 59 tests in 12.857s OK (skipped=2) ``` Verbose mode provides the reason for skipping the test (if one was provided by the test developer). ### Customized Execution tests/\_\_init\_\_.py defines a `_load_tests` function that returns an array with the tests that will be executed. The default implementation collects all test_* methods from all the classes that inherit from `unittest.TestCase` (or `RDMATestCase`) and located in files under tests directory which names starts with test_. Users can execute part of the tests by adding `-k` to the run_tests.py command. The following example executes only tests cases in files starting with `test_device` and not `test_`. ``` $ build/bin/run_tests.py -v -k test_device test_create_dm (tests.test_device.DMTest) ... ok test_create_dm_bad_flow (tests.test_device.DMTest) ... ok test_destroy_dm (tests.test_device.DMTest) ... ok test_destroy_dm_bad_flow (tests.test_device.DMTest) ... ok test_dm_read (tests.test_device.DMTest) ... ok test_dm_write (tests.test_device.DMTest) ... ok test_dm_write_bad_flow (tests.test_device.DMTest) ... ok test_dev_list (tests.test_device.DeviceTest) ... ok test_open_dev (tests.test_device.DeviceTest) ... ok test_query_device (tests.test_device.DeviceTest) ... ok test_query_device_ex (tests.test_device.DeviceTest) ... ok test_query_gid (tests.test_device.DeviceTest) ... ok test_query_port (tests.test_device.DeviceTest) ... ok test_query_port_bad_flow (tests.test_device.DeviceTest) ... ok ---------------------------------------------------------------------- Ran 14 tests in 0.152s OK ``` ### GPUDirect RDMA Part of the available tests includes GPUDirect RDMA. Those tests run RDMA traffic over CUDA-allocated memory. In order to run them successfully it's required to have a supported NVIDIA GPU, CUDA 11.7 and above with "Open flavor" ("-m kernel-open") Driver 515 or later and cuda-python 12.0 and above. Running the tests is similar to other tests, with the option to choose which GPU unit to use in case there are multiple GPUs on the setup. ``` $ build/bin/run_tests.py -v --gpu 0 -k cuda test_cuda_dmabuf_rdma_write_traffic (tests.test_cuda_dmabuf.DmabufCudaTest) Runs RDMA Write traffic over CUDA allocated memory using DMA BUF and ... ok test_mlx_devx_cuda_send_imm_traffic (tests.test_mlx5_cuda_umem.Mlx5GpuDevxRcTrafficTest) Creates two DevX RC QPs and runs SEND_IMM traffic over CUDA allocated ... ok ---------------------------------------------------------------------- Ran 2 tests in 3.033s OK ``` ## Writing Tests The following section explains how to add a new test, using tests/test_odp.py as an example. It's a simple test that runs ping-pong over a few different traffic types. ODP requires capability check, so a decorator was added to tests/utils.py. The first change for ODP execution is when registering a memory region (need to set the ON_DEMAND access flag), so we do as follows: 1. Create the players by inheriting from `RCResources` (for RC traffic). 2. In the player, override create_mr() and add the decorator to it. It will run before the actual call to ibv_reg_mr and if ODP caps are off, the test will be skipped. 3. Create the `OdpTestCase` by inheriting from `RDMATestCase`. 4. In the test case, add a method starting with test_, to let the unittest infrastructure that this is a test. 5. In the test method, create the players (which already check the ODP caps) and call the traffic() function, providing it the two players. rdma-core-56.1/Documentation/udev.md000066400000000000000000000173511477342711600174370ustar00rootroot00000000000000# Kernel Module Loading The RDMA subsystem relies on the kernel, udev and systemd to load modules on demand when RDMA hardware is present. The RDMA subsystem is unique since it does not load the optional RDMA hardware modules unless the system has the rdma-core package installed. This is to avoid exposing systems not using RDMA from having RDMA enabled, for instance if a system has a multi-protocol ethernet adapter, but is only using the net stack interface. ## Boot ordering with systemd systemd assumes everything is hot pluggable and runs in an event driven manner. This creates a chain of hot plug events as each part of the system autoloads based on earlier parts. The first step in the process is udev loading the physical hardware driver. This can happen in several spots along the bootup: - From the initrd or built into the kernel. If hardware modules are present in the initrd then they are loaded into the kernel before booting the system. This is done largely synchronously with the boot process. - From udev when it auto detects PCI hardware or otherwise. This happens asynchronously in the boot process, systemd does not wait for udev to finish loading modules before it continues on. This path makes it very likely the system will experience an RDMA 'hot plug' scenario. - From systemd's fixed module loader systemd-modules-load.service, e.g. from the list in /etc/modules-load.d/. In this case the modules load happens synchronously within systemd and it will hold off sysinit.target until modules are loaded Once the hardware module is loaded it may be necessary to load a protocol module, e.g. to enable RDMA support on an ethernet device. This is triggered automatically by udev rules that match the master devices and load the protocol module with udev's module loader. This happens asynchronously to the rest of the systemd startup. Once an RDMA device is created by the kernel then udev will cause systemd to schedule ULP module loading services (e.g. rdma-load-modules@.service) specific to the plugged hardware. If sysinit.target has not yet been passed then these loaders will defer sysinit.target until they complete, otherwise this is a hot plug event and things will load asynchronously to the boot up process. Finally udev will cause systemd to start RDMA specific daemons like srp_daemon, rdma-ndd and iwpmd. These starts are linked to the detection of the first RDMA hardware, and the daemons internally handle hot plug events for other hardware. ## Hot Plug compatible services Services using RDMA need to have device specific systemd dependencies in their unit files, either created by hand by the admin or by using udev rules. For instance, a service that uses /dev/infiniband/umad0 requires: ``` After=dev-infiniband-umad0.device BindsTo=dev-infiniband-umad0.device ``` Which will ensure the service will not run until the required umad device appears, and will be stopped if the umad device is unplugged. This is similar to how systemd handles mounting filesystems and configuring ethernet devices. ## Interaction with legacy non-hotplug services Services that cannot handle hot plug must be ordered after systemd-udev-settle.service, which will wait for udev to complete loading modules and scheduling systemd services. This ensures that all RDMA hardware present at boot is setup before proceeding to run the legacy service. Admins using legacy services can also place their RDMA hardware modules (e.g. mlx4_ib) directly in /etc/modules-load.d/ or in their initrd which will cause systemd to defer passing to sysinit.target until all RDMA hardware is setup, this is usually sufficient for legacy services. This is probably the default behavior in many configurations. # Systemd Ordering Within rdma-core we have a series of units which run in the pre `basic.target` world to setup kernel services: - `iwpmd` - `rdma-ndd` - `rdma-load-modules@.service` - `ibacmd.socket` These special units use DefaultDependencies=no and order before any other unit that uses DefaultDependencies=yes. This will happen even in the case of hotplug. Units for normal rdma-using daemons should use DefaultDependencies=yes, and either this pattern for 'any RDMA device': ``` [Unit] # Order after rdma-hw.target has become active and setup the kernel services Requires=rdma-hw.target After=rdma-hw.target [Install] # Autostart when RDMA hardware is present WantedBy=rdma-hw.target ``` Or this pattern for a specific RDMA device: ``` [Unit] # Order after RDMA services are setup After=rdma-hw.target # Run only while a specific umad device is present After=dev-infiniband-umad0.device BindsTo=dev-infiniband-umad0.device [Install] # Schedule the unit to be runnable when RDMA hardware is present, but # it will only start once the requested device actually appears. WantedBy=rdma-hw.target ``` Note, the above does explicitly reference `After=rdma-hw.target` even though all the current constituents of that target order before `sysinit.target`. This is to provide greater flexibility in the future. ## rdma-hw.target This target is Wanted automatically by udev as soon as any RDMA hardware is plugged in or becomes available at boot. This may be used to pull in rdma management daemons dynamically when RDMA hardware is found. Such daemons should use: ``` [Install] WantedBy=rdma-hw.target ``` In their unit files. `rdma-hw.target` is also a synchronization point that orders after the low level, pre `sysinit.target` RDMA related units have been started. # Stable names The library provides general utility and udev rule to automatically perform stable IB device name assignments, so users will always see names based on topology/GUID information. Such naming scheme has big advantage that the names are fully automatic, fully predictable and they stay fixed even if hardware is added or removed (i.e. no reenumeration takes place) and that broken hardware can be replaced seamlessly. The name is combination of link type (Infiniband, RoCE, iWARP, OPA or USNIC) and the chosen naming policy, like NAME_KERNEL, NAME_PCI, NAME_GUID, NAME_ONBOARD or NAME_FALLBACK. Those naming policies are controlled by udev rule and can be overwritten by placing own rename policy udev rules into /etc/udev/rules.d/ directory. * NAME_KERNEL - don't change names and rely on kernel assignment. This will keep RDMA names as before. Example: "mlx5_0". * NAME_PCI - read PCI location and topology as a source for stable names, which won't change in any software event (reset, PCI probe e.t.c.). Example: "ibp0s12f4". * NAME_GUID - read node GUID information in similar manner to net MAC naming policy. Example "rocex525400c0fe123455". * NAME_ONBOARD - read Firmware/BIOS provided index numbers for on-board devices. Example: "ibo3". * NAME_FALLBACK - automatic fallback: NAME_ONBOARD->NAME_PCI->NAME_KERNEL No doubts that new names are harder to read than the "mlx5_0" everybody, is used to, but being consistent in scripts is much more important. There is a distinction between real devices and virtual ones like RXE or SIW. For real devices, the naming policy is NAME_FALLBACK, while virtual devices keep their kernel name. In similar way to netdev, NAME_GUID scheme is not participating in fallback mechanism and needs to be enabled explicitly by the users. Type of names: * o - on-board device index number * s[f] - hotplug slot index number * x - Node GUID * [P]ps[f] - PCI geographical location Notes: * All multi-function PCI devices will carry the [f] number in the device name, including the function 0 device. * When using PCI geography, The PCI domain is only prepended when it is not 0. * SR-IOV virtual devices are named based on the name of the parent interface, with a suffix of "v", where is the virtual device number. rdma-core-56.1/Documentation/versioning.md000066400000000000000000000133241477342711600206530ustar00rootroot00000000000000# Overall Package Version This version number is set in the top level CMakeLists.txt: ```sh set(PACKAGE_VERSION "11") ``` For upstream releases this is a single integer showing the release ordering. We do not attempt to encode any 'ABI' information in this version. Branched stabled releases can append an additional counter eg `11.2`. Unofficial releases should include a distributor tag, eg '11.vendor2'. When the PACKAGE_VERSION is changed, the packaging files should be updated: ```diff diff --git a/CMakeLists.txt b/CMakeLists.txt index a2464ec5..cf237904 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -44,7 +44,7 @@ endif() set(PACKAGE_NAME "RDMA") # See Documentation/versioning.md -set(PACKAGE_VERSION "15") +set(PACKAGE_VERSION "16") # When this is changed the values in these files need changing too: # debian/libibverbs1.symbols # libibverbs/libibverbs.map diff --git a/redhat/rdma-core.spec b/redhat/rdma-core.spec index cc0c3ba0..62334730 100644 --- a/redhat/rdma-core.spec +++ b/redhat/rdma-core.spec @@ -1,5 +1,5 @@ Name: rdma-core -Version: 15 +Version: 16 Release: 1%{?dist} Summary: RDMA core userspace libraries and daemons diff --git a/suse/rdma-core.spec b/suse/rdma-core.spec index 76ca7286..a19f9e01 100644 --- a/suse/rdma-core.spec +++ b/suse/rdma-core.spec @@ -19,7 +19,7 @@ %bcond_without systemd %define git_ver %{nil} Name: rdma-core -Version: 15 +Version: 16 Release: 0 Summary: RDMA core userspace libraries and daemons License: GPL-2.0 or BSD-2-Clause ``` # Shared Library Versions The shared libraries use the typical semantic versioning scheme, eg *libibumad* has a version like `3.1.11`. The version number is broken up into three fields: - '3' is called the SONAME and is embedded into the ELF: ```sh $ readelf -ds build/lib/libibumad.so.3.1.11 0x000000000000000e (SONAME) Library soname: [libibumad.so.3] ``` We do not expect this value to ever change for our libraries. It indicates the overall ABI; changing it means the library will not dynamically link to old programs anymore. - '1' is called the ABI level and is used within the ELF as the last component symbol version tag. This version must be changed every time a new symbol is introduced. It allows the user to see what version of the ABI the library provides. - '11' is the overall release number and is copied from `PACKAGE_VERSION` This version increases with every package release, even if the library code did not change. It allows the user to see what upstream source was used to build the library. This version is encoded into the filename `build/lib/libibumad.so.3.1.11` and a symlink from `libibumad.so.3` to `build/lib/libibumad.so.3.1.11` is created. ## Shared Library Symbol Versions Symbol versions are a linker technique that lets the library author provide two symbols with different ABIs that have the same API name. The linker differentiates the two cases internally. This allows the library author to change the ABI that the API uses. This project typically does not make use of this feature. As a secondary feature, the symbol version is also used by package managers like RPM to manage the ABI level. To make this work properly the ABI level must be correctly encoded into the symbol version. ## Adding a new symbol First, increase the ABI level of the library. It is safe to re-use the ABI level for multiple new functions within a single release, but once a release is tagged the ABI level becomes *immutable*. The maintainer can provide guidance on what ABI level to use for each series. ```diff rdma_library(ibumad libibumad.map # See Documentation/versioning.md - 3 3.1.${PACKAGE_VERSION} + 3 3.2.${PACKAGE_VERSION} ``` Next, add your new symbol to the symbol version file: ```diff + IBUMAD_3.2 { + global: + umad_new_symbol; + } IBUMAD_1.0; ``` NOTE: Once a release is made the stanzas in the map file are *immutable* and cannot be changed. Do not add your new symbol to old stanzas. The new symbol should appear in the ELF: ```sh $ readelf -s build/lib/libibumad.so.3.1.11 35: 00000000000031e0 450 FUNC GLOBAL DEFAULT 12 umad_new_symbol@@IBUMAD_3.2 ``` Finally update the `debian/libibumad3.symbols` file. ## Private symbols in libibverbs Many symbols in libibverbs are private to rdma-core, they are being marked in the map file using the IBVERBS_PRIVATE_ prefix. For simplicity, there is only one version of the private symbol version stanza, and it is bumped whenever any change (add/remove/modify) to any of the private ABI is done. This makes it very clear if an incompatible provider is being used with libibverbs. Due to this there is no reason to provide compat symbol versions for the private ABI. When the private symbol version is bumped, the packaging files should be updated: ```diff diff --git a/debian/control b/debian/control index 642a715e..8def05c9 100644 --- a/debian/control +++ b/debian/control @@ -138,7 +138,7 @@ Section: libs Pre-Depends: ${misc:Pre-Depends} Depends: adduser, ${misc:Depends}, ${shlibs:Depends} Recommends: ibverbs-providers -Breaks: ibverbs-providers (<< 16~) +Breaks: ibverbs-providers (<< 17~) Description: Library for direct userspace use of RDMA (InfiniBand/iWARP) libibverbs is a library that allows userspace processes to use RDMA "verbs" as described in the InfiniBand Architecture Specification and ``` ### Use of private symbols between component packages A distribution packaging system still must have the correct dependencies between libraries within rdma-core that may use these private symbols. For this reason the private symbols can only be used by provider libraries and the distribution must ensure that a matched set of provider libraries and libibverbs are installed. rdma-core-56.1/MAINTAINERS000066400000000000000000000127521477342711600150360ustar00rootroot00000000000000 List of maintainers Generally patches should be submitted to the main development mailing list: linux-rdma@vger.kernel.org Descriptions of section entries: F: Files and directories with wildcard patterns. A trailing slash includes all files and subdirectory files. F: providers/mlx4/ all files in and below providers/mlx4/ F: providers/* all files in providers, but not below F: */net/* all files in "any top level directory"/net One pattern per line. Multiple F: lines acceptable. H: Historical authors L: Mailing list that is relevant to this area M: Designated reviewer: FullName These reviewers should be CCed on patches. S: Status, one of the following: Supported: Someone is actually paid to look after this. Maintained: Someone actually looks after it. Odd Fixes: It has a maintainer but they don't have time to do much other than throw the odd patch in. See below.. Orphan: No current maintainer [but maybe you could take the role as you write your new code]. Obsolete: Old code. Something tagged obsolete generally means it has been replaced by a better system and you should be using that. ----------------------------------- * OVERALL PACKAGE M: Leon Romanovsky M: Jason Gunthorpe H: Doug Ledford S: Supported BUILD SYSTEM M: Jason Gunthorpe S: Supported F: */CMakeLists.txt F: */lib*.map F: buildlib/ DEBIAN PACKAGING M: Benjamin Drung S: Supported F: debian/ BNXT_RE USERSPACE PROVIDER (for bnxt_re.ko) M: Selvin Xavier S: Supported F: providers/bnxt_re/ CXGB4 USERSPACE PROVIDER (for iw_cxgb4.ko) M: Steve Wise S: Supported F: providers/cxgb4/ EFA USERSPACE PROVIDER (for efa.ko) M: Michael Margolin S: Supported F: providers/efa/ ERDMA USERSPACE PROVIDER (for erdma.ko) M: Cheng Xu S: Supported F: providers/erdma/ HF1 USERSPACE PROVIDER (for hf1.ko) M: Dennis Dalessandro S: Supported F: providers/hfi1verbs/ HNS USERSPACE PROVIDER (for hns-roce-hw-v2.ko) M: Junxian Huang M: Chengchang Tang S: Supported F: providers/hns/ IRDMA USERSPACE PROVIDER (for i40iw.ko and irdma.ko) M: Sindhu Devale M: Tatyana Nikolova S: Supported F: providers/irdma/ RDMA Communication Manager Assistant (for librdmacm.so) M: Haakon Bugge M: Mark Haywood S: Supported F: ibacm/* IPATH/QIB USERSPACE PROVIDER (for ib_qib.ko) M: Dennis Dalessandro S: Supported F: providers/ipathverbs/ IWARP PORT MAPPER DAEMON (for iwarp kernel providers) M: Tatyana Nikolova M: Steve Wise H: Robert Sharp S: Supported F: iwpmd/ LIBIBUMAD USERSPACE LIBRARY FOR SMP AND GMP MAD PROCESSING (/dev/infiniband/umadX) M: Daniel Klein H: Hal Rosenstock H: Sasha Khapyorsky H: Shahar Frank S: Supported F: libibumad/ LIBIBVERBS USERSPACE LIBRARY FOR RDMA VERBS (/dev/infiniband/uverbsX) M: Yishai Hadas H: Michael S. Tsirkin H: Doug Ledford H: Sean Hefty H: Dotan Barak H: Roland Dreier S: Supported F: libibverbs/ LIBRDMACM USERSPACE LIBRARY FOR RDMA CONNECTION MANAGEMENT (/dev/infiniband/rdma_cm) M: Sean Hefty S: Supported F: librdmacm/ MANA USERSPACE PROVIDER (for mana_ib.ko) M: Long Li S: Supported F: providers/mana/ MLX4 USERSPACE PROVIDER (for mlx4_ib.ko) M: Yishai Hadas H: Roland Dreier S: Supported F: providers/mlx4/ MLX5 USERSPACE PROVIDER (for mlx5_ib.ko) M: Yishai Hadas H: Eli Cohen S: Supported F: providers/mlx5/ MTHCA USERSPACE PROVIDER (for ib_mthca.ko) M: Vladimir Sokolovsky H: Michael S. Tsirkin H: Roland Dreier S: Supported F: providers/mthca/ OCRDMA USERSPACE PROVIDER (for ocrdma.ko) M: Selvin Xavier S: Supported F: providers/ocrdma/ QEDR USERSPACE PROVIDER (for qedr.ko) M: Michal Kalderon M: Ariel Elior S: Supported F: providers/qedr/ RXE SOFT ROCEE USERSPACE PROVIDER (for rdma_rxe.ko) M: Moni Shoua S: Supported F: providers/rxe/ SIW SOFT IWARP USERSPACE PROVIDER (for siw.ko) M: Bernard Metzler S: Supported F: providers/siw/ SRP DAEMON (for ib_srp.ko) M: Bart Van Assche S: Supported F: srp_daemon/ SUSE PACKAGING M: Nicolas Morey-Chaisemartin S: Supported F: suse/ VMWARE PVRDMA USERSPACE PROVIDER (for vmw_pvrdma.ko) M: Adit Ranadive L: pv-drivers@vmware.com S: Supported F: providers/vmw_pvrdma/ PYVERBS M: Edward Srouji S: Supported F: pyverbs/ rdma-core-56.1/README.md000066400000000000000000000074041477342711600146160ustar00rootroot00000000000000[![Build Status](https://dev.azure.com/ucfconsort/rdma-core/_apis/build/status/linux-rdma.rdma-core?branchName=master)](https://dev.azure.com/ucfconsort/rdma-core/_build/latest?definitionId=2&branchName=master) # RDMA Core Userspace Libraries and Daemons This is the userspace components for the Linux Kernel's drivers/infiniband subsystem. Specifically this contains the userspace libraries for the following device nodes: - /dev/infiniband/uverbsX (libibverbs) - /dev/infiniband/rdma_cm (librdmacm) - /dev/infiniband/umadX (libibumad) The userspace component of the libibverbs RDMA kernel drivers are included under the providers/ directory. Support for the following Kernel RDMA drivers is included: - bnxt_re.ko - efa.ko - erdma.ko - iw_cxgb4.ko - hfi1.ko - hns-roce-hw-v2.ko - irdma.ko - ib_qib.ko - mana_ib.ko - mlx4_ib.ko - mlx5_ib.ko - ib_mthca.ko - ocrdma.ko - qedr.ko - rdma_rxe.ko - siw.ko - vmw_pvrdma.ko Additional service daemons are provided for: - srp_daemon (ib_srp.ko) - iwpmd (for iwarp kernel providers) - ibacm (for InfiniBand communication management assistant) # Building This project uses a cmake based build system. Quick start: ```sh $ bash build.sh ``` *build/bin* will contain the sample programs and *build/lib* will contain the shared libraries. The build is configured to run all the programs 'in-place' and cannot be installed. ### Debian Derived ```sh $ apt-get install build-essential cmake gcc libudev-dev libnl-3-dev libnl-route-3-dev ninja-build pkg-config valgrind python3-dev cython3 python3-docutils pandoc ``` Supported releases: * Debian 9 (stretch) or newer * Ubuntu 16.04 LTS (xenial) or newer ### Fedora, CentOS 8 ```sh $ dnf builddep redhat/rdma-core.spec ``` NOTE: Fedora Core uses the name 'ninja-build' for the 'ninja' command. ### openSUSE ```sh $ zypper install cmake gcc libnl3-devel libudev-devel ninja pkg-config valgrind-devel python3-devel python3-Cython python3-docutils pandoc ``` ## Building on CentOS 7, Amazon Linux 2 Install required packages: ```sh $ yum install cmake gcc libnl3-devel libudev-devel make pkgconfig valgrind-devel ``` Developers on CentOS 7 or Amazon Linux 2 are suggested to install more modern tooling for the best experience. CentOS 7: ```sh $ yum install epel-release $ yum install cmake3 ninja-build pandoc ``` Amazon Linux 2: ```sh $ amazon-linux-extras install epel $ yum install cmake3 ninja-build pandoc ``` NOTE: EPEL uses the name 'ninja-build' for the 'ninja' command, and 'cmake3' for the 'cmake' command. # Usage To set up software RDMA on an existing interface with either of the available drivers, use the following commands, substituting `` with the name of the driver of your choice (`rdma_rxe` or `siw`) and `` with the type corresponding to the driver (`rxe` or `siw`). ``` # modprobe # rdma link add type netdev ``` Please note that you need version of `iproute2` recent enough is required for the command above to work. You can use either `ibv_devices` or `rdma link` to verify that the device was successfully added. # Reporting bugs Bugs should be reported to the mailing list In your bug report, please include: * Information about your system: - Linux distribution and version - Linux kernel and version - InfiniBand hardware and firmware version - ... any other relevant information * How to reproduce the bug. * If the bug is a crash, the exact output printed out when the crash occurred, including any kernel messages produced. # Submitting patches See [Contributing to rdma-core](Documentation/contributing.md). # Stable branches Stable versions are released regularly with backported fixes (see Documentation/stable.md) The current minimum version still maintained is 'v33.X' rdma-core-56.1/build.sh000077500000000000000000000010451477342711600147700ustar00rootroot00000000000000#!/bin/bash set -e SRCDIR=`dirname $0` BUILDDIR="$SRCDIR/build" mkdir -p "$BUILDDIR" if hash cmake3 2>/dev/null; then # CentOS users are encouraged to install cmake3 from EPEL CMAKE=cmake3 else CMAKE=cmake fi if hash ninja-build 2>/dev/null; then # Fedora uses this name NINJA=ninja-build elif hash ninja 2>/dev/null; then NINJA=ninja fi cd "$BUILDDIR" if [ "x$NINJA" == "x" ]; then $CMAKE -DIN_PLACE=1 ${EXTRA_CMAKE_FLAGS:-} .. make else $CMAKE -DIN_PLACE=1 -GNinja ${EXTRA_CMAKE_FLAGS:-} .. $NINJA fi rdma-core-56.1/buildlib/000077500000000000000000000000001477342711600151205ustar00rootroot00000000000000rdma-core-56.1/buildlib/FindLDSymVer.cmake000066400000000000000000000043001477342711600203650ustar00rootroot00000000000000# COPYRIGHT (c) 2016 Obsidian Research Corporation. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. # find_package helper to detect symbol version support in the compiler and # linker. If supported then LDSYMVER_MODE will be set to GNU # Basic sample GNU style map file file(WRITE "${CMAKE_CURRENT_BINARY_DIR}/test.map" " IBVERBS_1.0 { global: ibv_get_device_list; local: *; }; IBVERBS_1.1 { global: ibv_get_device_list; } IBVERBS_1.0; ") # See RDMA_CHECK_C_LINKER_FLAG set(SAFE_CMAKE_REQUIRED_LIBRARIES "${CMAKE_REQUIRED_LIBRARIES}") set(SAFE_CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS}") if (POLICY CMP0056) set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--version-script=${CMAKE_CURRENT_BINARY_DIR}/test.map") else() set(CMAKE_REQUIRED_LIBRARIES "${CMAKE_REQUIRED_LIBRARIES} -Wl,--version-script=${CMAKE_CURRENT_BINARY_DIR}/test.map") endif() # And matching source, this also checks that .symver asm works if (HAVE_FUNC_ATTRIBUTE_SYMVER) check_c_source_compiles(" void ibv_get_device_list_1(void); __attribute((__symver__(\"ibv_get_device_list@IBVERBS_1.1\"))) void ibv_get_device_list_1(void){} void ibv_get_device_list_0(void); __attribute((__symver__(\"ibv_get_device_list@IBVERBS_1.0\"))) void ibv_get_device_list_0(void){} int main(int argc,const char *argv[]){return 0;}" _LDSYMVER_SUCCESS) else() check_c_source_compiles(" void ibv_get_device_list_1(void); void ibv_get_device_list_1(void){} asm(\".symver ibv_get_device_list_1, ibv_get_device_list@IBVERBS_1.1\"); void ibv_get_device_list_0(void); void ibv_get_device_list_0(void){} asm(\".symver ibv_get_device_list_0, ibv_get_device_list@@IBVERBS_1.0\"); int main(int argc,const char *argv[]){return 0;}" _LDSYMVER_SUCCESS) endif() file(REMOVE "${CMAKE_CURRENT_BINARY_DIR}/test.map") set(CMAKE_EXE_LINKER_FLAGS "${SAFE_CMAKE_EXE_LINKER_FLAGS}") set(CMAKE_REQUIRED_LIBRARIES "${SAFE_CMAKE_REQUIRED_LIBRARIES}") if (_LDSYMVER_SUCCESS) set(LDSYMVER_MODE "GNU" CACHE INTERNAL "How to set symbol versions on shared libraries") endif() include(FindPackageHandleStandardArgs) find_package_handle_standard_args( LDSymVer REQUIRED_VARS LDSYMVER_MODE ) rdma-core-56.1/buildlib/FindLTTngUST.cmake000066400000000000000000000103701477342711600203100ustar00rootroot00000000000000#.rst: # FindLTTngUST # ------------ # # This module finds the `LTTng-UST `__ library. # # Imported target # ^^^^^^^^^^^^^^^ # # This module defines the following :prop_tgt:`IMPORTED` target: # # ``LTTng::UST`` # The LTTng-UST library, if found # # Result variables # ^^^^^^^^^^^^^^^^ # # This module sets the following # # ``LTTNGUST_FOUND`` # ``TRUE`` if system has LTTng-UST # ``LTTNGUST_INCLUDE_DIRS`` # The LTTng-UST include directories # ``LTTNGUST_LIBRARIES`` # The libraries needed to use LTTng-UST # ``LTTNGUST_VERSION_STRING`` # The LTTng-UST version # ``LTTNGUST_HAS_TRACEF`` # ``TRUE`` if the ``tracef()`` API is available in the system's LTTng-UST # ``LTTNGUST_HAS_TRACELOG`` # ``TRUE`` if the ``tracelog()`` API is available in the system's LTTng-UST #============================================================================= # Copyright 2016 Kitware, Inc. # Copyright 2016 Philippe Proulx # # Distributed under the OSI-approved BSD License (the "License"); # see accompanying file Copyright.txt for details. # # This software is distributed WITHOUT ANY WARRANTY; without even the # implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. # See the License for more information. #============================================================================= # (To distribute this file outside of CMake, substitute the full # License text for the above reference.) find_path(LTTNGUST_INCLUDE_DIRS NAMES lttng/tracepoint.h) # Must also check for the path of generated header files since out-of-tree # build is a possibility (Yocto). find_path(LTTNGUST_INCLUDE_DIRS_GENERATED NAMES lttng/ust-config.h) find_library(LTTNGUST_LIBRARIES NAMES lttng-ust) if(LTTNGUST_INCLUDE_DIRS AND LTTNGUST_INCLUDE_DIRS_GENERATED AND LTTNGUST_LIBRARIES) # find tracef() and tracelog() support set(LTTNGUST_HAS_TRACEF 0) set(LTTNGUST_HAS_TRACELOG 0) if(EXISTS "${LTTNGUST_INCLUDE_DIRS}/lttng/tracef.h") set(LTTNGUST_HAS_TRACEF TRUE) endif() if(EXISTS "${LTTNGUST_INCLUDE_DIRS}/lttng/tracelog.h") set(LTTNGUST_HAS_TRACELOG TRUE) endif() # get version set(lttngust_version_file "${LTTNGUST_INCLUDE_DIRS_GENERATED}/lttng/ust-version.h") if(EXISTS "${lttngust_version_file}") file(STRINGS "${lttngust_version_file}" lttngust_version_major_string REGEX "^[\t ]*#define[\t ]+LTTNG_UST_MAJOR_VERSION[\t ]+[0-9]+[\t ]*$") file(STRINGS "${lttngust_version_file}" lttngust_version_minor_string REGEX "^[\t ]*#define[\t ]+LTTNG_UST_MINOR_VERSION[\t ]+[0-9]+[\t ]*$") file(STRINGS "${lttngust_version_file}" lttngust_version_patch_string REGEX "^[\t ]*#define[\t ]+LTTNG_UST_PATCHLEVEL_VERSION[\t ]+[0-9]+[\t ]*$") string(REGEX REPLACE ".*([0-9]+).*" "\\1" lttngust_v_major "${lttngust_version_major_string}") string(REGEX REPLACE ".*([0-9]+).*" "\\1" lttngust_v_minor "${lttngust_version_minor_string}") string(REGEX REPLACE ".*([0-9]+).*" "\\1" lttngust_v_patch "${lttngust_version_patch_string}") set(LTTNGUST_VERSION_STRING "${lttngust_v_major}.${lttngust_v_minor}.${lttngust_v_patch}") unset(lttngust_version_major_string) unset(lttngust_version_minor_string) unset(lttngust_version_patch_string) unset(lttngust_v_major) unset(lttngust_v_minor) unset(lttngust_v_patch) else() message(FATAL_ERROR "Missing version header") endif() unset(lttngust_version_file) if(NOT TARGET LTTng::UST) add_library(LTTng::UST UNKNOWN IMPORTED) set_target_properties(LTTng::UST PROPERTIES INTERFACE_INCLUDE_DIRECTORIES "${LTTNGUST_INCLUDE_DIRS};${LTTNGUST_INCLUDE_DIRS_GENERATED}" INTERFACE_LINK_LIBRARIES ${CMAKE_DL_LIBS} IMPORTED_LINK_INTERFACE_LANGUAGES "C" IMPORTED_LOCATION "${LTTNGUST_LIBRARIES}") endif() # add libdl to required libraries set(LTTNGUST_LIBRARIES ${LTTNGUST_LIBRARIES} ${CMAKE_DL_LIBS}) endif() # handle the QUIETLY and REQUIRED arguments and set LTTNGUST_FOUND to # TRUE if all listed variables are TRUE include(FindPackageHandleStandardArgs) find_package_handle_standard_args(LTTngUST FOUND_VAR LTTNGUST_FOUND REQUIRED_VARS LTTNGUST_LIBRARIES LTTNGUST_INCLUDE_DIRS VERSION_VAR LTTNGUST_VERSION_STRING) mark_as_advanced(LTTNGUST_LIBRARIES LTTNGUST_INCLUDE_DIRS) rdma-core-56.1/buildlib/FindSystemd.cmake000066400000000000000000000023021477342711600203500ustar00rootroot00000000000000# COPYRIGHT (c) 2015 Obsidian Research Corporation. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. find_path(LIBSYSTEMD_INCLUDE_DIRS "systemd/sd-journal.h") if (LIBSYSTEMD_INCLUDE_DIRS) set(SYSTEMD_INCLUDE_DIRS ${LIBSYSTEMD_INCLUDE_DIRS}) find_library(LIBSYSTEMD_LIBRARY NAMES systemd libsystemd) # Older systemd uses a split library if (NOT LIBSYSTEMD_LIBRARY) find_library(LIBSYSTEMD_JOURNAL_LIBRARY NAMES systemd-journal libsystemd-journal) find_library(LIBSYSTEMD_ID128_LIBRARY NAMES systemd-id128 libsystemd-id128) find_library(LIBSYSTEMD_DAEMON_LIBRARY NAMES systemd-daemon libsystemd-daemon) if (LIBSYSTEMD_JOURNAL_LIBRARY AND LIBSYSTEMD_ID128_LIBRARY AND LIBSYSTEMD_DAEMON_LIBRARY) set(SYSTEMD_LIBRARIES ${LIBSYSTEMD_JOURNAL_LIBRARY} ${LIBSYSTEMD_ID128_LIBRARY} ${LIBSYSTEMD_DAEMON_LIBRARY}) endif() else() set(SYSTEMD_LIBRARIES ${LIBSYSTEMD_LIBRARY}) endif() set(SYSTEMD_INCLUDE_DIRS) endif() include(FindPackageHandleStandardArgs) find_package_handle_standard_args(Systemd REQUIRED_VARS SYSTEMD_LIBRARIES LIBSYSTEMD_INCLUDE_DIRS) mark_as_advanced(LIBSYSTEMD_LIBRARY LIBSYSTEMD_JOURNAL_LIBRARY LIBSYSTEMD_ID128_LIBRARY LIBSYSTEMD_DAEMON_LIBRARY) rdma-core-56.1/buildlib/FindUDev.cmake000066400000000000000000000005311477342711600175650ustar00rootroot00000000000000# COPYRIGHT (c) 2016 Obsidian Research Corporation. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. find_library(LIBUDEV_LIBRARY NAMES udev libudev) set(UDEV_LIBRARIES ${LIBUDEV_LIBRARY}) include(FindPackageHandleStandardArgs) find_package_handle_standard_args(UDev REQUIRED_VARS LIBUDEV_LIBRARY) mark_as_advanced(LIBUDEV_LIBRARY) rdma-core-56.1/buildlib/Findcython.cmake000066400000000000000000000026541477342711600202360ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. See COPYING file execute_process(COMMAND "${PYTHON_EXECUTABLE}" -c "from Cython.Compiler.Main import main; import Cython; print(Cython.__version__);" OUTPUT_VARIABLE _VERSION RESULT_VARIABLE _VERSION_RESULT ERROR_QUIET) if(NOT _VERSION_RESULT) # We make our own cython script because it is very hard to figure out which # cython exectuable wrapper is appropriately matched to the python # interpreter we want to use. Cython must use the matching version of python # or things will go wrong. string(STRIP "${_VERSION}" CYTHON_VERSION_STRING) set(CYTHON_EXECUTABLE "${BUILD_PYTHON}/cython") file(WRITE "${CYTHON_EXECUTABLE}" "#!${PYTHON_EXECUTABLE} from Cython.Compiler.Main import main main(command_line = 1)") execute_process(COMMAND "chmod" "a+x" "${CYTHON_EXECUTABLE}") # Dockers with older Cython versions fail to build pyverbs. Until we get to # the bottom of this, disable pyverbs for older Cython versions. if (CYTHON_VERSION_STRING VERSION_LESS "0.25") message("Cython version < 0.25, disabling") unset(CYTHON_EXECUTABLE) endif() endif() unset(_VERSION_RESULT) unset(_VERSION) include(FindPackageHandleStandardArgs) find_package_handle_standard_args(cython REQUIRED_VARS CYTHON_EXECUTABLE CYTHON_VERSION_STRING VERSION_VAR CYTHON_VERSION_STRING) mark_as_advanced(CYTHON_EXECUTABLE) rdma-core-56.1/buildlib/Findpandoc.cmake000066400000000000000000000012651477342711600201730ustar00rootroot00000000000000# COPYRIGHT (c) 2017 Mellanox Technologies Ltd # Licensed under BSD (MIT variant) or GPLv2. See COPYING. find_program(PANDOC_EXECUTABLE NAMES pandoc) if(PANDOC_EXECUTABLE) execute_process(COMMAND "${PANDOC_EXECUTABLE}" -v OUTPUT_VARIABLE _VERSION RESULT_VARIABLE _VERSION_RESULT ERROR_QUIET) if(NOT _VERSION_RESULT) string(REGEX REPLACE "^pandoc ([^\n]+)\n.*" "\\1" PANDOC_VERSION_STRING "${_VERSION}") endif() unset(_VERSION_RESULT) unset(_VERSION) endif() include(FindPackageHandleStandardArgs) find_package_handle_standard_args(pandoc REQUIRED_VARS PANDOC_EXECUTABLE PANDOC_VERSION_STRING VERSION_VAR PANDOC_VERSION_STRING) mark_as_advanced(PANDOC_EXECUTABLE) rdma-core-56.1/buildlib/Findrst2man.cmake000066400000000000000000000013221477342711600203070ustar00rootroot00000000000000# COPYRIGHT (c) 2019 Mellanox Technologies Ltd # Licensed under BSD (MIT variant) or GPLv2. See COPYING. find_program(RST2MAN_EXECUTABLE NAMES rst2man) if(RST2MAN_EXECUTABLE) execute_process(COMMAND "${RST2MAN_EXECUTABLE}" --version OUTPUT_VARIABLE _VERSION RESULT_VARIABLE _VERSION_RESULT ERROR_QUIET) if(NOT _VERSION_RESULT) string(REGEX REPLACE "^rst2man \\(Docutils ([^,]+), .*" "\\1" RST2MAN_VERSION_STRING "${_VERSION}") endif() unset(_VERSION_RESULT) unset(_VERSION) endif() include(FindPackageHandleStandardArgs) find_package_handle_standard_args(rst2man REQUIRED_VARS RST2MAN_EXECUTABLE RST2MAN_VERSION_STRING VERSION_VAR RST2MAN_VERSION_STRING) mark_as_advanced(RST2MAN_EXECUTABLE) rdma-core-56.1/buildlib/RDMA_BuildType.cmake000066400000000000000000000032571477342711600206350ustar00rootroot00000000000000# COPYRIGHT (c) 2015 Obsidian Research Corporation. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. function(RDMA_BuildType) set(build_types Debug Release RelWithDebInfo MinSizeRel) # Set the default build type to RelWithDebInfo. Since RDMA is typically used # in performance contexts it doesn't make much sense to have the default build # turn off the optimizer. if(NOT CMAKE_BUILD_TYPE) set(CMAKE_BUILD_TYPE RelWithDebInfo CACHE STRING "Options are ${build_types}" FORCE ) set_property(CACHE CMAKE_BUILD_TYPE PROPERTY STRINGS ${build_types}) endif() # Release should be used by packagers, it is the same as the default RelWithDebInfo, # this means it uses -O2 and -DNDEBUG (not -O3) foreach (language CXX C) set(VAR_TO_MODIFY "CMAKE_${language}_FLAGS_RELEASE") if ("${${VAR_TO_MODIFY}}" STREQUAL "${${VAR_TO_MODIFY}_INIT}") set(${VAR_TO_MODIFY} "${CMAKE_${language}_FLAGS_RELWITHDEBINFO_INIT}" CACHE STRING "Default flags for Release configuration" FORCE) endif() endforeach() # RelWithDebInfo should be used by developers, it is the same as Release but # with the -DNDEBUG removed foreach (language CXX C) set(VAR_TO_MODIFY "CMAKE_${language}_FLAGS_RELWITHDEBINFO") if ("${${VAR_TO_MODIFY}}" STREQUAL "${${VAR_TO_MODIFY}_INIT}") string(REGEX REPLACE "(^| )[/-]D *NDEBUG($| )" " " replacement "${${VAR_TO_MODIFY}}" ) set(${VAR_TO_MODIFY} "${replacement}" CACHE STRING "Default flags for RelWithDebInfo configuration" FORCE) endif() endforeach() if (CMAKE_BUILD_TYPE STREQUAL Debug OR CMAKE_BUILD_TYPE STREQUAL RelWithDebInfo) add_definitions("-DVERBS_DEBUG") endif() endfunction() rdma-core-56.1/buildlib/RDMA_DoFixup.cmake000066400000000000000000000027001477342711600203020ustar00rootroot00000000000000# COPYRIGHT (c) 2016 Obsidian Research Corporation. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. # Execute a header fixup based on NOT_NEEDED for HEADER # The buildlib includes alternate header file shims for several scenarios, if # the build system detects a feature is present then it should call RDMA_DoFixup # with the test as true. If false then the shim header will be installed. # Typically the shim header will replace a missing header with stubs, or it # will augment an existing header with include_next. function(RDMA_DoFixup not_needed header) cmake_parse_arguments(ARGS "NO_SHIM" "" "" ${ARGN}) string(REPLACE / - header-bl ${header}) if (NOT EXISTS "${BUILDLIB}/fixup-include/${header-bl}") # NO_SHIM lets cmake succeed if the header exists in the system but no # shim is provided, but this will always fail if the shim is needed but # does not exist. if (NOT ARGS_NO_SHIM OR NOT "${not_needed}") message(FATAL_ERROR "Fixup header ${BUILDLIB}/fixup-include/${header-bl} is not present") endif() endif() set(DEST "${BUILD_INCLUDE}/${header}") if (NOT "${not_needed}") if(CMAKE_VERSION VERSION_LESS "2.8.12") get_filename_component(DIR ${DEST} PATH) else() get_filename_component(DIR ${DEST} DIRECTORY) endif() file(MAKE_DIRECTORY "${DIR}") rdma_create_symlink("${BUILDLIB}/fixup-include/${header-bl}" "${DEST}") else() file(REMOVE ${DEST}) endif() endfunction() rdma-core-56.1/buildlib/RDMA_EnableCStd.cmake000066400000000000000000000075051477342711600207000ustar00rootroot00000000000000# COPYRIGHT (c) 2016 Obsidian Research Corporation. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. # cmake does not have way to do this even slightly sanely until CMP0056 function(RDMA_CHECK_C_LINKER_FLAG FLAG CACHE_VAR) set(SAFE_CMAKE_REQUIRED_LIBRARIES "${CMAKE_REQUIRED_LIBRARIES}") set(SAFE_CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS}") if (POLICY CMP0056) set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${FLAG}") else() set(CMAKE_REQUIRED_LIBRARIES "${CMAKE_REQUIRED_LIBRARIES} ${FLAG}") endif() CHECK_C_COMPILER_FLAG("" ${CACHE_VAR}) set(CMAKE_EXE_LINKER_FLAGS "${SAFE_CMAKE_EXE_LINKER_FLAGS}") set(CMAKE_REQUIRED_LIBRARIES "${SAFE_CMAKE_REQUIRED_LIBRARIES}") endfunction() # Test if the CC compiler supports the linker flag and if so add it to TO_VAR function(RDMA_AddOptLDFlag TO_VAR CACHE_VAR FLAG) RDMA_CHECK_C_LINKER_FLAG("${FLAG}" ${CACHE_VAR}) if (${CACHE_VAR}) SET(${TO_VAR} "${${TO_VAR}} ${FLAG}" PARENT_SCOPE) endif() endfunction() # Test if the CC compiler supports the flag and if so add it to TO_VAR function(RDMA_AddOptCFlag TO_VAR CACHE_VAR FLAG) CHECK_C_COMPILER_FLAG("${FLAG}" ${CACHE_VAR}) if (${CACHE_VAR}) SET(${TO_VAR} "${${TO_VAR}} ${FLAG}" PARENT_SCOPE) endif() endfunction() # Enable the minimum required gnu11 standard in the compiler # This was introduced in GCC 4.7 function(RDMA_EnableCStd) if (HAVE_SPARSE) # Sparse doesn't support gnu11, but doesn't fail if the option is present, # force gnu99 instead. SET(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=gnu99" PARENT_SCOPE) return() endif() if (CMAKE_VERSION VERSION_LESS "3.1") # Check for support of the usual flag CHECK_C_COMPILER_FLAG("-std=gnu11" SUPPORTS_GNU11) if (SUPPORTS_GNU11) SET(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=gnu11" PARENT_SCOPE) else() SET(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=gnu99" PARENT_SCOPE) endif() else() # Newer cmake can do this internally set(CMAKE_C_STANDARD 11 PARENT_SCOPE) endif() endfunction() function(RDMA_Check_C_Compiles TO_VAR CHECK_PROGRAM) set(CMAKE_REQUIRED_FLAGS "${ARGV2} -Werror") CHECK_C_SOURCE_COMPILES("${CHECK_PROGRAM}" ${TO_VAR}) set(${TO_VAR} ${${TO_VAR}} PARENT_SCOPE) endfunction() function(RDMA_Check_Aliasing TO_VAR) SET(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O2") RDMA_Check_C_Compiles(HAVE_WORKING_STRICT_ALIASING " struct in6_addr {unsigned int u6_addr32[4];}; struct iphdr {unsigned int daddr;}; union ibv_gid {unsigned char raw[16];}; static void map_ipv4_addr_to_ipv6(struct in6_addr *ipv6) {ipv6->u6_addr32[0] = 0;} static int set_ah_attr_by_ipv4(struct iphdr *ip4h) { union ibv_gid sgid = {}; map_ipv4_addr_to_ipv6((struct in6_addr *)&sgid); return 0; } int main(int argc, char *argv[]) { struct in6_addr a; struct iphdr h = {}; map_ipv4_addr_to_ipv6(&a); return set_ah_attr_by_ipv4(&h); }" ) set(${TO_VAR} "${HAVE_WORKING_STRICT_ALIASING}" PARENT_SCOPE) endfunction() function(RDMA_Check_SSE TO_VAR) set(SSE_CHECK_PROGRAM " #if defined(__i386__) #include #include int __attribute__((target(\"sse\"))) main(int argc, char *argv[]) { __m128 tmp = {}; tmp = _mm_loadl_pi(tmp, (__m64 *)&main); _mm_storel_pi((__m64 *)&main, tmp); return memchr(&tmp, 0, sizeof(tmp)) == &tmp; } #else int main(int argc, char *argv[]) { return 0; } #endif ") RDMA_Check_C_Compiles(HAVE_TARGET_SSE "${SSE_CHECK_PROGRAM}") if(NOT HAVE_TARGET_SSE) # Older compiler, we can work around this by adding -msse instead of # relying on the function attribute. RDMA_Check_C_Compiles(NEED_MSSE_FLAG "${SSE_CHECK_PROGRAM}" "-msse") if(NEED_MSSE_FLAG) set(SSE_FLAGS "-msse" PARENT_SCOPE) else() message(FATAL_ERROR "Can not figure out how to turn on sse instructions for i386") endif() endif() set(${TO_VAR} "${HAVE_TARGET_SSE}" PARENT_SCOPE) endFunction() rdma-core-56.1/buildlib/RDMA_Sparse.cmake000066400000000000000000000023141477342711600201620ustar00rootroot00000000000000# COPYRIGHT (c) 2017 Obsidian Research Corporation. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. function(RDMA_CheckSparse) # Sparse defines __CHECKER__, but only for the 'sparse pass', which has no # way to fail the compiler. CHECK_C_SOURCE_COMPILES(" #if __CHECKER__ #warning \"SPARSE DETECTED\" #endif int main(int argc,const char *argv[]) {return 0;} " HAVE_NO_SPARSE FAIL_REGEX "SPARSE DETECTED") if (HAVE_NO_SPARSE) set(HAVE_SPARSE FALSE PARENT_SCOPE) else() set(HAVE_SPARSE TRUE PARENT_SCOPE) # Replace various glibc headers with our own versions that have embedded sparse annotations. execute_process(COMMAND "${PYTHON_EXECUTABLE}" "${BUILDLIB}/gen-sparse.py" "--out" "${BUILD_INCLUDE}/" "--src" "${PROJECT_SOURCE_DIR}/" "--cc" "${CMAKE_C_COMPILER}" RESULT_VARIABLE retcode) if(NOT "${retcode}" STREQUAL "0") message(FATAL_ERROR "glibc header file patching for sparse failed. Review include/*.rej and fix the rejects, then do " "${BUILDLIB}/gen-sparse.py --out ${BUILD_INCLUDE}/ --src ${PROJECT_SOURCE_DIR}/ --save") endif() # Enable endian analysis in sparse add_definitions("-D__CHECK_ENDIAN__") endif() endfunction() rdma-core-56.1/buildlib/azp-checkpatch000077500000000000000000000057151477342711600177430ustar00rootroot00000000000000#!/usr/bin/env python3 import subprocess import urllib.request import os import re import tempfile import collections import copy import sys base = os.environ["SYSTEM_PULLREQUEST_TARGETBRANCH"] if not re.match("^[0-9a-fA-F]{40}$", base): base = "refs/remotes/origin/" + base with tempfile.TemporaryDirectory() as dfn: patches = subprocess.check_output( [ "git", "format-patch", "--output-directory", dfn, os.environ["SYSTEM_PULLREQUEST_SOURCECOMMITID"], "^" + base ], universal_newlines=True).splitlines() if len(patches) == 0: sys.exit(0) ckp = os.path.join(dfn, "checkpatch.pl") urllib.request.urlretrieve( "https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/scripts/checkpatch.pl", ckp) urllib.request.urlretrieve( "https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/scripts/spelling.txt", os.path.join(dfn, "spelling.txt")) os.symlink( os.path.join(os.getcwd(), "buildlib/const_structs.checkpatch"), os.path.join(dfn, "const_structs.checkpatch")) checkpatch = [ "perl", ckp, "--no-tree", "--show-types", "--ignore", "PREFER_KERNEL_TYPES,FILE_PATH_CHANGES,EXECUTE_PERMISSIONS,USE_NEGATIVE_ERRNO,CONST_STRUCT", "--emacs", "--mailback", "--quiet", "--no-summary" ] environ = copy.copy(os.environ) environ["GIT_DIR"] = subprocess.check_output(["git","rev-parse","--absolute-git-dir"]).decode().strip() warnings = False errors = False for fn in patches: proc = subprocess.run( checkpatch + [os.path.basename(fn)], cwd=dfn, stdout=subprocess.PIPE, universal_newlines=True, env=environ, stderr=subprocess.STDOUT) if proc.returncode == 0: assert (not proc.stdout) continue sys.stdout.write(proc.stdout) warnings = True for g in re.finditer( r"^\d+-.*:\d+: (\S+):(\S+): (.*)(?:\n#(\d+): (?:FILE: (.*):(\d+):)?)?$", proc.stdout, flags=re.MULTILINE): itms = {} if g.group(1) == "WARNING": itms["type"] = "warning" else: itms["type"] = "error" itms["code"]=g.group(2) if g.group(4): itms["sourcepath"] = g.group(5) itms["linenumber"] = g.group(6) # Bump some warnings to errors if itms["code"] == "UNKNOWN_COMMIT_ID": itms["type"] = "error" if itms["type"] == "error": errors = True print("##vso[task.logissue %s]%s" % (";".join( "%s=%s" % (k, v) for k, v in sorted(itms.items())), g.group(3))) if errors: print("##vso[task.complete result=Failed]azp-checkpatch") elif warnings: print("##vso[task.complete result=SucceededWithIssues]azp-checkpatch") rdma-core-56.1/buildlib/azure-pipelines-release.yml000066400000000000000000000032021477342711600223720ustar00rootroot00000000000000# See https://aka.ms/yaml # This pipeline runs to produce GitHub releases when tags are pushed. The # pipeline is never run from a PR and has access to all the build secrets, # including write permission to GitHub. trigger: tags: include: - v* resources: containers: - container: azp image: ucfconsort.azurecr.io/rdma-core/azure_pipelines:44.0 endpoint: ucfconsort_registry stages: - stage: Release jobs: - job: SrcPrep displayName: Build Source Tar pool: vmImage: 'ubuntu-latest' container: azp steps: - checkout: self fetchDepth: 1 - bash: | set -e git_tag=$(git describe --exact-match HEAD) rel_ver=$(echo $git_tag | sed -e 's/^v//') echo "Version is $rel_ver" echo "##vso[task.setvariable variable=rel_ver]$rel_ver" mkdir build-pandoc artifacts cd build-pandoc CC=gcc-12 cmake -GNinja .. ninja docs cd .. python3 buildlib/cbuild make-dist-tar build-pandoc displayName: Prebuild Documentation - task: GithubRelease@1 displayName: 'Create GitHub Release' inputs: githubConnection: github_release repositoryName: linux-rdma/rdma-core assets: ./*.tar.gz action: create title: rdma-core-$(rel_ver) isDraft: true addChangeLog: true changeLogCompareToRelease: lastNonDraftReleaseByTag changeLogCompareToReleaseTag: v56.* rdma-core-56.1/buildlib/azure-pipelines.yml000066400000000000000000000205111477342711600207560ustar00rootroot00000000000000# See https://aka.ms/yaml trigger: - master - stable-v* - dev/stable-v*/* pr: - master resources: containers: - container: azp image: ucfconsort.azurecr.io/rdma-core/azure_pipelines:44.0 endpoint: ucfconsort_registry - container: centos7 image: ucfconsort.azurecr.io/rdma-core/centos7:25.0 endpoint: ucfconsort_registry - container: centos8 image: ucfconsort.azurecr.io/rdma-core/centos8:44.0 endpoint: ucfconsort_registry - container: centos9 image: ucfconsort.azurecr.io/rdma-core/centos9:44.0 endpoint: ucfconsort_registry - container: fedora image: ucfconsort.azurecr.io/rdma-core/fc41:54.0 endpoint: ucfconsort_registry - container: xenial image: ucfconsort.azurecr.io/rdma-core/ubuntu-16.04:28.0 endpoint: ucfconsort_registry - container: bionic image: ucfconsort.azurecr.io/rdma-core/ubuntu-18.04:29.0 endpoint: ucfconsort_registry - container: focal image: ucfconsort.azurecr.io/rdma-core/ubuntu-20.04:44.0 endpoint: ucfconsort_registry - container: leap image: ucfconsort.azurecr.io/rdma-core/opensuse-15.0:25.0 endpoint: ucfconsort_registry - container: i386 image: ucfconsort.azurecr.io/rdma-core/debian-11-i386:37.0 options: --platform linux/386 endpoint: ucfconsort_registry stages: - stage: Build jobs: - job: Compile displayName: Compile Tests pool: vmImage: 'ubuntu-latest' container: azp steps: - task: PythonScript@0 displayName: checkpatch condition: eq(variables['Build.Reason'], 'PullRequest') inputs: scriptPath: buildlib/azp-checkpatch pythonInterpreter: /usr/bin/python3 - bash: | set -e mkdir build-gcc12 cd build-gcc12 CC=gcc-12 cmake -GNinja .. -DIOCTL_MODE=both -DENABLE_STATIC=1 -DENABLE_WERROR=1 ninja displayName: gcc 12.1 Compile - task: PythonScript@0 displayName: Check Build Script inputs: scriptPath: buildlib/check-build arguments: --src .. --cc gcc-12 workingDirectory: build-gcc12 pythonInterpreter: /usr/bin/python3 # Run sparse on the subdirectories which are sparse clean - bash: | set -e mkdir build-sparse mv CMakeLists.txt CMakeLists-orig.txt grep -v "# NO SPARSE" CMakeLists-orig.txt > CMakeLists.txt cd build-sparse CC=cgcc cmake -GNinja .. -DIOCTL_MODE=both -DNO_PYVERBS=1 -DENABLE_WERROR=1 ninja | grep -v '^\[' | tee out # sparse does not fail gcc on messages if [ -s out ]; then false fi mv ../CMakeLists-orig.txt ../CMakeLists.txt displayName: sparse Analysis - bash: | set -e mkdir build-clang cd build-clang CC=clang-15 cmake -GNinja .. -DCMAKE_BUILD_TYPE=Debug -DIOCTL_MODE=both -DENABLE_WERROR=1 ninja displayName: clang 15 Compile - bash: | set -e mv util/udma_barrier.h util/udma_barrier.h.old echo "#error Fail" >> util/udma_barrier.h cd build-gcc12 rm CMakeCache.txt CC=gcc-12 cmake -GNinja .. -DIOCTL_MODE=both -DENABLE_WERROR=1 ninja mv ../util/udma_barrier.h.old ../util/udma_barrier.h displayName: Simulate non-coherent DMA Platform Compile - bash: | set -e mkdir build-arm64 cd build-arm64 CC=aarch64-linux-gnu-gcc-12 PKG_CONFIG_PATH=/usr/lib/aarch64-linux-gnu/pkgconfig/ cmake -GNinja .. -DIOCTL_MODE=both -DNO_PYVERBS=1 -DENABLE_WERROR=1 ninja displayName: gcc 12.1 ARM64 Compile - bash: | set -e mkdir build-ppc64el cd build-ppc64el CC=powerpc64le-linux-gnu-gcc-12 PKG_CONFIG_PATH=/usr/lib/powerpc64le-linux-gnu/pkgconfig/ cmake -GNinja .. -DIOCTL_MODE=both -DNO_PYVERBS=1 -DENABLE_WERROR=1 ninja displayName: gcc 12.1 PPC64EL Compile - job: Compile32 displayName: Compile Tests 32 bit pool: vmImage: 'ubuntu-latest' container: i386 steps: - bash: | set -e mkdir build-i386 cd build-i386 cmake -GNinja .. -DIOCTL_MODE=both -DENABLE_WERROR=1 ninja displayName: gcc 10.2 i386 Compile - job: SrcPrep displayName: Build Source Tar pool: vmImage: 'ubuntu-latest' container: azp steps: - checkout: self fetchDepth: 1 - bash: | set -e mkdir build-pandoc artifacts cd build-pandoc CC=gcc-12 cmake -GNinja .. ninja docs cd ../artifacts # FIXME: Check Build.SourceBranch for tag consistency python3 ../buildlib/cbuild make-dist-tar ../build-pandoc displayName: Prebuild Documentation - task: PublishPipelineArtifact@0 inputs: # Contains an rdma-core-XX.tar.gz file artifactName: source_tar targetPath: artifacts - job: RPM_Distros displayName: Test Build RPMs for dependsOn: SrcPrep pool: vmImage: 'ubuntu-latest' strategy: matrix: centos7: CONTAINER: centos7 SPEC: redhat/rdma-core.spec RPMBUILD_OPTS: --define 'EXTRA_CMAKE_FLAGS -DCMAKE_BUILD_TYPE=Debug -DENABLE_WERROR=1' centos8: CONTAINER: centos8 SPEC: redhat/rdma-core.spec RPMBUILD_OPTS: --define 'EXTRA_CMAKE_FLAGS -DCMAKE_BUILD_TYPE=Debug -DENABLE_WERROR=1' centos9: CONTAINER: centos9 SPEC: redhat/rdma-core.spec RPMBUILD_OPTS: --define 'EXTRA_CMAKE_FLAGS -DCMAKE_BUILD_TYPE=Debug -DENABLE_WERROR=1' fedora41: CONTAINER: fedora SPEC: redhat/rdma-core.spec RPMBUILD_OPTS: --define 'EXTRA_CMAKE_FLAGS -DCMAKE_BUILD_TYPE=Debug -DENABLE_WERROR=1' leap: CONTAINER: leap SPEC: suse/rdma-core.spec RPMBUILD_OPTS: --define 'EXTRA_CMAKE_FLAGS -DCMAKE_BUILD_TYPE=Debug -DENABLE_WERROR=1' --without=curlmini container: $[ variables['CONTAINER'] ] steps: - checkout: none - task: DownloadPipelineArtifact@2 inputs: artifactName: source_tar targetPath: . - bash: | set -e mkdir SOURCES tmp tar --wildcards -xzf rdma-core*.tar.gz */$(SPEC) --strip-components=2 RPM_SRC=$((rpmspec -P *.spec || grep ^Source: *.spec) | awk '/^Source:/{split($0,a,"[ \t]+");print(a[2])}') (cd SOURCES && ln -sf ../rdma-core*.tar.gz "$RPM_SRC") rpmbuild --define '_tmppath '$(pwd)'/tmp' --define '_topdir '$(pwd) -bb *.spec $(RPMBUILD_OPTS) displayName: Perform Package Build - job: DEB_Distros displayName: Test Build DEBs for dependsOn: SrcPrep pool: vmImage: 'ubuntu-latest' strategy: matrix: xenial: CONTAINER: xenial bionic: CONTAINER: bionic focal: CONTAINER: focal jammy: CONTAINER: azp LINTIAN: true container: $[ variables['CONTAINER'] ] steps: - checkout: none - task: DownloadPipelineArtifact@2 inputs: artifactName: source_tar targetPath: . - bash: | set -e mv *.tar.gz src.tar.gz tar -xzf src.tar.gz cd rdma-core*/ dpkg-buildpackage -b -d displayName: Perform Package Build - bash: | lintian *.deb displayName: Debian Lintian for .deb packages condition: eq(variables['LINTIAN'], 'true') rdma-core-56.1/buildlib/cbuild000077500000000000000000001300761477342711600163170ustar00rootroot00000000000000#!/usr/bin/env python3 # Copyright 2015-2016 Obsidian Research Corp. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. # PYTHON_ARGCOMPLETE_OK """cbuild - Build in a docker container This script helps using docker containers to run software builds. This allows building for a wide range of distributions without having to install them. Each target distribution has a base docker image and a set of packages to install. The first step is to build the customized docker container: $ buildlib/cbuild build-images fedora This will download the base image and customize it with the required packages. Next, a build can be performed 'in place'. This is useful to do edit/compile cycles with an alternate distribution. $ buildlib/cbuild make fedora The build output will be placed in build-fcXX, where XX is latest fedora release. Finally, a full package build can be performed inside the container. Note this mode actually creates a source tree inside the container based on the current git HEAD commit, so any uncommitted edits will be lost. $ buildlib/cbuild pkg fedora In this case only the final package results are copied outside the container (to ..) and everything else is discarded. In all cases the containers that are spun up are deleted after they are finished, only the base container created during 'build-images' is kept. The '--run-shell' option can be used to setup the container to the point of running the build command and instead run an interactive bash shell. This is useful for debugging certain kinds of build problems.""" from __future__ import print_function import argparse import collections import filecmp import grp import inspect import json import multiprocessing import os import pwd import re import shutil import subprocess import sys import tempfile import yaml from contextlib import contextmanager; project = "rdma-core"; def get_version(): """Return the version string for the project, this gets automatically written into the packaging files.""" with open("CMakeLists.txt","r") as F: for ln in F: g = re.match(r'^set\(PACKAGE_VERSION "(.+)"\)',ln) if g is None: continue; return g.group(1); raise RuntimeError("Could not find version"); class DockerFile(object): def __init__(self,src): self.lines = ["FROM %s"%(src)]; class Environment(object): azp_images = None; pandoc = True; python_cmd = "python3"; aliases = set(); use_make = False; proxy = True; build_pyverbs = True; docker_opts = [] to_azp = False; def _get_azp_names(self): if Environment.azp_images: return Environment.azp_images; with open("buildlib/azure-pipelines.yml") as F: azp = yaml.safe_load(F) Environment.azp_images = set(I["image"] for I in azp["resources"]["containers"]) return Environment.azp_images; def image_name(self): if self.to_azp: # Get the version number of the container out of the azp file. prefix = "ucfconsort.azurecr.io/%s/%s:"%(project, self.name); for I in self._get_azp_names(): if I.startswith(prefix): return I; raise ValueError("Image is not used in buildlib/azure-pipelines.yml") return "build-%s/%s"%(project,self.name); # ------------------------------------------------------------------------- class YumEnvironment(Environment): is_rpm = True; def get_docker_file(self,tmpdir): res = DockerFile(self.docker_parent); res.lines.append("RUN yum install -y %s && yum clean all"%( " ".join(sorted(self.pkgs)))); return res; class centos7(YumEnvironment): docker_parent = "centos:7"; pkgs = { 'cmake', 'gcc', 'libnl3-devel', 'libudev-devel', 'make', 'pkgconfig', 'python', 'python-argparse', 'python-docutils', 'rpm-build', 'systemd-devel', 'valgrind-devel', } name = "centos7"; use_make = True; pandoc = False; build_pyverbs = False; specfile = "redhat/rdma-core.spec"; python_cmd = "python"; to_azp = True; class centos7_epel(centos7): pkgs = (centos7.pkgs - {"cmake","make"}) | { "cmake3", "ninja-build", "pandoc", "python34-setuptools", 'python34-Cython', 'python34-devel', }; name = "centos7_epel"; build_pyverbs = True; use_make = False; pandoc = True; ninja_cmd = "ninja-build"; # Our spec file does not know how to cope with cmake3 is_rpm = False; to_azp = False; def get_docker_file(self,tmpdir): res = YumEnvironment.get_docker_file(self,tmpdir); res.lines.insert(1,"RUN yum install -y epel-release"); res.lines.append("RUN ln -s /usr/bin/cmake3 /usr/local/bin/cmake && ln -sf /usr/bin/python3.4 /usr/bin/python3"); return res; class amazonlinux2(YumEnvironment): docker_parent = "amazonlinux:2"; pkgs = centos7.pkgs; name = "amazonlinux2"; use_make = True; pandoc = False; build_pyverbs = False; specfile = "redhat/rdma-core.spec"; python_cmd = "python"; to_azp = False; class centos8(Environment): docker_parent = "quay.io/centos/centos:stream8" pkgs = { "pandoc", "perl-generators", "python3-Cython", "python3-devel", "python3-docutils", 'cmake', 'gcc', 'libnl3-devel', 'libudev-devel', 'ninja-build', 'pkgconfig', 'rpm-build', 'systemd-devel', 'valgrind-devel', }; name = "centos8"; specfile = "redhat/rdma-core.spec"; is_rpm = True; to_azp = True; proxy = False; def get_docker_file(self,tmpdir): res = DockerFile(self.docker_parent); res.lines.append("RUN dnf config-manager --set-enabled powertools && " "dnf install -y %s && dnf clean all" % (" ".join(sorted(self.pkgs)))) return res; class centos9(Environment): docker_parent = "quay.io/centos/centos:stream9" pkgs = centos8.pkgs name = "centos9" specfile = "redhat/rdma-core.spec" ninja_cmd = "ninja-build" is_rpm = True to_azp = True proxy = False def get_docker_file(self,tmpdir): res = DockerFile(self.docker_parent); res.lines.append("RUN dnf install -y 'dnf-command(config-manager)' epel-release &&" "dnf config-manager --set-enabled crb && " "dnf install -y %s && dnf clean all" % (" ".join(sorted(self.pkgs)))) return res class fc41(Environment): docker_parent = "fedora:41"; pkgs = centos8.pkgs | {"util-linux"} name = "fc41"; specfile = "redhat/rdma-core.spec"; ninja_cmd = "ninja-build"; is_rpm = True; aliases = {"fedora"}; to_azp = True; def get_docker_file(self,tmpdir): res = DockerFile(self.docker_parent); res.lines.append("RUN dnf install -y %s && dnf clean all"%( " ".join(sorted(self.pkgs)))); return res; # ------------------------------------------------------------------------- class APTEnvironment(Environment): is_deb = True; build_python = True; def get_docker_file(self,tmpdir): res = DockerFile(self.docker_parent); res.lines.append("RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends %s && apt-get clean && rm -rf /usr/share/doc/ /usr/lib/debug /var/lib/apt/lists/"%( " ".join(sorted(self.pkgs)))); return res; def add_source_list(self,tmpdir,name,content): sld = os.path.join(tmpdir,"etc","apt","sources.list.d"); if not os.path.isdir(sld): os.makedirs(sld); with open(os.path.join(sld,name),"w") as F: F.write(content + "\n"); def fix_https(self,tmpdir): """The ubuntu image does not include ca-certificates, so if we want to use HTTPS disable certificate validation.""" cfgd = os.path.join(tmpdir,"etc","apt","apt.conf.d") if not os.path.isdir(cfgd): os.makedirs(cfgd) with open(os.path.join(cfgd,"01nossl"),"w") as F: F.write('Acquire::https { Verify-Peer "false"; };') class xenial(APTEnvironment): docker_parent = "ubuntu:16.04" pkgs = { 'build-essential', 'cmake', 'debhelper', 'dh-systemd', 'fakeroot', # for AZP 'gcc', 'libnl-3-dev', 'libnl-route-3-dev', 'libsystemd-dev', 'libudev-dev', 'make', 'ninja-build', 'pandoc', 'pkg-config', 'python3', 'python3-docutils', 'valgrind', }; name = "ubuntu-16.04"; aliases = {"xenial"}; to_azp = True; class bionic(APTEnvironment): docker_parent = "ubuntu:18.04" pkgs = xenial.pkgs | { 'cython3', 'python3-dev', }; name = "ubuntu-18.04"; aliases = {"bionic", "ubuntu"}; to_azp = True class focal(APTEnvironment): docker_parent = "ubuntu:20.04" pkgs = bionic.pkgs | { 'dh-python', } name = "ubuntu-20.04"; aliases = {"focal", "ubuntu"}; to_azp = True class jammy(APTEnvironment): docker_parent = "ubuntu:22.04" pkgs = (bionic.pkgs ^ {"dh-systemd"}) | { 'dh-python', } name = "ubuntu-22.04"; aliases = {"jammy", "ubuntu"}; class jessie(APTEnvironment): docker_parent = "debian:8" pkgs = xenial.pkgs; name = "debian-8"; aliases = {"jessie"}; build_pyverbs = False; class stretch(APTEnvironment): docker_parent = "debian:9" pkgs = bionic.pkgs; name = "debian-9"; aliases = {"stretch"}; class bullseye(APTEnvironment): docker_parent = "debian:11" pkgs = { 'build-essential', 'cmake', 'debhelper', 'fakeroot', # for AZP 'gcc', 'libnl-3-dev', 'libnl-route-3-dev', 'libsystemd-dev', 'libudev-dev', 'make', 'ninja-build', 'pandoc', 'pkg-config', 'python3', 'python3-docutils', 'valgrind', }; name = "debian-11"; aliases = {"bullseye"}; class bullseye_i386(APTEnvironment): docker_parent = "debian:11" pkgs = bullseye.pkgs | {"nodejs"} name = "debian-11-i386"; aliases = {"bullseye_i386"}; docker_opts = ["--platform","linux/386"] to_azp = True def get_docker_file(self,tmpdir): res = json.loads(docker_cmd_str(args,"manifest","inspect",self.docker_parent)) # Docker is somewhat obnoxious in how it handles the multi-platform # images since it does not store the manifest locally and thus # overwrites the local tag. Figure out the tag we want by hash and # use it directly. Docker will cache this. for image in res["manifests"]: platform = image["platform"] if platform["architecture"] == "386" and platform["os"] == "linux": base = f"{self.docker_parent}@{image['digest']}" break else: raise RuntimeError("Docker manifest failed"); res = DockerFile(base); res.lines.append("RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends %s && apt-get clean && rm -rf /usr/share/doc/ /usr/lib/debug /var/lib/apt/lists/"%( " ".join(sorted(self.pkgs)))); res.lines.append('LABEL "com.azure.dev.pipelines.agent.handler.node.path"="/usr/bin/node"') return res; class debian_experimental(APTEnvironment): docker_parent = "debian:experimental" pkgs = (stretch.pkgs ^ {"gcc"}) | {"gcc-9"}; name = "debian-experimental"; def get_docker_file(self,tmpdir): res = DockerFile(self.docker_parent); res.lines.append("RUN apt-get update && apt-get -t experimental install -y --no-install-recommends %s && apt-get clean"%( " ".join(sorted(self.pkgs)))); return res; # ------------------------------------------------------------------------- class ZypperEnvironment(Environment): proxy = False; is_rpm = True; def get_docker_file(self,tmpdir): res = DockerFile(self.docker_parent); res.lines.append("RUN zypper --non-interactive refresh"); res.lines.append("RUN zypper --non-interactive dist-upgrade"); res.lines.append("RUN zypper --non-interactive install %s"%( " ".join(sorted(self.pkgs)))); return res; class leap(ZypperEnvironment): docker_parent = "opensuse/leap:15.0"; specfile = "suse/rdma-core.spec"; pkgs = { 'cmake', 'gcc', 'libnl3-devel', 'libudev-devel', 'udev', 'make', 'ninja', 'pandoc', 'pkg-config', 'python3', 'rpm-build', 'systemd-devel', 'valgrind-devel', 'python3-Cython', 'python3-devel', 'python3-docutils', }; rpmbuild_options = [ "--without=curlmini" ]; to_azp = True; name = "opensuse-15.0"; aliases = {"leap"}; class tumbleweed(ZypperEnvironment): docker_parent = "opensuse/tumbleweed:latest"; pkgs = (leap.pkgs ^ {"valgrind-devel"}) | { "valgrind-client-headers", "perl" }; name = "tumbleweed"; specfile = "suse/rdma-core.spec"; rpmbuild_options = [ "--without=curlmini" ]; # ------------------------------------------------------------------------- class azure_pipelines(APTEnvironment): docker_parent = "ubuntu:22.04" pkgs = { "abi-compliance-checker", "abi-dumper", "ca-certificates", "clang-15", "cmake", "cython3", "debhelper", "dh-python", "dpkg-dev", "fakeroot", "gcc-12", "git", "libc6-dev", "libnl-3-dev", "libnl-route-3-dev", "libsystemd-dev", "libudev-dev", "lintian", "make", "ninja-build", "pandoc", "pkg-config", "python3", "python3-dev", "python3-docutils", "python3-pkg-resources", "python3-yaml", "sparse", "valgrind", } | { # ARM 64 cross compiler "gcc-12-aarch64-linux-gnu", "libgcc-12-dev:arm64", "libc6-dev:arm64", "libnl-3-dev:arm64", "libnl-route-3-dev:arm64", "libsystemd-dev:arm64", "libudev-dev:arm64", } | { # PPC 64 cross compiler "gcc-12-powerpc64le-linux-gnu", "libgcc-12-dev:ppc64el", "libc6-dev:ppc64el", "libnl-3-dev:ppc64el", "libnl-route-3-dev:ppc64el", "libsystemd-dev:ppc64el", "libudev-dev:ppc64el", } to_azp = True; name = "azure_pipelines"; aliases = {"azp"} llvm_sources = """ Types: deb URIs: http://apt.llvm.org/jammy/ Suites: llvm-toolchain-jammy-15 Components: main Architectures: amd64 Signed-By: -----BEGIN PGP PUBLIC KEY BLOCK----- Version: GnuPG v1.4.12 (GNU/Linux) . mQINBFE9lCwBEADi0WUAApM/mgHJRU8lVkkw0CHsZNpqaQDNaHefD6Rw3S4LxNmM EZaOTkhP200XZM8lVdbfUW9xSjA3oPldc1HG26NjbqqCmWpdo2fb+r7VmU2dq3NM R18ZlKixiLDE6OUfaXWKamZsXb6ITTYmgTO6orQWYrnW6ckYHSeaAkW0wkDAryl2 B5v8aoFnQ1rFiVEMo4NGzw4UX+MelF7rxaaregmKVTPiqCOSPJ1McC1dHFN533FY Wh/RVLKWo6npu+owtwYFQW+zyQhKzSIMvNujFRzhIxzxR9Gn87MoLAyfgKEzrbbT DhqqNXTxS4UMUKCQaO93TzetX/EBrRpJj+vP640yio80h4Dr5pAd7+LnKwgpTDk1 G88bBXJAcPZnTSKu9I2c6KY4iRNbvRz4i+ZdwwZtdW4nSdl2792L7Sl7Nc44uLL/ ZqkKDXEBF6lsX5XpABwyK89S/SbHOytXv9o4puv+65Ac5/UShspQTMSKGZgvDauU cs8kE1U9dPOqVNCYq9Nfwinkf6RxV1k1+gwtclxQuY7UpKXP0hNAXjAiA5KS5Crq 7aaJg9q2F4bub0mNU6n7UI6vXguF2n4SEtzPRk6RP+4TiT3bZUsmr+1ktogyOJCc Ha8G5VdL+NBIYQthOcieYCBnTeIH7D3Sp6FYQTYtVbKFzmMK+36ERreL/wARAQAB tD1TeWx2ZXN0cmUgTGVkcnUgLSBEZWJpYW4gTExWTSBwYWNrYWdlcyA8c3lsdmVz dHJlQGRlYmlhbi5vcmc+iQI4BBMBAgAiBQJRPZQsAhsDBgsJCAcDAgYVCAIJCgsE FgIDAQIeAQIXgAAKCRAVz00Yr090Ibx+EADArS/hvkDF8juWMXxh17CgR0WZlHCC 9CTBWkg5a0bNN/3bb97cPQt/vIKWjQtkQpav6/5JTVCSx2riL4FHYhH0iuo4iAPR udC7Cvg8g7bSPrKO6tenQZNvQm+tUmBHgFiMBJi92AjZ/Qn1Shg7p9ITivFxpLyX wpmnF1OKyI2Kof2rm4BFwfSWuf8Fvh7kDMRLHv+MlnK/7j/BNpKdozXxLcwoFBmn l0WjpAH3OFF7Pvm1LJdf1DjWKH0Dc3sc6zxtmBR/KHHg6kK4BGQNnFKujcP7TVdv gMYv84kun14pnwjZcqOtN3UJtcx22880DOQzinoMs3Q4w4o05oIF+sSgHViFpc3W R0v+RllnH05vKZo+LDzc83DQVrdwliV12eHxrMQ8UYg88zCbF/cHHnlzZWAJgftg hB08v1BKPgYRUzwJ6VdVqXYcZWEaUJmQAPuAALyZESw94hSo28FAn0/gzEc5uOYx K+xG/lFwgAGYNb3uGM5m0P6LVTfdg6vDwwOeTNIExVk3KVFXeSQef2ZMkhwA7wya KJptkb62wBHFE+o9TUdtMCY6qONxMMdwioRE5BYNwAsS1PnRD2+jtlI0DzvKHt7B MWd8hnoUKhMeZ9TNmo+8CpsAtXZcBho0zPGz/R8NlJhAWpdAZ1CmcPo83EW86Yq7 BxQUKnNHcwj2ebkCDQRRPZQsARAA4jxYmbTHwmMjqSizlMJYNuGOpIidEdx9zQ5g zOr431/VfWq4S+VhMDhs15j9lyml0y4ok215VRFwrAREDg6UPMr7ajLmBQGau0Fc bvZJ90l4NjXp5p0NEE/qOb9UEHT7EGkEhaZ1ekkWFTWCgsy7rRXfZLxB6sk7pzLC DshyW3zjIakWAnpQ5j5obiDy708pReAuGB94NSyb1HoW/xGsGgvvCw4r0w3xPStw F1PhmScE6NTBIfLliea3pl8vhKPlCh54Hk7I8QGjo1ETlRP4Qll1ZxHJ8u25f/ta RES2Aw8Hi7j0EVcZ6MT9JWTI83yUcnUlZPZS2HyeWcUj+8nUC8W4N8An+aNps9l/ 21inIl2TbGo3Yn1JQLnA1YCoGwC34g8QZTJhElEQBN0X29ayWW6OdFx8MDvllbBV ymmKq2lK1U55mQTfDli7S3vfGz9Gp/oQwZ8bQpOeUkc5hbZszYwP4RX+68xDPfn+ M9udl+qW9wu+LyePbW6HX90LmkhNkkY2ZzUPRPDHZANU5btaPXc2H7edX4y4maQa xenqD0lGh9LGz/mps4HEZtCI5CY8o0uCMF3lT0XfXhuLksr7Pxv57yue8LLTItOJ d9Hmzp9G97SRYYeqU+8lyNXtU2PdrLLq7QHkzrsloG78lCpQcalHGACJzrlUWVP/ fN3Ht3kAEQEAAYkCHwQYAQIACQUCUT2ULAIbDAAKCRAVz00Yr090IbhWEADbr50X OEXMIMGRLe+YMjeMX9NG4jxs0jZaWHc/WrGR+CCSUb9r6aPXeLo+45949uEfdSsB pbaEdNWxF5Vr1CSjuO5siIlgDjmT655voXo67xVpEN4HhMrxugDJfCa6z97P0+ML PdDxim57uNqkam9XIq9hKQaurxMAECDPmlEXI4QT3eu5qw5/knMzDMZj4Vi6hovL wvvAeLHO/jsyfIdNmhBGU2RWCEZ9uo/MeerPHtRPfg74g+9PPfP6nyHD2Wes6yGd oVQwtPNAQD6Cj7EaA2xdZYLJ7/jW6yiPu98FFWP74FN2dlyEA2uVziLsfBrgpS4l tVOlrO2YzkkqUGrybzbLpj6eeHx+Cd7wcjI8CalsqtL6cG8cUEjtWQUHyTbQWAgG 5VPEgIAVhJ6RTZ26i/G+4J8neKyRs4vz+57UGwY6zI4AB1ZcWGEE3Bf+CDEDgmnP LSwbnHefK9IljT9XU98PelSryUO/5UPw7leE0akXKB4DtekToO226px1VnGp3Bov 1GBGvpHvL2WizEwdk+nfk8LtrLzej+9FtIcq3uIrYnsac47Pf7p0otcFeTJTjSq3 krCaoG4Hx0zGQG2ZFpHrSrZTVy6lxvIdfi0beMgY6h78p6M9eYZHQHc02DjFkQXN bXb5c6gCHESH5PXwPU4jQEE7Ib9J6sbk7ZT2Mw== =j+4q -----END PGP PUBLIC KEY BLOCK----- """ gcc12_sources = """ Types: deb URIs: https://ppa.launchpadcontent.net/ubuntu-toolchain-r/test/ubuntu Suites: jammy Components: main Architectures: amd64 arm64 ppc64el Signed-By: -----BEGIN PGP PUBLIC KEY BLOCK----- . xo0ESuBvRwEEAMi4cDba7xlKaaoXjO1n1HX8RKrkW+HEIl79nSOSJyvzysajs7zU ow/OzCQp9NswqrDmNuH1+lPTTRNAGtK8r2ouq2rnXT1mTl23dpgHZ9spseR73s4Z BGw/ag4bpU5dNUStvfmHhIjVCuiSpNn7cyy1JSSvSs3N2mxteKjXLBf7ABEBAAHN GkxhdW5jaHBhZCBUb29sY2hhaW4gYnVpbGRzwrYEEwECACAFAkrgb0cCGwMGCwkI BwMCBBUCCAMEFgIDAQIeAQIXgAAKCRAek3eiup7yfzGKA/4xzUqNACSlB+k+DxFF HqkwKa/ziFiAlkLQyyhm+iqz80htRZr7Ls/ZRYZl0aSU56/hLe0V+TviJ1s8qdN2 lamkKdXIAFfavA04nOnTzyIBJ82EAUT3Nh45skMxo4z4iZMNmsyaQpNl/m/lNtOL hR64v5ZybofB2EWkMxUzX8D/FQ== =xe+/ -----END PGP PUBLIC KEY BLOCK----- """ ports_sources = """ Types: deb URIS: http://ports.ubuntu.com Suites: jammy jammy-security jammy-updates Components: main universe Architectures: arm64 ppc64el """ amd64_sources = """ Types: deb URIS: http://archive.ubuntu.com/ubuntu Suites: jammy jammy-security jammy-updates Components: main universe Architectures: amd64 """ def get_docker_file(self,tmpdir): res = focal.get_docker_file(self,tmpdir); self.fix_https(tmpdir) self.add_source_list(tmpdir, "llvm.sources", self.llvm_sources) self.add_source_list(tmpdir, "ubuntu-toolchain-r-ubuntu-test-jammy.sources", self.gcc12_sources) self.add_source_list(tmpdir, "ports.sources", self.ports_sources) # Replace the main sources so we can limit the architecture self.add_source_list(tmpdir, "amd64.sources", self.amd64_sources) res.lines.insert(1,"ADD etc/ /etc/"); res.lines.insert(1,"RUN rm /etc/apt/sources.list && " "dpkg --add-architecture ppc64el && " "dpkg --add-architecture arm64") return res; # ------------------------------------------------------------------------- environments = [centos7(), centos7_epel(), centos8(), centos9(), amazonlinux2(), xenial(), bionic(), focal(), jammy(), jessie(), stretch(), fc41(), leap(), tumbleweed(), debian_experimental(), azure_pipelines(), bullseye(), bullseye_i386(), ]; class ToEnvActionPkg(argparse.Action): """argparse helper to parse environment lists into environment classes""" def __call__(self, parser, namespace, values, option_string=None): if not isinstance(values,list): values = [values]; res = set(); for I in values: if I == "all": for env in environments: if env.name != "centos7_epel": res.add(env); else: for env in environments: if env.name == I or I in env.aliases: res.add(env); setattr(namespace, self.dest, sorted(res,key=lambda x:x.name)) class ToEnvAction(argparse.Action): """argparse helper to parse environment lists into environment classes""" def __call__(self, parser, namespace, values, option_string=None): if not isinstance(values,list): values = [values]; res = set(); for I in values: if I == "all": res.update(environments); else: for env in environments: if env.name == I or I in env.aliases: res.add(env); setattr(namespace, self.dest, sorted(res,key=lambda x:x.name)) def env_choices_pkg(): """All the names that can be used with ToEnvAction""" envs = set(("all",)); for I in environments: if getattr(I,"is_deb",False) or getattr(I,"is_rpm",False): envs.add(I.name); envs.update(I.aliases); return envs; def env_choices(): """All the names that can be used with ToEnvAction""" envs = set(("all",)); for I in environments: envs.add(I.name); envs.update(I.aliases); return envs; def docker_cmd(env,*cmd): """Invoke docker""" cmd = list(cmd); if env.sudo: return subprocess.check_call(["sudo","docker"] + cmd); return subprocess.check_call(["docker"] + cmd); def docker_cmd_str(env,*cmd): """Invoke docker""" cmd = list(cmd); if env.sudo: return subprocess.check_output(["sudo","docker"] + cmd).decode(); return subprocess.check_output(["docker"] + cmd).decode(); @contextmanager def private_tmp(args): """Simple version of Python 3's tempfile.TemporaryDirectory""" dfn = tempfile.mkdtemp(); try: yield dfn; finally: try: shutil.rmtree(dfn); except: # The debian builds result in root owned files because we don't use fakeroot subprocess.check_call(['sudo','rm','-rf',dfn]); @contextmanager def inDirectory(dir): cdir = os.getcwd(); try: os.chdir(dir); yield True; finally: os.chdir(cdir); def map_git_args(src_root,to): """Return a list of docker arguments that will map the .git directory into the container""" git_dir = subprocess.check_output([ "git", "-C", src_root, "rev-parse", "--absolute-git-dir", ]).decode().strip() if ".git/worktrees" in git_dir: with open(os.path.join(git_dir, "commondir")) as F: git_dir = os.path.join(git_dir, F.read().strip()) git_dir = os.path.abspath(git_dir) res = ["-v", "%s:%s:ro" % (os.path.join(src_root, ".git"), os.path.join(to, ".git")), "-v", "%s:%s:ro" % (git_dir, git_dir)] else: res = ["-v", "%s:%s:ro" % (git_dir, os.path.join(to, ".git"))] alternates = os.path.join(git_dir, "objects/info/alternates") if os.path.exists(alternates): with open(alternates) as F: for I in F.readlines(): I = os.path.normpath(I.strip()) res.extend(["-v","%s:%s:ro"%(I,I)]); return res; def get_image_id(args,image_name): img = json.loads(docker_cmd_str(args,"inspect",image_name)); image_id = img[0]["Id"]; # Newer dockers put a prefix if ":" in image_id: image_id = image_id.partition(':')[2]; return image_id; # ------------------------------------------------------------------------- def get_tar_file(args,tarfn,pandoc_prebuilt=False): """Create a tar file that matches what buildlib/github-release would do if it was a tagged release""" prefix = "%s-%s/"%(project,get_version()); if not pandoc_prebuilt: subprocess.check_call(["git","archive", # This must match the prefix generated buildlib/github-release "--prefix",prefix, "--output",tarfn, "HEAD"]); return; # When the OS does not support pandoc we got through the extra step to # build pandoc output in the azp container and include it in the # tar. if not args.use_prebuilt_pandoc: subprocess.check_call(["buildlib/cbuild","make","azure_pipelines","docs"]); cmd_make_dist_tar(argparse.Namespace(BUILD="build-azure_pipelines",tarfn=tarfn, script_pwd="",tag=None)); def run_rpm_build(args,spec_file,env): with open(spec_file,"r") as F: for ln in F: if ln.startswith("Version:"): ver = ln.strip().partition(' ')[2].strip(); assert(ver == get_version()); if ln.startswith("Source:"): tarfn = ln.strip().partition(' ')[2].strip(); image_id = get_image_id(args,env.image_name()); with private_tmp(args) as tmpdir: os.mkdir(os.path.join(tmpdir,"SOURCES")); os.mkdir(os.path.join(tmpdir,"tmp")); get_tar_file(args,os.path.join(tmpdir,"SOURCES",tarfn), pandoc_prebuilt=not env.pandoc); with open(spec_file,"r") as inF: spec = list(inF); tspec_file = os.path.basename(spec_file); with open(os.path.join(tmpdir,tspec_file),"w") as outF: outF.write("".join(spec)); home = os.path.join(os.path.sep,"home",os.getenv("LOGNAME")); vdir = os.path.join(home,"rpmbuild"); opts = [ "run", "--rm=true", "-v","%s:%s"%(tmpdir,vdir), "-w",vdir, "-h","builder-%s"%(image_id[:12]), "-e","HOME=%s"%(home), "-e","TMPDIR=%s"%(os.path.join(vdir,"tmp")), ]; # rpmbuild complains if we do not have an entry in passwd and group # for the user we are going to use to do the build. with open(os.path.join(tmpdir,"go.py"),"w") as F: print(""" import os,subprocess; with open("/etc/passwd","a") as F: F.write({passwd!r} + "\\n"); with open("/etc/group","a") as F: F.write({group!r} + "\\n"); os.setgid({gid:d}); os.setuid({uid:d}); # Get RPM to tell us the expected tar filename. for ln in subprocess.check_output(["rpmspec","-P",{tspec_file!r}]).splitlines(): if ln.startswith(b"Source:"): tarfn = ln.strip().partition(b' ')[2].strip(); if tarfn != {tarfn!r}: os.symlink({tarfn!r},os.path.join(b"SOURCES",tarfn)); """.format(passwd=":".join(str(I) for I in pwd.getpwuid(os.getuid())), group=":".join(str(I) for I in grp.getgrgid(os.getgid())), uid=os.getuid(), gid=os.getgid(), tarfn=tarfn, tspec_file=tspec_file), file=F); extra_opts = getattr(env,"rpmbuild_options", []) bopts = ["-bb",tspec_file] + extra_opts; for arg in args.with_flags: bopts.extend(["--with", arg]); for arg in args.without_flags: bopts.extend(["--without", arg]); if "pyverbs" not in args.with_flags + args.without_flags: if env.build_pyverbs: bopts.extend(["--with", "pyverbs"]); print('os.execlp("rpmbuild","rpmbuild",%s)'%( ",".join(repr(I) for I in bopts)), file=F); if args.run_shell: opts.append("-ti"); opts.append(env.image_name()); if args.run_shell: opts.append("/bin/bash"); else: opts.extend([env.python_cmd,"go.py"]); docker_cmd(args,*opts) print() for path,jnk,files in os.walk(os.path.join(tmpdir,"RPMS")): for I in files: print("Final RPM: ",os.path.join("..",I)); shutil.move(os.path.join(path,I), os.path.join("..",I)); def run_deb_build(args,env): image_id = get_image_id(args,env.image_name()); with private_tmp(args) as tmpdir: os.mkdir(os.path.join(tmpdir,"src")); os.mkdir(os.path.join(tmpdir,"tmp")); opwd = os.getcwd(); with inDirectory(os.path.join(tmpdir,"src")): subprocess.check_call(["git", "--git-dir",os.path.join(opwd,".git"), "reset","--hard","HEAD"]); home = os.path.join(os.path.sep,"home",os.getenv("LOGNAME")); opts = [ "run", "--read-only", "--rm=true", "-v","%s:%s"%(tmpdir,home), "-w",os.path.join(home,"src"), "-h","builder-%s"%(image_id[:12]), "-e","HOME=%s"%(home), "-e","TMPDIR=%s"%(os.path.join(home,"tmp")), "-e","DEB_BUILD_OPTIONS=parallel=%u"%(multiprocessing.cpu_count()), ]; # Create a go.py that will let us run the compilation as the user and # then switch to root only for the packaging step. with open(os.path.join(tmpdir,"go.py"),"w") as F: print(""" import subprocess,os; def to_user(): os.setgid({gid:d}); os.setuid({uid:d}); subprocess.check_call(["debian/rules","debian/rules","build"], preexec_fn=to_user); subprocess.check_call(["debian/rules","debian/rules","binary"]); """.format(uid=os.getuid(), gid=os.getgid()), file=F); if args.run_shell: opts.append("-ti"); opts.append(env.image_name()); if args.run_shell: opts.append("/bin/bash"); else: opts.extend(["python3",os.path.join(home,"go.py")]); docker_cmd(args,*opts); print() for I in os.listdir(tmpdir): if I.endswith(".deb"): print("Final DEB: ",os.path.join("..",I)); shutil.move(os.path.join(tmpdir,I), os.path.join("..",I)); def copy_abi_files(src): """Retrieve the current ABI files and place them in the source tree.""" if not os.path.isdir(src): return; for path,jnk,files in os.walk(src): for I in files: if not I.startswith("current-"): continue; ref_fn = os.path.join("ABI",I[8:]); cur_fn = os.path.join(src, path, I); if os.path.isfile(ref_fn) and filecmp.cmp(ref_fn,cur_fn,False): continue; print("Changed ABI File: ", ref_fn); shutil.copy(cur_fn, ref_fn); def run_azp_build(args,env): # Load the commands from the pipelines file with open("buildlib/azure-pipelines.yml") as F: azp = yaml.safe_load(F); for bst in azp["stages"]: if bst["stage"] == "Build": break; else: raise ValueError("No Build stage found"); for job in bst["jobs"]: if job["job"] == "Compile": break; else: raise ValueError("No Compile job found"); script = ["#!/bin/bash"] workdir = "/__w/1" srcdir = os.path.join(workdir,"s"); for I in job["steps"]: script.append("echo ==================================="); script.append("echo %s"%(I["displayName"])); script.append("cd %s"%(srcdir)); if "bash" in I: script.append(I["bash"]); elif I.get("task") == "PythonScript@0": script.append("set -e"); if "workingDirectory" in I["inputs"]: script.append("cd %s"%(os.path.join(srcdir,I["inputs"]["workingDirectory"]))); script.append("%s %s %s"%(I["inputs"]["pythonInterpreter"], os.path.join(srcdir,I["inputs"]["scriptPath"]), I["inputs"].get("arguments",""))); else: raise ValueError("Unknown stanza %r"%(I)); with private_tmp(args) as tmpdir: os.mkdir(os.path.join(tmpdir,"s")); os.mkdir(os.path.join(tmpdir,"tmp")); opwd = os.getcwd(); with inDirectory(os.path.join(tmpdir,"s")): subprocess.check_call(["git", "--git-dir",os.path.join(opwd,".git"), "reset","--hard","HEAD"]); subprocess.check_call(["git", "--git-dir",os.path.join(opwd,".git"), "fetch", "--no-tags", "https://github.com/linux-rdma/rdma-core.git","HEAD", "master"]); base = subprocess.check_output(["git", "--git-dir",os.path.join(opwd,".git"), "merge-base", "HEAD","FETCH_HEAD"]).decode().strip(); opts = [ "run", "--read-only", "--rm=true", "-v","%s:%s"%(tmpdir, workdir), "-w",srcdir, "-u",str(os.getuid()), "-e","SYSTEM_PULLREQUEST_SOURCECOMMITID=HEAD", # azp puts the branch name 'master' here, we need to put a commit ID.. "-e","SYSTEM_PULLREQUEST_TARGETBRANCH=%s"%(base), "-e","HOME=%s"%(workdir), "-e","TMPDIR=%s"%(os.path.join(workdir,"tmp")), ] + map_git_args(opwd,srcdir); if args.run_shell: opts.append("-ti"); opts.append(env.image_name()); with open(os.path.join(tmpdir,"go.sh"),"w") as F: F.write("\n".join(script)) if args.run_shell: opts.append("/bin/bash"); else: opts.extend(["/bin/bash",os.path.join(workdir,"go.sh")]); try: docker_cmd(args,*opts); except subprocess.CalledProcessError as e: copy_abi_files(os.path.join(tmpdir, "s/ABI")); raise; copy_abi_files(os.path.join(tmpdir, "s/ABI")); def args_pkg(parser): parser.add_argument("ENV",action=ToEnvActionPkg,choices=env_choices_pkg()); parser.add_argument("--run-shell",default=False,action="store_true", help="Instead of running the build, enter a shell"); parser.add_argument("--use-prebuilt-pandoc",default=False,action="store_true", help="Do not rebuild the pandoc cache in build-azure_pipelines/pandoc-prebuilt/"); parser.add_argument("--with", default=[],action="append", dest="with_flags", help="Enable specified feature in RPM builds"); parser.add_argument("--without", default=[],action="append", dest="without_flags", help="Disable specified feature in RPM builds"); def cmd_pkg(args): """Build a package in the given environment.""" for env in args.ENV: if env.name == "azure_pipelines": run_azp_build(args,env); elif getattr(env,"is_deb",False): run_deb_build(args,env); elif getattr(env,"is_rpm",False): run_rpm_build(args, getattr(env,"specfile","%s.spec"%(project)), env); else: print("%s does not support packaging"%(env.name)); # ------------------------------------------------------------------------- def args_make(parser): parser.add_argument("--run-shell",default=False,action="store_true", help="Instead of running the build, enter a shell"); parser.add_argument("ENV",action=ToEnvAction,choices=env_choices()); parser.add_argument('ARGS', nargs=argparse.REMAINDER); def cmd_make(args): """Run cmake and ninja within a docker container. If cmake has not yet been run then this runs it with the given environment variables, then invokes ninja. Otherwise ninja is invoked without calling cmake.""" SRC = os.getcwd(); for env in args.ENV: BUILD = "build-%s"%(env.name) if not os.path.exists(BUILD): os.mkdir(BUILD); home = os.path.join(os.path.sep,"home",os.getenv("LOGNAME")); dirs = [os.getcwd(),"/tmp"]; # Import the symlink target too if BUILD is a symlink BUILD_r = os.path.realpath(BUILD); if not BUILD_r.startswith(os.path.realpath(SRC)): dirs.append(BUILD_r); cmake_args = [] if not env.build_pyverbs: cmake_args.extend(["-DNO_PYVERBS=1"]); cmake_envs = [] ninja_args = [] for I in args.ARGS: if I.startswith("-D"): cmake_args.append(I); elif I.find('=') != -1: cmake_envs.append(I); else: ninja_args.append(I); if env.use_make: need_cmake = not os.path.exists(os.path.join(BUILD_r,"Makefile")); else: need_cmake = not os.path.exists(os.path.join(BUILD_r,"build.ninja")); opts = ["run", "--read-only", "--rm=true", "-ti", "-u",str(os.getuid()), "-e","HOME=%s"%(home), "-w",BUILD_r, ]; opts.extend(env.docker_opts) for I in dirs: opts.append("-v"); opts.append("%s:%s"%(I,I)); for I in cmake_envs: opts.append("-e"); opts.append(I); if args.run_shell: opts.append("-ti"); opts.append(env.image_name()); if args.run_shell: os.execlp("sudo","sudo","docker",*(opts + ["/bin/bash"])); if need_cmake: if env.use_make: prog_args = ["cmake",SRC] + cmake_args; else: prog_args = ["cmake","-GNinja",SRC] + cmake_args; docker_cmd(args,*(opts + prog_args)); if env.use_make: prog_args = ["make","-C",BUILD_r] + ninja_args; else: prog_args = [getattr(env,"ninja_cmd","ninja"), "-C",BUILD_r] + ninja_args; if len(args.ENV) <= 1: os.execlp("sudo","sudo","docker",*(opts + prog_args)); else: docker_cmd(args,*(opts + prog_args)); # ------------------------------------------------------------------------- def get_build_args(args,env): """Return extra docker arguments for building. This is the system APT proxy.""" res = []; if args.pull: res.append("--pull"); if env.proxy and os.path.exists("/etc/apt/apt.conf.d/01proxy"): # The line in this file must be 'Acquire::http { Proxy "http://xxxx:3142"; };' with open("/etc/apt/apt.conf.d/01proxy") as F: proxy = F.read().strip().split('"')[1]; res.append("--build-arg"); res.append('http_proxy=%s'%(proxy)); return res; def args_build_images(parser): parser.add_argument("ENV",nargs="+",action=ToEnvAction,choices=env_choices()); parser.add_argument("--no-pull",default=True,action="store_false", dest="pull", help="Instead of running the build, enter a shell"); def cmd_build_images(args): """Run from the top level source directory to make the docker images that are needed for building. This only needs to be run once.""" # Docker copies the permissions from the local host and we need this umask # to be 022 or the container breaks os.umask(0o22) for env in args.ENV: with private_tmp(args) as tmpdir: df = env.get_docker_file(tmpdir); fn = os.path.join(tmpdir,"Dockerfile"); with open(fn,"wt") as F: for ln in df.lines: print(ln, file=F); opts = (["build"] + get_build_args(args,env) + env.docker_opts + ["-f",fn, "-t",env.image_name(), tmpdir]); print(opts) docker_cmd(args,*opts); # ------------------------------------------------------------------------- def args_push_azp_images(args): pass def cmd_push_azp_images(args): """Push the images required for Azure Pipelines to the container registry. Must have done 'az login' first""" subprocess.check_call(["sudo","az","acr","login","--name","ucfconsort"]); with private_tmp(args) as tmpdir: nfn = os.path.join(tmpdir,"build.ninja"); with open(nfn,"w") as F: F.write("""rule push command = docker push $img description=Push $img\n"""); for env in environments: name = env.image_name() if "ucfconsort.azurecr.io" not in name: continue F.write("build push_%s : push\n img = %s\n"%(env.name,env.image_name())); F.write("default push_%s\n"%(env.name)); subprocess.check_call(["sudo","ninja"],cwd=tmpdir); # ------------------------------------------------------------------------- def args_make_dist_tar(parser): parser.add_argument("BUILD",help="Path to the build directory") parser.add_argument("--tarfn",help="Output TAR filename") parser.add_argument("--tag",help="git tag to sanity check against") def cmd_make_dist_tar(args): """Make the standard distribution tar. The BUILD argument must point to a build output directory that has pandoc-prebuilt""" ver = get_version(); if not args.tarfn: args.tarfn = "%s-%s.tar.gz"%(project,ver) # The tag name and the cmake file must match. if args.tag: assert args.tag == "v" + ver; os.umask(0o22) with private_tmp(args) as tmpdir: tmp_tarfn = os.path.join(tmpdir,"tmp.tar"); prefix = "%s-%s/"%(project,get_version()); subprocess.check_call(["git","archive", "--prefix",prefix, "--output",tmp_tarfn, "HEAD"]); # Mangle the paths and append the prebuilt stuff to the tar file if args.BUILD: subprocess.check_call([ "tar", "-C",os.path.join(args.script_pwd,args.BUILD,"pandoc-prebuilt"), "-rf",tmp_tarfn, "./", "--xform",r"s|^\.|%sbuildlib/pandoc-prebuilt|g"%(prefix)]); assert args.tarfn.endswith(".gz") or args.tarfn.endswith(".tgz"); with open(os.path.join(args.script_pwd,args.tarfn),"w") as F: subprocess.check_call(["gzip","-9c",tmp_tarfn],stdout=F); # ------------------------------------------------------------------------- if __name__ == '__main__': parser = argparse.ArgumentParser(description='Operate docker for building this package') subparsers = parser.add_subparsers(title="Sub Commands",dest="command"); subparsers.required = True; funcs = globals(); for k,v in list(funcs.items()): if k.startswith("cmd_") and inspect.isfunction(v): sparser = subparsers.add_parser(k[4:].replace('_','-'), help=v.__doc__); sparser.required = True; funcs["args_" + k[4:]](sparser); sparser.set_defaults(func=v); try: import argcomplete; argcomplete.autocomplete(parser); except ImportError: pass; args = parser.parse_args(); args.sudo = True; # This script must always run from the top of the git tree, and a git # checkout is mandatory. git_top = subprocess.check_output(["git","rev-parse","--show-toplevel"]).strip(); args.script_pwd = os.getcwd(); os.chdir(git_top); args.func(args); rdma-core-56.1/buildlib/check-build000077500000000000000000000445721477342711600172340ustar00rootroot00000000000000#!/usr/bin/env python3 # Copyright 2017 Obsidian Research Corp. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. """check-build - Run static checks on a build""" from __future__ import print_function import argparse import inspect import os import re import shutil import subprocess import tempfile import sys import copy import shlex import pipes from contextlib import contextmanager; from pkg_resources import parse_version def get_src_dir(): """Get the source directory using git""" git_top = subprocess.check_output(["git","rev-parse","--git-dir"]).decode().strip(); if git_top == ".git": return "."; return os.path.dirname(git_top); def get_package_version(args): """Return PACKAGE_VERSION from CMake""" with open(os.path.join(args.SRC,"CMakeLists.txt")) as F: for ln in F: g = re.match(r'^set\(PACKAGE_VERSION "(.+)"\)',ln) if g is None: continue; return g.group(1); raise RuntimeError("Could not find version"); @contextmanager def inDirectory(dir): cdir = os.getcwd(); try: os.chdir(dir); yield True; finally: os.chdir(cdir); @contextmanager def private_tmp(): """Simple version of Python 3's tempfile.TemporaryDirectory""" dfn = tempfile.mkdtemp(); try: yield dfn; finally: shutil.rmtree(dfn); # ------------------------------------------------------------------------- def get_symbol_vers(fn,exported=True): """Return the symbol version suffixes from the ELF file, eg IB_VERBS_1.0, etc""" syms = subprocess.check_output(["readelf","--wide","-s",fn]).decode(); go = False; res = set(); for I in syms.splitlines(): if I.startswith("Symbol table '.dynsym'"): go = True; continue; if I.startswith(" ") and go: itms = I.split(); if exported: if (len(itms) == 8 and itms[3] == "OBJECT" and itms[4] == "GLOBAL" and itms[6] == "ABS"): res.add(itms[7]); else: if (len(itms) >= 8 and itms[3] == "FUNC" and itms[4] == "GLOBAL" and itms[6] == "UND"): res.add(itms[7]); else: go = False; if not res: raise ValueError("Failed to read ELF symbol versions from %r"%(fn)); return res; def symver_parse_version(ver): return parse_version(ver.partition("_")[-1]) def check_lib_symver(args,fn): g = re.match(r"lib([^.]+)\.so\.(\d+)\.(\d+)\.(.*)",fn); if g.group(4) != args.PACKAGE_VERSION: raise ValueError("Shared Library filename %r does not have the package version %r (%r)"%( fn,args.PACKAGE_VERSION,g.groups())); # umad/etc used the wrong symbol version name when they moved to soname 3.0 if g.group(1) == "ibumad": newest_symver = "%s_%s.%s"%(g.group(1).upper(),'1',g.group(3)); elif g.group(1) == "ibmad": newest_symver = "%s_%s.%s"%(g.group(1).upper(),'1',g.group(3)); elif g.group(1) == "ibnetdisc": newest_symver = "%s_%s.%s"%(g.group(1).upper(),'1',g.group(3)); else: newest_symver = "%s_%s.%s"%(g.group(1).upper(),g.group(2),g.group(3)); syms = get_symbol_vers(fn); if newest_symver not in syms: raise ValueError("Symbol version %r implied by filename %r not in ELF (%r)"%( newest_symver,fn,syms)); # The private symbol tag should also be older than the package version private = set(I for I in syms if "PRIVATE" in I) if len(private) > 1: raise ValueError("Too many private symbol versions in ELF %r (%r)"%(fn,private)); if private: private_rel = list(private)[0].split('_')[-1]; if private_rel > args.PACKAGE_VERSION: raise ValueError("Private Symbol Version %r is newer than the package version %r"%( private,args.PACKAGE_VERSION)); syms = list(syms - private); syms.sort(key=symver_parse_version) if newest_symver != syms[-1]: raise ValueError("Symbol version %r implied by filename %r not the newest in ELF (%r)"%( newest_symver,fn,syms)); def test_lib_names(args): """Check that the library filename matches the symbol versions""" libd = os.path.join(args.BUILD,"lib"); # List of shlibs that follow the ABI guidelines libs = {}; with inDirectory(libd): for fn in os.listdir("."): if os.path.islink(fn): lfn = os.readlink(fn); if not os.path.islink(lfn): check_lib_symver(args,lfn); # ------------------------------------------------------------------------- def check_verbs_abi(args,fn): g = re.match(r"lib([^-]+)-rdmav(\d+).so",fn); if g is None: raise ValueError("Provider library has unknown file name format %r"%(fn)); private_ver = int(g.group(2)); syms = get_symbol_vers(fn,exported=False); syms = {I.partition("@")[2] for I in syms}; assert "IBVERBS_PRIVATE_%u"%(private_ver) in syms; assert len([I for I in syms if I.startswith("IBVERBS_PRIVATE")]) == 1; def test_verbs_private(args): """Check that the IBVERBS_PRIVATE symbols match the library name, eg that the map file and the cmake stuff are in sync.""" libd = os.path.join(args.BUILD,"lib"); with inDirectory(libd): for fn in os.listdir("."): if not os.path.islink(fn) and "rdmav" in fn and fn.endswith(".so"): check_verbs_abi(args,fn); # ------------------------------------------------------------------------- def check_abi(args,fn): g1 = re.match(r"lib([^.]+).so\.(.+)\.(.+)",fn); g2 = re.match(r"lib([^.]+).so\.(.+\..+)",fn); if g1 is None or g2 is None: raise ValueError("Library has unknown file name format %r"%(fn)); ref_fn = os.path.join(args.SRC,"ABI",g1.group(1) + ".dump"); cur_fn = os.path.join(args.SRC,"ABI","current-" + g1.group(1) + ".dump"); subprocess.check_call(["abi-dumper", "-lver",g2.group(1), fn, "-o",cur_fn]); if not os.path.exists(ref_fn): print("ABI file does not exist for %r"%(ref_fn), file=sys.stderr); return False; subprocess.check_call(["abi-compliance-checker", "-l",g1.group(1), "-old",ref_fn, "-new",cur_fn]); return True; def test_verbs_uapi(args): """Compare the ABI output from 'abi-dumper' between what is present in git and what was built in this tree. This allows us to detect changes in ABI on the -stable branch.""" # User must provide the ABI dir in the source tree if not os.path.isdir(os.path.join(args.SRC,"ABI")): print("ABI check skipped, no ABI/ directory."); return; libd = os.path.join(args.BUILD,"lib"); success = True; with inDirectory(libd): for fn in os.listdir("."): if not os.path.islink(fn) and re.match(r"lib.+\.so\..+\..+",fn): success = success & check_abi(args,fn); assert success == True; # ------------------------------------------------------------------------- def is_obsolete(fn): """True if the header is obsolete and should not be compiled anyhow.""" with open(fn) as F: for ln in F.readlines(): if re.search(r"#warning.*This header is obsolete",ln): return True; return False; def is_fixup(fn): """True if this is a fixup header, fixup headers are exempted because they required includes are not the same for kernel headers (eg netinet/in.h)""" if os.path.islink(fn): return "buildlib/fixup-include/" in os.readlink(fn); return False; def get_headers(incdir): includes = set(); for root,dirs,files in os.walk(incdir): for I in files: if I.endswith(".h"): includes.add(os.path.join(root,I)); return includes; def compile_test_headers(tmpd,incdir,includes,with_cxx=False): cppflags = subprocess.check_output(["pkg-config","libnl-3.0","--cflags-only-I"]).decode().strip(); cppflags = "-I %s %s"%(incdir,cppflags) with open(os.path.join(tmpd,"build.ninja"),"wt") as F: print("rule comp", file=F); print(" command = %s -Werror -c %s $in -o $out"%(args.CC,cppflags), file=F); print(" description=Header check for $in", file=F); print("rule comp_cxx", file=F); print(" command = %s -Werror -c %s $in -o $out"%(args.CXX,cppflags), file=F); print(" description=Header C++ check for $in", file=F); count = 0; for I in sorted(includes): if is_obsolete(I) or is_fixup(I): continue; print("build %s : comp %s"%("out%d.o"%(count),I), file=F); print("default %s"%("out%d.o"%(count)), file=F); print("build %s : comp_cxx %s"%("outxx%d.o"%(count),I), file=F); if with_cxx: print("default %s"%("outxx%d.o"%(count)), file=F); count = count + 1; subprocess.check_call(["ninja"],cwd=tmpd); def test_published_headers(args): """Test that every header file can be included on its own, and has no obvious implicit dependencies. This is intended as a first pass check of the public installed API headers""" incdir = os.path.abspath(os.path.join(args.BUILD,"include")); includes = get_headers(incdir); # Make a little ninja file to compile each header with private_tmp() as tmpd: compile_test_headers(tmpd,incdir,includes); # ------------------------------------------------------------------------- allowed_uapi_headers = { # This header is installed in all supported distributions "rdma/ib_user_sa.h", "rdma/ib_user_verbs.h", "linux/stddef.h", } non_cxx_headers = { "infiniband/arch.h", "infiniband/ib.h", "infiniband/ib_user_ioctl_verbs.h", "infiniband/ibnetdisc_osd.h", "infiniband/mad_osd.h", "infiniband/mlx5_api.h", "infiniband/mlx5_user_ioctl_verbs.h", "infiniband/opcode.h", "infiniband/sa-kern-abi.h", "infiniband/sa.h", "infiniband/verbs_api.h", "rdma/rdma_cma_abi.h", } def test_installed_headers(args): """This test also checks that the public headers can be compiled on their own, but goes further and confirms that the public headers do not depend on any internal headers, or kernel kAPI headers.""" with private_tmp() as tmpd: env = copy.deepcopy(os.environ); env["DESTDIR"] = tmpd; subprocess.check_output(["ninja","install"],env=env,cwd=args.BUILD); includes = get_headers(tmpd); incdir = os.path.commonprefix(list(includes)); rincludes = {I[len(incdir):] for I in includes}; bincdir = os.path.abspath(os.path.join(args.BUILD,"include")); all_includes = set(); for I in get_headers(bincdir): if not is_fixup(I) and not is_obsolete(I): all_includes.add(I[len(bincdir)+1:]); # Drop error includes for any include file that is internal, this way # when we compile the public headers any include of an internal header # will fail. for I in sorted(all_includes - rincludes): if I in allowed_uapi_headers: continue; I = os.path.join(incdir,I) dfn = os.path.dirname(I); if not os.path.isdir(dfn): os.makedirs(dfn); assert not os.path.exists(I); with open(I,"w") as F: print('#error "Private internal header"', file=F); # Roughly check that the headers have the extern "C" for C++ # compilation. for I in sorted(rincludes - non_cxx_headers): with open(os.path.join(incdir,I)) as F: if 'extern "C" {' not in F.read(): raise ValueError("No extern C in %r"%(I)); compile_test_headers(tmpd,incdir,includes,with_cxx=True); # ------------------------------------------------------------------------- def get_symbol_names(fn): """Return the defined, public, symbols from a ELF shlib""" syms = subprocess.check_output(["readelf", "--wide", "-s", fn]).decode() go = False res = set() for I in syms.splitlines(): if I.startswith("Symbol table '.dynsym'"): go = True continue if I.startswith(" ") and go: g = re.match( r"\s+\d+:\s+[0-9a-f]+\s+\d+.*(?:FUNC|OBJECT)\s+GLOBAL\s+DEFAULT\s+\d+\s+(\S+)@@(\S+)$", I) if not g or "PRIVATE" in g.group(2): continue res.add(g.group(1)) else: go = False return res def get_cc_args_from_pkgconfig(args, name, static): """Get the compile arguments from pkg-config for the named librarary""" os.environ["PKG_CONFIG_PATH"] = os.path.join(args.BUILD, "lib", "pkgconfig") flags = ["pkg-config", "--errors-to-stdout", "--cflags", "--libs"] if static: flags.append("--static") opts = subprocess.check_output(flags + ["lib" + name]).decode() opts = shlex.split(opts) opts.insert(0, "-Wall") opts.insert(0, "-Werror") opts.insert(0, "-L%s" % (os.path.join(args.BUILD, "lib"))) opts.insert(1, "-I%s" % (os.path.join(args.BUILD, "include"))) if not static: return opts # Only static link the pkg-config stuff, otherwise we get warnings about # static linking portions of glibc that need NSS. opts.insert(0, "-Wl,-Bstatic") opts.append("-Wl,-Bdynamic") # We need this extra libpthread/m because libnl's pkgconfig file is # broken and doesn't include the private libraries it requires. :( if "-lnl-3" in opts: opts.append("-lm") opts.append("-lpthread") # Put glibc associated libraries out side the static link section, if "-lpthread" in opts: while "-lpthread" in opts: opts.remove("-lpthread") opts.append("-lpthread") if "-lm" in opts: while "-lm" in opts: opts.remove("-lm") opts.append("-lm") return opts def compile_ninja(args, Fninja, name, cfn, opts): print(""" rule comp_{name} command = {CC} -Wall -o $out $in {opts} description = Compile and link $out build {name} : comp_{name} {cfn} default {name}""".format( name=name, CC=args.CC, cfn=cfn, opts=" ".join(pipes.quote(I) for I in opts)), file=Fninja) def get_providers(args): """Return a list of provider names""" return set( I for I in os.listdir(os.path.join(args.SRC, "providers")) if not I.startswith(".")) def check_static_lib(args, tmpd, Fninja, static_lib, shared_lib, name): syms = get_symbol_names(shared_lib) if not syms: return cfn = os.path.join(tmpd, "%s-test.c" % (name)) with open(cfn, "wt") as F: F.write("#include \n") for I in syms: F.write("extern void %s(void);\n" % (I)) F.write("int main(int argc,const char *argv[]) {\n") for I in syms: F.write('printf("%%p",&%s);\n' % (I)) F.write("return 0; }\n") compile_ninja(args, Fninja, "%s-static-out" % (name), cfn, get_cc_args_from_pkgconfig(args, name, static=True)) compile_ninja(args, Fninja, "%s-shared-out" % (name), cfn, get_cc_args_from_pkgconfig(args, name, static=False)) def check_static_providers(args, tmpd, Fninja): """Test that expected values for RDMA_STATIC_PROVIDERS are accepted and the link works""" cfn = os.path.join(tmpd, "provider-test.c") with open(cfn, "wt") as F: F.write("#include \n") F.write("int main(int argc,const char *argv[]) {\n") F.write('ibv_get_device_list(NULL);\n') F.write("return 0; }\n") opts = get_cc_args_from_pkgconfig( args, "ibverbs", static=True) providers = get_providers(args) for I in sorted(providers | { "none", "all", }): compile_ninja(args, Fninja, "providers-%s-static-out" % (I), cfn, ["-DRDMA_STATIC_PROVIDERS=%s" % (I)] + opts) compile_ninja( args, Fninja, "providers-static-out", cfn, ["-DRDMA_STATIC_PROVIDERS=%s" % (",".join(providers))] + opts) def test_static_libs(args): """Compile then link statically and dynamically a dummy program that touches every symbol in the libraries using pkgconfig output to guide the link options. This tests that pkgconfig is setup properly and that all the magic with incorporating the internal libraries for static linking has done its job.""" libd = os.path.join(args.BUILD, "lib") success = True libs = [] with inDirectory(libd): fns = set(fn for fn in os.listdir(".") if not os.path.islink(fn)) for static_lib in fns: g = re.match(r"lib(.+)\.a$", static_lib) if g: for shared_lib in fns: if re.match(r"lib%s.*\.so" % (g.group(1)), shared_lib): libs.append((os.path.join(libd, static_lib), os.path.join(libd, shared_lib), g.group(1))) break else: raise ValueError( "Failed to find matching shared library for %r" % (static_lib)) with private_tmp() as tmpd: with open(os.path.join(tmpd, "build.ninja"), "wt") as Fninja: for I in libs: check_static_lib(args, tmpd, Fninja, I[0], I[1], I[2]) check_static_providers(args, tmpd, Fninja) subprocess.check_call(["ninja"], cwd=tmpd) # ------------------------------------------------------------------------- parser = argparse.ArgumentParser(description='Run build time tests') parser.add_argument("--build",default=os.getcwd(),dest="BUILD", help="Build directory to inpsect"); parser.add_argument("--src",default=None,dest="SRC", help="Top of the source tree"); parser.add_argument("--cc",default="cc",dest="CC", help="C compiler to use"); parser.add_argument("--cxx",default="c++",dest="CXX", help="C++ compiler to use"); args = parser.parse_args(); if args.SRC is None: args.SRC = get_src_dir(); args.SRC = os.path.abspath(args.SRC); args.PACKAGE_VERSION = get_package_version(args); funcs = globals(); for k,v in list(funcs.items()): if k.startswith("test_") and inspect.isfunction(v): v(args); rdma-core-56.1/buildlib/config.h.in000066400000000000000000000051221477342711600171430ustar00rootroot00000000000000#ifndef CONFIG_H_IN #define CONFIG_H_IN #define HAVE_STATEMENT_EXPR 1 #define HAVE_BUILTIN_TYPES_COMPATIBLE_P 1 #define HAVE_TYPEOF 1 #define HAVE_ISBLANK 1 #define HAVE_BUILTIN_CLZ 1 #define HAVE_BUILTIN_CLZL 1 #define PACKAGE_VERSION "@PACKAGE_VERSION@" // FIXME: Remove this, The cmake version hard-requires new style CLOEXEC support #define STREAM_CLOEXEC "e" #define RDMA_CDEV_DIR "/dev/infiniband" #define IBV_CONFIG_DIR "@CONFIG_DIR@" #define RS_CONF_DIR "@CMAKE_INSTALL_FULL_SYSCONFDIR@/rdma/rsocket" #define IWPM_CONFIG_FILE "@CMAKE_INSTALL_FULL_SYSCONFDIR@/iwpmd.conf" #define SRP_DAEMON_CONFIG_FILE "@CMAKE_INSTALL_FULL_SYSCONFDIR@/srp_daemon.conf" #define SRP_DAEMON_LOCK_PREFIX "@CMAKE_INSTALL_FULL_RUNDIR@/srp_daemon" #define ACM_CONF_DIR "@CMAKE_INSTALL_FULL_SYSCONFDIR@/rdma" #define IBACM_LIB_PATH "@ACM_PROVIDER_DIR@" #define IBACM_BIN_PATH "@CMAKE_INSTALL_FULL_BINDIR@" #define IBACM_PID_FILE "@CMAKE_INSTALL_FULL_RUNDIR@/ibacm.pid" #define IBACM_PORT_BASE "ibacm-tcp.port" #define IBACM_IBACME_PORT_FILE "@CMAKE_INSTALL_FULL_RUNDIR@/" IBACM_PORT_BASE #define IBACM_PORT_FILE "@CMAKE_INSTALL_FULL_RUNDIR@/ibacm.port" #define IBACM_LOG_FILE "@CMAKE_INSTALL_FULL_LOCALSTATEDIR@/log/ibacm.log" #define IBACM_SERVER_BASE "ibacm-unix.sock" #define IBACM_IBACME_SERVER_PATH "@CMAKE_INSTALL_FULL_RUNDIR@/" IBACM_SERVER_BASE #define IBACM_SERVER_PATH "@CMAKE_INSTALL_FULL_RUNDIR@/ibacm.sock" #define IBDIAG_CONFIG_PATH "@IBDIAG_CONFIG_PATH@" #define IBDIAG_NODENAME_MAP_PATH "@IBDIAG_NODENAME_MAP_PATH@" #define VERBS_PROVIDER_DIR "@VERBS_PROVIDER_DIR@" #define VERBS_PROVIDER_SUFFIX "@IBVERBS_PROVIDER_SUFFIX@" #define IBVERBS_PABI_VERSION @IBVERBS_PABI_VERSION@ // FIXME This has been supported in compilers forever, we should just fail to build on such old systems. #cmakedefine HAVE_FUNC_ATTRIBUTE_ALWAYS_INLINE 1 #cmakedefine HAVE_FUNC_ATTRIBUTE_IFUNC 1 #cmakedefine HAVE_FUNC_ATTRIBUTE_SYMVER 1 #cmakedefine HAVE_WORKING_IF_H 1 // Operating mode for symbol versions #cmakedefine HAVE_FULL_SYMBOL_VERSIONS 1 #cmakedefine HAVE_LIMITED_SYMBOL_VERSIONS 1 @SIZEOF_LONG_CODE@ #if @IOCTL_MODE_NUM@ == 1 # define VERBS_IOCTL_ONLY 1 # define VERBS_WRITE_ONLY 0 #elif @IOCTL_MODE_NUM@ == 2 # define VERBS_IOCTL_ONLY 0 # define VERBS_WRITE_ONLY 1 #elif @IOCTL_MODE_NUM@ == 3 # define VERBS_IOCTL_ONLY 0 # define VERBS_WRITE_ONLY 0 #endif // Configuration defaults #define IBACM_SERVER_MODE_UNIX 0 #define IBACM_SERVER_MODE_LOOP 1 #define IBACM_SERVER_MODE_OPEN 2 #define IBACM_SERVER_MODE_DEFAULT @IBACM_SERVER_MODE_DEFAULT@ #define IBACM_ACME_PLUS_KERNEL_ONLY_DEFAULT @IBACM_ACME_PLUS_KERNEL_ONLY_DEFAULT@ #endif rdma-core-56.1/buildlib/const_structs.checkpatch000066400000000000000000000000001477342711600220420ustar00rootroot00000000000000rdma-core-56.1/buildlib/fixup-include/000077500000000000000000000000001477342711600176745ustar00rootroot00000000000000rdma-core-56.1/buildlib/fixup-include/assert.h000066400000000000000000000003331477342711600213450ustar00rootroot00000000000000#ifndef _FIXUP_ASSERT_H #define _FIXUP_ASSERT_H #include_next /* Without C11 compiler support it is not possible to implement static_assert */ #undef static_assert #define static_assert(_cond, msg) #endif rdma-core-56.1/buildlib/fixup-include/linux-in.h000066400000000000000000000001141477342711600216040ustar00rootroot00000000000000/* if in.h can't be included just leave it empty */ #include rdma-core-56.1/buildlib/fixup-include/linux-in6.h000066400000000000000000000001151477342711600216730ustar00rootroot00000000000000/* if in6.h can't be included just leave it empty */ #include rdma-core-56.1/buildlib/fixup-include/netlink-attr.h000066400000000000000000000112111477342711600224550ustar00rootroot00000000000000#ifndef _FIXUP_NETLINK_ATTR_H #define _FIXUP_NETLINK_ATTR_H #include #include #include #include struct nlmsghdr; struct nl_msg; struct nl_sock; struct nlattr; struct nl_cb; struct sockaddr_nl; struct nlmsgerr; struct nl_addr; struct nl_cache; struct nl_object; typedef int (*nl_recvmsg_msg_cb_t)(struct nl_msg *msg, void *arg); typedef int (*nl_recvmsg_err_cb_t)(struct sockaddr_nl *nla, struct nlmsgerr *nlerr, void *arg); struct nla_policy { int type; }; enum { NLA_U8, NLA_U32, NLA_U64, NL_AUTO_PORT, NL_AUTO_SEQ, NL_STOP, NL_OK, NL_CB_DEFAULT, NL_CB_VALID, NL_CB_CUSTOM, NLE_PARSE_ERR, NLE_NOMEM, }; static inline struct nl_sock *nl_socket_alloc(void) { return NULL; } static inline int nl_connect(struct nl_sock *sk, int kind) { return -1; } static inline void nl_socket_free(struct nl_sock *sk) { } static inline void nl_socket_disable_auto_ack(struct nl_sock *sk) { } static inline void nl_socket_disable_msg_peek(struct nl_sock *sk) { } static inline void nl_socket_disable_seq_check(struct nl_sock *sk) { } static inline int nl_socket_get_fd(struct nl_sock *sk) { return -1; } static inline int nl_socket_add_membership(struct nl_sock *sk, int group) { return -1; } static inline struct nlmsghdr *nlmsg_put(struct nl_msg *msg, uint32_t pid, uint32_t seq, int type, int payload, int flags) { return NULL; } static inline struct nl_msg *nlmsg_alloc(void) { return NULL; } static inline struct nl_msg *nlmsg_alloc_simple(int nlmsgtype, int flags) { return NULL; } static inline void nlmsg_free(struct nl_msg *msg) { } static inline int nl_send_auto(struct nl_sock *sk, struct nl_msg *msg) { return -1; } static inline struct nlmsghdr *nlmsg_hdr(struct nl_msg *msg) { return NULL; } static inline int nlmsg_parse(struct nlmsghdr *nlh, int hdrlen, struct nlattr *tb[], int maxtype, struct nla_policy *policy) { return -1; } static inline int nl_msg_parse(struct nl_msg *msg, void (*cb)(struct nl_object *, void *), void *arg) { return -1; } static inline int nlmsg_append(struct nl_msg *n, void *data, size_t len, int pad) { return -1; } static inline int nl_send_simple(struct nl_sock *sk, int type, int flags, void *buf, size_t size) { return -1; } static inline int nl_recvmsgs(struct nl_sock *sk, struct nl_cb *cb) { return -1; } static inline int nl_recvmsgs_default(struct nl_sock *sk) { return -1; } static inline struct nl_cb *nl_cb_alloc(int kind) { return NULL; } static inline int nl_cb_set(struct nl_cb *cb, int type, int kind, nl_recvmsg_msg_cb_t func, void *arg) { return -1; } static inline int nl_socket_modify_err_cb(struct nl_sock *sk, int kind, nl_recvmsg_err_cb_t func, void *arg) { return -1; } static inline int nl_socket_modify_cb(struct nl_sock *sk, int type, int kind, nl_recvmsg_msg_cb_t func, void *arg) { return -1; } #define NLA_PUT_U32(msg, attrtype, value) ({ goto nla_put_failure; }) #define NLA_PUT_STRING(msg, attrtype, value) ({ goto nla_put_failure; }) #define NLA_PUT_ADDR(msg, attrtype, value) ({ goto nla_put_failure; }) static inline const char *nla_get_string(struct nlattr *tb) { return NULL; } static inline uint8_t nla_get_u8(struct nlattr *tb) { return 0; } static inline uint32_t nla_get_u32(struct nlattr *tb) { return 0; } static inline uint64_t nla_get_u64(struct nlattr *tb) { return 0; } static inline struct nl_addr *nl_addr_clone(struct nl_addr *src) { return NULL; } static inline int nl_addr_info(struct nl_addr *addr, struct addrinfo **result) { return -1; } static inline struct nl_addr *nl_addr_build(int family, void *buf, size_t size) { return NULL; } static inline unsigned int nl_addr_get_len(struct nl_addr *addr) { return 0; } static inline void *nl_addr_get_binary_addr(struct nl_addr *addr) { return NULL; } static inline int nl_addr_get_family(struct nl_addr *addr) { return -1; } static inline int nl_addr_get_prefixlen(struct nl_addr *addr) { return -1; } static inline int nl_addr_fill_sockaddr(struct nl_addr *addr, struct sockaddr *sa, socklen_t *salen) { return -1; } static inline void nl_addr_put(struct nl_addr *addr) { } static inline void nl_addr_set_prefixlen(struct nl_addr *addr, int prefixlen) { } static inline void nl_cache_mngt_unprovide(struct nl_cache *cache) { } static inline void nl_cache_free(struct nl_cache *cache) { } static inline int nl_object_match_filter(struct nl_object *obj, struct nl_object *filter) { return -1; } static inline int nl_cache_refill(struct nl_sock *sk, struct nl_cache *cache) { return -1; } static inline void nl_cache_mngt_provide(struct nl_cache *cache) { } #endif rdma-core-56.1/buildlib/fixup-include/netlink-msg.h000066400000000000000000000000001477342711600222630ustar00rootroot00000000000000rdma-core-56.1/buildlib/fixup-include/netlink-netlink.h000066400000000000000000000000001477342711600231410ustar00rootroot00000000000000rdma-core-56.1/buildlib/fixup-include/netlink-object-api.h000066400000000000000000000000001477342711600235120ustar00rootroot00000000000000rdma-core-56.1/buildlib/fixup-include/netlink-route-link-vlan.h000066400000000000000000000000001477342711600245240ustar00rootroot00000000000000rdma-core-56.1/buildlib/fixup-include/netlink-route-link.h000066400000000000000000000000001477342711600235660ustar00rootroot00000000000000rdma-core-56.1/buildlib/fixup-include/netlink-route-neighbour.h000066400000000000000000000000001477342711600246130ustar00rootroot00000000000000rdma-core-56.1/buildlib/fixup-include/netlink-route-route.h000066400000000000000000000000001477342711600237670ustar00rootroot00000000000000rdma-core-56.1/buildlib/fixup-include/netlink-route-rtnl.h000066400000000000000000000040471477342711600236270ustar00rootroot00000000000000#ifndef _FIXUP_NETLINK_ROUTE_RTNL_H #define _FIXUP_NETLINK_ROUTE_RTNL_H #include struct rtnl_addr; struct rtnl_neigh; struct rtnl_route; struct rtnl_nexthop; static inline struct rtnl_neigh * rtnl_neigh_get(struct nl_cache *cache, int ifindex, struct nl_addr *dst) { return NULL; } static inline struct rtnl_link *rtnl_link_get(struct nl_cache *cache, int ifindex) { return NULL; } static void rtnl_neigh_put(struct rtnl_neigh *neigh) { } static inline int rtnl_addr_get_family(struct rtnl_addr *addr) { return -1; } static inline struct nl_addr *rtnl_neigh_get_lladdr(struct rtnl_neigh *neigh) { return NULL; } static inline struct rtnl_neigh *rtnl_neigh_alloc(void) { return NULL; } static inline void rtnl_neigh_set_ifindex(struct rtnl_neigh *neigh, int ifindex) { } static inline int rtnl_neigh_set_dst(struct rtnl_neigh *neigh, struct nl_addr *addr) { return -1; } static inline uint8_t rtnl_route_get_type(struct rtnl_route *route) { return 0; } static inline struct nl_addr *rtnl_route_get_pref_src(struct rtnl_route *route) { return NULL; } static inline struct rtnl_nexthop *rtnl_route_nexthop_n(struct rtnl_route *r, int n) { return NULL; } static inline int rtnl_route_nh_get_ifindex(struct rtnl_nexthop *nh) { return -1; } static inline struct nl_addr *rtnl_route_nh_get_gateway(struct rtnl_nexthop *nh) { return NULL; } static inline int rtnl_link_alloc_cache(struct nl_sock *sk, int family, struct nl_cache **result) { return -1; } static inline struct nl_addr *rtnl_link_get_addr(struct rtnl_link *link) { return NULL; } static inline int rtnl_link_vlan_get_id(struct rtnl_link *link) { return -1; } static inline void rtnl_link_put(struct rtnl_link *link) { } static inline int rtnl_link_is_vlan(struct rtnl_link *link) { return -1; } static inline int rtnl_route_alloc_cache(struct nl_sock *sk, int family, int flags, struct nl_cache **result) { return -1; } static inline int rtnl_neigh_alloc_cache(struct nl_sock *sock, struct nl_cache **result) { return -1; } #endif rdma-core-56.1/buildlib/fixup-include/stdatomic.h000066400000000000000000000323751477342711600220460ustar00rootroot00000000000000/* * An implementation of C11 stdatomic.h directly borrowed from FreeBSD * (original copyright follows), with minor modifications for * portability to other systems. Works for recent Clang (that * implement the feature c_atomic) and GCC 4.7+; includes * compatibility for GCC below 4.7 but I wouldn't recommend it. * * Caveats and limitations: * - Only the ``_Atomic parentheses'' notation is implemented, while * the ``_Atomic space'' one is not. * - _Atomic types must be typedef'ed, or programs using them will * not type check correctly (incompatible anonymous structure * types). * - Non-scalar _Atomic types would require runtime support for * runtime locking, which, as far as I know, is not currently * available on any system. */ /*- * Copyright (c) 2011 Ed Schouten * David Chisnall * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $FreeBSD: src/include/stdatomic.h,v 1.10.2.2 2012/05/30 19:21:54 theraven Exp $ */ #ifndef _STDATOMIC_H_ #define _STDATOMIC_H_ #include #include #if !defined(__has_feature) #define __has_feature(x) 0 #endif #if !defined(__has_builtin) #define __has_builtin(x) 0 #endif #if !defined(__GNUC_PREREQ__) #if defined(__GNUC__) && defined(__GNUC_MINOR__) #define __GNUC_PREREQ__(maj, min) \ ((__GNUC__ << 16) + __GNUC_MINOR__ >= ((maj) << 16) + (min)) #else #define __GNUC_PREREQ__(maj, min) 0 #endif #endif #if !defined(__CLANG_ATOMICS) && !defined(__GNUC_ATOMICS) #if __has_feature(c_atomic) #define __CLANG_ATOMICS #elif __GNUC_PREREQ__(4, 7) #define __GNUC_ATOMICS #elif !defined(__GNUC__) #error "stdatomic.h does not support your compiler" #endif #endif #if !defined(__CLANG_ATOMICS) #define _Atomic(T) struct { volatile __typeof__(T) __val; } #endif /* * 7.17.2 Initialization. */ #if defined(__CLANG_ATOMICS) #define ATOMIC_VAR_INIT(value) (value) #define atomic_init(obj, value) __c11_atomic_init(obj, value) #else #define ATOMIC_VAR_INIT(value) { .__val = (value) } #define atomic_init(obj, value) do { \ (obj)->__val = (value); \ } while (0) #endif /* * Clang and recent GCC both provide predefined macros for the memory * orderings. If we are using a compiler that doesn't define them, use the * clang values - these will be ignored in the fallback path. */ #ifndef __ATOMIC_RELAXED #define __ATOMIC_RELAXED 0 #endif #ifndef __ATOMIC_CONSUME #define __ATOMIC_CONSUME 1 #endif #ifndef __ATOMIC_ACQUIRE #define __ATOMIC_ACQUIRE 2 #endif #ifndef __ATOMIC_RELEASE #define __ATOMIC_RELEASE 3 #endif #ifndef __ATOMIC_ACQ_REL #define __ATOMIC_ACQ_REL 4 #endif #ifndef __ATOMIC_SEQ_CST #define __ATOMIC_SEQ_CST 5 #endif /* * 7.17.3 Order and consistency. * * The memory_order_* constants that denote the barrier behaviour of the * atomic operations. */ enum memory_order { memory_order_relaxed = __ATOMIC_RELAXED, memory_order_consume = __ATOMIC_CONSUME, memory_order_acquire = __ATOMIC_ACQUIRE, memory_order_release = __ATOMIC_RELEASE, memory_order_acq_rel = __ATOMIC_ACQ_REL, memory_order_seq_cst = __ATOMIC_SEQ_CST }; typedef enum memory_order memory_order; /* * 7.17.4 Fences. */ #ifdef __CLANG_ATOMICS #define atomic_thread_fence(order) __c11_atomic_thread_fence(order) #define atomic_signal_fence(order) __c11_atomic_signal_fence(order) #elif defined(__GNUC_ATOMICS) #define atomic_thread_fence(order) __atomic_thread_fence(order) #define atomic_signal_fence(order) __atomic_signal_fence(order) #else #define atomic_thread_fence(order) __sync_synchronize() #define atomic_signal_fence(order) __asm volatile ("" : : : "memory") #endif /* * 7.17.5 Lock-free property. */ #if defined(__CLANG_ATOMICS) #define atomic_is_lock_free(obj) \ __c11_atomic_is_lock_free(sizeof(obj)) #elif defined(__GNUC_ATOMICS) #define atomic_is_lock_free(obj) \ __atomic_is_lock_free(sizeof((obj)->__val)) #else #define atomic_is_lock_free(obj) \ (sizeof((obj)->__val) <= sizeof(void *)) #endif /* * 7.17.6 Atomic integer types. */ typedef _Atomic(_Bool) atomic_bool; typedef _Atomic(char) atomic_char; typedef _Atomic(signed char) atomic_schar; typedef _Atomic(unsigned char) atomic_uchar; typedef _Atomic(short) atomic_short; typedef _Atomic(unsigned short) atomic_ushort; typedef _Atomic(int) atomic_int; typedef _Atomic(unsigned int) atomic_uint; typedef _Atomic(long) atomic_long; typedef _Atomic(unsigned long) atomic_ulong; typedef _Atomic(long long) atomic_llong; typedef _Atomic(unsigned long long) atomic_ullong; #if 0 typedef _Atomic(char16_t) atomic_char16_t; typedef _Atomic(char32_t) atomic_char32_t; #endif typedef _Atomic(wchar_t) atomic_wchar_t; typedef _Atomic(int_least8_t) atomic_int_least8_t; typedef _Atomic(uint_least8_t) atomic_uint_least8_t; typedef _Atomic(int_least16_t) atomic_int_least16_t; typedef _Atomic(uint_least16_t) atomic_uint_least16_t; typedef _Atomic(int_least32_t) atomic_int_least32_t; typedef _Atomic(uint_least32_t) atomic_uint_least32_t; typedef _Atomic(int_least64_t) atomic_int_least64_t; typedef _Atomic(uint_least64_t) atomic_uint_least64_t; typedef _Atomic(int_fast8_t) atomic_int_fast8_t; typedef _Atomic(uint_fast8_t) atomic_uint_fast8_t; typedef _Atomic(int_fast16_t) atomic_int_fast16_t; typedef _Atomic(uint_fast16_t) atomic_uint_fast16_t; typedef _Atomic(int_fast32_t) atomic_int_fast32_t; typedef _Atomic(uint_fast32_t) atomic_uint_fast32_t; typedef _Atomic(int_fast64_t) atomic_int_fast64_t; typedef _Atomic(uint_fast64_t) atomic_uint_fast64_t; typedef _Atomic(intptr_t) atomic_intptr_t; typedef _Atomic(uintptr_t) atomic_uintptr_t; typedef _Atomic(size_t) atomic_size_t; typedef _Atomic(ptrdiff_t) atomic_ptrdiff_t; typedef _Atomic(intmax_t) atomic_intmax_t; typedef _Atomic(uintmax_t) atomic_uintmax_t; /* * 7.17.7 Operations on atomic types. */ /* * Compiler-specific operations. */ #if defined(__CLANG_ATOMICS) #define atomic_compare_exchange_strong_explicit(object, expected, \ desired, success, failure) \ __c11_atomic_compare_exchange_strong(object, expected, desired, \ success, failure) #define atomic_compare_exchange_weak_explicit(object, expected, \ desired, success, failure) \ __c11_atomic_compare_exchange_weak(object, expected, desired, \ success, failure) #define atomic_exchange_explicit(object, desired, order) \ __c11_atomic_exchange(object, desired, order) #define atomic_fetch_add_explicit(object, operand, order) \ __c11_atomic_fetch_add(object, operand, order) #define atomic_fetch_and_explicit(object, operand, order) \ __c11_atomic_fetch_and(object, operand, order) #define atomic_fetch_or_explicit(object, operand, order) \ __c11_atomic_fetch_or(object, operand, order) #define atomic_fetch_sub_explicit(object, operand, order) \ __c11_atomic_fetch_sub(object, operand, order) #define atomic_fetch_xor_explicit(object, operand, order) \ __c11_atomic_fetch_xor(object, operand, order) #define atomic_load_explicit(object, order) \ __c11_atomic_load(object, order) #define atomic_store_explicit(object, desired, order) \ __c11_atomic_store(object, desired, order) #elif defined(__GNUC_ATOMICS) #define atomic_compare_exchange_strong_explicit(object, expected, \ desired, success, failure) \ __atomic_compare_exchange_n(&(object)->__val, expected, \ desired, 0, success, failure) #define atomic_compare_exchange_weak_explicit(object, expected, \ desired, success, failure) \ __atomic_compare_exchange_n(&(object)->__val, expected, \ desired, 1, success, failure) #define atomic_exchange_explicit(object, desired, order) \ __atomic_exchange_n(&(object)->__val, desired, order) #define atomic_fetch_add_explicit(object, operand, order) \ __atomic_fetch_add(&(object)->__val, operand, order) #define atomic_fetch_and_explicit(object, operand, order) \ __atomic_fetch_and(&(object)->__val, operand, order) #define atomic_fetch_or_explicit(object, operand, order) \ __atomic_fetch_or(&(object)->__val, operand, order) #define atomic_fetch_sub_explicit(object, operand, order) \ __atomic_fetch_sub(&(object)->__val, operand, order) #define atomic_fetch_xor_explicit(object, operand, order) \ __atomic_fetch_xor(&(object)->__val, operand, order) #define atomic_load_explicit(object, order) \ __atomic_load_n(&(object)->__val, order) #define atomic_store_explicit(object, desired, order) \ __atomic_store_n(&(object)->__val, desired, order) #else #define atomic_compare_exchange_strong_explicit(object, expected, \ desired, success, failure) ({ \ __typeof__((object)->__val) __v; \ _Bool __r; \ __v = __sync_val_compare_and_swap(&(object)->__val, \ *(expected), desired); \ __r = *(expected) == __v; \ *(expected) = __v; \ __r; \ }) #define atomic_compare_exchange_weak_explicit(object, expected, \ desired, success, failure) \ atomic_compare_exchange_strong_explicit(object, expected, \ desired, success, failure) #if __has_builtin(__sync_swap) /* Clang provides a full-barrier atomic exchange - use it if available. */ #define atomic_exchange_explicit(object, desired, order) \ __sync_swap(&(object)->__val, desired) #else /* * __sync_lock_test_and_set() is only an acquire barrier in theory (although in * practice it is usually a full barrier) so we need an explicit barrier after * it. */ #define atomic_exchange_explicit(object, desired, order) ({ \ __typeof__((object)->__val) __v; \ __v = __sync_lock_test_and_set(&(object)->__val, desired); \ __sync_synchronize(); \ __v; \ }) #endif #define atomic_fetch_add_explicit(object, operand, order) \ __sync_fetch_and_add(&(object)->__val, operand) #define atomic_fetch_and_explicit(object, operand, order) \ __sync_fetch_and_and(&(object)->__val, operand) #define atomic_fetch_or_explicit(object, operand, order) \ __sync_fetch_and_or(&(object)->__val, operand) #define atomic_fetch_sub_explicit(object, operand, order) \ __sync_fetch_and_sub(&(object)->__val, operand) #define atomic_fetch_xor_explicit(object, operand, order) \ __sync_fetch_and_xor(&(object)->__val, operand) #define atomic_load_explicit(object, order) \ __sync_fetch_and_add(&(object)->__val, 0) #define atomic_store_explicit(object, desired, order) do { \ __sync_synchronize(); \ (object)->__val = (desired); \ __sync_synchronize(); \ } while (0) #endif /* * Convenience functions. */ #define atomic_compare_exchange_strong(object, expected, desired) \ atomic_compare_exchange_strong_explicit(object, expected, \ desired, memory_order_seq_cst, memory_order_seq_cst) #define atomic_compare_exchange_weak(object, expected, desired) \ atomic_compare_exchange_weak_explicit(object, expected, \ desired, memory_order_seq_cst, memory_order_seq_cst) #define atomic_exchange(object, desired) \ atomic_exchange_explicit(object, desired, memory_order_seq_cst) #define atomic_fetch_add(object, operand) \ atomic_fetch_add_explicit(object, operand, memory_order_seq_cst) #define atomic_fetch_and(object, operand) \ atomic_fetch_and_explicit(object, operand, memory_order_seq_cst) #define atomic_fetch_or(object, operand) \ atomic_fetch_or_explicit(object, operand, memory_order_seq_cst) #define atomic_fetch_sub(object, operand) \ atomic_fetch_sub_explicit(object, operand, memory_order_seq_cst) #define atomic_fetch_xor(object, operand) \ atomic_fetch_xor_explicit(object, operand, memory_order_seq_cst) #define atomic_load(object) \ atomic_load_explicit(object, memory_order_seq_cst) #define atomic_store(object, desired) \ atomic_store_explicit(object, desired, memory_order_seq_cst) /* * 7.17.8 Atomic flag type and operations. */ typedef atomic_bool atomic_flag; #define ATOMIC_FLAG_INIT ATOMIC_VAR_INIT(0) #define atomic_flag_clear_explicit(object, order) \ atomic_store_explicit(object, 0, order) #define atomic_flag_test_and_set_explicit(object, order) \ atomic_compare_exchange_strong_explicit(object, 0, 1, order, order) #define atomic_flag_clear(object) \ atomic_flag_clear_explicit(object, memory_order_seq_cst) #define atomic_flag_test_and_set(object) \ atomic_flag_test_and_set_explicit(object, memory_order_seq_cst) #endif /* !_STDATOMIC_H_ */ rdma-core-56.1/buildlib/fixup-include/sys-auxv.h000066400000000000000000000002321477342711600216410ustar00rootroot00000000000000#ifndef _FIXUP_SYS_AUXV_H #define _FIXUP_SYS_AUXV_H #if defined(__s390x__) #include_next #define HWCAP_S390_PCI_MIO 2097152 #endif #endif rdma-core-56.1/buildlib/fixup-include/sys-random.h000066400000000000000000000003661477342711600221460ustar00rootroot00000000000000#ifndef _FIXUP_SYS_RANDOM_H #define _FIXUP_SYS_RANDOM_H #include /* Flags for use with getrandom. */ #define GRND_NONBLOCK 0x01 static inline ssize_t getrandom(void *buf, size_t buflen, unsigned int flags) { return -1; } #endif rdma-core-56.1/buildlib/fixup-include/sys-stat.h000066400000000000000000000002371477342711600216360ustar00rootroot00000000000000#ifndef _FIXUP_SYS_STAT_H #define _FIXUP_SYS_STAT_H #include_next extern int __fxstat(int __ver, int __fildes, struct stat *__stat_buf); #endif rdma-core-56.1/buildlib/fixup-include/systemd-sd-daemon.h000066400000000000000000000004271477342711600234050ustar00rootroot00000000000000#define SD_LISTEN_FDS_START 3 static inline int sd_listen_fds(int unset_environment) { return 0; } static inline int sd_is_socket(int fd, int family, int type, int listening) { return 0; } static inline int sd_notify(int unset_environment, const char *state) { return 0; } rdma-core-56.1/buildlib/fixup-include/valgrind-drd.h000066400000000000000000000002351477342711600224220ustar00rootroot00000000000000static inline void ANNOTATE_BENIGN_RACE_SIZED(const void *mem,size_t len,const char *desc) {} #define ANNOTATE_BENIGN_RACE_SIZED ANNOTATE_BENIGN_RACE_SIZED rdma-core-56.1/buildlib/fixup-include/valgrind-memcheck.h000066400000000000000000000004271477342711600234300ustar00rootroot00000000000000static inline void VALGRIND_MAKE_MEM_DEFINED(const void *mem,size_t len) {} #define VALGRIND_MAKE_MEM_DEFINED VALGRIND_MAKE_MEM_DEFINED static inline void VALGRIND_MAKE_MEM_UNDEFINED(const void *mem,size_t len) {} #define VALGRIND_MAKE_MEM_UNDEFINED VALGRIND_MAKE_MEM_UNDEFINED rdma-core-56.1/buildlib/gen-sparse.py000077500000000000000000000142441477342711600175460ustar00rootroot00000000000000#!/usr/bin/env python3 # Copyright 2015-2017 Obsidian Research Corp. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. import argparse import subprocess import os import collections import re import itertools headers = { "bits/sysmacros.h", "endian.h", "netinet/in.h", "pthread.h", "stdatomic.h", "stdlib.h", "sys/socket.h", }; def norm_header(fn): for I in headers: flat = I.replace("/","-"); if fn.endswith(flat): return I; if fn.endswith(flat + ".diff"): return I; return None; def find_system_header(args,hdr): """/usr/include is not always where the include files are, particularly if we are running full multi-arch as the azure_pipeline container does. Get gcc to tell us where /usr/include is""" if "incpath" not in args: cpp = subprocess.check_output([args.cc, "-print-prog-name=cpp"],universal_newlines=True).strip() data = subprocess.check_output([cpp, "-v"],universal_newlines=True,stdin=subprocess.DEVNULL, stderr=subprocess.STDOUT) args.incpath = []; for incdir in re.finditer(r"^ (/\S+)$", data, re.MULTILINE): incdir = incdir.group(1) if "fixed" in incdir: continue; args.incpath.append(incdir) for incdir in args.incpath: fn = os.path.join(incdir,hdr) if os.path.exists(fn): return fn return None; def get_buildlib_patches(dfn): """Within the buildlib directory we store patches for the glibc headers. Each patch is in a numbered sub directory that indicates the order to try, the number should match the glibc version used to make the diff.""" ver_hdrs = []; all_hdrs = [] for d,_,files in os.walk(dfn): for I in files: if d != dfn: bn = int(os.path.basename(d)); else: bn = 0; if bn == 0: all_hdrs.append(os.path.join(d,I)); else: ver_hdrs.append((bn,os.path.join(d,I))); ver_hdrs.sort(reverse=True); def add_to_dict(d,lst): for I in lst: nh = norm_header(I) if nh is None: continue; assert nh not in d d[nh] = (I, find_system_header(args,nh)) ret = [] for k,g in itertools.groupby(ver_hdrs,key=lambda x:x[0]): dd = {} ret.append(dd) add_to_dict(dd,(I for _,I in g)) add_to_dict(dd,all_hdrs) return ret; def is_patch(fn): with open(fn) as F: return F.read(10).startswith("-- /"); def apply_patch(src,patch,dest): """Patch a single system header. The output goes into our include search path and takes precedence over the system version.""" if src is None: return False dfn = os.path.dirname(dest); if not os.path.isdir(dfn): os.makedirs(dfn); if not patch.endswith(".diff"): if not os.path.exists(dest): os.symlink(patch,dest); return True; try: if os.path.exists(dest + ".rej"): os.unlink(dest + ".rej"); subprocess.check_output(["patch","-f","--follow-symlinks","-V","never","-i",patch,"-o",dest,src]); if os.path.exists(dest + ".rej"): print("Patch from %r failed"%(patch)); return False; except subprocess.CalledProcessError: print("Patch from %r failed"%(patch)); return False; return True; def replace_headers(suite): # Local system does not have the reference system header, this suite is # not supported for fn,pfn in suite.items(): if pfn[1] is None: return False; for fn,pfn in suite.items(): if not apply_patch(pfn[1],pfn[0],os.path.join(args.INCLUDE,fn)): break; else: return True; for fn,_ in suite.items(): try: os.unlink(os.path.join(args.INCLUDE,fn)) except OSError: continue; return False; def save(fn,outdir): """Diff the header file in our include directory against the system header and store the diff into buildlib. This makes it fairly easy to maintain the replacement headers.""" if os.path.islink(os.path.join(args.INCLUDE,fn)): return; flatfn = fn.replace("/","-") + ".diff"; flatfn = os.path.join(outdir,flatfn); includefn = os.path.join(args.INCLUDE,fn) if not os.path.exists(includefn): return cwd = os.getcwd() with open(flatfn,"wt") as F: os.chdir(os.path.join(args.INCLUDE,"..")) try: subprocess.check_call(["diff","-u", find_system_header(args,fn), os.path.join("include",fn)], stdout=F); except subprocess.CalledProcessError as ex: if ex.returncode == 1: return; raise; finally: os.chdir(cwd) parser = argparse.ArgumentParser(description='Produce sparse shim header files') parser.add_argument("--out",dest="INCLUDE",required=True, help="Directory to write header files to"); parser.add_argument("--src",dest="SRC",required=True, help="Top of the source tree"); parser.add_argument("--cc",default="gcc", help="System compiler to use to locate the default system headers"); parser.add_argument("--save",action="store_true",default=False, help="Save mode will write the current content of the headers to buildlib as a diff."); args = parser.parse_args(); if args.save: # Get the glibc version string ver = subprocess.check_output(["ldd","--version"]).decode() ver = ver.splitlines()[0].split(' ')[-1]; ver = ver.partition(".")[-1]; outdir = os.path.join(args.SRC,"buildlib","sparse-include",ver); if not os.path.isdir(outdir): os.makedirs(outdir); for I in headers: save(I,outdir); else: failed = False; suites = get_buildlib_patches(os.path.join(args.SRC,"buildlib","sparse-include")); for I in suites: if replace_headers(I): break; else: raise ValueError("Patch applications failed"); rdma-core-56.1/buildlib/make_abi_structs.py000066400000000000000000000033021477342711600210070ustar00rootroot00000000000000#/usr/bin/env python """This script transforms the structs inside the kernel ABI headers into a define of an anonymous struct. eg struct abc {int foo;}; becomes #define _STRUCT_abc struct {int foo;}; This allows the exact same struct to be included in the provider wrapper struct: struct abc_resp { struct ibv_abc ibv_resp; _STRUCT_abc; }; Which duplicates the struct layout and naming we have historically used, but sources the data directly from the kernel headers instead of manually copying.""" import re; import functools; import sys; def in_struct(ln,FO,nesting=0): """Copy a top level structure over to the #define output, keeping track of nested structures.""" if nesting == 0: if re.match(r"(}.*);",ln): FO.write(ln[:-1] + "\n\n"); return find_struct; FO.write(ln + " \\\n"); if ln == "struct {" or ln == "union {": return functools.partial(in_struct,nesting=nesting+1); if re.match(r"}.*;",ln): return functools.partial(in_struct,nesting=nesting-1); return functools.partial(in_struct,nesting=nesting); def find_struct(ln,FO): """Look for the start of a top level structure""" if ln.startswith("struct ") or ln.startswith("union "): g = re.match(r"(struct|union)\s+(\S+)\s+{",ln); FO.write("#define _STRUCT_%s %s { \\\n"%(g.group(2),g.group(1))); return in_struct; return find_struct; with open(sys.argv[1]) as FI: with open(sys.argv[2],"w") as FO: state = find_struct; for ln in FI: # Drop obvious comments ln = ln.strip(); ln = re.sub(r"/\*.*\*/","",ln); ln = re.sub(r"//.*$","",ln); state = state(ln,FO); rdma-core-56.1/buildlib/pandoc-prebuilt.py000066400000000000000000000043341477342711600205660ustar00rootroot00000000000000#!/usr/bin/env python import os import shutil import subprocess import sys import hashlib import re def hash_rst_includes(incdir,txt): h = "" for fn in re.findall(br"^..\s+include::\s+(.*)$", txt, flags=re.MULTILINE): with open(os.path.join(incdir,fn.decode()),"rb") as F: h = h + hashlib.sha1(F.read()).hexdigest(); return h.encode(); def get_id(SRC): """Return a unique ID for the SRC file. For simplicity and robustness we just content hash it""" incdir = os.path.dirname(SRC) with open(SRC,"rb") as F: txt = F.read(); if SRC.endswith(".rst"): txt = txt + hash_rst_includes(incdir,txt); return hashlib.sha1(txt).hexdigest(); def do_retrieve(src_root,SRC): """Retrieve the file from the prebuild cache and write it to DEST""" prebuilt = os.path.join(src_root,"buildlib","pandoc-prebuilt",get_id(SRC)) sys.stdout.write(prebuilt); def do_build_pandoc(build_root,pandoc,SRC,DEST): """Build the markdown into a man page with pandoc and then keep a copy of the output under build/pandoc-prebuilt""" try: subprocess.check_call([pandoc,"-s","-t","man",SRC,"-o",DEST]); except subprocess.CalledProcessError: sys.exit(100); shutil.copy(DEST,os.path.join(build_root,"pandoc-prebuilt",get_id(SRC))); def do_build_rst2man(build_root,rst2man,SRC,DEST): """Build the markdown into a man page with pandoc and then keep a copy of the output under build/pandoc-prebuilt""" try: subprocess.check_call([rst2man,SRC,DEST]); except subprocess.CalledProcessError: sys.exit(100); shutil.copy(DEST,os.path.join(build_root,"pandoc-prebuilt",get_id(SRC))); # We support python 2.6 so argparse is not available. if len(sys.argv) == 4: assert(sys.argv[1] == "--retrieve"); do_retrieve(sys.argv[2],sys.argv[3]); elif len(sys.argv) == 7: assert(sys.argv[1] == "--build"); if sys.argv[3] == "--pandoc": do_build_pandoc(sys.argv[2],sys.argv[4],sys.argv[5],sys.argv[6]); elif sys.argv[3] == "--rst": do_build_rst2man(sys.argv[2],sys.argv[4],sys.argv[5],sys.argv[6]); else: raise ValueError("Bad sys.argv[3]"); else: raise ValueError("Must provide --build or --retrieve"); rdma-core-56.1/buildlib/provider.map000066400000000000000000000002541477342711600174520ustar00rootroot00000000000000/* The providers do not export any symbols at all. Instead they rely on attribute(constructor) to cause their init function to run at dlopen time. */ { local: *; }; rdma-core-56.1/buildlib/publish_headers.cmake000066400000000000000000000020211477342711600212560ustar00rootroot00000000000000# COPYRIGHT (c) 2016 Obsidian Research Corporation. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. # Same as publish_headers but does not install them during the install phase function(publish_internal_headers DEST) if(NOT ARGN) message(SEND_ERROR "Error: publish_internal_headers called without any files") return() endif() set(DDIR "${BUILD_INCLUDE}/${DEST}") file(MAKE_DIRECTORY "${DDIR}") foreach(SFIL ${ARGN}) get_filename_component(FIL ${SFIL} NAME) rdma_create_symlink("${CMAKE_CURRENT_SOURCE_DIR}/${SFIL}" "${DDIR}/${FIL}") endforeach() endfunction() # Copy headers from the source directory to the proper place in the # build/include directory. This also installs them into /usr/include/xx during # the install phase function(publish_headers DEST) publish_internal_headers("${DEST}" ${ARGN}) foreach(SFIL ${ARGN}) get_filename_component(FIL ${SFIL} NAME) install(FILES "${SFIL}" DESTINATION "${CMAKE_INSTALL_INCLUDEDIR}/${DEST}/" RENAME "${FIL}") endforeach() endfunction() rdma-core-56.1/buildlib/pyverbs_functions.cmake000066400000000000000000000074331477342711600217130ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. See COPYING file # Copyright (c) 2020, Intel Corporation. All rights reserved. See COPYING file set(COMMON_LIBS_PIC ccan_pic rdma_util_pic) function(build_module_from_cfiles PY_MODULE MODULE_NAME ALL_CFILES LINKER_FLAGS) string(REGEX REPLACE "\\.so$" "" SONAME "${MODULE_NAME}${CMAKE_PYTHON_SO_SUFFIX}") add_library(${SONAME} SHARED ${ALL_CFILES}) set_target_properties(${SONAME} PROPERTIES COMPILE_FLAGS "${CMAKE_C_FLAGS} -fPIC -fno-strict-aliasing -Wno-unused-function -Wno-redundant-decls -Wno-shadow -Wno-cast-function-type -Wno-implicit-fallthrough -Wno-unknown-warning -Wno-unknown-warning-option -Wno-deprecated-declarations ${NO_VAR_TRACKING_FLAGS}" LIBRARY_OUTPUT_DIRECTORY "${BUILD_PYTHON}/${PY_MODULE}" PREFIX "") target_link_libraries(${SONAME} LINK_PRIVATE ${PYTHON_LIBRARIES} ibverbs rdmacm ${LINKER_FLAGS} ${COMMON_LIBS_PIC} ${CMAKE_THREAD_LIBS_INIT}) install(TARGETS ${SONAME} DESTINATION ${CMAKE_INSTALL_PYTHON_ARCH_LIB}/${PY_MODULE}) endfunction() function(rdma_cython_module PY_MODULE LINKER_FLAGS) set(ALL_CFILES "") set(MODULE_NAME "") foreach(SRC_FILE ${ARGN}) get_filename_component(FILENAME ${SRC_FILE} NAME_WE) get_filename_component(DIR ${SRC_FILE} DIRECTORY) get_filename_component(EXT ${SRC_FILE} EXT) if (DIR) set(SRC_PATH "${CMAKE_CURRENT_SOURCE_DIR}/${DIR}") else() set(SRC_PATH "${CMAKE_CURRENT_SOURCE_DIR}") endif() if (${EXT} STREQUAL ".pyx") # each .pyx file starts a new module, finish the previous module first if (ALL_CFILES AND MODULE_NAME) build_module_from_cfiles(${PY_MODULE} ${MODULE_NAME} "${ALL_CFILES}" "${LINKER_FLAGS}") endif() set(PYX "${SRC_PATH}/${FILENAME}.pyx") set(CFILE "${CMAKE_CURRENT_BINARY_DIR}/${FILENAME}.c") include_directories(${PYTHON_INCLUDE_DIRS}) add_custom_command( OUTPUT "${CFILE}" MAIN_DEPENDENCY "${PYX}" COMMAND ${CYTHON_EXECUTABLE} "${PYX}" -o "${CFILE}" "-I${PYTHON_INCLUDE_DIRS}" COMMENT "Cythonizing ${PYX}" ) set(MODULE_NAME ${FILENAME}) set(ALL_CFILES "${CFILE}") elseif(${EXT} STREQUAL ".c") # .c files belong to the same module as the most recent .pyx file, # ignored if appearing before all .pyx files set(CFILE "${SRC_PATH}/${FILENAME}.c") set(ALL_CFILES "${ALL_CFILES};${CFILE}") else() continue() endif() endforeach() # finish the last module if (ALL_CFILES AND MODULE_NAME) build_module_from_cfiles(${PY_MODULE} ${MODULE_NAME} "${ALL_CFILES}" "${LINKER_FLAGS}") endif() endfunction() function(rdma_python_module PY_MODULE) foreach(PY_FILE ${ARGN}) get_filename_component(LINK "${CMAKE_CURRENT_SOURCE_DIR}/${PY_FILE}" ABSOLUTE) rdma_create_symlink("${LINK}" "${BUILD_PYTHON}/${PY_MODULE}/${PY_FILE}") install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/${PY_FILE} DESTINATION ${CMAKE_INSTALL_PYTHON_ARCH_LIB}/${PY_MODULE}) endforeach() endfunction() function(rdma_python_test PY_MODULE) foreach(PY_FILE ${ARGN}) install(FILES ${CMAKE_CURRENT_SOURCE_DIR}/${PY_FILE} DESTINATION ${CMAKE_INSTALL_DOCDIR}/${PY_MODULE}) endforeach() endfunction() # Make a python script runnable from the build/bin directory with all the # correct paths filled in function(rdma_internal_binary) foreach(PY_FILE ${ARGN}) get_filename_component(ABS "${CMAKE_CURRENT_SOURCE_DIR}/${PY_FILE}" ABSOLUTE) get_filename_component(FN "${CMAKE_CURRENT_SOURCE_DIR}/${PY_FILE}" NAME) set(BIN_FN "${BUILD_BIN}/${FN}") file(WRITE "${BIN_FN}" "#!/bin/sh PYTHONPATH='${BUILD_PYTHON}' exec '${PYTHON_EXECUTABLE}' '${ABS}' \"$@\" ") execute_process(COMMAND "chmod" "a+x" "${BIN_FN}") endforeach() endfunction() rdma-core-56.1/buildlib/rdma_functions.cmake000066400000000000000000000303151477342711600211370ustar00rootroot00000000000000# COPYRIGHT (c) 2016 Obsidian Research Corporation. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. # Helper functions for use in the sub CMakeLists files to make them simpler # and more uniform. # Global list of tuples of (SHARED STATIC MAP) library target names set(RDMA_STATIC_LIBS "" CACHE INTERNAL "Doc" FORCE) # Global list of tuples of (PROVIDER_NAME LIB_NAME) set(RDMA_PROVIDER_LIST "" CACHE INTERNAL "Doc" FORCE) set(COMMON_LIBS_PIC ccan_pic rdma_util_pic) set(COMMON_LIBS ccan rdma_util) function(rdma_public_static_lib SHLIB STATICLIB VERSION_SCRIPT) if (NOT IS_ABSOLUTE ${VERSION_SCRIPT}) set(VERSION_SCRIPT "${CMAKE_CURRENT_SOURCE_DIR}/${VERSION_SCRIPT}") endif() set_target_properties(${STATICLIB} PROPERTIES OUTPUT_NAME ${SHLIB} ARCHIVE_OUTPUT_DIRECTORY "${BUILD_STATIC_LIB}") target_compile_definitions(${STATICLIB} PRIVATE _STATIC_LIBRARY_BUILD_=1) list(APPEND RDMA_STATIC_LIBS ${SHLIB} ${STATICLIB} ${VERSION_SCRIPT}) set(RDMA_STATIC_LIBS "${RDMA_STATIC_LIBS}" CACHE INTERNAL "") endfunction() function(rdma_make_dir DDIR) if(NOT EXISTS "${DDIR}/") execute_process(COMMAND "${CMAKE_COMMAND}" "-E" "make_directory" "${DDIR}" RESULT_VARIABLE retcode) if(NOT "${retcode}" STREQUAL "0") message(FATAL_ERROR "Failed to create directory ${DDIR}") endif() endif() endfunction() # Create a symlink at filename DEST # If the directory containing DEST does not exist then it is created # automatically. function(rdma_create_symlink LINK_CONTENT DEST) if(NOT LINK_CONTENT) message(FATAL_ERROR "Failed to provide LINK_CONTENT") endif() # Make sure the directory exists, cmake doesn't create target DESTINATION # directories until everything is finished, do it manually here if necessary if(CMAKE_VERSION VERSION_LESS "2.8.12") get_filename_component(DDIR "${DEST}" PATH) else() get_filename_component(DDIR "${DEST}" DIRECTORY) endif() rdma_make_dir("${DDIR}") # Newer versions of cmake can use "${CMAKE_COMMAND}" "-E" "create_symlink" # however it is broken weirdly on older versions. execute_process(COMMAND "ln" "-Tsf" "${LINK_CONTENT}" "${DEST}" RESULT_VARIABLE retcode) if(NOT "${retcode}" STREQUAL "0") message(FATAL_ERROR "Failed to create symlink in ${DEST}") endif() endfunction() # Install a symlink during 'make install' function(rdma_install_symlink LINK_CONTENT DEST) # Create a link in the build tree with the right content get_filename_component(FN "${DEST}" NAME) rdma_create_symlink("${LINK_CONTENT}" "${CMAKE_CURRENT_BINARY_DIR}/${FN}") # Have cmake install it. Doing it this way lets cpack work if we ever wish # to use that. get_filename_component(DIR "${DEST}" PATH) install(FILES "${CMAKE_CURRENT_BINARY_DIR}/${FN}" DESTINATION "${DIR}") endfunction() # Wrapper for install() that runs the single file through configure_file first. # This only works with the basic single file install(FILE file ARGS..) pattern function(rdma_subst_install ARG1 file) if (NOT "${ARG1}" STREQUAL "FILES") message(FATAL_ERROR "Bad use of rdma_subst_install") endif() configure_file("${file}" "${CMAKE_CURRENT_BINARY_DIR}/${file}" @ONLY) install(FILES "${CMAKE_CURRENT_BINARY_DIR}/${file}" ${ARGN}) endfunction() # Modify shared library target DEST to use VERSION_SCRIPT as the linker map file function(rdma_set_library_map DEST VERSION_SCRIPT) if (NOT IS_ABSOLUTE ${VERSION_SCRIPT}) set(VERSION_SCRIPT "${CMAKE_CURRENT_SOURCE_DIR}/${VERSION_SCRIPT}") endif() set_property(TARGET ${DEST} APPEND_STRING PROPERTY LINK_FLAGS " -Wl,--version-script,${VERSION_SCRIPT}") # NOTE: This won't work with ninja prior to cmake 3.4 set_property(TARGET ${DEST} APPEND_STRING PROPERTY LINK_DEPENDS ${VERSION_SCRIPT}) endfunction() # Basic function to produce a standard libary with a GNU LD version script. function(rdma_library DEST VERSION_SCRIPT SOVERSION VERSION) # Create a static library if (ENABLE_STATIC) add_library(${DEST}-static STATIC ${ARGN}) target_link_libraries(${DEST}-static LINK_PRIVATE ${COMMON_LIBS}) rdma_public_static_lib(${DEST} ${DEST}-static ${VERSION_SCRIPT}) endif() # Create a shared library add_library(${DEST} SHARED ${ARGN}) rdma_set_library_map(${DEST} ${VERSION_SCRIPT}) target_link_libraries(${DEST} LINK_PRIVATE ${COMMON_LIBS_PIC}) set_target_properties(${DEST} PROPERTIES SOVERSION ${SOVERSION} VERSION ${VERSION} LIBRARY_OUTPUT_DIRECTORY "${BUILD_LIB}") install(TARGETS ${DEST} DESTINATION "${CMAKE_INSTALL_LIBDIR}") endfunction() # Create a special provider with exported symbols in it The shared provider # exists as a normal system library with the normal shared library SONAME and # other convections. The system library is symlinked into the # VERBS_PROVIDER_DIR so it can be dlopened as a provider as well. function(rdma_shared_provider DEST VERSION_SCRIPT SOVERSION VERSION) # Installed driver file file(WRITE "${CMAKE_CURRENT_BINARY_DIR}/${DEST}.driver" "driver ${DEST}\n") install(FILES "${CMAKE_CURRENT_BINARY_DIR}/${DEST}.driver" DESTINATION "${CONFIG_DIR}") # Uninstalled driver file file(MAKE_DIRECTORY "${BUILD_ETC}/libibverbs.d/") file(WRITE "${BUILD_ETC}/libibverbs.d/${DEST}.driver" "driver ${BUILD_LIB}/lib${DEST}\n") list(APPEND RDMA_PROVIDER_LIST ${DEST} ${DEST}) set(RDMA_PROVIDER_LIST "${RDMA_PROVIDER_LIST}" CACHE INTERNAL "") # Create a static provider library if (ENABLE_STATIC) add_library(${DEST}-static STATIC ${ARGN}) rdma_public_static_lib(${DEST} ${DEST}-static ${VERSION_SCRIPT}) endif() # Create the plugin shared library add_library(${DEST} SHARED ${ARGN}) rdma_set_library_map(${DEST} ${VERSION_SCRIPT}) target_link_libraries(${DEST} LINK_PRIVATE ${COMMON_LIBS_PIC}) target_link_libraries(${DEST} LINK_PRIVATE ibverbs) target_link_libraries(${DEST} LINK_PRIVATE ${CMAKE_THREAD_LIBS_INIT}) set_target_properties(${DEST} PROPERTIES SOVERSION ${SOVERSION} VERSION ${VERSION} LIBRARY_OUTPUT_DIRECTORY "${BUILD_LIB}") install(TARGETS ${DEST} DESTINATION "${CMAKE_INSTALL_LIBDIR}") # Compute a relative symlink from VERBS_PROVIDER_DIR to LIBDIR execute_process(COMMAND ${PYTHON_EXECUTABLE} ${PROJECT_SOURCE_DIR}/buildlib/relpath "${CMAKE_INSTALL_FULL_LIBDIR}/lib${DEST}.so.${VERSION}" "${VERBS_PROVIDER_DIR}" OUTPUT_VARIABLE DEST_LINK_PATH OUTPUT_STRIP_TRAILING_WHITESPACE RESULT_VARIABLE retcode) if(NOT "${retcode}" STREQUAL "0") message(FATAL_ERROR "Unable to run buildlib/relpath, do you have python?") endif() rdma_install_symlink("${DEST_LINK_PATH}" "${VERBS_PROVIDER_DIR}/lib${DEST}${IBVERBS_PROVIDER_SUFFIX}") rdma_create_symlink("lib${DEST}.so.${VERSION}" "${BUILD_LIB}/lib${DEST}${IBVERBS_PROVIDER_SUFFIX}") endfunction() # Create a provider shared library for libibverbs function(rdma_provider DEST) # Installed driver file file(WRITE "${CMAKE_CURRENT_BINARY_DIR}/${DEST}.driver" "driver ${DEST}\n") install(FILES "${CMAKE_CURRENT_BINARY_DIR}/${DEST}.driver" DESTINATION "${CONFIG_DIR}") # Uninstalled driver file file(MAKE_DIRECTORY "${BUILD_ETC}/libibverbs.d/") file(WRITE "${BUILD_ETC}/libibverbs.d/${DEST}.driver" "driver ${BUILD_LIB}/lib${DEST}\n") list(APPEND RDMA_PROVIDER_LIST ${DEST} "${DEST}-rdmav${IBVERBS_PABI_VERSION}") set(RDMA_PROVIDER_LIST "${RDMA_PROVIDER_LIST}" CACHE INTERNAL "") # Create a static provider library if (ENABLE_STATIC) add_library(${DEST} STATIC ${ARGN}) rdma_public_static_lib("${DEST}-rdmav${IBVERBS_PABI_VERSION}" ${DEST} ${BUILDLIB}/provider.map) endif() # Create the plugin shared library set(DEST "${DEST}-rdmav${IBVERBS_PABI_VERSION}") add_library(${DEST} MODULE ${ARGN}) # Even though these are modules we still want to use Wl,--no-undefined set_target_properties(${DEST} PROPERTIES LINK_FLAGS ${CMAKE_SHARED_LINKER_FLAGS}) rdma_set_library_map(${DEST} ${BUILDLIB}/provider.map) target_link_libraries(${DEST} LINK_PRIVATE ${COMMON_LIBS_PIC}) target_link_libraries(${DEST} LINK_PRIVATE ibverbs) target_link_libraries(${DEST} LINK_PRIVATE ${CMAKE_THREAD_LIBS_INIT}) set_target_properties(${DEST} PROPERTIES LIBRARY_OUTPUT_DIRECTORY "${BUILD_LIB}") # Provider Plugins do not use SONAME versioning, there is no reason to # create the usual symlinks. if (VERBS_PROVIDER_DIR) install(TARGETS ${DEST} DESTINATION "${VERBS_PROVIDER_DIR}") else() install(TARGETS ${DEST} DESTINATION "${CMAKE_INSTALL_LIBDIR}") # FIXME: This symlink is provided for compat with the old build, but it # never should have existed in the first place, nothing should use this # name, we can probably remove it. rdma_install_symlink("lib${DEST}${IBVERBS_PROVIDER_SUFFIX}" "${CMAKE_INSTALL_LIBDIR}/lib${DEST}.so") endif() endfunction() # Create an installed executable function(rdma_executable EXEC) add_executable(${EXEC} ${ARGN}) target_link_libraries(${EXEC} LINK_PRIVATE ${COMMON_LIBS}) set_target_properties(${EXEC} PROPERTIES RUNTIME_OUTPUT_DIRECTORY "${BUILD_BIN}") install(TARGETS ${EXEC} DESTINATION "${CMAKE_INSTALL_BINDIR}") endfunction() # Create an installed executable (under sbin) function(rdma_sbin_executable EXEC) add_executable(${EXEC} ${ARGN}) target_link_libraries(${EXEC} LINK_PRIVATE ${COMMON_LIBS}) set_target_properties(${EXEC} PROPERTIES RUNTIME_OUTPUT_DIRECTORY "${BUILD_BIN}") install(TARGETS ${EXEC} DESTINATION "${CMAKE_INSTALL_SBINDIR}") endfunction() # Create an test executable (not-installed) function(rdma_test_executable EXEC) add_executable(${EXEC} ${ARGN}) target_link_libraries(${EXEC} LINK_PRIVATE ${COMMON_LIBS}) set_target_properties(${EXEC} PROPERTIES RUNTIME_OUTPUT_DIRECTORY "${BUILD_BIN}") endfunction() # Finalize the setup of the static libraries by copying the meta information # from the shared to static and setting up the static builder function(rdma_finalize_libs) list(LENGTH RDMA_STATIC_LIBS LEN) if (LEN LESS 3) return() endif() math(EXPR LEN ${LEN}-1) foreach(I RANGE 0 ${LEN} 3) list(GET RDMA_STATIC_LIBS ${I} SHARED) math(EXPR I ${I}+1) list(GET RDMA_STATIC_LIBS ${I} STATIC) math(EXPR I ${I}+1) list(GET RDMA_STATIC_LIBS ${I} MAP) # PUBLIC libraries set(LIBS "") get_property(TMP TARGET ${SHARED} PROPERTY INTERFACE_LINK_LIBRARIES SET) if (TMP) get_target_property(TMP ${SHARED} INTERFACE_LINK_LIBRARIES) set_target_properties(${STATIC} PROPERTIES INTERFACE_LINK_LIBRARIES "${TMP}") set(LIBS "${TMP}") endif() # PRIVATE libraries get_property(TMP TARGET ${SHARED} PROPERTY LINK_LIBRARIES SET) if (TMP) get_target_property(TMP ${SHARED} LINK_LIBRARIES) set_target_properties(${STATIC} PROPERTIES LINK_LIBRARIES "${TMP}") list(APPEND LIBS "${TMP}") endif() set(ARGS ${ARGS} --map "${MAP}" --lib "$") set(DEPENDS ${DEPENDS} ${STATIC} ${MAP}) get_target_property(TMP ${STATIC} OUTPUT_NAME) set(OUTPUTS ${OUTPUTS} "${BUILD_LIB}/lib${TMP}.a") install(FILES "${BUILD_LIB}/lib${TMP}.a" DESTINATION "${CMAKE_INSTALL_LIBDIR}") endforeach() foreach(STATIC ${COMMON_LIBS}) set(ARGS ${ARGS} --internal_lib "$") set(DEPENDS ${DEPENDS} ${STATIC}) endforeach() add_custom_command( OUTPUT ${OUTPUTS} COMMAND "${PYTHON_EXECUTABLE}" "${PROJECT_SOURCE_DIR}/buildlib/sanitize_static_lib.py" --version ${PACKAGE_VERSION} --ar "${CMAKE_AR}" --nm "${CMAKE_NM}" --objcopy "${CMAKE_OBJCOPY}" ${ARGS} DEPENDS ${DEPENDS} "${PROJECT_SOURCE_DIR}/buildlib/sanitize_static_lib.py" COMMENT "Building distributable static libraries" VERBATIM) add_custom_target("make_static" ALL DEPENDS ${OUTPUTS}) endfunction() # Generate a pkg-config file function(rdma_pkg_config PC_LIB_NAME PC_REQUIRES_PRIVATE PC_LIB_PRIVATE) set(PC_LIB_NAME "${PC_LIB_NAME}") set(PC_LIB_PRIVATE "${PC_LIB_PRIVATE}") set(PC_REQUIRES_PRIVATE "${PC_REQUIRES_PRIVATE}") get_target_property(PC_VERSION ${PC_LIB_NAME} VERSION) # With IN_PLACE=1 the install step is not run, so generate the file in the build dir if (IN_PLACE) set(PC_RPATH "-Wl,-rpath,\${libdir}") endif() configure_file(${BUILDLIB}/template.pc.in ${BUILD_LIB}/pkgconfig/lib${PC_LIB_NAME}.pc @ONLY) if (NOT IN_PLACE) install(FILES ${BUILD_LIB}/pkgconfig/lib${PC_LIB_NAME}.pc DESTINATION ${CMAKE_INSTALL_LIBDIR}/pkgconfig) endif() endfunction() rdma-core-56.1/buildlib/rdma_man.cmake000066400000000000000000000105201477342711600176760ustar00rootroot00000000000000# COPYRIGHT (c) 2017-2018 Mellanox Technologies Ltd # Licensed under BSD (MIT variant) or GPLv2. See COPYING. rdma_make_dir("${PROJECT_BINARY_DIR}/pandoc-prebuilt") add_custom_target("docs" ALL DEPENDS "${OBJ}") function(rdma_man_get_prebuilt SRC OUT) # If rst2man is not installed then we install the man page from the # pre-built cache directory under buildlib. When the release tar file is # made the man pages are pre-built and included. This is done via install # so that ./build.sh never depends on pandoc, only 'ninja install'. execute_process( COMMAND "${PYTHON_EXECUTABLE}" "${PROJECT_SOURCE_DIR}/buildlib/pandoc-prebuilt.py" --retrieve "${PROJECT_SOURCE_DIR}" "${SRC}" WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" OUTPUT_VARIABLE OBJ RESULT_VARIABLE retcode) if(NOT "${retcode}" STREQUAL "0") message(FATAL_ERROR "Failed to load prebuilt pandoc output") endif() set(${OUT} "${OBJ}" PARENT_SCOPE) endfunction() function(rdma_md_man_page SRC MAN_SECT MANFN) set(OBJ "${CMAKE_CURRENT_BINARY_DIR}/${MANFN}") if (PANDOC_EXECUTABLE) add_custom_command( OUTPUT "${OBJ}" COMMAND "${PYTHON_EXECUTABLE}" "${PROJECT_SOURCE_DIR}/buildlib/pandoc-prebuilt.py" --build "${PROJECT_BINARY_DIR}" --pandoc "${PANDOC_EXECUTABLE}" "${SRC}" "${OBJ}" MAIN_DEPENDENCY "${SRC}" WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" COMMENT "Creating man page ${MANFN}" VERBATIM) add_custom_target("man-${MANFN}" ALL DEPENDS "${OBJ}") add_dependencies("docs" "man-${MANFN}") else() rdma_man_get_prebuilt(${SRC} OBJ) endif() install(FILES "${OBJ}" RENAME "${MANFN}" DESTINATION "${CMAKE_INSTALL_MANDIR}/man${MAN_SECT}/") endfunction() function(rdma_rst_man_page SRC MAN_SECT MANFN) set(OBJ "${CMAKE_CURRENT_BINARY_DIR}/${MANFN}") if (RST2MAN_EXECUTABLE) add_custom_command( OUTPUT "${OBJ}" COMMAND "${PYTHON_EXECUTABLE}" "${PROJECT_SOURCE_DIR}/buildlib/pandoc-prebuilt.py" --build "${PROJECT_BINARY_DIR}" --rst "${RST2MAN_EXECUTABLE}" "${SRC}" "${OBJ}" MAIN_DEPENDENCY "${SRC}" WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" COMMENT "Creating man page ${MANFN}" VERBATIM) add_custom_target("man-${MANFN}" ALL DEPENDS "${OBJ}") add_dependencies("docs" "man-${MANFN}") else() rdma_man_get_prebuilt(${SRC} OBJ) endif() install(FILES "${OBJ}" RENAME "${MANFN}" DESTINATION "${CMAKE_INSTALL_MANDIR}/man${MAN_SECT}/") endfunction() # Install man pages. This deduces the section from the trailing integer in the # filename function(rdma_man_pages) foreach(I ${ARGN}) if ("${I}" MATCHES "\\.md$") string(REGEX REPLACE "^.+[.](.+)\\.md$" "\\1" MAN_SECT "${I}") string(REGEX REPLACE "^(.+)\\.md$" "\\1" BASE_NAME "${I}") get_filename_component(BASE_NAME "${BASE_NAME}" NAME) rdma_md_man_page( "${I}" "${MAN_SECT}" "${BASE_NAME}") elseif ("${I}" MATCHES "\\.in\\.rst$") string(REGEX REPLACE "^.+[.](.+)\\.in\\.rst$" "\\1" MAN_SECT "${I}") string(REGEX REPLACE "^(.+)\\.in\\.rst$" "\\1" BASE_NAME "${I}") get_filename_component(BASE_NAME "${BASE_NAME}" NAME) configure_file("${I}" "${CMAKE_CURRENT_BINARY_DIR}/${BASE_NAME}.rst" @ONLY) rdma_rst_man_page( "${CMAKE_CURRENT_BINARY_DIR}/${BASE_NAME}.rst" "${MAN_SECT}" "${BASE_NAME}") elseif ("${I}" MATCHES "\\.in$") string(REGEX REPLACE "^.+[.](.+)\\.in$" "\\1" MAN_SECT "${I}") string(REGEX REPLACE "^(.+)\\.in$" "\\1" BASE_NAME "${I}") get_filename_component(BASE_NAME "${BASE_NAME}" NAME) rdma_subst_install(FILES "${I}" DESTINATION "${CMAKE_INSTALL_MANDIR}/man${MAN_SECT}/" RENAME "${BASE_NAME}") else() string(REGEX REPLACE "^.+[.](.+)$" "\\1" MAN_SECT "${I}") install(FILES "${I}" DESTINATION "${CMAKE_INSTALL_MANDIR}/man${MAN_SECT}/") endif() endforeach() endfunction() # Create an alias for a man page, using a symlink. # Input is a list of pairs of names (MAN_PAGE ALIAS) # NOTE: The section must currently be the same for both. function(rdma_alias_man_pages) list(LENGTH ARGN LEN) math(EXPR LEN ${LEN}-1) foreach(I RANGE 0 ${LEN} 2) list(GET ARGN ${I} FROM) math(EXPR I ${I}+1) list(GET ARGN ${I} TO) string(REGEX REPLACE "^.+[.](.+)$" "\\1" MAN_SECT ${FROM}) rdma_install_symlink("${FROM}" "${CMAKE_INSTALL_MANDIR}/man${MAN_SECT}/${TO}") endforeach() endfunction() rdma-core-56.1/buildlib/relpath000066400000000000000000000003051477342711600165000ustar00rootroot00000000000000#!/usr/bin/env python # Copyright 2017 Mellanox Technologies, Inc. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. import os import sys print(os.path.relpath(sys.argv[1], sys.argv[2])) rdma-core-56.1/buildlib/sanitize_static_lib.py000066400000000000000000000223361477342711600215230ustar00rootroot00000000000000#!/usr/bin/env python # Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. """This tool is used to create installable versions of the static libraries in rdma-core. This is complicated because rdma-core was not designed with static libraries in mind and relies on the dynamic linker to hide a variety of internal details. The build uses several internal utility libraries across the providers and the libraries. When building statically these libraries have to become inlined into the various main libraries. This script figures out which static libraries should include which internal libraries and inlines them appropriately. rdma-core is not careful to use globally unique names throughout all the libraries and all the providers. Normally the map file in the dynamic linker will hide these external symbols. This script does something similar for static linking by analyzing the libraries and map files then renaming internal symbols with a globally unique prefix. This is far too complicated to handle internally with cmake, so we have cmake produce the nearly completed libraries, then process them here using bintuils, and finally produce the final installation ready libraries.""" import collections import subprocess import argparse import tempfile import itertools import sys import os import re SymVer = collections.namedtuple( "SymVer", ["version", "prior_version", "globals", "locals"]) try: from tempfile import TemporaryDirectory except ImportError: import shutil import tempfile # From /usr/lib/python3/dist-packages/setuptools/py31compat.py class TemporaryDirectory(object): def __init__(self): self.name = None self.name = tempfile.mkdtemp() def __enter__(self): return self.name def __exit__(self, exctype, excvalue, exctrace): try: shutil.rmtree(self.name, True) except OSError: pass self.name = None try: from subprocess import check_output except ImportError: # From /usr/lib/python2.7/subprocess.py def check_output(*popenargs, **kwargs): if 'stdout' in kwargs: raise ValueError( 'stdout argument not allowed, it will be overridden.') process = subprocess.Popen( stdout=subprocess.PIPE, *popenargs, **kwargs) output, unused_err = process.communicate() retcode = process.poll() if retcode: cmd = kwargs.get("args") if cmd is None: cmd = popenargs[0] raise CalledProcessError(retcode, cmd, output=output) return output subprocess.check_output = check_output def parse_stanza(version, prior_version, lines): gbl = [] local = [] cur = None cur = 0 for I in re.finditer( r"\s*(?:(global:)|(local:)(\s*\*\s*;)|(?:(\w+)\s*;))", lines, flags=re.DOTALL | re.MULTILINE): if I.group(1): # global lst = gbl if I.group(2): # local lst = local if I.group(3): # wildcard lst.append("*") assert (cur is not gbl) if I.group(4): # symbol name lst.append(I.group(4)) assert cur == I.start() cur = I.end() assert cur == len(lines) return SymVer(version or "", prior_version or "", gbl, local) def load_map(fn): """This is a lame regex based parser for GNU linker map files. It asserts if the map file is invalid. It returns a list of the global symbols""" with open(fn, "rt") as F: lines = F.read() p = re.compile(r"/\*.*?\*/", flags=re.DOTALL) lines = re.sub(p, "", lines) lines = lines.strip() # Extract each stanza res = [] cur = 0 for I in re.finditer( r"\s*(?:(\S+)\s+)?{(.*?)\s*}(\s*\S+)?\s*;", lines, flags=re.DOTALL | re.MULTILINE): assert cur == I.start() res.append(parse_stanza(I.group(1), I.group(3), I.group(2))) cur = I.end() assert cur == len(lines) return res class Lib(object): def __init__(self, libfn, tmpdir): self.libfn = os.path.basename(libfn) self.objdir = os.path.join(tmpdir, self.libfn) self.final_objdir = os.path.join(tmpdir, "r-" + self.libfn) self.final_lib = os.path.join(os.path.dirname(libfn), "..", self.libfn) self.needs = set() self.needed = set() os.makedirs(self.objdir) os.makedirs(self.final_objdir) subprocess.check_call([args.ar, "x", libfn], cwd=self.objdir) self.objects = [I for I in os.listdir(self.objdir)] self.get_syms() def get_syms(self): """Read the definedsymbols from each object file""" self.syms = set() self.needed_syms = set() for I in self.objects: I = os.path.join(self.objdir, I) syms = subprocess.check_output([args.nm, "--defined-only", I]) for ln in syms.decode().splitlines(): ln = ln.split() if ln[1].isupper(): self.syms.add(ln[2]) syms = subprocess.check_output([args.nm, "--undefined-only", I]) for ln in syms.decode().splitlines(): ln = ln.split() if ln[0].isupper(): if not ln[1].startswith("verbs_provider_"): self.needed_syms.add(ln[1]) def rename_syms(self, rename_fn): """Invoke objcopy on all the objects to rename their symbols""" for I in self.objects: subprocess.check_call([ args.objcopy, "--redefine-syms=%s" % (rename_fn), os.path.join(self.objdir, I), os.path.join(self.final_objdir, I) ]) def incorporate_internal(self, internal_libs): """If this library requires an internal library then we want to inline it into this lib when we reconstruct it.""" for lib in self.needs.intersection(internal_libs): self.objects.extend( os.path.join(lib.final_objdir, I) for I in lib.objects) def finalize(self): """Write out the now modified library""" try: os.unlink(self.final_lib) except OSError: pass subprocess.check_call( [args.ar, "qsc", self.final_lib] + [os.path.join(self.final_objdir, I) for I in self.objects]) def compute_graph(libs): """Look at the symbols each library provides vs the symbols each library needs and organize the libraries into a graph.""" for a, b in itertools.permutations(libs, 2): if not a.syms.isdisjoint(b.needed_syms): b.needs.add(a) a.needed.add(b) # Use transitivity to prune the needs list def prune(cur_lib, to_prune): for I in cur_lib.needed: I.needs.discard(to_prune) to_prune.needed.discard(I) prune(I, to_prune) for cur_lib in libs: for I in list(cur_lib.needed): prune(I, cur_lib) parser = argparse.ArgumentParser( description='Generate static libraries for distribution') parser.add_argument( "--map", dest="maps", action="append", help="List of map files defining all the public symbols", default=[]) parser.add_argument( "--lib", dest="libs", action="append", help="The input static libraries") parser.add_argument( "--internal_lib", dest="internal_libs", action="append", help= "The internal static libraries, these will be merged into other libraries") parser.add_argument( "--version", action="store", help="Package version number", required=True) parser.add_argument("--ar", action="store", help="ar tool", required=True) parser.add_argument("--nm", action="store", help="nm tool", required=True) parser.add_argument( "--objcopy", action="store", help="objcopy tool", required=True) args = parser.parse_args() global_syms = set() for fn in sorted(set(args.maps)): for I in load_map(fn): # Private symbols in libibverbs are also mangled for maximum safety. if "PRIVATE" not in I.version: global_syms.update(I.globals) with TemporaryDirectory() as tmpdir: libs = set(Lib(fn, tmpdir) for fn in args.libs) internal_libs = set(Lib(fn, tmpdir) for fn in args.internal_libs) all_libs = libs | internal_libs all_syms = set() for I in all_libs: all_syms.update(I.syms) compute_graph(all_libs) # To support the ibv_static_providers() machinery these are made global # too, even though they are not in map files. We only want to expose them # for the static linking case. global_syms.add("ibv_static_providers") for I in all_syms: if I.startswith("verbs_provider_"): global_syms.add(I) # Generate a redefine file for objcopy that will sanitize the internal names prefix = re.sub(r"\W", "_", args.version) redefine_fn = os.path.join(tmpdir, "redefine") with open(redefine_fn, "wt") as F: for I in sorted(all_syms - global_syms): F.write("%s rdmacore%s_%s\n" % (I, prefix, I)) for I in all_libs: I.rename_syms(redefine_fn) for I in libs: I.incorporate_internal(internal_libs) I.finalize() rdma-core-56.1/buildlib/sparse-include/000077500000000000000000000000001477342711600200365ustar00rootroot00000000000000rdma-core-56.1/buildlib/sparse-include/19/000077500000000000000000000000001477342711600202675ustar00rootroot00000000000000rdma-core-56.1/buildlib/sparse-include/19/netinet-in.h.diff000066400000000000000000000107221477342711600234230ustar00rootroot00000000000000--- /usr/include/netinet/in.h 2016-05-26 10:27:23.000000000 +0000 +++ build-sparse/include/netinet/in.h 2017-03-15 21:50:20.436860311 +0000 @@ -22,12 +22,12 @@ #include #include #include - +#include __BEGIN_DECLS /* Internet address. */ -typedef uint32_t in_addr_t; +typedef __be32 in_addr_t; struct in_addr { in_addr_t s_addr; @@ -114,7 +114,7 @@ #endif /* !__USE_KERNEL_IPV6_DEFS */ /* Type to represent a port. */ -typedef uint16_t in_port_t; +typedef __be16 in_port_t; /* Standard well-known ports. */ enum @@ -173,36 +173,36 @@ #define IN_CLASSB_HOST (0xffffffff & ~IN_CLASSB_NET) #define IN_CLASSB_MAX 65536 -#define IN_CLASSC(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xc0000000) +#define IN_CLASSC(a) ((((uint32_t)(a)) & 0xe0000000) == 0xc0000000) #define IN_CLASSC_NET 0xffffff00 #define IN_CLASSC_NSHIFT 8 #define IN_CLASSC_HOST (0xffffffff & ~IN_CLASSC_NET) -#define IN_CLASSD(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xe0000000) +#define IN_CLASSD(a) ((((uint32_t)(a)) & 0xf0000000) == 0xe0000000) #define IN_MULTICAST(a) IN_CLASSD(a) -#define IN_EXPERIMENTAL(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xe0000000) -#define IN_BADCLASS(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xf0000000) +#define IN_EXPERIMENTAL(a) ((((uint32_t)(a)) & 0xe0000000) == 0xe0000000) +#define IN_BADCLASS(a) ((((uint32_t)(a)) & 0xf0000000) == 0xf0000000) /* Address to accept any incoming messages. */ -#define INADDR_ANY ((in_addr_t) 0x00000000) +#define INADDR_ANY ((uint32_t) 0x00000000) /* Address to send to all hosts. */ -#define INADDR_BROADCAST ((in_addr_t) 0xffffffff) +#define INADDR_BROADCAST ((uint32_t) 0xffffffff) /* Address indicating an error return. */ -#define INADDR_NONE ((in_addr_t) 0xffffffff) +#define INADDR_NONE ((uint32_t) 0xffffffff) /* Network number for local host loopback. */ #define IN_LOOPBACKNET 127 /* Address to loopback in software to local host. */ #ifndef INADDR_LOOPBACK -# define INADDR_LOOPBACK ((in_addr_t) 0x7f000001) /* Inet 127.0.0.1. */ +# define INADDR_LOOPBACK ((uint32_t) 0x7f000001) /* Inet 127.0.0.1. */ #endif /* Defines for Multicast INADDR. */ -#define INADDR_UNSPEC_GROUP ((in_addr_t) 0xe0000000) /* 224.0.0.0 */ -#define INADDR_ALLHOSTS_GROUP ((in_addr_t) 0xe0000001) /* 224.0.0.1 */ -#define INADDR_ALLRTRS_GROUP ((in_addr_t) 0xe0000002) /* 224.0.0.2 */ -#define INADDR_MAX_LOCAL_GROUP ((in_addr_t) 0xe00000ff) /* 224.0.0.255 */ +#define INADDR_UNSPEC_GROUP ((uint32_t) 0xe0000000) /* 224.0.0.0 */ +#define INADDR_ALLHOSTS_GROUP ((uint32_t) 0xe0000001) /* 224.0.0.1 */ +#define INADDR_ALLRTRS_GROUP ((uint32_t) 0xe0000002) /* 224.0.0.2 */ +#define INADDR_MAX_LOCAL_GROUP ((uint32_t) 0xe00000ff) /* 224.0.0.255 */ #ifndef __USE_KERNEL_IPV6_DEFS /* IPv6 address */ @@ -212,8 +212,8 @@ { uint8_t __u6_addr8[16]; #if defined __USE_MISC || defined __USE_GNU - uint16_t __u6_addr16[8]; - uint32_t __u6_addr32[4]; + __be16 __u6_addr16[8]; + __be32 __u6_addr32[4]; #endif } __in6_u; #define s6_addr __in6_u.__u6_addr8 @@ -253,7 +253,7 @@ { __SOCKADDR_COMMON (sin6_); in_port_t sin6_port; /* Transport layer port # */ - uint32_t sin6_flowinfo; /* IPv6 flow information */ + __be32 sin6_flowinfo; /* IPv6 flow information */ struct in6_addr sin6_addr; /* IPv6 address */ uint32_t sin6_scope_id; /* IPv6 scope-id */ }; @@ -371,12 +371,12 @@ this was a short-sighted decision since on different systems the types may have different representations but the values are always the same. */ -extern uint32_t ntohl (uint32_t __netlong) __THROW __attribute__ ((__const__)); -extern uint16_t ntohs (uint16_t __netshort) +extern uint32_t ntohl (__be32 __netlong) __THROW __attribute__ ((__const__)); +extern uint16_t ntohs (__be16 __netshort) __THROW __attribute__ ((__const__)); -extern uint32_t htonl (uint32_t __hostlong) +extern __be32 htonl (uint32_t __hostlong) __THROW __attribute__ ((__const__)); -extern uint16_t htons (uint16_t __hostshort) +extern __be16 htons (uint16_t __hostshort) __THROW __attribute__ ((__const__)); #include @@ -384,7 +384,7 @@ /* Get machine dependent optimized versions of byte swapping functions. */ #include -#ifdef __OPTIMIZE__ +#ifdef __disabled_OPTIMIZE__ /* We can optimize calls to the conversion functions. Either nothing has to be done or we are using directly the byte-swapping functions which often can be inlined. */ rdma-core-56.1/buildlib/sparse-include/23/000077500000000000000000000000001477342711600202625ustar00rootroot00000000000000rdma-core-56.1/buildlib/sparse-include/23/netinet-in.h.diff000066400000000000000000000106701477342711600234200ustar00rootroot00000000000000--- /usr/include/netinet/in.h 2016-11-16 15:44:03.000000000 -0700 +++ build-sparse/include/netinet/in.h 2017-03-15 13:55:43.865288477 -0600 @@ -22,12 +22,12 @@ #include #include #include - +#include __BEGIN_DECLS /* Internet address. */ -typedef uint32_t in_addr_t; +typedef __be32 in_addr_t; struct in_addr { in_addr_t s_addr; @@ -116,7 +116,7 @@ #endif /* !__USE_KERNEL_IPV6_DEFS */ /* Type to represent a port. */ -typedef uint16_t in_port_t; +typedef __be16 in_port_t; /* Standard well-known ports. */ enum @@ -175,36 +175,36 @@ #define IN_CLASSB_HOST (0xffffffff & ~IN_CLASSB_NET) #define IN_CLASSB_MAX 65536 -#define IN_CLASSC(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xc0000000) +#define IN_CLASSC(a) ((((uint32_t)(a)) & 0xe0000000) == 0xc0000000) #define IN_CLASSC_NET 0xffffff00 #define IN_CLASSC_NSHIFT 8 #define IN_CLASSC_HOST (0xffffffff & ~IN_CLASSC_NET) -#define IN_CLASSD(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xe0000000) +#define IN_CLASSD(a) ((((uint32_t)(a)) & 0xf0000000) == 0xe0000000) #define IN_MULTICAST(a) IN_CLASSD(a) -#define IN_EXPERIMENTAL(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xe0000000) -#define IN_BADCLASS(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xf0000000) +#define IN_EXPERIMENTAL(a) ((((uint32_t)(a)) & 0xe0000000) == 0xe0000000) +#define IN_BADCLASS(a) ((((uint32_t)(a)) & 0xf0000000) == 0xf0000000) /* Address to accept any incoming messages. */ -#define INADDR_ANY ((in_addr_t) 0x00000000) +#define INADDR_ANY ((uint32_t) 0x00000000) /* Address to send to all hosts. */ -#define INADDR_BROADCAST ((in_addr_t) 0xffffffff) +#define INADDR_BROADCAST ((uint32_t) 0xffffffff) /* Address indicating an error return. */ -#define INADDR_NONE ((in_addr_t) 0xffffffff) +#define INADDR_NONE ((uint32_t) 0xffffffff) /* Network number for local host loopback. */ #define IN_LOOPBACKNET 127 /* Address to loopback in software to local host. */ #ifndef INADDR_LOOPBACK -# define INADDR_LOOPBACK ((in_addr_t) 0x7f000001) /* Inet 127.0.0.1. */ +# define INADDR_LOOPBACK ((uint32_t) 0x7f000001) /* Inet 127.0.0.1. */ #endif /* Defines for Multicast INADDR. */ -#define INADDR_UNSPEC_GROUP ((in_addr_t) 0xe0000000) /* 224.0.0.0 */ -#define INADDR_ALLHOSTS_GROUP ((in_addr_t) 0xe0000001) /* 224.0.0.1 */ -#define INADDR_ALLRTRS_GROUP ((in_addr_t) 0xe0000002) /* 224.0.0.2 */ -#define INADDR_MAX_LOCAL_GROUP ((in_addr_t) 0xe00000ff) /* 224.0.0.255 */ +#define INADDR_UNSPEC_GROUP ((uint32_t) 0xe0000000) /* 224.0.0.0 */ +#define INADDR_ALLHOSTS_GROUP ((uint32_t) 0xe0000001) /* 224.0.0.1 */ +#define INADDR_ALLRTRS_GROUP ((uint32_t) 0xe0000002) /* 224.0.0.2 */ +#define INADDR_MAX_LOCAL_GROUP ((uint32_t) 0xe00000ff) /* 224.0.0.255 */ #ifndef __USE_KERNEL_IPV6_DEFS /* IPv6 address */ @@ -214,8 +214,8 @@ { uint8_t __u6_addr8[16]; #ifdef __USE_MISC - uint16_t __u6_addr16[8]; - uint32_t __u6_addr32[4]; + __be16 __u6_addr16[8]; + __be32 __u6_addr32[4]; #endif } __in6_u; #define s6_addr __in6_u.__u6_addr8 @@ -255,7 +255,7 @@ { __SOCKADDR_COMMON (sin6_); in_port_t sin6_port; /* Transport layer port # */ - uint32_t sin6_flowinfo; /* IPv6 flow information */ + __be32 sin6_flowinfo; /* IPv6 flow information */ struct in6_addr sin6_addr; /* IPv6 address */ uint32_t sin6_scope_id; /* IPv6 scope-id */ }; @@ -373,12 +373,12 @@ this was a short-sighted decision since on different systems the types may have different representations but the values are always the same. */ -extern uint32_t ntohl (uint32_t __netlong) __THROW __attribute__ ((__const__)); -extern uint16_t ntohs (uint16_t __netshort) +extern uint32_t ntohl (__be32 __netlong) __THROW __attribute__ ((__const__)); +extern uint16_t ntohs (__be16 __netshort) __THROW __attribute__ ((__const__)); -extern uint32_t htonl (uint32_t __hostlong) +extern __be32 htonl (uint32_t __hostlong) __THROW __attribute__ ((__const__)); -extern uint16_t htons (uint16_t __hostshort) +extern __be16 htons (uint16_t __hostshort) __THROW __attribute__ ((__const__)); #include @@ -386,7 +386,7 @@ /* Get machine dependent optimized versions of byte swapping functions. */ #include -#ifdef __OPTIMIZE__ +#ifdef __disabled_OPTIMIZE__ /* We can optimize calls to the conversion functions. Either nothing has to be done or we are using directly the byte-swapping functions which often can be inlined. */ rdma-core-56.1/buildlib/sparse-include/23/sys-socket.h.diff000066400000000000000000000010211477342711600234400ustar00rootroot00000000000000--- /usr/include/sys/socket.h 2016-11-16 15:43:53.000000000 -0700 +++ build-sparse/include/sys/socket.h 2017-03-15 12:43:28.736376893 -0600 @@ -65,7 +65,7 @@ uses with any of the listed types to be allowed without complaint. G++ 2.7 does not support transparent unions so there we want the old-style declaration, too. */ -#if defined __cplusplus || !__GNUC_PREREQ (2, 7) || !defined __USE_GNU +#if 1 # define __SOCKADDR_ARG struct sockaddr *__restrict # define __CONST_SOCKADDR_ARG const struct sockaddr * #else rdma-core-56.1/buildlib/sparse-include/25/000077500000000000000000000000001477342711600202645ustar00rootroot00000000000000rdma-core-56.1/buildlib/sparse-include/25/netinet-in.h.diff000066400000000000000000000106211477342711600234160ustar00rootroot00000000000000--- /usr/include/netinet/in.h 2017-03-09 00:51:29.000000000 +0000 +++ build-tumbleweed/include/netinet/in.h 2017-03-21 18:13:51.951339197 +0000 @@ -22,12 +22,12 @@ #include #include #include - +#include __BEGIN_DECLS /* Internet address. */ -typedef uint32_t in_addr_t; +typedef __be32 in_addr_t; struct in_addr { in_addr_t s_addr; @@ -116,7 +116,7 @@ #endif /* !__USE_KERNEL_IPV6_DEFS */ /* Type to represent a port. */ -typedef uint16_t in_port_t; +typedef __be16 in_port_t; /* Standard well-known ports. */ enum @@ -175,36 +175,36 @@ #define IN_CLASSB_HOST (0xffffffff & ~IN_CLASSB_NET) #define IN_CLASSB_MAX 65536 -#define IN_CLASSC(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xc0000000) +#define IN_CLASSC(a) ((((uint32_t)(a)) & 0xe0000000) == 0xc0000000) #define IN_CLASSC_NET 0xffffff00 #define IN_CLASSC_NSHIFT 8 #define IN_CLASSC_HOST (0xffffffff & ~IN_CLASSC_NET) -#define IN_CLASSD(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xe0000000) +#define IN_CLASSD(a) ((((uint32_t)(a)) & 0xf0000000) == 0xe0000000) #define IN_MULTICAST(a) IN_CLASSD(a) -#define IN_EXPERIMENTAL(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xe0000000) -#define IN_BADCLASS(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xf0000000) +#define IN_EXPERIMENTAL(a) ((((uint32_t)(a)) & 0xe0000000) == 0xe0000000) +#define IN_BADCLASS(a) ((((uint32_t)(a)) & 0xf0000000) == 0xf0000000) /* Address to accept any incoming messages. */ -#define INADDR_ANY ((in_addr_t) 0x00000000) +#define INADDR_ANY ((uint32_t) 0x00000000) /* Address to send to all hosts. */ -#define INADDR_BROADCAST ((in_addr_t) 0xffffffff) +#define INADDR_BROADCAST ((uint32_t) 0xffffffff) /* Address indicating an error return. */ -#define INADDR_NONE ((in_addr_t) 0xffffffff) +#define INADDR_NONE ((uint32_t) 0xffffffff) /* Network number for local host loopback. */ #define IN_LOOPBACKNET 127 /* Address to loopback in software to local host. */ #ifndef INADDR_LOOPBACK -# define INADDR_LOOPBACK ((in_addr_t) 0x7f000001) /* Inet 127.0.0.1. */ +# define INADDR_LOOPBACK ((uint32_t) 0x7f000001) /* Inet 127.0.0.1. */ #endif /* Defines for Multicast INADDR. */ -#define INADDR_UNSPEC_GROUP ((in_addr_t) 0xe0000000) /* 224.0.0.0 */ -#define INADDR_ALLHOSTS_GROUP ((in_addr_t) 0xe0000001) /* 224.0.0.1 */ -#define INADDR_ALLRTRS_GROUP ((in_addr_t) 0xe0000002) /* 224.0.0.2 */ -#define INADDR_MAX_LOCAL_GROUP ((in_addr_t) 0xe00000ff) /* 224.0.0.255 */ +#define INADDR_UNSPEC_GROUP ((uint32_t) 0xe0000000) /* 224.0.0.0 */ +#define INADDR_ALLHOSTS_GROUP ((uint32_t) 0xe0000001) /* 224.0.0.1 */ +#define INADDR_ALLRTRS_GROUP ((uint32_t) 0xe0000002) /* 224.0.0.2 */ +#define INADDR_MAX_LOCAL_GROUP ((uint32_t) 0xe00000ff) /* 224.0.0.255 */ #if !__USE_KERNEL_IPV6_DEFS /* IPv6 address */ @@ -213,8 +213,8 @@ union { uint8_t __u6_addr8[16]; - uint16_t __u6_addr16[8]; - uint32_t __u6_addr32[4]; + __be16 __u6_addr16[8]; + __be32 __u6_addr32[4]; } __in6_u; #define s6_addr __in6_u.__u6_addr8 #ifdef __USE_MISC @@ -253,7 +253,7 @@ { __SOCKADDR_COMMON (sin6_); in_port_t sin6_port; /* Transport layer port # */ - uint32_t sin6_flowinfo; /* IPv6 flow information */ + __be32 sin6_flowinfo; /* IPv6 flow information */ struct in6_addr sin6_addr; /* IPv6 address */ uint32_t sin6_scope_id; /* IPv6 scope-id */ }; @@ -371,12 +371,12 @@ this was a short-sighted decision since on different systems the types may have different representations but the values are always the same. */ -extern uint32_t ntohl (uint32_t __netlong) __THROW __attribute__ ((__const__)); -extern uint16_t ntohs (uint16_t __netshort) +extern uint32_t ntohl (__be32 __netlong) __THROW __attribute__ ((__const__)); +extern uint16_t ntohs (__be16 __netshort) __THROW __attribute__ ((__const__)); -extern uint32_t htonl (uint32_t __hostlong) +extern __be32 htonl (uint32_t __hostlong) __THROW __attribute__ ((__const__)); -extern uint16_t htons (uint16_t __hostshort) +extern __be16 htons (uint16_t __hostshort) __THROW __attribute__ ((__const__)); #include @@ -385,7 +385,7 @@ #include #include -#ifdef __OPTIMIZE__ +#ifdef __disabled_OPTIMIZE__ /* We can optimize calls to the conversion functions. Either nothing has to be done or we are using directly the byte-swapping functions which often can be inlined. */ rdma-core-56.1/buildlib/sparse-include/27/000077500000000000000000000000001477342711600202665ustar00rootroot00000000000000rdma-core-56.1/buildlib/sparse-include/27/bits-sysmacros.h.diff000066400000000000000000000017071477342711600243350ustar00rootroot00000000000000--- /usr/include/bits/sysmacros.h 2018-04-16 20:14:20.000000000 +0000 +++ include/bits/sysmacros.h 2019-05-16 19:30:02.096174695 +0000 @@ -40,8 +40,8 @@ __SYSMACROS_DECLARE_MAJOR (DECL_TEMPL) \ { \ unsigned int __major; \ - __major = ((__dev & (__dev_t) 0x00000000000fff00u) >> 8); \ - __major |= ((__dev & (__dev_t) 0xfffff00000000000u) >> 32); \ + __major = ((__dev & (__dev_t) 0x00000000000fff00ul) >> 8); \ + __major |= ((__dev & (__dev_t) 0xfffff00000000000ul) >> 32); \ return __major; \ } @@ -52,8 +52,8 @@ __SYSMACROS_DECLARE_MINOR (DECL_TEMPL) \ { \ unsigned int __minor; \ - __minor = ((__dev & (__dev_t) 0x00000000000000ffu) >> 0); \ - __minor |= ((__dev & (__dev_t) 0x00000ffffff00000u) >> 12); \ + __minor = ((__dev & (__dev_t) 0x00000000000000fful) >> 0); \ + __minor |= ((__dev & (__dev_t) 0x00000ffffff00000ul) >> 12); \ return __minor; \ } rdma-core-56.1/buildlib/sparse-include/27/netinet-in.h.diff000066400000000000000000000106131477342711600234210ustar00rootroot00000000000000--- /usr/include/netinet/in.h 2018-04-16 20:14:20.000000000 +0000 +++ include/netinet/in.h 2019-05-16 19:22:42.725853784 +0000 @@ -22,12 +22,12 @@ #include #include #include - +#include __BEGIN_DECLS /* Internet address. */ -typedef uint32_t in_addr_t; +typedef __be32 in_addr_t; struct in_addr { in_addr_t s_addr; @@ -116,7 +116,7 @@ #endif /* !__USE_KERNEL_IPV6_DEFS */ /* Type to represent a port. */ -typedef uint16_t in_port_t; +typedef __be16 in_port_t; /* Standard well-known ports. */ enum @@ -175,36 +175,36 @@ #define IN_CLASSB_HOST (0xffffffff & ~IN_CLASSB_NET) #define IN_CLASSB_MAX 65536 -#define IN_CLASSC(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xc0000000) +#define IN_CLASSC(a) ((((uint32_t)(a)) & 0xe0000000) == 0xc0000000) #define IN_CLASSC_NET 0xffffff00 #define IN_CLASSC_NSHIFT 8 #define IN_CLASSC_HOST (0xffffffff & ~IN_CLASSC_NET) -#define IN_CLASSD(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xe0000000) +#define IN_CLASSD(a) ((((uint32_t)(a)) & 0xf0000000) == 0xe0000000) #define IN_MULTICAST(a) IN_CLASSD(a) -#define IN_EXPERIMENTAL(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xe0000000) -#define IN_BADCLASS(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xf0000000) +#define IN_EXPERIMENTAL(a) ((((uint32_t)(a)) & 0xe0000000) == 0xe0000000) +#define IN_BADCLASS(a) ((((uint32_t)(a)) & 0xf0000000) == 0xf0000000) /* Address to accept any incoming messages. */ -#define INADDR_ANY ((in_addr_t) 0x00000000) +#define INADDR_ANY ((uint32_t) 0x00000000) /* Address to send to all hosts. */ -#define INADDR_BROADCAST ((in_addr_t) 0xffffffff) +#define INADDR_BROADCAST ((uint32_t) 0xffffffff) /* Address indicating an error return. */ -#define INADDR_NONE ((in_addr_t) 0xffffffff) +#define INADDR_NONE ((uint32_t) 0xffffffff) /* Network number for local host loopback. */ #define IN_LOOPBACKNET 127 /* Address to loopback in software to local host. */ #ifndef INADDR_LOOPBACK -# define INADDR_LOOPBACK ((in_addr_t) 0x7f000001) /* Inet 127.0.0.1. */ +# define INADDR_LOOPBACK ((uint32_t) 0x7f000001) /* Inet 127.0.0.1. */ #endif /* Defines for Multicast INADDR. */ -#define INADDR_UNSPEC_GROUP ((in_addr_t) 0xe0000000) /* 224.0.0.0 */ -#define INADDR_ALLHOSTS_GROUP ((in_addr_t) 0xe0000001) /* 224.0.0.1 */ -#define INADDR_ALLRTRS_GROUP ((in_addr_t) 0xe0000002) /* 224.0.0.2 */ -#define INADDR_MAX_LOCAL_GROUP ((in_addr_t) 0xe00000ff) /* 224.0.0.255 */ +#define INADDR_UNSPEC_GROUP ((uint32_t) 0xe0000000) /* 224.0.0.0 */ +#define INADDR_ALLHOSTS_GROUP ((uint32_t) 0xe0000001) /* 224.0.0.1 */ +#define INADDR_ALLRTRS_GROUP ((uint32_t) 0xe0000002) /* 224.0.0.2 */ +#define INADDR_MAX_LOCAL_GROUP ((uint32_t) 0xe00000ff) /* 224.0.0.255 */ #if !__USE_KERNEL_IPV6_DEFS /* IPv6 address */ @@ -213,8 +213,8 @@ union { uint8_t __u6_addr8[16]; - uint16_t __u6_addr16[8]; - uint32_t __u6_addr32[4]; + __be16 __u6_addr16[8]; + __be32 __u6_addr32[4]; } __in6_u; #define s6_addr __in6_u.__u6_addr8 #ifdef __USE_MISC @@ -253,7 +253,7 @@ { __SOCKADDR_COMMON (sin6_); in_port_t sin6_port; /* Transport layer port # */ - uint32_t sin6_flowinfo; /* IPv6 flow information */ + __be32 sin6_flowinfo; /* IPv6 flow information */ struct in6_addr sin6_addr; /* IPv6 address */ uint32_t sin6_scope_id; /* IPv6 scope-id */ }; @@ -371,12 +371,12 @@ this was a short-sighted decision since on different systems the types may have different representations but the values are always the same. */ -extern uint32_t ntohl (uint32_t __netlong) __THROW __attribute__ ((__const__)); -extern uint16_t ntohs (uint16_t __netshort) +extern uint32_t ntohl (__be32 __netlong) __THROW __attribute__ ((__const__)); +extern uint16_t ntohs (__be16 __netshort) __THROW __attribute__ ((__const__)); -extern uint32_t htonl (uint32_t __hostlong) +extern __be32 htonl (uint32_t __hostlong) __THROW __attribute__ ((__const__)); -extern uint16_t htons (uint16_t __hostshort) +extern __be16 htons (uint16_t __hostshort) __THROW __attribute__ ((__const__)); #include @@ -385,7 +385,7 @@ #include #include -#ifdef __OPTIMIZE__ +#ifdef __disabled_OPTIMIZE__ /* We can optimize calls to the conversion functions. Either nothing has to be done or we are using directly the byte-swapping functions which often can be inlined. */ rdma-core-56.1/buildlib/sparse-include/27/stdlib.h.diff000066400000000000000000000013671477342711600226360ustar00rootroot00000000000000--- /usr/include/stdlib.h 2018-04-16 20:14:20.000000000 +0000 +++ include/stdlib.h 2019-05-16 19:38:38.071615242 +0000 @@ -130,6 +130,20 @@ /* Likewise for '_FloatN' and '_FloatNx'. */ +/* For whatever reason our sparse does not understand these new compiler types */ +#undef __GLIBC_USE_IEC_60559_TYPES_EXT +#define __GLIBC_USE_IEC_60559_TYPES_EXT 0 +#undef __HAVE_FLOAT32 +#define __HAVE_FLOAT32 0 +#undef __HAVE_FLOAT32X +#define __HAVE_FLOAT32X 0 +#undef __HAVE_FLOAT64 +#define __HAVE_FLOAT64 0 +#undef __HAVE_FLOAT64X +#define __HAVE_FLOAT64X 0 +#undef __HAVE_FLOAT128 +#define __HAVE_FLOAT128 0 + #if __HAVE_FLOAT16 && __GLIBC_USE (IEC_60559_TYPES_EXT) extern _Float16 strtof16 (const char *__restrict __nptr, char **__restrict __endptr) rdma-core-56.1/buildlib/sparse-include/27/sys-socket.h.diff000066400000000000000000000010041477342711600234450ustar00rootroot00000000000000--- /usr/include/sys/socket.h 2018-04-16 20:14:20.000000000 +0000 +++ include/sys/socket.h 2019-05-16 19:22:42.721853727 +0000 @@ -54,7 +54,7 @@ uses with any of the listed types to be allowed without complaint. G++ 2.7 does not support transparent unions so there we want the old-style declaration, too. */ -#if defined __cplusplus || !__GNUC_PREREQ (2, 7) || !defined __USE_GNU +#if 1 # define __SOCKADDR_ARG struct sockaddr *__restrict # define __CONST_SOCKADDR_ARG const struct sockaddr * #else rdma-core-56.1/buildlib/sparse-include/31/000077500000000000000000000000001477342711600202615ustar00rootroot00000000000000rdma-core-56.1/buildlib/sparse-include/31/bits-sysmacros.h.diff000066400000000000000000000017301477342711600243240ustar00rootroot00000000000000--- /usr/include/x86_64-linux-gnu/bits/sysmacros.h 2020-04-14 19:26:04.000000000 +0000 +++ include/bits/sysmacros.h 2020-05-05 19:03:23.910980758 +0000 @@ -40,8 +40,8 @@ __SYSMACROS_DECLARE_MAJOR (DECL_TEMPL) \ { \ unsigned int __major; \ - __major = ((__dev & (__dev_t) 0x00000000000fff00u) >> 8); \ - __major |= ((__dev & (__dev_t) 0xfffff00000000000u) >> 32); \ + __major = ((__dev & (__dev_t) 0x00000000000fff00ul) >> 8); \ + __major |= ((__dev & (__dev_t) 0xfffff00000000000ul) >> 32); \ return __major; \ } @@ -52,8 +52,8 @@ __SYSMACROS_DECLARE_MINOR (DECL_TEMPL) \ { \ unsigned int __minor; \ - __minor = ((__dev & (__dev_t) 0x00000000000000ffu) >> 0); \ - __minor |= ((__dev & (__dev_t) 0x00000ffffff00000u) >> 12); \ + __minor = ((__dev & (__dev_t) 0x00000000000000fful) >> 0); \ + __minor |= ((__dev & (__dev_t) 0x00000ffffff00000ul) >> 12); \ return __minor; \ } rdma-core-56.1/buildlib/sparse-include/31/netinet-in.h.diff000066400000000000000000000110441477342711600234130ustar00rootroot00000000000000--- /usr/include/netinet/in.h 2020-04-14 19:26:04.000000000 +0000 +++ include/netinet/in.h 2020-05-05 19:11:08.250904392 +0000 @@ -22,12 +22,12 @@ #include #include #include - +#include __BEGIN_DECLS /* Internet address. */ -typedef uint32_t in_addr_t; +typedef __be32 in_addr_t; struct in_addr { in_addr_t s_addr; @@ -116,7 +116,7 @@ #endif /* !__USE_KERNEL_IPV6_DEFS */ /* Type to represent a port. */ -typedef uint16_t in_port_t; +typedef __be16 in_port_t; /* Standard well-known ports. */ enum @@ -175,37 +175,37 @@ #define IN_CLASSB_HOST (0xffffffff & ~IN_CLASSB_NET) #define IN_CLASSB_MAX 65536 -#define IN_CLASSC(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xc0000000) +#define IN_CLASSC(a) ((((uint32_t)(a)) & 0xe0000000) == 0xc0000000) #define IN_CLASSC_NET 0xffffff00 #define IN_CLASSC_NSHIFT 8 #define IN_CLASSC_HOST (0xffffffff & ~IN_CLASSC_NET) -#define IN_CLASSD(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xe0000000) +#define IN_CLASSD(a) ((((uint32_t)(a)) & 0xf0000000) == 0xe0000000) #define IN_MULTICAST(a) IN_CLASSD(a) -#define IN_EXPERIMENTAL(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xe0000000) -#define IN_BADCLASS(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xf0000000) +#define IN_EXPERIMENTAL(a) ((((uint32_t)(a)) & 0xe0000000) == 0xe0000000) +#define IN_BADCLASS(a) ((((uint32_t)(a)) & 0xf0000000) == 0xf0000000) /* Address to accept any incoming messages. */ -#define INADDR_ANY ((in_addr_t) 0x00000000) +#define INADDR_ANY ((uint32_t) 0x00000000) /* Address to send to all hosts. */ -#define INADDR_BROADCAST ((in_addr_t) 0xffffffff) +#define INADDR_BROADCAST ((uint32_t) 0xffffffff) /* Address indicating an error return. */ -#define INADDR_NONE ((in_addr_t) 0xffffffff) +#define INADDR_NONE ((uint32_t) 0xffffffff) /* Network number for local host loopback. */ #define IN_LOOPBACKNET 127 /* Address to loopback in software to local host. */ #ifndef INADDR_LOOPBACK -# define INADDR_LOOPBACK ((in_addr_t) 0x7f000001) /* Inet 127.0.0.1. */ +# define INADDR_LOOPBACK ((uint32_t) 0x7f000001) /* Inet 127.0.0.1. */ #endif /* Defines for Multicast INADDR. */ -#define INADDR_UNSPEC_GROUP ((in_addr_t) 0xe0000000) /* 224.0.0.0 */ -#define INADDR_ALLHOSTS_GROUP ((in_addr_t) 0xe0000001) /* 224.0.0.1 */ -#define INADDR_ALLRTRS_GROUP ((in_addr_t) 0xe0000002) /* 224.0.0.2 */ -#define INADDR_ALLSNOOPERS_GROUP ((in_addr_t) 0xe000006a) /* 224.0.0.106 */ -#define INADDR_MAX_LOCAL_GROUP ((in_addr_t) 0xe00000ff) /* 224.0.0.255 */ +#define INADDR_UNSPEC_GROUP ((uint32_t) 0xe0000000) /* 224.0.0.0 */ +#define INADDR_ALLHOSTS_GROUP ((uint32_t) 0xe0000001) /* 224.0.0.1 */ +#define INADDR_ALLRTRS_GROUP ((uint32_t) 0xe0000002) /* 224.0.0.2 */ +#define INADDR_ALLSNOOPERS_GROUP ((uint32_t) 0xe000006a) /* 224.0.0.106 */ +#define INADDR_MAX_LOCAL_GROUP ((uint32_t) 0xe00000ff) /* 224.0.0.255 */ #if !__USE_KERNEL_IPV6_DEFS /* IPv6 address */ @@ -214,8 +214,8 @@ union { uint8_t __u6_addr8[16]; - uint16_t __u6_addr16[8]; - uint32_t __u6_addr32[4]; + __be16 __u6_addr16[8]; + __be32 __u6_addr32[4]; } __in6_u; #define s6_addr __in6_u.__u6_addr8 #ifdef __USE_MISC @@ -254,7 +254,7 @@ { __SOCKADDR_COMMON (sin6_); in_port_t sin6_port; /* Transport layer port # */ - uint32_t sin6_flowinfo; /* IPv6 flow information */ + __be32 sin6_flowinfo; /* IPv6 flow information */ struct in6_addr sin6_addr; /* IPv6 address */ uint32_t sin6_scope_id; /* IPv6 scope-id */ }; @@ -372,12 +372,12 @@ this was a short-sighted decision since on different systems the types may have different representations but the values are always the same. */ -extern uint32_t ntohl (uint32_t __netlong) __THROW __attribute__ ((__const__)); -extern uint16_t ntohs (uint16_t __netshort) +extern uint32_t ntohl (__be32 __netlong) __THROW __attribute__ ((__const__)); +extern uint16_t ntohs (__be16 __netshort) __THROW __attribute__ ((__const__)); -extern uint32_t htonl (uint32_t __hostlong) +extern __be32 htonl (uint32_t __hostlong) __THROW __attribute__ ((__const__)); -extern uint16_t htons (uint16_t __hostshort) +extern __be16 htons (uint16_t __hostshort) __THROW __attribute__ ((__const__)); #include @@ -386,7 +386,7 @@ #include #include -#ifdef __OPTIMIZE__ +#ifdef __disabled_OPTIMIZE__ /* We can optimize calls to the conversion functions. Either nothing has to be done or we are using directly the byte-swapping functions which often can be inlined. */ rdma-core-56.1/buildlib/sparse-include/31/stdlib.h.diff000066400000000000000000000013671477342711600226310ustar00rootroot00000000000000--- /usr/include/stdlib.h 2020-04-14 19:26:04.000000000 +0000 +++ include/stdlib.h 2020-05-05 19:03:23.910980758 +0000 @@ -130,6 +130,20 @@ /* Likewise for '_FloatN' and '_FloatNx'. */ +/* For whatever reason our sparse does not understand these new compiler types */ +#undef __GLIBC_USE_IEC_60559_TYPES_EXT +#define __GLIBC_USE_IEC_60559_TYPES_EXT 0 +#undef __HAVE_FLOAT32 +#define __HAVE_FLOAT32 0 +#undef __HAVE_FLOAT32X +#define __HAVE_FLOAT32X 0 +#undef __HAVE_FLOAT64 +#define __HAVE_FLOAT64 0 +#undef __HAVE_FLOAT64X +#define __HAVE_FLOAT64X 0 +#undef __HAVE_FLOAT128 +#define __HAVE_FLOAT128 0 + #if __HAVE_FLOAT16 && __GLIBC_USE (IEC_60559_TYPES_EXT) extern _Float16 strtof16 (const char *__restrict __nptr, char **__restrict __endptr) rdma-core-56.1/buildlib/sparse-include/31/sys-socket.h.diff000066400000000000000000000010251477342711600234430ustar00rootroot00000000000000--- /usr/include/x86_64-linux-gnu/sys/socket.h 2020-04-14 19:26:04.000000000 +0000 +++ include/sys/socket.h 2020-05-05 19:03:23.910980758 +0000 @@ -54,7 +54,7 @@ uses with any of the listed types to be allowed without complaint. G++ 2.7 does not support transparent unions so there we want the old-style declaration, too. */ -#if defined __cplusplus || !__GNUC_PREREQ (2, 7) || !defined __USE_GNU +#if 1 # define __SOCKADDR_ARG struct sockaddr *__restrict # define __CONST_SOCKADDR_ARG const struct sockaddr * #else rdma-core-56.1/buildlib/sparse-include/35/000077500000000000000000000000001477342711600202655ustar00rootroot00000000000000rdma-core-56.1/buildlib/sparse-include/35/bits-sysmacros.h.diff000066400000000000000000000017301477342711600243300ustar00rootroot00000000000000--- /usr/include/x86_64-linux-gnu/bits/sysmacros.h 2022-07-06 23:23:23.000000000 +0000 +++ include/bits/sysmacros.h 2022-11-25 17:37:07.471320582 +0000 @@ -40,8 +40,8 @@ __SYSMACROS_DECLARE_MAJOR (DECL_TEMPL) \ { \ unsigned int __major; \ - __major = ((__dev & (__dev_t) 0x00000000000fff00u) >> 8); \ - __major |= ((__dev & (__dev_t) 0xfffff00000000000u) >> 32); \ + __major = ((__dev & (__dev_t) 0x00000000000fff00ul) >> 8); \ + __major |= ((__dev & (__dev_t) 0xfffff00000000000ul) >> 32); \ return __major; \ } @@ -52,8 +52,8 @@ __SYSMACROS_DECLARE_MINOR (DECL_TEMPL) \ { \ unsigned int __minor; \ - __minor = ((__dev & (__dev_t) 0x00000000000000ffu) >> 0); \ - __minor |= ((__dev & (__dev_t) 0x00000ffffff00000u) >> 12); \ + __minor = ((__dev & (__dev_t) 0x00000000000000fful) >> 0); \ + __minor |= ((__dev & (__dev_t) 0x00000ffffff00000ul) >> 12); \ return __minor; \ } rdma-core-56.1/buildlib/sparse-include/35/netinet-in.h.diff000066400000000000000000000113271477342711600234230ustar00rootroot00000000000000--- /usr/include/netinet/in.h 2022-07-06 23:23:23.000000000 +0000 +++ include/netinet/in.h 2022-11-25 17:37:07.467320507 +0000 @@ -22,12 +22,12 @@ #include #include #include - +#include __BEGIN_DECLS /* Internet address. */ -typedef uint32_t in_addr_t; +typedef __be32 in_addr_t; struct in_addr { in_addr_t s_addr; @@ -120,7 +120,7 @@ #endif /* !__USE_KERNEL_IPV6_DEFS */ /* Type to represent a port. */ -typedef uint16_t in_port_t; +typedef __be16 in_port_t; /* Standard well-known ports. */ enum @@ -179,40 +179,40 @@ #define IN_CLASSB_HOST (0xffffffff & ~IN_CLASSB_NET) #define IN_CLASSB_MAX 65536 -#define IN_CLASSC(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xc0000000) +#define IN_CLASSC(a) ((((uint32_t)(a)) & 0xe0000000) == 0xc0000000) #define IN_CLASSC_NET 0xffffff00 #define IN_CLASSC_NSHIFT 8 #define IN_CLASSC_HOST (0xffffffff & ~IN_CLASSC_NET) -#define IN_CLASSD(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xe0000000) +#define IN_CLASSD(a) ((((uint32_t)(a)) & 0xf0000000) == 0xe0000000) #define IN_MULTICAST(a) IN_CLASSD(a) -#define IN_EXPERIMENTAL(a) ((((in_addr_t)(a)) & 0xe0000000) == 0xe0000000) -#define IN_BADCLASS(a) ((((in_addr_t)(a)) & 0xf0000000) == 0xf0000000) +#define IN_EXPERIMENTAL(a) ((((uint32_t)(a)) & 0xe0000000) == 0xe0000000) +#define IN_BADCLASS(a) ((((uint32_t)(a)) & 0xf0000000) == 0xf0000000) /* Address to accept any incoming messages. */ -#define INADDR_ANY ((in_addr_t) 0x00000000) +#define INADDR_ANY ((uint32_t) 0x00000000) /* Address to send to all hosts. */ -#define INADDR_BROADCAST ((in_addr_t) 0xffffffff) +#define INADDR_BROADCAST ((uint32_t) 0xffffffff) /* Address indicating an error return. */ -#define INADDR_NONE ((in_addr_t) 0xffffffff) +#define INADDR_NONE ((uint32_t) 0xffffffff) /* Dummy address for source of ICMPv6 errors converted to IPv4 (RFC 7600). */ -#define INADDR_DUMMY ((in_addr_t) 0xc0000008) +#define INADDR_DUMMY ((uint32_t) 0xc0000008) /* Network number for local host loopback. */ #define IN_LOOPBACKNET 127 /* Address to loopback in software to local host. */ #ifndef INADDR_LOOPBACK -# define INADDR_LOOPBACK ((in_addr_t) 0x7f000001) /* Inet 127.0.0.1. */ +# define INADDR_LOOPBACK ((uint32_t) 0x7f000001) /* Inet 127.0.0.1. */ #endif /* Defines for Multicast INADDR. */ -#define INADDR_UNSPEC_GROUP ((in_addr_t) 0xe0000000) /* 224.0.0.0 */ -#define INADDR_ALLHOSTS_GROUP ((in_addr_t) 0xe0000001) /* 224.0.0.1 */ -#define INADDR_ALLRTRS_GROUP ((in_addr_t) 0xe0000002) /* 224.0.0.2 */ -#define INADDR_ALLSNOOPERS_GROUP ((in_addr_t) 0xe000006a) /* 224.0.0.106 */ -#define INADDR_MAX_LOCAL_GROUP ((in_addr_t) 0xe00000ff) /* 224.0.0.255 */ +#define INADDR_UNSPEC_GROUP ((uint32_t) 0xe0000000) /* 224.0.0.0 */ +#define INADDR_ALLHOSTS_GROUP ((uint32_t) 0xe0000001) /* 224.0.0.1 */ +#define INADDR_ALLRTRS_GROUP ((uint32_t) 0xe0000002) /* 224.0.0.2 */ +#define INADDR_ALLSNOOPERS_GROUP ((uint32_t) 0xe000006a) /* 224.0.0.106 */ +#define INADDR_MAX_LOCAL_GROUP ((uint32_t) 0xe00000ff) /* 224.0.0.255 */ #if !__USE_KERNEL_IPV6_DEFS /* IPv6 address */ @@ -221,8 +221,8 @@ union { uint8_t __u6_addr8[16]; - uint16_t __u6_addr16[8]; - uint32_t __u6_addr32[4]; + __be16 __u6_addr16[8]; + __be32 __u6_addr32[4]; } __in6_u; #define s6_addr __in6_u.__u6_addr8 #ifdef __USE_MISC @@ -261,7 +261,7 @@ { __SOCKADDR_COMMON (sin6_); in_port_t sin6_port; /* Transport layer port # */ - uint32_t sin6_flowinfo; /* IPv6 flow information */ + __be32 sin6_flowinfo; /* IPv6 flow information */ struct in6_addr sin6_addr; /* IPv6 address */ uint32_t sin6_scope_id; /* IPv6 scope-id */ }; @@ -379,12 +379,12 @@ this was a short-sighted decision since on different systems the types may have different representations but the values are always the same. */ -extern uint32_t ntohl (uint32_t __netlong) __THROW __attribute__ ((__const__)); -extern uint16_t ntohs (uint16_t __netshort) +extern uint32_t ntohl (__be32 __netlong) __THROW __attribute__ ((__const__)); +extern uint16_t ntohs (__be16 __netshort) __THROW __attribute__ ((__const__)); -extern uint32_t htonl (uint32_t __hostlong) +extern __be32 htonl (uint32_t __hostlong) __THROW __attribute__ ((__const__)); -extern uint16_t htons (uint16_t __hostshort) +extern __be16 htons (uint16_t __hostshort) __THROW __attribute__ ((__const__)); #include @@ -393,7 +393,7 @@ #include #include -#ifdef __OPTIMIZE__ +#ifdef __disabled_OPTIMIZE__ /* We can optimize calls to the conversion functions. Either nothing has to be done or we are using directly the byte-swapping functions which often can be inlined. */ rdma-core-56.1/buildlib/sparse-include/35/stdlib.h.diff000066400000000000000000000021051477342711600226240ustar00rootroot00000000000000--- /usr/include/stdlib.h 2022-07-06 23:23:23.000000000 +0000 +++ include/stdlib.h 2022-11-25 17:40:49.239478341 +0000 @@ -131,6 +131,20 @@ /* Likewise for '_FloatN' and '_FloatNx'. */ +/* For whatever reason our sparse does not understand these new compiler types */ +#undef __GLIBC_USE_IEC_60559_TYPES_EXT +#define __GLIBC_USE_IEC_60559_TYPES_EXT 0 +#undef __HAVE_FLOAT32 +#define __HAVE_FLOAT32 0 +#undef __HAVE_FLOAT32X +#define __HAVE_FLOAT32X 0 +#undef __HAVE_FLOAT64 +#define __HAVE_FLOAT64 0 +#undef __HAVE_FLOAT64X +#define __HAVE_FLOAT64X 0 +#undef __HAVE_FLOAT128 +#define __HAVE_FLOAT128 0 + #if __HAVE_FLOAT16 && __GLIBC_USE (IEC_60559_TYPES_EXT) extern _Float16 strtof16 (const char *__restrict __nptr, char **__restrict __endptr) @@ -564,10 +578,6 @@ __THROW __attribute_warn_unused_result__ __attribute_alloc_size__ ((2, 3)) __attr_dealloc_free; - -/* Add reallocarray as its own deallocator. */ -extern void *reallocarray (void *__ptr, size_t __nmemb, size_t __size) - __THROW __attr_dealloc (reallocarray, 1); #endif #ifdef __USE_MISC rdma-core-56.1/buildlib/sparse-include/35/sys-socket.h.diff000066400000000000000000000010251477342711600234470ustar00rootroot00000000000000--- /usr/include/x86_64-linux-gnu/sys/socket.h 2022-07-06 23:23:23.000000000 +0000 +++ include/sys/socket.h 2022-11-25 17:37:07.463320432 +0000 @@ -54,7 +54,7 @@ uses with any of the listed types to be allowed without complaint. G++ 2.7 does not support transparent unions so there we want the old-style declaration, too. */ -#if defined __cplusplus || !__GNUC_PREREQ (2, 7) || !defined __USE_GNU +#if 1 # define __SOCKADDR_ARG struct sockaddr *__restrict # define __CONST_SOCKADDR_ARG const struct sockaddr * #else rdma-core-56.1/buildlib/sparse-include/endian.h000066400000000000000000000024751477342711600214550ustar00rootroot00000000000000/* COPYRIGHT (c) 2017 Obsidian Research Corporation. Licensed under BSD (MIT variant) or GPLv2. See COPYING. */ #ifndef _SPARSE_ENDIAN_H_ #define _SPARSE_ENDIAN_H_ #include_next #include #undef htobe16 #undef htole16 #undef be16toh #undef le16toh #undef htobe32 #undef htole32 #undef be32toh #undef le32toh #undef htobe64 #undef htole64 #undef be64toh #undef le64toh /* These do not actually work, but this trivially ensures that sparse sees all * the types. */ #define htobe16(x) ((__force __be16)__builtin_bswap16(x)) #define htole16(x) ((__force __le16)__builtin_bswap16(x)) #define be16toh(x) ((uint16_t)__builtin_bswap16((__force uint16_t)(__be16)(x))) #define le16toh(x) ((uint16_t)__builtin_bswap16((__force uint16_t)(__le16)(x))) #define htobe32(x) ((__force __be32)__builtin_bswap32(x)) #define htole32(x) ((__force __le32)__builtin_bswap32(x)) #define be32toh(x) ((uint32_t)__builtin_bswap32((__force uint32_t)(__be32)(x))) #define le32toh(x) ((uint32_t)__builtin_bswap32((__force uint32_t)(__le32)(x))) #define htobe64(x) ((__force __be64)__builtin_bswap64(x)) #define htole64(x) ((__force __le64)__builtin_bswap64(x)) #define be64toh(x) ((uint64_t)__builtin_bswap64((__force uint64_t)(__be64)(x))) #define le64toh(x) ((uint64_t)__builtin_bswap64((__force uint64_t)(__le64)(x))) #endif rdma-core-56.1/buildlib/sparse-include/pthread.h000066400000000000000000000005401477342711600216350ustar00rootroot00000000000000/* COPYRIGHT (c) 2017 Obsidian Research Corporation. Licensed under BSD (MIT variant) or GPLv2. See COPYING. */ #ifndef _SPARSE_PTHREAD_H_ #define _SPARSE_PTHREAD_H_ #include_next /* Sparse complains that the glibc version of this has 0 instead of NULL */ #undef PTHREAD_MUTEX_INITIALIZER #define PTHREAD_MUTEX_INITIALIZER {} #endif rdma-core-56.1/buildlib/sparse-include/stdatomic.h000066400000000000000000000204321477342711600221770ustar00rootroot00000000000000/* COPYRIGHT (c) 2017 Obsidian Research Corporation. * Licensed under BSD (MIT variant) or GPLv2. See COPYING. * * A version of C11 stdatomic.h that doesn't make spare angry. This doesn't * actually work. */ #ifndef _SPARSE_STDATOMIC_H_ #define _SPARSE_STDATOMIC_H_ #include #include #define _Atomic(T) struct {volatile __typeof__(T) __val; } #define ATOMIC_VAR_INIT(value) \ { \ .__val = (value) \ } #define atomic_init(obj, value) \ do { \ (obj)->__val = (value); \ } while (0) enum memory_order { memory_order_relaxed, memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel, memory_order_seq_cst, }; typedef enum memory_order memory_order; #define atomic_thread_fence(order) __asm volatile("" : : : "memory") #define atomic_signal_fence(order) __asm volatile("" : : : "memory") #define atomic_is_lock_free(obj) (sizeof((obj)->__val) <= sizeof(void *)) typedef _Atomic(_Bool) atomic_bool; typedef _Atomic(char) atomic_char; typedef _Atomic(signed char) atomic_schar; typedef _Atomic(unsigned char) atomic_uchar; typedef _Atomic(short) atomic_short; typedef _Atomic(unsigned short) atomic_ushort; typedef _Atomic(int) atomic_int; typedef _Atomic(unsigned int) atomic_uint; typedef _Atomic(long) atomic_long; typedef _Atomic(unsigned long) atomic_ulong; typedef _Atomic(long long) atomic_llong; typedef _Atomic(unsigned long long) atomic_ullong; typedef _Atomic(wchar_t) atomic_wchar_t; typedef _Atomic(int_least8_t) atomic_int_least8_t; typedef _Atomic(uint_least8_t) atomic_uint_least8_t; typedef _Atomic(int_least16_t) atomic_int_least16_t; typedef _Atomic(uint_least16_t) atomic_uint_least16_t; typedef _Atomic(int_least32_t) atomic_int_least32_t; typedef _Atomic(uint_least32_t) atomic_uint_least32_t; typedef _Atomic(int_least64_t) atomic_int_least64_t; typedef _Atomic(uint_least64_t) atomic_uint_least64_t; typedef _Atomic(int_fast8_t) atomic_int_fast8_t; typedef _Atomic(uint_fast8_t) atomic_uint_fast8_t; typedef _Atomic(int_fast16_t) atomic_int_fast16_t; typedef _Atomic(uint_fast16_t) atomic_uint_fast16_t; typedef _Atomic(int_fast32_t) atomic_int_fast32_t; typedef _Atomic(uint_fast32_t) atomic_uint_fast32_t; typedef _Atomic(int_fast64_t) atomic_int_fast64_t; typedef _Atomic(uint_fast64_t) atomic_uint_fast64_t; typedef _Atomic(intptr_t) atomic_intptr_t; typedef _Atomic(uintptr_t) atomic_uintptr_t; typedef _Atomic(size_t) atomic_size_t; typedef _Atomic(ptrdiff_t) atomic_ptrdiff_t; typedef _Atomic(intmax_t) atomic_intmax_t; typedef _Atomic(uintmax_t) atomic_uintmax_t; #define atomic_compare_exchange_strong_explicit(object, expected, desired, \ success, failure) \ ({ \ __typeof__((object)->__val) __v = (object)->__val; \ bool __r; \ if (__v == *(expected)) { \ r = true; \ (object)->__val = (desired); \ } else { \ r = false; \ *(expected) = __val; \ } \ __r; \ }) #define atomic_compare_exchange_weak_explicit(object, expected, desired, \ success, failure) \ atomic_compare_exchange_strong_explicit(object, expected, desired, \ success, failure) #define atomic_exchange_explicit(object, desired, order) \ ({ \ __typeof__((object)->__val) __v = (object)->__val; \ (object)->__val = (operand); \ __v; \ }) #define atomic_fetch_add_explicit(object, operand, order) \ ({ \ __typeof__((object)->__val) __v = (object)->__val; \ (object)->__val += (operand); \ __v; \ }) #define atomic_fetch_and_explicit(object, operand, order) \ ({ \ __typeof__((object)->__val) __v = (object)->__val; \ (object)->__val &= (operand); \ __v; \ }) #define atomic_fetch_or_explicit(object, operand, order) \ ({ \ __typeof__((object)->__val) __v = (object)->__val; \ (object)->__val |= (operand); \ __v; \ }) #define atomic_fetch_sub_explicit(object, operand, order) \ ({ \ __typeof__((object)->__val) __v = (object)->__val; \ (object)->__val -= (operand); \ __v; \ }) #define atomic_fetch_xor_explicit(object, operand, order) \ ({ \ __typeof__((object)->__val) __v = (object)->__val; \ (object)->__val ^= (operand); \ __v; \ }) #define atomic_load_explicit(object, order) ((object)->__val) #define atomic_store_explicit(object, desired, order) \ ({ (object)->__val = (desired); }) #define atomic_compare_exchange_strong(object, expected, desired) \ atomic_compare_exchange_strong_explicit(object, expected, desired, \ memory_order_seq_cst, \ memory_order_seq_cst) #define atomic_compare_exchange_weak(object, expected, desired) \ atomic_compare_exchange_weak_explicit(object, expected, desired, \ memory_order_seq_cst, \ memory_order_seq_cst) #define atomic_exchange(object, desired) \ atomic_exchange_explicit(object, desired, memory_order_seq_cst) #define atomic_fetch_add(object, operand) \ atomic_fetch_add_explicit(object, operand, memory_order_seq_cst) #define atomic_fetch_and(object, operand) \ atomic_fetch_and_explicit(object, operand, memory_order_seq_cst) #define atomic_fetch_or(object, operand) \ atomic_fetch_or_explicit(object, operand, memory_order_seq_cst) #define atomic_fetch_sub(object, operand) \ atomic_fetch_sub_explicit(object, operand, memory_order_seq_cst) #define atomic_fetch_xor(object, operand) \ atomic_fetch_xor_explicit(object, operand, memory_order_seq_cst) #define atomic_load(object) atomic_load_explicit(object, memory_order_seq_cst) #define atomic_store(object, desired) \ atomic_store_explicit(object, desired, memory_order_seq_cst) typedef atomic_bool atomic_flag; #define ATOMIC_FLAG_INIT ATOMIC_VAR_INIT(0) #define atomic_flag_clear_explicit(object, order) \ atomic_store_explicit(object, 0, order) #define atomic_flag_test_and_set_explicit(object, order) \ atomic_compare_exchange_strong_explicit(object, 0, 1, order, order) #define atomic_flag_clear(object) \ atomic_flag_clear_explicit(object, memory_order_seq_cst) #define atomic_flag_test_and_set(object) \ atomic_flag_test_and_set_explicit(object, memory_order_seq_cst) #endif rdma-core-56.1/buildlib/template.pc.in000066400000000000000000000005361477342711600176700ustar00rootroot00000000000000libdir=@CMAKE_INSTALL_FULL_LIBDIR@ includedir=@CMAKE_INSTALL_FULL_INCLUDEDIR@ Name: lib@PC_LIB_NAME@ Description: RDMA Core Userspace Library URL: https://github.com/linux-rdma/rdma-core Version: @PC_VERSION@ Libs: -L${libdir} -l@PC_LIB_NAME@ @PC_RPATH@ Libs.private: @PC_LIB_PRIVATE@ Requires.private: @PC_REQUIRES_PRIVATE@ Cflags: -I${includedir} rdma-core-56.1/ccan/000077500000000000000000000000001477342711600142365ustar00rootroot00000000000000rdma-core-56.1/ccan/CMakeLists.txt000066400000000000000000000005371477342711600170030ustar00rootroot00000000000000publish_internal_headers(ccan array_size.h build_assert.h check_type.h compiler.h container_of.h ilog.h list.h minmax.h str.h str_debug.h ) set(C_FILES ilog.c list.c str.c ) add_library(ccan STATIC ${C_FILES}) add_library(ccan_pic STATIC ${C_FILES}) set_property(TARGET ccan_pic PROPERTY POSITION_INDEPENDENT_CODE TRUE) rdma-core-56.1/ccan/LICENSE.CCO000066400000000000000000000143151477342711600156520ustar00rootroot00000000000000Statement of Purpose The laws of most jurisdictions throughout the world automatically confer exclusive Copyright and Related Rights (defined below) upon the creator and subsequent owner(s) (each and all, an "owner") of an original work of authorship and/or a database (each, a "Work"). Certain owners wish to permanently relinquish those rights to a Work for the purpose of contributing to a commons of creative, cultural and scientific works ("Commons") that the public can reliably and without fear of later claims of infringement build upon, modify, incorporate in other works, reuse and redistribute as freely as possible in any form whatsoever and for any purposes, including without limitation commercial purposes. These owners may contribute to the Commons to promote the ideal of a free culture and the further production of creative, cultural and scientific works, or to gain reputation or greater distribution for their Work in part through the use and efforts of others. For these and/or other purposes and motivations, and without any expectation of additional consideration or compensation, the person associating CC0 with a Work (the "Affirmer"), to the extent that he or she is an owner of Copyright and Related Rights in the Work, voluntarily elects to apply CC0 to the Work and publicly distribute the Work under its terms, with knowledge of his or her Copyright and Related Rights in the Work and the meaning and intended legal effect of CC0 on those rights. 1. Copyright and Related Rights. A Work made available under CC0 may be protected by copyright and related or neighboring rights ("Copyright and Related Rights"). Copyright and Related Rights include, but are not limited to, the following: the right to reproduce, adapt, distribute, perform, display, communicate, and translate a Work; moral rights retained by the original author(s) and/or performer(s); publicity and privacy rights pertaining to a person's image or likeness depicted in a Work; rights protecting against unfair competition in regards to a Work, subject to the limitations in paragraph 4(a), below; rights protecting the extraction, dissemination, use and reuse of data in a Work; database rights (such as those arising under Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, and under any national implementation thereof, including any amended or successor version of such directive); and other similar, equivalent or corresponding rights throughout the world based on applicable law or treaty, and any national implementations thereof. 2. Waiver. To the greatest extent permitted by, but not in contravention of, applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and unconditionally waives, abandons, and surrenders all of Affirmer's Copyright and Related Rights and associated claims and causes of action, whether now known or unknown (including existing as well as future claims and causes of action), in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each member of the public at large and to the detriment of Affirmer's heirs and successors, fully intending that such Waiver shall not be subject to revocation, rescission, cancellation, termination, or any other legal or equitable action to disrupt the quiet enjoyment of the Work by the public as contemplated by Affirmer's express Statement of Purpose. 3. Public License Fallback. Should any part of the Waiver for any reason be judged legally invalid or ineffective under applicable law, then the Waiver shall be preserved to the maximum extent permitted taking into account Affirmer's express Statement of Purpose. In addition, to the extent the Waiver is so judged Affirmer hereby grants to each affected person a royalty-free, non transferable, non sublicensable, non exclusive, irrevocable and unconditional license to exercise Affirmer's Copyright and Related Rights in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "License"). The License shall be deemed effective as of the date CC0 was applied by Affirmer to the Work. Should any part of the License for any reason be judged legally invalid or ineffective under applicable law, such partial invalidity or ineffectiveness shall not invalidate the remainder of the License, and in such case Affirmer hereby affirms that he or she will not (i) exercise any of his or her remaining Copyright and Related Rights in the Work or (ii) assert any associated claims and causes of action with respect to the Work, in either case contrary to Affirmer's express Statement of Purpose. 4. Limitations and Disclaimers. No trademark or patent rights held by Affirmer are waived, abandoned, surrendered, licensed or otherwise affected by this document. Affirmer offers the Work as-is and makes no representations or warranties of any kind concerning the Work, express, implied, statutory or otherwise, including without limitation warranties of title, merchantability, fitness for a particular purpose, non infringement, or the absence of latent or other defects, accuracy, or the present or absence of errors, whether or not discoverable, all to the greatest extent permissible under applicable law. Affirmer disclaims responsibility for clearing rights of other persons that may apply to the Work or any use thereof, including without limitation any person's Copyright and Related Rights in the Work. Further, Affirmer disclaims responsibility for obtaining any necessary consents, permissions or other rights required for any use of the Work. Affirmer understands and acknowledges that Creative Commons is not a party to this document and has no duty or obligation with respect to this CC0 or use of the Work. rdma-core-56.1/ccan/LICENSE.MIT000066400000000000000000000017771477342711600157070ustar00rootroot00000000000000Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. rdma-core-56.1/ccan/array_size.h000066400000000000000000000015541477342711600165640ustar00rootroot00000000000000/* CC0 (Public domain) - see LICENSE file for details */ #ifndef CCAN_ARRAY_SIZE_H #define CCAN_ARRAY_SIZE_H #include "config.h" #include /** * ARRAY_SIZE - get the number of elements in a visible array * @arr: the array whose size you want. * * This does not work on pointers, or arrays declared as [], or * function parameters. With correct compiler support, such usage * will cause a build error (see build_assert). */ #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + _array_size_chk(arr)) #if HAVE_BUILTIN_TYPES_COMPATIBLE_P && HAVE_TYPEOF /* Two gcc extensions. * &a[0] degrades to a pointer: a different type from an array */ #define _array_size_chk(arr) \ BUILD_ASSERT_OR_ZERO(!__builtin_types_compatible_p(typeof(arr), \ typeof(&(arr)[0]))) #else #define _array_size_chk(arr) 0 #endif #endif /* CCAN_ALIGNOF_H */ rdma-core-56.1/ccan/build_assert.h000066400000000000000000000023201477342711600170640ustar00rootroot00000000000000/* CC0 (Public domain) - see LICENSE.CC0 file for details */ #ifndef CCAN_BUILD_ASSERT_H #define CCAN_BUILD_ASSERT_H /** * BUILD_ASSERT - assert a build-time dependency. * @cond: the compile-time condition which must be true. * * Your compile will fail if the condition isn't true, or can't be evaluated * by the compiler. This can only be used within a function. * * Example: * #include * ... * static char *foo_to_char(struct foo *foo) * { * // This code needs string to be at start of foo. * BUILD_ASSERT(offsetof(struct foo, string) == 0); * return (char *)foo; * } */ #define BUILD_ASSERT(cond) \ do { (void) sizeof(char [1 - 2*!(cond)]); } while(0) /** * BUILD_ASSERT_OR_ZERO - assert a build-time dependency, as an expression. * @cond: the compile-time condition which must be true. * * Your compile will fail if the condition isn't true, or can't be evaluated * by the compiler. This can be used in an expression: its value is "0". * * Example: * #define foo_to_char(foo) \ * ((char *)(foo) \ * + BUILD_ASSERT_OR_ZERO(offsetof(struct foo, string) == 0)) */ #define BUILD_ASSERT_OR_ZERO(cond) \ (sizeof(char [1 - 2*!(cond)]) - 1) #endif /* CCAN_BUILD_ASSERT_H */ rdma-core-56.1/ccan/check_type.h000066400000000000000000000044741477342711600165360ustar00rootroot00000000000000/* CC0 (Public domain) - see LICENSE.CC0 file for details */ #ifndef CCAN_CHECK_TYPE_H #define CCAN_CHECK_TYPE_H #include "config.h" /** * check_type - issue a warning or build failure if type is not correct. * @expr: the expression whose type we should check (not evaluated). * @type: the exact type we expect the expression to be. * * This macro is usually used within other macros to try to ensure that a macro * argument is of the expected type. No type promotion of the expression is * done: an unsigned int is not the same as an int! * * check_type() always evaluates to 0. * * If your compiler does not support typeof, then the best we can do is fail * to compile if the sizes of the types are unequal (a less complete check). * * Example: * // They should always pass a 64-bit value to _set_some_value! * #define set_some_value(expr) \ * _set_some_value((check_type((expr), uint64_t), (expr))) */ /** * check_types_match - issue a warning or build failure if types are not same. * @expr1: the first expression (not evaluated). * @expr2: the second expression (not evaluated). * * This macro is usually used within other macros to try to ensure that * arguments are of identical types. No type promotion of the expressions is * done: an unsigned int is not the same as an int! * * check_types_match() always evaluates to 0. * * If your compiler does not support typeof, then the best we can do is fail * to compile if the sizes of the types are unequal (a less complete check). * * Example: * // Do subtraction to get to enclosing type, but make sure that * // pointer is of correct type for that member. * #define container_of(mbr_ptr, encl_type, mbr) \ * (check_types_match((mbr_ptr), &((encl_type *)0)->mbr), \ * ((encl_type *) \ * ((char *)(mbr_ptr) - offsetof(enclosing_type, mbr)))) */ #if HAVE_TYPEOF #define check_type(expr, type) \ ((typeof(expr) *)0 != (type *)0) #define check_types_match(expr1, expr2) \ ((typeof(expr1) *)0 != (typeof(expr2) *)0) #else #include /* Without typeof, we can only test the sizes. */ #define check_type(expr, type) \ BUILD_ASSERT_OR_ZERO(sizeof(expr) == sizeof(type)) #define check_types_match(expr1, expr2) \ BUILD_ASSERT_OR_ZERO(sizeof(expr1) == sizeof(expr2)) #endif /* HAVE_TYPEOF */ #endif /* CCAN_CHECK_TYPE_H */ rdma-core-56.1/ccan/compiler.h000066400000000000000000000143151477342711600162250ustar00rootroot00000000000000/* CC0 (Public domain) - see LICENSE file for details */ #ifndef CCAN_COMPILER_H #define CCAN_COMPILER_H #include "config.h" #ifndef COLD /** * COLD - a function is unlikely to be called. * * Used to mark an unlikely code path and optimize appropriately. * It is usually used on logging or error routines. * * Example: * static void COLD moan(const char *reason) * { * fprintf(stderr, "Error: %s (%s)\n", reason, strerror(errno)); * } */ #define COLD __attribute__((__cold__)) #endif #ifndef NORETURN /** * NORETURN - a function does not return * * Used to mark a function which exits; useful for suppressing warnings. * * Example: * static void NORETURN fail(const char *reason) * { * fprintf(stderr, "Error: %s (%s)\n", reason, strerror(errno)); * exit(1); * } */ #define NORETURN __attribute__((__noreturn__)) #endif #ifndef PRINTF_FMT /** * PRINTF_FMT - a function takes printf-style arguments * @nfmt: the 1-based number of the function's format argument. * @narg: the 1-based number of the function's first variable argument. * * This allows the compiler to check your parameters as it does for printf(). * * Example: * void PRINTF_FMT(2,3) my_printf(const char *prefix, const char *fmt, ...); */ #define PRINTF_FMT(nfmt, narg) \ __attribute__((format(__printf__, nfmt, narg))) #endif #ifndef CONST_FUNCTION /** * CONST_FUNCTION - a function's return depends only on its argument * * This allows the compiler to assume that the function will return the exact * same value for the exact same arguments. This implies that the function * must not use global variables, or dereference pointer arguments. */ #define CONST_FUNCTION __attribute__((__const__)) #ifndef PURE_FUNCTION /** * PURE_FUNCTION - a function is pure * * A pure function is one that has no side effects other than it's return value * and uses no inputs other than it's arguments and global variables. */ #define PURE_FUNCTION __attribute__((__pure__)) #endif #endif #ifndef UNNEEDED /** * UNNEEDED - a variable/function may not be needed * * This suppresses warnings about unused variables or functions, but tells * the compiler that if it is unused it need not emit it into the source code. * * Example: * // With some preprocessor options, this is unnecessary. * static UNNEEDED int counter; * * // With some preprocessor options, this is unnecessary. * static UNNEEDED void add_to_counter(int add) * { * counter += add; * } */ #define UNNEEDED __attribute__((__unused__)) #endif #ifndef NEEDED /** * NEEDED - a variable/function is needed * * This suppresses warnings about unused variables or functions, but tells * the compiler that it must exist even if it (seems) unused. * * Example: * // Even if this is unused, these are vital for debugging. * static NEEDED int counter; * static NEEDED void dump_counter(void) * { * printf("Counter is %i\n", counter); * } */ #define NEEDED __attribute__((__used__)) #endif #ifndef UNUSED /** * UNUSED - a parameter is unused * * Some compilers (eg. gcc with -W or -Wunused) warn about unused * function parameters. This suppresses such warnings and indicates * to the reader that it's deliberate. * * Example: * // This is used as a callback, so needs to have this prototype. * static int some_callback(void *unused UNUSED) * { * return 0; * } */ #define UNUSED __attribute__((__unused__)) #endif #ifndef IS_COMPILE_CONSTANT /** * IS_COMPILE_CONSTANT - does the compiler know the value of this expression? * @expr: the expression to evaluate * * When an expression manipulation is complicated, it is usually better to * implement it in a function. However, if the expression being manipulated is * known at compile time, it is better to have the compiler see the entire * expression so it can simply substitute the result. * * This can be done using the IS_COMPILE_CONSTANT() macro. * * Example: * enum greek { ALPHA, BETA, GAMMA, DELTA, EPSILON }; * * // Out-of-line version. * const char *greek_name(enum greek greek); * * // Inline version. * static inline const char *_greek_name(enum greek greek) * { * switch (greek) { * case ALPHA: return "alpha"; * case BETA: return "beta"; * case GAMMA: return "gamma"; * case DELTA: return "delta"; * case EPSILON: return "epsilon"; * default: return "**INVALID**"; * } * } * * // Use inline if compiler knows answer. Otherwise call function * // to avoid copies of the same code everywhere. * #define greek_name(g) \ * (IS_COMPILE_CONSTANT(greek) ? _greek_name(g) : greek_name(g)) */ #define IS_COMPILE_CONSTANT(expr) __builtin_constant_p(expr) #endif #ifndef WARN_UNUSED_RESULT /** * WARN_UNUSED_RESULT - warn if a function return value is unused. * * Used to mark a function where it is extremely unlikely that the caller * can ignore the result, eg realloc(). * * Example: * // buf param may be freed by this; need return value! * static char *WARN_UNUSED_RESULT enlarge(char *buf, unsigned *size) * { * return realloc(buf, (*size) *= 2); * } */ #define WARN_UNUSED_RESULT __attribute__((__warn_unused_result__)) #endif /** * WARN_DEPRECATED - warn that a function/type/variable is deprecated when used. * * Used to mark a function, type or variable should not be used. * * Example: * WARN_DEPRECATED char *oldfunc(char *buf); */ #define WARN_DEPRECATED __attribute__((__deprecated__)) /** * NO_NULL_ARGS - specify that no arguments to this function can be NULL. * * The compiler will warn if any pointer args are NULL. * * Example: * NO_NULL_ARGS char *my_copy(char *buf); */ #define NO_NULL_ARGS __attribute__((__nonnull__)) /** * NON_NULL_ARGS - specify that some arguments to this function can't be NULL. * @...: 1-based argument numbers for which args can't be NULL. * * The compiler will warn if any of the specified pointer args are NULL. * * Example: * char *my_copy2(char *buf, char *maybenull) NON_NULL_ARGS(1); */ #define NON_NULL_ARGS(...) __attribute__((__nonnull__(__VA_ARGS__))) /** * LAST_ARG_NULL - specify the last argument of a variadic function must be NULL. * * The compiler will warn if the last argument isn't NULL. * * Example: * char *join_string(char *buf, ...) LAST_ARG_NULL; */ #define LAST_ARG_NULL __attribute__((__sentinel__)) #endif /* CCAN_COMPILER_H */ rdma-core-56.1/ccan/container_of.h000066400000000000000000000103121477342711600170520ustar00rootroot00000000000000/* CC0 (Public domain) - see LICENSE.CC0 file for details */ #ifndef CCAN_CONTAINER_OF_H #define CCAN_CONTAINER_OF_H #include #include "config.h" #include /** * container_of - get pointer to enclosing structure * @member_ptr: pointer to the structure member * @containing_type: the type this member is within * @member: the name of this member within the structure. * * Given a pointer to a member of a structure, this macro does pointer * subtraction to return the pointer to the enclosing type. * * Example: * struct foo { * int fielda, fieldb; * // ... * }; * struct info { * int some_other_field; * struct foo my_foo; * }; * * static struct info *foo_to_info(struct foo *foo) * { * return container_of(foo, struct info, my_foo); * } */ #ifndef container_of #define container_of(member_ptr, containing_type, member) \ ((containing_type *) \ ((char *)(member_ptr) \ - container_off(containing_type, member)) \ + check_types_match(*(member_ptr), ((containing_type *)0)->member)) #endif /** * container_of_or_null - get pointer to enclosing structure, or NULL * @member_ptr: pointer to the structure member * @containing_type: the type this member is within * @member: the name of this member within the structure. * * Given a pointer to a member of a structure, this macro does pointer * subtraction to return the pointer to the enclosing type, unless it * is given NULL, in which case it also returns NULL. * * Example: * struct foo { * int fielda, fieldb; * // ... * }; * struct info { * int some_other_field; * struct foo my_foo; * }; * * static struct info *foo_to_info_allowing_null(struct foo *foo) * { * return container_of_or_null(foo, struct info, my_foo); * } */ static inline char *container_of_or_null_(void *member_ptr, size_t offset) { return member_ptr ? (char *)member_ptr - offset : NULL; } #define container_of_or_null(member_ptr, containing_type, member) \ ((containing_type *) \ container_of_or_null_(member_ptr, \ container_off(containing_type, member)) \ + check_types_match(*(member_ptr), ((containing_type *)0)->member)) /** * container_off - get offset to enclosing structure * @containing_type: the type this member is within * @member: the name of this member within the structure. * * Given a pointer to a member of a structure, this macro does * typechecking and figures out the offset to the enclosing type. * * Example: * struct foo { * int fielda, fieldb; * // ... * }; * struct info { * int some_other_field; * struct foo my_foo; * }; * * static struct info *foo_to_info(struct foo *foo) * { * size_t off = container_off(struct info, my_foo); * return (void *)((char *)foo - off); * } */ #define container_off(containing_type, member) \ offsetof(containing_type, member) /** * container_of_var - get pointer to enclosing structure using a variable * @member_ptr: pointer to the structure member * @container_var: a pointer of same type as this member's container * @member: the name of this member within the structure. * * Given a pointer to a member of a structure, this macro does pointer * subtraction to return the pointer to the enclosing type. * * Example: * static struct info *foo_to_i(struct foo *foo) * { * struct info *i = container_of_var(foo, i, my_foo); * return i; * } */ #if HAVE_TYPEOF #define container_of_var(member_ptr, container_var, member) \ container_of(member_ptr, typeof(*container_var), member) #else #define container_of_var(member_ptr, container_var, member) \ ((void *)((char *)(member_ptr) - \ container_off_var(container_var, member))) #endif /** * container_off_var - get offset of a field in enclosing structure * @container_var: a pointer to a container structure * @member: the name of a member within the structure. * * Given (any) pointer to a structure and a its member name, this * macro does pointer subtraction to return offset of member in a * structure memory layout. * */ #if HAVE_TYPEOF #define container_off_var(var, member) \ container_off(typeof(*var), member) #else #define container_off_var(var, member) \ ((const char *)&(var)->member - (const char *)(var)) #endif #endif /* CCAN_CONTAINER_OF_H */ rdma-core-56.1/ccan/ilog.c000066400000000000000000000061321477342711600153360ustar00rootroot00000000000000/*(C) Timothy B. Terriberry (tterribe@xiph.org) 2001-2009 CC0 (Public domain). * See LICENSE file for details. */ #include "ilog.h" #include /*The fastest fallback strategy for platforms with fast multiplication appears to be based on de Bruijn sequences~\cite{LP98}. Tests confirmed this to be true even on an ARM11, where it is actually faster than using the native clz instruction. Define ILOG_NODEBRUIJN to use a simpler fallback on platforms where multiplication or table lookups are too expensive. @UNPUBLISHED{LP98, author="Charles E. Leiserson and Harald Prokop", title="Using de {Bruijn} Sequences to Index a 1 in a Computer Word", month=Jun, year=1998, note="\url{http://supertech.csail.mit.edu/papers/debruijn.pdf}" }*/ static UNNEEDED const unsigned char DEBRUIJN_IDX32[32]={ 0, 1,28, 2,29,14,24, 3,30,22,20,15,25,17, 4, 8, 31,27,13,23,21,19,16, 7,26,12,18, 6,11, 5,10, 9 }; /* We always compile these in, in case someone takes address of function. */ #undef ilog32_nz #undef ilog32 #undef ilog64_nz #undef ilog64 int ilog32(uint32_t _v){ /*On a Pentium M, this branchless version tested as the fastest version without multiplications on 1,000,000,000 random 32-bit integers, edging out a similar version with branches, and a 256-entry LUT version.*/ # if defined(ILOG_NODEBRUIJN) int ret; int m; ret=_v>0; m=(_v>0xFFFFU)<<4; _v>>=m; ret|=m; m=(_v>0xFFU)<<3; _v>>=m; ret|=m; m=(_v>0xFU)<<2; _v>>=m; ret|=m; m=(_v>3)<<1; _v>>=m; ret|=m; ret+=_v>1; return ret; /*This de Bruijn sequence version is faster if you have a fast multiplier.*/ # else int ret; ret=_v>0; _v|=_v>>1; _v|=_v>>2; _v|=_v>>4; _v|=_v>>8; _v|=_v>>16; _v=(_v>>1)+1; ret+=DEBRUIJN_IDX32[_v*0x77CB531U>>27&0x1F]; return ret; # endif } int ilog32_nz(uint32_t _v) { return ilog32(_v); } int ilog64(uint64_t _v){ # if defined(ILOG_NODEBRUIJN) uint32_t v; int ret; int m; ret=_v>0; m=(_v>0xFFFFFFFFU)<<5; v=(uint32_t)(_v>>m); ret|=m; m=(v>0xFFFFU)<<4; v>>=m; ret|=m; m=(v>0xFFU)<<3; v>>=m; ret|=m; m=(v>0xFU)<<2; v>>=m; ret|=m; m=(v>3)<<1; v>>=m; ret|=m; ret+=v>1; return ret; # else /*If we don't have a 64-bit word, split it into two 32-bit halves.*/ # if LONG_MAX<9223372036854775807LL uint32_t v; int ret; int m; ret=_v>0; m=(_v>0xFFFFFFFFU)<<5; v=(uint32_t)(_v>>m); ret|=m; v|=v>>1; v|=v>>2; v|=v>>4; v|=v>>8; v|=v>>16; v=(v>>1)+1; ret+=DEBRUIJN_IDX32[v*0x77CB531U>>27&0x1F]; return ret; /*Otherwise do it in one 64-bit operation.*/ # else static const unsigned char DEBRUIJN_IDX64[64]={ 0, 1, 2, 7, 3,13, 8,19, 4,25,14,28, 9,34,20,40, 5,17,26,38,15,46,29,48,10,31,35,54,21,50,41,57, 63, 6,12,18,24,27,33,39,16,37,45,47,30,53,49,56, 62,11,23,32,36,44,52,55,61,22,43,51,60,42,59,58 }; int ret; ret=_v>0; _v|=_v>>1; _v|=_v>>2; _v|=_v>>4; _v|=_v>>8; _v|=_v>>16; _v|=_v>>32; _v=(_v>>1)+1; ret+=DEBRUIJN_IDX64[_v*0x218A392CD3D5DBFULL>>58&0x3F]; return ret; # endif # endif } int ilog64_nz(uint64_t _v) { return ilog64(_v); } rdma-core-56.1/ccan/ilog.h000066400000000000000000000123531477342711600153450ustar00rootroot00000000000000/* CC0 (Public domain) - see LICENSE file for details */ #if !defined(_ilog_H) # define _ilog_H (1) # include "config.h" # include # include # include /** * ilog32 - Integer binary logarithm of a 32-bit value. * @_v: A 32-bit value. * Returns floor(log2(_v))+1, or 0 if _v==0. * This is the number of bits that would be required to represent _v in two's * complement notation with all of the leading zeros stripped. * Note that many uses will resolve to the fast macro version instead. * * See Also: * ilog32_nz(), ilog64() * * Example: * // Rounds up to next power of 2 (if not a power of 2). * static uint32_t round_up32(uint32_t i) * { * assert(i != 0); * return 1U << ilog32(i-1); * } */ int ilog32(uint32_t _v); /** * ilog32_nz - Integer binary logarithm of a non-zero 32-bit value. * @_v: A 32-bit value. * Returns floor(log2(_v))+1, or undefined if _v==0. * This is the number of bits that would be required to represent _v in two's * complement notation with all of the leading zeros stripped. * Note that many uses will resolve to the fast macro version instead. * See Also: * ilog32(), ilog64_nz() * Example: * // Find Last Set (ie. highest bit set, 0 to 31). * static uint32_t fls32(uint32_t i) * { * assert(i != 0); * return ilog32_nz(i) - 1; * } */ int ilog32_nz(uint32_t _v); /** * ilog64 - Integer binary logarithm of a 64-bit value. * @_v: A 64-bit value. * Returns floor(log2(_v))+1, or 0 if _v==0. * This is the number of bits that would be required to represent _v in two's * complement notation with all of the leading zeros stripped. * Note that many uses will resolve to the fast macro version instead. * See Also: * ilog64_nz(), ilog32() */ int ilog64(uint64_t _v); /** * ilog64_nz - Integer binary logarithm of a non-zero 64-bit value. * @_v: A 64-bit value. * Returns floor(log2(_v))+1, or undefined if _v==0. * This is the number of bits that would be required to represent _v in two's * complement notation with all of the leading zeros stripped. * Note that many uses will resolve to the fast macro version instead. * See Also: * ilog64(), ilog32_nz() */ int ilog64_nz(uint64_t _v); /** * STATIC_ILOG_32 - The integer logarithm of an (unsigned, 32-bit) constant. * @_v: A non-negative 32-bit constant. * Returns floor(log2(_v))+1, or 0 if _v==0. * This is the number of bits that would be required to represent _v in two's * complement notation with all of the leading zeros stripped. * This macro should only be used when you need a compile-time constant, * otherwise ilog32 or ilog32_nz are just as fast and more flexible. * * Example: * #define MY_PAGE_SIZE 4096 * #define MY_PAGE_BITS (STATIC_ILOG_32(PAGE_SIZE) - 1) */ #define STATIC_ILOG_32(_v) (STATIC_ILOG5((uint32_t)(_v))) /** * STATIC_ILOG_64 - The integer logarithm of an (unsigned, 64-bit) constant. * @_v: A non-negative 64-bit constant. * Returns floor(log2(_v))+1, or 0 if _v==0. * This is the number of bits that would be required to represent _v in two's * complement notation with all of the leading zeros stripped. * This macro should only be used when you need a compile-time constant, * otherwise ilog64 or ilog64_nz are just as fast and more flexible. */ #define STATIC_ILOG_64(_v) (STATIC_ILOG6((uint64_t)(_v))) /* Private implementation details */ /*Note the casts to (int) below: this prevents "upgrading" the type of an entire expression to an (unsigned) size_t.*/ #if INT_MAX>=2147483647 && HAVE_BUILTIN_CLZ #define builtin_ilog32_nz(v) \ (((int)sizeof(unsigned)*CHAR_BIT) - __builtin_clz(v)) #elif LONG_MAX>=2147483647L && HAVE_BUILTIN_CLZL #define builtin_ilog32_nz(v) \ (((int)sizeof(unsigned)*CHAR_BIT) - __builtin_clzl(v)) #endif #if INT_MAX>=9223372036854775807LL && HAVE_BUILTIN_CLZ #define builtin_ilog64_nz(v) \ (((int)sizeof(unsigned)*CHAR_BIT) - __builtin_clz(v)) #elif LONG_MAX>=9223372036854775807LL && HAVE_BUILTIN_CLZL #define builtin_ilog64_nz(v) \ (((int)sizeof(unsigned long)*CHAR_BIT) - __builtin_clzl(v)) #elif HAVE_BUILTIN_CLZLL #define builtin_ilog64_nz(v) \ (((int)sizeof(unsigned long long)*CHAR_BIT) - __builtin_clzll(v)) #endif #ifdef builtin_ilog32_nz #define ilog32(_v) (builtin_ilog32_nz(_v)&-!!(_v)) #define ilog32_nz(_v) builtin_ilog32_nz(_v) #else #define ilog32_nz(_v) ilog32(_v) #define ilog32(_v) (IS_COMPILE_CONSTANT(_v) ? STATIC_ILOG_32(_v) : ilog32(_v)) #endif /* builtin_ilog32_nz */ #ifdef builtin_ilog64_nz #define ilog64(_v) (builtin_ilog64_nz(_v)&-!!(_v)) #define ilog64_nz(_v) builtin_ilog64_nz(_v) #else #define ilog64_nz(_v) ilog64(_v) #define ilog64(_v) (IS_COMPILE_CONSTANT(_v) ? STATIC_ILOG_64(_v) : ilog64(_v)) #endif /* builtin_ilog64_nz */ /* Macros for evaluating compile-time constant ilog. */ # define STATIC_ILOG0(_v) (!!(_v)) # define STATIC_ILOG1(_v) (((_v)&0x2)?2:STATIC_ILOG0(_v)) # define STATIC_ILOG2(_v) (((_v)&0xC)?2+STATIC_ILOG1((_v)>>2):STATIC_ILOG1(_v)) # define STATIC_ILOG3(_v) \ (((_v)&0xF0)?4+STATIC_ILOG2((_v)>>4):STATIC_ILOG2(_v)) # define STATIC_ILOG4(_v) \ (((_v)&0xFF00)?8+STATIC_ILOG3((_v)>>8):STATIC_ILOG3(_v)) # define STATIC_ILOG5(_v) \ (((_v)&0xFFFF0000)?16+STATIC_ILOG4((_v)>>16):STATIC_ILOG4(_v)) # define STATIC_ILOG6(_v) \ (((_v)&0xFFFFFFFF00000000ULL)?32+STATIC_ILOG5((_v)>>32):STATIC_ILOG5(_v)) #endif /* _ilog_H */ rdma-core-56.1/ccan/list.c000066400000000000000000000017501477342711600153600ustar00rootroot00000000000000/* Licensed under MIT - see LICENSE.MIT file for details */ #include #include #include "list.h" static void *corrupt(const char *abortstr, const struct list_node *head, const struct list_node *node, unsigned int count) { if (abortstr) { fprintf(stderr, "%s: prev corrupt in node %p (%u) of %p\n", abortstr, node, count, head); abort(); } return NULL; } struct list_node *list_check_node(const struct list_node *node, const char *abortstr) { const struct list_node *p, *n; int count = 0; for (p = node, n = node->next; n != node; p = n, n = n->next) { count++; if (n->prev != p) return corrupt(abortstr, node, n, count); } /* Check prev on head node. */ if (node->prev != p) return corrupt(abortstr, node, node, 0); return (struct list_node *)node; } struct list_head *list_check(const struct list_head *h, const char *abortstr) { if (!list_check_node(&h->n, abortstr)) return NULL; return (struct list_head *)h; } rdma-core-56.1/ccan/list.h000066400000000000000000000573251477342711600153760ustar00rootroot00000000000000/* Licensed under MIT - see LICENSE.MIT file for details */ #ifndef CCAN_LIST_H #define CCAN_LIST_H //#define CCAN_LIST_DEBUG 1 #include #include #include #include #include /** * struct list_node - an entry in a doubly-linked list * @next: next entry (self if empty) * @prev: previous entry (self if empty) * * This is used as an entry in a linked list. * Example: * struct child { * const char *name; * // Linked list of all us children. * struct list_node list; * }; */ struct list_node { struct list_node *next, *prev; }; /** * struct list_head - the head of a doubly-linked list * @h: the list_head (containing next and prev pointers) * * This is used as the head of a linked list. * Example: * struct parent { * const char *name; * struct list_head children; * unsigned int num_children; * }; */ struct list_head { struct list_node n; }; /** * list_check - check head of a list for consistency * @h: the list_head * @abortstr: the location to print on aborting, or NULL. * * Because list_nodes have redundant information, consistency checking between * the back and forward links can be done. This is useful as a debugging check. * If @abortstr is non-NULL, that will be printed in a diagnostic if the list * is inconsistent, and the function will abort. * * Returns the list head if the list is consistent, NULL if not (it * can never return NULL if @abortstr is set). * * See also: list_check_node() * * Example: * static void dump_parent(struct parent *p) * { * struct child *c; * * printf("%s (%u children):\n", p->name, p->num_children); * list_check(&p->children, "bad child list"); * list_for_each(&p->children, c, list) * printf(" -> %s\n", c->name); * } */ struct list_head *list_check(const struct list_head *h, const char *abortstr); /** * list_check_node - check node of a list for consistency * @n: the list_node * @abortstr: the location to print on aborting, or NULL. * * Check consistency of the list node is in (it must be in one). * * See also: list_check() * * Example: * static void dump_child(const struct child *c) * { * list_check_node(&c->list, "bad child list"); * printf("%s\n", c->name); * } */ struct list_node *list_check_node(const struct list_node *n, const char *abortstr); #define LIST_LOC __FILE__ ":" stringify(__LINE__) #ifdef CCAN_LIST_DEBUG #define list_debug(h, loc) list_check((h), loc) #define list_debug_node(n, loc) list_check_node((n), loc) #else #define list_debug(h, loc) ((void)loc, h) #define list_debug_node(n, loc) ((void)loc, n) #endif /** * LIST_HEAD_INIT - initializer for an empty list_head * @name: the name of the list. * * Explicit initializer for an empty list. * * See also: * LIST_HEAD, list_head_init() * * Example: * static struct list_head my_list = LIST_HEAD_INIT(my_list); */ #define LIST_HEAD_INIT(name) { { &(name).n, &(name).n } } /** * LIST_HEAD - define and initialize an empty list_head * @name: the name of the list. * * The LIST_HEAD macro defines a list_head and initializes it to an empty * list. It can be prepended by "static" to define a static list_head. * * See also: * LIST_HEAD_INIT, list_head_init() * * Example: * static LIST_HEAD(my_global_list); */ #define LIST_HEAD(name) \ struct list_head name = LIST_HEAD_INIT(name) /** * list_head_init - initialize a list_head * @h: the list_head to set to the empty list * * Example: * ... * struct parent *parent = malloc(sizeof(*parent)); * * list_head_init(&parent->children); * parent->num_children = 0; */ static inline void list_head_init(struct list_head *h) { h->n.next = h->n.prev = &h->n; } /** * list_node_init - initialize a list_node * @n: the list_node to link to itself. * * You don't need to use this normally! But it lets you list_del(@n) * safely. */ static inline void list_node_init(struct list_node *n) { n->next = n->prev = n; } /** * list_add_after - add an entry after an existing node in a linked list * @h: the list_head to add the node to (for debugging) * @p: the existing list_node to add the node after * @n: the new list_node to add to the list. * * The existing list_node must already be a member of the list. * The new list_node does not need to be initialized; it will be overwritten. * * Example: * struct child c1, c2, c3; * LIST_HEAD(h); * * list_add_tail(&h, &c1.list); * list_add_tail(&h, &c3.list); * list_add_after(&h, &c1.list, &c2.list); */ #define list_add_after(h, p, n) list_add_after_(h, p, n, LIST_LOC) static inline void list_add_after_(struct list_head *h, struct list_node *p, struct list_node *n, const char *abortstr) { n->next = p->next; n->prev = p; p->next->prev = n; p->next = n; (void)list_debug(h, abortstr); } /** * list_add - add an entry at the start of a linked list. * @h: the list_head to add the node to * @n: the list_node to add to the list. * * The list_node does not need to be initialized; it will be overwritten. * Example: * struct child *child = malloc(sizeof(*child)); * * child->name = "marvin"; * list_add(&parent->children, &child->list); * parent->num_children++; */ #define list_add(h, n) list_add_(h, n, LIST_LOC) static inline void list_add_(struct list_head *h, struct list_node *n, const char *abortstr) { list_add_after_(h, &h->n, n, abortstr); } /** * list_add_before - add an entry before an existing node in a linked list * @h: the list_head to add the node to (for debugging) * @p: the existing list_node to add the node before * @n: the new list_node to add to the list. * * The existing list_node must already be a member of the list. * The new list_node does not need to be initialized; it will be overwritten. * * Example: * list_head_init(&h); * list_add_tail(&h, &c1.list); * list_add_tail(&h, &c3.list); * list_add_before(&h, &c3.list, &c2.list); */ #define list_add_before(h, p, n) list_add_before_(h, p, n, LIST_LOC) static inline void list_add_before_(struct list_head *h, struct list_node *p, struct list_node *n, const char *abortstr) { n->next = p; n->prev = p->prev; p->prev->next = n; p->prev = n; (void)list_debug(h, abortstr); } /** * list_add_tail - add an entry at the end of a linked list. * @h: the list_head to add the node to * @n: the list_node to add to the list. * * The list_node does not need to be initialized; it will be overwritten. * Example: * list_add_tail(&parent->children, &child->list); * parent->num_children++; */ #define list_add_tail(h, n) list_add_tail_(h, n, LIST_LOC) static inline void list_add_tail_(struct list_head *h, struct list_node *n, const char *abortstr) { list_add_before_(h, &h->n, n, abortstr); } /** * list_empty - is a list empty? * @h: the list_head * * If the list is empty, returns true. * * Example: * assert(list_empty(&parent->children) == (parent->num_children == 0)); */ #define list_empty(h) list_empty_(h, LIST_LOC) static inline bool list_empty_(const struct list_head *h, const char* abortstr) { (void)list_debug(h, abortstr); return h->n.next == &h->n; } /** * list_empty_nodebug - is a list empty (and don't perform debug checks)? * @h: the list_head * * If the list is empty, returns true. * This differs from list_empty() in that if CCAN_LIST_DEBUG is set it * will NOT perform debug checks. Only use this function if you REALLY * know what you're doing. * * Example: * assert(list_empty_nodebug(&parent->children) == (parent->num_children == 0)); */ #ifndef CCAN_LIST_DEBUG #define list_empty_nodebug(h) list_empty(h) #else static inline bool list_empty_nodebug(const struct list_head *h) { return h->n.next == &h->n; } #endif /** * list_empty_nocheck - is a list empty? * @h: the list_head * * If the list is empty, returns true. This doesn't perform any * debug check for list consistency, so it can be called without * locks, racing with the list being modified. This is ok for * checks where an incorrect result is not an issue (optimized * bail out path for example). */ static inline bool list_empty_nocheck(const struct list_head *h) { return h->n.next == &h->n; } /** * list_del - delete an entry from an (unknown) linked list. * @n: the list_node to delete from the list. * * Note that this leaves @n in an undefined state; it can be added to * another list, but not deleted again. * * See also: * list_del_from(), list_del_init() * * Example: * list_del(&child->list); * parent->num_children--; */ #define list_del(n) list_del_(n, LIST_LOC) static inline void list_del_(struct list_node *n, const char* abortstr) { (void)list_debug_node(n, abortstr); n->next->prev = n->prev; n->prev->next = n->next; #ifdef CCAN_LIST_DEBUG /* Catch use-after-del. */ n->next = n->prev = NULL; #endif } /** * list_del_init - delete a node, and reset it so it can be deleted again. * @n: the list_node to be deleted. * * list_del(@n) or list_del_init() again after this will be safe, * which can be useful in some cases. * * See also: * list_del_from(), list_del() * * Example: * list_del_init(&child->list); * parent->num_children--; */ #define list_del_init(n) list_del_init_(n, LIST_LOC) static inline void list_del_init_(struct list_node *n, const char *abortstr) { list_del_(n, abortstr); list_node_init(n); } /** * list_del_from - delete an entry from a known linked list. * @h: the list_head the node is in. * @n: the list_node to delete from the list. * * This explicitly indicates which list a node is expected to be in, * which is better documentation and can catch more bugs. * * See also: list_del() * * Example: * list_del_from(&parent->children, &child->list); * parent->num_children--; */ static inline void list_del_from(struct list_head *h, struct list_node *n) { #ifdef CCAN_LIST_DEBUG { /* Thorough check: make sure it was in list! */ struct list_node *i; for (i = h->n.next; i != n; i = i->next) assert(i != &h->n); } #endif /* CCAN_LIST_DEBUG */ /* Quick test that catches a surprising number of bugs. */ assert(!list_empty(h)); list_del(n); } /** * list_swap - swap out an entry from an (unknown) linked list for a new one. * @o: the list_node to replace from the list. * @n: the list_node to insert in place of the old one. * * Note that this leaves @o in an undefined state; it can be added to * another list, but not deleted/swapped again. * * See also: * list_del() * * Example: * struct child x1, x2; * LIST_HEAD(xh); * * list_add(&xh, &x1.list); * list_swap(&x1.list, &x2.list); */ #define list_swap(o, n) list_swap_(o, n, LIST_LOC) static inline void list_swap_(struct list_node *o, struct list_node *n, const char* abortstr) { (void)list_debug_node(o, abortstr); *n = *o; n->next->prev = n; n->prev->next = n; #ifdef CCAN_LIST_DEBUG /* Catch use-after-del. */ o->next = o->prev = NULL; #endif } /** * list_entry - convert a list_node back into the structure containing it. * @n: the list_node * @type: the type of the entry * @member: the list_node member of the type * * Example: * // First list entry is children.next; convert back to child. * child = list_entry(parent->children.n.next, struct child, list); * * See Also: * list_top(), list_for_each() */ #define list_entry(n, type, member) container_of(n, type, member) /** * list_top - get the first entry in a list * @h: the list_head * @type: the type of the entry * @member: the list_node member of the type * * If the list is empty, returns NULL. * * Example: * struct child *first; * first = list_top(&parent->children, struct child, list); * if (!first) * printf("Empty list!\n"); */ #define list_top(h, type, member) \ ((type *)list_top_((h), list_off_(type, member))) static inline const void *list_top_(const struct list_head *h, size_t off) { if (list_empty(h)) return NULL; return (const char *)h->n.next - off; } /** * list_pop - remove the first entry in a list * @h: the list_head * @type: the type of the entry * @member: the list_node member of the type * * If the list is empty, returns NULL. * * Example: * struct child *one; * one = list_pop(&parent->children, struct child, list); * if (!one) * printf("Empty list!\n"); */ #define list_pop(h, type, member) \ ((type *)list_pop_((h), list_off_(type, member))) static inline const void *list_pop_(const struct list_head *h, size_t off) { struct list_node *n; if (list_empty(h)) return NULL; n = h->n.next; list_del(n); return (const char *)n - off; } /** * list_tail - get the last entry in a list * @h: the list_head * @type: the type of the entry * @member: the list_node member of the type * * If the list is empty, returns NULL. * * Example: * struct child *last; * last = list_tail(&parent->children, struct child, list); * if (!last) * printf("Empty list!\n"); */ #define list_tail(h, type, member) \ ((type *)list_tail_((h), list_off_(type, member))) static inline const void *list_tail_(const struct list_head *h, size_t off) { if (list_empty(h)) return NULL; return (const char *)h->n.prev - off; } /** * list_for_each - iterate through a list. * @h: the list_head (warning: evaluated multiple times!) * @i: the structure containing the list_node * @member: the list_node member of the structure * * This is a convenient wrapper to iterate @i over the entire list. It's * a for loop, so you can break and continue as normal. * * Example: * list_for_each(&parent->children, child, list) * printf("Name: %s\n", child->name); */ #define list_for_each(h, i, member) \ list_for_each_off(h, i, list_off_var_(i, member)) /** * list_for_each_rev - iterate through a list backwards. * @h: the list_head * @i: the structure containing the list_node * @member: the list_node member of the structure * * This is a convenient wrapper to iterate @i over the entire list. It's * a for loop, so you can break and continue as normal. * * Example: * list_for_each_rev(&parent->children, child, list) * printf("Name: %s\n", child->name); */ #define list_for_each_rev(h, i, member) \ list_for_each_rev_off(h, i, list_off_var_(i, member)) /** * list_for_each_rev_safe - iterate through a list backwards, * maybe during deletion * @h: the list_head * @i: the structure containing the list_node * @nxt: the structure containing the list_node * @member: the list_node member of the structure * * This is a convenient wrapper to iterate @i over the entire list backwards. * It's a for loop, so you can break and continue as normal. The extra * variable * @nxt is used to hold the next element, so you can delete @i * from the list. * * Example: * struct child *next; * list_for_each_rev_safe(&parent->children, child, next, list) { * printf("Name: %s\n", child->name); * } */ #define list_for_each_rev_safe(h, i, nxt, member) \ list_for_each_rev_safe_off(h, i, nxt, list_off_var_(i, member)) /** * list_for_each_safe - iterate through a list, maybe during deletion * @h: the list_head * @i: the structure containing the list_node * @nxt: the structure containing the list_node * @member: the list_node member of the structure * * This is a convenient wrapper to iterate @i over the entire list. It's * a for loop, so you can break and continue as normal. The extra variable * @nxt is used to hold the next element, so you can delete @i from the list. * * Example: * list_for_each_safe(&parent->children, child, next, list) { * list_del(&child->list); * parent->num_children--; * } */ #define list_for_each_safe(h, i, nxt, member) \ list_for_each_safe_off(h, i, nxt, list_off_var_(i, member)) /** * list_next - get the next entry in a list * @h: the list_head * @i: a pointer to an entry in the list. * @member: the list_node member of the structure * * If @i was the last entry in the list, returns NULL. * * Example: * struct child *second; * second = list_next(&parent->children, first, list); * if (!second) * printf("No second child!\n"); */ #define list_next(h, i, member) \ ((list_typeof(i))list_entry_or_null(list_debug(h, \ __FILE__ ":" stringify(__LINE__)), \ (i)->member.next, \ list_off_var_((i), member))) /** * list_prev - get the previous entry in a list * @h: the list_head * @i: a pointer to an entry in the list. * @member: the list_node member of the structure * * If @i was the first entry in the list, returns NULL. * * Example: * first = list_prev(&parent->children, second, list); * if (!first) * printf("Can't go back to first child?!\n"); */ #define list_prev(h, i, member) \ ((list_typeof(i))list_entry_or_null(list_debug(h, \ __FILE__ ":" stringify(__LINE__)), \ (i)->member.prev, \ list_off_var_((i), member))) /** * list_append_list - empty one list onto the end of another. * @to: the list to append into * @from: the list to empty. * * This takes the entire contents of @from and moves it to the end of * @to. After this @from will be empty. * * Example: * struct list_head adopter; * * list_append_list(&adopter, &parent->children); * assert(list_empty(&parent->children)); * parent->num_children = 0; */ #define list_append_list(t, f) list_append_list_(t, f, \ __FILE__ ":" stringify(__LINE__)) static inline void list_append_list_(struct list_head *to, struct list_head *from, const char *abortstr) { struct list_node *from_tail = list_debug(from, abortstr)->n.prev; struct list_node *to_tail = list_debug(to, abortstr)->n.prev; /* Sew in head and entire list. */ to->n.prev = from_tail; from_tail->next = &to->n; to_tail->next = &from->n; from->n.prev = to_tail; /* Now remove head. */ list_del(&from->n); list_head_init(from); } /** * list_prepend_list - empty one list into the start of another. * @to: the list to prepend into * @from: the list to empty. * * This takes the entire contents of @from and moves it to the start * of @to. After this @from will be empty. * * Example: * list_prepend_list(&adopter, &parent->children); * assert(list_empty(&parent->children)); * parent->num_children = 0; */ #define list_prepend_list(t, f) list_prepend_list_(t, f, LIST_LOC) static inline void list_prepend_list_(struct list_head *to, struct list_head *from, const char *abortstr) { struct list_node *from_tail = list_debug(from, abortstr)->n.prev; struct list_node *to_head = list_debug(to, abortstr)->n.next; /* Sew in head and entire list. */ to->n.next = &from->n; from->n.prev = &to->n; to_head->prev = from_tail; from_tail->next = to_head; /* Now remove head. */ list_del(&from->n); list_head_init(from); } /* internal macros, do not use directly */ #define list_for_each_off_dir_(h, i, off, dir) \ for (i = list_node_to_off_(list_debug(h, LIST_LOC)->n.dir, \ (off)); \ list_node_from_off_((void *)i, (off)) != &(h)->n; \ i = list_node_to_off_(list_node_from_off_((void *)i, (off))->dir, \ (off))) #define list_for_each_safe_off_dir_(h, i, nxt, off, dir) \ for (i = list_node_to_off_(list_debug(h, LIST_LOC)->n.dir, \ (off)), \ nxt = list_node_to_off_(list_node_from_off_(i, (off))->dir, \ (off)); \ list_node_from_off_(i, (off)) != &(h)->n; \ i = nxt, \ nxt = list_node_to_off_(list_node_from_off_(i, (off))->dir, \ (off))) /** * list_for_each_off - iterate through a list of memory regions. * @h: the list_head * @i: the pointer to a memory region wich contains list node data. * @off: offset(relative to @i) at which list node data resides. * * This is a low-level wrapper to iterate @i over the entire list, used to * implement all oher, more high-level, for-each constructs. It's a for loop, * so you can break and continue as normal. * * WARNING! Being the low-level macro that it is, this wrapper doesn't know * nor care about the type of @i. The only assumtion made is that @i points * to a chunk of memory that at some @offset, relative to @i, contains a * properly filled `struct node_list' which in turn contains pointers to * memory chunks and it's turtles all the way down. Whith all that in mind * remember that given the wrong pointer/offset couple this macro will * happilly churn all you memory untill SEGFAULT stops it, in other words * caveat emptor. * * It is worth mentioning that one of legitimate use-cases for that wrapper * is operation on opaque types with known offset for `struct list_node' * member(preferably 0), because it allows you not to disclose the type of * @i. * * Example: * list_for_each_off(&parent->children, child, * offsetof(struct child, list)) * printf("Name: %s\n", child->name); */ #define list_for_each_off(h, i, off) \ list_for_each_off_dir_((h),(i),(off),next) /** * list_for_each_rev_off - iterate through a list of memory regions backwards * @h: the list_head * @i: the pointer to a memory region wich contains list node data. * @off: offset(relative to @i) at which list node data resides. * * See list_for_each_off for details */ #define list_for_each_rev_off(h, i, off) \ list_for_each_off_dir_((h),(i),(off),prev) /** * list_for_each_safe_off - iterate through a list of memory regions, maybe * during deletion * @h: the list_head * @i: the pointer to a memory region wich contains list node data. * @nxt: the structure containing the list_node * @off: offset(relative to @i) at which list node data resides. * * For details see `list_for_each_off' and `list_for_each_safe' * descriptions. * * Example: * list_for_each_safe_off(&parent->children, child, * next, offsetof(struct child, list)) * printf("Name: %s\n", child->name); */ #define list_for_each_safe_off(h, i, nxt, off) \ list_for_each_safe_off_dir_((h),(i),(nxt),(off),next) /** * list_for_each_rev_safe_off - iterate backwards through a list of * memory regions, maybe during deletion * @h: the list_head * @i: the pointer to a memory region wich contains list node data. * @nxt: the structure containing the list_node * @off: offset(relative to @i) at which list node data resides. * * For details see `list_for_each_rev_off' and `list_for_each_rev_safe' * descriptions. * * Example: * list_for_each_rev_safe_off(&parent->children, child, * next, offsetof(struct child, list)) * printf("Name: %s\n", child->name); */ #define list_for_each_rev_safe_off(h, i, nxt, off) \ list_for_each_safe_off_dir_((h),(i),(nxt),(off),prev) /* Other -off variants. */ #define list_entry_off(n, type, off) \ ((type *)list_node_from_off_((n), (off))) #define list_head_off(h, type, off) \ ((type *)list_head_off((h), (off))) #define list_tail_off(h, type, off) \ ((type *)list_tail_((h), (off))) #define list_add_off(h, n, off) \ list_add((h), list_node_from_off_((n), (off))) #define list_del_off(n, off) \ list_del(list_node_from_off_((n), (off))) #define list_del_from_off(h, n, off) \ list_del_from(h, list_node_from_off_((n), (off))) /* Offset helper functions so we only single-evaluate. */ static inline void *list_node_to_off_(struct list_node *node, size_t off) { return (void *)((char *)node - off); } static inline struct list_node *list_node_from_off_(void *ptr, size_t off) { return (struct list_node *)((char *)ptr + off); } /* Get the offset of the member, but make sure it's a list_node. */ #define list_off_(type, member) \ (container_off(type, member) + \ check_type(((type *)0)->member, struct list_node)) #define list_off_var_(var, member) \ (container_off_var(var, member) + \ check_type(var->member, struct list_node)) #if HAVE_TYPEOF #define list_typeof(var) typeof(var) #else #define list_typeof(var) void * #endif /* Returns member, or NULL if at end of list. */ static inline void *list_entry_or_null(const struct list_head *h, const struct list_node *n, size_t off) { if (n == &h->n) return NULL; return (char *)n - off; } #endif /* CCAN_LIST_H */ rdma-core-56.1/ccan/minmax.h000066400000000000000000000023511477342711600157010ustar00rootroot00000000000000/* CC0 (Public domain) - see LICENSE.CC0 file for details */ #ifndef CCAN_MINMAX_H #define CCAN_MINMAX_H #include "config.h" #include #if !HAVE_STATEMENT_EXPR || !HAVE_TYPEOF /* * Without these, there's no way to avoid unsafe double evaluation of * the arguments */ #error Sorry, minmax module requires statement expressions and typeof #endif #if HAVE_BUILTIN_TYPES_COMPATIBLE_P #define MINMAX_ASSERT_COMPATIBLE(a, b) \ BUILD_ASSERT(__builtin_types_compatible_p(a, b)) #else #define MINMAX_ASSERT_COMPATIBLE(a, b) \ do { } while (0) #endif #define min(a, b) \ ({ \ typeof(a) _a = (a); \ typeof(b) _b = (b); \ MINMAX_ASSERT_COMPATIBLE(typeof(_a), typeof(_b)); \ _a < _b ? _a : _b; \ }) #define max(a, b) \ ({ \ typeof(a) _a = (a); \ typeof(b) _b = (b); \ MINMAX_ASSERT_COMPATIBLE(typeof(_a), typeof(_b)); \ _a > _b ? _a : _b; \ }) #define clamp(v, f, c) (max(min((v), (c)), (f))) #define min_t(t, a, b) \ ({ \ t _ta = (a); \ t _tb = (b); \ min(_ta, _tb); \ }) #define max_t(t, a, b) \ ({ \ t _ta = (a); \ t _tb = (b); \ max(_ta, _tb); \ }) #define clamp_t(t, v, f, c) \ ({ \ t _tv = (v); \ t _tf = (f); \ t _tc = (c); \ clamp(_tv, _tf, _tc); \ }) #endif /* CCAN_MINMAX_H */ rdma-core-56.1/ccan/str.c000066400000000000000000000004331477342711600152120ustar00rootroot00000000000000/* CC0 (Public domain) - see LICENSE.CC0 file for details */ #include size_t strcount(const char *haystack, const char *needle) { size_t i = 0, nlen = strlen(needle); while ((haystack = strstr(haystack, needle)) != NULL) { i++; haystack += nlen; } return i; } rdma-core-56.1/ccan/str.h000066400000000000000000000134661477342711600152310ustar00rootroot00000000000000/* CC0 (Public domain) - see LICENSE.CC0 file for details */ #ifndef CCAN_STR_H #define CCAN_STR_H #include "config.h" #include #include #include #include /** * streq - Are two strings equal? * @a: first string * @b: first string * * This macro is arguably more readable than "!strcmp(a, b)". * * Example: * if (streq(somestring, "")) * printf("String is empty!\n"); */ #define streq(a,b) (strcmp((a),(b)) == 0) /** * strstarts - Does this string start with this prefix? * @str: string to test * @prefix: prefix to look for at start of str * * Example: * if (strstarts(somestring, "foo")) * printf("String %s begins with 'foo'!\n", somestring); */ #define strstarts(str,prefix) (strncmp((str),(prefix),strlen(prefix)) == 0) /** * strends - Does this string end with this postfix? * @str: string to test * @postfix: postfix to look for at end of str * * Example: * if (strends(somestring, "foo")) * printf("String %s end with 'foo'!\n", somestring); */ static inline bool strends(const char *str, const char *postfix) { if (strlen(str) < strlen(postfix)) return false; return streq(str + strlen(str) - strlen(postfix), postfix); } /** * stringify - Turn expression into a string literal * @expr: any C expression * * Example: * #define PRINT_COND_IF_FALSE(cond) \ * ((cond) || printf("%s is false!", stringify(cond))) */ #define stringify(expr) stringify_1(expr) /* Double-indirection required to stringify expansions */ #define stringify_1(expr) #expr /** * strcount - Count number of (non-overlapping) occurrences of a substring. * @haystack: a C string * @needle: a substring * * Example: * assert(strcount("aaa aaa", "a") == 6); * assert(strcount("aaa aaa", "ab") == 0); * assert(strcount("aaa aaa", "aa") == 2); */ size_t strcount(const char *haystack, const char *needle); /** * STR_MAX_CHARS - Maximum possible size of numeric string for this type. * @type_or_expr: a pointer or integer type or expression. * * This provides enough space for a nul-terminated string which represents the * largest possible value for the type or expression. * * Note: The implementation adds extra space so hex values or negative * values will fit (eg. sprintf(... "%p"). ) * * Example: * char str[STR_MAX_CHARS(int)]; * * sprintf(str, "%i", 7); */ #define STR_MAX_CHARS(type_or_expr) \ ((sizeof(type_or_expr) * CHAR_BIT + 8) / 9 * 3 + 2 \ + STR_MAX_CHARS_TCHECK_(type_or_expr)) #if HAVE_TYPEOF /* Only a simple type can have 0 assigned, so test that. */ #define STR_MAX_CHARS_TCHECK_(type_or_expr) \ ({ typeof(type_or_expr) x = 0; (void)x; 0; }) #else #define STR_MAX_CHARS_TCHECK_(type_or_expr) 0 #endif /** * cisalnum - isalnum() which takes a char (and doesn't accept EOF) * @c: a character * * Surprisingly, the standard ctype.h isalnum() takes an int, which * must have the value of EOF (-1) or an unsigned char. This variant * takes a real char, and doesn't accept EOF. */ static inline bool cisalnum(char c) { return isalnum((unsigned char)c); } static inline bool cisalpha(char c) { return isalpha((unsigned char)c); } static inline bool cisascii(char c) { return isascii((unsigned char)c); } #if HAVE_ISBLANK static inline bool cisblank(char c) { return isblank((unsigned char)c); } #endif static inline bool ciscntrl(char c) { return iscntrl((unsigned char)c); } static inline bool cisdigit(char c) { return isdigit((unsigned char)c); } static inline bool cisgraph(char c) { return isgraph((unsigned char)c); } static inline bool cislower(char c) { return islower((unsigned char)c); } static inline bool cisprint(char c) { return isprint((unsigned char)c); } static inline bool cispunct(char c) { return ispunct((unsigned char)c); } static inline bool cisspace(char c) { return isspace((unsigned char)c); } static inline bool cisupper(char c) { return isupper((unsigned char)c); } static inline bool cisxdigit(char c) { return isxdigit((unsigned char)c); } #include /* These checks force things out of line, hence they are under DEBUG. */ #ifdef CCAN_STR_DEBUG #include /* These are commonly misused: they take -1 or an *unsigned* char value. */ #undef isalnum #undef isalpha #undef isascii #undef isblank #undef iscntrl #undef isdigit #undef isgraph #undef islower #undef isprint #undef ispunct #undef isspace #undef isupper #undef isxdigit /* You can use a char if char is unsigned. */ #if HAVE_BUILTIN_TYPES_COMPATIBLE_P && HAVE_TYPEOF #define str_check_arg_(i) \ ((i) + BUILD_ASSERT_OR_ZERO(!__builtin_types_compatible_p(typeof(i), \ char) \ || (char)255 > 0)) #else #define str_check_arg_(i) (i) #endif #define isalnum(i) str_isalnum(str_check_arg_(i)) #define isalpha(i) str_isalpha(str_check_arg_(i)) #define isascii(i) str_isascii(str_check_arg_(i)) #if HAVE_ISBLANK #define isblank(i) str_isblank(str_check_arg_(i)) #endif #define iscntrl(i) str_iscntrl(str_check_arg_(i)) #define isdigit(i) str_isdigit(str_check_arg_(i)) #define isgraph(i) str_isgraph(str_check_arg_(i)) #define islower(i) str_islower(str_check_arg_(i)) #define isprint(i) str_isprint(str_check_arg_(i)) #define ispunct(i) str_ispunct(str_check_arg_(i)) #define isspace(i) str_isspace(str_check_arg_(i)) #define isupper(i) str_isupper(str_check_arg_(i)) #define isxdigit(i) str_isxdigit(str_check_arg_(i)) #if HAVE_TYPEOF /* With GNU magic, we can make const-respecting standard string functions. */ #undef strstr #undef strchr #undef strrchr /* + 0 is needed to decay array into pointer. */ #define strstr(haystack, needle) \ ((typeof((haystack) + 0))str_strstr((haystack), (needle))) #define strchr(haystack, c) \ ((typeof((haystack) + 0))str_strchr((haystack), (c))) #define strrchr(haystack, c) \ ((typeof((haystack) + 0))str_strrchr((haystack), (c))) #endif #endif /* CCAN_STR_DEBUG */ #endif /* CCAN_STR_H */ rdma-core-56.1/ccan/str_debug.h000066400000000000000000000014121477342711600163630ustar00rootroot00000000000000/* CC0 (Public domain) - see LICENSE.CC0 file for details */ #ifndef CCAN_STR_DEBUG_H #define CCAN_STR_DEBUG_H /* #define CCAN_STR_DEBUG 1 */ #ifdef CCAN_STR_DEBUG /* Because we mug the real ones with macros, we need our own wrappers. */ int str_isalnum(int i); int str_isalpha(int i); int str_isascii(int i); #if HAVE_ISBLANK int str_isblank(int i); #endif int str_iscntrl(int i); int str_isdigit(int i); int str_isgraph(int i); int str_islower(int i); int str_isprint(int i); int str_ispunct(int i); int str_isspace(int i); int str_isupper(int i); int str_isxdigit(int i); char *str_strstr(const char *haystack, const char *needle); char *str_strchr(const char *s, int c); char *str_strrchr(const char *s, int c); #endif /* CCAN_STR_DEBUG */ #endif /* CCAN_STR_DEBUG_H */ rdma-core-56.1/debian/000077500000000000000000000000001477342711600145545ustar00rootroot00000000000000rdma-core-56.1/debian/.gitignore000066400000000000000000000002541477342711600165450ustar00rootroot00000000000000/*.debhelper /*.log /*.substvars /files /ibacm/ /ibverbs-providers/ /ibverbs-utils/ /infiniband-diags/ /lib*/ /python3-pyverbs/ /rdma-core/ /rdmacm-utils/ /srptools/ /tmp/ rdma-core-56.1/debian/changelog000066400000000000000000000345171477342711600164400ustar00rootroot00000000000000rdma-core (56.1-1) unstable; urgency=low * New upstream release. -- Nicolas Morey Thu, 03 Apr 2025 08:44:30 +0200 rdma-core (56.0-1) unstable; urgency=medium * New upstream release. -- Jason Gunthorpe Tue, 21 Jan 2025 13:44:36 +0100 rdma-core (55.0-1) unstable; urgency=medium * New upstream release. * Bump Standards-Version to 4.7.0 * Update library symbols -- Benjamin Drung Tue, 21 Jan 2025 12:50:59 +0100 rdma-core (52.0-2) unstable; urgency=medium * Exclude hns provider on archs without coherent DMA (Closes: #1073050) -- Benjamin Drung Thu, 27 Jun 2024 13:49:34 +0200 rdma-core (52.0-1) unstable; urgency=medium * New upstream release. -- Benjamin Drung Mon, 03 Jun 2024 11:07:43 +0200 rdma-core (50.0-2) unstable; urgency=medium * Rename libraries for 64-bit time_t transition (Closes: #1064313) -- Benjamin Drung Thu, 29 Feb 2024 03:11:46 +0100 rdma-core (50.0-1) unstable; urgency=medium * New upstream release. - support cython 3.0.x (Closes: #1056882) - add support for loongarch64 (Closes: #1059022) * Replace obsolete pkg-config by pkgconf * Update years in debian/copyright * Use unversioned library names in package description * Installs udev/systemd files below /usr/lib (Closes: #1059188) -- Benjamin Drung Mon, 26 Feb 2024 17:41:04 +0100 rdma-core (48.0-1) unstable; urgency=medium * New upstream release. -- Benjamin Drung Mon, 23 Oct 2023 18:40:29 +0200 rdma-core (47.0-1) unstable; urgency=medium * New upstream release. * Drop obsolete versioned dependency on lsb-base -- Benjamin Drung Mon, 07 Aug 2023 11:26:40 +0200 rdma-core (44.0-2) unstable; urgency=medium * debian: Add 32-bit MIPS architectures to COHERENT_DMA_ARCHS and inverse the list to NON_COHERENT_DMA_ARCHS (Closes: #1026088) -- Benjamin Drung Tue, 10 Jan 2023 13:59:10 +0100 rdma-core (44.0-1) unstable; urgency=medium * New upstream release. - Add Microsoft Azure Network Adapter (MANA) RDMA provider - util: mmio: fix build on MIPS with binutils >= 2.35 * Add 64-bit MIPS architectures to COHERENT_DMA_ARCHS (Closes: #1026088) * debian/watch: Query api.github.com for release tarballs -- Benjamin Drung Tue, 03 Jan 2023 17:29:32 +0100 rdma-core (43.0-1) unstable; urgency=medium * New upstream release. - Install 70-persistent-ipoib.rules into docs instead of /etc (Closes: #958385) -- Benjamin Drung Mon, 24 Oct 2022 18:21:42 +0200 rdma-core (42.0-1) unstable; urgency=medium * New upstream release. * Update overrides for lintian 2.115.2 * Bump Standards-Version to 4.6.1 -- Benjamin Drung Thu, 18 Aug 2022 15:24:14 +0200 rdma-core (40.0-1) unstable; urgency=medium [ Benjamin Drung ] * New upstream release. * Update my email address to @ubuntu.com [ Heinrich Schuchardt ] * Add riscv64 to COHERENT_DMA_ARCHS -- Benjamin Drung Mon, 16 May 2022 13:55:12 +0200 rdma-core (39.0-1) unstable; urgency=medium * New upstream release. * Remove i40iw provider conffile (Closes: #1000562) * Update overrides for lintian 2.114 * Override obsolete-command-in-modprobe.d-file lintian error * Override spare-manual-page lintian complaint -- Benjamin Drung Thu, 27 Jan 2022 19:57:04 +0100 rdma-core (38.0-1) unstable; urgency=medium * New upstream release. * debian: Add __verbs_log to private libibverbs symbols -- Benjamin Drung Fri, 19 Nov 2021 17:57:18 +0100 rdma-core (36.0-2) unstable; urgency=medium * Revert installing systemd services in /usr/lib/systemd/system. See #994388 for more details. (Closes: #997727) -- Benjamin Drung Tue, 09 Nov 2021 11:13:56 +0100 rdma-core (36.0-1) unstable; urgency=medium * New upstream release. * Bump Standards-Version to 4.6.0 * Install systemd services in /usr/lib/systemd/system * Mark libraries as Multi-Arch: same -- Benjamin Drung Mon, 13 Sep 2021 14:09:01 +0200 rdma-core (33.2-1) unstable; urgency=medium * New upstream bug-fix release: - libhns: Fix wrong range of a mask - verbs: Fix attr_optional() when 'IOCTL_MODE=write' is used - mlx4: Fix mlx4_read_clock returned errno value - efa: Fix use of uninitialized query device response - libhns: Avoid accessing NULL pointer when locking/unlocking CQ - mlx5: Fix mlx5_read_clock returned errno value - bnxt_re/lib: Check AH handler validity before use - iwpmd: Check returned value of parse_iwpm_msg - libhns: Bugfix for calculation of extended sge -- Benjamin Drung Thu, 03 Jun 2021 11:19:24 +0200 rdma-core (33.1+git20210317-1) unstable; urgency=medium * New upstream bug-fix snapshot: - mlx5: Fix uuars to have the 'uar_mmap_offset' data - pyverbs: Fix Mlx5 QP constructor - efa: Fix DV extension clear check - verbs: Fix possible port loop overflow - ibacm: Fix possible port loop overflow - mlx5: DR, Check new cap for isolated VL TC QP - kernel-boot: Fix VF lookup - mlx5: DR, Force QP drain on table creation - libhns: Avoid double release of a pointer - libhns: Correct definition of DB_BYTE_4_TAG_M - libhns: Remove assert to check whether a pointer is NULL - libhns: Remove the unnecessary mask on QPN of CQE - libhns: Remove unnecessary mask of ownerbit - libhns: Remove unnecessary barrier when poll cq - rdma-ndd: fix udev racy issue for system with multiple InfiniBand HCAs - verbs: Fix create CQ comp_mask check * Update my email address -- Benjamin Drung Mon, 12 Apr 2021 11:28:57 +0200 rdma-core (33.1-1) unstable; urgency=medium * New upstream bugfix release. -- Benjamin Drung Wed, 27 Jan 2021 14:32:48 +0100 rdma-core (33.0-1) unstable; urgency=medium * New upstream release. -- Benjamin Drung Mon, 04 Jan 2021 16:41:27 +0100 rdma-core (32.0-1) unstable; urgency=medium * New upstream release. -- Benjamin Drung Fri, 30 Oct 2020 10:01:11 +0100 rdma-core (31.0-1) unstable; urgency=medium * New upstream release. * Switch to debhelper 13 -- Benjamin Drung Wed, 19 Aug 2020 09:36:17 +0200 rdma-core (29.0-1) unstable; urgency=medium * New upstream release. -- Benjamin Drung Tue, 14 Apr 2020 16:15:54 +0200 rdma-core (28.0-1) unstable; urgency=medium * New upstream release. - rxe: Remove rxe_cfg * Bump Standards-Version to 4.5.0 -- Benjamin Drung Wed, 12 Feb 2020 17:21:38 +0100 rdma-core (27.0-2) unstable; urgency=medium [ Debian Janitor ] * Set upstream metadata fields: Repository, Repository-Browse. [ Benjamin Drung ] * debian: Remove obsolete ibverbs-providers conffiles (Closes: #947307) -- Benjamin Drung Mon, 06 Jan 2020 13:23:44 +0100 rdma-core (27.0-1) unstable; urgency=medium * New upstream release - libcxgb3: Remove libcxgb3 from rdma-core - libnes: Remove libnes from rdma-core * Add missing build dependency dh-python -- Benjamin Drung Mon, 23 Dec 2019 13:22:46 +0100 rdma-core (26.0-2) unstable; urgency=medium * Improve/extent description of python3-pyverbs * Bump Standards-Version to 4.4.1 (no changes required) * Add Rules-Requires-Root: no -- Benjamin Drung Tue, 29 Oct 2019 13:22:15 +0100 rdma-core (26.0-1) unstable; urgency=medium * New upstream release. - Include infiniband-diags source package producing infiniband-diags, libibmad5, libibmad-dev, libibnetdisc5, and libibnetdisc-dev. * Update private libibverbs symbols * Specify Build-Depends-Package for libibmad5 and libibnetdisc5 -- Benjamin Drung Thu, 24 Oct 2019 11:27:45 +0200 rdma-core (24.0-2) unstable; urgency=medium * Skip installing efa if the architecture lacks coherent DMA support -- Benjamin Drung Thu, 11 Jul 2019 12:34:23 +0200 rdma-core (24.0-1) unstable; urgency=medium * New upstream release. * Drop pyverbs-Add-shebang-to-ib_devices.py-example.patch (applied upstream) * Bump Standards-Version to 4.4.0 (no changes needed) * Switch to debhelper 12 * Add Pre-Depends on ${misc:Pre-Depends} * Drop debug symbol migration -- Benjamin Drung Wed, 10 Jul 2019 12:39:27 +0200 rdma-core (22.1-1) unstable; urgency=medium * New upstream bugfix release. -- Benjamin Drung Wed, 06 Feb 2019 15:58:48 +0100 rdma-core (22.0-1) unstable; urgency=medium * New upstream release. - mlx5: Add DEVX APIs for interop with verbs objects - Add pyverbs Python binding * Update private libibverbs symbols * Bump Standards-Version to 4.3.0 (no changes required) -- Benjamin Drung Tue, 22 Jan 2019 13:27:29 +0100 rdma-core (21.0-1) unstable; urgency=medium * New upstream release. - Drop ibacm sysV init script to avoid issues with the sysV to systemd wrapper starting the service instead of the socket (LP: #1794825) - Include static libraries in the build * Update private libibverbs symbols * Specify Build-Depends-Package in symbols -- Benjamin Drung Tue, 20 Nov 2018 11:49:25 +0100 rdma-core (20.0-1) unstable; urgency=medium * New upstream release. - Switch from net-tools to iproute2 for rxe_cfg - Install pkg-config files * Update libibverbs symbols and let libibverbs1 break ibverbs-providers < 20~ * Drop all patches (accepted upstream) * Bump Standards-Version to 4.2.1 (no changes needed) -- Benjamin Drung Mon, 10 Sep 2018 11:23:11 +0200 rdma-core (19.0-1) unstable; urgency=medium * New upstream release. * Switch to debhelper 11 * Add patch to fix bad whatis entries in man pages -- Benjamin Drung Thu, 28 Jun 2018 15:01:27 +0200 rdma-core (18.1-1) unstable; urgency=medium * New upstream bugfix release. * Drop all patches (applied upstream) -- Benjamin Drung Tue, 12 Jun 2018 11:53:44 +0200 rdma-core (18.0-1) unstable; urgency=medium * New upstream release. * Update private libibverbs symbols and let libibverbs1 break ibverbs-providers < 18~ * Fix bad whatis entries in man pages * Fix spelling mistakes in ibv_create_flow_action.3 man page * Use versioned Breaks & Replaces for ibverbs-providers to make it multi-arch coinstallable (Closes: #898055) -- Benjamin Drung Mon, 07 May 2018 13:40:40 +0200 rdma-core (17.1-2) unstable; urgency=medium * Support for new architecture riscv64 (Closes: #894995) by - Whitelist (instead of blacklist) architectures that support valgrind - Whitelist (instead of blacklist) coherent DMA supporting architectures * Bump Standards-Version to 4.1.4 (no changes needed) -- Benjamin Drung Mon, 30 Apr 2018 19:01:44 +0200 rdma-core (17.1-1) unstable; urgency=medium * New upstream bugfix release. -- Benjamin Drung Mon, 19 Mar 2018 13:32:31 +0100 rdma-core (17.0-1) unstable; urgency=medium * New upstream release - Remove the obsolete libibcm library * Update private libibverbs symbols and let libibverbs1 break ibverbs-providers < 17~ * Update copyright for kernel-headers directory -- Benjamin Drung Mon, 19 Feb 2018 12:47:42 +0100 rdma-core (16.2-1) unstable; urgency=medium * New upstream bugfix release * Guard udevadm call again * Override intentional systemd WantedBy= relationship lintian warning -- Benjamin Drung Thu, 15 Feb 2018 11:41:14 +0100 rdma-core (16.1-2) unstable; urgency=medium * Do not require valgrind on ia64 (Closes: #887511) -- Benjamin Drung Fri, 19 Jan 2018 12:37:05 +0100 rdma-core (16.1-1) unstable; urgency=medium * New upstream bugfix release. * Bump Standards-Version to 4.1.3 (no changes needed) * Add udev dependency to rdma-core and srptools -- Benjamin Drung Thu, 04 Jan 2018 14:42:26 +0100 rdma-core (16.0-1) unstable; urgency=medium * New upstream release. * Update private libibverbs symbols * Bump Standards-Version to 4.1.2 (no changes needed) -- Benjamin Drung Tue, 12 Dec 2017 11:01:38 +0100 rdma-core (15.1-1) unstable; urgency=medium * New upstream release. * Add m68k as non-coherent DMA architecture * Mark libraries as Multi-Arch: same -- Benjamin Drung Thu, 30 Nov 2017 12:08:26 +0100 rdma-core (15-3) unstable; urgency=medium * debian/rules: Include architecture.mk for DEB_HOST_ARCH definition * Add alpha, hppa, sh4 as non-coherent DMA archs * Do not require valgrind on x32 (not available there due to build failure) -- Benjamin Drung Thu, 16 Nov 2017 17:33:48 +0100 rdma-core (15-2) unstable; urgency=medium * Do not build ibacm for non-Linux architectures * Do not require valgrind if not available * Let libibverbs1 15 break ibverbs-providers 14 * Drop dh-systemd build dependency * Bump Standards-Version to 4.1.1 (no changes needed) * Drop lintian overrides for false positives * Set myself as maintainer (instead of linux-rdma) * Do not try to install disabled ibverbs providers on architectures that do not provide cache coherent DMA (Closes: #881731) * Explicitly list private libibverbs symbols -- Benjamin Drung Thu, 16 Nov 2017 12:55:28 +0100 rdma-core (15-1) unstable; urgency=medium * New upstream version. ibverbs-providers combines the source packages libcxgb3, libipathverbs, libmlx4, libmlx5, libmthca, and libnes. rdma-core also combines the source packages ibacm, libibcm, libibumad, libibverbs, librdmacm, and srptools (Closes: #848971) -- Benjamin Drung Mon, 18 Sep 2017 11:00:39 +0200 rdma-core-56.1/debian/compat000066400000000000000000000000031477342711600157530ustar00rootroot0000000000000010 rdma-core-56.1/debian/control000066400000000000000000000344161477342711600161670ustar00rootroot00000000000000Source: rdma-core Maintainer: Linux RDMA Mailing List Uploaders: Benjamin Drung , Talat Batheesh Section: net Priority: optional Build-Depends: cmake (>= 2.8.11), cython3, debhelper (>= 10), dh-python, dpkg-dev (>= 1.17), libnl-3-dev, libnl-route-3-dev, libsystemd-dev, libudev-dev, ninja-build, pandoc, pkg-config, python3-dev, python3-docutils, valgrind [amd64 arm64 armhf i386 mips mips64el mipsel powerpc ppc64 ppc64el s390x] Rules-Requires-Root: no Standards-Version: 4.7.0 Vcs-Git: https://github.com/linux-rdma/rdma-core.git Vcs-Browser: https://github.com/linux-rdma/rdma-core Homepage: https://github.com/linux-rdma/rdma-core Package: rdma-core Architecture: linux-any Depends: udev, ${misc:Depends}, ${perl:Depends}, ${shlibs:Depends} Pre-Depends: ${misc:Pre-Depends} Recommends: dmidecode, ethtool, iproute2 Breaks: infiniband-diags (<< 2.0.0) Replaces: infiniband-diags (<< 2.0.0) Description: RDMA core userspace infrastructure and documentation This package provides the basic boot time support for systems that use the Linux kernel's remote direct memory access (RDMA) subystem which includes InfiniBand, iWARP, and RDMA over Converged Ethernet (RoCE). . Several kernel RDMA support daemons are included: - The rdma-ndd daemon which watches for RDMA device changes and/or hostname changes and updates the Node Description of the RDMA devices based on those changes. - The iWARP Port Mapper Daemon (iwpmd) which provides a kernel support service in userspace for iWARP drivers to claim TCP ports through the standard socket interface. Package: ibacm Architecture: linux-any Depends: rdma-core (>= 15), ${misc:Depends}, ${shlibs:Depends} Description: InfiniBand Communication Manager Assistant (ACM) The IB ACM implements and provides a framework for name, address, and route (path) resolution services over InfiniBand. It is intended to address connection setup scalability issues running MPI applications on large clusters. The IB ACM provides information needed to establish a connection, but does not implement the CM protocol. A primary user of the ibacm service is the librdmacm library. Package: ibverbs-providers Architecture: linux-any Multi-Arch: same Depends: ${misc:Depends}, ${shlibs:Depends} Provides: libefa1, libipathverbs1, libmana1, libmlx4-1, libmlx5-1, libmthca1 Replaces: libipathverbs1 (<< 15), libmlx4-1 (<< 15), libmlx5-1 (<< 15), libmthca1 (<< 15) Breaks: libipathverbs1 (<< 15), libmlx4-1 (<< 15), libmlx5-1 (<< 15), libmthca1 (<< 15) Description: User space provider drivers for libibverbs libibverbs is a library that allows userspace processes to use RDMA "verbs" as described in the InfiniBand Architecture Specification and the RDMA Protocol Verbs Specification. iWARP ethernet NICs support RDMA over hardware-offloaded TCP/IP, while InfiniBand is a high-throughput, low-latency networking technology. InfiniBand host channel adapters (HCAs) and iWARP NICs commonly support direct hardware access from userspace (kernel bypass), and libibverbs supports this when available. . An RDMA driver consists of a kernel portion and a user space portion. This package contains the user space verbs drivers: . - bnxt_re: Broadcom NetXtreme-E RoCE HCAs - cxgb4: Chelsio T4 iWARP HCAs - efa: Amazon Elastic Fabric Adapter - erdma: Alibaba Elastic RDMA (iWarp) Adapter - hfi1verbs: Intel Omni-Path HFI - hns: HiSilicon Hip06 SoC - ipathverbs: QLogic InfiniPath HCAs - irdma: Intel Ethernet Connection RDMA - mana: Microsoft Azure Network Adapter - mlx4: Mellanox ConnectX-3 InfiniBand HCAs - mlx5: Mellanox Connect-IB/X-4+ InfiniBand HCAs - mthca: Mellanox InfiniBand HCAs - ocrdma: Emulex OneConnect RDMA/RoCE device - qedr: QLogic QL4xxx RoCE HCAs - rxe: A software implementation of the RoCE protocol - siw: A software implementation of the iWarp protocol - vmw_pvrdma: VMware paravirtual RDMA device Package: ibverbs-utils Architecture: linux-any Depends: ${misc:Depends}, ${shlibs:Depends} Description: Examples for the libibverbs library libibverbs is a library that allows userspace processes to use RDMA "verbs" as described in the InfiniBand Architecture Specification and the RDMA Protocol Verbs Specification. iWARP ethernet NICs support RDMA over hardware-offloaded TCP/IP, while InfiniBand is a high-throughput, low-latency networking technology. InfiniBand host channel adapters (HCAs) and iWARP NICs commonly support direct hardware access from userspace (kernel bypass), and libibverbs supports this when available. . This package contains useful libibverbs1 example programs such as ibv_devinfo, which displays information about InfiniBand devices. Package: libibverbs-dev Section: libdevel Architecture: linux-any Multi-Arch: same Depends: ibverbs-providers (= ${binary:Version}), libibverbs1 (= ${binary:Version}), libnl-3-dev, libnl-route-3-dev, ${misc:Depends} Description: Development files for the libibverbs library libibverbs is a library that allows userspace processes to use RDMA "verbs" as described in the InfiniBand Architecture Specification and the RDMA Protocol Verbs Specification. iWARP ethernet NICs support RDMA over hardware-offloaded TCP/IP, while InfiniBand is a high-throughput, low-latency networking technology. InfiniBand host channel adapters (HCAs) and iWARP NICs commonly support direct hardware access from userspace (kernel bypass), and libibverbs supports this when available. . This package is needed to compile programs against libibverbs1. It contains the header files and static libraries (optionally) needed for compiling. Package: libibverbs1 Architecture: linux-any Multi-Arch: same Section: libs Pre-Depends: ${misc:Pre-Depends} Depends: adduser, ${misc:Depends}, ${shlibs:Depends} Recommends: ibverbs-providers Breaks: ibverbs-providers (<< 34~) Description: Library for direct userspace use of RDMA (InfiniBand/iWARP) libibverbs is a library that allows userspace processes to use RDMA "verbs" as described in the InfiniBand Architecture Specification and the RDMA Protocol Verbs Specification. iWARP ethernet NICs support RDMA over hardware-offloaded TCP/IP, while InfiniBand is a high-throughput, low-latency networking technology. InfiniBand host channel adapters (HCAs) and iWARP NICs commonly support direct hardware access from userspace (kernel bypass), and libibverbs supports this when available. . For this library to be useful, a device-specific plug-in module should also be installed. . This package contains the shared library. Package: libibumad-dev Section: libdevel Architecture: linux-any Multi-Arch: same Depends: libibumad3 (= ${binary:Version}), ${misc:Depends} Description: Development files for libibumad libibumad provides userspace Infiniband Management Datagram (uMAD) functions which sit on top of the uMAD modules in the kernel. These are used by InfiniBand diagnostic and management tools. . This package is needed to compile programs against libibumad. It contains the header files and static libraries (optionally) needed for compiling. Package: libibumad3 Architecture: linux-any Multi-Arch: same Section: libs Pre-Depends: ${misc:Pre-Depends} Depends: ${misc:Depends}, ${shlibs:Depends} Description: InfiniBand Userspace Management Datagram (uMAD) library libibumad provides userspace Infiniband Management Datagram (uMAD) functions which sit on top of the uMAD modules in the kernel. These are used by InfiniBand diagnostic and management tools. . This package contains the shared library. Package: librdmacm-dev Section: libdevel Architecture: linux-any Multi-Arch: same Depends: libibverbs-dev, librdmacm1 (= ${binary:Version}), ${misc:Depends} Description: Development files for the librdmacm library librdmacm is a library that allows applications to set up reliable connected and unreliable datagram transfers when using RDMA adapters. It provides a transport-neutral interface in the sense that the same code can be used for both InfiniBand and iWARP adapters. The interface is based on sockets, but adapted for queue pair (QP) based semantics: communication must use a specific RDMA device, and data transfers are message-based. . librdmacm only provides communication management (connection setup and tear-down) and works in conjunction with the verbs interface provided by libibverbs, which provides the interface used to actually transfer data. . This package is needed to compile programs against librdmacm. It contains the header files and static libraries (optionally) needed for compiling. Package: librdmacm1 Architecture: linux-any Multi-Arch: same Section: libs Pre-Depends: ${misc:Pre-Depends} Depends: ${misc:Depends}, ${shlibs:Depends} Description: Library for managing RDMA connections librdmacm is a library that allows applications to set up reliable connected and unreliable datagram transfers when using RDMA adapters. It provides a transport-neutral interface in the sense that the same code can be used for both InfiniBand and iWARP adapters. The interface is based on sockets, but adapted for queue pair (QP) based semantics: communication must use a specific RDMA device, and data transfers are message-based. . librdmacm only provides communication management (connection setup and tear-down) and works in conjunction with the verbs interface provided by libibverbs, which provides the interface used to actually transfer data. . This package contains the shared library. Package: rdmacm-utils Architecture: linux-any Depends: ${misc:Depends}, ${shlibs:Depends} Description: Examples for the librdmacm library librdmacm is a library that allows applications to set up reliable connected and unreliable datagram transfers when using RDMA adapters. It provides a transport-neutral interface in the sense that the same code can be used for both InfiniBand and iWARP adapters. The interface is based on sockets, but adapted for queue pair (QP) based semantics: communication must use a specific RDMA device, and data transfers are message-based. . librdmacm only provides communication management (connection setup and tear-down) and works in conjunction with the verbs interface provided by libibverbs, which provides the interface used to actually transfer data. . This package contains useful librdmacm example programs such as rping and udaddy. Package: srptools Architecture: linux-any Depends: rdma-core (>= 15), udev, ${misc:Depends}, ${shlibs:Depends} Pre-Depends: ${misc:Pre-Depends} Description: Tools for Infiniband attached storage (SRP) In conjunction with the kernel ib_srp driver, srptools allows you to discover and use Infiniband attached storage devices which use the SCSI RDMA Protocol (SRP). Package: python3-pyverbs Section: python Architecture: linux-any Depends: rdma-core (>= 21), ${misc:Depends}, ${python3:Depends}, ${shlibs:Depends} Provides: ${python3:Provides} Description: Python bindings for rdma-core Pyverbs provides a Python API over rdma-core, the Linux userspace C API for the remote direct memory access (RDMA) stack. . One goal is to provide easier access to RDMA: RDMA has a steep learning curve as is and the C interface requires the user to initialize multiple structs before having usable objects. Pyverbs attempts to remove much of this overhead and provide a smoother user experience. Package: infiniband-diags Architecture: linux-any Depends: libibnetdisc5 (= ${binary:Version}), ${misc:Depends}, ${perl:Depends}, ${shlibs:Depends} Description: InfiniBand diagnostic programs InfiniBand is a switched fabric communications link used in high-performance computing and enterprise data centers. Its features include high throughput, low latency, quality of service and failover, and it is designed to be scalable. . This package provides diagnostic programs and scripts needed to diagnose an InfiniBand subnet. Package: libibmad5 Section: libs Architecture: linux-any Multi-Arch: same Pre-Depends: ${misc:Pre-Depends} Depends: ${misc:Depends}, ${shlibs:Depends} Description: Infiniband Management Datagram (MAD) library libibmad provides low layer InfiniBand functions for use by the Infiniband diagnostic and management programs. These include Management Datagrams (MAD), Subnet Administration (SA), Subnet Management Packets (SMP) and other basic functions. . This package contains the shared library. Package: libibmad-dev Section: libdevel Architecture: linux-any Multi-Arch: same Depends: libibmad5 (= ${binary:Version}), ${misc:Depends} Description: Development files for libibmad libibmad provides low layer Infiniband functions for use by the InfiniBand diagnostic and management programs. These include Management Datagrams (MAD), Subnet Administration (SA), Subnet Management Packets (SMP) and other basic functions. . This package is needed to compile programs against libibmad5. It contains the header files and static libraries (optionally) needed for compiling. Package: libibnetdisc5 Section: libs Architecture: linux-any Multi-Arch: same Pre-Depends: ${misc:Pre-Depends} Depends: ${misc:Depends}, ${shlibs:Depends} Description: InfiniBand diagnostics library InfiniBand is a switched fabric communications link used in high-performance computing and enterprise data centers. Its features include high throughput, low latency, quality of service and failover, and it is designed to be scalable. . This package provides libraries required by the InfiniBand diagnostic programs. Package: libibnetdisc-dev Section: libdevel Architecture: linux-any Multi-Arch: same Depends: libibnetdisc5 (= ${binary:Version}), ${misc:Depends} Breaks: infiniband-diags (<< 2.0.0) Replaces: infiniband-diags (<< 2.0.0) Description: InfiniBand diagnostics library headers InfiniBand is a switched fabric communications link used in high-performance computing and enterprise data centers. Its features include high throughput, low latency, quality of service and failover, and it is designed to be scalable. . This package provides development files required to build applications against the libibnetdisc InfiniBand diagnostic libraries. rdma-core-56.1/debian/copyright000066400000000000000000000747331477342711600165250ustar00rootroot00000000000000Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ Upstream-Name: rdma-core Upstream-Contact: Doug Ledford , Leon Romanovsky Source: https://github.com/linux-rdma/rdma-core Files: * Copyright: disclaimed License: BSD-MIT or GPL-2 Files: debian/* Copyright: 2008, Genome Research Ltd 2014, Ana Beatriz Guerrero Lopez 2015-2016, Jason Gunthorpe 2016-2024, Benjamin Drung 2016-2017, Talat Batheesh License: GPL-2+ This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. . This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. . You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. . On Debian systems, the full text of the GNU General Public License version 2 can be found in the file `/usr/share/common-licenses/GPL-2'. Files: CMakeLists.txt Copyright: 2015-2017, Obsidian Research Corporation. License: BSD-MIT or GPL-2 Files: buildlib/* Copyright: 2015-2017, Obsidian Research Corporation. 2016-2017 Mellanox Technologies, Inc License: BSD-MIT or GPL-2 Files: buildlib/fixup-include/stdatomic.h Copyright: 2011 Ed Schouten David Chisnall License: BSD-2-clause Files: ccan/* Copyright: unspecified License: CC0 Files: ccan/list.* Copyright: unspecified License: MIT Files: ibacm/* Copyright: 2009-2014, Intel Corporation. 2013, Mellanox Technologies LTD. License: BSD-MIT Files: ibacm/man/* ibacm/ibacm.init.in Copyright: disclaimed License: BSD-2-clause Files: ibacm/CMakeLists.txt ibacm/ibacm_hosts.data Copyright: disclaimed License: BSD-MIT or GPL-2 Files: iwpmd/* Copyright: 2013-2016, Intel Corporation. License: BSD-MIT or GPL-2 Files: kernel-headers/* Copyright: disclaimed License: GPL-2 or BSD-2-clause Files: kernel-headers/rdma/rdma_netlink.h Copyright: disclaimed License: GPL-2 Files: kernel-headers/rdma/hfi/* Copyright: disclaimed License: GPL-2 or BSD-3-clause Files: libibumad/* Copyright: 2004-2017, Mellanox Technologies Ltd. 2004, Infinicon Corporation. 2004-2014, Intel Corporation. 2004, Topspin Corporation. 2004-2009, Voltaire Inc. 2013 Lawrence Livermore National Security 2013, Oracle and/or its affiliates. License: BSD-MIT or GPL-2 Files: libibumad/man/* Copyright: disclaimed License: BSD-2-clause Files: libibverbs/* Copyright: 2004-2012, Intel Corporation. 2004-2005, Topspin Communications. 2005-2007, Cisco Systems, Inc. 2005, PathScale, Inc. 2005, Mellanox Technologies Ltd. 2005, Voltaire, Inc. 2008, Lawrence Livermore National Laboratory. License: BSD-MIT or GPL-2 Files: libibverbs/man/* libibverbs/neigh.h libibverbs/neigh.c Copyright: disclaimed License: BSD-2-clause Files: librdmacm/* Copyright: 2005-2014, Intel Corporation. 2005, Ammasso, Inc. 2005, Voltaire Inc. 2006, Open Grid Computing, Inc. 2014-2015, Mellanox Technologies LTD. License: BSD-MIT or GPL-2 Files: librdmacm/examples/cmtime.c librdmacm/examples/rcopy.c librdmacm/examples/rdma_client.c librdmacm/examples/rdma_server.c librdmacm/examples/rdma_xclient.c librdmacm/examples/rdma_xserver.c librdmacm/examples/riostream.c librdmacm/examples/rstream.c librdmacm/examples/udpong.c Copyright: 2005-2014, Intel Corporation. 2014-2015, Mellanox Technologies LTD. License: BSD-MIT Files: librdmacm/docs/rsocket Copyright: disclaimed License: BSD-2-clause Files: librdmacm/man/* Copyright: disclaimed License: BSD-2-clause Files: providers/bnxt_re/* Copyright: 2015-2017, Broadcom Limited and/or its subsidiaries License: BSD-2-clause or GPL-2 Files: providers/cxgb4/* Copyright: 2003-2016, Chelsio Communications, Inc. License: BSD-MIT or GPL-2 Files: providers/efa/* pyverbs/providers/efa/* Copyright: 2019-2024 Amazon.com, Inc. or its affiliates. License: BSD-2-clause or GPL-2 Files: providers/erdma/* Copyright: 2020-2021, Alibaba Group License: BSD-MIT or GPL-2 Files: providers/hfi1verbs/* Copyright: 2005 PathScale, Inc. 2006-2009 QLogic Corporation 2015 Intel Corporation License: BSD-3-clause or GPL-2 Files: providers/hns/* Copyright: 2016, Hisilicon Limited. License: BSD-MIT or GPL-2 Files: providers/ipathverbs/* Copyright: 2006-2010, QLogic Corp. 2005, PathScale, Inc. 2013, Intel Corporation License: BSD-MIT or GPL-2 Files: providers/irdma/* Copyright: 2015-2023, Intel Corporation. License: BSD-MIT or GPL-2 Files: providers/mana/* Copyright: 2022, Microsoft Corporation. License: BSD-MIT or GPL-2 Files: providers/mlx4/* Copyright: 2004-2005, Topspin Communications. 2005-2007, Cisco, Inc. 2005-2017, Mellanox Technologies Ltd. License: BSD-MIT or GPL-2 Files: providers/mlx5/* Copyright: 2019, Mellanox Technologies, Inc. 2020-2024 Nvidia, Inc. License: BSD-MIT or GPL-2 Files: providers/mlx5/man/*.3 providers/mlx5/man/*.7 Copyright: disclaimed License: BSD-MIT Files: providers/mthca/* Copyright: 2004-2005, Topspin Communications. 2005-2006, Cisco Systems. 2005, Mellanox Technologies Ltd. License: BSD-MIT or GPL-2 Files: providers/ocrdma/* Copyright: 2008-2013, Emulex. License: BSD-2-clause or GPL-2 Files: providers/qedr/* Copyright: 2015-2016, QLogic Corporation. License: BSD-MIT or GPL-2 Files: providers/rxe/* Copyright: 2009-2011, System Fabric Works, Inc. 2009-2011, Mellanox Technologies Ltd. 2006-2007, QLogic Corporation. 2005, PathScale, Inc. License: BSD-MIT or GPL-2 Files: providers/siw/* Copyright: 2008-2019, IBM Corporation. License: BSD-3-clause or GPL-2 Files: providers/vmw_pvrdma/* Copyright: 2012-2016 VMware, Inc. License: BSD-2-clause or GPL-2 Files: rdma-ndd/* Copyright: 2004-2016, Intel Corporation. License: BSD-MIT or GPL-2 Files: redhat/* Copyright: 1996-2013, Red Hat, Inc. License: GPL-2 Files: srp_daemon/* Copyright: 2005, Topspin Communications. 2006, Cisco Systems, Inc. 2006, Mellanox Technologies Ltd. License: BSD-MIT or GPL-2 Files: srp_daemon/srp_daemon.8.in Copyright: 2006 Mellanox Technologies. License: CPL-1.0 or BSD-2-clause or GPL-2 Files: srp_daemon/srpd.in srp_daemon/ibsrpdm.8 Copyright: disclaimed License: BSD-2-clause Files: util/udma_barrier.h Copyright: 2005 Topspin Communications. License: BSD-MIT or GPL-2 License: BSD-MIT OpenIB.org BSD license (MIT variant) . Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: . - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. . - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. . THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. License: BSD-2-clause OpenIB.org BSD license (FreeBSD Variant) . Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: . - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. . - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. . THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. License: BSD-3-clause Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: . * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Intel Corporation nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. . THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. License: GPL-2 This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; version 2 of the License. . This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. . You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. . On Debian systems, the full text of the GNU General Public License version 2 can be found in the file `/usr/share/common-licenses/GPL-2'. License: CC0 The laws of most jurisdictions throughout the world automatically confer exclusive Copyright and Related Rights (defined below) upon the creator and subsequent owner(s) (each and all, an "owner") of an original work of authorship and/or a database (each, a "Work"). . Certain owners wish to permanently relinquish those rights to a Work for the purpose of contributing to a commons of creative, cultural and scientific works ("Commons") that the public can reliably and without fear of later claims of infringement build upon, modify, incorporate in other works, reuse and redistribute as freely as possible in any form whatsoever and for any purposes, including without limitation commercial purposes. These owners may contribute to the Commons to promote the ideal of a free culture and the further production of creative, cultural and scientific works, or to gain reputation or greater distribution for their Work in part through the use and efforts of others. . For these and/or other purposes and motivations, and without any expectation of additional consideration or compensation, the person associating CC0 with a Work (the "Affirmer"), to the extent that he or she is an owner of Copyright and Related Rights in the Work, voluntarily elects to apply CC0 to the Work and publicly distribute the Work under its terms, with knowledge of his or her Copyright and Related Rights in the Work and the meaning and intended legal effect of CC0 on those rights. . 1. Copyright and Related Rights. A Work made available under CC0 may be protected by copyright and related or neighboring rights ("Copyright and Related Rights"). Copyright and Related Rights include, but are not limited to, the following: . the right to reproduce, adapt, distribute, perform, display, communicate, and translate a Work; moral rights retained by the original author(s) and/or performer(s); publicity and privacy rights pertaining to a person's image or likeness depicted in a Work; rights protecting against unfair competition in regards to a Work, subject to the limitations in paragraph 4(a), below; rights protecting the extraction, dissemination, use and reuse of data in a Work; database rights (such as those arising under Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, and under any national implementation thereof, including any amended or successor version of such directive); and other similar, equivalent or corresponding rights throughout the world based on applicable law or treaty, and any national implementations thereof. . 2. Waiver. To the greatest extent permitted by, but not in contravention of, applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and unconditionally waives, abandons, and surrenders all of Affirmer's Copyright and Related Rights and associated claims and causes of action, whether now known or unknown (including existing as well as future claims and causes of action), in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each member of the public at large and to the detriment of Affirmer's heirs and successors, fully intending that such Waiver shall not be subject to revocation, rescission, cancellation, termination, or any other legal or equitable action to disrupt the quiet enjoyment of the Work by the public as contemplated by Affirmer's express Statement of Purpose. . 3. Public License Fallback. Should any part of the Waiver for any reason be judged legally invalid or ineffective under applicable law, then the Waiver shall be preserved to the maximum extent permitted taking into account Affirmer's express Statement of Purpose. In addition, to the extent the Waiver is so judged Affirmer hereby grants to each affected person a royalty-free, non transferable, non sublicensable, non exclusive, irrevocable and unconditional license to exercise Affirmer's Copyright and Related Rights in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "License"). The License shall be deemed effective as of the date CC0 was applied by Affirmer to the Work. Should any part of the License for any reason be judged legally invalid or ineffective under applicable law, such partial invalidity or ineffectiveness shall not invalidate the remainder of the License, and in such case Affirmer hereby affirms that he or she will not (i) exercise any of his or her remaining Copyright and Related Rights in the Work or (ii) assert any associated claims and causes of action with respect to the Work, in either case contrary to Affirmer's express Statement of Purpose. . 4. Limitations and Disclaimers. . No trademark or patent rights held by Affirmer are waived, abandoned, surrendered, licensed or otherwise affected by this document. Affirmer offers the Work as-is and makes no representations or warranties of any kind concerning the Work, express, implied, statutory or otherwise, including without limitation warranties of title, merchantability, fitness for a particular purpose, non infringement, or the absence of latent or other defects, accuracy, or the present or absence of errors, whether or not discoverable, all to the greatest extent permissible under applicable law. Affirmer disclaims responsibility for clearing rights of other persons that may apply to the Work or any use thereof, including without limitation any person's Copyright and Related Rights in the Work. Further, Affirmer disclaims responsibility for obtaining any necessary consents, permissions or other rights required for any use of the Work. Affirmer understands and acknowledges that Creative Commons is not a party to this document and has no duty or obligation with respect to this CC0 or use of the Work. License: MIT Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: . The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. . THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. License: CPL-1.0 THE ACCOMPANYING PROGRAM IS PROVIDED UNDER THE TERMS OF THIS COMMON PUBLIC LICENSE ("AGREEMENT"). ANY USE, REPRODUCTION OR DISTRIBUTION OF THE PROGRAM CONSTITUTES RECIPIENT'S ACCEPTANCE OF THIS AGREEMENT. . 1. DEFINITIONS . "Contribution" means: . a) in the case of the initial Contributor, the initial code and documentation distributed under this Agreement, and . b) in the case of each subsequent Contributor: . i) changes to the Program, and . ii) additions to the Program; . where such changes and/or additions to the Program originate from and are distributed by that particular Contributor. A Contribution 'originates' from a Contributor if it was added to the Program by such Contributor itself or anyone acting on such Contributor's behalf. Contributions do not include additions to the Program which: (i) are separate modules of software distributed in conjunction with the Program under their own license agreement, and (ii) are not derivative works of the Program. . "Contributor" means any person or entity that distributes the Program. . "Licensed Patents " mean patent claims licensable by a Contributor which are necessarily infringed by the use or sale of its Contribution alone or when combined with the Program. . "Program" means the Contributions distributed in accordance with this Agreement. . "Recipient" means anyone who receives the Program under this Agreement, including all Contributors. . 2. GRANT OF RIGHTS . a) Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, distribute and sublicense the Contribution of such Contributor, if any, and such derivative works, in source code and object code form. . b) Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free patent license under Licensed Patents to make, use, sell, offer to sell, import and otherwise transfer the Contribution of such Contributor, if any, in source code and object code form. This patent license shall apply to the combination of the Contribution and the Program if, at the time the Contribution is added by the Contributor, such addition of the Contribution causes such combination to be covered by the Licensed Patents. The patent license shall not apply to any other combinations which include the Contribution. No hardware per se is licensed hereunder. . c) Recipient understands that although each Contributor grants the licenses to its Contributions set forth herein, no assurances are provided by any Contributor that the Program does not infringe the patent or other intellectual property rights of any other entity. Each Contributor disclaims any liability to Recipient for claims brought by any other entity based on infringement of intellectual property rights or otherwise. As a condition to exercising the rights and licenses granted hereunder, each Recipient hereby assumes sole responsibility to secure any other intellectual property rights needed, if any. For example, if a third party patent license is required to allow Recipient to distribute the Program, it is Recipient's responsibility to acquire that license before distributing the Program. . d) Each Contributor represents that to its knowledge it has sufficient copyright rights in its Contribution, if any, to grant the copyright license set forth in this Agreement. . 3. REQUIREMENTS . A Contributor may choose to distribute the Program in object code form under its own license agreement, provided that: . a) it complies with the terms and conditions of this Agreement; and . b) its license agreement: . i) effectively disclaims on behalf of all Contributors all warranties and conditions, express and implied, including warranties or conditions of title and non-infringement, and implied warranties or conditions of merchantability and fitness for a particular purpose; . ii) effectively excludes on behalf of all Contributors all liability for damages, including direct, indirect, special, incidental and consequential damages, such as lost profits; . iii) states that any provisions which differ from this Agreement are offered by that Contributor alone and not by any other party; and . iv) states that source code for the Program is available from such Contributor, and informs licensees how to obtain it in a reasonable manner on or through a medium customarily used for software exchange. . When the Program is made available in source code form: . a) it must be made available under this Agreement; and . b) a copy of this Agreement must be included with each copy of the Program. . Contributors may not remove or alter any copyright notices contained within the Program. . Each Contributor must identify itself as the originator of its Contribution, if any, in a manner that reasonably allows subsequent Recipients to identify the originator of the Contribution. . 4. COMMERCIAL DISTRIBUTION . Commercial distributors of software may accept certain responsibilities with respect to end users, business partners and the like. While this license is intended to facilitate the commercial use of the Program, the Contributor who includes the Program in a commercial product offering should do so in a manner which does not create potential liability for other Contributors. Therefore, if a Contributor includes the Program in a commercial product offering, such Contributor ("Commercial Contributor") hereby agrees to defend and indemnify every other Contributor ("Indemnified Contributor") against any losses, damages and costs (collectively "Losses") arising from claims, lawsuits and other legal actions brought by a third party against the Indemnified Contributor to the extent caused by the acts or omissions of such Commercial Contributor in connection with its distribution of the Program in a commercial product offering. The obligations in this section do not apply to any claims or Losses relating to any actual or alleged intellectual property infringement. In order to qualify, an Indemnified Contributor must: a) promptly notify the Commercial Contributor in writing of such claim, and b) allow the Commercial Contributor to control, and cooperate with the Commercial Contributor in, the defense and any related settlement negotiations. The Indemnified Contributor may participate in any such claim at its own expense. . For example, a Contributor might include the Program in a commercial product offering, Product X. That Contributor is then a Commercial Contributor. If that Commercial Contributor then makes performance claims, or offers warranties related to Product X, those performance claims and warranties are such Commercial Contributor's responsibility alone. Under this section, the Commercial Contributor would have to defend claims against the other Contributors related to those performance claims and warranties, and if a court requires any other Contributor to pay any damages as a result, the Commercial Contributor must pay those damages. . 5. NO WARRANTY . EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, THE PROGRAM IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Each Recipient is solely responsible for determining the appropriateness of using and distributing the Program and assumes all risks associated with its exercise of rights under this Agreement, including but not limited to the risks and costs of program errors, compliance with applicable laws, damage to or loss of data, programs or equipment, and unavailability or interruption of operations. . 6. DISCLAIMER OF LIABILITY . EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS GRANTED HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. . 7. GENERAL . If any provision of this Agreement is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this Agreement, and without further action by the parties hereto, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable. . If Recipient institutes patent litigation against a Contributor with respect to a patent applicable to software (including a cross-claim or counterclaim in a lawsuit), then any patent licenses granted by that Contributor to such Recipient under this Agreement shall terminate as of the date such litigation is filed. In addition, if Recipient institutes patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Program itself (excluding combinations of the Program with other software or hardware) infringes such Recipient's patent(s), then such Recipient's rights granted under Section 2(b) shall terminate as of the date such litigation is filed. . All Recipient's rights under this Agreement shall terminate if it fails to comply with any of the material terms or conditions of this Agreement and does not cure such failure in a reasonable period of time after becoming aware of such noncompliance. If all Recipient's rights under this Agreement terminate, Recipient agrees to cease use and distribution of the Program as soon as reasonably practicable. However, Recipient's obligations under this Agreement and any licenses granted by Recipient relating to the Program shall continue and survive. . Everyone is permitted to copy and distribute copies of this Agreement, but in order to avoid inconsistency the Agreement is copyrighted and may only be modified in the following manner. The Agreement Steward reserves the right to publish new versions (including revisions) of this Agreement from time to time. No one other than the Agreement Steward has the right to modify this Agreement. IBM is the initial Agreement Steward. IBM may assign the responsibility to serve as the Agreement Steward to a suitable separate entity. Each new version of the Agreement will be given a distinguishing version number. The Program (including Contributions) may always be distributed subject to the version of the Agreement under which it was received. In addition, after a new version of the Agreement is published, Contributor may elect to distribute the Program (including its Contributions) under the new version. Except as expressly stated in Sections 2(a) and 2(b) above, Recipient receives no rights or licenses to the intellectual property of any Contributor under this Agreement, whether expressly, by implication, estoppel or otherwise. All rights in the Program not expressly granted under this Agreement are reserved. . This Agreement is governed by the laws of the State of New York and the intellectual property laws of the United States of America. No party to this Agreement will bring a legal action under this Agreement more than one year after the cause of action arose. Each party waives its rights to a jury trial in any resulting litigation. rdma-core-56.1/debian/ibacm.install000066400000000000000000000005451477342711600172230ustar00rootroot00000000000000lib/systemd/system/ibacm.service lib/systemd/system/ibacm.socket usr/bin/ib_acme usr/include/infiniband/acm.h usr/include/infiniband/acm_prov.h usr/lib/*/ibacm/libibacmp.so usr/sbin/ibacm usr/share/doc/rdma-core/ibacm.md usr/share/doc/ibacm/ usr/share/man/man1/ib_acme.1 usr/share/man/man7/ibacm.7 usr/share/man/man7/ibacm_prov.7 usr/share/man/man8/ibacm.8 rdma-core-56.1/debian/ibacm.lintian-overrides000066400000000000000000000002641477342711600212110ustar00rootroot00000000000000# The wantedby target rdma-hw.target is intentional (see rdma-core) ibacm: systemd-service-file-refers-to-unusual-wantedby-target rdma-hw.target [lib/systemd/system/ibacm.service] rdma-core-56.1/debian/ibacm.maintscript000066400000000000000000000000551477342711600201060ustar00rootroot00000000000000rm_conffile /etc/init.d/ibacm 19.0-1ubuntu1~ rdma-core-56.1/debian/ibverbs-providers.install000066400000000000000000000002471477342711600216160ustar00rootroot00000000000000etc/libibverbs.d/ usr/lib/*/libefa.so.* usr/lib/*/libhns.so.* usr/lib/*/libibverbs/lib*-rdmav*.so usr/lib/*/libmana.so.* usr/lib/*/libmlx4.so.* usr/lib/*/libmlx5.so.* rdma-core-56.1/debian/ibverbs-providers.lintian-overrides000066400000000000000000000003041477342711600236000ustar00rootroot00000000000000# libefa, libhns, libmana, libmlx4 and libmlx5 are ibverbs provider that provides more functions. ibverbs-providers: package-name-doesnt-match-sonames libefa1 libhns1 libmana1 libmlx4-1 libmlx5-1 rdma-core-56.1/debian/ibverbs-providers.maintscript000066400000000000000000000002271477342711600225030ustar00rootroot00000000000000rm_conffile /etc/libibverbs.d/cxgb3.driver 27.0-2~ rm_conffile /etc/libibverbs.d/i40iw.driver 39.0-1~ rm_conffile /etc/libibverbs.d/nes.driver 27.0-2~ rdma-core-56.1/debian/ibverbs-providers.symbols000066400000000000000000000153231477342711600216410ustar00rootroot00000000000000libmlx4.so.1 ibverbs-providers #MINVER# * Build-Depends-Package: libibverbs-dev MLX4_1.0@MLX4_1.0 15 mlx4dv_init_obj@MLX4_1.0 15 mlx4dv_query_device@MLX4_1.0 15 mlx4dv_create_qp@MLX4_1.0 15 mlx4dv_set_context_attr@MLX4_1.0 15 libmlx5.so.1 ibverbs-providers #MINVER# * Build-Depends-Package: libibverbs-dev MLX5_1.0@MLX5_1.0 13 MLX5_1.1@MLX5_1.1 14 MLX5_1.2@MLX5_1.2 15 MLX5_1.3@MLX5_1.3 16 MLX5_1.4@MLX5_1.4 17 MLX5_1.5@MLX5_1.5 18 MLX5_1.6@MLX5_1.6 20 MLX5_1.7@MLX5_1.7 21 MLX5_1.8@MLX5_1.8 22 MLX5_1.9@MLX5_1.9 23 MLX5_1.10@MLX5_1.10 24 MLX5_1.11@MLX5_1.11 25 MLX5_1.12@MLX5_1.12 28 MLX5_1.13@MLX5_1.13 29 MLX5_1.14@MLX5_1.14 30 MLX5_1.15@MLX5_1.15 31 MLX5_1.16@MLX5_1.16 32 MLX5_1.17@MLX5_1.17 33 MLX5_1.18@MLX5_1.18 34 MLX5_1.19@MLX5_1.19 35 MLX5_1.20@MLX5_1.20 36 MLX5_1.21@MLX5_1.21 37 MLX5_1.22@MLX5_1.22 38 MLX5_1.23@MLX5_1.23 40 MLX5_1.24@MLX5_1.24 42 MLX5_1.25@MLX5_1.25 54 mlx5dv_init_obj@MLX5_1.0 13 mlx5dv_init_obj@MLX5_1.2 15 mlx5dv_query_device@MLX5_1.0 13 mlx5dv_create_cq@MLX5_1.1 14 mlx5dv_set_context_attr@MLX5_1.2 15 mlx5dv_create_qp@MLX5_1.3 16 mlx5dv_create_wq@MLX5_1.3 16 mlx5dv_get_clock_info@MLX5_1.4 17 mlx5dv_create_flow_action_esp@MLX5_1.5 18 mlx5dv_create_flow_matcher@MLX5_1.6 20 mlx5dv_destroy_flow_matcher@MLX5_1.6 20 mlx5dv_create_flow@MLX5_1.6 20 mlx5dv_create_flow_action_modify_header@MLX5_1.7 21 mlx5dv_create_flow_action_packet_reformat@MLX5_1.7 21 mlx5dv_devx_alloc_uar@MLX5_1.7 21 mlx5dv_devx_free_uar@MLX5_1.7 21 mlx5dv_devx_general_cmd@MLX5_1.7 21 mlx5dv_devx_obj_create@MLX5_1.7 21 mlx5dv_devx_obj_destroy@MLX5_1.7 21 mlx5dv_devx_obj_modify@MLX5_1.7 21 mlx5dv_devx_obj_query@MLX5_1.7 21 mlx5dv_devx_query_eqn@MLX5_1.7 21 mlx5dv_devx_umem_dereg@MLX5_1.7 21 mlx5dv_devx_umem_reg@MLX5_1.7 21 mlx5dv_open_device@MLX5_1.7 21 mlx5dv_devx_cq_modify@MLX5_1.8 22 mlx5dv_devx_cq_query@MLX5_1.8 22 mlx5dv_devx_ind_tbl_modify@MLX5_1.8 22 mlx5dv_devx_ind_tbl_query@MLX5_1.8 22 mlx5dv_devx_qp_modify@MLX5_1.8 22 mlx5dv_devx_qp_query@MLX5_1.8 22 mlx5dv_devx_srq_modify@MLX5_1.8 22 mlx5dv_devx_srq_query@MLX5_1.8 22 mlx5dv_devx_wq_modify@MLX5_1.8 22 mlx5dv_devx_wq_query@MLX5_1.8 22 mlx5dv_is_supported@MLX5_1.8 22 mlx5dv_devx_create_cmd_comp@MLX5_1.9 23 mlx5dv_devx_destroy_cmd_comp@MLX5_1.9 23 mlx5dv_devx_get_async_cmd_comp@MLX5_1.9 23 mlx5dv_devx_obj_query_async@MLX5_1.9 23 mlx5dv_alloc_dm@MLX5_1.10 24 mlx5dv_create_mkey@MLX5_1.10 24 mlx5dv_destroy_mkey@MLX5_1.10 24 mlx5dv_dr_action_create_dest_table@MLX5_1.10 24 mlx5dv_dr_action_create_dest_ibv_qp@MLX5_1.10 24 mlx5dv_dr_action_create_dest_vport@MLX5_1.10 24 mlx5dv_dr_action_create_flow_counter@MLX5_1.10 24 mlx5dv_dr_action_create_drop@MLX5_1.10 24 mlx5dv_dr_action_create_modify_header@MLX5_1.10 24 mlx5dv_dr_action_create_packet_reformat@MLX5_1.10 24 mlx5dv_dr_action_create_tag@MLX5_1.10 24 mlx5dv_dr_action_destroy@MLX5_1.10 24 mlx5dv_dr_domain_create@MLX5_1.10 24 mlx5dv_dr_domain_destroy@MLX5_1.10 24 mlx5dv_dr_domain_sync@MLX5_1.10 24 mlx5dv_dr_matcher_create@MLX5_1.10 24 mlx5dv_dr_matcher_destroy@MLX5_1.10 24 mlx5dv_dr_rule_create@MLX5_1.10 24 mlx5dv_dr_rule_destroy@MLX5_1.10 24 mlx5dv_dr_table_create@MLX5_1.10 24 mlx5dv_dr_table_destroy@MLX5_1.10 24 mlx5dv_qp_ex_from_ibv_qp_ex@MLX5_1.10 24 mlx5dv_devx_create_event_channel@MLX5_1.11 25 mlx5dv_devx_destroy_event_channel@MLX5_1.11 25 mlx5dv_devx_get_event@MLX5_1.11 25 mlx5dv_devx_subscribe_devx_event@MLX5_1.11 25 mlx5dv_devx_subscribe_devx_event_fd@MLX5_1.11 25 mlx5dv_alloc_var@MLX5_1.12 28 mlx5dv_dr_action_create_flow_meter@MLX5_1.12 28 mlx5dv_dr_action_modify_flow_meter@MLX5_1.12 28 mlx5dv_dump_dr_domain@MLX5_1.12 28 mlx5dv_dump_dr_matcher@MLX5_1.12 28 mlx5dv_dump_dr_rule@MLX5_1.12 28 mlx5dv_dump_dr_table@MLX5_1.12 28 mlx5dv_free_var@MLX5_1.12 28 mlx5dv_pp_alloc@MLX5_1.13 29 mlx5dv_pp_free@MLX5_1.13 29 mlx5dv_dr_action_create_default_miss@MLX5_1.14 30 mlx5dv_dr_domain_set_reclaim_device_memory@MLX5_1.14 30 mlx5dv_modify_qp_lag_port@MLX5_1.14 30 mlx5dv_query_qp_lag_port@MLX5_1.14 30 mlx5dv_dr_action_create_dest_devx_tir@MLX5_1.15 31 mlx5dv_dr_action_create_flow_sampler@MLX5_1.16 32 mlx5dv_dr_action_create_dest_array@MLX5_1.16 32 mlx5dv_dr_action_create_aso@MLX5_1.17 33 mlx5dv_dr_action_create_pop_vlan@MLX5_1.17 33 mlx5dv_dr_action_create_push_vlan@MLX5_1.17 33 mlx5dv_dr_action_modify_aso@MLX5_1.17 33 mlx5dv_modify_qp_sched_elem@MLX5_1.17 33 mlx5dv_modify_qp_udp_sport@MLX5_1.17 33 mlx5dv_sched_leaf_create@MLX5_1.17 33 mlx5dv_sched_leaf_destroy@MLX5_1.17 33 mlx5dv_sched_leaf_modify@MLX5_1.17 33 mlx5dv_sched_node_create@MLX5_1.17 33 mlx5dv_sched_node_destroy@MLX5_1.17 33 mlx5dv_sched_node_modify@MLX5_1.17 33 mlx5dv_reserved_qpn_alloc@MLX5_1.18 34 mlx5dv_reserved_qpn_dealloc@MLX5_1.18 34 mlx5dv_devx_umem_reg_ex@MLX5_1.19 35 mlx5dv_dm_map_op_addr@MLX5_1.19 35 _mlx5dv_query_port@MLX5_1.19 35 mlx5dv_dr_domain_allow_duplicate_rules@MLX5_1.20 36 mlx5dv_map_ah_to_qp@MLX5_1.20 36 mlx5dv_qp_cancel_posted_send_wrs@MLX5_1.20 36 _mlx5dv_mkey_check@MLX5_1.20 36 mlx5dv_crypto_login@MLX5_1.21 37 mlx5dv_crypto_login_query_state@MLX5_1.21 37 mlx5dv_crypto_logout@MLX5_1.21 37 mlx5dv_dci_stream_id_reset@MLX5_1.21 37 mlx5dv_dek_create@MLX5_1.21 37 mlx5dv_dek_destroy@MLX5_1.21 37 mlx5dv_dek_query@MLX5_1.21 37 mlx5dv_dr_action_create_dest_ib_port@MLX5_1.21 37 mlx5dv_dr_matcher_set_layout@MLX5_1.21 37 mlx5dv_get_vfio_device_list@MLX5_1.21 37 mlx5dv_vfio_get_events_fd@MLX5_1.21 37 mlx5dv_vfio_process_events@MLX5_1.21 37 mlx5dv_dr_aso_other_domain_link@MLX5_1.22 38 mlx5dv_dr_aso_other_domain_unlink@MLX5_1.22 38 mlx5dv_devx_alloc_msi_vector@MLX5_1.23 40 mlx5dv_devx_create_eq@MLX5_1.23 40 mlx5dv_devx_destroy_eq@MLX5_1.23 40 mlx5dv_devx_free_msi_vector@MLX5_1.23 40 mlx5dv_create_steering_anchor@MLX5_1.24 42 mlx5dv_crypto_login_create@MLX5_1.24 42 mlx5dv_crypto_login_destroy@MLX5_1.24 42 mlx5dv_crypto_login_query@MLX5_1.24 42 mlx5dv_destroy_steering_anchor@MLX5_1.24 42 mlx5dv_dr_action_create_dest_root_table@MLX5_1.24 42 mlx5dv_get_data_direct_sysfs_path@MLX5_1.25 54 mlx5dv_reg_dmabuf_mr@MLX5_1.25 54 libefa.so.1 ibverbs-providers #MINVER# * Build-Depends-Package: libibverbs-dev EFA_1.0@EFA_1.0 24 EFA_1.1@EFA_1.1 26 EFA_1.2@EFA_1.2 43 EFA_1.3@EFA_1.3 50 efadv_create_driver_qp@EFA_1.0 24 efadv_create_qp_ex@EFA_1.1 26 efadv_query_device@EFA_1.1 26 efadv_query_ah@EFA_1.1 26 efadv_cq_from_ibv_cq_ex@EFA_1.2 43 efadv_create_cq@EFA_1.2 43 efadv_query_mr@EFA_1.3 50 libhns.so.1 ibverbs-providers #MINVER# * Build-Depends-Package: libibverbs-dev HNS_1.0@HNS_1.0 51 hnsdv_is_supported@HNS_1.0 51 hnsdv_create_qp@HNS_1.0 51 hnsdv_query_device@HNS_1.0 51 libmana.so.1 ibverbs-providers #MINVER# * Build-Depends-Package: libibverbs-dev MANA_1.0@MANA_1.0 41 manadv_init_obj@MANA_1.0 41 manadv_set_context_attr@MANA_1.0 41 rdma-core-56.1/debian/ibverbs-utils.install000066400000000000000000000007341477342711600207420ustar00rootroot00000000000000usr/bin/ibv_asyncwatch usr/bin/ibv_devices usr/bin/ibv_devinfo usr/bin/ibv_rc_pingpong usr/bin/ibv_srq_pingpong usr/bin/ibv_uc_pingpong usr/bin/ibv_ud_pingpong usr/bin/ibv_xsrq_pingpong usr/share/man/man1/ibv_asyncwatch.1 usr/share/man/man1/ibv_devices.1 usr/share/man/man1/ibv_devinfo.1 usr/share/man/man1/ibv_rc_pingpong.1 usr/share/man/man1/ibv_srq_pingpong.1 usr/share/man/man1/ibv_uc_pingpong.1 usr/share/man/man1/ibv_ud_pingpong.1 usr/share/man/man1/ibv_xsrq_pingpong.1 rdma-core-56.1/debian/infiniband-diags.install000066400000000000000000000031701477342711600213330ustar00rootroot00000000000000etc/infiniband-diags/error_thresholds etc/infiniband-diags/ibdiag.conf usr/sbin/check_lft_balance usr/sbin/dump_fts usr/sbin/dump_lfts usr/sbin/dump_mfts usr/sbin/ibaddr usr/sbin/ibcacheedit usr/sbin/ibccconfig usr/sbin/ibccquery usr/sbin/ibfindnodesusing usr/sbin/ibhosts usr/sbin/ibidsverify usr/sbin/iblinkinfo usr/sbin/ibnetdiscover usr/sbin/ibnodes usr/sbin/ibping usr/sbin/ibportstate usr/sbin/ibqueryerrors usr/sbin/ibroute usr/sbin/ibrouters usr/sbin/ibstat usr/sbin/ibstatus usr/sbin/ibswitches usr/sbin/ibsysstat usr/sbin/ibtracert usr/sbin/perfquery usr/sbin/saquery usr/sbin/sminfo usr/sbin/smpdump usr/sbin/smpquery usr/sbin/vendstat usr/share/man/man8/check_lft_balance.8 usr/share/man/man8/dump_fts.8 usr/share/man/man8/dump_lfts.8 usr/share/man/man8/dump_mfts.8 usr/share/man/man8/ibaddr.8 usr/share/man/man8/ibcacheedit.8 usr/share/man/man8/ibccconfig.8 usr/share/man/man8/ibccquery.8 usr/share/man/man8/ibfindnodesusing.8 usr/share/man/man8/ibhosts.8 usr/share/man/man8/ibidsverify.8 usr/share/man/man8/iblinkinfo.8 usr/share/man/man8/ibnetdiscover.8 usr/share/man/man8/ibnodes.8 usr/share/man/man8/ibping.8 usr/share/man/man8/ibportstate.8 usr/share/man/man8/ibqueryerrors.8 usr/share/man/man8/ibroute.8 usr/share/man/man8/ibrouters.8 usr/share/man/man8/ibstat.8 usr/share/man/man8/ibstatus.8 usr/share/man/man8/ibswitches.8 usr/share/man/man8/ibsysstat.8 usr/share/man/man8/ibtracert.8 usr/share/man/man8/infiniband-diags.8 usr/share/man/man8/perfquery.8 usr/share/man/man8/saquery.8 usr/share/man/man8/sminfo.8 usr/share/man/man8/smpdump.8 usr/share/man/man8/smpquery.8 usr/share/man/man8/vendstat.8 usr/share/perl5/IBswcountlimits.pm rdma-core-56.1/debian/infiniband-diags.lintian-overrides000066400000000000000000000004011477342711600233150ustar00rootroot00000000000000# The infiniband-diags man page gives an overview of the available commands. infiniband-diags: spare-manual-page [usr/share/man/man8/infiniband-diags.8.gz] # For lintian < 2.115.2 infiniband-diags: spare-manual-page usr/share/man/man8/infiniband-diags.8.gz rdma-core-56.1/debian/libibmad-dev.install000066400000000000000000000002131477342711600204570ustar00rootroot00000000000000usr/include/infiniband/mad.h usr/include/infiniband/mad_osd.h usr/lib/*/libibmad*.a usr/lib/*/libibmad*.so usr/lib/*/pkgconfig/libibmad.pc rdma-core-56.1/debian/libibmad5.install000066400000000000000000000000311477342711600177660ustar00rootroot00000000000000usr/lib/*/libibmad*.so.* rdma-core-56.1/debian/libibmad5.symbols000066400000000000000000000141551477342711600200240ustar00rootroot00000000000000libibmad.so.5 libibmad5 #MINVER# * Build-Depends-Package: libibmad-dev IBMAD_1.3@IBMAD_1.3 1.3.11 IBMAD_1.4@IBMAD_1.4 54 IBMAD_1.5@IBMAD_1.5 56 bm_call_via@IBMAD_1.3 1.3.11 cc_config_status_via@IBMAD_1.3 1.3.11 cc_query_status_via@IBMAD_1.3 1.3.11 drpath2str@IBMAD_1.3 1.3.11 ib_node_query_via@IBMAD_1.3 1.3.11 ib_path_query@IBMAD_1.3 1.3.11 ib_path_query_via@IBMAD_1.3 1.3.11 ib_resolve_gid_via@IBMAD_1.3 1.3.11 ib_resolve_guid_via@IBMAD_1.3 1.3.11 ib_resolve_portid_str@IBMAD_1.3 1.3.11 ib_resolve_portid_str_via@IBMAD_1.3 1.3.11 ib_resolve_self@IBMAD_1.3 1.3.11 ib_resolve_self_via@IBMAD_1.3 1.3.11 ib_resolve_smlid@IBMAD_1.3 1.3.11 ib_resolve_smlid_via@IBMAD_1.3 1.3.11 ib_vendor_call@IBMAD_1.3 1.3.11 ib_vendor_call_via@IBMAD_1.3 1.3.11 ibdebug@IBMAD_1.3 1.3.11 mad_alloc@IBMAD_1.3 1.3.11 mad_build_pkt@IBMAD_1.3 1.3.11 mad_class_agent@IBMAD_1.3 1.3.11 mad_decode_field@IBMAD_1.3 1.3.11 mad_dump_array@IBMAD_1.3 1.3.11 mad_dump_bitfield@IBMAD_1.3 1.3.11 mad_dump_cc_cacongestionentry@IBMAD_1.3 1.3.11 mad_dump_cc_cacongestionsetting@IBMAD_1.3 1.3.11 mad_dump_cc_congestioncontroltable@IBMAD_1.3 1.3.11 mad_dump_cc_congestioncontroltableentry@IBMAD_1.3 1.3.11 mad_dump_cc_congestioninfo@IBMAD_1.3 1.3.11 mad_dump_cc_congestionkeyinfo@IBMAD_1.3 1.3.11 mad_dump_cc_congestionlog@IBMAD_1.3 1.3.11 mad_dump_cc_congestionlogca@IBMAD_1.3 1.3.11 mad_dump_cc_congestionlogentryca@IBMAD_1.3 1.3.11 mad_dump_cc_congestionlogentryswitch@IBMAD_1.3 1.3.11 mad_dump_cc_congestionlogswitch@IBMAD_1.3 1.3.11 mad_dump_cc_switchcongestionsetting@IBMAD_1.3 1.3.11 mad_dump_cc_switchportcongestionsettingelement@IBMAD_1.3 1.3.11 mad_dump_cc_timestamp@IBMAD_1.3 1.3.11 mad_dump_classportinfo@IBMAD_1.3 1.3.11 mad_dump_field@IBMAD_1.3 1.3.11 mad_dump_fields@IBMAD_1.3 1.3.11 mad_dump_hex@IBMAD_1.3 1.3.11 mad_dump_int@IBMAD_1.3 1.3.11 mad_dump_linkdowndefstate@IBMAD_1.3 1.3.11 mad_dump_linkspeed@IBMAD_1.3 1.3.11 mad_dump_linkspeeden@IBMAD_1.3 1.3.11 mad_dump_linkspeedext@IBMAD_1.3 1.3.11 mad_dump_linkspeedext2@IBMAD_1.4 5.4.54 mad_dump_linkspeedexten@IBMAD_1.3 1.3.11 mad_dump_linkspeedexten2@IBMAD_1.4 5.4.54 mad_dump_linkspeedextsup@IBMAD_1.3 1.3.11 mad_dump_linkspeedextsup2@IBMAD_1.4 5.4.54 mad_dump_linkspeedsup@IBMAD_1.3 1.3.11 mad_dump_linkwidth@IBMAD_1.3 1.3.11 mad_dump_linkwidthen@IBMAD_1.3 1.3.11 mad_dump_linkwidthsup@IBMAD_1.3 1.3.11 mad_dump_mlnx_ext_port_info@IBMAD_1.3 1.3.11 mad_dump_mtu@IBMAD_1.3 1.3.11 mad_dump_node_type@IBMAD_1.3 1.3.11 mad_dump_nodedesc@IBMAD_1.3 1.3.11 mad_dump_nodeinfo@IBMAD_1.3 1.3.11 mad_dump_opervls@IBMAD_1.3 1.3.11 mad_dump_perfcounters@IBMAD_1.3 1.3.11 mad_dump_perfcounters_ext@IBMAD_1.3 1.3.11 mad_dump_perfcounters_port_flow_ctl_counters@IBMAD_1.3 1.3.11 mad_dump_perfcounters_port_op_rcv_counters@IBMAD_1.3 1.3.11 mad_dump_perfcounters_port_vl_op_data@IBMAD_1.3 1.3.11 mad_dump_perfcounters_port_vl_op_packet@IBMAD_1.3 1.3.11 mad_dump_perfcounters_port_vl_xmit_flow_ctl_update_errors@IBMAD_1.3 1.3.11 mad_dump_perfcounters_port_vl_xmit_wait_counters@IBMAD_1.3 1.3.11 mad_dump_perfcounters_rcv_con_ctrl@IBMAD_1.3 1.3.11 mad_dump_perfcounters_rcv_err@IBMAD_1.3 1.3.11 mad_dump_perfcounters_rcv_sl@IBMAD_1.3 1.3.11 mad_dump_perfcounters_sl_rcv_becn@IBMAD_1.3 1.3.11 mad_dump_perfcounters_sl_rcv_fecn@IBMAD_1.3 1.3.11 mad_dump_perfcounters_sw_port_vl_congestion@IBMAD_1.3 1.3.11 mad_dump_perfcounters_vl_xmit_time_cong@IBMAD_1.3 1.3.11 mad_dump_perfcounters_xmit_con_ctrl@IBMAD_1.3 1.3.11 mad_dump_perfcounters_xmt_disc@IBMAD_1.3 1.3.11 mad_dump_perfcounters_xmt_sl@IBMAD_1.3 1.3.11 mad_dump_physportstate@IBMAD_1.3 1.3.11 mad_dump_port_ext_speeds_counters@IBMAD_1.3 1.3.11 mad_dump_port_ext_speeds_counters_rsfec_active@IBMAD_1.3 1.3.12 mad_dump_portcapmask2@IBMAD_1.3 2.1.0 mad_dump_portcapmask@IBMAD_1.3 1.3.11 mad_dump_portinfo@IBMAD_1.3 1.3.11 mad_dump_portinfo_ext@IBMAD_1.3 1.3.12 mad_dump_portsamples_control@IBMAD_1.3 1.3.11 mad_dump_portsamples_result@IBMAD_1.3 1.3.11 mad_dump_portstate@IBMAD_1.3 1.3.11 mad_dump_portstates@IBMAD_1.3 1.3.11 mad_dump_rhex@IBMAD_1.3 1.3.11 mad_dump_sltovl@IBMAD_1.3 1.3.11 mad_dump_string@IBMAD_1.3 1.3.11 mad_dump_switchinfo@IBMAD_1.3 1.3.11 mad_dump_uint@IBMAD_1.3 1.3.11 mad_dump_val@IBMAD_1.3 1.3.11 mad_dump_vlarbitration@IBMAD_1.3 1.3.11 mad_dump_vlcap@IBMAD_1.3 1.3.11 mad_encode@IBMAD_1.3 1.3.11 mad_encode_field@IBMAD_1.3 1.3.11 mad_field_name@IBMAD_1.3 1.3.11 mad_free@IBMAD_1.3 1.3.11 mad_get_array@IBMAD_1.3 1.3.11 mad_get_field64@IBMAD_1.3 1.3.11 mad_get_field@IBMAD_1.3 1.3.11 mad_get_retries@IBMAD_1.3 1.3.11 mad_get_timeout@IBMAD_1.3 1.3.11 mad_print_field@IBMAD_1.3 1.3.11 mad_receive@IBMAD_1.3 1.3.11 mad_receive_via@IBMAD_1.3 1.3.11 mad_register_client@IBMAD_1.3 1.3.11 mad_register_client_via@IBMAD_1.3 1.3.11 mad_register_server@IBMAD_1.3 1.3.11 mad_register_server_via@IBMAD_1.3 1.3.11 mad_respond@IBMAD_1.3 1.3.11 mad_respond_via@IBMAD_1.3 1.3.11 mad_rpc@IBMAD_1.3 1.3.11 mad_rpc_class_agent@IBMAD_1.3 1.3.11 mad_rpc_close_port@IBMAD_1.3 1.3.11 mad_rpc_open_port@IBMAD_1.3 1.3.11 mad_rpc_portid@IBMAD_1.3 1.3.11 mad_rpc_rmpp@IBMAD_1.3 1.3.11 mad_rpc_set_retries@IBMAD_1.3 1.3.11 mad_rpc_set_timeout@IBMAD_1.3 1.3.11 mad_send@IBMAD_1.3 1.3.11 mad_send_via@IBMAD_1.3 1.3.11 mad_set_array@IBMAD_1.3 1.3.11 mad_set_field64@IBMAD_1.3 1.3.11 mad_set_field@IBMAD_1.3 1.3.11 mad_trid@IBMAD_1.3 1.3.11 madrpc@IBMAD_1.3 1.3.11 madrpc_init@IBMAD_1.3 1.3.11 madrpc_portid@IBMAD_1.3 1.3.11 madrpc_rmpp@IBMAD_1.3 1.3.11 madrpc_save_mad@IBMAD_1.3 1.3.11 madrpc_set_retries@IBMAD_1.3 1.3.11 madrpc_set_timeout@IBMAD_1.3 1.3.11 madrpc_show_errors@IBMAD_1.3 1.3.11 performance_reset_via@IBMAD_1.3 1.3.11 pma_query_via@IBMAD_1.3 1.3.11 portid2portnum@IBMAD_1.3 1.3.11 portid2str@IBMAD_1.3 1.3.11 sa_call@IBMAD_1.3 1.3.11 sa_rpc_call@IBMAD_1.3 1.3.11 smp_mkey_get@IBMAD_1.3 1.3.11 smp_mkey_set@IBMAD_1.3 1.3.11 smp_query@IBMAD_1.3 1.3.11 smp_query_status_via@IBMAD_1.3 1.3.11 smp_query_via@IBMAD_1.3 1.3.11 smp_set@IBMAD_1.3 1.3.11 smp_set_status_via@IBMAD_1.3 1.3.11 smp_set_via@IBMAD_1.3 1.3.11 str2drpath@IBMAD_1.3 1.3.11 xdump@IBMAD_1.3 1.3.11 mad_rpc_open_port2@IBMAD_1.5 56 mad_rpc_close_port2@IBMAD_1.5 56 rdma-core-56.1/debian/libibnetdisc-dev.install000066400000000000000000000007441477342711600213600ustar00rootroot00000000000000usr/include/infiniband/ibnetdisc* usr/lib/*/libibnetdisc*.a usr/lib/*/libibnetdisc*.so usr/lib/*/pkgconfig/libibnetdisc.pc usr/share/man/man3/ibnd_debug.3 usr/share/man/man3/ibnd_destroy_fabric.3 usr/share/man/man3/ibnd_discover_fabric.3 usr/share/man/man3/ibnd_find_node_dr.3 usr/share/man/man3/ibnd_find_node_guid.3 usr/share/man/man3/ibnd_iter_nodes.3 usr/share/man/man3/ibnd_iter_nodes_type.3 usr/share/man/man3/ibnd_set_max_smps_on_wire.3 usr/share/man/man3/ibnd_show_progress.3 rdma-core-56.1/debian/libibnetdisc5.install000066400000000000000000000000351477342711600206620ustar00rootroot00000000000000usr/lib/*/libibnetdisc*.so.* rdma-core-56.1/debian/libibnetdisc5.symbols000066400000000000000000000023461477342711600207130ustar00rootroot00000000000000libibnetdisc.so.5 libibnetdisc5 #MINVER# * Build-Depends-Package: libibnetdisc-dev IBNETDISC_1.0@IBNETDISC_1.0 1.6.1 IBNETDISC_1.1@IBNETDISC_1.1 49 ibnd_cache_fabric@IBNETDISC_1.0 1.6.1 ibnd_destroy_fabric@IBNETDISC_1.0 1.6.1 ibnd_discover_fabric@IBNETDISC_1.0 1.6.1 ibnd_find_node_dr@IBNETDISC_1.0 1.6.1 ibnd_find_node_guid@IBNETDISC_1.0 1.6.1 ibnd_find_port_dr@IBNETDISC_1.0 1.6.1 ibnd_find_port_guid@IBNETDISC_1.0 1.6.1 ibnd_find_port_lid@IBNETDISC_1.0 1.6.4 ibnd_get_chassis_guid@IBNETDISC_1.0 1.6.1 ibnd_get_chassis_slot_str@IBNETDISC_1.0 1.6.1 ibnd_get_chassis_type@IBNETDISC_1.0 1.6.1 ibnd_is_xsigo_guid@IBNETDISC_1.0 1.6.1 ibnd_is_xsigo_hca@IBNETDISC_1.0 1.6.1 ibnd_is_xsigo_tca@IBNETDISC_1.0 1.6.1 ibnd_iter_nodes@IBNETDISC_1.0 1.6.1 ibnd_iter_nodes_type@IBNETDISC_1.0 1.6.1 ibnd_iter_ports@IBNETDISC_1.0 1.6.1 ibnd_load_fabric@IBNETDISC_1.0 1.6.1 ibnd_get_agg_linkspeedext_field@IBNETDISC_1.1 49 ibnd_get_agg_linkspeedext@IBNETDISC_1.1 49 ibnd_get_agg_linkspeedexten@IBNETDISC_1.1 49 ibnd_get_agg_linkspeedextsup@IBNETDISC_1.1 49 ibnd_dump_agg_linkspeedext_bits@IBNETDISC_1.1 49 ibnd_dump_agg_linkspeedext@IBNETDISC_1.1 49 ibnd_dump_agg_linkspeedexten@IBNETDISC_1.1 49 ibnd_dump_agg_linkspeedextsup@IBNETDISC_1.1 49 rdma-core-56.1/debian/libibumad-dev.install000066400000000000000000000002101477342711600206410ustar00rootroot00000000000000usr/include/infiniband/umad*.h usr/lib/*/libibumad*.so usr/lib/*/libibumad.a usr/lib/*/pkgconfig/libibumad.pc usr/share/man/man3/umad_* rdma-core-56.1/debian/libibumad3.install000066400000000000000000000000321477342711600201520ustar00rootroot00000000000000usr/lib/*/libibumad*.so.* rdma-core-56.1/debian/libibumad3.symbols000066400000000000000000000032161477342711600202030ustar00rootroot00000000000000libibumad.so.3 libibumad3 #MINVER# * Build-Depends-Package: libibumad-dev IBUMAD_1.0@IBUMAD_1.0 1.3.9 IBUMAD_1.1@IBUMAD_1.1 3.1.26 IBUMAD_1.2@IBUMAD_1.2 3.2.30 IBUMAD_1.3@IBUMAD_1.3 3.3.53 IBUMAD_1.4@IBUMAD_1.4 56 umad_addr_dump@IBUMAD_1.0 1.3.9 umad_attribute_str@IBUMAD_1.0 1.3.10.2 umad_class_str@IBUMAD_1.0 1.3.10.2 umad_close_port@IBUMAD_1.0 1.3.9 umad_common_mad_status_str@IBUMAD_1.0 1.3.10.2 umad_debug@IBUMAD_1.0 1.3.9 umad_done@IBUMAD_1.0 1.3.9 umad_dump@IBUMAD_1.0 1.3.9 umad_free_ca_device_list@IBUMAD_1.1 3.1.26 umad_get_ca@IBUMAD_1.0 1.3.9 umad_get_ca_device_list@IBUMAD_1.1 3.1.26 umad_get_ca_portguids@IBUMAD_1.0 1.3.9 umad_get_cas_names@IBUMAD_1.0 1.3.9 umad_get_fd@IBUMAD_1.0 1.3.9 umad_get_issm_path@IBUMAD_1.0 1.3.9 umad_get_mad@IBUMAD_1.0 1.3.9 umad_get_mad_addr@IBUMAD_1.0 1.3.9 umad_get_pkey@IBUMAD_1.0 1.3.9 umad_get_port@IBUMAD_1.0 1.3.9 umad_init@IBUMAD_1.0 1.3.9 umad_method_str@IBUMAD_1.0 1.3.10.2 umad_open_port@IBUMAD_1.0 1.3.9 umad_open_smi_port@IBUMAD_1.3 3.3.53 umad_poll@IBUMAD_1.0 1.3.9 umad_recv@IBUMAD_1.0 1.3.9 umad_register2@IBUMAD_1.0 1.3.10.2 umad_register@IBUMAD_1.0 1.3.9 umad_register_oui@IBUMAD_1.0 1.3.9 umad_release_ca@IBUMAD_1.0 1.3.9 umad_release_port@IBUMAD_1.0 1.3.9 umad_sa_mad_status_str@IBUMAD_1.0 1.3.10.2 umad_send@IBUMAD_1.0 1.3.9 umad_set_addr@IBUMAD_1.0 1.3.9 umad_set_addr_net@IBUMAD_1.0 1.3.9 umad_set_grh@IBUMAD_1.0 1.3.9 umad_set_pkey@IBUMAD_1.0 1.3.9 umad_size@IBUMAD_1.0 1.3.9 umad_sort_ca_device_list@IBUMAD_1.2 3.2.30 umad_status@IBUMAD_1.0 1.3.9 umad_unregister@IBUMAD_1.0 1.3.9 umad_get_smi_gsi_pairs@IBUMAD_1.4 56 umad_get_smi_gsi_pair_by_ca_name@IBUMAD_1.4 56 rdma-core-56.1/debian/libibverbs-dev.install000066400000000000000000000025161477342711600210470ustar00rootroot00000000000000usr/include/infiniband/arch.h usr/include/infiniband/efadv.h usr/include/infiniband/hnsdv.h usr/include/infiniband/ib_user_ioctl_verbs.h usr/include/infiniband/manadv.h usr/include/infiniband/mlx4dv.h usr/include/infiniband/mlx5_api.h usr/include/infiniband/mlx5_user_ioctl_verbs.h usr/include/infiniband/mlx5dv.h usr/include/infiniband/opcode.h usr/include/infiniband/sa-kern-abi.h usr/include/infiniband/sa.h usr/include/infiniband/tm_types.h usr/include/infiniband/verbs.h usr/include/infiniband/verbs_api.h usr/lib/*/lib*-rdmav*.a usr/lib/*/libefa.a usr/lib/*/libefa.so usr/lib/*/libhns.a usr/lib/*/libhns.so usr/lib/*/libibverbs*.so usr/lib/*/libibverbs.a usr/lib/*/libmana.a usr/lib/*/libmana.so usr/lib/*/libmlx4.a usr/lib/*/libmlx4.so usr/lib/*/libmlx5.a usr/lib/*/libmlx5.so usr/lib/*/pkgconfig/libefa.pc usr/lib/*/pkgconfig/libhns.pc usr/lib/*/pkgconfig/libibverbs.pc usr/lib/*/pkgconfig/libmana.pc usr/lib/*/pkgconfig/libmlx4.pc usr/lib/*/pkgconfig/libmlx5.pc usr/share/man/man3/efadv_*.3 usr/share/man/man3/hnsdv_*.3 usr/share/man/man3/ibv_* usr/share/man/man3/manadv_*.3 usr/share/man/man3/mbps_to_ibv_rate.3 usr/share/man/man3/mlx4dv_*.3 usr/share/man/man3/mlx5dv_*.3 usr/share/man/man3/mult_to_ibv_rate.3 usr/share/man/man7/efadv.7 usr/share/man/man7/hnsdv.7 usr/share/man/man7/manadv.7 usr/share/man/man7/mlx4dv.7 usr/share/man/man7/mlx5dv.7 rdma-core-56.1/debian/libibverbs1.install000066400000000000000000000001341477342711600203460ustar00rootroot00000000000000usr/lib/*/libibverbs*.so.* usr/share/doc/rdma-core/libibverbs.md usr/share/doc/libibverbs1/ rdma-core-56.1/debian/libibverbs1.postinst000066400000000000000000000002541477342711600205660ustar00rootroot00000000000000#!/bin/sh # postinst script for libibverbs1 set -e if [ "$1" = configure ]; then getent group rdma > /dev/null 2>&1 || addgroup --system --quiet rdma fi #DEBHELPER# rdma-core-56.1/debian/libibverbs1.symbols000066400000000000000000000104061477342711600203730ustar00rootroot00000000000000libibverbs.so.1 libibverbs1 #MINVER# * Build-Depends-Package: libibverbs-dev IBVERBS_1.0@IBVERBS_1.0 1.1.6 IBVERBS_1.1@IBVERBS_1.1 1.1.6 IBVERBS_1.5@IBVERBS_1.5 20 IBVERBS_1.6@IBVERBS_1.6 24 IBVERBS_1.7@IBVERBS_1.7 25 IBVERBS_1.8@IBVERBS_1.8 28 IBVERBS_1.9@IBVERBS_1.9 30 IBVERBS_1.10@IBVERBS_1.10 31 IBVERBS_1.11@IBVERBS_1.11 32 IBVERBS_1.12@IBVERBS_1.12 34 IBVERBS_1.13@IBVERBS_1.13 35 IBVERBS_1.14@IBVERBS_1.14 36 (symver)IBVERBS_PRIVATE_34 34 _ibv_query_gid_ex@IBVERBS_1.11 32 _ibv_query_gid_table@IBVERBS_1.11 32 ibv_ack_async_event@IBVERBS_1.0 1.1.6 ibv_ack_async_event@IBVERBS_1.1 1.1.6 ibv_ack_cq_events@IBVERBS_1.0 1.1.6 ibv_ack_cq_events@IBVERBS_1.1 1.1.6 ibv_alloc_pd@IBVERBS_1.0 1.1.6 ibv_alloc_pd@IBVERBS_1.1 1.1.6 ibv_attach_mcast@IBVERBS_1.0 1.1.6 ibv_attach_mcast@IBVERBS_1.1 1.1.6 ibv_close_device@IBVERBS_1.0 1.1.6 ibv_close_device@IBVERBS_1.1 1.1.6 ibv_copy_ah_attr_from_kern@IBVERBS_1.1 1.1.6 ibv_copy_path_rec_from_kern@IBVERBS_1.0 1.1.6 ibv_copy_path_rec_to_kern@IBVERBS_1.0 1.1.6 ibv_copy_qp_attr_from_kern@IBVERBS_1.0 1.1.6 ibv_create_ah@IBVERBS_1.0 1.1.6 ibv_create_ah@IBVERBS_1.1 1.1.6 ibv_create_ah_from_wc@IBVERBS_1.1 1.1.6 ibv_create_comp_channel@IBVERBS_1.0 1.1.6 ibv_create_cq@IBVERBS_1.0 1.1.6 ibv_create_cq@IBVERBS_1.1 1.1.6 ibv_create_qp@IBVERBS_1.0 1.1.6 ibv_create_qp@IBVERBS_1.1 1.1.6 ibv_create_srq@IBVERBS_1.0 1.1.6 ibv_create_srq@IBVERBS_1.1 1.1.6 ibv_dealloc_pd@IBVERBS_1.0 1.1.6 ibv_dealloc_pd@IBVERBS_1.1 1.1.6 ibv_dereg_mr@IBVERBS_1.0 1.1.6 ibv_dereg_mr@IBVERBS_1.1 1.1.6 ibv_destroy_ah@IBVERBS_1.0 1.1.6 ibv_destroy_ah@IBVERBS_1.1 1.1.6 ibv_destroy_comp_channel@IBVERBS_1.0 1.1.6 ibv_destroy_cq@IBVERBS_1.0 1.1.6 ibv_destroy_cq@IBVERBS_1.1 1.1.6 ibv_destroy_qp@IBVERBS_1.0 1.1.6 ibv_destroy_qp@IBVERBS_1.1 1.1.6 ibv_destroy_srq@IBVERBS_1.0 1.1.6 ibv_destroy_srq@IBVERBS_1.1 1.1.6 ibv_detach_mcast@IBVERBS_1.0 1.1.6 ibv_detach_mcast@IBVERBS_1.1 1.1.6 ibv_dofork_range@IBVERBS_1.1 1.1.6 ibv_dontfork_range@IBVERBS_1.1 1.1.6 ibv_event_type_str@IBVERBS_1.1 1.1.6 ibv_fork_init@IBVERBS_1.1 1.1.6 ibv_free_device_list@IBVERBS_1.0 1.1.6 ibv_free_device_list@IBVERBS_1.1 1.1.6 ibv_get_async_event@IBVERBS_1.0 1.1.6 ibv_get_async_event@IBVERBS_1.1 1.1.6 ibv_get_cq_event@IBVERBS_1.0 1.1.6 ibv_get_cq_event@IBVERBS_1.1 1.1.6 ibv_get_device_guid@IBVERBS_1.0 1.1.6 ibv_get_device_guid@IBVERBS_1.1 1.1.6 ibv_get_device_index@IBVERBS_1.9 30 ibv_get_device_list@IBVERBS_1.0 1.1.6 ibv_get_device_list@IBVERBS_1.1 1.1.6 ibv_get_device_name@IBVERBS_1.0 1.1.6 ibv_get_device_name@IBVERBS_1.1 1.1.6 ibv_get_pkey_index@IBVERBS_1.5 20 ibv_get_sysfs_path@IBVERBS_1.0 1.1.6 ibv_import_device@IBVERBS_1.10 31 ibv_import_dm@IBVERBS_1.13 35 ibv_import_mr@IBVERBS_1.10 31 ibv_import_pd@IBVERBS_1.10 31 ibv_init_ah_from_wc@IBVERBS_1.1 1.1.6 ibv_is_fork_initialized@IBVERBS_1.13 35 ibv_modify_qp@IBVERBS_1.0 1.1.6 ibv_modify_qp@IBVERBS_1.1 1.1.6 ibv_modify_srq@IBVERBS_1.0 1.1.6 ibv_modify_srq@IBVERBS_1.1 1.1.6 ibv_node_type_str@IBVERBS_1.1 1.1.6 ibv_open_device@IBVERBS_1.0 1.1.6 ibv_open_device@IBVERBS_1.1 1.1.6 ibv_port_state_str@IBVERBS_1.1 1.1.6 ibv_qp_to_qp_ex@IBVERBS_1.6 24 ibv_query_device@IBVERBS_1.0 1.1.6 ibv_query_device@IBVERBS_1.1 1.1.6 ibv_query_ece@IBVERBS_1.10 31 ibv_query_gid@IBVERBS_1.0 1.1.6 ibv_query_gid@IBVERBS_1.1 1.1.6 ibv_query_pkey@IBVERBS_1.0 1.1.6 ibv_query_pkey@IBVERBS_1.1 1.1.6 ibv_query_port@IBVERBS_1.0 1.1.6 ibv_query_port@IBVERBS_1.1 1.1.6 ibv_query_qp@IBVERBS_1.0 1.1.6 ibv_query_qp@IBVERBS_1.1 1.1.6 ibv_query_qp_data_in_order@IBVERBS_1.14 36 ibv_query_srq@IBVERBS_1.0 1.1.6 ibv_query_srq@IBVERBS_1.1 1.1.6 ibv_rate_to_mbps@IBVERBS_1.1 1.1.8 ibv_rate_to_mult@IBVERBS_1.0 1.1.6 ibv_read_sysfs_file@IBVERBS_1.0 1.1.6 ibv_reg_dmabuf_mr@IBVERBS_1.12 34 ibv_reg_mr@IBVERBS_1.0 1.1.6 ibv_reg_mr@IBVERBS_1.1 1.1.6 ibv_reg_mr_iova@IBVERBS_1.7 25 ibv_reg_mr_iova2@IBVERBS_1.8 28 ibv_register_driver@IBVERBS_1.1 1.1.6 ibv_rereg_mr@IBVERBS_1.1 1.2.1 ibv_resize_cq@IBVERBS_1.0 1.1.6 ibv_resize_cq@IBVERBS_1.1 1.1.6 ibv_resolve_eth_l2_from_gid@IBVERBS_1.1 1.2.0 ibv_set_ece@IBVERBS_1.10 31 ibv_unimport_dm@IBVERBS_1.13 35 ibv_unimport_mr@IBVERBS_1.10 31 ibv_unimport_pd@IBVERBS_1.10 31 ibv_wc_status_str@IBVERBS_1.1 1.1.6 mbps_to_ibv_rate@IBVERBS_1.1 1.1.8 mult_to_ibv_rate@IBVERBS_1.0 1.1.6 rdma-core-56.1/debian/librdmacm-dev.install000066400000000000000000000004661477342711600206600ustar00rootroot00000000000000usr/include/infiniband/ib.h usr/include/rdma/rdma_cma.h usr/include/rdma/rdma_cma_abi.h usr/include/rdma/rdma_verbs.h usr/include/rdma/rsocket.h usr/lib/*/librdmacm*.so usr/lib/*/librdmacm.a usr/lib/*/pkgconfig/librdmacm.pc usr/share/man/man3/rdma_*.3 usr/share/man/man7/rdma_cm.7 usr/share/man/man7/rsocket.7 rdma-core-56.1/debian/librdmacm1.install000066400000000000000000000001751477342711600201620ustar00rootroot00000000000000usr/lib/*/librdmacm*.so.* usr/lib/*/rsocket/librspreload*.so* usr/share/doc/rdma-core/librdmacm.md usr/share/doc/librdmacm1/ rdma-core-56.1/debian/librdmacm1.symbols000066400000000000000000000045151477342711600202060ustar00rootroot00000000000000librdmacm.so.1 librdmacm1 #MINVER# * Build-Depends-Package: librdmacm-dev RDMACM_1.0@RDMACM_1.0 1.0.15 RDMACM_1.1@RDMACM_1.1 16 RDMACM_1.2@RDMACM_1.2 23 RDMACM_1.3@RDMACM_1.3 31 raccept@RDMACM_1.0 1.0.16 rbind@RDMACM_1.0 1.0.16 rclose@RDMACM_1.0 1.0.16 rconnect@RDMACM_1.0 1.0.16 rdma_accept@RDMACM_1.0 1.0.15 rdma_ack_cm_event@RDMACM_1.0 1.0.15 rdma_bind_addr@RDMACM_1.0 1.0.15 rdma_connect@RDMACM_1.0 1.0.15 rdma_create_ep@RDMACM_1.0 1.0.15 rdma_create_event_channel@RDMACM_1.0 1.0.15 rdma_create_id@RDMACM_1.0 1.0.15 rdma_create_qp@RDMACM_1.0 1.0.15 rdma_create_qp_ex@RDMACM_1.0 1.0.19 rdma_create_srq@RDMACM_1.0 1.0.15 rdma_create_srq_ex@RDMACM_1.0 1.0.19 rdma_destroy_ep@RDMACM_1.0 1.0.15 rdma_destroy_event_channel@RDMACM_1.0 1.0.15 rdma_destroy_id@RDMACM_1.0 1.0.15 rdma_destroy_qp@RDMACM_1.0 1.0.15 rdma_destroy_srq@RDMACM_1.0 1.0.15 rdma_disconnect@RDMACM_1.0 1.0.15 rdma_event_str@RDMACM_1.0 1.0.15 rdma_establish@RDMACM_1.2 23 rdma_free_devices@RDMACM_1.0 1.0.15 rdma_freeaddrinfo@RDMACM_1.0 1.0.15 rdma_get_cm_event@RDMACM_1.0 1.0.15 rdma_get_devices@RDMACM_1.0 1.0.15 rdma_get_dst_port@RDMACM_1.0 1.0.19 rdma_get_remote_ece@RDMACM_1.3 31 rdma_get_request@RDMACM_1.0 1.0.15 rdma_get_src_port@RDMACM_1.0 1.0.19 rdma_getaddrinfo@RDMACM_1.0 1.0.15 rdma_init_qp_attr@RDMACM_1.2 23 rdma_join_multicast@RDMACM_1.0 1.0.15 rdma_join_multicast_ex@RDMACM_1.1 16 rdma_leave_multicast@RDMACM_1.0 1.0.15 rdma_listen@RDMACM_1.0 1.0.15 rdma_migrate_id@RDMACM_1.0 1.0.15 rdma_notify@RDMACM_1.0 1.0.15 rdma_reject@RDMACM_1.0 1.0.15 rdma_reject_ece@RDMACM_1.3 31 rdma_resolve_addr@RDMACM_1.0 1.0.15 rdma_resolve_route@RDMACM_1.0 1.0.15 rdma_set_local_ece@RDMACM_1.3 31 rdma_set_option@RDMACM_1.0 1.0.15 rfcntl@RDMACM_1.0 1.0.16 rgetpeername@RDMACM_1.0 1.0.16 rgetsockname@RDMACM_1.0 1.0.16 rgetsockopt@RDMACM_1.0 1.0.16 riomap@RDMACM_1.0 1.0.19 riounmap@RDMACM_1.0 1.0.19 riowrite@RDMACM_1.0 1.0.19 rlisten@RDMACM_1.0 1.0.16 rpoll@RDMACM_1.0 1.0.16 rread@RDMACM_1.0 1.0.16 rreadv@RDMACM_1.0 1.0.16 rrecv@RDMACM_1.0 1.0.16 rrecvfrom@RDMACM_1.0 1.0.16 rrecvmsg@RDMACM_1.0 1.0.16 rselect@RDMACM_1.0 1.0.16 rsend@RDMACM_1.0 1.0.16 rsendmsg@RDMACM_1.0 1.0.16 rsendto@RDMACM_1.0 1.0.16 rsetsockopt@RDMACM_1.0 1.0.16 rshutdown@RDMACM_1.0 1.0.16 rsocket@RDMACM_1.0 1.0.16 rwrite@RDMACM_1.0 1.0.16 rwritev@RDMACM_1.0 1.0.16 rdma-core-56.1/debian/python3-pyverbs.examples000066400000000000000000000000371477342711600214100ustar00rootroot00000000000000pyverbs/examples/ib_devices.py rdma-core-56.1/debian/python3-pyverbs.install000066400000000000000000000001041477342711600212330ustar00rootroot00000000000000usr/lib/python3/dist-packages/pyverbs usr/share/doc/rdma-core/tests rdma-core-56.1/debian/rdma-core.install000066400000000000000000000021221477342711600200120ustar00rootroot00000000000000etc/init.d/iwpmd etc/iwpmd.conf etc/modprobe.d/mlx4.conf etc/modprobe.d/truescale.conf etc/rdma/modules/infiniband.conf etc/rdma/modules/iwarp.conf etc/rdma/modules/iwpmd.conf etc/rdma/modules/opa.conf etc/rdma/modules/rdma.conf etc/rdma/modules/roce.conf lib/systemd/system/iwpmd.service lib/systemd/system/rdma-hw.target lib/systemd/system/rdma-load-modules@.service lib/systemd/system/rdma-ndd.service lib/udev/rdma_rename lib/udev/rules.d/60-rdma-ndd.rules lib/udev/rules.d/60-rdma-persistent-naming.rules lib/udev/rules.d/75-rdma-description.rules lib/udev/rules.d/90-iwpmd.rules lib/udev/rules.d/90-rdma-hw-modules.rules lib/udev/rules.d/90-rdma-ulp-modules.rules lib/udev/rules.d/90-rdma-umad.rules usr/lib/truescale-serdes.cmds usr/sbin/iwpmd usr/sbin/rdma-ndd usr/share/doc/rdma-core/70-persistent-ipoib.rules usr/share/doc/rdma-core/MAINTAINERS usr/share/doc/rdma-core/README.md usr/share/doc/rdma-core/rxe.md usr/share/doc/rdma-core/tag_matching.md usr/share/doc/rdma-core/udev.md usr/share/man/man5/iwpmd.conf.5 usr/share/man/man7/rxe.7 usr/share/man/man8/iwpmd.8 usr/share/man/man8/rdma-ndd.8 rdma-core-56.1/debian/rdma-core.lintian-overrides000066400000000000000000000007711477342711600220120ustar00rootroot00000000000000# "module -i ib_qib" will executes code. This cannot be replaced by the softdep command. rdma-core: obsolete-command-in-modprobe.d-file install [etc/modprobe.d/truescale.conf] # The rdma-ndd service is started by udev. rdma-core: systemd-service-file-missing-install-key [lib/systemd/system/iwpmd.service] rdma-core: systemd-service-file-missing-install-key [lib/systemd/system/rdma-ndd.service] # For lintian < 2.115.2 rdma-core: obsolete-command-in-modprobe.d-file etc/modprobe.d/truescale.conf install rdma-core-56.1/debian/rdma-core.maintscript000066400000000000000000000001071477342711600207020ustar00rootroot00000000000000rm_conffile /etc/udev/rules.d/70-persistent-ipoib.rules 43.0 rdma-core rdma-core-56.1/debian/rdma-core.postinst000066400000000000000000000006521477342711600202350ustar00rootroot00000000000000#!/bin/sh set -e #DEBHELPER# if [ "$1" = "configure" ]; then # we ship udev rules, so trigger an update. This has to be done after # DEBHELPER restarts systemd to get our new service files loaded. udevadm trigger --subsystem-match=infiniband --action=change || true udevadm trigger --subsystem-match=net --action=change || true udevadm trigger --subsystem-match=infiniband_mad --action=change || true fi rdma-core-56.1/debian/rdmacm-utils.install000066400000000000000000000011411477342711600205420ustar00rootroot00000000000000usr/bin/cmtime usr/bin/mckey usr/bin/rcopy usr/bin/rdma_client usr/bin/rdma_server usr/bin/rdma_xclient usr/bin/rdma_xserver usr/bin/riostream usr/bin/rping usr/bin/rstream usr/bin/ucmatose usr/bin/udaddy usr/bin/udpong usr/share/man/man1/cmtime.1 usr/share/man/man1/mckey.1 usr/share/man/man1/rcopy.1 usr/share/man/man1/rdma_client.1 usr/share/man/man1/rdma_server.1 usr/share/man/man1/rdma_xclient.1 usr/share/man/man1/rdma_xserver.1 usr/share/man/man1/riostream.1 usr/share/man/man1/rping.1 usr/share/man/man1/rstream.1 usr/share/man/man1/ucmatose.1 usr/share/man/man1/udaddy.1 usr/share/man/man1/udpong.1 rdma-core-56.1/debian/rules000077500000000000000000000073041477342711600156400ustar00rootroot00000000000000#!/usr/bin/make -f include /usr/share/dpkg/architecture.mk export DEB_BUILD_MAINT_OPTIONS=hardening=+all NON_COHERENT_DMA_ARCHS = alpha arc armel armhf hppa m68k sh4 dh_params = --with python3 --builddirectory=build-deb %: dh $@ $(dh_params) override_dh_auto_clean: dh_auto_clean rm -rf build-deb for package in ibverbs-providers libibverbs-dev rdma-core; do \ test ! -e debian/$$package.install.backup || mv debian/$$package.install.backup debian/$$package.install; \ done # Upstream wishes to use CMAKE_BUILD_TYPE=Release, and ensures that has a # sensible basis of options (eg no -O3, including -g). Debian specific options # come from CFLAGS as usual. # # Upstream encourages the use of Ninja to build the source, convince dh to use # it until someone writes native support for dh+cmake+ninja. DH_AUTO_CONFIGURE := "--" \ "-GNinja" \ "-DDISTRO_FLAVOUR=Debian" \ "-DCMAKE_BUILD_TYPE=Release" \ "-DCMAKE_INSTALL_SYSCONFDIR:PATH=/etc" \ "-DCMAKE_INSTALL_SYSTEMD_SERVICEDIR:PATH=/lib/systemd/system" \ "-DCMAKE_INSTALL_INITDDIR:PATH=/etc/init.d" \ "-DCMAKE_INSTALL_LIBEXECDIR:PATH=/usr/lib" \ "-DCMAKE_INSTALL_SHAREDSTATEDIR:PATH=/var/lib" \ "-DCMAKE_INSTALL_RUNDIR:PATH=/run" \ "-DCMAKE_INSTALL_UDEV_RULESDIR:PATH=/lib/udev/rules.d" \ "-DCMAKE_INSTALL_PERLDIR:PATH=/usr/share/perl5" \ "-DENABLE_STATIC=1" \ $(EXTRA_CMAKE_FLAGS) override_dh_auto_configure: if [ -e /usr/bin/python3 ]; then \ dh_auto_configure $(DH_AUTO_CONFIGURE) \ -DPYTHON_EXECUTABLE:PATH=/usr/bin/python3 \ -DCMAKE_INSTALL_PYTHON_ARCH_LIB:PATH=/usr/lib/python3/dist-packages; \ else \ dh_auto_configure $(DH_AUTO_CONFIGURE) \ -DNO_PYVERBS=1; \ fi override_dh_auto_build: ninja -C build-deb -v # upstream does not ship test cases override_dh_auto_test: override_dh_auto_install: # Some providers are disabled on architectures that are not able to do coherent DMA ifeq (,$(filter-out $(NON_COHERENT_DMA_ARCHS),$(DEB_HOST_ARCH))) for package in ibverbs-providers libibverbs-dev rdma-core; do \ test -e debian/$$package.install.backup || cp debian/$$package.install debian/$$package.install.backup; \ done sed -i '/efa\|hns\|mana\|mlx[45]/d' debian/ibverbs-providers.install debian/libibverbs-dev.install debian/rdma-core.install endif DESTDIR=$(CURDIR)/debian/tmp ninja -C build-deb install # The following files are not used on Debian (we ship our own sysvinit script) INST_EXCLUDE := "etc/init.d/srpd" \ "etc/init.d/ibacm" \ "usr/sbin/run_srp_daemon" \ "usr/sbin/srp_daemon.sh" INST_EXCLUDE := $(addprefix -X,$(INST_EXCLUDE)) override_dh_install: if [ -e build-deb/python/pyverbs/__init__.py ]; then \ dh_install --fail-missing $(INST_EXCLUDE); \ else \ dh_install -Npython3-pyverbs --fail-missing $(INST_EXCLUDE) --remaining-packages; \ fi # cmake installs the correct init scripts in the correct place, just setup the # pre-postrms override_dh_installinit: dh_installinit -prdma-core --onlyscripts --name=iwpmd dh_installinit --remaining-packages override_dh_installsystemd: dh_installsystemd -pibacm --no-start ibacm.service dh_installsystemd -pibacm ibacm.socket dh_installsystemd --remaining-packages # Provider plugin libaries are not shared libraries and do not belong in the # shlibs file. # librspreload is a LD_PRELOAD library and does not belong in the shlib files SHLIBS_EXCLUDE = "/libibverbs/" "librspreload" "/ibacm/" SHLIBS_EXCLUDE := $(addprefix --exclude=,$(SHLIBS_EXCLUDE)) override_dh_makeshlibs: dh_makeshlibs $(SHLIBS_EXCLUDE) # Upstream encourages the use of 'build' as the developer build output # directory, allow that directory to be present and still allow dh to work. .PHONY: build build: dh $@ $(dh_params) rdma-core-56.1/debian/source/000077500000000000000000000000001477342711600160545ustar00rootroot00000000000000rdma-core-56.1/debian/source/format000066400000000000000000000000141477342711600172620ustar00rootroot000000000000003.0 (quilt) rdma-core-56.1/debian/srptools.default000066400000000000000000000004651477342711600200140ustar00rootroot00000000000000# How often should srp_daemon rescan the fabric (seconds). RETRIES=60 # Where should srp_daemon log to. LOG=/var/log/srp_daemon.log # What ports should srp_daemon be started on. # Format is CA:port # ALL or NONE will run on all ports on none # respectively PORTS=NONE #PORTS=ALL #PORTS="mthca0:1 mlx4_0:2" rdma-core-56.1/debian/srptools.init000066400000000000000000000054371477342711600173370ustar00rootroot00000000000000#!/bin/bash ### BEGIN INIT INFO # Provides: srptools # Required-Start: $remote_fs $syslog # Required-Stop: $remote_fs $syslog # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Discovers SRP scsi targets. # Description: Discovers SRP scsi over infiniband targets. ### END INIT INFO DAEMON=/usr/sbin/srp_daemon IBDIR=/sys/class/infiniband PORTS="" RETRIES="" RETRIES_DEFAULT=60 LOG="" LOG_DEFAULT=/var/log/srp_daemon.log [ -x $DAEMON ] || exit 0 . /lib/lsb/init-functions [ -f /etc/default/srptools ] && . /etc/default/srptools max() { echo $(($1 > $2 ? $1 : $2)) } run_daemon() { # srp does not background itself; using the start-stop-daemon background # function causes us to lose stdout, which is where it logs to nohup start-stop-daemon --start --quiet -m \ --pidfile "/var/run/srp_daemon.${HCA_ID}.${PORT}" \ --exec $DAEMON -- -e -c -n \ -i "${HCA_ID}" -p "${PORT}" -R "${RETRIES:-${RETRIES_DEFAULT}}" \ >> "${LOG:-${LOG_DEFAULT}}" 2>&1 & RETVAL=$(max "$RETVAL" $?) } # Evaluate shell command $1 for every port in $PORTS for_all_ports() { local cmd=$1 p if [ "$PORTS" = "ALL" ]; then for p in ${IBDIR}/*/ports/*; do [ -e "$p" ] || continue PORT=$(basename "$p") HCA_ID=$(basename "$(dirname "$(dirname "$p")")") eval "$cmd" done else for ADAPTER in $PORTS; do HCA_ID=${ADAPTER%%:*} PORT=${ADAPTER#${HCA_ID}:} [ -n "$HCA_ID" ] && [ -n "$PORT" ] && eval "$cmd" done fi } start_daemon() { local RETVAL=0 if [ "$PORTS" = "NONE" ] ; then echo "srptools disabled." exit 0 fi for_all_ports run_daemon case $RETVAL in 0) log_success_msg "started $DAEMON";; *) log_failure_msg "failed to start $DAEMON";; esac return $RETVAL } stop_daemon() { local RETVAL=0 PORTS=ALL for_all_ports 'start-stop-daemon --stop --quiet --oknodo -m --pidfile "/var/run/srp_daemon.${HCA_ID}.${PORT}"; RETVAL=$(max $RETVAL $?)' case $RETVAL in 0) log_success_msg "stopped $DAEMON";; *) log_failure_msg "failed to stop $DAEMON";; esac return $RETVAL } check_status() { local pidfile=$1 pid [ -e "$pidfile" ] || return 3 # not running pid=$(<"$pidfile") [ -n "$pid" ] || return 3 # not running [ -d "/proc/$pid" ] || return 1 # not running and pid file exists return 0 # running } daemon_status() { local RETVAL=0 for_all_ports 'check_status /var/run/srp_daemon.${HCA_ID}.${PORT} $DAEMON; RETVAL=$(max $RETVAL $?)' case $RETVAL in 0) log_success_msg "$DAEMON is running";; *) log_failure_msg "$DAEMON is not running";; esac return $RETVAL } case "$1" in start) start_daemon ;; stop) stop_daemon ;; status) daemon_status ;; restart | reload | force-reload ) stop_daemon start_daemon ;; esac rdma-core-56.1/debian/srptools.install000066400000000000000000000007051477342711600200330ustar00rootroot00000000000000etc/rdma/modules/srp_daemon.conf etc/srp_daemon.conf lib/systemd/system/srp_daemon.service lib/systemd/system/srp_daemon_port@.service lib/udev/rules.d/60-srp_daemon.rules usr/lib/srp_daemon/start_on_all_ports usr/sbin/ibsrpdm usr/sbin/srp_daemon usr/share/doc/rdma-core/ibsrpdm.md usr/share/doc/srptools/ usr/share/man/man5/srp_daemon.service.5 usr/share/man/man5/srp_daemon_port@.service.5 usr/share/man/man8/ibsrpdm.8 usr/share/man/man8/srp_daemon.8 rdma-core-56.1/debian/srptools.links000066400000000000000000000001141477342711600174770ustar00rootroot00000000000000/lib/systemd/system/srp_daemon.service /lib/systemd/system/srptools.service rdma-core-56.1/debian/srptools.lintian-overrides000066400000000000000000000002571477342711600220250ustar00rootroot00000000000000# The wantedby target remote-fs-pre.target is intentional srptools: systemd-service-file-refers-to-unusual-wantedby-target remote-fs-pre.target [lib/systemd/system/*.service] rdma-core-56.1/debian/srptools.postinst000066400000000000000000000004371477342711600202520ustar00rootroot00000000000000#!/bin/sh set -e #DEBHELPER# if [ "$1" = "configure" ]; then # we ship udev rules, so trigger an update. This has to be done after # DEBHELPER restarts systemd to get our new service files loaded. udevadm trigger --subsystem-match=infiniband_mad --action=change || true fi rdma-core-56.1/debian/upstream/000077500000000000000000000000001477342711600164145ustar00rootroot00000000000000rdma-core-56.1/debian/upstream/metadata000066400000000000000000000001631477342711600201170ustar00rootroot00000000000000Repository: https://github.com/linux-rdma/rdma-core.git Repository-Browse: https://github.com/linux-rdma/rdma-core rdma-core-56.1/debian/watch000066400000000000000000000003301477342711600156010ustar00rootroot00000000000000version=4 opts="searchmode=plain" \ https://api.github.com/repos/linux-rdma/rdma-core/releases?per_page=100 \ https://github.com/linux-rdma/rdma-core/releases/download/v[^/]+/rdma-core-@ANY_VERSION@@ARCHIVE_EXT@ rdma-core-56.1/ibacm/000077500000000000000000000000001477342711600144055ustar00rootroot00000000000000rdma-core-56.1/ibacm/CMakeLists.txt000066400000000000000000000040731477342711600171510ustar00rootroot00000000000000publish_headers(infiniband include/infiniband/acm_prov.h ) # FIXME: Fixup the include scheme to not require all these -Is include_directories("include") include_directories("src") include_directories("linux") include_directories(${NL_INCLUDE_DIRS}) # NOTE: ibacm exports symbols from its own binary for use by ibacm rdma_sbin_executable(ibacm src/acm.c src/acm_util.c ) target_link_libraries(ibacm LINK_PRIVATE ibverbs ibumad ${NL_LIBRARIES} ${SYSTEMD_LIBRARIES} ${CMAKE_THREAD_LIBS_INIT} ${CMAKE_DL_LIBS} ) # FIXME: We should probably list the symbols we want to export.. set_target_properties(ibacm PROPERTIES ENABLE_EXPORTS TRUE) # This is a plugin module that dynamically links to ibacm add_library(ibacmp MODULE prov/acmp/src/acmp.c ) rdma_set_library_map(ibacmp "prov/acmp/src/libibacmp.map") target_link_libraries(ibacmp LINK_PRIVATE ibacm ibverbs ibumad ${CMAKE_THREAD_LIBS_INIT} ) set_target_properties(ibacmp PROPERTIES LIBRARY_OUTPUT_DIRECTORY "${BUILD_LIB}") install(TARGETS ibacmp DESTINATION "${ACM_PROVIDER_DIR}") # ACM providers are linked into a subdir so that IN_PLACE can work. file(MAKE_DIRECTORY "${BUILD_LIB}/ibacm/") rdma_create_symlink("../libibacmp.so" "${BUILD_LIB}/ibacm/libibacmp.so") rdma_executable(ib_acme src/acme.c src/libacm.c src/parse.c ) target_link_libraries(ib_acme LINK_PRIVATE ibverbs ) target_compile_definitions(ib_acme PRIVATE "-DACME_PRINTS") rdma_man_pages( man/ib_acme.1 man/ibacm.7 man/ibacm.8 man/ibacm_prov.7.in ) # FIXME: update the .init.in rdma_subst_install(FILES "ibacm.init.in" DESTINATION "${CMAKE_INSTALL_INITDDIR}" RENAME "ibacm" PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ OWNER_EXECUTE GROUP_EXECUTE WORLD_EXECUTE) rdma_subst_install(FILES "ibacm.service.in" DESTINATION "${CMAKE_INSTALL_SYSTEMD_SERVICEDIR}" RENAME ibacm.service PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ) install(FILES "ibacm.socket" DESTINATION "${CMAKE_INSTALL_SYSTEMD_SERVICEDIR}" RENAME ibacm.socket PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ) rdma-core-56.1/ibacm/ibacm.init.in000066400000000000000000000055711477342711600167620ustar00rootroot00000000000000#!/bin/bash # Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md # # Bring up/down the ibacm daemon # # chkconfig: 2345 25 75 # description: Starts/Stops InfiniBand ACM service # ### BEGIN INIT INFO # Provides: ibacm # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Required-Start: $network $remote_fs # Required-Stop: $network $remote_fs # Should-Start: # Should-Stop: # Short-Description: Starts and stops the InfiniBand ACM service # Description: The InfiniBand ACM service provides a user space implementation # of something resembling an ARP cache for InfiniBand SA queries and # host route lookups. ### END INIT INFO pidfile=@CMAKE_INSTALL_FULL_RUNDIR@/ibacm.pid subsys=/var/lock/subsys/ibacm daemon() { /sbin/daemon ${1+"$@"}; } if [ -s /etc/init.d/functions ]; then # RHEL / CentOS / SL / Fedora . /etc/init.d/functions _daemon() { daemon ${1+"$@"}; } _checkpid() { checkpid `cat $pidfile`; } _success() { success; echo; } _failure() { failure; echo; } elif [ -s /lib/lsb/init-functions ]; then # SLES / OpenSuSE / Debian . /lib/lsb/init-functions _daemon() { start_daemon "$@"; } _checkpid() { checkproc -p $pidfile @CMAKE_INSTALL_FULL_SBINDIR@/ibacm; } _success() { log_success_msg; } _failure() { log_failure_msg; } elif [ -s /etc/rc.status ]; then # Older SuSE . /etc/rc.status _daemon() { /sbin/start_daemon ${1+"$@"}; } _checkpid() { checkproc -p $pidfile @CMAKE_INSTALL_FULL_SBINDIR@/ibacm; } _success() { rc_status -v; } _failure() { rc_status -v; } fi start() { echo -n "Starting ibacm daemon:" _daemon @CMAKE_INSTALL_FULL_SBINDIR@/ibacm if [[ $RETVAL -eq 0 ]]; then _success else _failure fi } stop() { echo -n "Stopping ibacm daemon:" killproc -p $pidfile ibacm if [[ $RETVAL -eq 0 ]]; then _success else _failure fi rm -f $subsys } status() { echo -n "Checking for ibacm service " if [ ! -f $subsys -a ! -f $pidfile ]; then RETVAL=3 elif [ -f $pidfile ]; then _checkpid RETVAL=$? elif [ -f $subsys ]; then RETVAL=2 else RETVAL=0 fi if [[ $RETVAL -eq 0 ]]; then _success else _failure fi } restart () { stop start } condrestart () { [ -e $subsys ] && restart || return 0 } usage () { echo echo "Usage: `basename $0` {start|stop|restart|condrestart|try-restart|force-reload|status}" echo return 2 } case $1 in start|stop|restart|condrestart|try-restart|force-reload) [ `id -u` != "0" ] && exit 4 ;; esac case $1 in start) start ;; stop) stop ;; restart | reload) restart ;; condrestart | try-restart | force-reload) condrestart ;; status) status ;; *) usage ;; esac exit $RETVAL rdma-core-56.1/ibacm/ibacm.service.in000066400000000000000000000016551477342711600174560ustar00rootroot00000000000000[Unit] Description=InfiniBand Address Cache Manager Daemon Documentation=man:ibacm file:@CMAKE_INSTALL_FULL_SYSCONFDIR@/rdma/ibacm_opts.cfg # Cause systemd to always start the socket, which means the parameters in # ibacm.socket always configures the listening socket, even if the deamon is # started directly. Wants=ibacm.socket # Ensure required kernel modules are loaded before starting Wants=rdma-load-modules@rdma.service After=rdma-load-modules@rdma.service # Order ibacm startup after basic RDMA hw setup. After=rdma-hw.target # Implicitly after basic.target, note that ibacm writes to /var/log directly # and thus needs writable filesystems setup. [Service] Type=notify ExecStart=@CMAKE_INSTALL_FULL_SBINDIR@/ibacm --systemd ProtectSystem=full ProtectHome=true ProtectHostname=true ProtectKernelLogs=true [Install] Also=ibacm.socket # Only want ibacm if RDMA hardware is present (or the socket is touched) WantedBy=rdma-hw.target rdma-core-56.1/ibacm/ibacm.socket000066400000000000000000000023651477342711600167000ustar00rootroot00000000000000# Please copy this file to /etc/systemd/system/ibacm.socket # before modification, if not done already. # # When using socket-based activation of the 'ibacm' service # ibacm's configuration option 'acme_plus_kernel_only' is ignored # (i.e. an implicit 'acme_plus_kernel_only no') # # In order to get the equivalent behavior of # configuration 'acme_plus_kernel_only yes' # Please add a comment (i.e. a '#' character) in front # of the line 'Symlinks' below, and ensure that file # file '/run/ibacm.sock' does not exist: # e.g. by using "rm -f /run/ibacm.sock" after modifying # the copy of this file that lives in /etc/systemd/system. # # Please also remember to reload the systemd configuration by running: # % systemctl --system daemon-reload [Unit] Description=Socket for InfiniBand Address Cache Manager Daemon Documentation=man:ibacm # Ensure that anything ordered after rdma-hw.target will see the socket, even # if that thing is not ordered after socket.target/basic.target. Before=rdma-hw.target # ibacm.socket always starts [Socket] ListenStream=/run/ibacm-unix.sock Symlinks=/run/ibacm.sock # Bind to PF_NETLINK, NETLINK_RDMA, RDMA_NL_GROUP_LS # Supported in systemd > 234 ListenNetlink=rdma 4 [Install] # Standard for all sockets WantedBy=sockets.target rdma-core-56.1/ibacm/ibacm_hosts.data000066400000000000000000000006611477342711600175360ustar00rootroot00000000000000# InfiniBand Communication Management Assistant for clusters hosts file # # Entry format is: # address IB GID # # The address may be one of the following: # host_name - ascii character string, up to 31 characters # address - IPv4 or IPv6 formatted address # # There can be multiple entries for a single IB GID # # Samples: # luna3 fe80::8:f104:39a:169 # 192.168.1.3 fe80::8:f104:39a:169 # fe80::208:f104:39a:169 fe80::8:f104:39a:169 rdma-core-56.1/ibacm/include/000077500000000000000000000000001477342711600160305ustar00rootroot00000000000000rdma-core-56.1/ibacm/include/acm_mad.h000066400000000000000000000142541477342711600175700ustar00rootroot00000000000000/* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenFabrics.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(ACM_MAD_H) #define ACM_MAD_H #include #include #include #define ACM_SEND_SIZE 256 #define ACM_RECV_SIZE (ACM_SEND_SIZE + sizeof(struct ibv_grh)) #define IB_METHOD_GET 0x01 #define IB_METHOD_SET 0x02 #define IB_METHOD_SEND 0x03 #define IB_METHOD_GET_TABLE 0x12 #define IB_METHOD_DELETE 0x15 #define IB_METHOD_RESP 0x80 #define ACM_MGMT_CLASS 0x2C #define ACM_CTRL_ACK htobe16(0x8000) #define ACM_CTRL_RESOLVE htobe16(0x0001) #define IB_PKEY_FULL_MEMBER 0x8000 struct acm_mad { uint8_t base_version; uint8_t mgmt_class; uint8_t class_version; uint8_t method; __be16 status; __be16 control; __be64 tid; uint8_t data[240]; }; #define acm_class_status(status) ((uint8_t) (be16toh(status) >> 8)) #define ACM_QKEY 0x80010000 /* Map to ACM_EP_INFO_* */ #define ACM_ADDRESS_INVALID 0x00 #define ACM_ADDRESS_NAME 0x01 #define ACM_ADDRESS_IP 0x02 #define ACM_ADDRESS_IP6 0x03 #define ACM_ADDRESS_GID 0x04 #define ACM_ADDRESS_LID 0x05 #define ACM_ADDRESS_RESERVED 0x06 /* start of reserved range */ #define ACM_MAX_GID_COUNT 10 struct acm_resolve_rec { uint8_t dest_type; uint8_t dest_length; uint8_t src_type; uint8_t src_length; uint8_t gid_cnt; uint8_t resp_resources; uint8_t init_depth; uint8_t reserved; uint8_t dest[ACM_MAX_ADDRESS]; uint8_t src[ACM_MAX_ADDRESS]; union ibv_gid gid[ACM_MAX_GID_COUNT]; }; #define IB_MGMT_CLASS_SA 0x03 struct ib_sa_mad { uint8_t base_version; uint8_t mgmt_class; uint8_t class_version; uint8_t method; __be16 status; __be16 reserved1; __be64 tid; __be16 attr_id; __be16 reserved2; __be32 attr_mod; uint8_t rmpp_version; uint8_t rmpp_type; uint8_t rmpp_flags; uint8_t rmpp_status; __be32 seg_num; __be32 paylen_newwin; __be32 sm_key[2]; __be16 attr_offset; __be16 reserved3; __be64 comp_mask; uint8_t data[200]; }; #define IB_SA_ATTR_PATH_REC htobe16(0x0035) #define IB_COMP_MASK_PR_SERVICE_ID (htobe64(1 << 0) | \ htobe64(1 << 1)) #define IB_COMP_MASK_PR_DGID htobe64(1 << 2) #define IB_COMP_MASK_PR_SGID htobe64(1 << 3) #define IB_COMP_MASK_PR_DLID htobe64(1 << 4) #define IB_COMP_MASK_PR_SLID htobe64(1 << 5) #define IB_COMP_MASK_PR_RAW_TRAFFIC htobe64(1 << 6) /* RESERVED htobe64(1 << 7) */ #define IB_COMP_MASK_PR_FLOW_LABEL htobe64(1 << 8) #define IB_COMP_MASK_PR_HOP_LIMIT htobe64(1 << 9) #define IB_COMP_MASK_PR_TCLASS htobe64(1 << 10) #define IB_COMP_MASK_PR_REVERSIBLE htobe64(1 << 11) #define IB_COMP_MASK_PR_NUM_PATH htobe64(1 << 12) #define IB_COMP_MASK_PR_PKEY htobe64(1 << 13) #define IB_COMP_MASK_PR_QOS_CLASS htobe64(1 << 14) #define IB_COMP_MASK_PR_SL htobe64(1 << 15) #define IB_COMP_MASK_PR_MTU_SELECTOR htobe64(1 << 16) #define IB_COMP_MASK_PR_MTU htobe64(1 << 17) #define IB_COMP_MASK_PR_RATE_SELECTOR htobe64(1 << 18) #define IB_COMP_MASK_PR_RATE htobe64(1 << 19) #define IB_COMP_MASK_PR_PACKET_LIFETIME_SELECTOR htobe64(1 << 20) #define IB_COMP_MASK_PR_PACKET_LIFETIME htobe64(1 << 21) #define IB_COMP_MASK_PR_PREFERENCE htobe64(1 << 22) /* RESERVED htobe64(1 << 23) */ #define IB_MC_QPN 0xffffff #define IB_SA_ATTR_MC_MEMBER_REC htobe16(0x0038) #define IB_COMP_MASK_MC_MGID htobe64(1 << 0) #define IB_COMP_MASK_MC_PORT_GID htobe64(1 << 1) #define IB_COMP_MASK_MC_QKEY htobe64(1 << 2) #define IB_COMP_MASK_MC_MLID htobe64(1 << 3) #define IB_COMP_MASK_MC_MTU_SEL htobe64(1 << 4) #define IB_COMP_MASK_MC_MTU htobe64(1 << 5) #define IB_COMP_MASK_MC_TCLASS htobe64(1 << 6) #define IB_COMP_MASK_MC_PKEY htobe64(1 << 7) #define IB_COMP_MASK_MC_RATE_SEL htobe64(1 << 8) #define IB_COMP_MASK_MC_RATE htobe64(1 << 9) #define IB_COMP_MASK_MC_PACKET_LIFETIME_SEL htobe64(1 << 10) #define IB_COMP_MASK_MC_PACKET_LIFETIME htobe64(1 << 11) #define IB_COMP_MASK_MC_SL htobe64(1 << 12) #define IB_COMP_MASK_MC_FLOW htobe64(1 << 13) #define IB_COMP_MASK_MC_HOP htobe64(1 << 14) #define IB_COMP_MASK_MC_SCOPE htobe64(1 << 15) #define IB_COMP_MASK_MC_JOIN_STATE htobe64(1 << 16) #define IB_COMP_MASK_MC_PROXY_JOIN htobe64(1 << 17) struct ib_mc_member_rec { union ibv_gid mgid; union ibv_gid port_gid; __be32 qkey; __be16 mlid; uint8_t mtu; uint8_t tclass; __be16 pkey; uint8_t rate; uint8_t packet_lifetime; __be32 sl_flow_hop; uint8_t scope_state; uint8_t proxy_join; uint8_t reserved[2]; uint8_t pad[4]; }; #endif /* ACM_MAD_H */ rdma-core-56.1/ibacm/include/infiniband/000077500000000000000000000000001477342711600201315ustar00rootroot00000000000000rdma-core-56.1/ibacm/include/infiniband/acm_prov.h000066400000000000000000000101161477342711600221070ustar00rootroot00000000000000/* * Copyright (c) 2014 Intel Corporation. All rights reserved. * * This software is available to you under the OpenFabrics.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(ACM_PROV_H) #define ACM_PROV_H #include #include #include #ifdef __cplusplus extern "C" { #endif #define ACM_PROV_VERSION 1 struct acm_device { struct ibv_context *verbs; __be64 dev_guid; }; struct acm_port { struct acm_device *dev; uint8_t port_num; }; struct acm_endpoint { struct acm_port *port; uint16_t pkey; }; struct acm_address { struct acm_endpoint *endpoint; union acm_ep_info info; char *id_string; uint16_t type; }; struct acm_provider { size_t size; uint32_t version; const char *name; int (*open_device)(const struct acm_device *device, void **dev_context); void (*close_device)(void *dev_context); int (*open_port)(const struct acm_port *port, void *dev_context, void **port_context); void (*close_port)(void *port_context); int (*open_endpoint)(const struct acm_endpoint *endpoint, void *port_context, void **ep_context); void (*close_endpoint)(void *ep_context); int (*add_address)(const struct acm_address *addr, void *ep_context, void **addr_context); void (*remove_address)(void *addr_context); int (*resolve)(void *addr_context, struct acm_msg *msg, uint64_t id); int (*query)(void *addr_context, struct acm_msg *msg, uint64_t id); int (*handle_event)(void *port_context, enum ibv_event_type type); void (*query_perf)(void *ep_context, uint64_t *values, uint8_t *cnt); }; int provider_query(struct acm_provider **info, uint32_t *version); /* Functions exported from core */ #define acm_log(level, format, ...) \ acm_write(level, "%s: "format, __func__, ## __VA_ARGS__) extern void acm_write(int level, const char *format, ...) __attribute__((format(printf, 2, 3))); extern void acm_format_name(int level, char *name, size_t name_size, uint8_t addr_type, const uint8_t *addr, size_t addr_size); extern int ib_any_gid(union ibv_gid *gid); extern uint8_t acm_gid_index(struct acm_port *port, union ibv_gid *gid); extern int acm_get_gid(struct acm_port *port, int index, union ibv_gid *gid); extern __be64 acm_path_comp_mask(struct ibv_path_record *path); extern int acm_resolve_response(uint64_t id, struct acm_msg *msg); extern int acm_query_response(uint64_t id, struct acm_msg *msg); extern enum ibv_rate acm_get_rate(uint8_t width, uint8_t speed); extern enum ibv_mtu acm_convert_mtu(int mtu); extern enum ibv_rate acm_convert_rate(int rate); struct acm_sa_mad { void *context; struct ib_user_mad umad; struct umad_sa_packet sa_mad; /* must follow umad and be 64-bit aligned */ }; extern struct acm_sa_mad * acm_alloc_sa_mad(const struct acm_endpoint *endpoint, void *context, void (*handler)(struct acm_sa_mad *)); extern void acm_free_sa_mad(struct acm_sa_mad *mad); extern int acm_send_sa_mad(struct acm_sa_mad *mad); extern const char *acm_get_opts_file(void); extern void acm_increment_counter(int type); #ifdef __cplusplus } #endif #endif /* ACM_PROV_H */ rdma-core-56.1/ibacm/linux/000077500000000000000000000000001477342711600155445ustar00rootroot00000000000000rdma-core-56.1/ibacm/linux/osd.h000066400000000000000000000072451477342711600165120ustar00rootroot00000000000000/* * Copyright (c) 2009 Intel Corporation. All rights reserved. * Copyright (c) 2013 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under the OpenFabrics.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(OSD_H) #define OSD_H #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define ACM_ADDR_FILE "ibacm_addr.cfg" #define ACM_OPTS_FILE "ibacm_opts.cfg" #if DEFINE_ATOMICS typedef struct { pthread_mutex_t mut; int val; } atomic_t; static inline int atomic_inc(atomic_t *atomic) { int v; pthread_mutex_lock(&atomic->mut); v = ++(atomic->val); pthread_mutex_unlock(&atomic->mut); return v; } static inline int atomic_dec(atomic_t *atomic) { int v; pthread_mutex_lock(&atomic->mut); v = --(atomic->val); pthread_mutex_unlock(&atomic->mut); return v; } static inline void atomic_init(atomic_t *atomic) { pthread_mutex_init(&atomic->mut, NULL); atomic->val = 0; } #else typedef struct { volatile int val; } atomic_t; #define atomic_inc(v) (__sync_add_and_fetch(&(v)->val, 1)) #define atomic_dec(v) (__sync_sub_and_fetch(&(v)->val, 1)) #define atomic_init(v) ((v)->val = 0) #endif #define atomic_get(v) ((v)->val) #define atomic_set(v, s) ((v)->val = s) typedef struct { pthread_cond_t cond; pthread_mutex_t mutex; } event_t; static inline void event_init(event_t *e) { pthread_condattr_t attr; pthread_condattr_init(&attr); pthread_condattr_setclock(&attr, CLOCK_MONOTONIC); pthread_cond_init(&e->cond, &attr); pthread_mutex_init(&e->mutex, NULL); } #define event_signal(e) pthread_cond_signal(&(e)->cond) #define ONE_SEC_IN_NSEC 1000000000ULL static inline int event_wait(event_t *e, unsigned int timeout) { struct timespec wait; int ret; clock_gettime(CLOCK_MONOTONIC, &wait); wait.tv_sec = wait.tv_sec + timeout / 1000; wait.tv_nsec = wait.tv_nsec + (timeout % 1000) * 1000000; if (wait.tv_nsec > ONE_SEC_IN_NSEC) { wait.tv_sec++; wait.tv_nsec -= ONE_SEC_IN_NSEC; } pthread_mutex_lock(&e->mutex); ret = pthread_cond_timedwait(&e->cond, &e->mutex, &wait); pthread_mutex_unlock(&e->mutex); return ret; } static inline uint64_t time_stamp_us(void) { struct timespec t; clock_gettime(CLOCK_MONOTONIC, &t); return (t.tv_sec * ONE_SEC_IN_NSEC + t.tv_nsec) / 1000; } #define time_stamp_ms() (time_stamp_us() / (uint64_t) 1000) #define time_stamp_sec() (time_stamp_ms() / (uint64_t) 1000) #define time_stamp_min() (time_stamp_sec() / (uint64_t) 60) #endif /* OSD_H */ rdma-core-56.1/ibacm/man/000077500000000000000000000000001477342711600151605ustar00rootroot00000000000000rdma-core-56.1/ibacm/man/ib_acme.1000066400000000000000000000101631477342711600166220ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "ib_acme" 1 "2014-06-16" "ib_acme" "ib_acme" ib_acme .SH NAME ib_acme \- test and configuration utility for the IB ACM .SH SYNOPSIS .sp .nf \fIib_acme\fR [-f addr_format] [-s src_addr] -d dest_addr [-v] [-c] [-e] [-P] [-S svc_addr] [-C repetitions] .fi .nf \fIib_acme\fR [-A [addr_file]] [-O [opt_file]] [-D dest_dir] [-V] .fi .SH "DESCRIPTION" ib_acme provides assistance configuring and testing the ibacm service. The first usage of the service will test that the ibacm is running and operating correctly. The second usage model will automatically create address and configuration files for the ibacm service. .SH "OPTIONS" .TP \-f addr_format Specifies the format of the src_addr and dest_addr parameters. Valid address formats are: 'i' ip address, 'n' host name, 'l' lid, 'g' gid, and 'u' unspecified. If the -f option is omitted, an unspecified address format is assumed. ib_acme will use getaddrinfo or other mechanisms to determine which format the address uses. .TP \-s src_addr Specifies the local source address of the path to resolve. The source address can be an IP address, system network name, or LID, as indicated by the addr_format option. .TP \-d dest_addr Specifies the destination address of the path to resolve. The destination address can be an IP address, system network name, or LID, as indicated by the addr_format option. .TP \-v Indicates that the resolved path information should be verified with the active IB SA. Use of the -v option provides a sanity check that resolved path information is usable given the current cluster configuration. .TP \-c Instructs the ACM service to only returned information that currently resides in its local cache. .TP \-e [N] Displays one (N = 1, 2, ...) or all endpoints (N = 0 or not present). .TP \-P [opt] Queries performance data from the destination service. Valid options are: "col" for outputting combined data in column format, "N" (N = 1, 2, ...) for outputting data for a specific endpoint N, "all" for outputting data for all endpoints, and "s" for outputting data for a specific endpoint with the address given by the -s option. .TP \-S svc_addr Hostname, IPv4-address or Unix-domain socket of ACM service, default: /run/ibacm.sock .TP \-C repetitions number of repetitions to perform resolution. Used to measure performance of ACM cache lookups. Defaults to 1. .TP \-A [addr_file] With this option, the ib_acme utility automatically generates the address configuration file ibacm_addr.cfg. The generated file is constructed using the system host name. .TP \-O [opt_file] With this option, the ib_acme utility automatically generates the option configuration file ibacm_opts.cfg. The generated file is currently generated using static information. .TP \-D dest_dir Specify the destination directory for the output files. .TP \-V Enables verbose output. When combined with -A or -O options, ib_acme will display additional details, such as generated address information saved to the ibacm_addr.cfg file. .SH "NOTES" The ib_acme utility performs two main functions. With the -A and -O options, it automatically generates address or options configuration files. The generated files are text based and may be edited. These options are intended to provide a simple way to configure address and option information on all nodes on a cluster. .P The other function of the ib_acme utility is to test the ibacm service, including helping to verify that the service is usable given the current cluster configuration. The ib_acme utility can resolve IP addresses, network names, or IB LIDs into a path record. It can then compare that path record against one obtained by the SA. When used to test the ibacm service, the ib_acme utility has the side effect of loading the ibacm caches. .P Multiple, numerical destinations can be specified by adding brackets [] to the end of a base destination name or address. Users may specify a list of numerical ranges inside the brackets using the following example as a guide: node[1-3,5,7-8]. This will result in testing node1, node2, node3, node5, node7, and node8. .SH "SEE ALSO" ibacm(7), ibacm(8) rdma-core-56.1/ibacm/man/ibacm.7000066400000000000000000000024761477342711600163340ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "IBACM" 7 "2014-06-16" "IBACM" "IB ACM User Guide" IBACM .SH NAME ibacm \- InfiniBand communication management assistant .SH SYNOPSIS .B "#include " .SH "DESCRIPTION" Used to resolve remote endpoint information before establishing communications over InfiniBand. .SH "NOTES" Th IB ACM provides scalable address and route resolution services over InfiniBand. It resolves system network names and IP addresses to InfiniBand path record data using efficient mechanisms, including caching of data. .P The IB ACM provides information needed to establish a connection, but does not implement the communication management protocol. It provides services similar to rdma_getaddrinfo, rdma_resolve_addr, and rdma_resolve_route using IB multicast. The IB ACM does not require IPoIB or use standard naming services, such as DNS, and limits network communication, especially with the IB SA. The ib_acme utility assists in verifying what options of the ibacm service may be usable for the current fabric topology. .P Client interactions with the ibacm service are done over sockets through a standard TCP connection. The librdmacm abstracts this interaction. .SH "RETURN CODES" .IP "== 0" success .IP "!= 0" error .SH "SEE ALSO" ib_acme(1), ibacm(8) rdma-core-56.1/ibacm/man/ibacm.8000066400000000000000000000175011477342711600163300ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "ibacm" 8 "2014-06-16" "ibacm" "ibacm" ibacm .SH NAME ibacm \- address and route resolution services for InfiniBand. .SH SYNOPSIS .sp .nf \fIibacm\fR [-D] [-P] [-A addr_file] [-O option_file] .fi .SH "DESCRIPTION" The IB ACM implements and provides a framework for name, address, and route (path) resolution services over InfiniBand. It is intended to address connection setup scalability issues running MPI applications on large clusters. The IB ACM provides information needed to establish a connection, but does not implement the CM protocol. .P A primary user of the ibacm service is the librdmacm library. This enables applications to make use of the ibacm service without code changes or needing to be aware that the service is in use. librdmacm versions 1.0.12 - 1.0.15 can invoke IB ACM services when built using the --with-ib_acm option. Version 1.0.16 and newer of librdmacm will automatically use the IB ACM if it is installed. The IB ACM services tie in under the rdma_resolve_addr, rdma_resolve_route, and rdma_getaddrinfo routines. For maximum benefit, the rdma_getaddrinfo routine should be used, however existing applications should still see significant connection scaling benefits using the calls available in librdmacm 1.0.11 and previous releases. .P The IB ACM is focused on being scalable, efficient, and extensible. It implements a plugin architecture that allows a vendor to supply its proprietary provider in addition to the default provider. The current default provider implementation ibacmp limits network traffic, SA interactions, and centralized services. Ibacmp supports multiple resolution protocols in order to handle different fabric topologies. .P The IB ACM package is comprised of three components: the ibacm core service, the default provider ibacmp shared library, and a test/configuration utility - ib_acme. All three are userspace components and are available for Linux. Additional details are given below. .SH "OPTIONS" .TP \-D run in daemon mode (default) .TP \-P run as standard process .TP \-A addr_file address configuration file .TP \-O option_file option configuration file .TP \--systemd Enable systemd integration. This includes optional socket activation of the daemon's listening socket. .SH "QUICK START GUIDE" 1. Prerequisites: libibverbs and libibumad must be installed. The IB stack should be running with IPoIB configured. These steps assume that the user has administrative privileges. .P 2. Install the IB ACM package. This installs ibacm, ibacmp, ib_acme, and init.d scripts. .P 3. Run 'ibacm' as administrator to start the ibacm daemon. .P 4. Optionally, run 'ib_acme -d -v' to verify that the ibacm service is running. .P 5. Install librdmacm, using the build option --with-ib_acm if needed. This build option is not needed with librdmacm 1.0.17 or newer. The librdmacm will automatically use the ibacm service. On failures, the librdmacm will fall back to normal resolution. .P 6. You can use ib_acme -P to gather performance statistics from the local ibacm daemon to see if the service is working correctly. Similarly, the command ib_acme -e could be used to enumerate all endpoints created by the local ibacm service. .SH "NOTES" ib_acme: .P The ib_acme program serves a dual role. It acts as a utility to test ibacm operation and help verify if the ibacm service and selected protocol is usable for a given cluster configuration. Additionally, it automatically generates ibacm configuration files to assist with or eliminate manual setup. .P ibacm configuration files: .P The ibacm service relies on two configuration files. .P The ibacm_addr.cfg file contains name and address mappings for each IB endpoint. Although the names in the ibacm_addr.cfg file can be anything, ib_acme maps the host name to the IB endpoints. IP addresses, on the other hand, are assigned dynamically. If the address file cannot be found, the ibacm service will attempt to create one using default values. .P The ibacm_opts.cfg file provides a set of configurable options for the ibacm core service and default provider, such as timeout, number of retries, logging level, etc. ib_acme generates the ibacm_opts.cfg file using static information. If an option file cannot be found, ibacm will use default values. .P ibacm: .P The ibacm service is responsible for resolving names and addresses to InfiniBand path information and caching such data. It should execute with administrative privileges. .P The ibacm implements a client interface over TCP sockets, which is abstracted by the librdmacm library. One or more providers can be loaded by the core service, depending on the configuration. In the default provider ibacmp, one or more back-end protocols are used to satisfy user requests. Although ibacmp supports standard SA path record queries on the back-end, it also supports a resolution protocol based on multicast traffic. The latter is not usable on all fabric topologies, specifically ones that may not have reversible paths or fabrics using torus routing. Users should use the ib_acme utility to verify that multicast protocol is usable before running other applications. .P Conceptually, the default provider ibacmp implements an ARP like protocol and either uses IB multicast records to construct path record data or queries the SA directly, depending on the selected route protocol. By default, the ibacmp provider uses and caches SA path record queries. .P Specifically, all IB endpoints join a number of multicast groups. Multicast groups differ based on rates, mtu, sl, etc., and are prioritized. All participating endpoints must be able to communicate on the lowest priority multicast group. The ibacmp assigns one or more names/addresses to each IB endpoint using the ibacm_addr.cfg file. Clients provide source and destination names or addresses as input to the service, and receive as output path record data. .P The service maps a client's source name/address to a local IB endpoint. If the destination name/address is not cached locally in the default provider, it sends a multicast request out on the lowest priority multicast group on the local endpoint. The request carries a list of multicast groups that the sender can use. The recipient of the request selects the highest priority multicast group that it can use as well and returns that information directly to the sender. The request data is cached by all endpoints that receive the multicast request message. The source endpoint also caches the response and uses the multicast group that was selected to construct or obtain path record data, which is returned to the client. .P The current implementation of the provider ibacmp has several additional restrictions: .P - The ibacmp is limited in its handling of dynamic changes. ibacm must be stopped and restarted if a cluster is reconfigured. .P - Support for IPv6 has not been verified. .P - The number of multicast groups that an endpoint can support is limited to 2. .P The ibacmp contains several internal caches. These include caches for GID and LID destination addresses. These caches can be optionally preloaded. ibacm supports the OpenSM dump_pr plugin "full" PathRecord format which is used to preload these caches. The file format is specified in the ibacm_opts.cfg file via the route_preload setting which should be set to full_opensm_v1 for this file format. Default format is none which does not preload these caches. See dump_pr.notes.txt in dump_pr for more information on the full_opensm_v1 file format and how to configure OpenSM to generate this file. .P Additionally, the name, IPv4, and IPv6 caches can be be preloaded by using the addr_preload option. The default is none which does not preload these caches. To preload these caches, set this option to acm_hosts and configure the addr_data_file appropriately. .SH "SEE ALSO" ibacm(7), ib_acme(1), rdma_cm(7) rdma-core-56.1/ibacm/man/ibacm_prov.7.in000066400000000000000000000072431477342711600200040ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "IBACM_PROV" 7 "2014-06-16" "IBACM_PROV" "IB ACM Provider Guide" IBACM_PROV .SH NAME ibacm_prov \- InfiniBand communication management assistant provider interface .SH SYNOPSIS .B "#include " .SH "DESCRIPTION" The ibacm provider interface provides a plugin interface that allows a vendor to implement proprietary solutions to support scalable address and route resolution services over InfiniBand. .P To add a provider to the ibacm core service, the provider must .TP 1. be implemented as a shared library; .TP 2. be installed under a configured directory, eg., @ACM_PROVIDER_DIR@; .TP 3 export a function provider_query() that returns a pointer to its provider info and version info. .P The prototype of provider_query function is defined below: .P .nf int provider_query(struct acm_provider **info, uint32_t *version); .fi .P This function should return a pointer to its provider structure: .P .nf struct acm_provider { size_t size; uint32_t version; char *name; int (*open_device)(const struct acm_device *device, void **dev_context); void (*close_device)(void *dev_context); int (*open_port)(const struct acm_port *port, void *dev_context, void **port_context); void (*close_port)(void *port_context); int (*open_endpoint)(const struct acm_endpoint *endpoint, void *port_context, void **ep_context); void (*close_endpoint)(void *ep_context); int (*add_address)(const struct acm_address *addr, void *ep_context, void **addr_context); void (*remove_address)(void *addr_context); int (*resolve)(void *addr_context, struct acm_msg *msg, uint64_t id); int (*query)(void *addr_context, struct acm_msg *msg, uint64_t id); int (*handle_event)(void *port_context, enum ibv_event_type type); void (*query_perf)(void *ep_context, uint64_t *values, uint8_t *cnt); }; .fi .P The size and version fields provide a way to detect version compatibility. When a port is assigned to the provider, the ibacm core will call the open/add_address functions; Similarly, when a port is down or re-assigned to another provider, the close/remove_address functions will be invoked to release resources. The ibacm core will centralize the management of events for each device and events not handled by the ibacm core will be forwarded to the relevant port through the handle_event() function. The resolve() function will be called to resolve a destination name into a path record. The performance of the provider for each endpoint can be queried by calling perf_query(). .P To share a configuration file, the path for the ibacm configuration file is exported through the variable opts_file. Each loaded provider can open this configuration file and parse the contents related to its own operation. Non-related sections should be ignored. .P Some helper functions are also exported by the ibacm core. For example, the acm_log define (or the acm_write() function) can be used to log messages into ibacm's log file (default @CMAKE_INSTALL_FULL_LOCALSTATEDIR@/log/ibacm.log). For details, refer to the acm_prov.h file. .SH "NOTES" A provider should always set the version in its provider info structure as the value of the define ACM_PROV_VERSION at the time the provider is implemented. Never set the version to ACM_PROV_VERSION itself as the define may be changed over time when the provider interface is changed, unless the provider itself is placed in ibacm source tree. This is to avoid the version problem when the old provider implementation is built against a new acm_prov.h file. The ibacm will always check the version of the provider at loading time. .SH "SEE ALSO" ib_acme(1), ibacm(7), ibacm(8) rdma-core-56.1/ibacm/prov/000077500000000000000000000000001477342711600153735ustar00rootroot00000000000000rdma-core-56.1/ibacm/prov/acmp/000077500000000000000000000000001477342711600163135ustar00rootroot00000000000000rdma-core-56.1/ibacm/prov/acmp/src/000077500000000000000000000000001477342711600171025ustar00rootroot00000000000000rdma-core-56.1/ibacm/prov/acmp/src/acmp.c000066400000000000000000002331761477342711600202020ustar00rootroot00000000000000/* * Copyright (c) 2009-2014 Intel Corporation. All rights reserved. * Copyright (c) 2013 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "acm_util.h" #include "acm_mad.h" #define IB_LID_MCAST_START 0xc000 #define MAX_EP_ADDR 4 #define MAX_EP_MC 2 enum acmp_state { ACMP_INIT, ACMP_QUERY_ADDR, ACMP_ADDR_RESOLVED, ACMP_QUERY_ROUTE, ACMP_READY }; enum acmp_addr_prot { ACMP_ADDR_PROT_ACM }; enum acmp_route_prot { ACMP_ROUTE_PROT_ACM, ACMP_ROUTE_PROT_SA }; enum acmp_loopback_prot { ACMP_LOOPBACK_PROT_NONE, ACMP_LOOPBACK_PROT_LOCAL }; enum acmp_route_preload { ACMP_ROUTE_PRELOAD_NONE, ACMP_ROUTE_PRELOAD_OSM_FULL_V1 }; enum acmp_addr_preload { ACMP_ADDR_PRELOAD_NONE, ACMP_ADDR_PRELOAD_HOSTS }; /* * Nested locking order: dest -> ep, dest -> port */ struct acmp_ep; struct acmp_dest { uint8_t address[ACM_MAX_ADDRESS]; /* keep first */ char name[ACM_MAX_ADDRESS]; struct ibv_ah *ah; struct ibv_ah_attr av; struct ibv_path_record path; union ibv_gid mgid; __be64 req_id; struct list_head req_queue; uint32_t remote_qpn; pthread_mutex_t lock; enum acmp_state state; atomic_t refcnt; uint64_t addr_timeout; uint64_t route_timeout; uint8_t addr_type; struct acmp_ep *ep; }; struct acmp_device; struct acmp_port { struct acmp_device *dev; const struct acm_port *port; struct list_head ep_list; pthread_mutex_t lock; struct acmp_dest sa_dest; enum ibv_port_state state; enum ibv_mtu mtu; enum ibv_rate rate; int subnet_timeout; uint16_t default_pkey_ix; uint16_t lid; uint16_t lid_mask; uint8_t port_num; }; struct acmp_device { struct ibv_context *verbs; const struct acm_device *device; struct ibv_comp_channel *channel; struct ibv_pd *pd; __be64 guid; struct list_node entry; pthread_t comp_thread_id; int port_cnt; struct acmp_port port[0]; }; /* Maintain separate virtual send queues to avoid deadlock */ struct acmp_send_queue { int credits; struct list_head pending; }; struct acmp_addr { uint16_t type; union acm_ep_info info; struct acm_address addr; struct acmp_ep *ep; }; struct acmp_addr_ctx { struct acmp_ep *ep; int addr_inx; }; struct acmp_ep { struct acmp_port *port; struct ibv_cq *cq; struct ibv_qp *qp; struct ibv_mr *mr; uint8_t *recv_bufs; struct list_node entry; char id_string[IBV_SYSFS_NAME_MAX + 11]; void *dest_map[ACM_ADDRESS_RESERVED - 1]; struct acmp_dest mc_dest[MAX_EP_MC]; int mc_cnt; uint16_t pkey_index; uint16_t pkey; const struct acm_endpoint *endpoint; pthread_mutex_t lock; struct acmp_send_queue resolve_queue; struct acmp_send_queue resp_queue; struct list_head active_queue; struct list_head wait_queue; enum acmp_state state; /* This lock protects nmbr_ep_addrs and addr_info */ pthread_rwlock_t rwlock; int nmbr_ep_addrs; struct acmp_addr *addr_info; atomic_t counters[ACM_MAX_COUNTER]; }; struct acmp_send_msg { struct list_node entry; struct acmp_ep *ep; struct acmp_dest *dest; struct ibv_ah *ah; void *context; void (*resp_handler)(struct acmp_send_msg *req, struct ibv_wc *wc, struct acm_mad *resp); struct acmp_send_queue *req_queue; struct ibv_mr *mr; struct ibv_send_wr wr; struct ibv_sge sge; uint64_t expires; int tries; uint8_t data[ACM_SEND_SIZE]; }; struct acmp_request { uint64_t id; struct list_node entry; struct acm_msg msg; struct acmp_ep *ep; }; static int acmp_open_dev(const struct acm_device *device, void **dev_context); static void acmp_close_dev(void *dev_context); static int acmp_open_port(const struct acm_port *port, void *dev_context, void **port_context); static void acmp_close_port(void *port_context); static int acmp_open_endpoint(const struct acm_endpoint *endpoint, void *port_context, void **ep_context); static void acmp_close_endpoint(void *ep_context); static int acmp_add_addr(const struct acm_address *addr, void *ep_context, void **addr_context); static void acmp_remove_addr(void *addr_context); static int acmp_resolve(void *addr_context, struct acm_msg *msg, uint64_t id); static int acmp_query(void *addr_context, struct acm_msg *msg, uint64_t id); static int acmp_handle_event(void *port_context, enum ibv_event_type type); static void acmp_query_perf(void *ep_context, uint64_t *values, uint8_t *cnt); static struct acm_provider def_prov = { .size = sizeof(struct acm_provider), .version = ACM_PROV_VERSION, .name = "ibacmp", .open_device = acmp_open_dev, .close_device = acmp_close_dev, .open_port = acmp_open_port, .close_port = acmp_close_port, .open_endpoint = acmp_open_endpoint, .close_endpoint = acmp_close_endpoint, .add_address = acmp_add_addr, .remove_address = acmp_remove_addr, .resolve = acmp_resolve, .query = acmp_query, .handle_event = acmp_handle_event, .query_perf = acmp_query_perf, }; static LIST_HEAD(acmp_dev_list); static pthread_mutex_t acmp_dev_lock; static atomic_t g_tid; static LIST_HEAD(timeout_list); static event_t timeout_event; static atomic_t wait_cnt; static pthread_t retry_thread_id; static int retry_thread_started = 0; static __thread char log_data[ACM_MAX_ADDRESS]; /* * Service options - may be set through ibacm_opts.cfg file. */ static char route_data_file[128] = ACM_CONF_DIR "/ibacm_route.data"; static char addr_data_file[128] = ACM_CONF_DIR "/ibacm_hosts.data"; static enum acmp_addr_prot addr_prot = ACMP_ADDR_PROT_ACM; static int addr_timeout = 1440; static enum acmp_route_prot route_prot = ACMP_ROUTE_PROT_SA; static int route_timeout = -1; static enum acmp_loopback_prot loopback_prot = ACMP_LOOPBACK_PROT_LOCAL; static int timeout = 2000; static int retries = 2; static int resolve_depth = 1; static int send_depth = 1; static int recv_depth = 1024; static uint8_t min_mtu = IBV_MTU_2048; static uint8_t min_rate = IBV_RATE_10_GBPS; static enum acmp_route_preload route_preload; static enum acmp_addr_preload addr_preload; static int acmp_initialized = 0; static int acmp_compare_dest(const void *dest1, const void *dest2) { return memcmp(dest1, dest2, ACM_MAX_ADDRESS); } static void acmp_set_dest_addr(struct acmp_dest *dest, uint8_t addr_type, const uint8_t *addr, size_t size) { memcpy(dest->address, addr, size); dest->addr_type = addr_type; acm_format_name(0, dest->name, sizeof dest->name, addr_type, addr, size); } static void acmp_init_dest(struct acmp_dest *dest, uint8_t addr_type, const uint8_t *addr, size_t size) { list_head_init(&dest->req_queue); atomic_init(&dest->refcnt); atomic_set(&dest->refcnt, 1); pthread_mutex_init(&dest->lock, NULL); if (size) acmp_set_dest_addr(dest, addr_type, addr, size); dest->state = ACMP_INIT; } static struct acmp_dest * acmp_alloc_dest(uint8_t addr_type, const uint8_t *addr) { struct acmp_dest *dest; dest = calloc(1, sizeof *dest); if (!dest) { acm_log(0, "ERROR - unable to allocate dest\n"); return NULL; } acmp_init_dest(dest, addr_type, addr, ACM_MAX_ADDRESS); acm_log(1, "%s\n", dest->name); return dest; } /* Caller must hold ep lock. */ static struct acmp_dest * acmp_get_dest(struct acmp_ep *ep, uint8_t addr_type, const uint8_t *addr) { struct acmp_dest *dest, **tdest; tdest = tfind(addr, &ep->dest_map[addr_type - 1], acmp_compare_dest); if (tdest) { dest = *tdest; (void) atomic_inc(&dest->refcnt); acm_log(2, "%s\n", dest->name); } else { dest = NULL; acm_format_name(2, log_data, sizeof log_data, addr_type, addr, ACM_MAX_ADDRESS); acm_log(2, "%s not found\n", log_data); } return dest; } static void acmp_put_dest(struct acmp_dest *dest) { acm_log(2, "%s\n", dest->name); if (atomic_dec(&dest->refcnt) == 0) { free(dest); } } /* Caller must hold ep lock. */ static void acmp_remove_dest(struct acmp_ep *ep, struct acmp_dest *dest) { acm_log(2, "%s\n", dest->name); if (!tdelete(dest->address, &ep->dest_map[dest->addr_type - 1], acmp_compare_dest)) acm_log(0, "ERROR: %s not found!!\n", dest->name); acmp_put_dest(dest); } static struct acmp_dest * acmp_acquire_dest(struct acmp_ep *ep, uint8_t addr_type, const uint8_t *addr) { struct acmp_dest *dest; int64_t rec_expr_minutes; acm_format_name(2, log_data, sizeof log_data, addr_type, addr, ACM_MAX_ADDRESS); acm_log(2, "%s\n", log_data); pthread_mutex_lock(&ep->lock); dest = acmp_get_dest(ep, addr_type, addr); if (dest && dest->state == ACMP_READY && dest->addr_timeout != (uint64_t)~0ULL) { rec_expr_minutes = dest->addr_timeout - time_stamp_min(); if (rec_expr_minutes <= 0) { acm_log(2, "Record expired\n"); acmp_remove_dest(ep, dest); dest = NULL; } else { acm_log(2, "Record valid for the next %" PRId64 " minute(s)\n", rec_expr_minutes); } } if (!dest) { dest = acmp_alloc_dest(addr_type, addr); if (dest) { dest->ep = ep; tsearch(dest, &ep->dest_map[addr_type - 1], acmp_compare_dest); (void) atomic_inc(&dest->refcnt); } } pthread_mutex_unlock(&ep->lock); return dest; } static struct acmp_request *acmp_alloc_req(uint64_t id, struct acm_msg *msg) { struct acmp_request *req; req = calloc(1, sizeof *req); if (!req) { acm_log(0, "ERROR - unable to alloc client request\n"); return NULL; } req->id = id; memcpy(&req->msg, msg, sizeof(req->msg)); acm_log(2, "id %" PRIu64 ", req %p\n", id, req); return req; } static void acmp_free_req(struct acmp_request *req) { acm_log(2, "%p\n", req); free(req); } static struct acmp_send_msg * acmp_alloc_send(struct acmp_ep *ep, struct acmp_dest *dest, size_t size) { struct acmp_send_msg *msg; msg = (struct acmp_send_msg *) calloc(1, sizeof *msg); if (!msg) { acm_log(0, "ERROR - unable to allocate send buffer\n"); return NULL; } msg->ep = ep; msg->mr = ibv_reg_mr(ep->port->dev->pd, msg->data, size, 0); if (!msg->mr) { acm_log(0, "ERROR - failed to register send buffer\n"); goto err1; } if (!dest->ah) { msg->ah = ibv_create_ah(ep->port->dev->pd, &dest->av); if (!msg->ah) { acm_log(0, "ERROR - unable to create ah\n"); goto err2; } msg->wr.wr.ud.ah = msg->ah; } else { msg->wr.wr.ud.ah = dest->ah; } acm_log(2, "get dest %s\n", dest->name); (void) atomic_inc(&dest->refcnt); msg->dest = dest; msg->wr.next = NULL; msg->wr.sg_list = &msg->sge; msg->wr.num_sge = 1; msg->wr.opcode = IBV_WR_SEND; msg->wr.send_flags = IBV_SEND_SIGNALED; msg->wr.wr_id = (uintptr_t) msg; msg->wr.wr.ud.remote_qpn = dest->remote_qpn; msg->wr.wr.ud.remote_qkey = ACM_QKEY; msg->sge.length = size; msg->sge.lkey = msg->mr->lkey; msg->sge.addr = (uintptr_t) msg->data; acm_log(2, "%p\n", msg); return msg; err2: ibv_dereg_mr(msg->mr); err1: free(msg); return NULL; } static void acmp_init_send_req(struct acmp_send_msg *msg, void *context, void (*resp_handler)(struct acmp_send_msg *req, struct ibv_wc *wc, struct acm_mad *resp)) { acm_log(2, "%p\n", msg); msg->tries = retries + 1; msg->context = context; msg->resp_handler = resp_handler; } static void acmp_free_send(struct acmp_send_msg *msg) { acm_log(2, "%p\n", msg); if (msg->ah) ibv_destroy_ah(msg->ah); ibv_dereg_mr(msg->mr); acmp_put_dest(msg->dest); free(msg); } static void acmp_post_send(struct acmp_send_queue *queue, struct acmp_send_msg *msg) { struct acmp_ep *ep = msg->ep; struct ibv_send_wr *bad_wr; msg->req_queue = queue; pthread_mutex_lock(&ep->lock); if (queue->credits) { acm_log(2, "posting send to QP\n"); queue->credits--; list_add_tail(&ep->active_queue, &msg->entry); ibv_post_send(ep->qp, &msg->wr, &bad_wr); } else { acm_log(2, "no sends available, queuing message\n"); list_add_tail(&queue->pending, &msg->entry); } pthread_mutex_unlock(&ep->lock); } static void acmp_post_recv(struct acmp_ep *ep, uint64_t address) { struct ibv_recv_wr wr, *bad_wr; struct ibv_sge sge; wr.next = NULL; wr.sg_list = &sge; wr.num_sge = 1; wr.wr_id = address; sge.length = ACM_RECV_SIZE; sge.lkey = ep->mr->lkey; sge.addr = address; ibv_post_recv(ep->qp, &wr, &bad_wr); } /* Caller must hold ep lock */ static void acmp_send_available(struct acmp_ep *ep, struct acmp_send_queue *queue) { struct acmp_send_msg *msg; struct ibv_send_wr *bad_wr; msg = list_pop(&queue->pending, struct acmp_send_msg, entry); if (msg) { acm_log(2, "posting queued send message\n"); list_add_tail(&ep->active_queue, &msg->entry); ibv_post_send(ep->qp, &msg->wr, &bad_wr); } else { queue->credits++; } } static void acmp_complete_send(struct acmp_send_msg *msg) { struct acmp_ep *ep = msg->ep; pthread_mutex_lock(&ep->lock); list_del(&msg->entry); if (msg->tries) { acm_log(2, "waiting for response\n"); msg->expires = time_stamp_ms() + ep->port->subnet_timeout + timeout; list_add_tail(&ep->wait_queue, &msg->entry); if (atomic_inc(&wait_cnt) == 1) event_signal(&timeout_event); } else { acm_log(2, "freeing\n"); acmp_send_available(ep, msg->req_queue); acmp_free_send(msg); } pthread_mutex_unlock(&ep->lock); } static struct acmp_send_msg *acmp_get_request(struct acmp_ep *ep, __be64 tid, int *free) { struct acmp_send_msg *msg, *next, *req = NULL; struct acm_mad *mad; acm_log(2, "\n"); pthread_mutex_lock(&ep->lock); list_for_each_safe(&ep->wait_queue, msg, next, entry) { mad = (struct acm_mad *) msg->data; if (mad->tid == tid) { acm_log(2, "match found in wait queue\n"); req = msg; list_del(&msg->entry); (void) atomic_dec(&wait_cnt); acmp_send_available(ep, msg->req_queue); *free = 1; goto unlock; } } list_for_each(&ep->active_queue, msg, entry) { mad = (struct acm_mad *) msg->data; if (mad->tid == tid && msg->tries) { acm_log(2, "match found in active queue\n"); req = msg; req->tries = 0; *free = 0; break; } } unlock: pthread_mutex_unlock(&ep->lock); return req; } static int acmp_mc_index(struct acmp_ep *ep, union ibv_gid *gid) { int i; for (i = 0; i < ep->mc_cnt; i++) { if (!memcmp(&ep->mc_dest[i].address, gid, sizeof(*gid))) return i; } return -1; } /* Multicast groups are ordered lowest to highest preference. */ static int acmp_best_mc_index(struct acmp_ep *ep, struct acm_resolve_rec *rec) { int i, index; for (i = min_t(int, rec->gid_cnt, ACM_MAX_GID_COUNT) - 1; i >= 0; i--) { index = acmp_mc_index(ep, &rec->gid[i]); if (index >= 0) { return index; } } return -1; } static void acmp_record_mc_av(struct acmp_port *port, struct ib_mc_member_rec *mc_rec, struct acmp_dest *dest) { uint32_t sl_flow_hop; sl_flow_hop = be32toh(mc_rec->sl_flow_hop); dest->av.dlid = be16toh(mc_rec->mlid); dest->av.sl = (uint8_t) (sl_flow_hop >> 28); dest->av.src_path_bits = port->sa_dest.av.src_path_bits; dest->av.static_rate = mc_rec->rate & 0x3F; dest->av.port_num = port->port_num; dest->av.is_global = 1; dest->av.grh.dgid = mc_rec->mgid; dest->av.grh.flow_label = (sl_flow_hop >> 8) & 0xFFFFF; dest->av.grh.sgid_index = acm_gid_index((struct acm_port *) port->port, &mc_rec->port_gid); dest->av.grh.hop_limit = (uint8_t) sl_flow_hop; dest->av.grh.traffic_class = mc_rec->tclass; dest->path.dgid = mc_rec->mgid; dest->path.sgid = mc_rec->port_gid; dest->path.dlid = mc_rec->mlid; dest->path.slid = htobe16(port->lid | port->sa_dest.av.src_path_bits); dest->path.flowlabel_hoplimit = htobe32(sl_flow_hop & 0xFFFFFFF); dest->path.tclass = mc_rec->tclass; dest->path.reversible_numpath = IBV_PATH_RECORD_REVERSIBLE | 1; dest->path.pkey = mc_rec->pkey; dest->path.qosclass_sl = htobe16((uint16_t) (sl_flow_hop >> 28)); dest->path.mtu = mc_rec->mtu; dest->path.rate = mc_rec->rate; dest->path.packetlifetime = mc_rec->packet_lifetime; } /* Always send the GRH to transfer GID data to remote side */ static void acmp_init_path_av(struct acmp_port *port, struct acmp_dest *dest) { uint32_t flow_hop; dest->av.dlid = be16toh(dest->path.dlid); dest->av.sl = be16toh(dest->path.qosclass_sl) & 0xF; dest->av.src_path_bits = be16toh(dest->path.slid) & 0x7F; dest->av.static_rate = dest->path.rate & 0x3F; dest->av.port_num = port->port_num; flow_hop = be32toh(dest->path.flowlabel_hoplimit); dest->av.is_global = 1; dest->av.grh.flow_label = (flow_hop >> 8) & 0xFFFFF; pthread_mutex_lock(&port->lock); if (port->port) dest->av.grh.sgid_index = acm_gid_index( (struct acm_port *) port->port, &dest->path.sgid); else dest->av.grh.sgid_index = 0; pthread_mutex_unlock(&port->lock); dest->av.grh.hop_limit = (uint8_t) flow_hop; dest->av.grh.traffic_class = dest->path.tclass; } static void acmp_process_join_resp(struct acm_sa_mad *sa_mad) { struct acmp_dest *dest; struct ib_mc_member_rec *mc_rec; struct ib_sa_mad *mad; int index, ret; struct acmp_ep *ep = sa_mad->context; mad = (struct ib_sa_mad *) &sa_mad->sa_mad; acm_log(1, "response status: 0x%x, mad status: 0x%x\n", sa_mad->umad.status, mad->status); pthread_mutex_lock(&ep->lock); if (sa_mad->umad.status) { acm_log(0, "ERROR - send join failed 0x%x\n", sa_mad->umad.status); goto out; } if (mad->status) { acm_log(0, "ERROR - join response status 0x%x\n", mad->status); goto out; } mc_rec = (struct ib_mc_member_rec *) mad->data; index = acmp_mc_index(ep, &mc_rec->mgid); if (index < 0) { acm_log(0, "ERROR - MGID in join response not found\n"); goto out; } dest = &ep->mc_dest[index]; dest->remote_qpn = IB_MC_QPN; dest->mgid = mc_rec->mgid; acmp_record_mc_av(ep->port, mc_rec, dest); if (index == 0) { dest->ah = ibv_create_ah(ep->port->dev->pd, &dest->av); if (!dest->ah) { acm_log(0, "ERROR - unable to create ah\n"); goto out; } ret = ibv_attach_mcast(ep->qp, &dest->mgid, dest->av.dlid); if (ret) { acm_log(0, "ERROR - unable to attach QP to multicast group\n"); ibv_destroy_ah(dest->ah); dest->ah = NULL; goto out; } ep->state = ACMP_READY; } atomic_set(&dest->refcnt, 1); dest->state = ACMP_READY; acm_log(1, "join successful\n"); out: acm_free_sa_mad(sa_mad); pthread_mutex_unlock(&ep->lock); } static uint8_t acmp_record_acm_route(struct acmp_ep *ep, struct acmp_dest *dest) { int i; acm_log(2, "\n"); for (i = 0; i < MAX_EP_MC; i++) { if (!memcmp(&dest->mgid, &ep->mc_dest[i].mgid, sizeof dest->mgid)) break; } if (i == MAX_EP_MC) { acm_log(0, "ERROR - cannot match mgid\n"); return ACM_STATUS_EINVAL; } dest->path = ep->mc_dest[i].path; dest->path.dgid = dest->av.grh.dgid; dest->path.dlid = htobe16(dest->av.dlid); dest->addr_timeout = time_stamp_min() + (unsigned) addr_timeout; dest->route_timeout = time_stamp_min() + (unsigned) route_timeout; dest->state = ACMP_READY; return ACM_STATUS_SUCCESS; } static void acmp_init_path_query(struct ib_sa_mad *mad) { acm_log(2, "\n"); mad->base_version = 1; mad->mgmt_class = IB_MGMT_CLASS_SA; mad->class_version = 2; mad->method = IB_METHOD_GET; mad->tid = htobe64((uint64_t) atomic_inc(&g_tid)); mad->attr_id = IB_SA_ATTR_PATH_REC; } /* Caller must hold dest lock */ static uint8_t acmp_resolve_path_sa(struct acmp_ep *ep, struct acmp_dest *dest, void (*handler)(struct acm_sa_mad *)) { struct ib_sa_mad *mad; uint8_t ret; struct acm_sa_mad *sa_mad; acm_log(2, "%s\n", dest->name); sa_mad = acm_alloc_sa_mad(ep->endpoint, dest, handler); if (!sa_mad) { acm_log(0, "Error - failed to allocate sa_mad\n"); ret = ACM_STATUS_ENOMEM; goto err; } mad = (struct ib_sa_mad *) &sa_mad->sa_mad; acmp_init_path_query(mad); memcpy(mad->data, &dest->path, sizeof(dest->path)); mad->comp_mask = acm_path_comp_mask(&dest->path); acm_increment_counter(ACM_CNTR_ROUTE_QUERY); atomic_inc(&ep->counters[ACM_CNTR_ROUTE_QUERY]); dest->state = ACMP_QUERY_ROUTE; if (acm_send_sa_mad(sa_mad)) { acm_log(0, "Error - Failed to send sa mad\n"); ret = ACM_STATUS_ENODATA; goto free_mad; } return ACM_STATUS_SUCCESS; free_mad: acm_free_sa_mad(sa_mad); err: dest->state = ACMP_INIT; return ret; } static uint8_t acmp_record_acm_addr(struct acmp_ep *ep, struct acmp_dest *dest, struct ibv_wc *wc, struct acm_resolve_rec *rec) { int index; acm_log(2, "%s\n", dest->name); index = acmp_best_mc_index(ep, rec); if (index < 0) { acm_log(0, "ERROR - no shared multicast groups\n"); dest->state = ACMP_INIT; return ACM_STATUS_ENODATA; } acm_log(2, "selecting MC group at index %d\n", index); dest->av = ep->mc_dest[index].av; dest->av.dlid = wc->slid; dest->av.src_path_bits = wc->dlid_path_bits; dest->av.grh.dgid = ((struct ibv_grh *) (uintptr_t) wc->wr_id)->sgid; dest->mgid = ep->mc_dest[index].mgid; dest->path.sgid = ep->mc_dest[index].path.sgid; dest->path.dgid = dest->av.grh.dgid; dest->path.tclass = ep->mc_dest[index].path.tclass; dest->path.pkey = ep->mc_dest[index].path.pkey; dest->remote_qpn = wc->src_qp; dest->state = ACMP_ADDR_RESOLVED; return ACM_STATUS_SUCCESS; } static void acmp_record_path_addr(struct acmp_ep *ep, struct acmp_dest *dest, struct ibv_path_record *path) { acm_log(2, "%s\n", dest->name); dest->path.pkey = htobe16(ep->pkey); dest->path.dgid = path->dgid; if (path->slid) { dest->path.slid = path->slid; } else { dest->path.slid = htobe16(ep->port->lid); } if (!ib_any_gid(&path->sgid)) { dest->path.sgid = path->sgid; } else { dest->path.sgid = ep->mc_dest[0].path.sgid; } dest->path.dlid = path->dlid; dest->state = ACMP_ADDR_RESOLVED; } static uint8_t acmp_validate_addr_req(struct acm_mad *mad) { struct acm_resolve_rec *rec; if (mad->method != IB_METHOD_GET) { acm_log(0, "ERROR - invalid method 0x%x\n", mad->method); return ACM_STATUS_EINVAL; } rec = (struct acm_resolve_rec *) mad->data; if (!rec->src_type || rec->src_type >= ACM_ADDRESS_RESERVED) { acm_log(0, "ERROR - unknown src type 0x%x\n", rec->src_type); return ACM_STATUS_EINVAL; } return ACM_STATUS_SUCCESS; } static void acmp_send_addr_resp(struct acmp_ep *ep, struct acmp_dest *dest) { struct acm_resolve_rec *rec; struct acmp_send_msg *msg; struct acm_mad *mad; acm_log(2, "%s\n", dest->name); msg = acmp_alloc_send(ep, dest, sizeof (*mad)); if (!msg) { acm_log(0, "ERROR - failed to allocate message\n"); return; } mad = (struct acm_mad *) msg->data; rec = (struct acm_resolve_rec *) mad->data; mad->base_version = 1; mad->mgmt_class = ACM_MGMT_CLASS; mad->class_version = 1; mad->method = IB_METHOD_GET | IB_METHOD_RESP; mad->status = ACM_STATUS_SUCCESS; mad->control = ACM_CTRL_RESOLVE; mad->tid = dest->req_id; rec->gid_cnt = 1; memcpy(rec->gid, dest->mgid.raw, sizeof(union ibv_gid)); acmp_post_send(&ep->resp_queue, msg); } static int acmp_resolve_response(uint64_t id, struct acm_msg *req_msg, struct acmp_dest *dest, uint8_t status) { struct acm_msg msg; acm_log(2, "client %" PRIu64 ", status 0x%x\n", id, status); memset(&msg, 0, sizeof msg); if (dest) { if (status == ACM_STATUS_ENODATA) atomic_inc(&dest->ep->counters[ACM_CNTR_NODATA]); else if (status) atomic_inc(&dest->ep->counters[ACM_CNTR_ERROR]); } msg.hdr = req_msg->hdr; msg.hdr.status = status; msg.hdr.length = ACM_MSG_HDR_LENGTH; memset(msg.hdr.data, 0, sizeof(msg.hdr.data)); if (status == ACM_STATUS_SUCCESS) { msg.hdr.length += ACM_MSG_EP_LENGTH; msg.resolve_data[0].flags = IBV_PATH_FLAG_GMP | IBV_PATH_FLAG_PRIMARY | IBV_PATH_FLAG_BIDIRECTIONAL; msg.resolve_data[0].type = ACM_EP_INFO_PATH; msg.resolve_data[0].info.path = dest->path; if (req_msg->hdr.src_out) { msg.hdr.length += ACM_MSG_EP_LENGTH; memcpy(&msg.resolve_data[1], &req_msg->resolve_data[req_msg->hdr.src_index], ACM_MSG_EP_LENGTH); } } return acm_resolve_response(id, &msg); } static void acmp_complete_queued_req(struct acmp_dest *dest, uint8_t status) { struct acmp_request *req; acm_log(2, "status %d\n", status); pthread_mutex_lock(&dest->lock); while ((req = list_pop(&dest->req_queue, struct acmp_request, entry))) { pthread_mutex_unlock(&dest->lock); acm_log(2, "completing request, client %" PRIu64 "\n", req->id); acmp_resolve_response(req->id, &req->msg, dest, status); acmp_free_req(req); pthread_mutex_lock(&dest->lock); } pthread_mutex_unlock(&dest->lock); } static void acmp_dest_sa_resp(struct acm_sa_mad *mad) { struct acmp_dest *dest = (struct acmp_dest *) mad->context; struct ib_sa_mad *sa_mad = (struct ib_sa_mad *) &mad->sa_mad; uint8_t status; if (!mad->umad.status) { status = (uint8_t) (be16toh(sa_mad->status) >> 8); } else { status = ACM_STATUS_ETIMEDOUT; } acm_log(2, "%s status=0x%x\n", dest->name, status); pthread_mutex_lock(&dest->lock); if (dest->state != ACMP_QUERY_ROUTE) { acm_log(1, "notice - discarding SA response\n"); pthread_mutex_unlock(&dest->lock); goto out; } if (!status) { memcpy(&dest->path, sa_mad->data, sizeof(dest->path)); acmp_init_path_av(dest->ep->port, dest); dest->addr_timeout = time_stamp_min() + (unsigned) addr_timeout; dest->route_timeout = time_stamp_min() + (unsigned) route_timeout; acm_log(2, "timeout addr %" PRIu64 " route %" PRIu64 "\n", dest->addr_timeout, dest->route_timeout); dest->state = ACMP_READY; } else { dest->state = ACMP_INIT; } pthread_mutex_unlock(&dest->lock); acmp_complete_queued_req(dest, status); out: acm_free_sa_mad(mad); } static void acmp_resolve_sa_resp(struct acm_sa_mad *mad) { struct acmp_dest *dest = (struct acmp_dest *) mad->context; int send_resp; acm_log(2, "\n"); acmp_dest_sa_resp(mad); pthread_mutex_lock(&dest->lock); send_resp = (dest->state == ACMP_READY); pthread_mutex_unlock(&dest->lock); if (send_resp) acmp_send_addr_resp(dest->ep, dest); } static struct acmp_addr * acmp_addr_lookup(struct acmp_ep *ep, uint8_t *addr, uint16_t type) { struct acmp_addr *ret = NULL; int i; pthread_rwlock_rdlock(&ep->rwlock); for (i = 0; i < ep->nmbr_ep_addrs; i++) { if (ep->addr_info[i].type != type) continue; if ((type == ACM_ADDRESS_NAME && !strncasecmp((char *) ep->addr_info[i].info.name, (char *) addr, ACM_MAX_ADDRESS)) || !memcmp(ep->addr_info[i].info.addr, addr, ACM_MAX_ADDRESS)) { ret = ep->addr_info + i; break; } } pthread_rwlock_unlock(&ep->rwlock); return ret; } static void acmp_process_addr_req(struct acmp_ep *ep, struct ibv_wc *wc, struct acm_mad *mad) { struct acm_resolve_rec *rec; struct acmp_dest *dest; uint8_t status; struct acmp_addr *addr; acm_log(2, "\n"); if ((status = acmp_validate_addr_req(mad))) { acm_log(0, "ERROR - invalid request\n"); return; } rec = (struct acm_resolve_rec *) mad->data; dest = acmp_acquire_dest(ep, rec->src_type, rec->src); if (!dest) { acm_log(0, "ERROR - unable to add source\n"); return; } addr = acmp_addr_lookup(ep, rec->dest, rec->dest_type); if (addr) dest->req_id = mad->tid; pthread_mutex_lock(&dest->lock); acm_log(2, "dest state %d\n", dest->state); switch (dest->state) { case ACMP_READY: if (dest->remote_qpn == wc->src_qp) break; acm_log(2, "src service has new qp, resetting\n"); /* fall through */ case ACMP_INIT: case ACMP_QUERY_ADDR: status = acmp_record_acm_addr(ep, dest, wc, rec); if (status) break; /* fall through */ case ACMP_ADDR_RESOLVED: if (route_prot == ACMP_ROUTE_PROT_ACM) { status = acmp_record_acm_route(ep, dest); break; } if (addr || !list_empty(&dest->req_queue)) { status = acmp_resolve_path_sa(ep, dest, acmp_resolve_sa_resp); if (status) break; } /* fall through */ default: pthread_mutex_unlock(&dest->lock); acmp_put_dest(dest); return; } pthread_mutex_unlock(&dest->lock); acmp_complete_queued_req(dest, status); if (addr && !status) { acmp_send_addr_resp(ep, dest); } acmp_put_dest(dest); } static void acmp_process_addr_resp(struct acmp_send_msg *msg, struct ibv_wc *wc, struct acm_mad *mad) { struct acm_resolve_rec *resp_rec; struct acmp_dest *dest = (struct acmp_dest *) msg->context; uint8_t status; if (mad) { status = acm_class_status(mad->status); resp_rec = (struct acm_resolve_rec *) mad->data; } else { status = ACM_STATUS_ETIMEDOUT; resp_rec = NULL; } acm_log(2, "resp status 0x%x\n", status); pthread_mutex_lock(&dest->lock); if (dest->state != ACMP_QUERY_ADDR) { pthread_mutex_unlock(&dest->lock); goto put; } if (!status) { status = acmp_record_acm_addr(msg->ep, dest, wc, resp_rec); if (!status) { if (route_prot == ACMP_ROUTE_PROT_ACM) { status = acmp_record_acm_route(msg->ep, dest); } else { status = acmp_resolve_path_sa(msg->ep, dest, acmp_dest_sa_resp); if (!status) { pthread_mutex_unlock(&dest->lock); goto put; } } } } else { dest->state = ACMP_INIT; } pthread_mutex_unlock(&dest->lock); acmp_complete_queued_req(dest, status); put: acmp_put_dest(dest); } static void acmp_process_acm_recv(struct acmp_ep *ep, struct ibv_wc *wc, struct acm_mad *mad) { struct acmp_send_msg *req; struct acm_resolve_rec *rec; int free; acm_log(2, "\n"); if (mad->base_version != 1 || mad->class_version != 1) { acm_log(0, "ERROR - invalid version %d %d\n", mad->base_version, mad->class_version); return; } if (mad->control != ACM_CTRL_RESOLVE) { acm_log(0, "ERROR - invalid control 0x%x\n", mad->control); return; } rec = (struct acm_resolve_rec *) mad->data; acm_format_name(2, log_data, sizeof log_data, rec->src_type, rec->src, sizeof rec->src); acm_log(2, "src %s\n", log_data); acm_format_name(2, log_data, sizeof log_data, rec->dest_type, rec->dest, sizeof rec->dest); acm_log(2, "dest %s\n", log_data); if (mad->method & IB_METHOD_RESP) { acm_log(2, "received response\n"); req = acmp_get_request(ep, mad->tid, &free); if (!req) { acm_log(1, "notice - response did not match active request\n"); return; } acm_log(2, "found matching request\n"); req->resp_handler(req, wc, mad); if (free) acmp_free_send(req); } else { acm_log(2, "unsolicited request\n"); acmp_process_addr_req(ep, wc, mad); } } static void acmp_sa_resp(struct acm_sa_mad *mad) { struct acmp_request *req = (struct acmp_request *) mad->context; struct ib_sa_mad *sa_mad = (struct ib_sa_mad *) &mad->sa_mad; req->msg.hdr.opcode |= ACM_OP_ACK; if (!mad->umad.status) { struct acm_ep_addr_data *resolve_data = req->msg.resolve_data; req->msg.hdr.status = (uint8_t) (be16toh(sa_mad->status) >> 8); memcpy(&resolve_data->info.path, sa_mad->data, sizeof(struct ibv_path_record)); } else { req->msg.hdr.status = ACM_STATUS_ETIMEDOUT; } acm_log(2, "status 0x%x\n", req->msg.hdr.status); if (req->msg.hdr.status) atomic_inc(&req->ep->counters[ACM_CNTR_ERROR]); acm_query_response(req->id, &req->msg); acm_free_sa_mad(mad); acmp_free_req(req); } static void acmp_process_sa_recv(struct acmp_ep *ep, struct ibv_wc *wc, struct acm_mad *mad) { struct ib_sa_mad *sa_mad = (struct ib_sa_mad *) mad; struct acmp_send_msg *req; int free; acm_log(2, "\n"); if (mad->base_version != 1 || mad->class_version != 2 || !(mad->method & IB_METHOD_RESP) || sa_mad->attr_id != IB_SA_ATTR_PATH_REC) { acm_log(0, "ERROR - unexpected SA MAD %d %d\n", mad->base_version, mad->class_version); return; } req = acmp_get_request(ep, mad->tid, &free); if (!req) { acm_log(1, "notice - response did not match active request\n"); return; } acm_log(2, "found matching request\n"); req->resp_handler(req, wc, mad); if (free) acmp_free_send(req); } static void acmp_process_recv(struct acmp_ep *ep, struct ibv_wc *wc) { struct acm_mad *mad; acm_log(2, "base endpoint name %s\n", ep->id_string); mad = (struct acm_mad *) (uintptr_t) (wc->wr_id + sizeof(struct ibv_grh)); switch (mad->mgmt_class) { case IB_MGMT_CLASS_SA: acmp_process_sa_recv(ep, wc, mad); break; case ACM_MGMT_CLASS: acmp_process_acm_recv(ep, wc, mad); break; default: acm_log(0, "ERROR - invalid mgmt class 0x%x\n", mad->mgmt_class); break; } acmp_post_recv(ep, wc->wr_id); } static void acmp_process_comp(struct acmp_ep *ep, struct ibv_wc *wc) { if (wc->status) { acm_log(0, "ERROR - work completion error\n" "\topcode %d, completion status %d\n", wc->opcode, wc->status); return; } if (wc->opcode & IBV_WC_RECV) acmp_process_recv(ep, wc); else acmp_complete_send((struct acmp_send_msg *) (uintptr_t) wc->wr_id); } static void *acmp_comp_handler(void *context) { struct acmp_device *dev = (struct acmp_device *) context; struct acmp_ep *ep; struct ibv_cq *cq; struct ibv_wc wc; int cnt; acm_log(1, "started\n"); if (pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, NULL)) { acm_log(0, "Error: failed to set cancel type for dev %s\n", dev->verbs->device->name); pthread_exit(NULL); } if (pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL)) { acm_log(0, "Error: failed to set cancel state for dev %s\n", dev->verbs->device->name); pthread_exit(NULL); } while (1) { pthread_testcancel(); ibv_get_cq_event(dev->channel, &cq, (void *) &ep); cnt = 0; while (ibv_poll_cq(cq, 1, &wc) > 0) { cnt++; acmp_process_comp(ep, &wc); } ibv_req_notify_cq(cq, 0); while (ibv_poll_cq(cq, 1, &wc) > 0) { cnt++; acmp_process_comp(ep, &wc); } ibv_ack_cq_events(cq, cnt); } return NULL; } static void acmp_format_mgid(union ibv_gid *mgid, uint16_t pkey, uint8_t tos, uint8_t rate, uint8_t mtu) { mgid->raw[0] = 0xFF; mgid->raw[1] = 0x10 | 0x05; mgid->raw[2] = 0x40; mgid->raw[3] = 0x01; mgid->raw[4] = (uint8_t) (pkey >> 8); mgid->raw[5] = (uint8_t) pkey; mgid->raw[6] = tos; mgid->raw[7] = rate; mgid->raw[8] = mtu; mgid->raw[9] = 0; mgid->raw[10] = 0; mgid->raw[11] = 0; mgid->raw[12] = 0; mgid->raw[13] = 0; mgid->raw[14] = 0; mgid->raw[15] = 0; } static void acmp_init_join(struct ib_sa_mad *mad, union ibv_gid *port_gid, uint16_t pkey, uint8_t tos, uint8_t tclass, uint8_t sl, uint8_t rate, uint8_t mtu) { struct ib_mc_member_rec *mc_rec; acm_log(2, "\n"); mad->base_version = 1; mad->mgmt_class = IB_MGMT_CLASS_SA; mad->class_version = 2; mad->method = IB_METHOD_SET; mad->tid = htobe64((uint64_t) atomic_inc(&g_tid)); mad->attr_id = IB_SA_ATTR_MC_MEMBER_REC; mad->comp_mask = IB_COMP_MASK_MC_MGID | IB_COMP_MASK_MC_PORT_GID | IB_COMP_MASK_MC_QKEY | IB_COMP_MASK_MC_MTU_SEL| IB_COMP_MASK_MC_MTU | IB_COMP_MASK_MC_TCLASS | IB_COMP_MASK_MC_PKEY | IB_COMP_MASK_MC_RATE_SEL | IB_COMP_MASK_MC_RATE | IB_COMP_MASK_MC_SL | IB_COMP_MASK_MC_FLOW | IB_COMP_MASK_MC_SCOPE | IB_COMP_MASK_MC_JOIN_STATE; mc_rec = (struct ib_mc_member_rec *) mad->data; acmp_format_mgid(&mc_rec->mgid, pkey | IB_PKEY_FULL_MEMBER, tos, rate, mtu); mc_rec->port_gid = *port_gid; mc_rec->qkey = htobe32(ACM_QKEY); mc_rec->mtu = umad_sa_set_rate_mtu_or_life(UMAD_SA_SELECTOR_EXACTLY, mtu); mc_rec->tclass = tclass; mc_rec->pkey = htobe16(pkey); mc_rec->rate = umad_sa_set_rate_mtu_or_life(UMAD_SA_SELECTOR_EXACTLY, rate); mc_rec->sl_flow_hop = umad_sa_mcm_set_sl_flow_hop(sl, 0, 0); mc_rec->scope_state = umad_sa_mcm_set_scope_state(UMAD_SA_MCM_ADDR_SCOPE_SITE_LOCAL, UMAD_SA_MCM_JOIN_STATE_FULL_MEMBER); } static void acmp_join_group(struct acmp_ep *ep, union ibv_gid *port_gid, uint8_t tos, uint8_t tclass, uint8_t sl, uint8_t rate, uint8_t mtu) { struct ib_sa_mad *mad; struct ib_mc_member_rec *mc_rec; struct acm_sa_mad *sa_mad; acm_log(2, "\n"); sa_mad = acm_alloc_sa_mad(ep->endpoint, ep, acmp_process_join_resp); if (!sa_mad) { acm_log(0, "Error - failed to allocate sa_mad\n"); return; } acm_log(0, "%s %d pkey 0x%x, sl 0x%x, rate 0x%x, mtu 0x%x\n", ep->port->dev->verbs->device->name, ep->port->port_num, ep->pkey, sl, rate, mtu); mad = (struct ib_sa_mad *) &sa_mad->sa_mad; acmp_init_join(mad, port_gid, ep->pkey, tos, tclass, sl, rate, mtu); mc_rec = (struct ib_mc_member_rec *) mad->data; acmp_set_dest_addr(&ep->mc_dest[ep->mc_cnt++], ACM_ADDRESS_GID, mc_rec->mgid.raw, sizeof(mc_rec->mgid)); ep->mc_dest[ep->mc_cnt - 1].state = ACMP_INIT; if (acm_send_sa_mad(sa_mad)) { acm_log(0, "Error - Failed to send sa mad\n"); acm_free_sa_mad(sa_mad); } } static void acmp_ep_join(struct acmp_ep *ep) { struct acmp_port *port; union ibv_gid gid; port = ep->port; acm_log(1, "%s\n", ep->id_string); if (ep->mc_dest[0].state == ACMP_READY && ep->mc_dest[0].ah) { ibv_detach_mcast(ep->qp, &ep->mc_dest[0].mgid, ep->mc_dest[0].av.dlid); ibv_destroy_ah(ep->mc_dest[0].ah); ep->mc_dest[0].ah = NULL; } ep->mc_cnt = 0; ep->state = ACMP_INIT; acm_get_gid((struct acm_port *)ep->port->port, 0, &gid); acmp_join_group(ep, &gid, 0, 0, 0, min_rate, min_mtu); if ((route_prot == ACMP_ROUTE_PROT_ACM) && (port->rate != min_rate || port->mtu != min_mtu)) acmp_join_group(ep, &gid, 0, 0, 0, port->rate, port->mtu); acm_log(1, "join for %s complete\n", ep->id_string); } static int acmp_port_join(void *port_context) { struct acmp_ep *ep; struct acmp_port *port = port_context; acm_log(1, "device %s port %d\n", port->dev->verbs->device->name, port->port_num); list_for_each(&port->ep_list, ep, entry) { if (!ep->endpoint) { /* Stale endpoint */ continue; } acmp_ep_join(ep); } acm_log(1, "joins for device %s port %d complete\n", port->dev->verbs->device->name, port->port_num); return 0; } static int acmp_handle_event(void *port_context, enum ibv_event_type type) { int ret = 0; acm_log(2, "event %s\n", ibv_event_type_str(type)); switch (type) { case IBV_EVENT_CLIENT_REREGISTER: ret = acmp_port_join(port_context); break; default: break; } return ret; } static void acmp_process_timeouts(void) { struct acmp_send_msg *msg; struct acm_resolve_rec *rec; struct acm_mad *mad; while ((msg = list_pop(&timeout_list, struct acmp_send_msg, entry))) { mad = (struct acm_mad *) &msg->data[0]; rec = (struct acm_resolve_rec *) mad->data; acm_format_name(0, log_data, sizeof log_data, rec->dest_type, rec->dest, sizeof rec->dest); acm_log(0, "notice - dest %s\n", log_data); msg->resp_handler(msg, NULL, NULL); acmp_free_send(msg); } } static void acmp_process_wait_queue(struct acmp_ep *ep, uint64_t *next_expire) { struct acmp_send_msg *msg, *next; struct ibv_send_wr *bad_wr; list_for_each_safe(&ep->wait_queue, msg, next, entry) { if (msg->expires <= time_stamp_ms()) { list_del(&msg->entry); (void) atomic_dec(&wait_cnt); if (--msg->tries) { acm_log(1, "notice - retrying request\n"); list_add_tail(&ep->active_queue, &msg->entry); ibv_post_send(ep->qp, &msg->wr, &bad_wr); } else { acm_log(0, "notice - failing request\n"); acmp_send_available(ep, msg->req_queue); list_add_tail(&timeout_list, &msg->entry); } } else { *next_expire = min(*next_expire, msg->expires); break; } } } /* While the device/port/ep will not be freed, we need to be careful of * their addition while walking the link lists. Therefore, we need to acquire * the appropriate locks. */ static void *acmp_retry_handler(void *context) { struct acmp_device *dev; struct acmp_port *port; struct acmp_ep *ep; uint64_t next_expire; int i, wait; acm_log(0, "started\n"); if (pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, NULL)) { acm_log(0, "Error: failed to set cancel type \n"); pthread_exit(NULL); } if (pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL)) { acm_log(0, "Error: failed to set cancel state\n"); pthread_exit(NULL); } retry_thread_started = 1; while (1) { while (!atomic_get(&wait_cnt)) { pthread_testcancel(); event_wait(&timeout_event, -1); } next_expire = -1; pthread_mutex_lock(&acmp_dev_lock); list_for_each(&acmp_dev_list, dev, entry) { pthread_mutex_unlock(&acmp_dev_lock); for (i = 0; i < dev->port_cnt; i++) { port = &dev->port[i]; pthread_mutex_lock(&port->lock); list_for_each(&port->ep_list, ep, entry) { pthread_mutex_unlock(&port->lock); pthread_mutex_lock(&ep->lock); if (!list_empty(&ep->wait_queue)) acmp_process_wait_queue(ep, &next_expire); pthread_mutex_unlock(&ep->lock); pthread_mutex_lock(&port->lock); } pthread_mutex_unlock(&port->lock); } pthread_mutex_lock(&acmp_dev_lock); } pthread_mutex_unlock(&acmp_dev_lock); acmp_process_timeouts(); if (next_expire != -1) { wait = (int) (next_expire - time_stamp_ms()); if (wait > 0 && atomic_get(&wait_cnt)) { pthread_testcancel(); event_wait(&timeout_event, wait); } } } retry_thread_started = 0; return NULL; } /* rwlock must be held read-locked */ static int __acmp_query(struct acmp_ep *ep, struct acm_msg *msg, uint64_t id) { struct acmp_request *req; struct ib_sa_mad *mad; uint8_t status; struct acm_sa_mad *sa_mad; if (ep->state != ACMP_READY) { status = ACM_STATUS_ENODATA; goto resp; } req = acmp_alloc_req(id, msg); if (!req) { status = ACM_STATUS_ENOMEM; goto resp; } req->ep = ep; sa_mad = acm_alloc_sa_mad(ep->endpoint, req, acmp_sa_resp); if (!sa_mad) { acm_log(0, "Error - failed to allocate sa_mad\n"); status = ACM_STATUS_ENOMEM; goto free_req; } mad = (struct ib_sa_mad *) &sa_mad->sa_mad; acmp_init_path_query(mad); memcpy(mad->data, &msg->resolve_data[0].info.path, sizeof(struct ibv_path_record)); mad->comp_mask = acm_path_comp_mask(&msg->resolve_data[0].info.path); acm_increment_counter(ACM_CNTR_ROUTE_QUERY); atomic_inc(&ep->counters[ACM_CNTR_ROUTE_QUERY]); if (acm_send_sa_mad(sa_mad)) { acm_log(0, "Error - Failed to send sa mad\n"); status = ACM_STATUS_ENODATA; goto free_mad; } return ACM_STATUS_SUCCESS; free_mad: acm_free_sa_mad(sa_mad); free_req: acmp_free_req(req); resp: msg->hdr.opcode |= ACM_OP_ACK; msg->hdr.status = status; if (status == ACM_STATUS_ENODATA) atomic_inc(&ep->counters[ACM_CNTR_NODATA]); else atomic_inc(&ep->counters[ACM_CNTR_ERROR]); return acm_query_response(id, msg); } static int acmp_query(void *addr_context, struct acm_msg *msg, uint64_t id) { struct acmp_addr_ctx *addr_ctx = addr_context; struct acmp_addr *address; int ret; pthread_rwlock_rdlock(&addr_ctx->ep->rwlock); address = addr_ctx->ep->addr_info + addr_ctx->addr_inx; ret = __acmp_query(address->ep, msg, id); pthread_rwlock_unlock(&addr_ctx->ep->rwlock); return ret; } static uint8_t acmp_send_resolve(struct acmp_ep *ep, struct acmp_dest *dest, struct acm_ep_addr_data *saddr) { struct acmp_send_msg *msg; struct acm_mad *mad; struct acm_resolve_rec *rec; int i; acm_log(2, "\n"); msg = acmp_alloc_send(ep, &ep->mc_dest[0], sizeof(*mad)); if (!msg) { acm_log(0, "ERROR - cannot allocate send msg\n"); return ACM_STATUS_ENOMEM; } acmp_init_send_req(msg, (void *) dest, acmp_process_addr_resp); (void) atomic_inc(&dest->refcnt); mad = (struct acm_mad *) msg->data; mad->base_version = 1; mad->mgmt_class = ACM_MGMT_CLASS; mad->class_version = 1; mad->method = IB_METHOD_GET; mad->control = ACM_CTRL_RESOLVE; mad->tid = htobe64((uint64_t) atomic_inc(&g_tid)); rec = (struct acm_resolve_rec *) mad->data; rec->src_type = (uint8_t) saddr->type; rec->src_length = ACM_MAX_ADDRESS; memcpy(rec->src, saddr->info.addr, ACM_MAX_ADDRESS); rec->dest_type = dest->addr_type; rec->dest_length = ACM_MAX_ADDRESS; memcpy(rec->dest, dest->address, ACM_MAX_ADDRESS); rec->gid_cnt = (uint8_t) ep->mc_cnt; for (i = 0; i < ep->mc_cnt; i++) memcpy(&rec->gid[i], ep->mc_dest[i].address, 16); acm_increment_counter(ACM_CNTR_ADDR_QUERY); atomic_inc(&ep->counters[ACM_CNTR_ADDR_QUERY]); acmp_post_send(&ep->resolve_queue, msg); return 0; } /* Caller must hold dest lock */ static uint8_t acmp_queue_req(struct acmp_dest *dest, uint64_t id, struct acm_msg *msg) { struct acmp_request *req; acm_log(2, "id %" PRIu64 "\n", id); req = acmp_alloc_req(id, msg); if (!req) { return ACM_STATUS_ENOMEM; } req->ep = dest->ep; list_add_tail(&dest->req_queue, &req->entry); return ACM_STATUS_SUCCESS; } static int acmp_dest_timeout(struct acmp_dest *dest) { uint64_t timestamp = time_stamp_min(); if (timestamp > dest->addr_timeout) { acm_log(2, "%s address timed out\n", dest->name); dest->state = ACMP_INIT; return 1; } else if (timestamp > dest->route_timeout) { acm_log(2, "%s route timed out\n", dest->name); dest->state = ACMP_ADDR_RESOLVED; return 1; } return 0; } static int acmp_check_addr_match(struct ifaddrs *iap, struct acm_ep_addr_data *saddr, unsigned int d_family) { char sip[INET6_ADDRSTRLEN] = {0}; char dip[INET6_ADDRSTRLEN] = {0}; const char *tmp; size_t sock_size; unsigned int s_family; int ret; s_family = iap->ifa_addr->sa_family; if (!(iap->ifa_flags & IFF_UP) || (s_family != d_family)) return -1; sock_size = (s_family == AF_INET) ? sizeof(struct sockaddr_in) : sizeof(struct sockaddr_in6); ret = getnameinfo(iap->ifa_addr, sock_size, sip, sizeof(sip), NULL, 0, NI_NUMERICHOST); if (ret) return ret; tmp = inet_ntop(d_family, (void *)saddr->info.addr, dip, sizeof(dip)); if (!tmp) return -1; ret = memcmp(sip, dip, strlen(dip)); return ret; } static void acmp_acquire_sgid(struct acm_ep_addr_data *saddr, struct acmp_dest *dest) { struct ifaddrs *addrs, *iap; unsigned int d_family; int ret; if (!ib_any_gid(&dest->path.sgid)) return; if (dest->addr_type != ACM_ADDRESS_IP6 && dest->addr_type != ACM_ADDRESS_IP) return; if (getifaddrs(&addrs)) return; d_family = (dest->addr_type == ACM_ADDRESS_IP) ? AF_INET : AF_INET6; for (iap = addrs; iap != NULL; iap = iap->ifa_next) { ret = acmp_check_addr_match(iap, saddr, d_family); if (!ret) { ret = acm_if_get_sgid(iap->ifa_name, &dest->path.sgid); if (!ret) break; } } freeifaddrs(addrs); } static int acmp_resolve_dest(struct acmp_ep *ep, struct acm_msg *msg, uint64_t id) { struct acmp_dest *dest; struct acm_ep_addr_data *saddr, *daddr; uint8_t status; int ret; saddr = &msg->resolve_data[msg->hdr.src_index]; daddr = &msg->resolve_data[msg->hdr.dst_index]; acm_format_name(2, log_data, sizeof log_data, daddr->type, daddr->info.addr, sizeof daddr->info.addr); acm_log(2, "dest %s\n", log_data); dest = acmp_acquire_dest(ep, daddr->type, daddr->info.addr); if (!dest) { acm_log(0, "ERROR - unable to allocate destination in request\n"); atomic_inc(&ep->counters[ACM_CNTR_ERROR]); return acmp_resolve_response(id, msg, NULL, ACM_STATUS_ENOMEM); } pthread_mutex_lock(&dest->lock); test: switch (dest->state) { case ACMP_READY: if (acmp_dest_timeout(dest)) goto test; acm_log(2, "request satisfied from local cache\n"); acm_increment_counter(ACM_CNTR_ROUTE_CACHE); atomic_inc(&ep->counters[ACM_CNTR_ROUTE_CACHE]); status = ACM_STATUS_SUCCESS; break; case ACMP_ADDR_RESOLVED: acm_log(2, "have address, resolving route\n"); acm_increment_counter(ACM_CNTR_ADDR_CACHE); atomic_inc(&ep->counters[ACM_CNTR_ADDR_CACHE]); acmp_acquire_sgid(saddr, dest); status = acmp_resolve_path_sa(ep, dest, acmp_dest_sa_resp); if (status) { break; } goto queue; case ACMP_INIT: acm_log(2, "sending resolve msg to dest\n"); status = acmp_send_resolve(ep, dest, saddr); if (status) { break; } dest->state = ACMP_QUERY_ADDR; /* fall through */ default: queue: if (daddr->flags & ACM_FLAGS_NODELAY) { acm_log(2, "lookup initiated, but client wants no delay\n"); status = ACM_STATUS_ENODATA; break; } status = acmp_queue_req(dest, id, msg); if (status) { break; } ret = 0; pthread_mutex_unlock(&dest->lock); goto put; } pthread_mutex_unlock(&dest->lock); ret = acmp_resolve_response(id, msg, dest, status); put: acmp_put_dest(dest); return ret; } static int acmp_resolve_path(struct acmp_ep *ep, struct acm_msg *msg, uint64_t id) { struct acmp_dest *dest; struct ibv_path_record *path; uint8_t *addr; uint8_t status; int ret; path = &msg->resolve_data[0].info.path; addr = msg->resolve_data[1].info.addr; memset(addr, 0, ACM_MAX_ADDRESS); if (path->dlid) { * ((__be16 *) addr) = path->dlid; dest = acmp_acquire_dest(ep, ACM_ADDRESS_LID, addr); } else { memcpy(addr, &path->dgid, sizeof path->dgid); dest = acmp_acquire_dest(ep, ACM_ADDRESS_GID, addr); } if (!dest) { acm_log(0, "ERROR - unable to allocate destination in request\n"); atomic_inc(&ep->counters[ACM_CNTR_ERROR]); return acmp_resolve_response(id, msg, NULL, ACM_STATUS_ENOMEM); } pthread_mutex_lock(&dest->lock); test: switch (dest->state) { case ACMP_READY: if (acmp_dest_timeout(dest)) goto test; acm_log(2, "request satisfied from local cache\n"); acm_increment_counter(ACM_CNTR_ROUTE_CACHE); atomic_inc(&ep->counters[ACM_CNTR_ROUTE_CACHE]); status = ACM_STATUS_SUCCESS; break; case ACMP_INIT: acm_log(2, "have path, bypassing address resolution\n"); acmp_record_path_addr(ep, dest, path); /* fall through */ case ACMP_ADDR_RESOLVED: acm_log(2, "have address, resolving route\n"); status = acmp_resolve_path_sa(ep, dest, acmp_dest_sa_resp); if (status) { break; } /* fall through */ default: if (msg->resolve_data[0].flags & ACM_FLAGS_NODELAY) { acm_log(2, "lookup initiated, but client wants no delay\n"); status = ACM_STATUS_ENODATA; break; } status = acmp_queue_req(dest, id, msg); if (status) { break; } ret = 0; pthread_mutex_unlock(&dest->lock); goto put; } pthread_mutex_unlock(&dest->lock); ret = acmp_resolve_response(id, msg, dest, status); put: acmp_put_dest(dest); return ret; } static int acmp_resolve(void *addr_context, struct acm_msg *msg, uint64_t id) { struct acmp_addr_ctx *addr_ctx = addr_context; struct acmp_addr *address = addr_ctx->ep->addr_info + addr_ctx->addr_inx; struct acmp_ep *ep = address->ep; if (ep->state != ACMP_READY) { atomic_inc(&ep->counters[ACM_CNTR_NODATA]); return acmp_resolve_response(id, msg, NULL, ACM_STATUS_ENODATA); } atomic_inc(&ep->counters[ACM_CNTR_RESOLVE]); if (msg->resolve_data[0].type == ACM_EP_INFO_PATH) return acmp_resolve_path(ep, msg, id); else return acmp_resolve_dest(ep, msg, id); } static void acmp_query_perf(void *ep_context, uint64_t *values, uint8_t *cnt) { struct acmp_ep *ep = ep_context; int i; for (i = 0; i < ACM_MAX_COUNTER; i++) values[i] = htobe64((uint64_t) atomic_get(&ep->counters[i])); *cnt = ACM_MAX_COUNTER; } static enum acmp_addr_prot acmp_convert_addr_prot(char *param) { if (!strcasecmp("acm", param)) return ACMP_ADDR_PROT_ACM; return addr_prot; } static enum acmp_route_prot acmp_convert_route_prot(char *param) { if (!strcasecmp("acm", param)) return ACMP_ROUTE_PROT_ACM; else if (!strcasecmp("sa", param)) return ACMP_ROUTE_PROT_SA; return route_prot; } static enum acmp_loopback_prot acmp_convert_loopback_prot(char *param) { if (!strcasecmp("none", param)) return ACMP_LOOPBACK_PROT_NONE; else if (!strcasecmp("local", param)) return ACMP_LOOPBACK_PROT_LOCAL; return loopback_prot; } static enum acmp_route_preload acmp_convert_route_preload(char *param) { if (!strcasecmp("none", param) || !strcasecmp("no", param)) return ACMP_ROUTE_PRELOAD_NONE; else if (!strcasecmp("opensm_full_v1", param)) return ACMP_ROUTE_PRELOAD_OSM_FULL_V1; return route_preload; } static enum acmp_addr_preload acmp_convert_addr_preload(char *param) { if (!strcasecmp("none", param) || !strcasecmp("no", param)) return ACMP_ADDR_PRELOAD_NONE; else if (!strcasecmp("acm_hosts", param)) return ACMP_ADDR_PRELOAD_HOSTS; return addr_preload; } static int acmp_post_recvs(struct acmp_ep *ep) { int i, size; size = recv_depth * ACM_RECV_SIZE; ep->recv_bufs = malloc(size); if (!ep->recv_bufs) { acm_log(0, "ERROR - unable to allocate receive buffer\n"); return ACM_STATUS_ENOMEM; } ep->mr = ibv_reg_mr(ep->port->dev->pd, ep->recv_bufs, size, IBV_ACCESS_LOCAL_WRITE); if (!ep->mr) { acm_log(0, "ERROR - unable to register receive buffer\n"); goto err; } for (i = 0; i < recv_depth; i++) { acmp_post_recv(ep, (uintptr_t) (ep->recv_bufs + ACM_RECV_SIZE * i)); } return 0; err: free(ep->recv_bufs); return -1; } /* Parse "opensm full v1" file to build LID to GUID table */ static void acmp_parse_osm_fullv1_lid2guid(FILE *f, __be64 *lid2guid) { char s[128]; char *p, *ptr, *p_guid, *p_lid; uint64_t guid; uint16_t lid; while (fgets(s, sizeof s, f)) { if (s[0] == '#') continue; if (!(p = strtok_r(s, " \n", &ptr))) continue; /* ignore blank lines */ if (strncmp(p, "Switch", sizeof("Switch") - 1) && strncmp(p, "Channel", sizeof("Channel") - 1) && strncmp(p, "Router", sizeof("Router") - 1)) continue; if (!strncmp(p, "Channel", sizeof("Channel") - 1)) { p = strtok_r(NULL, " ", &ptr); /* skip 'Adapter' */ if (!p) continue; } p_guid = strtok_r(NULL, ",", &ptr); if (!p_guid) continue; guid = (uint64_t) strtoull(p_guid, NULL, 16); ptr = strstr(ptr, "base LID"); if (!ptr) continue; ptr += sizeof("base LID"); p_lid = strtok_r(NULL, ",", &ptr); if (!p_lid) continue; lid = (uint16_t) strtoul(p_lid, NULL, 0); if (lid >= IB_LID_MCAST_START) continue; if (lid2guid[lid]) acm_log(0, "ERROR - duplicate lid %u\n", lid); else lid2guid[lid] = htobe64(guid); } } /* Parse 'opensm full v1' file to populate PR cache */ static int acmp_parse_osm_fullv1_paths(FILE *f, __be64 *lid2guid, struct acmp_ep *ep) { union ibv_gid sgid, dgid; struct ibv_port_attr attr = {}; struct acmp_dest *dest; char s[128]; char *p, *ptr, *p_guid, *p_lid; uint64_t guid; uint16_t lid, dlid; __be16 net_dlid; int sl, mtu, rate; int ret = 1, i; uint8_t addr[ACM_MAX_ADDRESS]; uint8_t addr_type; acm_get_gid((struct acm_port *)ep->port->port, 0, &sgid); /* Search for endpoint's SLID */ while (fgets(s, sizeof s, f)) { if (s[0] == '#') continue; if (!(p = strtok_r(s, " \n", &ptr))) continue; /* ignore blank lines */ if (strncmp(p, "Switch", sizeof("Switch") - 1) && strncmp(p, "Channel", sizeof("Channel") - 1) && strncmp(p, "Router", sizeof("Router") - 1)) continue; if (!strncmp(p, "Channel", sizeof("Channel") - 1)) { p = strtok_r(NULL, " ", &ptr); /* skip 'Adapter' */ if (!p) continue; } p_guid = strtok_r(NULL, ",", &ptr); if (!p_guid) continue; guid = (uint64_t) strtoull(p_guid, NULL, 16); if (guid != be64toh(sgid.global.interface_id)) continue; ptr = strstr(ptr, "base LID"); if (!ptr) continue; ptr += sizeof("base LID"); p_lid = strtok_r(NULL, ",", &ptr); if (!p_lid) continue; lid = (uint16_t) strtoul(p_lid, NULL, 0); if (lid != ep->port->lid) continue; ibv_query_port(ep->port->dev->verbs, ep->port->port_num, &attr); ret = 0; break; } while (fgets(s, sizeof s, f)) { if (s[0] == '#') continue; if (!(p = strtok_r(s, " \n", &ptr))) continue; /* ignore blank lines */ if (!strncmp(p, "Switch", sizeof("Switch") - 1) || !strncmp(p, "Channel", sizeof("Channel") - 1) || !strncmp(p, "Router", sizeof("Router") - 1)) break; dlid = strtoul(p, NULL, 0); net_dlid = htobe16(dlid); p = strtok_r(NULL, ":", &ptr); if (!p) continue; if (strcmp(p, "UNREACHABLE") == 0) continue; sl = atoi(p); p = strtok_r(NULL, ":", &ptr); if (!p) continue; mtu = atoi(p); p = strtok_r(NULL, ":", &ptr); if (!p) continue; rate = atoi(p); if (!lid2guid[dlid]) { acm_log(0, "ERROR - dlid %u not found in lid2guid table\n", dlid); continue; } dgid.global.subnet_prefix = sgid.global.subnet_prefix; dgid.global.interface_id = lid2guid[dlid]; for (i = 0; i < 2; i++) { memset(addr, 0, ACM_MAX_ADDRESS); if (i == 0) { addr_type = ACM_ADDRESS_LID; memcpy(addr, &net_dlid, sizeof net_dlid); } else { addr_type = ACM_ADDRESS_GID; memcpy(addr, &dgid, sizeof(dgid)); } dest = acmp_acquire_dest(ep, addr_type, addr); if (!dest) { acm_log(0, "ERROR - unable to create dest\n"); break; } dest->path.sgid = sgid; dest->path.slid = htobe16(ep->port->lid); dest->path.dgid = dgid; dest->path.dlid = net_dlid; dest->path.reversible_numpath = IBV_PATH_RECORD_REVERSIBLE; dest->path.pkey = htobe16(ep->pkey); dest->path.mtu = (uint8_t) mtu; dest->path.rate = (uint8_t) rate; dest->path.qosclass_sl = htobe16((uint16_t) sl & 0xF); if (dlid == ep->port->lid) { dest->path.packetlifetime = 0; dest->addr_timeout = (uint64_t)~0ULL; dest->route_timeout = (uint64_t)~0ULL; } else { dest->path.packetlifetime = attr.subnet_timeout; dest->addr_timeout = time_stamp_min() + (unsigned) addr_timeout; dest->route_timeout = time_stamp_min() + (unsigned) route_timeout; } dest->remote_qpn = 1; dest->state = ACMP_READY; acmp_put_dest(dest); acm_log(1, "added cached dest %s\n", dest->name); } } return ret; } static int acmp_parse_osm_fullv1(struct acmp_ep *ep) { FILE *f; __be64 *lid2guid; int ret = 1; if (!(f = fopen(route_data_file, "r"))) { acm_log(0, "ERROR - couldn't open %s\n", route_data_file); return ret; } lid2guid = calloc(IB_LID_MCAST_START, sizeof(*lid2guid)); if (!lid2guid) { acm_log(0, "ERROR - no memory for path record parsing\n"); goto err; } acmp_parse_osm_fullv1_lid2guid(f, lid2guid); rewind(f); ret = acmp_parse_osm_fullv1_paths(f, lid2guid, ep); free(lid2guid); err: fclose(f); return ret; } static void acmp_parse_hosts_file(struct acmp_ep *ep) { FILE *f; char s[120]; char addr[INET6_ADDRSTRLEN], gid[INET6_ADDRSTRLEN]; uint8_t name[ACM_MAX_ADDRESS]; struct in6_addr ip_addr, ib_addr; struct acmp_dest *dest, *gid_dest; uint8_t addr_type; if (!(f = fopen(addr_data_file, "r"))) { acm_log(0, "ERROR - couldn't open %s\n", addr_data_file); return; } while (fgets(s, sizeof s, f)) { if (s[0] == '#') continue; if (sscanf(s, "%45s%45s", addr, gid) != 2) continue; acm_log(2, "%s", s); if (inet_pton(AF_INET6, gid, &ib_addr) <= 0) { acm_log(0, "ERROR - %s is not IB GID\n", gid); continue; } memset(name, 0, ACM_MAX_ADDRESS); if (inet_pton(AF_INET, addr, &ip_addr) > 0) { addr_type = ACM_ADDRESS_IP; memcpy(name, &ip_addr, 4); } else if (inet_pton(AF_INET6, addr, &ip_addr) > 0) { addr_type = ACM_ADDRESS_IP6; memcpy(name, &ip_addr, sizeof(ip_addr)); } else { addr_type = ACM_ADDRESS_NAME; strncpy((char *)name, addr, ACM_MAX_ADDRESS); } dest = acmp_acquire_dest(ep, addr_type, name); if (!dest) { acm_log(0, "ERROR - unable to create dest %s\n", addr); continue; } memset(name, 0, ACM_MAX_ADDRESS); memcpy(name, &ib_addr, sizeof(ib_addr)); gid_dest = acmp_get_dest(ep, ACM_ADDRESS_GID, name); if (gid_dest) { dest->path = gid_dest->path; dest->state = ACMP_READY; acmp_put_dest(gid_dest); } else { memcpy(&dest->path.dgid, &ib_addr, 16); //ibv_query_gid(ep->port->dev->verbs, ep->port->port_num, // 0, &dest->path.sgid); dest->path.slid = htobe16(ep->port->lid); dest->path.reversible_numpath = IBV_PATH_RECORD_REVERSIBLE; dest->path.pkey = htobe16(ep->pkey); dest->state = ACMP_ADDR_RESOLVED; } dest->remote_qpn = 1; dest->addr_timeout = time_stamp_min() + (unsigned) addr_timeout; dest->route_timeout = time_stamp_min() + (unsigned) route_timeout; acmp_put_dest(dest); acm_log(1, "added host %s address type %d IB GID %s\n", addr, addr_type, gid); } fclose(f); } /* * We currently require that the routing data be preloaded in order to * load the address data. This is backwards from normal operation, which * usually resolves the address before the route. */ static void acmp_ep_preload(struct acmp_ep *ep) { switch (route_preload) { case ACMP_ROUTE_PRELOAD_OSM_FULL_V1: if (acmp_parse_osm_fullv1(ep)) acm_log(0, "ERROR - failed to preload EP\n"); break; default: break; } switch (addr_preload) { case ACMP_ADDR_PRELOAD_HOSTS: acmp_parse_hosts_file(ep); break; default: break; } } /* rwlock must be held write-locked */ static int __acmp_add_addr(const struct acm_address *addr, struct acmp_ep *ep, void **addr_context) { struct acmp_dest *dest; struct acmp_addr_ctx *addr_ctx; int i; for (i = 0; (i < ep->nmbr_ep_addrs) && (ep->addr_info[i].type != ACM_ADDRESS_INVALID); i++) ; if (i == ep->nmbr_ep_addrs) { struct acmp_addr *new_info; new_info = realloc(ep->addr_info, (i + 1) * sizeof(*ep->addr_info)); if (!new_info) { acm_log(0, "ERROR - no more space for local address\n"); return -1; } ep->addr_info = new_info; /* Added memory is not initialized */ memset(ep->addr_info + i, 0, sizeof(*ep->addr_info)); ++ep->nmbr_ep_addrs; } ep->addr_info[i].type = addr->type; memcpy(&ep->addr_info[i].info, &addr->info, sizeof(addr->info)); memcpy(&ep->addr_info[i].addr, addr, sizeof(*addr)); ep->addr_info[i].ep = ep; addr_ctx = malloc(sizeof(*addr_ctx)); if (!addr_ctx) { acm_log(0, "ERROR - unable to alloc address context struct\n"); return -1; } addr_ctx->ep = ep; addr_ctx->addr_inx = i; if (loopback_prot != ACMP_LOOPBACK_PROT_LOCAL) { *addr_context = addr_ctx; return 0; } dest = acmp_acquire_dest(ep, addr->type, (uint8_t *)addr->info.addr); if (!dest) { acm_log(0, "ERROR - unable to create loopback dest %s\n", addr->id_string); memset(&ep->addr_info[i], 0, sizeof(ep->addr_info[i])); free(addr_ctx); return -1; } acm_get_gid((struct acm_port *) ep->port->port, 0, &dest->path.sgid); dest->path.dgid = dest->path.sgid; dest->path.dlid = dest->path.slid = htobe16(ep->port->lid); dest->path.reversible_numpath = IBV_PATH_RECORD_REVERSIBLE; dest->path.pkey = htobe16(ep->pkey); dest->path.mtu = (uint8_t) ep->port->mtu; dest->path.rate = (uint8_t) ep->port->rate; dest->remote_qpn = ep->qp->qp_num; dest->addr_timeout = (uint64_t) ~0ULL; dest->route_timeout = (uint64_t) ~0ULL; dest->state = ACMP_READY; acmp_put_dest(dest); *addr_context = addr_ctx; acm_log(1, "added loopback dest %s\n", dest->name); return 0; } static int acmp_add_addr(const struct acm_address *addr, void *ep_context, void **addr_context) { struct acmp_ep *ep = ep_context; int ret; acm_log(2, "\n"); pthread_rwlock_wrlock(&ep->rwlock); ret = __acmp_add_addr(addr, ep, addr_context); pthread_rwlock_unlock(&ep->rwlock); return ret; } static void acmp_remove_addr(void *addr_context) { struct acmp_addr_ctx *addr_ctx = addr_context; struct acmp_addr *address = addr_ctx->ep->addr_info + addr_ctx->addr_inx; struct acmp_device *dev; struct acmp_dest *dest; struct acmp_ep *ep; int i; acm_log(2, "\n"); /* * The address may be a local destination address. If so, * delete it from the cache. */ pthread_mutex_lock(&acmp_dev_lock); list_for_each(&acmp_dev_list, dev, entry) { pthread_mutex_unlock(&acmp_dev_lock); for (i = 0; i < dev->port_cnt; i++) { struct acmp_port *port = &dev->port[i]; pthread_mutex_lock(&port->lock); list_for_each(&port->ep_list, ep, entry) { pthread_mutex_unlock(&port->lock); dest = acmp_get_dest(ep, address->type, address->addr.info.addr); if (dest) { acm_log(2, "Found a dest addr, deleting it\n"); pthread_mutex_lock(&ep->lock); acmp_remove_dest(ep, dest); pthread_mutex_unlock(&ep->lock); } pthread_mutex_lock(&port->lock); } pthread_mutex_unlock(&port->lock); } pthread_mutex_lock(&acmp_dev_lock); } pthread_mutex_unlock(&acmp_dev_lock); memset(address, 0, sizeof(*address)); free(addr_ctx); } static struct acmp_port *acmp_get_port(struct acm_endpoint *endpoint) { struct acmp_device *dev; acm_log(1, "dev 0x%" PRIx64 " port %d pkey 0x%x\n", be64toh(endpoint->port->dev->dev_guid), endpoint->port->port_num, endpoint->pkey); list_for_each(&acmp_dev_list, dev, entry) { if (dev->guid == endpoint->port->dev->dev_guid) return &dev->port[endpoint->port->port_num - 1]; } return NULL; } static struct acmp_ep * acmp_get_ep(struct acmp_port *port, struct acm_endpoint *endpoint) { struct acmp_ep *ep; acm_log(1, "dev 0x%" PRIx64 " port %d pkey 0x%x\n", be64toh(endpoint->port->dev->dev_guid), endpoint->port->port_num, endpoint->pkey); list_for_each(&port->ep_list, ep, entry) { if (ep->pkey == endpoint->pkey) return ep; } return NULL; } static uint16_t acmp_get_pkey_index(struct acm_endpoint *endpoint) { struct acmp_port *port; int i; port = acmp_get_port(endpoint); if (!port) return 0; i = ibv_get_pkey_index(port->dev->verbs, port->port_num, htobe16(endpoint->pkey)); if (i < 0) return 0; return i; } static void acmp_close_endpoint(void *ep_context) { struct acmp_ep *ep = ep_context; acm_log(1, "%s %d pkey 0x%04x\n", ep->port->dev->verbs->device->name, ep->port->port_num, ep->pkey); ep->endpoint = NULL; } static struct acmp_ep * acmp_alloc_ep(struct acmp_port *port, struct acm_endpoint *endpoint) { struct acmp_ep *ep; int i; acm_log(1, "\n"); ep = calloc(1, sizeof *ep); if (!ep) return NULL; ep->port = port; ep->endpoint = endpoint; ep->pkey = endpoint->pkey; ep->resolve_queue.credits = resolve_depth; ep->resp_queue.credits = send_depth; list_head_init(&ep->resolve_queue.pending); list_head_init(&ep->resp_queue.pending); list_head_init(&ep->active_queue); list_head_init(&ep->wait_queue); pthread_mutex_init(&ep->lock, NULL); sprintf(ep->id_string, "%s-%d-0x%x", port->dev->verbs->device->name, port->port_num, endpoint->pkey); if (pthread_rwlock_init(&ep->rwlock, NULL)) { free(ep); return NULL; } ep->addr_info = NULL; ep->nmbr_ep_addrs = 0; for (i = 0; i < ACM_MAX_COUNTER; i++) atomic_init(&ep->counters[i]); return ep; } static int acmp_open_endpoint(const struct acm_endpoint *endpoint, void *port_context, void **ep_context) { struct acmp_port *port = port_context; struct acmp_ep *ep; struct ibv_qp_init_attr init_attr; struct ibv_qp_attr attr; int ret, sq_size; ep = acmp_get_ep(port, (struct acm_endpoint *) endpoint); if (ep) { acm_log(2, "endpoint for pkey 0x%x already exists\n", endpoint->pkey); pthread_mutex_lock(&ep->lock); ep->endpoint = (struct acm_endpoint *) endpoint; pthread_mutex_unlock(&ep->lock); *ep_context = (void *) ep; return 0; } acm_log(2, "creating endpoint for pkey 0x%x\n", endpoint->pkey); ep = acmp_alloc_ep(port, (struct acm_endpoint *) endpoint); if (!ep) return -1; sprintf(ep->id_string, "%s-%d-0x%x", port->dev->verbs->device->name, port->port_num, endpoint->pkey); sq_size = resolve_depth + send_depth; ep->cq = ibv_create_cq(port->dev->verbs, sq_size + recv_depth, ep, port->dev->channel, 0); if (!ep->cq) { acm_log(0, "ERROR - failed to create CQ\n"); goto err0; } ret = ibv_req_notify_cq(ep->cq, 0); if (ret) { acm_log(0, "ERROR - failed to arm CQ\n"); goto err1; } memset(&init_attr, 0, sizeof init_attr); init_attr.cap.max_send_wr = sq_size; init_attr.cap.max_recv_wr = recv_depth; init_attr.cap.max_send_sge = 1; init_attr.cap.max_recv_sge = 1; init_attr.qp_context = ep; init_attr.sq_sig_all = 1; init_attr.qp_type = IBV_QPT_UD; init_attr.send_cq = ep->cq; init_attr.recv_cq = ep->cq; ep->qp = ibv_create_qp(ep->port->dev->pd, &init_attr); if (!ep->qp) { acm_log(0, "ERROR - failed to create QP\n"); goto err1; } attr.qp_state = IBV_QPS_INIT; attr.port_num = port->port_num; attr.pkey_index = acmp_get_pkey_index((struct acm_endpoint *) endpoint); attr.qkey = ACM_QKEY; ret = ibv_modify_qp(ep->qp, &attr, IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | IBV_QP_QKEY); if (ret) { acm_log(0, "ERROR - failed to modify QP to init\n"); goto err2; } attr.qp_state = IBV_QPS_RTR; ret = ibv_modify_qp(ep->qp, &attr, IBV_QP_STATE); if (ret) { acm_log(0, "ERROR - failed to modify QP to rtr\n"); goto err2; } attr.qp_state = IBV_QPS_RTS; attr.sq_psn = 0; ret = ibv_modify_qp(ep->qp, &attr, IBV_QP_STATE | IBV_QP_SQ_PSN); if (ret) { acm_log(0, "ERROR - failed to modify QP to rts\n"); goto err2; } ret = acmp_post_recvs(ep); if (ret) goto err2; pthread_mutex_lock(&port->lock); list_add(&port->ep_list, &ep->entry); pthread_mutex_unlock(&port->lock); acmp_ep_preload(ep); acmp_ep_join(ep); *ep_context = (void *) ep; return 0; err2: ibv_destroy_qp(ep->qp); err1: ibv_destroy_cq(ep->cq); err0: free(ep); return -1; } static void acmp_port_up(struct acmp_port *port) { struct ibv_port_attr attr; uint16_t pkey; __be16 pkey_be; __be16 sm_lid; int i, ret; int instance; acm_log(1, "%s %d\n", port->dev->verbs->device->name, port->port_num); ret = ibv_query_port(port->dev->verbs, port->port_num, &attr); if (ret) { acm_log(0, "ERROR - unable to get port attribute\n"); return; } port->mtu = attr.active_mtu; port->rate = acm_get_rate(attr.active_width, attr.active_speed); if (attr.subnet_timeout >= 8) port->subnet_timeout = 1 << (attr.subnet_timeout - 8); port->lid = attr.lid; port->lid_mask = 0xffff - ((1 << attr.lmc) - 1); port->sa_dest.av.src_path_bits = 0; port->sa_dest.av.dlid = attr.sm_lid; port->sa_dest.av.sl = attr.sm_sl; port->sa_dest.av.port_num = port->port_num; port->sa_dest.remote_qpn = 1; sm_lid = htobe16(attr.sm_lid); acmp_set_dest_addr(&port->sa_dest, ACM_ADDRESS_LID, (uint8_t *) &sm_lid, sizeof(sm_lid)); instance = atomic_inc(&port->sa_dest.refcnt) - 1; port->sa_dest.state = ACMP_READY; for (i = 0; i < attr.pkey_tbl_len; i++) { ret = ibv_query_pkey(port->dev->verbs, port->port_num, i, &pkey_be); if (ret) continue; pkey = be16toh(pkey_be); if (!(pkey & 0x7fff)) continue; /* Determine pkey index for default partition with preference * for full membership */ if ((pkey & 0x7fff) == 0x7fff) { port->default_pkey_ix = i; break; } } port->state = IBV_PORT_ACTIVE; acm_log(1, "%s %d %d is up\n", port->dev->verbs->device->name, port->port_num, instance); } static void acmp_port_down(struct acmp_port *port) { int instance; acm_log(1, "%s %d\n", port->dev->verbs->device->name, port->port_num); pthread_mutex_lock(&port->lock); port->state = IBV_PORT_DOWN; pthread_mutex_unlock(&port->lock); /* * We wait for the SA destination to be released. We could use an * event instead of a sleep loop, but it's not worth it given how * infrequently we should be processing a port down event in practice. */ instance = atomic_dec(&port->sa_dest.refcnt); if (instance == 1) { pthread_mutex_lock(&port->sa_dest.lock); port->sa_dest.state = ACMP_INIT; pthread_mutex_unlock(&port->sa_dest.lock); } acm_log(1, "%s %d %d is down\n", port->dev->verbs->device->name, port->port_num, instance); } static int acmp_open_port(const struct acm_port *cport, void *dev_context, void **port_context) { struct acmp_device *dev = dev_context; struct acmp_port *port; if (cport->port_num < 1 || cport->port_num > dev->port_cnt) { acm_log(0, "Error: port_num %d is out of range (max %d)\n", cport->port_num, dev->port_cnt); return -1; } port = &dev->port[cport->port_num - 1]; pthread_mutex_lock(&port->lock); port->port = cport; port->state = IBV_PORT_DOWN; pthread_mutex_unlock(&port->lock); acmp_port_up(port); *port_context = port; return 0; } static void acmp_close_port(void *port_context) { struct acmp_port *port = port_context; acmp_port_down(port); pthread_mutex_lock(&port->lock); port->port = NULL; pthread_mutex_unlock(&port->lock); } static void acmp_init_port(struct acmp_port *port, struct acmp_device *dev, uint8_t port_num) { acm_log(1, "%s %d\n", dev->verbs->device->name, port_num); port->dev = dev; port->port_num = port_num; pthread_mutex_init(&port->lock, NULL); list_head_init(&port->ep_list); acmp_init_dest(&port->sa_dest, ACM_ADDRESS_LID, NULL, 0); port->state = IBV_PORT_DOWN; } static int acmp_open_dev(const struct acm_device *device, void **dev_context) { struct acmp_device *dev; size_t size; struct ibv_device_attr attr; int i, ret; struct ibv_context *verbs; acm_log(1, "dev_guid 0x%" PRIx64 " %s\n", be64toh(device->dev_guid), device->verbs->device->name); list_for_each(&acmp_dev_list, dev, entry) { if (dev->guid == device->dev_guid) { acm_log(2, "dev_guid 0x%" PRIx64 " already exits\n", be64toh(device->dev_guid)); *dev_context = dev; dev->device = device; return 0; } } /* We need to release the core device structure when device close is * called. But this provider does not support dynamic add/removal of * devices/ports/endpoints. To avoid use-after-free issues, we open * our own verbs context, rather than using the one in the core * device structure. */ verbs = ibv_open_device(device->verbs->device); if (!verbs) { acm_log(0, "ERROR - opening device %s\n", device->verbs->device->name); goto err; } ret = ibv_query_device(verbs, &attr); if (ret) { acm_log(0, "ERROR - ibv_query_device (%s) %d\n", verbs->device->name, ret); goto err; } size = sizeof(*dev) + sizeof(struct acmp_port) * attr.phys_port_cnt; dev = (struct acmp_device *) calloc(1, size); if (!dev) goto err; dev->verbs = verbs; dev->device = device; dev->port_cnt = attr.phys_port_cnt; dev->pd = ibv_alloc_pd(dev->verbs); if (!dev->pd) { acm_log(0, "ERROR - unable to allocate PD\n"); goto err1; } dev->channel = ibv_create_comp_channel(dev->verbs); if (!dev->channel) { acm_log(0, "ERROR - unable to create comp channel\n"); goto err2; } for (i = 0; i < dev->port_cnt; i++) { acmp_init_port(&dev->port[i], dev, i + 1); } if (pthread_create(&dev->comp_thread_id, NULL, acmp_comp_handler, dev)) { acm_log(0, "Error -- failed to create the comp thread for dev %s", dev->verbs->device->name); goto err3; } pthread_mutex_lock(&acmp_dev_lock); list_add(&acmp_dev_list, &dev->entry); pthread_mutex_unlock(&acmp_dev_lock); dev->guid = device->dev_guid; *dev_context = dev; acm_log(1, "%s opened\n", dev->verbs->device->name); return 0; err3: ibv_destroy_comp_channel(dev->channel); err2: ibv_dealloc_pd(dev->pd); err1: free(dev); err: return -1; } static void acmp_close_dev(void *dev_context) { struct acmp_device *dev = dev_context; acm_log(1, "dev_guid 0x%" PRIx64 "\n", be64toh(dev->device->dev_guid)); dev->device = NULL; } static void acmp_set_options(void) { FILE *f; char s[120]; char opt[32], value[256]; const char *opts_file = acm_get_opts_file(); if (!(f = fopen(opts_file, "r"))) return; while (fgets(s, sizeof s, f)) { if (s[0] == '#') continue; if (sscanf(s, "%31s%255s", opt, value) != 2) continue; if (!strcasecmp("addr_prot", opt)) addr_prot = acmp_convert_addr_prot(value); else if (!strcasecmp("addr_timeout", opt)) addr_timeout = atoi(value); else if (!strcasecmp("route_prot", opt)) route_prot = acmp_convert_route_prot(value); else if (!strcmp("route_timeout", opt)) route_timeout = atoi(value); else if (!strcasecmp("loopback_prot", opt)) loopback_prot = acmp_convert_loopback_prot(value); else if (!strcasecmp("timeout", opt)) timeout = atoi(value); else if (!strcasecmp("retries", opt)) retries = atoi(value); else if (!strcasecmp("resolve_depth", opt)) resolve_depth = atoi(value); else if (!strcasecmp("send_depth", opt)) send_depth = atoi(value); else if (!strcasecmp("recv_depth", opt)) recv_depth = atoi(value); else if (!strcasecmp("min_mtu", opt)) min_mtu = acm_convert_mtu(atoi(value)); else if (!strcasecmp("min_rate", opt)) min_rate = acm_convert_rate(atoi(value)); else if (!strcasecmp("route_preload", opt)) route_preload = acmp_convert_route_preload(value); else if (!strcasecmp("route_data_file", opt)) strcpy(route_data_file, value); else if (!strcasecmp("addr_preload", opt)) addr_preload = acmp_convert_addr_preload(value); else if (!strcasecmp("addr_data_file", opt)) strcpy(addr_data_file, value); } fclose(f); } static void acmp_log_options(void) { acm_log(0, "address resolution %d\n", addr_prot); acm_log(0, "address timeout %d\n", addr_timeout); acm_log(0, "route resolution %d\n", route_prot); acm_log(0, "route timeout %d\n", route_timeout); acm_log(0, "loopback resolution %d\n", loopback_prot); acm_log(0, "timeout %d ms\n", timeout); acm_log(0, "retries %d\n", retries); acm_log(0, "resolve depth %d\n", resolve_depth); acm_log(0, "send depth %d\n", send_depth); acm_log(0, "receive depth %d\n", recv_depth); acm_log(0, "minimum mtu %d\n", min_mtu); acm_log(0, "minimum rate %d\n", min_rate); acm_log(0, "route preload %d\n", route_preload); acm_log(0, "route data file %s\n", route_data_file); acm_log(0, "address preload %d\n", addr_preload); acm_log(0, "address data file %s\n", addr_data_file); } static void __attribute__((constructor)) acmp_init(void) { acmp_set_options(); acmp_log_options(); atomic_init(&g_tid); atomic_init(&wait_cnt); pthread_mutex_init(&acmp_dev_lock, NULL); event_init(&timeout_event); umad_init(); acm_log(1, "starting timeout/retry thread\n"); if (pthread_create(&retry_thread_id, NULL, acmp_retry_handler, NULL)) { acm_log(0, "Error: failed to create the retry thread"); retry_thread_started = 0; return; } acmp_initialized = 1; } int provider_query(struct acm_provider **provider, uint32_t *version) { acm_log(1, "\n"); if (!acmp_initialized) return -1; if (provider) *provider = &def_prov; if (version) *version = ACM_PROV_VERSION; return 0; } rdma-core-56.1/ibacm/prov/acmp/src/libibacmp.map000066400000000000000000000000641477342711600215230ustar00rootroot00000000000000ACMP_1.0 { global: provider_query; local: *; }; rdma-core-56.1/ibacm/src/000077500000000000000000000000001477342711600151745ustar00rootroot00000000000000rdma-core-56.1/ibacm/src/acm.c000066400000000000000000002466011477342711600161110ustar00rootroot00000000000000/* * Copyright (c) 2009-2014 Intel Corporation. All rights reserved. * Copyright (c) 2013 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "acm_mad.h" #include "acm_util.h" #define NL_MSG_BUF_SIZE 4096 #define ACM_PROV_NAME_SIZE 64 #define NL_CLIENT_INDEX 0 struct acmc_subnet { struct list_node entry; __be64 subnet_prefix; }; struct acmc_prov { struct acm_provider *prov; void *handle; struct list_node entry; struct list_head subnet_list; }; struct acmc_prov_context { struct list_node entry; atomic_t refcnt; struct acm_provider *prov; void *context; }; struct acmc_device; struct acmc_port { struct acmc_device *dev; struct acm_port port; struct acm_provider *prov; /* limit to 1 provider per port for now */ void *prov_port_context; int mad_portid; int mad_agentid; struct ib_mad_addr sa_addr; struct list_head sa_pending; struct list_head sa_wait; int sa_credits; pthread_mutex_t lock; struct list_head ep_list; enum ibv_port_state state; int gid_cnt; union ibv_gid *gid_tbl; uint16_t lid; uint16_t lid_mask; int sa_pkey_index; bool pending_rereg; uint16_t def_acm_pkey; }; struct acmc_device { struct acm_device device; struct list_node entry; struct list_head prov_dev_context_list; int port_cnt; struct acmc_port port[0]; }; struct acmc_addr { struct acm_address addr; void *prov_addr_context; char string_buf[ACM_MAX_ADDRESS]; }; struct acmc_ep { struct acmc_port *port; struct acm_endpoint endpoint; void *prov_ep_context; /* Although the below two entries are used for dynamic allocations, * they are accessed by a single thread, so no locking is required. */ int nmbr_ep_addrs; struct acmc_addr *addr_info; struct list_node entry; }; struct acmc_client { pthread_mutex_t lock; /* acquire ep lock first */ int sock; int index; atomic_t refcnt; }; union socket_addr { struct sockaddr sa; struct sockaddr_in sin; struct sockaddr_in6 sin6; }; struct acmc_sa_req { struct list_node entry; struct acmc_ep *ep; void (*resp_handler)(struct acm_sa_mad *); struct acm_sa_mad mad; }; struct acm_nl_path { struct nlattr attr_hdr; struct ib_path_rec_data rec; }; struct acm_nl_msg { struct nlmsghdr nlmsg_header; union { uint8_t data[ACM_MSG_DATA_LENGTH]; struct rdma_ls_resolve_header resolve_header; struct nlattr attr[0]; struct acm_nl_path path[0]; }; }; static char def_prov_name[ACM_PROV_NAME_SIZE] = "ibacmp"; static LIST_HEAD(provider_list); static struct acmc_prov *def_provider = NULL; static LIST_HEAD(dev_list); static int listen_socket; static int ip_mon_socket; static struct acmc_client client_array[FD_SETSIZE - 1]; static FILE *flog; static pthread_mutex_t log_lock; static __thread char log_data[ACM_MAX_ADDRESS]; static atomic_t counter[ACM_MAX_COUNTER]; static struct acmc_device * acm_get_device_from_gid(union ibv_gid *sgid, uint8_t *port); static struct acmc_ep *acm_find_ep(struct acmc_port *port, uint16_t pkey); static int acm_ep_insert_addr(struct acmc_ep *ep, const char *name, uint8_t *addr, uint8_t addr_type); static void acm_event_handler(struct acmc_device *dev); static int acm_nl_send(int sock, struct acm_msg *msg); static struct sa_data { int timeout; int retries; int depth; pthread_t thread_id; struct pollfd *fds; struct acmc_port **ports; int nfds; } sa = { 2000, 2, 1, 0, NULL, NULL, 0}; /* * Service options - may be set through ibacm_opts.cfg file. */ static const char *acme = IBACM_BIN_PATH "/ib_acme -A"; static const char *opts_file = ACM_CONF_DIR "/" ACM_OPTS_FILE; static const char *addr_file = ACM_CONF_DIR "/" ACM_ADDR_FILE; static char log_file[128] = IBACM_LOG_FILE; static int log_level = 0; static int umad_debug_level; static char lock_file[128] = IBACM_PID_FILE; static short server_port = 6125; static int server_mode = IBACM_SERVER_MODE_DEFAULT; static int acme_plus_kernel_only = IBACM_ACME_PLUS_KERNEL_ONLY_DEFAULT; static int support_ips_in_addr_cfg = 0; static char prov_lib_path[256] = IBACM_LIB_PATH; void acm_write(int level, const char *format, ...) { va_list args; struct timeval tv; struct tm tmtime; char buffer[20]; if (level > log_level) return; gettimeofday(&tv, NULL); localtime_r(&tv.tv_sec, &tmtime); strftime(buffer, 20, "%Y-%m-%dT%H:%M:%S", &tmtime); va_start(args, format); pthread_mutex_lock(&log_lock); fprintf(flog, "%s.%03u: ", buffer, (unsigned) (tv.tv_usec / 1000)); vfprintf(flog, format, args); fflush(flog); pthread_mutex_unlock(&log_lock); va_end(args); } void acm_format_name(int level, char *name, size_t name_size, uint8_t addr_type, const uint8_t *addr, size_t addr_size) { struct ibv_path_record *path; if (level > log_level) return; switch (addr_type) { case ACM_EP_INFO_NAME: memcpy(name, addr, addr_size); break; case ACM_EP_INFO_ADDRESS_IP: inet_ntop(AF_INET, addr, name, name_size); break; case ACM_EP_INFO_ADDRESS_IP6: case ACM_ADDRESS_GID: inet_ntop(AF_INET6, addr, name, name_size); break; case ACM_EP_INFO_PATH: path = (struct ibv_path_record *) addr; if (path->dlid) { snprintf(name, name_size, "SLID(%u) DLID(%u)", be16toh(path->slid), be16toh(path->dlid)); } else { acm_format_name(level, name, name_size, ACM_ADDRESS_GID, path->dgid.raw, sizeof path->dgid); } break; case ACM_ADDRESS_LID: snprintf(name, name_size, "LID(%u)", be16toh(*((__be16 *) addr))); break; default: strcpy(name, "Unknown"); break; } } int ib_any_gid(union ibv_gid *gid) { return ((gid->global.subnet_prefix | gid->global.interface_id) == 0); } const char *acm_get_opts_file(void) { return opts_file; } void acm_increment_counter(int type) { if (type >= 0 && type < ACM_MAX_COUNTER) atomic_inc(&counter[type]); } static struct acmc_prov_context * acm_alloc_prov_context(struct acm_provider *prov) { struct acmc_prov_context *ctx; ctx = calloc(1, sizeof(*ctx)); if (!ctx) { acm_log(0, "Error: failed to allocate prov context\n"); return NULL; } atomic_set(&ctx->refcnt, 1); ctx->prov = prov; return ctx; } static struct acmc_prov_context * acm_get_prov_context(struct list_head *list, struct acm_provider *prov) { struct acmc_prov_context *ctx; list_for_each(list, ctx, entry) { if (ctx->prov == prov) { return ctx; } } return NULL; } static struct acmc_prov_context * acm_acquire_prov_context(struct list_head *list, struct acm_provider *prov) { struct acmc_prov_context *ctx; ctx = acm_get_prov_context(list, prov); if (!ctx) { ctx = acm_alloc_prov_context(prov); if (!ctx) { acm_log(0, "Error -- failed to allocate provider context\n"); return NULL; } list_add_tail(list, &ctx->entry); } else { atomic_inc(&ctx->refcnt); } return ctx; } static void acm_release_prov_context(struct acmc_prov_context *ctx) { if (atomic_dec(&ctx->refcnt) <= 0) { list_del(&ctx->entry); free(ctx); } } uint8_t acm_gid_index(struct acm_port *port, union ibv_gid *gid) { uint8_t i; struct acmc_port *cport; cport = container_of(port, struct acmc_port, port); for (i = 0; i < cport->gid_cnt; i++) { if (!memcmp(&cport->gid_tbl[i], gid, sizeof (*gid))) break; } return i; } int acm_get_gid(struct acm_port *port, int index, union ibv_gid *gid) { struct acmc_port *cport; cport = container_of(port, struct acmc_port, port); if (index >= 0 && index < cport->gid_cnt) { *gid = cport->gid_tbl[index]; return 0; } else { return -1; } } static size_t acm_addr_len(uint8_t addr_type) { switch (addr_type) { case ACM_ADDRESS_NAME: return ACM_MAX_ADDRESS; case ACM_ADDRESS_IP: return sizeof(struct in_addr); case ACM_ADDRESS_IP6: return sizeof(struct in6_addr); case ACM_ADDRESS_GID: return sizeof(union ibv_gid); case ACM_ADDRESS_LID: return sizeof(uint16_t); default: acm_log(2, "illegal address type %d\n", addr_type); } return 0; } static int acm_addr_cmp(struct acm_address *acm_addr, uint8_t *addr, uint8_t addr_type) { if (acm_addr->type != addr_type) return -2; if (acm_addr->type == ACM_ADDRESS_NAME) return strncasecmp((char *) acm_addr->info.name, (char *) addr, acm_addr_len(acm_addr->type)); return memcmp(acm_addr->info.addr, addr, acm_addr_len(acm_addr->type)); } static void acm_mark_addr_invalid(struct acmc_ep *ep, struct acm_ep_addr_data *data) { int i; for (i = 0; i < ep->nmbr_ep_addrs; i++) { if (!acm_addr_cmp(&ep->addr_info[i].addr, data->info.addr, data->type)) { ep->addr_info[i].addr.type = ACM_ADDRESS_INVALID; ep->port->prov->remove_address(ep->addr_info[i].prov_addr_context); break; } } } static struct acm_address * acm_addr_lookup(const struct acm_endpoint *endpoint, uint8_t *addr, uint8_t addr_type) { struct acmc_ep *ep; int i; ep = container_of(endpoint, struct acmc_ep, endpoint); for (i = 0; i < ep->nmbr_ep_addrs; i++) if (!acm_addr_cmp(&ep->addr_info[i].addr, addr, addr_type)) return &ep->addr_info[i].addr; return NULL; } __be64 acm_path_comp_mask(struct ibv_path_record *path) { uint32_t fl_hop; uint16_t qos_sl; __be64 comp_mask = 0; acm_log(2, "\n"); if (path->service_id) comp_mask |= IB_COMP_MASK_PR_SERVICE_ID; if (!ib_any_gid(&path->dgid)) comp_mask |= IB_COMP_MASK_PR_DGID; if (!ib_any_gid(&path->sgid)) comp_mask |= IB_COMP_MASK_PR_SGID; if (path->dlid) comp_mask |= IB_COMP_MASK_PR_DLID; if (path->slid) comp_mask |= IB_COMP_MASK_PR_SLID; fl_hop = be32toh(path->flowlabel_hoplimit); if (fl_hop >> 8) comp_mask |= IB_COMP_MASK_PR_FLOW_LABEL; if (fl_hop & 0xFF) comp_mask |= IB_COMP_MASK_PR_HOP_LIMIT; if (path->tclass) comp_mask |= IB_COMP_MASK_PR_TCLASS; if (path->reversible_numpath & 0x80) comp_mask |= IB_COMP_MASK_PR_REVERSIBLE; if (path->pkey) comp_mask |= IB_COMP_MASK_PR_PKEY; qos_sl = be16toh(path->qosclass_sl); if (qos_sl >> 4) comp_mask |= IB_COMP_MASK_PR_QOS_CLASS; if (qos_sl & 0xF) comp_mask |= IB_COMP_MASK_PR_SL; if (path->mtu & 0xC0) comp_mask |= IB_COMP_MASK_PR_MTU_SELECTOR; if (path->mtu & 0x3F) comp_mask |= IB_COMP_MASK_PR_MTU; if (path->rate & 0xC0) comp_mask |= IB_COMP_MASK_PR_RATE_SELECTOR; if (path->rate & 0x3F) comp_mask |= IB_COMP_MASK_PR_RATE; if (path->packetlifetime & 0xC0) comp_mask |= IB_COMP_MASK_PR_PACKET_LIFETIME_SELECTOR; if (path->packetlifetime & 0x3F) comp_mask |= IB_COMP_MASK_PR_PACKET_LIFETIME; return comp_mask; } int acm_resolve_response(uint64_t id, struct acm_msg *msg) { struct acmc_client *client = &client_array[id]; int ret; acm_log(2, "client %d, status 0x%x\n", client->index, msg->hdr.status); if (msg->hdr.status == ACM_STATUS_ENODATA) atomic_inc(&counter[ACM_CNTR_NODATA]); else if (msg->hdr.status) atomic_inc(&counter[ACM_CNTR_ERROR]); pthread_mutex_lock(&client->lock); if (client->sock == -1) { acm_log(0, "ERROR - connection lost\n"); ret = ACM_STATUS_ENOTCONN; goto release; } if (id == NL_CLIENT_INDEX) ret = acm_nl_send(client->sock, msg); else ret = send(client->sock, (char *) msg, msg->hdr.length, 0); if (ret != msg->hdr.length) acm_log(0, "ERROR - failed to send response\n"); else ret = 0; release: pthread_mutex_unlock(&client->lock); (void) atomic_dec(&client->refcnt); return ret; } static int acmc_resolve_response(uint64_t id, struct acm_msg *req_msg, uint8_t status) { req_msg->hdr.opcode |= ACM_OP_ACK; req_msg->hdr.status = status; if (status != ACM_STATUS_SUCCESS) req_msg->hdr.length = ACM_MSG_HDR_LENGTH; memset(req_msg->hdr.data, 0, sizeof(req_msg->hdr.data)); return acm_resolve_response(id, req_msg); } int acm_query_response(uint64_t id, struct acm_msg *msg) { struct acmc_client *client = &client_array[id]; int ret; acm_log(2, "status 0x%x\n", msg->hdr.status); pthread_mutex_lock(&client->lock); if (client->sock == -1) { acm_log(0, "ERROR - connection lost\n"); ret = ACM_STATUS_ENOTCONN; goto release; } ret = send(client->sock, (char *) msg, msg->hdr.length, 0); if (ret != msg->hdr.length) acm_log(0, "ERROR - failed to send response\n"); else ret = 0; release: pthread_mutex_unlock(&client->lock); (void) atomic_dec(&client->refcnt); return ret; } static int acmc_query_response(uint64_t id, struct acm_msg *msg, uint8_t status) { acm_log(2, "status 0x%x\n", status); msg->hdr.opcode |= ACM_OP_ACK; msg->hdr.status = status; return acm_query_response(id, msg); } static void acm_init_server(void) { FILE *f; int i; for (i = 0; i < FD_SETSIZE - 1; i++) { pthread_mutex_init(&client_array[i].lock, NULL); client_array[i].index = i; client_array[i].sock = -1; atomic_init(&client_array[i].refcnt); } if (server_mode != IBACM_SERVER_MODE_UNIX) { f = fopen(IBACM_IBACME_PORT_FILE, "w"); if (f) { fprintf(f, "%hu\n", server_port); fclose(f); } else acm_log(0, "notice - cannot publish ibacm port number\n"); unlink(IBACM_PORT_FILE); if (!acme_plus_kernel_only) { if (symlink(IBACM_PORT_BASE, IBACM_PORT_FILE) != 0) acm_log(0, "notice - can't create port symlink\n"); } } else { unlink(IBACM_IBACME_PORT_FILE); unlink(IBACM_PORT_FILE); } } static int acm_listen(void) { union { struct sockaddr any; struct sockaddr_in inet; struct sockaddr_un unx; } addr; mode_t saved_mask; int ret, saved_errno; acm_log(2, "\n"); memset(&addr, 0, sizeof(addr)); if (server_mode == IBACM_SERVER_MODE_UNIX) { addr.any.sa_family = AF_UNIX; BUILD_ASSERT(sizeof(IBACM_IBACME_SERVER_PATH) <= sizeof(addr.unx.sun_path)); strcpy(addr.unx.sun_path, IBACM_IBACME_SERVER_PATH); listen_socket = socket(AF_UNIX, SOCK_STREAM, 0); if (listen_socket < 0) { acm_log(0, "ERROR - unable to allocate unix socket\n"); return errno; } unlink(addr.unx.sun_path); saved_mask = umask(0); ret = bind(listen_socket, &addr.any, sizeof(addr.unx)); saved_errno = errno; umask(saved_mask); if (ret) { acm_log(0, "ERROR - unable to bind listen socket '%s'\n", addr.unx.sun_path); return saved_errno; } unlink(IBACM_SERVER_PATH); if (!acme_plus_kernel_only) { if (symlink(IBACM_SERVER_BASE, IBACM_SERVER_PATH) != 0) { saved_errno = errno; acm_log(0, "notice - can't create symlink\n"); return saved_errno; } } } else { unlink(IBACM_IBACME_SERVER_PATH); unlink(IBACM_SERVER_PATH); listen_socket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); if (listen_socket == -1) { acm_log(0, "ERROR - unable to allocate TCP socket\n"); return errno; } addr.any.sa_family = AF_INET; addr.inet.sin_port = htobe16(server_port); if (server_mode == IBACM_SERVER_MODE_LOOP) addr.inet.sin_addr.s_addr = htonl(INADDR_LOOPBACK); ret = bind(listen_socket, &addr.any, sizeof(addr.inet)); if (ret == -1) { acm_log(0, "ERROR - unable to bind listen socket\n"); return errno; } } ret = listen(listen_socket, 0); if (ret == -1) { acm_log(0, "ERROR - unable to start listen\n"); return errno; } acm_log(2, "listen active\n"); return 0; } /* Retrieve the listening socket from systemd. */ static int acm_listen_systemd(void) { int fd; int rc = sd_listen_fds(1); if (rc == -1) { fprintf(stderr, "sd_listen_fds failed %d\n", rc); return rc; } if (rc > 2) { fprintf(stderr, "sd_listen_fds returned %d fds, expected <= 2\n", rc); return -1; } for (fd = SD_LISTEN_FDS_START; fd != SD_LISTEN_FDS_START + rc; fd++) { if (sd_is_socket(fd, AF_NETLINK, SOCK_RAW, 0)) { /* ListenNetlink for RDMA_NL_GROUP_LS multicast * messages from the kernel */ if (client_array[NL_CLIENT_INDEX].sock != -1) { fprintf(stderr, "sd_listen_fds returned more than one netlink socket\n"); return -1; } client_array[NL_CLIENT_INDEX].sock = fd; /* systemd sets NONBLOCK on the netlink socket, while * we want blocking send to the kernel. */ if (set_fd_nonblock(fd, false)) { fprintf(stderr, "Unable to drop O_NOBLOCK on netlink socket"); return -1; } } else if (sd_is_socket(SD_LISTEN_FDS_START, AF_UNSPEC, SOCK_STREAM, 1)) { /* Socket for user space client communication */ if (listen_socket != -1) { fprintf(stderr, "sd_listen_fds returned more than one listening socket\n"); return -1; } listen_socket = fd; } else { fprintf(stderr, "sd_listen_fds socket is not a SOCK_STREAM/SOCK_NETLINK listening socket\n"); return -1; } } return 0; } static void acm_disconnect_client(struct acmc_client *client) { pthread_mutex_lock(&client->lock); shutdown(client->sock, SHUT_RDWR); close(client->sock); client->sock = -1; pthread_mutex_unlock(&client->lock); (void) atomic_dec(&client->refcnt); } static void acm_svr_accept(void) { int s; int i; acm_log(2, "\n"); s = accept(listen_socket, NULL, NULL); if (s == -1) { acm_log(0, "ERROR - failed to accept connection\n"); return; } for (i = 0; i < FD_SETSIZE - 1; i++) { if (i == NL_CLIENT_INDEX) continue; if (!atomic_get(&client_array[i].refcnt)) break; } if (i == FD_SETSIZE - 1) { acm_log(0, "ERROR - all connections busy - rejecting\n"); close(s); return; } client_array[i].sock = s; atomic_set(&client_array[i].refcnt, 1); acm_log(2, "assigned client %d\n", i); } static int acm_is_path_from_port(struct acmc_port *port, struct ibv_path_record *path) { uint8_t i; if (!ib_any_gid(&path->sgid)) { return (acm_gid_index(&port->port, &path->sgid) < port->gid_cnt); } if (path->slid) { return (port->lid == (be16toh(path->slid) & port->lid_mask)); } if (ib_any_gid(&path->dgid)) { return 1; } if (acm_gid_index(&port->port, &path->dgid) < port->gid_cnt) { return 1; } for (i = 0; i < port->gid_cnt; i++) { if (port->gid_tbl[i].global.subnet_prefix == path->dgid.global.subnet_prefix) { return 1; } } return 0; } static bool acm_same_partition(uint16_t pkey_a, uint16_t pkey_b) { acm_log(2, "pkey_a: 0x%04x pkey_b: 0x%04x\n", pkey_a, pkey_b); return ((pkey_a | IB_PKEY_FULL_MEMBER) == (pkey_b | IB_PKEY_FULL_MEMBER)); } static struct acmc_addr * acm_get_port_ep_address(struct acmc_port *port, struct acm_ep_addr_data *data) { struct acmc_ep *ep; struct acm_address *addr; int i; if (port->state != IBV_PORT_ACTIVE) return NULL; if (data->type == ACM_EP_INFO_PATH && !acm_is_path_from_port(port, &data->info.path)) return NULL; list_for_each(&port->ep_list, ep, entry) { if ((data->type == ACM_EP_INFO_PATH) && (!data->info.path.pkey || acm_same_partition(be16toh(data->info.path.pkey), ep->endpoint.pkey))) { for (i = 0; i < ep->nmbr_ep_addrs; i++) { if (ep->addr_info[i].addr.type) return &ep->addr_info[i]; } return NULL; } if ((addr = acm_addr_lookup(&ep->endpoint, data->info.addr, (uint8_t) data->type))) return container_of(addr, struct acmc_addr, addr); } return NULL; } static struct acmc_addr *acm_get_ep_address(struct acm_ep_addr_data *data) { struct acmc_device *dev; struct acmc_addr *addr; int i; acm_format_name(2, log_data, sizeof log_data, data->type, data->info.addr, sizeof data->info.addr); acm_log(2, "%s\n", log_data); list_for_each(&dev_list, dev, entry) { for (i = 0; i < dev->port_cnt; i++) { addr = acm_get_port_ep_address(&dev->port[i], data); if (addr) return addr; } } acm_format_name(0, log_data, sizeof log_data, data->type, data->info.addr, sizeof data->info.addr); acm_log(1, "notice - could not find %s\n", log_data); return NULL; } /* If port_num is zero, iterate through all ports, otherwise consider * only the specific port_num */ static struct acmc_ep *acm_get_ep(int index, uint8_t port_num) { struct acmc_device *dev; struct acmc_ep *ep; int i, inx = 0; acm_log(2, "ep index %d\n", index); list_for_each(&dev_list, dev, entry) { for (i = 0; i < dev->port_cnt; i++) { if (port_num && port_num != (i + 1)) continue; if (dev->port[i].state != IBV_PORT_ACTIVE) continue; list_for_each(&dev->port[i].ep_list, ep, entry) { if (index == inx) return ep; ++inx; } } } acm_log(1, "notice - could not find ep %d\n", index); return NULL; } static int acm_svr_query_path(struct acmc_client *client, struct acm_msg *msg) { struct acmc_addr *addr; struct acmc_ep *ep; acm_log(2, "client %d\n", client->index); if (msg->hdr.length != ACM_MSG_HDR_LENGTH + ACM_MSG_EP_LENGTH) { acm_log(0, "ERROR - invalid length: 0x%x\n", msg->hdr.length); return acmc_query_response(client->index, msg, ACM_STATUS_EINVAL); } addr = acm_get_ep_address(&msg->resolve_data[0]); if (!addr) { acm_log(1, "notice - could not find local end point address\n"); return acmc_query_response(client->index, msg, ACM_STATUS_ESRCADDR); } ep = container_of(addr->addr.endpoint, struct acmc_ep, endpoint); return ep->port->prov->query(addr->prov_addr_context, msg, client->index); } static int acm_svr_select_src(struct acm_ep_addr_data *src, struct acm_ep_addr_data *dst) { union socket_addr addr; socklen_t len; int ret; int s; acm_log(2, "selecting source address\n"); memset(&addr, 0, sizeof addr); switch (dst->type) { case ACM_EP_INFO_ADDRESS_IP: addr.sin.sin_family = AF_INET; memcpy(&addr.sin.sin_addr, dst->info.addr, 4); len = sizeof(struct sockaddr_in); break; case ACM_EP_INFO_ADDRESS_IP6: addr.sin6.sin6_family = AF_INET6; memcpy(&addr.sin6.sin6_addr, dst->info.addr, 16); len = sizeof(struct sockaddr_in6); break; default: acm_log(1, "notice - bad destination type, cannot lookup source\n"); return ACM_STATUS_EDESTTYPE; } s = socket(addr.sa.sa_family, SOCK_DGRAM, IPPROTO_UDP); if (s == -1) { acm_log(0, "ERROR - unable to allocate socket\n"); return errno; } ret = connect(s, &addr.sa, len); if (ret) { acm_log(0, "ERROR - unable to connect socket\n"); ret = errno; goto out; } ret = getsockname(s, &addr.sa, &len); if (ret) { acm_log(0, "ERROR - failed to get socket address\n"); ret = errno; goto out; } src->type = dst->type; src->flags = ACM_EP_FLAG_SOURCE; if (dst->type == ACM_EP_INFO_ADDRESS_IP) { memcpy(&src->info.addr, &addr.sin.sin_addr, 4); } else { memcpy(&src->info.addr, &addr.sin6.sin6_addr, 16); } out: close(s); return ret; } /* * Verify the resolve message from the client and return * references to the source and destination addresses. * The message buffer contains extra address data buffers. If a * source address is not given, reference an empty address buffer, * and we'll resolve a source address later. Record the location of * the source and destination addresses in the message header data * to avoid further searches. */ static uint8_t acm_svr_verify_resolve(struct acm_msg *msg) { int i, cnt, have_dst = 0; if (msg->hdr.length < ACM_MSG_HDR_LENGTH) { acm_log(0, "ERROR - invalid msg hdr length %d\n", msg->hdr.length); return ACM_STATUS_EINVAL; } msg->hdr.src_out = 1; cnt = (msg->hdr.length - ACM_MSG_HDR_LENGTH) / ACM_MSG_EP_LENGTH; for (i = 0; i < cnt; i++) { if (msg->resolve_data[i].flags & ACM_EP_FLAG_SOURCE) { if (!msg->hdr.src_out) { acm_log(0, "ERROR - multiple sources specified\n"); return ACM_STATUS_ESRCADDR; } if (!msg->resolve_data[i].type || (msg->resolve_data[i].type >= ACM_ADDRESS_RESERVED)) { acm_log(0, "ERROR - unsupported source address type\n"); return ACM_STATUS_ESRCTYPE; } msg->hdr.src_out = 0; msg->hdr.src_index = i; } if (msg->resolve_data[i].flags & ACM_EP_FLAG_DEST) { if (have_dst) { acm_log(0, "ERROR - multiple destinations specified\n"); return ACM_STATUS_EDESTADDR; } if (!msg->resolve_data[i].type || (msg->resolve_data[i].type >= ACM_ADDRESS_RESERVED)) { acm_log(0, "ERROR - unsupported destination address type\n"); return ACM_STATUS_EDESTTYPE; } have_dst = 1; msg->hdr.dst_index = i; } } if (!have_dst) { acm_log(0, "ERROR - destination address required\n"); return ACM_STATUS_EDESTTYPE; } if (msg->hdr.src_out) { msg->hdr.src_index = i; memset(&msg->resolve_data[i], 0, sizeof(struct acm_ep_addr_data)); } return ACM_STATUS_SUCCESS; } static int acm_svr_resolve_dest(struct acmc_client *client, struct acm_msg *msg) { struct acmc_addr *addr; struct acmc_ep *ep; struct acm_ep_addr_data *saddr, *daddr; uint8_t status; acm_log(2, "client %d\n", client->index); status = acm_svr_verify_resolve(msg); if (status) { acm_log(0, "notice - misformatted or unsupported request\n"); return acmc_resolve_response(client->index, msg, status); } saddr = &msg->resolve_data[msg->hdr.src_index]; daddr = &msg->resolve_data[msg->hdr.dst_index]; if (msg->hdr.src_out) { status = acm_svr_select_src(saddr, daddr); if (status) { acm_log(0, "notice - unable to select suitable source address\n"); return acmc_resolve_response(client->index, msg, status); } } acm_format_name(2, log_data, sizeof log_data, saddr->type, saddr->info.addr, sizeof saddr->info.addr); acm_log(2, "src %s\n", log_data); addr = acm_get_ep_address(saddr); if (!addr) { acm_log(0, "notice - unknown local end point address\n"); return acmc_resolve_response(client->index, msg, ACM_STATUS_ESRCADDR); } ep = container_of(addr->addr.endpoint, struct acmc_ep, endpoint); return ep->port->prov->resolve(addr->prov_addr_context, msg, client->index); } /* * The message buffer contains extra address data buffers. We extract the * destination address from the path record into an extra buffer, so we can * lookup the destination by either LID or GID. */ static int acm_svr_resolve_path(struct acmc_client *client, struct acm_msg *msg) { struct acmc_addr *addr; struct acmc_ep *ep; struct ibv_path_record *path; acm_log(2, "client %d\n", client->index); if (msg->hdr.length < (ACM_MSG_HDR_LENGTH + ACM_MSG_EP_LENGTH)) { acm_log(0, "notice - invalid msg hdr length %d\n", msg->hdr.length); return acmc_resolve_response(client->index, msg, ACM_STATUS_EINVAL); } path = &msg->resolve_data[0].info.path; if (!path->dlid && ib_any_gid(&path->dgid)) { acm_log(0, "notice - no destination specified\n"); return acmc_resolve_response(client->index, msg, ACM_STATUS_EDESTADDR); } acm_format_name(2, log_data, sizeof log_data, ACM_EP_INFO_PATH, msg->resolve_data[0].info.addr, sizeof *path); acm_log(2, "path %s\n", log_data); addr = acm_get_ep_address(&msg->resolve_data[0]); if (!addr) { acm_log(0, "notice - unknown local end point address\n"); return acmc_resolve_response(client->index, msg, ACM_STATUS_ESRCADDR); } ep = container_of(addr->addr.endpoint, struct acmc_ep, endpoint); return ep->port->prov->resolve(addr->prov_addr_context, msg, client->index); } static int acm_svr_resolve(struct acmc_client *client, struct acm_msg *msg) { (void) atomic_inc(&client->refcnt); if (msg->resolve_data[0].type == ACM_EP_INFO_PATH) { if (msg->resolve_data[0].flags & ACM_FLAGS_QUERY_SA) { return acm_svr_query_path(client, msg); } else { return acm_svr_resolve_path(client, msg); } } else { return acm_svr_resolve_dest(client, msg); } } static int acm_svr_perf_query(struct acmc_client *client, struct acm_msg *msg) { int ret, i; uint16_t len; struct acmc_addr *addr; struct acmc_ep *ep = NULL; int index; acm_log(2, "client %d\n", client->index); index = msg->hdr.src_index; msg->hdr.opcode |= ACM_OP_ACK; msg->hdr.status = ACM_STATUS_SUCCESS; msg->hdr.dst_index = 0; if ((be16toh(msg->hdr.length) < (ACM_MSG_HDR_LENGTH + ACM_MSG_EP_LENGTH) && index < 1) || ((be16toh(msg->hdr.length) >= (ACM_MSG_HDR_LENGTH + ACM_MSG_EP_LENGTH) && !(msg->resolve_data[0].flags & ACM_EP_FLAG_SOURCE)))) { for (i = 0; i < ACM_MAX_COUNTER; i++) msg->perf_data[i] = htobe64((uint64_t) atomic_get(&counter[i])); msg->hdr.src_out = ACM_MAX_COUNTER; len = ACM_MSG_HDR_LENGTH + (ACM_MAX_COUNTER * sizeof(uint64_t)); } else { if (index >= 1) { ep = acm_get_ep(index - 1, msg->hdr.src_index); } else { addr = acm_get_ep_address(&msg->resolve_data[0]); if (addr) ep = container_of(addr->addr.endpoint, struct acmc_ep, endpoint); } if (ep) { ep->port->prov->query_perf(ep->prov_ep_context, msg->perf_data, &msg->hdr.src_out); len = ACM_MSG_HDR_LENGTH + (msg->hdr.src_out * sizeof(uint64_t)); } else { msg->hdr.status = ACM_STATUS_ESRCADDR; len = ACM_MSG_HDR_LENGTH; } } msg->hdr.length = htobe16(len); ret = send(client->sock, (char *) msg, len, 0); if (ret != len) acm_log(0, "ERROR - failed to send response\n"); else ret = 0; return ret; } static int may_be_realloc(struct acm_msg **msg_ptr, int len, int cnt, int *cur_msg_siz_ptr, int max_msg_siz) { /* Check if a new address exceeds the protocol constrained max size */ if (len + (cnt + 1) * ACM_MAX_ADDRESS > max_msg_siz) { acm_log(0, "ERROR - unable to amend more addresses to acm_msg due to protocol constraints\n"); return ENOMEM; } /* Check if a new address exceeds current size of msg */ if (len + (cnt + 1) * ACM_MAX_ADDRESS > *cur_msg_siz_ptr) { const size_t chunk_size = 16 * ACM_MAX_ADDRESS; struct acm_msg *new_msg = realloc(*msg_ptr, *cur_msg_siz_ptr + chunk_size); if (!new_msg) { acm_log(0, "ERROR - failed to allocate longer acm_msg\n"); return ENOMEM; } *msg_ptr = new_msg; *cur_msg_siz_ptr += chunk_size; } return 0; } static int acm_svr_ep_query(struct acmc_client *client, struct acm_msg **_msg) { int sts; int ret, i; uint16_t len; struct acmc_ep *ep; int index, cnt = 0; struct acm_msg *msg = *_msg; int cur_msg_siz = sizeof(*msg); int max_msg_siz = USHRT_MAX; acm_log(2, "client %d\n", client->index); index = msg->hdr.src_out; ep = acm_get_ep(index - 1, msg->hdr.src_index); if (ep) { msg->hdr.status = ACM_STATUS_SUCCESS; msg->ep_data.dev_guid = ep->port->dev->device.dev_guid; msg->ep_data.port_num = ep->port->port.port_num; msg->ep_data.phys_port_cnt = ep->port->dev->port_cnt; msg->ep_data.pkey = htobe16(ep->endpoint.pkey); strncpy((char *)msg->ep_data.prov_name, ep->port->prov->name, ACM_MAX_PROV_NAME - 1); msg->ep_data.prov_name[ACM_MAX_PROV_NAME - 1] = '\0'; len = ACM_MSG_HDR_LENGTH + sizeof(struct acm_ep_config_data); for (i = 0; i < ep->nmbr_ep_addrs; i++) { if (ep->addr_info[i].addr.type != ACM_ADDRESS_INVALID) { sts = may_be_realloc(_msg, len, cnt, &cur_msg_siz, max_msg_siz); msg = *_msg; if (sts) break; memcpy(msg->ep_data.addrs[cnt++].name, ep->addr_info[i].string_buf, ACM_MAX_ADDRESS); } } msg->ep_data.addr_cnt = htobe16(cnt); len += cnt * ACM_MAX_ADDRESS; } else { msg->hdr.status = ACM_STATUS_EINVAL; len = ACM_MSG_HDR_LENGTH; } msg->hdr.opcode |= ACM_OP_ACK; msg->hdr.src_index = 0; msg->hdr.dst_index = 0; msg->hdr.length = htobe16(len); ret = send(client->sock, (char *) msg, len, 0); if (ret != len) acm_log(0, "ERROR - failed to send response\n"); else ret = 0; return ret; } static int acm_msg_length(struct acm_msg *msg) { return (msg->hdr.opcode == ACM_OP_RESOLVE) ? msg->hdr.length : be16toh(msg->hdr.length); } static void acm_svr_receive(struct acmc_client *client) { struct acm_msg *msg = malloc(sizeof(*msg)); int ret; if (!msg) { acm_log(0, "ERROR - Unable to alloc acm_msg\n"); ret = ENOMEM; goto out; } acm_log(2, "client %d\n", client->index); ret = recv(client->sock, (char *)msg, sizeof(*msg), 0); if (ret <= 0 || ret != acm_msg_length(msg)) { acm_log(2, "client disconnected\n"); ret = ACM_STATUS_ENOTCONN; goto out; } if (msg->hdr.version != ACM_VERSION) { acm_log(0, "ERROR - unsupported version %d\n", msg->hdr.version); goto out; } switch (msg->hdr.opcode & ACM_OP_MASK) { case ACM_OP_RESOLVE: atomic_inc(&counter[ACM_CNTR_RESOLVE]); ret = acm_svr_resolve(client, msg); break; case ACM_OP_PERF_QUERY: ret = acm_svr_perf_query(client, msg); break; case ACM_OP_EP_QUERY: ret = acm_svr_ep_query(client, &msg); break; default: acm_log(0, "ERROR - unknown opcode 0x%x\n", msg->hdr.opcode); break; } out: free(msg); if (ret) acm_disconnect_client(client); } static int acm_nl_to_addr_data(struct acm_ep_addr_data *ad, int af_family, uint8_t *addr, size_t addr_len) { if (addr_len > ACM_MAX_ADDRESS) return EINVAL; /* find the ep associated with this address "if any" */ switch (af_family) { case AF_INET: ad->type = ACM_ADDRESS_IP; break; case AF_INET6: ad->type = ACM_ADDRESS_IP6; break; default: return EINVAL; } memcpy(&ad->info.addr, addr, addr_len); return 0; } static void acm_add_ep_ip(char *ifname, struct acm_ep_addr_data *data, char *ip_str) { struct acmc_ep *ep; struct acmc_device *dev; uint8_t port_num; uint16_t pkey; union ibv_gid sgid; struct acmc_addr *addr; addr = acm_get_ep_address(data); if (addr) { acm_log(1, "Address '%s' already available\n", ip_str); return; } if (acm_if_get_sgid(ifname, &sgid)) return; dev = acm_get_device_from_gid(&sgid, &port_num); if (!dev) return; if (acm_if_get_pkey(ifname, &pkey)) return; acm_log(0, " %s\n", ip_str); ep = acm_find_ep(&dev->port[port_num - 1], pkey); if (ep) { if (acm_ep_insert_addr(ep, ip_str, data->info.addr, data->type)) acm_log(0, "Failed to add '%s' to EP\n", ip_str); } else { acm_log(0, "Failed to add '%s' no EP for pkey\n", ip_str); } } static void acm_rm_ep_ip(struct acm_ep_addr_data *data) { struct acmc_ep *ep; struct acmc_addr *addr; addr = acm_get_ep_address(data); if (addr) { ep = container_of(addr->addr.endpoint, struct acmc_ep, endpoint); acm_format_name(0, log_data, sizeof log_data, data->type, data->info.addr, sizeof data->info.addr); acm_log(0, " %s\n", log_data); acm_mark_addr_invalid(ep, data); } } static int acm_ipnl_create(void) { struct sockaddr_nl addr; if ((ip_mon_socket = socket(PF_NETLINK, SOCK_RAW | SOCK_NONBLOCK, NETLINK_ROUTE)) == -1) { acm_log(0, "Failed to open NETLINK_ROUTE socket"); return EIO; } memset(&addr, 0, sizeof(addr)); addr.nl_family = AF_NETLINK; addr.nl_groups = RTMGRP_LINK | RTMGRP_IPV4_IFADDR; if (bind(ip_mon_socket, (struct sockaddr *)&addr, sizeof(addr)) == -1) { acm_log(0, "Failed to bind NETLINK_ROUTE socket"); return EIO; } return 0; } static void acm_ip_iter_cb(char *ifname, union ibv_gid *gid, uint16_t pkey, uint8_t addr_type, uint8_t *addr, char *ip_str, void *ctx) { int ret = EINVAL; struct acmc_device *dev; struct acmc_ep *ep; uint8_t port_num; char gid_str[INET6_ADDRSTRLEN]; dev = acm_get_device_from_gid(gid, &port_num); if (dev) { ep = acm_find_ep(&dev->port[port_num - 1], pkey); if (ep) ret = acm_ep_insert_addr(ep, ip_str, addr, addr_type); } if (ret) { inet_ntop(AF_INET6, gid->raw, gid_str, sizeof(gid_str)); acm_log(0, "Failed to add '%s' (gid %s; pkey 0x%x)\n", ip_str, gid_str, pkey); } } /* Netlink updates have indicated a failure which means we are no longer in * sync. This should be a rare condition so we handle this with a "big * hammer" by clearing and re-reading all the system IP's. */ static int resync_system_ips(void) { struct acmc_device *dev; struct acmc_port *port; struct acmc_ep *ep; int i, cnt; acm_log(0, "Resyncing all IP's\n"); /* mark all IP's invalid */ list_for_each(&dev_list, dev, entry) { for (cnt = 0; cnt < dev->port_cnt; cnt++) { port = &dev->port[cnt]; list_for_each(&port->ep_list, ep, entry) { for (i = 0; i < ep->nmbr_ep_addrs; i++) { if (ep->addr_info[i].addr.type == ACM_ADDRESS_IP || ep->addr_info[i].addr.type == ACM_ADDRESS_IP6) ep->addr_info[i].addr.type = ACM_ADDRESS_INVALID; } } } } return acm_if_iter_sys(acm_ip_iter_cb, NULL); } static void acm_ipnl_handler(void) { int len; char buffer[NL_MSG_BUF_SIZE]; struct nlmsghdr *nlh; char ifname[IFNAMSIZ]; char ip_str[INET6_ADDRSTRLEN]; struct acm_ep_addr_data ad; while ((len = recv(ip_mon_socket, buffer, NL_MSG_BUF_SIZE, 0)) > 0) { nlh = (struct nlmsghdr *)buffer; while ((NLMSG_OK(nlh, len)) && (nlh->nlmsg_type != NLMSG_DONE)) { struct ifaddrmsg *ifa = (struct ifaddrmsg *) NLMSG_DATA(nlh); struct ifinfomsg *ifi = (struct ifinfomsg *) NLMSG_DATA(nlh); struct rtattr *rth = IFA_RTA(ifa); int rtl = IFA_PAYLOAD(nlh); switch (nlh->nlmsg_type) { case RTM_NEWADDR: if_indextoname(ifa->ifa_index, ifname); while (rtl && RTA_OK(rth, rtl)) { if (rth->rta_type == IFA_LOCAL) { acm_log(1, "New system address available %s : %s\n", ifname, inet_ntop(ifa->ifa_family, RTA_DATA(rth), ip_str, sizeof(ip_str))); if (!acm_nl_to_addr_data(&ad, ifa->ifa_family, RTA_DATA(rth), RTA_PAYLOAD(rth))) { acm_add_ep_ip(ifname, &ad, ip_str); } } rth = RTA_NEXT(rth, rtl); } break; case RTM_DELADDR: if_indextoname(ifa->ifa_index, ifname); while (rtl && RTA_OK(rth, rtl)) { if (rth->rta_type == IFA_LOCAL) { acm_log(1, "System address removed %s : %s\n", ifname, inet_ntop(ifa->ifa_family, RTA_DATA(rth), ip_str, sizeof(ip_str))); if (!acm_nl_to_addr_data(&ad, ifa->ifa_family, RTA_DATA(rth), RTA_PAYLOAD(rth))) { acm_rm_ep_ip(&ad); } } rth = RTA_NEXT(rth, rtl); } break; case RTM_NEWLINK: acm_log(2, "Link added : %s\n", if_indextoname(ifi->ifi_index, ifname)); break; case RTM_DELLINK: acm_log(2, "Link removed : %s\n", if_indextoname(ifi->ifi_index, ifname)); break; default: acm_log(2, "unknown netlink message\n"); break; } nlh = NLMSG_NEXT(nlh, len); } } if (len < 0 && errno == ENOBUFS) { acm_log(0, "ENOBUFS returned from netlink...\n"); resync_system_ips(); } } static int acm_nl_send(int sock, struct acm_msg *msg) { struct sockaddr_nl dst_addr; struct acm_nl_msg acmnlmsg; struct acm_nl_msg *orig; int ret; int datalen; orig = (struct acm_nl_msg *)(uintptr_t)msg->hdr.tid; memset(&dst_addr, 0, sizeof(dst_addr)); dst_addr.nl_family = AF_NETLINK; dst_addr.nl_groups = (1 << (RDMA_NL_GROUP_LS - 1)); memset(&acmnlmsg, 0, sizeof(acmnlmsg)); acmnlmsg.nlmsg_header.nlmsg_len = NLMSG_HDRLEN; acmnlmsg.nlmsg_header.nlmsg_pid = getpid(); acmnlmsg.nlmsg_header.nlmsg_type = orig->nlmsg_header.nlmsg_type; acmnlmsg.nlmsg_header.nlmsg_seq = orig->nlmsg_header.nlmsg_seq; if (msg->hdr.status != ACM_STATUS_SUCCESS) { acm_log(2, "acm status no success = %d\n", msg->hdr.status); acmnlmsg.nlmsg_header.nlmsg_flags |= RDMA_NL_LS_F_ERR; } else { acm_log(2, "acm status success\n"); acmnlmsg.nlmsg_header.nlmsg_len += NLA_ALIGN(sizeof(struct acm_nl_path)); acmnlmsg.path[0].attr_hdr.nla_type = LS_NLA_TYPE_PATH_RECORD; acmnlmsg.path[0].attr_hdr.nla_len = sizeof(struct acm_nl_path); if (orig->resolve_header.path_use == LS_RESOLVE_PATH_USE_UNIDIRECTIONAL) acmnlmsg.path[0].rec.flags = IB_PATH_PRIMARY | IB_PATH_OUTBOUND; else acmnlmsg.path[0].rec.flags = IB_PATH_PRIMARY | IB_PATH_GMP | IB_PATH_BIDIRECTIONAL; memcpy(acmnlmsg.path[0].rec.path_rec, &msg->resolve_data[0].info.path, sizeof(struct ibv_path_record)); } datalen = NLMSG_ALIGN(acmnlmsg.nlmsg_header.nlmsg_len); ret = sendto(sock, &acmnlmsg, datalen, 0, (const struct sockaddr *)&dst_addr, (socklen_t)sizeof(dst_addr)); if (ret != datalen) { acm_log(0, "ERROR - sendto = %d errno = %d\n", ret, errno); ret = -1; } else { ret = msg->hdr.length; } free(orig); return ret; } #define NLA_LEN(nla) ((nla)->nla_len - NLA_HDRLEN) #define NLA_DATA(nla) ((char *)(nla) + NLA_HDRLEN) static int acm_nl_parse_path_attr(struct nlattr *attr, struct acm_ep_addr_data *data) { struct ibv_path_record *path; uint64_t *sid; struct rdma_nla_ls_gid *gid; uint8_t *tcl; uint16_t *pkey; uint16_t *qos; uint16_t val; int ret = 0; #define IBV_PATH_RECORD_QOS_MASK 0xfff0 path = &data->info.path; switch (attr->nla_type & RDMA_NLA_TYPE_MASK) { case LS_NLA_TYPE_SERVICE_ID: sid = (uint64_t *) NLA_DATA(attr); if (NLA_LEN(attr) == sizeof(*sid)) { acm_log(2, "service_id 0x%" PRIx64 "\n", *sid); path->service_id = htobe64(*sid); } else { ret = -1; } break; case LS_NLA_TYPE_DGID: gid = (struct rdma_nla_ls_gid *) NLA_DATA(attr); if (NLA_LEN(attr) == sizeof(gid->gid)) { acm_format_name(2, log_data, sizeof(log_data), ACM_ADDRESS_GID, gid->gid, sizeof(union ibv_gid)); acm_log(2, "path dgid %s\n", log_data); memcpy(path->dgid.raw, gid->gid, sizeof(path->dgid)); data->flags |= ACM_EP_FLAG_DEST; } else { ret = -1; } break; case LS_NLA_TYPE_SGID: gid = (struct rdma_nla_ls_gid *) NLA_DATA(attr); if (NLA_LEN(attr) == sizeof(gid->gid)) { acm_format_name(2, log_data, sizeof(log_data), ACM_ADDRESS_GID, gid->gid, sizeof(union ibv_gid)); acm_log(2, "path sgid %s\n", log_data); memcpy(path->sgid.raw, gid->gid, sizeof(path->sgid)); data->flags |= ACM_EP_FLAG_SOURCE; } else { ret = -1; } break; case LS_NLA_TYPE_TCLASS: tcl = (uint8_t *) NLA_DATA(attr); if (NLA_LEN(attr) == sizeof(*tcl)) { acm_log(2, "tclass 0x%x\n", *tcl); path->tclass = *tcl; } else { ret = -1; } break; case LS_NLA_TYPE_PKEY: pkey = (uint16_t *) NLA_DATA(attr); if (NLA_LEN(attr) == sizeof(*pkey)) { acm_log(2, "pkey 0x%x\n", *pkey); path->pkey = htobe16(*pkey); } else { ret = -1; } break; case LS_NLA_TYPE_QOS_CLASS: qos = (uint16_t *) NLA_DATA(attr); if (NLA_LEN(attr) == sizeof(*qos)) { acm_log(2, "qos_class 0x%x\n", *qos); val = be16toh(path->qosclass_sl); val &= ~IBV_PATH_RECORD_QOS_MASK; val |= (*qos & IBV_PATH_RECORD_QOS_MASK); path->qosclass_sl = htobe16(val); } else { ret = -1; } break; default: acm_log(1, "WARN: unknown attr %x\n", attr->nla_type); /* We can not ignore a mandatory attribute */ if (attr->nla_type & RDMA_NLA_F_MANDATORY) ret = -1; break; } return ret; } static void acm_nl_process_invalid_request(struct acmc_client *client, struct acm_nl_msg *acmnlmsg) { struct acm_msg msg; memset(&msg, 0, sizeof(msg)); msg.hdr.opcode = ACM_OP_RESOLVE; msg.hdr.version = ACM_VERSION; msg.hdr.length = ACM_MSG_HDR_LENGTH; msg.hdr.status = ACM_STATUS_EINVAL; msg.hdr.tid = (uintptr_t) acmnlmsg; acm_nl_send(client->sock, &msg); } static void acm_nl_process_resolve(struct acmc_client *client, struct acm_nl_msg *acmnlmsg) { struct acm_msg msg; struct nlattr *attr; int payload_len; int resolve_hdr_len; int rem; int total_attr_len; int status; unsigned char *data; memset(&msg, 0, sizeof(msg)); msg.hdr.opcode = ACM_OP_RESOLVE; msg.hdr.version = ACM_VERSION; msg.hdr.length = ACM_MSG_HDR_LENGTH + ACM_MSG_EP_LENGTH; msg.hdr.status = ACM_STATUS_SUCCESS; msg.hdr.tid = (uintptr_t) acmnlmsg; msg.resolve_data[0].type = ACM_EP_INFO_PATH; /* We support only one pathrecord */ acm_log(2, "path use 0x%x\n", acmnlmsg->resolve_header.path_use); if (acmnlmsg->resolve_header.path_use == LS_RESOLVE_PATH_USE_UNIDIRECTIONAL) msg.resolve_data[0].info.path.reversible_numpath = 1; else msg.resolve_data[0].info.path.reversible_numpath = IBV_PATH_RECORD_REVERSIBLE | 1; data = (unsigned char *) &acmnlmsg->nlmsg_header + NLMSG_HDRLEN; resolve_hdr_len = NLMSG_ALIGN(sizeof(struct rdma_ls_resolve_header)); attr = (struct nlattr *) (data + resolve_hdr_len); payload_len = acmnlmsg->nlmsg_header.nlmsg_len - NLMSG_HDRLEN - resolve_hdr_len; rem = payload_len; while (1) { if (rem < (int) sizeof(*attr) || attr->nla_len < sizeof(*attr) || attr->nla_len > rem) break; status = acm_nl_parse_path_attr(attr, &msg.resolve_data[0]); if (status) { acm_nl_process_invalid_request(client, acmnlmsg); return; } /* Next attribute */ total_attr_len = NLA_ALIGN(attr->nla_len); rem -= total_attr_len; attr = (struct nlattr *) ((char *) attr + total_attr_len); } atomic_inc(&counter[ACM_CNTR_RESOLVE]); acm_svr_resolve(client, &msg); } static int acm_nl_is_valid_resolve_request(struct acm_nl_msg *acmnlmsg) { int payload_len; payload_len = acmnlmsg->nlmsg_header.nlmsg_len - NLMSG_HDRLEN; if (payload_len < (sizeof(struct rdma_ls_resolve_header) + sizeof(struct nlattr))) return 0; return 1; } static void acm_nl_receive(struct acmc_client *client) { struct acm_nl_msg *acmnlmsg; int datalen = sizeof(*acmnlmsg); int ret; uint16_t client_inx, op; acmnlmsg = calloc(1, sizeof(*acmnlmsg)); if (!acmnlmsg) { acm_log(0, "Out of memory for recving nl msg.\n"); return; } ret = recv(client->sock, acmnlmsg, datalen, 0); if (!NLMSG_OK(&acmnlmsg->nlmsg_header, ret)) { acm_log(0, "Netlink receive error: %d.\n", ret); goto rcv_cleanup; } acm_log(2, "nlmsg: len %d type 0x%x flags 0x%x seq %d pid %d\n", acmnlmsg->nlmsg_header.nlmsg_len, acmnlmsg->nlmsg_header.nlmsg_type, acmnlmsg->nlmsg_header.nlmsg_flags, acmnlmsg->nlmsg_header.nlmsg_seq, acmnlmsg->nlmsg_header.nlmsg_pid); /* Currently we handle only request from the local service client */ client_inx = RDMA_NL_GET_CLIENT(acmnlmsg->nlmsg_header.nlmsg_type); op = RDMA_NL_GET_OP(acmnlmsg->nlmsg_header.nlmsg_type); if (client_inx != RDMA_NL_LS) { acm_log_once(0, "ERROR - Unknown NL client ID (%d)\n", client_inx); goto rcv_cleanup; } switch (op) { case RDMA_NL_LS_OP_RESOLVE: if (acm_nl_is_valid_resolve_request(acmnlmsg)) acm_nl_process_resolve(client, acmnlmsg); else acm_nl_process_invalid_request(client, acmnlmsg); break; default: /* Not supported*/ acm_log_once(0, "WARN - invalid opcode %x\n", op); acm_nl_process_invalid_request(client, acmnlmsg); break; } return; rcv_cleanup: free(acmnlmsg); } static int acm_init_nl(void) { struct sockaddr_nl src_addr; int ret; int nl_rcv_socket; nl_rcv_socket = socket(PF_NETLINK, SOCK_RAW, NETLINK_RDMA); if (nl_rcv_socket == -1) { acm_log(0, "ERROR - unable to allocate netlink recv socket\n"); return errno; } memset(&src_addr, 0, sizeof(src_addr)); src_addr.nl_family = AF_NETLINK; src_addr.nl_pid = getpid(); src_addr.nl_groups = (1 << (RDMA_NL_GROUP_LS - 1)); ret = bind(nl_rcv_socket, (struct sockaddr *)&src_addr, sizeof(src_addr)); if (ret == -1) { acm_log(0, "ERROR - unable to bind netlink socket\n"); close(nl_rcv_socket); return errno; } /* init nl client structure */ client_array[NL_CLIENT_INDEX].sock = nl_rcv_socket; return 0; } static void acm_server(bool systemd) { fd_set readfds; int i, n, ret; struct acmc_device *dev; acm_log(0, "started\n"); acm_init_server(); client_array[NL_CLIENT_INDEX].sock = -1; listen_socket = -1; if (systemd) { ret = acm_listen_systemd(); if (ret) { acm_log(0, "ERROR - systemd server listen failed\n"); return; } } if (listen_socket == -1) { ret = acm_listen(); if (ret) { acm_log(0, "ERROR - server listen failed\n"); return; } } if (client_array[NL_CLIENT_INDEX].sock == -1) { ret = acm_init_nl(); if (ret) acm_log(1, "Warn - Netlink init failed\n"); } if (systemd) sd_notify(0, "READY=1"); while (1) { n = (int) listen_socket; FD_ZERO(&readfds); FD_SET(listen_socket, &readfds); n = max(n, (int) ip_mon_socket); FD_SET(ip_mon_socket, &readfds); for (i = 0; i < FD_SETSIZE - 1; i++) { if (client_array[i].sock != -1) { FD_SET(client_array[i].sock, &readfds); n = max(n, (int) client_array[i].sock); } } list_for_each(&dev_list, dev, entry) { FD_SET(dev->device.verbs->async_fd, &readfds); n = max(n, (int) dev->device.verbs->async_fd); } ret = select(n + 1, &readfds, NULL, NULL, NULL); if (ret == -1) { acm_log(0, "ERROR - server select error\n"); continue; } if (FD_ISSET(listen_socket, &readfds)) acm_svr_accept(); if (FD_ISSET(ip_mon_socket, &readfds)) acm_ipnl_handler(); for (i = 0; i < FD_SETSIZE - 1; i++) { if (client_array[i].sock != -1 && FD_ISSET(client_array[i].sock, &readfds)) { acm_log(2, "receiving from client %d\n", i); if (i == NL_CLIENT_INDEX) acm_nl_receive(&client_array[i]); else acm_svr_receive(&client_array[i]); } } list_for_each(&dev_list, dev, entry) { if (FD_ISSET(dev->device.verbs->async_fd, &readfds)) { acm_log(2, "handling event from %s\n", dev->device.verbs->device->name); acm_event_handler(dev); } } } } enum ibv_rate acm_get_rate(uint8_t width, uint8_t speed) { switch (width) { case 1: /* 1x */ switch (speed) { case 1: return IBV_RATE_2_5_GBPS; case 2: return IBV_RATE_5_GBPS; case 4: /* fall through */ case 8: return IBV_RATE_10_GBPS; case 16: return IBV_RATE_14_GBPS; case 32: return IBV_RATE_25_GBPS; default: return IBV_RATE_MAX; } case 2: /* 4x */ switch (speed) { case 1: return IBV_RATE_10_GBPS; case 2: return IBV_RATE_20_GBPS; case 4: /* fall through */ case 8: return IBV_RATE_40_GBPS; case 16: return IBV_RATE_56_GBPS; case 32: return IBV_RATE_100_GBPS; default: return IBV_RATE_MAX; } case 4: /* 8x */ switch (speed) { case 1: return IBV_RATE_20_GBPS; case 2: return IBV_RATE_40_GBPS; case 4: /* fall through */ case 8: return IBV_RATE_80_GBPS; case 16: return IBV_RATE_112_GBPS; case 32: return IBV_RATE_200_GBPS; default: return IBV_RATE_MAX; } case 8: /* 12x */ switch (speed) { case 1: return IBV_RATE_30_GBPS; case 2: return IBV_RATE_60_GBPS; case 4: /* fall through */ case 8: return IBV_RATE_120_GBPS; case 16: return IBV_RATE_168_GBPS; case 32: return IBV_RATE_300_GBPS; default: return IBV_RATE_MAX; } default: acm_log(0, "ERROR - unknown link width 0x%x\n", width); return IBV_RATE_MAX; } } enum ibv_mtu acm_convert_mtu(int mtu) { switch (mtu) { case 256: return IBV_MTU_256; case 512: return IBV_MTU_512; case 1024: return IBV_MTU_1024; case 2048: return IBV_MTU_2048; case 4096: return IBV_MTU_4096; default: return IBV_MTU_2048; } } enum ibv_rate acm_convert_rate(int rate) { switch (rate) { case 2: return IBV_RATE_2_5_GBPS; case 5: return IBV_RATE_5_GBPS; case 10: return IBV_RATE_10_GBPS; case 20: return IBV_RATE_20_GBPS; case 30: return IBV_RATE_30_GBPS; case 40: return IBV_RATE_40_GBPS; case 60: return IBV_RATE_60_GBPS; case 80: return IBV_RATE_80_GBPS; case 120: return IBV_RATE_120_GBPS; case 14: return IBV_RATE_14_GBPS; case 56: return IBV_RATE_56_GBPS; case 112: return IBV_RATE_112_GBPS; case 168: return IBV_RATE_168_GBPS; case 25: return IBV_RATE_25_GBPS; case 100: return IBV_RATE_100_GBPS; case 200: return IBV_RATE_200_GBPS; case 300: return IBV_RATE_300_GBPS; default: return IBV_RATE_10_GBPS; } } static FILE *acm_open_addr_file(void) { FILE *f; if ((f = fopen(addr_file, "r"))) return f; acm_log(0, "notice - generating %s file\n", addr_file); if (!(f = popen(acme, "r"))) { acm_log(0, "ERROR - cannot generate %s\n", addr_file); return NULL; } pclose(f); return fopen(addr_file, "r"); } static int __acm_ep_insert_addr(struct acmc_ep *ep, const char *name, uint8_t *addr, uint8_t addr_type) { int i; int ret; uint8_t tmp[ACM_MAX_ADDRESS] = {}; memcpy(tmp, addr, acm_addr_len(addr_type)); for (i = 0; (i < ep->nmbr_ep_addrs) && (ep->addr_info[i].addr.type != ACM_ADDRESS_INVALID); i++) ; if (i == ep->nmbr_ep_addrs) { struct acmc_addr *new_info; int j; new_info = realloc(ep->addr_info, (i + 1) * sizeof(*ep->addr_info)); if (!new_info) { ret = ENOMEM; goto out; } /* id_string needs to point to the reallocated string_buf */ for (j = 0; (j < ep->nmbr_ep_addrs); j++) { new_info[j].addr.id_string = new_info[j].string_buf; } ep->addr_info = new_info; /* Added memory is not initialized */ memset(ep->addr_info + i, 0, sizeof(*ep->addr_info)); ep->addr_info[i].addr.endpoint = &ep->endpoint; ep->addr_info[i].addr.id_string = ep->addr_info[i].string_buf; ++ep->nmbr_ep_addrs; } /* Open the provider endpoint only if at least a name or address is found */ if (!ep->prov_ep_context) { ret = ep->port->prov->open_endpoint(&ep->endpoint, ep->port->prov_port_context, &ep->prov_ep_context); if (ret) { acm_log(0, "Error: failed to open prov ep\n"); goto out; } } ep->addr_info[i].addr.type = addr_type; if (!check_snprintf(ep->addr_info[i].string_buf, sizeof(ep->addr_info[i].string_buf), "%s", name)) return EINVAL; memcpy(ep->addr_info[i].addr.info.addr, tmp, ACM_MAX_ADDRESS); ret = ep->port->prov->add_address(&ep->addr_info[i].addr, ep->prov_ep_context, &ep->addr_info[i].prov_addr_context); if (ret) { acm_log(0, "Error: failed to add addr to provider\n"); ep->addr_info[i].addr.type = ACM_ADDRESS_INVALID; } out: return ret; } static int acm_ep_insert_addr(struct acmc_ep *ep, const char *name, uint8_t *addr, uint8_t addr_type) { int ret = -1; if (!acm_addr_lookup(&ep->endpoint, addr, addr_type)) { ret = __acm_ep_insert_addr(ep, name, addr, addr_type); } return ret; } static struct acmc_device * acm_get_device_from_gid(union ibv_gid *sgid, uint8_t *port) { struct acmc_device *dev; int i; list_for_each(&dev_list, dev, entry) { for (*port = 1; *port <= dev->port_cnt; (*port)++) { for (i = 0; i < dev->port[*port - 1].gid_cnt; i++) { if (!memcmp(sgid->raw, dev->port[*port - 1].gid_tbl[i].raw, sizeof(*sgid))) return dev; } } } return NULL; } static void acm_ep_ip_iter_cb(char *ifname, union ibv_gid *gid, uint16_t pkey, uint8_t addr_type, uint8_t *addr, char *ip_str, void *ctx) { uint8_t port_num; struct acmc_device *dev; struct acmc_ep *ep = ctx; dev = acm_get_device_from_gid(gid, &port_num); if (dev && ep->port->dev == dev && ep->port->port.port_num == port_num && /* pkey retrieved from ipoib has always full mmbr bit set */ (ep->endpoint.pkey | IB_PKEY_FULL_MEMBER) == pkey) { if (!acm_ep_insert_addr(ep, ip_str, addr, addr_type)) { acm_log(0, "Added %s %s %d 0x%x from %s\n", ip_str, dev->device.verbs->device->name, port_num, ep->endpoint.pkey, ifname); } } } static int acm_get_system_ips(struct acmc_ep *ep) { return acm_if_iter_sys(acm_ep_ip_iter_cb, ep); } static int acm_assign_ep_names(struct acmc_ep *ep) { FILE *faddr; char *dev_name; char s[120]; char dev[32], name[ACM_MAX_ADDRESS], pkey_str[8]; uint16_t pkey; uint8_t addr[ACM_MAX_ADDRESS], type; int port; dev_name = ep->port->dev->device.verbs->device->name; acm_log(1, "device %s, port %d, pkey 0x%x\n", dev_name, ep->port->port.port_num, ep->endpoint.pkey); acm_get_system_ips(ep); if (!(faddr = acm_open_addr_file())) { acm_log(0, "ERROR - address file not found\n"); goto out; } while (fgets(s, sizeof s, faddr)) { if (s[0] == '#') continue; if (sscanf(s, "%46s%31s%d%7s", name, dev, &port, pkey_str) != 4) continue; acm_log(2, "%s", s); if (inet_pton(AF_INET, name, addr) > 0) { if (!support_ips_in_addr_cfg) { acm_log(0, "ERROR - IP's are not configured to be read from ibacm_addr.cfg\n"); continue; } type = ACM_ADDRESS_IP; } else if (inet_pton(AF_INET6, name, addr) > 0) { if (!support_ips_in_addr_cfg) { acm_log(0, "ERROR - IP's are not configured to be read from ibacm_addr.cfg\n"); continue; } type = ACM_ADDRESS_IP6; } else { type = ACM_ADDRESS_NAME; strncpy((char *)addr, name, sizeof(addr)); } if (strcasecmp(pkey_str, "default")) { if (sscanf(pkey_str, "%hx", &pkey) != 1) { acm_log(0, "ERROR - bad pkey format %s\n", pkey_str); continue; } } else { pkey = ep->port->def_acm_pkey; } if (!strcasecmp(dev_name, dev) && (ep->port->port.port_num == (uint8_t) port) && acm_same_partition(ep->endpoint.pkey, pkey)) { acm_log(1, "assigning %s\n", name); if (acm_ep_insert_addr(ep, name, addr, type)) { acm_log(1, "maximum number of names assigned to EP\n"); break; } } } fclose(faddr); out: return (!ep->nmbr_ep_addrs || ep->addr_info[0].addr.type == ACM_ADDRESS_INVALID); } static struct acmc_ep *acm_find_ep(struct acmc_port *port, uint16_t pkey) { struct acmc_ep *ep, *res = NULL; acm_log(2, "pkey 0x%x\n", pkey); list_for_each(&port->ep_list, ep, entry) { if (acm_same_partition(ep->endpoint.pkey, pkey)) { res = ep; break; } } return res; } static void acm_ep_down(struct acmc_ep *ep) { int i; acm_log(1, "%s %d pkey 0x%04x\n", ep->port->dev->device.verbs->device->name, ep->port->port.port_num, ep->endpoint.pkey); for (i = 0; i < ep->nmbr_ep_addrs; i++) { if (ep->addr_info[i].addr.type && ep->addr_info[i].prov_addr_context) ep->port->prov->remove_address(ep->addr_info[i]. prov_addr_context); } if (ep->prov_ep_context) ep->port->prov->close_endpoint(ep->prov_ep_context); free(ep); } static struct acmc_ep * acm_alloc_ep(struct acmc_port *port, uint16_t pkey) { struct acmc_ep *ep; acm_log(1, "\n"); ep = calloc(1, sizeof *ep); if (!ep) return NULL; ep->port = port; ep->endpoint.port = &port->port; ep->endpoint.pkey = pkey; ep->addr_info = NULL; ep->nmbr_ep_addrs = 0; return ep; } static void acm_ep_up(struct acmc_port *port, uint16_t pkey) { struct acmc_ep *ep; int ret; acm_log(1, "\n"); if (acm_find_ep(port, pkey)) { acm_log(2, "endpoint for pkey 0x%x already exists\n", pkey); return; } acm_log(2, "creating endpoint for pkey 0x%x\n", pkey); ep = acm_alloc_ep(port, pkey); if (!ep) return; ret = acm_assign_ep_names(ep); if (ret) { acm_log(1, "unable to assign EP name for pkey 0x%x\n", pkey); goto ep_close; } list_add(&port->ep_list, &ep->entry); return; ep_close: if (ep->prov_ep_context) port->prov->close_endpoint(ep->prov_ep_context); free(ep); } static void acm_assign_provider(struct acmc_port *port) { struct acmc_prov *prov; struct acmc_subnet *subnet; acm_log(2, "port %s/%d\n", port->port.dev->verbs->device->name, port->port.port_num); list_for_each(&provider_list, prov, entry) { list_for_each(&prov->subnet_list, subnet, entry) { if (subnet->subnet_prefix == port->gid_tbl[0].global.subnet_prefix) { acm_log(2, "Found provider %s for port %s/%d\n", prov->prov->name, port->port.dev->verbs->device->name, port->port.port_num); port->prov = prov->prov; return; } } } /* If no provider is found, assign the default provider*/ if (!port->prov) { acm_log(2, "No prov found, assign default prov %s to %s/%d\n", def_provider ? def_provider->prov->name: "NULL", port->port.dev->verbs->device->name, port->port.port_num); port->prov = def_provider ? def_provider->prov : NULL; } } static void acm_port_get_gid_tbl(struct acmc_port *port) { union ibv_gid gid; int i, j, ret; for (i = 0;; i++) { ret = ibv_query_gid(port->port.dev->verbs, port->port.port_num, i, &gid); if (ret || !gid.global.interface_id) break; } if (i > 0) { port->gid_tbl = calloc(i, sizeof(union ibv_gid)); if (!port->gid_tbl) { acm_log(0, "Error: failed to allocate gid table\n"); port->gid_cnt = 0; return; } for (j = 0; j < i; j++) { ret = ibv_query_gid(port->port.dev->verbs, port->port.port_num, j, &port->gid_tbl[j]); if (ret || !port->gid_tbl[j].global.interface_id) break; acm_log(2, "guid %d: 0x%" PRIx64 " %" PRIx64 "\n", j, be64toh(port->gid_tbl[j].global.subnet_prefix), be64toh(port->gid_tbl[j].global.interface_id)); } port->gid_cnt = j; } acm_log(2, "port %d gid_cnt %d\n", port->port.port_num, port->gid_cnt); } static void acm_port_up(struct acmc_port *port) { struct ibv_port_attr attr; uint16_t pkey; __be16 pkey_be; int i, ret; struct acmc_prov_context *dev_ctx; int index = -1; uint16_t first_pkey = 0; acm_log(1, "%s %d\n", port->dev->device.verbs->device->name, port->port.port_num); ret = ibv_query_port(port->dev->device.verbs, port->port.port_num, &attr); if (ret) { acm_log(0, "ERROR - unable to get port state\n"); return; } if (attr.state != IBV_PORT_ACTIVE) { acm_log(1, "port not active\n"); return; } acm_port_get_gid_tbl(port); port->lid = attr.lid; port->lid_mask = 0xffff - ((1 << attr.lmc) - 1); port->sa_addr.lid = htobe16(attr.sm_lid); port->sa_addr.sl = attr.sm_sl; port->state = IBV_PORT_ACTIVE; acm_assign_provider(port); if (!port->prov) { acm_log(1, "no provider assigned to port\n"); return; } dev_ctx = acm_acquire_prov_context(&port->dev->prov_dev_context_list, port->prov); if (!dev_ctx) { acm_log(0, "Error -- failed to acquire dev context\n"); return; } if (atomic_get(&dev_ctx->refcnt) == 1) { if (port->prov->open_device(&port->dev->device, &dev_ctx->context)) { acm_log(0, "Error -- failed to open the prov device\n"); goto err1; } } if (port->prov->open_port(&port->port, dev_ctx->context, &port->prov_port_context)) { acm_log(0, "Error -- failed to open the prov port\n"); goto err1; } /* Determine the default pkey for SA access first. * Order of preference: 0xffff, 0x7fff * Use the first pkey as the default pkey for parsing address file. */ for (i = 0; i < attr.pkey_tbl_len; i++) { ret = ibv_query_pkey(port->dev->device.verbs, port->port.port_num, i, &pkey_be); if (ret) continue; pkey = be16toh(pkey_be); if (i == 0) first_pkey = pkey; if (pkey == 0xffff) { index = i; break; } else if (pkey == 0x7fff) { index = i; } } port->sa_pkey_index = index < 0 ? 0 : index; port->def_acm_pkey = first_pkey; for (i = 0; i < attr.pkey_tbl_len; i++) { ret = ibv_query_pkey(port->dev->device.verbs, port->port.port_num, i, &pkey_be); if (ret) continue; pkey = be16toh(pkey_be); if (!(pkey & 0x7fff)) continue; acm_ep_up(port, pkey); } return; err1: acm_release_prov_context(dev_ctx); } static void acm_shutdown_port(struct acmc_port *port) { struct acmc_ep *ep; struct acmc_prov_context *dev_ctx; while ((ep = list_pop(&port->ep_list, struct acmc_ep, entry))) acm_ep_down(ep); if (port->prov_port_context) { port->prov->close_port(port->prov_port_context); port->prov_port_context = NULL; dev_ctx = acm_get_prov_context(&port->dev->prov_dev_context_list, port->prov); if (dev_ctx) { if (atomic_get(&dev_ctx->refcnt) == 1) port->prov->close_device(dev_ctx->context); acm_release_prov_context(dev_ctx); } } port->prov = NULL; if (port->gid_tbl) { free(port->gid_tbl); port->gid_tbl = NULL; } port->gid_cnt = 0; } static void acm_port_down(struct acmc_port *port) { struct ibv_port_attr attr; int ret; acm_log(1, "%s %d\n", port->port.dev->verbs->device->name, port->port.port_num); ret = ibv_query_port(port->port.dev->verbs, port->port.port_num, &attr); if (!ret && attr.state == IBV_PORT_ACTIVE) { acm_log(1, "port active\n"); return; } port->state = attr.state; acm_shutdown_port(port); acm_log(1, "%s %d is down\n", port->dev->device.verbs->device->name, port->port.port_num); } static void acm_port_change(struct acmc_port *port) { struct ibv_port_attr attr; int ret; acm_log(1, "%s %d\n", port->port.dev->verbs->device->name, port->port.port_num); ret = ibv_query_port(port->port.dev->verbs, port->port.port_num, &attr); if (ret || attr.state != IBV_PORT_ACTIVE) { acm_log(1, "port not active: don't care\n"); return; } port->state = attr.state; acm_shutdown_port(port); acm_port_up(port); } static void acm_event_handler(struct acmc_device *dev) { struct ibv_async_event event; int i, ret; ret = ibv_get_async_event(dev->device.verbs, &event); if (ret) return; acm_log(2, "processing async event %s for %s\n", ibv_event_type_str(event.event_type), dev->device.verbs->device->name); i = event.element.port_num - 1; switch (event.event_type) { case IBV_EVENT_PORT_ACTIVE: if (dev->port[i].state != IBV_PORT_ACTIVE) acm_port_up(&dev->port[i]); if (dev->port[i].pending_rereg && dev->port[i].prov_port_context) { dev->port[i].prov->handle_event(dev->port[i].prov_port_context, IBV_EVENT_CLIENT_REREGISTER); dev->port[i].pending_rereg = false; acm_log(1, "%s %d delayed reregistration\n", dev->device.verbs->device->name, i + 1); } break; case IBV_EVENT_PORT_ERR: if (dev->port[i].state == IBV_PORT_ACTIVE) acm_port_down(&dev->port[i]); break; case IBV_EVENT_CLIENT_REREGISTER: if ((dev->port[i].state == IBV_PORT_ACTIVE) && dev->port[i].prov_port_context) { dev->port[i].prov->handle_event(dev->port[i].prov_port_context, event.event_type); acm_log(1, "%s %d has reregistered\n", dev->device.verbs->device->name, i + 1); } else { acm_log(2, "%s %d rereg on inactive port, postpone handling\n", dev->device.verbs->device->name, i + 1); dev->port[i].pending_rereg = true; } break; case IBV_EVENT_LID_CHANGE: case IBV_EVENT_GID_CHANGE: case IBV_EVENT_PKEY_CHANGE: acm_port_change(&dev->port[i]); break; default: break; } ibv_ack_async_event(&event); } static void acm_activate_devices(void) { struct acmc_device *dev; int i; acm_log(1, "\n"); list_for_each(&dev_list, dev, entry) { for (i = 0; i < dev->port_cnt; i++) { acm_port_up(&dev->port[i]); } } } static void acm_open_port(struct acmc_port *port, struct acmc_device *dev, uint8_t port_num) { acm_log(1, "%s %d\n", dev->device.verbs->device->name, port_num); port->dev = dev; port->port.dev = &dev->device; port->port.port_num = port_num; pthread_mutex_init(&port->lock, NULL); list_head_init(&port->ep_list); list_head_init(&port->sa_pending); list_head_init(&port->sa_wait); port->sa_credits = sa.depth; port->sa_addr.qpn = htobe32(1); port->sa_addr.qkey = htobe32(ACM_QKEY); port->mad_portid = umad_open_port(dev->device.verbs->device->name, port_num); if (port->mad_portid < 0) acm_log(0, "ERROR - unable to open MAD port\n"); port->mad_agentid = umad_register(port->mad_portid, IB_MGMT_CLASS_SA, 1, 1, NULL); if (port->mad_agentid < 0) { umad_close_port(port->mad_portid); acm_log(0, "ERROR - unable to register MAD client\n"); } port->prov = NULL; port->state = IBV_PORT_DOWN; } static void acm_open_dev(struct ibv_device *ibdev) { struct acmc_device *dev; struct ibv_device_attr attr; struct ibv_port_attr port_attr; struct ibv_context *verbs; size_t size; int i, ret; bool has_ib_port = false; acm_log(1, "%s\n", ibdev->name); verbs = ibv_open_device(ibdev); if (verbs == NULL) { acm_log(0, "ERROR - opening device %s\n", ibdev->name); return; } ret = ibv_query_device(verbs, &attr); if (ret) { acm_log(0, "ERROR - ibv_query_device (%d) %s\n", ret, ibdev->name); goto err1; } for (i = 0; i < attr.phys_port_cnt; i++) { ret = ibv_query_port(verbs, i + 1, &port_attr); if (ret) { acm_log(0, "ERROR - ibv_query_port (%s, %d) return (%d)\n", ibdev->name, i + 1, ret); continue; } if (port_attr.link_layer == IBV_LINK_LAYER_INFINIBAND) { acm_log(1, "%s port %d is an InfiniBand port\n", ibdev->name, i + 1); has_ib_port = true; } else { acm_log(1, "%s port %d is not an InfiniBand port\n", ibdev->name, i + 1); } } if (!has_ib_port) { acm_log(1, "%s does not support InfiniBand.\n", ibdev->name); goto err1; } size = sizeof(*dev) + sizeof(struct acmc_port) * attr.phys_port_cnt; dev = (struct acmc_device *) calloc(1, size); if (!dev) goto err1; dev->device.verbs = verbs; dev->device.dev_guid = ibv_get_device_guid(ibdev); dev->port_cnt = attr.phys_port_cnt; list_head_init(&dev->prov_dev_context_list); for (i = 0; i < dev->port_cnt; i++) { acm_open_port(&dev->port[i], dev, i + 1); } list_add(&dev_list, &dev->entry); acm_log(1, "%s opened\n", ibdev->name); return; err1: ibv_close_device(verbs); } static int acm_open_devices(void) { struct ibv_device **ibdev; int dev_cnt; int i; acm_log(1, "\n"); ibdev = ibv_get_device_list(&dev_cnt); if (!ibdev) { acm_log(0, "ERROR - unable to get device list\n"); return -1; } for (i = 0; i < dev_cnt; i++) acm_open_dev(ibdev[i]); ibv_free_device_list(ibdev); if (list_empty(&dev_list)) { acm_log(0, "ERROR - no devices\n"); return -1; } return 0; } static void acm_load_prov_config(void) { FILE *fd; char s[128]; char *p, *ptr; char prov_name[ACM_PROV_NAME_SIZE]; uint64_t prefix; struct acmc_prov *prov; struct acmc_subnet *subnet; if (!(fd = fopen(opts_file, "r"))) return; while (fgets(s, sizeof s, fd)) { if (s[0] == '#') continue; /* Ignore blank lines */ if (!(p = strtok_r(s, " \n", &ptr))) continue; if (strncasecmp(p, "provider", sizeof("provider") - 1)) continue; p = strtok_r(NULL, " ", &ptr); if (!p) continue; strncpy(prov_name, p, sizeof(prov_name)); prov_name[sizeof(prov_name) -1] = '\0'; p = strtok_r(NULL, " ", &ptr); if (!p) continue; if (!strncasecmp(p, "default", sizeof("default") - 1)) { strncpy(def_prov_name, prov_name, sizeof(def_prov_name)); def_prov_name[sizeof(def_prov_name) -1] = '\0'; acm_log(2, "default provider: %s\n", def_prov_name); continue; } prefix = strtoull(p, NULL, 0); acm_log(2, "provider %s subnet_prefix 0x%" PRIx64 "\n", prov_name, prefix); list_for_each(&provider_list, prov, entry) { if (!strcasecmp(prov->prov->name, prov_name)) { subnet = calloc(1, sizeof (*subnet)); if (!subnet) { acm_log(0, "Error: out of memory\n"); fclose(fd); return; } subnet->subnet_prefix = htobe64(prefix); list_add_tail(&prov->subnet_list, &subnet->entry); } } } fclose(fd); list_for_each(&provider_list, prov, entry) { if (!strcasecmp(prov->prov->name, def_prov_name)) { def_provider = prov; break; } } } static int acm_string_end_compare(const char *s1, const char *s2) { size_t s1_len = strlen(s1); size_t s2_len = strlen(s2); if (s1_len < s2_len) return -1; return strcmp(s1 + s1_len - s2_len, s2); } static int acm_open_providers(void) { DIR *shlib_dir; struct dirent *dent; char file_name[256]; struct stat buf; void *handle; struct acmc_prov *prov; struct acm_provider *provider; uint32_t version; char *err_str; int (*query)(struct acm_provider **, uint32_t *); acm_log(1, "\n"); shlib_dir = opendir(prov_lib_path); if (!shlib_dir) { acm_log(0, "ERROR - could not open provider lib dir: %s\n", prov_lib_path); return -1; } while ((dent = readdir(shlib_dir))) { if (acm_string_end_compare(dent->d_name, ".so")) continue; if (!check_snprintf(file_name, sizeof(file_name), "%s/%s", prov_lib_path, dent->d_name)) continue; if (lstat(file_name, &buf)) { acm_log(0, "Error - could not stat: %s\n", file_name); continue; } if (!S_ISREG(buf.st_mode)) continue; acm_log(2, "Loading provider %s...\n", file_name); if (!(handle = dlopen(file_name, RTLD_LAZY))) { acm_log(0, "Error - could not load provider %s (%s)\n", file_name, dlerror()); continue; } query = dlsym(handle, "provider_query"); if ((err_str = dlerror()) != NULL) { acm_log(0, "Error - provider_query not found in %s (%s)\n", file_name, err_str); dlclose(handle); continue; } if (query(&provider, &version)) { acm_log(0, "Error - provider_query failed to %s\n", file_name); dlclose(handle); continue; } if (version != ACM_PROV_VERSION || provider->size != sizeof(struct acm_provider)) { acm_log(0, "Error -unmatched provider version 0x%08x (size %zd)" " core 0x%08x (size %zd)\n", version, provider->size, ACM_PROV_VERSION, sizeof(struct acm_provider)); dlclose(handle); continue; } acm_log(1, "Provider %s (%s) loaded\n", provider->name, file_name); prov = calloc(1, sizeof(*prov)); if (!prov) { acm_log(0, "Error -failed to allocate provider %s\n", file_name); dlclose(handle); continue; } prov->prov = provider; prov->handle = handle; list_head_init(&prov->subnet_list); list_add_tail(&provider_list, &prov->entry); if (!strcasecmp(provider->name, def_prov_name)) def_provider = prov; } closedir(shlib_dir); acm_load_prov_config(); return 0; } static void acm_close_providers(void) { struct acmc_prov *prov; struct acmc_subnet *subnet; acm_log(1, "\n"); def_provider = NULL; while ((prov = list_pop(&provider_list, struct acmc_prov, entry))) { while ((subnet = list_pop(&prov->subnet_list, struct acmc_subnet, entry))) free(subnet); dlclose(prov->handle); free(prov); } } static int acmc_init_sa_fds(void) { struct acmc_device *dev; int ret, p, i = 0; list_for_each(&dev_list, dev, entry) sa.nfds += dev->port_cnt; sa.fds = calloc(sa.nfds, sizeof(*sa.fds)); sa.ports = calloc(sa.nfds, sizeof(*sa.ports)); if (!sa.fds || !sa.ports) return -ENOMEM; list_for_each(&dev_list, dev, entry) { for (p = 0; p < dev->port_cnt; p++) { sa.fds[i].fd = umad_get_fd(dev->port[p].mad_portid); sa.fds[i].events = POLLIN; ret = set_fd_nonblock(sa.fds[i].fd, true); if (ret) acm_log(0, "WARNING - umad fd is blocking\n"); sa.ports[i++] = &dev->port[p]; } } return 0; } struct acm_sa_mad * acm_alloc_sa_mad(const struct acm_endpoint *endpoint, void *context, void (*handler)(struct acm_sa_mad *)) { struct acmc_sa_req *req; if (!endpoint) { acm_log(0, "Error: NULL endpoint\n"); return NULL; } req = calloc(1, sizeof (*req)); if (!req) { acm_log(0, "Error: failed to allocate sa request\n"); return NULL; } req->ep = container_of(endpoint, struct acmc_ep, endpoint); req->mad.context = context; req->resp_handler = handler; acm_log(2, "%p\n", req); return &req->mad; } void acm_free_sa_mad(struct acm_sa_mad *mad) { struct acmc_sa_req *req; req = container_of(mad, struct acmc_sa_req, mad); acm_log(2, "%p\n", req); free(req); } int acm_send_sa_mad(struct acm_sa_mad *mad) { struct acmc_port *port; struct acmc_sa_req *req; int ret; req = container_of(mad, struct acmc_sa_req, mad); acm_log(2, "%p from %s\n", req, req->ep->addr_info[0].addr.id_string); port = req->ep->port; mad->umad.addr.qpn = port->sa_addr.qpn; mad->umad.addr.qkey = port->sa_addr.qkey; mad->umad.addr.lid = port->sa_addr.lid; mad->umad.addr.sl = port->sa_addr.sl; mad->umad.addr.pkey_index = req->ep->port->sa_pkey_index; pthread_mutex_lock(&port->lock); if (port->sa_credits && list_empty(&port->sa_wait)) { ret = umad_send(port->mad_portid, port->mad_agentid, &mad->umad, sizeof mad->sa_mad, sa.timeout, sa.retries); if (!ret) { port->sa_credits--; list_add_tail(&port->sa_pending, &req->entry); } } else { ret = 0; list_add_tail(&port->sa_wait, &req->entry); } pthread_mutex_unlock(&port->lock); return ret; } static void acmc_send_queued_req(struct acmc_port *port) { struct acmc_sa_req *req; int ret; pthread_mutex_lock(&port->lock); if (list_empty(&port->sa_wait) || !port->sa_credits) { pthread_mutex_unlock(&port->lock); return; } req = list_pop(&port->sa_wait, struct acmc_sa_req, entry); ret = umad_send(port->mad_portid, port->mad_agentid, &req->mad.umad, sizeof req->mad.sa_mad, sa.timeout, sa.retries); if (!ret) { port->sa_credits--; list_add_tail(&port->sa_pending, &req->entry); } pthread_mutex_unlock(&port->lock); if (ret) { req->mad.umad.status = -ret; req->resp_handler(&req->mad); } } static void acmc_recv_mad(struct acmc_port *port) { struct acmc_sa_req *req; struct acm_sa_mad resp; int ret, len, found; struct umad_hdr *hdr; if (!port->prov) { acm_log(1, "no provider assigned to port\n"); return; } acm_log(2, "\n"); len = sizeof(resp.sa_mad); ret = umad_recv(port->mad_portid, &resp.umad, &len, 0); if (ret < 0) { acm_log(1, "umad_recv error %d\n", ret); return; } hdr = &resp.sa_mad.mad_hdr; acm_log(2, "bv %x cls %x cv %x mtd %x st %d tid %" PRIx64 " at %x atm %x\n", hdr->base_version, hdr->mgmt_class, hdr->class_version, hdr->method, be16toh(hdr->status), be64toh(hdr->tid), be16toh(hdr->attr_id), be32toh(hdr->attr_mod)); found = 0; pthread_mutex_lock(&port->lock); list_for_each(&port->sa_pending, req, entry) { /* The upper 32-bit of the tid is used for agentid in umad */ if (req->mad.sa_mad.mad_hdr.tid == (hdr->tid & htobe64(0xFFFFFFFF))) { found = 1; list_del(&req->entry); port->sa_credits++; break; } } pthread_mutex_unlock(&port->lock); if (found) { memcpy(&req->mad.umad, &resp.umad, sizeof(resp.umad) + len); req->resp_handler(&req->mad); } } static void *acm_sa_handler(void *context) { int i, ret; acm_log(0, "started\n"); ret = acmc_init_sa_fds(); if (ret) { acm_log(0, "ERROR - failed to init fds\n"); return NULL; } if (pthread_setcanceltype(PTHREAD_CANCEL_DEFERRED, NULL)) { acm_log(0, "Error: failed to set cancel type \n"); return NULL; } if (pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, NULL)) { acm_log(0, "Error: failed to set cancel state\n"); return NULL; } for (;;) { pthread_testcancel(); ret = poll(sa.fds, sa.nfds, -1); if (ret < 0) { acm_log(0, "ERROR - sa poll error: %d\n", errno); continue; } for (i = 0; i < sa.nfds; i++) { if (!sa.fds[i].revents) continue; if (sa.fds[i].revents & POLLIN) { acmc_recv_mad(sa.ports[i]); acmc_send_queued_req(sa.ports[i]); } sa.fds[i].revents = 0; } } return NULL; } static void acm_stop_sa_handler(void) { if (pthread_cancel(sa.thread_id)) { acm_log(0, "Error: failed to cancel sa resp thread \n"); return; } if (pthread_join(sa.thread_id, NULL)) { acm_log(0, "Error: failed to join sa resp thread\n"); return; } } static void acm_set_options(void) { FILE *f; char s[120]; char opt[32], value[256]; if (!(f = fopen(opts_file, "r"))) return; while (fgets(s, sizeof s, f)) { if (s[0] == '#') continue; if (sscanf(s, "%31s%255s", opt, value) != 2) continue; if (!strcasecmp("log_file", opt)) strcpy(log_file, value); else if (!strcasecmp("log_level", opt)) log_level = atoi(value); else if (!strcasecmp("umad_debug_level", opt)) { umad_debug_level = atoi(value); if (umad_debug_level > 0) umad_debug(umad_debug_level); } else if (!strcasecmp("lock_file", opt)) strcpy(lock_file, value); else if (!strcasecmp("server_port", opt)) server_port = (short) atoi(value); else if (!strcasecmp("server_mode", opt)) { if (!strcasecmp(value, "open")) server_mode = IBACM_SERVER_MODE_OPEN; else if (!strcasecmp(value, "loop")) server_mode = IBACM_SERVER_MODE_LOOP; else server_mode = IBACM_SERVER_MODE_UNIX; } else if (!strcasecmp("acme_plus_kernel_only", opt)) acme_plus_kernel_only = !strcasecmp(value, "true") || !strcasecmp(value, "yes") || strtol(value, NULL, 0); else if (!strcasecmp("provider_lib_path", opt)) strcpy(prov_lib_path, value); else if (!strcasecmp("support_ips_in_addr_cfg", opt)) support_ips_in_addr_cfg = atoi(value); else if (!strcasecmp("timeout", opt)) sa.timeout = atoi(value); else if (!strcasecmp("retries", opt)) sa.retries = atoi(value); else if (!strcasecmp("sa_depth", opt)) sa.depth = atoi(value); } fclose(f); } static void acm_log_options(void) { static const char * const server_mode_names[] = { [IBACM_SERVER_MODE_UNIX] = "unix", [IBACM_SERVER_MODE_LOOP] = "loop", [IBACM_SERVER_MODE_OPEN] = "open", }; acm_log(0, "log file %s\n", log_file); acm_log(0, "log level %d\n", log_level); acm_log(0, "umad debug level %d\n", umad_debug_level); acm_log(0, "lock file %s\n", lock_file); acm_log(0, "server_port %d\n", server_port); acm_log(0, "server_mode %s\n", server_mode_names[server_mode]); acm_log(0, "acme_plus_kernel_only %s\n", acme_plus_kernel_only ? "yes" : "no"); acm_log(0, "timeout %d ms\n", sa.timeout); acm_log(0, "retries %d\n", sa.retries); acm_log(0, "sa depth %d\n", sa.depth); acm_log(0, "options file %s\n", opts_file); acm_log(0, "addr file %s\n", addr_file); acm_log(0, "provider lib path %s\n", prov_lib_path); acm_log(0, "support IP's in ibacm_addr.cfg %d\n", support_ips_in_addr_cfg); } static FILE *acm_open_log(void) { FILE *f; if (!strcasecmp(log_file, "stdout")) return stdout; if (!strcasecmp(log_file, "stderr")) return stderr; if (!(f = fopen(log_file, "w"))) f = stdout; return f; } static int acm_open_lock_file(void) { int lock_fd; char pid[16]; lock_fd = open(lock_file, O_RDWR | O_CREAT, 0640); if (lock_fd < 0) return lock_fd; if (lockf(lock_fd, F_TLOCK, 0)) { close(lock_fd); return -1; } snprintf(pid, sizeof pid, "%d\n", getpid()); if (write(lock_fd, pid, strlen(pid)) != strlen(pid)){ close(lock_fd); return -1; } return 0; } static void show_usage(char *program) { printf("usage: %s\n", program); printf(" [-D] - run as a daemon (default)\n"); printf(" [-P] - run as a standard process\n"); printf(" [-A addr_file] - address configuration file\n"); printf(" (default %s/%s)\n", ACM_CONF_DIR, ACM_ADDR_FILE); printf(" [-O option_file] - option configuration file\n"); printf(" (default %s/%s)\n", ACM_CONF_DIR, ACM_OPTS_FILE); } int main(int argc, char **argv) { int i, op, as_daemon = 1; bool systemd = false; static const struct option long_opts[] = { {"systemd", 0, NULL, 's'}, {} }; while ((op = getopt_long(argc, argv, "DPA:O:", long_opts, NULL)) != -1) { switch (op) { case 'D': /* option no longer required */ break; case 'P': as_daemon = 0; break; case 'A': addr_file = optarg; break; case 'O': opts_file = optarg; break; case 's': systemd = true; break; default: show_usage(argv[0]); exit(1); } } if (as_daemon && !systemd) { if (daemon(0, 0)) return EXIT_FAILURE; } acm_set_options(); /* usage of systemd implies unix-domain communication */ if (systemd) server_mode = IBACM_SERVER_MODE_UNIX; if (acm_open_lock_file()) return -1; pthread_mutex_init(&log_lock, NULL); flog = acm_open_log(); acm_log(0, "Assistant to the InfiniBand Communication Manager\n"); acm_log_options(); for (i = 0; i < ACM_MAX_COUNTER; i++) atomic_init(&counter[i]); if (umad_init() != 0) { acm_log(0, "ERROR - fail to initialize umad\n"); return -1; } if (acm_open_providers()) { acm_log(0, "ERROR - unable to open any providers\n"); return -1; } if (acm_open_devices()) { acm_log(0, "ERROR - unable to open any devices\n"); return -1; } acm_log(1, "creating IP Netlink socket\n"); acm_ipnl_create(); acm_log(1, "starting sa response receiving thread\n"); if (pthread_create(&sa.thread_id, NULL, acm_sa_handler, NULL)) { acm_log(0, "Error: failed to create sa resp rcving thread"); return -1; } if (acm_init_if_iter_sys()) { acm_log(0, "Error: unable to initialize acm_if_iter_sys"); return -1; } acm_activate_devices(); acm_log(1, "starting server\n"); acm_server(systemd); acm_log(0, "shutting down\n"); if (client_array[NL_CLIENT_INDEX].sock != -1) close(client_array[NL_CLIENT_INDEX].sock); acm_close_providers(); acm_stop_sa_handler(); umad_done(); acm_fini_if_iter_sys(); fclose(flog); return 0; } rdma-core-56.1/ibacm/src/acm_util.c000066400000000000000000000130301477342711600171320ustar00rootroot00000000000000/* * Copyright (c) 2014 Intel Corporation. All rights reserved. * * This software is available to you under the OpenFabrics.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "acm_mad.h" #include "acm_util.h" int acm_if_get_pkey(char *ifname, uint16_t *pkey) { char buf[128], *end; FILE *f; int ret; snprintf(buf, sizeof buf, "//sys//class//net//%s//pkey", ifname); f = fopen(buf, "r"); if (!f) { acm_log(0, "failed to open %s\n", buf); return -1; } if (fgets(buf, sizeof buf, f)) { *pkey = strtol(buf, &end, 16); ret = 0; } else { acm_log(0, "failed to read pkey\n"); ret = -1; } fclose(f); return ret; } int acm_if_get_sgid(char *ifname, union ibv_gid *sgid) { char buf[128], *end; FILE *f; int i, p, ret; snprintf(buf, sizeof buf, "//sys//class//net//%s//address", ifname); f = fopen(buf, "r"); if (!f) { acm_log(0, "failed to open %s\n", buf); return -1; } if (fgets(buf, sizeof buf, f)) { for (i = 0, p = 12; i < 16; i++, p += 3) { buf[p + 2] = '\0'; sgid->raw[i] = (uint8_t) strtol(buf + p, &end, 16); } ret = 0; } else { acm_log(0, "failed to read sgid\n"); ret = -1; } fclose(f); return ret; } static struct nl_sock *sk; static struct nl_cache *link_cache; static struct nl_cache *addr_cache; int acm_init_if_iter_sys(void) { int sts; sk = nl_socket_alloc(); if (!sk) { acm_log(0, "nl_socket_alloc"); return -1; } sts = nl_connect(sk, NETLINK_ROUTE); if (sts) { acm_log(0, "nl_connect failed"); goto out_connect; } sts = rtnl_link_alloc_cache(sk, AF_UNSPEC, &link_cache); if (sts) { acm_log(0, "rtnl_link_alloc_cache failed"); goto out_connect; } sts = rtnl_addr_alloc_cache(sk, &addr_cache); if (sts) { acm_log(0, "rtnl_addr_alloc_cache"); goto out_addr; } return 0; out_addr: nl_cache_free(link_cache); out_connect: nl_close(sk); return sts; } void acm_fini_if_iter_sys(void) { nl_cache_free(link_cache); nl_cache_free(addr_cache); nl_close(sk); } static inline int af2acm_addr_type(int af) { switch (af) { case AF_INET: return ACM_ADDRESS_IP; case AF_INET6: return ACM_ADDRESS_IP6; } acm_log(0, "Unnkown address family\n"); return ACM_ADDRESS_INVALID; } struct ctx_and_cb { void *ctx; acm_if_iter_cb cb; }; static void acm_if_iter(struct nl_object *obj, void *_ctx_and_cb) { struct ctx_and_cb *ctx_cb = (struct ctx_and_cb *)_ctx_and_cb; struct rtnl_addr *addr = (struct rtnl_addr *)obj; struct nl_addr *a = rtnl_addr_get_local(addr); uint8_t bin_addr[ACM_MAX_ADDRESS] = {}; int addr_len = nl_addr_get_len(a); char ip_str[INET6_ADDRSTRLEN]; struct nl_addr *link_addr; struct rtnl_link *link; char flags_str[128]; union ibv_gid sgid; uint16_t pkey; char *label; int af; link = rtnl_link_get(link_cache, rtnl_addr_get_ifindex(addr)); if (rtnl_link_get_arptype(link) != ARPHRD_INFINIBAND) return; if (!a) return; if (addr_len > ACM_MAX_ADDRESS) { acm_log(0, "address too long (%d)\n", addr_len); return; } af = nl_addr_get_family(a); if (af != AF_INET && af != AF_INET6) return; label = rtnl_addr_get_label(addr); link_addr = rtnl_link_get_addr(link); /* gid has a 4 byte offset into the link address */ memcpy(sgid.raw, nl_addr_get_binary_addr(link_addr) + 4, sizeof(sgid)); if (acm_if_get_pkey(rtnl_link_get_name(link), &pkey)) return; acm_log(2, "name: %5s label: %9s index: %2d flags: %s addr: %s pkey: 0x%04x guid: 0x%" PRIx64 "\n", rtnl_link_get_name(link), label, rtnl_addr_get_ifindex(addr), rtnl_link_flags2str(rtnl_link_get_flags(link), flags_str, sizeof(flags_str)), nl_addr2str(a, ip_str, sizeof(ip_str)), pkey, be64toh(sgid.global.interface_id)); memcpy(&bin_addr, nl_addr_get_binary_addr(a), addr_len); ctx_cb->cb(label ? label : rtnl_link_get_name(link), &sgid, pkey, af2acm_addr_type(af), bin_addr, ip_str, ctx_cb->ctx); } int acm_if_iter_sys(acm_if_iter_cb cb, void *ctx) { struct ctx_and_cb ctx_cb; int sts; sts = nl_cache_refill(sk, link_cache); if (sts) { acm_log(0, "nl_cache_refill link_cache"); return sts; } sts = nl_cache_refill(sk, addr_cache); if (sts) { acm_log(0, "nl_cache_refill addr_cache"); return sts; } ctx_cb.ctx = ctx; ctx_cb.cb = cb; nl_cache_foreach(addr_cache, acm_if_iter, (void *)&ctx_cb); return 0; } rdma-core-56.1/ibacm/src/acm_util.h000066400000000000000000000047171477342711600171530ustar00rootroot00000000000000/* * Copyright (c) 2014 Intel Corporation. All rights reserved. * * This software is available to you under the OpenFabrics.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(ACM_IF_H) #define ACM_IF_H #include #include #ifdef ACME_PRINTS #undef acm_log #define acm_log(level, format, ...) \ printf(format, ## __VA_ARGS__) #define acm_log_once(level, format, ...) \ printf(format, ## __VA_ARGS__) #else /* !ACME_PRINTS */ #define acm_log(level, format, ...) \ acm_write(level, "%s: "format, __func__, ## __VA_ARGS__) #define acm_log_once(level, format, ...) do { \ static bool once; \ if (!once) { \ acm_write(level, "%s: "format, __func__, ## __VA_ARGS__); \ once = true; \ } \ } while (0) #endif /* ACME_PRINTS */ int acm_if_get_pkey(char *ifname, uint16_t *pkey); int acm_if_get_sgid(char *ifname, union ibv_gid *sgid); int acm_init_if_iter_sys(void); void acm_fini_if_iter_sys(void); typedef void (*acm_if_iter_cb)(char *ifname, union ibv_gid *gid, uint16_t pkey, uint8_t addr_type, uint8_t *addr, char *ip_str, void *ctx); int acm_if_iter_sys(acm_if_iter_cb cb, void *ctx); char **parse(const char *args, int *count); #endif /* ACM_IF_H */ rdma-core-56.1/ibacm/src/acme.c000066400000000000000000001013071477342711600162470ustar00rootroot00000000000000/* * Copyright (c) 2009-2010 Intel Corporation. All rights reserved. * Copyright (c) 2013 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include "libacm.h" #include "acm_util.h" static const char *dest_dir = ACM_CONF_DIR; static const char *addr_file = ACM_ADDR_FILE; static const char *opts_file = ACM_OPTS_FILE; static char *dest_addr; static char *src_addr; #if IBACM_SERVER_MODE_DEFAULT == IBACM_SERVER_MODE_UNIX static const char *svc_arg = IBACM_IBACME_SERVER_PATH; #else static const char *svc_arg = "localhost"; #endif static char *dest_arg; static char *src_arg; static char addr_type = 'u'; static int verify; static int nodelay; static int repetitions = 1; static int ep_index; static int enum_ep; enum perf_query_output { PERF_QUERY_NONE, PERF_QUERY_ROW, PERF_QUERY_COL, PERF_QUERY_EP_INDEX, PERF_QUERY_EP_ALL, PERF_QUERY_EP_ADDR }; static enum perf_query_output perf_query; static int verbose; static struct ibv_context **verbs; static int dev_cnt; #define VPRINT(format, ...) do { if (verbose) printf(format, ## __VA_ARGS__ ); } while (0) static void show_usage(char *program) { printf("usage 1: %s\n", program); printf("Query specified ibacm service for data\n"); printf(" [-e [N]] - display one or all endpoints:\n"); printf(" No index: all endpoints\n"); printf(" N: endpoint N (N = 1, 2, ...)\n"); printf(" [-f addr_format] - i(p), n(ame), l(id), g(gid), or u(nspecified)\n"); printf(" address format for -s and -d options, default: 'u'\n"); printf(" [-s src_addr] - source address for path queries\n"); printf(" [-d dest_addr] - destination addresses for path queries\n"); printf(" [-v] - verify ACM response against SA query response\n"); printf(" [-c] - read ACM cached data only\n"); printf(" [-P [opt]] - query performance data from destination service:\n"); printf(" No option: output combined data in row format.\n"); printf(" col: output combined data in column format.\n"); printf(" N: output data for endpoint N (N = 1, 2,...)\n"); printf(" all: output data for all endpoints\n"); printf(" s: output data for the endpoint with the\n"); printf(" address specified in -s option\n"); printf(" [-S svc_addr] - address of ACM service, default: local service\n"); printf(" [-C repetitions] - repeat count for resolution\n"); printf("usage 2: %s\n", program); printf("Generate default ibacm service configuration and option files\n"); printf(" -A [addr_file] - generate local address configuration file\n"); printf(" (default is %s)\n", ACM_ADDR_FILE); printf(" -O [opt_file] - generate local ibacm_opts.cfg options file\n"); printf(" (default is %s)\n", ACM_OPTS_FILE); printf(" -D dest_dir - specify destination directory for output files\n"); printf(" (default is %s)\n", ACM_CONF_DIR); printf(" -V - enable verbose output\n"); } static void gen_opts_temp(FILE *f) { fprintf(f, "# InfiniBand Communication Manager Assistant for clusters configuration file\n"); fprintf(f, "#\n"); fprintf(f, "# Use ib_acme utility with -O option to automatically generate a sample\n"); fprintf(f, "# ibacm_opts.cfg file for the current system.\n"); fprintf(f, "#\n"); fprintf(f, "# Entry format is:\n"); fprintf(f, "# name value\n"); fprintf(f, "\n"); fprintf(f, "# log_file:\n"); fprintf(f, "# Specifies the location of the ACM service output. The log file is used to\n"); fprintf(f, "# assist with ACM service debugging and troubleshooting. The log_file can\n"); fprintf(f, "# be set to 'stdout', 'stderr', or the name of a file.\n"); fprintf(f, "# Examples:\n"); fprintf(f, "# log_file stdout\n"); fprintf(f, "# log_file stderr\n"); fprintf(f, "# log_file %s\n", IBACM_LOG_FILE); fprintf(f, "\n"); fprintf(f, "log_file %s\n", IBACM_LOG_FILE); fprintf(f, "\n"); fprintf(f, "# log_level:\n"); fprintf(f, "# Indicates the amount of detailed data written to the log file. Log levels\n"); fprintf(f, "# should be one of the following values:\n"); fprintf(f, "# 0 - basic configuration & errors\n"); fprintf(f, "# 1 - verbose configuration & errors\n"); fprintf(f, "# 2 - verbose operation\n"); fprintf(f, "\n"); fprintf(f, "log_level 0\n"); fprintf(f, "\n"); fprintf(f, "# libibumad debug level:\n"); fprintf(f, "# Set the umad library internal debug level to level.\n"); fprintf(f, "# 0 - no debug (the default)\n"); fprintf(f, "# 1 - basic debug information\n"); fprintf(f, "# 2 - verbose debug information.\n"); fprintf(f, "\n"); fprintf(f, "umad_debug_level 0\n"); fprintf(f, "\n"); fprintf(f, "# lock_file:\n"); fprintf(f, "# Specifies the location of the ACM lock file used to ensure that only a\n"); fprintf(f, "# single instance of ACM is running.\n"); fprintf(f, "\n"); fprintf(f, "lock_file %s\n", IBACM_PID_FILE); fprintf(f, "\n"); fprintf(f, "# addr_prot:\n"); fprintf(f, "# Default resolution protocol to resolve IP addresses into IB GIDs.\n"); fprintf(f, "# Supported protocols are:\n"); fprintf(f, "# acm - Use ACM multicast protocol, which is similar to ARP.\n"); fprintf(f, "\n"); fprintf(f, "addr_prot acm\n"); fprintf(f, "\n"); fprintf(f, "# addr_timeout:\n"); fprintf(f, "# Number of minutes to maintain IP address to GID mapping before\n"); fprintf(f, "# repeating address resolution. A value of -1 indicates that the\n"); fprintf(f, "# mapping will not time out.\n"); fprintf(f, "# 1 hour = 60, 1 day = 1440, 1 week = 10080, 1 month ~ 43200"); fprintf(f, "\n"); fprintf(f, "addr_timeout 1440\n"); fprintf(f, "\n"); fprintf(f, "# route_prot:\n"); fprintf(f, "# Default resolution protocol to resolve IB routing information.\n"); fprintf(f, "# Supported protocols are:\n"); fprintf(f, "# sa - Query SA for path record data and cache results.\n"); fprintf(f, "# acm - Use ACM multicast protocol.\n"); fprintf(f, "\n"); fprintf(f, "route_prot sa\n"); fprintf(f, "\n"); fprintf(f, "# route_timeout:\n"); fprintf(f, "# Number of minutes to maintain IB routing information before\n"); fprintf(f, "# repeating route resolution. A value of -1 indicates that the\n"); fprintf(f, "# mapping will not time out. However, the route will\n"); fprintf(f, "# automatically time out when the address times out.\n"); fprintf(f, "# 1 hour = 60, 1 day = 1440, 1 week = 10080, 1 month ~ 43200"); fprintf(f, "\n"); fprintf(f, "route_timeout -1\n"); fprintf(f, "\n"); fprintf(f, "# loopback_prot:\n"); fprintf(f, "# Address and route resolution protocol to resolve local addresses\n"); fprintf(f, "# Supported protocols are:\n"); fprintf(f, "# none - Use same protocols defined for addr_prot and route_prot\n"); fprintf(f, "# local - Resolve information used locally available data\n"); fprintf(f, "\n"); fprintf(f, "loopback_prot local\n"); fprintf(f, "\n"); fprintf(f, "# server_port:\n"); fprintf(f, "# TCP port number that the server listens on.\n"); fprintf(f, "# If this value is changed, then a corresponding change is required for\n"); fprintf(f, "# client applications.\n"); fprintf(f, "\n"); fprintf(f, "server_port 6125\n"); fprintf(f, "\n"); fprintf(f, "# server_mode:\n"); fprintf(f, "# Selects how clients can connect to this server:\n"); fprintf(f, "# unix - Use unix-domain sockets,"); fprintf(f, " hence limits service to the same machine.\n"); fprintf(f, "# loop - Limit incoming connections"); fprintf(f, " for server_port to 127.0.0.1.\n"); fprintf(f, "# open - Allow incoming connections"); fprintf(f, " from any TCP client (internal or external).\n"); fprintf(f, "\n"); #if IBACM_SERVER_MODE_DEFAULT == IBACM_SERVER_MODE_OPEN fprintf(f, "server_mode open\n"); #elif IBACM_SERVER_MODE_DEFAULT == IBACM_SERVER_MODE_LOOP fprintf(f, "server_mode loop\n"); #else fprintf(f, "server_mode unix\n"); #endif fprintf(f, "\n"); fprintf(f, "# acme_plus_kernel_only:\n"); fprintf(f, "# If set to 'true', 'yes' or a non-zero number\n"); fprintf(f, "# ibacm will only serve requests originating\n"); fprintf(f, "# from the kernel or the ib_acme utility.\n"); fprintf(f, "# Please note that this option is ignored if the ibacm\n"); fprintf(f, "# service is started on demand by systemd,\n"); fprintf(f, "# in which case this option is treated\n"); fprintf(f, "# as if it were set to 'no'\n"); fprintf(f, "\n"); #if IBACM_ACME_PLUS_KERNEL_ONLY_DEFAULT fprintf(f, "acme_plus_kernel_only yes\n"); #else fprintf(f, "acme_plus_kernel_only no\n"); #endif fprintf(f, "\n"); fprintf(f, "# timeout:\n"); fprintf(f, "# Additional time, in milliseconds, that the ACM service will wait for a\n"); fprintf(f, "# response from a remote ACM service or the IB SA. The actual request\n"); fprintf(f, "# timeout is this value plus the subnet timeout.\n"); fprintf(f, "\n"); fprintf(f, "timeout 2000\n"); fprintf(f, "\n"); fprintf(f, "# retries:\n"); fprintf(f, "# Number of times that the ACM service will retry a request. This affects\n"); fprintf(f, "# both ACM multicast messages and and IB SA messages.\n"); fprintf(f, "\n"); fprintf(f, "retries 2\n"); fprintf(f, "\n"); fprintf(f, "# resolve_depth:\n"); fprintf(f, "# Specifies the maximum number of outstanding requests that can be in\n"); fprintf(f, "# progress simultaneously. A larger resolve depth allows for greater\n"); fprintf(f, "# parallelism, but increases system resource usage and subnet load.\n"); fprintf(f, "# If the number of pending requests is greater than the resolve_depth,\n"); fprintf(f, "# the additional requests will automatically be queued until some of\n"); fprintf(f, "# the previous requests complete.\n"); fprintf(f, "\n"); fprintf(f, "resolve_depth 1\n"); fprintf(f, "\n"); fprintf(f, "# sa_depth:\n"); fprintf(f, "# Specifies the maximum number of outstanding requests to the SA that\n"); fprintf(f, "# can be in progress simultaneously. A larger SA depth allows for greater\n"); fprintf(f, "# parallelism, but increases system resource usage and SA load.\n"); fprintf(f, "# If the number of pending SA requests is greater than the sa_depth,\n"); fprintf(f, "# the additional requests will automatically be queued until some of\n"); fprintf(f, "# the previous requests complete. The number of outstanding SA requests\n"); fprintf(f, "# is separate from the specified resolve_depth.\n"); fprintf(f, "\n"); fprintf(f, "sa_depth 1\n"); fprintf(f, "\n"); fprintf(f, "# send_depth:\n"); fprintf(f, "# Specifies the number of outstanding send operations that can\n"); fprintf(f, "# be in progress simultaneously. A larger send depth allows for\n"); fprintf(f, "# greater parallelism, but consumes more system resources and subnet load.\n"); fprintf(f, "# The send_depth is in addition to resolve_depth and sa_depth, and limits\n"); fprintf(f, "# the transfer of responses.\n"); fprintf(f, "\n"); fprintf(f, "send_depth 1\n"); fprintf(f, "\n"); fprintf(f, "# recv_depth:\n"); fprintf(f, "# Specifies the number of buffers allocated and ready to receive remote\n"); fprintf(f, "# requests. A larger receive depth consumes more system resources, but\n"); fprintf(f, "# can avoid dropping requests due to insufficient receive buffers.\n"); fprintf(f, "\n"); fprintf(f, "recv_depth 1024\n"); fprintf(f, "\n"); fprintf(f, "# min_mtu:\n"); fprintf(f, "# Indicates the minimum MTU supported by the ACM service. The ACM service\n"); fprintf(f, "# negotiates to use the largest MTU available between both sides of a\n"); fprintf(f, "# connection. It is most efficient and recommended that min_mtu be set\n"); fprintf(f, "# to the largest MTU value supported by all nodes in a cluster.\n"); fprintf(f, "\n"); fprintf(f, "min_mtu 2048\n"); fprintf(f, "\n"); fprintf(f, "# min_rate:\n"); fprintf(f, "# Indicates the minimum link rate, in Gbps, supported by the ACM service.\n"); fprintf(f, "# The ACM service negotiates to use the highest rate available between both\n"); fprintf(f, "# sides of a connection. It is most efficient and recommended that the\n"); fprintf(f, "# min_rate be set to the largest rate supported by all nodes in a cluster.\n"); fprintf(f, "\n"); fprintf(f, "min_rate 10\n"); fprintf(f, "\n"); fprintf(f, "# route_preload:\n"); fprintf(f, "# Specifies if the ACM routing cache should be preloaded, or built on demand.\n"); fprintf(f, "# If preloaded, indicates the method used to build the cache.\n"); fprintf(f, "# Supported preload values are:\n"); fprintf(f, "# none - The routing cache is not pre-built (default)\n"); fprintf(f, "# opensm_full_v1 - OpenSM 'full' path records dump file format (version 1)\n"); fprintf(f, "\n"); fprintf(f, "route_preload none\n"); fprintf(f, "\n"); fprintf(f, "# route_data_file:\n"); fprintf(f, "# Specifies the location of the route data file to use when preloading\n"); fprintf(f, "# the ACM cache. This option is only valid if route_preload\n"); fprintf(f, "# indicates that routing data should be read from a file.\n"); fprintf(f, "# Default is %s/ibacm_route.data\n", ACM_CONF_DIR); fprintf(f, "# route_data_file %s/ibacm_route.data\n", ACM_CONF_DIR); fprintf(f, "\n"); fprintf(f, "# addr_preload:\n"); fprintf(f, "# Specifies if the ACM address cache should be preloaded, or built on demand.\n"); fprintf(f, "# If preloaded, indicates the method used to build the cache.\n"); fprintf(f, "# Supported preload values are:\n"); fprintf(f, "# none - The address cache is not pre-built (default)\n"); fprintf(f, "# acm_hosts - ACM address to GID file format\n"); fprintf(f, "\n"); fprintf(f, "addr_preload none\n"); fprintf(f, "\n"); fprintf(f, "# addr_data_file:\n"); fprintf(f, "# Specifies the location of the address data file to use when preloading\n"); fprintf(f, "# the ACM cache. This option is only valid if addr_preload\n"); fprintf(f, "# indicates that address data should be read from a file.\n"); fprintf(f, "# Default is %s/ibacm_hosts.data\n", ACM_CONF_DIR); fprintf(f, "# addr_data_file %s/ibacm_hosts.data\n", ACM_CONF_DIR); fprintf(f, "\n"); fprintf(f, "# support_ips_in_addr_cfg:\n"); fprintf(f, "# If 1 continue to read IP addresses from ibacm_addr.cfg\n"); fprintf(f, "# Default is 0 \"no\"\n"); fprintf(f, "# support_ips_in_addr_cfg 0\n"); fprintf(f, "\n"); fprintf(f, "# provider_lib_path:\n"); fprintf(f, "# Specifies the directory of the provider libraries\n"); fprintf(f, "\n"); fprintf(f, "# provider_lib_path %s\n", IBACM_LIB_PATH); fprintf(f, "\n"); fprintf(f, "# provider:\n"); fprintf(f, "# Specifies the provider to assign to each subnet\n"); fprintf(f, "# ACM providers may override the address and route resolution\n"); fprintf(f, "# protocols with provider specific protocols.\n"); fprintf(f, "# provider name (prefix | default)\n"); fprintf(f, "# Example:\n"); fprintf(f, "# provider ibacmp 0xFE80000000000000\n"); fprintf(f, "# provider ibacmp default\n"); fprintf(f, "\n"); } static int open_dir(void) { mkdir(dest_dir, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH); if (chdir(dest_dir)) { printf("Failed to open directory %s: %s\n", dest_dir, strerror(errno)); return -1; } return 0; } static int gen_opts(void) { FILE *f; VPRINT("Generating %s/%s\n", dest_dir, opts_file); if (open_dir() || !(f = fopen(opts_file, "w"))) { printf("Failed to open option configuration file: %s\n", strerror(errno)); return -1; } gen_opts_temp(f); fclose(f); return 0; } static void gen_addr_temp(FILE *f) { fprintf(f, "# InfiniBand Communication Management Assistant for clusters address file\n"); fprintf(f, "#\n"); fprintf(f, "# Use ib_acme utility with -A option to automatically generate a sample\n"); fprintf(f, "# ibacm_addr.cfg file for the current system.\n"); fprintf(f, "#\n"); fprintf(f, "# Entry format is:\n"); fprintf(f, "# address device port pkey\n"); fprintf(f, "#\n"); fprintf(f, "# NOTE: IP addresses are now automatically read and monitored on the system.\n"); fprintf(f, "# Therefore they are no longer required in this file.\n"); fprintf(f, "#\n"); fprintf(f, "# The address may be one of the following:\n"); fprintf(f, "# host_name - ascii character string, up to 31 characters\n"); fprintf(f, "#\n"); fprintf(f, "# device name - struct ibv_device name\n"); fprintf(f, "# port number - valid port number on device (numbering starts at 1)\n"); fprintf(f, "# pkey - partition key in hex (can specify 'default' for first entry in pkey table)\n"); fprintf(f, "#\n"); fprintf(f, "# Up to 4 addresses can be associated with a given tuple\n"); fprintf(f, "#\n"); fprintf(f, "# Samples:\n"); fprintf(f, "# node31 ibv_device0 1 default\n"); fprintf(f, "# node31-1 ibv_device0 1 0x00FF\n"); fprintf(f, "# node31-2 ibv_device0 2 0x00FF\n"); } static int open_verbs(void) { struct ibv_device **dev_array; int i, ret; dev_array = ibv_get_device_list(&dev_cnt); if (!dev_array) { printf("ibv_get_device_list - no devices present?\n"); return -1; } verbs = malloc(sizeof(struct ibv_context *) * dev_cnt); if (!verbs) { ret = -1; goto err1; } for (i = 0; i < dev_cnt; i++) { verbs[i] = ibv_open_device(dev_array[i]); if (!verbs[i]) { printf("ibv_open_device - failed to open device\n"); ret = -1; goto err2; } } ibv_free_device_list(dev_array); return 0; err2: while (i--) ibv_close_device(verbs[i]); free(verbs); err1: ibv_free_device_list(dev_array); return ret; } static void close_verbs(void) { int i; for (i = 0; i < dev_cnt; i++) ibv_close_device(verbs[i]); free(verbs); } static int gen_addr_names(FILE *f) { struct ibv_device_attr dev_attr; struct ibv_port_attr port_attr; int i, index, ret, found_active; char host_name[256]; uint32_t p; ret = gethostname(host_name, sizeof host_name); if (ret) { printf("gethostname error: %d\n", ret); return ret; } strtok(host_name, "."); found_active = 0; index = 1; for (i = 0; i < dev_cnt; i++) { ret = ibv_query_device(verbs[i], &dev_attr); if (ret) break; for (p = 1; p <= dev_attr.phys_port_cnt; p++) { if (!found_active) { ret = ibv_query_port(verbs[i], p, &port_attr); if (!ret && port_attr.state == IBV_PORT_ACTIVE) { VPRINT("%s %s %u default\n", host_name, verbs[i]->device->name, p); fprintf(f, "%s %s %u default\n", host_name, verbs[i]->device->name, p); found_active = 1; } } VPRINT("%s-%d %s %u default\n", host_name, index, verbs[i]->device->name, p); fprintf(f, "%s-%d %s %u default\n", host_name, index++, verbs[i]->device->name, p); } } return ret; } static int gen_addr(void) { FILE *f; int ret; VPRINT("Generating %s/%s\n", dest_dir, addr_file); if (open_dir() || !(f = fopen(addr_file, "w"))) { printf("Failed to open address configuration file: %s\n", strerror(errno)); return -1; } ret = open_verbs(); if (ret) { goto out1; } gen_addr_temp(f); ret = gen_addr_names(f); if (ret) { printf("Failed to auto generate host names in config file\n"); goto out2; } out2: close_verbs(); out1: fclose(f); return ret; } static void show_path(struct ibv_path_record *path) { char gid[sizeof "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff"]; uint32_t fl_hop; printf("Path information\n"); inet_ntop(AF_INET6, path->dgid.raw, gid, sizeof gid); printf(" dgid: %s\n", gid); inet_ntop(AF_INET6, path->sgid.raw, gid, sizeof gid); printf(" sgid: %s\n", gid); printf(" dlid: %u\n", be16toh(path->dlid)); printf(" slid: %u\n", be16toh(path->slid)); fl_hop = be32toh(path->flowlabel_hoplimit); printf(" flow label: 0x%x\n", fl_hop >> 8); printf(" hop limit: %d\n", (uint8_t) fl_hop); printf(" tclass: %d\n", path->tclass); printf(" reversible: %d\n", path->reversible_numpath >> 7); printf(" pkey: 0x%x\n", be16toh(path->pkey)); printf(" sl: %d\n", be16toh(path->qosclass_sl) & 0xF); printf(" mtu: %d\n", path->mtu & 0x1F); printf(" rate: %d\n", path->rate & 0x1F); printf(" packet lifetime: %d\n", path->packetlifetime & 0x1F); } static uint32_t get_resolve_flags(void) { uint32_t flags = 0; if (nodelay) flags |= ACM_FLAGS_NODELAY; return flags; } static int inet_any_pton(char *addr, struct sockaddr *sa) { struct sockaddr_in *sin; struct sockaddr_in6 *sin6; int ret; sin = (struct sockaddr_in *) sa; sa->sa_family = AF_INET; ret = inet_pton(AF_INET, addr, &sin->sin_addr); if (ret <= 0) { sin6 = (struct sockaddr_in6 *) sa; sa->sa_family = AF_INET6; ret = inet_pton(AF_INET6, addr, &sin6->sin6_addr); } return ret; } static int resolve_ip(struct ibv_path_record *path) { struct ibv_path_data *paths; struct sockaddr_storage src, dest; struct sockaddr *saddr; int ret, count; if (src_addr) { saddr = (struct sockaddr *) &src; ret = inet_any_pton(src_addr, saddr); if (ret <= 0) { printf("inet_pton error on source address (%s): 0x%x\n", src_addr, ret); return -1; } } else { saddr = NULL; } ret = inet_any_pton(dest_addr, (struct sockaddr *) &dest); if (ret <= 0) { printf("inet_pton error on destination address (%s): 0x%x\n", dest_addr, ret); return -1; } if (src_addr && src.ss_family != dest.ss_family) { printf("source and destination address families don't match\n"); return -1; } ret = ib_acm_resolve_ip(saddr, (struct sockaddr *) &dest, &paths, &count, get_resolve_flags(), (repetitions == 1)); if (ret) { printf("ib_acm_resolve_ip failed: %s\n", strerror(errno)); return ret; } *path = paths[0].path; ib_acm_free_paths(paths); return 0; } static int resolve_name(struct ibv_path_record *path) { struct ibv_path_data *paths; int ret, count; ret = ib_acm_resolve_name(src_addr, dest_addr, &paths, &count, get_resolve_flags(), (repetitions == 1)); if (ret) { printf("ib_acm_resolve_name failed: %s\n", strerror(errno)); return ret; } *path = paths[0].path; ib_acm_free_paths(paths); return 0; } static int resolve_lid(struct ibv_path_record *path) { int ret; if (src_addr) path->slid = htobe16((uint16_t) atoi(src_addr)); path->dlid = htobe16((uint16_t) atoi(dest_addr)); path->reversible_numpath = IBV_PATH_RECORD_REVERSIBLE | 1; ret = ib_acm_resolve_path(path, get_resolve_flags()); if (ret) printf("ib_acm_resolve_path failed: %s\n", strerror(errno)); return ret; } static int resolve_gid(struct ibv_path_record *path) { int ret; if (src_addr) { ret = inet_pton(AF_INET6, src_addr, &path->sgid); if (ret <= 0) { printf("inet_pton error on source address (%s): 0x%x\n", src_addr, ret); return ret ? ret : -1; } } ret = inet_pton(AF_INET6, dest_addr, &path->dgid); if (ret <= 0) { printf("inet_pton error on dest address (%s): 0x%x\n", dest_addr, ret); return ret ? ret : -1; } path->reversible_numpath = IBV_PATH_RECORD_REVERSIBLE | 1; ret = ib_acm_resolve_path(path, get_resolve_flags()); if (ret) printf("ib_acm_resolve_path failed: %s\n", strerror(errno)); return ret; } static int verify_resolve(struct ibv_path_record *path) { int ret; ret = ib_acm_resolve_path(path, ACM_FLAGS_QUERY_SA); if (ret) printf("SA verification: failed %s\n", strerror(errno)); else printf("SA verification: success\n"); return ret; } static char *get_dest(char *arg, char *format) { static char addr[64]; struct addrinfo hint, *res; const char *ai; int ret; if (!arg || addr_type != 'u') { *format = addr_type; return arg; } if ((inet_pton(AF_INET, arg, addr) > 0) || (inet_pton(AF_INET6, arg, addr) > 0)) { *format = 'i'; return arg; } memset(&hint, 0, sizeof hint); hint.ai_protocol = IPPROTO_TCP; ret = getaddrinfo(arg, NULL, &hint, &res); if (ret) { *format = 'l'; return arg; } if (res->ai_family == AF_INET) { ai = inet_ntop(AF_INET, &((struct sockaddr_in *) res->ai_addr)->sin_addr, addr, sizeof addr); } else { ai = inet_ntop(AF_INET6, &((struct sockaddr_in6 *) res->ai_addr)->sin6_addr, addr, sizeof addr); } freeaddrinfo(res); if (ai) { *format = 'i'; return addr; } else { *format = 'n'; return arg; } } static int resolve(char *svc) { char **dest_list, **src_list; struct ibv_path_record path; int ret = -1, d = 0, s = 0, i; char dest_type; dest_list = parse(dest_arg, NULL); if (!dest_list) { printf("Unable to parse destination argument\n"); return ret; } src_list = src_arg ? parse(src_arg, NULL) : NULL; printf("Service: %s\n", svc); for (dest_addr = get_dest(dest_list[d], &dest_type); dest_addr; dest_addr = get_dest(dest_list[++d], &dest_type)) { s = 0; src_addr = src_list ? src_list[s] : NULL; do { printf("Destination: %s\n", dest_addr); if (src_addr) printf("Source: %s\n", src_addr); for (i = 0; i < repetitions; i++) { switch (dest_type) { case 'i': ret = resolve_ip(&path); break; case 'n': ret = resolve_name(&path); break; case 'l': memset(&path, 0, sizeof path); ret = resolve_lid(&path); break; case 'g': memset(&path, 0, sizeof path); ret = resolve_gid(&path); break; default: break; } } if (!ret) show_path(&path); if (!ret && verify) ret = verify_resolve(&path); printf("\n"); if (src_list) src_addr = src_list[++s]; } while (src_addr); } free(src_list); free(dest_list); return ret; } static int query_perf_ip(uint64_t **counters, int *cnt) { union _sockaddr { struct sockaddr_storage src; struct sockaddr saddr; } addr; uint8_t type; struct sockaddr_in *sin; struct sockaddr_in6 *sin6; int ret; VPRINT("%s: src_addr %s\n", __FUNCTION__, src_addr); addr.saddr.sa_family = AF_INET; sin = (struct sockaddr_in *) &addr.saddr; ret = inet_pton(AF_INET, src_addr, &sin->sin_addr); if (ret <= 0) { addr.saddr.sa_family = AF_INET6; sin6 = (struct sockaddr_in6 *)&addr.saddr; ret = inet_pton(AF_INET6, src_addr, &sin6->sin6_addr); if (ret <= 0) { printf("inet_pton error on src address (%s): 0x%x\n", src_addr, ret); return -1; } type = ACM_EP_INFO_ADDRESS_IP6; } else { type = ACM_EP_INFO_ADDRESS_IP; } ret = ib_acm_query_perf_ep_addr((uint8_t *)&addr.src, type, counters, cnt); if (ret) { printf("ib_acm_query_perf failed: %s\n", strerror(errno)); return ret; } return 0; } static int query_perf_name(uint64_t **counters, int *cnt) { int ret; VPRINT("%s: src_addr %s\n", __FUNCTION__, src_addr); ret = ib_acm_query_perf_ep_addr((uint8_t *)src_addr, ACM_EP_INFO_NAME, counters, cnt); if (ret) { printf("ib_acm_query_perf failed: %s\n", strerror(errno)); return ret; } return 0; } static int query_perf_ep_addr(uint64_t **counters, int *cnt) { int ret; char src_type; src_addr = get_dest(src_arg, &src_type); switch (src_type) { case 'i': ret = query_perf_ip(counters, cnt); break; case 'n': ret = query_perf_name(counters, cnt); break; default: printf("Unsupported src_type %d\n", src_type); return -1; } return ret; } static int query_perf_one(char *svc, int index) { static int labels; int ret, cnt, i; uint64_t *counters; if (perf_query == PERF_QUERY_EP_ADDR) ret = query_perf_ep_addr(&counters, &cnt); else ret = ib_acm_query_perf(index, &counters, &cnt); if (ret) { if (perf_query != PERF_QUERY_EP_ALL) { printf("%s: Failed to query perf data: %s\n", svc, strerror(errno)); } return ret; } if (perf_query != PERF_QUERY_COL) { if (!labels) { printf("svc,"); for (i = 0; i < cnt - 1; i++) printf("%s,", ib_acm_cntr_name(i)); printf("%s\n", ib_acm_cntr_name(i)); labels = 1; } printf("%s,", svc); for (i = 0; i < cnt - 1; i++) printf("%llu,", (unsigned long long) counters[i]); printf("%llu\n", (unsigned long long) counters[i]); } else { printf("%s\n", svc); for (i = 0; i < cnt; i++) { printf("%s : ", ib_acm_cntr_name(i)); printf("%llu\n", (unsigned long long) counters[i]); } } ib_acm_free_perf(counters); return 0; } static void query_perf(char *svc) { int index = 1; if (perf_query != PERF_QUERY_EP_ALL) { query_perf_one(svc, ep_index); } else { while (!query_perf_one(svc, index++)); } } static int enumerate_ep(char *svc, int index) { static int labels; int ret, i; struct acm_ep_config_data *ep_data; int phys_port_cnt = 255; int found = 0; int port; for (port = 1; port <= phys_port_cnt; ++port) { ret = ib_acm_enum_ep(index, &ep_data, port); if (ret) continue; found = 1; if (!labels) { printf("svc,guid,port,pkey,ep_index,prov,addr_0,addresses\n"); labels = 1; } printf("%s,0x%016" PRIx64 ",%d,0x%04x,%d,%s", svc, ep_data->dev_guid, ep_data->port_num, ep_data->pkey, index, ep_data->prov_name); for (i = 0; i < ep_data->addr_cnt; i++) printf(",%s", ep_data->addrs[i].name); printf("\n"); phys_port_cnt = ep_data->phys_port_cnt; ib_acm_free_ep_data(ep_data); } return !found; } static void enumerate_eps(char *svc) { int index = 1; if (ep_index > 0) { if (enumerate_ep(svc, ep_index)) printf(" Endpoint %d is not available\n", ep_index); } else { while (!enumerate_ep(svc, index++)); } } static int query_svcs(void) { char **svc_list; int ret = -1, i; svc_list = parse(svc_arg, NULL); if (!svc_list) { printf("Unable to parse service list argument\n"); return -1; } for (i = 0; svc_list[i]; i++) { ret = ib_acm_connect(svc_list[i]); if (ret) { printf("%s,unable to contact service: %s\n", svc_list[i], strerror(errno)); continue; } if (dest_arg) ret = resolve(svc_list[i]); if (perf_query) query_perf(svc_list[i]); if (enum_ep) enumerate_eps(svc_list[i]); ib_acm_disconnect(); } free(svc_list); return ret; } static char *opt_arg(int argc, char **argv) { if (optarg) return optarg; if ((optind < argc) && (argv[optind][0] != '-')) return argv[optind]; return NULL; } static void parse_perf_arg(char *arg) { if (!strncasecmp("col", arg, 3)) { perf_query = PERF_QUERY_COL; } else if (!strncasecmp("all", arg, 3)) { perf_query = PERF_QUERY_EP_ALL; } else if (!strcmp("s", arg)) { perf_query = PERF_QUERY_EP_ADDR; } else { ep_index = atoi(arg); if (ep_index > 0) perf_query = PERF_QUERY_EP_INDEX; else perf_query = PERF_QUERY_ROW; } } int main(int argc, char **argv) { int op, ret = 0; int make_addr = 0; int make_opts = 0; while ((op = getopt(argc, argv, "e::f:s:d:vcA::O::D:P::S:C:V")) != -1) { switch (op) { case 'e': enum_ep = 1; if (opt_arg(argc, argv)) ep_index = atoi(opt_arg(argc, argv)); break; case 'f': addr_type = optarg[0]; if (addr_type != 'i' && addr_type != 'n' && addr_type != 'l' && addr_type != 'g') goto show_use; break; case 's': src_arg = optarg; break; case 'd': dest_arg = optarg; break; case 'v': verify = 1; break; case 'c': nodelay = 1; break; case 'A': make_addr = 1; if (opt_arg(argc, argv)) addr_file = opt_arg(argc, argv); break; case 'O': make_opts = 1; if (opt_arg(argc, argv)) opts_file = opt_arg(argc, argv); break; case 'D': dest_dir = optarg; break; case 'P': if (opt_arg(argc, argv)) parse_perf_arg(opt_arg(argc, argv)); else perf_query = PERF_QUERY_ROW; break; case 'S': svc_arg = optarg; break; case 'C': repetitions = atoi(optarg); if (!repetitions) repetitions = 1; break; case 'V': verbose = 1; break; default: goto show_use; } } if ((src_arg && (!dest_arg && perf_query != PERF_QUERY_EP_ADDR)) || (perf_query == PERF_QUERY_EP_ADDR && !src_arg) || (!src_arg && !dest_arg && !perf_query && !make_addr && !make_opts && !enum_ep)) goto show_use; if (dest_arg || perf_query || enum_ep) ret = query_svcs(); if (!ret && make_addr) ret = gen_addr(); if (!ret && make_opts) ret = gen_opts(); if (verbose || !(make_addr || make_opts) || ret) printf("return status 0x%x\n", ret); return ret; show_use: show_usage(argv[0]); exit(1); } rdma-core-56.1/ibacm/src/libacm.c000066400000000000000000000301651477342711600165740ustar00rootroot00000000000000/* * Copyright (c) 2009 Intel Corporation. All rights reserved. * Copyright (c) 2013 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include "libacm.h" #include #include #include #include #include #include #include #include static pthread_mutex_t acm_lock = PTHREAD_MUTEX_INITIALIZER; static int sock = -1; static short server_port = 6125; static void acm_set_server_port(void) { FILE *f; f = fopen(IBACM_IBACME_PORT_FILE, "r"); if (f) { if (fscanf(f, "%hu", (unsigned short *) &server_port) != 1) printf("Failed to read server port\n"); fclose(f); } } static int ib_acm_connect_open(char *dest) { struct addrinfo hint, *res; int ret; acm_set_server_port(); memset(&hint, 0, sizeof hint); hint.ai_family = AF_UNSPEC; hint.ai_protocol = IPPROTO_TCP; ret = getaddrinfo(dest, NULL, &hint, &res); if (ret) return ret; sock = socket(res->ai_family, res->ai_socktype, res->ai_protocol); if (sock == -1) { ret = errno; goto freeaddr; } ((struct sockaddr_in *) res->ai_addr)->sin_port = htobe16(server_port); ret = connect(sock, res->ai_addr, res->ai_addrlen); if (ret) { close(sock); sock = -1; } freeaddr: freeaddrinfo(res); return ret; } static int ib_acm_connect_unix(char *dest) { struct sockaddr_un addr; int ret; addr.sun_family = AF_UNIX; if (dest) { if (snprintf(addr.sun_path, sizeof(addr.sun_path), "%s", dest) >= sizeof(addr.sun_path)) { errno = ENAMETOOLONG; return errno; } } else { BUILD_ASSERT(sizeof(IBACM_IBACME_SERVER_PATH) <= sizeof(addr.sun_path)); strcpy(addr.sun_path, IBACM_IBACME_SERVER_PATH); } sock = socket(AF_UNIX, SOCK_STREAM, 0); if (sock < 0) return errno; if (connect(sock, (struct sockaddr *)&addr, sizeof(addr)) != 0) { ret = errno; close(sock); sock = -1; errno = ret; return ret; } return 0; } int ib_acm_connect(char *dest) { if (dest && *dest == '/') return ib_acm_connect_unix(dest); return ib_acm_connect_open(dest); } void ib_acm_disconnect(void) { if (sock != -1) { shutdown(sock, SHUT_RDWR); close(sock); sock = -1; } } static int acm_format_resp(struct acm_msg *msg, struct ibv_path_data **paths, int *count, int print) { struct ibv_path_data *path_data; char addr[ACM_MAX_ADDRESS]; int i, addr_cnt; *count = 0; addr_cnt = (msg->hdr.length - ACM_MSG_HDR_LENGTH) / sizeof(struct acm_ep_addr_data); path_data = (struct ibv_path_data *) calloc(1, addr_cnt * sizeof(struct ibv_path_data)); if (!path_data) return -1; for (i = 0; i < addr_cnt; i++) { switch (msg->resolve_data[i].type) { case ACM_EP_INFO_PATH: path_data[i].flags = msg->resolve_data[i].flags; path_data[i].path = msg->resolve_data[i].info.path; (*count)++; break; default: if (!(msg->resolve_data[i].flags & ACM_EP_FLAG_SOURCE)) goto err; switch (msg->resolve_data[i].type) { case ACM_EP_INFO_ADDRESS_IP: inet_ntop(AF_INET, msg->resolve_data[i].info.addr, addr, sizeof addr); break; case ACM_EP_INFO_ADDRESS_IP6: inet_ntop(AF_INET6, msg->resolve_data[i].info.addr, addr, sizeof addr); break; case ACM_EP_INFO_NAME: memcpy(addr, msg->resolve_data[i].info.name, ACM_MAX_ADDRESS); break; default: goto err; } if (print) printf("Source: %s\n", addr); break; } } *paths = path_data; return 0; err: free(path_data); return -1; } static int acm_format_ep_addr(struct acm_ep_addr_data *data, uint8_t *addr, uint8_t type, uint32_t flags) { data->type = type; data->flags = flags; switch (type) { case ACM_EP_INFO_NAME: if (!check_snprintf((char *)data->info.name, sizeof(data->info.name), "%s", (char *)addr)) return -1; break; case ACM_EP_INFO_ADDRESS_IP: memcpy(data->info.addr, &((struct sockaddr_in *) addr)->sin_addr, 4); break; case ACM_EP_INFO_ADDRESS_IP6: memcpy(data->info.addr, &((struct sockaddr_in6 *) addr)->sin6_addr, 16); break; default: return -1; } return 0; } static inline int ERR(int err) { errno = err; return -1; } static int acm_error(uint8_t status) { switch (status) { case ACM_STATUS_SUCCESS: return 0; case ACM_STATUS_ENOMEM: return ERR(ENOMEM); case ACM_STATUS_EINVAL: return ERR(EINVAL); case ACM_STATUS_ENODATA: return ERR(ENODATA); case ACM_STATUS_ENOTCONN: return ERR(ENOTCONN); case ACM_STATUS_ETIMEDOUT: return ERR(ETIMEDOUT); case ACM_STATUS_ESRCADDR: case ACM_STATUS_EDESTADDR: return ERR(EADDRNOTAVAIL); case ACM_STATUS_ESRCTYPE: case ACM_STATUS_EDESTTYPE: default: return ERR(EINVAL); } } static int acm_resolve(uint8_t *src, uint8_t *dest, uint8_t type, struct ibv_path_data **paths, int *count, uint32_t flags, int print) { struct acm_msg msg; int ret, cnt = 0; pthread_mutex_lock(&acm_lock); memset(&msg, 0, sizeof msg); msg.hdr.version = ACM_VERSION; msg.hdr.opcode = ACM_OP_RESOLVE; if (src) { ret = acm_format_ep_addr(&msg.resolve_data[cnt++], src, type, ACM_EP_FLAG_SOURCE); if (ret) goto out; } ret = acm_format_ep_addr(&msg.resolve_data[cnt++], dest, type, ACM_EP_FLAG_DEST | flags); if (ret) goto out; msg.hdr.length = ACM_MSG_HDR_LENGTH + (cnt * ACM_MSG_EP_LENGTH); ret = send(sock, (char *) &msg, msg.hdr.length, 0); if (ret != msg.hdr.length) goto out; ret = recv(sock, (char *) &msg, sizeof msg, 0); if (ret < ACM_MSG_HDR_LENGTH || ret != msg.hdr.length) goto out; if (msg.hdr.status) { ret = acm_error(msg.hdr.status); goto out; } ret = acm_format_resp(&msg, paths, count, print); out: pthread_mutex_unlock(&acm_lock); return ret; } int ib_acm_resolve_name(char *src, char *dest, struct ibv_path_data **paths, int *count, uint32_t flags, int print) { return acm_resolve((uint8_t *) src, (uint8_t *) dest, ACM_EP_INFO_NAME, paths, count, flags, print); } int ib_acm_resolve_ip(struct sockaddr *src, struct sockaddr *dest, struct ibv_path_data **paths, int *count, uint32_t flags, int print) { if (((struct sockaddr *) dest)->sa_family == AF_INET) { return acm_resolve((uint8_t *) src, (uint8_t *) dest, ACM_EP_INFO_ADDRESS_IP, paths, count, flags, print); } else { return acm_resolve((uint8_t *) src, (uint8_t *) dest, ACM_EP_INFO_ADDRESS_IP6, paths, count, flags, print); } } int ib_acm_resolve_path(struct ibv_path_record *path, uint32_t flags) { struct acm_msg msg; struct acm_ep_addr_data *data; int ret; pthread_mutex_lock(&acm_lock); memset(&msg, 0, sizeof msg); msg.hdr.version = ACM_VERSION; msg.hdr.opcode = ACM_OP_RESOLVE; msg.hdr.length = ACM_MSG_HDR_LENGTH + ACM_MSG_EP_LENGTH; data = &msg.resolve_data[0]; data->flags = flags; data->type = ACM_EP_INFO_PATH; data->info.path = *path; ret = send(sock, (char *) &msg, msg.hdr.length, 0); if (ret != msg.hdr.length) goto out; ret = recv(sock, (char *) &msg, sizeof msg, 0); if (ret < ACM_MSG_HDR_LENGTH || ret != msg.hdr.length) goto out; ret = acm_error(msg.hdr.status); if (!ret) *path = data->info.path; out: pthread_mutex_unlock(&acm_lock); return ret; } int ib_acm_query_perf(int index, uint64_t **counters, int *count) { struct acm_msg msg; int ret, i; pthread_mutex_lock(&acm_lock); memset(&msg, 0, sizeof msg); msg.hdr.version = ACM_VERSION; msg.hdr.opcode = ACM_OP_PERF_QUERY; msg.hdr.src_index = index; msg.hdr.length = htobe16(ACM_MSG_HDR_LENGTH); ret = send(sock, (char *) &msg, ACM_MSG_HDR_LENGTH, 0); if (ret != ACM_MSG_HDR_LENGTH) goto out; ret = recv(sock, (char *) &msg, sizeof msg, 0); if (ret < ACM_MSG_HDR_LENGTH || ret != be16toh(msg.hdr.length)) { ret = ACM_STATUS_EINVAL; goto out; } if (msg.hdr.status) { ret = acm_error(msg.hdr.status); goto out; } *counters = malloc(sizeof(uint64_t) * msg.hdr.src_out); if (!*counters) { ret = ACM_STATUS_ENOMEM; goto out; } *count = msg.hdr.src_out; for (i = 0; i < *count; i++) (*counters)[i] = be64toh(msg.perf_data[i]); ret = 0; out: pthread_mutex_unlock(&acm_lock); return ret; } int ib_acm_enum_ep(int index, struct acm_ep_config_data **data, uint8_t port) { struct acm_ep_config_data *netw_edata = NULL; struct acm_ep_config_data *host_edata = NULL; struct acm_hdr hdr; struct acm_msg msg; int ret; int len; int i; pthread_mutex_lock(&acm_lock); memset(&msg, 0, sizeof msg); msg.hdr.version = ACM_VERSION; msg.hdr.opcode = ACM_OP_EP_QUERY; msg.hdr.src_out = index; msg.hdr.src_index = port; msg.hdr.length = htobe16(ACM_MSG_HDR_LENGTH); ret = send(sock, (char *) &msg, ACM_MSG_HDR_LENGTH, 0); if (ret != ACM_MSG_HDR_LENGTH) goto out; ret = recv(sock, (char *) &hdr, sizeof(hdr), 0); if (ret != sizeof(hdr)) { ret = ACM_STATUS_EINVAL; goto out; } if (hdr.status) { ret = acm_error(hdr.status); goto out; } len = be16toh(hdr.length) - sizeof(hdr); netw_edata = malloc(len); host_edata = malloc(len); if (!netw_edata || !host_edata) { ret = ACM_STATUS_ENOMEM; goto out; } ret = recv(sock, (char *)netw_edata, len, 0); if (ret != len) { ret = ACM_STATUS_EINVAL; goto out; } host_edata->dev_guid = be64toh(netw_edata->dev_guid); host_edata->port_num = netw_edata->port_num; host_edata->phys_port_cnt = netw_edata->phys_port_cnt; host_edata->pkey = be16toh(netw_edata->pkey); host_edata->addr_cnt = be16toh(netw_edata->addr_cnt); memcpy(host_edata->prov_name, netw_edata->prov_name, sizeof(host_edata->prov_name)); for (i = 0; i < host_edata->addr_cnt; ++i) host_edata->addrs[i] = netw_edata->addrs[i]; *data = host_edata; ret = 0; out: free(netw_edata); if (ret) free(host_edata); pthread_mutex_unlock(&acm_lock); return ret; } int ib_acm_query_perf_ep_addr(uint8_t *src, uint8_t type, uint64_t **counters, int *count) { struct acm_msg msg; int ret, i, len; if (!src) return -1; pthread_mutex_lock(&acm_lock); memset(&msg, 0, sizeof msg); msg.hdr.version = ACM_VERSION; msg.hdr.opcode = ACM_OP_PERF_QUERY; ret = acm_format_ep_addr(&msg.resolve_data[0], src, type, ACM_EP_FLAG_SOURCE); if (ret) goto out; len = ACM_MSG_HDR_LENGTH + ACM_MSG_EP_LENGTH; msg.hdr.length = htobe16(len); ret = send(sock, (char *) &msg, len, 0); if (ret != len) goto out; ret = recv(sock, (char *) &msg, sizeof msg, 0); if (ret < ACM_MSG_HDR_LENGTH || ret != be16toh(msg.hdr.length)) { ret = ACM_STATUS_EINVAL; goto out; } if (msg.hdr.status) { ret = acm_error(msg.hdr.status); goto out; } *counters = malloc(sizeof(uint64_t) * msg.hdr.src_out); if (!*counters) { ret = ACM_STATUS_ENOMEM; goto out; } *count = msg.hdr.src_out; for (i = 0; i < *count; i++) (*counters)[i] = be64toh(msg.perf_data[i]); ret = 0; out: pthread_mutex_unlock(&acm_lock); return ret; } const char *ib_acm_cntr_name(int index) { static const char *const cntr_name[] = { [ACM_CNTR_ERROR] = "Error Count", [ACM_CNTR_RESOLVE] = "Resolve Count", [ACM_CNTR_NODATA] = "No Data", [ACM_CNTR_ADDR_QUERY] = "Addr Query Count", [ACM_CNTR_ADDR_CACHE] = "Addr Cache Count", [ACM_CNTR_ROUTE_QUERY] = "Route Query Count", [ACM_CNTR_ROUTE_CACHE] = "Route Cache Count", }; if (index < ACM_CNTR_ERROR || index > ACM_MAX_COUNTER) return "Unknown"; return cntr_name[index]; } rdma-core-56.1/ibacm/src/libacm.h000066400000000000000000000042471477342711600166030ustar00rootroot00000000000000/* * Copyright (c) 2009 Intel Corporation. All rights reserved. * Copyright (c) 2013 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef LIBACM_H #define LIBACM_H #include struct sockaddr; int ib_acm_connect(char *dest_svc); void ib_acm_disconnect(void); int ib_acm_resolve_name(char *src, char *dest, struct ibv_path_data **paths, int *count, uint32_t flags, int print); int ib_acm_resolve_ip(struct sockaddr *src, struct sockaddr *dest, struct ibv_path_data **paths, int *count, uint32_t flags, int print); int ib_acm_resolve_path(struct ibv_path_record *path, uint32_t flags); #define ib_acm_free_paths(paths) free(paths) int ib_acm_query_perf(int index, uint64_t **counters, int *count); int ib_acm_query_perf_ep_addr(uint8_t *src, uint8_t type, uint64_t **counters, int *count); #define ib_acm_free_perf(counters) free(counters) const char *ib_acm_cntr_name(int index); int ib_acm_enum_ep(int index, struct acm_ep_config_data **data, uint8_t port); #define ib_acm_free_ep_data(data) free(data) #endif /* LIBACM_H */ rdma-core-56.1/ibacm/src/parse.c000066400000000000000000000056211477342711600164560ustar00rootroot00000000000000/* * Copyright (c) 2009-2010 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include "acm_util.h" static char *expand(char *basename, char *args, int *str_cnt, int *str_size) { char buf[256]; char *str_buf = NULL; char *token, *tmp; int from, to, width; int size = 0, cnt = 0; token = strtok(args, ","); do { from = atoi(token); tmp = index(token, '-'); if (tmp) { to = atoi(tmp+1); width = tmp - token; } else { to = from; width = strlen(token); } while (from <= to) { snprintf(buf, sizeof buf, "%s%0*d", basename, width, from); str_buf = realloc(str_buf, size + strlen(buf)+1); strcpy(&str_buf[size], buf); from++; cnt++; size += strlen(buf)+1; } token = strtok(NULL, ","); } while (token); *str_size = size; *str_cnt = cnt; return str_buf; } char **parse(const char *args, int *count) { char **ptrs = NULL; char *str_buf, *cpy, *token, *next; int cnt = 0, str_size = 0; int i; /* make a copy that strtok can modify */ cpy = strdup(args); if (!cpy) return NULL; if (args[0] == '[') { cpy[0] = '\0'; token = cpy; next = strtok(cpy + 1, "]"); } else { token = strtok(cpy, "["); next = strtok(NULL, "]"); } if (!next) { str_size = strlen(token) + 1; str_buf = malloc(str_size); if (!str_buf) goto out_cpy; strcpy(str_buf, token); cnt = 1; } else { str_buf = expand(cpy, next, &cnt, &str_size); } ptrs = malloc((sizeof str_buf * (cnt + 1)) + str_size); if (!ptrs) goto out_str_buf; memcpy(&ptrs[cnt + 1], str_buf, str_size); ptrs[0] = (char*) &ptrs[cnt + 1]; for (i = 1; i < cnt; i++) ptrs[i] = index(ptrs[i - 1], 0) + 1; ptrs[i] = NULL; if (count) *count = cnt; out_str_buf: free(str_buf); out_cpy: free(cpy); return ptrs; } rdma-core-56.1/infiniband-diags/000077500000000000000000000000001477342711600165205ustar00rootroot00000000000000rdma-core-56.1/infiniband-diags/CMakeLists.txt000066400000000000000000000017741477342711600212710ustar00rootroot00000000000000publish_internal_headers("" ibdiag_common.h ibdiag_sa.h ) install(FILES etc/error_thresholds etc/ibdiag.conf DESTINATION "${IBDIAG_CONFIG_PATH}") add_library(ibdiags_tools STATIC ibdiag_common.c ibdiag_sa.c ) target_link_libraries(ibdiags_tools LINK_PRIVATE ibnetdisc ) function(ibdiag_programs) foreach(I ${ARGN}) rdma_sbin_executable(${I} "${I}.c") target_link_libraries(${I} LINK_PRIVATE ${RT_LIBRARIES} ibdiags_tools ibumad ibmad ibnetdisc) endforeach() endfunction() ibdiag_programs( dump_fts ibaddr ibcacheedit ibccconfig ibccquery iblinkinfo ibnetdiscover ibping ibportstate ibqueryerrors ibroute ibstat ibsysstat ibtracert perfquery saquery sminfo smpdump smpquery vendstat ) rdma_test_executable(ibsendtrap "ibsendtrap.c") target_link_libraries(ibsendtrap LINK_PRIVATE ibdiags_tools ibumad ibmad) rdma_test_executable(mcm_rereg_test "mcm_rereg_test.c") target_link_libraries(mcm_rereg_test LINK_PRIVATE ibdiags_tools ibumad ibmad) rdma-core-56.1/infiniband-diags/dump_fts.c000066400000000000000000000306671477342711600205210ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2009-2011 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 2013 Lawrence Livermore National Security. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include #include #include #include #include "ibdiag_common.h" static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; static unsigned startlid, endlid; static int brief, dump_all, multicast; static char *node_name_map_file = NULL; static nn_map_t *node_name_map = NULL; #define IB_MLIDS_IN_BLOCK (IB_SMP_DATA_SIZE/2) static int dump_mlid(char *str, int strlen, unsigned mlid, unsigned nports, __be16 mft[16][IB_MLIDS_IN_BLOCK]) { uint16_t mask; unsigned i, chunk, bit, nonzero = 0; if (brief) { int n = 0; unsigned chunks = ALIGN(nports + 1, 16) / 16; for (i = 0; i < chunks; i++) { mask = ntohs(mft[i][mlid % IB_MLIDS_IN_BLOCK]); if (mask) nonzero++; n += snprintf(str + n, strlen - n, "%04hx", mask); if (n >= strlen) { n = strlen; break; } } if (!nonzero && !dump_all) { str[0] = 0; return 0; } return n; } for (i = 0; i <= nports; i++) { chunk = i / 16; bit = i % 16; mask = ntohs(mft[chunk][mlid % IB_MLIDS_IN_BLOCK]); if (mask) nonzero++; str[i * 2] = (mask & (1 << bit)) ? 'x' : ' '; str[i * 2 + 1] = ' '; } if (!nonzero && !dump_all) { str[0] = 0; return 0; } str[i * 2] = 0; return i * 2; } static __be16 mft[16][IB_MLIDS_IN_BLOCK]; static void dump_multicast_tables(ibnd_node_t *node, unsigned startl, unsigned endl, struct ibmad_port *mad_port) { ib_portid_t *portid = &node->path_portid; char str[512]; char *s; uint64_t nodeguid; uint32_t mod; unsigned block, i, j, e, nports, cap, chunks, startblock, lastblock, top; char *mapnd = NULL; int n = 0; nports = node->numports; nodeguid = node->guid; mad_decode_field(node->switchinfo, IB_SW_MCAST_FDB_CAP_F, &cap); mad_decode_field(node->switchinfo, IB_SW_MCAST_FDB_TOP_F, &top); if (!endl || endl > IB_MIN_MCAST_LID + cap - 1) endl = IB_MIN_MCAST_LID + cap - 1; if (!dump_all && top && top < endl) { if (top < IB_MIN_MCAST_LID - 1) IBWARN("illegal top mlid %x", top); else endl = top; } if (!startl) startl = IB_MIN_MCAST_LID; else if (startl < IB_MIN_MCAST_LID) { IBWARN("illegal start mlid %x, set to %x", startl, IB_MIN_MCAST_LID); startl = IB_MIN_MCAST_LID; } if (endl > IB_MAX_MCAST_LID) { IBWARN("illegal end mlid %x, truncate to %x", endl, IB_MAX_MCAST_LID); endl = IB_MAX_MCAST_LID; } mapnd = remap_node_name(node_name_map, nodeguid, node->nodedesc); printf("Multicast mlids [0x%x-0x%x] of switch %s guid 0x%016" PRIx64 " (%s):\n", startl, endl, portid2str(portid), nodeguid, mapnd); if (brief) printf(" MLid Port Mask\n"); else { if (nports > 9) { for (i = 0, s = str; i <= nports; i++) { *s++ = (i % 10) ? ' ' : '0' + i / 10; *s++ = ' '; } *s = 0; printf(" %s\n", str); } for (i = 0, s = str; i <= nports; i++) s += sprintf(s, "%d ", i % 10); printf(" Ports: %s\n", str); printf(" MLid\n"); } if (ibverbose) printf("Switch multicast mlid capability is %d top is 0x%x\n", cap, top); chunks = ALIGN(nports + 1, 16) / 16; startblock = startl / IB_MLIDS_IN_BLOCK; lastblock = endl / IB_MLIDS_IN_BLOCK; for (block = startblock; block <= lastblock; block++) { for (j = 0; j < chunks; j++) { int status; mod = (block - IB_MIN_MCAST_LID / IB_MLIDS_IN_BLOCK) | (j << 28); DEBUG("reading block %x chunk %d mod %x", block, j, mod); if (!smp_query_status_via (mft + j, portid, IB_ATTR_MULTICASTFORWTBL, mod, 0, &status, mad_port)) { fprintf(stderr, "SubnGet(MFT) failed on switch " "'%s' %s Node GUID 0x%"PRIx64 " SMA LID %d; MAD status 0x%x " "AM 0x%x\n", mapnd, portid2str(portid), node->guid, node->smalid, status, mod); } } i = block * IB_MLIDS_IN_BLOCK; e = i + IB_MLIDS_IN_BLOCK; if (i < startl) i = startl; if (e > endl + 1) e = endl + 1; for (; i < e; i++) { if (dump_mlid(str, sizeof str, i, nports, mft) == 0) continue; printf("0x%04x %s\n", i, str); n++; } } printf("%d %smlids dumped \n", n, dump_all ? "" : "valid "); free(mapnd); } static int dump_lid(char *str, int str_len, int lid, int valid, ibnd_fabric_t *fabric, int *last_port_lid, int *base_port_lid, uint64_t *portguid) { ibnd_port_t *port = NULL; char ntype[50], sguid[30]; uint64_t nodeguid; int baselid, lmc, type; char *mapnd = NULL; int rc; if (brief) { str[0] = 0; return 0; } if (lid <= *last_port_lid) { if (!valid) return snprintf(str, str_len, ": (path #%d - illegal port)", lid - *base_port_lid); else if (!*portguid) return snprintf(str, str_len, ": (path #%d out of %d)", lid - *base_port_lid + 1, *last_port_lid - *base_port_lid + 1); else { return snprintf(str, str_len, ": (path #%d out of %d: portguid %s)", lid - *base_port_lid + 1, *last_port_lid - *base_port_lid + 1, mad_dump_val(IB_NODE_PORT_GUID_F, sguid, sizeof sguid, portguid)); } } if (!valid) return snprintf(str, str_len, ": (illegal port)"); *portguid = 0; port = ibnd_find_port_lid(fabric, lid); if (!port) { return snprintf(str, str_len, ": (node info not available fabric scan)"); } nodeguid = port->node->guid; *portguid = port->guid; type = port->node->type; baselid = port->base_lid; lmc = port->lmc; if (lmc > 0) { *base_port_lid = baselid; *last_port_lid = baselid + (1 << lmc) - 1; } mapnd = remap_node_name(node_name_map, nodeguid, port->node->nodedesc); rc = snprintf(str, str_len, ": (%s portguid %s: '%s')", mad_dump_val(IB_NODE_TYPE_F, ntype, sizeof ntype, &type), mad_dump_val(IB_NODE_PORT_GUID_F, sguid, sizeof sguid, portguid), mapnd); free(mapnd); return rc; } static void dump_unicast_tables(ibnd_node_t *node, int startl, int endl, struct ibmad_port *mad_port, ibnd_fabric_t *fabric) { ib_portid_t * portid = &node->path_portid; char lft[IB_SMP_DATA_SIZE] = { 0 }; char str[200]; uint64_t nodeguid; int block, i, e, top; unsigned nports; int n = 0, startblock, endblock; char *mapnd = NULL; int last_port_lid = 0, base_port_lid = 0; uint64_t portguid = 0; mad_decode_field(node->switchinfo, IB_SW_LINEAR_FDB_TOP_F, &top); nodeguid = node->guid; nports = node->numports; if (!endl || endl > top) endl = top; if (endl > IB_MAX_UCAST_LID) { IBWARN("illegal lft top %d, truncate to %d", endl, IB_MAX_UCAST_LID); endl = IB_MAX_UCAST_LID; } mapnd = remap_node_name(node_name_map, nodeguid, node->nodedesc); printf("Unicast lids [0x%x-0x%x] of switch %s guid 0x%016" PRIx64 " (%s):\n", startl, endl, portid2str(portid), nodeguid, mapnd); DEBUG("Switch top is 0x%x\n", top); printf(" Lid Out Destination\n"); printf(" Port Info \n"); startblock = startl / IB_SMP_DATA_SIZE; endblock = ALIGN(endl, IB_SMP_DATA_SIZE) / IB_SMP_DATA_SIZE; for (block = startblock; block < endblock; block++) { int status; DEBUG("reading block %d", block); if (!smp_query_status_via(lft, portid, IB_ATTR_LINEARFORWTBL, block, 0, &status, mad_port)) { fprintf(stderr, "SubnGet(LFT) failed on switch " "'%s' %s Node GUID 0x%"PRIx64 " SMA LID %d; MAD status 0x%x AM 0x%x\n", mapnd, portid2str(portid), node->guid, node->smalid, status, block); } i = block * IB_SMP_DATA_SIZE; e = i + IB_SMP_DATA_SIZE; if (i < startl) i = startl; if (e > endl + 1) e = endl + 1; for (; i < e; i++) { unsigned outport = lft[i % IB_SMP_DATA_SIZE]; unsigned valid = (outport <= nports); if (!valid && !dump_all) continue; dump_lid(str, sizeof str, i, valid, fabric, &last_port_lid, &base_port_lid, &portguid); printf("0x%04x %03u %s\n", i, outport & 0xff, str); n++; } } printf("%d %slids dumped \n", n, dump_all ? "" : "valid "); free(mapnd); } static void dump_node(ibnd_node_t *node, struct ibmad_port *mad_port, ibnd_fabric_t *fabric) { if (multicast) dump_multicast_tables(node, startlid, endlid, mad_port); else dump_unicast_tables(node, startlid, endlid, mad_port, fabric); } static void process_switch(ibnd_node_t *node, void *fabric) { dump_node(node, srcport, (ibnd_fabric_t *)fabric); } static int process_opt(void *context, int ch) { switch (ch) { case 'a': dump_all++; break; case 'M': multicast++; break; case 'n': brief++; break; case 1: node_name_map_file = strdup(optarg); if (node_name_map_file == NULL) IBEXIT("out of memory, strdup for node_name_map_file name failed"); break; default: return -1; } return 0; } int main(int argc, char **argv) { int rc = 0; int mgmt_classes[3] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS }; struct ibnd_config config = { 0 }; ibnd_fabric_t *fabric = NULL; const struct ibdiag_opt opts[] = { {"all", 'a', 0, NULL, "show all lids, even invalid entries"}, {"no_dests", 'n', 0, NULL, "do not try to resolve destinations"}, {"Multicast", 'M', 0, NULL, "show multicast forwarding tables"}, {"node-name-map", 1, 1, "", "node name map file"}, {} }; char usage_args[] = "[ [ []]]"; const char *usage_examples[] = { " -- Unicast examples:", "-a\t# same, but dump all lids, even with invalid out ports", "-n\t# simple dump format - no destination resolving", "10\t# dump lids starting from 10", "0x10 0x20\t# dump lid range", " -- Multicast examples:", "-M\t# dump all non empty mlids of switch with lid 4", "-M 0xc010 0xc020\t# same, but with range", "-M -n\t# simple dump format", NULL, }; ibdiag_process_opts(argc, argv, &config, "KGDLs", opts, process_opt, usage_args, usage_examples); argc -= optind; argv += optind; if (argc > 0) startlid = strtoul(argv[0], NULL, 0); if (argc > 1) endlid = strtoul(argv[1], NULL, 0); node_name_map = open_node_name_map(node_name_map_file); if (ibd_timeout) config.timeout_ms = ibd_timeout; config.flags = ibd_ibnetdisc_flags; config.mkey = ibd_mkey; if ((fabric = ibnd_discover_fabric(ibd_ca, ibd_ca_port, NULL, &config)) != NULL) { srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 1); if (!srcports) { fprintf(stderr, "Failed to open '%s' port '%d'\n", ibd_ca, ibd_ca_port); rc = -1; goto Exit; } srcport = srcports->smi.port; if (!srcport) { fprintf(stderr, "Failed to open '%s' port '%d'\n", ibd_ca, ibd_ca_port); rc = -1; goto Exit; } smp_mkey_set(srcport, ibd_mkey); if (ibd_timeout) { mad_rpc_set_timeout(srcport, ibd_timeout); } ibnd_iter_nodes_type(fabric, process_switch, IB_NODE_SWITCH, fabric); mad_rpc_close_port2(srcports); } else { fprintf(stderr, "Failed to discover fabric\n"); rc = -1; } Exit: ibnd_destroy_fabric(fabric); close_node_name_map(node_name_map); exit(rc); } rdma-core-56.1/infiniband-diags/etc/000077500000000000000000000000001477342711600172735ustar00rootroot00000000000000rdma-core-56.1/infiniband-diags/etc/error_thresholds000066400000000000000000000005571477342711600226150ustar00rootroot00000000000000# Define error thresholds here #SymbolErrorCounter=10 #LinkErrorRecoveryCounter=10 #LinkDownedCounter=10 #PortRcvErrors=10 #PortRcvRemotePhysicalErrors=100 #PortRcvSwitchRelayErrors=100 #PortXmitDiscards=100 #PortXmitConstraintErrors=100 #PortRcvConstraintErrors=100 #LocalLinkIntegrityErrors=10 #ExcessiveBufferOverrunErrors=10 #VL15Dropped=100 #PortXmitWait=1000 rdma-core-56.1/infiniband-diags/etc/ibdiag.conf000066400000000000000000000011411477342711600213560ustar00rootroot00000000000000# Define different defaults for all infiniband-diag tools. These can be # defined on the command line but this offers a more global config. # Defaults are to find the first port with Physical state == "LinkUp" #CA=mlx4_0 # NOTE: that using a different Port may require an altered DR path. # for example -D 0,1 will not work with port 2 #Port=1 # define a different default timeout #timeout=50 # disable query of Mellanox Extended PortInfo on ibnetdiscover subnet sweeps # Default = true #MLX_EPI=false # define a default m_key #m_key=0x00 # default smkey to be used for SA requests #sa_key=0x00 rdma-core-56.1/infiniband-diags/ibaddr.c000066400000000000000000000114401477342711600201110ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include "ibdiag_common.h" static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; static int ib_resolve_addr(ib_portid_t * portid, int portnum, int show_lid, int show_gid) { char gid_str[INET6_ADDRSTRLEN]; uint8_t portinfo[IB_SMP_DATA_SIZE] = { 0 }; uint8_t nodeinfo[IB_SMP_DATA_SIZE] = { 0 }; uint64_t guid, prefix; ibmad_gid_t gid; int lmc; if (!smp_query_via(nodeinfo, portid, IB_ATTR_NODE_INFO, 0, 0, srcport)) return -1; if (!smp_query_via(portinfo, portid, IB_ATTR_PORT_INFO, portnum, 0, srcport)) return -1; mad_decode_field(portinfo, IB_PORT_LID_F, &portid->lid); mad_decode_field(portinfo, IB_PORT_GID_PREFIX_F, &prefix); mad_decode_field(portinfo, IB_PORT_LMC_F, &lmc); mad_decode_field(nodeinfo, IB_NODE_PORT_GUID_F, &guid); mad_encode_field(gid, IB_GID_PREFIX_F, &prefix); mad_encode_field(gid, IB_GID_GUID_F, &guid); if (show_gid) { printf("GID %s ", inet_ntop(AF_INET6, gid, gid_str, sizeof gid_str)); } if (show_lid > 0) printf("LID start 0x%x end 0x%x", portid->lid, portid->lid + (1 << lmc) - 1); else if (show_lid < 0) printf("LID start %u end %u", portid->lid, portid->lid + (1 << lmc) - 1); printf("\n"); return 0; } static int show_lid, show_gid; static int process_opt(void *context, int ch) { switch (ch) { case 'g': show_gid = 1; break; case 'l': show_lid++; break; case 'L': show_lid = -100; break; default: return -1; } return 0; } int main(int argc, char **argv) { int mgmt_classes[3] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS }; ib_portid_t portid = { 0 }; int port = 0; const struct ibdiag_opt opts[] = { {"gid_show", 'g', 0, NULL, "show gid address only"}, {"lid_show", 'l', 0, NULL, "show lid range only"}, {"Lid_show", 'L', 0, NULL, "show lid range (in decimal) only"}, {} }; char usage_args[] = "[]"; const char *usage_examples[] = { "\t\t# local port's address", "32\t\t# show lid range and gid of lid 32", "-G 0x8f1040023\t# same but using guid address", "-l 32\t\t# show lid range only", "-L 32\t\t# show decimal lid range only", "-g 32\t\t# show gid address only", NULL }; ibdiag_process_opts(argc, argv, NULL, "KL", opts, process_opt, usage_args, usage_examples); argc -= optind; argv += optind; if (argc > 1) port = strtoul(argv[1], NULL, 0); if (!show_lid && !show_gid) show_lid = show_gid = 1; srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 1); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->smi.port; if (!srcport) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); smp_mkey_set(srcport, ibd_mkey); if (argc) { if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &portid, argv[0], ibd_dest_type, ibd_sm_id, srcports->gsi.port) < 0) IBEXIT("can't resolve destination port %s", argv[0]); } else { if (resolve_self(srcports->gsi.ca_name, ibd_ca_port, &portid, &port, NULL) < 0) IBEXIT("can't resolve self port %s", argv[0]); } if (ib_resolve_addr(&portid, port, show_lid, show_gid) < 0) IBEXIT("can't resolve requested address"); mad_rpc_close_port2(srcports); exit(0); } rdma-core-56.1/infiniband-diags/ibcacheedit.c000066400000000000000000000177051477342711600211220ustar00rootroot00000000000000/* * Copyright (c) 2010 Lawrence Livermore National Lab. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #define _GNU_SOURCE #include #include #include #include #include #include #include "ibdiag_common.h" static uint64_t switchguid_before; static uint64_t switchguid_after; static int switchguid_flag; static uint64_t caguid_before; static uint64_t caguid_after; static int caguid_flag; static uint64_t sysimgguid_before; static uint64_t sysimgguid_after; static int sysimgguid_flag; static uint64_t portguid_nodeguid; static uint64_t portguid_before; static uint64_t portguid_after; static int portguid_flag; struct guids { uint64_t searchguid; int searchguid_found; uint64_t before; uint64_t after; int found; }; static int parse_beforeafter(char *arg, uint64_t *before, uint64_t *after) { char *ptr; char *before_str; char *after_str; ptr = strchr(optarg, ':'); if (!ptr || !(*(ptr + 1))) { fprintf(stderr, "invalid input '%s'\n", arg); return -1; } (*ptr) = '\0'; before_str = arg; after_str = ptr + 1; (*before) = strtoull(before_str, NULL, 0); (*after) = strtoull(after_str, NULL, 0); return 0; } static int parse_guidbeforeafter(char *arg, uint64_t *guid, uint64_t *before, uint64_t *after) { char *ptr1; char *ptr2; char *guid_str; char *before_str; char *after_str; ptr1 = strchr(optarg, ':'); if (!ptr1 || !(*(ptr1 + 1))) { fprintf(stderr, "invalid input '%s'\n", arg); return -1; } guid_str = arg; before_str = ptr1 + 1; ptr2 = strchr(before_str, ':'); if (!ptr2 || !(*(ptr2 + 1))) { fprintf(stderr, "invalid input '%s'\n", arg); return -1; } (*ptr1) = '\0'; (*ptr2) = '\0'; after_str = ptr2 + 1; (*guid) = strtoull(guid_str, NULL, 0); (*before) = strtoull(before_str, NULL, 0); (*after) = strtoull(after_str, NULL, 0); return 0; } static int process_opt(void *context, int ch) { switch (ch) { case 1: if (parse_beforeafter(optarg, &switchguid_before, &switchguid_after) < 0) return -1; switchguid_flag++; break; case 2: if (parse_beforeafter(optarg, &caguid_before, &caguid_after) < 0) return -1; caguid_flag++; break; case 3: if (parse_beforeafter(optarg, &sysimgguid_before, &sysimgguid_after) < 0) return -1; sysimgguid_flag++; break; case 4: if (parse_guidbeforeafter(optarg, &portguid_nodeguid, &portguid_before, &portguid_after) < 0) return -1; portguid_flag++; break; default: return -1; } return 0; } static void update_switchportguids(ibnd_node_t *node) { ibnd_port_t *port; int p; for (p = 0; p <= node->numports; p++) { port = node->ports[p]; if (port) port->guid = node->guid; } } static void replace_node_guid(ibnd_node_t *node, void *user_data) { struct guids *guids; guids = (struct guids *)user_data; if (node->guid == guids->before) { node->guid = guids->after; /* port guids are identical to switch guids on * switches, so update port guids too */ if (node->type == IB_NODE_SWITCH) update_switchportguids(node); guids->found++; } } static void replace_sysimgguid(ibnd_node_t *node, void *user_data) { struct guids *guids; uint64_t sysimgguid; guids = (struct guids *)user_data; sysimgguid = mad_get_field64(node->info, 0, IB_NODE_SYSTEM_GUID_F); if (sysimgguid == guids->before) { mad_set_field64(node->info, 0, IB_NODE_SYSTEM_GUID_F, guids->after); guids->found++; } } static void replace_portguid(ibnd_node_t *node, void *user_data) { struct guids *guids; guids = (struct guids *)user_data; if (node->guid != guids->searchguid) return; guids->searchguid_found++; if (node->type == IB_NODE_SWITCH) { /* port guids are identical to switch guids on * switches, so update switch guid too */ if (node->guid == guids->before) { node->guid = guids->after; update_switchportguids(node); guids->found++; } } else { ibnd_port_t *port; int p; for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (port && port->guid == guids->before) { port->guid = guids->after; guids->found++; break; } } } } int main(int argc, char **argv) { ibnd_fabric_t *fabric = NULL; char *orig_cache_file = NULL; char *new_cache_file = NULL; struct guids guids; const struct ibdiag_opt opts[] = { {"switchguid", 1, 1, "BEFOREGUID:AFTERGUID", "Specify before and after switchguid to edit"}, {"caguid", 2, 1, "BEFOREGUID:AFTERGUID", "Specify before and after caguid to edit"}, {"sysimgguid", 3, 1, "BEFOREGUID:AFTERGUID", "Specify before and after sysimgguid to edit"}, {"portguid", 4, 1, "NODEGUID:BEFOREGUID:AFTERGUID", "Specify before and after port guid to edit"}, {} }; const char *usage_args = " "; ibdiag_process_opts(argc, argv, NULL, "CDdeGKLPstvy", opts, process_opt, usage_args, NULL); argc -= optind; argv += optind; orig_cache_file = argv[0]; new_cache_file = argv[1]; if (!orig_cache_file) IBEXIT("original cache file not specified"); if (!new_cache_file) IBEXIT("new cache file not specified"); if ((fabric = ibnd_load_fabric(orig_cache_file, 0)) == NULL) IBEXIT("loading original cached fabric failed"); if (switchguid_flag) { guids.before = switchguid_before; guids.after = switchguid_after; guids.found = 0; ibnd_iter_nodes_type(fabric, replace_node_guid, IB_NODE_SWITCH, &guids); if (!guids.found) IBEXIT("switchguid = %" PRIx64 " not found", switchguid_before); } if (caguid_flag) { guids.before = caguid_before; guids.after = caguid_after; guids.found = 0; ibnd_iter_nodes_type(fabric, replace_node_guid, IB_NODE_CA, &guids); if (!guids.found) IBEXIT("caguid = %" PRIx64 " not found", caguid_before); } if (sysimgguid_flag) { guids.before = sysimgguid_before; guids.after = sysimgguid_after; guids.found = 0; ibnd_iter_nodes(fabric, replace_sysimgguid, &guids); if (!guids.found) IBEXIT("sysimgguid = %" PRIx64 " not found", sysimgguid_before); } if (portguid_flag) { guids.searchguid = portguid_nodeguid; guids.searchguid_found = 0; guids.before = portguid_before; guids.after = portguid_after; guids.found = 0; ibnd_iter_nodes(fabric, replace_portguid, &guids); if (!guids.searchguid_found) IBEXIT("nodeguid = %" PRIx64 " not found", portguid_nodeguid); if (!guids.found) IBEXIT("portguid = %" PRIx64 " not found", portguid_before); } if (ibnd_cache_fabric(fabric, new_cache_file, 0) < 0) IBEXIT("caching new cache data failed"); ibnd_destroy_fabric(fabric); exit(0); } rdma-core-56.1/infiniband-diags/ibccconfig.c000066400000000000000000000411231477342711600207530ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 2011 Lawrence Livermore National Lab. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include #define __STDC_FORMAT_MACROS #include #include #include "ibdiag_common.h" static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; static op_fn_t congestion_key_info; static op_fn_t switch_congestion_setting; static op_fn_t switch_port_congestion_setting; static op_fn_t ca_congestion_setting; static op_fn_t congestion_control_table; static const match_rec_t match_tbl[] = { {"CongestionKeyInfo", "CK", congestion_key_info, 0, " "}, {"SwitchCongestionSetting", "SS", switch_congestion_setting, 0, " " " "}, {"SwitchPortCongestionSetting", "SP", switch_port_congestion_setting, 1, " "}, {"CACongestionSetting", "CS", ca_congestion_setting, 0, " " " "}, {"CongestionControlTable", "CT", congestion_control_table, 0, " ..."}, {} }; static uint64_t cckey; /*******************************************/ static const char *parselonglongint(char *arg, uint64_t *val) { char *endptr = NULL; errno = 0; *val = strtoull(arg, &endptr, 0); if ((endptr && *endptr != '\0') || errno != 0) { if (errno == ERANGE) return "value out of range"; return "invalid integer input"; } return NULL; } static const char *parseint(char *arg, uint32_t *val, int hexonly) { char *endptr = NULL; errno = 0; *val = strtoul(arg, &endptr, hexonly ? 16 : 0); if ((endptr && *endptr != '\0') || errno != 0) { if (errno == ERANGE) return "value out of range"; return "invalid integer input"; } return NULL; } static const char *congestion_key_info(ib_portid_t *dest, char **argv, int argc) { uint8_t rcv[IB_CC_DATA_SZ] = { 0 }; uint8_t payload[IB_CC_DATA_SZ] = { 0 }; uint64_t cc_key; uint32_t cc_keyprotectbit; uint32_t cc_keyleaseperiod; uint32_t cc_keyviolations; const char *errstr; if (argc != 4) return "invalid number of parameters for CongestionKeyInfo"; if ((errstr = parselonglongint(argv[0], &cc_key))) return errstr; if ((errstr = parseint(argv[1], &cc_keyprotectbit, 0))) return errstr; if ((errstr = parseint(argv[2], &cc_keyleaseperiod, 0))) return errstr; if ((errstr = parseint(argv[3], &cc_keyviolations, 0))) return errstr; if (cc_keyprotectbit != 0 && cc_keyprotectbit != 1) return "invalid cc_keyprotectbit value"; if (cc_keyleaseperiod > USHRT_MAX) return "invalid cc_keyleaseperiod value"; if (cc_keyviolations > USHRT_MAX) return "invalid cc_keyviolations value"; mad_set_field64(payload, 0, IB_CC_CONGESTION_KEY_INFO_CC_KEY_F, cc_key); mad_encode_field(payload, IB_CC_CONGESTION_KEY_INFO_CC_KEY_PROTECT_BIT_F, &cc_keyprotectbit); mad_encode_field(payload, IB_CC_CONGESTION_KEY_INFO_CC_KEY_LEASE_PERIOD_F, &cc_keyleaseperiod); /* spec says "setting the counter to a value other than zero results * in the counter being left unchanged. So if user wants no change, * they gotta input non-zero */ mad_encode_field(payload, IB_CC_CONGESTION_KEY_INFO_CC_KEY_VIOLATIONS_F, &cc_keyviolations); if (!cc_config_status_via(payload, rcv, dest, IB_CC_ATTR_CONGESTION_KEY_INFO, 0, 0, NULL, srcport, cckey)) return "congestion key info config failed"; return NULL; } /* parse like it's a hypothetical 256 bit hex code */ static const char *parse256(char *arg, uint8_t *buf) { int numdigits = 0; int startindex; char *ptr; int i; if (!strncmp(arg, "0x", 2) || !strncmp(arg, "0X", 2)) arg += 2; for (ptr = arg; *ptr; ptr++) { if (!isxdigit(*ptr)) return "invalid hex digit read"; numdigits++; } if (numdigits > 64) return "hex code too long"; /* we need to imagine that this is like a 256-bit int stored * in big endian. So we need to find the first index * point where the user's input would start in our array. */ startindex = 32 - ((numdigits - 1) / 2) - 1; for (i = startindex; i <= 31; i++) { char tmp[3] = { 0 }; uint32_t tmpint; const char *errstr; /* I can't help but think there is a strtoX that * will do this for me, but I can't find it. */ if (i == startindex && numdigits % 2) { memcpy(tmp, arg, 1); arg++; } else { memcpy(tmp, arg, 2); arg += 2; } if ((errstr = parseint(tmp, &tmpint, 1))) return errstr; buf[i] = tmpint; } return NULL; } static const char *parsecct(char *arg, uint32_t *shift, uint32_t *multiplier) { char buf[1024] = { 0 }; const char *errstr; char *ptr; strcpy(buf, arg); if (!(ptr = strchr(buf, ':'))) return "ccts are formatted shift:multiplier"; *ptr = '\0'; ptr++; if ((errstr = parseint(buf, shift, 0))) return errstr; if ((errstr = parseint(ptr, multiplier, 0))) return errstr; return NULL; } static const char *switch_congestion_setting(ib_portid_t *dest, char **argv, int argc) { uint8_t rcv[IB_CC_DATA_SZ] = { 0 }; uint8_t payload[IB_CC_DATA_SZ] = { 0 }; uint32_t control_map; uint8_t victim_mask[32] = { 0 }; uint8_t credit_mask[32] = { 0 }; uint32_t threshold; uint32_t packet_size; uint32_t cs_threshold; uint32_t cs_returndelay_s; uint32_t cs_returndelay_m; uint32_t cs_returndelay; uint32_t marking_rate; const char *errstr; if (argc != 8) return "invalid number of parameters for SwitchCongestionSetting"; if ((errstr = parseint(argv[0], &control_map, 0))) return errstr; if ((errstr = parse256(argv[1], victim_mask))) return errstr; if ((errstr = parse256(argv[2], credit_mask))) return errstr; if ((errstr = parseint(argv[3], &threshold, 0))) return errstr; if ((errstr = parseint(argv[4], &packet_size, 0))) return errstr; if ((errstr = parseint(argv[5], &cs_threshold, 0))) return errstr; if ((errstr = parsecct(argv[6], &cs_returndelay_s, &cs_returndelay_m))) return errstr; cs_returndelay = cs_returndelay_m; cs_returndelay |= (cs_returndelay_s << 14); if ((errstr = parseint(argv[7], &marking_rate, 0))) return errstr; mad_encode_field(payload, IB_CC_SWITCH_CONGESTION_SETTING_CONTROL_MAP_F, &control_map); mad_set_array(payload, 0, IB_CC_SWITCH_CONGESTION_SETTING_VICTIM_MASK_F, victim_mask); mad_set_array(payload, 0, IB_CC_SWITCH_CONGESTION_SETTING_CREDIT_MASK_F, credit_mask); mad_encode_field(payload, IB_CC_SWITCH_CONGESTION_SETTING_THRESHOLD_F, &threshold); mad_encode_field(payload, IB_CC_SWITCH_CONGESTION_SETTING_PACKET_SIZE_F, &packet_size); mad_encode_field(payload, IB_CC_SWITCH_CONGESTION_SETTING_CS_THRESHOLD_F, &cs_threshold); mad_encode_field(payload, IB_CC_SWITCH_CONGESTION_SETTING_CS_RETURN_DELAY_F, &cs_returndelay); mad_encode_field(payload, IB_CC_SWITCH_CONGESTION_SETTING_MARKING_RATE_F, &marking_rate); if (!cc_config_status_via(payload, rcv, dest, IB_CC_ATTR_SWITCH_CONGESTION_SETTING, 0, 0, NULL, srcport, cckey)) return "switch congestion setting config failed"; return NULL; } static const char *switch_port_congestion_setting(ib_portid_t *dest, char **argv, int argc) { uint8_t rcv[IB_CC_DATA_SZ] = { 0 }; uint8_t payload[IB_CC_DATA_SZ] = { 0 }; uint8_t data[IB_CC_DATA_SZ] = { 0 }; uint32_t portnum; uint32_t valid; uint32_t control_type; uint32_t threshold; uint32_t packet_size; uint32_t cong_parm_marking_rate; uint32_t type; uint32_t numports; uint8_t *ptr; const char *errstr; if (argc != 6) return "invalid number of parameters for SwitchPortCongestion"; if ((errstr = parseint(argv[0], &portnum, 0))) return errstr; if ((errstr = parseint(argv[1], &valid, 0))) return errstr; if ((errstr = parseint(argv[2], &control_type, 0))) return errstr; if ((errstr = parseint(argv[3], &threshold, 0))) return errstr; if ((errstr = parseint(argv[4], &packet_size, 0))) return errstr; if ((errstr = parseint(argv[5], &cong_parm_marking_rate, 0))) return errstr; /* Figure out number of ports first */ if (!smp_query_via(data, dest, IB_ATTR_NODE_INFO, 0, 0, srcports->smi.port)) return "node info config failed"; mad_decode_field((uint8_t *)data, IB_NODE_TYPE_F, &type); mad_decode_field((uint8_t *)data, IB_NODE_NPORTS_F, &numports); if (type != IB_NODE_SWITCH) return "destination not a switch"; if (portnum > numports) return "invalid port number specified"; /* We are modifying only 1 port, so get the current config */ if (!cc_query_status_via(payload, dest, IB_CC_ATTR_SWITCH_PORT_CONGESTION_SETTING, portnum / 32, 0, NULL, srcport, cckey)) return "switch port congestion setting query failed"; ptr = payload + (((portnum % 32) * 4)); mad_encode_field(ptr, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_VALID_F, &valid); mad_encode_field(ptr, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_CONTROL_TYPE_F, &control_type); mad_encode_field(ptr, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_THRESHOLD_F, &threshold); mad_encode_field(ptr, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_PACKET_SIZE_F, &packet_size); mad_encode_field(ptr, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_CONG_PARM_MARKING_RATE_F, &cong_parm_marking_rate); if (!cc_config_status_via(payload, rcv, dest, IB_CC_ATTR_SWITCH_PORT_CONGESTION_SETTING, portnum / 48, 0, NULL, srcport, cckey)) return "switch port congestion setting config failed"; return NULL; } static const char *ca_congestion_setting(ib_portid_t *dest, char **argv, int argc) { uint8_t rcv[IB_CC_DATA_SZ] = { 0 }; uint8_t payload[IB_CC_DATA_SZ] = { 0 }; uint32_t port_control; uint32_t control_map; uint32_t ccti_timer; uint32_t ccti_increase; uint32_t trigger_threshold; uint32_t ccti_min; const char *errstr; int i; if (argc != 6) return "invalid number of parameters for CACongestionSetting"; if ((errstr = parseint(argv[0], &port_control, 0))) return errstr; if ((errstr = parseint(argv[1], &control_map, 0))) return errstr; if ((errstr = parseint(argv[2], &ccti_timer, 0))) return errstr; if ((errstr = parseint(argv[3], &ccti_increase, 0))) return errstr; if ((errstr = parseint(argv[4], &trigger_threshold, 0))) return errstr; if ((errstr = parseint(argv[5], &ccti_min, 0))) return errstr; mad_encode_field(payload, IB_CC_CA_CONGESTION_SETTING_PORT_CONTROL_F, &port_control); mad_encode_field(payload, IB_CC_CA_CONGESTION_SETTING_CONTROL_MAP_F, &control_map); for (i = 0; i < 16; i++) { uint8_t *ptr; if (!(control_map & (0x1 << i))) continue; ptr = payload + 2 + 2 + i * 8; mad_encode_field(ptr, IB_CC_CA_CONGESTION_ENTRY_CCTI_TIMER_F, &ccti_timer); mad_encode_field(ptr, IB_CC_CA_CONGESTION_ENTRY_CCTI_INCREASE_F, &ccti_increase); mad_encode_field(ptr, IB_CC_CA_CONGESTION_ENTRY_TRIGGER_THRESHOLD_F, &trigger_threshold); mad_encode_field(ptr, IB_CC_CA_CONGESTION_ENTRY_CCTI_MIN_F, &ccti_min); } if (!cc_config_status_via(payload, rcv, dest, IB_CC_ATTR_CA_CONGESTION_SETTING, 0, 0, NULL, srcport, cckey)) return "ca congestion setting config failed"; return NULL; } static const char *congestion_control_table(ib_portid_t *dest, char **argv, int argc) { uint8_t rcv[IB_CC_DATA_SZ] = { 0 }; uint8_t payload[IB_CC_DATA_SZ] = { 0 }; uint32_t ccti_limit; uint32_t index; uint32_t cctshifts[64]; uint32_t cctmults[64]; const char *errstr; int i; if (argc < 2 || argc > 66) return "invalid number of parameters for CongestionControlTable"; if ((errstr = parseint(argv[0], &ccti_limit, 0))) return errstr; if ((errstr = parseint(argv[1], &index, 0))) return errstr; if (ccti_limit && (ccti_limit + 1) != (index * 64 + (argc - 2))) return "invalid number of cct entries input given ccti_limit and index"; for (i = 0; i < (argc - 2); i++) { if ((errstr = parsecct(argv[i + 2], &cctshifts[i], &cctmults[i]))) return errstr; } mad_encode_field(payload, IB_CC_CONGESTION_CONTROL_TABLE_CCTI_LIMIT_F, &ccti_limit); for (i = 0; i < (argc - 2); i++) { mad_encode_field(payload + 4 + i * 2, IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_CCT_SHIFT_F, &cctshifts[i]); mad_encode_field(payload + 4 + i * 2, IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_CCT_MULTIPLIER_F, &cctmults[i]); } if (!cc_config_status_via(payload, rcv, dest, IB_CC_ATTR_CONGESTION_CONTROL_TABLE, index, 0, NULL, srcport, cckey)) return "congestion control table config failed"; return NULL; } static int process_opt(void *context, int ch) { switch (ch) { case 'c': cckey = (uint64_t) strtoull(optarg, NULL, 0); break; default: return -1; } return 0; } int main(int argc, char **argv) { char usage_args[1024]; int mgmt_classes[3] = { IB_SMI_CLASS, IB_SA_CLASS, IB_CC_CLASS }; ib_portid_t portid = { 0 }; const char *err; op_fn_t *fn; const match_rec_t *r; int n; const struct ibdiag_opt opts[] = { {"cckey", 'c', 1, "", "CC key"}, {} }; const char *usage_examples[] = { "SwitchCongestionSetting 2 0x1F 0x1FFFFFFFFF 0x0 0xF 8 0 0:0 1\t# Configure Switch Congestion Settings", "CACongestionSetting 1 0 0x3 150 1 0 0\t\t# Configure CA Congestion Settings to SL 0 and SL 1", "CACongestionSetting 1 0 0x4 200 1 0 0\t\t# Configure CA Congestion Settings to SL 2", "CongestionControlTable 1 63 0 0:0 0:1 ...\t# Configure first block of Congestion Control Table", "CongestionControlTable 1 127 0 0:64 0:65 ...\t# Configure second block of Congestion Control Table", NULL }; n = sprintf(usage_args, "[-c key] \n" "\nWARNING -- You should understand what you are " "doing before using this tool. Misuse of this " "tool could result in a broken fabric.\n" "\nSupported ops (and aliases, case insensitive):\n"); for (r = match_tbl; r->name; r++) { n += snprintf(usage_args + n, sizeof(usage_args) - n, " %s (%s) %s%s%s\n", r->name, r->alias ? r->alias : "", r->opt_portnum ? " " : "", r->ops_extra ? " " : "", r->ops_extra ? r->ops_extra : ""); if (n >= sizeof(usage_args)) exit(-1); } ibdiag_process_opts(argc, argv, NULL, "DK", opts, process_opt, usage_args, usage_examples); argc -= optind; argv += optind; if (argc < 2) ibdiag_show_usage(); if (!(fn = match_op(match_tbl, argv[0]))) IBEXIT("operation '%s' not supported", argv[0]); srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 0); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->gsi.port; if (!srcport) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); smp_mkey_set(srcports->smi.port, ibd_mkey); if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &portid, argv[1], ibd_dest_type, ibd_sm_id, srcport) < 0) IBEXIT("can't resolve destination %s", argv[1]); if ((err = fn(&portid, argv + 2, argc - 2))) IBEXIT("operation %s: %s", argv[0], err); mad_rpc_close_port2(srcports); exit(0); } rdma-core-56.1/infiniband-diags/ibccquery.c000066400000000000000000000277511477342711600206660ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 2011 Lawrence Livermore National Lab. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #define __STDC_FORMAT_MACROS #include #include #include "ibdiag_common.h" static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; static op_fn_t class_port_info; static op_fn_t congestion_info; static op_fn_t congestion_key_info; static op_fn_t congestion_log; static op_fn_t switch_congestion_setting; static op_fn_t switch_port_congestion_setting; static op_fn_t ca_congestion_setting; static op_fn_t congestion_control_table; static op_fn_t timestamp_dump; static const match_rec_t match_tbl[] = { {"ClassPortInfo", "CP", class_port_info, 0, ""}, {"CongestionInfo", "CI", congestion_info, 0, ""}, {"CongestionKeyInfo", "CK", congestion_key_info, 0, ""}, {"CongestionLog", "CL", congestion_log, 0, ""}, {"SwitchCongestionSetting", "SS", switch_congestion_setting, 0, ""}, {"SwitchPortCongestionSetting", "SP", switch_port_congestion_setting, 1, ""}, {"CACongestionSetting", "CS", ca_congestion_setting, 0, ""}, {"CongestionControlTable", "CT", congestion_control_table, 0, ""}, {"Timestamp", "TI", timestamp_dump, 0, ""}, {} }; static uint64_t cckey = 0; /*******************************************/ static const char *class_port_info(ib_portid_t *dest, char **argv, int argc) { char buf[2048]; char data[IB_CC_DATA_SZ] = { 0 }; if (!cc_query_status_via(data, dest, CLASS_PORT_INFO, 0, 0, NULL, srcport, cckey)) return "class port info query failed"; mad_dump_classportinfo(buf, sizeof buf, data, sizeof data); printf("# ClassPortInfo: %s\n%s", portid2str(dest), buf); return NULL; } static const char *congestion_info(ib_portid_t *dest, char **argv, int argc) { char buf[2048]; char data[IB_CC_DATA_SZ] = { 0 }; if (!cc_query_status_via(data, dest, IB_CC_ATTR_CONGESTION_INFO, 0, 0, NULL, srcport, cckey)) return "congestion info query failed"; mad_dump_cc_congestioninfo(buf, sizeof buf, data, sizeof data); printf("# CongestionInfo: %s\n%s", portid2str(dest), buf); return NULL; } static const char *congestion_key_info(ib_portid_t *dest, char **argv, int argc) { char buf[2048]; char data[IB_CC_DATA_SZ] = { 0 }; if (!cc_query_status_via(data, dest, IB_CC_ATTR_CONGESTION_KEY_INFO, 0, 0, NULL, srcport, cckey)) return "congestion key info query failed"; mad_dump_cc_congestionkeyinfo(buf, sizeof buf, data, sizeof data); printf("# CongestionKeyInfo: %s\n%s", portid2str(dest), buf); return NULL; } static const char *congestion_log(ib_portid_t *dest, char **argv, int argc) { char buf[2048]; char data[IB_CC_LOG_DATA_SZ] = { 0 }; char emptybuf[16] = { 0 }; int i, type; if (!cc_query_status_via(data, dest, IB_CC_ATTR_CONGESTION_LOG, 0, 0, NULL, srcport, cckey)) return "congestion log query failed"; mad_decode_field((uint8_t *)data, IB_CC_CONGESTION_LOG_LOGTYPE_F, &type); if (type != 1 && type != 2) return "unrecognized log type"; mad_dump_cc_congestionlog(buf, sizeof buf, data, sizeof data); printf("# CongestionLog: %s\n%s", portid2str(dest), buf); if (type == 1) { mad_dump_cc_congestionlogswitch(buf, sizeof buf, data, sizeof data); printf("%s\n", buf); for (i = 0; i < 15; i++) { /* output only if entry not 0 */ if (memcmp(data + 40 + i * 12, emptybuf, 12)) { mad_dump_cc_congestionlogentryswitch(buf, sizeof buf, data + 40 + i * 12, 12); printf("%s\n", buf); } } } else { /* XXX: Q3/2010 errata lists first entry offset at 80, but we assume * will be updated to 96 once CurrentTimeStamp field is word aligned. * In addition, assume max 13 log events instead of 16. Due to * errata changes increasing size of CA log event, 16 log events is * no longer possible to fit in max MAD size. */ mad_dump_cc_congestionlogca(buf, sizeof buf, data, sizeof data); printf("%s\n", buf); for (i = 0; i < 13; i++) { /* output only if entry not 0 */ if (memcmp(data + 12 + i * 16, emptybuf, 16)) { mad_dump_cc_congestionlogentryca(buf, sizeof buf, data + 12 + i * 16, 16); printf("%s\n", buf); } } } return NULL; } static const char *switch_congestion_setting(ib_portid_t *dest, char **argv, int argc) { char buf[2048]; char data[IB_CC_DATA_SZ] = { 0 }; if (!cc_query_status_via(data, dest, IB_CC_ATTR_SWITCH_CONGESTION_SETTING, 0, 0, NULL, srcport, cckey)) return "switch congestion setting query failed"; mad_dump_cc_switchcongestionsetting(buf, sizeof buf, data, sizeof data); printf("# SwitchCongestionSetting: %s\n%s", portid2str(dest), buf); return NULL; } static const char *switch_port_congestion_setting(ib_portid_t *dest, char **argv, int argc) { char buf[2048]; char data[IB_CC_DATA_SZ] = { 0 }; int type, numports, maxblocks, i, j; int portnum = 0; int outputcount = 0; if (argc > 0) portnum = strtol(argv[0], NULL, 0); /* Figure out number of ports first */ if (!smp_query_via(data, dest, IB_ATTR_NODE_INFO, 0, 0, srcports->smi.port)) return "node info query failed"; mad_decode_field((uint8_t *)data, IB_NODE_TYPE_F, &type); mad_decode_field((uint8_t *)data, IB_NODE_NPORTS_F, &numports); if (type != IB_NODE_SWITCH) return "destination not a switch"; printf("# SwitchPortCongestionSetting: %s\n", portid2str(dest)); if (portnum) { if (portnum > numports) return "invalid port number specified"; memset(data, '\0', sizeof data); if (!cc_query_status_via(data, dest, IB_CC_ATTR_SWITCH_PORT_CONGESTION_SETTING, portnum / 48, 0, NULL, srcport, cckey)) return "switch port congestion setting query failed"; mad_dump_cc_switchportcongestionsettingelement(buf, sizeof buf, data + ((portnum % 48) * 4), 4); printf("%s", buf); return NULL; } /* else get all port info */ maxblocks = numports / 48 + 1; for (i = 0; i < maxblocks; i++) { memset(data, '\0', sizeof data); if (!cc_query_status_via(data, dest, IB_CC_ATTR_SWITCH_PORT_CONGESTION_SETTING, i, 0, NULL, srcport, cckey)) return "switch port congestion setting query failed"; for (j = 0; j < 48 && outputcount <= numports; j++) { printf("Port:............................%u\n", i * 48 + j); mad_dump_cc_switchportcongestionsettingelement(buf, sizeof buf, data + j * 4, 4); printf("%s\n", buf); outputcount++; } } return NULL; } static const char *ca_congestion_setting(ib_portid_t *dest, char **argv, int argc) { char buf[2048]; char data[IB_CC_DATA_SZ] = { 0 }; int i; if (!cc_query_status_via(data, dest, IB_CC_ATTR_CA_CONGESTION_SETTING, 0, 0, NULL, srcport, cckey)) return "ca congestion setting query failed"; mad_dump_cc_cacongestionsetting(buf, sizeof buf, data, sizeof data); printf("# CACongestionSetting: %s\n%s\n", portid2str(dest), buf); for (i = 0; i < 16; i++) { printf("SL:..............................%u\n", i); mad_dump_cc_cacongestionentry(buf, sizeof buf, data + 4 + i * 8, 8); printf("%s\n", buf); } return NULL; } static const char *congestion_control_table(ib_portid_t *dest, char **argv, int argc) { char buf[2048]; char data[IB_CC_DATA_SZ] = { 0 }; int limit, outputcount = 0; int i, j; if (!cc_query_status_via(data, dest, IB_CC_ATTR_CONGESTION_CONTROL_TABLE, 0, 0, NULL, srcport, cckey)) return "congestion control table query failed"; mad_decode_field((uint8_t *)data, IB_CC_CONGESTION_CONTROL_TABLE_CCTI_LIMIT_F, &limit); mad_dump_cc_congestioncontroltable(buf, sizeof buf, data, sizeof data); printf("# CongestionControlTable: %s\n%s\n", portid2str(dest), buf); if (!limit) return NULL; for (i = 0; i < (limit/64) + 1; i++) { /* first query done */ if (i) if (!cc_query_status_via(data, dest, IB_CC_ATTR_CONGESTION_CONTROL_TABLE, i, 0, NULL, srcport, cckey)) return "congestion control table query failed"; for (j = 0; j < 64 && outputcount <= limit; j++) { printf("Entry:...........................%u\n", i*64 + j); mad_dump_cc_congestioncontroltableentry(buf, sizeof buf, data + 4 + j * 2, sizeof data - 4 - j * 2); printf("%s\n", buf); outputcount++; } } return NULL; } static const char *timestamp_dump(ib_portid_t *dest, char **argv, int argc) { char buf[2048]; char data[IB_CC_DATA_SZ] = { 0 }; if (!cc_query_status_via(data, dest, IB_CC_ATTR_TIMESTAMP, 0, 0, NULL, srcport, cckey)) return "timestamp query failed"; mad_dump_cc_timestamp(buf, sizeof buf, data, sizeof data); printf("# Timestamp: %s\n%s", portid2str(dest), buf); return NULL; } static int process_opt(void *context, int ch) { switch (ch) { case 'c': cckey = (uint64_t) strtoull(optarg, NULL, 0); break; default: return -1; } return 0; } int main(int argc, char **argv) { char usage_args[1024]; int mgmt_classes[3] = { IB_SMI_CLASS, IB_SA_CLASS, IB_CC_CLASS }; ib_portid_t portid = { 0 }; const char *err; op_fn_t *fn; const match_rec_t *r; int n; const struct ibdiag_opt opts[] = { {"cckey", 'c', 1, "", "CC key"}, {} }; const char *usage_examples[] = { "CongestionInfo 3\t\t\t# Congestion Info by lid", "SwitchPortCongestionSetting 3\t# Query all Switch Port Congestion Settings", "SwitchPortCongestionSetting 3 1\t# Query Switch Port Congestion Setting for port 1", NULL }; n = sprintf(usage_args, "[-c key] \n" "\nSupported ops (and aliases, case insensitive):\n"); for (r = match_tbl; r->name; r++) { n += snprintf(usage_args + n, sizeof(usage_args) - n, " %s (%s) %s\n", r->name, r->alias ? r->alias : "", r->opt_portnum ? " []" : ""); if (n >= sizeof(usage_args)) exit(-1); } ibdiag_process_opts(argc, argv, NULL, "DK", opts, process_opt, usage_args, usage_examples); argc -= optind; argv += optind; if (argc < 2) ibdiag_show_usage(); if (!(fn = match_op(match_tbl, argv[0]))) IBEXIT("operation '%s' not supported", argv[0]); srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 0); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->gsi.port; if (!srcport) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); smp_mkey_set(srcports->smi.port, ibd_mkey); if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &portid, argv[1], ibd_dest_type, ibd_sm_id, srcport) < 0) IBEXIT("can't resolve destination %s", argv[1]); if ((err = fn(&portid, argv + 2, argc - 2))) IBEXIT("operation %s: %s", argv[0], err); mad_rpc_close_port2(srcports); exit(0); } rdma-core-56.1/infiniband-diags/ibdiag_common.c000066400000000000000000000565261477342711600214710ustar00rootroot00000000000000/* * Copyright (c) 2006-2007 The Regents of the University of California. * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2010 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2011 Lawrence Livermore National Security. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ /** * Define common functions which can be included in the various C based diags. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include int ibverbose; enum MAD_DEST ibd_dest_type = IB_DEST_LID; ib_portid_t *ibd_sm_id; static ib_portid_t sm_portid = { 0 }; /* general config options */ #define IBDIAG_CONFIG_GENERAL IBDIAG_CONFIG_PATH"/ibdiag.conf" char *ibd_ca = NULL; int ibd_ca_port = 0; int ibd_timeout = 0; uint32_t ibd_ibnetdisc_flags = IBND_CONFIG_MLX_EPI; uint64_t ibd_mkey; uint64_t ibd_sakey = 0; int show_keys = 0; char *ibd_nd_format = NULL; static const char *prog_name; static const char *prog_args; static const char **prog_examples; static struct option *long_opts = NULL; static const struct ibdiag_opt *opts_map[256]; static const char *get_build_version(void) { return "BUILD VERSION: " PACKAGE_VERSION; } static void pretty_print(int start, int width, const char *str) { int len = width - start; const char *p, *e; while (1) { while (isspace(*str)) str++; p = str; do { e = p + 1; p = strchr(e, ' '); } while (p && p - str < len); if (!p) { fprintf(stderr, "%s", str); break; } if (e - str == 1) e = p; fprintf(stderr, "%.*s\n%*s", (int)(e - str), str, start, ""); str = e; } } static inline int val_str_true(const char *val_str) { return ((strncmp(val_str, "TRUE", strlen("TRUE")) == 0) || (strncmp(val_str, "true", strlen("true")) == 0)); } static void read_ibdiag_config(const char *file) { char buf[1024]; FILE *config_fd = NULL; char *p_prefix, *p_last; char *name; char *val_str; struct stat statbuf; /* silently ignore missing config file */ if (stat(file, &statbuf)) return; config_fd = fopen(file, "r"); if (!config_fd) return; while (fgets(buf, sizeof buf, config_fd) != NULL) { p_prefix = strtok_r(buf, "\n", &p_last); if (!p_prefix) continue; /* ignore blank lines */ if (*p_prefix == '#') continue; /* ignore comment lines */ name = strtok_r(p_prefix, "=", &p_last); val_str = strtok_r(NULL, "\n", &p_last); if (strncmp(name, "CA", strlen("CA")) == 0) { free(ibd_ca); ibd_ca = strdup(val_str); } else if (strncmp(name, "Port", strlen("Port")) == 0) { ibd_ca_port = strtoul(val_str, NULL, 0); } else if (strncmp(name, "timeout", strlen("timeout")) == 0) { ibd_timeout = strtoul(val_str, NULL, 0); } else if (strncmp(name, "MLX_EPI", strlen("MLX_EPI")) == 0) { if (val_str_true(val_str)) { ibd_ibnetdisc_flags |= IBND_CONFIG_MLX_EPI; } else { ibd_ibnetdisc_flags &= ~IBND_CONFIG_MLX_EPI; } } else if (strncmp(name, "m_key", strlen("m_key")) == 0) { ibd_mkey = strtoull(val_str, NULL, 0); } else if (strncmp(name, "sa_key", strlen("sa_key")) == 0) { ibd_sakey = strtoull(val_str, NULL, 0); } else if (strncmp(name, "nd_format", strlen("nd_format")) == 0) { if (ibd_nd_format) free(ibd_nd_format); ibd_nd_format = strdup(val_str); } } fclose(config_fd); } void ibdiag_show_usage(void) { struct option *o = long_opts; int n; fprintf(stderr, "\nUsage: %s [options] %s\n\n", prog_name, prog_args ? prog_args : ""); if (long_opts[0].name) fprintf(stderr, "Options:\n"); for (o = long_opts; o->name; o++) { const struct ibdiag_opt *io = opts_map[o->val]; n = fprintf(stderr, " --%s", io->name); if (isprint(io->letter)) n += fprintf(stderr, ", -%c", io->letter); if (io->has_arg) n += fprintf(stderr, " %s", io->arg_tmpl ? io->arg_tmpl : ""); if (io->description && *io->description) { n += fprintf(stderr, "%*s ", 24 - n > 0 ? 24 - n : 0, ""); pretty_print(n, 74, io->description); } fprintf(stderr, "\n"); } if (prog_examples) { const char **p; fprintf(stderr, "\nExamples:\n"); for (p = prog_examples; *p && **p; p++) fprintf(stderr, " %s %s\n", prog_name, *p); } fprintf(stderr, "\n"); exit(2); } static int process_opt(int ch) { char *endp; long val; switch (ch) { case 'z': read_ibdiag_config(optarg); break; case 'h': ibdiag_show_usage(); break; case 'V': fprintf(stderr, "%s %s\n", prog_name, get_build_version()); exit(0); case 'e': madrpc_show_errors(1); break; case 'v': ibverbose++; break; case 'd': ibdebug++; madrpc_show_errors(1); umad_debug(ibdebug - 1); break; case 'C': ibd_ca = optarg; break; case 'P': ibd_ca_port = strtoul(optarg, NULL, 0); break; case 'D': ibd_dest_type = IB_DEST_DRPATH; break; case 'L': ibd_dest_type = IB_DEST_LID; break; case 'G': ibd_dest_type = IB_DEST_GUID; break; case 't': errno = 0; val = strtol(optarg, &endp, 0); if (errno || (endp && *endp != '\0') || val <= 0 || val > INT_MAX) IBEXIT("Invalid timeout \"%s\". Timeout requires a " "positive integer value < %d.", optarg, INT_MAX); else { madrpc_set_timeout((int)val); ibd_timeout = (int)val; } break; case 's': /* srcport is not required when resolving via IB_DEST_LID */ if (resolve_portid_str(ibd_ca, ibd_ca_port, &sm_portid, optarg, IB_DEST_LID, NULL, NULL) < 0) IBEXIT("cannot resolve SM destination port %s", optarg); ibd_sm_id = &sm_portid; break; case 'K': show_keys = 1; break; case 'y': errno = 0; ibd_mkey = strtoull(optarg, &endp, 0); if (errno || *endp != '\0') { errno = 0; ibd_mkey = strtoull(getpass("M_Key: "), &endp, 0); if (errno || *endp != '\0') { IBEXIT("Bad M_Key"); } } break; default: return -1; } return 0; } static const struct ibdiag_opt common_opts[] = { {"config", 'z', 1, "", "use config file, default: " IBDIAG_CONFIG_GENERAL}, {"Ca", 'C', 1, "", "Ca name to use"}, {"Port", 'P', 1, "", "Ca port number to use"}, {"Direct", 'D', 0, NULL, "use Direct address argument"}, {"Lid", 'L', 0, NULL, "use LID address argument"}, {"Guid", 'G', 0, NULL, "use GUID address argument"}, {"timeout", 't', 1, "", "timeout in ms"}, {"sm_port", 's', 1, "", "SM port lid"}, {"show_keys", 'K', 0, NULL, "display security keys in output"}, {"m_key", 'y', 1, "", "M_Key to use in request"}, {"errors", 'e', 0, NULL, "show send and receive errors"}, {"verbose", 'v', 0, NULL, "increase verbosity level"}, {"debug", 'd', 0, NULL, "raise debug level"}, {"help", 'h', 0, NULL, "help message"}, {"version", 'V', 0, NULL, "show version"}, {} }; static void make_opt(struct option *l, const struct ibdiag_opt *o, const struct ibdiag_opt *map[]) { l->name = o->name; l->has_arg = o->has_arg; l->flag = NULL; l->val = o->letter; if (!map[l->val]) map[l->val] = o; } static struct option *make_long_opts(const char *exclude_str, const struct ibdiag_opt *custom_opts, const struct ibdiag_opt *map[]) { struct option *res, *l; const struct ibdiag_opt *o; unsigned n = 0; if (custom_opts) for (o = custom_opts; o->name; o++) n++; res = malloc((sizeof(common_opts) / sizeof(common_opts[0]) + n) * sizeof(*res)); if (!res) return NULL; l = res; if (custom_opts) for (o = custom_opts; o->name; o++) make_opt(l++, o, map); for (o = common_opts; o->name; o++) { if (exclude_str && strchr(exclude_str, o->letter)) continue; make_opt(l++, o, map); } memset(l, 0, sizeof(*l)); return res; } static void make_str_opts(const struct option *o, char *p, unsigned size) { unsigned i, n = 0; for (n = 0; o->name && n + 2 + o->has_arg < size; o++) { p[n++] = (char)o->val; for (i = 0; i < (unsigned)o->has_arg; i++) p[n++] = ':'; } p[n] = '\0'; } int ibdiag_process_opts(int argc, char *const argv[], void *cxt, const char *exclude_common_str, const struct ibdiag_opt custom_opts[], int (*custom_handler) (void *cxt, int val), const char *usage_args, const char *usage_examples[]) { char str_opts[1024]; const struct ibdiag_opt *o; prog_name = argv[0]; prog_args = usage_args; prog_examples = usage_examples; if (long_opts) free(long_opts); long_opts = make_long_opts(exclude_common_str, custom_opts, opts_map); if (!long_opts) return -1; read_ibdiag_config(IBDIAG_CONFIG_GENERAL); make_str_opts(long_opts, str_opts, sizeof(str_opts)); while (1) { int ch = getopt_long(argc, argv, str_opts, long_opts, NULL); if (ch == -1) break; o = opts_map[ch]; if (!o) ibdiag_show_usage(); if (custom_handler) { if (custom_handler(cxt, ch) && process_opt(ch)) ibdiag_show_usage(); } else if (process_opt(ch)) ibdiag_show_usage(); } return 0; } void ibexit(const char *fn, const char *msg, ...) { char buf[512]; va_list va; int n; va_start(va, msg); n = vsprintf(buf, msg, va); va_end(va); buf[n] = 0; if (ibdebug) printf("%s: iberror: [pid %d] %s: failed: %s\n", prog_name ? prog_name : "", getpid(), fn, buf); else printf("%s: iberror: failed: %s\n", prog_name ? prog_name : "", buf); exit(-1); } const char *conv_cnt_human_readable(uint64_t val64, float *val, int data) { uint64_t tmp = val64; int ui = 0; uint64_t div = 1; tmp /= 1024; while (tmp) { ui++; tmp /= 1024; div *= 1024; } *val = (float)(val64); if (data) { *val *= 4; if (*val/div > 1024) { ui++; div *= 1024; } } *val /= div; if (data) { switch (ui) { case 0: return ("B"); case 1: return ("KB"); case 2: return ("MB"); case 3: return ("GB"); case 4: return ("TB"); case 5: return ("PB"); case 6: return ("EB"); default: return (""); } } else { switch (ui) { case 0: return (""); case 1: return ("K"); case 2: return ("M"); case 3: return ("G"); case 4: return ("T"); case 5: return ("P"); case 6: return ("E"); default: return (""); } } return (""); } int is_port_info_extended_supported(ib_portid_t * dest, int port, struct ibmad_port *srcport) { uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; uint32_t cap_mask; uint16_t cap_mask2; int type, portnum; if (!smp_query_via(data, dest, IB_ATTR_NODE_INFO, 0, 0, srcport)) IBEXIT("node info query failed"); mad_decode_field(data, IB_NODE_TYPE_F, &type); if (type == IB_NODE_SWITCH) portnum = 0; else portnum = port; if (!smp_query_via(data, dest, IB_ATTR_PORT_INFO, portnum, 0, srcport)) IBEXIT("port info query failed"); mad_decode_field(data, IB_PORT_CAPMASK_F, &cap_mask); if (cap_mask & be32toh(IB_PORT_CAP_HAS_CAP_MASK2)) { mad_decode_field(data, IB_PORT_CAPMASK2_F, &cap_mask2); if (!(cap_mask2 & be16toh(IB_PORT_CAP2_IS_PORT_INFO_EXT_SUPPORTED))) { IBWARN("port info capability mask2 = 0x%x doesn't" " indicate PortInfoExtended support", cap_mask2); return 0; } } else { IBWARN("port info capability mask2 not supported"); return 0; } return 1; } int is_mlnx_ext_port_info_supported(uint32_t vendorid, uint16_t devid) { if (ibd_ibnetdisc_flags & IBND_CONFIG_MLX_EPI) { if ((devid >= 0xc738 && devid <= 0xc73b) || devid == 0xc839 || devid == 0xcb20 || devid == 0xcf08 || devid == 0xcf09 || devid == 0xd2f0 || ((vendorid == 0x119f) && /* Bull SwitchX */ (devid == 0x1b02 || devid == 0x1b50 || /* Bull SwitchIB and SwitchIB2 */ devid == 0x1ba0 || (devid >= 0x1bd0 && devid <= 0x1bd5) || /* Bull Quantum */ devid == 0x1bf0))) return 1; if ((devid >= 0x1003 && devid <= 0x101b) || (devid == 0xa2d2) || ((vendorid == 0x119f) && /* Bull ConnectX3 */ (devid == 0x1b33 || devid == 0x1b73 || devid == 0x1b40 || devid == 0x1b41 || devid == 0x1b60 || devid == 0x1b61 || /* Bull ConnectIB */ devid == 0x1b83 || devid == 0x1b93 || devid == 0x1b94 || /* Bull ConnectX4, Sequana HDR and HDR100 */ devid == 0x1bb4 || devid == 0x1bb5 || (devid >= 0x1bc4 && devid <= 0x1bc6)))) return 1; } return 0; } /** ========================================================================= * Resolve the SM portid using the umad layer rather than using * ib_resolve_smlid_via which requires a PortInfo query on the local port. */ int resolve_sm_portid(char *ca_name, uint8_t portnum, ib_portid_t *sm_id) { umad_port_t port; int rc; if (!sm_id) return (-1); if ((rc = umad_get_port(ca_name, portnum, &port)) < 0) return rc; memset(sm_id, 0, sizeof(*sm_id)); sm_id->lid = port.sm_lid; sm_id->sl = port.sm_sl; umad_release_port(&port); return 0; } /** ========================================================================= * Resolve local CA characteristics using the umad layer rather than using * ib_resolve_self_via which requires SMP queries on the local port. */ int resolve_self(char *ca_name, uint8_t ca_port, ib_portid_t *portid, int *portnum, ibmad_gid_t *gid) { umad_port_t port; uint64_t prefix, guid; int rc; if (!(portid || portnum || gid)) return (-1); if ((rc = umad_get_port(ca_name, ca_port, &port)) < 0) return rc; if (portid) { memset(portid, 0, sizeof(*portid)); portid->lid = port.base_lid; portid->sl = port.sm_sl; } if (portnum) *portnum = port.portnum; if (gid) { memset(gid, 0, sizeof(*gid)); prefix = be64toh(port.gid_prefix); guid = be64toh(port.port_guid); mad_encode_field(*gid, IB_GID_PREFIX_F, &prefix); mad_encode_field(*gid, IB_GID_GUID_F, &guid); } umad_release_port(&port); return 0; } static int resolve_gid(char *ca_name, uint8_t ca_port, ib_portid_t *portid, ibmad_gid_t gid, ib_portid_t *sm_id, const struct ibmad_port *srcport) { ib_portid_t tmp; char buf[IB_SA_DATA_SIZE] = { 0 }; if (!sm_id) { sm_id = &tmp; if (resolve_sm_portid(ca_name, ca_port, sm_id) < 0) return -1; } if ((portid->lid = ib_path_query_via(srcport, gid, gid, sm_id, buf)) < 0) return -1; return 0; } static int resolve_guid(char *ca_name, uint8_t ca_port, ib_portid_t *portid, uint64_t *guid, ib_portid_t *sm_id, const struct ibmad_port *srcport) { ib_portid_t tmp; uint8_t buf[IB_SA_DATA_SIZE] = { 0 }; __be64 prefix; ibmad_gid_t selfgid; if (!sm_id) { sm_id = &tmp; if (resolve_sm_portid(ca_name, ca_port, sm_id) < 0) return -1; } if (resolve_self(ca_name, ca_port, NULL, NULL, &selfgid) < 0) return -1; memcpy(&prefix, selfgid, sizeof(prefix)); mad_set_field64(portid->gid, 0, IB_GID_PREFIX_F, prefix ? be64toh(prefix) : IB_DEFAULT_SUBN_PREFIX); if (guid) mad_set_field64(portid->gid, 0, IB_GID_GUID_F, *guid); if ((portid->lid = ib_path_query_via(srcport, selfgid, portid->gid, sm_id, buf)) < 0) return -1; mad_decode_field(buf, IB_SA_PR_SL_F, &portid->sl); return 0; } /* * Callers of this function should ensure their ibmad_port has been opened with * IB_SA_CLASS as this function may require the SA to resolve addresses. */ int resolve_portid_str(char *ca_name, uint8_t ca_port, ib_portid_t * portid, char *addr_str, enum MAD_DEST dest_type, ib_portid_t *sm_id, const struct ibmad_port *srcport) { ibmad_gid_t gid; uint64_t guid; int lid; char *routepath; ib_portid_t selfportid = { 0 }; int selfport = 0; memset(portid, 0, sizeof *portid); switch (dest_type) { case IB_DEST_LID: lid = strtol(addr_str, NULL, 0); if (!IB_LID_VALID(lid)) return -1; return ib_portid_set(portid, lid, 0, 0); case IB_DEST_DRPATH: if (str2drpath(&portid->drpath, addr_str, 0, 0) < 0) return -1; return 0; case IB_DEST_GUID: if (!(guid = strtoull(addr_str, NULL, 0))) return -1; /* keep guid in portid? */ return resolve_guid(ca_name, ca_port, portid, &guid, sm_id, srcport); case IB_DEST_DRSLID: lid = strtol(addr_str, &routepath, 0); routepath++; if (!IB_LID_VALID(lid)) return -1; ib_portid_set(portid, lid, 0, 0); /* handle DR parsing and set DrSLID to local lid */ if (resolve_self(ca_name, ca_port, &selfportid, &selfport, NULL) < 0) return -1; if (str2drpath(&portid->drpath, routepath, selfportid.lid, 0) < 0) return -1; return 0; case IB_DEST_GID: if (inet_pton(AF_INET6, addr_str, &gid) <= 0) return -1; return resolve_gid(ca_name, ca_port, portid, gid, sm_id, srcport); default: IBWARN("bad dest_type %d", dest_type); } return -1; } static unsigned int get_max_width(unsigned int num) { unsigned r = 0; /* 1x */ if (num & 8) r = 3; /* 12x */ else { if (num & 4) r = 2; /* 8x */ else if (num & 2) r = 1; /* 4x */ else if (num & 0x10) r = 4; /* 2x */ } return (1 << r); } static unsigned int get_max(unsigned int num) { unsigned r = 0; // r will be lg(num) while (num >>= 1) // unroll for more speed... r++; return (1 << r); } static uint8_t *get_port_info_for_cap_mask(ibnd_port_t *port) { uint8_t *info = NULL; if (port->node->type == IB_NODE_SWITCH) { if (port->node->ports[0]) info = (uint8_t *)&port->node->ports[0]->info; } else info = (uint8_t *)&port->info; return info; } void get_max_msg(char *width_msg, char *speed_msg, int msg_size, ibnd_port_t * port) { char buf[64]; uint32_t max_speed = 0; uint32_t espeed = 0, e2speed = 0; uint32_t cap_mask, rem_cap_mask, fdr10; uint8_t *info = NULL; uint32_t max_width = get_max_width(mad_get_field(port->info, 0, IB_PORT_LINK_WIDTH_SUPPORTED_F) & mad_get_field(port->remoteport->info, 0, IB_PORT_LINK_WIDTH_SUPPORTED_F)); if ((max_width & mad_get_field(port->info, 0, IB_PORT_LINK_WIDTH_ACTIVE_F)) == 0) // we are not at the max supported width // print what we could be at. snprintf(width_msg, msg_size, "Could be %s", mad_dump_val(IB_PORT_LINK_WIDTH_ACTIVE_F, buf, 64, &max_width)); info = get_port_info_for_cap_mask(port); if (info) cap_mask = mad_get_field(info, 0, IB_PORT_CAPMASK_F); else cap_mask = 0; info = get_port_info_for_cap_mask(port->remoteport); if (info) rem_cap_mask = mad_get_field(info, 0, IB_PORT_CAPMASK_F); else rem_cap_mask = 0; if (cap_mask & be32toh(IB_PORT_CAP_HAS_EXT_SPEEDS) && rem_cap_mask & be32toh(IB_PORT_CAP_HAS_EXT_SPEEDS)) goto check_ext_speed; check_fdr10_supp: fdr10 = (mad_get_field(port->ext_info, 0, IB_MLNX_EXT_PORT_LINK_SPEED_SUPPORTED_F) & FDR10) && (mad_get_field(port->remoteport->ext_info, 0, IB_MLNX_EXT_PORT_LINK_SPEED_SUPPORTED_F) & FDR10); if (fdr10) goto check_fdr10_active; max_speed = get_max(mad_get_field(port->info, 0, IB_PORT_LINK_SPEED_SUPPORTED_F) & mad_get_field(port->remoteport->info, 0, IB_PORT_LINK_SPEED_SUPPORTED_F)); if ((max_speed & mad_get_field(port->info, 0, IB_PORT_LINK_SPEED_ACTIVE_F)) == 0) // we are not at the max supported speed // print what we could be at. snprintf(speed_msg, msg_size, "Could be %s", mad_dump_val(IB_PORT_LINK_SPEED_ACTIVE_F, buf, 64, &max_speed)); return; check_ext_speed: espeed = ibnd_get_agg_linkspeedextsup(get_port_info_for_cap_mask(port), port->info); e2speed = ibnd_get_agg_linkspeedextsup(get_port_info_for_cap_mask(port->remoteport), port->remoteport->info); if (!espeed || !e2speed) goto check_fdr10_supp; max_speed = get_max(espeed & e2speed); espeed = ibnd_get_agg_linkspeedext(get_port_info_for_cap_mask(port), port->info); if ((max_speed & espeed) == 0) // we are not at the max supported extended speed // print what we could be at. snprintf(speed_msg, msg_size, "Could be %s", ibnd_dump_agg_linkspeedext(buf, 64, max_speed)); return; check_fdr10_active: if ((mad_get_field(port->ext_info, 0, IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F) & FDR10) == 0) { /* Special case QDR to try to avoid confusion with FDR10 */ if (mad_get_field(port->info, 0, IB_PORT_LINK_SPEED_ACTIVE_F) == 4) /* QDR (10.0 Gbps) */ snprintf(speed_msg, msg_size, "Could be FDR10 (Found link at QDR but expected speed is FDR10)"); else snprintf(speed_msg, msg_size, "Could be FDR10"); } } int vsnprint_field(char *buf, size_t n, enum MAD_FIELDS f, int spacing, const char *format, va_list va_args) { int len, i, ret; len = strlen(mad_field_name(f)); if (len + 2 > n || spacing + 1 > n) return 0; strncpy(buf, mad_field_name(f), n); buf[len] = ':'; for (i = len+1; i < spacing+1; i++) { buf[i] = '.'; } ret = vsnprintf(&buf[spacing+1], n - spacing, format, va_args); if (ret >= n - spacing) buf[n] = '\0'; return ret + spacing; } int snprint_field(char *buf, size_t n, enum MAD_FIELDS f, int spacing, const char *format, ...) { va_list val; int ret; va_start(val, format); ret = vsnprint_field(buf, n, f, spacing, format, val); va_end(val); return ret; } void dump_portinfo(void *pi, int tabs) { int field, i; char val[64]; char buf[1024]; for (field = IB_PORT_FIRST_F; field < IB_PORT_LAST_F; field++) { for (i=0;iname; r++) if (!strcasecmp(r->name, name) || (r->alias && !strcasecmp(r->alias, name))) return r->fn; return NULL; } rdma-core-56.1/infiniband-diags/ibdiag_common.h000066400000000000000000000115411477342711600214620ustar00rootroot00000000000000/* * Copyright (c) 2006-2007 The Regents of the University of California. * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2002-2010 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2011 Lawrence Livermore National Security. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _IBDIAG_COMMON_H_ #define _IBDIAG_COMMON_H_ #include #include #include #include #include #include extern int ibverbose; extern char *ibd_ca; extern int ibd_ca_port; extern enum MAD_DEST ibd_dest_type; extern ib_portid_t *ibd_sm_id; extern int ibd_timeout; extern uint32_t ibd_ibnetdisc_flags; extern uint64_t ibd_mkey; extern uint64_t ibd_sakey; extern int show_keys; extern char *ibd_nd_format; /*========================================================*/ /* External interface */ /*========================================================*/ #undef DEBUG #define DEBUG(fmt, ...) do { \ if (ibdebug) IBDEBUG(fmt, ## __VA_ARGS__); \ } while (0) #define VERBOSE(fmt, ...) do { \ if (ibverbose) IBVERBOSE(fmt, ## __VA_ARGS__); \ } while (0) #define IBEXIT(fmt, ...) ibexit(__FUNCTION__, fmt, ## __VA_ARGS__) #define NOT_DISPLAYED_STR "" struct ibdiag_opt { const char *name; char letter; unsigned has_arg; const char *arg_tmpl; const char *description; }; extern int ibdiag_process_opts(int argc, char *const argv[], void *context, const char *exclude_common_str, const struct ibdiag_opt custom_opts[], int (*custom_handler) (void *cxt, int val), const char *usage_args, const char *usage_examples[]); extern void ibdiag_show_usage(void); extern void ibexit(const char *fn, const char *msg, ...) __attribute__((format(printf, 2, 3))); /* convert counter values to a float with a unit specifier returned (using * binary prefix) * "data" is a flag indicating this counter is a byte counter multiplied by 4 * as per PortCounters[Extended] */ const char *conv_cnt_human_readable(uint64_t val64, float *val, int data); int is_mlnx_ext_port_info_supported(uint32_t vendorid, uint16_t devid); int is_port_info_extended_supported(ib_portid_t * dest, int port, struct ibmad_port *srcport); void get_max_msg(char *width_msg, char *speed_msg, int msg_size, ibnd_port_t * port); int resolve_sm_portid(char *ca_name, uint8_t portnum, ib_portid_t *sm_id); int resolve_self(char *ca_name, uint8_t ca_port, ib_portid_t *portid, int *port, ibmad_gid_t *gid); int resolve_portid_str(char *ca_name, uint8_t ca_port, ib_portid_t * portid, char *addr_str, enum MAD_DEST dest_type, ib_portid_t *sm_id, const struct ibmad_port *srcport); int vsnprint_field(char *buf, size_t n, enum MAD_FIELDS f, int spacing, const char *format, va_list va_args) __attribute__((format(printf, 5, 0))); int snprint_field(char *buf, size_t n, enum MAD_FIELDS f, int spacing, const char *format, ...) __attribute__((format(printf, 5, 6))); void dump_portinfo(void *pi, int tabs); /** * Some common command line parsing */ typedef const char *(op_fn_t)(ib_portid_t *dest, char **argv, int argc); typedef struct match_rec { const char *name, *alias; op_fn_t *fn; unsigned opt_portnum; const char *ops_extra; } match_rec_t; op_fn_t *match_op(const match_rec_t match_tbl[], char *name); #endif /* _IBDIAG_COMMON_H_ */ rdma-core-56.1/infiniband-diags/ibdiag_sa.c000066400000000000000000000157371477342711600206030ustar00rootroot00000000000000/* * Copyright (c) 2006-2007 The Regents of the University of California. * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2010 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2011 Lawrence Livermore National Security. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include "ibdiag_common.h" #include "ibdiag_sa.h" /* define a common SA query structure * This is by no means optimal but it moves the saquery functionality out of * the saquery tool and provides it to other utilities. */ struct sa_handle *sa_get_handle(char *ca_name) { struct sa_handle *handle; handle = calloc(1, sizeof(*handle)); if (!handle) IBPANIC("calloc failed"); char *name = ca_name ? ca_name : ibd_ca; resolve_sm_portid(name, ibd_ca_port, &handle->dport); if (!handle->dport.lid) { IBWARN("No SM/SA found on port %s:%d", name ? "" : name, ibd_ca_port); goto err; } handle->dport.qp = 1; if (!handle->dport.qkey) handle->dport.qkey = IB_DEFAULT_QP1_QKEY; handle->fd = umad_open_port(name, ibd_ca_port); if (handle->fd < 0) { IBWARN("umad_open_port on port %s:%d failed", name ? "" : name, ibd_ca_port); goto err; } if ((handle->agent = umad_register(handle->fd, IB_SA_CLASS, 2, 1, NULL)) < 0) { umad_close_port(handle->fd); IBWARN("umad_register for SA class failed on port %s:%d", name ? "" : name, ibd_ca_port); goto err; } return handle; err: free(handle); return (NULL); } void sa_free_handle(struct sa_handle * h) { umad_unregister(h->fd, h->agent); umad_close_port(h->fd); free(h); } int sa_query(struct sa_handle * h, uint8_t method, uint16_t attr, uint32_t mod, uint64_t comp_mask, uint64_t sm_key, void *data, size_t datasz, struct sa_query_result *result) { ib_rpc_t rpc; void *umad, *mad; int ret, offset, len = 256; memset(&rpc, 0, sizeof(rpc)); rpc.mgtclass = IB_SA_CLASS; rpc.method = method; rpc.attr.id = attr; rpc.attr.mod = mod; rpc.mask = comp_mask; rpc.datasz = datasz; rpc.dataoffs = IB_SA_DATA_OFFS; umad = calloc(1, len + umad_size()); if (!umad) IBPANIC("cannot alloc mem for umad: %s\n", strerror(errno)); mad_build_pkt(umad, &rpc, &h->dport, NULL, data); mad_set_field64(umad_get_mad(umad), 0, IB_SA_MKEY_F, sm_key); if (ibdebug > 1) xdump(stdout, "SA Request:\n", umad_get_mad(umad), len); ret = umad_send(h->fd, h->agent, umad, len, ibd_timeout, 0); if (ret < 0) { IBWARN("umad_send failed: attr 0x%x: %s\n", attr, strerror(errno)); free(umad); return (-ret); } recv_mad: ret = umad_recv(h->fd, umad, &len, ibd_timeout); if (ret < 0) { if (errno == ENOSPC) { umad = realloc(umad, umad_size() + len); goto recv_mad; } IBWARN("umad_recv failed: attr 0x%x: %s\n", attr, strerror(errno)); free(umad); return (-ret); } if ((ret = umad_status(umad))) return ret; mad = umad_get_mad(umad); if (ibdebug > 1) xdump(stdout, "SA Response:\n", mad, len); method = (uint8_t) mad_get_field(mad, 0, IB_MAD_METHOD_F); offset = mad_get_field(mad, 0, IB_SA_ATTROFFS_F); result->status = mad_get_field(mad, 0, IB_MAD_STATUS_F); result->p_result_madw = mad; if (result->status != IB_SA_MAD_STATUS_SUCCESS) result->result_cnt = 0; else if (method != IB_MAD_METHOD_GET_TABLE) result->result_cnt = 1; else if (!offset) result->result_cnt = 0; else result->result_cnt = (len - IB_SA_DATA_OFFS) / (offset << 3); return 0; } void sa_free_result_mad(struct sa_query_result *result) { if (result->p_result_madw) { free((uint8_t *) result->p_result_madw - umad_size()); result->p_result_madw = NULL; } } void *sa_get_query_rec(void *mad, unsigned i) { int offset = mad_get_field(mad, 0, IB_SA_ATTROFFS_F); return (uint8_t *) mad + IB_SA_DATA_OFFS + i * (offset << 3); } static const char *ib_sa_error_str[] = { "SA_NO_ERROR", "SA_ERR_NO_RESOURCES", "SA_ERR_REQ_INVALID", "SA_ERR_NO_RECORDS", "SA_ERR_TOO_MANY_RECORDS", "SA_ERR_REQ_INVALID_GID", "SA_ERR_REQ_INSUFFICIENT_COMPONENTS", "SA_ERR_REQ_DENIED", "SA_ERR_STATUS_PRIO_SUGGESTED", "SA_ERR_UNKNOWN" }; #define ARR_SIZE(a) (sizeof(a)/sizeof((a)[0])) #define SA_ERR_UNKNOWN (ARR_SIZE(ib_sa_error_str) - 1) static inline const char *ib_sa_err_str(uint8_t status) { if (status > SA_ERR_UNKNOWN) status = SA_ERR_UNKNOWN; return (ib_sa_error_str[status]); } static const char *ib_mad_inv_field_str[] = { "MAD No invalid fields", "MAD Bad version", "MAD Method specified is not supported", "MAD Method/Attribute combination is not supported", "MAD Reserved", "MAD Reserved", "MAD Reserved", "MAD Invalid value in Attribute field(s) or Attribute Modifier", "MAD UNKNOWN ERROR" }; #define MAD_ERR_UNKNOWN (ARR_SIZE(ib_mad_inv_field_str) - 1) static inline const char *ib_mad_inv_field_err_str(uint8_t f) { if (f > MAD_ERR_UNKNOWN) f = MAD_ERR_UNKNOWN; return (ib_mad_inv_field_str[f]); } void sa_report_err(int status) { int st = status & 0xff; char mad_err_str[128] = { 0 }; char sa_err_str[64] = { 0 }; int rc; if (st) { rc = snprintf(mad_err_str, sizeof(mad_err_str), " (%s; %s; %s)", (st & 0x1) ? "BUSY" : "", (st & 0x2) ? "Redirection Required" : "", ib_mad_inv_field_err_str(st>>2)); if (rc > sizeof(mad_err_str)) fprintf(stderr, "WARN: string buffer overflow\n"); } st = status >> 8; if (st) { rc = snprintf(sa_err_str, sizeof(sa_err_str), " SA(%s)", ib_sa_err_str((uint8_t) st)); if (rc > sizeof(sa_err_str)) fprintf(stderr, "WARN: string buffer overflow\n"); } fprintf(stderr, "ERROR: Query result returned 0x%04x, %s%s\n", status, mad_err_str, sa_err_str); } rdma-core-56.1/infiniband-diags/ibdiag_sa.h000066400000000000000000000066201477342711600205770ustar00rootroot00000000000000/* * Copyright (c) 2006-2007 The Regents of the University of California. * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2002-2010 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2012 Lawrence Livermore National Security. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _IBDIAG_SA_H_ #define _IBDIAG_SA_H_ #include #include /* define an SA query structure to be common * This is by no means optimal but it moves the saquery functionality out of * the saquery tool and provides it to other utilities. */ struct sa_handle { int fd, agent; ib_portid_t dport; struct ibmad_port *srcport; }; struct sa_query_result { uint32_t status; unsigned result_cnt; void *p_result_madw; }; /* NOTE: umad_init must be called prior to sa_get_handle */ struct sa_handle *sa_get_handle(char *ca_name); void sa_free_handle(struct sa_handle * h); int sa_query(struct sa_handle *h, uint8_t method, uint16_t attr, uint32_t mod, uint64_t comp_mask, uint64_t sm_key, void *data, size_t datasz, struct sa_query_result *result); void sa_free_result_mad(struct sa_query_result *result); void *sa_get_query_rec(void *mad, unsigned i); void sa_report_err(int status); /* Macros for setting query values and ComponentMasks */ static inline uint8_t htobe8(uint8_t val) { return val; } #define CHECK_AND_SET_VAL(val, size, comp_with, target, name, mask) \ if ((int##size##_t) val != (int##size##_t) comp_with) { \ target = htobe##size((uint##size##_t) val); \ comp_mask |= IB_##name##_COMPMASK_##mask; \ } #define CHECK_AND_SET_GID(val, target, name, mask) \ if (valid_gid(&(val))) { \ memcpy(&(target), &(val), sizeof(val)); \ comp_mask |= IB_##name##_COMPMASK_##mask; \ } #define CHECK_AND_SET_VAL_AND_SEL(val, target, name, mask, sel) \ if (val) { \ target = val; \ comp_mask |= IB_##name##_COMPMASK_##mask##sel; \ comp_mask |= IB_##name##_COMPMASK_##mask; \ } #endif /* _IBDIAG_SA_H_ */ rdma-core-56.1/infiniband-diags/iblinkinfo.c000066400000000000000000000517521477342711600210220ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * Copyright (c) 2008 Lawrence Livermore National Lab. All rights reserved. * Copyright (c) 2010,2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include "ibdiag_common.h" #define DIFF_FLAG_PORT_CONNECTION 0x01 #define DIFF_FLAG_PORT_STATE 0x02 #define DIFF_FLAG_LID 0x04 #define DIFF_FLAG_NODE_DESCRIPTION 0x08 #define DIFF_FLAG_DEFAULT (DIFF_FLAG_PORT_CONNECTION | DIFF_FLAG_PORT_STATE) static char *node_name_map_file = NULL; static nn_map_t *node_name_map = NULL; static char *load_cache_file = NULL; static char *diff_cache_file = NULL; static unsigned diffcheck_flags = DIFF_FLAG_DEFAULT; static char *filterdownports_cache_file = NULL; static ibnd_fabric_t *filterdownports_fabric = NULL; static struct { uint64_t guid; char *guid_str; } node_label; static char *dr_path = NULL; static int all = 0; static int down_links_only = 0; static int line_mode = 0; static int add_sw_settings = 0; static int only_flag = 0; static int only_type = 0; static int filterdownport_check(ibnd_node_t *node, ibnd_port_t *port) { ibnd_node_t *fsw; ibnd_port_t *fport; int fistate; fsw = ibnd_find_node_guid(filterdownports_fabric, node->guid); if (!fsw) return 0; if (port->portnum > fsw->numports) return 0; fport = fsw->ports[port->portnum]; if (!fport) return 0; fistate = mad_get_field(fport->info, 0, IB_PORT_STATE_F); return (fistate == IB_LINK_DOWN) ? 1 : 0; } static void print_port(ibnd_node_t *node, ibnd_port_t *port, const char *out_prefix) { char width[64], speed[64], state[64], physstate[64]; char remote_guid_str[256]; char remote_str[256]; char link_str[256]; char width_msg[256]; char speed_msg[256]; char ext_port_str[256]; int iwidth, ispeed, fdr10, espeed, istate, iphystate; int n = 0; uint8_t *info = NULL; int rc; if (!port) return; iwidth = mad_get_field(port->info, 0, IB_PORT_LINK_WIDTH_ACTIVE_F); ispeed = mad_get_field(port->info, 0, IB_PORT_LINK_SPEED_ACTIVE_F); fdr10 = mad_get_field(port->ext_info, 0, IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F) & FDR10; if (port->node->type == IB_NODE_SWITCH) { if (port->node->ports[0]) info = (uint8_t *)&port->node->ports[0]->info; } else info = (uint8_t *)&port->info; if (info) { espeed = ibnd_get_agg_linkspeedext(info, port->info); } else { ispeed = 0; iwidth = 0; espeed = 0; } istate = mad_get_field(port->info, 0, IB_PORT_STATE_F); iphystate = mad_get_field(port->info, 0, IB_PORT_PHYS_STATE_F); remote_guid_str[0] = '\0'; remote_str[0] = '\0'; link_str[0] = '\0'; width_msg[0] = '\0'; speed_msg[0] = '\0'; if (istate == IB_LINK_DOWN && filterdownports_fabric && filterdownport_check(node, port)) return; /* C14-24.2.1 states that a down port allows for invalid data to be * returned for all PortInfo components except PortState and * PortPhysicalState */ if (istate != IB_LINK_DOWN) { if (!espeed) { if (fdr10) sprintf(speed, "10.0 Gbps (FDR10)"); else mad_dump_val(IB_PORT_LINK_SPEED_ACTIVE_F, speed, 64, &ispeed); } else ibnd_dump_agg_linkspeedext(speed, 64, espeed); n = snprintf(link_str, 256, "(%3s %18s %6s/%8s)", mad_dump_val(IB_PORT_LINK_WIDTH_ACTIVE_F, width, 64, &iwidth), speed, mad_dump_val(IB_PORT_STATE_F, state, 64, &istate), mad_dump_val(IB_PORT_PHYS_STATE_F, physstate, 64, &iphystate)); } else { n = snprintf(link_str, 256, "( %6s/%8s)", mad_dump_val(IB_PORT_STATE_F, state, 64, &istate), mad_dump_val(IB_PORT_PHYS_STATE_F, physstate, 64, &iphystate)); } /* again default values due to C14-24.2.1 */ if (add_sw_settings && istate != IB_LINK_DOWN) { snprintf(link_str + n, 256 - n, " (HOQ:%d VL_Stall:%d)", mad_get_field(port->info, 0, IB_PORT_HOQ_LIFE_F), mad_get_field(port->info, 0, IB_PORT_VL_STALL_COUNT_F)); } if (port->remoteport) { char *remap = remap_node_name(node_name_map, port->remoteport->node->guid, port->remoteport->node->nodedesc); if (port->remoteport->ext_portnum) snprintf(ext_port_str, 256, "%d", port->remoteport->ext_portnum); else ext_port_str[0] = '\0'; get_max_msg(width_msg, speed_msg, 256, port); if (line_mode) { snprintf(remote_guid_str, 256, "0x%016" PRIx64 " ", port->remoteport->guid); } rc = snprintf(remote_str, sizeof(remote_str), "%s%6d %4d[%2s] \"%s\" (%s %s)\n", remote_guid_str, port->remoteport->base_lid ? port->remoteport->base_lid : port->remoteport->node->smalid, port->remoteport->portnum, ext_port_str, remap, width_msg, speed_msg); if (rc > sizeof(remote_str)) fprintf(stderr, "WARN: string buffer overflow\n"); free(remap); } else { if (istate == IB_LINK_DOWN) snprintf(remote_str, 256, " [ ] \"\" ( )\n"); else snprintf(remote_str, 256, " \"Port not available\"\n"); } if (port->ext_portnum) snprintf(ext_port_str, 256, "%d", port->ext_portnum); else ext_port_str[0] = '\0'; if (line_mode) { char *remap = remap_node_name(node_name_map, node->guid, node->nodedesc); printf("%s0x%016" PRIx64 " \"%30s\" ", out_prefix ? out_prefix : "", port->guid, remap); free(remap); } else printf("%s ", out_prefix ? out_prefix : ""); if (port->node->type != IB_NODE_SWITCH) { if (!line_mode) printf("0x%016" PRIx64 " ", port->guid); printf("%6d %4d[%2s] ==%s==> %s", port->base_lid, port->portnum, ext_port_str, link_str, remote_str); } else printf("%6d %4d[%2s] ==%s==> %s", node->smalid, port->portnum, ext_port_str, link_str, remote_str); } static inline const char *nodetype_str(ibnd_node_t * node) { switch (node->type) { case IB_NODE_SWITCH: return "Switch"; case IB_NODE_CA: return "CA"; case IB_NODE_ROUTER: return "Router"; } return "??"; } static void print_node_header(ibnd_node_t *node, int *out_header_flag, const char *out_prefix) { uint64_t guid = 0; if ((!out_header_flag || !(*out_header_flag)) && !line_mode) { char *remap = remap_node_name(node_name_map, node->guid, node->nodedesc); if (node->type == IB_NODE_SWITCH) { if (node->ports[0]) guid = node->ports[0]->guid; else guid = mad_get_field64(node->info, 0, IB_NODE_PORT_GUID_F); printf("%s%s: 0x%016" PRIx64 " %s:\n", out_prefix ? out_prefix : "", nodetype_str(node), guid, remap); } else printf("%s%s: %s:\n", out_prefix ? out_prefix : "", nodetype_str(node), remap); (*out_header_flag)++; free(remap); } } static void print_node(ibnd_node_t *node, void *user_data) { int i = 0; int head_print = 0; char *out_prefix = (char *)user_data; for (i = 1; i <= node->numports; i++) { ibnd_port_t *port = node->ports[i]; if (!port) continue; if (!down_links_only || mad_get_field(port->info, 0, IB_PORT_STATE_F) == IB_LINK_DOWN) { print_node_header(node, &head_print, out_prefix); print_port(node, port, out_prefix); } } } struct iter_diff_data { uint32_t diff_flags; ibnd_fabric_t *fabric1; ibnd_fabric_t *fabric2; const char *fabric1_prefix; const char *fabric2_prefix; }; static void diff_node_ports(ibnd_node_t *fabric1_node, ibnd_node_t *fabric2_node, int *head_print, struct iter_diff_data *data) { int i = 0; for (i = 1; i <= fabric1_node->numports; i++) { ibnd_port_t *fabric1_port, *fabric2_port; int output_diff = 0; fabric1_port = fabric1_node->ports[i]; fabric2_port = fabric2_node->ports[i]; if (!fabric1_port && !fabric2_port) continue; if (data->diff_flags & DIFF_FLAG_PORT_CONNECTION) { if ((fabric1_port && !fabric2_port) || (!fabric1_port && fabric2_port) || (fabric1_port->remoteport && !fabric2_port->remoteport) || (!fabric1_port->remoteport && fabric2_port->remoteport) || (fabric1_port->remoteport && fabric2_port->remoteport && fabric1_port->remoteport->guid != fabric2_port->remoteport->guid)) output_diff++; } /* if either fabric1_port or fabric2_port NULL, should be * handled by port connection diff code */ if (data->diff_flags & DIFF_FLAG_PORT_STATE && fabric1_port && fabric2_port) { int state1, state2; state1 = mad_get_field(fabric1_port->info, 0, IB_PORT_STATE_F); state2 = mad_get_field(fabric2_port->info, 0, IB_PORT_STATE_F); if (state1 != state2) output_diff++; } if (data->diff_flags & DIFF_FLAG_PORT_CONNECTION && data->diff_flags & DIFF_FLAG_LID && fabric1_port && fabric2_port && fabric1_port->remoteport && fabric2_port->remoteport && fabric1_port->remoteport->base_lid != fabric2_port->remoteport->base_lid) output_diff++; if (data->diff_flags & DIFF_FLAG_PORT_CONNECTION && data->diff_flags & DIFF_FLAG_NODE_DESCRIPTION && fabric1_port && fabric2_port && fabric1_port->remoteport && fabric2_port->remoteport && memcmp(fabric1_port->remoteport->node->nodedesc, fabric2_port->remoteport->node->nodedesc, IB_SMP_DATA_SIZE)) output_diff++; if (output_diff && fabric1_port) { print_node_header(fabric1_node, head_print, NULL); print_port(fabric1_node, fabric1_port, data->fabric1_prefix); } if (output_diff && fabric2_port) { print_node_header(fabric1_node, head_print, NULL); print_port(fabric2_node, fabric2_port, data->fabric2_prefix); } } } static void diff_node_iter(ibnd_node_t *fabric1_node, void *iter_user_data) { struct iter_diff_data *data = iter_user_data; ibnd_node_t *fabric2_node; int head_print = 0; DEBUG("DEBUG: fabric1_node %p\n", fabric1_node); fabric2_node = ibnd_find_node_guid(data->fabric2, fabric1_node->guid); if (!fabric2_node) print_node(fabric1_node, (void *)data->fabric1_prefix); else if (data->diff_flags & (DIFF_FLAG_PORT_CONNECTION | DIFF_FLAG_PORT_STATE | DIFF_FLAG_LID | DIFF_FLAG_NODE_DESCRIPTION)) { if ((fabric1_node->type == IB_NODE_SWITCH && data->diff_flags & DIFF_FLAG_LID && fabric1_node->smalid != fabric2_node->smalid) || (data->diff_flags & DIFF_FLAG_NODE_DESCRIPTION && memcmp(fabric1_node->nodedesc, fabric2_node->nodedesc, IB_SMP_DATA_SIZE))) { print_node_header(fabric1_node, NULL, data->fabric1_prefix); print_node_header(fabric2_node, NULL, data->fabric2_prefix); head_print++; } if (fabric1_node->numports != fabric2_node->numports) { print_node_header(fabric1_node, &head_print, NULL); printf("%snumports = %d\n", data->fabric1_prefix, fabric1_node->numports); printf("%snumports = %d\n", data->fabric2_prefix, fabric2_node->numports); return; } diff_node_ports(fabric1_node, fabric2_node, &head_print, data); } } static int diff_node(ibnd_node_t *node, ibnd_fabric_t *orig_fabric, ibnd_fabric_t *new_fabric) { struct iter_diff_data iter_diff_data; iter_diff_data.diff_flags = diffcheck_flags; iter_diff_data.fabric1 = orig_fabric; iter_diff_data.fabric2 = new_fabric; iter_diff_data.fabric1_prefix = "< "; iter_diff_data.fabric2_prefix = "> "; if (node) diff_node_iter(node, &iter_diff_data); else { if (only_flag) ibnd_iter_nodes_type(orig_fabric, diff_node_iter, only_type, &iter_diff_data); else ibnd_iter_nodes(orig_fabric, diff_node_iter, &iter_diff_data); } /* Do opposite diff to find existence of node types * in new_fabric but not in orig_fabric. * * In this diff, we don't need to check port connections, * port state, lids, or node descriptions since it has already * been done (i.e. checks are only done when guid exists on both * orig and new). */ iter_diff_data.diff_flags = diffcheck_flags & ~DIFF_FLAG_PORT_CONNECTION; iter_diff_data.diff_flags &= ~DIFF_FLAG_PORT_STATE; iter_diff_data.diff_flags &= ~DIFF_FLAG_LID; iter_diff_data.diff_flags &= ~DIFF_FLAG_NODE_DESCRIPTION; iter_diff_data.fabric1 = new_fabric; iter_diff_data.fabric2 = orig_fabric; iter_diff_data.fabric1_prefix = "> "; iter_diff_data.fabric2_prefix = "< "; if (node) diff_node_iter(node, &iter_diff_data); else { if (only_flag) ibnd_iter_nodes_type(new_fabric, diff_node_iter, only_type, &iter_diff_data); else ibnd_iter_nodes(new_fabric, diff_node_iter, &iter_diff_data); } return 0; } static int process_opt(void *context, int ch) { struct ibnd_config *cfg = context; char *p; switch (ch) { case 1: node_name_map_file = strdup(optarg); if (node_name_map_file == NULL) IBEXIT("out of memory, strdup for node_name_map_file name failed"); break; case 2: load_cache_file = strdup(optarg); break; case 3: diff_cache_file = strdup(optarg); break; case 4: diffcheck_flags = 0; p = strtok(optarg, ","); while (p) { if (!strcasecmp(p, "port")) diffcheck_flags |= DIFF_FLAG_PORT_CONNECTION; else if (!strcasecmp(p, "state")) diffcheck_flags |= DIFF_FLAG_PORT_STATE; else if (!strcasecmp(p, "lid")) diffcheck_flags |= DIFF_FLAG_LID; else if (!strcasecmp(p, "nodedesc")) diffcheck_flags |= DIFF_FLAG_NODE_DESCRIPTION; else { fprintf(stderr, "invalid diff check key: %s\n", p); return -1; } p = strtok(NULL, ","); } break; case 5: filterdownports_cache_file = strdup(optarg); break; case 6: only_flag = 1; only_type = IB_NODE_SWITCH; break; case 7: only_flag = 1; only_type = IB_NODE_CA; break; case 'S': case 'G': node_label.guid_str = optarg; node_label.guid = (uint64_t)strtoull(node_label.guid_str, NULL, 0); break; case 'D': dr_path = strdup(optarg); break; case 'a': all = 1; break; case 'n': cfg->max_hops = strtoul(optarg, NULL, 0); break; case 'd': down_links_only = 1; break; case 'l': line_mode = 1; break; case 'p': add_sw_settings = 1; break; case 'R': /* nop */ break; case 'o': cfg->max_smps = strtoul(optarg, NULL, 0); break; default: return -1; } return 0; } int main(int argc, char **argv) { struct ibnd_config config = { 0 }; int rc = 0; int resolved = -1; ibnd_fabric_t *fabric = NULL; ibnd_fabric_t *diff_fabric = NULL; struct ibmad_port *ibmad_port; struct ibmad_ports_pair *ibmad_ports; ib_portid_t port_id = { 0 }; uint8_t ni[IB_SMP_DATA_SIZE] = { 0 }; int mgmt_classes[3] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS }; const struct ibdiag_opt opts[] = { {"node-name-map", 1, 1, "", "node name map file"}, {"switch", 'S', 1, "", "start partial scan at the port specified by (hex format)"}, {"port-guid", 'G', 1, "", "(same as -S)"}, {"Direct", 'D', 1, "", "start partial scan at the port specified by "}, {"all", 'a', 0, NULL, "print all nodes found in a partial fabric scan"}, {"hops", 'n', 1, "", "Number of hops to include away from specified node"}, {"down", 'd', 0, NULL, "print only down links"}, {"line", 'l', 0, NULL, "(line mode) print all information for each link on a single line"}, {"additional", 'p', 0, NULL, "print additional port settings (PktLifeTime, HoqLife, VLStallCount)"}, {"load-cache", 2, 1, "", "filename of ibnetdiscover cache to load"}, {"diff", 3, 1, "", "filename of ibnetdiscover cache to diff"}, {"diffcheck", 4, 1, "", "specify checks to execute for --diff"}, {"filterdownports", 5, 1, "", "filename of ibnetdiscover cache to filter downports"}, {"outstanding_smps", 'o', 1, NULL, "specify the number of outstanding SMP's which should be " "issued during the scan"}, {"switches-only", 6, 0, NULL, "Output only switches"}, {"cas-only", 7, 0, NULL, "Output only CAs"}, {} }; char usage_args[] = ""; ibdiag_process_opts(argc, argv, &config, "aDdGgKLlnpRS", opts, process_opt, usage_args, NULL); argc -= optind; argv += optind; ibmad_ports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 1); if (!ibmad_ports) { fprintf(stderr, "Failed to open %s port %d\n", ibd_ca, ibd_ca_port); exit(1); } ibmad_port = ibmad_ports->smi.port; if (!ibmad_port) { fprintf(stderr, "Failed to open %s port %d\n", ibd_ca, ibd_ca_port); exit(1); } smp_mkey_set(ibmad_port, ibd_mkey); if (ibd_timeout) { mad_rpc_set_timeout(ibmad_port, ibd_timeout); config.timeout_ms = ibd_timeout; } config.flags = ibd_ibnetdisc_flags; config.mkey = ibd_mkey; node_name_map = open_node_name_map(node_name_map_file); if (dr_path && load_cache_file) { mad_rpc_close_port2(ibmad_ports); fprintf(stderr, "Cannot specify cache and direct route path\n"); exit(1); } if (dr_path) { /* only scan part of the fabric */ if ((resolved = resolve_portid_str(ibmad_ports->gsi.ca_name, ibd_ca_port, &port_id, dr_path, IB_DEST_DRPATH, NULL, ibmad_ports->gsi.port)) < 0) IBWARN("Failed to resolve %s; attempting full scan", dr_path); } else if (node_label.guid_str) { if ((resolved = resolve_portid_str( ibmad_ports->gsi.ca_name, ibd_ca_port, &port_id, node_label.guid_str, IB_DEST_GUID, NULL, ibmad_ports->gsi.port)) < 0) IBWARN("Failed to resolve %s; attempting full scan\n", node_label.guid_str); } if (!all && dr_path) { if (!smp_query_via(ni, &port_id, IB_ATTR_NODE_INFO, 0, ibd_timeout, ibmad_port)){ mad_rpc_close_port2(ibmad_ports); fprintf(stderr, "Failed to get local Node Info\n"); exit(1); } } mad_rpc_close_port2(ibmad_ports); if (diff_cache_file && !(diff_fabric = ibnd_load_fabric(diff_cache_file, 0))) IBEXIT("loading cached fabric for diff failed\n"); if (filterdownports_cache_file && !(filterdownports_fabric = ibnd_load_fabric(filterdownports_cache_file, 0))) IBEXIT("loading cached fabric for filterdownports failed\n"); if (load_cache_file) { if ((fabric = ibnd_load_fabric(load_cache_file, 0)) == NULL) { fprintf(stderr, "loading cached fabric failed\n"); exit(1); } } else { if (resolved >= 0) { if (!config.max_hops) config.max_hops = 1; if (!(fabric = ibnd_discover_fabric(ibd_ca, ibd_ca_port, &port_id, &config))) IBWARN("Partial fabric scan failed;" " attempting full scan\n"); } if (!fabric && !(fabric = ibnd_discover_fabric(ibd_ca, ibd_ca_port, NULL, &config))) { fprintf(stderr, "discover failed\n"); rc = 1; goto close_port; } } if (!all && node_label.guid_str) { ibnd_port_t *p = ibnd_find_port_guid(fabric, node_label.guid); if (p && (!only_flag || p->node->type == only_type)) { ibnd_node_t *n = p->node; if (diff_fabric) diff_node(n, diff_fabric, fabric); else print_node(n, NULL); } else fprintf(stderr, "Failed to find port: %s\n", node_label.guid_str); } else if (!all && dr_path) { ibnd_port_t *p = NULL; mad_decode_field(ni, IB_NODE_PORT_GUID_F, &node_label.guid); p = ibnd_find_port_guid(fabric, node_label.guid); if (p && (!only_flag || p->node->type == only_type)) { ibnd_node_t *n = p->node; if (diff_fabric) diff_node(n, diff_fabric, fabric); else print_node(n, NULL); } else fprintf(stderr, "Failed to find port: %s\n", dr_path); } else { if (diff_fabric) diff_node(NULL, diff_fabric, fabric); else { if (only_flag) ibnd_iter_nodes_type(fabric, print_node, only_type, NULL); else ibnd_iter_nodes(fabric, print_node, NULL); } } ibnd_destroy_fabric(fabric); if (diff_fabric) ibnd_destroy_fabric(diff_fabric); close_port: close_node_name_map(node_name_map); exit(rc); } rdma-core-56.1/infiniband-diags/ibnetdiscover.c000066400000000000000000000747131477342711600215400ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * Copyright (c) 2008 Lawrence Livermore National Lab. All rights reserved. * Copyright (c) 2010-2020 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include "ibdiag_common.h" #define LIST_CA_NODE (1 << IB_NODE_CA) #define LIST_SWITCH_NODE (1 << IB_NODE_SWITCH) #define LIST_ROUTER_NODE (1 << IB_NODE_ROUTER) #define DIFF_FLAG_SWITCH 0x01 #define DIFF_FLAG_CA 0x02 #define DIFF_FLAG_ROUTER 0x04 #define DIFF_FLAG_PORT_CONNECTION 0x08 #define DIFF_FLAG_LID 0x10 #define DIFF_FLAG_NODE_DESCRIPTION 0x20 #define DIFF_FLAG_DEFAULT (DIFF_FLAG_SWITCH | DIFF_FLAG_CA | DIFF_FLAG_ROUTER \ | DIFF_FLAG_PORT_CONNECTION) static FILE *f; static char *node_name_map_file = NULL; static nn_map_t *node_name_map = NULL; static char *cache_file = NULL; static char *load_cache_file = NULL; static char *diff_cache_file = NULL; static unsigned diffcheck_flags = DIFF_FLAG_DEFAULT; static int report_max_hops = 0; static int full_info; /** * Define our own conversion functions to maintain compatibility with the old * ibnetdiscover which did not use the ibmad conversion functions. */ static const char *dump_linkspeed_compat(uint32_t speed) { switch (speed) { case 1: return ("SDR"); break; case 2: return ("DDR"); break; case 4: return ("QDR"); break; } return ("???"); } static const char *dump_linkspeedext_compat(uint32_t espeed, uint32_t speed, uint32_t fdr10) { switch (espeed) { case 0: if (fdr10 & FDR10) return ("FDR10"); else return dump_linkspeed_compat(speed); break; case 1: return ("FDR"); break; case 2: return ("EDR"); break; case 4: return ("HDR"); break; case 8: return ("NDR"); /* case 16: non used value */ case 32: return ("XDR"); } return ("???"); } static const char *dump_linkwidth_compat(uint32_t width) { switch (width) { case 1: return ("1x"); break; case 2: return ("4x"); break; case 4: return ("8x"); break; case 8: return ("12x"); break; case 16: return ("2x"); break; } return ("??"); } static inline const char *ports_nt_str_compat(ibnd_node_t * node) { switch (node->type) { case IB_NODE_SWITCH: return "SW"; case IB_NODE_CA: return "CA"; case IB_NODE_ROUTER: return "RT"; } return "??"; } static char *node_name(ibnd_node_t * node) { static char buf[256]; switch (node->type) { case IB_NODE_SWITCH: sprintf(buf, "\"%s", "S"); break; case IB_NODE_CA: sprintf(buf, "\"%s", "H"); break; case IB_NODE_ROUTER: sprintf(buf, "\"%s", "R"); break; default: sprintf(buf, "\"%s", "?"); break; } sprintf(buf + 2, "-%016" PRIx64 "\"", node->guid); return buf; } static void list_node(ibnd_node_t *node, void *user_data) { const char *node_type; char *nodename = remap_node_name(node_name_map, node->guid, node->nodedesc); switch (node->type) { case IB_NODE_SWITCH: node_type = "Switch"; break; case IB_NODE_CA: node_type = "Ca"; break; case IB_NODE_ROUTER: node_type = "Router"; break; default: node_type = "???"; break; } fprintf(f, "%s\t : 0x%016" PRIx64 " ports %d devid 0x%x vendid 0x%x \"%s\"\n", node_type, node->guid, node->numports, mad_get_field(node->info, 0, IB_NODE_DEVID_F), mad_get_field(node->info, 0, IB_NODE_VENDORID_F), nodename); free(nodename); } static void list_nodes(ibnd_fabric_t *fabric, int list) { if (list & LIST_CA_NODE) ibnd_iter_nodes_type(fabric, list_node, IB_NODE_CA, NULL); if (list & LIST_SWITCH_NODE) ibnd_iter_nodes_type(fabric, list_node, IB_NODE_SWITCH, NULL); if (list & LIST_ROUTER_NODE) ibnd_iter_nodes_type(fabric, list_node, IB_NODE_ROUTER, NULL); } static void out_ids(ibnd_node_t *node, int group, char *chname, const char *out_prefix) { uint64_t sysimgguid = mad_get_field64(node->info, 0, IB_NODE_SYSTEM_GUID_F); fprintf(f, "\n%svendid=0x%x\n", out_prefix ? out_prefix : "", mad_get_field(node->info, 0, IB_NODE_VENDORID_F)); fprintf(f, "%sdevid=0x%x\n", out_prefix ? out_prefix : "", mad_get_field(node->info, 0, IB_NODE_DEVID_F)); if (sysimgguid) fprintf(f, "%ssysimgguid=0x%" PRIx64, out_prefix ? out_prefix : "", sysimgguid); if (group && node->chassis && node->chassis->chassisnum) { fprintf(f, "\t\t# Chassis %d", node->chassis->chassisnum); if (chname) fprintf(f, " (%s)", clean_nodedesc(chname)); if (ibnd_is_xsigo_tca(node->guid) && node->ports[1] && node->ports[1]->remoteport) fprintf(f, " slot %d", node->ports[1]->remoteport->portnum); } if (sysimgguid || (group && node->chassis && node->chassis->chassisnum)) fprintf(f, "\n"); } static uint64_t out_chassis(ibnd_fabric_t *fabric, unsigned char chassisnum) { uint64_t guid; fprintf(f, "\nChassis %u", chassisnum); guid = ibnd_get_chassis_guid(fabric, chassisnum); if (guid) fprintf(f, " (guid 0x%" PRIx64 ")", guid); fprintf(f, "\n"); return guid; } static void out_switch_detail(ibnd_node_t *node, const char *sw_prefix) { char *nodename = NULL; nodename = remap_node_name(node_name_map, node->guid, node->nodedesc); fprintf(f, "%sSwitch\t%d %s\t\t# \"%s\" %s port 0 lid %d lmc %d", sw_prefix ? sw_prefix : "", node->numports, node_name(node), nodename, node->smaenhsp0 ? "enhanced" : "base", node->smalid, node->smalmc); free(nodename); } static void out_switch(ibnd_node_t *node, int group, char *chname, const char *id_prefix, const char *sw_prefix) { const char *str; char str2[256]; out_ids(node, group, chname, id_prefix); fprintf(f, "%sswitchguid=0x%" PRIx64, id_prefix ? id_prefix : "", node->guid); fprintf(f, "(%" PRIx64 ")", mad_get_field64(node->info, 0, IB_NODE_PORT_GUID_F)); if (group) { fprintf(f, "\t# "); str = ibnd_get_chassis_type(node); if (str) fprintf(f, "%s ", str); str = ibnd_get_chassis_slot_str(node, str2, 256); if (str) fprintf(f, "%s", str); } fprintf(f, "\n"); out_switch_detail(node, sw_prefix); fprintf(f, "\n"); } static void out_ca_detail(ibnd_node_t *node, const char *ca_prefix) { const char *node_type; switch (node->type) { case IB_NODE_CA: node_type = "Ca"; break; case IB_NODE_ROUTER: node_type = "Rt"; break; default: node_type = "???"; break; } fprintf(f, "%s%s\t%d %s\t\t# \"%s\"", ca_prefix ? ca_prefix : "", node_type, node->numports, node_name(node), clean_nodedesc(node->nodedesc)); } static void out_ca(ibnd_node_t *node, int group, char *chname, const char *id_prefix, const char *ca_prefix) { const char *node_type; out_ids(node, group, chname, id_prefix); switch (node->type) { case IB_NODE_CA: node_type = "ca"; break; case IB_NODE_ROUTER: node_type = "rt"; break; default: node_type = "???"; break; } fprintf(f, "%s%sguid=0x%" PRIx64 "\n", id_prefix ? id_prefix : "", node_type, node->guid); out_ca_detail(node, ca_prefix); if (group && ibnd_is_xsigo_hca(node->guid)) fprintf(f, " (scp)"); fprintf(f, "\n"); } #define OUT_BUFFER_SIZE 16 static char *out_ext_port(ibnd_port_t * port, int group) { static char mapping[OUT_BUFFER_SIZE]; if (group && port->ext_portnum != 0) { snprintf(mapping, OUT_BUFFER_SIZE, "[ext %d]", port->ext_portnum); return (mapping); } return (NULL); } static void out_switch_port(ibnd_port_t *port, int group, const char *out_prefix) { char *ext_port_str = NULL; char *rem_nodename = NULL; uint32_t iwidth = mad_get_field(port->info, 0, IB_PORT_LINK_WIDTH_ACTIVE_F); uint32_t ispeed = mad_get_field(port->info, 0, IB_PORT_LINK_SPEED_ACTIVE_F); uint32_t vlcap = mad_get_field(port->info, 0, IB_PORT_VL_CAP_F); uint32_t fdr10 = mad_get_field(port->ext_info, 0, IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F); uint32_t espeed; DEBUG("port %p:%d remoteport %p\n", port, port->portnum, port->remoteport); fprintf(f, "%s[%d]", out_prefix ? out_prefix : "", port->portnum); ext_port_str = out_ext_port(port, group); if (ext_port_str) fprintf(f, "%s", ext_port_str); rem_nodename = remap_node_name(node_name_map, port->remoteport->node->guid, port->remoteport->node->nodedesc); ext_port_str = out_ext_port(port->remoteport, group); if (!port->node->ports[0]) { ispeed = 0; espeed = 0; } else espeed = ibnd_get_agg_linkspeedext(port->node->ports[0]->info, port->info); fprintf(f, "\t%s[%d]%s", node_name(port->remoteport->node), port->remoteport->portnum, ext_port_str ? ext_port_str : ""); if (port->remoteport->node->type != IB_NODE_SWITCH) fprintf(f, "(%" PRIx64 ") ", port->remoteport->guid); fprintf(f, "\t\t# \"%s\" lid %d %s%s", rem_nodename, port->remoteport->node->type == IB_NODE_SWITCH ? port->remoteport->node->smalid : port->remoteport->base_lid, dump_linkwidth_compat(iwidth), (ispeed != 4 && !espeed) ? dump_linkspeed_compat(ispeed) : dump_linkspeedext_compat(espeed, ispeed, fdr10)); if (full_info) fprintf(f, " s=%d w=%d v=%d", ispeed, iwidth, vlcap); if (ibnd_is_xsigo_tca(port->remoteport->guid)) fprintf(f, " slot %d", port->portnum); else if (ibnd_is_xsigo_hca(port->remoteport->guid)) fprintf(f, " (scp)"); fprintf(f, "\n"); free(rem_nodename); } static void out_ca_port(ibnd_port_t *port, int group, const char *out_prefix) { char *str = NULL; char *rem_nodename = NULL; uint32_t iwidth = mad_get_field(port->info, 0, IB_PORT_LINK_WIDTH_ACTIVE_F); uint32_t ispeed = mad_get_field(port->info, 0, IB_PORT_LINK_SPEED_ACTIVE_F); uint32_t vlcap = mad_get_field(port->info, 0, IB_PORT_VL_CAP_F); uint32_t fdr10 = mad_get_field(port->ext_info, 0, IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F); uint32_t espeed; fprintf(f, "%s[%d]", out_prefix ? out_prefix : "", port->portnum); if (port->node->type != IB_NODE_SWITCH) fprintf(f, "(%" PRIx64 ") ", port->guid); fprintf(f, "\t%s[%d]", node_name(port->remoteport->node), port->remoteport->portnum); str = out_ext_port(port->remoteport, group); if (str) fprintf(f, "%s", str); if (port->remoteport->node->type != IB_NODE_SWITCH) fprintf(f, " (%" PRIx64 ") ", port->remoteport->guid); rem_nodename = remap_node_name(node_name_map, port->remoteport->node->guid, port->remoteport->node->nodedesc); espeed = ibnd_get_agg_linkspeedext(port->info, port->info); fprintf(f, "\t\t# lid %d lmc %d \"%s\" lid %d %s%s", port->base_lid, port->lmc, rem_nodename, port->remoteport->node->type == IB_NODE_SWITCH ? port->remoteport->node->smalid : port->remoteport->base_lid, dump_linkwidth_compat(iwidth), (ispeed != 4 && !espeed) ? dump_linkspeed_compat(ispeed) : dump_linkspeedext_compat(espeed, ispeed, fdr10)); if (full_info) fprintf(f, " s=%d w=%d v=%d", ispeed, iwidth, vlcap); fprintf(f, "\n"); free(rem_nodename); } struct iter_user_data { int group; int skip_chassis_nodes; }; static void switch_iter_func(ibnd_node_t * node, void *iter_user_data) { ibnd_port_t *port; int p = 0; struct iter_user_data *data = (struct iter_user_data *)iter_user_data; DEBUG("SWITCH: node %p\n", node); /* skip chassis based switches if flagged */ if (data->skip_chassis_nodes && node->chassis && node->chassis->chassisnum) return; out_switch(node, data->group, NULL, NULL, NULL); for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (port && port->remoteport) out_switch_port(port, data->group, NULL); } } static void ca_iter_func(ibnd_node_t * node, void *iter_user_data) { ibnd_port_t *port; int p = 0; struct iter_user_data *data = (struct iter_user_data *)iter_user_data; DEBUG("CA: node %p\n", node); /* Now, skip chassis based CAs */ if (data->group && node->chassis && node->chassis->chassisnum) return; out_ca(node, data->group, NULL, NULL, NULL); for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (port && port->remoteport) out_ca_port(port, data->group, NULL); } } static void router_iter_func(ibnd_node_t * node, void *iter_user_data) { ibnd_port_t *port; int p = 0; struct iter_user_data *data = (struct iter_user_data *)iter_user_data; DEBUG("RT: node %p\n", node); /* Now, skip chassis based RTs */ if (data->group && node->chassis && node->chassis->chassisnum) return; out_ca(node, data->group, NULL, NULL, NULL); for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (port && port->remoteport) out_ca_port(port, data->group, NULL); } } static int dump_topology(int group, ibnd_fabric_t *fabric) { ibnd_node_t *node; ibnd_port_t *port; int i = 0, p = 0; time_t t = time(NULL); uint64_t chguid; char *chname = NULL; struct iter_user_data iter_user_data; fprintf(f, "#\n# Topology file: generated on %s#\n", ctime(&t)); if (report_max_hops) fprintf(f, "# Reported max hops discovered: %u\n" "# Total MADs used: %u\n", fabric->maxhops_discovered, fabric->total_mads_used); fprintf(f, "# Initiated from node %016" PRIx64 " port %016" PRIx64 "\n", fabric->from_node->guid, mad_get_field64(fabric->from_node->info, 0, IB_NODE_PORT_GUID_F)); /* Make pass on switches */ if (group) { ibnd_chassis_t *ch = NULL; /* Chassis based switches first */ for (ch = fabric->chassis; ch; ch = ch->next) { int n = 0; if (!ch->chassisnum) continue; chguid = out_chassis(fabric, ch->chassisnum); chname = NULL; if (ibnd_is_xsigo_guid(chguid)) { for (node = ch->nodes; node; node = node->next_chassis_node) { if (ibnd_is_xsigo_hca(node->guid)) { chname = node->nodedesc; fprintf(f, "Hostname: %s\n", clean_nodedesc (node->nodedesc)); } } } fprintf(f, "\n# Spine Nodes"); for (n = 1; n <= SPINES_MAX_NUM; n++) { if (ch->spinenode[n]) { out_switch(ch->spinenode[n], group, chname, NULL, NULL); for (p = 1; p <= ch->spinenode[n]->numports; p++) { port = ch->spinenode[n]->ports[p]; if (port && port->remoteport) out_switch_port(port, group, NULL); } } } fprintf(f, "\n# Line Nodes"); for (n = 1; n <= LINES_MAX_NUM; n++) { if (ch->linenode[n]) { out_switch(ch->linenode[n], group, chname, NULL, NULL); for (p = 1; p <= ch->linenode[n]->numports; p++) { port = ch->linenode[n]->ports[p]; if (port && port->remoteport) out_switch_port(port, group, NULL); } } } fprintf(f, "\n# Chassis Switches"); for (node = ch->nodes; node; node = node->next_chassis_node) { if (node->type == IB_NODE_SWITCH) { out_switch(node, group, chname, NULL, NULL); for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (port && port->remoteport) out_switch_port(port, group, NULL); } } } fprintf(f, "\n# Chassis CAs"); for (node = ch->nodes; node; node = node->next_chassis_node) { if (node->type == IB_NODE_CA) { out_ca(node, group, chname, NULL, NULL); for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (port && port->remoteport) out_ca_port(port, group, NULL); } } } } } else { /* !group */ iter_user_data.group = group; iter_user_data.skip_chassis_nodes = 0; ibnd_iter_nodes_type(fabric, switch_iter_func, IB_NODE_SWITCH, &iter_user_data); } chname = NULL; if (group) { iter_user_data.group = group; iter_user_data.skip_chassis_nodes = 1; fprintf(f, "\nNon-Chassis Nodes\n"); ibnd_iter_nodes_type(fabric, switch_iter_func, IB_NODE_SWITCH, &iter_user_data); } iter_user_data.group = group; iter_user_data.skip_chassis_nodes = 0; /* Make pass on CAs */ ibnd_iter_nodes_type(fabric, ca_iter_func, IB_NODE_CA, &iter_user_data); /* Make pass on routers */ ibnd_iter_nodes_type(fabric, router_iter_func, IB_NODE_ROUTER, &iter_user_data); return i; } static void dump_ports_report(ibnd_node_t *node, void *user_data) { int p = 0; ibnd_port_t *port = NULL; char *nodename = NULL; char *rem_nodename = NULL; uint32_t espeed; /* for each port */ for (p = node->numports, port = node->ports[p]; p > 0; port = node->ports[--p]) { uint32_t iwidth, ispeed, fdr10; uint8_t *info = NULL; if (port == NULL) continue; iwidth = mad_get_field(port->info, 0, IB_PORT_LINK_WIDTH_ACTIVE_F); ispeed = mad_get_field(port->info, 0, IB_PORT_LINK_SPEED_ACTIVE_F); if (port->node->type == IB_NODE_SWITCH) { if (port->node->ports[0]) info = (uint8_t *)&port->node->ports[0]->info; } else info = (uint8_t *)&port->info; if (info) { espeed = ibnd_get_agg_linkspeedext(info, port->info); } else { ispeed = 0; iwidth = 0; espeed = 0; } fdr10 = mad_get_field(port->ext_info, 0, IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F); nodename = remap_node_name(node_name_map, port->node->guid, port->node->nodedesc); fprintf(stdout, "%2s %5d %2d 0x%016" PRIx64 " %s %s", ports_nt_str_compat(node), node->type == IB_NODE_SWITCH ? node->smalid : port->base_lid, port->portnum, port->guid, dump_linkwidth_compat(iwidth), (ispeed != 4 && !espeed) ? dump_linkspeed_compat(ispeed) : dump_linkspeedext_compat(espeed, ispeed, fdr10)); if (port->remoteport) { rem_nodename = remap_node_name(node_name_map, port->remoteport->node->guid, port->remoteport->node->nodedesc); fprintf(stdout, " - %2s %5d %2d 0x%016" PRIx64 " ( '%s' - '%s' )\n", ports_nt_str_compat(port->remoteport->node), port->remoteport->node->type == IB_NODE_SWITCH ? port->remoteport->node->smalid : port->remoteport->base_lid, port->remoteport->portnum, port->remoteport->guid, nodename, rem_nodename); free(rem_nodename); } else fprintf(stdout, "%36s'%s'\n", "", nodename); free(nodename); } } struct iter_diff_data { uint32_t diff_flags; ibnd_fabric_t *fabric1; ibnd_fabric_t *fabric2; const char *fabric1_prefix; const char *fabric2_prefix; void (*out_header)(ibnd_node_t *, int, char *, const char *, const char *); void (*out_header_detail)(ibnd_node_t *, const char *); void (*out_port)(ibnd_port_t *, int, const char *); }; static void diff_iter_out_header(ibnd_node_t * node, struct iter_diff_data *data, int *out_header_flag) { if (!(*out_header_flag)) { (*data->out_header) (node, 0, NULL, NULL, NULL); (*out_header_flag)++; } } static void diff_ports(ibnd_node_t * fabric1_node, ibnd_node_t * fabric2_node, int *out_header_flag, struct iter_diff_data *data) { ibnd_port_t *fabric1_port; ibnd_port_t *fabric2_port; int p; for (p = 1; p <= fabric1_node->numports; p++) { int fabric1_out = 0, fabric2_out = 0; fabric1_port = fabric1_node->ports[p]; fabric2_port = fabric2_node->ports[p]; if (data->diff_flags & DIFF_FLAG_PORT_CONNECTION) { if ((fabric1_port && !fabric2_port) || ((fabric1_port && fabric2_port) && (fabric1_port->remoteport && !fabric2_port->remoteport))) fabric1_out++; else if ((!fabric1_port && fabric2_port) || ((fabric1_port && fabric2_port) && (!fabric1_port->remoteport && fabric2_port->remoteport))) fabric2_out++; else if ((fabric1_port && fabric2_port) && ((fabric1_port->guid != fabric2_port->guid) || ((fabric1_port->remoteport && fabric2_port->remoteport) && (fabric1_port->remoteport->guid != fabric2_port->remoteport->guid)))) { fabric1_out++; fabric2_out++; } } if ((data->diff_flags & DIFF_FLAG_LID) && fabric1_port && fabric2_port && fabric1_port->base_lid != fabric2_port->base_lid) { fabric1_out++; fabric2_out++; } if (data->diff_flags & DIFF_FLAG_PORT_CONNECTION && data->diff_flags & DIFF_FLAG_NODE_DESCRIPTION && fabric1_port && fabric2_port && fabric1_port->remoteport && fabric2_port->remoteport && memcmp(fabric1_port->remoteport->node->nodedesc, fabric2_port->remoteport->node->nodedesc, IB_SMP_DATA_SIZE)) { fabric1_out++; fabric2_out++; } if (data->diff_flags & DIFF_FLAG_PORT_CONNECTION && data->diff_flags & DIFF_FLAG_NODE_DESCRIPTION && fabric1_port && fabric2_port && fabric1_port->remoteport && fabric2_port->remoteport && memcmp(fabric1_port->remoteport->node->nodedesc, fabric2_port->remoteport->node->nodedesc, IB_SMP_DATA_SIZE)) { fabric1_out++; fabric2_out++; } if (data->diff_flags & DIFF_FLAG_PORT_CONNECTION && data->diff_flags & DIFF_FLAG_LID && fabric1_port && fabric2_port && fabric1_port->remoteport && fabric2_port->remoteport && fabric1_port->remoteport->base_lid != fabric2_port->remoteport->base_lid) { fabric1_out++; fabric2_out++; } if (fabric1_out) { diff_iter_out_header(fabric1_node, data, out_header_flag); (*data->out_port) (fabric1_port, 0, data->fabric1_prefix); } if (fabric2_out) { diff_iter_out_header(fabric1_node, data, out_header_flag); (*data->out_port) (fabric2_port, 0, data->fabric2_prefix); } } } static void diff_iter_func(ibnd_node_t * fabric1_node, void *iter_user_data) { struct iter_diff_data *data = iter_user_data; ibnd_node_t *fabric2_node; ibnd_port_t *fabric1_port; int p; DEBUG("DEBUG: fabric1_node %p\n", fabric1_node); fabric2_node = ibnd_find_node_guid(data->fabric2, fabric1_node->guid); if (!fabric2_node) { (*data->out_header) (fabric1_node, 0, NULL, data->fabric1_prefix, data->fabric1_prefix); for (p = 1; p <= fabric1_node->numports; p++) { fabric1_port = fabric1_node->ports[p]; if (fabric1_port && fabric1_port->remoteport) (*data->out_port) (fabric1_port, 0, data->fabric1_prefix); } } else if (data->diff_flags & (DIFF_FLAG_PORT_CONNECTION | DIFF_FLAG_LID | DIFF_FLAG_NODE_DESCRIPTION)) { int out_header_flag = 0; if ((data->diff_flags & DIFF_FLAG_LID && fabric1_node->smalid != fabric2_node->smalid) || (data->diff_flags & DIFF_FLAG_NODE_DESCRIPTION && memcmp(fabric1_node->nodedesc, fabric2_node->nodedesc, IB_SMP_DATA_SIZE))) { (*data->out_header) (fabric1_node, 0, NULL, NULL, data->fabric1_prefix); (*data->out_header_detail) (fabric2_node, data->fabric2_prefix); fprintf(f, "\n"); out_header_flag++; } if (fabric1_node->numports != fabric2_node->numports) { diff_iter_out_header(fabric1_node, data, &out_header_flag); fprintf(f, "%snumports = %d\n", data->fabric1_prefix, fabric1_node->numports); fprintf(f, "%snumports = %d\n", data->fabric2_prefix, fabric2_node->numports); return; } if (data->diff_flags & DIFF_FLAG_PORT_CONNECTION || data->diff_flags & DIFF_FLAG_LID) diff_ports(fabric1_node, fabric2_node, &out_header_flag, data); } } static int diff_common(ibnd_fabric_t *orig_fabric, ibnd_fabric_t *new_fabric, int node_type, uint32_t diff_flags, void (*out_header)(ibnd_node_t *, int, char *, const char *, const char *), void (*out_header_detail)(ibnd_node_t *, const char *), void (*out_port)(ibnd_port_t *, int, const char *)) { struct iter_diff_data iter_diff_data; iter_diff_data.diff_flags = diff_flags; iter_diff_data.fabric1 = orig_fabric; iter_diff_data.fabric2 = new_fabric; iter_diff_data.fabric1_prefix = "< "; iter_diff_data.fabric2_prefix = "> "; iter_diff_data.out_header = out_header; iter_diff_data.out_header_detail = out_header_detail; iter_diff_data.out_port = out_port; ibnd_iter_nodes_type(orig_fabric, diff_iter_func, node_type, &iter_diff_data); /* Do opposite diff to find existence of node types * in new_fabric but not in orig_fabric. * * In this diff, we don't need to check port connections, * lids, or node descriptions since it has already been * done (i.e. checks are only done when guid exists on both * orig and new). */ iter_diff_data.diff_flags = diff_flags & ~DIFF_FLAG_PORT_CONNECTION; iter_diff_data.diff_flags &= ~DIFF_FLAG_LID; iter_diff_data.diff_flags &= ~DIFF_FLAG_NODE_DESCRIPTION; iter_diff_data.fabric1 = new_fabric; iter_diff_data.fabric2 = orig_fabric; iter_diff_data.fabric1_prefix = "> "; iter_diff_data.fabric2_prefix = "< "; iter_diff_data.out_header = out_header; iter_diff_data.out_header_detail = out_header_detail; iter_diff_data.out_port = out_port; ibnd_iter_nodes_type(new_fabric, diff_iter_func, node_type, &iter_diff_data); return 0; } static int diff(ibnd_fabric_t *orig_fabric, ibnd_fabric_t *new_fabric) { if (diffcheck_flags & DIFF_FLAG_SWITCH) diff_common(orig_fabric, new_fabric, IB_NODE_SWITCH, diffcheck_flags, out_switch, out_switch_detail, out_switch_port); if (diffcheck_flags & DIFF_FLAG_CA) diff_common(orig_fabric, new_fabric, IB_NODE_CA, diffcheck_flags, out_ca, out_ca_detail, out_ca_port); if (diffcheck_flags & DIFF_FLAG_ROUTER) diff_common(orig_fabric, new_fabric, IB_NODE_ROUTER, diffcheck_flags, out_ca, out_ca_detail, out_ca_port); return 0; } static int list, group, ports_report; static int process_opt(void *context, int ch) { struct ibnd_config *cfg = context; char *p; switch (ch) { case 1: node_name_map_file = strdup(optarg); if (node_name_map_file == NULL) IBEXIT("out of memory, strdup for node_name_map_file name failed"); break; case 2: cache_file = strdup(optarg); break; case 3: load_cache_file = strdup(optarg); break; case 4: diff_cache_file = strdup(optarg); break; case 5: diffcheck_flags = 0; p = strtok(optarg, ","); while (p) { if (!strcasecmp(p, "sw")) diffcheck_flags |= DIFF_FLAG_SWITCH; else if (!strcasecmp(p, "ca")) diffcheck_flags |= DIFF_FLAG_CA; else if (!strcasecmp(p, "router")) diffcheck_flags |= DIFF_FLAG_ROUTER; else if (!strcasecmp(p, "port")) diffcheck_flags |= DIFF_FLAG_PORT_CONNECTION; else if (!strcasecmp(p, "lid")) diffcheck_flags |= DIFF_FLAG_LID; else if (!strcasecmp(p, "nodedesc")) diffcheck_flags |= DIFF_FLAG_NODE_DESCRIPTION; else { fprintf(stderr, "invalid diff check key: %s\n", p); return -1; } p = strtok(NULL, ","); } break; case 's': cfg->show_progress = 1; break; case 'f': full_info = 1; break; case 'l': list = LIST_CA_NODE | LIST_SWITCH_NODE | LIST_ROUTER_NODE; break; case 'g': group = 1; break; case 'S': list = LIST_SWITCH_NODE; break; case 'H': list = LIST_CA_NODE; break; case 'R': list = LIST_ROUTER_NODE; break; case 'p': ports_report = 1; break; case 'm': report_max_hops = 1; break; case 'o': cfg->max_smps = strtoul(optarg, NULL, 0); break; default: return -1; } return 0; } int main(int argc, char **argv) { struct ibnd_config config = { 0 }; ibnd_fabric_t *fabric = NULL; ibnd_fabric_t *diff_fabric = NULL; const struct ibdiag_opt opts[] = { {"full", 'f', 0, NULL, "show full information (ports' speed and width, vlcap)"}, {"show", 's', 0, NULL, "show more information"}, {"list", 'l', 0, NULL, "list of connected nodes"}, {"grouping", 'g', 0, NULL, "show grouping"}, {"Hca_list", 'H', 0, NULL, "list of connected CAs"}, {"Switch_list", 'S', 0, NULL, "list of connected switches"}, {"Router_list", 'R', 0, NULL, "list of connected routers"}, {"node-name-map", 1, 1, "", "node name map file"}, {"cache", 2, 1, "", "filename to cache ibnetdiscover data to"}, {"load-cache", 3, 1, "", "filename of ibnetdiscover cache to load"}, {"diff", 4, 1, "", "filename of ibnetdiscover cache to diff"}, {"diffcheck", 5, 1, "", "specify checks to execute for --diff"}, {"ports", 'p', 0, NULL, "obtain a ports report"}, {"max_hops", 'm', 0, NULL, "report max hops discovered by the library"}, {"outstanding_smps", 'o', 1, NULL, "specify the number of outstanding SMP's which should be " "issued during the scan"}, {} }; char usage_args[] = "[topology-file]"; ibdiag_process_opts(argc, argv, &config, "DGKLs", opts, process_opt, usage_args, NULL); f = stdout; argc -= optind; argv += optind; if (ibd_timeout) config.timeout_ms = ibd_timeout; config.flags = ibd_ibnetdisc_flags; if (argc && !(f = fopen(argv[0], "w"))) IBEXIT("can't open file %s for writing", argv[0]); config.mkey = ibd_mkey; node_name_map = open_node_name_map(node_name_map_file); if (diff_cache_file && !(diff_fabric = ibnd_load_fabric(diff_cache_file, 0))) IBEXIT("loading cached fabric for diff failed\n"); if (load_cache_file) { if ((fabric = ibnd_load_fabric(load_cache_file, 0)) == NULL) IBEXIT("loading cached fabric failed\n"); } else { if ((fabric = ibnd_discover_fabric(ibd_ca, ibd_ca_port, NULL, &config)) == NULL) IBEXIT("discover failed\n"); } if (ports_report) ibnd_iter_nodes(fabric, dump_ports_report, NULL); else if (list) list_nodes(fabric, list); else if (diff_fabric) diff(diff_fabric, fabric); else dump_topology(group, fabric); if (cache_file) if (ibnd_cache_fabric(fabric, cache_file, 0) < 0) IBEXIT("caching ibnetdiscover data failed\n"); ibnd_destroy_fabric(fabric); if (diff_fabric) ibnd_destroy_fabric(diff_fabric); close_node_name_map(node_name_map); exit(0); } rdma-core-56.1/infiniband-diags/ibping.c000066400000000000000000000152421477342711600201400ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include #include "ibdiag_common.h" static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; static uint64_t time_stamp(void) { struct timespec ts; clock_gettime(CLOCK_MONOTONIC, &ts); return ((uint64_t)ts.tv_sec * 1000000ULL) + ts.tv_nsec / 10000ULL; } static char host_and_domain[IB_VENDOR_RANGE2_DATA_SIZE]; static char last_host[IB_VENDOR_RANGE2_DATA_SIZE]; static void get_host_and_domain(char *data, int sz) { char *s = data; int n; if (gethostname(s, sz) < 0) snprintf(s, sz, "?hostname?"); s[sz - 1] = 0; if ((n = strlen(s)) >= sz) return; s[n] = '.'; s += n + 1; sz -= n + 1; if (getdomainname(s, sz) < 0) snprintf(s, sz, "?domainname?"); if (strlen(s) == 0) s[-1] = 0; /* no domain */ } static char *ibping_serv(void) { void *umad; void *mad; char *data; DEBUG("starting to serve..."); while ((umad = mad_receive_via(NULL, -1, srcport))) { if (umad_status(umad) == 0) { mad = umad_get_mad(umad); data = (char *)mad + IB_VENDOR_RANGE2_DATA_OFFS; memcpy(data, host_and_domain, IB_VENDOR_RANGE2_DATA_SIZE); DEBUG("Pong: %s", data); if (mad_respond_via(umad, NULL, 0, srcport) < 0) DEBUG("respond failed"); } mad_free(umad); } DEBUG("server out"); return NULL; } static int oui = IB_OPENIB_OUI; static uint64_t ibping(ib_portid_t * portid, int quiet) { char data[IB_VENDOR_RANGE2_DATA_SIZE] = { 0 }; ib_vendor_call_t call; uint64_t start, rtt; DEBUG("Ping.."); start = time_stamp(); call.method = IB_MAD_METHOD_GET; call.mgmt_class = IB_VENDOR_OPENIB_PING_CLASS; call.attrid = 0; call.mod = 0; call.oui = oui; call.timeout = 0; memset(&call.rmpp, 0, sizeof call.rmpp); if (!ib_vendor_call_via(data, portid, &call, srcport)) return ~0ull; rtt = time_stamp() - start; if (!last_host[0]) memcpy(last_host, data, sizeof last_host); if (!quiet) printf("Pong from %s (%s): time %" PRIu64 ".%03" PRIu64 " ms\n", data, portid2str(portid), rtt / 1000, rtt % 1000); return rtt; } static uint64_t minrtt = ~0ull, maxrtt, total_rtt; static uint64_t start, total_time, replied, lost, ntrans; static ib_portid_t portid = { 0 }; static void report(int sig) { total_time = time_stamp() - start; DEBUG("out due signal %d", sig); printf("\n--- %s (%s) ibping statistics ---\n", last_host, portid2str(&portid)); printf("%" PRIu64 " packets transmitted, %" PRIu64 " received, %" PRIu64 "%% packet loss, time %" PRIu64 " ms\n", ntrans, replied, (lost != 0) ? lost * 100 / ntrans : 0, total_time / 1000); printf("rtt min/avg/max = %" PRIu64 ".%03" PRIu64 "/%" PRIu64 ".%03" PRIu64 "/%" PRIu64 ".%03" PRIu64 " ms\n", minrtt == ~0ull ? 0 : minrtt / 1000, minrtt == ~0ull ? 0 : minrtt % 1000, replied ? total_rtt / replied / 1000 : 0, replied ? (total_rtt / replied) % 1000 : 0, maxrtt / 1000, maxrtt % 1000); exit(0); } static int server = 0, flood = 0; static unsigned count = ~0; static int process_opt(void *context, int ch) { switch (ch) { case 'c': count = strtoul(optarg, NULL, 0); break; case 'f': flood++; break; case 'o': oui = strtoul(optarg, NULL, 0); break; case 'S': server++; break; default: return -1; } return 0; } int main(int argc, char **argv) { int mgmt_classes[1] = { IB_SA_CLASS }; int ping_class = IB_VENDOR_OPENIB_PING_CLASS; uint64_t rtt; char *err; const struct ibdiag_opt opts[] = { {"count", 'c', 1, "", "stop after count packets"}, {"flood", 'f', 0, NULL, "flood destination"}, {"oui", 'o', 1, NULL, "use specified OUI number"}, {"Server", 'S', 0, NULL, "start in server mode"}, {} }; char usage_args[] = ""; ibdiag_process_opts(argc, argv, NULL, "DKy", opts, process_opt, usage_args, NULL); argc -= optind; argv += optind; if (!argc && !server) ibdiag_show_usage(); srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 1, 0); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->gsi.port; if (!srcport) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); if (server) { if (mad_register_server_via(ping_class, 0, NULL, oui, srcport) < 0) IBEXIT("can't serve class %d on this port", ping_class); get_host_and_domain(host_and_domain, sizeof host_and_domain); if ((err = ibping_serv())) IBEXIT("ibping to %s: %s", portid2str(&portid), err); exit(0); } if (mad_register_client_via(ping_class, 0, srcport) < 0) IBEXIT("can't register ping class %d on this port", ping_class); if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &portid, argv[0], ibd_dest_type, ibd_sm_id, srcport) < 0) IBEXIT("can't resolve destination port %s", argv[0]); signal(SIGINT, report); signal(SIGTERM, report); start = time_stamp(); while (count-- > 0) { ntrans++; if ((rtt = ibping(&portid, flood)) == ~0ull) { DEBUG("ibping to %s failed", portid2str(&portid)); lost++; } else { if (rtt < minrtt) minrtt = rtt; if (rtt > maxrtt) maxrtt = rtt; total_rtt += rtt; replied++; } if (!flood) sleep(1); } report(0); mad_rpc_close_port2(srcports); exit(-1); } rdma-core-56.1/infiniband-diags/ibportstate.c000066400000000000000000000606321477342711600212330ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2010,2011 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 2011,2016 Oracle and/or its affiliates. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include "ibdiag_common.h" #include enum port_ops { QUERY, ENABLE, RESET, DISABLE, SPEED, ESPEED, FDR10SPEED, WIDTH, DOWN, ARM, ACTIVE, VLS, MTU, LID, SMLID, LMC, MKEY, MKEYLEASE, MKEYPROT, ON, OFF }; static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; static uint64_t speed; /* no state change */ static uint64_t espeed; /* no state change */ static uint64_t fdr10; /* no state change */ static uint64_t width; /* no state change */ static uint64_t lid; static uint64_t smlid; static uint64_t lmc; static uint64_t mtu; static uint64_t vls; /* no state change */ static uint64_t mkey; static uint64_t mkeylease; static uint64_t mkeyprot; static struct { const char *name; uint64_t *val; int set; } port_args[] = { {"query", NULL, 0}, /* QUERY */ {"enable", NULL, 0}, /* ENABLE */ {"reset", NULL, 0}, /* RESET */ {"disable", NULL, 0}, /* DISABLE */ {"speed", &speed, 0}, /* SPEED */ {"espeed", &espeed, 0}, /* EXTENDED SPEED */ {"fdr10", &fdr10, 0}, /* FDR10 SPEED */ {"width", &width, 0}, /* WIDTH */ {"down", NULL, 0}, /* DOWN */ {"arm", NULL, 0}, /* ARM */ {"active", NULL, 0}, /* ACTIVE */ {"vls", &vls, 0}, /* VLS */ {"mtu", &mtu, 0}, /* MTU */ {"lid", &lid, 0}, /* LID */ {"smlid", &smlid, 0}, /* SMLID */ {"lmc", &lmc, 0}, /* LMC */ {"mkey", &mkey, 0}, /* MKEY */ {"mkeylease", &mkeylease, 0}, /* MKEY LEASE */ {"mkeyprot", &mkeyprot, 0}, /* MKEY PROTECT BITS */ {"on", NULL, 0}, /* ON */ {"off", NULL, 0}, /* OFF */ }; #define NPORT_ARGS (sizeof(port_args) / sizeof(port_args[0])) /*******************************************/ /* * Return 1 if node is a switch, else zero. */ static int get_node_info(ib_portid_t * dest, uint8_t * data) { int node_type; if (!smp_query_via(data, dest, IB_ATTR_NODE_INFO, 0, 0, srcport)) IBEXIT("smp query nodeinfo failed"); node_type = mad_get_field(data, 0, IB_NODE_TYPE_F); if (node_type == IB_NODE_SWITCH) /* Switch NodeType ? */ return 1; else return 0; } static int get_port_info(ib_portid_t * dest, uint8_t * data, int portnum, int is_switch) { uint8_t smp[IB_SMP_DATA_SIZE]; uint8_t *info; int cap_mask, cap_mask2; if (is_switch) { if (!smp_query_via(smp, dest, IB_ATTR_PORT_INFO, 0, 0, srcport)) IBEXIT("smp query port 0 portinfo failed"); info = smp; } else info = data; if (!smp_query_via(data, dest, IB_ATTR_PORT_INFO, portnum, 0, srcport)) IBEXIT("smp query portinfo failed"); cap_mask = mad_get_field(info, 0, IB_PORT_CAPMASK_F); if (cap_mask & be32toh(IB_PORT_CAP_HAS_CAP_MASK2)) { cap_mask2 = (mad_get_field(info, 0, IB_PORT_CAPMASK2_F) & be16toh(IB_PORT_CAP2_IS_EXT_SPEEDS_2_SUPPORTED)) ? 0x02 : 0x00; } else cap_mask2 = 0; return cap_mask2 | ((cap_mask & be32toh(IB_PORT_CAP_HAS_EXT_SPEEDS)) ? 0x01 : 0x00); } static void show_port_info(ib_portid_t * dest, uint8_t * data, int portnum, int espeed_cap, int is_switch) { char buf[2300]; char val[64]; mad_dump_portstates(buf, sizeof buf, data, sizeof *data); mad_decode_field(data, IB_PORT_LID_F, val); mad_dump_field(IB_PORT_LID_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_SMLID_F, val); mad_dump_field(IB_PORT_SMLID_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_LMC_F, val); mad_dump_field(IB_PORT_LMC_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_LINK_WIDTH_SUPPORTED_F, val); mad_dump_field(IB_PORT_LINK_WIDTH_SUPPORTED_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_LINK_WIDTH_ENABLED_F, val); mad_dump_field(IB_PORT_LINK_WIDTH_ENABLED_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_LINK_WIDTH_ACTIVE_F, val); mad_dump_field(IB_PORT_LINK_WIDTH_ACTIVE_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_LINK_SPEED_SUPPORTED_F, val); mad_dump_field(IB_PORT_LINK_SPEED_SUPPORTED_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_LINK_SPEED_ENABLED_F, val); mad_dump_field(IB_PORT_LINK_SPEED_ENABLED_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_LINK_SPEED_ACTIVE_F, val); mad_dump_field(IB_PORT_LINK_SPEED_ACTIVE_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); if (espeed_cap & 0x01) { mad_decode_field(data, IB_PORT_LINK_SPEED_EXT_SUPPORTED_F, val); mad_dump_field(IB_PORT_LINK_SPEED_EXT_SUPPORTED_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_LINK_SPEED_EXT_ENABLED_F, val); mad_dump_field(IB_PORT_LINK_SPEED_EXT_ENABLED_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_LINK_SPEED_EXT_ACTIVE_F, val); mad_dump_field(IB_PORT_LINK_SPEED_EXT_ACTIVE_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); } if (espeed_cap & 0x02) { mad_decode_field(data, IB_PORT_LINK_SPEED_EXT_SUPPORTED_2_F, val); mad_dump_field(IB_PORT_LINK_SPEED_EXT_SUPPORTED_2_F, buf + strlen(buf), sizeof(buf) - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_LINK_SPEED_EXT_ENABLED_2_F, val); mad_dump_field(IB_PORT_LINK_SPEED_EXT_ENABLED_2_F, buf + strlen(buf), sizeof(buf) - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_LINK_SPEED_EXT_ACTIVE_2_F, val); mad_dump_field(IB_PORT_LINK_SPEED_EXT_ACTIVE_2_F, buf + strlen(buf), sizeof(buf) - strlen(buf), val); sprintf(buf + strlen(buf), "%s", "\n"); } if (!is_switch || portnum == 0) { if (show_keys) { mad_decode_field(data, IB_PORT_MKEY_F, val); mad_dump_field(IB_PORT_MKEY_F, buf + strlen(buf), sizeof buf - strlen(buf), val); } else snprint_field(buf+strlen(buf), sizeof(buf)-strlen(buf), IB_PORT_MKEY_F, 32, NOT_DISPLAYED_STR); sprintf(buf+strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_MKEY_LEASE_F, val); mad_dump_field(IB_PORT_MKEY_LEASE_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf+strlen(buf), "%s", "\n"); mad_decode_field(data, IB_PORT_MKEY_PROT_BITS_F, val); mad_dump_field(IB_PORT_MKEY_PROT_BITS_F, buf + strlen(buf), sizeof buf - strlen(buf), val); sprintf(buf+strlen(buf), "%s", "\n"); } printf("# Port info: %s port %d\n%s", portid2str(dest), portnum, buf); } static void set_port_info(ib_portid_t * dest, uint8_t * data, int portnum, int espeed_cap, int is_switch) { unsigned mod; mod = portnum; if (espeed_cap) mod |= (1U)<<31; if (!smp_set_via(data, dest, IB_ATTR_PORT_INFO, mod, 0, srcport)) IBEXIT("smp set portinfo failed"); printf("\nAfter PortInfo set:\n"); show_port_info(dest, data, portnum, espeed_cap, is_switch); } static void get_mlnx_ext_port_info(ib_portid_t * dest, uint8_t * data, int portnum) { if (!smp_query_via(data, dest, IB_ATTR_MLNX_EXT_PORT_INFO, portnum, 0, srcport)) IBEXIT("smp query ext portinfo failed"); } static void show_mlnx_ext_port_info(ib_portid_t * dest, uint8_t * data, int portnum) { char buf[256]; mad_dump_mlnx_ext_port_info(buf, sizeof buf, data, IB_SMP_DATA_SIZE); printf("# MLNX ext Port info: %s port %d\n%s", portid2str(dest), portnum, buf); } static void set_mlnx_ext_port_info(ib_portid_t * dest, uint8_t * data, int portnum) { if (!smp_set_via(data, dest, IB_ATTR_MLNX_EXT_PORT_INFO, portnum, 0, srcport)) IBEXIT("smp set MLNX ext portinfo failed"); printf("\nAfter MLNXExtendedPortInfo set:\n"); show_mlnx_ext_port_info(dest, data, portnum); } static int get_link_width(int lwe, int lws) { if (lwe == 255) return lws; else return lwe; } static int get_link_speed(int lse, int lss) { if (lse == 15) return lss; else return lse; } static int get_link_speed_ext(int lsee, int lses) { if (lsee & 0x20) return lsee; if (lsee == 31) return lses; else return lsee; } static void validate_width(int peerwidth, int lwa) { if ((width & peerwidth & 0x8)) { if (lwa != 8) IBWARN ("Peer ports operating at active width %d rather than 8 (12x)", lwa); } else if ((width & peerwidth & 0x4)) { if (lwa != 4) IBWARN ("Peer ports operating at active width %d rather than 4 (8x)", lwa); } else if ((width & peerwidth & 0x2)) { if (lwa != 2) IBWARN ("Peer ports operating at active width %d rather than 2 (4x)", lwa); } else if ((width & peerwidth & 0x10)) { if (lwa != 16) IBWARN ("Peer ports operating at active width %d rather than 16 (2x)", lwa); } else if ((width & peerwidth & 0x1)) { if (lwa != 1) IBWARN ("Peer ports operating at active width %d rather than 1 (1x)", lwa); } } static void validate_speed(int peerspeed, int lsa) { if ((speed & peerspeed & 0x4)) { if (lsa != 4) IBWARN ("Peer ports operating at active speed %d rather than 4 (10.0 Gbps)", lsa); } else if ((speed & peerspeed & 0x2)) { if (lsa != 2) IBWARN ("Peer ports operating at active speed %d rather than 2 (5.0 Gbps)", lsa); } else if ((speed & peerspeed & 0x1)) { if (lsa != 1) IBWARN ("Peer ports operating at active speed %d rather than 1 (2.5 Gbps)", lsa); } } static void validate_extended_speed(int peerespeed, int lsea) { if ((espeed & peerespeed & 0x20)) { if (lsea != 32) IBWARN ("Peer ports operating at active extended speed %d rather than 32 (212.5 Gbps)", lsea); } else if ((espeed & peerespeed & 0x8)) { if (lsea != 8) IBWARN ("Peer ports operating at active extended speed %d rather than 8 (106.25 Gbps)", lsea); } else if ((espeed & peerespeed & 0x4)) { if (lsea != 4) IBWARN ("Peer ports operating at active extended speed %d rather than 4 (53.125 Gbps)", lsea); } else if ((espeed & peerespeed & 0x2)) { if (lsea != 2) IBWARN ("Peer ports operating at active extended speed %d rather than 2 (25.78125 Gbps)", lsea); } else if ((espeed & peerespeed & 0x1)) { if (lsea != 1) IBWARN ("Peer ports operating at active extended speed %d rather than 1 (14.0625 Gbps)", lsea); } } int main(int argc, char **argv) { int mgmt_classes[3] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS }; ib_portid_t portid = { 0 }; int port_op = -1; int is_switch, is_peer_switch, espeed_cap, peer_espeed_cap; int state, physstate, lwe, lws, lwa, lse, lss, lsa, lsee, lses, lsea, fdr10s, fdr10e, fdr10a; int peerlocalportnum, peerlwe, peerlws, peerlwa, peerlse, peerlss, peerlsa, peerlsee, peerlses, peerfdr10s, peerfdr10e, peerfdr10a; int peerwidth, peerspeed, peerespeed; uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; uint8_t data2[IB_SMP_DATA_SIZE] = { 0 }; ib_portid_t peerportid = { 0 }; int portnum = 0; ib_portid_t selfportid = { 0 }; int selfport = 0; int changed = 0; int i; uint32_t vendorid, rem_vendorid; uint16_t devid, rem_devid; uint64_t val; char *endp; char usage_args[] = " []\n" "\nSupported ops: enable, disable, on, off, reset, speed, espeed, fdr10,\n" "\twidth, query, down, arm, active, vls, mtu, lid, smlid, lmc,\n" "\tmkey, mkeylease, mkeyprot\n"; const char *usage_examples[] = { "-C qib0 -P 1 3 1 disable # by CA name, CA Port Number, lid, physical port number", "-C qib0 -P 1 3 1 enable # by CA name, CA Port Number, lid, physical port number", "-D 0 1\t\t\t# (query) by direct route", "3 1 reset\t\t\t# by lid", "3 1 speed 1\t\t\t# by lid", "3 1 width 1\t\t\t# by lid", "-D 0 1 lid 0x1234 arm\t\t# by direct route", NULL }; ibdiag_process_opts(argc, argv, NULL, NULL, NULL, NULL, usage_args, usage_examples); argc -= optind; argv += optind; if (argc < 2) ibdiag_show_usage(); srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 1); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->smi.port; if (!srcport) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); smp_mkey_set(srcport, ibd_mkey); if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &portid, argv[0], ibd_dest_type, ibd_sm_id, srcports->gsi.port) < 0) IBEXIT("can't resolve destination port %s", argv[0]); if (argc > 1) portnum = strtol(argv[1], NULL, 0); for (i = 2; i < argc; i++) { int j; for (j = 0; j < NPORT_ARGS; j++) { if (strcmp(argv[i], port_args[j].name)) continue; port_args[j].set = 1; if (!port_args[j].val) { if (port_op >= 0) IBEXIT("%s only one of: " "query, enable, disable, " "reset, down, arm, active, " "can be specified", port_args[j].name); port_op = j; break; } if (++i >= argc) IBEXIT("%s requires an additional parameter", port_args[j].name); val = strtoull(argv[i], NULL, 0); switch (j) { case SPEED: if (val > 15) IBEXIT("invalid speed value %" PRIu64, val); break; case ESPEED: if (val > 31) IBEXIT("invalid extended speed value %" PRIu64, val); break; case FDR10SPEED: if (val > 1) IBEXIT("invalid fdr10 speed value %" PRIu64, val); break; case WIDTH: if ((val > 31 && val != 255)) IBEXIT("invalid width value %" PRIu64, val); break; case VLS: if (val == 0 || val > 5) IBEXIT("invalid vls value %" PRIu64, val); break; case MTU: if (val == 0 || val > 5) IBEXIT("invalid mtu value %" PRIu64, val); break; case LID: if (val == 0 || val >= 0xC000) IBEXIT("invalid lid value 0x%" PRIx64, val); break; case SMLID: if (val == 0 || val >= 0xC000) IBEXIT("invalid smlid value 0x%" PRIx64, val); break; case LMC: if (val > 7) IBEXIT("invalid lmc value %" PRIu64, val); break; case MKEY: errno = 0; val = strtoull(argv[i], &endp, 0); if (errno || *endp != '\0') { errno = 0; val = strtoull(getpass("New M_Key: "), &endp, 0); if (errno || *endp != '\0') { IBEXIT("Bad new M_Key\n"); } } /* All 64-bit values are legal */ break; case MKEYLEASE: if (val > 0xFFFF) IBEXIT("invalid mkey lease time %" PRIu64, val); break; case MKEYPROT: if (val > 3) IBEXIT("invalid mkey protection bit setting %" PRIu64, val); } *port_args[j].val = val; changed = 1; break; } if (j == NPORT_ARGS) IBEXIT("invalid operation: %s", argv[i]); } if (port_op < 0) port_op = QUERY; is_switch = get_node_info(&portid, data); vendorid = (uint32_t) mad_get_field(data, 0, IB_NODE_VENDORID_F); devid = (uint16_t) mad_get_field(data, 0, IB_NODE_DEVID_F); if ((port_args[MKEY].set || port_args[MKEYLEASE].set || port_args[MKEYPROT].set) && is_switch && portnum != 0) IBEXIT("Can't set M_Key fields on switch port != 0"); if (port_op != QUERY || changed) printf("Initial %s PortInfo:\n", is_switch ? "Switch" : "CA/RT"); else printf("%s PortInfo:\n", is_switch ? "Switch" : "CA/RT"); espeed_cap = get_port_info(&portid, data, portnum, is_switch); show_port_info(&portid, data, portnum, espeed_cap, is_switch); if (is_mlnx_ext_port_info_supported(vendorid, devid)) { get_mlnx_ext_port_info(&portid, data2, portnum); show_mlnx_ext_port_info(&portid, data2, portnum); } if (port_op != QUERY || changed) { /* * If we aren't setting the LID and the LID is the default, * the SMA command will fail due to an invalid LID. * Set it to something unlikely but valid. */ physstate = mad_get_field(data, 0, IB_PORT_PHYS_STATE_F); val = mad_get_field(data, 0, IB_PORT_LID_F); if (!port_args[LID].set && (!val || val == 0xFFFF)) mad_set_field(data, 0, IB_PORT_LID_F, 0x1234); val = mad_get_field(data, 0, IB_PORT_SMLID_F); if (!port_args[SMLID].set && (!val || val == 0xFFFF)) mad_set_field(data, 0, IB_PORT_SMLID_F, 0x1234); mad_set_field(data, 0, IB_PORT_STATE_F, 0); /* NOP */ mad_set_field(data, 0, IB_PORT_PHYS_STATE_F, 0); /* NOP */ switch (port_op) { case ON: /* Enable only if state is Disable */ if(physstate != 3) { printf("Port is already in enable state\n"); goto close_port; } SWITCH_FALLTHROUGH; case ENABLE: case RESET: /* Polling */ mad_set_field(data, 0, IB_PORT_PHYS_STATE_F, 2); break; case OFF: case DISABLE: printf("Disable may be irreversible\n"); mad_set_field(data, 0, IB_PORT_PHYS_STATE_F, 3); break; case DOWN: mad_set_field(data, 0, IB_PORT_STATE_F, 1); break; case ARM: mad_set_field(data, 0, IB_PORT_STATE_F, 3); break; case ACTIVE: mad_set_field(data, 0, IB_PORT_STATE_F, 4); break; } /* always set enabled speeds/width - defaults to NOP */ mad_set_field(data, 0, IB_PORT_LINK_SPEED_ENABLED_F, speed); mad_set_field(data, 0, IB_PORT_LINK_SPEED_EXT_ENABLED_F, espeed & 0x1f); mad_set_field(data, 0, IB_PORT_LINK_SPEED_EXT_ENABLED_2_F, espeed >> 5); mad_set_field(data, 0, IB_PORT_LINK_WIDTH_ENABLED_F, width); if (port_args[VLS].set) mad_set_field(data, 0, IB_PORT_OPER_VLS_F, vls); if (port_args[MTU].set) mad_set_field(data, 0, IB_PORT_NEIGHBOR_MTU_F, mtu); if (port_args[LID].set) mad_set_field(data, 0, IB_PORT_LID_F, lid); if (port_args[SMLID].set) mad_set_field(data, 0, IB_PORT_SMLID_F, smlid); if (port_args[LMC].set) mad_set_field(data, 0, IB_PORT_LMC_F, lmc); if (port_args[FDR10SPEED].set) { mad_set_field(data2, 0, IB_MLNX_EXT_PORT_STATE_CHG_ENABLE_F, FDR10); mad_set_field(data2, 0, IB_MLNX_EXT_PORT_LINK_SPEED_ENABLED_F, fdr10); set_mlnx_ext_port_info(&portid, data2, portnum); } if (port_args[MKEY].set) mad_set_field64(data, 0, IB_PORT_MKEY_F, mkey); if (port_args[MKEYLEASE].set) mad_set_field(data, 0, IB_PORT_MKEY_LEASE_F, mkeylease); if (port_args[MKEYPROT].set) mad_set_field(data, 0, IB_PORT_MKEY_PROT_BITS_F, mkeyprot); set_port_info(&portid, data, portnum, espeed_cap, is_switch); } else if (is_switch && portnum) { /* Now, make sure PortState is Active */ /* Or is PortPhysicalState LinkUp sufficient ? */ mad_decode_field(data, IB_PORT_STATE_F, &state); mad_decode_field(data, IB_PORT_PHYS_STATE_F, &physstate); if (state == 4) { /* Active */ mad_decode_field(data, IB_PORT_LINK_WIDTH_ENABLED_F, &lwe); mad_decode_field(data, IB_PORT_LINK_WIDTH_SUPPORTED_F, &lws); mad_decode_field(data, IB_PORT_LINK_WIDTH_ACTIVE_F, &lwa); mad_decode_field(data, IB_PORT_LINK_SPEED_SUPPORTED_F, &lss); mad_decode_field(data, IB_PORT_LINK_SPEED_ACTIVE_F, &lsa); mad_decode_field(data, IB_PORT_LINK_SPEED_ENABLED_F, &lse); mad_decode_field(data2, IB_MLNX_EXT_PORT_LINK_SPEED_SUPPORTED_F, &fdr10s); mad_decode_field(data2, IB_MLNX_EXT_PORT_LINK_SPEED_ENABLED_F, &fdr10e); mad_decode_field(data2, IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F, &fdr10a); if (espeed_cap) { lsea = ibnd_get_agg_linkspeedext(data, data); lsee = ibnd_get_agg_linkspeedexten(data, data); lses = ibnd_get_agg_linkspeedextsup(data, data); } else { lsea = 0; lsee = 0; lses = 0; } /* Setup portid for peer port */ memcpy(&peerportid, &portid, sizeof(peerportid)); if (portid.lid == 0) { peerportid.drpath.cnt++; if (peerportid.drpath.cnt == IB_SUBNET_PATH_HOPS_MAX) { IBEXIT("Too many hops"); } } else { peerportid.drpath.cnt = 1; /* Set DrSLID to local lid */ if (resolve_self(ibd_ca, ibd_ca_port, &selfportid, &selfport, NULL) < 0) IBEXIT("could not resolve self"); peerportid.drpath.drslid = (uint16_t) selfportid.lid; peerportid.drpath.drdlid = 0xffff; } peerportid.drpath.p[peerportid.drpath.cnt] = (uint8_t) portnum; /* Get peer port NodeInfo to obtain peer port number */ is_peer_switch = get_node_info(&peerportid, data); rem_vendorid = (uint32_t) mad_get_field(data, 0, IB_NODE_VENDORID_F); rem_devid = (uint16_t) mad_get_field(data, 0, IB_NODE_DEVID_F); mad_decode_field(data, IB_NODE_LOCAL_PORT_F, &peerlocalportnum); printf("Peer PortInfo:\n"); /* Get peer port characteristics */ peer_espeed_cap = get_port_info(&peerportid, data, peerlocalportnum, is_peer_switch); if (is_mlnx_ext_port_info_supported(rem_vendorid, rem_devid)) get_mlnx_ext_port_info(&peerportid, data2, peerlocalportnum); show_port_info(&peerportid, data, peerlocalportnum, peer_espeed_cap, is_peer_switch); if (is_mlnx_ext_port_info_supported(rem_vendorid, rem_devid)) show_mlnx_ext_port_info(&peerportid, data2, peerlocalportnum); mad_decode_field(data, IB_PORT_LINK_WIDTH_ENABLED_F, &peerlwe); mad_decode_field(data, IB_PORT_LINK_WIDTH_SUPPORTED_F, &peerlws); mad_decode_field(data, IB_PORT_LINK_WIDTH_ACTIVE_F, &peerlwa); mad_decode_field(data, IB_PORT_LINK_SPEED_SUPPORTED_F, &peerlss); mad_decode_field(data, IB_PORT_LINK_SPEED_ACTIVE_F, &peerlsa); mad_decode_field(data, IB_PORT_LINK_SPEED_ENABLED_F, &peerlse); mad_decode_field(data2, IB_MLNX_EXT_PORT_LINK_SPEED_SUPPORTED_F, &peerfdr10s); mad_decode_field(data2, IB_MLNX_EXT_PORT_LINK_SPEED_ENABLED_F, &peerfdr10e); mad_decode_field(data2, IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F, &peerfdr10a); if (peer_espeed_cap) { peerlsee = ibnd_get_agg_linkspeedexten(data, data); peerlses = ibnd_get_agg_linkspeedextsup(data, data); } else { peerlsee = 0; peerlses = 0; } /* Now validate peer port characteristics */ /* Examine Link Width */ width = get_link_width(lwe, lws); peerwidth = get_link_width(peerlwe, peerlws); validate_width(peerwidth, lwa); /* Examine Link Speeds */ speed = get_link_speed(lse, lss); peerspeed = get_link_speed(peerlse, peerlss); validate_speed(peerspeed, lsa); if (espeed_cap && peer_espeed_cap) { espeed = get_link_speed_ext(lsee, lses); peerespeed = get_link_speed_ext(peerlsee, peerlses); validate_extended_speed(peerespeed, lsea); } else { if (fdr10e & FDR10 && peerfdr10e & FDR10) { if (!(fdr10a & FDR10)) IBWARN("Peer ports operating at active speed %d rather than FDR10", lsa); } } } } close_port: mad_rpc_close_port2(srcports); exit(0); } rdma-core-56.1/infiniband-diags/ibqueryerrors.c000066400000000000000000000762041477342711600216120ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * Copyright (c) 2008 Lawrence Livermore National Lab. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2010,2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include "ibdiag_common.h" #include "ibdiag_sa.h" static struct ibmad_port *ibmad_port; static struct ibmad_ports_pair *ibmad_ports; static char *node_name_map_file = NULL; static nn_map_t *node_name_map = NULL; static char *load_cache_file = NULL; static uint16_t lid2sl_table[sizeof(uint8_t) * 1024 * 48] = { 0 }; static int obtain_sl = 1; static int data_counters; static int data_counters_only; static int port_config; static uint64_t port_guid; static char *port_guid_str; #define SUP_MAX 64 static int sup_total; static enum MAD_FIELDS suppressed_fields[SUP_MAX]; static char *dr_path; static uint8_t node_type_to_print; static unsigned clear_errors, clear_counts, details; #define PRINT_SWITCH 0x1 #define PRINT_CA 0x2 #define PRINT_ROUTER 0x4 #define PRINT_ALL 0xFF /* all nodes default flag */ #define DEFAULT_HALF_WORLD_PR_TIMEOUT (3000) static struct { int nodes_checked; int bad_nodes; int ports_checked; int bad_ports; int pma_query_failures; } summary; #define DEF_THRES_FILE IBDIAG_CONFIG_PATH"/error_thresholds" static const char *threshold_file = DEF_THRES_FILE; /* define a "packet" with threshold values in it */ static uint8_t thresholds[1204]; static char *threshold_str; static unsigned valid_gid(ib_gid_t * gid) { ib_gid_t zero_gid; memset(&zero_gid, 0, sizeof zero_gid); return memcmp(&zero_gid, gid, sizeof(*gid)); } static void set_thres(char *name, uint64_t val) { int f; int n; char tmp[256]; for (f = IB_PC_EXT_ERR_SYM_F; f <= IB_PC_EXT_XMT_WAIT_F; f++) { if (strcmp(name, mad_field_name(f)) == 0) { mad_encode_field(thresholds, f, &val); snprintf(tmp, 255, "[%s = %" PRIu64 "]", name, val); threshold_str = realloc(threshold_str, strlen(threshold_str)+strlen(tmp)+1); if (!threshold_str) { fprintf(stderr, "Failed to allocate memory: " "%s\n", strerror(errno)); exit(1); } n = strlen(threshold_str); strcpy(threshold_str+n, tmp); } } } static void set_thresholds(void) { char buf[1024]; uint64_t val = 0; FILE *thresf = fopen(threshold_file, "r"); char *p_prefix, *p_last; char *name; char *val_str; char str[64]; if (!thresf) return; snprintf(str, 63, "Thresholds: "); threshold_str = malloc(strlen(str)+1); if (!threshold_str) { fprintf(stderr, "Failed to allocate memory: %s\n", strerror(errno)); exit(1); } strcpy(threshold_str, str); while (fgets(buf, sizeof buf, thresf) != NULL) { p_prefix = strtok_r(buf, "\n", &p_last); if (!p_prefix) continue; /* ignore blank lines */ if (*p_prefix == '#') continue; /* ignore comment lines */ name = strtok_r(p_prefix, "=", &p_last); val_str = strtok_r(NULL, "\n", &p_last); val = strtoul(val_str, NULL, 0); set_thres(name, val); } fclose(thresf); } static int exceeds_threshold(int field, uint64_t val) { uint64_t thres = 0; mad_decode_field(thresholds, field, &thres); return (val > thres); } static void print_port_config(ibnd_node_t * node, int portnum) { char width[64], speed[64], state[64], physstate[64]; char remote_str[256]; char link_str[256]; char width_msg[256]; char speed_msg[256]; char ext_port_str[256]; int iwidth, ispeed, fdr10, espeed, istate, iphystate; uint8_t *info; int rc; ibnd_port_t *port = node->ports[portnum]; if (!port) return; iwidth = mad_get_field(port->info, 0, IB_PORT_LINK_WIDTH_ACTIVE_F); ispeed = mad_get_field(port->info, 0, IB_PORT_LINK_SPEED_ACTIVE_F); fdr10 = mad_get_field(port->ext_info, 0, IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F) & FDR10; if (port->node->type == IB_NODE_SWITCH) info = (uint8_t *)&port->node->ports[0]->info; else info = (uint8_t *)&port->info; espeed = ibnd_get_agg_linkspeedext(info, port->info); istate = mad_get_field(port->info, 0, IB_PORT_STATE_F); iphystate = mad_get_field(port->info, 0, IB_PORT_PHYS_STATE_F); remote_str[0] = '\0'; link_str[0] = '\0'; width_msg[0] = '\0'; speed_msg[0] = '\0'; /* C14-24.2.1 states that a down port allows for invalid data to be * returned for all PortInfo components except PortState and * PortPhysicalState */ if (istate != IB_LINK_DOWN) { if (!espeed) { if (fdr10) sprintf(speed, "10.0 Gbps (FDR10)"); else mad_dump_val(IB_PORT_LINK_SPEED_ACTIVE_F, speed, 64, &ispeed); } else ibnd_dump_agg_linkspeedext(speed, 64, espeed); snprintf(link_str, 256, "(%3s %18s %6s/%8s)", mad_dump_val(IB_PORT_LINK_WIDTH_ACTIVE_F, width, 64, &iwidth), speed, mad_dump_val(IB_PORT_STATE_F, state, 64, &istate), mad_dump_val(IB_PORT_PHYS_STATE_F, physstate, 64, &iphystate)); } else { snprintf(link_str, 256, "( %6s/%8s)", mad_dump_val(IB_PORT_STATE_F, state, 64, &istate), mad_dump_val(IB_PORT_PHYS_STATE_F, physstate, 64, &iphystate)); } if (port->remoteport) { char *rem_node_name = NULL; if (port->remoteport->ext_portnum) snprintf(ext_port_str, 256, "%d", port->remoteport->ext_portnum); else ext_port_str[0] = '\0'; get_max_msg(width_msg, speed_msg, 256, port); rem_node_name = remap_node_name(node_name_map, port->remoteport->node->guid, port->remoteport->node-> nodedesc); rc = snprintf(remote_str, sizeof(remote_str), "0x%016" PRIx64 " %6d %4d[%2s] \"%s\" (%s %s)\n", port->remoteport->guid, port->remoteport->base_lid ? port->remoteport-> base_lid : port->remoteport->node->smalid, port->remoteport->portnum, ext_port_str, rem_node_name, width_msg, speed_msg); if (rc > sizeof(remote_str)) fprintf(stderr, "WARN: string buffer overflow\n"); free(rem_node_name); } else snprintf(remote_str, 256, " [ ] \"\" ( )\n"); if (port->ext_portnum) snprintf(ext_port_str, 256, "%d", port->ext_portnum); else ext_port_str[0] = '\0'; if (node->type == IB_NODE_SWITCH) printf(" Link info: %6d", node->smalid); else printf(" Link info: %6d", port->base_lid); printf("%4d[%2s] ==%s==> %s", port->portnum, ext_port_str, link_str, remote_str); } static int suppress(enum MAD_FIELDS field) { int i = 0; for (i = 0; i < sup_total; i++) if (field == suppressed_fields[i]) return 1; return 0; } static void report_suppressed(void) { int i = 0; printf("## Suppressed:"); for (i = 0; i < sup_total; i++) printf(" %s", mad_field_name(suppressed_fields[i])); printf("\n"); } static int print_summary(void) { printf("\n## Summary: %d nodes checked, %d bad nodes found\n", summary.nodes_checked, summary.bad_nodes); printf("## %d ports checked, %d ports have errors beyond threshold\n", summary.ports_checked, summary.bad_ports); printf("## %s\n", threshold_str); if (summary.pma_query_failures) printf("## %d PMA query failures\n", summary.pma_query_failures); report_suppressed(); return (summary.bad_ports); } static void insert_lid2sl_table(struct sa_query_result *r) { unsigned int i; for (i = 0; i < r->result_cnt; i++) { ib_path_rec_t *p_pr = (ib_path_rec_t *)sa_get_query_rec(r->p_result_madw, i); lid2sl_table[be16toh(p_pr->dlid)] = ib_path_rec_sl(p_pr); } } static int path_record_query(ib_gid_t sgid,uint64_t dguid) { ib_path_rec_t pr; __be64 comp_mask = 0; uint8_t reversible = 0; struct sa_handle *h; h = sa_get_handle(ibmad_ports->gsi.ca_name); if (!h) return -1; ibd_timeout = DEFAULT_HALF_WORLD_PR_TIMEOUT; memset(&pr, 0, sizeof(pr)); CHECK_AND_SET_GID(sgid, pr.sgid, PR, SGID); if(dguid) { mad_encode_field(sgid.raw, IB_GID_GUID_F, &dguid); CHECK_AND_SET_GID(sgid, pr.dgid, PR, DGID); } CHECK_AND_SET_VAL(1, 8, -1, pr.num_path, PR, NUMBPATH);/*to get only one PathRecord for each source and destination pair*/ CHECK_AND_SET_VAL(1, 8, -1, reversible, PR, REVERSIBLE);/*for a reversible path*/ pr.num_path |= reversible << 7; struct sa_query_result result; int ret = sa_query(h, IB_MAD_METHOD_GET_TABLE, (uint16_t)IB_SA_ATTR_PATHRECORD,0,be64toh(comp_mask),ibd_sakey, &pr, sizeof(pr), &result); if (ret) { sa_free_handle(h); fprintf(stderr, "Query SA failed: %s; sa call path_query failed\n", strerror(ret)); return ret; } if (result.status != IB_SA_MAD_STATUS_SUCCESS) { sa_report_err(result.status); ret = EIO; goto Exit; } insert_lid2sl_table(&result); Exit: sa_free_handle(h); sa_free_result_mad(&result); return ret; } static int query_and_dump(char *buf, size_t size, ib_portid_t * portid, char *node_name, int portnum, const char *attr_name, uint16_t attr_id, int start_field, int end_field) { uint8_t pc[1024]; uint32_t val = 0; int i, n; memset(pc, 0, sizeof(pc)); if (!pma_query_via(pc, portid, portnum, ibd_timeout, attr_id, ibmad_port)) { IBWARN("%s query failed on %s, %s port %d", attr_name, node_name, portid2str(portid), portnum); summary.pma_query_failures++; return 0; } for (n = 0, i = start_field; i < end_field; i++) { mad_decode_field(pc, i, (void *)&val); if (val) n += snprintf(buf + n, size - n, " [%s == %u]", mad_field_name(i), val); } return n; } static int check_threshold(uint8_t *pc, uint8_t *pce, uint32_t cap_mask2, int i, int ext_i, int *n, char *str, size_t size) { uint32_t val32 = 0; uint64_t val64 = 0; int is_exceeds = 0; float val = 0; const char *unit = ""; if (htonl(cap_mask2) & IB_PM_IS_ADDL_PORT_CTRS_EXT_SUP) { mad_decode_field(pce, ext_i, (void *)&val64); if (exceeds_threshold(ext_i, val64)) { unit = conv_cnt_human_readable(val64, &val, 0); *n += snprintf(str + *n, size - *n, " [%s == %" PRIu64 " (%5.3f%s)]", mad_field_name(ext_i), val64, val, unit); is_exceeds = 1; } } else { mad_decode_field(pc, i, (void *)&val32); if (exceeds_threshold(ext_i, val32)) { *n += snprintf(str + *n, size - *n, " [%s == %u]", mad_field_name(i), val32); is_exceeds = 1; } } return is_exceeds; } static int print_results(ib_portid_t * portid, char *node_name, ibnd_node_t * node, uint8_t * pc, int portnum, int *header_printed, uint8_t *pce, __be16 cap_mask, uint32_t cap_mask2) { char buf[2048]; char *str = buf; int i, ext_i, n; for (n = 0, i = IB_PC_ERR_SYM_F, ext_i = IB_PC_EXT_ERR_SYM_F; i <= IB_PC_VL15_DROPPED_F; i++, ext_i++ ) { if (suppress(i)) continue; /* this is not a counter, skip it */ if (i == IB_PC_COUNTER_SELECT2_F) { ext_i--; continue; } if (check_threshold(pc, pce, cap_mask2, i, ext_i, &n, str, sizeof(buf))) { /* If there are PortXmitDiscards, get details (if supported) */ if (i == IB_PC_XMT_DISCARDS_F && details) { n += query_and_dump(str + n, sizeof(buf) - n, portid, node_name, portnum, "PortXmitDiscardDetails", IB_GSI_PORT_XMIT_DISCARD_DETAILS, IB_PC_RCV_LOCAL_PHY_ERR_F, IB_PC_RCV_ERR_LAST_F); /* If there are PortRcvErrors, get details (if supported) */ } else if (i == IB_PC_ERR_RCV_F && details) { n += query_and_dump(str + n, sizeof(buf) - n, portid, node_name, portnum, "PortRcvErrorDetails", IB_GSI_PORT_RCV_ERROR_DETAILS, IB_PC_XMT_INACT_DISC_F, IB_PC_XMT_DISC_LAST_F); } } } if (!suppress(IB_PC_XMT_WAIT_F)) { check_threshold(pc, pce, cap_mask2, IB_PC_XMT_WAIT_F, IB_PC_EXT_XMT_WAIT_F, &n, str, sizeof(buf)); } /* if we found errors. */ if (n != 0) { if (data_counters) { uint8_t *pkt = pc; int start_field = IB_PC_XMT_BYTES_F; int end_field = IB_PC_RCV_PKTS_F; if (pce) { pkt = pce; start_field = IB_PC_EXT_XMT_BYTES_F; if (cap_mask & IB_PM_EXT_WIDTH_SUPPORTED) end_field = IB_PC_EXT_RCV_MPKTS_F; else end_field = IB_PC_EXT_RCV_PKTS_F; } for (i = start_field; i <= end_field; i++) { uint64_t val64 = 0; float val = 0; const char *unit = ""; mad_decode_field(pkt, i, (void *)&val64); if (val64) { int data = 0; if (i == IB_PC_EXT_XMT_BYTES_F || i == IB_PC_EXT_RCV_BYTES_F || i == IB_PC_XMT_BYTES_F || i == IB_PC_RCV_BYTES_F) data = 1; unit = conv_cnt_human_readable(val64, &val, data); n += snprintf(str + n, sizeof(buf) - n, " [%s == %" PRIu64 " (%5.3f%s)]", mad_field_name(i), val64, val, unit); } } } if (!*header_printed) { if (node->type == IB_NODE_SWITCH) printf("Errors for 0x%" PRIx64 " \"%s\"\n", node->ports[0]->guid, node_name); else printf("Errors for \"%s\"\n", node_name); *header_printed = 1; summary.bad_nodes++; } if (portnum == 0xFF) { if (node->type == IB_NODE_SWITCH) printf(" GUID 0x%" PRIx64 " port ALL:%s\n", node->ports[0]->guid, str); } else { printf(" GUID 0x%" PRIx64 " port %d:%s\n", node->ports[portnum]->guid, portnum, str); if (port_config) print_port_config(node, portnum); summary.bad_ports++; } } return (n); } static int query_cap_mask(ib_portid_t * portid, char *node_name, int portnum, __be16 * cap_mask, uint32_t * cap_mask2) { uint8_t pc[1024] = { 0 }; __be16 rc_cap_mask; __be32 rc_cap_mask2; portid->sl = lid2sl_table[portid->lid]; /* PerfMgt ClassPortInfo is a required attribute */ if (!pma_query_via(pc, portid, portnum, ibd_timeout, CLASS_PORT_INFO, ibmad_port)) { IBWARN("classportinfo query failed on %s, %s port %d", node_name, portid2str(portid), portnum); summary.pma_query_failures++; return -1; } /* ClassPortInfo should be supported as part of libibmad */ memcpy(&rc_cap_mask, pc + 2, sizeof(rc_cap_mask)); /* CapabilityMask */ memcpy(&rc_cap_mask2, pc + 4, sizeof(rc_cap_mask2)); /* CapabilityMask2 */ *cap_mask = rc_cap_mask; *cap_mask2 = ntohl(rc_cap_mask2) >> 5; return 0; } static int print_data_cnts(ib_portid_t * portid, __be16 cap_mask, char *node_name, ibnd_node_t * node, int portnum, int *header_printed) { uint8_t pc[1024]; int i; int start_field = IB_PC_XMT_BYTES_F; int end_field = IB_PC_RCV_PKTS_F; memset(pc, 0, 1024); portid->sl = lid2sl_table[portid->lid]; if (cap_mask & (IB_PM_EXT_WIDTH_SUPPORTED | IB_PM_EXT_WIDTH_NOIETF_SUP)) { if (!pma_query_via(pc, portid, portnum, ibd_timeout, IB_GSI_PORT_COUNTERS_EXT, ibmad_port)) { IBWARN("IB_GSI_PORT_COUNTERS_EXT query failed on %s, %s port %d", node_name, portid2str(portid), portnum); summary.pma_query_failures++; return (1); } start_field = IB_PC_EXT_XMT_BYTES_F; if (cap_mask & IB_PM_EXT_WIDTH_SUPPORTED) end_field = IB_PC_EXT_RCV_MPKTS_F; else end_field = IB_PC_EXT_RCV_PKTS_F; } else { if (!pma_query_via(pc, portid, portnum, ibd_timeout, IB_GSI_PORT_COUNTERS, ibmad_port)) { IBWARN("IB_GSI_PORT_COUNTERS query failed on %s, %s port %d", node_name, portid2str(portid), portnum); summary.pma_query_failures++; return (1); } start_field = IB_PC_XMT_BYTES_F; end_field = IB_PC_RCV_PKTS_F; } if (!*header_printed) { printf("Data Counters for 0x%" PRIx64 " \"%s\"\n", node->guid, node_name); *header_printed = 1; } if (portnum == 0xFF) printf(" GUID 0x%" PRIx64 " port ALL:", node->guid); else printf(" GUID 0x%" PRIx64 " port %d:", node->guid, portnum); for (i = start_field; i <= end_field; i++) { uint64_t val64 = 0; float val = 0; const char *unit = ""; int data = 0; mad_decode_field(pc, i, (void *)&val64); if (i == IB_PC_EXT_XMT_BYTES_F || i == IB_PC_EXT_RCV_BYTES_F || i == IB_PC_XMT_BYTES_F || i == IB_PC_RCV_BYTES_F) data = 1; unit = conv_cnt_human_readable(val64, &val, data); printf(" [%s == %" PRIu64 " (%5.3f%s)]", mad_field_name(i), val64, val, unit); } printf("\n"); if (portnum != 0xFF && port_config) print_port_config(node, portnum); return (0); } static int print_errors(ib_portid_t * portid, __be16 cap_mask, uint32_t cap_mask2, char *node_name, ibnd_node_t * node, int portnum, int *header_printed) { uint8_t pc[1024]; uint8_t pce[1024]; uint8_t *pc_ext = NULL; memset(pc, 0, 1024); memset(pce, 0, 1024); portid->sl = lid2sl_table[portid->lid]; if (!pma_query_via(pc, portid, portnum, ibd_timeout, IB_GSI_PORT_COUNTERS, ibmad_port)) { IBWARN("IB_GSI_PORT_COUNTERS query failed on %s, %s port %d", node_name, portid2str(portid), portnum); summary.pma_query_failures++; return (0); } if (cap_mask & (IB_PM_EXT_WIDTH_SUPPORTED | IB_PM_EXT_WIDTH_NOIETF_SUP)) { if (!pma_query_via(pce, portid, portnum, ibd_timeout, IB_GSI_PORT_COUNTERS_EXT, ibmad_port)) { IBWARN("IB_GSI_PORT_COUNTERS_EXT query failed on %s, %s port %d", node_name, portid2str(portid), portnum); summary.pma_query_failures++; return (0); } pc_ext = pce; } if (!(cap_mask & IB_PM_PC_XMIT_WAIT_SUP)) { /* if PortCounters:PortXmitWait not supported clear this counter */ uint32_t foo = 0; mad_encode_field(pc, IB_PC_XMT_WAIT_F, &foo); } return (print_results(portid, node_name, node, pc, portnum, header_printed, pc_ext, cap_mask, cap_mask2)); } static uint8_t *reset_pc_ext(void *rcvbuf, ib_portid_t *dest, int port, unsigned mask, unsigned timeout, const struct ibmad_port *srcport) { ib_rpc_t rpc = { 0 }; int lid = dest->lid; DEBUG("lid %u port %d mask 0x%x", lid, port, mask); if (lid == -1) { IBWARN("only lid routed is supported"); return NULL; } if (!mask) mask = ~0; rpc.mgtclass = IB_PERFORMANCE_CLASS; rpc.method = IB_MAD_METHOD_SET; rpc.attr.id = IB_GSI_PORT_COUNTERS_EXT; memset(rcvbuf, 0, IB_MAD_SIZE); /* Same for attribute IDs */ mad_set_field(rcvbuf, 0, IB_PC_EXT_PORT_SELECT_F, port); mad_set_field(rcvbuf, 0, IB_PC_EXT_COUNTER_SELECT_F, mask); mask = mask >> 16; mad_set_field(rcvbuf, 0, IB_PC_EXT_COUNTER_SELECT2_F, mask); rpc.attr.mod = 0; rpc.timeout = timeout; rpc.datasz = IB_PC_DATA_SZ; rpc.dataoffs = IB_PC_DATA_OFFS; if (!dest->qp) dest->qp = 1; if (!dest->qkey) dest->qkey = IB_DEFAULT_QP1_QKEY; return mad_rpc(srcport, &rpc, dest, rcvbuf, rcvbuf); } static void clear_port(ib_portid_t * portid, __be16 cap_mask, uint32_t cap_mask2, char *node_name, int port) { uint8_t pc[1024] = { 0 }; /* bits defined in Table 228 PortCounters CounterSelect and * CounterSelect2 */ uint32_t mask = 0; if (clear_errors) { mask |= 0xFFF; if (cap_mask & IB_PM_PC_XMIT_WAIT_SUP) mask |= 0x10000; } if (clear_counts) mask |= 0xF000; if (mask) if (!performance_reset_via(pc, portid, port, mask, ibd_timeout, IB_GSI_PORT_COUNTERS, ibmad_port)) fprintf(stderr, "Failed to reset errors %s port %d\n", node_name, port); if (clear_errors && details) { memset(pc, 0, 1024); performance_reset_via(pc, portid, port, 0xf, ibd_timeout, IB_GSI_PORT_XMIT_DISCARD_DETAILS, ibmad_port); memset(pc, 0, 1024); performance_reset_via(pc, portid, port, 0x3f, ibd_timeout, IB_GSI_PORT_RCV_ERROR_DETAILS, ibmad_port); } if (cap_mask & (IB_PM_EXT_WIDTH_SUPPORTED | IB_PM_EXT_WIDTH_NOIETF_SUP)) { mask = 0; if (clear_counts) { if (cap_mask & IB_PM_EXT_WIDTH_SUPPORTED) mask = 0xFF; else mask = 0x0F; } if (clear_errors && (htonl(cap_mask2) & IB_PM_IS_ADDL_PORT_CTRS_EXT_SUP)) { mask |= 0xfff0000; if (cap_mask & IB_PM_PC_XMIT_WAIT_SUP) mask |= (1 << 28); } if (mask && !reset_pc_ext(pc, portid, port, mask, ibd_timeout, ibmad_port)) fprintf(stderr, "Failed to reset extended data counters %s, " "%s port %d\n", node_name, portid2str(portid), port); } } static void print_node(ibnd_node_t *node, void *user_data) { int header_printed = 0; int p = 0; int startport = 1; int type = 0; int all_port_sup = 0; ib_portid_t portid = { 0 }; __be16 cap_mask = 0; uint32_t cap_mask2 = 0; char *node_name = NULL; switch (node->type) { case IB_NODE_SWITCH: type = PRINT_SWITCH; break; case IB_NODE_CA: type = PRINT_CA; break; case IB_NODE_ROUTER: type = PRINT_ROUTER; break; } if ((type & node_type_to_print) == 0) return; if (node->type == IB_NODE_SWITCH && node->smaenhsp0) startport = 0; node_name = remap_node_name(node_name_map, node->guid, node->nodedesc); if (node->type == IB_NODE_SWITCH) { ib_portid_set(&portid, node->smalid, 0, 0); p = 0; } else { for (p = 1; p <= node->numports; p++) { if (node->ports[p]) { ib_portid_set(&portid, node->ports[p]->base_lid, 0, 0); break; } } } if ((query_cap_mask(&portid, node_name, p, &cap_mask, &cap_mask2) == 0) && (cap_mask & IB_PM_ALL_PORT_SELECT)) all_port_sup = 1; if (data_counters_only) { for (p = startport; p <= node->numports; p++) { if (node->ports[p]) { if (node->type == IB_NODE_SWITCH) ib_portid_set(&portid, node->smalid, 0, 0); else ib_portid_set(&portid, node->ports[p]->base_lid, 0, 0); print_data_cnts(&portid, cap_mask, node_name, node, p, &header_printed); summary.ports_checked++; if (!all_port_sup) clear_port(&portid, cap_mask, cap_mask2, node_name, p); } } } else { if (all_port_sup) if (!print_errors(&portid, cap_mask, cap_mask2, node_name, node, 0xFF, &header_printed)) { summary.ports_checked += node->numports; goto clear; } for (p = startport; p <= node->numports; p++) { if (node->ports[p]) { if (node->type == IB_NODE_SWITCH) ib_portid_set(&portid, node->smalid, 0, 0); else ib_portid_set(&portid, node->ports[p]->base_lid, 0, 0); print_errors(&portid, cap_mask, cap_mask2, node_name, node, p, &header_printed); summary.ports_checked++; if (!all_port_sup) clear_port(&portid, cap_mask, cap_mask2, node_name, p); } } } clear: summary.nodes_checked++; if (all_port_sup) clear_port(&portid, cap_mask, cap_mask2, node_name, 0xFF); free(node_name); } static void add_suppressed(enum MAD_FIELDS field) { if (sup_total >= SUP_MAX) { IBWARN("Maximum (%d) fields have been suppressed; skipping %s", sup_total, mad_field_name(field)); return; } suppressed_fields[sup_total++] = field; } static void calculate_suppressed_fields(char *str) { enum MAD_FIELDS f; char *val, *lasts = NULL; char *tmp = strdup(str); val = strtok_r(tmp, ",", &lasts); while (val) { for (f = IB_PC_FIRST_F; f <= IB_PC_LAST_F; f++) if (strcmp(val, mad_field_name(f)) == 0) add_suppressed(f); val = strtok_r(NULL, ",", &lasts); } free(tmp); } static int process_opt(void *context, int ch) { struct ibnd_config *cfg = context; switch (ch) { case 's': calculate_suppressed_fields(optarg); break; case 'c': /* Right now this is the only "common" error */ add_suppressed(IB_PC_ERR_SWITCH_REL_F); break; case 1: node_name_map_file = strdup(optarg); if (node_name_map_file == NULL) IBEXIT("out of memory, strdup for node_name_map_file name failed"); break; case 2: data_counters++; break; case 3: node_type_to_print |= PRINT_SWITCH; break; case 4: node_type_to_print |= PRINT_CA; break; case 5: node_type_to_print |= PRINT_ROUTER; break; case 6: details = 1; break; case 7: load_cache_file = strdup(optarg); break; case 8: threshold_file = strdup(optarg); break; case 9: data_counters_only = 1; break; case 10: obtain_sl = 0; break; case 'G': case 'S': port_guid_str = optarg; port_guid = strtoull(optarg, NULL, 0); break; case 'D': dr_path = strdup(optarg); break; case 'r': port_config++; break; case 'R': /* nop */ break; case 'k': clear_errors = 1; break; case 'K': clear_counts = 1; break; case 'o': cfg->max_smps = strtoul(optarg, NULL, 0); break; default: return -1; } return 0; } int main(int argc, char **argv) { struct ibnd_config config = { 0 }; int resolved = -1; ib_portid_t portid = { 0 }; ib_portid_t self_portid = { 0 }; int rc = 0; ibnd_fabric_t *fabric = NULL; ib_gid_t self_gid; int port = 0; int mgmt_classes[4] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS, IB_PERFORMANCE_CLASS }; const struct ibdiag_opt opts[] = { {"suppress", 's', 1, "", "suppress errors listed"}, {"suppress-common", 'c', 0, NULL, "suppress some of the common counters"}, {"node-name-map", 1, 1, "", "node name map file"}, {"port-guid", 'G', 1, "", "report the node containing the port specified by "}, {"", 'S', 1, "", "Same as \"-G\" for backward compatibility"}, {"Direct", 'D', 1, "", "report the node containing the port specified by "}, {"skip-sl", 10, 0, NULL,"don't obtain SL to all destinations"}, {"report-port", 'r', 0, NULL, "report port link information"}, {"threshold-file", 8, 1, NULL, "specify an alternate threshold file, default: " DEF_THRES_FILE}, {"GNDN", 'R', 0, NULL, "(This option is obsolete and does nothing)"}, {"data", 2, 0, NULL, "include data counters for ports with errors"}, {"switch", 3, 0, NULL, "print data for switches only"}, {"ca", 4, 0, NULL, "print data for CA's only"}, {"router", 5, 0, NULL, "print data for routers only"}, {"details", 6, 0, NULL, "include transmit discard details"}, {"counters", 9, 0, NULL, "print data counters only"}, {"clear-errors", 'k', 0, NULL, "Clear error counters after read"}, {"clear-counts", 'K', 0, NULL, "Clear data counters after read"}, {"load-cache", 7, 1, "", "filename of ibnetdiscover cache to load"}, {"outstanding_smps", 'o', 1, NULL, "specify the number of outstanding SMP's which should be " "issued during the scan"}, {} }; char usage_args[] = ""; memset(suppressed_fields, 0, sizeof suppressed_fields); ibdiag_process_opts(argc, argv, &config, "cDGKLnRrSs", opts, process_opt, usage_args, NULL); argc -= optind; argv += optind; if (!node_type_to_print) node_type_to_print = PRINT_ALL; ibmad_ports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 4, 0); if (!ibmad_ports) IBEXIT("Failed to open port; %s:%d\n", ibd_ca, ibd_ca_port); ibmad_port = ibmad_ports->gsi.port; smp_mkey_set(ibmad_port, ibd_mkey); if (ibd_timeout) { mad_rpc_set_timeout(ibmad_port, ibd_timeout); config.timeout_ms = ibd_timeout; } config.flags = ibd_ibnetdisc_flags; config.mkey = ibd_mkey; if (dr_path && load_cache_file) { mad_rpc_close_port2(ibmad_ports); fprintf(stderr, "Cannot specify cache and direct route path\n"); exit(-1); } if (resolve_self(ibmad_ports->gsi.ca_name, ibd_ca_port, &self_portid, &port, &self_gid.raw) < 0) { mad_rpc_close_port2(ibmad_ports); IBEXIT("can't resolve self port %s", argv[0]); } node_name_map = open_node_name_map(node_name_map_file); /* limit the scan the fabric around the target */ if (dr_path) { if ((resolved = resolve_portid_str(ibmad_ports->gsi.ca_name, ibd_ca_port, &portid, dr_path, IB_DEST_DRPATH, NULL, ibmad_port)) < 0) IBWARN("Failed to resolve %s; attempting full scan", dr_path); } else if (port_guid_str) { if ((resolved = resolve_portid_str(ibmad_ports->gsi.ca_name, ibd_ca_port, &portid, port_guid_str, IB_DEST_GUID, ibd_sm_id, ibmad_port)) < 0) IBWARN("Failed to resolve %s; attempting full scan", port_guid_str); if(obtain_sl) lid2sl_table[portid.lid] = portid.sl; } mad_rpc_close_port2(ibmad_ports); if (load_cache_file) { if ((fabric = ibnd_load_fabric(load_cache_file, 0)) == NULL) { fprintf(stderr, "loading cached fabric failed\n"); rc = -1; goto close_name_map; } } else { if (resolved >= 0) { if (!config.max_hops) config.max_hops = 1; if (!(fabric = ibnd_discover_fabric(ibd_ca, ibd_ca_port, &portid, &config))) IBWARN("Single node discover failed;" " attempting full scan"); } if (!fabric && !(fabric = ibnd_discover_fabric(ibd_ca, ibd_ca_port, NULL, &config))) { fprintf(stderr, "discover failed\n"); rc = -1; goto close_name_map; } } set_thresholds(); /* reopen the global ibmad_port */ ibmad_ports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 4, dr_path ? 1 : 0); if (!ibmad_ports) { ibnd_destroy_fabric(fabric); close_node_name_map(node_name_map); IBEXIT("Failed to reopen port: %s:%d\n", ibd_ca, ibd_ca_port); } ibmad_port = ibmad_ports->gsi.port; if (!ibmad_port) { ibnd_destroy_fabric(fabric); close_node_name_map(node_name_map); IBEXIT("Failed to reopen port: %s:%d\n", ibd_ca, ibd_ca_port); } smp_mkey_set(ibmad_port, ibd_mkey); if (ibd_timeout) mad_rpc_set_timeout(ibmad_port, ibd_timeout); if (port_guid_str) { ibnd_port_t *ndport = ibnd_find_port_guid(fabric, port_guid); if (ndport) print_node(ndport->node, NULL); else fprintf(stderr, "Failed to find node: %s\n", port_guid_str); } else if (dr_path) { ibnd_port_t *ndport; uint8_t ni[IB_SMP_DATA_SIZE] = { 0 }; if (!smp_query_via(ni, &portid, IB_ATTR_NODE_INFO, 0, ibd_timeout, ibmad_port)) { fprintf(stderr, "Failed to query local Node Info\n"); goto close_port; } mad_decode_field(ni, IB_NODE_PORT_GUID_F, &(port_guid)); ndport = ibnd_find_port_guid(fabric, port_guid); if (ndport) { if(obtain_sl) if(path_record_query(self_gid,ndport->guid)) goto close_port; print_node(ndport->node, NULL); } else fprintf(stderr, "Failed to find node: %s\n", dr_path); } else { if(obtain_sl) if(path_record_query(self_gid,0)) goto close_port; ibnd_iter_nodes(fabric, print_node, NULL); } rc = print_summary(); if (rc) rc = 1; close_port: mad_rpc_close_port2(ibmad_ports); ibnd_destroy_fabric(fabric); close_name_map: close_node_name_map(node_name_map); exit(rc); } rdma-core-56.1/infiniband-diags/ibroute.c000066400000000000000000000316721477342711600203460ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2009-2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include #include #include "ibdiag_common.h" static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; static int brief, dump_all, multicast; static char *node_name_map_file = NULL; static nn_map_t *node_name_map = NULL; /*******************************************/ static const char *check_switch(ib_portid_t *portid, unsigned int *nports, uint64_t *guid, uint8_t *sw, char *nd) { uint8_t ni[IB_SMP_DATA_SIZE] = { 0 }; int type; DEBUG("checking node type"); if (!smp_query_via(ni, portid, IB_ATTR_NODE_INFO, 0, 0, srcport)) { xdump(stderr, "nodeinfo\n", ni, sizeof ni); return "node info failed: valid addr?"; } if (!smp_query_via(nd, portid, IB_ATTR_NODE_DESC, 0, 0, srcport)) return "node desc failed"; mad_decode_field(ni, IB_NODE_TYPE_F, &type); if (type != IB_NODE_SWITCH) return "not a switch"; DEBUG("Gathering information about switch"); mad_decode_field(ni, IB_NODE_NPORTS_F, nports); mad_decode_field(ni, IB_NODE_GUID_F, guid); if (!smp_query_via(sw, portid, IB_ATTR_SWITCH_INFO, 0, 0, srcport)) return "switch info failed: is a switch node?"; return NULL; } #define IB_MLIDS_IN_BLOCK (IB_SMP_DATA_SIZE/2) static int dump_mlid(char *str, int strlen, unsigned mlid, unsigned nports, __be16 mft[16][IB_MLIDS_IN_BLOCK]) { uint16_t mask; unsigned i, chunk, bit, nonzero = 0; if (brief) { int n = 0; unsigned chunks = ALIGN(nports + 1, 16) / 16; for (i = 0; i < chunks; i++) { mask = ntohs(mft[i][mlid % IB_MLIDS_IN_BLOCK]); if (mask) nonzero++; n += snprintf(str + n, strlen - n, "%04hx", mask); if (n >= strlen) { n = strlen; break; } } if (!nonzero && !dump_all) { str[0] = 0; return 0; } return n; } for (i = 0; i <= nports; i++) { chunk = i / 16; bit = i % 16; mask = ntohs(mft[chunk][mlid % IB_MLIDS_IN_BLOCK]); if (mask) nonzero++; str[i * 2] = (mask & (1 << bit)) ? 'x' : ' '; str[i * 2 + 1] = ' '; } if (!nonzero && !dump_all) { str[0] = 0; return 0; } str[i * 2] = 0; return i * 2; } static __be16 mft[16][IB_MLIDS_IN_BLOCK]; static const char *dump_multicast_tables(ib_portid_t *portid, unsigned startlid, unsigned endlid) { char nd[IB_SMP_DATA_SIZE + 1] = { 0 }; uint8_t sw[IB_SMP_DATA_SIZE] = { 0 }; char str[512], *s; const char *err; uint64_t nodeguid; uint32_t mod; unsigned block, i, j, e, nports, cap, chunks, startblock, lastblock, top; char *mapnd = NULL; int n = 0; if ((err = check_switch(portid, &nports, &nodeguid, sw, nd))) return err; mad_decode_field(sw, IB_SW_MCAST_FDB_CAP_F, &cap); mad_decode_field(sw, IB_SW_MCAST_FDB_TOP_F, &top); if (!endlid || endlid > IB_MIN_MCAST_LID + cap - 1) endlid = IB_MIN_MCAST_LID + cap - 1; if (!dump_all && top && top < endlid) { if (top < IB_MIN_MCAST_LID - 1) IBWARN("illegal top mlid %x", top); else endlid = top; } if (!startlid) startlid = IB_MIN_MCAST_LID; else if (startlid < IB_MIN_MCAST_LID) { IBWARN("illegal start mlid %x, set to %x", startlid, IB_MIN_MCAST_LID); startlid = IB_MIN_MCAST_LID; } if (endlid > IB_MAX_MCAST_LID) { IBWARN("illegal end mlid %x, truncate to %x", endlid, IB_MAX_MCAST_LID); endlid = IB_MAX_MCAST_LID; } mapnd = remap_node_name(node_name_map, nodeguid, nd); printf("Multicast mlids [0x%x-0x%x] of switch %s guid 0x%016" PRIx64 " (%s):\n", startlid, endlid, portid2str(portid), nodeguid, mapnd); if (brief) printf(" MLid Port Mask\n"); else { if (nports > 9) { for (i = 0, s = str; i <= nports; i++) { *s++ = (i % 10) ? ' ' : '0' + i / 10; *s++ = ' '; } *s = 0; printf(" %s\n", str); } for (i = 0, s = str; i <= nports; i++) s += sprintf(s, "%d ", i % 10); printf(" Ports: %s\n", str); printf(" MLid\n"); } if (ibverbose) printf("Switch multicast mlid capability is %d top is 0x%x\n", cap, top); chunks = ALIGN(nports + 1, 16) / 16; startblock = startlid / IB_MLIDS_IN_BLOCK; lastblock = endlid / IB_MLIDS_IN_BLOCK; for (block = startblock; block <= lastblock; block++) { for (j = 0; j < chunks; j++) { int status; mod = (block - IB_MIN_MCAST_LID / IB_MLIDS_IN_BLOCK) | (j << 28); DEBUG("reading block %x chunk %d mod %x", block, j, mod); if (!smp_query_status_via (mft + j, portid, IB_ATTR_MULTICASTFORWTBL, mod, 0, &status, srcport)) { fprintf(stderr, "SubnGet() failed" "; MAD status 0x%x AM 0x%x\n", status, mod); free(mapnd); return NULL; } } i = block * IB_MLIDS_IN_BLOCK; e = i + IB_MLIDS_IN_BLOCK; if (i < startlid) i = startlid; if (e > endlid + 1) e = endlid + 1; for (; i < e; i++) { if (dump_mlid(str, sizeof str, i, nports, mft) == 0) continue; printf("0x%04x %s\n", i, str); n++; } } printf("%d %smlids dumped \n", n, dump_all ? "" : "valid "); free(mapnd); return NULL; } static int dump_lid(char *str, int strlen, int lid, int valid) { char nd[IB_SMP_DATA_SIZE + 1] = { 0 }; uint8_t ni[IB_SMP_DATA_SIZE] = { 0 }; uint8_t pi[IB_SMP_DATA_SIZE] = { 0 }; ib_portid_t lidport = { 0 }; static int last_port_lid, base_port_lid; char ntype[50], sguid[30]; static uint64_t portguid; uint64_t nodeguid; int baselid, lmc, type; char *mapnd = NULL; int rc; if (brief) { str[0] = 0; return 0; } if (lid <= last_port_lid) { if (!valid) return snprintf(str, strlen, ": (path #%d - illegal port)", lid - base_port_lid); else if (!portguid) return snprintf(str, strlen, ": (path #%d out of %d)", lid - base_port_lid + 1, last_port_lid - base_port_lid + 1); else { return snprintf(str, strlen, ": (path #%d out of %d: portguid %s)", lid - base_port_lid + 1, last_port_lid - base_port_lid + 1, mad_dump_val(IB_NODE_PORT_GUID_F, sguid, sizeof sguid, &portguid)); } } if (!valid) return snprintf(str, strlen, ": (illegal port)"); portguid = 0; lidport.lid = lid; if (!smp_query_via(nd, &lidport, IB_ATTR_NODE_DESC, 0, 100, srcport) || !smp_query_via(pi, &lidport, IB_ATTR_PORT_INFO, 0, 100, srcport) || !smp_query_via(ni, &lidport, IB_ATTR_NODE_INFO, 0, 100, srcport)) return snprintf(str, strlen, ": (unknown node and type)"); mad_decode_field(ni, IB_NODE_GUID_F, &nodeguid); mad_decode_field(ni, IB_NODE_PORT_GUID_F, &portguid); mad_decode_field(ni, IB_NODE_TYPE_F, &type); mad_decode_field(pi, IB_PORT_LID_F, &baselid); mad_decode_field(pi, IB_PORT_LMC_F, &lmc); if (lmc > 0) { base_port_lid = baselid; last_port_lid = baselid + (1 << lmc) - 1; } mapnd = remap_node_name(node_name_map, nodeguid, nd); rc = snprintf(str, strlen, ": (%s portguid %s: '%s')", mad_dump_val(IB_NODE_TYPE_F, ntype, sizeof ntype, &type), mad_dump_val(IB_NODE_PORT_GUID_F, sguid, sizeof sguid, &portguid), mapnd); free(mapnd); return rc; } static const char *dump_unicast_tables(ib_portid_t *portid, int startlid, int endlid) { uint8_t lft[IB_SMP_DATA_SIZE] = { 0 }; char nd[IB_SMP_DATA_SIZE + 1] = { 0 }; uint8_t sw[IB_SMP_DATA_SIZE] = { 0 }; char str[200]; const char *s; uint64_t nodeguid; int block, i, e, top; unsigned nports; int n = 0, startblock, endblock; char *mapnd = NULL; if ((s = check_switch(portid, &nports, &nodeguid, sw, nd))) return s; mad_decode_field(sw, IB_SW_LINEAR_FDB_TOP_F, &top); if (!endlid || endlid > top) endlid = top; if (endlid > IB_MAX_UCAST_LID) { IBWARN("illegal lft top %d, truncate to %d", endlid, IB_MAX_UCAST_LID); endlid = IB_MAX_UCAST_LID; } mapnd = remap_node_name(node_name_map, nodeguid, nd); printf("Unicast lids [0x%x-0x%x] of switch %s guid 0x%016" PRIx64 " (%s):\n", startlid, endlid, portid2str(portid), nodeguid, mapnd); DEBUG("Switch top is 0x%x\n", top); printf(" Lid Out Destination\n"); printf(" Port Info \n"); startblock = startlid / IB_SMP_DATA_SIZE; endblock = ALIGN(endlid, IB_SMP_DATA_SIZE) / IB_SMP_DATA_SIZE; for (block = startblock; block < endblock; block++) { int status; DEBUG("reading block %d", block); if (!smp_query_status_via(lft, portid, IB_ATTR_LINEARFORWTBL, block, 0, &status, srcport)) { fprintf(stderr, "SubnGet() failed" "; MAD status 0x%x AM 0x%x\n", status, block); free(mapnd); return NULL; } i = block * IB_SMP_DATA_SIZE; e = i + IB_SMP_DATA_SIZE; if (i < startlid) i = startlid; if (e > endlid + 1) e = endlid + 1; for (; i < e; i++) { unsigned outport = lft[i % IB_SMP_DATA_SIZE]; unsigned valid = (outport <= nports); if (!valid && !dump_all) continue; dump_lid(str, sizeof str, i, valid); printf("0x%04x %03u %s\n", i, outport & 0xff, str); n++; } } printf("%d %slids dumped \n", n, dump_all ? "" : "valid "); free(mapnd); return NULL; } static int process_opt(void *context, int ch) { switch (ch) { case 'a': dump_all++; break; case 'M': multicast++; break; case 'n': brief++; break; case 1: node_name_map_file = strdup(optarg); if (node_name_map_file == NULL) IBEXIT("out of memory, strdup for node_name_map_file name failed"); break; default: return -1; } return 0; } int main(int argc, char **argv) { int mgmt_classes[3] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS }; ib_portid_t portid = { 0 }; unsigned startlid = 0, endlid = 0; const char *err; const struct ibdiag_opt opts[] = { {"all", 'a', 0, NULL, "show all lids, even invalid entries"}, {"no_dests", 'n', 0, NULL, "do not try to resolve destinations"}, {"Multicast", 'M', 0, NULL, "show multicast forwarding tables"}, {"node-name-map", 1, 1, "", "node name map file"}, {} }; char usage_args[] = "[ [ []]]"; const char *usage_examples[] = { " -- Unicast examples:", "4\t# dump all lids with valid out ports of switch with lid 4", "-a 4\t# same, but dump all lids, even with invalid out ports", "-n 4\t# simple dump format - no destination resolving", "4 10\t# dump lids starting from 10", "4 0x10 0x20\t# dump lid range", "-G 0x08f1040023\t# resolve switch by GUID", "-D 0,1\t# resolve switch by direct path", " -- Multicast examples:", "-M 4\t# dump all non empty mlids of switch with lid 4", "-M 4 0xc010 0xc020\t# same, but with range", "-M -n 4\t# simple dump format", NULL, }; ibdiag_process_opts(argc, argv, NULL, "K", opts, process_opt, usage_args, usage_examples); argc -= optind; argv += optind; if (!argc) ibdiag_show_usage(); if (argc > 1) startlid = strtoul(argv[1], NULL, 0); if (argc > 2) endlid = strtoul(argv[2], NULL, 0); node_name_map = open_node_name_map(node_name_map_file); srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 1); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->smi.port; if (!srcport) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); smp_mkey_set(srcport, ibd_mkey); if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &portid, argv[0], ibd_dest_type, ibd_sm_id, srcports->gsi.port) < 0) IBEXIT("can't resolve destination port %s", argv[0]); if (multicast) err = dump_multicast_tables(&portid, startlid, endlid); else err = dump_unicast_tables(&portid, startlid, endlid); if (err) IBEXIT("dump tables: %s", err); mad_rpc_close_port2(srcports); close_node_name_map(node_name_map); exit(0); } rdma-core-56.1/infiniband-diags/ibsendtrap.c000066400000000000000000000220271477342711600210220ustar00rootroot00000000000000/* * Copyright (c) 2008 Lawrence Livermore National Security * Copyright (c) 2008-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * Produced at Lawrence Livermore National Laboratory. * Written by Ira Weiny . * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #define _GNU_SOURCE #include #include "ibdiag_common.h" static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; /* for local link integrity */ static int error_port = 1; static uint16_t get_node_type(ib_portid_t * port) { uint16_t node_type = IB_NODE_TYPE_CA; uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; if (smp_query_via(data, port, IB_ATTR_NODE_INFO, 0, 0, srcport)) node_type = (uint16_t) mad_get_field(data, 0, IB_NODE_TYPE_F); return node_type; } static uint32_t get_cap_mask(ib_portid_t * port) { uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; uint32_t cap_mask = 0; if (smp_query_via(data, port, IB_ATTR_PORT_INFO, 0, 0, srcport)) cap_mask = (uint32_t) mad_get_field(data, 0, IB_PORT_CAPMASK_F); return cap_mask; } static void build_trap145(ib_mad_notice_attr_t * n, ib_portid_t * port) { n->generic_type = 0x80 | IB_NOTICE_TYPE_INFO; n->g_or_v.generic.prod_type_lsb = htobe16(get_node_type(port)); n->g_or_v.generic.trap_num = htobe16(145); n->issuer_lid = htobe16((uint16_t) port->lid); n->data_details.ntc_145.new_sys_guid = htobe64(0x1234567812345678ULL); } static void build_trap144_local(ib_mad_notice_attr_t * n, ib_portid_t * port) { n->generic_type = 0x80 | IB_NOTICE_TYPE_INFO; n->g_or_v.generic.prod_type_lsb = htobe16(get_node_type(port)); n->g_or_v.generic.trap_num = htobe16(144); n->issuer_lid = htobe16((uint16_t) port->lid); n->data_details.ntc_144.lid = n->issuer_lid; n->data_details.ntc_144.new_cap_mask = htobe32(get_cap_mask(port)); n->data_details.ntc_144.local_changes = TRAP_144_MASK_OTHER_LOCAL_CHANGES; } static void build_trap144_nodedesc(ib_mad_notice_attr_t * n, ib_portid_t * port) { build_trap144_local(n, port); n->data_details.ntc_144.change_flgs = TRAP_144_MASK_NODE_DESCRIPTION_CHANGE; } static void build_trap144_linkspeed(ib_mad_notice_attr_t * n, ib_portid_t * port) { build_trap144_local(n, port); n->data_details.ntc_144.change_flgs = TRAP_144_MASK_LINK_SPEED_ENABLE_CHANGE; } static void build_trap129(ib_mad_notice_attr_t * n, ib_portid_t * port) { n->generic_type = 0x80 | IB_NOTICE_TYPE_URGENT; n->g_or_v.generic.prod_type_lsb = htobe16(get_node_type(port)); n->g_or_v.generic.trap_num = htobe16(129); n->issuer_lid = htobe16((uint16_t) port->lid); n->data_details.ntc_129_131.lid = n->issuer_lid; n->data_details.ntc_129_131.pad = 0; n->data_details.ntc_129_131.port_num = (uint8_t) error_port; } static void build_trap256_local(ib_mad_notice_attr_t * n, ib_portid_t * port) { n->generic_type = 0x80 | IB_NOTICE_TYPE_SECURITY; n->g_or_v.generic.prod_type_lsb = htobe16(get_node_type(port)); n->g_or_v.generic.trap_num = htobe16(256); n->issuer_lid = htobe16((uint16_t) port->lid); n->data_details.ntc_256.lid = n->issuer_lid; n->data_details.ntc_256.dr_slid = htobe16(0xffff); n->data_details.ntc_256.method = 1; n->data_details.ntc_256.attr_id = htobe16(0x15); n->data_details.ntc_256.attr_mod = htobe32(0x12); n->data_details.ntc_256.mkey = htobe64(0x1234567812345678ULL); } static void build_trap256_lid(ib_mad_notice_attr_t * n, ib_portid_t * port) { build_trap256_local(n, port); n->data_details.ntc_256.dr_trunc_hop = 0; } static void build_trap256_dr(ib_mad_notice_attr_t * n, ib_portid_t * port) { build_trap256_local(n, port); n->data_details.ntc_256.dr_trunc_hop = 0x80 | 0x4; n->data_details.ntc_256.dr_rtn_path[0] = 5; n->data_details.ntc_256.dr_rtn_path[1] = 6; n->data_details.ntc_256.dr_rtn_path[2] = 7; n->data_details.ntc_256.dr_rtn_path[3] = 8; } static void build_trap257_258(ib_mad_notice_attr_t * n, ib_portid_t * port, uint16_t trap_num) { n->generic_type = 0x80 | IB_NOTICE_TYPE_SECURITY; n->g_or_v.generic.prod_type_lsb = htobe16(get_node_type(port)); n->g_or_v.generic.trap_num = htobe16(trap_num); n->issuer_lid = htobe16((uint16_t) port->lid); n->data_details.ntc_257_258.lid1 = htobe16(1); n->data_details.ntc_257_258.lid2 = htobe16(2); n->data_details.ntc_257_258.key = htobe32(0x12345678); n->data_details.ntc_257_258.qp1 = htobe32(0x010101); n->data_details.ntc_257_258.qp2 = htobe32(0x020202); n->data_details.ntc_257_258.gid1.unicast.prefix = htobe64(0xf8c0000000000001ULL); n->data_details.ntc_257_258.gid1.unicast.interface_id = htobe64(0x1111222233334444ULL); n->data_details.ntc_257_258.gid2.unicast.prefix = htobe64(0xf8c0000000000001ULL); n->data_details.ntc_257_258.gid2.unicast.interface_id = htobe64(0x5678567812341234ULL); } static void build_trap257(ib_mad_notice_attr_t * n, ib_portid_t * port) { build_trap257_258(n, port, 257); } static void build_trap258(ib_mad_notice_attr_t * n, ib_portid_t * port) { build_trap257_258(n, port, 258); } static int send_trap(void (*build) (ib_mad_notice_attr_t *, ib_portid_t *)) { ib_portid_t sm_port; ib_portid_t selfportid; int selfport; ib_rpc_t trap_rpc; ib_mad_notice_attr_t notice; if (resolve_self(srcports->smi.ca_name, ibd_ca_port, &selfportid, &selfport, NULL)) IBEXIT("can't resolve self"); if (resolve_sm_portid(srcports->smi.ca_name, ibd_ca_port, &sm_port)) IBEXIT("can't resolve SM destination port"); memset(&trap_rpc, 0, sizeof(trap_rpc)); trap_rpc.mgtclass = IB_SMI_CLASS; trap_rpc.method = IB_MAD_METHOD_TRAP; trap_rpc.trid = mad_trid(); trap_rpc.attr.id = NOTICE; trap_rpc.datasz = IB_SMP_DATA_SIZE; trap_rpc.dataoffs = IB_SMP_DATA_OFFS; memset(¬ice, 0, sizeof(notice)); build(¬ice, &selfportid); return mad_send_via(&trap_rpc, &sm_port, NULL, ¬ice, srcport); } typedef struct _trap_def { const char *trap_name; void (*build_func) (ib_mad_notice_attr_t *, ib_portid_t *); } trap_def_t; static const trap_def_t traps[] = { {"node_desc_change", build_trap144_nodedesc}, {"link_speed_enabled_change", build_trap144_linkspeed}, {"local_link_integrity", build_trap129}, {"sys_image_guid_change", build_trap145}, {"mkey_lid", build_trap256_lid}, {"mkey_dr", build_trap256_dr}, {"pkey", build_trap257}, {"qkey", build_trap258}, {NULL, NULL} }; static int process_send_trap(const char *trap_name) { int i; for (i = 0; traps[i].trap_name; i++) if (strcmp(traps[i].trap_name, trap_name) == 0) return send_trap(traps[i].build_func); ibdiag_show_usage(); return 1; } int main(int argc, char **argv) { char usage_args[1024]; int mgmt_classes[2] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS }; const char *trap_name = NULL; int i, n, rc; n = sprintf(usage_args, "[] []\n" "\nArgument can be one of the following:\n"); for (i = 0; traps[i].trap_name; i++) { n += snprintf(usage_args + n, sizeof(usage_args) - n, " %s\n", traps[i].trap_name); if (n >= sizeof(usage_args)) exit(-1); } snprintf(usage_args + n, sizeof(usage_args) - n, "\n default behavior is to send \"%s\"", traps[0].trap_name); ibdiag_process_opts(argc, argv, NULL, "DGKL", NULL, NULL, usage_args, NULL); argc -= optind; argv += optind; trap_name = argv[0] ? argv[0] : traps[0].trap_name; if (argc > 1) error_port = atoi(argv[1]); madrpc_show_errors(1); srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 2, 1); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->smi.port; if (!srcport) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); smp_mkey_set(srcport, ibd_mkey); rc = process_send_trap(trap_name); mad_rpc_close_port2(srcports); return rc; } rdma-core-56.1/infiniband-diags/ibstat.c000066400000000000000000000201761477342711600201600ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include /* __be64 */ #include #include static const char * const node_type_str[] = { "???", "CA", "Switch", "Router", "iWARP RNIC" }; static void ca_dump(umad_ca_t * ca) { if (!ca->node_type) return; printf("%s '%s'\n", ((unsigned)ca->node_type <= IB_NODE_MAX ? node_type_str[ca->node_type] : "???"), ca->ca_name); printf("\t%s type: %s\n", ((unsigned)ca->node_type <= IB_NODE_MAX ? node_type_str[ca->node_type] : "???"), ca->ca_type); printf("\tNumber of ports: %d\n", ca->numports); printf("\tFirmware version: %s\n", ca->fw_ver); printf("\tHardware version: %s\n", ca->hw_ver); printf("\tNode GUID: 0x%016" PRIx64 "\n", be64toh(ca->node_guid)); printf("\tSystem image GUID: 0x%016" PRIx64 "\n", be64toh(ca->system_guid)); } static const char * const port_state_str[] = { "???", "Down", "Initializing", "Armed", "Active" }; static const char * const port_phy_state_str[] = { "No state change", "Sleep", "Polling", "Disabled", "PortConfigurationTraining", "LinkUp", "LinkErrorRecovery", "PhyTest" }; static int ret_code(void) { int e = errno; if (e > 0) return -e; return e; } static int sys_read_string(const char *dir_name, const char *file_name, char *str, int max_len) { char path[256], *s; int fd, r; r = snprintf(path, sizeof(path), "%s/%s", dir_name, file_name); if (r > sizeof(path)) return -ENOENT; if ((fd = open(path, O_RDONLY)) < 0) return ret_code(); if ((r = read(fd, str, max_len)) < 0) { int e = errno; close(fd); errno = e; return ret_code(); } str[(r < max_len) ? r : max_len - 1] = 0; if ((s = strrchr(str, '\n'))) *s = 0; close(fd); return 0; } static int is_fdr10(umad_port_t *port) { char port_dir[256]; char rate[32]; int len, fdr10 = 0; char *p; len = snprintf(port_dir, sizeof(port_dir), "%s/%s/%s/%d", SYS_INFINIBAND, port->ca_name, SYS_CA_PORTS_DIR, port->portnum); if (len < 0 || len > sizeof(port_dir)) goto done; if (sys_read_string(port_dir, SYS_PORT_RATE, rate, sizeof(rate)) == 0) { if ((p = strchr(rate, ')'))) { if (!strncasecmp(p - 5, "fdr10", 5)) fdr10 = 1; } } done: return fdr10; } static int port_dump(umad_port_t * port, int alone) { const char *pre = ""; const char *hdrpre = ""; if (!port) return -1; if (!alone) { pre = " "; hdrpre = " "; } printf("%sPort %d:\n", hdrpre, port->portnum); printf("%sState: %s\n", pre, (unsigned)port->state <= 4 ? port_state_str[port->state] : "???"); printf("%sPhysical state: %s\n", pre, (unsigned)port->phys_state <= 7 ? port_phy_state_str[port->phys_state] : "???"); if (is_fdr10(port)) printf("%sRate: %d (FDR10)\n", pre, port->rate); else if (port->rate != 2) /* 1x SDR */ printf("%sRate: %d\n", pre, port->rate); else printf("%sRate: 2.5\n", pre); printf("%sBase lid: %d\n", pre, port->base_lid); printf("%sLMC: %d\n", pre, port->lmc); printf("%sSM lid: %d\n", pre, port->sm_lid); printf("%sCapability mask: 0x%08x\n", pre, ntohl(port->capmask)); printf("%sPort GUID: 0x%016" PRIx64 "\n", pre, be64toh(port->port_guid)); printf("%sLink layer: %s\n", pre, port->link_layer); return 0; } static int ca_stat(const char *ca_name, int portnum, int no_ports) { umad_ca_t ca; int r; if ((r = umad_get_ca(ca_name, &ca)) < 0) return r; if (!ca.node_type) return 0; if (!no_ports && portnum >= 0) { if (portnum > ca.numports || !ca.ports[portnum]) { IBWARN("%s: '%s' has no port number %d - max (%d)", ((unsigned)ca.node_type <= IB_NODE_MAX ? node_type_str[ca.node_type] : "???"), ca_name, portnum, ca.numports); return -1; } printf("%s: '%s'\n", ((unsigned)ca.node_type <= IB_NODE_MAX ? node_type_str[ca.node_type] : "???"), ca.ca_name); port_dump(ca.ports[portnum], 1); return 0; } /* print ca header */ ca_dump(&ca); if (no_ports) return 0; for (portnum = 0; portnum <= ca.numports; portnum++) port_dump(ca.ports[portnum], 0); return 0; } static int ports_list(struct umad_device_node *first_node, struct umad_device_node *last_node) { __be64 guids[64]; struct umad_device_node *node; int ports, j; for (node = first_node; node && node != last_node; node = node->next) { if ((ports = umad_get_ca_portguids(node->ca_name, &guids[0], 64)) < 0) return -1; for (j = 0; j < ports; j++) if (guids[j]) printf("0x%016" PRIx64 "\n", be64toh(guids[j])); } return 0; } static int list_only, short_format, list_ports; static int process_opt(void *context, int ch) { switch (ch) { case 'l': list_only++; break; case 's': short_format++; break; case 'p': list_ports++; break; default: return -1; } return 0; } int main(int argc, char *argv[]) { struct umad_device_node *device_list; struct umad_device_node *node; struct umad_device_node *first_node; struct umad_device_node *last_node; int dev_port = -1; const char *ca_name; const struct ibdiag_opt opts[] = { {"list_of_cas", 'l', 0, NULL, "list all IB devices"}, {"short", 's', 0, NULL, "short output"}, {"port_list", 'p', 0, NULL, "show port list"}, {} }; char usage_args[] = " [portnum]"; const char *usage_examples[] = { "-l # list all IB devices", "mthca0 2 # stat port 2 of 'mthca0'", NULL }; ibdiag_process_opts(argc, argv, NULL, "CDeGKLPsty", opts, process_opt, usage_args, usage_examples); argc -= optind; argv += optind; if (argc > 1) dev_port = strtol(argv[1], NULL, 0); if (umad_init() < 0) IBPANIC("can't init UMAD library"); device_list = umad_get_ca_device_list(); if (!device_list && errno) IBPANIC("can't list IB device names"); if (umad_sort_ca_device_list(&device_list, 0)) IBWARN("can't sort list IB device names"); if (argc) { for (node = device_list; node; node = node->next) if (!strcmp(node->ca_name, argv[0])) break; if (!node) IBPANIC("'%s' IB device can't be found", argv[0]); first_node = node; last_node = node->next; } else { first_node = device_list; last_node = NULL; } if (list_ports) { if (ports_list(first_node, last_node) < 0) IBPANIC("can't list ports"); umad_free_ca_device_list(device_list); return 0; } for (node = first_node; node != last_node; node = node->next) { ca_name = node->ca_name; if (list_only) printf("%s\n", ca_name); else if (ca_stat(ca_name, dev_port, short_format) < 0) IBWARN("stat of IB device '%s' failed", ca_name); } umad_free_ca_device_list(device_list); return 0; } rdma-core-56.1/infiniband-diags/ibsysstat.c000066400000000000000000000214471477342711600207210ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include "ibdiag_common.h" #define MAX_CPUS 256 static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; enum ib_sysstat_attr_t { IB_PING_ATTR = 0x10, IB_HOSTINFO_ATTR = 0x11, IB_CPUINFO_ATTR = 0x12, }; typedef struct cpu_info { char *model; char *mhz; } cpu_info; static cpu_info cpus[MAX_CPUS]; static int host_ncpu; static int server = 0, oui = IB_OPENIB_OUI; static int server_respond(void *umad, int size) { ib_rpc_t rpc = { 0 }; ib_rmpp_hdr_t rmpp = { 0 }; ib_portid_t rport; uint8_t *mad = umad_get_mad(umad); ib_mad_addr_t *mad_addr; if (!(mad_addr = umad_get_mad_addr(umad))) return -1; memset(&rport, 0, sizeof(rport)); rport.lid = ntohs(mad_addr->lid); rport.qp = ntohl(mad_addr->qpn); rport.qkey = ntohl(mad_addr->qkey); rport.sl = mad_addr->sl; if (!rport.qkey && rport.qp == 1) rport.qkey = IB_DEFAULT_QP1_QKEY; rpc.mgtclass = mad_get_field(mad, 0, IB_MAD_MGMTCLASS_F); rpc.method = IB_MAD_METHOD_GET | IB_MAD_RESPONSE; rpc.attr.id = mad_get_field(mad, 0, IB_MAD_ATTRID_F); rpc.attr.mod = mad_get_field(mad, 0, IB_MAD_ATTRMOD_F); rpc.oui = mad_get_field(mad, 0, IB_VEND2_OUI_F); rpc.trid = mad_get_field64(mad, 0, IB_MAD_TRID_F); if (size > IB_MAD_SIZE) rmpp.flags = IB_RMPP_FLAG_ACTIVE; DEBUG("responding %d bytes to %s, attr 0x%x mod 0x%x qkey %x", size, portid2str(&rport), rpc.attr.id, rpc.attr.mod, rport.qkey); if (mad_build_pkt(umad, &rpc, &rport, &rmpp, NULL) < 0) return -1; if (ibdebug > 1) xdump(stderr, "mad respond pkt\n", mad, IB_MAD_SIZE); if (umad_send(mad_rpc_portid(srcport), mad_rpc_class_agent(srcport, rpc.mgtclass), umad, size, rpc.timeout, 0) < 0) { DEBUG("send failed; %m"); return -1; } return 0; } static int mk_reply(int attr, void *data, int sz) { char *s = data; int n, i, ret = 0; switch (attr) { case IB_PING_ATTR: break; /* nothing to do here, just reply */ case IB_HOSTINFO_ATTR: if (gethostname(s, sz) < 0) snprintf(s, sz, "?hostname?"); s[sz - 1] = 0; if ((n = strlen(s)) >= sz - 1) { ret = sz; break; } s[n] = '.'; s += n + 1; sz -= n + 1; ret += n + 1; if (getdomainname(s, sz) < 0) snprintf(s, sz, "?domainname?"); if ((n = strlen(s)) == 0) s[-1] = 0; /* no domain */ else ret += n; break; case IB_CPUINFO_ATTR: s[0] = '\0'; for (i = 0; i < host_ncpu && sz > 0; i++) { n = snprintf(s, sz, "cpu %d: model %s MHZ %s\n", i, cpus[i].model, cpus[i].mhz); if (n >= sz) { IBWARN("cpuinfo truncated"); ret = sz; break; } sz -= n; s += n; ret += n; } ret++; break; default: DEBUG("unknown attr %d", attr); } return ret; } static uint8_t buf[2048 * 32]; static char *ibsystat_serv(void) { void *umad; void *mad; int attr, mod, size; DEBUG("starting to serve..."); while ((umad = mad_receive_via(buf, -1, srcport))) { if (umad_status(buf)) { DEBUG("drop mad with status %x: %s", umad_status(buf), strerror(umad_status(buf))); continue; } mad = umad_get_mad(umad); attr = mad_get_field(mad, 0, IB_MAD_ATTRID_F); mod = mad_get_field(mad, 0, IB_MAD_ATTRMOD_F); DEBUG("got packet: attr 0x%x mod 0x%x", attr, mod); size = mk_reply(attr, (uint8_t *) mad + IB_VENDOR_RANGE2_DATA_OFFS, sizeof(buf) - umad_size() - IB_VENDOR_RANGE2_DATA_OFFS); if (server_respond(umad, IB_VENDOR_RANGE2_DATA_OFFS + size) < 0) DEBUG("respond failed"); } DEBUG("server out"); return NULL; } static int match_attr(char *str) { if (!strcmp(str, "ping")) return IB_PING_ATTR; if (!strcmp(str, "host")) return IB_HOSTINFO_ATTR; if (!strcmp(str, "cpu")) return IB_CPUINFO_ATTR; return -1; } static char *ibsystat(ib_portid_t * portid, int attr) { ib_rpc_t rpc = { 0 }; int fd, agent, timeout, len; void *data = (uint8_t *) umad_get_mad(buf) + IB_VENDOR_RANGE2_DATA_OFFS; DEBUG("Sysstat ping.."); rpc.mgtclass = IB_VENDOR_OPENIB_SYSSTAT_CLASS; rpc.method = IB_MAD_METHOD_GET; rpc.attr.id = attr; rpc.attr.mod = 0; rpc.oui = oui; rpc.timeout = 0; rpc.datasz = IB_VENDOR_RANGE2_DATA_SIZE; rpc.dataoffs = IB_VENDOR_RANGE2_DATA_OFFS; portid->qp = 1; if (!portid->qkey) portid->qkey = IB_DEFAULT_QP1_QKEY; if ((len = mad_build_pkt(buf, &rpc, portid, NULL, NULL)) < 0) IBPANIC("cannot build packet."); fd = mad_rpc_portid(srcport); agent = mad_rpc_class_agent(srcport, rpc.mgtclass); timeout = ibd_timeout ? ibd_timeout : MAD_DEF_TIMEOUT_MS; if (umad_send(fd, agent, buf, len, timeout, 0) < 0) IBPANIC("umad_send failed."); len = sizeof(buf) - umad_size(); if (umad_recv(fd, buf, &len, timeout) < 0) IBPANIC("umad_recv failed."); if (umad_status(buf)) return strerror(umad_status(buf)); DEBUG("Got sysstat pong.."); if (attr != IB_PING_ATTR) puts(data); else printf("sysstat ping succeeded\n"); return NULL; } static int build_cpuinfo(void) { char line[1024] = { 0 }, *s, *e; FILE *f; int ncpu = 0; if (!(f = fopen("/proc/cpuinfo", "r"))) { IBWARN("couldn't open /proc/cpuinfo"); return 0; } while (fgets(line, sizeof(line) - 1, f)) { if (!strncmp(line, "processor\t", 10)) { ncpu++; if (ncpu > MAX_CPUS) { fclose(f); return MAX_CPUS; } continue; } if (!ncpu || !(s = strchr(line, ':'))) continue; if ((e = strchr(s, '\n'))) *e = 0; if (!strncmp(line, "model name\t", 11)) cpus[ncpu - 1].model = strdup(s + 1); else if (!strncmp(line, "cpu MHz\t", 8)) cpus[ncpu - 1].mhz = strdup(s + 1); } fclose(f); DEBUG("ncpu %d", ncpu); return ncpu; } static int process_opt(void *context, int ch) { switch (ch) { case 'o': oui = strtoul(optarg, NULL, 0); break; case 'S': server++; break; default: return -1; } return 0; } int main(int argc, char **argv) { int mgmt_classes[3] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS }; int sysstat_class = IB_VENDOR_OPENIB_SYSSTAT_CLASS; ib_portid_t portid = { 0 }; int attr = IB_PING_ATTR; char *err; const struct ibdiag_opt opts[] = { {"oui", 'o', 1, NULL, "use specified OUI number"}, {"Server", 'S', 0, NULL, "start in server mode"}, {} }; char usage_args[] = " []"; ibdiag_process_opts(argc, argv, NULL, "DKy", opts, process_opt, usage_args, NULL); argc -= optind; argv += optind; if (!argc && !server) ibdiag_show_usage(); if (argc > 1 && (attr = match_attr(argv[1])) < 0) ibdiag_show_usage(); srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 0); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->gsi.port; if (!srcport) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); if (server) { if (mad_register_server_via(sysstat_class, 1, NULL, oui, srcport) < 0) IBEXIT("can't serve class %d", sysstat_class); host_ncpu = build_cpuinfo(); if ((err = ibsystat_serv())) IBEXIT("ibssystat to %s: %s", portid2str(&portid), err); exit(0); } if (mad_register_client_via(sysstat_class, 1, srcport) < 0) IBEXIT("can't register to sysstat class %d", sysstat_class); if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &portid, argv[0], ibd_dest_type, ibd_sm_id, srcport) < 0) IBEXIT("can't resolve destination port %s", argv[0]); if ((err = ibsystat(&portid, attr))) IBEXIT("ibsystat to %s: %s", portid2str(&portid), err); mad_rpc_close_port2(srcports); exit(0); } rdma-core-56.1/infiniband-diags/ibtracert.c000066400000000000000000000577241477342711600206620ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2010,2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include "ibdiag_common.h" static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; #define MAXHOPS 63 static const char * const node_type_str[] = { "???", "ca", "switch", "router", "iwarp rnic" }; static int timeout = 0; /* ms */ static int force; static FILE *f; static FILE *ports_fd; static char *node_name_map_file = NULL; static char *ports_file = NULL; static nn_map_t *node_name_map = NULL; typedef struct Port Port; typedef struct Switch Switch; typedef struct Node Node; struct Port { Port *next; Port *remoteport; uint64_t portguid; int portnum; int lid; int lmc; int state; int physstate; char portinfo[64]; }; struct Switch { int linearcap; int mccap; int linearFDBtop; int fdb_base; int enhsp0; uint8_t fdb[64]; char switchinfo[64]; }; struct Node { Node *htnext; Node *dnext; Port *ports; ib_portid_t path; int type; int dist; int numports; int upport; Node *upnode; uint64_t nodeguid; /* also portguid */ char nodedesc[IB_SMP_DATA_SIZE + 1]; char nodeinfo[64]; }; static Node *nodesdist[MAXHOPS]; static uint64_t target_portguid; /* * is_port_inactive * Checks whether or not the port state is other than active. * The "sw" argument is only relevant when the port is on a * switch; for HCAs and routers, this argument is ignored. * Returns 1 when port is not active and 0 when active. * Base switch port 0 is considered always active. */ static int is_port_inactive(Node * node, Port * port, Switch * sw) { int res = 0; if (port->state != 4 && (node->type != IB_NODE_SWITCH || (node->type == IB_NODE_SWITCH && sw->enhsp0))) res = 1; return res; } static int get_node(Node * node, Port * port, ib_portid_t * portid) { void *pi = port->portinfo, *ni = node->nodeinfo, *nd = node->nodedesc; char *s, *e; memset(ni, 0, sizeof(node->nodeinfo)); if (!smp_query_via(ni, portid, IB_ATTR_NODE_INFO, 0, timeout, srcport)) return -1; memset(nd, 0, sizeof(node->nodedesc)); if (!smp_query_via(nd, portid, IB_ATTR_NODE_DESC, 0, timeout, srcport)) return -1; for (s = nd, e = s + 64; s < e; s++) { if (!*s) break; if (!isprint(*s)) *s = ' '; } memset(pi, 0, sizeof(port->portinfo)); if (!smp_query_via(pi, portid, IB_ATTR_PORT_INFO, 0, timeout, srcport)) return -1; mad_decode_field(ni, IB_NODE_GUID_F, &node->nodeguid); mad_decode_field(ni, IB_NODE_TYPE_F, &node->type); mad_decode_field(ni, IB_NODE_NPORTS_F, &node->numports); mad_decode_field(ni, IB_NODE_PORT_GUID_F, &port->portguid); mad_decode_field(ni, IB_NODE_LOCAL_PORT_F, &port->portnum); mad_decode_field(pi, IB_PORT_LID_F, &port->lid); mad_decode_field(pi, IB_PORT_LMC_F, &port->lmc); mad_decode_field(pi, IB_PORT_STATE_F, &port->state); DEBUG("portid %s: got node %" PRIx64 " '%s'", portid2str(portid), node->nodeguid, node->nodedesc); return 0; } static int switch_lookup(Switch * sw, ib_portid_t * portid, int lid) { void *si = sw->switchinfo, *fdb = sw->fdb; memset(si, 0, sizeof(sw->switchinfo)); if (!smp_query_via(si, portid, IB_ATTR_SWITCH_INFO, 0, timeout, srcport)) return -1; mad_decode_field(si, IB_SW_LINEAR_FDB_CAP_F, &sw->linearcap); mad_decode_field(si, IB_SW_LINEAR_FDB_TOP_F, &sw->linearFDBtop); mad_decode_field(si, IB_SW_ENHANCED_PORT0_F, &sw->enhsp0); if (lid >= sw->linearcap && lid > sw->linearFDBtop) return -1; memset(fdb, 0, sizeof(sw->fdb)); if (!smp_query_via(fdb, portid, IB_ATTR_LINEARFORWTBL, lid / 64, timeout, srcport)) return -1; DEBUG("portid %s: forward lid %d to port %d", portid2str(portid), lid, sw->fdb[lid % 64]); return sw->fdb[lid % 64]; } static int sameport(Port * a, Port * b) { return a->portguid == b->portguid || (force && a->lid == b->lid); } static int extend_dpath(ib_dr_path_t * path, int nextport) { if (path->cnt + 2 >= sizeof(path->p)) return -1; ++path->cnt; path->p[path->cnt] = (uint8_t) nextport; return path->cnt; } static void dump_endnode(int dump, const char *prompt, Node *node, Port *port) { char *nodename = NULL; if (!dump) return; if (dump == 1) { fprintf(f, "%s {0x%016" PRIx64 "}[%d]\n", prompt, node->nodeguid, node->type == IB_NODE_SWITCH ? 0 : port->portnum); return; } nodename = remap_node_name(node_name_map, node->nodeguid, node->nodedesc); fprintf(f, "%s %s {0x%016" PRIx64 "} portnum %d lid %u-%u \"%s\"\n", prompt, (node->type <= IB_NODE_MAX ? node_type_str[node->type] : "???"), node->nodeguid, node->type == IB_NODE_SWITCH ? 0 : port->portnum, port->lid, port->lid + (1 << port->lmc) - 1, nodename); free(nodename); } static void dump_route(int dump, Node * node, int outport, Port * port) { char *nodename = NULL; if (!dump && !ibverbose) return; nodename = remap_node_name(node_name_map, node->nodeguid, node->nodedesc); if (dump == 1) fprintf(f, "[%d] -> {0x%016" PRIx64 "}[%d]\n", outport, port->portguid, port->portnum); else fprintf(f, "[%d] -> %s port {0x%016" PRIx64 "}[%d] lid %u-%u \"%s\"\n", outport, (node->type <= IB_NODE_MAX ? node_type_str[node->type] : "???"), port->portguid, port->portnum, port->lid, port->lid + (1 << port->lmc) - 1, nodename); free(nodename); } static int find_route(ib_portid_t * from, ib_portid_t * to, int dump) { Node *node, fromnode, tonode, nextnode; Port *port, fromport, toport, nextport; Switch sw; int maxhops = MAXHOPS; int portnum, outport = 255, next_sw_outport = 255; memset(&fromnode,0,sizeof(Node)); memset(&tonode,0,sizeof(Node)); memset(&nextnode,0,sizeof(Node)); memset(&fromport,0,sizeof(Port)); memset(&toport,0,sizeof(Port)); memset(&nextport,0,sizeof(Port)); DEBUG("from %s", portid2str(from)); if (get_node(&fromnode, &fromport, from) < 0 || get_node(&tonode, &toport, to) < 0) { IBWARN("can't reach to/from ports"); if (!force) return -1; if (to->lid > 0) toport.lid = to->lid; IBWARN("Force: look for lid %d", to->lid); } node = &fromnode; port = &fromport; portnum = port->portnum; dump_endnode(dump, "From", node, port); if (node->type == IB_NODE_SWITCH) { next_sw_outport = switch_lookup(&sw, from, to->lid); if (next_sw_outport < 0 || next_sw_outport > node->numports) { /* needed to print the port in badtbl */ outport = next_sw_outport; goto badtbl; } } while (maxhops--) { if (is_port_inactive(node, port, &sw)) goto badport; if (sameport(port, &toport)) break; /* found */ if (node->type == IB_NODE_SWITCH) { DEBUG("switch node"); outport = next_sw_outport; if (extend_dpath(&from->drpath, outport) < 0) goto badpath; if (get_node(&nextnode, &nextport, from) < 0) { IBWARN("can't reach port at %s", portid2str(from)); return -1; } if (outport == 0) { if (!sameport(&nextport, &toport)) goto badtbl; else break; /* found SMA port */ } } else if ((node->type == IB_NODE_CA) || (node->type == IB_NODE_ROUTER)) { int ca_src = 0; outport = portnum; DEBUG("ca or router node"); if (!sameport(port, &fromport)) { IBWARN ("can't continue: reached CA or router port %" PRIx64 ", lid %d", port->portguid, port->lid); return -1; } /* we are at CA or router "from" - go one hop back to (hopefully) a switch */ if (from->drpath.cnt > 0) { DEBUG("ca or router node - return back 1 hop"); from->drpath.cnt--; } else { ca_src = 1; if (portnum && extend_dpath(&from->drpath, portnum) < 0) goto badpath; } if (get_node(&nextnode, &nextport, from) < 0) { IBWARN("can't reach port at %s", portid2str(from)); return -1; } /* fix port num to be seen from the CA or router side */ if (!ca_src) nextport.portnum = from->drpath.p[from->drpath.cnt + 1]; } /* only if the next node is a switch, get switch info */ if (nextnode.type == IB_NODE_SWITCH) { next_sw_outport = switch_lookup(&sw, from, to->lid); if (next_sw_outport < 0 || next_sw_outport > nextnode.numports) { /* needed to print the port in badtbl */ outport = next_sw_outport; goto badtbl; } } port = &nextport; if (is_port_inactive(&nextnode, port, &sw)) goto badoutport; node = &nextnode; portnum = port->portnum; dump_route(dump, node, outport, port); } if (maxhops <= 0) { IBWARN("no route found after %d hops", MAXHOPS); return -1; } dump_endnode(dump, "To", node, port); return 0; badport: IBWARN("Bad port state found: node \"%s\" port %d state %d", clean_nodedesc(node->nodedesc), portnum, port->state); return -1; badoutport: IBWARN("Bad out port state found: node \"%s\" outport %d state %d", clean_nodedesc(node->nodedesc), outport, port->state); return -1; badtbl: IBWARN ("Bad forwarding table entry found at: node \"%s\" lid entry %d is %d (top %d)", clean_nodedesc(node->nodedesc), to->lid, outport, sw.linearFDBtop); return -1; badpath: IBWARN("Direct path too long!"); return -1; } /************************** * MC span part */ #define HASHGUID(guid) ((uint32_t)(((uint32_t)(guid) * 101) ^ ((uint32_t)((guid) >> 32) * 103))) #define HTSZ 137 static int insert_node(Node * new) { static Node *nodestbl[HTSZ]; int hash = HASHGUID(new->nodeguid) % HTSZ; Node *node; for (node = nodestbl[hash]; node; node = node->htnext) if (node->nodeguid == new->nodeguid) { DEBUG("node %" PRIx64 " already exists", new->nodeguid); return -1; } new->htnext = nodestbl[hash]; nodestbl[hash] = new; return 0; } static int get_port(Port * port, int portnum, ib_portid_t * portid) { char portinfo[64] = { 0 }; void *pi = portinfo; port->portnum = portnum; if (!smp_query_via(pi, portid, IB_ATTR_PORT_INFO, portnum, timeout, srcport)) return -1; mad_decode_field(pi, IB_PORT_LID_F, &port->lid); mad_decode_field(pi, IB_PORT_LMC_F, &port->lmc); mad_decode_field(pi, IB_PORT_STATE_F, &port->state); mad_decode_field(pi, IB_PORT_PHYS_STATE_F, &port->physstate); VERBOSE("portid %s portnum %d: lid %d state %d physstate %d", portid2str(portid), portnum, port->lid, port->state, port->physstate); return 1; } static void link_port(Port * port, Node * node) { port->next = node->ports; node->ports = port; } static int new_node(Node * node, Port * port, ib_portid_t * path, int dist) { if (port->portguid == target_portguid) { node->dist = -1; /* tag as target */ link_port(port, node); dump_endnode(ibverbose, "found target", node, port); return 1; /* found; */ } /* BFS search start with my self */ if (insert_node(node) < 0) return -1; /* known switch */ VERBOSE("insert dist %d node %p port %d lid %d", dist, node, port->portnum, port->lid); link_port(port, node); node->dist = dist; node->path = *path; node->dnext = nodesdist[dist]; nodesdist[dist] = node; return 0; } static int switch_mclookup(Node * node, ib_portid_t * portid, int mlid, char *map) { Switch sw; char mdb[64]; void *si = sw.switchinfo; __be16 *msets = (__be16 *) mdb; int maxsets, block, i, set; memset(map, 0, 256); memset(si, 0, sizeof(sw.switchinfo)); if (!smp_query_via(si, portid, IB_ATTR_SWITCH_INFO, 0, timeout, srcport)) return -1; mlid -= 0xc000; mad_decode_field(si, IB_SW_MCAST_FDB_CAP_F, &sw.mccap); if (mlid >= sw.mccap) return -1; block = mlid / 32; maxsets = (node->numports + 15) / 16; /* round up */ for (set = 0; set < maxsets; set++) { memset(mdb, 0, sizeof(mdb)); if (!smp_query_via(mdb, portid, IB_ATTR_MULTICASTFORWTBL, block | (set << 28), timeout, srcport)) return -1; for (i = 0; i < 16; i++, map++) { uint16_t mask = ntohs(msets[mlid % 32]); if (mask & (1 << i)) *map = 1; else continue; VERBOSE("Switch guid 0x%" PRIx64 ": mlid 0x%x is forwarded to port %d", node->nodeguid, mlid + 0xc000, i + set * 16); } } return 0; } /* * Return 1 if found, 0 if not, -1 on errors. */ static Node *find_mcpath(ib_portid_t * from, int mlid) { Node *node, *remotenode; Port *port, *remoteport; char map[256]; int r, i; int dist = 0, leafport = 0; ib_portid_t *path; DEBUG("from %s", portid2str(from)); node = calloc(1, sizeof(Node)); if (!node) IBEXIT("out of memory"); port = calloc(1, sizeof(Port)); if (!port) { free(node); IBEXIT("out of memory"); } if (get_node(node, port, from) < 0) { IBWARN("can't reach node %s", portid2str(from)); free(node); free(port); return NULL; } node->upnode = NULL; /* root */ if ((r = new_node(node, port, from, 0)) > 0) { if (node->type != IB_NODE_SWITCH) { IBWARN("ibtracert from CA to CA is unsupported"); free(node); free(port); return NULL; /* ibtracert from host to itself is unsupported */ } if (switch_mclookup(node, from, mlid, map) < 0 || !map[0]) { free(node); free(port); return NULL; } return node; } for (dist = 0; dist < MAXHOPS; dist++) { for (node = nodesdist[dist]; node; node = node->dnext) { path = &node->path; VERBOSE("dist %d node %p", dist, node); dump_endnode(ibverbose, "processing", node, node->ports); memset(map, 0, sizeof(map)); if (node->type != IB_NODE_SWITCH) { if (dist) continue; leafport = path->drpath.p[path->drpath.cnt]; map[port->portnum] = 1; node->upport = 0; /* starting here */ DEBUG("Starting from CA 0x%" PRIx64 " lid %d port %d (leafport %d)", node->nodeguid, port->lid, port->portnum, leafport); } else { /* switch */ /* if starting from a leaf port fix up port (up port) */ if (dist == 1 && leafport) node->upport = leafport; if (switch_mclookup(node, path, mlid, map) < 0) { IBWARN("skipping bad Switch 0x%" PRIx64 "", node->nodeguid); continue; } } for (i = 1; i <= node->numports; i++) { if (!map[i] || i == node->upport) continue; if (dist == 0 && leafport) { if (from->drpath.cnt > 0) path->drpath.cnt--; } else { port = calloc(1, sizeof(Port)); if (!port) IBEXIT("out of memory"); if (get_port(port, i, path) < 0) { IBWARN ("can't reach node %s port %d", portid2str(path), i); free(port); return NULL; } if (port->physstate != 5) { /* LinkUP */ free(port); continue; } #if 0 link_port(port, node); #endif if (extend_dpath(&path->drpath, i) < 0) { free(port); return NULL; } } remotenode = calloc(1, sizeof(Node)); if (!remotenode) { free(port); IBEXIT("out of memory"); } remoteport = calloc(1, sizeof(Port)); if (!remoteport) { free(port); free(remotenode); IBEXIT("out of memory"); } if (get_node(remotenode, remoteport, path) < 0) { IBWARN ("NodeInfo on %s port %d failed, skipping port", portid2str(path), i); path->drpath.cnt--; /* restore path */ free(remotenode); free(remoteport); continue; } remotenode->upnode = node; remotenode->upport = remoteport->portnum; remoteport->remoteport = port; if ((r = new_node(remotenode, remoteport, path, dist + 1)) > 0) return remotenode; if (r == 0) dump_endnode(ibverbose, "new remote", remotenode, remoteport); else if (remotenode->type == IB_NODE_SWITCH) dump_endnode(2, "ERR: circle discovered at", remotenode, remoteport); path->drpath.cnt--; /* restore path */ free(port); free(remotenode); free(remoteport); } } } return NULL; /* not found */ } static uint64_t find_target_portguid(ib_portid_t * to) { Node tonode; Port toport; if (get_node(&tonode, &toport, to) < 0) { IBWARN("can't find to port\n"); return -1; } return toport.portguid; } static void dump_mcpath(Node * node, int dumplevel) { char *nodename = NULL; if (node->upnode) dump_mcpath(node->upnode, dumplevel); nodename = remap_node_name(node_name_map, node->nodeguid, node->nodedesc); if (!node->dist) { printf("From %s 0x%" PRIx64 " port %d lid %u-%u \"%s\"\n", (node->type <= IB_NODE_MAX ? node_type_str[node->type] : "???"), node->nodeguid, node->ports->portnum, node->ports->lid, node->ports->lid + (1 << node->ports->lmc) - 1, nodename); goto free_name; } if (node->dist) { if (dumplevel == 1) printf("[%d] -> %s {0x%016" PRIx64 "}[%d]\n", node->ports->remoteport->portnum, (node->type <= IB_NODE_MAX ? node_type_str[node->type] : "???"), node->nodeguid, node->upport); else printf("[%d] -> %s 0x%" PRIx64 "[%d] lid %u \"%s\"\n", node->ports->remoteport->portnum, (node->type <= IB_NODE_MAX ? node_type_str[node->type] : "???"), node->nodeguid, node->upport, node->ports->lid, nodename); } if (node->dist < 0) /* target node */ printf("To %s 0x%" PRIx64 " port %d lid %u-%u \"%s\"\n", (node->type <= IB_NODE_MAX ? node_type_str[node->type] : "???"), node->nodeguid, node->ports->portnum, node->ports->lid, node->ports->lid + (1 << node->ports->lmc) - 1, nodename); free_name: free(nodename); } static int resolve_lid(ib_portid_t *portid) { uint8_t portinfo[64] = { 0 }; uint16_t lid; if (!smp_query_via(portinfo, portid, IB_ATTR_PORT_INFO, 0, 0, NULL)) return -1; mad_decode_field(portinfo, IB_PORT_LID_F, &lid); ib_portid_set(portid, lid, 0, 0); return 0; } static int dumplevel = 2, multicast, mlid; static int process_opt(void *context, int ch) { switch (ch) { case 1: node_name_map_file = strdup(optarg); if (node_name_map_file == NULL) IBEXIT("out of memory, strdup for node_name_map_file name failed"); break; case 2: ports_file = strdup(optarg); if (ports_file == NULL) IBEXIT("out of memory, strdup for ports_file name failed"); break; case 'm': multicast++; mlid = strtoul(optarg, NULL, 0); break; case 'f': force++; break; case 'n': dumplevel = 1; break; default: return -1; } return 0; } static int get_route(char *srcid, char *dstid) { ib_portid_t my_portid = { 0 }; ib_portid_t src_portid = { 0 }; ib_portid_t dest_portid = { 0 }; Node *endnode; if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &src_portid, srcid, ibd_dest_type, ibd_sm_id, srcports->gsi.port) < 0) { IBWARN("can't resolve source port %s", srcid); return -1; } if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &dest_portid, dstid, ibd_dest_type, ibd_sm_id, srcports->gsi.port) < 0) { IBWARN("can't resolve destination port %s", dstid); return -1; } if (ibd_dest_type == IB_DEST_DRPATH) { if (resolve_lid(&src_portid) < 0) { IBWARN("cannot resolve lid for port \'%s\'", portid2str(&src_portid)); return -1; } if (resolve_lid(&dest_portid) < 0) { IBWARN("cannot resolve lid for port \'%s\'", portid2str(&dest_portid)); return -1; } } if (dest_portid.lid == 0 || src_portid.lid == 0) { IBWARN("bad src/dest lid"); ibdiag_show_usage(); } if (ibd_dest_type != IB_DEST_DRPATH) { /* first find a direct path to the src port */ if (find_route(&my_portid, &src_portid, 0) < 0) { IBWARN("can't find a route to the src port"); return -1; } src_portid = my_portid; } if (!multicast) { if (find_route(&src_portid, &dest_portid, dumplevel) < 0) { IBWARN("can't find a route from src to dest"); return -1; } return 0; } else { if (mlid < 0xc000) IBWARN("invalid MLID; must be 0xc000 or larger"); } if (!(target_portguid = find_target_portguid(&dest_portid))) { IBWARN("can't reach target lid %d", dest_portid.lid); return -1; } if (!(endnode = find_mcpath(&src_portid, mlid))) { IBWARN("can't find a multicast route from src to dest"); return -1; } /* dump multicast path */ dump_mcpath(endnode, dumplevel); return 0; } int main(int argc, char **argv) { char dstbuf[21]; char srcbuf[21]; char portsbuf[80]; char *p_first; int len, i; int line_count = 0; int num_port_pairs = 0; int mgmt_classes[3] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS }; const struct ibdiag_opt opts[] = { {"force", 'f', 0, NULL, "force"}, {"no_info", 'n', 0, NULL, "simple format"}, {"mlid", 'm', 1, "", "multicast trace of the mlid"}, {"node-name-map", 1, 1, "", "node name map file"}, {"ports-file", 2, 1, "", "port pairs file"}, {} }; char usage_args[] = " "; const char *usage_examples[] = { "- Unicast examples:", "4 16\t\t\t# show path between lids 4 and 16", "-n 4 16\t\t# same, but using simple output format", "-G 0x8f1040396522d 0x002c9000100d051\t# use guid addresses", " - Multicast examples:", "-m 0xc000 4 16\t# show multicast path of mlid 0xc000 between lids 4 and 16", NULL, }; ibdiag_process_opts(argc, argv, NULL, "DK", opts, process_opt, usage_args, usage_examples); f = stdout; argc -= optind; argv += optind; if (argc < 2 && ports_file == NULL) ibdiag_show_usage(); if (ibd_timeout) timeout = ibd_timeout; srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 1); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->smi.port; if (!srcport) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); smp_mkey_set(srcport, ibd_mkey); node_name_map = open_node_name_map(node_name_map_file); if (ports_file == NULL) { /* single get_route call when lids/guids on command line */ if (get_route(argv[0], argv[1]) != 0) IBEXIT("Failed to get route information"); } else { /* multiple get_route calls when reading lids/guids from a file */ ports_fd = fopen(ports_file, "r"); if (!ports_fd) IBEXIT("cannot open ports-file %s", ports_file); while (fgets(portsbuf, sizeof(portsbuf), ports_fd) != NULL) { line_count++; p_first = strtok(portsbuf, "\n"); if (!p_first) continue; /* ignore blank lines */ len = (int) strlen(p_first); for (i = 0; i < len; i++) { if (!isspace(p_first[i])) break; } if (i == len) /* ignore all spaces */ continue; if (p_first[i] == '#') continue; /* ignore comment lines */ if (sscanf(portsbuf, "%20s %20s", srcbuf, dstbuf) != 2) IBEXIT("ports-file, %s, at line %i contains bad data", ports_file, line_count); num_port_pairs++; if (get_route(srcbuf, dstbuf) != 0) IBEXIT("Failed to get route information at line %i", line_count); } printf("%i lid/guid pairs processed from %s\n", num_port_pairs, ports_file); } close_node_name_map(node_name_map); mad_rpc_close_port2(srcports); exit(0); } rdma-core-56.1/infiniband-diags/man/000077500000000000000000000000001477342711600172735ustar00rootroot00000000000000rdma-core-56.1/infiniband-diags/man/CMakeLists.txt000066400000000000000000000043761477342711600220450ustar00rootroot00000000000000# rst2man has no way to set the include search path and we need to substitute # into the common files, so subst/link them all into the build directory function(rdma_rst_common) foreach(I ${ARGN}) if ("${I}" MATCHES "\\.in.rst$") string(REGEX REPLACE "^(.+)\\.in.rst$" "\\1" BASE_NAME "${I}") configure_file("common/${I}" "${CMAKE_CURRENT_BINARY_DIR}/common/${BASE_NAME}.rst" @ONLY) else() if (NOT CMAKE_CURRENT_SOURCE_DIR STREQUAL CMAKE_CURRENT_BINARY_DIR) rdma_create_symlink("${CMAKE_CURRENT_SOURCE_DIR}/common/${I}" "${CMAKE_CURRENT_BINARY_DIR}/common/${I}") endif() endif() endforeach() endfunction() rdma_rst_common( opt_cache.rst opt_C.rst opt_diffcheck.rst opt_diff.rst opt_debug.rst opt_D.rst opt_D_with_param.rst opt_e.rst opt_G.rst opt_G_with_param.rst opt_h.rst opt_K.rst opt_load-cache.rst opt_L.rst opt_node_name_map.rst opt_o-outstanding_smps.rst opt_ports-file.rst opt_P.rst opt_s.rst opt_t.rst opt_verbose.rst opt_V.rst opt_y.rst opt_z-config.in.rst sec_config-file.in.rst sec_node-name-map.rst sec_portselection.rst sec_ports-file.rst sec_topology-file.rst ) rdma_man_pages( check_lft_balance.8.in.rst dump_fts.8.in.rst ibaddr.8.in.rst ibcacheedit.8.in.rst ibccconfig.8.in.rst ibccquery.8.in.rst ibfindnodesusing.8.in.rst ibhosts.8.in.rst ibidsverify.8.in.rst iblinkinfo.8.in.rst ibnetdiscover.8.in.rst ibnodes.8.in.rst ibping.8.in.rst ibportstate.8.in.rst ibqueryerrors.8.in.rst ibroute.8.in.rst ibrouters.8.in.rst ibstat.8.in.rst ibstatus.8.in.rst ibswitches.8.in.rst ibsysstat.8.in.rst ibtracert.8.in.rst infiniband-diags.8.in.rst perfquery.8.in.rst saquery.8.in.rst sminfo.8.in.rst smpdump.8.in.rst smpquery.8.in.rst vendstat.8.in.rst ) rdma_alias_man_pages( dump_fts.8 dump_lfts.8 dump_fts.8 dump_mfts.8 ) if (ENABLE_IBDIAGS_COMPAT) rdma_man_pages( ibcheckerrors.8 ibcheckerrs.8 ibchecknet.8 ibchecknode.8 ibcheckport.8 ibcheckportstate.8 ibcheckportwidth.8 ibcheckstate.8 ibcheckwidth.8 ibclearcounters.8 ibclearerrors.8 ibdatacounters.8 ibdatacounts.8 ibdiscover.8 ibprintca.8 ibprintrt.8 ibprintswitch.8 ibswportwatch.8 ) endif() rdma-core-56.1/infiniband-diags/man/check_lft_balance.8.in.rst000066400000000000000000000017301477342711600241700ustar00rootroot00000000000000================= check_lft_balance ================= -------------------------------------------------- check InfiniBand unicast forwarding tables balance -------------------------------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== check_lft_balance.sh [-hRv] DESCRIPTION =========== check_lft_balance.sh is a script which checks for balancing in Infiniband unicast forwarding tables. It analyzes the output of **dump_lfts(8)** and **iblinkinfo(8)** OPTIONS ======= **-h** show help **-R** Recalculate dump_lfts information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe that the fabric has changed. **-v** verbose output SEE ALSO ======== **dump_lfts(8)** **iblinkinfo(8)** AUTHORS ======= Albert Chu < chu11@llnl.gov > rdma-core-56.1/infiniband-diags/man/common/000077500000000000000000000000001477342711600205635ustar00rootroot00000000000000rdma-core-56.1/infiniband-diags/man/common/opt_C.rst000066400000000000000000000001261477342711600223600ustar00rootroot00000000000000.. Define the common option -C **-C, --Ca ** use the specified ca_name. rdma-core-56.1/infiniband-diags/man/common/opt_D.rst000066400000000000000000000007411477342711600223640ustar00rootroot00000000000000.. Define the common option -D for Directed routes **-D, --Direct** The address specified is a directed route :: Examples: [options] -D [options] "0" # self port [options] -D [options] "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag '-P' or the port found through the automatic selection process.) rdma-core-56.1/infiniband-diags/man/common/opt_D_with_param.rst000066400000000000000000000007031477342711600245750ustar00rootroot00000000000000.. Define the common option -D for Directed routes **-D, --Direct ** The address specified is a directed route :: Examples: -D "0" # self port -D "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag '-P' or the port found through the automatic selection process.) rdma-core-56.1/infiniband-diags/man/common/opt_G.rst000066400000000000000000000001311477342711600223600ustar00rootroot00000000000000.. Define the common option -G **-G, --Guid** The address specified is a Port GUID rdma-core-56.1/infiniband-diags/man/common/opt_G_with_param.rst000066400000000000000000000001261477342711600245770ustar00rootroot00000000000000.. Define the common option -G **--port-guid, -G ** Specify a port_guid rdma-core-56.1/infiniband-diags/man/common/opt_K.rst000066400000000000000000000001721477342711600223710ustar00rootroot00000000000000.. Define the common option -K **-K, --show_keys** show security keys (mkey, smkey, etc.) associated with the request. rdma-core-56.1/infiniband-diags/man/common/opt_L.rst000066400000000000000000000001201477342711600223630ustar00rootroot00000000000000.. Define the common option -L **-L, --Lid** The address specified is a LID rdma-core-56.1/infiniband-diags/man/common/opt_P.rst000066400000000000000000000001301477342711600223700ustar00rootroot00000000000000.. Define the common option -P **-P, --Port ** use the specified ca_port. rdma-core-56.1/infiniband-diags/man/common/opt_V.rst000066400000000000000000000001161477342711600224020ustar00rootroot00000000000000.. Define the common option -V **-V, --version** show the version info. rdma-core-56.1/infiniband-diags/man/common/opt_cache.rst000066400000000000000000000002671477342711600232470ustar00rootroot00000000000000.. Define the common option cache **--cache ** Cache the ibnetdiscover network data in the specified filename. This cache may be used by other tools for later analysis. rdma-core-56.1/infiniband-diags/man/common/opt_debug.rst000066400000000000000000000002001477342711600232550ustar00rootroot00000000000000.. Define the common option -d -d raise the IB debugging level. May be used several times (-ddd or -d -d -d). rdma-core-56.1/infiniband-diags/man/common/opt_diff.rst000066400000000000000000000006101477342711600231040ustar00rootroot00000000000000.. Define the common option diff **--diff ** Load cached ibnetdiscover data and do a diff comparison to the current network or another cache. A special diff output for ibnetdiscover output will be displayed showing differences between the old and current fabric. By default, the following are compared for differences: switches, channel adapters, routers, and port connections. rdma-core-56.1/infiniband-diags/man/common/opt_diffcheck.rst000066400000000000000000000011541477342711600241060ustar00rootroot00000000000000.. Define the common option diffcheck **--diffcheck ** Specify what diff checks should be done in the **--diff** option above. Comma separate multiple diff check key(s). The available diff checks are: **sw = switches**, **ca = channel adapters**, **router** = routers, **port** = port connections, **lid** = lids, **nodedesc** = node descriptions. Note that **port**, **lid**, and **nodedesc** are checked only for the node types that are specified (e.g. **sw**, **ca**, **router**). If **port** is specified alongside **lid** or **nodedesc**, remote port lids and node descriptions will also be compared. rdma-core-56.1/infiniband-diags/man/common/opt_e.rst000066400000000000000000000001331477342711600224200ustar00rootroot00000000000000.. Define the common option -e -e show send and receive errors (timeouts and others) rdma-core-56.1/infiniband-diags/man/common/opt_h.rst000066400000000000000000000001141477342711600224220ustar00rootroot00000000000000.. Define the common option -h **-h, --help** show the usage message rdma-core-56.1/infiniband-diags/man/common/opt_load-cache.rst000066400000000000000000000003631477342711600241610ustar00rootroot00000000000000.. Define the common option load-cache **--load-cache ** Load and use the cached ibnetdiscover data stored in the specified filename. May be useful for outputting and learning about other fabrics or a previous state of a fabric. rdma-core-56.1/infiniband-diags/man/common/opt_node_name_map.rst000066400000000000000000000002721477342711600247620ustar00rootroot00000000000000.. Define the common option --node-name-map **--node-name-map ** Specify a node name map. This file maps GUIDs to more user friendly names. See FILES section. rdma-core-56.1/infiniband-diags/man/common/opt_o-outstanding_smps.rst000066400000000000000000000002551477342711600260360ustar00rootroot00000000000000.. Define the common option -z **--outstanding_smps, -o ** Specify the number of outstanding SMP's which should be issued during the scan Default: 2 rdma-core-56.1/infiniband-diags/man/common/opt_ports-file.rst000066400000000000000000000003031477342711600242570ustar00rootroot00000000000000.. Define the common option --ports-file **--ports-file ** Specify a ports file. This file contains multiple source and destination lid or guid pairs. See FILES section. rdma-core-56.1/infiniband-diags/man/common/opt_s.rst000066400000000000000000000001551477342711600224420ustar00rootroot00000000000000.. Define the common option -s **-s, --sm_port ** use 'smlid' as the target lid for SA queries. rdma-core-56.1/infiniband-diags/man/common/opt_t.rst000066400000000000000000000001651477342711600224440ustar00rootroot00000000000000.. Define the common option -t **-t, --timeout ** override the default timeout for the solicited mads. rdma-core-56.1/infiniband-diags/man/common/opt_verbose.rst000066400000000000000000000002311477342711600236400ustar00rootroot00000000000000.. Define the common option -v **-v, --verbose** increase the application verbosity level. May be used several times (-vv or -v -v -v) rdma-core-56.1/infiniband-diags/man/common/opt_y.rst000066400000000000000000000002751477342711600224530ustar00rootroot00000000000000.. Define the common option -y **-y, --m_key ** use the specified M_key for requests. If non-numeric value (like 'x') is specified then a value will be prompted for. rdma-core-56.1/infiniband-diags/man/common/opt_z-config.in.rst000066400000000000000000000002231477342711600243150ustar00rootroot00000000000000.. Define the common option -z **--config, -z ** Specify alternate config file. Default: @IBDIAG_CONFIG_PATH@/ibdiag.conf rdma-core-56.1/infiniband-diags/man/common/sec_config-file.in.rst000066400000000000000000000003311477342711600247330ustar00rootroot00000000000000.. Common text for the config file CONFIG FILE ----------- @IBDIAG_CONFIG_PATH@/ibdiag.conf A global config file is provided to set some of the common options for all tools. See supplied config file for details. rdma-core-56.1/infiniband-diags/man/common/sec_node-name-map.rst000066400000000000000000000027531477342711600245740ustar00rootroot00000000000000.. Common text to describe the node name map file. NODE NAME MAP FILE FORMAT ------------------------- The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. This functionality is provided by the opensm-libs package. See **opensm(8)** for the file location for your installation. **Generically:** :: # comment "" **Example:** :: # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" rdma-core-56.1/infiniband-diags/man/common/sec_ports-file.rst000066400000000000000000000010011477342711600242230ustar00rootroot00000000000000.. Common text to describe the port file. PORTS FILE FORMAT ------------------------- The ports file can be used to specify multiple source and destination pairs. They can be lids or guids. If guids, use the -G option to indicate that. **Generically:** :: # comment **Example:** :: 73 207 203 657 531 101 > OR < 0x0008f104003f125c 0x0008f104003f133d 0x0008f1040011ab07 0x0008f104004265c0 0x0008f104007c5510 0x0008f1040099bb08 rdma-core-56.1/infiniband-diags/man/common/sec_portselection.rst000066400000000000000000000015021477342711600250370ustar00rootroot00000000000000.. Explanation of local port selection Local port Selection -------------------- Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: 1. the first port that is ACTIVE. 2. if not found, the first port that is UP (physical link up). If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. For example: :: ibaddr # use the first port (criteria #1 above) ibaddr -C mthca1 # pick the best port from "mthca1" only. ibaddr -P 2 # use the second (active/up) port from the first available IB device. ibaddr -C mthca0 -P 2 # use the specified port only. rdma-core-56.1/infiniband-diags/man/common/sec_topology-file.rst000066400000000000000000000103601477342711600247400ustar00rootroot00000000000000.. Common text to describe the Topology file. TOPOLOGY FILE FORMAT -------------------- The topology file format is human readable and largely intuitive. Most identifiers are given textual names like vendor ID (vendid), device ID (device ID), GUIDs of various types (sysimgguid, caguid, switchguid, etc.). PortGUIDs are shown in parentheses (). For switches, this is shown on the switchguid line. For CA and router ports, it is shown on the connectivity lines. The IB node is identified followed by the number of ports and a quoted the node GUID. On the right of this line is a comment (#) followed by the NodeDescription in quotes. If the node is a switch, this line also contains whether switch port 0 is base or enhanced, and the LID and LMC of port 0. Subsequent lines pertaining to this node show the connectivity. On the left is the port number of the current node. On the right is the peer node (node at other end of link). It is identified in quotes with nodetype followed by - followed by NodeGUID with the port number in square brackets. Further on the right is a comment (#). What follows the comment is dependent on the node type. If it it a switch node, it is followed by the NodeDescription in quotes and the LID of the peer node. If it is a CA or router node, it is followed by the local LID and LMC and then followed by the NodeDescription in quotes and the LID of the peer node. The active link width and speed are then appended to the end of this output line. An example of this is: :: # # Topology file: generated on Tue Jun 5 14:15:10 2007 # # Max of 3 hops discovered # Initiated from node 0008f10403960558 port 0008f10403960559 Non-Chassis Nodes vendid=0x8f1 devid=0x5a06 sysimgguid=0x5442ba00003000 switchguid=0x5442ba00003080(5442ba00003080) Switch 24 "S-005442ba00003080" # "ISR9024 Voltaire" base port 0 lid 6 lmc 0 [22] "H-0008f10403961354"[1](8f10403961355) # "MT23108 InfiniHost Mellanox Technologies" lid 4 4xSDR [10] "S-0008f10400410015"[1] # "SW-6IB4 Voltaire" lid 3 4xSDR [8] "H-0008f10403960558"[2](8f1040396055a) # "MT23108 InfiniHost Mellanox Technologies" lid 14 4xSDR [6] "S-0008f10400410015"[3] # "SW-6IB4 Voltaire" lid 3 4xSDR [12] "H-0008f10403960558"[1](8f10403960559) # "MT23108 InfiniHost Mellanox Technologies" lid 10 4xSDR vendid=0x8f1 devid=0x5a05 switchguid=0x8f10400410015(8f10400410015) Switch 8 "S-0008f10400410015" # "SW-6IB4 Voltaire" base port 0 lid 3 lmc 0 [6] "H-0008f10403960984"[1](8f10403960985) # "MT23108 InfiniHost Mellanox Technologies" lid 16 4xSDR [4] "H-005442b100004900"[1](5442b100004901) # "MT23108 InfiniHost Mellanox Technologies" lid 12 4xSDR [1] "S-005442ba00003080"[10] # "ISR9024 Voltaire" lid 6 1xSDR [3] "S-005442ba00003080"[6] # "ISR9024 Voltaire" lid 6 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x8f10403960984 Ca 2 "H-0008f10403960984" # "MT23108 InfiniHost Mellanox Technologies" [1](8f10403960985) "S-0008f10400410015"[6] # lid 16 lmc 1 "SW-6IB4 Voltaire" lid 3 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x5442b100004900 Ca 2 "H-005442b100004900" # "MT23108 InfiniHost Mellanox Technologies" [1](5442b100004901) "S-0008f10400410015"[4] # lid 12 lmc 1 "SW-6IB4 Voltaire" lid 3 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x8f10403961354 Ca 2 "H-0008f10403961354" # "MT23108 InfiniHost Mellanox Technologies" [1](8f10403961355) "S-005442ba00003080"[22] # lid 4 lmc 1 "ISR9024 Voltaire" lid 6 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x8f10403960558 Ca 2 "H-0008f10403960558" # "MT23108 InfiniHost Mellanox Technologies" [2](8f1040396055a) "S-005442ba00003080"[8] # lid 14 lmc 1 "ISR9024 Voltaire" lid 6 4xSDR [1](8f10403960559) "S-005442ba00003080"[12] # lid 10 lmc 1 "ISR9024 Voltaire" lid 6 1xSDR When grouping is used, IB nodes are organized into chassis which are numbered. Nodes which cannot be determined to be in a chassis are displayed as "Non-Chassis Nodes". External ports are also shown on the connectivity lines. rdma-core-56.1/infiniband-diags/man/dump_fts.8.in.rst000066400000000000000000000030371477342711600224240ustar00rootroot00000000000000======== DUMP_FTS ======== --------------------------------- dump InfiniBand forwarding tables --------------------------------- :Date: 2013-03-26 :Manual section: 8 :Manual group: OpenIB Diagnostics SYNOPSIS ======== dump_fts [options] [ []] DESCRIPTION =========== dump_fts is similar to ibroute but dumps tables for every switch found in an ibnetdiscover scan of the subnet. The dump file format is compatible with loading into OpenSM using the -R file -U /path/to/dump-file syntax. OPTIONS ======= **-a, --all** show all lids in range, even invalid entries **-n, --no_dests** do not try to resolve destinations **-M, --Multicast** show multicast forwarding tables In this case, the range parameters are specifying the mlid range. Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_t.rst .. include:: common/opt_y.rst .. include:: common/opt_node_name_map.rst .. include:: common/opt_z-config.rst FILES ===== .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst SEE ALSO ======== **dump_lfts(8), dump_mfts(8), ibroute(8), ibswitches(8), opensm(8)** AUTHORS ======= Ira Weiny < ira.weiny@intel.com > rdma-core-56.1/infiniband-diags/man/ibaddr.8.in.rst000066400000000000000000000033261477342711600220310ustar00rootroot00000000000000====== IBADDR ====== ---------------------------- query InfiniBand address(es) ---------------------------- :Date: 2013-10-11 :Manual section: 8 :Manual group: OpenIB Diagnostics SYNOPSIS ======== ibaddr [options] DESCRIPTION =========== Display the lid (and range) as well as the GID address of the port specified (by DR path, lid, or GUID) or the local port by default. Note: this utility can be used as simple address resolver. OPTIONS ======= **--gid_show, -g** show gid address only **--lid_show, -l** show lid range only **--Lid_show, -L** show lid range (in decimal) only Addressing Flags ---------------- .. include:: common/opt_D.rst .. include:: common/opt_G.rst .. include:: common/opt_s.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Configuration flags ------------------- .. include:: common/opt_y.rst .. include:: common/opt_t.rst .. include:: common/opt_z-config.rst FILES ===== .. include:: common/sec_config-file.rst EXAMPLES ======== :: ibaddr # local port\'s address ibaddr 32 # show lid range and gid of lid 32 ibaddr -G 0x8f1040023 # same but using guid address ibaddr -l 32 # show lid range only ibaddr -L 32 # show decimal lid range only ibaddr -g 32 # show gid address only SEE ALSO ======== **ibroute (8), ibtracert (8)** AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibcacheedit.8.in.rst000066400000000000000000000030451477342711600230260ustar00rootroot00000000000000=========== ibcacheedit =========== --------------------------- edit an ibnetdiscover cache --------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== ibcacheedit [options] DESCRIPTION =========== ibcacheedit allows users to edit an ibnetdiscover cache created through the **--cache** option in **ibnetdiscover(8)** . OPTIONS ======= **--switchguid BEFOREGUID:AFTERGUID** Specify a switchguid that should be changed. The before and after guid should be separated by a colon. On switches, port guids are identical to the switch guid, so port guids will be adjusted as well on switches. **--caguid BEFOREGUID:AFTERGUID** Specify a caguid that should be changed. The before and after guid should be separated by a colon. **--sysimgguid BEFOREGUID:AFTERGUID** Specify a sysimgguid that should be changed. The before and after guid should be spearated by a colon. **--portguid NODEGUID:BEFOREGUID:AFTERGUID** Specify a portguid that should be changed. The nodeguid of the port (e.g. switchguid or caguid) should be specified first, followed by a colon, the before port guid, another colon, then the after port guid. On switches, port guids are identical to the switch guid, so the switch guid will be adjusted as well on switches. Debugging flags --------------- .. include:: common/opt_h.rst .. include:: common/opt_V.rst AUTHORS ======= Albert Chu < chu11@llnl.gov > rdma-core-56.1/infiniband-diags/man/ibccconfig.8.in.rst000066400000000000000000000052001477342711600226630ustar00rootroot00000000000000========== IBCCCONFIG ========== ------------------------------------- configure congestion control settings ------------------------------------- :Date: 2012-05-31 :Manual section: 8 :Manual group: OpenIB Diagnostics SYNOPSIS ======== ibccconfig [common_options] [-c cckey] [port] DESCRIPTION =========== **ibccconfig** supports the configuration of congestion control settings on switches and HCAs. **WARNING -- You should understand what you are doing before using this tool. Misuse of this tool could result in a broken fabric.** OPTIONS ======= Current supported operations and their parameters: CongestionKeyInfo (CK) SwitchCongestionSetting (SS) SwitchPortCongestionSetting (SP) CACongestionSetting (CS) CongestionControlTable (CT) ... **--cckey, -c, ** Specify a congestion control (CC) key. If none is specified, a key of 0 is used. Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Addressing Flags ---------------- .. include:: common/opt_G.rst .. include:: common/opt_L.rst .. include:: common/opt_s.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Configuration flags ------------------- .. include:: common/opt_y.rst .. include:: common/opt_z-config.rst EXAMPLES ======== :: ibccconfig SwitchCongestionSetting 2 0x1F 0x1FFFFFFFFF 0x0 0xF 8 0 0:0 1 # Configure Switch Congestion Settings ibccconfig CACongestionSetting 1 0 0x3 150 1 0 0 # Configure CA Congestion Settings to SL 0 and SL 1 ibccconfig CACongestionSetting 1 0 0x4 200 1 0 0 # Configure CA Congestion Settings to SL 2 ibccconfig CongestionControlTable 1 63 0 0:0 0:1 ... # Configure first block of Congestion Control Table ibccconfig CongestionControlTable 1 127 0 0:64 0:65 ... # Configure second block of Congestion Control Table FILES ===== .. include:: common/sec_config-file.rst AUTHOR ====== Albert Chu < chu11@llnl.gov > rdma-core-56.1/infiniband-diags/man/ibccquery.8.in.rst000066400000000000000000000034561477342711600225760ustar00rootroot00000000000000========= IBCCQUERY ========= -------------------------------------- query congestion control settings/info -------------------------------------- :Date: 2012-05-31 :Manual section: 8 :Manual group: OpenIB Diagnostics SYNOPSIS ======== ibccquery [common_options] [-c cckey] [port] DESCRIPTION =========== ibccquery support the querying of settings and other information related to congestion control. OPTIONS ======= Current supported operations and their parameters: CongestionInfo (CI) CongestionKeyInfo (CK) CongestionLog (CL) SwitchCongestionSetting (SS) SwitchPortCongestionSetting (SP) [] CACongestionSetting (CS) CongestionControlTable (CT) Timestamp (TI) **--cckey, -c ** Specify a congestion control (CC) key. If none is specified, a key of 0 is used. Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Addressing Flags ---------------- .. include:: common/opt_G.rst .. include:: common/opt_L.rst .. include:: common/opt_s.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Configuration flags ------------------- .. include:: common/opt_y.rst .. include:: common/opt_z-config.rst FILES ===== .. include:: common/sec_config-file.rst EXAMPLES ======== :: ibccquery CongestionInfo 3 # Congestion Info by lid ibccquery SwitchPortCongestionSetting 3 # Query all Switch Port Congestion Settings ibccquery SwitchPortCongestionSetting 3 1 # Query Switch Port Congestion Setting for port 1 AUTHOR ====== Albert Chu < chu11@llnl.gov > rdma-core-56.1/infiniband-diags/man/ibcheckerrors.8000066400000000000000000000017311477342711600222130ustar00rootroot00000000000000.TH IBCHECKERRORS 8 "May 21, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibcheckerrors \- validate IB subnet and report errors .SH SYNOPSIS .B ibcheckerrors [\-h] [\-b] [\-v] [\-N | \-nocolor] [ | \-C ca_name \-P ca_port \-t(imeout) timeout_ms] .SH DESCRIPTION .PP ibcheckerrors is a script which uses a full topology file that was created by ibnetdiscover, scans the network to validate the connectivity and reports errors (from port counters). .SH OPTIONS .PP \-v increase the verbosity level .PP \-b brief mode. Reduce the output to show only if errors are present, not what they are. .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH SEE ALSO .BR ibnetdiscover(8), .BR ibchecknode(8), .BR ibcheckport(8), .BR ibcheckerrs(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibcheckerrs.8000066400000000000000000000031121477342711600216450ustar00rootroot00000000000000.TH IBCHECKERRS 8 "May 30, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibcheckerrs \- validate IB port (or node) and report errors in counters above threshold .SH SYNOPSIS .B ibcheckerrs [\-h] [\-b] [\-v] [\-G] [\-T ] [\-s(how_thresholds)] [\-N | \-nocolor] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] .SH DESCRIPTION .PP Check specified port (or node) and report errors that surpassed their predefined threshold. Port address is lid unless -G option is used to specify a GUID address. The predefined thresholds can be dumped using the -s option, and a user defined threshold_file (using the same format as the dump) can be specified using the -t option. .SH OPTIONS .PP \-G use GUID address argument. In most cases, it is the Port GUID. Example: "0x08f1040023" .PP \-s show predefined thresholds .PP \-T use specified threshold file .PP \-v increase the verbosity level .PP \-b brief mode. Reduce the output to show only if errors are present, not what they are. .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH EXAMPLE .PP ibcheckerrs 2 # check aggregated node counter for lid 2 .PP ibcheckerrs 2 4 # check port counters for lid 2 port 4 .PP ibcheckerrs -T xxx 2 # check node using xxx threshold file .SH SEE ALSO .BR perfquery(8), .BR ibaddr(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibchecknet.8000066400000000000000000000015001477342711600214570ustar00rootroot00000000000000.TH IBCHECKNET 8 "May 21, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibchecknet \- validate IB subnet and report errors .SH SYNOPSIS .B ibchecknet [\-h] [\-N | \-nocolor] [ | \-C ca_name \-P ca_port \-t(imeout) timeout_ms] .SH DESCRIPTION .PP ibchecknet is a script which uses a full topology file that was created by ibnetdiscover, and scans the network to validate the connectivity and reports errors (from port counters). .SH OPTIONS .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH SEE ALSO .BR ibnetdiscover(8), .BR ibchecknode(8), .BR ibcheckport(8), .BR ibcheckerrs(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibchecknode.8000066400000000000000000000017261477342711600216300ustar00rootroot00000000000000.TH IBCHECKNODE 8 "May 21, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibchecknode \- validate IB node and report errors .SH SYNOPSIS .B ibchecknode [\-h] [\-v] [\-N | \-nocolor] [\-G] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] .SH DESCRIPTION .PP Check connectivity and do some simple sanity checks for the specified node. Port address is a lid unless -G option is used to specify a GUID address. .SH OPTIONS .PP \-G use GUID address argument. In most cases, it is the Port GUID. Example: "0x08f1040023" .PP \-v increase the verbosity level .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH EXAMPLE .PP ibchecknode 2 # check node via lid 2 .SH SEE ALSO .BR smpquery(8), .BR ibaddr(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibcheckport.8000066400000000000000000000017341477342711600216660ustar00rootroot00000000000000.TH IBCHECKPORT 8 "May 21, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibcheckport \- validate IB port and report errors .SH SYNOPSIS .B ibcheckport [\-h] [\-v] [\-N | \-nocolor] [\-G] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] .SH DESCRIPTION .PP Check connectivity and do some simple sanity checks for the specified port. Port address is a lid unless -G option is used to specify a GUID address. .SH OPTIONS .PP \-G use GUID address argument. In most cases, it is the Port GUID. Example: "0x08f1040023" .PP \-v increase the verbosity level .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH EXAMPLE .PP ibcheckport 2 3 # check lid 2 port 3 .SH SEE ALSO .BR smpquery(8), .BR ibaddr(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibcheckportstate.8000066400000000000000000000020421477342711600227200ustar00rootroot00000000000000.TH IBCHECKPORTSTATE 8 "May 21, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibcheckportstate \- validate IB port for LinkUp and not Active state .SH SYNOPSIS .B ibcheckportstate [\-h] [\-v] [\-N | \-nocolor] [\-G] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] .SH DESCRIPTION .PP Check connectivity and check the specified port for proper port state (Active) and port physical state (LinkUp). Port address is a lid unless -G option is used to specify a GUID address. .SH OPTIONS .PP \-G use GUID address argument. In most cases, it is the Port GUID. Example: "0x08f1040023" .PP \-v increase the verbosity level .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH EXAMPLE .PP ibcheckportstate 2 3 # check lid 2 port 3 .SH SEE ALSO .BR smpquery(8), .BR ibaddr(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibcheckportwidth.8000066400000000000000000000017471477342711600227320ustar00rootroot00000000000000.TH IBCHECKPORTWIDTH 8 "May 21, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibcheckportwidth \- validate IB port for 1x link width .SH SYNOPSIS .B ibcheckportwidth [\-h] [\-v] [\-N | \-nocolor] [\-G] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] .SH DESCRIPTION .PP Check connectivity and check the specified port for 1x link width. Port address is a lid unless -G option is used to specify a GUID address. .SH OPTIONS .PP \-G use GUID address argument. In most cases, it is the Port GUID. Example: "0x08f1040023" .PP \-v increase the verbosity level .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH EXAMPLE .PP ibcheckportwidth 2 3 # check lid 2 port 3 .SH SEE ALSO .BR smpquery(8), .BR ibaddr(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibcheckstate.8000066400000000000000000000016501477342711600220170ustar00rootroot00000000000000.TH IBCHECKSTATE 8 "May 21, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibcheckstate \- find ports in IB subnet which are link up but not active .SH SYNOPSIS .B ibcheckstate [\-h] [\-v] [\-N | \-nocolor] [ | \-C ca_name \-P ca_port \-t(imeout) timeout_ms] .SH DESCRIPTION .PP ibcheckstat is a script which uses a full topology file that was created by ibnetdiscover, scans the network to validate the port state and port physical state, and reports any ports which have a port state other than Active or a port physical state other than LinkUp. .SH OPTIONS .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH SEE ALSO .BR ibnetdiscover(8), .BR ibchecknode(8), .BR ibcheckportstate(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibcheckwidth.8000066400000000000000000000014521477342711600220160ustar00rootroot00000000000000.TH IBCHECKWIDTH 8 "May 21, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibcheckwidth \- find 1x links in IB subnet .SH SYNOPSIS .B ibcheckwidth [\-h] [\-v] [\-N | \-nocolor] [ | \-C ca_name \-P ca_port \-t(imeout) timeout_ms] .SH DESCRIPTION .PP ibcheckwidth is a script which uses a full topology file that was created by ibnetdiscover, scans the network to validate the active link widths and reports any 1x links. .SH OPTIONS .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH SEE ALSO .BR ibnetdiscover(8), .BR ibchecknode(8), .BR ibcheckportwidth(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibclearcounters.8000066400000000000000000000012651477342711600225540ustar00rootroot00000000000000.TH IBCLEARCOUNTERS 8 "May 21, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibclearcounters \- clear port counters in IB subnet .SH SYNOPSIS .B ibclearcounters [\-h] [ | \-C ca_name \-P ca_port \-t(imeout) timeout_ms] .SH DESCRIPTION .PP ibclearcounters is a script that clears the PMA port counters by either walking the IB subnet topology or using an already saved topology file. .SH OPTIONS .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH SEE ALSO .BR ibnetdiscover(8), .BR perfquery(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibclearerrors.8000066400000000000000000000014071477342711600222240ustar00rootroot00000000000000.TH IBCLEARERRORS 8 "May 21, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibclearerrors \- clear error counters in IB subnet .SH SYNOPSIS .B ibclearerrors [\-h] [\-N | \-nocolor] [ | \-C ca_name \-P ca_port \-t(imeout) timeout_ms] .SH DESCRIPTION .PP ibclearerrors is a script which clears the PMA error counters in PortCounters by either walking the IB subnet topology or using an already saved topology file. .SH OPTIONS .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH SEE ALSO .BR ibnetdiscover(8), .BR perfquery(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibdatacounters.8000066400000000000000000000017001477342711600223710ustar00rootroot00000000000000.TH IBDATACOUNTERS 8 "May 31, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibdatacounters \- query IB subnet for data counters .SH SYNOPSIS .B ibdatacounters [\-h] [\-b] [\-v] [\-N | \-nocolor] [ | \-C ca_name \-P ca_port \-t(imeout) timeout_ms] .SH DESCRIPTION .PP ibdatacounters is a script which uses a full topology file that was created by ibnetdiscover, scans the network to validate the connectivity and reports the data counters (from port counters). .SH OPTIONS .PP \-v increase the verbosity level .PP \-b brief mode. Reduce the output to show only if errors are present, not what they are. .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH SEE ALSO .BR ibnetdiscover(8), .BR ibdatacounts(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibdatacounts.8000066400000000000000000000020561477342711600220470ustar00rootroot00000000000000.TH IBDATACOUNTS 8 "May 30, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibdatacounts \- get IB port data counters .SH SYNOPSIS .B ibdatacounts [\-h] [\-b] [\-v] [\-G] [\-N | \-nocolor] [\-C ca_name] [\-P ca_port] [\-t(imeout) timeout_ms] [] .SH DESCRIPTION .PP Obtain PMA data counters from specified port (or node). Port address is lid unless -G option is used to specify a GUID address. .SH OPTIONS .PP \-G use GUID address argument. In most cases, it is the Port GUID. Example: "0x08f1040023" .PP \-v increase the verbosity level .PP \-b brief mode .PP \-N | \-nocolor use mono rather than color mode .PP \-C use the specified ca_name. .PP \-P use the specified ca_port. .PP \-t override the default timeout for the solicited mads. .SH EXAMPLE .PP ibdatacounts 2 # show data counters for lid 2 .PP ibdatacounts 2 4 # show data counters for lid 2 port 4 .SH SEE ALSO .BR perfquery(8), .BR ibaddr(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibdiscover.8000066400000000000000000000023501477342711600215150ustar00rootroot00000000000000.TH IBDISCOVER.PL 8 "September 21, 2006" "OpenIB" "OpenIB Diagnostics" .SH NAME ibdiscover.pl \- annotate and compare InfiniBand topology .SH SYNOPSIS .B ibdiscover.pl .SH DESCRIPTION .PP ibdiscover.pl uses a topology file create by ibnetdiscover and a discover.map file which the network administrator creates which indicates the nodes to be expected and a ibdiscover.topo file which is the expected connectivity and produces a new connectivity file (discover.topo.new) and outputs the changes to stdout. The network administrator can choose to replace the "old" topo file with the new one or certain changes in. The syntax of the ibdiscover.map file is: |port|"Text for node"| e.g. 8f10400410015|8|"ISR 6000"|# SW-6IB4 Voltaire port 0 lid 5 8f10403960558|2|"HCA 1"|# MT23108 InfiniHost Mellanox Technologies The syntax of the old and new topo files (ibdiscover.topo and ibdiscover.topo.new) are: ||| e.g. 10|5442ba00003080|1|8f10400410015 These topo files are produced by the ibdiscover.pl tool. .SH USAGE .PP ibnetdiscover | ibdiscover.pl .SH SEE ALSO .BR ibnetdiscover(8) .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibfindnodesusing.8.in.rst000066400000000000000000000023721477342711600241360ustar00rootroot00000000000000================ ibfindnodesusing ================ ------------------------------------------------------------------------------- find a list of end nodes which are routed through the specified switch and port ------------------------------------------------------------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== ibfindnodesusing.pl [options] DESCRIPTION =========== ibfindnodesusing.pl uses ibroute and detects the current nodes which are routed through both directions of the link specified. The link is specified by one switch port end; the script finds the remote end automatically. OPTIONS ======= **-h** show help **-R** Recalculate the ibnetdiscover information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe that the fabric has changed. **-C ** use the specified ca_name. **-P ** use the specified ca_port. FILES ===== .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst AUTHOR ====== Ira Weiny < ira.weiny@intel.com > rdma-core-56.1/infiniband-diags/man/ibhosts.8.in.rst000066400000000000000000000016431477342711600222570ustar00rootroot00000000000000======= IBHOSTS ======= -------------------------------------- show InfiniBand host nodes in topology -------------------------------------- :Date: 2016-12-20 :Manual section: 8 :Manual group: OpenIB Diagnostics SYNOPSIS ======== ibhosts [options] [] DESCRIPTION =========== ibhosts is a script which either walks the IB subnet topology or uses an already saved topology file and extracts the CA nodes. OPTIONS ======= .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/opt_t.rst .. include:: common/opt_y.rst .. include:: common/opt_h.rst .. include:: common/opt_z-config.rst .. include:: common/sec_portselection.rst FILES ===== .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst SEE ALSO ======== ibnetdiscover(8) DEPENDENCIES ============ ibnetdiscover, ibnetdiscover format AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibidsverify.8.in.rst000066400000000000000000000025041477342711600231200ustar00rootroot00000000000000=========== ibidsverify =========== --------------------------------------------------- validate IB identifiers in subnet and report errors --------------------------------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== ibidsverify.pl [-h] [-R] DESCRIPTION =========== ibidsverify.pl is a perl script which uses a full topology file that was created by ibnetdiscover, scans the network to validate the LIDs and GUIDs in the subnet. The validation consists of checking that there are no zero or duplicate identifiers. Finally, ibidsverify.pl will also reuse the cached ibnetdiscover output from some of the other diag tools which makes it a bit faster than running ibnetdiscover from scratch. OPTIONS ======= **-R** Recalculate the ibnetdiscover information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe the fabric has changed. **-C ** use the specified ca_name. **-P ** use the specified ca_port. EXIT STATUS =========== Exit status is 1 if errors are found, 0 otherwise. FILES ===== .. include:: common/sec_config-file.rst SEE ALSO ======== **ibnetdiscover(8)** AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/iblinkinfo.8.in.rst000066400000000000000000000070671477342711600227360ustar00rootroot00000000000000========== IBLINKINFO ========== -------------------------------------------- report link info for all links in the fabric -------------------------------------------- :Date: 2018-07-09 :Manual section: 8 :Manual group: OpenIB Diagnostics SYNOPSIS ======== iblinkinfo DESCRIPTION =========== iblinkinfo reports link info for each port in an IB fabric, node by node. Optionally, iblinkinfo can do partial scans and limit its output to parts of a fabric. OPTIONS ======= **--down, -d** Print only nodes which have a port in the "Down" state. **--line, -l** Print all information for each link on one line. Default is to print a header with the node information and then a list for each port (useful for grep'ing output). **--additional, -p** Print additional port settings (,,) **--switches-only** Show only switches in output. **--cas-only** Show only CAs in output. Partial Scan flags ------------------ The node to start a partial scan can be specified with the following addresses. .. include:: common/opt_G_with_param.rst .. include:: common/opt_D_with_param.rst **Note:** For switches results are printed for all ports not just switch port 0. **--switch, -S ** same as "-G". (provided only for backward compatibility) How much of the scan to be printed can be controlled with the following. **--all, -a** Print all nodes found in a partial fabric scan. Normally a partial fabric scan will return only the node specified. This option will print the other nodes found as well. **--hops, -n ** Specify the number of hops away from a specified node to scan. This is useful to expand a partial fabric scan beyond the node specified. Cache File flags ---------------- .. include:: common/opt_load-cache.rst .. include:: common/opt_diff.rst **--diffcheck ** Specify what diff checks should be done in the **--diff** option above. Comma separate multiple diff check key(s). The available diff checks are: **port** = port connections, **state** = port state, **lid** = lids, **nodedesc** = node descriptions. Note that **port**, **lid**, and **nodedesc** are checked only for the node types that are specified (e.g. **switches-only**, **cas-only**). If **port** is specified alongside **lid** or **nodedesc**, remote port lids and node descriptions will also be compared. **--filterdownports ** Filter downports indicated in a ibnetdiscover cache. If a port was previously indicated as down in the specified cache, and is still down, do not output it in the resulting output. This option may be particularly useful for environments where switches are not fully populated, thus much of the default iblinkinfo info is considered useless. See **ibnetdiscover** for information on caching ibnetdiscover output. Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Configuration flags ------------------- .. include:: common/opt_z-config.rst .. include:: common/opt_o-outstanding_smps.rst .. include:: common/opt_node_name_map.rst .. include:: common/opt_t.rst .. include:: common/opt_y.rst Debugging flags --------------- .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst EXIT STATUS =========== 0 on success, -1 on failure to scan the fabric, 1 if check mode is used and inconsistencies are found. FILES ===== .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst AUTHOR ====== Ira Weiny < ira.weiny@intel.com > rdma-core-56.1/infiniband-diags/man/ibnetdiscover.8.in.rst000066400000000000000000000046101477342711600234410ustar00rootroot00000000000000============= IBNETDISCOVER ============= ---------------------------- discover InfiniBand topology ---------------------------- :Date: 2013-06-22 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== ibnetdiscover [options] [] DESCRIPTION =========== ibnetdiscover performs IB subnet discovery and outputs a human readable topology file. GUIDs, node types, and port numbers are displayed as well as port LIDs and NodeDescriptions. All nodes (and links) are displayed (full topology). Optionally, this utility can be used to list the current connected nodes by nodetype. The output is printed to standard output unless a topology file is specified. OPTIONS ======= **-l, --list** List of connected nodes **-g, --grouping** Show grouping. Grouping correlates IB nodes by different vendor specific schemes. It may also show the switch external ports correspondence. **-H, --Hca_list** List of connected CAs **-S, --Switch_list** List of connected switches **-R, --Router_list** List of connected routers **-s, --show** Show progress information during discovery. **-f, --full** Show full information (ports' speed and width, vlcap) **-p, --ports** Obtain a ports report which is a list of connected ports with relevant information (like LID, portnum, GUID, width, speed, and NodeDescription). **-m, --max_hops** Report max hops discovered. .. include:: common/opt_o-outstanding_smps.rst Cache File flags ---------------- .. include:: common/opt_cache.rst .. include:: common/opt_load-cache.rst .. include:: common/opt_diff.rst .. include:: common/opt_diffcheck.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Configuration flags ------------------- .. include:: common/opt_z-config.rst .. include:: common/opt_o-outstanding_smps.rst .. include:: common/opt_node_name_map.rst .. include:: common/opt_t.rst .. include:: common/opt_y.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst FILES ===== .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst .. include:: common/sec_topology-file.rst AUTHORS ======= Hal Rosenstock < halr@voltaire.com > Ira Weiny < ira.weiny@intel.com > rdma-core-56.1/infiniband-diags/man/ibnodes.8.in.rst000066400000000000000000000016131477342711600222240ustar00rootroot00000000000000======= IBNODES ======= --------------------------------- show InfiniBand nodes in topology --------------------------------- :Date: 2012-05-14 :Manual section: 8 :Manual group: OpenIB Diagnostics SYNOPSIS ======== ibnodes [options] [] DESCRIPTION =========== ibnodes is a script which either walks the IB subnet topology or uses an already saved topology file and extracts the IB nodes (CAs and switches). OPTIONS ======= .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/opt_t.rst .. include:: common/opt_h.rst .. include:: common/opt_z-config.rst .. include:: common/sec_portselection.rst FILES ===== .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst SEE ALSO ======== ibnetdiscover(8) DEPENDENCIES ============ ibnetdiscover, ibnetdiscover format AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibping.8.in.rst000066400000000000000000000026501477342711600220530ustar00rootroot00000000000000====== IBPING ====== -------------------------- ping an InfiniBand address -------------------------- :Date: 2012-05-14 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== ibping [options] DESCRIPTION =========== ibping uses vendor mads to validate connectivity between IB nodes. On exit, (IP) ping like output is show. ibping is run as client/server. Default is to run as client. Note also that a default ping server is implemented within the kernel. OPTIONS ======= **-c, --count** stop after count packets **-f, --flood** flood destination: send packets back to back without delay **-o, --oui** use specified OUI number to multiplex vendor mads **-S, --Server** start in server mode (do not return) Addressing Flags ---------------- .. include:: common/opt_L.rst .. include:: common/opt_G.rst .. include:: common/opt_s.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Configuration flags ------------------- .. include:: common/opt_z-config.rst .. include:: common/opt_t.rst Debugging flags --------------- .. include:: common/opt_h.rst .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst FILES ===== .. include:: common/sec_config-file.rst AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibportstate.8.in.rst000066400000000000000000000105131477342711600231400ustar00rootroot00000000000000=========== IBPORTSTATE =========== ----------------------------------------------------------------- handle port (physical) state and link speed of an InfiniBand port ----------------------------------------------------------------- :Date: 2013-03-26 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== ibportstate [options] [] DESCRIPTION =========== ibportstate allows the port state and port physical state of an IB port to be queried (in addition to link width and speed being validated relative to the peer port when the port queried is a switch port), or a switch port to be disabled, enabled, or reset. InfiniBand HCA port state may be changed locally without the knowledge of the Subnet Manager. It also allows the link speed/width enabled on any IB port to be adjusted. OPTIONS ======= **** Supported ops: enable, disable, reset, speed, espeed, fdr10, width, query, on, off, down, arm, active, vls, mtu, lid, smlid, lmc, mkey, mkeylease, mkeyprot (Default is query) **enable, disable, and reset** change or reset a switch or HCA port state (You must specify the CA name and Port number when locally change CA port state.) **off** change the port state to disable. **on** change the port state to enable(only when the current state is disable). **speed and width** are allowed on any port **speed** values are the legal values for PortInfo:LinkSpeedEnabled (An error is indicated if PortInfo:LinkSpeedSupported does not support this setting) **espeed** is allowed on any port supporting extended link speeds **fdr10** is allowed on any port supporting fdr10 (An error is indicated if port's capability mask indicates extended link speeds are not supported or if PortInfo:LinkSpeedExtSupported does not support this setting) **width** values are legal values for PortInfo:LinkWidthEnabled (An error is indicated if PortInfo:LinkWidthSupported does not support this setting) (NOTE: Speed and width changes are not effected until the port goes through link renegotiation) **query** also validates port characteristics (link width, speed, espeed, and fdr10) based on the peer port. This checking is done when the port queried is a switch port as it relies on combined routing (an initial LID route with directed routing to the peer) which can only be done on a switch. This peer port validation feature of query op requires LID routing to be functioning in the subnet. **mkey, mkeylease, and mkeyprot** are only allowed on CAs, routers, or switch port 0 (An error is generated if attempted on external switch ports). Hexadecimal and octal mkeys may be specified by prepending the key with '0x' or '0', respectively. If a non-numeric value (like 'x') is specified for the mkey, then ibportstate will prompt for a value. Addressing Flags ---------------- .. include:: common/opt_L.rst .. include:: common/opt_G.rst .. include:: common/opt_D.rst .. include:: common/opt_s.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Configuration flags ------------------- .. include:: common/opt_z-config.rst .. include:: common/opt_t.rst .. include:: common/opt_y.rst Debugging flags --------------- .. include:: common/opt_h.rst .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_K.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst FILES ===== .. include:: common/sec_config-file.rst EXAMPLES ======== :: ibportstate -C qib0 -P 1 3 1 disable # by CA name, CA Port Number, lid, physical port number ibportstate -C qib0 -P 1 3 1 enable # by CA name, CA Port Number, lid, physical port number ibportstate -D 0 1 # (query) by direct route ibportstate 3 1 reset # by lid ibportstate 3 1 speed 1 # by lid ibportstate 3 1 width 1 # by lid ibportstate -D 0 1 lid 0x1234 arm # by direct route AUTHOR ====== Hal Rosenstock < hal.rosenstock@gmail.com > rdma-core-56.1/infiniband-diags/man/ibprintca.8000066400000000000000000000022761477342711600213460ustar00rootroot00000000000000.TH IBPRINTCA 8 "May 31, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibprintca.pl \- print either the ca specified or the list of cas from the ibnetdiscover output .SH SYNOPSIS .B ibprintca.pl [-R -l -C -P ] [] .SH DESCRIPTION .PP Faster than greping/viewing with an editor the output of ibnetdiscover, ibprintca.pl will parse out and print either the CA information for the specified CA or a list of all the CAs in the subnet. Finally, ibprintca.pl will also reuse the cached ibnetdiscover output from some of the other diag tools which makes it a bit faster than running ibnetdiscover from scratch. .SH OPTIONS .PP .TP \fB\-l\fR List the CAs (simply a wrapper for ibhosts). .TP \fB\-R\fR Recalculate the ibnetdiscover information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe that the fabric has changed. .TP \fB\-C \fR use the specified ca_name for the search. .TP \fB\-P \fR use the specified ca_port for the search. .SH AUTHORS .TP Ira Weiny .RI < weiny2@llnl.gov > .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibprintrt.8000066400000000000000000000022631477342711600214040ustar00rootroot00000000000000.TH IBPRINTRT 8 "May 31, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibprintrt.pl \- print either only the router specified or a list of routers from the ibnetdiscover output .SH SYNOPSIS .B ibprintrt.pl [-R -l -C -P ] [] .SH DESCRIPTION .PP Faster than greping/viewing with an editor the output of ibnetdiscover, ibprintrt.pl will parse out and print either the router information for the specified IB router or a list of all IB routers in the subnet. Finally, ibprintrt.pl will also reuse the cached ibnetdiscover output from some of the other diag tools which makes it a bit faster than running ibnetdiscover from scratch. .SH OPTIONS .PP .TP \fB\-l\fR List the Rts (simply a wrapper for ibrouters). .TP \fB\-R\fR Recalculate the ibnetdiscover information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe that the fabric has changed. .TP \fB\-C \fR use the specified ca_name for the search. .TP \fB\-P \fR use the specified ca_port for the search. .SH AUTHOR .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibprintswitch.8000066400000000000000000000026541477342711600222640ustar00rootroot00000000000000.TH IBPRINTSWITCH 8 "May 31, 2007" "OpenIB" "OpenIB Diagnostics" .SH NAME ibprintswitch.pl \- print either the switch specified or a list of switches from the ibnetdiscover output .SH SYNOPSIS .B ibprintswitch.pl [-R -l -C -P ] [] .SH DESCRIPTION .PP Faster than greping/viewing with an editor the output of ibnetdiscover, ibprintswitch.pl will parse out and print either the switch information for the switch specified or a list of all the switches found in the subnet. In addition, it will crudely parse on the node description information and if found report all the information for an entire chasis if the description information is consistent. Finally, ibprintswitch.pl will also reuse the cached ibnetdiscover output from some of the other diag tools which makes it a bit faster than running ibnetdiscover from scratch. .SH OPTIONS .PP .TP \fB\-l\fR List the switches (simply a wrapper for ibswitches). .TP \fB\-R\fR Recalculate the ibnetdiscover information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe that the fabric has changed. .TP \fB\-C \fR use the specified ca_name for the search. .TP \fB\-P \fR use the specified ca_port for the search. .SH AUTHORS .TP Ira Weiny .RI < weiny2@llnl.gov > .TP Hal Rosenstock .RI < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibqueryerrors.8.in.rst000066400000000000000000000072671477342711600235310ustar00rootroot00000000000000============= IBQUERYERRORS ============= --------------------------------- query and report IB port counters --------------------------------- :Date: 2016-09-26 :Manual section: 8 :Manual group: OpenIB Diagnostics SYNOPSIS ======== ibqueryerrors [options] DESCRIPTION =========== The default behavior is to report the port error counters which exceed a threshold for each port in the fabric. The default threshold is zero (0). Error fields can also be suppressed entirely. In addition to reporting errors on every port. ibqueryerrors can report the port transmit and receive data as well as report full link information to the remote port if available. OPTIONS ======= **-s, --suppress ** Suppress the errors listed in the comma separated list provided. **-c, --suppress-common** Suppress some of the common "side effect" counters. These counters usually do not indicate an error condition and can be usually be safely ignored. **-r, --report-port** Report the port information. This includes LID, port, external port (if applicable), link speed setting, remote GUID, remote port, remote external port (if applicable), and remote node description information. **--data** Include the optional transmit and receive data counters. **--threshold-file ** Specify an alternate threshold file. The default is @IBDIAG_CONFIG_PATH@/error_thresholds **--switch** print data for switch's only **--ca** print data for CA's only **--skip-sl** Use the default sl for queries. This is not recommended when using a QoS aware routing engine as it can cause a credit deadlock. **--router** print data for routers only **--clear-errors -k** Clear error counters after read. **--clear-counts -K** Clear data counters after read. **CAUTION** clearing data or error counters will occur regardless of if they are printed or not. See **--counters** and **--data** for details on controlling which counters are printed. **--details** include receive error and transmit discard details **--counters** print data counters only Partial Scan flags ------------------ The node to start a partial scan can be specified with the following addresses. .. include:: common/opt_G_with_param.rst .. include:: common/opt_D_with_param.rst **Note:** For switches results are printed for all ports not just switch port 0. **-S ** same as "-G". (provided only for backward compatibility) Cache File flags ---------------- .. include:: common/opt_load-cache.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Configuration flags ------------------- .. include:: common/opt_z-config.rst .. include:: common/opt_o-outstanding_smps.rst .. include:: common/opt_node_name_map.rst .. include:: common/opt_t.rst .. include:: common/opt_y.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst **-R** (This option is obsolete and does nothing) EXIT STATUS =========== **-1** if scan fails. **0** if scan succeeds without errors beyond thresholds **1** if errors are found beyond thresholds or inconsistencies are found in check mode. FILES ===== ERROR THRESHOLD --------------- @IBDIAG_CONFIG_PATH@/error_thresholds Define threshold values for errors. File format is simple "name=val". Comments begin with '#' **Example:** :: # Define thresholds for error counters SymbolErrorCounter=10 LinkErrorRecoveryCounter=10 VL15Dropped=100 .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst AUTHOR ====== Ira Weiny < ira.weiny@intel.com > rdma-core-56.1/infiniband-diags/man/ibroute.8.in.rst000066400000000000000000000047411477342711600222570ustar00rootroot00000000000000======= ibroute ======= ----------------------------------------- query InfiniBand switch forwarding tables ----------------------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== ibroute [options] [ [ []]] DESCRIPTION =========== ibroute uses SMPs to display the forwarding tables (unicast (LinearForwardingTable or LFT) or multicast (MulticastForwardingTable or MFT)) for the specified switch LID and the optional lid (mlid) range. The default range is all valid entries in the range 1...FDBTop. OPTIONS ======= **-a, --all** show all lids in range, even invalid entries **-n, --no_dests** do not try to resolve destinations **-M, --Multicast** show multicast forwarding tables In this case, the range parameters are specifying the mlid range. Addressing Flags ---------------- .. include:: common/opt_D.rst .. include:: common/opt_G.rst .. include:: common/opt_L.rst .. include:: common/opt_s.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_t.rst .. include:: common/opt_y.rst .. include:: common/opt_node_name_map.rst .. include:: common/opt_z-config.rst FILES ===== .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst EXAMPLES ======== Unicast examples :: ibroute 4 # dump all lids with valid out ports of switch with lid 4 ibroute -a 4 # same, but dump all lids, even with invalid out ports ibroute -n 4 # simple dump format - no destination resolution ibroute 4 10 # dump lids starting from 10 (up to FDBTop) ibroute 4 0x10 0x20 # dump lid range ibroute -G 0x08f1040023 # resolve switch by GUID ibroute -D 0,1 # resolve switch by direct path Multicast examples :: ibroute -M 4 # dump all non empty mlids of switch with lid 4 ibroute -M 4 0xc010 0xc020 # same, but with range ibroute -M -n 4 # simple dump format SEE ALSO ======== ibtracert (8) AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibrouters.8.in.rst000066400000000000000000000016701477342711600226220ustar00rootroot00000000000000========= IBROUTERS ========= ---------------------------------------- show InfiniBand router nodes in topology ---------------------------------------- :Date: 2016-12-20 :Manual section: 8 :Manual group: OpenIB Diagnostics SYNOPSIS ======== ibrouters [options] [] DESCRIPTION =========== ibrouters is a script which either walks the IB subnet topology or uses an already saved topology file and extracts the router nodes. OPTIONS ======= .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/opt_t.rst .. include:: common/opt_y.rst .. include:: common/opt_h.rst .. include:: common/opt_z-config.rst .. include:: common/sec_portselection.rst FILES ===== .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst SEE ALSO ======== ibnetdiscover(8) DEPENDENCIES ============ ibnetdiscover, ibnetdiscover format AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibstat.8.in.rst000066400000000000000000000027241477342711600220730ustar00rootroot00000000000000====== ibstat ====== ------------------------------------------ query basic status of InfiniBand device(s) ------------------------------------------ :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== ibstat [options] [portnum] DESCRIPTION =========== ibstat is a binary which displays basic information obtained from the local IB driver. Output includes LID, SMLID, port state, link width active, and port physical state. It is similar to the ibstatus utility but implemented as a binary rather than a script. It has options to list CAs and/or ports and displays more information than ibstatus. OPTIONS ======= **-l, --list_of_cas** list all IB devices **-s, --short** short output **-p, --port_list** show port list **ca_name** InfiniBand device name **portnum** port number of InfiniBand device Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_z-config.rst EXAMPLES ======== :: ibstat # display status of all ports on all IB devices ibstat -l # list all IB devices ibstat -p # show port guids ibstat mthca0 2 # show status of port 2 of 'mthca0' SEE ALSO ======== ibstatus (8) AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibstatus.8.in.rst000066400000000000000000000016641477342711600224450ustar00rootroot00000000000000======== ibstatus ======== ------------------------------------------ query basic status of InfiniBand device(s) ------------------------------------------ :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== ibstatus [\-h] [devname[:port]]... DESCRIPTION =========== ibstatus is a script which displays basic information obtained from the local IB driver. Output includes LID, SMLID, port state, link width active, and port physical state. OPTIONS ======= .. include:: common/opt_h.rst **devname** InfiniBand device name **portnum** port number of InfiniBand device EXAMPLES ======== :: ibstatus # display status of all IB ports ibstatus mthca1 # status of mthca1 ports ibstatus mthca1:1 mthca0:2 # show status of specified ports SEE ALSO ======== **ibstat (8)** AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibswitches.8.in.rst000066400000000000000000000016731477342711600227530ustar00rootroot00000000000000========== IBSWITCHES ========== ---------------------------------------- show InfiniBand switch nodes in topology ---------------------------------------- :Date: 2016-12-20 :Manual section: 8 :Manual group: OpenIB Diagnostics SYNOPSIS ======== ibswitches [options] [] DESCRIPTION =========== ibswitches is a script which either walks the IB subnet topology or uses an already saved topology file and extracts the switch nodes. OPTIONS ======= .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/opt_t.rst .. include:: common/opt_y.rst .. include:: common/opt_h.rst .. include:: common/opt_z-config.rst .. include:: common/sec_portselection.rst FILES ===== .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst SEE ALSO ======== ibnetdiscover(8) DEPENDENCIES ============ ibnetdiscover, ibnetdiscover format AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibswportwatch.8000066400000000000000000000013551477342711600222700ustar00rootroot00000000000000.TH IBSWPORTWATCH 8 "September 27, 2006" "OpenIB" "OpenIB Diagnostics" .SH NAME ibswportwatch.pl \- poll the counters on the specified switch/port and report rate of change information. .SH SYNOPSIS .B ibswportwatch.pl [-p -v -n -G] .SH DESCRIPTION .PP ibswportwatch.pl polls the port counters of the specified port and calculates rate of change information. .SH OPTIONS .PP .TP \fB\-p \fR Specify a pause time (polling interval) other than the default. .TP \fB\-v\fR Be verbose. .TP \fB\-n \fR Run for a set number of poll intervals and stop. (Default == -1 == forever) .TP \fB\-G\fR The address provided is a GUID rather than LID. .SH AUTHOR .TP Ira Weiny .RI < weiny2@llnl.gov > rdma-core-56.1/infiniband-diags/man/ibsysstat.8.in.rst000066400000000000000000000030021477342711600226200ustar00rootroot00000000000000========= ibsysstat ========= -------------------------------------- system status on an InfiniBand address -------------------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== ibsysstat [options] [] DESCRIPTION =========== ibsysstat uses vendor mads to validate connectivity between IB nodes and obtain other information about the IB node. ibsysstat is run as client/server. Default is to run as client. OPTIONS ======= Current supported operations: :: ping \- verify connectivity to server (default) host \- obtain host information from server cpu \- obtain cpu information from server **-o, --oui** use specified OUI number to multiplex vendor mads **-S, --Server** start in server mode (do not return) Addressing Flags ---------------- .. include:: common/opt_G.rst .. include:: common/opt_L.rst .. include:: common/opt_s.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_t.rst .. include:: common/opt_z-config.rst FILES ===== .. include:: common/sec_config-file.rst AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/ibtracert.8.in.rst000066400000000000000000000041611477342711600225610ustar00rootroot00000000000000========= ibtracert ========= --------------------- trace InfiniBand path --------------------- :Date: 2018-04-02 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== ibtracert [options] [ [ []]] DESCRIPTION =========== ibtracert uses SMPs to trace the path from a source GID/LID to a destination GID/LID. Each hop along the path is displayed until the destination is reached or a hop does not respond. By using the -m option, multicast path tracing can be performed between source and destination nodes. OPTIONS ======= **-n, --no_info** simple format; don't show additional information **-m** show the multicast trace of the specified mlid **-f, --force** force route to destination port Addressing Flags ---------------- .. include:: common/opt_G.rst .. include:: common/opt_L.rst .. include:: common/opt_s.rst .. include:: common/opt_ports-file.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_t.rst .. include:: common/opt_node_name_map.rst .. include:: common/opt_y.rst .. include:: common/opt_z-config.rst FILES ===== .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst .. include:: common/sec_ports-file.rst EXAMPLES ======== Unicast examples :: ibtracert 4 16 # show path between lids 4 and 16 ibtracert -n 4 16 # same, but using simple output format ibtracert -G 0x8f1040396522d 0x002c9000100d051 # use guid addresses Multicast example :: ibtracert -m 0xc000 4 16 # show multicast path of mlid 0xc000 between lids 4 and 16 SEE ALSO ======== ibroute (8) AUTHOR ====== Hal Rosenstock Ira Weiny < ira.weiny@intel.com > rdma-core-56.1/infiniband-diags/man/infiniband-diags.8.in.rst000066400000000000000000000072741477342711600240000ustar00rootroot00000000000000================ infiniband-diags ================ ---------------------------------- Diagnostics for InfiniBand Fabrics ---------------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics DESCRIPTION =========== infiniband-diags is a set of utilities designed to help configure, debug, and maintain infiniband fabrics. Many tools and utilities are provided. Some with similar functionality. The base utilities use directed route MAD's to perform their operations. They may therefore work even in unconfigured subnets. Other, higher level utilities, require LID routed MAD's and to some extent SA/SM access. THE USE OF SMPs (QP0) ===================== Many of the tools in this package rely on the use of SMPs via QP0 to acquire data directly from the SMA. While this mode of operation is not technically in compliance with the InfiniBand specification, practical experience has found that this level of diagnostics is valuable when working with a fabric which is broken or only partially configured. For this reason many of these tools may require the use of an MKey or operation from Virtual Machines may be restricted for security reasons. COMMON OPTIONS ============== Most OpenIB diagnostics take some of the following common flags. The exact list of supported flags per utility can be found in the documentation for those commands. Addressing Flags ---------------- The -D and -G option have two forms: .. include:: common/opt_D.rst .. include:: common/opt_D_with_param.rst .. include:: common/opt_G.rst .. include:: common/opt_G_with_param.rst .. include:: common/opt_L.rst .. include:: common/opt_s.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_t.rst .. include:: common/opt_o-outstanding_smps.rst .. include:: common/opt_node_name_map.rst .. include:: common/opt_z-config.rst COMMON FILES ============ The following config files are common amongst many of the utilities. .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst .. include:: common/sec_topology-file.rst Utilities list ============== Basic fabric connectivity ------------------------- See: ibnetdiscover, iblinkinfo Node information ---------------- See: ibnodes, ibswitches, ibhosts, ibrouters Port information ---------------- See: ibportstate, ibaddr Switch Forwarding Table info ---------------------------- See: ibtracert, ibroute, dump_lfts, dump_mfts, check_lft_balance, ibfindnodesusing Performance counters -------------------- See: ibqueryerrors, perfquery Local HCA info -------------- See: ibstat, ibstatus Connectivity check ------------------ See: ibping, ibsysstat Low level query tools --------------------- See: smpquery, smpdump, saquery, sminfo Fabric verification tools ------------------------- See: ibidsverify Backwards compatibility scripts =============================== The following scripts have been identified as redundant and/or lower performing as compared to the above scripts. They are provided as legacy scripts when --enable-compat-utils is specified at build time. ibcheckerrors, ibclearcounters, ibclearerrors, ibdatacounters ibchecknet, ibchecknode, ibcheckport, ibcheckportstate, ibcheckportwidth, ibcheckstate, ibcheckwidth, ibswportwatch, ibprintca, ibprintrt, ibprintswitch, set_nodedesc.sh AUTHORS ======= Ira Weiny < ira.weiny@intel.com > rdma-core-56.1/infiniband-diags/man/perfquery.8.in.rst000066400000000000000000000125751477342711600226340ustar00rootroot00000000000000========= perfquery ========= ----------------------------------------------- query InfiniBand port counters on a single port ----------------------------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== perfquery [options] [ [[port(s)] [reset_mask]]] DESCRIPTION =========== perfquery uses PerfMgt GMPs to obtain the PortCounters (basic performance and error counters), PortExtendedCounters, PortXmitDataSL, PortRcvDataSL, PortRcvErrorDetails, PortXmitDiscardDetails, PortExtendedSpeedsCounters, or PortSamplesControl from the PMA at the node/port specified. Optionally shows aggregated counters for all ports of node. Finally it can, reset after read, or just reset the counters. Note: In PortCounters, PortCountersExtended, PortXmitDataSL, and PortRcvDataSL, components that represent Data (e.g. PortXmitData and PortRcvData) indicate octets divided by 4 rather than just octets. Note: Inputting a port of 255 indicates an operation be performed on all ports. Note: For PortCounters, ExtendedCounters, and resets, multiple ports can be specified by either a comma separated list or a port range. See examples below. OPTIONS ======= **-x, --extended** show extended port counters rather than (basic) port counters. Note that extended port counters attribute is optional. **-X, --xmtsl** show transmit data SL counter. This is an optional counter for QoS. **-S, --rcvsl** show receive data SL counter. This is an optional counter for QoS. **-D, --xmtdisc** show transmit discard details. This is an optional counter. **-E, --rcverr** show receive error details. This is an optional counter. **-D, --xmtdisc** show transmit discard details. This is an optional counter. **-T, --extended_speeds** show extended speeds port counters. This is an optional counter. **--oprcvcounters** show Rcv Counters per Op code. This is an optional counter. **--flowctlcounters** show flow control counters. This is an optional counter. **--vloppackets** show packets received per Op code per VL. This is an optional counter. **--vlopdata** show data received per Op code per VL. This is an optional counter. **--vlxmitflowctlerrors** show flow control update errors per VL. This is an optional counter. **--vlxmitcounters** show ticks waiting to transmit counters per VL. This is an optional counter. **--swportvlcong** show sw port VL congestion. This is an optional counter. **--rcvcc** show Rcv congestion control counters. This is an optional counter. **--slrcvfecn** show SL Rcv FECN counters. This is an optional counter. **--slrcvbecn** show SL Rcv BECN counters. This is an optional counter. **--xmitcc** show Xmit congestion control counters. This is an optional counter. **--vlxmittimecc** show VL Xmit Time congestion control counters. This is an optional counter. **-c, --smplctl** show port samples control. **-a, --all_ports** show aggregated counters for all ports of the destination lid, reset all counters for all ports, or if multiple ports are specified, aggregate the counters of the specified ports. If the destination lid does not support the AllPortSelect flag, all ports will be iterated through to emulate AllPortSelect behavior. **-l, --loop_ports** If all ports are selected by the user (either through the **-a** option or port 255) or multiple ports are specified iterate through each port rather than doing than aggregate operation. **-r, --reset_after_read** reset counters after read **-R, --Reset_only** only reset counters Addressing Flags ---------------- .. include:: common/opt_G.rst .. include:: common/opt_L.rst .. include:: common/opt_s.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_t.rst .. include:: common/opt_y.rst .. include:: common/opt_z-config.rst FILES ===== .. include:: common/sec_config-file.rst EXAMPLES ======== :: perfquery # read local port performance counters perfquery 32 1 # read performance counters from lid 32, port 1 perfquery -x 32 1 # read extended performance counters from lid 32, port 1 perfquery -a 32 # read perf counters from lid 32, all ports perfquery -r 32 1 # read performance counters and reset perfquery -x -r 32 1 # read extended performance counters and reset perfquery -R 0x20 1 # reset performance counters of port 1 only perfquery -x -R 0x20 1 # reset extended performance counters of port 1 only perfquery -R -a 32 # reset performance counters of all ports perfquery -R 32 2 0x0fff # reset only error counters of port 2 perfquery -R 32 2 0xf000 # reset only non-error counters of port 2 perfquery -a 32 1-10 # read performance counters from lid 32, port 1-10, aggregate output perfquery -l 32 1-10 # read performance counters from lid 32, port 1-10, output each port perfquery -a 32 1,4,8 # read performance counters from lid 32, port 1, 4, and 8, aggregate output perfquery -l 32 1,4,8 # read performance counters from lid 32, port 1, 4, and 8, output each port AUTHOR ====== Hal Rosenstock < hal.rosenstock@gmail.com > rdma-core-56.1/infiniband-diags/man/saquery.8.in.rst000066400000000000000000000114671477342711600223020ustar00rootroot00000000000000======= saquery ======= ------------------------------------------------- query InfiniBand subnet administration attributes ------------------------------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== saquery [options] [ | | ] DESCRIPTION =========== saquery issues the selected SA query. Node records are queried by default. OPTIONS ======= **-p** get PathRecord info **-N** get NodeRecord info **-D, --list** get NodeDescriptions of CAs only **-S** get ServiceRecord info **-I** get InformInfoRecord (subscription) info **-L** return the Lids of the name specified **-l** return the unique Lid of the name specified **-G** return the Guids of the name specified **-O** return the name for the Lid specified **-U** return the name for the Guid specified **-c** get the SA's class port info **-s** return the PortInfoRecords with isSM or isSMdisabled capability mask bit on **-g** get multicast group info **-m** get multicast member info. If a group is specified, limit the output to the group specified and print one line containing only the GUID and node description for each entry. Example: saquery -m 0xc000 **-x** get LinkRecord info **--src-to-dst ** get a PathRecord for where src and dst are either node names or LIDs **--sgid-to-dgid ** get a PathRecord for **sgid** to **dgid** where both GIDs are in an IPv6 format acceptable to **inet_pton (3)** **--smkey ** use SM_Key value for the query. Will be used only with "trusted" queries. If non-numeric value (like 'x') is specified then saquery will prompt for a value. Default (when not specified here or in @IBDIAG_CONFIG_PATH@/ibdiag.conf) is to use SM_Key == 0 (or \"untrusted\") .. include:: common/opt_K.rst **--slid ** Source LID (PathRecord) **--dlid ** Destination LID (PathRecord) **--mlid ** Multicast LID (MCMemberRecord) **--sgid ** Source GID (IPv6 format) (PathRecord) **--dgid ** Destination GID (IPv6 format) (PathRecord) **--gid ** Port GID (MCMemberRecord) **--mgid ** Multicast GID (MCMemberRecord) **--reversible** Reversible path (PathRecord) **--numb_path** Number of paths (PathRecord) **--pkey** P_Key (PathRecord, MCMemberRecord). If non-numeric value (like 'x') is specified then saquery will prompt for a value **--qos_class** QoS Class (PathRecord) **--sl** Service level (PathRecord, MCMemberRecord) **--mtu** MTU and selector (PathRecord, MCMemberRecord) **--rate** Rate and selector (PathRecord, MCMemberRecord) **--pkt_lifetime** Packet lifetime and selector (PathRecord, MCMemberRecord) **--qkey** Q_Key (MCMemberRecord). If non-numeric value (like 'x') is specified then saquery will prompt for a value **--tclass** Traffic Class (PathRecord, MCMemberRecord) **--flow_label** Flow Label (PathRecord, MCMemberRecord) **--hop_limit** Hop limit (PathRecord, MCMemberRecord) **--scope** Scope (MCMemberRecord) **--join_state** Join state (MCMemberRecord) **--proxy_join** Proxy join (MCMemberRecord) **--service_id** ServiceID (PathRecord) Supported query names (and aliases): :: ClassPortInfo (CPI) NodeRecord (NR) [lid] PortInfoRecord (PIR) [[lid]/[port]/[options]] SL2VLTableRecord (SL2VL) [[lid]/[in_port]/[out_port]] PKeyTableRecord (PKTR) [[lid]/[port]/[block]] VLArbitrationTableRecord (VLAR) [[lid]/[port]/[block]] InformInfoRecord (IIR) LinkRecord (LR) [[from_lid]/[from_port]] [[to_lid]/[to_port]] ServiceRecord (SR) PathRecord (PR) MCMemberRecord (MCMR) LFTRecord (LFTR) [[lid]/[block]] MFTRecord (MFTR) [[mlid]/[position]/[block]] GUIDInfoRecord (GIR) [[lid]/[block]] SwitchInfoRecord (SWIR) [lid] SMInfoRecord (SMIR) [lid] Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_t.rst .. include:: common/opt_o-outstanding_smps.rst .. include:: common/opt_node_name_map.rst .. include:: common/opt_z-config.rst COMMON FILES ============ .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst DEPENDENCIES ============ OpenSM (or other running SM/SA), libosmcomp, libibumad, libibmad AUTHORS ======= Ira Weiny < ira.weiny@intel.com > Hal Rosenstock < halr@mellanox.com > rdma-core-56.1/infiniband-diags/man/sminfo.8.in.rst000066400000000000000000000034411477342711600220750ustar00rootroot00000000000000====== sminfo ====== --------------------------------- query InfiniBand SMInfo attribute --------------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== sminfo [options] sm_lid | sm_dr_path [modifier] DESCRIPTION =========== Optionally set and display the output of a sminfo query in human readable format. The target SM is the one listed in the local port info, or the SM specified by the optional SM lid or by the SM direct routed path. Note: using sminfo for any purposes other then simple query may be very dangerous, and may result in a malfunction of the target SM. OPTIONS ======= **-s, --state ** set SM state 0 not active 1 discovering 2 standby 3 master **-p, --priority ** set priority (0-15) **-a, --activity ** set activity count Addressing Flags ---------------- .. include:: common/opt_D.rst .. include:: common/opt_G.rst .. include:: common/opt_L.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_t.rst .. include:: common/opt_y.rst .. include:: common/opt_z-config.rst FILES ===== .. include:: common/sec_config-file.rst EXAMPLES ======== :: sminfo # local port\'s sminfo sminfo 32 # show sminfo of lid 32 sminfo -G 0x8f1040023 # same but using guid address SEE ALSO ======== smpdump (8) AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/smpdump.8.in.rst000066400000000000000000000032421477342711600222660ustar00rootroot00000000000000======= smpdump ======= -------------------------------------------- dump InfiniBand subnet management attributes -------------------------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== smpdump [options] [attribute_modifier] DESCRIPTION =========== smpdump is a general purpose SMP utility which gets SM attributes from a specified SMA. The result is dumped in hex by default. OPTIONS ======= **dlid|drpath** LID or DR path to SMA **attribute** IBA attribute ID for SM attribute **attribute_modifier** IBA modifier for SM attribute **-s, --string** Print strings in packet if possible Addressing Flags ---------------- .. include:: common/opt_D.rst .. include:: common/opt_L.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_t.rst .. include:: common/opt_z-config.rst FILES ===== .. include:: common/sec_config-file.rst EXAMPLES ======== Direct Routed Examples :: smpdump -D 0,1,2,3,5 16 # NODE DESC smpdump -D 0,1,2 0x15 2 # PORT INFO, port 2 LID Routed Examples :: smpdump 3 0x15 2 # PORT INFO, lid 3 port 2 smpdump 0xa0 0x11 # NODE INFO, lid 0xa0 SEE ALSO ======== smpquery (8) AUTHOR ====== Hal Rosenstock < halr@voltaire.com > rdma-core-56.1/infiniband-diags/man/smpquery.8.in.rst000066400000000000000000000047431477342711600224750ustar00rootroot00000000000000======== smpquery ======== --------------------------------------------- query InfiniBand subnet management attributes --------------------------------------------- :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== smpquery [options] [op params] DESCRIPTION =========== smpquery allows a basic subset of standard SMP queries including the following: node info, node description, switch info, port info. Fields are displayed in human readable format. OPTIONS ======= Current supported operations (case insensitive) and their parameters: :: Nodeinfo (NI) Nodedesc (ND) Portinfo (PI) [] # default port is zero PortInfoExtended (PIE) [] Switchinfo (SI) PKeyTable (PKeys) [] SL2VLTable (SL2VL) [] VLArbitration (VLArb) [] GUIDInfo (GI) MlnxExtPortInfo (MEPI) [] # default port is zero **-c, --combined** Use Combined route address argument `` `` **-x, --extended** Set SMSupportsExtendedSpeeds bit 31 in AttributeModifier (only impacts PortInfo queries). .. include:: common/opt_K.rst Addressing Flags ---------------- .. include:: common/opt_D.rst .. include:: common/opt_G.rst .. include:: common/opt_L.rst .. include:: common/opt_s.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_t.rst .. include:: common/opt_node_name_map.rst .. include:: common/opt_y.rst .. include:: common/opt_z-config.rst FILES ===== .. include:: common/sec_config-file.rst .. include:: common/sec_node-name-map.rst EXAMPLES ======== :: smpquery portinfo 3 1 # portinfo by lid, with port modifier smpquery -G switchinfo 0x2C9000100D051 1 # switchinfo by guid smpquery -D nodeinfo 0 # nodeinfo by direct route smpquery -c nodeinfo 6 0,12 # nodeinfo by combined route SEE ALSO ======== smpdump (8) AUTHOR ====== Hal Rosenstock < hal@mellanox.com > rdma-core-56.1/infiniband-diags/man/vendstat.8.in.rst000066400000000000000000000043211477342711600224300ustar00rootroot00000000000000======== vendstat ======== ------------------------------------------ query InfiniBand vendor specific functions ------------------------------------------ :Date: 2017-08-21 :Manual section: 8 :Manual group: Open IB Diagnostics SYNOPSIS ======== vendstat [options] DESCRIPTION =========== vendstat uses vendor specific MADs to access beyond the IB spec vendor specific functionality. Currently, there is support for Mellanox InfiniSwitch-III (IS3) and InfiniSwitch-IV (IS4). OPTIONS ======= **-N** show IS3 or IS4 general information. **-w** show IS3 port xmit wait counters. **-i** show IS4 counter group info. **-c ** configure IS4 counter groups. Configure IS4 counter groups 0 and 1. Such configuration is not persistent across IS4 reboot. First number is for counter group 0 and second is for counter group 1. Group 0 counter config values: :: 0 - PortXmitDataSL0-7 1 - PortXmitDataSL8-15 2 - PortRcvDataSL0-7 Group 1 counter config values: :: 1 - PortXmitDataSL8-15 2 - PortRcvDataSL0-7 8 - PortRcvDataSL8-15 **-R, --Read ** Read configuration space record at addr **-W, --Write ** Write configuration space record at addr Addressing Flags ---------------- .. include:: common/opt_G.rst .. include:: common/opt_L.rst .. include:: common/opt_s.rst Port Selection flags -------------------- .. include:: common/opt_C.rst .. include:: common/opt_P.rst .. include:: common/sec_portselection.rst Debugging flags --------------- .. include:: common/opt_debug.rst .. include:: common/opt_e.rst .. include:: common/opt_h.rst .. include:: common/opt_verbose.rst .. include:: common/opt_V.rst Configuration flags ------------------- .. include:: common/opt_t.rst .. include:: common/opt_z-config.rst FILES ===== .. include:: common/sec_config-file.rst EXAMPLES ======== :: vendstat -N 6 # read IS3 or IS4 general information vendstat -w 6 # read IS3 port xmit wait counters vendstat -i 6 12 # read IS4 port 12 counter group info vendstat -c 0,1 6 12 # configure IS4 port 12 counter groups for PortXmitDataSL vendstat -c 2,8 6 12 # configure IS4 port 12 counter groups for PortRcvDataSL AUTHOR ====== Hal Rosenstock < hal.rosenstock@gmail.com > rdma-core-56.1/infiniband-diags/mcm_rereg_test.c000066400000000000000000000240331477342711600216650ustar00rootroot00000000000000/* * Copyright (c) 2006-2009 Voltaire, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include "ibdiag_common.h" #define info(fmt, ...) fprintf(stderr, "INFO: " fmt, ## __VA_ARGS__ ) #define err(fmt, ...) fprintf(stderr, "ERR: " fmt, ## __VA_ARGS__ ) #ifdef NOISY_DEBUG #define dbg(fmt, ...) fprintf(stderr, "DBG: " fmt, ## __VA_ARGS__ ) #else __attribute__((format(printf, 1, 2))) static inline void dbg(const char *fmt, ...) { } #endif #define TMO 100 static ibmad_gid_t mgid_ipoib = { 0xff, 0x12, 0x40, 0x1b, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff }; static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; static uint64_t build_mcm_rec(uint8_t *data, ibmad_gid_t mgid, ibmad_gid_t port_gid) { memset(data, 0, IB_SA_DATA_SIZE); mad_set_array(data, 0, IB_SA_MCM_MGID_F, mgid); mad_set_array(data, 0, IB_SA_MCM_PORTGID_F, port_gid); mad_set_field(data, 0, IB_SA_MCM_JOIN_STATE_F, 1); return be64toh(IB_MCR_COMPMASK_MGID | IB_MCR_COMPMASK_PORT_GID | IB_MCR_COMPMASK_JOIN_STATE); } static void build_mcm_rec_umad(void *umad, ib_portid_t * dport, int method, uint64_t comp_mask, uint8_t * data) { ib_rpc_t rpc; memset(&rpc, 0, sizeof(rpc)); rpc.mgtclass = IB_SA_CLASS; rpc.method = method; rpc.attr.id = IB_SA_ATTR_MCRECORD; rpc.attr.mod = 0; // ??? rpc.mask = comp_mask; rpc.datasz = IB_SA_DATA_SIZE; rpc.dataoffs = IB_SA_DATA_OFFS; mad_build_pkt(umad, &rpc, dport, NULL, data); } static int rereg_send(int port, int agent, ib_portid_t * dport, uint8_t * umad, int len, int method, ibmad_gid_t port_gid) { uint8_t data[IB_SA_DATA_SIZE]; uint64_t comp_mask; comp_mask = build_mcm_rec(data, mgid_ipoib, port_gid); build_mcm_rec_umad(umad, dport, method, comp_mask, data); if (umad_send(port, agent, umad, len, TMO, 0) < 0) { err("umad_send %s failed: %s\n", (method == IB_MAD_METHOD_GET) ? "query" : "non query", strerror(errno)); return -1; } dbg("umad_send %d: tid = 0x%016" PRIx64 "\n", method, mad_get_field64(umad_get_mad(umad), 0, IB_MAD_TRID_F)); return 0; } static int rereg_port_gid(int port, int agent, ib_portid_t * dport, uint8_t * umad, int len, ibmad_gid_t port_gid) { uint8_t data[IB_SA_DATA_SIZE]; uint64_t comp_mask; comp_mask = build_mcm_rec(data, mgid_ipoib, port_gid); build_mcm_rec_umad(umad, dport, IB_MAD_METHOD_DELETE, comp_mask, data); if (umad_send(port, agent, umad, len, TMO, 0) < 0) { err("umad_send leave failed: %s\n", strerror(errno)); return -1; } dbg("umad_send leave: tid = 0x%016" PRIx64 "\n", mad_get_field64(umad_get_mad(umad), 0, IB_MAD_TRID_F)); build_mcm_rec_umad(umad, dport, IB_MAD_METHOD_SET, comp_mask, data); if (umad_send(port, agent, umad, len, TMO, 0) < 0) { err("umad_send join failed: %s\n", strerror(errno)); return -1; } dbg("umad_send join: tid = 0x%016" PRIx64 "\n", mad_get_field64(umad_get_mad(umad), 0, IB_MAD_TRID_F)); return 0; } struct guid_trid { ibmad_gid_t gid; __be64 guid; uint64_t trid; }; static int rereg_send_all(int port, int agent, ib_portid_t * dport, struct guid_trid *list, unsigned cnt) { uint8_t *umad; int len = umad_size() + 256; unsigned i; int ret; info("rereg_send_all... cnt = %u\n", cnt); umad = calloc(1, len); if (!umad) { err("cannot alloc mem for umad: %s\n", strerror(errno)); return -1; } for (i = 0; i < cnt; i++) { ret = rereg_port_gid(port, agent, dport, umad, len, list[i].gid); if (ret < 0) { err("rereg_send_all: rereg_port_gid 0x%016" PRIx64 " failed\n", be64toh(list[i].guid)); continue; } list[i].trid = mad_get_field64(umad_get_mad(umad), 0, IB_MAD_TRID_F); } info("rereg_send_all: sent %u requests\n", cnt * 2); free(umad); return 0; } static int rereg_recv(int port, int agent, ib_portid_t * dport, uint8_t * umad, int length, int tmo) { int ret, retry = 0; int len = length; while ((ret = umad_recv(port, umad, &len, tmo)) < 0 && errno == ETIMEDOUT) { if (retry++ > 3) return 0; } if (ret < 0) { err("umad_recv %d failed: %s\n", ret, strerror(errno)); return -1; } dbg("umad_recv (retries %d), tid = 0x%016" PRIx64 ": len = %d, status = %d\n", retry, mad_get_field64(umad_get_mad(umad), 0, IB_MAD_TRID_F), len, umad_status(umad)); return 1; } static int rereg_recv_all(int port, int agent, ib_portid_t * dport, struct guid_trid *list, unsigned cnt) { uint8_t *umad, *mad; int len = umad_size() + 256; uint64_t trid; unsigned n, method, status; unsigned i; info("rereg_recv_all...\n"); umad = calloc(1, len); if (!umad) { err("cannot alloc mem for umad: %s\n", strerror(errno)); return -1; } n = 0; while (rereg_recv(port, agent, dport, umad, len, TMO) > 0) { dbg("rereg_recv_all: done %d\n", n); n++; mad = umad_get_mad(umad); method = mad_get_field(mad, 0, IB_MAD_METHOD_F); status = mad_get_field(mad, 0, IB_MAD_STATUS_F); if (status) dbg("MAD status %x, method %x\n", status, method); if (status && (method & 0x7f) == (IB_MAD_METHOD_GET_RESPONSE & 0x7f)) { trid = mad_get_field64(mad, 0, IB_MAD_TRID_F); for (i = 0; i < cnt; i++) if (trid == list[i].trid) break; if (i == cnt) { err("cannot find trid 0x%016" PRIx64 "\n", trid); continue; } info("guid 0x%016" PRIx64 ": method = %x status = %x. Resending\n", be64toh(list[i].guid), method, status); rereg_port_gid(port, agent, dport, umad, len, list[i].gid); list[i].trid = mad_get_field64(umad_get_mad(umad), 0, IB_MAD_TRID_F); } } info("rereg_recv_all: got %u responses\n", n); free(umad); return 0; } static int rereg_query_all(int port, int agent, ib_portid_t * dport, struct guid_trid *list, unsigned cnt) { uint8_t *umad, *mad; int len = umad_size() + 256; unsigned method, status; unsigned i; int ret; info("rereg_query_all...\n"); umad = calloc(1, len); if (!umad) { err("cannot alloc mem for umad: %s\n", strerror(errno)); return -1; } for (i = 0; i < cnt; i++) { ret = rereg_send(port, agent, dport, umad, len, IB_MAD_METHOD_GET, list[i].gid); if (ret < 0) { err("query_all: rereg_send failed.\n"); continue; } ret = rereg_recv(port, agent, dport, umad, len, TMO); if (ret < 0) { err("query_all: rereg_recv failed.\n"); continue; } mad = umad_get_mad(umad); method = mad_get_field(mad, 0, IB_MAD_METHOD_F); status = mad_get_field(mad, 0, IB_MAD_STATUS_F); if (status) info("guid 0x%016" PRIx64 ": status %x, method %x\n", be64toh(list[i].guid), status, method); } info("rereg_query_all: %u queried.\n", cnt); free(umad); return 0; } #define MAX_CLIENTS 50 static int rereg_and_test_port(const char *guid_file, int port, int agent, ib_portid_t *dport, int timeout) { char line[256]; FILE *f; ibmad_gid_t port_gid; __be64 prefix = htobe64(0xfe80000000000000ull); __be64 guid = htobe64(0x0002c90200223825ull); struct guid_trid *list; int i = 0; list = calloc(MAX_CLIENTS, sizeof(*list)); if (!list) { err("cannot alloc mem for guid/trid list: %s\n", strerror(errno)); return -1; } f = fopen(guid_file, "r"); if (!f) { err("cannot open %s: %s\n", guid_file, strerror(errno)); free(list); return -1; } while (fgets(line, sizeof(line), f)) { guid = htobe64(strtoull(line, NULL, 0)); memcpy(&port_gid[0], &prefix, 8); memcpy(&port_gid[8], &guid, 8); list[i].guid = guid; memcpy(list[i].gid, port_gid, sizeof(list[i].gid)); list[i].trid = 0; if (++i >= MAX_CLIENTS) break; } fclose(f); rereg_send_all(port, agent, dport, list, i); rereg_recv_all(port, agent, dport, list, i); rereg_query_all(port, agent, dport, list, i); free(list); return 0; } int main(int argc, const char **argv) { const char *guid_file = "port_guids.list"; int mgmt_classes[2] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS }; ib_portid_t dport_id; int port, agent; uint8_t *umad; int len; if (argc > 1) guid_file = argv[1]; srcports = mad_rpc_open_port2(NULL, 0, mgmt_classes, 2, 0); if (!srcports) err("Failed to open port"); srcport = srcports->gsi.port; if (!srcport) err("Failed to open port"); resolve_sm_portid(NULL, 0, &dport_id); dport_id.qp = 1; if (!dport_id.qkey) dport_id.qkey = IB_DEFAULT_QP1_QKEY; len = umad_size() + 256; umad = calloc(1, len); if (!umad) { err("cannot alloc mem for umad: %s\n", strerror(errno)); return -1; } port = mad_rpc_portid(srcport); agent = umad_register(port, IB_SA_CLASS, 2, 0, NULL); rereg_and_test_port(guid_file, port, agent, &dport_id, TMO); free(umad); umad_unregister(port, agent); mad_rpc_close_port2(srcports); umad_done(); return 0; } rdma-core-56.1/infiniband-diags/perfquery.c000066400000000000000000001062111477342711600207070ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include "ibdiag_common.h" static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; struct perf_count { uint32_t portselect; uint32_t counterselect; uint32_t symbolerrors; uint32_t linkrecovers; uint32_t linkdowned; uint32_t rcverrors; uint32_t rcvremotephyerrors; uint32_t rcvswrelayerrors; uint32_t xmtdiscards; uint32_t xmtconstrainterrors; uint32_t rcvconstrainterrors; uint32_t linkintegrityerrors; uint32_t excbufoverrunerrors; uint32_t qp1dropped; uint32_t vl15dropped; uint32_t xmtdata; uint32_t rcvdata; uint32_t xmtpkts; uint32_t rcvpkts; uint32_t xmtwait; }; struct perf_count_ext { uint32_t portselect; uint32_t counterselect; uint64_t portxmitdata; uint64_t portrcvdata; uint64_t portxmitpkts; uint64_t portrcvpkts; uint64_t portunicastxmitpkts; uint64_t portunicastrcvpkts; uint64_t portmulticastxmitpkits; uint64_t portmulticastrcvpkts; uint32_t counterSelect2; uint64_t symbolErrorCounter; uint64_t linkErrorRecoveryCounter; uint64_t linkDownedCounter; uint64_t portRcvErrors; uint64_t portRcvRemotePhysicalErrors; uint64_t portRcvSwitchRelayErrors; uint64_t portXmitDiscards; uint64_t portXmitConstraintErrors; uint64_t portRcvConstraintErrors; uint64_t localLinkIntegrityErrors; uint64_t excessiveBufferOverrunErrors; uint64_t VL15Dropped; uint64_t portXmitWait; uint64_t QP1Dropped; }; static uint8_t pc[1024]; static struct perf_count perf_count = {}; static struct perf_count_ext perf_count_ext = {}; #define ALL_PORTS 0xFF #define MAX_PORTS 255 /* Notes: IB semantics is to cap counters if count has exceeded limits. * Therefore we must check for overflows and cap the counters if necessary. * * mad_decode_field and mad_encode_field assume 32 bit integers passed in * for fields < 32 bits in length. */ static void aggregate_4bit(uint32_t * dest, uint32_t val) { if ((((*dest) + val) < (*dest)) || ((*dest) + val) > 0xf) (*dest) = 0xf; else (*dest) = (*dest) + val; } static void aggregate_8bit(uint32_t * dest, uint32_t val) { if ((((*dest) + val) < (*dest)) || ((*dest) + val) > 0xff) (*dest) = 0xff; else (*dest) = (*dest) + val; } static void aggregate_16bit(uint32_t * dest, uint32_t val) { if ((((*dest) + val) < (*dest)) || ((*dest) + val) > 0xffff) (*dest) = 0xffff; else (*dest) = (*dest) + val; } static void aggregate_32bit(uint32_t * dest, uint32_t val) { if (((*dest) + val) < (*dest)) (*dest) = 0xffffffff; else (*dest) = (*dest) + val; } static void aggregate_64bit(uint64_t * dest, uint64_t val) { if (((*dest) + val) < (*dest)) (*dest) = 0xffffffffffffffffULL; else (*dest) = (*dest) + val; } static void aggregate_perfcounters(void) { uint32_t val; mad_decode_field(pc, IB_PC_PORT_SELECT_F, &val); perf_count.portselect = val; mad_decode_field(pc, IB_PC_COUNTER_SELECT_F, &val); perf_count.counterselect = val; mad_decode_field(pc, IB_PC_ERR_SYM_F, &val); aggregate_16bit(&perf_count.symbolerrors, val); mad_decode_field(pc, IB_PC_LINK_RECOVERS_F, &val); aggregate_8bit(&perf_count.linkrecovers, val); mad_decode_field(pc, IB_PC_LINK_DOWNED_F, &val); aggregate_8bit(&perf_count.linkdowned, val); mad_decode_field(pc, IB_PC_ERR_RCV_F, &val); aggregate_16bit(&perf_count.rcverrors, val); mad_decode_field(pc, IB_PC_ERR_PHYSRCV_F, &val); aggregate_16bit(&perf_count.rcvremotephyerrors, val); mad_decode_field(pc, IB_PC_ERR_SWITCH_REL_F, &val); aggregate_16bit(&perf_count.rcvswrelayerrors, val); mad_decode_field(pc, IB_PC_XMT_DISCARDS_F, &val); aggregate_16bit(&perf_count.xmtdiscards, val); mad_decode_field(pc, IB_PC_ERR_XMTCONSTR_F, &val); aggregate_8bit(&perf_count.xmtconstrainterrors, val); mad_decode_field(pc, IB_PC_ERR_RCVCONSTR_F, &val); aggregate_8bit(&perf_count.rcvconstrainterrors, val); mad_decode_field(pc, IB_PC_ERR_LOCALINTEG_F, &val); aggregate_4bit(&perf_count.linkintegrityerrors, val); mad_decode_field(pc, IB_PC_ERR_EXCESS_OVR_F, &val); aggregate_4bit(&perf_count.excbufoverrunerrors, val); mad_decode_field(pc, IB_PC_QP1_DROP_F, &val); aggregate_16bit(&perf_count.qp1dropped, val); mad_decode_field(pc, IB_PC_VL15_DROPPED_F, &val); aggregate_16bit(&perf_count.vl15dropped, val); mad_decode_field(pc, IB_PC_XMT_BYTES_F, &val); aggregate_32bit(&perf_count.xmtdata, val); mad_decode_field(pc, IB_PC_RCV_BYTES_F, &val); aggregate_32bit(&perf_count.rcvdata, val); mad_decode_field(pc, IB_PC_XMT_PKTS_F, &val); aggregate_32bit(&perf_count.xmtpkts, val); mad_decode_field(pc, IB_PC_RCV_PKTS_F, &val); aggregate_32bit(&perf_count.rcvpkts, val); mad_decode_field(pc, IB_PC_XMT_WAIT_F, &val); aggregate_32bit(&perf_count.xmtwait, val); } static void output_aggregate_perfcounters(ib_portid_t * portid, __be16 cap_mask) { char buf[1024]; uint32_t val = ALL_PORTS; /* set port_select to 255 to emulate AllPortSelect */ mad_encode_field(pc, IB_PC_PORT_SELECT_F, &val); mad_encode_field(pc, IB_PC_COUNTER_SELECT_F, &perf_count.counterselect); mad_encode_field(pc, IB_PC_ERR_SYM_F, &perf_count.symbolerrors); mad_encode_field(pc, IB_PC_LINK_RECOVERS_F, &perf_count.linkrecovers); mad_encode_field(pc, IB_PC_LINK_DOWNED_F, &perf_count.linkdowned); mad_encode_field(pc, IB_PC_ERR_RCV_F, &perf_count.rcverrors); mad_encode_field(pc, IB_PC_ERR_PHYSRCV_F, &perf_count.rcvremotephyerrors); mad_encode_field(pc, IB_PC_ERR_SWITCH_REL_F, &perf_count.rcvswrelayerrors); mad_encode_field(pc, IB_PC_XMT_DISCARDS_F, &perf_count.xmtdiscards); mad_encode_field(pc, IB_PC_ERR_XMTCONSTR_F, &perf_count.xmtconstrainterrors); mad_encode_field(pc, IB_PC_ERR_RCVCONSTR_F, &perf_count.rcvconstrainterrors); mad_encode_field(pc, IB_PC_ERR_LOCALINTEG_F, &perf_count.linkintegrityerrors); mad_encode_field(pc, IB_PC_ERR_EXCESS_OVR_F, &perf_count.excbufoverrunerrors); mad_encode_field(pc, IB_PC_QP1_DROP_F, &perf_count.qp1dropped); mad_encode_field(pc, IB_PC_VL15_DROPPED_F, &perf_count.vl15dropped); mad_encode_field(pc, IB_PC_XMT_BYTES_F, &perf_count.xmtdata); mad_encode_field(pc, IB_PC_RCV_BYTES_F, &perf_count.rcvdata); mad_encode_field(pc, IB_PC_XMT_PKTS_F, &perf_count.xmtpkts); mad_encode_field(pc, IB_PC_RCV_PKTS_F, &perf_count.rcvpkts); mad_encode_field(pc, IB_PC_XMT_WAIT_F, &perf_count.xmtwait); mad_dump_perfcounters(buf, sizeof buf, pc, sizeof pc); printf("# Port counters: %s port %d (CapMask: 0x%02X)\n%s", portid2str(portid), ALL_PORTS, ntohs(cap_mask), buf); } static void aggregate_perfcounters_ext(__be16 cap_mask, uint32_t cap_mask2) { uint32_t val; uint64_t val64; mad_decode_field(pc, IB_PC_EXT_PORT_SELECT_F, &val); perf_count_ext.portselect = val; mad_decode_field(pc, IB_PC_EXT_COUNTER_SELECT_F, &val); perf_count_ext.counterselect = val; mad_decode_field(pc, IB_PC_EXT_XMT_BYTES_F, &val64); aggregate_64bit(&perf_count_ext.portxmitdata, val64); mad_decode_field(pc, IB_PC_EXT_RCV_BYTES_F, &val64); aggregate_64bit(&perf_count_ext.portrcvdata, val64); mad_decode_field(pc, IB_PC_EXT_XMT_PKTS_F, &val64); aggregate_64bit(&perf_count_ext.portxmitpkts, val64); mad_decode_field(pc, IB_PC_EXT_RCV_PKTS_F, &val64); aggregate_64bit(&perf_count_ext.portrcvpkts, val64); if (cap_mask & IB_PM_EXT_WIDTH_SUPPORTED) { mad_decode_field(pc, IB_PC_EXT_XMT_UPKTS_F, &val64); aggregate_64bit(&perf_count_ext.portunicastxmitpkts, val64); mad_decode_field(pc, IB_PC_EXT_RCV_UPKTS_F, &val64); aggregate_64bit(&perf_count_ext.portunicastrcvpkts, val64); mad_decode_field(pc, IB_PC_EXT_XMT_MPKTS_F, &val64); aggregate_64bit(&perf_count_ext.portmulticastxmitpkits, val64); mad_decode_field(pc, IB_PC_EXT_RCV_MPKTS_F, &val64); aggregate_64bit(&perf_count_ext.portmulticastrcvpkts, val64); } if (htonl(cap_mask2) & IB_PM_IS_ADDL_PORT_CTRS_EXT_SUP) { mad_decode_field(pc, IB_PC_EXT_COUNTER_SELECT2_F, &val); perf_count_ext.counterSelect2 = val; mad_decode_field(pc, IB_PC_EXT_ERR_SYM_F, &val64); aggregate_64bit(&perf_count_ext.symbolErrorCounter, val64); mad_decode_field(pc, IB_PC_EXT_LINK_RECOVERS_F, &val64); aggregate_64bit(&perf_count_ext.linkErrorRecoveryCounter, val64); mad_decode_field(pc, IB_PC_EXT_LINK_DOWNED_F, &val64); aggregate_64bit(&perf_count_ext.linkDownedCounter, val64); mad_decode_field(pc, IB_PC_EXT_ERR_RCV_F, &val64); aggregate_64bit(&perf_count_ext.portRcvErrors, val64); mad_decode_field(pc, IB_PC_EXT_ERR_PHYSRCV_F, &val64); aggregate_64bit(&perf_count_ext.portRcvRemotePhysicalErrors, val64); mad_decode_field(pc, IB_PC_EXT_ERR_SWITCH_REL_F, &val64); aggregate_64bit(&perf_count_ext.portRcvSwitchRelayErrors, val64); mad_decode_field(pc, IB_PC_EXT_XMT_DISCARDS_F, &val64); aggregate_64bit(&perf_count_ext.portXmitDiscards, val64); mad_decode_field(pc, IB_PC_EXT_ERR_XMTCONSTR_F, &val64); aggregate_64bit(&perf_count_ext.portXmitConstraintErrors, val64); mad_decode_field(pc, IB_PC_EXT_ERR_RCVCONSTR_F, &val64); aggregate_64bit(&perf_count_ext.portRcvConstraintErrors, val64); mad_decode_field(pc, IB_PC_EXT_ERR_LOCALINTEG_F, &val64); aggregate_64bit(&perf_count_ext.localLinkIntegrityErrors, val64); mad_decode_field(pc, IB_PC_EXT_ERR_EXCESS_OVR_F, &val64); aggregate_64bit(&perf_count_ext.excessiveBufferOverrunErrors, val64); mad_decode_field(pc, IB_PC_EXT_VL15_DROPPED_F, &val64); aggregate_64bit(&perf_count_ext.VL15Dropped, val64); mad_decode_field(pc, IB_PC_EXT_XMT_WAIT_F, &val64); aggregate_64bit(&perf_count_ext.portXmitWait, val64); mad_decode_field(pc, IB_PC_EXT_QP1_DROP_F, &val64); aggregate_64bit(&perf_count_ext.QP1Dropped, val64); } } static void dump_perfcounters_ext(char *buf, int size, __be16 cap_mask, uint32_t cap_mask2) { size_t offset, tmp_offset; mad_dump_fields(buf, size, pc, sizeof(pc), IB_PC_EXT_FIRST_F, IB_PC_EXT_XMT_UPKTS_F); offset = strlen(buf); if (cap_mask & IB_PM_EXT_WIDTH_SUPPORTED) { mad_dump_fields(buf + offset, size - offset, pc, sizeof(pc), IB_PC_EXT_XMT_UPKTS_F, IB_PC_EXT_LAST_F); tmp_offset = strlen(buf + offset); offset += tmp_offset; } if (htonl(cap_mask2) & IB_PM_IS_ADDL_PORT_CTRS_EXT_SUP) { mad_dump_fields(buf + offset, size - offset, pc, sizeof(pc), IB_PC_EXT_COUNTER_SELECT2_F, IB_PC_EXT_ERR_LAST_F); } } static void output_aggregate_perfcounters_ext(ib_portid_t * portid, __be16 cap_mask, uint32_t cap_mask2) { char buf[1536]; uint32_t val = ALL_PORTS; memset(buf, 0, sizeof(buf)); /* set port_select to 255 to emulate AllPortSelect */ mad_encode_field(pc, IB_PC_EXT_PORT_SELECT_F, &val); mad_encode_field(pc, IB_PC_EXT_COUNTER_SELECT_F, &perf_count_ext.counterselect); mad_encode_field(pc, IB_PC_EXT_XMT_BYTES_F, &perf_count_ext.portxmitdata); mad_encode_field(pc, IB_PC_EXT_RCV_BYTES_F, &perf_count_ext.portrcvdata); mad_encode_field(pc, IB_PC_EXT_XMT_PKTS_F, &perf_count_ext.portxmitpkts); mad_encode_field(pc, IB_PC_EXT_RCV_PKTS_F, &perf_count_ext.portrcvpkts); if (cap_mask & IB_PM_EXT_WIDTH_SUPPORTED) { mad_encode_field(pc, IB_PC_EXT_XMT_UPKTS_F, &perf_count_ext.portunicastxmitpkts); mad_encode_field(pc, IB_PC_EXT_RCV_UPKTS_F, &perf_count_ext.portunicastrcvpkts); mad_encode_field(pc, IB_PC_EXT_XMT_MPKTS_F, &perf_count_ext.portmulticastxmitpkits); mad_encode_field(pc, IB_PC_EXT_RCV_MPKTS_F, &perf_count_ext.portmulticastrcvpkts); } if (htonl(cap_mask2) & IB_PM_IS_ADDL_PORT_CTRS_EXT_SUP) { mad_encode_field(pc, IB_PC_EXT_COUNTER_SELECT2_F, &perf_count_ext.counterSelect2); mad_encode_field(pc, IB_PC_EXT_ERR_SYM_F, &perf_count_ext.symbolErrorCounter); mad_encode_field(pc, IB_PC_EXT_LINK_RECOVERS_F, &perf_count_ext.linkErrorRecoveryCounter); mad_encode_field(pc, IB_PC_EXT_LINK_DOWNED_F, &perf_count_ext.linkDownedCounter); mad_encode_field(pc, IB_PC_EXT_ERR_RCV_F, &perf_count_ext.portRcvErrors); mad_encode_field(pc, IB_PC_EXT_ERR_PHYSRCV_F, &perf_count_ext.portRcvRemotePhysicalErrors); mad_encode_field(pc, IB_PC_EXT_ERR_SWITCH_REL_F, &perf_count_ext.portRcvSwitchRelayErrors); mad_encode_field(pc, IB_PC_EXT_XMT_DISCARDS_F, &perf_count_ext.portXmitDiscards); mad_encode_field(pc, IB_PC_EXT_ERR_XMTCONSTR_F, &perf_count_ext.portXmitConstraintErrors); mad_encode_field(pc, IB_PC_EXT_ERR_RCVCONSTR_F, &perf_count_ext.portRcvConstraintErrors); mad_encode_field(pc, IB_PC_EXT_ERR_LOCALINTEG_F, &perf_count_ext.localLinkIntegrityErrors); mad_encode_field(pc, IB_PC_EXT_ERR_EXCESS_OVR_F, &perf_count_ext.excessiveBufferOverrunErrors); mad_encode_field(pc, IB_PC_EXT_VL15_DROPPED_F, &perf_count_ext.VL15Dropped); mad_encode_field(pc, IB_PC_EXT_XMT_WAIT_F, &perf_count_ext.portXmitWait); mad_encode_field(pc, IB_PC_EXT_QP1_DROP_F, &perf_count_ext.QP1Dropped); } dump_perfcounters_ext(buf, sizeof(buf), cap_mask, cap_mask2); printf("# Port extended counters: %s port %d (CapMask: 0x%02X CapMask2: 0x%07X)\n%s", portid2str(portid), ALL_PORTS, ntohs(cap_mask), cap_mask2, buf); } static void dump_perfcounters(int extended, int timeout, __be16 cap_mask, uint32_t cap_mask2, ib_portid_t * portid, int port, int aggregate) { char buf[1536]; if (extended != 1) { memset(pc, 0, sizeof(pc)); if (!pma_query_via(pc, portid, port, timeout, IB_GSI_PORT_COUNTERS, srcport)) IBEXIT("perfquery"); if (!(cap_mask & IB_PM_PC_XMIT_WAIT_SUP)) { /* if PortCounters:PortXmitWait not supported clear this counter */ VERBOSE("PortXmitWait not indicated" " so ignore this counter"); perf_count.xmtwait = 0; mad_encode_field(pc, IB_PC_XMT_WAIT_F, &perf_count.xmtwait); } if (aggregate) aggregate_perfcounters(); else mad_dump_perfcounters(buf, sizeof buf, pc, sizeof pc); } else { /* 1.2 errata: bit 9 is extended counter support * bit 10 is extended counter NoIETF */ if (!(cap_mask & IB_PM_EXT_WIDTH_SUPPORTED) && !(cap_mask & IB_PM_EXT_WIDTH_NOIETF_SUP)) IBWARN ("PerfMgt ClassPortInfo CapMask 0x%02X; No extended counter support indicated\n", ntohs(cap_mask)); memset(pc, 0, sizeof(pc)); if (!pma_query_via(pc, portid, port, timeout, IB_GSI_PORT_COUNTERS_EXT, srcport)) IBEXIT("perfextquery"); if (aggregate) aggregate_perfcounters_ext(cap_mask, cap_mask2); else dump_perfcounters_ext(buf, sizeof(buf), cap_mask, cap_mask2); } if (!aggregate) { if (extended) printf("# Port extended counters: %s port %d " "(CapMask: 0x%02X CapMask2: 0x%07X)\n%s", portid2str(portid), port, ntohs(cap_mask), cap_mask2, buf); else printf("# Port counters: %s port %d " "(CapMask: 0x%02X)\n%s", portid2str(portid), port, ntohs(cap_mask), buf); } } static void reset_counters(int extended, int timeout, int mask, ib_portid_t * portid, int port) { memset(pc, 0, sizeof(pc)); if (extended != 1) { if (!performance_reset_via(pc, portid, port, mask, timeout, IB_GSI_PORT_COUNTERS, srcport)) IBEXIT("perf reset"); } else { if (!performance_reset_via(pc, portid, port, mask, timeout, IB_GSI_PORT_COUNTERS_EXT, srcport)) IBEXIT("perf ext reset"); } } static struct { int reset, reset_only, all_ports, loop_ports, port, extended, xmt_sl, rcv_sl, xmt_disc, rcv_err, extended_speeds, smpl_ctl, oprcvcounters, flowctlcounters, vloppackets, vlopdata, vlxmitflowctlerrors, vlxmitcounters, swportvlcong, rcvcc, slrcvfecn, slrcvbecn, xmitcc, vlxmittimecc; int ports[MAX_PORTS]; int ports_count; } info; static void common_func(ib_portid_t * portid, int port_num, int mask, unsigned query, unsigned reset, const char *name, uint16_t attr, void dump_func(char *, int, void *, int)) { char buf[1536]; if (query) { memset(pc, 0, sizeof(pc)); if (!pma_query_via(pc, portid, port_num, ibd_timeout, attr, srcport)) IBEXIT("cannot query %s", name); dump_func(buf, sizeof(buf), pc, sizeof(pc)); printf("# %s counters: %s port %d\n%s", name, portid2str(portid), port_num, buf); } memset(pc, 0, sizeof(pc)); if (reset && !performance_reset_via(pc, portid, info.port, mask, ibd_timeout, attr, srcport)) IBEXIT("cannot reset %s", name); } static void xmt_sl_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortXmitDataSL", IB_GSI_PORT_XMIT_DATA_SL, mad_dump_perfcounters_xmt_sl); } static void rcv_sl_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortRcvDataSL", IB_GSI_PORT_RCV_DATA_SL, mad_dump_perfcounters_rcv_sl); } static void xmt_disc_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortXmitDiscardDetails", IB_GSI_PORT_XMIT_DISCARD_DETAILS, mad_dump_perfcounters_xmt_disc); } static void rcv_err_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortRcvErrorDetails", IB_GSI_PORT_RCV_ERROR_DETAILS, mad_dump_perfcounters_rcv_err); } static uint8_t *ext_speeds_reset_via(void *rcvbuf, ib_portid_t * dest, int port, uint64_t mask, unsigned timeout) { ib_rpc_t rpc = { 0 }; int lid = dest->lid; DEBUG("lid %u port %d mask 0x%" PRIx64, lid, port, mask); if (lid == -1) { IBWARN("only lid routed is supported"); return NULL; } if (!mask) mask = ~0; rpc.mgtclass = IB_PERFORMANCE_CLASS; rpc.method = IB_MAD_METHOD_SET; rpc.attr.id = IB_GSI_PORT_EXT_SPEEDS_COUNTERS; memset(rcvbuf, 0, IB_MAD_SIZE); mad_set_field(rcvbuf, 0, IB_PESC_PORT_SELECT_F, port); mad_set_field64(rcvbuf, 0, IB_PESC_COUNTER_SELECT_F, mask); rpc.attr.mod = 0; rpc.timeout = timeout; rpc.datasz = IB_PC_DATA_SZ; rpc.dataoffs = IB_PC_DATA_OFFS; if (!dest->qp) dest->qp = 1; if (!dest->qkey) dest->qkey = IB_DEFAULT_QP1_QKEY; return mad_rpc(srcport, &rpc, dest, rcvbuf, rcvbuf); } static uint8_t is_rsfec_mode_active(ib_portid_t * portid, int port, __be16 cap_mask) { uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; uint32_t fec_mode_active = 0; uint32_t pie_capmask = 0; if (cap_mask & IS_PM_RSFEC_COUNTERS_SUP) { if (!is_port_info_extended_supported(portid, port, srcports->smi.port)) { IBWARN("Port Info Extended not supported"); return 0; } if (!smp_query_via(data, portid, IB_ATTR_PORT_INFO_EXT, port, 0, srcports->smi.port)) IBEXIT("smp query portinfo extended failed"); mad_decode_field(data, IB_PORT_EXT_CAPMASK_F, &pie_capmask); mad_decode_field(data, IB_PORT_EXT_FEC_MODE_ACTIVE_F, &fec_mode_active); if((pie_capmask & be32toh(IB_PORT_EXT_CAP_IS_FEC_MODE_SUPPORTED)) && ((be16toh(IB_PORT_EXT_RS_FEC_MODE_ACTIVE) == (fec_mode_active & 0xffff)) || (be16toh(IB_PORT_EXT_RS_FEC2_MODE_ACTIVE) == (fec_mode_active & 0xffff)))) return 1; } return 0; } static void extended_speeds_query(ib_portid_t * portid, int port, uint64_t ext_mask, __be16 cap_mask) { int mask = ext_mask; if (!info.reset_only) { if (is_rsfec_mode_active(portid, port, cap_mask)) common_func(portid, port, mask, 1, 0, "PortExtendedSpeedsCounters with RS-FEC Active", IB_GSI_PORT_EXT_SPEEDS_COUNTERS, mad_dump_port_ext_speeds_counters_rsfec_active); else common_func(portid, port, mask, 1, 0, "PortExtendedSpeedsCounters", IB_GSI_PORT_EXT_SPEEDS_COUNTERS, mad_dump_port_ext_speeds_counters); } if ((info.reset_only || info.reset) && !ext_speeds_reset_via(pc, portid, port, ext_mask, ibd_timeout)) IBEXIT("cannot reset PortExtendedSpeedsCounters"); } static void oprcvcounters_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortOpRcvCounters", IB_GSI_PORT_PORT_OP_RCV_COUNTERS, mad_dump_perfcounters_port_op_rcv_counters); } static void flowctlcounters_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortFlowCtlCounters", IB_GSI_PORT_PORT_FLOW_CTL_COUNTERS, mad_dump_perfcounters_port_flow_ctl_counters); } static void vloppackets_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortVLOpPackets", IB_GSI_PORT_PORT_VL_OP_PACKETS, mad_dump_perfcounters_port_vl_op_packet); } static void vlopdata_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortVLOpData", IB_GSI_PORT_PORT_VL_OP_DATA, mad_dump_perfcounters_port_vl_op_data); } static void vlxmitflowctlerrors_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortVLXmitFlowCtlUpdateErrors", IB_GSI_PORT_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS, mad_dump_perfcounters_port_vl_xmit_flow_ctl_update_errors); } static void vlxmitcounters_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortVLXmitWaitCounters", IB_GSI_PORT_PORT_VL_XMIT_WAIT_COUNTERS, mad_dump_perfcounters_port_vl_xmit_wait_counters); } static void swportvlcong_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "SwPortVLCongestion", IB_GSI_SW_PORT_VL_CONGESTION, mad_dump_perfcounters_sw_port_vl_congestion); } static void rcvcc_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortRcvConCtrl", IB_GSI_PORT_RCV_CON_CTRL, mad_dump_perfcounters_rcv_con_ctrl); } static void slrcvfecn_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortSLRcvFECN", IB_GSI_PORT_SL_RCV_FECN, mad_dump_perfcounters_sl_rcv_fecn); } static void slrcvbecn_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortSLRcvBECN", IB_GSI_PORT_SL_RCV_BECN, mad_dump_perfcounters_sl_rcv_becn); } static void xmitcc_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortXmitConCtrl", IB_GSI_PORT_XMIT_CON_CTRL, mad_dump_perfcounters_xmit_con_ctrl); } static void vlxmittimecc_query(ib_portid_t * portid, int port, int mask) { common_func(portid, port, mask, !info.reset_only, (info.reset_only || info.reset), "PortVLXmitTimeCong", IB_GSI_PORT_VL_XMIT_TIME_CONG, mad_dump_perfcounters_vl_xmit_time_cong); } static void dump_portsamples_control(ib_portid_t *portid, int port) { char buf[1280]; memset(pc, 0, sizeof(pc)); if (!pma_query_via(pc, portid, port, ibd_timeout, IB_GSI_PORT_SAMPLES_CONTROL, srcport)) IBEXIT("sampctlquery"); mad_dump_portsamples_control(buf, sizeof buf, pc, sizeof pc); printf("# PortSamplesControl: %s port %d\n%s", portid2str(portid), port, buf); } static int process_opt(void *context, int ch) { switch (ch) { case 'x': info.extended = 1; break; case 'X': info.xmt_sl = 1; break; case 'S': info.rcv_sl = 1; break; case 'D': info.xmt_disc = 1; break; case 'E': info.rcv_err = 1; break; case 'T': info.extended_speeds = 1; break; case 'c': info.smpl_ctl = 1; break; case 1: info.oprcvcounters = 1; break; case 2: info.flowctlcounters = 1; break; case 3: info.vloppackets = 1; break; case 4: info.vlopdata = 1; break; case 5: info.vlxmitflowctlerrors = 1; break; case 6: info.vlxmitcounters = 1; break; case 7: info.swportvlcong = 1; break; case 8: info.rcvcc = 1; break; case 9: info.slrcvfecn = 1; break; case 10: info.slrcvbecn = 1; break; case 11: info.xmitcc = 1; break; case 12: info.vlxmittimecc = 1; break; case 'a': info.all_ports++; info.port = ALL_PORTS; break; case 'l': info.loop_ports++; break; case 'r': info.reset++; break; case 'R': info.reset_only++; break; default: return -1; } return 0; } int main(int argc, char **argv) { int mgmt_classes[3] = { IB_SMI_CLASS, IB_SA_CLASS, IB_PERFORMANCE_CLASS }; ib_portid_t portid = { 0 }; int mask = 0xffff; uint64_t ext_mask = 0xffffffffffffffffULL; __be32 cap_mask2_be; uint32_t cap_mask2; __be16 cap_mask; int all_ports_loop = 0; int node_type, num_ports = 0; uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; int start_port = 1; int enhancedport0; char *tmpstr; int i; const struct ibdiag_opt opts[] = { {"extended", 'x', 0, NULL, "show extended port counters"}, {"xmtsl", 'X', 0, NULL, "show Xmt SL port counters"}, {"rcvsl", 'S', 0, NULL, "show Rcv SL port counters"}, {"xmtdisc", 'D', 0, NULL, "show Xmt Discard Details"}, {"rcverr", 'E', 0, NULL, "show Rcv Error Details"}, {"extended_speeds", 'T', 0, NULL, "show port extended speeds counters"}, {"oprcvcounters", 1, 0, NULL, "show Rcv Counters per Op code"}, {"flowctlcounters", 2, 0, NULL, "show flow control counters"}, {"vloppackets", 3, 0, NULL, "show packets received per Op code per VL"}, {"vlopdata", 4, 0, NULL, "show data received per Op code per VL"}, {"vlxmitflowctlerrors", 5, 0, NULL, "show flow control update errors per VL"}, {"vlxmitcounters", 6, 0, NULL, "show ticks waiting to transmit counters per VL"}, {"swportvlcong", 7, 0, NULL, "show sw port VL congestion"}, {"rcvcc", 8, 0, NULL, "show Rcv congestion control counters"}, {"slrcvfecn", 9, 0, NULL, "show SL Rcv FECN counters"}, {"slrcvbecn", 10, 0, NULL, "show SL Rcv BECN counters"}, {"xmitcc", 11, 0, NULL, "show Xmit congestion control counters"}, {"vlxmittimecc", 12, 0, NULL, "show VL Xmit Time congestion control counters"}, {"smplctl", 'c', 0, NULL, "show samples control"}, {"all_ports", 'a', 0, NULL, "show aggregated counters"}, {"loop_ports", 'l', 0, NULL, "iterate through each port"}, {"reset_after_read", 'r', 0, NULL, "reset counters after read"}, {"Reset_only", 'R', 0, NULL, "only reset counters"}, {} }; char usage_args[] = " [ [[port(s)] [reset_mask]]]"; const char *usage_examples[] = { "\t\t# read local port's performance counters", "32 1\t\t# read performance counters from lid 32, port 1", "-x 32 1\t# read extended performance counters from lid 32, port 1", "-a 32\t\t# read performance counters from lid 32, all ports", "-r 32 1\t# read performance counters and reset", "-x -r 32 1\t# read extended performance counters and reset", "-R 0x20 1\t# reset performance counters of port 1 only", "-x -R 0x20 1\t# reset extended performance counters of port 1 only", "-R -a 32\t# reset performance counters of all ports", "-R 32 2 0x0fff\t# reset only error counters of port 2", "-R 32 2 0xf000\t# reset only non-error counters of port 2", "-a 32 1-10\t# read performance counters from lid 32, port 1-10, aggregate output", "-l 32 1-10\t# read performance counters from lid 32, port 1-10, output each port", "-a 32 1,4,8\t# read performance counters from lid 32, port 1, 4, and 8, aggregate output", "-l 32 1,4,8\t# read performance counters from lid 32, port 1, 4, and 8, output each port", NULL, }; ibdiag_process_opts(argc, argv, NULL, "DK", opts, process_opt, usage_args, usage_examples); argc -= optind; argv += optind; if (argc > 1) { if (strchr(argv[1], ',')) { tmpstr = strtok(argv[1], ","); while (tmpstr) { info.ports[info.ports_count++] = strtoul(tmpstr, NULL, 0); tmpstr = strtok(NULL, ","); } info.port = info.ports[0]; } else if ((tmpstr = strchr(argv[1], '-'))) { int pmin, pmax; *tmpstr = '\0'; tmpstr++; pmin = strtoul(argv[1], NULL, 0); pmax = strtoul(tmpstr, NULL, 0); if (pmin >= pmax) IBEXIT("max port must be greater than min port in range"); while (pmin <= pmax) info.ports[info.ports_count++] = pmin++; info.port = info.ports[0]; } else info.port = strtoul(argv[1], NULL, 0); } if (argc > 2) { ext_mask = strtoull(argv[2], NULL, 0); mask = ext_mask; } srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 0); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->gsi.port; smp_mkey_set(srcports->smi.port, ibd_mkey); if (argc) { if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &portid, argv[0], ibd_dest_type, ibd_sm_id, srcport) < 0) IBEXIT("can't resolve destination port %s", argv[0]); } else { if (resolve_self(srcports->gsi.ca_name, ibd_ca_port, &portid, &info.port, NULL) < 0) IBEXIT("can't resolve self port %s", argv[0]); } /* PerfMgt ClassPortInfo is a required attribute */ memset(pc, 0, sizeof(pc)); if (!pma_query_via(pc, &portid, info.port, ibd_timeout, CLASS_PORT_INFO, srcport)) IBEXIT("classportinfo query"); /* ClassPortInfo should be supported as part of libibmad */ memcpy(&cap_mask, pc + 2, sizeof(cap_mask)); /* CapabilityMask */ memcpy(&cap_mask2_be, pc + 4, sizeof(cap_mask2_be)); /* CapabilityMask2 */ cap_mask2 = ntohl(cap_mask2_be) >> 5; if (!(cap_mask & IB_PM_ALL_PORT_SELECT)) { /* bit 8 is AllPortSelect */ if (!info.all_ports && info.port == ALL_PORTS) IBEXIT("AllPortSelect not supported"); if (info.all_ports && info.port == ALL_PORTS) all_ports_loop = 1; } if (info.xmt_sl) { xmt_sl_query(&portid, info.port, mask); goto done; } if (info.rcv_sl) { rcv_sl_query(&portid, info.port, mask); goto done; } if (info.xmt_disc) { xmt_disc_query(&portid, info.port, mask); goto done; } if (info.rcv_err) { rcv_err_query(&portid, info.port, mask); goto done; } if (info.extended_speeds) { extended_speeds_query(&portid, info.port, ext_mask, cap_mask); goto done; } if (info.oprcvcounters) { oprcvcounters_query(&portid, info.port, mask); goto done; } if (info.flowctlcounters) { flowctlcounters_query(&portid, info.port, mask); goto done; } if (info.vloppackets) { vloppackets_query(&portid, info.port, mask); goto done; } if (info.vlopdata) { vlopdata_query(&portid, info.port, mask); goto done; } if (info.vlxmitflowctlerrors) { vlxmitflowctlerrors_query(&portid, info.port, mask); goto done; } if (info.vlxmitcounters) { vlxmitcounters_query(&portid, info.port, mask); goto done; } if (info.swportvlcong) { swportvlcong_query(&portid, info.port, mask); goto done; } if (info.rcvcc) { rcvcc_query(&portid, info.port, mask); goto done; } if (info.slrcvfecn) { slrcvfecn_query(&portid, info.port, mask); goto done; } if (info.slrcvbecn) { slrcvbecn_query(&portid, info.port, mask); goto done; } if (info.xmitcc) { xmitcc_query(&portid, info.port, mask); goto done; } if (info.vlxmittimecc) { vlxmittimecc_query(&portid, info.port, mask); goto done; } if (info.smpl_ctl) { dump_portsamples_control(&portid, info.port); goto done; } if (all_ports_loop || (info.loop_ports && (info.all_ports || info.port == ALL_PORTS))) { if (!smp_query_via(data, &portid, IB_ATTR_NODE_INFO, 0, 0, srcports->smi.port)) IBEXIT("smp query nodeinfo failed"); node_type = mad_get_field(data, 0, IB_NODE_TYPE_F); mad_decode_field(data, IB_NODE_NPORTS_F, &num_ports); if (!num_ports) IBEXIT("smp query nodeinfo: num ports invalid"); if (node_type == IB_NODE_SWITCH) { if (!smp_query_via(data, &portid, IB_ATTR_SWITCH_INFO, 0, 0, srcports->smi.port)) IBEXIT("smp query nodeinfo failed"); enhancedport0 = mad_get_field(data, 0, IB_SW_ENHANCED_PORT0_F); if (enhancedport0) start_port = 0; } if (all_ports_loop && !info.loop_ports) IBWARN ("Emulating AllPortSelect by iterating through all ports"); } if (info.reset_only) goto do_reset; if (all_ports_loop || (info.loop_ports && (info.all_ports || info.port == ALL_PORTS))) { for (i = start_port; i <= num_ports; i++) dump_perfcounters(info.extended, ibd_timeout, cap_mask, cap_mask2, &portid, i, (all_ports_loop && !info.loop_ports)); if (all_ports_loop && !info.loop_ports) { if (info.extended != 1) output_aggregate_perfcounters(&portid, cap_mask); else output_aggregate_perfcounters_ext(&portid, cap_mask, cap_mask2); } } else if (info.ports_count > 1) { for (i = 0; i < info.ports_count; i++) dump_perfcounters(info.extended, ibd_timeout, cap_mask, cap_mask2, &portid, info.ports[i], (info.all_ports && !info.loop_ports)); if (info.all_ports && !info.loop_ports) { if (info.extended != 1) output_aggregate_perfcounters(&portid, cap_mask); else output_aggregate_perfcounters_ext(&portid, cap_mask, cap_mask2); } } else dump_perfcounters(info.extended, ibd_timeout, cap_mask, cap_mask2, &portid, info.port, 0); if (!info.reset) goto done; do_reset: if (argc <= 2 && !info.extended) { if (cap_mask & IB_PM_PC_XMIT_WAIT_SUP) mask |= (1 << 16); /* reset portxmitwait */ if (cap_mask & IB_PM_IS_QP1_DROP_SUP) mask |= (1 << 17); /* reset qp1dropped */ } if (info.extended) { mask |= 0xfff0000; if (cap_mask & IB_PM_PC_XMIT_WAIT_SUP) mask |= (1 << 28); if (cap_mask & IB_PM_IS_QP1_DROP_SUP) mask |= (1 << 29); } if (all_ports_loop || (info.loop_ports && (info.all_ports || info.port == ALL_PORTS))) { for (i = start_port; i <= num_ports; i++) reset_counters(info.extended, ibd_timeout, mask, &portid, i); } else if (info.ports_count > 1) { for (i = 0; i < info.ports_count; i++) reset_counters(info.extended, ibd_timeout, mask, &portid, info.ports[i]); } else reset_counters(info.extended, ibd_timeout, mask, &portid, info.port); done: mad_rpc_close_port2(srcports); exit(0); } rdma-core-56.1/infiniband-diags/saquery.c000066400000000000000000001673231477342711600203710ustar00rootroot00000000000000/* * Copyright (c) 2006,2007 The Regents of the University of California. * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2013 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2013 Intel Corporation. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * * Produced at Lawrence Livermore National Laboratory. * Written by Ira Weiny . * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #define _GNU_SOURCE #include #include #include #include "ibdiag_common.h" #include "ibdiag_sa.h" #ifndef IB_PR_COMPMASK_SERVICEID #define IB_PR_COMPMASK_SERVICEID (IB_PR_COMPMASK_SERVICEID_MSB | \ IB_PR_COMPMASK_SERVICEID_LSB) #endif #define UMAD_SA_CAP_MASK2_IS_MCAST_TOP_SUP (1 << 3) struct query_params { uint64_t service_id; ibmad_gid_t sgid, dgid, gid, mgid; uint16_t slid, dlid, mlid; uint32_t flow_label; int hop_limit; uint8_t tclass; int reversible, numb_path; uint16_t pkey; int qos_class, sl; uint8_t mtu, rate, pkt_life; uint32_t qkey; uint8_t scope; uint8_t join_state; int proxy_join; ib_class_port_info_t cpi; }; struct query_cmd { const char *name, *alias; uint16_t query_type; const char *usage; int (*handler) (const struct query_cmd * q, struct sa_handle * h, struct query_params * p, int argc, char *argv[]); }; static char *node_name_map_file = NULL; static nn_map_t *node_name_map = NULL; /** * Declare some globals because I don't want this to be too complex. */ #define MAX_PORTS (8) #define DEFAULT_SA_TIMEOUT_MS (1000) static enum { ALL, LID_ONLY, UNIQUE_LID_ONLY, GUID_ONLY, ALL_DESC, NAME_OF_LID, NAME_OF_GUID, } node_print_desc = ALL; static char *requested_name; static uint16_t requested_lid; static int requested_lid_flag; static uint64_t requested_guid; static int requested_guid_flag; static unsigned valid_gid(ibmad_gid_t * gid) { ibmad_gid_t zero_gid; memset(&zero_gid, 0, sizeof zero_gid); return memcmp(&zero_gid, gid, sizeof(*gid)); } static void print_node_desc(ib_node_record_t * node_record) { ib_node_info_t *p_ni = &(node_record->node_info); ib_node_desc_t *p_nd = &(node_record->node_desc); char *name; if (p_ni->node_type == IB_NODE_TYPE_CA) { name = remap_node_name(node_name_map, be64toh(node_record->node_info.node_guid), (char *)p_nd->description); printf("%6d \"%s\"\n", be16toh(node_record->lid), name); free(name); } } static void dump_node_record(void *data, struct query_params *p) { ib_node_record_t *nr = data; ib_node_info_t *ni = &nr->node_info; char *name = remap_node_name(node_name_map, be64toh(ni->node_guid), (char *)nr->node_desc.description); printf("NodeRecord dump:\n" "\t\tlid.....................%u\n" "\t\treserved................0x%X\n" "\t\tbase_version............0x%X\n" "\t\tclass_version...........0x%X\n" "\t\tnode_type...............%s\n" "\t\tnum_ports...............%u\n" "\t\tsys_guid................0x%016" PRIx64 "\n" "\t\tnode_guid...............0x%016" PRIx64 "\n" "\t\tport_guid...............0x%016" PRIx64 "\n" "\t\tpartition_cap...........0x%X\n" "\t\tdevice_id...............0x%X\n" "\t\trevision................0x%X\n" "\t\tport_num................%u\n" "\t\tvendor_id...............0x%X\n" "\t\tNodeDescription.........%s\n", be16toh(nr->lid), be16toh(nr->resv), ni->base_version, ni->class_version, ib_get_node_type_str(ni->node_type), ni->num_ports, be64toh(ni->sys_guid), be64toh(ni->node_guid), be64toh(ni->port_guid), be16toh(ni->partition_cap), be16toh(ni->device_id), be32toh(ni->revision), ib_node_info_get_local_port_num(ni), be32toh(ib_node_info_get_vendor_id(ni)), name); free(name); } static void print_node_record(ib_node_record_t * node_record) { ib_node_info_t *p_ni = &node_record->node_info; ib_node_desc_t *p_nd = &node_record->node_desc; char *name; switch (node_print_desc) { case LID_ONLY: case UNIQUE_LID_ONLY: printf("%u\n", be16toh(node_record->lid)); return; case GUID_ONLY: printf("0x%016" PRIx64 "\n", be64toh(p_ni->port_guid)); return; case NAME_OF_LID: case NAME_OF_GUID: name = remap_node_name(node_name_map, be64toh(p_ni->node_guid), (char *)p_nd->description); printf("%s\n", name); free(name); return; case ALL: default: break; } dump_node_record(node_record, NULL); } static void dump_path_record(void *data, struct query_params *p) { char gid_str[INET6_ADDRSTRLEN]; char gid_str2[INET6_ADDRSTRLEN]; ib_path_rec_t *p_pr = data; printf("PathRecord dump:\n" "\t\tservice_id..............0x%016" PRIx64 "\n" "\t\tdgid....................%s\n" "\t\tsgid....................%s\n" "\t\tdlid....................%u\n" "\t\tslid....................%u\n" "\t\thop_flow_raw............0x%X\n" "\t\ttclass..................0x%X\n" "\t\tnum_path_revers.........0x%X\n" "\t\tpkey....................0x%X\n" "\t\tqos_class...............0x%X\n" "\t\tsl......................0x%X\n" "\t\tmtu.....................0x%X\n" "\t\trate....................0x%X\n" "\t\tpkt_life................0x%X\n" "\t\tpreference..............0x%X\n" "\t\tresv2...................0x%02X%02X%02X%02X%02X%02X\n", be64toh(p_pr->service_id), inet_ntop(AF_INET6, p_pr->dgid.raw, gid_str, sizeof gid_str), inet_ntop(AF_INET6, p_pr->sgid.raw, gid_str2, sizeof gid_str2), be16toh(p_pr->dlid), be16toh(p_pr->slid), be32toh(p_pr->hop_flow_raw), p_pr->tclass, p_pr->num_path, be16toh(p_pr->pkey), ib_path_rec_qos_class(p_pr), ib_path_rec_sl(p_pr), p_pr->mtu, p_pr->rate, p_pr->pkt_life, p_pr->preference, p_pr->resv2[0], p_pr->resv2[1], p_pr->resv2[2], p_pr->resv2[3], p_pr->resv2[4], p_pr->resv2[5]); } static void dump_class_port_info(ib_class_port_info_t *cpi) { char gid_str[INET6_ADDRSTRLEN]; char gid_str2[INET6_ADDRSTRLEN]; printf("SA ClassPortInfo:\n" "\t\tBase version.............%d\n" "\t\tClass version............%d\n" "\t\tCapability mask..........0x%04X\n" "\t\tCapability mask 2........0x%08X\n" "\t\tResponse time value......0x%02X\n" "\t\tRedirect GID.............%s\n" "\t\tRedirect TC/SL/FL........0x%08X\n" "\t\tRedirect LID.............%u\n" "\t\tRedirect PKey............0x%04X\n" "\t\tRedirect QP..............0x%08X\n" "\t\tRedirect QKey............0x%08X\n" "\t\tTrap GID.................%s\n" "\t\tTrap TC/SL/FL............0x%08X\n" "\t\tTrap LID.................%u\n" "\t\tTrap PKey................0x%04X\n" "\t\tTrap HL/QP...............0x%08X\n" "\t\tTrap QKey................0x%08X\n", cpi->base_ver, cpi->class_ver, be16toh(cpi->cap_mask), ib_class_cap_mask2(cpi), ib_class_resp_time_val(cpi), inet_ntop(AF_INET6, &(cpi->redir_gid), gid_str, sizeof gid_str), be32toh(cpi->redir_tc_sl_fl), be16toh(cpi->redir_lid), be16toh(cpi->redir_pkey), be32toh(cpi->redir_qp), be32toh(cpi->redir_qkey), inet_ntop(AF_INET6, &(cpi->trap_gid), gid_str2, sizeof gid_str2), be32toh(cpi->trap_tc_sl_fl), be16toh(cpi->trap_lid), be16toh(cpi->trap_pkey), be32toh(cpi->trap_hop_qp), be32toh(cpi->trap_qkey)); } static void dump_portinfo_record(void *data, struct query_params *p) { ib_portinfo_record_t *p_pir = data; const ib_port_info_t *const p_pi = &p_pir->port_info; printf("PortInfoRecord dump:\n" "\t\tEndPortLid..............%u\n" "\t\tPortNum.................%u\n" "\t\tbase_lid................%u\n" "\t\tmaster_sm_base_lid......%u\n" "\t\tcapability_mask.........0x%X\n", be16toh(p_pir->lid), p_pir->port_num, be16toh(p_pi->base_lid), be16toh(p_pi->master_sm_base_lid), be32toh(p_pi->capability_mask)); } static void dump_one_portinfo_record(void *data, struct query_params *p) { ib_portinfo_record_t *pir = data; ib_port_info_t *pi = &pir->port_info; printf("PortInfoRecord dump:\n" "\tRID\n" "\t\tEndPortLid..............%u\n" "\t\tPortNum.................%u\n" "\t\tOptions.................0x%x\n" "\tPortInfo dump:\n", be16toh(pir->lid), pir->port_num, pir->options); dump_portinfo(pi, 2); } static void dump_one_mcmember_record(void *data, struct query_params *p) { char mgid[INET6_ADDRSTRLEN], gid[INET6_ADDRSTRLEN]; ib_member_rec_t *mr = data; uint32_t flow; uint8_t sl, hop, scope, join; ib_member_get_sl_flow_hop(mr->sl_flow_hop, &sl, &flow, &hop); ib_member_get_scope_state(mr->scope_state, &scope, &join); printf("MCMember Record dump:\n" "\t\tMGID....................%s\n" "\t\tPortGid.................%s\n" "\t\tqkey....................0x%x\n" "\t\tmlid....................0x%x\n" "\t\tmtu.....................0x%x\n" "\t\tTClass..................0x%x\n" "\t\tpkey....................0x%x\n" "\t\trate....................0x%x\n" "\t\tpkt_life................0x%x\n" "\t\tSL......................0x%x\n" "\t\tFlowLabel...............0x%x\n" "\t\tHopLimit................0x%x\n" "\t\tScope...................0x%x\n" "\t\tJoinState...............0x%x\n" "\t\tProxyJoin...............0x%x\n", inet_ntop(AF_INET6, mr->mgid.raw, mgid, sizeof(mgid)), inet_ntop(AF_INET6, mr->port_gid.raw, gid, sizeof(gid)), be32toh(mr->qkey), be16toh(mr->mlid), mr->mtu, mr->tclass, be16toh(mr->pkey), mr->rate, mr->pkt_life, sl, flow, hop, scope, join, mr->proxy_join); } static void dump_multicast_group_record(void *data, struct query_params *p) { char gid_str[INET6_ADDRSTRLEN]; ib_member_rec_t *p_mcmr = data; uint8_t sl; ib_member_get_sl_flow_hop(p_mcmr->sl_flow_hop, &sl, NULL, NULL); printf("MCMemberRecord group dump:\n" "\t\tMGID....................%s\n" "\t\tMlid....................0x%X\n" "\t\tMtu.....................0x%X\n" "\t\tpkey....................0x%X\n" "\t\tRate....................0x%X\n" "\t\tSL......................0x%X\n", inet_ntop(AF_INET6, p_mcmr->mgid.raw, gid_str, sizeof gid_str), be16toh(p_mcmr->mlid), p_mcmr->mtu, be16toh(p_mcmr->pkey), p_mcmr->rate, sl); } static void dump_multicast_member_record(ib_member_rec_t *p_mcmr, struct sa_query_result *nr_result, struct query_params *params) { char gid_str[INET6_ADDRSTRLEN]; char gid_str2[INET6_ADDRSTRLEN]; uint16_t mlid = be16toh(p_mcmr->mlid); unsigned i = 0; char *node_name = strdup(""); /* go through the node records searching for a port guid which matches * this port gid interface id. * This gives us a node name to print, if available. */ for (i = 0; i < nr_result->result_cnt; i++) { ib_node_record_t *nr = sa_get_query_rec(nr_result->p_result_madw, i); if (nr->node_info.port_guid == p_mcmr->port_gid.unicast.interface_id) { if(node_name != NULL) free(node_name); node_name = remap_node_name(node_name_map, be64toh(nr->node_info.node_guid), (char *)nr->node_desc.description); break; } } if (requested_name) { if (strtol(requested_name, NULL, 0) == mlid) printf("\t\tPortGid.................%s (%s)\n", inet_ntop(AF_INET6, p_mcmr->port_gid.raw, gid_str, sizeof gid_str), node_name); } else { printf("MCMemberRecord member dump:\n" "\t\tMGID....................%s\n" "\t\tMlid....................0x%X\n" "\t\tPortGid.................%s\n" "\t\tScopeState..............0x%X\n" "\t\tProxyJoin...............0x%X\n" "\t\tNodeDescription.........%s\n", inet_ntop(AF_INET6, p_mcmr->mgid.raw, gid_str, sizeof gid_str), be16toh(p_mcmr->mlid), inet_ntop(AF_INET6, p_mcmr->port_gid.raw, gid_str2, sizeof gid_str2), p_mcmr->scope_state, p_mcmr->proxy_join, node_name); } free(node_name); } static void dump_service_record(void *data, struct query_params *p) { char gid[INET6_ADDRSTRLEN]; char buf_service_key[35]; char buf_service_name[65]; ib_service_record_t *p_sr = data; sprintf(buf_service_key, "0x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x", p_sr->service_key[0], p_sr->service_key[1], p_sr->service_key[2], p_sr->service_key[3], p_sr->service_key[4], p_sr->service_key[5], p_sr->service_key[6], p_sr->service_key[7], p_sr->service_key[8], p_sr->service_key[9], p_sr->service_key[10], p_sr->service_key[11], p_sr->service_key[12], p_sr->service_key[13], p_sr->service_key[14], p_sr->service_key[15]); strncpy(buf_service_name, (char *)p_sr->service_name, 64); buf_service_name[64] = '\0'; printf("ServiceRecord dump:\n" "\t\tServiceID...............0x%016" PRIx64 "\n" "\t\tServiceGID..............%s\n" "\t\tServiceP_Key............0x%X\n" "\t\tServiceLease............0x%X\n" "\t\tServiceKey..............%s\n" "\t\tServiceName.............%s\n" "\t\tServiceData8.1..........0x%X\n" "\t\tServiceData8.2..........0x%X\n" "\t\tServiceData8.3..........0x%X\n" "\t\tServiceData8.4..........0x%X\n" "\t\tServiceData8.5..........0x%X\n" "\t\tServiceData8.6..........0x%X\n" "\t\tServiceData8.7..........0x%X\n" "\t\tServiceData8.8..........0x%X\n" "\t\tServiceData8.9..........0x%X\n" "\t\tServiceData8.10.........0x%X\n" "\t\tServiceData8.11.........0x%X\n" "\t\tServiceData8.12.........0x%X\n" "\t\tServiceData8.13.........0x%X\n" "\t\tServiceData8.14.........0x%X\n" "\t\tServiceData8.15.........0x%X\n" "\t\tServiceData8.16.........0x%X\n" "\t\tServiceData16.1.........0x%X\n" "\t\tServiceData16.2.........0x%X\n" "\t\tServiceData16.3.........0x%X\n" "\t\tServiceData16.4.........0x%X\n" "\t\tServiceData16.5.........0x%X\n" "\t\tServiceData16.6.........0x%X\n" "\t\tServiceData16.7.........0x%X\n" "\t\tServiceData16.8.........0x%X\n" "\t\tServiceData32.1.........0x%X\n" "\t\tServiceData32.2.........0x%X\n" "\t\tServiceData32.3.........0x%X\n" "\t\tServiceData32.4.........0x%X\n" "\t\tServiceData64.1.........0x%016" PRIx64 "\n" "\t\tServiceData64.2.........0x%016" PRIx64 "\n", be64toh(p_sr->service_id), inet_ntop(AF_INET6, p_sr->service_gid.raw, gid, sizeof gid), be16toh(p_sr->service_pkey), be32toh(p_sr->service_lease), (show_keys ? buf_service_key : NOT_DISPLAYED_STR), buf_service_name, p_sr->service_data8[0], p_sr->service_data8[1], p_sr->service_data8[2], p_sr->service_data8[3], p_sr->service_data8[4], p_sr->service_data8[5], p_sr->service_data8[6], p_sr->service_data8[7], p_sr->service_data8[8], p_sr->service_data8[9], p_sr->service_data8[10], p_sr->service_data8[11], p_sr->service_data8[12], p_sr->service_data8[13], p_sr->service_data8[14], p_sr->service_data8[15], be16toh(p_sr->service_data16[0]), be16toh(p_sr->service_data16[1]), be16toh(p_sr->service_data16[2]), be16toh(p_sr->service_data16[3]), be16toh(p_sr->service_data16[4]), be16toh(p_sr->service_data16[5]), be16toh(p_sr->service_data16[6]), be16toh(p_sr->service_data16[7]), be32toh(p_sr->service_data32[0]), be32toh(p_sr->service_data32[1]), be32toh(p_sr->service_data32[2]), be32toh(p_sr->service_data32[3]), be64toh(p_sr->service_data64[0]), be64toh(p_sr->service_data64[1])); } static void dump_sm_info_record(void *data, struct query_params *p) { ib_sminfo_record_t *p_smr = data; const ib_sm_info_t *const p_smi = &p_smr->sm_info; uint8_t priority, state; priority = ib_sminfo_get_priority(p_smi); state = ib_sminfo_get_state(p_smi); printf("SMInfoRecord dump:\n" "\t\tRID\n" "\t\tLID...................%u\n" "\t\tSMInfo dump:\n" "\t\tGUID..................0x%016" PRIx64 "\n" "\t\tSM_Key................0x%016" PRIx64 "\n" "\t\tActCount..............%u\n" "\t\tPriority..............%u\n" "\t\tSMState...............%u\n", be16toh(p_smr->lid), be64toh(p_smr->sm_info.guid), be64toh(p_smr->sm_info.sm_key), be32toh(p_smr->sm_info.act_count), priority, state); } static void dump_switch_info_record(void *data, struct query_params *p) { ib_switch_info_record_t *p_sir = data; uint32_t sa_cap_mask2 = ib_class_cap_mask2(&p->cpi); printf("SwitchInfoRecord dump:\n" "\t\tRID\n" "\t\tLID.....................................%u\n" "\t\tSwitchInfo dump:\n" "\t\tLinearFDBCap............................0x%X\n" "\t\tRandomFDBCap............................0x%X\n" "\t\tMulticastFDBCap.........................0x%X\n" "\t\tLinearFDBTop............................0x%X\n" "\t\tDefaultPort.............................%u\n" "\t\tDefaultMulticastPrimaryPort.............%u\n" "\t\tDefaultMulticastNotPrimaryPort..........%u\n" "\t\tLifeTimeValue/PortStateChange/OpSL2VL...0x%X\n" "\t\tLIDsPerPort.............................0x%X\n" "\t\tPartitionEnforcementCap.................0x%X\n" "\t\tflags...................................0x%X\n", be16toh(p_sir->lid), be16toh(p_sir->switch_info.lin_cap), be16toh(p_sir->switch_info.rand_cap), be16toh(p_sir->switch_info.mcast_cap), be16toh(p_sir->switch_info.lin_top), p_sir->switch_info.def_port, p_sir->switch_info.def_mcast_pri_port, p_sir->switch_info.def_mcast_not_port, p_sir->switch_info.life_state, be16toh(p_sir->switch_info.lids_per_port), be16toh(p_sir->switch_info.enforce_cap), p_sir->switch_info.flags); if (sa_cap_mask2 & UMAD_SA_CAP_MASK2_IS_MCAST_TOP_SUP) printf("\t\tMulticastFDBTop.........................0x%X\n", be16toh(p_sir->switch_info.mcast_top)); } static void dump_inform_info_record(void *data, struct query_params *p) { char gid_str[INET6_ADDRSTRLEN]; char gid_str2[INET6_ADDRSTRLEN]; ib_inform_info_record_t *p_iir = data; __be32 qpn; uint8_t resp_time_val; ib_inform_info_get_qpn_resp_time(p_iir->inform_info.g_or_v. generic.qpn_resp_time_val, &qpn, &resp_time_val); if (p_iir->inform_info.is_generic) { printf("InformInfoRecord dump:\n" "\t\tRID\n" "\t\tSubscriberGID...........%s\n" "\t\tSubscriberEnum..........0x%X\n" "\t\tInformInfo dump:\n" "\t\tgid.....................%s\n" "\t\tlid_range_begin.........%u\n" "\t\tlid_range_end...........%u\n" "\t\tis_generic..............0x%X\n" "\t\tsubscribe...............0x%X\n" "\t\ttrap_type...............0x%X\n" "\t\ttrap_num................%u\n", inet_ntop(AF_INET6, p_iir->subscriber_gid.raw, gid_str, sizeof gid_str), be16toh(p_iir->subscriber_enum), inet_ntop(AF_INET6, p_iir->inform_info.gid.raw, gid_str2, sizeof gid_str2), be16toh(p_iir->inform_info.lid_range_begin), be16toh(p_iir->inform_info.lid_range_end), p_iir->inform_info.is_generic, p_iir->inform_info.subscribe, be16toh(p_iir->inform_info.trap_type), be16toh(p_iir->inform_info.g_or_v.generic.trap_num)); if (show_keys) { printf("\t\tqpn.....................0x%06X\n", be32toh(qpn)); } else { printf("\t\tqpn....................." NOT_DISPLAYED_STR "\n"); } printf("\t\tresp_time_val...........0x%X\n" "\t\tnode_type...............0x%06X\n", resp_time_val, be32toh(ib_inform_info_get_prod_type (&p_iir->inform_info))); } else { printf("InformInfoRecord dump:\n" "\t\tRID\n" "\t\tSubscriberGID...........%s\n" "\t\tSubscriberEnum..........0x%X\n" "\t\tInformInfo dump:\n" "\t\tgid.....................%s\n" "\t\tlid_range_begin.........%u\n" "\t\tlid_range_end...........%u\n" "\t\tis_generic..............0x%X\n" "\t\tsubscribe...............0x%X\n" "\t\ttrap_type...............0x%X\n" "\t\tdev_id..................0x%X\n", inet_ntop(AF_INET6, p_iir->subscriber_gid.raw, gid_str, sizeof gid_str), be16toh(p_iir->subscriber_enum), inet_ntop(AF_INET6, p_iir->inform_info.gid.raw, gid_str2, sizeof gid_str2), be16toh(p_iir->inform_info.lid_range_begin), be16toh(p_iir->inform_info.lid_range_end), p_iir->inform_info.is_generic, p_iir->inform_info.subscribe, be16toh(p_iir->inform_info.trap_type), be16toh(p_iir->inform_info.g_or_v.vend.dev_id)); if (show_keys) { printf("\t\tqpn.....................0x%06X\n", be32toh(qpn)); } else { printf("\t\tqpn....................." NOT_DISPLAYED_STR "\n"); } printf("\t\tresp_time_val...........0x%X\n" "\t\tvendor_id...............0x%06X\n", resp_time_val, be32toh(ib_inform_info_get_prod_type (&p_iir->inform_info))); } } static void dump_one_link_record(void *data, struct query_params *p) { ib_link_record_t *lr = data; printf("LinkRecord dump:\n" "\t\tFromLID....................%u\n" "\t\tFromPort...................%u\n" "\t\tToPort.....................%u\n" "\t\tToLID......................%u\n", be16toh(lr->from_lid), lr->from_port_num, lr->to_port_num, be16toh(lr->to_lid)); } static void dump_one_slvl_record(void *data, struct query_params *p) { ib_slvl_table_record_t *slvl = data; ib_slvl_table_t *t = &slvl->slvl_tbl; printf("SL2VLTableRecord dump:\n" "\t\tLID........................%u\n" "\t\tInPort.....................%u\n" "\t\tOutPort....................%u\n" "\t\tSL: 0| 1| 2| 3| 4| 5| 6| 7| 8| 9|10|11|12|13|14|15|\n" "\t\tVL:%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u" "|%2u|%2u|%2u|\n", be16toh(slvl->lid), slvl->in_port_num, slvl->out_port_num, ib_slvl_table_get(t, 0), ib_slvl_table_get(t, 1), ib_slvl_table_get(t, 2), ib_slvl_table_get(t, 3), ib_slvl_table_get(t, 4), ib_slvl_table_get(t, 5), ib_slvl_table_get(t, 6), ib_slvl_table_get(t, 7), ib_slvl_table_get(t, 8), ib_slvl_table_get(t, 9), ib_slvl_table_get(t, 10), ib_slvl_table_get(t, 11), ib_slvl_table_get(t, 12), ib_slvl_table_get(t, 13), ib_slvl_table_get(t, 14), ib_slvl_table_get(t, 15)); } static void dump_one_vlarb_record(void *data, struct query_params *p) { ib_vl_arb_table_record_t *vlarb = data; ib_vl_arb_element_t *e = vlarb->vl_arb_tbl.vl_entry; int i; printf("VLArbTableRecord dump:\n" "\t\tLID........................%u\n" "\t\tPort.......................%u\n" "\t\tBlock......................%u\n", be16toh(vlarb->lid), vlarb->port_num, vlarb->block_num); for (i = 0; i < 32; i += 16) printf("\t\tVL :%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|" "%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|\n" "\t\tWeight:%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|" "%2u|%2u|%2u|%2u|%2u|%2u|%2u|%2u|\n", e[i + 0].vl, e[i + 1].vl, e[i + 2].vl, e[i + 3].vl, e[i + 4].vl, e[i + 5].vl, e[i + 6].vl, e[i + 7].vl, e[i + 8].vl, e[i + 9].vl, e[i + 10].vl, e[i + 11].vl, e[i + 12].vl, e[i + 13].vl, e[i + 14].vl, e[i + 15].vl, e[i + 0].weight, e[i + 1].weight, e[i + 2].weight, e[i + 3].weight, e[i + 4].weight, e[i + 5].weight, e[i + 6].weight, e[i + 7].weight, e[i + 8].weight, e[i + 9].weight, e[i + 10].weight, e[i + 11].weight, e[i + 12].weight, e[i + 13].weight, e[i + 14].weight, e[i + 15].weight); } static void dump_one_pkey_tbl_record(void *data, struct query_params *params) { ib_pkey_table_record_t *pktr = data; __be16 *p = pktr->pkey_tbl.pkey_entry; int i; printf("PKeyTableRecord dump:\n" "\t\tLID........................%u\n" "\t\tPort.......................%u\n" "\t\tBlock......................%u\n" "\t\tPKey Table:\n", be16toh(pktr->lid), pktr->port_num, pktr->block_num); for (i = 0; i < 32; i += 8) printf("\t\t0x%04x 0x%04x 0x%04x 0x%04x" " 0x%04x 0x%04x 0x%04x 0x%04x\n", be16toh(p[i + 0]), be16toh(p[i + 1]), be16toh(p[i + 2]), be16toh(p[i + 3]), be16toh(p[i + 4]), be16toh(p[i + 5]), be16toh(p[i + 6]), be16toh(p[i + 7])); printf("\n"); } static void dump_one_lft_record(void *data, struct query_params *p) { ib_lft_record_t *lftr = data; unsigned block = be16toh(lftr->block_num); int i; printf("LFT Record dump:\n" "\t\tLID........................%u\n" "\t\tBlock......................%u\n" "\t\tLFT:\n\t\tLID\tPort Number\n", be16toh(lftr->lid), block); for (i = 0; i < 64; i++) printf("\t\t%u\t%u\n", block * 64 + i, lftr->lft[i]); printf("\n"); } static void dump_one_guidinfo_record(void *data, struct query_params *p) { ib_guidinfo_record_t *gir = data; printf("GUIDInfo Record dump:\n" "\t\tLID........................%u\n" "\t\tBlock......................%u\n" "\t\tGUID 0.....................0x%016" PRIx64 "\n" "\t\tGUID 1.....................0x%016" PRIx64 "\n" "\t\tGUID 2.....................0x%016" PRIx64 "\n" "\t\tGUID 3.....................0x%016" PRIx64 "\n" "\t\tGUID 4.....................0x%016" PRIx64 "\n" "\t\tGUID 5.....................0x%016" PRIx64 "\n" "\t\tGUID 6.....................0x%016" PRIx64 "\n" "\t\tGUID 7.....................0x%016" PRIx64 "\n", be16toh(gir->lid), gir->block_num, be64toh(gir->guid_info.guid[0]), be64toh(gir->guid_info.guid[1]), be64toh(gir->guid_info.guid[2]), be64toh(gir->guid_info.guid[3]), be64toh(gir->guid_info.guid[4]), be64toh(gir->guid_info.guid[5]), be64toh(gir->guid_info.guid[6]), be64toh(gir->guid_info.guid[7])); } static void dump_one_mft_record(void *data, struct query_params *p) { ib_mft_record_t *mftr = data; unsigned position = be16toh(mftr->position_block_num) >> 12; unsigned block = be16toh(mftr->position_block_num) & IB_MCAST_BLOCK_ID_MASK_HO; int i; unsigned offset; printf("MFT Record dump:\n" "\t\tLID........................%u\n" "\t\tPosition...................%u\n" "\t\tBlock......................%u\n" "\t\tMFT:\n\t\tMLID\tPort Mask\n", be16toh(mftr->lid), position, block); offset = IB_LID_MCAST_START_HO + block * 32; for (i = 0; i < IB_MCAST_BLOCK_SIZE; i++) printf("\t\t0x%04x\t0x%04x\n", offset + i, be16toh(mftr->mft[i])); printf("\n"); } static void dump_results(struct sa_query_result *r, void (*dump_func) (void *, struct query_params *), struct query_params *p) { unsigned i; for (i = 0; i < r->result_cnt; i++) { void *data = sa_get_query_rec(r->p_result_madw, i); dump_func(data, p); } } /** * Get any record(s) */ static int get_any_records(struct sa_handle * h, uint16_t attr_id, uint32_t attr_mod, __be64 comp_mask, void *attr, size_t attr_size, struct sa_query_result *result) { int ret = sa_query(h, IB_MAD_METHOD_GET_TABLE, attr_id, attr_mod, be64toh(comp_mask), ibd_sakey, attr, attr_size, result); if (ret) { fprintf(stderr, "Query SA failed: %s\n", strerror(ret)); return ret; } if (result->status != IB_SA_MAD_STATUS_SUCCESS) { sa_report_err(result->status); return EIO; } return ret; } static int get_and_dump_any_records(struct sa_handle * h, uint16_t attr_id, uint32_t attr_mod, __be64 comp_mask, void *attr, size_t attr_size, void (*dump_func) (void *, struct query_params *), struct query_params *p) { struct sa_query_result result; int ret = get_any_records(h, attr_id, attr_mod, comp_mask, attr, attr_size, &result); if (ret) return ret; dump_results(&result, dump_func, p); sa_free_result_mad(&result); return 0; } /** * Get all the records available for requested query type. */ static int get_all_records(struct sa_handle * h, uint16_t attr_id, struct sa_query_result *result) { return get_any_records(h, attr_id, 0, 0, NULL, 0, result); } static int get_and_dump_all_records(struct sa_handle * h, uint16_t attr_id, void (*dump_func) (void *, struct query_params *p), struct query_params *p) { struct sa_query_result result; int ret = get_all_records(h, attr_id, &result); if (ret) return ret; dump_results(&result, dump_func, p); sa_free_result_mad(&result); return ret; } /** * return the lid from the node descriptor (name) supplied */ static int get_lid_from_name(struct sa_handle * h, const char *name, uint16_t * lid) { ib_node_record_t *node_record = NULL; unsigned i; int ret; struct sa_query_result result; ret = get_all_records(h, IB_SA_ATTR_NODERECORD, &result); if (ret) return ret; ret = ENONET; for (i = 0; i < result.result_cnt; i++) { node_record = sa_get_query_rec(result.p_result_madw, i); if (name && strncmp(name, (char *)node_record->node_desc.description, sizeof(node_record->node_desc.description)) == 0) { *lid = be16toh(node_record->lid); ret = 0; break; } } sa_free_result_mad(&result); return ret; } static uint16_t get_lid(struct sa_handle * h, const char *name) { int rc = 0; uint16_t rc_lid = 0; if (!name) return 0; if (isalpha(name[0])) { if ((rc = get_lid_from_name(h, name, &rc_lid)) != 0) { fprintf(stderr, "Failed to find lid for \"%s\": %s\n", name, strerror(rc)); exit(rc); } } else { long val; errno = 0; val = strtol(name, NULL, 0); if (errno != 0 || val <= 0 || val > UINT16_MAX) { fprintf(stderr, "Invalid lid specified: \"%s\"\n", name); exit(EINVAL); } rc_lid = (uint16_t)val; } return rc_lid; } static int parse_iir_subscriber_gid(char *str, ib_inform_info_record_t *ir) { int rc = inet_pton(AF_INET6,str,&(ir->subscriber_gid.raw)); if(rc < 1){ fprintf(stderr, "Invalid SubscriberGID specified: \"%s\"\n",str); exit(EINVAL); } return rc; } static int parse_lid_and_ports(struct sa_handle * h, char *str, int *lid, int *port1, int *port2) { char *p, *e; if (port1) *port1 = -1; if (port2) *port2 = -1; p = strchr(str, '/'); if (p) *p = '\0'; if (lid) *lid = get_lid(h, str); if (!p) return 0; str = p + 1; p = strchr(str, '/'); if (p) *p = '\0'; if (port1) { *port1 = strtoul(str, &e, 0); if (e == str) *port1 = -1; } if (!p) return 0; str = p + 1; if (port2) { *port2 = strtoul(str, &e, 0); if (e == str) *port2 = -1; } return 0; } /* * Get the portinfo records available with IsSM or IsSMdisabled CapabilityMask bit on. */ static int get_issm_records(struct sa_handle * h, __be32 capability_mask, struct sa_query_result *result) { ib_portinfo_record_t attr; memset(&attr, 0, sizeof(attr)); attr.port_info.capability_mask = capability_mask; return get_any_records(h, IB_SA_ATTR_PORTINFORECORD, 1 << 31, IB_PIR_COMPMASK_CAPMASK, &attr, sizeof(attr), result); } static int print_node_records(struct sa_handle * h, struct query_params *p) { unsigned i; int ret; struct sa_query_result result; ret = get_all_records(h, IB_SA_ATTR_NODERECORD, &result); if (ret) return ret; if (node_print_desc == ALL_DESC) { printf(" LID \"name\"\n"); printf("================\n"); } for (i = 0; i < result.result_cnt; i++) { ib_node_record_t *node_record; node_record = sa_get_query_rec(result.p_result_madw, i); if (node_print_desc == ALL_DESC) { print_node_desc(node_record); } else if (node_print_desc == NAME_OF_LID) { if (requested_lid == be16toh(node_record->lid)) print_node_record(node_record); } else if (node_print_desc == NAME_OF_GUID) { ib_node_info_t *p_ni = &(node_record->node_info); if (requested_guid == be64toh(p_ni->port_guid)) print_node_record(node_record); } else { ib_node_info_t *p_ni = &(node_record->node_info); ib_node_desc_t *p_nd = &(node_record->node_desc); char *name; name = remap_node_name (node_name_map, be64toh(p_ni->node_guid), (char *)p_nd->description); if (!requested_name || (strncmp(requested_name, (char *)node_record->node_desc.description, sizeof(node_record-> node_desc.description)) == 0) || (strncmp(requested_name, name, sizeof(node_record-> node_desc.description)) == 0)) { print_node_record(node_record); if (node_print_desc == UNIQUE_LID_ONLY) { sa_free_result_mad(&result); exit(0); } } free(name); } } sa_free_result_mad(&result); return ret; } static int query_path_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_path_rec_t pr; __be64 comp_mask = 0; uint32_t flow = 0; int qos_class = 0; uint8_t reversible = 0; memset(&pr, 0, sizeof(pr)); CHECK_AND_SET_VAL(p->service_id, 64, 0, pr.service_id, PR, SERVICEID); CHECK_AND_SET_GID(p->sgid, pr.sgid, PR, SGID); CHECK_AND_SET_GID(p->dgid, pr.dgid, PR, DGID); CHECK_AND_SET_VAL(p->slid, 16, 0, pr.slid, PR, SLID); CHECK_AND_SET_VAL(p->dlid, 16, 0, pr.dlid, PR, DLID); CHECK_AND_SET_VAL(p->hop_limit, 32, -1, pr.hop_flow_raw, PR, HOPLIMIT); CHECK_AND_SET_VAL(p->flow_label, 8, 0, flow, PR, FLOWLABEL); pr.hop_flow_raw |= htobe32(flow << 8); CHECK_AND_SET_VAL(p->tclass, 8, 0, pr.tclass, PR, TCLASS); CHECK_AND_SET_VAL(p->reversible, 8, -1, reversible, PR, REVERSIBLE); CHECK_AND_SET_VAL(p->numb_path, 8, -1, pr.num_path, PR, NUMBPATH); pr.num_path |= reversible << 7; CHECK_AND_SET_VAL(p->pkey, 16, 0, pr.pkey, PR, PKEY); CHECK_AND_SET_VAL(p->sl, 16, -1, pr.qos_class_sl, PR, SL); if (p->qos_class != -1) { qos_class = p->qos_class; comp_mask |= IB_PR_COMPMASK_QOS_CLASS; } ib_path_rec_set_qos_class(&pr, qos_class); CHECK_AND_SET_VAL_AND_SEL(p->mtu, pr.mtu, PR, MTU, SELEC); CHECK_AND_SET_VAL_AND_SEL(p->rate, pr.rate, PR, RATE, SELEC); CHECK_AND_SET_VAL_AND_SEL(p->pkt_life, pr.pkt_life, PR, PKTLIFETIME, SELEC); return get_and_dump_any_records(h, IB_SA_ATTR_PATHRECORD, 0, comp_mask, &pr, sizeof(pr), dump_path_record, p); } static int print_issm_records(struct sa_handle * h, struct query_params *p) { struct sa_query_result result; int ret = 0; /* First, get IsSM records */ ret = get_issm_records(h, IB_PORT_CAP_IS_SM, &result); if (ret != 0) return (ret); printf("IsSM ports\n"); dump_results(&result, dump_portinfo_record, p); sa_free_result_mad(&result); /* Now, get IsSMdisabled records */ ret = get_issm_records(h, IB_PORT_CAP_SM_DISAB, &result); if (ret != 0) return (ret); printf("\nIsSMdisabled ports\n"); dump_results(&result, dump_portinfo_record, p); sa_free_result_mad(&result); return (ret); } static int print_multicast_member_records(struct sa_handle * h, struct query_params *params) { struct sa_query_result mc_group_result; struct sa_query_result nr_result; int ret; unsigned i; ret = get_all_records(h, IB_SA_ATTR_MCRECORD, &mc_group_result); if (ret) return ret; ret = get_all_records(h, IB_SA_ATTR_NODERECORD, &nr_result); if (ret) goto return_mc; for (i = 0; i < mc_group_result.result_cnt; i++) { ib_member_rec_t *rec = (ib_member_rec_t *) sa_get_query_rec(mc_group_result.p_result_madw, i); dump_multicast_member_record(rec, &nr_result, params); } sa_free_result_mad(&nr_result); return_mc: sa_free_result_mad(&mc_group_result); return ret; } static int print_multicast_group_records(struct sa_handle * h, struct query_params *p) { return get_and_dump_all_records(h, IB_SA_ATTR_MCRECORD, dump_multicast_group_record, p); } static int query_class_port_info(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { dump_class_port_info(&p->cpi); return (0); } static int query_node_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_node_record_t nr; __be64 comp_mask = 0; int lid = 0; if (argc > 0) parse_lid_and_ports(h, argv[0], &lid, NULL, NULL); memset(&nr, 0, sizeof(nr)); CHECK_AND_SET_VAL(lid, 16, 0, nr.lid, NR, LID); return get_and_dump_any_records(h, IB_SA_ATTR_NODERECORD, 0, comp_mask, &nr, sizeof(nr), dump_node_record, p); } static int query_portinfo_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_portinfo_record_t pir; __be64 comp_mask = 0; int lid = 0, port = -1, options = -1; if (argc > 0) parse_lid_and_ports(h, argv[0], &lid, &port, &options); memset(&pir, 0, sizeof(pir)); CHECK_AND_SET_VAL(lid, 16, 0, pir.lid, PIR, LID); CHECK_AND_SET_VAL(port, 8, -1, pir.port_num, PIR, PORTNUM); CHECK_AND_SET_VAL(options, 8, -1, pir.options, PIR, OPTIONS); return get_and_dump_any_records(h, IB_SA_ATTR_PORTINFORECORD, 0, comp_mask, &pir, sizeof(pir), dump_one_portinfo_record, p); } static int query_mcmember_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_member_rec_t mr; __be64 comp_mask = 0; uint32_t flow = 0; uint8_t sl = 0, hop = 0, scope = 0; memset(&mr, 0, sizeof(mr)); CHECK_AND_SET_GID(p->mgid, mr.mgid, MCR, MGID); CHECK_AND_SET_GID(p->gid, mr.port_gid, MCR, PORT_GID); CHECK_AND_SET_VAL(p->mlid, 16, 0, mr.mlid, MCR, MLID); CHECK_AND_SET_VAL(p->qkey, 32, 0, mr.qkey, MCR, QKEY); CHECK_AND_SET_VAL_AND_SEL(p->mtu, mr.mtu, MCR, MTU, _SEL); CHECK_AND_SET_VAL_AND_SEL(p->rate, mr.rate, MCR, RATE, _SEL); CHECK_AND_SET_VAL_AND_SEL(p->pkt_life, mr.pkt_life, MCR, LIFE, _SEL); CHECK_AND_SET_VAL(p->tclass, 8, 0, mr.tclass, MCR, TCLASS); CHECK_AND_SET_VAL(p->pkey, 16, 0, mr.pkey, MCR, PKEY); CHECK_AND_SET_VAL(p->sl, 8, -1, sl, MCR, SL); CHECK_AND_SET_VAL(p->flow_label, 8, 0, flow, MCR, FLOW); CHECK_AND_SET_VAL(p->hop_limit, 8, -1, hop, MCR, HOP); mr.sl_flow_hop = ib_member_set_sl_flow_hop(sl, flow, hop); CHECK_AND_SET_VAL(p->scope, 8, 0, scope, MCR, SCOPE); CHECK_AND_SET_VAL(p->join_state, 8, 0, mr.scope_state, MCR, JOIN_STATE); mr.scope_state |= scope << 4; CHECK_AND_SET_VAL(p->proxy_join, 8, -1, mr.proxy_join, MCR, PROXY); return get_and_dump_any_records(h, IB_SA_ATTR_MCRECORD, 0, comp_mask, &mr, sizeof(mr), dump_one_mcmember_record, p); } static int query_service_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { return get_and_dump_all_records(h, IB_SA_ATTR_SERVICERECORD, dump_service_record, p); } static int query_sm_info_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_sminfo_record_t smir; __be64 comp_mask = 0; int lid = 0; if (argc > 0) parse_lid_and_ports(h, argv[0], &lid, NULL, NULL); memset(&smir, 0, sizeof(smir)); CHECK_AND_SET_VAL(lid, 16, 0, smir.lid, SMIR, LID); return get_and_dump_any_records(h, IB_SA_ATTR_SMINFORECORD, 0, comp_mask, &smir, sizeof(smir), dump_sm_info_record, p); } static int query_switchinfo_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_switch_info_record_t swir; __be64 comp_mask = 0; int lid = 0; if (argc > 0) parse_lid_and_ports(h, argv[0], &lid, NULL, NULL); memset(&swir, 0, sizeof(swir)); CHECK_AND_SET_VAL(lid, 16, 0, swir.lid, SWIR, LID); return get_and_dump_any_records(h, IB_SA_ATTR_SWITCHINFORECORD, 0, comp_mask, &swir, sizeof(swir), dump_switch_info_record, p); } static int query_inform_info_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { int rc = 0; ib_inform_info_record_t ir; __be64 comp_mask = 0; memset(&ir, 0, sizeof(ir)); if (argc > 0) { comp_mask = IB_IIR_COMPMASK_SUBSCRIBERGID; if((rc = parse_iir_subscriber_gid(argv[0], &ir)) < 1) return rc; } return get_and_dump_any_records(h, IB_SA_ATTR_INFORMINFORECORD, 0, comp_mask, &ir, sizeof(ir), dump_inform_info_record, p); } static int query_link_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_link_record_t lr; __be64 comp_mask = 0; int from_lid = 0, to_lid = 0, from_port = -1, to_port = -1; if (argc > 0) parse_lid_and_ports(h, argv[0], &from_lid, &from_port, NULL); if (argc > 1) parse_lid_and_ports(h, argv[1], &to_lid, &to_port, NULL); memset(&lr, 0, sizeof(lr)); CHECK_AND_SET_VAL(from_lid, 16, 0, lr.from_lid, LR, FROM_LID); CHECK_AND_SET_VAL(from_port, 8, -1, lr.from_port_num, LR, FROM_PORT); CHECK_AND_SET_VAL(to_lid, 16, 0, lr.to_lid, LR, TO_LID); CHECK_AND_SET_VAL(to_port, 8, -1, lr.to_port_num, LR, TO_PORT); return get_and_dump_any_records(h, IB_SA_ATTR_LINKRECORD, 0, comp_mask, &lr, sizeof(lr), dump_one_link_record, p); } static int query_sl2vl_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_slvl_table_record_t slvl; __be64 comp_mask = 0; int lid = 0, in_port = -1, out_port = -1; if (argc > 0) parse_lid_and_ports(h, argv[0], &lid, &in_port, &out_port); memset(&slvl, 0, sizeof(slvl)); CHECK_AND_SET_VAL(lid, 16, 0, slvl.lid, SLVL, LID); CHECK_AND_SET_VAL(in_port, 8, -1, slvl.in_port_num, SLVL, IN_PORT); CHECK_AND_SET_VAL(out_port, 8, -1, slvl.out_port_num, SLVL, OUT_PORT); return get_and_dump_any_records(h, IB_SA_ATTR_SL2VLTABLERECORD, 0, comp_mask, &slvl, sizeof(slvl), dump_one_slvl_record, p); } static int query_vlarb_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_vl_arb_table_record_t vlarb; __be64 comp_mask = 0; int lid = 0, port = -1, block = -1; if (argc > 0) parse_lid_and_ports(h, argv[0], &lid, &port, &block); memset(&vlarb, 0, sizeof(vlarb)); CHECK_AND_SET_VAL(lid, 16, 0, vlarb.lid, VLA, LID); CHECK_AND_SET_VAL(port, 8, -1, vlarb.port_num, VLA, OUT_PORT); CHECK_AND_SET_VAL(block, 8, -1, vlarb.block_num, VLA, BLOCK); return get_and_dump_any_records(h, IB_SA_ATTR_VLARBTABLERECORD, 0, comp_mask, &vlarb, sizeof(vlarb), dump_one_vlarb_record, p); } static int query_pkey_tbl_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_pkey_table_record_t pktr; __be64 comp_mask = 0; int lid = 0, port = -1, block = -1; if (argc > 0) parse_lid_and_ports(h, argv[0], &lid, &port, &block); memset(&pktr, 0, sizeof(pktr)); CHECK_AND_SET_VAL(lid, 16, 0, pktr.lid, PKEY, LID); CHECK_AND_SET_VAL(port, 8, -1, pktr.port_num, PKEY, PORT); CHECK_AND_SET_VAL(block, 16, -1, pktr.block_num, PKEY, BLOCK); return get_and_dump_any_records(h, IB_SA_ATTR_PKEYTABLERECORD, 0, comp_mask, &pktr, sizeof(pktr), dump_one_pkey_tbl_record, p); } static int query_lft_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_lft_record_t lftr; __be64 comp_mask = 0; int lid = 0, block = -1; if (argc > 0) parse_lid_and_ports(h, argv[0], &lid, &block, NULL); memset(&lftr, 0, sizeof(lftr)); CHECK_AND_SET_VAL(lid, 16, 0, lftr.lid, LFTR, LID); CHECK_AND_SET_VAL(block, 16, -1, lftr.block_num, LFTR, BLOCK); return get_and_dump_any_records(h, IB_SA_ATTR_LFTRECORD, 0, comp_mask, &lftr, sizeof(lftr), dump_one_lft_record, p); } static int query_guidinfo_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_guidinfo_record_t gir; __be64 comp_mask = 0; int lid = 0, block = -1; if (argc > 0) parse_lid_and_ports(h, argv[0], &lid, &block, NULL); memset(&gir, 0, sizeof(gir)); CHECK_AND_SET_VAL(lid, 16, 0, gir.lid, GIR, LID); CHECK_AND_SET_VAL(block, 8, -1, gir.block_num, GIR, BLOCKNUM); return get_and_dump_any_records(h, IB_SA_ATTR_GUIDINFORECORD, 0, comp_mask, &gir, sizeof(gir), dump_one_guidinfo_record, p); } static int query_mft_records(const struct query_cmd *q, struct sa_handle * h, struct query_params *p, int argc, char *argv[]) { ib_mft_record_t mftr; __be64 comp_mask = 0; int lid = 0, block = -1, position = -1; uint16_t pos = 0; if (argc > 0) parse_lid_and_ports(h, argv[0], &lid, &position, &block); memset(&mftr, 0, sizeof(mftr)); CHECK_AND_SET_VAL(lid, 16, 0, mftr.lid, MFTR, LID); CHECK_AND_SET_VAL(block, 16, -1, mftr.position_block_num, MFTR, BLOCK); mftr.position_block_num &= htobe16(IB_MCAST_BLOCK_ID_MASK_HO); CHECK_AND_SET_VAL(position, 8, -1, pos, MFTR, POSITION); mftr.position_block_num |= htobe16(pos << 12); return get_and_dump_any_records(h, IB_SA_ATTR_MFTRECORD, 0, comp_mask, &mftr, sizeof(mftr), dump_one_mft_record, p); } static int query_sa_cpi(struct sa_handle *h, struct query_params *query_params) { ib_class_port_info_t *cpi; struct sa_query_result result; int ret = sa_query(h, IB_MAD_METHOD_GET, CLASS_PORT_INFO, 0, 0, ibd_sakey, NULL, 0, &result); if (ret) { fprintf(stderr, "Query SA failed: %s\n", strerror(ret)); return ret; } if (result.status != IB_SA_MAD_STATUS_SUCCESS) { sa_report_err(result.status); ret = EIO; goto Exit; } cpi = sa_get_query_rec(result.p_result_madw, 0); memcpy(&query_params->cpi, cpi, sizeof(query_params->cpi)); Exit: sa_free_result_mad(&result); return ret; } static const struct query_cmd query_cmds[] = { {"ClassPortInfo", "CPI", CLASS_PORT_INFO, NULL, query_class_port_info}, {"NodeRecord", "NR", IB_SA_ATTR_NODERECORD, "[lid]", query_node_records}, {"PortInfoRecord", "PIR", IB_SA_ATTR_PORTINFORECORD, "[[lid]/[port]/[options]]", query_portinfo_records}, {"SL2VLTableRecord", "SL2VL", IB_SA_ATTR_SL2VLTABLERECORD, "[[lid]/[in_port]/[out_port]]", query_sl2vl_records}, {"PKeyTableRecord", "PKTR", IB_SA_ATTR_PKEYTABLERECORD, "[[lid]/[port]/[block]]", query_pkey_tbl_records}, {"VLArbitrationTableRecord", "VLAR", IB_SA_ATTR_VLARBTABLERECORD, "[[lid]/[port]/[block]]", query_vlarb_records}, {"InformInfoRecord", "IIR", IB_SA_ATTR_INFORMINFORECORD, "[subscriber_gid]", query_inform_info_records}, {"LinkRecord", "LR", IB_SA_ATTR_LINKRECORD, "[[from_lid]/[from_port]] [[to_lid]/[to_port]]", query_link_records}, {"ServiceRecord", "SR", IB_SA_ATTR_SERVICERECORD, NULL, query_service_records}, {"PathRecord", "PR", IB_SA_ATTR_PATHRECORD, NULL, query_path_records}, {"MCMemberRecord", "MCMR", IB_SA_ATTR_MCRECORD, NULL, query_mcmember_records}, {"LFTRecord", "LFTR", IB_SA_ATTR_LFTRECORD, "[[lid]/[block]]", query_lft_records}, {"MFTRecord", "MFTR", IB_SA_ATTR_MFTRECORD, "[[mlid]/[position]/[block]]", query_mft_records}, {"GUIDInfoRecord", "GIR", IB_SA_ATTR_GUIDINFORECORD, "[[lid]/[block]]", query_guidinfo_records}, {"SwitchInfoRecord", "SWIR", IB_SA_ATTR_SWITCHINFORECORD, "[lid]", query_switchinfo_records}, {"SMInfoRecord", "SMIR", IB_SA_ATTR_SMINFORECORD, "[lid]", query_sm_info_records}, {} }; static const struct query_cmd *find_query(const char *name) { const struct query_cmd *q; for (q = query_cmds; q->name; q++) if (!strcasecmp(name, q->name) || (q->alias && !strcasecmp(name, q->alias))) return q; return NULL; } static const struct query_cmd *find_query_by_type(uint16_t type) { const struct query_cmd *q; for (q = query_cmds; q->name; q++) if (q->query_type == type) return q; return NULL; } enum saquery_command { SAQUERY_CMD_QUERY, SAQUERY_CMD_NODE_RECORD, SAQUERY_CMD_CLASS_PORT_INFO, SAQUERY_CMD_ISSM, SAQUERY_CMD_MCGROUPS, SAQUERY_CMD_MCMEMBERS, }; static enum saquery_command command = SAQUERY_CMD_QUERY; static uint16_t query_type; static char *src_lid, *dst_lid; static int process_opt(void *context, int ch) { struct query_params *p = context; switch (ch) { case 1: { src_lid = strdup(optarg); dst_lid = strchr(src_lid, ':'); if (!dst_lid) ibdiag_show_usage(); *dst_lid++ = '\0'; } p->numb_path = 0x7f; query_type = IB_SA_ATTR_PATHRECORD; break; case 2: { char *src_addr = strdup(optarg); char *dst_addr = strchr(src_addr, '-'); if (!dst_addr) ibdiag_show_usage(); *dst_addr++ = '\0'; if (inet_pton(AF_INET6, src_addr, &p->sgid) <= 0) ibdiag_show_usage(); if (inet_pton(AF_INET6, dst_addr, &p->dgid) <= 0) ibdiag_show_usage(); free(src_addr); } p->numb_path = 0x7f; query_type = IB_SA_ATTR_PATHRECORD; break; case 3: node_name_map_file = strdup(optarg); if (node_name_map_file == NULL) IBEXIT("out of memory, strdup for node_name_map_file name failed"); break; case 4: if (!isxdigit(*optarg) && !(optarg = getpass("SM_Key: "))) { fprintf(stderr, "cannot get SM_Key\n"); ibdiag_show_usage(); } ibd_sakey = strtoull(optarg, NULL, 0); break; case 'p': query_type = IB_SA_ATTR_PATHRECORD; break; case 'D': node_print_desc = ALL_DESC; command = SAQUERY_CMD_NODE_RECORD; break; case 'c': command = SAQUERY_CMD_CLASS_PORT_INFO; break; case 'S': query_type = IB_SA_ATTR_SERVICERECORD; break; case 'I': query_type = IB_SA_ATTR_INFORMINFORECORD; break; case 'N': command = SAQUERY_CMD_NODE_RECORD; break; case 'L': node_print_desc = LID_ONLY; command = SAQUERY_CMD_NODE_RECORD; break; case 'l': node_print_desc = UNIQUE_LID_ONLY; command = SAQUERY_CMD_NODE_RECORD; break; case 'G': node_print_desc = GUID_ONLY; command = SAQUERY_CMD_NODE_RECORD; break; case 'O': node_print_desc = NAME_OF_LID; command = SAQUERY_CMD_NODE_RECORD; break; case 'U': node_print_desc = NAME_OF_GUID; command = SAQUERY_CMD_NODE_RECORD; break; case 's': command = SAQUERY_CMD_ISSM; break; case 'g': command = SAQUERY_CMD_MCGROUPS; break; case 'm': command = SAQUERY_CMD_MCMEMBERS; break; case 'x': query_type = IB_SA_ATTR_LINKRECORD; break; case 5: p->slid = (uint16_t) strtoul(optarg, NULL, 0); break; case 6: p->dlid = (uint16_t) strtoul(optarg, NULL, 0); break; case 7: p->mlid = (uint16_t) strtoul(optarg, NULL, 0); break; case 14: if (inet_pton(AF_INET6, optarg, &p->sgid) <= 0) ibdiag_show_usage(); break; case 15: if (inet_pton(AF_INET6, optarg, &p->dgid) <= 0) ibdiag_show_usage(); break; case 16: if (inet_pton(AF_INET6, optarg, &p->gid) <= 0) ibdiag_show_usage(); break; case 17: if (inet_pton(AF_INET6, optarg, &p->mgid) <= 0) ibdiag_show_usage(); break; case 'r': p->reversible = strtoul(optarg, NULL, 0); break; case 'n': p->numb_path = strtoul(optarg, NULL, 0); break; case 18: if (!isxdigit(*optarg) && !(optarg = getpass("P_Key: "))) { fprintf(stderr, "cannot get P_Key\n"); ibdiag_show_usage(); } p->pkey = (uint16_t) strtoul(optarg, NULL, 0); break; case 'Q': p->qos_class = strtoul(optarg, NULL, 0); break; case 19: p->sl = strtoul(optarg, NULL, 0); break; case 'M': p->mtu = (uint8_t) strtoul(optarg, NULL, 0); break; case 'R': p->rate = (uint8_t) strtoul(optarg, NULL, 0); break; case 20: p->pkt_life = (uint8_t) strtoul(optarg, NULL, 0); break; case 'q': if (!isxdigit(*optarg) && !(optarg = getpass("Q_Key: "))) { fprintf(stderr, "cannot get Q_Key\n"); ibdiag_show_usage(); } p->qkey = strtoul(optarg, NULL, 0); break; case 'T': p->tclass = (uint8_t) strtoul(optarg, NULL, 0); break; case 'F': p->flow_label = strtoul(optarg, NULL, 0); break; case 'H': p->hop_limit = strtoul(optarg, NULL, 0); break; case 21: p->scope = (uint8_t) strtoul(optarg, NULL, 0); break; case 'J': p->join_state = (uint8_t) strtoul(optarg, NULL, 0); break; case 'X': p->proxy_join = strtoul(optarg, NULL, 0); break; case 22: p->service_id = strtoull(optarg, NULL, 0); break; default: return -1; } return 0; } int main(int argc, char **argv) { int sa_cpi_required = 0; char usage_args[1024]; struct sa_handle * h; struct query_params params; const struct query_cmd *q; int status; int n; const struct ibdiag_opt opts[] = { {"p", 'p', 0, NULL, "get PathRecord info"}, {"N", 'N', 0, NULL, "get NodeRecord info"}, {"L", 'L', 0, NULL, "return the Lids of the name specified"}, {"l", 'l', 0, NULL, "return the unique Lid of the name specified"}, {"G", 'G', 0, NULL, "return the Guids of the name specified"}, {"O", 'O', 0, NULL, "return name for the Lid specified"}, {"U", 'U', 0, NULL, "return name for the Guid specified"}, {"s", 's', 0, NULL, "return the PortInfoRecords with isSM or" " isSMdisabled capability mask bit on"}, {"g", 'g', 0, NULL, "get multicast group info"}, {"m", 'm', 0, NULL, "get multicast member info (if multicast" " group specified, list member GIDs only for group specified," " for example 'saquery -m 0xC000')"}, {"x", 'x', 0, NULL, "get LinkRecord info"}, {"c", 'c', 0, NULL, "get the SA's class port info"}, {"S", 'S', 0, NULL, "get ServiceRecord info"}, {"I", 'I', 0, NULL, "get InformInfoRecord (subscription) info"}, {"list", 'D', 0, NULL, "the node desc of the CA's"}, {"src-to-dst", 1, 1, "", "get a PathRecord for" " where src and dst are either node names or LIDs"}, {"sgid-to-dgid", 2, 1, "", "get a PathRecord for" " where sgid and dgid are addresses in IPv6 format"}, {"node-name-map", 3, 1, "", "specify a node name map file"}, {"smkey", 4, 1, "", "SA SM_Key value for the query." " If non-numeric value (like 'x') is specified then" " saquery will prompt for a value. " " Default (when not specified here or in ibdiag.conf) is to " " use SM_Key == 0 (or \"untrusted\")"}, {"slid", 5, 1, "", "Source LID (PathRecord)"}, {"dlid", 6, 1, "", "Destination LID (PathRecord)"}, {"mlid", 7, 1, "", "Multicast LID (MCMemberRecord)"}, {"sgid", 14, 1, "", "Source GID (IPv6 format) (PathRecord)"}, {"dgid", 15, 1, "", "Destination GID (IPv6 format) (PathRecord)"}, {"gid", 16, 1, "", "Port GID (MCMemberRecord)"}, {"mgid", 17, 1, "", "Multicast GID (MCMemberRecord)"}, {"reversible", 'r', 1, NULL, "Reversible path (PathRecord)"}, {"numb_path", 'n', 1, NULL, "Number of paths (PathRecord)"}, {"pkey", 18, 1, NULL, "P_Key (PathRecord, MCMemberRecord)." " If non-numeric value (like 'x') is specified then" " saquery will prompt for a value"}, {"qos_class", 'Q', 1, NULL, "QoS Class (PathRecord)"}, {"sl", 19, 1, NULL, "Service level (PathRecord, MCMemberRecord)"}, {"mtu", 'M', 1, NULL, "MTU and selector (PathRecord, MCMemberRecord)"}, {"rate", 'R', 1, NULL, "Rate and selector (PathRecord, MCMemberRecord)"}, {"pkt_lifetime", 20, 1, NULL, "Packet lifetime and selector (PathRecord, MCMemberRecord)"}, {"qkey", 'q', 1, NULL, "Q_Key (MCMemberRecord)." " If non-numeric value (like 'x') is specified then" " saquery will prompt for a value"}, {"tclass", 'T', 1, NULL, "Traffic Class (PathRecord, MCMemberRecord)"}, {"flow_label", 'F', 1, NULL, "Flow Label (PathRecord, MCMemberRecord)"}, {"hop_limit", 'H', 1, NULL, "Hop limit (PathRecord, MCMemberRecord)"}, {"scope", 21, 1, NULL, "Scope (MCMemberRecord)"}, {"join_state", 'J', 1, NULL, "Join state (MCMemberRecord)"}, {"proxy_join", 'X', 1, NULL, "Proxy join (MCMemberRecord)"}, {"service_id", 22, 1, NULL, "ServiceID (PathRecord)"}, {} }; memset(¶ms, 0, sizeof params); params.hop_limit = -1; params.reversible = -1; params.numb_path = -1; params.qos_class = -1; params.sl = -1; params.proxy_join = -1; n = sprintf(usage_args, "[query-name] [ | | ]\n" "\nSupported query names (and aliases):\n"); for (q = query_cmds; q->name; q++) { n += snprintf(usage_args + n, sizeof(usage_args) - n, " %s (%s) %s\n", q->name, q->alias ? q->alias : "", q->usage ? q->usage : ""); if (n >= sizeof(usage_args)) exit(-1); } snprintf(usage_args + n, sizeof(usage_args) - n, "\n Queries node records by default."); q = NULL; ibd_timeout = DEFAULT_SA_TIMEOUT_MS; ibdiag_process_opts(argc, argv, ¶ms, "DGLsy", opts, process_opt, usage_args, NULL); argc -= optind; argv += optind; if (!query_type && command == SAQUERY_CMD_QUERY) { if (!argc || !(q = find_query(argv[0]))) query_type = IB_SA_ATTR_NODERECORD; else { query_type = q->query_type; argc--; argv++; } } if (argc) { if (node_print_desc == NAME_OF_LID) { requested_lid = (uint16_t) strtoul(argv[0], NULL, 0); requested_lid_flag++; } else if (node_print_desc == NAME_OF_GUID) { requested_guid = strtoul(argv[0], NULL, 0); requested_guid_flag++; } else requested_name = argv[0]; } if ((node_print_desc == LID_ONLY || node_print_desc == UNIQUE_LID_ONLY || node_print_desc == GUID_ONLY) && !requested_name) { fprintf(stderr, "ERROR: name not specified\n"); ibdiag_show_usage(); } if (node_print_desc == NAME_OF_LID && !requested_lid_flag) { fprintf(stderr, "ERROR: lid not specified\n"); ibdiag_show_usage(); } if (node_print_desc == NAME_OF_GUID && !requested_guid_flag) { fprintf(stderr, "ERROR: guid not specified\n"); ibdiag_show_usage(); } /* Note: lid cannot be 0; see infiniband spec 4.1.3 */ if (node_print_desc == NAME_OF_LID && !requested_lid) { fprintf(stderr, "ERROR: lid invalid\n"); ibdiag_show_usage(); } if (umad_init()) IBEXIT("Failed to initialized umad library"); h = sa_get_handle(NULL); if (!h) IBPANIC("Failed to bind to the SA"); node_name_map = open_node_name_map(node_name_map_file); if (src_lid && *src_lid) params.slid = get_lid(h, src_lid); if (dst_lid && *dst_lid) params.dlid = get_lid(h, dst_lid); if (command == SAQUERY_CMD_CLASS_PORT_INFO || query_type == CLASS_PORT_INFO || query_type == IB_SA_ATTR_SWITCHINFORECORD) sa_cpi_required = 1; if (sa_cpi_required && (status = query_sa_cpi(h, ¶ms)) != 0) { fprintf(stderr, "Failed to query SA:ClassPortInfo\n"); goto error; } switch (command) { case SAQUERY_CMD_NODE_RECORD: status = print_node_records(h, ¶ms); break; case SAQUERY_CMD_CLASS_PORT_INFO: dump_class_port_info(¶ms.cpi); status = 0; break; case SAQUERY_CMD_ISSM: status = print_issm_records(h, ¶ms); break; case SAQUERY_CMD_MCGROUPS: status = print_multicast_group_records(h, ¶ms); break; case SAQUERY_CMD_MCMEMBERS: status = print_multicast_member_records(h, ¶ms); break; default: if ((!q && !(q = find_query_by_type(query_type))) || !q->handler) { fprintf(stderr, "Unknown query type %d\n", query_type); status = EINVAL; } else status = q->handler(q, h, ¶ms, argc, argv); break; } error: if (src_lid) free(src_lid); sa_free_handle(h); umad_done(); close_node_name_map(node_name_map); return (status); } rdma-core-56.1/infiniband-diags/scripts/000077500000000000000000000000001477342711600202075ustar00rootroot00000000000000rdma-core-56.1/infiniband-diags/scripts/CMakeLists.txt000066400000000000000000000063001477342711600227460ustar00rootroot00000000000000function(_rdma_sbin_interp INTERP IFN OFN) configure_file("${IFN}" "${CMAKE_CURRENT_BINARY_DIR}/${OFN}" @ONLY) file(WRITE "${BUILD_BIN}/${OFN}" "#!${INTERP}\nexec ${INTERP} ${CMAKE_CURRENT_BINARY_DIR}/${OFN} \"$@\"\n") execute_process(COMMAND "chmod" "a+x" "${BUILD_BIN}/${OFN}") install(FILES "${CMAKE_CURRENT_BINARY_DIR}/${OFN}" DESTINATION "${CMAKE_INSTALL_SBINDIR}" PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ OWNER_EXECUTE GROUP_EXECUTE WORLD_EXECUTE) endfunction() function(_rdma_sbin_interp_link INTERP IFN OFN) file(WRITE "${BUILD_BIN}/${OFN}" "#!${INTERP}\nexec ${INTERP} ${CMAKE_CURRENT_SOURCE_DIR}/${IFN} \"$@\"\n") execute_process(COMMAND "chmod" "a+x" "${BUILD_BIN}/${OFN}") install(FILES "${IFN}" DESTINATION "${CMAKE_INSTALL_SBINDIR}" RENAME "${OFN}" PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ OWNER_EXECUTE GROUP_EXECUTE WORLD_EXECUTE) endfunction() function(rdma_sbin_shell_program) foreach(IFN ${ARGN}) if (IFN MATCHES "\\.sh\\.in") if (DISTRO_FLAVOUR STREQUAL Debian) string(REGEX REPLACE "^(.+)\\.sh\\.in$" "\\1" OFN "${IFN}") else() string(REGEX REPLACE "^(.+)\\.in$" "\\1" OFN "${IFN}") endif() _rdma_sbin_interp("/bin/bash" "${IFN}" "${OFN}") elseif (IFN MATCHES "\\.in") string(REGEX REPLACE "^(.+)\\.in$" "\\1" OFN "${IFN}") _rdma_sbin_interp("/bin/bash" "${IFN}" "${OFN}") elseif (IFN MATCHES "\\.sh") if (DISTRO_FLAVOUR STREQUAL Debian) string(REGEX REPLACE "^(.+)\\.sh$" "\\1" OFN "${IFN}") else() set(OFN "${IFN}") endif() _rdma_sbin_interp_link("/bin/bash" "${IFN}" "${OFN}") else() _rdma_sbin_interp_link("/bin/bash" "${IFN}" "${IFN}") endif() endforeach() endfunction() function(rdma_sbin_perl_program) foreach(IFN ${ARGN}) if (IFN MATCHES "\\.pl\\.in") if (DISTRO_FLAVOUR STREQUAL Debian) string(REGEX REPLACE "^(.+)\\.pl\\.in$" "\\1" OFN "${IFN}") else() string(REGEX REPLACE "^(.+)\\.in$" "\\1" OFN "${IFN}") endif() _rdma_sbin_interp("/usr/bin/perl" "${IFN}" "${OFN}") elseif (IFN MATCHES "\\.pl") if (DISTRO_FLAVOUR STREQUAL Debian) string(REGEX REPLACE "^(.+)\\.pl$" "\\1" OFN "${IFN}") else() set(OFN "${IFN}") endif() _rdma_sbin_interp_link("/usr/bin/perl" "${IFN}" "${OFN}") endif() endforeach() endfunction() set(IBSCRIPTPATH "${CMAKE_INSTALL_FULL_SBINDIR}") rdma_sbin_shell_program( dump_lfts.sh.in dump_mfts.sh.in ibhosts.in ibnodes.in ibrouters.in ibstatus ibswitches.in ) rdma_sbin_perl_program( check_lft_balance.pl ibfindnodesusing.pl ibidsverify.pl ) install(FILES "IBswcountlimits.pm" DESTINATION "${CMAKE_INSTALL_PERLDIR}") if (ENABLE_IBDIAGS_COMPAT) rdma_sbin_shell_program( ibcheckerrors.in ibcheckerrs.in ibchecknet.in ibchecknode.in ibcheckport.in ibcheckportstate.in ibcheckportwidth.in ibcheckstate.in ibcheckwidth.in ibclearcounters.in ibclearerrors.in ibdatacounters.in ibdatacounts.in set_nodedesc.sh ) rdma_sbin_perl_program( ibdiscover.pl iblinkinfo.pl.in ibprintca.pl ibprintrt.pl ibprintswitch.pl ibqueryerrors.pl.in ibswportwatch.pl ) endif() rdma-core-56.1/infiniband-diags/scripts/IBswcountlimits.pm000077500000000000000000000345301477342711600237140ustar00rootroot00000000000000#!/usr/bin/perl # # Copyright (c) 2006 The Regents of the University of California. # Copyright (c) 2006-2008 Voltaire, Inc. All rights reserved. # # Produced at Lawrence Livermore National Laboratory. # Written by Ira Weiny . # Erez Strauss from Voltaire for help in the get_link_ends code. # # This software is available to you under a choice of one of two # licenses. You may choose to be licensed under the terms of the GNU # General Public License (GPL) Version 2, available from the file # COPYING in the main directory of this source tree, or the # OpenIB.org BSD license below: # # Redistribution and use in source and binary forms, with or # without modification, are permitted provided that the following # conditions are met: # # - Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. # # - Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials # provided with the distribution. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # use strict; %IBswcountlimits::cur_counts = (); %IBswcountlimits::new_counts = (); @IBswcountlimits::suppress_errors = (); $IBswcountlimits::link_ends = undef; $IBswcountlimits::pause_time = 10; $IBswcountlimits::cache_dir = "/var/cache/infiniband-diags"; # all the PerfMgt counters @IBswcountlimits::counters = ( "SymbolErrorCounter", "LinkErrorRecoveryCounter", "LinkDownedCounter", "PortRcvErrors", "PortRcvRemotePhysicalErrors", "PortRcvSwitchRelayErrors", "PortXmitDiscards", "PortXmitConstraintErrors", "PortRcvConstraintErrors", "LocalLinkIntegrityErrors", "ExcessiveBufferOverrunErrors", "VL15Dropped", "PortXmitData", "PortRcvData", "PortXmitPkts", "PortRcvPkts" ); # non-critical counters %IBswcountlimits::error_counters = ( "SymbolErrorCounter", "No action is required except if counter is increasing along with LinkErrorRecoveryCounter", "LinkErrorRecoveryCounter", "If this is increasing along with SymbolErrorCounter this may indicate a bad link, run ibswportwatch.pl on this port", "LinkDownedCounter", "Number of times the port has gone down (Usually for valid reasons)", "PortRcvErrors", "This is a bad link, if the link is internal to a 288 try setting SDR, otherwise check the cable", "PortRcvRemotePhysicalErrors", "This indicates a problem ELSEWHERE in the fabric.", "PortXmitDiscards", "This is a symptom of congestion and may require tweaking either HOQ or switch lifetime values", "PortXmitConstraintErrors", "This is a result of bad partitioning, check partition configuration.", "PortRcvConstraintErrors", "This is a result of bad partitioning, check partition configuration.", "LocalLinkIntegrityErrors", "May indicate a bad link, run ibswportwatch.pl on this port", "ExcessiveBufferOverrunErrors", "This is a flow control state machine error and can be caused by packets with physical errors", "VL15Dropped", "check with ibswportwatch.pl, if increasing in SMALL increments, OK", "PortRcvSwitchRelayErrors", "This counter can increase due to a valid network event" ); sub check_counters { my $print_action = $_[0]; my $actions = undef; COUNTER: foreach my $cnt (keys %IBswcountlimits::error_counters) { if ($IBswcountlimits::cur_counts{$cnt} > 0) { foreach my $sup_cnt (@IBswcountlimits::suppress_errors) { if ("$cnt" eq $sup_cnt) { next COUNTER; } } print " [$cnt == $IBswcountlimits::cur_counts{$cnt}]"; if ("$print_action" eq "yes") { $actions = join " ", ( $actions, " $cnt: $IBswcountlimits::error_counters{$cnt}\n" ); } } } if ($actions) { print "\n Actions:\n$actions"; } } # Data counters %IBswcountlimits::data_counters = ( "PortXmitData", "Total number of data octets, divided by 4, transmitted on all VLs from the port", "PortRcvData", "Total number of data octets, divided by 4, received on all VLs to the port", "PortXmitPkts", "Total number of packets, excluding link packets, transmitted on all VLs from the port", "PortRcvPkts", "Total number of packets, excluding link packets, received on all VLs to the port" ); sub check_data_counters { my $print_action = $_[0]; my $actions = undef; COUNTER: foreach my $cnt (keys %IBswcountlimits::data_counters) { print " [$cnt == $IBswcountlimits::cur_counts{$cnt}]"; if ("$print_action" eq "yes") { $actions = join " ", ( $actions, " $cnt: $IBswcountlimits::data_counters{$cnt}\n" ); } } if ($actions) { print "\n Descriptions:\n$actions"; } } sub print_data_rates { COUNTER: foreach my $cnt (keys %IBswcountlimits::data_counters) { my $cnt_per_second = calculate_rate( $IBswcountlimits::cur_counts{$cnt}, $IBswcountlimits::new_counts{$cnt} ); print " $cnt_per_second $cnt/second\n"; } } # ========================================================================= # Rate dependent counters # calculate the count/sec # calculate_rate old_count new_count sub calculate_rate { my $rate = 0; my $old_val = $_[0]; my $new_val = $_[1]; my $rate = ($new_val - $old_val) / $IBswcountlimits::pause_time; return ($rate); } %IBswcountlimits::rate_dep_thresholds = ( "SymbolErrorCounter", 10, "LinkErrorRecoveryCounter", 10, "PortRcvErrors", 10, "LocalLinkIntegrityErrors", 10, "PortXmitDiscards", 10 ); sub check_counter_rates { foreach my $rate_count (keys %IBswcountlimits::rate_dep_thresholds) { my $rate = calculate_rate( $IBswcountlimits::cur_counts{$rate_count}, $IBswcountlimits::new_counts{$rate_count} ); if ($rate > $IBswcountlimits::rate_dep_thresholds{$rate_count}) { print "Detected excessive rate for $rate_count ($rate cnts/sec)\n"; } elsif ($rate > 0) { print "Detected rate for $rate_count ($rate cnts/sec)\n"; } } } # ========================================================================= # sub clear_counters { # clear the counters foreach my $count (@IBswcountlimits::counters) { $IBswcountlimits::cur_counts{$count} = 0; } } # ========================================================================= # sub any_counts { my $total = 0; my $count = 0; foreach $count (keys %IBswcountlimits::critical) { $total = $total + $IBswcountlimits::cur_counts{$count}; } COUNTER: foreach $count (keys %IBswcountlimits::error_counters) { foreach my $sup_cnt (@IBswcountlimits::suppress_errors) { if ("$count" eq $sup_cnt) { next COUNTER; } } $total = $total + $IBswcountlimits::cur_counts{$count}; } return ($total); } # ========================================================================= # sub ensure_cache_dir { if (!(-d "$IBswcountlimits::cache_dir") && !mkdir($IBswcountlimits::cache_dir, 0700)) { die "cannot create $IBswcountlimits::cache_dir: $!\n"; } } # ========================================================================= # get_cache_file(ca_name, ca_port) # sub get_cache_file { my $ca_name = $_[0]; my $ca_port = $_[1]; ensure_cache_dir; return ( "$IBswcountlimits::cache_dir/ibnetdiscover-$ca_name-$ca_port.topology"); } # ========================================================================= # get_ca_name_port_param_string(ca_name, ca_port) # sub get_ca_name_port_param_string { my $ca_name = $_[0]; my $ca_port = $_[1]; if ("$ca_name" ne "") { $ca_name = "-C $ca_name"; } if ("$ca_port" ne "") { $ca_port = "-P $ca_port"; } return ("$ca_name $ca_port"); } # ========================================================================= # generate_ibnetdiscover_topology(ca_name, ca_port) # sub generate_ibnetdiscover_topology { my $ca_name = $_[0]; my $ca_port = $_[1]; my $cache_file = get_cache_file($ca_name, $ca_port); my $extra_params = get_ca_name_port_param_string($ca_name, $ca_port); if (`ibnetdiscover -g $extra_params > $cache_file`) { die "Execution of ibnetdiscover failed: $!\n"; } } # ========================================================================= # get_link_ends(regenerate_map, ca_name, ca_port) # sub get_link_ends { my $regenerate_map = $_[0]; my $ca_name = $_[1]; my $ca_port = $_[2]; my $cache_file = get_cache_file($ca_name, $ca_port); if ($regenerate_map || !(-f "$cache_file")) { generate_ibnetdiscover_topology($ca_name, $ca_port); } open IBNET_TOPO, "<$cache_file" or die "Failed to open ibnet topology: $!\n"; my $in_switch = "no"; my $desc = ""; my $guid = ""; my $loc_sw_lid = ""; my $loc_port = ""; my $line = ""; while ($line = ) { if ($line =~ /^Switch.*\"S-(.*)\"\s+#.*\"(.*)\".* lid (\d+).*/) { $guid = $1; $desc = $2; $loc_sw_lid = $3; $in_switch = "yes"; } if ($in_switch eq "yes") { my $rec = undef; if ($line =~ /^\[(\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\](\(.+\))?\s+#.*\"(.*)\"\.* lid (\d+).*/ ) { $loc_port = $1; my $rem_guid = $2; my $rem_port = $3; my $rem_port_guid = $4; my $rem_desc = $5; my $rem_lid = $6; $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => "", loc_desc => $desc, loc_sw_lid => $loc_sw_lid, rem_guid => "0x$rem_guid", rem_lid => $rem_lid, rem_port => $rem_port, rem_ext_port => "", rem_desc => $rem_desc, rem_port_guid => $rem_port_guid }; } if ($line =~ /^\[(\d+)\]\[ext (\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\](\(.+\))?\s+#.*\"(.*)\"\.* lid (\d+).*/ ) { $loc_port = $1; my $loc_ext_port = $2; my $rem_guid = $3; my $rem_port = $4; my $rem_port_guid = $5; my $rem_desc = $6; my $rem_lid = $7; $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => $loc_ext_port, loc_desc => $desc, loc_sw_lid => $loc_sw_lid, rem_guid => "0x$rem_guid", rem_lid => $rem_lid, rem_port => $rem_port, rem_ext_port => "", rem_desc => $rem_desc, rem_port_guid => $rem_port_guid }; } if ($line =~ /^\[(\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\]\[ext (\d+)\](\(.+\))?\s+#.*\"(.*)\"\.* lid (\d+).*/ ) { $loc_port = $1; my $rem_guid = $2; my $rem_port = $3; my $rem_ext_port = $4; my $rem_port_guid = $5; my $rem_desc = $6; my $rem_lid = $7; $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => "", loc_desc => $desc, loc_sw_lid => $loc_sw_lid, rem_guid => "0x$rem_guid", rem_lid => $rem_lid, rem_port => $rem_port, rem_ext_port => $rem_ext_port, rem_desc => $rem_desc, rem_port_guid => $rem_port_guid }; } if ($line =~ /^\[(\d+)\]\[ext (\d+)\]\s+\"[HSR]-(.+)\"\[(\d+)\]\[ext (\d+)\](\(.+\))?\s+#.*\"(.*)\"\.* lid (\d+).*/ ) { $loc_port = $1; my $loc_ext_port = $2; my $rem_guid = $3; my $rem_port = $4; my $rem_ext_port = $5; my $rem_port_guid = $6; my $rem_desc = $7; my $rem_lid = $8; $rec = { loc_guid => "0x$guid", loc_port => $loc_port, loc_ext_port => $loc_ext_port, loc_desc => $desc, loc_sw_lid => $loc_sw_lid, rem_guid => "0x$rem_guid", rem_lid => $rem_lid, rem_port => $rem_port, rem_ext_port => $rem_ext_port, rem_desc => $rem_desc, rem_port_guid => $rem_port_guid }; } if ($rec) { $rec->{rem_port_guid} =~ s/\((.*)\)/$1/; $IBswcountlimits::link_ends{"0x$guid"}{$loc_port} = $rec; } } if ($line =~ /^Ca.*/ || $line =~ /^Rt.*/) { $in_switch = "no"; } } close IBNET_TOPO; } # ========================================================================= # get_num_ports(switch_guid, ca_name, ca_port) # sub get_num_ports { my $guid = $_[0]; my $ca_name = $_[1]; my $ca_port = $_[2]; my $num_ports = 0; my $extra_params = get_ca_name_port_param_string($ca_name, $ca_port); my $data = `smpquery $extra_params -G nodeinfo $guid` || die "'smpquery $extra_params -G nodeinfo $guid' faild\n"; my @lines = split("\n", $data); my $pkt_lifetime = ""; foreach my $line (@lines) { if ($line =~ /^NumPorts:\.+(.*)/) { $num_ports = $1; } } return ($num_ports); } # ========================================================================= # format_guid(guid) # The diags store the guids as strings. This converts the guid supplied # to the correct string format. # eg: 0x0008f10400411f56 == 0x8f10400411f56 # sub format_guid { my $guid = $_[0]; my $guid_str = ""; $guid =~ tr/[A-F]/[a-f]/; if ($guid =~ /0x(.*)/) { $guid_str = sprintf("0x%016s", $1); } else { $guid_str = sprintf("0x%016s", $guid); } return ($guid_str); } # ========================================================================= # convert_dr_to_guid(direct_route) # sub convert_dr_to_guid { my $guid = undef; my $data = `smpquery nodeinfo -D $_[0]` || die "'mpquery nodeinfo -D $_[0]' failed\n"; my @lines = split("\n", $data); foreach my $line (@lines) { if ($line =~ /^PortGuid:\.+(.*)/) { $guid = $1; } } return format_guid($guid); } # ========================================================================= # get_node_type(guid_or_direct_route) # sub get_node_type { my $type = undef; my $query_arg = "smpquery nodeinfo "; if ($_[0] =~ /x/) { # assume arg is a guid if contains an x $query_arg .= "-G " . $_[0]; } else { # assume arg is a direct path $query_arg .= "-D " . $_[0]; } my $data = `$query_arg` || die "'$query_arg' failed\n"; my @lines = split("\n", $data); foreach my $line (@lines) { if ($line =~ /^NodeType:\.+(.*)/) { $type = $1; } } return $type; } # ========================================================================= # is_switch(guid_or_direct_route) # sub is_switch { my $node_type = &get_node_type($_[0]); return ($node_type =~ /Switch/); } rdma-core-56.1/infiniband-diags/scripts/check_lft_balance.pl000077500000000000000000000241771477342711600241510ustar00rootroot00000000000000#!/usr/bin/perl # # Copyright (C) 2001-2003 The Regents of the University of California. # Copyright (c) 2006 The Regents of the University of California. # Copyright (c) 2007-2008 Voltaire, Inc. All rights reserved. # # Produced at Lawrence Livermore National Laboratory. # Written by Ira Weiny # Jim Garlick # Albert Chu # # This software is available to you under a choice of one of two # licenses. You may choose to be licensed under the terms of the GNU # General Public License (GPL) Version 2, available from the file # COPYING in the main directory of this source tree, or the # OpenIB.org BSD license below: # # Redistribution and use in source and binary forms, with or # without modification, are permitted provided that the following # conditions are met: # # - Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. # # - Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials # provided with the distribution. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # use strict; use Getopt::Std; my $ibnetdiscover_cache = ""; my $dump_lft_file = ""; my $verbose = 0; my $switch_lid = undef; my $switch_guid = undef; my $switch_name = undef; my %switch_port_count = (); my @switch_maybe_directly_connected_hosts = (); my $host = undef; my @host_ports = (); my @lft_lines = (); my $lft_line; my $lids_per_port; my $lids_per_port_calculated; my $heuristic_flag = 0; sub usage { my $prog = `basename $0`; chomp($prog); print "Usage: $prog -l lft-output -i ibnetdiscover-cache [-e] [-v]\n"; print " Generate lft-output via \"dump_lfts.sh > lft-output\"\n"; print " Generate ibnetdiscover-cache via \"ibnetdiscover --cache ibnetdiscover-cache\"\n"; print " -e turn on heuristic(s) to look at switch balances deeper\n"; print " -v verbose output, output all switches\n"; exit 2; } sub is_port_up { my $iblinkinfo_output = $_[0]; my $port = $_[1]; my $decport; my @lines; my $line; $port =~ /0+(.+)/; $decport = $1; # Add a space if necessary if ($decport >= 1 && $decport <= 9) { $decport = " $decport"; } @lines = split("\n", $iblinkinfo_output); foreach $line (@lines) { if ($line =~ /$decport\[..\] ==/) { if ($line =~ /Down/) { return 0; } else { return 1; } } } # return 0 if not found return 0; } sub is_directly_connected { my $iblinkinfo_output = $_[0]; my $port = $_[1]; my $decport; my $str; my $rv = 0; my $host_tmp; my @lines; my $line; if (($switch_port_count{$port} != $lids_per_port) || !(@switch_maybe_directly_connected_hosts)) { return $rv; } $port =~ /0+(.+)/; $decport = $1; # Add a space if necessary if ($decport >= 1 && $decport <= 9) { $decport = " $decport"; } @lines = split("\n", $iblinkinfo_output); foreach $line (@lines) { if ($line =~ /$decport\[..\] ==/) { $str = $line; } } if ($str =~ "Active") { $str =~ /[\d]+[\s]+[\d]+\[.+\] \=\=.+\=\=>[\s]+[\d]+[\s]+[\d]+\[.+\] \"(.+)\".+/; for $host_tmp (@switch_maybe_directly_connected_hosts) { if ($1 == $host_tmp) { $rv = 1; last; } } } return $rv; } sub output_switch_port_usage { my $min_usage = 999999; my $max_usage = 0; my $min_usage2 = 999999; my $max_usage2 = 0; my @ports = ( "001", "002", "003", "004", "005", "006", "007", "008", "009", "010", "011", "012", "013", "014", "015", "016", "017", "018", "019", "020", "021", "022", "023", "024", "025", "026", "027", "028", "029", "030", "031", "032", "033", "034", "035", "036" ); my @output_ports = (); my @double_check_ports = (); my $port; my $iblinkinfo_output; my $is_unbalanced = 0; my $ports_on_switch = 0; my $all_zero_flag = 1; my $ret; $iblinkinfo_output = `iblinkinfo --load-cache $ibnetdiscover_cache -S $switch_guid`; for $port (@ports) { if (!defined($switch_port_count{$port})) { $switch_port_count{$port} = 0; } if ($switch_port_count{$port} == 0) { # If port is down, don't use it in this calculation $ret = is_port_up($iblinkinfo_output, $port); if ($ret == 0) { next; } } $ports_on_switch++; # If port is directly connected to a node, don't use # it in this calculation. if (is_directly_connected($iblinkinfo_output, $port) == 1) { next; } # Save off ports that should be output later push(@output_ports, $port); if ($switch_port_count{$port} < $min_usage) { $min_usage = $switch_port_count{$port}; } if ($switch_port_count{$port} > $max_usage) { $max_usage = $switch_port_count{$port}; } } if ($max_usage > ($min_usage + 1)) { $is_unbalanced = 1; } # In the event this is a switch lineboard, it will almost always never # balanced. Half the ports go up to the spine, and the rest of the ports # go down to HCAs. So we will do a special heuristic: # # If about 1/2 of the remaining ports are balanced, then we will consider the # entire switch balanced. # # Also, we do this only if there are enough alive ports on the switch to care. # I picked 12 somewhat randomly if ($heuristic_flag == 1 && $is_unbalanced == 1 && $ports_on_switch > 12) { @double_check_ports = (); for $port (@output_ports) { if ($switch_port_count{$port} == $max_usage || $switch_port_count{$port} == ($max_usage - 1) || $switch_port_count{$port} == 0) { next; } push(@double_check_ports, $port); } # we'll call half +/- 1 "about half" if (@double_check_ports == int($ports_on_switch / 2) || @double_check_ports == int($ports_on_switch / 2) + 1 || @double_check_ports == int($ports_on_switch / 2) - 1) { for $port (@double_check_ports) { if ($switch_port_count{$port} < $min_usage2) { $min_usage2 = $switch_port_count{$port}; } if ($switch_port_count{$port} > $max_usage2) { $max_usage2 = $switch_port_count{$port}; } } if (!($max_usage2 > ($min_usage2 + 1))) { $is_unbalanced = 0; } } } # Another special case is when you have a non-fully-populated switch # Many ports will be zero. So if all active ports != max or max-1 are = 0 # we will also consider this balanced. if ($heuristic_flag == 1 && $is_unbalanced == 1 && $ports_on_switch > 12) { @double_check_ports = (); for $port (@output_ports) { if ($switch_port_count{$port} == $max_usage || $switch_port_count{$port} == ($max_usage - 1)) { next; } push(@double_check_ports, $port); } for $port (@double_check_ports) { if ($switch_port_count{$port} != 0) { $all_zero_flag = 0; last; } } if ($all_zero_flag == 1) { $is_unbalanced = 0; } } if ($verbose || $is_unbalanced == 1) { if ($is_unbalanced == 1) { print "Unbalanced Switch Port Usage: "; print "$switch_name, $switch_guid\n"; } else { print "Switch Port Usage: $switch_name, $switch_guid\n"; } for $port (@output_ports) { print "Port $port: $switch_port_count{$port}\n"; } } } sub process_host_ports { my $test_port; my $tmp; my $flag = 0; if (@host_ports == $lids_per_port) { # Are all the host ports identical? $test_port = $host_ports[0]; for $tmp (@host_ports) { if ($tmp != $test_port) { $flag = 1; last; } } # If all host ports are identical, maybe its directly # connected to a host. if ($flag == 0) { push(@switch_maybe_directly_connected_hosts, $host); } } } if (!getopts("hl:i:ve")) { usage(); } if (defined($main::opt_h)) { usage(); } if (defined($main::opt_l)) { $dump_lft_file = $main::opt_l; } else { print STDERR ("Must specify dump lfts file\n"); usage(); exit 1; } if (defined($main::opt_i)) { $ibnetdiscover_cache = $main::opt_i; } else { print STDERR ("Must specify ibnetdiscover cache\n"); usage(); exit 1; } if (defined($main::opt_v)) { $verbose = 1; } if (defined($main::opt_e)) { $heuristic_flag = 1; } if (!open(FH, "< $dump_lft_file")) { print STDERR ("Couldn't open dump lfts file: $dump_lft_file: $!\n"); } @lft_lines = ; foreach $lft_line (@lft_lines) { chomp($lft_line); if ($lft_line =~ /Unicast/) { if (@host_ports) { process_host_ports(); } if (defined($switch_name)) { output_switch_port_usage(); } if ($lft_line =~ /Unicast lids .+ of switch DR path slid .+ guid (.+) \((.+)\)/) { $switch_guid = $1; $switch_name = $2; } if ($lft_line =~ /Unicast lids .+ of switch Lid .+ guid (.+) \((.+)\)/) { $switch_guid = $1; $switch_name = $2; } @switch_maybe_directly_connected_hosts = (); %switch_port_count = (); @host_ports = (); $lids_per_port = 0; $lids_per_port_calculated = 0; } elsif ($lft_line =~ /Channel/ || $lft_line =~ /Router/) { $lft_line =~ /.+ (.+) : \(.+ portguid .+: '(.+)'\)/; $host = $2; $switch_port_count{$1}++; if (@host_ports) { process_host_ports(); } @host_ports = ($1); if ($lids_per_port == 0) { $lids_per_port++; } else { $lids_per_port_calculated++; } } elsif ($lft_line =~ /path/) { $lft_line =~ /.+ (.+) : \(path #. out of .: portguid .+\)/; $switch_port_count{$1}++; if ($lids_per_port_calculated == 0) { $lids_per_port++; } push(@host_ports, $1); } else { if ($lids_per_port) { $lids_per_port_calculated++; } next; } } if (@host_ports) { process_host_ports(); } output_switch_port_usage(); rdma-core-56.1/infiniband-diags/scripts/dump_lfts.sh.in000077500000000000000000000004351477342711600231520ustar00rootroot00000000000000#!/bin/sh # # This simple script will collect outputs of ibroute for all switches # on the subnet and drop it on stdout. It can be used for LFTs dump # generation. # @IBSCRIPTPATH@/dump_fts $@ echo "" echo "*** WARNING ***: this command has been replaced by dump_fts" echo "" echo "" rdma-core-56.1/infiniband-diags/scripts/dump_mfts.sh.in000077500000000000000000000004431477342711600231520ustar00rootroot00000000000000#!/bin/sh # # This simple script will collect outputs of ibroute for all switches # on the subnet and drop it on stdout. It can be used for MFTs dump # generation. # @IBSCRIPTPATH@/dump_fts -M $@ echo "" echo "*** WARNING ***: this command has been replaced by dump_fts -M" echo "" echo "" rdma-core-56.1/infiniband-diags/scripts/ibcheckerrors.in000066400000000000000000000044521477342711600233710ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-b] [-v] [-N | -nocolor]"\ "[ | -C ca_name -P ca_port -t(imeout) timeout_ms]" exit 255 } user_abort() { echo "Aborted" exit 1 } trap user_abort SIGINT gflags="" verbose="" brief="" v=0 ntype="" nodeguid="" topofile="" ca_info="" while [ "$1" ]; do case $1 in -h) usage ;; -N|-nocolor) gflags=-N ;; -v) verbose=-v brief="" v=1 ;; -b) brief=-b verbose="" ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) if [ "$topofile" ]; then usage fi topofile="$1" ;; esac shift done if [ "$topofile" ]; then netcmd="cat $topofile" else netcmd="$IBPATH/ibnetdiscover $ca_info" fi text="`eval $netcmd`" rv=$? echo "$text" | awk ' BEGIN { ne=0 } function check_node(lid, port) { if (system("'$IBPATH'/ibchecknode -S '"$ca_info"' '$gflags' '$verbose' " lid)) { ne++ print "\n# " ntype ": nodeguid 0x" nodeguid " failed" return 1; } if (system("'$IBPATH'/ibcheckerrs -S '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port)) return 2; return 0; } /^Ca/ || /^Switch/ || /^Rt/ { nnodes++ ntype=$1; nodeguid=substr($3, 4, 16); ports=$2 if ('$v') print "\n# Checking " ntype ": nodeguid 0x" nodeguid err = 0; if (ntype != "Switch") next lid = substr($0, index($0, "port 0 lid ") + 11) lid = substr(lid, 1, index(lid, " ") - 1) err = check_node(lid, 255) } /^\[/ { nports++ port = $1 sub("\\(.*\\)", "", port) gsub("[\\[\\]]", "", port) if (ntype != "Switch") { lid = substr($0, index($0, " lid ") + 5) lid = substr(lid, 1, index(lid, " ") - 1) if (check_node(lid, port) == 2) pcnterr++; } else if (err && system("'$IBPATH'/ibcheckerrs -S '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port)) pcnterr++; } /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} END { printf "\n*** WARNING ***: this command is deprecated; Please use \"ibqueryerrors\"" printf "\n## Summary: %d nodes checked, %d bad nodes found\n", nnodes, ne printf "## %d ports checked, %d ports have errors beyond threshold\n", nports, pcnterr exit (ne + pcnterr) } ' exit $rv rdma-core-56.1/infiniband-diags/scripts/ibcheckerrs.in000066400000000000000000000115111477342711600230220ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-b] [-v] [-G] [-T ]" \ "[-s(how_thresholds)] [-N \| -nocolor] [-C ca_name] [-P ca_port]" \ "[-t(imeout) timeout_ms] []" exit 255 } green() { if [ "$bw" = "yes" ]; then if [ "$verbose" = "yes" ]; then echo $1 fi return fi if [ "$verbose" = "yes" ]; then echo -e "\\033[1;032m" $1 "\\033[0;39m" fi } red() { if [ "$bw" = "yes" ]; then echo $1 return fi echo -e "\\033[1;031m" $1 "\\033[0;39m" } show_thresholds() { echo "SymbolErrorCounter=$SymbolErrorCounter" echo "LinkErrorRecoveryCounter=$LinkErrorRecoveryCounter" echo "LinkDownedCounter=$LinkDownedCounter" echo "PortRcvErrors=$PortRcvErrors" echo "PortRcvRemotePhysicalErrors=$PortRcvRemotePhysicalErrors" echo "PortRcvSwitchRelayErrors=$PortRcvSwitchRelayErrors" echo "PortXmitDiscards=$PortXmitDiscards" echo "PortXmitConstraintErrors=$PortXmitConstraintErrors" echo "PortRcvConstraintErrors=$PortRcvConstraintErrors" echo "LocalLinkIntegrityErrors=$LocalLinkIntegrityErrors" echo "ExcessiveBufferOverrunErrors=$ExcessiveBufferOverrunErrors" echo "VL15Dropped=$VL15Dropped" } get_thresholds() { . $1 } # Default thresholds SymbolErrorCounter=10 LinkErrorRecoveryCounter=10 LinkDownedCounter=10 PortRcvErrors=10 PortRcvRemotePhysicalErrors=100 PortRcvSwitchRelayErrors=100 PortXmitDiscards=100 PortXmitConstraintErrors=100 PortRcvConstraintErrors=100 LocalLinkIntegrityErrors=10 ExcessiveBufferOverrunErrors=10 VL15Dropped=100 guid_addr="" bw="" verbose="" brief="" ca_info="" suppress_deprecated="no" while [ "$1" ]; do case $1 in -G) guid_addr=yes ;; -nocolor|-N) bw=yes ;; -v) verbose=yes brief="" ;; -b) brief=yes verbose="" ;; -T) if ! [ -r $2 ]; then echo "Can't use threshold file '$2'" usage fi get_thresholds $2 shift ;; -s) show_thresholds exit 0 ;; -S) suppress_deprecated="yes" ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) break ;; esac shift done #default is all ports portnum=255 if [ $# -lt 1 ]; then usage fi if [ "$2" ]; then portnum=$2 fi if [ "$portnum" = "255" ]; then portname="all" else portname=$2 fi if [ "$suppress_deprecated" = "no" ]; then /usr/bin/echo -e "*** WARNING ***: this command is deprecated; Please use \"ibqueryerrors\"\n\n" 1>&2 fi if [ "$guid_addr" ]; then if ! lid=`$IBPATH/ibaddr $ca_info -G -L $1 | awk '/failed/{exit 255} {print $3}'`; then echo -n "guid $1 address resolution: " red "FAILED" exit 255 fi guid=$1 else lid=$1 if ! temp=`$IBPATH/ibaddr $ca_info -L $1 | awk '/failed/{exit 255} {print $1}'`; then echo -n "lid $1 address resolution: " red "FAILED" exit 255 fi fi nodename=`$IBPATH/smpquery $ca_info nodedesc $lid | sed -e "s/^Node Description:\.*\(.*\)/\1/"` text="`eval $IBPATH/perfquery $ca_info $lid $portnum`" rv=$? if echo $text | grep -q 'AllPortSelect not supported'; then if [ "$verbose" = "yes" ]; then echo -n "Error check on lid $lid ($nodename) port $portname: " green "AllPortSelect not supported" fi exit 0 fi if echo "$text" | awk -v mono=$bw -v brief=$brief -F '[.:]*' ' function blue(s) { if (brief == "yes") { return } if (mono) printf s else if (!quiet) { printf "\033[1;034m" s printf "\033[0;39m" } } BEGIN { th["SymbolErrorCounter"] = '$SymbolErrorCounter' th["LinkErrorRecoveryCounter"] = '$LinkErrorRecoveryCounter' th["LinkDownedCounter"] = '$LinkDownedCounter' th["PortRcvErrors"] = '$PortRcvErrors' th["PortRcvRemotePhysicalErrors"] = '$PortRcvRemotePhysicalErrors' th["PortRcvSwitchRelayErrors"] = '$PortRcvSwitchRelayErrors' th["PortXmitDiscards"] = '$PortXmitDiscards' th["PortXmitConstraintErrors"] = '$PortXmitConstraintErrors' th["PortRcvConstraintErrors"] = '$PortRcvConstraintErrors' th["LocalLinkIntegrityErrors"] = '$LocalLinkIntegrityErrors' th["ExcessiveBufferOverrunErrors"] = '$ExcessiveBufferOverrunErrors' th["VL15Dropped"] = '$VL15Dropped' } /^CounterSelect/ {next} /AllPortSelect/ {next} /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} /^PortSelect/ { if ($2 != '$portnum') {err = err "error: lid '$lid' port " $2 " does not match query ('$portnum')\n"; exit 255}} $1 ~ "(Xmt|Rcv)(Pkts|Data)" { next } { if (th[$1] > 0 && $2 >= th[$1]) warn = warn "#warn: counter " $1 " = " $2 " \t(threshold " th[$1] ") lid '$lid' port '$portnum'\n" } END { if (err != "") { blue(err) exit 255 } if (warn != "") { blue(warn) exit 255 } exit 0 }' 2>&1 && test $rv -eq 0 ; then if [ "$verbose" = "yes" ]; then echo -n "Error check on lid $lid ($nodename) port $portname: " green OK fi exit 0 else echo -n "Error check on lid $lid ($nodename) port $portname: " red FAILED exit 255 fi rdma-core-56.1/infiniband-diags/scripts/ibchecknet.in000066400000000000000000000051161477342711600226410ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-v] [-N | -nocolor]" \ "[ | -C ca_name -P ca_port -t(imeout) timeout_ms]" exit 255 } user_abort() { echo "Aborted" exit 1 } trap user_abort SIGINT gflags="" verbose="" v=0 oldlid="" topofile="" ca_info="" while [ "$1" ]; do case $1 in -h) usage ;; -N|-nocolor) gflags=-N ;; -v) verbose=-v v=0 ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) if [ "$topofile" ]; then usage fi topofile="$1" ;; esac shift done if [ "$topofile" ]; then netcmd="cat $topofile" else netcmd="$IBPATH/ibnetdiscover $ca_info" fi text="`eval $netcmd`" rv=$? echo "$text" | awk ' BEGIN { ne=0 pe=0 } function check_node(lid, port) { if (system("'$IBPATH'/ibchecknode -S '"$ca_info"' '$gflags' '$verbose' " lid)) { ne++ print "\n# " ntype ": nodeguid 0x" nodeguid " failed" return 1; } if (system("'$IBPATH'/ibcheckerrs -S '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port)) return 2; return 0; } /^Ca/ || /^Switch/ || /^Rt/ { nnodes++ ntype=$1; nodeguid=substr($3, 4, 16); ports=$2 if ('$v' || ntype != "Switch") print "\n# Checking " ntype ": nodeguid 0x" nodeguid err = 0; if (ntype != "Switch") next lid = substr($0, index($0, "port 0 lid ") + 11) lid = substr(lid, 1, index(lid, " ") - 1) err = check_node(lid, 255) } /^\[/ { nports++ port = $1 sub("\\(.*\\)", "", port) gsub("[\\[\\]]", "", port) if (ntype != "Switch") { lid = substr($0, index($0, " lid ") + 5) lid = substr(lid, 1, index(lid, " ") - 1) if (check_node(lid, port) == 2) pcnterr++; } else if (err && system("'$IBPATH'/ibcheckerrs -S '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port)) pcnterr++; if (system("'$IBPATH'/ibcheckport -S '"$ca_info"' '$gflags' '$verbose' " lid " " port)) { if (!'$v' && oldlid != lid) { print "# Checked " ntype ": nodeguid 0x" nodeguid " with failure" oldlid = lid } pe++; } } /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} END { printf "\n*** WARNING ***: this command is deprecated; Please use \"ibqueryerrors\"" printf "\n## Summary: %d nodes checked, %d bad nodes found\n", nnodes, ne printf "## %d ports checked, %d bad ports found\n", nports, pe printf "## %d ports have errors beyond threshold\n", pcnterr exit (ne + pe + pcnterr) } ' av=$? if [ $av -ne 0 ] ; then exit $av else exit $rv fi rdma-core-56.1/infiniband-diags/scripts/ibchecknode.in000066400000000000000000000032721477342711600230010ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-v] [-N | -nocolor] [-G]" \ "[-C ca_name] [-P ca_port] [-t(imeout) timeout_ms] " exit 255 } green() { if [ "$bw" = "yes" ]; then if [ "$verbose" = "yes" ]; then echo $1 fi return fi if [ "$verbose" = "yes" ]; then echo -e "\\033[1;032m" $1 "\\033[0;39m" fi } red() { if [ "$bw" = "yes" ]; then echo $1 return fi echo -e "\\033[1;031m" $1 "\\033[0;39m" } guid_addr="" bw="" verbose="" ca_info="" suppress_deprecated="no" while [ "$1" ]; do case $1 in -G) guid_addr=yes ;; -nocolor|-N) bw=yes ;; -v) verbose=yes ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -S) suppress_deprecated="yes" ;; -*) usage ;; *) break ;; esac shift done if [ -z "$1" ]; then usage fi if [ "$suppress_deprecated" = "no" ]; then /usr/bin/echo -e "*** WARNING ***: this command is deprecated; Please use \"smpquery nodeinfo\"\n\n" 1>&2 fi if [ "$guid_addr" ]; then if ! lid=`$IBPATH/ibaddr $ca_info -G -L $1 | awk '/failed/{exit 255} {print $3}'`; then echo -n "guid $1 address resolution: " red "FAILED" exit 255 fi else lid=$1 if ! temp=`$IBPATH/ibaddr $ca_info -L $1 | awk '/failed/{exit 255} {print $1}'`; then echo -n "lid $1 address resolution: " red "FAILED" exit 255 fi fi ## For now, check node only checks if node info is replied if $IBPATH/smpquery $ca_info nodeinfo $lid > /dev/null 2>&1 ; then if [ "$verbose" = "yes" ]; then echo -n "Node check lid $lid: " green OK fi exit 0 else echo -n "Node check lid $lid: " red FAILED exit 255 fi rdma-core-56.1/infiniband-diags/scripts/ibcheckport.in000066400000000000000000000057011477342711600230370ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-v] [-N | -nocolor] [-G]" \ "[-C ca_name] [-P ca_port] [-t(imeout) timeout_ms] " exit 255 } green() { if [ "$bw" = "yes" ]; then if [ "$verbose" = "yes" ]; then echo $1 fi return fi if [ "$verbose" = "yes" ]; then echo -e "\\033[1;032m" $1 "\\033[0;39m" fi } red() { if [ "$bw" = "yes" ]; then echo $1 return fi echo -e "\\033[1;031m" $1 "\\033[0;39m" } guid_addr="" bw="" verbose="" ca_info="" suppress_deprecated="no" while [ "$1" ]; do case $1 in -G) guid_addr=yes ;; -nocolor|-N) bw=yes ;; -v) verbose=yes ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -S) suppress_deprecated="yes" ;; -*) usage ;; *) break ;; esac shift done if [ $# -lt 2 ]; then usage fi portnum=$2 if [ "$suppress_deprecated" = "no" ]; then /usr/bin/echo -e "*** WARNING ***: this command is deprecated\n\n" 1>&2 fi if [ "$guid_addr" ]; then if ! lid=`$IBPATH/ibaddr $ca_info -G -L $1 | awk '/failed/{exit 255} {print $3}'`; then echo -n "guid $1 address resolution: " red "FAILED" exit 255 fi guid=$1 else lid=$1 if ! temp=`$IBPATH/ibaddr $ca_info -L $1 | awk '/failed/{exit 255} {print $1}'`; then echo -n "lid $1 address resolution: " red "FAILED" exit 255 fi fi is_switch=`$IBPATH/smpquery $ca_info nodeinfo $lid $portnum | awk -F '[.:]*' '/^NodeType/{ if ($2 == "Switch") {print 1}}'` if [ "$is_switch" -a "$portnum" == "0" ]; then ignore_check=true fi text="`eval $IBPATH/smpquery $ca_info portinfo $lid $portnum`" rv=$? if echo "$text" | awk -v ignore_check=$ignore_check -v mono=$bw -F '[.:]*' ' function blue(s) { if (mono) printf s else if (!quiet) { printf "\033[1;034m" s printf "\033[0;39m" } } # Checks /^PhysLinkState/{ if ($2 != "LinkUp") {err = err "#error: Physical link state is " $2 " lid '$lid' port '$portnum'\n"; exit 255}} /^LinkState/{ if ($2 != "Active") warn = warn "#warn: Logical link state is " $2 " lid '$lid' port '$portnum'\n"} /^LinkWidthActive/{ if ($2 == "1X") warn = warn "#warn: Link configured as 1X lid '$lid' port '$portnum'\n"} /^Lid/{ if (ignore_check == "0" && $2 == "0") warn = warn "#warn: Lid is not configured lid '$lid' port '$portnum'\n"} /^SMLid/{ if (ignore_check == "0" && $2 == "0") warn = warn "#warn: SM Lid is not configured\n"} #/^LocalPort/ { if ($2 != '$portnum') {err = err "#error: port " $2 " does not match query ('$portnum')\n"; exit 255}} /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} END { if (err != "") { blue(err) exit 255 } if (warn != "") { blue(warn) exit 255 } exit 0 }' 2>&1 && test $rv -eq 0 ; then if [ "$verbose" = "yes" ]; then echo -n "Port check lid $lid port $portnum: " green "OK" fi exit 0 else echo -n "Port check lid $lid port $portnum: " red "FAILED" exit 255 fi rdma-core-56.1/infiniband-diags/scripts/ibcheckportstate.in000066400000000000000000000045041477342711600241000ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-v] [-N | -nocolor] [-G]" \ "[-C ca_name] [-P ca_port] [-t(imeout) timeout_ms] " exit 255 } green() { if [ "$bw" = "yes" ]; then if [ "$verbose" = "yes" ]; then echo $1 fi return fi if [ "$verbose" = "yes" ]; then echo -e "\\033[1;032m" $1 "\\033[0;39m" fi } red() { if [ "$bw" = "yes" ]; then echo $1 return fi echo -e "\\033[1;031m" $1 "\\033[0;39m" } guid_addr="" bw="" verbose="" ca_info="" suppress_deprecated="no" while [ "$1" ]; do case $1 in -G) guid_addr=yes ;; -nocolor|-N) bw=yes ;; -v) verbose=yes ;; -S) suppress_deprecated="yes" ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) break ;; esac shift done if [ $# -lt 2 ]; then usage fi portnum=$2 if [ "$suppress_deprecated" = "no" ]; then /usr/bin/echo -e "*** WARNING ***: this command is deprecated\n\n" 1>&2 fi if [ "$guid_addr" ]; then if ! lid=`$IBPATH/ibaddr $ca_info -G -L $1 | awk '/failed/{exit 255} {print $3}'`; then echo -n "guid $1 address resolution: " red "FAILED" exit 255 fi guid=$1 else lid=$1 if ! temp=`$IBPATH/ibaddr $ca_info -L $1 | awk '/failed/{exit 255} {print $1}'`; then echo -n "lid $1 address resolution: " red "FAILED" exit 255 fi fi text="`eval $IBPATH/smpquery $ca_info portinfo $lid $portnum`" rv=$? if echo "$text" | awk -v mono=$bw -F '[.:]*' ' function blue(s) { if (mono) printf s else if (!quiet) { printf "\033[1;034m" s printf "\033[0;39m" } } # Only check PortPhysicalState and PortState /^PhysLinkState/{ if ($2 != "LinkUp") {err = err "#error: Physical link state is " $2 " lid '$lid' port '$portnum'\n"; exit 255}} /^LinkState/{ if ($2 != "Active") warn = warn "#warn: Logical link state is " $2 " lid '$lid' port '$portnum'\n"} /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} END { if (err != "") { blue(err) exit 255 } if (warn != "") { blue(warn) exit 255 } exit 0 }' 2>&1 && test $rv -eq 0 ; then if [ "$verbose" = "yes" ]; then echo -n "Port check lid $lid port $portnum: " green "OK" fi exit 0 else echo -n "Port check lid $lid port $portnum: " red "FAILED" exit 255 fi rdma-core-56.1/infiniband-diags/scripts/ibcheckportwidth.in000066400000000000000000000043741477342711600241040ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-v] [-N | -nocolor] [-G]" \ "[-C ca_name] [-P ca_port] [-t(imeout) timeout_ms] " exit 255 } green() { if [ "$bw" = "yes" ]; then if [ "$verbose" = "yes" ]; then echo $1 fi return fi if [ "$verbose" = "yes" ]; then echo -e "\\033[1;032m" $1 "\\033[0;39m" fi } red() { if [ "$bw" = "yes" ]; then echo $1 return fi echo -e "\\033[1;031m" $1 "\\033[0;39m" } guid_addr="" bw="" verbose="" ca_info="" suppress_deprecated="no" while [ "$1" ]; do case $1 in -G) guid_addr=yes ;; -nocolor|-N) bw=yes ;; -v) verbose=yes ;; -S) suppress_deprecated="yes" ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) break ;; esac shift done if [ $# -lt 2 ]; then usage fi portnum=$2 if [ "$suppress_deprecated" = "no" ]; then /usr/bin/echo -e "*** WARNING ***: this command is deprecated\n\n" 1>&2 fi if [ "$guid_addr" ]; then if ! lid=`$IBPATH/ibaddr $ca_info -G -L $1 | awk '/failed/{exit 255} {print $3}'`; then echo -n "guid $1 address resolution: " red "FAILED" exit 255 fi guid=$1 else lid=$1 if ! temp=`$IBPATH/ibaddr $ca_info -L $1 | awk '/failed/{exit 255} {print $1}'`; then echo -n "lid $1 address resolution: " red "FAILED" exit 255 fi fi text="`eval $IBPATH/smpquery $ca_info portinfo $lid $portnum`" rv=$? if echo "$text" | awk -v mono=$bw -F '[.:]*' ' function blue(s) { if (mono) printf s else if (!quiet) { printf "\033[1;034m" s printf "\033[0;39m" } } # Only check LinkWidthActive if LinkWidthSupported is not 1X /^LinkWidthSupported/{ if ($2 == "1X") { exit } } /^LinkWidthActive/{ if ($2 == "1X") warn = warn "#warn: Link configured as 1X lid '$lid' port '$portnum'\n"} /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} END { if (err != "") { blue(err) exit 255 } if (warn != "") { blue(warn) exit 255 } exit 0 }' 2>&1 && test $rv -eq 0 ; then if [ "$verbose" = "yes" ]; then echo -n "Port check lid $lid port $portnum: " green "OK" fi exit 0 else echo -n "Port check lid $lid port $portnum: " red "FAILED" exit 255 fi rdma-core-56.1/infiniband-diags/scripts/ibcheckstate.in000066400000000000000000000043071477342711600231740ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-v] [-N | -nocolor]" \ "[ | -C ca_name -P ca_port -t(imeout) timeout_ms]" exit 255 } user_abort() { echo "Aborted" exit 1 } trap user_abort SIGINT gflags="" verbose="" v=0 ntype="" nodeguid="" oldlid="" topofile="" ca_info="" while [ "$1" ]; do case $1 in -h) usage ;; -N|-nocolor) gflags=-N ;; -v) verbose=-v v=1 ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) if [ "$topofile" ]; then usage fi topofile="$1" ;; esac shift done if [ "$topofile" ]; then netcmd="cat $topofile" else netcmd="$IBPATH/ibnetdiscover $ca_info" fi text="`eval $netcmd`" rv=$? echo "$text" | awk ' BEGIN { ne=0 pe=0 } function check_node(lid) { nodechecked=1 if (system("'$IBPATH'/ibchecknode -S '"$ca_info"' '$gflags' '$verbose' " lid)) { ne++ badnode=1 return } } /^Ca/ || /^Switch/ || /^Rt/ { nnodes++ ntype=$1; nodeguid=substr($3, 4, 16); ports=$2 if ('$v') print "\n# Checking " ntype ": nodeguid 0x" nodeguid nodechecked=0 badnode=0 if (ntype != "Switch") next lid = substr($0, index($0, "port 0 lid ") + 11) lid = substr(lid, 1, index(lid, " ") - 1) check_node(lid) } /^\[/ { nports++ port = $1 if (!nodechecked) { lid = substr($0, index($0, " lid ") + 5) lid = substr(lid, 1, index(lid, " ") - 1) check_node(lid) } if (badnode) { print "\n# " ntype ": nodeguid 0x" nodeguid " failed" next } sub("\\(.*\\)", "", port) gsub("[\\[\\]]", "", port) if (system("'$IBPATH'/ibcheckportstate -S '"$ca_info"' '$gflags' '$verbose' " lid " " port)) { if (!'$v' && oldlid != lid) { print "# Checked " ntype ": nodeguid 0x" nodeguid " with failure" oldlid = lid } pe++; } } /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} END { printf "\n*** WARNING ***: this command is deprecated\n" printf "\n## Summary: %d nodes checked, %d bad nodes found\n", nnodes, ne printf "## %d ports checked, %d ports with bad state found\n", nports, pe } ' exit $rv rdma-core-56.1/infiniband-diags/scripts/ibcheckwidth.in000066400000000000000000000043211477342711600231670ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-v] [-N | -nocolor]" \ "[ \| -C ca_name -P ca_port -t(imeout) timeout_ms]" exit 255 } user_abort() { echo "Aborted" exit 1 } trap user_abort SIGINT gflags="" verbose="" v=0 ntype="" nodeguid="" oldlid="" topofile="" ca_info="" while [ "$1" ]; do case $1 in -h) usage ;; -N|-nocolor) gflags=-N ;; -v) verbose="-v" v=1 ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) if [ "$topofile" ]; then usage fi topofile="$1" ;; esac shift done if [ "$topofile" ]; then netcmd="cat $topofile" else netcmd="$IBPATH/ibnetdiscover $ca_info" fi text="`eval $netcmd`" rv=$? echo "$text" | awk ' BEGIN { ne=0 pe=0 } function check_node(lid) { nodechecked=1 if (system("'$IBPATH'/ibchecknode -S '"$ca_info"' '$gflags' '$verbose' " lid)) { ne++ badnode=1 return } } /^Ca/ || /^Switch/ || /^Rt/ { nnodes++ ntype=$1; nodeguid=substr($3, 4, 16); ports=$2 if ('$v') print "\n# Checking " ntype ": nodeguid 0x" nodeguid nodechecked=0 badnode=0 if (ntype != "Switch") next lid = substr($0, index($0, "port 0 lid ") + 11) lid = substr(lid, 1, index(lid, " ") - 1) check_node(lid) } /^\[/ { nports++ port = $1 if (!nodechecked) { lid = substr($0, index($0, " lid ") + 5) lid = substr(lid, 1, index(lid, " ") - 1) check_node(lid) } if (badnode) { print "\n# " ntype ": nodeguid 0x" nodeguid " failed" next } sub("\\(.*\\)", "", port) gsub("[\\[\\]]", "", port) if (system("'$IBPATH'/ibcheckportwidth -S '"$ca_info"' '$gflags' '$verbose' " lid " " port)) { if (!'$v' && oldlid != lid) { print "# Checked " ntype ": nodeguid 0x" nodeguid " with failure" oldlid = lid } pe++; } } /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} END { printf "\n*** WARNING ***: this command is deprecated\n" printf "\n## Summary: %d nodes checked, %d bad nodes found\n", nnodes, ne printf "## %d ports checked, %d ports with 1x width in error found\n", nports, pe } ' exit $rv rdma-core-56.1/infiniband-diags/scripts/ibclearcounters.in000066400000000000000000000033261477342711600237270ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [" \ "| -C ca_name -P ca_port -t(imeout) timeout_ms]" exit 255 } user_abort() { echo "Aborted" exit 1 } trap user_abort SIGINT gflags="" verbose="" v=0 topofile="" ca_info="" while [ "$1" ]; do case $1 in -h) usage ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) if [ "$topofile" ]; then usage fi topofile="$1" ;; esac shift done if [ "$topofile" ]; then netcmd="cat $topofile" else netcmd="$IBPATH/ibnetdiscover $ca_info" fi text="`eval $netcmd`" rv=$? echo "$text" | awk ' function clear_counters(lid) { if (system("'$IBPATH'/perfquery'"$ca_info"' '$gflags' -R -a " lid)) nodeerr++ } function clear_port_counters(lid, port) { if (system("'$IBPATH'/perfquery'"$ca_info"' '$gflags' -R " lid " " port)) nodeerr++ } /^Ca/ || /^Switch/ || /^Rt/ { nnodes++ ntype=$1; nodeguid=substr($3, 4, 16); ports=$2 if (ntype != "Switch") next lid = substr($0, index($0, "port 0 lid ") + 11) lid = substr(lid, 1, index(lid, " ") - 1) clear_counters(lid) } /^\[/ { port = $1 sub("\\(.*\\)", "", port) gsub("[\\[\\]]", "", port) if (ntype != "Switch") { lid = substr($0, index($0, " lid ") + 5) lid = substr(lid, 1, index(lid, " ") - 1) clear_port_counters(lid, port) } } /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} END { printf "\n*** WARNING ***: this command is deprecated; Please use \"ibqueryerrors -K\"\n" printf "\n## Summary: %d nodes cleared %d errors\n", nnodes, nodeerr } ' exit $rv rdma-core-56.1/infiniband-diags/scripts/ibclearerrors.in000066400000000000000000000034551477342711600234040ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-N | -nocolor] [" \ "| -C ca_name -P ca_port -t(imeout) timeout_ms]" exit 255 } user_abort() { echo "Aborted" exit 1 } trap user_abort SIGINT gflags="" verbose="" v=0 oldlid="" topofile="" ca_info="" while [ "$1" ]; do case $1 in -h) usage ;; -N|-nocolor) gflags=-N ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) if [ "$topofile" ]; then usage fi topofile="$1" ;; esac shift done if [ "$topofile" ]; then netcmd="cat $topofile" else netcmd="$IBPATH/ibnetdiscover $ca_info" fi text="`eval $netcmd`" rv=$? echo "$text" | awk ' function clear_all_errors(lid, port) { if (system("'$IBPATH'/perfquery'"$ca_info"' '$gflags' -R -a " lid " " port " 0x0fff")) nodeerr++ } function clear_errors(lid, port) { if (system("'$IBPATH'/perfquery'"$ca_info"' '$gflags' -R " lid " " port " 0x0fff")) nodeerr++ } /^Ca/ || /^Switch/ || /^Rt/ { nnodes++ ntype=$1; nodeguid=substr($3, 4, 16); ports=$2 if (ntype != "Switch") next lid = substr($0, index($0, "port 0 lid ") + 11) lid = substr(lid, 1, index(lid, " ") - 1) clear_all_errors(lid, 255) } /^\[/ { port = $1 sub("\\(.*\\)", "", port) gsub("[\\[\\]]", "", port) if (ntype != "Switch") { lid = substr($0, index($0, " lid ") + 5) lid = substr(lid, 1, index(lid, " ") - 1) clear_errors(lid, port) } } /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} END { printf "\n*** WARNING ***: this command is deprecated; Please use \"ibqueryerrors -k\"\n" printf "\n## Summary: %d nodes cleared %d errors\n", nnodes, nodeerr } ' exit $rv rdma-core-56.1/infiniband-diags/scripts/ibdatacounters.in000066400000000000000000000043011477342711600235440ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-b] [-v] [-N | -nocolor]" \ "[ \| -C ca_name -P ca_port -t(imeout) timeout_ms]" exit 255 } user_abort() { echo "Aborted" exit 1 } trap user_abort SIGINT gflags="" verbose="" brief="" v=0 ntype="" nodeguid="" topofile="" ca_info="" while [ "$1" ]; do case $1 in -h) usage ;; -N|-nocolor) gflags=-N ;; -v) verbose=-v brief="" v=1 ;; -b) brief=-b verbose="" ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) if [ "$topofile" ]; then usage fi topofile="$1" ;; esac shift done if [ "$topofile" ]; then netcmd="cat $topofile" else netcmd="$IBPATH/ibnetdiscover $ca_info" fi text="`eval $netcmd`" rv=$? echo "$text" | awk ' BEGIN { ne=0 } function check_node(lid, port) { if (system("'$IBPATH'/ibchecknode -S '"$ca_info"' '$gflags' '$verbose' " lid)) { ne++ print "\n# " ntype ": nodeguid 0x" nodeguid " failed" return 1; } return system("'$IBPATH'/ibdatacounts -S '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port); } /^Ca/ || /^Switch/ || /^Rt/ { nnodes++ ntype=$1; nodeguid=substr($3, 4, 16); ports=$2 if ('$v') print "\n# Checking " ntype ": nodeguid 0x" nodeguid err = 0; if (ntype != "Switch") next lid = substr($0, index($0, "port 0 lid ") + 11) lid = substr(lid, 1, index(lid, " ") - 1) err = check_node(lid, 255) } /^\[/ { nports++ port = $1 sub("\\(.*\\)", "", port) gsub("[\\[\\]]", "", port) if (ntype != "Switch") { lid = substr($0, index($0, " lid ") + 5) lid = substr(lid, 1, index(lid, " ") - 1) check_node(lid, port) } else if (err) system("'$IBPATH'/ibdatacounts -S '"$ca_info"' '$gflags' '$verbose' '$brief' " lid " " port); } /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} END { printf "*** WARNING ***: this command is deprecated; Please use \"ibqueryerrors --counters\n" printf "\n## Summary: %d nodes checked, %d bad nodes found\n", nnodes, ne printf "## %d ports checked\n", nports exit (ne ) } ' exit $rv rdma-core-56.1/infiniband-diags/scripts/ibdatacounts.in000066400000000000000000000053471477342711600232300ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [-b] [-v] [-G] [-N | -nocolor]" \ "[-C ca_name] [-P ca_port] [-t(imeout) timeout_ms] " \ "[]" exit 255 } green() { if [ "$bw" = "yes" ]; then if [ "$verbose" = "yes" ]; then echo $1 fi return fi if [ "$verbose" = "yes" ]; then echo -e "\\033[1;032m" $1 "\\033[0;39m" fi } red() { if [ "$bw" = "yes" ]; then echo $1 return fi echo -e "\\033[1;031m" $1 "\\033[0;39m" } guid_addr="" bw="" verbose="" brief="" ca_info="" suppress_deprecated="no" while [ "$1" ]; do case $1 in -G) guid_addr=yes ;; -nocolor|-N) bw=yes ;; -v) verbose=yes brief="" ;; -b) brief=yes verbose="" ;; -P | -C | -t | -timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -S) suppress_deprecated="yes" ;; -*) usage ;; *) break ;; esac shift done #default is all ports portnum=255 if [ $# -lt 1 ]; then usage fi if [ "$2" ]; then portnum=$2 fi if [ "$portnum" = "255" ]; then portname="all" else portname=$2 fi if [ "$guid_addr" ]; then if ! lid=`$IBPATH/ibaddr $ca_info -G -L $1 | awk '/failed/{exit 255} {print $3}'`; then echo -n "guid $1 address resolution: " red "FAILED" exit 255 fi guid=$1 else lid=$1 if ! temp=`$IBPATH/ibaddr $ca_info -L $1 | awk '/failed/{exit 255} {print $1}'`; then echo -n "lid $1 address resolution: " red "FAILED" exit 255 fi fi nodename=`$IBPATH/smpquery $ca_info nodedesc $lid | sed -e "s/^Node Description:\.*\(.*\)/\1/"` if [ "$suppress_deprecated" = "no" ]; then /usr/bin/echo -e "*** WARNING ***: this command is deprecated; Please use \"ibqueryerrors --counters\"\n\n" 1>&2 fi text="`eval $IBPATH/perfquery $ca_info $lid $portnum`" rv=$? if echo "$text" | awk -v mono=$bw -v brief=$brief -F '[.:]*' ' function blue(s) { if (brief == "yes") { return } if (mono) printf s else if (!quiet) { printf "\033[1;034m" s printf "\033[0;39m" } } # Only display Xmit/Rcv Pkts/Data /^# Port counters/ {print} /^CounterSelect/ {next} /AllPortSelect/ {next} /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} /^PortSelect/ { if ($2 != '$portnum') {err = err "error: lid '$lid' port " $2 " does not match query ('$portnum')\n"; exit 255}} $1 ~ "(Xmt|Rcv)(Pkts|Data)" { print $1 ":........................." $2 } END { if (err != "") { blue(err) exit 255 } if (warn != "") { blue(warn) exit 255 } exit 0 }' 2>&1 && test $rv -eq 0 ; then if [ "$verbose" = "yes" ]; then echo -n "Error on lid $lid ($nodename) port $portname: " green OK fi exit 0 else echo -n "Error on lid $lid ($nodename) port $portname: " red FAILED exit 255 fi rdma-core-56.1/infiniband-diags/scripts/ibdiscover.map000066400000000000000000000006051477342711600230400ustar00rootroot000000000000008f10400410015|8|"ISR 6000"|# SW-6IB4 Voltaire port 0 lid 5 5442ba00003080|24|"ISR 9024"|# ISR9024 Voltaire port 0 lid 2 8f10403960558|2|"HCA 1"|# MT23108 InfiniHost Mellanox Technologies 5442b100004900|2|"HCA 2"|# MT23108 InfiniHost Mellanox Technologies 8f10403961354|2|"HCA 3"|# MT23108 InfiniHost Mellanox Technologies 8f10403960984|2|"HCA 4"|# MT23108 InfiniHost Mellanox Technologies rdma-core-56.1/infiniband-diags/scripts/ibdiscover.pl000077500000000000000000000047131477342711600227050ustar00rootroot00000000000000#!/usr/bin/perl printf (STDERR "*** WARNING ***; this command is deprecated;\n"); printf (STDERR " see ibnetdiscover cache features\n"); printf (STDERR " and/or iblinkinfo \"check\" features\n\n"); # # Read mapfile # open(MAP, "< ibdiscover.map"); while () { ($pre, $port, $desc) = split /\|/; $val{$pre} = $desc; # print "Ack1 - $pre - $port - $desc\n"; } close(MAP); # # Read old topo map in # open(TOPO, "< ibdiscover.topo"); $topomap = 0; while () { $topomap = 1; ($localPort, $localGuid, $remotePort, $remoteGuid) = split /\|/; chomp $remoteGuid; $var = sprintf("%s|%2s|%2s|%s", $localGuid, $localPort, $remotePort, $remoteGuid); $topo{$var} = 1; # ${$pre} = $desc; # print "Ack1 - $pre - $port - $desc\n"; } close(TOPO); # # Read stdin and output enhanced output # # Search and replace =0x???? with value # Search and replace -000???? with value open(TOPO2, " >ibdiscover.topo.new"); while () { ($a, $b, $local, $d) = /([sh])([\s\S]*)=0x([a-f\d]*)([\s\S]*)/; if ($local ne "") { printf( "\n%s GUID: %s %s\n", ($a eq "s" ? "Switch" : "Host"), $local, $val{$local} ); chomp $local; $localGuid = $local; } else { ($localPort, $type, $remoteGuid, $remotePort) = /([\s\S]*)"([SH])\-000([a-f\d]*)"([\s\S]*)\n/; ($localPort) = $localPort =~ /\[(\d*)]/; ($remotePort) = $remotePort =~ /\[(\d*)]/; if ($remoteGuid ne "" && $localPort ne "") { printf(TOPO2 "%d|%s|%d|%s\n", $localPort, $localGuid, $remotePort, $remoteGuid); $var = sprintf("%s|%2s|%2s|%s", $localGuid, $localPort, $remotePort, $remoteGuid); $topo{$var} += 1; printf( "Local: %2s Remote: %2s %7s GUID: %s Location: %s\n", $localPort, $remotePort, ($type eq "H" ? "Host" : "Switch"), $remoteGuid, ($val{$remoteGuid} ne "" ? $val{$remoteGuid} : $remoteGuid) ); } } } close(STDIN); close(TOPO2); printf("\nDelta change in topo (change between successive runs)\n\n"); foreach $el (keys %topo) { if ($topo{$el} < 2 || $topomap == 0) { ($lg, $lp, $rp, $rg) = split(/\|/, $el); printf( "Link change: Local/Remote Port %2d/%2d Local/Remote GUID: %s/%s\n", $lp, $rp, $lg, $rg); printf("\tLocations: Local/Remote\n\t\t%s\n\t\t%s\n\n", $val{$lg}, $val{$rg}); } } printf (STDERR "*** WARNING ***; this command is deprecated;\n"); printf (STDERR " see ibnetdiscover cache features\n"); printf (STDERR " and/or iblinkinfo \"check\" features\n\n"); rdma-core-56.1/infiniband-diags/scripts/ibfindnodesusing.pl000077500000000000000000000145431477342711600241100ustar00rootroot00000000000000#!/usr/bin/perl # # Copyright (C) 2001-2003 The Regents of the University of California. # Copyright (c) 2006 The Regents of the University of California. # Copyright (c) 2007-2008 Voltaire, Inc. All rights reserved. # # Produced at Lawrence Livermore National Laboratory. # Written by Ira Weiny # Jim Garlick # Albert Chu # # This software is available to you under a choice of one of two # licenses. You may choose to be licensed under the terms of the GNU # General Public License (GPL) Version 2, available from the file # COPYING in the main directory of this source tree, or the # OpenIB.org BSD license below: # # Redistribution and use in source and binary forms, with or # without modification, are permitted provided that the following # conditions are met: # # - Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. # # - Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials # provided with the distribution. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # use strict; use Getopt::Std; use IBswcountlimits; my $ca_name = ""; my $ca_port = ""; # ========================================================================= # sub get_hosts_routed { my $sw_guid = $_[0]; my $sw_port = $_[1]; my @hosts = undef; my $extra_params = get_ca_name_port_param_string($ca_name, $ca_port); if ($sw_guid eq "") { return (@hosts); } my $data = `ibroute $extra_params -G $sw_guid`; my @lines = split("\n", $data); foreach my $line (@lines) { if ($line =~ /\w+\s+(\d+)\s+:\s+\(Channel Adapter.*:\s+'(.*)'\)/) { if ($1 == $sw_port) { push @hosts, $2; } } } return (@hosts); } # ========================================================================= # sub usage_and_exit { my $prog = $_[0]; print "Usage: $prog [-R -C -P ] \n"; print " find a list of nodes which are routed through switch:port\n"; print " -R Recalculate ibnetdiscover information\n"; print " -C use selected Channel Adaptor name for queries\n"; print " -P use selected channel adaptor port for queries\n"; exit 2; } my $argv0 = `basename $0`; my $regenerate_map = undef; chomp $argv0; if (!getopts("hRC:P:")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_R) { $regenerate_map = $Getopt::Std::opt_R; } if (defined $Getopt::Std::opt_C) { $ca_name = $Getopt::Std::opt_C; } if (defined $Getopt::Std::opt_P) { $ca_port = $Getopt::Std::opt_P; } my $target_switch = format_guid($ARGV[0]); my $target_port = $ARGV[1]; get_link_ends($regenerate_map, $ca_name, $ca_port); if ($target_switch eq "" || $target_port eq "") { usage_and_exit $argv0; } # sortn: # # sort a group of alphanumeric strings by the last group of digits on # those strings, if such exists (good for numerically suffixed host lists) # sub sortn { map { $$_[0] } sort { ($$a[1] || 0) <=> ($$b[1] || 0) } map { [$_, /(\d*)$/] } @_; } # comp2(): # # takes a list of names and returns a hash of arrays, indexed by name prefix, # each containing a list of numerical ranges describing the initial list. # # e.g.: %hash = comp2(lx01,lx02,lx03,lx05,dev0,dev1,dev21) # will return: # $hash{"lx"} = ["01-03", "05"] # $hash{"dev"} = ["0-1", "21"] # sub comp2 { my (%i) = (); my (%s) = (); # turn off warnings here to avoid perl complaints about # uninitialized values for members of %i and %s local ($^W) = 0; push( @{ $s{$$_[0]}[ ( $s{$$_[0]}[$i{$$_[0]}][$#{$s{$$_[0]}[$i{$$_[0]}]}] == ($$_[1] - 1) ) ? $i{$$_[0]} : ++$i{$$_[0]} ] }, ($$_[1]) ) for map { [/(.*?)(\d*)$/] } sortn(@_); for my $key (keys %s) { @{$s{$key}} = map { $#$_ > 0 ? "$$_[0]-$$_[$#$_]" : @{$_} } @{$s{$key}}; } return %s; } sub compress_hostlist { my %rng = comp2(@_); my @list = (); local $" = ","; foreach my $k (keys %rng) { @{$rng{$k}} = map { "$k$_" } @{$rng{$k}}; } @list = map { @{$rng{$_}} } sort keys %rng; return "@list"; } # ========================================================================= # sub main { my $found_switch = undef; my $cache_file = get_cache_file($ca_name, $ca_port); open IBNET_TOPO, "<$cache_file" or die "Failed to open ibnet topology\n"; my $in_switch = "no"; my $switch_guid = ""; my $desc = undef; my %ports = undef; while (my $line = ) { if ($line =~ /^Switch.*\"S-(.*)\"\s+# (.*) port.*/) { $switch_guid = $1; $desc = $2; if ("0x$switch_guid" eq $target_switch || $desc =~ /.*$target_switch\s+.*/) { $found_switch = "yes"; goto FOUND; } } if ($line =~ /^Ca.*/ || $line =~ /^Rt.*/) { $in_switch = "no"; } if ($line =~ /^\[(\d+)\].*/ && $in_switch eq "yes") { $ports{$1} = $line; } } FOUND: close IBNET_TOPO; if (!$found_switch) { print "Switch \"$target_switch\" not found\n"; print " Try running with the \"-R\" or \"-P\" option.\n"; exit 1; } $switch_guid = "0x$switch_guid"; my $hr = $IBswcountlimits::link_ends{$switch_guid}{$target_port}; my $rem_sw_guid = $hr->{rem_guid}; my $rem_sw_port = $hr->{rem_port}; my $rem_sw_desc = $hr->{rem_desc}; my @hosts = undef; @hosts = get_hosts_routed($switch_guid, $target_port); my $hosts = compress_hostlist(@hosts); @hosts = split ",", $hosts; print "$switch_guid $target_port ($desc) ==>> $rem_sw_guid $rem_sw_port ($rem_sw_desc)\n"; print "@hosts\n\n"; @hosts = get_hosts_routed($rem_sw_guid, $rem_sw_port); $hosts = compress_hostlist(@hosts); @hosts = split ",", $hosts; print "$switch_guid $target_port ($desc) <<== $rem_sw_guid $rem_sw_port ($rem_sw_desc)\n"; print "@hosts\n"; } main rdma-core-56.1/infiniband-diags/scripts/ibhosts.in000066400000000000000000000017631477342711600222210ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [ | -y mkey" \ "-C ca_name -P ca_port -t timeout_ms]" exit 255 } topofile="" ca_info="" mkey="0" while [ "$1" ]; do case $1 in -h | --help) usage ;; -y | --m_key) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi shift mkey="$1" ;; -P | --Port | -C | --Ca | -t | --timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) if [ "$topofile" ]; then usage fi topofile="$1" ;; esac shift done if [ "$topofile" ]; then netcmd="cat $topofile" else netcmd="$IBPATH/ibnetdiscover -y $mkey $ca_info" fi text="`eval $netcmd`" rv=$? echo "$text" | awk ' /^Ca/ {print $1 "\t: 0x" substr($3, 4, 16) " ports " $2 " "\ substr($0, match($0, "#[ \t]*")+RLENGTH)} /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} ' exit $rv rdma-core-56.1/infiniband-diags/scripts/ibidsverify.pl000077500000000000000000000157731477342711600231030ustar00rootroot00000000000000#!/usr/bin/perl # # Copyright (c) 2007-2008 Voltaire, Inc. All rights reserved. # Copyright (c) 2006 The Regents of the University of California. # # This software is available to you under a choice of one of two # licenses. You may choose to be licensed under the terms of the GNU # General Public License (GPL) Version 2, available from the file # COPYING in the main directory of this source tree, or the # OpenIB.org BSD license below: # # Redistribution and use in source and binary forms, with or # without modification, are permitted provided that the following # conditions are met: # # - Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. # # - Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials # provided with the distribution. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # use strict; use Getopt::Std; use IBswcountlimits; my $return_code = 0; sub usage_and_exit { my $prog = $_[0]; print "Usage: $prog [-Rh]\n"; print " Validate LIDs and GUIDs (check for zero and duplicates) in the local subnet\n"; print " -h This help message\n"; print " -R Recalculate ibnetdiscover information (Default is to reuse ibnetdiscover output)\n"; print " -C use selected Channel Adaptor name for queries\n"; print " -P use selected channel adaptor port for queries\n"; exit 2; } my $argv0 = `basename $0`; my $regenerate_map = undef; my $ca_name = ""; my $ca_port = ""; chomp $argv0; if (!getopts("hRC:P:")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_R) { $regenerate_map = $Getopt::Std::opt_R; } if (defined $Getopt::Std::opt_C) { $ca_name = $Getopt::Std::opt_C; } if (defined $Getopt::Std::opt_P) { $ca_port = $Getopt::Std::opt_P; } sub validate_non_zero_lid { my ($lid) = shift(@_); my ($nodeguid) = shift(@_); my ($nodetype) = shift(@_); if ($lid eq 0) { print "LID 0 found for $nodetype NodeGUID $nodeguid\n"; return 1; } return 0; } sub validate_non_zero_guid { my ($lid) = shift(@_); my ($guid) = shift(@_); my ($nodetype) = shift(@_); if ($guid eq 0x0) { print "$nodetype GUID 0x0 found with LID $lid\n"; return 1; } return 0; } $insert_lid::lids = undef; $insert_nodeguid::nodeguids = undef; $insert_portguid::portguids = undef; sub insert_lid { my ($lid) = shift(@_); my ($nodeguid) = shift(@_); my ($nodetype) = shift(@_); my $rec = undef; my $status = ""; $status = validate_non_zero_lid($lid, $nodeguid, $nodetype); if ($status eq 0) { if (defined($insert_lid::lids{$lid})) { print "LID $lid already defined for NodeGUID $insert_lid::lids{$lid}->{nodeguid}\n"; $return_code = 1; } else { $rec = {lid => $lid, nodeguid => $nodeguid}; $insert_lid::lids{$lid} = $rec; } } else { $return_code = $status; } } sub insert_nodeguid { my ($lid) = shift(@_); my ($nodeguid) = shift(@_); my ($nodetype) = shift(@_); my $rec = undef; my $status = ""; $status = validate_non_zero_guid($lid, $nodeguid, $nodetype); if ($status eq 0) { if (defined($insert_nodeguid::nodeguids{$nodeguid})) { print "NodeGUID $nodeguid already defined for LID $insert_nodeguid::nodeguids{$nodeguid}->{lid}\n"; $return_code = 1; } else { $rec = {lid => $lid, nodeguid => $nodeguid}; $insert_nodeguid::nodeguids{$nodeguid} = $rec; } } else { $return_code = $status; } } sub validate_portguid { my ($portguid) = shift(@_); my ($nodeguid) = shift(@_); if (($nodeguid ne $portguid) && defined($insert_nodeguid::nodeguids{$portguid})) { print "PortGUID $portguid is an invalid duplicate of a NodeGUID\n"; $return_code = 1; } } sub insert_portguid { my ($lid) = shift(@_); my ($portguid) = shift(@_); my ($nodetype) = shift(@_); my ($nodeguid) = shift(@_); my $rec = undef; my $status = ""; $status = validate_non_zero_guid($lid, $portguid, $nodetype); if ($status eq 0) { if (defined($insert_portguid::portguids{$portguid})) { print "PortGUID $portguid already defined for LID $insert_portguid::portguids{$portguid}->{lid}\n"; $return_code = 1; } else { $rec = {lid => $lid, portguid => $portguid}; $insert_portguid::portguids{$portguid} = $rec; validate_portguid($portguid, $nodeguid); } } else { $return_code = $status; } } sub main { my $cache_file = get_cache_file($ca_name, $ca_port); if ($regenerate_map || !(-f "$cache_file")) { generate_ibnetdiscover_topology($ca_name, $ca_port); } open IBNET_TOPO, "<$cache_file" or die "Failed to open ibnet topology: $!\n"; my $nodetype = ""; my $nodeguid = ""; my $portguid = ""; my $lid = ""; my $line = ""; my $firstport = ""; while ($line = ) { if ($line =~ /^caguid=(.*)/ || $line =~ /^rtguid=(.*)/) { $nodeguid = $1; $nodetype = ""; } if ($line =~ /^switchguid=(.*)/) { $nodeguid = $1; $portguid = ""; $nodetype = ""; } if ($line =~ /^switchguid=(.*)\((.*)\)/) { $nodeguid = $1; $portguid = "0x" . $2; } if ($line =~ /^Switch.*\"S-(.*)\"\s+# (.*) port.* lid (\d+) .*/) { $nodetype = "switch"; $firstport = "yes"; $lid = $3; insert_lid($lid, $nodeguid, $nodetype); insert_nodeguid($lid, $nodeguid, $nodetype); if ($portguid ne "") { insert_portguid($lid, $portguid, $nodetype, $nodeguid); } } if ($line =~ /^Ca.*/) { $nodetype = "ca"; $firstport = "yes"; } if ($line =~ /^Rt.*/) { $nodetype = "router"; $firstport = "yes"; } if ($nodetype eq "ca" || $nodetype eq "router") { if ($line =~ /"S-(.*)\# lid (\d+) .*/) { $lid = $2; insert_lid($lid, $nodeguid, $nodetype); if ($firstport eq "yes") { insert_nodeguid($lid, $nodeguid, $nodetype); $firstport = "no"; } } if ($line =~ /^.*"H-(.*)\# lid (\d+) .*/) { $lid = $2; insert_lid($lid, $nodeguid, $nodetype); if ($firstport eq "yes") { insert_nodeguid($lid, $nodeguid, $nodetype); $firstport = "no"; } } if ($line =~ /^.*"R-(.*)\# lid (\d+) .*/) { $lid = $2; insert_lid($lid, $nodeguid, $nodetype); if ($firstport eq "yes") { insert_nodeguid($lid, $nodeguid, $nodetype); $firstport = "no"; } } if ($line =~ /^\[(\d+)\]\((.*)\)/) { $portguid = "0x" . $2; insert_portguid($lid, $portguid, $nodetype, $nodeguid); } } } close IBNET_TOPO; } main; exit ($return_code); rdma-core-56.1/infiniband-diags/scripts/iblinkinfo.pl.in000077500000000000000000000032631477342711600233040ustar00rootroot00000000000000#!/usr/bin/perl # # Copyright (c) 2009 Lawrence Livermore National Security # # Produced at Lawrence Livermore National Laboratory. # Written by Ira Weiny . # # This software is available to you under a choice of one of two # licenses. You may choose to be licensed under the terms of the GNU # General Public License (GPL) Version 2, available from the file # COPYING in the main directory of this source tree, or the # OpenIB.org BSD license below: # # Redistribution and use in source and binary forms, with or # without modification, are permitted provided that the following # conditions are met: # # - Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. # # - Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials # provided with the distribution. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # # this is now just a wrapper for the C based utility $str = join " ",@ARGV; system "@IBSCRIPTPATH@/iblinkinfo $str"; printf (STDERR "\n*** WARNING ***: this command has been replaced by iblinkinfo\n\n"); rdma-core-56.1/infiniband-diags/scripts/ibnodes.in000066400000000000000000000001271477342711600221620ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} $IBPATH/ibhosts $@; $IBPATH/ibswitches $@ rdma-core-56.1/infiniband-diags/scripts/ibprintca.pl000077500000000000000000000106231477342711600225240ustar00rootroot00000000000000#!/usr/bin/perl # # Copyright (c) 2006 The Regents of the University of California. # Copyright (c) 2007-2008 Voltaire, Inc. All rights reserved. # # Produced at Lawrence Livermore National Laboratory. # Written by Ira Weiny . # # This software is available to you under a choice of one of two # licenses. You may choose to be licensed under the terms of the GNU # General Public License (GPL) Version 2, available from the file # COPYING in the main directory of this source tree, or the # OpenIB.org BSD license below: # # Redistribution and use in source and binary forms, with or # without modification, are permitted provided that the following # conditions are met: # # - Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. # # - Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials # provided with the distribution. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # use strict; use Getopt::Std; use IBswcountlimits; printf (STDERR "*** WARNING ***: this command is deprecated; Please use \"ibhosts\"\n\n"); # ========================================================================= # sub usage_and_exit { my $prog = $_[0]; print "Usage: $prog [-R -l] [-G | ]\n"; print " print only the ca specified from the ibnetdiscover output\n"; print " -R Recalculate ibnetdiscover information\n"; print " -l list cas\n"; print " -C use selected channel adaptor name for queries\n"; print " -P use selected channel adaptor port for queries\n"; print " -G node is specified with GUID\n"; exit 2; } my $argv0 = `basename $0`; my $regenerate_map = undef; my $list_hcas = undef; my $ca_name = ""; my $ca_port = ""; my $name_is_guid = "no"; chomp $argv0; if (!getopts("hRlC:P:G")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_R) { $regenerate_map = $Getopt::Std::opt_R; } if (defined $Getopt::Std::opt_l) { $list_hcas = $Getopt::Std::opt_l; } if (defined $Getopt::Std::opt_C) { $ca_name = $Getopt::Std::opt_C; } if (defined $Getopt::Std::opt_P) { $ca_port = $Getopt::Std::opt_P; } if (defined $Getopt::Std::opt_G) { $name_is_guid = "yes"; } my $target_hca = $ARGV[0]; if ($name_is_guid eq "yes") { $target_hca = format_guid($target_hca); } my $cache_file = get_cache_file($ca_name, $ca_port); if ($regenerate_map || !(-f "$cache_file")) { generate_ibnetdiscover_topology($ca_name, $ca_port); } if ($list_hcas) { system("ibhosts $cache_file"); exit 1; } if ($target_hca eq "") { usage_and_exit $argv0; } # ========================================================================= # sub main { my $found_hca = 0; open IBNET_TOPO, "<$cache_file" or die "Failed to open ibnet topology\n"; my $in_hca = "no"; my %ports = undef; while (my $line = ) { if ($line =~ /^Ca.*\"H-(.*)\"\s+# (.*)/) { my $guid = $1; my $desc = $2; if ($in_hca eq "yes") { $in_hca = "no"; foreach my $port (sort { $a <=> $b } (keys %ports)) { print $ports{$port}; } } if ("0x$guid" eq $target_hca || $desc =~ /[\s\"]$target_hca[\s\"]/) { print $line; $in_hca = "yes"; $found_hca++; } } if ($line =~ /^Switch.*/ || $line =~ /^Rt.*/) { $in_hca = "no"; } if ($line =~ /^\[(\d+)\].*/ && $in_hca eq "yes") { $ports{$1} = $line; } } if ($in_hca eq "yes") { foreach my $port (sort { $a <=> $b } (keys %ports)) { print $ports{$port}; } } if ($found_hca == 0) { die "\"$target_hca\" not found\n" . " Try running with the \"-R\" option.\n" . " If still not found the node is probably down.\n"; } if ($found_hca > 1) { print "\nWARNING: Found $found_hca CA's with the name \"$target_hca\"\n"; } close IBNET_TOPO; } main rdma-core-56.1/infiniband-diags/scripts/ibprintrt.pl000077500000000000000000000104241477342711600225650ustar00rootroot00000000000000#!/usr/bin/perl # # Copyright (c) 2006 The Regents of the University of California. # Copyright (c) 2007-2008 Voltaire, Inc. All rights reserved. # # Produced at Lawrence Livermore National Laboratory. # Written by Ira Weiny . # # This software is available to you under a choice of one of two # licenses. You may choose to be licensed under the terms of the GNU # General Public License (GPL) Version 2, available from the file # COPYING in the main directory of this source tree, or the # OpenIB.org BSD license below: # # Redistribution and use in source and binary forms, with or # without modification, are permitted provided that the following # conditions are met: # # - Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. # # - Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials # provided with the distribution. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # use strict; use Getopt::Std; use IBswcountlimits; printf (STDERR "*** WARNING ***: this command is deprecated; Please use \"ibrouters\"\n\n"); # ========================================================================= # sub usage_and_exit { my $prog = $_[0]; print "Usage: $prog [-R -l] [-G | ]\n"; print " print only the rt specified from the ibnetdiscover output\n"; print " -R Recalculate ibnetdiscover information\n"; print " -l list rts\n"; print " -C use selected channel adaptor name for queries\n"; print " -P use selected channel adaptor port for queries\n"; print " -G node is specified with GUID\n"; exit 2; } my $argv0 = `basename $0`; my $regenerate_map = undef; my $list_rts = undef; my $ca_name = ""; my $ca_port = ""; my $name_is_guid = "no"; chomp $argv0; if (!getopts("hRlC:P:G")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_R) { $regenerate_map = $Getopt::Std::opt_R; } if (defined $Getopt::Std::opt_l) { $list_rts = $Getopt::Std::opt_l; } if (defined $Getopt::Std::opt_C) { $ca_name = $Getopt::Std::opt_C; } if (defined $Getopt::Std::opt_P) { $ca_port = $Getopt::Std::opt_P; } if (defined $Getopt::Std::opt_G) { $name_is_guid = "yes"; } my $target_rt = $ARGV[0]; if ($name_is_guid eq "yes") { $target_rt = format_guid($target_rt); } my $cache_file = get_cache_file($ca_name, $ca_port); if ($regenerate_map || !(-f "$cache_file")) { generate_ibnetdiscover_topology($ca_name, $ca_port); } if ($list_rts) { system("ibrouters $cache_file"); exit 1; } if ($target_rt eq "") { usage_and_exit $argv0; } # ========================================================================= # sub main { my $found_rt = 0; open IBNET_TOPO, "<$cache_file" or die "Failed to open ibnet topology\n"; my $in_rt = "no"; my %ports = undef; while (my $line = ) { if ($line =~ /^Rt.*\"R-(.*)\"\s+# (.*)/) { my $guid = $1; my $desc = $2; if ($in_rt eq "yes") { $in_rt = "no"; foreach my $port (sort { $a <=> $b } (keys %ports)) { print $ports{$port}; } } if ("0x$guid" eq $target_rt || $desc =~ /[\s\"]$target_rt[\s\"]/) { print $line; $in_rt = "yes"; $found_rt++; } } if ($line =~ /^Switch.*/ || $line =~ /^Ca.*/) { $in_rt = "no"; } if ($line =~ /^\[(\d+)\].*/ && $in_rt eq "yes") { $ports{$1} = $line; } } if ($found_rt == 0) { die "\"$target_rt\" not found\n" . " Try running with the \"-R\" option.\n" . " If still not found the node is probably down.\n"; } if ($found_rt > 1) { print "\nWARNING: Found $found_rt Router's with the name \"$target_rt\"\n"; } close IBNET_TOPO; } main rdma-core-56.1/infiniband-diags/scripts/ibprintswitch.pl000077500000000000000000000104651477342711600234460ustar00rootroot00000000000000#!/usr/bin/perl # # Copyright (c) 2008 Voltaire, Inc. All rights reserved. # Copyright (c) 2006 The Regents of the University of California. # # Produced at Lawrence Livermore National Laboratory. # Written by Ira Weiny . # # This software is available to you under a choice of one of two # licenses. You may choose to be licensed under the terms of the GNU # General Public License (GPL) Version 2, available from the file # COPYING in the main directory of this source tree, or the # OpenIB.org BSD license below: # # Redistribution and use in source and binary forms, with or # without modification, are permitted provided that the following # conditions are met: # # - Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. # # - Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials # provided with the distribution. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # use strict; use Getopt::Std; use IBswcountlimits; printf (STDERR "*** WARNING ***: this command is deprecated; Please use \"ibswitches\"\n\n"); # ========================================================================= # sub usage_and_exit { my $prog = $_[0]; print "Usage: $prog [-R -l] [-G | ]\n"; print " print only the switch specified from the ibnetdiscover output\n"; print " -R Recalculate ibnetdiscover information\n"; print " -l list switches\n"; print " -C use selected channel adaptor name for queries\n"; print " -P use selected channel adaptor port for queries\n"; print " -G node is specified with GUID\n"; exit 2; } my $argv0 = `basename $0`; my $regenerate_map = undef; my $list_switches = undef; my $ca_name = ""; my $ca_port = ""; my $name_is_guid = "no"; chomp $argv0; if (!getopts("hRlC:P:G")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_R) { $regenerate_map = $Getopt::Std::opt_R; } if (defined $Getopt::Std::opt_l) { $list_switches = $Getopt::Std::opt_l; } if (defined $Getopt::Std::opt_C) { $ca_name = $Getopt::Std::opt_C; } if (defined $Getopt::Std::opt_P) { $ca_port = $Getopt::Std::opt_P; } if (defined $Getopt::Std::opt_G) { $name_is_guid = "yes"; } my $target_switch = $ARGV[0]; if ($name_is_guid eq "yes") { $target_switch = format_guid($target_switch); } my $cache_file = get_cache_file($ca_name, $ca_port); if ($regenerate_map || !(-f "$cache_file")) { generate_ibnetdiscover_topology($ca_name, $ca_port); } if ($list_switches) { system("ibswitches $cache_file"); exit 1; } if ($target_switch eq "") { usage_and_exit $argv0; } # ========================================================================= # sub main { my $found_switch = 0; open IBNET_TOPO, "<$cache_file" or die "Failed to open ibnet topology\n"; my $in_switch = "no"; my %ports = undef; while (my $line = ) { if ($line =~ /^Switch.*\"S-(.*)\"\s+# (.*) port.*/) { my $guid = $1; my $desc = $2; if ($in_switch eq "yes") { $in_switch = "no"; foreach my $port (sort { $a <=> $b } (keys %ports)) { print $ports{$port}; } } if ("0x$guid" eq $target_switch || $desc =~ /[\s\"]$target_switch[\s\"]/) { print $line; $in_switch = "yes"; $found_switch++; } } if ($line =~ /^Ca.*/) { $in_switch = "no"; } if ($line =~ /^\[(\d+)\].*/ && $in_switch eq "yes") { $ports{$1} = $line; } } if ($found_switch == 0) { die "Switch \"$target_switch\" not found\n" . " Try running with the \"-R\" option.\n"; } if ($found_switch > 1) { print "\nWARNING: Found $found_switch switches with the name \"$target_switch\"\n"; } close IBNET_TOPO; } main rdma-core-56.1/infiniband-diags/scripts/ibqueryerrors.pl.in000066400000000000000000000032701477342711600240700ustar00rootroot00000000000000#!/usr/bin/perl # # Copyright (c) 2009 Lawrence Livermore National Security # # Produced at Lawrence Livermore National Laboratory. # Written by Ira Weiny . # # This software is available to you under a choice of one of two # licenses. You may choose to be licensed under the terms of the GNU # General Public License (GPL) Version 2, available from the file # COPYING in the main directory of this source tree, or the # OpenIB.org BSD license below: # # Redistribution and use in source and binary forms, with or # without modification, are permitted provided that the following # conditions are met: # # - Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. # # - Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials # provided with the distribution. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # # this is now just a wrapper for the C based utility $str = join " ",@ARGV; system "@IBSCRIPTPATH@/ibqueryerrors $str"; printf (STDERR "\n*** WARNING ***: this command has been replaced by ibqueryerrors\n\n"); rdma-core-56.1/infiniband-diags/scripts/ibrouters.in000066400000000000000000000017631477342711600225640ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [ | -y mkey" \ "-C ca_name -P ca_port -t timeout_ms]" exit 255 } topofile="" ca_info="" mkey="0" while [ "$1" ]; do case $1 in -h | --help) usage ;; -y | --m_key) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi shift mkey="$1" ;; -P | --Port | -C | --Ca | -t | --timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) if [ "$topofile" ]; then usage fi topofile="$1" ;; esac shift done if [ "$topofile" ]; then netcmd="cat $topofile" else netcmd="$IBPATH/ibnetdiscover -y $mkey $ca_info" fi text="`eval $netcmd`" rv=$? echo "$text" | awk ' /^Rt/ {print $1 "\t: 0x" substr($3, 4, 16) " ports " $2 " "\ substr($0, match($0, "#[ \t]*")+RLENGTH)} /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} ' exit $rv rdma-core-56.1/infiniband-diags/scripts/ibstatus000077500000000000000000000035161477342711600220000ustar00rootroot00000000000000#!/bin/sh # Usage ibstatus [devname[:port]] infiniband_base="/sys/class/infiniband" def_ibdev="mthca0" usage() { prog=`basename $0` echo "Usage: " $prog " [-h] [devname[:portnum]]" echo " -h: this help screen" echo " Examples:" echo " $prog mthca1 # shows status of all ports of 'mthca1'" echo " $prog mthca0:2 # shows status port number 2 of 'mthca0'" echo " $prog # default: shows status of all '$def_ibdev' ports" exit 255 } fatal() { echo "Fatal error: " $* exit 255 } port_status() { port_dir="$infiniband_base/$1/ports/$2" echo "Infiniband device '$1' port $2 status:" echo " default gid: " `[ -r $port_dir/gids/0 ] && cat $port_dir/gids/0 || echo unknown` echo " base lid: " `[ -r $port_dir/lid ] && cat $port_dir/lid || echo unknown` echo " sm lid: " `[ -r $port_dir/sm_lid ] && cat $port_dir/sm_lid || echo unknown` echo " state: " `[ -r $port_dir/state ] && cat $port_dir/state || echo unknown` echo " phys state: " `[ -r $port_dir/phys_state ] && cat $port_dir/phys_state || echo unknown` echo " rate: " `[ -r $port_dir/rate ] && cat $port_dir/rate || echo unknown` echo " link_layer: " `[ -r $port_dir/link_layer ] && cat $port_dir/link_layer || echo IB` echo } ib_status() { ports_dir="$infiniband_base/$1/ports" if ! [ -d "$ports_dir" ]; then fatal "device '$1': sys files not found ($ports_dir)" fi if [ "$2" = "+" ]; then ports=`(cd "$infiniband_base/$1/ports" 2>/dev/null || fatal No devices; echo *)` else ports=$2 fi for i in $ports; do port_status $1 $i done } if [ "$1" = "-h" ]; then usage fi if [ -z "$1" ]; then cd $infiniband_base 2>/dev/null || fatal No devices for dev in *; do ib_status $dev "+"; done exit 0 fi while [ "$1" ]; do dev=`echo $1 | sed 's/:.*$//'` port=`echo $1 | sed 's/^.*://'` if [ "$port" = "$dev" ]; then port="+" fi ib_status $dev $port shift done rdma-core-56.1/infiniband-diags/scripts/ibswitches.in000066400000000000000000000026511477342711600227070ustar00rootroot00000000000000#!/bin/sh IBPATH=${IBPATH:-@IBSCRIPTPATH@} usage() { echo Usage: `basename $0` "[-h] [ | -y mkey" \ "-C ca_name -P ca_port -t timeout_ms]" exit 255 } topofile="" ca_info="" mkey="0" while [ "$1" ]; do case $1 in -h | --help) usage ;; -y | --m_key) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi shift mkey="$1" ;; -P | --Port | -C | --Ca | -t | --timeout) case $2 in -*) usage ;; esac if [ x$2 = x ] ; then usage fi ca_info="$ca_info $1 $2" shift ;; -*) usage ;; *) if [ "$topofile" ]; then usage fi topofile="$1" ;; esac shift done if [ "$topofile" ]; then netcmd="cat $topofile" else netcmd="$IBPATH/ibnetdiscover -y $mkey $ca_info" fi text="`eval $netcmd`" rv=$? echo "$text" | awk ' /^Switch/ { l=$0 desc=substr(l, match(l, "#[ \t]*")+RLENGTH) pi=match(desc, "port 0.*") pinfo=substr(desc, pi) desc=substr(desc, 1, pi-2) type="base port 0" ti=match(desc, type) if (ti==0) { type="enhanced port 0" ti=match(desc, type) if (ti!=0) desc=substr(desc, 1, ti-2) } else desc=substr(desc, 1, ti-2) if (ti==0) print $1 "\t: 0x" substr($3, 4, 16) " ports " $2 " "\ desc " " pinfo else print $1 "\t: 0x" substr($3, 4, 16) " ports " $2 " "\ desc " " type " " pinfo} /^ib/ {print $0; next} /ibpanic:/ {print $0} /ibwarn:/ {print $0} /iberror:/ {print $0} ' exit $rv rdma-core-56.1/infiniband-diags/scripts/ibswportwatch.pl000077500000000000000000000117061477342711600234540ustar00rootroot00000000000000#!/usr/bin/perl # # Copyright (c) 2008 Voltaire, Inc. All rights reserved. # Copyright (c) 2006 The Regents of the University of California. # # Produced at Lawrence Livermore National Laboratory. # Written by Ira Weiny . # # This software is available to you under a choice of one of two # licenses. You may choose to be licensed under the terms of the GNU # General Public License (GPL) Version 2, available from the file # COPYING in the main directory of this source tree, or the # OpenIB.org BSD license below: # # Redistribution and use in source and binary forms, with or # without modification, are permitted provided that the following # conditions are met: # # - Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. # # - Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials # provided with the distribution. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # use strict; use Getopt::Std; use IBswcountlimits; my $sw_addr = ""; my $sw_port = ""; my $verbose = undef; # ========================================================================= # sub print_verbose { if ($verbose) { print $_[0]; } } # ========================================================================= # sub print_all_counts { if (!$verbose) { return; } print " Counter\t\t\tNew ==> Old\n"; foreach my $cnt (@IBswcountlimits::counters) { print " $cnt\t\t\t$IBswcountlimits::new_counts{$cnt} ==> $IBswcountlimits::cur_counts{$cnt}\n"; } } # ========================================================================= # sub usage_and_exit { my $prog = $_[0]; print "Usage: $prog [-p -b -v -n -G] \n"; print " Attempt to diagnose a problem on a port\n"; print " Run this on a link while a job is running which utilizes that link.\n"; print " -p define the ammount of time between counter polls (default $IBswcountlimits::pause_time)\n"; print " -v Be verbose\n"; print " -n run n cycles then exit (default -1 == forever)\n"; print " -G Address provided is a GUID\n"; print " -b report bytes/second packets/second\n"; exit 2; } # ========================================================================= # sub clear_counters { # clear the counters foreach my $count (@IBswcountlimits::counters) { $IBswcountlimits::cur_counts{$count} = 0; $IBswcountlimits::new_counts{$count} = 0; } } # ========================================================================= # sub mv_counts { foreach my $count (@IBswcountlimits::counters) { $IBswcountlimits::cur_counts{$count} = $IBswcountlimits::new_counts{$count}; } } # ========================================================================= # use perfquery to get the counters. my $GUID = ""; sub get_new_counts { my $addr = $_[0]; my $port = $_[1]; mv_counts; ensure_cache_dir; if ( system( "perfquery $GUID $addr $port > $IBswcountlimits::cache_dir/perfquery.out" ) ) { die "perfquery failed : \"perfquery $GUID $addr $port\"\n"; } open PERF_QUERY, "<$IBswcountlimits::cache_dir/perfquery.out" or die "cannot read '$IBswcountlimits::cache_dir/perfquery.out': $!\n"; while (my $line = ) { foreach my $count (@IBswcountlimits::counters) { if ($line =~ /^$count:\.+(\d+)/) { $IBswcountlimits::new_counts{$count} = $1; } } } close PERF_QUERY; } my $cycle = -1; # forever my $bytes_per_second = undef; my $argv0 = `basename $0`; chomp $argv0; if (!getopts("hbvp:n:G")) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_h) { usage_and_exit $argv0; } if (defined $Getopt::Std::opt_p) { $IBswcountlimits::pause_time = $Getopt::Std::opt_p; } if (defined $Getopt::Std::opt_v) { $verbose = $Getopt::Std::opt_v; } if (defined $Getopt::Std::opt_n) { $cycle = $Getopt::Std::opt_n; } if (defined $Getopt::Std::opt_G) { $GUID = "-G"; } if (defined $Getopt::Std::opt_b) { $bytes_per_second = $Getopt::Std::opt_b; } my $sw_addr = $ARGV[0]; my $sw_port = $ARGV[1]; sub main { clear_counters; get_new_counts($sw_addr, $sw_port); while ($cycle != 0) { print "Checking counts...\n"; sleep($IBswcountlimits::pause_time); get_new_counts($sw_addr, $sw_port); check_counter_rates; if ($bytes_per_second) { print_data_rates; } print_all_counts; if ($cycle != -1) { $cycle = $cycle - 1; } } } main; rdma-core-56.1/infiniband-diags/scripts/set_nodedesc.sh000077500000000000000000000021741477342711600232110ustar00rootroot00000000000000#!/bin/sh if [ -f /etc/sysconfig/network ]; then . /etc/sysconfig/network fi ib_sysfs="/sys/class/infiniband" newname="$HOSTNAME" echo "" echo "*** WARNING ***: this command is deprecated." echo "" function usage { echo "Usage: `basename $0` [-hv] []" echo " set the node_desc field of all hca's found in \"$ib_sysfs\"" echo " -h this help" echo " -v view all node descriptors" echo " [] set name to name specified." echo " Default is to use the hostname: \"$HOSTNAME\"" exit 2 } function viewall { for hca in `ls $ib_sysfs`; do if [ -f $ib_sysfs/$hca/node_desc ]; then echo -n "$hca: " cat $ib_sysfs/$hca/node_desc else logger -s "Failed to set node_desc for : $hca" fi done exit 0 } while getopts "hv" flag do case $flag in "h") usage;; "v") viewall;; esac done shift $(($OPTIND - 1)) if [ "$1" != "" ]; then newname="$1" fi for hca in `ls $ib_sysfs`; do if [ -f $ib_sysfs/$hca/node_desc ]; then echo -n "$newname" >> $ib_sysfs/$hca/node_desc else logger -s "Failed to set node_desc for : $hca" fi done exit 0 rdma-core-56.1/infiniband-diags/sminfo.c000066400000000000000000000110571477342711600201630ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include "ibdiag_common.h" static uint8_t sminfo[1024] = { 0 }; static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; enum { SMINFO_NOTACT, SMINFO_DISCOVER, SMINFO_STANDBY, SMINFO_MASTER, SMINFO_STATE_LAST, }; static const char *const statestr[] = { "SMINFO_NOTACT", "SMINFO_DISCOVER", "SMINFO_STANDBY", "SMINFO_MASTER", }; #define STATESTR(s) (((unsigned)(s)) < SMINFO_STATE_LAST ? statestr[s] : "???") static unsigned act; static int prio, state = SMINFO_STANDBY; static int process_opt(void *context, int ch) { switch (ch) { case 'a': act = strtoul(optarg, NULL, 0); break; case 's': state = strtoul(optarg, NULL, 0); break; case 'p': prio = strtoul(optarg, NULL, 0); break; default: return -1; } return 0; } int main(int argc, char **argv) { int mgmt_classes[3] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS }; int mod = 0; ib_portid_t portid = { 0 }; uint8_t *p; uint64_t guid = 0, key = 0; const struct ibdiag_opt opts[] = { {"state", 's', 1, "<0-3>", "set SM state"}, {"priority", 'p', 1, "<0-15>", "set SM priority"}, {"activity", 'a', 1, NULL, "set activity count"}, {} }; char usage_args[] = " [modifier]"; ibdiag_process_opts(argc, argv, NULL, "sK", opts, process_opt, usage_args, NULL); argc -= optind; argv += optind; if (argc > 1) mod = atoi(argv[1]); srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 1); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->smi.port; if (!srcport) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); smp_mkey_set(srcport, ibd_mkey); if (argc) { if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &portid, argv[0], ibd_dest_type, NULL, srcports->gsi.port) < 0) IBEXIT("can't resolve destination port %s", argv[0]); } else { if (resolve_sm_portid(srcports->smi.ca_name, ibd_ca_port, &portid) < 0) IBEXIT("can't resolve sm port %s", argv[0]); } mad_encode_field(sminfo, IB_SMINFO_GUID_F, &guid); mad_encode_field(sminfo, IB_SMINFO_ACT_F, &act); mad_encode_field(sminfo, IB_SMINFO_KEY_F, &key); mad_encode_field(sminfo, IB_SMINFO_PRIO_F, &prio); mad_encode_field(sminfo, IB_SMINFO_STATE_F, &state); if (mod) { if (!(p = smp_set_via(sminfo, &portid, IB_ATTR_SMINFO, mod, ibd_timeout, srcport))) IBEXIT("query"); } else if (!(p = smp_query_via(sminfo, &portid, IB_ATTR_SMINFO, 0, ibd_timeout, srcport))) IBEXIT("query"); mad_decode_field(sminfo, IB_SMINFO_GUID_F, &guid); mad_decode_field(sminfo, IB_SMINFO_ACT_F, &act); mad_decode_field(sminfo, IB_SMINFO_KEY_F, &key); mad_decode_field(sminfo, IB_SMINFO_PRIO_F, &prio); mad_decode_field(sminfo, IB_SMINFO_STATE_F, &state); printf("sminfo: sm lid %d sm guid 0x%" PRIx64 ", activity count %u priority %d state %d %s\n", portid.lid, guid, act, prio, state, STATESTR(state)); mad_rpc_close_port(srcport); exit(0); } rdma-core-56.1/infiniband-diags/smpdump.c000066400000000000000000000141241477342711600203530ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include static int mad_agent; static int drmad_tid = 0x123; typedef struct { char path[64]; int hop_cnt; } DRPath; struct drsmp { uint8_t base_version; uint8_t mgmt_class; uint8_t class_version; uint8_t method; __be16 status; uint8_t hop_ptr; uint8_t hop_cnt; __be64 tid; __be16 attr_id; uint16_t resv; __be32 attr_mod; __be64 mkey; __be16 dr_slid; __be16 dr_dlid; uint8_t reserved[28]; uint8_t data[64]; uint8_t initial_path[64]; uint8_t return_path[64]; }; static void drsmp_get_init(void *umad, DRPath * path, int attr, int mod) { struct drsmp *smp = (struct drsmp *)(umad_get_mad(umad)); memset(smp, 0, sizeof(*smp)); smp->base_version = 1; smp->mgmt_class = IB_SMI_DIRECT_CLASS; smp->class_version = 1; smp->method = 1; smp->attr_id = htons(attr); smp->attr_mod = htonl(mod); smp->tid = htobe64(drmad_tid); drmad_tid++; smp->dr_slid = htobe16(0xffff); smp->dr_dlid = htobe16(0xffff); umad_set_addr(umad, 0xffff, 0, 0, 0); if (path) memcpy(smp->initial_path, path->path, path->hop_cnt + 1); smp->hop_cnt = (uint8_t) path->hop_cnt; } static void smp_get_init(void *umad, int lid, int attr, int mod) { struct drsmp *smp = (struct drsmp *)(umad_get_mad(umad)); memset(smp, 0, sizeof(*smp)); smp->base_version = 1; smp->mgmt_class = IB_SMI_CLASS; smp->class_version = 1; smp->method = 1; smp->attr_id = htons(attr); smp->attr_mod = htonl(mod); smp->tid = htobe64(drmad_tid); drmad_tid++; umad_set_addr(umad, lid, 0, 0, 0); } static int str2DRPath(char *str, DRPath * path) { char *s; path->hop_cnt = -1; DEBUG("DR str: %s", str); while (str && *str) { if ((s = strchr(str, ','))) *s = 0; path->path[++path->hop_cnt] = (char)atoi(str); if (!s) break; str = s + 1; } #if 0 if (path->path[0] != 0 || (path->hop_cnt > 0 && dev_port && path->path[1] != dev_port)) { DEBUG("hop 0 != 0 or hop 1 != dev_port"); return -1; } #endif return path->hop_cnt; } static int dump_char, mgmt_class = IB_SMI_CLASS; static int process_opt(void *context, int ch) { switch (ch) { case 's': dump_char++; break; case 'D': mgmt_class = IB_SMI_DIRECT_CLASS; break; case 'L': mgmt_class = IB_SMI_CLASS; break; default: return -1; } return 0; } int main(int argc, char *argv[]) { int dlid = 0; void *umad; struct drsmp *smp; int i, portid, mod = 0, attr; DRPath path; uint8_t *desc; int length; const struct ibdiag_opt opts[] = { {"string", 's', 0, NULL, ""}, {} }; char usage_args[] = " [mod]"; const char *usage_examples[] = { " -- DR routed examples:", "-D 0,1,2,3,5 16 # NODE DESC", "-D 0,1,2 0x15 2 # PORT INFO, port 2", " -- LID routed examples:", "3 0x15 2 # PORT INFO, lid 3 port 2", "0xa0 0x11 # NODE INFO, lid 0xa0", NULL }; ibd_timeout = 1000; ibdiag_process_opts(argc, argv, NULL, "GKs", opts, process_opt, usage_args, usage_examples); argc -= optind; argv += optind; if (argc < 2) ibdiag_show_usage(); if (mgmt_class == IB_SMI_DIRECT_CLASS && str2DRPath(strdupa(argv[0]), &path) < 0) IBPANIC("bad path str '%s'", argv[0]); if (mgmt_class == IB_SMI_CLASS) dlid = strtoul(argv[0], NULL, 0); attr = strtoul(argv[1], NULL, 0); if (argc > 2) mod = strtoul(argv[2], NULL, 0); if (umad_init() < 0) IBPANIC("can't init UMAD library"); if ((portid = umad_open_port(ibd_ca, ibd_ca_port)) < 0) IBPANIC("can't open UMAD port (%s:%d)", ibd_ca, ibd_ca_port); if ((mad_agent = umad_register(portid, mgmt_class, 1, 0, NULL)) < 0) IBPANIC("Couldn't register agent for SMPs"); if (!(umad = umad_alloc(1, umad_size() + IB_MAD_SIZE))) IBPANIC("can't alloc MAD"); smp = umad_get_mad(umad); if (mgmt_class == IB_SMI_DIRECT_CLASS) drsmp_get_init(umad, &path, attr, mod); else smp_get_init(umad, dlid, attr, mod); if (ibdebug > 1) xdump(stderr, "before send:\n", smp, 256); length = IB_MAD_SIZE; if (umad_send(portid, mad_agent, umad, length, ibd_timeout, 0) < 0) IBPANIC("send failed"); if (umad_recv(portid, umad, &length, -1) != mad_agent) IBPANIC("recv error: %s", strerror(errno)); if (ibdebug) fprintf(stderr, "%d bytes received\n", length); if (!dump_char) { xdump(stdout, NULL, smp->data, 64); if (smp->status) fprintf(stdout, "SMP status: 0x%x\n", ntohs(smp->status)); goto exit; } desc = smp->data; for (i = 0; i < 64; ++i) { if (!desc[i]) break; putchar(desc[i]); } putchar('\n'); if (smp->status) fprintf(stdout, "SMP status: 0x%x\n", ntohs(smp->status)); exit: umad_free(umad); return 0; } rdma-core-56.1/infiniband-diags/smpquery.c000066400000000000000000000342411477342711600205550ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #define __STDC_FORMAT_MACROS #include #include #include #include #include "ibdiag_common.h" static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; static op_fn_t node_desc, node_info, port_info, switch_info, pkey_table, sl2vl_table, vlarb_table, guid_info, mlnx_ext_port_info, port_info_extended; static const match_rec_t match_tbl[] = { {"NodeInfo", "NI", node_info, 0, ""}, {"NodeDesc", "ND", node_desc, 0, ""}, {"PortInfo", "PI", port_info, 1, ""}, {"PortInfoExtended", "PIE", port_info_extended, 1, ""}, {"SwitchInfo", "SI", switch_info, 0, ""}, {"PKeyTable", "PKeys", pkey_table, 1, ""}, {"SL2VLTable", "SL2VL", sl2vl_table, 1, ""}, {"VLArbitration", "VLArb", vlarb_table, 1, ""}, {"GUIDInfo", "GI", guid_info, 0, ""}, {"MlnxExtPortInfo", "MEPI", mlnx_ext_port_info, 1, ""}, {} }; static char *node_name_map_file = NULL; static nn_map_t *node_name_map = NULL; static int extended_speeds = 0; /*******************************************/ static const char *node_desc(ib_portid_t *dest, char **argv, int argc) { int node_type, l; uint64_t node_guid; char nd[IB_SMP_DATA_SIZE + 1] = { 0 }; uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; char dots[128]; char *nodename = NULL; if (!smp_query_via(data, dest, IB_ATTR_NODE_INFO, 0, 0, srcport)) return "node info query failed"; mad_decode_field(data, IB_NODE_TYPE_F, &node_type); mad_decode_field(data, IB_NODE_GUID_F, &node_guid); if (!smp_query_via(nd, dest, IB_ATTR_NODE_DESC, 0, 0, srcport)) return "node desc query failed"; nodename = remap_node_name(node_name_map, node_guid, nd); l = strlen(nodename); if (l < 32) { memset(dots, '.', 32 - l); dots[32 - l] = '\0'; } else { dots[0] = '.'; dots[1] = '\0'; } printf("Node Description:%s%s\n", dots, nodename); free(nodename); return NULL; } static const char *node_info(ib_portid_t * dest, char **argv, int argc) { char buf[2048]; char data[IB_SMP_DATA_SIZE] = { 0 }; if (!smp_query_via(data, dest, IB_ATTR_NODE_INFO, 0, 0, srcport)) return "node info query failed"; mad_dump_nodeinfo(buf, sizeof buf, data, sizeof data); printf("# Node info: %s\n%s", portid2str(dest), buf); return NULL; } static const char *port_info_extended(ib_portid_t *dest, char **argv, int argc) { char buf[2048]; uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; int portnum = 0; if (argc > 0) portnum = strtol(argv[0], NULL, 0); if (!is_port_info_extended_supported(dest, portnum, srcport)) return "port info extended not supported"; if (!smp_query_via(data, dest, IB_ATTR_PORT_INFO_EXT, portnum, 0, srcport)) return "port info extended query failed"; mad_dump_portinfo_ext(buf, sizeof buf, data, sizeof data); printf("# Port info Extended: %s port %d\n%s", portid2str(dest), portnum, buf); return NULL; } static const char *port_info(ib_portid_t *dest, char **argv, int argc) { char data[IB_SMP_DATA_SIZE] = { 0 }; int portnum = 0, orig_portnum; if (argc > 0) portnum = strtol(argv[0], NULL, 0); orig_portnum = portnum; if (extended_speeds) portnum |= (1U) << 31; if (!smp_query_via(data, dest, IB_ATTR_PORT_INFO, portnum, 0, srcport)) return "port info query failed"; printf("# Port info: %s port %d\n", portid2str(dest), orig_portnum); dump_portinfo(data, 0); return NULL; } static const char *mlnx_ext_port_info(ib_portid_t *dest, char **argv, int argc) { char buf[2300]; char data[IB_SMP_DATA_SIZE]; int portnum = 0; if (argc > 0) portnum = strtol(argv[0], NULL, 0); if (!smp_query_via(data, dest, IB_ATTR_MLNX_EXT_PORT_INFO, portnum, 0, srcport)) return "Mellanox ext port info query failed"; mad_dump_mlnx_ext_port_info(buf, sizeof buf, data, sizeof data); printf("# MLNX ext Port info: %s port %d\n%s", portid2str(dest), portnum, buf); return NULL; } static const char *switch_info(ib_portid_t *dest, char **argv, int argc) { char buf[2048]; char data[IB_SMP_DATA_SIZE] = { 0 }; if (!smp_query_via(data, dest, IB_ATTR_SWITCH_INFO, 0, 0, srcport)) return "switch info query failed"; mad_dump_switchinfo(buf, sizeof buf, data, sizeof data); printf("# Switch info: %s\n%s", portid2str(dest), buf); return NULL; } static const char *pkey_table(ib_portid_t *dest, char **argv, int argc) { uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; int i, j, k; __be16 *p; unsigned mod; int n, t, phy_ports; int portnum = 0; if (argc > 0) portnum = strtol(argv[0], NULL, 0); /* Get the partition capacity */ if (!smp_query_via(data, dest, IB_ATTR_NODE_INFO, 0, 0, srcport)) return "node info query failed"; mad_decode_field(data, IB_NODE_TYPE_F, &t); mad_decode_field(data, IB_NODE_NPORTS_F, &phy_ports); if (portnum > phy_ports) return "invalid port number"; if ((t == IB_NODE_SWITCH) && (portnum != 0)) { if (!smp_query_via(data, dest, IB_ATTR_SWITCH_INFO, 0, 0, srcport)) return "switch info failed"; mad_decode_field(data, IB_SW_PARTITION_ENFORCE_CAP_F, &n); } else mad_decode_field(data, IB_NODE_PARTITION_CAP_F, &n); for (i = 0; i < (n + 31) / 32; i++) { mod = i | (portnum << 16); if (!smp_query_via(data, dest, IB_ATTR_PKEY_TBL, mod, 0, srcport)) return "pkey table query failed"; if (i + 1 == (n + 31) / 32) k = ((n + 7 - i * 32) / 8) * 8; else k = 32; p = (__be16 *) data; for (j = 0; j < k; j += 8, p += 8) { printf ("%4u: 0x%04x 0x%04x 0x%04x 0x%04x 0x%04x 0x%04x 0x%04x 0x%04x\n", (i * 32) + j, ntohs(p[0]), ntohs(p[1]), ntohs(p[2]), ntohs(p[3]), ntohs(p[4]), ntohs(p[5]), ntohs(p[6]), ntohs(p[7])); } } printf("%d pkeys capacity for this port\n", n); return NULL; } static const char *sl2vl_dump_table_entry(ib_portid_t *dest, int in, int out) { char buf[2048]; char data[IB_SMP_DATA_SIZE] = { 0 }; int portnum = (in << 8) | out; if (!smp_query_via(data, dest, IB_ATTR_SLVL_TABLE, portnum, 0, srcport)) return "slvl query failed"; mad_dump_sltovl(buf, sizeof buf, data, sizeof data); printf("ports: in %2d, out %2d: ", in, out); printf("%s", buf); return NULL; } static const char *sl2vl_table(ib_portid_t *dest, char **argv, int argc) { uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; int type, num_ports, portnum = 0; int i; const char *ret; if (argc > 0) portnum = strtol(argv[0], NULL, 0); if (!smp_query_via(data, dest, IB_ATTR_NODE_INFO, 0, 0, srcport)) return "node info query failed"; mad_decode_field(data, IB_NODE_TYPE_F, &type); mad_decode_field(data, IB_NODE_NPORTS_F, &num_ports); if (portnum > num_ports) return "invalid port number"; printf("# SL2VL table: %s\n", portid2str(dest)); printf("# SL: |"); for (i = 0; i < 16; i++) printf("%2d|", i); printf("\n"); if (type != IB_NODE_SWITCH) return sl2vl_dump_table_entry(dest, 0, 0); for (i = 0; i <= num_ports; i++) { ret = sl2vl_dump_table_entry(dest, i, portnum); if (ret) return ret; } return NULL; } static const char *vlarb_dump_table_entry(ib_portid_t *dest, int portnum, int offset, unsigned cap) { char buf[2048]; char data[IB_SMP_DATA_SIZE] = { 0 }; if (!smp_query_via(data, dest, IB_ATTR_VL_ARBITRATION, (offset << 16) | portnum, 0, srcport)) return "vl arb query failed"; mad_dump_vlarbitration(buf, sizeof(buf), data, cap * 2); printf("%s", buf); return NULL; } static const char *vlarb_dump_table(ib_portid_t *dest, int portnum, const char *name, int offset, int cap) { const char *ret; printf("# %s priority VL Arbitration Table:", name); ret = vlarb_dump_table_entry(dest, portnum, offset, cap < 32 ? cap : 32); if (!ret && cap > 32) ret = vlarb_dump_table_entry(dest, portnum, offset + 1, cap - 32); return ret; } static const char *vlarb_table(ib_portid_t *dest, char **argv, int argc) { uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; int portnum = 0; int type, enhsp0, lowcap, highcap; const char *ret = NULL; if (argc > 0) portnum = strtol(argv[0], NULL, 0); /* port number of 0 could mean SP0 or port MAD arrives on */ if (portnum == 0) { if (!smp_query_via(data, dest, IB_ATTR_NODE_INFO, 0, 0, srcport)) return "node info query failed"; mad_decode_field(data, IB_NODE_TYPE_F, &type); if (type == IB_NODE_SWITCH) { memset(data, 0, sizeof(data)); if (!smp_query_via(data, dest, IB_ATTR_SWITCH_INFO, 0, 0, srcport)) return "switch info query failed"; mad_decode_field(data, IB_SW_ENHANCED_PORT0_F, &enhsp0); if (!enhsp0) { printf ("# No VLArbitration tables (BSP0): %s port %d\n", portid2str(dest), 0); return NULL; } memset(data, 0, sizeof(data)); } } if (!smp_query_via(data, dest, IB_ATTR_PORT_INFO, portnum, 0, srcport)) return "port info query failed"; mad_decode_field(data, IB_PORT_VL_ARBITRATION_LOW_CAP_F, &lowcap); mad_decode_field(data, IB_PORT_VL_ARBITRATION_HIGH_CAP_F, &highcap); printf("# VLArbitration tables: %s port %d LowCap %d HighCap %d\n", portid2str(dest), portnum, lowcap, highcap); if (lowcap > 0) ret = vlarb_dump_table(dest, portnum, "Low", 1, lowcap); if (!ret && highcap > 0) ret = vlarb_dump_table(dest, portnum, "High", 3, highcap); return ret; } static const char *guid_info(ib_portid_t *dest, char **argv, int argc) { uint8_t data[IB_SMP_DATA_SIZE] = { 0 }; int i, j, k; __be64 *p; unsigned mod; int n; /* Get the guid capacity */ if (!smp_query_via(data, dest, IB_ATTR_PORT_INFO, 0, 0, srcport)) return "port info failed"; mad_decode_field(data, IB_PORT_GUID_CAP_F, &n); for (i = 0; i < (n + 7) / 8; i++) { mod = i; if (!smp_query_via(data, dest, IB_ATTR_GUID_INFO, mod, 0, srcport)) return "guid info query failed"; if (i + 1 == (n + 7) / 8) k = ((n + 1 - i * 8) / 2) * 2; else k = 8; p = (__be64 *) data; for (j = 0; j < k; j += 2, p += 2) { printf("%4u: 0x%016" PRIx64 " 0x%016" PRIx64 "\n", (i * 8) + j, be64toh(p[0]), be64toh(p[1])); } } printf("%d guids capacity for this port\n", n); return NULL; } static int process_opt(void *context, int ch) { switch (ch) { case 1: node_name_map_file = strdup(optarg); if (node_name_map_file == NULL) IBEXIT("out of memory, strdup for node_name_map_file name failed"); break; case 'c': ibd_dest_type = IB_DEST_DRSLID; break; case 'x': extended_speeds = 1; break; default: return -1; } return 0; } int main(int argc, char **argv) { char usage_args[1024]; int mgmt_classes[3] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS, IB_SA_CLASS }; ib_portid_t portid = { 0 }; const char *err; op_fn_t *fn; const match_rec_t *r; int n; const struct ibdiag_opt opts[] = { {"combined", 'c', 0, NULL, "use Combined route address argument"}, {"node-name-map", 1, 1, "", "node name map file"}, {"extended", 'x', 0, NULL, "use extended speeds"}, {} }; const char *usage_examples[] = { "portinfo 3 1\t\t\t\t# portinfo by lid, with port modifier", "-G switchinfo 0x2C9000100D051 1\t# switchinfo by guid", "-D nodeinfo 0\t\t\t\t# nodeinfo by direct route", "-c nodeinfo 6 0,12\t\t\t# nodeinfo by combined route", NULL }; n = sprintf(usage_args, " [op params]\n" "\nSupported ops (and aliases, case insensitive):\n"); for (r = match_tbl; r->name; r++) { n += snprintf(usage_args + n, sizeof(usage_args) - n, " %s (%s) %s\n", r->name, r->alias ? r->alias : "", r->opt_portnum ? " []" : ""); if (n >= sizeof(usage_args)) exit(-1); } ibdiag_process_opts(argc, argv, NULL, NULL, opts, process_opt, usage_args, usage_examples); argc -= optind; argv += optind; if (argc < 2) ibdiag_show_usage(); if (!(fn = match_op(match_tbl, argv[0]))) IBEXIT("operation '%s' not supported", argv[0]); srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 3, 1); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->smi.port; smp_mkey_set(srcport, ibd_mkey); node_name_map = open_node_name_map(node_name_map_file); if (ibd_dest_type != IB_DEST_DRSLID) { if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &portid, argv[1], ibd_dest_type, ibd_sm_id, srcports->gsi.port) < 0) IBEXIT("can't resolve destination port %s", argv[1]); if ((err = fn(&portid, argv + 2, argc - 2))) IBEXIT("operation %s: %s", argv[0], err); } else { char concat[64]; memset(concat, 0, 64); snprintf(concat, sizeof(concat), "%s %s", argv[1], argv[2]); if (resolve_portid_str(srcports->smi.ca_name, ibd_ca_port, &portid, concat, ibd_dest_type, ibd_sm_id, srcport) < 0) IBEXIT("can't resolve destination port %s", concat); if ((err = fn(&portid, argv + 3, argc - 3))) IBEXIT("operation %s: %s", argv[0], err); } close_node_name_map(node_name_map); mad_rpc_close_port2(srcports); exit(0); } rdma-core-56.1/infiniband-diags/vendstat.c000066400000000000000000000364031477342711600205220ustar00rootroot00000000000000/* * Copyright (c) 2012 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include "ibdiag_common.h" #define IS3_DEVICE_ID 47396 #define IB_MLX_VENDOR_CLASS 10 /* Vendor specific Attribute IDs */ #define IB_MLX_IS3_GENERAL_INFO 0x17 #define IB_MLX_IS3_CONFIG_SPACE_ACCESS 0x50 #define IB_MLX_IS4_COUNTER_GROUP_INFO 0x90 #define IB_MLX_IS4_CONFIG_COUNTER_GROUP 0x91 /* Config space addresses */ #define IB_MLX_IS3_PORT_XMIT_WAIT 0x10013C static struct ibmad_port *srcport; static struct ibmad_ports_pair *srcports; typedef struct { __be16 hw_revision; __be16 device_id; uint8_t reserved[24]; __be32 uptime; } is3_hw_info_t; typedef struct { uint8_t resv1; uint8_t major; uint8_t minor; uint8_t sub_minor; __be32 build_id; uint8_t month; uint8_t day; __be16 year; __be16 resv2; __be16 hour; uint8_t psid[16]; __be32 ini_file_version; } is3_fw_info_t; typedef struct { __be32 ext_major; __be32 ext_minor; __be32 ext_sub_minor; __be32 reserved[4]; } is4_fw_ext_info_t; typedef struct { uint8_t resv1; uint8_t major; uint8_t minor; uint8_t sub_minor; uint8_t resv2[28]; } is3_sw_info_t; typedef struct { uint8_t reserved[8]; is3_hw_info_t hw_info; is3_fw_info_t fw_info; is3_sw_info_t sw_info; } is3_general_info_t; typedef struct { uint8_t reserved[8]; is3_hw_info_t hw_info; is3_fw_info_t fw_info; is4_fw_ext_info_t ext_fw_info; is3_sw_info_t sw_info; } is4_general_info_t; typedef struct { uint8_t reserved[8]; struct is3_record { __be32 address; __be32 data; __be32 mask; } record[18]; } is3_config_space_t; #define COUNTER_GROUPS_NUM 2 typedef struct { uint8_t reserved1[8]; uint8_t reserved[3]; uint8_t num_of_counter_groups; __be32 group_masks[COUNTER_GROUPS_NUM]; } is4_counter_group_info_t; typedef struct { uint8_t reserved[3]; uint8_t group_select; } is4_group_select_t; typedef struct { uint8_t reserved1[8]; uint8_t reserved[4]; is4_group_select_t group_selects[COUNTER_GROUPS_NUM]; } is4_config_counter_groups_t; static uint16_t ext_fw_info_device[][2] = { {0x0245, 0x0245}, /* Switch-X */ {0xc738, 0xc73b}, /* Switch-X */ {0xcb20, 0xcb20}, /* Switch-IB */ {0xcf08, 0xcf08}, /* Switch-IB2 */ {0xd2f0, 0xd2f0}, /* Quantum */ {0x01b3, 0x01b3}, /* IS-4 */ {0x1003, 0x101b}, /* Connect-X */ {0xa2d2, 0xa2d2}, /* BlueField */ {0x1b02, 0x1b02}, /* Bull SwitchX */ {0x1b50, 0x1b50}, /* Bull SwitchX */ {0x1ba0, 0x1ba0}, /* Bull SwitchIB */ {0x1bd0, 0x1bd5}, /* Bull SwitchIB and SwitchIB2 */ {0x1bf0, 0x1bf0}, /* Bull Sequana Quantum */ {0x1b33, 0x1b33}, /* Bull ConnectX3 */ {0x1b73, 0x1b73}, /* Bull ConnectX3 */ {0x1b40, 0x1b41}, /* Bull ConnectX3 */ {0x1b60, 0x1b61}, /* Bull ConnectX3 */ {0x1b83, 0x1b83}, /* Bull ConnectIB */ {0x1b93, 0x1b94}, /* Bull ConnectIB */ {0x1bb4, 0x1bb5}, /* Bull ConnectX4 */ {0x1bc4, 0x1bc6}, /* Bull ConnectX4, Sequana HDR and HDR100 */ {0x0000, 0x0000}}; static int is_ext_fw_info_supported(uint16_t device_id) { int i; for (i = 0; ext_fw_info_device[i][0]; i++) if (ext_fw_info_device[i][0] <= device_id && device_id <= ext_fw_info_device[i][1]) return 1; return 0; } static int do_vendor(ib_portid_t *portid, uint8_t class, uint8_t method, uint16_t attr_id, uint32_t attr_mod, void *data) { ib_vendor_call_t call; memset(&call, 0, sizeof(call)); call.mgmt_class = class; call.method = method; call.timeout = ibd_timeout; call.attrid = attr_id; call.mod = attr_mod; if (!ib_vendor_call_via(data, portid, &call, srcport)) { fprintf(stderr,"vendstat: method %u, attribute %u failure\n", method, attr_id); return -1; } return 0; } static int do_config_space_records(ib_portid_t *portid, unsigned set, is3_config_space_t *cs, unsigned records) { unsigned i; if (records > 18) records = 18; if (do_vendor(portid, IB_MLX_VENDOR_CLASS, set ? IB_MAD_METHOD_SET : IB_MAD_METHOD_GET, IB_MLX_IS3_CONFIG_SPACE_ACCESS, 2 << 22 | records << 16, cs)) { fprintf(stderr,"cannot %s config space records\n", set ? "set" : "get"); return -1; } for (i = 0; i < records; i++) { printf("Config space record at 0x%x: 0x%x\n", ntohl(cs->record[i].address), ntohl(cs->record[i].data & cs->record[i].mask)); } return 0; } static int counter_groups_info(ib_portid_t * portid, int port) { char buf[1024]; is4_counter_group_info_t *cg_info; int i, num_cg; /* Counter Group Info */ memset(&buf, 0, sizeof(buf)); if (do_vendor(portid, IB_MLX_VENDOR_CLASS, IB_MAD_METHOD_GET, IB_MLX_IS4_COUNTER_GROUP_INFO, port, buf)) { fprintf(stderr,"counter group info query failure\n"); return -1; } cg_info = (is4_counter_group_info_t *) & buf; num_cg = cg_info->num_of_counter_groups; printf("counter_group_info:\n"); printf("%d counter groups\n", num_cg); for (i = 0; i < num_cg; i++) printf("group%d mask %#x\n", i, ntohl(cg_info->group_masks[i])); return 0; } /* Group0 counter config values */ #define IS4_G0_PortXmtDataSL_0_7 0 #define IS4_G0_PortXmtDataSL_8_15 1 #define IS4_G0_PortRcvDataSL_0_7 2 /* Group1 counter config values */ #define IS4_G1_PortXmtDataSL_8_15 1 #define IS4_G1_PortRcvDataSL_0_7 2 #define IS4_G1_PortRcvDataSL_8_15 8 static int cg0, cg1; static int config_counter_groups(ib_portid_t * portid, int port) { char buf[1024]; is4_config_counter_groups_t *cg_config; /* configure counter groups for groups 0 and 1 */ memset(&buf, 0, sizeof(buf)); cg_config = (is4_config_counter_groups_t *) & buf; printf("counter_groups_config: configuring group0 %d group1 %d\n", cg0, cg1); cg_config->group_selects[0].group_select = (uint8_t) cg0; cg_config->group_selects[1].group_select = (uint8_t) cg1; if (do_vendor(portid, IB_MLX_VENDOR_CLASS, IB_MAD_METHOD_SET, IB_MLX_IS4_CONFIG_COUNTER_GROUP, port, buf)) { fprintf(stderr, "config counter group set failure\n"); return -1; } /* get config counter groups */ memset(&buf, 0, sizeof(buf)); if (do_vendor(portid, IB_MLX_VENDOR_CLASS, IB_MAD_METHOD_GET, IB_MLX_IS4_CONFIG_COUNTER_GROUP, port, buf)) { fprintf(stderr, "config counter group query failure\n"); return -1; } return 0; } static int general_info, xmit_wait, counter_group_info, config_counter_group; static is3_config_space_t write_cs, read_cs; static unsigned write_cs_records, read_cs_records; static int process_opt(void *context, int ch) { int ret; unsigned int address, data, mask; switch (ch) { case 'N': general_info = 1; break; case 'w': xmit_wait = 1; break; case 'i': counter_group_info = 1; break; case 'c': config_counter_group = 1; ret = sscanf(optarg, "%d,%d", &cg0, &cg1); if (ret != 2) return -1; break; case 'R': if (read_cs_records >= 18) break; ret = sscanf(optarg, "%x,%x", &address, &mask); if (ret < 1) return -1; else if (ret == 1) mask = 0xffffffff; read_cs.record[read_cs_records].address = htobe32(address); read_cs.record[read_cs_records].mask = htobe32(mask); read_cs_records++; break; case 'W': if (write_cs_records >= 18) break; ret = sscanf(optarg, "%x,%x,%x", &address, &data, &mask); if (ret < 2) return -1; else if (ret == 2) mask = 0xffffffff; write_cs.record[write_cs_records].address = htobe32(address); write_cs.record[write_cs_records].data = htobe32(data); write_cs.record[write_cs_records].mask = htobe32(mask); write_cs_records++; break; default: return -1; } return 0; } int main(int argc, char **argv) { int mgmt_classes[2] = { IB_SA_CLASS, IB_MLX_VENDOR_CLASS }; ib_portid_t portid = { 0 }; int port = 0; char buf[1024]; uint32_t fw_ver_major = 0; uint32_t fw_ver_minor = 0; uint32_t fw_ver_sub_minor = 0; uint8_t sw_ver_major = 0, sw_ver_minor = 0, sw_ver_sub_minor = 0; is3_general_info_t *gi_is3; is4_general_info_t *gi_is4; const struct ibdiag_opt opts[] = { {"N", 'N', 0, NULL, "show IS3 or IS4 general information"}, {"w", 'w', 0, NULL, "show IS3 port xmit wait counters"}, {"i", 'i', 0, NULL, "show IS4 counter group info"}, {"c", 'c', 1, "", "configure IS4 counter groups"}, {"Read", 'R', 1, "", "Read configuration space record at addr"}, {"Write", 'W', 1, "", "Write configuration space record at addr"}, {} }; char usage_args[] = " [port]"; const char *usage_examples[] = { "-N 6\t\t# read IS3 or IS4 general information", "-w 6\t\t# read IS3 port xmit wait counters", "-i 6 12\t# read IS4 port 12 counter group info", "-c 0,1 6 12\t# configure IS4 port 12 counter groups for PortXmitDataSL", "-c 2,8 6 12\t# configure IS4 port 12 counter groups for PortRcvDataSL", NULL }; ibdiag_process_opts(argc, argv, NULL, "DKy", opts, process_opt, usage_args, usage_examples); argc -= optind; argv += optind; if (argc > 1) port = strtoul(argv[1], NULL, 0); srcports = mad_rpc_open_port2(ibd_ca, ibd_ca_port, mgmt_classes, 2, 0); if (!srcports) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); srcport = srcports->gsi.port; if (!srcport) IBEXIT("Failed to open '%s' port '%d'", ibd_ca, ibd_ca_port); if (argc) { if (resolve_portid_str(srcports->gsi.ca_name, ibd_ca_port, &portid, argv[0], ibd_dest_type, ibd_sm_id, srcports->gsi.port) < 0) { mad_rpc_close_port2(srcports); IBEXIT("can't resolve destination port %s", argv[0]); } } else { if (resolve_self(srcports->gsi.ca_name, ibd_ca_port, &portid, &port, NULL) < 0) { mad_rpc_close_port2(srcports); IBEXIT("can't resolve self port %s", argv[0]); } } if (counter_group_info) { counter_groups_info(&portid, port); mad_rpc_close_port2(srcports); exit(0); } if (config_counter_group) { config_counter_groups(&portid, port); mad_rpc_close_port2(srcports); exit(0); } if (read_cs_records || write_cs_records) { if (read_cs_records) do_config_space_records(&portid, 0, &read_cs, read_cs_records); if (write_cs_records) do_config_space_records(&portid, 1, &write_cs, write_cs_records); mad_rpc_close_port2(srcports); exit(0); } /* These are Mellanox specific vendor MADs */ /* but vendors change the VendorId so how know for sure ? */ /* Only General Info and Port Xmit Wait Counters */ /* queries are currently supported */ if (!general_info && !xmit_wait) { mad_rpc_close_port2(srcports); IBEXIT("at least one of -N and -w must be specified"); } /* Would need a list of these and it might not be complete */ /* so for right now, punt on this */ /* vendor ClassPortInfo is required attribute if class supported */ memset(&buf, 0, sizeof(buf)); if (do_vendor(&portid, IB_MLX_VENDOR_CLASS, IB_MAD_METHOD_GET, CLASS_PORT_INFO, 0, buf)) { mad_rpc_close_port2(srcports); IBEXIT("classportinfo query"); } memset(&buf, 0, sizeof(buf)); gi_is3 = (is3_general_info_t *) &buf; if (do_vendor(&portid, IB_MLX_VENDOR_CLASS, IB_MAD_METHOD_GET, IB_MLX_IS3_GENERAL_INFO, 0, gi_is3)) { mad_rpc_close_port2(srcports); IBEXIT("generalinfo query"); } if (is_ext_fw_info_supported(ntohs(gi_is3->hw_info.device_id))) { gi_is4 = (is4_general_info_t *) &buf; fw_ver_major = ntohl(gi_is4->ext_fw_info.ext_major); fw_ver_minor = ntohl(gi_is4->ext_fw_info.ext_minor); fw_ver_sub_minor = ntohl(gi_is4->ext_fw_info.ext_sub_minor); sw_ver_major = gi_is4->sw_info.major; sw_ver_minor = gi_is4->sw_info.minor; sw_ver_sub_minor = gi_is4->sw_info.sub_minor; } else { fw_ver_major = gi_is3->fw_info.major; fw_ver_minor = gi_is3->fw_info.minor; fw_ver_sub_minor = gi_is3->fw_info.sub_minor; sw_ver_major = gi_is3->sw_info.major; sw_ver_minor = gi_is3->sw_info.minor; sw_ver_sub_minor = gi_is3->sw_info.sub_minor; } if (general_info) { /* dump IS3 or IS4 general info here */ printf("hw_dev_rev: 0x%04x\n", ntohs(gi_is3->hw_info.hw_revision)); printf("hw_dev_id: 0x%04x\n", ntohs(gi_is3->hw_info.device_id)); printf("hw_uptime: 0x%08x\n", ntohl(gi_is3->hw_info.uptime)); printf("fw_version: %02d.%02d.%02d\n", fw_ver_major, fw_ver_minor, fw_ver_sub_minor); printf("fw_build_id: 0x%04x\n", ntohl(gi_is3->fw_info.build_id)); printf("fw_date: %02x/%02x/%04x\n", gi_is3->fw_info.month, gi_is3->fw_info.day, ntohs(gi_is3->fw_info.year)); printf("fw_psid: '%s'\n", gi_is3->fw_info.psid); printf("fw_ini_ver: %d\n", ntohl(gi_is3->fw_info.ini_file_version)); printf("sw_version: %02d.%02d.%02d\n", sw_ver_major, sw_ver_minor, sw_ver_sub_minor); } if (xmit_wait) { is3_config_space_t *cs; unsigned i; if (ntohs(gi_is3->hw_info.device_id) != IS3_DEVICE_ID) { mad_rpc_close_port2(srcports); IBEXIT("Unsupported device ID 0x%x", ntohs(gi_is3->hw_info.device_id)); } memset(&buf, 0, sizeof(buf)); /* Set record addresses for each port */ cs = (is3_config_space_t *) & buf; for (i = 0; i < 16; i++) cs->record[i].address = htonl(IB_MLX_IS3_PORT_XMIT_WAIT + ((i + 1) << 12)); if (do_vendor(&portid, IB_MLX_VENDOR_CLASS, IB_MAD_METHOD_GET, IB_MLX_IS3_CONFIG_SPACE_ACCESS, 2 << 22 | 16 << 16, cs)) { mad_rpc_close_port2(srcports); IBEXIT("vendstat"); } for (i = 0; i < 16; i++) if (cs->record[i].data) /* PortXmitWait is 32 bit counter */ printf("Port %d: PortXmitWait 0x%x\n", i + 4, ntohl(cs->record[i].data)); /* port 4 is first port */ /* Last 8 ports is another query */ memset(&buf, 0, sizeof(buf)); /* Set record addresses for each port */ cs = (is3_config_space_t *) & buf; for (i = 0; i < 8; i++) cs->record[i].address = htonl(IB_MLX_IS3_PORT_XMIT_WAIT + ((i + 17) << 12)); if (do_vendor(&portid, IB_MLX_VENDOR_CLASS, IB_MAD_METHOD_GET, IB_MLX_IS3_CONFIG_SPACE_ACCESS, 2 << 22 | 8 << 16, cs)) { mad_rpc_close_port2(srcports); IBEXIT("vendstat"); } for (i = 0; i < 8; i++) if (cs->record[i].data) /* PortXmitWait is 32 bit counter */ printf("Port %d: PortXmitWait 0x%x\n", i < 4 ? i + 21 : i - 3, ntohl(cs->record[i].data)); } mad_rpc_close_port2(srcports); exit(0); } rdma-core-56.1/iwpmd/000077500000000000000000000000001477342711600144525ustar00rootroot00000000000000rdma-core-56.1/iwpmd/CMakeLists.txt000066400000000000000000000015451477342711600172170ustar00rootroot00000000000000rdma_sbin_executable(iwpmd iwarp_pm_common.c iwarp_pm_helper.c iwarp_pm_server.c ) target_link_libraries(iwpmd LINK_PRIVATE ${SYSTEMD_LIBRARIES} ${NL_LIBRARIES} ${CMAKE_THREAD_LIBS_INIT} ) rdma_man_pages( iwpmd.8.in iwpmd.conf.5.in ) rdma_subst_install(FILES "iwpmd.service.in" RENAME "iwpmd.service" DESTINATION "${CMAKE_INSTALL_SYSTEMD_SERVICEDIR}") rdma_subst_install(FILES "iwpmd_init.in" DESTINATION "${CMAKE_INSTALL_INITDDIR}" RENAME "iwpmd" PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ OWNER_EXECUTE GROUP_EXECUTE WORLD_EXECUTE) install(FILES "iwpmd.conf" DESTINATION "${CMAKE_INSTALL_SYSCONFDIR}") install(FILES "iwpmd.rules" RENAME "90-iwpmd.rules" DESTINATION "${CMAKE_INSTALL_UDEV_RULESDIR}") install(FILES modules-iwpmd.conf RENAME "iwpmd.conf" DESTINATION "${CMAKE_INSTALL_SYSCONFDIR}/rdma/modules") rdma-core-56.1/iwpmd/iwarp_pm.h000066400000000000000000000202041477342711600164370ustar00rootroot00000000000000/* * Copyright (c) 2013 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef IWARP_PM_H #define IWARP_PM_H #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define IWARP_PM_PORT 3935 #define IWARP_PM_VER_SHIFT 6 #define IWARP_PM_VER_MASK 0xc0 #define IWARP_PM_MT_SHIFT 4 #define IWARP_PM_MT_MASK 0x30 #define IWARP_PM_IPVER_SHIFT 0 #define IWARP_PM_IPVER_MASK 0x0F #define IWARP_PM_MESSAGE_SIZE 48 /* bytes */ #define IWARP_PM_ASSOC_OFFSET 0x10 /* different assochandles for passive/active side map requests */ #define IWARP_PM_IPV4_ADDR 4 #define IWARP_PM_MT_REQ 0 #define IWARP_PM_MT_ACC 1 #define IWARP_PM_MT_ACK 2 #define IWARP_PM_MT_REJ 3 #define IWARP_PM_REQ_QUERY 1 #define IWARP_PM_REQ_ACCEPT 2 #define IWARP_PM_REQ_ACK 4 #define IWARP_PM_RECV_PAYLOAD 4096 #define IWARP_PM_MAX_CLIENTS 64 #define IWPM_MAP_REQ_TIMEOUT 10 /* sec */ #define IWPM_SEND_MSG_RETRIES 3 #define IWPM_ULIB_NAME "iWarpPortMapperUser" #define IWPM_ULIBNAME_SIZE 32 #define IWPM_DEVNAME_SIZE 32 #define IWPM_IFNAME_SIZE 16 #define IWPM_IPADDR_SIZE 16 #define IWPM_PARAM_NUM 1 #define IWPM_PARAM_NAME_LEN 64 #define IWARP_PM_NETLINK_DBG 0x01 #define IWARP_PM_WIRE_DBG 0x02 #define IWARP_PM_RETRY_DBG 0x04 #define IWARP_PM_ALL_DBG 0x07 #define IWARP_PM_DEBUG 0x08 #define iwpm_debug(dbg_level, str, args...) \ do { if (dbg_level & IWARP_PM_DEBUG) { \ syslog(LOG_WARNING, str, ##args); } \ } while (0) /* Port Mapper errors */ enum { IWPM_INVALID_NLMSG_ERR = 10, IWPM_CREATE_MAPPING_ERR, IWPM_DUPLICATE_MAPPING_ERR, IWPM_UNKNOWN_MAPPING_ERR, IWPM_CLIENT_DEV_INFO_ERR, IWPM_USER_LIB_INFO_ERR, IWPM_REMOTE_QUERY_REJECT, IWPM_VERSION_MISMATCH_ERR, }; /* iwpm param indexes */ enum { NL_SOCK_RBUF_SIZE }; typedef struct iwpm_client { char ifname[IWPM_IFNAME_SIZE]; /* netdev interface name */ char ibdevname[IWPM_DEVNAME_SIZE]; /* OFED device name */ char ulibname[IWPM_ULIBNAME_SIZE]; /* library name of the userpace PM agent provider */ __u32 nl_seq; char valid; } iwpm_client; typedef union sockaddr_union { struct sockaddr_storage s_sockaddr; struct sockaddr sock_addr; struct sockaddr_in v4_sockaddr; struct sockaddr_in6 v6_sockaddr; struct sockaddr_nl nl_sockaddr; } sockaddr_union; typedef struct iwpm_mapped_port { struct list_node entry; int owner_client; int sd; struct sockaddr_storage local_addr; struct sockaddr_storage mapped_addr; int wcard; _Atomic(int) ref_cnt; /* the number of owners */ } iwpm_mapped_port; typedef struct iwpm_wire_msg { __u8 magic; __u8 pmtime; __be16 reserved; __be16 apport; __be16 cpport; __be64 assochandle; /* big endian IP addresses and ports */ __u8 cpipaddr[IWPM_IPADDR_SIZE]; __u8 apipaddr[IWPM_IPADDR_SIZE]; __u8 mapped_cpipaddr[IWPM_IPADDR_SIZE]; } iwpm_wire_msg; typedef struct iwpm_send_msg { int pm_sock; struct sockaddr_storage dest_addr; iwpm_wire_msg data; int length; } iwpm_send_msg; typedef struct iwpm_mapping_request { struct list_node entry; struct sockaddr_storage src_addr; struct sockaddr_storage remote_addr; __u16 nlmsg_type; /* Message content */ __u32 nlmsg_seq; /* Sequence number */ __u32 nlmsg_pid; __u64 assochandle; iwpm_send_msg * send_msg; int timeout; int complete; int msg_type; } iwpm_mapping_request; typedef struct iwpm_pending_msg { struct list_node entry; iwpm_send_msg send_msg; } iwpm_pending_msg; typedef struct iwpm_msg_parms { __u32 ip_ver; __u16 address_family; char apipaddr[IWPM_IPADDR_SIZE]; __be16 apport; char cpipaddr[IWPM_IPADDR_SIZE]; __be16 cpport; char mapped_cpipaddr[IWPM_IPADDR_SIZE]; __be16 mapped_cpport; unsigned char ver; unsigned char mt; unsigned char pmtime; __u64 assochandle; int msize; } iwpm_msg_parms; /* iwarp_pm_common.c */ void parse_iwpm_config(FILE *); int create_iwpm_socket_v4(__u16); int create_iwpm_socket_v6(__u16); int create_netlink_socket(void); void destroy_iwpm_socket(int); int parse_iwpm_nlmsg(struct nlmsghdr *, int, struct nla_policy *, struct nlattr * [], const char *); int parse_iwpm_msg(iwpm_wire_msg *, iwpm_msg_parms *); void form_iwpm_request(iwpm_wire_msg *, iwpm_msg_parms *); void form_iwpm_accept(iwpm_wire_msg *, iwpm_msg_parms *); void form_iwpm_ack(iwpm_wire_msg *, iwpm_msg_parms *); void form_iwpm_reject(iwpm_wire_msg *, iwpm_msg_parms *); int send_iwpm_nlmsg(int, struct nl_msg *, int); struct nl_msg *create_iwpm_nlmsg(__u16, int); void print_iwpm_sockaddr(struct sockaddr_storage *, const char *, __u32); __be16 get_sockaddr_port(struct sockaddr_storage *sockaddr); void copy_iwpm_sockaddr(__u16, struct sockaddr_storage *, struct sockaddr_storage *, char *, char *, __be16 *); int is_wcard_ipaddr(struct sockaddr_storage *); /* iwarp_pm_helper.c */ iwpm_mapped_port *create_iwpm_mapped_port(struct sockaddr_storage *, int, __u32 flags); iwpm_mapped_port *reopen_iwpm_mapped_port(struct sockaddr_storage *, struct sockaddr_storage *, int, __u32 flags); void add_iwpm_mapped_port(iwpm_mapped_port *); iwpm_mapped_port *find_iwpm_mapping(struct sockaddr_storage *, int); iwpm_mapped_port *find_iwpm_same_mapping(struct sockaddr_storage *, int); void remove_iwpm_mapped_port(iwpm_mapped_port *); void print_iwpm_mapped_ports(void); void free_iwpm_port(iwpm_mapped_port *); iwpm_mapping_request *create_iwpm_map_request(struct nlmsghdr *, struct sockaddr_storage *, struct sockaddr_storage *, __u64, int, iwpm_send_msg *); void add_iwpm_map_request(iwpm_mapping_request *); int update_iwpm_map_request(__u64, struct sockaddr_storage *, int, iwpm_mapping_request *, int); void remove_iwpm_map_request(iwpm_mapping_request *); void form_iwpm_send_msg(int, struct sockaddr_storage *, int, iwpm_send_msg *); int send_iwpm_msg(void (*form_msg_type)(iwpm_wire_msg *, iwpm_msg_parms *), iwpm_msg_parms *, struct sockaddr_storage *, int); int add_iwpm_pending_msg(iwpm_send_msg *); int check_same_sockaddr(struct sockaddr_storage *, struct sockaddr_storage *); void free_iwpm_mapped_ports(void); extern struct list_head pending_messages; extern struct list_head mapping_reqs; extern iwpm_client client_list[IWARP_PM_MAX_CLIENTS]; extern pthread_cond_t cond_req_complete; extern pthread_mutex_t map_req_mutex; extern int wake; extern pthread_cond_t cond_pending_msg; extern pthread_mutex_t pending_msg_mutex; #endif rdma-core-56.1/iwpmd/iwarp_pm_common.c000066400000000000000000000464411477342711600200150ustar00rootroot00000000000000/* * Copyright (c) 2013-2015 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include "iwarp_pm.h" #include /* iwpm config params */ static const char * iwpm_param_names[IWPM_PARAM_NUM] = { "nl_sock_rbuf_size" }; static int iwpm_param_vals[IWPM_PARAM_NUM] = { 0 }; /** * get_iwpm_param() */ static int get_iwpm_param(char *param_name, int val) { int i, ret; for (i = 0; i < IWPM_PARAM_NUM; i++) { ret = strcmp(param_name, iwpm_param_names[i]); if (!ret && val > 0) { syslog(LOG_WARNING, "get_iwpm_param: Got param (name = %s val = %d)\n", param_name, val); iwpm_param_vals[i] = val; return ret; } } return ret; } /** * parse_iwpm_config() */ void parse_iwpm_config(FILE *fp) { char line_buf[128]; char param_name[IWPM_PARAM_NAME_LEN]; int n, val, ret; char *str; str = fgets(line_buf, 128, fp); while (str) { if (line_buf[0] == '#' || line_buf[0] == '\n') goto parse_next_line; n = sscanf(line_buf, "%63[^= ] %*[=]%d", param_name, &val); if (n != 2) { syslog(LOG_WARNING, "parse_iwpm_config: Couldn't parse a line (n = %d, name = %s, val = %d\n", n, param_name, val); goto parse_next_line; } ret = get_iwpm_param(param_name, val); if (ret) syslog(LOG_WARNING, "parse_iwpm_config: Couldn't find param (ret = %d)\n", ret); parse_next_line: str = fgets(line_buf, 128, fp); } } /** * create_iwpm_socket_v4 - Create an ipv4 socket for the iwarp port mapper * @bind_port: UDP port to bind the socket * * Return a handle of ipv4 socket */ int create_iwpm_socket_v4(__u16 bind_port) { sockaddr_union bind_addr; struct sockaddr_in *bind_in4; int pm_sock; socklen_t sockname_len; char ip_address_text[INET6_ADDRSTRLEN]; /* create a socket */ pm_sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); if (pm_sock < 0) { syslog(LOG_WARNING, "create_iwpm_socket_v4: Unable to create socket. %s.\n", strerror(errno)); pm_sock = -errno; goto create_socket_v4_exit; } /* bind the socket to the given port */ memset(&bind_addr, 0, sizeof(bind_addr)); bind_in4 = &bind_addr.v4_sockaddr; bind_in4->sin_family = AF_INET; bind_in4->sin_addr.s_addr = htobe32(INADDR_ANY); bind_in4->sin_port = htobe16(bind_port); if (bind(pm_sock, &bind_addr.sock_addr, sizeof(struct sockaddr_in))) { syslog(LOG_WARNING, "create_iwpm_socket_v4: Unable to bind socket (port = %u). %s.\n", bind_port, strerror(errno)); close(pm_sock); pm_sock = -errno; goto create_socket_v4_exit; } /* get the socket name (local port number) */ sockname_len = sizeof(struct sockaddr_in); if (getsockname(pm_sock, &bind_addr.sock_addr, &sockname_len)) { syslog(LOG_WARNING, "create_iwpm_socket_v4: Unable to get socket name. %s.\n", strerror(errno)); close(pm_sock); pm_sock = -errno; goto create_socket_v4_exit; } iwpm_debug(IWARP_PM_WIRE_DBG, "create_iwpm_socket_v4: Socket IP address:port %s:%u\n", inet_ntop(bind_in4->sin_family, &bind_in4->sin_addr.s_addr, ip_address_text, INET6_ADDRSTRLEN), be16toh(bind_in4->sin_port)); create_socket_v4_exit: return pm_sock; } /** * create_iwpm_socket_v6 - Create an ipv6 socket for the iwarp port mapper * @bind_port: UDP port to bind the socket * * Return a handle of ipv6 socket */ int create_iwpm_socket_v6(__u16 bind_port) { sockaddr_union bind_addr; struct sockaddr_in6 *bind_in6; int pm_sock, ret_value, ipv6_only; socklen_t sockname_len; char ip_address_text[INET6_ADDRSTRLEN]; /* create a socket */ pm_sock = socket(AF_INET6, SOCK_DGRAM, IPPROTO_UDP); if (pm_sock < 0) { syslog(LOG_WARNING, "create_iwpm_socket_v6: Unable to create socket. %s.\n", strerror(errno)); pm_sock = -errno; goto create_socket_v6_exit; } ipv6_only = 1; ret_value = setsockopt(pm_sock, IPPROTO_IPV6, IPV6_V6ONLY, &ipv6_only, sizeof(ipv6_only)); if (ret_value < 0) { syslog(LOG_WARNING, "create_iwpm_socket_v6: Unable to set sock options. %s.\n", strerror(errno)); close(pm_sock); pm_sock = -errno; goto create_socket_v6_exit; } /* bind the socket to the given port */ memset(&bind_addr, 0, sizeof(bind_addr)); bind_in6 = &bind_addr.v6_sockaddr; bind_in6->sin6_family = AF_INET6; bind_in6->sin6_addr = in6addr_any; bind_in6->sin6_port = htobe16(bind_port); if (bind(pm_sock, &bind_addr.sock_addr, sizeof(struct sockaddr_in6))) { syslog(LOG_WARNING, "create_iwpm_socket_v6: Unable to bind socket (port = %u). %s.\n", bind_port, strerror(errno)); close(pm_sock); pm_sock = -errno; goto create_socket_v6_exit; } /* get the socket name (local port number) */ sockname_len = sizeof(struct sockaddr_in6); if (getsockname(pm_sock, &bind_addr.sock_addr, &sockname_len)) { syslog(LOG_WARNING, "create_iwpm_socket_v6: Unable to get socket name. %s.\n", strerror(errno)); close(pm_sock); pm_sock = -errno; goto create_socket_v6_exit; } iwpm_debug(IWARP_PM_WIRE_DBG, "create_iwpm_socket_v6: Socket IP address:port %s:%04X\n", inet_ntop(bind_in6->sin6_family, &bind_in6->sin6_addr, ip_address_text, INET6_ADDRSTRLEN), be16toh(bind_in6->sin6_port)); create_socket_v6_exit: return pm_sock; } /** * create_netlink_socket - Create netlink socket for the iwarp port mapper */ int create_netlink_socket(void) { sockaddr_union bind_addr; struct sockaddr_nl *bind_nl; int nl_sock; __u32 rbuf_size; socklen_t opt_len; /* create a socket */ nl_sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_RDMA); if (nl_sock < 0) { syslog(LOG_WARNING, "create_netlink_socket: Unable to create socket. %s.\n", strerror(errno)); nl_sock = -errno; goto create_nl_socket_exit; } /* bind the socket */ memset(&bind_addr, 0, sizeof(bind_addr)); bind_nl = &bind_addr.nl_sockaddr; bind_nl->nl_family = AF_NETLINK; bind_nl->nl_pid = getpid(); bind_nl->nl_groups = 3; /* != 0 support multicast */ if (bind(nl_sock, &bind_addr.sock_addr, sizeof(struct sockaddr_nl))) { syslog(LOG_WARNING, "create_netlink_socket: Unable to bind socket. %s.\n", strerror(errno)); close(nl_sock); nl_sock = -errno; goto create_nl_socket_exit; } if (iwpm_param_vals[NL_SOCK_RBUF_SIZE] > 0) { rbuf_size = iwpm_param_vals[NL_SOCK_RBUF_SIZE]; if (setsockopt(nl_sock, SOL_SOCKET, SO_RCVBUFFORCE, &rbuf_size, sizeof rbuf_size)) { syslog(LOG_WARNING, "create_netlink_socket: Unable to set sock option " "(rbuf_size = %u). %s.\n", rbuf_size, strerror(errno)); if (setsockopt(nl_sock, SOL_SOCKET, SO_RCVBUF, &rbuf_size, sizeof rbuf_size)) { syslog(LOG_WARNING, "create_netlink_socket: " "Unable to set sock option %s. Closing socket\n", strerror(errno)); close(nl_sock); nl_sock = -errno; goto create_nl_socket_exit; } } } opt_len = sizeof(rbuf_size); if (getsockopt(nl_sock, SOL_SOCKET, SO_RCVBUF, &rbuf_size, &opt_len)) { iwpm_debug(IWARP_PM_NETLINK_DBG, "create_netlink_socket: Setting a sock option (rbuf_size = %u).\n", rbuf_size); } else { syslog(LOG_WARNING, "create_netlink_socket: Failed to get socket option. %s.\n", strerror(errno)); } create_nl_socket_exit: return nl_sock; } /** * destroy_iwpm_socket - Close socket */ void destroy_iwpm_socket(int pm_sock) { if (pm_sock >= 0) close(pm_sock); pm_sock = -1; } /** * check_iwpm_nlattr - Check for NULL netlink attribute */ static int check_iwpm_nlattr(struct nlattr *nltb[], int nla_count) { int i, ret = 0; for (i = 1; i < nla_count; i++) { if (!nltb[i]) { iwpm_debug(IWARP_PM_NETLINK_DBG, "check_iwpm_nlattr: NULL (attr idx = %d)\n", i); ret = -EINVAL; } } return ret; } /** * parse_iwpm_nlmsg - Parse a netlink message * @req_nlh: netlink header of the received message to parse * @policy_max: the number of attributes in the policy * @nlmsg_policy: the attribute policy * @nltb: array to store the parsed attributes * @msg_type: netlink message type (dbg purpose) */ int parse_iwpm_nlmsg(struct nlmsghdr *req_nlh, int policy_max, struct nla_policy *nlmsg_policy, struct nlattr *nltb [], const char *msg_type) { const char *str_err; int ret; if ((ret = nlmsg_validate(req_nlh, 0, policy_max-1, nlmsg_policy))) { str_err = "nlmsg_validate error"; goto parse_nlmsg_error; } if ((ret = nlmsg_parse(req_nlh, 0, nltb, policy_max-1, nlmsg_policy))) { str_err = "nlmsg_parse error"; goto parse_nlmsg_error; } if (check_iwpm_nlattr(nltb, policy_max)) { ret = -EINVAL; str_err = "NULL nlmsg attribute"; goto parse_nlmsg_error; } return 0; parse_nlmsg_error: syslog(LOG_WARNING, "parse_iwpm_nlmsg: msg type = %s (%s ret = %d)\n", msg_type, str_err, ret); return ret; } /** * send_iwpm_nlmsg - Send a netlink message * @nl_sock: netlink socket to use for sending the message * @nlmsg: netlink message to send * @dest_pid: pid of the destination of the nlmsg */ int send_iwpm_nlmsg(int nl_sock, struct nl_msg *nlmsg, int dest_pid) { struct sockaddr_nl dest_addr; struct nlmsghdr *nlh = nlmsg_hdr(nlmsg); __u32 nlmsg_len = nlh->nlmsg_len; int len; /* fill in the netlink address of the client */ memset(&dest_addr, 0, sizeof(dest_addr)); dest_addr.nl_groups = 0; dest_addr.nl_family = AF_NETLINK; dest_addr.nl_pid = dest_pid; /* send response to the client */ len = sendto(nl_sock, (char *)nlh, nlmsg_len, 0, (struct sockaddr *)&dest_addr, sizeof(dest_addr)); if (len != nlmsg_len) return -errno; return 0; } /** * create_iwpm_nlmsg - Create a netlink message * @nlmsg_type: type of the netlink message * @client: the port mapper client to receive the message */ struct nl_msg *create_iwpm_nlmsg(__u16 nlmsg_type, int client_idx) { struct nl_msg *nlmsg; struct nlmsghdr *nlh; __u32 seq = 0; nlmsg = nlmsg_alloc(); if (!nlmsg) return NULL; if (client_idx > 0) seq = client_list[client_idx].nl_seq++; nlh = nlmsg_put(nlmsg, getpid(), seq, nlmsg_type, 0, NLM_F_REQUEST); if (!nlh) { nlmsg_free(nlmsg); return NULL; } return nlmsg; } /** * parse_iwpm_msg - Parse iwarp port mapper wire message * @pm_msg: iwpm message to be parsed * @msg_parms: contains the parameters of the iwpm message after parsing */ int parse_iwpm_msg(iwpm_wire_msg *pm_msg, iwpm_msg_parms *msg_parms) { int ret_value = 0; msg_parms->pmtime = pm_msg->pmtime; msg_parms->assochandle = be64toh(pm_msg->assochandle); msg_parms->ip_ver = (pm_msg->magic & IWARP_PM_IPVER_MASK) >> IWARP_PM_IPVER_SHIFT; switch (msg_parms->ip_ver) { case 4: msg_parms->address_family = AF_INET; break; case 6: msg_parms->address_family = AF_INET6; break; default: syslog(LOG_WARNING, "parse_iwpm_msg: Invalid IP version = %d.\n", msg_parms->ip_ver); return -EINVAL; } /* port mapper protocol version */ msg_parms->ver = (pm_msg->magic & IWARP_PM_VER_MASK) >> IWARP_PM_VER_SHIFT; /* message type */ msg_parms->mt = (pm_msg->magic & IWARP_PM_MT_MASK) >> IWARP_PM_MT_SHIFT; msg_parms->apport = pm_msg->apport; /* accepting peer port */ msg_parms->cpport = pm_msg->cpport; /* connecting peer port */ /* copy accepting peer IP address */ memcpy(&msg_parms->apipaddr, &pm_msg->apipaddr, IWPM_IPADDR_SIZE); /* copy connecting peer IP address */ memcpy(&msg_parms->cpipaddr, &pm_msg->cpipaddr, IWPM_IPADDR_SIZE); if (msg_parms->mt == IWARP_PM_MT_REQ) { msg_parms->mapped_cpport = pm_msg->reserved; memcpy(&msg_parms->mapped_cpipaddr, &pm_msg->mapped_cpipaddr, IWPM_IPADDR_SIZE); } return ret_value; } /** * form_iwpm_msg - Form iwarp port mapper wire message * @pm_msg: iwpm message to be formed * @msg_parms: the parameters to be packed in a iwpm message */ static void form_iwpm_msg(iwpm_wire_msg *pm_msg, iwpm_msg_parms *msg_parms) { memset(pm_msg, 0, sizeof(struct iwpm_wire_msg)); pm_msg->pmtime = msg_parms->pmtime; pm_msg->assochandle = htobe64(msg_parms->assochandle); /* record IP version, port mapper version, message type */ pm_msg->magic = (msg_parms->ip_ver << IWARP_PM_IPVER_SHIFT) & IWARP_PM_IPVER_MASK; pm_msg->magic |= (msg_parms->ver << IWARP_PM_VER_SHIFT) & IWARP_PM_VER_MASK; pm_msg->magic |= (msg_parms->mt << IWARP_PM_MT_SHIFT) & IWARP_PM_MT_MASK; pm_msg->apport = msg_parms->apport; pm_msg->cpport = msg_parms->cpport; memcpy(&pm_msg->apipaddr, &msg_parms->apipaddr, IWPM_IPADDR_SIZE); memcpy(&pm_msg->cpipaddr, &msg_parms->cpipaddr, IWPM_IPADDR_SIZE); if (msg_parms->mt == IWARP_PM_MT_REQ) { pm_msg->reserved = msg_parms->mapped_cpport; memcpy(&pm_msg->mapped_cpipaddr, &msg_parms->mapped_cpipaddr, IWPM_IPADDR_SIZE); } } /** * form_iwpm_request - Form iwarp port mapper request message * @pm_msg: iwpm message to be formed * @msg_parms: the parameters to be packed in a iwpm message **/ void form_iwpm_request(struct iwpm_wire_msg *pm_msg, struct iwpm_msg_parms *msg_parms) { msg_parms->mt = IWARP_PM_MT_REQ; msg_parms->msize = IWARP_PM_MESSAGE_SIZE + IWPM_IPADDR_SIZE; form_iwpm_msg(pm_msg, msg_parms); } /** * form_iwpm_accept - Form iwarp port mapper accept message * @pm_msg: iwpm message to be formed * @msg_parms: the parameters to be packed in a iwpm message **/ void form_iwpm_accept(struct iwpm_wire_msg *pm_msg, struct iwpm_msg_parms *msg_parms) { msg_parms->mt = IWARP_PM_MT_ACC; msg_parms->msize = IWARP_PM_MESSAGE_SIZE; form_iwpm_msg(pm_msg, msg_parms); } /** * form_iwpm_ack - Form iwarp port mapper ack message * @pm_msg: iwpm message to be formed * @msg_parms: the parameters to be packed in a iwpm message **/ void form_iwpm_ack(struct iwpm_wire_msg *pm_msg, struct iwpm_msg_parms *msg_parms) { msg_parms->mt = IWARP_PM_MT_ACK; msg_parms->msize = IWARP_PM_MESSAGE_SIZE; form_iwpm_msg(pm_msg, msg_parms); } /** * form_iwpm_reject - Form iwarp port mapper reject message * @pm_msg: iwpm message to be formed * @msg_parms: the parameters to be packed in a iwpm message */ void form_iwpm_reject(struct iwpm_wire_msg *pm_msg, struct iwpm_msg_parms *msg_parms) { msg_parms->mt = IWARP_PM_MT_REJ; msg_parms->msize = IWARP_PM_MESSAGE_SIZE; form_iwpm_msg(pm_msg, msg_parms); } /** * get_sockaddr_port - Report the tcp port number, contained in the sockaddr * @sockaddr: sockaddr storage to get the tcp port from */ __be16 get_sockaddr_port(struct sockaddr_storage *sockaddr) { struct sockaddr_in *sockaddr_v4; struct sockaddr_in6 *sockaddr_v6; __be16 port = 0; switch (sockaddr->ss_family) { case AF_INET: sockaddr_v4 = (struct sockaddr_in *)sockaddr; port = sockaddr_v4->sin_port; break; case AF_INET6: sockaddr_v6 = (struct sockaddr_in6 *)sockaddr; port = sockaddr_v6->sin6_port; break; default: syslog(LOG_WARNING, "get_sockaddr_port: Invalid sockaddr family.\n"); break; } return port; } /** * copy_iwpm_sockaddr - Copy (IP address and Port) from src to dst * @address_family: Internet address family * @src_sockaddr: socket address to copy (if NULL, use src_addr) * @dst_sockaddr: socket address to update (if NULL, use dst_addr) * @src_addr: IP address to copy (if NULL, use src_sockaddr) * @dst_addr: IP address to update (if NULL, use dst_sockaddr) * @src_port: port to copy in dst_sockaddr, if src_sockaddr = NULL * port to update, if src_sockaddr != NULL and dst_sockaddr = NULL */ void copy_iwpm_sockaddr(__u16 addr_family, struct sockaddr_storage *src_sockaddr, struct sockaddr_storage *dst_sockaddr, char *src_addr, char *dst_addr, __be16 *src_port) { switch (addr_family) { case AF_INET: { const struct in_addr *src = (void *)src_addr; struct in_addr *dst = (void *)dst_addr; const struct sockaddr_in *src_sockaddr_in; struct sockaddr_in *dst_sockaddr_in; if (src_sockaddr) { src_sockaddr_in = (const void *)src_sockaddr; src = &src_sockaddr_in->sin_addr; *src_port = src_sockaddr_in->sin_port; } if (dst_sockaddr) { dst_sockaddr_in = (void *)dst_sockaddr; dst = &dst_sockaddr_in->sin_addr; dst_sockaddr_in->sin_port = *src_port; dst_sockaddr_in->sin_family = AF_INET; } *dst = *src; break; } case AF_INET6: { const struct in6_addr *src = (void *)src_addr; struct in6_addr *dst = (void *)dst_addr; const struct sockaddr_in6 *src_sockaddr_in6; struct sockaddr_in6 *dst_sockaddr_in6; if (src_sockaddr) { src_sockaddr_in6 = (const void *)src_sockaddr; src = &src_sockaddr_in6->sin6_addr; *src_port = src_sockaddr_in6->sin6_port; } if (dst_sockaddr) { dst_sockaddr_in6 = (void *)dst_sockaddr; dst = &dst_sockaddr_in6->sin6_addr; dst_sockaddr_in6->sin6_port = *src_port; dst_sockaddr_in6->sin6_family = AF_INET6; } *dst = *src; break; } default: assert(false); if (dst_sockaddr) dst_sockaddr->ss_family = addr_family; } } /** * is_wcard_ipaddr - Check if the search_addr has a wild card ip address */ int is_wcard_ipaddr(struct sockaddr_storage *search_addr) { int ret = 0; switch (search_addr->ss_family) { case AF_INET: { struct sockaddr_in wcard_addr; struct sockaddr_in *in4addr = (struct sockaddr_in *)search_addr; inet_pton(AF_INET, "0.0.0.0", &wcard_addr.sin_addr); if (in4addr->sin_addr.s_addr == wcard_addr.sin_addr.s_addr) ret = 1; break; } case AF_INET6: { struct sockaddr_in6 wcard_addr; struct sockaddr_in6 *in6addr = (struct sockaddr_in6 *)search_addr; inet_pton(AF_INET6, "::", &wcard_addr.sin6_addr); if (!memcmp(in6addr->sin6_addr.s6_addr, wcard_addr.sin6_addr.s6_addr, IWPM_IPADDR_SIZE)) ret = 1; break; } default: syslog(LOG_WARNING, "check_same_sockaddr: Invalid addr family 0x%02X\n", search_addr->ss_family); break; } return ret; } /** * print_iwpm_sockaddr - Print socket address (IP address and Port) * @sockaddr: socket address to print * @msg: message to print */ void print_iwpm_sockaddr(struct sockaddr_storage *sockaddr, const char *msg, __u32 dbg_flag) { struct sockaddr_in6 *sockaddr_v6; struct sockaddr_in *sockaddr_v4; char ip_address_text[INET6_ADDRSTRLEN]; switch (sockaddr->ss_family) { case AF_INET: sockaddr_v4 = (struct sockaddr_in *)sockaddr; iwpm_debug(dbg_flag, "%s IPV4 %s:%u(0x%04X)\n", msg, inet_ntop(AF_INET, &sockaddr_v4->sin_addr, ip_address_text, INET6_ADDRSTRLEN), be16toh(sockaddr_v4->sin_port), be16toh(sockaddr_v4->sin_port)); break; case AF_INET6: sockaddr_v6 = (struct sockaddr_in6 *)sockaddr; iwpm_debug(dbg_flag, "%s IPV6 %s:%u(0x%04X)\n", msg, inet_ntop(AF_INET6, &sockaddr_v6->sin6_addr, ip_address_text, INET6_ADDRSTRLEN), be16toh(sockaddr_v6->sin6_port), be16toh(sockaddr_v6->sin6_port)); break; default: break; } } rdma-core-56.1/iwpmd/iwarp_pm_helper.c000066400000000000000000000413741477342711600200040ustar00rootroot00000000000000/* * Copyright (c) 2013-2016 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include "iwarp_pm.h" static LIST_HEAD(mapped_ports); /* list of mapped ports */ /** * create_iwpm_map_request - Create a new map request tracking object * @req_nlh: netlink header of the received client message * @src_addr: the local address of the client initiating the request * @remote_addr: the destination (the port mapper peer) address * @assochandle: unique number per host * @msg_type: message types are request, accept and ack * @send_msg: message to retransmit to the remote port mapper peer, * if the request isn't serviced on time. */ iwpm_mapping_request *create_iwpm_map_request(struct nlmsghdr *req_nlh, struct sockaddr_storage *src_addr, struct sockaddr_storage *remote_addr, __u64 assochandle, int msg_type, iwpm_send_msg *send_msg) { iwpm_mapping_request *iwpm_map_req; __u32 type = 0, seq = 0, pid = 0; /* create iwpm conversation tracking object */ iwpm_map_req = malloc(sizeof(iwpm_mapping_request)); if (!iwpm_map_req) return NULL; if (req_nlh) { type = req_nlh->nlmsg_type; seq = req_nlh->nlmsg_seq; pid = req_nlh->nlmsg_pid; } memset(iwpm_map_req, 0, sizeof(iwpm_mapping_request)); iwpm_map_req->timeout = IWPM_MAP_REQ_TIMEOUT; iwpm_map_req->complete = 0; iwpm_map_req->msg_type = msg_type; iwpm_map_req->send_msg = send_msg; iwpm_map_req->nlmsg_type = type; iwpm_map_req->nlmsg_seq = seq; iwpm_map_req->nlmsg_pid = pid; /* assochandle helps match iwpm request sent to remote peer with future iwpm accept/reject */ iwpm_map_req->assochandle = assochandle; if (!assochandle) iwpm_map_req->assochandle = (uintptr_t)iwpm_map_req; memcpy(&iwpm_map_req->src_addr, src_addr, sizeof(struct sockaddr_storage)); /* keep record of remote IP address and port */ memcpy(&iwpm_map_req->remote_addr, remote_addr, sizeof(struct sockaddr_storage)); return iwpm_map_req; } /** * add_iwpm_map_request - Add a map request tracking object to a global list * @iwpm_map_req: mapping request to be saved */ void add_iwpm_map_request(iwpm_mapping_request *iwpm_map_req) { pthread_mutex_lock(&map_req_mutex); list_add(&mapping_reqs, &iwpm_map_req->entry); /* if not wake, signal the thread that a new request has been posted */ if (!wake) pthread_cond_signal(&cond_req_complete); pthread_mutex_unlock(&map_req_mutex); } /** * remove_iwpm_map_request - Free a map request tracking object * @iwpm_map_req: mapping request to be removed * * Routine must be called within lock context */ void remove_iwpm_map_request(iwpm_mapping_request *iwpm_map_req) { if (!iwpm_map_req->complete && iwpm_map_req->msg_type != IWARP_PM_REQ_ACK) { iwpm_debug(IWARP_PM_RETRY_DBG, "remove_iwpm_map_request: " "Timeout for request (type = %u pid = %d)\n", iwpm_map_req->msg_type, iwpm_map_req->nlmsg_pid); } list_del(&iwpm_map_req->entry); if (iwpm_map_req->send_msg) free(iwpm_map_req->send_msg); free(iwpm_map_req); } /** * update_iwpm_map_request - Find and update a map request tracking object * @assochandle: the request assochandle to search for * @src_addr: the request src address to search for * @msg_type: the request type to search for * @iwpm_copy_req: to store a copy of the found map request object * @update: if set update the found request, otherwise don't update */ int update_iwpm_map_request(__u64 assochandle, struct sockaddr_storage *src_addr, int msg_type, iwpm_mapping_request *iwpm_copy_req, int update) { iwpm_mapping_request *iwpm_map_req; int ret = -EINVAL; pthread_mutex_lock(&map_req_mutex); /* look for a matching entry in the list */ list_for_each(&mapping_reqs, iwpm_map_req, entry) { if (assochandle == iwpm_map_req->assochandle && (msg_type & iwpm_map_req->msg_type) && check_same_sockaddr(src_addr, &iwpm_map_req->src_addr)) { ret = 0; /* get a copy of the request (a different thread is in charge of freeing it) */ memcpy(iwpm_copy_req, iwpm_map_req, sizeof(iwpm_mapping_request)); if (!update) goto update_map_request_exit; if (iwpm_map_req->complete) goto update_map_request_exit; /* update the request object */ if (iwpm_map_req->msg_type == IWARP_PM_REQ_ACK) { iwpm_map_req->timeout = IWPM_MAP_REQ_TIMEOUT; iwpm_map_req->complete = 0; } else { /* already serviced request could be freed */ iwpm_map_req->timeout = 0; iwpm_map_req->complete = 1; } goto update_map_request_exit; } } update_map_request_exit: pthread_mutex_unlock(&map_req_mutex); return ret; } /** * send_iwpm_msg - Form and send iwpm message to the remote peer */ int send_iwpm_msg(void (*form_msg_type)(iwpm_wire_msg *, iwpm_msg_parms *), iwpm_msg_parms *msg_parms, struct sockaddr_storage *recv_addr, int send_sock) { iwpm_send_msg send_msg; form_msg_type(&send_msg.data, msg_parms); form_iwpm_send_msg(send_sock, recv_addr, msg_parms->msize, &send_msg); return add_iwpm_pending_msg(&send_msg); } /** * get_iwpm_tcp_port - Get a new TCP port from the host stack * @addr_family: should be valid AF_INET or AF_INET6 * @requested_port: set only if reopening of mapped port * @mapped_addr: to store the mapped TCP port * @new_sock: to store socket handle (bound to the mapped TCP port) */ static int get_iwpm_tcp_port(__u16 addr_family, __be16 requested_port, struct sockaddr_storage *mapped_addr, int *new_sock) { sockaddr_union bind_addr; struct sockaddr_in *bind_in4; struct sockaddr_in6 *bind_in6; socklen_t sockname_len; __be16 *new_port = NULL, *mapped_port = NULL; const char *str_err = ""; /* create a socket */ *new_sock = socket(addr_family, SOCK_STREAM, 0); if (*new_sock < 0) { str_err = "Unable to create socket"; goto get_tcp_port_error; } memset(&bind_addr, 0, sizeof(bind_addr)); switch (addr_family) { case AF_INET: mapped_port = &((struct sockaddr_in *)mapped_addr)->sin_port; bind_in4 = &bind_addr.v4_sockaddr; bind_in4->sin_family = addr_family; bind_in4->sin_addr.s_addr = htobe32(INADDR_ANY); if (requested_port) requested_port = *mapped_port; bind_in4->sin_port = requested_port; new_port = &bind_in4->sin_port; break; case AF_INET6: mapped_port = &((struct sockaddr_in6 *)mapped_addr)->sin6_port; bind_in6 = &bind_addr.v6_sockaddr; bind_in6->sin6_family = addr_family; bind_in6->sin6_addr = in6addr_any; if (requested_port) requested_port = *mapped_port; bind_in6->sin6_port = requested_port; new_port = &bind_in6->sin6_port; break; default: str_err = "Invalid Internet address family"; goto get_tcp_port_error; } if (bind(*new_sock, &bind_addr.sock_addr, sizeof(bind_addr))) { str_err = "Unable to bind the socket"; goto get_tcp_port_error; } /* get the TCP port */ sockname_len = sizeof(bind_addr); if (getsockname(*new_sock, &bind_addr.sock_addr, &sockname_len)) { str_err = "Unable to get socket name"; goto get_tcp_port_error; } *mapped_port = *new_port; iwpm_debug(IWARP_PM_ALL_DBG, "get_iwpm_tcp_port: Open tcp port " "(addr family = %04X, requested port = %04X, mapped port = %04X).\n", addr_family, be16toh(requested_port), be16toh(*mapped_port)); return 0; get_tcp_port_error: syslog(LOG_WARNING, "get_iwpm_tcp_port: %s (addr family = %04X, requested port = %04X).\n", str_err, addr_family, be16toh(requested_port)); return -errno; } /** * get_iwpm_port - Allocate and initialize a new mapped port object */ static iwpm_mapped_port *get_iwpm_port(int client_idx, struct sockaddr_storage *local_addr, struct sockaddr_storage *mapped_addr, int sd) { iwpm_mapped_port *iwpm_port; iwpm_port = malloc(sizeof(iwpm_mapped_port)); if (!iwpm_port) { syslog(LOG_WARNING, "get_iwpm_port: Unable to allocate a mapped port.\n"); return NULL; } memset(iwpm_port, 0, sizeof(*iwpm_port)); /* record local and mapped address in the mapped port object */ memcpy(&iwpm_port->local_addr, local_addr, sizeof(struct sockaddr_storage)); memcpy(&iwpm_port->mapped_addr, mapped_addr, sizeof(struct sockaddr_storage)); iwpm_port->owner_client = client_idx; iwpm_port->sd = sd; atomic_init(&iwpm_port->ref_cnt, 1); if (is_wcard_ipaddr(local_addr)) iwpm_port->wcard = 1; return iwpm_port; } /** * create_iwpm_mapped_port - Create a new mapped port object * @local_addr: local address to be mapped (IP address and TCP port) * @client_idx: the index of the client owner of the mapped port */ iwpm_mapped_port *create_iwpm_mapped_port(struct sockaddr_storage *local_addr, int client_idx, __u32 flags) { iwpm_mapped_port *iwpm_port; struct sockaddr_storage mapped_addr; int new_sd; memcpy(&mapped_addr, local_addr, sizeof(mapped_addr)); /* get a tcp port from the host net stack */ if (flags & IWPM_FLAGS_NO_PORT_MAP) { new_sd = -1; } else { if (get_iwpm_tcp_port(local_addr->ss_family, 0, &mapped_addr, &new_sd)) goto create_mapped_port_error; } iwpm_port = get_iwpm_port(client_idx, local_addr, &mapped_addr, new_sd); return iwpm_port; create_mapped_port_error: iwpm_debug(IWARP_PM_ALL_DBG, "create_iwpm_mapped_port: Could not make port mapping.\n"); return NULL; } /** * reopen_iwpm_mapped_port - Create a new mapped port object * @local_addr: local address to be mapped (IP address and TCP port) * @mapped_addr: mapped address to be remapped (IP address and TCP port) * @client_idx: the index of the client owner of the mapped port */ iwpm_mapped_port *reopen_iwpm_mapped_port(struct sockaddr_storage *local_addr, struct sockaddr_storage *mapped_addr, int client_idx, __u32 flags) { iwpm_mapped_port *iwpm_port; int new_sd = -1; const char *str_err = ""; if (local_addr->ss_family != mapped_addr->ss_family) { str_err = "Different local and mapped sockaddr families"; goto reopen_mapped_port_error; } if (!(flags & IWPM_FLAGS_NO_PORT_MAP)) { if (get_iwpm_tcp_port(local_addr->ss_family, htobe16(1), mapped_addr, &new_sd)) goto reopen_mapped_port_error; } iwpm_port = get_iwpm_port(client_idx, local_addr, mapped_addr, new_sd); return iwpm_port; reopen_mapped_port_error: iwpm_debug(IWARP_PM_ALL_DBG, "reopen_iwpm_mapped_port: Could not make port mapping (%s).\n", str_err); if (new_sd >= 0) close(new_sd); return NULL; } /** * add_iwpm_mapped_port - Add mapping to a global list * @iwpm_port: mapping to be saved */ void add_iwpm_mapped_port(iwpm_mapped_port *iwpm_port) { static int dbg_idx = 1; if (atomic_load(&iwpm_port->ref_cnt) > 1) return; iwpm_debug(IWARP_PM_ALL_DBG, "add_iwpm_mapped_port: Adding a new mapping #%d\n", dbg_idx++); list_add(&mapped_ports, &iwpm_port->entry); } /** * check_same_sockaddr - Compare two sock addresses; * return true if they are same, false otherwise */ int check_same_sockaddr(struct sockaddr_storage *sockaddr_a, struct sockaddr_storage *sockaddr_b) { int ret = 0; if (sockaddr_a->ss_family == sockaddr_b->ss_family) { switch (sockaddr_a->ss_family) { case AF_INET: { struct sockaddr_in *in4addr_a = (struct sockaddr_in *)sockaddr_a; struct sockaddr_in *in4addr_b = (struct sockaddr_in *)sockaddr_b; if ((in4addr_a->sin_addr.s_addr == in4addr_b->sin_addr.s_addr) && (in4addr_a->sin_port == in4addr_b->sin_port)) ret = 1; break; } case AF_INET6: { struct sockaddr_in6 *in6addr_a = (struct sockaddr_in6 *)sockaddr_a; struct sockaddr_in6 *in6addr_b = (struct sockaddr_in6 *)sockaddr_b; if ((!memcmp(in6addr_a->sin6_addr.s6_addr, in6addr_b->sin6_addr.s6_addr, IWPM_IPADDR_SIZE)) && (in6addr_a->sin6_port == in6addr_b->sin6_port)) ret = 1; break; } default: syslog(LOG_WARNING, "check_same_sockaddr: Invalid addr family 0x%02X\n", sockaddr_a->ss_family); break; } } return ret; } /** * find_iwpm_mapping - Find saved mapped port object * @search_addr: IP address and port to search for in the list * @not_mapped: if set, compare local addresses, otherwise compare mapped addresses * * Compares the search_sockaddr to the addresses in the list, * to find a saved port object with the sockaddr or * a wild card address with the same tcp port */ iwpm_mapped_port *find_iwpm_mapping(struct sockaddr_storage *search_addr, int not_mapped) { iwpm_mapped_port *iwpm_port, *saved_iwpm_port = NULL; struct sockaddr_storage *current_addr; list_for_each(&mapped_ports, iwpm_port, entry) { current_addr = (not_mapped)? &iwpm_port->local_addr : &iwpm_port->mapped_addr; if (get_sockaddr_port(search_addr) == get_sockaddr_port(current_addr)) { if (check_same_sockaddr(search_addr, current_addr) || iwpm_port->wcard || is_wcard_ipaddr(search_addr)) { saved_iwpm_port = iwpm_port; goto find_mapping_exit; } } } find_mapping_exit: return saved_iwpm_port; } /** * find_iwpm_same_mapping - Find saved mapped port object * @search_addr: IP address and port to search for in the list * @not_mapped: if set, compare local addresses, otherwise compare mapped addresses * * Compares the search_sockaddr to the addresses in the list, * to find a saved port object with the same sockaddr */ iwpm_mapped_port *find_iwpm_same_mapping(struct sockaddr_storage *search_addr, int not_mapped) { iwpm_mapped_port *iwpm_port, *saved_iwpm_port = NULL; struct sockaddr_storage *current_addr; list_for_each(&mapped_ports, iwpm_port, entry) { current_addr = (not_mapped)? &iwpm_port->local_addr : &iwpm_port->mapped_addr; if (check_same_sockaddr(search_addr, current_addr)) { saved_iwpm_port = iwpm_port; goto find_same_mapping_exit; } } find_same_mapping_exit: return saved_iwpm_port; } /** * free_iwpm_port - Free mapping object * @iwpm_port: mapped port object to be freed */ void free_iwpm_port(iwpm_mapped_port *iwpm_port) { if (iwpm_port->sd != -1) close(iwpm_port->sd); free(iwpm_port); } /** * remove_iwpm_mapped_port - Remove a mapping from a global list * @iwpm_port: mapping to be removed * * Called only by the main iwarp port mapper thread */ void remove_iwpm_mapped_port(iwpm_mapped_port *iwpm_port) { static int dbg_idx = 1; iwpm_debug(IWARP_PM_ALL_DBG, "remove_iwpm_mapped_port: index = %d\n", dbg_idx++); list_del(&iwpm_port->entry); } void print_iwpm_mapped_ports(void) { iwpm_mapped_port *iwpm_port; int i = 0; syslog(LOG_WARNING, "print_iwpm_mapped_ports:\n"); list_for_each(&mapped_ports, iwpm_port, entry) { syslog(LOG_WARNING, "Mapping #%d\n", i++); print_iwpm_sockaddr(&iwpm_port->local_addr, "Local address", IWARP_PM_DEBUG); print_iwpm_sockaddr(&iwpm_port->mapped_addr, "Mapped address", IWARP_PM_DEBUG); } } /** * form_iwpm_send_msg - Form a message to send on the wire */ void form_iwpm_send_msg(int pm_sock, struct sockaddr_storage *dest, int length, iwpm_send_msg *send_msg) { send_msg->pm_sock = pm_sock; send_msg->length = length; memcpy(&send_msg->dest_addr, dest, sizeof(send_msg->dest_addr)); } /** * add_iwpm_pending_msg - Add wire message to a global list of pending messages * @send_msg: message to send to the remote port mapper peer */ int add_iwpm_pending_msg(iwpm_send_msg *send_msg) { iwpm_pending_msg *pending_msg = malloc(sizeof(iwpm_pending_msg)); if (!pending_msg) { syslog(LOG_WARNING, "add_iwpm_pending_msg: Unable to allocate message.\n"); return -ENOMEM; } memcpy(&pending_msg->send_msg, send_msg, sizeof(iwpm_send_msg)); pthread_mutex_lock(&pending_msg_mutex); list_add(&pending_messages, &pending_msg->entry); pthread_mutex_unlock(&pending_msg_mutex); /* signal the thread that a new message has been posted */ pthread_cond_signal(&cond_pending_msg); return 0; } /** * free_iwpm_mapped_ports - Free all iwpm mapped port objects */ void free_iwpm_mapped_ports(void) { iwpm_mapped_port *iwpm_port; while ((iwpm_port = list_pop(&mapped_ports, iwpm_mapped_port, entry))) free_iwpm_port(iwpm_port); } rdma-core-56.1/iwpmd/iwarp_pm_server.c000066400000000000000000001521331477342711600200270ustar00rootroot00000000000000/* * Copyright (c) 2013-2016 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include "config.h" #include #include #include "iwarp_pm.h" #include #include static const char iwpm_ulib_name [] = "iWarpPortMapperUser"; static __u16 iwpm_version = IWPM_UABI_VERSION; LIST_HEAD(mapping_reqs); /* list of map tracking objects */ LIST_HEAD(pending_messages); /* list of pending wire messages */ iwpm_client client_list[IWARP_PM_MAX_CLIENTS];/* list of iwarp port mapper clients */ static int mapinfo_num_list[IWARP_PM_MAX_CLIENTS]; /* list of iwarp port mapper clients */ /* socket handles */ static int pmv4_sock, pmv6_sock, netlink_sock, pmv4_client_sock, pmv6_client_sock; static pthread_t map_req_thread; /* handling mapping requests timeout */ pthread_cond_t cond_req_complete; pthread_mutex_t map_req_mutex = PTHREAD_MUTEX_INITIALIZER; int wake = 0; /* set if map_req_thread is wake */ static pthread_t pending_msg_thread; /* sending iwpm wire messages */ pthread_cond_t cond_pending_msg; pthread_mutex_t pending_msg_mutex = PTHREAD_MUTEX_INITIALIZER; static void iwpm_cleanup(void); static int print_mappings = 0; static int send_iwpm_mapinfo_request(int nl_sock, int client); /** * iwpm_signal_handler - Handle signals which iwarp port mapper receives * @signum: the number of the caught signal */ static void iwpm_signal_handler(int signum) { switch(signum) { case SIGHUP: syslog(LOG_WARNING, "iwpm_signal_handler: Received SIGHUP signal\n"); iwpm_cleanup(); exit(signum); break; case SIGTERM: syslog(LOG_WARNING, "iwpm_signal_handler: Received SIGTERM signal\n"); iwpm_cleanup(); exit(EXIT_SUCCESS); break; case SIGUSR1: syslog(LOG_WARNING, "iwpm_signal_handler: Received SIGUSR1 signal\n"); print_mappings = 1; break; default: syslog(LOG_WARNING, "iwpm_signal_handler: Unhandled signal %d\n", signum); break; } } /** * iwpm_mapping_reqs_handler - Handle mapping requests timeouts and retries */ static void *iwpm_mapping_reqs_handler(void *unused) { iwpm_mapping_request *iwpm_map_req, *next_map_req; int ret = 0; while (1) { pthread_mutex_lock(&map_req_mutex); wake = 0; if (list_empty(&mapping_reqs)) { /* wait until a new mapping request is posted */ ret = pthread_cond_wait(&cond_req_complete, &map_req_mutex); if (ret) { syslog(LOG_WARNING, "mapping_reqs_handler: " "Condition wait failed (ret = %d)\n", ret); pthread_mutex_unlock(&map_req_mutex); goto mapping_reqs_handler_exit; } } pthread_mutex_unlock(&map_req_mutex); /* update timeouts of the posted mapping requests */ do { pthread_mutex_lock(&map_req_mutex); wake = 1; list_for_each_safe(&mapping_reqs, iwpm_map_req, next_map_req, entry) { if (iwpm_map_req->timeout > 0) { if (iwpm_map_req->timeout < IWPM_MAP_REQ_TIMEOUT && iwpm_map_req->msg_type != IWARP_PM_REQ_ACK) { /* the request is still incomplete, retransmit the message (every 1sec) */ add_iwpm_pending_msg(iwpm_map_req->send_msg); iwpm_debug(IWARP_PM_RETRY_DBG, "mapping_reqs_handler: " "Going to retransmit a msg, map request " "(assochandle = %llu, type = %u, timeout = %d)\n", iwpm_map_req->assochandle, iwpm_map_req->msg_type, iwpm_map_req->timeout); } iwpm_map_req->timeout--; /* hang around for 10s */ } else { remove_iwpm_map_request(iwpm_map_req); } } pthread_mutex_unlock(&map_req_mutex); sleep(1); } while (!list_empty(&mapping_reqs)); } mapping_reqs_handler_exit: return NULL; } /** * iwpm_pending_msgs_handler - Handle sending iwarp port mapper wire messages */ static void *iwpm_pending_msgs_handler(void *unused) { iwpm_pending_msg *pending_msg; iwpm_send_msg *send_msg; int retries = IWPM_SEND_MSG_RETRIES; int ret = 0; pthread_mutex_lock(&pending_msg_mutex); while (1) { /* wait until a new message is posted */ ret = pthread_cond_wait(&cond_pending_msg, &pending_msg_mutex); if (ret) { syslog(LOG_WARNING, "pending_msgs_handler: " "Condition wait failed (ret = %d)\n", ret); pthread_mutex_unlock(&pending_msg_mutex); goto pending_msgs_handler_exit; } /* try sending out each pending message and remove it from the list */ while ((pending_msg = list_pop(&pending_messages, iwpm_pending_msg, entry))) { retries = IWPM_SEND_MSG_RETRIES; while (retries) { send_msg = &pending_msg->send_msg; /* send out the message */ int bytes_sent = sendto(send_msg->pm_sock, (char *)&send_msg->data, send_msg->length, 0, (struct sockaddr *)&send_msg->dest_addr, sizeof(send_msg->dest_addr)); if (bytes_sent != send_msg->length) { retries--; syslog(LOG_WARNING, "pending_msgs_handler: " "Could not send to PM Socket send_msg = %p, retries = %d\n", send_msg, retries); } else retries = 0; /* no need to retry */ } free(pending_msg); } } pthread_mutex_unlock(&pending_msg_mutex); pending_msgs_handler_exit: return NULL; } static int send_iwpm_error_msg(__u32, __u16, int, int); /* Register pid query - nlmsg attributes */ static struct nla_policy reg_pid_policy[IWPM_NLA_REG_PID_MAX] = { [IWPM_NLA_REG_PID_SEQ] = { .type = NLA_U32 }, [IWPM_NLA_REG_IF_NAME] = { .type = NLA_STRING, .maxlen = IWPM_IFNAME_SIZE }, [IWPM_NLA_REG_IBDEV_NAME] = { .type = NLA_STRING, .maxlen = IWPM_ULIBNAME_SIZE }, [IWPM_NLA_REG_ULIB_NAME] = { .type = NLA_STRING, .maxlen = IWPM_ULIBNAME_SIZE } }; /** * process_iwpm_register_pid - Service a client query for port mapper pid * @req_nlh: netlink header of the received client message * @client_idx: the index of the client (unique for each iwpm client) * @nl_sock: netlink socket to send a message back to the client * * Process a query and send a response to the client which contains the iwpm pid * nlmsg response attributes: * IWPM_NLA_RREG_PID_SEQ * IWPM_NLA_RREG_IBDEV_NAME * IWPM_NLA_RREG_ULIB_NAME * IWPM_NLA_RREG_ULIB_VER * IWPM_NLA_RREG_PID_ERR */ static int process_iwpm_register_pid(struct nlmsghdr *req_nlh, int client_idx, int nl_sock) { iwpm_client *client; struct nlattr *nltb [IWPM_NLA_REG_PID_MAX]; struct nl_msg *resp_nlmsg = NULL; const char *ifname, *devname, *libname; __u16 err_code = 0; const char *msg_type = "Register Pid Request"; const char *str_err; int ret = -EINVAL; if (parse_iwpm_nlmsg(req_nlh, IWPM_NLA_REG_PID_MAX, reg_pid_policy, nltb, msg_type)) { str_err = "Received Invalid nlmsg"; err_code = IWPM_INVALID_NLMSG_ERR; goto register_pid_error; } ifname = (const char *)nla_get_string(nltb[IWPM_NLA_REG_IF_NAME]); devname = (const char *)nla_get_string(nltb[IWPM_NLA_REG_IBDEV_NAME]); libname = (const char *)nla_get_string(nltb[IWPM_NLA_REG_ULIB_NAME]); iwpm_debug(IWARP_PM_NETLINK_DBG, "process_register_pid: PID request from " "IB device %s Ethernet device %s User library %s " "(client idx = %d, msg seq = %u).\n", devname, ifname, libname, client_idx, req_nlh->nlmsg_seq); /* register a first time client */ client = &client_list[client_idx]; if (!client->valid) { memcpy(client->ibdevname, devname, IWPM_DEVNAME_SIZE); memcpy(client->ifname, ifname, IWPM_IFNAME_SIZE); memcpy(client->ulibname, libname, IWPM_ULIBNAME_SIZE); client->valid = 1; } else { /* check client info */ if (strcmp(client->ulibname, libname)) { str_err = "Incorrect library version"; err_code = IWPM_USER_LIB_INFO_ERR; goto register_pid_error; } } resp_nlmsg = create_iwpm_nlmsg(req_nlh->nlmsg_type, client_idx); if (!resp_nlmsg) { ret = -ENOMEM; str_err = "Unable to create nlmsg response"; goto register_pid_error; } str_err = "Invalid nlmsg attribute"; if ((ret = nla_put_u32(resp_nlmsg, IWPM_NLA_RREG_PID_SEQ, req_nlh->nlmsg_seq))) goto register_pid_error; if ((ret = nla_put_string(resp_nlmsg, IWPM_NLA_RREG_IBDEV_NAME, devname))) goto register_pid_error; if ((ret = nla_put_string(resp_nlmsg, IWPM_NLA_RREG_ULIB_NAME, iwpm_ulib_name))) goto register_pid_error; if ((ret = nla_put_u16(resp_nlmsg, IWPM_NLA_RREG_ULIB_VER, iwpm_version))) goto register_pid_error; if ((ret = nla_put_u16(resp_nlmsg, IWPM_NLA_RREG_PID_ERR, err_code))) goto register_pid_error; if ((ret = send_iwpm_nlmsg(nl_sock, resp_nlmsg, req_nlh->nlmsg_pid))) { str_err = "Unable to send nlmsg response"; goto register_pid_error; } nlmsg_free(resp_nlmsg); return 0; register_pid_error: if (resp_nlmsg) nlmsg_free(resp_nlmsg); syslog(LOG_WARNING, "process_register_pid: %s ret = %d.\n", str_err, ret); if (err_code) send_iwpm_error_msg(req_nlh->nlmsg_seq, err_code, client_idx, nl_sock); return ret; } /* Add mapping request - nlmsg attributes */ static struct nla_policy manage_map_policy[IWPM_NLA_MANAGE_MAPPING_MAX] = { [IWPM_NLA_MANAGE_MAPPING_SEQ] = { .type = NLA_U32 }, [IWPM_NLA_MANAGE_ADDR] = { .minlen = sizeof(struct sockaddr_storage) }, [IWPM_NLA_MANAGE_FLAGS] = { .type = NLA_U32 } }; /** * process_iwpm_add_mapping - Service a client request for mapping of a local address * @req_nlh: netlink header of the received client message * @client_idx: the index of the client (unique for each iwpm client) * @nl_sock: netlink socket to send a message back to the client * * Process a mapping request for a local address and send a response to the client * which contains the mapped local address (IP address and TCP port) * nlmsg response attributes: * [IWPM_NLA_RMANAGE_MAPPING_SEQ] * [IWPM_NLA_RMANAGE_ADDR] * [IWPM_NLA_RMANAGE_MAPPED_LOC_ADDR] * [IWPM_NLA_RMANAGE_MAPPING_ERR] */ static int process_iwpm_add_mapping(struct nlmsghdr *req_nlh, int client_idx, int nl_sock) { iwpm_mapped_port *iwpm_port = NULL; struct nlattr *nltb [IWPM_NLA_MANAGE_MAPPING_MAX] = {}; struct nl_msg *resp_nlmsg = NULL; struct sockaddr_storage *local_addr; int not_mapped = 1; __u16 err_code = 0; const char *msg_type = "Add Mapping Request"; const char *str_err = ""; int ret = -EINVAL; __u32 flags; int max = IWPM_NLA_MANAGE_MAPPING_MAX; if (iwpm_version != IWPM_UABI_VERSION) max--; if (parse_iwpm_nlmsg(req_nlh, max, manage_map_policy, nltb, msg_type)) { err_code = IWPM_INVALID_NLMSG_ERR; str_err = "Received Invalid nlmsg"; goto add_mapping_error; } local_addr = (struct sockaddr_storage *)nla_data(nltb[IWPM_NLA_MANAGE_ADDR]); flags = nltb[IWPM_NLA_MANAGE_FLAGS] ? nla_get_u32(nltb[IWPM_NLA_MANAGE_FLAGS]) : 0; iwpm_port = find_iwpm_mapping(local_addr, not_mapped); if (iwpm_port) { if (check_same_sockaddr(local_addr, &iwpm_port->local_addr) && iwpm_port->wcard) { atomic_fetch_add(&iwpm_port->ref_cnt, 1); } else { err_code = IWPM_DUPLICATE_MAPPING_ERR; str_err = "Duplicate mapped port"; goto add_mapping_error; } } else { iwpm_port = create_iwpm_mapped_port(local_addr, client_idx, flags); if (!iwpm_port) { err_code = IWPM_CREATE_MAPPING_ERR; str_err = "Unable to create new mapping"; goto add_mapping_error; } } resp_nlmsg = create_iwpm_nlmsg(req_nlh->nlmsg_type, client_idx); if (!resp_nlmsg) { ret = -ENOMEM; str_err = "Unable to create nlmsg response"; goto add_mapping_free_error; } str_err = "Invalid nlmsg attribute"; if ((ret = nla_put_u32(resp_nlmsg, IWPM_NLA_RMANAGE_MAPPING_SEQ, req_nlh->nlmsg_seq))) goto add_mapping_free_error; if ((ret = nla_put(resp_nlmsg, IWPM_NLA_RMANAGE_ADDR, sizeof(struct sockaddr_storage), &iwpm_port->local_addr))) goto add_mapping_free_error; if ((ret = nla_put(resp_nlmsg, IWPM_NLA_RMANAGE_MAPPED_LOC_ADDR, sizeof(struct sockaddr_storage), &iwpm_port->mapped_addr))) goto add_mapping_free_error; if ((ret = nla_put_u16(resp_nlmsg, IWPM_NLA_RMANAGE_MAPPING_ERR, err_code))) goto add_mapping_free_error; if ((ret = send_iwpm_nlmsg(nl_sock, resp_nlmsg, req_nlh->nlmsg_pid))) { str_err = "Unable to send nlmsg response"; goto add_mapping_free_error; } /* add the new mapping to the list */ add_iwpm_mapped_port(iwpm_port); nlmsg_free(resp_nlmsg); return 0; add_mapping_free_error: if (resp_nlmsg) nlmsg_free(resp_nlmsg); if (iwpm_port) { if (atomic_fetch_sub(&iwpm_port->ref_cnt, 1) == 1) free_iwpm_port(iwpm_port); } add_mapping_error: syslog(LOG_WARNING, "process_add_mapping: %s (failed request from client = %s).\n", str_err, client_list[client_idx].ibdevname); if (err_code) { /* send error message to the client */ send_iwpm_error_msg(req_nlh->nlmsg_seq, err_code, client_idx, nl_sock); } return ret; } /* Query mapping request - nlmsg attributes */ static struct nla_policy query_map_policy[IWPM_NLA_QUERY_MAPPING_MAX] = { [IWPM_NLA_QUERY_MAPPING_SEQ] = { .type = NLA_U32 }, [IWPM_NLA_QUERY_LOCAL_ADDR] = { .minlen = sizeof(struct sockaddr_storage) }, [IWPM_NLA_QUERY_REMOTE_ADDR] = { .minlen = sizeof(struct sockaddr_storage) }, [IWPM_NLA_QUERY_FLAGS] = { .type = NLA_U32 } }; /** * process_iwpm_query_mapping - Service a client request for local and remote mapping * @req_nlh: netlink header of the received client message * @client_idx: the index of the client (the index is unique for each iwpm client) * @nl_sock: netlink socket to send a message back to the client * * Process a client request for local and remote address mapping * Create mapping for the local address (IP address and TCP port) * Send a request to the remote port mapper peer to find out the remote address mapping */ static int process_iwpm_query_mapping(struct nlmsghdr *req_nlh, int client_idx, int nl_sock) { iwpm_mapped_port *iwpm_port = NULL; iwpm_mapping_request *iwpm_map_req = NULL; struct nlattr *nltb [IWPM_NLA_QUERY_MAPPING_MAX] = {}; struct sockaddr_storage *local_addr, *remote_addr; sockaddr_union dest_addr; iwpm_msg_parms msg_parms; iwpm_send_msg *send_msg = NULL; int pm_client_sock; int not_mapped = 1; __u16 err_code = 0; const char *msg_type = "Add & Query Mapping Request"; const char *str_err = ""; int ret = -EINVAL; __u32 flags; int max = IWPM_NLA_QUERY_MAPPING_MAX; if (iwpm_version != IWPM_UABI_VERSION) max--; if (parse_iwpm_nlmsg(req_nlh, max, query_map_policy, nltb, msg_type)) { err_code = IWPM_INVALID_NLMSG_ERR; str_err = "Received Invalid nlmsg"; goto query_mapping_error; } local_addr = (struct sockaddr_storage *)nla_data(nltb[IWPM_NLA_QUERY_LOCAL_ADDR]); remote_addr = (struct sockaddr_storage *)nla_data(nltb[IWPM_NLA_QUERY_REMOTE_ADDR]); flags = nltb[IWPM_NLA_QUERY_FLAGS] ? nla_get_u32(nltb[IWPM_NLA_QUERY_FLAGS]) : 0; iwpm_port = find_iwpm_mapping(local_addr, not_mapped); if (iwpm_port) { atomic_fetch_add(&iwpm_port->ref_cnt, 1); } else { iwpm_port = create_iwpm_mapped_port(local_addr, client_idx, flags); if (!iwpm_port) { err_code = IWPM_CREATE_MAPPING_ERR; str_err = "Unable to create new mapping"; goto query_mapping_error; } } if (iwpm_port->wcard) { err_code = IWPM_CREATE_MAPPING_ERR; str_err = "Invalid wild card mapping"; goto query_mapping_free_error; } /* create iwpm wire message */ memcpy(&dest_addr.s_sockaddr, remote_addr, sizeof(struct sockaddr_storage)); switch (dest_addr.s_sockaddr.ss_family) { case AF_INET: dest_addr.v4_sockaddr.sin_port = htobe16(IWARP_PM_PORT); msg_parms.ip_ver = 4; msg_parms.address_family = AF_INET; pm_client_sock = pmv4_client_sock; break; case AF_INET6: dest_addr.v6_sockaddr.sin6_port = htobe16(IWARP_PM_PORT); msg_parms.ip_ver = 6; msg_parms.address_family = AF_INET6; pm_client_sock = pmv6_client_sock; break; default: str_err = "Invalid Internet address family"; goto query_mapping_free_error; } /* fill in the remote peer address and the local mapped address */ copy_iwpm_sockaddr(dest_addr.s_sockaddr.ss_family, remote_addr, NULL, NULL, &msg_parms.apipaddr[0], &msg_parms.apport); copy_iwpm_sockaddr(dest_addr.s_sockaddr.ss_family, local_addr, NULL, NULL, &msg_parms.cpipaddr[0], &msg_parms.cpport); copy_iwpm_sockaddr(dest_addr.s_sockaddr.ss_family, &iwpm_port->mapped_addr, NULL, NULL, &msg_parms.mapped_cpipaddr[0], &msg_parms.mapped_cpport); msg_parms.pmtime = 0; msg_parms.ver = 0; iwpm_debug(IWARP_PM_WIRE_DBG, "process_query_mapping: Local port = 0x%04X, " "remote port = 0x%04X\n", be16toh(msg_parms.cpport), be16toh(msg_parms.apport)); ret = -ENOMEM; send_msg = malloc(sizeof(iwpm_send_msg)); if (!send_msg) { str_err = "Unable to allocate send msg buffer"; goto query_mapping_free_error; } iwpm_map_req = create_iwpm_map_request(req_nlh, &iwpm_port->local_addr, remote_addr, 0, IWARP_PM_REQ_QUERY, send_msg); if (!iwpm_map_req) { str_err = "Unable to allocate mapping request"; goto query_mapping_free_error; } msg_parms.assochandle = iwpm_map_req->assochandle; form_iwpm_request(&send_msg->data, &msg_parms); form_iwpm_send_msg(pm_client_sock, &dest_addr.s_sockaddr, msg_parms.msize, send_msg); add_iwpm_map_request(iwpm_map_req); add_iwpm_mapped_port(iwpm_port); return send_iwpm_msg(form_iwpm_request, &msg_parms, &dest_addr.s_sockaddr, pm_client_sock); query_mapping_free_error: if (iwpm_port) { if (atomic_fetch_sub(&iwpm_port->ref_cnt, 1) == 1) free_iwpm_port(iwpm_port); } if (send_msg) free(send_msg); if (iwpm_map_req) free(iwpm_map_req); query_mapping_error: syslog(LOG_WARNING, "process_query_mapping: %s (failed request from client = %s).\n", str_err, client_list[client_idx].ibdevname); if (err_code) { /* send error message to the client */ send_iwpm_error_msg(req_nlh->nlmsg_seq, err_code, client_idx, nl_sock); } return ret; } /** * process_iwpm_remove_mapping - Remove a local mapping and close the mapped TCP port * @req_nlh: netlink header of the received client message * @client_idx: the index of the client (the index is unique for each iwpm client) * @nl_sock: netlink socket to send a message to the client */ static int process_iwpm_remove_mapping(struct nlmsghdr *req_nlh, int client_idx, int nl_sock) { iwpm_mapped_port *iwpm_port = NULL; struct sockaddr_storage *local_addr; struct nlattr *nltb [IWPM_NLA_MANAGE_MAPPING_MAX]; int not_mapped = 1; const char *msg_type = "Remove Mapping Request"; int ret = 0; if (parse_iwpm_nlmsg(req_nlh, IWPM_NLA_REMOVE_MAPPING_MAX, manage_map_policy, nltb, msg_type)) { send_iwpm_error_msg(req_nlh->nlmsg_seq, IWPM_INVALID_NLMSG_ERR, client_idx, nl_sock); syslog(LOG_WARNING, "process_remove_mapping: Received Invalid nlmsg from client = %d\n", client_idx); ret = -EINVAL; goto remove_mapping_exit; } local_addr = (struct sockaddr_storage *)nla_data(nltb[IWPM_NLA_MANAGE_ADDR]); iwpm_debug(IWARP_PM_NETLINK_DBG, "process_remove_mapping: Going to remove mapping" " (client idx = %d)\n", client_idx); iwpm_port = find_iwpm_same_mapping(local_addr, not_mapped); if (!iwpm_port) { iwpm_debug(IWARP_PM_NETLINK_DBG, "process_remove_mapping: Unable to find mapped port object\n"); print_iwpm_sockaddr(local_addr, "process_remove_mapping: Local address", IWARP_PM_ALL_DBG); /* the client sends a remove mapping request when terminating a connection and it is possible that there isn't a successful mapping for this connection */ goto remove_mapping_exit; } if (iwpm_port->owner_client != client_idx) { syslog(LOG_WARNING, "process_remove_mapping: Invalid request from client = %d\n", client_idx); goto remove_mapping_exit; } if (atomic_fetch_sub(&iwpm_port->ref_cnt, 1) == 1) { remove_iwpm_mapped_port(iwpm_port); free_iwpm_port(iwpm_port); } remove_mapping_exit: return ret; } static int send_conn_info_nlmsg(struct sockaddr_storage *local_addr, struct sockaddr_storage *remote_addr, struct sockaddr_storage *mapped_loc_addr, struct sockaddr_storage *mapped_rem_addr, int owner_client, __u16 nlmsg_type, __u32 nlmsg_seq, __u32 nlmsg_pid, __u16 nlmsg_err, int nl_sock) { struct nl_msg *resp_nlmsg = NULL; const char *str_err; int ret; resp_nlmsg = create_iwpm_nlmsg(nlmsg_type, owner_client); if (!resp_nlmsg) { str_err = "Unable to create nlmsg response"; ret = -ENOMEM; goto nlmsg_error; } str_err = "Invalid nlmsg attribute"; if ((ret = nla_put_u32(resp_nlmsg, IWPM_NLA_QUERY_MAPPING_SEQ, nlmsg_seq))) goto nlmsg_free_error; if ((ret = nla_put(resp_nlmsg, IWPM_NLA_QUERY_LOCAL_ADDR, sizeof(struct sockaddr_storage), local_addr))) goto nlmsg_free_error; if ((ret = nla_put(resp_nlmsg, IWPM_NLA_QUERY_REMOTE_ADDR, sizeof(struct sockaddr_storage), remote_addr))) goto nlmsg_free_error; if ((ret = nla_put(resp_nlmsg, IWPM_NLA_RQUERY_MAPPED_LOC_ADDR, sizeof(struct sockaddr_storage), mapped_loc_addr))) goto nlmsg_free_error; if ((ret = nla_put(resp_nlmsg, IWPM_NLA_RQUERY_MAPPED_REM_ADDR, sizeof(struct sockaddr_storage), mapped_rem_addr))) goto nlmsg_free_error; if ((ret = nla_put_u16(resp_nlmsg, IWPM_NLA_RQUERY_MAPPING_ERR, nlmsg_err))) goto nlmsg_free_error; if ((ret = send_iwpm_nlmsg(nl_sock, resp_nlmsg, nlmsg_pid))) { str_err = "Unable to send nlmsg response"; goto nlmsg_free_error; } nlmsg_free(resp_nlmsg); return 0; nlmsg_free_error: if (resp_nlmsg) nlmsg_free(resp_nlmsg); nlmsg_error: syslog(LOG_WARNING, "send_conn_info_nlmsg: %s.\n", str_err); return ret; } /** * process_iwpm_wire_request - Process a mapping query from remote port mapper peer * @msg_parms: the received iwpm request message * @recv_addr: address of the remote peer * @pm_sock: socket handle to send a response to the remote iwpm peer * * Look up the accepting peer local address to find the corresponding mapping, * send reject message to the remote connecting peer, if no mapping is found, * otherwise, send accept message with the accepting peer mapping info */ static int process_iwpm_wire_request(iwpm_msg_parms *msg_parms, int nl_sock, struct sockaddr_storage *recv_addr, int pm_sock) { iwpm_mapped_port *iwpm_port; iwpm_mapping_request *iwpm_map_req = NULL; iwpm_mapping_request iwpm_copy_req; iwpm_send_msg *send_msg = NULL; struct sockaddr_storage local_addr, mapped_loc_addr; struct sockaddr_storage remote_addr = {}, mapped_rem_addr = {}; __u16 nlmsg_type; int not_mapped = 1; int ret = 0; copy_iwpm_sockaddr(msg_parms->address_family, NULL, &local_addr, &msg_parms->apipaddr[0], NULL, &msg_parms->apport); iwpm_port = find_iwpm_mapping(&local_addr, not_mapped); if (!iwpm_port) { /* could not find mapping for the requested address */ iwpm_debug(IWARP_PM_WIRE_DBG, "process_wire_request: " "Sending Reject to port mapper peer.\n"); print_iwpm_sockaddr(&local_addr, "process_wire_request: Local address", IWARP_PM_ALL_DBG); return send_iwpm_msg(form_iwpm_reject, msg_parms, recv_addr, pm_sock); } /* record mapping in the accept message */ if (iwpm_port->wcard) msg_parms->apport = get_sockaddr_port(&iwpm_port->mapped_addr); else copy_iwpm_sockaddr(msg_parms->address_family, &iwpm_port->mapped_addr, NULL, NULL, &msg_parms->apipaddr[0], &msg_parms->apport); copy_iwpm_sockaddr(msg_parms->address_family, NULL, &mapped_loc_addr, &msg_parms->apipaddr[0], NULL, &msg_parms->apport); /* check if there is already a request */ ret = update_iwpm_map_request(msg_parms->assochandle, &mapped_loc_addr, IWARP_PM_REQ_ACCEPT, &iwpm_copy_req, 0); if (!ret) { /* found request */ iwpm_debug(IWARP_PM_WIRE_DBG,"process_wire_request: Detected retransmission " "map request (assochandle = %llu type = %d timeout = %u complete = %d)\n", iwpm_copy_req.assochandle, iwpm_copy_req.msg_type, iwpm_copy_req.timeout, iwpm_copy_req.complete); return 0; } /* allocate response message */ send_msg = malloc(sizeof(iwpm_send_msg)); if (!send_msg) { syslog(LOG_WARNING, "process_wire_request: Unable to allocate send msg.\n"); return -ENOMEM; } form_iwpm_accept(&send_msg->data, msg_parms); form_iwpm_send_msg(pm_sock, recv_addr, msg_parms->msize, send_msg); copy_iwpm_sockaddr(msg_parms->address_family, NULL, &remote_addr, &msg_parms->cpipaddr[0], NULL, &msg_parms->cpport); copy_iwpm_sockaddr(msg_parms->address_family, NULL, &mapped_rem_addr, &msg_parms->mapped_cpipaddr[0], NULL, &msg_parms->mapped_cpport); iwpm_map_req = create_iwpm_map_request(NULL, &mapped_loc_addr, &remote_addr, msg_parms->assochandle, IWARP_PM_REQ_ACCEPT, send_msg); if (!iwpm_map_req) { syslog(LOG_WARNING, "process_wire_request: Unable to allocate mapping request.\n"); free(send_msg); return -ENOMEM; } add_iwpm_map_request(iwpm_map_req); ret = send_iwpm_msg(form_iwpm_accept, msg_parms, recv_addr, pm_sock); if (ret) { syslog(LOG_WARNING, "process_wire_request: Unable to allocate accept message.\n"); return ret; } nlmsg_type = RDMA_NL_GET_TYPE(iwpm_port->owner_client, RDMA_NL_IWPM_REMOTE_INFO); ret = send_conn_info_nlmsg(&iwpm_port->local_addr, &remote_addr, &iwpm_port->mapped_addr, &mapped_rem_addr, iwpm_port->owner_client, nlmsg_type, 0, 0, 0, nl_sock); return ret; } /** * process_iwpm_wire_accept - Process accept message from the remote port mapper peer * @msg_parms: the received iwpm accept message, containing the remote peer mapping info * @nl_sock: netlink socket to send a message to the iwpm client * @recv_addr: address of the remote peer * @pm_sock: socket handle to send ack message back to the remote peer * * Send acknowledgement to the remote/accepting peer, * send a netlink message with the local and remote mapping info to the iwpm client * nlmsg response attributes: * [IWPM_NLA_QUERY_MAPPING_SEQ] * [IWPM_NLA_QUERY_LOCAL_ADDR] * [IWPM_NLA_QUERY_REMOTE_ADDR] * [IWPM_NLA_RQUERY_MAPPED_LOC_ADDR] * [IWPM_NLA_RQUERY_MAPPED_REM_ADDR] * [IWPM_NLA_RQUERY_MAPPING_ERR] */ static int process_iwpm_wire_accept(iwpm_msg_parms *msg_parms, int nl_sock, struct sockaddr_storage *recv_addr, int pm_sock) { iwpm_mapping_request iwpm_map_req; iwpm_mapping_request *iwpm_retry_req = NULL; iwpm_mapped_port *iwpm_port; struct sockaddr_storage local_addr, remote_mapped_addr; int not_mapped = 1; const char *str_err; int ret; copy_iwpm_sockaddr(msg_parms->address_family, NULL, &local_addr, &msg_parms->cpipaddr[0], NULL, &msg_parms->cpport); copy_iwpm_sockaddr(msg_parms->address_family, NULL, &remote_mapped_addr, &msg_parms->apipaddr[0], NULL, &msg_parms->apport); ret = -EINVAL; iwpm_port = find_iwpm_same_mapping(&local_addr, not_mapped); if (!iwpm_port) { iwpm_debug(IWARP_PM_WIRE_DBG, "process_wire_accept: " "Received accept for unknown mapping.\n"); return 0; } /* there should be a request for the accept message */ ret = update_iwpm_map_request(msg_parms->assochandle, &iwpm_port->local_addr, (IWARP_PM_REQ_QUERY|IWARP_PM_REQ_ACK), &iwpm_map_req, 1); if (ret) { iwpm_debug(IWARP_PM_WIRE_DBG, "process_wire_accept: " "No matching mapping request (assochandle = %llu)\n", msg_parms->assochandle); return 0; /* ok when retransmission */ } if (iwpm_map_req.complete) return 0; /* if the accept has already been processed and this is retransmission */ if (iwpm_map_req.msg_type == IWARP_PM_REQ_ACK) { iwpm_debug(IWARP_PM_RETRY_DBG, "process_wire_accept: Detected retransmission " "(map request assochandle = %llu)\n", iwpm_map_req.assochandle); goto wire_accept_send_ack; } ret = send_conn_info_nlmsg(&iwpm_port->local_addr, &iwpm_map_req.remote_addr, &iwpm_port->mapped_addr, &remote_mapped_addr, iwpm_port->owner_client, iwpm_map_req.nlmsg_type, iwpm_map_req.nlmsg_seq, iwpm_map_req.nlmsg_pid, 0, nl_sock); if (ret) { str_err = "Unable to send nlmsg response"; goto wire_accept_error; } /* object to detect retransmission */ iwpm_retry_req = create_iwpm_map_request(NULL, &iwpm_map_req.src_addr, &iwpm_map_req.remote_addr, iwpm_map_req.assochandle, IWARP_PM_REQ_ACK, NULL); if (!iwpm_retry_req) { ret = -ENOMEM; str_err = "Unable to allocate retry request"; goto wire_accept_error; } add_iwpm_map_request(iwpm_retry_req); wire_accept_send_ack: return send_iwpm_msg(form_iwpm_ack, msg_parms, recv_addr, pm_sock); wire_accept_error: syslog(LOG_WARNING, "process_iwpm_wire_accept: %s.\n", str_err); return ret; } /** * process_iwpm_wire_reject - Process reject message from the port mapper remote peer * @msg_parms: the received iwpm reject message * @nl_sock: netlink socket to send through a message to the iwpm client * * Send notification to the iwpm client that its * mapping request is rejected by the remote/accepting port mapper peer */ static int process_iwpm_wire_reject(iwpm_msg_parms *msg_parms, int nl_sock) { iwpm_mapping_request iwpm_map_req; iwpm_mapped_port *iwpm_port; struct sockaddr_storage local_addr, remote_addr; int not_mapped = 1; __u16 err_code = IWPM_REMOTE_QUERY_REJECT; const char *str_err; int ret = -EINVAL; copy_iwpm_sockaddr(msg_parms->address_family, NULL, &local_addr, &msg_parms->cpipaddr[0], NULL, &msg_parms->cpport); copy_iwpm_sockaddr(msg_parms->address_family, NULL, &remote_addr, &msg_parms->apipaddr[0], NULL, &msg_parms->apport); print_iwpm_sockaddr(&local_addr, "process_wire_reject: Local address", IWARP_PM_ALL_DBG); print_iwpm_sockaddr(&remote_addr, "process_wire_reject: Remote address", IWARP_PM_ALL_DBG); ret = -EINVAL; iwpm_port = find_iwpm_same_mapping(&local_addr, not_mapped); if (!iwpm_port) { syslog(LOG_WARNING, "process_wire_reject: Received reject for unknown mapping.\n"); return 0; } /* make sure there is request posted */ ret = update_iwpm_map_request(msg_parms->assochandle, &iwpm_port->local_addr, IWARP_PM_REQ_QUERY, &iwpm_map_req, 1); if (ret) { iwpm_debug(IWARP_PM_WIRE_DBG, "process_wire_reject: " "No matching mapping request (assochandle = %llu)\n", msg_parms->assochandle); return 0; /* ok when retransmission */ } if (iwpm_map_req.complete) return 0; ret = send_conn_info_nlmsg(&iwpm_port->local_addr, &iwpm_map_req.remote_addr, &iwpm_port->mapped_addr, &iwpm_map_req.remote_addr, iwpm_port->owner_client, iwpm_map_req.nlmsg_type, iwpm_map_req.nlmsg_seq, iwpm_map_req.nlmsg_pid, err_code, nl_sock); if (ret) { str_err = "Unable to send nlmsg response"; goto wire_reject_error; } return 0; wire_reject_error: syslog(LOG_WARNING, "process_wire_reject: %s.\n", str_err); return ret; } /** * process_iwpm_wire_ack - Process acknowledgement from the remote port mapper peer * @msg_parms: received iwpm acknowledgement */ static int process_iwpm_wire_ack(iwpm_msg_parms *msg_parms) { iwpm_mapped_port *iwpm_port; iwpm_mapping_request iwpm_map_req; struct sockaddr_storage local_mapped_addr; int not_mapped = 0; int ret; copy_iwpm_sockaddr(msg_parms->address_family, NULL, &local_mapped_addr, &msg_parms->apipaddr[0], NULL, &msg_parms->apport); iwpm_port = find_iwpm_mapping(&local_mapped_addr, not_mapped); if (!iwpm_port) { iwpm_debug(IWARP_PM_WIRE_DBG, "process_wire_ack: Received ack for unknown mapping.\n"); return 0; } /* make sure there is accept for the ack */ ret = update_iwpm_map_request(msg_parms->assochandle, &local_mapped_addr, IWARP_PM_REQ_ACCEPT, &iwpm_map_req, 1); if (ret) iwpm_debug(IWARP_PM_WIRE_DBG, "process_wire_ack: No matching mapping request\n"); return 0; } /* Mapping info message - nlmsg attributes */ static struct nla_policy mapinfo_policy[IWPM_NLA_MAPINFO_MAX] = { [IWPM_NLA_MAPINFO_LOCAL_ADDR] = { .minlen = sizeof(struct sockaddr_storage) }, [IWPM_NLA_MAPINFO_MAPPED_ADDR] = { .minlen = sizeof(struct sockaddr_storage) }, [IWPM_NLA_MAPINFO_FLAGS] = { .type = NLA_U32 } }; /** * process_iwpm_mapinfo - Process a mapping info message from the port mapper client * @req_nlh: netlink header of the received client message * @client_idx: the index of the client (the index is unique for each iwpm client) * @nl_sock: netlink socket to send a message to the client * * In case the userspace iwarp port mapper daemon is restarted, * the iwpm client needs to send a record of mappings it is currently using. * The port mapper needs to reopen the mapped ports used by the client. */ static int process_iwpm_mapinfo(struct nlmsghdr *req_nlh, int client_idx, int nl_sock) { iwpm_mapped_port *iwpm_port = NULL; struct sockaddr_storage *local_addr, *local_mapped_addr; struct nlattr *nltb [IWPM_NLA_MAPINFO_MAX] = {}; int not_mapped = 1; __u16 err_code = 0; const char *msg_type = "Mapping Info Msg"; const char *str_err = ""; int ret = -EINVAL; __u32 flags; int max = IWPM_NLA_MAPINFO_MAX; if (iwpm_version != IWPM_UABI_VERSION) max--; if (parse_iwpm_nlmsg(req_nlh, max, mapinfo_policy, nltb, msg_type)) { err_code = IWPM_INVALID_NLMSG_ERR; str_err = "Received Invalid nlmsg"; goto process_mapinfo_error; } local_addr = (struct sockaddr_storage *)nla_data(nltb[IWPM_NLA_MAPINFO_LOCAL_ADDR]); local_mapped_addr = (struct sockaddr_storage *)nla_data(nltb[IWPM_NLA_MAPINFO_MAPPED_ADDR]); flags = nltb[IWPM_NLA_MAPINFO_FLAGS] ? nla_get_u32(nltb[IWPM_NLA_MAPINFO_FLAGS]) : 0; iwpm_port = find_iwpm_mapping(local_addr, not_mapped); if (iwpm_port) { /* Can be safely ignored, if the mapinfo is exactly the same, * because the client will provide all the port information it has and * it could have started using the port mapper service already */ if (check_same_sockaddr(&iwpm_port->local_addr, local_addr) && check_same_sockaddr(&iwpm_port->mapped_addr, local_mapped_addr)) goto process_mapinfo_exit; /* partial duplicates matching wcard ip address aren't allowed as well */ err_code = IWPM_DUPLICATE_MAPPING_ERR; str_err = "Duplicate mapped port"; goto process_mapinfo_error; } iwpm_port = reopen_iwpm_mapped_port(local_addr, local_mapped_addr, client_idx, flags); if (!iwpm_port) { err_code = IWPM_CREATE_MAPPING_ERR; str_err = "Unable to create new mapping"; goto process_mapinfo_error; } /* add the new mapping to the list */ add_iwpm_mapped_port(iwpm_port); process_mapinfo_exit: mapinfo_num_list[client_idx]++; return 0; process_mapinfo_error: syslog(LOG_WARNING, "process_mapinfo: %s.\n", str_err); if (err_code) { /* send error message to the client */ send_iwpm_error_msg(req_nlh->nlmsg_seq, err_code, client_idx, nl_sock); } return ret; } /* Mapping info message count - nlmsg attributes */ static struct nla_policy mapinfo_count_policy[IWPM_NLA_MAPINFO_SEND_MAX] = { [IWPM_NLA_MAPINFO_SEQ] = { .type = NLA_U32 }, [IWPM_NLA_MAPINFO_SEND_NUM] = { .type = NLA_U32 } }; /** * process_iwpm_mapinfo_count - Process mapinfo count message * @req_nlh: netlink header of the received message from the client * @client_idx: the index of the client * @nl_sock: netlink socket to send a message to the client * * Mapinfo count message is a mechanism for the port mapper and the client to * synchronize on the number of mapinfo messages which were sucessfully exchanged and processed */ static int process_iwpm_mapinfo_count(struct nlmsghdr *req_nlh, int client_idx, int nl_sock) { struct nlattr *nltb [IWPM_NLA_MAPINFO_SEND_MAX]; struct nl_msg *resp_nlmsg = NULL; const char *msg_type = "Number of Mappings Msg"; __u32 map_count; __u16 err_code = 0; const char *str_err = ""; int ret = -EINVAL; if (parse_iwpm_nlmsg(req_nlh, IWPM_NLA_MAPINFO_SEND_MAX, mapinfo_count_policy, nltb, msg_type)) { str_err = "Received Invalid nlmsg"; err_code = IWPM_INVALID_NLMSG_ERR; goto mapinfo_count_error; } map_count = nla_get_u32(nltb[IWPM_NLA_MAPINFO_SEND_NUM]); if (map_count != mapinfo_num_list[client_idx]) iwpm_debug(IWARP_PM_NETLINK_DBG, "get_mapinfo_count: Client (idx = %d) " "send mapinfo count = %u processed mapinfo count = %u.\n", client_idx, map_count, mapinfo_num_list[client_idx]); resp_nlmsg = create_iwpm_nlmsg(req_nlh->nlmsg_type, client_idx); if (!resp_nlmsg) { str_err = "Unable to create nlmsg response"; ret = -ENOMEM; goto mapinfo_count_error; } str_err = "Invalid nlmsg attribute"; if ((ret = nla_put_u32(resp_nlmsg, IWPM_NLA_MAPINFO_SEQ, req_nlh->nlmsg_seq))) goto mapinfo_count_free_error; if ((ret = nla_put_u32(resp_nlmsg, IWPM_NLA_MAPINFO_SEND_NUM, map_count))) goto mapinfo_count_free_error; if ((ret = nla_put_u32(resp_nlmsg, IWPM_NLA_MAPINFO_ACK_NUM, mapinfo_num_list[client_idx]))) goto mapinfo_count_free_error; if ((ret = send_iwpm_nlmsg(nl_sock, resp_nlmsg, req_nlh->nlmsg_pid))) { str_err = "Unable to send nlmsg response"; goto mapinfo_count_free_error; } nlmsg_free(resp_nlmsg); return 0; mapinfo_count_free_error: if (resp_nlmsg) nlmsg_free(resp_nlmsg); mapinfo_count_error: syslog(LOG_WARNING, "process_mapinfo_count: %s.\n", str_err); if (err_code) { /* send error message to the client */ send_iwpm_error_msg(req_nlh->nlmsg_seq, err_code, client_idx, nl_sock); } return ret; } /** * send_iwpm_error_msg - Send error message to the iwpm client * @seq: last received netlink message sequence * @err_code: used to differentiante between errors * @client_idx: the index of the client * @nl_sock: netlink socket to send a message to the client */ static int send_iwpm_error_msg(__u32 seq, __u16 err_code, int client_idx, int nl_sock) { struct nl_msg *resp_nlmsg; __u16 nlmsg_type; const char *str_err = ""; int ret; nlmsg_type = RDMA_NL_GET_TYPE(client_idx, RDMA_NL_IWPM_HANDLE_ERR); resp_nlmsg = create_iwpm_nlmsg(nlmsg_type, client_idx); if (!resp_nlmsg) { ret = -ENOMEM; str_err = "Unable to create nlmsg response"; goto send_error_msg_exit; } str_err = "Invalid nlmsg attribute"; if ((ret = nla_put_u32(resp_nlmsg, IWPM_NLA_ERR_SEQ, seq))) goto send_error_msg_exit; if ((ret = nla_put_u16(resp_nlmsg, IWPM_NLA_ERR_CODE, err_code))) goto send_error_msg_exit; if ((ret = send_iwpm_nlmsg(nl_sock, resp_nlmsg, 0))) { str_err = "Unable to send nlmsg response"; goto send_error_msg_exit; } nlmsg_free(resp_nlmsg); return 0; send_error_msg_exit: if (resp_nlmsg) nlmsg_free(resp_nlmsg); syslog(LOG_WARNING, "send_iwpm_error_msg: %s (ret = %d).\n", str_err, ret); return ret; } /* Hello message - nlmsg attributes */ static struct nla_policy hello_policy[IWPM_NLA_HELLO_MAX] = { [IWPM_NLA_HELLO_ABI_VERSION] = { .type = NLA_U16 } }; /** * process_iwpm_hello - Process mapinfo count message * @req_nlh: netlink header of the received message from the client * @client_idx: the index of the client * @nl_sock: netlink socket to send a message to the client * * Mapinfo count message is a mechanism for the port mapper and the client to * synchronize on the number of mapinfo messages which were sucessfully exchanged and processed */ static int process_iwpm_hello(struct nlmsghdr *req_nlh, int client_idx, int nl_sock) { struct nlattr *nltb [IWPM_NLA_HELLO_MAX]; const char *msg_type = "Hello Msg"; __u16 abi_version; __u16 err_code = 0; const char *str_err = ""; int ret = -EINVAL; if (req_nlh->nlmsg_type == NLMSG_ERROR) { abi_version = IWPM_UABI_VERSION_MIN; } else { if (parse_iwpm_nlmsg(req_nlh, IWPM_NLA_HELLO_MAX, hello_policy, nltb, msg_type)) { str_err = "Received Invalid nlmsg"; err_code = IWPM_INVALID_NLMSG_ERR; goto hello_error; } abi_version = nla_get_u16(nltb[IWPM_NLA_HELLO_ABI_VERSION]); } if (abi_version > IWPM_UABI_VERSION) { str_err = "UABI Version mismatch"; err_code = IWPM_VERSION_MISMATCH_ERR; goto hello_error; } iwpm_version = abi_version; iwpm_debug(IWARP_PM_NETLINK_DBG, "process_iwpm_hello: using abi_version %u\n", iwpm_version); send_iwpm_mapinfo_request(nl_sock, RDMA_NL_IWCM); if (iwpm_version == 3) { /* Legacy RDMA_NL_C4IW for old kernels */ send_iwpm_mapinfo_request(nl_sock, RDMA_NL_IWCM+1); } return 0; hello_error: syslog(LOG_WARNING, "process_iwpm_hello: %s.\n", str_err); if (err_code) { /* send error message to the client */ send_iwpm_error_msg(req_nlh->nlmsg_seq, err_code, client_idx, nl_sock); } return ret; } /** * process_iwpm_netlink_msg - Dispatch received netlink messages * @nl_sock: netlink socket to read the messages from */ static int process_iwpm_netlink_msg(int nl_sock) { char *recv_buffer = NULL; struct nlmsghdr *nlh; struct sockaddr_nl src_addr; int len, type, client_idx, op; socklen_t src_addr_len; const char *str_err = ""; int ret = 0; recv_buffer = malloc(NLMSG_SPACE(IWARP_PM_RECV_PAYLOAD)); if (!recv_buffer) { ret = -ENOMEM; str_err = "Unable to allocate receive socket buffer"; goto process_netlink_msg_exit; } /* receive a new message */ nlh = (struct nlmsghdr *)recv_buffer; memset(nlh, 0, NLMSG_SPACE(IWARP_PM_RECV_PAYLOAD)); memset(&src_addr, 0, sizeof(src_addr)); src_addr_len = sizeof(src_addr); len = recvfrom(nl_sock, (void *)nlh, NLMSG_SPACE(IWARP_PM_RECV_PAYLOAD), 0, (struct sockaddr *)&src_addr, &src_addr_len); if (len <= 0) { ret = -errno; str_err = "Unable to receive data from netlink socket"; goto process_netlink_msg_exit; } /* loop for multiple netlink messages packed together */ while (NLMSG_OK(nlh, len) != 0) { if (nlh->nlmsg_type == NLMSG_DONE) { goto process_netlink_msg_exit; } type = nlh->nlmsg_type; client_idx = RDMA_NL_GET_CLIENT(type); if (type == NLMSG_ERROR) { /* RDMA_NL_IWCM HELLO error indicates V3 kernel */ if (nlh->nlmsg_seq == 0) { ret = process_iwpm_hello(nlh, client_idx, nl_sock); } else { iwpm_debug(IWARP_PM_NETLINK_DBG, "process_netlink_msg: " "Netlink error message seq = %u\n", nlh->nlmsg_seq); } goto process_netlink_msg_exit; } op = RDMA_NL_GET_OP(type); iwpm_debug(IWARP_PM_NETLINK_DBG, "process_netlink_msg: Received a new message: " "opcode = %u client idx = %u, client pid = %u," " msg seq = %u, type = %u, length = %u.\n", op, client_idx, nlh->nlmsg_pid, nlh->nlmsg_seq, type, len); if (client_idx >= IWARP_PM_MAX_CLIENTS) { ret = -EINVAL; str_err = "Invalid client index"; goto process_netlink_msg_exit; } switch (op) { case RDMA_NL_IWPM_REG_PID: str_err = "Register Pid request"; ret = process_iwpm_register_pid(nlh, client_idx, nl_sock); break; case RDMA_NL_IWPM_ADD_MAPPING: str_err = "Add Mapping request"; if (!client_list[client_idx].valid) { ret = -EINVAL; goto process_netlink_msg_exit; } ret = process_iwpm_add_mapping(nlh, client_idx, nl_sock); break; case RDMA_NL_IWPM_QUERY_MAPPING: str_err = "Query Mapping request"; if (!client_list[client_idx].valid) { ret = -EINVAL; goto process_netlink_msg_exit; } ret = process_iwpm_query_mapping(nlh, client_idx, nl_sock); break; case RDMA_NL_IWPM_REMOVE_MAPPING: str_err = "Remove Mapping request"; ret = process_iwpm_remove_mapping(nlh, client_idx, nl_sock); break; case RDMA_NL_IWPM_MAPINFO: ret = process_iwpm_mapinfo(nlh, client_idx, nl_sock); break; case RDMA_NL_IWPM_MAPINFO_NUM: ret = process_iwpm_mapinfo_count(nlh, client_idx, nl_sock); break; case RDMA_NL_IWPM_HELLO: ret = process_iwpm_hello(nlh, client_idx, nl_sock); break; default: str_err = "Netlink message with invalid opcode"; ret = -1; break; } nlh = NLMSG_NEXT(nlh, len); if (ret) goto process_netlink_msg_exit; } process_netlink_msg_exit: if (recv_buffer) free(recv_buffer); if (ret) syslog(LOG_WARNING, "process_netlink_msg: %s error (ret = %d).\n", str_err, ret); return ret; } /** * process_iwpm_msg - Dispatch iwpm wire messages, sent by the remote peer * @pm_sock: socket handle to read the messages from */ static int process_iwpm_msg(int pm_sock) { iwpm_msg_parms msg_parms; struct sockaddr_storage recv_addr; iwpm_wire_msg recv_buffer; /* received message */ int bytes_recv, ret = 0; int max_bytes_send = IWARP_PM_MESSAGE_SIZE + IWPM_IPADDR_SIZE; socklen_t recv_addr_len = sizeof(recv_addr); bytes_recv = recvfrom(pm_sock, &recv_buffer, max_bytes_send, 0, (struct sockaddr *)&recv_addr, &recv_addr_len); if (bytes_recv != IWARP_PM_MESSAGE_SIZE && bytes_recv != max_bytes_send) { syslog(LOG_WARNING, "process_iwpm_msg: Unable to receive data from PM socket. %s.\n", strerror(errno)); ret = -errno; goto process_iwpm_msg_exit; } ret = parse_iwpm_msg(&recv_buffer, &msg_parms); if (ret) goto process_iwpm_msg_exit; switch (msg_parms.mt) { case IWARP_PM_MT_REQ: iwpm_debug(IWARP_PM_WIRE_DBG, "process_iwpm_msg: Received Request message.\n"); ret = process_iwpm_wire_request(&msg_parms, netlink_sock, &recv_addr, pm_sock); break; case IWARP_PM_MT_ACK: iwpm_debug(IWARP_PM_WIRE_DBG, "process_iwpm_msg: Received Acknowledgement.\n"); ret = process_iwpm_wire_ack(&msg_parms); break; case IWARP_PM_MT_ACC: iwpm_debug(IWARP_PM_WIRE_DBG, "process_iwpm_msg: Received Accept message.\n"); ret = process_iwpm_wire_accept(&msg_parms, netlink_sock, &recv_addr, pm_sock); break; case IWARP_PM_MT_REJ: iwpm_debug(IWARP_PM_WIRE_DBG, "process_iwpm_msg: Received Reject message.\n"); ret = process_iwpm_wire_reject(&msg_parms, netlink_sock); break; default: syslog(LOG_WARNING, "process_iwpm_msg: Received Invalid message type = %u.\n", msg_parms.mt); } process_iwpm_msg_exit: return ret; } /** * send_iwpm_hello - Notify the client that the V4 iwarp port mapper is available * @nl_sock: netlink socket to send a message to the client * * Send a HELLO message including the ABI_VERSION supported by iwpmd. If the * response is an ERROR message, then we know the kernel driver is < V4, so we * drop back to the V3 protocol. If the kernel is >= V4, then it will reply * with its ABI Version. The response is handled in iwarp_port_mapper(). Once * the ABI version is negotiatied, iwpmd will send a mapinfo request to get any * current mappings, using the correct ABI version. This allows working with * V3 kernels. */ static int send_iwpm_hello(int nl_sock) { struct nl_msg *req_nlmsg; const char *str_err; __u16 nlmsg_type; int ret; nlmsg_type = RDMA_NL_GET_TYPE(RDMA_NL_IWCM, RDMA_NL_IWPM_HELLO); req_nlmsg = create_iwpm_nlmsg(nlmsg_type, RDMA_NL_IWCM); if (!req_nlmsg) { ret = -ENOMEM; str_err = "Unable to create nlmsg request"; goto send_hello_error; } str_err = "Invalid nlmsg attribute"; if ((ret = nla_put_u16(req_nlmsg, IWPM_NLA_HELLO_ABI_VERSION, iwpm_version))) goto send_hello_error; if ((ret = send_iwpm_nlmsg(nl_sock, req_nlmsg, 0))) { str_err = "Unable to send nlmsg response"; goto send_hello_error; } nlmsg_free(req_nlmsg); return 0; send_hello_error: if (req_nlmsg) nlmsg_free(req_nlmsg); syslog(LOG_WARNING, "send_hello_request: %s ret = %d.\n", str_err, ret); return ret; } /** * send_iwpm_mapinfo_request - Notify the client that the iwarp port mapper is available * @nl_sock: netlink socket to send a message to the client * @client - client to receive the message */ static int send_iwpm_mapinfo_request(int nl_sock, int client) { struct nl_msg *req_nlmsg; __u16 nlmsg_type; const char *str_err; int ret; nlmsg_type = RDMA_NL_GET_TYPE(client, RDMA_NL_IWPM_MAPINFO); req_nlmsg = create_iwpm_nlmsg(nlmsg_type, client); if (!req_nlmsg) { ret = -ENOMEM; str_err = "Unable to create nlmsg request"; goto send_mapinfo_error; } str_err = "Invalid nlmsg attribute"; if ((ret = nla_put_string(req_nlmsg, IWPM_NLA_MAPINFO_ULIB_NAME, iwpm_ulib_name))) goto send_mapinfo_error; if ((ret = nla_put_u16(req_nlmsg, IWPM_NLA_MAPINFO_ULIB_VER, iwpm_version))) goto send_mapinfo_error; if ((ret = send_iwpm_nlmsg(nl_sock, req_nlmsg, 0))) { str_err = "Unable to send nlmsg response"; goto send_mapinfo_error; } nlmsg_free(req_nlmsg); return 0; send_mapinfo_error: if (req_nlmsg) nlmsg_free(req_nlmsg); syslog(LOG_WARNING, "send_mapinfo_request: %s ret = %d.\n", str_err, ret); return ret; } /** iwpm_cleanup - Close socket handles and free mapped ports */ static void iwpm_cleanup(void) { free_iwpm_mapped_ports(); destroy_iwpm_socket(netlink_sock); destroy_iwpm_socket(pmv6_client_sock); destroy_iwpm_socket(pmv6_sock); destroy_iwpm_socket(pmv4_client_sock); destroy_iwpm_socket(pmv4_sock); /* close up logging */ closelog(); } /** * iwarp_port_mapper - Distribute work orders for processing different types of iwpm messages */ static int iwarp_port_mapper(void) { fd_set select_fdset; /* read fdset */ struct timeval select_timeout; int select_rc, max_sock = 0, ret = 0; if (pmv4_sock > max_sock) max_sock = pmv4_sock; if (pmv6_sock > max_sock) max_sock = pmv6_sock; if (netlink_sock > max_sock) max_sock = netlink_sock; if (pmv4_client_sock > max_sock) max_sock = pmv4_client_sock; if (pmv6_client_sock > max_sock) max_sock = pmv6_client_sock; /* poll a set of sockets */ do { do { if (print_mappings) { print_iwpm_mapped_ports(); print_mappings = 0; } /* initialize the file sets for select */ FD_ZERO(&select_fdset); /* add the UDP and Netlink sockets to the file set */ if (pmv4_sock >= 0) { FD_SET(pmv4_sock, &select_fdset); FD_SET(pmv4_client_sock, &select_fdset); } if (pmv6_sock >= 0) { FD_SET(pmv6_sock, &select_fdset); FD_SET(pmv6_client_sock, &select_fdset); } FD_SET(netlink_sock, &select_fdset); /* set the timeout for select */ select_timeout.tv_sec = 10; select_timeout.tv_usec = 0; /* timeout is an upper bound of time elapsed before select returns */ select_rc = select(max_sock + 1, &select_fdset, NULL, NULL, &select_timeout); } while (select_rc == 0); /* select_rc is the number of fds ready for IO ( IO won't block) */ if (select_rc == -1) { if (errno == EINTR) continue; syslog(LOG_WARNING, "iwarp_port_mapper: Select failed (%s).\n", strerror(errno)); ret = -errno; goto iwarp_port_mapper_exit; } if (pmv4_sock >= 0) { if (FD_ISSET(pmv4_sock, &select_fdset)) ret = process_iwpm_msg(pmv4_sock); if (FD_ISSET(pmv4_client_sock, &select_fdset)) ret = process_iwpm_msg(pmv4_client_sock); } if (pmv6_sock >= 0) { if (FD_ISSET(pmv6_sock, &select_fdset)) ret = process_iwpm_msg(pmv6_sock); if (FD_ISSET(pmv6_client_sock, &select_fdset)) ret = process_iwpm_msg(pmv6_client_sock); } if (FD_ISSET(netlink_sock, &select_fdset)) ret = process_iwpm_netlink_msg(netlink_sock); } while (1); iwarp_port_mapper_exit: return ret; } /** * daemonize_iwpm_server - Make iwarp port mapper a daemon process */ static void daemonize_iwpm_server(void) { if (daemon(0, 0) != 0) { syslog(LOG_ERR, "Failed to daemonize\n"); exit(EXIT_FAILURE); } syslog(LOG_WARNING, "daemonize_iwpm_server: Starting iWarp Port Mapper V%d process\n", iwpm_version); } int main(int argc, char *argv[]) { FILE *fp; int c; int ret = EXIT_FAILURE; bool systemd = false; while (1) { static const struct option long_opts[] = { {"systemd", 0, NULL, 's'}, {} }; c = getopt_long(argc, argv, "fs", long_opts, NULL); if (c == -1) break; switch (c) { case 's': systemd = true; break; default: break; } } openlog(NULL, LOG_NDELAY | LOG_CONS | LOG_PID, LOG_DAEMON); if (!systemd) daemonize_iwpm_server(); umask(0); /* change file mode mask */ fp = fopen(IWPM_CONFIG_FILE, "r"); if (fp) { parse_iwpm_config(fp); fclose(fp); } memset(client_list, 0, sizeof(client_list)); pmv4_client_sock = -1; pmv6_sock = -1; pmv6_client_sock = pmv6_sock; pmv4_sock = create_iwpm_socket_v4(IWARP_PM_PORT); if (pmv4_sock < 0 && pmv4_sock != -EAFNOSUPPORT) goto error_exit_sock; pmv6_sock = create_iwpm_socket_v6(IWARP_PM_PORT); if (pmv6_sock < 0 && pmv6_sock != -EAFNOSUPPORT) goto error_exit_sock; /* If neither IPv4 nor IPv6 is supported, exit */ if (pmv4_sock < 0 && pmv6_sock < 0) goto error_exit_sock; if (pmv4_sock >= 0) { pmv4_client_sock = create_iwpm_socket_v4(0); if (pmv4_client_sock < 0) goto error_exit_sock; } if (pmv6_sock >= 0) { pmv6_client_sock = create_iwpm_socket_v6(0); if (pmv6_client_sock < 0) goto error_exit_sock; } netlink_sock = create_netlink_socket(); if (netlink_sock < 0) goto error_exit_sock; signal(SIGHUP, iwpm_signal_handler); signal(SIGTERM, iwpm_signal_handler); signal(SIGUSR1, iwpm_signal_handler); pthread_cond_init(&cond_req_complete, NULL); pthread_cond_init(&cond_pending_msg, NULL); ret = pthread_create(&map_req_thread, NULL, iwpm_mapping_reqs_handler, NULL); if (ret) goto error_exit; ret = pthread_create(&pending_msg_thread, NULL, iwpm_pending_msgs_handler, NULL); if (ret) goto error_exit; ret = send_iwpm_hello(netlink_sock); if (ret) goto error_exit; if (systemd) sd_notify(0, "READY=1"); iwarp_port_mapper(); /* start iwarp port mapper process */ free_iwpm_mapped_ports(); closelog(); error_exit: destroy_iwpm_socket(netlink_sock); error_exit_sock: destroy_iwpm_socket(pmv4_client_sock); destroy_iwpm_socket(pmv6_client_sock); destroy_iwpm_socket(pmv4_sock); destroy_iwpm_socket(pmv6_sock); syslog(LOG_WARNING, "main: Couldn't start iWarp Port Mapper.\n"); return ret; } rdma-core-56.1/iwpmd/iwpmd.8.in000066400000000000000000000050261477342711600162730ustar00rootroot00000000000000.TH "iwpmd" 8 "2016-09-16" "iwpmd" "iwpmd" iwpmd .SH NAME iwpmd \- port mapping services for iWARP. .SH SYNOPSIS .sp .nf \fIiwpmd\fR .fi .SH "DESCRIPTION" The iWARP Port Mapper Daemon provides a user space service (iwpmd) for the iWarp drivers to claim tcp ports through the standard socket interface. .P The kernel space support for the port mapper is part of the iw_cm module. The ib_core module includes netlink support, which is used by the port mapper clients to exchange messages with iwpmd. Both modules iw_cm and ib_core need to be loaded in order for the libiwpm service to start successfully. .SH "IWARP PORT MAPPING DETAILS" The iWARP Port Mapper implementation is based on the port mapper specification section in the Sockets Direct Protocol: http://www.rdmaconsortium.org/home/draft-pinkerton-iwarp-sdp-v1.0.pdf .P Existing iWARP RDMA providers use the same IP address as the native TCP/IP stack when creating RDMA connections. They need a mechanism to claim the TCP ports used for RDMA connections to prevent TCP port collisions when other host applications use TCP ports. The iWARP Port Mapper provides a standard mechanism to accomplish this. Without this service it is possible for RDMA application to bind/listen on the same port which is already being used by native TCP host application. If that happens the incoming TCP connection data can be passed to the RDMA stack with error. .P The iWARP Connection Manager (port mapper client) sends to the IWPM service the local IP address and TCP port it has received from the RDMA application, when starting a connection. The IWPM service performs a socket bind from user space to get an available TCP port, called a mapped port, and communicates it back to the client. In that sense, the IWPM service is used to map the TCP port, which the RDMA application uses to any port available from the host TCP port space. The mapped ports are used in iWARP RDMA connections to avoid collisions with native TCP stack which is aware that these ports are taken. When an RDMA connection using a mapped port is terminated, the client notifies the IWPM service, which then releases the TCP port. .P The message exchange between iwpmd and the iWARP Connection Manager (between user space and kernel space) is implemented using netlink sockets. .SH OPTIONS .sp \fB\-s, \-\-systemd\fP Enable systemd integration. .SH "SIGNALS" SIGUSR1 will force a dump of the current mappings to the system message log. .P SIGTERM/SIGHUP will force iwpmd to exit. .SH "FILES" @CMAKE_INSTALL_FULL_SYSCONFDIR@/iwpmd.conf .SH "SEE ALSO" rdma_cm(7) rdma-core-56.1/iwpmd/iwpmd.conf000066400000000000000000000000341477342711600164360ustar00rootroot00000000000000nl_sock_rbuf_size=419430400 rdma-core-56.1/iwpmd/iwpmd.conf.5.in000066400000000000000000000011231477342711600172060ustar00rootroot00000000000000.TH "iwpmd.conf" 5 "2016-09-16" "iwpmd.conf" "iwpmd.conf" iwpmd.conf .SH NAME iwpmd.conf \- iWARP port mapper config file. .SH SYNOPSIS .sp .nf \fIiwpmd.conf\fR .fi .SH "DESCRIPTION" The iwpmd.conf file provides configuration parameters for iwpmd. Parameters are in the form: param=value, and one per line. Parameters include: .P nl_sock_rbuf_size - The socket buffer size of the netlink socket used to communicate with the kernel port map client. The default is 400MB. .SH "EXAMPLES" nl_sock_rbuf_size=419430400 .SH "FILES" @CMAKE_INSTALL_FULL_SYSCONFDIR@/iwpmd.conf .SH "SEE ALSO" iwpmd(8) rdma-core-56.1/iwpmd/iwpmd.rules000066400000000000000000000001151477342711600166430ustar00rootroot00000000000000TAG+="systemd", ENV{ID_RDMA_IWARP}=="1", ENV{SYSTEMD_WANTS}+="iwpmd.service" rdma-core-56.1/iwpmd/iwpmd.service.in000066400000000000000000000021751477342711600175660ustar00rootroot00000000000000[Unit] Description=iWarp Port Mapper Documentation=man:iwpmd file:/etc/iwpmd.conf StopWhenUnneeded=yes # iwpmd is a kernel support program and needs to run as early as possible, # otherwise the kernel or userspace cannot establish RDMA connections and # things will just fail, not block until iwpmd arrives. DefaultDependencies=no Before=sysinit.target # Do not execute concurrently with an ongoing shutdown (required for DefaultDependencies=no) Conflicts=shutdown.target Before=shutdown.target # Ensure required kernel modules are loaded before starting Wants=rdma-load-modules@iwpmd.service After=rdma-load-modules@iwpmd.service # iwpmd needs to start before networking is brought up, even kernel networking # (eg NFS) since it provides kernel support for iWarp's RDMA CM. Wants=network-pre.target Before=network-pre.target # rdma-hw is not ready until iwpmd is running Before=rdma-hw.target [Service] Type=notify ExecStart=@CMAKE_INSTALL_FULL_SBINDIR@/iwpmd --systemd LimitNOFILE=102400 ProtectSystem=full ProtectHome=true ProtectHostname=true ProtectKernelLogs=true # iwpmd is automatically wanted by udev when an iWarp RDMA device is present rdma-core-56.1/iwpmd/iwpmd_init.in000066400000000000000000000042061477342711600171470ustar00rootroot00000000000000#!/bin/bash # Start the IWPMD daemon # # chkconfig: 1235 90 15 # description: iWarp Port Mapper Daemon for opening sockets to reserve ports from userspace # processname: iwpmd # pidfile: /var/run/iwpmd.pid # ### BEGIN INIT INFO # Provides: iwpmd # Required-Start: $network $syslog $remote_fs # Required-Stop: $remote_fs # Default-Stop: 0 1 6 # Default-Start: 2 3 4 5 # Short-Description: iWarp Port Mapper Daemon # Description: iWarp Port Mapper Daemon for opening sockets to claim TCP ports from userspace ### END INIT INFO IWPMD_BIN="@CMAKE_INSTALL_FULL_SBINDIR@/iwpmd" LOCK="/var/lock/subsys/iwpmd" IWPMD_PID=0 RETVAL=0 # Source function library. if [ -f "/etc/redhat-release" ]; then . /etc/rc.d/init.d/functions STARTD=daemon STOPD=killproc STATUSD=status GETPID=/sbin/pidof else # Debian / openSUSE / Ubuntu . /lib/lsb/init-functions STARTD=start_daemon STOPD=killproc STATUSD=/sbin/checkproc GETPID=pidofproc fi check() { # Check if iwpm is executable test -x $IWPMD_BIN || ( echo "Couldn't find $IWPMD_BIN"; exit 5 ) } start() { check RETVAL=$? [ $RETVAL -gt 0 ] && exit $RETVAL echo -n $"Starting iwpm daemon: " if [ ! -f "$LOCK" ]; then ulimit -n 102400 $STARTD $IWPMD_BIN &> /dev/null RETVAL=$? [ $RETVAL -eq 0 ] && ( touch $LOCK; echo "OK" ) || echo "NO" else echo "NO (iwpm is already running)" fi return $RETVAL } stop() { check RETVAL=$? [ $RETVAL -gt 0 ] && exit $RETVAL echo -n $"Stopping iwpm daemon: " if [ -f "$LOCK" ]; then $STOPD $IWPMD_BIN &> /dev/null RETVAL=$? [ $RETVAL -eq 0 ] && ( rm -f $LOCK; echo "OK" ) || echo "NO" else echo "NO (iwpm is already stopped)" fi return $RETVAL } restart() { stop start } show_status() { check RETVAL=$? [ $RETVAL -gt 0 ] && exit $RETVAL IWPMD_PID="$($GETPID $IWPMD_BIN)" $STATUSD $IWPMD_BIN &> /dev/null RETVAL=$? [ $RETVAL -eq 0 ] && echo "iwpm daemon (pid $IWPMD_PID) is running" || echo "iwpm daemon isn't available" return $RETVAL } case "$1" in start) start ;; stop) stop ;; restart) restart ;; force-reload) restart ;; status) show_status ;; *) echo $"Usage: $0 {start|stop|restart|force-reload|status}" RETVAL=2 esac exit $RETVAL rdma-core-56.1/iwpmd/modules-iwpmd.conf000066400000000000000000000001051477342711600201030ustar00rootroot00000000000000# These modules are loaded by the system if iwpmd is to be run iw_cm rdma-core-56.1/kernel-boot/000077500000000000000000000000001477342711600155535ustar00rootroot00000000000000rdma-core-56.1/kernel-boot/CMakeLists.txt000066400000000000000000000036271477342711600203230ustar00rootroot00000000000000rdma_subst_install(FILES rdma-load-modules@.service.in DESTINATION "${CMAKE_INSTALL_SYSTEMD_SERVICEDIR}" RENAME rdma-load-modules@.service PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ) rdma_subst_install(FILES "rdma-hw.target.in" RENAME "rdma-hw.target" DESTINATION "${CMAKE_INSTALL_SYSTEMD_SERVICEDIR}" PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ) install(FILES modules/infiniband.conf modules/iwarp.conf modules/opa.conf modules/rdma.conf modules/roce.conf DESTINATION "${CMAKE_INSTALL_SYSCONFDIR}/rdma/modules") install(FILES "rdma-persistent-naming.rules" RENAME "60-rdma-persistent-naming.rules" DESTINATION "${CMAKE_INSTALL_UDEV_RULESDIR}") install(FILES "rdma-description.rules" RENAME "75-rdma-description.rules" DESTINATION "${CMAKE_INSTALL_UDEV_RULESDIR}") install(FILES "rdma-hw-modules.rules" RENAME "90-rdma-hw-modules.rules" DESTINATION "${CMAKE_INSTALL_UDEV_RULESDIR}") install(FILES "rdma-ulp-modules.rules" RENAME "90-rdma-ulp-modules.rules" DESTINATION "${CMAKE_INSTALL_UDEV_RULESDIR}") install(FILES "rdma-umad.rules" RENAME "90-rdma-umad.rules" DESTINATION "${CMAKE_INSTALL_UDEV_RULESDIR}") rdma_subst_install(FILES "persistent-ipoib.rules.in" RENAME "70-persistent-ipoib.rules" DESTINATION "${CMAKE_INSTALL_DOCDIR}" PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ) set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS}") # Create an installed executable (under /usr/lib/udev) function(rdma_udev_executable EXEC) add_executable(${EXEC} ${ARGN}) target_link_libraries(${EXEC} LINK_PRIVATE ${COMMON_LIBS}) set_target_properties(${EXEC} PROPERTIES RUNTIME_OUTPUT_DIRECTORY "${BUILD_BIN}") install(TARGETS ${EXEC} DESTINATION "${CMAKE_INSTALL_UDEV_RULESDIR}/../") endfunction() if (NOT NL_KIND EQUAL 0) rdma_udev_executable(rdma_rename rdma_rename.c ) target_link_libraries(rdma_rename LINK_PRIVATE ${NL_LIBRARIES} ) endif() rdma-core-56.1/kernel-boot/modules/000077500000000000000000000000001477342711600172235ustar00rootroot00000000000000rdma-core-56.1/kernel-boot/modules/infiniband.conf000066400000000000000000000004531477342711600221750ustar00rootroot00000000000000# These modules are loaded by the system if any InfiniBand device is installed # InfiniBand over IP netdevice ib_ipoib # Access to fabric management SMPs and GMPs from userspace. ib_umad # SCSI Remote Protocol target support # ib_srpt # ib_ucm provides the obsolete /dev/infiniband/ucm0 # ib_ucm rdma-core-56.1/kernel-boot/modules/iwarp.conf000066400000000000000000000001121477342711600212060ustar00rootroot00000000000000# These modules are loaded by the system if any iWarp device is installed rdma-core-56.1/kernel-boot/modules/opa.conf000066400000000000000000000003751477342711600206560ustar00rootroot00000000000000# These modules are loaded by the system if any OmniPath Architecture device # is installed # Infiniband over IP netdevice ib_ipoib # Access to fabric management SMPs and GMPs from userspace. ib_umad # Omnipath Ethernet Virtual NIC netdevice opa_vnic rdma-core-56.1/kernel-boot/modules/rdma.conf000066400000000000000000000006351477342711600210210ustar00rootroot00000000000000# These modules are loaded by the system if any RDMA devices is installed # iSCSI over RDMA client support ib_iser # iSCSI over RDMA target support # ib_isert # User access to RDMA verbs (supports libibverbs) ib_uverbs # User access to RDMA connection management (supports librdmacm) rdma_ucm # RDS over RDMA support # rds_rdma # NFS over RDMA client support xprtrdma # NFS over RDMA server support svcrdma rdma-core-56.1/kernel-boot/modules/roce.conf000066400000000000000000000001431477342711600210200ustar00rootroot00000000000000# These modules are loaded by the system if any RDMA over Converged Ethernet # device is installed rdma-core-56.1/kernel-boot/persistent-ipoib.rules.in000066400000000000000000000013061477342711600225340ustar00rootroot00000000000000# This is a sample udev rules file that demonstrates how to get udev to # set the name of IPoIB interfaces to whatever you wish. Copy this file # into @CMAKE_INSTALL_SYSCONFDIR@/udev/rules.d before editing it! There is a 16 character limit # on network device names. # # Important items to note: ATTR{type}=="32" is IPoIB interfaces, and the # ATTR{address} match must start with ?* and only reference the last 8 # bytes of the address or else the address might not match the variable QPN # portion. # # Modern udev is case sensitive and all addresses need to be in lower case. # # ACTION=="add", SUBSYSTEM=="net", DRIVERS=="?*", ATTR{type}=="32", ATTR{address}=="?*00:02:c9:03:00:31:78:f2", NAME="mlx4_ib3" rdma-core-56.1/kernel-boot/rdma-description.rules000066400000000000000000000034111477342711600220720ustar00rootroot00000000000000# This is a version of net-description.rules for /sys/class/infiniband devices ACTION=="remove", GOTO="rdma_description_end" SUBSYSTEM!="infiniband", GOTO="rdma_description_end" # NOTE: DRIVERS searches up the sysfs path to find the driver that is bound to # the PCI/etc device that the RDMA device is linked to. This is not the kernel # driver that is supplying the RDMA device (eg as seen in ID_NET_DRIVER) # FIXME: with kernel support we could actually detect the protocols the RDMA # driver itself supports, this is a work around for lack of that support. # In future we could do this with a udev IMPORT{program} helper program # that extracted the ID information from the RDMA netlink. # Hardware that supports InfiniBand DRIVERS=="ib_mthca", ENV{ID_RDMA_INFINIBAND}="1" DRIVERS=="mlx4_core", ENV{ID_RDMA_INFINIBAND}="1" DRIVERS=="mlx5_core", ENV{ID_RDMA_INFINIBAND}="1" DRIVERS=="ib_qib", ENV{ID_RDMA_INFINIBAND}="1" # Hardware that supports OPA DRIVERS=="hfi1", ENV{ID_RDMA_OPA}="1" # Hardware that supports iWarp DRIVERS=="cxgb4", ENV{ID_RDMA_IWARP}="1" DRIVERS=="i40e", ENV{ID_RDMA_IWARP}="1" # Hardware that supports RoCE DRIVERS=="be2net", ENV{ID_RDMA_ROCE}="1" DRIVERS=="bnxt_en", ENV{ID_RDMA_ROCE}="1" DRIVERS=="hns", ENV{ID_RDMA_ROCE}="1" DRIVERS=="mlx4_core", ENV{ID_RDMA_ROCE}="1" DRIVERS=="mlx5_core", ENV{ID_RDMA_ROCE}="1" DRIVERS=="qede", ENV{ID_RDMA_ROCE}="1" DRIVERS=="vmw_pvrdma", ENV{ID_RDMA_ROCE}="1" DEVPATH=="*/infiniband/rxe*", ATTR{parent}=="*", ENV{ID_RDMA_ROCE}="1" # Setup the usual ID information so that systemd will display a sane name for # the RDMA device units. SUBSYSTEMS=="pci", ENV{ID_BUS}="pci", ENV{ID_VENDOR_ID}="$attr{vendor}", ENV{ID_MODEL_ID}="$attr{device}" SUBSYSTEMS=="pci", IMPORT{builtin}="hwdb --subsystem=pci" LABEL="rdma_description_end" rdma-core-56.1/kernel-boot/rdma-hw-modules.rules000066400000000000000000000034161477342711600216400ustar00rootroot00000000000000ACTION=="remove", GOTO="rdma_hw_modules_end" SUBSYSTEM!="net", GOTO="rdma_hw_modules_net_end" # For Ethernet cards with RoCE support # Automatically load RDMA specific kernel modules when a multi-function device is installed # These drivers autoload an ethernet driver based on hardware detection and # need userspace to load the module that has their RDMA component to turn on # RDMA. ENV{ID_NET_DRIVER}=="be2net", RUN{builtin}+="kmod load ocrdma" ENV{ID_NET_DRIVER}=="bnxt_en", RUN{builtin}+="kmod load bnxt_re" ENV{ID_NET_DRIVER}=="cxgb4", RUN{builtin}+="kmod load iw_cxgb4" ENV{ID_NET_DRIVER}=="hns", RUN{builtin}+="kmod load hns_roce" ENV{ID_NET_DRIVER}=="i40e", RUN{builtin}+="kmod load i40iw" ENV{ID_NET_DRIVER}=="mlx4_en", RUN{builtin}+="kmod load mlx4_ib" ENV{ID_NET_DRIVER}=="mlx5_core", RUN{builtin}+="kmod load mlx5_ib" ENV{ID_NET_DRIVER}=="qede", RUN{builtin}+="kmod load qedr" # The user must explicitly load these modules via /etc/modules-load.d/ or otherwise # rxe # enic no longer has a userspace verbs driver, this rule should probably be # owned by libfabric ENV{ID_NET_DRIVER}=="enic", RUN{builtin}+="kmod load usnic_verbs" # These providers are single function and autoload RDMA automatically based on # PCI probing # hfi1verbs # ipathverbs # mthca # vmw_pvrdma LABEL="rdma_hw_modules_net_end" SUBSYSTEM!="pci", GOTO="rdma_hw_modules_pci_end" # For InfiniBand cards # Normally the request_module inside the driver will trigger this, but in case that fails due to # missing modules in the initrd, trigger it again. HW that doesn't create a netdevice will not # trigger the net based rules above. ENV{DRIVER}=="mlx4_core", RUN{builtin}+="kmod load mlx4_ib" ENV{DRIVER}=="mlx5_core", RUN{builtin}+="kmod load mlx5_ib" LABEL="rdma_hw_modules_pci_end" LABEL="rdma_hw_modules_end" rdma-core-56.1/kernel-boot/rdma-hw.target.in000066400000000000000000000011071477342711600207260ustar00rootroot00000000000000[Unit] Description=RDMA Hardware Documentation=file:@CMAKE_INSTALL_FULL_DOCDIR@/udev.md StopWhenUnneeded=yes # Start the basic ULP RDMA kernel modules when RDMA hardware is detected (note # the rdma-load-modules@.service is already before this target) Wants=rdma-load-modules@rdma.service # Order after the standard network.target for compatibility with init.d # scripts that order after networking - this will mean RDMA is ready too. Before=network.target # We do not order rdma-hw before basic.target, units for daemons that use RDMA # have to manually order after rdma-hw.target rdma-core-56.1/kernel-boot/rdma-load-modules@.service.in000066400000000000000000000021461477342711600231530ustar00rootroot00000000000000[Unit] Description=Load RDMA modules from @CMAKE_INSTALL_FULL_SYSCONFDIR@/rdma/modules/%I.conf Documentation=file:@CMAKE_INSTALL_FULL_DOCDIR@/udev.md # Kernel module loading must take place before sysinit.target, similar to # systemd-modules-load.service DefaultDependencies=no Before=sysinit.target # Kernel modules must load in initrd before initrd.target to avoid being killed # when initrd-cleanup.service isolates to initrd-switch-root.target. Before=initrd.target # Do not execute concurrently with an ongoing shutdown Conflicts=shutdown.target Before=shutdown.target # Partially support distro network setup scripts that run after # systemd-modules-load.service but before sysinit.target, eg a classic network # setup script. Run them after modules have loaded. Wants=network-pre.target Before=network-pre.target # Orders all kernel module startup before rdma-hw.target can become ready Before=rdma-hw.target ConditionCapability=CAP_SYS_MODULE [Service] Type=oneshot RemainAfterExit=yes ExecStart=@CMAKE_INSTALL_SYSTEMD_BINDIR@/systemd-modules-load @CMAKE_INSTALL_FULL_SYSCONFDIR@/rdma/modules/%I.conf TimeoutSec=90s rdma-core-56.1/kernel-boot/rdma-persistent-naming.rules000066400000000000000000000021251477342711600232170ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file # # Rename modes: # NAME_FALLBACK - Try to name devices in the following order: # by-pci -> by-guid -> kernel # NAME_KERNEL - leave name as kernel provided # NAME_PCI - based on PCI/slot/function location # NAME_GUID - based on system image GUID # NAME_FIXED - rename the device to the fixed named in the next argument # # The stable names are combination of device type technology and rename mode. # Infiniband - ib* # RoCE - roce* # iWARP - iw* # OPA - opa* # Default (unknown protocol) - rdma* # # Example: # * NAME_PCI # pci = 0000:00:0c.4 # Device type = IB # mlx5_0 -> ibp0s12f4 # * NAME_GUID # GUID = 5254:00c0:fe12:3455 # Device type = RoCE # mlx5_0 -> rocex525400c0fe123455 # ACTION=="add", SUBSYSTEM=="infiniband", PROGRAM="rdma_rename %k NAME_FALLBACK" # Example: # * NAME_FIXED # fixed name for specific board_id # #ACTION=="add", ATTR{board_id}=="MSF0010110035", SUBSYSTEM=="infiniband", PROGRAM="rdma_rename %k NAME_FIXED myib"rdma-core-56.1/kernel-boot/rdma-ulp-modules.rules000066400000000000000000000012051477342711600220140ustar00rootroot00000000000000ACTION=="remove", GOTO="rdma_ulp_modules_end" SUBSYSTEM!="infiniband", GOTO="rdma_ulp_modules_end" # Automatically load general RDMA ULP modules when RDMA hardware is installed TAG+="systemd", ENV{SYSTEMD_WANTS}+="rdma-hw.target" TAG+="systemd", ENV{ID_RDMA_INFINIBAND}=="1", ENV{SYSTEMD_WANTS}+="rdma-load-modules@infiniband.service" TAG+="systemd", ENV{ID_RDMA_IWARP}=="1", ENV{SYSTEMD_WANTS}+="rdma-load-modules@iwarp.service" TAG+="systemd", ENV{ID_RDMA_OPA}=="1", ENV{SYSTEMD_WANTS}+="rdma-load-modules@opa.service" TAG+="systemd", ENV{ID_RDMA_ROCE}=="1", ENV{SYSTEMD_WANTS}+="rdma-load-modules@roce.service" LABEL="rdma_ulp_modules_end" rdma-core-56.1/kernel-boot/rdma-umad.rules000066400000000000000000000002161477342711600204750ustar00rootroot00000000000000SUBSYSTEM=="infiniband_mad", KERNEL=="*umad*", TAG+="systemd", ENV{SYSTEMD_ALIAS}="/sys/subsystem/rdma/devices/$attr{ibdev}:$attr{port}/umad" rdma-core-56.1/kernel-boot/rdma_rename.c000066400000000000000000000325521477342711600202000ustar00rootroot00000000000000// SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) /* Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include /* * Rename modes: * NAME_FALLBACK - Try to name devices in the following order: * by->onboard -> by-pci -> by-guid -> kernel * NAME_KERNEL - leave name as kernel provided * NAME_PCI - based on PCI/slot/function location * NAME_GUID - based on node GUID * NAME_ONBOARD - based on-board device index * NAME_FIXED - rename the device to the fixed named in the next argument * * The stable names are combination of device type technology and rename mode. * Infiniband - ib* * RoCE - roce* * iWARP - iw* * OPA - opa* * Default (unknown protocol) - rdma* * * Example: * NAME_PCI * pci = 0000:00:0c.4 * Device type = IB * mlx5_0 -> ibp0s12f4 * NAME_GUID * GUID = 5254:00c0:fe12:3455 * Device type = RoCE * mlx5_0 -> rocex525400c0fe123455 * NAME_ONBOARD * Index = 3 * Device type = OPA * hfi1_1 -> opao3 */ struct data { const char *curr; char *prefix; uint64_t node_guid; char *name; int idx; int name_assign_type; }; static bool debug_mode; #define pr_err(args...) syslog(LOG_ERR, ##args) #define pr_dbg(args...) \ do { \ if (debug_mode) \ syslog(LOG_ERR, ##args); \ } while (0) #define ONBOARD_INDEX_MAX (16*1024-1) static int by_onboard(struct data *d) { char *index = NULL; char *acpi = NULL; unsigned int o; FILE *fp; int ret; /* * ACPI_DSM - device specific method for naming * PCI or PCI Express device */ ret = asprintf(&acpi, "/sys/class/infiniband/%s/device/acpi_index", d->curr); if (ret < 0) return -ENOMEM; /* SMBIOS type 41 - Onboard Devices Extended Information */ ret = asprintf(&index, "/sys/class/infiniband/%s/device/index", d->curr); if (ret < 0) { index = NULL; ret = -ENOMEM; goto out; } fp = fopen(acpi, "r"); if (!fp) fp = fopen(index, "r"); if (!fp) { pr_dbg("%s: Device is not embedded onboard\n", d->curr); ret = -ENOENT; goto out; } ret = fscanf(fp, "%u", &o); fclose(fp); /* https://github.com/systemd/systemd/blob/master/src/udev/udev-builtin-net_id.c#L263 */ if (!ret || o > ONBOARD_INDEX_MAX) { pr_err("%s: Onboard index %d and ret %d\n", d->curr, o, ret); ret = -ENOENT; goto out; } ret = asprintf(&d->name, "%so%u", d->prefix, o); if (ret < 0) { pr_err("%s: Failed to allocate name with prefix %s and onboard index %d\n", d->curr, d->prefix, o); ret = -ENOENT; d->name = NULL; goto out; } ret = 0; out: free(index); free(acpi); return ret; } static int find_sun(char *devname, char *pci) { char bof[256], tmp[256]; struct dirent *dent; char *slots; DIR *dir; int ret; ret = asprintf(&slots, "%s/subsystem/slots", devname); if (ret < 0) return 0; ret = 0; dir = opendir(slots); if (!dir) goto err_dir; if (sscanf(pci, "%s.%s", bof, tmp) != 2) goto out; while ((dent = readdir(dir))) { char *str, address[256]; FILE *fp; int i; if (dent->d_name[0] == '.') continue; i = atoi(dent->d_name); if (i <= 0) continue; ret = asprintf(&str, "%s/%s/address", slots, dent->d_name); if (ret < 0) { ret = 0; goto out; } fp = fopen(str, "r"); free(str); if (!fp) { ret = 0; goto out; } ret = fscanf(fp, "%255s", address); fclose(fp); if (ret != 1) { ret = 0; goto out; } if (!strcmp(bof, address)) { ret = i; break; } } out: closedir(dir); err_dir: free(slots); return ret; } static int is_pci_multifunction(char *devname) { char c[64] = {}; char *config; FILE *fp; int ret; ret = asprintf(&config, "%s/config", devname); if (ret < 0) return 0; fp = fopen(config, "r"); free(config); if (!fp) return 0; ret = fread(c, 1, sizeof(c), fp); fclose(fp); if (ret != sizeof(c)) return 0; /* bit 0-6 header type, bit 7 multi/single function device */ return c[PCI_HEADER_TYPE] & 0x80; } static int is_pci_ari_enabled(char *devname) { int ret, a; char *ari; FILE *fp; ret = asprintf(&ari, "%s/ari_enabled", devname); if (ret < 0) return 0; fp = fopen(ari, "r"); free(ari); if (!fp) return 0; ret = fscanf(fp, "%d", &a); fclose(fp); return (ret) ? a == 1 : 0; } struct pci_info { char *pcidev; unsigned int domain; unsigned int bus; unsigned int slot; unsigned int func; unsigned int sun; unsigned int vf; bool valid_vf; }; static int fill_pci_info(struct data *d, struct pci_info *p) { char buf[256] = {}; char *pci; int ret; ret = readlink(p->pcidev, buf, sizeof(buf)-1); if (ret == -1 || ret == sizeof(buf)) return -EINVAL; buf[ret] = 0; pci = basename(buf); /* * pci = 0000:00:0c.0 */ ret = sscanf(pci, "%x:%x:%x.%u", &p->domain, &p->bus, &p->slot, &p->func); if (ret != 4) { pr_err("%s: Failed to read PCI BOF\n", d->curr); return -ENOENT; } if (is_pci_ari_enabled(p->pcidev)) { /* * ARI devices support up to 256 functions on a single device * ("slot"), and interpret the traditional 5-bit slot and 3-bit * function number as a single 8-bit function number, where the * slot makes up the upper 5 bits. * * https://github.com/systemd/systemd/blob/master/src/udev/udev-builtin-net_id.c#L344 */ p->func += p->slot * 8; pr_dbg("%s: This is ARI device, new PCI BOF is %04x:%02x:%02x.%u\n", d->curr, p->domain, p->bus, p->slot, p->func); } p->sun = find_sun(p->pcidev, pci); return 0; } static int get_virtfn_info(struct data *d, struct pci_info *p) { struct pci_info vf = {}; char *physfn_pcidev; struct dirent *dent; DIR *dir; int ret; /* Check if this is a virtual function. */ ret = asprintf(&physfn_pcidev, "%s/physfn", p->pcidev); if (ret < 0) return -ENOMEM; /* We are VF, get VF number and replace pcidev to point to PF */ dir = opendir(physfn_pcidev); if (!dir) { /* * -ENOENT means that we are already in PF * and pcidev points to right PCI. */ ret = (errno == ENOENT) ? 0 : -ENOMEM; goto err_free; } p->valid_vf = true; vf.pcidev = p->pcidev; ret = fill_pci_info(d, &vf); if (ret) goto err_dir; while ((dent = readdir(dir))) { const char *s = "virtfn"; struct pci_info v = {}; if (strncmp(dent->d_name, s, strlen(s)) || strlen(dent->d_name) == strlen(s)) continue; ret = asprintf(&v.pcidev, "%s/%s", physfn_pcidev, dent->d_name); if (ret < 0) { ret = -ENOMEM; goto err_dir; } ret = fill_pci_info(d, &v); free(v.pcidev); if (ret) { ret = -ENOMEM; goto err_dir; } if (vf.func == v.func && vf.slot == v.slot) { p->vf = atoi(&dent->d_name[6]); break; } } p->pcidev = physfn_pcidev; closedir(dir); return 0; err_dir: closedir(dir); err_free: free(physfn_pcidev); return ret; } static int by_pci(struct data *d) { struct pci_info p = {}; char *subsystem; char buf[256] = {}; char *subs; int ret; ret = asprintf(&subsystem, "/sys/class/infiniband/%s/device/subsystem", d->curr); if (ret < 0) return -ENOMEM; ret = readlink(subsystem, buf, sizeof(buf)-1); if (ret == -1 || ret == sizeof(buf)) { ret = -EINVAL; goto out; } buf[ret] = 0; subs = basename(buf); if (strcmp(subs, "pci")) { /* Ball out virtual devices */ pr_dbg("%s: Non-PCI device (%s) was detected\n", d->curr, subs); ret = -EINVAL; goto out; } /* Real devices */ ret = asprintf(&p.pcidev, "/sys/class/infiniband/%s/device", d->curr); if (ret < 0) { ret = -ENOMEM; p.pcidev = NULL; goto out; } ret = get_virtfn_info(d, &p); if (ret) goto out; ret = fill_pci_info(d, &p); if (ret) { pr_err("%s: Failed to fill PCI device information\n", d->curr); goto out; } d->name = calloc(256, sizeof(char)); if (!d->name) { ret = -ENOMEM; goto out; } ret = sprintf(d->name, "%s", d->prefix); if (ret == -1) { ret = -EINVAL; goto out; } if (p.domain > 0) { ret = sprintf(buf, "P%u", p.domain); if (ret == -1) { ret = -ENOMEM; goto out; } strcat(d->name, buf); } if (p.sun > 0) ret = sprintf(buf, "s%u", p.sun); else ret = sprintf(buf, "p%us%u", p.bus, p.slot); if (ret == -1) { ret = -ENOMEM; goto out; } strcat(d->name, buf); if (p.func > 0 || is_pci_multifunction(p.pcidev)) { ret = sprintf(buf, "f%u", p.func); if (ret == -1) { ret = -ENOMEM; goto out; } strcat(d->name, buf); if (p.valid_vf) { ret = sprintf(buf, "v%u", p.vf); if (ret == -1) { ret = -ENOMEM; goto out; } strcat(d->name, buf); } } ret = 0; out: free(p.pcidev); free(subsystem); if (ret) { free(d->name); d->name = NULL; } return ret; } static int by_guid(struct data *d) { uint16_t vp[4]; int ret = -1; if (!d->node_guid) /* virtual devices start without GUID */ goto out; memcpy(vp, &d->node_guid, sizeof(uint64_t)); ret = asprintf(&d->name, "%sx%04x%04x%04x%04x", d->prefix, vp[3], vp[2], vp[1], vp[0]); out: if (ret == -1) { d->name = NULL; return -ENOMEM; } return 0; } static int set_fixed_name(struct data *d, char *name) { int ret; ret = asprintf(&d->name, "%s", name); if (ret == -1) { d->name = NULL; return -ENOMEM; } return 0; } static int device_rename(struct nl_sock *nl, struct data *d) { struct nlmsghdr *hdr; struct nl_msg *msg; int ret = -1; msg = nlmsg_alloc(); if (!msg) return -ENOMEM; hdr = nlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_SET), 0, 0); if (!hdr) { ret = -ENOMEM; goto nla_put_failure; } NLA_PUT_U32(msg, RDMA_NLDEV_ATTR_DEV_INDEX, d->idx); NLA_PUT_STRING(msg, RDMA_NLDEV_ATTR_DEV_NAME, d->name); ret = nl_send_auto(nl, msg); if (ret < 0) return ret; nla_put_failure: nlmsg_free(msg); return (ret < 0) ? ret : 0; } static int get_nldata_cb(struct nl_msg *msg, void *data) { struct nlattr *tb[RDMA_NLDEV_ATTR_MAX] = {}; struct nlmsghdr *hdr = nlmsg_hdr(msg); struct data *d = data; int ret; ret = nlmsg_parse(hdr, 0, tb, RDMA_NLDEV_ATTR_MAX - 1, rdmanl_policy); if (ret < 0) return NL_STOP; if (!tb[RDMA_NLDEV_ATTR_DEV_NAME] || !tb[RDMA_NLDEV_ATTR_DEV_INDEX] || !tb[RDMA_NLDEV_ATTR_NODE_GUID]) return NL_STOP; ret = strcmp(d->curr, nla_get_string(tb[RDMA_NLDEV_ATTR_DEV_NAME])); if (ret) return NL_OK; if (tb[RDMA_NLDEV_ATTR_DEV_PROTOCOL]) d->prefix = strdup( nla_get_string(tb[RDMA_NLDEV_ATTR_DEV_PROTOCOL])); if (!d->prefix) ret = asprintf(&d->prefix, "rdma"); if (ret < 0) return NL_STOP; d->idx = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]); d->node_guid = nla_get_u64(tb[RDMA_NLDEV_ATTR_NODE_GUID]); if (tb[RDMA_NLDEV_ATTR_NAME_ASSIGN_TYPE]) d->name_assign_type = nla_get_u8(tb[RDMA_NLDEV_ATTR_NAME_ASSIGN_TYPE]); return NL_STOP; } enum name_policy { NAME_KERNEL = 1 << 0, NAME_PCI = 1 << 1, NAME_GUID = 1 << 2, NAME_ONBOARD = 1 << 3, NAME_FIXED = 1 << 4, NAME_ERROR = 1 << 8 }; static int str2policy(const char *np) { if (!strcmp(np, "NAME_KERNEL")) return NAME_KERNEL; if (!strcmp(np, "NAME_PCI")) return NAME_PCI; if (!strcmp(np, "NAME_GUID")) return NAME_GUID; if (!strcmp(np, "NAME_ONBOARD")) return NAME_ONBOARD; if (!strcmp(np, "NAME_FIXED")) return NAME_FIXED; if (!strcmp(np, "NAME_FALLBACK")) return NAME_ONBOARD | NAME_PCI; return NAME_ERROR; }; int main(int argc, char **argv) { struct data d = { .idx = -1 }; struct nl_sock *nl; int ret = -1; int np, opt; if (argc < 3) goto err; while ((opt = getopt(argc, argv, "v")) >= 0) { switch (opt) { case 'v': debug_mode = true; break; default: goto err; } } argc -= optind; argv += optind; d.curr = argv[0]; np = str2policy(argv[1]); if (np & NAME_ERROR) { pr_err("%s: Unknown policy %s\n", d.curr, argv[1]); goto err; } if (np & NAME_FIXED && argc < 3) { pr_err("%s: No name specified\n", d.curr); goto err; } pr_dbg("%s: Requested policy is %s\n", d.curr, argv[1]); if (np & NAME_KERNEL) { pr_dbg("%s: Leave kernel names, do nothing\n", d.curr); /* Do nothing */ exit(0); } nl = rdmanl_socket_alloc(); if (!nl) { pr_err("%s: Failed to allocate netlink socket\n", d.curr); goto err; } if (rdmanl_get_devices(nl, get_nldata_cb, &d)) { pr_err("%s: Failed to connect to NETLINK_RDMA\n", d.curr); goto out; } if (d.name_assign_type == RDMA_NAME_ASSIGN_TYPE_USER) { pr_dbg("%s: Leave user-assigned names, do nothing\n", d.curr); /* Do nothing */ ret = 0; goto out; } if (d.idx == -1 || !d.prefix) { pr_err("%s: Failed to get current device name and index\n", d.curr); goto out; } ret = -1; if (np & NAME_ONBOARD) ret = by_onboard(&d); if (ret && (np & NAME_PCI)) ret = by_pci(&d); if (ret && (np & NAME_GUID)) ret = by_guid(&d); if (ret && (np & NAME_FIXED)) ret = set_fixed_name(&d, argv[2]); if (ret) goto out; ret = device_rename(nl, &d); if (ret) { pr_err("%s: Device rename to %s failed with error %d\n", d.curr, d.name, ret); goto out; } pr_dbg("%s: Successfully renamed device to be %s\n", d.curr, d.name); printf("%s\n", d.name); free(d.name); out: free(d.prefix); nl_socket_free(nl); err: ret = (ret) ? 1 : 0; exit(ret); } rdma-core-56.1/kernel-headers/000077500000000000000000000000001477342711600162235ustar00rootroot00000000000000rdma-core-56.1/kernel-headers/CMakeLists.txt000066400000000000000000000040241477342711600207630ustar00rootroot00000000000000publish_internal_headers(rdma rdma/bnxt_re-abi.h rdma/cxgb4-abi.h rdma/efa-abi.h rdma/erdma-abi.h rdma/hns-abi.h rdma/ib_user_ioctl_cmds.h rdma/ib_user_ioctl_verbs.h rdma/ib_user_mad.h rdma/ib_user_sa.h rdma/ib_user_verbs.h rdma/irdma-abi.h rdma/mana-abi.h rdma/mlx4-abi.h rdma/mlx5-abi.h rdma/mlx5_user_ioctl_cmds.h rdma/mlx5_user_ioctl_verbs.h rdma/mthca-abi.h rdma/ocrdma-abi.h rdma/qedr-abi.h rdma/rdma_netlink.h rdma/rdma_user_cm.h rdma/rdma_user_ioctl.h rdma/rdma_user_ioctl_cmds.h rdma/rdma_user_rxe.h rdma/rvt-abi.h rdma/siw-abi.h rdma/vmw_pvrdma-abi.h ) publish_internal_headers(rdma/hfi rdma/hfi/hfi1_ioctl.h rdma/hfi/hfi1_user.h ) publish_internal_headers(linux linux/stddef.h linux/vfio.h ) function(rdma_kernel_provider_abi) # Older versions of cmake do not create the output directory automatically set(DDIR "${BUILD_INCLUDE}/kernel-abi") rdma_make_dir("${DDIR}") set(HDRS "") foreach(IHDR ${ARGN}) get_filename_component(FIL ${IHDR} NAME) set(OHDR "${DDIR}/${FIL}") set(HDRS ${HDRS} ${OHDR}) add_custom_command( OUTPUT "${OHDR}" COMMAND "${PYTHON_EXECUTABLE}" "${PROJECT_SOURCE_DIR}/buildlib/make_abi_structs.py" "${IHDR}" "${OHDR}" MAIN_DEPENDENCY "${IHDR}" DEPENDS "${PROJECT_SOURCE_DIR}/buildlib/make_abi_structs.py" WORKING_DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}" COMMENT "Creating ABI wrapper ${OHDR}" ) endforeach() # This weird construction is needed to ensure ordering of the build. add_library(kern-abi STATIC kern-abi.c ${HDRS}) endfunction() # Transform the kernel ABIs used by the providers rdma_kernel_provider_abi( rdma/bnxt_re-abi.h rdma/cxgb4-abi.h rdma/efa-abi.h rdma/erdma-abi.h rdma/hns-abi.h rdma/ib_user_verbs.h rdma/irdma-abi.h rdma/mana-abi.h rdma/mlx4-abi.h rdma/mlx5-abi.h rdma/mthca-abi.h rdma/ocrdma-abi.h rdma/qedr-abi.h rdma/rdma_user_rxe.h rdma/siw-abi.h rdma/vmw_pvrdma-abi.h ) publish_headers(infiniband rdma/ib_user_ioctl_verbs.h ) rdma-core-56.1/kernel-headers/kern-abi.c000066400000000000000000000000331477342711600200530ustar00rootroot00000000000000/* empty file for cmake */ rdma-core-56.1/kernel-headers/linux/000077500000000000000000000000001477342711600173625ustar00rootroot00000000000000rdma-core-56.1/kernel-headers/linux/stddef.h000066400000000000000000000026761477342711600210170ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ #ifndef __always_inline #define __always_inline inline #endif /** * __struct_group() - Create a mirrored named and anonyomous struct * * @TAG: The tag name for the named sub-struct (usually empty) * @NAME: The identifier name of the mirrored sub-struct * @ATTRS: Any struct attributes (usually empty) * @MEMBERS: The member declarations for the mirrored structs * * Used to create an anonymous union of two structs with identical layout * and size: one anonymous and one named. The former's members can be used * normally without sub-struct naming, and the latter can be used to * reason about the start, end, and size of the group of struct members. * The named struct can also be explicitly tagged for layer reuse, as well * as both having struct attributes appended. */ #define __struct_group(TAG, NAME, ATTRS, MEMBERS...) \ union { \ struct { MEMBERS } ATTRS; \ struct TAG { MEMBERS } ATTRS NAME; \ } /** * __DECLARE_FLEX_ARRAY() - Declare a flexible array usable in a union * * @TYPE: The type of each flexible array element * @NAME: The name of the flexible array member * * In order to have a flexible array member in a union or alone in a * struct, it needs to be wrapped in an anonymous struct with at least 1 * named member, but that member can be empty. */ #define __DECLARE_FLEX_ARRAY(TYPE, NAME) \ struct { \ struct { } __empty_ ## NAME; \ TYPE NAME[]; \ } rdma-core-56.1/kernel-headers/linux/vfio.h000066400000000000000000001511311477342711600205000ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ /* * VFIO API definition * * Copyright (C) 2012 Red Hat, Inc. All rights reserved. * Author: Alex Williamson * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as * published by the Free Software Foundation. */ #ifndef _UAPIVFIO_H #define _UAPIVFIO_H #include #include #ifndef __kernel #define __user #endif #define VFIO_API_VERSION 0 /* Kernel & User level defines for VFIO IOCTLs. */ /* Extensions */ #define VFIO_TYPE1_IOMMU 1 #define VFIO_SPAPR_TCE_IOMMU 2 #define VFIO_TYPE1v2_IOMMU 3 /* * IOMMU enforces DMA cache coherence (ex. PCIe NoSnoop stripping). This * capability is subject to change as groups are added or removed. */ #define VFIO_DMA_CC_IOMMU 4 /* Check if EEH is supported */ #define VFIO_EEH 5 /* Two-stage IOMMU */ #define VFIO_TYPE1_NESTING_IOMMU 6 /* Implies v2 */ #define VFIO_SPAPR_TCE_v2_IOMMU 7 /* * The No-IOMMU IOMMU offers no translation or isolation for devices and * supports no ioctls outside of VFIO_CHECK_EXTENSION. Use of VFIO's No-IOMMU * code will taint the host kernel and should be used with extreme caution. */ #define VFIO_NOIOMMU_IOMMU 8 /* Supports VFIO_DMA_UNMAP_FLAG_ALL */ #define VFIO_UNMAP_ALL 9 /* Supports the vaddr flag for DMA map and unmap */ #define VFIO_UPDATE_VADDR 10 /* * The IOCTL interface is designed for extensibility by embedding the * structure length (argsz) and flags into structures passed between * kernel and userspace. We therefore use the _IO() macro for these * defines to avoid implicitly embedding a size into the ioctl request. * As structure fields are added, argsz will increase to match and flag * bits will be defined to indicate additional fields with valid data. * It's *always* the caller's responsibility to indicate the size of * the structure passed by setting argsz appropriately. */ #define VFIO_TYPE (';') #define VFIO_BASE 100 /* * For extension of INFO ioctls, VFIO makes use of a capability chain * designed after PCI/e capabilities. A flag bit indicates whether * this capability chain is supported and a field defined in the fixed * structure defines the offset of the first capability in the chain. * This field is only valid when the corresponding bit in the flags * bitmap is set. This offset field is relative to the start of the * INFO buffer, as is the next field within each capability header. * The id within the header is a shared address space per INFO ioctl, * while the version field is specific to the capability id. The * contents following the header are specific to the capability id. */ struct vfio_info_cap_header { __u16 id; /* Identifies capability */ __u16 version; /* Version specific to the capability ID */ __u32 next; /* Offset of next capability */ }; /* * Callers of INFO ioctls passing insufficiently sized buffers will see * the capability chain flag bit set, a zero value for the first capability * offset (if available within the provided argsz), and argsz will be * updated to report the necessary buffer size. For compatibility, the * INFO ioctl will not report error in this case, but the capability chain * will not be available. */ /* -------- IOCTLs for VFIO file descriptor (/dev/vfio/vfio) -------- */ /** * VFIO_GET_API_VERSION - _IO(VFIO_TYPE, VFIO_BASE + 0) * * Report the version of the VFIO API. This allows us to bump the entire * API version should we later need to add or change features in incompatible * ways. * Return: VFIO_API_VERSION * Availability: Always */ #define VFIO_GET_API_VERSION _IO(VFIO_TYPE, VFIO_BASE + 0) /** * VFIO_CHECK_EXTENSION - _IOW(VFIO_TYPE, VFIO_BASE + 1, __u32) * * Check whether an extension is supported. * Return: 0 if not supported, 1 (or some other positive integer) if supported. * Availability: Always */ #define VFIO_CHECK_EXTENSION _IO(VFIO_TYPE, VFIO_BASE + 1) /** * VFIO_SET_IOMMU - _IOW(VFIO_TYPE, VFIO_BASE + 2, __s32) * * Set the iommu to the given type. The type must be supported by an * iommu driver as verified by calling CHECK_EXTENSION using the same * type. A group must be set to this file descriptor before this * ioctl is available. The IOMMU interfaces enabled by this call are * specific to the value set. * Return: 0 on success, -errno on failure * Availability: When VFIO group attached */ #define VFIO_SET_IOMMU _IO(VFIO_TYPE, VFIO_BASE + 2) /* -------- IOCTLs for GROUP file descriptors (/dev/vfio/$GROUP) -------- */ /** * VFIO_GROUP_GET_STATUS - _IOR(VFIO_TYPE, VFIO_BASE + 3, * struct vfio_group_status) * * Retrieve information about the group. Fills in provided * struct vfio_group_info. Caller sets argsz. * Return: 0 on succes, -errno on failure. * Availability: Always */ struct vfio_group_status { __u32 argsz; __u32 flags; #define VFIO_GROUP_FLAGS_VIABLE (1 << 0) #define VFIO_GROUP_FLAGS_CONTAINER_SET (1 << 1) }; #define VFIO_GROUP_GET_STATUS _IO(VFIO_TYPE, VFIO_BASE + 3) /** * VFIO_GROUP_SET_CONTAINER - _IOW(VFIO_TYPE, VFIO_BASE + 4, __s32) * * Set the container for the VFIO group to the open VFIO file * descriptor provided. Groups may only belong to a single * container. Containers may, at their discretion, support multiple * groups. Only when a container is set are all of the interfaces * of the VFIO file descriptor and the VFIO group file descriptor * available to the user. * Return: 0 on success, -errno on failure. * Availability: Always */ #define VFIO_GROUP_SET_CONTAINER _IO(VFIO_TYPE, VFIO_BASE + 4) /** * VFIO_GROUP_UNSET_CONTAINER - _IO(VFIO_TYPE, VFIO_BASE + 5) * * Remove the group from the attached container. This is the * opposite of the SET_CONTAINER call and returns the group to * an initial state. All device file descriptors must be released * prior to calling this interface. When removing the last group * from a container, the IOMMU will be disabled and all state lost, * effectively also returning the VFIO file descriptor to an initial * state. * Return: 0 on success, -errno on failure. * Availability: When attached to container */ #define VFIO_GROUP_UNSET_CONTAINER _IO(VFIO_TYPE, VFIO_BASE + 5) /** * VFIO_GROUP_GET_DEVICE_FD - _IOW(VFIO_TYPE, VFIO_BASE + 6, char) * * Return a new file descriptor for the device object described by * the provided string. The string should match a device listed in * the devices subdirectory of the IOMMU group sysfs entry. The * group containing the device must already be added to this context. * Return: new file descriptor on success, -errno on failure. * Availability: When attached to container */ #define VFIO_GROUP_GET_DEVICE_FD _IO(VFIO_TYPE, VFIO_BASE + 6) /* --------------- IOCTLs for DEVICE file descriptors --------------- */ /** * VFIO_DEVICE_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 7, * struct vfio_device_info) * * Retrieve information about the device. Fills in provided * struct vfio_device_info. Caller sets argsz. * Return: 0 on success, -errno on failure. */ struct vfio_device_info { __u32 argsz; __u32 flags; #define VFIO_DEVICE_FLAGS_RESET (1 << 0) /* Device supports reset */ #define VFIO_DEVICE_FLAGS_PCI (1 << 1) /* vfio-pci device */ #define VFIO_DEVICE_FLAGS_PLATFORM (1 << 2) /* vfio-platform device */ #define VFIO_DEVICE_FLAGS_AMBA (1 << 3) /* vfio-amba device */ #define VFIO_DEVICE_FLAGS_CCW (1 << 4) /* vfio-ccw device */ #define VFIO_DEVICE_FLAGS_AP (1 << 5) /* vfio-ap device */ #define VFIO_DEVICE_FLAGS_FSL_MC (1 << 6) /* vfio-fsl-mc device */ #define VFIO_DEVICE_FLAGS_CAPS (1 << 7) /* Info supports caps */ __u32 num_regions; /* Max region index + 1 */ __u32 num_irqs; /* Max IRQ index + 1 */ __u32 cap_offset; /* Offset within info struct of first cap */ }; #define VFIO_DEVICE_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 7) /* * Vendor driver using Mediated device framework should provide device_api * attribute in supported type attribute groups. Device API string should be one * of the following corresponding to device flags in vfio_device_info structure. */ #define VFIO_DEVICE_API_PCI_STRING "vfio-pci" #define VFIO_DEVICE_API_PLATFORM_STRING "vfio-platform" #define VFIO_DEVICE_API_AMBA_STRING "vfio-amba" #define VFIO_DEVICE_API_CCW_STRING "vfio-ccw" #define VFIO_DEVICE_API_AP_STRING "vfio-ap" /* * The following capabilities are unique to s390 zPCI devices. Their contents * are further-defined in vfio_zdev.h */ #define VFIO_DEVICE_INFO_CAP_ZPCI_BASE 1 #define VFIO_DEVICE_INFO_CAP_ZPCI_GROUP 2 #define VFIO_DEVICE_INFO_CAP_ZPCI_UTIL 3 #define VFIO_DEVICE_INFO_CAP_ZPCI_PFIP 4 /** * VFIO_DEVICE_GET_REGION_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 8, * struct vfio_region_info) * * Retrieve information about a device region. Caller provides * struct vfio_region_info with index value set. Caller sets argsz. * Implementation of region mapping is bus driver specific. This is * intended to describe MMIO, I/O port, as well as bus specific * regions (ex. PCI config space). Zero sized regions may be used * to describe unimplemented regions (ex. unimplemented PCI BARs). * Return: 0 on success, -errno on failure. */ struct vfio_region_info { __u32 argsz; __u32 flags; #define VFIO_REGION_INFO_FLAG_READ (1 << 0) /* Region supports read */ #define VFIO_REGION_INFO_FLAG_WRITE (1 << 1) /* Region supports write */ #define VFIO_REGION_INFO_FLAG_MMAP (1 << 2) /* Region supports mmap */ #define VFIO_REGION_INFO_FLAG_CAPS (1 << 3) /* Info supports caps */ __u32 index; /* Region index */ __u32 cap_offset; /* Offset within info struct of first cap */ __u64 size; /* Region size (bytes) */ __u64 offset; /* Region offset from start of device fd */ }; #define VFIO_DEVICE_GET_REGION_INFO _IO(VFIO_TYPE, VFIO_BASE + 8) /* * The sparse mmap capability allows finer granularity of specifying areas * within a region with mmap support. When specified, the user should only * mmap the offset ranges specified by the areas array. mmaps outside of the * areas specified may fail (such as the range covering a PCI MSI-X table) or * may result in improper device behavior. * * The structures below define version 1 of this capability. */ #define VFIO_REGION_INFO_CAP_SPARSE_MMAP 1 struct vfio_region_sparse_mmap_area { __u64 offset; /* Offset of mmap'able area within region */ __u64 size; /* Size of mmap'able area */ }; struct vfio_region_info_cap_sparse_mmap { struct vfio_info_cap_header header; __u32 nr_areas; __u32 reserved; struct vfio_region_sparse_mmap_area areas[]; }; /* * The device specific type capability allows regions unique to a specific * device or class of devices to be exposed. This helps solve the problem for * vfio bus drivers of defining which region indexes correspond to which region * on the device, without needing to resort to static indexes, as done by * vfio-pci. For instance, if we were to go back in time, we might remove * VFIO_PCI_VGA_REGION_INDEX and let vfio-pci simply define that all indexes * greater than or equal to VFIO_PCI_NUM_REGIONS are device specific and we'd * make a "VGA" device specific type to describe the VGA access space. This * means that non-VGA devices wouldn't need to waste this index, and thus the * address space associated with it due to implementation of device file * descriptor offsets in vfio-pci. * * The current implementation is now part of the user ABI, so we can't use this * for VGA, but there are other upcoming use cases, such as opregions for Intel * IGD devices and framebuffers for vGPU devices. We missed VGA, but we'll * use this for future additions. * * The structure below defines version 1 of this capability. */ #define VFIO_REGION_INFO_CAP_TYPE 2 struct vfio_region_info_cap_type { struct vfio_info_cap_header header; __u32 type; /* global per bus driver */ __u32 subtype; /* type specific */ }; /* * List of region types, global per bus driver. * If you introduce a new type, please add it here. */ /* PCI region type containing a PCI vendor part */ #define VFIO_REGION_TYPE_PCI_VENDOR_TYPE (1 << 31) #define VFIO_REGION_TYPE_PCI_VENDOR_MASK (0xffff) #define VFIO_REGION_TYPE_GFX (1) #define VFIO_REGION_TYPE_CCW (2) #define VFIO_REGION_TYPE_MIGRATION (3) /* sub-types for VFIO_REGION_TYPE_PCI_* */ /* 8086 vendor PCI sub-types */ #define VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION (1) #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG (2) #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG (3) /* 10de vendor PCI sub-types */ /* * NVIDIA GPU NVlink2 RAM is coherent RAM mapped onto the host address space. * * Deprecated, region no longer provided */ #define VFIO_REGION_SUBTYPE_NVIDIA_NVLINK2_RAM (1) /* 1014 vendor PCI sub-types */ /* * IBM NPU NVlink2 ATSD (Address Translation Shootdown) register of NPU * to do TLB invalidation on a GPU. * * Deprecated, region no longer provided */ #define VFIO_REGION_SUBTYPE_IBM_NVLINK2_ATSD (1) /* sub-types for VFIO_REGION_TYPE_GFX */ #define VFIO_REGION_SUBTYPE_GFX_EDID (1) /** * struct vfio_region_gfx_edid - EDID region layout. * * Set display link state and EDID blob. * * The EDID blob has monitor information such as brand, name, serial * number, physical size, supported video modes and more. * * This special region allows userspace (typically qemu) set a virtual * EDID for the virtual monitor, which allows a flexible display * configuration. * * For the edid blob spec look here: * https://en.wikipedia.org/wiki/Extended_Display_Identification_Data * * On linux systems you can find the EDID blob in sysfs: * /sys/class/drm/${card}/${connector}/edid * * You can use the edid-decode ulility (comes with xorg-x11-utils) to * decode the EDID blob. * * @edid_offset: location of the edid blob, relative to the * start of the region (readonly). * @edid_max_size: max size of the edid blob (readonly). * @edid_size: actual edid size (read/write). * @link_state: display link state (read/write). * VFIO_DEVICE_GFX_LINK_STATE_UP: Monitor is turned on. * VFIO_DEVICE_GFX_LINK_STATE_DOWN: Monitor is turned off. * @max_xres: max display width (0 == no limitation, readonly). * @max_yres: max display height (0 == no limitation, readonly). * * EDID update protocol: * (1) set link-state to down. * (2) update edid blob and size. * (3) set link-state to up. */ struct vfio_region_gfx_edid { __u32 edid_offset; __u32 edid_max_size; __u32 edid_size; __u32 max_xres; __u32 max_yres; __u32 link_state; #define VFIO_DEVICE_GFX_LINK_STATE_UP 1 #define VFIO_DEVICE_GFX_LINK_STATE_DOWN 2 }; /* sub-types for VFIO_REGION_TYPE_CCW */ #define VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD (1) #define VFIO_REGION_SUBTYPE_CCW_SCHIB (2) #define VFIO_REGION_SUBTYPE_CCW_CRW (3) /* sub-types for VFIO_REGION_TYPE_MIGRATION */ #define VFIO_REGION_SUBTYPE_MIGRATION (1) /* * The structure vfio_device_migration_info is placed at the 0th offset of * the VFIO_REGION_SUBTYPE_MIGRATION region to get and set VFIO device related * migration information. Field accesses from this structure are only supported * at their native width and alignment. Otherwise, the result is undefined and * vendor drivers should return an error. * * device_state: (read/write) * - The user application writes to this field to inform the vendor driver * about the device state to be transitioned to. * - The vendor driver should take the necessary actions to change the * device state. After successful transition to a given state, the * vendor driver should return success on write(device_state, state) * system call. If the device state transition fails, the vendor driver * should return an appropriate -errno for the fault condition. * - On the user application side, if the device state transition fails, * that is, if write(device_state, state) returns an error, read * device_state again to determine the current state of the device from * the vendor driver. * - The vendor driver should return previous state of the device unless * the vendor driver has encountered an internal error, in which case * the vendor driver may report the device_state VFIO_DEVICE_STATE_ERROR. * - The user application must use the device reset ioctl to recover the * device from VFIO_DEVICE_STATE_ERROR state. If the device is * indicated to be in a valid device state by reading device_state, the * user application may attempt to transition the device to any valid * state reachable from the current state or terminate itself. * * device_state consists of 3 bits: * - If bit 0 is set, it indicates the _RUNNING state. If bit 0 is clear, * it indicates the _STOP state. When the device state is changed to * _STOP, driver should stop the device before write() returns. * - If bit 1 is set, it indicates the _SAVING state, which means that the * driver should start gathering device state information that will be * provided to the VFIO user application to save the device's state. * - If bit 2 is set, it indicates the _RESUMING state, which means that * the driver should prepare to resume the device. Data provided through * the migration region should be used to resume the device. * Bits 3 - 31 are reserved for future use. To preserve them, the user * application should perform a read-modify-write operation on this * field when modifying the specified bits. * * +------- _RESUMING * |+------ _SAVING * ||+----- _RUNNING * ||| * 000b => Device Stopped, not saving or resuming * 001b => Device running, which is the default state * 010b => Stop the device & save the device state, stop-and-copy state * 011b => Device running and save the device state, pre-copy state * 100b => Device stopped and the device state is resuming * 101b => Invalid state * 110b => Error state * 111b => Invalid state * * State transitions: * * _RESUMING _RUNNING Pre-copy Stop-and-copy _STOP * (100b) (001b) (011b) (010b) (000b) * 0. Running or default state * | * * 1. Normal Shutdown (optional) * |------------------------------------->| * * 2. Save the state or suspend * |------------------------->|---------->| * * 3. Save the state during live migration * |----------->|------------>|---------->| * * 4. Resuming * |<---------| * * 5. Resumed * |--------->| * * 0. Default state of VFIO device is _RUNNING when the user application starts. * 1. During normal shutdown of the user application, the user application may * optionally change the VFIO device state from _RUNNING to _STOP. This * transition is optional. The vendor driver must support this transition but * must not require it. * 2. When the user application saves state or suspends the application, the * device state transitions from _RUNNING to stop-and-copy and then to _STOP. * On state transition from _RUNNING to stop-and-copy, driver must stop the * device, save the device state and send it to the application through the * migration region. The sequence to be followed for such transition is given * below. * 3. In live migration of user application, the state transitions from _RUNNING * to pre-copy, to stop-and-copy, and to _STOP. * On state transition from _RUNNING to pre-copy, the driver should start * gathering the device state while the application is still running and send * the device state data to application through the migration region. * On state transition from pre-copy to stop-and-copy, the driver must stop * the device, save the device state and send it to the user application * through the migration region. * Vendor drivers must support the pre-copy state even for implementations * where no data is provided to the user before the stop-and-copy state. The * user must not be required to consume all migration data before the device * transitions to a new state, including the stop-and-copy state. * The sequence to be followed for above two transitions is given below. * 4. To start the resuming phase, the device state should be transitioned from * the _RUNNING to the _RESUMING state. * In the _RESUMING state, the driver should use the device state data * received through the migration region to resume the device. * 5. After providing saved device data to the driver, the application should * change the state from _RESUMING to _RUNNING. * * reserved: * Reads on this field return zero and writes are ignored. * * pending_bytes: (read only) * The number of pending bytes still to be migrated from the vendor driver. * * data_offset: (read only) * The user application should read data_offset field from the migration * region. The user application should read the device data from this * offset within the migration region during the _SAVING state or write * the device data during the _RESUMING state. See below for details of * sequence to be followed. * * data_size: (read/write) * The user application should read data_size to get the size in bytes of * the data copied in the migration region during the _SAVING state and * write the size in bytes of the data copied in the migration region * during the _RESUMING state. * * The format of the migration region is as follows: * ------------------------------------------------------------------ * |vfio_device_migration_info| data section | * | | /////////////////////////////// | * ------------------------------------------------------------------ * ^ ^ * offset 0-trapped part data_offset * * The structure vfio_device_migration_info is always followed by the data * section in the region, so data_offset will always be nonzero. The offset * from where the data is copied is decided by the kernel driver. The data * section can be trapped, mmapped, or partitioned, depending on how the kernel * driver defines the data section. The data section partition can be defined * as mapped by the sparse mmap capability. If mmapped, data_offset must be * page aligned, whereas initial section which contains the * vfio_device_migration_info structure, might not end at the offset, which is * page aligned. The user is not required to access through mmap regardless * of the capabilities of the region mmap. * The vendor driver should determine whether and how to partition the data * section. The vendor driver should return data_offset accordingly. * * The sequence to be followed while in pre-copy state and stop-and-copy state * is as follows: * a. Read pending_bytes, indicating the start of a new iteration to get device * data. Repeated read on pending_bytes at this stage should have no side * effects. * If pending_bytes == 0, the user application should not iterate to get data * for that device. * If pending_bytes > 0, perform the following steps. * b. Read data_offset, indicating that the vendor driver should make data * available through the data section. The vendor driver should return this * read operation only after data is available from (region + data_offset) * to (region + data_offset + data_size). * c. Read data_size, which is the amount of data in bytes available through * the migration region. * Read on data_offset and data_size should return the offset and size of * the current buffer if the user application reads data_offset and * data_size more than once here. * d. Read data_size bytes of data from (region + data_offset) from the * migration region. * e. Process the data. * f. Read pending_bytes, which indicates that the data from the previous * iteration has been read. If pending_bytes > 0, go to step b. * * The user application can transition from the _SAVING|_RUNNING * (pre-copy state) to the _SAVING (stop-and-copy) state regardless of the * number of pending bytes. The user application should iterate in _SAVING * (stop-and-copy) until pending_bytes is 0. * * The sequence to be followed while _RESUMING device state is as follows: * While data for this device is available, repeat the following steps: * a. Read data_offset from where the user application should write data. * b. Write migration data starting at the migration region + data_offset for * the length determined by data_size from the migration source. * c. Write data_size, which indicates to the vendor driver that data is * written in the migration region. Vendor driver must return this write * operations on consuming data. Vendor driver should apply the * user-provided migration region data to the device resume state. * * If an error occurs during the above sequences, the vendor driver can return * an error code for next read() or write() operation, which will terminate the * loop. The user application should then take the next necessary action, for * example, failing migration or terminating the user application. * * For the user application, data is opaque. The user application should write * data in the same order as the data is received and the data should be of * same transaction size at the source. */ struct vfio_device_migration_info { __u32 device_state; /* VFIO device state */ #define VFIO_DEVICE_STATE_STOP (0) #define VFIO_DEVICE_STATE_RUNNING (1 << 0) #define VFIO_DEVICE_STATE_SAVING (1 << 1) #define VFIO_DEVICE_STATE_RESUMING (1 << 2) #define VFIO_DEVICE_STATE_MASK (VFIO_DEVICE_STATE_RUNNING | \ VFIO_DEVICE_STATE_SAVING | \ VFIO_DEVICE_STATE_RESUMING) #define VFIO_DEVICE_STATE_VALID(state) \ (state & VFIO_DEVICE_STATE_RESUMING ? \ (state & VFIO_DEVICE_STATE_MASK) == VFIO_DEVICE_STATE_RESUMING : 1) #define VFIO_DEVICE_STATE_IS_ERROR(state) \ ((state & VFIO_DEVICE_STATE_MASK) == (VFIO_DEVICE_STATE_SAVING | \ VFIO_DEVICE_STATE_RESUMING)) #define VFIO_DEVICE_STATE_SET_ERROR(state) \ ((state & ~VFIO_DEVICE_STATE_MASK) | VFIO_DEVICE_SATE_SAVING | \ VFIO_DEVICE_STATE_RESUMING) __u32 reserved; __u64 pending_bytes; __u64 data_offset; __u64 data_size; }; /* * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped * which allows direct access to non-MSIX registers which happened to be within * the same system page. * * Even though the userspace gets direct access to the MSIX data, the existing * VFIO_DEVICE_SET_IRQS interface must still be used for MSIX configuration. */ #define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE 3 /* * Capability with compressed real address (aka SSA - small system address) * where GPU RAM is mapped on a system bus. Used by a GPU for DMA routing * and by the userspace to associate a NVLink bridge with a GPU. * * Deprecated, capability no longer provided */ #define VFIO_REGION_INFO_CAP_NVLINK2_SSATGT 4 struct vfio_region_info_cap_nvlink2_ssatgt { struct vfio_info_cap_header header; __u64 tgt; }; /* * Capability with an NVLink link speed. The value is read by * the NVlink2 bridge driver from the bridge's "ibm,nvlink-speed" * property in the device tree. The value is fixed in the hardware * and failing to provide the correct value results in the link * not working with no indication from the driver why. * * Deprecated, capability no longer provided */ #define VFIO_REGION_INFO_CAP_NVLINK2_LNKSPD 5 struct vfio_region_info_cap_nvlink2_lnkspd { struct vfio_info_cap_header header; __u32 link_speed; __u32 __pad; }; /** * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9, * struct vfio_irq_info) * * Retrieve information about a device IRQ. Caller provides * struct vfio_irq_info with index value set. Caller sets argsz. * Implementation of IRQ mapping is bus driver specific. Indexes * using multiple IRQs are primarily intended to support MSI-like * interrupt blocks. Zero count irq blocks may be used to describe * unimplemented interrupt types. * * The EVENTFD flag indicates the interrupt index supports eventfd based * signaling. * * The MASKABLE flags indicates the index supports MASK and UNMASK * actions described below. * * AUTOMASKED indicates that after signaling, the interrupt line is * automatically masked by VFIO and the user needs to unmask the line * to receive new interrupts. This is primarily intended to distinguish * level triggered interrupts. * * The NORESIZE flag indicates that the interrupt lines within the index * are setup as a set and new subindexes cannot be enabled without first * disabling the entire index. This is used for interrupts like PCI MSI * and MSI-X where the driver may only use a subset of the available * indexes, but VFIO needs to enable a specific number of vectors * upfront. In the case of MSI-X, where the user can enable MSI-X and * then add and unmask vectors, it's up to userspace to make the decision * whether to allocate the maximum supported number of vectors or tear * down setup and incrementally increase the vectors as each is enabled. */ struct vfio_irq_info { __u32 argsz; __u32 flags; #define VFIO_IRQ_INFO_EVENTFD (1 << 0) #define VFIO_IRQ_INFO_MASKABLE (1 << 1) #define VFIO_IRQ_INFO_AUTOMASKED (1 << 2) #define VFIO_IRQ_INFO_NORESIZE (1 << 3) __u32 index; /* IRQ index */ __u32 count; /* Number of IRQs within this index */ }; #define VFIO_DEVICE_GET_IRQ_INFO _IO(VFIO_TYPE, VFIO_BASE + 9) /** * VFIO_DEVICE_SET_IRQS - _IOW(VFIO_TYPE, VFIO_BASE + 10, struct vfio_irq_set) * * Set signaling, masking, and unmasking of interrupts. Caller provides * struct vfio_irq_set with all fields set. 'start' and 'count' indicate * the range of subindexes being specified. * * The DATA flags specify the type of data provided. If DATA_NONE, the * operation performs the specified action immediately on the specified * interrupt(s). For example, to unmask AUTOMASKED interrupt [0,0]: * flags = (DATA_NONE|ACTION_UNMASK), index = 0, start = 0, count = 1. * * DATA_BOOL allows sparse support for the same on arrays of interrupts. * For example, to mask interrupts [0,1] and [0,3] (but not [0,2]): * flags = (DATA_BOOL|ACTION_MASK), index = 0, start = 1, count = 3, * data = {1,0,1} * * DATA_EVENTFD binds the specified ACTION to the provided __s32 eventfd. * A value of -1 can be used to either de-assign interrupts if already * assigned or skip un-assigned interrupts. For example, to set an eventfd * to be trigger for interrupts [0,0] and [0,2]: * flags = (DATA_EVENTFD|ACTION_TRIGGER), index = 0, start = 0, count = 3, * data = {fd1, -1, fd2} * If index [0,1] is previously set, two count = 1 ioctls calls would be * required to set [0,0] and [0,2] without changing [0,1]. * * Once a signaling mechanism is set, DATA_BOOL or DATA_NONE can be used * with ACTION_TRIGGER to perform kernel level interrupt loopback testing * from userspace (ie. simulate hardware triggering). * * Setting of an event triggering mechanism to userspace for ACTION_TRIGGER * enables the interrupt index for the device. Individual subindex interrupts * can be disabled using the -1 value for DATA_EVENTFD or the index can be * disabled as a whole with: flags = (DATA_NONE|ACTION_TRIGGER), count = 0. * * Note that ACTION_[UN]MASK specify user->kernel signaling (irqfds) while * ACTION_TRIGGER specifies kernel->user signaling. */ struct vfio_irq_set { __u32 argsz; __u32 flags; #define VFIO_IRQ_SET_DATA_NONE (1 << 0) /* Data not present */ #define VFIO_IRQ_SET_DATA_BOOL (1 << 1) /* Data is bool (u8) */ #define VFIO_IRQ_SET_DATA_EVENTFD (1 << 2) /* Data is eventfd (s32) */ #define VFIO_IRQ_SET_ACTION_MASK (1 << 3) /* Mask interrupt */ #define VFIO_IRQ_SET_ACTION_UNMASK (1 << 4) /* Unmask interrupt */ #define VFIO_IRQ_SET_ACTION_TRIGGER (1 << 5) /* Trigger interrupt */ __u32 index; __u32 start; __u32 count; __u8 data[]; }; #define VFIO_DEVICE_SET_IRQS _IO(VFIO_TYPE, VFIO_BASE + 10) #define VFIO_IRQ_SET_DATA_TYPE_MASK (VFIO_IRQ_SET_DATA_NONE | \ VFIO_IRQ_SET_DATA_BOOL | \ VFIO_IRQ_SET_DATA_EVENTFD) #define VFIO_IRQ_SET_ACTION_TYPE_MASK (VFIO_IRQ_SET_ACTION_MASK | \ VFIO_IRQ_SET_ACTION_UNMASK | \ VFIO_IRQ_SET_ACTION_TRIGGER) /** * VFIO_DEVICE_RESET - _IO(VFIO_TYPE, VFIO_BASE + 11) * * Reset a device. */ #define VFIO_DEVICE_RESET _IO(VFIO_TYPE, VFIO_BASE + 11) /* * The VFIO-PCI bus driver makes use of the following fixed region and * IRQ index mapping. Unimplemented regions return a size of zero. * Unimplemented IRQ types return a count of zero. */ enum { VFIO_PCI_BAR0_REGION_INDEX, VFIO_PCI_BAR1_REGION_INDEX, VFIO_PCI_BAR2_REGION_INDEX, VFIO_PCI_BAR3_REGION_INDEX, VFIO_PCI_BAR4_REGION_INDEX, VFIO_PCI_BAR5_REGION_INDEX, VFIO_PCI_ROM_REGION_INDEX, VFIO_PCI_CONFIG_REGION_INDEX, /* * Expose VGA regions defined for PCI base class 03, subclass 00. * This includes I/O port ranges 0x3b0 to 0x3bb and 0x3c0 to 0x3df * as well as the MMIO range 0xa0000 to 0xbffff. Each implemented * range is found at it's identity mapped offset from the region * offset, for example 0x3b0 is region_info.offset + 0x3b0. Areas * between described ranges are unimplemented. */ VFIO_PCI_VGA_REGION_INDEX, VFIO_PCI_NUM_REGIONS = 9 /* Fixed user ABI, region indexes >=9 use */ /* device specific cap to define content. */ }; enum { VFIO_PCI_INTX_IRQ_INDEX, VFIO_PCI_MSI_IRQ_INDEX, VFIO_PCI_MSIX_IRQ_INDEX, VFIO_PCI_ERR_IRQ_INDEX, VFIO_PCI_REQ_IRQ_INDEX, VFIO_PCI_NUM_IRQS }; /* * The vfio-ccw bus driver makes use of the following fixed region and * IRQ index mapping. Unimplemented regions return a size of zero. * Unimplemented IRQ types return a count of zero. */ enum { VFIO_CCW_CONFIG_REGION_INDEX, VFIO_CCW_NUM_REGIONS }; enum { VFIO_CCW_IO_IRQ_INDEX, VFIO_CCW_CRW_IRQ_INDEX, VFIO_CCW_REQ_IRQ_INDEX, VFIO_CCW_NUM_IRQS }; /** * VFIO_DEVICE_GET_PCI_HOT_RESET_INFO - _IORW(VFIO_TYPE, VFIO_BASE + 12, * struct vfio_pci_hot_reset_info) * * Return: 0 on success, -errno on failure: * -enospc = insufficient buffer, -enodev = unsupported for device. */ struct vfio_pci_dependent_device { __u32 group_id; __u16 segment; __u8 bus; __u8 devfn; /* Use PCI_SLOT/PCI_FUNC */ }; struct vfio_pci_hot_reset_info { __u32 argsz; __u32 flags; __u32 count; struct vfio_pci_dependent_device devices[]; }; #define VFIO_DEVICE_GET_PCI_HOT_RESET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12) /** * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13, * struct vfio_pci_hot_reset) * * Return: 0 on success, -errno on failure. */ struct vfio_pci_hot_reset { __u32 argsz; __u32 flags; __u32 count; __s32 group_fds[]; }; #define VFIO_DEVICE_PCI_HOT_RESET _IO(VFIO_TYPE, VFIO_BASE + 13) /** * VFIO_DEVICE_QUERY_GFX_PLANE - _IOW(VFIO_TYPE, VFIO_BASE + 14, * struct vfio_device_query_gfx_plane) * * Set the drm_plane_type and flags, then retrieve the gfx plane info. * * flags supported: * - VFIO_GFX_PLANE_TYPE_PROBE and VFIO_GFX_PLANE_TYPE_DMABUF are set * to ask if the mdev supports dma-buf. 0 on support, -EINVAL on no * support for dma-buf. * - VFIO_GFX_PLANE_TYPE_PROBE and VFIO_GFX_PLANE_TYPE_REGION are set * to ask if the mdev supports region. 0 on support, -EINVAL on no * support for region. * - VFIO_GFX_PLANE_TYPE_DMABUF or VFIO_GFX_PLANE_TYPE_REGION is set * with each call to query the plane info. * - Others are invalid and return -EINVAL. * * Note: * 1. Plane could be disabled by guest. In that case, success will be * returned with zero-initialized drm_format, size, width and height * fields. * 2. x_hot/y_hot is set to 0xFFFFFFFF if no hotspot information available * * Return: 0 on success, -errno on other failure. */ struct vfio_device_gfx_plane_info { __u32 argsz; __u32 flags; #define VFIO_GFX_PLANE_TYPE_PROBE (1 << 0) #define VFIO_GFX_PLANE_TYPE_DMABUF (1 << 1) #define VFIO_GFX_PLANE_TYPE_REGION (1 << 2) /* in */ __u32 drm_plane_type; /* type of plane: DRM_PLANE_TYPE_* */ /* out */ __u32 drm_format; /* drm format of plane */ __u64 drm_format_mod; /* tiled mode */ __u32 width; /* width of plane */ __u32 height; /* height of plane */ __u32 stride; /* stride of plane */ __u32 size; /* size of plane in bytes, align on page*/ __u32 x_pos; /* horizontal position of cursor plane */ __u32 y_pos; /* vertical position of cursor plane*/ __u32 x_hot; /* horizontal position of cursor hotspot */ __u32 y_hot; /* vertical position of cursor hotspot */ union { __u32 region_index; /* region index */ __u32 dmabuf_id; /* dma-buf id */ }; }; #define VFIO_DEVICE_QUERY_GFX_PLANE _IO(VFIO_TYPE, VFIO_BASE + 14) /** * VFIO_DEVICE_GET_GFX_DMABUF - _IOW(VFIO_TYPE, VFIO_BASE + 15, __u32) * * Return a new dma-buf file descriptor for an exposed guest framebuffer * described by the provided dmabuf_id. The dmabuf_id is returned from VFIO_ * DEVICE_QUERY_GFX_PLANE as a token of the exposed guest framebuffer. */ #define VFIO_DEVICE_GET_GFX_DMABUF _IO(VFIO_TYPE, VFIO_BASE + 15) /** * VFIO_DEVICE_IOEVENTFD - _IOW(VFIO_TYPE, VFIO_BASE + 16, * struct vfio_device_ioeventfd) * * Perform a write to the device at the specified device fd offset, with * the specified data and width when the provided eventfd is triggered. * vfio bus drivers may not support this for all regions, for all widths, * or at all. vfio-pci currently only enables support for BAR regions, * excluding the MSI-X vector table. * * Return: 0 on success, -errno on failure. */ struct vfio_device_ioeventfd { __u32 argsz; __u32 flags; #define VFIO_DEVICE_IOEVENTFD_8 (1 << 0) /* 1-byte write */ #define VFIO_DEVICE_IOEVENTFD_16 (1 << 1) /* 2-byte write */ #define VFIO_DEVICE_IOEVENTFD_32 (1 << 2) /* 4-byte write */ #define VFIO_DEVICE_IOEVENTFD_64 (1 << 3) /* 8-byte write */ #define VFIO_DEVICE_IOEVENTFD_SIZE_MASK (0xf) __u64 offset; /* device fd offset of write */ __u64 data; /* data to be written */ __s32 fd; /* -1 for de-assignment */ }; #define VFIO_DEVICE_IOEVENTFD _IO(VFIO_TYPE, VFIO_BASE + 16) /** * VFIO_DEVICE_FEATURE - _IORW(VFIO_TYPE, VFIO_BASE + 17, * struct vfio_device_feature) * * Get, set, or probe feature data of the device. The feature is selected * using the FEATURE_MASK portion of the flags field. Support for a feature * can be probed by setting both the FEATURE_MASK and PROBE bits. A probe * may optionally include the GET and/or SET bits to determine read vs write * access of the feature respectively. Probing a feature will return success * if the feature is supported and all of the optionally indicated GET/SET * methods are supported. The format of the data portion of the structure is * specific to the given feature. The data portion is not required for * probing. GET and SET are mutually exclusive, except for use with PROBE. * * Return 0 on success, -errno on failure. */ struct vfio_device_feature { __u32 argsz; __u32 flags; #define VFIO_DEVICE_FEATURE_MASK (0xffff) /* 16-bit feature index */ #define VFIO_DEVICE_FEATURE_GET (1 << 16) /* Get feature into data[] */ #define VFIO_DEVICE_FEATURE_SET (1 << 17) /* Set feature from data[] */ #define VFIO_DEVICE_FEATURE_PROBE (1 << 18) /* Probe feature support */ __u8 data[]; }; #define VFIO_DEVICE_FEATURE _IO(VFIO_TYPE, VFIO_BASE + 17) /* * Provide support for setting a PCI VF Token, which is used as a shared * secret between PF and VF drivers. This feature may only be set on a * PCI SR-IOV PF when SR-IOV is enabled on the PF and there are no existing * open VFs. Data provided when setting this feature is a 16-byte array * (__u8 b[16]), representing a UUID. */ #define VFIO_DEVICE_FEATURE_PCI_VF_TOKEN (0) /* -------- API for Type1 VFIO IOMMU -------- */ /** * VFIO_IOMMU_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 12, struct vfio_iommu_info) * * Retrieve information about the IOMMU object. Fills in provided * struct vfio_iommu_info. Caller sets argsz. * * XXX Should we do these by CHECK_EXTENSION too? */ struct vfio_iommu_type1_info { __u32 argsz; __u32 flags; #define VFIO_IOMMU_INFO_PGSIZES (1 << 0) /* supported page sizes info */ #define VFIO_IOMMU_INFO_CAPS (1 << 1) /* Info supports caps */ __u64 iova_pgsizes; /* Bitmap of supported page sizes */ __u32 cap_offset; /* Offset within info struct of first cap */ }; /* * The IOVA capability allows to report the valid IOVA range(s) * excluding any non-relaxable reserved regions exposed by * devices attached to the container. Any DMA map attempt * outside the valid iova range will return error. * * The structures below define version 1 of this capability. */ #define VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE 1 struct vfio_iova_range { __u64 start; __u64 end; }; struct vfio_iommu_type1_info_cap_iova_range { struct vfio_info_cap_header header; __u32 nr_iovas; __u32 reserved; struct vfio_iova_range iova_ranges[]; }; /* * The migration capability allows to report supported features for migration. * * The structures below define version 1 of this capability. * * The existence of this capability indicates that IOMMU kernel driver supports * dirty page logging. * * pgsize_bitmap: Kernel driver returns bitmap of supported page sizes for dirty * page logging. * max_dirty_bitmap_size: Kernel driver returns maximum supported dirty bitmap * size in bytes that can be used by user applications when getting the dirty * bitmap. */ #define VFIO_IOMMU_TYPE1_INFO_CAP_MIGRATION 2 struct vfio_iommu_type1_info_cap_migration { struct vfio_info_cap_header header; __u32 flags; __u64 pgsize_bitmap; __u64 max_dirty_bitmap_size; /* in bytes */ }; /* * The DMA available capability allows to report the current number of * simultaneously outstanding DMA mappings that are allowed. * * The structure below defines version 1 of this capability. * * avail: specifies the current number of outstanding DMA mappings allowed. */ #define VFIO_IOMMU_TYPE1_INFO_DMA_AVAIL 3 struct vfio_iommu_type1_info_dma_avail { struct vfio_info_cap_header header; __u32 avail; }; #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12) /** * VFIO_IOMMU_MAP_DMA - _IOW(VFIO_TYPE, VFIO_BASE + 13, struct vfio_dma_map) * * Map process virtual addresses to IO virtual addresses using the * provided struct vfio_dma_map. Caller sets argsz. READ &/ WRITE required. * * If flags & VFIO_DMA_MAP_FLAG_VADDR, update the base vaddr for iova, and * unblock translation of host virtual addresses in the iova range. The vaddr * must have previously been invalidated with VFIO_DMA_UNMAP_FLAG_VADDR. To * maintain memory consistency within the user application, the updated vaddr * must address the same memory object as originally mapped. Failure to do so * will result in user memory corruption and/or device misbehavior. iova and * size must match those in the original MAP_DMA call. Protection is not * changed, and the READ & WRITE flags must be 0. */ struct vfio_iommu_type1_dma_map { __u32 argsz; __u32 flags; #define VFIO_DMA_MAP_FLAG_READ (1 << 0) /* readable from device */ #define VFIO_DMA_MAP_FLAG_WRITE (1 << 1) /* writable from device */ #define VFIO_DMA_MAP_FLAG_VADDR (1 << 2) __u64 vaddr; /* Process virtual address */ __u64 iova; /* IO virtual address */ __u64 size; /* Size of mapping (bytes) */ }; #define VFIO_IOMMU_MAP_DMA _IO(VFIO_TYPE, VFIO_BASE + 13) struct vfio_bitmap { __u64 pgsize; /* page size for bitmap in bytes */ __u64 size; /* in bytes */ __u64 __user *data; /* one bit per page */ }; /** * VFIO_IOMMU_UNMAP_DMA - _IOWR(VFIO_TYPE, VFIO_BASE + 14, * struct vfio_dma_unmap) * * Unmap IO virtual addresses using the provided struct vfio_dma_unmap. * Caller sets argsz. The actual unmapped size is returned in the size * field. No guarantee is made to the user that arbitrary unmaps of iova * or size different from those used in the original mapping call will * succeed. * * VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP should be set to get the dirty bitmap * before unmapping IO virtual addresses. When this flag is set, the user must * provide a struct vfio_bitmap in data[]. User must provide zero-allocated * memory via vfio_bitmap.data and its size in the vfio_bitmap.size field. * A bit in the bitmap represents one page, of user provided page size in * vfio_bitmap.pgsize field, consecutively starting from iova offset. Bit set * indicates that the page at that offset from iova is dirty. A Bitmap of the * pages in the range of unmapped size is returned in the user-provided * vfio_bitmap.data. * * If flags & VFIO_DMA_UNMAP_FLAG_ALL, unmap all addresses. iova and size * must be 0. This cannot be combined with the get-dirty-bitmap flag. * * If flags & VFIO_DMA_UNMAP_FLAG_VADDR, do not unmap, but invalidate host * virtual addresses in the iova range. Tasks that attempt to translate an * iova's vaddr will block. DMA to already-mapped pages continues. This * cannot be combined with the get-dirty-bitmap flag. */ struct vfio_iommu_type1_dma_unmap { __u32 argsz; __u32 flags; #define VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (1 << 0) #define VFIO_DMA_UNMAP_FLAG_ALL (1 << 1) #define VFIO_DMA_UNMAP_FLAG_VADDR (1 << 2) __u64 iova; /* IO virtual address */ __u64 size; /* Size of mapping (bytes) */ __u8 data[]; }; #define VFIO_IOMMU_UNMAP_DMA _IO(VFIO_TYPE, VFIO_BASE + 14) /* * IOCTLs to enable/disable IOMMU container usage. * No parameters are supported. */ #define VFIO_IOMMU_ENABLE _IO(VFIO_TYPE, VFIO_BASE + 15) #define VFIO_IOMMU_DISABLE _IO(VFIO_TYPE, VFIO_BASE + 16) /** * VFIO_IOMMU_DIRTY_PAGES - _IOWR(VFIO_TYPE, VFIO_BASE + 17, * struct vfio_iommu_type1_dirty_bitmap) * IOCTL is used for dirty pages logging. * Caller should set flag depending on which operation to perform, details as * below: * * Calling the IOCTL with VFIO_IOMMU_DIRTY_PAGES_FLAG_START flag set, instructs * the IOMMU driver to log pages that are dirtied or potentially dirtied by * the device; designed to be used when a migration is in progress. Dirty pages * are logged until logging is disabled by user application by calling the IOCTL * with VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP flag. * * Calling the IOCTL with VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP flag set, instructs * the IOMMU driver to stop logging dirtied pages. * * Calling the IOCTL with VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP flag set * returns the dirty pages bitmap for IOMMU container for a given IOVA range. * The user must specify the IOVA range and the pgsize through the structure * vfio_iommu_type1_dirty_bitmap_get in the data[] portion. This interface * supports getting a bitmap of the smallest supported pgsize only and can be * modified in future to get a bitmap of any specified supported pgsize. The * user must provide a zeroed memory area for the bitmap memory and specify its * size in bitmap.size. One bit is used to represent one page consecutively * starting from iova offset. The user should provide page size in bitmap.pgsize * field. A bit set in the bitmap indicates that the page at that offset from * iova is dirty. The caller must set argsz to a value including the size of * structure vfio_iommu_type1_dirty_bitmap_get, but excluding the size of the * actual bitmap. If dirty pages logging is not enabled, an error will be * returned. * * Only one of the flags _START, _STOP and _GET may be specified at a time. * */ struct vfio_iommu_type1_dirty_bitmap { __u32 argsz; __u32 flags; #define VFIO_IOMMU_DIRTY_PAGES_FLAG_START (1 << 0) #define VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP (1 << 1) #define VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP (1 << 2) __u8 data[]; }; struct vfio_iommu_type1_dirty_bitmap_get { __u64 iova; /* IO virtual address */ __u64 size; /* Size of iova range */ struct vfio_bitmap bitmap; }; #define VFIO_IOMMU_DIRTY_PAGES _IO(VFIO_TYPE, VFIO_BASE + 17) /* -------- Additional API for SPAPR TCE (Server POWERPC) IOMMU -------- */ /* * The SPAPR TCE DDW info struct provides the information about * the details of Dynamic DMA window capability. * * @pgsizes contains a page size bitmask, 4K/64K/16M are supported. * @max_dynamic_windows_supported tells the maximum number of windows * which the platform can create. * @levels tells the maximum number of levels in multi-level IOMMU tables; * this allows splitting a table into smaller chunks which reduces * the amount of physically contiguous memory required for the table. */ struct vfio_iommu_spapr_tce_ddw_info { __u64 pgsizes; /* Bitmap of supported page sizes */ __u32 max_dynamic_windows_supported; __u32 levels; }; /* * The SPAPR TCE info struct provides the information about the PCI bus * address ranges available for DMA, these values are programmed into * the hardware so the guest has to know that information. * * The DMA 32 bit window start is an absolute PCI bus address. * The IOVA address passed via map/unmap ioctls are absolute PCI bus * addresses too so the window works as a filter rather than an offset * for IOVA addresses. * * Flags supported: * - VFIO_IOMMU_SPAPR_INFO_DDW: informs the userspace that dynamic DMA windows * (DDW) support is present. @ddw is only supported when DDW is present. */ struct vfio_iommu_spapr_tce_info { __u32 argsz; __u32 flags; #define VFIO_IOMMU_SPAPR_INFO_DDW (1 << 0) /* DDW supported */ __u32 dma32_window_start; /* 32 bit window start (bytes) */ __u32 dma32_window_size; /* 32 bit window size (bytes) */ struct vfio_iommu_spapr_tce_ddw_info ddw; }; #define VFIO_IOMMU_SPAPR_TCE_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12) /* * EEH PE operation struct provides ways to: * - enable/disable EEH functionality; * - unfreeze IO/DMA for frozen PE; * - read PE state; * - reset PE; * - configure PE; * - inject EEH error. */ struct vfio_eeh_pe_err { __u32 type; __u32 func; __u64 addr; __u64 mask; }; struct vfio_eeh_pe_op { __u32 argsz; __u32 flags; __u32 op; union { struct vfio_eeh_pe_err err; }; }; #define VFIO_EEH_PE_DISABLE 0 /* Disable EEH functionality */ #define VFIO_EEH_PE_ENABLE 1 /* Enable EEH functionality */ #define VFIO_EEH_PE_UNFREEZE_IO 2 /* Enable IO for frozen PE */ #define VFIO_EEH_PE_UNFREEZE_DMA 3 /* Enable DMA for frozen PE */ #define VFIO_EEH_PE_GET_STATE 4 /* PE state retrieval */ #define VFIO_EEH_PE_STATE_NORMAL 0 /* PE in functional state */ #define VFIO_EEH_PE_STATE_RESET 1 /* PE reset in progress */ #define VFIO_EEH_PE_STATE_STOPPED 2 /* Stopped DMA and IO */ #define VFIO_EEH_PE_STATE_STOPPED_DMA 4 /* Stopped DMA only */ #define VFIO_EEH_PE_STATE_UNAVAIL 5 /* State unavailable */ #define VFIO_EEH_PE_RESET_DEACTIVATE 5 /* Deassert PE reset */ #define VFIO_EEH_PE_RESET_HOT 6 /* Assert hot reset */ #define VFIO_EEH_PE_RESET_FUNDAMENTAL 7 /* Assert fundamental reset */ #define VFIO_EEH_PE_CONFIGURE 8 /* PE configuration */ #define VFIO_EEH_PE_INJECT_ERR 9 /* Inject EEH error */ #define VFIO_EEH_PE_OP _IO(VFIO_TYPE, VFIO_BASE + 21) /** * VFIO_IOMMU_SPAPR_REGISTER_MEMORY - _IOW(VFIO_TYPE, VFIO_BASE + 17, struct vfio_iommu_spapr_register_memory) * * Registers user space memory where DMA is allowed. It pins * user pages and does the locked memory accounting so * subsequent VFIO_IOMMU_MAP_DMA/VFIO_IOMMU_UNMAP_DMA calls * get faster. */ struct vfio_iommu_spapr_register_memory { __u32 argsz; __u32 flags; __u64 vaddr; /* Process virtual address */ __u64 size; /* Size of mapping (bytes) */ }; #define VFIO_IOMMU_SPAPR_REGISTER_MEMORY _IO(VFIO_TYPE, VFIO_BASE + 17) /** * VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY - _IOW(VFIO_TYPE, VFIO_BASE + 18, struct vfio_iommu_spapr_register_memory) * * Unregisters user space memory registered with * VFIO_IOMMU_SPAPR_REGISTER_MEMORY. * Uses vfio_iommu_spapr_register_memory for parameters. */ #define VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY _IO(VFIO_TYPE, VFIO_BASE + 18) /** * VFIO_IOMMU_SPAPR_TCE_CREATE - _IOWR(VFIO_TYPE, VFIO_BASE + 19, struct vfio_iommu_spapr_tce_create) * * Creates an additional TCE table and programs it (sets a new DMA window) * to every IOMMU group in the container. It receives page shift, window * size and number of levels in the TCE table being created. * * It allocates and returns an offset on a PCI bus of the new DMA window. */ struct vfio_iommu_spapr_tce_create { __u32 argsz; __u32 flags; /* in */ __u32 page_shift; __u32 __resv1; __u64 window_size; __u32 levels; __u32 __resv2; /* out */ __u64 start_addr; }; #define VFIO_IOMMU_SPAPR_TCE_CREATE _IO(VFIO_TYPE, VFIO_BASE + 19) /** * VFIO_IOMMU_SPAPR_TCE_REMOVE - _IOW(VFIO_TYPE, VFIO_BASE + 20, struct vfio_iommu_spapr_tce_remove) * * Unprograms a TCE table from all groups in the container and destroys it. * It receives a PCI bus offset as a window id. */ struct vfio_iommu_spapr_tce_remove { __u32 argsz; __u32 flags; /* in */ __u64 start_addr; }; #define VFIO_IOMMU_SPAPR_TCE_REMOVE _IO(VFIO_TYPE, VFIO_BASE + 20) /* ***************************************************************** */ #endif /* _UAPIVFIO_H */ rdma-core-56.1/kernel-headers/rdma/000077500000000000000000000000001477342711600171465ustar00rootroot00000000000000rdma-core-56.1/kernel-headers/rdma/bnxt_re-abi.h000066400000000000000000000130601477342711600215110ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) */ /* * Broadcom NetXtreme-E RoCE driver. * * Copyright (c) 2016 - 2017, Broadcom. All rights reserved. The term * Broadcom refers to Broadcom Limited and/or its subsidiaries. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Description: Uverbs ABI header file */ #ifndef __BNXT_RE_UVERBS_ABI_H__ #define __BNXT_RE_UVERBS_ABI_H__ #include #include #define BNXT_RE_ABI_VERSION 1 #define BNXT_RE_CHIP_ID0_CHIP_NUM_SFT 0x00 #define BNXT_RE_CHIP_ID0_CHIP_REV_SFT 0x10 #define BNXT_RE_CHIP_ID0_CHIP_MET_SFT 0x18 enum { BNXT_RE_UCNTX_CMASK_HAVE_CCTX = 0x1ULL, BNXT_RE_UCNTX_CMASK_HAVE_MODE = 0x02ULL, BNXT_RE_UCNTX_CMASK_WC_DPI_ENABLED = 0x04ULL, BNXT_RE_UCNTX_CMASK_DBR_PACING_ENABLED = 0x08ULL, BNXT_RE_UCNTX_CMASK_POW2_DISABLED = 0x10ULL, BNXT_RE_UCNTX_CMASK_MSN_TABLE_ENABLED = 0x40, }; enum bnxt_re_wqe_mode { BNXT_QPLIB_WQE_MODE_STATIC = 0x00, BNXT_QPLIB_WQE_MODE_VARIABLE = 0x01, BNXT_QPLIB_WQE_MODE_INVALID = 0x02, }; enum { BNXT_RE_COMP_MASK_REQ_UCNTX_POW2_SUPPORT = 0x01, BNXT_RE_COMP_MASK_REQ_UCNTX_VAR_WQE_SUPPORT = 0x02, }; struct bnxt_re_uctx_req { __aligned_u64 comp_mask; }; struct bnxt_re_uctx_resp { __u32 dev_id; __u32 max_qp; __u32 pg_size; __u32 cqe_sz; __u32 max_cqd; __u32 rsvd; __aligned_u64 comp_mask; __u32 chip_id0; __u32 chip_id1; __u32 mode; __u32 rsvd1; /* padding */ }; /* * This struct is placed after the ib_uverbs_alloc_pd_resp struct, which is * not 8 byted aligned. To avoid undesired padding in various cases we have to * set this struct to packed. */ struct bnxt_re_pd_resp { __u32 pdid; __u32 dpi; __u64 dbr; } __attribute__((packed, aligned(4))); struct bnxt_re_cq_req { __aligned_u64 cq_va; __aligned_u64 cq_handle; }; enum bnxt_re_cq_mask { BNXT_RE_CQ_TOGGLE_PAGE_SUPPORT = 0x1, }; struct bnxt_re_cq_resp { __u32 cqid; __u32 tail; __u32 phase; __u32 rsvd; __aligned_u64 comp_mask; }; struct bnxt_re_resize_cq_req { __aligned_u64 cq_va; }; enum bnxt_re_qp_mask { BNXT_RE_QP_REQ_MASK_VAR_WQE_SQ_SLOTS = 0x1, }; struct bnxt_re_qp_req { __aligned_u64 qpsva; __aligned_u64 qprva; __aligned_u64 qp_handle; __aligned_u64 comp_mask; __u32 sq_slots; }; struct bnxt_re_qp_resp { __u32 qpid; __u32 rsvd; }; struct bnxt_re_srq_req { __aligned_u64 srqva; __aligned_u64 srq_handle; }; enum bnxt_re_srq_mask { BNXT_RE_SRQ_TOGGLE_PAGE_SUPPORT = 0x1, }; struct bnxt_re_srq_resp { __u32 srqid; __u32 rsvd; /* padding */ __aligned_u64 comp_mask; }; enum bnxt_re_shpg_offt { BNXT_RE_BEG_RESV_OFFT = 0x00, BNXT_RE_AVID_OFFT = 0x10, BNXT_RE_AVID_SIZE = 0x04, BNXT_RE_END_RESV_OFFT = 0xFF0 }; enum bnxt_re_objects { BNXT_RE_OBJECT_ALLOC_PAGE = (1U << UVERBS_ID_NS_SHIFT), BNXT_RE_OBJECT_NOTIFY_DRV, BNXT_RE_OBJECT_GET_TOGGLE_MEM, }; enum bnxt_re_alloc_page_type { BNXT_RE_ALLOC_WC_PAGE = 0, BNXT_RE_ALLOC_DBR_BAR_PAGE, BNXT_RE_ALLOC_DBR_PAGE, }; enum bnxt_re_var_alloc_page_attrs { BNXT_RE_ALLOC_PAGE_HANDLE = (1U << UVERBS_ID_NS_SHIFT), BNXT_RE_ALLOC_PAGE_TYPE, BNXT_RE_ALLOC_PAGE_DPI, BNXT_RE_ALLOC_PAGE_MMAP_OFFSET, BNXT_RE_ALLOC_PAGE_MMAP_LENGTH, }; enum bnxt_re_alloc_page_attrs { BNXT_RE_DESTROY_PAGE_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; enum bnxt_re_alloc_page_methods { BNXT_RE_METHOD_ALLOC_PAGE = (1U << UVERBS_ID_NS_SHIFT), BNXT_RE_METHOD_DESTROY_PAGE, }; enum bnxt_re_notify_drv_methods { BNXT_RE_METHOD_NOTIFY_DRV = (1U << UVERBS_ID_NS_SHIFT), }; /* Toggle mem */ enum bnxt_re_get_toggle_mem_type { BNXT_RE_CQ_TOGGLE_MEM = 0, BNXT_RE_SRQ_TOGGLE_MEM, }; enum bnxt_re_var_toggle_mem_attrs { BNXT_RE_TOGGLE_MEM_HANDLE = (1U << UVERBS_ID_NS_SHIFT), BNXT_RE_TOGGLE_MEM_TYPE, BNXT_RE_TOGGLE_MEM_RES_ID, BNXT_RE_TOGGLE_MEM_MMAP_PAGE, BNXT_RE_TOGGLE_MEM_MMAP_OFFSET, BNXT_RE_TOGGLE_MEM_MMAP_LENGTH, }; enum bnxt_re_toggle_mem_attrs { BNXT_RE_RELEASE_TOGGLE_MEM_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; enum bnxt_re_toggle_mem_methods { BNXT_RE_METHOD_GET_TOGGLE_MEM = (1U << UVERBS_ID_NS_SHIFT), BNXT_RE_METHOD_RELEASE_TOGGLE_MEM, }; #endif /* __BNXT_RE_UVERBS_ABI_H__*/ rdma-core-56.1/kernel-headers/rdma/cxgb4-abi.h000066400000000000000000000060621477342711600210630ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2009-2010 Chelsio, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef CXGB4_ABI_USER_H #define CXGB4_ABI_USER_H #include #define C4IW_UVERBS_ABI_VERSION 3 /* * Make sure that all structs defined in this file remain laid out so * that they pack the same way on 32-bit and 64-bit architectures (to * avoid incompatibility between 32-bit userspace and 64-bit kernels). * In particular do not use pointer types -- pass pointers in __aligned_u64 * instead. */ enum { C4IW_64B_CQE = (1 << 0) }; struct c4iw_create_cq { __u32 flags; __u32 reserved; }; struct c4iw_create_cq_resp { __aligned_u64 key; __aligned_u64 gts_key; __aligned_u64 memsize; __u32 cqid; __u32 size; __u32 qid_mask; __u32 flags; }; enum { C4IW_QPF_ONCHIP = (1 << 0), C4IW_QPF_WRITE_W_IMM = (1 << 1) }; struct c4iw_create_qp_resp { __aligned_u64 ma_sync_key; __aligned_u64 sq_key; __aligned_u64 rq_key; __aligned_u64 sq_db_gts_key; __aligned_u64 rq_db_gts_key; __aligned_u64 sq_memsize; __aligned_u64 rq_memsize; __u32 sqid; __u32 rqid; __u32 sq_size; __u32 rq_size; __u32 qid_mask; __u32 flags; }; struct c4iw_create_srq_resp { __aligned_u64 srq_key; __aligned_u64 srq_db_gts_key; __aligned_u64 srq_memsize; __u32 srqid; __u32 srq_size; __u32 rqt_abs_idx; __u32 qid_mask; __u32 flags; __u32 reserved; /* explicit padding */ }; enum { /* HW supports SRQ_LIMIT_REACHED event */ T4_SRQ_LIMIT_SUPPORT = 1 << 0, }; struct c4iw_alloc_ucontext_resp { __aligned_u64 status_page_key; __u32 status_page_size; __u32 reserved; /* explicit padding (optional for i386) */ }; struct c4iw_alloc_pd_resp { __u32 pdid; }; #endif /* CXGB4_ABI_USER_H */ rdma-core-56.1/kernel-headers/rdma/efa-abi.h000066400000000000000000000071101477342711600206020ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) */ /* * Copyright 2018-2024 Amazon.com, Inc. or its affiliates. All rights reserved. */ #ifndef EFA_ABI_USER_H #define EFA_ABI_USER_H #include #include /* * Increment this value if any changes that break userspace ABI * compatibility are made. */ #define EFA_UVERBS_ABI_VERSION 1 /* * Keep structs aligned to 8 bytes. * Keep reserved fields as arrays of __u8 named reserved_XXX where XXX is the * hex bit offset of the field. */ enum { EFA_ALLOC_UCONTEXT_CMD_COMP_TX_BATCH = 1 << 0, EFA_ALLOC_UCONTEXT_CMD_COMP_MIN_SQ_WR = 1 << 1, }; struct efa_ibv_alloc_ucontext_cmd { __u32 comp_mask; __u8 reserved_20[4]; }; enum efa_ibv_user_cmds_supp_udata { EFA_USER_CMDS_SUPP_UDATA_QUERY_DEVICE = 1 << 0, EFA_USER_CMDS_SUPP_UDATA_CREATE_AH = 1 << 1, }; struct efa_ibv_alloc_ucontext_resp { __u32 comp_mask; __u32 cmds_supp_udata_mask; __u16 sub_cqs_per_cq; __u16 inline_buf_size; __u32 max_llq_size; /* bytes */ __u16 max_tx_batch; /* units of 64 bytes */ __u16 min_sq_wr; __u8 reserved_a0[4]; }; struct efa_ibv_alloc_pd_resp { __u32 comp_mask; __u16 pdn; __u8 reserved_30[2]; }; enum { EFA_CREATE_CQ_WITH_COMPLETION_CHANNEL = 1 << 0, EFA_CREATE_CQ_WITH_SGID = 1 << 1, }; struct efa_ibv_create_cq { __u32 comp_mask; __u32 cq_entry_size; __u16 num_sub_cqs; __u8 flags; __u8 reserved_58[5]; }; enum { EFA_CREATE_CQ_RESP_DB_OFF = 1 << 0, }; struct efa_ibv_create_cq_resp { __u32 comp_mask; __u8 reserved_20[4]; __aligned_u64 q_mmap_key; __aligned_u64 q_mmap_size; __u16 cq_idx; __u8 reserved_d0[2]; __u32 db_off; __aligned_u64 db_mmap_key; }; enum { EFA_QP_DRIVER_TYPE_SRD = 0, }; enum { EFA_CREATE_QP_WITH_UNSOLICITED_WRITE_RECV = 1 << 0, }; struct efa_ibv_create_qp { __u32 comp_mask; __u32 rq_ring_size; /* bytes */ __u32 sq_ring_size; /* bytes */ __u32 driver_qp_type; __u16 flags; __u8 sl; __u8 reserved_98[5]; }; struct efa_ibv_create_qp_resp { __u32 comp_mask; /* the offset inside the page of the rq db */ __u32 rq_db_offset; /* the offset inside the page of the sq db */ __u32 sq_db_offset; /* the offset inside the page of descriptors buffer */ __u32 llq_desc_offset; __aligned_u64 rq_mmap_key; __aligned_u64 rq_mmap_size; __aligned_u64 rq_db_mmap_key; __aligned_u64 sq_db_mmap_key; __aligned_u64 llq_desc_mmap_key; __u16 send_sub_cq_idx; __u16 recv_sub_cq_idx; __u8 reserved_1e0[4]; }; struct efa_ibv_create_ah_resp { __u32 comp_mask; __u16 efa_address_handle; __u8 reserved_30[2]; }; enum { EFA_QUERY_DEVICE_CAPS_RDMA_READ = 1 << 0, EFA_QUERY_DEVICE_CAPS_RNR_RETRY = 1 << 1, EFA_QUERY_DEVICE_CAPS_CQ_NOTIFICATIONS = 1 << 2, EFA_QUERY_DEVICE_CAPS_CQ_WITH_SGID = 1 << 3, EFA_QUERY_DEVICE_CAPS_DATA_POLLING_128 = 1 << 4, EFA_QUERY_DEVICE_CAPS_RDMA_WRITE = 1 << 5, EFA_QUERY_DEVICE_CAPS_UNSOLICITED_WRITE_RECV = 1 << 6, }; struct efa_ibv_ex_query_device_resp { __u32 comp_mask; __u32 max_sq_wr; __u32 max_rq_wr; __u16 max_sq_sge; __u16 max_rq_sge; __u32 max_rdma_size; __u32 device_caps; }; enum { EFA_QUERY_MR_VALIDITY_RECV_IC_ID = 1 << 0, EFA_QUERY_MR_VALIDITY_RDMA_READ_IC_ID = 1 << 1, EFA_QUERY_MR_VALIDITY_RDMA_RECV_IC_ID = 1 << 2, }; enum efa_query_mr_attrs { EFA_IB_ATTR_QUERY_MR_HANDLE = (1U << UVERBS_ID_NS_SHIFT), EFA_IB_ATTR_QUERY_MR_RESP_IC_ID_VALIDITY, EFA_IB_ATTR_QUERY_MR_RESP_RECV_IC_ID, EFA_IB_ATTR_QUERY_MR_RESP_RDMA_READ_IC_ID, EFA_IB_ATTR_QUERY_MR_RESP_RDMA_RECV_IC_ID, }; enum efa_mr_methods { EFA_IB_METHOD_MR_QUERY = (1U << UVERBS_ID_NS_SHIFT), }; #endif /* EFA_ABI_USER_H */ rdma-core-56.1/kernel-headers/rdma/erdma-abi.h000066400000000000000000000014531477342711600211430ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */ /* * Copyright (c) 2020-2022, Alibaba Group. */ #ifndef __ERDMA_USER_H__ #define __ERDMA_USER_H__ #include #define ERDMA_ABI_VERSION 1 struct erdma_ureq_create_cq { __aligned_u64 db_record_va; __aligned_u64 qbuf_va; __u32 qbuf_len; __u32 rsvd0; }; struct erdma_uresp_create_cq { __u32 cq_id; __u32 num_cqe; }; struct erdma_ureq_create_qp { __aligned_u64 db_record_va; __aligned_u64 qbuf_va; __u32 qbuf_len; __u32 rsvd0; }; struct erdma_uresp_create_qp { __u32 qp_id; __u32 num_sqe; __u32 num_rqe; __u32 rq_offset; }; struct erdma_uresp_alloc_ctx { __u32 dev_id; __u32 pad; __u32 sdb_type; __u32 sdb_offset; __aligned_u64 sdb; __aligned_u64 rdb; __aligned_u64 cdb; }; #endif rdma-core-56.1/kernel-headers/rdma/hfi/000077500000000000000000000000001477342711600177145ustar00rootroot00000000000000rdma-core-56.1/kernel-headers/rdma/hfi/hfi1_ioctl.h000066400000000000000000000147321477342711600221150ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */ /* * * This file is provided under a dual BSD/GPLv2 license. When using or * redistributing this file, you may do so under either license. * * GPL LICENSE SUMMARY * * Copyright(c) 2015 Intel Corporation. * * This program is free software; you can redistribute it and/or modify * it under the terms of version 2 of the GNU General Public License as * published by the Free Software Foundation. * * This program is distributed in the hope that it will be useful, but * WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * General Public License for more details. * * BSD LICENSE * * Copyright(c) 2015 Intel Corporation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * - Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * - Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * - Neither the name of Intel Corporation nor the names of its * contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * */ #ifndef _LINUX__HFI1_IOCTL_H #define _LINUX__HFI1_IOCTL_H #include /* * This structure is passed to the driver to tell it where * user code buffers are, sizes, etc. The offsets and sizes of the * fields must remain unchanged, for binary compatibility. It can * be extended, if userversion is changed so user code can tell, if needed */ struct hfi1_user_info { /* * version of user software, to detect compatibility issues. * Should be set to HFI1_USER_SWVERSION. */ __u32 userversion; __u32 pad; /* * If two or more processes wish to share a context, each process * must set the subcontext_cnt and subcontext_id to the same * values. The only restriction on the subcontext_id is that * it be unique for a given node. */ __u16 subctxt_cnt; __u16 subctxt_id; /* 128bit UUID passed in by PSM. */ __u8 uuid[16]; }; struct hfi1_ctxt_info { __aligned_u64 runtime_flags; /* chip/drv runtime flags (HFI1_CAP_*) */ __u32 rcvegr_size; /* size of each eager buffer */ __u16 num_active; /* number of active units */ __u16 unit; /* unit (chip) assigned to caller */ __u16 ctxt; /* ctxt on unit assigned to caller */ __u16 subctxt; /* subctxt on unit assigned to caller */ __u16 rcvtids; /* number of Rcv TIDs for this context */ __u16 credits; /* number of PIO credits for this context */ __u16 numa_node; /* NUMA node of the assigned device */ __u16 rec_cpu; /* cpu # for affinity (0xffff if none) */ __u16 send_ctxt; /* send context in use by this user context */ __u16 egrtids; /* number of RcvArray entries for Eager Rcvs */ __u16 rcvhdrq_cnt; /* number of RcvHdrQ entries */ __u16 rcvhdrq_entsize; /* size (in bytes) for each RcvHdrQ entry */ __u16 sdma_ring_size; /* number of entries in SDMA request ring */ }; struct hfi1_tid_info { /* virtual address of first page in transfer */ __aligned_u64 vaddr; /* pointer to tid array. this array is big enough */ __aligned_u64 tidlist; /* number of tids programmed by this request */ __u32 tidcnt; /* length of transfer buffer programmed by this request */ __u32 length; }; /* * This structure is returned by the driver immediately after * open to get implementation-specific info, and info specific to this * instance. * * This struct must have explicit pad fields where type sizes * may result in different alignments between 32 and 64 bit * programs, since the 64 bit * bit kernel requires the user code * to have matching offsets */ struct hfi1_base_info { /* version of hardware, for feature checking. */ __u32 hw_version; /* version of software, for feature checking. */ __u32 sw_version; /* Job key */ __u16 jkey; __u16 padding1; /* * The special QP (queue pair) value that identifies PSM * protocol packet from standard IB packets. */ __u32 bthqp; /* PIO credit return address, */ __aligned_u64 sc_credits_addr; /* * Base address of write-only pio buffers for this process. * Each buffer has sendpio_credits*64 bytes. */ __aligned_u64 pio_bufbase_sop; /* * Base address of write-only pio buffers for this process. * Each buffer has sendpio_credits*64 bytes. */ __aligned_u64 pio_bufbase; /* address where receive buffer queue is mapped into */ __aligned_u64 rcvhdr_bufbase; /* base address of Eager receive buffers. */ __aligned_u64 rcvegr_bufbase; /* base address of SDMA completion ring */ __aligned_u64 sdma_comp_bufbase; /* * User register base for init code, not to be used directly by * protocol or applications. Always maps real chip register space. * the register addresses are: * ur_rcvhdrhead, ur_rcvhdrtail, ur_rcvegrhead, ur_rcvegrtail, * ur_rcvtidflow */ __aligned_u64 user_regbase; /* notification events */ __aligned_u64 events_bufbase; /* status page */ __aligned_u64 status_bufbase; /* rcvhdrtail update */ __aligned_u64 rcvhdrtail_base; /* * shared memory pages for subctxts if ctxt is shared; these cover * all the processes in the group sharing a single context. * all have enough space for the num_subcontexts value on this job. */ __aligned_u64 subctxt_uregbase; __aligned_u64 subctxt_rcvegrbuf; __aligned_u64 subctxt_rcvhdrbuf; }; #endif /* _LINIUX__HFI1_IOCTL_H */ rdma-core-56.1/kernel-headers/rdma/hfi/hfi1_user.h000066400000000000000000000221221477342711600217510ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */ /* * * This file is provided under a dual BSD/GPLv2 license. When using or * redistributing this file, you may do so under either license. * * GPL LICENSE SUMMARY * * Copyright(c) 2015 - 2020 Intel Corporation. * * This program is free software; you can redistribute it and/or modify * it under the terms of version 2 of the GNU General Public License as * published by the Free Software Foundation. * * This program is distributed in the hope that it will be useful, but * WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * General Public License for more details. * * BSD LICENSE * * Copyright(c) 2015 Intel Corporation. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * - Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * - Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * - Neither the name of Intel Corporation nor the names of its * contributors may be used to endorse or promote products derived * from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * */ /* * This file contains defines, structures, etc. that are used * to communicate between kernel and user code. */ #ifndef _LINUX__HFI1_USER_H #define _LINUX__HFI1_USER_H #include #include /* * This version number is given to the driver by the user code during * initialization in the spu_userversion field of hfi1_user_info, so * the driver can check for compatibility with user code. * * The major version changes when data structures change in an incompatible * way. The driver must be the same for initialization to succeed. */ #define HFI1_USER_SWMAJOR 6 /* * Minor version differences are always compatible * a within a major version, however if user software is larger * than driver software, some new features and/or structure fields * may not be implemented; the user code must deal with this if it * cares, or it must abort after initialization reports the difference. */ #define HFI1_USER_SWMINOR 3 /* * We will encode the major/minor inside a single 32bit version number. */ #define HFI1_SWMAJOR_SHIFT 16 /* * Set of HW and driver capability/feature bits. * These bit values are used to configure enabled/disabled HW and * driver features. The same set of bits are communicated to user * space. */ #define HFI1_CAP_DMA_RTAIL (1UL << 0) /* Use DMA'ed RTail value */ #define HFI1_CAP_SDMA (1UL << 1) /* Enable SDMA support */ #define HFI1_CAP_SDMA_AHG (1UL << 2) /* Enable SDMA AHG support */ #define HFI1_CAP_EXTENDED_PSN (1UL << 3) /* Enable Extended PSN support */ #define HFI1_CAP_HDRSUPP (1UL << 4) /* Enable Header Suppression */ #define HFI1_CAP_TID_RDMA (1UL << 5) /* Enable TID RDMA operations */ #define HFI1_CAP_USE_SDMA_HEAD (1UL << 6) /* DMA Hdr Q tail vs. use CSR */ #define HFI1_CAP_MULTI_PKT_EGR (1UL << 7) /* Enable multi-packet Egr buffs*/ #define HFI1_CAP_NODROP_RHQ_FULL (1UL << 8) /* Don't drop on Hdr Q full */ #define HFI1_CAP_NODROP_EGR_FULL (1UL << 9) /* Don't drop on EGR buffs full */ #define HFI1_CAP_TID_UNMAP (1UL << 10) /* Disable Expected TID caching */ #define HFI1_CAP_PRINT_UNIMPL (1UL << 11) /* Show for unimplemented feats */ #define HFI1_CAP_ALLOW_PERM_JKEY (1UL << 12) /* Allow use of permissive JKEY */ #define HFI1_CAP_NO_INTEGRITY (1UL << 13) /* Enable ctxt integrity checks */ #define HFI1_CAP_PKEY_CHECK (1UL << 14) /* Enable ctxt PKey checking */ #define HFI1_CAP_STATIC_RATE_CTRL (1UL << 15) /* Allow PBC.StaticRateControl */ #define HFI1_CAP_OPFN (1UL << 16) /* Enable the OPFN protocol */ #define HFI1_CAP_SDMA_HEAD_CHECK (1UL << 17) /* SDMA head checking */ #define HFI1_CAP_EARLY_CREDIT_RETURN (1UL << 18) /* early credit return */ #define HFI1_CAP_AIP (1UL << 19) /* Enable accelerated IP */ #define HFI1_RCVHDR_ENTSIZE_2 (1UL << 0) #define HFI1_RCVHDR_ENTSIZE_16 (1UL << 1) #define HFI1_RCVDHR_ENTSIZE_32 (1UL << 2) #define _HFI1_EVENT_FROZEN_BIT 0 #define _HFI1_EVENT_LINKDOWN_BIT 1 #define _HFI1_EVENT_LID_CHANGE_BIT 2 #define _HFI1_EVENT_LMC_CHANGE_BIT 3 #define _HFI1_EVENT_SL2VL_CHANGE_BIT 4 #define _HFI1_EVENT_TID_MMU_NOTIFY_BIT 5 #define _HFI1_MAX_EVENT_BIT _HFI1_EVENT_TID_MMU_NOTIFY_BIT #define HFI1_EVENT_FROZEN (1UL << _HFI1_EVENT_FROZEN_BIT) #define HFI1_EVENT_LINKDOWN (1UL << _HFI1_EVENT_LINKDOWN_BIT) #define HFI1_EVENT_LID_CHANGE (1UL << _HFI1_EVENT_LID_CHANGE_BIT) #define HFI1_EVENT_LMC_CHANGE (1UL << _HFI1_EVENT_LMC_CHANGE_BIT) #define HFI1_EVENT_SL2VL_CHANGE (1UL << _HFI1_EVENT_SL2VL_CHANGE_BIT) #define HFI1_EVENT_TID_MMU_NOTIFY (1UL << _HFI1_EVENT_TID_MMU_NOTIFY_BIT) /* * These are the status bits readable (in ASCII form, 64bit value) * from the "status" sysfs file. For binary compatibility, values * must remain as is; removed states can be reused for different * purposes. */ #define HFI1_STATUS_INITTED 0x1 /* basic initialization done */ /* Chip has been found and initialized */ #define HFI1_STATUS_CHIP_PRESENT 0x20 /* IB link is at ACTIVE, usable for data traffic */ #define HFI1_STATUS_IB_READY 0x40 /* link is configured, LID, MTU, etc. have been set */ #define HFI1_STATUS_IB_CONF 0x80 /* A Fatal hardware error has occurred. */ #define HFI1_STATUS_HWERROR 0x200 /* * Number of supported shared contexts. * This is the maximum number of software contexts that can share * a hardware send/receive context. */ #define HFI1_MAX_SHARED_CTXTS 8 /* * Poll types */ #define HFI1_POLL_TYPE_ANYRCV 0x0 #define HFI1_POLL_TYPE_URGENT 0x1 enum hfi1_sdma_comp_state { FREE = 0, QUEUED, COMPLETE, ERROR }; /* * SDMA completion ring entry */ struct hfi1_sdma_comp_entry { __u32 status; __u32 errcode; }; /* * Device status and notifications from driver to user-space. */ struct hfi1_status { __aligned_u64 dev; /* device/hw status bits */ __aligned_u64 port; /* port state and status bits */ char freezemsg[]; }; enum sdma_req_opcode { EXPECTED = 0, EAGER }; #define HFI1_SDMA_REQ_VERSION_MASK 0xF #define HFI1_SDMA_REQ_VERSION_SHIFT 0x0 #define HFI1_SDMA_REQ_OPCODE_MASK 0xF #define HFI1_SDMA_REQ_OPCODE_SHIFT 0x4 #define HFI1_SDMA_REQ_IOVCNT_MASK 0xFF #define HFI1_SDMA_REQ_IOVCNT_SHIFT 0x8 struct sdma_req_info { /* * bits 0-3 - version (currently unused) * bits 4-7 - opcode (enum sdma_req_opcode) * bits 8-15 - io vector count */ __u16 ctrl; /* * Number of fragments contained in this request. * User-space has already computed how many * fragment-sized packet the user buffer will be * split into. */ __u16 npkts; /* * Size of each fragment the user buffer will be * split into. */ __u16 fragsize; /* * Index of the slot in the SDMA completion ring * this request should be using. User-space is * in charge of managing its own ring. */ __u16 comp_idx; } __attribute__((__packed__)); /* * SW KDETH header. * swdata is SW defined portion. */ struct hfi1_kdeth_header { __le32 ver_tid_offset; __le16 jkey; __le16 hcrc; __le32 swdata[7]; } __attribute__((__packed__)); /* * Structure describing the headers that User space uses. The * structure above is a subset of this one. */ struct hfi1_pkt_header { __le16 pbc[4]; __be16 lrh[4]; __be32 bth[3]; struct hfi1_kdeth_header kdeth; } __attribute__((__packed__)); /* * The list of usermode accessible registers. */ enum hfi1_ureg { /* (RO) DMA RcvHdr to be used next. */ ur_rcvhdrtail = 0, /* (RW) RcvHdr entry to be processed next by host. */ ur_rcvhdrhead = 1, /* (RO) Index of next Eager index to use. */ ur_rcvegrindextail = 2, /* (RW) Eager TID to be processed next */ ur_rcvegrindexhead = 3, /* (RO) Receive Eager Offset Tail */ ur_rcvegroffsettail = 4, /* For internal use only; max register number. */ ur_maxreg, /* (RW) Receive TID flow table */ ur_rcvtidflowtable = 256 }; #endif /* _LINIUX__HFI1_USER_H */ rdma-core-56.1/kernel-headers/rdma/hns-abi.h000066400000000000000000000076051477342711600206500ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2016 Hisilicon Limited. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef HNS_ABI_USER_H #define HNS_ABI_USER_H #include struct hns_roce_ib_create_cq { __aligned_u64 buf_addr; __aligned_u64 db_addr; __u32 cqe_size; __u32 reserved; }; enum hns_roce_cq_cap_flags { HNS_ROCE_CQ_FLAG_RECORD_DB = 1 << 0, }; struct hns_roce_ib_create_cq_resp { __aligned_u64 cqn; /* Only 32 bits used, 64 for compat */ __aligned_u64 cap_flags; }; enum hns_roce_srq_cap_flags { HNS_ROCE_SRQ_CAP_RECORD_DB = 1 << 0, }; enum hns_roce_srq_cap_flags_resp { HNS_ROCE_RSP_SRQ_CAP_RECORD_DB = 1 << 0, }; struct hns_roce_ib_create_srq { __aligned_u64 buf_addr; __aligned_u64 db_addr; __aligned_u64 que_addr; __u32 req_cap_flags; /* Use enum hns_roce_srq_cap_flags */ __u32 reserved; }; struct hns_roce_ib_create_srq_resp { __u32 srqn; __u32 cap_flags; /* Use enum hns_roce_srq_cap_flags */ }; enum hns_roce_congest_type_flags { HNS_ROCE_CREATE_QP_FLAGS_DCQCN, HNS_ROCE_CREATE_QP_FLAGS_LDCP, HNS_ROCE_CREATE_QP_FLAGS_HC3, HNS_ROCE_CREATE_QP_FLAGS_DIP, }; enum hns_roce_create_qp_comp_mask { HNS_ROCE_CREATE_QP_MASK_CONGEST_TYPE = 1 << 0, }; struct hns_roce_ib_create_qp { __aligned_u64 buf_addr; __aligned_u64 db_addr; __u8 log_sq_bb_count; __u8 log_sq_stride; __u8 sq_no_prefetch; __u8 reserved[5]; __aligned_u64 sdb_addr; __aligned_u64 comp_mask; /* Use enum hns_roce_create_qp_comp_mask */ __aligned_u64 create_flags; __aligned_u64 cong_type_flags; }; enum hns_roce_qp_cap_flags { HNS_ROCE_QP_CAP_RQ_RECORD_DB = 1 << 0, HNS_ROCE_QP_CAP_SQ_RECORD_DB = 1 << 1, HNS_ROCE_QP_CAP_OWNER_DB = 1 << 2, HNS_ROCE_QP_CAP_DIRECT_WQE = 1 << 5, }; struct hns_roce_ib_create_qp_resp { __aligned_u64 cap_flags; __aligned_u64 dwqe_mmap_key; }; struct hns_roce_ib_modify_qp_resp { __u8 tc_mode; __u8 priority; __u8 reserved[6]; }; enum { HNS_ROCE_EXSGE_FLAGS = 1 << 0, HNS_ROCE_RQ_INLINE_FLAGS = 1 << 1, HNS_ROCE_CQE_INLINE_FLAGS = 1 << 2, }; enum { HNS_ROCE_RSP_EXSGE_FLAGS = 1 << 0, HNS_ROCE_RSP_RQ_INLINE_FLAGS = 1 << 1, HNS_ROCE_RSP_CQE_INLINE_FLAGS = 1 << 2, }; struct hns_roce_ib_alloc_ucontext_resp { __u32 qp_tab_size; __u32 cqe_size; __u32 srq_tab_size; __u32 reserved; __u32 config; __u32 max_inline_data; __u8 congest_type; __u8 reserved0[7]; }; struct hns_roce_ib_alloc_ucontext { __u32 config; __u32 reserved; }; struct hns_roce_ib_alloc_pd_resp { __u32 pdn; }; struct hns_roce_ib_create_ah_resp { __u8 dmac[6]; __u8 priority; __u8 tc_mode; }; #endif /* HNS_ABI_USER_H */ rdma-core-56.1/kernel-headers/rdma/ib_user_ioctl_cmds.h000066400000000000000000000232101477342711600231450ustar00rootroot00000000000000/* * Copyright (c) 2018, Mellanox Technologies inc. All rights reserved. * Copyright (c) 2020, Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef IB_USER_IOCTL_CMDS_H #define IB_USER_IOCTL_CMDS_H #define UVERBS_ID_NS_MASK 0xF000 #define UVERBS_ID_NS_SHIFT 12 enum uverbs_default_objects { UVERBS_OBJECT_DEVICE, /* No instances of DEVICE are allowed */ UVERBS_OBJECT_PD, UVERBS_OBJECT_COMP_CHANNEL, UVERBS_OBJECT_CQ, UVERBS_OBJECT_QP, UVERBS_OBJECT_SRQ, UVERBS_OBJECT_AH, UVERBS_OBJECT_MR, UVERBS_OBJECT_MW, UVERBS_OBJECT_FLOW, UVERBS_OBJECT_XRCD, UVERBS_OBJECT_RWQ_IND_TBL, UVERBS_OBJECT_WQ, UVERBS_OBJECT_FLOW_ACTION, UVERBS_OBJECT_DM, UVERBS_OBJECT_COUNTERS, UVERBS_OBJECT_ASYNC_EVENT, }; enum { UVERBS_ID_DRIVER_NS = 1UL << UVERBS_ID_NS_SHIFT, UVERBS_ATTR_UHW_IN = UVERBS_ID_DRIVER_NS, UVERBS_ATTR_UHW_OUT, UVERBS_ID_DRIVER_NS_WITH_UHW, }; enum uverbs_methods_device { UVERBS_METHOD_INVOKE_WRITE, UVERBS_METHOD_INFO_HANDLES, UVERBS_METHOD_QUERY_PORT, UVERBS_METHOD_GET_CONTEXT, UVERBS_METHOD_QUERY_CONTEXT, UVERBS_METHOD_QUERY_GID_TABLE, UVERBS_METHOD_QUERY_GID_ENTRY, }; enum uverbs_attrs_invoke_write_cmd_attr_ids { UVERBS_ATTR_CORE_IN, UVERBS_ATTR_CORE_OUT, UVERBS_ATTR_WRITE_CMD, }; enum uverbs_attrs_query_port_cmd_attr_ids { UVERBS_ATTR_QUERY_PORT_PORT_NUM, UVERBS_ATTR_QUERY_PORT_RESP, }; enum uverbs_attrs_get_context_attr_ids { UVERBS_ATTR_GET_CONTEXT_NUM_COMP_VECTORS, UVERBS_ATTR_GET_CONTEXT_CORE_SUPPORT, }; enum uverbs_attrs_query_context_attr_ids { UVERBS_ATTR_QUERY_CONTEXT_NUM_COMP_VECTORS, UVERBS_ATTR_QUERY_CONTEXT_CORE_SUPPORT, }; enum uverbs_attrs_create_cq_cmd_attr_ids { UVERBS_ATTR_CREATE_CQ_HANDLE, UVERBS_ATTR_CREATE_CQ_CQE, UVERBS_ATTR_CREATE_CQ_USER_HANDLE, UVERBS_ATTR_CREATE_CQ_COMP_CHANNEL, UVERBS_ATTR_CREATE_CQ_COMP_VECTOR, UVERBS_ATTR_CREATE_CQ_FLAGS, UVERBS_ATTR_CREATE_CQ_RESP_CQE, UVERBS_ATTR_CREATE_CQ_EVENT_FD, }; enum uverbs_attrs_destroy_cq_cmd_attr_ids { UVERBS_ATTR_DESTROY_CQ_HANDLE, UVERBS_ATTR_DESTROY_CQ_RESP, }; enum uverbs_attrs_create_flow_action_esp { UVERBS_ATTR_CREATE_FLOW_ACTION_ESP_HANDLE, UVERBS_ATTR_FLOW_ACTION_ESP_ATTRS, UVERBS_ATTR_FLOW_ACTION_ESP_ESN, UVERBS_ATTR_FLOW_ACTION_ESP_KEYMAT, UVERBS_ATTR_FLOW_ACTION_ESP_REPLAY, UVERBS_ATTR_FLOW_ACTION_ESP_ENCAP, }; enum uverbs_attrs_modify_flow_action_esp { UVERBS_ATTR_MODIFY_FLOW_ACTION_ESP_HANDLE = UVERBS_ATTR_CREATE_FLOW_ACTION_ESP_HANDLE, }; enum uverbs_attrs_destroy_flow_action_esp { UVERBS_ATTR_DESTROY_FLOW_ACTION_HANDLE, }; enum uverbs_attrs_create_qp_cmd_attr_ids { UVERBS_ATTR_CREATE_QP_HANDLE, UVERBS_ATTR_CREATE_QP_XRCD_HANDLE, UVERBS_ATTR_CREATE_QP_PD_HANDLE, UVERBS_ATTR_CREATE_QP_SRQ_HANDLE, UVERBS_ATTR_CREATE_QP_SEND_CQ_HANDLE, UVERBS_ATTR_CREATE_QP_RECV_CQ_HANDLE, UVERBS_ATTR_CREATE_QP_IND_TABLE_HANDLE, UVERBS_ATTR_CREATE_QP_USER_HANDLE, UVERBS_ATTR_CREATE_QP_CAP, UVERBS_ATTR_CREATE_QP_TYPE, UVERBS_ATTR_CREATE_QP_FLAGS, UVERBS_ATTR_CREATE_QP_SOURCE_QPN, UVERBS_ATTR_CREATE_QP_EVENT_FD, UVERBS_ATTR_CREATE_QP_RESP_CAP, UVERBS_ATTR_CREATE_QP_RESP_QP_NUM, }; enum uverbs_attrs_destroy_qp_cmd_attr_ids { UVERBS_ATTR_DESTROY_QP_HANDLE, UVERBS_ATTR_DESTROY_QP_RESP, }; enum uverbs_methods_qp { UVERBS_METHOD_QP_CREATE, UVERBS_METHOD_QP_DESTROY, }; enum uverbs_attrs_create_srq_cmd_attr_ids { UVERBS_ATTR_CREATE_SRQ_HANDLE, UVERBS_ATTR_CREATE_SRQ_PD_HANDLE, UVERBS_ATTR_CREATE_SRQ_XRCD_HANDLE, UVERBS_ATTR_CREATE_SRQ_CQ_HANDLE, UVERBS_ATTR_CREATE_SRQ_USER_HANDLE, UVERBS_ATTR_CREATE_SRQ_MAX_WR, UVERBS_ATTR_CREATE_SRQ_MAX_SGE, UVERBS_ATTR_CREATE_SRQ_LIMIT, UVERBS_ATTR_CREATE_SRQ_MAX_NUM_TAGS, UVERBS_ATTR_CREATE_SRQ_TYPE, UVERBS_ATTR_CREATE_SRQ_EVENT_FD, UVERBS_ATTR_CREATE_SRQ_RESP_MAX_WR, UVERBS_ATTR_CREATE_SRQ_RESP_MAX_SGE, UVERBS_ATTR_CREATE_SRQ_RESP_SRQ_NUM, }; enum uverbs_attrs_destroy_srq_cmd_attr_ids { UVERBS_ATTR_DESTROY_SRQ_HANDLE, UVERBS_ATTR_DESTROY_SRQ_RESP, }; enum uverbs_methods_srq { UVERBS_METHOD_SRQ_CREATE, UVERBS_METHOD_SRQ_DESTROY, }; enum uverbs_methods_cq { UVERBS_METHOD_CQ_CREATE, UVERBS_METHOD_CQ_DESTROY, }; enum uverbs_attrs_create_wq_cmd_attr_ids { UVERBS_ATTR_CREATE_WQ_HANDLE, UVERBS_ATTR_CREATE_WQ_PD_HANDLE, UVERBS_ATTR_CREATE_WQ_CQ_HANDLE, UVERBS_ATTR_CREATE_WQ_USER_HANDLE, UVERBS_ATTR_CREATE_WQ_TYPE, UVERBS_ATTR_CREATE_WQ_EVENT_FD, UVERBS_ATTR_CREATE_WQ_MAX_WR, UVERBS_ATTR_CREATE_WQ_MAX_SGE, UVERBS_ATTR_CREATE_WQ_FLAGS, UVERBS_ATTR_CREATE_WQ_RESP_MAX_WR, UVERBS_ATTR_CREATE_WQ_RESP_MAX_SGE, UVERBS_ATTR_CREATE_WQ_RESP_WQ_NUM, }; enum uverbs_attrs_destroy_wq_cmd_attr_ids { UVERBS_ATTR_DESTROY_WQ_HANDLE, UVERBS_ATTR_DESTROY_WQ_RESP, }; enum uverbs_methods_wq { UVERBS_METHOD_WQ_CREATE, UVERBS_METHOD_WQ_DESTROY, }; enum uverbs_methods_actions_flow_action_ops { UVERBS_METHOD_FLOW_ACTION_ESP_CREATE, UVERBS_METHOD_FLOW_ACTION_DESTROY, UVERBS_METHOD_FLOW_ACTION_ESP_MODIFY, }; enum uverbs_attrs_alloc_dm_cmd_attr_ids { UVERBS_ATTR_ALLOC_DM_HANDLE, UVERBS_ATTR_ALLOC_DM_LENGTH, UVERBS_ATTR_ALLOC_DM_ALIGNMENT, }; enum uverbs_attrs_free_dm_cmd_attr_ids { UVERBS_ATTR_FREE_DM_HANDLE, }; enum uverbs_methods_dm { UVERBS_METHOD_DM_ALLOC, UVERBS_METHOD_DM_FREE, }; enum uverbs_attrs_reg_dm_mr_cmd_attr_ids { UVERBS_ATTR_REG_DM_MR_HANDLE, UVERBS_ATTR_REG_DM_MR_OFFSET, UVERBS_ATTR_REG_DM_MR_LENGTH, UVERBS_ATTR_REG_DM_MR_PD_HANDLE, UVERBS_ATTR_REG_DM_MR_ACCESS_FLAGS, UVERBS_ATTR_REG_DM_MR_DM_HANDLE, UVERBS_ATTR_REG_DM_MR_RESP_LKEY, UVERBS_ATTR_REG_DM_MR_RESP_RKEY, }; enum uverbs_methods_mr { UVERBS_METHOD_DM_MR_REG, UVERBS_METHOD_MR_DESTROY, UVERBS_METHOD_ADVISE_MR, UVERBS_METHOD_QUERY_MR, UVERBS_METHOD_REG_DMABUF_MR, }; enum uverbs_attrs_mr_destroy_ids { UVERBS_ATTR_DESTROY_MR_HANDLE, }; enum uverbs_attrs_advise_mr_cmd_attr_ids { UVERBS_ATTR_ADVISE_MR_PD_HANDLE, UVERBS_ATTR_ADVISE_MR_ADVICE, UVERBS_ATTR_ADVISE_MR_FLAGS, UVERBS_ATTR_ADVISE_MR_SGE_LIST, }; enum uverbs_attrs_query_mr_cmd_attr_ids { UVERBS_ATTR_QUERY_MR_HANDLE, UVERBS_ATTR_QUERY_MR_RESP_LKEY, UVERBS_ATTR_QUERY_MR_RESP_RKEY, UVERBS_ATTR_QUERY_MR_RESP_LENGTH, UVERBS_ATTR_QUERY_MR_RESP_IOVA, }; enum uverbs_attrs_reg_dmabuf_mr_cmd_attr_ids { UVERBS_ATTR_REG_DMABUF_MR_HANDLE, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, UVERBS_ATTR_REG_DMABUF_MR_OFFSET, UVERBS_ATTR_REG_DMABUF_MR_LENGTH, UVERBS_ATTR_REG_DMABUF_MR_IOVA, UVERBS_ATTR_REG_DMABUF_MR_FD, UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, }; enum uverbs_attrs_create_counters_cmd_attr_ids { UVERBS_ATTR_CREATE_COUNTERS_HANDLE, }; enum uverbs_attrs_destroy_counters_cmd_attr_ids { UVERBS_ATTR_DESTROY_COUNTERS_HANDLE, }; enum uverbs_attrs_read_counters_cmd_attr_ids { UVERBS_ATTR_READ_COUNTERS_HANDLE, UVERBS_ATTR_READ_COUNTERS_BUFF, UVERBS_ATTR_READ_COUNTERS_FLAGS, }; enum uverbs_methods_actions_counters_ops { UVERBS_METHOD_COUNTERS_CREATE, UVERBS_METHOD_COUNTERS_DESTROY, UVERBS_METHOD_COUNTERS_READ, }; enum uverbs_attrs_info_handles_id { UVERBS_ATTR_INFO_OBJECT_ID, UVERBS_ATTR_INFO_TOTAL_HANDLES, UVERBS_ATTR_INFO_HANDLES_LIST, }; enum uverbs_methods_pd { UVERBS_METHOD_PD_DESTROY, }; enum uverbs_attrs_pd_destroy_ids { UVERBS_ATTR_DESTROY_PD_HANDLE, }; enum uverbs_methods_mw { UVERBS_METHOD_MW_DESTROY, }; enum uverbs_attrs_mw_destroy_ids { UVERBS_ATTR_DESTROY_MW_HANDLE, }; enum uverbs_methods_xrcd { UVERBS_METHOD_XRCD_DESTROY, }; enum uverbs_attrs_xrcd_destroy_ids { UVERBS_ATTR_DESTROY_XRCD_HANDLE, }; enum uverbs_methods_ah { UVERBS_METHOD_AH_DESTROY, }; enum uverbs_attrs_ah_destroy_ids { UVERBS_ATTR_DESTROY_AH_HANDLE, }; enum uverbs_methods_rwq_ind_tbl { UVERBS_METHOD_RWQ_IND_TBL_DESTROY, }; enum uverbs_attrs_rwq_ind_tbl_destroy_ids { UVERBS_ATTR_DESTROY_RWQ_IND_TBL_HANDLE, }; enum uverbs_methods_flow { UVERBS_METHOD_FLOW_DESTROY, }; enum uverbs_attrs_flow_destroy_ids { UVERBS_ATTR_DESTROY_FLOW_HANDLE, }; enum uverbs_method_async_event { UVERBS_METHOD_ASYNC_EVENT_ALLOC, }; enum uverbs_attrs_async_event_create { UVERBS_ATTR_ASYNC_EVENT_ALLOC_FD_HANDLE, }; enum uverbs_attrs_query_gid_table_cmd_attr_ids { UVERBS_ATTR_QUERY_GID_TABLE_ENTRY_SIZE, UVERBS_ATTR_QUERY_GID_TABLE_FLAGS, UVERBS_ATTR_QUERY_GID_TABLE_RESP_ENTRIES, UVERBS_ATTR_QUERY_GID_TABLE_RESP_NUM_ENTRIES, }; enum uverbs_attrs_query_gid_entry_cmd_attr_ids { UVERBS_ATTR_QUERY_GID_ENTRY_PORT, UVERBS_ATTR_QUERY_GID_ENTRY_GID_INDEX, UVERBS_ATTR_QUERY_GID_ENTRY_FLAGS, UVERBS_ATTR_QUERY_GID_ENTRY_RESP_ENTRY, }; #endif rdma-core-56.1/kernel-headers/rdma/ib_user_ioctl_verbs.h000066400000000000000000000174031477342711600233470ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2017-2018, Mellanox Technologies inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef IB_USER_IOCTL_VERBS_H #define IB_USER_IOCTL_VERBS_H #include #include #ifndef RDMA_UAPI_PTR #define RDMA_UAPI_PTR(_type, _name) __aligned_u64 _name #endif #define IB_UVERBS_ACCESS_OPTIONAL_FIRST (1 << 20) #define IB_UVERBS_ACCESS_OPTIONAL_LAST (1 << 29) enum ib_uverbs_core_support { IB_UVERBS_CORE_SUPPORT_OPTIONAL_MR_ACCESS = 1 << 0, }; enum ib_uverbs_access_flags { IB_UVERBS_ACCESS_LOCAL_WRITE = 1 << 0, IB_UVERBS_ACCESS_REMOTE_WRITE = 1 << 1, IB_UVERBS_ACCESS_REMOTE_READ = 1 << 2, IB_UVERBS_ACCESS_REMOTE_ATOMIC = 1 << 3, IB_UVERBS_ACCESS_MW_BIND = 1 << 4, IB_UVERBS_ACCESS_ZERO_BASED = 1 << 5, IB_UVERBS_ACCESS_ON_DEMAND = 1 << 6, IB_UVERBS_ACCESS_HUGETLB = 1 << 7, IB_UVERBS_ACCESS_FLUSH_GLOBAL = 1 << 8, IB_UVERBS_ACCESS_FLUSH_PERSISTENT = 1 << 9, IB_UVERBS_ACCESS_RELAXED_ORDERING = IB_UVERBS_ACCESS_OPTIONAL_FIRST, IB_UVERBS_ACCESS_OPTIONAL_RANGE = ((IB_UVERBS_ACCESS_OPTIONAL_LAST << 1) - 1) & ~(IB_UVERBS_ACCESS_OPTIONAL_FIRST - 1) }; enum ib_uverbs_srq_type { IB_UVERBS_SRQT_BASIC, IB_UVERBS_SRQT_XRC, IB_UVERBS_SRQT_TM, }; enum ib_uverbs_wq_type { IB_UVERBS_WQT_RQ, }; enum ib_uverbs_wq_flags { IB_UVERBS_WQ_FLAGS_CVLAN_STRIPPING = 1 << 0, IB_UVERBS_WQ_FLAGS_SCATTER_FCS = 1 << 1, IB_UVERBS_WQ_FLAGS_DELAY_DROP = 1 << 2, IB_UVERBS_WQ_FLAGS_PCI_WRITE_END_PADDING = 1 << 3, }; enum ib_uverbs_qp_type { IB_UVERBS_QPT_RC = 2, IB_UVERBS_QPT_UC, IB_UVERBS_QPT_UD, IB_UVERBS_QPT_RAW_PACKET = 8, IB_UVERBS_QPT_XRC_INI, IB_UVERBS_QPT_XRC_TGT, IB_UVERBS_QPT_DRIVER = 0xFF, }; enum ib_uverbs_qp_create_flags { IB_UVERBS_QP_CREATE_BLOCK_MULTICAST_LOOPBACK = 1 << 1, IB_UVERBS_QP_CREATE_SCATTER_FCS = 1 << 8, IB_UVERBS_QP_CREATE_CVLAN_STRIPPING = 1 << 9, IB_UVERBS_QP_CREATE_PCI_WRITE_END_PADDING = 1 << 11, IB_UVERBS_QP_CREATE_SQ_SIG_ALL = 1 << 12, }; enum ib_uverbs_query_port_cap_flags { IB_UVERBS_PCF_SM = 1 << 1, IB_UVERBS_PCF_NOTICE_SUP = 1 << 2, IB_UVERBS_PCF_TRAP_SUP = 1 << 3, IB_UVERBS_PCF_OPT_IPD_SUP = 1 << 4, IB_UVERBS_PCF_AUTO_MIGR_SUP = 1 << 5, IB_UVERBS_PCF_SL_MAP_SUP = 1 << 6, IB_UVERBS_PCF_MKEY_NVRAM = 1 << 7, IB_UVERBS_PCF_PKEY_NVRAM = 1 << 8, IB_UVERBS_PCF_LED_INFO_SUP = 1 << 9, IB_UVERBS_PCF_SM_DISABLED = 1 << 10, IB_UVERBS_PCF_SYS_IMAGE_GUID_SUP = 1 << 11, IB_UVERBS_PCF_PKEY_SW_EXT_PORT_TRAP_SUP = 1 << 12, IB_UVERBS_PCF_EXTENDED_SPEEDS_SUP = 1 << 14, IB_UVERBS_PCF_CM_SUP = 1 << 16, IB_UVERBS_PCF_SNMP_TUNNEL_SUP = 1 << 17, IB_UVERBS_PCF_REINIT_SUP = 1 << 18, IB_UVERBS_PCF_DEVICE_MGMT_SUP = 1 << 19, IB_UVERBS_PCF_VENDOR_CLASS_SUP = 1 << 20, IB_UVERBS_PCF_DR_NOTICE_SUP = 1 << 21, IB_UVERBS_PCF_CAP_MASK_NOTICE_SUP = 1 << 22, IB_UVERBS_PCF_BOOT_MGMT_SUP = 1 << 23, IB_UVERBS_PCF_LINK_LATENCY_SUP = 1 << 24, IB_UVERBS_PCF_CLIENT_REG_SUP = 1 << 25, /* * IsOtherLocalChangesNoticeSupported is aliased by IP_BASED_GIDS and * is inaccessible */ IB_UVERBS_PCF_LINK_SPEED_WIDTH_TABLE_SUP = 1 << 27, IB_UVERBS_PCF_VENDOR_SPECIFIC_MADS_TABLE_SUP = 1 << 28, IB_UVERBS_PCF_MCAST_PKEY_TRAP_SUPPRESSION_SUP = 1 << 29, IB_UVERBS_PCF_MCAST_FDB_TOP_SUP = 1 << 30, IB_UVERBS_PCF_HIERARCHY_INFO_SUP = 1ULL << 31, /* NOTE this is an internal flag, not an IBA flag */ IB_UVERBS_PCF_IP_BASED_GIDS = 1 << 26, }; enum ib_uverbs_query_port_flags { IB_UVERBS_QPF_GRH_REQUIRED = 1 << 0, }; enum ib_uverbs_flow_action_esp_keymat { IB_UVERBS_FLOW_ACTION_ESP_KEYMAT_AES_GCM, }; enum ib_uverbs_flow_action_esp_keymat_aes_gcm_iv_algo { IB_UVERBS_FLOW_ACTION_IV_ALGO_SEQ, }; struct ib_uverbs_flow_action_esp_keymat_aes_gcm { __aligned_u64 iv; __u32 iv_algo; /* Use enum ib_uverbs_flow_action_esp_keymat_aes_gcm_iv_algo */ __u32 salt; __u32 icv_len; __u32 key_len; __u32 aes_key[256 / 32]; }; enum ib_uverbs_flow_action_esp_replay { IB_UVERBS_FLOW_ACTION_ESP_REPLAY_NONE, IB_UVERBS_FLOW_ACTION_ESP_REPLAY_BMP, }; struct ib_uverbs_flow_action_esp_replay_bmp { __u32 size; }; enum ib_uverbs_flow_action_esp_flags { IB_UVERBS_FLOW_ACTION_ESP_FLAGS_INLINE_CRYPTO = 0UL << 0, /* Default */ IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD = 1UL << 0, IB_UVERBS_FLOW_ACTION_ESP_FLAGS_TUNNEL = 0UL << 1, /* Default */ IB_UVERBS_FLOW_ACTION_ESP_FLAGS_TRANSPORT = 1UL << 1, IB_UVERBS_FLOW_ACTION_ESP_FLAGS_DECRYPT = 0UL << 2, /* Default */ IB_UVERBS_FLOW_ACTION_ESP_FLAGS_ENCRYPT = 1UL << 2, IB_UVERBS_FLOW_ACTION_ESP_FLAGS_ESN_NEW_WINDOW = 1UL << 3, }; struct ib_uverbs_flow_action_esp_encap { /* This struct represents a list of pointers to flow_xxxx_filter that * encapsulates the payload in ESP tunnel mode. */ RDMA_UAPI_PTR(void *, val_ptr); /* pointer to a flow_xxxx_filter */ RDMA_UAPI_PTR(struct ib_uverbs_flow_action_esp_encap *, next_ptr); __u16 len; /* Len of the filter struct val_ptr points to */ __u16 type; /* Use flow_spec_type enum */ }; struct ib_uverbs_flow_action_esp { __u32 spi; __u32 seq; __u32 tfc_pad; __u32 flags; __aligned_u64 hard_limit_pkts; }; enum ib_uverbs_read_counters_flags { /* prefer read values from driver cache */ IB_UVERBS_READ_COUNTERS_PREFER_CACHED = 1 << 0, }; enum ib_uverbs_advise_mr_advice { IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH, IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_WRITE, IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT, }; enum ib_uverbs_advise_mr_flag { IB_UVERBS_ADVISE_MR_FLAG_FLUSH = 1 << 0, }; struct ib_uverbs_query_port_resp_ex { struct ib_uverbs_query_port_resp legacy_resp; __u16 port_cap_flags2; __u8 reserved[2]; __u32 active_speed_ex; }; struct ib_uverbs_qp_cap { __u32 max_send_wr; __u32 max_recv_wr; __u32 max_send_sge; __u32 max_recv_sge; __u32 max_inline_data; }; enum rdma_driver_id { RDMA_DRIVER_UNKNOWN, RDMA_DRIVER_MLX5, RDMA_DRIVER_MLX4, RDMA_DRIVER_CXGB3, RDMA_DRIVER_CXGB4, RDMA_DRIVER_MTHCA, RDMA_DRIVER_BNXT_RE, RDMA_DRIVER_OCRDMA, RDMA_DRIVER_NES, RDMA_DRIVER_I40IW, RDMA_DRIVER_IRDMA = RDMA_DRIVER_I40IW, RDMA_DRIVER_VMW_PVRDMA, RDMA_DRIVER_QEDR, RDMA_DRIVER_HNS, RDMA_DRIVER_USNIC, RDMA_DRIVER_RXE, RDMA_DRIVER_HFI1, RDMA_DRIVER_QIB, RDMA_DRIVER_EFA, RDMA_DRIVER_SIW, RDMA_DRIVER_ERDMA, RDMA_DRIVER_MANA, }; enum ib_uverbs_gid_type { IB_UVERBS_GID_TYPE_IB, IB_UVERBS_GID_TYPE_ROCE_V1, IB_UVERBS_GID_TYPE_ROCE_V2, }; struct ib_uverbs_gid_entry { __aligned_u64 gid[2]; __u32 gid_index; __u32 port_num; __u32 gid_type; __u32 netdev_ifindex; /* It is 0 if there is no netdev associated with it */ }; #endif rdma-core-56.1/kernel-headers/rdma/ib_user_mad.h000066400000000000000000000205221477342711600215710ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2004 Topspin Communications. All rights reserved. * Copyright (c) 2005 Voltaire, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef IB_USER_MAD_H #define IB_USER_MAD_H #include #include /* * Increment this value if any changes that break userspace ABI * compatibility are made. */ #define IB_USER_MAD_ABI_VERSION 5 /* * Make sure that all structs defined in this file remain laid out so * that they pack the same way on 32-bit and 64-bit architectures (to * avoid incompatibility between 32-bit userspace and 64-bit kernels). */ /** * ib_user_mad_hdr_old - Old version of MAD packet header without pkey_index * @id - ID of agent MAD received with/to be sent with * @status - 0 on successful receive, ETIMEDOUT if no response * received (transaction ID in data[] will be set to TID of original * request) (ignored on send) * @timeout_ms - Milliseconds to wait for response (unset on receive) * @retries - Number of automatic retries to attempt * @qpn - Remote QP number received from/to be sent to * @qkey - Remote Q_Key to be sent with (unset on receive) * @lid - Remote lid received from/to be sent to * @sl - Service level received with/to be sent with * @path_bits - Local path bits received with/to be sent with * @grh_present - If set, GRH was received/should be sent * @gid_index - Local GID index to send with (unset on receive) * @hop_limit - Hop limit in GRH * @traffic_class - Traffic class in GRH * @gid - Remote GID in GRH * @flow_label - Flow label in GRH */ struct ib_user_mad_hdr_old { __u32 id; __u32 status; __u32 timeout_ms; __u32 retries; __u32 length; __be32 qpn; __be32 qkey; __be16 lid; __u8 sl; __u8 path_bits; __u8 grh_present; __u8 gid_index; __u8 hop_limit; __u8 traffic_class; __u8 gid[16]; __be32 flow_label; }; /** * ib_user_mad_hdr - MAD packet header * This layout allows specifying/receiving the P_Key index. To use * this capability, an application must call the * IB_USER_MAD_ENABLE_PKEY ioctl on the user MAD file handle before * any other actions with the file handle. * @id - ID of agent MAD received with/to be sent with * @status - 0 on successful receive, ETIMEDOUT if no response * received (transaction ID in data[] will be set to TID of original * request) (ignored on send) * @timeout_ms - Milliseconds to wait for response (unset on receive) * @retries - Number of automatic retries to attempt * @qpn - Remote QP number received from/to be sent to * @qkey - Remote Q_Key to be sent with (unset on receive) * @lid - Remote lid received from/to be sent to * @sl - Service level received with/to be sent with * @path_bits - Local path bits received with/to be sent with * @grh_present - If set, GRH was received/should be sent * @gid_index - Local GID index to send with (unset on receive) * @hop_limit - Hop limit in GRH * @traffic_class - Traffic class in GRH * @gid - Remote GID in GRH * @flow_label - Flow label in GRH * @pkey_index - P_Key index */ struct ib_user_mad_hdr { __u32 id; __u32 status; __u32 timeout_ms; __u32 retries; __u32 length; __be32 qpn; __be32 qkey; __be16 lid; __u8 sl; __u8 path_bits; __u8 grh_present; __u8 gid_index; __u8 hop_limit; __u8 traffic_class; __u8 gid[16]; __be32 flow_label; __u16 pkey_index; __u8 reserved[6]; }; /** * ib_user_mad - MAD packet * @hdr - MAD packet header * @data - Contents of MAD * */ struct ib_user_mad { struct ib_user_mad_hdr hdr; __aligned_u64 data[]; }; /* * Earlier versions of this interface definition declared the * method_mask[] member as an array of __u32 but treated it as a * bitmap made up of longs in the kernel. This ambiguity meant that * 32-bit big-endian applications that can run on both 32-bit and * 64-bit kernels had no consistent ABI to rely on, and 64-bit * big-endian applications that treated method_mask as being made up * of 32-bit words would have their bitmap misinterpreted. * * To clear up this confusion, we change the declaration of * method_mask[] to use unsigned long and handle the conversion from * 32-bit userspace to 64-bit kernel for big-endian systems in the * compat_ioctl method. Unfortunately, to keep the structure layout * the same, we need the method_mask[] array to be aligned only to 4 * bytes even when long is 64 bits, which forces us into this ugly * typedef. */ typedef unsigned long __attribute__((aligned(4))) packed_ulong; #define IB_USER_MAD_LONGS_PER_METHOD_MASK (128 / (8 * sizeof (long))) /** * ib_user_mad_reg_req - MAD registration request * @id - Set by the kernel; used to identify agent in future requests. * @qpn - Queue pair number; must be 0 or 1. * @method_mask - The caller will receive unsolicited MADs for any method * where @method_mask = 1. * @mgmt_class - Indicates which management class of MADs should be receive * by the caller. This field is only required if the user wishes to * receive unsolicited MADs, otherwise it should be 0. * @mgmt_class_version - Indicates which version of MADs for the given * management class to receive. * @oui: Indicates IEEE OUI when mgmt_class is a vendor class * in the range from 0x30 to 0x4f. Otherwise not used. * @rmpp_version: If set, indicates the RMPP version used. * */ struct ib_user_mad_reg_req { __u32 id; packed_ulong method_mask[IB_USER_MAD_LONGS_PER_METHOD_MASK]; __u8 qpn; __u8 mgmt_class; __u8 mgmt_class_version; __u8 oui[3]; __u8 rmpp_version; }; /** * ib_user_mad_reg_req2 - MAD registration request * * @id - Set by the _kernel_; used by userspace to identify the * registered agent in future requests. * @qpn - Queue pair number; must be 0 or 1. * @mgmt_class - Indicates which management class of MADs should be * receive by the caller. This field is only required if * the user wishes to receive unsolicited MADs, otherwise * it should be 0. * @mgmt_class_version - Indicates which version of MADs for the given * management class to receive. * @res - Ignored. * @flags - additional registration flags; Must be in the set of * flags defined in IB_USER_MAD_REG_FLAGS_CAP * @method_mask - The caller wishes to receive unsolicited MADs for the * methods whose bit(s) is(are) set. * @oui - Indicates IEEE OUI to use when mgmt_class is a vendor * class in the range from 0x30 to 0x4f. Otherwise not * used. * @rmpp_version - If set, indicates the RMPP version to use. */ enum { IB_USER_MAD_USER_RMPP = (1 << 0), }; #define IB_USER_MAD_REG_FLAGS_CAP (IB_USER_MAD_USER_RMPP) struct ib_user_mad_reg_req2 { __u32 id; __u32 qpn; __u8 mgmt_class; __u8 mgmt_class_version; __u16 res; __u32 flags; __aligned_u64 method_mask[2]; __u32 oui; __u8 rmpp_version; __u8 reserved[3]; }; #endif /* IB_USER_MAD_H */ rdma-core-56.1/kernel-headers/rdma/ib_user_sa.h000066400000000000000000000044011477342711600214310ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2005 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef IB_USER_SA_H #define IB_USER_SA_H #include enum { IB_PATH_GMP = 1, IB_PATH_PRIMARY = (1<<1), IB_PATH_ALTERNATE = (1<<2), IB_PATH_OUTBOUND = (1<<3), IB_PATH_INBOUND = (1<<4), IB_PATH_INBOUND_REVERSE = (1<<5), IB_PATH_BIDIRECTIONAL = IB_PATH_OUTBOUND | IB_PATH_INBOUND_REVERSE }; struct ib_path_rec_data { __u32 flags; __u32 reserved; __u32 path_rec[16]; }; struct ib_user_path_rec { __u8 dgid[16]; __u8 sgid[16]; __be16 dlid; __be16 slid; __u32 raw_traffic; __be32 flow_label; __u32 reversible; __u32 mtu; __be16 pkey; __u8 hop_limit; __u8 traffic_class; __u8 numb_path; __u8 sl; __u8 mtu_selector; __u8 rate_selector; __u8 rate; __u8 packet_life_time_selector; __u8 packet_life_time; __u8 preference; }; #endif /* IB_USER_SA_H */ rdma-core-56.1/kernel-headers/rdma/ib_user_verbs.h000066400000000000000000000704451477342711600221620ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005, 2006 Cisco Systems. All rights reserved. * Copyright (c) 2005 PathScale, Inc. All rights reserved. * Copyright (c) 2006 Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef IB_USER_VERBS_H #define IB_USER_VERBS_H #include /* * Increment this value if any changes that break userspace ABI * compatibility are made. */ #define IB_USER_VERBS_ABI_VERSION 6 #define IB_USER_VERBS_CMD_THRESHOLD 50 enum ib_uverbs_write_cmds { IB_USER_VERBS_CMD_GET_CONTEXT, IB_USER_VERBS_CMD_QUERY_DEVICE, IB_USER_VERBS_CMD_QUERY_PORT, IB_USER_VERBS_CMD_ALLOC_PD, IB_USER_VERBS_CMD_DEALLOC_PD, IB_USER_VERBS_CMD_CREATE_AH, IB_USER_VERBS_CMD_MODIFY_AH, IB_USER_VERBS_CMD_QUERY_AH, IB_USER_VERBS_CMD_DESTROY_AH, IB_USER_VERBS_CMD_REG_MR, IB_USER_VERBS_CMD_REG_SMR, IB_USER_VERBS_CMD_REREG_MR, IB_USER_VERBS_CMD_QUERY_MR, IB_USER_VERBS_CMD_DEREG_MR, IB_USER_VERBS_CMD_ALLOC_MW, IB_USER_VERBS_CMD_BIND_MW, IB_USER_VERBS_CMD_DEALLOC_MW, IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL, IB_USER_VERBS_CMD_CREATE_CQ, IB_USER_VERBS_CMD_RESIZE_CQ, IB_USER_VERBS_CMD_DESTROY_CQ, IB_USER_VERBS_CMD_POLL_CQ, IB_USER_VERBS_CMD_PEEK_CQ, IB_USER_VERBS_CMD_REQ_NOTIFY_CQ, IB_USER_VERBS_CMD_CREATE_QP, IB_USER_VERBS_CMD_QUERY_QP, IB_USER_VERBS_CMD_MODIFY_QP, IB_USER_VERBS_CMD_DESTROY_QP, IB_USER_VERBS_CMD_POST_SEND, IB_USER_VERBS_CMD_POST_RECV, IB_USER_VERBS_CMD_ATTACH_MCAST, IB_USER_VERBS_CMD_DETACH_MCAST, IB_USER_VERBS_CMD_CREATE_SRQ, IB_USER_VERBS_CMD_MODIFY_SRQ, IB_USER_VERBS_CMD_QUERY_SRQ, IB_USER_VERBS_CMD_DESTROY_SRQ, IB_USER_VERBS_CMD_POST_SRQ_RECV, IB_USER_VERBS_CMD_OPEN_XRCD, IB_USER_VERBS_CMD_CLOSE_XRCD, IB_USER_VERBS_CMD_CREATE_XSRQ, IB_USER_VERBS_CMD_OPEN_QP, }; enum { IB_USER_VERBS_EX_CMD_QUERY_DEVICE = IB_USER_VERBS_CMD_QUERY_DEVICE, IB_USER_VERBS_EX_CMD_CREATE_CQ = IB_USER_VERBS_CMD_CREATE_CQ, IB_USER_VERBS_EX_CMD_CREATE_QP = IB_USER_VERBS_CMD_CREATE_QP, IB_USER_VERBS_EX_CMD_MODIFY_QP = IB_USER_VERBS_CMD_MODIFY_QP, IB_USER_VERBS_EX_CMD_CREATE_FLOW = IB_USER_VERBS_CMD_THRESHOLD, IB_USER_VERBS_EX_CMD_DESTROY_FLOW, IB_USER_VERBS_EX_CMD_CREATE_WQ, IB_USER_VERBS_EX_CMD_MODIFY_WQ, IB_USER_VERBS_EX_CMD_DESTROY_WQ, IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL, IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL, IB_USER_VERBS_EX_CMD_MODIFY_CQ }; /* see IBA A19.4.1.1 Placement Types */ enum ib_placement_type { IB_FLUSH_GLOBAL = 1U << 0, IB_FLUSH_PERSISTENT = 1U << 1, }; /* see IBA A19.4.1.2 Selectivity Level */ enum ib_selectivity_level { IB_FLUSH_RANGE = 0, IB_FLUSH_MR, }; /* * Make sure that all structs defined in this file remain laid out so * that they pack the same way on 32-bit and 64-bit architectures (to * avoid incompatibility between 32-bit userspace and 64-bit kernels). * Specifically: * - Do not use pointer types -- pass pointers in __u64 instead. * - Make sure that any structure larger than 4 bytes is padded to a * multiple of 8 bytes. Otherwise the structure size will be * different between 32-bit and 64-bit architectures. */ struct ib_uverbs_async_event_desc { __aligned_u64 element; __u32 event_type; /* enum ib_event_type */ __u32 reserved; }; struct ib_uverbs_comp_event_desc { __aligned_u64 cq_handle; }; struct ib_uverbs_cq_moderation_caps { __u16 max_cq_moderation_count; __u16 max_cq_moderation_period; __u32 reserved; }; /* * All commands from userspace should start with a __u32 command field * followed by __u16 in_words and out_words fields (which give the * length of the command block and response buffer if any in 32-bit * words). The kernel driver will read these fields first and read * the rest of the command struct based on these value. */ #define IB_USER_VERBS_CMD_COMMAND_MASK 0xff #define IB_USER_VERBS_CMD_FLAG_EXTENDED 0x80000000u struct ib_uverbs_cmd_hdr { __u32 command; __u16 in_words; __u16 out_words; }; struct ib_uverbs_ex_cmd_hdr { __aligned_u64 response; __u16 provider_in_words; __u16 provider_out_words; __u32 cmd_hdr_reserved; }; struct ib_uverbs_get_context { __aligned_u64 response; __aligned_u64 driver_data[]; }; struct ib_uverbs_get_context_resp { __u32 async_fd; __u32 num_comp_vectors; __aligned_u64 driver_data[]; }; struct ib_uverbs_query_device { __aligned_u64 response; __aligned_u64 driver_data[]; }; struct ib_uverbs_query_device_resp { __aligned_u64 fw_ver; __be64 node_guid; __be64 sys_image_guid; __aligned_u64 max_mr_size; __aligned_u64 page_size_cap; __u32 vendor_id; __u32 vendor_part_id; __u32 hw_ver; __u32 max_qp; __u32 max_qp_wr; __u32 device_cap_flags; __u32 max_sge; __u32 max_sge_rd; __u32 max_cq; __u32 max_cqe; __u32 max_mr; __u32 max_pd; __u32 max_qp_rd_atom; __u32 max_ee_rd_atom; __u32 max_res_rd_atom; __u32 max_qp_init_rd_atom; __u32 max_ee_init_rd_atom; __u32 atomic_cap; __u32 max_ee; __u32 max_rdd; __u32 max_mw; __u32 max_raw_ipv6_qp; __u32 max_raw_ethy_qp; __u32 max_mcast_grp; __u32 max_mcast_qp_attach; __u32 max_total_mcast_qp_attach; __u32 max_ah; __u32 max_fmr; __u32 max_map_per_fmr; __u32 max_srq; __u32 max_srq_wr; __u32 max_srq_sge; __u16 max_pkeys; __u8 local_ca_ack_delay; __u8 phys_port_cnt; __u8 reserved[4]; }; struct ib_uverbs_ex_query_device { __u32 comp_mask; __u32 reserved; }; struct ib_uverbs_odp_caps { __aligned_u64 general_caps; struct { __u32 rc_odp_caps; __u32 uc_odp_caps; __u32 ud_odp_caps; } per_transport_caps; __u32 reserved; }; struct ib_uverbs_rss_caps { /* Corresponding bit will be set if qp type from * 'enum ib_qp_type' is supported, e.g. * supported_qpts |= 1 << IB_QPT_UD */ __u32 supported_qpts; __u32 max_rwq_indirection_tables; __u32 max_rwq_indirection_table_size; __u32 reserved; }; struct ib_uverbs_tm_caps { /* Max size of rendezvous request message */ __u32 max_rndv_hdr_size; /* Max number of entries in tag matching list */ __u32 max_num_tags; /* TM flags */ __u32 flags; /* Max number of outstanding list operations */ __u32 max_ops; /* Max number of SGE in tag matching entry */ __u32 max_sge; __u32 reserved; }; struct ib_uverbs_ex_query_device_resp { struct ib_uverbs_query_device_resp base; __u32 comp_mask; __u32 response_length; struct ib_uverbs_odp_caps odp_caps; __aligned_u64 timestamp_mask; __aligned_u64 hca_core_clock; /* in KHZ */ __aligned_u64 device_cap_flags_ex; struct ib_uverbs_rss_caps rss_caps; __u32 max_wq_type_rq; __u32 raw_packet_caps; struct ib_uverbs_tm_caps tm_caps; struct ib_uverbs_cq_moderation_caps cq_moderation_caps; __aligned_u64 max_dm_size; __u32 xrc_odp_caps; __u32 reserved; }; struct ib_uverbs_query_port { __aligned_u64 response; __u8 port_num; __u8 reserved[7]; __aligned_u64 driver_data[]; }; struct ib_uverbs_query_port_resp { __u32 port_cap_flags; /* see ib_uverbs_query_port_cap_flags */ __u32 max_msg_sz; __u32 bad_pkey_cntr; __u32 qkey_viol_cntr; __u32 gid_tbl_len; __u16 pkey_tbl_len; __u16 lid; __u16 sm_lid; __u8 state; __u8 max_mtu; __u8 active_mtu; __u8 lmc; __u8 max_vl_num; __u8 sm_sl; __u8 subnet_timeout; __u8 init_type_reply; __u8 active_width; __u8 active_speed; __u8 phys_state; __u8 link_layer; __u8 flags; /* see ib_uverbs_query_port_flags */ __u8 reserved; }; struct ib_uverbs_alloc_pd { __aligned_u64 response; __aligned_u64 driver_data[]; }; struct ib_uverbs_alloc_pd_resp { __u32 pd_handle; __u32 driver_data[]; }; struct ib_uverbs_dealloc_pd { __u32 pd_handle; }; struct ib_uverbs_open_xrcd { __aligned_u64 response; __u32 fd; __u32 oflags; __aligned_u64 driver_data[]; }; struct ib_uverbs_open_xrcd_resp { __u32 xrcd_handle; __u32 driver_data[]; }; struct ib_uverbs_close_xrcd { __u32 xrcd_handle; }; struct ib_uverbs_reg_mr { __aligned_u64 response; __aligned_u64 start; __aligned_u64 length; __aligned_u64 hca_va; __u32 pd_handle; __u32 access_flags; __aligned_u64 driver_data[]; }; struct ib_uverbs_reg_mr_resp { __u32 mr_handle; __u32 lkey; __u32 rkey; __u32 driver_data[]; }; struct ib_uverbs_rereg_mr { __aligned_u64 response; __u32 mr_handle; __u32 flags; __aligned_u64 start; __aligned_u64 length; __aligned_u64 hca_va; __u32 pd_handle; __u32 access_flags; __aligned_u64 driver_data[]; }; struct ib_uverbs_rereg_mr_resp { __u32 lkey; __u32 rkey; __aligned_u64 driver_data[]; }; struct ib_uverbs_dereg_mr { __u32 mr_handle; }; struct ib_uverbs_alloc_mw { __aligned_u64 response; __u32 pd_handle; __u8 mw_type; __u8 reserved[3]; __aligned_u64 driver_data[]; }; struct ib_uverbs_alloc_mw_resp { __u32 mw_handle; __u32 rkey; __aligned_u64 driver_data[]; }; struct ib_uverbs_dealloc_mw { __u32 mw_handle; }; struct ib_uverbs_create_comp_channel { __aligned_u64 response; }; struct ib_uverbs_create_comp_channel_resp { __u32 fd; }; struct ib_uverbs_create_cq { __aligned_u64 response; __aligned_u64 user_handle; __u32 cqe; __u32 comp_vector; __s32 comp_channel; __u32 reserved; __aligned_u64 driver_data[]; }; enum ib_uverbs_ex_create_cq_flags { IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION = 1 << 0, IB_UVERBS_CQ_FLAGS_IGNORE_OVERRUN = 1 << 1, }; struct ib_uverbs_ex_create_cq { __aligned_u64 user_handle; __u32 cqe; __u32 comp_vector; __s32 comp_channel; __u32 comp_mask; __u32 flags; /* bitmask of ib_uverbs_ex_create_cq_flags */ __u32 reserved; }; struct ib_uverbs_create_cq_resp { __u32 cq_handle; __u32 cqe; __aligned_u64 driver_data[0]; }; struct ib_uverbs_ex_create_cq_resp { struct ib_uverbs_create_cq_resp base; __u32 comp_mask; __u32 response_length; }; struct ib_uverbs_resize_cq { __aligned_u64 response; __u32 cq_handle; __u32 cqe; __aligned_u64 driver_data[]; }; struct ib_uverbs_resize_cq_resp { __u32 cqe; __u32 reserved; __aligned_u64 driver_data[]; }; struct ib_uverbs_poll_cq { __aligned_u64 response; __u32 cq_handle; __u32 ne; }; enum ib_uverbs_wc_opcode { IB_UVERBS_WC_SEND = 0, IB_UVERBS_WC_RDMA_WRITE = 1, IB_UVERBS_WC_RDMA_READ = 2, IB_UVERBS_WC_COMP_SWAP = 3, IB_UVERBS_WC_FETCH_ADD = 4, IB_UVERBS_WC_BIND_MW = 5, IB_UVERBS_WC_LOCAL_INV = 6, IB_UVERBS_WC_TSO = 7, IB_UVERBS_WC_FLUSH = 8, IB_UVERBS_WC_ATOMIC_WRITE = 9, }; struct ib_uverbs_wc { __aligned_u64 wr_id; __u32 status; __u32 opcode; __u32 vendor_err; __u32 byte_len; union { __be32 imm_data; __u32 invalidate_rkey; } ex; __u32 qp_num; __u32 src_qp; __u32 wc_flags; __u16 pkey_index; __u16 slid; __u8 sl; __u8 dlid_path_bits; __u8 port_num; __u8 reserved; }; struct ib_uverbs_poll_cq_resp { __u32 count; __u32 reserved; struct ib_uverbs_wc wc[]; }; struct ib_uverbs_req_notify_cq { __u32 cq_handle; __u32 solicited_only; }; struct ib_uverbs_destroy_cq { __aligned_u64 response; __u32 cq_handle; __u32 reserved; }; struct ib_uverbs_destroy_cq_resp { __u32 comp_events_reported; __u32 async_events_reported; }; struct ib_uverbs_global_route { __u8 dgid[16]; __u32 flow_label; __u8 sgid_index; __u8 hop_limit; __u8 traffic_class; __u8 reserved; }; struct ib_uverbs_ah_attr { struct ib_uverbs_global_route grh; __u16 dlid; __u8 sl; __u8 src_path_bits; __u8 static_rate; __u8 is_global; __u8 port_num; __u8 reserved; }; struct ib_uverbs_qp_attr { __u32 qp_attr_mask; __u32 qp_state; __u32 cur_qp_state; __u32 path_mtu; __u32 path_mig_state; __u32 qkey; __u32 rq_psn; __u32 sq_psn; __u32 dest_qp_num; __u32 qp_access_flags; struct ib_uverbs_ah_attr ah_attr; struct ib_uverbs_ah_attr alt_ah_attr; /* ib_qp_cap */ __u32 max_send_wr; __u32 max_recv_wr; __u32 max_send_sge; __u32 max_recv_sge; __u32 max_inline_data; __u16 pkey_index; __u16 alt_pkey_index; __u8 en_sqd_async_notify; __u8 sq_draining; __u8 max_rd_atomic; __u8 max_dest_rd_atomic; __u8 min_rnr_timer; __u8 port_num; __u8 timeout; __u8 retry_cnt; __u8 rnr_retry; __u8 alt_port_num; __u8 alt_timeout; __u8 reserved[5]; }; struct ib_uverbs_create_qp { __aligned_u64 response; __aligned_u64 user_handle; __u32 pd_handle; __u32 send_cq_handle; __u32 recv_cq_handle; __u32 srq_handle; __u32 max_send_wr; __u32 max_recv_wr; __u32 max_send_sge; __u32 max_recv_sge; __u32 max_inline_data; __u8 sq_sig_all; __u8 qp_type; __u8 is_srq; __u8 reserved; __aligned_u64 driver_data[]; }; enum ib_uverbs_create_qp_mask { IB_UVERBS_CREATE_QP_MASK_IND_TABLE = 1UL << 0, }; enum { IB_UVERBS_CREATE_QP_SUP_COMP_MASK = IB_UVERBS_CREATE_QP_MASK_IND_TABLE, }; struct ib_uverbs_ex_create_qp { __aligned_u64 user_handle; __u32 pd_handle; __u32 send_cq_handle; __u32 recv_cq_handle; __u32 srq_handle; __u32 max_send_wr; __u32 max_recv_wr; __u32 max_send_sge; __u32 max_recv_sge; __u32 max_inline_data; __u8 sq_sig_all; __u8 qp_type; __u8 is_srq; __u8 reserved; __u32 comp_mask; __u32 create_flags; __u32 rwq_ind_tbl_handle; __u32 source_qpn; }; struct ib_uverbs_open_qp { __aligned_u64 response; __aligned_u64 user_handle; __u32 pd_handle; __u32 qpn; __u8 qp_type; __u8 reserved[7]; __aligned_u64 driver_data[]; }; /* also used for open response */ struct ib_uverbs_create_qp_resp { __u32 qp_handle; __u32 qpn; __u32 max_send_wr; __u32 max_recv_wr; __u32 max_send_sge; __u32 max_recv_sge; __u32 max_inline_data; __u32 reserved; __u32 driver_data[0]; }; struct ib_uverbs_ex_create_qp_resp { struct ib_uverbs_create_qp_resp base; __u32 comp_mask; __u32 response_length; }; /* * This struct needs to remain a multiple of 8 bytes to keep the * alignment of the modify QP parameters. */ struct ib_uverbs_qp_dest { __u8 dgid[16]; __u32 flow_label; __u16 dlid; __u16 reserved; __u8 sgid_index; __u8 hop_limit; __u8 traffic_class; __u8 sl; __u8 src_path_bits; __u8 static_rate; __u8 is_global; __u8 port_num; }; struct ib_uverbs_query_qp { __aligned_u64 response; __u32 qp_handle; __u32 attr_mask; __aligned_u64 driver_data[]; }; struct ib_uverbs_query_qp_resp { struct ib_uverbs_qp_dest dest; struct ib_uverbs_qp_dest alt_dest; __u32 max_send_wr; __u32 max_recv_wr; __u32 max_send_sge; __u32 max_recv_sge; __u32 max_inline_data; __u32 qkey; __u32 rq_psn; __u32 sq_psn; __u32 dest_qp_num; __u32 qp_access_flags; __u16 pkey_index; __u16 alt_pkey_index; __u8 qp_state; __u8 cur_qp_state; __u8 path_mtu; __u8 path_mig_state; __u8 sq_draining; __u8 max_rd_atomic; __u8 max_dest_rd_atomic; __u8 min_rnr_timer; __u8 port_num; __u8 timeout; __u8 retry_cnt; __u8 rnr_retry; __u8 alt_port_num; __u8 alt_timeout; __u8 sq_sig_all; __u8 reserved[5]; __aligned_u64 driver_data[]; }; struct ib_uverbs_modify_qp { struct ib_uverbs_qp_dest dest; struct ib_uverbs_qp_dest alt_dest; __u32 qp_handle; __u32 attr_mask; __u32 qkey; __u32 rq_psn; __u32 sq_psn; __u32 dest_qp_num; __u32 qp_access_flags; __u16 pkey_index; __u16 alt_pkey_index; __u8 qp_state; __u8 cur_qp_state; __u8 path_mtu; __u8 path_mig_state; __u8 en_sqd_async_notify; __u8 max_rd_atomic; __u8 max_dest_rd_atomic; __u8 min_rnr_timer; __u8 port_num; __u8 timeout; __u8 retry_cnt; __u8 rnr_retry; __u8 alt_port_num; __u8 alt_timeout; __u8 reserved[2]; __aligned_u64 driver_data[0]; }; struct ib_uverbs_ex_modify_qp { struct ib_uverbs_modify_qp base; __u32 rate_limit; __u32 reserved; }; struct ib_uverbs_ex_modify_qp_resp { __u32 comp_mask; __u32 response_length; }; struct ib_uverbs_destroy_qp { __aligned_u64 response; __u32 qp_handle; __u32 reserved; }; struct ib_uverbs_destroy_qp_resp { __u32 events_reported; }; /* * The ib_uverbs_sge structure isn't used anywhere, since we assume * the ib_sge structure is packed the same way on 32-bit and 64-bit * architectures in both kernel and user space. It's just here to * document the ABI. */ struct ib_uverbs_sge { __aligned_u64 addr; __u32 length; __u32 lkey; }; enum ib_uverbs_wr_opcode { IB_UVERBS_WR_RDMA_WRITE = 0, IB_UVERBS_WR_RDMA_WRITE_WITH_IMM = 1, IB_UVERBS_WR_SEND = 2, IB_UVERBS_WR_SEND_WITH_IMM = 3, IB_UVERBS_WR_RDMA_READ = 4, IB_UVERBS_WR_ATOMIC_CMP_AND_SWP = 5, IB_UVERBS_WR_ATOMIC_FETCH_AND_ADD = 6, IB_UVERBS_WR_LOCAL_INV = 7, IB_UVERBS_WR_BIND_MW = 8, IB_UVERBS_WR_SEND_WITH_INV = 9, IB_UVERBS_WR_TSO = 10, IB_UVERBS_WR_RDMA_READ_WITH_INV = 11, IB_UVERBS_WR_MASKED_ATOMIC_CMP_AND_SWP = 12, IB_UVERBS_WR_MASKED_ATOMIC_FETCH_AND_ADD = 13, IB_UVERBS_WR_FLUSH = 14, IB_UVERBS_WR_ATOMIC_WRITE = 15, /* Review enum ib_wr_opcode before modifying this */ }; struct ib_uverbs_send_wr { __aligned_u64 wr_id; __u32 num_sge; __u32 opcode; /* see enum ib_uverbs_wr_opcode */ __u32 send_flags; union { __be32 imm_data; __u32 invalidate_rkey; } ex; union { struct { __aligned_u64 remote_addr; __u32 rkey; __u32 reserved; } rdma; struct { __aligned_u64 remote_addr; __aligned_u64 compare_add; __aligned_u64 swap; __u32 rkey; __u32 reserved; } atomic; struct { __u32 ah; __u32 remote_qpn; __u32 remote_qkey; __u32 reserved; } ud; } wr; }; struct ib_uverbs_post_send { __aligned_u64 response; __u32 qp_handle; __u32 wr_count; __u32 sge_count; __u32 wqe_size; struct ib_uverbs_send_wr send_wr[]; }; struct ib_uverbs_post_send_resp { __u32 bad_wr; }; struct ib_uverbs_recv_wr { __aligned_u64 wr_id; __u32 num_sge; __u32 reserved; }; struct ib_uverbs_post_recv { __aligned_u64 response; __u32 qp_handle; __u32 wr_count; __u32 sge_count; __u32 wqe_size; struct ib_uverbs_recv_wr recv_wr[]; }; struct ib_uverbs_post_recv_resp { __u32 bad_wr; }; struct ib_uverbs_post_srq_recv { __aligned_u64 response; __u32 srq_handle; __u32 wr_count; __u32 sge_count; __u32 wqe_size; struct ib_uverbs_recv_wr recv[]; }; struct ib_uverbs_post_srq_recv_resp { __u32 bad_wr; }; struct ib_uverbs_create_ah { __aligned_u64 response; __aligned_u64 user_handle; __u32 pd_handle; __u32 reserved; struct ib_uverbs_ah_attr attr; __aligned_u64 driver_data[]; }; struct ib_uverbs_create_ah_resp { __u32 ah_handle; __u32 driver_data[]; }; struct ib_uverbs_destroy_ah { __u32 ah_handle; }; struct ib_uverbs_attach_mcast { __u8 gid[16]; __u32 qp_handle; __u16 mlid; __u16 reserved; __aligned_u64 driver_data[]; }; struct ib_uverbs_detach_mcast { __u8 gid[16]; __u32 qp_handle; __u16 mlid; __u16 reserved; __aligned_u64 driver_data[]; }; struct ib_uverbs_flow_spec_hdr { __u32 type; __u16 size; __u16 reserved; /* followed by flow_spec */ __aligned_u64 flow_spec_data[0]; }; struct ib_uverbs_flow_eth_filter { __u8 dst_mac[6]; __u8 src_mac[6]; __be16 ether_type; __be16 vlan_tag; }; struct ib_uverbs_flow_spec_eth { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; struct ib_uverbs_flow_eth_filter val; struct ib_uverbs_flow_eth_filter mask; }; struct ib_uverbs_flow_ipv4_filter { __be32 src_ip; __be32 dst_ip; __u8 proto; __u8 tos; __u8 ttl; __u8 flags; }; struct ib_uverbs_flow_spec_ipv4 { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; struct ib_uverbs_flow_ipv4_filter val; struct ib_uverbs_flow_ipv4_filter mask; }; struct ib_uverbs_flow_tcp_udp_filter { __be16 dst_port; __be16 src_port; }; struct ib_uverbs_flow_spec_tcp_udp { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; struct ib_uverbs_flow_tcp_udp_filter val; struct ib_uverbs_flow_tcp_udp_filter mask; }; struct ib_uverbs_flow_ipv6_filter { __u8 src_ip[16]; __u8 dst_ip[16]; __be32 flow_label; __u8 next_hdr; __u8 traffic_class; __u8 hop_limit; __u8 reserved; }; struct ib_uverbs_flow_spec_ipv6 { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; struct ib_uverbs_flow_ipv6_filter val; struct ib_uverbs_flow_ipv6_filter mask; }; struct ib_uverbs_flow_spec_action_tag { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; __u32 tag_id; __u32 reserved1; }; struct ib_uverbs_flow_spec_action_drop { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; }; struct ib_uverbs_flow_spec_action_handle { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; __u32 handle; __u32 reserved1; }; struct ib_uverbs_flow_spec_action_count { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; __u32 handle; __u32 reserved1; }; struct ib_uverbs_flow_tunnel_filter { __be32 tunnel_id; }; struct ib_uverbs_flow_spec_tunnel { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; struct ib_uverbs_flow_tunnel_filter val; struct ib_uverbs_flow_tunnel_filter mask; }; struct ib_uverbs_flow_spec_esp_filter { __u32 spi; __u32 seq; }; struct ib_uverbs_flow_spec_esp { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; struct ib_uverbs_flow_spec_esp_filter val; struct ib_uverbs_flow_spec_esp_filter mask; }; struct ib_uverbs_flow_gre_filter { /* c_ks_res0_ver field is bits 0-15 in offset 0 of a standard GRE header: * bit 0 - C - checksum bit. * bit 1 - reserved. set to 0. * bit 2 - key bit. * bit 3 - sequence number bit. * bits 4:12 - reserved. set to 0. * bits 13:15 - GRE version. */ __be16 c_ks_res0_ver; __be16 protocol; __be32 key; }; struct ib_uverbs_flow_spec_gre { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; struct ib_uverbs_flow_gre_filter val; struct ib_uverbs_flow_gre_filter mask; }; struct ib_uverbs_flow_mpls_filter { /* The field includes the entire MPLS label: * bits 0:19 - label field. * bits 20:22 - traffic class field. * bits 23 - bottom of stack bit. * bits 24:31 - ttl field. */ __be32 label; }; struct ib_uverbs_flow_spec_mpls { union { struct ib_uverbs_flow_spec_hdr hdr; struct { __u32 type; __u16 size; __u16 reserved; }; }; struct ib_uverbs_flow_mpls_filter val; struct ib_uverbs_flow_mpls_filter mask; }; struct ib_uverbs_flow_attr { __u32 type; __u16 size; __u16 priority; __u8 num_of_specs; __u8 reserved[2]; __u8 port; __u32 flags; /* Following are the optional layers according to user request * struct ib_flow_spec_xxx * struct ib_flow_spec_yyy */ struct ib_uverbs_flow_spec_hdr flow_specs[]; }; struct ib_uverbs_create_flow { __u32 comp_mask; __u32 qp_handle; struct ib_uverbs_flow_attr flow_attr; }; struct ib_uverbs_create_flow_resp { __u32 comp_mask; __u32 flow_handle; }; struct ib_uverbs_destroy_flow { __u32 comp_mask; __u32 flow_handle; }; struct ib_uverbs_create_srq { __aligned_u64 response; __aligned_u64 user_handle; __u32 pd_handle; __u32 max_wr; __u32 max_sge; __u32 srq_limit; __aligned_u64 driver_data[]; }; struct ib_uverbs_create_xsrq { __aligned_u64 response; __aligned_u64 user_handle; __u32 srq_type; __u32 pd_handle; __u32 max_wr; __u32 max_sge; __u32 srq_limit; __u32 max_num_tags; __u32 xrcd_handle; __u32 cq_handle; __aligned_u64 driver_data[]; }; struct ib_uverbs_create_srq_resp { __u32 srq_handle; __u32 max_wr; __u32 max_sge; __u32 srqn; __u32 driver_data[]; }; struct ib_uverbs_modify_srq { __u32 srq_handle; __u32 attr_mask; __u32 max_wr; __u32 srq_limit; __aligned_u64 driver_data[]; }; struct ib_uverbs_query_srq { __aligned_u64 response; __u32 srq_handle; __u32 reserved; __aligned_u64 driver_data[]; }; struct ib_uverbs_query_srq_resp { __u32 max_wr; __u32 max_sge; __u32 srq_limit; __u32 reserved; }; struct ib_uverbs_destroy_srq { __aligned_u64 response; __u32 srq_handle; __u32 reserved; }; struct ib_uverbs_destroy_srq_resp { __u32 events_reported; }; struct ib_uverbs_ex_create_wq { __u32 comp_mask; __u32 wq_type; __aligned_u64 user_handle; __u32 pd_handle; __u32 cq_handle; __u32 max_wr; __u32 max_sge; __u32 create_flags; /* Use enum ib_wq_flags */ __u32 reserved; }; struct ib_uverbs_ex_create_wq_resp { __u32 comp_mask; __u32 response_length; __u32 wq_handle; __u32 max_wr; __u32 max_sge; __u32 wqn; }; struct ib_uverbs_ex_destroy_wq { __u32 comp_mask; __u32 wq_handle; }; struct ib_uverbs_ex_destroy_wq_resp { __u32 comp_mask; __u32 response_length; __u32 events_reported; __u32 reserved; }; struct ib_uverbs_ex_modify_wq { __u32 attr_mask; __u32 wq_handle; __u32 wq_state; __u32 curr_wq_state; __u32 flags; /* Use enum ib_wq_flags */ __u32 flags_mask; /* Use enum ib_wq_flags */ }; /* Prevent memory allocation rather than max expected size */ #define IB_USER_VERBS_MAX_LOG_IND_TBL_SIZE 0x0d struct ib_uverbs_ex_create_rwq_ind_table { __u32 comp_mask; __u32 log_ind_tbl_size; /* Following are the wq handles according to log_ind_tbl_size * wq_handle1 * wq_handle2 */ __u32 wq_handles[]; }; struct ib_uverbs_ex_create_rwq_ind_table_resp { __u32 comp_mask; __u32 response_length; __u32 ind_tbl_handle; __u32 ind_tbl_num; }; struct ib_uverbs_ex_destroy_rwq_ind_table { __u32 comp_mask; __u32 ind_tbl_handle; }; struct ib_uverbs_cq_moderation { __u16 cq_count; __u16 cq_period; }; struct ib_uverbs_ex_modify_cq { __u32 cq_handle; __u32 attr_mask; struct ib_uverbs_cq_moderation attr; __u32 reserved; }; #define IB_DEVICE_NAME_MAX 64 /* * bits 9, 15, 16, 19, 22, 27, 30, 31, 32, 33, 35 and 37 may be set by old * kernels and should not be used. */ enum ib_uverbs_device_cap_flags { IB_UVERBS_DEVICE_RESIZE_MAX_WR = 1 << 0, IB_UVERBS_DEVICE_BAD_PKEY_CNTR = 1 << 1, IB_UVERBS_DEVICE_BAD_QKEY_CNTR = 1 << 2, IB_UVERBS_DEVICE_RAW_MULTI = 1 << 3, IB_UVERBS_DEVICE_AUTO_PATH_MIG = 1 << 4, IB_UVERBS_DEVICE_CHANGE_PHY_PORT = 1 << 5, IB_UVERBS_DEVICE_UD_AV_PORT_ENFORCE = 1 << 6, IB_UVERBS_DEVICE_CURR_QP_STATE_MOD = 1 << 7, IB_UVERBS_DEVICE_SHUTDOWN_PORT = 1 << 8, /* IB_UVERBS_DEVICE_INIT_TYPE = 1 << 9, (not in use) */ IB_UVERBS_DEVICE_PORT_ACTIVE_EVENT = 1 << 10, IB_UVERBS_DEVICE_SYS_IMAGE_GUID = 1 << 11, IB_UVERBS_DEVICE_RC_RNR_NAK_GEN = 1 << 12, IB_UVERBS_DEVICE_SRQ_RESIZE = 1 << 13, IB_UVERBS_DEVICE_N_NOTIFY_CQ = 1 << 14, IB_UVERBS_DEVICE_MEM_WINDOW = 1 << 17, IB_UVERBS_DEVICE_UD_IP_CSUM = 1 << 18, IB_UVERBS_DEVICE_XRC = 1 << 20, IB_UVERBS_DEVICE_MEM_MGT_EXTENSIONS = 1 << 21, IB_UVERBS_DEVICE_MEM_WINDOW_TYPE_2A = 1 << 23, IB_UVERBS_DEVICE_MEM_WINDOW_TYPE_2B = 1 << 24, IB_UVERBS_DEVICE_RC_IP_CSUM = 1 << 25, /* Deprecated. Please use IB_UVERBS_RAW_PACKET_CAP_IP_CSUM. */ IB_UVERBS_DEVICE_RAW_IP_CSUM = 1 << 26, IB_UVERBS_DEVICE_MANAGED_FLOW_STEERING = 1 << 29, /* Deprecated. Please use IB_UVERBS_RAW_PACKET_CAP_SCATTER_FCS. */ IB_UVERBS_DEVICE_RAW_SCATTER_FCS = 1ULL << 34, IB_UVERBS_DEVICE_PCI_WRITE_END_PADDING = 1ULL << 36, /* Flush placement types */ IB_UVERBS_DEVICE_FLUSH_GLOBAL = 1ULL << 38, IB_UVERBS_DEVICE_FLUSH_PERSISTENT = 1ULL << 39, /* Atomic write attributes */ IB_UVERBS_DEVICE_ATOMIC_WRITE = 1ULL << 40, }; enum ib_uverbs_raw_packet_caps { IB_UVERBS_RAW_PACKET_CAP_CVLAN_STRIPPING = 1 << 0, IB_UVERBS_RAW_PACKET_CAP_SCATTER_FCS = 1 << 1, IB_UVERBS_RAW_PACKET_CAP_IP_CSUM = 1 << 2, IB_UVERBS_RAW_PACKET_CAP_DELAY_DROP = 1 << 3, }; #endif /* IB_USER_VERBS_H */ rdma-core-56.1/kernel-headers/rdma/irdma-abi.h000066400000000000000000000045121477342711600211460ustar00rootroot00000000000000/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB */ /* * Copyright (c) 2006 - 2021 Intel Corporation. All rights reserved. * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005 Cisco Systems. All rights reserved. * Copyright (c) 2005 Open Grid Computing, Inc. All rights reserved. */ #ifndef IRDMA_ABI_H #define IRDMA_ABI_H #include /* irdma must support legacy GEN_1 i40iw kernel * and user-space whose last ABI ver is 5 */ #define IRDMA_ABI_VER 5 enum irdma_memreg_type { IRDMA_MEMREG_TYPE_MEM = 0, IRDMA_MEMREG_TYPE_QP = 1, IRDMA_MEMREG_TYPE_CQ = 2, }; enum { IRDMA_ALLOC_UCTX_USE_RAW_ATTR = 1 << 0, IRDMA_ALLOC_UCTX_MIN_HW_WQ_SIZE = 1 << 1, }; struct irdma_alloc_ucontext_req { __u32 rsvd32; __u8 userspace_ver; __u8 rsvd8[3]; __aligned_u64 comp_mask; }; struct irdma_alloc_ucontext_resp { __u32 max_pds; __u32 max_qps; __u32 wq_size; /* size of the WQs (SQ+RQ) in the mmaped area */ __u8 kernel_ver; __u8 rsvd[3]; __aligned_u64 feature_flags; __aligned_u64 db_mmap_key; __u32 max_hw_wq_frags; __u32 max_hw_read_sges; __u32 max_hw_inline; __u32 max_hw_rq_quanta; __u32 max_hw_wq_quanta; __u32 min_hw_cq_size; __u32 max_hw_cq_size; __u16 max_hw_sq_chunk; __u8 hw_rev; __u8 rsvd2; __aligned_u64 comp_mask; __u16 min_hw_wq_size; __u8 rsvd3[6]; }; struct irdma_alloc_pd_resp { __u32 pd_id; __u8 rsvd[4]; }; struct irdma_resize_cq_req { __aligned_u64 user_cq_buffer; }; struct irdma_create_cq_req { __aligned_u64 user_cq_buf; __aligned_u64 user_shadow_area; }; struct irdma_create_qp_req { __aligned_u64 user_wqe_bufs; __aligned_u64 user_compl_ctx; }; struct irdma_mem_reg_req { __u16 reg_type; /* enum irdma_memreg_type */ __u16 cq_pages; __u16 rq_pages; __u16 sq_pages; }; struct irdma_modify_qp_req { __u8 sq_flush; __u8 rq_flush; __u8 rsvd[6]; }; struct irdma_create_cq_resp { __u32 cq_id; __u32 cq_size; }; struct irdma_create_qp_resp { __u32 qp_id; __u32 actual_sq_size; __u32 actual_rq_size; __u32 irdma_drv_opt; __u16 push_idx; __u8 lsmm; __u8 rsvd; __u32 qp_caps; }; struct irdma_modify_qp_resp { __aligned_u64 push_wqe_mmap_key; __aligned_u64 push_db_mmap_key; __u16 push_offset; __u8 push_valid; __u8 rsvd[5]; }; struct irdma_create_ah_resp { __u32 ah_id; __u8 rsvd[4]; }; #endif /* IRDMA_ABI_H */ rdma-core-56.1/kernel-headers/rdma/mana-abi.h000066400000000000000000000027521477342711600207720ustar00rootroot00000000000000/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) */ /* * Copyright (c) 2022, Microsoft Corporation. All rights reserved. */ #ifndef MANA_ABI_USER_H #define MANA_ABI_USER_H #include #include /* * Increment this value if any changes that break userspace ABI * compatibility are made. */ #define MANA_IB_UVERBS_ABI_VERSION 1 enum mana_ib_create_cq_flags { MANA_IB_CREATE_RNIC_CQ = 1 << 0, }; struct mana_ib_create_cq { __aligned_u64 buf_addr; __u16 flags; __u16 reserved0; __u32 reserved1; }; struct mana_ib_create_cq_resp { __u32 cqid; __u32 reserved; }; struct mana_ib_create_qp { __aligned_u64 sq_buf_addr; __u32 sq_buf_size; __u32 port; }; struct mana_ib_create_qp_resp { __u32 sqid; __u32 cqid; __u32 tx_vp_offset; __u32 reserved; }; struct mana_ib_create_rc_qp { __aligned_u64 queue_buf[4]; __u32 queue_size[4]; }; struct mana_ib_create_rc_qp_resp { __u32 queue_id[4]; }; struct mana_ib_create_wq { __aligned_u64 wq_buf_addr; __u32 wq_buf_size; __u32 reserved; }; /* RX Hash function flags */ enum mana_ib_rx_hash_function_flags { MANA_IB_RX_HASH_FUNC_TOEPLITZ = 1 << 0, }; struct mana_ib_create_qp_rss { __aligned_u64 rx_hash_fields_mask; __u8 rx_hash_function; __u8 reserved[7]; __u32 rx_hash_key_len; __u8 rx_hash_key[40]; __u32 port; }; struct rss_resp_entry { __u32 cqid; __u32 wqid; }; struct mana_ib_create_qp_rss_resp { __aligned_u64 num_entries; struct rss_resp_entry entries[64]; }; #endif rdma-core-56.1/kernel-headers/rdma/mlx4-abi.h000066400000000000000000000117751477342711600207470ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved. * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MLX4_ABI_USER_H #define MLX4_ABI_USER_H #include /* * Increment this value if any changes that break userspace ABI * compatibility are made. */ #define MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION 3 #define MLX4_IB_UVERBS_ABI_VERSION 4 /* * Make sure that all structs defined in this file remain laid out so * that they pack the same way on 32-bit and 64-bit architectures (to * avoid incompatibility between 32-bit userspace and 64-bit kernels). * In particular do not use pointer types -- pass pointers in __u64 * instead. */ struct mlx4_ib_alloc_ucontext_resp_v3 { __u32 qp_tab_size; __u16 bf_reg_size; __u16 bf_regs_per_page; }; enum { MLX4_USER_DEV_CAP_LARGE_CQE = 1L << 0, }; struct mlx4_ib_alloc_ucontext_resp { __u32 dev_caps; __u32 qp_tab_size; __u16 bf_reg_size; __u16 bf_regs_per_page; __u32 cqe_size; }; struct mlx4_ib_alloc_pd_resp { __u32 pdn; __u32 reserved; }; struct mlx4_ib_create_cq { __aligned_u64 buf_addr; __aligned_u64 db_addr; }; struct mlx4_ib_create_cq_resp { __u32 cqn; __u32 reserved; }; struct mlx4_ib_resize_cq { __aligned_u64 buf_addr; }; struct mlx4_ib_create_srq { __aligned_u64 buf_addr; __aligned_u64 db_addr; }; struct mlx4_ib_create_srq_resp { __u32 srqn; __u32 reserved; }; struct mlx4_ib_create_qp_rss { __aligned_u64 rx_hash_fields_mask; /* Use enum mlx4_ib_rx_hash_fields */ __u8 rx_hash_function; /* Use enum mlx4_ib_rx_hash_function_flags */ __u8 reserved[7]; __u8 rx_hash_key[40]; __u32 comp_mask; __u32 reserved1; }; struct mlx4_ib_create_qp { __aligned_u64 buf_addr; __aligned_u64 db_addr; __u8 log_sq_bb_count; __u8 log_sq_stride; __u8 sq_no_prefetch; __u8 reserved; __u32 inl_recv_sz; }; struct mlx4_ib_create_wq { __aligned_u64 buf_addr; __aligned_u64 db_addr; __u8 log_range_size; __u8 reserved[3]; __u32 comp_mask; }; struct mlx4_ib_modify_wq { __u32 comp_mask; __u32 reserved; }; struct mlx4_ib_create_rwq_ind_tbl_resp { __u32 response_length; __u32 reserved; }; /* RX Hash function flags */ enum mlx4_ib_rx_hash_function_flags { MLX4_IB_RX_HASH_FUNC_TOEPLITZ = 1 << 0, }; /* * RX Hash flags, these flags allows to set which incoming packet's field should * participates in RX Hash. Each flag represent certain packet's field, * when the flag is set the field that is represented by the flag will * participate in RX Hash calculation. */ enum mlx4_ib_rx_hash_fields { MLX4_IB_RX_HASH_SRC_IPV4 = 1 << 0, MLX4_IB_RX_HASH_DST_IPV4 = 1 << 1, MLX4_IB_RX_HASH_SRC_IPV6 = 1 << 2, MLX4_IB_RX_HASH_DST_IPV6 = 1 << 3, MLX4_IB_RX_HASH_SRC_PORT_TCP = 1 << 4, MLX4_IB_RX_HASH_DST_PORT_TCP = 1 << 5, MLX4_IB_RX_HASH_SRC_PORT_UDP = 1 << 6, MLX4_IB_RX_HASH_DST_PORT_UDP = 1 << 7, MLX4_IB_RX_HASH_INNER = 1ULL << 31, }; struct mlx4_ib_rss_caps { __aligned_u64 rx_hash_fields_mask; /* enum mlx4_ib_rx_hash_fields */ __u8 rx_hash_function; /* enum mlx4_ib_rx_hash_function_flags */ __u8 reserved[7]; }; enum query_device_resp_mask { MLX4_IB_QUERY_DEV_RESP_MASK_CORE_CLOCK_OFFSET = 1UL << 0, }; struct mlx4_ib_tso_caps { __u32 max_tso; /* Maximum tso payload size in bytes */ /* Corresponding bit will be set if qp type from * 'enum ib_qp_type' is supported. */ __u32 supported_qpts; }; struct mlx4_uverbs_ex_query_device_resp { __u32 comp_mask; __u32 response_length; __aligned_u64 hca_core_clock_offset; __u32 max_inl_recv_sz; __u32 reserved; struct mlx4_ib_rss_caps rss_caps; struct mlx4_ib_tso_caps tso_caps; }; #endif /* MLX4_ABI_USER_H */ rdma-core-56.1/kernel-headers/rdma/mlx5-abi.h000066400000000000000000000333531477342711600207440ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MLX5_ABI_USER_H #define MLX5_ABI_USER_H #include #include /* For ETH_ALEN. */ #include #include enum { MLX5_QP_FLAG_SIGNATURE = 1 << 0, MLX5_QP_FLAG_SCATTER_CQE = 1 << 1, MLX5_QP_FLAG_TUNNEL_OFFLOADS = 1 << 2, MLX5_QP_FLAG_BFREG_INDEX = 1 << 3, MLX5_QP_FLAG_TYPE_DCT = 1 << 4, MLX5_QP_FLAG_TYPE_DCI = 1 << 5, MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_UC = 1 << 6, MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_MC = 1 << 7, MLX5_QP_FLAG_ALLOW_SCATTER_CQE = 1 << 8, MLX5_QP_FLAG_PACKET_BASED_CREDIT_MODE = 1 << 9, MLX5_QP_FLAG_UAR_PAGE_INDEX = 1 << 10, MLX5_QP_FLAG_DCI_STREAM = 1 << 11, }; enum { MLX5_SRQ_FLAG_SIGNATURE = 1 << 0, }; enum { MLX5_WQ_FLAG_SIGNATURE = 1 << 0, }; /* Increment this value if any changes that break userspace ABI * compatibility are made. */ #define MLX5_IB_UVERBS_ABI_VERSION 1 /* Make sure that all structs defined in this file remain laid out so * that they pack the same way on 32-bit and 64-bit architectures (to * avoid incompatibility between 32-bit userspace and 64-bit kernels). * In particular do not use pointer types -- pass pointers in __u64 * instead. */ struct mlx5_ib_alloc_ucontext_req { __u32 total_num_bfregs; __u32 num_low_latency_bfregs; }; enum mlx5_lib_caps { MLX5_LIB_CAP_4K_UAR = (__u64)1 << 0, MLX5_LIB_CAP_DYN_UAR = (__u64)1 << 1, }; enum mlx5_ib_alloc_uctx_v2_flags { MLX5_IB_ALLOC_UCTX_DEVX = 1 << 0, }; struct mlx5_ib_alloc_ucontext_req_v2 { __u32 total_num_bfregs; __u32 num_low_latency_bfregs; __u32 flags; __u32 comp_mask; __u8 max_cqe_version; __u8 reserved0; __u16 reserved1; __u32 reserved2; __aligned_u64 lib_caps; }; enum mlx5_ib_alloc_ucontext_resp_mask { MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_CORE_CLOCK_OFFSET = 1UL << 0, MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_DUMP_FILL_MKEY = 1UL << 1, MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_ECE = 1UL << 2, MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_SQD2RTS = 1UL << 3, MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_REAL_TIME_TS = 1UL << 4, MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_MKEY_UPDATE_TAG = 1UL << 5, }; enum mlx5_user_cmds_supp_uhw { MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE = 1 << 0, MLX5_USER_CMDS_SUPP_UHW_CREATE_AH = 1 << 1, }; /* The eth_min_inline response value is set to off-by-one vs the FW * returned value to allow user-space to deal with older kernels. */ enum mlx5_user_inline_mode { MLX5_USER_INLINE_MODE_NA, MLX5_USER_INLINE_MODE_NONE, MLX5_USER_INLINE_MODE_L2, MLX5_USER_INLINE_MODE_IP, MLX5_USER_INLINE_MODE_TCP_UDP, }; enum { MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM = 1 << 0, MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_REQ_METADATA = 1 << 1, MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_SPI_STEERING = 1 << 2, MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_FULL_OFFLOAD = 1 << 3, MLX5_USER_ALLOC_UCONTEXT_FLOW_ACTION_FLAGS_ESP_AES_GCM_TX_IV_IS_ESN = 1 << 4, }; struct mlx5_ib_alloc_ucontext_resp { __u32 qp_tab_size; __u32 bf_reg_size; __u32 tot_bfregs; __u32 cache_line_size; __u16 max_sq_desc_sz; __u16 max_rq_desc_sz; __u32 max_send_wqebb; __u32 max_recv_wr; __u32 max_srq_recv_wr; __u16 num_ports; __u16 flow_action_flags; __u32 comp_mask; __u32 response_length; __u8 cqe_version; __u8 cmds_supp_uhw; __u8 eth_min_inline; __u8 clock_info_versions; __aligned_u64 hca_core_clock_offset; __u32 log_uar_size; __u32 num_uars_per_page; __u32 num_dyn_bfregs; __u32 dump_fill_mkey; }; struct mlx5_ib_alloc_pd_resp { __u32 pdn; }; struct mlx5_ib_tso_caps { __u32 max_tso; /* Maximum tso payload size in bytes */ /* Corresponding bit will be set if qp type from * 'enum ib_qp_type' is supported, e.g. * supported_qpts |= 1 << IB_QPT_UD */ __u32 supported_qpts; }; struct mlx5_ib_rss_caps { __aligned_u64 rx_hash_fields_mask; /* enum mlx5_rx_hash_fields */ __u8 rx_hash_function; /* enum mlx5_rx_hash_function_flags */ __u8 reserved[7]; }; enum mlx5_ib_cqe_comp_res_format { MLX5_IB_CQE_RES_FORMAT_HASH = 1 << 0, MLX5_IB_CQE_RES_FORMAT_CSUM = 1 << 1, MLX5_IB_CQE_RES_FORMAT_CSUM_STRIDX = 1 << 2, }; struct mlx5_ib_cqe_comp_caps { __u32 max_num; __u32 supported_format; /* enum mlx5_ib_cqe_comp_res_format */ }; enum mlx5_ib_packet_pacing_cap_flags { MLX5_IB_PP_SUPPORT_BURST = 1 << 0, }; struct mlx5_packet_pacing_caps { __u32 qp_rate_limit_min; __u32 qp_rate_limit_max; /* In kpbs */ /* Corresponding bit will be set if qp type from * 'enum ib_qp_type' is supported, e.g. * supported_qpts |= 1 << IB_QPT_RAW_PACKET */ __u32 supported_qpts; __u8 cap_flags; /* enum mlx5_ib_packet_pacing_cap_flags */ __u8 reserved[3]; }; enum mlx5_ib_mpw_caps { MPW_RESERVED = 1 << 0, MLX5_IB_ALLOW_MPW = 1 << 1, MLX5_IB_SUPPORT_EMPW = 1 << 2, }; enum mlx5_ib_sw_parsing_offloads { MLX5_IB_SW_PARSING = 1 << 0, MLX5_IB_SW_PARSING_CSUM = 1 << 1, MLX5_IB_SW_PARSING_LSO = 1 << 2, }; struct mlx5_ib_sw_parsing_caps { __u32 sw_parsing_offloads; /* enum mlx5_ib_sw_parsing_offloads */ /* Corresponding bit will be set if qp type from * 'enum ib_qp_type' is supported, e.g. * supported_qpts |= 1 << IB_QPT_RAW_PACKET */ __u32 supported_qpts; }; struct mlx5_ib_striding_rq_caps { __u32 min_single_stride_log_num_of_bytes; __u32 max_single_stride_log_num_of_bytes; __u32 min_single_wqe_log_num_of_strides; __u32 max_single_wqe_log_num_of_strides; /* Corresponding bit will be set if qp type from * 'enum ib_qp_type' is supported, e.g. * supported_qpts |= 1 << IB_QPT_RAW_PACKET */ __u32 supported_qpts; __u32 reserved; }; struct mlx5_ib_dci_streams_caps { __u8 max_log_num_concurent; __u8 max_log_num_errored; }; enum mlx5_ib_query_dev_resp_flags { /* Support 128B CQE compression */ MLX5_IB_QUERY_DEV_RESP_FLAGS_CQE_128B_COMP = 1 << 0, MLX5_IB_QUERY_DEV_RESP_FLAGS_CQE_128B_PAD = 1 << 1, MLX5_IB_QUERY_DEV_RESP_PACKET_BASED_CREDIT_MODE = 1 << 2, MLX5_IB_QUERY_DEV_RESP_FLAGS_SCAT2CQE_DCT = 1 << 3, MLX5_IB_QUERY_DEV_RESP_FLAGS_OOO_DP = 1 << 4, }; enum mlx5_ib_tunnel_offloads { MLX5_IB_TUNNELED_OFFLOADS_VXLAN = 1 << 0, MLX5_IB_TUNNELED_OFFLOADS_GRE = 1 << 1, MLX5_IB_TUNNELED_OFFLOADS_GENEVE = 1 << 2, MLX5_IB_TUNNELED_OFFLOADS_MPLS_GRE = 1 << 3, MLX5_IB_TUNNELED_OFFLOADS_MPLS_UDP = 1 << 4, }; struct mlx5_ib_query_device_resp { __u32 comp_mask; __u32 response_length; struct mlx5_ib_tso_caps tso_caps; struct mlx5_ib_rss_caps rss_caps; struct mlx5_ib_cqe_comp_caps cqe_comp_caps; struct mlx5_packet_pacing_caps packet_pacing_caps; __u32 mlx5_ib_support_multi_pkt_send_wqes; __u32 flags; /* Use enum mlx5_ib_query_dev_resp_flags */ struct mlx5_ib_sw_parsing_caps sw_parsing_caps; struct mlx5_ib_striding_rq_caps striding_rq_caps; __u32 tunnel_offloads_caps; /* enum mlx5_ib_tunnel_offloads */ struct mlx5_ib_dci_streams_caps dci_streams_caps; __u16 reserved; struct mlx5_ib_uapi_reg reg_c0; }; enum mlx5_ib_create_cq_flags { MLX5_IB_CREATE_CQ_FLAGS_CQE_128B_PAD = 1 << 0, MLX5_IB_CREATE_CQ_FLAGS_UAR_PAGE_INDEX = 1 << 1, MLX5_IB_CREATE_CQ_FLAGS_REAL_TIME_TS = 1 << 2, }; struct mlx5_ib_create_cq { __aligned_u64 buf_addr; __aligned_u64 db_addr; __u32 cqe_size; __u8 cqe_comp_en; __u8 cqe_comp_res_format; __u16 flags; __u16 uar_page_index; __u16 reserved0; __u32 reserved1; }; struct mlx5_ib_create_cq_resp { __u32 cqn; __u32 reserved; }; struct mlx5_ib_resize_cq { __aligned_u64 buf_addr; __u16 cqe_size; __u16 reserved0; __u32 reserved1; }; struct mlx5_ib_create_srq { __aligned_u64 buf_addr; __aligned_u64 db_addr; __u32 flags; __u32 reserved0; /* explicit padding (optional on i386) */ __u32 uidx; __u32 reserved1; }; struct mlx5_ib_create_srq_resp { __u32 srqn; __u32 reserved; }; struct mlx5_ib_create_qp_dci_streams { __u8 log_num_concurent; __u8 log_num_errored; }; struct mlx5_ib_create_qp { __aligned_u64 buf_addr; __aligned_u64 db_addr; __u32 sq_wqe_count; __u32 rq_wqe_count; __u32 rq_wqe_shift; __u32 flags; __u32 uidx; __u32 bfreg_index; union { __aligned_u64 sq_buf_addr; __aligned_u64 access_key; }; __u32 ece_options; struct mlx5_ib_create_qp_dci_streams dci_streams; __u16 reserved; }; /* RX Hash function flags */ enum mlx5_rx_hash_function_flags { MLX5_RX_HASH_FUNC_TOEPLITZ = 1 << 0, }; /* * RX Hash flags, these flags allows to set which incoming packet's field should * participates in RX Hash. Each flag represent certain packet's field, * when the flag is set the field that is represented by the flag will * participate in RX Hash calculation. * Note: *IPV4 and *IPV6 flags can't be enabled together on the same QP * and *TCP and *UDP flags can't be enabled together on the same QP. */ enum mlx5_rx_hash_fields { MLX5_RX_HASH_SRC_IPV4 = 1 << 0, MLX5_RX_HASH_DST_IPV4 = 1 << 1, MLX5_RX_HASH_SRC_IPV6 = 1 << 2, MLX5_RX_HASH_DST_IPV6 = 1 << 3, MLX5_RX_HASH_SRC_PORT_TCP = 1 << 4, MLX5_RX_HASH_DST_PORT_TCP = 1 << 5, MLX5_RX_HASH_SRC_PORT_UDP = 1 << 6, MLX5_RX_HASH_DST_PORT_UDP = 1 << 7, MLX5_RX_HASH_IPSEC_SPI = 1 << 8, /* Save bits for future fields */ MLX5_RX_HASH_INNER = (1UL << 31), }; struct mlx5_ib_create_qp_rss { __aligned_u64 rx_hash_fields_mask; /* enum mlx5_rx_hash_fields */ __u8 rx_hash_function; /* enum mlx5_rx_hash_function_flags */ __u8 rx_key_len; /* valid only for Toeplitz */ __u8 reserved[6]; __u8 rx_hash_key[128]; /* valid only for Toeplitz */ __u32 comp_mask; __u32 flags; }; enum mlx5_ib_create_qp_resp_mask { MLX5_IB_CREATE_QP_RESP_MASK_TIRN = 1UL << 0, MLX5_IB_CREATE_QP_RESP_MASK_TISN = 1UL << 1, MLX5_IB_CREATE_QP_RESP_MASK_RQN = 1UL << 2, MLX5_IB_CREATE_QP_RESP_MASK_SQN = 1UL << 3, MLX5_IB_CREATE_QP_RESP_MASK_TIR_ICM_ADDR = 1UL << 4, }; struct mlx5_ib_create_qp_resp { __u32 bfreg_index; __u32 ece_options; __u32 comp_mask; __u32 tirn; __u32 tisn; __u32 rqn; __u32 sqn; __u32 reserved1; __u64 tir_icm_addr; }; struct mlx5_ib_alloc_mw { __u32 comp_mask; __u8 num_klms; __u8 reserved1; __u16 reserved2; }; enum mlx5_ib_create_wq_mask { MLX5_IB_CREATE_WQ_STRIDING_RQ = (1 << 0), }; struct mlx5_ib_create_wq { __aligned_u64 buf_addr; __aligned_u64 db_addr; __u32 rq_wqe_count; __u32 rq_wqe_shift; __u32 user_index; __u32 flags; __u32 comp_mask; __u32 single_stride_log_num_of_bytes; __u32 single_wqe_log_num_of_strides; __u32 two_byte_shift_en; }; struct mlx5_ib_create_ah_resp { __u32 response_length; __u8 dmac[ETH_ALEN]; __u8 reserved[6]; }; struct mlx5_ib_burst_info { __u32 max_burst_sz; __u16 typical_pkt_sz; __u16 reserved; }; enum mlx5_ib_modify_qp_mask { MLX5_IB_MODIFY_QP_OOO_DP = 1 << 0, }; struct mlx5_ib_modify_qp { __u32 comp_mask; struct mlx5_ib_burst_info burst_info; __u32 ece_options; }; struct mlx5_ib_modify_qp_resp { __u32 response_length; __u32 dctn; __u32 ece_options; __u32 reserved; }; struct mlx5_ib_create_wq_resp { __u32 response_length; __u32 reserved; }; struct mlx5_ib_create_rwq_ind_tbl_resp { __u32 response_length; __u32 reserved; }; struct mlx5_ib_modify_wq { __u32 comp_mask; __u32 reserved; }; struct mlx5_ib_clock_info { __u32 sign; __u32 resv; __aligned_u64 nsec; __aligned_u64 cycles; __aligned_u64 frac; __u32 mult; __u32 shift; __aligned_u64 mask; __aligned_u64 overflow_period; }; enum mlx5_ib_mmap_cmd { MLX5_IB_MMAP_REGULAR_PAGE = 0, MLX5_IB_MMAP_GET_CONTIGUOUS_PAGES = 1, MLX5_IB_MMAP_WC_PAGE = 2, MLX5_IB_MMAP_NC_PAGE = 3, /* 5 is chosen in order to be compatible with old versions of libmlx5 */ MLX5_IB_MMAP_CORE_CLOCK = 5, MLX5_IB_MMAP_ALLOC_WC = 6, MLX5_IB_MMAP_CLOCK_INFO = 7, MLX5_IB_MMAP_DEVICE_MEM = 8, }; enum { MLX5_IB_CLOCK_INFO_KERNEL_UPDATING = 1, }; /* Bit indexes for the mlx5_alloc_ucontext_resp.clock_info_versions bitmap */ enum { MLX5_IB_CLOCK_INFO_V1 = 0, }; struct mlx5_ib_flow_counters_desc { __u32 description; __u32 index; }; struct mlx5_ib_flow_counters_data { RDMA_UAPI_PTR(struct mlx5_ib_flow_counters_desc *, counters_data); __u32 ncounters; __u32 reserved; }; struct mlx5_ib_create_flow { __u32 ncounters_data; __u32 reserved; /* * Following are counters data based on ncounters_data, each * entry in the data[] should match a corresponding counter object * that was pointed by a counters spec upon the flow creation */ struct mlx5_ib_flow_counters_data data[]; }; #endif /* MLX5_ABI_USER_H */ rdma-core-56.1/kernel-headers/rdma/mlx5_user_ioctl_cmds.h000066400000000000000000000256331477342711600234530ustar00rootroot00000000000000/* * Copyright (c) 2018, Mellanox Technologies inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MLX5_USER_IOCTL_CMDS_H #define MLX5_USER_IOCTL_CMDS_H #include #include enum mlx5_ib_create_flow_action_attrs { /* This attribute belong to the driver namespace */ MLX5_IB_ATTR_CREATE_FLOW_ACTION_FLAGS = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_dm_methods { MLX5_IB_METHOD_DM_MAP_OP_ADDR = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_DM_QUERY, }; enum mlx5_ib_dm_map_op_addr_attrs { MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_OP, MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_START_OFFSET, MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_PAGE_INDEX, }; enum mlx5_ib_query_dm_attrs { MLX5_IB_ATTR_QUERY_DM_REQ_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_QUERY_DM_RESP_START_OFFSET, MLX5_IB_ATTR_QUERY_DM_RESP_PAGE_INDEX, MLX5_IB_ATTR_QUERY_DM_RESP_LENGTH, }; enum mlx5_ib_alloc_dm_attrs { MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX, MLX5_IB_ATTR_ALLOC_DM_REQ_TYPE, }; enum mlx5_ib_devx_methods { MLX5_IB_METHOD_DEVX_OTHER = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_DEVX_QUERY_UAR, MLX5_IB_METHOD_DEVX_QUERY_EQN, MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT, }; enum mlx5_ib_devx_other_attrs { MLX5_IB_ATTR_DEVX_OTHER_CMD_IN = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_DEVX_OTHER_CMD_OUT, }; enum mlx5_ib_devx_obj_create_attrs { MLX5_IB_ATTR_DEVX_OBJ_CREATE_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_DEVX_OBJ_CREATE_CMD_IN, MLX5_IB_ATTR_DEVX_OBJ_CREATE_CMD_OUT, }; enum mlx5_ib_devx_query_uar_attrs { MLX5_IB_ATTR_DEVX_QUERY_UAR_USER_IDX = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_DEVX_QUERY_UAR_DEV_IDX, }; enum mlx5_ib_devx_obj_destroy_attrs { MLX5_IB_ATTR_DEVX_OBJ_DESTROY_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_devx_obj_modify_attrs { MLX5_IB_ATTR_DEVX_OBJ_MODIFY_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_IN, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_OUT, }; enum mlx5_ib_devx_obj_query_attrs { MLX5_IB_ATTR_DEVX_OBJ_QUERY_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_IN, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_OUT, }; enum mlx5_ib_devx_obj_query_async_attrs { MLX5_IB_ATTR_DEVX_OBJ_QUERY_ASYNC_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_DEVX_OBJ_QUERY_ASYNC_CMD_IN, MLX5_IB_ATTR_DEVX_OBJ_QUERY_ASYNC_FD, MLX5_IB_ATTR_DEVX_OBJ_QUERY_ASYNC_WR_ID, MLX5_IB_ATTR_DEVX_OBJ_QUERY_ASYNC_OUT_LEN, }; enum mlx5_ib_devx_subscribe_event_attrs { MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_FD_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_OBJ_HANDLE, MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_TYPE_NUM_LIST, MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_FD_NUM, MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_COOKIE, }; enum mlx5_ib_devx_query_eqn_attrs { MLX5_IB_ATTR_DEVX_QUERY_EQN_USER_VEC = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_DEVX_QUERY_EQN_DEV_EQN, }; enum mlx5_ib_devx_obj_methods { MLX5_IB_METHOD_DEVX_OBJ_CREATE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_DEVX_OBJ_DESTROY, MLX5_IB_METHOD_DEVX_OBJ_MODIFY, MLX5_IB_METHOD_DEVX_OBJ_QUERY, MLX5_IB_METHOD_DEVX_OBJ_ASYNC_QUERY, }; enum mlx5_ib_var_alloc_attrs { MLX5_IB_ATTR_VAR_OBJ_ALLOC_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_VAR_OBJ_ALLOC_MMAP_OFFSET, MLX5_IB_ATTR_VAR_OBJ_ALLOC_MMAP_LENGTH, MLX5_IB_ATTR_VAR_OBJ_ALLOC_PAGE_ID, }; enum mlx5_ib_var_obj_destroy_attrs { MLX5_IB_ATTR_VAR_OBJ_DESTROY_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_var_obj_methods { MLX5_IB_METHOD_VAR_OBJ_ALLOC = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_VAR_OBJ_DESTROY, }; enum mlx5_ib_uar_alloc_attrs { MLX5_IB_ATTR_UAR_OBJ_ALLOC_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_UAR_OBJ_ALLOC_TYPE, MLX5_IB_ATTR_UAR_OBJ_ALLOC_MMAP_OFFSET, MLX5_IB_ATTR_UAR_OBJ_ALLOC_MMAP_LENGTH, MLX5_IB_ATTR_UAR_OBJ_ALLOC_PAGE_ID, }; enum mlx5_ib_uar_obj_destroy_attrs { MLX5_IB_ATTR_UAR_OBJ_DESTROY_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_uar_obj_methods { MLX5_IB_METHOD_UAR_OBJ_ALLOC = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_UAR_OBJ_DESTROY, }; enum mlx5_ib_devx_umem_reg_attrs { MLX5_IB_ATTR_DEVX_UMEM_REG_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_DEVX_UMEM_REG_ADDR, MLX5_IB_ATTR_DEVX_UMEM_REG_LEN, MLX5_IB_ATTR_DEVX_UMEM_REG_ACCESS, MLX5_IB_ATTR_DEVX_UMEM_REG_OUT_ID, MLX5_IB_ATTR_DEVX_UMEM_REG_PGSZ_BITMAP, MLX5_IB_ATTR_DEVX_UMEM_REG_DMABUF_FD, }; enum mlx5_ib_devx_umem_dereg_attrs { MLX5_IB_ATTR_DEVX_UMEM_DEREG_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_pp_obj_methods { MLX5_IB_METHOD_PP_OBJ_ALLOC = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_PP_OBJ_DESTROY, }; enum mlx5_ib_pp_alloc_attrs { MLX5_IB_ATTR_PP_OBJ_ALLOC_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_PP_OBJ_ALLOC_CTX, MLX5_IB_ATTR_PP_OBJ_ALLOC_FLAGS, MLX5_IB_ATTR_PP_OBJ_ALLOC_INDEX, }; enum mlx5_ib_pp_obj_destroy_attrs { MLX5_IB_ATTR_PP_OBJ_DESTROY_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_devx_umem_methods { MLX5_IB_METHOD_DEVX_UMEM_REG = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_DEVX_UMEM_DEREG, }; enum mlx5_ib_devx_async_cmd_fd_alloc_attrs { MLX5_IB_ATTR_DEVX_ASYNC_CMD_FD_ALLOC_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_devx_async_event_fd_alloc_attrs { MLX5_IB_ATTR_DEVX_ASYNC_EVENT_FD_ALLOC_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_DEVX_ASYNC_EVENT_FD_ALLOC_FLAGS, }; enum mlx5_ib_devx_async_cmd_fd_methods { MLX5_IB_METHOD_DEVX_ASYNC_CMD_FD_ALLOC = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_devx_async_event_fd_methods { MLX5_IB_METHOD_DEVX_ASYNC_EVENT_FD_ALLOC = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_objects { MLX5_IB_OBJECT_DEVX = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_OBJECT_DEVX_UMEM, MLX5_IB_OBJECT_FLOW_MATCHER, MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD, MLX5_IB_OBJECT_DEVX_ASYNC_EVENT_FD, MLX5_IB_OBJECT_VAR, MLX5_IB_OBJECT_PP, MLX5_IB_OBJECT_UAR, MLX5_IB_OBJECT_STEERING_ANCHOR, }; enum mlx5_ib_flow_matcher_create_attrs { MLX5_IB_ATTR_FLOW_MATCHER_CREATE_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_FLOW_MATCHER_MATCH_MASK, MLX5_IB_ATTR_FLOW_MATCHER_FLOW_TYPE, MLX5_IB_ATTR_FLOW_MATCHER_MATCH_CRITERIA, MLX5_IB_ATTR_FLOW_MATCHER_FLOW_FLAGS, MLX5_IB_ATTR_FLOW_MATCHER_FT_TYPE, }; enum mlx5_ib_flow_matcher_destroy_attrs { MLX5_IB_ATTR_FLOW_MATCHER_DESTROY_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_flow_matcher_methods { MLX5_IB_METHOD_FLOW_MATCHER_CREATE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_FLOW_MATCHER_DESTROY, }; enum mlx5_ib_flow_steering_anchor_create_attrs { MLX5_IB_ATTR_STEERING_ANCHOR_CREATE_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_STEERING_ANCHOR_FT_TYPE, MLX5_IB_ATTR_STEERING_ANCHOR_PRIORITY, MLX5_IB_ATTR_STEERING_ANCHOR_FT_ID, }; enum mlx5_ib_flow_steering_anchor_destroy_attrs { MLX5_IB_ATTR_STEERING_ANCHOR_DESTROY_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_steering_anchor_methods { MLX5_IB_METHOD_STEERING_ANCHOR_CREATE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_STEERING_ANCHOR_DESTROY, }; enum mlx5_ib_device_query_context_attrs { MLX5_IB_ATTR_QUERY_CONTEXT_RESP_UCTX = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_create_cq_attrs { MLX5_IB_ATTR_CREATE_CQ_UAR_INDEX = UVERBS_ID_DRIVER_NS_WITH_UHW, }; enum mlx5_ib_reg_dmabuf_mr_attrs { MLX5_IB_ATTR_REG_DMABUF_MR_ACCESS_FLAGS = (1U << UVERBS_ID_NS_SHIFT), }; #define MLX5_IB_DW_MATCH_PARAM 0xA0 struct mlx5_ib_match_params { __u32 match_params[MLX5_IB_DW_MATCH_PARAM]; }; enum mlx5_ib_flow_type { MLX5_IB_FLOW_TYPE_NORMAL, MLX5_IB_FLOW_TYPE_SNIFFER, MLX5_IB_FLOW_TYPE_ALL_DEFAULT, MLX5_IB_FLOW_TYPE_MC_DEFAULT, }; enum mlx5_ib_create_flow_flags { MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DEFAULT_MISS = 1 << 0, MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DROP = 1 << 1, }; enum mlx5_ib_create_flow_attrs { MLX5_IB_ATTR_CREATE_FLOW_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_CREATE_FLOW_MATCH_VALUE, MLX5_IB_ATTR_CREATE_FLOW_DEST_QP, MLX5_IB_ATTR_CREATE_FLOW_DEST_DEVX, MLX5_IB_ATTR_CREATE_FLOW_MATCHER, MLX5_IB_ATTR_CREATE_FLOW_ARR_FLOW_ACTIONS, MLX5_IB_ATTR_CREATE_FLOW_TAG, MLX5_IB_ATTR_CREATE_FLOW_ARR_COUNTERS_DEVX, MLX5_IB_ATTR_CREATE_FLOW_ARR_COUNTERS_DEVX_OFFSET, MLX5_IB_ATTR_CREATE_FLOW_FLAGS, }; enum mlx5_ib_destroy_flow_attrs { MLX5_IB_ATTR_DESTROY_FLOW_HANDLE = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_flow_methods { MLX5_IB_METHOD_CREATE_FLOW = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_DESTROY_FLOW, }; enum mlx5_ib_flow_action_methods { MLX5_IB_METHOD_FLOW_ACTION_CREATE_MODIFY_HEADER = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_FLOW_ACTION_CREATE_PACKET_REFORMAT, }; enum mlx5_ib_create_flow_action_create_modify_header_attrs { MLX5_IB_ATTR_CREATE_MODIFY_HEADER_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_CREATE_MODIFY_HEADER_ACTIONS_PRM, MLX5_IB_ATTR_CREATE_MODIFY_HEADER_FT_TYPE, }; enum mlx5_ib_create_flow_action_create_packet_reformat_attrs { MLX5_IB_ATTR_CREATE_PACKET_REFORMAT_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_CREATE_PACKET_REFORMAT_TYPE, MLX5_IB_ATTR_CREATE_PACKET_REFORMAT_FT_TYPE, MLX5_IB_ATTR_CREATE_PACKET_REFORMAT_DATA_BUF, }; enum mlx5_ib_query_pd_attrs { MLX5_IB_ATTR_QUERY_PD_HANDLE = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_QUERY_PD_RESP_PDN, }; enum mlx5_ib_pd_methods { MLX5_IB_METHOD_PD_QUERY = (1U << UVERBS_ID_NS_SHIFT), }; enum mlx5_ib_device_methods { MLX5_IB_METHOD_QUERY_PORT = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_METHOD_GET_DATA_DIRECT_SYSFS_PATH, }; enum mlx5_ib_query_port_attrs { MLX5_IB_ATTR_QUERY_PORT_PORT_NUM = (1U << UVERBS_ID_NS_SHIFT), MLX5_IB_ATTR_QUERY_PORT, }; enum mlx5_ib_get_data_direct_sysfs_path_attrs { MLX5_IB_ATTR_GET_DATA_DIRECT_SYSFS_PATH = (1U << UVERBS_ID_NS_SHIFT), }; #endif rdma-core-56.1/kernel-headers/rdma/mlx5_user_ioctl_verbs.h000066400000000000000000000072021477342711600236360ustar00rootroot00000000000000/* * Copyright (c) 2018, Mellanox Technologies inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MLX5_USER_IOCTL_VERBS_H #define MLX5_USER_IOCTL_VERBS_H #include enum mlx5_ib_uapi_flow_action_flags { MLX5_IB_UAPI_FLOW_ACTION_FLAGS_REQUIRE_METADATA = 1 << 0, }; enum mlx5_ib_uapi_flow_table_type { MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_RX = 0x0, MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_TX = 0x1, MLX5_IB_UAPI_FLOW_TABLE_TYPE_FDB = 0x2, MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_RX = 0x3, MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TX = 0x4, }; enum mlx5_ib_uapi_flow_action_packet_reformat_type { MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2 = 0x0, MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL = 0x1, MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2 = 0x2, MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL = 0x3, }; enum mlx5_ib_uapi_reg_dmabuf_flags { MLX5_IB_UAPI_REG_DMABUF_ACCESS_DATA_DIRECT = 1 << 0, }; struct mlx5_ib_uapi_devx_async_cmd_hdr { __aligned_u64 wr_id; __u8 out_data[]; }; enum mlx5_ib_uapi_dm_type { MLX5_IB_UAPI_DM_TYPE_MEMIC, MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM, MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM, MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_PATTERN_SW_ICM, MLX5_IB_UAPI_DM_TYPE_ENCAP_SW_ICM, }; enum mlx5_ib_uapi_devx_create_event_channel_flags { MLX5_IB_UAPI_DEVX_CR_EV_CH_FLAGS_OMIT_DATA = 1 << 0, }; struct mlx5_ib_uapi_devx_async_event_hdr { __aligned_u64 cookie; __u8 out_data[]; }; enum mlx5_ib_uapi_pp_alloc_flags { MLX5_IB_UAPI_PP_ALLOC_FLAGS_DEDICATED_INDEX = 1 << 0, }; enum mlx5_ib_uapi_uar_alloc_type { MLX5_IB_UAPI_UAR_ALLOC_TYPE_BF = 0x0, MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC = 0x1, }; enum mlx5_ib_uapi_query_port_flags { MLX5_IB_UAPI_QUERY_PORT_VPORT = 1 << 0, MLX5_IB_UAPI_QUERY_PORT_VPORT_VHCA_ID = 1 << 1, MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_RX = 1 << 2, MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_TX = 1 << 3, MLX5_IB_UAPI_QUERY_PORT_VPORT_REG_C0 = 1 << 4, MLX5_IB_UAPI_QUERY_PORT_ESW_OWNER_VHCA_ID = 1 << 5, }; struct mlx5_ib_uapi_reg { __u32 value; __u32 mask; }; struct mlx5_ib_uapi_query_port { __aligned_u64 flags; __u16 vport; __u16 vport_vhca_id; __u16 esw_owner_vhca_id; __u16 rsvd0; __aligned_u64 vport_steering_icm_rx; __aligned_u64 vport_steering_icm_tx; struct mlx5_ib_uapi_reg reg_c0; }; #endif rdma-core-56.1/kernel-headers/rdma/mthca-abi.h000066400000000000000000000057571477342711600211620ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005, 2006 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MTHCA_ABI_USER_H #define MTHCA_ABI_USER_H #include /* * Increment this value if any changes that break userspace ABI * compatibility are made. */ #define MTHCA_UVERBS_ABI_VERSION 1 /* * Make sure that all structs defined in this file remain laid out so * that they pack the same way on 32-bit and 64-bit architectures (to * avoid incompatibility between 32-bit userspace and 64-bit kernels). * In particular do not use pointer types -- pass pointers in __u64 * instead. */ struct mthca_alloc_ucontext_resp { __u32 qp_tab_size; __u32 uarc_size; }; struct mthca_alloc_pd_resp { __u32 pdn; __u32 reserved; }; /* * Mark the memory region with a DMA attribute that causes * in-flight DMA to be flushed when the region is written to: */ #define MTHCA_MR_DMASYNC 0x1 struct mthca_reg_mr { __u32 mr_attrs; __u32 reserved; }; struct mthca_create_cq { __u32 lkey; __u32 pdn; __aligned_u64 arm_db_page; __aligned_u64 set_db_page; __u32 arm_db_index; __u32 set_db_index; }; struct mthca_create_cq_resp { __u32 cqn; __u32 reserved; }; struct mthca_resize_cq { __u32 lkey; __u32 reserved; }; struct mthca_create_srq { __u32 lkey; __u32 db_index; __aligned_u64 db_page; }; struct mthca_create_srq_resp { __u32 srqn; __u32 reserved; }; struct mthca_create_qp { __u32 lkey; __u32 reserved; __aligned_u64 sq_db_page; __aligned_u64 rq_db_page; __u32 sq_db_index; __u32 rq_db_index; }; #endif /* MTHCA_ABI_USER_H */ rdma-core-56.1/kernel-headers/rdma/ocrdma-abi.h000066400000000000000000000100241477342711600213120ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) */ /* This file is part of the Emulex RoCE Device Driver for * RoCE (RDMA over Converged Ethernet) adapters. * Copyright (C) 2012-2015 Emulex. All rights reserved. * EMULEX and SLI are trademarks of Emulex. * www.emulex.com * * This software is available to you under a choice of one of two licenses. * You may choose to be licensed under the terms of the GNU General Public * License (GPL) Version 2, available from the file COPYING in the main * directory of this source tree, or the BSD license below: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * - Redistributions of source code must retain the above copyright notice, * this list of conditions and the following disclaimer. * * - Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR * OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Contact Information: * linux-drivers@emulex.com * * Emulex * 3333 Susan Street * Costa Mesa, CA 92626 */ #ifndef OCRDMA_ABI_USER_H #define OCRDMA_ABI_USER_H #include #define OCRDMA_ABI_VERSION 2 #define OCRDMA_BE_ROCE_ABI_VERSION 1 /* user kernel communication data structures. */ struct ocrdma_alloc_ucontext_resp { __u32 dev_id; __u32 wqe_size; __u32 max_inline_data; __u32 dpp_wqe_size; __aligned_u64 ah_tbl_page; __u32 ah_tbl_len; __u32 rqe_size; __u8 fw_ver[32]; /* for future use/new features in progress */ __aligned_u64 rsvd1; __aligned_u64 rsvd2; }; struct ocrdma_alloc_pd_ureq { __u32 rsvd[2]; }; struct ocrdma_alloc_pd_uresp { __u32 id; __u32 dpp_enabled; __u32 dpp_page_addr_hi; __u32 dpp_page_addr_lo; __u32 rsvd[2]; }; struct ocrdma_create_cq_ureq { __u32 dpp_cq; __u32 rsvd; /* pad */ }; #define MAX_CQ_PAGES 8 struct ocrdma_create_cq_uresp { __u32 cq_id; __u32 page_size; __u32 num_pages; __u32 max_hw_cqe; __aligned_u64 page_addr[MAX_CQ_PAGES]; __aligned_u64 db_page_addr; __u32 db_page_size; __u32 phase_change; /* for future use/new features in progress */ __aligned_u64 rsvd1; __aligned_u64 rsvd2; }; #define MAX_QP_PAGES 8 #define MAX_UD_AV_PAGES 8 struct ocrdma_create_qp_ureq { __u8 enable_dpp_cq; __u8 rsvd; __u16 dpp_cq_id; __u32 rsvd1; /* pad */ }; struct ocrdma_create_qp_uresp { __u16 qp_id; __u16 sq_dbid; __u16 rq_dbid; __u16 resv0; /* pad */ __u32 sq_page_size; __u32 rq_page_size; __u32 num_sq_pages; __u32 num_rq_pages; __aligned_u64 sq_page_addr[MAX_QP_PAGES]; __aligned_u64 rq_page_addr[MAX_QP_PAGES]; __aligned_u64 db_page_addr; __u32 db_page_size; __u32 dpp_credit; __u32 dpp_offset; __u32 num_wqe_allocated; __u32 num_rqe_allocated; __u32 db_sq_offset; __u32 db_rq_offset; __u32 db_shift; __aligned_u64 rsvd[11]; }; struct ocrdma_create_srq_uresp { __u16 rq_dbid; __u16 resv0; /* pad */ __u32 resv1; __u32 rq_page_size; __u32 num_rq_pages; __aligned_u64 rq_page_addr[MAX_QP_PAGES]; __aligned_u64 db_page_addr; __u32 db_page_size; __u32 num_rqe_allocated; __u32 db_rq_offset; __u32 db_shift; __aligned_u64 rsvd2; __aligned_u64 rsvd3; }; #endif /* OCRDMA_ABI_USER_H */ rdma-core-56.1/kernel-headers/rdma/qedr-abi.h000066400000000000000000000103231477342711600210020ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* QLogic qedr NIC Driver * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __QEDR_USER_H__ #define __QEDR_USER_H__ #include #define QEDR_ABI_VERSION (8) /* user kernel communication data structures. */ enum qedr_alloc_ucontext_flags { QEDR_ALLOC_UCTX_EDPM_MODE = 1 << 0, QEDR_ALLOC_UCTX_DB_REC = 1 << 1, QEDR_SUPPORT_DPM_SIZES = 1 << 2, }; struct qedr_alloc_ucontext_req { __u32 context_flags; __u32 reserved; }; #define QEDR_LDPM_MAX_SIZE (8192) #define QEDR_EDPM_TRANS_SIZE (64) #define QEDR_EDPM_MAX_SIZE (ROCE_REQ_MAX_INLINE_DATA_SIZE) enum qedr_rdma_dpm_type { QEDR_DPM_TYPE_NONE = 0, QEDR_DPM_TYPE_ROCE_ENHANCED = 1 << 0, QEDR_DPM_TYPE_ROCE_LEGACY = 1 << 1, QEDR_DPM_TYPE_IWARP_LEGACY = 1 << 2, QEDR_DPM_TYPE_ROCE_EDPM_MODE = 1 << 3, QEDR_DPM_SIZES_SET = 1 << 4, }; struct qedr_alloc_ucontext_resp { __aligned_u64 db_pa; __u32 db_size; __u32 max_send_wr; __u32 max_recv_wr; __u32 max_srq_wr; __u32 sges_per_send_wr; __u32 sges_per_recv_wr; __u32 sges_per_srq_wr; __u32 max_cqes; __u8 dpm_flags; __u8 wids_enabled; __u16 wid_count; __u16 ldpm_limit_size; __u8 edpm_trans_size; __u8 reserved; __u16 edpm_limit_size; __u8 padding[6]; }; struct qedr_alloc_pd_ureq { __aligned_u64 rsvd1; }; struct qedr_alloc_pd_uresp { __u32 pd_id; __u32 reserved; }; struct qedr_create_cq_ureq { __aligned_u64 addr; __aligned_u64 len; }; struct qedr_create_cq_uresp { __u32 db_offset; __u16 icid; __u16 reserved; __aligned_u64 db_rec_addr; }; struct qedr_create_qp_ureq { __u32 qp_handle_hi; __u32 qp_handle_lo; /* SQ */ /* user space virtual address of SQ buffer */ __aligned_u64 sq_addr; /* length of SQ buffer */ __aligned_u64 sq_len; /* RQ */ /* user space virtual address of RQ buffer */ __aligned_u64 rq_addr; /* length of RQ buffer */ __aligned_u64 rq_len; }; struct qedr_create_qp_uresp { __u32 qp_id; __u32 atomic_supported; /* SQ */ __u32 sq_db_offset; __u16 sq_icid; /* RQ */ __u32 rq_db_offset; __u16 rq_icid; __u32 rq_db2_offset; __u32 reserved; /* address of SQ doorbell recovery user entry */ __aligned_u64 sq_db_rec_addr; /* address of RQ doorbell recovery user entry */ __aligned_u64 rq_db_rec_addr; }; struct qedr_create_srq_ureq { /* user space virtual address of producer pair */ __aligned_u64 prod_pair_addr; /* user space virtual address of SRQ buffer */ __aligned_u64 srq_addr; /* length of SRQ buffer */ __aligned_u64 srq_len; }; struct qedr_create_srq_uresp { __u16 srq_id; __u16 reserved0; __u32 reserved1; }; /* doorbell recovery entry allocated and populated by userspace doorbelling * entities and mapped to kernel. Kernel uses this to register doorbell * information with doorbell drop recovery mechanism. */ struct qedr_user_db_rec { __aligned_u64 db_data; /* doorbell data */ }; #endif /* __QEDR_USER_H__ */ rdma-core-56.1/kernel-headers/rdma/rdma_netlink.h000066400000000000000000000371671477342711600220040ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ #ifndef _UAPI_RDMA_NETLINK_H #define _UAPI_RDMA_NETLINK_H #include enum { RDMA_NL_IWCM = 2, RDMA_NL_RSVD, RDMA_NL_LS, /* RDMA Local Services */ RDMA_NL_NLDEV, /* RDMA device interface */ RDMA_NL_NUM_CLIENTS }; enum { RDMA_NL_GROUP_IWPM = 2, RDMA_NL_GROUP_LS, RDMA_NL_GROUP_NOTIFY, RDMA_NL_NUM_GROUPS }; #define RDMA_NL_GET_CLIENT(type) ((type & (((1 << 6) - 1) << 10)) >> 10) #define RDMA_NL_GET_OP(type) (type & ((1 << 10) - 1)) #define RDMA_NL_GET_TYPE(client, op) ((client << 10) + op) /* The minimum version that the iwpm kernel supports */ #define IWPM_UABI_VERSION_MIN 3 /* The latest version that the iwpm kernel supports */ #define IWPM_UABI_VERSION 4 /* iwarp port mapper message flags */ enum { /* Do not map the port for this IWPM request */ IWPM_FLAGS_NO_PORT_MAP = (1 << 0), }; /* iwarp port mapper op-codes */ enum { RDMA_NL_IWPM_REG_PID = 0, RDMA_NL_IWPM_ADD_MAPPING, RDMA_NL_IWPM_QUERY_MAPPING, RDMA_NL_IWPM_REMOVE_MAPPING, RDMA_NL_IWPM_REMOTE_INFO, RDMA_NL_IWPM_HANDLE_ERR, RDMA_NL_IWPM_MAPINFO, RDMA_NL_IWPM_MAPINFO_NUM, RDMA_NL_IWPM_HELLO, RDMA_NL_IWPM_NUM_OPS }; enum { IWPM_NLA_REG_PID_UNSPEC = 0, IWPM_NLA_REG_PID_SEQ, IWPM_NLA_REG_IF_NAME, IWPM_NLA_REG_IBDEV_NAME, IWPM_NLA_REG_ULIB_NAME, IWPM_NLA_REG_PID_MAX }; enum { IWPM_NLA_RREG_PID_UNSPEC = 0, IWPM_NLA_RREG_PID_SEQ, IWPM_NLA_RREG_IBDEV_NAME, IWPM_NLA_RREG_ULIB_NAME, IWPM_NLA_RREG_ULIB_VER, IWPM_NLA_RREG_PID_ERR, IWPM_NLA_RREG_PID_MAX }; enum { IWPM_NLA_MANAGE_MAPPING_UNSPEC = 0, IWPM_NLA_MANAGE_MAPPING_SEQ, IWPM_NLA_MANAGE_ADDR, IWPM_NLA_MANAGE_FLAGS, IWPM_NLA_MANAGE_MAPPING_MAX }; enum { IWPM_NLA_RMANAGE_MAPPING_UNSPEC = 0, IWPM_NLA_RMANAGE_MAPPING_SEQ, IWPM_NLA_RMANAGE_ADDR, IWPM_NLA_RMANAGE_MAPPED_LOC_ADDR, /* The following maintains bisectability of rdma-core */ IWPM_NLA_MANAGE_MAPPED_LOC_ADDR = IWPM_NLA_RMANAGE_MAPPED_LOC_ADDR, IWPM_NLA_RMANAGE_MAPPING_ERR, IWPM_NLA_RMANAGE_MAPPING_MAX }; #define IWPM_NLA_MAPINFO_SEND_MAX 3 #define IWPM_NLA_REMOVE_MAPPING_MAX 3 enum { IWPM_NLA_QUERY_MAPPING_UNSPEC = 0, IWPM_NLA_QUERY_MAPPING_SEQ, IWPM_NLA_QUERY_LOCAL_ADDR, IWPM_NLA_QUERY_REMOTE_ADDR, IWPM_NLA_QUERY_FLAGS, IWPM_NLA_QUERY_MAPPING_MAX, }; enum { IWPM_NLA_RQUERY_MAPPING_UNSPEC = 0, IWPM_NLA_RQUERY_MAPPING_SEQ, IWPM_NLA_RQUERY_LOCAL_ADDR, IWPM_NLA_RQUERY_REMOTE_ADDR, IWPM_NLA_RQUERY_MAPPED_LOC_ADDR, IWPM_NLA_RQUERY_MAPPED_REM_ADDR, IWPM_NLA_RQUERY_MAPPING_ERR, IWPM_NLA_RQUERY_MAPPING_MAX }; enum { IWPM_NLA_MAPINFO_REQ_UNSPEC = 0, IWPM_NLA_MAPINFO_ULIB_NAME, IWPM_NLA_MAPINFO_ULIB_VER, IWPM_NLA_MAPINFO_REQ_MAX }; enum { IWPM_NLA_MAPINFO_UNSPEC = 0, IWPM_NLA_MAPINFO_LOCAL_ADDR, IWPM_NLA_MAPINFO_MAPPED_ADDR, IWPM_NLA_MAPINFO_FLAGS, IWPM_NLA_MAPINFO_MAX }; enum { IWPM_NLA_MAPINFO_NUM_UNSPEC = 0, IWPM_NLA_MAPINFO_SEQ, IWPM_NLA_MAPINFO_SEND_NUM, IWPM_NLA_MAPINFO_ACK_NUM, IWPM_NLA_MAPINFO_NUM_MAX }; enum { IWPM_NLA_ERR_UNSPEC = 0, IWPM_NLA_ERR_SEQ, IWPM_NLA_ERR_CODE, IWPM_NLA_ERR_MAX }; enum { IWPM_NLA_HELLO_UNSPEC = 0, IWPM_NLA_HELLO_ABI_VERSION, IWPM_NLA_HELLO_MAX }; /* For RDMA_NLDEV_ATTR_DEV_NODE_TYPE */ enum { /* IB values map to NodeInfo:NodeType. */ RDMA_NODE_IB_CA = 1, RDMA_NODE_IB_SWITCH, RDMA_NODE_IB_ROUTER, RDMA_NODE_RNIC, RDMA_NODE_USNIC, RDMA_NODE_USNIC_UDP, RDMA_NODE_UNSPECIFIED, }; /* * Local service operations: * RESOLVE - The client requests the local service to resolve a path. * SET_TIMEOUT - The local service requests the client to set the timeout. * IP_RESOLVE - The client requests the local service to resolve an IP to GID. */ enum { RDMA_NL_LS_OP_RESOLVE = 0, RDMA_NL_LS_OP_SET_TIMEOUT, RDMA_NL_LS_OP_IP_RESOLVE, RDMA_NL_LS_NUM_OPS }; /* Local service netlink message flags */ #define RDMA_NL_LS_F_ERR 0x0100 /* Failed response */ /* * Local service resolve operation family header. * The layout for the resolve operation: * nlmsg header * family header * attributes */ /* * Local service path use: * Specify how the path(s) will be used. * ALL - For connected CM operation (6 pathrecords) * UNIDIRECTIONAL - For unidirectional UD (1 pathrecord) * GMP - For miscellaneous GMP like operation (at least 1 reversible * pathrecord) */ enum { LS_RESOLVE_PATH_USE_ALL = 0, LS_RESOLVE_PATH_USE_UNIDIRECTIONAL, LS_RESOLVE_PATH_USE_GMP, LS_RESOLVE_PATH_USE_MAX }; #define LS_DEVICE_NAME_MAX 64 struct rdma_ls_resolve_header { __u8 device_name[LS_DEVICE_NAME_MAX]; __u8 port_num; __u8 path_use; }; struct rdma_ls_ip_resolve_header { __u32 ifindex; }; /* Local service attribute type */ #define RDMA_NLA_F_MANDATORY (1 << 13) #define RDMA_NLA_TYPE_MASK (~(NLA_F_NESTED | NLA_F_NET_BYTEORDER | \ RDMA_NLA_F_MANDATORY)) /* * Local service attributes: * Attr Name Size Byte order * ----------------------------------------------------- * PATH_RECORD struct ib_path_rec_data * TIMEOUT u32 cpu * SERVICE_ID u64 cpu * DGID u8[16] BE * SGID u8[16] BE * TCLASS u8 * PKEY u16 cpu * QOS_CLASS u16 cpu * IPV4 u32 BE * IPV6 u8[16] BE */ enum { LS_NLA_TYPE_UNSPEC = 0, LS_NLA_TYPE_PATH_RECORD, LS_NLA_TYPE_TIMEOUT, LS_NLA_TYPE_SERVICE_ID, LS_NLA_TYPE_DGID, LS_NLA_TYPE_SGID, LS_NLA_TYPE_TCLASS, LS_NLA_TYPE_PKEY, LS_NLA_TYPE_QOS_CLASS, LS_NLA_TYPE_IPV4, LS_NLA_TYPE_IPV6, LS_NLA_TYPE_MAX }; /* Local service DGID/SGID attribute: big endian */ struct rdma_nla_ls_gid { __u8 gid[16]; }; enum rdma_nldev_command { RDMA_NLDEV_CMD_UNSPEC, RDMA_NLDEV_CMD_GET, /* can dump */ RDMA_NLDEV_CMD_SET, RDMA_NLDEV_CMD_NEWLINK, RDMA_NLDEV_CMD_DELLINK, RDMA_NLDEV_CMD_PORT_GET, /* can dump */ RDMA_NLDEV_CMD_SYS_GET, RDMA_NLDEV_CMD_SYS_SET, /* 8 is free to use */ RDMA_NLDEV_CMD_RES_GET = 9, /* can dump */ RDMA_NLDEV_CMD_RES_QP_GET, /* can dump */ RDMA_NLDEV_CMD_RES_CM_ID_GET, /* can dump */ RDMA_NLDEV_CMD_RES_CQ_GET, /* can dump */ RDMA_NLDEV_CMD_RES_MR_GET, /* can dump */ RDMA_NLDEV_CMD_RES_PD_GET, /* can dump */ RDMA_NLDEV_CMD_GET_CHARDEV, RDMA_NLDEV_CMD_STAT_SET, RDMA_NLDEV_CMD_STAT_GET, /* can dump */ RDMA_NLDEV_CMD_STAT_DEL, RDMA_NLDEV_CMD_RES_QP_GET_RAW, RDMA_NLDEV_CMD_RES_CQ_GET_RAW, RDMA_NLDEV_CMD_RES_MR_GET_RAW, RDMA_NLDEV_CMD_RES_CTX_GET, /* can dump */ RDMA_NLDEV_CMD_RES_SRQ_GET, /* can dump */ RDMA_NLDEV_CMD_STAT_GET_STATUS, RDMA_NLDEV_CMD_RES_SRQ_GET_RAW, RDMA_NLDEV_CMD_NEWDEV, RDMA_NLDEV_CMD_DELDEV, RDMA_NLDEV_CMD_MONITOR, RDMA_NLDEV_NUM_OPS }; enum rdma_nldev_print_type { RDMA_NLDEV_PRINT_TYPE_UNSPEC, RDMA_NLDEV_PRINT_TYPE_HEX, }; enum rdma_nldev_attr { /* don't change the order or add anything between, this is ABI! */ RDMA_NLDEV_ATTR_UNSPEC, /* Pad attribute for 64b alignment */ RDMA_NLDEV_ATTR_PAD = RDMA_NLDEV_ATTR_UNSPEC, /* Identifier for ib_device */ RDMA_NLDEV_ATTR_DEV_INDEX, /* u32 */ RDMA_NLDEV_ATTR_DEV_NAME, /* string */ /* * Device index together with port index are identifiers * for port/link properties. * * For RDMA_NLDEV_CMD_GET commamnd, port index will return number * of available ports in ib_device, while for port specific operations, * it will be real port index as it appears in sysfs. Port index follows * sysfs notation and starts from 1 for the first port. */ RDMA_NLDEV_ATTR_PORT_INDEX, /* u32 */ /* * Device and port capabilities * * When used for port info, first 32-bits are CapabilityMask followed by * 16-bit CapabilityMask2. */ RDMA_NLDEV_ATTR_CAP_FLAGS, /* u64 */ /* * FW version */ RDMA_NLDEV_ATTR_FW_VERSION, /* string */ /* * Node GUID (in host byte order) associated with the RDMA device. */ RDMA_NLDEV_ATTR_NODE_GUID, /* u64 */ /* * System image GUID (in host byte order) associated with * this RDMA device and other devices which are part of a * single system. */ RDMA_NLDEV_ATTR_SYS_IMAGE_GUID, /* u64 */ /* * Subnet prefix (in host byte order) */ RDMA_NLDEV_ATTR_SUBNET_PREFIX, /* u64 */ /* * Local Identifier (LID), * According to IB specification, It is 16-bit address assigned * by the Subnet Manager. Extended to be 32-bit for OmniPath users. */ RDMA_NLDEV_ATTR_LID, /* u32 */ RDMA_NLDEV_ATTR_SM_LID, /* u32 */ /* * LID mask control (LMC) */ RDMA_NLDEV_ATTR_LMC, /* u8 */ RDMA_NLDEV_ATTR_PORT_STATE, /* u8 */ RDMA_NLDEV_ATTR_PORT_PHYS_STATE, /* u8 */ RDMA_NLDEV_ATTR_DEV_NODE_TYPE, /* u8 */ RDMA_NLDEV_ATTR_RES_SUMMARY, /* nested table */ RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY, /* nested table */ RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_NAME, /* string */ RDMA_NLDEV_ATTR_RES_SUMMARY_ENTRY_CURR, /* u64 */ RDMA_NLDEV_ATTR_RES_QP, /* nested table */ RDMA_NLDEV_ATTR_RES_QP_ENTRY, /* nested table */ /* * Local QPN */ RDMA_NLDEV_ATTR_RES_LQPN, /* u32 */ /* * Remote QPN, * Applicable for RC and UC only IBTA 11.2.5.3 QUERY QUEUE PAIR */ RDMA_NLDEV_ATTR_RES_RQPN, /* u32 */ /* * Receive Queue PSN, * Applicable for RC and UC only 11.2.5.3 QUERY QUEUE PAIR */ RDMA_NLDEV_ATTR_RES_RQ_PSN, /* u32 */ /* * Send Queue PSN */ RDMA_NLDEV_ATTR_RES_SQ_PSN, /* u32 */ RDMA_NLDEV_ATTR_RES_PATH_MIG_STATE, /* u8 */ /* * QP types as visible to RDMA/core, the reserved QPT * are not exported through this interface. */ RDMA_NLDEV_ATTR_RES_TYPE, /* u8 */ RDMA_NLDEV_ATTR_RES_STATE, /* u8 */ /* * Process ID which created object, * in case of kernel origin, PID won't exist. */ RDMA_NLDEV_ATTR_RES_PID, /* u32 */ /* * The name of process created following resource. * It will exist only for kernel objects. * For user created objects, the user is supposed * to read /proc/PID/comm file. */ RDMA_NLDEV_ATTR_RES_KERN_NAME, /* string */ RDMA_NLDEV_ATTR_RES_CM_ID, /* nested table */ RDMA_NLDEV_ATTR_RES_CM_ID_ENTRY, /* nested table */ /* * rdma_cm_id port space. */ RDMA_NLDEV_ATTR_RES_PS, /* u32 */ /* * Source and destination socket addresses */ RDMA_NLDEV_ATTR_RES_SRC_ADDR, /* __kernel_sockaddr_storage */ RDMA_NLDEV_ATTR_RES_DST_ADDR, /* __kernel_sockaddr_storage */ RDMA_NLDEV_ATTR_RES_CQ, /* nested table */ RDMA_NLDEV_ATTR_RES_CQ_ENTRY, /* nested table */ RDMA_NLDEV_ATTR_RES_CQE, /* u32 */ RDMA_NLDEV_ATTR_RES_USECNT, /* u64 */ RDMA_NLDEV_ATTR_RES_POLL_CTX, /* u8 */ RDMA_NLDEV_ATTR_RES_MR, /* nested table */ RDMA_NLDEV_ATTR_RES_MR_ENTRY, /* nested table */ RDMA_NLDEV_ATTR_RES_RKEY, /* u32 */ RDMA_NLDEV_ATTR_RES_LKEY, /* u32 */ RDMA_NLDEV_ATTR_RES_IOVA, /* u64 */ RDMA_NLDEV_ATTR_RES_MRLEN, /* u64 */ RDMA_NLDEV_ATTR_RES_PD, /* nested table */ RDMA_NLDEV_ATTR_RES_PD_ENTRY, /* nested table */ RDMA_NLDEV_ATTR_RES_LOCAL_DMA_LKEY, /* u32 */ RDMA_NLDEV_ATTR_RES_UNSAFE_GLOBAL_RKEY, /* u32 */ /* * Provides logical name and index of netdevice which is * connected to physical port. This information is relevant * for RoCE and iWARP. * * The netdevices which are associated with containers are * supposed to be exported together with GID table once it * will be exposed through the netlink. Because the * associated netdevices are properties of GIDs. */ RDMA_NLDEV_ATTR_NDEV_INDEX, /* u32 */ RDMA_NLDEV_ATTR_NDEV_NAME, /* string */ /* * driver-specific attributes. */ RDMA_NLDEV_ATTR_DRIVER, /* nested table */ RDMA_NLDEV_ATTR_DRIVER_ENTRY, /* nested table */ RDMA_NLDEV_ATTR_DRIVER_STRING, /* string */ /* * u8 values from enum rdma_nldev_print_type */ RDMA_NLDEV_ATTR_DRIVER_PRINT_TYPE, /* u8 */ RDMA_NLDEV_ATTR_DRIVER_S32, /* s32 */ RDMA_NLDEV_ATTR_DRIVER_U32, /* u32 */ RDMA_NLDEV_ATTR_DRIVER_S64, /* s64 */ RDMA_NLDEV_ATTR_DRIVER_U64, /* u64 */ /* * Indexes to get/set secific entry, * for QP use RDMA_NLDEV_ATTR_RES_LQPN */ RDMA_NLDEV_ATTR_RES_PDN, /* u32 */ RDMA_NLDEV_ATTR_RES_CQN, /* u32 */ RDMA_NLDEV_ATTR_RES_MRN, /* u32 */ RDMA_NLDEV_ATTR_RES_CM_IDN, /* u32 */ RDMA_NLDEV_ATTR_RES_CTXN, /* u32 */ /* * Identifies the rdma driver. eg: "rxe" or "siw" */ RDMA_NLDEV_ATTR_LINK_TYPE, /* string */ /* * net namespace mode for rdma subsystem: * either shared or exclusive among multiple net namespaces. */ RDMA_NLDEV_SYS_ATTR_NETNS_MODE, /* u8 */ /* * Device protocol, e.g. ib, iw, usnic, roce and opa */ RDMA_NLDEV_ATTR_DEV_PROTOCOL, /* string */ /* * File descriptor handle of the net namespace object */ RDMA_NLDEV_NET_NS_FD, /* u32 */ /* * Information about a chardev. * CHARDEV_TYPE is the name of the chardev ABI (ie uverbs, umad, etc) * CHARDEV_ABI signals the ABI revision (historical) * CHARDEV_NAME is the kernel name for the /dev/ file (no directory) * CHARDEV is the 64 bit dev_t for the inode */ RDMA_NLDEV_ATTR_CHARDEV_TYPE, /* string */ RDMA_NLDEV_ATTR_CHARDEV_NAME, /* string */ RDMA_NLDEV_ATTR_CHARDEV_ABI, /* u64 */ RDMA_NLDEV_ATTR_CHARDEV, /* u64 */ RDMA_NLDEV_ATTR_UVERBS_DRIVER_ID, /* u64 */ /* * Counter-specific attributes. */ RDMA_NLDEV_ATTR_STAT_MODE, /* u32 */ RDMA_NLDEV_ATTR_STAT_RES, /* u32 */ RDMA_NLDEV_ATTR_STAT_AUTO_MODE_MASK, /* u32 */ RDMA_NLDEV_ATTR_STAT_COUNTER, /* nested table */ RDMA_NLDEV_ATTR_STAT_COUNTER_ENTRY, /* nested table */ RDMA_NLDEV_ATTR_STAT_COUNTER_ID, /* u32 */ RDMA_NLDEV_ATTR_STAT_HWCOUNTERS, /* nested table */ RDMA_NLDEV_ATTR_STAT_HWCOUNTER_ENTRY, /* nested table */ RDMA_NLDEV_ATTR_STAT_HWCOUNTER_ENTRY_NAME, /* string */ RDMA_NLDEV_ATTR_STAT_HWCOUNTER_ENTRY_VALUE, /* u64 */ /* * CQ adaptive moderatio (DIM) */ RDMA_NLDEV_ATTR_DEV_DIM, /* u8 */ RDMA_NLDEV_ATTR_RES_RAW, /* binary */ RDMA_NLDEV_ATTR_RES_CTX, /* nested table */ RDMA_NLDEV_ATTR_RES_CTX_ENTRY, /* nested table */ RDMA_NLDEV_ATTR_RES_SRQ, /* nested table */ RDMA_NLDEV_ATTR_RES_SRQ_ENTRY, /* nested table */ RDMA_NLDEV_ATTR_RES_SRQN, /* u32 */ RDMA_NLDEV_ATTR_MIN_RANGE, /* u32 */ RDMA_NLDEV_ATTR_MAX_RANGE, /* u32 */ RDMA_NLDEV_SYS_ATTR_COPY_ON_FORK, /* u8 */ RDMA_NLDEV_ATTR_STAT_HWCOUNTER_INDEX, /* u32 */ RDMA_NLDEV_ATTR_STAT_HWCOUNTER_DYNAMIC, /* u8 */ RDMA_NLDEV_SYS_ATTR_PRIVILEGED_QKEY_MODE, /* u8 */ RDMA_NLDEV_ATTR_DRIVER_DETAILS, /* u8 */ /* * QP subtype string, used for driver QPs */ RDMA_NLDEV_ATTR_RES_SUBTYPE, /* string */ RDMA_NLDEV_ATTR_DEV_TYPE, /* u8 */ RDMA_NLDEV_ATTR_PARENT_NAME, /* string */ RDMA_NLDEV_ATTR_NAME_ASSIGN_TYPE, /* u8 */ RDMA_NLDEV_ATTR_EVENT_TYPE, /* u8 */ RDMA_NLDEV_SYS_ATTR_MONITOR_MODE, /* u8 */ /* * Always the end */ RDMA_NLDEV_ATTR_MAX }; /* * Supported counter bind modes. All modes are mutual-exclusive. */ enum rdma_nl_counter_mode { RDMA_COUNTER_MODE_NONE, /* * A qp is bound with a counter automatically during initialization * based on the auto mode (e.g., qp type, ...) */ RDMA_COUNTER_MODE_AUTO, /* * Which qp are bound with which counter is explicitly specified * by the user */ RDMA_COUNTER_MODE_MANUAL, /* * Always the end */ RDMA_COUNTER_MODE_MAX, }; /* * Supported criteria in counter auto mode. * Currently only "qp type" is supported */ enum rdma_nl_counter_mask { RDMA_COUNTER_MASK_QP_TYPE = 1, RDMA_COUNTER_MASK_PID = 1 << 1, }; /* Supported rdma device types. */ enum rdma_nl_dev_type { RDMA_DEVICE_TYPE_SMI = 1, }; /* RDMA device name assignment types */ enum rdma_nl_name_assign_type { RDMA_NAME_ASSIGN_TYPE_UNKNOWN = 0, RDMA_NAME_ASSIGN_TYPE_USER = 1, /* Provided by user-space */ }; /* * Supported rdma monitoring event types. */ enum rdma_nl_notify_event_type { RDMA_REGISTER_EVENT, RDMA_UNREGISTER_EVENT, RDMA_NETDEV_ATTACH_EVENT, RDMA_NETDEV_DETACH_EVENT, }; #endif /* _UAPI_RDMA_NETLINK_H */ rdma-core-56.1/kernel-headers/rdma/rdma_user_cm.h000066400000000000000000000157341477342711600217710ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2005-2006 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef RDMA_USER_CM_H #define RDMA_USER_CM_H #include #include #include #include #include #define RDMA_USER_CM_ABI_VERSION 4 #define RDMA_MAX_PRIVATE_DATA 256 enum { RDMA_USER_CM_CMD_CREATE_ID, RDMA_USER_CM_CMD_DESTROY_ID, RDMA_USER_CM_CMD_BIND_IP, RDMA_USER_CM_CMD_RESOLVE_IP, RDMA_USER_CM_CMD_RESOLVE_ROUTE, RDMA_USER_CM_CMD_QUERY_ROUTE, RDMA_USER_CM_CMD_CONNECT, RDMA_USER_CM_CMD_LISTEN, RDMA_USER_CM_CMD_ACCEPT, RDMA_USER_CM_CMD_REJECT, RDMA_USER_CM_CMD_DISCONNECT, RDMA_USER_CM_CMD_INIT_QP_ATTR, RDMA_USER_CM_CMD_GET_EVENT, RDMA_USER_CM_CMD_GET_OPTION, RDMA_USER_CM_CMD_SET_OPTION, RDMA_USER_CM_CMD_NOTIFY, RDMA_USER_CM_CMD_JOIN_IP_MCAST, RDMA_USER_CM_CMD_LEAVE_MCAST, RDMA_USER_CM_CMD_MIGRATE_ID, RDMA_USER_CM_CMD_QUERY, RDMA_USER_CM_CMD_BIND, RDMA_USER_CM_CMD_RESOLVE_ADDR, RDMA_USER_CM_CMD_JOIN_MCAST }; /* See IBTA Annex A11, servies ID bytes 4 & 5 */ enum rdma_ucm_port_space { RDMA_PS_IPOIB = 0x0002, RDMA_PS_IB = 0x013F, RDMA_PS_TCP = 0x0106, RDMA_PS_UDP = 0x0111, }; /* * command ABI structures. */ struct rdma_ucm_cmd_hdr { __u32 cmd; __u16 in; __u16 out; }; struct rdma_ucm_create_id { __aligned_u64 uid; __aligned_u64 response; __u16 ps; /* use enum rdma_ucm_port_space */ __u8 qp_type; __u8 reserved[5]; }; struct rdma_ucm_create_id_resp { __u32 id; }; struct rdma_ucm_destroy_id { __aligned_u64 response; __u32 id; __u32 reserved; }; struct rdma_ucm_destroy_id_resp { __u32 events_reported; }; struct rdma_ucm_bind_ip { __aligned_u64 response; struct sockaddr_in6 addr; __u32 id; }; struct rdma_ucm_bind { __u32 id; __u16 addr_size; __u16 reserved; struct __kernel_sockaddr_storage addr; }; struct rdma_ucm_resolve_ip { struct sockaddr_in6 src_addr; struct sockaddr_in6 dst_addr; __u32 id; __u32 timeout_ms; }; struct rdma_ucm_resolve_addr { __u32 id; __u32 timeout_ms; __u16 src_size; __u16 dst_size; __u32 reserved; struct __kernel_sockaddr_storage src_addr; struct __kernel_sockaddr_storage dst_addr; }; struct rdma_ucm_resolve_route { __u32 id; __u32 timeout_ms; }; enum { RDMA_USER_CM_QUERY_ADDR, RDMA_USER_CM_QUERY_PATH, RDMA_USER_CM_QUERY_GID }; struct rdma_ucm_query { __aligned_u64 response; __u32 id; __u32 option; }; struct rdma_ucm_query_route_resp { __aligned_u64 node_guid; struct ib_user_path_rec ib_route[2]; struct sockaddr_in6 src_addr; struct sockaddr_in6 dst_addr; __u32 num_paths; __u8 port_num; __u8 reserved[3]; __u32 ibdev_index; __u32 reserved1; }; struct rdma_ucm_query_addr_resp { __aligned_u64 node_guid; __u8 port_num; __u8 reserved; __u16 pkey; __u16 src_size; __u16 dst_size; struct __kernel_sockaddr_storage src_addr; struct __kernel_sockaddr_storage dst_addr; __u32 ibdev_index; __u32 reserved1; }; struct rdma_ucm_query_path_resp { __u32 num_paths; __u32 reserved; struct ib_path_rec_data path_data[]; }; struct rdma_ucm_conn_param { __u32 qp_num; __u32 qkey; __u8 private_data[RDMA_MAX_PRIVATE_DATA]; __u8 private_data_len; __u8 srq; __u8 responder_resources; __u8 initiator_depth; __u8 flow_control; __u8 retry_count; __u8 rnr_retry_count; __u8 valid; }; struct rdma_ucm_ud_param { __u32 qp_num; __u32 qkey; struct ib_uverbs_ah_attr ah_attr; __u8 private_data[RDMA_MAX_PRIVATE_DATA]; __u8 private_data_len; __u8 reserved[7]; }; struct rdma_ucm_ece { __u32 vendor_id; __u32 attr_mod; }; struct rdma_ucm_connect { struct rdma_ucm_conn_param conn_param; __u32 id; __u32 reserved; struct rdma_ucm_ece ece; }; struct rdma_ucm_listen { __u32 id; __u32 backlog; }; struct rdma_ucm_accept { __aligned_u64 uid; struct rdma_ucm_conn_param conn_param; __u32 id; __u32 reserved; struct rdma_ucm_ece ece; }; struct rdma_ucm_reject { __u32 id; __u8 private_data_len; __u8 reason; __u8 reserved[2]; __u8 private_data[RDMA_MAX_PRIVATE_DATA]; }; struct rdma_ucm_disconnect { __u32 id; }; struct rdma_ucm_init_qp_attr { __aligned_u64 response; __u32 id; __u32 qp_state; }; struct rdma_ucm_notify { __u32 id; __u32 event; }; struct rdma_ucm_join_ip_mcast { __aligned_u64 response; /* rdma_ucm_create_id_resp */ __aligned_u64 uid; struct sockaddr_in6 addr; __u32 id; }; /* Multicast join flags */ enum { RDMA_MC_JOIN_FLAG_FULLMEMBER, RDMA_MC_JOIN_FLAG_SENDONLY_FULLMEMBER, RDMA_MC_JOIN_FLAG_RESERVED, }; struct rdma_ucm_join_mcast { __aligned_u64 response; /* rdma_ucma_create_id_resp */ __aligned_u64 uid; __u32 id; __u16 addr_size; __u16 join_flags; struct __kernel_sockaddr_storage addr; }; struct rdma_ucm_get_event { __aligned_u64 response; }; struct rdma_ucm_event_resp { __aligned_u64 uid; __u32 id; __u32 event; __u32 status; /* * NOTE: This union is not aligned to 8 bytes so none of the union * members may contain a u64 or anything with higher alignment than 4. */ union { struct rdma_ucm_conn_param conn; struct rdma_ucm_ud_param ud; } param; __u32 reserved; struct rdma_ucm_ece ece; }; /* Option levels */ enum { RDMA_OPTION_ID = 0, RDMA_OPTION_IB = 1 }; /* Option details */ enum { RDMA_OPTION_ID_TOS = 0, RDMA_OPTION_ID_REUSEADDR = 1, RDMA_OPTION_ID_AFONLY = 2, RDMA_OPTION_ID_ACK_TIMEOUT = 3 }; enum { RDMA_OPTION_IB_PATH = 1 }; struct rdma_ucm_set_option { __aligned_u64 optval; __u32 id; __u32 level; __u32 optname; __u32 optlen; }; struct rdma_ucm_migrate_id { __aligned_u64 response; __u32 id; __u32 fd; }; struct rdma_ucm_migrate_resp { __u32 events_reported; }; #endif /* RDMA_USER_CM_H */ rdma-core-56.1/kernel-headers/rdma/rdma_user_ioctl.h000066400000000000000000000072471477342711600225040ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2016 Mellanox Technologies, LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef RDMA_USER_IOCTL_H #define RDMA_USER_IOCTL_H #include #include #include /* Legacy name, for user space application which already use it */ #define IB_IOCTL_MAGIC RDMA_IOCTL_MAGIC /* * General blocks assignments * It is closed on purpose - do not expose it to user space * #define MAD_CMD_BASE 0x00 * #define HFI1_CMD_BAS 0xE0 */ /* MAD specific section */ #define IB_USER_MAD_REGISTER_AGENT _IOWR(RDMA_IOCTL_MAGIC, 0x01, struct ib_user_mad_reg_req) #define IB_USER_MAD_UNREGISTER_AGENT _IOW(RDMA_IOCTL_MAGIC, 0x02, __u32) #define IB_USER_MAD_ENABLE_PKEY _IO(RDMA_IOCTL_MAGIC, 0x03) #define IB_USER_MAD_REGISTER_AGENT2 _IOWR(RDMA_IOCTL_MAGIC, 0x04, struct ib_user_mad_reg_req2) /* HFI specific section */ /* allocate HFI and context */ #define HFI1_IOCTL_ASSIGN_CTXT _IOWR(RDMA_IOCTL_MAGIC, 0xE1, struct hfi1_user_info) /* find out what resources we got */ #define HFI1_IOCTL_CTXT_INFO _IOW(RDMA_IOCTL_MAGIC, 0xE2, struct hfi1_ctxt_info) /* set up userspace */ #define HFI1_IOCTL_USER_INFO _IOW(RDMA_IOCTL_MAGIC, 0xE3, struct hfi1_base_info) /* update expected TID entries */ #define HFI1_IOCTL_TID_UPDATE _IOWR(RDMA_IOCTL_MAGIC, 0xE4, struct hfi1_tid_info) /* free expected TID entries */ #define HFI1_IOCTL_TID_FREE _IOWR(RDMA_IOCTL_MAGIC, 0xE5, struct hfi1_tid_info) /* force an update of PIO credit */ #define HFI1_IOCTL_CREDIT_UPD _IO(RDMA_IOCTL_MAGIC, 0xE6) /* control receipt of packets */ #define HFI1_IOCTL_RECV_CTRL _IOW(RDMA_IOCTL_MAGIC, 0xE8, int) /* set the kind of polling we want */ #define HFI1_IOCTL_POLL_TYPE _IOW(RDMA_IOCTL_MAGIC, 0xE9, int) /* ack & clear user status bits */ #define HFI1_IOCTL_ACK_EVENT _IOW(RDMA_IOCTL_MAGIC, 0xEA, unsigned long) /* set context's pkey */ #define HFI1_IOCTL_SET_PKEY _IOW(RDMA_IOCTL_MAGIC, 0xEB, __u16) /* reset context's HW send context */ #define HFI1_IOCTL_CTXT_RESET _IO(RDMA_IOCTL_MAGIC, 0xEC) /* read TID cache invalidations */ #define HFI1_IOCTL_TID_INVAL_READ _IOWR(RDMA_IOCTL_MAGIC, 0xED, struct hfi1_tid_info) /* get the version of the user cdev */ #define HFI1_IOCTL_GET_VERS _IOR(RDMA_IOCTL_MAGIC, 0xEE, int) #endif /* RDMA_USER_IOCTL_H */ rdma-core-56.1/kernel-headers/rdma/rdma_user_ioctl_cmds.h000066400000000000000000000050711477342711600235030ustar00rootroot00000000000000/* * Copyright (c) 2018, Mellanox Technologies inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef RDMA_USER_IOCTL_CMDS_H #define RDMA_USER_IOCTL_CMDS_H #include #include /* Documentation/userspace-api/ioctl/ioctl-number.rst */ #define RDMA_IOCTL_MAGIC 0x1b #define RDMA_VERBS_IOCTL \ _IOWR(RDMA_IOCTL_MAGIC, 1, struct ib_uverbs_ioctl_hdr) enum { /* User input */ UVERBS_ATTR_F_MANDATORY = 1U << 0, /* * Valid output bit should be ignored and considered set in * mandatory fields. This bit is kernel output. */ UVERBS_ATTR_F_VALID_OUTPUT = 1U << 1, }; struct ib_uverbs_attr { __u16 attr_id; /* command specific type attribute */ __u16 len; /* only for pointers and IDRs array */ __u16 flags; /* combination of UVERBS_ATTR_F_XXXX */ union { struct { __u8 elem_id; __u8 reserved; } enum_data; __u16 reserved; } attr_data; union { /* * ptr to command, inline data, idr/fd or * ptr to __u32 array of IDRs */ __aligned_u64 data; /* Used by FD_IN and FD_OUT */ __s64 data_s64; }; }; struct ib_uverbs_ioctl_hdr { __u16 length; __u16 object_id; __u16 method_id; __u16 num_attrs; __aligned_u64 reserved1; __u32 driver_id; __u32 reserved2; struct ib_uverbs_attr attrs[]; }; #endif rdma-core-56.1/kernel-headers/rdma/rdma_user_rxe.h000066400000000000000000000120071477342711600221560ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR Linux-OpenIB) */ /* * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef RDMA_USER_RXE_H #define RDMA_USER_RXE_H #include #include #include #include enum { RXE_NETWORK_TYPE_IPV4 = 1, RXE_NETWORK_TYPE_IPV6 = 2, }; union rxe_gid { __u8 raw[16]; struct { __be64 subnet_prefix; __be64 interface_id; } global; }; struct rxe_global_route { union rxe_gid dgid; __u32 flow_label; __u8 sgid_index; __u8 hop_limit; __u8 traffic_class; }; struct rxe_av { __u8 port_num; /* From RXE_NETWORK_TYPE_* */ __u8 network_type; __u8 dmac[6]; struct rxe_global_route grh; union { struct sockaddr_in _sockaddr_in; struct sockaddr_in6 _sockaddr_in6; } sgid_addr, dgid_addr; }; struct rxe_send_wr { __aligned_u64 wr_id; __u32 reserved; __u32 opcode; __u32 send_flags; union { __be32 imm_data; __u32 invalidate_rkey; } ex; union { struct { __aligned_u64 remote_addr; __u32 length; __u32 rkey; __u8 type; __u8 level; } flush; struct { __aligned_u64 remote_addr; __u32 rkey; __u32 reserved; } rdma; struct { __aligned_u64 remote_addr; __aligned_u64 compare_add; __aligned_u64 swap; __u32 rkey; __u32 reserved; } atomic; struct { __u32 remote_qpn; __u32 remote_qkey; __u16 pkey_index; __u16 reserved; __u32 ah_num; __u32 pad[4]; struct rxe_av av; } ud; struct { __aligned_u64 addr; __aligned_u64 length; __u32 mr_lkey; __u32 mw_rkey; __u32 rkey; __u32 access; } mw; /* reg is only used by the kernel and is not part of the uapi */ #ifdef __KERNEL__ struct { union { struct ib_mr *mr; __aligned_u64 reserved; }; __u32 key; __u32 access; } reg; #endif } wr; }; struct rxe_sge { __aligned_u64 addr; __u32 length; __u32 lkey; }; struct mminfo { __aligned_u64 offset; __u32 size; __u32 pad; }; struct rxe_dma_info { __u32 length; __u32 resid; __u32 cur_sge; __u32 num_sge; __u32 sge_offset; __u32 reserved; union { __DECLARE_FLEX_ARRAY(__u8, inline_data); __DECLARE_FLEX_ARRAY(__u8, atomic_wr); __DECLARE_FLEX_ARRAY(struct rxe_sge, sge); }; }; struct rxe_send_wqe { struct rxe_send_wr wr; __u32 status; __u32 state; __aligned_u64 iova; __u32 mask; __u32 first_psn; __u32 last_psn; __u32 ack_length; __u32 ssn; __u32 has_rd_atomic; struct rxe_dma_info dma; }; struct rxe_recv_wqe { __aligned_u64 wr_id; __u32 reserved; __u32 padding; struct rxe_dma_info dma; }; struct rxe_create_ah_resp { __u32 ah_num; __u32 reserved; }; struct rxe_create_cq_resp { struct mminfo mi; }; struct rxe_resize_cq_resp { struct mminfo mi; }; struct rxe_create_qp_resp { struct mminfo rq_mi; struct mminfo sq_mi; }; struct rxe_create_srq_resp { struct mminfo mi; __u32 srq_num; __u32 reserved; }; struct rxe_modify_srq_cmd { __aligned_u64 mmap_info_addr; }; /* This data structure is stored at the base of work and * completion queues shared between user space and kernel space. * It contains the producer and consumer indices. Is also * contains a copy of the queue size parameters for user space * to use but the kernel must use the parameters in the * rxe_queue struct. For performance reasons arrange to have * producer and consumer indices in separate cache lines * the kernel should always mask the indices to avoid accessing * memory outside of the data area */ struct rxe_queue_buf { __u32 log2_elem_size; __u32 index_mask; __u32 pad_1[30]; __u32 producer_index; __u32 pad_2[31]; __u32 consumer_index; __u32 pad_3[31]; __u8 data[]; }; #endif /* RDMA_USER_RXE_H */ rdma-core-56.1/kernel-headers/rdma/rvt-abi.h000066400000000000000000000033531477342711600206670ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */ /* * This file contains defines, structures, etc. that are used * to communicate between kernel and user code. */ #ifndef RVT_ABI_USER_H #define RVT_ABI_USER_H #include #include #ifndef RDMA_ATOMIC_UAPI #define RDMA_ATOMIC_UAPI(_type, _name) struct{ _type val; } _name #endif struct rvt_wqe_sge { __aligned_u64 addr; __u32 length; __u32 lkey; }; /* * This structure is used to contain the head pointer, tail pointer, * and completion queue entries as a single memory allocation so * it can be mmap'ed into user space. */ struct rvt_cq_wc { /* index of next entry to fill */ RDMA_ATOMIC_UAPI(__u32, head); /* index of next ib_poll_cq() entry */ RDMA_ATOMIC_UAPI(__u32, tail); /* these are actually size ibcq.cqe + 1 */ struct ib_uverbs_wc uqueue[]; }; /* * Receive work request queue entry. * The size of the sg_list is determined when the QP (or SRQ) is created * and stored in qp->r_rq.max_sge (or srq->rq.max_sge). */ struct rvt_rwqe { __u64 wr_id; __u8 num_sge; __u8 padding[7]; struct rvt_wqe_sge sg_list[]; }; /* * This structure is used to contain the head pointer, tail pointer, * and receive work queue entries as a single memory allocation so * it can be mmap'ed into user space. * Note that the wq array elements are variable size so you can't * just index into the array to get the N'th element; * use get_rwqe_ptr() for user space and rvt_get_rwqe_ptr() * for kernel space. */ struct rvt_rwq { /* new work requests posted to the head */ RDMA_ATOMIC_UAPI(__u32, head); /* receives pull requests from here. */ RDMA_ATOMIC_UAPI(__u32, tail); struct rvt_rwqe wq[]; }; #endif /* RVT_ABI_USER_H */ rdma-core-56.1/kernel-headers/rdma/siw-abi.h000066400000000000000000000065421477342711600206610ustar00rootroot00000000000000/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause */ /* Authors: Bernard Metzler */ /* Copyright (c) 2008-2019, IBM Corporation */ #ifndef _SIW_USER_H #define _SIW_USER_H #include #define SIW_NODE_DESC_COMMON "Software iWARP stack" #define SIW_ABI_VERSION 1 #define SIW_MAX_SGE 6 #define SIW_UOBJ_MAX_KEY 0x08FFFF #define SIW_INVAL_UOBJ_KEY (SIW_UOBJ_MAX_KEY + 1) struct siw_uresp_create_cq { __u32 cq_id; __u32 num_cqe; __aligned_u64 cq_key; }; struct siw_uresp_create_qp { __u32 qp_id; __u32 num_sqe; __u32 num_rqe; __u32 pad; __aligned_u64 sq_key; __aligned_u64 rq_key; }; struct siw_ureq_reg_mr { __u8 stag_key; __u8 reserved[3]; __u32 pad; }; struct siw_uresp_reg_mr { __u32 stag; __u32 pad; }; struct siw_uresp_create_srq { __u32 num_rqe; __u32 pad; __aligned_u64 srq_key; }; struct siw_uresp_alloc_ctx { __u32 dev_id; __u32 pad; }; enum siw_opcode { SIW_OP_WRITE, SIW_OP_READ, SIW_OP_READ_LOCAL_INV, SIW_OP_SEND, SIW_OP_SEND_WITH_IMM, SIW_OP_SEND_REMOTE_INV, /* Unsupported */ SIW_OP_FETCH_AND_ADD, SIW_OP_COMP_AND_SWAP, SIW_OP_RECEIVE, /* provider internal SQE */ SIW_OP_READ_RESPONSE, /* * below opcodes valid for * in-kernel clients only */ SIW_OP_INVAL_STAG, SIW_OP_REG_MR, SIW_NUM_OPCODES }; /* Keep it same as ibv_sge to allow for memcpy */ struct siw_sge { __aligned_u64 laddr; __u32 length; __u32 lkey; }; /* * Inline data are kept within the work request itself occupying * the space of sge[1] .. sge[n]. Therefore, inline data cannot be * supported if SIW_MAX_SGE is below 2 elements. */ #define SIW_MAX_INLINE (sizeof(struct siw_sge) * (SIW_MAX_SGE - 1)) #if SIW_MAX_SGE < 2 #error "SIW_MAX_SGE must be at least 2" #endif enum siw_wqe_flags { SIW_WQE_VALID = 1, SIW_WQE_INLINE = (1 << 1), SIW_WQE_SIGNALLED = (1 << 2), SIW_WQE_SOLICITED = (1 << 3), SIW_WQE_READ_FENCE = (1 << 4), SIW_WQE_REM_INVAL = (1 << 5), SIW_WQE_COMPLETED = (1 << 6) }; /* Send Queue Element */ struct siw_sqe { __aligned_u64 id; __u16 flags; __u8 num_sge; /* Contains enum siw_opcode values */ __u8 opcode; __u32 rkey; union { __aligned_u64 raddr; __aligned_u64 base_mr; }; union { struct siw_sge sge[SIW_MAX_SGE]; __aligned_u64 access; }; }; /* Receive Queue Element */ struct siw_rqe { __aligned_u64 id; __u16 flags; __u8 num_sge; /* * only used by kernel driver, * ignored if set by user */ __u8 opcode; __u32 unused; struct siw_sge sge[SIW_MAX_SGE]; }; enum siw_notify_flags { SIW_NOTIFY_NOT = (0), SIW_NOTIFY_SOLICITED = (1 << 0), SIW_NOTIFY_NEXT_COMPLETION = (1 << 1), SIW_NOTIFY_MISSED_EVENTS = (1 << 2), SIW_NOTIFY_ALL = SIW_NOTIFY_SOLICITED | SIW_NOTIFY_NEXT_COMPLETION | SIW_NOTIFY_MISSED_EVENTS }; enum siw_wc_status { SIW_WC_SUCCESS, SIW_WC_LOC_LEN_ERR, SIW_WC_LOC_PROT_ERR, SIW_WC_LOC_QP_OP_ERR, SIW_WC_WR_FLUSH_ERR, SIW_WC_BAD_RESP_ERR, SIW_WC_LOC_ACCESS_ERR, SIW_WC_REM_ACCESS_ERR, SIW_WC_REM_INV_REQ_ERR, SIW_WC_GENERAL_ERR, SIW_NUM_WC_STATUS }; struct siw_cqe { __aligned_u64 id; __u8 flags; __u8 opcode; __u16 status; __u32 bytes; union { __aligned_u64 imm_data; __u32 inval_stag; }; /* QP number or QP pointer */ union { struct ib_qp *base_qp; __aligned_u64 qp_id; }; }; /* * Shared structure between user and kernel * to control CQ arming. */ struct siw_cq_ctrl { __u32 flags; __u32 pad; }; #endif rdma-core-56.1/kernel-headers/rdma/vmw_pvrdma-abi.h000066400000000000000000000175131477342711600222410ustar00rootroot00000000000000/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) */ /* * Copyright (c) 2012-2016 VMware, Inc. All rights reserved. * * This program is free software; you can redistribute it and/or * modify it under the terms of EITHER the GNU General Public License * version 2 as published by the Free Software Foundation or the BSD * 2-Clause License. This program is distributed in the hope that it * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License version 2 for more details at * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html. * * You should have received a copy of the GNU General Public License * along with this program available in the file COPYING in the main * directory of this source tree. * * The BSD 2-Clause License * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. */ #ifndef __VMW_PVRDMA_ABI_H__ #define __VMW_PVRDMA_ABI_H__ #include #define PVRDMA_UVERBS_ABI_VERSION 3 /* ABI Version. */ #define PVRDMA_UAR_HANDLE_MASK 0x00FFFFFF /* Bottom 24 bits. */ #define PVRDMA_UAR_QP_OFFSET 0 /* QP doorbell. */ #define PVRDMA_UAR_QP_SEND (1 << 30) /* Send bit. */ #define PVRDMA_UAR_QP_RECV (1 << 31) /* Recv bit. */ #define PVRDMA_UAR_CQ_OFFSET 4 /* CQ doorbell. */ #define PVRDMA_UAR_CQ_ARM_SOL (1 << 29) /* Arm solicited bit. */ #define PVRDMA_UAR_CQ_ARM (1 << 30) /* Arm bit. */ #define PVRDMA_UAR_CQ_POLL (1 << 31) /* Poll bit. */ #define PVRDMA_UAR_SRQ_OFFSET 8 /* SRQ doorbell. */ #define PVRDMA_UAR_SRQ_RECV (1 << 30) /* Recv bit. */ enum pvrdma_wr_opcode { PVRDMA_WR_RDMA_WRITE, PVRDMA_WR_RDMA_WRITE_WITH_IMM, PVRDMA_WR_SEND, PVRDMA_WR_SEND_WITH_IMM, PVRDMA_WR_RDMA_READ, PVRDMA_WR_ATOMIC_CMP_AND_SWP, PVRDMA_WR_ATOMIC_FETCH_AND_ADD, PVRDMA_WR_LSO, PVRDMA_WR_SEND_WITH_INV, PVRDMA_WR_RDMA_READ_WITH_INV, PVRDMA_WR_LOCAL_INV, PVRDMA_WR_FAST_REG_MR, PVRDMA_WR_MASKED_ATOMIC_CMP_AND_SWP, PVRDMA_WR_MASKED_ATOMIC_FETCH_AND_ADD, PVRDMA_WR_BIND_MW, PVRDMA_WR_REG_SIG_MR, PVRDMA_WR_ERROR, }; enum pvrdma_wc_status { PVRDMA_WC_SUCCESS, PVRDMA_WC_LOC_LEN_ERR, PVRDMA_WC_LOC_QP_OP_ERR, PVRDMA_WC_LOC_EEC_OP_ERR, PVRDMA_WC_LOC_PROT_ERR, PVRDMA_WC_WR_FLUSH_ERR, PVRDMA_WC_MW_BIND_ERR, PVRDMA_WC_BAD_RESP_ERR, PVRDMA_WC_LOC_ACCESS_ERR, PVRDMA_WC_REM_INV_REQ_ERR, PVRDMA_WC_REM_ACCESS_ERR, PVRDMA_WC_REM_OP_ERR, PVRDMA_WC_RETRY_EXC_ERR, PVRDMA_WC_RNR_RETRY_EXC_ERR, PVRDMA_WC_LOC_RDD_VIOL_ERR, PVRDMA_WC_REM_INV_RD_REQ_ERR, PVRDMA_WC_REM_ABORT_ERR, PVRDMA_WC_INV_EECN_ERR, PVRDMA_WC_INV_EEC_STATE_ERR, PVRDMA_WC_FATAL_ERR, PVRDMA_WC_RESP_TIMEOUT_ERR, PVRDMA_WC_GENERAL_ERR, }; enum pvrdma_wc_opcode { PVRDMA_WC_SEND, PVRDMA_WC_RDMA_WRITE, PVRDMA_WC_RDMA_READ, PVRDMA_WC_COMP_SWAP, PVRDMA_WC_FETCH_ADD, PVRDMA_WC_BIND_MW, PVRDMA_WC_LSO, PVRDMA_WC_LOCAL_INV, PVRDMA_WC_FAST_REG_MR, PVRDMA_WC_MASKED_COMP_SWAP, PVRDMA_WC_MASKED_FETCH_ADD, PVRDMA_WC_RECV = 1 << 7, PVRDMA_WC_RECV_RDMA_WITH_IMM, }; enum pvrdma_wc_flags { PVRDMA_WC_GRH = 1 << 0, PVRDMA_WC_WITH_IMM = 1 << 1, PVRDMA_WC_WITH_INVALIDATE = 1 << 2, PVRDMA_WC_IP_CSUM_OK = 1 << 3, PVRDMA_WC_WITH_SMAC = 1 << 4, PVRDMA_WC_WITH_VLAN = 1 << 5, PVRDMA_WC_WITH_NETWORK_HDR_TYPE = 1 << 6, PVRDMA_WC_FLAGS_MAX = PVRDMA_WC_WITH_NETWORK_HDR_TYPE, }; enum pvrdma_network_type { PVRDMA_NETWORK_IB, PVRDMA_NETWORK_ROCE_V1 = PVRDMA_NETWORK_IB, PVRDMA_NETWORK_IPV4, PVRDMA_NETWORK_IPV6 }; struct pvrdma_alloc_ucontext_resp { __u32 qp_tab_size; __u32 reserved; }; struct pvrdma_alloc_pd_resp { __u32 pdn; __u32 reserved; }; struct pvrdma_create_cq { __aligned_u64 buf_addr; __u32 buf_size; __u32 reserved; }; struct pvrdma_create_cq_resp { __u32 cqn; __u32 reserved; }; struct pvrdma_resize_cq { __aligned_u64 buf_addr; __u32 buf_size; __u32 reserved; }; struct pvrdma_create_srq { __aligned_u64 buf_addr; __u32 buf_size; __u32 reserved; }; struct pvrdma_create_srq_resp { __u32 srqn; __u32 reserved; }; struct pvrdma_create_qp { __aligned_u64 rbuf_addr; __aligned_u64 sbuf_addr; __u32 rbuf_size; __u32 sbuf_size; __aligned_u64 qp_addr; }; struct pvrdma_create_qp_resp { __u32 qpn; __u32 qp_handle; }; /* PVRDMA masked atomic compare and swap */ struct pvrdma_ex_cmp_swap { __aligned_u64 swap_val; __aligned_u64 compare_val; __aligned_u64 swap_mask; __aligned_u64 compare_mask; }; /* PVRDMA masked atomic fetch and add */ struct pvrdma_ex_fetch_add { __aligned_u64 add_val; __aligned_u64 field_boundary; }; /* PVRDMA address vector. */ struct pvrdma_av { __u32 port_pd; __u32 sl_tclass_flowlabel; __u8 dgid[16]; __u8 src_path_bits; __u8 gid_index; __u8 stat_rate; __u8 hop_limit; __u8 dmac[6]; __u8 reserved[6]; }; /* PVRDMA scatter/gather entry */ struct pvrdma_sge { __aligned_u64 addr; __u32 length; __u32 lkey; }; /* PVRDMA receive queue work request */ struct pvrdma_rq_wqe_hdr { __aligned_u64 wr_id; /* wr id */ __u32 num_sge; /* size of s/g array */ __u32 total_len; /* reserved */ }; /* Use pvrdma_sge (ib_sge) for receive queue s/g array elements. */ /* PVRDMA send queue work request */ struct pvrdma_sq_wqe_hdr { __aligned_u64 wr_id; /* wr id */ __u32 num_sge; /* size of s/g array */ __u32 total_len; /* reserved */ __u32 opcode; /* operation type */ __u32 send_flags; /* wr flags */ union { __be32 imm_data; __u32 invalidate_rkey; } ex; __u32 reserved; union { struct { __aligned_u64 remote_addr; __u32 rkey; __u8 reserved[4]; } rdma; struct { __aligned_u64 remote_addr; __aligned_u64 compare_add; __aligned_u64 swap; __u32 rkey; __u32 reserved; } atomic; struct { __aligned_u64 remote_addr; __u32 log_arg_sz; __u32 rkey; union { struct pvrdma_ex_cmp_swap cmp_swap; struct pvrdma_ex_fetch_add fetch_add; } wr_data; } masked_atomics; struct { __aligned_u64 iova_start; __aligned_u64 pl_pdir_dma; __u32 page_shift; __u32 page_list_len; __u32 length; __u32 access_flags; __u32 rkey; __u32 reserved; } fast_reg; struct { __u32 remote_qpn; __u32 remote_qkey; struct pvrdma_av av; } ud; } wr; }; /* Use pvrdma_sge (ib_sge) for send queue s/g array elements. */ /* Completion queue element. */ struct pvrdma_cqe { __aligned_u64 wr_id; __aligned_u64 qp; __u32 opcode; __u32 status; __u32 byte_len; __be32 imm_data; __u32 src_qp; __u32 wc_flags; __u32 vendor_err; __u16 pkey_index; __u16 slid; __u8 sl; __u8 dlid_path_bits; __u8 port_num; __u8 smac[6]; __u8 network_hdr_type; __u8 reserved2[6]; /* Pad to next power of 2 (64). */ }; #endif /* __VMW_PVRDMA_ABI_H__ */ rdma-core-56.1/kernel-headers/update000077500000000000000000000161621477342711600174410ustar00rootroot00000000000000#!/usr/bin/env python3 # Copyright 2018 Mellanox Technologies Inc. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. # PYTHON_ARGCOMPLETE_OK """This script takes a commitish from a kernel tree and synchronizes the RDMA headers we use with that tree. During development, before commits are accepted to the official kernel git tree, the --not-final option should be used. Once finalized the commit should be revised using --amend, eg using the exec feature of 'git rebase'""" import argparse import subprocess import tempfile import os import contextlib import textwrap import email.utils import collections def git_call(args): """Run git and display the output to the terminal""" return subprocess.check_call(['git',] + args); def git_output(args,mode=None,input=None): """Run git and return the output""" o = subprocess.check_output(['git',] + args,input=input); if mode == "raw": return o; return o.strip(); @contextlib.contextmanager def in_directory(dir): """Context manager that chdirs into a directory and restores the original directory when closed.""" cdir = os.getcwd(); old_env = {}; try: # git rebase invokes its exec with a bunch of git variables set that # prevent us from invoking git in another tree, blow them away. for k in list(os.environ.keys()): if k.startswith("GIT"): old_env[k] = os.environ[k]; del os.environ[k]; os.chdir(dir); yield True; finally: os.chdir(cdir); os.environ.update(old_env); def copy_objects(args): """Copy the uapi header objects from the kernel repo at the commit indicated into our repo. This is done by having git copy the tree object and blobs from the kernel tree into this tree and then revising our index. This is a simple way to ensure they match exactly.""" with in_directory(args.KERNEL_GIT): if args.not_final: fmt = "--format=?? (\"%s\")"; else: fmt = "--format=%h (\"%s\")"; kernel_desc = git_output(["log", "--abbrev=12","-1", fmt, args.COMMIT]); ntree = git_output(["rev-parse", "%s:include/uapi/rdma"%(args.COMMIT)]); pack = git_output(["pack-objects", "-q","--revs","--stdout"], mode="raw", input=ntree); git_output(["unpack-objects","-q"],input=pack); return (ntree,kernel_desc); def update_cmake(args,ntree): """Create a new CMakeLists.txt that lists all of the kernel headers for installation.""" # We need to expand to a publish_internal_headers for each directory fns = git_output(["ls-tree","--name-only","--full-tree","-r",ntree]).splitlines(); groups = collections.defaultdict(list); for I in fns: d,p = os.path.split(os.path.join("rdma",I.decode())); groups[d].append(p); data = subprocess.check_output(['git',"cat-file","blob", ":kernel-headers/CMakeLists.txt"]); data = data.decode(); # Build a new CMakeLists.txt in a temporary file with tempfile.NamedTemporaryFile("wt") as F: # Emit the headers lists for I,vals in sorted(groups.items()): F.write("publish_internal_headers(%s\n"%(I)); for J in sorted(vals): F.write(" %s\n"%(os.path.join(I,J))); F.write(" )\n"); F.write("\n"); # Throw away the old header lists cur = iter(data.splitlines()); for ln in cur: if not ln: continue; if ln.startswith("publish_internal_headers(rdma"): while not next(cur).startswith(" )"): pass; continue; F.write(ln + '\n'); break; # and copy the remaining lines for ln in cur: F.write(ln + '\n'); F.flush(); blob = git_output(["hash-object","-w",F.name]); git_call(["update-index","--cacheinfo", b"0644,%s,kernel-headers/CMakeLists.txt"%(blob)]); def make_commit(args,ntree,kernel_desc): """Make the rdma-core commit that syncs the kernel header directory.""" head_id = git_output(["rev-parse","HEAD"]); old_tree_id = git_output(["rev-parse",b"%s^{tree}"%(head_id)]); if args.amend: subject = git_output(["log","-1","--format=%s"]).decode(); if subject != "Update kernel headers": raise ValueError("In amend mode, but current HEAD does not seem to be a kernel update with subject %r"%( subject)); parent = git_output(["rev-parse",head_id + b"^"]); else: parent = head_id; emaila = email.utils.formataddr((git_output(["config","user.name"]).decode(), git_output(["config","user.email"]).decode())); # Build a new tree object that replaces the kernel headers directory with tempfile.NamedTemporaryFile() as F: os.environ["GIT_INDEX_FILE"] = F.name; git_call(["read-tree",head_id]); git_call(["rm","-r","--quiet","--cached", "kernel-headers/rdma"]); git_call(["read-tree","--prefix=kernel-headers/rdma",ntree]); update_cmake(args,ntree); all_tree = git_output(["write-tree"]); del os.environ["GIT_INDEX_FILE"]; if not args.amend and old_tree_id == all_tree: raise ValueError("Commit is empty, aborting"); # And now create the commit msg="Update kernel headers\n\n"; p = textwrap.fill("To commit: %s."%(kernel_desc.decode()), width=74) msg = msg + p; msg = msg + "\n\nSigned-off-by: %s\n"%(emaila); commit = git_output(["commit-tree",all_tree,"-p",parent, "-F","-"], input=msg.encode()); return commit,head_id; parser = argparse.ArgumentParser(description='Update kernel headers from the kernel tree') parser.add_argument("--amend", action="store_true", default=False, help="Replace the top commit with the the kernel header commit"); parser.add_argument("--not-final", action="store_true", default=False, help="Use if the git commit given is not part of the official kernel git tree. This option should be used during development."); parser.add_argument("KERNEL_GIT", action="store", help="Kernel git directory"); parser.add_argument("COMMIT", action="store", help="Kernel commitish to synchronize headers with"); try: import argcomplete; argcomplete.autocomplete(parser); except ImportError: pass; args = parser.parse_args(); ntree,kernel_desc = copy_objects(args); commit,head_id = make_commit(args,ntree,kernel_desc); # Finalize if args.amend: print("Commit amended"); git_call(["--no-pager","diff","--stat",head_id,commit]); git_call(["reset","--merge",commit]); else: git_call(["merge","--ff","--ff-only",commit]); rdma-core-56.1/libibmad/000077500000000000000000000000001477342711600150755ustar00rootroot00000000000000rdma-core-56.1/libibmad/CMakeLists.txt000066400000000000000000000006461477342711600176430ustar00rootroot00000000000000publish_headers(infiniband mad.h mad_osd.h ) publish_internal_headers(util iba_types.h ) rdma_library(ibmad libibmad.map # See Documentation/versioning.md 5 5.5.${PACKAGE_VERSION} bm.c cc.c dump.c fields.c gs.c mad.c portid.c register.c resolve.c rpc.c sa.c serv.c smp.c vendor.c ) target_link_libraries(ibmad LINK_PRIVATE ibumad ) rdma_pkg_config("ibmad" "libibumad" "") rdma-core-56.1/libibmad/bm.c000066400000000000000000000061721477342711600156450ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #undef DEBUG #define DEBUG if (ibdebug) IBWARN static inline int response_expected(int method) { return method == IB_MAD_METHOD_GET || method == IB_MAD_METHOD_SET || method == IB_MAD_METHOD_TRAP; } uint8_t *bm_call_via(void *data, ib_portid_t * portid, ib_bm_call_t * call, struct ibmad_port * srcport) { ib_rpc_t rpc = { 0 }; int resp_expected; struct { uint64_t bkey; uint8_t reserved[32]; uint8_t data[IB_BM_DATA_SZ]; } bm_data; DEBUG("route %s data %p", portid2str(portid), data); if (portid->lid <= 0) { IBWARN("only lid routes are supported"); return NULL; } resp_expected = response_expected(call->method); rpc.mgtclass = IB_BOARD_MGMT_CLASS; rpc.method = call->method; rpc.attr.id = call->attrid; rpc.attr.mod = call->mod; rpc.timeout = resp_expected ? call->timeout : 0; // send data and bkey rpc.datasz = IB_BM_BKEY_AND_DATA_SZ; rpc.dataoffs = IB_BM_BKEY_OFFS; // copy data to a buffer which also includes the bkey bm_data.bkey = htonll(call->bkey); memset(bm_data.reserved, 0, sizeof(bm_data.reserved)); memcpy(bm_data.data, data, IB_BM_DATA_SZ); DEBUG ("method 0x%x attr 0x%x mod 0x%x datasz %d off %d res_ex %d bkey 0x%08x%08x", rpc.method, rpc.attr.id, rpc.attr.mod, rpc.datasz, rpc.dataoffs, resp_expected, (int)(call->bkey >> 32), (int)call->bkey); portid->qp = 1; if (!portid->qkey) portid->qkey = IB_DEFAULT_QP1_QKEY; if (resp_expected) { /* FIXME: no RMPP for now */ if (mad_rpc(srcport, &rpc, portid, &bm_data, &bm_data)) goto return_ok; return NULL; } if (mad_send_via(&rpc, portid, NULL, &bm_data, srcport) < 0) return NULL; return_ok: memcpy(data, bm_data.data, IB_BM_DATA_SZ); return data; } rdma-core-56.1/libibmad/cc.c000066400000000000000000000065021477342711600156310ustar00rootroot00000000000000/* * Copyright (c) 2011 Lawrence Livermore National Lab. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include "mad_internal.h" #undef DEBUG #define DEBUG if (ibdebug) IBWARN void *cc_query_status_via(void *rcvbuf, ib_portid_t * portid, unsigned attrid, unsigned mod, unsigned timeout, int *rstatus, const struct ibmad_port * srcport, uint64_t cckey) { ib_rpc_cc_t rpc = { 0 }; void *res; DEBUG("attr 0x%x mod 0x%x route %s", attrid, mod, portid2str(portid)); rpc.method = IB_MAD_METHOD_GET; rpc.attr.id = attrid; rpc.attr.mod = mod; rpc.timeout = timeout; if (attrid == IB_CC_ATTR_CONGESTION_LOG) { rpc.datasz = IB_CC_LOG_DATA_SZ; rpc.dataoffs = IB_CC_LOG_DATA_OFFS; } else { rpc.datasz = IB_CC_DATA_SZ; rpc.dataoffs = IB_CC_DATA_OFFS; } rpc.mgtclass = IB_CC_CLASS; rpc.cckey = cckey; portid->qp = 1; if (!portid->qkey) portid->qkey = IB_DEFAULT_QP1_QKEY; res = mad_rpc(srcport, (ib_rpc_t *)&rpc, portid, rcvbuf, rcvbuf); if (rstatus) *rstatus = rpc.rstatus; return res; } void *cc_config_status_via(void *payload, void *rcvbuf, ib_portid_t * portid, unsigned attrid, unsigned mod, unsigned timeout, int *rstatus, const struct ibmad_port * srcport, uint64_t cckey) { ib_rpc_cc_t rpc = { 0 }; void *res; DEBUG("attr 0x%x mod 0x%x route %s", attrid, mod, portid2str(portid)); rpc.method = IB_MAD_METHOD_SET; rpc.attr.id = attrid; rpc.attr.mod = mod; rpc.timeout = timeout; if (attrid == IB_CC_ATTR_CONGESTION_LOG) { rpc.datasz = IB_CC_LOG_DATA_SZ; rpc.dataoffs = IB_CC_LOG_DATA_OFFS; } else { rpc.datasz = IB_CC_DATA_SZ; rpc.dataoffs = IB_CC_DATA_OFFS; } rpc.mgtclass = IB_CC_CLASS; rpc.cckey = cckey; portid->qp = 1; if (!portid->qkey) portid->qkey = IB_DEFAULT_QP1_QKEY; res = mad_rpc(srcport, (ib_rpc_t *)&rpc, portid, payload, rcvbuf); if (rstatus) *rstatus = rpc.rstatus; return res; } rdma-core-56.1/libibmad/dump.c000066400000000000000000000744641477342711600162250ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * Copyright (c) 2009-2011 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include void mad_dump_int(char *buf, int bufsz, void *val, int valsz) { switch (valsz) { case 1: snprintf(buf, bufsz, "%d", *(uint32_t *) val & 0xff); break; case 2: snprintf(buf, bufsz, "%d", *(uint32_t *) val & 0xffff); break; case 3: case 4: snprintf(buf, bufsz, "%d", *(uint32_t *) val); break; case 5: case 6: case 7: case 8: snprintf(buf, bufsz, "%" PRIu64, *(uint64_t *) val); break; default: IBWARN("bad int sz %d", valsz); buf[0] = 0; } } void mad_dump_uint(char *buf, int bufsz, void *val, int valsz) { switch (valsz) { case 1: snprintf(buf, bufsz, "%u", *(uint32_t *) val & 0xff); break; case 2: snprintf(buf, bufsz, "%u", *(uint32_t *) val & 0xffff); break; case 3: case 4: snprintf(buf, bufsz, "%u", *(uint32_t *) val); break; case 5: case 6: case 7: case 8: snprintf(buf, bufsz, "%" PRIu64, *(uint64_t *) val); break; default: IBWARN("bad int sz %u", valsz); buf[0] = 0; } } void mad_dump_hex(char *buf, int bufsz, void *val, int valsz) { switch (valsz) { case 1: snprintf(buf, bufsz, "0x%02x", *(uint32_t *) val & 0xff); break; case 2: snprintf(buf, bufsz, "0x%04x", *(uint32_t *) val & 0xffff); break; case 3: snprintf(buf, bufsz, "0x%06x", *(uint32_t *) val & 0xffffff); break; case 4: snprintf(buf, bufsz, "0x%08x", *(uint32_t *) val); break; case 5: snprintf(buf, bufsz, "0x%010" PRIx64, *(uint64_t *) val & (uint64_t) 0xffffffffffULL); break; case 6: snprintf(buf, bufsz, "0x%012" PRIx64, *(uint64_t *) val & (uint64_t) 0xffffffffffffULL); break; case 7: snprintf(buf, bufsz, "0x%014" PRIx64, *(uint64_t *) val & (uint64_t) 0xffffffffffffffULL); break; case 8: snprintf(buf, bufsz, "0x%016" PRIx64, *(uint64_t *) val); break; default: IBWARN("bad int sz %d", valsz); buf[0] = 0; } } void mad_dump_rhex(char *buf, int bufsz, void *val, int valsz) { switch (valsz) { case 1: snprintf(buf, bufsz, "%02x", *(uint32_t *) val & 0xff); break; case 2: snprintf(buf, bufsz, "%04x", *(uint32_t *) val & 0xffff); break; case 3: snprintf(buf, bufsz, "%06x", *(uint32_t *) val & 0xffffff); break; case 4: snprintf(buf, bufsz, "%08x", *(uint32_t *) val); break; case 5: snprintf(buf, bufsz, "%010" PRIx64, *(uint64_t *) val & (uint64_t) 0xffffffffffULL); break; case 6: snprintf(buf, bufsz, "%012" PRIx64, *(uint64_t *) val & (uint64_t) 0xffffffffffffULL); break; case 7: snprintf(buf, bufsz, "%014" PRIx64, *(uint64_t *) val & (uint64_t) 0xffffffffffffffULL); break; case 8: snprintf(buf, bufsz, "%016" PRIx64, *(uint64_t *) val); break; default: IBWARN("bad int sz %d", valsz); buf[0] = 0; } } void mad_dump_linkwidth(char *buf, int bufsz, void *val, int valsz) { int width = *(int *)val; switch (width) { case 1: snprintf(buf, bufsz, "1X"); break; case 2: snprintf(buf, bufsz, "4X"); break; case 4: snprintf(buf, bufsz, "8X"); break; case 8: snprintf(buf, bufsz, "12X"); break; case 16: snprintf(buf, bufsz, "2X"); break; default: IBWARN("bad width %d", width); snprintf(buf, bufsz, "undefined (%d)", width); break; } } static void dump_linkwidth(char *buf, int bufsz, int width) { int n = 0; if (width & 0x1) n += snprintf(buf + n, bufsz - n, "1X or "); if (n < bufsz && (width & 0x2)) n += snprintf(buf + n, bufsz - n, "4X or "); if (n < bufsz && (width & 0x4)) n += snprintf(buf + n, bufsz - n, "8X or "); if (n < bufsz && (width & 0x8)) n += snprintf(buf + n, bufsz - n, "12X or "); if (n < bufsz && (width & 0x10)) n += snprintf(buf + n, bufsz - n, "2X or "); if (n >= bufsz) return; else if (width == 0 || (width >> 5)) snprintf(buf + n, bufsz - n, "undefined (%d)", width); else if (bufsz > 3) buf[n - 4] = '\0'; } void mad_dump_linkwidthsup(char *buf, int bufsz, void *val, int valsz) { int width = *(int *)val; dump_linkwidth(buf, bufsz, width); switch (width) { case 1: case 3: case 7: case 11: case 15: case 17: case 19: case 23: case 27: case 31: break; default: if (!(width >> 5)) snprintf(buf + strlen(buf), bufsz - strlen(buf), " (IBA extension)"); break; } } void mad_dump_linkwidthen(char *buf, int bufsz, void *val, int valsz) { int width = *(int *)val; dump_linkwidth(buf, bufsz, width); } void mad_dump_linkspeed(char *buf, int bufsz, void *val, int valsz) { int speed = *(int *)val; switch (speed) { case 0: snprintf(buf, bufsz, "Extended speed"); break; case 1: snprintf(buf, bufsz, "2.5 Gbps"); break; case 2: snprintf(buf, bufsz, "5.0 Gbps"); break; case 4: snprintf(buf, bufsz, "10.0 Gbps"); break; default: snprintf(buf, bufsz, "undefined (%d)", speed); break; } } static void dump_linkspeed(char *buf, int bufsz, int speed) { int n = 0; if (speed & 0x1) n += snprintf(buf + n, bufsz - n, "2.5 Gbps or "); if (n < bufsz && (speed & 0x2)) n += snprintf(buf + n, bufsz - n, "5.0 Gbps or "); if (n < bufsz && (speed & 0x4)) n += snprintf(buf + n, bufsz - n, "10.0 Gbps or "); if (n >= bufsz) return; else if (speed == 0 || (speed >> 3)) { n += snprintf(buf + n, bufsz - n, "undefined (%d)", speed); if (n >= bufsz) return; } else if (bufsz > 3) { buf[n - 4] = '\0'; n -= 4; } switch (speed) { case 1: case 3: case 5: case 7: break; default: if (!(speed >> 3)) snprintf(buf + n, bufsz - n, " (IBA extension)"); break; } } void mad_dump_linkspeedsup(char *buf, int bufsz, void *val, int valsz) { int speed = *(int *)val; dump_linkspeed(buf, bufsz, speed); } void mad_dump_linkspeeden(char *buf, int bufsz, void *val, int valsz) { int speed = *(int *)val; dump_linkspeed(buf, bufsz, speed); } void mad_dump_linkspeedext(char *buf, int bufsz, void *val, int valsz) { int speed = *(int *)val; switch (speed) { case 0: snprintf(buf, bufsz, "No Extended Speed"); break; case 1: snprintf(buf, bufsz, "14.0625 Gbps"); break; case 2: snprintf(buf, bufsz, "25.78125 Gbps"); break; case 4: snprintf(buf, bufsz, "53.125 Gbps"); break; case 8: snprintf(buf, bufsz, "106.25 Gbps"); break; default: snprintf(buf, bufsz, "undefined (%d)", speed); break; } } static void dump_linkspeedext(char *buf, int bufsz, int speed) { int n = 0; if (speed == 0) { sprintf(buf, "%d", speed); return; } if (speed & 0x1) n += snprintf(buf + n, bufsz - n, "14.0625 Gbps or "); if (n < bufsz && speed & 0x2) n += snprintf(buf + n, bufsz - n, "25.78125 Gbps or "); if (n < bufsz && speed & 0x4) n += snprintf(buf + n, bufsz - n, "53.125 Gbps or "); if (n < bufsz && speed & 0x8) n += snprintf(buf + n, bufsz - n, "106.25 Gbps or "); if (n >= bufsz) { if (bufsz > 3) buf[n - 4] = '\0'; return; } if (speed >> 4) { n += snprintf(buf + n, bufsz - n, "undefined (%d)", speed); return; } else if (bufsz > 3) buf[n - 4] = '\0'; } void mad_dump_linkspeedextsup(char *buf, int bufsz, void *val, int valsz) { int speed = *(int *)val; dump_linkspeedext(buf, bufsz, speed); } void mad_dump_linkspeedexten(char *buf, int bufsz, void *val, int valsz) { int speed = *(int *)val; if (speed == 30) { sprintf(buf, "%s", "Extended link speeds disabled"); return; } dump_linkspeedext(buf, bufsz, speed); } void mad_dump_linkspeedext2(char *buf, int bufsz, void *val, int valsz) { int speed = *(int *) val; switch (speed) { case 0: snprintf(buf, bufsz, "No Extended Speed 2"); break; case 1: snprintf(buf, bufsz, "212.5 Gbps"); break; default: snprintf(buf, bufsz, "undefined (%d)", speed); break; } } static void dump_linkspeedext2(char *buf, int bufsz, int speed) { int n = 0; if (speed == 0) { snprintf(buf, bufsz, "%d", speed); return; } if (speed & 0x1) snprintf(buf, bufsz, "212.5 Gbps"); if (n >= bufsz) return; if (speed >> 1) snprintf(buf + n, bufsz - n, " undefined (%d)", speed); } void mad_dump_linkspeedextsup2(char *buf, int bufsz, void *val, int valsz) { int speed = *(int *) val; dump_linkspeedext2(buf, bufsz, speed); } void mad_dump_linkspeedexten2(char *buf, int bufsz, void *val, int valsz) { int speed = *(int *) val; dump_linkspeedext2(buf, bufsz, speed); } void mad_dump_portstate(char *buf, int bufsz, void *val, int valsz) { int state = *(int *)val; switch (state) { case 0: snprintf(buf, bufsz, "NoChange"); break; case 1: snprintf(buf, bufsz, "Down"); break; case 2: snprintf(buf, bufsz, "Initialize"); break; case 3: snprintf(buf, bufsz, "Armed"); break; case 4: snprintf(buf, bufsz, "Active"); break; default: snprintf(buf, bufsz, "?(%d)", state); } } void mad_dump_linkdowndefstate(char *buf, int bufsz, void *val, int valsz) { int state = *(int *)val; switch (state) { case 0: snprintf(buf, bufsz, "NoChange"); break; case 1: snprintf(buf, bufsz, "Sleep"); break; case 2: snprintf(buf, bufsz, "Polling"); break; default: snprintf(buf, bufsz, "?(%d)", state); break; } } void mad_dump_physportstate(char *buf, int bufsz, void *val, int valsz) { int state = *(int *)val; switch (state) { case 0: snprintf(buf, bufsz, "NoChange"); break; case 1: snprintf(buf, bufsz, "Sleep"); break; case 2: snprintf(buf, bufsz, "Polling"); break; case 3: snprintf(buf, bufsz, "Disabled"); break; case 4: snprintf(buf, bufsz, "PortConfigurationTraining"); break; case 5: snprintf(buf, bufsz, "LinkUp"); break; case 6: snprintf(buf, bufsz, "LinkErrorRecovery"); break; case 7: snprintf(buf, bufsz, "PhyTest"); break; default: snprintf(buf, bufsz, "?(%d)", state); } } void mad_dump_mtu(char *buf, int bufsz, void *val, int valsz) { int mtu = *(int *)val; switch (mtu) { case 1: snprintf(buf, bufsz, "256"); break; case 2: snprintf(buf, bufsz, "512"); break; case 3: snprintf(buf, bufsz, "1024"); break; case 4: snprintf(buf, bufsz, "2048"); break; case 5: snprintf(buf, bufsz, "4096"); break; default: snprintf(buf, bufsz, "?(%d)", mtu); } } void mad_dump_vlcap(char *buf, int bufsz, void *val, int valsz) { int vlcap = *(int *)val; switch (vlcap) { case 1: snprintf(buf, bufsz, "VL0"); break; case 2: snprintf(buf, bufsz, "VL0-1"); break; case 3: snprintf(buf, bufsz, "VL0-3"); break; case 4: snprintf(buf, bufsz, "VL0-7"); break; case 5: snprintf(buf, bufsz, "VL0-14"); break; default: snprintf(buf, bufsz, "?(%d)", vlcap); } } void mad_dump_opervls(char *buf, int bufsz, void *val, int valsz) { int opervls = *(int *)val; switch (opervls) { case 0: snprintf(buf, bufsz, "No change"); break; case 1: snprintf(buf, bufsz, "VL0"); break; case 2: snprintf(buf, bufsz, "VL0-1"); break; case 3: snprintf(buf, bufsz, "VL0-3"); break; case 4: snprintf(buf, bufsz, "VL0-7"); break; case 5: snprintf(buf, bufsz, "VL0-14"); break; default: snprintf(buf, bufsz, "?(%d)", opervls); } } void mad_dump_portcapmask(char *buf, int bufsz, void *val, int valsz) { unsigned mask = *(unsigned *)val; char *s = buf; s += sprintf(s, "0x%x\n", mask); if (mask & (1 << 1)) s += sprintf(s, "\t\t\t\tIsSM\n"); if (mask & (1 << 2)) s += sprintf(s, "\t\t\t\tIsNoticeSupported\n"); if (mask & (1 << 3)) s += sprintf(s, "\t\t\t\tIsTrapSupported\n"); if (mask & (1 << 4)) s += sprintf(s, "\t\t\t\tIsOptionalIPDSupported\n"); if (mask & (1 << 5)) s += sprintf(s, "\t\t\t\tIsAutomaticMigrationSupported\n"); if (mask & (1 << 6)) s += sprintf(s, "\t\t\t\tIsSLMappingSupported\n"); if (mask & (1 << 7)) s += sprintf(s, "\t\t\t\tIsMKeyNVRAM\n"); if (mask & (1 << 8)) s += sprintf(s, "\t\t\t\tIsPKeyNVRAM\n"); if (mask & (1 << 9)) s += sprintf(s, "\t\t\t\tIsLedInfoSupported\n"); if (mask & (1 << 10)) s += sprintf(s, "\t\t\t\tIsSMdisabled\n"); if (mask & (1 << 11)) s += sprintf(s, "\t\t\t\tIsSystemImageGUIDsupported\n"); if (mask & (1 << 12)) s += sprintf(s, "\t\t\t\tIsPkeySwitchExternalPortTrapSupported\n"); if (mask & (1 << 14)) s += sprintf(s, "\t\t\t\tIsExtendedSpeedsSupported\n"); if (mask & (1 << 15)) s += sprintf(s, "\t\t\t\tIsCapabilityMask2Supported\n"); if (mask & (1 << 16)) s += sprintf(s, "\t\t\t\tIsCommunicatonManagementSupported\n"); if (mask & (1 << 17)) s += sprintf(s, "\t\t\t\tIsSNMPTunnelingSupported\n"); if (mask & (1 << 18)) s += sprintf(s, "\t\t\t\tIsReinitSupported\n"); if (mask & (1 << 19)) s += sprintf(s, "\t\t\t\tIsDeviceManagementSupported\n"); if (mask & (1 << 20)) s += sprintf(s, "\t\t\t\tIsVendorClassSupported\n"); if (mask & (1 << 21)) s += sprintf(s, "\t\t\t\tIsDRNoticeSupported\n"); if (mask & (1 << 22)) s += sprintf(s, "\t\t\t\tIsCapabilityMaskNoticeSupported\n"); if (mask & (1 << 23)) s += sprintf(s, "\t\t\t\tIsBootManagementSupported\n"); if (mask & (1 << 24)) s += sprintf(s, "\t\t\t\tIsLinkRoundTripLatencySupported\n"); if (mask & (1 << 25)) s += sprintf(s, "\t\t\t\tIsClientRegistrationSupported\n"); if (mask & (1 << 26)) s += sprintf(s, "\t\t\t\tIsOtherLocalChangesNoticeSupported\n"); if (mask & (1 << 27)) s += sprintf(s, "\t\t\t\tIsLinkSpeedWidthPairsTableSupported\n"); if (mask & (1 << 28)) s += sprintf(s, "\t\t\t\tIsVendorSpecificMadsTableSupported\n"); if (mask & (1 << 29)) s += sprintf(s, "\t\t\t\tIsMcastPkeyTrapSuppressionSupported\n"); if (mask & (1 << 30)) s += sprintf(s, "\t\t\t\tIsMulticastFDBTopSupported\n"); if (mask & ((1U) << 31)) s += sprintf(s, "\t\t\t\tIsHierarchyInfoSupported\n"); if (s != buf) *(--s) = 0; } void mad_dump_portcapmask2(char *buf, int bufsz, void *val, int valsz) { int mask = *(int *)val; char *s = buf; s += sprintf(s, "0x%x\n", mask); if (mask & (1 << 0)) s += sprintf(s, "\t\t\t\tIsSetNodeDescriptionSupported\n"); if (mask & (1 << 1)) s += sprintf(s, "\t\t\t\tIsPortInfoExtendedSupported\n"); if (mask & (1 << 2)) s += sprintf(s, "\t\t\t\tIsVirtualizationSupported\n"); if (mask & (1 << 3)) s += sprintf(s, "\t\t\t\tIsSwitchPortStateTableSupported\n"); if (mask & (1 << 4)) s += sprintf(s, "\t\t\t\tIsLinkWidth2xSupported\n"); if (mask & (1 << 5)) s += sprintf(s, "\t\t\t\tIsLinkSpeedHDRSupported\n"); if (mask & (1 << 6)) s += sprintf(s, "\t\t\t\tIsMKeyProtectBitsExtSupported\n"); if (mask & (1 << 7)) s += sprintf(s, "\t\t\t\tIsEnhancedTrap128Supported\n"); if (mask & (1 << 8)) s += sprintf(s, "\t\t\t\tIsPartitionTopSupported\n"); if (mask & (1 << 9)) s += sprintf(s, "\t\t\t\tIsEnhancedQoSArbiterSupported\n"); if (mask & (1 << 10)) s += sprintf(s, "\t\t\t\tIsLinkSpeedNDRSupported\n"); if (s != buf) *(--s) = 0; } void mad_dump_bitfield(char *buf, int bufsz, void *val, int valsz) { snprintf(buf, bufsz, "0x%x", *(uint32_t *) val); } void mad_dump_array(char *buf, int bufsz, void *val, int valsz) { uint8_t *p = val, *e; char *s = buf; if (bufsz < valsz * 2) valsz = bufsz / 2; for (p = val, e = p + valsz; p < e; p++, s += 2) sprintf(s, "%02x", *p); } void mad_dump_string(char *buf, int bufsz, void *val, int valsz) { if (bufsz < valsz) valsz = bufsz; snprintf(buf, valsz, "'%s'", (char *)val); } void mad_dump_node_type(char *buf, int bufsz, void *val, int valsz) { int nodetype = *(int *)val; switch (nodetype) { case 1: snprintf(buf, bufsz, "Channel Adapter"); break; case 2: snprintf(buf, bufsz, "Switch"); break; case 3: snprintf(buf, bufsz, "Router"); break; default: snprintf(buf, bufsz, "?(%d)?", nodetype); break; } } #define IB_MAX_NUM_VLS 16 #define IB_MAX_NUM_VLS_TO_U8 ((IB_MAX_NUM_VLS)/2) typedef struct _ib_slvl_table { uint8_t vl_by_sl_num[IB_MAX_NUM_VLS_TO_U8]; } ib_slvl_table_t; static inline void ib_slvl_get_i(ib_slvl_table_t * tbl, int i, uint8_t * vl) { *vl = (tbl->vl_by_sl_num[i >> 1] >> ((!(i & 1)) << 2)) & 0xf; } #define IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK 32 typedef struct _ib_vl_arb_table { struct { uint8_t res_vl; uint8_t weight; } vl_entry[IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK]; } ib_vl_arb_table_t; static inline void ib_vl_arb_get_vl(uint8_t res_vl, uint8_t * const vl) { *vl = res_vl & 0x0F; } void mad_dump_sltovl(char *buf, int bufsz, void *val, int valsz) { ib_slvl_table_t *p_slvl_tbl = val; uint8_t vl; int i, n = 0; n = snprintf(buf, bufsz, "|"); for (i = 0; i < 16; i++) { ib_slvl_get_i(p_slvl_tbl, i, &vl); n += snprintf(buf + n, bufsz - n, "%2u|", vl); if (n >= bufsz) break; } snprintf(buf + n, bufsz - n, "\n"); } void mad_dump_vlarbitration(char *buf, int bufsz, void *val, int num) { ib_vl_arb_table_t *p_vla_tbl = val; int i, n; uint8_t vl; num /= sizeof(p_vla_tbl->vl_entry[0]); n = snprintf(buf, bufsz, "\nVL : |"); if (n >= bufsz) return; for (i = 0; i < num; i++) { ib_vl_arb_get_vl(p_vla_tbl->vl_entry[i].res_vl, &vl); n += snprintf(buf + n, bufsz - n, "0x%-2X|", vl); if (n >= bufsz) return; } n += snprintf(buf + n, bufsz - n, "\nWEIGHT: |"); if (n >= bufsz) return; for (i = 0; i < num; i++) { n += snprintf(buf + n, bufsz - n, "0x%-2X|", p_vla_tbl->vl_entry[i].weight); if (n >= bufsz) return; } snprintf(buf + n, bufsz - n, "\n"); } static int _dump_fields(char *buf, int bufsz, void *data, int start, int end) { char val[64]; char *s = buf; int n, field; for (field = start; field < end && bufsz > 0; field++) { mad_decode_field(data, field, val); if (!mad_dump_field(field, s, bufsz-1, val)) return -1; n = strlen(s); s += n; *s++ = '\n'; *s = 0; n++; bufsz -= n; } return (int)(s - buf); } void mad_dump_fields(char *buf, int bufsz, void *val, int valsz, int start, int end) { _dump_fields(buf, bufsz, val, start, end); } void mad_dump_nodedesc(char *buf, int bufsz, void *val, int valsz) { strncpy(buf, val, bufsz); if (valsz < bufsz) buf[valsz] = 0; } void mad_dump_nodeinfo(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_NODE_FIRST_F, IB_NODE_LAST_F); } void mad_dump_portinfo(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PORT_FIRST_F, IB_PORT_LAST_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PORT_CAPMASK2_F, IB_PORT_LINK_SPEED_EXT_LAST_F); } void mad_dump_portstates(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_PORT_STATE_F, IB_PORT_LINK_DOWN_DEF_F); } void mad_dump_switchinfo(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_SW_FIRST_F, IB_SW_LAST_F); } void mad_dump_perfcounters(char *buf, int bufsz, void *val, int valsz) { int cnt, cnt2; cnt = _dump_fields(buf, bufsz, val, IB_PC_FIRST_F, IB_PC_VL15_DROPPED_F); if (cnt < 0) return; cnt2 = _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_QP1_DROP_F, IB_PC_QP1_DROP_F + 1); if (cnt2 < 0) return; _dump_fields(buf + cnt + cnt2, bufsz - cnt - cnt2, val, IB_PC_VL15_DROPPED_F, IB_PC_LAST_F); } void mad_dump_perfcounters_ext(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_FIRST_F, IB_PC_EXT_LAST_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_EXT_COUNTER_SELECT2_F, IB_PC_EXT_ERR_LAST_F); } void mad_dump_perfcounters_xmt_sl(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_XMT_DATA_SL_FIRST_F, IB_PC_XMT_DATA_SL_LAST_F); } void mad_dump_perfcounters_rcv_sl(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_RCV_DATA_SL_FIRST_F, IB_PC_RCV_DATA_SL_LAST_F); } void mad_dump_perfcounters_xmt_disc(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_XMT_INACT_DISC_F, IB_PC_XMT_DISC_LAST_F); } void mad_dump_perfcounters_rcv_err(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_RCV_LOCAL_PHY_ERR_F, IB_PC_RCV_ERR_LAST_F); } void mad_dump_portsamples_control(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_PSC_OPCODE_F, IB_PSC_LAST_F); } void mad_dump_portsamples_result(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_PSR_TAG_F, IB_PSR_LAST_F); } void mad_dump_port_ext_speeds_counters_rsfec_active(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_PESC_RSFEC_FIRST_F, IB_PESC_RSFEC_LAST_F); } void mad_dump_port_ext_speeds_counters(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_PESC_PORT_SELECT_F, IB_PESC_LAST_F); } void mad_dump_perfcounters_port_op_rcv_counters(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_PORT_OP_RCV_COUNTERS_FIRST_F, IB_PC_PORT_OP_RCV_COUNTERS_LAST_F); } void mad_dump_perfcounters_port_flow_ctl_counters(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_PORT_FLOW_CTL_COUNTERS_FIRST_F, IB_PC_PORT_FLOW_CTL_COUNTERS_LAST_F); } void mad_dump_perfcounters_port_vl_op_packet(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_PORT_VL_OP_PACKETS_FIRST_F, IB_PC_PORT_VL_OP_PACKETS_LAST_F); } void mad_dump_perfcounters_port_vl_op_data(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_PORT_VL_OP_DATA_FIRST_F, IB_PC_PORT_VL_OP_DATA_LAST_F); } void mad_dump_perfcounters_port_vl_xmit_flow_ctl_update_errors(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS_FIRST_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS_LAST_F); } void mad_dump_perfcounters_port_vl_xmit_wait_counters(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_PORT_VL_XMIT_WAIT_COUNTERS_FIRST_F, IB_PC_PORT_VL_XMIT_WAIT_COUNTERS_LAST_F); } void mad_dump_perfcounters_sw_port_vl_congestion(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_SW_PORT_VL_CONGESTION_FIRST_F, IB_PC_SW_PORT_VL_CONGESTION_LAST_F); } void mad_dump_perfcounters_rcv_con_ctrl(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_RCV_CON_CTRL_FIRST_F, IB_PC_RCV_CON_CTRL_LAST_F); } void mad_dump_perfcounters_sl_rcv_fecn(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_SL_RCV_FECN_FIRST_F, IB_PC_SL_RCV_FECN_LAST_F); } void mad_dump_perfcounters_sl_rcv_becn(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_SL_RCV_BECN_FIRST_F, IB_PC_SL_RCV_BECN_LAST_F); } void mad_dump_perfcounters_xmit_con_ctrl(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_XMIT_CON_CTRL_FIRST_F, IB_PC_XMIT_CON_CTRL_LAST_F); } void mad_dump_perfcounters_vl_xmit_time_cong(char *buf, int bufsz, void *val, int valsz) { int cnt; cnt = _dump_fields(buf, bufsz, val, IB_PC_EXT_PORT_SELECT_F, IB_PC_EXT_XMT_BYTES_F); if (cnt < 0) return; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PC_VL_XMIT_TIME_CONG_FIRST_F, IB_PC_VL_XMIT_TIME_CONG_LAST_F); } void mad_dump_mlnx_ext_port_info(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_MLNX_EXT_PORT_STATE_CHG_ENABLE_F, IB_MLNX_EXT_PORT_LAST_F); } void mad_dump_cc_congestioninfo(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_CONGESTION_INFO_FIRST_F, IB_CC_CONGESTION_INFO_LAST_F); } void mad_dump_cc_congestionkeyinfo(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_CONGESTION_KEY_INFO_FIRST_F, IB_CC_CONGESTION_KEY_INFO_LAST_F); } void mad_dump_cc_congestionlog(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_CONGESTION_LOG_FIRST_F, IB_CC_CONGESTION_LOG_LAST_F); } void mad_dump_cc_congestionlogswitch(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_CONGESTION_LOG_SWITCH_FIRST_F, IB_CC_CONGESTION_LOG_SWITCH_LAST_F); } void mad_dump_cc_congestionlogentryswitch(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_CONGESTION_LOG_ENTRY_SWITCH_FIRST_F, IB_CC_CONGESTION_LOG_ENTRY_SWITCH_LAST_F); } void mad_dump_cc_congestionlogca(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_CONGESTION_LOG_CA_FIRST_F, IB_CC_CONGESTION_LOG_CA_LAST_F); } void mad_dump_cc_congestionlogentryca(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_CONGESTION_LOG_ENTRY_CA_FIRST_F, IB_CC_CONGESTION_LOG_ENTRY_CA_LAST_F); } void mad_dump_cc_switchcongestionsetting(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_SWITCH_CONGESTION_SETTING_FIRST_F, IB_CC_SWITCH_CONGESTION_SETTING_LAST_F); } void mad_dump_cc_switchportcongestionsettingelement(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_FIRST_F, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_LAST_F); } void mad_dump_cc_cacongestionsetting(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_CA_CONGESTION_SETTING_FIRST_F, IB_CC_CA_CONGESTION_SETTING_LAST_F); } void mad_dump_cc_cacongestionentry(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_CA_CONGESTION_ENTRY_FIRST_F, IB_CC_CA_CONGESTION_ENTRY_LAST_F); } void mad_dump_cc_congestioncontroltable(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_CONGESTION_CONTROL_TABLE_FIRST_F, IB_CC_CONGESTION_CONTROL_TABLE_LAST_F); } void mad_dump_cc_congestioncontroltableentry(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_FIRST_F, IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_LAST_F); } void mad_dump_cc_timestamp(char *buf, int bufsz, void *val, int valsz) { _dump_fields(buf, bufsz, val, IB_CC_TIMESTAMP_FIRST_F, IB_CC_TIMESTAMP_LAST_F); } void mad_dump_classportinfo(char *buf, int bufsz, void *val, int valsz) { /* no FIRST_F and LAST_F for CPI field enums, must do a hack */ _dump_fields(buf, bufsz, val, IB_CPI_BASEVER_F, IB_CPI_TRAP_QKEY_F + 1); } void mad_dump_portinfo_ext(char *buf, int bufsz, void *val, int valsz) { int cnt, n; cnt = _dump_fields(buf, bufsz, val, IB_PORT_EXT_FIRST_F, IB_PORT_EXT_LAST_F); if (cnt < 0) return; n = _dump_fields(buf + cnt, bufsz - cnt, val, IB_PORT_EXT_HDR_FEC_MODE_SUPPORTED_F, IB_PORT_EXT_HDR_FEC_MODE_LAST_F); if (n < 0) return; cnt += n; _dump_fields(buf + cnt, bufsz - cnt, val, IB_PORT_EXT_NDR_FEC_MODE_SUPPORTED_F, IB_PORT_EXT_NDR_FEC_MODE_LAST_F); } void xdump(FILE * file, const char *msg, void *p, int size) { #define HEX(x) ((x) < 10 ? '0' + (x) : 'a' + ((x) -10)) uint8_t *cp = p; int i; if (msg) fputs(msg, file); for (i = 0; i < size;) { fputc(HEX(*cp >> 4), file); fputc(HEX(*cp & 0xf), file); if (++i >= size) break; fputc(HEX(cp[1] >> 4), file); fputc(HEX(cp[1] & 0xf), file); if ((++i) % 16) fputc(' ', file); else fputc('\n', file); cp += 2; } if (i % 16) fputc('\n', file); } rdma-core-56.1/libibmad/fields.c000066400000000000000000001310231477342711600165070ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2009-2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include /* * BITSOFFS and BE_OFFS are required due the fact that the bit offsets are inconsistently * encoded in the IB spec - IB headers are encoded such that the bit offsets * are in big endian convention (BE_OFFS), while the SMI/GSI queries data fields bit * offsets are specified using real bit offset (?!) * The following macros normalize everything to big endian offsets. */ #define BITSOFFS(o, w) (((o) & ~31) | ((32 - ((o) & 31) - (w)))), (w) #define BE_OFFS(o, w) (o), (w) #define BE_TO_BITSOFFS(o, w) (((o) & ~31) | ((32 - ((o) & 31) - (w)))) static const ib_field_t ib_mad_f[] = { {}, /* IB_NO_FIELD - reserved as invalid */ {0, 64, "GidPrefix", mad_dump_rhex}, {64, 64, "GidGuid", mad_dump_rhex}, /* * MAD: common MAD fields (IB spec 13.4.2) * SMP: Subnet Management packets - lid routed (IB spec 14.2.1.1) * DSMP: Subnet Management packets - direct route (IB spec 14.2.1.2) * SA: Subnet Administration packets (IB spec 15.2.1.1) */ /* first MAD word (0-3 bytes) */ {BE_OFFS(0, 7), "MadMethod", mad_dump_hex}, /* TODO: add dumper */ {BE_OFFS(7, 1), "MadIsResponse", mad_dump_uint}, /* TODO: add dumper */ {BE_OFFS(8, 8), "MadClassVersion", mad_dump_uint}, {BE_OFFS(16, 8), "MadMgmtClass", mad_dump_uint}, /* TODO: add dumper */ {BE_OFFS(24, 8), "MadBaseVersion", mad_dump_uint}, /* second MAD word (4-7 bytes) */ {BE_OFFS(48, 16), "MadStatus", mad_dump_hex}, /* TODO: add dumper */ /* DR SMP only */ {BE_OFFS(32, 8), "DrSmpHopCnt", mad_dump_uint}, {BE_OFFS(40, 8), "DrSmpHopPtr", mad_dump_uint}, {BE_OFFS(48, 15), "DrSmpStatus", mad_dump_hex}, /* TODO: add dumper */ {BE_OFFS(63, 1), "DrSmpDirection", mad_dump_uint}, /* TODO: add dumper */ /* words 3,4,5,6 (8-23 bytes) */ {64, 64, "MadTRID", mad_dump_hex}, {BE_OFFS(144, 16), "MadAttr", mad_dump_hex}, /* TODO: add dumper */ {160, 32, "MadModifier", mad_dump_hex}, /* TODO: add dumper */ /* word 7,8 (24-31 bytes) */ {192, 64, "MadMkey", mad_dump_hex}, /* word 9 (32-37 bytes) */ {BE_OFFS(256, 16), "DrSmpDLID", mad_dump_uint}, {BE_OFFS(272, 16), "DrSmpSLID", mad_dump_uint}, /* word 10,11 (36-43 bytes) */ {288, 64, "SaSMkey", mad_dump_hex}, /* word 12 (44-47 bytes) */ {BE_OFFS(46 * 8, 16), "SaAttrOffs", mad_dump_uint}, /* word 13,14 (48-55 bytes) */ {48 * 8, 64, "SaCompMask", mad_dump_hex}, /* word 13,14 (56-255 bytes) */ {56 * 8, (256 - 56) * 8, "SaData", mad_dump_hex}, /* bytes 64 - 127 */ {}, /* IB_SM_DATA_F - reserved as invalid */ /* bytes 64 - 256 */ {64 * 8, (256 - 64) * 8, "GsData", mad_dump_hex}, /* bytes 128 - 191 */ {1024, 512, "DrSmpPath", mad_dump_hex}, /* bytes 192 - 255 */ {1536, 512, "DrSmpRetPath", mad_dump_hex}, /* * PortInfo fields */ {0, 64, "Mkey", mad_dump_hex}, {64, 64, "GidPrefix", mad_dump_hex}, {BITSOFFS(128, 16), "Lid", mad_dump_uint}, {BITSOFFS(144, 16), "SMLid", mad_dump_uint}, {160, 32, "CapMask", mad_dump_portcapmask}, {BITSOFFS(192, 16), "DiagCode", mad_dump_hex}, {BITSOFFS(208, 16), "MkeyLeasePeriod", mad_dump_uint}, {BITSOFFS(224, 8), "LocalPort", mad_dump_uint}, {BITSOFFS(232, 8), "LinkWidthEnabled", mad_dump_linkwidthen}, {BITSOFFS(240, 8), "LinkWidthSupported", mad_dump_linkwidthsup}, {BITSOFFS(248, 8), "LinkWidthActive", mad_dump_linkwidth}, {BITSOFFS(256, 4), "LinkSpeedSupported", mad_dump_linkspeedsup}, {BITSOFFS(260, 4), "LinkState", mad_dump_portstate}, {BITSOFFS(264, 4), "PhysLinkState", mad_dump_physportstate}, {BITSOFFS(268, 4), "LinkDownDefState", mad_dump_linkdowndefstate}, {BITSOFFS(272, 2), "ProtectBits", mad_dump_uint}, {BITSOFFS(277, 3), "LMC", mad_dump_uint}, {BITSOFFS(280, 4), "LinkSpeedActive", mad_dump_linkspeed}, {BITSOFFS(284, 4), "LinkSpeedEnabled", mad_dump_linkspeeden}, {BITSOFFS(288, 4), "NeighborMTU", mad_dump_mtu}, {BITSOFFS(292, 4), "SMSL", mad_dump_uint}, {BITSOFFS(296, 4), "VLCap", mad_dump_vlcap}, {BITSOFFS(300, 4), "InitType", mad_dump_hex}, {BITSOFFS(304, 8), "VLHighLimit", mad_dump_uint}, {BITSOFFS(312, 8), "VLArbHighCap", mad_dump_uint}, {BITSOFFS(320, 8), "VLArbLowCap", mad_dump_uint}, {BITSOFFS(328, 4), "InitReply", mad_dump_hex}, {BITSOFFS(332, 4), "MtuCap", mad_dump_mtu}, {BITSOFFS(336, 3), "VLStallCount", mad_dump_uint}, {BITSOFFS(339, 5), "HoqLife", mad_dump_uint}, {BITSOFFS(344, 4), "OperVLs", mad_dump_opervls}, {BITSOFFS(348, 1), "PartEnforceInb", mad_dump_uint}, {BITSOFFS(349, 1), "PartEnforceOutb", mad_dump_uint}, {BITSOFFS(350, 1), "FilterRawInb", mad_dump_uint}, {BITSOFFS(351, 1), "FilterRawOutb", mad_dump_uint}, {BITSOFFS(352, 16), "MkeyViolations", mad_dump_uint}, {BITSOFFS(368, 16), "PkeyViolations", mad_dump_uint}, {BITSOFFS(384, 16), "QkeyViolations", mad_dump_uint}, {BITSOFFS(400, 8), "GuidCap", mad_dump_uint}, {BITSOFFS(408, 1), "ClientReregister", mad_dump_uint}, {BITSOFFS(409, 2), "McastPkeyTrapSuppressionEnabled", mad_dump_uint}, {BITSOFFS(411, 5), "SubnetTimeout", mad_dump_uint}, {BITSOFFS(419, 5), "RespTimeVal", mad_dump_uint}, {BITSOFFS(424, 4), "LocalPhysErr", mad_dump_uint}, {BITSOFFS(428, 4), "OverrunErr", mad_dump_uint}, {BITSOFFS(432, 16), "MaxCreditHint", mad_dump_uint}, {BITSOFFS(456, 24), "RoundTrip", mad_dump_uint}, {}, /* IB_PORT_LAST_F */ /* * NodeInfo fields */ {BITSOFFS(0, 8), "BaseVers", mad_dump_uint}, {BITSOFFS(8, 8), "ClassVers", mad_dump_uint}, {BITSOFFS(16, 8), "NodeType", mad_dump_node_type}, {BITSOFFS(24, 8), "NumPorts", mad_dump_uint}, {32, 64, "SystemGuid", mad_dump_hex}, {96, 64, "Guid", mad_dump_hex}, {160, 64, "PortGuid", mad_dump_hex}, {BITSOFFS(224, 16), "PartCap", mad_dump_uint}, {BITSOFFS(240, 16), "DevId", mad_dump_hex}, {256, 32, "Revision", mad_dump_hex}, {BITSOFFS(288, 8), "LocalPort", mad_dump_uint}, {BITSOFFS(296, 24), "VendorId", mad_dump_hex}, {}, /* IB_NODE_LAST_F */ /* * SwitchInfo fields */ {BITSOFFS(0, 16), "LinearFdbCap", mad_dump_uint}, {BITSOFFS(16, 16), "RandomFdbCap", mad_dump_uint}, {BITSOFFS(32, 16), "McastFdbCap", mad_dump_uint}, {BITSOFFS(48, 16), "LinearFdbTop", mad_dump_uint}, {BITSOFFS(64, 8), "DefPort", mad_dump_uint}, {BITSOFFS(72, 8), "DefMcastPrimPort", mad_dump_uint}, {BITSOFFS(80, 8), "DefMcastNotPrimPort", mad_dump_uint}, {BITSOFFS(88, 5), "LifeTime", mad_dump_uint}, {BITSOFFS(93, 1), "StateChange", mad_dump_uint}, {BITSOFFS(94, 2), "OptSLtoVLMapping", mad_dump_uint}, {BITSOFFS(96, 16), "LidsPerPort", mad_dump_uint}, {BITSOFFS(112, 16), "PartEnforceCap", mad_dump_uint}, {BITSOFFS(128, 1), "InboundPartEnf", mad_dump_uint}, {BITSOFFS(129, 1), "OutboundPartEnf", mad_dump_uint}, {BITSOFFS(130, 1), "FilterRawInbound", mad_dump_uint}, {BITSOFFS(131, 1), "FilterRawOutbound", mad_dump_uint}, {BITSOFFS(132, 1), "EnhancedPort0", mad_dump_uint}, {BITSOFFS(144, 16), "MulticastFDBTop", mad_dump_hex}, {}, /* IB_SW_LAST_F */ /* * SwitchLinearForwardingTable fields */ {0, 512, "LinearForwTbl", mad_dump_array}, /* * SwitchMulticastForwardingTable fields */ {0, 512, "MulticastForwTbl", mad_dump_array}, /* * NodeDescription fields */ {0, 64 * 8, "NodeDesc", mad_dump_string}, /* * Notice/Trap fields */ {BITSOFFS(0, 1), "NoticeIsGeneric", mad_dump_uint}, {BITSOFFS(1, 7), "NoticeType", mad_dump_uint}, {BITSOFFS(8, 24), "NoticeProducerType", mad_dump_node_type}, {BITSOFFS(32, 16), "NoticeTrapNumber", mad_dump_uint}, {BITSOFFS(48, 16), "NoticeIssuerLID", mad_dump_uint}, {BITSOFFS(64, 1), "NoticeToggle", mad_dump_uint}, {BITSOFFS(65, 15), "NoticeCount", mad_dump_uint}, {80, 432, "NoticeDataDetails", mad_dump_array}, {BITSOFFS(80, 16), "NoticeDataLID", mad_dump_uint}, {BITSOFFS(96, 16), "NoticeDataTrap144LID", mad_dump_uint}, {BITSOFFS(128, 32), "NoticeDataTrap144CapMask", mad_dump_uint}, /* * Port counters */ {BITSOFFS(8, 8), "PortSelect", mad_dump_uint}, {BITSOFFS(16, 16), "CounterSelect", mad_dump_hex}, {BITSOFFS(32, 16), "SymbolErrorCounter", mad_dump_uint}, {BITSOFFS(48, 8), "LinkErrorRecoveryCounter", mad_dump_uint}, {BITSOFFS(56, 8), "LinkDownedCounter", mad_dump_uint}, {BITSOFFS(64, 16), "PortRcvErrors", mad_dump_uint}, {BITSOFFS(80, 16), "PortRcvRemotePhysicalErrors", mad_dump_uint}, {BITSOFFS(96, 16), "PortRcvSwitchRelayErrors", mad_dump_uint}, {BITSOFFS(112, 16), "PortXmitDiscards", mad_dump_uint}, {BITSOFFS(128, 8), "PortXmitConstraintErrors", mad_dump_uint}, {BITSOFFS(136, 8), "PortRcvConstraintErrors", mad_dump_uint}, {BITSOFFS(144, 8), "CounterSelect2", mad_dump_hex}, {BITSOFFS(152, 4), "LocalLinkIntegrityErrors", mad_dump_uint}, {BITSOFFS(156, 4), "ExcessiveBufferOverrunErrors", mad_dump_uint}, {BITSOFFS(176, 16), "VL15Dropped", mad_dump_uint}, {192, 32, "PortXmitData", mad_dump_uint}, {224, 32, "PortRcvData", mad_dump_uint}, {256, 32, "PortXmitPkts", mad_dump_uint}, {288, 32, "PortRcvPkts", mad_dump_uint}, {320, 32, "PortXmitWait", mad_dump_uint}, {}, /* IB_PC_LAST_F */ /* * SMInfo */ {0, 64, "SmInfoGuid", mad_dump_hex}, {64, 64, "SmInfoKey", mad_dump_hex}, {128, 32, "SmActivity", mad_dump_uint}, {BITSOFFS(160, 4), "SmPriority", mad_dump_uint}, {BITSOFFS(164, 4), "SmState", mad_dump_uint}, /* * SA RMPP */ {BE_OFFS(24 * 8 + 24, 8), "RmppVers", mad_dump_uint}, {BE_OFFS(24 * 8 + 16, 8), "RmppType", mad_dump_uint}, {BE_OFFS(24 * 8 + 11, 5), "RmppResp", mad_dump_uint}, {BE_OFFS(24 * 8 + 8, 3), "RmppFlags", mad_dump_hex}, {BE_OFFS(24 * 8 + 0, 8), "RmppStatus", mad_dump_hex}, /* data1 */ {28 * 8, 32, "RmppData1", mad_dump_hex}, {28 * 8, 32, "RmppSegNum", mad_dump_uint}, /* data2 */ {32 * 8, 32, "RmppData2", mad_dump_hex}, {32 * 8, 32, "RmppPayload", mad_dump_uint}, {32 * 8, 32, "RmppNewWin", mad_dump_uint}, /* * SA Get Multi Path */ {BITSOFFS(41, 7), "MultiPathNumPath", mad_dump_uint}, {BITSOFFS(120, 8), "MultiPathNumSrc", mad_dump_uint}, {BITSOFFS(128, 8), "MultiPathNumDest", mad_dump_uint}, {192, 128, "MultiPathGid", mad_dump_array}, /* * SA Path rec */ {64, 128, "PathRecDGid", mad_dump_array}, {192, 128, "PathRecSGid", mad_dump_array}, {BITSOFFS(320, 16), "PathRecDLid", mad_dump_uint}, {BITSOFFS(336, 16), "PathRecSLid", mad_dump_uint}, {BITSOFFS(393, 7), "PathRecNumPath", mad_dump_uint}, {BITSOFFS(428, 4), "PathRecSL", mad_dump_uint}, /* * MC Member rec */ {0, 128, "McastMemMGid", mad_dump_array}, {128, 128, "McastMemPortGid", mad_dump_array}, {256, 32, "McastMemQkey", mad_dump_hex}, {BITSOFFS(288, 16), "McastMemMLid", mad_dump_hex}, {BITSOFFS(352, 4), "McastMemSL", mad_dump_uint}, {BITSOFFS(306, 6), "McastMemMTU", mad_dump_uint}, {BITSOFFS(338, 6), "McastMemRate", mad_dump_uint}, {BITSOFFS(312, 8), "McastMemTClass", mad_dump_uint}, {BITSOFFS(320, 16), "McastMemPkey", mad_dump_uint}, {BITSOFFS(356, 20), "McastMemFlowLbl", mad_dump_uint}, {BITSOFFS(388, 4), "McastMemJoinState", mad_dump_uint}, {BITSOFFS(392, 1), "McastMemProxyJoin", mad_dump_uint}, /* * Service record */ {0, 64, "ServRecID", mad_dump_hex}, {64, 128, "ServRecGid", mad_dump_array}, {BITSOFFS(192, 16), "ServRecPkey", mad_dump_hex}, {224, 32, "ServRecLease", mad_dump_hex}, {256, 128, "ServRecKey", mad_dump_hex}, {384, 512, "ServRecName", mad_dump_string}, {896, 512, "ServRecData", mad_dump_array}, /* ATS for example */ /* * ATS SM record - within SA_SR_DATA */ {12 * 8, 32, "ATSNodeAddr", mad_dump_hex}, {BITSOFFS(16 * 8, 16), "ATSMagicKey", mad_dump_hex}, {BITSOFFS(18 * 8, 16), "ATSNodeType", mad_dump_hex}, {32 * 8, 32 * 8, "ATSNodeName", mad_dump_string}, /* * SLTOVL MAPPING TABLE */ {0, 64, "SLToVLMap", mad_dump_hex}, /* * VL ARBITRATION TABLE */ {0, 512, "VLArbTbl", mad_dump_array}, /* * IB vendor classes range 2 */ {BE_OFFS(36 * 8, 24), "OUI", mad_dump_array}, {40 * 8, (256 - 40) * 8, "Vendor2Data", mad_dump_array}, /* * Extended port counters */ {BITSOFFS(8, 8), "PortSelect", mad_dump_uint}, {BITSOFFS(16, 16), "CounterSelect", mad_dump_hex}, {64, 64, "PortXmitData", mad_dump_uint}, {128, 64, "PortRcvData", mad_dump_uint}, {192, 64, "PortXmitPkts", mad_dump_uint}, {256, 64, "PortRcvPkts", mad_dump_uint}, {320, 64, "PortUnicastXmitPkts", mad_dump_uint}, {384, 64, "PortUnicastRcvPkts", mad_dump_uint}, {448, 64, "PortMulticastXmitPkts", mad_dump_uint}, {512, 64, "PortMulticastRcvPkts", mad_dump_uint}, {}, /* IB_PC_EXT_LAST_F */ /* * GUIDInfo fields */ {0, 64, "GUID0", mad_dump_hex}, /* * ClassPortInfo fields */ {BITSOFFS(0, 8), "BaseVersion", mad_dump_uint}, {BITSOFFS(8, 8), "ClassVersion", mad_dump_uint}, {BITSOFFS(16, 16), "CapabilityMask", mad_dump_hex}, {BITSOFFS(32, 27), "CapabilityMask2", mad_dump_hex}, {BITSOFFS(59, 5), "RespTimeVal", mad_dump_uint}, {64, 128, "RedirectGID", mad_dump_array}, {BITSOFFS(192, 8), "RedirectTC", mad_dump_hex}, {BITSOFFS(200, 4), "RedirectSL", mad_dump_uint}, {BITSOFFS(204, 20), "RedirectFL", mad_dump_hex}, {BITSOFFS(224, 16), "RedirectLID", mad_dump_uint}, {BITSOFFS(240, 16), "RedirectPKey", mad_dump_hex}, {BITSOFFS(264, 24), "RedirectQP", mad_dump_hex}, {288, 32, "RedirectQKey", mad_dump_hex}, {320, 128, "TrapGID", mad_dump_array}, {BITSOFFS(448, 8), "TrapTC", mad_dump_hex}, {BITSOFFS(456, 4), "TrapSL", mad_dump_uint}, {BITSOFFS(460, 20), "TrapFL", mad_dump_hex}, {BITSOFFS(480, 16), "TrapLID", mad_dump_uint}, {BITSOFFS(496, 16), "TrapPKey", mad_dump_hex}, {BITSOFFS(512, 8), "TrapHL", mad_dump_uint}, {BITSOFFS(520, 24), "TrapQP", mad_dump_hex}, {544, 32, "TrapQKey", mad_dump_hex}, /* * PortXmitDataSL fields */ {32, 32, "XmtDataSL0", mad_dump_uint}, {64, 32, "XmtDataSL1", mad_dump_uint}, {96, 32, "XmtDataSL2", mad_dump_uint}, {128, 32, "XmtDataSL3", mad_dump_uint}, {160, 32, "XmtDataSL4", mad_dump_uint}, {192, 32, "XmtDataSL5", mad_dump_uint}, {224, 32, "XmtDataSL6", mad_dump_uint}, {256, 32, "XmtDataSL7", mad_dump_uint}, {288, 32, "XmtDataSL8", mad_dump_uint}, {320, 32, "XmtDataSL9", mad_dump_uint}, {352, 32, "XmtDataSL10", mad_dump_uint}, {384, 32, "XmtDataSL11", mad_dump_uint}, {416, 32, "XmtDataSL12", mad_dump_uint}, {448, 32, "XmtDataSL13", mad_dump_uint}, {480, 32, "XmtDataSL14", mad_dump_uint}, {512, 32, "XmtDataSL15", mad_dump_uint}, {}, /* IB_PC_XMT_DATA_SL_LAST_F */ /* * PortRcvDataSL fields */ {32, 32, "RcvDataSL0", mad_dump_uint}, {64, 32, "RcvDataSL1", mad_dump_uint}, {96, 32, "RcvDataSL2", mad_dump_uint}, {128, 32, "RcvDataSL3", mad_dump_uint}, {160, 32, "RcvDataSL4", mad_dump_uint}, {192, 32, "RcvDataSL5", mad_dump_uint}, {224, 32, "RcvDataSL6", mad_dump_uint}, {256, 32, "RcvDataSL7", mad_dump_uint}, {288, 32, "RcvDataSL8", mad_dump_uint}, {320, 32, "RcvDataSL9", mad_dump_uint}, {352, 32, "RcvDataSL10", mad_dump_uint}, {384, 32, "RcvDataSL11", mad_dump_uint}, {416, 32, "RcvDataSL12", mad_dump_uint}, {448, 32, "RcvDataSL13", mad_dump_uint}, {480, 32, "RcvDataSL14", mad_dump_uint}, {512, 32, "RcvDataSL15", mad_dump_uint}, {}, /* IB_PC_RCV_DATA_SL_LAST_F */ /* * PortXmitDiscardDetails fields */ {BITSOFFS(32, 16), "PortInactiveDiscards", mad_dump_uint}, {BITSOFFS(48, 16), "PortNeighborMTUDiscards", mad_dump_uint}, {BITSOFFS(64, 16), "PortSwLifetimeLimitDiscards", mad_dump_uint}, {BITSOFFS(80, 16), "PortSwHOQLifetimeLimitDiscards", mad_dump_uint}, {}, /* IB_PC_XMT_DISC_LAST_F */ /* * PortRcvErrorDetails fields */ {BITSOFFS(32, 16), "PortLocalPhysicalErrors", mad_dump_uint}, {BITSOFFS(48, 16), "PortMalformedPktErrors", mad_dump_uint}, {BITSOFFS(64, 16), "PortBufferOverrunErrors", mad_dump_uint}, {BITSOFFS(80, 16), "PortDLIDMappingErrors", mad_dump_uint}, {BITSOFFS(96, 16), "PortVLMappingErrors", mad_dump_uint}, {BITSOFFS(112, 16), "PortLoopingErrors", mad_dump_uint}, {}, /* IB_PC_RCV_ERR_LAST_F */ /* * PortSamplesControl fields */ {BITSOFFS(0, 8), "OpCode", mad_dump_hex}, {BITSOFFS(8, 8), "PortSelect", mad_dump_uint}, {BITSOFFS(16, 8), "Tick", mad_dump_hex}, {BITSOFFS(29, 3), "CounterWidth", mad_dump_uint}, {BITSOFFS(34, 3), "CounterMask0", mad_dump_hex}, {BITSOFFS(37, 27), "CounterMasks1to9", mad_dump_hex}, {BITSOFFS(65, 15), "CounterMasks10to14", mad_dump_hex}, {BITSOFFS(80, 8), "SampleMechanisms", mad_dump_uint}, {BITSOFFS(94, 2), "SampleStatus", mad_dump_uint}, {96, 64, "OptionMask", mad_dump_hex}, {160, 64, "VendorMask", mad_dump_hex}, {224, 32, "SampleStart", mad_dump_uint}, {256, 32, "SampleInterval", mad_dump_uint}, {BITSOFFS(288, 16), "Tag", mad_dump_hex}, {BITSOFFS(304, 16), "CounterSelect0", mad_dump_hex}, {BITSOFFS(320, 16), "CounterSelect1", mad_dump_hex}, {BITSOFFS(336, 16), "CounterSelect2", mad_dump_hex}, {BITSOFFS(352, 16), "CounterSelect3", mad_dump_hex}, {BITSOFFS(368, 16), "CounterSelect4", mad_dump_hex}, {BITSOFFS(384, 16), "CounterSelect5", mad_dump_hex}, {BITSOFFS(400, 16), "CounterSelect6", mad_dump_hex}, {BITSOFFS(416, 16), "CounterSelect7", mad_dump_hex}, {BITSOFFS(432, 16), "CounterSelect8", mad_dump_hex}, {BITSOFFS(448, 16), "CounterSelect9", mad_dump_hex}, {BITSOFFS(464, 16), "CounterSelect10", mad_dump_hex}, {BITSOFFS(480, 16), "CounterSelect11", mad_dump_hex}, {BITSOFFS(496, 16), "CounterSelect12", mad_dump_hex}, {BITSOFFS(512, 16), "CounterSelect13", mad_dump_hex}, {BITSOFFS(528, 16), "CounterSelect14", mad_dump_hex}, {576, 64, "SamplesOnlyOptionMask", mad_dump_hex}, {}, /* IB_PSC_LAST_F */ /* GUIDInfo fields */ {0, 64, "GUID0", mad_dump_hex}, {64, 64, "GUID1", mad_dump_hex}, {128, 64, "GUID2", mad_dump_hex}, {192, 64, "GUID3", mad_dump_hex}, {256, 64, "GUID4", mad_dump_hex}, {320, 64, "GUID5", mad_dump_hex}, {384, 64, "GUID6", mad_dump_hex}, {448, 64, "GUID7", mad_dump_hex}, /* GUID Info Record */ {BITSOFFS(0, 16), "Lid", mad_dump_uint}, {BITSOFFS(16, 8), "BlockNum", mad_dump_uint}, {64, 64, "Guid0", mad_dump_hex}, {128, 64, "Guid1", mad_dump_hex}, {192, 64, "Guid2", mad_dump_hex}, {256, 64, "Guid3", mad_dump_hex}, {320, 64, "Guid4", mad_dump_hex}, {384, 64, "Guid5", mad_dump_hex}, {448, 64, "Guid6", mad_dump_hex}, {512, 64, "Guid7", mad_dump_hex}, /* * More PortInfo fields */ {BITSOFFS(480, 16), "CapabilityMask2", mad_dump_portcapmask2}, {BITSOFFS(496, 4), "LinkSpeedExtActive", mad_dump_linkspeedext}, {BITSOFFS(500, 4), "LinkSpeedExtSupported", mad_dump_linkspeedextsup}, {BITSOFFS(507, 5), "LinkSpeedExtEnabled", mad_dump_linkspeedexten}, {}, /* IB_PORT_LINK_SPEED_EXT_LAST_F */ /* * PortExtendedSpeedsCounters fields */ {BITSOFFS(8, 8), "PortSelect", mad_dump_uint}, {64, 64, "CounterSelect", mad_dump_hex}, {BITSOFFS(128, 16), "SyncHeaderErrorCounter", mad_dump_uint}, {BITSOFFS(144, 16), "UnknownBlockCounter", mad_dump_uint}, {BITSOFFS(160, 16), "ErrorDetectionCounterLane0", mad_dump_uint}, {BITSOFFS(176, 16), "ErrorDetectionCounterLane1", mad_dump_uint}, {BITSOFFS(192, 16), "ErrorDetectionCounterLane2", mad_dump_uint}, {BITSOFFS(208, 16), "ErrorDetectionCounterLane3", mad_dump_uint}, {BITSOFFS(224, 16), "ErrorDetectionCounterLane4", mad_dump_uint}, {BITSOFFS(240, 16), "ErrorDetectionCounterLane5", mad_dump_uint}, {BITSOFFS(256, 16), "ErrorDetectionCounterLane6", mad_dump_uint}, {BITSOFFS(272, 16), "ErrorDetectionCounterLane7", mad_dump_uint}, {BITSOFFS(288, 16), "ErrorDetectionCounterLane8", mad_dump_uint}, {BITSOFFS(304, 16), "ErrorDetectionCounterLane9", mad_dump_uint}, {BITSOFFS(320, 16), "ErrorDetectionCounterLane10", mad_dump_uint}, {BITSOFFS(336, 16), "ErrorDetectionCounterLane11", mad_dump_uint}, {352, 32, "FECCorrectableBlockCtrLane0", mad_dump_uint}, {384, 32, "FECCorrectableBlockCtrLane1", mad_dump_uint}, {416, 32, "FECCorrectableBlockCtrLane2", mad_dump_uint}, {448, 32, "FECCorrectableBlockCtrLane3", mad_dump_uint}, {480, 32, "FECCorrectableBlockCtrLane4", mad_dump_uint}, {512, 32, "FECCorrectableBlockCtrLane5", mad_dump_uint}, {544, 32, "FECCorrectableBlockCtrLane6", mad_dump_uint}, {576, 32, "FECCorrectableBlockCtrLane7", mad_dump_uint}, {608, 32, "FECCorrectableBlockCtrLane8", mad_dump_uint}, {640, 32, "FECCorrectableBlockCtrLane9", mad_dump_uint}, {672, 32, "FECCorrectableBlockCtrLane10", mad_dump_uint}, {704, 32, "FECCorrectableBlockCtrLane11", mad_dump_uint}, {736, 32, "FECUncorrectableBlockCtrLane0", mad_dump_uint}, {768, 32, "FECUncorrectableBlockCtrLane1", mad_dump_uint}, {800, 32, "FECUncorrectableBlockCtrLane2", mad_dump_uint}, {832, 32, "FECUncorrectableBlockCtrLane3", mad_dump_uint}, {864, 32, "FECUncorrectableBlockCtrLane4", mad_dump_uint}, {896, 32, "FECUncorrectableBlockCtrLane5", mad_dump_uint}, {928, 32, "FECUncorrectableBlockCtrLane6", mad_dump_uint}, {960, 32, "FECUncorrectableBlockCtrLane7", mad_dump_uint}, {992, 32, "FECUncorrectableBlockCtrLane8", mad_dump_uint}, {1024, 32, "FECUncorrectableBlockCtrLane9", mad_dump_uint}, {1056, 32, "FECUncorrectableBlockCtrLane10", mad_dump_uint}, {1088, 32, "FECUncorrectableBlockCtrLane11", mad_dump_uint}, {}, /* IB_PESC_LAST_F */ /* * PortOpRcvCounters fields */ {32, 32, "PortOpRcvPkts", mad_dump_uint}, {64, 32, "PortOpRcvData", mad_dump_uint}, {}, /* IB_PC_PORT_OP_RCV_COUNTERS_LAST_F */ /* * PortFlowCtlCounters fields */ {32, 32, "PortXmitFlowPkts", mad_dump_uint}, {64, 32, "PortRcvFlowPkts", mad_dump_uint}, {}, /* IB_PC_PORT_FLOW_CTL_COUNTERS_LAST_F */ /* * PortVLOpPackets fields */ {BITSOFFS(32, 16), "PortVLOpPackets0", mad_dump_uint}, {BITSOFFS(48, 16), "PortVLOpPackets1", mad_dump_uint}, {BITSOFFS(64, 16), "PortVLOpPackets2", mad_dump_uint}, {BITSOFFS(80, 16), "PortVLOpPackets3", mad_dump_uint}, {BITSOFFS(96, 16), "PortVLOpPackets4", mad_dump_uint}, {BITSOFFS(112, 16), "PortVLOpPackets5", mad_dump_uint}, {BITSOFFS(128, 16), "PortVLOpPackets6", mad_dump_uint}, {BITSOFFS(144, 16), "PortVLOpPackets7", mad_dump_uint}, {BITSOFFS(160, 16), "PortVLOpPackets8", mad_dump_uint}, {BITSOFFS(176, 16), "PortVLOpPackets9", mad_dump_uint}, {BITSOFFS(192, 16), "PortVLOpPackets10", mad_dump_uint}, {BITSOFFS(208, 16), "PortVLOpPackets11", mad_dump_uint}, {BITSOFFS(224, 16), "PortVLOpPackets12", mad_dump_uint}, {BITSOFFS(240, 16), "PortVLOpPackets13", mad_dump_uint}, {BITSOFFS(256, 16), "PortVLOpPackets14", mad_dump_uint}, {BITSOFFS(272, 16), "PortVLOpPackets15", mad_dump_uint}, {}, /* IB_PC_PORT_VL_OP_PACKETS_LAST_F */ /* * PortVLOpData fields */ {32, 32, "PortVLOpData0", mad_dump_uint}, {64, 32, "PortVLOpData1", mad_dump_uint}, {96, 32, "PortVLOpData2", mad_dump_uint}, {128, 32, "PortVLOpData3", mad_dump_uint}, {160, 32, "PortVLOpData4", mad_dump_uint}, {192, 32, "PortVLOpData5", mad_dump_uint}, {224, 32, "PortVLOpData6", mad_dump_uint}, {256, 32, "PortVLOpData7", mad_dump_uint}, {288, 32, "PortVLOpData8", mad_dump_uint}, {320, 32, "PortVLOpData9", mad_dump_uint}, {352, 32, "PortVLOpData10", mad_dump_uint}, {384, 32, "PortVLOpData11", mad_dump_uint}, {416, 32, "PortVLOpData12", mad_dump_uint}, {448, 32, "PortVLOpData13", mad_dump_uint}, {480, 32, "PortVLOpData14", mad_dump_uint}, {512, 32, "PortVLOpData15", mad_dump_uint}, {}, /* IB_PC_PORT_VL_OP_DATA_LAST_F */ /* * PortVLXmitFlowCtlUpdateErrors fields */ {BITSOFFS(32, 2), "PortVLXmitFlowCtlUpdateErrors0", mad_dump_uint}, {BITSOFFS(34, 2), "PortVLXmitFlowCtlUpdateErrors1", mad_dump_uint}, {BITSOFFS(36, 2), "PortVLXmitFlowCtlUpdateErrors2", mad_dump_uint}, {BITSOFFS(38, 2), "PortVLXmitFlowCtlUpdateErrors3", mad_dump_uint}, {BITSOFFS(40, 2), "PortVLXmitFlowCtlUpdateErrors4", mad_dump_uint}, {BITSOFFS(42, 2), "PortVLXmitFlowCtlUpdateErrors5", mad_dump_uint}, {BITSOFFS(44, 2), "PortVLXmitFlowCtlUpdateErrors6", mad_dump_uint}, {BITSOFFS(46, 2), "PortVLXmitFlowCtlUpdateErrors7", mad_dump_uint}, {BITSOFFS(48, 2), "PortVLXmitFlowCtlUpdateErrors8", mad_dump_uint}, {BITSOFFS(50, 2), "PortVLXmitFlowCtlUpdateErrors9", mad_dump_uint}, {BITSOFFS(52, 2), "PortVLXmitFlowCtlUpdateErrors10", mad_dump_uint}, {BITSOFFS(54, 2), "PortVLXmitFlowCtlUpdateErrors11", mad_dump_uint}, {BITSOFFS(56, 2), "PortVLXmitFlowCtlUpdateErrors12", mad_dump_uint}, {BITSOFFS(58, 2), "PortVLXmitFlowCtlUpdateErrors13", mad_dump_uint}, {BITSOFFS(60, 2), "PortVLXmitFlowCtlUpdateErrors14", mad_dump_uint}, {BITSOFFS(62, 2), "PortVLXmitFlowCtlUpdateErrors15", mad_dump_uint}, {}, /* IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS_LAST_F */ /* * PortVLXmitWaitCounters fields */ {BITSOFFS(32, 16), "PortVLXmitWait0", mad_dump_uint}, {BITSOFFS(48, 16), "PortVLXmitWait1", mad_dump_uint}, {BITSOFFS(64, 16), "PortVLXmitWait2", mad_dump_uint}, {BITSOFFS(80, 16), "PortVLXmitWait3", mad_dump_uint}, {BITSOFFS(96, 16), "PortVLXmitWait4", mad_dump_uint}, {BITSOFFS(112, 16), "PortVLXmitWait5", mad_dump_uint}, {BITSOFFS(128, 16), "PortVLXmitWait6", mad_dump_uint}, {BITSOFFS(144, 16), "PortVLXmitWait7", mad_dump_uint}, {BITSOFFS(160, 16), "PortVLXmitWait8", mad_dump_uint}, {BITSOFFS(176, 16), "PortVLXmitWait9", mad_dump_uint}, {BITSOFFS(192, 16), "PortVLXmitWait10", mad_dump_uint}, {BITSOFFS(208, 16), "PortVLXmitWait11", mad_dump_uint}, {BITSOFFS(224, 16), "PortVLXmitWait12", mad_dump_uint}, {BITSOFFS(240, 16), "PortVLXmitWait13", mad_dump_uint}, {BITSOFFS(256, 16), "PortVLXmitWait14", mad_dump_uint}, {BITSOFFS(272, 16), "PortVLXmitWait15", mad_dump_uint}, {}, /* IB_PC_PORT_VL_XMIT_WAIT_COUNTERS_LAST_F */ /* * SwPortVLCongestion fields */ {BITSOFFS(32, 16), "SWPortVLCongestion0", mad_dump_uint}, {BITSOFFS(48, 16), "SWPortVLCongestion1", mad_dump_uint}, {BITSOFFS(64, 16), "SWPortVLCongestion2", mad_dump_uint}, {BITSOFFS(80, 16), "SWPortVLCongestion3", mad_dump_uint}, {BITSOFFS(96, 16), "SWPortVLCongestion4", mad_dump_uint}, {BITSOFFS(112, 16), "SWPortVLCongestion5", mad_dump_uint}, {BITSOFFS(128, 16), "SWPortVLCongestion6", mad_dump_uint}, {BITSOFFS(144, 16), "SWPortVLCongestion7", mad_dump_uint}, {BITSOFFS(160, 16), "SWPortVLCongestion8", mad_dump_uint}, {BITSOFFS(176, 16), "SWPortVLCongestion9", mad_dump_uint}, {BITSOFFS(192, 16), "SWPortVLCongestion10", mad_dump_uint}, {BITSOFFS(208, 16), "SWPortVLCongestion11", mad_dump_uint}, {BITSOFFS(224, 16), "SWPortVLCongestion12", mad_dump_uint}, {BITSOFFS(240, 16), "SWPortVLCongestion13", mad_dump_uint}, {BITSOFFS(256, 16), "SWPortVLCongestion14", mad_dump_uint}, {BITSOFFS(272, 16), "SWPortVLCongestion15", mad_dump_uint}, {}, /* IB_PC_SW_PORT_VL_CONGESTION_LAST_F */ /* * PortRcvConCtrl fields */ {32, 32, "PortPktRcvFECN", mad_dump_uint}, {64, 32, "PortPktRcvBECN", mad_dump_uint}, {}, /* IB_PC_RCV_CON_CTRL_LAST_F */ /* * PortSLRcvFECN fields */ {32, 32, "PortSLRcvFECN0", mad_dump_uint}, {64, 32, "PortSLRcvFECN1", mad_dump_uint}, {96, 32, "PortSLRcvFECN2", mad_dump_uint}, {128, 32, "PortSLRcvFECN3", mad_dump_uint}, {160, 32, "PortSLRcvFECN4", mad_dump_uint}, {192, 32, "PortSLRcvFECN5", mad_dump_uint}, {224, 32, "PortSLRcvFECN6", mad_dump_uint}, {256, 32, "PortSLRcvFECN7", mad_dump_uint}, {288, 32, "PortSLRcvFECN8", mad_dump_uint}, {320, 32, "PortSLRcvFECN9", mad_dump_uint}, {352, 32, "PortSLRcvFECN10", mad_dump_uint}, {384, 32, "PortSLRcvFECN11", mad_dump_uint}, {416, 32, "PortSLRcvFECN12", mad_dump_uint}, {448, 32, "PortSLRcvFECN13", mad_dump_uint}, {480, 32, "PortSLRcvFECN14", mad_dump_uint}, {512, 32, "PortSLRcvFECN15", mad_dump_uint}, {}, /* IB_PC_SL_RCV_FECN_LAST_F */ /* * PortSLRcvBECN fields */ {32, 32, "PortSLRcvBECN0", mad_dump_uint}, {64, 32, "PortSLRcvBECN1", mad_dump_uint}, {96, 32, "PortSLRcvBECN2", mad_dump_uint}, {128, 32, "PortSLRcvBECN3", mad_dump_uint}, {160, 32, "PortSLRcvBECN4", mad_dump_uint}, {192, 32, "PortSLRcvBECN5", mad_dump_uint}, {224, 32, "PortSLRcvBECN6", mad_dump_uint}, {256, 32, "PortSLRcvBECN7", mad_dump_uint}, {288, 32, "PortSLRcvBECN8", mad_dump_uint}, {320, 32, "PortSLRcvBECN9", mad_dump_uint}, {352, 32, "PortSLRcvBECN10", mad_dump_uint}, {384, 32, "PortSLRcvBECN11", mad_dump_uint}, {416, 32, "PortSLRcvBECN12", mad_dump_uint}, {448, 32, "PortSLRcvBECN13", mad_dump_uint}, {480, 32, "PortSLRcvBECN14", mad_dump_uint}, {512, 32, "PortSLRcvBECN15", mad_dump_uint}, {}, /* IB_PC_SL_RCV_BECN_LAST_F */ /* * PortXmitConCtrl fields */ {32, 32, "PortXmitTimeCong", mad_dump_uint}, {}, /* IB_PC_XMIT_CON_CTRL_LAST_F */ /* * PortVLXmitTimeCong fields */ {32, 32, "PortVLXmitTimeCong0", mad_dump_uint}, {64, 32, "PortVLXmitTimeCong1", mad_dump_uint}, {96, 32, "PortVLXmitTimeCong2", mad_dump_uint}, {128, 32, "PortVLXmitTimeCong3", mad_dump_uint}, {160, 32, "PortVLXmitTimeCong4", mad_dump_uint}, {192, 32, "PortVLXmitTimeCong5", mad_dump_uint}, {224, 32, "PortVLXmitTimeCong6", mad_dump_uint}, {256, 32, "PortVLXmitTimeCong7", mad_dump_uint}, {288, 32, "PortVLXmitTimeCong8", mad_dump_uint}, {320, 32, "PortVLXmitTimeCong9", mad_dump_uint}, {352, 32, "PortVLXmitTimeCong10", mad_dump_uint}, {384, 32, "PortVLXmitTimeCong11", mad_dump_uint}, {416, 32, "PortVLXmitTimeCong12", mad_dump_uint}, {448, 32, "PortVLXmitTimeCong13", mad_dump_uint}, {480, 32, "PortVLXmitTimeCong14", mad_dump_uint}, {}, /* IB_PC_VL_XMIT_TIME_CONG_LAST_F */ /* * Mellanox ExtendedPortInfo fields */ {BITSOFFS(24, 8), "StateChangeEnable", mad_dump_hex}, {BITSOFFS(56, 8), "LinkSpeedSupported", mad_dump_hex}, {BITSOFFS(88, 8), "LinkSpeedEnabled", mad_dump_hex}, {BITSOFFS(120, 8), "LinkSpeedActive", mad_dump_hex}, {}, /* IB_MLNX_EXT_PORT_LAST_F */ /* * Congestion Control Mad fields * bytes 24-31 of congestion control mad */ {192, 64, "CC_Key", mad_dump_hex}, /* IB_CC_CCKEY_F */ /* * CongestionInfo fields */ {BITSOFFS(0, 16), "CongestionInfo", mad_dump_hex}, {BITSOFFS(16, 8), "ControlTableCap", mad_dump_uint}, {}, /* IB_CC_CONGESTION_INFO_LAST_F */ /* * CongestionKeyInfo fields */ {0, 64, "CC_Key", mad_dump_hex}, {BITSOFFS(64, 1), "CC_KeyProtectBit", mad_dump_uint}, {BITSOFFS(80, 16), "CC_KeyLeasePeriod", mad_dump_uint}, {BITSOFFS(96, 16), "CC_KeyViolations", mad_dump_uint}, {}, /* IB_CC_CONGESTION_KEY_INFO_LAST_F */ /* * CongestionLog (common) fields */ {BITSOFFS(0, 8), "LogType", mad_dump_uint}, {BITSOFFS(8, 8), "CongestionFlags", mad_dump_hex}, {}, /* IB_CC_CONGESTION_LOG_LAST_F */ /* * CongestionLog (Switch) fields */ {BITSOFFS(16, 16), "LogEventsCounter", mad_dump_uint}, {32, 32, "CurrentTimeStamp", mad_dump_uint}, {64, 256, "PortMap", mad_dump_array}, {}, /* IB_CC_CONGESTION_LOG_SWITCH_LAST_F */ /* * CongestionLogEvent (Switch) fields */ {BITSOFFS(0, 16), "SLID", mad_dump_uint}, {BITSOFFS(16, 16), "DLID", mad_dump_uint}, {BITSOFFS(32, 4), "SL", mad_dump_uint}, {64, 32, "Timestamp", mad_dump_uint}, {}, /* IB_CC_CONGESTION_LOG_ENTRY_SWITCH_LAST_F */ /* * CongestionLog (CA) fields */ {BITSOFFS(16, 16), "ThresholdEventCounter", mad_dump_uint}, {BITSOFFS(32, 16), "ThresholdCongestionEventMap", mad_dump_hex}, /* XXX: Q3/2010 errata lists offset 48, but that means field is not * word aligned. Assume will be aligned to offset 64 later. */ {BITSOFFS(64, 32), "CurrentTimeStamp", mad_dump_uint}, {}, /* IB_CC_CONGESTION_LOG_CA_LAST_F */ /* * CongestionLogEvent (CA) fields */ {BITSOFFS(0, 24), "Local_QP_CN_Entry", mad_dump_uint}, {BITSOFFS(24, 4), "SL_CN_Entry", mad_dump_uint}, {BITSOFFS(28, 4), "Service_Type_CN_Entry", mad_dump_hex}, {BITSOFFS(32, 24), "Remote_QP_Number_CN_Entry", mad_dump_uint}, {BITSOFFS(64, 16), "Local_LID_CN", mad_dump_uint}, {BITSOFFS(80, 16), "Remote_LID_CN_Entry", mad_dump_uint}, {BITSOFFS(96, 32), "Timestamp_CN_Entry", mad_dump_uint}, {}, /* IB_CC_CONGESTION_LOG_ENTRY_CA_LAST_F */ /* * SwitchCongestionSetting fields */ {0, 32, "Control_Map", mad_dump_hex}, {32, 256, "Victim_Mask", mad_dump_array}, {288, 256, "Credit_Mask", mad_dump_array}, {BITSOFFS(544, 4), "Threshold", mad_dump_hex}, {BITSOFFS(552, 8), "Packet_Size", mad_dump_uint}, {BITSOFFS(560, 4), "CS_Threshold", mad_dump_hex}, {BITSOFFS(576, 16), "CS_ReturnDelay", mad_dump_hex}, /* TODO: CCT dump */ {BITSOFFS(592, 16), "Marking_Rate", mad_dump_uint}, {}, /* IB_CC_SWITCH_CONGESTION_SETTING_LAST_F */ /* * SwitchPortCongestionSettingElement fields */ {BITSOFFS(0, 1), "Valid", mad_dump_uint}, {BITSOFFS(1, 1), "Control_Type", mad_dump_uint}, {BITSOFFS(4, 4), "Threshold", mad_dump_hex}, {BITSOFFS(8, 8), "Packet_Size", mad_dump_uint}, {BITSOFFS(16, 16), "Cong_Parm_Marking_Rate", mad_dump_uint}, {}, /* IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_LAST_F */ /* * CACongestionSetting fields */ {BITSOFFS(0, 16), "Port_Control", mad_dump_hex}, {BITSOFFS(16, 16), "Control_Map", mad_dump_hex}, {}, /* IB_CC_CA_CONGESTION_SETTING_LAST_F */ /* * CACongestionEntry fields */ {BITSOFFS(0, 16), "CCTI_Timer", mad_dump_uint}, {BITSOFFS(16, 8), "CCTI_Increase", mad_dump_uint}, {BITSOFFS(24, 8), "Trigger_Threshold", mad_dump_uint}, {BITSOFFS(32, 8), "CCTI_Min", mad_dump_uint}, {}, /* IB_CC_CA_CONGESTION_SETTING_ENTRY_LAST_F */ /* * CongestionControlTable fields */ {BITSOFFS(0, 16), "CCTI_Limit", mad_dump_uint}, {}, /* IB_CC_CONGESTION_CONTROL_TABLE_LAST_F */ /* * CongestionControlTableEntry fields */ {BITSOFFS(0, 2), "CCT_Shift", mad_dump_uint}, {BITSOFFS(2, 14), "CCT_Multiplier", mad_dump_uint}, {}, /* IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_LAST_F */ /* * Timestamp fields */ {0, 32, "Timestamp", mad_dump_uint}, {}, /* IB_CC_TIMESTAMP_LAST_F */ /* Node Record */ {BITSOFFS(0, 16), "Lid", mad_dump_uint}, {BITSOFFS(32, 8), "BaseVers", mad_dump_uint}, {BITSOFFS(40, 8), "ClassVers", mad_dump_uint}, {BITSOFFS(48, 8), "NodeType", mad_dump_node_type}, {BITSOFFS(56, 8), "NumPorts", mad_dump_uint}, {64, 64, "SystemGuid", mad_dump_hex}, {128, 64, "Guid", mad_dump_hex}, {192, 64, "PortGuid", mad_dump_hex}, {BITSOFFS(256, 16), "PartCap", mad_dump_uint}, {BITSOFFS(272, 16), "DevId", mad_dump_hex}, {288, 32, "Revision", mad_dump_hex}, {BITSOFFS(320, 8), "LocalPort", mad_dump_uint}, {BITSOFFS(328, 24), "VendorId", mad_dump_hex}, {352, 64 * 8, "NodeDesc", mad_dump_string}, {}, /* IB_SA_NR_LAST_F */ /* * PortSamplesResult fields */ {BITSOFFS(0, 16), "Tag", mad_dump_hex}, {BITSOFFS(30, 2), "SampleStatus", mad_dump_hex}, {32, 32, "Counter0", mad_dump_uint}, {64, 32, "Counter1", mad_dump_uint}, {96, 32, "Counter2", mad_dump_uint}, {128, 32, "Counter3", mad_dump_uint}, {160, 32, "Counter4", mad_dump_uint}, {192, 32, "Counter5", mad_dump_uint}, {224, 32, "Counter6", mad_dump_uint}, {256, 32, "Counter7", mad_dump_uint}, {288, 32, "Counter8", mad_dump_uint}, {320, 32, "Counter9", mad_dump_uint}, {352, 32, "Counter10", mad_dump_uint}, {384, 32, "Counter11", mad_dump_uint}, {416, 32, "Counter12", mad_dump_uint}, {448, 32, "Counter13", mad_dump_uint}, {480, 32, "Counter14", mad_dump_uint}, {}, /* IB_PSR_LAST_F */ /* * PortInfoExtended fields */ {0, 32, "CapMask", mad_dump_hex}, {BITSOFFS(32, 16), "FECModeActive", mad_dump_uint}, {BITSOFFS(48, 16), "FDRFECModeSupported", mad_dump_hex}, {BITSOFFS(64, 16), "FDRFECModeEnabled", mad_dump_hex}, {BITSOFFS(80, 16), "EDRFECModeSupported", mad_dump_hex}, {BITSOFFS(96, 16), "EDRFECModeEnabled", mad_dump_hex}, {}, /* IB_PORT_EXT_LAST_F */ /* * PortExtendedSpeedsCounters RSFEC Active fields */ {BITSOFFS(8, 8), "PortSelect", mad_dump_uint}, {64, 64, "CounterSelect", mad_dump_hex}, {BITSOFFS(128, 16), "SyncHeaderErrorCounter", mad_dump_uint}, {BITSOFFS(144, 16), "UnknownBlockCounter", mad_dump_uint}, {352, 32, "FECCorrectableSymbolCtrLane0", mad_dump_uint}, {384, 32, "FECCorrectableSymbolCtrLane1", mad_dump_uint}, {416, 32, "FECCorrectableSymbolCtrLane2", mad_dump_uint}, {448, 32, "FECCorrectableSymbolCtrLane3", mad_dump_uint}, {480, 32, "FECCorrectableSymbolCtrLane4", mad_dump_uint}, {512, 32, "FECCorrectableSymbolCtrLane5", mad_dump_uint}, {544, 32, "FECCorrectableSymbolCtrLane6", mad_dump_uint}, {576, 32, "FECCorrectableSymbolCtrLane7", mad_dump_uint}, {608, 32, "FECCorrectableSymbolCtrLane8", mad_dump_uint}, {640, 32, "FECCorrectableSymbolCtrLane9", mad_dump_uint}, {672, 32, "FECCorrectableSymbolCtrLane10", mad_dump_uint}, {704, 32, "FECCorrectableSymbolCtrLane11", mad_dump_uint}, {1120, 32, "PortFECCorrectableBlockCtr", mad_dump_uint}, {1152, 32, "PortFECUncorrectableBlockCtr", mad_dump_uint}, {1184, 32, "PortFECCorrectedSymbolCtr", mad_dump_uint}, {}, /* IB_PESC_RSFEC_LAST_F */ /* * More PortCountersExtended fields */ {32, 32, "CounterSelect2", mad_dump_hex}, {576, 64, "SymbolErrorCounter", mad_dump_uint}, {640, 64, "LinkErrorRecoveryCounter", mad_dump_uint}, {704, 64, "LinkDownedCounter", mad_dump_uint}, {768, 64, "PortRcvErrors", mad_dump_uint}, {832, 64, "PortRcvRemotePhysicalErrors", mad_dump_uint}, {896, 64, "PortRcvSwitchRelayErrors", mad_dump_uint}, {960, 64, "PortXmitDiscards", mad_dump_uint}, {1024, 64, "PortXmitConstraintErrors", mad_dump_uint}, {1088, 64, "PortRcvConstraintErrors", mad_dump_uint}, {1152, 64, "LocalLinkIntegrityErrors", mad_dump_uint}, {1216, 64, "ExcessiveBufferOverrunErrors", mad_dump_uint}, {1280, 64, "VL15Dropped", mad_dump_uint}, {1344, 64, "PortXmitWait", mad_dump_uint}, {1408, 64, "QP1Dropped", mad_dump_uint}, {}, /* IB_PC_EXT_ERR_LAST_F */ /* * Another PortCounters field */ {160, 16, "QP1Dropped", mad_dump_uint}, /* * More PortInfoExtended fields (HDR) */ {112, 16, "HDRFECModeSupported", mad_dump_hex}, {128, 16, "HDRFECModeEnabled", mad_dump_hex}, {}, /* IB_PORT_EXT_HDR_FEC_MODE_LAST_F */ /* * More PortInfoExtended fields (NDR) */ {144, 16, "NDRFECModeSupported", mad_dump_hex}, {160, 16, "NDRFECModeEnabled", mad_dump_hex}, {}, /* IB_PORT_EXT_NDR_FEC_MODE_LAST_F */ /* * More PortInfo fields (XDR) */ { BITSOFFS(449, 2), "LinkSpeedExtActive2", mad_dump_linkspeedext2 }, { BITSOFFS(451, 2), "LinkSpeedExtSupported2", mad_dump_linkspeedextsup2 }, { BITSOFFS(453, 3), "LinkSpeedExtEnabled2", mad_dump_linkspeedexten2 }, { }, /* IB_PORT_LINK_SPEED_EXT_2_LAST_F */ {} /* IB_FIELD_LAST_ */ }; static void _set_field64(void *buf, int base_offs, const ib_field_t * f, uint64_t val) { uint64_t nval; nval = htonll(val); memcpy(((void *)(char *)buf + base_offs + f->bitoffs / 8), (void *)&nval, sizeof(uint64_t)); } static uint64_t _get_field64(void *buf, int base_offs, const ib_field_t * f) { uint64_t val; memcpy((void *)&val, (void *)((char *)buf + base_offs + f->bitoffs / 8), sizeof(uint64_t)); return ntohll(val); } static void _set_field(void *buf, int base_offs, const ib_field_t * f, uint32_t val) { int prebits = (8 - (f->bitoffs & 7)) & 7; int postbits = (f->bitoffs + f->bitlen) & 7; int bytelen = f->bitlen / 8; unsigned idx = base_offs + f->bitoffs / 8; char *p = (char *)buf; if (!bytelen && (f->bitoffs & 7) + f->bitlen < 8) { p[3 ^ idx] &= ~((((1 << f->bitlen) - 1)) << (f->bitoffs & 7)); p[3 ^ idx] |= (val & ((1 << f->bitlen) - 1)) << (f->bitoffs & 7); return; } if (prebits) { /* val lsb in byte msb */ p[3 ^ idx] &= (1 << (8 - prebits)) - 1; p[3 ^ idx++] |= (val & ((1 << prebits) - 1)) << (8 - prebits); val >>= prebits; } /* BIG endian byte order */ for (; bytelen--; val >>= 8) p[3 ^ idx++] = val & 0xff; if (postbits) { /* val msb in byte lsb */ p[3 ^ idx] &= ~((1 << postbits) - 1); p[3 ^ idx] |= val; } } static uint32_t _get_field(void *buf, int base_offs, const ib_field_t * f) { int prebits = (8 - (f->bitoffs & 7)) & 7; int postbits = (f->bitoffs + f->bitlen) & 7; int bytelen = f->bitlen / 8; unsigned idx = base_offs + f->bitoffs / 8; uint8_t *p = (uint8_t *) buf; uint32_t val = 0, v = 0, i; if (!bytelen && (f->bitoffs & 7) + f->bitlen < 8) return (p[3 ^ idx] >> (f->bitoffs & 7)) & ((1 << f->bitlen) - 1); if (prebits) /* val lsb from byte msb */ v = p[3 ^ idx++] >> (8 - prebits); if (postbits) { /* val msb from byte lsb */ i = base_offs + (f->bitoffs + f->bitlen) / 8; val = (p[3 ^ i] & ((1 << postbits) - 1)); } /* BIG endian byte order */ for (idx += bytelen - 1; bytelen--; idx--) val = (val << 8) | p[3 ^ idx]; return (val << prebits) | v; } /* field must be byte aligned */ static void _set_array(void *buf, int base_offs, const ib_field_t * f, void *val) { int bitoffs = f->bitoffs; if (f->bitlen < 32) bitoffs = BE_TO_BITSOFFS(bitoffs, f->bitlen); memcpy((uint8_t *) buf + base_offs + bitoffs / 8, val, f->bitlen / 8); } static void _get_array(void *buf, int base_offs, const ib_field_t * f, void *val) { int bitoffs = f->bitoffs; if (f->bitlen < 32) bitoffs = BE_TO_BITSOFFS(bitoffs, f->bitlen); memcpy(val, (uint8_t *) buf + base_offs + bitoffs / 8, f->bitlen / 8); } uint32_t mad_get_field(void *buf, int base_offs, enum MAD_FIELDS field) { return _get_field(buf, base_offs, ib_mad_f + field); } void mad_set_field(void *buf, int base_offs, enum MAD_FIELDS field, uint32_t val) { _set_field(buf, base_offs, ib_mad_f + field, val); } uint64_t mad_get_field64(void *buf, int base_offs, enum MAD_FIELDS field) { return _get_field64(buf, base_offs, ib_mad_f + field); } void mad_set_field64(void *buf, int base_offs, enum MAD_FIELDS field, uint64_t val) { _set_field64(buf, base_offs, ib_mad_f + field, val); } void mad_set_array(void *buf, int base_offs, enum MAD_FIELDS field, void *val) { _set_array(buf, base_offs, ib_mad_f + field, val); } void mad_get_array(void *buf, int base_offs, enum MAD_FIELDS field, void *val) { _get_array(buf, base_offs, ib_mad_f + field, val); } void mad_decode_field(uint8_t * buf, enum MAD_FIELDS field, void *val) { const ib_field_t *f = ib_mad_f + field; if (!field) { *(int *)val = *(int *)buf; return; } if (f->bitlen <= 32) { *(uint32_t *) val = _get_field(buf, 0, f); return; } if (f->bitlen == 64) { *(uint64_t *) val = _get_field64(buf, 0, f); return; } _get_array(buf, 0, f, val); } void mad_encode_field(uint8_t * buf, enum MAD_FIELDS field, void *val) { const ib_field_t *f = ib_mad_f + field; if (!field) { *(int *)buf = *(int *)val; return; } if (f->bitlen <= 32) { _set_field(buf, 0, f, *(uint32_t *) val); return; } if (f->bitlen == 64) { _set_field64(buf, 0, f, *(uint64_t *) val); return; } _set_array(buf, 0, f, val); } /************************/ static char *_mad_dump_val(const ib_field_t * f, char *buf, int bufsz, void *val) { f->def_dump_fn(buf, bufsz, val, ALIGN(f->bitlen, 8) / 8); buf[bufsz - 1] = 0; return buf; } static char *_mad_dump_field(const ib_field_t * f, const char *name, char *buf, int bufsz, void *val) { char dots[128]; int l, n; if (bufsz <= 32) return NULL; /* buf too small */ if (!name) name = f->name; l = strlen(name); if (l < 32) { memset(dots, '.', 32 - l); dots[32 - l] = 0; } n = snprintf(buf, bufsz, "%s:%s", name, dots); _mad_dump_val(f, buf + n, bufsz - n, val); buf[bufsz - 1] = 0; return buf; } static int _mad_dump(ib_mad_dump_fn * fn, const char *name, void *val, int valsz) { ib_field_t f; char buf[512]; f.def_dump_fn = fn; f.bitlen = valsz * 8; return printf("%s\n", _mad_dump_field(&f, name, buf, sizeof buf, val)); } static int _mad_print_field(const ib_field_t * f, const char *name, void *val, int valsz) { return _mad_dump(f->def_dump_fn, name ? name : f->name, val, valsz ? valsz : ALIGN(f->bitlen, 8) / 8); } int mad_print_field(enum MAD_FIELDS field, const char *name, void *val) { if (field <= IB_NO_FIELD || field >= IB_FIELD_LAST_) return -1; return _mad_print_field(ib_mad_f + field, name, val, 0); } char *mad_dump_field(enum MAD_FIELDS field, char *buf, int bufsz, void *val) { if (field <= IB_NO_FIELD || field >= IB_FIELD_LAST_) return NULL; return _mad_dump_field(ib_mad_f + field, NULL, buf, bufsz, val); } char *mad_dump_val(enum MAD_FIELDS field, char *buf, int bufsz, void *val) { if (field <= IB_NO_FIELD || field >= IB_FIELD_LAST_) return NULL; return _mad_dump_val(ib_mad_f + field, buf, bufsz, val); } const char *mad_field_name(enum MAD_FIELDS field) { return (ib_mad_f[field].name); } rdma-core-56.1/libibmad/gs.c000066400000000000000000000073001477342711600156520ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #undef DEBUG #define DEBUG if (ibdebug) IBWARN uint8_t *pma_query_via(void *rcvbuf, ib_portid_t * dest, int port, unsigned timeout, unsigned id, const struct ibmad_port * srcport) { ib_rpc_v1_t rpc = { 0 }; ib_rpc_t *rpcold = (ib_rpc_t *)(void *)&rpc; int lid = dest->lid; void *p_ret; DEBUG("lid %u port %d", lid, port); if (lid == -1) { IBWARN("only lid routed is supported"); return NULL; } rpc.mgtclass = IB_PERFORMANCE_CLASS | IB_MAD_RPC_VERSION1; rpc.method = IB_MAD_METHOD_GET; rpc.attr.id = id; /* Same for attribute IDs */ mad_set_field(rcvbuf, 0, IB_PC_PORT_SELECT_F, port); rpc.attr.mod = 0; rpc.timeout = timeout; rpc.datasz = IB_PC_DATA_SZ; rpc.dataoffs = IB_PC_DATA_OFFS; if (!dest->qp) dest->qp = 1; if (!dest->qkey) dest->qkey = IB_DEFAULT_QP1_QKEY; p_ret = mad_rpc(srcport, rpcold, dest, rcvbuf, rcvbuf); errno = rpc.error; return p_ret; } uint8_t *performance_reset_via(void *rcvbuf, ib_portid_t * dest, int port, unsigned mask, unsigned timeout, unsigned id, const struct ibmad_port * srcport) { ib_rpc_v1_t rpc = { 0 }; ib_rpc_t *rpcold = (ib_rpc_t *)(void *)&rpc; int lid = dest->lid; void *p_ret; DEBUG("lid %u port %d mask 0x%x", lid, port, mask); if (lid == -1) { IBWARN("only lid routed is supported"); return NULL; } if (!mask) mask = ~0; rpc.mgtclass = IB_PERFORMANCE_CLASS | IB_MAD_RPC_VERSION1; rpc.method = IB_MAD_METHOD_SET; rpc.attr.id = id; memset(rcvbuf, 0, IB_MAD_SIZE); /* Next 2 lines - same for attribute IDs */ mad_set_field(rcvbuf, 0, IB_PC_PORT_SELECT_F, port); mad_set_field(rcvbuf, 0, IB_PC_COUNTER_SELECT_F, mask); mask = mask >> 16; if (id == IB_GSI_PORT_COUNTERS_EXT) mad_set_field(rcvbuf, 0, IB_PC_EXT_COUNTER_SELECT2_F, mask); else mad_set_field(rcvbuf, 0, IB_PC_COUNTER_SELECT2_F, mask); rpc.attr.mod = 0; rpc.timeout = timeout; rpc.datasz = IB_PC_DATA_SZ; rpc.dataoffs = IB_PC_DATA_OFFS; if (!dest->qp) dest->qp = 1; if (!dest->qkey) dest->qkey = IB_DEFAULT_QP1_QKEY; p_ret = mad_rpc(srcport, rpcold, dest, rcvbuf, rcvbuf); errno = rpc.error; return p_ret; } rdma-core-56.1/libibmad/iba_types.h000066400000000000000000001703351477342711600172360ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2019 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2013 Oracle and/or its affiliates. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef __LIBIBMAD_IB_TYPES_H__ #define __LIBIBMAD_IB_TYPES_H__ #include #include #include #include #include #define MAD_BLOCK_SIZE 256 #define MAD_RMPP_HDR_SIZE 36 #define MAD_BLOCK_GRH_SIZE 296 #define IB_LID_PERMISSIVE 0xFFFF #define IB_DEFAULT_PKEY 0xFFFF #define IB_QP1_WELL_KNOWN_Q_KEY htobe32(0x80010000) #define IB_QP0 0 #define IB_QP1 htobe32(1) #define IB_QP_PRIVILEGED_Q_KEY htobe32(0x80000000) #define IB_LID_UCAST_START_HO 0x0001 #define IB_LID_UCAST_START htobe16(IB_LID_UCAST_START_HO) #define IB_LID_UCAST_END_HO 0xBFFF #define IB_LID_UCAST_END htobe16(IB_LID_UCAST_END_HO) #define IB_LID_MCAST_START_HO 0xC000 #define IB_LID_MCAST_START htobe16(IB_LID_MCAST_START_HO) #define IB_LID_MCAST_END_HO 0xFFFE #define IB_LID_MCAST_END htobe16(IB_LID_MCAST_END_HO) #define IB_DEFAULT_SUBNET_PREFIX htobe64(0xFE80000000000000ULL) #define IB_DEFAULT_SUBNET_PREFIX_HO 0xFE80000000000000ULL #define IB_NODE_NUM_PORTS_MAX 0xFE #define IB_INVALID_PORT_NUM 0xFF #define IB_SUBNET_PATH_HOPS_MAX 64 #define IB_HOPLIMIT_MAX 255 #define IB_MC_SCOPE_LINK_LOCAL 0x2 #define IB_MC_SCOPE_SITE_LOCAL 0x5 #define IB_MC_SCOPE_ORG_LOCAL 0x8 #define IB_MC_SCOPE_GLOBAL 0xE #define IB_PKEY_MAX_BLOCKS 2048 #define IB_MCAST_MAX_BLOCK_ID 511 #define IB_MCAST_BLOCK_ID_MASK_HO 0x000001FF #define IB_MCAST_BLOCK_SIZE 32 #define IB_MCAST_MASK_SIZE 16 #define IB_MCAST_POSITION_MASK_HO 0xF0000000 #define IB_MCAST_POSITION_MAX 0xF #define IB_MCAST_POSITION_SHIFT 28 #define IB_PKEY_BASE_MASK htobe16(0x7FFF) #define IB_PKEY_TYPE_MASK htobe16(0x8000) #define IB_DEFAULT_PARTIAL_PKEY htobe16(0x7FFF) #define IB_MCLASS_SUBN_LID 0x01 #define IB_MCLASS_SUBN_DIR 0x81 #define IB_MCLASS_SUBN_ADM 0x03 #define IB_MCLASS_PERF 0x04 #define IB_MCLASS_BM 0x05 #define IB_MCLASS_DEV_MGMT 0x06 #define IB_MCLASS_COMM_MGMT 0x07 #define IB_MCLASS_SNMP 0x08 #define IB_MCLASS_VENDOR_LOW_RANGE_MIN 0x09 #define IB_MCLASS_VENDOR_LOW_RANGE_MAX 0x0F #define IB_MCLASS_DEV_ADM 0x10 #define IB_MCLASS_BIS 0x12 #define IB_MCLASS_CC 0x21 #define IB_MCLASS_VENDOR_HIGH_RANGE_MIN 0x30 #define IB_MCLASS_VENDOR_HIGH_RANGE_MAX 0x4F #define IB_MAX_METHODS 128 #define IB_MAD_METHOD_RESP_MASK 0x80 #define IB_MAD_METHOD_GET 0x01 #define IB_MAD_METHOD_SET 0x02 #define IB_MAD_METHOD_GET_RESP 0x81 #define IB_MAD_METHOD_DELETE 0x15 #define IB_MAD_METHOD_GETTABLE 0x12 #define IB_MAD_METHOD_GETTABLE_RESP 0x92 #define IB_MAD_METHOD_GETTRACETABLE 0x13 #define IB_MAD_METHOD_GETMULTI 0x14 #define IB_MAD_METHOD_GETMULTI_RESP 0x94 #define IB_MAD_METHOD_SEND 0x03 #define IB_MAD_METHOD_TRAP 0x05 #define IB_MAD_METHOD_REPORT 0x06 #define IB_MAD_METHOD_REPORT_RESP 0x86 #define IB_MAD_METHOD_TRAP_REPRESS 0x07 #define IB_MAD_STATUS_BUSY htobe16(0x0001) #define IB_MAD_STATUS_REDIRECT htobe16(0x0002) #define IB_MAD_STATUS_UNSUP_CLASS_VER htobe16(0x0004) #define IB_MAD_STATUS_UNSUP_METHOD htobe16(0x0008) #define IB_MAD_STATUS_UNSUP_METHOD_ATTR htobe16(0x000C) #define IB_MAD_STATUS_INVALID_FIELD htobe16(0x001C) #define IB_MAD_STATUS_CLASS_MASK htobe16(0xFF00) #define IB_SA_MAD_STATUS_SUCCESS 0x0000 #define IB_SA_MAD_STATUS_NO_RESOURCES htobe16(0x0100) #define IB_SA_MAD_STATUS_REQ_INVALID htobe16(0x0200) #define IB_SA_MAD_STATUS_NO_RECORDS htobe16(0x0300) #define IB_SA_MAD_STATUS_TOO_MANY_RECORDS htobe16(0x0400) #define IB_SA_MAD_STATUS_INVALID_GID htobe16(0x0500) #define IB_SA_MAD_STATUS_INSUF_COMPS htobe16(0x0600) #define IB_SA_MAD_STATUS_DENIED htobe16(0x0700) #define IB_SA_MAD_STATUS_PRIO_SUGGESTED htobe16(0x0800) #define IB_DM_MAD_STATUS_NO_IOC_RESP htobe16(0x0100) #define IB_DM_MAD_STATUS_NO_SVC_ENTRIES htobe16(0x0200) #define IB_DM_MAD_STATUS_IOC_FAILURE htobe16(0x8000) #define IB_MAD_ATTR_CLASS_PORT_INFO htobe16(0x0001) #define IB_MAD_ATTR_NOTICE htobe16(0x0002) #define IB_MAD_ATTR_INFORM_INFO htobe16(0x0003) #define IB_MAD_ATTR_NODE_DESC htobe16(0x0010) #define IB_MAD_ATTR_PORT_SMPL_CTRL htobe16(0x0010) #define IB_MAD_ATTR_NODE_INFO htobe16(0x0011) #define IB_MAD_ATTR_PORT_SMPL_RSLT htobe16(0x0011) #define IB_MAD_ATTR_SWITCH_INFO htobe16(0x0012) #define IB_MAD_ATTR_PORT_CNTRS htobe16(0x0012) #define IB_MAD_ATTR_PORT_CNTRS_EXT htobe16(0x001D) #define IB_MAD_ATTR_PORT_XMIT_DATA_SL htobe16(0x0036) #define IB_MAD_ATTR_PORT_RCV_DATA_SL htobe16(0x0037) #define IB_MAD_ATTR_GUID_INFO htobe16(0x0014) #define IB_MAD_ATTR_PORT_INFO htobe16(0x0015) #define IB_MAD_ATTR_P_KEY_TABLE htobe16(0x0016) #define IB_MAD_ATTR_SLVL_TABLE htobe16(0x0017) #define IB_MAD_ATTR_VL_ARBITRATION htobe16(0x0018) #define IB_MAD_ATTR_LIN_FWD_TBL htobe16(0x0019) #define IB_MAD_ATTR_RND_FWD_TBL htobe16(0x001A) #define IB_MAD_ATTR_MCAST_FWD_TBL htobe16(0x001B) #define IB_MAD_ATTR_NODE_RECORD htobe16(0x0011) #define IB_MAD_ATTR_PORTINFO_RECORD htobe16(0x0012) #define IB_MAD_ATTR_SWITCH_INFO_RECORD htobe16(0x0014) #define IB_MAD_ATTR_LINK_RECORD htobe16(0x0020) #define IB_MAD_ATTR_SM_INFO htobe16(0x0020) #define IB_MAD_ATTR_SMINFO_RECORD htobe16(0x0018) #define IB_MAD_ATTR_GUIDINFO_RECORD htobe16(0x0030) #define IB_MAD_ATTR_VENDOR_DIAG htobe16(0x0030) #define IB_MAD_ATTR_LED_INFO htobe16(0x0031) #define IB_MAD_ATTR_MLNX_EXTENDED_PORT_INFO htobe16(0xFF90) #define IB_MAD_ATTR_SERVICE_RECORD htobe16(0x0031) #define IB_MAD_ATTR_LFT_RECORD htobe16(0x0015) #define IB_MAD_ATTR_MFT_RECORD htobe16(0x0017) #define IB_MAD_ATTR_PKEY_TBL_RECORD htobe16(0x0033) #define IB_MAD_ATTR_PATH_RECORD htobe16(0x0035) #define IB_MAD_ATTR_VLARB_RECORD htobe16(0x0036) #define IB_MAD_ATTR_SLVL_RECORD htobe16(0x0013) #define IB_MAD_ATTR_MCMEMBER_RECORD htobe16(0x0038) #define IB_MAD_ATTR_TRACE_RECORD htobe16(0x0039) #define IB_MAD_ATTR_MULTIPATH_RECORD htobe16(0x003A) #define IB_MAD_ATTR_SVC_ASSOCIATION_RECORD htobe16(0x003B) #define IB_MAD_ATTR_INFORM_INFO_RECORD htobe16(0x00F3) #define IB_MAD_ATTR_IO_UNIT_INFO htobe16(0x0010) #define IB_MAD_ATTR_IO_CONTROLLER_PROFILE htobe16(0x0011) #define IB_MAD_ATTR_SERVICE_ENTRIES htobe16(0x0012) #define IB_MAD_ATTR_DIAGNOSTIC_TIMEOUT htobe16(0x0020) #define IB_MAD_ATTR_PREPARE_TO_TEST htobe16(0x0021) #define IB_MAD_ATTR_TEST_DEVICE_ONCE htobe16(0x0022) #define IB_MAD_ATTR_TEST_DEVICE_LOOP htobe16(0x0023) #define IB_MAD_ATTR_DIAG_CODE htobe16(0x0024) #define IB_MAD_ATTR_SVC_ASSOCIATION_RECORD htobe16(0x003B) #define IB_MAD_ATTR_CONG_INFO htobe16(0x0011) #define IB_MAD_ATTR_CONG_KEY_INFO htobe16(0x0012) #define IB_MAD_ATTR_CONG_LOG htobe16(0x0013) #define IB_MAD_ATTR_SW_CONG_SETTING htobe16(0x0014) #define IB_MAD_ATTR_SW_PORT_CONG_SETTING htobe16(0x0015) #define IB_MAD_ATTR_CA_CONG_SETTING htobe16(0x0016) #define IB_MAD_ATTR_CC_TBL htobe16(0x0017) #define IB_MAD_ATTR_TIME_STAMP htobe16(0x0018) #define IB_NODE_TYPE_CA 0x01 #define IB_NODE_TYPE_SWITCH 0x02 #define IB_NODE_TYPE_ROUTER 0x03 #define IB_NOTICE_PRODUCER_TYPE_CA htobe32(0x000001) #define IB_NOTICE_PRODUCER_TYPE_SWITCH htobe32(0x000002) #define IB_NOTICE_PRODUCER_TYPE_ROUTER htobe32(0x000003) #define IB_NOTICE_PRODUCER_TYPE_CLASS_MGR htobe32(0x000004) #define IB_MTU_LEN_256 1 #define IB_MTU_LEN_512 2 #define IB_MTU_LEN_1024 3 #define IB_MTU_LEN_2048 4 #define IB_MTU_LEN_4096 5 #define IB_PATH_SELECTOR_GREATER_THAN 0 #define IB_PATH_SELECTOR_LESS_THAN 1 #define IB_PATH_SELECTOR_EXACTLY 2 #define IB_PATH_SELECTOR_LARGEST 3 #define IB_SMINFO_STATE_NOTACTIVE 0 #define IB_SMINFO_STATE_DISCOVERING 1 #define IB_SMINFO_STATE_STANDBY 2 #define IB_SMINFO_STATE_MASTER 3 #define IB_PATH_REC_SL_MASK 0x000F #define IB_MULTIPATH_REC_SL_MASK 0x000F #define IB_PATH_REC_QOS_CLASS_MASK 0xFFF0 #define IB_MULTIPATH_REC_QOS_CLASS_MASK 0xFFF0 #define IB_PATH_REC_SELECTOR_MASK 0xC0 #define IB_MULTIPATH_REC_SELECTOR_MASK 0xC0 #define IB_PATH_REC_BASE_MASK 0x3F #define IB_MULTIPATH_REC_BASE_MASK 0x3F #define IB_LINK_NO_CHANGE 0 #define IB_LINK_DOWN 1 #define IB_LINK_INIT 2 #define IB_LINK_ARMED 3 #define IB_LINK_ACTIVE 4 #define IB_LINK_ACT_DEFER 5 #define IB_JOIN_STATE_FULL 1 #define IB_JOIN_STATE_NON 2 #define IB_JOIN_STATE_SEND_ONLY 4 #define IB_JOIN_STATE_SEND_ONLY_FULL 8 typedef union { uint8_t raw[16]; struct _ib_gid_unicast { __be64 prefix; __be64 interface_id; } __attribute__((packed)) unicast; struct _ib_gid_multicast { uint8_t header[2]; uint8_t raw_group_id[14]; } __attribute__((packed)) multicast; struct _ib_gid_ip_multicast { uint8_t header[2]; __be16 signature; __be16 p_key; uint8_t group_id[10]; } __attribute__((packed)) ip_multicast; } __attribute__((packed)) ib_gid_t; typedef struct { __be64 service_id; ib_gid_t dgid; ib_gid_t sgid; __be16 dlid; __be16 slid; __be32 hop_flow_raw; uint8_t tclass; uint8_t num_path; __be16 pkey; __be16 qos_class_sl; uint8_t mtu; uint8_t rate; uint8_t pkt_life; uint8_t preference; uint8_t resv2[6]; } __attribute__((packed)) ib_path_rec_t; #define IB_PR_COMPMASK_SERVICEID_MSB htobe64(((uint64_t)1) << 0) #define IB_PR_COMPMASK_SERVICEID_LSB htobe64(((uint64_t)1) << 1) #define IB_PR_COMPMASK_DGID htobe64(((uint64_t)1) << 2) #define IB_PR_COMPMASK_SGID htobe64(((uint64_t)1) << 3) #define IB_PR_COMPMASK_DLID htobe64(((uint64_t)1) << 4) #define IB_PR_COMPMASK_SLID htobe64(((uint64_t)1) << 5) #define IB_PR_COMPMASK_RAWTRAFFIC htobe64(((uint64_t)1) << 6) #define IB_PR_COMPMASK_RESV0 htobe64(((uint64_t)1) << 7) #define IB_PR_COMPMASK_FLOWLABEL htobe64(((uint64_t)1) << 8) #define IB_PR_COMPMASK_HOPLIMIT htobe64(((uint64_t)1) << 9) #define IB_PR_COMPMASK_TCLASS htobe64(((uint64_t)1) << 10) #define IB_PR_COMPMASK_REVERSIBLE htobe64(((uint64_t)1) << 11) #define IB_PR_COMPMASK_NUMBPATH htobe64(((uint64_t)1) << 12) #define IB_PR_COMPMASK_PKEY htobe64(((uint64_t)1) << 13) #define IB_PR_COMPMASK_QOS_CLASS htobe64(((uint64_t)1) << 14) #define IB_PR_COMPMASK_SL htobe64(((uint64_t)1) << 15) #define IB_PR_COMPMASK_MTUSELEC htobe64(((uint64_t)1) << 16) #define IB_PR_COMPMASK_MTU htobe64(((uint64_t)1) << 17) #define IB_PR_COMPMASK_RATESELEC htobe64(((uint64_t)1) << 18) #define IB_PR_COMPMASK_RATE htobe64(((uint64_t)1) << 19) #define IB_PR_COMPMASK_PKTLIFETIMESELEC htobe64(((uint64_t)1) << 20) #define IB_PR_COMPMASK_PKTLIFETIME htobe64(((uint64_t)1) << 21) #define IB_LR_COMPMASK_FROM_LID htobe64(((uint64_t)1) << 0) #define IB_LR_COMPMASK_FROM_PORT htobe64(((uint64_t)1) << 1) #define IB_LR_COMPMASK_TO_PORT htobe64(((uint64_t)1) << 2) #define IB_LR_COMPMASK_TO_LID htobe64(((uint64_t)1) << 3) #define IB_VLA_COMPMASK_LID htobe64(((uint64_t)1) << 0) #define IB_VLA_COMPMASK_OUT_PORT htobe64(((uint64_t)1) << 1) #define IB_VLA_COMPMASK_BLOCK htobe64(((uint64_t)1) << 2) #define IB_SLVL_COMPMASK_LID htobe64(((uint64_t)1) << 0) #define IB_SLVL_COMPMASK_IN_PORT htobe64(((uint64_t)1) << 1) #define IB_SLVL_COMPMASK_OUT_PORT htobe64(((uint64_t)1) << 2) #define IB_PKEY_COMPMASK_LID htobe64(((uint64_t)1) << 0) #define IB_PKEY_COMPMASK_BLOCK htobe64(((uint64_t)1) << 1) #define IB_PKEY_COMPMASK_PORT htobe64(((uint64_t)1) << 2) #define IB_SWIR_COMPMASK_LID htobe64(((uint64_t)1) << 0) #define IB_SWIR_COMPMASK_RESERVED1 htobe64(((uint64_t)1) << 1) #define IB_LFTR_COMPMASK_LID htobe64(((uint64_t)1) << 0) #define IB_LFTR_COMPMASK_BLOCK htobe64(((uint64_t)1) << 1) #define IB_MFTR_COMPMASK_LID htobe64(((uint64_t)1) << 0) #define IB_MFTR_COMPMASK_POSITION htobe64(((uint64_t)1) << 1) #define IB_MFTR_COMPMASK_RESERVED1 htobe64(((uint64_t)1) << 2) #define IB_MFTR_COMPMASK_BLOCK htobe64(((uint64_t)1) << 3) #define IB_MFTR_COMPMASK_RESERVED2 htobe64(((uint64_t)1) << 4) #define IB_NR_COMPMASK_LID htobe64(((uint64_t)1) << 0) #define IB_NR_COMPMASK_RESERVED1 htobe64(((uint64_t)1) << 1) #define IB_NR_COMPMASK_BASEVERSION htobe64(((uint64_t)1) << 2) #define IB_NR_COMPMASK_CLASSVERSION htobe64(((uint64_t)1) << 3) #define IB_NR_COMPMASK_NODETYPE htobe64(((uint64_t)1) << 4) #define IB_NR_COMPMASK_NUMPORTS htobe64(((uint64_t)1) << 5) #define IB_NR_COMPMASK_SYSIMAGEGUID htobe64(((uint64_t)1) << 6) #define IB_NR_COMPMASK_NODEGUID htobe64(((uint64_t)1) << 7) #define IB_NR_COMPMASK_PORTGUID htobe64(((uint64_t)1) << 8) #define IB_NR_COMPMASK_PARTCAP htobe64(((uint64_t)1) << 9) #define IB_NR_COMPMASK_DEVID htobe64(((uint64_t)1) << 10) #define IB_NR_COMPMASK_REV htobe64(((uint64_t)1) << 11) #define IB_NR_COMPMASK_PORTNUM htobe64(((uint64_t)1) << 12) #define IB_NR_COMPMASK_VENDID htobe64(((uint64_t)1) << 13) #define IB_NR_COMPMASK_NODEDESC htobe64(((uint64_t)1) << 14) #define IB_SR_COMPMASK_SID htobe64(((uint64_t)1) << 0) #define IB_SR_COMPMASK_SGID htobe64(((uint64_t)1) << 1) #define IB_SR_COMPMASK_SPKEY htobe64(((uint64_t)1) << 2) #define IB_SR_COMPMASK_RES1 htobe64(((uint64_t)1) << 3) #define IB_SR_COMPMASK_SLEASE htobe64(((uint64_t)1) << 4) #define IB_SR_COMPMASK_SKEY htobe64(((uint64_t)1) << 5) #define IB_SR_COMPMASK_SNAME htobe64(((uint64_t)1) << 6) #define IB_SR_COMPMASK_SDATA8_0 htobe64(((uint64_t)1) << 7) #define IB_SR_COMPMASK_SDATA8_1 htobe64(((uint64_t)1) << 8) #define IB_SR_COMPMASK_SDATA8_2 htobe64(((uint64_t)1) << 9) #define IB_SR_COMPMASK_SDATA8_3 htobe64(((uint64_t)1) << 10) #define IB_SR_COMPMASK_SDATA8_4 htobe64(((uint64_t)1) << 11) #define IB_SR_COMPMASK_SDATA8_5 htobe64(((uint64_t)1) << 12) #define IB_SR_COMPMASK_SDATA8_6 htobe64(((uint64_t)1) << 13) #define IB_SR_COMPMASK_SDATA8_7 htobe64(((uint64_t)1) << 14) #define IB_SR_COMPMASK_SDATA8_8 htobe64(((uint64_t)1) << 15) #define IB_SR_COMPMASK_SDATA8_9 htobe64(((uint64_t)1) << 16) #define IB_SR_COMPMASK_SDATA8_10 htobe64(((uint64_t)1) << 17) #define IB_SR_COMPMASK_SDATA8_11 htobe64(((uint64_t)1) << 18) #define IB_SR_COMPMASK_SDATA8_12 htobe64(((uint64_t)1) << 19) #define IB_SR_COMPMASK_SDATA8_13 htobe64(((uint64_t)1) << 20) #define IB_SR_COMPMASK_SDATA8_14 htobe64(((uint64_t)1) << 21) #define IB_SR_COMPMASK_SDATA8_15 htobe64(((uint64_t)1) << 22) #define IB_SR_COMPMASK_SDATA16_0 htobe64(((uint64_t)1) << 23) #define IB_SR_COMPMASK_SDATA16_1 htobe64(((uint64_t)1) << 24) #define IB_SR_COMPMASK_SDATA16_2 htobe64(((uint64_t)1) << 25) #define IB_SR_COMPMASK_SDATA16_3 htobe64(((uint64_t)1) << 26) #define IB_SR_COMPMASK_SDATA16_4 htobe64(((uint64_t)1) << 27) #define IB_SR_COMPMASK_SDATA16_5 htobe64(((uint64_t)1) << 28) #define IB_SR_COMPMASK_SDATA16_6 htobe64(((uint64_t)1) << 29) #define IB_SR_COMPMASK_SDATA16_7 htobe64(((uint64_t)1) << 30) #define IB_SR_COMPMASK_SDATA32_0 htobe64(((uint64_t)1) << 31) #define IB_SR_COMPMASK_SDATA32_1 htobe64(((uint64_t)1) << 32) #define IB_SR_COMPMASK_SDATA32_2 htobe64(((uint64_t)1) << 33) #define IB_SR_COMPMASK_SDATA32_3 htobe64(((uint64_t)1) << 34) #define IB_SR_COMPMASK_SDATA64_0 htobe64(((uint64_t)1) << 35) #define IB_SR_COMPMASK_SDATA64_1 htobe64(((uint64_t)1) << 36) #define IB_PIR_COMPMASK_LID htobe64(((uint64_t)1) << 0) #define IB_PIR_COMPMASK_PORTNUM htobe64(((uint64_t)1) << 1) #define IB_PIR_COMPMASK_OPTIONS htobe64(((uint64_t)1) << 2) #define IB_PIR_COMPMASK_MKEY htobe64(((uint64_t)1) << 3) #define IB_PIR_COMPMASK_GIDPRE htobe64(((uint64_t)1) << 4) #define IB_PIR_COMPMASK_BASELID htobe64(((uint64_t)1) << 5) #define IB_PIR_COMPMASK_SMLID htobe64(((uint64_t)1) << 6) #define IB_PIR_COMPMASK_CAPMASK htobe64(((uint64_t)1) << 7) #define IB_PIR_COMPMASK_DIAGCODE htobe64(((uint64_t)1) << 8) #define IB_PIR_COMPMASK_MKEYLEASEPRD htobe64(((uint64_t)1) << 9) #define IB_PIR_COMPMASK_LOCALPORTNUM htobe64(((uint64_t)1) << 10) #define IB_PIR_COMPMASK_LINKWIDTHENABLED htobe64(((uint64_t)1) << 11) #define IB_PIR_COMPMASK_LNKWIDTHSUPPORT htobe64(((uint64_t)1) << 12) #define IB_PIR_COMPMASK_LNKWIDTHACTIVE htobe64(((uint64_t)1) << 13) #define IB_PIR_COMPMASK_LNKSPEEDSUPPORT htobe64(((uint64_t)1) << 14) #define IB_PIR_COMPMASK_PORTSTATE htobe64(((uint64_t)1) << 15) #define IB_PIR_COMPMASK_PORTPHYSTATE htobe64(((uint64_t)1) << 16) #define IB_PIR_COMPMASK_LINKDWNDFLTSTATE htobe64(((uint64_t)1) << 17) #define IB_PIR_COMPMASK_MKEYPROTBITS htobe64(((uint64_t)1) << 18) #define IB_PIR_COMPMASK_RESV2 htobe64(((uint64_t)1) << 19) #define IB_PIR_COMPMASK_LMC htobe64(((uint64_t)1) << 20) #define IB_PIR_COMPMASK_LINKSPEEDACTIVE htobe64(((uint64_t)1) << 21) #define IB_PIR_COMPMASK_LINKSPEEDENABLE htobe64(((uint64_t)1) << 22) #define IB_PIR_COMPMASK_NEIGHBORMTU htobe64(((uint64_t)1) << 23) #define IB_PIR_COMPMASK_MASTERSMSL htobe64(((uint64_t)1) << 24) #define IB_PIR_COMPMASK_VLCAP htobe64(((uint64_t)1) << 25) #define IB_PIR_COMPMASK_INITTYPE htobe64(((uint64_t)1) << 26) #define IB_PIR_COMPMASK_VLHIGHLIMIT htobe64(((uint64_t)1) << 27) #define IB_PIR_COMPMASK_VLARBHIGHCAP htobe64(((uint64_t)1) << 28) #define IB_PIR_COMPMASK_VLARBLOWCAP htobe64(((uint64_t)1) << 29) #define IB_PIR_COMPMASK_INITTYPEREPLY htobe64(((uint64_t)1) << 30) #define IB_PIR_COMPMASK_MTUCAP htobe64(((uint64_t)1) << 31) #define IB_PIR_COMPMASK_VLSTALLCNT htobe64(((uint64_t)1) << 32) #define IB_PIR_COMPMASK_HOQLIFE htobe64(((uint64_t)1) << 33) #define IB_PIR_COMPMASK_OPVLS htobe64(((uint64_t)1) << 34) #define IB_PIR_COMPMASK_PARENFIN htobe64(((uint64_t)1) << 35) #define IB_PIR_COMPMASK_PARENFOUT htobe64(((uint64_t)1) << 36) #define IB_PIR_COMPMASK_FILTERRAWIN htobe64(((uint64_t)1) << 37) #define IB_PIR_COMPMASK_FILTERRAWOUT htobe64(((uint64_t)1) << 38) #define IB_PIR_COMPMASK_MKEYVIO htobe64(((uint64_t)1) << 39) #define IB_PIR_COMPMASK_PKEYVIO htobe64(((uint64_t)1) << 40) #define IB_PIR_COMPMASK_QKEYVIO htobe64(((uint64_t)1) << 41) #define IB_PIR_COMPMASK_GUIDCAP htobe64(((uint64_t)1) << 42) #define IB_PIR_COMPMASK_CLIENTREREG htobe64(((uint64_t)1) << 43) #define IB_PIR_COMPMASK_RESV3 htobe64(((uint64_t)1) << 44) #define IB_PIR_COMPMASK_SUBNTO htobe64(((uint64_t)1) << 45) #define IB_PIR_COMPMASK_RESV4 htobe64(((uint64_t)1) << 46) #define IB_PIR_COMPMASK_RESPTIME htobe64(((uint64_t)1) << 47) #define IB_PIR_COMPMASK_LOCALPHYERR htobe64(((uint64_t)1) << 48) #define IB_PIR_COMPMASK_OVERRUNERR htobe64(((uint64_t)1) << 49) #define IB_PIR_COMPMASK_MAXCREDHINT htobe64(((uint64_t)1) << 50) #define IB_PIR_COMPMASK_RESV5 htobe64(((uint64_t)1) << 51) #define IB_PIR_COMPMASK_LINKRTLAT htobe64(((uint64_t)1) << 52) #define IB_PIR_COMPMASK_CAPMASK2 htobe64(((uint64_t)1) << 53) #define IB_PIR_COMPMASK_LINKSPDEXTACT htobe64(((uint64_t)1) << 54) #define IB_PIR_COMPMASK_LINKSPDEXTSUPP htobe64(((uint64_t)1) << 55) #define IB_PIR_COMPMASK_RESV7 htobe64(((uint64_t)1) << 56) #define IB_PIR_COMPMASK_LINKSPDEXTENAB htobe64(((uint64_t)1) << 57) #define IB_MCR_COMPMASK_GID htobe64(((uint64_t)1) << 0) #define IB_MCR_COMPMASK_MGID htobe64(((uint64_t)1) << 0) #define IB_MCR_COMPMASK_PORT_GID htobe64(((uint64_t)1) << 1) #define IB_MCR_COMPMASK_QKEY htobe64(((uint64_t)1) << 2) #define IB_MCR_COMPMASK_MLID htobe64(((uint64_t)1) << 3) #define IB_MCR_COMPMASK_MTU_SEL htobe64(((uint64_t)1) << 4) #define IB_MCR_COMPMASK_MTU htobe64(((uint64_t)1) << 5) #define IB_MCR_COMPMASK_TCLASS htobe64(((uint64_t)1) << 6) #define IB_MCR_COMPMASK_PKEY htobe64(((uint64_t)1) << 7) #define IB_MCR_COMPMASK_RATE_SEL htobe64(((uint64_t)1) << 8) #define IB_MCR_COMPMASK_RATE htobe64(((uint64_t)1) << 9) #define IB_MCR_COMPMASK_LIFE_SEL htobe64(((uint64_t)1) << 10) #define IB_MCR_COMPMASK_LIFE htobe64(((uint64_t)1) << 11) #define IB_MCR_COMPMASK_SL htobe64(((uint64_t)1) << 12) #define IB_MCR_COMPMASK_FLOW htobe64(((uint64_t)1) << 13) #define IB_MCR_COMPMASK_HOP htobe64(((uint64_t)1) << 14) #define IB_MCR_COMPMASK_SCOPE htobe64(((uint64_t)1) << 15) #define IB_MCR_COMPMASK_JOIN_STATE htobe64(((uint64_t)1) << 16) #define IB_MCR_COMPMASK_PROXY htobe64(((uint64_t)1) << 17) #define IB_GIR_COMPMASK_LID htobe64(((uint64_t)1) << 0) #define IB_GIR_COMPMASK_BLOCKNUM htobe64(((uint64_t)1) << 1) #define IB_GIR_COMPMASK_RESV1 htobe64(((uint64_t)1) << 2) #define IB_GIR_COMPMASK_RESV2 htobe64(((uint64_t)1) << 3) #define IB_GIR_COMPMASK_GID0 htobe64(((uint64_t)1) << 4) #define IB_GIR_COMPMASK_GID1 htobe64(((uint64_t)1) << 5) #define IB_GIR_COMPMASK_GID2 htobe64(((uint64_t)1) << 6) #define IB_GIR_COMPMASK_GID3 htobe64(((uint64_t)1) << 7) #define IB_GIR_COMPMASK_GID4 htobe64(((uint64_t)1) << 8) #define IB_GIR_COMPMASK_GID5 htobe64(((uint64_t)1) << 9) #define IB_GIR_COMPMASK_GID6 htobe64(((uint64_t)1) << 10) #define IB_GIR_COMPMASK_GID7 htobe64(((uint64_t)1) << 11) #define IB_MPR_COMPMASK_RAWTRAFFIC htobe64(((uint64_t)1) << 0) #define IB_MPR_COMPMASK_RESV0 htobe64(((uint64_t)1) << 1) #define IB_MPR_COMPMASK_FLOWLABEL htobe64(((uint64_t)1) << 2) #define IB_MPR_COMPMASK_HOPLIMIT htobe64(((uint64_t)1) << 3) #define IB_MPR_COMPMASK_TCLASS htobe64(((uint64_t)1) << 4) #define IB_MPR_COMPMASK_REVERSIBLE htobe64(((uint64_t)1) << 5) #define IB_MPR_COMPMASK_NUMBPATH htobe64(((uint64_t)1) << 6) #define IB_MPR_COMPMASK_PKEY htobe64(((uint64_t)1) << 7) #define IB_MPR_COMPMASK_QOS_CLASS htobe64(((uint64_t)1) << 8) #define IB_MPR_COMPMASK_SL htobe64(((uint64_t)1) << 9) #define IB_MPR_COMPMASK_MTUSELEC htobe64(((uint64_t)1) << 10) #define IB_MPR_COMPMASK_MTU htobe64(((uint64_t)1) << 11) #define IB_MPR_COMPMASK_RATESELEC htobe64(((uint64_t)1) << 12) #define IB_MPR_COMPMASK_RATE htobe64(((uint64_t)1) << 13) #define IB_MPR_COMPMASK_PKTLIFETIMESELEC htobe64(((uint64_t)1) << 14) #define IB_MPR_COMPMASK_PKTLIFETIME htobe64(((uint64_t)1) << 15) #define IB_MPR_COMPMASK_SERVICEID_MSB htobe64(((uint64_t)1) << 16) #define IB_MPR_COMPMASK_INDEPSELEC htobe64(((uint64_t)1) << 17) #define IB_MPR_COMPMASK_RESV3 htobe64(((uint64_t)1) << 18) #define IB_MPR_COMPMASK_SGIDCOUNT htobe64(((uint64_t)1) << 19) #define IB_MPR_COMPMASK_DGIDCOUNT htobe64(((uint64_t)1) << 20) #define IB_MPR_COMPMASK_SERVICEID_LSB htobe64(((uint64_t)1) << 21) #define IB_SMIR_COMPMASK_LID htobe64(((uint64_t)1) << 0) #define IB_SMIR_COMPMASK_RESV0 htobe64(((uint64_t)1) << 1) #define IB_SMIR_COMPMASK_GUID htobe64(((uint64_t)1) << 2) #define IB_SMIR_COMPMASK_SMKEY htobe64(((uint64_t)1) << 3) #define IB_SMIR_COMPMASK_ACTCOUNT htobe64(((uint64_t)1) << 4) #define IB_SMIR_COMPMASK_PRIORITY htobe64(((uint64_t)1) << 5) #define IB_SMIR_COMPMASK_SMSTATE htobe64(((uint64_t)1) << 6) #define IB_IIR_COMPMASK_SUBSCRIBERGID htobe64(((uint64_t)1) << 0) #define IB_IIR_COMPMASK_ENUM htobe64(((uint64_t)1) << 1) #define IB_IIR_COMPMASK_RESV0 htobe64(((uint64_t)1) << 2) #define IB_IIR_COMPMASK_GID htobe64(((uint64_t)1) << 3) #define IB_IIR_COMPMASK_LIDRANGEBEGIN htobe64(((uint64_t)1) << 4) #define IB_IIR_COMPMASK_LIDRANGEEND htobe64(((uint64_t)1) << 5) #define IB_IIR_COMPMASK_RESV1 htobe64(((uint64_t)1) << 6) #define IB_IIR_COMPMASK_ISGENERIC htobe64(((uint64_t)1) << 7) #define IB_IIR_COMPMASK_SUBSCRIBE htobe64(((uint64_t)1) << 8) #define IB_IIR_COMPMASK_TYPE htobe64(((uint64_t)1) << 9) #define IB_IIR_COMPMASK_TRAPNUMB htobe64(((uint64_t)1) << 10) #define IB_IIR_COMPMASK_DEVICEID htobe64(((uint64_t)1) << 10) #define IB_IIR_COMPMASK_QPN htobe64(((uint64_t)1) << 11) #define IB_IIR_COMPMASK_RESV2 htobe64(((uint64_t)1) << 12) #define IB_IIR_COMPMASK_RESPTIME htobe64(((uint64_t)1) << 13) #define IB_IIR_COMPMASK_RESV3 htobe64(((uint64_t)1) << 14) #define IB_IIR_COMPMASK_PRODTYPE htobe64(((uint64_t)1) << 15) #define IB_IIR_COMPMASK_VENDID htobe64(((uint64_t)1) << 15) #define IB_CLASS_CAP_TRAP 0x0001 #define IB_CLASS_CAP_GETSET 0x0002 #define IB_CLASS_CAP_CAPMASK2 0x0004 #define IB_CLASS_ENH_PORT0_CC_MASK 0x0100 #define IB_CLASS_RESP_TIME_MASK 0x1F #define IB_CLASS_CAPMASK2_SHIFT 5 typedef struct { uint8_t base_ver; uint8_t class_ver; __be16 cap_mask; __be32 cap_mask2_resp_time; ib_gid_t redir_gid; __be32 redir_tc_sl_fl; __be16 redir_lid; __be16 redir_pkey; __be32 redir_qp; __be32 redir_qkey; ib_gid_t trap_gid; __be32 trap_tc_sl_fl; __be16 trap_lid; __be16 trap_pkey; __be32 trap_hop_qp; __be32 trap_qkey; } __attribute__((packed)) ib_class_port_info_t; #define IB_PM_ALL_PORT_SELECT htobe16(1 << 8) #define IB_PM_EXT_WIDTH_SUPPORTED htobe16(1 << 9) #define IB_PM_EXT_WIDTH_NOIETF_SUP htobe16(1 << 10) #define IB_PM_SAMPLES_ONLY_SUP htobe16(1 << 11) #define IB_PM_PC_XMIT_WAIT_SUP htobe16(1 << 12) #define IS_PM_INH_LMTD_PKEY_MC_CONSTR_ERR htobe16(1 << 13) #define IS_PM_RSFEC_COUNTERS_SUP htobe16(1 << 14) #define IB_PM_IS_QP1_DROP_SUP htobe16(1 << 15) #define IB_PM_IS_PM_KEY_SUPPORTED htobe32(1 << 0) #define IB_PM_IS_ADDL_PORT_CTRS_EXT_SUP htobe32(1 << 1) typedef struct { __be64 guid; __be64 sm_key; __be32 act_count; uint8_t pri_state; } __attribute__((packed)) ib_sm_info_t; typedef struct { uint8_t base_ver; uint8_t mgmt_class; uint8_t class_ver; uint8_t method; __be16 status; __be16 class_spec; __be64 trans_id; __be16 attr_id; __be16 resv; __be32 attr_mod; } __attribute__((packed)) ib_mad_t; typedef struct { ib_mad_t common_hdr; uint8_t rmpp_version; uint8_t rmpp_type; uint8_t rmpp_flags; uint8_t rmpp_status; __be32 seg_num; __be32 paylen_newwin; } __attribute__((packed)) ib_rmpp_mad_t; #define IB_RMPP_TYPE_DATA 1 #define IB_RMPP_TYPE_ACK 2 #define IB_RMPP_TYPE_STOP 3 #define IB_RMPP_TYPE_ABORT 4 #define IB_RMPP_NO_RESP_TIME 0x1F #define IB_RMPP_FLAG_ACTIVE 0x01 #define IB_RMPP_FLAG_FIRST 0x02 #define IB_RMPP_FLAG_LAST 0x04 #define IB_RMPP_STATUS_SUCCESS 0 #define IB_RMPP_STATUS_RESX 1 #define IB_RMPP_STATUS_T2L 118 #define IB_RMPP_STATUS_BAD_LEN 119 #define IB_RMPP_STATUS_BAD_SEG 120 #define IB_RMPP_STATUS_BADT 121 #define IB_RMPP_STATUS_W2S 122 #define IB_RMPP_STATUS_S2B 123 #define IB_RMPP_STATUS_BAD_STATUS 124 #define IB_RMPP_STATUS_UNV 125 #define IB_RMPP_STATUS_TMR 126 #define IB_RMPP_STATUS_UNSPEC 127 #define IB_SMP_DIRECTION_HO 0x8000 #define IB_SMP_DIRECTION htobe16(IB_SMP_DIRECTION_HO) #define IB_SMP_STATUS_MASK_HO 0x7FFF #define IB_SMP_STATUS_MASK htobe16(IB_SMP_STATUS_MASK_HO) #define IB_SMP_DATA_SIZE 64 typedef struct { uint8_t base_ver; uint8_t mgmt_class; uint8_t class_ver; uint8_t method; __be16 status; uint8_t hop_ptr; uint8_t hop_count; __be64 trans_id; __be16 attr_id; __be16 resv; __be32 attr_mod; __be64 m_key; __be16 dr_slid; __be16 dr_dlid; uint32_t resv1[7]; uint8_t data[IB_SMP_DATA_SIZE]; uint8_t initial_path[IB_SUBNET_PATH_HOPS_MAX]; uint8_t return_path[IB_SUBNET_PATH_HOPS_MAX]; } __attribute__((packed)) ib_smp_t; typedef struct { uint8_t base_version; uint8_t class_version; uint8_t node_type; uint8_t num_ports; __be64 sys_guid; __be64 node_guid; __be64 port_guid; __be16 partition_cap; __be16 device_id; __be32 revision; __be32 port_num_vendor_id; } __attribute__((packed)) ib_node_info_t; #define IB_SA_DATA_SIZE 200 typedef struct { uint8_t base_ver; uint8_t mgmt_class; uint8_t class_ver; uint8_t method; __be16 status; __be16 resv; __be64 trans_id; __be16 attr_id; __be16 resv1; __be32 attr_mod; uint8_t rmpp_version; uint8_t rmpp_type; uint8_t rmpp_flags; uint8_t rmpp_status; __be32 seg_num; __be32 paylen_newwin; __be64 sm_key; __be16 attr_offset; __be16 resv3; __be64 comp_mask; uint8_t data[IB_SA_DATA_SIZE]; } __attribute__((packed)) ib_sa_mad_t; #define IB_NODE_INFO_PORT_NUM_MASK htobe32(0xFF000000) #define IB_NODE_INFO_VEND_ID_MASK htobe32(0x00FFFFFF) #define IB_NODE_DESCRIPTION_SIZE 64 typedef struct { // Node String is an array of UTF-8 characters // that describe the node in text format // Note that this string is NOT NULL TERMINATED! uint8_t description[IB_NODE_DESCRIPTION_SIZE]; } __attribute__((packed)) ib_node_desc_t; typedef struct { __be16 lid; __be16 resv; ib_node_info_t node_info; ib_node_desc_t node_desc; uint8_t pad[4]; } __attribute__((packed)) ib_node_record_t; typedef struct { __be64 m_key; __be64 subnet_prefix; __be16 base_lid; __be16 master_sm_base_lid; __be32 capability_mask; __be16 diag_code; __be16 m_key_lease_period; uint8_t local_port_num; uint8_t link_width_enabled; uint8_t link_width_supported; uint8_t link_width_active; uint8_t state_info1; /* LinkSpeedSupported and PortState */ uint8_t state_info2; /* PortPhysState and LinkDownDefaultState */ uint8_t mkey_lmc; /* M_KeyProtectBits and LMC */ uint8_t link_speed; /* LinkSpeedEnabled and LinkSpeedActive */ uint8_t mtu_smsl; uint8_t vl_cap; /* VLCap and InitType */ uint8_t vl_high_limit; uint8_t vl_arb_high_cap; uint8_t vl_arb_low_cap; uint8_t mtu_cap; uint8_t vl_stall_life; uint8_t vl_enforce; __be16 m_key_violations; __be16 p_key_violations; __be16 q_key_violations; uint8_t guid_cap; uint8_t subnet_timeout; /* cli_rereg(1b), mcast_pkey_trap_suppr(2b), timeout(5b) */ uint8_t resp_time_value; /* reserv(3b), rtv(5b) */ uint8_t error_threshold; /* local phy errors(4b), overrun errors(4b) */ __be16 max_credit_hint; __be32 link_rt_latency; /* reserv(8b), link round trip lat(24b) */ __be16 capability_mask2; uint8_t link_speed_ext; /* LinkSpeedExtActive and LinkSpeedExtSupported */ uint8_t link_speed_ext_enabled; /* reserv(3b), LinkSpeedExtEnabled(5b) */ } __attribute__((packed)) ib_port_info_t; #define IB_PORT_STATE_MASK 0x0F #define IB_PORT_LMC_MASK 0x07 #define IB_PORT_LMC_MAX 0x07 #define IB_PORT_MPB_MASK 0xC0 #define IB_PORT_MPB_SHIFT 6 #define IB_PORT_LINK_SPEED_SHIFT 4 #define IB_PORT_LINK_SPEED_SUPPORTED_MASK 0xF0 #define IB_PORT_LINK_SPEED_ACTIVE_MASK 0xF0 #define IB_PORT_LINK_SPEED_ENABLED_MASK 0x0F #define IB_PORT_PHYS_STATE_MASK 0xF0 #define IB_PORT_PHYS_STATE_SHIFT 4 #define IB_PORT_PHYS_STATE_NO_CHANGE 0 #define IB_PORT_PHYS_STATE_SLEEP 1 #define IB_PORT_PHYS_STATE_POLLING 2 #define IB_PORT_PHYS_STATE_DISABLED 3 #define IB_PORT_PHYS_STATE_PORTCONFTRAIN 4 #define IB_PORT_PHYS_STATE_LINKUP 5 #define IB_PORT_PHYS_STATE_LINKERRRECOVER 6 #define IB_PORT_PHYS_STATE_PHYTEST 7 #define IB_PORT_LNKDWNDFTSTATE_MASK 0x0F #define IB_PORT_CAP_RESV0 htobe32(0x00000001) #define IB_PORT_CAP_IS_SM htobe32(0x00000002) #define IB_PORT_CAP_HAS_NOTICE htobe32(0x00000004) #define IB_PORT_CAP_HAS_TRAP htobe32(0x00000008) #define IB_PORT_CAP_HAS_IPD htobe32(0x00000010) #define IB_PORT_CAP_HAS_AUTO_MIG htobe32(0x00000020) #define IB_PORT_CAP_HAS_SL_MAP htobe32(0x00000040) #define IB_PORT_CAP_HAS_NV_MKEY htobe32(0x00000080) #define IB_PORT_CAP_HAS_NV_PKEY htobe32(0x00000100) #define IB_PORT_CAP_HAS_LED_INFO htobe32(0x00000200) #define IB_PORT_CAP_SM_DISAB htobe32(0x00000400) #define IB_PORT_CAP_HAS_SYS_IMG_GUID htobe32(0x00000800) #define IB_PORT_CAP_HAS_PKEY_SW_EXT_PORT_TRAP htobe32(0x00001000) #define IB_PORT_CAP_HAS_CABLE_INFO htobe32(0x00002000) #define IB_PORT_CAP_HAS_EXT_SPEEDS htobe32(0x00004000) #define IB_PORT_CAP_HAS_CAP_MASK2 htobe32(0x00008000) #define IB_PORT_CAP_HAS_COM_MGT htobe32(0x00010000) #define IB_PORT_CAP_HAS_SNMP htobe32(0x00020000) #define IB_PORT_CAP_REINIT htobe32(0x00040000) #define IB_PORT_CAP_HAS_DEV_MGT htobe32(0x00080000) #define IB_PORT_CAP_HAS_VEND_CLS htobe32(0x00100000) #define IB_PORT_CAP_HAS_DR_NTC htobe32(0x00200000) #define IB_PORT_CAP_HAS_CAP_NTC htobe32(0x00400000) #define IB_PORT_CAP_HAS_BM htobe32(0x00800000) #define IB_PORT_CAP_HAS_LINK_RT_LATENCY htobe32(0x01000000) #define IB_PORT_CAP_HAS_CLIENT_REREG htobe32(0x02000000) #define IB_PORT_CAP_HAS_OTHER_LOCAL_CHANGES_NTC htobe32(0x04000000) #define IB_PORT_CAP_HAS_LINK_SPEED_WIDTH_PAIRS_TBL htobe32(0x08000000) #define IB_PORT_CAP_HAS_VEND_MADS htobe32(0x10000000) #define IB_PORT_CAP_HAS_MCAST_PKEY_TRAP_SUPPRESS htobe32(0x20000000) #define IB_PORT_CAP_HAS_MCAST_FDB_TOP htobe32(0x40000000) #define IB_PORT_CAP_HAS_HIER_INFO htobe32(0x80000000) #define IB_PORT_CAP2_IS_SET_NODE_DESC_SUPPORTED htobe16(0x0001) #define IB_PORT_CAP2_IS_PORT_INFO_EXT_SUPPORTED htobe16(0x0002) #define IB_PORT_CAP2_IS_VIRT_SUPPORTED htobe16(0x0004) #define IB_PORT_CAP2_IS_SWITCH_PORT_STATE_TBL_SUPP htobe16(0x0008) #define IB_PORT_CAP2_IS_LINK_WIDTH_2X_SUPPORTED htobe16(0x0010) #define IB_PORT_CAP2_IS_LINK_SPEED_HDR_SUPPORTED htobe16(0x0020) #define IB_PORT_CAP2_IS_LINK_SPEED_NDR_SUPPORTED htobe16(0x0400) #define IB_PORT_CAP2_IS_EXT_SPEEDS_2_SUPPORTED htobe16(0x0800) #define IB_PORT_CAP2_IS_LINK_SPEED_XDR_SUPPORTED htobe16(0x1000) typedef struct { __be32 cap_mask; __be16 fec_mode_active; __be16 fdr_fec_mode_sup; __be16 fdr_fec_mode_enable; __be16 edr_fec_mode_sup; __be16 edr_fec_mode_enable; __be16 hdr_fec_mode_sup; __be16 hdr_fec_mode_enable; uint8_t reserved[46]; } __attribute__((packed)) ib_port_info_ext_t; #define IB_PORT_EXT_NO_FEC_MODE_ACTIVE 0 #define IB_PORT_EXT_FIRE_CODE_FEC_MODE_ACTIVE htobe16(0x0001) #define IB_PORT_EXT_RS_FEC_MODE_ACTIVE htobe16(0x0002) #define IB_PORT_EXT_LOW_LATENCY_RS_FEC_MODE_ACTIVE htobe16(0x0003) #define IB_PORT_EXT_CAP_IS_FEC_MODE_SUPPORTED htobe32(0x00000001) #define IB_LINK_WIDTH_ACTIVE_1X 1 #define IB_LINK_WIDTH_ACTIVE_4X 2 #define IB_LINK_WIDTH_ACTIVE_8X 4 #define IB_LINK_WIDTH_ACTIVE_12X 8 #define IB_LINK_WIDTH_ACTIVE_2X 16 #define IB_LINK_WIDTH_SET_LWS 255 #define IB_LINK_SPEED_ACTIVE_EXTENDED 0 #define IB_LINK_SPEED_ACTIVE_2_5 1 #define IB_LINK_SPEED_ACTIVE_5 2 #define IB_LINK_SPEED_ACTIVE_10 4 #define IB_LINK_SPEED_SET_LSS 15 #define IB_LINK_SPEED_EXT_ACTIVE_NONE 0 #define IB_LINK_SPEED_EXT_ACTIVE_14 1 #define IB_LINK_SPEED_EXT_ACTIVE_25 2 #define IB_LINK_SPEED_EXT_ACTIVE_50 4 #define IB_LINK_SPEED_EXT_DISABLE 30 #define IB_LINK_SPEED_EXT_SET_LSES 31 #define IB_PATH_RECORD_RATE_2_5_GBS 2 #define IB_PATH_RECORD_RATE_10_GBS 3 #define IB_PATH_RECORD_RATE_30_GBS 4 #define IB_PATH_RECORD_RATE_5_GBS 5 #define IB_PATH_RECORD_RATE_20_GBS 6 #define IB_PATH_RECORD_RATE_40_GBS 7 #define IB_PATH_RECORD_RATE_60_GBS 8 #define IB_PATH_RECORD_RATE_80_GBS 9 #define IB_PATH_RECORD_RATE_120_GBS 10 #define IB_PATH_RECORD_RATE_14_GBS 11 #define IB_PATH_RECORD_RATE_56_GBS 12 #define IB_PATH_RECORD_RATE_112_GBS 13 #define IB_PATH_RECORD_RATE_168_GBS 14 #define IB_PATH_RECORD_RATE_25_GBS 15 #define IB_PATH_RECORD_RATE_100_GBS 16 #define IB_PATH_RECORD_RATE_200_GBS 17 #define IB_PATH_RECORD_RATE_300_GBS 18 #define IB_PATH_RECORD_RATE_28_GBS 19 #define IB_PATH_RECORD_RATE_50_GBS 20 #define IB_PATH_RECORD_RATE_400_GBS 21 #define IB_PATH_RECORD_RATE_600_GBS 22 #define IB_PATH_RECORD_RATE_800_GBS 23 #define IB_PATH_RECORD_RATE_1200_GBS 24 #define FDR10 0x01 typedef struct { uint8_t resvd1[3]; uint8_t state_change_enable; uint8_t resvd2[3]; uint8_t link_speed_supported; uint8_t resvd3[3]; uint8_t link_speed_enabled; uint8_t resvd4[3]; uint8_t link_speed_active; uint8_t resvd5[48]; } __attribute__((packed)) ib_mlnx_ext_port_info_t; typedef struct { __be64 service_id; ib_gid_t service_gid; __be16 service_pkey; __be16 resv; __be32 service_lease; uint8_t service_key[16]; uint8_t service_name[64]; uint8_t service_data8[16]; __be16 service_data16[8]; __be32 service_data32[4]; __be64 service_data64[2]; } __attribute__((packed)) ib_service_record_t; typedef struct { __be16 lid; uint8_t port_num; uint8_t options; ib_port_info_t port_info; uint8_t pad[4]; } __attribute__((packed)) ib_portinfo_record_t; typedef struct { __be16 lid; uint8_t port_num; uint8_t options; ib_port_info_ext_t port_info_ext; } __attribute__((packed)) ib_portinfoext_record_t; typedef struct { __be16 from_lid; uint8_t from_port_num; uint8_t to_port_num; __be16 to_lid; uint8_t pad[2]; } __attribute__((packed)) ib_link_record_t; typedef struct { __be16 lid; uint16_t resv0; ib_sm_info_t sm_info; uint8_t pad[7]; } __attribute__((packed)) ib_sminfo_record_t; typedef struct { __be16 lid; __be16 block_num; uint32_t resv0; uint8_t lft[64]; } __attribute__((packed)) ib_lft_record_t; typedef struct { __be16 lid; __be16 position_block_num; uint32_t resv0; __be16 mft[IB_MCAST_BLOCK_SIZE]; } __attribute__((packed)) ib_mft_record_t; typedef struct { __be16 lin_cap; __be16 rand_cap; __be16 mcast_cap; __be16 lin_top; uint8_t def_port; uint8_t def_mcast_pri_port; uint8_t def_mcast_not_port; uint8_t life_state; __be16 lids_per_port; __be16 enforce_cap; uint8_t flags; uint8_t resvd; __be16 mcast_top; } __attribute__((packed)) ib_switch_info_t; typedef struct { __be16 lid; uint16_t resv0; ib_switch_info_t switch_info; } __attribute__((packed)) ib_switch_info_record_t; #define IB_SWITCH_PSC 0x04 #define GUID_TABLE_MAX_ENTRIES 8 typedef struct { __be64 guid[GUID_TABLE_MAX_ENTRIES]; } __attribute__((packed)) ib_guid_info_t; typedef struct { __be16 lid; uint8_t block_num; uint8_t resv; uint32_t reserved; ib_guid_info_t guid_info; } __attribute__((packed)) ib_guidinfo_record_t; #define IB_MULTIPATH_MAX_GIDS 11 typedef struct { __be32 hop_flow_raw; uint8_t tclass; uint8_t num_path; __be16 pkey; __be16 qos_class_sl; uint8_t mtu; uint8_t rate; uint8_t pkt_life; uint8_t service_id_8msb; uint8_t independence; /* formerly resv2 */ uint8_t sgid_count; uint8_t dgid_count; uint8_t service_id_56lsb[7]; ib_gid_t gids[IB_MULTIPATH_MAX_GIDS]; } __attribute__((packed)) ib_multipath_rec_t; #define IB_NUM_PKEY_ELEMENTS_IN_BLOCK 32 typedef struct { __be16 pkey_entry[IB_NUM_PKEY_ELEMENTS_IN_BLOCK]; } ib_pkey_table_t; typedef struct { __be16 lid; // for CA: lid of port, for switch lid of port 0 __be16 block_num; uint8_t port_num; // for switch: port number, for CA: reserved uint8_t reserved1; uint16_t reserved2; ib_pkey_table_t pkey_tbl; } ib_pkey_table_record_t; #define IB_DROP_VL 15 #define IB_MAX_NUM_VLS 16 typedef struct { uint8_t raw_vl_by_sl[IB_MAX_NUM_VLS / 2]; } __attribute__((packed)) ib_slvl_table_t; typedef struct { __be16 lid; // for CA: lid of port, for switch lid of port 0 uint8_t in_port_num; // reserved for CAs uint8_t out_port_num; // reserved for CAs uint32_t resv; ib_slvl_table_t slvl_tbl; } __attribute__((packed)) ib_slvl_table_record_t; typedef struct { uint8_t vl; uint8_t weight; } __attribute__((packed)) ib_vl_arb_element_t; #define IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK 32 typedef struct { ib_vl_arb_element_t vl_entry[IB_NUM_VL_ARB_ELEMENTS_IN_BLOCK]; } __attribute__((packed)) ib_vl_arb_table_t; typedef struct { __be16 lid; // for CA: lid of port, for switch lid of port 0 uint8_t port_num; uint8_t block_num; uint32_t reserved; ib_vl_arb_table_t vl_arb_tbl; } __attribute__((packed)) ib_vl_arb_table_record_t; typedef struct { __be32 ver_class_flow; __be16 resv1; uint8_t resv2; uint8_t hop_limit; ib_gid_t src_gid; ib_gid_t dest_gid; } __attribute__((packed)) ib_grh_t; typedef struct { ib_gid_t mgid; ib_gid_t port_gid; __be32 qkey; __be16 mlid; uint8_t mtu; uint8_t tclass; __be16 pkey; uint8_t rate; uint8_t pkt_life; __be32 sl_flow_hop; uint8_t scope_state; uint8_t proxy_join : 1; uint8_t reserved[2]; uint8_t pad[4]; } __attribute__((packed)) ib_member_rec_t; #define IB_MC_REC_STATE_FULL_MEMBER 0x01 #define IB_MC_REC_STATE_NON_MEMBER 0x02 #define IB_MC_REC_STATE_SEND_ONLY_NON_MEMBER 0x04 #define IB_MC_REC_STATE_SEND_ONLY_FULL_MEMBER 0x08 #define IB_NOTICE_TYPE_FATAL 0x00 #define IB_NOTICE_TYPE_URGENT 0x01 #define IB_NOTICE_TYPE_SECURITY 0x02 #define IB_NOTICE_TYPE_SUBN_MGMT 0x03 #define IB_NOTICE_TYPE_INFO 0x04 #define IB_NOTICE_TYPE_EMPTY 0x7F #define SM_GID_IN_SERVICE_TRAP 64 #define SM_GID_OUT_OF_SERVICE_TRAP 65 #define SM_MGID_CREATED_TRAP 66 #define SM_MGID_DESTROYED_TRAP 67 #define SM_UNPATH_TRAP 68 #define SM_REPATH_TRAP 69 #define SM_LINK_STATE_CHANGED_TRAP 128 #define SM_LINK_INTEGRITY_THRESHOLD_TRAP 129 #define SM_BUFFER_OVERRUN_THRESHOLD_TRAP 130 #define SM_WATCHDOG_TIMER_EXPIRED_TRAP 131 #define SM_LOCAL_CHANGES_TRAP 144 #define SM_SYS_IMG_GUID_CHANGED_TRAP 145 #define SM_BAD_MKEY_TRAP 256 #define SM_BAD_PKEY_TRAP 257 #define SM_BAD_QKEY_TRAP 258 #define SM_BAD_SWITCH_PKEY_TRAP 259 typedef struct { uint8_t generic_type; // 1 1 union _notice_g_or_v { struct _notice_generic // 5 6 { uint8_t prod_type_msb; __be16 prod_type_lsb; __be16 trap_num; } __attribute__((packed)) generic; struct _notice_vend { uint8_t vend_id_msb; __be16 vend_id_lsb; __be16 dev_id; } __attribute__((packed)) vend; } g_or_v; __be16 issuer_lid; // 2 8 __be16 toggle_count; // 2 10 union _data_details // 54 64 { struct _raw_data { uint8_t details[54]; } __attribute__((packed)) raw_data; struct _ntc_64_67 { uint8_t res[6]; ib_gid_t gid; // the Node or Multicast Group that came in/out } __attribute__((packed)) ntc_64_67; struct _ntc_128 { __be16 sw_lid; // the sw lid of which link state changed } __attribute__((packed)) ntc_128; struct _ntc_129_131 { __be16 pad; __be16 lid; // lid and port number of the violation uint8_t port_num; } __attribute__((packed)) ntc_129_131; struct _ntc_144 { __be16 pad1; __be16 lid; // lid where change occured uint8_t pad2; // reserved uint8_t local_changes; // 7b reserved 1b local changes __be32 new_cap_mask; // new capability mask __be16 change_flgs; // 10b reserved 6b change flags __be16 cap_mask2; } __attribute__((packed)) ntc_144; struct _ntc_145 { __be16 pad1; __be16 lid; // lid where sys guid changed __be16 pad2; __be64 new_sys_guid; // new system image guid } __attribute__((packed)) ntc_145; struct _ntc_256 { // total: 54 __be16 pad1; // 2 __be16 lid; // 2 __be16 dr_slid; // 2 uint8_t method; // 1 uint8_t pad2; // 1 __be16 attr_id; // 2 __be32 attr_mod; // 4 __be64 mkey; // 8 uint8_t pad3; // 1 uint8_t dr_trunc_hop; // 1 uint8_t dr_rtn_path[30]; // 30 } __attribute__((packed)) ntc_256; struct _ntc_257_258 // violation of p/q_key // 49 { __be16 pad1; // 2 __be16 lid1; // 2 __be16 lid2; // 2 __be32 key; // 4 __be32 qp1; // 4b sl, 4b pad, 24b qp1 __be32 qp2; // 8b pad, 24b qp2 ib_gid_t gid1; // 16 ib_gid_t gid2; // 16 } __attribute__((packed)) ntc_257_258; struct _ntc_259 // pkey violation from switch 51 { __be16 data_valid; // 2 __be16 lid1; // 2 __be16 lid2; // 2 __be16 pkey; // 2 __be32 sl_qp1; // 4b sl, 4b pad, 24b qp1 __be32 qp2; // 8b pad, 24b qp2 ib_gid_t gid1; // 16 ib_gid_t gid2; // 16 __be16 sw_lid; // 2 uint8_t port_no; // 1 } __attribute__((packed)) ntc_259; struct _ntc_bkey_259 // bkey violation { __be16 lidaddr; uint8_t method; uint8_t reserved; __be16 attribute_id; __be32 attribute_modifier; __be32 qp; // qp is low 24 bits __be64 bkey; ib_gid_t gid; } __attribute__((packed)) ntc_bkey_259; struct _ntc_cckey_0 // CC key violation { __be16 slid; // source LID from offending packet LRH uint8_t method; // method, from common MAD header uint8_t resv0; __be16 attribute_id; // Attribute ID, from common MAD header __be16 resv1; __be32 attribute_modifier; // Attribute Modif, from common MAD header __be32 qp; // 8b pad, 24b dest QP from BTH __be64 cc_key; // CC key of the offending packet ib_gid_t source_gid; // GID from GRH of the offending packet uint8_t padding[14]; // Padding - ignored on read } __attribute__((packed)) ntc_cckey_0; } data_details; ib_gid_t issuer_gid; // 16 80 } __attribute__((packed)) ib_mad_notice_attr_t; #define TRAP_259_MASK_SL htobe32(0xF0000000) #define TRAP_259_MASK_QP htobe32(0x00FFFFFF) #define TRAP_144_MASK_OTHER_LOCAL_CHANGES 0x01 #define TRAP_144_MASK_CAPABILITY_MASK2_CHANGE htobe16(0x0020) #define TRAP_144_MASK_HIERARCHY_INFO_CHANGE htobe16(0x0010) #define TRAP_144_MASK_SM_PRIORITY_CHANGE htobe16(0x0008) #define TRAP_144_MASK_LINK_SPEED_ENABLE_CHANGE htobe16(0x0004) #define TRAP_144_MASK_LINK_WIDTH_ENABLE_CHANGE htobe16(0x0002) #define TRAP_144_MASK_NODE_DESCRIPTION_CHANGE htobe16(0x0001) typedef struct { ib_gid_t gid; __be16 lid_range_begin; __be16 lid_range_end; __be16 reserved1; uint8_t is_generic; uint8_t subscribe; __be16 trap_type; union _inform_g_or_v { struct _inform_generic { __be16 trap_num; __be32 qpn_resp_time_val; uint8_t reserved2; uint8_t node_type_msb; __be16 node_type_lsb; } __attribute__((packed)) generic; struct _inform_vend { __be16 dev_id; __be32 qpn_resp_time_val; uint8_t reserved2; uint8_t vendor_id_msb; __be16 vendor_id_lsb; } __attribute__((packed)) vend; } __attribute__((packed)) g_or_v; } __attribute__((packed)) ib_inform_info_t; typedef struct { ib_gid_t subscriber_gid; __be16 subscriber_enum; uint8_t reserved[6]; ib_inform_info_t inform_info; uint8_t pad[4]; } __attribute__((packed)) ib_inform_info_record_t; typedef struct { ib_mad_t header; uint8_t resv[40]; #define IB_PM_DATA_SIZE 192 uint8_t data[IB_PM_DATA_SIZE]; } __attribute__((packed)) ib_perfmgt_mad_t; typedef struct { uint8_t reserved; uint8_t port_select; __be16 counter_select; __be16 symbol_err_cnt; uint8_t link_err_recover; uint8_t link_downed; __be16 rcv_err; __be16 rcv_rem_phys_err; __be16 rcv_switch_relay_err; __be16 xmit_discards; uint8_t xmit_constraint_err; uint8_t rcv_constraint_err; uint8_t counter_select2; uint8_t link_int_buffer_overrun; __be16 qp1_dropped; __be16 vl15_dropped; __be32 xmit_data; __be32 rcv_data; __be32 xmit_pkts; __be32 rcv_pkts; __be32 xmit_wait; } __attribute__((packed)) ib_port_counters_t; typedef struct { uint8_t reserved; uint8_t port_select; __be16 counter_select; __be32 counter_select2; __be64 xmit_data; __be64 rcv_data; __be64 xmit_pkts; __be64 rcv_pkts; __be64 unicast_xmit_pkts; __be64 unicast_rcv_pkts; __be64 multicast_xmit_pkts; __be64 multicast_rcv_pkts; __be64 symbol_err_cnt; __be64 link_err_recover; __be64 link_downed; __be64 rcv_err; __be64 rcv_rem_phys_err; __be64 rcv_switch_relay_err; __be64 xmit_discards; __be64 xmit_constraint_err; __be64 rcv_constraint_err; __be64 link_integrity_err; __be64 buffer_overrun; __be64 vl15_dropped; __be64 xmit_wait; __be64 qp1_dropped; } __attribute__((packed)) ib_port_counters_ext_t; typedef struct { uint8_t op_code; uint8_t port_select; uint8_t tick; uint8_t counter_width; /* 5 bits res : 3bits counter_width */ __be32 counter_mask; /* 2 bits res : 3 bits counter_mask : 27 bits counter_masks_1to9 */ __be16 counter_mask_10to14; /* 1 bits res : 15 bits counter_masks_10to14 */ uint8_t sample_mech; uint8_t sample_status; /* 6 bits res : 2 bits sample_status */ __be64 option_mask; __be64 vendor_mask; __be32 sample_start; __be32 sample_interval; __be16 tag; __be16 counter_select0; __be16 counter_select1; __be16 counter_select2; __be16 counter_select3; __be16 counter_select4; __be16 counter_select5; __be16 counter_select6; __be16 counter_select7; __be16 counter_select8; __be16 counter_select9; __be16 counter_select10; __be16 counter_select11; __be16 counter_select12; __be16 counter_select13; __be16 counter_select14; } __attribute__((packed)) ib_port_samples_control_t; #define IB_CS_PORT_XMIT_DATA htobe16(0x0001) #define IB_CS_PORT_RCV_DATA htobe16(0x0002) #define IB_CS_PORT_XMIT_PKTS htobe16(0x0003) #define IB_CS_PORT_RCV_PKTS htobe16(0x0004) #define IB_CS_PORT_XMIT_WAIT htobe16(0x0005) typedef struct { __be16 tag; __be16 sample_status; /* 14 bits res : 2 bits sample_status */ __be32 counter0; __be32 counter1; __be32 counter2; __be32 counter3; __be32 counter4; __be32 counter5; __be32 counter6; __be32 counter7; __be32 counter8; __be32 counter9; __be32 counter10; __be32 counter11; __be32 counter12; __be32 counter13; __be32 counter14; } __attribute__((packed)) ib_port_samples_result_t; typedef struct { uint8_t reserved; uint8_t port_select; __be16 counter_select; __be32 port_xmit_data_sl[16]; uint8_t resv[124]; } __attribute__((packed)) ib_port_xmit_data_sl_t; typedef struct { uint8_t reserved; uint8_t port_select; __be16 counter_select; __be32 port_rcv_data_sl[16]; uint8_t resv[124]; } __attribute__((packed)) ib_port_rcv_data_sl_t; typedef struct { ib_mad_t header; uint8_t resv[40]; #define IB_DM_DATA_SIZE 192 uint8_t data[IB_DM_DATA_SIZE]; } __attribute__((packed)) ib_dm_mad_t; typedef struct { __be16 change_id; uint8_t max_controllers; uint8_t diag_rom; #define IB_DM_CTRL_LIST_SIZE 128 uint8_t controller_list[IB_DM_CTRL_LIST_SIZE]; #define IOC_NOT_INSTALLED 0x0 #define IOC_INSTALLED 0x1 // Reserved values 0x02-0xE #define SLOT_DOES_NOT_EXIST 0xF } __attribute__((packed)) ib_iou_info_t; typedef struct { __be64 ioc_guid; __be32 vend_id; __be32 dev_id; __be16 dev_ver; __be16 resv2; __be32 subsys_vend_id; __be32 subsys_id; __be16 io_class; __be16 io_subclass; __be16 protocol; __be16 protocol_ver; __be32 resv3; __be16 send_msg_depth; uint8_t resv4; uint8_t rdma_read_depth; __be32 send_msg_size; __be32 rdma_size; uint8_t ctrl_ops_cap; #define CTRL_OPS_CAP_ST 0x01 #define CTRL_OPS_CAP_SF 0x02 #define CTRL_OPS_CAP_RT 0x04 #define CTRL_OPS_CAP_RF 0x08 #define CTRL_OPS_CAP_WT 0x10 #define CTRL_OPS_CAP_WF 0x20 #define CTRL_OPS_CAP_AT 0x40 #define CTRL_OPS_CAP_AF 0x80 uint8_t resv5; uint8_t num_svc_entries; #define MAX_NUM_SVC_ENTRIES 0xff uint8_t resv6[9]; #define CTRL_ID_STRING_LEN 64 char id_string[CTRL_ID_STRING_LEN]; } __attribute__((packed)) ib_ioc_profile_t; typedef struct { #define MAX_SVC_ENTRY_NAME_LEN 40 char name[MAX_SVC_ENTRY_NAME_LEN]; __be64 id; } __attribute__((packed)) ib_svc_entry_t; typedef struct { #define SVC_ENTRY_COUNT 4 ib_svc_entry_t service_entry[SVC_ENTRY_COUNT]; } __attribute__((packed)) ib_svc_entries_t; typedef struct { __be64 module_guid; __be64 iou_guid; ib_ioc_profile_t ioc_profile; __be64 access_key; uint16_t initiators_conf; uint8_t resv[38]; } __attribute__((packed)) ib_ioc_info_t; typedef struct { bool cm; bool snmp; bool dev_mgmt; bool vend; bool sm; bool sm_disable; bool qkey_ctr; bool pkey_ctr; bool notice; bool trap; bool apm; bool slmap; bool pkey_nvram; bool mkey_nvram; bool sysguid; bool dr_notice; bool boot_mgmt; bool capm_notice; bool reinit; bool ledinfo; bool port_active; } ib_port_cap_t; #define IB_INIT_TYPE_NO_LOAD 0x01 #define IB_INIT_TYPE_PRESERVE_CONTENT 0x02 #define IB_INIT_TYPE_PRESERVE_PRESENCE 0x04 #define IB_INIT_TYPE_DO_NOT_RESUSCITATE 0x08 typedef struct { uint8_t port_num; uint8_t sl; __be16 dlid; bool grh_valid; ib_grh_t grh; uint8_t static_rate; uint8_t path_bits; struct _av_conn { uint8_t path_mtu; uint8_t local_ack_timeout; uint8_t seq_err_retry_cnt; uint8_t rnr_retry_cnt; } conn; } ib_av_attr_t; #define IB_AC_RDMA_READ 0x00000001 #define IB_AC_RDMA_WRITE 0x00000002 #define IB_AC_ATOMIC 0x00000004 #define IB_AC_LOCAL_WRITE 0x00000008 #define IB_AC_MW_BIND 0x00000010 #define IB_QPS_RESET 0x00000001 #define IB_QPS_INIT 0x00000002 #define IB_QPS_RTR 0x00000004 #define IB_QPS_RTS 0x00000008 #define IB_QPS_SQD 0x00000010 #define IB_QPS_SQD_DRAINING 0x00000030 #define IB_QPS_SQD_DRAINED 0x00000050 #define IB_QPS_SQERR 0x00000080 #define IB_QPS_ERROR 0x00000100 #define IB_QPS_TIME_WAIT 0xDEAD0000 #define IB_MOD_QP_ALTERNATE_AV 0x00000001 #define IB_MOD_QP_PKEY 0x00000002 #define IB_MOD_QP_APM_STATE 0x00000004 #define IB_MOD_QP_PRIMARY_AV 0x00000008 #define IB_MOD_QP_RNR_NAK_TIMEOUT 0x00000010 #define IB_MOD_QP_RESP_RES 0x00000020 #define IB_MOD_QP_INIT_DEPTH 0x00000040 #define IB_MOD_QP_PRIMARY_PORT 0x00000080 #define IB_MOD_QP_ACCESS_CTRL 0x00000100 #define IB_MOD_QP_QKEY 0x00000200 #define IB_MOD_QP_SQ_DEPTH 0x00000400 #define IB_MOD_QP_RQ_DEPTH 0x00000800 #define IB_MOD_QP_CURRENT_STATE 0x00001000 #define IB_MOD_QP_RETRY_CNT 0x00002000 #define IB_MOD_QP_LOCAL_ACK_TIMEOUT 0x00004000 #define IB_MOD_QP_RNR_RETRY_CNT 0x00008000 #define IB_MOD_EEC_ALTERNATE_AV 0x00000001 #define IB_MOD_EEC_PKEY 0x00000002 #define IB_MOD_EEC_APM_STATE 0x00000004 #define IB_MOD_EEC_PRIMARY_AV 0x00000008 #define IB_MOD_EEC_RNR 0x00000010 #define IB_MOD_EEC_RESP_RES 0x00000020 #define IB_MOD_EEC_OUTSTANDING 0x00000040 #define IB_MOD_EEC_PRIMARY_PORT 0x00000080 #define IB_SEND_OPT_IMMEDIATE 0x00000001 #define IB_SEND_OPT_FENCE 0x00000002 #define IB_SEND_OPT_SIGNALED 0x00000004 #define IB_SEND_OPT_SOLICITED 0x00000008 #define IB_SEND_OPT_INLINE 0x00000010 #define IB_SEND_OPT_LOCAL 0x00000020 #define IB_SEND_OPT_VEND_MASK 0xFFFF0000 #define IB_RECV_OPT_IMMEDIATE 0x00000001 #define IB_RECV_OPT_FORWARD 0x00000002 #define IB_RECV_OPT_GRH_VALID 0x00000004 #define IB_RECV_OPT_VEND_MASK 0xFFFF0000 #define IB_CA_MOD_IS_CM_SUPPORTED 0x00000001 #define IB_CA_MOD_IS_SNMP_SUPPORTED 0x00000002 #define IB_CA_MOD_IS_DEV_MGMT_SUPPORTED 0x00000004 #define IB_CA_MOD_IS_VEND_SUPPORTED 0x00000008 #define IB_CA_MOD_IS_SM 0x00000010 #define IB_CA_MOD_IS_SM_DISABLED 0x00000020 #define IB_CA_MOD_QKEY_CTR 0x00000040 #define IB_CA_MOD_PKEY_CTR 0x00000080 #define IB_CA_MOD_IS_NOTICE_SUPPORTED 0x00000100 #define IB_CA_MOD_IS_TRAP_SUPPORTED 0x00000200 #define IB_CA_MOD_IS_APM_SUPPORTED 0x00000400 #define IB_CA_MOD_IS_SLMAP_SUPPORTED 0x00000800 #define IB_CA_MOD_IS_PKEY_NVRAM_SUPPORTED 0x00001000 #define IB_CA_MOD_IS_MKEY_NVRAM_SUPPORTED 0x00002000 #define IB_CA_MOD_IS_SYSGUID_SUPPORTED 0x00004000 #define IB_CA_MOD_IS_DR_NOTICE_SUPPORTED 0x00008000 #define IB_CA_MOD_IS_BOOT_MGMT_SUPPORTED 0x00010000 #define IB_CA_MOD_IS_CAPM_NOTICE_SUPPORTED 0x00020000 #define IB_CA_MOD_IS_REINIT_SUPORTED 0x00040000 #define IB_CA_MOD_IS_LEDINFO_SUPPORTED 0x00080000 #define IB_CA_MOD_SHUTDOWN_PORT 0x00100000 #define IB_CA_MOD_INIT_TYPE_VALUE 0x00200000 #define IB_CA_MOD_SYSTEM_IMAGE_GUID 0x00400000 #define IB_MR_MOD_ADDR 0x00000001 #define IB_MR_MOD_PD 0x00000002 #define IB_MR_MOD_ACCESS 0x00000004 #define IB_SMINFO_ATTR_MOD_HANDOVER htobe32(0x000001) #define IB_SMINFO_ATTR_MOD_ACKNOWLEDGE htobe32(0x000002) #define IB_SMINFO_ATTR_MOD_DISABLE htobe32(0x000003) #define IB_SMINFO_ATTR_MOD_STANDBY htobe32(0x000004) #define IB_SMINFO_ATTR_MOD_DISCOVER htobe32(0x000005) #define IB_CC_LOG_DATA_SIZE 32 #define IB_CC_MGT_DATA_SIZE 192 typedef struct { ib_mad_t header; __be64 cc_key; uint8_t log_data[IB_CC_LOG_DATA_SIZE]; uint8_t mgt_data[IB_CC_MGT_DATA_SIZE]; } __attribute__((packed)) ib_cc_mad_t; typedef struct { uint8_t cong_info; uint8_t resv; uint8_t ctrl_table_cap; } __attribute__((packed)) ib_cong_info_t; typedef struct { __be64 cc_key; __be16 protect_bit; __be16 lease_period; __be16 violations; } __attribute__((packed)) ib_cong_key_info_t; typedef struct { __be16 slid; __be16 dlid; __be32 sl; __be32 time_stamp; } __attribute__((packed)) ib_cong_log_event_sw_t; typedef struct { __be32 local_qp_resv0; __be32 remote_qp_sl_service_type; __be16 remote_lid; __be16 resv1; __be32 time_stamp; } __attribute__((packed)) ib_cong_log_event_ca_t; typedef struct { uint8_t log_type; union _log_details { struct _log_sw { uint8_t cong_flags; __be16 event_counter; __be32 time_stamp; uint8_t port_map[32]; ib_cong_log_event_sw_t entry_list[15]; } __attribute__((packed)) log_sw; struct _log_ca { uint8_t cong_flags; __be16 event_counter; __be16 event_map; __be16 resv; __be32 time_stamp; ib_cong_log_event_ca_t log_event[13]; } __attribute__((packed)) log_ca; } log_details; } __attribute__((packed)) ib_cong_log_t; #define IB_CC_PORT_MASK_DATA_SIZE 32 typedef struct { __be32 control_map; uint8_t victim_mask[IB_CC_PORT_MASK_DATA_SIZE]; uint8_t credit_mask[IB_CC_PORT_MASK_DATA_SIZE]; uint8_t threshold_resv; uint8_t packet_size; __be16 cs_threshold_resv; __be16 cs_return_delay; __be16 marking_rate; } __attribute__((packed)) ib_sw_cong_setting_t; typedef struct { uint8_t valid_ctrl_type_res_threshold; uint8_t packet_size; __be16 cong_param; } __attribute__((packed)) ib_sw_port_cong_setting_element_t; #define IB_CC_SW_PORT_SETTING_ELEMENTS 32 typedef struct { ib_sw_port_cong_setting_element_t block[IB_CC_SW_PORT_SETTING_ELEMENTS]; } __attribute__((packed)) ib_sw_port_cong_setting_t; typedef struct { __be16 ccti_timer; uint8_t ccti_increase; uint8_t trigger_threshold; uint8_t ccti_min; uint8_t resv0; __be16 resv1; } __attribute__((packed)) ib_ca_cong_entry_t; #define IB_CA_CONG_ENTRY_DATA_SIZE 16 typedef struct { __be16 port_control; __be16 control_map; ib_ca_cong_entry_t entry_list[IB_CA_CONG_ENTRY_DATA_SIZE]; } __attribute__((packed)) ib_ca_cong_setting_t; typedef struct { __be16 shift_multiplier; } __attribute__((packed)) ib_cc_tbl_entry_t; #define IB_CC_TBL_ENTRY_LIST_MAX 64 typedef struct { __be16 ccti_limit; __be16 resv; ib_cc_tbl_entry_t entry_list[IB_CC_TBL_ENTRY_LIST_MAX]; } __attribute__((packed)) ib_cc_tbl_t; typedef struct { __be32 value; } __attribute__((packed)) ib_time_stamp_t; #define IB_PM_PC_XMIT_WAIT_SUP htobe16(1 << 12) #define IS_PM_RSFEC_COUNTERS_SUP htobe16(1 << 14) #define IB_PM_IS_QP1_DROP_SUP htobe16(1 << 15) #define IB_PM_IS_ADDL_PORT_CTRS_EXT_SUP htobe32(1 << 1) #define IB_PORT_CAP2_IS_PORT_INFO_EXT_SUPPORTED htobe16(0x0002) #define IB_PORT_EXT_NO_FEC_MODE_ACTIVE 0 #define IB_PORT_EXT_FIRE_CODE_FEC_MODE_ACTIVE htobe16(0x0001) #define IB_PORT_EXT_RS_FEC_MODE_ACTIVE htobe16(0x0002) #define IB_PORT_EXT_LOW_LATENCY_RS_FEC_MODE_ACTIVE htobe16(0x0003) #define IB_PORT_EXT_RS_FEC2_MODE_ACTIVE htobe16(0x0004) #define IB_PORT_EXT_CAP_IS_FEC_MODE_SUPPORTED htobe32(0x00000001) static inline uint32_t ib_class_cap_mask2(const ib_class_port_info_t *p_cpi) { return (be32toh(p_cpi->cap_mask2_resp_time) >> IB_CLASS_CAPMASK2_SHIFT); } static inline uint8_t ib_class_resp_time_val(ib_class_port_info_t *p_cpi) { return (uint8_t)(be32toh(p_cpi->cap_mask2_resp_time) & IB_CLASS_RESP_TIME_MASK); } static inline const char *ib_get_node_type_str(uint8_t node_type) { static const char *const __ib_node_type_str[] = { "UNKNOWN", "Channel Adapter", "Switch", "Router", }; if (node_type > IB_NODE_TYPE_ROUTER) node_type = 0; return (__ib_node_type_str[node_type]); } static inline __be32 ib_inform_info_get_prod_type(const ib_inform_info_t *p_inf) { uint32_t nt; nt = be16toh(p_inf->g_or_v.generic.node_type_lsb) | (p_inf->g_or_v.generic.node_type_msb << 16); return htobe32(nt); } static inline void ib_inform_info_get_qpn_resp_time(const __be32 qpn_resp_time_val, __be32 *p_qpn, uint8_t *p_resp_time_val) { uint32_t tmp = be32toh(qpn_resp_time_val); if (p_qpn) *p_qpn = htobe32((tmp & 0xffffff00) >> 8); if (p_resp_time_val) *p_resp_time_val = (uint8_t)(tmp & 0x0000001f); } static inline void ib_member_get_scope_state(const uint8_t scope_state, uint8_t *p_scope, uint8_t *p_state) { uint8_t tmp_scope_state; if (p_state) *p_state = (uint8_t)(scope_state & 0x0f); tmp_scope_state = scope_state >> 4; if (p_scope) *p_scope = (uint8_t)(tmp_scope_state & 0x0f); } static inline void ib_member_get_sl_flow_hop(const __be32 sl_flow_hop, uint8_t *p_sl, uint32_t *p_flow_lbl, uint8_t *p_hop) { uint32_t tmp; tmp = be32toh(sl_flow_hop); if (p_hop) *p_hop = (uint8_t)tmp; tmp >>= 8; if (p_flow_lbl) *p_flow_lbl = (uint32_t)(tmp & 0xfffff); tmp >>= 20; if (p_sl) *p_sl = (uint8_t)tmp; } static inline __be32 ib_member_set_sl_flow_hop(const uint8_t sl, const uint32_t flow_label, const uint8_t hop_limit) { uint32_t tmp; tmp = (sl << 28) | ((flow_label & 0xfffff) << 8) | hop_limit; return htobe32(tmp); } static inline __be32 ib_node_info_get_vendor_id(const ib_node_info_t *p_ni) { return ((__be32)(p_ni->port_num_vendor_id & IB_NODE_INFO_VEND_ID_MASK)); } static inline uint8_t ib_node_info_get_local_port_num(const ib_node_info_t *p_ni) { return be32toh(p_ni->port_num_vendor_id & IB_NODE_INFO_PORT_NUM_MASK) >> 24; } static inline uint16_t ib_path_rec_qos_class(const ib_path_rec_t *p_rec) { return (be16toh(p_rec->qos_class_sl) >> 4); } static inline void ib_path_rec_set_qos_class(ib_path_rec_t *p_rec, const uint16_t qos_class) { p_rec->qos_class_sl = (p_rec->qos_class_sl & htobe16(IB_PATH_REC_SL_MASK)) | htobe16(qos_class << 4); } static inline uint8_t ib_path_rec_sl(const ib_path_rec_t *p_rec) { return (uint8_t)(be16toh(p_rec->qos_class_sl) & IB_PATH_REC_SL_MASK); } static inline uint8_t ib_slvl_table_get(const ib_slvl_table_t *p_slvl_tbl, uint8_t sl_index) { uint8_t idx = sl_index / 2; assert(sl_index <= 15); if (sl_index % 2) /* this is an odd sl. Need to return the ls bits. */ return (p_slvl_tbl->raw_vl_by_sl[idx] & 0x0F); else /* this is an even sl. Need to return the ms bits. */ return ((p_slvl_tbl->raw_vl_by_sl[idx] & 0xF0) >> 4); } static inline uint8_t ib_sminfo_get_priority(const ib_sm_info_t *p_smi) { return ((uint8_t)((p_smi->pri_state & 0xF0) >> 4)); } static inline uint8_t ib_sminfo_get_state(const ib_sm_info_t *p_smi) { return ((uint8_t)(p_smi->pri_state & 0x0F)); } #endif rdma-core-56.1/libibmad/libibmad.map000066400000000000000000000074321477342711600173450ustar00rootroot00000000000000IBMAD_1.3 { global: xdump; mad_dump_field; mad_dump_val; mad_print_field; mad_dump_array; mad_dump_bitfield; mad_dump_hex; mad_dump_int; mad_dump_linkdowndefstate; mad_dump_linkspeed; mad_dump_linkspeeden; mad_dump_linkspeedsup; mad_dump_linkspeedext; mad_dump_linkspeedexten; mad_dump_linkspeedextsup; mad_dump_linkwidth; mad_dump_linkwidthen; mad_dump_linkwidthsup; mad_dump_mlnx_ext_port_info; mad_dump_portinfo_ext; mad_dump_mtu; mad_dump_node_type; mad_dump_nodedesc; mad_dump_nodeinfo; mad_dump_opervls; mad_dump_fields; mad_dump_perfcounters; mad_dump_perfcounters_ext; mad_dump_perfcounters_xmt_sl; mad_dump_perfcounters_rcv_sl; mad_dump_perfcounters_xmt_disc; mad_dump_perfcounters_rcv_err; mad_dump_physportstate; mad_dump_portcapmask; mad_dump_portcapmask2; mad_dump_portinfo; mad_dump_portsamples_control; mad_dump_portsamples_result; mad_dump_perfcounters_port_op_rcv_counters; mad_dump_perfcounters_port_flow_ctl_counters; mad_dump_perfcounters_port_vl_op_packet; mad_dump_perfcounters_port_vl_op_data; mad_dump_perfcounters_port_vl_xmit_flow_ctl_update_errors; mad_dump_perfcounters_port_vl_xmit_wait_counters; mad_dump_perfcounters_sw_port_vl_congestion; mad_dump_perfcounters_rcv_con_ctrl; mad_dump_perfcounters_sl_rcv_fecn; mad_dump_perfcounters_sl_rcv_becn; mad_dump_perfcounters_xmit_con_ctrl; mad_dump_perfcounters_vl_xmit_time_cong; mad_dump_cc_congestioninfo; mad_dump_cc_congestionkeyinfo; mad_dump_cc_congestionlog; mad_dump_cc_congestionlogswitch; mad_dump_cc_congestionlogentryswitch; mad_dump_cc_congestionlogca; mad_dump_cc_congestionlogentryca; mad_dump_cc_switchcongestionsetting; mad_dump_cc_switchportcongestionsettingelement; mad_dump_cc_cacongestionsetting; mad_dump_cc_cacongestionentry; mad_dump_cc_congestioncontroltable; mad_dump_cc_congestioncontroltableentry; mad_dump_cc_timestamp; mad_dump_classportinfo; mad_dump_portstates; mad_dump_portstate; mad_dump_rhex; mad_dump_sltovl; mad_dump_string; mad_dump_switchinfo; mad_dump_uint; mad_dump_vlarbitration; mad_dump_vlcap; mad_get_field; mad_set_field; mad_get_field64; mad_set_field64; mad_get_array; mad_set_array; pma_query_via; performance_reset_via; mad_build_pkt; mad_decode_field; mad_encode; mad_encode_field; mad_trid; portid2portnum; portid2str; str2drpath; drpath2str; mad_class_agent; mad_register_client; mad_register_server; mad_register_client_via; mad_register_server_via; ib_resolve_portid_str; ib_resolve_self; ib_resolve_smlid; ibdebug; mad_rpc_open_port; mad_rpc_close_port; mad_rpc; mad_rpc_rmpp; mad_rpc_portid; mad_rpc_class_agent; mad_rpc_set_retries; mad_rpc_set_timeout; mad_get_timeout; mad_get_retries; madrpc; madrpc_init; madrpc_portid; madrpc_rmpp; madrpc_save_mad; madrpc_set_retries; madrpc_set_timeout; madrpc_show_errors; ib_path_query; sa_call; sa_rpc_call; mad_alloc; mad_free; mad_receive; mad_respond; mad_receive_via; mad_respond_via; mad_send; mad_send_via; smp_query; smp_set; ib_vendor_call; ib_vendor_call_via; smp_query_via; smp_query_status_via; smp_set_via; smp_set_status_via; ib_path_query_via; ib_resolve_smlid_via; ib_resolve_guid_via; ib_resolve_gid_via; ib_resolve_portid_str_via; ib_resolve_self_via; mad_field_name; bm_call_via; mad_dump_port_ext_speeds_counters; mad_dump_port_ext_speeds_counters_rsfec_active; cc_query_status_via; cc_config_status_via; smp_mkey_get; smp_mkey_set; ib_node_query_via; local: *; }; IBMAD_1.4 { global: mad_dump_linkspeedext2; mad_dump_linkspeedexten2; mad_dump_linkspeedextsup2; } IBMAD_1.3; IBMAD_1.5 { global: mad_rpc_open_port2; mad_rpc_close_port2; } IBMAD_1.4; rdma-core-56.1/libibmad/mad.c000066400000000000000000000147711477342711600160140ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include "mad_internal.h" #undef DEBUG #define DEBUG if (ibdebug) IBWARN #define GET_IB_USERLAND_TID(tid) (tid & 0x00000000ffffffff) /* * Generate the 64 bit MAD transaction ID. The upper 32 bits are reserved for * use by the kernel. We clear the upper 32 bits here, but MADs received from * the kernel may contain kernel specific data in these bits, consequently * userland TID matching should only be done on the lower 32 bits. */ uint64_t mad_trid(void) { static uint64_t trid; uint64_t next; if (!trid) { srandom((int)time(NULL) * getpid()); trid = random(); } next = ++trid; next = GET_IB_USERLAND_TID(next); return next; } int mad_get_timeout(const struct ibmad_port *srcport, int override_ms) { return (override_ms ? override_ms : srcport->timeout ? srcport->timeout : madrpc_timeout); } int mad_get_retries(const struct ibmad_port *srcport) { return (srcport->retries ? srcport->retries : madrpc_retries); } void *mad_encode(void *buf, ib_rpc_t * rpc, ib_dr_path_t * drpath, void *data) { int is_resp = rpc->method & IB_MAD_RESPONSE; int mgtclass; /* first word */ mad_set_field(buf, 0, IB_MAD_METHOD_F, rpc->method); mad_set_field(buf, 0, IB_MAD_RESPONSE_F, is_resp ? 1 : 0); mgtclass = rpc->mgtclass & 0xff; if (mgtclass == IB_SA_CLASS || mgtclass == IB_CC_CLASS) mad_set_field(buf, 0, IB_MAD_CLASSVER_F, 2); else mad_set_field(buf, 0, IB_MAD_CLASSVER_F, 1); mad_set_field(buf, 0, IB_MAD_MGMTCLASS_F, rpc->mgtclass & 0xff); mad_set_field(buf, 0, IB_MAD_BASEVER_F, 1); /* second word */ if ((rpc->mgtclass & 0xff) == IB_SMI_DIRECT_CLASS) { if (!drpath) { IBWARN("encoding dr mad without drpath (null)"); errno = EINVAL; return NULL; } if (drpath->cnt >= IB_SUBNET_PATH_HOPS_MAX) { IBWARN("dr path with hop count %d", drpath->cnt); errno = EINVAL; return NULL; } mad_set_field(buf, 0, IB_DRSMP_HOPCNT_F, drpath->cnt); mad_set_field(buf, 0, IB_DRSMP_HOPPTR_F, is_resp ? drpath->cnt + 1 : 0x0); mad_set_field(buf, 0, IB_DRSMP_STATUS_F, rpc->rstatus); mad_set_field(buf, 0, IB_DRSMP_DIRECTION_F, is_resp ? 1 : 0); /* out */ } else mad_set_field(buf, 0, IB_MAD_STATUS_F, rpc->rstatus); /* words 3,4,5,6 */ if (!rpc->trid) rpc->trid = mad_trid(); mad_set_field64(buf, 0, IB_MAD_TRID_F, rpc->trid); mad_set_field(buf, 0, IB_MAD_ATTRID_F, rpc->attr.id); mad_set_field(buf, 0, IB_MAD_ATTRMOD_F, rpc->attr.mod); /* words 7,8 */ mad_set_field64(buf, 0, IB_MAD_MKEY_F, rpc->mkey); if ((rpc->mgtclass & 0xff) == IB_SMI_DIRECT_CLASS) { /* word 9 */ mad_set_field(buf, 0, IB_DRSMP_DRDLID_F, drpath->drdlid ? drpath->drdlid : 0xffff); mad_set_field(buf, 0, IB_DRSMP_DRSLID_F, drpath->drslid ? drpath->drslid : 0xffff); /* bytes 128 - 256 - by default should be zero due to memset */ if (is_resp) mad_set_array(buf, 0, IB_DRSMP_RPATH_F, drpath->p); else mad_set_array(buf, 0, IB_DRSMP_PATH_F, drpath->p); } if ((rpc->mgtclass & 0xff) == IB_SA_CLASS) mad_set_field64(buf, 0, IB_SA_COMPMASK_F, rpc->mask); if ((rpc->mgtclass & 0xff) == IB_CC_CLASS) { ib_rpc_cc_t *rpccc = (ib_rpc_cc_t *)rpc; mad_set_field64(buf, 0, IB_CC_CCKEY_F, rpccc->cckey); } if (data) memcpy((char *)buf + rpc->dataoffs, data, rpc->datasz); /* vendor mads range 2 */ if (mad_is_vendor_range2(rpc->mgtclass & 0xff)) mad_set_field(buf, 0, IB_VEND2_OUI_F, rpc->oui); return (uint8_t *) buf + IB_MAD_SIZE; } int mad_build_pkt(void *umad, ib_rpc_t * rpc, ib_portid_t * dport, ib_rmpp_hdr_t * rmpp, void *data) { uint8_t *p, *mad; int lid_routed = (rpc->mgtclass & 0xff) != IB_SMI_DIRECT_CLASS; int is_smi = ((rpc->mgtclass & 0xff) == IB_SMI_CLASS || (rpc->mgtclass & 0xff) == IB_SMI_DIRECT_CLASS); struct ib_mad_addr addr; if (!is_smi) umad_set_addr(umad, dport->lid, dport->qp, dport->sl, dport->qkey); else if (lid_routed) umad_set_addr(umad, dport->lid, dport->qp, 0, 0); else if ((dport->drpath.drslid != 0xffff) && (dport->lid > 0)) umad_set_addr(umad, dport->lid, 0, 0, 0); else umad_set_addr(umad, 0xffff, 0, 0, 0); if (dport->grh_present && !is_smi) { addr.grh_present = 1; memcpy(addr.gid, dport->gid, 16); addr.hop_limit = 0xff; addr.traffic_class = 0; addr.flow_label = 0; umad_set_grh(umad, &addr); } else umad_set_grh(umad, NULL); umad_set_pkey(umad, is_smi ? 0 : dport->pkey_idx); mad = umad_get_mad(umad); p = mad_encode(mad, rpc, lid_routed ? NULL : &dport->drpath, data); if (!p) return -1; if (!is_smi && rmpp) { mad_set_field(mad, 0, IB_SA_RMPP_VERS_F, 1); mad_set_field(mad, 0, IB_SA_RMPP_TYPE_F, rmpp->type); mad_set_field(mad, 0, IB_SA_RMPP_RESP_F, 0x3f); mad_set_field(mad, 0, IB_SA_RMPP_FLAGS_F, rmpp->flags); mad_set_field(mad, 0, IB_SA_RMPP_STATUS_F, rmpp->status); mad_set_field(mad, 0, IB_SA_RMPP_D1_F, rmpp->d1.u); mad_set_field(mad, 0, IB_SA_RMPP_D2_F, rmpp->d2.u); } return ((int)(p - mad)); } rdma-core-56.1/libibmad/mad.h000066400000000000000000001403761477342711600160220ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2009-2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _MAD_H_ #define _MAD_H_ #include #include #include #include #include #include #include #include #include #ifdef __cplusplus extern "C" { #endif #define IB_MAD_RPC_VERSION_MASK 0x0f00 #define IB_MAD_RPC_VERSION1 (1<<8) #define IB_SUBNET_PATH_HOPS_MAX 64 #define IB_DEFAULT_SUBN_PREFIX 0xfe80000000000000ULL #define IB_DEFAULT_QP1_QKEY 0x80010000 #define IB_MAD_SIZE 256 #define IB_SMP_DATA_OFFS 64 #define IB_SMP_DATA_SIZE 64 #define IB_VENDOR_RANGE1_DATA_OFFS 24 #define IB_VENDOR_RANGE1_DATA_SIZE (IB_MAD_SIZE - IB_VENDOR_RANGE1_DATA_OFFS) #define IB_VENDOR_RANGE2_DATA_OFFS 40 #define IB_VENDOR_RANGE2_DATA_SIZE (IB_MAD_SIZE - IB_VENDOR_RANGE2_DATA_OFFS) #define IB_SA_DATA_SIZE 200 #define IB_SA_DATA_OFFS 56 #define IB_PC_DATA_OFFS 64 #define IB_PC_DATA_SZ (IB_MAD_SIZE - IB_PC_DATA_OFFS) #define IB_SA_MCM_RECSZ 53 #define IB_SA_PR_RECSZ 64 #define IB_SA_NR_RECSZ 108 #define IB_SA_GIR_RECSZ 72 #define IB_BM_DATA_OFFS 64 #define IB_BM_DATA_SZ (IB_MAD_SIZE - IB_BM_DATA_OFFS) #define IB_BM_BKEY_OFFS 24 #define IB_BM_BKEY_AND_DATA_SZ (IB_MAD_SIZE - IB_BM_BKEY_OFFS) #define IB_CC_DATA_OFFS 64 #define IB_CC_DATA_SZ (IB_MAD_SIZE - IB_CC_DATA_OFFS) #define IB_CC_LOG_DATA_OFFS 32 #define IB_CC_LOG_DATA_SZ (IB_MAD_SIZE - IB_CC_LOG_DATA_OFFS) enum MAD_CLASSES { IB_SMI_CLASS = 0x1, IB_SMI_DIRECT_CLASS = 0x81, IB_SA_CLASS = 0x3, IB_PERFORMANCE_CLASS = 0x4, IB_BOARD_MGMT_CLASS = 0x5, IB_DEVICE_MGMT_CLASS = 0x6, IB_CM_CLASS = 0x7, IB_SNMP_CLASS = 0x8, IB_VENDOR_RANGE1_START_CLASS = 0x9, IB_VENDOR_RANGE1_END_CLASS = 0x0f, IB_CC_CLASS = 0x21, IB_VENDOR_RANGE2_START_CLASS = 0x30, IB_VENDOR_RANGE2_END_CLASS = 0x4f, }; enum MAD_METHODS { IB_MAD_METHOD_GET = 0x1, IB_MAD_METHOD_SET = 0x2, IB_MAD_METHOD_GET_RESPONSE = 0x81, IB_MAD_METHOD_SEND = 0x3, IB_MAD_METHOD_TRAP = 0x5, IB_MAD_METHOD_TRAP_REPRESS = 0x7, IB_MAD_METHOD_REPORT = 0x6, IB_MAD_METHOD_REPORT_RESPONSE = 0x86, IB_MAD_METHOD_GET_TABLE = 0x12, IB_MAD_METHOD_GET_TABLE_RESPONSE = 0x92, IB_MAD_METHOD_GET_TRACE_TABLE = 0x13, IB_MAD_METHOD_GET_TRACE_TABLE_RESPONSE = 0x93, IB_MAD_METHOD_GETMULTI = 0x14, IB_MAD_METHOD_GETMULTI_RESPONSE = 0x94, IB_MAD_METHOD_DELETE = 0x15, IB_MAD_METHOD_DELETE_RESPONSE = 0x95, IB_MAD_RESPONSE = 0x80, }; enum MAD_ATTR_ID { CLASS_PORT_INFO = 0x1, NOTICE = 0x2, INFORM_INFO = 0x3, }; enum MAD_STATUS { IB_MAD_STS_OK = (0 << 2), IB_MAD_STS_BUSY = (1 << 0), IB_MAD_STS_REDIRECT = (1 << 1), IB_MAD_STS_BAD_BASE_VER_OR_CLASS = (1 << 2), IB_MAD_STS_METHOD_NOT_SUPPORTED = (2 << 2), IB_MAD_STS_METHOD_ATTR_NOT_SUPPORTED = (3 << 2), IB_MAD_STS_INV_ATTR_VALUE = (7 << 2), }; enum SMI_ATTR_ID { IB_ATTR_NODE_DESC = 0x10, IB_ATTR_NODE_INFO = 0x11, IB_ATTR_SWITCH_INFO = 0x12, IB_ATTR_GUID_INFO = 0x14, IB_ATTR_PORT_INFO = 0x15, IB_ATTR_PKEY_TBL = 0x16, IB_ATTR_SLVL_TABLE = 0x17, IB_ATTR_VL_ARBITRATION = 0x18, IB_ATTR_LINEARFORWTBL = 0x19, IB_ATTR_MULTICASTFORWTBL = 0x1b, IB_ATTR_LINKSPEEDWIDTHPAIRSTBL = 0x1c, IB_ATTR_VENDORMADSTBL = 0x1d, IB_ATTR_SMINFO = 0x20, IB_ATTR_PORT_INFO_EXT = 0x33, IB_ATTR_LAST, IB_ATTR_MLNX_EXT_PORT_INFO = 0xff90, }; enum SA_ATTR_ID { IB_SA_ATTR_NOTICE = 0x02, IB_SA_ATTR_INFORMINFO = 0x03, IB_SA_ATTR_NODERECORD = 0x11, IB_SA_ATTR_PORTINFORECORD = 0x12, IB_SA_ATTR_SL2VLTABLERECORD = 0x13, IB_SA_ATTR_SWITCHINFORECORD = 0x14, IB_SA_ATTR_LFTRECORD = 0x15, IB_SA_ATTR_RFTRECORD = 0x16, IB_SA_ATTR_MFTRECORD = 0x17, IB_SA_ATTR_SMINFORECORD = 0x18, IB_SA_ATTR_LINKRECORD = 0x20, IB_SA_ATTR_GUIDINFORECORD = 0x30, IB_SA_ATTR_SERVICERECORD = 0x31, IB_SA_ATTR_PKEYTABLERECORD = 0x33, IB_SA_ATTR_PATHRECORD = 0x35, IB_SA_ATTR_VLARBTABLERECORD = 0x36, IB_SA_ATTR_MCRECORD = 0x38, IB_SA_ATTR_MULTIPATH = 0x3a, IB_SA_ATTR_INFORMINFORECORD = 0xf3, IB_SA_ATTR_LAST }; enum GSI_ATTR_ID { IB_GSI_PORT_SAMPLES_CONTROL = 0x10, IB_GSI_PORT_SAMPLES_RESULT = 0x11, IB_GSI_PORT_COUNTERS = 0x12, IB_GSI_PORT_RCV_ERROR_DETAILS = 0x15, IB_GSI_PORT_XMIT_DISCARD_DETAILS = 0x16, IB_GSI_PORT_PORT_OP_RCV_COUNTERS = 0x17, IB_GSI_PORT_PORT_FLOW_CTL_COUNTERS = 0x18, IB_GSI_PORT_PORT_VL_OP_PACKETS = 0x19, IB_GSI_PORT_PORT_VL_OP_DATA = 0x1A, IB_GSI_PORT_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS = 0x1B, IB_GSI_PORT_PORT_VL_XMIT_WAIT_COUNTERS = 0x1C, IB_GSI_PORT_COUNTERS_EXT = 0x1D, IB_GSI_PORT_EXT_SPEEDS_COUNTERS = 0x1F, IB_GSI_SW_PORT_VL_CONGESTION = 0x30, IB_GSI_PORT_RCV_CON_CTRL = 0x31, IB_GSI_PORT_SL_RCV_FECN = 0x32, IB_GSI_PORT_SL_RCV_BECN = 0x33, IB_GSI_PORT_XMIT_CON_CTRL = 0x34, IB_GSI_PORT_VL_XMIT_TIME_CONG = 0x35, IB_GSI_PORT_XMIT_DATA_SL = 0x36, IB_GSI_PORT_RCV_DATA_SL = 0x37, IB_GSI_ATTR_LAST }; enum BM_ATTR_ID { IB_BM_ATTR_BKEYINFO = 0x10, IB_BM_ATTR_WRITE_VPD = 0x20, IB_BM_ATTR_READ_VPD = 0x21, IB_BM_ATTR_RESET_IBML = 0x22, IB_BM_ATTR_SET_MODULE_PM_CONTROL = 0x23, IB_BM_ATTR_GET_MODULE_PM_CONTROL = 0x24, IB_BM_ATTR_SET_UNIT_PM_CONTROL = 0x25, IB_BM_ATTR_GET_UNIT_PM_CONTROL = 0x26, IB_BM_ATTR_SET_IOC_PM_CONTROL = 0x27, IB_BM_ATTR_GET_IOC_PM_CONTROL = 0x28, IB_BM_ATTR_SET_MODULE_STATE = 0x29, IB_BM_ATTR_SET_MODULE_ATTENTION = 0x2A, IB_BM_ATTR_GET_MODULE_STATUS = 0x2B, IB_BM_ATTR_IB2IBML = 0x2C, IB_BM_ATTR_IB2CME = 0x2D, IB_BM_ATTR_IB2MME = 0x2E, IB_BM_ATTR_OEM = 0x2F, IB_BM_ATTR_LAST }; enum CC_ATTRI_ID { IB_CC_ATTR_CONGESTION_INFO = 0x11, IB_CC_ATTR_CONGESTION_KEY_INFO = 0x12, IB_CC_ATTR_CONGESTION_LOG = 0x13, IB_CC_ATTR_SWITCH_CONGESTION_SETTING = 0x14, IB_CC_ATTR_SWITCH_PORT_CONGESTION_SETTING = 0x15, IB_CC_ATTR_CA_CONGESTION_SETTING = 0x16, IB_CC_ATTR_CONGESTION_CONTROL_TABLE = 0x17, IB_CC_ATTR_TIMESTAMP = 0x18, }; #define IB_VENDOR_OPENIB_PING_CLASS (IB_VENDOR_RANGE2_START_CLASS + 2) #define IB_VENDOR_OPENIB_SYSSTAT_CLASS (IB_VENDOR_RANGE2_START_CLASS + 3) #define IB_OPENIB_OUI (0x001405) typedef uint8_t ibmad_gid_t[16]; #ifdef USE_DEPRECATED_IB_GID_T typedef ibmad_gid_t ib_gid_t __attribute__ ((deprecated)); #endif typedef struct { int cnt; uint8_t p[IB_SUBNET_PATH_HOPS_MAX]; uint16_t drslid; uint16_t drdlid; } ib_dr_path_t; typedef struct { unsigned id; unsigned mod; } ib_attr_t; typedef struct { int mgtclass; int method; ib_attr_t attr; uint32_t rstatus; /* return status */ int dataoffs; int datasz; uint64_t mkey; uint64_t trid; /* used for out mad if nonzero, return real val */ uint64_t mask; /* for sa mads */ unsigned recsz; /* for sa mads (attribute offset) */ int timeout; uint32_t oui; /* for vendor range 2 mads */ } ib_rpc_t; typedef struct { int mgtclass; int method; ib_attr_t attr; uint32_t rstatus; /* return status */ int dataoffs; int datasz; uint64_t mkey; uint64_t trid; /* used for out mad if nonzero, return real val */ uint64_t mask; /* for sa mads */ unsigned recsz; /* for sa mads (attribute offset) */ int timeout; uint32_t oui; /* for vendor range 2 mads */ int error; /* errno */ } ib_rpc_v1_t; typedef struct { int mgtclass; int method; ib_attr_t attr; uint32_t rstatus; /* return status */ int dataoffs; int datasz; uint64_t mkey; uint64_t trid; /* used for out mad if nonzero, return real val */ uint64_t mask; /* for sa mads */ unsigned recsz; /* for sa mads (attribute offset) */ int timeout; uint32_t oui; /* for vendor range 2 mads */ int error; /* errno */ uint64_t cckey; } ib_rpc_cc_t; typedef struct portid { int lid; /* lid or 0 if directed route */ ib_dr_path_t drpath; int grh_present; /* flag */ ibmad_gid_t gid; uint32_t qp; uint32_t qkey; uint8_t sl; unsigned pkey_idx; } ib_portid_t; typedef void (ib_mad_dump_fn) (char *buf, int bufsz, void *val, int valsz); #define IB_FIELD_NAME_LEN 32 typedef struct ib_field { int bitoffs; int bitlen; char name[IB_FIELD_NAME_LEN]; ib_mad_dump_fn *def_dump_fn; } ib_field_t; enum MAD_FIELDS { IB_NO_FIELD, IB_GID_PREFIX_F, IB_GID_GUID_F, /* first MAD word (0-3 bytes) */ IB_MAD_METHOD_F, IB_MAD_RESPONSE_F, IB_MAD_CLASSVER_F, IB_MAD_MGMTCLASS_F, IB_MAD_BASEVER_F, /* second MAD word (4-7 bytes) */ IB_MAD_STATUS_F, /* DRSMP only */ IB_DRSMP_HOPCNT_F, IB_DRSMP_HOPPTR_F, IB_DRSMP_STATUS_F, IB_DRSMP_DIRECTION_F, /* words 3,4,5,6 (8-23 bytes) */ IB_MAD_TRID_F, IB_MAD_ATTRID_F, IB_MAD_ATTRMOD_F, /* word 7,8 (24-31 bytes) */ IB_MAD_MKEY_F, /* word 9 (32-37 bytes) */ IB_DRSMP_DRDLID_F, IB_DRSMP_DRSLID_F, /* word 10,11 (36-43 bytes) */ IB_SA_MKEY_F, /* word 12 (44-47 bytes) */ IB_SA_ATTROFFS_F, /* word 13,14 (48-55 bytes) */ IB_SA_COMPMASK_F, /* word 13,14 (56-255 bytes) */ IB_SA_DATA_F, /* bytes 64 - 127 */ IB_SM_DATA_F, /* bytes 64 - 256 */ IB_GS_DATA_F, /* bytes 128 - 191 */ IB_DRSMP_PATH_F, /* bytes 192 - 255 */ IB_DRSMP_RPATH_F, /* * PortInfo fields */ IB_PORT_FIRST_F, IB_PORT_MKEY_F = IB_PORT_FIRST_F, IB_PORT_GID_PREFIX_F, IB_PORT_LID_F, IB_PORT_SMLID_F, IB_PORT_CAPMASK_F, IB_PORT_DIAG_F, IB_PORT_MKEY_LEASE_F, IB_PORT_LOCAL_PORT_F, IB_PORT_LINK_WIDTH_ENABLED_F, IB_PORT_LINK_WIDTH_SUPPORTED_F, IB_PORT_LINK_WIDTH_ACTIVE_F, IB_PORT_LINK_SPEED_SUPPORTED_F, IB_PORT_STATE_F, IB_PORT_PHYS_STATE_F, IB_PORT_LINK_DOWN_DEF_F, IB_PORT_MKEY_PROT_BITS_F, IB_PORT_LMC_F, IB_PORT_LINK_SPEED_ACTIVE_F, IB_PORT_LINK_SPEED_ENABLED_F, IB_PORT_NEIGHBOR_MTU_F, IB_PORT_SMSL_F, IB_PORT_VL_CAP_F, IB_PORT_INIT_TYPE_F, IB_PORT_VL_HIGH_LIMIT_F, IB_PORT_VL_ARBITRATION_HIGH_CAP_F, IB_PORT_VL_ARBITRATION_LOW_CAP_F, IB_PORT_INIT_TYPE_REPLY_F, IB_PORT_MTU_CAP_F, IB_PORT_VL_STALL_COUNT_F, IB_PORT_HOQ_LIFE_F, IB_PORT_OPER_VLS_F, IB_PORT_PART_EN_INB_F, IB_PORT_PART_EN_OUTB_F, IB_PORT_FILTER_RAW_INB_F, IB_PORT_FILTER_RAW_OUTB_F, IB_PORT_MKEY_VIOL_F, IB_PORT_PKEY_VIOL_F, IB_PORT_QKEY_VIOL_F, IB_PORT_GUID_CAP_F, IB_PORT_CLIENT_REREG_F, IB_PORT_MCAST_PKEY_SUPR_ENAB_F, IB_PORT_SUBN_TIMEOUT_F, IB_PORT_RESP_TIME_VAL_F, IB_PORT_LOCAL_PHYS_ERR_F, IB_PORT_OVERRUN_ERR_F, IB_PORT_MAX_CREDIT_HINT_F, IB_PORT_LINK_ROUND_TRIP_F, IB_PORT_LAST_F, /* * NodeInfo fields */ IB_NODE_FIRST_F, IB_NODE_BASE_VERS_F = IB_NODE_FIRST_F, IB_NODE_CLASS_VERS_F, IB_NODE_TYPE_F, IB_NODE_NPORTS_F, IB_NODE_SYSTEM_GUID_F, IB_NODE_GUID_F, IB_NODE_PORT_GUID_F, IB_NODE_PARTITION_CAP_F, IB_NODE_DEVID_F, IB_NODE_REVISION_F, IB_NODE_LOCAL_PORT_F, IB_NODE_VENDORID_F, IB_NODE_LAST_F, /* * SwitchInfo fields */ IB_SW_FIRST_F, IB_SW_LINEAR_FDB_CAP_F = IB_SW_FIRST_F, IB_SW_RANDOM_FDB_CAP_F, IB_SW_MCAST_FDB_CAP_F, IB_SW_LINEAR_FDB_TOP_F, IB_SW_DEF_PORT_F, IB_SW_DEF_MCAST_PRIM_F, IB_SW_DEF_MCAST_NOT_PRIM_F, IB_SW_LIFE_TIME_F, IB_SW_STATE_CHANGE_F, IB_SW_OPT_SLTOVL_MAPPING_F, IB_SW_LIDS_PER_PORT_F, IB_SW_PARTITION_ENFORCE_CAP_F, IB_SW_PARTITION_ENF_INB_F, IB_SW_PARTITION_ENF_OUTB_F, IB_SW_FILTER_RAW_INB_F, IB_SW_FILTER_RAW_OUTB_F, IB_SW_ENHANCED_PORT0_F, IB_SW_MCAST_FDB_TOP_F, IB_SW_LAST_F, /* * SwitchLinearForwardingTable fields */ IB_LINEAR_FORW_TBL_F, /* * SwitchMulticastForwardingTable fields */ IB_MULTICAST_FORW_TBL_F, /* * NodeDescription fields */ IB_NODE_DESC_F, /* * Notice/Trap fields */ IB_NOTICE_IS_GENERIC_F, IB_NOTICE_TYPE_F, IB_NOTICE_PRODUCER_F, IB_NOTICE_TRAP_NUMBER_F, IB_NOTICE_ISSUER_LID_F, IB_NOTICE_TOGGLE_F, IB_NOTICE_COUNT_F, IB_NOTICE_DATA_DETAILS_F, IB_NOTICE_DATA_LID_F, IB_NOTICE_DATA_144_LID_F, IB_NOTICE_DATA_144_CAPMASK_F, /* * GS Performance */ IB_PC_FIRST_F, IB_PC_PORT_SELECT_F = IB_PC_FIRST_F, IB_PC_COUNTER_SELECT_F, IB_PC_ERR_SYM_F, IB_PC_LINK_RECOVERS_F, IB_PC_LINK_DOWNED_F, IB_PC_ERR_RCV_F, IB_PC_ERR_PHYSRCV_F, IB_PC_ERR_SWITCH_REL_F, IB_PC_XMT_DISCARDS_F, IB_PC_ERR_XMTCONSTR_F, IB_PC_ERR_RCVCONSTR_F, IB_PC_COUNTER_SELECT2_F, IB_PC_ERR_LOCALINTEG_F, IB_PC_ERR_EXCESS_OVR_F, IB_PC_VL15_DROPPED_F, IB_PC_XMT_BYTES_F, IB_PC_RCV_BYTES_F, IB_PC_XMT_PKTS_F, IB_PC_RCV_PKTS_F, IB_PC_XMT_WAIT_F, IB_PC_LAST_F, /* * SMInfo */ IB_SMINFO_GUID_F, IB_SMINFO_KEY_F, IB_SMINFO_ACT_F, IB_SMINFO_PRIO_F, IB_SMINFO_STATE_F, /* * SA RMPP */ IB_SA_RMPP_VERS_F, IB_SA_RMPP_TYPE_F, IB_SA_RMPP_RESP_F, IB_SA_RMPP_FLAGS_F, IB_SA_RMPP_STATUS_F, /* data1 */ IB_SA_RMPP_D1_F, IB_SA_RMPP_SEGNUM_F, /* data2 */ IB_SA_RMPP_D2_F, IB_SA_RMPP_LEN_F, /* DATA: Payload len */ IB_SA_RMPP_NEWWIN_F, /* ACK: new window last */ /* * SA Multi Path rec */ IB_SA_MP_NPATH_F, IB_SA_MP_NSRC_F, IB_SA_MP_NDEST_F, IB_SA_MP_GID0_F, /* * SA Path rec */ IB_SA_PR_DGID_F, IB_SA_PR_SGID_F, IB_SA_PR_DLID_F, IB_SA_PR_SLID_F, IB_SA_PR_NPATH_F, IB_SA_PR_SL_F, /* * MC Member rec */ IB_SA_MCM_MGID_F, IB_SA_MCM_PORTGID_F, IB_SA_MCM_QKEY_F, IB_SA_MCM_MLID_F, IB_SA_MCM_SL_F, IB_SA_MCM_MTU_F, IB_SA_MCM_RATE_F, IB_SA_MCM_TCLASS_F, IB_SA_MCM_PKEY_F, IB_SA_MCM_FLOW_LABEL_F, IB_SA_MCM_JOIN_STATE_F, IB_SA_MCM_PROXY_JOIN_F, /* * Service record */ IB_SA_SR_ID_F, IB_SA_SR_GID_F, IB_SA_SR_PKEY_F, IB_SA_SR_LEASE_F, IB_SA_SR_KEY_F, IB_SA_SR_NAME_F, IB_SA_SR_DATA_F, /* * ATS SM record - within SA_SR_DATA */ IB_ATS_SM_NODE_ADDR_F, IB_ATS_SM_MAGIC_KEY_F, IB_ATS_SM_NODE_TYPE_F, IB_ATS_SM_NODE_NAME_F, /* * SLTOVL MAPPING TABLE */ IB_SLTOVL_MAPPING_TABLE_F, /* * VL ARBITRATION TABLE */ IB_VL_ARBITRATION_TABLE_F, /* * IB vendor class range 2 */ IB_VEND2_OUI_F, IB_VEND2_DATA_F, /* * PortCountersExtended */ IB_PC_EXT_FIRST_F, IB_PC_EXT_PORT_SELECT_F = IB_PC_EXT_FIRST_F, IB_PC_EXT_COUNTER_SELECT_F, IB_PC_EXT_XMT_BYTES_F, IB_PC_EXT_RCV_BYTES_F, IB_PC_EXT_XMT_PKTS_F, IB_PC_EXT_RCV_PKTS_F, IB_PC_EXT_XMT_UPKTS_F, IB_PC_EXT_RCV_UPKTS_F, IB_PC_EXT_XMT_MPKTS_F, IB_PC_EXT_RCV_MPKTS_F, IB_PC_EXT_LAST_F, /* * GUIDInfo fields */ IB_GUID_GUID0_F, /* Obsolete, kept for compatibility Use IB_GI_GUID0_F going forward */ /* * ClassPortInfo fields */ IB_CPI_BASEVER_F, IB_CPI_CLASSVER_F, IB_CPI_CAPMASK_F, IB_CPI_CAPMASK2_F, IB_CPI_RESP_TIME_VALUE_F, IB_CPI_REDIRECT_GID_F, IB_CPI_REDIRECT_TC_F, IB_CPI_REDIRECT_SL_F, IB_CPI_REDIRECT_FL_F, IB_CPI_REDIRECT_LID_F, IB_CPI_REDIRECT_PKEY_F, IB_CPI_REDIRECT_QP_F, IB_CPI_REDIRECT_QKEY_F, IB_CPI_TRAP_GID_F, IB_CPI_TRAP_TC_F, IB_CPI_TRAP_SL_F, IB_CPI_TRAP_FL_F, IB_CPI_TRAP_LID_F, IB_CPI_TRAP_PKEY_F, IB_CPI_TRAP_HL_F, IB_CPI_TRAP_QP_F, IB_CPI_TRAP_QKEY_F, /* * PortXmitDataSL fields */ IB_PC_XMT_DATA_SL_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_XMT_DATA_SL0_F = IB_PC_XMT_DATA_SL_FIRST_F, IB_PC_XMT_DATA_SL1_F, IB_PC_XMT_DATA_SL2_F, IB_PC_XMT_DATA_SL3_F, IB_PC_XMT_DATA_SL4_F, IB_PC_XMT_DATA_SL5_F, IB_PC_XMT_DATA_SL6_F, IB_PC_XMT_DATA_SL7_F, IB_PC_XMT_DATA_SL8_F, IB_PC_XMT_DATA_SL9_F, IB_PC_XMT_DATA_SL10_F, IB_PC_XMT_DATA_SL11_F, IB_PC_XMT_DATA_SL12_F, IB_PC_XMT_DATA_SL13_F, IB_PC_XMT_DATA_SL14_F, IB_PC_XMT_DATA_SL15_F, IB_PC_XMT_DATA_SL_LAST_F, /* * PortRcvDataSL fields */ IB_PC_RCV_DATA_SL_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_RCV_DATA_SL0_F = IB_PC_RCV_DATA_SL_FIRST_F, IB_PC_RCV_DATA_SL1_F, IB_PC_RCV_DATA_SL2_F, IB_PC_RCV_DATA_SL3_F, IB_PC_RCV_DATA_SL4_F, IB_PC_RCV_DATA_SL5_F, IB_PC_RCV_DATA_SL6_F, IB_PC_RCV_DATA_SL7_F, IB_PC_RCV_DATA_SL8_F, IB_PC_RCV_DATA_SL9_F, IB_PC_RCV_DATA_SL10_F, IB_PC_RCV_DATA_SL11_F, IB_PC_RCV_DATA_SL12_F, IB_PC_RCV_DATA_SL13_F, IB_PC_RCV_DATA_SL14_F, IB_PC_RCV_DATA_SL15_F, IB_PC_RCV_DATA_SL_LAST_F, /* * PortXmitDiscardDetails fields */ /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_XMT_INACT_DISC_F, IB_PC_XMT_NEIGH_MTU_DISC_F, IB_PC_XMT_SW_LIFE_DISC_F, IB_PC_XMT_SW_HOL_DISC_F, IB_PC_XMT_DISC_LAST_F, /* * PortRcvErrorDetails fields */ /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_RCV_LOCAL_PHY_ERR_F, IB_PC_RCV_MALFORMED_PKT_ERR_F, IB_PC_RCV_BUF_OVR_ERR_F, IB_PC_RCV_DLID_MAP_ERR_F, IB_PC_RCV_VL_MAP_ERR_F, IB_PC_RCV_LOOPING_ERR_F, IB_PC_RCV_ERR_LAST_F, /* * PortSamplesControl fields */ IB_PSC_OPCODE_F, IB_PSC_PORT_SELECT_F, IB_PSC_TICK_F, IB_PSC_COUNTER_WIDTH_F, IB_PSC_COUNTER_MASK0_F, IB_PSC_COUNTER_MASKS1TO9_F, IB_PSC_COUNTER_MASKS10TO14_F, IB_PSC_SAMPLE_MECHS_F, IB_PSC_SAMPLE_STATUS_F, IB_PSC_OPTION_MASK_F, IB_PSC_VENDOR_MASK_F, IB_PSC_SAMPLE_START_F, IB_PSC_SAMPLE_INTVL_F, IB_PSC_TAG_F, IB_PSC_COUNTER_SEL0_F, IB_PSC_COUNTER_SEL1_F, IB_PSC_COUNTER_SEL2_F, IB_PSC_COUNTER_SEL3_F, IB_PSC_COUNTER_SEL4_F, IB_PSC_COUNTER_SEL5_F, IB_PSC_COUNTER_SEL6_F, IB_PSC_COUNTER_SEL7_F, IB_PSC_COUNTER_SEL8_F, IB_PSC_COUNTER_SEL9_F, IB_PSC_COUNTER_SEL10_F, IB_PSC_COUNTER_SEL11_F, IB_PSC_COUNTER_SEL12_F, IB_PSC_COUNTER_SEL13_F, IB_PSC_COUNTER_SEL14_F, IB_PSC_SAMPLES_ONLY_OPT_MASK_F, IB_PSC_LAST_F, /* * GUIDInfo fields */ IB_GI_GUID0_F, /* a duplicate of IB_GUID_GUID0_F for backwards compatibility */ IB_GI_GUID1_F, IB_GI_GUID2_F, IB_GI_GUID3_F, IB_GI_GUID4_F, IB_GI_GUID5_F, IB_GI_GUID6_F, IB_GI_GUID7_F, /* * GUID Info Record */ IB_SA_GIR_LID_F, IB_SA_GIR_BLOCKNUM_F, IB_SA_GIR_GUID0_F, IB_SA_GIR_GUID1_F, IB_SA_GIR_GUID2_F, IB_SA_GIR_GUID3_F, IB_SA_GIR_GUID4_F, IB_SA_GIR_GUID5_F, IB_SA_GIR_GUID6_F, IB_SA_GIR_GUID7_F, /* * More PortInfo fields */ IB_PORT_CAPMASK2_F, IB_PORT_LINK_SPEED_EXT_ACTIVE_F, IB_PORT_LINK_SPEED_EXT_SUPPORTED_F, IB_PORT_LINK_SPEED_EXT_ENABLED_F, IB_PORT_LINK_SPEED_EXT_LAST_F, /* * PortExtendedSpeedsCounters fields */ IB_PESC_PORT_SELECT_F, IB_PESC_COUNTER_SELECT_F, IB_PESC_SYNC_HDR_ERR_CTR_F, IB_PESC_UNK_BLOCK_CTR_F, IB_PESC_ERR_DET_CTR_LANE0_F, IB_PESC_ERR_DET_CTR_LANE1_F, IB_PESC_ERR_DET_CTR_LANE2_F, IB_PESC_ERR_DET_CTR_LANE3_F, IB_PESC_ERR_DET_CTR_LANE4_F, IB_PESC_ERR_DET_CTR_LANE5_F, IB_PESC_ERR_DET_CTR_LANE6_F, IB_PESC_ERR_DET_CTR_LANE7_F, IB_PESC_ERR_DET_CTR_LANE8_F, IB_PESC_ERR_DET_CTR_LANE9_F, IB_PESC_ERR_DET_CTR_LANE10_F, IB_PESC_ERR_DET_CTR_LANE11_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE0_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE1_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE2_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE3_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE4_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE5_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE6_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE7_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE8_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE9_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE10_F, IB_PESC_FEC_CORR_BLOCK_CTR_LANE11_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE0_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE1_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE2_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE3_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE4_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE5_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE6_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE7_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE8_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE9_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE10_F, IB_PESC_FEC_UNCORR_BLOCK_CTR_LANE11_F, IB_PESC_LAST_F, /* * PortOpRcvCounters fields */ IB_PC_PORT_OP_RCV_COUNTERS_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_PORT_OP_RCV_PKTS_F = IB_PC_PORT_OP_RCV_COUNTERS_FIRST_F, IB_PC_PORT_OP_RCV_DATA_F, IB_PC_PORT_OP_RCV_COUNTERS_LAST_F, /* * PortFlowCtlCounters fields */ IB_PC_PORT_FLOW_CTL_COUNTERS_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_PORT_XMIT_FLOW_PKTS_F = IB_PC_PORT_FLOW_CTL_COUNTERS_FIRST_F, IB_PC_PORT_RCV_FLOW_PKTS_F, IB_PC_PORT_FLOW_CTL_COUNTERS_LAST_F, /* * PortVLOpPackets fields */ IB_PC_PORT_VL_OP_PACKETS_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_PORT_VL_OP_PACKETS0_F = IB_PC_PORT_VL_OP_PACKETS_FIRST_F, IB_PC_PORT_VL_OP_PACKETS1_F, IB_PC_PORT_VL_OP_PACKETS2_F, IB_PC_PORT_VL_OP_PACKETS3_F, IB_PC_PORT_VL_OP_PACKETS4_F, IB_PC_PORT_VL_OP_PACKETS5_F, IB_PC_PORT_VL_OP_PACKETS6_F, IB_PC_PORT_VL_OP_PACKETS7_F, IB_PC_PORT_VL_OP_PACKETS8_F, IB_PC_PORT_VL_OP_PACKETS9_F, IB_PC_PORT_VL_OP_PACKETS10_F, IB_PC_PORT_VL_OP_PACKETS11_F, IB_PC_PORT_VL_OP_PACKETS12_F, IB_PC_PORT_VL_OP_PACKETS13_F, IB_PC_PORT_VL_OP_PACKETS14_F, IB_PC_PORT_VL_OP_PACKETS15_F, IB_PC_PORT_VL_OP_PACKETS_LAST_F, /* * PortVLOpData fields */ IB_PC_PORT_VL_OP_DATA_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_PORT_VL_OP_DATA0_F = IB_PC_PORT_VL_OP_DATA_FIRST_F, IB_PC_PORT_VL_OP_DATA1_F, IB_PC_PORT_VL_OP_DATA2_F, IB_PC_PORT_VL_OP_DATA3_F, IB_PC_PORT_VL_OP_DATA4_F, IB_PC_PORT_VL_OP_DATA5_F, IB_PC_PORT_VL_OP_DATA6_F, IB_PC_PORT_VL_OP_DATA7_F, IB_PC_PORT_VL_OP_DATA8_F, IB_PC_PORT_VL_OP_DATA9_F, IB_PC_PORT_VL_OP_DATA10_F, IB_PC_PORT_VL_OP_DATA11_F, IB_PC_PORT_VL_OP_DATA12_F, IB_PC_PORT_VL_OP_DATA13_F, IB_PC_PORT_VL_OP_DATA14_F, IB_PC_PORT_VL_OP_DATA15_F, IB_PC_PORT_VL_OP_DATA_LAST_F, /* * PortVLXmitFlowCtlUpdateErrors fields */ IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS0_F = IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS_FIRST_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS1_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS2_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS3_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS4_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS5_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS6_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS7_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS8_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS9_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS10_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS11_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS12_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS13_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS14_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS15_F, IB_PC_PORT_VL_XMIT_FLOW_CTL_UPDATE_ERRORS_LAST_F, /* * PortVLXmitWaitCounters fields */ IB_PC_PORT_VL_XMIT_WAIT_COUNTERS_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_PORT_VL_XMIT_WAIT0_F = IB_PC_PORT_VL_XMIT_WAIT_COUNTERS_FIRST_F, IB_PC_PORT_VL_XMIT_WAIT1_F, IB_PC_PORT_VL_XMIT_WAIT2_F, IB_PC_PORT_VL_XMIT_WAIT3_F, IB_PC_PORT_VL_XMIT_WAIT4_F, IB_PC_PORT_VL_XMIT_WAIT5_F, IB_PC_PORT_VL_XMIT_WAIT6_F, IB_PC_PORT_VL_XMIT_WAIT7_F, IB_PC_PORT_VL_XMIT_WAIT8_F, IB_PC_PORT_VL_XMIT_WAIT9_F, IB_PC_PORT_VL_XMIT_WAIT10_F, IB_PC_PORT_VL_XMIT_WAIT11_F, IB_PC_PORT_VL_XMIT_WAIT12_F, IB_PC_PORT_VL_XMIT_WAIT13_F, IB_PC_PORT_VL_XMIT_WAIT14_F, IB_PC_PORT_VL_XMIT_WAIT15_F, IB_PC_PORT_VL_XMIT_WAIT_COUNTERS_LAST_F, /* * SwPortVLCongestion fields */ IB_PC_SW_PORT_VL_CONGESTION_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_SW_PORT_VL_CONGESTION0_F = IB_PC_SW_PORT_VL_CONGESTION_FIRST_F, IB_PC_SW_PORT_VL_CONGESTION1_F, IB_PC_SW_PORT_VL_CONGESTION2_F, IB_PC_SW_PORT_VL_CONGESTION3_F, IB_PC_SW_PORT_VL_CONGESTION4_F, IB_PC_SW_PORT_VL_CONGESTION5_F, IB_PC_SW_PORT_VL_CONGESTION6_F, IB_PC_SW_PORT_VL_CONGESTION7_F, IB_PC_SW_PORT_VL_CONGESTION8_F, IB_PC_SW_PORT_VL_CONGESTION9_F, IB_PC_SW_PORT_VL_CONGESTION10_F, IB_PC_SW_PORT_VL_CONGESTION11_F, IB_PC_SW_PORT_VL_CONGESTION12_F, IB_PC_SW_PORT_VL_CONGESTION13_F, IB_PC_SW_PORT_VL_CONGESTION14_F, IB_PC_SW_PORT_VL_CONGESTION15_F, IB_PC_SW_PORT_VL_CONGESTION_LAST_F, /* * PortRcvConCtrl fields */ IB_PC_RCV_CON_CTRL_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_RCV_CON_CTRL_PKT_RCV_FECN_F = IB_PC_RCV_CON_CTRL_FIRST_F, IB_PC_RCV_CON_CTRL_PKT_RCV_BECN_F, IB_PC_RCV_CON_CTRL_LAST_F, /* * PortSLRcvFECN fields */ IB_PC_SL_RCV_FECN_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_SL_RCV_FECN0_F = IB_PC_SL_RCV_FECN_FIRST_F, IB_PC_SL_RCV_FECN1_F, IB_PC_SL_RCV_FECN2_F, IB_PC_SL_RCV_FECN3_F, IB_PC_SL_RCV_FECN4_F, IB_PC_SL_RCV_FECN5_F, IB_PC_SL_RCV_FECN6_F, IB_PC_SL_RCV_FECN7_F, IB_PC_SL_RCV_FECN8_F, IB_PC_SL_RCV_FECN9_F, IB_PC_SL_RCV_FECN10_F, IB_PC_SL_RCV_FECN11_F, IB_PC_SL_RCV_FECN12_F, IB_PC_SL_RCV_FECN13_F, IB_PC_SL_RCV_FECN14_F, IB_PC_SL_RCV_FECN15_F, IB_PC_SL_RCV_FECN_LAST_F, /* * PortSLRcvBECN fields */ IB_PC_SL_RCV_BECN_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_SL_RCV_BECN0_F = IB_PC_SL_RCV_BECN_FIRST_F, IB_PC_SL_RCV_BECN1_F, IB_PC_SL_RCV_BECN2_F, IB_PC_SL_RCV_BECN3_F, IB_PC_SL_RCV_BECN4_F, IB_PC_SL_RCV_BECN5_F, IB_PC_SL_RCV_BECN6_F, IB_PC_SL_RCV_BECN7_F, IB_PC_SL_RCV_BECN8_F, IB_PC_SL_RCV_BECN9_F, IB_PC_SL_RCV_BECN10_F, IB_PC_SL_RCV_BECN11_F, IB_PC_SL_RCV_BECN12_F, IB_PC_SL_RCV_BECN13_F, IB_PC_SL_RCV_BECN14_F, IB_PC_SL_RCV_BECN15_F, IB_PC_SL_RCV_BECN_LAST_F, /* * PortXmitConCtrl fields */ IB_PC_XMIT_CON_CTRL_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_XMIT_CON_CTRL_TIME_CONG_F = IB_PC_XMIT_CON_CTRL_FIRST_F, IB_PC_XMIT_CON_CTRL_LAST_F, /* * PortVLXmitTimeCong fields */ IB_PC_VL_XMIT_TIME_CONG_FIRST_F, /* for PortSelect and CounterSelect, use IB_PC_PORT_SELECT_F and IB_PC_COUNTER_SELECT_F */ IB_PC_VL_XMIT_TIME_CONG0_F = IB_PC_VL_XMIT_TIME_CONG_FIRST_F, IB_PC_VL_XMIT_TIME_CONG1_F, IB_PC_VL_XMIT_TIME_CONG2_F, IB_PC_VL_XMIT_TIME_CONG3_F, IB_PC_VL_XMIT_TIME_CONG4_F, IB_PC_VL_XMIT_TIME_CONG5_F, IB_PC_VL_XMIT_TIME_CONG6_F, IB_PC_VL_XMIT_TIME_CONG7_F, IB_PC_VL_XMIT_TIME_CONG8_F, IB_PC_VL_XMIT_TIME_CONG9_F, IB_PC_VL_XMIT_TIME_CONG10_F, IB_PC_VL_XMIT_TIME_CONG11_F, IB_PC_VL_XMIT_TIME_CONG12_F, IB_PC_VL_XMIT_TIME_CONG13_F, IB_PC_VL_XMIT_TIME_CONG14_F, IB_PC_VL_XMIT_TIME_CONG_LAST_F, /* * Mellanox ExtendedPortInfo fields */ IB_MLNX_EXT_PORT_STATE_CHG_ENABLE_F, IB_MLNX_EXT_PORT_LINK_SPEED_SUPPORTED_F, IB_MLNX_EXT_PORT_LINK_SPEED_ENABLED_F, IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F, IB_MLNX_EXT_PORT_LAST_F, /* * Congestion Control Mad fields * bytes 24-31 of congestion control mad */ IB_CC_CCKEY_F, /* * CongestionInfo fields */ IB_CC_CONGESTION_INFO_FIRST_F, IB_CC_CONGESTION_INFO_F = IB_CC_CONGESTION_INFO_FIRST_F, IB_CC_CONGESTION_INFO_CONTROL_TABLE_CAP_F, IB_CC_CONGESTION_INFO_LAST_F, /* * CongestionKeyInfo fields */ IB_CC_CONGESTION_KEY_INFO_FIRST_F, IB_CC_CONGESTION_KEY_INFO_CC_KEY_F = IB_CC_CONGESTION_KEY_INFO_FIRST_F, IB_CC_CONGESTION_KEY_INFO_CC_KEY_PROTECT_BIT_F, IB_CC_CONGESTION_KEY_INFO_CC_KEY_LEASE_PERIOD_F, IB_CC_CONGESTION_KEY_INFO_CC_KEY_VIOLATIONS_F, IB_CC_CONGESTION_KEY_INFO_LAST_F, /* * CongestionLog (common) fields */ IB_CC_CONGESTION_LOG_FIRST_F, IB_CC_CONGESTION_LOG_LOGTYPE_F = IB_CC_CONGESTION_LOG_FIRST_F, IB_CC_CONGESTION_LOG_CONGESTION_FLAGS_F, IB_CC_CONGESTION_LOG_LAST_F, /* * CongestionLog (Switch) fields */ IB_CC_CONGESTION_LOG_SWITCH_FIRST_F, IB_CC_CONGESTION_LOG_SWITCH_LOG_EVENTS_COUNTER_F = IB_CC_CONGESTION_LOG_SWITCH_FIRST_F, IB_CC_CONGESTION_LOG_SWITCH_CURRENT_TIME_STAMP_F, IB_CC_CONGESTION_LOG_SWITCH_PORTMAP_F, IB_CC_CONGESTION_LOG_SWITCH_LAST_F, /* * CongestionLogEvent (Switch) fields */ IB_CC_CONGESTION_LOG_ENTRY_SWITCH_FIRST_F, IB_CC_CONGESTION_LOG_ENTRY_SWITCH_SLID_F = IB_CC_CONGESTION_LOG_ENTRY_SWITCH_FIRST_F, IB_CC_CONGESTION_LOG_ENTRY_SWITCH_DLID_F, IB_CC_CONGESTION_LOG_ENTRY_SWITCH_SL_F, IB_CC_CONGESTION_LOG_ENTRY_SWITCH_TIMESTAMP_F, IB_CC_CONGESTION_LOG_ENTRY_SWITCH_LAST_F, /* * CongestionLog (CA) fields */ IB_CC_CONGESTION_LOG_CA_FIRST_F, IB_CC_CONGESTION_LOG_CA_THRESHOLD_EVENT_COUNTER_F = IB_CC_CONGESTION_LOG_CA_FIRST_F, IB_CC_CONGESTION_LOG_CA_THRESHOLD_CONGESTION_EVENT_MAP_F, IB_CC_CONGESTION_LOG_CA_CURRENT_TIMESTAMP_F, IB_CC_CONGESTION_LOG_CA_LAST_F, /* * CongestionLogEvent (CA) fields */ IB_CC_CONGESTION_LOG_ENTRY_CA_FIRST_F, IB_CC_CONGESTION_LOG_ENTRY_CA_LOCAL_QP_CN_ENTRY_F = IB_CC_CONGESTION_LOG_ENTRY_CA_FIRST_F, IB_CC_CONGESTION_LOG_ENTRY_CA_SL_CN_ENTRY_F, IB_CC_CONGESTION_LOG_ENTRY_CA_SERVICE_TYPE_CN_ENTRY_F, IB_CC_CONGESTION_LOG_ENTRY_CA_REMOTE_QP_NUMBER_CN_ENTRY_F, IB_CC_CONGESTION_LOG_ENTRY_CA_LOCAL_LID_CN_F, IB_CC_CONGESTION_LOG_ENTRY_CA_REMOTE_LID_CN_ENTRY_F, IB_CC_CONGESTION_LOG_ENTRY_CA_TIMESTAMP_CN_ENTRY_F, IB_CC_CONGESTION_LOG_ENTRY_CA_LAST_F, /* * SwitchCongestionSetting fields */ IB_CC_SWITCH_CONGESTION_SETTING_FIRST_F, IB_CC_SWITCH_CONGESTION_SETTING_CONTROL_MAP_F = IB_CC_SWITCH_CONGESTION_SETTING_FIRST_F, IB_CC_SWITCH_CONGESTION_SETTING_VICTIM_MASK_F, IB_CC_SWITCH_CONGESTION_SETTING_CREDIT_MASK_F, IB_CC_SWITCH_CONGESTION_SETTING_THRESHOLD_F, IB_CC_SWITCH_CONGESTION_SETTING_PACKET_SIZE_F, IB_CC_SWITCH_CONGESTION_SETTING_CS_THRESHOLD_F, IB_CC_SWITCH_CONGESTION_SETTING_CS_RETURN_DELAY_F, IB_CC_SWITCH_CONGESTION_SETTING_MARKING_RATE_F, IB_CC_SWITCH_CONGESTION_SETTING_LAST_F, /* * SwitchPortCongestionSettingElement fields */ IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_FIRST_F, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_VALID_F = IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_FIRST_F, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_CONTROL_TYPE_F, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_THRESHOLD_F, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_PACKET_SIZE_F, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_CONG_PARM_MARKING_RATE_F, IB_CC_SWITCH_PORT_CONGESTION_SETTING_ELEMENT_LAST_F, /* * CACongestionSetting fields */ IB_CC_CA_CONGESTION_SETTING_FIRST_F, IB_CC_CA_CONGESTION_SETTING_PORT_CONTROL_F = IB_CC_CA_CONGESTION_SETTING_FIRST_F, IB_CC_CA_CONGESTION_SETTING_CONTROL_MAP_F, IB_CC_CA_CONGESTION_SETTING_LAST_F, /* * CACongestionEntry fields */ IB_CC_CA_CONGESTION_ENTRY_FIRST_F, IB_CC_CA_CONGESTION_ENTRY_CCTI_TIMER_F = IB_CC_CA_CONGESTION_ENTRY_FIRST_F, IB_CC_CA_CONGESTION_ENTRY_CCTI_INCREASE_F, IB_CC_CA_CONGESTION_ENTRY_TRIGGER_THRESHOLD_F, IB_CC_CA_CONGESTION_ENTRY_CCTI_MIN_F, IB_CC_CA_CONGESTION_ENTRY_LAST_F, /* * CongestionControlTable fields */ IB_CC_CONGESTION_CONTROL_TABLE_FIRST_F, IB_CC_CONGESTION_CONTROL_TABLE_CCTI_LIMIT_F = IB_CC_CONGESTION_CONTROL_TABLE_FIRST_F, IB_CC_CONGESTION_CONTROL_TABLE_LAST_F, /* * CongestionControlTableEntry fields */ IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_FIRST_F, IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_CCT_SHIFT_F = IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_FIRST_F, IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_CCT_MULTIPLIER_F, IB_CC_CONGESTION_CONTROL_TABLE_ENTRY_LAST_F, /* * Timestamp fields */ IB_CC_TIMESTAMP_FIRST_F, IB_CC_TIMESTAMP_F = IB_CC_TIMESTAMP_FIRST_F, IB_CC_TIMESTAMP_LAST_F, /* * Node Record */ IB_SA_NR_FIRST_F, IB_SA_NR_LID_F = IB_SA_NR_FIRST_F, IB_SA_NR_BASEVER_F, IB_SA_NR_CLASSVER_F, IB_SA_NR_TYPE_F, IB_SA_NR_NPORTS_F, IB_SA_NR_SYSTEM_GUID_F, IB_SA_NR_GUID_F, IB_SA_NR_PORT_GUID_F, IB_SA_NR_PARTITION_CAP_F, IB_SA_NR_DEVID_F, IB_SA_NR_REVISION_F, IB_SA_NR_LOCAL_PORT_F, IB_SA_NR_VENDORID_F, IB_SA_NR_NODEDESC_F, IB_SA_NR_LAST_F, /* * PortSamplesResult fields */ IB_PSR_TAG_F, IB_PSR_SAMPLE_STATUS_F, IB_PSR_COUNTER0_F, IB_PSR_COUNTER1_F, IB_PSR_COUNTER2_F, IB_PSR_COUNTER3_F, IB_PSR_COUNTER4_F, IB_PSR_COUNTER5_F, IB_PSR_COUNTER6_F, IB_PSR_COUNTER7_F, IB_PSR_COUNTER8_F, IB_PSR_COUNTER9_F, IB_PSR_COUNTER10_F, IB_PSR_COUNTER11_F, IB_PSR_COUNTER12_F, IB_PSR_COUNTER13_F, IB_PSR_COUNTER14_F, IB_PSR_LAST_F, /* * PortInfoExtended fields */ IB_PORT_EXT_FIRST_F, IB_PORT_EXT_CAPMASK_F = IB_PORT_EXT_FIRST_F, IB_PORT_EXT_FEC_MODE_ACTIVE_F, IB_PORT_EXT_FDR_FEC_MODE_SUPPORTED_F, IB_PORT_EXT_FDR_FEC_MODE_ENABLED_F, IB_PORT_EXT_EDR_FEC_MODE_SUPPORTED_F, IB_PORT_EXT_EDR_FEC_MODE_ENABLED_F, IB_PORT_EXT_LAST_F, /* * PortExtendedSpeedsCounters RSFEC active fields */ IB_PESC_RSFEC_FIRST_F, IB_PESC_RSFEC_PORT_SELECT_F = IB_PESC_RSFEC_FIRST_F, IB_PESC_RSFEC_COUNTER_SELECT_F, IB_PESC_RSFEC_SYNC_HDR_ERR_CTR_F, IB_PESC_RSFEC_UNK_BLOCK_CTR_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE0_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE1_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE2_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE3_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE4_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE5_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE6_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE7_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE8_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE9_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE10_F, IB_PESC_RSFEC_FEC_CORR_SYMBOL_CTR_LANE11_F, IB_PESC_PORT_FEC_CORR_BLOCK_CTR_F, IB_PESC_PORT_FEC_UNCORR_BLOCK_CTR_F, IB_PESC_PORT_FEC_CORR_SYMBOL_CTR_F, IB_PESC_RSFEC_LAST_F, /* * More PortCountersExtended fields */ IB_PC_EXT_COUNTER_SELECT2_F, IB_PC_EXT_ERR_SYM_F, IB_PC_EXT_LINK_RECOVERS_F, IB_PC_EXT_LINK_DOWNED_F, IB_PC_EXT_ERR_RCV_F, IB_PC_EXT_ERR_PHYSRCV_F, IB_PC_EXT_ERR_SWITCH_REL_F, IB_PC_EXT_XMT_DISCARDS_F, IB_PC_EXT_ERR_XMTCONSTR_F, IB_PC_EXT_ERR_RCVCONSTR_F, IB_PC_EXT_ERR_LOCALINTEG_F, IB_PC_EXT_ERR_EXCESS_OVR_F, IB_PC_EXT_VL15_DROPPED_F, IB_PC_EXT_XMT_WAIT_F, IB_PC_EXT_QP1_DROP_F, IB_PC_EXT_ERR_LAST_F, /* * Another PortCounters field */ IB_PC_QP1_DROP_F, /* * More PortInfoExtended fields (HDR) */ IB_PORT_EXT_HDR_FEC_MODE_SUPPORTED_F, IB_PORT_EXT_HDR_FEC_MODE_ENABLED_F, IB_PORT_EXT_HDR_FEC_MODE_LAST_F, /* * More PortInfoExtended fields (NDR) */ IB_PORT_EXT_NDR_FEC_MODE_SUPPORTED_F, IB_PORT_EXT_NDR_FEC_MODE_ENABLED_F, IB_PORT_EXT_NDR_FEC_MODE_LAST_F, /* * More PortInfo fields (XDR) */ IB_PORT_LINK_SPEED_EXT_ACTIVE_2_F, IB_PORT_LINK_SPEED_EXT_SUPPORTED_2_F, IB_PORT_LINK_SPEED_EXT_ENABLED_2_F, IB_PORT_LINK_SPEED_EXT_2_LAST_F, IB_FIELD_LAST_ /* must be last */ }; /* * SA RMPP section */ enum RMPP_TYPE_ENUM { IB_RMPP_TYPE_NONE, IB_RMPP_TYPE_DATA, IB_RMPP_TYPE_ACK, IB_RMPP_TYPE_STOP, IB_RMPP_TYPE_ABORT, }; enum RMPP_FLAGS_ENUM { IB_RMPP_FLAG_ACTIVE = 1 << 0, IB_RMPP_FLAG_FIRST = 1 << 1, IB_RMPP_FLAG_LAST = 1 << 2, }; typedef struct { int type; int flags; int status; union { uint32_t u; uint32_t segnum; } d1; union { uint32_t u; uint32_t len; uint32_t newwin; } d2; } ib_rmpp_hdr_t; enum SA_SIZES_ENUM { SA_HEADER_SZ = 20, }; typedef struct ib_sa_call { unsigned attrid; unsigned mod; uint64_t mask; unsigned method; uint64_t trid; /* used for out mad if nonzero, return real val */ unsigned recsz; /* return field */ ib_rmpp_hdr_t rmpp; } ib_sa_call_t; typedef struct ib_vendor_call { unsigned method; unsigned mgmt_class; unsigned attrid; unsigned mod; uint32_t oui; unsigned timeout; ib_rmpp_hdr_t rmpp; } ib_vendor_call_t; typedef struct ib_bm_call { unsigned method; unsigned attrid; unsigned mod; unsigned timeout; uint64_t bkey; } ib_bm_call_t; typedef struct ibmad_ports_item { struct ibmad_port *port; char ca_name[20]; } ibmad_ports_item_t; struct ibmad_ports_pair { ibmad_ports_item_t smi; ibmad_ports_item_t gsi; }; #define IB_MIN_UCAST_LID 1 #define IB_MAX_UCAST_LID (0xc000-1) #define IB_MIN_MCAST_LID 0xc000 #define IB_MAX_MCAST_LID (0xffff-1) #define IB_LID_VALID(lid) ((lid) >= IB_MIN_UCAST_LID && lid <= IB_MAX_UCAST_LID) #define IB_MLID_VALID(lid) ((lid) >= IB_MIN_MCAST_LID && lid <= IB_MAX_MCAST_LID) #define MAD_DEF_RETRIES 3 #define MAD_DEF_TIMEOUT_MS 1000 enum MAD_DEST { IB_DEST_LID, IB_DEST_DRPATH, IB_DEST_GUID, IB_DEST_DRSLID, IB_DEST_GID }; enum MAD_NODE_TYPE { IB_NODE_CA = 1, IB_NODE_SWITCH, IB_NODE_ROUTER, NODE_RNIC, IB_NODE_MAX = NODE_RNIC }; /******************************************************************************/ /* portid.c */ char *portid2str(ib_portid_t *portid); int portid2portnum(ib_portid_t *portid); int str2drpath(ib_dr_path_t *path, char *routepath, int drslid, int drdlid); char *drpath2str(ib_dr_path_t *path, char *dstr, size_t dstr_size); static inline int ib_portid_set(ib_portid_t * portid, int lid, int qp, int qkey) { portid->lid = lid; portid->qp = qp; portid->qkey = qkey; portid->grh_present = 0; return 0; } /* fields.c */ uint32_t mad_get_field(void *buf, int base_offs, enum MAD_FIELDS field); void mad_set_field(void *buf, int base_offs, enum MAD_FIELDS field, uint32_t val); /* field must be byte aligned */ uint64_t mad_get_field64(void *buf, int base_offs, enum MAD_FIELDS field); void mad_set_field64(void *buf, int base_offs, enum MAD_FIELDS field, uint64_t val); void mad_set_array(void *buf, int base_offs, enum MAD_FIELDS field, void *val); void mad_get_array(void *buf, int base_offs, enum MAD_FIELDS field, void *val); void mad_decode_field(uint8_t *buf, enum MAD_FIELDS field, void *val); void mad_encode_field(uint8_t *buf, enum MAD_FIELDS field, void *val); int mad_print_field(enum MAD_FIELDS field, const char *name, void *val); char *mad_dump_field(enum MAD_FIELDS field, char *buf, int bufsz, void *val); char *mad_dump_val(enum MAD_FIELDS field, char *buf, int bufsz, void *val); const char *mad_field_name(enum MAD_FIELDS field); /* mad.c */ void *mad_encode(void *buf, ib_rpc_t *rpc, ib_dr_path_t *drpath, void *data); uint64_t mad_trid(void); int mad_build_pkt(void *umad, ib_rpc_t *rpc, ib_portid_t *dport, ib_rmpp_hdr_t *rmpp, void *data); /* New interface */ void madrpc_show_errors(int set); int madrpc_set_retries(int retries); int madrpc_set_timeout(int timeout); struct ibmad_port *mad_rpc_open_port(char *dev_name, int dev_port, int *mgmt_classes, int num_classes); struct ibmad_ports_pair *mad_rpc_open_port2(char *dev_name, int dev_port, int *mgmt_classes, int num_classes, unsigned enforce_smi); void mad_rpc_close_port(struct ibmad_port *srcport); void mad_rpc_close_port2(struct ibmad_ports_pair *srcport); /* * On redirection, the dport argument is updated with the redirection target, * so subsequent MADs will not go through the redirection process again but * reach the target directly. */ void *mad_rpc(const struct ibmad_port *srcport, ib_rpc_t *rpc, ib_portid_t *dport, void *payload, void *rcvdata); void *mad_rpc_rmpp(const struct ibmad_port *srcport, ib_rpc_t *rpc, ib_portid_t *dport, ib_rmpp_hdr_t *rmpp, void *data); int mad_rpc_portid(struct ibmad_port *srcport); void mad_rpc_set_retries(struct ibmad_port *port, int retries); void mad_rpc_set_timeout(struct ibmad_port *port, int timeout); int mad_rpc_class_agent(struct ibmad_port *srcport, int cls); int mad_get_timeout(const struct ibmad_port *srcport, int override_ms); int mad_get_retries(const struct ibmad_port *srcport); /* register.c */ int mad_register_client(int mgmt, uint8_t rmpp_version) __attribute__((deprecated)); int mad_register_server(int mgmt, uint8_t rmpp_version, long method_mask[16 / sizeof(long)], uint32_t class_oui) __attribute__((deprecated)); /* register.c new interface */ int mad_register_client_via(int mgmt, uint8_t rmpp_version, struct ibmad_port *srcport); int mad_register_server_via(int mgmt, uint8_t rmpp_version, long method_mask[16 / sizeof(long)], uint32_t class_oui, struct ibmad_port *srcport); int mad_class_agent(int mgmt) __attribute__((deprecated)); /* serv.c */ int mad_send(ib_rpc_t *rpc, ib_portid_t *dport, ib_rmpp_hdr_t *rmpp, void *data) __attribute__((deprecated)); void *mad_receive(void *umad, int timeout) __attribute__((deprecated)); int mad_respond(void *umad, ib_portid_t *portid, uint32_t rstatus) __attribute__((deprecated)); /* serv.c new interface */ int mad_send_via(ib_rpc_t *rpc, ib_portid_t *dport, ib_rmpp_hdr_t *rmpp, void *data, struct ibmad_port *srcport); void *mad_receive_via(void *umad, int timeout, struct ibmad_port *srcport); int mad_respond_via(void *umad, ib_portid_t *portid, uint32_t rstatus, struct ibmad_port *srcport); void *mad_alloc(void); void mad_free(void *umad); /* vendor.c */ uint8_t *ib_vendor_call(void *data, ib_portid_t *portid, ib_vendor_call_t *call) __attribute__((deprecated)); /* vendor.c new interface */ uint8_t *ib_vendor_call_via(void *data, ib_portid_t *portid, ib_vendor_call_t *call, struct ibmad_port *srcport); static inline int mad_is_vendor_range1(int mgmt) { return mgmt >= 0x9 && mgmt <= 0xf; } static inline int mad_is_vendor_range2(int mgmt) { return mgmt >= 0x30 && mgmt <= 0x4f; } /* rpc.c */ int madrpc_portid(void) __attribute__((deprecated)); void *madrpc(ib_rpc_t *rpc, ib_portid_t *dport, void *payload, void *rcvdata) __attribute__((deprecated)); void *madrpc_rmpp(ib_rpc_t *rpc, ib_portid_t *dport, ib_rmpp_hdr_t *rmpp, void *data) __attribute__((deprecated)); void madrpc_init(char *dev_name, int dev_port, int *mgmt_classes, int num_classes) __attribute__((deprecated)); void madrpc_save_mad(void *madbuf, int len) __attribute__((deprecated)); /* smp.c */ uint8_t *smp_query(void *buf, ib_portid_t *id, unsigned attrid, unsigned mod, unsigned timeout) __attribute__((deprecated)); uint8_t *smp_set(void *buf, ib_portid_t *id, unsigned attrid, unsigned mod, unsigned timeout) __attribute__((deprecated)); /* smp.c new interface */ uint8_t *smp_query_via(void *buf, ib_portid_t *id, unsigned attrid, unsigned mod, unsigned timeout, const struct ibmad_port *srcport); uint8_t *smp_set_via(void *buf, ib_portid_t *id, unsigned attrid, unsigned mod, unsigned timeout, const struct ibmad_port *srcport); uint8_t *smp_query_status_via(void *rcvbuf, ib_portid_t *portid, unsigned attrid, unsigned mod, unsigned timeout, int *rstatus, const struct ibmad_port *srcport); uint8_t *smp_set_status_via(void *data, ib_portid_t *portid, unsigned attrid, unsigned mod, unsigned timeout, int *rstatus, const struct ibmad_port *srcport); void smp_mkey_set(struct ibmad_port *srcport, uint64_t mkey); uint64_t smp_mkey_get(const struct ibmad_port *srcport); /* cc.c */ void *cc_query_status_via(void *rcvbuf, ib_portid_t *portid, unsigned attrid, unsigned mod, unsigned timeout, int *rstatus, const struct ibmad_port *srcport, uint64_t cckey); void *cc_config_status_via(void *payload, void *rcvbuf, ib_portid_t *portid, unsigned attrid, unsigned mod, unsigned timeout, int *rstatus, const struct ibmad_port *srcport, uint64_t cckey); /* sa.c */ uint8_t *sa_call(void *rcvbuf, ib_portid_t *portid, ib_sa_call_t *sa, unsigned timeout) __attribute__((deprecated)); int ib_path_query(ibmad_gid_t srcgid, ibmad_gid_t destgid, ib_portid_t *sm_id, void *buf) __attribute__((deprecated)); /* sa.c new interface */ uint8_t *sa_rpc_call(const struct ibmad_port *srcport, void *rcvbuf, ib_portid_t *portid, ib_sa_call_t *sa, unsigned timeout); int ib_path_query_via(const struct ibmad_port *srcport, ibmad_gid_t srcgid, ibmad_gid_t destgid, ib_portid_t *sm_id, void *buf); /* returns lid */ int ib_node_query_via(const struct ibmad_port *srcport, uint64_t guid, ib_portid_t *sm_id, void *buf); /* resolve.c */ int ib_resolve_smlid(ib_portid_t *sm_id, int timeout) __attribute__((deprecated)); int ib_resolve_portid_str(ib_portid_t *portid, char *addr_str, enum MAD_DEST dest, ib_portid_t *sm_id) __attribute__((deprecated)); int ib_resolve_self(ib_portid_t *portid, int *portnum, ibmad_gid_t *gid) __attribute__((deprecated)); /* resolve.c new interface */ int ib_resolve_smlid_via(ib_portid_t *sm_id, int timeout, const struct ibmad_port *srcport); int ib_resolve_guid_via(ib_portid_t *portid, uint64_t *guid, ib_portid_t *sm_id, int timeout, const struct ibmad_port *srcport); int ib_resolve_gid_via(ib_portid_t *portid, ibmad_gid_t gid, ib_portid_t *sm_id, int timeout, const struct ibmad_port *srcport); int ib_resolve_portid_str_via(ib_portid_t *portid, char *addr_str, enum MAD_DEST dest, ib_portid_t *sm_id, const struct ibmad_port *srcport); int ib_resolve_self_via(ib_portid_t *portid, int *portnum, ibmad_gid_t *gid, const struct ibmad_port *srcport); /* gs.c new interface */ uint8_t *pma_query_via(void *rcvbuf, ib_portid_t *dest, int port, unsigned timeout, unsigned id, const struct ibmad_port *srcport); uint8_t *performance_reset_via(void *rcvbuf, ib_portid_t *dest, int port, unsigned mask, unsigned timeout, unsigned id, const struct ibmad_port *srcport); /* bm.c */ uint8_t *bm_call_via(void *data, ib_portid_t *portid, ib_bm_call_t *call, struct ibmad_port *srcport); /* dump.c */ ib_mad_dump_fn mad_dump_int, mad_dump_uint, mad_dump_hex, mad_dump_rhex, mad_dump_bitfield, mad_dump_array, mad_dump_string, mad_dump_linkwidth, mad_dump_linkwidthsup, mad_dump_linkwidthen, mad_dump_linkdowndefstate, mad_dump_linkspeed, mad_dump_linkspeedsup, mad_dump_linkspeeden, mad_dump_linkspeedext, mad_dump_linkspeedextsup, mad_dump_linkspeedexten, mad_dump_portstate, mad_dump_portstates, mad_dump_physportstate, mad_dump_portcapmask, mad_dump_portcapmask2, mad_dump_mtu, mad_dump_vlcap, mad_dump_opervls, mad_dump_node_type, mad_dump_sltovl, mad_dump_vlarbitration, mad_dump_nodedesc, mad_dump_nodeinfo, mad_dump_portinfo, mad_dump_switchinfo, mad_dump_perfcounters, mad_dump_perfcounters_ext, mad_dump_perfcounters_xmt_sl, mad_dump_perfcounters_rcv_sl, mad_dump_perfcounters_xmt_disc, mad_dump_perfcounters_rcv_err, mad_dump_portsamples_control, mad_dump_port_ext_speeds_counters, mad_dump_perfcounters_port_op_rcv_counters, mad_dump_perfcounters_port_flow_ctl_counters, mad_dump_perfcounters_port_vl_op_packet, mad_dump_perfcounters_port_vl_op_data, mad_dump_perfcounters_port_vl_xmit_flow_ctl_update_errors, mad_dump_perfcounters_port_vl_xmit_wait_counters, mad_dump_perfcounters_sw_port_vl_congestion, mad_dump_perfcounters_rcv_con_ctrl, mad_dump_perfcounters_sl_rcv_fecn, mad_dump_perfcounters_sl_rcv_becn, mad_dump_perfcounters_xmit_con_ctrl, mad_dump_perfcounters_vl_xmit_time_cong, mad_dump_mlnx_ext_port_info, mad_dump_cc_congestioninfo, mad_dump_cc_congestionkeyinfo, mad_dump_cc_congestionlog, mad_dump_cc_congestionlogswitch, mad_dump_cc_congestionlogentryswitch, mad_dump_cc_congestionlogca, mad_dump_cc_congestionlogentryca, mad_dump_cc_switchcongestionsetting, mad_dump_cc_switchportcongestionsettingelement, mad_dump_cc_cacongestionsetting, mad_dump_cc_cacongestionentry, mad_dump_cc_congestioncontroltable, mad_dump_cc_congestioncontroltableentry, mad_dump_cc_timestamp, mad_dump_classportinfo, mad_dump_portsamples_result, mad_dump_portinfo_ext, mad_dump_port_ext_speeds_counters_rsfec_active, mad_dump_linkspeedext2, mad_dump_linkspeedextsup2, mad_dump_linkspeedexten2; void mad_dump_fields(char *buf, int bufsz, void *val, int valsz, int start, int end); extern int ibdebug; #if __BYTE_ORDER == __LITTLE_ENDIAN #ifndef ntohll #define ntohll bswap_64 #endif #ifndef htonll #define htonll bswap_64 #endif #elif __BYTE_ORDER == __BIG_ENDIAN #ifndef ntohll #define ntohll(x) (x) #endif #ifndef htonll #define htonll(x) (x) #endif #endif /* __BYTE_ORDER == __BIG_ENDIAN */ /* Misc. macros: */ /** align value \a l to \a size (ceil) */ #define ALIGN(l, size) (((l) + ((size) - 1)) / (size) * (size)) /** printf style warning MACRO, includes name of function and pid */ #define IBWARN(fmt, ...) fprintf(stderr, "ibwarn: [%d] %s: " fmt "\n", \ (int)getpid(), __func__, ## __VA_ARGS__) #define IBDEBUG(fmt, ...) fprintf(stdout, "ibdebug: [%d] %s: " fmt "\n", \ (int)getpid(), __func__, ## __VA_ARGS__) #define IBVERBOSE(fmt, ...) fprintf(stdout, "[%d] %s: " fmt "\n", \ (int)getpid(), __func__, ## __VA_ARGS__) #define IBPANIC(fmt, ...) do { \ fprintf(stderr, "ibpanic: [%d] %s: " fmt ": %m\n", \ (int)getpid(), __func__, ## __VA_ARGS__); \ exit(-1); \ } while(0) void xdump(FILE *file, const char *msg, void *p, int size); #ifdef __cplusplus } #endif #endif /* _MAD_H_ */ rdma-core-56.1/libibmad/mad_internal.h000066400000000000000000000034201477342711600177020ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _MAD_INTERNAL_H_ #define _MAD_INTERNAL_H_ #define MAX_CLASS 256 struct ibmad_port { int port_id; /* file descriptor returned by umad_open() */ int class_agents[MAX_CLASS]; /* class2agent mapper */ int timeout, retries; uint64_t smp_mkey; }; extern struct ibmad_port *ibmp; extern int madrpc_timeout; extern int madrpc_retries; #endif /* _MAD_INTERNAL_H_ */ rdma-core-56.1/libibmad/mad_osd.h000066400000000000000000000000441477342711600166520ustar00rootroot00000000000000#warning "This header is obsolete." rdma-core-56.1/libibmad/portid.c000066400000000000000000000062301477342711600165430ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #undef DEBUG #define DEBUG if (ibdebug) IBWARN int portid2portnum(ib_portid_t * portid) { if (portid->lid > 0) return -1; if (portid->drpath.cnt == 0) return 0; return portid->drpath.p[(portid->drpath.cnt - 1)]; } char *portid2str(ib_portid_t * portid) { static char buf[1024] = "local"; int n = 0; if (portid->lid > 0) { n += sprintf(buf + n, "Lid %d", portid->lid); if (portid->grh_present) { char gid[sizeof "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff"]; if (inet_ntop(AF_INET6, portid->gid, gid, sizeof(gid))) n += sprintf(buf + n, " Gid %s", gid); } if (portid->drpath.cnt) n += sprintf(buf + n, " "); else return buf; } n += sprintf(buf + n, "DR path "); drpath2str(&(portid->drpath), buf + n, sizeof(buf) - n); return buf; } int str2drpath(ib_dr_path_t * path, char *routepath, int drslid, int drdlid) { char *s, *str; char *tmp; path->cnt = -1; if (!routepath || !(tmp = strdup(routepath))) goto Exit; DEBUG("DR str: %s", routepath); str = tmp; while (str && *str) { if ((s = strchr(str, ','))) *s = 0; path->p[++path->cnt] = (uint8_t) atoi(str); if (!s) break; str = s + 1; } free(tmp); Exit: path->drdlid = drdlid ? drdlid : 0xffff; path->drslid = drslid ? drslid : 0xffff; return path->cnt; } char *drpath2str(ib_dr_path_t * path, char *dstr, size_t dstr_size) { int i = 0; int rc = snprintf(dstr, dstr_size, "slid %u; dlid %u; %d", path->drslid, path->drdlid, path->p[0]); if (rc >= (int)dstr_size) return dstr; for (i = 1; i <= path->cnt; i++) { rc += snprintf(dstr + rc, dstr_size - rc, ",%d", path->p[i]); if (rc >= (int)dstr_size) break; } return (dstr); } rdma-core-56.1/libibmad/register.c000066400000000000000000000110271477342711600170660ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include "mad_internal.h" #undef DEBUG #define DEBUG if (ibdebug) IBWARN static int mgmt_class_vers(int mgmt_class) { if ((mgmt_class >= IB_VENDOR_RANGE1_START_CLASS && mgmt_class <= IB_VENDOR_RANGE1_END_CLASS) || (mgmt_class >= IB_VENDOR_RANGE2_START_CLASS && mgmt_class <= IB_VENDOR_RANGE2_END_CLASS)) return 1; switch (mgmt_class) { case IB_SMI_CLASS: case IB_SMI_DIRECT_CLASS: return 1; case IB_SA_CLASS: return 2; case IB_PERFORMANCE_CLASS: return 1; case IB_DEVICE_MGMT_CLASS: return 1; case IB_CC_CLASS: return 2; case IB_BOARD_MGMT_CLASS: return 1; } return 0; } int mad_class_agent(int mgmt) { if (mgmt < 1 || mgmt >= MAX_CLASS) return -1; return ibmp->class_agents[mgmt]; } static int mad_register_port_client(int port_id, int mgmt, uint8_t rmpp_version) { int vers, agent; if ((vers = mgmt_class_vers(mgmt)) <= 0) { DEBUG("Unknown class %d mgmt_class", mgmt); return -1; } agent = umad_register(port_id, mgmt, vers, rmpp_version, NULL); if (agent < 0) DEBUG("Can't register agent for class %d", mgmt); return agent; } int mad_register_client(int mgmt, uint8_t rmpp_version) { return mad_register_client_via(mgmt, rmpp_version, ibmp); } int mad_register_client_via(int mgmt, uint8_t rmpp_version, struct ibmad_port *srcport) { int agent; if (!srcport) return -1; agent = mad_register_port_client(mad_rpc_portid(srcport), mgmt, rmpp_version); if (agent < 0) return agent; srcport->class_agents[mgmt] = agent; return 0; } int mad_register_server(int mgmt, uint8_t rmpp_version, long method_mask[16 / sizeof(long)], uint32_t class_oui) { return mad_register_server_via(mgmt, rmpp_version, method_mask, class_oui, ibmp); } int mad_register_server_via(int mgmt, uint8_t rmpp_version, long method_mask[16 / sizeof(long)], uint32_t class_oui, struct ibmad_port *srcport) { long class_method_mask[16 / sizeof(long)]; uint8_t oui[3]; int agent, vers; if (method_mask) memcpy(class_method_mask, method_mask, sizeof class_method_mask); else memset(class_method_mask, 0xff, sizeof(class_method_mask)); if (!srcport) return -1; if (srcport->class_agents[mgmt] >= 0) { DEBUG("Class 0x%x already registered %d", mgmt, srcport->class_agents[mgmt]); return -1; } if ((vers = mgmt_class_vers(mgmt)) <= 0) { DEBUG("Unknown class 0x%x mgmt_class", mgmt); return -1; } if (mgmt >= IB_VENDOR_RANGE2_START_CLASS && mgmt <= IB_VENDOR_RANGE2_END_CLASS) { oui[0] = (class_oui >> 16) & 0xff; oui[1] = (class_oui >> 8) & 0xff; oui[2] = class_oui & 0xff; if ((agent = umad_register_oui(srcport->port_id, mgmt, rmpp_version, oui, class_method_mask)) < 0) { DEBUG("Can't register agent for class %d", mgmt); return -1; } } else if ((agent = umad_register(srcport->port_id, mgmt, vers, rmpp_version, class_method_mask)) < 0) { DEBUG("Can't register agent for class %d", mgmt); return -1; } srcport->class_agents[mgmt] = agent; return agent; } rdma-core-56.1/libibmad/resolve.c000066400000000000000000000147031477342711600167250ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include "mad_internal.h" #undef DEBUG #define DEBUG if (ibdebug) IBWARN int ib_resolve_smlid_via(ib_portid_t * sm_id, int timeout, const struct ibmad_port *srcport) { ib_portid_t self = { 0 }; uint8_t portinfo[64]; int lid; memset(sm_id, 0, sizeof(*sm_id)); if (!smp_query_via(portinfo, &self, IB_ATTR_PORT_INFO, 0, 0, srcport)) return -1; mad_decode_field(portinfo, IB_PORT_SMLID_F, &lid); if (!IB_LID_VALID(lid)) { errno = ENXIO; return -1; } mad_decode_field(portinfo, IB_PORT_SMSL_F, &sm_id->sl); return ib_portid_set(sm_id, lid, 0, 0); } int ib_resolve_smlid(ib_portid_t * sm_id, int timeout) { return ib_resolve_smlid_via(sm_id, timeout, ibmp); } int ib_resolve_gid_via(ib_portid_t * portid, ibmad_gid_t gid, ib_portid_t * sm_id, int timeout, const struct ibmad_port *srcport) { ib_portid_t sm_portid = { 0 }; char buf[IB_SA_DATA_SIZE] = { 0 }; if (!sm_id) sm_id = &sm_portid; if (!IB_LID_VALID(sm_id->lid)) { if (ib_resolve_smlid_via(sm_id, timeout, srcport) < 0) return -1; } if ((portid->lid = ib_path_query_via(srcport, gid, gid, sm_id, buf)) < 0) return -1; return 0; } int ib_resolve_guid_via(ib_portid_t * portid, uint64_t * guid, ib_portid_t * sm_id, int timeout, const struct ibmad_port *srcport) { ib_portid_t sm_portid = { 0 }; uint8_t buf[IB_SA_DATA_SIZE] = { 0 }; ib_portid_t self = { 0 }; uint64_t selfguid, prefix; ibmad_gid_t selfgid; uint8_t nodeinfo[64]; if (!sm_id) sm_id = &sm_portid; if (!IB_LID_VALID(sm_id->lid)) { if (ib_resolve_smlid_via(sm_id, timeout, srcport) < 0) return -1; } if (!smp_query_via(nodeinfo, &self, IB_ATTR_NODE_INFO, 0, 0, srcport)) return -1; mad_decode_field(nodeinfo, IB_NODE_PORT_GUID_F, &selfguid); mad_set_field64(selfgid, 0, IB_GID_PREFIX_F, IB_DEFAULT_SUBN_PREFIX); mad_set_field64(selfgid, 0, IB_GID_GUID_F, selfguid); memcpy(&prefix, portid->gid, sizeof(prefix)); if (!prefix) mad_set_field64(portid->gid, 0, IB_GID_PREFIX_F, IB_DEFAULT_SUBN_PREFIX); if (guid) mad_set_field64(portid->gid, 0, IB_GID_GUID_F, *guid); if ((portid->lid = ib_path_query_via(srcport, selfgid, portid->gid, sm_id, buf)) < 0) return -1; mad_decode_field(buf, IB_SA_PR_SL_F, &portid->sl); return 0; } int ib_resolve_portid_str_via(ib_portid_t * portid, char *addr_str, enum MAD_DEST dest_type, ib_portid_t * sm_id, const struct ibmad_port *srcport) { ibmad_gid_t gid; uint64_t guid; int lid; char *routepath; ib_portid_t selfportid = { 0 }; int selfport = 0; memset(portid, 0, sizeof *portid); switch (dest_type) { case IB_DEST_LID: lid = strtol(addr_str, NULL, 0); if (!IB_LID_VALID(lid)) { errno = EINVAL; return -1; } return ib_portid_set(portid, lid, 0, 0); case IB_DEST_DRPATH: if (str2drpath(&portid->drpath, addr_str, 0, 0) < 0) { errno = EINVAL; return -1; } return 0; case IB_DEST_GUID: if (!(guid = strtoull(addr_str, NULL, 0))) { errno = EINVAL; return -1; } /* keep guid in portid? */ return ib_resolve_guid_via(portid, &guid, sm_id, 0, srcport); case IB_DEST_DRSLID: lid = strtol(addr_str, &routepath, 0); routepath++; if (!IB_LID_VALID(lid)) { errno = EINVAL; return -1; } ib_portid_set(portid, lid, 0, 0); /* handle DR parsing and set DrSLID to local lid */ if (ib_resolve_self_via(&selfportid, &selfport, NULL, srcport) < 0) return -1; if (str2drpath(&portid->drpath, routepath, selfportid.lid, 0) < 0) { errno = EINVAL; return -1; } return 0; case IB_DEST_GID: if (inet_pton(AF_INET6, addr_str, &gid) <= 0) return -1; return ib_resolve_gid_via(portid, gid, sm_id, 0, srcport); default: IBWARN("bad dest_type %d", dest_type); errno = EINVAL; } return -1; } int ib_resolve_portid_str(ib_portid_t * portid, char *addr_str, enum MAD_DEST dest_type, ib_portid_t * sm_id) { return ib_resolve_portid_str_via(portid, addr_str, dest_type, sm_id, ibmp); } int ib_resolve_self_via(ib_portid_t * portid, int *portnum, ibmad_gid_t * gid, const struct ibmad_port *srcport) { ib_portid_t self = { 0 }; uint8_t portinfo[64]; uint8_t nodeinfo[64]; uint64_t guid, prefix; if (!smp_query_via(nodeinfo, &self, IB_ATTR_NODE_INFO, 0, 0, srcport)) return -1; if (!smp_query_via(portinfo, &self, IB_ATTR_PORT_INFO, 0, 0, srcport)) return -1; mad_decode_field(portinfo, IB_PORT_LID_F, &portid->lid); mad_decode_field(portinfo, IB_PORT_SMSL_F, &portid->sl); mad_decode_field(portinfo, IB_PORT_GID_PREFIX_F, &prefix); mad_decode_field(nodeinfo, IB_NODE_PORT_GUID_F, &guid); if (portnum) mad_decode_field(nodeinfo, IB_NODE_LOCAL_PORT_F, portnum); if (gid) { mad_encode_field(*gid, IB_GID_PREFIX_F, &prefix); mad_encode_field(*gid, IB_GID_GUID_F, &guid); } return 0; } int ib_resolve_self(ib_portid_t * portid, int *portnum, ibmad_gid_t * gid) { return ib_resolve_self_via(portid, portnum, gid, ibmp); } rdma-core-56.1/libibmad/rpc.c000066400000000000000000000357211477342711600160350ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2009 HNR Consulting. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include "mad_internal.h" int ibdebug; static struct ibmad_port mad_port; struct ibmad_port *ibmp = &mad_port; static int iberrs; int madrpc_retries = MAD_DEF_RETRIES; int madrpc_timeout = MAD_DEF_TIMEOUT_MS; static void *save_mad; static int save_mad_len = 256; #undef DEBUG #define DEBUG if (ibdebug) IBWARN #define ERRS(fmt, ...) do { \ if (iberrs || ibdebug) \ IBWARN(fmt, ## __VA_ARGS__); \ } while (0) #define MAD_TID(mad) (*((uint64_t *)((char *)(mad) + 8))) void madrpc_show_errors(int set) { iberrs = set; } void madrpc_save_mad(void *madbuf, int len) { save_mad = madbuf; save_mad_len = len; } int madrpc_set_retries(int retries) { if (retries > 0) madrpc_retries = retries; return madrpc_retries; } int madrpc_set_timeout(int timeout) { madrpc_timeout = timeout; return 0; } void mad_rpc_set_retries(struct ibmad_port *port, int retries) { port->retries = retries; } void mad_rpc_set_timeout(struct ibmad_port *port, int timeout) { port->timeout = timeout; } int madrpc_portid(void) { return ibmp->port_id; } int mad_rpc_portid(struct ibmad_port *srcport) { return srcport->port_id; } int mad_rpc_class_agent(struct ibmad_port *port, int class) { if (class < 1 || class >= MAX_CLASS) return -1; return port->class_agents[class]; } static int _do_madrpc(int port_id, void *sndbuf, void *rcvbuf, int agentid, int len, int timeout, int max_retries, int *p_error) { uint32_t trid; /* only low 32 bits - see mad_trid() */ int retries; int length, status; if (ibdebug > 1) { IBWARN(">>> sending: len %d pktsz %zu", len, umad_size() + len); xdump(stderr, "send buf\n", sndbuf, umad_size() + len); } if (save_mad) { memcpy(save_mad, umad_get_mad(sndbuf), save_mad_len < len ? save_mad_len : len); save_mad = NULL; } if (max_retries <= 0) { errno = EINVAL; *p_error = EINVAL; ERRS("max_retries %d <= 0", max_retries); return -1; } trid = (uint32_t) mad_get_field64(umad_get_mad(sndbuf), 0, IB_MAD_TRID_F); for (retries = 0; retries < max_retries; retries++) { if (retries) ERRS("retry %d (timeout %d ms)", retries, timeout); length = len; if (umad_send(port_id, agentid, sndbuf, length, timeout, 0) < 0) { IBWARN("send failed; %s", strerror(errno)); return -1; } /* Use same timeout on receive side just in case */ /* send packet is lost somewhere. */ do { length = len; if (umad_recv(port_id, rcvbuf, &length, timeout) < 0) { IBWARN("recv failed: %s", strerror(errno)); return -1; } if (ibdebug > 2) umad_addr_dump(umad_get_mad_addr(rcvbuf)); if (ibdebug > 1) { IBWARN("rcv buf:"); xdump(stderr, "rcv buf\n", umad_get_mad(rcvbuf), IB_MAD_SIZE); } } while ((uint32_t) mad_get_field64(umad_get_mad(rcvbuf), 0, IB_MAD_TRID_F) != trid); status = umad_status(rcvbuf); if (!status) return length; /* done */ if (status == ENOMEM) return length; } errno = status; *p_error = ETIMEDOUT; ERRS("timeout after %d retries, %d ms", retries, timeout * retries); return -1; } static int redirect_port(ib_portid_t * port, uint8_t * mad) { port->lid = mad_get_field(mad, 64, IB_CPI_REDIRECT_LID_F); if (!port->lid) { IBWARN("GID-based redirection is not supported"); return -1; } port->qp = mad_get_field(mad, 64, IB_CPI_REDIRECT_QP_F); port->qkey = mad_get_field(mad, 64, IB_CPI_REDIRECT_QKEY_F); port->sl = (uint8_t) mad_get_field(mad, 64, IB_CPI_REDIRECT_SL_F); /* TODO: Reverse map redirection P_Key to P_Key index */ if (ibdebug) IBWARN("redirected to lid %d, qp 0x%x, qkey 0x%x, sl 0x%x", port->lid, port->qp, port->qkey, port->sl); return 0; } void *mad_rpc(const struct ibmad_port *port, ib_rpc_t * rpc, ib_portid_t * dport, void *payload, void *rcvdata) { int status, len; uint8_t sndbuf[1024], rcvbuf[1024], *mad; ib_rpc_v1_t *rpcv1 = (ib_rpc_v1_t *)rpc; int error = 0; if ((rpc->mgtclass & IB_MAD_RPC_VERSION_MASK) == IB_MAD_RPC_VERSION1) rpcv1->error = 0; do { len = 0; memset(sndbuf, 0, umad_size() + IB_MAD_SIZE); if ((len = mad_build_pkt(sndbuf, rpc, dport, NULL, payload)) < 0) return NULL; if ((len = _do_madrpc(port->port_id, sndbuf, rcvbuf, port->class_agents[rpc->mgtclass & 0xff], len, mad_get_timeout(port, rpc->timeout), mad_get_retries(port), &error)) < 0) { if ((rpc->mgtclass & IB_MAD_RPC_VERSION_MASK) == IB_MAD_RPC_VERSION1) rpcv1->error = error; IBWARN("_do_madrpc failed; dport (%s)", portid2str(dport)); return NULL; } mad = umad_get_mad(rcvbuf); status = mad_get_field(mad, 0, IB_DRSMP_STATUS_F); /* check for exact match instead of only the redirect bit; * that way, weird statuses cause an error, too */ if (status == IB_MAD_STS_REDIRECT) { /* update dport for next request and retry */ /* bail if redirection fails */ if (redirect_port(dport, mad)) break; } else break; } while (1); if ((rpc->mgtclass & IB_MAD_RPC_VERSION_MASK) == IB_MAD_RPC_VERSION1) rpcv1->error = error; rpc->rstatus = status; if (status != 0) { ERRS("MAD completed with error status 0x%x; dport (%s)", status, portid2str(dport)); errno = EIO; return NULL; } if (rcvdata) memcpy(rcvdata, mad + rpc->dataoffs, rpc->datasz); return rcvdata; } void *mad_rpc_rmpp(const struct ibmad_port *port, ib_rpc_t * rpc, ib_portid_t * dport, ib_rmpp_hdr_t * rmpp, void *data) { int status, len; uint8_t sndbuf[1024], rcvbuf[1024], *mad; ib_rpc_v1_t *rpcv1 = (ib_rpc_v1_t *)rpc; int error = 0; memset(sndbuf, 0, umad_size() + IB_MAD_SIZE); DEBUG("rmpp %p data %p", rmpp, data); if ((rpc->mgtclass & IB_MAD_RPC_VERSION_MASK) == IB_MAD_RPC_VERSION1) rpcv1->error = 0; if ((len = mad_build_pkt(sndbuf, rpc, dport, rmpp, data)) < 0) return NULL; if ((len = _do_madrpc(port->port_id, sndbuf, rcvbuf, port->class_agents[rpc->mgtclass & 0xff], len, mad_get_timeout(port, rpc->timeout), mad_get_retries(port), &error)) < 0) { if ((rpc->mgtclass & IB_MAD_RPC_VERSION_MASK) == IB_MAD_RPC_VERSION1) rpcv1->error = error; IBWARN("_do_madrpc failed; dport (%s)", portid2str(dport)); return NULL; } if ((rpc->mgtclass & IB_MAD_RPC_VERSION_MASK) == IB_MAD_RPC_VERSION1) rpcv1->error = error; mad = umad_get_mad(rcvbuf); if ((status = mad_get_field(mad, 0, IB_MAD_STATUS_F)) != 0) { ERRS("MAD completed with error status 0x%x; dport (%s)", status, portid2str(dport)); errno = EIO; return NULL; } if (rmpp) { rmpp->flags = mad_get_field(mad, 0, IB_SA_RMPP_FLAGS_F); if ((rmpp->flags & 0x3) && mad_get_field(mad, 0, IB_SA_RMPP_VERS_F) != 1) { IBWARN("bad rmpp version"); return NULL; } rmpp->type = mad_get_field(mad, 0, IB_SA_RMPP_TYPE_F); rmpp->status = mad_get_field(mad, 0, IB_SA_RMPP_STATUS_F); DEBUG("rmpp type %d status %d", rmpp->type, rmpp->status); rmpp->d1.u = mad_get_field(mad, 0, IB_SA_RMPP_D1_F); rmpp->d2.u = mad_get_field(mad, 0, IB_SA_RMPP_D2_F); } if (data) memcpy(data, mad + rpc->dataoffs, rpc->datasz); rpc->recsz = mad_get_field(mad, 0, IB_SA_ATTROFFS_F); return data; } void *madrpc(ib_rpc_t * rpc, ib_portid_t * dport, void *payload, void *rcvdata) { return mad_rpc(ibmp, rpc, dport, payload, rcvdata); } void *madrpc_rmpp(ib_rpc_t * rpc, ib_portid_t * dport, ib_rmpp_hdr_t * rmpp, void *data) { return mad_rpc_rmpp(ibmp, rpc, dport, rmpp, data); } void madrpc_init(char *dev_name, int dev_port, int *mgmt_classes, int num_classes) { int fd; if (umad_init() < 0) IBPANIC("can't init UMAD library"); if ((fd = umad_open_port(dev_name, dev_port)) < 0) IBPANIC("can't open UMAD port (%s:%d)", dev_name ? dev_name : "(nil)", dev_port); if (num_classes >= MAX_CLASS) IBPANIC("too many classes %d requested", num_classes); ibmp->port_id = fd; memset(ibmp->class_agents, 0xff, sizeof ibmp->class_agents); while (num_classes--) { uint8_t rmpp_version = 0; int mgmt = *mgmt_classes++; if (mgmt == IB_SA_CLASS) rmpp_version = 1; if (mad_register_client_via(mgmt, rmpp_version, ibmp) < 0) IBPANIC("client_register for mgmt class %d failed", mgmt); } } struct ibmad_port *mad_rpc_open_port(char *dev_name, int dev_port, int *mgmt_classes, int num_classes) { struct ibmad_port *p; int port_id; char *debug_level_env; if (num_classes >= MAX_CLASS) { IBWARN("too many classes %d requested", num_classes); errno = EINVAL; return NULL; } if (umad_init() < 0) { IBWARN("can't init UMAD library"); errno = ENODEV; return NULL; } debug_level_env = getenv("LIBIBMAD_DEBUG_LEVEL"); if (debug_level_env) { ibdebug = atoi(debug_level_env); } p = malloc(sizeof(*p)); if (!p) { errno = ENOMEM; return NULL; } memset(p, 0, sizeof(*p)); if ((port_id = umad_open_port(dev_name, dev_port)) < 0) { IBWARN("can't open UMAD port (%s:%d)", dev_name, dev_port); if (!errno) errno = EIO; free(p); return NULL; } p->port_id = port_id; memset(p->class_agents, 0xff, sizeof p->class_agents); while (num_classes--) { uint8_t rmpp_version = 0; int mgmt = *mgmt_classes++; if (mgmt == IB_SA_CLASS) rmpp_version = 1; if (mgmt < 0 || mgmt >= MAX_CLASS || mad_register_client_via(mgmt, rmpp_version, p) < 0) { IBWARN("client_register for mgmt %d failed", mgmt); if (!errno) errno = EINVAL; umad_close_port(port_id); free(p); return NULL; } } return p; } void mad_rpc_close_port(struct ibmad_port *port) { umad_close_port(port->port_id); free(port); } static int get_smi_gsi_pair(const char *ca_name, int portnum, struct ibmad_ports_pair *ports_pair, unsigned enforce_smi) { struct umad_ca_pair ca_pair; int smi_port_id = -1; int gsi_port_id = -1; int rc = -1; rc = umad_get_smi_gsi_pair_by_ca_name(ca_name, portnum, &ca_pair, enforce_smi); if (rc < 0) { IBWARN("Can't open UMAD port (%s) (%s:%d)", strerror(-rc), ca_name, portnum); return rc; } smi_port_id = umad_open_port(ca_pair.smi_name, ca_pair.smi_preferred_port); if (smi_port_id < 0 && enforce_smi) { IBWARN("Can't open SMI UMAD port (%s) (%s:%d)", strerror(-smi_port_id), ca_pair.smi_name, ca_pair.smi_preferred_port); return smi_port_id; } gsi_port_id = umad_open_port(ca_pair.gsi_name, ca_pair.gsi_preferred_port); if (gsi_port_id < 0) { IBWARN("Can't open GSI UMAD port (%s) (%s:%d)", strerror(-gsi_port_id), ca_pair.gsi_name, ca_pair.gsi_preferred_port); umad_close_port(smi_port_id); return gsi_port_id; } ports_pair->smi.port->port_id = smi_port_id; ports_pair->gsi.port->port_id = gsi_port_id; strncpy(ports_pair->smi.ca_name, ca_pair.smi_name, UMAD_CA_NAME_LEN); strncpy(ports_pair->gsi.ca_name, ca_pair.gsi_name, UMAD_CA_NAME_LEN); return 0; } static struct ibmad_port *get_port_for_class(struct ibmad_ports_pair *ports_pair, int mgmt_class, unsigned enforce_smi) { if (mgmt_class != IB_SMI_CLASS && mgmt_class != IB_SMI_DIRECT_CLASS) { if (ports_pair->gsi.port->port_id < 0) { IBWARN("required port for GSI is invalid"); return NULL; } return ports_pair->gsi.port; } if (ports_pair->smi.port->port_id < 0) { if (enforce_smi) { IBWARN("required port for SMI is invalid"); return NULL; } return ports_pair->gsi.port; } return ports_pair->smi.port; } struct ibmad_ports_pair *mad_rpc_open_port2(char *dev_name, int dev_port, int *mgmt_classes, int num_classes, unsigned enforce_smi) { struct ibmad_port *smi; struct ibmad_port *gsi; struct ibmad_ports_pair *ports_pair; char *debug_level_env; if (num_classes >= MAX_CLASS) { IBWARN("too many classes %d requested", num_classes); errno = EINVAL; return NULL; } if (umad_init() < 0) { IBWARN("can't init UMAD library"); errno = ENODEV; return NULL; } debug_level_env = getenv("LIBIBMAD_DEBUG_LEVEL"); if (debug_level_env) { ibdebug = atoi(debug_level_env); } smi = malloc(sizeof(*smi)); if (!smi) { errno = ENOMEM; goto smi_error; } memset(smi, 0, sizeof(*smi)); gsi = malloc(sizeof(*gsi)); if (!gsi) { errno = ENOMEM; goto gsi_error; } memset(gsi, 0, sizeof(*gsi)); ports_pair = malloc(sizeof(*ports_pair)); if (!ports_pair) { errno = ENOMEM; goto ports_pair_error; } memset(ports_pair, 0, sizeof(*ports_pair)); ports_pair->smi.port = smi; ports_pair->gsi.port = gsi; if ((get_smi_gsi_pair(dev_name, dev_port, ports_pair, enforce_smi)) < 0) { IBWARN("can't open UMAD port (%s:%d)", dev_name, dev_port); if (!errno) errno = EIO; goto get_smi_error; } memset(ports_pair->smi.port->class_agents, 0xff, sizeof ports_pair->smi.port->class_agents); memset(ports_pair->gsi.port->class_agents, 0xff, sizeof ports_pair->gsi.port->class_agents); while (num_classes--) { uint8_t rmpp_version = 0; int mgmt = *mgmt_classes++; struct ibmad_port *p = get_port_for_class(ports_pair, mgmt, enforce_smi); if (mgmt == IB_SA_CLASS) rmpp_version = 1; if (mgmt < 0 || mgmt >= MAX_CLASS || !p || mad_register_client_via(mgmt, rmpp_version, p) < 0) { IBWARN("client_register for mgmt %d failed", mgmt); if (!errno) errno = EINVAL; goto mad_reg_error; } } return ports_pair; mad_reg_error: umad_close_port(ports_pair->smi.port->port_id); umad_close_port(ports_pair->gsi.port->port_id); get_smi_error: free(ports_pair); ports_pair_error: free(gsi); gsi_error: free(smi); smi_error: return NULL; } void mad_rpc_close_port2(struct ibmad_ports_pair *srcport) { umad_close_port(srcport->smi.port->port_id); umad_close_port(srcport->gsi.port->port_id); free(srcport->smi.port); free(srcport->gsi.port); free(srcport); } rdma-core-56.1/libibmad/sa.c000066400000000000000000000127761477342711600156610ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include "mad_internal.h" #undef DEBUG #define DEBUG if (ibdebug) IBWARN uint8_t *sa_rpc_call(const struct ibmad_port *ibmad_port, void *rcvbuf, ib_portid_t * portid, ib_sa_call_t * sa, unsigned timeout) { ib_rpc_t rpc = { 0 }; uint8_t *p; DEBUG("attr 0x%x mod 0x%x route %s", sa->attrid, sa->mod, portid2str(portid)); if (portid->lid <= 0) { IBWARN("only lid routes are supported"); return NULL; } rpc.mgtclass = IB_SA_CLASS; rpc.method = sa->method; rpc.attr.id = sa->attrid; rpc.attr.mod = sa->mod; rpc.mask = sa->mask; rpc.timeout = timeout; rpc.datasz = IB_SA_DATA_SIZE; rpc.dataoffs = IB_SA_DATA_OFFS; rpc.trid = sa->trid; portid->qp = 1; if (!portid->qkey) portid->qkey = IB_DEFAULT_QP1_QKEY; p = mad_rpc_rmpp(ibmad_port, &rpc, portid, NULL /*&sa->rmpp */ , rcvbuf); /* TODO: RMPP */ sa->recsz = rpc.recsz; return p; } uint8_t *sa_call(void *rcvbuf, ib_portid_t * portid, ib_sa_call_t * sa, unsigned timeout) { return sa_rpc_call(ibmp, rcvbuf, portid, sa, timeout); } /* PathRecord */ #define IB_PR_COMPMASK_DGID (1ull<<2) #define IB_PR_COMPMASK_SGID (1ull<<3) #define IB_PR_COMPMASK_DLID (1ull<<4) #define IB_PR_COMPMASK_SLID (1ull<<5) #define IB_PR_COMPMASK_RAWTRAFIC (1ull<<6) #define IB_PR_COMPMASK_RESV0 (1ull<<7) #define IB_PR_COMPMASK_FLOWLABEL (1ull<<8) #define IB_PR_COMPMASK_HOPLIMIT (1ull<<9) #define IB_PR_COMPMASK_TCLASS (1ull<<10) #define IB_PR_COMPMASK_REVERSIBLE (1ull<<11) #define IB_PR_COMPMASK_NUMBPATH (1ull<<12) #define IB_PR_COMPMASK_PKEY (1ull<<13) #define IB_PR_COMPMASK_RESV1 (1ull<<14) #define IB_PR_COMPMASK_SL (1ull<<15) #define IB_PR_COMPMASK_MTUSELEC (1ull<<16) #define IB_PR_COMPMASK_MTU (1ull<<17) #define IB_PR_COMPMASK_RATESELEC (1ull<<18) #define IB_PR_COMPMASK_RATE (1ull<<19) #define IB_PR_COMPMASK_PKTLIFETIMESELEC (1ull<<20) #define IB_PR_COMPMASK_PKTLIFETIME (1ull<<21) #define IB_PR_COMPMASK_PREFERENCE (1ull<<22) #define IB_PR_DEF_MASK (IB_PR_COMPMASK_DGID |\ IB_PR_COMPMASK_SGID) int ib_path_query_via(const struct ibmad_port *srcport, ibmad_gid_t srcgid, ibmad_gid_t destgid, ib_portid_t * sm_id, void *buf) { ib_sa_call_t sa = { 0 }; uint8_t *p; int dlid; memset(&sa, 0, sizeof sa); sa.method = IB_MAD_METHOD_GET; sa.attrid = IB_SA_ATTR_PATHRECORD; sa.mask = IB_PR_DEF_MASK; sa.trid = mad_trid(); memset(buf, 0, IB_SA_PR_RECSZ); mad_encode_field(buf, IB_SA_PR_DGID_F, destgid); mad_encode_field(buf, IB_SA_PR_SGID_F, srcgid); p = sa_rpc_call(srcport, buf, sm_id, &sa, 0); if (!p) { IBWARN("sa call path_query failed"); return -1; } mad_decode_field(p, IB_SA_PR_DLID_F, &dlid); return dlid; } int ib_path_query(ibmad_gid_t srcgid, ibmad_gid_t destgid, ib_portid_t * sm_id, void *buf) { return ib_path_query_via(ibmp, srcgid, destgid, sm_id, buf); } /* NodeRecord */ #define IB_NR_COMPMASK_LID (1ull<<0) #define IB_NR_COMPMASK_RESERVED1 (1ull<<1) #define IB_NR_COMPMASK_BASEVERSION (1ull<<2) #define IB_NR_COMPMASK_CLASSVERSION (1ull<<3) #define IB_NR_COMPMASK_NODETYPE (1ull<<4) #define IB_NR_COMPMASK_NUMPORTS (1ull<<5) #define IB_NR_COMPMASK_SYSIMAGEGUID (1ull<<6) #define IB_NR_COMPMASK_NODEGUID (1ull<<7) #define IB_NR_COMPMASK_PORTGUID (1ull<<8) #define IB_NR_COMPMASK_PARTCAP (1ull<<9) #define IB_NR_COMPMASK_DEVID (1ull<<10) #define IB_NR_COMPMASK_REV (1ull<<11) #define IB_NR_COMPMASK_PORTNUM (1ull<<12) #define IB_NR_COMPMASK_VENDID (1ull<<13) #define IB_NR_COMPMASK_NODEDESC (1ull<<14) #define IB_NR_DEF_MASK IB_NR_COMPMASK_PORTGUID int ib_node_query_via(const struct ibmad_port *srcport, uint64_t guid, ib_portid_t * sm_id, void *buf) { ib_sa_call_t sa = { 0 }; uint8_t *p; memset(&sa, 0, sizeof sa); sa.method = IB_MAD_METHOD_GET; sa.attrid = IB_SA_ATTR_NODERECORD; sa.mask = IB_NR_DEF_MASK; sa.trid = mad_trid(); memset(buf, 0, IB_SA_NR_RECSZ); mad_encode_field(buf, IB_SA_NR_PORT_GUID_F, &guid); p = sa_rpc_call(srcport, buf, sm_id, &sa, 0); if (!p) { IBWARN("sa call node_query failed"); return -1; } return 0; } rdma-core-56.1/libibmad/serv.c000066400000000000000000000123661477342711600162300ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include "mad_internal.h" #undef DEBUG #define DEBUG if (ibdebug) IBWARN int mad_send(ib_rpc_t * rpc, ib_portid_t * dport, ib_rmpp_hdr_t * rmpp, void *data) { return mad_send_via(rpc, dport, rmpp, data, ibmp); } int mad_send_via(ib_rpc_t * rpc, ib_portid_t * dport, ib_rmpp_hdr_t * rmpp, void *data, struct ibmad_port *srcport) { uint8_t pktbuf[1024]; void *umad = pktbuf; memset(pktbuf, 0, umad_size() + IB_MAD_SIZE); DEBUG("rmpp %p data %p", rmpp, data); if (mad_build_pkt(umad, rpc, dport, rmpp, data) < 0) return -1; if (ibdebug) { IBWARN("data offs %d sz %d", rpc->dataoffs, rpc->datasz); xdump(stderr, "mad send data\n", (char *)umad_get_mad(umad) + rpc->dataoffs, rpc->datasz); } if (umad_send(srcport->port_id, srcport->class_agents[rpc->mgtclass & 0xff], umad, IB_MAD_SIZE, mad_get_timeout(srcport, rpc->timeout), 0) < 0) { IBWARN("send failed; %s", strerror(errno)); return -1; } return 0; } int mad_respond(void *umad, ib_portid_t * portid, uint32_t rstatus) { return mad_respond_via(umad, portid, rstatus, ibmp); } int mad_respond_via(void *umad, ib_portid_t * portid, uint32_t rstatus, struct ibmad_port *srcport) { uint8_t *mad = umad_get_mad(umad); ib_mad_addr_t *mad_addr; ib_rpc_t rpc = { 0 }; ib_portid_t rport; int is_smi; if (!portid) { if (!(mad_addr = umad_get_mad_addr(umad))) { errno = EINVAL; return -1; } memset(&rport, 0, sizeof(rport)); rport.lid = ntohs(mad_addr->lid); rport.qp = ntohl(mad_addr->qpn); rport.qkey = ntohl(mad_addr->qkey); rport.sl = mad_addr->sl; portid = &rport; } DEBUG("dest %s", portid2str(portid)); rpc.mgtclass = mad_get_field(mad, 0, IB_MAD_MGMTCLASS_F); rpc.method = mad_get_field(mad, 0, IB_MAD_METHOD_F); if (rpc.method == IB_MAD_METHOD_SET) rpc.method = IB_MAD_METHOD_GET; if (rpc.method != IB_MAD_METHOD_SEND) rpc.method |= IB_MAD_RESPONSE; rpc.attr.id = mad_get_field(mad, 0, IB_MAD_ATTRID_F); rpc.attr.mod = mad_get_field(mad, 0, IB_MAD_ATTRMOD_F); if (rpc.mgtclass == IB_SA_CLASS) rpc.recsz = mad_get_field(mad, 0, IB_SA_ATTROFFS_F); if (mad_is_vendor_range2(rpc.mgtclass)) rpc.oui = mad_get_field(mad, 0, IB_VEND2_OUI_F); rpc.trid = mad_get_field64(mad, 0, IB_MAD_TRID_F); rpc.rstatus = rstatus; /* cleared by default: timeout, datasz, dataoffs, mkey, mask */ is_smi = rpc.mgtclass == IB_SMI_CLASS || rpc.mgtclass == IB_SMI_DIRECT_CLASS; if (is_smi) portid->qp = 0; else if (!portid->qp) portid->qp = 1; if (!portid->qkey && portid->qp == 1) portid->qkey = IB_DEFAULT_QP1_QKEY; DEBUG ("qp 0x%x class 0x%x method %d attr 0x%x mod 0x%x datasz %d off %d qkey %x", portid->qp, rpc.mgtclass, rpc.method, rpc.attr.id, rpc.attr.mod, rpc.datasz, rpc.dataoffs, portid->qkey); if (mad_build_pkt(umad, &rpc, portid, NULL, NULL) < 0) return -1; if (ibdebug > 1) xdump(stderr, "mad respond pkt\n", mad, IB_MAD_SIZE); if (umad_send (srcport->port_id, srcport->class_agents[rpc.mgtclass], umad, IB_MAD_SIZE, mad_get_timeout(srcport, rpc.timeout), 0) < 0) { DEBUG("send failed; %s", strerror(errno)); return -1; } return 0; } void *mad_receive(void *umad, int timeout) { return mad_receive_via(umad, timeout, ibmp); } void *mad_receive_via(void *umad, int timeout, struct ibmad_port *srcport) { void *mad = umad ? umad : umad_alloc(1, umad_size() + IB_MAD_SIZE); int agent; int length = IB_MAD_SIZE; if ((agent = umad_recv(srcport->port_id, mad, &length, mad_get_timeout(srcport, timeout))) < 0) { if (!umad) umad_free(mad); DEBUG("recv failed: %s", strerror(errno)); return NULL; } return mad; } void *mad_alloc(void) { return umad_alloc(1, umad_size() + IB_MAD_SIZE); } void mad_free(void *umad) { umad_free(umad); } rdma-core-56.1/libibmad/smp.c000066400000000000000000000104751477342711600160470ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include "mad_internal.h" #undef DEBUG #define DEBUG if (ibdebug) IBWARN void smp_mkey_set(struct ibmad_port *srcport, uint64_t mkey) { srcport->smp_mkey = mkey; } uint64_t smp_mkey_get(const struct ibmad_port *srcport) { return srcport->smp_mkey; } uint8_t *smp_set_status_via(void *data, ib_portid_t * portid, unsigned attrid, unsigned mod, unsigned timeout, int *rstatus, const struct ibmad_port *srcport) { ib_rpc_t rpc = { 0 }; uint8_t *res; DEBUG("attr 0x%x mod 0x%x route %s", attrid, mod, portid2str(portid)); if ((portid->lid <= 0) || (portid->drpath.drslid == 0xffff) || (portid->drpath.drdlid == 0xffff)) rpc.mgtclass = IB_SMI_DIRECT_CLASS; /* direct SMI */ else rpc.mgtclass = IB_SMI_CLASS; /* Lid routed SMI */ rpc.method = IB_MAD_METHOD_SET; rpc.attr.id = attrid; rpc.attr.mod = mod; rpc.timeout = timeout; rpc.datasz = IB_SMP_DATA_SIZE; rpc.dataoffs = IB_SMP_DATA_OFFS; rpc.mkey = srcport->smp_mkey; portid->sl = 0; portid->qp = 0; res = mad_rpc(srcport, &rpc, portid, data, data); if (rstatus) *rstatus = rpc.rstatus; return res; } uint8_t *smp_set_via(void *data, ib_portid_t * portid, unsigned attrid, unsigned mod, unsigned timeout, const struct ibmad_port *srcport) { return smp_set_status_via(data, portid, attrid, mod, timeout, NULL, srcport); } uint8_t *smp_set(void *data, ib_portid_t * portid, unsigned attrid, unsigned mod, unsigned timeout) { return smp_set_via(data, portid, attrid, mod, timeout, ibmp); } uint8_t *smp_query_status_via(void *rcvbuf, ib_portid_t * portid, unsigned attrid, unsigned mod, unsigned timeout, int *rstatus, const struct ibmad_port * srcport) { ib_rpc_t rpc = { 0 }; uint8_t *res; DEBUG("attr 0x%x mod 0x%x route %s", attrid, mod, portid2str(portid)); rpc.method = IB_MAD_METHOD_GET; rpc.attr.id = attrid; rpc.attr.mod = mod; rpc.timeout = timeout; rpc.datasz = IB_SMP_DATA_SIZE; rpc.dataoffs = IB_SMP_DATA_OFFS; rpc.mkey = srcport->smp_mkey; if ((portid->lid <= 0) || (portid->drpath.drslid == 0xffff) || (portid->drpath.drdlid == 0xffff)) rpc.mgtclass = IB_SMI_DIRECT_CLASS; /* direct SMI */ else rpc.mgtclass = IB_SMI_CLASS; /* Lid routed SMI */ portid->sl = 0; portid->qp = 0; res = mad_rpc(srcport, &rpc, portid, rcvbuf, rcvbuf); if (rstatus) *rstatus = rpc.rstatus; return res; } uint8_t *smp_query_via(void *rcvbuf, ib_portid_t * portid, unsigned attrid, unsigned mod, unsigned timeout, const struct ibmad_port * srcport) { return smp_query_status_via(rcvbuf, portid, attrid, mod, timeout, NULL, srcport); } uint8_t *smp_query(void *rcvbuf, ib_portid_t * portid, unsigned attrid, unsigned mod, unsigned timeout) { return smp_query_via(rcvbuf, portid, attrid, mod, timeout, ibmp); } rdma-core-56.1/libibmad/vendor.c000066400000000000000000000066051477342711600165450ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include "mad_internal.h" #undef DEBUG #define DEBUG if (ibdebug) IBWARN static inline int response_expected(int method) { return method == IB_MAD_METHOD_GET || method == IB_MAD_METHOD_SET || method == IB_MAD_METHOD_TRAP; } uint8_t *ib_vendor_call(void *data, ib_portid_t * portid, ib_vendor_call_t * call) { return ib_vendor_call_via(data, portid, call, ibmp); } uint8_t *ib_vendor_call_via(void *data, ib_portid_t * portid, ib_vendor_call_t * call, struct ibmad_port * srcport) { ib_rpc_v1_t rpc = { 0 }; ib_rpc_t *rpcold = (ib_rpc_t *)(void *)&rpc; int range1 = 0, resp_expected; void *p_ret; DEBUG("route %s data %p", portid2str(portid), data); if (portid->lid <= 0) return NULL; /* no direct SMI */ if (!(range1 = mad_is_vendor_range1(call->mgmt_class)) && !(mad_is_vendor_range2(call->mgmt_class))) return NULL; resp_expected = response_expected(call->method); rpc.mgtclass = call->mgmt_class | IB_MAD_RPC_VERSION1; rpc.method = call->method; rpc.attr.id = call->attrid; rpc.attr.mod = call->mod; rpc.timeout = resp_expected ? call->timeout : 0; rpc.datasz = range1 ? IB_VENDOR_RANGE1_DATA_SIZE : IB_VENDOR_RANGE2_DATA_SIZE; rpc.dataoffs = range1 ? IB_VENDOR_RANGE1_DATA_OFFS : IB_VENDOR_RANGE2_DATA_OFFS; if (!range1) rpc.oui = call->oui; DEBUG ("class 0x%x method 0x%x attr 0x%x mod 0x%x datasz %d off %d res_ex %d", rpc.mgtclass, rpc.method, rpc.attr.id, rpc.attr.mod, rpc.datasz, rpc.dataoffs, resp_expected); portid->qp = 1; if (!portid->qkey) portid->qkey = IB_DEFAULT_QP1_QKEY; if (resp_expected) { p_ret = mad_rpc_rmpp(srcport, rpcold, portid, NULL, data); /* FIXME: no RMPP for now */ errno = rpc.error; return p_ret; } return mad_send_via(rpcold, portid, NULL, data, srcport) < 0 ? NULL : data; /* FIXME: no RMPP for now */ } rdma-core-56.1/libibnetdisc/000077500000000000000000000000001477342711600157655ustar00rootroot00000000000000rdma-core-56.1/libibnetdisc/CMakeLists.txt000066400000000000000000000007351477342711600205320ustar00rootroot00000000000000publish_headers(infiniband ibnetdisc.h ibnetdisc_osd.h ) rdma_library(ibnetdisc libibnetdisc.map # See Documentation/versioning.md 5 5.1.${PACKAGE_VERSION} chassis.c ibnetdisc.c ibnetdisc_cache.c query_smp.c ) target_link_libraries(ibnetdisc LINK_PRIVATE ibmad ibumad ) rdma_pkg_config("ibnetdisc" "libibumad libibmad" "") rdma_test_executable(testleaks tests/testleaks.c) target_link_libraries(testleaks LINK_PRIVATE ibmad ibumad ibnetdisc ) rdma-core-56.1/libibnetdisc/chassis.c000066400000000000000000001115531477342711600175740ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * Copyright (c) 2008 Lawrence Livermore National Lab. All rights reserved. * Copyright (c) 2010 HNR Consulting. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ /*========================================================*/ /* FABRIC SCANNER SPECIFIC DATA */ /*========================================================*/ #include #include #include #include "internal.h" #include "chassis.h" static const char * const ChassisTypeStr[] = { "", "ISR9288", "ISR9096", "ISR2012", "ISR2004", "ISR4700", "ISR4200" }; static const char * const ChassisSlotTypeStr[] = { "", "Line", "Spine", "SRBD" }; typedef struct chassis_scan { ibnd_chassis_t *first_chassis; ibnd_chassis_t *current_chassis; ibnd_chassis_t *last_chassis; } chassis_scan_t; const char *ibnd_get_chassis_type(ibnd_node_t * node) { int chassis_type; if (!node) { IBND_DEBUG("node parameter NULL\n"); return NULL; } if (!node->chassis) return NULL; chassis_type = mad_get_field(node->info, 0, IB_NODE_VENDORID_F); switch (chassis_type) { case VTR_VENDOR_ID: /* Voltaire chassis */ { if (node->ch_type == UNRESOLVED_CT || node->ch_type > ISR4200_CT) return NULL; return ChassisTypeStr[node->ch_type]; } case MLX_VENDOR_ID: { if (node->ch_type_str[0] == '\0') return NULL; return node->ch_type_str; } default: { break; } } return NULL; } char *ibnd_get_chassis_slot_str(ibnd_node_t * node, char *str, size_t size) { int vendor_id; if (!node) { IBND_DEBUG("node parameter NULL\n"); return NULL; } /* Currently, only if Voltaire or Mellanox chassis */ vendor_id = mad_get_field(node->info, 0,IB_NODE_VENDORID_F); if ((vendor_id != VTR_VENDOR_ID) && (vendor_id != MLX_VENDOR_ID)) return NULL; if (!node->chassis) return NULL; if (node->ch_slot == UNRESOLVED_CS || node->ch_slot > SRBD_CS) return NULL; if (!str) return NULL; snprintf(str, size, "%s %d Chip %d", ChassisSlotTypeStr[node->ch_slot], node->ch_slotnum, node->ch_anafanum); return str; } static ibnd_chassis_t *find_chassisnum(ibnd_fabric_t * fabric, unsigned char chassisnum) { ibnd_chassis_t *current; for (current = fabric->chassis; current; current = current->next) if (current->chassisnum == chassisnum) return current; return NULL; } static uint64_t topspin_chassisguid(uint64_t guid) { /* Byte 3 in system image GUID is chassis type, and */ /* Byte 4 is location ID (slot) so just mask off byte 4 */ return guid & 0xffffffff00ffffffULL; } int ibnd_is_xsigo_guid(uint64_t guid) { if ((guid & 0xffffff0000000000ULL) == 0x0013970000000000ULL) return 1; else return 0; } static int is_xsigo_leafone(uint64_t guid) { if ((guid & 0xffffffffff000000ULL) == 0x0013970102000000ULL) return 1; else return 0; } int ibnd_is_xsigo_hca(uint64_t guid) { /* NodeType 2 is HCA */ if ((guid & 0xffffffff00000000ULL) == 0x0013970200000000ULL) return 1; else return 0; } int ibnd_is_xsigo_tca(uint64_t guid) { /* NodeType 3 is TCA */ if ((guid & 0xffffffff00000000ULL) == 0x0013970300000000ULL) return 1; else return 0; } static int is_xsigo_ca(uint64_t guid) { if (ibnd_is_xsigo_hca(guid) || ibnd_is_xsigo_tca(guid)) return 1; else return 0; } static int is_xsigo_switch(uint64_t guid) { if ((guid & 0xffffffff00000000ULL) == 0x0013970100000000ULL) return 1; else return 0; } static uint64_t xsigo_chassisguid(ibnd_node_t * node) { uint64_t sysimgguid = mad_get_field64(node->info, 0, IB_NODE_SYSTEM_GUID_F); uint64_t remote_sysimgguid; if (!is_xsigo_ca(sysimgguid)) { /* Byte 3 is NodeType and byte 4 is PortType */ /* If NodeType is 1 (switch), PortType is masked */ if (is_xsigo_switch(sysimgguid)) return sysimgguid & 0xffffffff00ffffffULL; else return sysimgguid; } else { if (!node->ports || !node->ports[1]) return 0; /* Is there a peer port ? */ if (!node->ports[1]->remoteport) return sysimgguid; /* If peer port is Leaf 1, use its chassis GUID */ remote_sysimgguid = mad_get_field64(node->ports[1]->remoteport->node->info, 0, IB_NODE_SYSTEM_GUID_F); if (is_xsigo_leafone(remote_sysimgguid)) return remote_sysimgguid & 0xffffffff00ffffffULL; else return sysimgguid; } } static uint64_t get_chassisguid(ibnd_node_t * node) { uint32_t vendid = mad_get_field(node->info, 0, IB_NODE_VENDORID_F); uint64_t sysimgguid = mad_get_field64(node->info, 0, IB_NODE_SYSTEM_GUID_F); if (vendid == TS_VENDOR_ID || vendid == SS_VENDOR_ID) return topspin_chassisguid(sysimgguid); else if (vendid == XS_VENDOR_ID || ibnd_is_xsigo_guid(sysimgguid)) return xsigo_chassisguid(node); else return sysimgguid; } static ibnd_chassis_t *find_chassisguid(ibnd_fabric_t * fabric, ibnd_node_t * node) { ibnd_chassis_t *current; uint64_t chguid; chguid = get_chassisguid(node); for (current = fabric->chassis; current; current = current->next) if (current->chassisguid == chguid) return current; return NULL; } uint64_t ibnd_get_chassis_guid(ibnd_fabric_t * fabric, unsigned char chassisnum) { ibnd_chassis_t *chassis; if (!fabric) { IBND_DEBUG("fabric parameter NULL\n"); return 0; } chassis = find_chassisnum(fabric, chassisnum); if (chassis) return chassis->chassisguid; else return 0; } static int is_router(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_IB_FC_ROUTER || devid == VTR_DEVID_IB_IP_ROUTER); } static int is_spine_9096(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_SFB4 || devid == VTR_DEVID_SFB4_DDR); } static int is_spine_9288(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_SFB12 || devid == VTR_DEVID_SFB12_DDR); } static int is_spine_2004(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_SFB2004); } static int is_spine_2012(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_SFB2012); } static int is_spine_4700(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_SFB4700); } static int is_spine_4700x2(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_SFB4700X2); } static int is_spine_4200(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_SFB4200); } static int is_spine(ibnd_node_t * n) { return (is_spine_9096(n) || is_spine_9288(n) || is_spine_2004(n) || is_spine_2012(n) || is_spine_4700(n) || is_spine_4700x2(n) || is_spine_4200(n)); } static int is_line_24(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_SLB24 || devid == VTR_DEVID_SLB24_DDR || devid == VTR_DEVID_SRB2004); } static int is_line_8(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_SLB8); } static int is_line_2024(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_SLB2024); } static int is_line_4700(ibnd_node_t * n) { uint32_t devid = mad_get_field(n->info, 0, IB_NODE_DEVID_F); return (devid == VTR_DEVID_SLB4018); } static int is_line(ibnd_node_t * n) { return (is_line_24(n) || is_line_8(n) || is_line_2024(n) || is_line_4700(n)); } /* these structs help find Line (Anafa) slot number while using spine portnum */ static const char line_slot_2_sfb4[37] = { 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; static const char anafa_line_slot_2_sfb4[37] = { 0, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; static const char line_slot_2_sfb12[37] = { 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 11, 11, 12, 12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; static const char anafa_line_slot_2_sfb12[37] = { 0, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; /* LB slot = table[spine port] */ static const char line_slot_2_sfb18[37] = { 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 11, 11, 12, 12, 13, 13, 14, 14, 15, 15, 16, 16, 17, 17, 18, 18}; /* LB asic num = table[spine port] */ static const char anafa_line_slot_2_sfb18[37] = { 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }; /* LB slot = table[spine port] */ static const char line_slot_2_sfb18x2[37] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}; /* LB asic num = table[spine port] */ static const char anafa_line_slot_2_sfb18x2[37] = { 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; /* LB slot = table[spine port] */ static const char line_slot_2_sfb4200[37] = { 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 9, 9, 9, 9}; /* LB asic num = table[spine port] */ static const char anafa_line_slot_2_sfb4200[37] = { 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }; /* IPR FCR modules connectivity while using sFB4 port as reference */ static const char ipr_slot_2_sfb4_port[37] = { 0, 3, 2, 1, 3, 2, 1, 3, 2, 1, 3, 2, 1, 3, 2, 1, 3, 2, 1, 3, 2, 1, 3, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; /* these structs help find Spine (Anafa) slot number while using spine portnum */ static const char spine12_slot_2_slb[37] = { 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; static const char anafa_spine12_slot_2_slb[37] = { 0, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; static const char spine4_slot_2_slb[37] = { 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; static const char anafa_spine4_slot_2_slb[37] = { 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; /* FB slot = table[line port] */ static const char spine18_slot_2_slb[37] = { 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; /* FB asic = table[line port] */ static const char anafa_spine18_slot_2_slb[37] = { 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; static const char anafa_spine18x2_slot_2_slb[37] = { 0, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; /* FB slot = table[line port] */ static const char sfb4200_slot_2_slb[37] = { 0, 1, 1, 1, 1, 0, 0, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; /* FB asic = table[line port] */ static const char anafa_sfb4200_slot_2_slb[37] = { 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; /* reference { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 }; */ static int get_sfb_slot(ibnd_node_t * n, ibnd_port_t * lineport) { n->ch_slot = SPINE_CS; if (is_spine_9096(n)) { n->ch_type = ISR9096_CT; n->ch_slotnum = spine4_slot_2_slb[lineport->portnum]; n->ch_anafanum = anafa_spine4_slot_2_slb[lineport->portnum]; } else if (is_spine_9288(n)) { n->ch_type = ISR9288_CT; n->ch_slotnum = spine12_slot_2_slb[lineport->portnum]; n->ch_anafanum = anafa_spine12_slot_2_slb[lineport->portnum]; } else if (is_spine_2012(n)) { n->ch_type = ISR2012_CT; n->ch_slotnum = spine12_slot_2_slb[lineport->portnum]; n->ch_anafanum = anafa_spine12_slot_2_slb[lineport->portnum]; } else if (is_spine_2004(n)) { n->ch_type = ISR2004_CT; n->ch_slotnum = spine4_slot_2_slb[lineport->portnum]; n->ch_anafanum = anafa_spine4_slot_2_slb[lineport->portnum]; } else if (is_spine_4700(n)) { n->ch_type = ISR4700_CT; n->ch_slotnum = spine18_slot_2_slb[lineport->portnum]; n->ch_anafanum = anafa_spine18_slot_2_slb[lineport->portnum]; } else if (is_spine_4700x2(n)) { n->ch_type = ISR4700_CT; n->ch_slotnum = spine18_slot_2_slb[lineport->portnum]; n->ch_anafanum = anafa_spine18x2_slot_2_slb[lineport->portnum]; } else if (is_spine_4200(n)) { n->ch_type = ISR4200_CT; n->ch_slotnum = sfb4200_slot_2_slb[lineport->portnum]; n->ch_anafanum = anafa_sfb4200_slot_2_slb[lineport->portnum]; } else { IBND_ERROR("Unexpected node found: guid 0x%016" PRIx64 "\n", n->guid); } return 0; } static int get_router_slot(ibnd_node_t * n, ibnd_port_t * spineport) { uint64_t guessnum = 0; n->ch_found = 1; n->ch_slot = SRBD_CS; if (is_spine_9096(spineport->node)) { n->ch_type = ISR9096_CT; n->ch_slotnum = line_slot_2_sfb4[spineport->portnum]; n->ch_anafanum = ipr_slot_2_sfb4_port[spineport->portnum]; } else if (is_spine_9288(spineport->node)) { n->ch_type = ISR9288_CT; n->ch_slotnum = line_slot_2_sfb12[spineport->portnum]; /* this is a smart guess based on nodeguids order on sFB-12 module */ guessnum = spineport->node->guid % 4; /* module 1 <--> remote anafa 3 */ /* module 2 <--> remote anafa 2 */ /* module 3 <--> remote anafa 1 */ n->ch_anafanum = (guessnum == 3 ? 1 : (guessnum == 1 ? 3 : 2)); } else if (is_spine_2012(spineport->node)) { n->ch_type = ISR2012_CT; n->ch_slotnum = line_slot_2_sfb12[spineport->portnum]; /* this is a smart guess based on nodeguids order on sFB-12 module */ guessnum = spineport->node->guid % 4; // module 1 <--> remote anafa 3 // module 2 <--> remote anafa 2 // module 3 <--> remote anafa 1 n->ch_anafanum = (guessnum == 3 ? 1 : (guessnum == 1 ? 3 : 2)); } else if (is_spine_2004(spineport->node)) { n->ch_type = ISR2004_CT; n->ch_slotnum = line_slot_2_sfb4[spineport->portnum]; n->ch_anafanum = ipr_slot_2_sfb4_port[spineport->portnum]; } else { IBND_ERROR("Unexpected node found: guid 0x%016" PRIx64 "\n", spineport->node->guid); } return 0; } static int get_slb_slot(ibnd_node_t * n, ibnd_port_t * spineport) { n->ch_slot = LINE_CS; if (is_spine_9096(spineport->node)) { n->ch_type = ISR9096_CT; n->ch_slotnum = line_slot_2_sfb4[spineport->portnum]; n->ch_anafanum = anafa_line_slot_2_sfb4[spineport->portnum]; } else if (is_spine_9288(spineport->node)) { n->ch_type = ISR9288_CT; n->ch_slotnum = line_slot_2_sfb12[spineport->portnum]; n->ch_anafanum = anafa_line_slot_2_sfb12[spineport->portnum]; } else if (is_spine_2012(spineport->node)) { n->ch_type = ISR2012_CT; n->ch_slotnum = line_slot_2_sfb12[spineport->portnum]; n->ch_anafanum = anafa_line_slot_2_sfb12[spineport->portnum]; } else if (is_spine_2004(spineport->node)) { n->ch_type = ISR2004_CT; n->ch_slotnum = line_slot_2_sfb4[spineport->portnum]; n->ch_anafanum = anafa_line_slot_2_sfb4[spineport->portnum]; } else if (is_spine_4700(spineport->node)) { n->ch_type = ISR4700_CT; n->ch_slotnum = line_slot_2_sfb18[spineport->portnum]; n->ch_anafanum = anafa_line_slot_2_sfb18[spineport->portnum]; } else if (is_spine_4700x2(spineport->node)) { n->ch_type = ISR4700_CT; n->ch_slotnum = line_slot_2_sfb18x2[spineport->portnum]; n->ch_anafanum = anafa_line_slot_2_sfb18x2[spineport->portnum]; } else if (is_spine_4200(spineport->node)) { n->ch_type = ISR4200_CT; n->ch_slotnum = line_slot_2_sfb4200[spineport->portnum]; n->ch_anafanum = anafa_line_slot_2_sfb4200[spineport->portnum]; } else { IBND_ERROR("Unexpected node found: guid 0x%016" PRIx64 "\n", spineport->node->guid); } return 0; } /* This function called for every Mellanox node in fabric */ static int fill_mellanox_chassis_record(ibnd_node_t * node) { int p = 0; ibnd_port_t *port; char node_desc[IB_SMP_DATA_SIZE + 1]; char *system_name; char *system_type; char *system_slot_name; char *node_index; char *iter; int dev_id; /* The node description has the following format: 'MF0;:/[:board type]/U' - System slot name in our systems can be L[01-36] , S[01-18] - Node index is always 1 (we don.t have boards with multiple IS4 chips). - System name is taken from the currently configured host name. -The board type is optional and we don.t set it currently - A leaf or spine slot can currently hold a single type of board. */ memcpy(node_desc, node->nodedesc, IB_SMP_DATA_SIZE); node_desc[IB_SMP_DATA_SIZE] = '\0'; IBND_DEBUG("fill_mellanox_chassis_record: node_desc:%s \n",node_desc); if (node->ch_found) /* somehow this node has already been passed */ return 0; /* All mellanox IS4 switches have the same vendor id*/ dev_id = mad_get_field(node->info, 0,IB_NODE_DEVID_F); if (dev_id != MLX_DEVID_IS4) return 0; if((node_desc[0] != 'M') || (node_desc[1] != 'F') || (node_desc[2] != '0') || (node_desc[3] != ';')) { IBND_DEBUG("fill_mellanox_chassis_record: Unsupported node description format:%s \n",node_desc); return 0; } /* parse system name*/ system_name = &node_desc[4]; for (iter = system_name ; (*iter != ':') && (*iter != '\0') ; iter++); if(*iter == '\0'){ IBND_DEBUG("fill_mellanox_chassis_record: Unsupported node description format:%s - (get system_name failed) \n",node_desc); return 0; } *iter = '\0'; iter++; /* parse system type*/ system_type = iter; for ( ; (*iter != '/') && (*iter != '\0') ; iter++); if(*iter == '\0'){ IBND_DEBUG("fill_mellanox_chassis_record: Unsupported node description format:%s - (get system_type failed) \n",node_desc); return 0; } *iter = '\0'; iter++; /* parse system slot name*/ system_slot_name = iter; for ( ; (*iter != '/') && (*iter != ':') && (*iter != '\0') ; iter++); if(*iter == '\0'){ IBND_DEBUG("fill_mellanox_chassis_record: Unsupported node description format:%s - (get system_slot_name failed) \n",node_desc); return 0; } if(*iter == ':'){ *iter = '\0'; iter++; for ( ; (*iter != '/') && (*iter != '\0') ; iter++); if(*iter == '\0'){ IBND_DEBUG("fill_mellanox_chassis_record: Unsupported node description format:%s - (get board type failed) \n",node_desc); return 0; } } *iter = '\0'; iter++; node_index = iter; if(node_index[0] != 'U'){ IBND_DEBUG("fill_mellanox_chassis_record: Unsupported node description format:%s - (get node index) \n",node_desc); return 0; } /* set Chip number (node index) */ node->ch_anafanum = (unsigned char) atoi(&node_index[1]); if(node->ch_anafanum != 1){ IBND_DEBUG("Unexpected Chip number:%d \n",node->ch_anafanum); } /* set Line Spine numbers */ if(system_slot_name[0] == 'L') node->ch_slot = LINE_CS; else if(system_slot_name[0] == 'S') node->ch_slot = SPINE_CS; else{ IBND_DEBUG("fill_mellanox_chassis_record: Unsupported system_slot_name:%s \n",system_slot_name); return 0; } /* The switch will be displayed under Line or Spine and not under Chassis switches */ node->ch_found = 1; node->ch_slotnum = (unsigned char) atoi(&system_slot_name[1]); if((node->ch_slot == LINE_CS && (node->ch_slotnum > (LINES_MAX_NUM + 1))) || (node->ch_slot == SPINE_CS && (node->ch_slotnum > (SPINES_MAX_NUM + 1)))){ IBND_ERROR("fill_mellanox_chassis_record: invalid slot number:%d \n",node->ch_slotnum); node->ch_slotnum = 0; return 0; } /*set ch_type_str*/ strncpy(node->ch_type_str , system_type, sizeof(node->ch_type_str)-1); /* Line ports 1-18 are mapped to external ports 1-18*/ if(node->ch_slot == LINE_CS) { for (p = 1; p <= node->numports && p <= 18 ; p++) { port = node->ports[p]; if (!port) continue; port->ext_portnum = p; } } return 0; } static int insert_mellanox_line_and_spine(ibnd_node_t * node, ibnd_chassis_t * chassis) { if (node->ch_slot == LINE_CS){ if (chassis->linenode[node->ch_slotnum]) return 0; /* already filled slot */ chassis->linenode[node->ch_slotnum] = node; } else if (node->ch_slot == SPINE_CS){ if (chassis->spinenode[node->ch_slotnum]) return 0; /* already filled slot */ chassis->spinenode[node->ch_slotnum] = node; } else return 0; node->chassis = chassis; return 0; } /* forward declare this */ static void voltaire_portmap(ibnd_port_t * port); /* This function called for every Voltaire node in fabric It could be optimized so, but time overhead is very small and its only diag.util */ static int fill_voltaire_chassis_record(ibnd_node_t * node) { int p = 0; ibnd_port_t *port; ibnd_node_t *remnode = NULL; if (node->ch_found) /* somehow this node has already been passed */ return 0; node->ch_found = 1; /* node is router only in case of using unique lid */ /* (which is lid of chassis router port) */ /* in such case node->ports is actually a requested port... */ if (is_router(node)) /* find the remote node */ for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (port && is_spine(port->remoteport->node)) get_router_slot(node, port->remoteport); } else if (is_spine(node)) { int is_4700x2 = is_spine_4700x2(node); for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (!port || !port->remoteport) continue; /* * Skip ISR4700 double density fabric boards ports 19-36 * as they are chassis external ports */ if (is_4700x2 && (port->portnum > 18)) continue; remnode = port->remoteport->node; if (remnode->type != IB_NODE_SWITCH) { if (!remnode->ch_found) get_router_slot(remnode, port); continue; } if (!node->ch_type) /* we assume here that remoteport belongs to line */ get_sfb_slot(node, port->remoteport); /* we could break here, but need to find if more routers connected */ } } else if (is_line(node)) { int is_4700_line = is_line_4700(node); for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (!port || !port->remoteport) continue; if ((is_4700_line && (port->portnum > 18)) || (!is_4700_line && (port->portnum > 12))) continue; /* we assume here that remoteport belongs to spine */ get_slb_slot(node, port->remoteport); break; } } /* for each port of this node, map external ports */ for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (!port) continue; voltaire_portmap(port); } return 0; } static int get_line_index(ibnd_node_t * node) { int retval; if (is_line_4700(node)) retval = node->ch_slotnum; else retval = 3 * (node->ch_slotnum - 1) + node->ch_anafanum; if (retval > LINES_MAX_NUM || retval < 1) { printf("%s: retval = %d\n", __FUNCTION__, retval); IBND_ERROR("Internal error\n"); return -1; } return retval; } static int get_spine_index(ibnd_node_t * node) { int retval; if (is_spine_9288(node) || is_spine_2012(node)) retval = 3 * (node->ch_slotnum - 1) + node->ch_anafanum; else if (is_spine_4700(node) || is_spine_4700x2(node)) retval = 2 * (node->ch_slotnum - 1) + node->ch_anafanum; else retval = node->ch_slotnum; if (retval > SPINES_MAX_NUM || retval < 1) { IBND_ERROR("Internal error\n"); return -1; } return retval; } static int insert_line_router(ibnd_node_t * node, ibnd_chassis_t * chassis) { int i = get_line_index(node); if (i < 0) return i; if (chassis->linenode[i]) return 0; /* already filled slot */ chassis->linenode[i] = node; node->chassis = chassis; return 0; } static int insert_spine(ibnd_node_t * node, ibnd_chassis_t * chassis) { int i = get_spine_index(node); if (i < 0) return i; if (chassis->spinenode[i]) return 0; /* already filled slot */ chassis->spinenode[i] = node; node->chassis = chassis; return 0; } static int pass_on_lines_catch_spines(ibnd_chassis_t * chassis) { ibnd_node_t *node, *remnode; ibnd_port_t *port; int i, p; for (i = 1; i <= LINES_MAX_NUM; i++) { int is_4700_line; node = chassis->linenode[i]; if (!(node && is_line(node))) continue; /* empty slot or router */ is_4700_line = is_line_4700(node); for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (!port || !port->remoteport) continue; if ((is_4700_line && (port->portnum > 18)) || (!is_4700_line && (port->portnum > 12))) continue; remnode = port->remoteport->node; if (!remnode->ch_found) continue; /* some error - spine not initialized ? FIXME */ if (insert_spine(remnode, chassis)) return -1; } } return 0; } static int pass_on_spines_catch_lines(ibnd_chassis_t * chassis) { ibnd_node_t *node, *remnode; ibnd_port_t *port; int i, p; for (i = 1; i <= SPINES_MAX_NUM; i++) { int is_4700x2; node = chassis->spinenode[i]; if (!node) continue; /* empty slot */ is_4700x2 = is_spine_4700x2(node); for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (!port || !port->remoteport) continue; /* * ISR4700 double density fabric board ports 19-36 are * chassis external ports, so skip them */ if (is_4700x2 && (port->portnum > 18)) continue; remnode = port->remoteport->node; if (!remnode->ch_found) continue; /* some error - line/router not initialized ? FIXME */ if (insert_line_router(remnode, chassis)) return -1; } } return 0; } /* Stupid interpolation algorithm... But nothing to do - have to be compliant with VoltaireSM/NMS */ static void pass_on_spines_interpolate_chguid(ibnd_chassis_t * chassis) { ibnd_node_t *node; int i; for (i = 1; i <= SPINES_MAX_NUM; i++) { node = chassis->spinenode[i]; if (!node) continue; /* skip the empty slots */ /* take first guid minus one to be consistent with SM */ chassis->chassisguid = node->guid - 1; break; } } /* This function fills chassis structure with all nodes in that chassis chassis structure = structure of one standalone chassis */ static int build_chassis(ibnd_node_t * node, ibnd_chassis_t * chassis) { int p = 0; ibnd_node_t *remnode = NULL; ibnd_port_t *port = NULL; /* we get here with node = chassis_spine */ if (insert_spine(node, chassis)) return -1; /* loop: pass on all ports of node */ for (p = 1; p <= node->numports; p++) { port = node->ports[p]; if (!port || !port->remoteport) continue; /* * ISR4700 double density fabric board ports 19-36 are * chassis external ports, so skip them */ if (is_spine_4700x2(node) && (port->portnum > 18)) continue; remnode = port->remoteport->node; if (!remnode->ch_found) continue; /* some error - line or router not initialized ? FIXME */ insert_line_router(remnode, chassis); } if (pass_on_lines_catch_spines(chassis)) return -1; /* this pass needed for to catch routers, since routers connected only */ /* to spines in slot 1 or 4 and we could miss them first time */ if (pass_on_spines_catch_lines(chassis)) return -1; /* additional 2 passes needed for to overcome a problem of pure "in-chassis" */ /* connectivity - extra pass to ensure that all related chips/modules */ /* inserted into the chassis */ if (pass_on_lines_catch_spines(chassis)) return -1; if (pass_on_spines_catch_lines(chassis)) return -1; pass_on_spines_interpolate_chguid(chassis); return 0; } /*========================================================*/ /* INTERNAL TO EXTERNAL PORT MAPPING */ /*========================================================*/ /* Description : On ISR9288/9096 external ports indexing is not matching the internal ( anafa ) port indexes. Use this MAP to translate the data you get from the OpenIB diagnostics (smpquery, ibroute, ibtracert, etc.) Module : sLB-24 anafa 1 anafa 2 ext port | 13 14 15 16 17 18 | 19 20 21 22 23 24 int port | 22 23 24 18 17 16 | 22 23 24 18 17 16 ext port | 1 2 3 4 5 6 | 7 8 9 10 11 12 int port | 19 20 21 15 14 13 | 19 20 21 15 14 13 ------------------------------------------------ Module : sLB-8 anafa 1 anafa 2 ext port | 13 14 15 16 17 18 | 19 20 21 22 23 24 int port | 24 23 22 18 17 16 | 24 23 22 18 17 16 ext port | 1 2 3 4 5 6 | 7 8 9 10 11 12 int port | 21 20 19 15 14 13 | 21 20 19 15 14 13 -----------> anafa 1 anafa 2 ext port | - - 5 - - 6 | - - 7 - - 8 int port | 24 23 22 18 17 16 | 24 23 22 18 17 16 ext port | - - 1 - - 2 | - - 3 - - 4 int port | 21 20 19 15 14 13 | 21 20 19 15 14 13 ------------------------------------------------ Module : sLB-2024 ext port | 13 14 15 16 17 18 19 20 21 22 23 24 A1 int port| 13 14 15 16 17 18 19 20 21 22 23 24 ext port | 1 2 3 4 5 6 7 8 9 10 11 12 A2 int port| 13 14 15 16 17 18 19 20 21 22 23 24 --------------------------------------------------- Module : sLB-4018 int port | 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 ext port | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 --------------------------------------------------- Module : sFB-4700X2 12X port -> 3 x 4X ports: A1 int port | 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 ext port | 7 7 7 8 8 8 9 9 9 10 10 10 11 11 11 12 12 12 A2 int port | 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 ext port | 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 */ static int int2ext_map_slb24[2][25] = { {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 5, 4, 18, 17, 16, 1, 2, 3, 13, 14, 15}, {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 12, 11, 10, 24, 23, 22, 7, 8, 9, 19, 20, 21} }; static int int2ext_map_slb8[2][25] = { {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 6, 6, 6, 1, 1, 1, 5, 5, 5}, {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 4, 4, 8, 8, 8, 3, 3, 3, 7, 7, 7} }; static int int2ext_map_slb2024[2][25] = { {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24}, {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} }; static int int2ext_map_slb4018[37] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 }; static int int2ext_map_sfb4700x2[2][37] = { {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 11, 11, 11, 12, 12, 12}, {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6} }; /* reference { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 }; */ /* map internal ports to external ports if appropriate */ static void voltaire_portmap(ibnd_port_t * port) { int portnum = port->portnum; int chipnum = 0; ibnd_node_t *node = port->node; int is_4700_line = is_line_4700(node); int is_4700x2_spine = is_spine_4700x2(node); if (!node->ch_found || (!is_line(node) && !is_4700x2_spine)) { port->ext_portnum = 0; return; } if (((is_4700_line || is_4700x2_spine) && (portnum < 19 || portnum > 36)) || ((!is_4700_line && !is_4700x2_spine) && (portnum < 13 || portnum > 24))) { port->ext_portnum = 0; return; } if (port->node->ch_anafanum < 1 || port->node->ch_anafanum > 2) { port->ext_portnum = 0; return; } chipnum = port->node->ch_anafanum - 1; if (is_line_24(node)) port->ext_portnum = int2ext_map_slb24[chipnum][portnum]; else if (is_line_2024(node)) port->ext_portnum = int2ext_map_slb2024[chipnum][portnum]; /* sLB-4018: Only one asic per LB */ else if (is_4700_line) port->ext_portnum = int2ext_map_slb4018[portnum]; /* sFB-4700X2 4X port */ else if (is_4700x2_spine) port->ext_portnum = int2ext_map_sfb4700x2[chipnum][portnum]; else port->ext_portnum = int2ext_map_slb8[chipnum][portnum]; } static int add_chassis(chassis_scan_t * chassis_scan) { if (!(chassis_scan->current_chassis = calloc(1, sizeof(ibnd_chassis_t)))) { IBND_ERROR("OOM: failed to allocate chassis object\n"); return -1; } if (chassis_scan->first_chassis == NULL) { chassis_scan->first_chassis = chassis_scan->current_chassis; chassis_scan->last_chassis = chassis_scan->current_chassis; } else { chassis_scan->last_chassis->next = chassis_scan->current_chassis; chassis_scan->last_chassis = chassis_scan->current_chassis; } return 0; } static void add_node_to_chassis(ibnd_chassis_t * chassis, ibnd_node_t * node) { node->chassis = chassis; node->next_chassis_node = chassis->nodes; chassis->nodes = node; } /* Main grouping function Algorithm: 1. pass on every Voltaire node 2. catch spine chip for every Voltaire node 2.1 build/interpolate chassis around this chip 2.2 go to 1. 3. pass on non Voltaire nodes (SystemImageGUID based grouping) 4. now group non Voltaire nodes by SystemImageGUID Returns: 0 on success, -1 on failure */ int group_nodes(ibnd_fabric_t * fabric) { ibnd_node_t *node; int chassisnum = 0; ibnd_chassis_t *chassis; ibnd_chassis_t *ch, *ch_next; chassis_scan_t chassis_scan; int vendor_id; chassis_scan.first_chassis = NULL; chassis_scan.current_chassis = NULL; chassis_scan.last_chassis = NULL; /* first pass on switches and build for every Voltaire node */ /* an appropriate chassis record (slotnum and position) */ /* according to internal connectivity */ /* not very efficient but clear code so... */ for (node = fabric->switches; node; node = node->type_next) { vendor_id = mad_get_field(node->info, 0,IB_NODE_VENDORID_F); if (vendor_id == VTR_VENDOR_ID && fill_voltaire_chassis_record(node)) goto cleanup; else if (vendor_id == MLX_VENDOR_ID && fill_mellanox_chassis_record(node)) goto cleanup; } /* separate every Voltaire chassis from each other and build linked list of them */ /* algorithm: catch spine and find all surrounding nodes */ for (node = fabric->switches; node; node = node->type_next) { if (mad_get_field(node->info, 0, IB_NODE_VENDORID_F) != VTR_VENDOR_ID) continue; if (!node->ch_found || (node->chassis && node->chassis->chassisnum) || !is_spine(node)) continue; if (add_chassis(&chassis_scan)) goto cleanup; chassis_scan.current_chassis->chassisnum = ++chassisnum; if (build_chassis(node, chassis_scan.current_chassis)) goto cleanup; } /* now make pass on nodes for chassis which are not Voltaire */ /* grouped by common SystemImageGUID */ for (node = fabric->nodes; node; node = node->next) { if (mad_get_field(node->info, 0, IB_NODE_VENDORID_F) == VTR_VENDOR_ID) continue; if (mad_get_field64(node->info, 0, IB_NODE_SYSTEM_GUID_F)) { chassis = find_chassisguid(fabric, node); if (chassis) chassis->nodecount++; else { /* Possible new chassis */ if (add_chassis(&chassis_scan)) goto cleanup; chassis_scan.current_chassis->chassisguid = get_chassisguid(node); chassis_scan.current_chassis->nodecount = 1; if (!fabric->chassis) fabric->chassis = chassis_scan.first_chassis; } } } /* now, make another pass to see which nodes are part of chassis */ /* (defined as chassis->nodecount > 1) */ for (node = fabric->nodes; node; node = node->next) { vendor_id = mad_get_field(node->info, 0,IB_NODE_VENDORID_F); if (vendor_id == VTR_VENDOR_ID) continue; if (mad_get_field64(node->info, 0, IB_NODE_SYSTEM_GUID_F)) { chassis = find_chassisguid(fabric, node); if (chassis && chassis->nodecount > 1) { if (!chassis->chassisnum) chassis->chassisnum = ++chassisnum; if (!node->ch_found) { node->ch_found = 1; add_node_to_chassis(chassis, node); } else if (vendor_id == MLX_VENDOR_ID){ insert_mellanox_line_and_spine(node, chassis); } } } } fabric->chassis = chassis_scan.first_chassis; return 0; cleanup: ch = chassis_scan.first_chassis; while (ch) { ch_next = ch->next; free(ch); ch = ch_next; } fabric->chassis = NULL; return -1; } rdma-core-56.1/libibnetdisc/chassis.h000066400000000000000000000065231477342711600176010ustar00rootroot00000000000000/* * Copyright (c) 2004-2007 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _CHASSIS_H_ #define _CHASSIS_H_ #include #include "internal.h" /*========================================================*/ /* CHASSIS RECOGNITION SPECIFIC DATA */ /*========================================================*/ /* Device IDs */ #define VTR_DEVID_IB_FC_ROUTER 0x5a00 #define VTR_DEVID_IB_IP_ROUTER 0x5a01 #define VTR_DEVID_ISR9600_SPINE 0x5a02 #define VTR_DEVID_ISR9600_LEAF 0x5a03 #define VTR_DEVID_HCA1 0x5a04 #define VTR_DEVID_HCA2 0x5a44 #define VTR_DEVID_HCA3 0x6278 #define VTR_DEVID_SW_6IB4 0x5a05 #define VTR_DEVID_ISR9024 0x5a06 #define VTR_DEVID_ISR9288 0x5a07 #define VTR_DEVID_SLB24 0x5a09 #define VTR_DEVID_SFB12 0x5a08 #define VTR_DEVID_SFB4 0x5a0b #define VTR_DEVID_ISR9024_12 0x5a0c #define VTR_DEVID_SLB8 0x5a0d #define VTR_DEVID_RLX_SWITCH_BLADE 0x5a20 #define VTR_DEVID_ISR9024_DDR 0x5a31 #define VTR_DEVID_SFB12_DDR 0x5a32 #define VTR_DEVID_SFB4_DDR 0x5a33 #define VTR_DEVID_SLB24_DDR 0x5a34 #define VTR_DEVID_SFB2012 0x5a37 #define VTR_DEVID_SLB2024 0x5a38 #define VTR_DEVID_ISR2012 0x5a39 #define VTR_DEVID_SFB2004 0x5a40 #define VTR_DEVID_ISR2004 0x5a41 #define VTR_DEVID_SRB2004 0x5a42 #define VTR_DEVID_SLB4018 0x5a5b #define VTR_DEVID_SFB4700 0x5a5c #define VTR_DEVID_SFB4700X2 0x5a5d #define VTR_DEVID_SFB4200 0x5a60 #define MLX_DEVID_IS4 0xbd36 /* Vendor IDs (for chassis based systems) */ #define VTR_VENDOR_ID 0x8f1 /* Voltaire */ #define MLX_VENDOR_ID 0x2c9 /* Mellanox */ #define TS_VENDOR_ID 0x5ad /* Cisco */ #define SS_VENDOR_ID 0x66a /* InfiniCon */ #define XS_VENDOR_ID 0x1397 /* Xsigo */ enum ibnd_chassis_type { UNRESOLVED_CT, ISR9288_CT, ISR9096_CT, ISR2012_CT, ISR2004_CT, ISR4700_CT, ISR4200_CT }; enum ibnd_chassis_slot_type { UNRESOLVED_CS, LINE_CS, SPINE_CS, SRBD_CS }; int group_nodes(struct ibnd_fabric *fabric); #endif /* _CHASSIS_H_ */ rdma-core-56.1/libibnetdisc/ibnetdisc.c000066400000000000000000000742041477342711600201040ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * Copyright (c) 2008 Lawrence Livermore National Laboratory * Copyright (c) 2010-2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include "internal.h" #include "chassis.h" #define container_of(ptr, type, member) \ ((type *)((uint8_t *)(ptr)-offsetof(type, member))) /* forward declarations */ struct ni_cbdata { ibnd_node_t *node; int port_num; }; static int query_node_info(smp_engine_t * engine, ib_portid_t * portid, struct ni_cbdata * cbdata); static int query_port_info(smp_engine_t * engine, ib_portid_t * portid, ibnd_node_t * node, int portnum); static int recv_switch_info(smp_engine_t * engine, ibnd_smp_t * smp, uint8_t * mad, void *cb_data) { uint8_t *switch_info = mad + IB_SMP_DATA_OFFS; ibnd_node_t *node = cb_data; memcpy(node->switchinfo, switch_info, sizeof(node->switchinfo)); mad_decode_field(node->switchinfo, IB_SW_ENHANCED_PORT0_F, &node->smaenhsp0); return 0; } static int query_switch_info(smp_engine_t * engine, ib_portid_t * portid, ibnd_node_t * node) { node->smaenhsp0 = 0; /* assume base SP0 */ return issue_smp(engine, portid, IB_ATTR_SWITCH_INFO, 0, recv_switch_info, node); } static int add_port_to_dpath(ib_dr_path_t * path, int nextport) { if (path->cnt > sizeof(path->p) - 2) return -1; ++path->cnt; path->p[path->cnt] = (uint8_t) nextport; return path->cnt; } static int retract_dpath(smp_engine_t * engine, ib_portid_t * portid) { ibnd_scan_t *scan = engine->user_data; f_internal_t *f_int = scan->f_int; if (scan->cfg->max_hops && f_int->fabric.maxhops_discovered > scan->cfg->max_hops) return 0; /* this may seem wrong but the only time we would retract the path is * if the user specified a CA for the DR path and we are retracting * from that to find the node it is connected to. This counts as a * positive hop discovered */ f_int->fabric.maxhops_discovered++; portid->drpath.p[portid->drpath.cnt] = 0; portid->drpath.cnt--; return 1; } static int extend_dpath(smp_engine_t * engine, ib_portid_t * portid, int nextport) { ibnd_scan_t *scan = engine->user_data; f_internal_t *f_int = scan->f_int; if (scan->cfg->max_hops && f_int->fabric.maxhops_discovered > scan->cfg->max_hops) return 0; if (portid->lid) { /* If we were LID routed we need to set up the drslid */ portid->drpath.drslid = (uint16_t) scan->selfportid.lid; portid->drpath.drdlid = 0xFFFF; } if (add_port_to_dpath(&portid->drpath, nextport) < 0) { IBND_ERROR("add port %d to DR path failed; %s\n", nextport, portid2str(portid)); return -1; } if (((unsigned) portid->drpath.cnt - scan->initial_hops) > f_int->fabric.maxhops_discovered) f_int->fabric.maxhops_discovered++; return 1; } static int recv_node_desc(smp_engine_t * engine, ibnd_smp_t * smp, uint8_t * mad, void *cb_data) { uint8_t *node_desc = mad + IB_SMP_DATA_OFFS; ibnd_node_t *node = cb_data; memcpy(node->nodedesc, node_desc, sizeof(node->nodedesc)); return 0; } static int query_node_desc(smp_engine_t * engine, ib_portid_t * portid, ibnd_node_t * node) { return issue_smp(engine, portid, IB_ATTR_NODE_DESC, 0, recv_node_desc, node); } static void debug_port(ib_portid_t * portid, ibnd_port_t * port) { char width[64], speed[64]; int iwidth; int ispeed, fdr10, espeed; uint8_t *info; iwidth = mad_get_field(port->info, 0, IB_PORT_LINK_WIDTH_ACTIVE_F); ispeed = mad_get_field(port->info, 0, IB_PORT_LINK_SPEED_ACTIVE_F); fdr10 = mad_get_field(port->ext_info, 0, IB_MLNX_EXT_PORT_LINK_SPEED_ACTIVE_F); if (port->node->type == IB_NODE_SWITCH) info = (uint8_t *)&port->node->ports[0]->info; else info = (uint8_t *)&port->info; espeed = ibnd_get_agg_linkspeedext(info, port->info); IBND_DEBUG ("portid %s portnum %d: base lid %d state %d physstate %d %s %s %s %s\n", portid2str(portid), port->portnum, port->base_lid, mad_get_field(port->info, 0, IB_PORT_STATE_F), mad_get_field(port->info, 0, IB_PORT_PHYS_STATE_F), mad_dump_val(IB_PORT_LINK_WIDTH_ACTIVE_F, width, 64, &iwidth), mad_dump_val(IB_PORT_LINK_SPEED_ACTIVE_F, speed, 64, &ispeed), (fdr10 & FDR10) ? "FDR10" : "", ibnd_dump_agg_linkspeedext(speed, 64, espeed)); } static int is_mlnx_ext_port_info_supported(ibnd_port_t * port) { uint16_t devid = (uint16_t) mad_get_field(port->node->info, 0, IB_NODE_DEVID_F); uint32_t vendorid = (uint32_t) mad_get_field(port->node->info, 0, IB_NODE_VENDORID_F); if ((devid >= 0xc738 && devid <= 0xc73b) || devid == 0xc839 || devid == 0xcb20 || devid == 0xcf08 || devid == 0xcf09 || devid == 0xd2f0 || ((vendorid == 0x119f) && /* Bull SwitchX */ (devid == 0x1b02 || devid == 0x1b50 || /* Bull SwitchIB and SwitchIB2 */ devid == 0x1ba0 || (devid >= 0x1bd0 && devid <= 0x1bd5) || /* Bull Quantum */ devid == 0x1bf0))) return 1; if ((devid >= 0x1003 && devid <= 0x101b) || (devid == 0xa2d2) || ((vendorid == 0x119f) && /* Bull ConnectX3 */ (devid == 0x1b33 || devid == 0x1b73 || devid == 0x1b40 || devid == 0x1b41 || devid == 0x1b60 || devid == 0x1b61 || /* Bull ConnectIB */ devid == 0x1b83 || devid == 0x1b93 || devid == 0x1b94 || /* Bull ConnectX4, Sequana HDR and HDR100 */ devid == 0x1bb4 || devid == 0x1bb5 || (devid >= 0x1bc4 && devid <= 0x1bc6)))) return 1; return 0; } int mlnx_ext_port_info_err(smp_engine_t * engine, ibnd_smp_t * smp, uint8_t * mad, void *cb_data) { f_internal_t *f_int = ((ibnd_scan_t *) engine->user_data)->f_int; ibnd_node_t *node = cb_data; ibnd_port_t *port; uint8_t port_num, local_port; port_num = (uint8_t) mad_get_field(mad, 0, IB_MAD_ATTRMOD_F); port = node->ports[port_num]; if (!port) { IBND_ERROR("Failed to find 0x%" PRIx64 " port %u\n", node->guid, port_num); return -1; } local_port = (uint8_t) mad_get_field(port->info, 0, IB_PORT_LOCAL_PORT_F); debug_port(&smp->path, port); if (port_num && mad_get_field(port->info, 0, IB_PORT_PHYS_STATE_F) == IB_PORT_PHYS_STATE_LINKUP && ((node->type == IB_NODE_SWITCH && port_num != local_port) || (node == f_int->fabric.from_node && port_num == f_int->fabric.from_portnum))) { int rc = 0; ib_portid_t path = smp->path; if (node->type != IB_NODE_SWITCH && node == f_int->fabric.from_node && path.drpath.cnt > 1) rc = retract_dpath(engine, &path); else { /* we can't proceed through an HCA with DR */ if (path.lid == 0 || node->type == IB_NODE_SWITCH) rc = extend_dpath(engine, &path, port_num); } if (rc > 0) { struct ni_cbdata * cbdata = malloc(sizeof(*cbdata)); cbdata->node = node; cbdata->port_num = port_num; query_node_info(engine, &path, cbdata); } } return 0; } static int recv_mlnx_ext_port_info(smp_engine_t * engine, ibnd_smp_t * smp, uint8_t * mad, void *cb_data) { f_internal_t *f_int = ((ibnd_scan_t *) engine->user_data)->f_int; ibnd_node_t *node = cb_data; ibnd_port_t *port; uint8_t *ext_port_info = mad + IB_SMP_DATA_OFFS; uint8_t port_num, local_port; port_num = (uint8_t) mad_get_field(mad, 0, IB_MAD_ATTRMOD_F); port = node->ports[port_num]; if (!port) { IBND_ERROR("Failed to find 0x%" PRIx64 " port %u\n", node->guid, port_num); return -1; } memcpy(port->ext_info, ext_port_info, sizeof(port->ext_info)); local_port = (uint8_t) mad_get_field(port->info, 0, IB_PORT_LOCAL_PORT_F); debug_port(&smp->path, port); if (port_num && mad_get_field(port->info, 0, IB_PORT_PHYS_STATE_F) == IB_PORT_PHYS_STATE_LINKUP && ((node->type == IB_NODE_SWITCH && port_num != local_port) || (node == f_int->fabric.from_node && port_num == f_int->fabric.from_portnum))) { int rc = 0; ib_portid_t path = smp->path; if (node->type != IB_NODE_SWITCH && node == f_int->fabric.from_node && path.drpath.cnt > 1) rc = retract_dpath(engine, &path); else { /* we can't proceed through an HCA with DR */ if (path.lid == 0 || node->type == IB_NODE_SWITCH) rc = extend_dpath(engine, &path, port_num); } if (rc > 0) { struct ni_cbdata * cbdata = malloc(sizeof(*cbdata)); cbdata->node = node; cbdata->port_num = port_num; query_node_info(engine, &path, cbdata); } } return 0; } static int query_mlnx_ext_port_info(smp_engine_t * engine, ib_portid_t * portid, ibnd_node_t * node, int portnum) { IBND_DEBUG("Query MLNX Extended Port Info; %s (0x%" PRIx64 "):%d\n", portid2str(portid), node->guid, portnum); return issue_smp(engine, portid, IB_ATTR_MLNX_EXT_PORT_INFO, portnum, recv_mlnx_ext_port_info, node); } static int recv_port_info(smp_engine_t * engine, ibnd_smp_t * smp, uint8_t * mad, void *cb_data) { ibnd_scan_t *scan = (ibnd_scan_t *)engine->user_data; f_internal_t *f_int = scan->f_int; ibnd_node_t *node = cb_data; ibnd_port_t *port; uint8_t *port_info = mad + IB_SMP_DATA_OFFS; uint8_t port_num, local_port; int phystate, ispeed, espeed; uint8_t *info; port_num = (uint8_t) mad_get_field(mad, 0, IB_MAD_ATTRMOD_F); local_port = (uint8_t) mad_get_field(port_info, 0, IB_PORT_LOCAL_PORT_F); /* this may have been created before */ port = node->ports[port_num]; if (!port) { port = node->ports[port_num] = calloc(1, sizeof(*port)); if (!port) { IBND_ERROR("Failed to allocate 0x%" PRIx64 " port %u\n", node->guid, port_num); return -1; } port->guid = mad_get_field64(node->info, 0, IB_NODE_PORT_GUID_F); } memcpy(port->info, port_info, sizeof(port->info)); port->node = node; port->portnum = port_num; port->ext_portnum = 0; port->base_lid = (uint16_t) mad_get_field(port->info, 0, IB_PORT_LID_F); port->lmc = (uint8_t) mad_get_field(port->info, 0, IB_PORT_LMC_F); if (port_num == 0) { node->smalid = port->base_lid; node->smalmc = port->lmc; } else if (node->type == IB_NODE_SWITCH) { port->base_lid = node->smalid; port->lmc = node->smalmc; } int rc1 = add_to_portguid_hash(port, f_int->fabric.portstbl); if (rc1) IBND_ERROR("Error Occurred when trying" " to insert new port guid 0x%016" PRIx64 " to DB\n", port->guid); add_to_portlid_hash(port, f_int); if ((scan->cfg->flags & IBND_CONFIG_MLX_EPI) && is_mlnx_ext_port_info_supported(port)) { phystate = mad_get_field(port->info, 0, IB_PORT_PHYS_STATE_F); ispeed = mad_get_field(port->info, 0, IB_PORT_LINK_SPEED_ACTIVE_F); if (port->node->type == IB_NODE_SWITCH) info = (uint8_t *)&port->node->ports[0]->info; else info = (uint8_t *)&port->info; espeed = ibnd_get_agg_linkspeedext(info, port->info); if (phystate == IB_PORT_PHYS_STATE_LINKUP && ispeed == IB_LINK_SPEED_ACTIVE_10 && espeed == IB_LINK_SPEED_EXT_ACTIVE_NONE) { /* LinkUp/QDR */ query_mlnx_ext_port_info(engine, &smp->path, node, port_num); return 0; } } debug_port(&smp->path, port); if (port_num && mad_get_field(port->info, 0, IB_PORT_PHYS_STATE_F) == IB_PORT_PHYS_STATE_LINKUP && ((node->type == IB_NODE_SWITCH && port_num != local_port) || (node == f_int->fabric.from_node && port_num == f_int->fabric.from_portnum))) { int rc = 0; ib_portid_t path = smp->path; if (node->type != IB_NODE_SWITCH && node == f_int->fabric.from_node && path.drpath.cnt > 1) rc = retract_dpath(engine, &path); else { /* we can't proceed through an HCA with DR */ if (path.lid == 0 || node->type == IB_NODE_SWITCH) rc = extend_dpath(engine, &path, port_num); } if (rc > 0) { struct ni_cbdata * cbdata = malloc(sizeof(*cbdata)); cbdata->node = node; cbdata->port_num = port_num; query_node_info(engine, &path, cbdata); } } return 0; } static int recv_port0_info(smp_engine_t * engine, ibnd_smp_t * smp, uint8_t * mad, void *cb_data) { ibnd_node_t *node = cb_data; int i, status; status = recv_port_info(engine, smp, mad, cb_data); /* Query PortInfo on switch external/physical ports */ for (i = 1; i <= node->numports; i++) query_port_info(engine, &smp->path, node, i); return status; } static int query_port_info(smp_engine_t * engine, ib_portid_t * portid, ibnd_node_t * node, int portnum) { IBND_DEBUG("Query Port Info; %s (0x%" PRIx64 "):%d\n", portid2str(portid), node->guid, portnum); return issue_smp(engine, portid, IB_ATTR_PORT_INFO, portnum, portnum ? recv_port_info : recv_port0_info, node); } static ibnd_node_t *create_node(smp_engine_t * engine, ib_portid_t * path, uint8_t * node_info) { f_internal_t *f_int = ((ibnd_scan_t *) engine->user_data)->f_int; ibnd_node_t *rc = calloc(1, sizeof(*rc)); if (!rc) { IBND_ERROR("OOM: node creation failed\n"); return NULL; } /* decode just a couple of fields for quicker reference. */ mad_decode_field(node_info, IB_NODE_GUID_F, &rc->guid); mad_decode_field(node_info, IB_NODE_TYPE_F, &rc->type); mad_decode_field(node_info, IB_NODE_NPORTS_F, &rc->numports); rc->ports = calloc(rc->numports + 1, sizeof(*rc->ports)); if (!rc->ports) { free(rc); IBND_ERROR("OOM: Failed to allocate the ports array\n"); return NULL; } rc->path_portid = *path; memcpy(rc->info, node_info, sizeof(rc->info)); int rc1 = add_to_nodeguid_hash(rc, f_int->fabric.nodestbl); if (rc1) IBND_ERROR("Error Occurred when trying" " to insert new node guid 0x%016" PRIx64 " to DB\n", rc->guid); /* add this to the all nodes list */ rc->next = f_int->fabric.nodes; f_int->fabric.nodes = rc; add_to_type_list(rc, f_int); return rc; } static void link_ports(ibnd_node_t * node, ibnd_port_t * port, ibnd_node_t * remotenode, ibnd_port_t * remoteport) { IBND_DEBUG("linking: 0x%" PRIx64 " %p->%p:%u and 0x%" PRIx64 " %p->%p:%u\n", node->guid, node, port, port->portnum, remotenode->guid, remotenode, remoteport, remoteport->portnum); if (port->remoteport) port->remoteport->remoteport = NULL; if (remoteport->remoteport) remoteport->remoteport->remoteport = NULL; port->remoteport = remoteport; remoteport->remoteport = port; } static void dump_endnode(ib_portid_t *path, const char *prompt, ibnd_node_t *node, ibnd_port_t *port) { char type[64]; mad_dump_node_type(type, sizeof(type), &node->type, sizeof(int)); printf("%s -> %s %s {%016" PRIx64 "} portnum %d lid %d-%d \"%s\"\n", portid2str(path), prompt, type, node->guid, node->type == IB_NODE_SWITCH ? 0 : port->portnum, port->base_lid, port->base_lid + (1 << port->lmc) - 1, node->nodedesc); } static int recv_node_info(smp_engine_t * engine, ibnd_smp_t * smp, uint8_t * mad, void *cb_data) { ibnd_scan_t *scan = engine->user_data; f_internal_t *f_int = scan->f_int; uint8_t *node_info = mad + IB_SMP_DATA_OFFS; struct ni_cbdata *ni_cbdata = (struct ni_cbdata *)cb_data; ibnd_node_t *rem_node = NULL; int rem_port_num = 0; ibnd_node_t *node; int node_is_new = 0; uint64_t node_guid = mad_get_field64(node_info, 0, IB_NODE_GUID_F); uint64_t port_guid = mad_get_field64(node_info, 0, IB_NODE_PORT_GUID_F); int port_num = mad_get_field(node_info, 0, IB_NODE_LOCAL_PORT_F); ibnd_port_t *port = NULL; if (ni_cbdata) { rem_node = ni_cbdata->node; rem_port_num = ni_cbdata->port_num; free(ni_cbdata); } node = ibnd_find_node_guid(&f_int->fabric, node_guid); if (!node) { node = create_node(engine, &smp->path, node_info); if (!node) return -1; node_is_new = 1; } IBND_DEBUG("Found %s node GUID 0x%" PRIx64 " (%s)\n", node_is_new ? "new" : "old", node->guid, portid2str(&smp->path)); port = node->ports[port_num]; if (!port) { /* If we have not see this port before create a shell for it */ port = node->ports[port_num] = calloc(1, sizeof(*port)); if (!port) return -1; port->node = node; port->portnum = port_num; } port->guid = port_guid; if (scan->cfg->show_progress) dump_endnode(&smp->path, node_is_new ? "new" : "known", node, port); if (rem_node == NULL) { /* this is the start node */ f_int->fabric.from_node = node; f_int->fabric.from_portnum = port_num; } else { /* link ports... */ if (!rem_node->ports[rem_port_num]) { IBND_ERROR("Internal Error; " "Node(%p) 0x%" PRIx64 " Port %d no port created!?!?!?\n\n", rem_node, rem_node->guid, rem_port_num); return -1; } link_ports(node, port, rem_node, rem_node->ports[rem_port_num]); } if (node_is_new) { query_node_desc(engine, &smp->path, node); if (node->type == IB_NODE_SWITCH) { query_switch_info(engine, &smp->path, node); /* Query PortInfo on Switch Port 0 first */ query_port_info(engine, &smp->path, node, 0); } } if (node->type != IB_NODE_SWITCH) query_port_info(engine, &smp->path, node, port_num); return 0; } static int query_node_info(smp_engine_t * engine, ib_portid_t * portid, struct ni_cbdata * cbdata) { IBND_DEBUG("Query Node Info; %s\n", portid2str(portid)); return issue_smp(engine, portid, IB_ATTR_NODE_INFO, 0, recv_node_info, (void *)cbdata); } ibnd_node_t *ibnd_find_node_guid(ibnd_fabric_t * fabric, uint64_t guid) { int hash = HASHGUID(guid) % HTSZ; ibnd_node_t *node; if (!fabric) { IBND_DEBUG("fabric parameter NULL\n"); return NULL; } for (node = fabric->nodestbl[hash]; node; node = node->htnext) if (node->guid == guid) return node; return NULL; } ibnd_node_t *ibnd_find_node_dr(ibnd_fabric_t * fabric, char *dr_str) { ibnd_port_t *rc = ibnd_find_port_dr(fabric, dr_str); return rc->node; } int add_to_nodeguid_hash(ibnd_node_t * node, ibnd_node_t * hash[]) { int rc = 0; ibnd_node_t *tblnode; int hash_idx = HASHGUID(node->guid) % HTSZ; for (tblnode = hash[hash_idx]; tblnode; tblnode = tblnode->htnext) { if (tblnode == node) { IBND_ERROR("Duplicate Node: Node with guid 0x%016" PRIx64 " already exists in nodes DB\n", node->guid); return 1; } } node->htnext = hash[hash_idx]; hash[hash_idx] = node; return rc; } int add_to_portguid_hash(ibnd_port_t * port, ibnd_port_t * hash[]) { int rc = 0; ibnd_port_t *tblport; int hash_idx = HASHGUID(port->guid) % HTSZ; for (tblport = hash[hash_idx]; tblport; tblport = tblport->htnext) { if (tblport == port) { IBND_ERROR("Duplicate Port: Port with guid 0x%016" PRIx64 " already exists in ports DB\n", port->guid); return 1; } } port->htnext = hash[hash_idx]; hash[hash_idx] = port; return rc; } struct lid2guid_item { cl_map_item_t cl_map; ibnd_port_t *port; }; void create_lid2guid(f_internal_t *f_int) { cl_qmap_init(&f_int->lid2guid); } void destroy_lid2guid(f_internal_t *f_int) { cl_map_item_t *item; for (item = cl_qmap_head(&f_int->lid2guid); item != cl_qmap_end(&f_int->lid2guid); item = cl_qmap_head(&f_int->lid2guid)) { cl_qmap_remove_item(&f_int->lid2guid, item); free(container_of(item, struct lid2guid_item, cl_map)); } } void add_to_portlid_hash(ibnd_port_t * port, f_internal_t *f_int) { uint16_t base_lid = port->base_lid; uint16_t lid_mask = ((1 << port->lmc) -1); uint16_t lid = 0; /* 0 < valid lid <= 0xbfff */ if (base_lid > 0 && base_lid <= 0xbfff) { /* We add the port for all lids * so it is easier to find any "random" lid specified */ for (lid = base_lid; lid <= (base_lid + lid_mask); lid++) { struct lid2guid_item *item; item = malloc(sizeof(*item)); if (item) { item->port = port; if (cl_qmap_insert(&f_int->lid2guid, lid, &item->cl_map) != &item->cl_map) { /* Port is already in map, release item */ free(item); } } } } } void add_to_type_list(ibnd_node_t * node, f_internal_t * f_int) { ibnd_fabric_t *fabric = &f_int->fabric; switch (node->type) { case IB_NODE_CA: node->type_next = fabric->ch_adapters; fabric->ch_adapters = node; break; case IB_NODE_SWITCH: node->type_next = fabric->switches; fabric->switches = node; break; case IB_NODE_ROUTER: node->type_next = fabric->routers; fabric->routers = node; break; } } static int set_config(struct ibnd_config *config, struct ibnd_config *cfg) { if (!config) return (-EINVAL); if (cfg) memcpy(config, cfg, sizeof(*config)); if (!config->max_smps) config->max_smps = DEFAULT_MAX_SMP_ON_WIRE; if (!config->timeout_ms) config->timeout_ms = DEFAULT_TIMEOUT; if (!config->retries) config->retries = DEFAULT_RETRIES; return (0); } f_internal_t *allocate_fabric_internal(void) { f_internal_t *f = calloc(1, sizeof(*f)); if (f) create_lid2guid(f); return (f); } ibnd_fabric_t *ibnd_discover_fabric(char * ca_name, int ca_port, ib_portid_t * from, struct ibnd_config *cfg) { struct ibnd_config config = { 0 }; f_internal_t *f_int = NULL; ib_portid_t my_portid = { 0 }; smp_engine_t engine; ibnd_scan_t scan; struct ibmad_port *ibmad_port; struct ibmad_ports_pair *ibmad_ports; int nc = 2; int mc[2] = { IB_SMI_CLASS, IB_SMI_DIRECT_CLASS }; /* If not specified start from "my" port */ if (!from) from = &my_portid; if (set_config(&config, cfg)) { IBND_ERROR("Invalid ibnd_config\n"); return NULL; } f_int = allocate_fabric_internal(); if (!f_int) { IBND_ERROR("OOM: failed to calloc ibnd_fabric_t\n"); return NULL; } memset(&scan.selfportid, 0, sizeof(scan.selfportid)); scan.f_int = f_int; scan.cfg = &config; scan.initial_hops = from->drpath.cnt; ibmad_ports = mad_rpc_open_port2(ca_name, ca_port, mc, nc, 1); if (!ibmad_ports) { IBND_ERROR("can't open MAD port (%s:%d)\n", ca_name, ca_port); goto error_int; } ibmad_port = ibmad_ports->smi.port; if (!ibmad_port) { IBND_ERROR("can't open MAD port (%s:%d)\n", ca_name, ca_port); goto error_int; } mad_rpc_set_timeout(ibmad_port, cfg->timeout_ms); mad_rpc_set_retries(ibmad_port, cfg->retries); smp_mkey_set(ibmad_port, cfg->mkey); if (ib_resolve_self_via(&scan.selfportid, NULL, NULL, ibmad_port) < 0) { IBND_ERROR("Failed to resolve self\n"); mad_rpc_close_port2(ibmad_ports); goto error_int; } //in case of smi/gsi seperation make sure we take the smi name char fixed_ca_name[UMAD_CA_NAME_LEN]; memset(fixed_ca_name, 0, UMAD_CA_NAME_LEN); strncpy(fixed_ca_name, ibmad_ports->smi.ca_name, UMAD_CA_NAME_LEN); mad_rpc_close_port2(ibmad_ports); if (smp_engine_init(&engine, fixed_ca_name, ca_port, &scan, &config)) { goto error_int; } IBND_DEBUG("from %s\n", portid2str(from)); if (!query_node_info(&engine, from, NULL)) if (process_mads(&engine) != 0) goto error; f_int->fabric.total_mads_used = engine.total_smps; f_int->fabric.maxhops_discovered += scan.initial_hops; if (group_nodes(&f_int->fabric)) goto error; smp_engine_destroy(&engine); return (ibnd_fabric_t *)f_int; error: smp_engine_destroy(&engine); ibnd_destroy_fabric(&f_int->fabric); error_int: free(f_int); return NULL; } void destroy_node(ibnd_node_t * node) { int p = 0; if (node->ports) { for (p = 0; p <= node->numports; p++) free(node->ports[p]); free(node->ports); } free(node); } void ibnd_destroy_fabric(ibnd_fabric_t * fabric) { ibnd_node_t *node = NULL; ibnd_node_t *next = NULL; ibnd_chassis_t *ch, *ch_next; if (!fabric) return; ch = fabric->chassis; while (ch) { ch_next = ch->next; free(ch); ch = ch_next; } node = fabric->nodes; while (node) { next = node->next; destroy_node(node); node = next; } destroy_lid2guid((f_internal_t *)fabric); free(fabric); } void ibnd_iter_nodes(ibnd_fabric_t * fabric, ibnd_iter_node_func_t func, void *user_data) { ibnd_node_t *cur = NULL; if (!fabric) { IBND_DEBUG("fabric parameter NULL\n"); return; } if (!func) { IBND_DEBUG("func parameter NULL\n"); return; } for (cur = fabric->nodes; cur; cur = cur->next) func(cur, user_data); } void ibnd_iter_nodes_type(ibnd_fabric_t * fabric, ibnd_iter_node_func_t func, int node_type, void *user_data) { ibnd_node_t *list = NULL; ibnd_node_t *cur = NULL; if (!fabric) { IBND_DEBUG("fabric parameter NULL\n"); return; } if (!func) { IBND_DEBUG("func parameter NULL\n"); return; } switch (node_type) { case IB_NODE_SWITCH: list = fabric->switches; break; case IB_NODE_CA: list = fabric->ch_adapters; break; case IB_NODE_ROUTER: list = fabric->routers; break; default: IBND_DEBUG("Invalid node_type specified %d\n", node_type); break; } for (cur = list; cur; cur = cur->type_next) func(cur, user_data); } ibnd_port_t *ibnd_find_port_lid(ibnd_fabric_t * fabric, uint16_t lid) { f_internal_t *f = (f_internal_t *)fabric; cl_map_item_t *p_item = cl_qmap_get(&f->lid2guid, lid); if (p_item == &f->lid2guid.nil) return NULL; return container_of(p_item, struct lid2guid_item, cl_map) ->port; } ibnd_port_t *ibnd_find_port_guid(ibnd_fabric_t * fabric, uint64_t guid) { int hash = HASHGUID(guid) % HTSZ; ibnd_port_t *port; if (!fabric) { IBND_DEBUG("fabric parameter NULL\n"); return NULL; } for (port = fabric->portstbl[hash]; port; port = port->htnext) if (port->guid == guid) return port; return NULL; } ibnd_port_t *ibnd_find_port_dr(ibnd_fabric_t * fabric, char *dr_str) { int i = 0; ibnd_node_t *cur_node; ibnd_port_t *rc = NULL; ib_dr_path_t path; if (!fabric) { IBND_DEBUG("fabric parameter NULL\n"); return NULL; } if (!dr_str) { IBND_DEBUG("dr_str parameter NULL\n"); return NULL; } cur_node = fabric->from_node; if (str2drpath(&path, dr_str, 0, 0) == -1) return NULL; for (i = 0; i <= path.cnt; i++) { ibnd_port_t *remote_port = NULL; if (path.p[i] == 0) continue; if (!cur_node->ports) return NULL; remote_port = cur_node->ports[path.p[i]]->remoteport; if (!remote_port) return NULL; rc = remote_port; cur_node = remote_port->node; } return rc; } void ibnd_iter_ports(ibnd_fabric_t * fabric, ibnd_iter_port_func_t func, void *user_data) { int i = 0; ibnd_port_t *cur = NULL; if (!fabric) { IBND_DEBUG("fabric parameter NULL\n"); return; } if (!func) { IBND_DEBUG("func parameter NULL\n"); return; } for (i = 0; iportstbl[i]; cur; cur = cur->htnext) func(cur, user_data); } int ibnd_get_agg_linkspeedext_field(void *cap_info, void *info, enum MAD_FIELDS efield, enum MAD_FIELDS e2field) { int espeed = 0, e2speed = 0; int cap_mask = cap_info ? mad_get_field(cap_info, 0, IB_PORT_CAPMASK_F) : 0; int cap_mask2 = 0; if (cap_mask & be32toh(IB_PORT_CAP_HAS_EXT_SPEEDS)) { espeed = mad_get_field(info, 0, efield); if (efield == IB_PORT_LINK_SPEED_EXT_ENABLED_F) if (espeed == 30) espeed = 0; if (cap_mask & be32toh(IB_PORT_CAP_HAS_CAP_MASK2)) cap_mask2 = cap_info ? mad_get_field(cap_info, 0, IB_PORT_CAPMASK2_F) : 0; if (cap_mask2 & be16toh(IB_PORT_CAP2_IS_EXT_SPEEDS_2_SUPPORTED)) { e2speed = (mad_get_field(info, 0, e2field) << 5); } } if (efield == IB_PORT_LINK_SPEED_EXT_ACTIVE_F) return e2speed ? e2speed : espeed; return espeed | e2speed; } int ibnd_get_agg_linkspeedext(void *cap_info, void *info) { return ibnd_get_agg_linkspeedext_field(cap_info, info, IB_PORT_LINK_SPEED_EXT_ACTIVE_F, IB_PORT_LINK_SPEED_EXT_ACTIVE_2_F); } int ibnd_get_agg_linkspeedexten(void *cap_info, void *info) { return ibnd_get_agg_linkspeedext_field(cap_info, info, IB_PORT_LINK_SPEED_EXT_ENABLED_F, IB_PORT_LINK_SPEED_EXT_ENABLED_2_F); } int ibnd_get_agg_linkspeedextsup(void *cap_info, void *info) { return ibnd_get_agg_linkspeedext_field(cap_info, info, IB_PORT_LINK_SPEED_EXT_SUPPORTED_F, IB_PORT_LINK_SPEED_EXT_SUPPORTED_2_F); } char *ibnd_dump_agg_linkspeedext(char *buf, int bufsz, int speed) { switch (speed) { case 0: snprintf(buf, bufsz, "No Extended Speed"); break; case 1: snprintf(buf, bufsz, "14.0625 Gbps"); break; case 2: snprintf(buf, bufsz, "25.78125 Gbps"); break; case 4: snprintf(buf, bufsz, "53.125 Gbps"); break; case 8: snprintf(buf, bufsz, "106.25 Gbps"); break; /* case 16: not used value */ case 32: snprintf(buf, bufsz, "212.5 Gbps"); break; default: snprintf(buf, bufsz, "undefined (%d)", speed); break; } return buf; } char *ibnd_dump_agg_linkspeedext_bits(char *buf, int bufsz, int speed) { int n = 0; if (speed == 0) { snprintf(buf, bufsz, "%d", speed); return buf; } if (speed & 0x1) n += snprintf(buf + n, bufsz - n, "14.0625 Gbps or "); if (n < bufsz && (speed & 0x02)) n += snprintf(buf + n, bufsz - n, "25.78125 Gbps or "); if (n < bufsz && (speed & 0x04)) n += snprintf(buf + n, bufsz - n, "53.125 Gbps or "); if (n < bufsz && (speed & 0x08)) n += snprintf(buf + n, bufsz - n, "106.25 Gbps or "); if (n < bufsz && (speed & 0x20)) n += snprintf(buf + n, bufsz - n, "212.5 Gbps or "); if (speed >> 6) { n += snprintf(buf + n, bufsz - n, "undefined (%d)", speed); return buf; } else if (bufsz > 3) buf[n - 4] = '\0'; return buf; } char *ibnd_dump_agg_linkspeedexten(char *buf, int bufsz, int speed) { return ibnd_dump_agg_linkspeedext_bits(buf, bufsz, speed); } char *ibnd_dump_agg_linkspeedextsup(char *buf, int bufsz, int speed) { return ibnd_dump_agg_linkspeedext_bits(buf, bufsz, speed); } rdma-core-56.1/libibnetdisc/ibnetdisc.h000066400000000000000000000203001477342711600200750ustar00rootroot00000000000000/* * Copyright (c) 2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2008 Lawrence Livermore National Lab. All rights reserved. * Copyright (c) 2010-2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _IBNETDISC_H_ #define _IBNETDISC_H_ #include #include #ifdef __cplusplus extern "C" { #endif struct ibnd_chassis; /* forward declare */ struct ibnd_port; /* forward declare */ #define CHASSIS_TYPE_SIZE 20 /** ========================================================================= * Node */ typedef struct ibnd_node { struct ibnd_node *next; /* all node list in fabric */ ib_portid_t path_portid; /* path from "from_node" */ /* NOTE: this is not valid on a fabric * read from a cache file */ uint16_t smalid; uint8_t smalmc; /* quick cache of switchinfo below */ int smaenhsp0; /* use libibmad decoder functions for switchinfo */ uint8_t switchinfo[IB_SMP_DATA_SIZE]; /* quick cache of info below */ uint64_t guid; int type; int numports; /* use libibmad decoder functions for info */ uint8_t info[IB_SMP_DATA_SIZE]; char nodedesc[IB_SMP_DATA_SIZE + 1]; struct ibnd_port **ports; /* array of ports, indexed by port number ports[1] == port 1, ports[2] == port 2, etc... Any port in the array MAY BE NULL! Most notable is non-switches have no port 0 therefore node.ports[0] == NULL for those nodes */ /* chassis info */ struct ibnd_node *next_chassis_node; /* next node in ibnd_chassis_t->nodes */ struct ibnd_chassis *chassis; /* if != NULL the chassis this node belongs to */ unsigned char ch_type; char ch_type_str[CHASSIS_TYPE_SIZE]; unsigned char ch_anafanum; unsigned char ch_slotnum; unsigned char ch_slot; /* internal use only */ unsigned char ch_found; struct ibnd_node *htnext; /* hash table list */ struct ibnd_node *type_next; /* next based on type */ } ibnd_node_t; /** ========================================================================= * Port */ typedef struct ibnd_port { uint64_t guid; int portnum; int ext_portnum; /* optional if != 0 external port num */ ibnd_node_t *node; /* node this port belongs to */ struct ibnd_port *remoteport; /* null if SMA, or does not exist */ /* quick cache of info below */ uint16_t base_lid; uint8_t lmc; /* use libibmad decoder functions for info */ uint8_t info[IB_SMP_DATA_SIZE]; uint8_t ext_info[IB_SMP_DATA_SIZE]; /* internal use only */ struct ibnd_port *htnext; } ibnd_port_t; /** ========================================================================= * Chassis */ typedef struct ibnd_chassis { struct ibnd_chassis *next; uint64_t chassisguid; unsigned char chassisnum; /* generic grouping by SystemImageGUID */ unsigned char nodecount; ibnd_node_t *nodes; /* specific to voltaire type nodes */ #define SPINES_MAX_NUM 18 #define LINES_MAX_NUM 36 ibnd_node_t *spinenode[SPINES_MAX_NUM + 1]; ibnd_node_t *linenode[LINES_MAX_NUM + 1]; } ibnd_chassis_t; #define HTSZ 137 /* define config flags */ #define IBND_CONFIG_MLX_EPI (1 << 0) typedef struct ibnd_config { unsigned max_smps; unsigned show_progress; unsigned max_hops; unsigned debug; unsigned timeout_ms; unsigned retries; uint32_t flags; uint64_t mkey; uint8_t pad[44]; } ibnd_config_t; /** ========================================================================= * Fabric * Main fabric object which is returned and represents the data discovered */ typedef struct ibnd_fabric { /* the node the discover was initiated from * "from" parameter in ibnd_discover_fabric * or by default the node you ar running on */ ibnd_node_t *from_node; int from_portnum; /* NULL term list of all nodes in the fabric */ ibnd_node_t *nodes; /* NULL terminated list of all chassis found in the fabric */ ibnd_chassis_t *chassis; unsigned maxhops_discovered; unsigned total_mads_used; /* internal use only */ ibnd_node_t *nodestbl[HTSZ]; ibnd_port_t *portstbl[HTSZ]; ibnd_node_t *switches; ibnd_node_t *ch_adapters; ibnd_node_t *routers; } ibnd_fabric_t; /** ========================================================================= * Initialization (fabric operations) */ ibnd_fabric_t *ibnd_discover_fabric(char *ca_name, int ca_port, ib_portid_t *from, struct ibnd_config *config); /** * ca_name: (optional) name of the CA to use * ca_port: (optional) CA port to use * from: (optional) specify the node to start scanning from. * If NULL start from the CA/CA port specified * config: (optional) additional config options for the scan */ void ibnd_destroy_fabric(ibnd_fabric_t *fabric); ibnd_fabric_t *ibnd_load_fabric(const char *file, unsigned int flags); int ibnd_cache_fabric(ibnd_fabric_t *fabric, const char *file, unsigned int flags); #define IBND_CACHE_FABRIC_FLAG_DEFAULT 0x0000 #define IBND_CACHE_FABRIC_FLAG_NO_OVERWRITE 0x0001 /** ========================================================================= * Node operations */ ibnd_node_t *ibnd_find_node_guid(ibnd_fabric_t *fabric, uint64_t guid); ibnd_node_t *ibnd_find_node_dr(ibnd_fabric_t *fabric, char *dr_str); typedef void (*ibnd_iter_node_func_t) (ibnd_node_t * node, void *user_data); void ibnd_iter_nodes(ibnd_fabric_t *fabric, ibnd_iter_node_func_t func, void *user_data); void ibnd_iter_nodes_type(ibnd_fabric_t *fabric, ibnd_iter_node_func_t func, int node_type, void *user_data); /** ========================================================================= * Port operations */ ibnd_port_t *ibnd_find_port_guid(ibnd_fabric_t *fabric, uint64_t guid); ibnd_port_t *ibnd_find_port_dr(ibnd_fabric_t *fabric, char *dr_str); ibnd_port_t *ibnd_find_port_lid(ibnd_fabric_t *fabric, uint16_t lid); typedef void (*ibnd_iter_port_func_t) (ibnd_port_t * port, void *user_data); void ibnd_iter_ports(ibnd_fabric_t *fabric, ibnd_iter_port_func_t func, void *user_data); /** ========================================================================= * Chassis queries */ uint64_t ibnd_get_chassis_guid(ibnd_fabric_t *fabric, unsigned char chassisnum); const char *ibnd_get_chassis_type(ibnd_node_t *node); char *ibnd_get_chassis_slot_str(ibnd_node_t *node, char *str, size_t size); int ibnd_is_xsigo_guid(uint64_t guid); int ibnd_is_xsigo_tca(uint64_t guid); int ibnd_is_xsigo_hca(uint64_t guid); int ibnd_get_agg_linkspeedext_field(void *cap_info, void *info, enum MAD_FIELDS efield, enum MAD_FIELDS e2field); int ibnd_get_agg_linkspeedext(void *cap_info, void *info); int ibnd_get_agg_linkspeedexten(void *cap_info, void *info); int ibnd_get_agg_linkspeedextsup(void *cap_info, void *info); char *ibnd_dump_agg_linkspeedext_bits(char *buf, int bufsz, int speed); char *ibnd_dump_agg_linkspeedext(char *buf, int bufsz, int speed); char *ibnd_dump_agg_linkspeedexten(char *buf, int bufsz, int speed); char *ibnd_dump_agg_linkspeedextsup(char *buf, int bufsz, int speed); #ifdef __cplusplus } #endif #endif /* _IBNETDISC_H_ */ rdma-core-56.1/libibnetdisc/ibnetdisc_cache.c000066400000000000000000000573351477342711600212350ustar00rootroot00000000000000/* * Copyright (c) 2004-2007 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * Copyright (c) 2008 Lawrence Livermore National Laboratory * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include "internal.h" #include "chassis.h" /* For this caching lib, we always cache little endian */ /* Cache format * * Bytes 1-4 - magic number * Bytes 5-8 - version number * Bytes 9-12 - node count * Bytes 13-16 - port count * Bytes 17-24 - "from node" guid * Bytes 25-28 - maxhops discovered * Bytes X-Y - nodes (variable length) * Bytes X-Y - ports (variable length) * * Nodes are cached as * * 2 bytes - smalid * 1 byte - smalmc * 1 byte - smaenhsp0 flag * IB_SMP_DATA_SIZE bytes - switchinfo * 8 bytes - guid * 1 byte - type * 1 byte - numports * IB_SMP_DATA_SIZE bytes - info * IB_SMP_DATA_SIZE bytes - nodedesc * 1 byte - number of ports stored * 8 bytes - portguid A * 1 byte - port num A * 8 bytes - portguid B * 1 byte - port num B * ... etc., depending on number of ports stored * * Ports are cached as * * 8 bytes - guid * 1 byte - portnum * 1 byte - external portnum * 2 bytes - base lid * 1 byte - lmc * IB_SMP_DATA_SIZE bytes - info * 8 bytes - node guid port "owned" by * 1 byte - flag indicating if remote port exists * 8 bytes - port guid remotely connected to * 1 byte - port num remotely connected to */ /* Structs that hold cache info temporarily before * the real structs can be reconstructed. */ typedef struct ibnd_port_cache_key { uint64_t guid; uint8_t portnum; } ibnd_port_cache_key_t; typedef struct ibnd_node_cache { ibnd_node_t *node; uint8_t ports_stored_count; ibnd_port_cache_key_t *port_cache_keys; struct ibnd_node_cache *next; struct ibnd_node_cache *htnext; int node_stored_to_fabric; } ibnd_node_cache_t; typedef struct ibnd_port_cache { ibnd_port_t *port; uint64_t node_guid; uint8_t remoteport_flag; ibnd_port_cache_key_t remoteport_cache_key; struct ibnd_port_cache *next; struct ibnd_port_cache *htnext; int port_stored_to_fabric; } ibnd_port_cache_t; typedef struct ibnd_fabric_cache { f_internal_t *f_int; uint64_t from_node_guid; ibnd_node_cache_t *nodes_cache; ibnd_port_cache_t *ports_cache; ibnd_node_cache_t *nodescachetbl[HTSZ]; ibnd_port_cache_t *portscachetbl[HTSZ]; } ibnd_fabric_cache_t; #define IBND_FABRIC_CACHE_BUFLEN 4096 #define IBND_FABRIC_CACHE_MAGIC 0x8FE7832B #define IBND_FABRIC_CACHE_VERSION 0x00000001 #define IBND_FABRIC_CACHE_COUNT_OFFSET 8 #define IBND_FABRIC_CACHE_HEADER_LEN (28) #define IBND_NODE_CACHE_HEADER_LEN (15 + IB_SMP_DATA_SIZE*3) #define IBND_PORT_CACHE_KEY_LEN (8 + 1) #define IBND_PORT_CACHE_LEN (31 + IB_SMP_DATA_SIZE) static ssize_t ibnd_read(int fd, void *buf, size_t count) { size_t count_done = 0; ssize_t ret; while ((count - count_done) > 0) { ret = read(fd, ((char *) buf) + count_done, count - count_done); if (ret < 0) { if (errno == EINTR) continue; else { IBND_DEBUG("read: %s\n", strerror(errno)); return -1; } } if (!ret) break; count_done += ret; } if (count_done != count) { IBND_DEBUG("read: read short\n"); return -1; } return count_done; } static size_t _unmarshall8(uint8_t * inbuf, uint8_t * num) { (*num) = inbuf[0]; return (sizeof(*num)); } static size_t _unmarshall16(uint8_t * inbuf, uint16_t * num) { (*num) = ((uint16_t) inbuf[1] << 8) | inbuf[0]; return (sizeof(*num)); } static size_t _unmarshall32(uint8_t * inbuf, uint32_t * num) { (*num) = (uint32_t) inbuf[0]; (*num) |= ((uint32_t) inbuf[1] << 8); (*num) |= ((uint32_t) inbuf[2] << 16); (*num) |= ((uint32_t) inbuf[3] << 24); return (sizeof(*num)); } static size_t _unmarshall64(uint8_t * inbuf, uint64_t * num) { (*num) = (uint64_t) inbuf[0]; (*num) |= ((uint64_t) inbuf[1] << 8); (*num) |= ((uint64_t) inbuf[2] << 16); (*num) |= ((uint64_t) inbuf[3] << 24); (*num) |= ((uint64_t) inbuf[4] << 32); (*num) |= ((uint64_t) inbuf[5] << 40); (*num) |= ((uint64_t) inbuf[6] << 48); (*num) |= ((uint64_t) inbuf[7] << 56); return (sizeof(*num)); } static size_t _unmarshall_buf(const void *inbuf, void *outbuf, unsigned int len) { memcpy(outbuf, inbuf, len); return len; } static int _load_header_info(int fd, ibnd_fabric_cache_t * fabric_cache, unsigned int *node_count, unsigned int *port_count) { uint8_t buf[IBND_FABRIC_CACHE_BUFLEN]; uint32_t magic = 0; uint32_t version = 0; size_t offset = 0; uint32_t tmp32; if (ibnd_read(fd, buf, IBND_FABRIC_CACHE_HEADER_LEN) < 0) return -1; offset += _unmarshall32(buf + offset, &magic); if (magic != IBND_FABRIC_CACHE_MAGIC) { IBND_DEBUG("invalid fabric cache file\n"); return -1; } offset += _unmarshall32(buf + offset, &version); if (version != IBND_FABRIC_CACHE_VERSION) { IBND_DEBUG("invalid fabric cache version\n"); return -1; } offset += _unmarshall32(buf + offset, node_count); offset += _unmarshall32(buf + offset, port_count); offset += _unmarshall64(buf + offset, &fabric_cache->from_node_guid); offset += _unmarshall32(buf + offset, &tmp32); fabric_cache->f_int->fabric.maxhops_discovered = tmp32; return 0; } static void _destroy_ibnd_node_cache(ibnd_node_cache_t * node_cache) { free(node_cache->port_cache_keys); if (!node_cache->node_stored_to_fabric && node_cache->node) destroy_node(node_cache->node); free(node_cache); } static void _destroy_ibnd_fabric_cache(ibnd_fabric_cache_t * fabric_cache) { ibnd_node_cache_t *node_cache; ibnd_node_cache_t *node_cache_next; ibnd_port_cache_t *port_cache; ibnd_port_cache_t *port_cache_next; if (!fabric_cache) return; node_cache = fabric_cache->nodes_cache; while (node_cache) { node_cache_next = node_cache->next; _destroy_ibnd_node_cache(node_cache); node_cache = node_cache_next; } port_cache = fabric_cache->ports_cache; while (port_cache) { port_cache_next = port_cache->next; if (!port_cache->port_stored_to_fabric && port_cache->port) free(port_cache->port); free(port_cache); port_cache = port_cache_next; } free(fabric_cache); } static void store_node_cache(ibnd_node_cache_t * node_cache, ibnd_fabric_cache_t * fabric_cache) { int hash_indx = HASHGUID(node_cache->node->guid) % HTSZ; node_cache->next = fabric_cache->nodes_cache; fabric_cache->nodes_cache = node_cache; node_cache->htnext = fabric_cache->nodescachetbl[hash_indx]; fabric_cache->nodescachetbl[hash_indx] = node_cache; } static int _load_node(int fd, ibnd_fabric_cache_t * fabric_cache) { uint8_t buf[IBND_FABRIC_CACHE_BUFLEN]; ibnd_node_cache_t *node_cache = NULL; ibnd_node_t *node = NULL; size_t offset = 0; uint8_t tmp8; node_cache = (ibnd_node_cache_t *) malloc(sizeof(ibnd_node_cache_t)); if (!node_cache) { IBND_DEBUG("OOM: node_cache\n"); return -1; } memset(node_cache, '\0', sizeof(ibnd_node_cache_t)); node = (ibnd_node_t *) malloc(sizeof(ibnd_node_t)); if (!node) { IBND_DEBUG("OOM: node\n"); free(node_cache); return -1; } memset(node, '\0', sizeof(ibnd_node_t)); node_cache->node = node; if (ibnd_read(fd, buf, IBND_NODE_CACHE_HEADER_LEN) < 0) goto cleanup; offset += _unmarshall16(buf + offset, &node->smalid); offset += _unmarshall8(buf + offset, &node->smalmc); offset += _unmarshall8(buf + offset, &tmp8); node->smaenhsp0 = tmp8; offset += _unmarshall_buf(buf + offset, node->switchinfo, IB_SMP_DATA_SIZE); offset += _unmarshall64(buf + offset, &node->guid); offset += _unmarshall8(buf + offset, &tmp8); node->type = tmp8; offset += _unmarshall8(buf + offset, &tmp8); node->numports = tmp8; offset += _unmarshall_buf(buf + offset, node->info, IB_SMP_DATA_SIZE); offset += _unmarshall_buf(buf + offset, node->nodedesc, IB_SMP_DATA_SIZE); offset += _unmarshall8(buf + offset, &node_cache->ports_stored_count); if (node_cache->ports_stored_count) { unsigned int tomalloc = 0; unsigned int toread = 0; unsigned int i; tomalloc = sizeof(ibnd_port_cache_key_t) * node_cache->ports_stored_count; toread = IBND_PORT_CACHE_KEY_LEN * node_cache->ports_stored_count; node_cache->port_cache_keys = (ibnd_port_cache_key_t *) malloc(tomalloc); if (!node_cache->port_cache_keys) { IBND_DEBUG("OOM: node_cache port_cache_keys\n"); goto cleanup; } if (ibnd_read(fd, buf, toread) < 0) goto cleanup; offset = 0; for (i = 0; i < node_cache->ports_stored_count; i++) { offset += _unmarshall64(buf + offset, &node_cache->port_cache_keys[i].guid); offset += _unmarshall8(buf + offset, &node_cache-> port_cache_keys[i].portnum); } } store_node_cache(node_cache, fabric_cache); return 0; cleanup: _destroy_ibnd_node_cache(node_cache); return -1; } static void store_port_cache(ibnd_port_cache_t * port_cache, ibnd_fabric_cache_t * fabric_cache) { int hash_indx = HASHGUID(port_cache->port->guid) % HTSZ; port_cache->next = fabric_cache->ports_cache; fabric_cache->ports_cache = port_cache; port_cache->htnext = fabric_cache->portscachetbl[hash_indx]; fabric_cache->portscachetbl[hash_indx] = port_cache; } static int _load_port(int fd, ibnd_fabric_cache_t * fabric_cache) { uint8_t buf[IBND_FABRIC_CACHE_BUFLEN]; ibnd_port_cache_t *port_cache = NULL; ibnd_port_t *port = NULL; size_t offset = 0; uint8_t tmp8; port_cache = (ibnd_port_cache_t *) malloc(sizeof(ibnd_port_cache_t)); if (!port_cache) { IBND_DEBUG("OOM: port_cache\n"); return -1; } memset(port_cache, '\0', sizeof(ibnd_port_cache_t)); port = (ibnd_port_t *) malloc(sizeof(ibnd_port_t)); if (!port) { IBND_DEBUG("OOM: port\n"); free(port_cache); return -1; } memset(port, '\0', sizeof(ibnd_port_t)); port_cache->port = port; if (ibnd_read(fd, buf, IBND_PORT_CACHE_LEN) < 0) goto cleanup; offset += _unmarshall64(buf + offset, &port->guid); offset += _unmarshall8(buf + offset, &tmp8); port->portnum = tmp8; offset += _unmarshall8(buf + offset, &tmp8); port->ext_portnum = tmp8; offset += _unmarshall16(buf + offset, &port->base_lid); offset += _unmarshall8(buf + offset, &port->lmc); offset += _unmarshall_buf(buf + offset, port->info, IB_SMP_DATA_SIZE); offset += _unmarshall64(buf + offset, &port_cache->node_guid); offset += _unmarshall8(buf + offset, &port_cache->remoteport_flag); offset += _unmarshall64(buf + offset, &port_cache->remoteport_cache_key.guid); offset += _unmarshall8(buf + offset, &port_cache->remoteport_cache_key.portnum); store_port_cache(port_cache, fabric_cache); return 0; cleanup: free(port); free(port_cache); return -1; } static ibnd_port_cache_t *_find_port(ibnd_fabric_cache_t * fabric_cache, ibnd_port_cache_key_t * port_cache_key) { int hash_indx = HASHGUID(port_cache_key->guid) % HTSZ; ibnd_port_cache_t *port_cache; for (port_cache = fabric_cache->portscachetbl[hash_indx]; port_cache; port_cache = port_cache->htnext) { if (port_cache->port->guid == port_cache_key->guid && port_cache->port->portnum == port_cache_key->portnum) return port_cache; } return NULL; } static ibnd_node_cache_t *_find_node(ibnd_fabric_cache_t * fabric_cache, uint64_t guid) { int hash_indx = HASHGUID(guid) % HTSZ; ibnd_node_cache_t *node_cache; for (node_cache = fabric_cache->nodescachetbl[hash_indx]; node_cache; node_cache = node_cache->htnext) { if (node_cache->node->guid == guid) return node_cache; } return NULL; } static int _fill_port(ibnd_fabric_cache_t * fabric_cache, ibnd_node_t * node, ibnd_port_cache_key_t * port_cache_key) { ibnd_port_cache_t *port_cache; if (!(port_cache = _find_port(fabric_cache, port_cache_key))) { IBND_DEBUG("Cache invalid: cannot find port\n"); return -1; } if (port_cache->port_stored_to_fabric) { IBND_DEBUG("Cache invalid: duplicate port discovered\n"); return -1; } node->ports[port_cache->port->portnum] = port_cache->port; port_cache->port_stored_to_fabric++; /* achu: needed if user wishes to re-cache a loaded fabric. * Otherwise, mostly unnecessary to do this. */ int rc = add_to_portguid_hash(port_cache->port, fabric_cache->f_int->fabric.portstbl); if (rc) { IBND_DEBUG("Error Occurred when trying" " to insert new port guid 0x%016" PRIx64 " to DB\n", port_cache->port->guid); } return 0; } static int _rebuild_nodes(ibnd_fabric_cache_t * fabric_cache) { ibnd_node_cache_t *node_cache; ibnd_node_cache_t *node_cache_next; node_cache = fabric_cache->nodes_cache; while (node_cache) { ibnd_node_t *node; int i; node_cache_next = node_cache->next; node = node_cache->node; /* Insert node into appropriate data structures */ node->next = fabric_cache->f_int->fabric.nodes; fabric_cache->f_int->fabric.nodes = node; int rc = add_to_nodeguid_hash(node_cache->node, fabric_cache-> f_int-> fabric.nodestbl); if (rc) { IBND_DEBUG("Error Occurred when trying" " to insert new node guid 0x%016" PRIx64 " to DB\n", node_cache->node->guid); } add_to_type_list(node_cache->node, fabric_cache->f_int); node_cache->node_stored_to_fabric++; /* Rebuild node ports array */ if (!(node->ports = calloc(node->numports + 1, sizeof(*node->ports)))) { IBND_DEBUG("OOM: node->ports\n"); return -1; } for (i = 0; i < node_cache->ports_stored_count; i++) { if (_fill_port(fabric_cache, node, &node_cache->port_cache_keys[i]) < 0) return -1; } node_cache = node_cache_next; } return 0; } static int _rebuild_ports(ibnd_fabric_cache_t * fabric_cache) { ibnd_port_cache_t *port_cache; ibnd_port_cache_t *port_cache_next; port_cache = fabric_cache->ports_cache; while (port_cache) { ibnd_node_cache_t *node_cache; ibnd_port_cache_t *remoteport_cache; ibnd_port_t *port; port_cache_next = port_cache->next; port = port_cache->port; if (!(node_cache = _find_node(fabric_cache, port_cache->node_guid))) { IBND_DEBUG("Cache invalid: cannot find node\n"); return -1; } port->node = node_cache->node; if (port_cache->remoteport_flag) { if (!(remoteport_cache = _find_port(fabric_cache, &port_cache->remoteport_cache_key))) { IBND_DEBUG ("Cache invalid: cannot find remote port\n"); return -1; } port->remoteport = remoteport_cache->port; } else port->remoteport = NULL; add_to_portlid_hash(port, fabric_cache->f_int); port_cache = port_cache_next; } return 0; } ibnd_fabric_t *ibnd_load_fabric(const char *file, unsigned int flags) { unsigned int node_count = 0; unsigned int port_count = 0; ibnd_fabric_cache_t *fabric_cache = NULL; f_internal_t *f_int = NULL; ibnd_node_cache_t *node_cache = NULL; int fd = -1; unsigned int i; if (!file) { IBND_DEBUG("file parameter NULL\n"); return NULL; } if ((fd = open(file, O_RDONLY)) < 0) { IBND_DEBUG("open: %s\n", strerror(errno)); return NULL; } fabric_cache = (ibnd_fabric_cache_t *) malloc(sizeof(ibnd_fabric_cache_t)); if (!fabric_cache) { IBND_DEBUG("OOM: fabric_cache\n"); goto cleanup; } memset(fabric_cache, '\0', sizeof(ibnd_fabric_cache_t)); f_int = allocate_fabric_internal(); if (!f_int) { IBND_DEBUG("OOM: fabric\n"); goto cleanup; } fabric_cache->f_int = f_int; if (_load_header_info(fd, fabric_cache, &node_count, &port_count) < 0) goto cleanup; for (i = 0; i < node_count; i++) { if (_load_node(fd, fabric_cache) < 0) goto cleanup; } for (i = 0; i < port_count; i++) { if (_load_port(fd, fabric_cache) < 0) goto cleanup; } /* Special case - find from node */ if (!(node_cache = _find_node(fabric_cache, fabric_cache->from_node_guid))) { IBND_DEBUG("Cache invalid: cannot find from node\n"); goto cleanup; } f_int->fabric.from_node = node_cache->node; if (_rebuild_nodes(fabric_cache) < 0) goto cleanup; if (_rebuild_ports(fabric_cache) < 0) goto cleanup; if (group_nodes(&f_int->fabric)) goto cleanup; _destroy_ibnd_fabric_cache(fabric_cache); close(fd); return (ibnd_fabric_t *)&f_int->fabric; cleanup: ibnd_destroy_fabric((ibnd_fabric_t *)f_int); _destroy_ibnd_fabric_cache(fabric_cache); close(fd); return NULL; } static ssize_t ibnd_write(int fd, const void *buf, size_t count) { size_t count_done = 0; ssize_t ret; while ((count - count_done) > 0) { ret = write(fd, ((char *) buf) + count_done, count - count_done); if (ret < 0) { if (errno == EINTR) continue; else { IBND_DEBUG("write: %s\n", strerror(errno)); return -1; } } count_done += ret; } return count_done; } static size_t _marshall8(uint8_t * outbuf, uint8_t num) { outbuf[0] = num; return (sizeof(num)); } static size_t _marshall16(uint8_t * outbuf, uint16_t num) { outbuf[0] = num & 0x00FF; outbuf[1] = (num & 0xFF00) >> 8; return (sizeof(num)); } static size_t _marshall32(uint8_t * outbuf, uint32_t num) { outbuf[0] = num & 0x000000FF; outbuf[1] = (num & 0x0000FF00) >> 8; outbuf[2] = (num & 0x00FF0000) >> 16; outbuf[3] = (num & 0xFF000000) >> 24; return (sizeof(num)); } static size_t _marshall64(uint8_t * outbuf, uint64_t num) { outbuf[0] = (uint8_t) num; outbuf[1] = (uint8_t) (num >> 8); outbuf[2] = (uint8_t) (num >> 16); outbuf[3] = (uint8_t) (num >> 24); outbuf[4] = (uint8_t) (num >> 32); outbuf[5] = (uint8_t) (num >> 40); outbuf[6] = (uint8_t) (num >> 48); outbuf[7] = (uint8_t) (num >> 56); return (sizeof(num)); } static size_t _marshall_buf(void *outbuf, const void *inbuf, unsigned int len) { memcpy(outbuf, inbuf, len); return len; } static int _cache_header_info(int fd, ibnd_fabric_t * fabric) { uint8_t buf[IBND_FABRIC_CACHE_BUFLEN]; size_t offset = 0; /* Store magic number, version, and other important info */ /* For this caching lib, we always assume cached as little endian */ offset += _marshall32(buf + offset, IBND_FABRIC_CACHE_MAGIC); offset += _marshall32(buf + offset, IBND_FABRIC_CACHE_VERSION); /* save space for node count */ offset += _marshall32(buf + offset, 0); /* save space for port count */ offset += _marshall32(buf + offset, 0); offset += _marshall64(buf + offset, fabric->from_node->guid); offset += _marshall32(buf + offset, fabric->maxhops_discovered); if (ibnd_write(fd, buf, offset) < 0) return -1; return 0; } static int _cache_header_counts(int fd, unsigned int node_count, unsigned int port_count) { uint8_t buf[IBND_FABRIC_CACHE_BUFLEN]; size_t offset = 0; offset += _marshall32(buf + offset, node_count); offset += _marshall32(buf + offset, port_count); if (lseek(fd, IBND_FABRIC_CACHE_COUNT_OFFSET, SEEK_SET) < 0) { IBND_DEBUG("lseek: %s\n", strerror(errno)); return -1; } if (ibnd_write(fd, buf, offset) < 0) return -1; return 0; } static int _cache_node(int fd, ibnd_node_t * node) { uint8_t buf[IBND_FABRIC_CACHE_BUFLEN]; size_t offset = 0; size_t ports_stored_offset = 0; uint8_t ports_stored_count = 0; int i; offset += _marshall16(buf + offset, node->smalid); offset += _marshall8(buf + offset, node->smalmc); offset += _marshall8(buf + offset, (uint8_t) node->smaenhsp0); offset += _marshall_buf(buf + offset, node->switchinfo, IB_SMP_DATA_SIZE); offset += _marshall64(buf + offset, node->guid); offset += _marshall8(buf + offset, (uint8_t) node->type); offset += _marshall8(buf + offset, (uint8_t) node->numports); offset += _marshall_buf(buf + offset, node->info, IB_SMP_DATA_SIZE); offset += _marshall_buf(buf + offset, node->nodedesc, IB_SMP_DATA_SIZE); /* need to come back later and store number of stored ports * because port entries can be NULL or (in the case of switches) * there is an additional port 0 not accounted for in numports. */ ports_stored_offset = offset; offset += sizeof(uint8_t); for (i = 0; i <= node->numports; i++) { if (node->ports[i]) { offset += _marshall64(buf + offset, node->ports[i]->guid); offset += _marshall8(buf + offset, (uint8_t) node->ports[i]->portnum); ports_stored_count++; } } /* go back and store number of port keys stored */ _marshall8(buf + ports_stored_offset, ports_stored_count); if (ibnd_write(fd, buf, offset) < 0) return -1; return 0; } static int _cache_port(int fd, ibnd_port_t * port) { uint8_t buf[IBND_FABRIC_CACHE_BUFLEN]; size_t offset = 0; offset += _marshall64(buf + offset, port->guid); offset += _marshall8(buf + offset, (uint8_t) port->portnum); offset += _marshall8(buf + offset, (uint8_t) port->ext_portnum); offset += _marshall16(buf + offset, port->base_lid); offset += _marshall8(buf + offset, port->lmc); offset += _marshall_buf(buf + offset, port->info, IB_SMP_DATA_SIZE); offset += _marshall64(buf + offset, port->node->guid); if (port->remoteport) { offset += _marshall8(buf + offset, 1); offset += _marshall64(buf + offset, port->remoteport->guid); offset += _marshall8(buf + offset, (uint8_t) port->remoteport->portnum); } else { offset += _marshall8(buf + offset, 0); offset += _marshall64(buf + offset, 0); offset += _marshall8(buf + offset, 0); } if (ibnd_write(fd, buf, offset) < 0) return -1; return 0; } int ibnd_cache_fabric(ibnd_fabric_t * fabric, const char *file, unsigned int flags) { struct stat statbuf; ibnd_node_t *node = NULL; ibnd_node_t *node_next = NULL; unsigned int node_count = 0; ibnd_port_t *port = NULL; ibnd_port_t *port_next = NULL; unsigned int port_count = 0; int fd; int i; if (!fabric) { IBND_DEBUG("fabric parameter NULL\n"); return -1; } if (!file) { IBND_DEBUG("file parameter NULL\n"); return -1; } if (!(flags & IBND_CACHE_FABRIC_FLAG_NO_OVERWRITE)) { if (!stat(file, &statbuf)) { if (unlink(file) < 0) { IBND_DEBUG("error removing '%s': %s\n", file, strerror(errno)); return -1; } } } else { if (!stat(file, &statbuf)) { IBND_DEBUG("file '%s' already exists\n", file); return -1; } } if ((fd = open(file, O_CREAT | O_EXCL | O_WRONLY, 0644)) < 0) { IBND_DEBUG("open: %s\n", strerror(errno)); return -1; } if (_cache_header_info(fd, fabric) < 0) goto cleanup; node = fabric->nodes; while (node) { node_next = node->next; if (_cache_node(fd, node) < 0) goto cleanup; node_count++; node = node_next; } for (i = 0; i < HTSZ; i++) { port = fabric->portstbl[i]; while (port) { port_next = port->htnext; if (_cache_port(fd, port) < 0) goto cleanup; port_count++; port = port_next; } } if (_cache_header_counts(fd, node_count, port_count) < 0) goto cleanup; if (close(fd) < 0) { IBND_DEBUG("close: %s\n", strerror(errno)); goto cleanup; } return 0; cleanup: unlink(file); close(fd); return -1; } rdma-core-56.1/libibnetdisc/ibnetdisc_osd.h000066400000000000000000000000441477342711600207450ustar00rootroot00000000000000#warning "This header is obsolete." rdma-core-56.1/libibnetdisc/internal.h000066400000000000000000000073461477342711600177640ustar00rootroot00000000000000/* * Copyright (c) 2008 Lawrence Livermore National Laboratory * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ /** ========================================================================= * Define the internal data structures. */ #ifndef _INTERNAL_H_ #define _INTERNAL_H_ #include #include #define IBND_DEBUG(fmt, ...) \ if (ibdebug) { \ printf("%s:%u; " fmt, __FILE__, __LINE__, ## __VA_ARGS__); \ } #define IBND_ERROR(fmt, ...) \ fprintf(stderr, "%s:%u; " fmt, __FILE__, __LINE__, ## __VA_ARGS__) /* HASH table defines */ #define HASHGUID(guid) ((uint32_t)(((uint32_t)(guid) * 101) ^ ((uint32_t)((guid) >> 32) * 103))) #define MAXHOPS 63 #define DEFAULT_MAX_SMP_ON_WIRE 2 #define DEFAULT_TIMEOUT 1000 #define DEFAULT_RETRIES 3 typedef struct f_internal { ibnd_fabric_t fabric; cl_qmap_t lid2guid; } f_internal_t; f_internal_t *allocate_fabric_internal(void); void create_lid2guid(f_internal_t *f_int); void destroy_lid2guid(f_internal_t *f_int); void add_to_portlid_hash(ibnd_port_t * port, f_internal_t *f_int); typedef struct ibnd_scan { ib_portid_t selfportid; f_internal_t *f_int; struct ibnd_config *cfg; unsigned initial_hops; } ibnd_scan_t; typedef struct ibnd_smp ibnd_smp_t; typedef struct smp_engine smp_engine_t; typedef int (*smp_comp_cb_t) (smp_engine_t * engine, ibnd_smp_t * smp, uint8_t * mad_resp, void *cb_data); struct ibnd_smp { cl_map_item_t on_wire; struct ibnd_smp *qnext; smp_comp_cb_t cb; void *cb_data; ib_portid_t path; ib_rpc_t rpc; }; struct smp_engine { int umad_fd; int smi_agent; int smi_dir_agent; ibnd_smp_t *smp_queue_head; ibnd_smp_t *smp_queue_tail; void *user_data; cl_qmap_t smps_on_wire; struct ibnd_config *cfg; unsigned total_smps; }; int smp_engine_init(smp_engine_t * engine, char * ca_name, int ca_port, void *user_data, ibnd_config_t *cfg); int issue_smp(smp_engine_t * engine, ib_portid_t * portid, unsigned attrid, unsigned mod, smp_comp_cb_t cb, void *cb_data); int process_mads(smp_engine_t * engine); void smp_engine_destroy(smp_engine_t * engine); int add_to_nodeguid_hash(ibnd_node_t * node, ibnd_node_t * hash[]); int add_to_portguid_hash(ibnd_port_t * port, ibnd_port_t * hash[]); void add_to_type_list(ibnd_node_t * node, f_internal_t * fabric); void destroy_node(ibnd_node_t * node); int mlnx_ext_port_info_err(smp_engine_t *engine, ibnd_smp_t *smp, uint8_t *mad, void *cb_data); #endif /* _INTERNAL_H_ */ rdma-core-56.1/libibnetdisc/libibnetdisc.map000066400000000000000000000013611477342711600211200ustar00rootroot00000000000000IBNETDISC_1.0 { global: ibnd_discover_fabric; ibnd_destroy_fabric; ibnd_load_fabric; ibnd_cache_fabric; ibnd_find_node_guid; ibnd_find_node_dr; ibnd_is_xsigo_guid; ibnd_is_xsigo_tca; ibnd_is_xsigo_hca; ibnd_get_chassis_guid; ibnd_get_chassis_type; ibnd_get_chassis_slot_str; ibnd_iter_nodes; ibnd_iter_nodes_type; ibnd_find_port_guid; ibnd_find_port_dr; ibnd_find_port_lid; ibnd_iter_ports; local: *; }; IBNETDISC_1.1 { global: ibnd_get_agg_linkspeedext_field; ibnd_get_agg_linkspeedext; ibnd_get_agg_linkspeedexten; ibnd_get_agg_linkspeedextsup; ibnd_dump_agg_linkspeedext_bits; ibnd_dump_agg_linkspeedext; ibnd_dump_agg_linkspeedexten; ibnd_dump_agg_linkspeedextsup; local: *; } IBNETDISC_1.0; rdma-core-56.1/libibnetdisc/man/000077500000000000000000000000001477342711600165405ustar00rootroot00000000000000rdma-core-56.1/libibnetdisc/man/CMakeLists.txt000066400000000000000000000006031477342711600212770ustar00rootroot00000000000000rdma_man_pages( ibnd_discover_fabric.3 ibnd_find_node_guid.3 ibnd_iter_nodes.3 ) rdma_alias_man_pages( ibnd_discover_fabric.3 ibnd_debug.3 ibnd_discover_fabric.3 ibnd_destroy_fabric.3 ibnd_discover_fabric.3 ibnd_set_max_smps_on_wire.3 ibnd_discover_fabric.3 ibnd_show_progress.3 ibnd_find_node_guid.3 ibnd_find_node_dr.3 ibnd_iter_nodes.3 ibnd_iter_nodes_type.3 ) rdma-core-56.1/libibnetdisc/man/ibnd_discover_fabric.3000066400000000000000000000043771477342711600227570ustar00rootroot00000000000000.TH IBND_DISCOVER_FABRIC 3 "July 25, 2008" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" ibnd_discover_fabric, ibnd_destroy_fabric, ibnd_debug ibnd_show_progress \- initialize ibnetdiscover library. .SH "SYNOPSIS" .nf .B #include .sp .BI "ibnd_fabric_t *ibnd_discover_fabric(struct ibmad_port *ibmad_port, int timeout_ms, ib_portid_t *from, int hops)" .BI "void ibnd_destroy_fabric(ibnd_fabric_t *fabric)" .BI "void ibnd_debug(int i)" .BI "void ibnd_show_progress(int i)" .BI "int ibnd_set_max_smps_on_wire(int i)" .SH "DESCRIPTION" .B ibnd_discover_fabric() Discover the fabric connected to the port specified by ibmad_port, using a timeout specified. The "from" and "hops" parameters are optional and allow one to scan part of a fabric by specifying a node "from" and a number of hops away from that node to scan, "hops". This gives the user a "sub-fabric" which is "centered" anywhere they chose. ibmad_port must be opened with at least IB_SMI_CLASS and IB_SMI_DIRECT_CLASS classes for ibnd_discover_fabric to work. .B ibnd_destroy_fabric() free all memory and resources associated with the fabric. .B ibnd_debug() Set the debug level to be printed as library operations take place. .B ibnd_show_progress() Indicate that the library should print debug output which shows it's progress through the fabric. .B ibnd_set_max_smps_on_wire() Set the number of SMP's which will be issued on the wire simultaneously. .SH "RETURN VALUE" .B ibnd_discover_fabric() return NULL on failure, otherwise a valid ibnd_fabric_t object. .B ibnd_destory_fabric(), ibnd_debug() NONE .B ibnd_set_max_smps_on_wire() The previous value is returned .SH "EXAMPLES" .B Discover the entire fabric connected to device "mthca0", port 1. int mgmt_classes[2] = {IB_SMI_CLASS, IB_SMI_DIRECT_CLASS}; struct ibmad_port *ibmad_port = mad_rpc_open_port(ca, ca_port, mgmt_classes, 2); ibnd_fabric_t *fabric = ibnd_discover_fabric(ibmad_port, 100, NULL, 0); ... ibnd_destroy_fabric(fabric); mad_rpc_close_port(ibmad_port); .B Discover only a single node and those nodes connected to it. ... str2drpath(&(port_id.drpath), from, 0, 0); ... ibnd_discover_fabric(ibmad_port, 100, &port_id, 1); ... .SH "SEE ALSO" libibmad, mad_rpc_open_port .SH "AUTHORS" .TP Ira Weiny rdma-core-56.1/libibnetdisc/man/ibnd_find_node_guid.3000066400000000000000000000015521477342711600225600ustar00rootroot00000000000000.TH IBND_FIND_NODE_GUID 3 "July 25, 2008" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" ibnd_find_node_guid, ibnd_find_node_dr \- given a fabric object find the node object within it which matches the guid or directed route specified. .SH "SYNOPSIS" .nf .B #include .sp .BI "ibnd_node_t *ibnd_find_node_guid(ibnd_fabric_t *fabric, uint64_t guid)" .BI "ibnd_node_t *ibnd_find_node_dr(ibnd_fabric_t *fabric, char *dr_str)" .SH "DESCRIPTION" .B ibnd_find_node_guid() Given a fabric object and a guid, return the ibnd_node_t object with that node guid. .B ibnd_find_node_dr() Given a fabric object and a directed route, return the ibnd_node_t object with that directed route. .SH "RETURN VALUE" .B ibnd_find_node_guid(), ibnd_find_node_dr() return NULL on failure, otherwise a valid ibnd_node_t object. .SH "AUTHORS" .TP Ira Weiny rdma-core-56.1/libibnetdisc/man/ibnd_iter_nodes.3000066400000000000000000000014631477342711600217570ustar00rootroot00000000000000.TH IBND_ITER_NODES 3 "July 25, 2008" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" ibnd_iter_nodes, ibnd_iter_nodes_type \- given a fabric object and a function itterate over the nodes in the fabric. .SH "SYNOPSIS" .nf .B #include .sp .BI "void ibnd_iter_nodes(ibnd_fabric_t *fabric, ibnd_iter_func_t func, void *user_data)" .BI "void ibnd_iter_nodes_type(ibnd_fabric_t *fabric, ibnd_iter_func_t func, ibnd_node_type_t type, void *user_data)" .SH "DESCRIPTION" .B ibnd_iter_nodes() Itterate through all the nodes in the fabric and call "func" on them. .B ibnd_iter_nodes_type() The same as ibnd_iter_nodes except to limit the iteration to the nodes with the specified type. .SH "RETURN VALUE" .B ibnd_iter_nodes(), ibnd_iter_nodes_type() NONE .SH "AUTHORS" .TP Ira Weiny rdma-core-56.1/libibnetdisc/query_smp.c000066400000000000000000000166441477342711600201700ustar00rootroot00000000000000/* * Copyright (c) 2010 Lawrence Livermore National Laboratory * Copyright (c) 2011 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include "internal.h" static void queue_smp(smp_engine_t * engine, ibnd_smp_t * smp) { smp->qnext = NULL; if (!engine->smp_queue_head) { engine->smp_queue_head = smp; engine->smp_queue_tail = smp; } else { engine->smp_queue_tail->qnext = smp; engine->smp_queue_tail = smp; } } static ibnd_smp_t *get_smp(smp_engine_t * engine) { ibnd_smp_t *head = engine->smp_queue_head; ibnd_smp_t *tail = engine->smp_queue_tail; ibnd_smp_t *rc = head; if (head) { if (tail == head) engine->smp_queue_tail = NULL; engine->smp_queue_head = head->qnext; } return rc; } static int send_smp(ibnd_smp_t * smp, smp_engine_t * engine) { int rc = 0; uint8_t umad[1024]; ib_rpc_t *rpc = &smp->rpc; int agent = 0; memset(umad, 0, umad_size() + IB_MAD_SIZE); if (rpc->mgtclass == IB_SMI_CLASS) { agent = engine->smi_agent; } else if (rpc->mgtclass == IB_SMI_DIRECT_CLASS) { agent = engine->smi_dir_agent; } else { IBND_ERROR("Invalid class for RPC\n"); return (-EIO); } if ((rc = mad_build_pkt(umad, &smp->rpc, &smp->path, NULL, NULL)) < 0) { IBND_ERROR("mad_build_pkt failed; %d\n", rc); return rc; } if ((rc = umad_send(engine->umad_fd, agent, umad, IB_MAD_SIZE, engine->cfg->timeout_ms, engine->cfg->retries)) < 0) { IBND_ERROR("send failed; %d\n", rc); return rc; } return 0; } static int process_smp_queue(smp_engine_t * engine) { int rc = 0; ibnd_smp_t *smp; while (cl_qmap_count(&engine->smps_on_wire) < engine->cfg->max_smps) { smp = get_smp(engine); if (!smp) return 0; if ((rc = send_smp(smp, engine)) != 0) { free(smp); return rc; } cl_qmap_insert(&engine->smps_on_wire, (uint32_t) smp->rpc.trid, (cl_map_item_t *) smp); engine->total_smps++; } return 0; } int issue_smp(smp_engine_t * engine, ib_portid_t * portid, unsigned attrid, unsigned mod, smp_comp_cb_t cb, void *cb_data) { ibnd_smp_t *smp = calloc(1, sizeof *smp); if (!smp) { IBND_ERROR("OOM\n"); return -ENOMEM; } smp->cb = cb; smp->cb_data = cb_data; smp->path = *portid; smp->rpc.method = IB_MAD_METHOD_GET; smp->rpc.attr.id = attrid; smp->rpc.attr.mod = mod; smp->rpc.timeout = engine->cfg->timeout_ms; smp->rpc.datasz = IB_SMP_DATA_SIZE; smp->rpc.dataoffs = IB_SMP_DATA_OFFS; smp->rpc.trid = mad_trid(); smp->rpc.mkey = engine->cfg->mkey; if (portid->lid <= 0 || portid->drpath.drslid == 0xffff || portid->drpath.drdlid == 0xffff) smp->rpc.mgtclass = IB_SMI_DIRECT_CLASS; /* direct SMI */ else smp->rpc.mgtclass = IB_SMI_CLASS; /* Lid routed SMI */ portid->sl = 0; portid->qp = 0; queue_smp(engine, smp); return process_smp_queue(engine); } static int process_one_recv(smp_engine_t * engine) { int rc = 0; int status = 0; ibnd_smp_t *smp; uint8_t *mad; uint32_t trid; uint8_t umad[sizeof(struct ib_user_mad) + IB_MAD_SIZE]; int length = umad_size() + IB_MAD_SIZE; memset(umad, 0, sizeof(umad)); /* wait for the next message */ if ((rc = umad_recv(engine->umad_fd, umad, &length, -1)) < 0) { IBND_ERROR("umad_recv failed: %d\n", rc); return -1; } mad = umad_get_mad(umad); trid = (uint32_t) mad_get_field64(mad, 0, IB_MAD_TRID_F); smp = (ibnd_smp_t *) cl_qmap_remove(&engine->smps_on_wire, trid); if ((cl_map_item_t *) smp == cl_qmap_end(&engine->smps_on_wire)) { IBND_ERROR("Failed to find matching smp for trid (%x)\n", trid); return -1; } rc = process_smp_queue(engine); if (rc) goto error; if ((status = umad_status(umad))) { IBND_ERROR("umad (%s Attr 0x%x:%u) bad status %d; %s\n", portid2str(&smp->path), smp->rpc.attr.id, smp->rpc.attr.mod, status, strerror(status)); if (smp->rpc.attr.id == IB_ATTR_MLNX_EXT_PORT_INFO) rc = mlnx_ext_port_info_err(engine, smp, mad, smp->cb_data); } else if ((status = mad_get_field(mad, 0, IB_DRSMP_STATUS_F))) { IBND_ERROR("mad (%s Attr 0x%x:%u) bad status 0x%x\n", portid2str(&smp->path), smp->rpc.attr.id, smp->rpc.attr.mod, status); if (smp->rpc.attr.id == IB_ATTR_MLNX_EXT_PORT_INFO) rc = mlnx_ext_port_info_err(engine, smp, mad, smp->cb_data); } else rc = smp->cb(engine, smp, mad, smp->cb_data); error: free(smp); return rc; } int smp_engine_init(smp_engine_t * engine, char * ca_name, int ca_port, void *user_data, ibnd_config_t *cfg) { memset(engine, 0, sizeof(*engine)); if (umad_init() < 0) { IBND_ERROR("umad_init failed\n"); return -EIO; } engine->umad_fd = umad_open_port(ca_name, ca_port); if (engine->umad_fd < 0) { IBND_ERROR("can't open UMAD port (%s:%d)\n", ca_name, ca_port); return -EIO; } if ((engine->smi_agent = umad_register(engine->umad_fd, IB_SMI_CLASS, 1, 0, NULL)) < 0) { IBND_ERROR("Failed to register SMI agent on (%s:%d)\n", ca_name, ca_port); goto eio_close; } if ((engine->smi_dir_agent = umad_register(engine->umad_fd, IB_SMI_DIRECT_CLASS, 1, 0, NULL)) < 0) { IBND_ERROR("Failed to register SMI_DIRECT agent on (%s:%d)\n", ca_name, ca_port); goto eio_close; } engine->user_data = user_data; cl_qmap_init(&engine->smps_on_wire); engine->cfg = cfg; return (0); eio_close: umad_close_port(engine->umad_fd); return (-EIO); } void smp_engine_destroy(smp_engine_t * engine) { cl_map_item_t *item; ibnd_smp_t *smp; /* remove queued smps */ smp = get_smp(engine); if (smp) IBND_ERROR("outstanding SMP's\n"); for ( /* */ ; smp; smp = get_smp(engine)) free(smp); /* remove smps from the wire queue */ item = cl_qmap_head(&engine->smps_on_wire); if (item != cl_qmap_end(&engine->smps_on_wire)) IBND_ERROR("outstanding SMP's on wire\n"); for ( /* */ ; item != cl_qmap_end(&engine->smps_on_wire); item = cl_qmap_head(&engine->smps_on_wire)) { cl_qmap_remove_item(&engine->smps_on_wire, item); free(item); } umad_close_port(engine->umad_fd); } int process_mads(smp_engine_t * engine) { int rc; while (!cl_is_qmap_empty(&engine->smps_on_wire)) if ((rc = process_one_recv(engine)) != 0) return rc; return 0; } rdma-core-56.1/libibnetdisc/tests/000077500000000000000000000000001477342711600171275ustar00rootroot00000000000000rdma-core-56.1/libibnetdisc/tests/testleaks.c000066400000000000000000000104371477342711600212770ustar00rootroot00000000000000/* * Copyright (c) 2004-2007 Voltaire Inc. All rights reserved. * Copyright (c) 2007 Xsigo Systems Inc. All rights reserved. * Copyright (c) 2008 Lawrence Livermore National Lab. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include static const char *argv0 = "iblinkinfotest"; static FILE *f; static void usage(void) { fprintf(stderr, "Usage: %s [-hclp -D -C -P ]\n" " Report link speed and connection for each port of each switch which is active\n" " -h This help message\n" " -i Number of iterations to run (default -1 == infinate)\n" " -f specify node to start \"from\"\n" " -n Number of hops to include away from specified node\n" " -t timeout for any single fabric query\n" " -s show errors\n" " -C use selected Channel Adaptor name for queries\n" " -P use selected channel adaptor port for queries\n" " --debug print debug messages\n", argv0); exit(-1); } int main(int argc, char **argv) { struct ibnd_config config = { 0 }; int rc = 0; char *ca = NULL; int ca_port = 0; ibnd_fabric_t *fabric = NULL; char *from = NULL; ib_portid_t port_id; int iters = -1; static char const str_opts[] = "S:D:n:C:P:t:shuf:i:"; static const struct option long_opts[] = { {"S", 1, NULL, 'S'}, {"D", 1, NULL, 'D'}, {"num-hops", 1, NULL, 'n'}, {"ca-name", 1, NULL, 'C'}, {"ca-port", 1, NULL, 'P'}, {"timeout", 1, NULL, 't'}, {"show", 0, NULL, 's'}, {"help", 0, NULL, 'h'}, {"usage", 0, NULL, 'u'}, {"debug", 0, NULL, 2}, {"from", 1, NULL, 'f'}, {"iters", 1, NULL, 'i'}, {} }; f = stdout; argv0 = argv[0]; while (1) { int ch = getopt_long(argc, argv, str_opts, long_opts, NULL); if (ch == -1) break; switch (ch) { case 2: config.debug++; break; case 'f': from = strdup(optarg); break; case 'C': ca = strdup(optarg); break; case 'P': ca_port = strtoul(optarg, NULL, 0); break; case 'n': config.max_hops = strtoul(optarg, NULL, 0); break; case 'i': iters = (int)strtol(optarg, NULL, 0); break; case 't': config.timeout_ms = strtoul(optarg, NULL, 0); break; default: usage(); break; } } argc -= optind; argv += optind; while (iters == -1 || iters-- > 0) { if (from) { /* only scan part of the fabric */ str2drpath(&(port_id.drpath), from, 0, 0); if ((fabric = ibnd_discover_fabric(ca, ca_port, &port_id, &config)) == NULL) { fprintf(stderr, "discover failed\n"); rc = 1; goto close_port; } } else if ((fabric = ibnd_discover_fabric(ca, ca_port, NULL, &config)) == NULL) { fprintf(stderr, "discover failed\n"); rc = 1; goto close_port; } ibnd_destroy_fabric(fabric); } close_port: exit(rc); } rdma-core-56.1/libibumad/000077500000000000000000000000001477342711600152625ustar00rootroot00000000000000rdma-core-56.1/libibumad/CMakeLists.txt000066400000000000000000000004371477342711600200260ustar00rootroot00000000000000publish_headers(infiniband umad.h umad_cm.h umad_sa.h umad_sa_mcm.h umad_sm.h umad_str.h umad_types.h ) rdma_library(ibumad libibumad.map # See Documentation/versioning.md 3 3.4.${PACKAGE_VERSION} sysfs.c umad.c umad_str.c ) rdma_pkg_config("ibumad" "" "") rdma-core-56.1/libibumad/libibumad.map000066400000000000000000000020601477342711600177070ustar00rootroot00000000000000/* Do not change this file without reading Documentation/versioning.md */ IBUMAD_1.0 { global: umad_init; umad_done; umad_get_cas_names; umad_get_ca_portguids; umad_open_port; umad_get_ca; umad_release_ca; umad_get_port; umad_release_port; umad_close_port; umad_get_mad; umad_get_issm_path; umad_size; umad_set_grh; umad_set_pkey; umad_get_pkey; umad_set_addr; umad_set_addr_net; umad_send; umad_recv; umad_poll; umad_get_fd; umad_register; umad_register2; umad_register_oui; umad_unregister; umad_status; umad_get_mad_addr; umad_debug; umad_addr_dump; umad_dump; umad_class_str; umad_method_str; umad_common_mad_status_str; umad_sa_mad_status_str; umad_attribute_str; local: *; }; IBUMAD_1.1 { global: umad_free_ca_device_list; umad_get_ca_device_list; } IBUMAD_1.0; IBUMAD_1.2 { global: umad_sort_ca_device_list; } IBUMAD_1.1; IBUMAD_1.3 { global: umad_open_smi_port; } IBUMAD_1.2; IBUMAD_1.4 { global: umad_get_smi_gsi_pairs; umad_get_smi_gsi_pair_by_ca_name; } IBUMAD_1.3; rdma-core-56.1/libibumad/man/000077500000000000000000000000001477342711600160355ustar00rootroot00000000000000rdma-core-56.1/libibumad/man/CMakeLists.txt000066400000000000000000000015101477342711600205720ustar00rootroot00000000000000rdma_man_pages( umad_addr_dump.3 umad_alloc.3 umad_class_str.3 umad_close_port.3 umad_debug.3 umad_dump.3 umad_free.3 umad_get_ca.3 umad_get_ca_portguids.3 umad_get_cas_names.3 umad_get_fd.3 umad_get_issm_path.3 umad_get_mad.3 umad_get_mad_addr.3 umad_get_pkey.3 umad_get_port.3 umad_init.3.md umad_open_port.3 umad_poll.3 umad_recv.3 umad_register.3 umad_register2.3 umad_register_oui.3 umad_send.3 umad_set_addr.3 umad_set_addr_net.3 umad_set_grh.3 umad_set_grh_net.3 umad_set_pkey.3 umad_size.3 umad_status.3 umad_unregister.3 ) rdma_alias_man_pages( umad_class_str.3 umad_attribute_str.3 umad_class_str.3 umad_mad_status_str.3 umad_class_str.3 umad_method_str.3 umad_get_ca.3 umad_release_ca.3 umad_get_port.3 umad_release_port.3 umad_init.3 umad_done.3 ) rdma-core-56.1/libibumad/man/umad_addr_dump.3000066400000000000000000000015761477342711600210770ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_ADDR_DUMP 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_addr_dump \- dump addr structure to stderr .SH "SYNOPSIS" .nf .B #include .sp .BI "void umad_addr_dump(ib_mad_addr_t " "*addr"); .fi .SH "DESCRIPTION" .B umad_addr_dump() dumps the given .I addr\fR to stderr. The argument .I addr is an .I ib_mad_addr_t struct, as specified in . .PP .nf typedef struct ib_mad_addr { .in +8 uint32_t qpn; uint32_t qkey; uint16_t lid; uint8_t sl; uint8_t path_bits; uint8_t grh_present; uint8_t gid_index; uint8_t hop_limit; uint8_t traffic_class; uint8_t gid[16]; uint32_t flow_label; .in -8 } ib_mad_addr_t; .fi .SH "RETURN VALUE" .B umad_addr_dump() returns no value. .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_alloc.3000066400000000000000000000014031477342711600202170ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_ALLOC 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_alloc \- allocate memory for umad buffers .SH "SYNOPSIS" .nf .B #include .sp .BI "void * umad_alloc(int " "num" ", size_t " "size"); .fi .SH "DESCRIPTION" .B umad_alloc() allocates memory for an array of .I num\fR umad buffers of .I size bytes\fR. Note that .I size\fR should include the .B umad_size() plus the length (MAD_BLOCK_SIZE for normal MADs or the length returned from .B umad_recv() for RMPP MADs). .SH "RETURN VALUE" .B umad_alloc() returns NULL if out of memory. .SH "SEE ALSO" .BR umad_free (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_class_str.3000066400000000000000000000026301477342711600211250ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_CLASS_STR 3 "Feb 15, 2013" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_*_str \- class of functions to return string representations of enums .SH "SYNOPSIS" .nf .B #include .sp .BI "const char * umad_class_str(uint8_t mgmt_class)" .BI "const char * umad_method_str(uint8_t mgmt_class, uint8_t method)" .BI "const char * umad_attribute_str(uint8_t mgmt_class, be16_t attr_id)" .BI "const char * umad_common_mad_status_str(be16_t status)" .BI "const char * umad_sa_mad_status_str(be16_t status)" .SH "DESCRIPTION" .B "const char * umad_class_str(uint8_t mgmt_class)" Return string value of management class enum .B "const char * umad_method_str(uint8_t mgmt_class, uint8_t method)" Return string value of the method for the mgmt_class specified .B "const char * umad_attribute_str(uint8_t mgmt_class, be16_t attr_id)" Return string value of attribute specified in attr_id based on mgmt_class specified. .B "const char * umad_common_mad_status_str(be16_t status)" Return string value for common MAD status values .B "const char * umad_sa_mad_status_str(be16_t status)" Return string value for SA MAD status values .B NOTE: Not all classes are supported. .SH "RETURN VALUE" Returns a string representations of the fields specified. .SH "AUTHOR" .TP Ira Weiny rdma-core-56.1/libibumad/man/umad_close_port.3000066400000000000000000000013071477342711600213010ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_OPEN_PORT 3 "May 11, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_close_port \- close InfiniBand device port for umad access .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_close_port(int " "portid" ); .fi .SH "DESCRIPTION" .B umad_close_port() closes the port specified by the handle .I portid\fR. .SH "RETURN VALUE" .B umad_close_port() returns 0 on success, and a negative value on error. -EINVAL is returned if the .I portid\fR is not a handle to a valid (open) port. .SH "SEE ALSO" .BR umad_open_port (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_debug.3000066400000000000000000000015001477342711600202110ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_DEBUG 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_debug \- set debug level .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_debug(int " "level" ); .fi .SH "DESCRIPTION" .B umad_debug() sets the umad library internal debug level to .I level\fR. The following debug levels are supported: 0 - no debug (the default), 1 - basic debug information, 2 - verbose debug information. Negative values are ignored in terms of set. Note that the current debug level can be queried by passing a negative value as .I level\fR. .SH "RETURN VALUE" .B umad_debug() returns the actual debug level. .SH "AUTHORS" .TP Hal Rosenstock .TP Dotan Barak rdma-core-56.1/libibumad/man/umad_dump.3000066400000000000000000000007661477342711600201050ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_DUMP 3 "May 17, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_dump \- dump umad buffer to stderr .SH "SYNOPSIS" .nf .B #include .sp .BI "void umad_dump(void " "*umad"); .fi .SH "DESCRIPTION" .B umad_dump() dumps the given .I umad\fR buffer to stderr. .SH "RETURN VALUE" .B umad_dump() returns no value. .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_free.3000066400000000000000000000010501477342711600200440ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_FREE 3 "May 17, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_free \- frees memory of umad buffers .SH "SYNOPSIS" .nf .B #include .sp .BI "void umad_free(void " "*umad"); .fi .SH "DESCRIPTION" .B umad_free() frees memory previously allocated with .B umad_alloc()\fR. .SH "RETURN VALUE" .B umad_free() returns no value. .SH "SEE ALSO" .BR umad_alloc (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_free_ca_device_list.3.md000066400000000000000000000015701477342711600234670ustar00rootroot00000000000000 --- date: "May 1, 2018" footer: "OpenIB" header: "OpenIB Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: UMAD_FREE_CA_DEVICE_LIST --- # NAME umad_free_ca_device_list - free InfiniBand devices name list # SYNOPSIS ```c #include void umad_free_ca_device_list(struct umad_device_node *head); ``` # DESCRIPTION **umad_free_ca_device_list()** frees the *struct umad_device_node* list and its values that allocated with umad_get_ca_namelist(). The argument head is list of *struct umad_device_node* filled with local IB devices(CAs) names. # RETURN VALUE **umad_free_ca_device_list()** returns no value. # SEE ALSO **umad_get_ca_device_list** # AUTHORS Vladimir Koushnir , Hal Rosenstock , Haim Boozaglo rdma-core-56.1/libibumad/man/umad_get_ca.3000066400000000000000000000035351477342711600203570ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_GET_CA 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_get_ca, umad_release_ca \- get and release InfiniBand device port attributes .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_get_ca(char " "*ca_name" ", umad_ca_t " "*ca" ); .sp .BI "int umad_release_ca(umad_ca_t " "*ca" ); .fi .SH "DESCRIPTION" .B umad_get_ca() gets the attributes of the InfiniBand device .I ca_name\fR. It fills the .I ca structure with the device attributes specified by the .I ca_name or with the default device attributes if .I ca_name is NULL. .B umad_release_ca() should be called before the .I ca structure is deallocated. The argument .I ca is an .I umad_ca_t struct, as specified in . .PP .nf typedef struct umad_ca { .in +8 char ca_name[UMAD_CA_NAME_LEN]; /* Name of the device */ uint node_type; /* Type of the device */ int numports; /* Number of physical ports */ char fw_ver[20]; /* FW version */ char ca_type[40]; /* CA type (e.g. MT23108, etc.) */ char hw_ver[20]; /* Hardware version */ uint64_t node_guid; /* Node GUID */ uint64_t system_guid; /* System image GUID */ umad_port_t *ports[UMAD_CA_MAX_PORTS]; /* Array of device port properties */ .in -8 } umad_ca_t; .fi .PP .B umad_release_ca() releases the resources that were allocated in the function .B umad_get_ca()\fR. .SH "RETURN VALUE" .B umad_get_ca() and .B umad_release_ca() return 0 on success, and a negative value on error. .SH "AUTHORS" .TP Hal Rosenstock .TP Dotan Barak rdma-core-56.1/libibumad/man/umad_get_ca_device_list.3.md000066400000000000000000000025151477342711600233250ustar00rootroot00000000000000 --- date: "May 1, 2018" footer: "OpenIB" header: "OpenIB Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: UMAD_GET_CA_DEVICE_LIST --- # NAME umad_get_ca_device_list - get list of available InfiniBand device names. # SYNOPSIS ```c #include struct umad_device_node *umad_get_ca_device_list(void); ``` # DESCRIPTION **umad_get_ca_device_list()** fills the cas list of *struct umad_device_node* with local IB devices (CAs) names. *struct umad_device_node* is defined as follows: ```c struct umad_device_node { struct umad_device_node *next; const char *ca_name; }; ``` # RETURN VALUE **umad_get_ca_device_list()** returns list of *struct umad_device_node* filled with local IB devices(CAs) names. In case of empty list (zero elements), NULL is returned and *errno* is not set. On error, NULL is returned and *errno* is set appropriately. The last value of the list is NULL in order to indicate the number of entries filled. # ERRORS **umad_get_ca_device_list()** can fail with the following errors: **ENOMEM** # SEE ALSO **umad_get_ca_portguids**(3), **umad_open_port**(3), **umad_free_ca_device_list** # AUTHORS Vladimir Koushnir , Hal Rosenstock , Haim Boozaglo rdma-core-56.1/libibumad/man/umad_get_ca_portguids.3000066400000000000000000000023121477342711600224470ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_GET_CA_PORTGUIDS 3 "August 8, 2016" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_get_ca_portguids \- get the InfiniBand device ports GUIDs .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_get_ca_portguids(char " "*ca_name" ", __be64 " "*portguids" ", int " "max" ); .fi .SH "DESCRIPTION" .B umad_get_ca_portguids() fills the .I portguids\fR array with up to .I max port GUIDs belonging the specified IB device .I ca_name , or to the default IB device if .I ca_name is NULL. The argument .I portguids is an array of .I max uint64_t entries. .SH "RETURN VALUE" On success, .B umad_get_ca_portguids() returns a non-negative value equal to the number of port GUIDs actually filled. Not all filled entries may be valid. Invalid entries will be 0. For example, on a CA node with only one port, this function returns a value of 2. In this case, the value at index 0 will be invalid as it is reserved for switches. On failure, a negative value is returned. .SH "SEE ALSO" .BR umad_get_cas_names (3) .SH "AUTHORS" .TP Hal Rosenstock .TP Dotan Barak rdma-core-56.1/libibumad/man/umad_get_cas_names.3000066400000000000000000000016131477342711600217200ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_GET_CAS_NAMES 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_get_cas_names \- get list of available InfiniBand device names .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_get_cas_names(char " "cas[][UMAD_CA_NAME_LEN]" ", int " "max" ); .fi .SH "DESCRIPTION" .B umad_get_cas_names() fills the .I cas array with up to .I max local IB devices (CAs) names. The argument .I cas is a character array with .I max entries, each with .B UMAD_CA_NAME_LEN characters. .SH "RETURN VALUE" .B umad_get_cas_names() returns a non-negative value equal to the number of entries filled, or \-1 on errors. .SH "SEE ALSO" .BR umad_get_ca_portguids (3), .BR umad_open_port (3) .SH "AUTHORS" .TP Hal Rosenstock .TP Dotan Barak rdma-core-56.1/libibumad/man/umad_get_fd.3000066400000000000000000000011361477342711600203600ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_GET_FD 3 "May 17, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_get_fd \- get the umad fd for the requested port .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_get_fd(int " "portid" ); .fi .SH "DESCRIPTION" .B umad_get_fd() returns the umad fd for the port specified by .I portid\fR. .SH "RETURN VALUE" .B umad_get_fd() returns the fd for the .I portid\fR requested or -EINVAL if .I portid\fR is invalid. .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_get_issm_path.3000066400000000000000000000020661477342711600217610ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_GET_ISSM_PATH 3 "Oct 18, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_get_issm_path \- get path of issm device .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_get_issm_path(char " "*ca_name" ", int " "portnum", char *path, int max); .fi .SH "DESCRIPTION" .B umad_get_issm_path() resolves path to issm device (which used for setting/clearing PortInfo:CapMask IsSM bit) for .I portnum of the IB device .I ca_name , it stores resolved path in .I path array which cannot exceed .I max bytes in length (including NULL terminator). .fi Opening issm device sets PortInfo:CapMask IsSM bit and closing clears it. .fi .SH "RETURN VALUE" .B umad_open_port() returns 0 on success and a negative value on error as follows: -ENODEV IB device can't be resolved -EINVAL port is not valid (bad .I portnum\fR or no umad device) .SH "SEE ALSO" .BR umad_open_port (3), .BR umad_get_port (3) .SH "AUTHOR" .TP Sasha Khapyorsky rdma-core-56.1/libibumad/man/umad_get_mad.3000066400000000000000000000011451477342711600205300ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_GET_MAD 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_get_mad \- get the MAD pointer of a umad buffer .SH "SYNOPSIS" .nf .B #include .sp .BI "void * umad_get_mad(void " "*umad"); .fi .SH "DESCRIPTION" .B umad_get_mad() returns a pointer to the MAD contained within the .I umad\fR buffer. .SH "RETURN VALUE" .B umad_get_mad() returns a pointer to the MAD contained within the supplied .I umad\fR buffer. .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_get_mad_addr.3000066400000000000000000000016651477342711600215310ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_GET_MAD_ADDR 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_get_mad_addr \- get the address of the ib_mad_addr from a umad buffer .SH "SYNOPSIS" .nf .B #include .sp .BI "ib_mad_addr_t * umad_get_mad_addr(void " "*umad"); .fi .SH "DESCRIPTION" .B umad_get_mad_addr() returns a pointer to the ib_mad_addr struct within the specified .I umad\fR buffer. .SH "RETURN VALUE" The return value is a pointer to an .I ib_mad_addr_t struct, as specified in . .PP .nf typedef struct ib_mad_addr { .in +8 uint32_t qpn; uint32_t qkey; uint16_t lid; uint8_t sl; uint8_t path_bits; uint8_t grh_present; uint8_t gid_index; uint8_t hop_limit; uint8_t traffic_class; uint8_t gid[16]; uint32_t flow_label; .in -8 } ib_mad_addr_t; .fi .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_get_pkey.3000066400000000000000000000011511477342711600207340ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_GET_PKEY 3 "Jan 15, 2008" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_get_pkey \- get pkey index from umad buffer .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_get_pkey(void " "*umad"); .fi .SH "DESCRIPTION" .B umad_get_pkey() gets the pkey index from the specified .I umad\fR buffer. .SH "RETURN VALUE" .B umad_get_pkey() returns value of pkey index (or zero if pkey index is not supported by user_mad interface). .SH "AUTHOR" .TP Sasha Khapyorsky rdma-core-56.1/libibumad/man/umad_get_port.3000066400000000000000000000045761477342711600207660ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_GET_PORT 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_get_port, umad_release_port \- open and close an InfiniBand port .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_get_port(char " "*ca_name" ", int " "portnum" ", umad_port_t " "*port" ); .sp .BI "int umad_release_port(umad_port_t " "*port" ); .fi .SH "DESCRIPTION" .B umad_get_port() fills the .I port structure with the IB port attributes specified by .I ca_name and .I portnum , or the default port if .I ca_name is NULL and .I portnum is zero. If only one of .I ca_name and .I portnum are specified, the other is used as a filter. For example, passing a NULL .I ca_name and 2 for the .I portnum means get a port from any of the local IB devices, as long as it is the second port. Note that the library may use some reference scheme to support port caching therefore .B umad_release_port() should be called before the .I port structure can be deallocated. The argument .I port is an .B umad_port_t struct, as specified in . .PP .nf typedef struct umad_port { .in +8 char ca_name[UMAD_CA_NAME_LEN]; /* Name of the device */ int portnum; /* Physical port number */ uint base_lid; /* Base port LID */ uint lmc; /* LMC of LID */ uint sm_lid; /* SM LID */ uint sm_sl; /* SM service level */ uint state; /* Logical port state */ uint phys_state; /* Physical port state */ uint rate; /* Port link bit rate */ uint64_t capmask; /* Port capabilities */ uint64_t gid_prefix; /* Gid prefix of this port */ uint64_t port_guid; /* GUID of this port */ .in -8 } umad_port_t; .fi .PP .B umad_release_port() releases the resources that were allocated by the .B umad_get_port() function for the specified IB .I port\fR. .SH "RETURN VALUE" .B umad_get_port() and .B umad_release_port() return 0 on success, and a negative value on error. .SH "AUTHORS" .TP Hal Rosenstock .TP Dotan Barak rdma-core-56.1/libibumad/man/umad_get_smi_gsi_pair_by_ca_name.3000066400000000000000000000032561477342711600245760ustar00rootroot00000000000000.. -*- rst -*- .. Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md umad_get_smi_gsi_pair_by_ca_name(3) =================================== Retrieve SMI/GSI pair information based on device name and port number. Synopsis -------- .. code-block:: c #include int umad_get_smi_gsi_pair_by_ca_name(const char *devname, uint8_t portnum, struct umad_ca_pair *ca, unsigned enforce_smi); Description ----------- ``umad_get_smi_gsi_pair_by_ca_name()`` fills the provided ``ca`` structure with the SMI and GSI pair information for the specified device name and port number. The ``devname`` parameter specifies the name of the device, and ``portnum`` is the associated port number. ``enforce_smi`` parameter if enabled, will look only for pairs that has both SMI and GSI interfaces. The ``struct umad_ca_pair`` is defined in ```` and includes the following members: .. code-block:: c struct umad_ca_pair { char smi_name[UMAD_CA_NAME_LEN]; /* Name of the SMI */ uint32_t smi_preferred_port; /* Preferred port for the SMI */ char gsi_name[UMAD_CA_NAME_LEN]; /* Name of the GSI */ uint32_t gsi_preferred_port; /* Preferred port for the GSI */ }; The function populates this structure with the relevant data for the given ``devname`` and ``portnum``. Return Value ------------ ``umad_get_smi_gsi_pair_by_ca_name()`` returns: - **0**: If the specified device and port are found and the structure is successfully populated. - **1**: If no matching device or port is found. Authors ------- - Asaf Mazor rdma-core-56.1/libibumad/man/umad_get_smi_gsi_pairs.3000066400000000000000000000023051477342711600226160ustar00rootroot00000000000000.. -*- rst -*- .. Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md umad_get_smi_gsi_pairs(3) ========================= Get CAs as SMI/GSI pairs. Synopsis -------- .. code-block:: c #include int umad_get_smi_gsi_pairs(struct umad_ca_pair cas[], size_t max); Description ----------- ``umad_get_smi_gsi_pairs()`` fills a user-allocated array of ``struct umad_ca_pair``. It fills up to ``max`` devices. The argument ``cas`` is an array of ``struct umad_ca_pair`` as specified in ````: .. code-block:: c struct umad_ca_pair { char smi_name[UMAD_CA_NAME_LEN]; /* Name of the SMI */ uint32_t smi_preferred_port; /* Preferred port for the SMI */ char gsi_name[UMAD_CA_NAME_LEN]; /* Name of the GSI */ uint32_t gsi_preferred_port; /* Preferred port for the GSI */ }; The ``smi_preferred_port`` and ``gsi_preferred_port`` fields represent the first ports found active for the corresponding SMI/GSI device. Return Value ------------ ``umad_get_smi_gsi_pairs()`` returns the number of devices filled, or **-1** on error. Authors ------- - Asaf Mazor rdma-core-56.1/libibumad/man/umad_init.3.md000066400000000000000000000017731477342711600205010ustar00rootroot00000000000000 --- date: "May 21, 2007" footer: "OpenIB" header: "OpenIB Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: UMAD_INIT --- # NAME umad_init, umad_done - perform library initialization and finalization # SYNOPSIS ```c #include int umad_init(void); int umad_done(void); ``` # DESCRIPTION **umad_init()** and **umad_done()** do nothing. # RETURN VALUE Always 0. # COMPATIBILITY Versions prior to release 18 of the library require **umad_init()** to be called prior to using any other library functions. Old versions could return a failure code of -1 from **umad_init()**. For compatibility, applications should continue to call **umad_init()**, and check the return code, prior to calling other **umad_** functions. If **umad_init()** returns an error, then no further use of the umad library should be attempted. # AUTHORS Dotan Barak , Hal Rosenstock rdma-core-56.1/libibumad/man/umad_open_port.3000066400000000000000000000021131477342711600211310ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_OPEN_PORT 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_open_port \- open InfiniBand device port for umad access .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_open_port(char " "*ca_name" ", int " "portnum" ); .fi .SH "DESCRIPTION" .B umad_open_port() opens the port .I portnum of the IB device .I ca_name for umad access. The port is selected by the library if not all parameters are provided (see .B umad_get_port() for details). .fi .SH "RETURN VALUE" .B umad_open_port() returns 0 or an unique positive value of umad device descriptor on success, and a negative value on error as follows: -EOPNOTSUPP ABI version doesn't match -ENODEV IB device can't be resolved -EINVAL port is not valid (bad .I portnum\fR or no umad device) -EIO umad device for this port can't be opened .SH "SEE ALSO" .BR umad_close_port (3), .BR umad_get_cas_names (3), .BR umad_get_port (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_open_smi_port.3000066400000000000000000000021341477342711600220040ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_OPEN_SMI_PORT 3 "June 18, 2024" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_open_smi_port \- open InfiniBand device SMI port for umad access .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_open_smi_port(char " "*ca_name" ", int " "portnum" ); .fi .SH "DESCRIPTION" .B umad_open_smi_port() opens the SMI port .I portnum of the IB device .I ca_name for umad access. The port is selected by the library if not all parameters are provided (see .B umad_get_port() for details). Only SMI ports will be selected. .fi .SH "RETURN VALUE" .B umad_open_smi_port() returns 0 or an unique positive value of umad device descriptor on success, and a negative value on error as follows: -EOPNOTSUPP ABI version doesn't match -ENODEV IB device with SMI port can't be resolved -EINVAL port is not valid (bad .I portnum\fR or no umad device) -EIO umad device for this port can't be opened .SH "SEE ALSO" .BR umad_open_port (3), .SH "AUTHOR" .TP Amir Nir rdma-core-56.1/libibumad/man/umad_poll.3000066400000000000000000000022321477342711600200740ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_POLL 3 "October 23, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_poll \- poll umad .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_poll(int " "portid" ", int " "timeout_ms"); .fi .SH "DESCRIPTION" .B umad_poll() waits up to .I timeout_ms\fR milliseconds for a packet to be received from the port specified by .I portid\fR. Once a packet is ready to be read, the function returns 0. After that the packet can be read using .B umad_recv(). Otherwise, \-ETIMEDOUT is returned. Note that successfully polling a port does not guarantee that the subsequent .B umad_recv() will be non blocking when several threads are using the same port. Instead, use a .I timeout_ms\fR parameter of zero to .B umad_recv() to ensure a non-blocking read. .SH "RETURN VALUE" .B umad_poll() returns 0 on success, and a negative value on error as follows: -EINVAL invalid port handle or agentid -ETIMEDOUT poll operation timed out -EIO poll operation failed .SH "SEE ALSO" .BR umad_recv (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_recv.3000066400000000000000000000040331477342711600200660ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_RECV 3 "May 11, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_recv \- receive umad .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_recv(int " "portid" ", void " "*umad" ", int " "*length" ", int " "timeout_ms"); .fi .SH "DESCRIPTION" .B umad_recv() waits up to .I timeout_ms\fR milliseconds for an incoming MAD message to be received from the port specified by .I portid\fR. A MAD "message" consists of a single MAD packet .I or a coalesced multipacket RMPP transmission. In the RMPP case the header of the first RMPP packet is returned as the header of the buffer and the buffer data contains the coalesced data section of each subsequent RMPP MAD packet within the transmission. Thus all the RMPP headers except the first are not copied to user space from the kernel. The message is copied to the .I umad\fR buffer if there is sufficient room and the received .I length\fR is indicated. If the buffer is not large enough, the size of the umad buffer needed is returned in .I length\fR. A negative .I timeout_ms\fR makes the function block until a packet is received. A .I timeout_ms\fR parameter of zero indicates a non blocking read. .B Note .I length is a pointer to the length of the .B data portion of the umad buffer. This means that .I umad must point to a buffer at least umad_size() + .I *length bytes long. .B Note also that .I *length\fR must be >= 256 bytes. This length allows for at least a single MAD packet to be returned. .SH "RETURN VALUE" .B umad_recv() on success return the agentid; on error, errno is set and a negative value is returned as follows: -EINVAL invalid port handle or agentid or *length is less than the minimum supported -EIO receive operation failed -EWOULDBLOCK non blocking read can't be fulfilled -ENOSPC The provided buffer is not long enough for the complete message. .SH "SEE ALSO" .BR umad_poll (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_register.3000066400000000000000000000024041477342711600207530ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_REGISTER 3 "May 11, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_register \- register the specified management class and version for port .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_register(int " "portid" ", int " "mgmt_class" ", int " "mgmt_version" " , uint8_t " "rmpp_version" ", long " "method_mask[16/sizeof(long)]"); .fi .SH "DESCRIPTION" .B umad_register() registers the specified management class, management version, and whether RMPP is being used for the port specified by the .I portid\fR parameter. If .I method_mask\fR array is provided, the caller is registered as a replier (server) for the methods having their corresponding bit on in the .I method_mask\fR. If .I method_mask\fR is NULL, the caller is registered as a MAD client, meaning that it can only receive replies on MADs that it sent (solicited MADs). .SH "RETURN VALUE" .B umad_register() returns non-negative agent id number on success, and a negative value on error as follows: -EINVAL invalid port handle -EPERM registration failed .SH "SEE ALSO" .BR umad_register_oui(3), .BR umad_unregister (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_register2.3000066400000000000000000000043451477342711600210430ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_REGISTER2 3 "March 25, 2014" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_register2 \- register the specified management class and version for port .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_register2(int " "port_fd" ", struct umad_reg_attr *" "attr" ", uint32_t *" "agent_id"); .fi .SH "DESCRIPTION" .B umad_register2() registers for a MAD agent using the provided registration attributes .I port_fd\fR the port on which to register the agent .I attr\fR The registration attributes as defined by the structure passed. See below for details of this structure. .I agent_id\fR returned on success. agent_id identifies the kernel MAD agent a MAD is received by or to be sent by. agent_id is returned in the umad header "struct ib_user_mad" on recv and specified in umad_send when sending. .SH "REGISTRATION ATTRIBUTE STRUCTURE" .nf struct umad_reg_attr { .in +8 uint8_t mgmt_class; uint8_t mgmt_class_version; uint32_t flags; uint64_t method_mask[2]; uint32_t oui; uint8_t rmpp_version; .in -8 }; .I mgmt_class\fR Management class to register for. .I mgmt_class_version\fR Management class version to register for. .I flags\fR Registration flags. If a flag specified is not supported by the kernel, an error is returned, and the supported flags are returned in this field. .P Current flags are: .in +8 UMAD_USER_RMPP -- flag to indicate the kernel should not process RMPP packets. All RMPP packets will be treated like individual MADs. The user is responsible for implementing the RMPP protocol. .in -8 .I method_mask\fR A bit mask which indicates which unsolicited methods this agent should receive. Setting this array to 0 will result in the agent only receiving response MADs for which a request was sent. .I oui\fR The oui (in host order) to use for vendor classes 0x30 - 0x4f. Otherwise ignored. .I rmpp_version\fR If the class supports RMPP and kernel RMPP is enabled (the default) indicate which rmpp_version to use. .SH "RETURN VALUE" .B umad_register2() returns 0 on success and +ERRNO on failure. .SH "SEE ALSO" .BR umad_unregister (3) .SH "AUTHOR" .TP Ira Weiny rdma-core-56.1/libibumad/man/umad_register_oui.3000066400000000000000000000024641477342711600216350ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_REGISTER_OUI 3 "May 17, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_register_oui \- register the specified class in vendor range 2 for port .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_register_oui(int " "portid" ", int " "mgmt_class" ", uint8_t " "rmpp_version" ", uint8_t " "oui[3]" ", uint32_t " "method_mask[4]"); .fi .SH "DESCRIPTION" .B umad_register_oui() registers the specified class in vendor range 2, the specified .I oui\fR, and whether RMPP is being used for the port specified by the .I portid\fR handle. If .I method_mask\fR array is provided, the caller is registered as a replier (server) for the methods having their corresponding bit on in the .I method_mask\fR. If .I method_mask\fR is NULL, the caller is registered as a MAD client, meaning that it can only receive replies on MADs that it sent (solicited MADs). .SH "RETURN VALUE" .B umad_register() returns non-negative agent id number on success, and a negative value on error as follows: -EINVAL invalid port handle or class is not in the vendor class 2 range -EPERM registration failed .SH "SEE ALSO" .BR umad_register (3), .BR umad_unregister (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_send.3000066400000000000000000000032421477342711600200610ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_SEND 3 "May 11, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_send \- send umad .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_send(int " "portid" ", int " "agentid" ", void " "*umad" ", int " "length" ", int " "timeout_ms" ", int " "retries"); .fi .SH "DESCRIPTION" .B umad_send() sends .I length\fR bytes from the specified .I umad\fR buffer from the port specified by .I portid\fR, and using the agent specified by .I agentid\fR. The buffer can contain a RMPP transmission which is larger than a single MAD packet when the agentid specifies a class which utilizes RMPP and the header flags indicate RMPP is active. NOTE currently only RMPPFlags.Active is meaningful in the header in user space. All other RMPP fields are ignored. The data section of the buffer will be sent in multiple RMPP MAD packets with headers built for the user. .I timeout_ms\fR controls the solicited MADs behavior as follows: zero value means not solicited. Positive value makes kernel indicate timeout in milliseconds. If reply is not received within the specified value, the original buffer is returned in the read channel with the status field set (to non zero). Negative .I timeout_ms\fR makes kernel wait forever for the reply. .I retries\fR indicates the number of times the MAD will be retried before giving up. .SH "RETURN VALUE" .B umad_send() returns 0 on success; on error, errno is set and a negative value is returned as follows: -EINVAL invalid port handle or agentid -EIO send operation failed .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_set_addr.3000066400000000000000000000016701477342711600207200ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_SET_ADDR 3 "May 17, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_set_addr \- set MAD address fields within umad buffer using host ordering .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_set_addr(void " "*umad" ", int " "dlid" ", int " "dqp" ", int " "sl" ", int " "qkey"); .fi .SH "DESCRIPTION" .B umad_set_addr() sets the MAD address fields within the specified .I umad\fR buffer using the provided host ordered fields. .I dlid\fR is the destination LID. .I dqp\fR is the destination QP (queue pair). .I sl\fR is the SL (service level). .I qkey\fR is the Q_Key (queue key). .SH "RETURN VALUE" .B umad_set_addr() returns 0 on success, and a negative value on errors. Currently, there are no errors indicated. .SH "SEE ALSO" .BR umad_set_addr_net (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_set_addr_net.3000066400000000000000000000017271477342711600215710ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_SET_ADDR_NET 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_set_addr_net \- set MAD address fields within umad buffer using network ordering .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_set_addr_net(void " "*umad" ", __be16 " "dlid" ", __be32 " "dqp" ", int " "sl" ", __be32 " "qkey"); .fi .SH "DESCRIPTION" .B umad_set_addr_net() sets the MAD address fields within the specified .I umad\fR buffer using the provided network ordered fields. .I dlid\fR is the destination LID. .I dqp\fR is the destination QP (queue pair). .I sl\fR is the SL (service level). .I qkey\fR is the Q_Key (queue key). .SH "RETURN VALUE" .B umad_set_addr_net() returns 0 on success, and a negative value on errors. Currently, there are no errors indicated. .SH "SEE ALSO" .BR umad_set_addr (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_set_grh.3000066400000000000000000000031051477342711600205610ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_SET_GRH 3 "May 24, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_set_grh \- set GRH fields within umad buffer using host ordering .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_set_grh(void " "*umad" ", void " "*mad_addr"); .fi .SH "DESCRIPTION" .B umad_set_grh() sets the GRH fields (grh_present, gid, hop_limit, traffic_class, flow_label) within the specified .I umad\fR buffer based on the .I mad_addr\fR supplied. The provided .I mad_addr\fR fields are expected to be in host order. If the .I mad_addr\fR pointer supplied is NULL, no GRH is set. The argument .I mad_addr is a pointer to an .I ib_mad_addr_t struct, as specified in .I . The argument .I umad is a pointer to an .I ib_user_mad_t struct, as specified in .I . .PP .nf typedef struct ib_mad_addr { .in +8 uint32_t qpn; uint32_t qkey; uint16_t lid; uint8_t sl; uint8_t path_bits; uint8_t grh_present; uint8_t gid_index; uint8_t hop_limit; uint8_t traffic_class; uint8_t gid[16]; uint32_t flow_label; .in -8 } ib_mad_addr_t; .PP typedef struct ib_user_mad { .in +8 uint32_t agent_id; uint32_t status; uint32_t timeout_ms; uint32_t retries; uint32_t length; ib_mad_addr_t addr; uint8_t data[0]; .in -8 } ib_user_mad_t; .fi .SH "RETURN VALUE" .B umad_set_grh() returns 0 on success, and a negative value on errors. Currently, there are no errors indicated. .SH "SEE ALSO" .BR umad_set_grh_net (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_set_grh_net.3000066400000000000000000000031721477342711600214330ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_SET_GRH_NET 3 "May 24, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_set_grh_net \- set GRH fields within umad buffer using network ordering .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_set_grh_net(void " "*umad" ", void " "*mad_addr"); .fi .SH "DESCRIPTION" .B umad_set_grh_net() sets the GRH fields (grh_present, gid, hop_limit, traffic_class, flow_label) within the specified .I umad\fR buffer based on the .I mad_addr\fR supplied. The provided .I mad_addr\fR fields are expected to be in network order. If the .I mad_addr\fR pointer supplied is NULL, no GRH is set. The argument .I mad_addr is a pointer to an .I ib_mad_addr_t struct, as specified in . The argument .I umad is a pointer to an .I ib_user_mad_t struct, as specified in .I . .PP .nf typedef struct ib_mad_addr { .in +8 uint32_t qpn; uint32_t qkey; uint16_t lid; uint8_t sl; uint8_t path_bits; uint8_t grh_present; uint8_t gid_index; uint8_t hop_limit; uint8_t traffic_class; uint8_t gid[16]; uint32_t flow_label; .in -8 } ib_mad_addr_t; .PP typedef struct ib_user_mad { .in +8 uint32_t agent_id; uint32_t status; uint32_t timeout_ms; uint32_t retries; uint32_t length; ib_mad_addr_t addr; uint8_t data[0]; .in -8 } ib_user_mad_t; .fi .SH "RETURN VALUE" .B umad_set_grh_net() returns 0 on success, and a negative value on errors. Currently, there are no errors indicated. .SH "KNOWN BUGS" Not implemented. .SH "SEE ALSO" .BR umad_set_grh (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_set_pkey.3000066400000000000000000000011341477342711600207510ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_SET_PKEY 3 "June 20, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_set_pkey \- set pkey index within umad buffer .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_set_pkey(void " "*umad" ", int " "pkey_index"); .fi .SH "DESCRIPTION" .B umad_set_pkey() sets the pkey index within the specified .I umad\fR buffer. .SH "RETURN VALUE" .B umad_set_pkey() returns 0 on success, and a negative value on an error. .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_size.3000066400000000000000000000010101477342711600200710ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_SIZE 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_size \- get the size of umad buffer .SH "SYNOPSIS" .nf .B #include .sp .BI "size_t umad_size(void); .fi .SH "DESCRIPTION" .B umad_size() returns the size of umad buffer (in bytes). .SH "RETURN VALUE" .B umad_size() returns the size of umad buffer (in bytes). .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_sort_ca_device_list.3.md000066400000000000000000000024441477342711600235360ustar00rootroot00000000000000 --- date: "April 23, 2020" footer: "OpenIB" header: "OpenIB Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: UMAD_SORT_CA_DEVICE_LIST --- # NAME umad_sort_ca_device_list - sort list of InfiniBand device names in alphabetical order. # SYNOPSIS ```c #include int umad_sort_ca_device_list(struct umad_device_node **head, size_t size); ``` # DESCRIPTION **umad_sort_ca_device_list(struct umad_device_node **head, size_t size)** sort the cas list of *struct umad_device_node* by IB devices (CAs) names (Alphabetical sorting). if *size_t size* input parameter is zero, the function will calculate the size of the cas list. *struct umad_device_node* is defined as follows: ```c struct umad_device_node { struct umad_device_node *next; const char *ca_name; }; ``` # RETURN VALUE **umad_sort_ca_device_list(struct umad_device_node **head, size_t size)** returns zero value if sorting was succeded. The function also returns pointer to list (struct umad_device_node **head) sorted in alphabetical order as output parameter. On error, non-zero value is returned. *errno* is not set. # SEE ALSO **umad_get_ca_device_list**, **umad_free_ca_device_list** # AUTHORS Haim Boozaglo rdma-core-56.1/libibumad/man/umad_status.3000066400000000000000000000013251477342711600204530ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_STATUS 3 "May 17, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_status \- get the status of a umad buffer .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_status(void " "*umad" ); .fi .SH "DESCRIPTION" .B umad_status() get the internal .I umad\fR status field. .SH "RETURN VALUE" After a packet is received, .B umad_status() returns 0 on a successful receive, or a non zero status. ETIMEDOUT means that the packet had a send-timeout indication. In this case, the transaction ID will be set to the TID of the original request. .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/man/umad_unregister.3000066400000000000000000000014321477342711600213160ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH UMAD_UNREGISTER 3 "May 21, 2007" "OpenIB" "OpenIB Programmer's Manual" .SH "NAME" umad_unregister \- unregister umad agent .SH "SYNOPSIS" .nf .B #include .sp .BI "int umad_unregister(int " "portid" ", int " "agentid"); .fi .SH "DESCRIPTION" .B umad_unregister() unregisters the specified .I agentid\fR previously registered using .B umad_register() or .B umad_register_oui()\fR. .SH "RETURN VALUE" .B umad_unregister() returns 0 on success and negative value on error as follows: -EINVAL invalid port handle or agentid * (kernel error codes) .SH "SEE ALSO" .BR umad_register (3), .BR umad_register_oui (3) .SH "AUTHOR" .TP Hal Rosenstock rdma-core-56.1/libibumad/sysfs.c000066400000000000000000000067201477342711600166020ustar00rootroot00000000000000/* * Copyright (c) 2004-2008 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include #include #include #include #include "sysfs.h" static int ret_code(void) { int e = errno; if (e > 0) return -e; return e; } int sys_read_string(const char *dir_name, const char *file_name, char *str, int max_len) { char path[256], *s; int fd, r; snprintf(path, sizeof(path), "%s/%s", dir_name, file_name); if ((fd = open(path, O_RDONLY)) < 0) return ret_code(); if ((r = read(fd, (void *)str, max_len)) < 0) { int e = errno; close(fd); errno = e; return ret_code(); } str[(r < max_len) ? r : max_len - 1] = 0; if ((s = strrchr(str, '\n'))) *s = 0; close(fd); return 0; } int sys_read_guid(const char *dir_name, const char *file_name, __be64 *net_guid) { char buf[32], *str, *s; uint64_t guid; int r, i; if ((r = sys_read_string(dir_name, file_name, buf, sizeof(buf))) < 0) return r; guid = 0; for (s = buf, i = 0; i < 4; i++) { if (!(str = strsep(&s, ": \t\n"))) return -EINVAL; guid = (guid << 16) | (strtoul(str, NULL, 16) & 0xffff); } *net_guid = htobe64(guid); return 0; } int sys_read_gid(const char *dir_name, const char *file_name, union umad_gid *gid) { char buf[64], *str, *s; __be16 *ugid = (__be16 *) gid; int r, i; if ((r = sys_read_string(dir_name, file_name, buf, sizeof(buf))) < 0) return r; for (s = buf, i = 0; i < 8; i++) { if (!(str = strsep(&s, ": \t\n"))) return -EINVAL; ugid[i] = htobe16(strtoul(str, NULL, 16) & 0xffff); } return 0; } int sys_read_uint64(const char *dir_name, const char *file_name, uint64_t * u) { char buf[32]; int r; if ((r = sys_read_string(dir_name, file_name, buf, sizeof(buf))) < 0) return r; *u = strtoull(buf, NULL, 0); return 0; } int sys_read_uint(const char *dir_name, const char *file_name, unsigned *u) { char buf[32]; int r; if ((r = sys_read_string(dir_name, file_name, buf, sizeof(buf))) < 0) return r; *u = strtoul(buf, NULL, 0); return 0; } rdma-core-56.1/libibumad/sysfs.h000066400000000000000000000037501477342711600166070ustar00rootroot00000000000000/* * Copyright (c) 2008 Voltaire Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _UMAD_SYSFS_H #define _UMAD_SYSFS_H #include #include #include extern int sys_read_string(const char *dir_name, const char *file_name, char *str, int len); extern int sys_read_guid(const char *dir_name, const char *file_name, __be64 * net_guid); extern int sys_read_gid(const char *dir_name, const char *file_name, union umad_gid *gid); extern int sys_read_uint64(const char *dir_name, const char *file_name, uint64_t * u); extern int sys_read_uint(const char *dir_name, const char *file_name, unsigned *u); #endif /* _UMAD_SYSFS_H */ rdma-core-56.1/libibumad/tests/000077500000000000000000000000001477342711600164245ustar00rootroot00000000000000rdma-core-56.1/libibumad/tests/CMakeLists.txt000066400000000000000000000006371477342711600211720ustar00rootroot00000000000000rdma_test_executable(umad_reg2 umad_reg2_compat.c) target_link_libraries(umad_reg2 LINK_PRIVATE ibumad) rdma_test_executable(umad_register2 umad_register2.c) target_link_libraries(umad_register2 LINK_PRIVATE ibumad) rdma_test_executable(umad_sa_mcm_rereg_test umad_sa_mcm_rereg_test.c) target_link_libraries(umad_sa_mcm_rereg_test LINK_PRIVATE ibumad) rdma_test_executable(umad_compile_test umad_compile_test.c) rdma-core-56.1/libibumad/tests/umad_compile_test.c000066400000000000000000000051141477342711600222660ustar00rootroot00000000000000/* * Copyright (c) 2017 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include #include int main(int argc, char *argv[]) { #ifndef __CHECKER__ /* * Hide these checks for sparse because these checks fail with * older versions of sparse. */ BUILD_ASSERT(__alignof__(union umad_gid) == 4); #endif /* umad_types.h structure checks */ BUILD_ASSERT(sizeof(struct umad_hdr) == 24); BUILD_ASSERT(sizeof(struct umad_rmpp_hdr) == 12); BUILD_ASSERT(sizeof(struct umad_packet) == 256); BUILD_ASSERT(sizeof(struct umad_rmpp_packet) == 256); BUILD_ASSERT(sizeof(struct umad_dm_packet) == 256); BUILD_ASSERT(sizeof(struct umad_vendor_packet) == 256); BUILD_ASSERT(sizeof(struct umad_class_port_info) == 72); BUILD_ASSERT(offsetof(struct umad_class_port_info, redirgid) == 8); BUILD_ASSERT(offsetof(struct umad_class_port_info, trapgid) == 40); /* umad_sm.h structure check */ BUILD_ASSERT(sizeof(struct umad_smp) == 256); /* umad_sa.h structure check */ BUILD_ASSERT(sizeof(struct umad_sa_packet) == 256); return 0; } rdma-core-56.1/libibumad/tests/umad_reg2_compat.c000066400000000000000000000131211477342711600217760ustar00rootroot00000000000000/* * Copyright (c) 2014 Intel Corporation, All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #define UNLIKELY_MGMT_CLASS 0x2F #define UNLIKELY_RMPP_MGMT_CLASS 0x4F static int test_failures = 0; /** ========================================================================= * Stolen from OpenSM's register */ static int set_bit(int nr, void *method_mask) { long mask, *addr = method_mask; int retval; addr += nr / (8 * sizeof(long)); mask = (1UL) << (nr % (8 * sizeof(long))); retval = (mask & *addr) != 0; *addr |= mask; return retval; } static void set_bit64(int b, uint64_t *buf) { uint64_t mask; uint64_t *addr = buf; addr += b >> 6; mask = 1ULL << (b & 0x3f); *addr |= mask; } static void dump_reg_attr(struct umad_reg_attr *reg_attr) { printf("\nmgmt_class %u\n" "mgmt_class_version %u\n" "flags 0x%08x\n" "method_mask 0x%016"PRIx64" %016"PRIx64"\n" "oui 0x%06x\n" "rmpp_version %u\n\n", reg_attr->mgmt_class, reg_attr->mgmt_class_version, reg_attr->flags, reg_attr->method_mask[1], reg_attr->method_mask[0], reg_attr->oui, reg_attr->rmpp_version); } static int open_test_device(void) { int fd = umad_open_port(NULL, 0); if (fd < 0) { printf("\n *****\nOpen Port Failure... Aborting\n"); printf(" Ensure you have an HCA to test against.\n"); exit(0); } return fd; } static void test_register(void) { int agent_id; long method_mask[16 / sizeof(long)]; uint32_t class_oui = 0x001405; /* OPENIB_OUI */ uint8_t oui[3]; int fd; printf("\n old register test ... "); fd = open_test_device(); memset(&method_mask, 0, sizeof(method_mask)); set_bit( 1, &method_mask); set_bit(63, &method_mask); set_bit(64, &method_mask); // equal to this with the new register //reg_attr.method_mask[0] = 0x8000000000000002ULL; //reg_attr.method_mask[1] = 0x0000000000000001ULL; agent_id = umad_register(fd, UNLIKELY_MGMT_CLASS, 0x1, 0x00, method_mask); if (agent_id < 0) { printf("\n umad_register Failure, agent_id %d\n", agent_id); printf("\n umad_register(fd, 0x01, 0x1, 0x00, method_mask);\n"); test_failures++; } else { printf(" PASS\n"); umad_unregister(fd, agent_id); } printf("\n old register_oui test ... "); oui[0] = (class_oui >> 16) & 0xff; oui[1] = (class_oui >> 8) & 0xff; oui[2] = class_oui & 0xff; agent_id = umad_register_oui(fd, UNLIKELY_RMPP_MGMT_CLASS, 0x1, oui, method_mask); if (agent_id < 0) { printf("\n umad_register_oui Failure, agent_id %d\n", agent_id); printf("\n umad_register(fd, 0x30, 0x1, oui, method_mask);\n"); test_failures++; } else { printf(" PASS\n"); umad_unregister(fd, agent_id); } umad_close_port(fd); } static void test_fall_back(void) { int rc = 0; struct umad_reg_attr reg_attr; uint32_t agent_id; int fd; fd = open_test_device(); memset(®_attr, 0, sizeof(reg_attr)); reg_attr.mgmt_class = UNLIKELY_MGMT_CLASS; reg_attr.mgmt_class_version = 0x1; reg_attr.oui = 0x001405; /* OPENIB_OUI */ //reg_attr.method_mask[0] = 0x8000000000000002ULL; //reg_attr.method_mask[1] = 0x0000000000000001ULL; set_bit64( 1, (uint64_t *)®_attr.method_mask); set_bit64(63, (uint64_t *)®_attr.method_mask); set_bit64(64, (uint64_t *)®_attr.method_mask); printf("\n umad_register2 fall back (set_bit) ... "); rc = umad_register2(fd, ®_attr, &agent_id); if (rc != 0) { printf("\n umad_register2 failed to fall back. rc = %d\n", rc); dump_reg_attr(®_attr); test_failures++; } else { printf(" PASS\n"); umad_unregister(fd, agent_id); } reg_attr.method_mask[0] = 0x8000000000000002ULL; reg_attr.method_mask[1] = 0x0000000000000001ULL; printf("\n umad_register2 fall back ... "); rc = umad_register2(fd, ®_attr, &agent_id); if (rc != 0) { printf("\n umad_register2 failed to fall back. rc = %d\n", rc); dump_reg_attr(®_attr); test_failures++; } else { printf(" PASS\n"); umad_unregister(fd, agent_id); } umad_close_port(fd); } int main(int argc, char *argv[]) { //umad_debug(1); printf("\n *****\nStart compatibility tests\n"); test_register(); test_fall_back(); printf("\n *******************\n"); printf(" umad_reg2_compat had %d failures\n", test_failures); printf(" *******************\n"); return test_failures; } rdma-core-56.1/libibumad/tests/umad_register2.c000066400000000000000000000167061477342711600215160ustar00rootroot00000000000000/* * Copyright (c) 2014 Intel Corporation, All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #define UNLIKELY_MGMT_CLASS 0x2F #define UNLIKELY_RMPP_MGMT_CLASS 0x4F struct ib_user_mad_reg_req2 { uint32_t id; uint32_t qpn; uint8_t mgmt_class; uint8_t mgmt_class_version; uint16_t res; uint32_t flags; uint64_t method_mask[2]; uint32_t oui; uint8_t rmpp_version; uint8_t reserved[3]; }; static int test_failures = 0; static void dump_reg_attr(struct umad_reg_attr *reg_attr) { printf("\nmgmt_class %u\n" "mgmt_class_version %u\n" "flags 0x%08x\n" "method_mask 0x%016"PRIx64" %016"PRIx64"\n" "oui 0x%06x\n" "rmpp_version %u\n\n", reg_attr->mgmt_class, reg_attr->mgmt_class_version, reg_attr->flags, reg_attr->method_mask[1], reg_attr->method_mask[0], reg_attr->oui, reg_attr->rmpp_version); } static int open_test_device(void) { int fd = umad_open_port(NULL, 0); if (fd < 0) { printf("\n *****\nOpen Port Failure... Aborting\n"); printf(" Ensure you have an HCA to test against.\n"); exit(0); } return fd; } static void test_fail(void) { int rc = 0; struct umad_reg_attr reg_attr; uint32_t agent_id; uint32_t agent_id2; int fd; printf("\n *****\nBegin invalid tests\n"); fd = open_test_device(); memset(®_attr, 0, sizeof(reg_attr)); reg_attr.mgmt_class = UNLIKELY_MGMT_CLASS; reg_attr.mgmt_class_version = 0x1; reg_attr.flags = 0x80000000; printf("\n invalid register flags ... "); rc = umad_register2(fd, ®_attr, &agent_id); if (rc == 0) { printf("\n umad_register2 registered invalid flags. rc = %d\n", rc); dump_reg_attr(®_attr); test_failures++; goto out; } else { printf(" PASS\n"); umad_unregister(fd, agent_id); } memset(®_attr, 0, sizeof(reg_attr)); reg_attr.mgmt_class = 0x03; reg_attr.mgmt_class_version = 0x2; reg_attr.rmpp_version = 0x02; printf("\n invalid rmpp_version ... "); rc = umad_register2(fd, ®_attr, &agent_id); if (rc == 0) { printf("\n umad_register2 registered an invalid rmpp_version. rc = %d\n", rc); dump_reg_attr(®_attr); test_failures++; goto out; } else { printf(" PASS\n"); umad_unregister(fd, agent_id); } memset(®_attr, 0, sizeof(reg_attr)); reg_attr.mgmt_class = UNLIKELY_RMPP_MGMT_CLASS; reg_attr.oui = 0x0100066a; printf("\n invalid oui ... "); rc = umad_register2(fd, ®_attr, &agent_id); if (rc == 0) { printf("\n umad_register2 registered an invalid oui. rc = %d\n", rc); dump_reg_attr(®_attr); test_failures++; goto out; } else { printf(" PASS\n"); umad_unregister(fd, agent_id); } /* The following 2 registrations attempt to register the same OUI 2 * times. The second one is supposed to fail with the same method * mask. */ printf("\n duplicate oui ... "); memset(®_attr, 0, sizeof(reg_attr)); reg_attr.mgmt_class = UNLIKELY_RMPP_MGMT_CLASS; reg_attr.mgmt_class_version = 0x1; reg_attr.rmpp_version = 0x00; reg_attr.oui = 0x00066a; reg_attr.method_mask[0] = 0x80000000000000DEULL; reg_attr.method_mask[1] = 0xAD00000000000001ULL; rc = umad_register2(fd, ®_attr, &agent_id); if (rc != 0) { printf("\n umad_register2 Failed to register an oui for the duplicate test. rc = %d\n", rc); dump_reg_attr(®_attr); test_failures++; goto out; } memset(®_attr, 0, sizeof(reg_attr)); reg_attr.mgmt_class = UNLIKELY_RMPP_MGMT_CLASS; reg_attr.mgmt_class_version = 0x1; reg_attr.rmpp_version = 0x00; reg_attr.oui = 0x00066a; reg_attr.method_mask[0] = 0x80000000000000DEULL; reg_attr.method_mask[1] = 0xAD00000000000001ULL; rc = umad_register2(fd, ®_attr, &agent_id2); if (rc == 0) { printf("\n umad_register2 registered a duplicate oui. rc = %d\n", rc); dump_reg_attr(®_attr); test_failures++; goto out; } else { printf(" PASS\n"); umad_unregister(fd, agent_id); umad_unregister(fd, agent_id2); } umad_close_port(fd); out: printf("\n *****\nEnd invalid tests\n"); } static void test_oui(void) { int rc = 0; struct umad_reg_attr reg_attr; uint32_t agent_id; int fd; printf("\n *****\nStart valid oui tests\n"); fd = open_test_device(); printf("\n valid oui ... "); memset(®_attr, 0, sizeof(reg_attr)); reg_attr.mgmt_class = UNLIKELY_RMPP_MGMT_CLASS; reg_attr.mgmt_class_version = 0x1; reg_attr.rmpp_version = 0x00; reg_attr.oui = 0x00066a; reg_attr.method_mask[0] = 0x80000000000000DEULL; reg_attr.method_mask[1] = 0xAD00000000000001ULL; rc = umad_register2(fd, ®_attr, &agent_id); if (rc != 0) { printf("\n umad_register2 failed oui 0x%x. rc = %d\n", reg_attr.oui, rc); dump_reg_attr(®_attr); test_failures++; goto out; } else { printf(" PASS\n"); umad_unregister(fd, agent_id); } printf("\n valid oui with flags ... "); memset(®_attr, 0, sizeof(reg_attr)); reg_attr.mgmt_class = UNLIKELY_RMPP_MGMT_CLASS; reg_attr.mgmt_class_version = 0x1; reg_attr.rmpp_version = 0x00; reg_attr.flags = 0x01; /* Use Intel OUI for testing */ reg_attr.oui = 0x00066a; rc = umad_register2(fd, ®_attr, &agent_id); if (rc != 0) { printf("\n umad_register2 failed oui 0x%x with flags 0x%x. rc = %d\n", reg_attr.oui, reg_attr.flags, rc); dump_reg_attr(®_attr); test_failures++; goto out; } else { printf(" PASS\n"); umad_unregister(fd, agent_id); } umad_close_port(fd); out: printf("\n End valid oui tests\n *****\n"); } static void check_register2_support(void) { struct ib_user_mad_reg_req2 req; int fd; fd = open_test_device(); memset(&req, 0, sizeof(req)); req.mgmt_class = UNLIKELY_MGMT_CLASS; req.mgmt_class_version = 0x1; req.qpn = 0x1; if (ioctl(fd, IB_USER_MAD_REGISTER_AGENT2, (void *)&req) != 0) { if (errno == ENOTTY || errno == EINVAL) { printf("\n *****\nKernel does not support the new ioctl. Aborting tests\n"); exit(0); } } umad_close_port(fd); } int main(int argc, char *argv[]) { //umad_debug(1); check_register2_support(); test_fail(); test_oui(); printf("\n *******************\n"); printf(" umad_register2 had %d failures\n", test_failures); printf(" *******************\n"); return test_failures; } rdma-core-56.1/libibumad/tests/umad_sa_mcm_rereg_test.c000066400000000000000000000363031477342711600232650ustar00rootroot00000000000000/* * Copyright (c) 2017 Mellanox Technologies Ltd. All rights reserved. * Copyright (c) 2006-2009 Voltaire, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #define info(fmt, ...) fprintf(stderr, "INFO: " fmt, ## __VA_ARGS__) #define err(fmt, ...) fprintf(stderr, "ERR: " fmt, ## __VA_ARGS__) #ifdef NOISY_DEBUG #define dbg(fmt, ...) fprintf(stderr, "DBG: " fmt, ## __VA_ARGS__) #else #define dbg(fmt, ...) {} #endif #define DEFAULT_TIMEOUT 100 /* milliseconds */ #define MAX_PORT_GUIDS 64 /* Use null MGID to request SA assigned MGID */ static const uint8_t null_mgid[16] = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 }; static int create, join, leave; static uint8_t rate = 0xff, mtu = 0xff, sl = 0xff; static umad_port_t umad_port; struct guid_trid { uint8_t gid[16]; __be64 guid; uint64_t trid[2]; }; static void build_user_mad_addr(uint8_t *umad) { umad_set_addr(umad, umad_port.sm_lid, 1, umad_port.sm_sl, UMAD_QKEY); /* * The following 2 umad calls are redundant * as umad was originally cleared to */ umad_set_grh(umad, NULL); umad_set_pkey(umad, 0); /* just pkey index 0 for now !!! */ } static void build_mcm_rec(struct umad_sa_packet *sa, uint8_t method, const uint8_t mgid[], const uint8_t port_gid[], uint64_t tid, int creat) { struct umad_sa_mcmember_record *mcm; memset(sa, 0, sizeof(*sa)); sa->mad_hdr.base_version = UMAD_BASE_VERSION; sa->mad_hdr.mgmt_class = UMAD_CLASS_SUBN_ADM; sa->mad_hdr.class_version = UMAD_SA_CLASS_VERSION; sa->mad_hdr.method = method; sa->mad_hdr.tid = htobe64(tid); sa->mad_hdr.attr_id = htons(UMAD_SA_ATTR_MCMEMBER_REC); if (creat) sa->comp_mask = htobe64(UMAD_SA_MCM_COMP_MASK_MGID | UMAD_SA_MCM_COMP_MASK_PORT_GID | UMAD_SA_MCM_COMP_MASK_QKEY | UMAD_SA_MCM_COMP_MASK_TCLASS | UMAD_SA_MCM_COMP_MASK_PKEY | UMAD_SA_MCM_COMP_MASK_SL | UMAD_SA_MCM_COMP_MASK_FLOW_LABEL | UMAD_SA_MCM_COMP_MASK_JOIN_STATE); else sa->comp_mask = htobe64(UMAD_SA_MCM_COMP_MASK_MGID | UMAD_SA_MCM_COMP_MASK_PORT_GID | UMAD_SA_MCM_COMP_MASK_JOIN_STATE); mcm = (struct umad_sa_mcmember_record *) sa->data; memcpy(mcm->mgid, mgid, sizeof(mcm->mgid)); memcpy(mcm->portgid, port_gid, sizeof(mcm->portgid)); umad_sa_mcm_set_join_state(mcm, UMAD_SA_MCM_JOIN_STATE_FULL_MEMBER); if (creat) { mcm->qkey = htonl(0xb1b); /* assume full default partition (in index 0) */ mcm->pkey = htons(0xffff); if (rate != 0xff) { sa->comp_mask |= htobe64(UMAD_SA_MCM_COMP_MASK_RATE_SEL | UMAD_SA_MCM_COMP_MASK_RATE); mcm->rate = (UMAD_SA_SELECTOR_EXACTLY << UMAD_SA_SELECTOR_SHIFT) | (rate & UMAD_SA_RATE_MTU_PKT_LIFE_MASK); } if (mtu != 0xff) { sa->comp_mask |= htobe64(UMAD_SA_MCM_COMP_MASK_MTU_SEL | UMAD_SA_MCM_COMP_MASK_MTU); mcm->mtu = (UMAD_SA_SELECTOR_EXACTLY << UMAD_SA_SELECTOR_SHIFT) | (mtu & UMAD_SA_RATE_MTU_PKT_LIFE_MASK); } if (sl != 0xff) { sa->comp_mask |= htobe64(UMAD_SA_MCM_COMP_MASK_SL); mcm->sl_flow_hop = umad_sa_mcm_set_sl_flow_hop(sl, 0, 0); } } } static int mcm_send(int portid, int agentid, uint8_t *umad, int len, int tmo, uint8_t method, const uint8_t mgid[], struct guid_trid *entry, int creat) { struct umad_sa_packet *sa = umad_get_mad(umad); build_mcm_rec(sa, method, mgid, entry->gid, entry->trid[0], creat); if (umad_send(portid, agentid, umad, len, tmo, 0) < 0) { err("umad_send %s failed: %s\n", (method == UMAD_METHOD_GET) ? "query" : "non query", strerror(errno)); return -1; } dbg("umad_send %d: tid = 0x%" PRIx64 "\n", method, be64toh(sa->mad_hdr.tid)); return 0; } static int rereg_port_gid(int portid, int agentid, uint8_t *umad, int len, int tmo, const uint8_t mgid[], struct guid_trid *entry) { struct umad_sa_packet *sa = umad_get_mad(umad); build_mcm_rec(sa, UMAD_SA_METHOD_DELETE, mgid, entry->gid, entry->trid[0], 0); if (umad_send(portid, agentid, umad, len, tmo, 0) < 0) { err("umad_send leave failed: %s\n", strerror(errno)); return -1; } dbg("umad_send leave: tid = 0x%" PRIx64 "\n", be64toh(sa->mad_hdr.tid)); entry->trid[0] = be64toh(sa->mad_hdr.tid); /* for agent ID */ sa->mad_hdr.method = UMAD_METHOD_SET; sa->mad_hdr.tid = htobe64(entry->trid[1]); if (umad_send(portid, agentid, umad, len, tmo, 0) < 0) { err("umad_send join failed: %s\n", strerror(errno)); return -1; } dbg("umad_send join: tid = 0x%" PRIx64 "\n", be64toh(sa->mad_hdr.tid)); entry->trid[1] = be64toh(sa->mad_hdr.tid); /* for agent ID */ return 0; } static int rereg_send_all(int portid, int agentid, int tmo, const uint8_t mgid[], struct guid_trid *list, unsigned int cnt) { uint8_t *umad; int len = sizeof(struct umad_hdr) + UMAD_LEN_DATA; unsigned int i, sent = 0; int ret; info("%s... cnt = %u\n", __func__, cnt); umad = calloc(1, len + umad_size()); if (!umad) { err("cannot alloc mem for umad: %s\n", strerror(errno)); return -1; } build_user_mad_addr(umad); for (i = 0; i < cnt; i++) { ret = rereg_port_gid(portid, agentid, umad, len, tmo, mgid, &list[i]); if (ret < 0) { err("%s: rereg_port_gid guid 0x%016" PRIx64 " failed\n", __func__, be64toh(list[i].guid)); continue; } sent++; } info("%s: sent %u of %u requests\n", __func__, sent * 2, cnt * 2); free(umad); return 0; } static int mcm_recv(int portid, uint8_t *umad, int length, int tmo) { int ret, retry = 0; int len = length; #ifdef NOISY_DEBUG struct umad_hdr *mad; #endif while ((ret = umad_recv(portid, umad, &len, tmo)) < 0 && errno == ETIMEDOUT) { if (retry++ > 3) return 0; } if (ret < 0) { err("umad_recv %d failed: %s\n", ret, strerror(errno)); return -1; } #ifdef NOISY_DEBUG mad = umad_get_mad(umad); #endif dbg("umad_recv (retries %d), tid = 0x%" PRIx64 ": len = %d, status = %d\n", retry, be64toh(mad->tid), len, umad_status(umad)); return 1; } static int rereg_recv_all(int portid, int agentid, int tmo, const uint8_t mgid[], struct guid_trid *list, unsigned int cnt) { uint8_t *umad; struct umad_hdr *mad; int len = sizeof(struct umad_hdr) + UMAD_LEN_DATA; uint64_t trid; unsigned int n, i, j; uint16_t status; uint8_t method; info("%s...\n", __func__); umad = calloc(1, len + umad_size()); if (!umad) { err("cannot alloc mem for umad: %s\n", strerror(errno)); return -1; } mad = umad_get_mad(umad); n = 0; while (mcm_recv(portid, umad, len, tmo) > 0) { dbg("%s: done %d\n", __func__, n); n++; method = mad->method; status = ntohs(mad->status); trid = be64toh(mad->tid); if (status) dbg("MAD status 0x%x, method 0x%x\n", status, method); if (status && (method == UMAD_METHOD_GET_RESP || method == UMAD_SA_METHOD_DELETE_RESP)) { for (i = 0; i < cnt; i++) for (j = 0; j < 2; j++) if (trid == list[i].trid[j]) break; if (i == cnt) { err("cannot find trid 0x%" PRIx64 ", status 0x%x, method 0x%x\n", trid, status, method); continue; } info("guid 0x%016" PRIx64 ": status 0x%x, method 0x%x. Retrying\n", be64toh(list[i].guid), status, method); rereg_port_gid(portid, agentid, umad, len, tmo, mgid, &list[i]); } } info("%s: got %u responses\n", __func__, n); free(umad); return 0; } static int query_all(int portid, int agentid, int tmo, uint8_t method, const uint8_t mgid[], struct guid_trid *list, int creat, unsigned int cnt) { uint8_t *umad; struct umad_hdr *mad; int len = sizeof(struct umad_hdr) + UMAD_LEN_DATA; unsigned int i, sent = 0; int ret; uint16_t status; uint8_t mcgid[16]; info("%s...\n", __func__); memcpy(mcgid, mgid, 16); umad = calloc(1, len + umad_size()); if (!umad) { err("cannot alloc mem for umad: %s\n", strerror(errno)); return -1; } build_user_mad_addr(umad); mad = umad_get_mad(umad); for (i = 0; i < cnt; i++) { ret = mcm_send(portid, agentid, umad, len, tmo, method, mcgid, &list[i], creat); if (ret < 0) { err("%s: mcm_send failed\n", __func__); continue; } sent++; ret = mcm_recv(portid, umad, len, tmo); if (ret < 0) { err("%s: mcm_recv failed\n", __func__); continue; } status = ntohs(mad->status); if (status) info( "guid 0x%016" PRIx64 ": status 0x%x, method 0x%x\n", be64toh(list[i].guid), status, mad->method); else if (creat && i == 0) { if (memcmp(mgid, null_mgid, 16) == 0) { struct umad_sa_packet *sa = (void *) mad; struct umad_sa_mcmember_record *mcm; mcm = (struct umad_sa_mcmember_record *) sa->data; memcpy(mcgid, mcm->mgid, 16); } } } info("%s: %u of %u queried\n", __func__, sent, cnt); free(umad); return 0; } static int test_port(const char *guid_file, int portid, int agentid, int tmo, const uint8_t mgid[]) { char line[256]; FILE *f; uint8_t port_gid[16]; uint64_t guidho; __be64 prefix, guid; uint64_t trid; struct guid_trid *list; int i = 0, j; list = calloc(MAX_PORT_GUIDS, sizeof(*list)); if (!list) { err("cannot alloc mem for guid/trid list: %s\n", strerror(errno)); return -1; } f = fopen(guid_file, "r"); if (!f) { err("cannot open %s: %s\n", guid_file, strerror(errno)); free(list); return -1; } trid = 0x12345678; /* starting tid */ prefix = umad_port.gid_prefix; while (fgets(line, sizeof(line), f)) { guidho = strtoull(line, NULL, 0); guid = htobe64(guidho); memcpy(&port_gid[0], &prefix, 8); memcpy(&port_gid[8], &guid, 8); list[i].guid = guid; memcpy(list[i].gid, port_gid, sizeof(list[i].gid)); for (j = 0; j < 2; j++) list[i].trid[j] = trid++; if (++i >= MAX_PORT_GUIDS) break; } fclose(f); if (create) query_all(portid, agentid, tmo, UMAD_METHOD_SET, mgid, list, 1, i); else if (join) query_all(portid, agentid, tmo, UMAD_METHOD_SET, mgid, list, 0, i); else if (leave) query_all(portid, agentid, tmo, UMAD_SA_METHOD_DELETE, mgid, list, 0, i); else { /* no operation specified - default to rereg */ rereg_send_all(portid, agentid, tmo, mgid, list, i); rereg_recv_all(portid, agentid, tmo, mgid, list, i); query_all(portid, agentid, tmo, UMAD_METHOD_GET, mgid, list, 0, i); } free(list); return 0; } static void show_usage(const char *prog_name) { fprintf(stderr, "%s [-C ] [-P ] [-F ] [-t ] [-g ] [-c] [-j] [-l] [-r ] [-m ] [-s ] [-h]\n", prog_name); fprintf(stderr, " -C use the specified ca_name\n"); fprintf(stderr, " -P use the specific ca_port\n"); fprintf(stderr, " -F use the specified port_guid_file\n"); fprintf(stderr, " defaults to port_guids.lst\n"); fprintf(stderr, " -t override the default timeout of 100 milliseconds\n"); fprintf(stderr, " -g MGID of MC group in IPv6 format\n"); fprintf(stderr, " defaults to IPv4 broadcast group if not specified\n"); fprintf(stderr, " To create SA assigned group, use either :: or 0:0:0:0:0:0:0:0\n"); fprintf(stderr, " -c create MC group with ports\n"); fprintf(stderr, " -j join ports to MC group\n"); fprintf(stderr, " -l remove ports from MC group (leave)\n"); fprintf(stderr, " operation defaults to reregister ports if none if c, j, l are specified\n\n"); fprintf(stderr, " -r Encoded rate value (for create)\n"); fprintf(stderr, " -m Encoded mtu value (for create)\n"); fprintf(stderr, " -s SL (for create)\n"); fprintf(stderr, " -h show this usage message\n"); } int main(int argc, char **argv) { char *ibd_ca = NULL; int ibd_ca_port = 0; const char *guid_file = "port_guids.list"; int tmo = DEFAULT_TIMEOUT; int c, portid, agentid; const char *prog_name; const char *const optstring = "F:C:P:t:g:cjlr:m:s:h"; /* IPoIB broadcast group (for full default pkey) */ uint8_t mgid[16] = { 0xff, 0x12, 0x40, 0x1b, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff }; prog_name = argv[0]; while ((c = getopt(argc, argv, optstring)) != -1) { switch (c) { case 'C': ibd_ca = optarg; break; case 'P': ibd_ca_port = strtoul(optarg, NULL, 0); break; case 'F': guid_file = optarg; break; case 't': tmo = atoi(optarg); break; case 'g': if (inet_pton(AF_INET6, optarg, &mgid) <= 0) { fprintf(stderr, "mgid could not be parsed\n"); exit(EXIT_FAILURE); } break; case 'c': create = 1; break; case 'j': join = 1; break; case 'l': leave = 1; break; case 'r': rate = atoi(optarg); break; case 'm': mtu = atoi(optarg); break; case 's': sl = atoi(optarg); break; case 'h': show_usage(prog_name); exit(EXIT_SUCCESS); break; default: fprintf(stderr, "Unrecognized option: -%c\n", optopt); show_usage(prog_name); exit(EXIT_FAILURE); break; } } if (umad_get_port(ibd_ca, ibd_ca_port, &umad_port) < 0) { if (ibd_ca == NULL) err( "umad_get_port failed for first IB CA port %d: %s\n", ibd_ca_port, strerror(errno)); else err("umad_get_port failed for CA %s port %d: %s\n", ibd_ca, ibd_ca_port, strerror(errno)); umad_done(); return -1; } info("using %s port %d guid 0x%016" PRIx64 "\n", umad_port.ca_name, umad_port.portnum, be64toh(umad_port.port_guid)); portid = umad_open_port(umad_port.ca_name, umad_port.portnum); if (portid < 0) { err("umad_open_port failed: %s\n", strerror(errno)); umad_release_port(&umad_port); umad_done(); return -1; } agentid = umad_register(portid, UMAD_CLASS_SUBN_ADM, UMAD_SA_CLASS_VERSION, 0, NULL); if (agentid < 0) { err("umad_register failed: %s\n", strerror(errno)); umad_release_port(&umad_port); umad_close_port(portid); umad_done(); return -1; } test_port(guid_file, portid, agentid, tmo, mgid); umad_release_port(&umad_port); umad_unregister(portid, agentid); umad_close_port(portid); umad_done(); return 0; } rdma-core-56.1/libibumad/umad.c000066400000000000000000001124751477342711600163660ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2014 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #define IB_OPENIB_OUI (0x001405) #define CAPMASK_IS_SM_DISABLED (0x400) #include #include "sysfs.h" typedef struct ib_user_mad_reg_req { uint32_t id; uint32_t method_mask[4]; uint8_t qpn; uint8_t mgmt_class; uint8_t mgmt_class_version; uint8_t oui[3]; uint8_t rmpp_version; } ib_user_mad_reg_req_t; struct ib_user_mad_reg_req2 { uint32_t id; uint32_t qpn; uint8_t mgmt_class; uint8_t mgmt_class_version; uint16_t res; uint32_t flags; uint64_t method_mask[2]; uint32_t oui; uint8_t rmpp_version; uint8_t reserved[3]; }; struct port_guid_port_count { __be64 port_guid; uint8_t count; }; struct guid_ca_pairs_mapping { __be64 port_guid; struct umad_ca_pair *ca_pair; }; #define IBWARN(fmt, args...) fprintf(stderr, "ibwarn: [%d] %s: " fmt "\n", getpid(), __func__, ## args) #define TRACE if (umaddebug) IBWARN #define DEBUG if (umaddebug) IBWARN static int umaddebug = 0; #define UMAD_DEV_FILE_SZ 256 static const char *def_ca_name = "mthca0"; static int def_ca_port = 1; static unsigned new_user_mad_api; static unsigned int get_abi_version(void) { static unsigned int abi_version; if (abi_version != 0) return abi_version & 0x7FFFFFFF; if (sys_read_uint(IB_UMAD_ABI_DIR, IB_UMAD_ABI_FILE, &abi_version) < 0) { IBWARN("can't read ABI version from %s/%s (%m): is ib_umad module loaded?", IB_UMAD_ABI_DIR, IB_UMAD_ABI_FILE); abi_version = (1U) << 31; return 0; } if (abi_version < IB_UMAD_ABI_VERSION) { abi_version = (1U) << 31; return 0; } return abi_version; } /************************************* * Port */ static int find_cached_ca(const char *ca_name, umad_ca_t * ca) { return 0; /* caching not implemented yet */ } static int put_ca(umad_ca_t * ca) { return 0; /* caching not implemented yet */ } static unsigned is_smi_disabled(umad_port_t *port) { return (be32toh(port->capmask) & CAPMASK_IS_SM_DISABLED); } static int release_port(umad_port_t * port) { free(port->pkeys); port->pkeys = NULL; port->pkeys_size = 0; return 0; } static int check_for_digit_name(const struct dirent *dent) { const char *p = dent->d_name; while (*p && isdigit(*p)) p++; return *p ? 0 : 1; } static int get_port(const char *ca_name, const char *dir, int portnum, umad_port_t * port) { char port_dir[256]; union umad_gid gid; struct dirent **namelist; int i, len, num_pkeys = 0; uint32_t capmask; strncpy(port->ca_name, ca_name, sizeof port->ca_name - 1); port->portnum = portnum; port->pkeys = NULL; len = snprintf(port_dir, sizeof(port_dir), "%s/%d", dir, portnum); if (len < 0 || len > sizeof(port_dir)) return -EIO; if (sys_read_uint(port_dir, SYS_PORT_LMC, &port->lmc) < 0) return -EIO; if (sys_read_uint(port_dir, SYS_PORT_SMLID, &port->sm_lid) < 0) return -EIO; if (sys_read_uint(port_dir, SYS_PORT_SMSL, &port->sm_sl) < 0) return -EIO; if (sys_read_uint(port_dir, SYS_PORT_LID, &port->base_lid) < 0) return -EIO; if (sys_read_uint(port_dir, SYS_PORT_STATE, &port->state) < 0) return -EIO; if (sys_read_uint(port_dir, SYS_PORT_PHY_STATE, &port->phys_state) < 0) return -EIO; if (sys_read_uint(port_dir, SYS_PORT_RATE, &port->rate) < 0) return -EIO; if (sys_read_uint(port_dir, SYS_PORT_CAPMASK, &capmask) < 0) return -EIO; if (sys_read_string(port_dir, SYS_PORT_LINK_LAYER, port->link_layer, UMAD_CA_NAME_LEN) < 0) /* assume IB by default */ sprintf(port->link_layer, "IB"); port->capmask = htobe32(capmask); if (sys_read_gid(port_dir, SYS_PORT_GID, &gid) < 0) return -EIO; port->gid_prefix = gid.global.subnet_prefix; port->port_guid = gid.global.interface_id; snprintf(port_dir + len, sizeof(port_dir) - len, "/pkeys"); num_pkeys = scandir(port_dir, &namelist, check_for_digit_name, NULL); if (num_pkeys <= 0) { IBWARN("no pkeys found for %s:%u (at dir %s)...", port->ca_name, port->portnum, port_dir); return -EIO; } port->pkeys = calloc(num_pkeys, sizeof(port->pkeys[0])); if (!port->pkeys) { IBWARN("get_port: calloc failed: %s", strerror(errno)); goto clean_names; } for (i = 0; i < num_pkeys; i++) { unsigned idx, val; idx = strtoul(namelist[i]->d_name, NULL, 0); if (sys_read_uint(port_dir, namelist[i]->d_name, &val) < 0) goto clean_pkeys; port->pkeys[idx] = val; } port->pkeys_size = num_pkeys; for (i = 0; i < num_pkeys; i++) free(namelist[i]); free(namelist); port_dir[len] = '\0'; /* FIXME: handle gids */ return 0; clean_pkeys: free(port->pkeys); clean_names: for (i = 0; i < num_pkeys; i++) free(namelist[i]); free(namelist); return -EIO; } static int release_ca(umad_ca_t * ca) { int i; for (i = 0; i <= ca->numports; i++) { if (!ca->ports[i]) continue; release_port(ca->ports[i]); free(ca->ports[i]); ca->ports[i] = NULL; } return 0; } /* * if *port > 0, check ca[port] state. Otherwise set *port to * the first port that is active, and if such is not found, to * the first port that is link up and if none are linkup, then * the first port that is not disabled. Otherwise return -1. * if enforce_smi > 0, only search smi ports. if none are found, return -1. */ static int resolve_ca_port(const char *ca_name, int *port, unsigned enforce_smi) { umad_ca_t ca; int active = -1, up = -1; int i, ret = 0; TRACE("checking ca '%s'", ca_name); if (umad_get_ca(ca_name, &ca) < 0) return -1; if (ca.node_type == 2) { *port = 0; /* switch sma port 0 */ ret = 1; goto Exit; } if (*port > 0) { /* check only the port the user wants */ if (*port > ca.numports) { ret = -1; goto Exit; } if (!ca.ports[*port]) { ret = -1; goto Exit; } if (strcmp(ca.ports[*port]->link_layer, "InfiniBand") && strcmp(ca.ports[*port]->link_layer, "IB")) { ret = -1; goto Exit; } if (enforce_smi && is_smi_disabled(ca.ports[*port])) { ret = -1; goto Exit; } if (ca.ports[*port]->state == 4) { ret = 1; goto Exit; } if (ca.ports[*port]->phys_state != 3) goto Exit; ret = -1; goto Exit; } for (i = 0; i <= ca.numports; i++) { DEBUG("checking port %d", i); if (!ca.ports[i]) continue; if (strcmp(ca.ports[i]->link_layer, "InfiniBand") && strcmp(ca.ports[i]->link_layer, "IB")) continue; if (enforce_smi && is_smi_disabled(ca.ports[i])) continue; if (up < 0 && ca.ports[i]->phys_state == 5) up = *port = i; if (ca.ports[i]->state == 4) { active = *port = i; DEBUG("found active port %d", i); break; } } if (active == -1 && up == -1) { /* no active or linkup port found */ for (i = 0; i <= ca.numports; i++) { DEBUG("checking port %d", i); if (!ca.ports[i]) continue; if (enforce_smi && is_smi_disabled(ca.ports[i])) continue; if (ca.ports[i]->phys_state != 3) { up = *port = i; break; } } } if (active >= 0) { ret = 1; goto Exit; } if (up >= 0) { ret = 0; goto Exit; } ret = -1; Exit: release_ca(&ca); return ret; } static int resolve_ca_name(const char *ca_in, int *best_port, char **ca_name, unsigned enforce_smi) { struct umad_device_node *device_list; struct umad_device_node *node; struct umad_device_node *phys_found = NULL; const char *name_found; int port_found = 0, port, port_type; *ca_name = NULL; if (ca_in && (!best_port || *best_port)) { *ca_name = strdup(ca_in); if (!(*ca_name)) return -1; return 0; } if (ca_in) { if (resolve_ca_port(ca_in, best_port, enforce_smi) < 0) return -1; *ca_name = strdup(ca_in); if (!(*ca_name)) return -1; return 0; } /* Get the list of CA names */ device_list = umad_get_ca_device_list(); if (!device_list) return -1; /* Find the first existing CA with an active port */ for (node = device_list; node; node = node->next) { name_found = node->ca_name; TRACE("checking ca '%s'", name_found); port = best_port ? *best_port : 0; port_type = resolve_ca_port(name_found, &port, enforce_smi); if (port_type < 0) continue; DEBUG("found ca %s with port %d type %d", name_found, port, port_type); if (port_type > 0) { if (best_port) *best_port = port; DEBUG("found ca %s with active port %d", name_found, port); *ca_name = strdup(name_found); umad_free_ca_device_list(device_list); if (!(*ca_name)) return -1; return 0; } if (!phys_found) { phys_found = node; port_found = port; } } DEBUG("phys found on %s port %d", phys_found ? phys_found->ca_name : NULL, port_found); if (phys_found) { name_found = phys_found->ca_name; DEBUG("phys found on %s port %d", phys_found ? name_found : NULL, port_found); if (best_port) *best_port = port_found; *ca_name = strdup(name_found); umad_free_ca_device_list(device_list); if (!(*ca_name)) return -1; return 0; } umad_free_ca_device_list(device_list); if (best_port) *best_port = def_ca_port; *ca_name = strdup(def_ca_name); if (!(*ca_name)) return -1; return 0; } static int get_ca(const char *ca_name, umad_ca_t * ca) { DIR *dir; char dir_name[256]; struct dirent **namelist; int r, i, ret; int portnum; ca->numports = 0; memset(ca->ports, 0, sizeof ca->ports); strncpy(ca->ca_name, ca_name, sizeof(ca->ca_name) - 1); snprintf(dir_name, sizeof(dir_name), "%s/%s", SYS_INFINIBAND, ca->ca_name); if ((r = sys_read_uint(dir_name, SYS_NODE_TYPE, &ca->node_type)) < 0) return r; if (sys_read_string(dir_name, SYS_CA_FW_VERS, ca->fw_ver, sizeof ca->fw_ver) < 0) ca->fw_ver[0] = '\0'; if (sys_read_string(dir_name, SYS_CA_HW_VERS, ca->hw_ver, sizeof ca->hw_ver) < 0) ca->hw_ver[0] = '\0'; if ((r = sys_read_string(dir_name, SYS_CA_TYPE, ca->ca_type, sizeof ca->ca_type)) < 0) ca->ca_type[0] = '\0'; if ((r = sys_read_guid(dir_name, SYS_CA_NODE_GUID, &ca->node_guid)) < 0) return r; if ((r = sys_read_guid(dir_name, SYS_CA_SYS_GUID, &ca->system_guid)) < 0) return r; snprintf(dir_name, sizeof(dir_name), "%s/%s/%s", SYS_INFINIBAND, ca->ca_name, SYS_CA_PORTS_DIR); if (!(dir = opendir(dir_name))) return -ENOENT; if ((r = scandir(dir_name, &namelist, NULL, alphasort)) < 0) { ret = errno < 0 ? errno : -EIO; goto error; } ret = 0; for (i = 0; i < r; i++) { portnum = 0; if (!strcmp(".", namelist[i]->d_name) || !strcmp("..", namelist[i]->d_name)) continue; if (strcmp("0", namelist[i]->d_name) && ((portnum = atoi(namelist[i]->d_name)) <= 0 || portnum >= UMAD_CA_MAX_PORTS)) { ret = -EIO; goto clean; } if (!(ca->ports[portnum] = calloc(1, sizeof(*ca->ports[portnum])))) { ret = -ENOMEM; goto clean; } if (get_port(ca_name, dir_name, portnum, ca->ports[portnum]) < 0) { free(ca->ports[portnum]); ca->ports[portnum] = NULL; ret = -EIO; goto clean; } if (ca->numports < portnum) ca->numports = portnum; } for (i = 0; i < r; i++) free(namelist[i]); free(namelist); closedir(dir); put_ca(ca); return 0; clean: for (i = 0; i < r; i++) free(namelist[i]); free(namelist); error: closedir(dir); release_ca(ca); return ret; } static int umad_id_to_dev(int umad_id, char *dev, unsigned *port) { char path[256]; int r; snprintf(path, sizeof(path), SYS_INFINIBAND_MAD "/umad%d/", umad_id); if ((r = sys_read_string(path, SYS_IB_MAD_DEV, dev, UMAD_CA_NAME_LEN)) < 0) return r; if ((r = sys_read_uint(path, SYS_IB_MAD_PORT, port)) < 0) return r; return 0; } static int dev_to_umad_id(const char *dev, unsigned port) { char umad_dev[UMAD_CA_NAME_LEN]; unsigned umad_port; int id; for (id = 0; id < UMAD_MAX_PORTS; id++) { if (umad_id_to_dev(id, umad_dev, &umad_port) < 0) continue; if (strncmp(dev, umad_dev, UMAD_CA_NAME_LEN)) continue; if (port != umad_port) continue; DEBUG("mapped %s %d to %d", dev, port, id); return id; } return -1; /* not found */ } static int umad_ca_device_list_compare_function(const void *node_a, const void *node_b) { return strcmp((*((const struct umad_device_node **)node_a))->ca_name, (*((const struct umad_device_node **)node_b))->ca_name); } /******************************* * Public interface */ int umad_init(void) { TRACE("umad_init"); return 0; } int umad_done(void) { TRACE("umad_done"); /* FIXME - verify that all ports are closed */ return 0; } static unsigned is_ib_type(const char *ca_name) { char dir_name[256]; unsigned type; snprintf(dir_name, sizeof(dir_name), "%s/%s", SYS_INFINIBAND, ca_name); if (sys_read_uint(dir_name, SYS_NODE_TYPE, &type) < 0) return 0; return type >= 1 && type <= 3 ? 1 : 0; } int umad_get_cas_names(char cas[][UMAD_CA_NAME_LEN], int max) { struct dirent **namelist; int n, i, j = 0; TRACE("max %d", max); n = scandir(SYS_INFINIBAND, &namelist, NULL, alphasort); if (n > 0) { for (i = 0; i < n; i++) { if (strcmp(namelist[i]->d_name, ".") && strcmp(namelist[i]->d_name, "..") && strlen(namelist[i]->d_name) < UMAD_CA_NAME_LEN) { if (j < max && is_ib_type(namelist[i]->d_name)) strcpy(cas[j++], namelist[i]->d_name); } free(namelist[i]); } DEBUG("return %d cas", j); } else { /* Is this still needed ? */ strncpy((char *)cas, def_ca_name, UMAD_CA_NAME_LEN); DEBUG("return 1 ca"); j = 1; } if (n >= 0) free(namelist); return j; } int umad_get_ca_portguids(const char *ca_name, __be64 *portguids, int max) { umad_ca_t ca; int ports = 0, i, result; char *found_ca_name; TRACE("ca name %s max port guids %d", ca_name, max); if (resolve_ca_name(ca_name, NULL, &found_ca_name, 0) < 0) { result = -ENODEV; goto exit; } if (umad_get_ca(found_ca_name, &ca) < 0) { result = -1; goto exit; } if (portguids) { if (ca.numports + 1 > max) { result = -ENOMEM; goto clean; } for (i = 0; i <= ca.numports; i++) portguids[ports++] = ca.ports[i] ? ca.ports[i]->port_guid : htobe64(0); } DEBUG("%s: %d ports", found_ca_name, ports); result = ports; clean: release_ca(&ca); exit: free(found_ca_name); return result; } int umad_get_issm_path(const char *ca_name, int portnum, char path[], int max) { int umad_id, result; char *found_ca_name; TRACE("ca %s port %d", ca_name, portnum); if (resolve_ca_name(ca_name, &portnum, &found_ca_name, 0) < 0) { result = -ENODEV; goto exit; } umad_id = dev_to_umad_id(found_ca_name, portnum); if (umad_id < 0) { result = -EINVAL; goto exit; } snprintf(path, max, "%s/issm%u", RDMA_CDEV_DIR, umad_id); result = 0; exit: free(found_ca_name); return result; } static int do_umad_open_port(const char *ca_name, int portnum, unsigned enforce_smi) { char dev_file[UMAD_DEV_FILE_SZ]; int umad_id, fd, result; unsigned int abi_version = get_abi_version(); char *found_ca_name = NULL; TRACE("ca %s port %d", ca_name, portnum); if (!abi_version) { result = -EOPNOTSUPP; goto exit; } if (resolve_ca_name(ca_name, &portnum, &found_ca_name, enforce_smi) < 0) { result = -ENODEV; goto exit; } DEBUG("opening %s port %d", found_ca_name, portnum); umad_id = dev_to_umad_id(found_ca_name, portnum); if (umad_id < 0) { result = -EINVAL; goto exit; } snprintf(dev_file, sizeof(dev_file), "%s/umad%d", RDMA_CDEV_DIR, umad_id); if ((fd = open(dev_file, O_RDWR | O_NONBLOCK)) < 0) { DEBUG("open %s failed: %s", dev_file, strerror(errno)); result = -EIO; goto exit; } if (abi_version > 5 || !ioctl(fd, IB_USER_MAD_ENABLE_PKEY, NULL)) new_user_mad_api = 1; else new_user_mad_api = 0; DEBUG("opened %s fd %d portid %d", dev_file, fd, umad_id); result = fd; exit: free(found_ca_name); return result; } int umad_open_port(const char *ca_name, int portnum) { return do_umad_open_port(ca_name, portnum, 0); } int umad_open_smi_port(const char *ca_name, int portnum) { return do_umad_open_port(ca_name, portnum, 1); } int umad_get_ca(const char *ca_name, umad_ca_t *ca) { int r = 0; char *found_ca_name; TRACE("ca_name %s", ca_name); if (resolve_ca_name(ca_name, NULL, &found_ca_name, 0) < 0) { r = -ENODEV; goto exit; } if (find_cached_ca(found_ca_name, ca) > 0) goto exit; r = get_ca(found_ca_name, ca); if (r < 0) goto exit; DEBUG("opened %s", found_ca_name); exit: free(found_ca_name); return r; } int umad_release_ca(umad_ca_t * ca) { int r; TRACE("ca_name %s", ca->ca_name); if (!ca) return -ENODEV; if ((r = release_ca(ca)) < 0) return r; DEBUG("releasing %s", ca->ca_name); return 0; } int umad_get_port(const char *ca_name, int portnum, umad_port_t *port) { char dir_name[256]; char *found_ca_name; int result; TRACE("ca_name %s portnum %d", ca_name, portnum); if (resolve_ca_name(ca_name, &portnum, &found_ca_name, 0) < 0) { result = -ENODEV; goto exit; } snprintf(dir_name, sizeof(dir_name), "%s/%s/%s", SYS_INFINIBAND, found_ca_name, SYS_CA_PORTS_DIR); result = get_port(found_ca_name, dir_name, portnum, port); exit: free(found_ca_name); return result; } int umad_release_port(umad_port_t * port) { int r; TRACE("port %s:%d", port->ca_name, port->portnum); if (!port) return -ENODEV; if ((r = release_port(port)) < 0) return r; DEBUG("releasing %s:%d", port->ca_name, port->portnum); return 0; } int umad_close_port(int fd) { close(fd); DEBUG("closed fd %d", fd); return 0; } void *umad_get_mad(void *umad) { return new_user_mad_api ? ((struct ib_user_mad *)umad)->data : (void *)&((struct ib_user_mad *)umad)->addr.pkey_index; } size_t umad_size(void) { return new_user_mad_api ? sizeof(struct ib_user_mad) : sizeof(struct ib_user_mad) - 8; } int umad_set_grh(void *umad, void *mad_addr) { struct ib_user_mad *mad = umad; struct ib_mad_addr *addr = mad_addr; if (mad_addr) { mad->addr.grh_present = 1; mad->addr.ib_gid = addr->ib_gid; /* The definition for umad_set_grh requires that the input be * in host order */ mad->addr.flow_label = htobe32((__force uint32_t)addr->flow_label); mad->addr.hop_limit = addr->hop_limit; mad->addr.traffic_class = addr->traffic_class; } else mad->addr.grh_present = 0; return 0; } int umad_set_pkey(void *umad, int pkey_index) { struct ib_user_mad *mad = umad; if (new_user_mad_api) mad->addr.pkey_index = pkey_index; return 0; } int umad_get_pkey(void *umad) { struct ib_user_mad *mad = umad; if (new_user_mad_api) return mad->addr.pkey_index; return 0; } int umad_set_addr(void *umad, int dlid, int dqp, int sl, int qkey) { struct ib_user_mad *mad = umad; TRACE("umad %p dlid %u dqp %d sl %d, qkey %x", umad, dlid, dqp, sl, qkey); mad->addr.qpn = htobe32(dqp); mad->addr.lid = htobe16(dlid); mad->addr.qkey = htobe32(qkey); mad->addr.sl = sl; return 0; } int umad_set_addr_net(void *umad, __be16 dlid, __be32 dqp, int sl, __be32 qkey) { struct ib_user_mad *mad = umad; TRACE("umad %p dlid %u dqp %d sl %d qkey %x", umad, be16toh(dlid), be32toh(dqp), sl, be32toh(qkey)); mad->addr.qpn = dqp; mad->addr.lid = dlid; mad->addr.qkey = qkey; mad->addr.sl = sl; return 0; } int umad_send(int fd, int agentid, void *umad, int length, int timeout_ms, int retries) { struct ib_user_mad *mad = umad; int n; TRACE("fd %d agentid %d umad %p timeout %u", fd, agentid, umad, timeout_ms); errno = 0; mad->timeout_ms = timeout_ms; mad->retries = retries; mad->agent_id = agentid; if (umaddebug > 1) umad_dump(mad); n = write(fd, mad, length + umad_size()); if (n == length + umad_size()) return 0; DEBUG("write returned %d != sizeof umad %zu + length %d (%m)", n, umad_size(), length); if (!errno) errno = EIO; return -EIO; } static int dev_poll(int fd, int timeout_ms) { struct pollfd ufds; int n; ufds.fd = fd; ufds.events = POLLIN; if ((n = poll(&ufds, 1, timeout_ms)) == 1) return 0; if (n == 0) return -ETIMEDOUT; return -EIO; } int umad_recv(int fd, void *umad, int *length, int timeout_ms) { struct ib_user_mad *mad = umad; int n; errno = 0; TRACE("fd %d umad %p timeout %u", fd, umad, timeout_ms); if (!umad || !length) { errno = EINVAL; return -EINVAL; } if (timeout_ms && (n = dev_poll(fd, timeout_ms)) < 0) { if (!errno) errno = -n; return n; } n = read(fd, umad, umad_size() + *length); VALGRIND_MAKE_MEM_DEFINED(umad, umad_size() + *length); if ((n >= 0) && (n <= umad_size() + *length)) { DEBUG("mad received by agent %d length %d", mad->agent_id, n); if (n > umad_size()) *length = n - umad_size(); else *length = 0; return mad->agent_id; } if (n == -EWOULDBLOCK) { if (!errno) errno = EWOULDBLOCK; return n; } DEBUG("read returned %zu > sizeof umad %zu + length %d (%m)", mad->length - umad_size(), umad_size(), *length); *length = mad->length - umad_size(); if (!errno) errno = EIO; return -errno; } int umad_poll(int fd, int timeout_ms) { TRACE("fd %d timeout %u", fd, timeout_ms); return dev_poll(fd, timeout_ms); } int umad_get_fd(int fd) { TRACE("fd %d", fd); return fd; } int umad_register_oui(int fd, int mgmt_class, uint8_t rmpp_version, uint8_t oui[3], long method_mask[16 / sizeof(long)]) { struct ib_user_mad_reg_req req; TRACE("fd %d mgmt_class %u rmpp_version %d oui 0x%x%x%x method_mask %p", fd, mgmt_class, (int)rmpp_version, (int)oui[0], (int)oui[1], (int)oui[2], method_mask); if (mgmt_class < 0x30 || mgmt_class > 0x4f) { DEBUG("mgmt class %d not in vendor range 2", mgmt_class); return -EINVAL; } req.qpn = 1; req.mgmt_class = mgmt_class; req.mgmt_class_version = 1; memcpy(req.oui, oui, sizeof req.oui); req.rmpp_version = rmpp_version; if (method_mask) memcpy(req.method_mask, method_mask, sizeof req.method_mask); else memset(req.method_mask, 0, sizeof req.method_mask); VALGRIND_MAKE_MEM_DEFINED(&req, sizeof req); if (!ioctl(fd, IB_USER_MAD_REGISTER_AGENT, (void *)&req)) { DEBUG ("fd %d registered to use agent %d qp %d class 0x%x oui %p", fd, req.id, req.qpn, req.mgmt_class, oui); return req.id; /* return agentid */ } DEBUG("fd %d registering qp %d class 0x%x version %d oui %p failed: %m", fd, req.qpn, req.mgmt_class, req.mgmt_class_version, oui); return -EPERM; } int umad_register(int fd, int mgmt_class, int mgmt_version, uint8_t rmpp_version, long method_mask[16 / sizeof(long)]) { struct ib_user_mad_reg_req req; __be32 oui = htobe32(IB_OPENIB_OUI); int qp; TRACE ("fd %d mgmt_class %u mgmt_version %u rmpp_version %d method_mask %p", fd, mgmt_class, mgmt_version, rmpp_version, method_mask); req.qpn = qp = (mgmt_class == 0x1 || mgmt_class == 0x81) ? 0 : 1; req.mgmt_class = mgmt_class; req.mgmt_class_version = mgmt_version; req.rmpp_version = rmpp_version; if (method_mask) memcpy(req.method_mask, method_mask, sizeof req.method_mask); else memset(req.method_mask, 0, sizeof req.method_mask); memcpy(&req.oui, (char *)&oui + 1, sizeof req.oui); VALGRIND_MAKE_MEM_DEFINED(&req, sizeof req); if (!ioctl(fd, IB_USER_MAD_REGISTER_AGENT, (void *)&req)) { DEBUG("fd %d registered to use agent %d qp %d", fd, req.id, qp); return req.id; /* return agentid */ } DEBUG("fd %d registering qp %d class 0x%x version %d failed: %m", fd, qp, mgmt_class, mgmt_version); return -EPERM; } int umad_register2(int port_fd, struct umad_reg_attr *attr, uint32_t *agent_id) { struct ib_user_mad_reg_req2 req; int rc; if (!attr || !agent_id) return EINVAL; TRACE("fd %d mgmt_class %u mgmt_class_version %u flags 0x%08x " "method_mask 0x%016" PRIx64 " %016" PRIx64 "oui 0x%06x rmpp_version %u ", port_fd, attr->mgmt_class, attr->mgmt_class_version, attr->flags, attr->method_mask[0], attr->method_mask[1], attr->oui, attr->rmpp_version); if (attr->mgmt_class >= 0x30 && attr->mgmt_class <= 0x4f && ((attr->oui & 0x00ffffff) == 0 || (attr->oui & 0xff000000) != 0)) { DEBUG("mgmt class %d is in vendor range 2 but oui (0x%08x) is invalid", attr->mgmt_class, attr->oui); return EINVAL; } memset(&req, 0, sizeof(req)); req.mgmt_class = attr->mgmt_class; req.mgmt_class_version = attr->mgmt_class_version; req.qpn = (attr->mgmt_class == 0x1 || attr->mgmt_class == 0x81) ? 0 : 1; req.flags = attr->flags; memcpy(req.method_mask, attr->method_mask, sizeof req.method_mask); req.oui = attr->oui; req.rmpp_version = attr->rmpp_version; VALGRIND_MAKE_MEM_DEFINED(&req, sizeof req); if ((rc = ioctl(port_fd, IB_USER_MAD_REGISTER_AGENT2, (void *)&req)) == 0) { DEBUG("fd %d registered to use agent %d qp %d class 0x%x oui 0x%06x", port_fd, req.id, req.qpn, req.mgmt_class, attr->oui); *agent_id = req.id; return 0; } if (errno == ENOTTY || errno == EINVAL) { TRACE("no kernel support for registration flags"); req.flags = 0; if (attr->flags == 0) { struct ib_user_mad_reg_req req_v1; TRACE("attempting original register ioctl"); memset(&req_v1, 0, sizeof(req_v1)); req_v1.mgmt_class = req.mgmt_class; req_v1.mgmt_class_version = req.mgmt_class_version; req_v1.qpn = req.qpn; req_v1.rmpp_version = req.rmpp_version; req_v1.oui[0] = (req.oui & 0xff0000) >> 16; req_v1.oui[1] = (req.oui & 0x00ff00) >> 8; req_v1.oui[2] = req.oui & 0x0000ff; memcpy(req_v1.method_mask, req.method_mask, sizeof req_v1.method_mask); if ((rc = ioctl(port_fd, IB_USER_MAD_REGISTER_AGENT, (void *)&req_v1)) == 0) { DEBUG("fd %d registered to use agent %d qp %d class 0x%x oui 0x%06x", port_fd, req_v1.id, req_v1.qpn, req_v1.mgmt_class, attr->oui); *agent_id = req_v1.id; return 0; } } } rc = errno; attr->flags = req.flags; DEBUG("fd %d registering qp %d class 0x%x version %d " "oui 0x%06x failed flags returned 0x%x : %m", port_fd, req.qpn, req.mgmt_class, req.mgmt_class_version, attr->oui, req.flags); return rc; } int umad_unregister(int fd, int agentid) { TRACE("fd %d unregistering agent %d", fd, agentid); return ioctl(fd, IB_USER_MAD_UNREGISTER_AGENT, &agentid); } int umad_status(void *umad) { struct ib_user_mad *mad = umad; return mad->status; } ib_mad_addr_t *umad_get_mad_addr(void *umad) { struct ib_user_mad *mad = umad; return &mad->addr; } int umad_debug(int level) { if (level >= 0) umaddebug = level; return umaddebug; } void umad_addr_dump(ib_mad_addr_t * addr) { #define HEX(x) ((x) < 10 ? '0' + (x) : 'a' + ((x) -10)) char gid_str[64]; int i; for (i = 0; i < sizeof addr->gid; i++) { gid_str[i * 2] = HEX(addr->gid[i] >> 4); gid_str[i * 2 + 1] = HEX(addr->gid[i] & 0xf); } gid_str[i * 2] = 0; IBWARN("qpn %d qkey 0x%x lid %u sl %d\n" "grh_present %d gid_index %d hop_limit %d traffic_class %d flow_label 0x%x pkey_index 0x%x\n" "Gid 0x%s", be32toh(addr->qpn), be32toh(addr->qkey), be16toh(addr->lid), addr->sl, addr->grh_present, (int)addr->gid_index, (int)addr->hop_limit, (int)addr->traffic_class, addr->flow_label, addr->pkey_index, gid_str); } void umad_dump(void *umad) { struct ib_user_mad *mad = umad; IBWARN("agent id %d status %x timeout %d", mad->agent_id, mad->status, mad->timeout_ms); umad_addr_dump(&mad->addr); } int umad_sort_ca_device_list(struct umad_device_node **head, size_t size) { int errsv = 0; size_t i; struct umad_device_node *node; struct umad_device_node **nodes_array = NULL; if (!size) for (node = *head; node; node = node->next) size++; if (size < 2) return 0; nodes_array = calloc(size, sizeof(struct umad_device_node *)); if (!nodes_array) { errsv = ENOMEM; goto exit; } node = *head; for (i = 0; i < size; i++) { if (!node) { errsv = EINVAL; goto exit; } nodes_array[i] = node; node = node->next; } if (node) { errsv = EINVAL; goto exit; } qsort(nodes_array, size, sizeof(struct umad_device_node *), umad_ca_device_list_compare_function); for (i = 0; i < size - 1; i++) nodes_array[i]->next = nodes_array[i + 1]; *head = nodes_array[0]; nodes_array[size - 1]->next = NULL; exit: free(nodes_array); return errsv; } struct umad_device_node *umad_get_ca_device_list(void) { DIR *dir; struct dirent *entry; struct umad_device_node *head = NULL; struct umad_device_node *tail; struct umad_device_node *node; char *ca_name; size_t cas_num = 0; size_t d_name_size; int errsv = 0; dir = opendir(SYS_INFINIBAND); if (!dir) { if (errno == ENOENT) errno = 0; return NULL; } while ((entry = readdir(dir))) { if ((strcmp(entry->d_name, ".") == 0) || (strcmp(entry->d_name, "..") == 0)) continue; if (!is_ib_type(entry->d_name)) continue; d_name_size = strlen(entry->d_name) + 1; node = calloc(1, sizeof(struct umad_device_node) + d_name_size); if (!node) { errsv = ENOMEM; umad_free_ca_device_list(head); head = NULL; goto exit; } if (!head) head = node; else tail->next = node; tail = node; ca_name = (char *)(node + 1); strncpy(ca_name, entry->d_name, d_name_size); node->ca_name = ca_name; cas_num++; } DEBUG("return %zu cas", cas_num); exit: closedir(dir); errno = errsv; return head; } void umad_free_ca_device_list(struct umad_device_node *head) { struct umad_device_node *node; struct umad_device_node *next; for (node = head; node; node = next) { next = node->next; free(node); } } static struct umad_ca_pair *get_ca_pair_from_arr_by_guid(__be64 port_guid, struct guid_ca_pairs_mapping mapping[], size_t map_max, size_t *map_added, struct umad_ca_pair devs[], size_t devs_max, size_t *devs_added) { struct umad_ca_pair *dev = NULL; // attempt to find the port guid in the mapping size_t i = 0; for (i = 0; i < *map_added; ++i) { if (mapping[i].port_guid == port_guid) return mapping[i].ca_pair; } // attempt to add a new mapping/device if (*map_added >= map_max || *devs_added >= devs_max) return NULL; dev = &devs[*devs_added]; mapping[*map_added].port_guid = port_guid; mapping[*map_added].ca_pair = dev; (*devs_added)++; (*map_added)++; return dev; } static uint8_t get_port_guid_count(__be64 guid, const struct port_guid_port_count counts[], size_t max_guids) { size_t i = 0; for (i = 0; i < max_guids; ++i) { if (counts[i].port_guid == guid) return counts[i].count; } return 0; } static bool find_port_guid_count(struct port_guid_port_count counts[], size_t max, __be64 port_guid, size_t *index) { size_t i = 0; for (i = 0; i < max; ++i) { if (counts[i].port_guid == 0) { *index = i; return false; } if (counts[i].port_guid == port_guid) { *index = i; return true; } } *index = max; return false; } static int count_ports_by_guid(char legacy_ca_names[][UMAD_CA_NAME_LEN], size_t num_cas, struct port_guid_port_count counts[], size_t max) { // how many unique port GUIDs were added size_t num_of_guid = 0; memset(counts, 0, max * sizeof(struct port_guid_port_count)); size_t c_idx = 0; for (c_idx = 0; c_idx < num_cas; ++c_idx) { umad_ca_t curr_ca; if (umad_get_ca(legacy_ca_names[c_idx], &curr_ca) < 0) continue; size_t p_idx = 0; for (p_idx = 0; p_idx < (size_t)curr_ca.numports + 1; ++p_idx) { umad_port_t *p_port = curr_ca.ports[p_idx]; size_t count_idx = 0; if (!p_port) continue; if (find_port_guid_count(counts, max, p_port->port_guid, &count_idx)) { // port GUID already has a count struct ++counts[count_idx].count; } else { // add a new count struct for this GUID. // if the maximum amount was already added, do nothing. if (count_idx != max) { counts[count_idx].port_guid = p_port->port_guid; counts[count_idx].count = 1; ++num_of_guid; } } } umad_release_ca(&curr_ca); } return num_of_guid; } int umad_get_smi_gsi_pairs(struct umad_ca_pair cas[], size_t max) { size_t added_devices = 0, added_mappings = 0; char legacy_ca_names[UMAD_MAX_DEVICES][UMAD_CA_NAME_LEN] = {}; struct port_guid_port_count counts[UMAD_MAX_PORTS] = {}; struct guid_ca_pairs_mapping mapping[UMAD_MAX_PORTS] = {}; memset(cas, 0, sizeof(struct umad_ca_pair) * max); int cas_found = umad_get_cas_names(legacy_ca_names, UMAD_MAX_DEVICES); if (cas_found < 0) return 0; count_ports_by_guid(legacy_ca_names, cas_found, counts, UMAD_MAX_PORTS); size_t c_idx = 0; for (c_idx = 0; c_idx < (size_t)cas_found; ++c_idx) { umad_ca_t curr_ca; if (umad_get_ca(legacy_ca_names[c_idx], &curr_ca) < 0) continue; size_t p_idx = 0; for (p_idx = 0; p_idx < (size_t)curr_ca.numports + 1; ++p_idx) { umad_port_t *p_port = curr_ca.ports[p_idx]; uint8_t guid_count = 0; if (!p_port) continue; guid_count = get_port_guid_count(curr_ca.ports[p_idx]->port_guid, counts, UMAD_MAX_PORTS); struct umad_ca_pair *dev = get_ca_pair_from_arr_by_guid(p_port->port_guid, mapping, UMAD_MAX_PORTS, &added_mappings, cas, max, &added_devices); if (!dev) continue; if (guid_count > 1) { // planarized port char *dev_name = is_smi_disabled(p_port) ? dev->gsi_name : dev->smi_name; strncpy(dev_name, curr_ca.ca_name, UMAD_CA_NAME_LEN); break; } else if (guid_count == 1) { if (!is_smi_disabled(p_port)) strncpy(dev->smi_name, curr_ca.ca_name, UMAD_CA_NAME_LEN); strncpy(dev->gsi_name, curr_ca.ca_name, UMAD_CA_NAME_LEN); break; } else { umad_release_ca(&curr_ca); return -1; } } umad_release_ca(&curr_ca); } return added_devices; } static int umad_check_active(const umad_ca_t *ca, int prefered_portnum) { if (!ca) return 1; if (!ca->ports[prefered_portnum]) return 1; int state = ca->ports[prefered_portnum]->state; return !(state > 1); } static int umad_find_active(struct umad_ca_pair *ca_pair, const umad_ca_t *ca, bool is_gsi) { size_t i = 1; uint32_t *portnum_to_set = is_gsi ? &ca_pair->gsi_preferred_port : &ca_pair->smi_preferred_port; if (!ca_pair) return 1; for (i = 0; i < (size_t)ca->numports + 1; ++i) { if (!umad_check_active(ca, i)) { *portnum_to_set = ca->ports[i]->portnum; return 0; } } return 1; } static int find_preferred_ports(struct umad_ca_pair *ca_pair, const umad_ca_t *ca, bool is_gsi, int portnum) { if (portnum) { //in case we have same device, use same port for smi/gsi if (!strncmp(ca_pair->gsi_name, ca_pair->smi_name, UMAD_CA_NAME_LEN)) { if (!umad_check_active(ca, portnum)) { ca_pair->gsi_preferred_port = portnum; ca_pair->smi_preferred_port = portnum; return 0; } return 1; } uint32_t *port_to_set = is_gsi ? &ca_pair->gsi_preferred_port : &ca_pair->smi_preferred_port; if (!umad_check_active(ca, portnum)) { *port_to_set = portnum; return umad_find_active(ca_pair, ca, !is_gsi); } return 1; } return umad_find_active(ca_pair, ca, false) + umad_find_active(ca_pair, ca, true); } int umad_get_smi_gsi_pair_by_ca_name(const char *name, uint8_t portnum, struct umad_ca_pair *ca_pair, unsigned enforce_smi) { int rc = 1; size_t i = 0; int num_cas = 0; bool is_gsi = false; umad_ca_t ca; struct umad_ca_pair cas_pair[UMAD_MAX_PORTS] = {}; if (!ca_pair) return -1; memset(cas_pair, 0, sizeof(cas_pair)); memset(ca_pair, 0, sizeof(*ca_pair)); num_cas = umad_get_smi_gsi_pairs(cas_pair, UMAD_MAX_PORTS); if (num_cas <= 0) return num_cas; for (i = 0; i < (size_t)num_cas; ++i) { if (enforce_smi && !cas_pair[i].smi_name[0]) continue; if (!cas_pair[i].gsi_name[0]) continue; if (name) // name doesn't match - keep searching if (strncmp(cas_pair[i].gsi_name, name, UMAD_CA_NAME_LEN) && strncmp(cas_pair[i].smi_name, name, UMAD_CA_NAME_LEN)) { continue; } // check that the device given by "name" has a port number "portnum" // (if name doesn't exist, assume SMI port is given) // (if enforce_smi is false assume GSI in case no SMI present) is_gsi = (!enforce_smi && !cas_pair[i].smi_name[0]) || (name && !strncmp(name, cas_pair[i].gsi_name, UMAD_CA_NAME_LEN)); if (umad_get_ca(is_gsi ? cas_pair[i].gsi_name : cas_pair[i].smi_name, &ca) < 0) continue; if (portnum) { if (!ca.ports[portnum]) { umad_release_ca(&ca); continue; } } // fill candidate *ca_pair = cas_pair[i]; rc = find_preferred_ports(ca_pair, &ca, is_gsi, portnum); umad_release_ca(&ca); if (!rc) break; } if (rc) { errno = ENODEV; return -errno; } return rc; } rdma-core-56.1/libibumad/umad.h000066400000000000000000000177401477342711600163720ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire Inc. All rights reserved. * Copyright (c) 2014 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _UMAD_H #define _UMAD_H #include #include #include #include #include /* __be16, __be32 and __be64 */ #ifdef __cplusplus extern "C" { #endif typedef __be16 __attribute__((deprecated)) be16_t; typedef __be32 __attribute__((deprecated)) be32_t; typedef __be64 __attribute__((deprecated)) be64_t; /* * A GID data structure that may be used in definitions of on-the-wire data * structures. Do not cast umad_gid pointers to ibv_gid pointers because the * alignment of these two data structures is different. */ union umad_gid { uint8_t raw[16]; __be16 raw_be16[8]; struct { __be64 subnet_prefix; __be64 interface_id; } global; } __attribute__((aligned(4))) __attribute__((packed)); #define UMAD_MAX_DEVICES 32 #define UMAD_ANY_PORT 0 typedef struct ib_mad_addr { __be32 qpn; __be32 qkey; __be16 lid; uint8_t sl; uint8_t path_bits; uint8_t grh_present; uint8_t gid_index; uint8_t hop_limit; uint8_t traffic_class; union { uint8_t gid[16]; /* network-byte order */ union umad_gid ib_gid; }; __be32 flow_label; uint16_t pkey_index; uint8_t reserved[6]; } ib_mad_addr_t; typedef struct ib_user_mad { uint32_t agent_id; uint32_t status; uint32_t timeout_ms; uint32_t retries; uint32_t length; ib_mad_addr_t addr; uint8_t data[0]; } ib_user_mad_t; #define IB_UMAD_ABI_VERSION 5 #define IB_UMAD_ABI_DIR "/sys/class/infiniband_mad" #define IB_UMAD_ABI_FILE "abi_version" #define IB_IOCTL_MAGIC 0x1b #define IB_USER_MAD_REGISTER_AGENT _IOWR(IB_IOCTL_MAGIC, 1, \ struct ib_user_mad_reg_req) #define IB_USER_MAD_UNREGISTER_AGENT _IOW(IB_IOCTL_MAGIC, 2, uint32_t) #define IB_USER_MAD_ENABLE_PKEY _IO(IB_IOCTL_MAGIC, 3) #define IB_USER_MAD_REGISTER_AGENT2 _IOWR(IB_IOCTL_MAGIC, 4, \ struct ib_user_mad_reg_req2) #define UMAD_CA_NAME_LEN 20 #define UMAD_CA_MAX_PORTS 10 /* 0 - 9 */ #define UMAD_CA_MAX_AGENTS 32 #define SYS_INFINIBAND "/sys/class/infiniband" #define SYS_INFINIBAND_MAD "/sys/class/infiniband_mad" #define SYS_IB_MAD_PORT "port" #define SYS_IB_MAD_DEV "ibdev" #define UMAD_MAX_PORTS 64 #define SYS_CA_PORTS_DIR "ports" #define SYS_NODE_TYPE "node_type" #define SYS_CA_FW_VERS "fw_ver" #define SYS_CA_HW_VERS "hw_rev" #define SYS_CA_TYPE "hca_type" #define SYS_CA_NODE_GUID "node_guid" #define SYS_CA_SYS_GUID "sys_image_guid" #define SYS_PORT_LMC "lid_mask_count" #define SYS_PORT_SMLID "sm_lid" #define SYS_PORT_SMSL "sm_sl" #define SYS_PORT_LID "lid" #define SYS_PORT_STATE "state" #define SYS_PORT_PHY_STATE "phys_state" #define SYS_PORT_CAPMASK "cap_mask" #define SYS_PORT_RATE "rate" #define SYS_PORT_GUID "port_guid" #define SYS_PORT_GID "gids/0" #define SYS_PORT_LINK_LAYER "link_layer" typedef struct umad_port { char ca_name[UMAD_CA_NAME_LEN]; int portnum; unsigned base_lid; unsigned lmc; unsigned sm_lid; unsigned sm_sl; unsigned state; unsigned phys_state; unsigned rate; __be32 capmask; __be64 gid_prefix; __be64 port_guid; unsigned pkeys_size; uint16_t *pkeys; char link_layer[UMAD_CA_NAME_LEN]; } umad_port_t; typedef struct umad_ca { char ca_name[UMAD_CA_NAME_LEN]; unsigned node_type; int numports; char fw_ver[20]; char ca_type[40]; char hw_ver[20]; __be64 node_guid; __be64 system_guid; umad_port_t *ports[UMAD_CA_MAX_PORTS]; } umad_ca_t; struct umad_ca_pair { char smi_name[UMAD_CA_NAME_LEN]; uint32_t smi_preferred_port; char gsi_name[UMAD_CA_NAME_LEN]; uint32_t gsi_preferred_port; }; struct umad_device_node { struct umad_device_node *next; /* next umad device node */ const char *ca_name; /* ca name */ }; int umad_init(void); int umad_done(void); int umad_get_cas_names(char cas[][UMAD_CA_NAME_LEN], int max); int umad_get_ca_portguids(const char *ca_name, __be64 *portguids, int max); int umad_get_ca(const char *ca_name, umad_ca_t * ca); int umad_release_ca(umad_ca_t * ca); int umad_get_port(const char *ca_name, int portnum, umad_port_t * port); int umad_release_port(umad_port_t * port); int umad_get_issm_path(const char *ca_name, int portnum, char path[], int max); int umad_open_port(const char *ca_name, int portnum); int umad_open_smi_port(const char *ca_name, int portnum); int umad_close_port(int portid); void *umad_get_mad(void *umad); size_t umad_size(void); int umad_status(void *umad); ib_mad_addr_t *umad_get_mad_addr(void *umad); int umad_set_grh_net(void *umad, void *mad_addr); int umad_set_grh(void *umad, void *mad_addr); int umad_set_addr_net(void *umad, __be16 dlid, __be32 dqp, int sl, __be32 qkey); int umad_set_addr(void *umad, int dlid, int dqp, int sl, int qkey); int umad_set_pkey(void *umad, int pkey_index); int umad_get_pkey(void *umad); int umad_send(int portid, int agentid, void *umad, int length, int timeout_ms, int retries); int umad_recv(int portid, void *umad, int *length, int timeout_ms); int umad_poll(int portid, int timeout_ms); int umad_get_fd(int portid); int umad_register(int portid, int mgmt_class, int mgmt_version, uint8_t rmpp_version, long method_mask[16 / sizeof(long)]); int umad_register_oui(int portid, int mgmt_class, uint8_t rmpp_version, uint8_t oui[3], long method_mask[16 / sizeof(long)]); int umad_unregister(int portid, int agentid); int umad_sort_ca_device_list(struct umad_device_node **head, size_t size); struct umad_device_node *umad_get_ca_device_list(void); void umad_free_ca_device_list(struct umad_device_node *head); enum { UMAD_USER_RMPP = (1 << 0) }; struct umad_reg_attr { uint8_t mgmt_class; uint8_t mgmt_class_version; uint32_t flags; uint64_t method_mask[2]; uint32_t oui; uint8_t rmpp_version; }; int umad_register2(int port_fd, struct umad_reg_attr *attr, uint32_t *agent_id); int umad_debug(int level); void umad_addr_dump(ib_mad_addr_t * addr); void umad_dump(void *umad); int umad_get_smi_gsi_pairs(struct umad_ca_pair cas[], size_t max); int umad_get_smi_gsi_pair_by_ca_name(const char *devname, uint8_t portnum, struct umad_ca_pair *ca_pair, unsigned enforce_smi); static inline void *umad_alloc(int num, size_t size) { /* alloc array of umad buffers */ return calloc(num, size); } static inline void umad_free(void *umad) { free(umad); } /* Users should use the glibc functions directly, not these wrappers */ #ifndef ntohll #undef ntohll static inline __attribute__((deprecated)) uint64_t ntohll(uint64_t x) { return be64toh(x); } #define ntohll ntohll #endif #ifndef htonll #undef htonll static inline __attribute__((deprecated)) uint64_t htonll(uint64_t x) { return htobe64(x); } #define htonll htonll #endif #ifdef __cplusplus } #endif #endif /* _UMAD_H */ rdma-core-56.1/libibumad/umad_cm.h000066400000000000000000000041011477342711600170340ustar00rootroot00000000000000/* * Copyright (c) 2010 Intel Corporation. All rights reserved. * Copyright (c) 2014 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _UMAD_CM_H #define _UMAD_CM_H #include #ifdef __cplusplus extern "C" { #endif /* Communication management attributes */ enum { UMAD_CM_ATTR_REQ = 0x0010, UMAD_CM_ATTR_MRA = 0x0011, UMAD_CM_ATTR_REJ = 0x0012, UMAD_CM_ATTR_REP = 0x0013, UMAD_CM_ATTR_RTU = 0x0014, UMAD_CM_ATTR_DREQ = 0x0015, UMAD_CM_ATTR_DREP = 0x0016, UMAD_CM_ATTR_SIDR_REQ = 0x0017, UMAD_CM_ATTR_SIDR_REP = 0x0018, UMAD_CM_ATTR_LAP = 0x0019, UMAD_CM_ATTR_APR = 0x001A, UMAD_CM_ATTR_SAP = 0x001B, UMAD_CM_ATTR_SPR = 0x001C, }; #ifdef __cplusplus } #endif #endif /* _UMAD_CM_H */ rdma-core-56.1/libibumad/umad_sa.h000066400000000000000000000134041477342711600170460ustar00rootroot00000000000000/* * Copyright (c) 2004 Topspin Communications. All rights reserved. * Copyright (c) 2005 Voltaire, Inc. All rights reserved. * Copyright (c) 2006, 2010 Intel Corporation. All rights reserved. * Copyright (c) 2014 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _UMAD_SA_H #define _UMAD_SA_H #include #ifdef __cplusplus extern "C" { #endif /* SA specific methods */ enum { UMAD_SA_CLASS_VERSION = 2, /* IB spec version 1.1/1.2 */ UMAD_SA_METHOD_GET_TABLE = 0x12, UMAD_SA_METHOD_GET_TABLE_RESP = 0x92, UMAD_SA_METHOD_DELETE = 0x15, UMAD_SA_METHOD_DELETE_RESP = 0x95, UMAD_SA_METHOD_GET_MULTI = 0x14, UMAD_SA_METHOD_GET_MULTI_RESP = 0x94, UMAD_SA_METHOD_GET_TRACE_TABLE = 0x13 }; enum { UMAD_SA_STATUS_SUCCESS = 0, UMAD_SA_STATUS_NO_RESOURCES = 1, UMAD_SA_STATUS_REQ_INVALID = 2, UMAD_SA_STATUS_NO_RECORDS = 3, UMAD_SA_STATUS_TOO_MANY_RECORDS = 4, UMAD_SA_STATUS_INVALID_GID = 5, UMAD_SA_STATUS_INSUF_COMPS = 6, UMAD_SA_STATUS_REQ_DENIED = 7, UMAD_SA_STATUS_PRI_SUGGESTED = 8 }; /* SA attributes */ enum { UMAD_SA_ATTR_NODE_REC = 0x0011, UMAD_SA_ATTR_PORT_INFO_REC = 0x0012, UMAD_SA_ATTR_SLVL_REC = 0x0013, UMAD_SA_ATTR_SWITCH_INFO_REC = 0x0014, UMAD_SA_ATTR_LINEAR_FT_REC = 0x0015, UMAD_SA_ATTR_RANDOM_FT_REC = 0x0016, UMAD_SA_ATTR_MCAST_FT_REC = 0x0017, UMAD_SA_ATTR_SM_INFO_REC = 0x0018, UMAD_SA_ATTR_LINK_SPD_WIDTH_TABLE_REC = 0x0019, UMAD_SA_ATTR_INFORM_INFO_REC = 0x00F3, UMAD_SA_ATTR_LINK_REC = 0x0020, UMAD_SA_ATTR_GUID_INFO_REC = 0x0030, UMAD_SA_ATTR_SERVICE_REC = 0x0031, UMAD_SA_ATTR_PKEY_TABLE_REC = 0x0033, UMAD_SA_ATTR_PATH_REC = 0x0035, UMAD_SA_ATTR_VL_ARB_REC = 0x0036, UMAD_SA_ATTR_MCMEMBER_REC = 0x0038, UMAD_SA_ATTR_TRACE_REC = 0x0039, UMAD_SA_ATTR_MULTI_PATH_REC = 0x003A, UMAD_SA_ATTR_SERVICE_ASSOC_REC = 0x003B, UMAD_SA_ATTR_HIERARCHY_INFO_REC = 0x003C, UMAD_SA_ATTR_CABLE_INFO_REC = 0x003D, UMAD_SA_ATTR_PORT_INFO_EXT_REC = 0x003E }; enum { UMAD_LEN_SA_DATA = 200 }; /* CM bits */ enum { UMAD_SA_CAP_MASK_IS_SUBNET_OPT_REC_SUP = (1 << 8), UMAD_SA_CAP_MASK_IS_UD_MCAST_SUP = (1 << 9), UMAD_SA_CAP_MASK_IS_MULTIPATH_SUP = (1 << 10), UMAD_SA_CAP_MASK_IS_REINIT_SUP = (1 << 11), UMAD_SA_CAP_MASK_IS_GID_SCOPED_MULTIPATH_SUP = (1 << 12), UMAD_SA_CAP_MASK_IS_PORTINFO_CAP_MASK_MATCH_SUP = (1 << 13), UMAD_SA_CAP_MASK_IS_LINK_SPEED_WIDTH_PAIRS_REC_SUP = (1 << 14), UMAD_SA_CAP_MASK_IS_PA_SERVICES_SUP = (1 << 15) }; /* CM2 bits */ enum { UMAD_SA_CAP_MASK2_IS_UNPATH_REPATH_SUP = (1 << 0), UMAD_SA_CAP_MASK2_IS_QOS_SUP = (1 << 1), UMAD_SA_CAP_MASK2_IS_REV_PATH_PKEY_MEM_BIT_SUP = (1 << 2), UMAD_SA_CAP_MASK2_IS_MCAST_TOP_SUP = (1 << 3), UMAD_SA_CAP_MASK2_IS_HIERARCHY_INFO_SUP = (1 << 4), UMAD_SA_CAP_MASK2_IS_ADDITIONAL_GUID_SUP = (1 << 5), UMAD_SA_CAP_MASK2_IS_FULL_PORTINFO_REC_SUP = (1 << 6), UMAD_SA_CAP_MASK2_IS_EXT_SPEEDS_SUP = (1 << 7), UMAD_SA_CAP_MASK2_IS_MCAST_SERVICE_REC_SUP = (1 << 8), UMAD_SA_CAP_MASK2_IS_CABLE_INFO_REC_SUP = (1 << 9), UMAD_SA_CAP_MASK2_IS_PORT_INFO_CAPMASK2_MATCH_SUP = (1 << 10), UMAD_SA_CAP_MASK2_IS_PORT_INFO_EXT_REC_SUP = (1 << 11) }; /* * Shared by SA MCMemberRecord, PathRecord, and MultiPathRecord */ enum { UMAD_SA_SELECTOR_GREATER_THAN = 0, UMAD_SA_SELECTOR_LESS_THAN = 1, UMAD_SA_SELECTOR_EXACTLY = 2, UMAD_SA_SELECTOR_LARGEST_AVAIL = 3, /* rate & MTU */ UMAD_SA_SELECTOR_SMALLEST_AVAIL = 3 /* packet lifetime */ }; #define UMAD_SA_SELECTOR_SHIFT 6 #define UMAD_SA_RATE_MTU_PKT_LIFE_MASK 0x3f #define UMAD_SA_SELECTOR_MASK 0x3 /* * sm_key is not aligned on an 8-byte boundary, so is defined as a byte array */ struct umad_sa_packet { struct umad_hdr mad_hdr; struct umad_rmpp_hdr rmpp_hdr; uint8_t sm_key[8]; /* network-byte order */ __be16 attr_offset; __be16 reserved; __be64 comp_mask; uint8_t data[UMAD_LEN_SA_DATA]; /* network-byte order */ }; static inline uint8_t umad_sa_get_rate_mtu_or_life(uint8_t rate_mtu_or_life) { return (rate_mtu_or_life & UMAD_SA_RATE_MTU_PKT_LIFE_MASK); } static inline uint8_t umad_sa_set_rate_mtu_or_life(uint8_t selector, uint8_t rate_mtu_or_life) { return (((selector & UMAD_SA_SELECTOR_MASK) << UMAD_SA_SELECTOR_SHIFT) | (rate_mtu_or_life & UMAD_SA_RATE_MTU_PKT_LIFE_MASK)); } #ifdef __cplusplus } #endif #endif /* _UMAD_SA_H */ rdma-core-56.1/libibumad/umad_sa_mcm.h000066400000000000000000000116751477342711600177120ustar00rootroot00000000000000/* * Copyright (c) 2017 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _UMAD_SA_MCM_H #define _UMAD_SA_MCM_H #include #include #ifdef __cplusplus extern "C" { #endif /* Component mask bits for MCMemberRecord */ enum { UMAD_SA_MCM_COMP_MASK_MGID = (1ULL << 0), UMAD_SA_MCM_COMP_MASK_PORT_GID = (1ULL << 1), UMAD_SA_MCM_COMP_MASK_QKEY = (1ULL << 2), UMAD_SA_MCM_COMP_MASK_MLID = (1ULL << 3), UMAD_SA_MCM_COMP_MASK_MTU_SEL = (1ULL << 4), UMAD_SA_MCM_COMP_MASK_MTU = (1ULL << 5), UMAD_SA_MCM_COMP_MASK_TCLASS = (1ULL << 6), UMAD_SA_MCM_COMP_MASK_PKEY = (1ULL << 7), UMAD_SA_MCM_COMP_MASK_RATE_SEL = (1ULL << 8), UMAD_SA_MCM_COMP_MASK_RATE = (1ULL << 9), UMAD_SA_MCM_COMP_MASK_LIFE_TIME_SEL = (1ULL << 10), UMAD_SA_MCM_COMP_MASK_LIFE_TIME = (1ULL << 11), UMAD_SA_MCM_COMP_MASK_SL = (1ULL << 12), UMAD_SA_MCM_COMP_MASK_FLOW_LABEL = (1ULL << 13), UMAD_SA_MCM_COMP_MASK_HOP_LIMIT = (1ULL << 14), UMAD_SA_MCM_COMP_MASK_SCOPE = (1ULL << 15), UMAD_SA_MCM_COMP_MASK_JOIN_STATE = (1ULL << 16), UMAD_SA_MCM_COMP_MASK_PROXY_JOIN = (1ULL << 17) }; enum { UMAD_SA_MCM_JOIN_STATE_FULL_MEMBER = (1 << 0), UMAD_SA_MCM_JOIN_STATE_NON_MEMBER = (1 << 1), UMAD_SA_MCM_JOIN_STATE_SEND_ONLY_NON_MEMBER = (1 << 2), UMAD_SA_MCM_JOIN_STATE_SEND_ONLY_FULL_MEMBER = (1 << 3) }; enum { UMAD_SA_MCM_ADDR_SCOPE_LINK_LOCAL = 0x2, UMAD_SA_MCM_ADDR_SCOPE_SITE_LOCAL = 0x5, UMAD_SA_MCM_ADDR_SCOPE_ORG_LOCAL = 0x8, UMAD_SA_MCM_ADDR_SCOPE_GLOBAL = 0xE, }; struct umad_sa_mcmember_record { uint8_t mgid[16]; /* network-byte order */ uint8_t portgid[16]; /* network-byte order */ __be32 qkey; __be16 mlid; uint8_t mtu; /* 2 bit selector included */ uint8_t tclass; __be16 pkey; uint8_t rate; /* 2 bit selector included */ uint8_t pkt_life; /* 2 bit selector included */ __be32 sl_flow_hop; /* SL: 4 bits, FlowLabel: 20 bits, */ /* HopLimit: 8 bits */ uint8_t scope_state; /* Scope: 4 bits, JoinState: 4 bits */ uint8_t proxy_join; /* ProxyJoin: 1 bit (computed by SA) */ uint8_t reserved[2]; uint8_t pad[4]; /* SA records are multiple of 8 bytes */ }; static inline void umad_sa_mcm_get_sl_flow_hop(__be32 sl_flow_hop, uint8_t * const p_sl, uint32_t * const p_flow_lbl, uint8_t * const p_hop) { uint32_t tmp; tmp = ntohl(sl_flow_hop); if (p_hop) *p_hop = (uint8_t) tmp; tmp >>= 8; if (p_flow_lbl) *p_flow_lbl = (uint32_t) (tmp & 0xfffff); tmp >>= 20; if (p_sl) *p_sl = (uint8_t) tmp; } static inline __be32 umad_sa_mcm_set_sl_flow_hop(uint8_t sl, uint32_t flow_label, uint8_t hop_limit) { uint32_t tmp; tmp = (sl << 28) | ((flow_label & 0xfffff) << 8) | hop_limit; return htonl(tmp); } static inline void umad_sa_mcm_get_scope_state(const uint8_t scope_state, uint8_t * const p_scope, uint8_t * const p_state) { uint8_t tmp_scope_state; if (p_state) *p_state = (uint8_t) (scope_state & 0x0f); tmp_scope_state = scope_state >> 4; if (p_scope) *p_scope = (uint8_t) (tmp_scope_state & 0x0f); } static inline uint8_t umad_sa_mcm_set_scope_state(const uint8_t scope, const uint8_t state) { uint8_t scope_state; scope_state = scope; scope_state = scope_state << 4; scope_state = scope_state | state; return scope_state; } static inline void umad_sa_mcm_set_join_state(struct umad_sa_mcmember_record *p_mc_rec, const uint8_t state) { /* keep the scope as it is */ p_mc_rec->scope_state = (p_mc_rec->scope_state & 0xf0) | (0x0f & state); } static inline int umad_sa_mcm_get_proxy_join(struct umad_sa_mcmember_record *p_mc_rec) { return ((p_mc_rec->proxy_join & 0x80) == 0x80); } #ifdef __cplusplus } #endif #endif /* _UMAD_SA_MCM_H */ rdma-core-56.1/libibumad/umad_sm.h000066400000000000000000000075261477342711600170720ustar00rootroot00000000000000/* * Copyright (c) 2004-2014 Mellanox Technologies Ltd. All rights reserved. * Copyright (c) 2004 Infinicon Corporation. All rights reserved. * Copyright (c) 2004 Intel Corporation. All rights reserved. * Copyright (c) 2004 Topspin Corporation. All rights reserved. * Copyright (c) 2004 Voltaire Corporation. All rights reserved. * Copyright (c) 2013 Oracle and/or its affiliates. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _UMAD_SM_H #define _UMAD_SM_H #include #ifdef __cplusplus extern "C" { #endif enum { UMAD_SMP_DIRECTION = 0x8000, }; /* Subnet management attributes */ enum { UMAD_SM_ATTR_NODE_DESC = 0x0010, UMAD_SM_ATTR_NODE_INFO = 0x0011, UMAD_SM_ATTR_SWITCH_INFO = 0x0012, UMAD_SM_ATTR_GUID_INFO = 0x0014, UMAD_SM_ATTR_PORT_INFO = 0x0015, UMAD_SM_ATTR_PKEY_TABLE = 0x0016, UMAD_SM_ATTR_SLVL_TABLE = 0x0017, UMAD_SM_ATTR_VL_ARB_TABLE = 0x0018, UMAD_SM_ATTR_LINEAR_FT = 0x0019, UMAD_SM_ATTR_RANDOM_FT = 0x001A, UMAD_SM_ATTR_MCAST_FT = 0x001B, UMAD_SM_ATTR_LINK_SPD_WIDTH_TABLE = 0x001C, UMAD_SM_ATTR_VENDOR_MADS_TABLE = 0x001D, UMAD_SM_ATTR_HIERARCHY_INFO = 0x001E, UMAD_SM_ATTR_SM_INFO = 0x0020, UMAD_SM_ATTR_VENDOR_DIAG = 0x0030, UMAD_SM_ATTR_LED_INFO = 0x0031, UMAD_SM_ATTR_CABLE_INFO = 0x0032, UMAD_SM_ATTR_PORT_INFO_EXT = 0x0033, UMAD_SM_ATTR_VENDOR_MASK = 0xFF00, UMAD_SM_ATTR_MLNX_EXT_PORT_INFO = 0xFF90 }; enum { UMAD_SM_GID_IN_SERVICE_TRAP = 64, UMAD_SM_GID_OUT_OF_SERVICE_TRAP = 65, UMAD_SM_MGID_CREATED_TRAP = 66, UMAD_SM_MGID_DESTROYED_TRAP = 67, UMAD_SM_UNPATH_TRAP = 68, UMAD_SM_REPATH_TRAP = 69, UMAD_SM_LINK_STATE_CHANGED_TRAP = 128, UMAD_SM_LINK_INTEGRITY_THRESHOLD_TRAP = 129, UMAD_SM_BUFFER_OVERRUN_THRESHOLD_TRAP = 130, UMAD_SM_WATCHDOG_TIMER_EXPIRED_TRAP = 131, UMAD_SM_LOCAL_CHANGES_TRAP = 144, UMAD_SM_SYS_IMG_GUID_CHANGED_TRAP = 145, UMAD_SM_BAD_MKEY_TRAP = 256, UMAD_SM_BAD_PKEY_TRAP = 257, UMAD_SM_BAD_QKEY_TRAP = 258, UMAD_SM_BAD_SWITCH_PKEY_TRAP = 259 }; enum { UMAD_LEN_SMP_DATA = 64, UMAD_SMP_MAX_HOPS = 64 }; struct umad_smp { uint8_t base_version; uint8_t mgmt_class; uint8_t class_version; uint8_t method; __be16 status; uint8_t hop_ptr; uint8_t hop_cnt; __be64 tid; __be16 attr_id; __be16 resv; __be32 attr_mod; __be64 mkey; __be16 dr_slid; __be16 dr_dlid; uint8_t reserved[28]; uint8_t data[UMAD_LEN_SMP_DATA]; uint8_t initial_path[UMAD_SMP_MAX_HOPS]; uint8_t return_path[UMAD_SMP_MAX_HOPS]; }; #ifdef __cplusplus } #endif #endif /* _UMAD_SM_H */ rdma-core-56.1/libibumad/umad_str.c000066400000000000000000000235771477342711600172620ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005, 2010 Intel Corporation. All rights reserved. * Copyright (c) 2013 Lawrence Livermore National Security. All rights reserved. * Copyright (c) 2014 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include "umad_str.h" const char * umad_class_str(uint8_t mgmt_class) { switch (mgmt_class) { case UMAD_CLASS_SUBN_LID_ROUTED: case UMAD_CLASS_SUBN_DIRECTED_ROUTE: return("Subn"); case UMAD_CLASS_SUBN_ADM: return("SubnAdm"); case UMAD_CLASS_PERF_MGMT: return("Perf"); case UMAD_CLASS_BM: return("BM"); case UMAD_CLASS_DEVICE_MGMT: return("DevMgt"); case UMAD_CLASS_CM: return("ComMgt"); case UMAD_CLASS_SNMP: return("SNMP"); case UMAD_CLASS_DEVICE_ADM: return("DevAdm"); case UMAD_CLASS_BOOT_MGMT: return("BootMgt"); case UMAD_CLASS_BIS: return("BIS"); case UMAD_CLASS_CONG_MGMT: return("CongestionManagment"); default: break; } if ((UMAD_CLASS_VENDOR_RANGE1_START <= mgmt_class && mgmt_class <= UMAD_CLASS_VENDOR_RANGE1_END) || (UMAD_CLASS_VENDOR_RANGE2_START <= mgmt_class && mgmt_class <= UMAD_CLASS_VENDOR_RANGE2_END)) return("Vendor"); if (UMAD_CLASS_APPLICATION_START <= mgmt_class && mgmt_class <= UMAD_CLASS_APPLICATION_END) { return("Application"); } return (""); } static const char * umad_common_method_str(uint8_t method) { switch(method) { case UMAD_METHOD_GET: return ("Get"); case UMAD_METHOD_SET: return ("Set"); case UMAD_METHOD_GET_RESP: return ("GetResp"); case UMAD_METHOD_SEND: return ("Send"); case UMAD_METHOD_TRAP: return ("Trap"); case UMAD_METHOD_REPORT: return ("Report"); case UMAD_METHOD_REPORT_RESP: return ("ReportResp"); case UMAD_METHOD_TRAP_REPRESS: return ("TrapRepress"); default: return ("> 8) { case UMAD_SA_STATUS_SUCCESS: return ("Success"); case UMAD_SA_STATUS_NO_RESOURCES: return ("No Resources"); case UMAD_SA_STATUS_REQ_INVALID: return ("Request Invalid"); case UMAD_SA_STATUS_NO_RECORDS: return ("No Records"); case UMAD_SA_STATUS_TOO_MANY_RECORDS: return ("Too Many Records"); case UMAD_SA_STATUS_INVALID_GID: return ("Invalid GID"); case UMAD_SA_STATUS_INSUF_COMPS: return ("Insufficient Components"); case UMAD_SA_STATUS_REQ_DENIED: return ("Request Denied"); case UMAD_SA_STATUS_PRI_SUGGESTED: return ("Priority Suggested"); } return ("Undefined Error"); } static const char *umad_common_attr_str(__be16 attr_id) { switch(be16toh(attr_id)) { case UMAD_ATTR_CLASS_PORT_INFO: return "Class Port Info"; case UMAD_ATTR_NOTICE: return "Notice"; case UMAD_ATTR_INFORM_INFO: return "Inform Info"; default: return ""; } } static const char * umad_sm_attr_str(__be16 attr_id) { switch(be16toh(attr_id)) { case UMAD_SM_ATTR_NODE_DESC: return ("NodeDescription"); case UMAD_SM_ATTR_NODE_INFO: return ("NodeInfo"); case UMAD_SM_ATTR_SWITCH_INFO: return ("SwitchInfo"); case UMAD_SM_ATTR_GUID_INFO: return ("GUIDInfo"); case UMAD_SM_ATTR_PORT_INFO: return ("PortInfo"); case UMAD_SM_ATTR_PKEY_TABLE: return ("P_KeyTable"); case UMAD_SM_ATTR_SLVL_TABLE: return ("SLtoVLMappingTable"); case UMAD_SM_ATTR_VL_ARB_TABLE: return ("VLArbitrationTable"); case UMAD_SM_ATTR_LINEAR_FT: return ("LinearForwardingTable"); case UMAD_SM_ATTR_RANDOM_FT: return ("RandomForwardingTable"); case UMAD_SM_ATTR_MCAST_FT: return ("MulticastForwardingTable"); case UMAD_SM_ATTR_SM_INFO: return ("SMInfo"); case UMAD_SM_ATTR_VENDOR_DIAG: return ("VendorDiag"); case UMAD_SM_ATTR_LED_INFO: return ("LedInfo"); case UMAD_SM_ATTR_LINK_SPD_WIDTH_TABLE: return ("LinkSpeedWidthPairsTable"); case UMAD_SM_ATTR_VENDOR_MADS_TABLE: return ("VendorSpecificMadsTable"); case UMAD_SM_ATTR_HIERARCHY_INFO: return ("HierarchyInfo"); case UMAD_SM_ATTR_CABLE_INFO: return ("CableInfo"); case UMAD_SM_ATTR_PORT_INFO_EXT: return ("PortInfoExtended"); default: return (umad_common_attr_str(attr_id)); } } static const char * umad_sa_attr_str(__be16 attr_id) { switch(be16toh(attr_id)) { case UMAD_SA_ATTR_NODE_REC: return ("NodeRecord"); case UMAD_SA_ATTR_PORT_INFO_REC: return ("PortInfoRecord"); case UMAD_SA_ATTR_SLVL_REC: return ("SLtoVLMappingTableRecord"); case UMAD_SA_ATTR_SWITCH_INFO_REC: return ("SwitchInfoRecord"); case UMAD_SA_ATTR_LINEAR_FT_REC: return ("LinearForwardingTableRecord"); case UMAD_SA_ATTR_RANDOM_FT_REC: return ("RandomForwardingTableRecord"); case UMAD_SA_ATTR_MCAST_FT_REC: return ("MulticastForwardingTableRecord"); case UMAD_SA_ATTR_SM_INFO_REC: return ("SMInfoRecord"); case UMAD_SA_ATTR_INFORM_INFO_REC: return ("InformInfoRecord"); case UMAD_SA_ATTR_LINK_REC: return ("LinkRecord"); case UMAD_SA_ATTR_GUID_INFO_REC: return ("GuidInfoRecord"); case UMAD_SA_ATTR_SERVICE_REC: return ("ServiceRecord"); case UMAD_SA_ATTR_PKEY_TABLE_REC: return ("P_KeyTableRecord"); case UMAD_SA_ATTR_PATH_REC: return ("PathRecord"); case UMAD_SA_ATTR_VL_ARB_REC: return ("VLArbitrationTableRecord"); case UMAD_SA_ATTR_MCMEMBER_REC: return ("MCMemberRecord"); case UMAD_SA_ATTR_TRACE_REC: return ("TraceRecord"); case UMAD_SA_ATTR_MULTI_PATH_REC: return ("MultiPathRecord"); case UMAD_SA_ATTR_SERVICE_ASSOC_REC: return ("ServiceAssociationRecord"); case UMAD_SA_ATTR_LINK_SPD_WIDTH_TABLE_REC: return ("LinkSpeedWidthPairsTableRecord"); case UMAD_SA_ATTR_HIERARCHY_INFO_REC: return ("HierarchyInfoRecord"); case UMAD_SA_ATTR_CABLE_INFO_REC: return ("CableInfoRecord"); case UMAD_SA_ATTR_PORT_INFO_EXT_REC: return ("PortInfoExtendedRecord"); default: return (umad_common_attr_str(attr_id)); } } static const char * umad_cm_attr_str(__be16 attr_id) { switch(be16toh(attr_id)) { case UMAD_CM_ATTR_REQ: return "ConnectRequest"; case UMAD_CM_ATTR_MRA: return "MsgRcptAck"; case UMAD_CM_ATTR_REJ: return "ConnectReject"; case UMAD_CM_ATTR_REP: return "ConnectReply"; case UMAD_CM_ATTR_RTU: return "ReadyToUse"; case UMAD_CM_ATTR_DREQ: return "DisconnectRequest"; case UMAD_CM_ATTR_DREP: return "DisconnectReply"; case UMAD_CM_ATTR_SIDR_REQ: return "ServiceIDResReq"; case UMAD_CM_ATTR_SIDR_REP: return "ServiceIDResReqResp"; case UMAD_CM_ATTR_LAP: return "LoadAlternatePath"; case UMAD_CM_ATTR_APR: return "AlternatePathResponse"; case UMAD_CM_ATTR_SAP: return "SuggestAlternatePath"; case UMAD_CM_ATTR_SPR: return "SuggestPathResponse"; default: return (umad_common_attr_str(attr_id)); } } const char * umad_attribute_str(uint8_t mgmt_class, __be16 attr_id) { switch (mgmt_class) { case UMAD_CLASS_SUBN_LID_ROUTED: case UMAD_CLASS_SUBN_DIRECTED_ROUTE: return(umad_sm_attr_str(attr_id)); case UMAD_CLASS_SUBN_ADM: return(umad_sa_attr_str(attr_id)); case UMAD_CLASS_CM: return(umad_cm_attr_str(attr_id)); } return (umad_common_attr_str(attr_id)); } rdma-core-56.1/libibumad/umad_str.h000066400000000000000000000037001477342711600172510ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005, 2010 Intel Corporation. All rights reserved. * Copyright (c) 2013 Lawrence Livermore National Security. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _UMAD_STR_H #define _UMAD_STR_H #include #ifdef __cplusplus extern "C" { #endif const char * umad_class_str(uint8_t mgmt_class); const char * umad_method_str(uint8_t mgmt_class, uint8_t method); const char * umad_attribute_str(uint8_t mgmt_class, __be16 attr_id); const char * umad_common_mad_status_str(__be16 status); const char * umad_sa_mad_status_str(__be16 status); #ifdef __cplusplus } #endif #endif /* _UMAD_STR_H */ rdma-core-56.1/libibumad/umad_types.h000066400000000000000000000130341477342711600176060ustar00rootroot00000000000000/* * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved. * Copyright (c) 2004 Infinicon Corporation. All rights reserved. * Copyright (c) 2004, 2010 Intel Corporation. All rights reserved. * Copyright (c) 2004 Topspin Corporation. All rights reserved. * Copyright (c) 2004-2006 Voltaire Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _UMAD_TYPES_H #define _UMAD_TYPES_H #include #include #ifdef __cplusplus extern "C" { #endif #define UMAD_BASE_VERSION 1 #define UMAD_QKEY 0x80010000 /* Management classes */ enum { UMAD_CLASS_SUBN_LID_ROUTED = 0x01, UMAD_CLASS_SUBN_DIRECTED_ROUTE = 0x81, UMAD_CLASS_SUBN_ADM = 0x03, UMAD_CLASS_PERF_MGMT = 0x04, UMAD_CLASS_BM = 0x05, UMAD_CLASS_DEVICE_MGMT = 0x06, UMAD_CLASS_CM = 0x07, UMAD_CLASS_SNMP = 0x08, UMAD_CLASS_VENDOR_RANGE1_START = 0x09, UMAD_CLASS_VENDOR_RANGE1_END = 0x0F, UMAD_CLASS_APPLICATION_START = 0x10, UMAD_CLASS_DEVICE_ADM = UMAD_CLASS_APPLICATION_START, UMAD_CLASS_BOOT_MGMT = 0x11, UMAD_CLASS_BIS = 0x12, UMAD_CLASS_CONG_MGMT = 0x21, UMAD_CLASS_APPLICATION_END = 0x2F, UMAD_CLASS_VENDOR_RANGE2_START = 0x30, UMAD_CLASS_VENDOR_RANGE2_END = 0x4F }; /* Management methods */ enum { UMAD_METHOD_GET = 0x01, UMAD_METHOD_SET = 0x02, UMAD_METHOD_GET_RESP = 0x81, UMAD_METHOD_SEND = 0x03, UMAD_METHOD_TRAP = 0x05, UMAD_METHOD_REPORT = 0x06, UMAD_METHOD_REPORT_RESP = 0x86, UMAD_METHOD_TRAP_REPRESS = 0x07, UMAD_METHOD_RESP_MASK = 0x80 }; enum { UMAD_STATUS_SUCCESS = 0x0000, UMAD_STATUS_BUSY = 0x0001, UMAD_STATUS_REDIRECT = 0x0002, /* Invalid fields, bits 2-4 */ UMAD_STATUS_BAD_VERSION = (1 << 2), UMAD_STATUS_METHOD_NOT_SUPPORTED = (2 << 2), UMAD_STATUS_ATTR_NOT_SUPPORTED = (3 << 2), UMAD_STATUS_INVALID_ATTR_VALUE = (7 << 2), UMAD_STATUS_INVALID_FIELD_MASK = 0x001C, UMAD_STATUS_CLASS_MASK = 0xFF00 }; /* Attributes common to multiple classes */ enum { UMAD_ATTR_CLASS_PORT_INFO = 0x0001, UMAD_ATTR_NOTICE = 0x0002, UMAD_ATTR_INFORM_INFO = 0x0003 }; /* RMPP information */ #define UMAD_RMPP_VERSION 1 enum { UMAD_RMPP_FLAG_ACTIVE = 1, }; enum { UMAD_LEN_DATA = 232, UMAD_LEN_RMPP_DATA = 220, UMAD_LEN_DM_DATA = 192, UMAD_LEN_VENDOR_DATA = 216, }; struct umad_hdr { uint8_t base_version; uint8_t mgmt_class; uint8_t class_version; uint8_t method; __be16 status; __be16 class_specific; __be64 tid; __be16 attr_id; __be16 resv; __be32 attr_mod; }; struct umad_rmpp_hdr { uint8_t rmpp_version; uint8_t rmpp_type; uint8_t rmpp_rtime_flags; uint8_t rmpp_status; __be32 seg_num; __be32 paylen_newwin; }; struct umad_packet { struct umad_hdr mad_hdr; uint8_t data[UMAD_LEN_DATA]; /* network-byte order */ }; struct umad_rmpp_packet { struct umad_hdr mad_hdr; struct umad_rmpp_hdr rmpp_hdr; uint8_t data[UMAD_LEN_RMPP_DATA]; /* network-byte order */ }; struct umad_dm_packet { struct umad_hdr mad_hdr; uint8_t reserved[40]; uint8_t data[UMAD_LEN_DM_DATA]; /* network-byte order */ }; struct umad_vendor_packet { struct umad_hdr mad_hdr; struct umad_rmpp_hdr rmpp_hdr; uint8_t reserved; uint8_t oui[3]; /* network-byte order */ uint8_t data[UMAD_LEN_VENDOR_DATA]; /* network-byte order */ }; enum { UMAD_OPENIB_OUI = 0x001405 }; enum { UMAD_CLASS_RESP_TIME_MASK = 0x1F }; struct umad_class_port_info { uint8_t base_ver; uint8_t class_ver; __be16 cap_mask; __be32 cap_mask2_resp_time; union { uint8_t redir_gid[16] __attribute__((deprecated)); /* network byte order */ union umad_gid redirgid; }; __be32 redir_tc_sl_fl; __be16 redir_lid; __be16 redir_pkey; __be32 redir_qp; __be32 redir_qkey; union { uint8_t trap_gid[16] __attribute__((deprecated)); /* network byte order */ union umad_gid trapgid; }; __be32 trap_tc_sl_fl; __be16 trap_lid; __be16 trap_pkey; __be32 trap_hl_qp; __be32 trap_qkey; }; static inline uint32_t umad_class_cap_mask2(struct umad_class_port_info *cpi) { return (be32toh(cpi->cap_mask2_resp_time) >> 5); } static inline uint8_t umad_class_resp_time(struct umad_class_port_info *cpi) { return (uint8_t)(be32toh(cpi->cap_mask2_resp_time) & UMAD_CLASS_RESP_TIME_MASK); } #ifdef __cplusplus } #endif #endif /* _UMAD_TYPES_H */ rdma-core-56.1/libibverbs/000077500000000000000000000000001477342711600154555ustar00rootroot00000000000000rdma-core-56.1/libibverbs/CMakeLists.txt000066400000000000000000000037341477342711600202240ustar00rootroot00000000000000publish_headers(infiniband arch.h opcode.h sa-kern-abi.h sa.h verbs.h verbs_api.h tm_types.h ) publish_internal_headers(infiniband cmd_ioctl.h cmd_write.h driver.h kern-abi.h marshall.h ) configure_file("libibverbs.map.in" "${CMAKE_CURRENT_BINARY_DIR}/libibverbs.map" @ONLY) rdma_library(ibverbs "${CMAKE_CURRENT_BINARY_DIR}/libibverbs.map" # See Documentation/versioning.md 1 1.14.${PACKAGE_VERSION} all_providers.c cmd.c cmd_ah.c cmd_counters.c cmd_cq.c cmd_device.c cmd_dm.c cmd_fallback.c cmd_flow.c cmd_flow_action.c cmd_ioctl.c cmd_mr.c cmd_mw.c cmd_pd.c cmd_qp.c cmd_rwq_ind.c cmd_srq.c cmd_wq.c cmd_xrcd.c compat-1_0.c device.c dummy_ops.c dynamic_driver.c enum_strs.c ibdev_nl.c init.c marshall.c memory.c neigh.c static_driver.c sysfs.c verbs.c ) target_link_libraries(ibverbs LINK_PRIVATE ${NL_LIBRARIES} ${CMAKE_THREAD_LIBS_INIT} ${CMAKE_DL_LIBS} kern-abi ) function(ibverbs_finalize) if (ENABLE_STATIC) # In static mode the .pc file lists all of the providers for static # linking. The user should set RDMA_STATIC_PROVIDERS to select which ones # to include. list(LENGTH RDMA_PROVIDER_LIST LEN) math(EXPR LEN ${LEN}-1) foreach(I RANGE 0 ${LEN} 2) list(GET RDMA_PROVIDER_LIST ${I} PROVIDER_NAME) math(EXPR I ${I}+1) list(GET RDMA_PROVIDER_LIST ${I} LIB_NAME) math(EXPR I ${I}+1) set(PROVIDER_LIBS "${PROVIDER_LIBS} -l${LIB_NAME}") set(FOR_EACH_PROVIDER "${FOR_EACH_PROVIDER} FOR_PROVIDER(${PROVIDER_NAME})") endforeach() if (NOT NL_KIND EQUAL 0) set(REQUIRES "libnl-3.0, libnl-route-3.0") endif() rdma_pkg_config("ibverbs" "${REQUIRES}" "${PROVIDER_LIBS} -libverbs ${CMAKE_THREAD_LIBS_INIT}") file(WRITE ${BUILD_INCLUDE}/infiniband/all_providers.h "#define FOR_EACH_PROVIDER() ${FOR_EACH_PROVIDER}") else() rdma_pkg_config("ibverbs" "" "${CMAKE_THREAD_LIBS_INIT}") endif() endfunction() rdma-core-56.1/libibverbs/all_providers.c000066400000000000000000000041431477342711600204700ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef _STATIC_LIBRARY_BUILD_ #define RDMA_STATIC_PROVIDERS none #include #include #include /* When static linking this object will be included in the final link only if * something refers to the 'verbs_provider_all' symbol. It in turn brings all * the providers into the link as well. Otherwise the static linker will not * include this. It is important this is the only thing in this file. */ #define FOR_PROVIDER(x) &verbs_provider_ ## x, static const struct verbs_device_ops *all_providers[] = { FOR_EACH_PROVIDER() NULL }; const struct verbs_device_ops verbs_provider_all = { .static_providers = all_providers, }; #endif rdma-core-56.1/libibverbs/arch.h000066400000000000000000000036231477342711600165470ustar00rootroot00000000000000/* * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef INFINIBAND_ARCH_H #define INFINIBAND_ARCH_H #include #include #warning "This header is obsolete." #ifndef ntohll #undef htonll #undef ntohll /* Users should use the glibc functions directly, not these wrappers */ static inline __attribute__((deprecated)) uint64_t htonll(uint64_t x) { return htobe64(x); } static inline __attribute__((deprecated)) uint64_t ntohll(uint64_t x) { return be64toh(x); } #define htonll htonll #define ntohll ntohll #endif /* Barrier macros are no longer provided by libibverbs */ #endif /* INFINIBAND_ARCH_H */ rdma-core-56.1/libibverbs/cmd.c000066400000000000000000001072371477342711600163760ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005 PathScale, Inc. All rights reserved. * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include "ibverbs.h" #include bool verbs_allow_disassociate_destroy; int ibv_cmd_alloc_pd(struct ibv_context *context, struct ibv_pd *pd, struct ibv_alloc_pd *cmd, size_t cmd_size, struct ib_uverbs_alloc_pd_resp *resp, size_t resp_size) { int ret; ret = execute_cmd_write(context, IB_USER_VERBS_CMD_ALLOC_PD, cmd, cmd_size, resp, resp_size); if (ret) return ret; pd->handle = resp->pd_handle; pd->context = context; return 0; } int ibv_cmd_open_xrcd(struct ibv_context *context, struct verbs_xrcd *xrcd, int vxrcd_size, struct ibv_xrcd_init_attr *attr, struct ibv_open_xrcd *cmd, size_t cmd_size, struct ib_uverbs_open_xrcd_resp *resp, size_t resp_size) { int ret; if (attr->comp_mask >= IBV_XRCD_INIT_ATTR_RESERVED) return EOPNOTSUPP; if (!(attr->comp_mask & IBV_XRCD_INIT_ATTR_FD) || !(attr->comp_mask & IBV_XRCD_INIT_ATTR_OFLAGS)) return EINVAL; cmd->fd = attr->fd; cmd->oflags = attr->oflags; ret = execute_cmd_write(context, IB_USER_VERBS_CMD_OPEN_XRCD, cmd, cmd_size, resp, resp_size); if (ret) return ret; xrcd->xrcd.context = context; xrcd->comp_mask = 0; if (vext_field_avail(struct verbs_xrcd, handle, vxrcd_size)) { xrcd->comp_mask = VERBS_XRCD_HANDLE; xrcd->handle = resp->xrcd_handle; } return 0; } int ibv_cmd_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access, struct verbs_mr *vmr, struct ibv_reg_mr *cmd, size_t cmd_size, struct ib_uverbs_reg_mr_resp *resp, size_t resp_size) { int ret; cmd->start = (uintptr_t) addr; cmd->length = length; /* On demand access and entire address space means implicit. * In that case set the value in the command to what kernel expects. */ if (access & IBV_ACCESS_ON_DEMAND) { if (length == SIZE_MAX && addr) { errno = EINVAL; return EINVAL; } if (length == SIZE_MAX) cmd->length = UINT64_MAX; } cmd->hca_va = hca_va; cmd->pd_handle = pd->handle; cmd->access_flags = access; ret = execute_cmd_write(pd->context, IB_USER_VERBS_CMD_REG_MR, cmd, cmd_size, resp, resp_size); if (ret) return ret; vmr->ibv_mr.handle = resp->mr_handle; vmr->ibv_mr.lkey = resp->lkey; vmr->ibv_mr.rkey = resp->rkey; vmr->ibv_mr.context = pd->context; vmr->mr_type = IBV_MR_TYPE_MR; vmr->access = access; return 0; } int ibv_cmd_rereg_mr(struct verbs_mr *vmr, uint32_t flags, void *addr, size_t length, uint64_t hca_va, int access, struct ibv_pd *pd, struct ibv_rereg_mr *cmd, size_t cmd_sz, struct ib_uverbs_rereg_mr_resp *resp, size_t resp_sz) { int ret; cmd->mr_handle = vmr->ibv_mr.handle; cmd->flags = flags; cmd->start = (uintptr_t)addr; cmd->length = length; cmd->hca_va = hca_va; cmd->pd_handle = (flags & IBV_REREG_MR_CHANGE_PD) ? pd->handle : 0; cmd->access_flags = access; ret = execute_cmd_write(vmr->ibv_mr.context, IB_USER_VERBS_CMD_REREG_MR, cmd, cmd_sz, resp, resp_sz); if (ret) return ret; vmr->ibv_mr.lkey = resp->lkey; vmr->ibv_mr.rkey = resp->rkey; if (flags & IBV_REREG_MR_CHANGE_PD) vmr->ibv_mr.context = pd->context; return 0; } int ibv_cmd_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type, struct ibv_mw *mw, struct ibv_alloc_mw *cmd, size_t cmd_size, struct ib_uverbs_alloc_mw_resp *resp, size_t resp_size) { int ret; cmd->pd_handle = pd->handle; cmd->mw_type = type; memset(cmd->reserved, 0, sizeof(cmd->reserved)); ret = execute_cmd_write(pd->context, IB_USER_VERBS_CMD_ALLOC_MW, cmd, cmd_size, resp, resp_size); if (ret) return ret; mw->context = pd->context; mw->pd = pd; mw->rkey = resp->rkey; mw->handle = resp->mw_handle; mw->type = type; return 0; } int ibv_cmd_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc) { struct ibv_poll_cq cmd; struct ib_uverbs_poll_cq_resp *resp; int i; int rsize; int ret; rsize = sizeof *resp + ne * sizeof(struct ib_uverbs_wc); resp = malloc(rsize); if (!resp) return -1; cmd.cq_handle = ibcq->handle; cmd.ne = ne; ret = execute_cmd_write_no_uhw(ibcq->context, IB_USER_VERBS_CMD_POLL_CQ, &cmd, sizeof(cmd), resp, rsize); if (ret) { ret = -1; goto out; } for (i = 0; i < resp->count; i++) { wc[i].wr_id = resp->wc[i].wr_id; wc[i].status = resp->wc[i].status; wc[i].opcode = resp->wc[i].opcode; wc[i].vendor_err = resp->wc[i].vendor_err; wc[i].byte_len = resp->wc[i].byte_len; wc[i].imm_data = resp->wc[i].ex.imm_data; wc[i].qp_num = resp->wc[i].qp_num; wc[i].src_qp = resp->wc[i].src_qp; wc[i].wc_flags = resp->wc[i].wc_flags; wc[i].pkey_index = resp->wc[i].pkey_index; wc[i].slid = resp->wc[i].slid; wc[i].sl = resp->wc[i].sl; wc[i].dlid_path_bits = resp->wc[i].dlid_path_bits; } ret = resp->count; out: free(resp); return ret; } int ibv_cmd_req_notify_cq(struct ibv_cq *ibcq, int solicited_only) { struct ibv_req_notify_cq req; req.core_payload = (struct ib_uverbs_req_notify_cq){ .cq_handle = ibcq->handle, .solicited_only = !!solicited_only, }; return execute_cmd_write_req(ibcq->context, IB_USER_VERBS_CMD_REQ_NOTIFY_CQ, &req, sizeof(req)); } int ibv_cmd_resize_cq(struct ibv_cq *cq, int cqe, struct ibv_resize_cq *cmd, size_t cmd_size, struct ib_uverbs_resize_cq_resp *resp, size_t resp_size) { int ret; cmd->cq_handle = cq->handle; cmd->cqe = cqe; ret = execute_cmd_write(cq->context, IB_USER_VERBS_CMD_RESIZE_CQ, cmd, cmd_size, resp, resp_size); if (ret) return ret; cq->cqe = resp->cqe; return 0; } static int ibv_cmd_modify_srq_v3(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, int srq_attr_mask, struct ibv_modify_srq *new_cmd, size_t new_cmd_size) { struct ibv_modify_srq_v3 *cmd; size_t cmd_size; cmd_size = sizeof *cmd + new_cmd_size - sizeof *new_cmd; cmd = alloca(cmd_size); memcpy(cmd + 1, new_cmd + 1, new_cmd_size - sizeof *new_cmd); cmd->core_payload = (struct ib_uverbs_modify_srq_v3){ .srq_handle = srq->handle, .attr_mask = srq_attr_mask, .max_wr = srq_attr->max_wr, .srq_limit = srq_attr->srq_limit, }; return execute_cmd_write_req( srq->context, IB_USER_VERBS_CMD_MODIFY_SRQ_V3, cmd, cmd_size); } int ibv_cmd_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, int srq_attr_mask, struct ibv_modify_srq *cmd, size_t cmd_size) { if (abi_ver == 3) return ibv_cmd_modify_srq_v3(srq, srq_attr, srq_attr_mask, cmd, cmd_size); cmd->srq_handle = srq->handle; cmd->attr_mask = srq_attr_mask; cmd->max_wr = srq_attr->max_wr; cmd->srq_limit = srq_attr->srq_limit; return execute_cmd_write_req(srq->context, IB_USER_VERBS_CMD_MODIFY_SRQ, cmd, cmd_size); } int ibv_cmd_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, struct ibv_query_srq *cmd, size_t cmd_size) { struct ib_uverbs_query_srq_resp resp; int ret; cmd->srq_handle = srq->handle; cmd->reserved = 0; ret = execute_cmd_write(srq->context, IB_USER_VERBS_CMD_QUERY_SRQ, cmd, cmd_size, &resp, sizeof(resp)); if (ret) return ret; srq_attr->max_wr = resp.max_wr; srq_attr->max_sge = resp.max_sge; srq_attr->srq_limit = resp.srq_limit; return 0; } enum { CREATE_QP_EX2_SUP_CREATE_FLAGS = IBV_QP_CREATE_BLOCK_SELF_MCAST_LB | IBV_QP_CREATE_SCATTER_FCS | IBV_QP_CREATE_CVLAN_STRIPPING | IBV_QP_CREATE_SOURCE_QPN | IBV_QP_CREATE_PCI_WRITE_END_PADDING, }; int ibv_cmd_open_qp(struct ibv_context *context, struct verbs_qp *qp, int vqp_sz, struct ibv_qp_open_attr *attr, struct ibv_open_qp *cmd, size_t cmd_size, struct ib_uverbs_create_qp_resp *resp, size_t resp_size) { struct verbs_xrcd *xrcd; int ret; if (attr->comp_mask >= IBV_QP_OPEN_ATTR_RESERVED) return EOPNOTSUPP; if (!(attr->comp_mask & IBV_QP_OPEN_ATTR_XRCD) || !(attr->comp_mask & IBV_QP_OPEN_ATTR_NUM) || !(attr->comp_mask & IBV_QP_OPEN_ATTR_TYPE)) return EINVAL; xrcd = container_of(attr->xrcd, struct verbs_xrcd, xrcd); cmd->user_handle = (uintptr_t) qp; cmd->pd_handle = xrcd->handle; cmd->qpn = attr->qp_num; cmd->qp_type = attr->qp_type; ret = execute_cmd_write(context, IB_USER_VERBS_CMD_OPEN_QP, cmd, cmd_size, resp, resp_size); if (ret) return ret; qp->qp.handle = resp->qp_handle; qp->qp.context = context; qp->qp.qp_context = attr->qp_context; qp->qp.pd = NULL; qp->qp.send_cq = NULL; qp->qp.recv_cq = NULL; qp->qp.srq = NULL; qp->qp.qp_num = attr->qp_num; qp->qp.qp_type = attr->qp_type; qp->qp.state = IBV_QPS_UNKNOWN; qp->qp.events_completed = 0; pthread_mutex_init(&qp->qp.mutex, NULL); pthread_cond_init(&qp->qp.cond, NULL); qp->comp_mask = 0; if (vext_field_avail(struct verbs_qp, xrcd, vqp_sz)) { qp->comp_mask = VERBS_QP_XRCD; qp->xrcd = xrcd; } return 0; } int ibv_cmd_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr, struct ibv_query_qp *cmd, size_t cmd_size) { struct ib_uverbs_query_qp_resp resp; int ret; /* * Starting with IBV_QP_RATE_LIMIT the attribute must go through the * _ex path. */ if (attr_mask & ~(IBV_QP_RATE_LIMIT - 1)) return EOPNOTSUPP; cmd->qp_handle = qp->handle; cmd->attr_mask = attr_mask; ret = execute_cmd_write(qp->context, IB_USER_VERBS_CMD_QUERY_QP, cmd, cmd_size, &resp, sizeof(resp)); if (ret) return ret; attr->qkey = resp.qkey; attr->rq_psn = resp.rq_psn; attr->sq_psn = resp.sq_psn; attr->dest_qp_num = resp.dest_qp_num; attr->qp_access_flags = resp.qp_access_flags; attr->pkey_index = resp.pkey_index; attr->alt_pkey_index = resp.alt_pkey_index; attr->qp_state = resp.qp_state; attr->cur_qp_state = resp.cur_qp_state; attr->path_mtu = resp.path_mtu; attr->path_mig_state = resp.path_mig_state; attr->sq_draining = resp.sq_draining; attr->max_rd_atomic = resp.max_rd_atomic; attr->max_dest_rd_atomic = resp.max_dest_rd_atomic; attr->min_rnr_timer = resp.min_rnr_timer; attr->port_num = resp.port_num; attr->timeout = resp.timeout; attr->retry_cnt = resp.retry_cnt; attr->rnr_retry = resp.rnr_retry; attr->alt_port_num = resp.alt_port_num; attr->alt_timeout = resp.alt_timeout; attr->cap.max_send_wr = resp.max_send_wr; attr->cap.max_recv_wr = resp.max_recv_wr; attr->cap.max_send_sge = resp.max_send_sge; attr->cap.max_recv_sge = resp.max_recv_sge; attr->cap.max_inline_data = resp.max_inline_data; memcpy(attr->ah_attr.grh.dgid.raw, resp.dest.dgid, 16); attr->ah_attr.grh.flow_label = resp.dest.flow_label; attr->ah_attr.dlid = resp.dest.dlid; attr->ah_attr.grh.sgid_index = resp.dest.sgid_index; attr->ah_attr.grh.hop_limit = resp.dest.hop_limit; attr->ah_attr.grh.traffic_class = resp.dest.traffic_class; attr->ah_attr.sl = resp.dest.sl; attr->ah_attr.src_path_bits = resp.dest.src_path_bits; attr->ah_attr.static_rate = resp.dest.static_rate; attr->ah_attr.is_global = resp.dest.is_global; attr->ah_attr.port_num = resp.dest.port_num; memcpy(attr->alt_ah_attr.grh.dgid.raw, resp.alt_dest.dgid, 16); attr->alt_ah_attr.grh.flow_label = resp.alt_dest.flow_label; attr->alt_ah_attr.dlid = resp.alt_dest.dlid; attr->alt_ah_attr.grh.sgid_index = resp.alt_dest.sgid_index; attr->alt_ah_attr.grh.hop_limit = resp.alt_dest.hop_limit; attr->alt_ah_attr.grh.traffic_class = resp.alt_dest.traffic_class; attr->alt_ah_attr.sl = resp.alt_dest.sl; attr->alt_ah_attr.src_path_bits = resp.alt_dest.src_path_bits; attr->alt_ah_attr.static_rate = resp.alt_dest.static_rate; attr->alt_ah_attr.is_global = resp.alt_dest.is_global; attr->alt_ah_attr.port_num = resp.alt_dest.port_num; init_attr->qp_context = qp->qp_context; init_attr->send_cq = qp->send_cq; init_attr->recv_cq = qp->recv_cq; init_attr->srq = qp->srq; init_attr->qp_type = qp->qp_type; init_attr->cap.max_send_wr = resp.max_send_wr; init_attr->cap.max_recv_wr = resp.max_recv_wr; init_attr->cap.max_send_sge = resp.max_send_sge; init_attr->cap.max_recv_sge = resp.max_recv_sge; init_attr->cap.max_inline_data = resp.max_inline_data; init_attr->sq_sig_all = resp.sq_sig_all; return 0; } static void copy_modify_qp_fields(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ib_uverbs_modify_qp *cmd) { cmd->qp_handle = qp->handle; cmd->attr_mask = attr_mask; if (attr_mask & IBV_QP_STATE) cmd->qp_state = attr->qp_state; if (attr_mask & IBV_QP_CUR_STATE) cmd->cur_qp_state = attr->cur_qp_state; if (attr_mask & IBV_QP_EN_SQD_ASYNC_NOTIFY) cmd->en_sqd_async_notify = attr->en_sqd_async_notify; if (attr_mask & IBV_QP_ACCESS_FLAGS) cmd->qp_access_flags = attr->qp_access_flags; if (attr_mask & IBV_QP_PKEY_INDEX) cmd->pkey_index = attr->pkey_index; if (attr_mask & IBV_QP_PORT) cmd->port_num = attr->port_num; if (attr_mask & IBV_QP_QKEY) cmd->qkey = attr->qkey; if (attr_mask & IBV_QP_AV) { memcpy(cmd->dest.dgid, attr->ah_attr.grh.dgid.raw, 16); cmd->dest.flow_label = attr->ah_attr.grh.flow_label; cmd->dest.dlid = attr->ah_attr.dlid; cmd->dest.reserved = 0; cmd->dest.sgid_index = attr->ah_attr.grh.sgid_index; cmd->dest.hop_limit = attr->ah_attr.grh.hop_limit; cmd->dest.traffic_class = attr->ah_attr.grh.traffic_class; cmd->dest.sl = attr->ah_attr.sl; cmd->dest.src_path_bits = attr->ah_attr.src_path_bits; cmd->dest.static_rate = attr->ah_attr.static_rate; cmd->dest.is_global = attr->ah_attr.is_global; cmd->dest.port_num = attr->ah_attr.port_num; } if (attr_mask & IBV_QP_PATH_MTU) cmd->path_mtu = attr->path_mtu; if (attr_mask & IBV_QP_TIMEOUT) cmd->timeout = attr->timeout; if (attr_mask & IBV_QP_RETRY_CNT) cmd->retry_cnt = attr->retry_cnt; if (attr_mask & IBV_QP_RNR_RETRY) cmd->rnr_retry = attr->rnr_retry; if (attr_mask & IBV_QP_RQ_PSN) cmd->rq_psn = attr->rq_psn; if (attr_mask & IBV_QP_MAX_QP_RD_ATOMIC) cmd->max_rd_atomic = attr->max_rd_atomic; if (attr_mask & IBV_QP_ALT_PATH) { cmd->alt_pkey_index = attr->alt_pkey_index; cmd->alt_port_num = attr->alt_port_num; cmd->alt_timeout = attr->alt_timeout; memcpy(cmd->alt_dest.dgid, attr->alt_ah_attr.grh.dgid.raw, 16); cmd->alt_dest.flow_label = attr->alt_ah_attr.grh.flow_label; cmd->alt_dest.dlid = attr->alt_ah_attr.dlid; cmd->alt_dest.reserved = 0; cmd->alt_dest.sgid_index = attr->alt_ah_attr.grh.sgid_index; cmd->alt_dest.hop_limit = attr->alt_ah_attr.grh.hop_limit; cmd->alt_dest.traffic_class = attr->alt_ah_attr.grh.traffic_class; cmd->alt_dest.sl = attr->alt_ah_attr.sl; cmd->alt_dest.src_path_bits = attr->alt_ah_attr.src_path_bits; cmd->alt_dest.static_rate = attr->alt_ah_attr.static_rate; cmd->alt_dest.is_global = attr->alt_ah_attr.is_global; cmd->alt_dest.port_num = attr->alt_ah_attr.port_num; } if (attr_mask & IBV_QP_MIN_RNR_TIMER) cmd->min_rnr_timer = attr->min_rnr_timer; if (attr_mask & IBV_QP_SQ_PSN) cmd->sq_psn = attr->sq_psn; if (attr_mask & IBV_QP_MAX_DEST_RD_ATOMIC) cmd->max_dest_rd_atomic = attr->max_dest_rd_atomic; if (attr_mask & IBV_QP_PATH_MIG_STATE) cmd->path_mig_state = attr->path_mig_state; if (attr_mask & IBV_QP_DEST_QPN) cmd->dest_qp_num = attr->dest_qp_num; cmd->reserved[0] = cmd->reserved[1] = 0; } int ibv_cmd_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_modify_qp *cmd, size_t cmd_size) { /* * Starting with IBV_QP_RATE_LIMIT the attribute must go through the * _ex path. */ if (attr_mask & ~(IBV_QP_RATE_LIMIT - 1)) return EOPNOTSUPP; copy_modify_qp_fields(qp, attr, attr_mask, &cmd->core_payload); return execute_cmd_write_req(qp->context, IB_USER_VERBS_CMD_MODIFY_QP, cmd, cmd_size); } int ibv_cmd_modify_qp_ex(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_modify_qp_ex *cmd, size_t cmd_size, struct ib_uverbs_ex_modify_qp_resp *resp, size_t resp_size) { copy_modify_qp_fields(qp, attr, attr_mask, &cmd->base); if (attr_mask & IBV_QP_RATE_LIMIT) { if (cmd_size >= offsetof(struct ibv_modify_qp_ex, rate_limit) + sizeof(cmd->rate_limit)) cmd->rate_limit = attr->rate_limit; else return EINVAL; } return execute_cmd_write_ex(qp->context, IB_USER_VERBS_EX_CMD_MODIFY_QP, cmd, cmd_size, resp, resp_size); } int ibv_cmd_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { struct ibv_post_send *cmd; struct ib_uverbs_post_send_resp resp; struct ibv_send_wr *i; struct ib_uverbs_send_wr *n, *tmp; struct ibv_sge *s; unsigned wr_count = 0; unsigned sge_count = 0; int cmd_size; int ret; for (i = wr; i; i = i->next) { wr_count++; sge_count += i->num_sge; } cmd_size = sizeof *cmd + wr_count * sizeof *n + sge_count * sizeof *s; cmd = alloca(cmd_size); cmd->qp_handle = ibqp->handle; cmd->wr_count = wr_count; cmd->sge_count = sge_count; cmd->wqe_size = sizeof *n; n = (struct ib_uverbs_send_wr *) ((void *) cmd + sizeof *cmd); s = (struct ibv_sge *) (n + wr_count); tmp = n; for (i = wr; i; i = i->next) { tmp->wr_id = i->wr_id; tmp->num_sge = i->num_sge; tmp->opcode = i->opcode; tmp->send_flags = i->send_flags; tmp->ex.imm_data = i->imm_data; if (ibqp->qp_type == IBV_QPT_UD) { tmp->wr.ud.ah = i->wr.ud.ah->handle; tmp->wr.ud.remote_qpn = i->wr.ud.remote_qpn; tmp->wr.ud.remote_qkey = i->wr.ud.remote_qkey; } else { switch (i->opcode) { case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: case IBV_WR_RDMA_READ: tmp->wr.rdma.remote_addr = i->wr.rdma.remote_addr; tmp->wr.rdma.rkey = i->wr.rdma.rkey; break; case IBV_WR_ATOMIC_CMP_AND_SWP: case IBV_WR_ATOMIC_FETCH_AND_ADD: tmp->wr.atomic.remote_addr = i->wr.atomic.remote_addr; tmp->wr.atomic.compare_add = i->wr.atomic.compare_add; tmp->wr.atomic.swap = i->wr.atomic.swap; tmp->wr.atomic.rkey = i->wr.atomic.rkey; break; default: break; } } if (tmp->num_sge) { memcpy(s, i->sg_list, tmp->num_sge * sizeof *s); s += tmp->num_sge; } tmp++; } resp.bad_wr = 0; ret = execute_cmd_write_no_uhw(ibqp->context, IB_USER_VERBS_CMD_POST_SEND, cmd, cmd_size, &resp, sizeof(resp)); wr_count = resp.bad_wr; if (wr_count) { i = wr; while (--wr_count) i = i->next; *bad_wr = i; } else if (ret) *bad_wr = wr; return ret; } int ibv_cmd_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct ibv_post_recv *cmd; struct ib_uverbs_post_recv_resp resp; struct ibv_recv_wr *i; struct ib_uverbs_recv_wr *n, *tmp; struct ibv_sge *s; unsigned wr_count = 0; unsigned sge_count = 0; int cmd_size; int ret; for (i = wr; i; i = i->next) { wr_count++; sge_count += i->num_sge; } cmd_size = sizeof *cmd + wr_count * sizeof *n + sge_count * sizeof *s; cmd = alloca(cmd_size); cmd->qp_handle = ibqp->handle; cmd->wr_count = wr_count; cmd->sge_count = sge_count; cmd->wqe_size = sizeof *n; n = (struct ib_uverbs_recv_wr *) ((void *) cmd + sizeof *cmd); s = (struct ibv_sge *) (n + wr_count); tmp = n; for (i = wr; i; i = i->next) { tmp->wr_id = i->wr_id; tmp->num_sge = i->num_sge; if (tmp->num_sge) { memcpy(s, i->sg_list, tmp->num_sge * sizeof *s); s += tmp->num_sge; } tmp++; } resp.bad_wr = 0; ret = execute_cmd_write_no_uhw(ibqp->context, IB_USER_VERBS_CMD_POST_RECV, cmd, cmd_size, &resp, sizeof(resp)); wr_count = resp.bad_wr; if (wr_count) { i = wr; while (--wr_count) i = i->next; *bad_wr = i; } else if (ret) *bad_wr = wr; return ret; } int ibv_cmd_post_srq_recv(struct ibv_srq *srq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct ibv_post_srq_recv *cmd; struct ib_uverbs_post_srq_recv_resp resp; struct ibv_recv_wr *i; struct ib_uverbs_recv_wr *n, *tmp; struct ibv_sge *s; unsigned wr_count = 0; unsigned sge_count = 0; int cmd_size; int ret; for (i = wr; i; i = i->next) { wr_count++; sge_count += i->num_sge; } cmd_size = sizeof *cmd + wr_count * sizeof *n + sge_count * sizeof *s; cmd = alloca(cmd_size); cmd->srq_handle = srq->handle; cmd->wr_count = wr_count; cmd->sge_count = sge_count; cmd->wqe_size = sizeof *n; n = (struct ib_uverbs_recv_wr *) ((void *) cmd + sizeof *cmd); s = (struct ibv_sge *) (n + wr_count); tmp = n; for (i = wr; i; i = i->next) { tmp->wr_id = i->wr_id; tmp->num_sge = i->num_sge; if (tmp->num_sge) { memcpy(s, i->sg_list, tmp->num_sge * sizeof *s); s += tmp->num_sge; } tmp++; } resp.bad_wr = 0; ret = execute_cmd_write_no_uhw(srq->context, IB_USER_VERBS_CMD_POST_SRQ_RECV, cmd, cmd_size, &resp, sizeof(resp)); wr_count = resp.bad_wr; if (wr_count) { i = wr; while (--wr_count) i = i->next; *bad_wr = i; } else if (ret) *bad_wr = wr; return ret; } int ibv_cmd_create_ah(struct ibv_pd *pd, struct ibv_ah *ah, struct ibv_ah_attr *attr, struct ib_uverbs_create_ah_resp *resp, size_t resp_size) { struct ibv_create_ah cmd; int ret; cmd.user_handle = (uintptr_t) ah; cmd.pd_handle = pd->handle; cmd.reserved = 0; cmd.attr.dlid = attr->dlid; cmd.attr.sl = attr->sl; cmd.attr.src_path_bits = attr->src_path_bits; cmd.attr.static_rate = attr->static_rate; cmd.attr.is_global = attr->is_global; cmd.attr.port_num = attr->port_num; cmd.attr.reserved = 0; cmd.attr.grh.flow_label = attr->grh.flow_label; cmd.attr.grh.sgid_index = attr->grh.sgid_index; cmd.attr.grh.hop_limit = attr->grh.hop_limit; cmd.attr.grh.traffic_class = attr->grh.traffic_class; cmd.attr.grh.reserved = 0; memcpy(cmd.attr.grh.dgid, attr->grh.dgid.raw, 16); ret = execute_cmd_write(pd->context, IB_USER_VERBS_CMD_CREATE_AH, &cmd, sizeof(cmd), resp, resp_size); if (ret) return ret; ah->handle = resp->ah_handle; ah->context = pd->context; return 0; } int ibv_cmd_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) { struct ibv_attach_mcast req; req.core_payload = (struct ib_uverbs_attach_mcast){ .qp_handle = qp->handle, .mlid = lid, }; memcpy(req.gid, gid->raw, sizeof(req.gid)); return execute_cmd_write_req( qp->context, IB_USER_VERBS_CMD_ATTACH_MCAST, &req, sizeof(req)); } int ibv_cmd_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) { struct ibv_detach_mcast req; int ret; req.core_payload = (struct ib_uverbs_detach_mcast){ .qp_handle = qp->handle, .mlid = lid, }; memcpy(req.gid, gid->raw, sizeof(req.gid)); ret = execute_cmd_write_req(qp->context, IB_USER_VERBS_CMD_DETACH_MCAST, &req, sizeof(req)); if (verbs_is_destroy_err(&ret)) return ret; return 0; } static int buffer_is_zero(char *addr, ssize_t size) { return addr[0] == 0 && !memcmp(addr, addr + 1, size - 1); } static int get_filters_size(struct ibv_flow_spec *ib_spec, struct ibv_kern_spec *kern_spec, int *ib_filter_size, int *kern_filter_size, enum ibv_flow_spec_type type) { void *ib_spec_filter_mask; int curr_kern_filter_size; int min_filter_size; *ib_filter_size = (ib_spec->hdr.size - sizeof(ib_spec->hdr)) / 2; switch (type) { case IBV_FLOW_SPEC_IPV4_EXT: min_filter_size = offsetof(struct ib_uverbs_flow_ipv4_filter, flags) + sizeof(kern_spec->ipv4_ext.mask.flags); curr_kern_filter_size = min_filter_size; ib_spec_filter_mask = (void *)&ib_spec->ipv4_ext.val + *ib_filter_size; break; case IBV_FLOW_SPEC_IPV6: min_filter_size = offsetof(struct ib_uverbs_flow_ipv6_filter, hop_limit) + sizeof(kern_spec->ipv6.mask.hop_limit); curr_kern_filter_size = min_filter_size; ib_spec_filter_mask = (void *)&ib_spec->ipv6.val + *ib_filter_size; break; case IBV_FLOW_SPEC_VXLAN_TUNNEL: min_filter_size = offsetof(struct ib_uverbs_flow_tunnel_filter, tunnel_id) + sizeof(kern_spec->tunnel.mask.tunnel_id); curr_kern_filter_size = min_filter_size; ib_spec_filter_mask = (void *)&ib_spec->tunnel.val + *ib_filter_size; break; default: return EINVAL; } if (*ib_filter_size < min_filter_size) return EINVAL; if (*ib_filter_size > curr_kern_filter_size && !buffer_is_zero(ib_spec_filter_mask + curr_kern_filter_size, *ib_filter_size - curr_kern_filter_size)) return EOPNOTSUPP; *kern_filter_size = min_t(int, curr_kern_filter_size, *ib_filter_size); return 0; } static int ib_spec_to_kern_spec(struct ibv_flow_spec *ib_spec, struct ibv_kern_spec *kern_spec) { int kern_filter_size; int ib_filter_size; int ret; kern_spec->hdr.type = ib_spec->hdr.type; switch (kern_spec->hdr.type) { case IBV_FLOW_SPEC_ETH: case IBV_FLOW_SPEC_ETH | IBV_FLOW_SPEC_INNER: kern_spec->eth.size = sizeof(struct ib_uverbs_flow_spec_eth); memcpy(&kern_spec->eth.val, &ib_spec->eth.val, sizeof(struct ibv_flow_eth_filter)); memcpy(&kern_spec->eth.mask, &ib_spec->eth.mask, sizeof(struct ibv_flow_eth_filter)); break; case IBV_FLOW_SPEC_IPV4: case IBV_FLOW_SPEC_IPV4 | IBV_FLOW_SPEC_INNER: kern_spec->ipv4.size = sizeof(struct ibv_kern_spec_ipv4); memcpy(&kern_spec->ipv4.val, &ib_spec->ipv4.val, sizeof(struct ibv_flow_ipv4_filter)); memcpy(&kern_spec->ipv4.mask, &ib_spec->ipv4.mask, sizeof(struct ibv_flow_ipv4_filter)); break; case IBV_FLOW_SPEC_IPV4_EXT: case IBV_FLOW_SPEC_IPV4_EXT | IBV_FLOW_SPEC_INNER: ret = get_filters_size(ib_spec, kern_spec, &ib_filter_size, &kern_filter_size, IBV_FLOW_SPEC_IPV4_EXT); if (ret) return ret; kern_spec->hdr.type = IBV_FLOW_SPEC_IPV4 | (IBV_FLOW_SPEC_INNER & ib_spec->hdr.type); kern_spec->ipv4_ext.size = sizeof(struct ib_uverbs_flow_spec_ipv4); memcpy(&kern_spec->ipv4_ext.val, &ib_spec->ipv4_ext.val, kern_filter_size); memcpy(&kern_spec->ipv4_ext.mask, (void *)&ib_spec->ipv4_ext.val + ib_filter_size, kern_filter_size); break; case IBV_FLOW_SPEC_IPV6: case IBV_FLOW_SPEC_IPV6 | IBV_FLOW_SPEC_INNER: ret = get_filters_size(ib_spec, kern_spec, &ib_filter_size, &kern_filter_size, IBV_FLOW_SPEC_IPV6); if (ret) return ret; kern_spec->ipv6.size = sizeof(struct ib_uverbs_flow_spec_ipv6); memcpy(&kern_spec->ipv6.val, &ib_spec->ipv6.val, kern_filter_size); memcpy(&kern_spec->ipv6.mask, (void *)&ib_spec->ipv6.val + ib_filter_size, kern_filter_size); break; case IBV_FLOW_SPEC_ESP: case IBV_FLOW_SPEC_ESP | IBV_FLOW_SPEC_INNER: kern_spec->esp.size = sizeof(struct ib_uverbs_flow_spec_esp); memcpy(&kern_spec->esp.val, &ib_spec->esp.val, sizeof(struct ib_uverbs_flow_spec_esp_filter)); memcpy(&kern_spec->esp.mask, (void *)&ib_spec->esp.mask, sizeof(struct ib_uverbs_flow_spec_esp_filter)); break; case IBV_FLOW_SPEC_TCP: case IBV_FLOW_SPEC_UDP: case IBV_FLOW_SPEC_TCP | IBV_FLOW_SPEC_INNER: case IBV_FLOW_SPEC_UDP | IBV_FLOW_SPEC_INNER: kern_spec->tcp_udp.size = sizeof(struct ib_uverbs_flow_spec_tcp_udp); memcpy(&kern_spec->tcp_udp.val, &ib_spec->tcp_udp.val, sizeof(struct ibv_flow_tcp_udp_filter)); memcpy(&kern_spec->tcp_udp.mask, &ib_spec->tcp_udp.mask, sizeof(struct ibv_flow_tcp_udp_filter)); break; case IBV_FLOW_SPEC_GRE: kern_spec->gre.size = sizeof(struct ib_uverbs_flow_spec_gre); memcpy(&kern_spec->gre.val, &ib_spec->gre.val, sizeof(struct ibv_flow_gre_filter)); memcpy(&kern_spec->gre.mask, &ib_spec->gre.mask, sizeof(struct ibv_flow_gre_filter)); break; case IBV_FLOW_SPEC_MPLS: case IBV_FLOW_SPEC_MPLS | IBV_FLOW_SPEC_INNER: kern_spec->mpls.size = sizeof(struct ib_uverbs_flow_spec_mpls); memcpy(&kern_spec->mpls.val, &ib_spec->mpls.val, sizeof(struct ibv_flow_mpls_filter)); memcpy(&kern_spec->mpls.mask, &ib_spec->mpls.mask, sizeof(struct ibv_flow_mpls_filter)); break; case IBV_FLOW_SPEC_VXLAN_TUNNEL: ret = get_filters_size(ib_spec, kern_spec, &ib_filter_size, &kern_filter_size, IBV_FLOW_SPEC_VXLAN_TUNNEL); if (ret) return ret; kern_spec->tunnel.size = sizeof(struct ib_uverbs_flow_spec_tunnel); memcpy(&kern_spec->tunnel.val, &ib_spec->tunnel.val, kern_filter_size); memcpy(&kern_spec->tunnel.mask, (void *)&ib_spec->tunnel.val + ib_filter_size, kern_filter_size); break; case IBV_FLOW_SPEC_ACTION_TAG: kern_spec->flow_tag.size = sizeof(struct ib_uverbs_flow_spec_action_tag); kern_spec->flow_tag.tag_id = ib_spec->flow_tag.tag_id; break; case IBV_FLOW_SPEC_ACTION_DROP: kern_spec->drop.size = sizeof(struct ib_uverbs_flow_spec_action_drop); break; case IBV_FLOW_SPEC_ACTION_HANDLE: { const struct verbs_flow_action *vaction = container_of((const struct ibv_flow_action *)ib_spec->handle.action, const struct verbs_flow_action, action); kern_spec->handle.size = sizeof(struct ib_uverbs_flow_spec_action_handle); kern_spec->handle.handle = vaction->handle; break; } case IBV_FLOW_SPEC_ACTION_COUNT: { const struct verbs_counters *vcounters = container_of(ib_spec->flow_count.counters, const struct verbs_counters, counters); kern_spec->flow_count.size = sizeof(struct ib_uverbs_flow_spec_action_count); kern_spec->flow_count.handle = vcounters->handle; break; } default: return EINVAL; } return 0; } int ibv_cmd_create_flow(struct ibv_qp *qp, struct ibv_flow *flow_id, struct ibv_flow_attr *flow_attr, void *ucmd, size_t ucmd_size) { struct ibv_create_flow *cmd; struct ib_uverbs_create_flow_resp resp; size_t cmd_size; size_t written_size; int i, err; void *kern_spec; void *ib_spec; cmd_size = sizeof(*cmd) + (flow_attr->num_of_specs * sizeof(struct ibv_kern_spec)); cmd = alloca(cmd_size + ucmd_size); memset(cmd, 0, cmd_size + ucmd_size); cmd->qp_handle = qp->handle; cmd->flow_attr.type = flow_attr->type; cmd->flow_attr.priority = flow_attr->priority; cmd->flow_attr.num_of_specs = flow_attr->num_of_specs; cmd->flow_attr.port = flow_attr->port; cmd->flow_attr.flags = flow_attr->flags; kern_spec = cmd + 1; ib_spec = flow_attr + 1; for (i = 0; i < flow_attr->num_of_specs; i++) { err = ib_spec_to_kern_spec(ib_spec, kern_spec); if (err) { errno = err; return err; } cmd->flow_attr.size += ((struct ibv_kern_spec *)kern_spec)->hdr.size; kern_spec += ((struct ibv_kern_spec *)kern_spec)->hdr.size; ib_spec += ((struct ibv_flow_spec *)ib_spec)->hdr.size; } written_size = sizeof(*cmd) + cmd->flow_attr.size; if (ucmd) { memcpy((char *)cmd + written_size, ucmd, ucmd_size); written_size += ucmd_size; } err = execute_cmd_write_ex_full(qp->context, IB_USER_VERBS_EX_CMD_CREATE_FLOW, cmd, written_size - ucmd_size, written_size, &resp, sizeof(resp), sizeof(resp)); if (err) return err; flow_id->context = qp->context; flow_id->handle = resp.flow_handle; return 0; } int ibv_cmd_modify_wq(struct ibv_wq *wq, struct ibv_wq_attr *attr, struct ibv_modify_wq *cmd, size_t cmd_size) { int err; if (attr->attr_mask >= IBV_WQ_ATTR_RESERVED) return EINVAL; memset(cmd, 0, sizeof(*cmd)); cmd->curr_wq_state = attr->curr_wq_state; cmd->wq_state = attr->wq_state; if (attr->attr_mask & IBV_WQ_ATTR_FLAGS) { if (attr->flags_mask & ~(IBV_WQ_FLAGS_RESERVED - 1)) return EOPNOTSUPP; cmd->flags = attr->flags; cmd->flags_mask = attr->flags_mask; } cmd->wq_handle = wq->handle; cmd->attr_mask = attr->attr_mask; err = execute_cmd_write_ex_req( wq->context, IB_USER_VERBS_EX_CMD_MODIFY_WQ, cmd, cmd_size); if (err) return err; if (attr->attr_mask & IBV_WQ_ATTR_STATE) wq->state = attr->wq_state; return 0; } int ibv_cmd_create_rwq_ind_table(struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr, struct ibv_rwq_ind_table *rwq_ind_table, struct ib_uverbs_ex_create_rwq_ind_table_resp *resp, size_t resp_size) { struct ibv_create_rwq_ind_table *cmd; int err; unsigned int i; unsigned int num_tbl_entries; size_t cmd_size; if (init_attr->comp_mask >= IBV_CREATE_IND_TABLE_RESERVED) return EINVAL; num_tbl_entries = 1 << init_attr->log_ind_tbl_size; /* The entire message must be size aligned to 8 bytes. */ cmd_size = sizeof(*cmd) + num_tbl_entries * sizeof(cmd->wq_handles[0]); cmd_size = (cmd_size + 7) / 8 * 8; cmd = alloca(cmd_size); memset(cmd, 0, cmd_size); for (i = 0; i < num_tbl_entries; i++) cmd->wq_handles[i] = init_attr->ind_tbl[i]->handle; cmd->log_ind_tbl_size = init_attr->log_ind_tbl_size; cmd->comp_mask = 0; err = execute_cmd_write_ex_full(context, IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL, cmd, cmd_size, cmd_size, resp, sizeof(*resp), resp_size); if (err) return err; if (resp->response_length < sizeof(*resp)) return EINVAL; rwq_ind_table->ind_tbl_handle = resp->ind_tbl_handle; rwq_ind_table->ind_tbl_num = resp->ind_tbl_num; rwq_ind_table->context = context; return 0; } int ibv_cmd_modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr, struct ibv_modify_cq *cmd, size_t cmd_size) { if (attr->attr_mask >= IBV_CQ_ATTR_RESERVED) return EINVAL; cmd->cq_handle = cq->handle; cmd->attr_mask = attr->attr_mask; cmd->attr.cq_count = attr->moderate.cq_count; cmd->attr.cq_period = attr->moderate.cq_period; cmd->reserved = 0; return execute_cmd_write_ex_req( cq->context, IB_USER_VERBS_EX_CMD_MODIFY_CQ, cmd, cmd_size); } rdma-core-56.1/libibverbs/cmd_ah.c000066400000000000000000000040661477342711600170420ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include int ibv_cmd_destroy_ah(struct ibv_ah *ah) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_AH, UVERBS_METHOD_AH_DESTROY, 1, NULL); int ret; fill_attr_in_obj(cmdb, UVERBS_ATTR_DESTROY_AH_HANDLE, ah->handle); switch (execute_ioctl_fallback(ah->context, destroy_ah, cmdb, &ret)) { case TRY_WRITE: { struct ibv_destroy_ah req; req.core_payload = (struct ib_uverbs_destroy_ah){ .ah_handle = ah->handle, }; ret = execute_cmd_write_req(ah->context, IB_USER_VERBS_CMD_DESTROY_AH, &req, sizeof(req)); break; } default: break; } if (verbs_is_destroy_err(&ret)) return ret; return 0; } rdma-core-56.1/libibverbs/cmd_counters.c000066400000000000000000000062121477342711600203070ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include int ibv_cmd_create_counters(struct ibv_context *context, struct ibv_counters_init_attr *init_attr, struct verbs_counters *vcounters, struct ibv_command_buffer *link) { DECLARE_COMMAND_BUFFER_LINK(cmd, UVERBS_OBJECT_COUNTERS, UVERBS_METHOD_COUNTERS_CREATE, 1, link); struct ib_uverbs_attr *handle = fill_attr_out_obj(cmd, UVERBS_ATTR_CREATE_COUNTERS_HANDLE); int ret; if (!check_comp_mask(init_attr->comp_mask, 0)) return EOPNOTSUPP; ret = execute_ioctl(context, cmd); if (ret) return ret; vcounters->counters.context = context; vcounters->handle = read_attr_obj(UVERBS_ATTR_CREATE_COUNTERS_HANDLE, handle); return 0; } int ibv_cmd_destroy_counters(struct verbs_counters *vcounters) { DECLARE_COMMAND_BUFFER(cmd, UVERBS_OBJECT_COUNTERS, UVERBS_METHOD_COUNTERS_DESTROY, 1); int ret; fill_attr_in_obj(cmd, UVERBS_ATTR_DESTROY_COUNTERS_HANDLE, vcounters->handle); ret = execute_ioctl(vcounters->counters.context, cmd); if (verbs_is_destroy_err(&ret)) return ret; return 0; } int ibv_cmd_read_counters(struct verbs_counters *vcounters, uint64_t *counters_value, uint32_t ncounters, uint32_t flags, struct ibv_command_buffer *link) { DECLARE_COMMAND_BUFFER_LINK(cmd, UVERBS_OBJECT_COUNTERS, UVERBS_METHOD_COUNTERS_READ, 3, link); fill_attr_in_obj(cmd, UVERBS_ATTR_READ_COUNTERS_HANDLE, vcounters->handle); fill_attr_out_ptr_array(cmd, UVERBS_ATTR_READ_COUNTERS_BUFF, counters_value, ncounters); fill_attr_in_uint32(cmd, UVERBS_ATTR_READ_COUNTERS_FLAGS, flags); return execute_ioctl(vcounters->counters.context, cmd); } rdma-core-56.1/libibverbs/cmd_cq.c000066400000000000000000000163571477342711600170630ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include "ibverbs.h" static int ibv_icmd_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector, uint32_t flags, struct ibv_cq *cq, struct ibv_command_buffer *link, uint32_t cmd_flags) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_CQ, UVERBS_METHOD_CQ_CREATE, 8, link); struct verbs_ex_private *priv = get_priv(context); struct ib_uverbs_attr *handle; struct ib_uverbs_attr *async_fd_attr; uint32_t resp_cqe; int ret; cq->context = context; handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_CREATE_CQ_HANDLE); fill_attr_out_ptr(cmdb, UVERBS_ATTR_CREATE_CQ_RESP_CQE, &resp_cqe); fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_CQ_CQE, cqe); fill_attr_in_uint64(cmdb, UVERBS_ATTR_CREATE_CQ_USER_HANDLE, (uintptr_t)cq); if (channel) fill_attr_in_fd(cmdb, UVERBS_ATTR_CREATE_CQ_COMP_CHANNEL, channel->fd); fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_CQ_COMP_VECTOR, comp_vector); async_fd_attr = fill_attr_in_fd(cmdb, UVERBS_ATTR_CREATE_CQ_EVENT_FD, context->async_fd); if (priv->imported) fallback_require_ioctl(cmdb); else /* Prevent fallback to the 'write' mode if kernel doesn't support it */ attr_optional(async_fd_attr); if (flags) { if ((flags & ~IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION) || (!(cmd_flags & CREATE_CQ_CMD_FLAGS_TS_IGNORED_EX))) fallback_require_ex(cmdb); fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_CQ_FLAGS, flags); } switch (execute_ioctl_fallback(cq->context, create_cq, cmdb, &ret)) { case TRY_WRITE: { DECLARE_LEGACY_UHW_BUFS(link, IB_USER_VERBS_CMD_CREATE_CQ); *req = (struct ib_uverbs_create_cq){ .user_handle = (uintptr_t)cq, .cqe = cqe, .comp_vector = comp_vector, .comp_channel = channel ? channel->fd : -1, }; ret = execute_write_bufs( cq->context, IB_USER_VERBS_CMD_CREATE_CQ, req, resp); if (ret) return ret; cq->handle = resp->cq_handle; cq->cqe = resp->cqe; return 0; } case TRY_WRITE_EX: { DECLARE_LEGACY_UHW_BUFS_EX(link, IB_USER_VERBS_EX_CMD_CREATE_CQ); *req = (struct ib_uverbs_ex_create_cq){ .user_handle = (uintptr_t)cq, .cqe = cqe, .comp_vector = comp_vector, .comp_channel = channel ? channel->fd : -1, .flags = flags, }; ret = execute_write_bufs_ex( cq->context, IB_USER_VERBS_EX_CMD_CREATE_CQ, req, resp); if (ret) return ret; cq->handle = resp->base.cq_handle; cq->cqe = resp->base.cqe; return 0; } case ERROR: return ret; case SUCCESS: break; } cq->handle = read_attr_obj(UVERBS_ATTR_CREATE_CQ_HANDLE, handle); cq->cqe = resp_cqe; return 0; } static int ibv_icmd_create_cq_ex(struct ibv_context *context, const struct ibv_cq_init_attr_ex *cq_attr, struct verbs_cq *cq, struct ibv_command_buffer *cmdb, uint32_t cmd_flags) { uint32_t flags = 0; if (!check_comp_mask(cq_attr->comp_mask, IBV_CQ_INIT_ATTR_MASK_FLAGS | IBV_CQ_INIT_ATTR_MASK_PD)) return EOPNOTSUPP; if (cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP || cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK) flags |= IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION; if ((cq_attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_FLAGS) && cq_attr->flags & IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN) flags |= IB_UVERBS_CQ_FLAGS_IGNORE_OVERRUN; return ibv_icmd_create_cq(context, cq_attr->cqe, cq_attr->channel, cq_attr->comp_vector, flags, &cq->cq, cmdb, cmd_flags); } int ibv_cmd_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector, struct ibv_cq *cq, struct ibv_create_cq *cmd, size_t cmd_size, struct ib_uverbs_create_cq_resp *resp, size_t resp_size) { DECLARE_CMD_BUFFER_COMPAT(cmdb, UVERBS_OBJECT_CQ, UVERBS_METHOD_CQ_CREATE, cmd, cmd_size, resp, resp_size); return ibv_icmd_create_cq(context, cqe, channel, comp_vector, 0, cq, cmdb, 0); } int ibv_cmd_create_cq_ex(struct ibv_context *context, const struct ibv_cq_init_attr_ex *cq_attr, struct verbs_cq *cq, struct ibv_create_cq_ex *cmd, size_t cmd_size, struct ib_uverbs_ex_create_cq_resp *resp, size_t resp_size, uint32_t cmd_flags) { DECLARE_CMD_BUFFER_COMPAT(cmdb, UVERBS_OBJECT_CQ, UVERBS_METHOD_CQ_CREATE, cmd, cmd_size, resp, resp_size); return ibv_icmd_create_cq_ex(context, cq_attr, cq, cmdb, cmd_flags); } int ibv_cmd_create_cq_ex2(struct ibv_context *context, const struct ibv_cq_init_attr_ex *cq_attr, struct verbs_cq *cq, struct ibv_create_cq_ex *cmd, size_t cmd_size, struct ib_uverbs_ex_create_cq_resp *resp, size_t resp_size, uint32_t cmd_flags, struct ibv_command_buffer *driver) { DECLARE_CMD_BUFFER_LINK_COMPAT(cmdb, UVERBS_OBJECT_CQ, UVERBS_METHOD_CQ_CREATE, driver, cmd, cmd_size, resp, resp_size); return ibv_icmd_create_cq_ex(context, cq_attr, cq, cmdb, cmd_flags); } int ibv_cmd_destroy_cq(struct ibv_cq *cq) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_CQ, UVERBS_METHOD_CQ_DESTROY, 2, NULL); struct ib_uverbs_destroy_cq_resp resp; int ret; fill_attr_out_ptr(cmdb, UVERBS_ATTR_DESTROY_CQ_RESP, &resp); fill_attr_in_obj(cmdb, UVERBS_ATTR_DESTROY_CQ_HANDLE, cq->handle); switch (execute_ioctl_fallback(cq->context, destroy_cq, cmdb, &ret)) { case TRY_WRITE: { struct ibv_destroy_cq req; req.core_payload = (struct ib_uverbs_destroy_cq){ .cq_handle = cq->handle, }; ret = execute_cmd_write(cq->context, IB_USER_VERBS_CMD_DESTROY_CQ, &req, sizeof(req), &resp, sizeof(resp)); break; } default: break; } if (verbs_is_destroy_err(&ret)) return ret; pthread_mutex_lock(&cq->mutex); while (cq->comp_events_completed != resp.comp_events_reported || cq->async_events_completed != resp.async_events_reported) pthread_cond_wait(&cq->cond, &cq->mutex); pthread_mutex_unlock(&cq->mutex); return 0; } rdma-core-56.1/libibverbs/cmd_device.c000066400000000000000000000502211477342711600177030ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include static void copy_query_port_resp_to_port_attr(struct ibv_port_attr *port_attr, struct ib_uverbs_query_port_resp *resp) { port_attr->state = resp->state; port_attr->max_mtu = resp->max_mtu; port_attr->active_mtu = resp->active_mtu; port_attr->gid_tbl_len = resp->gid_tbl_len; port_attr->port_cap_flags = resp->port_cap_flags; port_attr->max_msg_sz = resp->max_msg_sz; port_attr->bad_pkey_cntr = resp->bad_pkey_cntr; port_attr->qkey_viol_cntr = resp->qkey_viol_cntr; port_attr->pkey_tbl_len = resp->pkey_tbl_len; port_attr->lid = resp->lid; port_attr->sm_lid = resp->sm_lid; port_attr->lmc = resp->lmc; port_attr->max_vl_num = resp->max_vl_num; port_attr->sm_sl = resp->sm_sl; port_attr->subnet_timeout = resp->subnet_timeout; port_attr->init_type_reply = resp->init_type_reply; port_attr->active_width = resp->active_width; port_attr->active_speed = resp->active_speed; port_attr->phys_state = resp->phys_state; port_attr->link_layer = resp->link_layer; port_attr->flags = resp->flags; } int ibv_cmd_query_port(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr, struct ibv_query_port *cmd, size_t cmd_size) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_DEVICE, UVERBS_METHOD_QUERY_PORT, 2, NULL); int ret; struct ib_uverbs_query_port_resp_ex resp_ex = {}; fill_attr_const_in(cmdb, UVERBS_ATTR_QUERY_PORT_PORT_NUM, port_num); fill_attr_out_ptr(cmdb, UVERBS_ATTR_QUERY_PORT_RESP, &resp_ex); switch (execute_ioctl_fallback(context, query_port, cmdb, &ret)) { case TRY_WRITE: { struct ib_uverbs_query_port_resp resp; cmd->port_num = port_num; memset(cmd->reserved, 0, sizeof(cmd->reserved)); memset(&resp, 0, sizeof(resp)); ret = execute_cmd_write(context, IB_USER_VERBS_CMD_QUERY_PORT, cmd, cmd_size, &resp, sizeof(resp)); if (ret) return ret; copy_query_port_resp_to_port_attr(port_attr, &resp); break; } case SUCCESS: copy_query_port_resp_to_port_attr(port_attr, &resp_ex.legacy_resp); port_attr->port_cap_flags2 = resp_ex.port_cap_flags2; port_attr->active_speed_ex = resp_ex.active_speed_ex; break; default: return ret; }; return 0; } int ibv_cmd_alloc_async_fd(struct ibv_context *context) { DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_ASYNC_EVENT, UVERBS_METHOD_ASYNC_EVENT_ALLOC, 1); struct ib_uverbs_attr *handle; int ret; handle = fill_attr_out_fd(cmdb, UVERBS_ATTR_ASYNC_EVENT_ALLOC_FD_HANDLE, 0); ret = execute_ioctl(context, cmdb); if (ret) return ret; context->async_fd = read_attr_fd(UVERBS_ATTR_ASYNC_EVENT_ALLOC_FD_HANDLE, handle); return 0; } static int cmd_get_context(struct verbs_context *context_ex, struct ibv_command_buffer *link) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_DEVICE, UVERBS_METHOD_GET_CONTEXT, 2, link); struct ibv_context *context = &context_ex->context; struct verbs_device *verbs_device; uint64_t core_support; uint32_t num_comp_vectors; int ret; fill_attr_out_ptr(cmdb, UVERBS_ATTR_GET_CONTEXT_NUM_COMP_VECTORS, &num_comp_vectors); fill_attr_out_ptr(cmdb, UVERBS_ATTR_GET_CONTEXT_CORE_SUPPORT, &core_support); /* Using free_context cmd_name as alloc context is not in * verbs_context_ops while free_context is and doesn't use ioctl */ switch (execute_ioctl_fallback(context, free_context, cmdb, &ret)) { case TRY_WRITE: { DECLARE_LEGACY_UHW_BUFS(link, IB_USER_VERBS_CMD_GET_CONTEXT); ret = execute_write_bufs(context, IB_USER_VERBS_CMD_GET_CONTEXT, req, resp); if (ret) return ret; context->async_fd = resp->async_fd; context->num_comp_vectors = resp->num_comp_vectors; return 0; } case SUCCESS: break; default: return ret; }; context->num_comp_vectors = num_comp_vectors; verbs_device = verbs_get_device(context->device); verbs_device->core_support = core_support; return 0; } int ibv_cmd_get_context(struct verbs_context *context_ex, struct ibv_get_context *cmd, size_t cmd_size, struct ib_uverbs_get_context_resp *resp, size_t resp_size) { DECLARE_CMD_BUFFER_COMPAT(cmdb, UVERBS_OBJECT_DEVICE, UVERBS_METHOD_GET_CONTEXT, cmd, cmd_size, resp, resp_size); return cmd_get_context(context_ex, cmdb); } int ibv_cmd_query_context(struct ibv_context *context, struct ibv_command_buffer *driver) { DECLARE_COMMAND_BUFFER_LINK(cmd, UVERBS_OBJECT_DEVICE, UVERBS_METHOD_QUERY_CONTEXT, 2, driver); struct verbs_device *verbs_device; uint64_t core_support; int ret; fill_attr_out_ptr(cmd, UVERBS_ATTR_QUERY_CONTEXT_NUM_COMP_VECTORS, &context->num_comp_vectors); fill_attr_out_ptr(cmd, UVERBS_ATTR_QUERY_CONTEXT_CORE_SUPPORT, &core_support); ret = execute_ioctl(context, cmd); if (ret) return ret; verbs_device = verbs_get_device(context->device); verbs_device->core_support = core_support; return 0; } static int is_zero_gid(union ibv_gid *gid) { const union ibv_gid zgid = {}; return !memcmp(gid, &zgid, sizeof(*gid)); } static int query_sysfs_gid_ndev_ifindex(struct ibv_context *context, uint8_t port_num, uint32_t gid_index, uint32_t *ndev_ifindex) { struct verbs_device *verbs_device = verbs_get_device(context->device); char buff[IF_NAMESIZE]; if (ibv_read_ibdev_sysfs_file(buff, sizeof(buff), verbs_device->sysfs, "ports/%d/gid_attrs/ndevs/%d", port_num, gid_index) <= 0) { *ndev_ifindex = 0; return 0; } *ndev_ifindex = if_nametoindex(buff); return *ndev_ifindex ? 0 : errno; } static int query_sysfs_gid(struct ibv_context *context, uint8_t port_num, int index, union ibv_gid *gid) { struct verbs_device *verbs_device = verbs_get_device(context->device); char attr[41]; uint16_t val; int i; if (ibv_read_ibdev_sysfs_file(attr, sizeof(attr), verbs_device->sysfs, "ports/%d/gids/%d", port_num, index) < 0) return -1; for (i = 0; i < 8; ++i) { if (sscanf(attr + i * 5, "%hx", &val) != 1) return -1; gid->raw[i * 2] = val >> 8; gid->raw[i * 2 + 1] = val & 0xff; } return 0; } /* GID types as appear in sysfs, no change is expected as of ABI * compatibility. */ #define V1_TYPE "IB/RoCE v1" #define V2_TYPE "RoCE v2" static int query_sysfs_gid_type(struct ibv_context *context, uint8_t port_num, unsigned int index, enum ibv_gid_type_sysfs *type) { struct verbs_device *verbs_device = verbs_get_device(context->device); char buff[11]; /* Reset errno so that we can rely on its value upon any error flow in * ibv_read_sysfs_file. */ errno = 0; if (ibv_read_ibdev_sysfs_file(buff, sizeof(buff), verbs_device->sysfs, "ports/%d/gid_attrs/types/%d", port_num, index) <= 0) { char *dir_path; DIR *dir; if (errno == EINVAL) { /* In IB, this file doesn't exist and the kernel sets * errno to -EINVAL. */ *type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1; return 0; } if (asprintf(&dir_path, "%s/%s/%d/%s/", verbs_device->sysfs->ibdev_path, "ports", port_num, "gid_attrs") < 0) return -1; dir = opendir(dir_path); free(dir_path); if (!dir) { if (errno == ENOENT) /* Assuming that if gid_attrs doesn't exist, * we have an old kernel and all GIDs are * IB/RoCE v1 */ *type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1; else return -1; } else { closedir(dir); errno = EFAULT; return -1; } } else { if (!strcmp(buff, V1_TYPE)) { *type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1; } else if (!strcmp(buff, V2_TYPE)) { *type = IBV_GID_TYPE_SYSFS_ROCE_V2; } else { errno = ENOTSUP; return -1; } } return 0; } static int query_sysfs_gid_entry(struct ibv_context *context, uint32_t port_num, uint32_t gid_index, struct ibv_gid_entry *entry, uint32_t attr_mask, int link_layer) { enum ibv_gid_type_sysfs gid_type; struct ibv_port_attr port_attr = {}; int ret = 0; entry->gid_index = gid_index; entry->port_num = port_num; if (attr_mask & VERBS_QUERY_GID_ATTR_GID) { ret = query_sysfs_gid(context, port_num, gid_index, &entry->gid); if (ret) return EINVAL; } if (attr_mask & VERBS_QUERY_GID_ATTR_TYPE) { ret = query_sysfs_gid_type(context, port_num, gid_index, &gid_type); if (ret) return EINVAL; if (gid_type == IBV_GID_TYPE_SYSFS_IB_ROCE_V1) { if (link_layer < 0) { ret = ibv_query_port(context, port_num, &port_attr); if (ret) goto out; link_layer = port_attr.link_layer; } if (link_layer == IBV_LINK_LAYER_INFINIBAND) { entry->gid_type = IBV_GID_TYPE_IB; } else if (link_layer == IBV_LINK_LAYER_ETHERNET) { entry->gid_type = IBV_GID_TYPE_ROCE_V1; } else { /* Unspecified link layer is IB by default */ entry->gid_type = IBV_GID_TYPE_IB; } } else { entry->gid_type = IBV_GID_TYPE_ROCE_V2; } } if (attr_mask & VERBS_QUERY_GID_ATTR_NDEV_IFINDEX) ret = query_sysfs_gid_ndev_ifindex(context, port_num, gid_index, &entry->ndev_ifindex); out: return ret; } static int query_gid_table_fb(struct ibv_context *context, struct ibv_gid_entry *entries, size_t max_entries, uint64_t *num_entries, size_t entry_size) { struct ibv_device_attr dev_attr = {}; struct ibv_port_attr port_attr = {}; struct ibv_gid_entry entry = {}; int attr_mask; void *tmp; int i, j; int ret; ret = ibv_query_device(context, &dev_attr); if (ret) goto out; tmp = entries; *num_entries = 0; attr_mask = VERBS_QUERY_GID_ATTR_GID | VERBS_QUERY_GID_ATTR_TYPE | VERBS_QUERY_GID_ATTR_NDEV_IFINDEX; for (i = 0; i < dev_attr.phys_port_cnt; i++) { ret = ibv_query_port(context, i + 1, &port_attr); if (ret) goto out; for (j = 0; j < port_attr.gid_tbl_len; j++) { /* In case we already reached max_entries, query to some * temp entry, in case all other entries are zeros the * API should succceed. */ if (*num_entries == max_entries) tmp = &entry; ret = query_sysfs_gid_entry(context, i + 1, j, tmp, attr_mask, port_attr.link_layer); if (ret) goto out; if (is_zero_gid(&((struct ibv_gid_entry *)tmp)->gid)) continue; if (*num_entries == max_entries) { ret = EINVAL; goto out; } (*num_entries)++; tmp += entry_size; } } out: return ret; } /* Using async_event cmd_name because query_gid_ex and query_gid_table are not * in verbs_context_ops while async_event is and doesn't use ioctl. * If one of them is not supported, so is the other. Hence, we can use a single * cmd_name for both of them. */ #define query_gid_kernel_cap async_event int __ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num, uint32_t gid_index, struct ibv_gid_entry *entry, uint32_t flags, size_t entry_size, uint32_t fallback_attr_mask) { DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_DEVICE, UVERBS_METHOD_QUERY_GID_ENTRY, 4); int ret; fill_attr_const_in(cmdb, UVERBS_ATTR_QUERY_GID_ENTRY_PORT, port_num); fill_attr_const_in(cmdb, UVERBS_ATTR_QUERY_GID_ENTRY_GID_INDEX, gid_index); fill_attr_in_uint32(cmdb, UVERBS_ATTR_QUERY_GID_ENTRY_FLAGS, flags); fill_attr_out(cmdb, UVERBS_ATTR_QUERY_GID_ENTRY_RESP_ENTRY, entry, entry_size); switch (execute_ioctl_fallback(context, query_gid_kernel_cap, cmdb, &ret)) { case TRY_WRITE: if (flags) return EOPNOTSUPP; ret = query_sysfs_gid_entry(context, port_num, gid_index, entry, fallback_attr_mask, -1); if (ret) return ret; if (fallback_attr_mask & VERBS_QUERY_GID_ATTR_GID && is_zero_gid(&entry->gid)) return ENODATA; return 0; default: return ret; } } int _ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num, uint32_t gid_index, struct ibv_gid_entry *entry, uint32_t flags, size_t entry_size) { return __ibv_query_gid_ex(context, port_num, gid_index, entry, flags, entry_size, VERBS_QUERY_GID_ATTR_GID | VERBS_QUERY_GID_ATTR_TYPE | VERBS_QUERY_GID_ATTR_NDEV_IFINDEX); } ssize_t _ibv_query_gid_table(struct ibv_context *context, struct ibv_gid_entry *entries, size_t max_entries, uint32_t flags, size_t entry_size) { DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_DEVICE, UVERBS_METHOD_QUERY_GID_TABLE, 4); uint64_t num_entries; int ret; fill_attr_const_in(cmdb, UVERBS_ATTR_QUERY_GID_TABLE_ENTRY_SIZE, entry_size); fill_attr_in_uint32(cmdb, UVERBS_ATTR_QUERY_GID_TABLE_FLAGS, flags); fill_attr_out(cmdb, UVERBS_ATTR_QUERY_GID_TABLE_RESP_ENTRIES, entries, _array_len(entry_size, max_entries)); fill_attr_out_ptr(cmdb, UVERBS_ATTR_QUERY_GID_TABLE_RESP_NUM_ENTRIES, &num_entries); switch (execute_ioctl_fallback(context, query_gid_kernel_cap, cmdb, &ret)) { case TRY_WRITE: if (flags) return -EOPNOTSUPP; ret = query_gid_table_fb(context, entries, max_entries, &num_entries, entry_size); break; default: break; } if (ret) return -ret; return num_entries; } int ibv_cmd_query_device_any(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size, struct ib_uverbs_ex_query_device_resp *resp, size_t *resp_size) { struct ib_uverbs_ex_query_device_resp internal_resp; size_t internal_resp_size; int err; if (input && input->comp_mask) return EINVAL; if (attr_size < sizeof(attr->orig_attr)) return EINVAL; if (!resp) { resp = &internal_resp; internal_resp_size = sizeof(internal_resp); resp_size = &internal_resp_size; } memset(attr, 0, attr_size); if (attr_size > sizeof(attr->orig_attr)) { struct ibv_query_device_ex cmd = {}; err = execute_cmd_write_ex(context, IB_USER_VERBS_EX_CMD_QUERY_DEVICE, &cmd, sizeof(cmd), resp, *resp_size); if (err) { if (err != EOPNOTSUPP && err != ENOSYS) return err; attr_size = sizeof(attr->orig_attr); } } if (attr_size == sizeof(attr->orig_attr)) { struct ibv_query_device cmd = {}; err = execute_cmd_write(context, IB_USER_VERBS_CMD_QUERY_DEVICE, &cmd, sizeof(cmd), &resp->base, sizeof(resp->base)); if (err) return err; resp->response_length = sizeof(resp->base); } *resp_size = resp->response_length; attr->orig_attr.node_guid = resp->base.node_guid; attr->orig_attr.sys_image_guid = resp->base.sys_image_guid; attr->orig_attr.max_mr_size = resp->base.max_mr_size; attr->orig_attr.page_size_cap = resp->base.page_size_cap; attr->orig_attr.vendor_id = resp->base.vendor_id; attr->orig_attr.vendor_part_id = resp->base.vendor_part_id; attr->orig_attr.hw_ver = resp->base.hw_ver; attr->orig_attr.max_qp = resp->base.max_qp; attr->orig_attr.max_qp_wr = resp->base.max_qp_wr; attr->orig_attr.device_cap_flags = resp->base.device_cap_flags; attr->orig_attr.max_sge = resp->base.max_sge; attr->orig_attr.max_sge_rd = resp->base.max_sge_rd; attr->orig_attr.max_cq = resp->base.max_cq; attr->orig_attr.max_cqe = resp->base.max_cqe; attr->orig_attr.max_mr = resp->base.max_mr; attr->orig_attr.max_pd = resp->base.max_pd; attr->orig_attr.max_qp_rd_atom = resp->base.max_qp_rd_atom; attr->orig_attr.max_ee_rd_atom = resp->base.max_ee_rd_atom; attr->orig_attr.max_res_rd_atom = resp->base.max_res_rd_atom; attr->orig_attr.max_qp_init_rd_atom = resp->base.max_qp_init_rd_atom; attr->orig_attr.max_ee_init_rd_atom = resp->base.max_ee_init_rd_atom; attr->orig_attr.atomic_cap = resp->base.atomic_cap; attr->orig_attr.max_ee = resp->base.max_ee; attr->orig_attr.max_rdd = resp->base.max_rdd; attr->orig_attr.max_mw = resp->base.max_mw; attr->orig_attr.max_raw_ipv6_qp = resp->base.max_raw_ipv6_qp; attr->orig_attr.max_raw_ethy_qp = resp->base.max_raw_ethy_qp; attr->orig_attr.max_mcast_grp = resp->base.max_mcast_grp; attr->orig_attr.max_mcast_qp_attach = resp->base.max_mcast_qp_attach; attr->orig_attr.max_total_mcast_qp_attach = resp->base.max_total_mcast_qp_attach; attr->orig_attr.max_ah = resp->base.max_ah; attr->orig_attr.max_fmr = resp->base.max_fmr; attr->orig_attr.max_map_per_fmr = resp->base.max_map_per_fmr; attr->orig_attr.max_srq = resp->base.max_srq; attr->orig_attr.max_srq_wr = resp->base.max_srq_wr; attr->orig_attr.max_srq_sge = resp->base.max_srq_sge; attr->orig_attr.max_pkeys = resp->base.max_pkeys; attr->orig_attr.local_ca_ack_delay = resp->base.local_ca_ack_delay; attr->orig_attr.phys_port_cnt = resp->base.phys_port_cnt; #define CAN_COPY(_ibv_attr, _uverbs_attr) \ (attr_size >= offsetofend(struct ibv_device_attr_ex, _ibv_attr) && \ resp->response_length >= \ offsetofend(struct ib_uverbs_ex_query_device_resp, \ _uverbs_attr)) if (CAN_COPY(odp_caps, odp_caps)) { attr->odp_caps.general_caps = resp->odp_caps.general_caps; attr->odp_caps.per_transport_caps.rc_odp_caps = resp->odp_caps.per_transport_caps.rc_odp_caps; attr->odp_caps.per_transport_caps.uc_odp_caps = resp->odp_caps.per_transport_caps.uc_odp_caps; attr->odp_caps.per_transport_caps.ud_odp_caps = resp->odp_caps.per_transport_caps.ud_odp_caps; } if (CAN_COPY(completion_timestamp_mask, timestamp_mask)) attr->completion_timestamp_mask = resp->timestamp_mask; if (CAN_COPY(hca_core_clock, hca_core_clock)) attr->hca_core_clock = resp->hca_core_clock; if (CAN_COPY(device_cap_flags_ex, device_cap_flags_ex)) attr->device_cap_flags_ex = resp->device_cap_flags_ex; if (CAN_COPY(rss_caps, rss_caps)) { attr->rss_caps.supported_qpts = resp->rss_caps.supported_qpts; attr->rss_caps.max_rwq_indirection_tables = resp->rss_caps.max_rwq_indirection_tables; attr->rss_caps.max_rwq_indirection_table_size = resp->rss_caps.max_rwq_indirection_table_size; } if (CAN_COPY(max_wq_type_rq, max_wq_type_rq)) attr->max_wq_type_rq = resp->max_wq_type_rq; if (CAN_COPY(raw_packet_caps, raw_packet_caps)) attr->raw_packet_caps = resp->raw_packet_caps; if (CAN_COPY(tm_caps, tm_caps)) { attr->tm_caps.max_rndv_hdr_size = resp->tm_caps.max_rndv_hdr_size; attr->tm_caps.max_num_tags = resp->tm_caps.max_num_tags; attr->tm_caps.flags = resp->tm_caps.flags; attr->tm_caps.max_ops = resp->tm_caps.max_ops; attr->tm_caps.max_sge = resp->tm_caps.max_sge; } if (CAN_COPY(cq_mod_caps, cq_moderation_caps)) { attr->cq_mod_caps.max_cq_count = resp->cq_moderation_caps.max_cq_moderation_count; attr->cq_mod_caps.max_cq_period = resp->cq_moderation_caps.max_cq_moderation_period; } if (CAN_COPY(max_dm_size, max_dm_size)) attr->max_dm_size = resp->max_dm_size; if (CAN_COPY(xrc_odp_caps, xrc_odp_caps)) attr->xrc_odp_caps = resp->xrc_odp_caps; if (attr_size >= offsetofend(struct ibv_device_attr_ex, phys_port_cnt_ex)) { struct verbs_sysfs_dev *sysfs_dev = verbs_get_device(context->device)->sysfs; if (sysfs_dev->num_ports) attr->phys_port_cnt_ex = sysfs_dev->num_ports; else attr->phys_port_cnt_ex = attr->orig_attr.phys_port_cnt; } #undef CAN_COPY return 0; } rdma-core-56.1/libibverbs/cmd_dm.c000066400000000000000000000076421477342711600170550ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include int ibv_cmd_alloc_dm(struct ibv_context *ctx, const struct ibv_alloc_dm_attr *dm_attr, struct verbs_dm *dm, struct ibv_command_buffer *link) { DECLARE_COMMAND_BUFFER_LINK(cmdb, UVERBS_OBJECT_DM, UVERBS_METHOD_DM_ALLOC, 3, link); struct ib_uverbs_attr *handle; int ret; handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_ALLOC_DM_HANDLE); fill_attr_in_uint64(cmdb, UVERBS_ATTR_ALLOC_DM_LENGTH, dm_attr->length); fill_attr_in_uint32(cmdb, UVERBS_ATTR_ALLOC_DM_ALIGNMENT, dm_attr->log_align_req); ret = execute_ioctl(ctx, cmdb); if (ret) return errno; dm->handle = read_attr_obj(UVERBS_ATTR_ALLOC_DM_HANDLE, handle); dm->dm.handle = dm->handle; dm->dm.comp_mask = IBV_DM_MASK_HANDLE; dm->dm.context = ctx; return 0; } int ibv_cmd_free_dm(struct verbs_dm *dm) { DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_DM, UVERBS_METHOD_DM_FREE, 1); int ret; fill_attr_in_obj(cmdb, UVERBS_ATTR_FREE_DM_HANDLE, dm->handle); ret = execute_ioctl(dm->dm.context, cmdb); if (verbs_is_destroy_err(&ret)) return ret; return 0; } int ibv_cmd_reg_dm_mr(struct ibv_pd *pd, struct verbs_dm *dm, uint64_t offset, size_t length, unsigned int access, struct verbs_mr *vmr, struct ibv_command_buffer *link) { DECLARE_COMMAND_BUFFER_LINK(cmdb, UVERBS_OBJECT_MR, UVERBS_METHOD_DM_MR_REG, 8, link); struct ib_uverbs_attr *handle; uint32_t lkey, rkey; int ret; /* * DM MRs are always 0 based since the mmap pointer, if it exists, is * hidden from the user. */ if (!(access & IBV_ACCESS_ZERO_BASED)) { errno = EINVAL; return errno; } handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_REG_DM_MR_HANDLE); fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DM_MR_RESP_LKEY, &lkey); fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DM_MR_RESP_RKEY, &rkey); fill_attr_in_obj(cmdb, UVERBS_ATTR_REG_DM_MR_PD_HANDLE, pd->handle); fill_attr_in_obj(cmdb, UVERBS_ATTR_REG_DM_MR_DM_HANDLE, dm->handle); fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DM_MR_OFFSET, offset); fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DM_MR_LENGTH, length); fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DM_MR_ACCESS_FLAGS, access); ret = execute_ioctl(pd->context, cmdb); if (ret) return errno; vmr->ibv_mr.handle = read_attr_obj(UVERBS_ATTR_REG_DM_MR_HANDLE, handle); vmr->ibv_mr.context = pd->context; vmr->ibv_mr.lkey = lkey; vmr->ibv_mr.rkey = rkey; vmr->ibv_mr.length = length; vmr->ibv_mr.pd = pd; vmr->ibv_mr.addr = NULL; vmr->mr_type = IBV_MR_TYPE_MR; return 0; } rdma-core-56.1/libibverbs/cmd_fallback.c000066400000000000000000000221431477342711600202050ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include "ibverbs.h" #include #include #include #include /* * Check if the command buffer provided by the driver includes anything that * is not compatible with the legacy interface. If so, then * _execute_ioctl_fallback indicates it handled the call and sets the error * code */ enum write_fallback _check_legacy(struct ibv_command_buffer *cmdb, int *ret) { struct ib_uverbs_attr *cur; bool fallback_require_ex = cmdb->fallback_require_ex; bool fallback_ioctl_only = cmdb->fallback_ioctl_only; for (cmdb = cmdb->next; cmdb; cmdb = cmdb->next) { for (cur = cmdb->hdr.attrs; cur != cmdb->next_attr; cur++) { if (cur->attr_id != UVERBS_ATTR_UHW_IN && cur->attr_id != UVERBS_ATTR_UHW_OUT && cur->flags & UVERBS_ATTR_F_MANDATORY) goto not_supp; } fallback_require_ex |= cmdb->fallback_require_ex; fallback_ioctl_only |= cmdb->fallback_ioctl_only; } if (fallback_ioctl_only) goto not_supp; if (fallback_require_ex) return TRY_WRITE_EX; return TRY_WRITE; not_supp: errno = EOPNOTSUPP; *ret = EOPNOTSUPP; return ERROR; } /* * Used to support callers that have a fallback to the old write ABI * interface. */ enum write_fallback _execute_ioctl_fallback(struct ibv_context *ctx, unsigned int cmd_bit, struct ibv_command_buffer *cmdb, int *ret) { struct verbs_ex_private *priv = get_priv(ctx); if (bitmap_test_bit(priv->unsupported_ioctls, cmd_bit)) return _check_legacy(cmdb, ret); *ret = execute_ioctl(ctx, cmdb); if (likely(*ret == 0)) return SUCCESS; if (*ret == ENOTTY) { /* ENOTTY means the ioctl framework is entirely absent */ bitmap_fill(priv->unsupported_ioctls, VERBS_OPS_NUM); return _check_legacy(cmdb, ret); } if (*ret == EPROTONOSUPPORT) { /* * EPROTONOSUPPORT means we have the ioctl framework but this * specific method or a mandatory attribute is not supported */ bitmap_set_bit(priv->unsupported_ioctls, cmd_bit); return _check_legacy(cmdb, ret); } return ERROR; } /* * Within the command implementation we get a pointer to the request and * response buffers for the legacy interface. This pointer is either allocated * on the stack (if the driver didn't provide a UHW) or arranged to be * directly before the UHW memory (see _write_set_uhw) */ void *_write_get_req(struct ibv_command_buffer *link, struct ib_uverbs_cmd_hdr *onstack, size_t size) { struct ib_uverbs_cmd_hdr *hdr; size += sizeof(*hdr); if (link->uhw_in_idx != _UHW_NO_INDEX) { struct ib_uverbs_attr *uhw = &link->hdr.attrs[link->uhw_in_idx]; assert(uhw->attr_id == UVERBS_ATTR_UHW_IN); assert(link->uhw_in_headroom_dwords * 4 >= size); hdr = (void *)((uintptr_t)uhw->data - size); hdr->in_words = __check_divide(size + uhw->len, 4); } else { hdr = onstack; hdr->in_words = __check_divide(size, 4); } return hdr + 1; } void *_write_get_req_ex(struct ibv_command_buffer *link, struct ex_hdr *onstack, size_t size) { struct ex_hdr *hdr; size_t full_size = size + sizeof(*hdr); if (link->uhw_in_idx != _UHW_NO_INDEX) { struct ib_uverbs_attr *uhw = &link->hdr.attrs[link->uhw_in_idx]; assert(uhw->attr_id == UVERBS_ATTR_UHW_IN); assert(link->uhw_in_headroom_dwords * 4 >= full_size); hdr = (void *)((uintptr_t)uhw->data - full_size); hdr->ex_hdr.provider_in_words = __check_divide(uhw->len, 8); } else { hdr = onstack; hdr->ex_hdr.provider_in_words = 0; } return hdr + 1; } void *_write_get_resp(struct ibv_command_buffer *link, struct ib_uverbs_cmd_hdr *hdr, void *onstack, size_t resp_size) { void *resp_start; if (link->uhw_out_idx != _UHW_NO_INDEX) { struct ib_uverbs_attr *uhw = &link->hdr.attrs[link->uhw_out_idx]; assert(uhw->attr_id == UVERBS_ATTR_UHW_OUT); assert(link->uhw_out_headroom_dwords * 4 >= resp_size); resp_start = (void *)((uintptr_t)uhw->data - resp_size); hdr->out_words = __check_divide(resp_size + uhw->len, 4); } else { resp_start = onstack; hdr->out_words = __check_divide(resp_size, 4); } return resp_start; } void *_write_get_resp_ex(struct ibv_command_buffer *link, struct ex_hdr *hdr, void *onstack, size_t resp_size) { void *resp_start; if (link->uhw_out_idx != _UHW_NO_INDEX) { struct ib_uverbs_attr *uhw = &link->hdr.attrs[link->uhw_out_idx]; assert(uhw->attr_id == UVERBS_ATTR_UHW_OUT); assert(link->uhw_out_headroom_dwords * 4 >= resp_size); resp_start = (void *)((uintptr_t)uhw->data - resp_size); hdr->ex_hdr.provider_out_words = __check_divide(uhw->len, 8); } else { resp_start = onstack; hdr->ex_hdr.provider_out_words = 0; } return resp_start; } static int ioctl_write(struct ibv_context *ctx, unsigned int write_method, const void *req, size_t core_req_size, size_t req_size, void *resp, size_t core_resp_size, size_t resp_size) { DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_DEVICE, UVERBS_METHOD_INVOKE_WRITE, 5); fill_attr_const_in(cmdb, UVERBS_ATTR_WRITE_CMD, write_method); if (core_req_size) fill_attr_in(cmdb, UVERBS_ATTR_CORE_IN, req, core_req_size); if (core_resp_size) fill_attr_out(cmdb, UVERBS_ATTR_CORE_OUT, resp, core_resp_size); if (req_size - core_req_size) fill_attr_in(cmdb, UVERBS_ATTR_UHW_IN, req + core_req_size, req_size - core_req_size); if (resp_size - core_resp_size) fill_attr_out(cmdb, UVERBS_ATTR_UHW_OUT, resp + core_resp_size, resp_size - core_resp_size); return execute_ioctl(ctx, cmdb); } int _execute_cmd_write(struct ibv_context *ctx, unsigned int write_method, void *vreq, size_t core_req_size, size_t req_size, void *resp, size_t core_resp_size, size_t resp_size) { struct ib_uverbs_cmd_hdr *req = vreq; struct verbs_ex_private *priv = get_priv(ctx); if (!VERBS_WRITE_ONLY && (VERBS_IOCTL_ONLY || priv->use_ioctl_write)) return ioctl_write(ctx, write_method, req + 1, core_req_size - sizeof(*req), req_size - sizeof(*req), resp, core_resp_size, resp_size); req->command = write_method; req->in_words = __check_divide(req_size, 4); req->out_words = __check_divide(resp_size, 4); if (write(ctx->cmd_fd, vreq, req_size) != req_size) return errno; if (resp) VALGRIND_MAKE_MEM_DEFINED(resp, resp_size); return 0; } /* * req_size is the total length of the ex_hdr, core payload and driver data. * core_req_size is the total length of the ex_hdr and core_payload. */ int _execute_cmd_write_ex(struct ibv_context *ctx, unsigned int write_method, struct ex_hdr *req, size_t core_req_size, size_t req_size, void *resp, size_t core_resp_size, size_t resp_size) { struct verbs_ex_private *priv = get_priv(ctx); if (!VERBS_WRITE_ONLY && (VERBS_IOCTL_ONLY || priv->use_ioctl_write)) return ioctl_write( ctx, IB_USER_VERBS_CMD_FLAG_EXTENDED | write_method, req + 1, core_req_size - sizeof(*req), req_size - sizeof(*req), resp, core_resp_size, resp_size); req->hdr.command = IB_USER_VERBS_CMD_FLAG_EXTENDED | write_method; req->hdr.in_words = __check_divide(core_req_size - sizeof(struct ex_hdr), 8); req->hdr.out_words = __check_divide(core_resp_size, 8); req->ex_hdr.provider_in_words = __check_divide(req_size - core_req_size, 8); req->ex_hdr.provider_out_words = __check_divide(resp_size - core_resp_size, 8); req->ex_hdr.response = ioctl_ptr_to_u64(resp); req->ex_hdr.cmd_hdr_reserved = 0; /* * Users assumes the stack buffer is zeroed before passing to the * kernel for writing. New kernels with the ioctl path do this * automatically for us. */ if (resp) memset(resp, 0, resp_size); if (write(ctx->cmd_fd, req, req_size) != req_size) return errno; if (resp) VALGRIND_MAKE_MEM_DEFINED(resp, resp_size); return 0; } rdma-core-56.1/libibverbs/cmd_flow.c000066400000000000000000000041531477342711600174160ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include int ibv_cmd_destroy_flow(struct ibv_flow *flow_id) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_FLOW, UVERBS_METHOD_FLOW_DESTROY, 1, NULL); int ret; fill_attr_in_obj(cmdb, UVERBS_ATTR_DESTROY_FLOW_HANDLE, flow_id->handle); switch (execute_ioctl_fallback(flow_id->context, destroy_ah, cmdb, &ret)) { case TRY_WRITE: { struct ibv_destroy_flow req; req.core_payload = (struct ib_uverbs_destroy_flow){ .flow_handle = flow_id->handle, }; ret = execute_cmd_write_ex_req( flow_id->context, IB_USER_VERBS_EX_CMD_DESTROY_FLOW, &req, sizeof(req)); break; } default: break; } if (verbs_is_destroy_err(&ret)) return ret; return 0; } rdma-core-56.1/libibverbs/cmd_flow_action.c000066400000000000000000000102711477342711600207510ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include static void scrub_esp_encap(struct ibv_flow_action_esp_encap *esp_encap) { scrub_ptr_attr(esp_encap->val_ptr); scrub_ptr_attr(esp_encap->next_ptr); } static int copy_flow_action_esp(struct ibv_flow_action_esp_attr *esp, struct ibv_command_buffer *cmd) { if (esp->comp_mask & IBV_FLOW_ACTION_ESP_MASK_ESN) fill_attr_in(cmd, UVERBS_ATTR_FLOW_ACTION_ESP_ESN, &esp->esn, sizeof(esp->esn)); if (esp->keymat_ptr) fill_attr_in_enum(cmd, UVERBS_ATTR_FLOW_ACTION_ESP_KEYMAT, esp->keymat_proto, esp->keymat_ptr, esp->keymat_len); if (esp->replay_ptr) fill_attr_in_enum(cmd, UVERBS_ATTR_FLOW_ACTION_ESP_REPLAY, esp->replay_proto, esp->replay_ptr, esp->replay_len); if (esp->esp_encap) { scrub_esp_encap(esp->esp_encap); fill_attr_in_ptr(cmd, UVERBS_ATTR_FLOW_ACTION_ESP_ENCAP, esp->esp_encap); } if (esp->esp_attr) fill_attr_in_ptr(cmd, UVERBS_ATTR_FLOW_ACTION_ESP_ATTRS, esp->esp_attr); return 0; } #define FLOW_ACTION_ESP_ATTRS_NUM 6 int ibv_cmd_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *attr, struct verbs_flow_action *flow_action, struct ibv_command_buffer *driver) { DECLARE_COMMAND_BUFFER_LINK(cmd, UVERBS_OBJECT_FLOW_ACTION, UVERBS_METHOD_FLOW_ACTION_ESP_CREATE, FLOW_ACTION_ESP_ATTRS_NUM, driver); struct ib_uverbs_attr *handle = fill_attr_out_obj( cmd, UVERBS_ATTR_CREATE_FLOW_ACTION_ESP_HANDLE); int ret; ret = copy_flow_action_esp(attr, cmd); if (ret) return ret; ret = execute_ioctl(ctx, cmd); if (ret) return errno; flow_action->action.context = ctx; flow_action->type = IBV_FLOW_ACTION_ESP; flow_action->handle = read_attr_obj( UVERBS_ATTR_CREATE_FLOW_ACTION_ESP_HANDLE, handle); return 0; } int ibv_cmd_modify_flow_action_esp(struct verbs_flow_action *flow_action, struct ibv_flow_action_esp_attr *attr, struct ibv_command_buffer *driver) { DECLARE_COMMAND_BUFFER_LINK(cmd, UVERBS_OBJECT_FLOW_ACTION, UVERBS_METHOD_FLOW_ACTION_ESP_MODIFY, FLOW_ACTION_ESP_ATTRS_NUM, driver); int ret; fill_attr_in_obj(cmd, UVERBS_ATTR_MODIFY_FLOW_ACTION_ESP_HANDLE, flow_action->handle); ret = copy_flow_action_esp(attr, cmd); if (ret) return ret; return execute_ioctl(flow_action->action.context, cmd); } int ibv_cmd_destroy_flow_action(struct verbs_flow_action *action) { DECLARE_COMMAND_BUFFER(cmd, UVERBS_OBJECT_FLOW_ACTION, UVERBS_METHOD_FLOW_ACTION_DESTROY, 1); int ret; fill_attr_in_obj(cmd, UVERBS_ATTR_DESTROY_FLOW_ACTION_HANDLE, action->handle); ret = execute_ioctl(action->action.context, cmd); if (verbs_is_destroy_err(&ret)) return ret; return 0; } rdma-core-56.1/libibverbs/cmd_ioctl.c000066400000000000000000000140061477342711600175570ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include "ibverbs.h" #include #include #include #include /* Number of attrs in this and all the link'd buffers */ unsigned int __ioctl_final_num_attrs(unsigned int num_attrs, struct ibv_command_buffer *link) { for (; link; link = link->next) num_attrs += link->next_attr - link->hdr.attrs; return num_attrs; } /* Linearize the link'd buffers into this one */ static void prepare_attrs(struct ibv_command_buffer *cmd) { struct ib_uverbs_attr *end = cmd->next_attr; struct ibv_command_buffer *link; for (link = cmd->next; link; link = link->next) { struct ib_uverbs_attr *cur; assert(cmd->hdr.object_id == link->hdr.object_id); assert(cmd->hdr.method_id == link->hdr.method_id); /* * Keep track of where the uhw_in lands in the final array if * we copy it from a link */ if (!VERBS_IOCTL_ONLY && link->uhw_in_idx != _UHW_NO_INDEX) { assert(cmd->uhw_in_idx == _UHW_NO_INDEX); cmd->uhw_in_idx = link->uhw_in_idx + (end - cmd->hdr.attrs); } for (cur = link->hdr.attrs; cur != link->next_attr; cur++) *end++ = *cur; assert(end <= cmd->last_attr); } cmd->hdr.num_attrs = end - cmd->hdr.attrs; /* * We keep the in UHW uninlined until directly before sending to * support the compat path. See _fill_attr_in_uhw */ if (!VERBS_IOCTL_ONLY && cmd->uhw_in_idx != _UHW_NO_INDEX) { struct ib_uverbs_attr *uhw = &cmd->hdr.attrs[cmd->uhw_in_idx]; assert(uhw->attr_id == UVERBS_ATTR_UHW_IN); if (uhw->len <= sizeof(uhw->data)) memcpy(&uhw->data, (void *)(uintptr_t)uhw->data, uhw->len); } } static void finalize_attr(struct ib_uverbs_attr *attr) { /* Only matches UVERBS_ATTR_TYPE_PTR_OUT */ if (attr->flags & UVERBS_ATTR_F_VALID_OUTPUT && attr->len) VALGRIND_MAKE_MEM_DEFINED((void *)(uintptr_t)attr->data, attr->len); } /* * Copy the link'd attrs back to their source and make all output buffers safe * for VALGRIND */ static void finalize_attrs(struct ibv_command_buffer *cmd) { struct ibv_command_buffer *link; struct ib_uverbs_attr *end; for (end = cmd->hdr.attrs; end != cmd->next_attr; end++) finalize_attr(end); for (link = cmd->next; link; link = link->next) { struct ib_uverbs_attr *cur; for (cur = link->hdr.attrs; cur != link->next_attr; cur++) { finalize_attr(end); *cur = *end++; } } } int execute_ioctl(struct ibv_context *context, struct ibv_command_buffer *cmd) { struct verbs_context *vctx = verbs_get_ctx(context); /* * One of the fill functions was given input that cannot be marshaled */ if (unlikely(cmd->buffer_error)) { errno = EINVAL; return errno; } prepare_attrs(cmd); cmd->hdr.length = sizeof(cmd->hdr) + sizeof(cmd->hdr.attrs[0]) * cmd->hdr.num_attrs; cmd->hdr.reserved1 = 0; cmd->hdr.reserved2 = 0; cmd->hdr.driver_id = vctx->priv->driver_id; if (ioctl(context->cmd_fd, RDMA_VERBS_IOCTL, &cmd->hdr)) return errno; finalize_attrs(cmd); return 0; } /* * The compat scheme for UHW IN requires a pointer in .data, however the * kernel protocol requires pointers < 8 to be inlined into .data. We defer * that transformation until directly before the ioctl. */ static inline struct ib_uverbs_attr * _fill_attr_in_uhw(struct ibv_command_buffer *cmd, uint16_t attr_id, const void *data, size_t len) { struct ib_uverbs_attr *attr = _ioctl_next_attr(cmd, attr_id); if (unlikely(len > UINT16_MAX)) cmd->buffer_error = 1; attr->len = len; attr->data = ioctl_ptr_to_u64(data); return attr; } /* * This helper is used in the driver compat wrappers to build the * command buffer from the legacy input pointers format. */ void _write_set_uhw(struct ibv_command_buffer *cmdb, const void *req, size_t core_req_size, size_t req_size, void *resp, size_t core_resp_size, size_t resp_size) { if (req && core_req_size < req_size) { if (VERBS_IOCTL_ONLY) cmdb->uhw_in_idx = fill_attr_in(cmdb, UVERBS_ATTR_UHW_IN, (uint8_t *)req + core_req_size, req_size - core_req_size) - cmdb->hdr.attrs; else cmdb->uhw_in_idx = _fill_attr_in_uhw(cmdb, UVERBS_ATTR_UHW_IN, (uint8_t *)req + core_req_size, req_size - core_req_size) - cmdb->hdr.attrs; cmdb->uhw_in_headroom_dwords = __check_divide(core_req_size, 4); } if (resp && core_resp_size < resp_size) { cmdb->uhw_out_idx = fill_attr_out(cmdb, UVERBS_ATTR_UHW_OUT, (uint8_t *)resp + core_resp_size, resp_size - core_resp_size) - cmdb->hdr.attrs; cmdb->uhw_out_headroom_dwords = __check_divide(core_resp_size, 4); } } rdma-core-56.1/libibverbs/cmd_ioctl.h000066400000000000000000000327041477342711600175710ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __INFINIBAND_VERBS_IOCTL_H #define __INFINIBAND_VERBS_IOCTL_H #include #include #include #include #include #include #include static inline uint64_t ioctl_ptr_to_u64(const void *ptr) { if (sizeof(ptr) == sizeof(uint64_t)) return (uintptr_t)ptr; /* * Some CPU architectures require sign extension when converting from * a 32 bit to 64 bit pointer. This should match the kernel * implementation of compat_ptr() for the architecture. */ #if defined(__tilegx__) return (int64_t)(intptr_t)ptr; #else return (uintptr_t)ptr; #endif } static inline void _scrub_ptr_attr(void **ptr) { #if UINTPTR_MAX == UINT64_MAX /* Do nothing */ #else RDMA_UAPI_PTR(void *, data) *scrub_data; scrub_data = container_of(ptr, typeof(*scrub_data), data); scrub_data->data_data_u64 = ioctl_ptr_to_u64(scrub_data->data); #endif } #define scrub_ptr_attr(ptr) _scrub_ptr_attr((void **)(&ptr)) /* * The command buffer is organized as a linked list of blocks of attributes. * Each stack frame allocates its block and then calls up toward to core code * which will do the ioctl. The frame that does the ioctl calls the special * FINAL variant which will allocate enough space to linearize the attribute * buffer for the kernel. * * The current range of attributes to fill is next_attr -> last_attr. */ struct ibv_command_buffer { struct ibv_command_buffer *next; struct ib_uverbs_attr *next_attr; struct ib_uverbs_attr *last_attr; /* * Used by the legacy write interface to keep track of where the UHW * buffer is located and the 'headroom' space that the common code * uses to construct the command header and common command struct * directly before the drivers' UHW. */ uint8_t uhw_in_idx; uint8_t uhw_out_idx; uint8_t uhw_in_headroom_dwords; uint8_t uhw_out_headroom_dwords; uint8_t buffer_error:1; /* * These flags control what execute_ioctl_fallback does if the kernel * does not support ioctl */ uint8_t fallback_require_ex:1; uint8_t fallback_ioctl_only:1; struct ib_uverbs_ioctl_hdr hdr; }; enum {_UHW_NO_INDEX = 0xFF}; /* * Constructing an array of ibv_command_buffer is a reasonable way to expand * the VLA in hdr.attrs on the stack and also allocate some internal state in * a single contiguous stack memory region. It will over-allocate the region in * some cases, but this approach allows the number of elements to be dynamic, * and not fixed as a compile time constant. */ #define _IOCTL_NUM_CMDB(_num_attrs) \ ((sizeof(struct ibv_command_buffer) + \ sizeof(struct ib_uverbs_attr) * (_num_attrs) + \ sizeof(struct ibv_command_buffer) - 1) / \ sizeof(struct ibv_command_buffer)) unsigned int __ioctl_final_num_attrs(unsigned int num_attrs, struct ibv_command_buffer *link); /* If the user doesn't provide a link then don't create a VLA */ #define _ioctl_final_num_attrs(_num_attrs, _link) \ ((__builtin_constant_p(!(_link)) && !(_link)) \ ? (_num_attrs) \ : __ioctl_final_num_attrs(_num_attrs, _link)) #define _COMMAND_BUFFER_INIT(_hdr, _object_id, _method_id, _num_attrs, _link) \ ((struct ibv_command_buffer){ \ .hdr = \ { \ .object_id = (_object_id), \ .method_id = (_method_id), \ }, \ .next = _link, \ .uhw_in_idx = _UHW_NO_INDEX, \ .uhw_out_idx = _UHW_NO_INDEX, \ .next_attr = (_hdr).attrs, \ .last_attr = (_hdr).attrs + _num_attrs}) /* * C99 does not permit an initializer for VLAs, so this function does the init * instead. It is called in the wonky way so that DELCARE_COMMAND_BUFFER can * still be a 'variable', and we so we don't require C11 mode. */ static inline int _ioctl_init_cmdb(struct ibv_command_buffer *cmd, uint16_t object_id, uint16_t method_id, size_t num_attrs, struct ibv_command_buffer *link) { *cmd = _COMMAND_BUFFER_INIT(cmd->hdr, object_id, method_id, num_attrs, link); return 0; } /* * Construct an IOCTL command buffer on the stack with enough space for * _num_attrs elements. _num_attrs does not have to be a compile time constant. * _link is a previous COMMAND_BUFFER in the call chain. */ #ifndef __CHECKER__ #define DECLARE_COMMAND_BUFFER_LINK(_name, _object_id, _method_id, _num_attrs, \ _link) \ const unsigned int __##_name##total = \ _ioctl_final_num_attrs(_num_attrs, _link); \ struct ibv_command_buffer _name[_IOCTL_NUM_CMDB(__##_name##total)]; \ int __attribute__((unused)) __##_name##dummy = _ioctl_init_cmdb( \ _name, _object_id, _method_id, __##_name##total, _link) #else /* * sparse enforces kernel rules which forbids VLAs. Make the VLA into a static * array when running sparse. Don't actually run the sparse compile result. * Sparse also doesn't like arrays of VLAs */ #define DECLARE_COMMAND_BUFFER_LINK(_name, _object_id, _method_id, _num_attrs, \ _link) \ uint64_t __##_name##storage[10]; \ struct ibv_command_buffer *_name = (void *)__##_name##storage[10]; \ int __attribute__((unused)) __##_name##dummy = \ _ioctl_init_cmdb(_name, _object_id, _method_id, 10, _link) #endif #define DECLARE_COMMAND_BUFFER(_name, _object_id, _method_id, _num_attrs) \ DECLARE_COMMAND_BUFFER_LINK(_name, _object_id, _method_id, _num_attrs, \ NULL) int execute_ioctl(struct ibv_context *context, struct ibv_command_buffer *cmd); static inline struct ib_uverbs_attr * _ioctl_next_attr(struct ibv_command_buffer *cmd, uint16_t attr_id) { struct ib_uverbs_attr *attr; assert(cmd->next_attr < cmd->last_attr); attr = cmd->next_attr++; *attr = (struct ib_uverbs_attr){ .attr_id = attr_id, /* * All attributes default to mandatory. Wrapper the fill_* * call in attr_optional() to make it optional. */ .flags = UVERBS_ATTR_F_MANDATORY, }; return attr; } /* * This construction is insane, an expression with a side effect that returns * from the calling function, but it is a non-invasive way to get the compiler * to elide the IOCTL support in the backwards compat command functions * without disturbing native ioctl support. * * A command function will set last_attr on the stack to NULL, and if it is * coded properly, the compiler will prove that last_attr is never changed and * elide the function. Unfortunately this penalizes native ioctl uses with the * extra if overhead. * * For this reason, _ioctl_next_attr must never be called outside a fill * function. */ #if VERBS_WRITE_ONLY #define _ioctl_next_attr(cmd, attr_id) \ ({ \ if (!((cmd)->last_attr)) \ return NULL; \ _ioctl_next_attr(cmd, attr_id); \ }) #endif /* Make the attribute optional. */ static inline struct ib_uverbs_attr *attr_optional(struct ib_uverbs_attr *attr) { if (!attr) return attr; attr->flags &= ~UVERBS_ATTR_F_MANDATORY; return attr; } /* Send attributes of kernel type UVERBS_ATTR_TYPE_IDR */ static inline struct ib_uverbs_attr * fill_attr_in_obj(struct ibv_command_buffer *cmd, uint16_t attr_id, uint32_t idr) { struct ib_uverbs_attr *attr = _ioctl_next_attr(cmd, attr_id); /* UVERBS_ATTR_TYPE_IDR uses a 64 bit value for the idr # */ attr->data = idr; return attr; } static inline struct ib_uverbs_attr * fill_attr_out_obj(struct ibv_command_buffer *cmd, uint16_t attr_id) { return fill_attr_in_obj(cmd, attr_id, 0); } static inline uint32_t read_attr_obj(uint16_t attr_id, struct ib_uverbs_attr *attr) { assert(attr->attr_id == attr_id); return attr->data; } /* Send attributes of kernel type UVERBS_ATTR_TYPE_PTR_IN */ static inline struct ib_uverbs_attr * fill_attr_in(struct ibv_command_buffer *cmd, uint16_t attr_id, const void *data, size_t len) { struct ib_uverbs_attr *attr = _ioctl_next_attr(cmd, attr_id); if (unlikely(len > UINT16_MAX)) cmd->buffer_error = 1; attr->len = len; if (len <= sizeof(uint64_t)) memcpy(&attr->data, data, len); else attr->data = ioctl_ptr_to_u64(data); return attr; } #define fill_attr_in_ptr(cmd, attr_id, ptr) \ fill_attr_in(cmd, attr_id, ptr, sizeof(*ptr)) /* Send attributes of various inline kernel types */ static inline struct ib_uverbs_attr * fill_attr_in_uint64(struct ibv_command_buffer *cmd, uint16_t attr_id, uint64_t data) { struct ib_uverbs_attr *attr = _ioctl_next_attr(cmd, attr_id); attr->len = sizeof(data); attr->data = data; return attr; } #define fill_attr_const_in(cmd, attr_id, _data) \ fill_attr_in_uint64(cmd, attr_id, _data) static inline struct ib_uverbs_attr * fill_attr_in_uint32(struct ibv_command_buffer *cmd, uint16_t attr_id, uint32_t data) { struct ib_uverbs_attr *attr = _ioctl_next_attr(cmd, attr_id); attr->len = sizeof(data); memcpy(&attr->data, &data, sizeof(data)); return attr; } static inline struct ib_uverbs_attr * fill_attr_in_fd(struct ibv_command_buffer *cmd, uint16_t attr_id, int fd) { struct ib_uverbs_attr *attr; if (fd == -1) return NULL; attr = _ioctl_next_attr(cmd, attr_id); /* UVERBS_ATTR_TYPE_FD uses a 64 bit value for the idr # */ attr->data = fd; return attr; } static inline struct ib_uverbs_attr * fill_attr_out_fd(struct ibv_command_buffer *cmd, uint16_t attr_id, int fd) { struct ib_uverbs_attr *attr = _ioctl_next_attr(cmd, attr_id); attr->data = 0; return attr; } static inline int read_attr_fd(uint16_t attr_id, struct ib_uverbs_attr *attr) { assert(attr->attr_id == attr_id); /* The kernel cannot fail to create a FD here, it never returns -1 */ return attr->data; } /* Send attributes of kernel type UVERBS_ATTR_TYPE_PTR_OUT */ static inline struct ib_uverbs_attr * fill_attr_out(struct ibv_command_buffer *cmd, uint16_t attr_id, void *data, size_t len) { struct ib_uverbs_attr *attr = _ioctl_next_attr(cmd, attr_id); if (unlikely(len > UINT16_MAX)) cmd->buffer_error = 1; attr->len = len; attr->data = ioctl_ptr_to_u64(data); return attr; } #define fill_attr_out_ptr(cmd, attr_id, ptr) \ fill_attr_out(cmd, attr_id, ptr, sizeof(*(ptr))) /* If size*nelems overflows size_t this returns SIZE_MAX */ static inline size_t _array_len(size_t size, size_t nelems) { if (size != 0 && SIZE_MAX / size <= nelems) return SIZE_MAX; return size * nelems; } #define fill_attr_out_ptr_array(cmd, attr_id, ptr, nelems) \ fill_attr_out(cmd, attr_id, ptr, _array_len(sizeof(*ptr), nelems)) #define fill_attr_in_ptr_array(cmd, attr_id, ptr, nelems) \ fill_attr_in(cmd, attr_id, ptr, _array_len(sizeof(*ptr), nelems)) static inline size_t __check_divide(size_t val, unsigned int div) { assert(val % div == 0); return val / div; } static inline struct ib_uverbs_attr * fill_attr_in_enum(struct ibv_command_buffer *cmd, uint16_t attr_id, uint8_t elem_id, const void *data, size_t len) { struct ib_uverbs_attr *attr; attr = fill_attr_in(cmd, attr_id, data, len); attr->attr_data.enum_data.elem_id = elem_id; return attr; } /* Send attributes of kernel type UVERBS_ATTR_TYPE_IDRS_ARRAY */ static inline struct ib_uverbs_attr * fill_attr_in_objs_arr(struct ibv_command_buffer *cmd, uint16_t attr_id, const uint32_t *idrs_arr, size_t nelems) { return fill_attr_in(cmd, attr_id, idrs_arr, _array_len(sizeof(*idrs_arr), nelems)); } #endif rdma-core-56.1/libibverbs/cmd_mr.c000066400000000000000000000116371477342711600170720ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * Copyright (c) 2020 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include int ibv_cmd_advise_mr(struct ibv_pd *pd, enum ibv_advise_mr_advice advice, uint32_t flags, struct ibv_sge *sg_list, uint32_t num_sge) { DECLARE_COMMAND_BUFFER(cmd, UVERBS_OBJECT_MR, UVERBS_METHOD_ADVISE_MR, 4); fill_attr_in_obj(cmd, UVERBS_ATTR_ADVISE_MR_PD_HANDLE, pd->handle); fill_attr_const_in(cmd, UVERBS_ATTR_ADVISE_MR_ADVICE, advice); fill_attr_in_uint32(cmd, UVERBS_ATTR_ADVISE_MR_FLAGS, flags); fill_attr_in_ptr_array(cmd, UVERBS_ATTR_ADVISE_MR_SGE_LIST, sg_list, num_sge); return execute_ioctl(pd->context, cmd); } int ibv_cmd_dereg_mr(struct verbs_mr *vmr) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_MR, UVERBS_METHOD_MR_DESTROY, 1, NULL); int ret; fill_attr_in_obj(cmdb, UVERBS_ATTR_DESTROY_MR_HANDLE, vmr->ibv_mr.handle); switch (execute_ioctl_fallback(vmr->ibv_mr.context, dereg_mr, cmdb, &ret)) { case TRY_WRITE: { struct ibv_dereg_mr req; req.core_payload = (struct ib_uverbs_dereg_mr){ .mr_handle = vmr->ibv_mr.handle, }; ret = execute_cmd_write_req(vmr->ibv_mr.context, IB_USER_VERBS_CMD_DEREG_MR, &req, sizeof(req)); break; } default: break; } if (verbs_is_destroy_err(&ret)) return ret; return 0; } int ibv_cmd_query_mr(struct ibv_pd *pd, struct verbs_mr *vmr, uint32_t mr_handle) { DECLARE_FBCMD_BUFFER(cmd, UVERBS_OBJECT_MR, UVERBS_METHOD_QUERY_MR, 4, NULL); struct ibv_mr *mr = &vmr->ibv_mr; int ret; fill_attr_in_obj(cmd, UVERBS_ATTR_QUERY_MR_HANDLE, mr_handle); fill_attr_out_ptr(cmd, UVERBS_ATTR_QUERY_MR_RESP_LKEY, &mr->lkey); fill_attr_out_ptr(cmd, UVERBS_ATTR_QUERY_MR_RESP_RKEY, &mr->rkey); fill_attr_out_ptr(cmd, UVERBS_ATTR_QUERY_MR_RESP_LENGTH, &mr->length); ret = execute_ioctl(pd->context, cmd); if (ret) return ret; mr->handle = mr_handle; mr->context = pd->context; mr->pd = pd; mr->addr = NULL; vmr->mr_type = IBV_MR_TYPE_IMPORTED_MR; return 0; } int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access, struct verbs_mr *vmr, struct ibv_command_buffer *driver) { DECLARE_COMMAND_BUFFER_LINK(cmdb, UVERBS_OBJECT_MR, UVERBS_METHOD_REG_DMABUF_MR, 9, driver); struct ib_uverbs_attr *handle; uint32_t lkey, rkey; int ret; handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_HANDLE); fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_LKEY, &lkey); fill_attr_out_ptr(cmdb, UVERBS_ATTR_REG_DMABUF_MR_RESP_RKEY, &rkey); fill_attr_in_obj(cmdb, UVERBS_ATTR_REG_DMABUF_MR_PD_HANDLE, pd->handle); fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_OFFSET, offset); fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_LENGTH, length); fill_attr_in_uint64(cmdb, UVERBS_ATTR_REG_DMABUF_MR_IOVA, iova); fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_FD, fd); fill_attr_in_uint32(cmdb, UVERBS_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, access); ret = execute_ioctl(pd->context, cmdb); if (ret) return errno; vmr->ibv_mr.handle = read_attr_obj(UVERBS_ATTR_REG_DMABUF_MR_HANDLE, handle); vmr->ibv_mr.context = pd->context; vmr->ibv_mr.lkey = lkey; vmr->ibv_mr.rkey = rkey; vmr->ibv_mr.pd = pd; vmr->ibv_mr.addr = (void *)(uintptr_t)offset; vmr->ibv_mr.length = length; vmr->mr_type = IBV_MR_TYPE_DMABUF_MR; return 0; } rdma-core-56.1/libibverbs/cmd_mw.c000066400000000000000000000040721477342711600170720ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include int ibv_cmd_dealloc_mw(struct ibv_mw *mw) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_MW, UVERBS_METHOD_MW_DESTROY, 1, NULL); int ret; fill_attr_in_obj(cmdb, UVERBS_ATTR_DESTROY_MW_HANDLE, mw->handle); switch (execute_ioctl_fallback(mw->context, dealloc_mw, cmdb, &ret)) { case TRY_WRITE: { struct ibv_dealloc_mw req; req.core_payload = (struct ib_uverbs_dealloc_mw){ .mw_handle = mw->handle, }; ret = execute_cmd_write_req(mw->context, IB_USER_VERBS_CMD_DEALLOC_MW, &req, sizeof(req)); break; } default: break; } if (verbs_is_destroy_err(&ret)) return ret; return 0; } rdma-core-56.1/libibverbs/cmd_pd.c000066400000000000000000000040721477342711600170520ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include int ibv_cmd_dealloc_pd(struct ibv_pd *pd) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_PD, UVERBS_METHOD_PD_DESTROY, 1, NULL); int ret; fill_attr_in_obj(cmdb, UVERBS_ATTR_DESTROY_PD_HANDLE, pd->handle); switch (execute_ioctl_fallback(pd->context, dealloc_pd, cmdb, &ret)) { case TRY_WRITE: { struct ibv_dealloc_pd req; req.core_payload = (struct ib_uverbs_dealloc_pd){ .pd_handle = pd->handle, }; ret = execute_cmd_write_req(pd->context, IB_USER_VERBS_CMD_DEALLOC_PD, &req, sizeof(req)); break; } default: break; } if (verbs_is_destroy_err(&ret)) return ret; return 0; } rdma-core-56.1/libibverbs/cmd_qp.c000066400000000000000000000351521477342711600170720ustar00rootroot00000000000000/* * Copyright (c) 2020 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include "ibverbs.h" enum { CREATE_QP_EX_SUP_CREATE_FLAGS = IBV_QP_CREATE_BLOCK_SELF_MCAST_LB | IBV_QP_CREATE_SCATTER_FCS | IBV_QP_CREATE_CVLAN_STRIPPING | IBV_QP_CREATE_SOURCE_QPN | IBV_QP_CREATE_PCI_WRITE_END_PADDING }; static void set_qp(struct verbs_qp *vqp, struct ibv_qp *qp_in, struct ibv_qp_init_attr_ex *attr_ex, struct verbs_xrcd *vxrcd) { struct ibv_qp *qp = vqp ? &vqp->qp : qp_in; qp->qp_context = attr_ex->qp_context; qp->pd = attr_ex->pd; qp->send_cq = attr_ex->send_cq; qp->recv_cq = attr_ex->recv_cq; qp->srq = attr_ex->srq; qp->qp_type = attr_ex->qp_type; qp->state = IBV_QPS_RESET; qp->events_completed = 0; pthread_mutex_init(&qp->mutex, NULL); pthread_cond_init(&qp->cond, NULL); if (vqp) { vqp->comp_mask = 0; if (attr_ex->comp_mask & IBV_QP_INIT_ATTR_XRCD) { vqp->comp_mask |= VERBS_QP_XRCD; vqp->xrcd = vxrcd; } } } static int ibv_icmd_create_qp(struct ibv_context *context, struct verbs_qp *vqp, struct ibv_qp *qp_in, struct ibv_qp_init_attr_ex *attr_ex, struct ibv_command_buffer *link) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_QP, UVERBS_METHOD_QP_CREATE, 15, link); struct verbs_ex_private *priv = get_priv(context); struct ib_uverbs_attr *handle; uint32_t qp_num; uint32_t pd_handle; uint32_t send_cq_handle = 0; uint32_t recv_cq_handle = 0; int ret; struct ibv_qp *qp = vqp ? &vqp->qp : qp_in; struct verbs_xrcd *vxrcd = NULL; uint32_t create_flags = 0; qp->context = context; switch (attr_ex->qp_type) { case IBV_QPT_XRC_RECV: if (!(attr_ex->comp_mask & IBV_QP_INIT_ATTR_XRCD)) { errno = EINVAL; return errno; } vxrcd = container_of(attr_ex->xrcd, struct verbs_xrcd, xrcd); fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_QP_XRCD_HANDLE, vxrcd->handle); pd_handle = vxrcd->handle; break; case IBV_QPT_RC: case IBV_QPT_UD: case IBV_QPT_UC: case IBV_QPT_RAW_PACKET: case IBV_QPT_XRC_SEND: case IBV_QPT_DRIVER: if (!(attr_ex->comp_mask & IBV_QP_INIT_ATTR_PD)) { errno = EINVAL; return errno; } fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_QP_PD_HANDLE, attr_ex->pd->handle); pd_handle = attr_ex->pd->handle; if (attr_ex->comp_mask & IBV_QP_INIT_ATTR_IND_TABLE) { if (attr_ex->cap.max_recv_wr || attr_ex->cap.max_recv_sge || attr_ex->recv_cq || attr_ex->srq) { errno = EINVAL; return errno; } fallback_require_ex(cmdb); fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_QP_IND_TABLE_HANDLE, attr_ex->rwq_ind_tbl->ind_tbl_handle); /* send_cq is optional */ if (attr_ex->cap.max_send_wr) { fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_QP_SEND_CQ_HANDLE, attr_ex->send_cq->handle); send_cq_handle = attr_ex->send_cq->handle; } } else { fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_QP_SEND_CQ_HANDLE, attr_ex->send_cq->handle); send_cq_handle = attr_ex->send_cq->handle; if (attr_ex->qp_type != IBV_QPT_XRC_SEND) { fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_QP_RECV_CQ_HANDLE, attr_ex->recv_cq->handle); recv_cq_handle = attr_ex->recv_cq->handle; } } /* compatible with kernel code from the 'write' mode */ if (attr_ex->qp_type == IBV_QPT_XRC_SEND) { attr_ex->cap.max_recv_wr = 0; attr_ex->cap.max_recv_sge = 0; } break; default: errno = EINVAL; return errno; } handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_CREATE_QP_HANDLE); fill_attr_const_in(cmdb, UVERBS_ATTR_CREATE_QP_TYPE, attr_ex->qp_type); fill_attr_in_uint64(cmdb, UVERBS_ATTR_CREATE_QP_USER_HANDLE, (uintptr_t)qp); static_assert(offsetof(struct ibv_qp_cap, max_send_wr) == offsetof(struct ib_uverbs_qp_cap, max_send_wr), "Bad layout"); static_assert(offsetof(struct ibv_qp_cap, max_recv_wr) == offsetof(struct ib_uverbs_qp_cap, max_recv_wr), "Bad layout"); static_assert(offsetof(struct ibv_qp_cap, max_send_sge) == offsetof(struct ib_uverbs_qp_cap, max_send_sge), "Bad layout"); static_assert(offsetof(struct ibv_qp_cap, max_recv_sge) == offsetof(struct ib_uverbs_qp_cap, max_recv_sge), "Bad layout"); static_assert(offsetof(struct ibv_qp_cap, max_inline_data) == offsetof(struct ib_uverbs_qp_cap, max_inline_data), "Bad layout"); fill_attr_in_ptr(cmdb, UVERBS_ATTR_CREATE_QP_CAP, &attr_ex->cap); fill_attr_in_fd(cmdb, UVERBS_ATTR_CREATE_QP_EVENT_FD, context->async_fd); if (priv->imported) fallback_require_ioctl(cmdb); if (attr_ex->sq_sig_all) create_flags |= IB_UVERBS_QP_CREATE_SQ_SIG_ALL; if (attr_ex->comp_mask & IBV_QP_INIT_ATTR_CREATE_FLAGS) { if (attr_ex->create_flags & ~CREATE_QP_EX_SUP_CREATE_FLAGS) { errno = EINVAL; return errno; } fallback_require_ex(cmdb); create_flags |= attr_ex->create_flags; if (attr_ex->create_flags & IBV_QP_CREATE_SOURCE_QPN) { fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_QP_SOURCE_QPN, attr_ex->source_qpn); /* source QPN is a self attribute once moving to ioctl, * no extra bit is supported. */ create_flags &= ~IBV_QP_CREATE_SOURCE_QPN; } } if (create_flags) fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_QP_FLAGS, create_flags); if (attr_ex->srq) fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_QP_SRQ_HANDLE, attr_ex->srq->handle); fill_attr_out_ptr(cmdb, UVERBS_ATTR_CREATE_QP_RESP_CAP, &attr_ex->cap); fill_attr_out_ptr(cmdb, UVERBS_ATTR_CREATE_QP_RESP_QP_NUM, &qp_num); switch (execute_ioctl_fallback(context, create_qp, cmdb, &ret)) { case TRY_WRITE: { if (abi_ver > 4) { DECLARE_LEGACY_UHW_BUFS(link, IB_USER_VERBS_CMD_CREATE_QP); *req = (struct ib_uverbs_create_qp){ .pd_handle = pd_handle, .user_handle = (uintptr_t)qp, .max_send_wr = attr_ex->cap.max_send_wr, .max_recv_wr = attr_ex->cap.max_recv_wr, .max_send_sge = attr_ex->cap.max_send_sge, .max_recv_sge = attr_ex->cap.max_recv_sge, .max_inline_data = attr_ex->cap.max_inline_data, .sq_sig_all = attr_ex->sq_sig_all, .qp_type = attr_ex->qp_type, .srq_handle = attr_ex->srq ? attr_ex->srq->handle : 0, .is_srq = !!attr_ex->srq, .recv_cq_handle = recv_cq_handle, .send_cq_handle = send_cq_handle, }; ret = execute_write_bufs( context, IB_USER_VERBS_CMD_CREATE_QP, req, resp); if (ret) return ret; qp->handle = resp->qp_handle; qp->qp_num = resp->qpn; attr_ex->cap.max_recv_sge = resp->max_recv_sge; attr_ex->cap.max_send_sge = resp->max_send_sge; attr_ex->cap.max_recv_wr = resp->max_recv_wr; attr_ex->cap.max_send_wr = resp->max_send_wr; attr_ex->cap.max_inline_data = resp->max_inline_data; } else if (abi_ver == 4) { DECLARE_LEGACY_UHW_BUFS(link, IB_USER_VERBS_CMD_CREATE_QP_V4); *req = (struct ib_uverbs_create_qp){ .pd_handle = pd_handle, .user_handle = (uintptr_t)qp, .max_send_wr = attr_ex->cap.max_send_wr, .max_recv_wr = attr_ex->cap.max_recv_wr, .max_send_sge = attr_ex->cap.max_send_sge, .max_recv_sge = attr_ex->cap.max_recv_sge, .max_inline_data = attr_ex->cap.max_inline_data, .sq_sig_all = attr_ex->sq_sig_all, .qp_type = attr_ex->qp_type, .srq_handle = attr_ex->srq ? attr_ex->srq->handle : 0, .is_srq = !!attr_ex->srq, .recv_cq_handle = recv_cq_handle, .send_cq_handle = send_cq_handle, }; ret = execute_write_bufs( context, IB_USER_VERBS_CMD_CREATE_QP_V4, req, resp); if (ret) return ret; qp->handle = resp->qp_handle; qp->qp_num = resp->qpn; attr_ex->cap.max_recv_sge = resp->max_recv_sge; attr_ex->cap.max_send_sge = resp->max_send_sge; attr_ex->cap.max_recv_wr = resp->max_recv_wr; attr_ex->cap.max_send_wr = resp->max_send_wr; attr_ex->cap.max_inline_data = resp->max_inline_data; } else { DECLARE_LEGACY_UHW_BUFS(link, IB_USER_VERBS_CMD_CREATE_QP_V3); *req = (struct ib_uverbs_create_qp){ .pd_handle = pd_handle, .user_handle = (uintptr_t)qp, .max_send_wr = attr_ex->cap.max_send_wr, .max_recv_wr = attr_ex->cap.max_recv_wr, .max_send_sge = attr_ex->cap.max_send_sge, .max_recv_sge = attr_ex->cap.max_recv_sge, .max_inline_data = attr_ex->cap.max_inline_data, .sq_sig_all = attr_ex->sq_sig_all, .qp_type = attr_ex->qp_type, .srq_handle = attr_ex->srq ? attr_ex->srq->handle : 0, .is_srq = !!attr_ex->srq, .recv_cq_handle = recv_cq_handle, .send_cq_handle = send_cq_handle, }; ret = execute_write_bufs( context, IB_USER_VERBS_CMD_CREATE_QP_V3, req, resp); if (ret) return ret; qp->handle = resp->qp_handle; qp->qp_num = resp->qpn; } set_qp(vqp, qp, attr_ex, vxrcd); return 0; } case TRY_WRITE_EX: { DECLARE_LEGACY_UHW_BUFS_EX(link, IB_USER_VERBS_EX_CMD_CREATE_QP); *req = (struct ib_uverbs_ex_create_qp){ .pd_handle = pd_handle, .user_handle = (uintptr_t)qp, .max_send_wr = attr_ex->cap.max_send_wr, .max_recv_wr = attr_ex->cap.max_recv_wr, .max_send_sge = attr_ex->cap.max_send_sge, .max_recv_sge = attr_ex->cap.max_recv_sge, .max_inline_data = attr_ex->cap.max_inline_data, .sq_sig_all = attr_ex->sq_sig_all, .qp_type = attr_ex->qp_type, .srq_handle = attr_ex->srq ? attr_ex->srq->handle : 0, .is_srq = !!attr_ex->srq, .recv_cq_handle = recv_cq_handle, .send_cq_handle = send_cq_handle, }; if (attr_ex->comp_mask & IBV_QP_INIT_ATTR_CREATE_FLAGS) { req->create_flags = attr_ex->create_flags; if (attr_ex->create_flags & IBV_QP_CREATE_SOURCE_QPN) req->source_qpn = attr_ex->source_qpn; } if (attr_ex->comp_mask & IBV_QP_INIT_ATTR_IND_TABLE) { req->rwq_ind_tbl_handle = attr_ex->rwq_ind_tbl->ind_tbl_handle; req->comp_mask = IB_UVERBS_CREATE_QP_MASK_IND_TABLE; } ret = execute_write_bufs_ex( context, IB_USER_VERBS_EX_CMD_CREATE_QP, req, resp); if (ret) return ret; qp->handle = resp->base.qp_handle; qp->qp_num = resp->base.qpn; attr_ex->cap.max_recv_sge = resp->base.max_recv_sge; attr_ex->cap.max_send_sge = resp->base.max_send_sge; attr_ex->cap.max_recv_wr = resp->base.max_recv_wr; attr_ex->cap.max_send_wr = resp->base.max_send_wr; attr_ex->cap.max_inline_data = resp->base.max_inline_data; set_qp(vqp, qp, attr_ex, vxrcd); return 0; } case SUCCESS: break; default: return ret; } qp->handle = read_attr_obj(UVERBS_ATTR_CREATE_QP_HANDLE, handle); qp->qp_num = qp_num; set_qp(vqp, qp, attr_ex, vxrcd); return 0; } int ibv_cmd_create_qp(struct ibv_pd *pd, struct ibv_qp *qp, struct ibv_qp_init_attr *attr, struct ibv_create_qp *cmd, size_t cmd_size, struct ib_uverbs_create_qp_resp *resp, size_t resp_size) { DECLARE_CMD_BUFFER_COMPAT(cmdb, UVERBS_OBJECT_QP, UVERBS_METHOD_QP_CREATE, cmd, cmd_size, resp, resp_size); struct ibv_qp_init_attr_ex attr_ex = {}; int ret; attr_ex.qp_context = attr->qp_context; attr_ex.send_cq = attr->send_cq; attr_ex.recv_cq = attr->recv_cq; attr_ex.srq = attr->srq; attr_ex.cap = attr->cap; attr_ex.qp_type = attr->qp_type; attr_ex.sq_sig_all = attr->sq_sig_all; attr_ex.comp_mask = IBV_QP_INIT_ATTR_PD; attr_ex.pd = pd; ret = ibv_icmd_create_qp(pd->context, NULL, qp, &attr_ex, cmdb); if (!ret) memcpy(&attr->cap, &attr_ex.cap, sizeof(attr_ex.cap)); return ret; } int ibv_cmd_create_qp_ex(struct ibv_context *context, struct verbs_qp *qp, struct ibv_qp_init_attr_ex *attr_ex, struct ibv_create_qp *cmd, size_t cmd_size, struct ib_uverbs_create_qp_resp *resp, size_t resp_size) { DECLARE_CMD_BUFFER_COMPAT(cmdb, UVERBS_OBJECT_QP, UVERBS_METHOD_QP_CREATE, cmd, cmd_size, resp, resp_size); if (!check_comp_mask(attr_ex->comp_mask, IBV_QP_INIT_ATTR_PD | IBV_QP_INIT_ATTR_XRCD | IBV_QP_INIT_ATTR_SEND_OPS_FLAGS)) { errno = EINVAL; return errno; } return ibv_icmd_create_qp(context, qp, NULL, attr_ex, cmdb); } int ibv_cmd_create_qp_ex2(struct ibv_context *context, struct verbs_qp *qp, struct ibv_qp_init_attr_ex *attr_ex, struct ibv_create_qp_ex *cmd, size_t cmd_size, struct ib_uverbs_ex_create_qp_resp *resp, size_t resp_size) { DECLARE_CMD_BUFFER_COMPAT(cmdb, UVERBS_OBJECT_QP, UVERBS_METHOD_QP_CREATE, cmd, cmd_size, resp, resp_size); if (!check_comp_mask(attr_ex->comp_mask, IBV_QP_INIT_ATTR_PD | IBV_QP_INIT_ATTR_XRCD | IBV_QP_INIT_ATTR_CREATE_FLAGS | IBV_QP_INIT_ATTR_MAX_TSO_HEADER | IBV_QP_INIT_ATTR_IND_TABLE | IBV_QP_INIT_ATTR_RX_HASH | IBV_QP_INIT_ATTR_SEND_OPS_FLAGS)) { errno = EINVAL; return errno; } return ibv_icmd_create_qp(context, qp, NULL, attr_ex, cmdb); } int ibv_cmd_destroy_qp(struct ibv_qp *qp) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_QP, UVERBS_METHOD_QP_DESTROY, 2, NULL); struct ib_uverbs_destroy_qp_resp resp; int ret; fill_attr_out_ptr(cmdb, UVERBS_ATTR_DESTROY_QP_RESP, &resp); fill_attr_in_obj(cmdb, UVERBS_ATTR_DESTROY_QP_HANDLE, qp->handle); switch (execute_ioctl_fallback(qp->context, destroy_qp, cmdb, &ret)) { case TRY_WRITE: { struct ibv_destroy_qp req; req.core_payload = (struct ib_uverbs_destroy_qp){ .qp_handle = qp->handle, }; ret = execute_cmd_write(qp->context, IB_USER_VERBS_CMD_DESTROY_QP, &req, sizeof(req), &resp, sizeof(resp)); break; } default: break; } if (verbs_is_destroy_err(&ret)) return ret; pthread_mutex_lock(&qp->mutex); while (qp->events_completed != resp.events_reported) pthread_cond_wait(&qp->cond, &qp->mutex); pthread_mutex_unlock(&qp->mutex); return 0; } rdma-core-56.1/libibverbs/cmd_rwq_ind.c000066400000000000000000000043421477342711600201120ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include int ibv_cmd_destroy_rwq_ind_table(struct ibv_rwq_ind_table *rwq_ind_table) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_RWQ_IND_TBL, UVERBS_METHOD_RWQ_IND_TBL_DESTROY, 1, NULL); int ret; fill_attr_in_obj(cmdb, UVERBS_ATTR_DESTROY_RWQ_IND_TBL_HANDLE, rwq_ind_table->ind_tbl_handle); switch (execute_ioctl_fallback(rwq_ind_table->context, destroy_ah, cmdb, &ret)) { case TRY_WRITE: { struct ibv_destroy_rwq_ind_table req; req.core_payload = (struct ib_uverbs_ex_destroy_rwq_ind_table){ .ind_tbl_handle = rwq_ind_table->ind_tbl_handle, }; ret = execute_cmd_write_ex_req( rwq_ind_table->context, IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL, &req, sizeof(req)); break; } default: break; } if (verbs_is_destroy_err(&ret)) return ret; return 0; } rdma-core-56.1/libibverbs/cmd_srq.c000066400000000000000000000212441477342711600172540ustar00rootroot00000000000000/* * Copyright (c) 2020 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include "ibverbs.h" static void set_vsrq(struct verbs_srq *vsrq, struct ibv_srq_init_attr_ex *attr_ex, uint32_t srq_num) { vsrq->srq_type = (attr_ex->comp_mask & IBV_SRQ_INIT_ATTR_TYPE) ? attr_ex->srq_type : IBV_SRQT_BASIC; if (vsrq->srq_type == IBV_SRQT_XRC) { vsrq->srq_num = srq_num; vsrq->xrcd = container_of(attr_ex->xrcd, struct verbs_xrcd, xrcd); } if (attr_ex->comp_mask & IBV_SRQ_INIT_ATTR_CQ) vsrq->cq = attr_ex->cq; } static int ibv_icmd_create_srq(struct ibv_pd *pd, struct verbs_srq *vsrq, struct ibv_srq *srq_in, struct ibv_srq_init_attr_ex *attr_ex, struct ibv_command_buffer *link) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_SRQ, UVERBS_METHOD_SRQ_CREATE, 13, link); struct verbs_ex_private *priv = get_priv(pd->context); struct ib_uverbs_attr *handle; uint32_t max_wr; uint32_t max_sge; uint32_t srq_num; int ret; struct ibv_srq *srq = vsrq ? &vsrq->srq : srq_in; struct verbs_xrcd *vxrcd = NULL; enum ibv_srq_type srq_type; srq->context = pd->context; pthread_mutex_init(&srq->mutex, NULL); pthread_cond_init(&srq->cond, NULL); srq_type = (attr_ex->comp_mask & IBV_SRQ_INIT_ATTR_TYPE) ? attr_ex->srq_type : IBV_SRQT_BASIC; switch (srq_type) { case IBV_SRQT_XRC: if (!(attr_ex->comp_mask & IBV_SRQ_INIT_ATTR_XRCD) || !(attr_ex->comp_mask & IBV_SRQ_INIT_ATTR_CQ)) { errno = EINVAL; return errno; } vxrcd = container_of(attr_ex->xrcd, struct verbs_xrcd, xrcd); fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_SRQ_XRCD_HANDLE, vxrcd->handle); fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_SRQ_CQ_HANDLE, attr_ex->cq->handle); fill_attr_out_ptr(cmdb, UVERBS_ATTR_CREATE_SRQ_RESP_SRQ_NUM, &srq_num); break; case IBV_SRQT_TM: if (!(attr_ex->comp_mask & IBV_SRQ_INIT_ATTR_CQ) || !(attr_ex->comp_mask & IBV_SRQ_INIT_ATTR_TM) || !(attr_ex->tm_cap.max_num_tags)) { errno = EINVAL; return errno; } fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_SRQ_CQ_HANDLE, attr_ex->cq->handle); fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_SRQ_MAX_NUM_TAGS, attr_ex->tm_cap.max_num_tags); break; default: break; } handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_CREATE_SRQ_HANDLE); fill_attr_const_in(cmdb, UVERBS_ATTR_CREATE_SRQ_TYPE, srq_type); fill_attr_in_uint64(cmdb, UVERBS_ATTR_CREATE_SRQ_USER_HANDLE, (uintptr_t)srq); fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_SRQ_PD_HANDLE, pd->handle); fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_SRQ_MAX_WR, attr_ex->attr.max_wr); fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_SRQ_MAX_SGE, attr_ex->attr.max_sge); fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_SRQ_LIMIT, attr_ex->attr.srq_limit); fill_attr_in_fd(cmdb, UVERBS_ATTR_CREATE_SRQ_EVENT_FD, pd->context->async_fd); fill_attr_out_ptr(cmdb, UVERBS_ATTR_CREATE_SRQ_RESP_MAX_WR, &max_wr); fill_attr_out_ptr(cmdb, UVERBS_ATTR_CREATE_SRQ_RESP_MAX_SGE, &max_sge); if (priv->imported) fallback_require_ioctl(cmdb); switch (execute_ioctl_fallback(srq->context, create_srq, cmdb, &ret)) { case TRY_WRITE: { if (attr_ex->srq_type == IBV_SRQT_BASIC && abi_ver > 5) { DECLARE_LEGACY_UHW_BUFS(link, IB_USER_VERBS_CMD_CREATE_SRQ); *req = (struct ib_uverbs_create_srq){ .pd_handle = pd->handle, .user_handle = (uintptr_t)srq, .max_wr = attr_ex->attr.max_wr, .max_sge = attr_ex->attr.max_sge, .srq_limit = attr_ex->attr.srq_limit, }; ret = execute_write_bufs( srq->context, IB_USER_VERBS_CMD_CREATE_SRQ, req, resp); if (ret) return ret; srq->handle = resp->srq_handle; attr_ex->attr.max_wr = resp->max_wr; attr_ex->attr.max_sge = resp->max_sge; } else if (attr_ex->srq_type == IBV_SRQT_BASIC && abi_ver <= 5) { DECLARE_LEGACY_UHW_BUFS(link, IB_USER_VERBS_CMD_CREATE_SRQ_V5); *req = (struct ib_uverbs_create_srq){ .pd_handle = pd->handle, .user_handle = (uintptr_t)srq, .max_wr = attr_ex->attr.max_wr, .max_sge = attr_ex->attr.max_sge, .srq_limit = attr_ex->attr.srq_limit, }; ret = execute_write_bufs( srq->context, IB_USER_VERBS_CMD_CREATE_SRQ_V5, req, resp); if (ret) return ret; srq->handle = resp->srq_handle; } else { DECLARE_LEGACY_UHW_BUFS(link, IB_USER_VERBS_CMD_CREATE_XSRQ); *req = (struct ib_uverbs_create_xsrq){ .pd_handle = pd->handle, .user_handle = (uintptr_t)srq, .max_wr = attr_ex->attr.max_wr, .max_sge = attr_ex->attr.max_sge, .srq_limit = attr_ex->attr.srq_limit, .srq_type = attr_ex->srq_type, .cq_handle = attr_ex->cq->handle, }; if (attr_ex->srq_type == IBV_SRQT_TM) req->max_num_tags = attr_ex->tm_cap.max_num_tags; else req->xrcd_handle = vxrcd->handle; ret = execute_write_bufs( srq->context, IB_USER_VERBS_CMD_CREATE_XSRQ, req, resp); if (ret) return ret; srq->handle = resp->srq_handle; attr_ex->attr.max_wr = resp->max_wr; attr_ex->attr.max_sge = resp->max_sge; set_vsrq(vsrq, attr_ex, resp->srqn); } return 0; } case SUCCESS: break; default: return ret; } srq->handle = read_attr_obj(UVERBS_ATTR_CREATE_SRQ_HANDLE, handle); attr_ex->attr.max_wr = max_wr; attr_ex->attr.max_sge = max_sge; if (vsrq) set_vsrq(vsrq, attr_ex, srq_num); return 0; } int ibv_cmd_create_srq(struct ibv_pd *pd, struct ibv_srq *srq, struct ibv_srq_init_attr *attr, struct ibv_create_srq *cmd, size_t cmd_size, struct ib_uverbs_create_srq_resp *resp, size_t resp_size) { DECLARE_CMD_BUFFER_COMPAT(cmdb, UVERBS_OBJECT_SRQ, UVERBS_METHOD_SRQ_CREATE, cmd, cmd_size, resp, resp_size); struct ibv_srq_init_attr_ex attr_ex = {}; int ret; memcpy(&attr_ex, attr, sizeof(*attr)); ret = ibv_icmd_create_srq(pd, NULL, srq, &attr_ex, cmdb); if (!ret) { attr->attr.max_wr = attr_ex.attr.max_wr; attr->attr.max_sge = attr_ex.attr.max_sge; } return ret; } int ibv_cmd_create_srq_ex(struct ibv_context *context, struct verbs_srq *srq, struct ibv_srq_init_attr_ex *attr_ex, struct ibv_create_xsrq *cmd, size_t cmd_size, struct ib_uverbs_create_srq_resp *resp, size_t resp_size) { DECLARE_CMD_BUFFER_COMPAT(cmdb, UVERBS_OBJECT_SRQ, UVERBS_METHOD_SRQ_CREATE, cmd, cmd_size, resp, resp_size); if (attr_ex->comp_mask >= IBV_SRQ_INIT_ATTR_RESERVED) { errno = EOPNOTSUPP; return errno; } if (!(attr_ex->comp_mask & IBV_SRQ_INIT_ATTR_PD)) { errno = EINVAL; return errno; } return ibv_icmd_create_srq(attr_ex->pd, srq, NULL, attr_ex, cmdb); } int ibv_cmd_destroy_srq(struct ibv_srq *srq) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_SRQ, UVERBS_METHOD_SRQ_DESTROY, 2, NULL); struct ib_uverbs_destroy_srq_resp resp; int ret; fill_attr_out_ptr(cmdb, UVERBS_ATTR_DESTROY_SRQ_RESP, &resp); fill_attr_in_obj(cmdb, UVERBS_ATTR_DESTROY_SRQ_HANDLE, srq->handle); switch (execute_ioctl_fallback(srq->context, destroy_srq, cmdb, &ret)) { case TRY_WRITE: { struct ibv_destroy_srq req; req.core_payload = (struct ib_uverbs_destroy_srq){ .srq_handle = srq->handle, }; ret = execute_cmd_write(srq->context, IB_USER_VERBS_CMD_DESTROY_SRQ, &req, sizeof(req), &resp, sizeof(resp)); break; } default: break; } if (verbs_is_destroy_err(&ret)) return ret; pthread_mutex_lock(&srq->mutex); while (srq->events_completed != resp.events_reported) pthread_cond_wait(&srq->cond, &srq->mutex); pthread_mutex_unlock(&srq->mutex); return 0; } rdma-core-56.1/libibverbs/cmd_wq.c000066400000000000000000000131361477342711600170770ustar00rootroot00000000000000/* * Copyright (c) 2020 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include "ibverbs.h" static int ibv_icmd_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *wq_init_attr, struct ibv_wq *wq, struct ibv_command_buffer *link) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_WQ, UVERBS_METHOD_WQ_CREATE, 13, link); struct verbs_ex_private *priv = get_priv(context); struct ib_uverbs_attr *handle; uint32_t create_flags = 0; uint32_t max_wr; uint32_t max_sge; uint32_t wq_num; int ret; wq->context = context; wq->cq = wq_init_attr->cq; wq->pd = wq_init_attr->pd; wq->wq_type = wq_init_attr->wq_type; handle = fill_attr_out_obj(cmdb, UVERBS_ATTR_CREATE_WQ_HANDLE); fill_attr_in_uint64(cmdb, UVERBS_ATTR_CREATE_WQ_USER_HANDLE, (uintptr_t)wq); fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_WQ_PD_HANDLE, wq_init_attr->pd->handle); fill_attr_in_obj(cmdb, UVERBS_ATTR_CREATE_WQ_CQ_HANDLE, wq_init_attr->cq->handle); fill_attr_const_in(cmdb, UVERBS_ATTR_CREATE_WQ_TYPE, wq_init_attr->wq_type); fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_WQ_MAX_WR, wq_init_attr->max_wr); fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_WQ_MAX_SGE, wq_init_attr->max_sge); fill_attr_in_fd(cmdb, UVERBS_ATTR_CREATE_WQ_EVENT_FD, wq->context->async_fd); if (wq_init_attr->comp_mask & IBV_WQ_INIT_ATTR_FLAGS) { if (wq_init_attr->create_flags & ~(IBV_WQ_FLAGS_RESERVED - 1)) { errno = EOPNOTSUPP; return errno; } create_flags = wq_init_attr->create_flags; } fill_attr_in_uint32(cmdb, UVERBS_ATTR_CREATE_WQ_FLAGS, create_flags); fill_attr_out_ptr(cmdb, UVERBS_ATTR_CREATE_WQ_RESP_MAX_WR, &max_wr); fill_attr_out_ptr(cmdb, UVERBS_ATTR_CREATE_WQ_RESP_MAX_SGE, &max_sge); fill_attr_out_ptr(cmdb, UVERBS_ATTR_CREATE_WQ_RESP_WQ_NUM, &wq_num); if (priv->imported) fallback_require_ioctl(cmdb); fallback_require_ex(cmdb); switch (execute_ioctl_fallback(context, create_wq, cmdb, &ret)) { case TRY_WRITE_EX: { DECLARE_LEGACY_UHW_BUFS_EX(link, IB_USER_VERBS_EX_CMD_CREATE_WQ); *req = (struct ib_uverbs_ex_create_wq){ .user_handle = (uintptr_t)wq, .pd_handle = wq_init_attr->pd->handle, .cq_handle = wq_init_attr->cq->handle, .max_wr = wq_init_attr->max_wr, .max_sge = wq_init_attr->max_sge, .wq_type = wq_init_attr->wq_type, .create_flags = wq_init_attr->create_flags, }; ret = execute_write_bufs_ex( context, IB_USER_VERBS_EX_CMD_CREATE_WQ, req, resp); if (ret) return ret; wq->handle = resp->wq_handle; wq_init_attr->max_wr = resp->max_wr; wq_init_attr->max_sge = resp->max_sge; wq->wq_num = resp->wqn; return 0; } case SUCCESS: break; default: return ret; } wq->handle = read_attr_obj(UVERBS_ATTR_CREATE_WQ_HANDLE, handle); wq->wq_num = wq_num; wq_init_attr->max_wr = max_wr; wq_init_attr->max_sge = max_sge; return 0; } int ibv_cmd_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *wq_init_attr, struct ibv_wq *wq, struct ibv_create_wq *cmd, size_t cmd_size, struct ib_uverbs_ex_create_wq_resp *resp, size_t resp_size) { DECLARE_CMD_BUFFER_COMPAT(cmdb, UVERBS_OBJECT_WQ, UVERBS_METHOD_WQ_CREATE, cmd, cmd_size, resp, resp_size); if (wq_init_attr->comp_mask >= IBV_WQ_INIT_ATTR_RESERVED) { errno = EINVAL; return errno; } return ibv_icmd_create_wq(context, wq_init_attr, wq, cmdb); } int ibv_cmd_destroy_wq(struct ibv_wq *wq) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_WQ, UVERBS_METHOD_WQ_DESTROY, 2, NULL); struct ib_uverbs_ex_destroy_wq_resp resp; int ret; fill_attr_out_ptr(cmdb, UVERBS_ATTR_DESTROY_WQ_RESP, &resp.events_reported); fill_attr_in_obj(cmdb, UVERBS_ATTR_DESTROY_WQ_HANDLE, wq->handle); switch (execute_ioctl_fallback(wq->context, destroy_wq, cmdb, &ret)) { case TRY_WRITE: { struct ibv_destroy_wq req; req.core_payload = (struct ib_uverbs_ex_destroy_wq){ .wq_handle = wq->handle, }; ret = execute_cmd_write_ex(wq->context, IB_USER_VERBS_EX_CMD_DESTROY_WQ, &req, sizeof(req), &resp, sizeof(resp)); break; } default: break; } if (verbs_is_destroy_err(&ret)) return ret; pthread_mutex_lock(&wq->mutex); while (wq->events_completed != resp.events_reported) pthread_cond_wait(&wq->cond, &wq->mutex); pthread_mutex_unlock(&wq->mutex); return 0; } rdma-core-56.1/libibverbs/cmd_write.h000066400000000000000000000337241477342711600176140ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __INFINIBAND_VERBS_WRITE_H #define __INFINIBAND_VERBS_WRITE_H #include #include #include #include #include void *_write_get_req(struct ibv_command_buffer *link, struct ib_uverbs_cmd_hdr *onstack, size_t size); void *_write_get_req_ex(struct ibv_command_buffer *link, struct ex_hdr *onstack, size_t size); void *_write_get_resp(struct ibv_command_buffer *link, struct ib_uverbs_cmd_hdr *hdr, void *onstack, size_t resp_size); void *_write_get_resp_ex(struct ibv_command_buffer *link, struct ex_hdr *hdr, void *onstack, size_t resp_size); /* * This macro creates 'req' and 'resp' pointers in the local stack frame that * point to the core code write command structures patterned off _pattern. * * This should be done before calling execute_write_bufs */ #define DECLARE_LEGACY_UHW_BUFS(_link, _enum) \ IBV_ABI_REQ(_enum) __req_onstack; \ IBV_KABI_RESP(_enum) __resp_onstack; \ IBV_KABI_REQ(_enum) *req = \ _write_get_req(_link, &__req_onstack.hdr, sizeof(*req)); \ IBV_KABI_RESP(_enum) *resp = ({ \ void *_resp = _write_get_resp( \ _link, \ &container_of(req, IBV_ABI_REQ(_enum), core_payload) \ ->hdr, \ &__resp_onstack, sizeof(*resp)); \ _resp; \ }) #define DECLARE_LEGACY_UHW_BUFS_EX(_link, _enum) \ IBV_ABI_REQ(_enum) __req_onstack; \ IBV_KABI_RESP(_enum) __resp_onstack; \ IBV_KABI_REQ(_enum) *req = \ _write_get_req_ex(_link, &__req_onstack.hdr, sizeof(*req)); \ IBV_KABI_RESP(_enum) *resp = _write_get_resp_ex( \ _link, \ &container_of(req, IBV_ABI_REQ(_enum), core_payload)->hdr, \ &__resp_onstack, sizeof(*resp)) /* * This macro is used to implement the compatibility command call wrappers. * Compatibility calls do not accept a command_buffer, and cannot use the new * attribute id mechanism. They accept the legacy kern-abi.h structs that have * the embedded header. */ void _write_set_uhw(struct ibv_command_buffer *cmdb, const void *req, size_t core_req_size, size_t req_size, void *resp, size_t core_resp_size, size_t resp_size); #define DECLARE_CMD_BUFFER_COMPAT(_name, _object_id, _method_id, cmd, \ cmd_size, resp, resp_size) \ DECLARE_COMMAND_BUFFER(_name, _object_id, _method_id, 2); \ _write_set_uhw(_name, cmd, sizeof(*cmd), cmd_size, resp, \ sizeof(*resp), resp_size) #define DECLARE_CMD_BUFFER_LINK_COMPAT(_name, _object_id, _method_id, \ _link, cmd, cmd_size, \ resp, resp_size) \ DECLARE_COMMAND_BUFFER_LINK(_name, _object_id, _method_id, 2, _link); \ _write_set_uhw(_name, cmd, sizeof(*cmd), cmd_size, resp, \ sizeof(*resp), resp_size) /* * The fallback scheme keeps track of which ioctls succeed in a per-context * bitmap. If ENOTTY or EPROTONOSUPPORT is seen then the ioctl is never * retried. * * cmd_name should be the name of the function op from verbs_context_ops * that is being implemented. */ #define _CMD_BIT(cmd_name) \ (offsetof(struct verbs_context_ops, cmd_name) / sizeof(void *)) enum write_fallback { TRY_WRITE, TRY_WRITE_EX, ERROR, SUCCESS }; /* * This bitmask indicate the required behavior of execute_ioctl_fallback when * the ioctl is not supported. It is a priority list where the highest set bit * takes precedence. This approach simplifies the typical required control * flow of the user. */ static inline void fallback_require_ex(struct ibv_command_buffer *cmdb) { cmdb->fallback_require_ex = 1; } static inline void fallback_require_ioctl(struct ibv_command_buffer *cmdb) { cmdb->fallback_ioctl_only = 1; } enum write_fallback _check_legacy(struct ibv_command_buffer *cmdb, int *ret); enum write_fallback _execute_ioctl_fallback(struct ibv_context *ctx, unsigned int cmd_bit, struct ibv_command_buffer *cmdb, int *ret); #define execute_ioctl_fallback(ctx, cmd_name, cmdb, ret) \ _execute_ioctl_fallback(ctx, _CMD_BIT(cmd_name), cmdb, ret) /* * For write() only commands that have fixed core structures and may take uhw * driver data. The last arguments are the same ones passed into the typical * ibv_cmd_* function. execute_cmd_write deduces the length of the core * structure based on the KABI struct linked to the enum op code. */ int _execute_cmd_write(struct ibv_context *ctx, unsigned int write_method, void *req, size_t core_req_size, size_t req_size, void *resp, size_t core_resp_size, size_t resp_size); #define execute_cmd_write(ctx, enum, cmd, cmd_size, resp, resp_size) \ ({ \ (cmd)->core_payload.response = ioctl_ptr_to_u64(resp); \ _execute_cmd_write( \ ctx, enum, cmd + check_type(cmd, IBV_ABI_REQ(enum) *), \ sizeof(*(cmd)), cmd_size, \ resp + check_type(resp, IBV_KABI_RESP(enum) *), \ sizeof(*(resp)), resp_size); \ }) /* For write() commands that have no response */ #define execute_cmd_write_req(ctx, enum, cmd, cmd_size) \ ({ \ static_assert(sizeof(IBV_KABI_RESP(enum)) == 0, \ "Method has a response!"); \ _execute_cmd_write(ctx, enum, \ cmd + check_type(cmd, IBV_ABI_REQ(enum) *), \ sizeof(*(cmd)), cmd_size, NULL, 0, 0); \ }) /* * Execute a write command that does not have a uhw component. The cmd_size * and resp_size are the lengths of the core structure. This version is only * needed if the core structure ends in a flex array, as the internal sizeof() * in execute_cmd_write() will give the wrong size. */ #define execute_cmd_write_no_uhw(ctx, enum, cmd, cmd_size, resp, resp_size) \ ({ \ (cmd)->core_payload.response = ioctl_ptr_to_u64(resp); \ _execute_cmd_write( \ ctx, enum, cmd + check_type(cmd, IBV_ABI_REQ(enum) *), \ cmd_size, cmd_size, \ resp + check_type(resp, IBV_KABI_RESP(enum) *), \ resp_size, resp_size); \ }) /* * For users of DECLARE_LEGACY_UHW_BUFS, in this case the machinery has * already stored the full req/resp length in the hdr. */ #define execute_write_bufs(ctx, enum, req, resp) \ ({ \ IBV_ABI_REQ(enum) *_hdr = \ container_of(req, IBV_ABI_REQ(enum), core_payload); \ execute_cmd_write(ctx, enum, _hdr, _hdr->hdr.in_words * 4, \ resp, _hdr->hdr.out_words * 4); \ }) /* * For write() commands that use the _ex protocol. _full allows the caller to * specify all 4 sizes directly. This version is used when the core structs * end in a flex array. The normal and req versions are similar to write() and * deduce the length of the core struct from the enum. */ int _execute_cmd_write_ex(struct ibv_context *ctx, unsigned int write_method, struct ex_hdr *req, size_t core_req_size, size_t req_size, void *resp, size_t core_resp_size, size_t resp_size); #define execute_cmd_write_ex_full(ctx, enum, cmd, core_cmd_size, cmd_size, \ resp, core_resp_size, resp_size) \ _execute_cmd_write_ex( \ ctx, enum, &(cmd)->hdr + check_type(cmd, IBV_ABI_REQ(enum) *), \ core_cmd_size, cmd_size, \ resp + check_type(resp, IBV_KABI_RESP(enum) *), \ core_resp_size, resp_size) #define execute_cmd_write_ex(ctx, enum, cmd, cmd_size, resp, resp_size) \ execute_cmd_write_ex_full(ctx, enum, cmd, sizeof(*(cmd)), cmd_size, \ resp, sizeof(*(resp)), resp_size) #define execute_cmd_write_ex_req(ctx, enum, cmd, cmd_size) \ ({ \ static_assert(sizeof(IBV_KABI_RESP(enum)) == 0, \ "Method has a response!"); \ _execute_cmd_write_ex( \ ctx, enum, \ &(cmd)->hdr + check_type(cmd, IBV_ABI_REQ(enum) *), \ sizeof(*(cmd)), cmd_size, NULL, 0, 0); \ }) /* For users of DECLARE_LEGACY_UHW_BUFS_EX */ #define execute_write_bufs_ex(ctx, enum, req, resp) \ ({ \ IBV_ABI_REQ(enum) *_hdr = \ container_of(req, IBV_ABI_REQ(enum), core_payload); \ execute_cmd_write_ex( \ ctx, enum, _hdr, \ sizeof(*_hdr) + \ _hdr->hdr.ex_hdr.provider_in_words * 8, \ resp, \ sizeof(*(resp)) + \ _hdr->hdr.ex_hdr.provider_out_words * 8); \ }) /* * These two macros are used only with execute_ioctl_fallback - they allow the * IOCTL code to be elided by the compiler when disabled. */ #define DECLARE_FBCMD_BUFFER DECLARE_COMMAND_BUFFER_LINK /* * Munge the macros above to remove certain paths during compilation based on * the cmake flag. */ #if VERBS_IOCTL_ONLY static inline enum write_fallback _execute_ioctl_only(struct ibv_context *context, struct ibv_command_buffer *cmd, int *ret) { *ret = execute_ioctl(context, cmd); if (*ret) return ERROR; return SUCCESS; } #undef execute_ioctl_fallback #define execute_ioctl_fallback(ctx, cmd_name, cmdb, ret) \ _execute_ioctl_only(ctx, cmdb, ret) #undef execute_write_bufs static inline int execute_write_bufs(struct ibv_context *ctx, unsigned int write_command, void *req, void *resp) { return ENOSYS; } #undef execute_write_bufs_ex static inline int execute_write_bufs_ex(struct ibv_context *ctx, unsigned int write_command, void *req, void *resp) { return ENOSYS; } #endif #if VERBS_WRITE_ONLY static inline enum write_fallback _execute_write_only(struct ibv_context *context, struct ibv_command_buffer *cmd, int *ret) { /* * write only still has the command buffer, and the command buffer * carries the fallback guidance that we need to inspect. This is * written in this odd way so the compiler knows that SUCCESS is not a * possible return and optimizes accordingly. */ switch (_check_legacy(cmd, ret)) { case TRY_WRITE: return TRY_WRITE; case TRY_WRITE_EX: return TRY_WRITE_EX; default: return ERROR; } } #undef execute_ioctl_fallback #define execute_ioctl_fallback(ctx, cmd_name, cmdb, ret) \ _execute_write_only(ctx, cmdb, ret) #undef DECLARE_FBCMD_BUFFER #define DECLARE_FBCMD_BUFFER(_name, _object_id, _method_id, _num_attrs, _link) \ struct ibv_command_buffer _name[1] = { \ { \ .next = _link, \ .uhw_in_idx = _UHW_NO_INDEX, \ .uhw_out_idx = _UHW_NO_INDEX, \ }, \ } #endif extern bool verbs_allow_disassociate_destroy; /* * Return true if 'ret' indicates that a destroy operation has failed * and the function should exit. If the kernel destroy failure is being * ignored then this will set ret to 0, so the calling function appears to succeed. */ static inline bool verbs_is_destroy_err(int *ret) { if (*ret == EIO && verbs_allow_disassociate_destroy) { *ret = 0; return true; } return *ret != 0; } #endif rdma-core-56.1/libibverbs/cmd_xrcd.c000066400000000000000000000041411477342711600174040ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include int ibv_cmd_close_xrcd(struct verbs_xrcd *xrcd) { DECLARE_FBCMD_BUFFER(cmdb, UVERBS_OBJECT_XRCD, UVERBS_METHOD_XRCD_DESTROY, 1, NULL); int ret; fill_attr_in_obj(cmdb, UVERBS_ATTR_DESTROY_XRCD_HANDLE, xrcd->handle); switch (execute_ioctl_fallback(xrcd->xrcd.context, close_xrcd, cmdb, &ret)) { case TRY_WRITE: { struct ibv_close_xrcd req; req.core_payload = (struct ib_uverbs_close_xrcd){ .xrcd_handle = xrcd->handle, }; ret = execute_cmd_write_req(xrcd->xrcd.context, IB_USER_VERBS_CMD_CLOSE_XRCD, &req, sizeof(req)); break; } default: break; } if (verbs_is_destroy_err(&ret)) return ret; return 0; } rdma-core-56.1/libibverbs/compat-1_0.c000066400000000000000000000554241477342711600174730ustar00rootroot00000000000000/* * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include "ibverbs.h" struct ibv_pd_1_0 { struct ibv_context_1_0 *context; uint32_t handle; struct ibv_pd *real_pd; }; struct ibv_mr_1_0 { struct ibv_context_1_0 *context; struct ibv_pd_1_0 *pd; uint32_t handle; uint32_t lkey; uint32_t rkey; struct ibv_mr *real_mr; }; struct ibv_srq_1_0 { struct ibv_context_1_0 *context; void *srq_context; struct ibv_pd_1_0 *pd; uint32_t handle; pthread_mutex_t mutex; pthread_cond_t cond; uint32_t events_completed; struct ibv_srq *real_srq; }; struct ibv_qp_init_attr_1_0 { void *qp_context; struct ibv_cq_1_0 *send_cq; struct ibv_cq_1_0 *recv_cq; struct ibv_srq_1_0 *srq; struct ibv_qp_cap cap; enum ibv_qp_type qp_type; int sq_sig_all; }; struct ibv_send_wr_1_0 { struct ibv_send_wr_1_0 *next; uint64_t wr_id; struct ibv_sge *sg_list; int num_sge; enum ibv_wr_opcode opcode; int send_flags; __be32 imm_data; union { struct { uint64_t remote_addr; uint32_t rkey; } rdma; struct { uint64_t remote_addr; uint64_t compare_add; uint64_t swap; uint32_t rkey; } atomic; struct { struct ibv_ah_1_0 *ah; uint32_t remote_qpn; uint32_t remote_qkey; } ud; } wr; }; struct ibv_recv_wr_1_0 { struct ibv_recv_wr_1_0 *next; uint64_t wr_id; struct ibv_sge *sg_list; int num_sge; }; struct ibv_qp_1_0 { struct ibv_context_1_0 *context; void *qp_context; struct ibv_pd_1_0 *pd; struct ibv_cq_1_0 *send_cq; struct ibv_cq_1_0 *recv_cq; struct ibv_srq_1_0 *srq; uint32_t handle; uint32_t qp_num; enum ibv_qp_state state; enum ibv_qp_type qp_type; pthread_mutex_t mutex; pthread_cond_t cond; uint32_t events_completed; struct ibv_qp *real_qp; }; struct ibv_cq_1_0 { struct ibv_context_1_0 *context; void *cq_context; uint32_t handle; int cqe; pthread_mutex_t mutex; pthread_cond_t cond; uint32_t comp_events_completed; uint32_t async_events_completed; struct ibv_cq *real_cq; }; struct ibv_ah_1_0 { struct ibv_context_1_0 *context; struct ibv_pd_1_0 *pd; uint32_t handle; struct ibv_ah *real_ah; }; struct ibv_device_1_0 { void *obsolete_sysfs_dev; void *obsolete_sysfs_ibdev; struct ibv_device *real_device; /* was obsolete driver member */ struct _ibv_device_ops _ops; }; struct ibv_context_ops_1_0 { int (*query_device)(struct ibv_context *context, struct ibv_device_attr *device_attr); int (*query_port)(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr); struct ibv_pd * (*alloc_pd)(struct ibv_context *context); int (*dealloc_pd)(struct ibv_pd *pd); struct ibv_mr * (*reg_mr)(struct ibv_pd *pd, void *addr, size_t length, int access); int (*dereg_mr)(struct ibv_mr *mr); struct ibv_cq * (*create_cq)(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); int (*poll_cq)(struct ibv_cq_1_0 *cq, int num_entries, struct ibv_wc *wc); int (*req_notify_cq)(struct ibv_cq_1_0 *cq, int solicited_only); void (*cq_event)(struct ibv_cq *cq); int (*resize_cq)(struct ibv_cq *cq, int cqe); int (*destroy_cq)(struct ibv_cq *cq); struct ibv_srq * (*create_srq)(struct ibv_pd *pd, struct ibv_srq_init_attr *srq_init_attr); int (*modify_srq)(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, int srq_attr_mask); int (*query_srq)(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr); int (*destroy_srq)(struct ibv_srq *srq); int (*post_srq_recv)(struct ibv_srq_1_0 *srq, struct ibv_recv_wr_1_0 *recv_wr, struct ibv_recv_wr_1_0 **bad_recv_wr); struct ibv_qp * (*create_qp)(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); int (*query_qp)(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int (*modify_qp)(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); int (*destroy_qp)(struct ibv_qp *qp); int (*post_send)(struct ibv_qp_1_0 *qp, struct ibv_send_wr_1_0 *wr, struct ibv_send_wr_1_0 **bad_wr); int (*post_recv)(struct ibv_qp_1_0 *qp, struct ibv_recv_wr_1_0 *wr, struct ibv_recv_wr_1_0 **bad_wr); struct ibv_ah * (*create_ah)(struct ibv_pd *pd, struct ibv_ah_attr *attr); int (*destroy_ah)(struct ibv_ah *ah); int (*attach_mcast)(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid); int (*detach_mcast)(struct ibv_qp *qp, union ibv_gid *gid, uint16_t lid); }; struct ibv_context_1_0 { struct ibv_device_1_0 *device; struct ibv_context_ops_1_0 ops; int cmd_fd; int async_fd; int num_comp_vectors; struct ibv_context *real_context; /* was abi_compat member */ }; typedef struct ibv_device *(*ibv_driver_init_func_1_1)(const char *uverbs_sys_path, int abi_version); COMPAT_SYMVER_FUNC(ibv_get_device_list, 1_0, "IBVERBS_1.0", struct ibv_device_1_0 **, int *num) { struct ibv_device **real_list; struct ibv_device_1_0 **l; int i, n; real_list = ibv_get_device_list(&n); if (!real_list) return NULL; l = calloc(n + 2, sizeof (struct ibv_device_1_0 *)); if (!l) goto free_device_list; l[0] = (void *) real_list; for (i = 0; i < n; ++i) { l[i + 1] = calloc(1, sizeof (struct ibv_device_1_0)); if (!l[i + 1]) goto fail; l[i + 1]->real_device = real_list[i]; } if (num) *num = n; return l + 1; fail: for (i = 1; i <= n; ++i) if (l[i]) free(l[i]); free(l); free_device_list: ibv_free_device_list(real_list); return NULL; } COMPAT_SYMVER_FUNC(ibv_free_device_list, 1_0, "IBVERBS_1.0", void, struct ibv_device_1_0 **list) { struct ibv_device_1_0 **l = list; while (*l) { free(*l); ++l; } ibv_free_device_list((void *) list[-1]); free(list - 1); } COMPAT_SYMVER_FUNC(ibv_get_device_name, 1_0, "IBVERBS_1.0", const char *, struct ibv_device_1_0 *device) { return ibv_get_device_name(device->real_device); } COMPAT_SYMVER_FUNC(ibv_get_device_guid, 1_0, "IBVERBS_1.0", __be64, struct ibv_device_1_0 *device) { return ibv_get_device_guid(device->real_device); } static int poll_cq_wrapper_1_0(struct ibv_cq_1_0 *cq, int num_entries, struct ibv_wc *wc) { return cq->context->real_context->ops.poll_cq(cq->real_cq, num_entries, wc); } static int req_notify_cq_wrapper_1_0(struct ibv_cq_1_0 *cq, int sol_only) { return cq->context->real_context->ops.req_notify_cq(cq->real_cq, sol_only); } static int post_srq_recv_wrapper_1_0(struct ibv_srq_1_0 *srq, struct ibv_recv_wr_1_0 *wr, struct ibv_recv_wr_1_0 **bad_wr) { struct ibv_recv_wr_1_0 *w; struct ibv_recv_wr *real_wr, *head_wr = NULL, *tail_wr = NULL, *real_bad_wr; int ret; for (w = wr; w; w = w->next) { real_wr = alloca(sizeof *real_wr); real_wr->wr_id = w->wr_id; real_wr->sg_list = w->sg_list; real_wr->num_sge = w->num_sge; real_wr->next = NULL; if (tail_wr) tail_wr->next = real_wr; else head_wr = real_wr; tail_wr = real_wr; } ret = srq->context->real_context->ops.post_srq_recv(srq->real_srq, head_wr, &real_bad_wr); if (ret) { for (real_wr = head_wr, w = wr; real_wr; real_wr = real_wr->next, w = w->next) if (real_wr == real_bad_wr) { *bad_wr = w; break; } } return ret; } static int post_send_wrapper_1_0(struct ibv_qp_1_0 *qp, struct ibv_send_wr_1_0 *wr, struct ibv_send_wr_1_0 **bad_wr) { struct ibv_send_wr_1_0 *w; struct ibv_send_wr *real_wr, *head_wr = NULL, *tail_wr = NULL, *real_bad_wr; int is_ud = qp->qp_type == IBV_QPT_UD; int ret; for (w = wr; w; w = w->next) { real_wr = alloca(sizeof *real_wr); real_wr->wr_id = w->wr_id; real_wr->next = NULL; #define TEST_SIZE_2_POINT(f1, f2) \ ((offsetof(struct ibv_send_wr, f1) - offsetof(struct ibv_send_wr, f2)) \ == offsetof(struct ibv_send_wr_1_0, f1) - offsetof(struct ibv_send_wr_1_0, f2)) #define TEST_SIZE_TO_END(f1) \ ((sizeof(struct ibv_send_wr) - offsetof(struct ibv_send_wr, f1)) == \ (sizeof(struct ibv_send_wr_1_0) - offsetof(struct ibv_send_wr_1_0, f1))) if (TEST_SIZE_TO_END (sg_list)) memcpy(&real_wr->sg_list, &w->sg_list, sizeof *real_wr - offsetof(struct ibv_send_wr, sg_list)); else if (TEST_SIZE_2_POINT (imm_data, sg_list) && TEST_SIZE_TO_END (wr)) { /* we have alignment up to wr, but padding between * imm_data and wr, and we know wr itself is the * same size */ memcpy(&real_wr->sg_list, &w->sg_list, offsetof(struct ibv_send_wr, imm_data) - offsetof(struct ibv_send_wr, sg_list) + sizeof real_wr->imm_data); memcpy(&real_wr->wr, &w->wr, sizeof real_wr->wr); } else { real_wr->sg_list = w->sg_list; real_wr->num_sge = w->num_sge; real_wr->opcode = w->opcode; real_wr->send_flags = w->send_flags; real_wr->imm_data = w->imm_data; if (TEST_SIZE_TO_END (wr)) memcpy(&real_wr->wr, &w->wr, sizeof real_wr->wr); else { real_wr->wr.atomic.remote_addr = w->wr.atomic.remote_addr; real_wr->wr.atomic.compare_add = w->wr.atomic.compare_add; real_wr->wr.atomic.swap = w->wr.atomic.swap; real_wr->wr.atomic.rkey = w->wr.atomic.rkey; } } if (is_ud) real_wr->wr.ud.ah = w->wr.ud.ah->real_ah; if (tail_wr) tail_wr->next = real_wr; else head_wr = real_wr; tail_wr = real_wr; } ret = qp->context->real_context->ops.post_send(qp->real_qp, head_wr, &real_bad_wr); if (ret) { for (real_wr = head_wr, w = wr; real_wr; real_wr = real_wr->next, w = w->next) if (real_wr == real_bad_wr) { *bad_wr = w; break; } } return ret; } static int post_recv_wrapper_1_0(struct ibv_qp_1_0 *qp, struct ibv_recv_wr_1_0 *wr, struct ibv_recv_wr_1_0 **bad_wr) { struct ibv_recv_wr_1_0 *w; struct ibv_recv_wr *real_wr, *head_wr = NULL, *tail_wr = NULL, *real_bad_wr; int ret; for (w = wr; w; w = w->next) { real_wr = alloca(sizeof *real_wr); real_wr->wr_id = w->wr_id; real_wr->sg_list = w->sg_list; real_wr->num_sge = w->num_sge; real_wr->next = NULL; if (tail_wr) tail_wr->next = real_wr; else head_wr = real_wr; tail_wr = real_wr; } ret = qp->context->real_context->ops.post_recv(qp->real_qp, head_wr, &real_bad_wr); if (ret) { for (real_wr = head_wr, w = wr; real_wr; real_wr = real_wr->next, w = w->next) if (real_wr == real_bad_wr) { *bad_wr = w; break; } } return ret; } COMPAT_SYMVER_FUNC(ibv_open_device, 1_0, "IBVERBS_1.0", struct ibv_context_1_0 *, struct ibv_device_1_0 *device) { struct ibv_context *real_ctx; struct ibv_context_1_0 *ctx; ctx = malloc(sizeof *ctx); if (!ctx) return NULL; real_ctx = ibv_open_device(device->real_device); if (!real_ctx) { free(ctx); return NULL; } ctx->device = device; ctx->real_context = real_ctx; ctx->ops.poll_cq = poll_cq_wrapper_1_0; ctx->ops.req_notify_cq = req_notify_cq_wrapper_1_0; ctx->ops.post_send = post_send_wrapper_1_0; ctx->ops.post_recv = post_recv_wrapper_1_0; ctx->ops.post_srq_recv = post_srq_recv_wrapper_1_0; return ctx; } COMPAT_SYMVER_FUNC(ibv_close_device, 1_0, "IBVERBS_1.0", int, struct ibv_context_1_0 *context) { int ret; ret = ibv_close_device(context->real_context); if (ret) return ret; free(context); return 0; } COMPAT_SYMVER_FUNC(ibv_get_async_event, 1_0, "IBVERBS_1.0", int, struct ibv_context_1_0 *context, struct ibv_async_event *event) { int ret; ret = ibv_get_async_event(context->real_context, event); if (ret) return ret; switch (event->event_type) { case IBV_EVENT_CQ_ERR: event->element.cq = event->element.cq->cq_context; break; case IBV_EVENT_QP_FATAL: case IBV_EVENT_QP_REQ_ERR: case IBV_EVENT_QP_ACCESS_ERR: case IBV_EVENT_COMM_EST: case IBV_EVENT_SQ_DRAINED: case IBV_EVENT_PATH_MIG: case IBV_EVENT_PATH_MIG_ERR: case IBV_EVENT_QP_LAST_WQE_REACHED: event->element.qp = event->element.qp->qp_context; break; case IBV_EVENT_SRQ_ERR: case IBV_EVENT_SRQ_LIMIT_REACHED: event->element.srq = event->element.srq->srq_context; break; default: break; } return ret; } COMPAT_SYMVER_FUNC(ibv_ack_async_event, 1_0, "IBVERBS_1.0", void, struct ibv_async_event *event) { struct ibv_async_event real_event = *event; switch (event->event_type) { case IBV_EVENT_CQ_ERR: real_event.element.cq = ((struct ibv_cq_1_0 *) event->element.cq)->real_cq; break; case IBV_EVENT_QP_FATAL: case IBV_EVENT_QP_REQ_ERR: case IBV_EVENT_QP_ACCESS_ERR: case IBV_EVENT_COMM_EST: case IBV_EVENT_SQ_DRAINED: case IBV_EVENT_PATH_MIG: case IBV_EVENT_PATH_MIG_ERR: case IBV_EVENT_QP_LAST_WQE_REACHED: real_event.element.qp = ((struct ibv_qp_1_0 *) event->element.qp)->real_qp; break; case IBV_EVENT_SRQ_ERR: case IBV_EVENT_SRQ_LIMIT_REACHED: real_event.element.srq = ((struct ibv_srq_1_0 *) event->element.srq)->real_srq; break; default: break; } ibv_ack_async_event(&real_event); } COMPAT_SYMVER_FUNC(ibv_query_device, 1_0, "IBVERBS_1.0", int, struct ibv_context_1_0 *context, struct ibv_device_attr *device_attr) { return ibv_query_device(context->real_context, device_attr); } COMPAT_SYMVER_FUNC(ibv_query_port, 1_0, "IBVERBS_1.0", int, struct ibv_context_1_0 *context, uint8_t port_num, struct ibv_port_attr *port_attr) { return ibv_query_port(context->real_context, port_num, port_attr); } COMPAT_SYMVER_FUNC(ibv_query_gid, 1_0, "IBVERBS_1.0", int, struct ibv_context_1_0 *context, uint8_t port_num, int index, union ibv_gid *gid) { return ibv_query_gid(context->real_context, port_num, index, gid); } COMPAT_SYMVER_FUNC(ibv_query_pkey, 1_0, "IBVERBS_1.0", int, struct ibv_context_1_0 *context, uint8_t port_num, int index, __be16 *pkey) { return ibv_query_pkey(context->real_context, port_num, index, pkey); } COMPAT_SYMVER_FUNC(ibv_alloc_pd, 1_0, "IBVERBS_1.0", struct ibv_pd_1_0 *, struct ibv_context_1_0 *context) { struct ibv_pd *real_pd; struct ibv_pd_1_0 *pd; pd = malloc(sizeof *pd); if (!pd) return NULL; real_pd = ibv_alloc_pd(context->real_context); if (!real_pd) { free(pd); return NULL; } pd->context = context; pd->real_pd = real_pd; return pd; } COMPAT_SYMVER_FUNC(ibv_dealloc_pd, 1_0, "IBVERBS_1.0", int, struct ibv_pd_1_0 *pd) { int ret; ret = ibv_dealloc_pd(pd->real_pd); if (ret) return ret; free(pd); return 0; } COMPAT_SYMVER_FUNC(ibv_reg_mr, 1_0, "IBVERBS_1.0", struct ibv_mr_1_0 *, struct ibv_pd_1_0 *pd, void *addr, size_t length, int access) { struct ibv_mr *real_mr; struct ibv_mr_1_0 *mr; mr = malloc(sizeof *mr); if (!mr) return NULL; real_mr = ibv_reg_mr(pd->real_pd, addr, length, access); if (!real_mr) { free(mr); return NULL; } mr->context = pd->context; mr->pd = pd; mr->lkey = real_mr->lkey; mr->rkey = real_mr->rkey; mr->real_mr = real_mr; return mr; } COMPAT_SYMVER_FUNC(ibv_dereg_mr, 1_0, "IBVERBS_1.0", int, struct ibv_mr_1_0 *mr) { int ret; ret = ibv_dereg_mr(mr->real_mr); if (ret) return ret; free(mr); return 0; } COMPAT_SYMVER_FUNC(ibv_create_cq, 1_0, "IBVERBS_1.0", struct ibv_cq_1_0 *, struct ibv_context_1_0 *context, int cqe, void *cq_context, struct ibv_comp_channel *channel, int comp_vector) { struct ibv_cq *real_cq; struct ibv_cq_1_0 *cq; cq = malloc(sizeof *cq); if (!cq) return NULL; real_cq = ibv_create_cq(context->real_context, cqe, cq_context, channel, comp_vector); if (!real_cq) { free(cq); return NULL; } cq->context = context; cq->cq_context = cq_context; cq->cqe = cqe; cq->real_cq = real_cq; real_cq->cq_context = cq; return cq; } COMPAT_SYMVER_FUNC(ibv_resize_cq, 1_0, "IBVERBS_1.0", int, struct ibv_cq_1_0 *cq, int cqe) { return ibv_resize_cq(cq->real_cq, cqe); } COMPAT_SYMVER_FUNC(ibv_destroy_cq, 1_0, "IBVERBS_1.0", int, struct ibv_cq_1_0 *cq) { int ret; ret = ibv_destroy_cq(cq->real_cq); if (ret) return ret; free(cq); return 0; } COMPAT_SYMVER_FUNC(ibv_get_cq_event, 1_0, "IBVERBS_1.0", int, struct ibv_comp_channel *channel, struct ibv_cq_1_0 **cq, void **cq_context) { struct ibv_cq *real_cq; void *cq_ptr; int ret; ret = ibv_get_cq_event(channel, &real_cq, &cq_ptr); if (ret) return ret; *cq = cq_ptr; *cq_context = (*cq)->cq_context; return 0; } COMPAT_SYMVER_FUNC(ibv_ack_cq_events, 1_0, "IBVERBS_1.0", void, struct ibv_cq_1_0 *cq, unsigned int nevents) { ibv_ack_cq_events(cq->real_cq, nevents); } COMPAT_SYMVER_FUNC(ibv_create_srq, 1_0, "IBVERBS_1.0", struct ibv_srq_1_0 *, struct ibv_pd_1_0 *pd, struct ibv_srq_init_attr *srq_init_attr) { struct ibv_srq *real_srq; struct ibv_srq_1_0 *srq; srq = malloc(sizeof *srq); if (!srq) return NULL; real_srq = ibv_create_srq(pd->real_pd, srq_init_attr); if (!real_srq) { free(srq); return NULL; } srq->context = pd->context; srq->srq_context = srq_init_attr->srq_context; srq->pd = pd; srq->real_srq = real_srq; real_srq->srq_context = srq; return srq; } COMPAT_SYMVER_FUNC(ibv_modify_srq, 1_0, "IBVERBS_1.0", int, struct ibv_srq_1_0 *srq, struct ibv_srq_attr *srq_attr, int srq_attr_mask) { return ibv_modify_srq(srq->real_srq, srq_attr, srq_attr_mask); } COMPAT_SYMVER_FUNC(ibv_query_srq, 1_0, "IBVERBS_1.0", int, struct ibv_srq_1_0 *srq, struct ibv_srq_attr *srq_attr) { return ibv_query_srq(srq->real_srq, srq_attr); } COMPAT_SYMVER_FUNC(ibv_destroy_srq, 1_0, "IBVERBS_1.0", int, struct ibv_srq_1_0 *srq) { int ret; ret = ibv_destroy_srq(srq->real_srq); if (ret) return ret; free(srq); return 0; } COMPAT_SYMVER_FUNC(ibv_create_qp, 1_0, "IBVERBS_1.0", struct ibv_qp_1_0 *, struct ibv_pd_1_0 *pd, struct ibv_qp_init_attr_1_0 *qp_init_attr) { struct ibv_qp *real_qp; struct ibv_qp_1_0 *qp; struct ibv_qp_init_attr real_init_attr; qp = malloc(sizeof *qp); if (!qp) return NULL; real_init_attr.qp_context = qp_init_attr->qp_context; real_init_attr.send_cq = qp_init_attr->send_cq->real_cq; real_init_attr.recv_cq = qp_init_attr->recv_cq->real_cq; real_init_attr.srq = qp_init_attr->srq ? qp_init_attr->srq->real_srq : NULL; real_init_attr.cap = qp_init_attr->cap; real_init_attr.qp_type = qp_init_attr->qp_type; real_init_attr.sq_sig_all = qp_init_attr->sq_sig_all; real_qp = ibv_create_qp(pd->real_pd, &real_init_attr); if (!real_qp) { free(qp); return NULL; } qp->context = pd->context; qp->qp_context = qp_init_attr->qp_context; qp->pd = pd; qp->send_cq = qp_init_attr->send_cq; qp->recv_cq = qp_init_attr->recv_cq; qp->srq = qp_init_attr->srq; qp->qp_type = qp_init_attr->qp_type; qp->qp_num = real_qp->qp_num; qp->real_qp = real_qp; qp_init_attr->cap = real_init_attr.cap; real_qp->qp_context = qp; return qp; } COMPAT_SYMVER_FUNC(ibv_query_qp, 1_0, "IBVERBS_1.0", int, struct ibv_qp_1_0 *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr_1_0 *init_attr) { struct ibv_qp_init_attr real_init_attr; int ret; ret = ibv_query_qp(qp->real_qp, attr, attr_mask, &real_init_attr); if (ret) return ret; init_attr->qp_context = qp->qp_context; init_attr->send_cq = real_init_attr.send_cq->cq_context; init_attr->recv_cq = real_init_attr.recv_cq->cq_context; init_attr->srq = real_init_attr.srq->srq_context; init_attr->qp_type = real_init_attr.qp_type; init_attr->cap = real_init_attr.cap; init_attr->sq_sig_all = real_init_attr.sq_sig_all; return 0; } COMPAT_SYMVER_FUNC(ibv_modify_qp, 1_0, "IBVERBS_1.0", int, struct ibv_qp_1_0 *qp, struct ibv_qp_attr *attr, int attr_mask) { return ibv_modify_qp(qp->real_qp, attr, attr_mask); } COMPAT_SYMVER_FUNC(ibv_destroy_qp, 1_0, "IBVERBS_1.0", int, struct ibv_qp_1_0 *qp) { int ret; ret = ibv_destroy_qp(qp->real_qp); if (ret) return ret; free(qp); return 0; } COMPAT_SYMVER_FUNC(ibv_create_ah, 1_0, "IBVERBS_1.0", struct ibv_ah_1_0 *, struct ibv_pd_1_0 *pd, struct ibv_ah_attr *attr) { struct ibv_ah *real_ah; struct ibv_ah_1_0 *ah; ah = malloc(sizeof *ah); if (!ah) return NULL; real_ah = ibv_create_ah(pd->real_pd, attr); if (!real_ah) { free(ah); return NULL; } ah->context = pd->context; ah->pd = pd; ah->real_ah = real_ah; return ah; } COMPAT_SYMVER_FUNC(ibv_destroy_ah, 1_0, "IBVERBS_1.0", int, struct ibv_ah_1_0 *ah) { int ret; ret = ibv_destroy_ah(ah->real_ah); if (ret) return ret; free(ah); return 0; } COMPAT_SYMVER_FUNC(ibv_attach_mcast, 1_0, "IBVERBS_1.0", int, struct ibv_qp_1_0 *qp, union ibv_gid *gid, uint16_t lid) { return ibv_attach_mcast(qp->real_qp, gid, lid); } COMPAT_SYMVER_FUNC(ibv_detach_mcast, 1_0, "IBVERBS_1.0", int, struct ibv_qp_1_0 *qp, union ibv_gid *gid, uint16_t lid) { return ibv_detach_mcast(qp->real_qp, gid, lid); } COMPAT_SYMVER_FUNC(ibv_register_driver, 1_1, "IBVERBS_1.1", void, const char *name, ibv_driver_init_func_1_1 init_func) { /* The driver interface is private as of rdma-core 13. This stub is * left to preserve dynamic-link compatibility with old libfabrics * usnic providers which use this function only to suppress a fprintf * in old versions of libibverbs. */ } rdma-core-56.1/libibverbs/device.c000066400000000000000000000327061477342711600170700ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include "ibverbs.h" static pthread_mutex_t dev_list_lock = PTHREAD_MUTEX_INITIALIZER; static struct list_head device_list = LIST_HEAD_INIT(device_list); LATEST_SYMVER_FUNC(ibv_get_device_list, 1_1, "IBVERBS_1.1", struct ibv_device **, int *num) { struct ibv_device **l = NULL; struct verbs_device *device; static bool initialized; int num_devices; int i = 0; if (num) *num = 0; pthread_mutex_lock(&dev_list_lock); if (!initialized) { if (ibverbs_init()) goto out; initialized = true; } num_devices = ibverbs_get_device_list(&device_list); if (num_devices < 0) { errno = -num_devices; goto out; } l = calloc(num_devices + 1, sizeof (struct ibv_device *)); if (!l) { errno = ENOMEM; goto out; } list_for_each(&device_list, device, entry) { l[i] = &device->device; ibverbs_device_hold(l[i]); i++; } if (num) *num = num_devices; out: pthread_mutex_unlock(&dev_list_lock); return l; } LATEST_SYMVER_FUNC(ibv_free_device_list, 1_1, "IBVERBS_1.1", void, struct ibv_device **list) { int i; for (i = 0; list[i]; i++) ibverbs_device_put(list[i]); free(list); } LATEST_SYMVER_FUNC(ibv_get_device_name, 1_1, "IBVERBS_1.1", const char *, struct ibv_device *device) { return device->name; } LATEST_SYMVER_FUNC(ibv_get_device_guid, 1_1, "IBVERBS_1.1", __be64, struct ibv_device *device) { struct verbs_sysfs_dev *sysfs_dev = verbs_get_device(device)->sysfs; char attr[24]; uint64_t guid = 0; uint16_t parts[4]; int i; pthread_mutex_lock(&dev_list_lock); if (sysfs_dev && sysfs_dev->flags & VSYSFS_READ_NODE_GUID) { guid = sysfs_dev->node_guid; pthread_mutex_unlock(&dev_list_lock); return htobe64(guid); } pthread_mutex_unlock(&dev_list_lock); if (ibv_read_ibdev_sysfs_file(attr, sizeof(attr), sysfs_dev, "node_guid") < 0) return 0; if (sscanf(attr, "%hx:%hx:%hx:%hx", parts, parts + 1, parts + 2, parts + 3) != 4) return 0; for (i = 0; i < 4; ++i) guid = (guid << 16) | parts[i]; pthread_mutex_lock(&dev_list_lock); sysfs_dev->node_guid = guid; sysfs_dev->flags |= VSYSFS_READ_NODE_GUID; pthread_mutex_unlock(&dev_list_lock); return htobe64(guid); } int ibv_get_device_index(struct ibv_device *device) { struct verbs_sysfs_dev *sysfs_dev = verbs_get_device(device)->sysfs; return sysfs_dev ? sysfs_dev->ibdev_idx : -1; } void verbs_init_cq(struct ibv_cq *cq, struct ibv_context *context, struct ibv_comp_channel *channel, void *cq_context) { cq->context = context; cq->channel = channel; if (cq->channel) { pthread_mutex_lock(&context->mutex); ++cq->channel->refcnt; pthread_mutex_unlock(&context->mutex); } cq->cq_context = cq_context; cq->comp_events_completed = 0; cq->async_events_completed = 0; pthread_mutex_init(&cq->mutex, NULL); pthread_cond_init(&cq->cond, NULL); } static struct ibv_cq_ex * __lib_ibv_create_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr) { struct ibv_cq_ex *cq; if (cq_attr->wc_flags & ~IBV_CREATE_CQ_SUP_WC_FLAGS) { errno = EOPNOTSUPP; return NULL; } cq = get_ops(context)->create_cq_ex(context, cq_attr); if (cq) verbs_init_cq(ibv_cq_ex_to_cq(cq), context, cq_attr->channel, cq_attr->cq_context); return cq; } static bool has_ioctl_write(struct ibv_context *ctx) { int rc; DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_DEVICE, UVERBS_METHOD_INVOKE_WRITE, 1); if (VERBS_IOCTL_ONLY) return true; if (VERBS_WRITE_ONLY) return false; /* * This command should return ENOSPC since the request length is too * small. */ fill_attr_const_in(cmdb, UVERBS_ATTR_WRITE_CMD, IB_USER_VERBS_CMD_QUERY_DEVICE); rc = execute_ioctl(ctx, cmdb); if (rc == EPROTONOSUPPORT) return false; if (rc == ENOTTY) return false; return true; } /* * Ownership of cmd_fd is transferred into this function, and it will either * be released during the matching call to verbs_uninit_contxt or during the * failure path of this function. */ int verbs_init_context(struct verbs_context *context_ex, struct ibv_device *device, int cmd_fd, uint32_t driver_id) { struct ibv_context *context = &context_ex->context; ibverbs_device_hold(device); context->device = device; context->cmd_fd = cmd_fd; context->async_fd = -1; pthread_mutex_init(&context->mutex, NULL); context_ex->context.abi_compat = __VERBS_ABI_IS_EXTENDED; context_ex->sz = sizeof(*context_ex); context_ex->priv = calloc(1, sizeof(*context_ex->priv)); if (!context_ex->priv) { errno = ENOMEM; close(cmd_fd); return -1; } context_ex->priv->driver_id = driver_id; verbs_set_ops(context_ex, &verbs_dummy_ops); context_ex->priv->use_ioctl_write = has_ioctl_write(context); return 0; } /* * Allocate and initialize a context structure. This is called to create the * driver wrapper, and context_offset is the number of bytes into the wrapper * structure where the verbs_context starts. */ void *_verbs_init_and_alloc_context(struct ibv_device *device, int cmd_fd, size_t alloc_size, struct verbs_context *context_offset, uint32_t driver_id) { void *drv_context; struct verbs_context *context; drv_context = calloc(1, alloc_size); if (!drv_context) { errno = ENOMEM; close(cmd_fd); return NULL; } context = drv_context + (uintptr_t)context_offset; if (verbs_init_context(context, device, cmd_fd, driver_id)) goto err_free; return drv_context; err_free: free(drv_context); return NULL; } static void set_lib_ops(struct verbs_context *vctx) { vctx->create_cq_ex = __lib_ibv_create_cq_ex; /* * The compat symver entry point behaves identically to what used to * be pointed to by _compat_query_port. */ #undef ibv_query_port vctx->context.ops._compat_query_port = ibv_query_port; vctx->query_port = __lib_query_port; vctx->context.ops._compat_query_device = ibv_query_device; /* * In order to maintain backward/forward binary compatibility * with apps compiled against libibverbs-1.1.8 that use the * flow steering addition, we need to set the two * ABI_placeholder entries to match the driver set flow * entries. This is because apps compiled against * libibverbs-1.1.8 use an inline ibv_create_flow and * ibv_destroy_flow function that looks in the placeholder * spots for the proper entry points. For apps compiled * against libibverbs-1.1.9 and later, the inline functions * will be looking in the right place. */ vctx->ABI_placeholder1 = (void (*)(void))vctx->ibv_create_flow; vctx->ABI_placeholder2 = (void (*)(void))vctx->ibv_destroy_flow; } struct ibv_context *verbs_open_device(struct ibv_device *device, void *private_data) { struct verbs_device *verbs_device = verbs_get_device(device); int cmd_fd = -1; struct verbs_context *context_ex; int ret; if (verbs_device->sysfs) { /* * We'll only be doing writes, but we need O_RDWR in case the * provider needs to mmap() the file. */ cmd_fd = open_cdev(verbs_device->sysfs->sysfs_name, verbs_device->sysfs->sysfs_cdev); if (cmd_fd < 0) return NULL; } /* * cmd_fd ownership is transferred into alloc_context, if it fails * then it closes cmd_fd and returns NULL */ context_ex = verbs_device->ops->alloc_context(device, cmd_fd, private_data); if (!context_ex) return NULL; set_lib_ops(context_ex); if (verbs_device->sysfs) { if (context_ex->context.async_fd == -1) { ret = ibv_cmd_alloc_async_fd(&context_ex->context); if (ret) { ibv_close_device(&context_ex->context); return NULL; } } } return &context_ex->context; } LATEST_SYMVER_FUNC(ibv_open_device, 1_1, "IBVERBS_1.1", struct ibv_context *, struct ibv_device *device) { return verbs_open_device(device, NULL); } struct ibv_context *ibv_import_device(int cmd_fd) { struct verbs_device *verbs_device = NULL; struct verbs_context *context_ex; struct ibv_device **dev_list; struct ibv_context *ctx = NULL; struct stat st; int ret; int i; if (fstat(cmd_fd, &st) || !S_ISCHR(st.st_mode)) { errno = EINVAL; return NULL; } dev_list = ibv_get_device_list(NULL); if (!dev_list) { errno = ENODEV; return NULL; } for (i = 0; dev_list[i]; ++i) { if (verbs_get_device(dev_list[i])->sysfs->sysfs_cdev == st.st_rdev) { verbs_device = verbs_get_device(dev_list[i]); break; } } if (!verbs_device) { errno = ENODEV; goto out; } if (!verbs_device->ops->import_context) { errno = EOPNOTSUPP; goto out; } /* In case the underlay cdev number was assigned in the meantime to * other device as of some disassociate flow, the next call on the * FD will end up with EIO (i.e. query_context command) and we should * be safe from using the wrong device. */ context_ex = verbs_device->ops->import_context(&verbs_device->device, cmd_fd); if (!context_ex) goto out; set_lib_ops(context_ex); context_ex->priv->imported = true; ctx = &context_ex->context; ret = ibv_cmd_alloc_async_fd(ctx); if (ret) { ibv_close_device(ctx); ctx = NULL; } out: ibv_free_device_list(dev_list); return ctx; } void verbs_uninit_context(struct verbs_context *context_ex) { free(context_ex->priv); if (context_ex->context.cmd_fd != -1) close(context_ex->context.cmd_fd); if (context_ex->context.async_fd != -1) close(context_ex->context.async_fd); ibverbs_device_put(context_ex->context.device); } LATEST_SYMVER_FUNC(ibv_close_device, 1_1, "IBVERBS_1.1", int, struct ibv_context *context) { const struct verbs_context_ops *ops = get_ops(context); ops->free_context(context); return 0; } LATEST_SYMVER_FUNC(ibv_get_async_event, 1_1, "IBVERBS_1.1", int, struct ibv_context *context, struct ibv_async_event *event) { struct ib_uverbs_async_event_desc ev; if (read(context->async_fd, &ev, sizeof ev) != sizeof ev) return -1; event->event_type = ev.event_type; switch (event->event_type) { case IBV_EVENT_CQ_ERR: event->element.cq = (void *) (uintptr_t) ev.element; break; case IBV_EVENT_QP_FATAL: case IBV_EVENT_QP_REQ_ERR: case IBV_EVENT_QP_ACCESS_ERR: case IBV_EVENT_COMM_EST: case IBV_EVENT_SQ_DRAINED: case IBV_EVENT_PATH_MIG: case IBV_EVENT_PATH_MIG_ERR: case IBV_EVENT_QP_LAST_WQE_REACHED: event->element.qp = (void *) (uintptr_t) ev.element; break; case IBV_EVENT_SRQ_ERR: case IBV_EVENT_SRQ_LIMIT_REACHED: event->element.srq = (void *) (uintptr_t) ev.element; break; case IBV_EVENT_WQ_FATAL: event->element.wq = (void *) (uintptr_t) ev.element; break; default: event->element.port_num = ev.element; break; } get_ops(context)->async_event(context, event); return 0; } LATEST_SYMVER_FUNC(ibv_ack_async_event, 1_1, "IBVERBS_1.1", void, struct ibv_async_event *event) { switch (event->event_type) { case IBV_EVENT_CQ_ERR: { struct ibv_cq *cq = event->element.cq; pthread_mutex_lock(&cq->mutex); ++cq->async_events_completed; pthread_cond_signal(&cq->cond); pthread_mutex_unlock(&cq->mutex); return; } case IBV_EVENT_QP_FATAL: case IBV_EVENT_QP_REQ_ERR: case IBV_EVENT_QP_ACCESS_ERR: case IBV_EVENT_COMM_EST: case IBV_EVENT_SQ_DRAINED: case IBV_EVENT_PATH_MIG: case IBV_EVENT_PATH_MIG_ERR: case IBV_EVENT_QP_LAST_WQE_REACHED: { struct ibv_qp *qp = event->element.qp; pthread_mutex_lock(&qp->mutex); ++qp->events_completed; pthread_cond_signal(&qp->cond); pthread_mutex_unlock(&qp->mutex); return; } case IBV_EVENT_SRQ_ERR: case IBV_EVENT_SRQ_LIMIT_REACHED: { struct ibv_srq *srq = event->element.srq; pthread_mutex_lock(&srq->mutex); ++srq->events_completed; pthread_cond_signal(&srq->cond); pthread_mutex_unlock(&srq->mutex); return; } case IBV_EVENT_WQ_FATAL: { struct ibv_wq *wq = event->element.wq; pthread_mutex_lock(&wq->mutex); ++wq->events_completed; pthread_cond_signal(&wq->cond); pthread_mutex_unlock(&wq->mutex); return; } default: return; } } rdma-core-56.1/libibverbs/driver.h000066400000000000000000000700071477342711600171250ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005, 2006 Cisco Systems, Inc. All rights reserved. * Copyright (c) 2005 PathScale, Inc. All rights reserved. * Copyright (c) 2020 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef INFINIBAND_DRIVER_H #define INFINIBAND_DRIVER_H #include #include #include #include #include #include #include #include #include #include struct verbs_device; enum { VERBS_LOG_LEVEL_NONE, VERBS_LOG_ERR, VERBS_LOG_WARN, VERBS_LOG_INFO, VERBS_LOG_DEBUG, }; void __verbs_log(struct verbs_context *ctx, uint32_t level, const char *fmt, ...); #define verbs_log(ctx, level, format, arg...) \ do { \ int tmp = errno; \ __verbs_log(ctx, level, "%s: %s:%d: " format, \ (ctx)->context.device->name, __func__, __LINE__, ##arg); \ errno = tmp; \ } while (0) #define verbs_debug(ctx, format, arg...) \ verbs_log(ctx, VERBS_LOG_DEBUG, format, ##arg) #define verbs_info(ctx, format, arg...) \ verbs_log(ctx, VERBS_LOG_INFO, format, ##arg) #define verbs_warn(ctx, format, arg...) \ verbs_log(ctx, VERBS_LOG_WARN, format, ##arg) #define verbs_err(ctx, format, arg...) \ verbs_log(ctx, VERBS_LOG_ERR, format, ##arg) #ifdef VERBS_DEBUG #define verbs_log_datapath(ctx, level, format, arg...) \ verbs_log(ctx, level, format, ##arg) #else #define verbs_log_datapath(ctx, level, format, arg...) {} #endif #define verbs_debug_datapath(ctx, format, arg...) \ verbs_log_datapath(ctx, VERBS_LOG_DEBUG, format, ##arg) #define verbs_info_datapath(ctx, format, arg...) \ verbs_log_datapath(ctx, VERBS_LOG_INFO, format, ##arg) #define verbs_warn_datapath(ctx, format, arg...) \ verbs_log_datapath(ctx, VERBS_LOG_WARN, format, ##arg) #define verbs_err_datapath(ctx, format, arg...) \ verbs_log_datapath(ctx, VERBS_LOG_ERR, format, ##arg) enum verbs_xrcd_mask { VERBS_XRCD_HANDLE = 1 << 0, VERBS_XRCD_RESERVED = 1 << 1 }; enum create_cq_cmd_flags { CREATE_CQ_CMD_FLAGS_TS_IGNORED_EX = 1 << 0, }; struct verbs_xrcd { struct ibv_xrcd xrcd; uint32_t comp_mask; uint32_t handle; }; struct verbs_srq { struct ibv_srq srq; enum ibv_srq_type srq_type; struct verbs_xrcd *xrcd; struct ibv_cq *cq; uint32_t srq_num; }; enum verbs_qp_mask { VERBS_QP_XRCD = 1 << 0, VERBS_QP_EX = 1 << 1, }; enum ibv_gid_type_sysfs { IBV_GID_TYPE_SYSFS_IB_ROCE_V1, IBV_GID_TYPE_SYSFS_ROCE_V2, }; enum verbs_query_gid_attr_mask { VERBS_QUERY_GID_ATTR_GID = 1 << 0, VERBS_QUERY_GID_ATTR_TYPE = 1 << 1, VERBS_QUERY_GID_ATTR_NDEV_IFINDEX = 1 << 2, }; enum ibv_mr_type { IBV_MR_TYPE_MR, IBV_MR_TYPE_NULL_MR, IBV_MR_TYPE_IMPORTED_MR, IBV_MR_TYPE_DMABUF_MR, }; struct verbs_mr { struct ibv_mr ibv_mr; enum ibv_mr_type mr_type; int access; }; static inline struct verbs_mr *verbs_get_mr(struct ibv_mr *mr) { return container_of(mr, struct verbs_mr, ibv_mr); } struct verbs_qp { union { struct ibv_qp qp; struct ibv_qp_ex qp_ex; }; uint32_t comp_mask; struct verbs_xrcd *xrcd; }; static_assert(offsetof(struct ibv_qp_ex, qp_base) == 0, "Invalid qp layout"); struct verbs_cq { union { struct ibv_cq cq; struct ibv_cq_ex cq_ex; }; }; enum ibv_flow_action_type { IBV_FLOW_ACTION_UNSPECIFIED, IBV_FLOW_ACTION_ESP = 1, }; struct verbs_flow_action { struct ibv_flow_action action; uint32_t handle; enum ibv_flow_action_type type; }; struct verbs_dm { struct ibv_dm dm; uint32_t handle; }; enum { VERBS_MATCH_SENTINEL = 0, VERBS_MATCH_PCI = 1, VERBS_MATCH_MODALIAS = 2, VERBS_MATCH_DRIVER_ID = 3, }; struct verbs_match_ent { void *driver_data; union { const char *modalias; uint64_t driver_id; } u; uint16_t vendor; uint16_t device; uint8_t kind; }; #define VERBS_DRIVER_ID(_id) \ { \ .u.driver_id = (_id), .kind = VERBS_MATCH_DRIVER_ID, \ } /* Note: New drivers should only use VERBS_DRIVER_ID, the below are for legacy * drivers */ #define VERBS_PCI_MATCH(_vendor, _device, _data) \ { \ .driver_data = (void *)(_data), \ .vendor = (_vendor), \ .device = (_device), \ .kind = VERBS_MATCH_PCI, \ } #define VERBS_MODALIAS_MATCH(_mod_str, _data) \ { \ .driver_data = (void *)(_data), \ .u.modalias = (_mod_str), \ .kind = VERBS_MATCH_MODALIAS, \ } /* Matching on the IB device name is STRONGLY discouraged. This will only * match if there is no device/modalias file available, and it will eventually * be disabled entirely if the kernel supports renaming. Use is strongly * discouraged. */ #define VERBS_NAME_MATCH(_name_prefix, _data) \ { \ .driver_data = (_data), \ .u.modalias = "rdma_device:*N" _name_prefix "*", \ .kind = VERBS_MATCH_MODALIAS, \ } enum { VSYSFS_READ_MODALIAS = 1 << 0, VSYSFS_READ_NODE_GUID = 1 << 1, }; /* An rdma device detected in sysfs */ struct verbs_sysfs_dev { struct list_node entry; void *provider_data; const struct verbs_match_ent *match; unsigned int flags; char sysfs_name[IBV_SYSFS_NAME_MAX]; dev_t sysfs_cdev; char ibdev_name[IBV_SYSFS_NAME_MAX]; char ibdev_path[IBV_SYSFS_PATH_MAX]; char modalias[512]; uint64_t node_guid; uint32_t driver_id; enum ibv_node_type node_type; int ibdev_idx; uint32_t num_ports; uint32_t abi_ver; struct timespec time_created; }; /* Must change the PRIVATE IBVERBS_PRIVATE_ symbol if this is changed */ struct verbs_device_ops { const char *name; uint32_t match_min_abi_version; uint32_t match_max_abi_version; const struct verbs_match_ent *match_table; const struct verbs_device_ops **static_providers; bool (*match_device)(struct verbs_sysfs_dev *sysfs_dev); struct verbs_context *(*alloc_context)(struct ibv_device *device, int cmd_fd, void *private_data); struct verbs_context *(*import_context)(struct ibv_device *device, int cmd_fd); struct verbs_device *(*alloc_device)(struct verbs_sysfs_dev *sysfs_dev); void (*uninit_device)(struct verbs_device *device); }; /* Must change the PRIVATE IBVERBS_PRIVATE_ symbol if this is changed */ struct verbs_device { struct ibv_device device; /* Must be first */ const struct verbs_device_ops *ops; atomic_int refcount; struct list_node entry; struct verbs_sysfs_dev *sysfs; uint64_t core_support; }; struct verbs_counters { struct ibv_counters counters; uint32_t handle; }; /* * Must change the PRIVATE IBVERBS_PRIVATE_ symbol if this is changed. This is * the union of every op the driver can support. If new elements are added to * this structure then verbs_dummy_ops must also be updated. * * Keep sorted. */ struct verbs_context_ops { int (*advise_mr)(struct ibv_pd *pd, enum ibv_advise_mr_advice advice, uint32_t flags, struct ibv_sge *sg_list, uint32_t num_sges); struct ibv_dm *(*alloc_dm)(struct ibv_context *context, struct ibv_alloc_dm_attr *attr); struct ibv_mw *(*alloc_mw)(struct ibv_pd *pd, enum ibv_mw_type type); struct ibv_mr *(*alloc_null_mr)(struct ibv_pd *pd); struct ibv_pd *(*alloc_parent_domain)( struct ibv_context *context, struct ibv_parent_domain_init_attr *attr); struct ibv_pd *(*alloc_pd)(struct ibv_context *context); struct ibv_td *(*alloc_td)(struct ibv_context *context, struct ibv_td_init_attr *init_attr); void (*async_event)(struct ibv_context *context, struct ibv_async_event *event); int (*attach_counters_point_flow)(struct ibv_counters *counters, struct ibv_counter_attach_attr *attr, struct ibv_flow *flow); int (*attach_mcast)(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); int (*bind_mw)(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind); int (*close_xrcd)(struct ibv_xrcd *xrcd); void (*cq_event)(struct ibv_cq *cq); struct ibv_ah *(*create_ah)(struct ibv_pd *pd, struct ibv_ah_attr *attr); struct ibv_counters *(*create_counters)(struct ibv_context *context, struct ibv_counters_init_attr *init_attr); struct ibv_cq *(*create_cq)(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); struct ibv_cq_ex *(*create_cq_ex)( struct ibv_context *context, struct ibv_cq_init_attr_ex *init_attr); struct ibv_flow *(*create_flow)(struct ibv_qp *qp, struct ibv_flow_attr *flow_attr); struct ibv_flow_action *(*create_flow_action_esp)(struct ibv_context *context, struct ibv_flow_action_esp_attr *attr); struct ibv_qp *(*create_qp)(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); struct ibv_qp *(*create_qp_ex)( struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_init_attr_ex); struct ibv_rwq_ind_table *(*create_rwq_ind_table)( struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr); struct ibv_srq *(*create_srq)(struct ibv_pd *pd, struct ibv_srq_init_attr *srq_init_attr); struct ibv_srq *(*create_srq_ex)( struct ibv_context *context, struct ibv_srq_init_attr_ex *srq_init_attr_ex); struct ibv_wq *(*create_wq)(struct ibv_context *context, struct ibv_wq_init_attr *wq_init_attr); int (*dealloc_mw)(struct ibv_mw *mw); int (*dealloc_pd)(struct ibv_pd *pd); int (*dealloc_td)(struct ibv_td *td); int (*dereg_mr)(struct verbs_mr *vmr); int (*destroy_ah)(struct ibv_ah *ah); int (*destroy_counters)(struct ibv_counters *counters); int (*destroy_cq)(struct ibv_cq *cq); int (*destroy_flow)(struct ibv_flow *flow); int (*destroy_flow_action)(struct ibv_flow_action *action); int (*destroy_qp)(struct ibv_qp *qp); int (*destroy_rwq_ind_table)(struct ibv_rwq_ind_table *rwq_ind_table); int (*destroy_srq)(struct ibv_srq *srq); int (*destroy_wq)(struct ibv_wq *wq); int (*detach_mcast)(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); void (*free_context)(struct ibv_context *context); int (*free_dm)(struct ibv_dm *dm); int (*get_srq_num)(struct ibv_srq *srq, uint32_t *srq_num); struct ibv_dm *(*import_dm)(struct ibv_context *context, uint32_t dm_handle); struct ibv_mr *(*import_mr)(struct ibv_pd *pd, uint32_t mr_handle); struct ibv_pd *(*import_pd)(struct ibv_context *context, uint32_t pd_handle); int (*modify_cq)(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr); int (*modify_flow_action_esp)(struct ibv_flow_action *action, struct ibv_flow_action_esp_attr *attr); int (*modify_qp)(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); int (*modify_qp_rate_limit)(struct ibv_qp *qp, struct ibv_qp_rate_limit_attr *attr); int (*modify_srq)(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, int srq_attr_mask); int (*modify_wq)(struct ibv_wq *wq, struct ibv_wq_attr *wq_attr); struct ibv_qp *(*open_qp)(struct ibv_context *context, struct ibv_qp_open_attr *attr); struct ibv_xrcd *(*open_xrcd)( struct ibv_context *context, struct ibv_xrcd_init_attr *xrcd_init_attr); int (*poll_cq)(struct ibv_cq *cq, int num_entries, struct ibv_wc *wc); int (*post_recv)(struct ibv_qp *qp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int (*post_send)(struct ibv_qp *qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int (*post_srq_ops)(struct ibv_srq *srq, struct ibv_ops_wr *op, struct ibv_ops_wr **bad_op); int (*post_srq_recv)(struct ibv_srq *srq, struct ibv_recv_wr *recv_wr, struct ibv_recv_wr **bad_recv_wr); int (*query_device_ex)(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int (*query_ece)(struct ibv_qp *qp, struct ibv_ece *ece); int (*query_port)(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr); int (*query_qp)(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int (*query_qp_data_in_order)(struct ibv_qp *qp, enum ibv_wr_opcode op, uint32_t flags); int (*query_rt_values)(struct ibv_context *context, struct ibv_values_ex *values); int (*query_srq)(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr); int (*read_counters)(struct ibv_counters *counters, uint64_t *counters_value, uint32_t ncounters, uint32_t flags); struct ibv_mr *(*reg_dm_mr)(struct ibv_pd *pd, struct ibv_dm *dm, uint64_t dm_offset, size_t length, unsigned int access); struct ibv_mr *(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access); struct ibv_mr *(*reg_mr)(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access); int (*req_notify_cq)(struct ibv_cq *cq, int solicited_only); int (*rereg_mr)(struct verbs_mr *vmr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access); int (*resize_cq)(struct ibv_cq *cq, int cqe); int (*set_ece)(struct ibv_qp *qp, struct ibv_ece *ece); void (*unimport_dm)(struct ibv_dm *dm); void (*unimport_mr)(struct ibv_mr *mr); void (*unimport_pd)(struct ibv_pd *pd); }; static inline struct verbs_device * verbs_get_device(const struct ibv_device *dev) { return container_of(dev, struct verbs_device, device); } typedef struct verbs_device *(*verbs_driver_init_func)(const char *uverbs_sys_path, int abi_version); /* Wire the IBVERBS_PRIVATE version number into the verbs_register_driver * symbol name. This guarentees we link to the correct set of symbols even if * statically linking or using a dynmic linker with symbol versioning turned * off. */ #define ___make_verbs_register_driver(x) verbs_register_driver_ ## x #define __make_verbs_register_driver(x) ___make_verbs_register_driver(x) #define verbs_register_driver __make_verbs_register_driver(IBVERBS_PABI_VERSION) void verbs_register_driver(const struct verbs_device_ops *ops); /* * Macro for providers to use to supply verbs_device_ops to the core code. * This creates a global symbol for the provider structure to be used by the * ibv_static_providers() machinery, and a global constructor for the dlopen * machinery. */ #define PROVIDER_DRIVER(provider_name, drv_struct) \ extern const struct verbs_device_ops verbs_provider_##provider_name \ __attribute__((alias(stringify(drv_struct)))); \ static __attribute__((constructor)) void provider_name##_register_driver(void) \ { \ verbs_register_driver(&drv_struct); \ } void *_verbs_init_and_alloc_context(struct ibv_device *device, int cmd_fd, size_t alloc_size, struct verbs_context *context_offset, uint32_t driver_id); #define verbs_init_and_alloc_context(ibdev, cmd_fd, drv_ctx_ptr, ctx_memb, \ driver_id) \ ((typeof(drv_ctx_ptr))_verbs_init_and_alloc_context( \ ibdev, cmd_fd, sizeof(*drv_ctx_ptr), \ &((typeof(drv_ctx_ptr))NULL)->ctx_memb, (driver_id))) int verbs_init_context(struct verbs_context *context_ex, struct ibv_device *device, int cmd_fd, uint32_t driver_id); void verbs_uninit_context(struct verbs_context *context); void verbs_set_ops(struct verbs_context *vctx, const struct verbs_context_ops *ops); void verbs_init_cq(struct ibv_cq *cq, struct ibv_context *context, struct ibv_comp_channel *channel, void *cq_context); struct ibv_context *verbs_open_device(struct ibv_device *device, void *private_data); int ibv_cmd_get_context(struct verbs_context *context, struct ibv_get_context *cmd, size_t cmd_size, struct ib_uverbs_get_context_resp *resp, size_t resp_size); int ibv_cmd_query_context(struct ibv_context *ctx, struct ibv_command_buffer *driver); int ibv_cmd_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *attr, struct verbs_flow_action *flow_action, struct ibv_command_buffer *driver); int ibv_cmd_modify_flow_action_esp(struct verbs_flow_action *flow_action, struct ibv_flow_action_esp_attr *attr, struct ibv_command_buffer *driver); int ibv_cmd_query_device_any(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size, struct ib_uverbs_ex_query_device_resp *resp, size_t *resp_size); int ibv_cmd_query_port(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr, struct ibv_query_port *cmd, size_t cmd_size); int ibv_cmd_alloc_async_fd(struct ibv_context *context); int ibv_cmd_alloc_pd(struct ibv_context *context, struct ibv_pd *pd, struct ibv_alloc_pd *cmd, size_t cmd_size, struct ib_uverbs_alloc_pd_resp *resp, size_t resp_size); int ibv_cmd_dealloc_pd(struct ibv_pd *pd); int ibv_cmd_open_xrcd(struct ibv_context *context, struct verbs_xrcd *xrcd, int vxrcd_size, struct ibv_xrcd_init_attr *attr, struct ibv_open_xrcd *cmd, size_t cmd_size, struct ib_uverbs_open_xrcd_resp *resp, size_t resp_size); int ibv_cmd_close_xrcd(struct verbs_xrcd *xrcd); int ibv_cmd_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access, struct verbs_mr *vmr, struct ibv_reg_mr *cmd, size_t cmd_size, struct ib_uverbs_reg_mr_resp *resp, size_t resp_size); int ibv_cmd_rereg_mr(struct verbs_mr *vmr, uint32_t flags, void *addr, size_t length, uint64_t hca_va, int access, struct ibv_pd *pd, struct ibv_rereg_mr *cmd, size_t cmd_sz, struct ib_uverbs_rereg_mr_resp *resp, size_t resp_sz); int ibv_cmd_dereg_mr(struct verbs_mr *vmr); int ibv_cmd_query_mr(struct ibv_pd *pd, struct verbs_mr *vmr, uint32_t mr_handle); int ibv_cmd_advise_mr(struct ibv_pd *pd, enum ibv_advise_mr_advice advice, uint32_t flags, struct ibv_sge *sg_list, uint32_t num_sge); int ibv_cmd_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access, struct verbs_mr *vmr, struct ibv_command_buffer *driver); int ibv_cmd_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type, struct ibv_mw *mw, struct ibv_alloc_mw *cmd, size_t cmd_size, struct ib_uverbs_alloc_mw_resp *resp, size_t resp_size); int ibv_cmd_dealloc_mw(struct ibv_mw *mw); int ibv_cmd_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector, struct ibv_cq *cq, struct ibv_create_cq *cmd, size_t cmd_size, struct ib_uverbs_create_cq_resp *resp, size_t resp_size); int ibv_cmd_create_cq_ex(struct ibv_context *context, const struct ibv_cq_init_attr_ex *cq_attr, struct verbs_cq *cq, struct ibv_create_cq_ex *cmd, size_t cmd_size, struct ib_uverbs_ex_create_cq_resp *resp, size_t resp_size, uint32_t cmd_flags); int ibv_cmd_create_cq_ex2(struct ibv_context *context, const struct ibv_cq_init_attr_ex *cq_attr, struct verbs_cq *cq, struct ibv_create_cq_ex *cmd, size_t cmd_size, struct ib_uverbs_ex_create_cq_resp *resp, size_t resp_size, uint32_t cmd_flags, struct ibv_command_buffer *driver); int ibv_cmd_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc); int ibv_cmd_req_notify_cq(struct ibv_cq *cq, int solicited_only); int ibv_cmd_resize_cq(struct ibv_cq *cq, int cqe, struct ibv_resize_cq *cmd, size_t cmd_size, struct ib_uverbs_resize_cq_resp *resp, size_t resp_size); int ibv_cmd_destroy_cq(struct ibv_cq *cq); int ibv_cmd_modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr, struct ibv_modify_cq *cmd, size_t cmd_size); int ibv_cmd_create_srq(struct ibv_pd *pd, struct ibv_srq *srq, struct ibv_srq_init_attr *attr, struct ibv_create_srq *cmd, size_t cmd_size, struct ib_uverbs_create_srq_resp *resp, size_t resp_size); int ibv_cmd_create_srq_ex(struct ibv_context *context, struct verbs_srq *srq, struct ibv_srq_init_attr_ex *attr_ex, struct ibv_create_xsrq *cmd, size_t cmd_size, struct ib_uverbs_create_srq_resp *resp, size_t resp_size); int ibv_cmd_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, int srq_attr_mask, struct ibv_modify_srq *cmd, size_t cmd_size); int ibv_cmd_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, struct ibv_query_srq *cmd, size_t cmd_size); int ibv_cmd_destroy_srq(struct ibv_srq *srq); int ibv_cmd_create_qp(struct ibv_pd *pd, struct ibv_qp *qp, struct ibv_qp_init_attr *attr, struct ibv_create_qp *cmd, size_t cmd_size, struct ib_uverbs_create_qp_resp *resp, size_t resp_size); int ibv_cmd_create_qp_ex(struct ibv_context *context, struct verbs_qp *qp, struct ibv_qp_init_attr_ex *attr_ex, struct ibv_create_qp *cmd, size_t cmd_size, struct ib_uverbs_create_qp_resp *resp, size_t resp_size); int ibv_cmd_create_qp_ex2(struct ibv_context *context, struct verbs_qp *qp, struct ibv_qp_init_attr_ex *qp_attr, struct ibv_create_qp_ex *cmd, size_t cmd_size, struct ib_uverbs_ex_create_qp_resp *resp, size_t resp_size); int ibv_cmd_open_qp(struct ibv_context *context, struct verbs_qp *qp, int vqp_sz, struct ibv_qp_open_attr *attr, struct ibv_open_qp *cmd, size_t cmd_size, struct ib_uverbs_create_qp_resp *resp, size_t resp_size); int ibv_cmd_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *qp_attr, int attr_mask, struct ibv_qp_init_attr *qp_init_attr, struct ibv_query_qp *cmd, size_t cmd_size); int ibv_cmd_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_modify_qp *cmd, size_t cmd_size); int ibv_cmd_modify_qp_ex(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_modify_qp_ex *cmd, size_t cmd_size, struct ib_uverbs_ex_modify_qp_resp *resp, size_t resp_size); int ibv_cmd_destroy_qp(struct ibv_qp *qp); int ibv_cmd_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int ibv_cmd_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int ibv_cmd_post_srq_recv(struct ibv_srq *srq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int ibv_cmd_create_ah(struct ibv_pd *pd, struct ibv_ah *ah, struct ibv_ah_attr *attr, struct ib_uverbs_create_ah_resp *resp, size_t resp_size); int ibv_cmd_destroy_ah(struct ibv_ah *ah); int ibv_cmd_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); int ibv_cmd_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); int ibv_cmd_create_flow(struct ibv_qp *qp, struct ibv_flow *flow_id, struct ibv_flow_attr *flow_attr, void *ucmd, size_t ucmd_size); int ibv_cmd_destroy_flow(struct ibv_flow *flow_id); int ibv_cmd_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *wq_init_attr, struct ibv_wq *wq, struct ibv_create_wq *cmd, size_t cmd_size, struct ib_uverbs_ex_create_wq_resp *resp, size_t resp_size); int ibv_cmd_destroy_flow_action(struct verbs_flow_action *action); int ibv_cmd_modify_wq(struct ibv_wq *wq, struct ibv_wq_attr *attr, struct ibv_modify_wq *cmd, size_t cmd_size); int ibv_cmd_destroy_wq(struct ibv_wq *wq); int ibv_cmd_create_rwq_ind_table(struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr, struct ibv_rwq_ind_table *rwq_ind_table, struct ib_uverbs_ex_create_rwq_ind_table_resp *resp, size_t resp_size); int ibv_cmd_destroy_rwq_ind_table(struct ibv_rwq_ind_table *rwq_ind_table); int ibv_cmd_create_counters(struct ibv_context *context, struct ibv_counters_init_attr *init_attr, struct verbs_counters *vcounters, struct ibv_command_buffer *link); int ibv_cmd_destroy_counters(struct verbs_counters *vcounters); int ibv_cmd_read_counters(struct verbs_counters *vcounters, uint64_t *counters_value, uint32_t ncounters, uint32_t flags, struct ibv_command_buffer *link); int ibv_dontfork_range(void *base, size_t size); int ibv_dofork_range(void *base, size_t size); int ibv_cmd_alloc_dm(struct ibv_context *ctx, const struct ibv_alloc_dm_attr *dm_attr, struct verbs_dm *dm, struct ibv_command_buffer *link); int ibv_cmd_free_dm(struct verbs_dm *dm); int ibv_cmd_reg_dm_mr(struct ibv_pd *pd, struct verbs_dm *dm, uint64_t offset, size_t length, unsigned int access, struct verbs_mr *vmr, struct ibv_command_buffer *link); int __ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num, uint32_t gid_index, struct ibv_gid_entry *entry, uint32_t flags, size_t entry_size, uint32_t fallback_attr_mask); /* * sysfs helper functions */ const char *ibv_get_sysfs_path(void); int ibv_read_sysfs_file(const char *dir, const char *file, char *buf, size_t size); int ibv_read_sysfs_file_at(int dirfd, const char *file, char *buf, size_t size); int ibv_read_ibdev_sysfs_file(char *buf, size_t size, struct verbs_sysfs_dev *sysfs_dev, const char *fnfmt, ...) __attribute__((format(printf, 4, 5))); static inline bool check_comp_mask(uint64_t input, uint64_t supported) { return (input & ~supported) == 0; } int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num, unsigned int index, enum ibv_gid_type_sysfs *type); static inline int ibv_check_alloc_parent_domain(struct ibv_parent_domain_init_attr *attr) { /* A valid protection domain must be set */ if (!attr->pd) { errno = EINVAL; return -1; } return 0; } /* * Initialize the ibv_pd which is being used as a parent_domain. From the * perspective of the core code the new ibv_pd is completely interchangeable * with the passed contained_pd. */ static inline void ibv_initialize_parent_domain(struct ibv_pd *parent_domain, struct ibv_pd *contained_pd) { parent_domain->context = contained_pd->context; parent_domain->handle = contained_pd->handle; } #endif /* INFINIBAND_DRIVER_H */ rdma-core-56.1/libibverbs/dummy_ops.c000066400000000000000000000401011477342711600176310ustar00rootroot00000000000000/* * Copyright (c) 2017 Mellanox Technologies, Inc. All rights reserved. * Copyright (c) 2020 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include "ibverbs.h" #include static int advise_mr(struct ibv_pd *pd, enum ibv_advise_mr_advice advice, uint32_t flags, struct ibv_sge *sg_list, uint32_t num_sges) { return EOPNOTSUPP; } static struct ibv_dm *alloc_dm(struct ibv_context *context, struct ibv_alloc_dm_attr *attr) { errno = EOPNOTSUPP; return NULL; } static struct ibv_mw *alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type) { errno = EOPNOTSUPP; return NULL; } static struct ibv_mr *alloc_null_mr(struct ibv_pd *pd) { errno = EOPNOTSUPP; return NULL; } static struct ibv_pd * alloc_parent_domain(struct ibv_context *context, struct ibv_parent_domain_init_attr *attr) { errno = EOPNOTSUPP; return NULL; } static struct ibv_pd *alloc_pd(struct ibv_context *context) { errno = EOPNOTSUPP; return NULL; } static struct ibv_td *alloc_td(struct ibv_context *context, struct ibv_td_init_attr *init_attr) { errno = EOPNOTSUPP; return NULL; } static void async_event(struct ibv_context *context, struct ibv_async_event *event) { } static int attach_counters_point_flow(struct ibv_counters *counters, struct ibv_counter_attach_attr *attr, struct ibv_flow *flow) { return EOPNOTSUPP; } static int attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) { return EOPNOTSUPP; } static int bind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind) { return EOPNOTSUPP; } static int close_xrcd(struct ibv_xrcd *xrcd) { return EOPNOTSUPP; } static void cq_event(struct ibv_cq *cq) { } static struct ibv_ah *create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr) { errno = EOPNOTSUPP; return NULL; } static struct ibv_counters *create_counters(struct ibv_context *context, struct ibv_counters_init_attr *init_attr) { errno = EOPNOTSUPP; return NULL; } static struct ibv_cq *create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { errno = EOPNOTSUPP; return NULL; } static struct ibv_cq_ex *create_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *init_attr) { errno = EOPNOTSUPP; return NULL; } static struct ibv_flow *create_flow(struct ibv_qp *qp, struct ibv_flow_attr *flow_attr) { errno = EOPNOTSUPP; return NULL; } static struct ibv_flow_action *create_flow_action_esp(struct ibv_context *context, struct ibv_flow_action_esp_attr *attr) { errno = EOPNOTSUPP; return NULL; } static struct ibv_qp *create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { errno = EOPNOTSUPP; return NULL; } static struct ibv_qp *create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_init_attr_ex) { errno = EOPNOTSUPP; return NULL; } static struct ibv_rwq_ind_table * create_rwq_ind_table(struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr) { errno = EOPNOTSUPP; return NULL; } static struct ibv_srq *create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *srq_init_attr) { errno = EOPNOTSUPP; return NULL; } static struct ibv_srq * create_srq_ex(struct ibv_context *context, struct ibv_srq_init_attr_ex *srq_init_attr_ex) { errno = EOPNOTSUPP; return NULL; } static struct ibv_wq *create_wq(struct ibv_context *context, struct ibv_wq_init_attr *wq_init_attr) { errno = EOPNOTSUPP; return NULL; } static int dealloc_mw(struct ibv_mw *mw) { return EOPNOTSUPP; } static int dealloc_pd(struct ibv_pd *pd) { return EOPNOTSUPP; } static int dealloc_td(struct ibv_td *td) { return EOPNOTSUPP; } static int dereg_mr(struct verbs_mr *vmr) { return EOPNOTSUPP; } static int destroy_ah(struct ibv_ah *ah) { return EOPNOTSUPP; } static int destroy_counters(struct ibv_counters *counters) { return EOPNOTSUPP; } static int destroy_cq(struct ibv_cq *cq) { return EOPNOTSUPP; } static int destroy_flow(struct ibv_flow *flow) { return EOPNOTSUPP; } static int destroy_flow_action(struct ibv_flow_action *action) { return EOPNOTSUPP; } static int destroy_qp(struct ibv_qp *qp) { return EOPNOTSUPP; } static int destroy_rwq_ind_table(struct ibv_rwq_ind_table *rwq_ind_table) { return EOPNOTSUPP; } static int destroy_srq(struct ibv_srq *srq) { return EOPNOTSUPP; } static int destroy_wq(struct ibv_wq *wq) { return EOPNOTSUPP; } static int detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) { return EOPNOTSUPP; } static void free_context(struct ibv_context *ctx) { return; } static int free_dm(struct ibv_dm *dm) { return EOPNOTSUPP; } static int get_srq_num(struct ibv_srq *srq, uint32_t *srq_num) { return EOPNOTSUPP; } static struct ibv_dm *import_dm(struct ibv_context *context, uint32_t dm_handle) { errno = EOPNOTSUPP; return NULL; } static struct ibv_mr *import_mr(struct ibv_pd *pd, uint32_t mr_handle) { errno = EOPNOTSUPP; return NULL; } static struct ibv_pd *import_pd(struct ibv_context *context, uint32_t pd_handle) { errno = EOPNOTSUPP; return NULL; } static int modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr) { return EOPNOTSUPP; } static int modify_flow_action_esp(struct ibv_flow_action *action, struct ibv_flow_action_esp_attr *attr) { return EOPNOTSUPP; } static int modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { return EOPNOTSUPP; } static int modify_qp_rate_limit(struct ibv_qp *qp, struct ibv_qp_rate_limit_attr *attr) { return EOPNOTSUPP; } static int modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, int srq_attr_mask) { return EOPNOTSUPP; } static int modify_wq(struct ibv_wq *wq, struct ibv_wq_attr *wq_attr) { return EOPNOTSUPP; } static struct ibv_qp *open_qp(struct ibv_context *context, struct ibv_qp_open_attr *attr) { errno = EOPNOTSUPP; return NULL; } static struct ibv_xrcd *open_xrcd(struct ibv_context *context, struct ibv_xrcd_init_attr *xrcd_init_attr) { errno = EOPNOTSUPP; return NULL; } static int poll_cq(struct ibv_cq *cq, int num_entries, struct ibv_wc *wc) { return EOPNOTSUPP; } static int post_recv(struct ibv_qp *qp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { return EOPNOTSUPP; } static int post_send(struct ibv_qp *qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { return EOPNOTSUPP; } static int post_srq_ops(struct ibv_srq *srq, struct ibv_ops_wr *op, struct ibv_ops_wr **bad_op) { return EOPNOTSUPP; } static int post_srq_recv(struct ibv_srq *srq, struct ibv_recv_wr *recv_wr, struct ibv_recv_wr **bad_recv_wr) { return EOPNOTSUPP; } static int query_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { return EOPNOTSUPP; } static int query_ece(struct ibv_qp *qp, struct ibv_ece *ece) { return EOPNOTSUPP; } static int query_qp_data_in_order(struct ibv_qp *qp, enum ibv_wr_opcode op, uint32_t flags) { return 0; } static int query_port(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr) { return EOPNOTSUPP; } static int query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { return EOPNOTSUPP; } static int query_rt_values(struct ibv_context *context, struct ibv_values_ex *values) { return EOPNOTSUPP; } static int query_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr) { return EOPNOTSUPP; } static int read_counters(struct ibv_counters *counters, uint64_t *counters_value, uint32_t ncounters, uint32_t flags) { return EOPNOTSUPP; } static struct ibv_mr *reg_dm_mr(struct ibv_pd *pd, struct ibv_dm *dm, uint64_t dm_offset, size_t length, unsigned int access) { errno = EOPNOTSUPP; return NULL; } static struct ibv_mr *reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { errno = EOPNOTSUPP; return NULL; } static struct ibv_mr *reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access) { errno = EOPNOTSUPP; return NULL; } static int req_notify_cq(struct ibv_cq *cq, int solicited_only) { return EOPNOTSUPP; } static int rereg_mr(struct verbs_mr *vmr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access) { errno = EOPNOTSUPP; return IBV_REREG_MR_ERR_INPUT; } static int resize_cq(struct ibv_cq *cq, int cqe) { return EOPNOTSUPP; } static int set_ece(struct ibv_qp *qp, struct ibv_ece *ece) { return EOPNOTSUPP; } static void unimport_dm(struct ibv_dm *dm) { } static void unimport_mr(struct ibv_mr *mr) { } static void unimport_pd(struct ibv_pd *pd) { } /* * Ops in verbs_dummy_ops simply return an EOPNOTSUPP error code when called, or * do nothing. They are placed in the ops structures if the provider does not * provide an op for the function. * * NOTE: This deliberately does not use named initializers to trigger a * '-Wmissing-field-initializers' warning if the struct is changed without * changing this. * * Keep sorted. */ const struct verbs_context_ops verbs_dummy_ops = { advise_mr, alloc_dm, alloc_mw, alloc_null_mr, alloc_parent_domain, alloc_pd, alloc_td, async_event, attach_counters_point_flow, attach_mcast, bind_mw, close_xrcd, cq_event, create_ah, create_counters, create_cq, create_cq_ex, create_flow, create_flow_action_esp, create_qp, create_qp_ex, create_rwq_ind_table, create_srq, create_srq_ex, create_wq, dealloc_mw, dealloc_pd, dealloc_td, dereg_mr, destroy_ah, destroy_counters, destroy_cq, destroy_flow, destroy_flow_action, destroy_qp, destroy_rwq_ind_table, destroy_srq, destroy_wq, detach_mcast, free_context, free_dm, get_srq_num, import_dm, import_mr, import_pd, modify_cq, modify_flow_action_esp, modify_qp, modify_qp_rate_limit, modify_srq, modify_wq, open_qp, open_xrcd, poll_cq, post_recv, post_send, post_srq_ops, post_srq_recv, query_device_ex, query_ece, query_port, query_qp, query_qp_data_in_order, query_rt_values, query_srq, read_counters, reg_dm_mr, reg_dmabuf_mr, reg_mr, req_notify_cq, rereg_mr, resize_cq, set_ece, unimport_dm, unimport_mr, unimport_pd, }; /* * Set the ops in a context. If the function pointer in op is NULL then it is * not set. This allows the providers to call the function multiple times in * order to have variations of the ops for different HW configurations. */ void verbs_set_ops(struct verbs_context *vctx, const struct verbs_context_ops *ops) { struct verbs_ex_private *priv = vctx->priv; struct ibv_context_ops *ctx = &vctx->context.ops; /* * We retain the function pointer for now, just as 'just-in-case' ABI * compatibility. If any ever get changed incompatibly they should be * set to NULL instead. */ #define SET_PRIV_OP(ptr, name) \ do { \ if (ops->name) { \ priv->ops.name = ops->name; \ (ptr)->_compat_##name = (void *)ops->name; \ } \ } while (0) /* Same as SET_PRIV_OP but without the compatibility pointer */ #define SET_PRIV_OP_IC(ptr, name) \ do { \ if (ops->name) \ priv->ops.name = ops->name; \ } while (0) #define SET_OP(ptr, name) \ do { \ if (ops->name) { \ priv->ops.name = ops->name; \ (ptr)->name = ops->name; \ } \ } while (0) #define SET_OP2(ptr, iname, name) \ do { \ if (ops->name) { \ priv->ops.name = ops->name; \ (ptr)->iname = ops->name; \ } \ } while (0) SET_OP(vctx, advise_mr); SET_OP(vctx, alloc_dm); SET_OP(ctx, alloc_mw); SET_OP(vctx, alloc_null_mr); SET_PRIV_OP(ctx, alloc_pd); SET_OP(vctx, alloc_parent_domain); SET_OP(vctx, alloc_td); SET_OP(vctx, attach_counters_point_flow); SET_OP(vctx, create_counters); SET_PRIV_OP(ctx, async_event); SET_PRIV_OP(ctx, attach_mcast); SET_OP(ctx, bind_mw); SET_OP(vctx, close_xrcd); SET_PRIV_OP(ctx, cq_event); SET_PRIV_OP(ctx, create_ah); SET_PRIV_OP(ctx, create_cq); SET_PRIV_OP_IC(vctx, create_cq_ex); SET_OP2(vctx, ibv_create_flow, create_flow); SET_OP(vctx, create_flow_action_esp); SET_PRIV_OP(ctx, create_qp); SET_OP(vctx, create_qp_ex); SET_OP(vctx, create_rwq_ind_table); SET_PRIV_OP(ctx, create_srq); SET_OP(vctx, create_srq_ex); SET_OP(vctx, create_wq); SET_OP(ctx, dealloc_mw); SET_PRIV_OP(ctx, dealloc_pd); SET_OP(vctx, dealloc_td); SET_OP(vctx, destroy_counters); SET_PRIV_OP(ctx, dereg_mr); SET_PRIV_OP(ctx, destroy_ah); SET_PRIV_OP(ctx, destroy_cq); SET_OP2(vctx, ibv_destroy_flow, destroy_flow); SET_OP(vctx, destroy_flow_action); SET_PRIV_OP(ctx, destroy_qp); SET_OP(vctx, destroy_rwq_ind_table); SET_PRIV_OP(ctx, destroy_srq); SET_OP(vctx, destroy_wq); SET_PRIV_OP(ctx, detach_mcast); SET_PRIV_OP_IC(ctx, free_context); SET_OP(vctx, free_dm); SET_OP(vctx, get_srq_num); SET_PRIV_OP_IC(vctx, import_dm); SET_PRIV_OP_IC(vctx, import_mr); SET_PRIV_OP_IC(vctx, import_pd); SET_OP(vctx, modify_cq); SET_OP(vctx, modify_flow_action_esp); SET_PRIV_OP(ctx, modify_qp); SET_OP(vctx, modify_qp_rate_limit); SET_PRIV_OP(ctx, modify_srq); SET_OP(vctx, modify_wq); SET_OP(vctx, open_qp); SET_OP(vctx, open_xrcd); SET_OP(ctx, poll_cq); SET_OP(ctx, post_recv); SET_OP(ctx, post_send); SET_OP(vctx, post_srq_ops); SET_OP(ctx, post_srq_recv); SET_OP(vctx, query_device_ex); SET_PRIV_OP_IC(vctx, query_ece); SET_PRIV_OP_IC(ctx, query_port); SET_PRIV_OP(ctx, query_qp); SET_PRIV_OP_IC(ctx, query_qp_data_in_order); SET_OP(vctx, query_rt_values); SET_OP(vctx, read_counters); SET_PRIV_OP(ctx, query_srq); SET_OP(vctx, reg_dm_mr); SET_PRIV_OP_IC(vctx, reg_dmabuf_mr); SET_PRIV_OP(ctx, reg_mr); SET_OP(ctx, req_notify_cq); SET_PRIV_OP(ctx, rereg_mr); SET_PRIV_OP(ctx, resize_cq); SET_PRIV_OP_IC(vctx, set_ece); SET_PRIV_OP_IC(vctx, unimport_dm); SET_PRIV_OP_IC(vctx, unimport_mr); SET_PRIV_OP_IC(vctx, unimport_pd); #undef SET_OP #undef SET_OP2 } rdma-core-56.1/libibverbs/dynamic_driver.c000066400000000000000000000133271477342711600206260ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _STATIC_LIBRARY_BUILD_ #define _GNU_SOURCE #include #include #include #include #include #include #include #include "ibverbs.h" struct ibv_driver_name { struct list_node entry; char *name; }; static LIST_HEAD(driver_name_list); static void read_config_file(const char *path) { FILE *conf; char *line = NULL; char *config; char *field; size_t buflen = 0; ssize_t len; conf = fopen(path, "r" STREAM_CLOEXEC); if (!conf) { fprintf(stderr, PFX "Warning: couldn't read config file %s.\n", path); return; } while ((len = getline(&line, &buflen, conf)) != -1) { config = line + strspn(line, "\t "); if (config[0] == '\n' || config[0] == '#') continue; field = strsep(&config, "\n\t "); if (strcmp(field, "driver") == 0 && config != NULL) { struct ibv_driver_name *driver_name; config += strspn(config, "\t "); field = strsep(&config, "\n\t "); driver_name = malloc(sizeof(*driver_name)); if (!driver_name) { fprintf(stderr, PFX "Warning: couldn't allocate driver name '%s'.\n", field); continue; } driver_name->name = strdup(field); if (!driver_name->name) { fprintf(stderr, PFX "Warning: couldn't allocate driver name '%s'.\n", field); free(driver_name); continue; } list_add(&driver_name_list, &driver_name->entry); } else fprintf(stderr, PFX "Warning: ignoring bad config directive '%s' in file '%s'.\n", field, path); } if (line) free(line); fclose(conf); } static void read_config(void) { DIR *conf_dir; struct dirent *dent; char *path; conf_dir = opendir(IBV_CONFIG_DIR); if (!conf_dir) { fprintf(stderr, PFX "Warning: couldn't open config directory '%s'.\n", IBV_CONFIG_DIR); return; } while ((dent = readdir(conf_dir))) { struct stat buf; if (asprintf(&path, "%s/%s", IBV_CONFIG_DIR, dent->d_name) < 0) { fprintf(stderr, PFX "Warning: couldn't read config file %s/%s.\n", IBV_CONFIG_DIR, dent->d_name); goto out; } if (stat(path, &buf)) { fprintf(stderr, PFX "Warning: couldn't stat config file '%s'.\n", path); goto next; } if (!S_ISREG(buf.st_mode)) goto next; read_config_file(path); next: free(path); } out: closedir(conf_dir); } static void load_driver(const char *name) { char *so_name; void *dlhandle; /* If the name is an absolute path then open that path after appending * the trailer suffix */ if (name[0] == '/') { if (asprintf(&so_name, "%s" VERBS_PROVIDER_SUFFIX, name) < 0) goto out_asprintf; dlhandle = dlopen(so_name, RTLD_NOW); if (!dlhandle) goto out_dlopen; free(so_name); return; } /* If configured with a provider plugin path then try that next */ if (sizeof(VERBS_PROVIDER_DIR) > 1) { if (asprintf(&so_name, VERBS_PROVIDER_DIR "/lib%s" VERBS_PROVIDER_SUFFIX, name) < 0) goto out_asprintf; dlhandle = dlopen(so_name, RTLD_NOW); free(so_name); if (dlhandle) return; } /* Otherwise use the system library search path. This is the historical * behavior of libibverbs */ if (asprintf(&so_name, "lib%s" VERBS_PROVIDER_SUFFIX, name) < 0) goto out_asprintf; dlhandle = dlopen(so_name, RTLD_NOW); if (!dlhandle) goto out_dlopen; free(so_name); return; out_asprintf: fprintf(stderr, PFX "Warning: couldn't load driver '%s'.\n", name); return; out_dlopen: fprintf(stderr, PFX "Warning: couldn't load driver '%s': %s\n", so_name, dlerror()); free(so_name); } void load_drivers(void) { struct ibv_driver_name *name, *next_name; const char *env; char *list, *env_name; read_config(); /* Only use drivers passed in through the calling user's environment * if we're not running setuid. */ if (getuid() == geteuid()) { if ((env = getenv("RDMAV_DRIVERS"))) { list = strdupa(env); while ((env_name = strsep(&list, ":;"))) load_driver(env_name); } else if ((env = getenv("IBV_DRIVERS"))) { list = strdupa(env); while ((env_name = strsep(&list, ":;"))) load_driver(env_name); } } list_for_each_safe (&driver_name_list, name, next_name, entry) { load_driver(name->name); free(name->name); free(name); } } #endif rdma-core-56.1/libibverbs/enum_strs.c000066400000000000000000000135451477342711600176500ustar00rootroot00000000000000/* * Copyright (c) 2008 Lawrence Livermore National Laboratory * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include const char *ibv_node_type_str(enum ibv_node_type node_type) { static const char *const node_type_str[] = { [IBV_NODE_CA] = "InfiniBand channel adapter", [IBV_NODE_SWITCH] = "InfiniBand switch", [IBV_NODE_ROUTER] = "InfiniBand router", [IBV_NODE_RNIC] = "iWARP NIC", [IBV_NODE_USNIC] = "usNIC", [IBV_NODE_USNIC_UDP] = "usNIC UDP", [IBV_NODE_UNSPECIFIED] = "unspecified", }; if (node_type < IBV_NODE_CA || node_type > IBV_NODE_UNSPECIFIED) return "unknown"; return node_type_str[node_type]; } const char *ibv_port_state_str(enum ibv_port_state port_state) { static const char *const port_state_str[] = { [IBV_PORT_NOP] = "no state change (NOP)", [IBV_PORT_DOWN] = "down", [IBV_PORT_INIT] = "init", [IBV_PORT_ARMED] = "armed", [IBV_PORT_ACTIVE] = "active", [IBV_PORT_ACTIVE_DEFER] = "active defer" }; if (port_state < IBV_PORT_NOP || port_state > IBV_PORT_ACTIVE_DEFER) return "unknown"; return port_state_str[port_state]; } const char *ibv_event_type_str(enum ibv_event_type event) { static const char *const event_type_str[] = { [IBV_EVENT_CQ_ERR] = "CQ error", [IBV_EVENT_QP_FATAL] = "local work queue catastrophic error", [IBV_EVENT_QP_REQ_ERR] = "invalid request local work queue error", [IBV_EVENT_QP_ACCESS_ERR] = "local access violation work queue error", [IBV_EVENT_COMM_EST] = "communication established", [IBV_EVENT_SQ_DRAINED] = "send queue drained", [IBV_EVENT_PATH_MIG] = "path migrated", [IBV_EVENT_PATH_MIG_ERR] = "path migration request error", [IBV_EVENT_DEVICE_FATAL] = "local catastrophic error", [IBV_EVENT_PORT_ACTIVE] = "port active", [IBV_EVENT_PORT_ERR] = "port error", [IBV_EVENT_LID_CHANGE] = "LID change", [IBV_EVENT_PKEY_CHANGE] = "P_Key change", [IBV_EVENT_SM_CHANGE] = "SM change", [IBV_EVENT_SRQ_ERR] = "SRQ catastrophic error", [IBV_EVENT_SRQ_LIMIT_REACHED] = "SRQ limit reached", [IBV_EVENT_QP_LAST_WQE_REACHED] = "last WQE reached", [IBV_EVENT_CLIENT_REREGISTER] = "client reregistration", [IBV_EVENT_GID_CHANGE] = "GID table change", [IBV_EVENT_WQ_FATAL] = "WQ fatal" }; if (event < IBV_EVENT_CQ_ERR || event > IBV_EVENT_WQ_FATAL) return "unknown"; return event_type_str[event]; } const char *ibv_wc_status_str(enum ibv_wc_status status) { static const char *const wc_status_str[] = { [IBV_WC_SUCCESS] = "success", [IBV_WC_LOC_LEN_ERR] = "local length error", [IBV_WC_LOC_QP_OP_ERR] = "local QP operation error", [IBV_WC_LOC_EEC_OP_ERR] = "local EE context operation error", [IBV_WC_LOC_PROT_ERR] = "local protection error", [IBV_WC_WR_FLUSH_ERR] = "Work Request Flushed Error", [IBV_WC_MW_BIND_ERR] = "memory management operation error", [IBV_WC_BAD_RESP_ERR] = "bad response error", [IBV_WC_LOC_ACCESS_ERR] = "local access error", [IBV_WC_REM_INV_REQ_ERR] = "remote invalid request error", [IBV_WC_REM_ACCESS_ERR] = "remote access error", [IBV_WC_REM_OP_ERR] = "remote operation error", [IBV_WC_RETRY_EXC_ERR] = "transport retry counter exceeded", [IBV_WC_RNR_RETRY_EXC_ERR] = "RNR retry counter exceeded", [IBV_WC_LOC_RDD_VIOL_ERR] = "local RDD violation error", [IBV_WC_REM_INV_RD_REQ_ERR] = "remote invalid RD request", [IBV_WC_REM_ABORT_ERR] = "aborted error", [IBV_WC_INV_EECN_ERR] = "invalid EE context number", [IBV_WC_INV_EEC_STATE_ERR] = "invalid EE context state", [IBV_WC_FATAL_ERR] = "fatal error", [IBV_WC_RESP_TIMEOUT_ERR] = "response timeout error", [IBV_WC_GENERAL_ERR] = "general error", [IBV_WC_TM_ERR] = "TM error", [IBV_WC_TM_RNDV_INCOMPLETE] = "TM software rendezvous", }; if (status < IBV_WC_SUCCESS || status > IBV_WC_TM_RNDV_INCOMPLETE) return "unknown"; return wc_status_str[status]; } const char *ibv_wr_opcode_str(enum ibv_wr_opcode opcode) { static const char *const wr_opcode_str[] = { [IBV_WR_RDMA_WRITE] = "rdma-write", [IBV_WR_RDMA_WRITE_WITH_IMM] = "rdma-write-with-imm", [IBV_WR_SEND] = "send", [IBV_WR_SEND_WITH_IMM] = "send-with-imm", [IBV_WR_RDMA_READ] = "rdma-read", [IBV_WR_ATOMIC_CMP_AND_SWP] = "atomic-cmp-and-swp", [IBV_WR_ATOMIC_FETCH_AND_ADD] = "atomic-fetch-and-add", [IBV_WR_LOCAL_INV] = "local-inv", [IBV_WR_BIND_MW] = "bind-mw", [IBV_WR_SEND_WITH_INV] = "send-with-inv", [IBV_WR_TSO] = "tso", [IBV_WR_DRIVER1] = "driver1", [IBV_WR_FLUSH] = "flush", [IBV_WR_ATOMIC_WRITE] = "atomic-write" }; if (opcode < IBV_WR_RDMA_WRITE || opcode > IBV_WR_ATOMIC_WRITE) return "unknown"; return wr_opcode_str[opcode]; } rdma-core-56.1/libibverbs/examples/000077500000000000000000000000001477342711600172735ustar00rootroot00000000000000rdma-core-56.1/libibverbs/examples/CMakeLists.txt000066400000000000000000000021161477342711600220330ustar00rootroot00000000000000# Shared example files add_library(ibverbs_tools STATIC pingpong.c ) if(HAVE_WCAST_ALIGN_STRICT) target_compile_options(ibverbs_tools PRIVATE "-Wcast-align=strict") endif() rdma_executable(ibv_asyncwatch asyncwatch.c) target_link_libraries(ibv_asyncwatch LINK_PRIVATE ibverbs) rdma_executable(ibv_devices device_list.c) target_link_libraries(ibv_devices LINK_PRIVATE ibverbs) rdma_executable(ibv_devinfo devinfo.c) target_link_libraries(ibv_devinfo LINK_PRIVATE ibverbs) rdma_executable(ibv_rc_pingpong rc_pingpong.c) target_link_libraries(ibv_rc_pingpong LINK_PRIVATE ibverbs ibverbs_tools) rdma_executable(ibv_srq_pingpong srq_pingpong.c) target_link_libraries(ibv_srq_pingpong LINK_PRIVATE ibverbs ibverbs_tools) rdma_executable(ibv_uc_pingpong uc_pingpong.c) target_link_libraries(ibv_uc_pingpong LINK_PRIVATE ibverbs ibverbs_tools) rdma_executable(ibv_ud_pingpong ud_pingpong.c) target_link_libraries(ibv_ud_pingpong LINK_PRIVATE ibverbs ibverbs_tools) rdma_executable(ibv_xsrq_pingpong xsrq_pingpong.c) target_link_libraries(ibv_xsrq_pingpong LINK_PRIVATE ibverbs ibverbs_tools) rdma-core-56.1/libibverbs/examples/asyncwatch.c000066400000000000000000000106221477342711600216040ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include static const char *event_name_str(enum ibv_event_type event_type) { switch (event_type) { case IBV_EVENT_DEVICE_FATAL: return "IBV_EVENT_DEVICE_FATAL"; case IBV_EVENT_PORT_ACTIVE: return "IBV_EVENT_PORT_ACTIVE"; case IBV_EVENT_PORT_ERR: return "IBV_EVENT_PORT_ERR"; case IBV_EVENT_LID_CHANGE: return "IBV_EVENT_LID_CHANGE"; case IBV_EVENT_PKEY_CHANGE: return "IBV_EVENT_PKEY_CHANGE"; case IBV_EVENT_SM_CHANGE: return "IBV_EVENT_SM_CHANGE"; case IBV_EVENT_CLIENT_REREGISTER: return "IBV_EVENT_CLIENT_REREGISTER"; case IBV_EVENT_GID_CHANGE: return "IBV_EVENT_GID_CHANGE"; case IBV_EVENT_CQ_ERR: case IBV_EVENT_QP_FATAL: case IBV_EVENT_QP_REQ_ERR: case IBV_EVENT_QP_ACCESS_ERR: case IBV_EVENT_COMM_EST: case IBV_EVENT_SQ_DRAINED: case IBV_EVENT_PATH_MIG: case IBV_EVENT_PATH_MIG_ERR: case IBV_EVENT_SRQ_ERR: case IBV_EVENT_SRQ_LIMIT_REACHED: case IBV_EVENT_QP_LAST_WQE_REACHED: default: return "unexpected"; } } static void usage(const char *argv0) { printf("Usage:\n"); printf(" %s start an asyncwatch process\n", argv0); printf("\n"); printf("Options:\n"); printf(" -d, --ib-dev= use IB device (default first device found)\n"); printf(" -h, --help print a help text and exit\n"); } int main(int argc, char *argv[]) { struct ibv_device **dev_list; struct ibv_context *context; struct ibv_async_event event; char *ib_devname = NULL; int i = 0; /* Force line-buffering in case stdout is redirected */ setvbuf(stdout, NULL, _IOLBF, 0); while (1) { int ret = 1; int c; static struct option long_options[] = { { .name = "ib-dev", .has_arg = 1, .val = 'd' }, { .name = "help", .has_arg = 0, .val = 'h' }, {} }; c = getopt_long(argc, argv, "d:h", long_options, NULL); if (c == -1) break; switch (c) { case 'd': ib_devname = strdupa(optarg); break; case 'h': ret = 0; SWITCH_FALLTHROUGH; default: usage(argv[0]); return ret; } } dev_list = ibv_get_device_list(NULL); if (!dev_list) { perror("Failed to get IB devices list"); return 1; } if (ib_devname) { for (; dev_list[i]; ++i) { if (!strcmp(ibv_get_device_name(dev_list[i]), ib_devname)) break; } } if (!dev_list[i]) { fprintf(stderr, "IB device %s not found\n", ib_devname ? ib_devname : ""); return 1; } context = ibv_open_device(dev_list[i]); if (!context) { fprintf(stderr, "Couldn't get context for %s\n", ibv_get_device_name(dev_list[i])); return 1; } printf("%s: async event FD %d\n", ibv_get_device_name(dev_list[i]), context->async_fd); while (1) { if (ibv_get_async_event(context, &event)) return 1; printf(" event_type %s (%d), port %d\n", event_name_str(event.event_type), event.event_type, event.element.port_num); ibv_ack_async_event(&event); } return 0; } rdma-core-56.1/libibverbs/examples/device_list.c000066400000000000000000000040401477342711600217270ustar00rootroot00000000000000/* * Copyright (c) 2004 Topspin Communications. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include int main(int argc, char *argv[]) { struct ibv_device **dev_list; int num_devices, i; dev_list = ibv_get_device_list(&num_devices); if (!dev_list) { perror("Failed to get IB devices list"); return 1; } printf(" %-16s\t node GUID\n", "device"); printf(" %-16s\t----------------\n", "------"); for (i = 0; i < num_devices; ++i) { printf(" %-16s\t%016llx\n", ibv_get_device_name(dev_list[i]), (unsigned long long) be64toh(ibv_get_device_guid(dev_list[i]))); } ibv_free_device_list(dev_list); return 0; } rdma-core-56.1/libibverbs/examples/devinfo.c000066400000000000000000000635351477342711600211050ustar00rootroot00000000000000/* * Copyright (c) 2005 Cisco Systems. All rights reserved. * Copyright (c) 2005 Mellanox Technologies Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include static int verbose; static int null_gid(union ibv_gid *gid) { return !(gid->raw[8] | gid->raw[9] | gid->raw[10] | gid->raw[11] | gid->raw[12] | gid->raw[13] | gid->raw[14] | gid->raw[15]); } static const char *guid_str(__be64 _node_guid, char *str) { uint64_t node_guid = be64toh(_node_guid); sprintf(str, "%04x:%04x:%04x:%04x", (unsigned) (node_guid >> 48) & 0xffff, (unsigned) (node_guid >> 32) & 0xffff, (unsigned) (node_guid >> 16) & 0xffff, (unsigned) (node_guid >> 0) & 0xffff); return str; } static const char *transport_str(enum ibv_transport_type transport) { switch (transport) { case IBV_TRANSPORT_IB: return "InfiniBand"; case IBV_TRANSPORT_IWARP: return "iWARP"; case IBV_TRANSPORT_USNIC: return "usNIC"; case IBV_TRANSPORT_USNIC_UDP: return "usNIC UDP"; case IBV_TRANSPORT_UNSPECIFIED: return "unspecified"; default: return "invalid transport"; } } static const char *port_state_str(enum ibv_port_state pstate) { switch (pstate) { case IBV_PORT_DOWN: return "PORT_DOWN"; case IBV_PORT_INIT: return "PORT_INIT"; case IBV_PORT_ARMED: return "PORT_ARMED"; case IBV_PORT_ACTIVE: return "PORT_ACTIVE"; default: return "invalid state"; } } static const char *port_phy_state_str(uint8_t phys_state) { switch (phys_state) { case 1: return "SLEEP"; case 2: return "POLLING"; case 3: return "DISABLED"; case 4: return "PORT_CONFIGURATION TRAINNING"; case 5: return "LINK_UP"; case 6: return "LINK_ERROR_RECOVERY"; case 7: return "PHY TEST"; default: return "invalid physical state"; } } static const char *atomic_cap_str(enum ibv_atomic_cap atom_cap) { switch (atom_cap) { case IBV_ATOMIC_NONE: return "ATOMIC_NONE"; case IBV_ATOMIC_HCA: return "ATOMIC_HCA"; case IBV_ATOMIC_GLOB: return "ATOMIC_GLOB"; default: return "invalid atomic capability"; } } static const char *mtu_str(enum ibv_mtu max_mtu) { switch (max_mtu) { case IBV_MTU_256: return "256"; case IBV_MTU_512: return "512"; case IBV_MTU_1024: return "1024"; case IBV_MTU_2048: return "2048"; case IBV_MTU_4096: return "4096"; default: return "invalid MTU"; } } static const char *width_str(uint8_t width) { switch (width) { case 1: return "1"; case 2: return "4"; case 4: return "8"; case 8: return "12"; case 16: return "2"; default: return "invalid width"; } } static const char *speed_str(uint32_t speed) { switch (speed) { case 1: return "2.5 Gbps"; case 2: return "5.0 Gbps"; case 4: /* fall through */ case 8: return "10.0 Gbps"; case 16: return "14.0 Gbps"; case 32: return "25.0 Gbps"; case 64: return "50.0 Gbps"; case 128: return "100.0 Gbps"; case 256: return "200.0 Gbps"; default: return "invalid speed"; } } static const char *vl_str(uint8_t vl_num) { switch (vl_num) { case 1: return "1"; case 2: return "2"; case 3: return "4"; case 4: return "8"; case 5: return "15"; default: return "invalid value"; } } #define DEVINFO_INVALID_GID_TYPE 2 static const char *gid_type_str(enum ibv_gid_type_sysfs type) { switch (type) { case IBV_GID_TYPE_SYSFS_IB_ROCE_V1: return "RoCE v1"; case IBV_GID_TYPE_SYSFS_ROCE_V2: return "RoCE v2"; default: return "Invalid gid type"; } } static void print_formated_gid(union ibv_gid *gid, int i, enum ibv_gid_type_sysfs type, int ll) { char gid_str[INET6_ADDRSTRLEN] = {}; char str[20] = {}; if (ll == IBV_LINK_LAYER_ETHERNET) sprintf(str, ", %s", gid_type_str(type)); if (type == IBV_GID_TYPE_SYSFS_IB_ROCE_V1) printf("\t\t\tGID[%3d]:\t\t%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x:%02x%02x%s\n", i, gid->raw[0], gid->raw[1], gid->raw[2], gid->raw[3], gid->raw[4], gid->raw[5], gid->raw[6], gid->raw[7], gid->raw[8], gid->raw[9], gid->raw[10], gid->raw[11], gid->raw[12], gid->raw[13], gid->raw[14], gid->raw[15], str); if (type == IBV_GID_TYPE_SYSFS_ROCE_V2) { inet_ntop(AF_INET6, gid->raw, gid_str, sizeof(gid_str)); printf("\t\t\tGID[%3d]:\t\t%s%s\n", i, gid_str, str); } } static int print_all_port_gids(struct ibv_context *ctx, struct ibv_port_attr *port_attr, uint32_t port_num) { enum ibv_gid_type_sysfs type; union ibv_gid gid; int tbl_len; int rc = 0; int i; tbl_len = port_attr->gid_tbl_len; for (i = 0; i < tbl_len; i++) { rc = ibv_query_gid(ctx, port_num, i, &gid); if (rc) { fprintf(stderr, "Failed to query gid to port %u, index %d\n", port_num, i); return rc; } rc = ibv_query_gid_type(ctx, port_num, i, &type); if (rc) { rc = 0; type = DEVINFO_INVALID_GID_TYPE; } if (!null_gid(&gid)) print_formated_gid(&gid, i, type, port_attr->link_layer); } return rc; } static const char *link_layer_str(uint8_t link_layer) { switch (link_layer) { case IBV_LINK_LAYER_UNSPECIFIED: return "Unspecified"; case IBV_LINK_LAYER_INFINIBAND: return "InfiniBand"; case IBV_LINK_LAYER_ETHERNET: return "Ethernet"; default: return "Unknown"; } } static void print_device_cap_flags(uint32_t dev_cap_flags) { uint32_t unknown_flags = ~(IBV_DEVICE_RESIZE_MAX_WR | IBV_DEVICE_BAD_PKEY_CNTR | IBV_DEVICE_BAD_QKEY_CNTR | IBV_DEVICE_RAW_MULTI | IBV_DEVICE_AUTO_PATH_MIG | IBV_DEVICE_CHANGE_PHY_PORT | IBV_DEVICE_UD_AV_PORT_ENFORCE | IBV_DEVICE_CURR_QP_STATE_MOD | IBV_DEVICE_SHUTDOWN_PORT | IBV_DEVICE_INIT_TYPE | IBV_DEVICE_PORT_ACTIVE_EVENT | IBV_DEVICE_SYS_IMAGE_GUID | IBV_DEVICE_RC_RNR_NAK_GEN | IBV_DEVICE_SRQ_RESIZE | IBV_DEVICE_N_NOTIFY_CQ | IBV_DEVICE_MEM_WINDOW | IBV_DEVICE_UD_IP_CSUM | IBV_DEVICE_XRC | IBV_DEVICE_MEM_MGT_EXTENSIONS | IBV_DEVICE_MEM_WINDOW_TYPE_2A | IBV_DEVICE_MEM_WINDOW_TYPE_2B | IBV_DEVICE_RC_IP_CSUM | IBV_DEVICE_RAW_IP_CSUM | IBV_DEVICE_MANAGED_FLOW_STEERING); if (dev_cap_flags & IBV_DEVICE_RESIZE_MAX_WR) printf("\t\t\t\t\tRESIZE_MAX_WR\n"); if (dev_cap_flags & IBV_DEVICE_BAD_PKEY_CNTR) printf("\t\t\t\t\tBAD_PKEY_CNTR\n"); if (dev_cap_flags & IBV_DEVICE_BAD_QKEY_CNTR) printf("\t\t\t\t\tBAD_QKEY_CNTR\n"); if (dev_cap_flags & IBV_DEVICE_RAW_MULTI) printf("\t\t\t\t\tRAW_MULTI\n"); if (dev_cap_flags & IBV_DEVICE_AUTO_PATH_MIG) printf("\t\t\t\t\tAUTO_PATH_MIG\n"); if (dev_cap_flags & IBV_DEVICE_CHANGE_PHY_PORT) printf("\t\t\t\t\tCHANGE_PHY_PORT\n"); if (dev_cap_flags & IBV_DEVICE_UD_AV_PORT_ENFORCE) printf("\t\t\t\t\tUD_AV_PORT_ENFORCE\n"); if (dev_cap_flags & IBV_DEVICE_CURR_QP_STATE_MOD) printf("\t\t\t\t\tCURR_QP_STATE_MOD\n"); if (dev_cap_flags & IBV_DEVICE_SHUTDOWN_PORT) printf("\t\t\t\t\tSHUTDOWN_PORT\n"); if (dev_cap_flags & IBV_DEVICE_INIT_TYPE) printf("\t\t\t\t\tINIT_TYPE\n"); if (dev_cap_flags & IBV_DEVICE_PORT_ACTIVE_EVENT) printf("\t\t\t\t\tPORT_ACTIVE_EVENT\n"); if (dev_cap_flags & IBV_DEVICE_SYS_IMAGE_GUID) printf("\t\t\t\t\tSYS_IMAGE_GUID\n"); if (dev_cap_flags & IBV_DEVICE_RC_RNR_NAK_GEN) printf("\t\t\t\t\tRC_RNR_NAK_GEN\n"); if (dev_cap_flags & IBV_DEVICE_SRQ_RESIZE) printf("\t\t\t\t\tSRQ_RESIZE\n"); if (dev_cap_flags & IBV_DEVICE_N_NOTIFY_CQ) printf("\t\t\t\t\tN_NOTIFY_CQ\n"); if (dev_cap_flags & IBV_DEVICE_MEM_WINDOW) printf("\t\t\t\t\tMEM_WINDOW\n"); if (dev_cap_flags & IBV_DEVICE_UD_IP_CSUM) printf("\t\t\t\t\tUD_IP_CSUM\n"); if (dev_cap_flags & IBV_DEVICE_XRC) printf("\t\t\t\t\tXRC\n"); if (dev_cap_flags & IBV_DEVICE_MEM_MGT_EXTENSIONS) printf("\t\t\t\t\tMEM_MGT_EXTENSIONS\n"); if (dev_cap_flags & IBV_DEVICE_MEM_WINDOW_TYPE_2A) printf("\t\t\t\t\tMEM_WINDOW_TYPE_2A\n"); if (dev_cap_flags & IBV_DEVICE_MEM_WINDOW_TYPE_2B) printf("\t\t\t\t\tMEM_WINDOW_TYPE_2B\n"); if (dev_cap_flags & IBV_DEVICE_RC_IP_CSUM) printf("\t\t\t\t\tRC_IP_CSUM\n"); if (dev_cap_flags & IBV_DEVICE_RAW_IP_CSUM) printf("\t\t\t\t\tRAW_IP_CSUM\n"); if (dev_cap_flags & IBV_DEVICE_MANAGED_FLOW_STEERING) printf("\t\t\t\t\tMANAGED_FLOW_STEERING\n"); if (dev_cap_flags & unknown_flags) printf("\t\t\t\t\tUnknown flags: 0x%" PRIX32 "\n", dev_cap_flags & unknown_flags); } static void print_odp_trans_caps(uint32_t trans) { uint32_t unknown_transport_caps = ~(IBV_ODP_SUPPORT_SEND | IBV_ODP_SUPPORT_RECV | IBV_ODP_SUPPORT_WRITE | IBV_ODP_SUPPORT_READ | IBV_ODP_SUPPORT_ATOMIC | IBV_ODP_SUPPORT_SRQ_RECV); if (!trans) { printf("\t\t\t\t\tNO SUPPORT\n"); } else { if (trans & IBV_ODP_SUPPORT_SEND) printf("\t\t\t\t\tSUPPORT_SEND\n"); if (trans & IBV_ODP_SUPPORT_RECV) printf("\t\t\t\t\tSUPPORT_RECV\n"); if (trans & IBV_ODP_SUPPORT_WRITE) printf("\t\t\t\t\tSUPPORT_WRITE\n"); if (trans & IBV_ODP_SUPPORT_READ) printf("\t\t\t\t\tSUPPORT_READ\n"); if (trans & IBV_ODP_SUPPORT_ATOMIC) printf("\t\t\t\t\tSUPPORT_ATOMIC\n"); if (trans & IBV_ODP_SUPPORT_SRQ_RECV) printf("\t\t\t\t\tSUPPORT_SRQ\n"); if (trans & unknown_transport_caps) printf("\t\t\t\t\tUnknown flags: 0x%" PRIX32 "\n", trans & unknown_transport_caps); } } static void print_odp_caps(const struct ibv_device_attr_ex *device_attr) { uint64_t unknown_general_caps = ~(IBV_ODP_SUPPORT | IBV_ODP_SUPPORT_IMPLICIT); const struct ibv_odp_caps *caps = &device_attr->odp_caps; /* general odp caps */ printf("\tgeneral_odp_caps:\n"); if (caps->general_caps & IBV_ODP_SUPPORT) printf("\t\t\t\t\tODP_SUPPORT\n"); if (caps->general_caps & IBV_ODP_SUPPORT_IMPLICIT) printf("\t\t\t\t\tODP_SUPPORT_IMPLICIT\n"); if (caps->general_caps & unknown_general_caps) printf("\t\t\t\t\tUnknown flags: 0x%" PRIX64 "\n", caps->general_caps & unknown_general_caps); /* RC transport */ printf("\trc_odp_caps:\n"); print_odp_trans_caps(caps->per_transport_caps.rc_odp_caps); printf("\tuc_odp_caps:\n"); print_odp_trans_caps(caps->per_transport_caps.uc_odp_caps); printf("\tud_odp_caps:\n"); print_odp_trans_caps(caps->per_transport_caps.ud_odp_caps); printf("\txrc_odp_caps:\n"); print_odp_trans_caps(device_attr->xrc_odp_caps); } static void print_device_cap_flags_ex(uint64_t device_cap_flags_ex) { uint64_t ex_flags = device_cap_flags_ex & 0xffffffff00000000ULL; uint64_t unknown_flags = ~(IBV_DEVICE_RAW_SCATTER_FCS | IBV_DEVICE_PCI_WRITE_END_PADDING); if (ex_flags & IBV_DEVICE_RAW_SCATTER_FCS) printf("\t\t\t\t\tRAW_SCATTER_FCS\n"); if (ex_flags & IBV_DEVICE_PCI_WRITE_END_PADDING) printf("\t\t\t\t\tPCI_WRITE_END_PADDING\n"); if (ex_flags & unknown_flags) printf("\t\t\t\t\tUnknown flags: 0x%" PRIX64 "\n", ex_flags & unknown_flags); } static void print_tm_caps(const struct ibv_tm_caps *caps) { if (caps->max_num_tags) { printf("\tmax_rndv_hdr_size:\t\t%u\n", caps->max_rndv_hdr_size); printf("\tmax_num_tags:\t\t\t%u\n", caps->max_num_tags); printf("\tmax_ops:\t\t\t%u\n", caps->max_ops); printf("\tmax_sge:\t\t\t%u\n", caps->max_sge); printf("\tflags:\n"); if (caps->flags & IBV_TM_CAP_RC) printf("\t\t\t\t\tIBV_TM_CAP_RC\n"); } else { printf("\ttag matching not supported\n"); } } static void print_tso_caps(const struct ibv_tso_caps *caps) { uint32_t unknown_general_caps = ~(1 << IBV_QPT_RAW_PACKET | 1 << IBV_QPT_UD); printf("\ttso_caps:\n"); printf("\t\tmax_tso:\t\t\t%d\n", caps->max_tso); if (caps->max_tso) { printf("\t\tsupported_qp:\n"); if (ibv_is_qpt_supported(caps->supported_qpts, IBV_QPT_RAW_PACKET)) printf("\t\t\t\t\tSUPPORT_RAW_PACKET\n"); if (ibv_is_qpt_supported(caps->supported_qpts, IBV_QPT_UD)) printf("\t\t\t\t\tSUPPORT_UD\n"); if (caps->supported_qpts & unknown_general_caps) printf("\t\t\t\t\tUnknown flags: 0x%" PRIX32 "\n", caps->supported_qpts & unknown_general_caps); } } static void print_rss_caps(const struct ibv_rss_caps *caps) { uint32_t unknown_general_caps = ~(1 << IBV_QPT_RAW_PACKET | 1 << IBV_QPT_UD); printf("\trss_caps:\n"); printf("\t\tmax_rwq_indirection_tables:\t\t\t%u\n", caps->max_rwq_indirection_tables); printf("\t\tmax_rwq_indirection_table_size:\t\t\t%u\n", caps->max_rwq_indirection_table_size); printf("\t\trx_hash_function:\t\t\t\t0x%x\n", caps->rx_hash_function); printf("\t\trx_hash_fields_mask:\t\t\t\t0x%" PRIX64 "\n", caps->rx_hash_fields_mask); if (caps->supported_qpts) { printf("\t\tsupported_qp:\n"); if (ibv_is_qpt_supported(caps->supported_qpts, IBV_QPT_RAW_PACKET)) printf("\t\t\t\t\tSUPPORT_RAW_PACKET\n"); if (ibv_is_qpt_supported(caps->supported_qpts, IBV_QPT_UD)) printf("\t\t\t\t\tSUPPORT_UD\n"); if (caps->supported_qpts & unknown_general_caps) printf("\t\t\t\t\tUnknown flags: 0x%" PRIX32 "\n", caps->supported_qpts & unknown_general_caps); } } static void print_cq_moderation_caps(const struct ibv_cq_moderation_caps *cq_caps) { if (!cq_caps->max_cq_count || !cq_caps->max_cq_period) return; printf("\n\tcq moderation caps:\n"); printf("\t\tmax_cq_count:\t%u\n", cq_caps->max_cq_count); printf("\t\tmax_cq_period:\t%u us\n\n", cq_caps->max_cq_period); } static void print_packet_pacing_caps(const struct ibv_packet_pacing_caps *caps) { uint32_t unknown_general_caps = ~(1 << IBV_QPT_RAW_PACKET | 1 << IBV_QPT_UD); printf("\tpacket_pacing_caps:\n"); printf("\t\tqp_rate_limit_min:\t%ukbps\n", caps->qp_rate_limit_min); printf("\t\tqp_rate_limit_max:\t%ukbps\n", caps->qp_rate_limit_max); if (caps->qp_rate_limit_max) { printf("\t\tsupported_qp:\n"); if (ibv_is_qpt_supported(caps->supported_qpts, IBV_QPT_RAW_PACKET)) printf("\t\t\t\t\tSUPPORT_RAW_PACKET\n"); if (ibv_is_qpt_supported(caps->supported_qpts, IBV_QPT_UD)) printf("\t\t\t\t\tSUPPORT_UD\n"); if (caps->supported_qpts & unknown_general_caps) printf("\t\t\t\t\tUnknown flags: 0x%" PRIX32 "\n", caps->supported_qpts & unknown_general_caps); } } static void print_raw_packet_caps(uint32_t raw_packet_caps) { printf("\traw packet caps:\n"); if (raw_packet_caps & IBV_RAW_PACKET_CAP_CVLAN_STRIPPING) printf("\t\t\t\t\tC-VLAN stripping offload\n"); if (raw_packet_caps & IBV_RAW_PACKET_CAP_SCATTER_FCS) printf("\t\t\t\t\tScatter FCS offload\n"); if (raw_packet_caps & IBV_RAW_PACKET_CAP_IP_CSUM) printf("\t\t\t\t\tIP csum offload\n"); if (raw_packet_caps & IBV_RAW_PACKET_CAP_DELAY_DROP) printf("\t\t\t\t\tDelay drop\n"); } static int print_hca_cap(struct ibv_device *ib_dev, uint8_t ib_port) { struct ibv_context *ctx; struct ibv_device_attr_ex device_attr = {}; struct ibv_port_attr port_attr; int rc = 0; uint32_t port; char buf[256]; ctx = ibv_open_device(ib_dev); if (!ctx) { fprintf(stderr, "Failed to open device\n"); rc = 1; goto cleanup; } if (ibv_query_device_ex(ctx, NULL, &device_attr)) { fprintf(stderr, "Failed to query device props\n"); rc = 2; goto cleanup; } if (ib_port && ib_port > device_attr.orig_attr.phys_port_cnt) { fprintf(stderr, "Invalid port requested for device\n"); /* rc = 3 is taken by failure to clean up */ rc = 4; goto cleanup; } printf("hca_id:\t%s\n", ibv_get_device_name(ib_dev)); printf("\ttransport:\t\t\t%s (%d)\n", transport_str(ib_dev->transport_type), ib_dev->transport_type); if (strlen(device_attr.orig_attr.fw_ver)) printf("\tfw_ver:\t\t\t\t%s\n", device_attr.orig_attr.fw_ver); printf("\tnode_guid:\t\t\t%s\n", guid_str(device_attr.orig_attr.node_guid, buf)); printf("\tsys_image_guid:\t\t\t%s\n", guid_str(device_attr.orig_attr.sys_image_guid, buf)); printf("\tvendor_id:\t\t\t0x%04x\n", device_attr.orig_attr.vendor_id); printf("\tvendor_part_id:\t\t\t%d\n", device_attr.orig_attr.vendor_part_id); printf("\thw_ver:\t\t\t\t0x%X\n", device_attr.orig_attr.hw_ver); if (ibv_read_sysfs_file(ib_dev->ibdev_path, "board_id", buf, sizeof buf) > 0) printf("\tboard_id:\t\t\t%s\n", buf); printf("\tphys_port_cnt:\t\t\t%d\n", device_attr.orig_attr.phys_port_cnt); if (verbose) { printf("\tmax_mr_size:\t\t\t0x%llx\n", (unsigned long long) device_attr.orig_attr.max_mr_size); printf("\tpage_size_cap:\t\t\t0x%llx\n", (unsigned long long) device_attr.orig_attr.page_size_cap); printf("\tmax_qp:\t\t\t\t%d\n", device_attr.orig_attr.max_qp); printf("\tmax_qp_wr:\t\t\t%d\n", device_attr.orig_attr.max_qp_wr); printf("\tdevice_cap_flags:\t\t0x%08x\n", device_attr.orig_attr.device_cap_flags); print_device_cap_flags(device_attr.orig_attr.device_cap_flags); printf("\tmax_sge:\t\t\t%d\n", device_attr.orig_attr.max_sge); printf("\tmax_sge_rd:\t\t\t%d\n", device_attr.orig_attr.max_sge_rd); printf("\tmax_cq:\t\t\t\t%d\n", device_attr.orig_attr.max_cq); printf("\tmax_cqe:\t\t\t%d\n", device_attr.orig_attr.max_cqe); printf("\tmax_mr:\t\t\t\t%d\n", device_attr.orig_attr.max_mr); printf("\tmax_pd:\t\t\t\t%d\n", device_attr.orig_attr.max_pd); printf("\tmax_qp_rd_atom:\t\t\t%d\n", device_attr.orig_attr.max_qp_rd_atom); printf("\tmax_ee_rd_atom:\t\t\t%d\n", device_attr.orig_attr.max_ee_rd_atom); printf("\tmax_res_rd_atom:\t\t%d\n", device_attr.orig_attr.max_res_rd_atom); printf("\tmax_qp_init_rd_atom:\t\t%d\n", device_attr.orig_attr.max_qp_init_rd_atom); printf("\tmax_ee_init_rd_atom:\t\t%d\n", device_attr.orig_attr.max_ee_init_rd_atom); printf("\tatomic_cap:\t\t\t%s (%d)\n", atomic_cap_str(device_attr.orig_attr.atomic_cap), device_attr.orig_attr.atomic_cap); printf("\tmax_ee:\t\t\t\t%d\n", device_attr.orig_attr.max_ee); printf("\tmax_rdd:\t\t\t%d\n", device_attr.orig_attr.max_rdd); printf("\tmax_mw:\t\t\t\t%d\n", device_attr.orig_attr.max_mw); printf("\tmax_raw_ipv6_qp:\t\t%d\n", device_attr.orig_attr.max_raw_ipv6_qp); printf("\tmax_raw_ethy_qp:\t\t%d\n", device_attr.orig_attr.max_raw_ethy_qp); printf("\tmax_mcast_grp:\t\t\t%d\n", device_attr.orig_attr.max_mcast_grp); printf("\tmax_mcast_qp_attach:\t\t%d\n", device_attr.orig_attr.max_mcast_qp_attach); printf("\tmax_total_mcast_qp_attach:\t%d\n", device_attr.orig_attr.max_total_mcast_qp_attach); printf("\tmax_ah:\t\t\t\t%d\n", device_attr.orig_attr.max_ah); printf("\tmax_fmr:\t\t\t%d\n", device_attr.orig_attr.max_fmr); if (device_attr.orig_attr.max_fmr) printf("\tmax_map_per_fmr:\t\t%d\n", device_attr.orig_attr.max_map_per_fmr); printf("\tmax_srq:\t\t\t%d\n", device_attr.orig_attr.max_srq); if (device_attr.orig_attr.max_srq) { printf("\tmax_srq_wr:\t\t\t%d\n", device_attr.orig_attr.max_srq_wr); printf("\tmax_srq_sge:\t\t\t%d\n", device_attr.orig_attr.max_srq_sge); } printf("\tmax_pkeys:\t\t\t%d\n", device_attr.orig_attr.max_pkeys); printf("\tlocal_ca_ack_delay:\t\t%d\n", device_attr.orig_attr.local_ca_ack_delay); print_odp_caps(&device_attr); if (device_attr.completion_timestamp_mask) printf("\tcompletion timestamp_mask:\t\t\t0x%016" PRIx64 "\n", device_attr.completion_timestamp_mask); else printf("\tcompletion_timestamp_mask not supported\n"); if (device_attr.hca_core_clock) printf("\thca_core_clock:\t\t\t%" PRIu64 "kHZ\n", device_attr.hca_core_clock); else printf("\tcore clock not supported\n"); if (device_attr.raw_packet_caps) print_raw_packet_caps(device_attr.raw_packet_caps); printf("\tdevice_cap_flags_ex:\t\t0x%" PRIX64 "\n", device_attr.device_cap_flags_ex); print_device_cap_flags_ex(device_attr.device_cap_flags_ex); print_tso_caps(&device_attr.tso_caps); print_rss_caps(&device_attr.rss_caps); printf("\tmax_wq_type_rq:\t\t\t%u\n", device_attr.max_wq_type_rq); print_packet_pacing_caps(&device_attr.packet_pacing_caps); print_tm_caps(&device_attr.tm_caps); print_cq_moderation_caps(&device_attr.cq_mod_caps); if (device_attr.max_dm_size) printf("\tmaximum available device memory:\t%" PRIu64"Bytes\n\n", device_attr.max_dm_size); printf("\tnum_comp_vectors:\t\t%d\n", ctx->num_comp_vectors); } for (port = 1; port <= device_attr.orig_attr.phys_port_cnt; ++port) { /* if in the command line the user didn't ask for info about this port */ if ((ib_port) && (port != ib_port)) continue; rc = ibv_query_port(ctx, port, &port_attr); if (rc) { fprintf(stderr, "Failed to query port %u props\n", port); goto cleanup; } printf("\t\tport:\t%u\n", port); printf("\t\t\tstate:\t\t\t%s (%d)\n", port_state_str(port_attr.state), port_attr.state); printf("\t\t\tmax_mtu:\t\t%s (%d)\n", mtu_str(port_attr.max_mtu), port_attr.max_mtu); printf("\t\t\tactive_mtu:\t\t%s (%d)\n", mtu_str(port_attr.active_mtu), port_attr.active_mtu); printf("\t\t\tsm_lid:\t\t\t%d\n", port_attr.sm_lid); printf("\t\t\tport_lid:\t\t%d\n", port_attr.lid); printf("\t\t\tport_lmc:\t\t0x%02x\n", port_attr.lmc); printf("\t\t\tlink_layer:\t\t%s\n", link_layer_str(port_attr.link_layer)); if (verbose) { printf("\t\t\tmax_msg_sz:\t\t0x%x\n", port_attr.max_msg_sz); printf("\t\t\tport_cap_flags:\t\t0x%08x\n", port_attr.port_cap_flags); printf("\t\t\tport_cap_flags2:\t0x%04x\n", port_attr.port_cap_flags2); printf("\t\t\tmax_vl_num:\t\t%s (%d)\n", vl_str(port_attr.max_vl_num), port_attr.max_vl_num); printf("\t\t\tbad_pkey_cntr:\t\t0x%x\n", port_attr.bad_pkey_cntr); printf("\t\t\tqkey_viol_cntr:\t\t0x%x\n", port_attr.qkey_viol_cntr); printf("\t\t\tsm_sl:\t\t\t%d\n", port_attr.sm_sl); printf("\t\t\tpkey_tbl_len:\t\t%d\n", port_attr.pkey_tbl_len); printf("\t\t\tgid_tbl_len:\t\t%d\n", port_attr.gid_tbl_len); printf("\t\t\tsubnet_timeout:\t\t%d\n", port_attr.subnet_timeout); printf("\t\t\tinit_type_reply:\t%d\n", port_attr.init_type_reply); printf("\t\t\tactive_width:\t\t%sX (%d)\n", width_str(port_attr.active_width), port_attr.active_width); printf("\t\t\tactive_speed:\t\t%s (%d)\n", port_attr.active_speed_ex ? speed_str(port_attr.active_speed_ex) : speed_str(port_attr.active_speed), port_attr.active_speed_ex ? port_attr.active_speed_ex : port_attr.active_speed); if (ib_dev->transport_type == IBV_TRANSPORT_IB) printf("\t\t\tphys_state:\t\t%s (%d)\n", port_phy_state_str(port_attr.phys_state), port_attr.phys_state); rc = print_all_port_gids(ctx, &port_attr, port); if (rc) goto cleanup; } printf("\n"); } cleanup: if (ctx) if (ibv_close_device(ctx)) { fprintf(stderr, "Failed to close device"); rc = 3; } return rc; } static void usage(const char *argv0) { printf("Usage: %s print the ca attributes\n", argv0); printf("\n"); printf("Options:\n"); printf(" -d, --ib-dev= use IB device (default all devices)\n"); printf(" -i, --ib-port= use port of IB device (default all ports)\n"); printf(" -l, --list print only the IB devices names\n"); printf(" -v, --verbose print all the attributes of the IB device(s)\n"); } int main(int argc, char *argv[]) { char *ib_devname = NULL; int ret = 0; struct ibv_device **dev_list, **orig_dev_list; int num_of_hcas; int ib_port = 0; /* parse command line options */ while (1) { int c; static struct option long_options[] = { { .name = "ib-dev", .has_arg = 1, .val = 'd' }, { .name = "ib-port", .has_arg = 1, .val = 'i' }, { .name = "list", .has_arg = 0, .val = 'l' }, { .name = "verbose", .has_arg = 0, .val = 'v' }, { } }; c = getopt_long(argc, argv, "d:i:lv", long_options, NULL); if (c == -1) break; switch (c) { case 'd': ib_devname = strdup(optarg); break; case 'i': ib_port = strtol(optarg, NULL, 0); if (ib_port <= 0) { usage(argv[0]); return 1; } break; case 'v': verbose = 1; break; case 'l': dev_list = orig_dev_list = ibv_get_device_list(&num_of_hcas); if (!dev_list) { perror("Failed to get IB devices list"); return -1; } printf("%d HCA%s found:\n", num_of_hcas, num_of_hcas != 1 ? "s" : ""); while (*dev_list) { printf("\t%s\n", ibv_get_device_name(*dev_list)); ++dev_list; } printf("\n"); ibv_free_device_list(orig_dev_list); return 0; default: usage(argv[0]); return -1; } } dev_list = orig_dev_list = ibv_get_device_list(NULL); if (!dev_list) { perror("Failed to get IB devices list"); return -1; } if (ib_devname) { while (*dev_list) { if (!strcmp(ibv_get_device_name(*dev_list), ib_devname)) break; ++dev_list; } if (!*dev_list) { fprintf(stderr, "IB device '%s' wasn't found\n", ib_devname); ret = -1; goto out; } ret |= print_hca_cap(*dev_list, ib_port); } else { if (!*dev_list) { fprintf(stderr, "No IB devices found\n"); ret = -1; goto out; } while (*dev_list) { ret |= print_hca_cap(*dev_list, ib_port); ++dev_list; } } out: if (ib_devname) free(ib_devname); ibv_free_device_list(orig_dev_list); return ret; } rdma-core-56.1/libibverbs/examples/pingpong.c000066400000000000000000000045671477342711600212740ustar00rootroot00000000000000/* * Copyright (c) 2006 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "pingpong.h" #include #include #include #include enum ibv_mtu pp_mtu_to_enum(int mtu) { switch (mtu) { case 256: return IBV_MTU_256; case 512: return IBV_MTU_512; case 1024: return IBV_MTU_1024; case 2048: return IBV_MTU_2048; case 4096: return IBV_MTU_4096; default: return 0; } } int pp_get_port_info(struct ibv_context *context, int port, struct ibv_port_attr *attr) { return ibv_query_port(context, port, attr); } void wire_gid_to_gid(const char *wgid, union ibv_gid *gid) { char tmp[9]; __be32 v32; int i; uint32_t tmp_gid[4]; for (tmp[8] = 0, i = 0; i < 4; ++i) { memcpy(tmp, wgid + i * 8, 8); sscanf(tmp, "%x", &v32); tmp_gid[i] = be32toh(v32); } memcpy(gid, tmp_gid, sizeof(*gid)); } void gid_to_wire_gid(const union ibv_gid *gid, char wgid[]) { uint32_t tmp_gid[4]; int i; memcpy(tmp_gid, gid, sizeof(tmp_gid)); for (i = 0; i < 4; ++i) sprintf(&wgid[i * 8], "%08x", htobe32(tmp_gid[i])); } rdma-core-56.1/libibverbs/examples/pingpong.h000066400000000000000000000033751477342711600212750ustar00rootroot00000000000000/* * Copyright (c) 2006 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef IBV_PINGPONG_H #define IBV_PINGPONG_H #include enum ibv_mtu pp_mtu_to_enum(int mtu); int pp_get_port_info(struct ibv_context *context, int port, struct ibv_port_attr *attr); void wire_gid_to_gid(const char *wgid, union ibv_gid *gid); void gid_to_wire_gid(const union ibv_gid *gid, char wgid[]); #endif /* IBV_PINGPONG_H */ rdma-core-56.1/libibverbs/examples/rc_pingpong.c000066400000000000000000000705161477342711600217550ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "pingpong.h" #include enum { PINGPONG_RECV_WRID = 1, PINGPONG_SEND_WRID = 2, }; static int page_size; static int use_odp; static int implicit_odp; static int prefetch_mr; static int use_ts; static int validate_buf; static int use_dm; static int use_new_send; struct pingpong_context { struct ibv_context *context; struct ibv_comp_channel *channel; struct ibv_pd *pd; struct ibv_mr *mr; struct ibv_dm *dm; union { struct ibv_cq *cq; struct ibv_cq_ex *cq_ex; } cq_s; struct ibv_qp *qp; struct ibv_qp_ex *qpx; char *buf; int size; int send_flags; int rx_depth; int pending; struct ibv_port_attr portinfo; uint64_t completion_timestamp_mask; }; static struct ibv_cq *pp_cq(struct pingpong_context *ctx) { return use_ts ? ibv_cq_ex_to_cq(ctx->cq_s.cq_ex) : ctx->cq_s.cq; } struct pingpong_dest { int lid; int qpn; int psn; union ibv_gid gid; }; static int pp_connect_ctx(struct pingpong_context *ctx, int port, int my_psn, enum ibv_mtu mtu, int sl, struct pingpong_dest *dest, int sgid_idx) { struct ibv_qp_attr attr = { .qp_state = IBV_QPS_RTR, .path_mtu = mtu, .dest_qp_num = dest->qpn, .rq_psn = dest->psn, .max_dest_rd_atomic = 1, .min_rnr_timer = 12, .ah_attr = { .is_global = 0, .dlid = dest->lid, .sl = sl, .src_path_bits = 0, .port_num = port } }; if (dest->gid.global.interface_id) { attr.ah_attr.is_global = 1; attr.ah_attr.grh.hop_limit = 1; attr.ah_attr.grh.dgid = dest->gid; attr.ah_attr.grh.sgid_index = sgid_idx; } if (ibv_modify_qp(ctx->qp, &attr, IBV_QP_STATE | IBV_QP_AV | IBV_QP_PATH_MTU | IBV_QP_DEST_QPN | IBV_QP_RQ_PSN | IBV_QP_MAX_DEST_RD_ATOMIC | IBV_QP_MIN_RNR_TIMER)) { fprintf(stderr, "Failed to modify QP to RTR\n"); return 1; } attr.qp_state = IBV_QPS_RTS; attr.timeout = 14; attr.retry_cnt = 7; attr.rnr_retry = 7; attr.sq_psn = my_psn; attr.max_rd_atomic = 1; if (ibv_modify_qp(ctx->qp, &attr, IBV_QP_STATE | IBV_QP_TIMEOUT | IBV_QP_RETRY_CNT | IBV_QP_RNR_RETRY | IBV_QP_SQ_PSN | IBV_QP_MAX_QP_RD_ATOMIC)) { fprintf(stderr, "Failed to modify QP to RTS\n"); return 1; } return 0; } static struct pingpong_dest *pp_client_exch_dest(const char *servername, int port, const struct pingpong_dest *my_dest) { struct addrinfo *res, *t; struct addrinfo hints = { .ai_family = AF_UNSPEC, .ai_socktype = SOCK_STREAM }; char *service; char msg[sizeof "0000:000000:000000:00000000000000000000000000000000"]; int n; int sockfd = -1; struct pingpong_dest *rem_dest = NULL; char gid[33]; if (asprintf(&service, "%d", port) < 0) return NULL; n = getaddrinfo(servername, service, &hints, &res); if (n < 0) { fprintf(stderr, "%s for %s:%d\n", gai_strerror(n), servername, port); free(service); return NULL; } for (t = res; t; t = t->ai_next) { sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol); if (sockfd >= 0) { if (!connect(sockfd, t->ai_addr, t->ai_addrlen)) break; close(sockfd); sockfd = -1; } } freeaddrinfo(res); free(service); if (sockfd < 0) { fprintf(stderr, "Couldn't connect to %s:%d\n", servername, port); return NULL; } gid_to_wire_gid(&my_dest->gid, gid); sprintf(msg, "%04x:%06x:%06x:%s", my_dest->lid, my_dest->qpn, my_dest->psn, gid); if (write(sockfd, msg, sizeof msg) != sizeof msg) { fprintf(stderr, "Couldn't send local address\n"); goto out; } if (read(sockfd, msg, sizeof msg) != sizeof msg || write(sockfd, "done", sizeof "done") != sizeof "done") { perror("client read/write"); fprintf(stderr, "Couldn't read/write remote address\n"); goto out; } rem_dest = malloc(sizeof *rem_dest); if (!rem_dest) goto out; sscanf(msg, "%x:%x:%x:%s", &rem_dest->lid, &rem_dest->qpn, &rem_dest->psn, gid); wire_gid_to_gid(gid, &rem_dest->gid); out: close(sockfd); return rem_dest; } static struct pingpong_dest *pp_server_exch_dest(struct pingpong_context *ctx, int ib_port, enum ibv_mtu mtu, int port, int sl, const struct pingpong_dest *my_dest, int sgid_idx) { struct addrinfo *res, *t; struct addrinfo hints = { .ai_flags = AI_PASSIVE, .ai_family = AF_UNSPEC, .ai_socktype = SOCK_STREAM }; char *service; char msg[sizeof "0000:000000:000000:00000000000000000000000000000000"]; int n; int sockfd = -1, connfd; struct pingpong_dest *rem_dest = NULL; char gid[33]; if (asprintf(&service, "%d", port) < 0) return NULL; n = getaddrinfo(NULL, service, &hints, &res); if (n < 0) { fprintf(stderr, "%s for port %d\n", gai_strerror(n), port); free(service); return NULL; } for (t = res; t; t = t->ai_next) { sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol); if (sockfd >= 0) { n = 1; setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &n, sizeof n); if (!bind(sockfd, t->ai_addr, t->ai_addrlen)) break; close(sockfd); sockfd = -1; } } freeaddrinfo(res); free(service); if (sockfd < 0) { fprintf(stderr, "Couldn't listen to port %d\n", port); return NULL; } listen(sockfd, 1); connfd = accept(sockfd, NULL, NULL); close(sockfd); if (connfd < 0) { fprintf(stderr, "accept() failed\n"); return NULL; } n = read(connfd, msg, sizeof msg); if (n != sizeof msg) { perror("server read"); fprintf(stderr, "%d/%d: Couldn't read remote address\n", n, (int) sizeof msg); goto out; } rem_dest = malloc(sizeof *rem_dest); if (!rem_dest) goto out; sscanf(msg, "%x:%x:%x:%s", &rem_dest->lid, &rem_dest->qpn, &rem_dest->psn, gid); wire_gid_to_gid(gid, &rem_dest->gid); if (pp_connect_ctx(ctx, ib_port, my_dest->psn, mtu, sl, rem_dest, sgid_idx)) { fprintf(stderr, "Couldn't connect to remote QP\n"); free(rem_dest); rem_dest = NULL; goto out; } gid_to_wire_gid(&my_dest->gid, gid); sprintf(msg, "%04x:%06x:%06x:%s", my_dest->lid, my_dest->qpn, my_dest->psn, gid); if (write(connfd, msg, sizeof msg) != sizeof msg || read(connfd, msg, sizeof msg) != sizeof "done") { fprintf(stderr, "Couldn't send/recv local address\n"); free(rem_dest); rem_dest = NULL; goto out; } out: close(connfd); return rem_dest; } static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size, int rx_depth, int port, int use_event) { struct pingpong_context *ctx; int access_flags = IBV_ACCESS_LOCAL_WRITE; ctx = calloc(1, sizeof *ctx); if (!ctx) return NULL; ctx->size = size; ctx->send_flags = IBV_SEND_SIGNALED; ctx->rx_depth = rx_depth; ctx->buf = memalign(page_size, size); if (!ctx->buf) { fprintf(stderr, "Couldn't allocate work buf.\n"); goto clean_ctx; } /* FIXME memset(ctx->buf, 0, size); */ memset(ctx->buf, 0x7b, size); ctx->context = ibv_open_device(ib_dev); if (!ctx->context) { fprintf(stderr, "Couldn't get context for %s\n", ibv_get_device_name(ib_dev)); goto clean_buffer; } if (use_event) { ctx->channel = ibv_create_comp_channel(ctx->context); if (!ctx->channel) { fprintf(stderr, "Couldn't create completion channel\n"); goto clean_device; } } else ctx->channel = NULL; ctx->pd = ibv_alloc_pd(ctx->context); if (!ctx->pd) { fprintf(stderr, "Couldn't allocate PD\n"); goto clean_comp_channel; } if (use_odp || use_ts || use_dm) { const uint32_t rc_caps_mask = IBV_ODP_SUPPORT_SEND | IBV_ODP_SUPPORT_RECV; struct ibv_device_attr_ex attrx; if (ibv_query_device_ex(ctx->context, NULL, &attrx)) { fprintf(stderr, "Couldn't query device for its features\n"); goto clean_pd; } if (use_odp) { if (!(attrx.odp_caps.general_caps & IBV_ODP_SUPPORT) || (attrx.odp_caps.per_transport_caps.rc_odp_caps & rc_caps_mask) != rc_caps_mask) { fprintf(stderr, "The device isn't ODP capable or does not support RC send and receive with ODP\n"); goto clean_pd; } if (implicit_odp && !(attrx.odp_caps.general_caps & IBV_ODP_SUPPORT_IMPLICIT)) { fprintf(stderr, "The device doesn't support implicit ODP\n"); goto clean_pd; } access_flags |= IBV_ACCESS_ON_DEMAND; } if (use_ts) { if (!attrx.completion_timestamp_mask) { fprintf(stderr, "The device isn't completion timestamp capable\n"); goto clean_pd; } ctx->completion_timestamp_mask = attrx.completion_timestamp_mask; } if (use_dm) { struct ibv_alloc_dm_attr dm_attr = {}; if (!attrx.max_dm_size) { fprintf(stderr, "Device doesn't support dm allocation\n"); goto clean_pd; } if (attrx.max_dm_size < size) { fprintf(stderr, "Device memory is insufficient\n"); goto clean_pd; } dm_attr.length = size; ctx->dm = ibv_alloc_dm(ctx->context, &dm_attr); if (!ctx->dm) { fprintf(stderr, "Dev mem allocation failed\n"); goto clean_pd; } access_flags |= IBV_ACCESS_ZERO_BASED; } } if (implicit_odp) { ctx->mr = ibv_reg_mr(ctx->pd, NULL, SIZE_MAX, access_flags); } else { ctx->mr = use_dm ? ibv_reg_dm_mr(ctx->pd, ctx->dm, 0, size, access_flags) : ibv_reg_mr(ctx->pd, ctx->buf, size, access_flags); } if (!ctx->mr) { fprintf(stderr, "Couldn't register MR\n"); goto clean_dm; } if (prefetch_mr) { struct ibv_sge sg_list; int ret; sg_list.lkey = ctx->mr->lkey; sg_list.addr = (uintptr_t)ctx->buf; sg_list.length = size; ret = ibv_advise_mr(ctx->pd, IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE, IB_UVERBS_ADVISE_MR_FLAG_FLUSH, &sg_list, 1); if (ret) fprintf(stderr, "Couldn't prefetch MR(%d). Continue anyway\n", ret); } if (use_ts) { struct ibv_cq_init_attr_ex attr_ex = { .cqe = rx_depth + 1, .cq_context = NULL, .channel = ctx->channel, .comp_vector = 0, .wc_flags = IBV_WC_EX_WITH_COMPLETION_TIMESTAMP }; ctx->cq_s.cq_ex = ibv_create_cq_ex(ctx->context, &attr_ex); } else { ctx->cq_s.cq = ibv_create_cq(ctx->context, rx_depth + 1, NULL, ctx->channel, 0); } if (!pp_cq(ctx)) { fprintf(stderr, "Couldn't create CQ\n"); goto clean_mr; } { struct ibv_qp_attr attr; struct ibv_qp_init_attr init_attr = { .send_cq = pp_cq(ctx), .recv_cq = pp_cq(ctx), .cap = { .max_send_wr = 1, .max_recv_wr = rx_depth, .max_send_sge = 1, .max_recv_sge = 1 }, .qp_type = IBV_QPT_RC }; if (use_new_send) { struct ibv_qp_init_attr_ex init_attr_ex = {}; init_attr_ex.send_cq = pp_cq(ctx); init_attr_ex.recv_cq = pp_cq(ctx); init_attr_ex.cap.max_send_wr = 1; init_attr_ex.cap.max_recv_wr = rx_depth; init_attr_ex.cap.max_send_sge = 1; init_attr_ex.cap.max_recv_sge = 1; init_attr_ex.qp_type = IBV_QPT_RC; init_attr_ex.comp_mask |= IBV_QP_INIT_ATTR_PD | IBV_QP_INIT_ATTR_SEND_OPS_FLAGS; init_attr_ex.pd = ctx->pd; init_attr_ex.send_ops_flags = IBV_QP_EX_WITH_SEND; ctx->qp = ibv_create_qp_ex(ctx->context, &init_attr_ex); } else { ctx->qp = ibv_create_qp(ctx->pd, &init_attr); } if (!ctx->qp) { fprintf(stderr, "Couldn't create QP\n"); goto clean_cq; } if (use_new_send) ctx->qpx = ibv_qp_to_qp_ex(ctx->qp); ibv_query_qp(ctx->qp, &attr, IBV_QP_CAP, &init_attr); if (init_attr.cap.max_inline_data >= size && !use_dm) ctx->send_flags |= IBV_SEND_INLINE; } { struct ibv_qp_attr attr = { .qp_state = IBV_QPS_INIT, .pkey_index = 0, .port_num = port, .qp_access_flags = 0 }; if (ibv_modify_qp(ctx->qp, &attr, IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | IBV_QP_ACCESS_FLAGS)) { fprintf(stderr, "Failed to modify QP to INIT\n"); goto clean_qp; } } return ctx; clean_qp: ibv_destroy_qp(ctx->qp); clean_cq: ibv_destroy_cq(pp_cq(ctx)); clean_mr: ibv_dereg_mr(ctx->mr); clean_dm: if (ctx->dm) ibv_free_dm(ctx->dm); clean_pd: ibv_dealloc_pd(ctx->pd); clean_comp_channel: if (ctx->channel) ibv_destroy_comp_channel(ctx->channel); clean_device: ibv_close_device(ctx->context); clean_buffer: free(ctx->buf); clean_ctx: free(ctx); return NULL; } static int pp_close_ctx(struct pingpong_context *ctx) { if (ibv_destroy_qp(ctx->qp)) { fprintf(stderr, "Couldn't destroy QP\n"); return 1; } if (ibv_destroy_cq(pp_cq(ctx))) { fprintf(stderr, "Couldn't destroy CQ\n"); return 1; } if (ibv_dereg_mr(ctx->mr)) { fprintf(stderr, "Couldn't deregister MR\n"); return 1; } if (ctx->dm) { if (ibv_free_dm(ctx->dm)) { fprintf(stderr, "Couldn't free DM\n"); return 1; } } if (ibv_dealloc_pd(ctx->pd)) { fprintf(stderr, "Couldn't deallocate PD\n"); return 1; } if (ctx->channel) { if (ibv_destroy_comp_channel(ctx->channel)) { fprintf(stderr, "Couldn't destroy completion channel\n"); return 1; } } if (ibv_close_device(ctx->context)) { fprintf(stderr, "Couldn't release context\n"); return 1; } free(ctx->buf); free(ctx); return 0; } static int pp_post_recv(struct pingpong_context *ctx, int n) { struct ibv_sge list = { .addr = use_dm ? 0 : (uintptr_t) ctx->buf, .length = ctx->size, .lkey = ctx->mr->lkey }; struct ibv_recv_wr wr = { .wr_id = PINGPONG_RECV_WRID, .sg_list = &list, .num_sge = 1, }; struct ibv_recv_wr *bad_wr; int i; for (i = 0; i < n; ++i) if (ibv_post_recv(ctx->qp, &wr, &bad_wr)) break; return i; } static int pp_post_send(struct pingpong_context *ctx) { struct ibv_sge list = { .addr = use_dm ? 0 : (uintptr_t) ctx->buf, .length = ctx->size, .lkey = ctx->mr->lkey }; struct ibv_send_wr wr = { .wr_id = PINGPONG_SEND_WRID, .sg_list = &list, .num_sge = 1, .opcode = IBV_WR_SEND, .send_flags = ctx->send_flags, }; struct ibv_send_wr *bad_wr; if (use_new_send) { ibv_wr_start(ctx->qpx); ctx->qpx->wr_id = PINGPONG_SEND_WRID; ctx->qpx->wr_flags = ctx->send_flags; ibv_wr_send(ctx->qpx); ibv_wr_set_sge(ctx->qpx, list.lkey, list.addr, list.length); return ibv_wr_complete(ctx->qpx); } else { return ibv_post_send(ctx->qp, &wr, &bad_wr); } } struct ts_params { uint64_t comp_recv_max_time_delta; uint64_t comp_recv_min_time_delta; uint64_t comp_recv_total_time_delta; uint64_t comp_recv_prev_time; int last_comp_with_ts; unsigned int comp_with_time_iters; }; static inline int parse_single_wc(struct pingpong_context *ctx, int *scnt, int *rcnt, int *routs, int iters, uint64_t wr_id, enum ibv_wc_status status, uint64_t completion_timestamp, struct ts_params *ts) { if (status != IBV_WC_SUCCESS) { fprintf(stderr, "Failed status %s (%d) for wr_id %d\n", ibv_wc_status_str(status), status, (int)wr_id); return 1; } switch ((int)wr_id) { case PINGPONG_SEND_WRID: ++(*scnt); break; case PINGPONG_RECV_WRID: if (--(*routs) <= 1) { *routs += pp_post_recv(ctx, ctx->rx_depth - *routs); if (*routs < ctx->rx_depth) { fprintf(stderr, "Couldn't post receive (%d)\n", *routs); return 1; } } ++(*rcnt); if (use_ts) { if (ts->last_comp_with_ts) { uint64_t delta; /* checking whether the clock was wrapped around */ if (completion_timestamp >= ts->comp_recv_prev_time) delta = completion_timestamp - ts->comp_recv_prev_time; else delta = ctx->completion_timestamp_mask - ts->comp_recv_prev_time + completion_timestamp + 1; ts->comp_recv_max_time_delta = max(ts->comp_recv_max_time_delta, delta); ts->comp_recv_min_time_delta = min(ts->comp_recv_min_time_delta, delta); ts->comp_recv_total_time_delta += delta; ts->comp_with_time_iters++; } ts->comp_recv_prev_time = completion_timestamp; ts->last_comp_with_ts = 1; } else { ts->last_comp_with_ts = 0; } break; default: fprintf(stderr, "Completion for unknown wr_id %d\n", (int)wr_id); return 1; } ctx->pending &= ~(int)wr_id; if (*scnt < iters && !ctx->pending) { if (pp_post_send(ctx)) { fprintf(stderr, "Couldn't post send\n"); return 1; } ctx->pending = PINGPONG_RECV_WRID | PINGPONG_SEND_WRID; } return 0; } static void usage(const char *argv0) { printf("Usage:\n"); printf(" %s start a server and wait for connection\n", argv0); printf(" %s connect to server at \n", argv0); printf("\n"); printf("Options:\n"); printf(" -p, --port= listen on/connect to port (default 18515)\n"); printf(" -d, --ib-dev= use IB device (default first device found)\n"); printf(" -i, --ib-port= use port of IB device (default 1)\n"); printf(" -s, --size= size of message to exchange (default 4096)\n"); printf(" -m, --mtu= path MTU (default 1024)\n"); printf(" -r, --rx-depth= number of receives to post at a time (default 500)\n"); printf(" -n, --iters= number of exchanges (default 1000)\n"); printf(" -l, --sl= service level value\n"); printf(" -e, --events sleep on CQ events (default poll)\n"); printf(" -g, --gid-idx= local port gid index\n"); printf(" -o, --odp use on demand paging\n"); printf(" -O, --iodp use implicit on demand paging\n"); printf(" -P, --prefetch prefetch an ODP MR\n"); printf(" -t, --ts get CQE with timestamp\n"); printf(" -c, --chk validate received buffer\n"); printf(" -j, --dm use device memory\n"); printf(" -N, --new_send use new post send WR API\n"); } int main(int argc, char *argv[]) { struct ibv_device **dev_list; struct ibv_device *ib_dev; struct pingpong_context *ctx; struct pingpong_dest my_dest; struct pingpong_dest *rem_dest; struct timeval start, end; char *ib_devname = NULL; char *servername = NULL; unsigned int port = 18515; int ib_port = 1; unsigned int size = 4096; enum ibv_mtu mtu = IBV_MTU_1024; unsigned int rx_depth = 500; unsigned int iters = 1000; int use_event = 0; int routs; int rcnt, scnt; int num_cq_events = 0; int sl = 0; int gidx = -1; char gid[33]; struct ts_params ts; srand48(getpid() * time(NULL)); while (1) { int c; static struct option long_options[] = { { .name = "port", .has_arg = 1, .val = 'p' }, { .name = "ib-dev", .has_arg = 1, .val = 'd' }, { .name = "ib-port", .has_arg = 1, .val = 'i' }, { .name = "size", .has_arg = 1, .val = 's' }, { .name = "mtu", .has_arg = 1, .val = 'm' }, { .name = "rx-depth", .has_arg = 1, .val = 'r' }, { .name = "iters", .has_arg = 1, .val = 'n' }, { .name = "sl", .has_arg = 1, .val = 'l' }, { .name = "events", .has_arg = 0, .val = 'e' }, { .name = "gid-idx", .has_arg = 1, .val = 'g' }, { .name = "odp", .has_arg = 0, .val = 'o' }, { .name = "iodp", .has_arg = 0, .val = 'O' }, { .name = "prefetch", .has_arg = 0, .val = 'P' }, { .name = "ts", .has_arg = 0, .val = 't' }, { .name = "chk", .has_arg = 0, .val = 'c' }, { .name = "dm", .has_arg = 0, .val = 'j' }, { .name = "new_send", .has_arg = 0, .val = 'N' }, {} }; c = getopt_long(argc, argv, "p:d:i:s:m:r:n:l:eg:oOPtcjN", long_options, NULL); if (c == -1) break; switch (c) { case 'p': port = strtoul(optarg, NULL, 0); if (port > 65535) { usage(argv[0]); return 1; } break; case 'd': ib_devname = strdupa(optarg); break; case 'i': ib_port = strtol(optarg, NULL, 0); if (ib_port < 1) { usage(argv[0]); return 1; } break; case 's': size = strtoul(optarg, NULL, 0); break; case 'm': mtu = pp_mtu_to_enum(strtol(optarg, NULL, 0)); if (mtu == 0) { usage(argv[0]); return 1; } break; case 'r': rx_depth = strtoul(optarg, NULL, 0); break; case 'n': iters = strtoul(optarg, NULL, 0); break; case 'l': sl = strtol(optarg, NULL, 0); break; case 'e': ++use_event; break; case 'g': gidx = strtol(optarg, NULL, 0); break; case 'o': use_odp = 1; break; case 'P': prefetch_mr = 1; break; case 'O': use_odp = 1; implicit_odp = 1; break; case 't': use_ts = 1; break; case 'c': validate_buf = 1; break; case 'j': use_dm = 1; break; case 'N': use_new_send = 1; break; default: usage(argv[0]); return 1; } } if (optind == argc - 1) servername = strdupa(argv[optind]); else if (optind < argc) { usage(argv[0]); return 1; } if (use_odp && use_dm) { fprintf(stderr, "DM memory region can't be on demand\n"); return 1; } if (!use_odp && prefetch_mr) { fprintf(stderr, "prefetch is valid only with on-demand memory region\n"); return 1; } if (use_ts) { ts.comp_recv_max_time_delta = 0; ts.comp_recv_min_time_delta = 0xffffffff; ts.comp_recv_total_time_delta = 0; ts.comp_recv_prev_time = 0; ts.last_comp_with_ts = 0; ts.comp_with_time_iters = 0; } page_size = sysconf(_SC_PAGESIZE); dev_list = ibv_get_device_list(NULL); if (!dev_list) { perror("Failed to get IB devices list"); return 1; } if (!ib_devname) { ib_dev = *dev_list; if (!ib_dev) { fprintf(stderr, "No IB devices found\n"); return 1; } } else { int i; for (i = 0; dev_list[i]; ++i) if (!strcmp(ibv_get_device_name(dev_list[i]), ib_devname)) break; ib_dev = dev_list[i]; if (!ib_dev) { fprintf(stderr, "IB device %s not found\n", ib_devname); return 1; } } ctx = pp_init_ctx(ib_dev, size, rx_depth, ib_port, use_event); if (!ctx) return 1; routs = pp_post_recv(ctx, ctx->rx_depth); if (routs < ctx->rx_depth) { fprintf(stderr, "Couldn't post receive (%d)\n", routs); return 1; } if (use_event) if (ibv_req_notify_cq(pp_cq(ctx), 0)) { fprintf(stderr, "Couldn't request CQ notification\n"); return 1; } if (pp_get_port_info(ctx->context, ib_port, &ctx->portinfo)) { fprintf(stderr, "Couldn't get port info\n"); return 1; } my_dest.lid = ctx->portinfo.lid; if (ctx->portinfo.link_layer != IBV_LINK_LAYER_ETHERNET && !my_dest.lid) { fprintf(stderr, "Couldn't get local LID\n"); return 1; } if (gidx >= 0) { if (ibv_query_gid(ctx->context, ib_port, gidx, &my_dest.gid)) { fprintf(stderr, "can't read sgid of index %d\n", gidx); return 1; } } else memset(&my_dest.gid, 0, sizeof my_dest.gid); my_dest.qpn = ctx->qp->qp_num; my_dest.psn = lrand48() & 0xffffff; inet_ntop(AF_INET6, &my_dest.gid, gid, sizeof gid); printf(" local address: LID 0x%04x, QPN 0x%06x, PSN 0x%06x, GID %s\n", my_dest.lid, my_dest.qpn, my_dest.psn, gid); if (servername) rem_dest = pp_client_exch_dest(servername, port, &my_dest); else rem_dest = pp_server_exch_dest(ctx, ib_port, mtu, port, sl, &my_dest, gidx); if (!rem_dest) return 1; inet_ntop(AF_INET6, &rem_dest->gid, gid, sizeof gid); printf(" remote address: LID 0x%04x, QPN 0x%06x, PSN 0x%06x, GID %s\n", rem_dest->lid, rem_dest->qpn, rem_dest->psn, gid); if (servername) if (pp_connect_ctx(ctx, ib_port, my_dest.psn, mtu, sl, rem_dest, gidx)) return 1; ctx->pending = PINGPONG_RECV_WRID; if (servername) { if (validate_buf) for (int i = 0; i < size; i += page_size) ctx->buf[i] = i / page_size % sizeof(char); if (use_dm) if (ibv_memcpy_to_dm(ctx->dm, 0, (void *)ctx->buf, size)) { fprintf(stderr, "Copy to dm buffer failed\n"); return 1; } if (pp_post_send(ctx)) { fprintf(stderr, "Couldn't post send\n"); return 1; } ctx->pending |= PINGPONG_SEND_WRID; } if (gettimeofday(&start, NULL)) { perror("gettimeofday"); return 1; } rcnt = scnt = 0; while (rcnt < iters || scnt < iters) { int ret; if (use_event) { struct ibv_cq *ev_cq; void *ev_ctx; if (ibv_get_cq_event(ctx->channel, &ev_cq, &ev_ctx)) { fprintf(stderr, "Failed to get cq_event\n"); return 1; } ++num_cq_events; if (ev_cq != pp_cq(ctx)) { fprintf(stderr, "CQ event for unknown CQ %p\n", ev_cq); return 1; } if (ibv_req_notify_cq(pp_cq(ctx), 0)) { fprintf(stderr, "Couldn't request CQ notification\n"); return 1; } } if (use_ts) { struct ibv_poll_cq_attr attr = {}; do { ret = ibv_start_poll(ctx->cq_s.cq_ex, &attr); } while (!use_event && ret == ENOENT); if (ret) { fprintf(stderr, "poll CQ failed %d\n", ret); return ret; } ret = parse_single_wc(ctx, &scnt, &rcnt, &routs, iters, ctx->cq_s.cq_ex->wr_id, ctx->cq_s.cq_ex->status, ibv_wc_read_completion_ts(ctx->cq_s.cq_ex), &ts); if (ret) { ibv_end_poll(ctx->cq_s.cq_ex); return ret; } ret = ibv_next_poll(ctx->cq_s.cq_ex); if (!ret) ret = parse_single_wc(ctx, &scnt, &rcnt, &routs, iters, ctx->cq_s.cq_ex->wr_id, ctx->cq_s.cq_ex->status, ibv_wc_read_completion_ts(ctx->cq_s.cq_ex), &ts); ibv_end_poll(ctx->cq_s.cq_ex); if (ret && ret != ENOENT) { fprintf(stderr, "poll CQ failed %d\n", ret); return ret; } } else { int ne, i; struct ibv_wc wc[2]; do { ne = ibv_poll_cq(pp_cq(ctx), 2, wc); if (ne < 0) { fprintf(stderr, "poll CQ failed %d\n", ne); return 1; } } while (!use_event && ne < 1); for (i = 0; i < ne; ++i) { ret = parse_single_wc(ctx, &scnt, &rcnt, &routs, iters, wc[i].wr_id, wc[i].status, 0, &ts); if (ret) { fprintf(stderr, "parse WC failed %d\n", ne); return 1; } } } } if (gettimeofday(&end, NULL)) { perror("gettimeofday"); return 1; } { float usec = (end.tv_sec - start.tv_sec) * 1000000 + (end.tv_usec - start.tv_usec); long long bytes = (long long) size * iters * 2; printf("%lld bytes in %.2f seconds = %.2f Mbit/sec\n", bytes, usec / 1000000., bytes * 8. / usec); printf("%d iters in %.2f seconds = %.2f usec/iter\n", iters, usec / 1000000., usec / iters); if (use_ts && ts.comp_with_time_iters) { printf("Max receive completion clock cycles = %" PRIu64 "\n", ts.comp_recv_max_time_delta); printf("Min receive completion clock cycles = %" PRIu64 "\n", ts.comp_recv_min_time_delta); printf("Average receive completion clock cycles = %f\n", (double)ts.comp_recv_total_time_delta / ts.comp_with_time_iters); } if ((!servername) && (validate_buf)) { if (use_dm) if (ibv_memcpy_from_dm(ctx->buf, ctx->dm, 0, size)) { fprintf(stderr, "Copy from DM buffer failed\n"); return 1; } for (int i = 0; i < size; i += page_size) if (ctx->buf[i] != i / page_size % sizeof(char)) printf("invalid data in page %d\n", i / page_size); } } ibv_ack_cq_events(pp_cq(ctx), num_cq_events); if (pp_close_ctx(ctx)) return 1; ibv_free_device_list(dev_list); free(rem_dest); return 0; } rdma-core-56.1/libibverbs/examples/srq_pingpong.c000066400000000000000000000572761477342711600221660ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include "pingpong.h" enum { PINGPONG_RECV_WRID = 1, PINGPONG_SEND_WRID = 2, MAX_QP = 256, }; static int page_size; static int validate_buf; static int use_odp; struct pingpong_context { struct ibv_context *context; struct ibv_comp_channel *channel; struct ibv_pd *pd; struct ibv_mr *mr; struct ibv_cq *cq; struct ibv_srq *srq; struct ibv_qp *qp[MAX_QP]; char *buf; int size; int send_flags; int num_qp; int rx_depth; int pending[MAX_QP]; struct ibv_port_attr portinfo; }; struct pingpong_dest { int lid; int qpn; int psn; union ibv_gid gid; }; static int pp_connect_ctx(struct pingpong_context *ctx, int port, enum ibv_mtu mtu, int sl, const struct pingpong_dest *my_dest, const struct pingpong_dest *dest, int sgid_idx) { int i; for (i = 0; i < ctx->num_qp; ++i) { struct ibv_qp_attr attr = { .qp_state = IBV_QPS_RTR, .path_mtu = mtu, .dest_qp_num = dest[i].qpn, .rq_psn = dest[i].psn, .max_dest_rd_atomic = 1, .min_rnr_timer = 12, .ah_attr = { .is_global = 0, .dlid = dest[i].lid, .sl = sl, .src_path_bits = 0, .port_num = port } }; if (dest->gid.global.interface_id) { attr.ah_attr.is_global = 1; attr.ah_attr.grh.hop_limit = 1; attr.ah_attr.grh.dgid = dest->gid; attr.ah_attr.grh.sgid_index = sgid_idx; } if (ibv_modify_qp(ctx->qp[i], &attr, IBV_QP_STATE | IBV_QP_AV | IBV_QP_PATH_MTU | IBV_QP_DEST_QPN | IBV_QP_RQ_PSN | IBV_QP_MAX_DEST_RD_ATOMIC | IBV_QP_MIN_RNR_TIMER)) { fprintf(stderr, "Failed to modify QP[%d] to RTR\n", i); return 1; } attr.qp_state = IBV_QPS_RTS; attr.timeout = 14; attr.retry_cnt = 7; attr.rnr_retry = 7; attr.sq_psn = my_dest[i].psn; attr.max_rd_atomic = 1; if (ibv_modify_qp(ctx->qp[i], &attr, IBV_QP_STATE | IBV_QP_TIMEOUT | IBV_QP_RETRY_CNT | IBV_QP_RNR_RETRY | IBV_QP_SQ_PSN | IBV_QP_MAX_QP_RD_ATOMIC)) { fprintf(stderr, "Failed to modify QP[%d] to RTS\n", i); return 1; } } return 0; } static struct pingpong_dest *pp_client_exch_dest(const char *servername, int port, const struct pingpong_dest *my_dest) { struct addrinfo *res, *t; struct addrinfo hints = { .ai_family = AF_UNSPEC, .ai_socktype = SOCK_STREAM }; char *service; char msg[sizeof "0000:000000:000000:00000000000000000000000000000000"]; int n; int r; int i; int sockfd = -1; struct pingpong_dest *rem_dest = NULL; char gid[33]; if (asprintf(&service, "%d", port) < 0) return NULL; n = getaddrinfo(servername, service, &hints, &res); if (n < 0) { fprintf(stderr, "%s for %s:%d\n", gai_strerror(n), servername, port); free(service); return NULL; } for (t = res; t; t = t->ai_next) { sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol); if (sockfd >= 0) { if (!connect(sockfd, t->ai_addr, t->ai_addrlen)) break; close(sockfd); sockfd = -1; } } freeaddrinfo(res); free(service); if (sockfd < 0) { fprintf(stderr, "Couldn't connect to %s:%d\n", servername, port); return NULL; } for (i = 0; i < MAX_QP; ++i) { gid_to_wire_gid(&my_dest[i].gid, gid); sprintf(msg, "%04x:%06x:%06x:%s", my_dest[i].lid, my_dest[i].qpn, my_dest[i].psn, gid); if (write(sockfd, msg, sizeof msg) != sizeof msg) { fprintf(stderr, "Couldn't send local address\n"); goto out; } } rem_dest = malloc(MAX_QP * sizeof *rem_dest); if (!rem_dest) goto out; for (i = 0; i < MAX_QP; ++i) { n = 0; while (n < sizeof msg) { r = read(sockfd, msg + n, sizeof msg - n); if (r < 0) { perror("client read"); fprintf(stderr, "%d/%d: Couldn't read remote address [%d]\n", n, (int) sizeof msg, i); goto out; } n += r; } sscanf(msg, "%x:%x:%x:%s", &rem_dest[i].lid, &rem_dest[i].qpn, &rem_dest[i].psn, gid); wire_gid_to_gid(gid, &rem_dest[i].gid); } if (write(sockfd, "done", sizeof "done") != sizeof "done") { perror("client write"); goto out; } out: close(sockfd); return rem_dest; } static struct pingpong_dest *pp_server_exch_dest(struct pingpong_context *ctx, int ib_port, enum ibv_mtu mtu, int port, int sl, const struct pingpong_dest *my_dest, int sgid_idx) { struct addrinfo *res, *t; struct addrinfo hints = { .ai_flags = AI_PASSIVE, .ai_family = AF_UNSPEC, .ai_socktype = SOCK_STREAM }; char *service; char msg[sizeof "0000:000000:000000:00000000000000000000000000000000"]; int n; int r; int i; int sockfd = -1, connfd; struct pingpong_dest *rem_dest = NULL; char gid[33]; if (asprintf(&service, "%d", port) < 0) return NULL; n = getaddrinfo(NULL, service, &hints, &res); if (n < 0) { fprintf(stderr, "%s for port %d\n", gai_strerror(n), port); free(service); return NULL; } for (t = res; t; t = t->ai_next) { sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol); if (sockfd >= 0) { n = 1; setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &n, sizeof n); if (!bind(sockfd, t->ai_addr, t->ai_addrlen)) break; close(sockfd); sockfd = -1; } } freeaddrinfo(res); free(service); if (sockfd < 0) { fprintf(stderr, "Couldn't listen to port %d\n", port); return NULL; } listen(sockfd, 1); connfd = accept(sockfd, NULL, NULL); close(sockfd); if (connfd < 0) { fprintf(stderr, "accept() failed\n"); return NULL; } rem_dest = malloc(MAX_QP * sizeof *rem_dest); if (!rem_dest) goto out; for (i = 0; i < MAX_QP; ++i) { n = 0; while (n < sizeof msg) { r = read(connfd, msg + n, sizeof msg - n); if (r < 0) { perror("server read"); fprintf(stderr, "%d/%d: Couldn't read remote address [%d]\n", n, (int) sizeof msg, i); goto out; } n += r; } sscanf(msg, "%x:%x:%x:%s", &rem_dest[i].lid, &rem_dest[i].qpn, &rem_dest[i].psn, gid); wire_gid_to_gid(gid, &rem_dest[i].gid); } if (pp_connect_ctx(ctx, ib_port, mtu, sl, my_dest, rem_dest, sgid_idx)) { fprintf(stderr, "Couldn't connect to remote QP\n"); free(rem_dest); rem_dest = NULL; goto out; } for (i = 0; i < MAX_QP; ++i) { gid_to_wire_gid(&my_dest[i].gid, gid); sprintf(msg, "%04x:%06x:%06x:%s", my_dest[i].lid, my_dest[i].qpn, my_dest[i].psn, gid); if (write(connfd, msg, sizeof msg) != sizeof msg) { fprintf(stderr, "Couldn't send local address\n"); free(rem_dest); rem_dest = NULL; goto out; } } if (read(connfd, msg, sizeof msg) != sizeof "done") { perror("client write"); free(rem_dest); rem_dest = NULL; goto out; } out: close(connfd); return rem_dest; } static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size, int num_qp, int rx_depth, int port, int use_event) { struct pingpong_context *ctx; int i; int access_flags = IBV_ACCESS_LOCAL_WRITE; ctx = calloc(1, sizeof *ctx); if (!ctx) return NULL; ctx->size = size; ctx->send_flags = IBV_SEND_SIGNALED; ctx->num_qp = num_qp; ctx->rx_depth = rx_depth; ctx->buf = memalign(page_size, size); if (!ctx->buf) { fprintf(stderr, "Couldn't allocate work buf.\n"); goto clean_ctx; } memset(ctx->buf, 0, size); ctx->context = ibv_open_device(ib_dev); if (!ctx->context) { fprintf(stderr, "Couldn't get context for %s\n", ibv_get_device_name(ib_dev)); goto clean_buffer; } if (use_odp) { struct ibv_device_attr_ex attrx; const uint32_t rc_caps_mask = IBV_ODP_SUPPORT_SEND | IBV_ODP_SUPPORT_SRQ_RECV; if (ibv_query_device_ex(ctx->context, NULL, &attrx)) { fprintf(stderr, "Couldn't query device for its features\n"); goto clean_device; } if (!(attrx.odp_caps.general_caps & IBV_ODP_SUPPORT) || (attrx.odp_caps.per_transport_caps.rc_odp_caps & rc_caps_mask) != rc_caps_mask) { fprintf(stderr, "The device isn't ODP capable or does not support RC send, receive and srq with ODP\n"); goto clean_device; } access_flags |= IBV_ACCESS_ON_DEMAND; } if (use_event) { ctx->channel = ibv_create_comp_channel(ctx->context); if (!ctx->channel) { fprintf(stderr, "Couldn't create completion channel\n"); goto clean_device; } } else ctx->channel = NULL; ctx->pd = ibv_alloc_pd(ctx->context); if (!ctx->pd) { fprintf(stderr, "Couldn't allocate PD\n"); goto clean_comp_channel; } ctx->mr = ibv_reg_mr(ctx->pd, ctx->buf, size, access_flags); if (!ctx->mr) { fprintf(stderr, "Couldn't register MR\n"); goto clean_pd; } ctx->cq = ibv_create_cq(ctx->context, rx_depth + num_qp, NULL, ctx->channel, 0); if (!ctx->cq) { fprintf(stderr, "Couldn't create CQ\n"); goto clean_mr; } { struct ibv_srq_init_attr attr = { .attr = { .max_wr = rx_depth, .max_sge = 1 } }; ctx->srq = ibv_create_srq(ctx->pd, &attr); if (!ctx->srq) { fprintf(stderr, "Couldn't create SRQ\n"); goto clean_cq; } } for (i = 0; i < num_qp; ++i) { struct ibv_qp_attr attr; struct ibv_qp_init_attr init_attr = { .send_cq = ctx->cq, .recv_cq = ctx->cq, .srq = ctx->srq, .cap = { .max_send_wr = 1, .max_send_sge = 1, }, .qp_type = IBV_QPT_RC }; ctx->qp[i] = ibv_create_qp(ctx->pd, &init_attr); if (!ctx->qp[i]) { fprintf(stderr, "Couldn't create QP[%d]\n", i); goto clean_qps; } ibv_query_qp(ctx->qp[i], &attr, IBV_QP_CAP, &init_attr); if (init_attr.cap.max_inline_data >= size) { ctx->send_flags |= IBV_SEND_INLINE; } } for (i = 0; i < num_qp; ++i) { struct ibv_qp_attr attr = { .qp_state = IBV_QPS_INIT, .pkey_index = 0, .port_num = port, .qp_access_flags = 0 }; if (ibv_modify_qp(ctx->qp[i], &attr, IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | IBV_QP_ACCESS_FLAGS)) { fprintf(stderr, "Failed to modify QP[%d] to INIT\n", i); goto clean_qps_full; } } return ctx; clean_qps_full: i = num_qp; clean_qps: for (--i; i >= 0; --i) ibv_destroy_qp(ctx->qp[i]); ibv_destroy_srq(ctx->srq); clean_cq: ibv_destroy_cq(ctx->cq); clean_mr: ibv_dereg_mr(ctx->mr); clean_pd: ibv_dealloc_pd(ctx->pd); clean_comp_channel: if (ctx->channel) ibv_destroy_comp_channel(ctx->channel); clean_device: ibv_close_device(ctx->context); clean_buffer: free(ctx->buf); clean_ctx: free(ctx); return NULL; } static int pp_close_ctx(struct pingpong_context *ctx, int num_qp) { int i; for (i = 0; i < num_qp; ++i) { if (ibv_destroy_qp(ctx->qp[i])) { fprintf(stderr, "Couldn't destroy QP[%d]\n", i); return 1; } } if (ibv_destroy_srq(ctx->srq)) { fprintf(stderr, "Couldn't destroy SRQ\n"); return 1; } if (ibv_destroy_cq(ctx->cq)) { fprintf(stderr, "Couldn't destroy CQ\n"); return 1; } if (ibv_dereg_mr(ctx->mr)) { fprintf(stderr, "Couldn't deregister MR\n"); return 1; } if (ibv_dealloc_pd(ctx->pd)) { fprintf(stderr, "Couldn't deallocate PD\n"); return 1; } if (ctx->channel) { if (ibv_destroy_comp_channel(ctx->channel)) { fprintf(stderr, "Couldn't destroy completion channel\n"); return 1; } } if (ibv_close_device(ctx->context)) { fprintf(stderr, "Couldn't release context\n"); return 1; } free(ctx->buf); free(ctx); return 0; } static int pp_post_recv(struct pingpong_context *ctx, int n) { struct ibv_sge list = { .addr = (uintptr_t) ctx->buf, .length = ctx->size, .lkey = ctx->mr->lkey }; struct ibv_recv_wr wr = { .wr_id = PINGPONG_RECV_WRID, .sg_list = &list, .num_sge = 1, }; struct ibv_recv_wr *bad_wr; int i; for (i = 0; i < n; ++i) if (ibv_post_srq_recv(ctx->srq, &wr, &bad_wr)) break; return i; } static int pp_post_send(struct pingpong_context *ctx, int qp_index) { struct ibv_sge list = { .addr = (uintptr_t) ctx->buf, .length = ctx->size, .lkey = ctx->mr->lkey }; struct ibv_send_wr wr = { .wr_id = PINGPONG_SEND_WRID, .sg_list = &list, .num_sge = 1, .opcode = IBV_WR_SEND, .send_flags = ctx->send_flags, }; struct ibv_send_wr *bad_wr; return ibv_post_send(ctx->qp[qp_index], &wr, &bad_wr); } static int find_qp(int qpn, struct pingpong_context *ctx, int num_qp) { int i; for (i = 0; i < num_qp; ++i) if (ctx->qp[i]->qp_num == qpn) return i; return -1; } static void usage(const char *argv0) { printf("Usage:\n"); printf(" %s start a server and wait for connection\n", argv0); printf(" %s connect to server at \n", argv0); printf("\n"); printf("Options:\n"); printf(" -p, --port= listen on/connect to port (default 18515)\n"); printf(" -d, --ib-dev= use IB device (default first device found)\n"); printf(" -i, --ib-port= use port of IB device (default 1)\n"); printf(" -s, --size= size of message to exchange (default 4096)\n"); printf(" -m, --mtu= path MTU (default 1024)\n"); printf(" -q, --num-qp= number of QPs to use (default 16)\n"); printf(" -r, --rx-depth= number of receives to post at a time (default 500)\n"); printf(" -n, --iters= number of exchanges per QP(default 1000)\n"); printf(" -l, --sl= service level value\n"); printf(" -e, --events sleep on CQ events (default poll)\n"); printf(" -g, --gid-idx= local port gid index\n"); printf(" -o, --odp use on demand paging\n"); printf(" -c, --chk validate received buffer\n"); } int main(int argc, char *argv[]) { struct ibv_device **dev_list; struct ibv_device *ib_dev; struct ibv_wc *wc; struct pingpong_context *ctx; struct pingpong_dest my_dest[MAX_QP]; struct pingpong_dest *rem_dest; struct timeval start, end; char *ib_devname = NULL; char *servername = NULL; unsigned int port = 18515; int ib_port = 1; unsigned int size = 4096; enum ibv_mtu mtu = IBV_MTU_1024; unsigned int num_qp = 16; unsigned int rx_depth = 500; unsigned int iters = 1000; int use_event = 0; int routs; int rcnt, scnt; int num_wc; int i; int num_cq_events = 0; int sl = 0; int gidx = -1; char gid[33]; srand48(getpid() * time(NULL)); while (1) { int c; static struct option long_options[] = { { .name = "port", .has_arg = 1, .val = 'p' }, { .name = "ib-dev", .has_arg = 1, .val = 'd' }, { .name = "ib-port", .has_arg = 1, .val = 'i' }, { .name = "size", .has_arg = 1, .val = 's' }, { .name = "mtu", .has_arg = 1, .val = 'm' }, { .name = "num-qp", .has_arg = 1, .val = 'q' }, { .name = "rx-depth", .has_arg = 1, .val = 'r' }, { .name = "iters", .has_arg = 1, .val = 'n' }, { .name = "sl", .has_arg = 1, .val = 'l' }, { .name = "events", .has_arg = 0, .val = 'e' }, { .name = "odp", .has_arg = 0, .val = 'o' }, { .name = "gid-idx", .has_arg = 1, .val = 'g' }, { .name = "chk", .has_arg = 0, .val = 'c' }, {} }; c = getopt_long(argc, argv, "p:d:i:s:m:q:r:n:l:eog:c", long_options, NULL); if (c == -1) break; switch (c) { case 'p': port = strtoul(optarg, NULL, 0); if (port > 65535) { usage(argv[0]); return 1; } break; case 'd': ib_devname = strdupa(optarg); break; case 'i': ib_port = strtol(optarg, NULL, 0); if (ib_port < 1) { usage(argv[0]); return 1; } break; case 's': size = strtoul(optarg, NULL, 0); if (size < 1) { usage(argv[0]); return 1; } break; case 'm': mtu = pp_mtu_to_enum(strtol(optarg, NULL, 0)); if (mtu == 0) { usage(argv[0]); return 1; } break; case 'q': num_qp = strtoul(optarg, NULL, 0); break; case 'r': rx_depth = strtoul(optarg, NULL, 0); break; case 'n': iters = strtoul(optarg, NULL, 0); break; case 'l': sl = strtol(optarg, NULL, 0); break; case 'e': ++use_event; break; case 'g': gidx = strtol(optarg, NULL, 0); break; case 'o': use_odp = 1; break; case 'c': validate_buf = 1; break; default: usage(argv[0]); return 1; } } if (optind == argc - 1) servername = strdupa(argv[optind]); else if (optind < argc) { usage(argv[0]); return 1; } if (num_qp > rx_depth) { fprintf(stderr, "rx_depth %d is too small for %d QPs -- " "must have at least one receive per QP.\n", rx_depth, num_qp); return 1; } if (num_qp >= MAX_QP) { fprintf(stderr, "num_qp %d must be less than %d\n", num_qp, MAX_QP - 1); return 1; } num_wc = num_qp + rx_depth; wc = alloca(num_wc * sizeof *wc); page_size = sysconf(_SC_PAGESIZE); dev_list = ibv_get_device_list(NULL); if (!dev_list) { perror("Failed to get IB devices list"); return 1; } if (!ib_devname) { ib_dev = *dev_list; if (!ib_dev) { fprintf(stderr, "No IB devices found\n"); return 1; } } else { for (i = 0; dev_list[i]; ++i) if (!strcmp(ibv_get_device_name(dev_list[i]), ib_devname)) break; ib_dev = dev_list[i]; if (!ib_dev) { fprintf(stderr, "IB device %s not found\n", ib_devname); return 1; } } ctx = pp_init_ctx(ib_dev, size, num_qp, rx_depth, ib_port, use_event); if (!ctx) return 1; routs = pp_post_recv(ctx, ctx->rx_depth); if (routs < ctx->rx_depth) { fprintf(stderr, "Couldn't post receive (%d)\n", routs); return 1; } if (use_event) if (ibv_req_notify_cq(ctx->cq, 0)) { fprintf(stderr, "Couldn't request CQ notification\n"); return 1; } memset(my_dest, 0, sizeof my_dest); if (pp_get_port_info(ctx->context, ib_port, &ctx->portinfo)) { fprintf(stderr, "Couldn't get port info\n"); return 1; } for (i = 0; i < num_qp; ++i) { my_dest[i].qpn = ctx->qp[i]->qp_num; my_dest[i].psn = lrand48() & 0xffffff; my_dest[i].lid = ctx->portinfo.lid; if (ctx->portinfo.link_layer != IBV_LINK_LAYER_ETHERNET && !my_dest[i].lid) { fprintf(stderr, "Couldn't get local LID\n"); return 1; } if (gidx >= 0) { if (ibv_query_gid(ctx->context, ib_port, gidx, &my_dest[i].gid)) { fprintf(stderr, "Could not get local gid for " "gid index %d\n", gidx); return 1; } } else memset(&my_dest[i].gid, 0, sizeof my_dest[i].gid); inet_ntop(AF_INET6, &my_dest[i].gid, gid, sizeof gid); printf(" local address: LID 0x%04x, QPN 0x%06x, PSN 0x%06x, " "GID %s\n", my_dest[i].lid, my_dest[i].qpn, my_dest[i].psn, gid); } if (servername) rem_dest = pp_client_exch_dest(servername, port, my_dest); else rem_dest = pp_server_exch_dest(ctx, ib_port, mtu, port, sl, my_dest, gidx); if (!rem_dest) return 1; inet_ntop(AF_INET6, &rem_dest->gid, gid, sizeof gid); for (i = 0; i < num_qp; ++i) { inet_ntop(AF_INET6, &rem_dest[i].gid, gid, sizeof gid); printf(" remote address: LID 0x%04x, QPN 0x%06x, PSN 0x%06x, " "GID %s\n", rem_dest[i].lid, rem_dest[i].qpn, rem_dest[i].psn, gid); } if (servername) if (pp_connect_ctx(ctx, ib_port, mtu, sl, my_dest, rem_dest, gidx)) return 1; if (servername) { if (validate_buf) for (i = 0; i < size; i += page_size) ctx->buf[i] = i / page_size % sizeof(char); for (i = 0; i < num_qp; ++i) { if (pp_post_send(ctx, i)) { fprintf(stderr, "Couldn't post send\n"); return 1; } ctx->pending[i] = PINGPONG_SEND_WRID | PINGPONG_RECV_WRID; } } else for (i = 0; i < num_qp; ++i) ctx->pending[i] = PINGPONG_RECV_WRID; if (gettimeofday(&start, NULL)) { perror("gettimeofday"); return 1; } rcnt = scnt = 0; while (rcnt < iters || scnt < iters) { if (use_event) { struct ibv_cq *ev_cq; void *ev_ctx; if (ibv_get_cq_event(ctx->channel, &ev_cq, &ev_ctx)) { fprintf(stderr, "Failed to get cq_event\n"); return 1; } ++num_cq_events; if (ev_cq != ctx->cq) { fprintf(stderr, "CQ event for unknown CQ %p\n", ev_cq); return 1; } if (ibv_req_notify_cq(ctx->cq, 0)) { fprintf(stderr, "Couldn't request CQ notification\n"); return 1; } } { int ne, qp_ind; do { ne = ibv_poll_cq(ctx->cq, num_wc, wc); if (ne < 0) { fprintf(stderr, "poll CQ failed %d\n", ne); return 1; } } while (!use_event && ne < 1); for (i = 0; i < ne; ++i) { if (wc[i].status != IBV_WC_SUCCESS) { fprintf(stderr, "Failed status %s (%d) for wr_id %d\n", ibv_wc_status_str(wc[i].status), wc[i].status, (int) wc[i].wr_id); return 1; } qp_ind = find_qp(wc[i].qp_num, ctx, num_qp); if (qp_ind < 0) { fprintf(stderr, "Couldn't find QPN %06x\n", wc[i].qp_num); return 1; } switch ((int) wc[i].wr_id) { case PINGPONG_SEND_WRID: ++scnt; break; case PINGPONG_RECV_WRID: if (--routs <= num_qp) { routs += pp_post_recv(ctx, ctx->rx_depth - routs); if (routs < ctx->rx_depth) { fprintf(stderr, "Couldn't post receive (%d)\n", routs); return 1; } } ++rcnt; break; default: fprintf(stderr, "Completion for unknown wr_id %d\n", (int) wc[i].wr_id); return 1; } ctx->pending[qp_ind] &= ~(int) wc[i].wr_id; if (scnt < iters && !ctx->pending[qp_ind]) { if (pp_post_send(ctx, qp_ind)) { fprintf(stderr, "Couldn't post send\n"); return 1; } ctx->pending[qp_ind] = PINGPONG_RECV_WRID | PINGPONG_SEND_WRID; } } } } if (gettimeofday(&end, NULL)) { perror("gettimeofday"); return 1; } { float usec = (end.tv_sec - start.tv_sec) * 1000000 + (end.tv_usec - start.tv_usec); long long bytes = (long long) size * iters * 2; printf("%lld bytes in %.2f seconds = %.2f Mbit/sec\n", bytes, usec / 1000000., bytes * 8. / usec); printf("%d iters in %.2f seconds = %.2f usec/iter\n", iters, usec / 1000000., usec / iters); if ((!servername) && (validate_buf)) { for (i = 0; i < size; i += page_size) if (ctx->buf[i] != i / page_size % sizeof(char)) printf("invalid data in page %d\n", i / page_size); } } ibv_ack_cq_events(ctx->cq, num_cq_events); if (pp_close_ctx(ctx, num_qp)) return 1; ibv_free_device_list(dev_list); free(rem_dest); return 0; } rdma-core-56.1/libibverbs/examples/uc_pingpong.c000066400000000000000000000474621477342711600217640ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include "pingpong.h" enum { PINGPONG_RECV_WRID = 1, PINGPONG_SEND_WRID = 2, }; static int page_size; static int validate_buf; struct pingpong_context { struct ibv_context *context; struct ibv_comp_channel *channel; struct ibv_pd *pd; struct ibv_mr *mr; struct ibv_cq *cq; struct ibv_qp *qp; char *buf; int size; int send_flags; int rx_depth; int pending; struct ibv_port_attr portinfo; }; struct pingpong_dest { int lid; int qpn; int psn; union ibv_gid gid; }; static int pp_connect_ctx(struct pingpong_context *ctx, int port, int my_psn, enum ibv_mtu mtu, int sl, struct pingpong_dest *dest, int sgid_idx) { struct ibv_qp_attr attr = { .qp_state = IBV_QPS_RTR, .path_mtu = mtu, .dest_qp_num = dest->qpn, .rq_psn = dest->psn, .ah_attr = { .is_global = 0, .dlid = dest->lid, .sl = sl, .src_path_bits = 0, .port_num = port } }; if (dest->gid.global.interface_id) { attr.ah_attr.is_global = 1; attr.ah_attr.grh.hop_limit = 1; attr.ah_attr.grh.dgid = dest->gid; attr.ah_attr.grh.sgid_index = sgid_idx; } if (ibv_modify_qp(ctx->qp, &attr, IBV_QP_STATE | IBV_QP_AV | IBV_QP_PATH_MTU | IBV_QP_DEST_QPN | IBV_QP_RQ_PSN)) { fprintf(stderr, "Failed to modify QP to RTR\n"); return 1; } attr.qp_state = IBV_QPS_RTS; attr.sq_psn = my_psn; if (ibv_modify_qp(ctx->qp, &attr, IBV_QP_STATE | IBV_QP_SQ_PSN)) { fprintf(stderr, "Failed to modify QP to RTS\n"); return 1; } return 0; } static struct pingpong_dest *pp_client_exch_dest(const char *servername, int port, const struct pingpong_dest *my_dest) { struct addrinfo *res, *t; struct addrinfo hints = { .ai_family = AF_UNSPEC, .ai_socktype = SOCK_STREAM }; char *service; char msg[sizeof "0000:000000:000000:00000000000000000000000000000000"]; int n; int sockfd = -1; struct pingpong_dest *rem_dest = NULL; char gid[33]; if (asprintf(&service, "%d", port) < 0) return NULL; n = getaddrinfo(servername, service, &hints, &res); if (n < 0) { fprintf(stderr, "%s for %s:%d\n", gai_strerror(n), servername, port); free(service); return NULL; } for (t = res; t; t = t->ai_next) { sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol); if (sockfd >= 0) { if (!connect(sockfd, t->ai_addr, t->ai_addrlen)) break; close(sockfd); sockfd = -1; } } freeaddrinfo(res); free(service); if (sockfd < 0) { fprintf(stderr, "Couldn't connect to %s:%d\n", servername, port); return NULL; } gid_to_wire_gid(&my_dest->gid, gid); sprintf(msg, "%04x:%06x:%06x:%s", my_dest->lid, my_dest->qpn, my_dest->psn, gid); if (write(sockfd, msg, sizeof msg) != sizeof msg) { fprintf(stderr, "Couldn't send local address\n"); goto out; } if (read(sockfd, msg, sizeof msg) != sizeof msg || write(sockfd, "done", sizeof "done") != sizeof "done") { perror("client read/write"); fprintf(stderr, "Couldn't read/write remote address\n"); goto out; } rem_dest = malloc(sizeof *rem_dest); if (!rem_dest) goto out; sscanf(msg, "%x:%x:%x:%s", &rem_dest->lid, &rem_dest->qpn, &rem_dest->psn, gid); wire_gid_to_gid(gid, &rem_dest->gid); out: close(sockfd); return rem_dest; } static struct pingpong_dest *pp_server_exch_dest(struct pingpong_context *ctx, int ib_port, enum ibv_mtu mtu, int port, int sl, const struct pingpong_dest *my_dest, int sgid_idx) { struct addrinfo *res, *t; struct addrinfo hints = { .ai_flags = AI_PASSIVE, .ai_family = AF_UNSPEC, .ai_socktype = SOCK_STREAM }; char *service; char msg[sizeof "0000:000000:000000:00000000000000000000000000000000"]; int n; int sockfd = -1, connfd; struct pingpong_dest *rem_dest = NULL; char gid[33]; if (asprintf(&service, "%d", port) < 0) return NULL; n = getaddrinfo(NULL, service, &hints, &res); if (n < 0) { fprintf(stderr, "%s for port %d\n", gai_strerror(n), port); free(service); return NULL; } for (t = res; t; t = t->ai_next) { sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol); if (sockfd >= 0) { n = 1; setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &n, sizeof n); if (!bind(sockfd, t->ai_addr, t->ai_addrlen)) break; close(sockfd); sockfd = -1; } } freeaddrinfo(res); free(service); if (sockfd < 0) { fprintf(stderr, "Couldn't listen to port %d\n", port); return NULL; } listen(sockfd, 1); connfd = accept(sockfd, NULL, NULL); close(sockfd); if (connfd < 0) { fprintf(stderr, "accept() failed\n"); return NULL; } n = read(connfd, msg, sizeof msg); if (n != sizeof msg) { perror("server read"); fprintf(stderr, "%d/%d: Couldn't read remote address\n", n, (int) sizeof msg); goto out; } rem_dest = malloc(sizeof *rem_dest); if (!rem_dest) goto out; sscanf(msg, "%x:%x:%x:%s", &rem_dest->lid, &rem_dest->qpn, &rem_dest->psn, gid); wire_gid_to_gid(gid, &rem_dest->gid); if (pp_connect_ctx(ctx, ib_port, my_dest->psn, mtu, sl, rem_dest, sgid_idx)) { fprintf(stderr, "Couldn't connect to remote QP\n"); free(rem_dest); rem_dest = NULL; goto out; } gid_to_wire_gid(&my_dest->gid, gid); sprintf(msg, "%04x:%06x:%06x:%s", my_dest->lid, my_dest->qpn, my_dest->psn, gid); if (write(connfd, msg, sizeof msg) != sizeof msg || read(connfd, msg, sizeof msg) != sizeof "done") { fprintf(stderr, "Couldn't send/recv local address\n"); free(rem_dest); rem_dest = NULL; goto out; } out: close(connfd); return rem_dest; } static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size, int rx_depth, int port, int use_event) { struct pingpong_context *ctx; ctx = calloc(1, sizeof *ctx); if (!ctx) return NULL; ctx->size = size; ctx->send_flags = IBV_SEND_SIGNALED; ctx->rx_depth = rx_depth; ctx->buf = memalign(page_size, size); if (!ctx->buf) { fprintf(stderr, "Couldn't allocate work buf.\n"); goto clean_ctx; } /* FIXME memset(ctx->buf, 0, size); */ memset(ctx->buf, 0x7b, size); ctx->context = ibv_open_device(ib_dev); if (!ctx->context) { fprintf(stderr, "Couldn't get context for %s\n", ibv_get_device_name(ib_dev)); goto clean_buffer; } if (use_event) { ctx->channel = ibv_create_comp_channel(ctx->context); if (!ctx->channel) { fprintf(stderr, "Couldn't create completion channel\n"); goto clean_device; } } else ctx->channel = NULL; ctx->pd = ibv_alloc_pd(ctx->context); if (!ctx->pd) { fprintf(stderr, "Couldn't allocate PD\n"); goto clean_comp_channel; } ctx->mr = ibv_reg_mr(ctx->pd, ctx->buf, size, IBV_ACCESS_LOCAL_WRITE); if (!ctx->mr) { fprintf(stderr, "Couldn't register MR\n"); goto clean_pd; } ctx->cq = ibv_create_cq(ctx->context, rx_depth + 1, NULL, ctx->channel, 0); if (!ctx->cq) { fprintf(stderr, "Couldn't create CQ\n"); goto clean_mr; } { struct ibv_qp_attr attr; struct ibv_qp_init_attr init_attr = { .send_cq = ctx->cq, .recv_cq = ctx->cq, .cap = { .max_send_wr = 1, .max_recv_wr = rx_depth, .max_send_sge = 1, .max_recv_sge = 1 }, .qp_type = IBV_QPT_UC }; ctx->qp = ibv_create_qp(ctx->pd, &init_attr); if (!ctx->qp) { fprintf(stderr, "Couldn't create QP\n"); goto clean_cq; } ibv_query_qp(ctx->qp, &attr, IBV_QP_CAP, &init_attr); if (init_attr.cap.max_inline_data >= size) { ctx->send_flags |= IBV_SEND_INLINE; } } { struct ibv_qp_attr attr = { .qp_state = IBV_QPS_INIT, .pkey_index = 0, .port_num = port, .qp_access_flags = 0 }; if (ibv_modify_qp(ctx->qp, &attr, IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | IBV_QP_ACCESS_FLAGS)) { fprintf(stderr, "Failed to modify QP to INIT\n"); goto clean_qp; } } return ctx; clean_qp: ibv_destroy_qp(ctx->qp); clean_cq: ibv_destroy_cq(ctx->cq); clean_mr: ibv_dereg_mr(ctx->mr); clean_pd: ibv_dealloc_pd(ctx->pd); clean_comp_channel: if (ctx->channel) ibv_destroy_comp_channel(ctx->channel); clean_device: ibv_close_device(ctx->context); clean_buffer: free(ctx->buf); clean_ctx: free(ctx); return NULL; } static int pp_close_ctx(struct pingpong_context *ctx) { if (ibv_destroy_qp(ctx->qp)) { fprintf(stderr, "Couldn't destroy QP\n"); return 1; } if (ibv_destroy_cq(ctx->cq)) { fprintf(stderr, "Couldn't destroy CQ\n"); return 1; } if (ibv_dereg_mr(ctx->mr)) { fprintf(stderr, "Couldn't deregister MR\n"); return 1; } if (ibv_dealloc_pd(ctx->pd)) { fprintf(stderr, "Couldn't deallocate PD\n"); return 1; } if (ctx->channel) { if (ibv_destroy_comp_channel(ctx->channel)) { fprintf(stderr, "Couldn't destroy completion channel\n"); return 1; } } if (ibv_close_device(ctx->context)) { fprintf(stderr, "Couldn't release context\n"); return 1; } free(ctx->buf); free(ctx); return 0; } static int pp_post_recv(struct pingpong_context *ctx, int n) { struct ibv_sge list = { .addr = (uintptr_t) ctx->buf, .length = ctx->size, .lkey = ctx->mr->lkey }; struct ibv_recv_wr wr = { .wr_id = PINGPONG_RECV_WRID, .sg_list = &list, .num_sge = 1, }; struct ibv_recv_wr *bad_wr; int i; for (i = 0; i < n; ++i) if (ibv_post_recv(ctx->qp, &wr, &bad_wr)) break; return i; } static int pp_post_send(struct pingpong_context *ctx) { struct ibv_sge list = { .addr = (uintptr_t) ctx->buf, .length = ctx->size, .lkey = ctx->mr->lkey }; struct ibv_send_wr wr = { .wr_id = PINGPONG_SEND_WRID, .sg_list = &list, .num_sge = 1, .opcode = IBV_WR_SEND, .send_flags = ctx->send_flags, }; struct ibv_send_wr *bad_wr; return ibv_post_send(ctx->qp, &wr, &bad_wr); } static void usage(const char *argv0) { printf("Usage:\n"); printf(" %s start a server and wait for connection\n", argv0); printf(" %s connect to server at \n", argv0); printf("\n"); printf("Options:\n"); printf(" -p, --port= listen on/connect to port (default 18515)\n"); printf(" -d, --ib-dev= use IB device (default first device found)\n"); printf(" -i, --ib-port= use port of IB device (default 1)\n"); printf(" -s, --size= size of message to exchange (default 4096)\n"); printf(" -m, --mtu= path MTU (default 1024)\n"); printf(" -r, --rx-depth= number of receives to post at a time (default 500)\n"); printf(" -n, --iters= number of exchanges (default 1000)\n"); printf(" -l, --sl= service level value\n"); printf(" -e, --events sleep on CQ events (default poll)\n"); printf(" -g, --gid-idx= local port gid index\n"); printf(" -c, --chk validate received buffer\n"); } int main(int argc, char *argv[]) { struct ibv_device **dev_list; struct ibv_device *ib_dev; struct pingpong_context *ctx; struct pingpong_dest my_dest; struct pingpong_dest *rem_dest; struct timeval start, end; char *ib_devname = NULL; char *servername = NULL; unsigned int port = 18515; int ib_port = 1; unsigned int size = 4096; enum ibv_mtu mtu = IBV_MTU_1024; unsigned int rx_depth = 500; unsigned int iters = 1000; int use_event = 0; int routs; int rcnt, scnt; int num_cq_events = 0; int sl = 0; int gidx = -1; char gid[33]; srand48(getpid() * time(NULL)); while (1) { int c; static struct option long_options[] = { { .name = "port", .has_arg = 1, .val = 'p' }, { .name = "ib-dev", .has_arg = 1, .val = 'd' }, { .name = "ib-port", .has_arg = 1, .val = 'i' }, { .name = "size", .has_arg = 1, .val = 's' }, { .name = "mtu", .has_arg = 1, .val = 'm' }, { .name = "rx-depth", .has_arg = 1, .val = 'r' }, { .name = "iters", .has_arg = 1, .val = 'n' }, { .name = "sl", .has_arg = 1, .val = 'l' }, { .name = "events", .has_arg = 0, .val = 'e' }, { .name = "gid-idx", .has_arg = 1, .val = 'g' }, { .name = "chk", .has_arg = 0, .val = 'c' }, {} }; c = getopt_long(argc, argv, "p:d:i:s:m:r:n:l:eg:c", long_options, NULL); if (c == -1) break; switch (c) { case 'p': port = strtoul(optarg, NULL, 0); if (port > 65535) { usage(argv[0]); return 1; } break; case 'd': ib_devname = strdupa(optarg); break; case 'i': ib_port = strtol(optarg, NULL, 0); if (ib_port < 1) { usage(argv[0]); return 1; } break; case 's': size = strtoul(optarg, NULL, 0); break; case 'm': mtu = pp_mtu_to_enum(strtol(optarg, NULL, 0)); if (mtu == 0) { usage(argv[0]); return 1; } break; case 'r': rx_depth = strtoul(optarg, NULL, 0); break; case 'n': iters = strtoul(optarg, NULL, 0); break; case 'l': sl = strtol(optarg, NULL, 0); break; case 'e': ++use_event; break; case 'g': gidx = strtol(optarg, NULL, 0); break; case 'c': validate_buf = 1; break; default: usage(argv[0]); return 1; } } if (optind == argc - 1) servername = strdupa(argv[optind]); else if (optind < argc) { usage(argv[0]); return 1; } page_size = sysconf(_SC_PAGESIZE); dev_list = ibv_get_device_list(NULL); if (!dev_list) { perror("Failed to get IB devices list"); return 1; } if (!ib_devname) { ib_dev = *dev_list; if (!ib_dev) { fprintf(stderr, "No IB devices found\n"); return 1; } } else { int i; for (i = 0; dev_list[i]; ++i) if (!strcmp(ibv_get_device_name(dev_list[i]), ib_devname)) break; ib_dev = dev_list[i]; if (!ib_dev) { fprintf(stderr, "IB device %s not found\n", ib_devname); return 1; } } ctx = pp_init_ctx(ib_dev, size, rx_depth, ib_port, use_event); if (!ctx) return 1; routs = pp_post_recv(ctx, ctx->rx_depth); if (routs < ctx->rx_depth) { fprintf(stderr, "Couldn't post receive (%d)\n", routs); return 1; } if (use_event) if (ibv_req_notify_cq(ctx->cq, 0)) { fprintf(stderr, "Couldn't request CQ notification\n"); return 1; } if (pp_get_port_info(ctx->context, ib_port, &ctx->portinfo)) { fprintf(stderr, "Couldn't get port info\n"); return 1; } my_dest.lid = ctx->portinfo.lid; if (ctx->portinfo.link_layer != IBV_LINK_LAYER_ETHERNET && !my_dest.lid) { fprintf(stderr, "Couldn't get local LID\n"); return 1; } if (gidx >= 0) { if (ibv_query_gid(ctx->context, ib_port, gidx, &my_dest.gid)) { fprintf(stderr, "can't read sgid of index %d\n", gidx); return 1; } } else memset(&my_dest.gid, 0, sizeof my_dest.gid); my_dest.qpn = ctx->qp->qp_num; my_dest.psn = lrand48() & 0xffffff; inet_ntop(AF_INET6, &my_dest.gid, gid, sizeof gid); printf(" local address: LID 0x%04x, QPN 0x%06x, PSN 0x%06x, GID %s\n", my_dest.lid, my_dest.qpn, my_dest.psn, gid); if (servername) rem_dest = pp_client_exch_dest(servername, port, &my_dest); else rem_dest = pp_server_exch_dest(ctx, ib_port, mtu, port, sl, &my_dest, gidx); if (!rem_dest) return 1; inet_ntop(AF_INET6, &rem_dest->gid, gid, sizeof gid); printf(" remote address: LID 0x%04x, QPN 0x%06x, PSN 0x%06x, GID %s\n", rem_dest->lid, rem_dest->qpn, rem_dest->psn, gid); if (servername) if (pp_connect_ctx(ctx, ib_port, my_dest.psn, mtu, sl, rem_dest, gidx)) return 1; ctx->pending = PINGPONG_RECV_WRID; if (servername) { if (validate_buf) for (int i = 0; i < size; i += page_size) ctx->buf[i] = i / page_size % sizeof(char); if (pp_post_send(ctx)) { fprintf(stderr, "Couldn't post send\n"); return 1; } ctx->pending |= PINGPONG_SEND_WRID; } if (gettimeofday(&start, NULL)) { perror("gettimeofday"); return 1; } rcnt = scnt = 0; while (rcnt < iters || scnt < iters) { if (use_event) { struct ibv_cq *ev_cq; void *ev_ctx; if (ibv_get_cq_event(ctx->channel, &ev_cq, &ev_ctx)) { fprintf(stderr, "Failed to get cq_event\n"); return 1; } ++num_cq_events; if (ev_cq != ctx->cq) { fprintf(stderr, "CQ event for unknown CQ %p\n", ev_cq); return 1; } if (ibv_req_notify_cq(ctx->cq, 0)) { fprintf(stderr, "Couldn't request CQ notification\n"); return 1; } } { struct ibv_wc wc[2]; int ne, i; do { ne = ibv_poll_cq(ctx->cq, 2, wc); if (ne < 0) { fprintf(stderr, "poll CQ failed %d\n", ne); return 1; } } while (!use_event && ne < 1); for (i = 0; i < ne; ++i) { if (wc[i].status != IBV_WC_SUCCESS) { fprintf(stderr, "Failed status %s (%d) for wr_id %d\n", ibv_wc_status_str(wc[i].status), wc[i].status, (int) wc[i].wr_id); return 1; } switch ((int) wc[i].wr_id) { case PINGPONG_SEND_WRID: ++scnt; break; case PINGPONG_RECV_WRID: if (--routs <= 1) { routs += pp_post_recv(ctx, ctx->rx_depth - routs); if (routs < ctx->rx_depth) { fprintf(stderr, "Couldn't post receive (%d)\n", routs); return 1; } } ++rcnt; break; default: fprintf(stderr, "Completion for unknown wr_id %d\n", (int) wc[i].wr_id); return 1; } ctx->pending &= ~(int) wc[i].wr_id; if (scnt < iters && !ctx->pending) { if (pp_post_send(ctx)) { fprintf(stderr, "Couldn't post send\n"); return 1; } ctx->pending = PINGPONG_RECV_WRID | PINGPONG_SEND_WRID; } } } } if (gettimeofday(&end, NULL)) { perror("gettimeofday"); return 1; } { float usec = (end.tv_sec - start.tv_sec) * 1000000 + (end.tv_usec - start.tv_usec); long long bytes = (long long) size * iters * 2; printf("%lld bytes in %.2f seconds = %.2f Mbit/sec\n", bytes, usec / 1000000., bytes * 8. / usec); printf("%d iters in %.2f seconds = %.2f usec/iter\n", iters, usec / 1000000., usec / iters); if ((!servername) && (validate_buf)) { for (int i = 0; i < size; i += page_size) if (ctx->buf[i] != i / page_size % sizeof(char)) printf("invalid data in page %d\n", i / page_size); } } ibv_ack_cq_events(ctx->cq, num_cq_events); if (pp_close_ctx(ctx)) return 1; ibv_free_device_list(dev_list); free(rem_dest); return 0; } rdma-core-56.1/libibverbs/examples/ud_pingpong.c000066400000000000000000000477611477342711600217670ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include "pingpong.h" enum { PINGPONG_RECV_WRID = 1, PINGPONG_SEND_WRID = 2, }; static int page_size; static int validate_buf; struct pingpong_context { struct ibv_context *context; struct ibv_comp_channel *channel; struct ibv_pd *pd; struct ibv_mr *mr; struct ibv_cq *cq; struct ibv_qp *qp; struct ibv_ah *ah; char *buf; int size; int send_flags; int rx_depth; int pending; struct ibv_port_attr portinfo; }; struct pingpong_dest { int lid; int qpn; int psn; union ibv_gid gid; }; static int pp_connect_ctx(struct pingpong_context *ctx, int port, int my_psn, int sl, struct pingpong_dest *dest, int sgid_idx) { struct ibv_ah_attr ah_attr = { .is_global = 0, .dlid = dest->lid, .sl = sl, .src_path_bits = 0, .port_num = port }; struct ibv_qp_attr attr = { .qp_state = IBV_QPS_RTR }; if (ibv_modify_qp(ctx->qp, &attr, IBV_QP_STATE)) { fprintf(stderr, "Failed to modify QP to RTR\n"); return 1; } attr.qp_state = IBV_QPS_RTS; attr.sq_psn = my_psn; if (ibv_modify_qp(ctx->qp, &attr, IBV_QP_STATE | IBV_QP_SQ_PSN)) { fprintf(stderr, "Failed to modify QP to RTS\n"); return 1; } if (dest->gid.global.interface_id) { ah_attr.is_global = 1; ah_attr.grh.hop_limit = 1; ah_attr.grh.dgid = dest->gid; ah_attr.grh.sgid_index = sgid_idx; } ctx->ah = ibv_create_ah(ctx->pd, &ah_attr); if (!ctx->ah) { fprintf(stderr, "Failed to create AH\n"); return 1; } return 0; } static struct pingpong_dest *pp_client_exch_dest(const char *servername, int port, const struct pingpong_dest *my_dest) { struct addrinfo *res, *t; struct addrinfo hints = { .ai_family = AF_UNSPEC, .ai_socktype = SOCK_STREAM }; char *service; char msg[sizeof "0000:000000:000000:00000000000000000000000000000000"]; int n; int sockfd = -1; struct pingpong_dest *rem_dest = NULL; char gid[33]; if (asprintf(&service, "%d", port) < 0) return NULL; n = getaddrinfo(servername, service, &hints, &res); if (n < 0) { fprintf(stderr, "%s for %s:%d\n", gai_strerror(n), servername, port); free(service); return NULL; } for (t = res; t; t = t->ai_next) { sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol); if (sockfd >= 0) { if (!connect(sockfd, t->ai_addr, t->ai_addrlen)) break; close(sockfd); sockfd = -1; } } freeaddrinfo(res); free(service); if (sockfd < 0) { fprintf(stderr, "Couldn't connect to %s:%d\n", servername, port); return NULL; } gid_to_wire_gid(&my_dest->gid, gid); sprintf(msg, "%04x:%06x:%06x:%s", my_dest->lid, my_dest->qpn, my_dest->psn, gid); if (write(sockfd, msg, sizeof msg) != sizeof msg) { fprintf(stderr, "Couldn't send local address\n"); goto out; } if (read(sockfd, msg, sizeof msg) != sizeof msg || write(sockfd, "done", sizeof "done") != sizeof "done") { perror("client read/write"); fprintf(stderr, "Couldn't read/write remote address\n"); goto out; } rem_dest = malloc(sizeof *rem_dest); if (!rem_dest) goto out; sscanf(msg, "%x:%x:%x:%s", &rem_dest->lid, &rem_dest->qpn, &rem_dest->psn, gid); wire_gid_to_gid(gid, &rem_dest->gid); out: close(sockfd); return rem_dest; } static struct pingpong_dest *pp_server_exch_dest(struct pingpong_context *ctx, int ib_port, int port, int sl, const struct pingpong_dest *my_dest, int sgid_idx) { struct addrinfo *res, *t; struct addrinfo hints = { .ai_flags = AI_PASSIVE, .ai_family = AF_UNSPEC, .ai_socktype = SOCK_STREAM }; char *service; char msg[sizeof "0000:000000:000000:00000000000000000000000000000000"]; int n; int sockfd = -1, connfd; struct pingpong_dest *rem_dest = NULL; char gid[33]; if (asprintf(&service, "%d", port) < 0) return NULL; n = getaddrinfo(NULL, service, &hints, &res); if (n < 0) { fprintf(stderr, "%s for port %d\n", gai_strerror(n), port); free(service); return NULL; } for (t = res; t; t = t->ai_next) { sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol); if (sockfd >= 0) { n = 1; setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &n, sizeof n); if (!bind(sockfd, t->ai_addr, t->ai_addrlen)) break; close(sockfd); sockfd = -1; } } freeaddrinfo(res); free(service); if (sockfd < 0) { fprintf(stderr, "Couldn't listen to port %d\n", port); return NULL; } listen(sockfd, 1); connfd = accept(sockfd, NULL, NULL); close(sockfd); if (connfd < 0) { fprintf(stderr, "accept() failed\n"); return NULL; } n = read(connfd, msg, sizeof msg); if (n != sizeof msg) { perror("server read"); fprintf(stderr, "%d/%d: Couldn't read remote address\n", n, (int) sizeof msg); goto out; } rem_dest = malloc(sizeof *rem_dest); if (!rem_dest) goto out; sscanf(msg, "%x:%x:%x:%s", &rem_dest->lid, &rem_dest->qpn, &rem_dest->psn, gid); wire_gid_to_gid(gid, &rem_dest->gid); if (pp_connect_ctx(ctx, ib_port, my_dest->psn, sl, rem_dest, sgid_idx)) { fprintf(stderr, "Couldn't connect to remote QP\n"); free(rem_dest); rem_dest = NULL; goto out; } gid_to_wire_gid(&my_dest->gid, gid); sprintf(msg, "%04x:%06x:%06x:%s", my_dest->lid, my_dest->qpn, my_dest->psn, gid); if (write(connfd, msg, sizeof msg) != sizeof msg || read(connfd, msg, sizeof msg) != sizeof "done") { fprintf(stderr, "Couldn't send/recv local address\n"); free(rem_dest); rem_dest = NULL; goto out; } out: close(connfd); return rem_dest; } static struct pingpong_context *pp_init_ctx(struct ibv_device *ib_dev, int size, int rx_depth, int port, int use_event) { struct pingpong_context *ctx; ctx = malloc(sizeof *ctx); if (!ctx) return NULL; ctx->size = size; ctx->send_flags = IBV_SEND_SIGNALED; ctx->rx_depth = rx_depth; ctx->buf = memalign(page_size, size + 40); if (!ctx->buf) { fprintf(stderr, "Couldn't allocate work buf.\n"); goto clean_ctx; } /* FIXME memset(ctx->buf, 0, size + 40); */ memset(ctx->buf, 0x7b, size + 40); ctx->context = ibv_open_device(ib_dev); if (!ctx->context) { fprintf(stderr, "Couldn't get context for %s\n", ibv_get_device_name(ib_dev)); goto clean_buffer; } { struct ibv_port_attr port_info = {}; int mtu; if (ibv_query_port(ctx->context, port, &port_info)) { fprintf(stderr, "Unable to query port info for port %d\n", port); goto clean_device; } mtu = 1 << (port_info.active_mtu + 7); if (size > mtu) { fprintf(stderr, "Requested size larger than port MTU (%d)\n", mtu); goto clean_device; } } if (use_event) { ctx->channel = ibv_create_comp_channel(ctx->context); if (!ctx->channel) { fprintf(stderr, "Couldn't create completion channel\n"); goto clean_device; } } else ctx->channel = NULL; ctx->pd = ibv_alloc_pd(ctx->context); if (!ctx->pd) { fprintf(stderr, "Couldn't allocate PD\n"); goto clean_comp_channel; } ctx->mr = ibv_reg_mr(ctx->pd, ctx->buf, size + 40, IBV_ACCESS_LOCAL_WRITE); if (!ctx->mr) { fprintf(stderr, "Couldn't register MR\n"); goto clean_pd; } ctx->cq = ibv_create_cq(ctx->context, rx_depth + 1, NULL, ctx->channel, 0); if (!ctx->cq) { fprintf(stderr, "Couldn't create CQ\n"); goto clean_mr; } { struct ibv_qp_attr attr; struct ibv_qp_init_attr init_attr = { .send_cq = ctx->cq, .recv_cq = ctx->cq, .cap = { .max_send_wr = 1, .max_recv_wr = rx_depth, .max_send_sge = 1, .max_recv_sge = 1 }, .qp_type = IBV_QPT_UD, }; ctx->qp = ibv_create_qp(ctx->pd, &init_attr); if (!ctx->qp) { fprintf(stderr, "Couldn't create QP\n"); goto clean_cq; } ibv_query_qp(ctx->qp, &attr, IBV_QP_CAP, &init_attr); if (init_attr.cap.max_inline_data >= size) { ctx->send_flags |= IBV_SEND_INLINE; } } { struct ibv_qp_attr attr = { .qp_state = IBV_QPS_INIT, .pkey_index = 0, .port_num = port, .qkey = 0x11111111 }; if (ibv_modify_qp(ctx->qp, &attr, IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | IBV_QP_QKEY)) { fprintf(stderr, "Failed to modify QP to INIT\n"); goto clean_qp; } } return ctx; clean_qp: ibv_destroy_qp(ctx->qp); clean_cq: ibv_destroy_cq(ctx->cq); clean_mr: ibv_dereg_mr(ctx->mr); clean_pd: ibv_dealloc_pd(ctx->pd); clean_comp_channel: if (ctx->channel) ibv_destroy_comp_channel(ctx->channel); clean_device: ibv_close_device(ctx->context); clean_buffer: free(ctx->buf); clean_ctx: free(ctx); return NULL; } static int pp_close_ctx(struct pingpong_context *ctx) { if (ibv_destroy_qp(ctx->qp)) { fprintf(stderr, "Couldn't destroy QP\n"); return 1; } if (ibv_destroy_cq(ctx->cq)) { fprintf(stderr, "Couldn't destroy CQ\n"); return 1; } if (ibv_dereg_mr(ctx->mr)) { fprintf(stderr, "Couldn't deregister MR\n"); return 1; } if (ibv_destroy_ah(ctx->ah)) { fprintf(stderr, "Couldn't destroy AH\n"); return 1; } if (ibv_dealloc_pd(ctx->pd)) { fprintf(stderr, "Couldn't deallocate PD\n"); return 1; } if (ctx->channel) { if (ibv_destroy_comp_channel(ctx->channel)) { fprintf(stderr, "Couldn't destroy completion channel\n"); return 1; } } if (ibv_close_device(ctx->context)) { fprintf(stderr, "Couldn't release context\n"); return 1; } free(ctx->buf); free(ctx); return 0; } static int pp_post_recv(struct pingpong_context *ctx, int n) { struct ibv_sge list = { .addr = (uintptr_t) ctx->buf, .length = ctx->size + 40, .lkey = ctx->mr->lkey }; struct ibv_recv_wr wr = { .wr_id = PINGPONG_RECV_WRID, .sg_list = &list, .num_sge = 1, }; struct ibv_recv_wr *bad_wr; int i; for (i = 0; i < n; ++i) if (ibv_post_recv(ctx->qp, &wr, &bad_wr)) break; return i; } static int pp_post_send(struct pingpong_context *ctx, uint32_t qpn) { struct ibv_sge list = { .addr = (uintptr_t) ctx->buf + 40, .length = ctx->size, .lkey = ctx->mr->lkey }; struct ibv_send_wr wr = { .wr_id = PINGPONG_SEND_WRID, .sg_list = &list, .num_sge = 1, .opcode = IBV_WR_SEND, .send_flags = ctx->send_flags, .wr = { .ud = { .ah = ctx->ah, .remote_qpn = qpn, .remote_qkey = 0x11111111 } } }; struct ibv_send_wr *bad_wr; return ibv_post_send(ctx->qp, &wr, &bad_wr); } static void usage(const char *argv0) { printf("Usage:\n"); printf(" %s start a server and wait for connection\n", argv0); printf(" %s connect to server at \n", argv0); printf("\n"); printf("Options:\n"); printf(" -p, --port= listen on/connect to port (default 18515)\n"); printf(" -d, --ib-dev= use IB device (default first device found)\n"); printf(" -i, --ib-port= use port of IB device (default 1)\n"); printf(" -s, --size= size of message to exchange (default 2048)\n"); printf(" -r, --rx-depth= number of receives to post at a time (default 500)\n"); printf(" -n, --iters= number of exchanges (default 1000)\n"); printf(" -l, --sl= send messages with service level (default 0)\n"); printf(" -e, --events sleep on CQ events (default poll)\n"); printf(" -g, --gid-idx= local port gid index\n"); printf(" -c, --chk validate received buffer\n"); } int main(int argc, char *argv[]) { struct ibv_device **dev_list; struct ibv_device *ib_dev; struct pingpong_context *ctx; struct pingpong_dest my_dest; struct pingpong_dest *rem_dest; struct timeval start, end; char *ib_devname = NULL; char *servername = NULL; unsigned int port = 18515; int ib_port = 1; unsigned int size = 1024; unsigned int rx_depth = 500; unsigned int iters = 1000; int use_event = 0; int routs; int rcnt, scnt; int num_cq_events = 0; int sl = 0; int gidx = -1; char gid[33]; srand48(getpid() * time(NULL)); while (1) { int c; static struct option long_options[] = { { .name = "port", .has_arg = 1, .val = 'p' }, { .name = "ib-dev", .has_arg = 1, .val = 'd' }, { .name = "ib-port", .has_arg = 1, .val = 'i' }, { .name = "size", .has_arg = 1, .val = 's' }, { .name = "rx-depth", .has_arg = 1, .val = 'r' }, { .name = "iters", .has_arg = 1, .val = 'n' }, { .name = "sl", .has_arg = 1, .val = 'l' }, { .name = "events", .has_arg = 0, .val = 'e' }, { .name = "gid-idx", .has_arg = 1, .val = 'g' }, { .name = "chk", .has_arg = 0, .val = 'c' }, {} }; c = getopt_long(argc, argv, "p:d:i:s:r:n:l:eg:c", long_options, NULL); if (c == -1) break; switch (c) { case 'p': port = strtol(optarg, NULL, 0); if (port > 65535) { usage(argv[0]); return 1; } break; case 'd': ib_devname = strdupa(optarg); break; case 'i': ib_port = strtol(optarg, NULL, 0); if (ib_port < 1) { usage(argv[0]); return 1; } break; case 's': size = strtoul(optarg, NULL, 0); break; case 'r': rx_depth = strtoul(optarg, NULL, 0); break; case 'n': iters = strtoul(optarg, NULL, 0); break; case 'l': sl = strtol(optarg, NULL, 0); break; case 'e': ++use_event; break; case 'g': gidx = strtol(optarg, NULL, 0); break; case 'c': validate_buf = 1; break; default: usage(argv[0]); return 1; } } if (optind == argc - 1) servername = strdupa(argv[optind]); else if (optind < argc) { usage(argv[0]); return 1; } page_size = sysconf(_SC_PAGESIZE); dev_list = ibv_get_device_list(NULL); if (!dev_list) { perror("Failed to get IB devices list"); return 1; } if (!ib_devname) { ib_dev = *dev_list; if (!ib_dev) { fprintf(stderr, "No IB devices found\n"); return 1; } } else { int i; for (i = 0; dev_list[i]; ++i) if (!strcmp(ibv_get_device_name(dev_list[i]), ib_devname)) break; ib_dev = dev_list[i]; if (!ib_dev) { fprintf(stderr, "IB device %s not found\n", ib_devname); return 1; } } ctx = pp_init_ctx(ib_dev, size, rx_depth, ib_port, use_event); if (!ctx) return 1; routs = pp_post_recv(ctx, ctx->rx_depth); if (routs < ctx->rx_depth) { fprintf(stderr, "Couldn't post receive (%d)\n", routs); return 1; } if (use_event) if (ibv_req_notify_cq(ctx->cq, 0)) { fprintf(stderr, "Couldn't request CQ notification\n"); return 1; } if (pp_get_port_info(ctx->context, ib_port, &ctx->portinfo)) { fprintf(stderr, "Couldn't get port info\n"); return 1; } my_dest.lid = ctx->portinfo.lid; my_dest.qpn = ctx->qp->qp_num; my_dest.psn = lrand48() & 0xffffff; if (gidx >= 0) { if (ibv_query_gid(ctx->context, ib_port, gidx, &my_dest.gid)) { fprintf(stderr, "Could not get local gid for gid index " "%d\n", gidx); return 1; } } else memset(&my_dest.gid, 0, sizeof my_dest.gid); inet_ntop(AF_INET6, &my_dest.gid, gid, sizeof gid); printf(" local address: LID 0x%04x, QPN 0x%06x, PSN 0x%06x: GID %s\n", my_dest.lid, my_dest.qpn, my_dest.psn, gid); if (servername) rem_dest = pp_client_exch_dest(servername, port, &my_dest); else rem_dest = pp_server_exch_dest(ctx, ib_port, port, sl, &my_dest, gidx); if (!rem_dest) return 1; inet_ntop(AF_INET6, &rem_dest->gid, gid, sizeof gid); printf(" remote address: LID 0x%04x, QPN 0x%06x, PSN 0x%06x, GID %s\n", rem_dest->lid, rem_dest->qpn, rem_dest->psn, gid); if (servername) if (pp_connect_ctx(ctx, ib_port, my_dest.psn, sl, rem_dest, gidx)) return 1; ctx->pending = PINGPONG_RECV_WRID; if (servername) { if (validate_buf) for (int i = 0; i < size; i += page_size) ctx->buf[i + 40] = i / page_size % sizeof(char); if (pp_post_send(ctx, rem_dest->qpn)) { fprintf(stderr, "Couldn't post send\n"); return 1; } ctx->pending |= PINGPONG_SEND_WRID; } if (gettimeofday(&start, NULL)) { perror("gettimeofday"); return 1; } rcnt = scnt = 0; while (rcnt < iters || scnt < iters) { if (use_event) { struct ibv_cq *ev_cq; void *ev_ctx; if (ibv_get_cq_event(ctx->channel, &ev_cq, &ev_ctx)) { fprintf(stderr, "Failed to get cq_event\n"); return 1; } ++num_cq_events; if (ev_cq != ctx->cq) { fprintf(stderr, "CQ event for unknown CQ %p\n", ev_cq); return 1; } if (ibv_req_notify_cq(ctx->cq, 0)) { fprintf(stderr, "Couldn't request CQ notification\n"); return 1; } } { struct ibv_wc wc[2]; int ne, i; do { ne = ibv_poll_cq(ctx->cq, 2, wc); if (ne < 0) { fprintf(stderr, "poll CQ failed %d\n", ne); return 1; } } while (!use_event && ne < 1); for (i = 0; i < ne; ++i) { if (wc[i].status != IBV_WC_SUCCESS) { fprintf(stderr, "Failed status %s (%d) for wr_id %d\n", ibv_wc_status_str(wc[i].status), wc[i].status, (int) wc[i].wr_id); return 1; } switch ((int) wc[i].wr_id) { case PINGPONG_SEND_WRID: ++scnt; break; case PINGPONG_RECV_WRID: if (--routs <= 1) { routs += pp_post_recv(ctx, ctx->rx_depth - routs); if (routs < ctx->rx_depth) { fprintf(stderr, "Couldn't post receive (%d)\n", routs); return 1; } } ++rcnt; break; default: fprintf(stderr, "Completion for unknown wr_id %d\n", (int) wc[i].wr_id); return 1; } ctx->pending &= ~(int) wc[i].wr_id; if (scnt < iters && !ctx->pending) { if (pp_post_send(ctx, rem_dest->qpn)) { fprintf(stderr, "Couldn't post send\n"); return 1; } ctx->pending = PINGPONG_RECV_WRID | PINGPONG_SEND_WRID; } } } } if (gettimeofday(&end, NULL)) { perror("gettimeofday"); return 1; } { float usec = (end.tv_sec - start.tv_sec) * 1000000 + (end.tv_usec - start.tv_usec); long long bytes = (long long) size * iters * 2; printf("%lld bytes in %.2f seconds = %.2f Mbit/sec\n", bytes, usec / 1000000., bytes * 8. / usec); printf("%d iters in %.2f seconds = %.2f usec/iter\n", iters, usec / 1000000., usec / iters); if ((!servername) && (validate_buf)) { for (int i = 0; i < size; i += page_size) if (ctx->buf[i + 40] != i / page_size % sizeof(char)) printf("invalid data in page %d\n", i / page_size); } } ibv_ack_cq_events(ctx->cq, num_cq_events); if (pp_close_ctx(ctx)) return 1; ibv_free_device_list(dev_list); free(rem_dest); return 0; } rdma-core-56.1/libibverbs/examples/xsrq_pingpong.c000066400000000000000000000602401477342711600223370ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2011 Intel Corporation, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "pingpong.h" #define MSG_FORMAT "%04x:%06x:%06x:%06x:%06x:%32s" #define MSG_SIZE 66 #define MSG_SSCAN "%x:%x:%x:%x:%x:%s" #define ADDR_FORMAT \ "%8s: LID %04x, QPN RECV %06x SEND %06x, PSN %06x, SRQN %06x, GID %s\n" #define TERMINATION_FORMAT "%s" #define TERMINATION_MSG_SIZE 4 #define TERMINATION_MSG "END" static int page_size; static int use_odp; struct pingpong_dest { union ibv_gid gid; int lid; int recv_qpn; int send_qpn; int recv_psn; int send_psn; int srqn; int pp_cnt; int sockfd; }; struct pingpong_context { struct ibv_context *context; struct ibv_comp_channel *channel; struct ibv_pd *pd; struct ibv_mr *mr; struct ibv_cq *send_cq; struct ibv_cq *recv_cq; struct ibv_srq *srq; struct ibv_xrcd *xrcd; struct ibv_qp **recv_qp; struct ibv_qp **send_qp; struct pingpong_dest *rem_dest; void *buf; int lid; int sl; enum ibv_mtu mtu; int ib_port; int fd; int size; int num_clients; int num_tests; int use_event; int gidx; }; static struct pingpong_context ctx; static int open_device(char *ib_devname) { struct ibv_device **dev_list; int i = 0; dev_list = ibv_get_device_list(NULL); if (!dev_list) { fprintf(stderr, "Failed to get IB devices list"); return -1; } if (ib_devname) { for (; dev_list[i]; ++i) { if (!strcmp(ibv_get_device_name(dev_list[i]), ib_devname)) break; } } if (!dev_list[i]) { fprintf(stderr, "IB device %s not found\n", ib_devname ? ib_devname : ""); return -1; } ctx.context = ibv_open_device(dev_list[i]); if (!ctx.context) { fprintf(stderr, "Couldn't get context for %s\n", ibv_get_device_name(dev_list[i])); return -1; } ibv_free_device_list(dev_list); return 0; } static int create_qps(void) { struct ibv_qp_init_attr_ex init; struct ibv_qp_attr mod; int i; for (i = 0; i < ctx.num_clients; ++i) { memset(&init, 0, sizeof init); init.qp_type = IBV_QPT_XRC_RECV; init.comp_mask = IBV_QP_INIT_ATTR_XRCD; init.xrcd = ctx.xrcd; ctx.recv_qp[i] = ibv_create_qp_ex(ctx.context, &init); if (!ctx.recv_qp[i]) { fprintf(stderr, "Couldn't create recv QP[%d] errno %d\n", i, errno); return 1; } mod.qp_state = IBV_QPS_INIT; mod.pkey_index = 0; mod.port_num = ctx.ib_port; mod.qp_access_flags = IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_READ; if (ibv_modify_qp(ctx.recv_qp[i], &mod, IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | IBV_QP_ACCESS_FLAGS)) { fprintf(stderr, "Failed to modify recv QP[%d] to INIT\n", i); return 1; } memset(&init, 0, sizeof init); init.qp_type = IBV_QPT_XRC_SEND; init.send_cq = ctx.send_cq; init.cap.max_send_wr = ctx.num_clients * ctx.num_tests; init.cap.max_send_sge = 1; init.comp_mask = IBV_QP_INIT_ATTR_PD; init.pd = ctx.pd; ctx.send_qp[i] = ibv_create_qp_ex(ctx.context, &init); if (!ctx.send_qp[i]) { fprintf(stderr, "Couldn't create send QP[%d] errno %d\n", i, errno); return 1; } mod.qp_state = IBV_QPS_INIT; mod.pkey_index = 0; mod.port_num = ctx.ib_port; mod.qp_access_flags = 0; if (ibv_modify_qp(ctx.send_qp[i], &mod, IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | IBV_QP_ACCESS_FLAGS)) { fprintf(stderr, "Failed to modify send QP[%d] to INIT\n", i); return 1; } } return 0; } static int pp_init_ctx(char *ib_devname) { struct ibv_srq_init_attr_ex attr; struct ibv_xrcd_init_attr xrcd_attr; struct ibv_port_attr port_attr; int access_flags = IBV_ACCESS_LOCAL_WRITE; ctx.recv_qp = calloc(ctx.num_clients, sizeof *ctx.recv_qp); ctx.send_qp = calloc(ctx.num_clients, sizeof *ctx.send_qp); ctx.rem_dest = calloc(ctx.num_clients, sizeof *ctx.rem_dest); if (!ctx.recv_qp || !ctx.send_qp || !ctx.rem_dest) return 1; if (open_device(ib_devname)) { fprintf(stderr, "Failed to open device\n"); return 1; } if (use_odp) { struct ibv_device_attr_ex attrx; const uint32_t xrc_caps_mask = IBV_ODP_SUPPORT_SEND | IBV_ODP_SUPPORT_SRQ_RECV; if (ibv_query_device_ex(ctx.context, NULL, &attrx)) { fprintf(stderr, "Couldn't query device for its features\n"); return 1; } if (!(attrx.odp_caps.general_caps & IBV_ODP_SUPPORT) || (attrx.xrc_odp_caps & xrc_caps_mask) != xrc_caps_mask) { fprintf(stderr, "The device isn't ODP capable or does not support XRC send, receive and srq with ODP\n"); return 1; } access_flags |= IBV_ACCESS_ON_DEMAND; } if (pp_get_port_info(ctx.context, ctx.ib_port, &port_attr)) { fprintf(stderr, "Failed to get port info\n"); return 1; } ctx.lid = port_attr.lid; if (port_attr.link_layer != IBV_LINK_LAYER_ETHERNET && !ctx.lid) { fprintf(stderr, "Couldn't get local LID\n"); return 1; } ctx.buf = memalign(page_size, ctx.size); if (!ctx.buf) { fprintf(stderr, "Couldn't allocate work buf.\n"); return 1; } memset(ctx.buf, 0, ctx.size); if (ctx.use_event) { ctx.channel = ibv_create_comp_channel(ctx.context); if (!ctx.channel) { fprintf(stderr, "Couldn't create completion channel\n"); return 1; } } ctx.pd = ibv_alloc_pd(ctx.context); if (!ctx.pd) { fprintf(stderr, "Couldn't allocate PD\n"); return 1; } ctx.mr = ibv_reg_mr(ctx.pd, ctx.buf, ctx.size, access_flags); if (!ctx.mr) { fprintf(stderr, "Couldn't register MR\n"); return 1; } ctx.fd = open("/tmp/xrc_domain", O_RDONLY | O_CREAT, S_IRUSR | S_IRGRP); if (ctx.fd < 0) { fprintf(stderr, "Couldn't create the file for the XRC Domain " "but not stopping %d\n", errno); ctx.fd = -1; } memset(&xrcd_attr, 0, sizeof xrcd_attr); xrcd_attr.comp_mask = IBV_XRCD_INIT_ATTR_FD | IBV_XRCD_INIT_ATTR_OFLAGS; xrcd_attr.fd = ctx.fd; xrcd_attr.oflags = O_CREAT; ctx.xrcd = ibv_open_xrcd(ctx.context, &xrcd_attr); if (!ctx.xrcd) { fprintf(stderr, "Couldn't Open the XRC Domain %d\n", errno); return 1; } ctx.recv_cq = ibv_create_cq(ctx.context, ctx.num_clients, &ctx.recv_cq, ctx.channel, 0); if (!ctx.recv_cq) { fprintf(stderr, "Couldn't create recv CQ\n"); return 1; } if (ctx.use_event) { if (ibv_req_notify_cq(ctx.recv_cq, 0)) { fprintf(stderr, "Couldn't request CQ notification\n"); return 1; } } ctx.send_cq = ibv_create_cq(ctx.context, ctx.num_clients, NULL, NULL, 0); if (!ctx.send_cq) { fprintf(stderr, "Couldn't create send CQ\n"); return 1; } memset(&attr, 0, sizeof attr); attr.attr.max_wr = ctx.num_clients; attr.attr.max_sge = 1; attr.comp_mask = IBV_SRQ_INIT_ATTR_TYPE | IBV_SRQ_INIT_ATTR_XRCD | IBV_SRQ_INIT_ATTR_CQ | IBV_SRQ_INIT_ATTR_PD; attr.srq_type = IBV_SRQT_XRC; attr.xrcd = ctx.xrcd; attr.cq = ctx.recv_cq; attr.pd = ctx.pd; ctx.srq = ibv_create_srq_ex(ctx.context, &attr); if (!ctx.srq) { fprintf(stderr, "Couldn't create SRQ\n"); return 1; } if (create_qps()) return 1; return 0; } static int recv_termination_ack(int index) { char msg[TERMINATION_MSG_SIZE]; int n = 0, r; int sockfd = ctx.rem_dest[index].sockfd; while (n < TERMINATION_MSG_SIZE) { r = read(sockfd, msg + n, TERMINATION_MSG_SIZE - n); if (r < 0) { perror("client read"); fprintf(stderr, "%d/%d: Couldn't read remote termination ack\n", n, TERMINATION_MSG_SIZE); return 1; } n += r; } if (strcmp(msg, TERMINATION_MSG)) { fprintf(stderr, "Invalid termination ack was accepted\n"); return 1; } return 0; } static int send_termination_ack(int index) { char msg[TERMINATION_MSG_SIZE]; int sockfd = ctx.rem_dest[index].sockfd; sprintf(msg, TERMINATION_FORMAT, TERMINATION_MSG); if (write(sockfd, msg, TERMINATION_MSG_SIZE) != TERMINATION_MSG_SIZE) { fprintf(stderr, "Couldn't send termination ack\n"); return 1; } return 0; } static int pp_client_termination(void) { if (send_termination_ack(0)) return 1; if (recv_termination_ack(0)) return 1; return 0; } static int pp_server_termination(void) { int i; for (i = 0; i < ctx.num_clients; i++) { if (recv_termination_ack(i)) return 1; } for (i = 0; i < ctx.num_clients; i++) { if (send_termination_ack(i)) return 1; } return 0; } static int send_local_dest(int sockfd, int index) { char msg[MSG_SIZE]; char gid[33]; uint32_t srq_num; union ibv_gid local_gid; if (ctx.gidx >= 0) { if (ibv_query_gid(ctx.context, ctx.ib_port, ctx.gidx, &local_gid)) { fprintf(stderr, "can't read sgid of index %d\n", ctx.gidx); return -1; } } else { memset(&local_gid, 0, sizeof(local_gid)); } ctx.rem_dest[index].recv_psn = lrand48() & 0xffffff; if (ibv_get_srq_num(ctx.srq, &srq_num)) { fprintf(stderr, "Couldn't get SRQ num\n"); return -1; } inet_ntop(AF_INET6, &local_gid, gid, sizeof(gid)); printf(ADDR_FORMAT, "local", ctx.lid, ctx.recv_qp[index]->qp_num, ctx.send_qp[index]->qp_num, ctx.rem_dest[index].recv_psn, srq_num, gid); gid_to_wire_gid(&local_gid, gid); sprintf(msg, MSG_FORMAT, ctx.lid, ctx.recv_qp[index]->qp_num, ctx.send_qp[index]->qp_num, ctx.rem_dest[index].recv_psn, srq_num, gid); if (write(sockfd, msg, MSG_SIZE) != MSG_SIZE) { fprintf(stderr, "Couldn't send local address\n"); return -1; } return 0; } static int recv_remote_dest(int sockfd, int index) { struct pingpong_dest *rem_dest; char msg[MSG_SIZE]; char gid[33]; int n = 0, r; while (n < MSG_SIZE) { r = read(sockfd, msg + n, MSG_SIZE - n); if (r < 0) { perror("client read"); fprintf(stderr, "%d/%d: Couldn't read remote address [%d]\n", n, MSG_SIZE, index); return -1; } n += r; } rem_dest = &ctx.rem_dest[index]; sscanf(msg, MSG_SSCAN, &rem_dest->lid, &rem_dest->recv_qpn, &rem_dest->send_qpn, &rem_dest->send_psn, &rem_dest->srqn, gid); wire_gid_to_gid(gid, &rem_dest->gid); inet_ntop(AF_INET6, &rem_dest->gid, gid, sizeof(gid)); printf(ADDR_FORMAT, "remote", rem_dest->lid, rem_dest->recv_qpn, rem_dest->send_qpn, rem_dest->send_psn, rem_dest->srqn, gid); rem_dest->sockfd = sockfd; return 0; } static void set_ah_attr(struct ibv_ah_attr *attr, struct pingpong_context *myctx, int index) { attr->is_global = 1; attr->grh.hop_limit = 5; attr->grh.dgid = myctx->rem_dest[index].gid; attr->grh.sgid_index = myctx->gidx; } static int connect_qps(int index) { struct ibv_qp_attr attr; memset(&attr, 0, sizeof attr); attr.qp_state = IBV_QPS_RTR; attr.dest_qp_num = ctx.rem_dest[index].send_qpn; attr.path_mtu = ctx.mtu; attr.rq_psn = ctx.rem_dest[index].send_psn; attr.min_rnr_timer = 12; attr.ah_attr.dlid = ctx.rem_dest[index].lid; attr.ah_attr.sl = ctx.sl; attr.ah_attr.port_num = ctx.ib_port; if (ctx.rem_dest[index].gid.global.interface_id) set_ah_attr(&attr.ah_attr, &ctx, index); if (ibv_modify_qp(ctx.recv_qp[index], &attr, IBV_QP_STATE | IBV_QP_AV | IBV_QP_PATH_MTU | IBV_QP_DEST_QPN | IBV_QP_RQ_PSN | IBV_QP_MAX_DEST_RD_ATOMIC | IBV_QP_MIN_RNR_TIMER)) { fprintf(stderr, "Failed to modify recv QP[%d] to RTR\n", index); return 1; } memset(&attr, 0, sizeof attr); attr.qp_state = IBV_QPS_RTS; attr.timeout = 14; attr.sq_psn = ctx.rem_dest[index].recv_psn; if (ibv_modify_qp(ctx.recv_qp[index], &attr, IBV_QP_STATE | IBV_QP_TIMEOUT | IBV_QP_SQ_PSN)) { fprintf(stderr, "Failed to modify recv QP[%d] to RTS\n", index); return 1; } memset(&attr, 0, sizeof attr); attr.qp_state = IBV_QPS_RTR; attr.dest_qp_num = ctx.rem_dest[index].recv_qpn; attr.path_mtu = ctx.mtu; attr.rq_psn = ctx.rem_dest[index].send_psn; attr.ah_attr.dlid = ctx.rem_dest[index].lid; attr.ah_attr.sl = ctx.sl; attr.ah_attr.port_num = ctx.ib_port; if (ctx.rem_dest[index].gid.global.interface_id) set_ah_attr(&attr.ah_attr, &ctx, index); if (ibv_modify_qp(ctx.send_qp[index], &attr, IBV_QP_STATE | IBV_QP_AV | IBV_QP_PATH_MTU | IBV_QP_DEST_QPN | IBV_QP_RQ_PSN)) { fprintf(stderr, "Failed to modify send QP[%d] to RTR\n", index); return 1; } memset(&attr, 0, sizeof attr); attr.qp_state = IBV_QPS_RTS; attr.timeout = 14; attr.retry_cnt = 7; attr.rnr_retry = 7; attr.sq_psn = ctx.rem_dest[index].recv_psn; if (ibv_modify_qp(ctx.send_qp[index], &attr, IBV_QP_STATE | IBV_QP_TIMEOUT | IBV_QP_SQ_PSN | IBV_QP_RETRY_CNT | IBV_QP_RNR_RETRY | IBV_QP_MAX_QP_RD_ATOMIC)) { fprintf(stderr, "Failed to modify send QP[%d] to RTS\n", index); return 1; } return 0; } static int pp_client_connect(const char *servername, int port) { struct addrinfo *res, *t; char *service; int ret; int sockfd = -1; struct addrinfo hints = { .ai_family = AF_UNSPEC, .ai_socktype = SOCK_STREAM }; if (asprintf(&service, "%d", port) < 0) return 1; ret = getaddrinfo(servername, service, &hints, &res); if (ret < 0) { fprintf(stderr, "%s for %s:%d\n", gai_strerror(ret), servername, port); free(service); return 1; } for (t = res; t; t = t->ai_next) { sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol); if (sockfd >= 0) { if (!connect(sockfd, t->ai_addr, t->ai_addrlen)) break; close(sockfd); sockfd = -1; } } freeaddrinfo(res); free(service); if (sockfd < 0) { fprintf(stderr, "Couldn't connect to %s:%d\n", servername, port); return 1; } if (send_local_dest(sockfd, 0)) { close(sockfd); return 1; } if (recv_remote_dest(sockfd, 0)) return 1; if (connect_qps(0)) return 1; return 0; } static int pp_server_connect(int port) { struct addrinfo *res, *t; char *service; int ret, i, n; int sockfd = -1, connfd; struct addrinfo hints = { .ai_flags = AI_PASSIVE, .ai_family = AF_UNSPEC, .ai_socktype = SOCK_STREAM }; if (asprintf(&service, "%d", port) < 0) return 1; ret = getaddrinfo(NULL, service, &hints, &res); if (ret < 0) { fprintf(stderr, "%s for port %d\n", gai_strerror(ret), port); free(service); return 1; } for (t = res; t; t = t->ai_next) { sockfd = socket(t->ai_family, t->ai_socktype, t->ai_protocol); if (sockfd >= 0) { n = 1; setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &n, sizeof n); if (!bind(sockfd, t->ai_addr, t->ai_addrlen)) break; close(sockfd); sockfd = -1; } } freeaddrinfo(res); free(service); if (sockfd < 0) { fprintf(stderr, "Couldn't listen to port %d\n", port); return 1; } listen(sockfd, ctx.num_clients); for (i = 0; i < ctx.num_clients; i++) { connfd = accept(sockfd, NULL, NULL); if (connfd < 0) { fprintf(stderr, "accept() failed for client %d\n", i); return 1; } if (recv_remote_dest(connfd, i)) return 1; if (send_local_dest(connfd, i)) return 1; if (connect_qps(i)) return 1; } close(sockfd); return 0; } static int pp_close_ctx(void) { int i; for (i = 0; i < ctx.num_clients; ++i) { if (ibv_destroy_qp(ctx.send_qp[i])) { fprintf(stderr, "Couldn't destroy INI QP[%d]\n", i); return 1; } if (ibv_destroy_qp(ctx.recv_qp[i])) { fprintf(stderr, "Couldn't destroy TGT QP[%d]\n", i); return 1; } if (ctx.rem_dest[i].sockfd) close(ctx.rem_dest[i].sockfd); } if (ibv_destroy_srq(ctx.srq)) { fprintf(stderr, "Couldn't destroy SRQ\n"); return 1; } if (ctx.xrcd && ibv_close_xrcd(ctx.xrcd)) { fprintf(stderr, "Couldn't close the XRC Domain\n"); return 1; } if (ctx.fd >= 0 && close(ctx.fd)) { fprintf(stderr, "Couldn't close the file for the XRC Domain\n"); return 1; } if (ibv_destroy_cq(ctx.send_cq)) { fprintf(stderr, "Couldn't destroy send CQ\n"); return 1; } if (ibv_destroy_cq(ctx.recv_cq)) { fprintf(stderr, "Couldn't destroy recv CQ\n"); return 1; } if (ibv_dereg_mr(ctx.mr)) { fprintf(stderr, "Couldn't deregister MR\n"); return 1; } if (ibv_dealloc_pd(ctx.pd)) { fprintf(stderr, "Couldn't deallocate PD\n"); return 1; } if (ctx.channel) { if (ibv_destroy_comp_channel(ctx.channel)) { fprintf(stderr, "Couldn't destroy completion channel\n"); return 1; } } if (ibv_close_device(ctx.context)) { fprintf(stderr, "Couldn't release context\n"); return 1; } free(ctx.buf); free(ctx.rem_dest); free(ctx.send_qp); free(ctx.recv_qp); return 0; } static int pp_post_recv(int cnt) { struct ibv_sge sge; struct ibv_recv_wr wr, *bad_wr; sge.addr = (uintptr_t) ctx.buf; sge.length = ctx.size; sge.lkey = ctx.mr->lkey; wr.next = NULL; wr.wr_id = (uintptr_t) &ctx; wr.sg_list = &sge; wr.num_sge = 1; while (cnt--) { if (ibv_post_srq_recv(ctx.srq, &wr, &bad_wr)) { fprintf(stderr, "Failed to post receive to SRQ\n"); return 1; } } return 0; } /* * Send to each client round robin on each set of xrc send/recv qp. * Generate a completion on the last send. */ static int pp_post_send(int index) { struct ibv_sge sge; struct ibv_send_wr wr, *bad_wr; int qpi; sge.addr = (uintptr_t) ctx.buf; sge.length = ctx.size; sge.lkey = ctx.mr->lkey; wr.wr_id = (uintptr_t) index; wr.next = NULL; wr.sg_list = &sge; wr.num_sge = 1; wr.opcode = IBV_WR_SEND; wr.qp_type.xrc.remote_srqn = ctx.rem_dest[index].srqn; qpi = (index + ctx.rem_dest[index].pp_cnt) % ctx.num_clients; wr.send_flags = (++ctx.rem_dest[index].pp_cnt >= ctx.num_tests) ? IBV_SEND_SIGNALED : 0; return ibv_post_send(ctx.send_qp[qpi], &wr, &bad_wr); } static int find_qp(int qpn) { int i; if (ctx.num_clients == 1) return 0; for (i = 0; i < ctx.num_clients; ++i) if (ctx.recv_qp[i]->qp_num == qpn) return i; fprintf(stderr, "Unable to find qp %x\n", qpn); return 0; } static int get_cq_event(void) { struct ibv_cq *ev_cq; void *ev_ctx; if (ibv_get_cq_event(ctx.channel, &ev_cq, &ev_ctx)) { fprintf(stderr, "Failed to get cq_event\n"); return 1; } if (ev_cq != ctx.recv_cq) { fprintf(stderr, "CQ event for unknown CQ %p\n", ev_cq); return 1; } if (ibv_req_notify_cq(ctx.recv_cq, 0)) { fprintf(stderr, "Couldn't request CQ notification\n"); return 1; } return 0; } static void init(void) { srand48(getpid() * time(NULL)); ctx.size = 4096; ctx.ib_port = 1; ctx.num_clients = 1; ctx.num_tests = 5; ctx.mtu = IBV_MTU_1024; ctx.sl = 0; ctx.gidx = -1; } static void usage(const char *argv0) { printf("Usage:\n"); printf(" %s start a server and wait for connection\n", argv0); printf(" %s connect to server at \n", argv0); printf("\n"); printf("Options:\n"); printf(" -p, --port= listen on/connect to port (default 18515)\n"); printf(" -d, --ib-dev= use IB device (default first device found)\n"); printf(" -i, --ib-port= use port of IB device (default 1)\n"); printf(" -s, --size= size of message to exchange (default 4096)\n"); printf(" -m, --mtu= path MTU (default 1024)\n"); printf(" -c, --clients= number of clients (on server only, default 1)\n"); printf(" -n, --num_tests= number of tests per client (default 5)\n"); printf(" -l, --sl= service level value\n"); printf(" -e, --events sleep on CQ events (default poll)\n"); printf(" -o, --odp use on demand paging\n"); printf(" -g, --gid-idx= local port gid index\n"); } int main(int argc, char *argv[]) { char *ib_devname = NULL; char *servername = NULL; int port = 18515; int i, total, cnt = 0; int ne, qpi, num_cq_events = 0; struct ibv_wc wc; init(); while (1) { int c; static struct option long_options[] = { { .name = "port", .has_arg = 1, .val = 'p' }, { .name = "ib-dev", .has_arg = 1, .val = 'd' }, { .name = "ib-port", .has_arg = 1, .val = 'i' }, { .name = "size", .has_arg = 1, .val = 's' }, { .name = "mtu", .has_arg = 1, .val = 'm' }, { .name = "clients", .has_arg = 1, .val = 'c' }, { .name = "num_tests", .has_arg = 1, .val = 'n' }, { .name = "sl", .has_arg = 1, .val = 'l' }, { .name = "events", .has_arg = 0, .val = 'e' }, { .name = "odp", .has_arg = 0, .val = 'o' }, { .name = "gid-idx", .has_arg = 1, .val = 'g' }, {} }; c = getopt_long(argc, argv, "p:d:i:s:m:n:l:eog:c:", long_options, NULL); if (c == -1) break; switch (c) { case 'p': port = strtol(optarg, NULL, 0); if (port < 0 || port > 65535) { usage(argv[0]); return 1; } break; case 'd': ib_devname = strdupa(optarg); break; case 'i': ctx.ib_port = strtol(optarg, NULL, 0); if (ctx.ib_port < 0) { usage(argv[0]); return 1; } break; case 's': ctx.size = strtol(optarg, NULL, 0); break; case 'm': ctx.mtu = pp_mtu_to_enum(strtol(optarg, NULL, 0)); if (ctx.mtu == 0) { usage(argv[0]); return 1; } break; case 'c': ctx.num_clients = strtol(optarg, NULL, 0); break; case 'n': ctx.num_tests = strtol(optarg, NULL, 0); break; case 'l': ctx.sl = strtol(optarg, NULL, 0); break; case 'g': ctx.gidx = strtol(optarg, NULL, 0); break; case 'e': ctx.use_event = 1; break; case 'o': use_odp = 1; break; default: usage(argv[0]); return 1; } } if (optind == argc - 1) { servername = strdupa(argv[optind]); ctx.num_clients = 1; } else if (optind < argc) { usage(argv[0]); return 1; } page_size = sysconf(_SC_PAGESIZE); if (pp_init_ctx(ib_devname)) return 1; if (pp_post_recv(ctx.num_clients)) { fprintf(stderr, "Couldn't post receives\n"); return 1; } if (servername) { if (pp_client_connect(servername, port)) return 1; } else { if (pp_server_connect(port)) return 1; for (i = 0; i < ctx.num_clients; i++) pp_post_send(i); } total = ctx.num_clients * ctx.num_tests; while (cnt < total) { if (ctx.use_event) { if (get_cq_event()) return 1; ++num_cq_events; } do { ne = ibv_poll_cq(ctx.recv_cq, 1, &wc); if (ne < 0) { fprintf(stderr, "Error polling cq %d\n", ne); return 1; } else if (ne == 0) { break; } if (wc.status) { fprintf(stderr, "Work completion error %d\n", wc.status); return 1; } pp_post_recv(ne); qpi = find_qp(wc.qp_num); if (ctx.rem_dest[qpi].pp_cnt < ctx.num_tests) pp_post_send(qpi); cnt += ne; } while (ne > 0); } for (cnt = 0; cnt < ctx.num_clients; cnt += ne) { ne = ibv_poll_cq(ctx.send_cq, 1, &wc); if (ne < 0) { fprintf(stderr, "Error polling cq %d\n", ne); return 1; } } if (ctx.use_event) ibv_ack_cq_events(ctx.recv_cq, num_cq_events); /* Process should get an ack from the daemon to close its resources to * make sure latest daemon's response sent via its target QP destined * to an XSRQ created by another client won't be lost. * Failure to do so may cause the other client to wait for that sent * message forever. See comment on pp_post_send. */ if (servername) { if (pp_client_termination()) return 1; } else if (pp_server_termination()) { return 1; } if (pp_close_ctx()) return 1; printf("success\n"); return 0; } rdma-core-56.1/libibverbs/ibdev_nl.c000066400000000000000000000156411477342711600174120ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include "ibverbs.h" /* Determine the name of the uverbsX class for the sysfs_dev using sysfs. */ static int find_uverbs_sysfs(struct verbs_sysfs_dev *sysfs_dev) { char path[IBV_SYSFS_PATH_MAX]; struct dirent *dent; DIR *class_dir; int ret = ENOENT; if (!check_snprintf(path, sizeof(path), "%s/device/infiniband_verbs", sysfs_dev->ibdev_path)) return ENOMEM; class_dir = opendir(path); if (!class_dir) return ENOSYS; while ((dent = readdir(class_dir))) { int uv_dirfd; bool failed; if (dent->d_name[0] == '.') continue; uv_dirfd = openat(dirfd(class_dir), dent->d_name, O_RDONLY | O_DIRECTORY | O_CLOEXEC); if (uv_dirfd == -1) break; failed = setup_sysfs_uverbs(uv_dirfd, dent->d_name, sysfs_dev); close(uv_dirfd); if (!failed) ret = 0; break; } closedir(class_dir); return ret; } static int find_uverbs_nl_cb(struct nl_msg *msg, void *data) { struct verbs_sysfs_dev *sysfs_dev = data; struct nlattr *tb[RDMA_NLDEV_ATTR_MAX]; uint64_t cdev64; int ret; ret = nlmsg_parse(nlmsg_hdr(msg), 0, tb, RDMA_NLDEV_ATTR_MAX - 1, rdmanl_policy); if (ret < 0) return ret; if (!tb[RDMA_NLDEV_ATTR_CHARDEV] || !tb[RDMA_NLDEV_ATTR_CHARDEV_ABI] || !tb[RDMA_NLDEV_ATTR_CHARDEV_NAME]) return NLE_PARSE_ERR; /* * The global uverbs abi is 6 for the request string 'uverbs'. We * don't expect to ever have to change the ABI version for uverbs * again. */ abi_ver = 6; /* * The top 32 bits of CHARDEV_ABI are reserved for a future use, * current kernels set them to 0 */ sysfs_dev->abi_ver = nla_get_u64(tb[RDMA_NLDEV_ATTR_CHARDEV_ABI]); if (tb[RDMA_NLDEV_ATTR_UVERBS_DRIVER_ID]) sysfs_dev->driver_id = nla_get_u32(tb[RDMA_NLDEV_ATTR_UVERBS_DRIVER_ID]); else sysfs_dev->driver_id = RDMA_DRIVER_UNKNOWN; /* Convert from huge_encode_dev to whatever glibc uses */ cdev64 = nla_get_u64(tb[RDMA_NLDEV_ATTR_CHARDEV]); sysfs_dev->sysfs_cdev = makedev((cdev64 & 0xfff00) >> 8, (cdev64 & 0xff) | ((cdev64 >> 12) & 0xfff00)); if (!check_snprintf(sysfs_dev->sysfs_name, sizeof(sysfs_dev->sysfs_name), "%s", nla_get_string(tb[RDMA_NLDEV_ATTR_CHARDEV_NAME]))) return NLE_PARSE_ERR; return 0; } /* Ask the kernel for the uverbs char device information */ static int find_uverbs_nl(struct nl_sock *nl, struct verbs_sysfs_dev *sysfs_dev) { if (rdmanl_get_chardev(nl, sysfs_dev->ibdev_idx, "uverbs", find_uverbs_nl_cb, sysfs_dev)) return -1; if (!sysfs_dev->sysfs_name[0]) return -1; return 0; } static int find_sysfs_devs_nl_cb(struct nl_msg *msg, void *data) { struct nlattr *tb[RDMA_NLDEV_ATTR_MAX]; struct list_head *sysfs_list = data; struct verbs_sysfs_dev *sysfs_dev; int ret; ret = nlmsg_parse(nlmsg_hdr(msg), 0, tb, RDMA_NLDEV_ATTR_MAX - 1, rdmanl_policy); if (ret < 0) return ret; if (!tb[RDMA_NLDEV_ATTR_DEV_NAME] || !tb[RDMA_NLDEV_ATTR_DEV_NODE_TYPE] || !tb[RDMA_NLDEV_ATTR_DEV_INDEX] || !tb[RDMA_NLDEV_ATTR_NODE_GUID] || !tb[RDMA_NLDEV_ATTR_PORT_INDEX]) return NLE_PARSE_ERR; sysfs_dev = calloc(1, sizeof(*sysfs_dev)); if (!sysfs_dev) return NLE_NOMEM; sysfs_dev->ibdev_idx = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]); sysfs_dev->num_ports = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]); sysfs_dev->node_guid = nla_get_u64(tb[RDMA_NLDEV_ATTR_NODE_GUID]); sysfs_dev->flags |= VSYSFS_READ_NODE_GUID; if (!check_snprintf(sysfs_dev->ibdev_name, sizeof(sysfs_dev->ibdev_name), "%s", nla_get_string(tb[RDMA_NLDEV_ATTR_DEV_NAME]))) goto err; if (!check_snprintf( sysfs_dev->ibdev_path, sizeof(sysfs_dev->ibdev_path), "%s/class/infiniband/%s", ibv_get_sysfs_path(), sysfs_dev->ibdev_name)) goto err; sysfs_dev->node_type = decode_knode_type( nla_get_u8(tb[RDMA_NLDEV_ATTR_DEV_NODE_TYPE])); /* * We don't need to check the cdev as netlink only shows us devices in * this namespace */ list_add(sysfs_list, &sysfs_dev->entry); return NL_OK; err: free(sysfs_dev); return NLE_PARSE_ERR; } /* Fetch the list of IB devices and uverbs from netlink */ int find_sysfs_devs_nl(struct list_head *tmp_sysfs_dev_list) { struct verbs_sysfs_dev *dev, *dev_tmp; struct nl_sock *nl; nl = rdmanl_socket_alloc(); if (!nl) return -EOPNOTSUPP; if (rdmanl_get_devices(nl, find_sysfs_devs_nl_cb, tmp_sysfs_dev_list)) goto err; list_for_each_safe (tmp_sysfs_dev_list, dev, dev_tmp, entry) { if ((find_uverbs_nl(nl, dev) && find_uverbs_sysfs(dev)) || try_access_device(dev)) { list_del(&dev->entry); free(dev); } } nl_socket_free(nl); return 0; err: list_for_each_safe (tmp_sysfs_dev_list, dev, dev_tmp, entry) { list_del(&dev->entry); free(dev); } nl_socket_free(nl); return EINVAL; } static int get_copy_on_fork_cb(struct nl_msg *msg, void *data) { struct nlattr *tb[RDMA_NLDEV_ATTR_MAX]; int ret; ret = nlmsg_parse(nlmsg_hdr(msg), 0, tb, RDMA_NLDEV_ATTR_MAX - 1, rdmanl_policy); if (ret < 0) return ret; /* Older kernels don't support COF and don't report it through nl */ if (!tb[RDMA_NLDEV_SYS_ATTR_COPY_ON_FORK]) { *(uint8_t *)data = 0; return NL_OK; } *(uint8_t *)data = nla_get_u8(tb[RDMA_NLDEV_SYS_ATTR_COPY_ON_FORK]); return NL_OK; } bool get_copy_on_fork(void) { struct nl_sock *nl; uint8_t cof; nl = rdmanl_socket_alloc(); if (!nl) return false; if (rdmanl_get_copy_on_fork(nl, get_copy_on_fork_cb, &cof)) cof = false; nl_socket_free(nl); return cof; } rdma-core-56.1/libibverbs/ibverbs.h000066400000000000000000000060341477342711600172650ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef IB_VERBS_H #define IB_VERBS_H #include #include #include #define INIT __attribute__((constructor)) #define PFX "libibverbs: " #define VERBS_OPS_NUM (sizeof(struct verbs_context_ops) / sizeof(void *)) struct ibv_abi_compat_v2 { struct ibv_comp_channel channel; pthread_mutex_t in_use; }; extern int abi_ver; extern const struct verbs_context_ops verbs_dummy_ops; int ibverbs_get_device_list(struct list_head *list); int ibverbs_init(void); void ibverbs_device_put(struct ibv_device *dev); void ibverbs_device_hold(struct ibv_device *dev); int __lib_query_port(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr, size_t port_attr_len); int setup_sysfs_uverbs(int uv_dirfd, const char *uverbs, struct verbs_sysfs_dev *sysfs_dev); #ifdef _STATIC_LIBRARY_BUILD_ static inline void load_drivers(void) { } #else void load_drivers(void); #endif struct verbs_ex_private { BMP_DECLARE(unsupported_ioctls, VERBS_OPS_NUM); uint32_t driver_id; bool use_ioctl_write; struct verbs_context_ops ops; bool imported; }; static inline struct verbs_ex_private *get_priv(struct ibv_context *ctx) { return container_of(ctx, struct verbs_context, context)->priv; } static inline const struct verbs_context_ops *get_ops(struct ibv_context *ctx) { return &get_priv(ctx)->ops; } enum ibv_node_type decode_knode_type(unsigned int knode_type); int find_sysfs_devs_nl(struct list_head *tmp_sysfs_dev_list); int try_access_device(const struct verbs_sysfs_dev *sysfs_dev); #endif /* IB_VERBS_H */ rdma-core-56.1/libibverbs/init.c000066400000000000000000000412331477342711600165670ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "driver.h" #include "ibverbs.h" #include int abi_ver; static uint32_t verbs_log_level; static FILE *verbs_log_fp; __attribute__((format(printf, 3, 4))) void __verbs_log(struct verbs_context *ctx, uint32_t level, const char *fmt, ...) { va_list args; if (level <= verbs_log_level) { int tmp = errno; va_start(args, fmt); vfprintf(verbs_log_fp, fmt, args); va_end(args); errno = tmp; } } struct ibv_driver { struct list_node entry; const struct verbs_device_ops *ops; }; static LIST_HEAD(driver_list); int try_access_device(const struct verbs_sysfs_dev *sysfs_dev) { struct stat cdev_stat; char *devpath; int ret; if (asprintf(&devpath, RDMA_CDEV_DIR"/%s", sysfs_dev->sysfs_name) < 0) return ENOMEM; ret = stat(devpath, &cdev_stat); free(devpath); return ret; } enum ibv_node_type decode_knode_type(unsigned int knode_type) { switch (knode_type) { case RDMA_NODE_IB_CA: return IBV_NODE_CA; case RDMA_NODE_IB_SWITCH: return IBV_NODE_SWITCH; case RDMA_NODE_IB_ROUTER: return IBV_NODE_ROUTER; case RDMA_NODE_RNIC: return IBV_NODE_RNIC; case RDMA_NODE_USNIC: return IBV_NODE_USNIC; case RDMA_NODE_USNIC_UDP: return IBV_NODE_USNIC_UDP; case RDMA_NODE_UNSPECIFIED: return IBV_NODE_UNSPECIFIED; } return IBV_NODE_UNKNOWN; } int setup_sysfs_uverbs(int uv_dirfd, const char *uverbs, struct verbs_sysfs_dev *sysfs_dev) { unsigned int major; unsigned int minor; struct stat buf; char value[32]; if (!check_snprintf(sysfs_dev->sysfs_name, sizeof(sysfs_dev->sysfs_name), "%s", uverbs)) return -1; if (stat(sysfs_dev->ibdev_path, &buf)) return -1; sysfs_dev->time_created = buf.st_mtim; if (ibv_read_sysfs_file_at(uv_dirfd, "dev", value, sizeof(value)) < 0) return -1; if (sscanf(value, "%u:%u", &major, &minor) != 2) return -1; sysfs_dev->sysfs_cdev = makedev(major, minor); if (ibv_read_sysfs_file_at(uv_dirfd, "abi_version", value, sizeof(value)) > 0) sysfs_dev->abi_ver = strtoul(value, NULL, 10); return 0; } static int setup_sysfs_dev(int dirfd, const char *uverbs, struct list_head *tmp_sysfs_dev_list) { struct verbs_sysfs_dev *sysfs_dev = NULL; char value[32]; int uv_dirfd; sysfs_dev = calloc(1, sizeof(*sysfs_dev)); if (!sysfs_dev) return ENOMEM; sysfs_dev->ibdev_idx = -1; uv_dirfd = openat(dirfd, uverbs, O_RDONLY | O_DIRECTORY | O_CLOEXEC); if (uv_dirfd == -1) goto err_alloc; if (ibv_read_sysfs_file_at(uv_dirfd, "ibdev", sysfs_dev->ibdev_name, sizeof(sysfs_dev->ibdev_name)) < 0) goto err_fd; if (!check_snprintf( sysfs_dev->ibdev_path, sizeof(sysfs_dev->ibdev_path), "%s/class/infiniband/%s", ibv_get_sysfs_path(), sysfs_dev->ibdev_name)) goto err_fd; if (setup_sysfs_uverbs(uv_dirfd, uverbs, sysfs_dev)) goto err_fd; if (ibv_read_ibdev_sysfs_file(value, sizeof(value), sysfs_dev, "node_type") <= 0) sysfs_dev->node_type = IBV_NODE_UNKNOWN; else sysfs_dev->node_type = decode_knode_type(strtoul(value, NULL, 10)); if (try_access_device(sysfs_dev)) goto err_fd; close(uv_dirfd); list_add(tmp_sysfs_dev_list, &sysfs_dev->entry); return 0; err_fd: close(uv_dirfd); err_alloc: free(sysfs_dev); return 0; } static int find_sysfs_devs(struct list_head *tmp_sysfs_dev_list) { struct verbs_sysfs_dev *dev, *dev_tmp; char class_path[IBV_SYSFS_PATH_MAX]; DIR *class_dir; struct dirent *dent; int ret = 0; if (!check_snprintf(class_path, sizeof(class_path), "%s/class/infiniband_verbs", ibv_get_sysfs_path())) return ENOMEM; class_dir = opendir(class_path); if (!class_dir) return ENOSYS; while ((dent = readdir(class_dir))) { if (dent->d_name[0] == '.') continue; ret = setup_sysfs_dev(dirfd(class_dir), dent->d_name, tmp_sysfs_dev_list); if (ret) break; } closedir(class_dir); if (ret) { list_for_each_safe (tmp_sysfs_dev_list, dev, dev_tmp, entry) { list_del(&dev->entry); free(dev); } } return ret; } void verbs_register_driver(const struct verbs_device_ops *ops) { struct ibv_driver *driver; driver = malloc(sizeof *driver); if (!driver) { fprintf(stderr, PFX "Warning: couldn't allocate driver for %s\n", ops->name); return; } driver->ops = ops; list_add_tail(&driver_list, &driver->entry); } /* Match a single modalias value */ static bool match_modalias(const struct verbs_match_ent *ent, const char *value) { char pci_ma[100]; switch (ent->kind) { case VERBS_MATCH_MODALIAS: return fnmatch(ent->u.modalias, value, 0) == 0; case VERBS_MATCH_PCI: snprintf(pci_ma, sizeof(pci_ma), "pci:v%08Xd%08Xsv*", ent->vendor, ent->device); return fnmatch(pci_ma, value, 0) == 0; default: return false; } } /* Search a null terminated table of verbs_match_ent's and return the one * that matches the device the verbs sysfs device is bound to or NULL. */ static const struct verbs_match_ent * match_modalias_device(const struct verbs_device_ops *ops, struct verbs_sysfs_dev *sysfs_dev) { const struct verbs_match_ent *i; if (!(sysfs_dev->flags & VSYSFS_READ_MODALIAS)) { sysfs_dev->flags |= VSYSFS_READ_MODALIAS; if (ibv_read_ibdev_sysfs_file( sysfs_dev->modalias, sizeof(sysfs_dev->modalias), sysfs_dev, "device/modalias") <= 0) { sysfs_dev->modalias[0] = 0; return NULL; } } for (i = ops->match_table; i->kind != VERBS_MATCH_SENTINEL; i++) if (match_modalias(i, sysfs_dev->modalias)) return i; return NULL; } /* Match the device name itself */ static const struct verbs_match_ent * match_name(const struct verbs_device_ops *ops, struct verbs_sysfs_dev *sysfs_dev) { char name_ma[100]; const struct verbs_match_ent *i; if (!check_snprintf(name_ma, sizeof(name_ma), "rdma_device:N%s", sysfs_dev->ibdev_name)) return NULL; for (i = ops->match_table; i->kind != VERBS_MATCH_SENTINEL; i++) if (match_modalias(i, name_ma)) return i; return NULL; } /* Match the driver id we get from netlink */ static const struct verbs_match_ent * match_driver_id(const struct verbs_device_ops *ops, struct verbs_sysfs_dev *sysfs_dev) { const struct verbs_match_ent *i; if (sysfs_dev->driver_id == RDMA_DRIVER_UNKNOWN) return NULL; for (i = ops->match_table; i->kind != VERBS_MATCH_SENTINEL; i++) if (i->kind == VERBS_MATCH_DRIVER_ID && i->u.driver_id == sysfs_dev->driver_id) return i; return NULL; } /* True if the provider matches the selected rdma sysfs device */ static bool match_device(const struct verbs_device_ops *ops, struct verbs_sysfs_dev *sysfs_dev) { if (ops->match_table) { sysfs_dev->match = match_driver_id(ops, sysfs_dev); if (!sysfs_dev->match) sysfs_dev->match = match_name(ops, sysfs_dev); if (!sysfs_dev->match) sysfs_dev->match = match_modalias_device(ops, sysfs_dev); } if (ops->match_device) { /* If a matching function is provided then it is called * unconditionally after the table match above, it is * responsible for determining if the device matches based on * the match pointer and any other internal information. */ if (!ops->match_device(sysfs_dev)) return false; } else { /* With no match function, we must have a table match */ if (!sysfs_dev->match) return false; } if (sysfs_dev->abi_ver < ops->match_min_abi_version || sysfs_dev->abi_ver > ops->match_max_abi_version) { fprintf(stderr, PFX "Warning: Driver %s does not support the kernel ABI of %u (supports %u to %u) for device %s\n", ops->name, sysfs_dev->abi_ver, ops->match_min_abi_version, ops->match_max_abi_version, sysfs_dev->ibdev_path); return false; } return true; } static struct verbs_device *try_driver(const struct verbs_device_ops *ops, struct verbs_sysfs_dev *sysfs_dev) { struct verbs_device *vdev; struct ibv_device *dev; if (!match_device(ops, sysfs_dev)) return NULL; vdev = ops->alloc_device(sysfs_dev); if (!vdev) { fprintf(stderr, PFX "Fatal: couldn't allocate device for %s\n", sysfs_dev->ibdev_path); return NULL; } vdev->ops = ops; atomic_init(&vdev->refcount, 1); dev = &vdev->device; assert(dev->_ops._dummy1 == NULL); assert(dev->_ops._dummy2 == NULL); dev->node_type = sysfs_dev->node_type; switch (sysfs_dev->node_type) { case IBV_NODE_CA: case IBV_NODE_SWITCH: case IBV_NODE_ROUTER: dev->transport_type = IBV_TRANSPORT_IB; break; case IBV_NODE_RNIC: dev->transport_type = IBV_TRANSPORT_IWARP; break; case IBV_NODE_USNIC: dev->transport_type = IBV_TRANSPORT_USNIC; break; case IBV_NODE_USNIC_UDP: dev->transport_type = IBV_TRANSPORT_USNIC_UDP; break; case IBV_NODE_UNSPECIFIED: dev->transport_type = IBV_TRANSPORT_UNSPECIFIED; break; default: dev->transport_type = IBV_TRANSPORT_UNKNOWN; break; } strcpy(dev->dev_name, sysfs_dev->sysfs_name); if (!check_snprintf(dev->dev_path, sizeof(dev->dev_path), "%s/class/infiniband_verbs/%s", ibv_get_sysfs_path(), sysfs_dev->sysfs_name)) goto err; strcpy(dev->name, sysfs_dev->ibdev_name); strcpy(dev->ibdev_path, sysfs_dev->ibdev_path); vdev->sysfs = sysfs_dev; return vdev; err: ops->uninit_device(vdev); return NULL; } static struct verbs_device *try_drivers(struct verbs_sysfs_dev *sysfs_dev) { struct ibv_driver *driver; struct verbs_device *dev; /* * Matching by driver_id takes priority over other match types, do it * first. */ if (sysfs_dev->driver_id != RDMA_DRIVER_UNKNOWN) { list_for_each (&driver_list, driver, entry) { if (match_driver_id(driver->ops, sysfs_dev)) { dev = try_driver(driver->ops, sysfs_dev); if (dev) return dev; } } } list_for_each(&driver_list, driver, entry) { dev = try_driver(driver->ops, sysfs_dev); if (dev) return dev; } return NULL; } static int check_abi_version(void) { char value[8]; if (abi_ver) return 0; if (ibv_read_sysfs_file(ibv_get_sysfs_path(), "class/infiniband_verbs/abi_version", value, sizeof(value)) < 0) { return ENOSYS; } abi_ver = strtol(value, NULL, 10); if (abi_ver < IB_USER_VERBS_MIN_ABI_VERSION || abi_ver > IB_USER_VERBS_MAX_ABI_VERSION) { fprintf(stderr, PFX "Fatal: kernel ABI version %d " "doesn't match library version %d.\n", abi_ver, IB_USER_VERBS_MAX_ABI_VERSION); return ENOSYS; } return 0; } static void check_memlock_limit(void) { struct rlimit rlim; if (!geteuid()) return; if (getrlimit(RLIMIT_MEMLOCK, &rlim)) { fprintf(stderr, PFX "Warning: getrlimit(RLIMIT_MEMLOCK) failed."); return; } if (rlim.rlim_cur <= 32768) fprintf(stderr, PFX "Warning: RLIMIT_MEMLOCK is %llu bytes.\n" " This will severely limit memory registrations.\n", (unsigned long long)rlim.rlim_cur); } static int same_sysfs_dev(struct verbs_sysfs_dev *sysfs1, struct verbs_sysfs_dev *sysfs2) { if (strcmp(sysfs1->sysfs_name, sysfs2->sysfs_name) != 0) return 0; /* In netlink mode the idx is a globally unique ID */ if (sysfs1->ibdev_idx != sysfs2->ibdev_idx) return 0; if (sysfs1->ibdev_idx == -1 && ts_cmp(&sysfs1->time_created, &sysfs2->time_created, !=)) return 0; return 1; } /* Match every ibv_sysfs_dev in the sysfs_list to a driver and add a new entry * to device_list. Once matched to a driver the entry in sysfs_list is * removed. */ static void try_all_drivers(struct list_head *sysfs_list, struct list_head *device_list, unsigned int *num_devices) { struct verbs_sysfs_dev *sysfs_dev; struct verbs_sysfs_dev *tmp; struct verbs_device *vdev; list_for_each_safe(sysfs_list, sysfs_dev, tmp, entry) { vdev = try_drivers(sysfs_dev); if (vdev) { list_del(&sysfs_dev->entry); /* Ownership of sysfs_dev moves into vdev->sysfs */ list_add(device_list, &vdev->entry); (*num_devices)++; } } } int ibverbs_get_device_list(struct list_head *device_list) { LIST_HEAD(sysfs_list); struct verbs_sysfs_dev *sysfs_dev, *next_dev; struct verbs_device *vdev, *tmp; static int drivers_loaded; unsigned int num_devices = 0; int ret; ret = find_sysfs_devs_nl(&sysfs_list); if (ret) { ret = find_sysfs_devs(&sysfs_list); if (ret) return -ret; } if (!list_empty(&sysfs_list)) { ret = check_abi_version(); if (ret) return -ret; } /* Remove entries from the sysfs_list that are already preset in the * device_list, and remove entries from the device_list that are not * present in the sysfs_list. */ list_for_each_safe(device_list, vdev, tmp, entry) { struct verbs_sysfs_dev *old_sysfs = NULL; list_for_each(&sysfs_list, sysfs_dev, entry) { if (same_sysfs_dev(vdev->sysfs, sysfs_dev)) { old_sysfs = sysfs_dev; break; } } if (old_sysfs) { list_del(&old_sysfs->entry); free(old_sysfs); num_devices++; } else { list_del(&vdev->entry); ibverbs_device_put(&vdev->device); } } try_all_drivers(&sysfs_list, device_list, &num_devices); if (list_empty(&sysfs_list) || drivers_loaded) goto out; load_drivers(); drivers_loaded = 1; try_all_drivers(&sysfs_list, device_list, &num_devices); out: /* Anything left in sysfs_list was not assoicated with a * driver. */ list_for_each_safe(&sysfs_list, sysfs_dev, next_dev, entry) { if (getenv("IBV_SHOW_WARNINGS")) { fprintf(stderr, PFX "Warning: no userspace device-specific driver found for %s\n", sysfs_dev->ibdev_name); } free(sysfs_dev); } return num_devices; } static void verbs_set_log_level(void) { char *env; env = getenv("VERBS_LOG_LEVEL"); if (env) verbs_log_level = strtol(env, NULL, 0); } /* * Fallback in case log file is not provided or can't be opened. * Release mode: disable debug prints. * Debug mode: Use stderr instead of a file. */ static void verbs_log_file_fallback(void) { #ifdef VERBS_DEBUG verbs_log_fp = stderr; #else verbs_log_level = VERBS_LOG_LEVEL_NONE; #endif } static void verbs_set_log_file(void) { char *env; if (verbs_log_level == VERBS_LOG_LEVEL_NONE) return; env = getenv("VERBS_LOG_FILE"); if (!env) { verbs_log_file_fallback(); return; } verbs_log_fp = fopen(env, "aw+"); if (!verbs_log_fp) { verbs_log_file_fallback(); return; } } int ibverbs_init(void) { if (check_env("RDMAV_FORK_SAFE") || check_env("IBV_FORK_SAFE")) if (ibv_fork_init()) fprintf(stderr, PFX "Warning: fork()-safety requested " "but init failed\n"); verbs_allow_disassociate_destroy = check_env("RDMAV_ALLOW_DISASSOC_DESTROY") /* Backward compatibility for the mlx4 driver env */ || check_env("MLX4_DEVICE_FATAL_CLEANUP"); if (!ibv_get_sysfs_path()) return -errno; check_memlock_limit(); verbs_set_log_level(); verbs_set_log_file(); return 0; } void ibverbs_device_hold(struct ibv_device *dev) { struct verbs_device *verbs_device = verbs_get_device(dev); atomic_fetch_add(&verbs_device->refcount, 1); } void ibverbs_device_put(struct ibv_device *dev) { struct verbs_device *verbs_device = verbs_get_device(dev); if (atomic_fetch_sub(&verbs_device->refcount, 1) == 1) { free(verbs_device->sysfs); if (verbs_device->ops->uninit_device) verbs_device->ops->uninit_device(verbs_device); } } rdma-core-56.1/libibverbs/kern-abi.h000066400000000000000000000366631477342711600173340ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005, 2006 Cisco Systems. All rights reserved. * Copyright (c) 2005 PathScale, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef KERN_ABI_H #define KERN_ABI_H #include #include #include #include #include /* * The minimum and maximum kernel ABI that we can handle. */ #define IB_USER_VERBS_MIN_ABI_VERSION 3 #define IB_USER_VERBS_MAX_ABI_VERSION 6 struct ex_hdr { struct ib_uverbs_cmd_hdr hdr; struct ib_uverbs_ex_cmd_hdr ex_hdr; }; /* * These macros expand to type names that refer to the ABI structure type * associated with the given enum string. */ #define IBV_ABI_REQ(_enum) _ABI_REQ_STRUCT_##_enum #define IBV_KABI_REQ(_enum) _KABI_REQ_STRUCT_##_enum #define IBV_KABI_RESP(_enum) _KABI_RESP_STRUCT_##_enum #define IBV_ABI_ALIGN(_enum) _ABI_ALIGN_##_enum /* * Historically the code had copied the data in the kernel headers, modified * it and placed them in structs. To avoid recoding eveything we continue to * preserve the same struct layout, with the kernel struct 'loose' inside the * modified userspace struct. * * This is automated with the make_abi_structs.py script which produces the * _STRUCT_xx macro that produces a tagless version of the kernel struct. The * tagless struct produces a layout that matches the original code. */ #define DECLARE_CMDX(_enum, _name, _kabi, _kabi_resp) \ struct _name { \ struct ib_uverbs_cmd_hdr hdr; \ union { \ _STRUCT_##_kabi; \ struct _kabi core_payload; \ }; \ }; \ typedef struct _name IBV_ABI_REQ(_enum); \ typedef struct _kabi IBV_KABI_REQ(_enum); \ typedef struct _kabi_resp IBV_KABI_RESP(_enum); \ enum { IBV_ABI_ALIGN(_enum) = 4 }; \ static_assert(sizeof(struct _kabi_resp) % 4 == 0, \ "Bad resp alignment"); \ static_assert(_enum != -1, "Bad enum"); \ static_assert(sizeof(struct _name) == \ sizeof(struct ib_uverbs_cmd_hdr) + \ sizeof(struct _kabi), \ "Bad size") #define DECLARE_CMD(_enum, _name, _kabi) \ DECLARE_CMDX(_enum, _name, _kabi, _kabi##_resp) #define DECLARE_CMD_EXX(_enum, _name, _kabi, _kabi_resp) \ struct _name { \ struct ex_hdr hdr; \ union { \ _STRUCT_##_kabi; \ struct _kabi core_payload; \ }; \ }; \ typedef struct _name IBV_ABI_REQ(_enum); \ typedef struct _kabi IBV_KABI_REQ(_enum); \ typedef struct _kabi_resp IBV_KABI_RESP(_enum); \ enum { IBV_ABI_ALIGN(_enum) = 8 }; \ static_assert(_enum != -1, "Bad enum"); \ static_assert(sizeof(struct _kabi) % 8 == 0, "Bad req alignment"); \ static_assert(sizeof(struct _kabi_resp) % 8 == 0, \ "Bad resp alignment"); \ static_assert(sizeof(struct _name) == \ sizeof(struct ex_hdr) + sizeof(struct _kabi), \ "Bad size"); \ static_assert(sizeof(struct _name) % 8 == 0, "Bad alignment") #define DECLARE_CMD_EX(_enum, _name, _kabi) \ DECLARE_CMD_EXX(_enum, _name, _kabi, _kabi##_resp) /* Drivers may use 'empty' for _kabi to signal no struct */ struct empty {}; #define _STRUCT_empty struct {} /* * Define the ABI struct for use by the driver. The internal cmd APIs require * this layout. The driver specifies the enum # they wish to define for and * the base name, and the macros figure out the rest correctly. * * The static asserts check that the layout produced by the wrapper struct has * no implicit padding in strange places, specifically between the core * structure and the driver structure and between the driver structure and the * end of the struct. * * Implicit padding can arise in various cases where the structs are not sizes * to a multiple of 8 bytes. */ #define DECLARE_DRV_CMD(_name, _enum, _kabi_req, _kabi_resp) \ struct _name { \ IBV_ABI_REQ(_enum) ibv_cmd; \ union { \ _STRUCT_##_kabi_req; \ struct _kabi_req drv_payload; \ }; \ }; \ struct _name##_resp { \ IBV_KABI_RESP(_enum) ibv_resp; \ union { \ _STRUCT_##_kabi_resp; \ struct _kabi_resp drv_payload; \ }; \ }; \ static_assert(sizeof(IBV_KABI_REQ(_enum)) % \ __alignof__(struct _kabi_req) == \ 0, \ "Bad kabi req struct length"); \ static_assert(sizeof(struct _name) == \ sizeof(IBV_ABI_REQ(_enum)) + \ sizeof(struct _kabi_req), \ "Bad req size"); \ static_assert(sizeof(struct _name) % IBV_ABI_ALIGN(_enum) == 0, \ "Bad kabi req alignment"); \ static_assert(sizeof(IBV_KABI_RESP(_enum)) % \ __alignof__(struct _kabi_resp) == \ 0, \ "Bad kabi resp struct length"); \ static_assert(sizeof(struct _name##_resp) == \ sizeof(IBV_KABI_RESP(_enum)) + \ sizeof(struct _kabi_resp), \ "Bad resp size"); \ static_assert(sizeof(struct _name##_resp) % IBV_ABI_ALIGN(_enum) == 0, \ "Bad kabi resp alignment"); DECLARE_CMD(IB_USER_VERBS_CMD_ALLOC_MW, ibv_alloc_mw, ib_uverbs_alloc_mw); DECLARE_CMD(IB_USER_VERBS_CMD_ALLOC_PD, ibv_alloc_pd, ib_uverbs_alloc_pd); DECLARE_CMDX(IB_USER_VERBS_CMD_ATTACH_MCAST, ibv_attach_mcast, ib_uverbs_attach_mcast, empty); DECLARE_CMDX(IB_USER_VERBS_CMD_CLOSE_XRCD, ibv_close_xrcd, ib_uverbs_close_xrcd, empty); DECLARE_CMD(IB_USER_VERBS_CMD_CREATE_AH, ibv_create_ah, ib_uverbs_create_ah); DECLARE_CMD(IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL, ibv_create_comp_channel, ib_uverbs_create_comp_channel); DECLARE_CMD(IB_USER_VERBS_CMD_CREATE_CQ, ibv_create_cq, ib_uverbs_create_cq); DECLARE_CMD(IB_USER_VERBS_CMD_CREATE_QP, ibv_create_qp, ib_uverbs_create_qp); DECLARE_CMD(IB_USER_VERBS_CMD_CREATE_SRQ, ibv_create_srq, ib_uverbs_create_srq); DECLARE_CMDX(IB_USER_VERBS_CMD_CREATE_XSRQ, ibv_create_xsrq, ib_uverbs_create_xsrq, ib_uverbs_create_srq_resp); DECLARE_CMDX(IB_USER_VERBS_CMD_DEALLOC_MW, ibv_dealloc_mw, ib_uverbs_dealloc_mw, empty); DECLARE_CMDX(IB_USER_VERBS_CMD_DEALLOC_PD, ibv_dealloc_pd, ib_uverbs_dealloc_pd, empty); DECLARE_CMDX(IB_USER_VERBS_CMD_DEREG_MR, ibv_dereg_mr, ib_uverbs_dereg_mr, empty); DECLARE_CMDX(IB_USER_VERBS_CMD_DESTROY_AH, ibv_destroy_ah, ib_uverbs_destroy_ah, empty); DECLARE_CMD(IB_USER_VERBS_CMD_DESTROY_CQ, ibv_destroy_cq, ib_uverbs_destroy_cq); DECLARE_CMD(IB_USER_VERBS_CMD_DESTROY_QP, ibv_destroy_qp, ib_uverbs_destroy_qp); DECLARE_CMD(IB_USER_VERBS_CMD_DESTROY_SRQ, ibv_destroy_srq, ib_uverbs_destroy_srq); DECLARE_CMDX(IB_USER_VERBS_CMD_DETACH_MCAST, ibv_detach_mcast, ib_uverbs_detach_mcast, empty); DECLARE_CMD(IB_USER_VERBS_CMD_GET_CONTEXT, ibv_get_context, ib_uverbs_get_context); DECLARE_CMDX(IB_USER_VERBS_CMD_MODIFY_QP, ibv_modify_qp, ib_uverbs_modify_qp, empty); DECLARE_CMDX(IB_USER_VERBS_CMD_MODIFY_SRQ, ibv_modify_srq, ib_uverbs_modify_srq, empty); DECLARE_CMDX(IB_USER_VERBS_CMD_OPEN_QP, ibv_open_qp, ib_uverbs_open_qp, ib_uverbs_create_qp_resp); DECLARE_CMD(IB_USER_VERBS_CMD_OPEN_XRCD, ibv_open_xrcd, ib_uverbs_open_xrcd); DECLARE_CMD(IB_USER_VERBS_CMD_POLL_CQ, ibv_poll_cq, ib_uverbs_poll_cq); DECLARE_CMD(IB_USER_VERBS_CMD_POST_RECV, ibv_post_recv, ib_uverbs_post_recv); DECLARE_CMD(IB_USER_VERBS_CMD_POST_SEND, ibv_post_send, ib_uverbs_post_send); DECLARE_CMD(IB_USER_VERBS_CMD_POST_SRQ_RECV, ibv_post_srq_recv, ib_uverbs_post_srq_recv); DECLARE_CMD(IB_USER_VERBS_CMD_QUERY_DEVICE, ibv_query_device, ib_uverbs_query_device); DECLARE_CMD(IB_USER_VERBS_CMD_QUERY_PORT, ibv_query_port, ib_uverbs_query_port); DECLARE_CMD(IB_USER_VERBS_CMD_QUERY_QP, ibv_query_qp, ib_uverbs_query_qp); DECLARE_CMD(IB_USER_VERBS_CMD_QUERY_SRQ, ibv_query_srq, ib_uverbs_query_srq); DECLARE_CMD(IB_USER_VERBS_CMD_REG_MR, ibv_reg_mr, ib_uverbs_reg_mr); DECLARE_CMDX(IB_USER_VERBS_CMD_REQ_NOTIFY_CQ, ibv_req_notify_cq, ib_uverbs_req_notify_cq, empty); DECLARE_CMD(IB_USER_VERBS_CMD_REREG_MR, ibv_rereg_mr, ib_uverbs_rereg_mr); DECLARE_CMD(IB_USER_VERBS_CMD_RESIZE_CQ, ibv_resize_cq, ib_uverbs_resize_cq); DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_CREATE_CQ, ibv_create_cq_ex, ib_uverbs_ex_create_cq); DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_CREATE_FLOW, ibv_create_flow, ib_uverbs_create_flow); DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_CREATE_QP, ibv_create_qp_ex, ib_uverbs_ex_create_qp); DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL, ibv_create_rwq_ind_table, ib_uverbs_ex_create_rwq_ind_table); DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_CREATE_WQ, ibv_create_wq, ib_uverbs_ex_create_wq); DECLARE_CMD_EXX(IB_USER_VERBS_EX_CMD_DESTROY_FLOW, ibv_destroy_flow, ib_uverbs_destroy_flow, empty); DECLARE_CMD_EXX(IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL, ibv_destroy_rwq_ind_table, ib_uverbs_ex_destroy_rwq_ind_table, empty); DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_DESTROY_WQ, ibv_destroy_wq, ib_uverbs_ex_destroy_wq); DECLARE_CMD_EXX(IB_USER_VERBS_EX_CMD_MODIFY_CQ, ibv_modify_cq, ib_uverbs_ex_modify_cq, empty); DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_MODIFY_QP, ibv_modify_qp_ex, ib_uverbs_ex_modify_qp); DECLARE_CMD_EXX(IB_USER_VERBS_EX_CMD_MODIFY_WQ, ibv_modify_wq, ib_uverbs_ex_modify_wq, empty); DECLARE_CMD_EX(IB_USER_VERBS_EX_CMD_QUERY_DEVICE, ibv_query_device_ex, ib_uverbs_ex_query_device); /* * Both ib_uverbs_create_qp and ib_uverbs_ex_create_qp start with the same * structure, this function converts the ex version into the normal version */ static inline struct ib_uverbs_create_qp * ibv_create_qp_ex_to_reg(struct ibv_create_qp_ex *cmd_ex) { /* * user_handle is the start in both places, note that the ex * does not have response located in the same place, so response * cannot be touched. */ return container_of(&cmd_ex->user_handle, struct ib_uverbs_create_qp, user_handle); } /* * This file contains copied data from the kernel's include/uapi/rdma/ib_user_verbs.h, * now included above. * * Whenever possible use the definition from the kernel header and avoid * copying from that header into this file. */ struct ibv_kern_ipv4_filter { __u32 src_ip; __u32 dst_ip; }; struct ibv_kern_spec_ipv4 { __u32 type; __u16 size; __u16 reserved; struct ibv_kern_ipv4_filter val; struct ibv_kern_ipv4_filter mask; }; struct ibv_kern_spec { union { struct ib_uverbs_flow_spec_hdr hdr; struct ib_uverbs_flow_spec_eth eth; struct ibv_kern_spec_ipv4 ipv4; struct ib_uverbs_flow_spec_ipv4 ipv4_ext; struct ib_uverbs_flow_spec_esp esp; struct ib_uverbs_flow_spec_tcp_udp tcp_udp; struct ib_uverbs_flow_spec_ipv6 ipv6; struct ib_uverbs_flow_spec_gre gre; struct ib_uverbs_flow_spec_tunnel tunnel; struct ib_uverbs_flow_spec_mpls mpls; struct ib_uverbs_flow_spec_action_tag flow_tag; struct ib_uverbs_flow_spec_action_drop drop; struct ib_uverbs_flow_spec_action_handle handle; struct ib_uverbs_flow_spec_action_count flow_count; }; }; struct ib_uverbs_modify_srq_v3 { __u32 srq_handle; __u32 attr_mask; __u32 max_wr; __u32 max_sge; __u32 srq_limit; __u32 reserved; }; #define _STRUCT_ib_uverbs_modify_srq_v3 enum { IB_USER_VERBS_CMD_MODIFY_SRQ_V3 = IB_USER_VERBS_CMD_MODIFY_SRQ }; DECLARE_CMDX(IB_USER_VERBS_CMD_MODIFY_SRQ_V3, ibv_modify_srq_v3, ib_uverbs_modify_srq_v3, empty); struct ibv_create_qp_resp_v3 { __u32 qp_handle; __u32 qpn; }; struct ibv_create_qp_resp_v4 { __u32 qp_handle; __u32 qpn; __u32 max_send_wr; __u32 max_recv_wr; __u32 max_send_sge; __u32 max_recv_sge; __u32 max_inline_data; }; struct ibv_create_srq_resp_v5 { __u32 srq_handle; }; #define _STRUCT_ib_uverbs_create_srq_v5 enum { IB_USER_VERBS_CMD_CREATE_SRQ_V5 = IB_USER_VERBS_CMD_CREATE_SRQ }; DECLARE_CMDX(IB_USER_VERBS_CMD_CREATE_SRQ_V5, ibv_create_srq_v5, ib_uverbs_create_srq, ibv_create_srq_resp_v5); #define _STRUCT_ib_uverbs_create_qp_v4 enum { IB_USER_VERBS_CMD_CREATE_QP_V4 = IB_USER_VERBS_CMD_CREATE_QP }; DECLARE_CMDX(IB_USER_VERBS_CMD_CREATE_QP_V4, ibv_create_qp_v4, ib_uverbs_create_qp, ibv_create_qp_resp_v4); #define _STRUCT_ib_uverbs_create_qp_v3 enum { IB_USER_VERBS_CMD_CREATE_QP_V3 = IB_USER_VERBS_CMD_CREATE_QP }; DECLARE_CMDX(IB_USER_VERBS_CMD_CREATE_QP_V3, ibv_create_qp_v3, ib_uverbs_create_qp, ibv_create_qp_resp_v3); #endif /* KERN_ABI_H */ rdma-core-56.1/libibverbs/libibverbs.map.in000066400000000000000000000116251477342711600207110ustar00rootroot00000000000000/* Do not change this file without reading Documentation/versioning.md */ IBVERBS_1.0 { global: ibv_get_device_list; ibv_free_device_list; ibv_get_device_name; ibv_get_device_guid; ibv_open_device; ibv_close_device; ibv_get_async_event; ibv_ack_async_event; ibv_query_device; ibv_query_port; ibv_query_gid; ibv_query_pkey; ibv_alloc_pd; ibv_dealloc_pd; ibv_reg_mr; ibv_dereg_mr; ibv_create_comp_channel; ibv_destroy_comp_channel; ibv_create_cq; ibv_resize_cq; ibv_destroy_cq; ibv_get_cq_event; ibv_ack_cq_events; ibv_create_srq; ibv_modify_srq; ibv_query_srq; ibv_destroy_srq; ibv_create_qp; ibv_query_qp; ibv_modify_qp; ibv_destroy_qp; ibv_create_ah; ibv_destroy_ah; ibv_attach_mcast; ibv_detach_mcast; ibv_rate_to_mult; mult_to_ibv_rate; /* These historical symbols are now private to libibverbs, but used by other rdma-core libraries. Do not change them. */ ibv_copy_path_rec_from_kern; ibv_copy_path_rec_to_kern; ibv_copy_qp_attr_from_kern; ibv_get_sysfs_path; ibv_read_sysfs_file; local: *; }; IBVERBS_1.1 { global: ibv_ack_async_event; ibv_ack_cq_events; ibv_alloc_pd; ibv_attach_mcast; ibv_close_device; ibv_create_ah; ibv_create_ah_from_wc; ibv_create_cq; ibv_create_qp; ibv_create_srq; ibv_dealloc_pd; ibv_dereg_mr; ibv_destroy_ah; ibv_destroy_cq; ibv_destroy_qp; ibv_destroy_srq; ibv_detach_mcast; ibv_dofork_range; ibv_dontfork_range; ibv_event_type_str; ibv_fork_init; ibv_free_device_list; ibv_get_async_event; ibv_get_cq_event; ibv_get_device_guid; ibv_get_device_list; ibv_get_device_name; ibv_init_ah_from_wc; ibv_modify_qp; ibv_modify_srq; ibv_node_type_str; ibv_open_device; ibv_port_state_str; ibv_query_device; ibv_query_gid; ibv_query_pkey; ibv_query_port; ibv_query_qp; ibv_query_srq; ibv_rate_to_mbps; ibv_reg_mr; ibv_register_driver; ibv_rereg_mr; ibv_resize_cq; ibv_resolve_eth_l2_from_gid; ibv_wc_status_str; mbps_to_ibv_rate; /* These historical symbols are now private to libibverbs, but used by other rdma-core libraries. Do not change them. */ ibv_copy_ah_attr_from_kern; } IBVERBS_1.0; IBVERBS_1.5 { global: ibv_get_pkey_index; } IBVERBS_1.1; IBVERBS_1.6 { global: ibv_qp_to_qp_ex; } IBVERBS_1.5; IBVERBS_1.7 { global: ibv_reg_mr_iova; } IBVERBS_1.6; IBVERBS_1.8 { global: ibv_reg_mr_iova2; } IBVERBS_1.7; IBVERBS_1.9 { global: ibv_get_device_index; } IBVERBS_1.8; IBVERBS_1.10 { global: ibv_import_device; ibv_import_mr; ibv_import_pd; ibv_query_ece; ibv_set_ece; ibv_unimport_mr; ibv_unimport_pd; } IBVERBS_1.9; IBVERBS_1.11 { global: _ibv_query_gid_ex; _ibv_query_gid_table; } IBVERBS_1.10; IBVERBS_1.12 { global: ibv_reg_dmabuf_mr; } IBVERBS_1.11; IBVERBS_1.13 { global: ibv_import_dm; ibv_is_fork_initialized; ibv_unimport_dm; } IBVERBS_1.12; IBVERBS_1.14 { global: ibv_query_qp_data_in_order; } IBVERBS_1.13; /* If any symbols in this stanza change ABI then the entire staza gets a new symbol version. See the top level CMakeLists.txt for this setting. */ IBVERBS_PRIVATE_@IBVERBS_PABI_VERSION@ { global: /* These historical symbols are now private to libibverbs */ __ioctl_final_num_attrs; __verbs_log; _verbs_init_and_alloc_context; execute_ioctl; ibv_cmd_advise_mr; ibv_cmd_alloc_dm; ibv_cmd_alloc_mw; ibv_cmd_alloc_pd; ibv_cmd_attach_mcast; ibv_cmd_close_xrcd; ibv_cmd_create_ah; ibv_cmd_create_counters; ibv_cmd_create_cq; ibv_cmd_create_cq_ex; ibv_cmd_create_cq_ex2; ibv_cmd_create_flow; ibv_cmd_create_flow_action_esp; ibv_cmd_create_qp; ibv_cmd_create_qp_ex2; ibv_cmd_create_qp_ex; ibv_cmd_create_rwq_ind_table; ibv_cmd_create_srq; ibv_cmd_create_srq_ex; ibv_cmd_create_wq; ibv_cmd_dealloc_mw; ibv_cmd_dealloc_pd; ibv_cmd_dereg_mr; ibv_cmd_destroy_ah; ibv_cmd_destroy_counters; ibv_cmd_destroy_cq; ibv_cmd_destroy_flow; ibv_cmd_destroy_flow_action; ibv_cmd_destroy_qp; ibv_cmd_destroy_rwq_ind_table; ibv_cmd_destroy_srq; ibv_cmd_destroy_wq; ibv_cmd_detach_mcast; ibv_cmd_free_dm; ibv_cmd_get_context; ibv_cmd_modify_cq; ibv_cmd_modify_flow_action_esp; ibv_cmd_modify_qp; ibv_cmd_modify_qp_ex; ibv_cmd_modify_srq; ibv_cmd_modify_wq; ibv_cmd_open_qp; ibv_cmd_open_xrcd; ibv_cmd_poll_cq; ibv_cmd_post_recv; ibv_cmd_post_send; ibv_cmd_post_srq_recv; ibv_cmd_query_context; ibv_cmd_query_device; ibv_cmd_query_device_any; ibv_cmd_query_mr; ibv_cmd_query_port; ibv_cmd_query_qp; ibv_cmd_query_srq; ibv_cmd_read_counters; ibv_cmd_reg_dm_mr; ibv_cmd_reg_dmabuf_mr; ibv_cmd_reg_mr; ibv_cmd_req_notify_cq; ibv_cmd_rereg_mr; ibv_cmd_resize_cq; ibv_query_gid_type; ibv_read_ibdev_sysfs_file; ibv_wr_opcode_str; verbs_allow_disassociate_destroy; verbs_init_cq; verbs_open_device; verbs_register_driver_@IBVERBS_PABI_VERSION@; verbs_set_ops; verbs_uninit_context; }; rdma-core-56.1/libibverbs/man/000077500000000000000000000000001477342711600162305ustar00rootroot00000000000000rdma-core-56.1/libibverbs/man/CMakeLists.txt000066400000000000000000000075141477342711600207770ustar00rootroot00000000000000rdma_man_pages( ibv_advise_mr.3.md ibv_alloc_dm.3 ibv_alloc_mw.3 ibv_alloc_null_mr.3.md ibv_alloc_parent_domain.3 ibv_alloc_pd.3 ibv_alloc_td.3 ibv_asyncwatch.1 ibv_attach_counters_point_flow.3.md ibv_attach_mcast.3.md ibv_bind_mw.3 ibv_create_ah.3 ibv_create_ah_from_wc.3 ibv_create_comp_channel.3 ibv_create_counters.3.md ibv_create_cq.3 ibv_create_cq_ex.3 ibv_modify_cq.3 ibv_create_flow.3 ibv_create_flow_action.3.md ibv_create_qp.3 ibv_create_qp_ex.3 ibv_create_rwq_ind_table.3 ibv_create_srq.3 ibv_create_srq_ex.3 ibv_create_wq.3 ibv_devices.1 ibv_devinfo.1 ibv_event_type_str.3.md ibv_fork_init.3.md ibv_get_async_event.3 ibv_get_cq_event.3 ibv_get_device_guid.3.md ibv_get_device_index.3.md ibv_get_device_list.3.md ibv_get_device_name.3.md ibv_get_pkey_index.3.md ibv_get_srq_num.3.md ibv_import_device.3.md ibv_import_dm.3.md ibv_import_mr.3.md ibv_import_pd.3.md ibv_inc_rkey.3.md ibv_is_fork_initialized.3.md ibv_modify_qp.3 ibv_modify_qp_rate_limit.3 ibv_modify_srq.3 ibv_modify_wq.3 ibv_open_device.3 ibv_open_qp.3 ibv_open_xrcd.3 ibv_poll_cq.3 ibv_post_recv.3 ibv_post_send.3 ibv_post_srq_ops.3 ibv_post_srq_recv.3 ibv_query_device.3 ibv_query_device_ex.3 ibv_query_ece.3.md ibv_query_gid.3.md ibv_query_gid_ex.3.md ibv_query_gid_table.3.md ibv_query_pkey.3.md ibv_query_port.3 ibv_query_qp.3 ibv_query_qp_data_in_order.3.md ibv_query_rt_values_ex.3 ibv_query_srq.3 ibv_rate_to_mbps.3.md ibv_rate_to_mult.3.md ibv_rc_pingpong.1 ibv_read_counters.3.md ibv_reg_mr.3 ibv_req_notify_cq.3.md ibv_rereg_mr.3.md ibv_resize_cq.3.md ibv_set_ece.3.md ibv_srq_pingpong.1 ibv_uc_pingpong.1 ibv_ud_pingpong.1 ibv_wr_post.3.md ibv_xsrq_pingpong.1 ) rdma_alias_man_pages( ibv_alloc_dm.3 ibv_free_dm.3 ibv_alloc_dm.3 ibv_reg_dm_mr.3 ibv_alloc_dm.3 ibv_memcpy_to_dm.3 ibv_alloc_dm.3 ibv_memcpy_from_dm.3 ibv_alloc_mw.3 ibv_dealloc_mw.3 ibv_alloc_pd.3 ibv_dealloc_pd.3 ibv_alloc_td.3 ibv_dealloc_td.3 ibv_attach_mcast.3 ibv_detach_mcast.3 ibv_create_ah.3 ibv_destroy_ah.3 ibv_create_ah_from_wc.3 ibv_init_ah_from_wc.3 ibv_create_comp_channel.3 ibv_destroy_comp_channel.3 ibv_create_counters.3 ibv_destroy_counters.3 ibv_create_cq.3 ibv_destroy_cq.3 ibv_create_flow.3 ibv_destroy_flow.3 ibv_create_flow_action.3 ibv_destroy_flow_action.3 ibv_create_flow_action.3 ibv_modify_flow_action.3 ibv_create_qp.3 ibv_destroy_qp.3 ibv_create_rwq_ind_table.3 ibv_destroy_rwq_ind_table.3 ibv_create_srq.3 ibv_destroy_srq.3 ibv_create_wq.3 ibv_destroy_wq.3 ibv_event_type_str.3 ibv_node_type_str.3 ibv_event_type_str.3 ibv_port_state_str.3 ibv_get_async_event.3 ibv_ack_async_event.3 ibv_get_cq_event.3 ibv_ack_cq_events.3 ibv_get_device_list.3 ibv_free_device_list.3 ibv_import_pd.3 ibv_unimport_pd.3 ibv_import_dm.3 ibv_unimport_dm.3 ibv_import_mr.3 ibv_unimport_mr.3 ibv_open_device.3 ibv_close_device.3 ibv_open_xrcd.3 ibv_close_xrcd.3 ibv_rate_to_mbps.3 mbps_to_ibv_rate.3 ibv_rate_to_mult.3 mult_to_ibv_rate.3 ibv_reg_mr.3 ibv_dereg_mr.3 ibv_wr_post.3 ibv_wr_abort.3 ibv_wr_post.3 ibv_wr_complete.3 ibv_wr_post.3 ibv_wr_start.3 ibv_wr_post.3 ibv_wr_atomic_cmp_swp.3 ibv_wr_post.3 ibv_wr_atomic_fetch_add.3 ibv_wr_post.3 ibv_wr_bind_mw.3 ibv_wr_post.3 ibv_wr_local_inv.3 ibv_wr_post.3 ibv_wr_rdma_read.3 ibv_wr_post.3 ibv_wr_rdma_write.3 ibv_wr_post.3 ibv_wr_rdma_write_imm.3 ibv_wr_post.3 ibv_wr_send.3 ibv_wr_post.3 ibv_wr_send_imm.3 ibv_wr_post.3 ibv_wr_send_inv.3 ibv_wr_post.3 ibv_wr_send_tso.3 ibv_wr_post.3 ibv_wr_set_inline_data.3 ibv_wr_post.3 ibv_wr_set_inline_data_list.3 ibv_wr_post.3 ibv_wr_set_sge.3 ibv_wr_post.3 ibv_wr_set_sge_list.3 ibv_wr_post.3 ibv_wr_set_ud_addr.3 ibv_wr_post.3 ibv_wr_set_xrc_srqn.3 ibv_wr_post.3 ibv_wr_flush.3 ) rdma-core-56.1/libibverbs/man/ibv_advise_mr.3.md000066400000000000000000000105471477342711600215330ustar00rootroot00000000000000--- date: 2018-10-19 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_ADVISE_MR --- # NAME ibv_advise_mr - Gives advice or directions to the kernel about an address range belongs to a memory region (MR). # SYNOPSIS ```c #include int ibv_advise_mr(struct ibv_pd *pd, enum ibv_advise_mr_advice advice, uint32_t flags, struct ibv_sge *sg_list, uint32_t num_sge) ``` # DESCRIPTION **ibv_advise_mr()** Give advice or directions to the kernel about an address range belonging to a memory region (MR). Applications that are aware of future access patterns can use this verb in order to leverage this knowledge to improve system or application performance. **Conventional advice values** *IBV_ADVISE_MR_ADVICE_PREFETCH* : Pre-fetch a range of an on-demand paging MR. Make pages present with read-only permission before the actual IO is conducted. This would provide a way to reduce latency by overlapping paging-in and either compute time or IO to other ranges. *IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE* : Like IBV_ADVISE_MR_ADVICE_PREFETCH but with read-access and write-access permission to the fetched memory. *IBV_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT* : Pre-fetch a range of an on-demand paging MR without faulting. This allows presented pages in the CPU to become presented to the device. # ARGUMENTS *pd* : The protection domain (PD) associated with the MR. *advice* : The requested advise value (as listed above). *flags* : Describes the properties of the advise operation **Conventional advice values** *IBV_ADVISE_MR_FLAG_FLUSH* : Request to be a synchronized operation. Return to the caller after the operation is completed. *sg_list* : Pointer to the s/g array When using IBV_ADVISE_OP_PREFETCH advise value, all the lkeys of all the scatter gather elements (SGEs) must be associated with ODP MRs (MRs that were registered with IBV_ACCESS_ON_DEMAND). *num_sge* : Number of elements in the s/g array # RETURN VALUE **ibv_advise_mr()** returns 0 when the call was successful, or the value of errno on failure (which indicates the failure reason). *EOPNOTSUPP* : libibverbs or provider driver doesn't support the ibv_advise_mr() verb (ENOSYS may sometimes be returned by old versions of libibverbs). *ENOTSUP* : The advise operation isn't supported. *EFAULT* : In one of the following: o When the range requested is out of the MR bounds, or when parts of it are not part of the process address space. o One of the lkeys provided in the scatter gather list is invalid or with wrong write access. *EINVAL* : In one of the following: o The PD is invalid. o The flags are invalid. o The requested address doesn't belong to a MR, but a MW or something. *EPERM* : In one of the following: o Referencing a valid lkey outside the caller's security scope. o The advice is IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE but the specified MR in the scatter gather list is not registered as writable access. *ENOENT* : The providing lkeys aren't consistent with the MR's. *ENOMEM* : Not enough memory. # NOTES An application may pre-fetch any address range within an ODP MR when using the **IBV_ADVISE_MR_ADVICE_PREFETCH** or **IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE** advice. Semantically, this operation is best-effort. That means the kernel does not guarantee that underlying pages are updated in the HCA or the pre-fetched pages would remain resident. When using **IBV_ADVISE_MR_ADVICE_PREFETCH** or **IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE** advice, the operation will be done in the following stages: o Page in the user pages to memory (pages aren't pinned). o Get the dma mapping of these user pages. o Post the underlying page translations to the HCA. If **IBV_ADVISE_MR_FLAG_FLUSH** is specified then the underlying pages are guaranteed to be updated in the HCA before returning SUCCESS. Otherwise the driver can choose to postpone the posting of the new translations to the HCA. When performing a local RDMA access operation it is recommended to use IBV_ADVISE_MR_FLAG_FLUSH flag with one of the pre-fetch advices to increase probability that the pages translations are valid in the HCA and avoid future page faults. # SEE ALSO **ibv_reg_mr**(3), **ibv_rereg_mr**(3), **ibv_dereg_mr**(3) # AUTHOR Aviad Yehezkel rdma-core-56.1/libibverbs/man/ibv_alloc_dm.3000066400000000000000000000077001477342711600207320ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_ALLOC_DM 3 2017-07-25 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_alloc_dm, ibv_free_dm, ibv_memcpy_to/from_dm \- allocate or free a device memory buffer (DMs) and perform memory copy to or from it .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_dm *ibv_alloc_dm(struct ibv_context " "*context", .BI " struct ibv_alloc_dm_attr " "*attr"); .sp .BI "int ibv_free_dm(struct ibv_dm " "*dm"); .fi .SH "DESCRIPTION" .B ibv_alloc_dm() allocates a device memory buffer for the RDMA device context .I context\fR. The argument .I attr is a pointer to an ibv_alloc_dm_attr struct, as defined in . .PP .B ibv_free_dm() free the device memory buffer .I dm\fR. .PP .nf struct ibv_alloc_dm_attr { .in +8 size_t length; /* Length of desired device memory buffer */ uint32_t log_align_req; /* Log base 2 of address alignment requirement */ uint32_t comp_mask; /* Compatibility mask that defines which of the following variables are valid */ .in -8 }; Address alignment may be required in cases where RDMA atomic operations will be performed using the device memory. .PP In such cases, the user may specify the device memory start address alignment using the log_align_req parameter .PP in the allocation attributes struct. .PP .SH "Accessing an allocated device memory" .nf In order to perform a write/read memory access to an allocated device memory, a user could use the ibv_memcpy_to_dm and ibv_memcpy_from_dm calls respectively. .sp .BI "int ibv_memcpy_to_dm(struct ibv_dm " "*dm" ", uint64_t " "dm_offset", .BI " void " "*host_addr" ", size_t " "length" "); .sp .BI "int ibv_memcpy_from_dm(void " "*host_addr" ", struct ibv_dm " "*dm" ", .BI " uint64_t " "dm_offset" ", size_t " "length" "); .sp .I dm_offest is the byte offset from the beginning of the allocated device memory buffer to access. .sp .I host_addr is the host memory buffer address to access. .sp .I length is the copy length in bytes. .sp .fi .SH "Device memory registration" .nf User may register the allocated device memory as a memory region and use the lkey/rkey inside sge when posting receive or sending work request. This type of MR is defined as zero based and therefore any reference to it (specifically in sge) is done with a byte offset from the beginning of the region. .sp This type of registration is done using ibv_reg_dm_mr. .sp .BI "struct ibv_mr* ibv_reg_dm_mr(struct ibv_pd " "*pd" ", struct ibv_dm " "*dm" ", uint64_t " "dm_offset", .BI " size_t " "length" ", uint32_t " "access"); .sp .I pd the associated pd for this registration. .sp .I dm the associated dm for this registration. .sp .I dm_offest is the byte offset from the beginning of the allocated device memory buffer to register. .sp .I length the memory length to register. .sp .I access mr access flags (Use enum ibv_access_flags). For this type of registration, user must set the IBV_ACCESS_ZERO_BASED flag. .SH "RETURN VALUE" .B ibv_alloc_dm() returns a pointer to an ibv_dm struct or NULL if the request fails. The output dm contains the handle which could be used by user to import this device memory. .PP .B ibv_free_dm() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .PP .B ibv_reg_dm_mr() returns a pointer to an ibv_mr struct on success or NULL if request fails. .PP .B ibv_memcpy_to_dm()/ibv_memcpy_from_dm() returns 0 on success or the failure reason value on failure. .SH "NOTES" .B ibv_alloc_dm() may fail if device has no more free device memory left, where the maximum amount of allocated memory is provided by the .I max_dm_size\fR attribute in .I ibv_device_attr_ex\fR struct. .B ibv_free_dm() may fail if any other resources (such as an MR) is still associated with the DM being freed. .SH "SEE ALSO" .BR ibv_query_device_ex (3), .SH "AUTHORS" .TP Ariel Levkovich rdma-core-56.1/libibverbs/man/ibv_alloc_mw.3000066400000000000000000000035601477342711600207550ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_ALLOC_MW 3 2016-02-02 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_alloc_mw, ibv_dealloc_mw \- allocate or deallocate a memory window (MW) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_mw *ibv_alloc_mw(struct ibv_pd " "*pd" , .BI " enum ibv_mw_type " "type"); .sp .BI "int ibv_dealloc_mw(struct ibv_mw " "*mw" ); .fi .SH "DESCRIPTION" .B ibv_alloc_mw() allocates a memory window (MW) associated with the protection domain .I pd\fR. The MW's type (1 or 2A/2B) is .I type\fR. .PP The MW is created not bound. For it to be useful, the MW must be bound, through either ibv_bind_mw (type 1) or a special WR (type 2). Once bound, the memory window allows RDMA (remote) access to a subset of the MR to which it was bound, until invalidated by: ibv_bind_mw verb with zero length for type 1, IBV_WR_LOCAL_INV/IBV_WR_SEND_WITH_INV WR opcode for type 2, deallocation. .PP .B ibv_dealloc_mw() Unbinds in case was previously bound and deallocates the MW .I mw\fR. .SH "RETURN VALUE" .B ibv_alloc_mw() returns a pointer to the allocated MW, or NULL if the request fails. The remote key (\fBR_Key\fR) field .B rkey is used by remote processes to perform Atomic and RDMA operations. This key will be changed during bind operations. The remote process places this .B rkey as the rkey field of struct ibv_send_wr passed to the ibv_post_send function. .PP .B ibv_dealloc_mw() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" .B ibv_dereg_mr() fails if any memory window is still bound to this MR. .SH "SEE ALSO" .BR ibv_alloc_pd (3), .BR ibv_post_send (3), .BR ibv_bind_mw (3), .BR ibv_reg_mr (3), .SH "AUTHORS" .TP Majd Dibbiny .TP Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_alloc_null_mr.3.md000066400000000000000000000026651477342711600224060ustar00rootroot00000000000000--- date: 2018-6-1 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: ibv_alloc_null_mr --- # NAME ibv_alloc_null_mr - allocate a null memory region (MR) # SYNOPSIS ```c #include struct ibv_mr *ibv_alloc_null_mr(struct ibv_pd *pd); ``` # DESCRIPTION **ibv_alloc_null_mr()** allocates a null memory region (MR) that is associated with the protection domain *pd*. A null MR discards all data written to it, and always returns 0 on read. It has the maximum length and only the lkey is valid, the MR is not exposed as an rkey. A device should implement the null MR in a way that bypasses PCI transfers, internally discarding or sourcing 0 data. This provides a way to avoid PCI bus transfers by using a scatter/gather list in commands if applications do not intend to access the data, or need data to be 0 filled. Specifically upon **ibv_post_send()** the device skips PCI read cycles and upon **ibv_post_recv()** the device skips PCI write cycles which finally improves performance. **ibv_dereg_mr()** deregisters the MR. The use of ibv_rereg_mr() or ibv_bind_mw() with this MR is invalid. # RETURN VALUE **ibv_alloc_null_mr()** returns a pointer to the allocated MR, or NULL if the request fails. # SEE ALSO **ibv_reg_mr**(3), **ibv_dereg_mr**(3), # AUTHOR Yonatan Cohen rdma-core-56.1/libibverbs/man/ibv_alloc_parent_domain.3000066400000000000000000000074741477342711600231620ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_ALLOC_PARENT_DOMAIN 3 2017-11-06 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_alloc_parent_domain(), ibv_dealloc_pd() \- allocate and deallocate the parent domain object .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_pd *ibv_alloc_parent_domain(struct ibv_context " "*context" ", struct ibv_parent_domain_init_attr " "*attr"); .sp .SH "DESCRIPTION" .B ibv_alloc_parent_domain() allocates a parent domain object for the RDMA device context .I context\fR. .sp The parent domain object extends the normal protection domain with additional objects, such as a thread domain. .sp A parent domain is completely interchangeable with the .I struct ibv_pd used to create it, and can be used as an input argument to any function accepting a .I struct ibv_pd. .sp The behavior of each verb may be different if the verb is passed a parent domain .I struct ibv_pd that contains a .I struct ibv_td pointer\fR. For instance the verb may choose to share resources between objects using the same thread domain. The exact behavior is provider dependent. .sp The .I attr argument specifies the following: .PP .nf enum ibv_parent_domain_init_attr_mask { .in +8 IBV_PARENT_DOMAIN_INIT_ATTR_ALLOCATORS = 1 << 0, IBV_PARENT_DOMAIN_INIT_ATTR_PD_CONTEXT = 1 << 1, .in -8 }; struct ibv_parent_domain_init_attr { .in +8 struct ibv_pd *pd; /* reference to a protection domain, can't be NULL */ struct ibv_td *td; /* reference to a thread domain, or NULL */ uint32_t comp_mask; void *(*alloc)(struct ibv_pd *pd, void *pd_context, size_t size, size_t alignment, uint64_t resource_type); void (*free)(struct ibv_pd *pd, void *pd_context, void *ptr, uint64_t resource_type); void *pd_context; .in -8 }; .fi .PP .sp .B ibv_dealloc_pd() will deallocate the parent domain as its exposed as an ibv_pd .I pd\fR. All resources created with the parent domain should be destroyed prior to deallocating the parent domain\fR. .SH "ARGUMENTS" .B pd Reference to the protection domain that this parent domain uses. .PP .B td An optional thread domain that the parent domain uses. .PP .B comp_mask Bit-mask of optional fields in the ibv_parent_domain_init_attr struct. .PP .B alloc Custom memory allocation function for this parent domain. Provider memory allocations will use this function to allocate the needed memory. The allocation function is passed the parent domain .B pd and the user-specified context .B pd_context. In addition, the callback receives the .B size and the .B alignment of the requested buffer, as well a vendor-specific .B resource_type , which is derived from the rdma_driver_id enum (upper 32 bits) and a vendor specific resource code. The function returns the pointer to the allocated buffer, or NULL to designate an error. It may also return .B IBV_ALLOCATOR_USE_DEFAULT asking the callee to allocate the buffer using the default allocator. The callback makes sure the allocated buffer is initialized with zeros. It is also the responsibility of the callback to make sure the memory cannot be COWed, e.g. by using madvise(MADV_DONTFORK) or by allocating anonymous shared memory. .PP .B free Callback to free memory buffers that were allocated using a successful alloc(). .PP .B pd_context A pointer for additional user-specific data to be associated with this parent domain. The pointer is passed back to the custom allocator functions. .SH "RETURN VALUE" .B ibv_alloc_parent_domain() returns a pointer to the allocated struct .I ibv_pd object, or NULL if the request fails (and sets errno to indicate the failure reason). .sp .SH "SEE ALSO" .BR ibv_alloc_parent_domain (3), .BR ibv_dealloc_pd (3), .BR ibv_alloc_pd (3), .BR ibv_alloc_td (3) .SH "AUTHORS" .TP Alex Rosenbaum .TP Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_alloc_pd.3000066400000000000000000000021321477342711600207270ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_ALLOC_PD 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_alloc_pd, ibv_dealloc_pd \- allocate or deallocate a protection domain (PDs) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_pd *ibv_alloc_pd(struct ibv_context " "*context" ); .sp .BI "int ibv_dealloc_pd(struct ibv_pd " "*pd" ); .fi .SH "DESCRIPTION" .B ibv_alloc_pd() allocates a PD for the RDMA device context .I context\fR. .PP .B ibv_dealloc_pd() deallocates the PD .I pd\fR. .SH "RETURN VALUE" .B ibv_alloc_pd() returns a pointer to the allocated PD, or NULL if the request fails. .PP .B ibv_dealloc_pd() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" .B ibv_dealloc_pd() may fail if any other resource is still associated with the PD being freed. .SH "SEE ALSO" .BR ibv_reg_mr (3), .BR ibv_create_srq (3), .BR ibv_create_qp (3), .BR ibv_create_ah (3), .BR ibv_create_ah_from_wc (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_alloc_td.3000066400000000000000000000037221477342711600207410ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_ALLOC_TD 3 2017-11-06 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_alloc_td(), ibv_dealloc_td() \- allocate and deallocate thread domain object .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_td *ibv_alloc_td(struct ibv_context " "*context" , .BI " struct ibv_td_init_attr " "*init_attr" ); .sp .BI "int ibv_dealloc_td(struct ibv_td " "*td"); .fi .SH "DESCRIPTION" .B ibv_alloc_td() allocates a thread domain object for the RDMA device context .I context\fR. .sp The thread domain object defines how the verbs libraries and provider will use locks and additional hardware capabilities to achieve best performance for handling multi-thread or single-thread protection. An application assigns verbs resources to a thread domain when it creates a verbs object. .sp If the .I ibv_td object is specified then any objects created under this thread domain will disable internal locking designed to protect against concurrent access to that object from multiple user threads. By default all verbs objects are safe for multi-threaded access, whether or not a thread domain is specified. .sp A .I struct ibv_td can be added to a parent domain via .B ibv_alloc_parent_domain() and then the parent domain can be used to create verbs objects. .sp .B ibv_dealloc_td() will deallocate the thread domain .I td\fR. All resources created with the .I td should be destroyed prior to deallocating the .I td\fR. .SH "RETURN VALUE" .B ibv_alloc_td() returns a pointer to the allocated struct .I ibv_td object, or NULL if the request fails (and sets errno to indicate the failure reason). .sp .B ibv_dealloc_td() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "SEE ALSO" .BR ibv_alloc_parent_domain (3), .SH "AUTHORS" .TP Alex Rosenbaum .TP Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_asyncwatch.1000066400000000000000000000012031477342711600213120ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH IBV_ASYNCWATCH 1 "August 30, 2005" "libibverbs" "USER COMMANDS" .SH NAME ibv_asyncwatch \- display asynchronous events .SH SYNOPSIS .B ibv_asyncwatch [\-d device] [-h] .SH DESCRIPTION .PP Display asynchronous events forwarded to userspace for an RDMA device. .SH OPTIONS .PP .TP \fB\-d\fR, \fB\-\-ib\-dev\fR=\fIDEVICE\fR use IB device \fIDEVICE\fR (default first device found) .TP \fB\-h\fR, \fB\-\-help\fR=\fIDEVICE\fR Print a help text and exit. .SH AUTHORS .TP Roland Dreier .RI < rolandd@cisco.com > .TP Eran Ben Elisha .RI < eranbe@mellanox.com > rdma-core-56.1/libibverbs/man/ibv_attach_counters_point_flow.3.md000066400000000000000000000073471477342711600252140ustar00rootroot00000000000000--- date: 2018-04-02 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: ibv_attach_counters_point_flow --- # NAME **ibv_attach_counters_point_flow** - attach individual counter definition to a flow object # SYNOPSIS ```c #include int ibv_attach_counters_point_flow(struct ibv_counters *counters, struct ibv_counter_attach_attr *counter_attach_attr, struct ibv_flow *flow); ``` # DESCRIPTION Attach counters point are a family of APIs to attach individual counter description definition to a verb object at a specific index location. Counters object will start collecting values after it is bound to the verb object resource. A static attach can be created when NULL is provided instead of the reference to the verbs object (e.g.: in case of flow providing NULL instead of *flow*). In this case, this counters object will only start collecting values after it is bound to the verbs resource, for flow this is when referencing the counters handle when creating a flow with **ibv_create_flow**(). Once an ibv_counters is bound statically to a verbs resource, no additional attach is allowed till the counter object is not bound to any verb object. The argument counter_desc specifies which counter value should be collected. It is defined in verbs.h as one of the enum ibv_counter_description options. Supported capabilities of specific counter_desc values per verbs object can be tested by checking the return value for success or ENOTSUP errno. Attaching a counters handle to multiple objects of the same type will accumulate the values into a single index. e.g.: creating several ibv_flow(s) with the same ibv_counters handle will collect the values from all relevant flows into the relevant index location when reading the values from **ibv_read_counters**(), setting the index more than once with different or same counter_desc will aggregate the values from all relevant counters into the relevant index location. The runtime values of counters can be read from the hardware by calling **ibv_read_counters**(). # ARGUMENTS *counters* : Existing counters to attach new counter point on. *counter_attach_attr* : An ibv_counter_attach_attr struct, as defined in verbs.h. *flow* : Existing flow to attach a new counters point on (in static mode it must be NULL). ## *counter_attach_attr* Argument ```c struct ibv_counter_attach_attr { enum ibv_counter_description counter_desc; uint32_t index; uint32_t comp_mask; }; ``` ## *counter_desc* Argument ```c enum ibv_counter_description { IBV_COUNTER_PACKETS, IBV_COUNTER_BYTES, }; ``` *index* : Desired location of the specific counter at the counters object. *comp_mask* : Bitmask specifying what fields in the structure are valid. # RETURN VALUE **ibv_attach_counters_point_flow**() returns 0 on success, or the value of errno on failure (which indicates the failure reason) # ERRORS EINVAL : invalid argument(s) passed ENOTSUP : *counter_desc* is not supported on the requested object EBUSY : the counter object is already bound to a flow, additional attach calls is not allowed (valid for static attach only) ENOMEM : not enough memory # NOTES Counter values in each index location are cleared upon creation when calling **ibv_create_counters**(). Attaching counters points will only increase these values accordingly. # EXAMPLE An example of use of **ibv_attach_counters_point_flow**() is shown in **ibv_read_counters** # SEE ALSO **ibv_create_counters**, **ibv_destroy_counters**, **ibv_read_counters**, **ibv_create_flow** # AUTHORS Raed Salem Alex Rosenbaum rdma-core-56.1/libibverbs/man/ibv_attach_mcast.3.md000066400000000000000000000026471477342711600222170ustar00rootroot00000000000000--- date: 2006-10-31 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_ATTACH_MCAST --- # NAME ibv_attach_mcast, ibv_detach_mcast - attach and detach a queue pair (QPs) to/from a multicast group # SYNOPSIS ```c #include int ibv_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); int ibv_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); ``` # DESCRIPTION **ibv_attach_mcast()** attaches the QP *qp* to the multicast group having MGID *gid* and MLID *lid*. **ibv_detach_mcast()** detaches the QP *qp* to the multicast group having MGID *gid* and MLID *lid*. # RETURN VALUE **ibv_attach_mcast()** and **ibv_detach_mcast()** returns 0 on success, or the value of errno on failure (which indicates the failure reason). # NOTES Only QPs of Transport Service Type **IBV_QPT_UD** may be attached to multicast groups. If a QP is attached to the same multicast group multiple times, the QP will still receive a single copy of a multicast message. In order to receive multicast messages, a join request for the multicast group must be sent to the subnet administrator (SA), so that the fabric's multicast routing is configured to deliver messages to the local port. # SEE ALSO **ibv_create_qp**(3) # AUTHOR Dotan Barak rdma-core-56.1/libibverbs/man/ibv_bind_mw.3000066400000000000000000000065241477342711600206020ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_BIND_MW 3 2016-02-02 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_bind_mw \- post a request to bind a type 1 memory window to a memory region .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_bind_mw(struct ibv_qp " "*qp" ", struct ibv_mw " "*mw" ", .BI " struct ibv_mw_bind " "*mw_bind" "); .fi .SH "DESCRIPTION" .B ibv_bind_mw() posts to the queue pair .I qp a request to bind the memory window .I mw according to the details in .I mw_bind\fR. .PP The argument .I mw_bind is an ibv_mw_bind struct, as defined in . .PP .nf struct ibv_mw_bind { .in +8 uint64_t wr_id; /* User defined WR ID */ unsigned int send_flags; /* Use ibv_send_flags */ struct ibv_mw_bind_info bind_info; /* MW bind information */ .in -8 } .fi .PP .nf struct ibv_mw_bind_info { .in +8 struct ibv_mr *mr; /* The MR to bind the MW to */ uint64_t addr; /* The address the MW should start at */ uint64_t length; /* The length (in bytes) the MW should span */ unsigned int mw_access_flags; /* Access flags to the MW. Use ibv_access_flags */ .in -8 }; .fi .PP The QP Transport Service Type must be either UC, RC or XRC_SEND for bind operations. .PP The attribute send_flags describes the properties of the \s-1WR\s0. It is either 0 or the bitwise \s-1OR\s0 of one or more of the following flags: .PP .TP .B IBV_SEND_FENCE \fR Set the fence indicator. .TP .B IBV_SEND_SIGNALED \fR Set the completion notification indicator. Relevant only if QP was created with sq_sig_all=0 .PP The mw_access_flags define the allowed access to the MW after the bind completes successfully. It is either 0 or the bitwise \s-1OR\s0 of one or more of the following flags: .TP .B IBV_ACCESS_REMOTE_WRITE \fR Enable Remote Write Access. Requires local write access to the MR. .TP .B IBV_ACCESS_REMOTE_READ\fR Enable Remote Read Access .TP .B IBV_ACCESS_REMOTE_ATOMIC\fR Enable Remote Atomic Operation Access (if supported). Requires local write access to the MR. .TP .B IBV_ACCESS_ZERO_BASED\fR If set, the address set on the 'remote_addr' field on the WR will be an offset from the MW's start address. .SH "RETURN VALUE" .B ibv_bind_mw() returns 0 on success, or the value of errno on failure (which indicates the failure reason). In case of a success, the R_key of the memory window after the bind is returned in the mw_bind->mw->rkey field. .SH "NOTES" The bind does not complete when the function return - it is merely posted to the QP. The user should keep a copy of the old R_key, and fix the mw structure if the subsequent CQE for the bind operation indicates a failure. The user may safely send the R_key using a send request on the same QP, (based on QP ordering rules: a send after a bind request on the same QP are always ordered), but must not transfer it to the remote in any other manner before reading a successful CQE. .PP Note that for type 2 MW, one should directly post bind WR to the QP, using ibv_post_send. .SH "SEE ALSO" .BR ibv_alloc_mw (3), .BR ibv_post_send (3), .BR ibv_poll_cq (3) .BR ibv_reg_mr (3), .SH "AUTHORS" .TP Majd Dibbiny .TP Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_create_ah.3000066400000000000000000000041751477342711600210760ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_CREATE_AH 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_create_ah, ibv_destroy_ah \- create or destroy an address handle (AH) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_ah *ibv_create_ah(struct ibv_pd " "*pd" ", .BI " struct ibv_ah_attr " "*attr" "); .sp .BI "int ibv_destroy_ah(struct ibv_ah " "*ah" "); .fi .SH "DESCRIPTION" .B ibv_create_ah() creates an address handle (AH) associated with the protection domain .I pd\fR. The argument .I attr is an ibv_ah_attr struct, as defined in . .PP .nf struct ibv_ah_attr { .in +8 struct ibv_global_route grh; /* Global Routing Header (GRH) attributes */ uint16_t dlid; /* Destination LID */ uint8_t sl; /* Service Level */ uint8_t src_path_bits; /* Source path bits */ uint8_t static_rate; /* Maximum static rate */ uint8_t is_global; /* GRH attributes are valid */ uint8_t port_num; /* Physical port number */ .in -8 }; .sp .nf struct ibv_global_route { .in +8 union ibv_gid dgid; /* Destination GID or MGID */ uint32_t flow_label; /* Flow label */ uint8_t sgid_index; /* Source GID index */ uint8_t hop_limit; /* Hop limit */ uint8_t traffic_class; /* Traffic class */ .in -8 }; .fi .sp .PP .B ibv_destroy_ah() destroys the AH .I ah\fR. .SH "RETURN VALUE" .B ibv_create_ah() returns a pointer to the created AH, or NULL if the request fails. .SH "NOTES" If port flag IBV_QPF_GRH_REQUIRED is set then .B ibv_create_ah() must be created with definition of 'struct ibv_ah_attr { .is_global = 1; .grh = {...}; }'. .PP .B ibv_destroy_ah() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "SEE ALSO" .BR ibv_alloc_pd (3), .BR ibv_init_ah_from_wc (3), .BR ibv_create_ah_from_wc (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_create_ah_from_wc.3000066400000000000000000000035531477342711600226110ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_CREATE_AH_FROM_WC 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_init_ah_from_wc, ibv_create_ah_from_wc \- initialize or create an address handle (AH) from a work completion .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_init_ah_from_wc(struct ibv_context " "*context" ", uint8_t " "port_num" , .BI " struct ibv_wc " "*wc" ", struct ibv_grh " "*grh" , .BI " struct ibv_ah_attr " "*ah_attr" ); .sp .BI "struct ibv_ah *ibv_create_ah_from_wc(struct ibv_pd " "*pd" , .BI " struct ibv_wc " "*wc" , .BI " struct ibv_grh " "*grh" , .BI " uint8_t " "port_num" ); .fi .SH "DESCRIPTION" .B ibv_init_ah_from_wc() initializes the address handle (AH) attribute structure .I ah_attr for the RDMA device context .I context using the port number .I port_num\fR, using attributes from the work completion .I wc and the Global Routing Header (GRH) structure .I grh\fR. .PP .B ibv_create_ah_from_wc() creates an AH associated with the protection domain .I pd using the port number .I port_num\fR, using attributes from the work completion .I wc and the Global Routing Header (GRH) structure .I grh\fR. .SH "RETURN VALUE" .B ibv_init_ah_from_wc() returns 0 on success, and \-1 on error. .PP .B ibv_create_ah_from_wc() returns a pointer to the created AH, or NULL if the request fails. .SH "NOTES" The filled structure .I ah_attr returned from .B ibv_init_ah_from_wc() can be used to create a new AH using .B ibv_create_ah()\fR. .SH "SEE ALSO" .BR ibv_open_device (3), .BR ibv_alloc_pd (3), .BR ibv_create_ah (3), .BR ibv_destroy_ah (3), .BR ibv_poll_cq (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_create_comp_channel.3000066400000000000000000000035211477342711600231260ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_CREATE_COMP_CHANNEL 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_create_comp_channel, ibv_destroy_comp_channel \- create or destroy a completion event channel .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_comp_channel *ibv_create_comp_channel(struct ibv_context .BI " " "*context" ); .sp .BI "int ibv_destroy_comp_channel(struct ibv_comp_channel " "*channel" ); .fi .SH "DESCRIPTION" .B ibv_create_comp_channel() creates a completion event channel for the RDMA device context .I context\fR. .PP .B ibv_destroy_comp_channel() destroys the completion event channel .I channel\fR. .SH "RETURN VALUE" .B ibv_create_comp_channel() returns a pointer to the created completion event channel, or NULL if the request fails. .PP .B ibv_destroy_comp_channel() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" A "completion channel" is an abstraction introduced by libibverbs that does not exist in the InfiniBand Architecture verbs specification or RDMA Protocol Verbs Specification. A completion channel is essentially file descriptor that is used to deliver completion notifications to a userspace process. When a completion event is generated for a completion queue (CQ), the event is delivered via the completion channel attached to that CQ. This may be useful to steer completion events to different threads by using multiple completion channels. .PP .B ibv_destroy_comp_channel() fails if any CQs are still associated with the completion event channel being destroyed. .SH "SEE ALSO" .BR ibv_open_device (3), .BR ibv_create_cq (3), .BR ibv_get_cq_event (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_create_counters.3.md000066400000000000000000000044061477342711600227440ustar00rootroot00000000000000--- date: 2018-04-02 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: ibv_create_counters tagline: Verbs --- # NAME **ibv_create_counters**, **ibv_destroy_counters** - Create or destroy a counters handle # SYNOPSIS ```c #include struct ibv_counters * ibv_create_counters(struct ibv_context *context, struct ibv_counters_init_attr *init_attr); int ibv_destroy_counters(struct ibv_counters *counters); ``` # DESCRIPTION **ibv_create_counters**() creates a new counters handle for the RDMA device context. An ibv_counters handle can be attached to a verbs resource (e.g.: QP, WQ, Flow) statically when these are created. For example attach an ibv_counters statically to a Flow (struct ibv_flow) during creation of a new Flow by calling **ibv_create_flow()**. Counters are cleared upon creation and values will be monotonically increasing. **ibv_destroy_counters**() releases the counters handle, user should detach the counters object before destroying it. # ARGUMENTS *context* : RDMA device context to create the counters on. *init_attr* : Is an ibv_counters_init_attr struct, as defined in verbs.h. ## *init_attr* Argument ```c struct ibv_counters_init_attr { int comp_mask; }; ``` *comp_mask* : Bitmask specifying what fields in the structure are valid. # RETURN VALUE **ibv_create_counters**() returns a pointer to the allocated ibv_counters object, or NULL if the request fails (and sets errno to indicate the failure reason) **ibv_destroy_counters**() returns 0 on success, or the value of errno on failure (which indicates the failure reason) # ERRORS EOPNOTSUPP : **ibv_create_counters**() is not currently supported on this device (ENOSYS may sometimes be returned by old versions of libibverbs). ENOMEM : **ibv_create_counters**() could not create ibv_counters object, not enough memory EINVAL : invalid parameter supplied **ibv_destroy_counters**() # EXAMPLE An example of use of ibv_counters is shown in **ibv_read_counters** # SEE ALSO **ibv_attach_counters_point_flow**, **ibv_read_counters**, **ibv_create_flow** # AUTHORS Raed Salem Alex Rosenbaum rdma-core-56.1/libibverbs/man/ibv_create_cq.3000066400000000000000000000034551477342711600211110ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_CREATE_CQ 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_create_cq, ibv_destroy_cq \- create or destroy a completion queue (CQ) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_cq *ibv_create_cq(struct ibv_context " "*context" ", int " "cqe" , .BI " void " "*cq_context" , .BI " struct ibv_comp_channel " "*channel" , .BI " int " "comp_vector" ); .sp .BI "int ibv_destroy_cq(struct ibv_cq " "*cq" ); .fi .SH "DESCRIPTION" .B ibv_create_cq() creates a completion queue (CQ) with at least .I cqe entries for the RDMA device context .I context\fR. The pointer .I cq_context will be used to set user context pointer of the CQ structure. The argument .I channel is optional; if not NULL, the completion channel .I channel will be used to return completion events. The CQ will use the completion vector .I comp_vector for signaling completion events; it must be at least zero and less than .I context\fR->num_comp_vectors. .PP .B ibv_destroy_cq() destroys the CQ .I cq\fR. .SH "RETURN VALUE" .B ibv_create_cq() returns a pointer to the CQ, or NULL if the request fails. .PP .B ibv_destroy_cq() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" .B ibv_create_cq() may create a CQ with size greater than or equal to the requested size. Check the cqe attribute in the returned CQ for the actual size. .PP .B ibv_destroy_cq() fails if any queue pair is still associated with this CQ. .SH "SEE ALSO" .BR ibv_resize_cq (3), .BR ibv_req_notify_cq (3), .BR ibv_ack_cq_events (3), .BR ibv_create_qp (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_create_cq_ex.3000066400000000000000000000201131477342711600215730ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_CREATE_CQ_EX 3 2016-05-08 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_create_cq_ex \- create a completion queue (CQ) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_cq_ex *ibv_create_cq_ex(struct ibv_context " "*context" ", .BI " struct ibv_cq_init_attr_ex " "*cq_attr" ); .fi .SH "DESCRIPTION" .B ibv_create_cq_ex() creates a completion queue (CQ) for RDMA device context .I context\fR. The argument .I cq_attr is a pointer to struct ibv_cq_init_attr_ex as defined in . .PP .nf struct ibv_cq_init_attr_ex { .in +8 int cqe; /* Minimum number of entries required for CQ */ void *cq_context; /* Consumer-supplied context returned for completion events */ struct ibv_comp_channel *channel; /* Completion channel where completion events will be queued. May be NULL if completion events will not be used. */ int comp_vector; /* Completion vector used to signal completion events. Must be >= 0 and < context->num_comp_vectors. */ uint64_t wc_flags; /* The wc_flags that should be returned in ibv_poll_cq_ex. Or'ed bit of enum ibv_wc_flags_ex. */ uint32_t comp_mask; /* compatibility mask (extended verb). */ uint32_t flags /* One or more flags from enum ibv_create_cq_attr_flags */ struct ibv_pd *parent_domain; /* Parent domain to be used by this CQ */ .in -8 }; enum ibv_wc_flags_ex { IBV_WC_EX_WITH_BYTE_LEN = 1 << 0, /* Require byte len in WC */ IBV_WC_EX_WITH_IMM = 1 << 1, /* Require immediate in WC */ IBV_WC_EX_WITH_QP_NUM = 1 << 2, /* Require QP number in WC */ IBV_WC_EX_WITH_SRC_QP = 1 << 3, /* Require source QP in WC */ IBV_WC_EX_WITH_SLID = 1 << 4, /* Require slid in WC */ IBV_WC_EX_WITH_SL = 1 << 5, /* Require sl in WC */ IBV_WC_EX_WITH_DLID_PATH_BITS = 1 << 6, /* Require dlid path bits in WC */ IBV_WC_EX_WITH_COMPLETION_TIMESTAMP = 1 << 7, /* Require completion device timestamp in WC /* IBV_WC_EX_WITH_CVLAN = 1 << 8, /* Require VLAN info in WC */ IBV_WC_EX_WITH_FLOW_TAG = 1 << 9, /* Require flow tag in WC */ IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK = 1 << 11, /* Require completion wallclock timestamp in WC */ }; enum ibv_cq_init_attr_mask { IBV_CQ_INIT_ATTR_MASK_FLAGS = 1 << 0, IBV_CQ_INIT_ATTR_MASK_PD = 1 << 1, }; enum ibv_create_cq_attr_flags { IBV_CREATE_CQ_ATTR_SINGLE_THREADED = 1 << 0, /* This CQ is used from a single threaded, thus no locking is required */ IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN = 1 << 1, /* This CQ will not pass to error state if overrun, CQE always will be written to next entry. * An application must be designed to avoid ever overflowing the CQ, otherwise CQEs might be lost. */ }; .SH "Polling an extended CQ" In order to poll an extended CQ efficiently, a user could use the following functions. .TP .B Completion iterator functions .BI "int ibv_start_poll(struct ibv_cq_ex " "*cq" ", struct ibv_poll_cq_attr " "*attr") .br Start polling a batch of work completions. .I attr is given in order to make this function easily extensible in the future. This function either returns 0 on success or an error code otherwise. When no completions are available on the CQ, ENOENT is returned, but the CQ remains in a valid state. On success, querying the completion's attribute could be done using the query functions described below. If an error code is given, end_poll shouldn't be called. .BI "int ibv_next_poll(struct ibv_cq_ex " "*cq") .br This function is called in order to get the next work completion. It has to be called after .I start_poll and before .I end_poll are called. This function either returns 0 on success or an error code otherwise. When no completions are available on the CQ, ENOENT is returned, but the CQ remains in a valid state. On success, querying the completion's attribute could be done using the query functions described below. If an error code is given, end_poll should still be called, indicating this is the end of the polled batch. .BI "void ibv_end_poll(struct ibv_cq_ex " "*cq") .br This function indicates the end of polling batch of work completions. After calling this function, the user should start a new batch by calling .I start_poll. .TP .B Polling fields in the completion Below members and functions are used in order to poll the current completion. The current completion is the completion which the iterator points to (start_poll and next_poll advances this iterator). Only fields that the user requested via wc_flags in ibv_create_cq_ex could be queried. In addition, some fields are only valid in certain opcodes and status codes. .BI "uint64_t wr_id - Can be accessed directly from struct ibv_cq_ex". .BI "enum ibv_wc_status - Can be accessed directly from struct ibv_cq_ex". .BI "enum ibv_wc_opcode ibv_wc_read_opcode(struct ibv_cq_ex " "*cq"); \c Get the opcode from the current completion. .BI "uint32_t ibv_wc_read_vendor_err(struct ibv_cq_ex " "*cq"); \c Get the vendor error from the current completion. .BI "uint32_t ibv_wc_read_byte_len(struct ibv_cq_ex " "*cq"); \c Get the payload length from the current completion. .BI "__be32 ibv_wc_read_imm_data(struct ibv_cq_ex " "*cq"); \c Get the immediate data field from the current completion. .BI "uint32_t ibv_wc_read_invalidated_rkey(struct ibv_cq_ex " "*cq"); \c Get the rkey invalided by the SEND_INVAL from the current completion. .BI "uint32_t ibv_wc_read_qp_num(struct ibv_cq_ex " "*cq"); \c Get the QP number field from the current completion. .BI "uint32_t ibv_wc_read_src_qp(struct ibv_cq_ex " "*cq"); \c Get the source QP number field from the current completion. .BI "unsigned int ibv_wc_read_wc_flags(struct ibv_cq_ex " "*cq"); \c Get the QP flags field from the current completion. .BI "uint16_t ibv_wc_read_pkey_index(struct ibv_cq_ex " "*cq"); \c Get the pkey index field from the current completion. .BI "uint32_t ibv_wc_read_slid(struct ibv_cq_ex " "*cq"); \c Get the slid field from the current completion. .BI "uint8_t ibv_wc_read_sl(struct ibv_cq_ex " "*cq"); \c Get the sl field from the current completion. .BI "uint8_t ibv_wc_read_dlid_path_bits(struct ibv_cq_ex " "*cq"); \c Get the dlid_path_bits field from the current completion. .BI "uint64_t ibv_wc_read_completion_ts(struct ibv_cq_ex " "*cq"); \c Get the completion timestamp from the current completion in HCA clock units. .BI "uint64_t ibv_wc_read_completion_wallclock_ns(struct ibv_cq_ex " *cq "); Get the completion timestamp from the current completion and convert it from HCA clock units to wall clock nanoseconds. .BI "uint16_t ibv_wc_read_cvlan(struct ibv_cq_ex " "*cq"); \c Get the CVLAN field from the current completion. .BI "uint32_t ibv_wc_read_flow_tag(struct ibv_cq_ex " "*cq"); \c Get flow tag from the current completion. .BI "void ibv_wc_read_tm_info(struct ibv_cq_ex " *cq "," .BI "struct ibv_wc_tm_info " *tm_info "); \c Get tag matching info from the current completion. .nf struct ibv_wc_tm_info { .in +8 uint64_t tag; /* tag from TMH */ uint32_t priv; /* opaque user data from TMH */ .in -8 }; .SH "RETURN VALUE" .B ibv_create_cq_ex() returns a pointer to the CQ, or NULL if the request fails. .SH "NOTES" .B ibv_create_cq_ex() may create a CQ with size greater than or equal to the requested size. Check the cqe attribute in the returned CQ for the actual size. .PP CQ should be destroyed with ibv_destroy_cq. .PP .SH "SEE ALSO" .BR ibv_create_cq (3), .BR ibv_destroy_cq (3), .BR ibv_resize_cq (3), .BR ibv_req_notify_cq (3), .BR ibv_ack_cq_events (3), .BR ibv_create_qp (3), .BR ibv_alloc_parent_domain (3) .SH "AUTHORS" .TP Matan Barak rdma-core-56.1/libibverbs/man/ibv_create_flow.3000066400000000000000000000241401477342711600214470ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH IBV_CREATE_FLOW 3 2016-03-15 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_create_flow, ibv_destroy_flow \- create or destroy flow steering rules .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_flow *ibv_create_flow(struct ibv_qp " "*qp" , .BI " struct ibv_flow_attr " "*flow_attr"); .BI "int ibv_destroy_flow(struct ibv_flow " "*flow_id"); .sp .fi .SH "DESCRIPTION" .SS ibv_create_flow() allows a user application QP .I qp to be attached into a specified flow .I flow which is defined in .I .PP .nf struct ibv_flow_attr { .in +8 uint32_t comp_mask; /* Future extendibility */ enum ibv_flow_attr_type type; /* Rule type - see below */ uint16_t size; /* Size of command */ uint16_t priority; /* Rule priority - see below */ uint8_t num_of_specs; /* Number of ibv_flow_spec_xxx */ uint8_t port; /* The uplink port number */ uint32_t flags; /* Extra flags for rule - see below */ /* Following are the optional layers according to user request * struct ibv_flow_spec_xxx * struct ibv_flow_spec_yyy */ .in -8 }; .sp .nf enum ibv_flow_attr_type { .in +8 IBV_FLOW_ATTR_NORMAL = 0x0, /* Steering according to rule specifications */ IBV_FLOW_ATTR_ALL_DEFAULT = 0x1, /* Default unicast and multicast rule - receive all Eth traffic which isn't steered to any QP */ IBV_FLOW_ATTR_MC_DEFAULT = 0x2, /* Default multicast rule - receive all Eth multicast traffic which isn't steered to any QP */ IBV_FLOW_ATTR_SNIFFER = 0x3, /* Sniffer rule - receive all port traffic */ .in -8 }; .sp .nf enum ibv_flow_flags { .in +8 IBV_FLOW_ATTR_FLAGS_DONT_TRAP = 1 << 1, /* Rule doesn't trap received packets, allowing them to match lower prioritized rules */ IBV_FLOW_ATTR_FLAGS_EGRESS = 1 << 2, /* Match sent packets against EGRESS rules and carry associated actions if required */ .in -8 }; .fi .nf .br enum ibv_flow_spec_type { .in +8 IBV_FLOW_SPEC_ETH = 0x20, /* Flow specification of L2 header */ IBV_FLOW_SPEC_IPV4 = 0x30, /* Flow specification of IPv4 header */ IBV_FLOW_SPEC_IPV6 = 0x31, /* Flow specification of IPv6 header */ IBV_FLOW_SPEC_IPV4_EXT = 0x32, /* Extended flow specification of IPv4 */ IBV_FLOW_SPEC_ESP = 0x34, /* Flow specification of ESP (IPSec) header */ IBV_FLOW_SPEC_TCP = 0x40, /* Flow specification of TCP header */ IBV_FLOW_SPEC_UDP = 0x41, /* Flow specification of UDP header */ IBV_FLOW_SPEC_VXLAN_TUNNEL = 0x50, /* Flow specification of VXLAN header */ IBV_FLOW_SPEC_GRE = 0x51, /* Flow specification of GRE header */ IBV_FLOW_SPEC_MPLS = 0x60, /* Flow specification of MPLS header */ IBV_FLOW_SPEC_INNER = 0x100, /* Flag making L2/L3/L4 specifications to be applied on the inner header */ IBV_FLOW_SPEC_ACTION_TAG = 0x1000, /* Action tagging matched packet */ IBV_FLOW_SPEC_ACTION_DROP = 0x1001, /* Action dropping matched packet */ IBV_FLOW_SPEC_ACTION_HANDLE = 0x1002, /* Carry out an action created by ibv_create_flow_action_xxxx verb */ IBV_FLOW_SPEC_ACTION_COUNT = 0x1003, /* Action count matched packet with a ibv_counters handle */ .in -8 }; .br Flow specification general structure: .BR struct ibv_flow_spec_xxx { .in +8 enum ibv_flow_spec_type type; uint16_t size; /* Flow specification size = sizeof(struct ibv_flow_spec_xxx) */ struct ibv_flow_xxx_filter val; struct ibv_flow_xxx_filter mask; /* Defines which bits from the filter value are applicable when looking for a match in the incoming packet */ .in -8 }; .PP Each spec struct holds the relevant network layer parameters for matching. To enforce the match, the user sets a mask for each parameter. .br Packets coming from the wire are matched against the flow specification. If a match is found, the associated flow actions are executed on the packet. .br In ingress flows, the QP parameter is treated as another action of scattering the packet to the respected QP. .br If the bit is set in the mask, the corresponding bit in the value should be matched. .br Note that most vendors support either full mask (all "1"s) or zero mask (all "0"s). .br .B Network parameters in the relevant network structs should be given in network order (big endian). .SS Flow domains and priority Flow steering defines the concept of domain and priority. Each domain represents an application that can attach a flow. Domains are prioritized. A higher priority domain will always supersede a lower priority domain when their flow specifications overlap. .br .B IB verbs have the higher priority domain. .br In addition to the domain, there is priority within each of the domains. A lower priority numeric value (higher priority) takes precedence over matching rules with higher numeric priority value (lower priority). It is important to note that the priority value of a flow spec is used not only to establish the precedence of conflicting flow matches but also as a way to abstract the order on which flow specs are tested for matches. Flows with higher priorities will be tested before flows with lower priorities. .SS Rules definition ordering An application can provide the ibv_flow_spec_xxx rules in an un-ordered scheme. In this case, each spec should be well defined and match a specific network header layer. In some cases, when certain flow spec types are present in the spec list, it is required to provide the list in an ordered manner so that the position of that flow spec type in the protocol stack is strictly defined. When the certain spec type, which requires the ordering, resides in the inner network protocol stack (in tunnel protocols) the ordering should be applied to the inner network specs and should be combined with the inner spec indication using the IBV_FLOW_SPEC_INNER flag. For example: An MPLS spec which attempts to match an MPLS tag in the inner network should have the IBV_FLOW_SPEC_INNER flag set and so do the rest of the inner network specs. On top of that, all the inner network specs should be provided in an ordered manner. This is essential to represent many of the encapsulation tunnel protocols. .br The flow spec types which require this sort of ordering are: .br .B 1. IBV_FLOW_SPEC_MPLS - .br Since MPLS header can appear at several locations in the protocol stack and can also be encapsulated on top of different layers, it is required to place this spec according to its exact location in the protocol stack. .br .SS ibv_destroy_flow() destroys the flow .I flow_id\fR. .SH "RETURN VALUE" .B ibv_create_flow() returns a pointer to the flow, or NULL if the request fails. In case of an error, errno is updated. .PP .B ibv_destroy_flow() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "ERRORS" .SS EINVAL .B ibv_create_flow() flow specification, QP or priority are invalid .PP .B ibv_destroy_flow() flow_id is invalid .SS ENOMEM Couldn't create/destroy flow, not enough memory .SS ENXIO Device managed flow steering isn't currently supported .SS EPERM No permissions to add the flow steering rule .SH "NOTES" 1. These verbs are available only for devices supporting .br IBV_DEVICE_MANAGED_FLOW_STEERING and only for QPs of Transport Service Type .BR IBV_QPT_UD or .BR IBV_QPT_RAW_PACKET .br 2. User must memset the spec struct with zeros before using it. .br 3. ether_type field in ibv_flow_eth_filter is the ethertype following the last VLAN tag of the packet. .br 4. Only rule type IBV_FLOW_ATTR_NORMAL supports IBV_FLOW_ATTR_FLAGS_DONT_TRAP flag. .br 5. No specifications are needed for IBV_FLOW_ATTR_SNIFFER rule type. .br 6. When IBV_FLOW_ATTR_FLAGS_EGRESS flag is set, the qp parameter is used only as a mean to get the device. .br .PP .SH EXAMPLE .br Below flow_attr defines a rule in priority 0 to match a destination mac address and a source ipv4 address. For that, L2 and L3 specs are used. .br If there is a hit on this rule, means the received packet has destination mac: 66:11:22:33:44:55 and source ip: 0x0B86C806, the packet is steered to its attached qp. .sp .nf struct raw_eth_flow_attr { .in +8 struct ibv_flow_attr attr; struct ibv_flow_spec_eth spec_eth; struct ibv_flow_spec_ipv4 spec_ipv4; .in -8 } __attribute__((packed)); .sp .nf struct raw_eth_flow_attr flow_attr = { .in +8 .attr = { .comp_mask = 0, .type = IBV_FLOW_ATTR_NORMAL, .size = sizeof(flow_attr), .priority = 0, .num_of_specs = 2, .port = 1, .flags = 0, }, .spec_eth = { .type = IBV_FLOW_SPEC_ETH, .size = sizeof(struct ibv_flow_spec_eth), .val = { .dst_mac = {0x66, 0x11, 0x22, 0x33, 0x44, 0x55}, .src_mac = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, .ether_type = 0, .vlan_tag = 0, }, .mask = { .dst_mac = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}, .src_mac = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}, .ether_type = 0, .vlan_tag = 0, } }, .spec_ipv4 = { .type = IBV_FLOW_SPEC_IPV4, .size = sizeof(struct ibv_flow_spec_ipv4), .val = { .src_ip = 0x0B86C806, .dst_ip = 0, }, .mask = { .src_ip = 0xFFFFFFFF, .dst_ip = 0, } } .in -8 }; .sp .nf .SH "AUTHORS" .TP Hadar Hen Zion .TP Matan Barak .TP Yishai Hadas .TP Maor Gottlieb rdma-core-56.1/libibverbs/man/ibv_create_flow_action.3.md000066400000000000000000000263641477342711600234150ustar00rootroot00000000000000--- layout: page title: ibv_flow_action_esp section: 3 tagline: Verbs --- # NAME ibv_flow_action_esp - Flow action esp for verbs # SYNOPSIS ```c #include struct ibv_flow_action * ibv_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp *esp); int ibv_modify_flow_action_esp(struct ibv_flow_action *action, struct ibv_flow_action_esp *esp); int ibv_destroy_flow_action(struct ibv_flow_action *action); ``` # DESCRIPTION An IPSEC ESP flow steering action allows a flow steering rule to decrypt or encrypt a packet after matching. Each action contains the necessary information for this operation in the *params* argument. After the crypto operation the packet will continue to be processed by flow steering rules until it reaches a final action of discard or delivery. After the action is created, then it should be associated with a *struct ibv_flow_attr* using *struct ibv_flow_spec_action_handle* flow specification. Each action can be associated with multiple flows, and *ibv_modify_flow_action_esp* will alter all associated flows simultaneously. # ARGUMENTS *ctx* : RDMA device context to create the action on. *esp* : ESP parameters and key material for the action. *action* : Existing action to modify ESP parameters. ## *action* Argument ```c struct ibv_flow_action_esp { struct ibv_flow_action_esp_attr *esp_attr; /* See Key Material */ uint16_t keymat_proto; uint16_t keymat_len; void *keymat_ptr; /* See Replay Protection */ uint16_t replay_proto; uint16_t replay_len; void *replay_ptr; struct ibv_flow_action_esp_encap *esp_encap; uint32_t comp_mask; uint32_t esn; }; ``` *comp_mask* : Bitmask specifying what fields in the structure are valid. *esn* : The starting value of the ESP extended sequence number. Valid only if *IBV_FLOW_ACTION_ESP_MASK_ESN* is set in *comp_mask*. The 32 bits of *esn* will be used to compute the full 64 bit ESN required for the AAD construction. When in *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_INLINE_CRYPTO* mode, the implementation will automatically track rollover of the lower 32 bits of the ESN. However, an update of the window is required once every 2^31 sequences. When in *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD* mode this value is automatically incremended and it is also used for anti-replay checks. *esp_attr* : See *ESP Attributes*. May be NULL on modify. *keymat_proto*, *keymat_len*, *keymat_ptr* : Describe the key material and encryption standard to use. May be NULL on modify. *replay_proto*, *replay_len*, *replay_ptr* : Describe the replay protection scheme used to manage sequence numbers and prevent replay attacks. This field is only valid in full offload mode. May be NULL on modify. *esp_encap* : Describe the encapsulation of ESP packets such as the IP tunnel and/or UDP encapsulation. This field is only valid in full offload mode. May be NULL on modify. ## ESP attributes ```c struct ibv_flow_action_esp_attr { uint32_t spi; uint32_t seq; uint32_t tfc_pad; uint32_t flags; uint64_t hard_limit_pkts; }; ``` *flags* : A bitwise OR of the various *IB_UVERBS_FLOW_ACTION_ESP_FLAGS* described below. *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_DECRYPT*, *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_ENCRYPT* : The action will decrypt or encrypt a packet using the provided keying material. The implementation may require that encrypt is only used with an egress flow steering rule, and that decrypt is only used with an ingress flow steering rule. ## Full Offload Mode When *esp_attr* flag *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD* is set the ESP header and trailer are added and removed automatically during the cipher operation. In this case the *esn* and *spi* are used to populate and check the ESP header, and any information from the *keymat* (eg a IV) is placed in the headers and otherwise handled automatically. For decrypt the hardware will perform anti-replay. Decryption failure will cause the packet to be dropped. This action must be combined with the flow steering that identifies the packets protected by the SA defined in this action. The following members of the esp_attr are used only in full offload mode: *spi* : The value for the ESP Security Parameters Index. It is only used for *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLAOD*. *seq* : The initial 32 lower bytes of the sequence number. This is the value of the ESP sequence number. It is only used for *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLAOD*. *tfc_pad* : The length of Traffic Flow Confidentiality Padding as specified by RFC4303. If it is set to zero no additional padding is added. It is only used for *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLAOD*. *hard_limit_pkts* : The hard lifetime of the SA measured in number of packets. As specified by RFC4301. After this limit is reached the action will drop future packets to prevent breaking the crypto. It is only used for *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLAOD*. ## Inline Crypto Mode When *esp_attr* flag *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_INLINE_CRYPTO* is set the user must providate packets with additional headers. For encrypt the packet must contain a fully populated IPSEC packet except the data payload is left un-encrypted and there is no IPsec trailer. If the IV must be unpredictable, then a flag should indicate the transofrmation such as *IB_UVERBS_FLOW_ACTION_IV_ALGO_SEQ*. *IB_UVERBS_FLOW_ACTION_IV_ALGO_SEQ* means that the IV is incremented sequentually. If the IV algorithm is supported by HW, then it could provide support for LSO offload with ESP inline crypto. Finally, the IV used to encrypt the packet replaces the IV field provided, the payload is encrypted and authenticated, a trailer with padding is added and the ICV is added as well. For decrypt the packet is authenticated and decrypted in-place, resulting in a decrypted IPSEC packet with no trailer. The result of decryption and authentication can be retrieved from an extended CQ via the *ibv_wc_read_XXX(3)* function. This mode must be combined with the flow steering including *IBV_FLOW_SPEC_IPV4* and *IBV_FLOW_SPEC_ESP* to match the outer packet headers to ensure that the action is only applied to IPSEC packets with the correct identifiers. For inline crypto, we have some special requirements to maintain a stateless ESN while maintaining the same parameters as software. The system supports offloading a portion of the IPSEC flow, enabling a single flow to be split between multiple NICs. ### Determining the ESN for Ingress Packets We require a "modify" command once every 2^31 packets. This modify command allows the implementation in HW to be stateless, as follows: ``` ESN 1 ESN 2 ESN 3 |-------------*-------------|-------------*-------------|-------------* ^ ^ ^ ^ ^ ^ ``` ^ - marks where command invoked to update the SA ESN state machine. | - marks the start of the ESN scope (0-2^32-1). At this point move SA ESN "new_window" bit to zero and increment ESN. * - marks the middle of the ESN scope (2^31). At this point move SA ESN "new_window" bit to one. For decryption the implementation uses the following state machine to determine ESN: ```c if (!overlap) { use esn // regardless of packet.seq } else { // new_window if (packet.seq >= 2^31) use esn else // packet.seq < 2^31 use esn+1 } ``` This mechanism is controlled by the *esp_attr* flag: *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_ESN_NEW_WINDOW* : This flag is only used to provide stateless ESN support for inline crypto. It is used only for *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_INLINE_CRYPTO* and *IBV_FLOW_ACTION_ESP_MASK_ESN*. Setting this flag indicates that the bottom of the replay window is between 2^31 - 2^32. ## Key Material for AES GCM (*IBV_ACTION_ESP_KEYMAT_AES_GCM*) The AES GCM crypto algorithm as defined by RFC4106. This struct is to be provided in *keymat_ptr* when *keymat_proto* is set to *IBV_ACTION_ESP_KEYMAT_AES_GCM*. ```c struct ibv_flow_action_esp_aes_keymat_aes_gcm { uint64_t iv; uint32_t iv_algo; /* Use enum ib_uverbs_flow_action_esp_aes_gcm_keymat_iv_algo */ uint32_t salt; uint32_t icv_len; uint32_t key_len; uint32_t aes_key[256 / 32]; }; ``` *iv* : The starting value for the initialization vector used only with *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD* encryption as defined in RFC4106. This field is ignored for *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_INLINE_CRYPTO*. For a given key, the IV MUST NOT be reused. *iv_algo* : The algorithm used to transform/generate new IVs with *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD* encryption. The only supported value is *IB_UVERBS_FLOW_ACTION_IV_ALGO_SEQ* to generate sequantial IVs. *salt* : The salt as defined by RFC4106. *icv_len* : The length of the Integrity Check Value in bytes as defined by RFC4106. *aes_key*, *key_len* : The cipher key data. It must be either 16, 24 or 32 bytes as defined by RFC4106. ## Bitmap Replay Protection (*IBV_FLOW_ACTION_ESP_REPLAY_BMP*) A shifting bitmap is used to identify which packets have already been transmitted. Each bit in the bitmap represents a packet, it is set if a packet with this ESP sequence number has been received and it passed authentication. If a packet with the same sequence is received, then the bit is already set, causing replay protection to drop the packet. The bitmap represents a window of *size* sequence numbers. If a newer sequence number is received, then the bitmap will shift to represent this as in RFC6479. The replay window cannot shift more than 2^31 sequence numbers forward. This struct is to be provided in *replay_ptr* when *reply_proto* is set to *IBV_FLOW_ACTION_ESP_REPLAY_BMP*. In this mode reply_ptr and reply_len should point to a struct ibv_flow_action_esp_replay_bmp containing: *size* : The size of the bitmap. ## ESP Encapsulation An *esp_encap* specification is required when *eps_attr* flags *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_TUNNEL* is set. It is used to provide the fields for the encapsulation header that is added/removed to/from packets. Tunnel and Transport mode are defined as in RFC4301. UDP encapsulation of ESP can be specified by providing the appropriate UDP header. This setting is only used in *IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD* mode. ```C struct ibv_flow_action_esp_encap { void *val; /* pointer to struct ibv_flow_xxxx_filter */ struct ibv_flow_action_esp_encap *next_ptr; uint16_t len; /* Len of mask and pointer (separately) */ uint16_t type; /* Use flow_spec enum */ }; ``` Each link in the list specifies a network header in the same manner as the flow steering API. The header should be selected from a supported header in 'enum ibv_flow_spec_type'. # RETURN VALUE Upon success *ibv_create_flow_action_esp* will return a new *struct ibv_flow_action* object, on error NULL will be returned and errno will be set. Upon success *ibv_modify_action_esp* will return 0. On error the value of errno will be returned. If ibv_modify_flow_action fails, it is guaranteed that the last action still holds. If it succeeds, there is a point in the future where the old action is applied on all packets until this point and the new one is applied on all packets from this point and on. # SEE ALSO *ibv_create_flow(3)*, *ibv_destroy_action(3)*, *RFC 4106* rdma-core-56.1/libibverbs/man/ibv_create_qp.3000066400000000000000000000061571477342711600211300ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_CREATE_QP 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_create_qp, ibv_destroy_qp \- create or destroy a queue pair (QP) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_qp *ibv_create_qp(struct ibv_pd " "*pd" , .BI " struct ibv_qp_init_attr " "*qp_init_attr" ); .sp .BI "int ibv_destroy_qp(struct ibv_qp " "*qp" ); .fi .SH "DESCRIPTION" .B ibv_create_qp() creates a queue pair (QP) associated with the protection domain .I pd\fR. The argument .I qp_init_attr is an ibv_qp_init_attr struct, as defined in . .PP .nf struct ibv_qp_init_attr { .in +8 void *qp_context; /* Associated context of the QP */ struct ibv_cq *send_cq; /* CQ to be associated with the Send Queue (SQ) */ struct ibv_cq *recv_cq; /* CQ to be associated with the Receive Queue (RQ) */ struct ibv_srq *srq; /* SRQ handle if QP is to be associated with an SRQ, otherwise NULL */ struct ibv_qp_cap cap; /* QP capabilities */ enum ibv_qp_type qp_type; /* QP Transport Service Type: IBV_QPT_RC, IBV_QPT_UC, IBV_QPT_UD, IBV_QPT_RAW_PACKET or IBV_QPT_DRIVER */ int sq_sig_all; /* If set, each Work Request (WR) submitted to the SQ generates a completion entry */ .in -8 }; .sp .nf struct ibv_qp_cap { .in +8 uint32_t max_send_wr; /* Requested max number of outstanding WRs in the SQ */ uint32_t max_recv_wr; /* Requested max number of outstanding WRs in the RQ */ uint32_t max_send_sge; /* Requested max number of scatter/gather (s/g) elements in a WR in the SQ */ uint32_t max_recv_sge; /* Requested max number of s/g elements in a WR in the RQ */ uint32_t max_inline_data;/* Requested max number of data (bytes) that can be posted inline to the SQ, otherwise 0 */ .in -8 }; .fi .PP The function .B ibv_create_qp() will update the .I qp_init_attr\fB\fR->cap struct with the actual \s-1QP\s0 values of the QP that was created; the values will be greater than or equal to the values requested. .PP .B ibv_destroy_qp() destroys the QP .I qp\fR. .SH "RETURN VALUE" .B ibv_create_qp() returns a pointer to the created QP, or NULL if the request fails. Check the QP number (\fBqp_num\fR) in the returned QP. .PP .B ibv_destroy_qp() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" .B ibv_create_qp() will fail if a it is asked to create QP of a type other than .B IBV_QPT_RC or .B IBV_QPT_UD associated with an SRQ. .PP The attributes max_recv_wr and max_recv_sge are ignored by .B ibv_create_qp() if the QP is to be associated with an SRQ. .PP .B ibv_destroy_qp() fails if the QP is attached to a multicast group. .PP .B IBV_QPT_DRIVER does not represent a specific service and is used for vendor specific QP logic. .SH "SEE ALSO" .BR ibv_alloc_pd (3), .BR ibv_modify_qp (3), .BR ibv_query_qp (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_create_qp_ex.3000066400000000000000000000143451477342711600216220ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_CREATE_QP_EX 3 2013-06-26 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_create_qp_ex, ibv_destroy_qp \- create or destroy a queue pair (QP) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_qp *ibv_create_qp_ex(struct ibv_context " "*context" , .BI " struct ibv_qp_init_attr_ex " "*qp_init_attr" ); .sp .BI "int ibv_destroy_qp(struct ibv_qp " "*qp" ); .fi .SH "DESCRIPTION" .B ibv_create_qp_ex() creates a queue pair (QP) associated with the protection domain .I pd\fR. The argument .I qp_init_attr_ex is an ibv_qp_init_attr_ex struct, as defined in . .PP .nf struct ibv_qp_init_attr_ex { .in +8 void *qp_context; /* Associated context of the QP */ struct ibv_cq *send_cq; /* CQ to be associated with the Send Queue (SQ) */ struct ibv_cq *recv_cq; /* CQ to be associated with the Receive Queue (RQ) */ struct ibv_srq *srq; /* SRQ handle if QP is to be associated with an SRQ, otherwise NULL */ struct ibv_qp_cap cap; /* QP capabilities */ enum ibv_qp_type qp_type; /* QP Transport Service Type: IBV_QPT_RC, IBV_QPT_UC, IBV_QPT_UD, IBV_QPT_RAW_PACKET or IBV_QPT_DRIVER */ int sq_sig_all; /* If set, each Work Request (WR) submitted to the SQ generates a completion entry */ uint32_t comp_mask; /* Identifies valid fields */ struct ibv_pd *pd; /* PD to be associated with the QP */ struct ibv_xrcd *xrcd; /* XRC domain to be associated with the target QP */ enum ibv_qp_create_flags create_flags; /* Creation flags for this QP */ uint16_t max_tso_header; /* Maximum TSO header size */ struct ibv_rwq_ind_table *rwq_ind_tbl; /* Indirection table to be associated with the QP */ struct ibv_rx_hash_conf rx_hash_conf; /* RX hash configuration to be used */ uint32_t source_qpn; /* Source QP number, creation flag IBV_QP_CREATE_SOURCE_QPN should be set, few NOTEs below */ uint64_t send_ops_flags; /* Select which QP send ops will be defined in struct ibv_qp_ex. Use enum ibv_qp_create_send_ops_flags */ .in -8 }; .sp .nf struct ibv_qp_cap { .in +8 uint32_t max_send_wr; /* Requested max number of outstanding WRs in the SQ */ uint32_t max_recv_wr; /* Requested max number of outstanding WRs in the RQ */ uint32_t max_send_sge; /* Requested max number of scatter/gather (s/g) elements in a WR in the SQ */ uint32_t max_recv_sge; /* Requested max number of s/g elements in a WR in the RQ */ uint32_t max_inline_data;/* Requested max number of data (bytes) that can be posted inline to the SQ, otherwise 0 */ .in -8 }; .nf enum ibv_qp_create_flags { .in +8 IBV_QP_CREATE_BLOCK_SELF_MCAST_LB = 1 << 1, /* Prevent self multicast loopback */ IBV_QP_CREATE_SCATTER_FCS = 1 << 8, /* FCS field will be scattered to host memory */ IBV_QP_CREATE_CVLAN_STRIPPING = 1 << 9, /* CVLAN field will be stripped from incoming packets */ IBV_QP_CREATE_SOURCE_QPN = 1 << 10, /* The created QP will use the source_qpn as its wire QP number */ IBV_QP_CREATE_PCI_WRITE_END_PADDING = 1 << 11, /* Incoming packets will be padded to cacheline size */ .in -8 }; .fi .nf struct ibv_rx_hash_conf { .in +8 uint8_t rx_hash_function; /* RX hash function, use enum ibv_rx_hash_function_flags */ uint8_t rx_hash_key_len; /* RX hash key length */ uint8_t *rx_hash_key; /* RX hash key data */ uint64_t rx_hash_fields_mask; /* RX fields that should participate in the hashing, use enum ibv_rx_hash_fields */ .in -8 }; .fi .nf enum ibv_rx_hash_fields { .in +8 IBV_RX_HASH_SRC_IPV4 = 1 << 0, IBV_RX_HASH_DST_IPV4 = 1 << 1, IBV_RX_HASH_SRC_IPV6 = 1 << 2, IBV_RX_HASH_DST_IPV6 = 1 << 3, IBV_RX_HASH_SRC_PORT_TCP = 1 << 4, IBV_RX_HASH_DST_PORT_TCP = 1 << 5, IBV_RX_HASH_SRC_PORT_UDP = 1 << 6, IBV_RX_HASH_DST_PORT_UDP = 1 << 7, IBV_RX_HASH_IPSEC_SPI = 1 << 8, /* When using tunneling protocol, e.g. VXLAN, then we have an inner (encapsulated packet) and outer. * For applying RSS on the inner packet, then the following field should be set with one of the L3/L4 fields. */ IBV_RX_HASH_INNER = (1UL << 31), .in -8 }; .fi .nf struct ibv_qp_create_send_ops_flags { .in +8 IBV_QP_EX_WITH_RDMA_WRITE = 1 << 0, IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM = 1 << 1, IBV_QP_EX_WITH_SEND = 1 << 2, IBV_QP_EX_WITH_SEND_WITH_IMM = 1 << 3, IBV_QP_EX_WITH_RDMA_READ = 1 << 4, IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP = 1 << 5, IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD = 1 << 6, IBV_QP_EX_WITH_LOCAL_INV = 1 << 7, IBV_QP_EX_WITH_BIND_MW = 1 << 8, IBV_QP_EX_WITH_SEND_WITH_INV = 1 << 9, IBV_QP_EX_WITH_TSO = 1 << 10, .in -8 }; .fi .PP The function .B ibv_create_qp_ex() will update the .I qp_init_attr_ex\fB\fR->cap struct with the actual \s-1QP\s0 values of the QP that was created; the values will be greater than or equal to the values requested. .PP .B ibv_destroy_qp() destroys the QP .I qp\fR. .SH "RETURN VALUE" .B ibv_create_qp_ex() returns a pointer to the created QP, or NULL if the request fails. Check the QP number (\fBqp_num\fR) in the returned QP. .PP .B ibv_destroy_qp() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" .PP The attributes max_recv_wr and max_recv_sge are ignored by .B ibv_create_qp_ex() if the QP is to be associated with an SRQ. .PP The attribute source_qpn is supported only on UD QP, without flow steering RX should not be possible. .PP Use .B ibv_qp_to_qp_ex() to get the .I ibv_qp_ex for accessing the send ops iterator interface, when QP create attr IBV_QP_INIT_ATTR_SEND_OPS_FLAGS is used. .PP .B ibv_destroy_qp() fails if the QP is attached to a multicast group. .PP .B IBV_QPT_DRIVER does not represent a specific service and is used for vendor specific QP logic. .SH "SEE ALSO" .BR ibv_alloc_pd (3), .BR ibv_modify_qp (3), .BR ibv_query_qp (3), .BR ibv_create_rwq_ind_table (3) .SH "AUTHORS" .TP Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_create_rwq_ind_table.3000066400000000000000000000040001477342711600233030ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH CREATE_RWQ_IND_TBL 3 2016-07-27 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_create_rwq_ind_table, ibv_destroy_rwq_ind_table \- create or destroy a Receive Work Queue Indirection Table (RWQ IND TBL). .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_rwq_ind_table *ibv_create_rwq_ind_table(struct ibv_context " "*context," .BI " struct ibv_rwq_ind_table_init_attr " "*init_attr" ); .sp .BI "int ibv_destroy_rwq_ind_table(struct ibv_rwq_ind_table " "*rwq_ind_table" ); .fi .SH "DESCRIPTION" .B ibv_create_rwq_ind_table() creates a RWQ IND TBL associated with the ibv_context .I context\fR. The argument .I init_attr is an ibv_rwq_ind_table_init_attr struct, as defined in . .PP .nf struct ibv_rwq_ind_table_init_attr { .in +8 uint32_t log_ind_tbl_size; /* Log, base 2, of Indirection table size */ struct ibv_wq **ind_tbl; /* Each entry is a pointer to Receive Work Queue */ uint32_t comp_mask; /* Identifies valid fields. Use ibv_ind_table_init_attr_mask */ .in -8 }; .fi .PP The function .B ibv_create_rwq_ind_table() will create a RWQ IND TBL that holds a table of Receive Work Queue. For further usage of the created object see below .I NOTES\fR. .PP .B ibv_destroy_rwq_ind_table() destroys the RWQ IND TBL .I rwq_ind_table\fR. .SH "RETURN VALUE" .B ibv_create_rwq_ind_table() returns a pointer to the created RWQ IND TBL, or NULL if the request fails. .PP .B ibv_destroy_rwq_ind_table() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" The created object should be used as part of .I ibv_create_qp_ex() to enable dispatching of incoming packets based on some RX hash configuration. .SH "SEE ALSO" .BR ibv_create_wq (3), .BR ibv_modify_wq (3), .BR ibv_create_qp_ex (3), .SH "AUTHORS" .TP Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_create_srq.3000066400000000000000000000037551477342711600213160ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_CREATE_SRQ 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_create_srq, ibv_destroy_srq \- create or destroy a shared receive queue (SRQ) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_srq *ibv_create_srq(struct ibv_pd " "*pd" ", struct " .BI " ibv_srq_init_attr " "*srq_init_attr" ); .sp .BI "int ibv_destroy_srq(struct ibv_srq " "*srq" ); .fi .SH "DESCRIPTION" .B ibv_create_srq() creates a shared receive queue (SRQ) associated with the protection domain .I pd\fR. The argument .I srq_init_attr is an ibv_srq_init_attr struct, as defined in . .PP .nf struct ibv_srq_init_attr { .in +8 void *srq_context; /* Associated context of the SRQ */ struct ibv_srq_attr attr; /* SRQ attributes */ .in -8 }; .sp .nf struct ibv_srq_attr { .in +8 uint32_t max_wr; /* Requested max number of outstanding work requests (WRs) in the SRQ */ uint32_t max_sge; /* Requested max number of scatter elements per WR */ uint32_t srq_limit; /* The limit value of the SRQ (irrelevant for ibv_create_srq) */ .in -8 }; .fi .PP The function .B ibv_create_srq() will update the .I srq_init_attr struct with the original values of the SRQ that was created; the values of max_wr and max_sge will be greater than or equal to the values requested. .PP .B ibv_destroy_srq() destroys the SRQ .I srq\fR. .SH "RETURN VALUE" .B ibv_create_srq() returns a pointer to the created SRQ, or NULL if the request fails. .PP .B ibv_destroy_srq() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" .B ibv_destroy_srq() fails if any queue pair is still associated with this SRQ. .SH "SEE ALSO" .BR ibv_alloc_pd (3), .BR ibv_modify_srq (3), .BR ibv_query_srq (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_create_srq_ex.3000066400000000000000000000052171477342711600220050ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_CREATE_SRQ_EX 3 2013-06-26 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_create_srq_ex, ibv_destroy_srq \- create or destroy a shared receive queue (SRQ) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_srq *ibv_create_srq_ex(struct ibv_context " "*context" ", struct " .BI " ibv_srq_init_attr_ex " "*srq_init_attr_ex" ); .sp .BI "int ibv_destroy_srq(struct ibv_srq " "*srq" ); .fi .SH "DESCRIPTION" .B ibv_create_srq_ex() creates a shared receive queue (SRQ) supporting both basic and xrc modes. The argument .I srq_init_attr_ex is an ibv_srq_init_attr_ex struct, as defined in . .PP .nf struct ibv_srq_init_attr_ex { .in +8 void *srq_context; /* Associated context of the SRQ */ struct ibv_srq_attr attr; /* SRQ attributes */ uint32_t comp_mask; /* Identifies valid fields */ enum ibv_srq_type srq_type; /* Basic / XRC / tag matching */ struct ibv_pd *pd; /* PD associated with the SRQ */ struct ibv_xrcd *xrcd; /* XRC domain to associate with the SRQ */ struct ibv_cq *cq; /* CQ to associate with the SRQ for XRC mode */ struct ibv_tm_cap tm_cap; /* Tag matching attributes */ .in -8 }; .sp .nf struct ibv_srq_attr { .in +8 uint32_t max_wr; /* Requested max number of outstanding work requests (WRs) in the SRQ */ uint32_t max_sge; /* Requested max number of scatter elements per WR */ uint32_t srq_limit; /* The limit value of the SRQ */ .in -8 }; .sp .nf struct ibv_tm_cap { .in +8 uint32_t max_num_tags; /* Tag matching list size */ uint32_t max_ops; /* Number of outstanding tag list operations */ .in -8 }; .sp .nf .fi .PP The function .B ibv_create_srq_ex() will update the .I srq_init_attr_ex struct with the original values of the SRQ that was created; the values of max_wr and max_sge will be greater than or equal to the values requested. .PP .B ibv_destroy_srq() destroys the SRQ .I srq\fR. .SH "RETURN VALUE" .B ibv_create_srq_ex() returns a pointer to the created SRQ, or NULL if the request fails. .PP .B ibv_destroy_srq() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" .B ibv_destroy_srq() fails if any queue pair is still associated with this SRQ. .SH "SEE ALSO" .BR ibv_alloc_pd (3), .BR ibv_modify_srq (3), .BR ibv_query_srq (3) .SH "AUTHORS" .TP Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_create_wq.3000066400000000000000000000050751477342711600211350ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_CREATE_WQ 3 2016-07-27 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_create_wq, ibv_destroy_wq \- create or destroy a Work Queue (WQ). .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_wq *ibv_create_wq(struct ibv_context " "*context," .BI " struct ibv_wq_init_attr " "*wq_init_attr" ); .sp .BI "int ibv_destroy_wq(struct ibv_wq " "*wq" ); .fi .SH "DESCRIPTION" .B ibv_create_wq() creates a WQ associated with the ibv_context .I context\fR. The argument .I wq_init_attr is an ibv_wq_init_attr struct, as defined in . .PP .nf struct ibv_wq_init_attr { .in +8 void *wq_context; /* Associated context of the WQ */ enum ibv_wq_type wq_type; /* WQ type */ uint32_t max_wr; /* Requested max number of outstanding WRs in the WQ */ uint32_t max_sge; /* Requested max number of scatter/gather (s/g) elements per WR in the WQ */ struct ibv_pd *pd; /* PD to be associated with the WQ */ struct ibv_cq *cq; /* CQ to be associated with the WQ */ uint32_t comp_mask; /* Identifies valid fields. Use ibv_wq_init_attr_mask */ uint32_t create_flags /* Creation flags for this WQ, use enum ibv_wq_flags */ .in -8 }; .sp .nf enum ibv_wq_flags { .in +8 IBV_WQ_FLAGS_CVLAN_STRIPPING = 1 << 0, /* CVLAN field will be stripped from incoming packets */ IBV_WQ_FLAGS_SCATTER_FCS = 1 << 1, /* FCS field will be scattered to host memory */ IBV_WQ_FLAGS_DELAY_DROP = 1 << 2, /* Packets won't be dropped immediately if no receive WQEs */ IBV_WQ_FLAGS_PCI_WRITE_END_PADDING = 1 << 3, /* Incoming packets will be padded to cacheline size */ IBV_WQ_FLAGS_RESERVED = 1 << 4, .in -8 }; .nf .fi .PP The function .B ibv_create_wq() will update the .I wq_init_attr\fB\fR->max_wr and .I wq_init_attr\fB\fR->max_sge fields with the actual \s-1WQ\s0 values of the WQ that was created; the values will be greater than or equal to the values requested. .PP .B ibv_destroy_wq() destroys the WQ .I wq\fR. .SH "RETURN VALUE" .B ibv_create_wq() returns a pointer to the created WQ, or NULL if the request fails. .PP .B ibv_destroy_wq() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "SEE ALSO" .BR ibv_modify_wq (3), .SH "AUTHORS" .TP Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_devices.1000066400000000000000000000005761477342711600206040ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH IBV_DEVICES 1 "August 30, 2005" "libibverbs" "USER COMMANDS" .SH NAME ibv_devices \- list RDMA devices .SH SYNOPSIS .B ibv_devices .SH DESCRIPTION .PP List RDMA devices available for use from userspace. .SH SEE ALSO .BR ibv_devinfo (1) .SH AUTHORS .TP Roland Dreier .RI < rolandd@cisco.com > rdma-core-56.1/libibverbs/man/ibv_devinfo.1000066400000000000000000000014611477342711600206060ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH IBV_DEVINFO 1 "August 30, 2005" "libibverbs" "USER COMMANDS" .SH NAME ibv_devinfo \- query RDMA devices .SH SYNOPSIS .B ibv_devinfo [\-d device] [\-i port] [\-l] [\-v] .SH DESCRIPTION .PP Print information about RDMA devices available for use from userspace. .SH OPTIONS .PP .TP \fB\-d\fR, \fB\-\-ib\-dev\fR=\fIDEVICE\fR use IB device \fIDEVICE\fR (default all devices) \fB\-i\fR, \fB\-\-ib\-port\fR=\fIPORT\fR query port \fIPORT\fR (default all ports) \fB\-l\fR, \fB\-\-list\fR only list names of RDMA devices \fB\-v\fR, \fB\-\-verbose\fR print all available information about RDMA devices .SH SEE ALSO .BR ibv_devices (1) .SH AUTHORS .TP Dotan Barak .RI < dotanba@gmail.com > .TP Roland Dreier .RI < rolandd@cisco.com > rdma-core-56.1/libibverbs/man/ibv_event_type_str.3.md000066400000000000000000000021551477342711600226300ustar00rootroot00000000000000--- date: 2006-10-31 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_EVENT_TYPE_STR --- # NAME ibv_event_type_str - Return string describing event_type enum value ibv_node_type_str - Return string describing node_type enum value ibv_port_state_str - Return string describing port_state enum value # SYNOPSIS ```c #include const char *ibv_event_type_str(enum ibv_event_type event_type); const char *ibv_node_type_str(enum ibv_node_type node_type); const char *ibv_port_state_str(enum ibv_port_state port_state); ``` # DESCRIPTION **ibv_node_type_str()** returns a string describing the node type enum value *node_type*. **ibv_port_state_str()** returns a string describing the port state enum value *port_state*. **ibv_event_type_str()** returns a string describing the event type enum value *event_type*. # RETURN VALUE These functions return a constant string that describes the enum value passed as their argument. # AUTHOR Roland Dreier rdma-core-56.1/libibverbs/man/ibv_fork_init.3.md000066400000000000000000000037741477342711600215520ustar00rootroot00000000000000--- date: 2006-10-31 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_FORK_INIT --- # NAME ibv_fork_init - initialize libibverbs to support fork() # SYNOPSIS ```c #include int ibv_fork_init(void); ``` # DESCRIPTION **ibv_fork_init()** initializes libibverbs's data structures to handle **fork()** function calls correctly and avoid data corruption, whether **fork()** is called explicitly or implicitly (such as in **system()**). It is not necessary to use this function if all parent process threads are always blocked until all child processes end or change address spaces via an **exec()** operation. # RETURN VALUE **ibv_fork_init()** returns 0 on success, or the value of errno on failure (which indicates the failure reason). An error value of EINVAL indicates that there had been RDMA memory registration already and it is therefore not safe anymore to fork. # NOTES **ibv_fork_init()** works on Linux kernels supporting the **MADV_DONTFORK** flag for **madvise()** (2.6.17 and higher). Setting the environment variable **RDMAV_FORK_SAFE** or **IBV_FORK_SAFE** has the same effect as calling **ibv_fork_init()**. Setting the environment variable **RDMAV_HUGEPAGES_SAFE** tells the library to check the underlying page size used by the kernel for memory regions. This is required if an application uses huge pages either directly or indirectly via a library such as libhugetlbfs. Calling **ibv_fork_init()** will reduce performance due to an extra system call for every memory registration, and the additional memory allocated to track memory regions. The precise performance impact depends on the workload and usually will not be significant. Setting **RDMAV_HUGEPAGES_SAFE** adds further overhead to all memory registrations. # SEE ALSO **exec**(3), **fork**(2), **ibv_get_device_list**(3), **system**(3), **wait**(2) # AUTHOR Dotan Barak rdma-core-56.1/libibverbs/man/ibv_get_async_event.3000066400000000000000000000115001477342711600223260ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_GET_ASYNC_EVENT 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_get_async_event, ibv_ack_async_event \- get or acknowledge asynchronous events .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_get_async_event(struct ibv_context " "*context" , .BI " struct ibv_async_event " "*event" ); .sp .BI "void ibv_ack_async_event(struct ibv_async_event " "*event" ); .fi .SH "DESCRIPTION" .B ibv_get_async_event() waits for the next async event of the RDMA device context .I context and returns it through the pointer .I event\fR, which is an ibv_async_event struct, as defined in . .PP .nf struct ibv_async_event { .in +8 union { .in +8 struct ibv_cq *cq; /* CQ that got the event */ struct ibv_qp *qp; /* QP that got the event */ struct ibv_srq *srq; /* SRQ that got the event */ struct ibv_wq *wq; /* WQ that got the event */ int port_num; /* port number that got the event */ .in -8 } element; enum ibv_event_type event_type; /* type of the event */ .in -8 }; .fi .PP One member of the element union will be valid, depending on the event_type member of the structure. event_type will be one of the following events: .PP .I QP events: .TP .B IBV_EVENT_QP_FATAL \fR Error occurred on a QP and it transitioned to error state .TP .B IBV_EVENT_QP_REQ_ERR \fR Invalid Request Local Work Queue Error .TP .B IBV_EVENT_QP_ACCESS_ERR \fR Local access violation error .TP .B IBV_EVENT_COMM_EST \fR Communication was established on a QP .TP .B IBV_EVENT_SQ_DRAINED \fR Send Queue was drained of outstanding messages in progress .TP .B IBV_EVENT_PATH_MIG \fR A connection has migrated to the alternate path .TP .B IBV_EVENT_PATH_MIG_ERR \fR A connection failed to migrate to the alternate path .TP .B IBV_EVENT_QP_LAST_WQE_REACHED \fR Last WQE Reached on a QP associated with an SRQ .PP .I CQ events: .TP .B IBV_EVENT_CQ_ERR \fR CQ is in error (CQ overrun) .PP .I SRQ events: .TP .B IBV_EVENT_SRQ_ERR \fR Error occurred on an SRQ .TP .B IBV_EVENT_SRQ_LIMIT_REACHED \fR SRQ limit was reached .PP .I WQ events: .TP .B IBV_EVENT_WQ_FATAL \fR Error occurred on a WQ and it transitioned to error state .PP .I Port events: .TP .B IBV_EVENT_PORT_ACTIVE \fR Link became active on a port .TP .B IBV_EVENT_PORT_ERR \fR Link became unavailable on a port .TP .B IBV_EVENT_LID_CHANGE \fR LID was changed on a port .TP .B IBV_EVENT_PKEY_CHANGE \fR P_Key table was changed on a port .TP .B IBV_EVENT_SM_CHANGE \fR SM was changed on a port .TP .B IBV_EVENT_CLIENT_REREGISTER \fR SM sent a CLIENT_REREGISTER request to a port .TP .B IBV_EVENT_GID_CHANGE \fR GID table was changed on a port .PP .I CA events: .TP .B IBV_EVENT_DEVICE_FATAL \fR CA is in FATAL state .PP .B ibv_ack_async_event() acknowledge the async event .I event\fR. .SH "RETURN VALUE" .B ibv_get_async_event() returns 0 on success, and \-1 on error. .PP .B ibv_ack_async_event() returns no value. .SH "NOTES" All async events that .B ibv_get_async_event() returns must be acknowledged using .B ibv_ack_async_event()\fR. To avoid races, destroying an object (CQ, SRQ or QP) will wait for all affiliated events for the object to be acknowledged; this avoids an application retrieving an affiliated event after the corresponding object has already been destroyed. .PP .B ibv_get_async_event() is a blocking function. If multiple threads call this function simultaneously, then when an async event occurs, only one thread will receive it, and it is not possible to predict which thread will receive it. .SH "EXAMPLES" The following code example demonstrates one possible way to work with async events in non-blocking mode. It performs the following steps: .PP 1. Set the async events queue work mode to be non-blocked .br 2. Poll the queue until it has an async event .br 3. Get the async event and ack it .PP .nf /* change the blocking mode of the async event queue */ flags = fcntl(ctx->async_fd, F_GETFL); rc = fcntl(ctx->async_fd, F_SETFL, flags | O_NONBLOCK); if (rc < 0) { fprintf(stderr, "Failed to change file descriptor of async event queue\en"); return 1; } /* * poll the queue until it has an event and sleep ms_timeout * milliseconds between any iteration */ my_pollfd.fd = ctx->async_fd; my_pollfd.events = POLLIN; my_pollfd.revents = 0; do { rc = poll(&my_pollfd, 1, ms_timeout); } while (rc == 0); if (rc < 0) { fprintf(stderr, "poll failed\en"); return 1; } /* Get the async event */ if (ibv_get_async_event(ctx, &async_event)) { fprintf(stderr, "Failed to get async_event\en"); return 1; } /* Ack the event */ ibv_ack_async_event(&async_event); .fi .SH "SEE ALSO" .BR ibv_open_device (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_get_cq_event.3000066400000000000000000000113571477342711600216260ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_GET_CQ_EVENT 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_get_cq_event, ibv_ack_cq_events \- get and acknowledge completion queue (CQ) events .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_get_cq_event(struct ibv_comp_channel " "*channel" , .BI " struct ibv_cq " "**cq" ", void " "**cq_context" ); .sp .BI "void ibv_ack_cq_events(struct ibv_cq " "*cq" ", unsigned int " "nevents" ); .fi .SH "DESCRIPTION" .B ibv_get_cq_event() waits for the next completion event in the completion event channel .I channel\fR. Fills the arguments .I cq with the CQ that got the event and .I cq_context with the CQ's context\fR. .PP .B ibv_ack_cq_events() acknowledges .I nevents events on the CQ .I cq\fR. .SH "RETURN VALUE" .B ibv_get_cq_event() returns 0 on success, and \-1 on error. .PP .B ibv_ack_cq_events() returns no value. .SH "NOTES" All completion events that .B ibv_get_cq_event() returns must be acknowledged using .B ibv_ack_cq_events()\fR. To avoid races, destroying a CQ will wait for all completion events to be acknowledged; this guarantees a one-to-one correspondence between acks and successful gets. .PP Calling .B ibv_ack_cq_events() may be relatively expensive in the datapath, since it must take a mutex. Therefore it may be better to amortize this cost by keeping a count of the number of events needing acknowledgement and acking several completion events in one call to .B ibv_ack_cq_events()\fR. .SH "EXAMPLES" The following code example demonstrates one possible way to work with completion events. It performs the following steps: .PP Stage I: Preparation .br 1. Creates a CQ .br 2. Requests for notification upon a new (first) completion event .PP Stage II: Completion Handling Routine .br 3. Wait for the completion event and ack it .br 4. Request for notification upon the next completion event .br 5. Empty the CQ .PP Note that an extra event may be triggered without having a corresponding completion entry in the CQ. This occurs if a completion entry is added to the CQ between Step 4 and Step 5, and the CQ is then emptied (polled) in Step 5. .PP .nf cq = ibv_create_cq(ctx, 1, ev_ctx, channel, 0); if (!cq) { fprintf(stderr, "Failed to create CQ\en"); return 1; } .PP /* Request notification before any completion can be created */ if (ibv_req_notify_cq(cq, 0)) { fprintf(stderr, "Couldn't request CQ notification\en"); return 1; } .PP \&. \&. \&. .PP /* Wait for the completion event */ if (ibv_get_cq_event(channel, &ev_cq, &ev_ctx)) { fprintf(stderr, "Failed to get cq_event\en"); return 1; } /* Ack the event */ ibv_ack_cq_events(ev_cq, 1); .PP /* Request notification upon the next completion event */ if (ibv_req_notify_cq(ev_cq, 0)) { fprintf(stderr, "Couldn't request CQ notification\en"); return 1; } .PP /* Empty the CQ: poll all of the completions from the CQ (if any exist) */ do { ne = ibv_poll_cq(cq, 1, &wc); if (ne < 0) { fprintf(stderr, "Failed to poll completions from the CQ\en"); return 1; } /* there may be an extra event with no completion in the CQ */ if (ne == 0) continue; .PP if (wc.status != IBV_WC_SUCCESS) { fprintf(stderr, "Completion with status 0x%x was found\en", wc.status); return 1; } } while (ne); .fi The following code example demonstrates one possible way to work with completion events in non-blocking mode. It performs the following steps: .PP 1. Set the completion event channel to be non-blocked .br 2. Poll the channel until there it has a completion event .br 3. Get the completion event and ack it .PP .nf /* change the blocking mode of the completion channel */ flags = fcntl(channel->fd, F_GETFL); rc = fcntl(channel->fd, F_SETFL, flags | O_NONBLOCK); if (rc < 0) { fprintf(stderr, "Failed to change file descriptor of completion event channel\en"); return 1; } /* * poll the channel until it has an event and sleep ms_timeout * milliseconds between any iteration */ my_pollfd.fd = channel->fd; my_pollfd.events = POLLIN; my_pollfd.revents = 0; do { rc = poll(&my_pollfd, 1, ms_timeout); } while (rc == 0); if (rc < 0) { fprintf(stderr, "poll failed\en"); return 1; } ev_cq = cq; /* Wait for the completion event */ if (ibv_get_cq_event(channel, &ev_cq, &ev_ctx)) { fprintf(stderr, "Failed to get cq_event\en"); return 1; } /* Ack the event */ ibv_ack_cq_events(ev_cq, 1); .fi .SH "SEE ALSO" .BR ibv_create_comp_channel (3), .BR ibv_create_cq (3), .BR ibv_req_notify_cq (3), .BR ibv_poll_cq (3) .SH "AUTHORS" .TP Dotan Barak .RI < dotanba@gmail.com > rdma-core-56.1/libibverbs/man/ibv_get_device_guid.3.md000066400000000000000000000013761477342711600226700ustar00rootroot00000000000000--- date: 2006-10-31 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_GET_DEVICE_GUID --- # NAME ibv_get_device_guid - get an RDMA device's GUID # SYNOPSIS ```c #include uint64_t ibv_get_device_guid(struct ibv_device *device); ``` # DESCRIPTION **ibv_get_device_guid()** returns the Global Unique IDentifier (GUID) of the RDMA device *device*. # RETURN VALUE **ibv_get_device_guid()** returns the GUID of the device in network byte order. # SEE ALSO **ibv_get_device_index**(3), **ibv_get_device_list**(3), **ibv_get_device_name**(3), **ibv_open_device**(3) # AUTHOR Dotan Barak rdma-core-56.1/libibverbs/man/ibv_get_device_index.3.md000066400000000000000000000014151477342711600230410ustar00rootroot00000000000000--- date: ' 2020-04-22' footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_GET_DEVICE_INDEX --- # NAME ibv_get_device_index - get an RDMA device index # SYNOPSIS ```c #include int ibv_get_device_index(struct ibv_device *device); ``` # DESCRIPTION **ibv_get_device_index()** returns stable IB device index as it is assigned by the kernel. # RETURN VALUE **ibv_get_device_index()** returns an index, or -1 if the kernel doesn't support device indexes. # SEE ALSO **ibv_get_device_name**(3), **ibv_get_device_guid**(3), **ibv_get_device_list**(3), **ibv_open_device**(3) # AUTHOR Leon Romanovsky rdma-core-56.1/libibverbs/man/ibv_get_device_list.3.md000066400000000000000000000051071477342711600227070ustar00rootroot00000000000000--- date: 2006-10-31 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_GET_DEVICE_LIST --- # NAME ibv_get_device_list, ibv_free_device_list - get and release list of available RDMA devices # SYNOPSIS ```c #include struct ibv_device **ibv_get_device_list(int *num_devices); void ibv_free_device_list(struct ibv_device **list); ``` # DESCRIPTION **ibv_get_device_list()** returns a NULL-terminated array of RDMA devices currently available. The argument *num_devices* is optional; if not NULL, it is set to the number of devices returned in the array. **ibv_free_device_list()** frees the array of devices *list* returned by **ibv_get_device_list()**. # RETURN VALUE **ibv_get_device_list()** returns the array of available RDMA devices, or sets *errno* and returns NULL if the request fails. If no devices are found then *num_devices* is set to 0, and non-NULL is returned. **ibv_free_device_list()** returns no value. # ERRORS **EPERM** : Permission denied. **ENOSYS** : No kernel support for RDMA. **ENOMEM** : Insufficient memory to complete the operation. # NOTES Client code should open all the devices it intends to use with **ibv_open_device()** before calling **ibv_free_device_list()**. Once it frees the array with **ibv_free_device_list()**, it will be able to use only the open devices; pointers to unopened devices will no longer be valid. Setting the environment variable **IBV_SHOW_WARNINGS** will cause warnings to be emitted to stderr if a kernel verbs device is discovered, but no corresponding userspace driver can be found for it. # STATIC LINKING If **libibverbs** is statically linked to the application then all provider drivers must also be statically linked. The library will not load dynamic providers when static linking is used. To link the providers set the **RDMA_STATIC_PROVIDERS** define to the comma separated list of desired providers when compiling the application. The special keyword 'all' will statically link all supported **libibverbs** providers. This is intended to be used along with **pkg-config(1)** to setup the proper flags for **libibverbs** linking. If this is not done then **ibv_get_device_list** will always return an empty list. Using only dynamic linking for **libibverbs** applications is strongly recommended. # SEE ALSO **ibv_fork_init**(3), **ibv_get_device_guid**(3), **ibv_get_device_name**(3), **ibv_get_device_index**(3), **ibv_open_device**(3) # AUTHOR Dotan Barak rdma-core-56.1/libibverbs/man/ibv_get_device_name.3.md000066400000000000000000000013651477342711600226560ustar00rootroot00000000000000--- date: ' 2006-10-31' footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_GET_DEVICE_NAME --- # NAME ibv_get_device_name - get an RDMA device's name # SYNOPSIS ```c #include const char *ibv_get_device_name(struct ibv_device *device); ``` # DESCRIPTION **ibv_get_device_name()** returns a human-readable name associated with the RDMA device *device*. # RETURN VALUE **ibv_get_device_name()** returns a pointer to the device name, or NULL if the request fails. # SEE ALSO **ibv_get_device_guid**(3), **ibv_get_device_list**(3), **ibv_open_device**(3) # AUTHOR Dotan Barak rdma-core-56.1/libibverbs/man/ibv_get_pkey_index.3.md000066400000000000000000000022341477342711600225520ustar00rootroot00000000000000--- date: 2018-07-16 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_GET_PKEY_INDEX --- # NAME ibv_get_pkey_index - obtain the index in the P_Key table of a P_Key # SYNOPSIS ```c #include int ibv_get_pkey_index(struct ibv_context *context, uint8_t port_num, __be16 pkey); ``` # DESCRIPTION Every InfiniBand HCA maintains a P_Key table for each of its ports that is indexed by an integer and with a P_Key in each element. Certain InfiniBand data structures that work with P_Keys expect a P_Key index, e.g. **struct ibv_qp_attr** and **struct ib_mad_addr**. Hence the function **ibv_get_pkey_index()** that accepts a P_Key in network byte order and that returns an index in the P_Key table as result. # RETURN VALUE **ibv_get_pkey_index()** returns the P_Key index on success, and -1 on error. # SEE ALSO **ibv_open_device**(3), **ibv_query_device**(3), **ibv_query_gid**(3), **ibv_query_pkey**(3), **ibv_query_port**(3) # AUTHOR Bart Van Assche rdma-core-56.1/libibverbs/man/ibv_get_srq_num.3.md000066400000000000000000000016641477342711600221050ustar00rootroot00000000000000--- date: 2013-06-26 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_GET_SRQ_NUM --- # NAME ibv_get_srq_num - return srq number associated with the given shared receive queue (SRQ) # SYNOPSIS ```c #include int ibv_get_srq_num(struct ibv_srq *srq, uint32_t *srq_num); ``` # DESCRIPTION **ibv_get_srq_num()** return srq number associated with the given XRC shared receive queue The argument *srq* is an ibv_srq struct, as defined in . *srq_num* is an output parameter that holds the returned srq number. # RETURN VALUE **ibv_get_srq_num()** returns 0 on success, or the value of errno on failure (which indicates the failure reason). # SEE ALSO **ibv_alloc_pd**(3), **ibv_create_srq_ex**(3), **ibv_modify_srq**(3) # AUTHOR Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_import_device.3.md000066400000000000000000000020711477342711600224040ustar00rootroot00000000000000--- date: 2020-5-3 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: ibv_import_device --- # NAME ibv_import_device - import a device from a given command FD # SYNOPSIS ```c #include struct ibv_context *ibv_import_device(int cmd_fd); ``` # DESCRIPTION **ibv_import_device()** returns an *ibv_context* pointer that is associated with the given *cmd_fd*. The *cmd_fd* is obtained from the ibv_context cmd_fd member, which must be dup'd (eg by dup(), SCM_RIGHTS, etc) before being passed to ibv_import_device(). Once the *ibv_context* usage has been ended *ibv_close_device()* should be called. This call may cleanup whatever is needed/opposite of the import including closing the command FD. # RETURN VALUE **ibv_import_device()** returns a pointer to the allocated RDMA context, or NULL if the request fails. # SEE ALSO **ibv_open_device**(3), **ibv_close_device**(3), # AUTHOR Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_import_dm.3.md000066400000000000000000000030701477342711600215450ustar00rootroot00000000000000--- date: 2021-1-17 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: ibv_import_dm ibv_unimport_dm --- # NAME ibv_import_dm - import an DM from a given ibv_context ibv_unimport_dm - unimport an DM # SYNOPSIS ```c #include struct ibv_dm *ibv_import_dm(struct ibv_context *context, uint32_t dm_handle); void ibv_unimport_dm(struct ibv_dm *dm) ``` # DESCRIPTION **ibv_import_dm()** returns a Device memory (DM) that is associated with the given *dm_handle* in the RDMA context. The input *dm_handle* value must be a valid kernel handle for an DM object in the assosicated RDMA context. It can be achieved from the original DM by getting its ibv_dm->handle member value. **ibv_unimport_dm()** un import the DM. Once the DM usage has been ended ibv_free_dm() or ibv_unimport_dm() should be called. The first one will go to the kernel to destroy the object once the second one way cleanup what ever is needed/opposite of the import without calling the kernel. This is the responsibility of the application to coordinate between all ibv_context(s) that use this DM. Once destroy is done no other process can touch the object except for unimport. All users of the context must collaborate to ensure this. # RETURN VALUE **ibv_import_dm()** returns a pointer to the allocated DM, or NULL if the request fails and errno is set. # NOTES # SEE ALSO **ibv_alloc_dm**(3), **ibv_free_dm**(3), # AUTHOR Maor Gottlieb rdma-core-56.1/libibverbs/man/ibv_import_mr.3.md000066400000000000000000000033261477342711600215670ustar00rootroot00000000000000--- date: 2020-5-3 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: ibv_import_mr ibv_unimport_mr --- # NAME ibv_import_mr - import an MR from a given ibv_pd ibv_unimport_mr - unimport an MR # SYNOPSIS ```c #include struct ibv_mr *ibv_import_mr(struct ibv_pd *pd, uint32_t mr_handle); void ibv_unimport_mr(struct ibv_mr *mr) ``` # DESCRIPTION **ibv_import_mr()** returns a Memory region (MR) that is associated with the given *mr_handle* in the RDMA context that assosicated with the given *pd*. The input *mr_handle* value must be a valid kernel handle for an MR object in the assosicated RDMA context. It can be achieved from the original MR by getting its ibv_mr->handle member value. **ibv_unimport_mr()** un import the MR. Once the MR usage has been ended ibv_dereg_mr() or ibv_unimport_mr() should be called. The first one will go to the kernel to destroy the object once the second one way cleanup what ever is needed/opposite of the import without calling the kernel. This is the responsibility of the application to coordinate between all ibv_context(s) that use this MR. Once destroy is done no other process can touch the object except for unimport. All users of the context must collaborate to ensure this. # RETURN VALUE **ibv_import_mr()** returns a pointer to the allocated MR, or NULL if the request fails. # NOTES The *addr* field in the imported MR is not applicable, NULL value is expected. # SEE ALSO **ibv_reg_mr**(3), **ibv_reg_dm_mr**(3), **ibv_reg_mr_iova**(3), **ibv_reg_mr_iova2**(3), **ibv_dereg_mr**(3), # AUTHOR Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_import_pd.3.md000066400000000000000000000031571477342711600215560ustar00rootroot00000000000000--- date: 2020-5-3 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: ibv_import_pd, ibv_unimport_pd --- # NAME ibv_import_pd - import a PD from a given ibv_context ibv_unimport_pd - unimport a PD # SYNOPSIS ```c #include struct ibv_pd *ibv_import_pd(struct ibv_context *context, uint32_t pd_handle); void ibv_unimport_pd(struct ibv_pd *pd) ``` # DESCRIPTION **ibv_import_pd()** returns a protection domain (PD) that is associated with the given *pd_handle* in the given *context*. The input *pd_handle* value must be a valid kernel handle for a PD object in the given *context*. It can be achieved from the original PD by getting its ibv_pd->handle member value. The returned *ibv_pd* can be used in all verbs that get a protection domain. **ibv_unimport_pd()** unimport the PD. Once the PD usage has been ended ibv_dealloc_pd() or ibv_unimport_pd() should be called. The first one will go to the kernel to destroy the object once the second one way cleanup what ever is needed/opposite of the import without calling the kernel. This is the responsibility of the application to coordinate between all ibv_context(s) that use this PD. Once destroy is done no other process can touch the object except for unimport. All users of the context must collaborate to ensure this. # RETURN VALUE **ibv_import_pd()** returns a pointer to the allocated PD, or NULL if the request fails. # SEE ALSO **ibv_alloc_pd**(3), **ibv_dealloc_pd**(3), # AUTHOR Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_inc_rkey.3.md000066400000000000000000000015331477342711600213600ustar00rootroot00000000000000--- date: 2015-01-29 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_INC_RKEY --- # NAME ibv_inc_rkey - creates a new rkey from the given one # SYNOPSIS ```c #include uint32_t ibv_inc_rkey(uint32_t rkey); ``` # DESCRIPTION **ibv_inc_rkey()** Increases the 8 LSB of *rkey* and returns the new value. # RETURN VALUE **ibv_inc_rkey()** returns the new rkey. # NOTES The verb generates a new rkey that is different from the previous one on its tag part but has the same index (bits 0xffffff00). A use case for this verb can be to create a new rkey from a Memory window's rkey when binding it to a Memory region. # AUTHORS Majd Dibbiny , Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_is_fork_initialized.3.md000066400000000000000000000023551477342711600236010ustar00rootroot00000000000000--- date: 2020-10-09 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_IS_FORK_INITIALIZED --- # NAME ibv_is_fork_initialized - check if fork support (ibv_fork_init) is enabled # SYNOPSIS ```c #include enum ibv_fork_status { IBV_FORK_DISABLED, IBV_FORK_ENABLED, IBV_FORK_UNNEEDED, }; enum ibv_fork_status ibv_is_fork_initialized(void); ``` # DESCRIPTION **ibv_is_fork_initialized()** checks whether libibverbs **fork()** support was enabled through the **ibv_fork_init()** verb. # RETURN VALUE **ibv_is_fork_initialized()** returns IBV_FORK_DISABLED if fork support is disabled, or IBV_FORK_ENABLED if enabled. IBV_FORK_UNNEEDED return value indicates that the kernel copies DMA pages on fork, hence a call to **ibv_fork_init()** is unneeded. # NOTES The IBV_FORK_UNNEEDED return value takes precedence over IBV_FORK_DISABLED and IBV_FORK_ENABLED. If the kernel supports copy-on-fork for DMA pages then IBV_FORK_UNNEEDED will be returned regardless of whether **ibv_fork_init()** was called or not. # SEE ALSO **fork**(2), **ibv_fork_init**(3) # AUTHOR Gal Pressman rdma-core-56.1/libibverbs/man/ibv_modify_cq.3000066400000000000000000000021171477342711600211270ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_MODIFY_CQ 3 2017-10-20 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_modify_cq \- modify a completion queue (CQ) .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_modify_cq(struct ibv_cq " *cq ", struct ibv_modify_cq_attr "*cq_attr "); .sp .fi .SH "DESCRIPTION" .B ibv_modify_cq() modify a CQ .I cq\fR. The argument .I cq_attr is an ibv_modify_cq_attr struct, as defined in . .PP .nf struct ibv_moderate_cq { .in +8 uint16_t cq_count; /* number of completions per event */ uint16_t cq_period; /* in micro seconds */ .in -8 }; struct ibv_modify_cq_attr { .in +8 uint32_t attr_mask; struct ibv_moderate_cq moderate; .in -8 }; .fi .PP The function .B ibv_modify_cq() will modify the CQ, based on the given .I cq_attr\fB\fR->attr_mask .SH "RETURN VALUE" returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "SEE ALSO" .BR ibv_create_cq (3) .SH "AUTHORS" .TP Yonatan Cohen rdma-core-56.1/libibverbs/man/ibv_modify_qp.3000066400000000000000000000167531477342711600211570ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_MODIFY_QP 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_modify_qp \- modify the attributes of a queue pair (QP) .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_modify_qp(struct ibv_qp " "*qp" ", struct ibv_qp_attr " "*attr" , .BI " int " "attr_mask" ); .fi .SH "DESCRIPTION" .B ibv_modify_qp() modifies the attributes of QP .I qp with the attributes in .I attr according to the mask .I attr_mask\fR. The argument \fIattr\fR is an ibv_qp_attr struct, as defined in . .PP .nf struct ibv_qp_attr { .in +8 enum ibv_qp_state qp_state; /* Move the QP to this state */ enum ibv_qp_state cur_qp_state; /* Assume this is the current QP state */ enum ibv_mtu path_mtu; /* Path MTU (valid only for RC/UC QPs) */ enum ibv_mig_state path_mig_state; /* Path migration state (valid if HCA supports APM) */ uint32_t qkey; /* Q_Key for the QP (valid only for UD QPs) */ uint32_t rq_psn; /* PSN for receive queue (valid only for RC/UC QPs) */ uint32_t sq_psn; /* PSN for send queue */ uint32_t dest_qp_num; /* Destination QP number (valid only for RC/UC QPs) */ unsigned int qp_access_flags; /* Mask of enabled remote access operations (valid only for RC/UC QPs) */ struct ibv_qp_cap cap; /* QP capabilities (valid if HCA supports QP resizing) */ struct ibv_ah_attr ah_attr; /* Primary path address vector (valid only for RC/UC QPs) */ struct ibv_ah_attr alt_ah_attr; /* Alternate path address vector (valid only for RC/UC QPs) */ uint16_t pkey_index; /* Primary P_Key index */ uint16_t alt_pkey_index; /* Alternate P_Key index */ uint8_t en_sqd_async_notify; /* Enable SQD.drained async notification (Valid only if qp_state is SQD) */ uint8_t sq_draining; /* Is the QP draining? Irrelevant for ibv_modify_qp() */ uint8_t max_rd_atomic; /* Number of outstanding RDMA reads & atomic operations on the destination QP (valid only for RC QPs) */ uint8_t max_dest_rd_atomic; /* Number of responder resources for handling incoming RDMA reads & atomic operations (valid only for RC QPs) */ uint8_t min_rnr_timer; /* Minimum RNR NAK timer (valid only for RC QPs) */ uint8_t port_num; /* Primary port number */ uint8_t timeout; /* Local ack timeout for primary path (valid only for RC QPs) */ uint8_t retry_cnt; /* Retry count (valid only for RC QPs) */ uint8_t rnr_retry; /* RNR retry (valid only for RC QPs) */ uint8_t alt_port_num; /* Alternate port number */ uint8_t alt_timeout; /* Local ack timeout for alternate path (valid only for RC QPs) */ uint32_t rate_limit; /* Rate limit in kbps for packet pacing */ .in -8 }; .fi .PP For details on struct ibv_qp_cap see the description of .B ibv_create_qp()\fR. For details on struct ibv_ah_attr see the description of .B ibv_create_ah()\fR. .PP The argument .I attr_mask specifies the QP attributes to be modified. The argument is either 0 or the bitwise OR of one or more of the following flags: .PP .TP .B IBV_QP_STATE \fR Modify qp_state .TP .B IBV_QP_CUR_STATE \fR Set cur_qp_state .TP .B IBV_QP_EN_SQD_ASYNC_NOTIFY \fR Set en_sqd_async_notify .TP .B IBV_QP_ACCESS_FLAGS \fR Set qp_access_flags .TP .B IBV_QP_PKEY_INDEX \fR Set pkey_index .TP .B IBV_QP_PORT \fR Set port_num .TP .B IBV_QP_QKEY \fR Set qkey .TP .B IBV_QP_AV \fR Set ah_attr .TP .B IBV_QP_PATH_MTU \fR Set path_mtu .TP .B IBV_QP_TIMEOUT \fR Set timeout .TP .B IBV_QP_RETRY_CNT \fR Set retry_cnt .TP .B IBV_QP_RNR_RETRY \fR Set rnr_retry .TP .B IBV_QP_RQ_PSN \fR Set rq_psn .TP .B IBV_QP_MAX_QP_RD_ATOMIC \fR Set max_rd_atomic .TP .B IBV_QP_ALT_PATH \fR Set the alternative path via: alt_ah_attr, alt_pkey_index, alt_port_num, alt_timeout .TP .B IBV_QP_MIN_RNR_TIMER \fR Set min_rnr_timer .TP .B IBV_QP_SQ_PSN \fR Set sq_psn .TP .B IBV_QP_MAX_DEST_RD_ATOMIC \fR Set max_dest_rd_atomic .TP .B IBV_QP_PATH_MIG_STATE \fR Set path_mig_state .TP .B IBV_QP_CAP \fR Set cap .TP .B IBV_QP_DEST_QPN \fR Set dest_qp_num .TP .B IBV_QP_RATE_LIMIT \fR Set rate_limit .SH "RETURN VALUE" .B ibv_modify_qp() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" If any of the modify attributes or the modify mask are invalid, none of the attributes will be modified (including the QP state). .PP Not all devices support resizing QPs. To check if a device supports it, check if the .B IBV_DEVICE_RESIZE_MAX_WR bit is set in the device capabilities flags. .PP Not all devices support alternate paths. To check if a device supports it, check if the .B IBV_DEVICE_AUTO_PATH_MIG bit is set in the device capabilities flags. .PP The following tables indicate for each QP Transport Service Type, the minimum list of attributes that must be changed upon transitioning QP state from: Reset \-\-> Init \-\-> RTR \-\-> RTS. .PP .nf For QP Transport Service Type \fB IBV_QPT_UD\fR: .sp Next state Required attributes \-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Init \fB IBV_QP_STATE, IBV_QP_PKEY_INDEX, IBV_QP_PORT, \fR \fB IBV_QP_QKEY \fR RTR \fB IBV_QP_STATE \fR RTS \fB IBV_QP_STATE, IBV_QP_SQ_PSN \fR .fi .PP .nf For QP Transport Service Type \fB IBV_QPT_UC\fR: .sp Next state Required attributes \-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Init \fB IBV_QP_STATE, IBV_QP_PKEY_INDEX, IBV_QP_PORT, \fR \fB IBV_QP_ACCESS_FLAGS \fR RTR \fB IBV_QP_STATE, IBV_QP_AV, IBV_QP_PATH_MTU, \fR \fB IBV_QP_DEST_QPN, IBV_QP_RQ_PSN \fR RTS \fB IBV_QP_STATE, IBV_QP_SQ_PSN \fR .fi .PP .nf For QP Transport Service Type \fB IBV_QPT_RC\fR: .sp Next state Required attributes \-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Init \fB IBV_QP_STATE, IBV_QP_PKEY_INDEX, IBV_QP_PORT, \fR \fB IBV_QP_ACCESS_FLAGS \fR RTR \fB IBV_QP_STATE, IBV_QP_AV, IBV_QP_PATH_MTU, \fR \fB IBV_QP_DEST_QPN, IBV_QP_RQ_PSN, \fR \fB IBV_QP_MAX_DEST_RD_ATOMIC, IBV_QP_MIN_RNR_TIMER \fR RTS \fB IBV_QP_STATE, IBV_QP_SQ_PSN, IBV_QP_MAX_QP_RD_ATOMIC, \fR \fB IBV_QP_RETRY_CNT, IBV_QP_RNR_RETRY, IBV_QP_TIMEOUT \fR .fi .PP .nf For QP Transport Service Type \fB IBV_QPT_RAW_PACKET\fR: .sp Next state Required attributes \-\-\-\-\-\-\-\-\-\- \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- Init \fB IBV_QP_STATE, IBV_QP_PORT\fR RTR \fB IBV_QP_STATE\fR RTS \fB IBV_QP_STATE\fR .fi .PP If port flag IBV_QPF_GRH_REQUIRED is set then ah_attr and alt_ah_attr must be passed with definition of 'struct ibv_ah_attr { .is_global = 1; .grh = {...}; }'. .PP .SH "SEE ALSO" .BR ibv_create_qp (3), .BR ibv_destroy_qp (3), .BR ibv_query_qp (3), .BR ibv_create_ah (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_modify_qp_rate_limit.3000066400000000000000000000040271477342711600233570ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_MODIFY_QP_RATE_LIMIT 3 2018-01-09 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_modify_qp_rate_limit \- modify the send rate limits attributes of a queue pair (QP) .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_modify_qp_rate_limit(struct ibv_qp " "*qp" ", struct ibv_qp_rate_limit_attr " "*attr"); .fi .SH "DESCRIPTION" .B ibv_modify_qp_rate_limit() modifies the send rate limiting packet pacing attributes of QP .I qp with the attributes in .I attr\fR. The argument \fIattr\fR is an ibv_qp_rate_limit_attr struct, as defined in . .PP The .I rate_limit defines the MAX send rate this QP will send as long as the link in not blocked and there are work requests in send queue. .PP Finer control for shaping the rate limit of a QP is achieved by defining the .I max_burst_sz\fR, single burst max bytes size and the .I typical_pkt_sz\fR, typical packet bytes size. These allow the device to adjust the inter-burst gap delay required to correctly shape the scheduling of sends to the wire in order to reach for requested application requirements. .PP Setting a value of 0 for .I max_burst_sz or .I typical_pkt_sz will use the devices defaults. .I typical_pkt_sz will default to the port's MTU value. .PP .nf struct ibv_qp_rate_limit_attr { .in +8 uint32_t rate_limit; /* kbps */ uint32_t max_burst_sz; /* bytes */ uint16_t typical_pkt_sz; /* bytes */ .in -8 }; .fi .PP .SH "RETURN VALUE" .B ibv_modify_qp_rate_limit() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "ERRORS" .SS EINVAL Invalid arguments. .SS EOPNOTSUPP Function is not implemented for this device. (ENOSYS may sometimes be returned by old versions of libibverbs). .PP .SH "SEE ALSO" .BR ibv_create_qp (3), .BR ibv_destroy_qp (3), .BR ibv_modify_qp (3), .BR ibv_query_qp (3) .SH "AUTHORS" .TP Alex Rosenbaum .TP Bodong Wang rdma-core-56.1/libibverbs/man/ibv_modify_srq.3000066400000000000000000000037631477342711600213410ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_MODIFY_SRQ 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_modify_srq \- modify attributes of a shared receive queue (SRQ) .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_modify_srq(struct ibv_srq " "*srq" , .BI " struct ibv_srq_attr " "*srq_attr" , .BI " int " "srq_attr_mask" ); .fi .SH "DESCRIPTION" .B ibv_modify_srq() modifies the attributes of SRQ .I srq with the attributes in .I srq_attr according to the mask .I srq_attr_mask\fR. The argument \fIsrq_attr\fR is an ibv_srq_attr struct, as defined in . .PP .nf struct ibv_srq_attr { .in +8 uint32_t max_wr; /* maximum number of outstanding work requests (WRs) in the SRQ */ uint32_t max_sge; /* number of scatter elements per WR (irrelevant for ibv_modify_srq) */ uint32_t srq_limit; /* the limit value of the SRQ */ .in -8 }; .fi .PP The argument .I srq_attr_mask specifies the SRQ attributes to be modified. The argument is either 0 or the bitwise OR of one or more of the following flags: .PP .TP .B IBV_SRQ_MAX_WR \fR Resize the SRQ .TP .B IBV_SRQ_LIMIT \fR Set the SRQ limit .SH "RETURN VALUE" .B ibv_modify_srq() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" If any of the modify attributes is invalid, none of the attributes will be modified. .PP Not all devices support resizing SRQs. To check if a device supports it, check if the .B IBV_DEVICE_SRQ_RESIZE bit is set in the device capabilities flags. .PP Modifying the srq_limit arms the SRQ to produce an .B IBV_EVENT_SRQ_LIMIT_REACHED "low watermark" asynchronous event once the number of WRs in the SRQ drops below srq_limit. .SH "SEE ALSO" .BR ibv_query_device (3), .BR ibv_create_srq (3), .BR ibv_destroy_srq (3), .BR ibv_query_srq (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_modify_wq.3000066400000000000000000000024721477342711600211570ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_MODIFY_WQ 3 2016-07-27 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_modify_wq \- Modify a Work Queue (WQ). .SH "SYNOPSIS" .nf .B #include .sp .BI "struct int ibv_modify_wq(struct ibv_wq " "*wq," .BI " struct ibv_wq_attr " "*wq_attr" ); .sp .fi .SH "DESCRIPTION" .B ibv_modify_wq() modifys a WQ .I wq\fR. The argument .I wq_attr is an ibv_wq_attr struct, as defined in . .PP .nf struct ibv_wq_attr { .in +8 uint32_t attr_mask; /* Use enum ibv_wq_attr_mask */ enum ibv_wq_state wq_state; /* Move to this state */ enum ibv_wq_state curr_wq_state; /* Assume this is the current state */ uint32_t flags; /* Flags values to modify, use enum ibv_wq_flags */ uint32_t flags_mask; /* Which flags to modify, use enum ibv_wq_flags */ .in -8 }; .fi .PP The function .B ibv_modify_wq() will modify the WQ based on the given .I wq_attr\fB\fR->attr_mask .SH "RETURN VALUE" returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "SEE ALSO" .BR ibv_create_wq (3), .BR ibv_destroy_wq (3), .SH "AUTHORS" .TP Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_open_device.3000066400000000000000000000031461477342711600214400ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_OPEN_DEVICE 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_open_device, ibv_close_device \- open and close an RDMA device context .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_context *ibv_open_device(struct ibv_device " "*device" "); .sp .BI "int ibv_close_device(struct ibv_context " "*context" "); .fi .SH "DESCRIPTION" .B ibv_open_device() opens the device .I device and creates a context for further use. .PP .B ibv_close_device() closes the device context .I context\fR. .SH "RETURN VALUE" .B ibv_open_device() returns a pointer to the allocated device context, or NULL if the request fails. .PP .B ibv_close_device() returns 0 on success, \-1 on failure. .SH "NOTES" .B ibv_close_device() does not release all the resources allocated using context .I context\fR. To avoid resource leaks, the user should release all associated resources before closing a context. Setting the environment variable **RDMAV_ALLOW_DISASSOC_DESTROY** tells the library to relate an EIO from destroy commands as a success as the kernel resources were already released. This comes to prevent memory leakage in the user space area upon device disassociation. Applications using this flag cannot call ibv_get_cq_event or ibv_get_async_event concurrently with any call to an object destruction function. .SH "SEE ALSO" .BR ibv_get_device_list (3), .BR ibv_query_device (3), .BR ibv_query_port (3), .BR ibv_query_gid (3), .BR ibv_query_pkey (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_open_qp.3000066400000000000000000000030601477342711600206140ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_OPEN_QP 3 2011-08-12 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_open_qp \- open a shareable queue pair (QP) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_qp *ibv_open_qp(struct ibv_context " "*context" , .BI " struct ibv_qp_open_attr " "*qp_open_attr" ); .fi .SH "DESCRIPTION" .B ibv_open_qp() opens an existing queue pair (QP) associated with the extended protection domain .I xrcd\fR. The argument .I qp_open_attr is an ibv_qp_open_attr struct, as defined in . .PP .nf struct ibv_qp_open_attr { .in +8 uint32_t comp_mask; /* Identifies valid fields */ uint32_t qp_num; /* QP number */ struct *ibv_xrcd; /* XRC domain */ void *qp_context; /* User defined opaque value */ enum ibv_qp_type qp_type; /* QP transport service type */ .fi .PP .B ibv_destroy_qp() closes the opened QP and destroys the underlying QP if it has no other references. .I qp\fR. .SH "RETURN VALUE" .B ibv_open_qp() returns a pointer to the opened QP, or NULL if the request fails. Check the QP number (\fBqp_num\fR) in the returned QP. .SH "NOTES" .B ibv_open_qp() will fail if a it is asked to open a QP that does not exist within the xrcd with the specified qp_num and qp_type. .SH "SEE ALSO" .BR ibv_alloc_pd (3), .BR ibv_create_qp (3), .BR ibv_create_qp_ex (3), .BR ibv_modify_qp (3), .BR ibv_query_qp (3) .SH "AUTHORS" .TP Sean Hefty rdma-core-56.1/libibverbs/man/ibv_open_xrcd.3000066400000000000000000000040221477342711600211330ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_OPEN_XRCD 3 2011-06-17 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_open_xrcd, ibv_close_xrcd \- open or close an XRC protection domain (XRCDs) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_xrcd *ibv_open_xrcd(struct ibv_context " "*context" "," .BI " struct ibv_xrcd_init_attr " "*xrcd_init_attr" ); .sp .BI "int ibv_close_xrcd(struct ibv_xrcd " "*xrcd" ); .fi .SH "DESCRIPTION" .B ibv_open_xrcd() open an XRC domain for the RDMA device context .I context .I xrcd_init_attr is an ibv_xrcd_init_attr struct, as defined in . .PP .nf struct ibv_xrcd_init_attr { .in +8 uint32_t comp_mask; /* Identifies valid fields */ int fd; int oflag; .fi .PP .I fd is the file descriptor to associate with the XRCD. .I oflag describes the desired creation attributes. It is a bitwise OR of zero or more of the following flags: .PP .TP .B O_CREAT Indicates that an XRCD should be created and associated with the inode referenced by the given fd. If the XRCD exists, this flag has no effect except as noted under .BR O_EXCL below.\fR .TP .B O_EXCL If .BR O_EXCL and .BR O_CREAT are set, open will fail if an XRCD associated with the inode exists. .PP If .I fd equals -1, no inode is associated with the XRCD. To indicate that XRCD should be created, use .I oflag = .B O_CREAT\fR. .PP .B ibv_close_xrcd() closes the XRCD .I xrcd\fR. If this is the last reference, the XRCD will be destroyed. .SH "RETURN VALUE" .B ibv_open_xrcd() returns a pointer to the opened XRCD, or NULL if the request fails. .PP .B ibv_close_xrcd() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" .B ibv_close_xrcd() may fail if any other resource is still associated with the XRCD being closed. .SH "SEE ALSO" .BR ibv_create_srq_ex (3), .BR ibv_create_qp_ex (3), .SH "AUTHORS" .TP Sean Hefty rdma-core-56.1/libibverbs/man/ibv_poll_cq.3000066400000000000000000000065231477342711600206130ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_POLL_CQ 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_poll_cq \- poll a completion queue (CQ) .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_poll_cq(struct ibv_cq " "*cq" ", int " "num_entries" , .BI " struct ibv_wc " "*wc" ); .fi .SH "DESCRIPTION" .B ibv_poll_cq() polls the CQ .I cq for work completions and returns the first .I num_entries (or all available completions if the CQ contains fewer than this number) in the array .I wc\fR. The argument .I wc is a pointer to an array of ibv_wc structs, as defined in . .PP .nf struct ibv_wc { .in +8 uint64_t wr_id; /* ID of the completed Work Request (WR) */ enum ibv_wc_status status; /* Status of the operation */ enum ibv_wc_opcode opcode; /* Operation type specified in the completed WR */ uint32_t vendor_err; /* Vendor error syndrome */ uint32_t byte_len; /* Number of bytes transferred */ union { .in +8 __be32 imm_data; /* Immediate data (in network byte order) */ uint32_t invalidated_rkey; /* Local RKey that was invalidated */ .in -8 }; uint32_t qp_num; /* Local QP number of completed WR */ uint32_t src_qp; /* Source QP number (remote QP number) of completed WR (valid only for UD QPs) */ unsigned int wc_flags; /* Flags of the completed WR */ uint16_t pkey_index; /* P_Key index (valid only for GSI QPs) */ uint16_t slid; /* Source LID */ uint8_t sl; /* Service Level */ uint8_t dlid_path_bits; /* DLID path bits (not applicable for multicast messages) */ .in -8 }; .sp .fi .PP The attribute wc_flags describes the properties of the work completion. It is either 0 or the bitwise OR of one or more of the following flags: .PP .TP .B IBV_WC_GRH \fR GRH is present (valid only for UD QPs) .TP .B IBV_WC_WITH_IMM \fR Immediate data value is valid .TP .B IBV_WC_WITH_INV \fR Invalidated RKey data value is valid (cannot be combined with IBV_WC_WITH_IMM) .TP .B IBV_WC_IP_CSUM_OK \fR TCP/UDP checksum over IPv4 and IPv4 header checksum are verified. Valid only when \fBdevice_cap_flags\fR in device_attr indicates current QP is supported by checksum offload. .PP Not all .I wc attributes are always valid. If the completion status is other than .B IBV_WC_SUCCESS\fR, only the following attributes are valid: wr_id, status, qp_num, and vendor_err. .SH "RETURN VALUE" On success, .B ibv_poll_cq() returns a non-negative value equal to the number of completions found. On failure, a negative value is returned. .SH "NOTES" .PP Each polled completion is removed from the CQ and cannot be returned to it. .PP The user should consume work completions at a rate that prevents CQ overrun from occurrence. In case of a CQ overrun, the async event .B IBV_EVENT_CQ_ERR will be triggered, and the CQ cannot be used. .PP IBV_WC_DRIVER1 will be reported as a response to IBV_WR_DRIVER1 opcode; IBV_WC_DRIVER2/IBV_WC_DRIVER3 will be reported on specific driver operations. .SH "SEE ALSO" .BR ibv_post_send (3), .BR ibv_post_recv (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_post_recv.3000066400000000000000000000047441477342711600211710ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_POST_RECV 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_post_recv \- post a list of work requests (WRs) to a receive queue .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_post_recv(struct ibv_qp " "*qp" ", struct ibv_recv_wr " "*wr" , .BI " struct ibv_recv_wr " "**bad_wr" ); .fi .SH "DESCRIPTION" .B ibv_post_recv() posts the linked list of work requests (WRs) starting with .I wr to the receive queue of the queue pair .I qp\fR. It stops processing WRs from this list at the first failure (that can be detected immediately while requests are being posted), and returns this failing WR through .I bad_wr\fR. .PP The argument .I wr is an ibv_recv_wr struct, as defined in . .PP .nf struct ibv_recv_wr { .in +8 uint64_t wr_id; /* User defined WR ID */ struct ibv_recv_wr *next; /* Pointer to next WR in list, NULL if last WR */ struct ibv_sge *sg_list; /* Pointer to the s/g array */ int num_sge; /* Size of the s/g array */ .in -8 }; .sp .nf struct ibv_sge { .in +8 uint64_t addr; /* Start address of the local memory buffer */ uint32_t length; /* Length of the buffer */ uint32_t lkey; /* Key of the local Memory Region */ .in -8 }; .fi .SH "RETURN VALUE" .B ibv_post_recv() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" The buffers used by a WR can only be safely reused after WR the request is fully executed and a work completion has been retrieved from the corresponding completion queue (CQ). .PP If the QP .I qp is associated with a shared receive queue, you must use the function .B ibv_post_srq_recv()\fR, and not .B ibv_post_recv()\fR, since the QP's own receive queue will not be used. .PP If a WR is being posted to a UD QP, the Global Routing Header (GRH) of the incoming message will be placed in the first 40 bytes of the buffer(s) in the scatter list. If no GRH is present in the incoming message, then the first bytes will be undefined. This means that in all cases, the actual data of the incoming message will start at an offset of 40 bytes into the buffer(s) in the scatter list. .SH "SEE ALSO" .BR ibv_create_qp (3), .BR ibv_post_send (3), .BR ibv_post_srq_recv (3), .BR ibv_poll_cq (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_post_send.3000066400000000000000000000156221477342711600211600ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_POST_SEND 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_post_send \- post a list of work requests (WRs) to a send queue .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_post_send(struct ibv_qp " "*qp" ", struct ibv_send_wr " "*wr" , .BI " struct ibv_send_wr " "**bad_wr" ); .fi .SH "DESCRIPTION" .B ibv_post_send() posts the linked list of work requests (WRs) starting with .I wr to the send queue of the queue pair .I qp\fR. It stops processing WRs from this list at the first failure (that can be detected immediately while requests are being posted), and returns this failing WR through .I bad_wr\fR. .PP The argument .I wr is an ibv_send_wr struct, as defined in . .PP .nf struct ibv_send_wr { .in +8 uint64_t wr_id; /* User defined WR ID */ struct ibv_send_wr *next; /* Pointer to next WR in list, NULL if last WR */ struct ibv_sge *sg_list; /* Pointer to the s/g array */ int num_sge; /* Size of the s/g array */ enum ibv_wr_opcode opcode; /* Operation type */ unsigned int send_flags; /* Flags of the WR properties */ union { .in +8 __be32 imm_data; /* Immediate data (in network byte order) */ uint32_t invalidate_rkey; /* Remote rkey to invalidate */ .in -8 }; union { .in +8 struct { .in +8 uint64_t remote_addr; /* Start address of remote memory buffer */ uint32_t rkey; /* Key of the remote Memory Region */ .in -8 } rdma; struct { .in +8 uint64_t remote_addr; /* Start address of remote memory buffer */ uint64_t compare_add; /* Compare operand */ uint64_t swap; /* Swap operand */ uint32_t rkey; /* Key of the remote Memory Region */ .in -8 } atomic; struct { .in +8 struct ibv_ah *ah; /* Address handle (AH) for the remote node address */ uint32_t remote_qpn; /* QP number of the destination QP */ uint32_t remote_qkey; /* Q_Key number of the destination QP */ .in -8 } ud; .in -8 } wr; union { .in +8 struct { .in +8 uint32_t remote_srqn; /* Number of the remote SRQ */ .in -8 } xrc; .in -8 } qp_type; union { .in +8 struct { .in +8 struct ibv_mw *mw; /* Memory window (MW) of type 2 to bind */ uint32_t rkey; /* The desired new rkey of the MW */ struct ibv_mw_bind_info bind_info; /* MW additional bind information */ .in -8 } bind_mw; struct { .in +8 void *hdr; /* Pointer address of inline header */ uint16_t hdr_sz; /* Inline header size */ uint16_t mss; /* Maximum segment size for each TSO fragment */ .in -8 } tso; .in -8 }; .in -8 }; .fi .sp .nf struct ibv_mw_bind_info { .in +8 struct ibv_mr *mr; /* The Memory region (MR) to bind the MW to */ uint64_t addr; /* The address the MW should start at */ uint64_t length; /* The length (in bytes) the MW should span */ unsigned int mw_access_flags; /* Access flags to the MW. Use ibv_access_flags */ .in -8 }; .fi .sp .nf struct ibv_sge { .in +8 uint64_t addr; /* Start address of the local memory buffer or number of bytes from the start of the MR for MRs which are IBV_ACCESS_ZERO_BASED */ uint32_t length; /* Length of the buffer */ uint32_t lkey; /* Key of the local Memory Region */ .in -8 }; .fi .PP Each QP Transport Service Type supports a specific set of opcodes, as shown in the following table: .PP .nf OPCODE | IBV_QPT_UD | IBV_QPT_UC | IBV_QPT_RC | IBV_QPT_XRC_SEND | IBV_QPT_RAW_PACKET \-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-+\-\-\-\-\-\-\-\-\-\-\-\-+\-\-\-\-\-\-\-\-\-\-\-\-+\-\-\-\-\-\-\-\-\-\-\-\-+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-+\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\-\- IBV_WR_SEND | X | X | X | X | X IBV_WR_SEND_WITH_IMM | X | X | X | X | IBV_WR_RDMA_WRITE | | X | X | X | IBV_WR_RDMA_WRITE_WITH_IMM | | X | X | X | IBV_WR_RDMA_READ | | | X | X | IBV_WR_ATOMIC_CMP_AND_SWP | | | X | X | IBV_WR_ATOMIC_FETCH_AND_ADD | | | X | X | IBV_WR_LOCAL_INV | | X | X | X | IBV_WR_BIND_MW | | X | X | X | IBV_WR_SEND_WITH_INV | | X | X | X | IBV_WR_TSO | X | | | | X .fi .PP The attribute send_flags describes the properties of the \s-1WR\s0. It is either 0 or the bitwise \s-1OR\s0 of one or more of the following flags: .PP .TP .B IBV_SEND_FENCE \fR Set the fence indicator. Valid only for QPs with Transport Service Type \fBIBV_QPT_RC .TP .B IBV_SEND_SIGNALED \fR Set the completion notification indicator. Relevant only if QP was created with sq_sig_all=0 .TP .B IBV_SEND_SOLICITED \fR Set the solicited event indicator. Valid only for Send and RDMA Write with immediate .TP .B IBV_SEND_INLINE \fR Send data in given gather list as inline data in a send WQE. Valid only for Send and RDMA Write. The L_Key will not be checked. .TP .B IBV_SEND_IP_CSUM \fR Offload the IPv4 and TCP/UDP checksum calculation. Valid only when \fBdevice_cap_flags\fR in device_attr indicates current QP is supported by checksum offload. .SH "RETURN VALUE" .B ibv_post_send() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" The user should not alter or destroy AHs associated with WRs until request is fully executed and a work completion has been retrieved from the corresponding completion queue (CQ) to avoid unexpected behavior. .PP The buffers used by a WR can only be safely reused after WR the request is fully executed and a work completion has been retrieved from the corresponding completion queue (CQ). However, if the IBV_SEND_INLINE flag was set, the buffer can be reused immediately after the call returns. .PP IBV_WR_DRIVER1 is an opcode that should be used to issue a specific driver operation. .SH "SEE ALSO" .BR ibv_create_qp (3), .BR ibv_create_ah (3), .BR ibv_post_recv (3), .BR ibv_post_srq_recv (3), .BR ibv_poll_cq (3) .SH "AUTHORS" .TP Dotan Barak .TP Majd Dibbiny .TP Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_post_srq_ops.3000066400000000000000000000073361477342711600217200ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_POST_SRQ_OPS 3 2017-03-26 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_post_srq_ops \- perform on a special shared receive queue (SRQ) configuration manipulations .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_post_srq_ops(struct ibv_srq " "*srq" ", struct ibv_ops_wr " "*wr" , .BI " struct ibv_ops_wr " "**bad_wr" ); .fi .SH "DESCRIPTION" The .B ibv_post_srq_ops() performs series of offload configuration manipulations on special types of SRQ .I srq\fR. Currenlty it is used to configure tag matching SRQ. Series of configuration operations defined by linked lists of struct ibv_ops_wr elements starting from .I wr. .PP .nf struct ibv_ops_wr { .in +8 uint64_t wr_id; /* User defined WR ID */ /* Pointer to next WR in list, NULL if last WR */ struct ibv_ops_wr *next; enum ibv_ops_wr_opcode opcode; /* From enum ibv_ops_wr_opcode */ int flags; /* From enum ibv_ops_flags */ struct { .in +8 /* Number of unexpected messages * handled by SW */ uint32_t unexpected_cnt; /* Input parameter for the DEL opcode * and output parameter for the ADD opcode */ uint32_t handle; struct { .in +8 uint64_t recv_wr_id; /* User defined WR ID for TM_RECV */ struct ibv_sge *sg_list; /* Pointer to the s/g array */ int num_sge; /* Size of the s/g array */ uint64_t tag; uint64_t mask; /* Incoming message considered matching if TMH.tag & entry.mask == entry.tag */ .in -8 } add; .in -8 } tm; .in -8 }; .fi .PP First part of struct ibv_ops_wr retains ibv_send_wr notion. Opcode defines operation to perform. Currently supported IBV_WR_TAG_ADD, IBV_WR_TAG_DEL and IBV_WR_TAG_SYNC values. See below for detailed description. .PP To allow reliable data delivery TM SRQ maintains special low level synchronization primitive - phase synchronization. Receive side message handling comprises two concurrent activities - posting tagged buffers by SW and receiving incoming messages by HW. This process considered coherent only if all unexpected messages received by HW is completely processed in SW. To pass to hardware number of processed unexpected messages unexpected_cnt field should be used and IBV_OPS_TM_SYNC flag should be set. .PP To request WC for tag list operations IBV_OPS_SIGNALED flags should be passed. In this case WC will be generated on TM SRQ's CQ, provided wr_id will identify WC. .PP Opcode IBV_WR_TAG_ADD used to add tag entry to tag matching list. Tag entry consists of SGE list, tag & mask (matching parameters), user specified opaque wr_id (passed via recv_wr_id field) and uniquely identified by handle (returned by driver). Size of tag matching list is limited by max_num_tags. SGE list size is limited by max_sge. .PP Opcode IBV_WR_TAG_DEL removes previously added tag entry. Field handle should be set to value returned by previously performed IBV_WR_TAG_ADD operation. Operation may fail due to concurrent tag consumption - in this case IBV_WC_TM_ERR status will be returned in WC. .PP Opcode IBV_WR_TAG_SYNC may be used if no changes to matching list required, just to updated unexpected messages counter. .PP IBV_WC_TM_SYNC_REQ flag returned in list operation WC shows that counter synchronization required. This flag also may be returned by unexpected receive WC, asking for IBV_WR_TAG_SYNC operation to keep TM coherence consistency. .SH "RETURN VALUE" .B ibv_post_srq_ops() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "SEE ALSO" .BR ibv_create_srq_ex (3), .SH "AUTHORS" .TP Artemy Kovalyov rdma-core-56.1/libibverbs/man/ibv_post_srq_recv.3000066400000000000000000000044751477342711600220570ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_POST_SRQ_RECV 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_post_srq_recv \- post a list of work requests (WRs) to a shared receive queue (SRQ) .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_post_srq_recv(struct ibv_srq " "*srq" ", struct ibv_recv_wr " "*wr" , .BI " struct ibv_recv_wr " "**bad_wr" ); .fi .SH "DESCRIPTION" .B ibv_post_srq_recv() posts the linked list of work requests (WRs) starting with .I wr to the shared receive queue (SRQ) .I srq\fR. It stops processing WRs from this list at the first failure (that can be detected immediately while requests are being posted), and returns this failing WR through .I bad_wr\fR. .PP The argument .I wr is an ibv_recv_wr struct, as defined in . .PP .nf struct ibv_recv_wr { .in +8 uint64_t wr_id; /* User defined WR ID */ struct ibv_recv_wr *next; /* Pointer to next WR in list, NULL if last WR */ struct ibv_sge *sg_list; /* Pointer to the s/g array */ int num_sge; /* Size of the s/g array */ .in -8 }; .sp .nf struct ibv_sge { .in +8 uint64_t addr; /* Start address of the local memory buffer */ uint32_t length; /* Length of the buffer */ uint32_t lkey; /* Key of the local Memory Region */ .in -8 }; .fi .SH "RETURN VALUE" .B ibv_post_srq_recv() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" The buffers used by a WR can only be safely reused after WR the request is fully executed and a work completion has been retrieved from the corresponding completion queue (CQ). .PP If a WR is being posted to a UD QP, the Global Routing Header (GRH) of the incoming message will be placed in the first 40 bytes of the buffer(s) in the scatter list. If no GRH is present in the incoming message, then the first bytes will be undefined. This means that in all cases, the actual data of the incoming message will start at an offset of 40 bytes into the buffer(s) in the scatter list. .SH "SEE ALSO" .BR ibv_create_qp (3), .BR ibv_post_send (3), .BR ibv_post_recv (3), .BR ibv_poll_cq (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_query_device.3000066400000000000000000000122301477342711600216360ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_QUERY_DEVICE 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_query_device \- query an RDMA device's attributes .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_query_device(struct ibv_context " "*context", .BI " struct ibv_device_attr " "*device_attr" ); .fi .SH "DESCRIPTION" .B ibv_query_device() returns the attributes of the device with context .I context\fR. The argument .I device_attr is a pointer to an ibv_device_attr struct, as defined in . .PP .nf struct ibv_device_attr { .in +8 char fw_ver[64]; /* FW version */ uint64_t node_guid; /* Node GUID (in network byte order) */ uint64_t sys_image_guid; /* System image GUID (in network byte order) */ uint64_t max_mr_size; /* Largest contiguous block that can be registered */ uint64_t page_size_cap; /* Supported memory shift sizes */ uint32_t vendor_id; /* Vendor ID, per IEEE */ uint32_t vendor_part_id; /* Vendor supplied part ID */ uint32_t hw_ver; /* Hardware version */ int max_qp; /* Maximum number of supported QPs */ int max_qp_wr; /* Maximum number of outstanding WR on any work queue */ unsigned int device_cap_flags; /* HCA capabilities mask */ int max_sge; /* Maximum number of s/g per WR for SQ & RQ of QP for non RDMA Read operations */ int max_sge_rd; /* Maximum number of s/g per WR for RDMA Read operations */ int max_cq; /* Maximum number of supported CQs */ int max_cqe; /* Maximum number of CQE capacity per CQ */ int max_mr; /* Maximum number of supported MRs */ int max_pd; /* Maximum number of supported PDs */ int max_qp_rd_atom; /* Maximum number of RDMA Read & Atomic operations that can be outstanding per QP */ int max_ee_rd_atom; /* Maximum number of RDMA Read & Atomic operations that can be outstanding per EEC */ int max_res_rd_atom; /* Maximum number of resources used for RDMA Read & Atomic operations by this HCA as the Target */ int max_qp_init_rd_atom; /* Maximum depth per QP for initiation of RDMA Read & Atomic operations */ int max_ee_init_rd_atom; /* Maximum depth per EEC for initiation of RDMA Read & Atomic operations */ enum ibv_atomic_cap atomic_cap; /* Atomic operations support level */ int max_ee; /* Maximum number of supported EE contexts */ int max_rdd; /* Maximum number of supported RD domains */ int max_mw; /* Maximum number of supported MWs */ int max_raw_ipv6_qp; /* Maximum number of supported raw IPv6 datagram QPs */ int max_raw_ethy_qp; /* Maximum number of supported Ethertype datagram QPs */ int max_mcast_grp; /* Maximum number of supported multicast groups */ int max_mcast_qp_attach; /* Maximum number of QPs per multicast group which can be attached */ int max_total_mcast_qp_attach;/* Maximum number of QPs which can be attached to multicast groups */ int max_ah; /* Maximum number of supported address handles */ int max_fmr; /* Maximum number of supported FMRs */ int max_map_per_fmr; /* Maximum number of (re)maps per FMR before an unmap operation in required */ int max_srq; /* Maximum number of supported SRQs */ int max_srq_wr; /* Maximum number of WRs per SRQ */ int max_srq_sge; /* Maximum number of s/g per SRQ */ uint16_t max_pkeys; /* Maximum number of partitions */ uint8_t local_ca_ack_delay; /* Local CA ack delay */ uint8_t phys_port_cnt; /* Number of physical ports */ .in -8 }; .fi .SH "RETURN VALUE" .B ibv_query_device() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" The maximum values returned by this function are the upper limits of supported resources by the device. However, it may not be possible to use these maximum values, since the actual number of any resource that can be created may be limited by the machine configuration, the amount of host memory, user permissions, and the amount of resources already in use by other users/processes. .SH "SEE ALSO" .BR ibv_open_device (3), .BR ibv_query_port (3), .BR ibv_query_pkey (3), .BR ibv_query_gid (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_query_device_ex.3000066400000000000000000000177321477342711600223460ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_QUERY_DEVICE_EX 3 2014-12-17 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_query_device_ex \- query an RDMA device's attributes including extended device properties. .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_query_device_ex(struct ibv_context " "*context", .BI " struct ibv_query_device_ex_input " "*input", .BI " struct ibv_device_attr_ex " "*attr" ); .fi .SH "DESCRIPTION" .B ibv_query_device_ex() returns the attributes of the device with context .I context\fR. The argument .I input is a pointer to an ibv_query_device_ex_input structure, used for future extensions The argument .I attr is a pointer to an ibv_device_attr_ex struct, as defined in . .PP .nf struct ibv_device_attr_ex { .in +8 struct ibv_device_attr orig_attr; uint32_t comp_mask; /* Compatibility mask that defines which of the following variables are valid */ struct ibv_odp_caps odp_caps; /* On-Demand Paging capabilities */ uint64_t completion_timestamp_mask; /* Completion timestamp mask (0 = unsupported) */ uint64_t hca_core_clock; /* The frequency (in kHZ) of the HCA (0 = unsupported) */ uint64_t device_cap_flags_ex; /* Extended device capability flags */ struct ibv_tso_caps tso_caps; /* TCP segmentation offload capabilities */ struct ibv_rss_caps rss_caps; /* RSS capabilities */ uint32_t max_wq_type_rq; /* Max Work Queue from type RQ */ struct ibv_packet_pacing_caps packet_pacing_caps; /* Packet pacing capabilities */ uint32_t raw_packet_caps; /* Raw packet capabilities, use enum ibv_raw_packet_caps */ struct ibv_tm_caps tm_caps; /* Tag matching capabilities */ struct ibv_cq_moderation_caps cq_mod_caps; /* CQ moderation max capabilities */ uint64_t max_dm_size; /* Max Device Memory size (in bytes) available for allocation */ struct ibv_pci_atomic_caps atomic_caps; /* PCI atomic operations capabilities, use enum ibv_pci_atomic_op_size */ uint32_t xrc_odp_caps; /* Mask with enum ibv_odp_transport_cap_bits to know which operations are supported. */ uint32_t phys_port_cnt_ex /* Extended number of physical port count, allows exposing more than 255 ports device */ .in -8 }; struct ibv_odp_caps { uint64_t general_odp_caps; /* Mask with enum ibv_odp_general_cap_bits */ struct { uint32_t rc_odp_caps; /* Mask with enum ibv_odp_tranport_cap_bits to know which operations are supported. */ uint32_t uc_odp_caps; /* Mask with enum ibv_odp_tranport_cap_bits to know which operations are supported. */ uint32_t ud_odp_caps; /* Mask with enum ibv_odp_tranport_cap_bits to know which operations are supported. */ } per_transport_caps; }; enum ibv_odp_general_cap_bits { IBV_ODP_SUPPORT = 1 << 0, /* On demand paging is supported */ IBV_ODP_SUPPORT_IMPLICIT = 1 << 1, /* Implicit on demand paging is supported */ }; enum ibv_odp_transport_cap_bits { IBV_ODP_SUPPORT_SEND = 1 << 0, /* Send operations support on-demand paging */ IBV_ODP_SUPPORT_RECV = 1 << 1, /* Receive operations support on-demand paging */ IBV_ODP_SUPPORT_WRITE = 1 << 2, /* RDMA-Write operations support on-demand paging */ IBV_ODP_SUPPORT_READ = 1 << 3, /* RDMA-Read operations support on-demand paging */ IBV_ODP_SUPPORT_ATOMIC = 1 << 4, /* RDMA-Atomic operations support on-demand paging */ IBV_ODP_SUPPORT_SRQ_RECV = 1 << 5, /* SRQ receive operations support on-demand paging */ }; struct ibv_tso_caps { uint32_t max_tso; /* Maximum payload size in bytes supported for segmentation by TSO engine.*/ uint32_t supported_qpts; /* Bitmap showing which QP types are supported by TSO operation. */ }; struct ibv_rss_caps { uint32_t supported_qpts; /* Bitmap showing which QP types are supported RSS */ uint32_t max_rwq_indirection_tables; /* Max receive work queue indirection tables */ uint32_t max_rwq_indirection_table_size; /* Max receive work queue indirection table size */ uint64_t rx_hash_fields_mask; /* Mask with enum ibv_rx_hash_fields to know which incoming packet's field can participates in the RX hash */ uint8_t rx_hash_function; /* Mask with enum ibv_rx_hash_function_flags to know which hash functions are supported */ }; struct ibv_packet_pacing_caps { uint32_t qp_rate_limit_min; /* Minimum rate limit in kbps */ uint32_t qp_rate_limit_max; /* Maximum rate limit in kbps */ uint32_t supported_qpts; /* Bitmap showing which QP types are supported. */ }; enum ibv_raw_packet_caps { .in +8 IBV_RAW_PACKET_CAP_CVLAN_STRIPPING = 1 << 0, /* CVLAN stripping is supported */ IBV_RAW_PACKET_CAP_SCATTER_FCS = 1 << 1, /* FCS scattering is supported */ IBV_RAW_PACKET_CAP_IP_CSUM = 1 << 2, /* IP CSUM offload is supported */ .in -8 }; enum ibv_tm_cap_flags { .in +8 IBV_TM_CAP_RC = 1 << 0, /* Support tag matching on RC transport */ .in -8 }; struct ibv_tm_caps { .in +8 uint32_t max_rndv_hdr_size; /* Max size of rendezvous request header */ uint32_t max_num_tags; /* Max number of tagged buffers in a TM-SRQ matching list */ uint32_t flags; /* From enum ibv_tm_cap_flags */ uint32_t max_ops; /* Max number of outstanding list operations */ uint32_t max_sge; /* Max number of SGEs in a tagged buffer */ .in -8 }; struct ibv_cq_moderation_caps { uint16_t max_cq_count; uint16_t max_cq_period; }; enum ibv_pci_atomic_op_size { .in +8 IBV_PCI_ATOMIC_OPERATION_4_BYTE_SIZE_SUP = 1 << 0, IBV_PCI_ATOMIC_OPERATION_8_BYTE_SIZE_SUP = 1 << 1, IBV_PCI_ATOMIC_OPERATION_16_BYTE_SIZE_SUP = 1 << 2, .in -8 }; struct ibv_pci_atomic_caps { .in +8 uint16_t fetch_add; /* Supported sizes for an atomic fetch and add operation, use enum ibv_pci_atomic_op_size */ uint16_t swap; /* Supported sizes for an atomic unconditional swap operation, use enum ibv_pci_atomic_op_size */ uint16_t compare_swap; /* Supported sizes for an atomic compare and swap operation, use enum ibv_pci_atomic_op_size */ .in -8 }; .fi Extended device capability flags (device_cap_flags_ex): .br .TP 7 IBV_DEVICE_PCI_WRITE_END_PADDING Indicates the device has support for padding PCI writes to a full cache line. Padding packets to full cache lines reduces the amount of traffic required at the memory controller at the expense of creating more traffic on the PCI-E port. Workloads that have a high CPU memory load and low PCI-E utilization will benefit from this feature, while workloads that have a high PCI-E utilization and small packets will be harmed. For instance, with a 128 byte cache line size, the transfer of any packets less than 128 bytes will require a full 128 transfer on PCI, potentially doubling the required PCI-E bandwidth. This feature can be enabled on a QP or WQ basis via the IBV_QP_CREATE_PCI_WRITE_END_PADDING or IBV_WQ_FLAGS_PCI_WRITE_END_PADDING flags. .SH "RETURN VALUE" .B ibv_query_device_ex() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" The maximum values returned by this function are the upper limits of supported resources by the device. However, it may not be possible to use these maximum values, since the actual number of any resource that can be created may be limited by the machine configuration, the amount of host memory, user permissions, and the amount of resources already in use by other users/processes. .SH "SEE ALSO" .BR ibv_query_device (3), .BR ibv_open_device (3), .BR ibv_query_port (3), .BR ibv_query_pkey (3), .BR ibv_query_gid (3) .SH "AUTHORS" .TP Majd Dibbiny rdma-core-56.1/libibverbs/man/ibv_query_ece.3.md000066400000000000000000000025371477342711600215430ustar00rootroot00000000000000--- date: 2020-01-22 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_QUERY_ECE --- # NAME ibv_query_ece - query ECE options. # SYNOPSIS ```c #include int ibv_query_ece(struct ibv_qp *qp, struct ibv_ece *ece); ``` # DESCRIPTION **ibv_query_ece()** query ECE options. Return to the user current ECE state for the QP. # ARGUMENTS *qp* : The queue pair (QP) associated with the ECE options. ## *ece* Argument : The ECE values. ```c struct ibv_ece { uint32_t vendor_id; uint32_t options; uint32_t comp_mask; }; ``` *vendor_id* : Unique identifier of the provider vendor on the network. The providers will set IEEE OUI here to distinguish itself in non-homogenius network. *options* : Provider specific attributes which are supported. *comp_mask* : Bitmask specifying what fields in the structure are valid. # RETURN VALUE **ibv_query_ece()** returns 0 when the call was successful, or the errno value which indicates the failure reason. *EOPNOTSUPP* : libibverbs or provider driver doesn't support the ibv_set_ece() verb. *EINVAL* : In one of the following: o The QP is invalid. o The ECE options are invalid. # SEE ALSO **ibv_set_ece**(3), # AUTHOR Leon Romanovsky rdma-core-56.1/libibverbs/man/ibv_query_gid.3.md000066400000000000000000000015341477342711600215460ustar00rootroot00000000000000--- date: 2006-10-31 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_QUERY_GID --- # NAME ibv_query_gid - query an InfiniBand port's GID table # SYNOPSIS ```c #include int ibv_query_gid(struct ibv_context *context, uint8_t port_num, int index, union ibv_gid *gid); ``` # DESCRIPTION **ibv_query_gid()** returns the GID value in entry *index* of port *port_num* for device context *context* through the pointer *gid*. # RETURN VALUE **ibv_query_gid()** returns 0 on success, and -1 on error. # SEE ALSO **ibv_open_device**(3), **ibv_query_device**(3), **ibv_query_pkey**(3), **ibv_query_port**(3) # AUTHOR Dotan Barak rdma-core-56.1/libibverbs/man/ibv_query_gid_ex.3.md000066400000000000000000000036431477342711600222450ustar00rootroot00000000000000--- date: 2020-04-24 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_QUERY_GID_EX --- # NAME ibv_query_gid_ex - Query an InfiniBand port's GID table entry # SYNOPSIS ```c #include int ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num, uint32_t gid_index, struct ibv_gid_entry *entry, uint32_t flags); ``` # DESCRIPTION **ibv_query_gid_ex()** returns the GID entry at *entry* for *gid_index* of port *port_num* for device context *context*. # ARGUMENTS *context* : The context of the device to query. *port_num* : The number of port to query its GID table. *gid_index* : The index of the GID table entry to query. ## *entry* Argument : An ibv_gid_entry struct, as defined in . ```c struct ibv_gid_entry { union ibv_gid gid; uint32_t gid_index; uint32_t port_num; uint32_t gid_type; uint32_t ndev_ifindex; }; ``` *gid* : The GID entry. *gid_index* : The GID table index of this entry. *port_num* : The port number that this GID belongs to. *gid_type* : enum ibv_gid_type, can be one of IBV_GID_TYPE_IB, IBV_GID_TYPE_ROCE_V1 or IBV_GID_TYPE_ROCE_V2. *ndev_ifindex* : The interface index of the net device associated with this GID. It is 0 if there is no net device associated with it. *flags* : Extra fields to query post *ndev_ifindex*, for now must be 0. # RETURN VALUE **ibv_query_gid_ex()** returns 0 on success or errno value on error. # ERRORS ENODATA : *gid_index* is within the GID table size of port *port_num* but there is no data in this index. # SEE ALSO **ibv_open_device**(3), **ibv_query_device**(3), **ibv_query_pkey**(3), **ibv_query_port**(3), **ibv_query_gid_table**(3) # AUTHOR Parav Pandit rdma-core-56.1/libibverbs/man/ibv_query_gid_table.3.md000066400000000000000000000037631477342711600227230ustar00rootroot00000000000000--- date: 2020-04-24 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_QUERY_GID_TABLE --- # NAME ibv_query_gid_table - query an InfiniBand device's GID table # SYNOPSIS ```c #include ssize_t ibv_query_gid_table(struct ibv_context *context, struct ibv_gid_entry *entries, size_t max_entries, uint32_t flags); ``` # DESCRIPTION **ibv_query_gid_table()** returns the valid GID table entries of the RDMA device context *context* at the pointer *entries*. A caller must allocate *entries* array for the GID table entries it desires to query. This API returns only valid GID table entries. A caller must pass non zero number of entries at *max_entries* that corresponds to the size of *entries* array. *entries* array must be allocated such that it can contain all the valid GID table entries of the device. If there are more valid GID entries than the provided value of *max_entries* and *entries* array, the call will fail. For example, if an RDMA device *context* has a total of 10 valid GID entries, *entries* should be allocated for at least 10 entries, and *max_entries* should be set appropriately. # ARGUMENTS *context* : The context of the device to query. *entries* : Array of ibv_gid_entry structs where the GID entries are returned. Please see **ibv_query_gid_ex**(3) man page for *ibv_gid_entry*. *max_entries* : Maximum number of entries that can be returned. *flags* : Extra fields to query post *entries->ndev_ifindex*, for now must be 0. # RETURN VALUE **ibv_query_gid_table()** returns the number of entries that were read on success or negative errno value on error. Number of entries returned is <= max_entries. # SEE ALSO **ibv_open_device**(3), **ibv_query_device**(3), **ibv_query_port**(3), **ibv_query_gid_ex**(3) # AUTHOR Parav Pandit rdma-core-56.1/libibverbs/man/ibv_query_pkey.3.md000066400000000000000000000015741477342711600217570ustar00rootroot00000000000000--- date: 2006-10-31 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_QUERY_PKEY --- # NAME ibv_query_pkey - query an InfiniBand port's P_Key table # SYNOPSIS ```c #include int ibv_query_pkey(struct ibv_context *context, uint8_t port_num, int index, uint16_t *pkey); ``` # DESCRIPTION **ibv_query_pkey()** returns the P_Key value (in network byte order) in entry *index* of port *port_num* for device context *context* through the pointer *pkey*. # RETURN VALUE **ibv_query_pkey()** returns 0 on success, and -1 on error. # SEE ALSO **ibv_open_device**(3), **ibv_query_device**(3), **ibv_query_gid**(3), **ibv_query_port**(3) # AUTHOR Dotan Barak rdma-core-56.1/libibverbs/man/ibv_query_port.3000066400000000000000000000055321477342711600213720ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_QUERY_PORT 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_query_port \- query an RDMA port's attributes .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_query_port(struct ibv_context " "*context" ", uint8_t " "port_num" , .BI " struct ibv_port_attr " "*port_attr" "); .fi .SH "DESCRIPTION" .B ibv_query_port() returns the attributes of port .I port_num for device context .I context through the pointer .I port_attr\fR. The argument .I port_attr is an ibv_port_attr struct, as defined in . .PP .nf struct ibv_port_attr { .in +8 enum ibv_port_state state; /* Logical port state */ enum ibv_mtu max_mtu; /* Max MTU supported by port */ enum ibv_mtu active_mtu; /* Actual MTU */ int gid_tbl_len; /* Length of source GID table */ uint32_t port_cap_flags; /* Port capabilities */ uint32_t max_msg_sz; /* Maximum message size */ uint32_t bad_pkey_cntr; /* Bad P_Key counter */ uint32_t qkey_viol_cntr; /* Q_Key violation counter */ uint16_t pkey_tbl_len; /* Length of partition table */ uint16_t lid; /* Base port LID */ uint16_t sm_lid; /* SM LID */ uint8_t lmc; /* LMC of LID */ uint8_t max_vl_num; /* Maximum number of VLs */ uint8_t sm_sl; /* SM service level */ uint8_t subnet_timeout; /* Subnet propagation delay */ uint8_t init_type_reply;/* Type of initialization performed by SM */ uint8_t active_width; /* Currently active link width */ uint8_t active_speed; /* Currently active link speed if speed rdma-core-56.1/libibverbs/man/ibv_query_qp.3000066400000000000000000000102441477342711600210220ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_QUERY_QP 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_query_qp \- get the attributes of a queue pair (QP) .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_query_qp(struct ibv_qp " "*qp" ", struct ibv_qp_attr " "*attr" , .BI " int " "attr_mask" , .BI " struct ibv_qp_init_attr " "*init_attr" ); .fi .SH "DESCRIPTION" .B ibv_query_qp() gets the attributes specified in .I attr_mask for the QP .I qp and returns them through the pointers .I attr and .I init_attr\fR. The argument .I attr is an ibv_qp_attr struct, as defined in . .PP .nf struct ibv_qp_attr { .in +8 enum ibv_qp_state qp_state; /* Current QP state */ enum ibv_qp_state cur_qp_state; /* Current QP state - irrelevant for ibv_query_qp */ enum ibv_mtu path_mtu; /* Path MTU (valid only for RC/UC QPs) */ enum ibv_mig_state path_mig_state; /* Path migration state (valid if HCA supports APM) */ uint32_t qkey; /* Q_Key of the QP (valid only for UD QPs) */ uint32_t rq_psn; /* PSN for receive queue (valid only for RC/UC QPs) */ uint32_t sq_psn; /* PSN for send queue */ uint32_t dest_qp_num; /* Destination QP number (valid only for RC/UC QPs) */ unsigned int qp_access_flags; /* Mask of enabled remote access operations (valid only for RC/UC QPs) */ struct ibv_qp_cap cap; /* QP capabilities */ struct ibv_ah_attr ah_attr; /* Primary path address vector (valid only for RC/UC QPs) */ struct ibv_ah_attr alt_ah_attr; /* Alternate path address vector (valid only for RC/UC QPs) */ uint16_t pkey_index; /* Primary P_Key index */ uint16_t alt_pkey_index; /* Alternate P_Key index */ uint8_t en_sqd_async_notify; /* Enable SQD.drained async notification - irrelevant for ibv_query_qp */ uint8_t sq_draining; /* Is the QP draining? (Valid only if qp_state is SQD) */ uint8_t max_rd_atomic; /* Number of outstanding RDMA reads & atomic operations on the destination QP (valid only for RC QPs) */ uint8_t max_dest_rd_atomic; /* Number of responder resources for handling incoming RDMA reads & atomic operations (valid only for RC QPs) */ uint8_t min_rnr_timer; /* Minimum RNR NAK timer (valid only for RC QPs) */ uint8_t port_num; /* Primary port number */ uint8_t timeout; /* Local ack timeout for primary path (valid only for RC QPs) */ uint8_t retry_cnt; /* Retry count (valid only for RC QPs) */ uint8_t rnr_retry; /* RNR retry (valid only for RC QPs) */ uint8_t alt_port_num; /* Alternate port number */ uint8_t alt_timeout; /* Local ack timeout for alternate path (valid only for RC QPs) */ .in -8 }; .fi .PP For details on struct ibv_qp_cap see the description of .B ibv_create_qp()\fR. For details on struct ibv_ah_attr see the description of .B ibv_create_ah()\fR. .SH "RETURN VALUE" .B ibv_query_qp() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" The argument .I attr_mask is a hint that specifies the minimum list of attributes to retrieve. Some RDMA devices may return extra attributes not requested, for example if the value can be returned cheaply. This has the same form as in .B ibv_modify_qp()\fR. .PP Attribute values are valid if they have been set using .B ibv_modify_qp()\fR. The exact list of valid attributes depends on the QP state. .PP Multiple calls to .B ibv_query_qp() may yield some differences in the values returned for the following attributes: qp_state, path_mig_state, sq_draining, ah_attr (if APM is enabled). .SH "SEE ALSO" .BR ibv_create_qp (3), .BR ibv_destroy_qp (3), .BR ibv_modify_qp (3), .BR ibv_create_ah (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_query_qp_data_in_order.3.md000066400000000000000000000050161477342711600242740ustar00rootroot00000000000000--- date: 2020-3-3 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: ibv_query_qp_data_in_order --- # NAME ibv_query_qp_data_in_order - check if qp data is guaranteed to be in order. # SYNOPSIS ```c #include int ibv_query_qp_data_in_order(struct ibv_qp *qp, enum ibv_wr_opcode op, uint32_t flags); ``` # DESCRIPTION **ibv_query_qp_data_in_order()** Checks whether WQE data is guaranteed to be written in-order, and thus reader may poll for data instead of poll for completion. This function indicates data is written in-order within each WQE, but cannot be used to determine ordering between separate WQEs. This function describes ordering at the receiving side of the QP, not the sending side. # ARGUMENTS *qp* : The local queue pair (QP) to query. *op* : The operation type to query about. Different operation types may write data in a different order. For RDMA read operations: describes ordering of RDMA reads posted on this local QP. For RDMA write operations: describes ordering of remote RDMA writes being done into this local QP. For RDMA send operations: describes ordering of remote RDMA sends being done into this local QP. This function should not be used to determine ordering of other operation types. *flags* : Flags are used to select a query type. Supported values: IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS - Query for supported capabilities and return a capabilities vector. Passing 0 is equivalent to using IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS and checking for IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG support. # RETURN VALUE **ibv_query_qp_data_in_order()** Return value is determined by flags. For each capability bit, 1 is returned if the data is guaranteed to be written in-order for selected operation and type, 0 otherwise. If IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS flag is used, return value can consist of following capabilities: IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG - All data is being written in order. IBV_QUERY_QP_DATA_IN_ORDER_ALIGNED_128_BYTES - Each 128 bytes aligned block is being written in order. If flags is 0, the function will return 1 if IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG is supported and 0 otherwise. # NOTES Return value is valid only when the data is read by the CPU and relaxed ordering MR is not the target of the transfer. # SEE ALSO **ibv_query_qp**(3) # AUTHOR Patrisious Haddad Yochai Cohen rdma-core-56.1/libibverbs/man/ibv_query_rt_values_ex.3000066400000000000000000000027321477342711600231050ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_QUERY_RT_VALUES_EX 3 2016-2-20 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_query_rt_values_ex \- query an RDMA device for some real time values .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_query_rt_values_ex(struct ibv_context " "*context", .BI " struct ibv_values_ex " "*values" ); .fi .SH "DESCRIPTION" .B ibv_query_rt_values_ex() returns certain real time values of a device .I context\fR. The argument .I attr is a pointer to an ibv_device_attr_ex struct, as defined in . .PP .nf struct ibv_values_ex { .in +8 uint32_t comp_mask; /* Compatibility mask that defines the query/queried fields [in/out] */ struct timespec raw_clock; /* HW raw clock */ .in -8 }; enum ibv_values_mask { IBV_VALUES_MASK_RAW_CLOCK = 1 << 0, /* HW raw clock */ }; .fi .SH "RETURN VALUE" .B ibv_query_rt_values_ex() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" This extension verb only calls the provider, the provider has to query this value somehow and mark the queried values in the comp_mask field. .SH "SEE ALSO" .BR ibv_query_device (3), .BR ibv_open_device (3), .BR ibv_query_port (3), .BR ibv_query_pkey (3), .BR ibv_query_gid (3) .SH "AUTHORS" .TP Matan Barak .TP Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_query_srq.3000066400000000000000000000026061477342711600212120ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_QUERY_SRQ 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_query_srq \- get the attributes of a shared receive queue (SRQ) .SH "SYNOPSIS" .nf .B #include .sp .BI "int ibv_query_srq(struct ibv_srq " "*srq" ", struct ibv_srq_attr " "*srq_attr" ); .fi .SH "DESCRIPTION" .B ibv_query_srq() gets the attributes of the SRQ .I srq and returns them through the pointer .I srq_attr\fR. The argument .I srq_attr is an ibv_srq_attr struct, as defined in . .PP .nf struct ibv_srq_attr { .in +8 uint32_t max_wr; /* maximum number of outstanding work requests (WRs) in the SRQ */ uint32_t max_sge; /* maximum number of scatter elements per WR */ uint32_t srq_limit; /* the limit value of the SRQ */ .in -8 }; .fi .SH "RETURN VALUE" .B ibv_query_srq() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" If the value returned for srq_limit is 0, then the SRQ limit reached ("low watermark") event is not (or no longer) armed, and no asynchronous events will be generated until the event is rearmed. .SH "SEE ALSO" .BR ibv_create_srq (3), .BR ibv_destroy_srq (3), .BR ibv_modify_srq (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_rate_to_mbps.3.md000066400000000000000000000022031477342711600222260ustar00rootroot00000000000000--- date: 2012-03-31 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_RATE_TO_MBPS --- # NAME ibv_rate_to_mbps - convert IB rate enumeration to Mbit/sec mbps_to_ibv_rate - convert Mbit/sec to an IB rate enumeration # SYNOPSIS ```c #include int ibv_rate_to_mbps(enum ibv_rate rate); enum ibv_rate mbps_to_ibv_rate(int mbps); ``` # DESCRIPTION **ibv_rate_to_mbps()** converts the IB transmission rate enumeration *rate* to a number of Mbit/sec. For example, if *rate* is **IBV_RATE_5_GBPS**, the value 5000 will be returned (5 Gbit/sec = 5000 Mbit/sec). **mbps_to_ibv_rate()** converts the number of Mbit/sec *mult* to an IB transmission rate enumeration. For example, if *mult* is 5000, the rate enumeration **IBV_RATE_5_GBPS** will be returned. # RETURN VALUE **ibv_rate_to_mbps()** returns the number of Mbit/sec. **mbps_to_ibv_rate()** returns the enumeration representing the IB transmission rate. # SEE ALSO **ibv_query_port**(3) # AUTHOR Dotan Barak rdma-core-56.1/libibverbs/man/ibv_rate_to_mult.3.md000066400000000000000000000023301477342711600222470ustar00rootroot00000000000000--- date: 2006-10-31 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_RATE_TO_MULT --- # NAME ibv_rate_to_mult - convert IB rate enumeration to multiplier of 2.5 Gbit/sec mult_to_ibv_rate - convert multiplier of 2.5 Gbit/sec to an IB rate enumeration # SYNOPSIS ```c #include int ibv_rate_to_mult(enum ibv_rate rate); enum ibv_rate mult_to_ibv_rate(int mult); ``` # DESCRIPTION **ibv_rate_to_mult()** converts the IB transmission rate enumeration *rate* to a multiple of 2.5 Gbit/sec (the base rate). For example, if *rate* is **IBV_RATE_5_GBPS**, the value 2 will be returned (5 Gbit/sec = 2 * 2.5 Gbit/sec). **mult_to_ibv_rate()** converts the multiplier value (of 2.5 Gbit/sec) *mult* to an IB transmission rate enumeration. For example, if *mult* is 2, the rate enumeration **IBV_RATE_5_GBPS** will be returned. # RETURN VALUE **ibv_rate_to_mult()** returns the multiplier of the base rate 2.5 Gbit/sec. **mult_to_ibv_rate()** returns the enumeration representing the IB transmission rate. # SEE ALSO **ibv_query_port**(3) # AUTHOR Dotan Barak rdma-core-56.1/libibverbs/man/ibv_rc_pingpong.1000066400000000000000000000046011477342711600214600ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH IBV_RC_PINGPONG 1 "August 30, 2005" "libibverbs" "USER COMMANDS" .SH NAME ibv_rc_pingpong \- simple InfiniBand RC transport test .SH SYNOPSIS .B ibv_rc_pingpong [\-p port] [\-d device] [\-i ib port] [\-s size] [\-m size] [\-r rx depth] [\-n iters] [\-l sl] [\-e] [\-g gid index] [\-o] [\-P] [\-t] [\-j] [\-N] \fBHOSTNAME\fR .B ibv_rc_pingpong [\-p port] [\-d device] [\-i ib port] [\-s size] [\-m size] [\-r rx depth] [\-n iters] [\-l sl] [\-e] [\-g gid index] [\-o] [\-P] [\-t] [\-j] [\-N] .SH DESCRIPTION .PP Run a simple ping-pong test over InfiniBand via the reliable connected (RC) transport. .SH OPTIONS .PP .TP \fB\-p\fR, \fB\-\-port\fR=\fIPORT\fR use TCP port \fIPORT\fR for initial synchronization (default 18515) .TP \fB\-d\fR, \fB\-\-ib\-dev\fR=\fIDEVICE\fR use IB device \fIDEVICE\fR (default first device found) .TP \fB\-i\fR, \fB\-\-ib\-port\fR=\fIPORT\fR use IB port \fIPORT\fR (default port 1) .TP \fB\-s\fR, \fB\-\-size\fR=\fISIZE\fR ping-pong messages of size \fISIZE\fR (default 4096) .TP \fB\-m\fR, \fB\-\-mtu\fR=\fISIZE\fR path MTU \fISIZE\fR (default 1024) .TP \fB\-r\fR, \fB\-\-rx\-depth\fR=\fIDEPTH\fR post \fIDEPTH\fR receives at a time (default 1000) .TP \fB\-n\fR, \fB\-\-iters\fR=\fIITERS\fR perform \fIITERS\fR message exchanges (default 1000) .TP \fB\-l\fR, \fB\-\-sl\fR=\fISL\fR use \fISL\fR as the service level value of the QP (default 0) .TP \fB\-e\fR, \fB\-\-events\fR sleep while waiting for work completion events (default is to poll for completions) .TP \fB\-g\fR, \fB\-\-gid-idx\fR=\fIGIDINDEX\fR local port \fIGIDINDEX\fR .TP \fB\-o\fR, \fB\-\-odp\fR use on demand paging .TP \fB\-P\fR, \fB\-\-prefetch=\fR prefetch an ODP MR .TP \fB\-t\fR, \fB\-\-ts\fR get CQE with timestamp .TP \fB\-c\fR, \fB\-\-chk\fR validate received buffer .TP \fB\-j\fR, \fB\-\-dm\fR use device memory .TP \fB\-N\fR, \fB\-\-new_send\fR use new post send WR API .SH SEE ALSO .BR ibv_uc_pingpong (1), .BR ibv_ud_pingpong (1), .BR ibv_srq_pingpong (1), .BR ibv_xsrq_pingpong (1) .SH AUTHORS .TP Roland Dreier .RI < rolandd@cisco.com > .SH BUGS The network synchronization between client and server instances is weak, and does not prevent incompatible options from being used on the two instances. The method used for retrieving work completions is not strictly correct, and race conditions may cause failures on some systems. rdma-core-56.1/libibverbs/man/ibv_read_counters.3.md000066400000000000000000000115751477342711600224210ustar00rootroot00000000000000--- date: 2018-04-02 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: ibv_read_counters --- # NAME **ibv_read_counters** - Read counter values # SYNOPSIS ```c #include int ibv_read_counters(struct ibv_counters *counters, uint64_t *counters_value, uint32_t ncounters, uint32_t flags); ``` # DESCRIPTION **ibv_read_counters**() returns the values of the chosen counters into *counters_value* array of which can accumulate *ncounters*. The values are filled according to the configuration defined by the user in the **ibv_attach_counters_point_xxx** functions. # ARGUMENTS *counters* : Counters object to read. *counters_value* : Input buffer to hold read result. *ncounters* : Number of counters to fill. *flags* : Use enum ibv_read_counters_flags. ## *flags* Argument IBV_READ_COUNTERS_ATTR_PREFER_CACHED : Will prefer reading the values from driver cache, else it will do volatile hardware access which is the default. # RETURN VALUE **ibv_read_counters**() returns 0 on success, or the value of errno on failure (which indicates the failure reason) # EXAMPLE Example: Statically attach counters to a new flow This example demonstrates the use of counters which are attached statically with the creation of a new flow. The counters are read from hardware periodically, and finally all resources are released. ```c /* create counters object and define its counters points */ /* create simple L2 flow with hardcoded MAC, and a count action */ /* read counters periodically, every 1sec, until loop ends */ /* assumes user prepared a RAW_PACKET QP as input */ /* only limited error checking in run time for code simplicity */ #include #include /* the below MAC should be replaced by user */ #define FLOW_SPEC_ETH_MAC_VAL { .dst_mac = { 0x00, 0x01, 0x02, 0x03, 0x04,0x05}, .src_mac = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, .ether_type = 0, .vlan_tag = 0, } #define FLOW_SPEC_ETH_MAC_MASK { .dst_mac = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}, .src_mac = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}, .ether_type = 0, .vlan_tag = 0, } void example_create_flow_with_counters_on_raw_qp(struct ibv_qp *qp) { int idx = 0; int loop = 10; struct ibv_flow *flow = NULL; struct ibv_counters *counters = NULL; struct ibv_counters_init_attr init_attr = {0}; struct ibv_counter_attach_attr attach_attr = {0}; /* create single counters handle */ counters = ibv_create_counters(qp->context, &init_attr); /* define counters points */ attach_attr.counter_desc = IBV_COUNTER_PACKETS; attach_attr.index = idx++; ret = ibv_attach_counters_point_flow(counters, &attach_attr, NULL); if (ret == ENOTSUP) { fprintf(stderr, "Attaching IBV_COUNTER_PACKETS to flow is not \ supported"); exit(1); } attach_attr.counter_desc = IBV_COUNTER_BYTES; attach_attr.index = idx++; ibv_attach_counters_point_flow(counters, &attach_attr, NULL); if (ret == ENOTSUP) { fprintf(stderr, "Attaching IBV_COUNTER_BYTES to flow is not \ supported"); exit(1); } /* define a new flow attr that includes the counters handle */ struct raw_eth_flow_attr { struct ibv_flow_attr attr; struct ibv_flow_spec_eth spec_eth; struct ibv_flow_spec_counter_action spec_count; } flow_attr = { .attr = { .comp_mask = 0, .type = IBV_FLOW_ATTR_NORMAL, .size = sizeof(flow_attr), .priority = 0, .num_of_specs = 2, /* ETH + COUNT */ .port = 1, .flags = 0, }, .spec_eth = { .type = IBV_EXP_FLOW_SPEC_ETH, .size = sizeof(struct ibv_flow_spec_eth), .val = FLOW_SPEC_ETH_MAC_VAL, .mask = FLOW_SPEC_ETH_MAC_MASK, }, .spec_count = { .type = IBV_FLOW_SPEC_ACTION_COUNT, .size = sizeof(struct ibv_flow_spec_counter_action), .counters = counters, /* attached this counters handle to the newly created ibv_flow */ } }; /* create the flow */ flow = ibv_create_flow(qp, &flow_attr.attr); /* allocate array for counters value reading */ uint64_t *counters_value = malloc(sizeof(uint64_t) * idx); /* periodical read and print of flow counters */ while (--loop) { sleep(1); /* read hardware counters values */ ibv_read_counters(counters, counters_value, idx, IBV_READ_COUNTERS_ATTR_PREFER_CACHED); printf("PACKETS = %"PRIu64", BYTES = %"PRIu64 \n", counters_value[0], counters_value[1] ); } /* all done, release all */ free(counters_value); /* destroy flow and detach counters */ ibv_destroy_flow(flow); /* destroy counters handle */ ibv_destroy_counters(counters); return; } ``` # SEE ALSO **ibv_create_counters**, **ibv_destroy_counters**, **ibv_attach_counters_point_flow**, **ibv_create_flow** # AUTHORS Raed Salem Alex Rosenbaum rdma-core-56.1/libibverbs/man/ibv_reg_mr.3000066400000000000000000000126061477342711600204340ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .\" .TH IBV_REG_MR 3 2006-10-31 libibverbs "Libibverbs Programmer's Manual" .SH "NAME" ibv_reg_mr, ibv_reg_mr_iova, ibv_reg_dmabuf_mr, ibv_dereg_mr \- register or deregister a memory region (MR) .SH "SYNOPSIS" .nf .B #include .sp .BI "struct ibv_mr *ibv_reg_mr(struct ibv_pd " "*pd" ", void " "*addr" , .BI " size_t " "length" ", int " "access" ); .sp .BI "struct ibv_mr *ibv_reg_mr_iova(struct ibv_pd " "*pd" ", void " "*addr" , .BI " size_t " "length" ", uint64_t " "hca_va" , .BI " int " "access" ); .sp .BI "struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd " "*pd" ", uint64_t " "offset" , .BI " size_t " "length" ", uint64_t " "iova" , .BI " int " "fd" ", int " "access" ); .sp .BI "int ibv_dereg_mr(struct ibv_mr " "*mr" ); .fi .SH "DESCRIPTION" .B ibv_reg_mr() registers a memory region (MR) associated with the protection domain .I pd\fR. The MR's starting address is .I addr and its size is .I length\fR. The argument .I access describes the desired memory protection attributes; it is either 0 or the bitwise OR of one or more of the following flags: .PP .TP .B IBV_ACCESS_LOCAL_WRITE \fR Enable Local Write Access .TP .B IBV_ACCESS_REMOTE_WRITE \fR Enable Remote Write Access .TP .B IBV_ACCESS_REMOTE_READ\fR Enable Remote Read Access .TP .B IBV_ACCESS_REMOTE_ATOMIC\fR Enable Remote Atomic Operation Access (if supported) .TP .B IBV_ACCESS_FLUSH_GLOBAL\fR Enable Remote Flush Operation with global visibility placement type (if supported) .TP .B IBV_ACCESS_FLUSH_PERSISTENT\fR Enable Remote Flush Operation with persistence placement type (if supported) .TP .B IBV_ACCESS_MW_BIND\fR Enable Memory Window Binding .TP .B IBV_ACCESS_ZERO_BASED\fR Use byte offset from beginning of MR to access this MR, instead of a pointer address .TP .B IBV_ACCESS_ON_DEMAND\fR Create an on-demand paging MR .TP .B IBV_ACCESS_HUGETLB\fR Huge pages are guaranteed to be used for this MR, applicable with IBV_ACCESS_ON_DEMAND in explicit mode only .TP .B IBV_ACCESS_RELAXED_ORDERING\fR This setting allows the NIC to relax the order that data is transferred between the network and the target memory region. Relaxed ordering allows network initiated writes (such as incoming message send or RDMA write operations) to reach memory in an arbitrary order. This can improve the performance of some applications. However, relaxed ordering has the following impact: RDMA write-after-write message order is no longer guaranteed. (Send messages will still match posted receive buffers in order.) Back-to-back network writes that target the same memory region leave the region in an unknown state. Relaxed ordering does not change completion semantics, such as data visibility. That is, a completion still ensures that all data is visible, including data from prior transfers. Relaxed ordered operations will also not bypass atomic operations. .PP If .B IBV_ACCESS_REMOTE_WRITE or .B IBV_ACCESS_REMOTE_ATOMIC is set, then .B IBV_ACCESS_LOCAL_WRITE must be set too. .PP Local read access is always enabled for the MR. .PP To create an implicit ODP MR, IBV_ACCESS_ON_DEMAND should be set, addr should be 0 and length should be SIZE_MAX. .PP If .B IBV_ACCESS_HUGETLB is set, then application awares that for this MR all pages are huge and must promise it will never do anything to break huge pages. .PP .B ibv_reg_mr_iova() ibv_reg_mr_iova is the same as the normal reg_mr, except that the user is allowed to specify the virtual base address of the MR when accessed through a lkey or rkey. The offset in the memory region is computed as 'addr + (iova - hca_va)'. Specifying 0 for hca_va has the same effect as IBV_ACCESS_ZERO_BASED. .PP .B ibv_reg_dmabuf_mr() registers a dma-buf based memory region (MR) associated with the protection domain .I pd\fR. The MR starts at .I offset of the dma-buf and its size is .I length\fR. The dma-buf is identified by the file descriptor .I fd\fR. The argument .I iova specifies the virtual base address of the MR when accessed through a lkey or rkey. It must have the same page offset as .I offset\fR. The argument .I access describes the desired memory protection attributes; it is similar to the ibv_reg_mr case except that only the following flags are supported: .B IBV_ACCESS_LOCAL_WRITE, IBV_ACCESS_REMOTE_WRITE, IBV_ACCESS_REMOTE_READ, IBV_ACCESS_REMOTE_ATOMIC, IBV_ACCESS_RELAXED_ORDERING. .PP .B ibv_dereg_mr() deregisters the MR .I mr\fR. .SH "RETURN VALUE" .B ibv_reg_mr() / ibv_reg_mr_iova() / ibv_reg_dmabuf_mr() returns a pointer to the registered MR, or NULL if the request fails. The local key (\fBL_Key\fR) field .B lkey is used as the lkey field of struct ibv_sge when posting buffers with ibv_post_* verbs, and the the remote key (\fBR_Key\fR) field .B rkey is used by remote processes to perform Atomic and RDMA operations. The remote process places this .B rkey as the rkey field of struct ibv_send_wr passed to the ibv_post_send function. .PP .B ibv_dereg_mr() returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH "NOTES" .B ibv_dereg_mr() fails if any memory window is still bound to this MR. .SH "SEE ALSO" .BR ibv_alloc_pd (3), .BR ibv_post_send (3), .BR ibv_post_recv (3), .BR ibv_post_srq_recv (3) .SH "AUTHORS" .TP Dotan Barak rdma-core-56.1/libibverbs/man/ibv_req_notify_cq.3.md000066400000000000000000000027541477342711600224250ustar00rootroot00000000000000--- date: 2006-10-31 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_REQ_NOTIFY_CQ --- # NAME ibv_req_notify_cq - request completion notification on a completion queue (CQ) # SYNOPSIS ```c #include int ibv_req_notify_cq(struct ibv_cq *cq, int solicited_only); ``` # DESCRIPTION **ibv_req_notify_cq()** requests a completion notification on the completion queue (CQ) *cq*. Upon the addition of a new CQ entry (CQE) to *cq*, a completion event will be added to the completion channel associated with the CQ. If the argument *solicited_only* is zero, a completion event is generated for any new CQE. If *solicited_only* is non-zero, an event is only generated for a new CQE with that is considered "solicited." A CQE is solicited if it is a receive completion for a message with the Solicited Event header bit set, or if the status is not successful. All other successful receive completions, or any successful send completion is unsolicited. # RETURN VALUE **ibv_req_notify_cq()** returns 0 on success, or the value of errno on failure (which indicates the failure reason). # NOTES The request for notification is "one shot." Only one completion event will be generated for each call to **ibv_req_notify_cq()**. # SEE ALSO **ibv_create_comp_channel**(3), **ibv_create_cq**(3), **ibv_get_cq_event**(3) # AUTHOR Dotan Barak rdma-core-56.1/libibverbs/man/ibv_rereg_mr.3.md000066400000000000000000000046141477342711600213620ustar00rootroot00000000000000--- date: 2016-03-13 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_REREG_MR --- # NAME ibv_rereg_mr - re-register a memory region (MR) # SYNOPSIS ```c #include int ibv_rereg_mr(struct ibv_mr *mr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access); ``` # DESCRIPTION **ibv_rereg_mr()** Modifies the attributes of an existing memory region (MR) *mr*. Conceptually, this call performs the functions deregister memory region followed by register memory region. Where possible, resources are reused instead of deallocated and reallocated. *flags* is a bit-mask used to indicate which of the following properties of the memory region are being modified. Flags should be a combination (bit field) of: **IBV_REREG_MR_CHANGE_TRANSLATION ** : Change translation (location and length) **IBV_REREG_MR_CHANGE_PD ** : Change protection domain **IBV_REREG_MR_CHANGE_ACCESS ** : Change access flags When **IBV_REREG_MR_CHANGE_PD** is used, *pd* represents the new PD this MR should be registered to. When **IBV_REREG_MR_CHANGE_TRANSLATION** is used, *addr*. represents the virtual address (user-space pointer) of the new MR, while *length* represents its length. The access and other flags are represented in the field *access*. This field describes the desired memory protection attributes; it is either 0 or the bitwise OR of one or more of ibv_access_flags. # RETURN VALUE **ibv_rereg_mr()** returns 0 on success, otherwise an error has occurred, *enum ibv_rereg_mr_err_code* represents the error as of below. IBV_REREG_MR_ERR_INPUT - Old MR is valid, an input error was detected by libibverbs. IBV_REREG_MR_ERR_DONT_FORK_NEW - Old MR is valid, failed via don't fork on new address range. IBV_REREG_MR_ERR_DO_FORK_OLD - New MR is valid, failed via do fork on old address range. IBV_REREG_MR_ERR_CMD - MR shouldn't be used, command error. IBV_REREG_MR_ERR_CMD_AND_DO_FORK_NEW - MR shouldn't be used, command error, invalid fork state on new address range. # NOTES Even on a failure, the user still needs to call ibv_dereg_mr on this MR. # SEE ALSO **ibv_dereg_mr**(3), **ibv_reg_mr**(3) # AUTHORS Matan Barak , Yishai Hadas rdma-core-56.1/libibverbs/man/ibv_resize_cq.3.md000066400000000000000000000021631477342711600215410ustar00rootroot00000000000000--- date: 2006-10-31 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_RESIZE_CQ --- # NAME ibv_resize_cq - resize a completion queue (CQ) # SYNOPSIS ```c #include int ibv_resize_cq(struct ibv_cq *cq, int cqe); ``` # DESCRIPTION **ibv_resize_cq()** resizes the completion queue (CQ) *cq* to have at least *cqe* entries. *cqe* must be at least the number of unpolled entries in the CQ *cq*. If *cqe* is a valid value less than the current CQ size, **ibv_resize_cq()** may not do anything, since this function is only guaranteed to resize the CQ to a size at least as big as the requested size. # RETURN VALUE **ibv_resize_cq()** returns 0 on success, or the value of errno on failure (which indicates the failure reason). # NOTES **ibv_resize_cq()** may assign a CQ size greater than or equal to the requested size. The cqe member of *cq* will be updated to the actual size. # SEE ALSO **ibv_create_cq**(3), **ibv_destroy_cq**(3) # AUTHOR Dotan Barak rdma-core-56.1/libibverbs/man/ibv_set_ece.3.md000066400000000000000000000031461477342711600211660ustar00rootroot00000000000000--- date: 2020-01-22 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_SET_ECE --- # NAME ibv_set_ece - set ECE options and use them for QP configuration stage. # SYNOPSIS ```c #include int ibv_set_ece(struct ibv_qp *qp, struct ibv_ece *ece); ``` # DESCRIPTION **ibv_set_ece()** set ECE options and use them for QP configuration stage. The desired ECE options will be used during various modify QP stages based on supported options in relevant QP state. # ARGUMENTS *qp* : The queue pair (QP) associated with the ECE options. ## *ece* Argument : The requested ECE values. This is IN/OUT field, the accepted options will be returned in this field. ```c struct ibv_ece { uint32_t vendor_id; uint32_t options; uint32_t comp_mask; }; ``` *vendor_id* : Unique identifier of the provider vendor on the network. The providers will set IEEE OUI here to distinguish itself in non-homogenius network. *options* : Provider specific attributes which are supported or needed to be enabled by ECE users. *comp_mask* : Bitmask specifying what fields in the structure are valid. # RETURN VALUE **ibv_set_ece()** returns 0 when the call was successful, or the errno value which indicates the failure reason. *EOPNOTSUPP* : libibverbs or provider driver doesn't support the ibv_set_ece() verb. *EINVAL* : In one of the following: o The QP is invalid. o The ECE options are invalid. # SEE ALSO **ibv_query_ece**(3), # AUTHOR Leon Romanovsky rdma-core-56.1/libibverbs/man/ibv_srq_pingpong.1000066400000000000000000000044161477342711600216650ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH IBV_SRQ_PINGPONG 1 "August 30, 2005" "libibverbs" "USER COMMANDS" .SH NAME ibv_srq_pingpong \- simple InfiniBand shared receive queue test .SH SYNOPSIS .B ibv_srq_pingpong [\-p port] [\-d device] [\-i ib port] [\-s size] [\-m size] [\-q num QPs] [\-r rx depth] [\-n iters] [\-l sl] [\-e] [\-g gid index] \fBHOSTNAME\fR .B ibv_srq_pingpong [\-p port] [\-d device] [\-i ib port] [\-s size] [\-m size] [\-q num QPs] [\-r rx depth] [\-n iters] [\-l sl] [\-e] [\-g gid index] .SH DESCRIPTION .PP Run a simple ping-pong test over InfiniBand via the reliable connected (RC) transport, using multiple queue pairs (QPs) and a single shared receive queue (SRQ). .SH OPTIONS .PP .TP \fB\-p\fR, \fB\-\-port\fR=\fIPORT\fR use TCP port \fIPORT\fR for initial synchronization (default 18515) .TP \fB\-d\fR, \fB\-\-ib\-dev\fR=\fIDEVICE\fR use IB device \fIDEVICE\fR (default first device found) .TP \fB\-i\fR, \fB\-\-ib\-port\fR=\fIPORT\fR use IB port \fIPORT\fR (default port 1) .TP \fB\-s\fR, \fB\-\-size\fR=\fISIZE\fR ping-pong messages of size \fISIZE\fR (default 4096) .TP \fB\-m\fR, \fB\-\-mtu\fR=\fISIZE\fR path MTU \fISIZE\fR (default 1024) .TP \fB\-q\fR, \fB\-\-num\-qp\fR=\fINUM\fR use \fINUM\fR queue pairs for test (default 16) .TP \fB\-r\fR, \fB\-\-rx\-depth\fR=\fIDEPTH\fR post \fIDEPTH\fR receives at a time (default 1000) .TP \fB\-n\fR, \fB\-\-iters\fR=\fIITERS\fR perform \fIITERS\fR message exchanges (default 1000) .TP \fB\-l\fR, \fB\-\-sl\fR=\fISL\fR use \fISL\fR as the service level value of the QPs (default 0) .TP \fB\-e\fR, \fB\-\-events\fR sleep while waiting for work completion events (default is to poll for completions) .TP \fB\-g\fR, \fB\-\-gid-idx\fR=\fIGIDINDEX\fR local port \fIGIDINDEX\fR .TP \fB\-c\fR, \fB\-\-chk\fR validate received buffer .SH SEE ALSO .BR ibv_rc_pingpong (1), .BR ibv_uc_pingpong (1), .BR ibv_ud_pingpong (1), .BR ibv_xsrq_pingpong (1) .SH AUTHORS .TP Roland Dreier .RI < rolandd@cisco.com > .SH BUGS The network synchronization between client and server instances is weak, and does not prevent incompatible options from being used on the two instances. The method used for retrieving work completions is not strictly correct, and race conditions may cause failures on some systems. rdma-core-56.1/libibverbs/man/ibv_uc_pingpong.1000066400000000000000000000041031477342711600214600ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH IBV_UC_PINGPONG 1 "August 30, 2005" "libibverbs" "USER COMMANDS" .SH NAME ibv_uc_pingpong \- simple InfiniBand UC transport test .SH SYNOPSIS .B ibv_uc_pingpong [\-p port] [\-d device] [\-i ib port] [\-s size] [\-m size] [\-r rx depth] [\-n iters] [\-l sl] [\-e] [\-g gid index] \fBHOSTNAME\fR .B ibv_uc_pingpong [\-p port] [\-d device] [\-i ib port] [\-s size] [\-m size] [\-r rx depth] [\-n iters] [\-l sl] [\-e] [\-g gid index] .SH DESCRIPTION .PP Run a simple ping-pong test over InfiniBand via the unreliable connected (UC) transport. .SH OPTIONS .PP .TP \fB\-p\fR, \fB\-\-port\fR=\fIPORT\fR use TCP port \fIPORT\fR for initial synchronization (default 18515) .TP \fB\-d\fR, \fB\-\-ib\-dev\fR=\fIDEVICE\fR use IB device \fIDEVICE\fR (default first device found) .TP \fB\-i\fR, \fB\-\-ib\-port\fR=\fIPORT\fR use IB port \fIPORT\fR (default port 1) .TP \fB\-s\fR, \fB\-\-size\fR=\fISIZE\fR ping-pong messages of size \fISIZE\fR (default 4096) .TP \fB\-m\fR, \fB\-\-mtu\fR=\fISIZE\fR path MTU \fISIZE\fR (default 1024) .TP \fB\-r\fR, \fB\-\-rx\-depth\fR=\fIDEPTH\fR post \fIDEPTH\fR receives at a time (default 1000) .TP \fB\-n\fR, \fB\-\-iters\fR=\fIITERS\fR perform \fIITERS\fR message exchanges (default 1000) .TP \fB\-l\fR, \fB\-\-sl\fR=\fISL\fR use \fISL\fR as the service level value of the QP (default 0) .TP \fB\-e\fR, \fB\-\-events\fR sleep while waiting for work completion events (default is to poll for completions) .TP \fB\-g\fR, \fB\-\-gid-idx\fR=\fIGIDINDEX\fR local port \fIGIDINDEX\fR .TP \fB\-c\fR, \fB\-\-chk\fR validate received buffer .SH SEE ALSO .BR ibv_rc_pingpong (1), .BR ibv_ud_pingpong (1), .BR ibv_srq_pingpong (1), .BR ibv_xsrq_pingpong (1) .SH AUTHORS .TP Roland Dreier .RI < rolandd@cisco.com > .SH BUGS The network synchronization between client and server instances is weak, and does not prevent incompatible options from being used on the two instances. The method used for retrieving work completions is not strictly correct, and race conditions may cause failures on some systems. rdma-core-56.1/libibverbs/man/ibv_ud_pingpong.1000066400000000000000000000037301477342711600214660ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH IBV_UD_PINGPONG 1 "August 30, 2005" "libibverbs" "USER COMMANDS" .SH NAME ibv_ud_pingpong \- simple InfiniBand UD transport test .SH SYNOPSIS .B ibv_ud_pingpong [\-p port] [\-d device] [\-i ib port] [\-s size] [\-r rx depth] [\-n iters] [\-l sl] [\-e] [\-g gid index] \fBHOSTNAME\fR .B ibv_ud_pingpong [\-p port] [\-d device] [\-i ib port] [\-s size] [\-r rx depth] [\-n iters] [\-l sl] [\-e] [\-g gid index] .SH DESCRIPTION .PP Run a simple ping-pong test over InfiniBand via the unreliable datagram (UD) transport. .SH OPTIONS .PP .TP \fB\-p\fR, \fB\-\-port\fR=\fIPORT\fR use TCP port \fIPORT\fR for initial synchronization (default 18515) .TP \fB\-d\fR, \fB\-\-ib\-dev\fR=\fIDEVICE\fR use IB device \fIDEVICE\fR (default first device found) .TP \fB\-i\fR, \fB\-\-ib\-port\fR=\fIPORT\fR use IB port \fIPORT\fR (default port 1) .TP \fB\-s\fR, \fB\-\-size\fR=\fISIZE\fR ping-pong messages of size \fISIZE\fR (default 2048) .TP \fB\-r\fR, \fB\-\-rx\-depth\fR=\fIDEPTH\fR post \fIDEPTH\fR receives at a time (default 500) .TP \fB\-n\fR, \fB\-\-iters\fR=\fIITERS\fR perform \fIITERS\fR message exchanges (default 1000) .TP \fB\-l\fR, \fB\-\-sl\fR=\fISL\fR send messages with service level \fISL\fR (default 0) .TP \fB\-e\fR, \fB\-\-events\fR sleep while waiting for work completion events (default is to poll for completions) .TP \fB\-g\fR, \fB\-\-gid-idx\fR=\fIGIDINDEX\fR local port \fIGIDINDEX\fR .TP \fB\-c\fR, \fB\-\-chk\fR validate received buffer .SH SEE ALSO .BR ibv_rc_pingpong (1), .BR ibv_uc_pingpong (1), .BR ibv_srq_pingpong (1), .BR ibv_xsrq_pingpong (1) .SH AUTHORS .TP Roland Dreier .RI < rolandd@cisco.com > .SH BUGS The network synchronization between client and server instances is weak, and does not prevent incompatible options from being used on the two instances. The method used for retrieving work completions is not strictly correct, and race conditions may cause failures on some systems. rdma-core-56.1/libibverbs/man/ibv_wr_post.3.md000066400000000000000000000316331477342711600212560ustar00rootroot00000000000000--- date: 2018-11-27 footer: libibverbs header: "Libibverbs Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: IBV_WR API --- # NAME ibv_wr_abort, ibv_wr_complete, ibv_wr_start - Manage regions allowed to post work ibv_wr_atomic_cmp_swp, ibv_wr_atomic_fetch_add - Post remote atomic operation work requests ibv_wr_bind_mw, ibv_wr_local_inv - Post work requests for memory windows ibv_wr_rdma_read, ibv_wr_rdma_write, ibv_wr_rdma_write_imm, ibv_wr_flush - Post RDMA work requests ibv_wr_send, ibv_wr_send_imm, ibv_wr_send_inv - Post send work requests ibv_wr_send_tso - Post segmentation offload work requests ibv_wr_set_inline_data, ibv_wr_set_inline_data_list - Attach inline data to the last work request ibv_wr_set_sge, ibv_wr_set_sge_list - Attach data to the last work request ibv_wr_set_ud_addr - Attach UD addressing info to the last work request ibv_wr_set_xrc_srqn - Attach an XRC SRQN to the last work request # SYNOPSIS ```c #include void ibv_wr_abort(struct ibv_qp_ex *qp); int ibv_wr_complete(struct ibv_qp_ex *qp); void ibv_wr_start(struct ibv_qp_ex *qp); void ibv_wr_atomic_cmp_swp(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, uint64_t compare, uint64_t swap); void ibv_wr_atomic_fetch_add(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, uint64_t add); void ibv_wr_bind_mw(struct ibv_qp_ex *qp, struct ibv_mw *mw, uint32_t rkey, const struct ibv_mw_bind_info *bind_info); void ibv_wr_local_inv(struct ibv_qp_ex *qp, uint32_t invalidate_rkey); void ibv_wr_rdma_read(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr); void ibv_wr_rdma_write(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr); void ibv_wr_rdma_write_imm(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, __be32 imm_data); void ibv_wr_send(struct ibv_qp_ex *qp); void ibv_wr_send_imm(struct ibv_qp_ex *qp, __be32 imm_data); void ibv_wr_send_inv(struct ibv_qp_ex *qp, uint32_t invalidate_rkey); void ibv_wr_send_tso(struct ibv_qp_ex *qp, void *hdr, uint16_t hdr_sz, uint16_t mss); void ibv_wr_set_inline_data(struct ibv_qp_ex *qp, void *addr, size_t length); void ibv_wr_set_inline_data_list(struct ibv_qp_ex *qp, size_t num_buf, const struct ibv_data_buf *buf_list); void ibv_wr_set_sge(struct ibv_qp_ex *qp, uint32_t lkey, uint64_t addr, uint32_t length); void ibv_wr_set_sge_list(struct ibv_qp_ex *qp, size_t num_sge, const struct ibv_sge *sg_list); void ibv_wr_set_ud_addr(struct ibv_qp_ex *qp, struct ibv_ah *ah, uint32_t remote_qpn, uint32_t remote_qkey); void ibv_wr_set_xrc_srqn(struct ibv_qp_ex *qp, uint32_t remote_srqn); void ibv_wr_flush(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, size_t len, uint8_t type, uint8_t level); ``` # DESCRIPTION The verbs work request API (ibv_wr_\*) allows efficient posting of work to a send queue using function calls instead of the struct based *ibv_post_send()* scheme. This approach is designed to minimize CPU branching and locking during the posting process. This API is intended to be used to access additional functionality beyond what is provided by *ibv_post_send()*. WRs batches of *ibv_post_send()* and this API WRs batches can interleave together just if they are not posted within the critical region of each other. (A critical region in this API formed by *ibv_wr_start()* and *ibv_wr_complete()*/*ibv_wr_abort()*) # USAGE To use these APIs the QP must be created using ibv_create_qp_ex() which allows setting the **IBV_QP_INIT_ATTR_SEND_OPS_FLAGS** in *comp_mask*. The *send_ops_flags* should be set to the OR of the work request types that will be posted to the QP. If the QP does not support all the requested work request types then QP creation will fail. Posting work requests to the QP is done within the critical region formed by *ibv_wr_start()* and *ibv_wr_complete()*/*ibv_wr_abort()* (see CONCURRENCY below). Each work request is created by calling a WR builder function (see the table column WR builder below) to start creating the work request, followed by allowed/required setter functions described below. The WR builder and setter combination can be called multiple times to efficiently post multiple work requests within a single critical region. Each WR builder will use the *wr_id* member of *struct ibv_qp_ex* to set the value to be returned in the completion. Some operations will also use the *wr_flags* member to influence operation (see Flags below). These values should be set before invoking the WR builder function. For example a simple send could be formed as follows: ```C qpx->wr_id = 1; ibv_wr_send(qpx); ibv_wr_set_sge(qpx, lkey, &data, sizeof(data)); ``` The section WORK REQUESTS describes the various WR builders and setters in details. Posting work is completed by calling *ibv_wr_complete()* or *ibv_wr_abort()*. No work is executed to the queue until *ibv_wr_complete()* returns success. *ibv_wr_abort()* will discard all work prepared since *ibv_wr_start()*. # WORK REQUESTS Many of the operations match the opcodes available for *ibv_post_send()*. Each operation has a WR builder function, a list of allowed setters, and a flag bit to request the operation with *send_ops_flags* in *struct ibv_qp_init_attr_ex* (see the EXAMPLE below). | Operation | WR builder | QP Type Supported | setters | |----------------------|---------------------------|----------------------------------|----------| | ATOMIC_CMP_AND_SWP | ibv_wr_atomic_cmp_swp() | RC, XRC_SEND | DATA, QP | | ATOMIC_FETCH_AND_ADD | ibv_wr_atomic_fetch_add() | RC, XRC_SEND | DATA, QP | | BIND_MW | ibv_wr_bind_mw() | UC, RC, XRC_SEND | NONE | | LOCAL_INV | ibv_wr_local_inv() | UC, RC, XRC_SEND | NONE | | RDMA_READ | ibv_wr_rdma_read() | RC, XRC_SEND | DATA, QP | | RDMA_WRITE | ibv_wr_rdma_write() | UC, RC, XRC_SEND | DATA, QP | | FLUSH | ibv_wr_flush() | RC, RD, XRC_SEND | DATA, QP | | RDMA_WRITE_WITH_IMM | ibv_wr_rdma_write_imm() | UC, RC, XRC_SEND | DATA, QP | | SEND | ibv_wr_send() | UD, UC, RC, XRC_SEND, RAW_PACKET | DATA, QP | | SEND_WITH_IMM | ibv_wr_send_imm() | UD, UC, RC, SRC SEND | DATA, QP | | SEND_WITH_INV | ibv_wr_send_inv() | UC, RC, XRC_SEND | DATA, QP | | TSO | ibv_wr_send_tso() | UD, RAW_PACKET | DATA, QP | ## Atomic operations Atomic operations are only atomic so long as all writes to memory go only through the same RDMA hardware. It is not atomic with writes performed by the CPU, or by other RDMA hardware in the system. *ibv_wr_atomic_cmp_swp()* : If the remote 64 bit memory location specified by *rkey* and *remote_addr* equals *compare* then set it to *swap*. *ibv_wr_atomic_fetch_add()* : Add *add* to the 64 bit memory location specified *rkey* and *remote_addr*. ## Memory Windows Memory window type 2 operations (See man page for ibv_alloc_mw). *ibv_wr_bind_mw()* : Bind a MW type 2 specified by **mw**, set a new **rkey** and set its properties by **bind_info**. *ibv_wr_local_inv()* : Invalidate a MW type 2 which is associated with **rkey**. ## RDMA *ibv_wr_rdma_read()* : Read from the remote memory location specified *rkey* and *remote_addr*. The number of bytes to read, and the local location to store the data, is determined by the DATA buffers set after this call. *ibv_wr_rdma_write()*, *ibv_wr_rdma_write_imm()* : Write to the remote memory location specified *rkey* and *remote_addr*. The number of bytes to read, and the local location to get the data, is determined by the DATA buffers set after this call. The _imm version causes the remote side to get a IBV_WC_RECV_RDMA_WITH_IMM containing the 32 bits of immediate data. ## Message Send *ibv_wr_send()*, *ibv_wr_send_imm()* : Send a message. The number of bytes to send, and the local location to get the data, is determined by the DATA buffers set after this call. The _imm version causes the remote side to get a IBV_WC_RECV_RDMA_WITH_IMM containing the 32 bits of immediate data. *ibv_wr_send_inv()* : The data transfer is the same as for *ibv_wr_send()*, however the remote side will invalidate the MR specified by *invalidate_rkey* before delivering a completion. *ibv_wr_send_tso()* : Produce multiple SEND messages using TCP Segmentation Offload. The SGE points to a TCP Stream buffer which will be segmented into MSS size SENDs. The hdr includes the entire network headers up to and including the TCP header and is prefixed before each segment. ## QP Specific setters Certain QP types require each post to be accompanied by additional setters, these setters are mandatory for any operation listing a QP setter in the above table. *UD* QPs : *ibv_wr_set_ud_addr()* must be called to set the destination address of the work. *XRC_SEND* QPs : *ibv_wr_set_xrc_srqn()* must be called to set the destination SRQN field. ## DATA transfer setters For work that requires to transfer data one of the following setters should be called once after the WR builder: *ibv_wr_set_sge()* : Transfer data to/from a single buffer given by the lkey, addr and length. This is equivalent to *ibv_wr_set_sge_list()* with a single element. *ibv_wr_set_sge_list()* : Transfer data to/from a list of buffers, logically concatenated together. Each buffer is specified by an element in an array of *struct ibv_sge*. Inline setters will copy the send data during the setter and allows the caller to immediately re-use the buffer. This behavior is identical to the IBV_SEND_INLINE flag. Generally this copy is done in a way that optimizes SEND latency and is suitable for small messages. The provider will limit the amount of data it can support in a single operation. This limit is requested in the *max_inline_data* member of *struct ibv_qp_init_attr*. Valid only for SEND and RDMA_WRITE. *ibv_wr_set_inline_data()* : Copy send data from a single buffer given by the addr and length. This is equivalent to *ibv_wr_set_inline_data_list()* with a single element. *ibv_wr_set_inline_data_list()* : Copy send data from a list of buffers, logically concatenated together. Each buffer is specified by an element in an array of *struct ibv_inl_data*. ## Flags A bit mask of flags may be specified in *wr_flags* to control the behavior of the work request. **IBV_SEND_FENCE** : Do not start this work request until prior work has completed. **IBV_SEND_IP_CSUM** : Offload the IPv4 and TCP/UDP checksum calculation **IBV_SEND_SIGNALED** : A completion will be generated in the completion queue for the operation. **IBV_SEND_SOLICITED** : Set the solicited bit in the RDMA packet. This informs the other side to generate a completion event upon receiving the RDMA operation. # CONCURRENCY The provider will provide locking to ensure that *ibv_wr_start()* and *ibv_wr_complete()/abort()* form a per-QP critical section where no other threads can enter. If an *ibv_td* is provided during QP creation then no locking will be performed and it is up to the caller to ensure that only one thread can be within the critical region at a time. # RETURN VALUE Applications should use this API in a way that does not create failures. The individual APIs do not return a failure indication to avoid branching. If a failure is detected during operation, for instance due to an invalid argument, then *ibv_wr_complete()* will return failure and the entire posting will be aborted. # EXAMPLE ```c /* create RC QP type and specify the required send opcodes */ qp_init_attr_ex.qp_type = IBV_QPT_RC; qp_init_attr_ex.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS; qp_init_attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_WRITE; qp_init_attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM; ibv_qp *qp = ibv_create_qp_ex(ctx, qp_init_attr_ex); ibv_qp_ex *qpx = ibv_qp_to_qp_ex(qp); ibv_wr_start(qpx); /* create 1st WRITE WR entry */ qpx->wr_id = my_wr_id_1; ibv_wr_rdma_write(qpx, rkey, remote_addr_1); ibv_wr_set_sge(qpx, lkey, local_addr_1, length_1); /* create 2nd WRITE_WITH_IMM WR entry */ qpx->wr_id = my_wr_id_2; qpx->wr_flags = IBV_SEND_SIGNALED; ibv_wr_rdma_write_imm(qpx, rkey, remote_addr_2, htonl(0x1234)); ibv_set_wr_sge(qpx, lkey, local_addr_2, length_2); /* Begin processing WRs */ ret = ibv_wr_complete(qpx); ``` # SEE ALSO **ibv_post_send**(3), **ibv_create_qp_ex(3)**. # AUTHOR Jason Gunthorpe Guy Levi rdma-core-56.1/libibverbs/man/ibv_xsrq_pingpong.1000066400000000000000000000042201477342711600220460ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH IBV_XSRQ_PINGPONG 1 "May 24, 2016" "libibverbs" "USER COMMANDS" .SH NAME ibv_xsrq_pingpong \- simple InfiniBand shared receive queue test .SH SYNOPSIS .B ibv_xsrq_pingpong [\-p port] [\-d device] [\-i ib port] [\-s size] [\-m mtu] [\-c clients] [\-n num_tests] [\-l sl] [\-e] [\-g gid index] \fBHOSTNAME\fR .B ibv_xsrq_pingpong [\-p port] [\-d device] [\-i ib port] [\-s size] [\-m mtu] [\-c clients] [\-n num_tests] [\-l sl] [\-e] [\-g gid index] .SH DESCRIPTION .PP Run a simple ping-pong test over InfiniBand via the extended reliable connected (XRC) transport service, using a shared receive queue (SRQ). .SH OPTIONS .PP .TP \fB\-p\fR, \fB\-\-port\fR=\fIPORT\fR use TCP port \fIPORT\fR for initial synchronization (default 18515) .TP \fB\-d\fR, \fB\-\-ib\-dev\fR=\fIDEVICE\fR use IB device \fIDEVICE\fR (default first device found) .TP \fB\-i\fR, \fB\-\-ib\-port\fR=\fIPORT\fR use IB port \fIPORT\fR (default port 1) .TP \fB\-s\fR, \fB\-\-size\fR=\fISIZE\fR ping-pong messages of size \fISIZE\fR (default 4096) .TP \fB\-m\fR, \fB\-\-mtu\fR=\fIMTU\fR use path mtu of size \fIMTU\fR (default 2048) .TP \fB\-c\fR, \fB\-\-clients\fR=\fICLIENTS\fR number of clients \fICLIENTS\fR (on server only, default 1) .TP \fB\-n\fR, \fB\-\-num\-tests\fR=\fINUM_TESTS\fR perform \fINUM_TESTS\fR tests per client (default 5) .TP \fB\-l\fR, \fB\-\-sl\fR=\fISL\fR use \fISL\fR as the service level value (default 0) .TP \fB\-e\fR, \fB\-\-events\fR sleep while waiting for work completion events (default is to poll for completions) .TP \fB\-g\fR, \fB\-\-gid-idx\fR=\fIGIDINDEX\fR local port \fIGIDINDEX\fR .SH SEE ALSO .BR ibv_rc_pingpong (1), .BR ibv_uc_pingpong (1), .BR ibv_ud_pingpong (1) .BR ibv_srq_pingpong (1) .SH AUTHORS .TP Roland Dreier .RI < roland@purestorage.com > .TP Jarod Wilson .RI < jarod@redhat.com > .SH BUGS The network synchronization between client and server instances is weak, and does not prevent incompatible options from being used on the two instances. The method used for retrieving work completions is not strictly correct, and race conditions may cause failures on some systems. rdma-core-56.1/libibverbs/marshall.c000066400000000000000000000114541477342711600174310ustar00rootroot00000000000000/* * Copyright (c) 2005 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include void ibv_copy_ah_attr_from_kern(struct ibv_ah_attr *dst, struct ib_uverbs_ah_attr *src) { memcpy(dst->grh.dgid.raw, src->grh.dgid, sizeof dst->grh.dgid); dst->grh.flow_label = src->grh.flow_label; dst->grh.sgid_index = src->grh.sgid_index; dst->grh.hop_limit = src->grh.hop_limit; dst->grh.traffic_class = src->grh.traffic_class; dst->dlid = src->dlid; dst->sl = src->sl; dst->src_path_bits = src->src_path_bits; dst->static_rate = src->static_rate; dst->is_global = src->is_global; dst->port_num = src->port_num; } void ibv_copy_qp_attr_from_kern(struct ibv_qp_attr *dst, struct ib_uverbs_qp_attr *src) { dst->cur_qp_state = src->cur_qp_state; dst->path_mtu = src->path_mtu; dst->path_mig_state = src->path_mig_state; dst->qkey = src->qkey; dst->rq_psn = src->rq_psn; dst->sq_psn = src->sq_psn; dst->dest_qp_num = src->dest_qp_num; dst->qp_access_flags = src->qp_access_flags; dst->cap.max_send_wr = src->max_send_wr; dst->cap.max_recv_wr = src->max_recv_wr; dst->cap.max_send_sge = src->max_send_sge; dst->cap.max_recv_sge = src->max_recv_sge; dst->cap.max_inline_data = src->max_inline_data; ibv_copy_ah_attr_from_kern(&dst->ah_attr, &src->ah_attr); ibv_copy_ah_attr_from_kern(&dst->alt_ah_attr, &src->alt_ah_attr); dst->pkey_index = src->pkey_index; dst->alt_pkey_index = src->alt_pkey_index; dst->en_sqd_async_notify = src->en_sqd_async_notify; dst->sq_draining = src->sq_draining; dst->max_rd_atomic = src->max_rd_atomic; dst->max_dest_rd_atomic = src->max_dest_rd_atomic; dst->min_rnr_timer = src->min_rnr_timer; dst->port_num = src->port_num; dst->timeout = src->timeout; dst->retry_cnt = src->retry_cnt; dst->rnr_retry = src->rnr_retry; dst->alt_port_num = src->alt_port_num; dst->alt_timeout = src->alt_timeout; } void ibv_copy_path_rec_from_kern(struct ibv_sa_path_rec *dst, struct ib_user_path_rec *src) { memcpy(dst->dgid.raw, src->dgid, sizeof dst->dgid); memcpy(dst->sgid.raw, src->sgid, sizeof dst->sgid); dst->dlid = src->dlid; dst->slid = src->slid; dst->raw_traffic = src->raw_traffic; dst->flow_label = src->flow_label; dst->hop_limit = src->hop_limit; dst->traffic_class = src->traffic_class; dst->reversible = src->reversible; dst->numb_path = src->numb_path; dst->pkey = src->pkey; dst->sl = src->sl; dst->mtu_selector = src->mtu_selector; dst->mtu = src->mtu; dst->rate_selector = src->rate_selector; dst->rate = src->rate; dst->packet_life_time = src->packet_life_time; dst->preference = src->preference; dst->packet_life_time_selector = src->packet_life_time_selector; } void ibv_copy_path_rec_to_kern(struct ib_user_path_rec *dst, struct ibv_sa_path_rec *src) { memcpy(dst->dgid, src->dgid.raw, sizeof src->dgid); memcpy(dst->sgid, src->sgid.raw, sizeof src->sgid); dst->dlid = src->dlid; dst->slid = src->slid; dst->raw_traffic = src->raw_traffic; dst->flow_label = src->flow_label; dst->hop_limit = src->hop_limit; dst->traffic_class = src->traffic_class; dst->reversible = src->reversible; dst->numb_path = src->numb_path; dst->pkey = src->pkey; dst->sl = src->sl; dst->mtu_selector = src->mtu_selector; dst->mtu = src->mtu; dst->rate_selector = src->rate_selector; dst->rate = src->rate; dst->packet_life_time = src->packet_life_time; dst->preference = src->preference; dst->packet_life_time_selector = src->packet_life_time_selector; } rdma-core-56.1/libibverbs/marshall.h000066400000000000000000000040711477342711600174330ustar00rootroot00000000000000/* * Copyright (c) 2005 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef INFINIBAND_MARSHALL_H #define INFINIBAND_MARSHALL_H #include #include #include #include #ifdef __cplusplus extern "C" { #endif void ibv_copy_qp_attr_from_kern(struct ibv_qp_attr *dst, struct ib_uverbs_qp_attr *src); void ibv_copy_ah_attr_from_kern(struct ibv_ah_attr *dst, struct ib_uverbs_ah_attr *src); void ibv_copy_path_rec_from_kern(struct ibv_sa_path_rec *dst, struct ib_user_path_rec *src); void ibv_copy_path_rec_to_kern(struct ib_user_path_rec *dst, struct ibv_sa_path_rec *src); #ifdef __cplusplus } #endif #endif /* INFINIBAND_MARSHALL_H */ rdma-core-56.1/libibverbs/memory.c000066400000000000000000000355351477342711600171440ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include "ibverbs.h" #include "util/rdma_nl.h" struct ibv_mem_node { enum { IBV_RED, IBV_BLACK } color; struct ibv_mem_node *parent; struct ibv_mem_node *left, *right; uintptr_t start, end; int refcnt; }; static struct ibv_mem_node *mm_root; static pthread_mutex_t mm_mutex = PTHREAD_MUTEX_INITIALIZER; static int page_size; static int huge_page_enabled; static int too_late; static unsigned long smaps_page_size(FILE *file) { int n; unsigned long size = page_size; char buf[1024]; while (fgets(buf, sizeof(buf), file) != NULL) { if (!strstr(buf, "KernelPageSize:")) continue; n = sscanf(buf, "%*s %lu", &size); if (n < 1) continue; /* page size is printed in Kb */ size = size * 1024; break; } return size; } static unsigned long get_page_size(void *base) { unsigned long ret = page_size; pid_t pid; FILE *file; char buf[1024]; pid = getpid(); snprintf(buf, sizeof(buf), "/proc/%d/smaps", pid); file = fopen(buf, "r" STREAM_CLOEXEC); if (!file) goto out; while (fgets(buf, sizeof(buf), file) != NULL) { int n; uintptr_t range_start, range_end; n = sscanf(buf, "%" SCNxPTR "-%" SCNxPTR, &range_start, &range_end); if (n < 2) continue; if ((uintptr_t) base >= range_start && (uintptr_t) base < range_end) { ret = smaps_page_size(file); break; } } fclose(file); out: return ret; } int ibv_fork_init(void) { void *tmp, *tmp_aligned; int ret; unsigned long size; if (getenv("RDMAV_HUGEPAGES_SAFE")) huge_page_enabled = 1; if (mm_root) return 0; if (ibv_is_fork_initialized() == IBV_FORK_UNNEEDED) return 0; if (too_late) return EINVAL; page_size = sysconf(_SC_PAGESIZE); if (page_size < 0) return errno; if (posix_memalign(&tmp, page_size, page_size)) return ENOMEM; if (huge_page_enabled) { size = get_page_size(tmp); tmp_aligned = (void *) ((uintptr_t) tmp & ~(size - 1)); } else { size = page_size; tmp_aligned = tmp; } ret = madvise(tmp_aligned, size, MADV_DONTFORK) || madvise(tmp_aligned, size, MADV_DOFORK); free(tmp); if (ret) return ENOSYS; mm_root = malloc(sizeof *mm_root); if (!mm_root) return ENOMEM; mm_root->parent = NULL; mm_root->left = NULL; mm_root->right = NULL; mm_root->color = IBV_BLACK; mm_root->start = 0; mm_root->end = UINTPTR_MAX; mm_root->refcnt = 0; return 0; } enum ibv_fork_status ibv_is_fork_initialized(void) { if (get_copy_on_fork()) return IBV_FORK_UNNEEDED; return mm_root ? IBV_FORK_ENABLED : IBV_FORK_DISABLED; } static struct ibv_mem_node *__mm_prev(struct ibv_mem_node *node) { if (node->left) { node = node->left; while (node->right) node = node->right; } else { while (node->parent && node == node->parent->left) node = node->parent; node = node->parent; } return node; } static struct ibv_mem_node *__mm_next(struct ibv_mem_node *node) { if (node->right) { node = node->right; while (node->left) node = node->left; } else { while (node->parent && node == node->parent->right) node = node->parent; node = node->parent; } return node; } static void __mm_rotate_right(struct ibv_mem_node *node) { struct ibv_mem_node *tmp; tmp = node->left; node->left = tmp->right; if (node->left) node->left->parent = node; if (node->parent) { if (node->parent->right == node) node->parent->right = tmp; else node->parent->left = tmp; } else mm_root = tmp; tmp->parent = node->parent; tmp->right = node; node->parent = tmp; } static void __mm_rotate_left(struct ibv_mem_node *node) { struct ibv_mem_node *tmp; tmp = node->right; node->right = tmp->left; if (node->right) node->right->parent = node; if (node->parent) { if (node->parent->right == node) node->parent->right = tmp; else node->parent->left = tmp; } else mm_root = tmp; tmp->parent = node->parent; tmp->left = node; node->parent = tmp; } #if 0 static int verify(struct ibv_mem_node *node) { int hl, hr; if (!node) return 1; hl = verify(node->left); hr = verify(node->left); if (!hl || !hr) return 0; if (hl != hr) return 0; if (node->color == IBV_RED) { if (node->left && node->left->color != IBV_BLACK) return 0; if (node->right && node->right->color != IBV_BLACK) return 0; return hl; } return hl + 1; } #endif static void __mm_add_rebalance(struct ibv_mem_node *node) { struct ibv_mem_node *parent, *gp, *uncle; while (node->parent && node->parent->color == IBV_RED) { parent = node->parent; gp = node->parent->parent; if (parent == gp->left) { uncle = gp->right; if (uncle && uncle->color == IBV_RED) { parent->color = IBV_BLACK; uncle->color = IBV_BLACK; gp->color = IBV_RED; node = gp; } else { if (node == parent->right) { __mm_rotate_left(parent); node = parent; parent = node->parent; } parent->color = IBV_BLACK; gp->color = IBV_RED; __mm_rotate_right(gp); } } else { uncle = gp->left; if (uncle && uncle->color == IBV_RED) { parent->color = IBV_BLACK; uncle->color = IBV_BLACK; gp->color = IBV_RED; node = gp; } else { if (node == parent->left) { __mm_rotate_right(parent); node = parent; parent = node->parent; } parent->color = IBV_BLACK; gp->color = IBV_RED; __mm_rotate_left(gp); } } } mm_root->color = IBV_BLACK; } static void __mm_add(struct ibv_mem_node *new) { struct ibv_mem_node *node, *parent = NULL; node = mm_root; while (node) { parent = node; if (node->start < new->start) node = node->right; else node = node->left; } if (parent->start < new->start) parent->right = new; else parent->left = new; new->parent = parent; new->left = NULL; new->right = NULL; new->color = IBV_RED; __mm_add_rebalance(new); } static void __mm_remove(struct ibv_mem_node *node) { struct ibv_mem_node *child, *parent, *sib, *tmp; int nodecol; if (node->left && node->right) { tmp = node->left; while (tmp->right) tmp = tmp->right; nodecol = tmp->color; child = tmp->left; tmp->color = node->color; if (tmp->parent != node) { parent = tmp->parent; parent->right = tmp->left; if (tmp->left) tmp->left->parent = parent; tmp->left = node->left; node->left->parent = tmp; } else parent = tmp; tmp->right = node->right; node->right->parent = tmp; tmp->parent = node->parent; if (node->parent) { if (node->parent->left == node) node->parent->left = tmp; else node->parent->right = tmp; } else mm_root = tmp; } else { nodecol = node->color; child = node->left ? node->left : node->right; parent = node->parent; if (child) child->parent = parent; if (parent) { if (parent->left == node) parent->left = child; else parent->right = child; } else mm_root = child; } free(node); if (nodecol == IBV_RED) return; while ((!child || child->color == IBV_BLACK) && child != mm_root) { if (parent->left == child) { sib = parent->right; if (sib->color == IBV_RED) { parent->color = IBV_RED; sib->color = IBV_BLACK; __mm_rotate_left(parent); sib = parent->right; } if ((!sib->left || sib->left->color == IBV_BLACK) && (!sib->right || sib->right->color == IBV_BLACK)) { sib->color = IBV_RED; child = parent; parent = child->parent; } else { if (!sib->right || sib->right->color == IBV_BLACK) { if (sib->left) sib->left->color = IBV_BLACK; sib->color = IBV_RED; __mm_rotate_right(sib); sib = parent->right; } sib->color = parent->color; parent->color = IBV_BLACK; if (sib->right) sib->right->color = IBV_BLACK; __mm_rotate_left(parent); child = mm_root; break; } } else { sib = parent->left; if (sib->color == IBV_RED) { parent->color = IBV_RED; sib->color = IBV_BLACK; __mm_rotate_right(parent); sib = parent->left; } if ((!sib->left || sib->left->color == IBV_BLACK) && (!sib->right || sib->right->color == IBV_BLACK)) { sib->color = IBV_RED; child = parent; parent = child->parent; } else { if (!sib->left || sib->left->color == IBV_BLACK) { if (sib->right) sib->right->color = IBV_BLACK; sib->color = IBV_RED; __mm_rotate_left(sib); sib = parent->left; } sib->color = parent->color; parent->color = IBV_BLACK; if (sib->left) sib->left->color = IBV_BLACK; __mm_rotate_right(parent); child = mm_root; break; } } } if (child) child->color = IBV_BLACK; } static struct ibv_mem_node *__mm_find_start(uintptr_t start, uintptr_t end) { struct ibv_mem_node *node = mm_root; while (node) { if (node->start <= start && node->end >= start) break; if (node->start < start) node = node->right; else node = node->left; } return node; } static struct ibv_mem_node *merge_ranges(struct ibv_mem_node *node, struct ibv_mem_node *prev) { prev->end = node->end; prev->refcnt = node->refcnt; __mm_remove(node); return prev; } static struct ibv_mem_node *split_range(struct ibv_mem_node *node, uintptr_t cut_line) { struct ibv_mem_node *new_node = NULL; new_node = malloc(sizeof *new_node); if (!new_node) return NULL; new_node->start = cut_line; new_node->end = node->end; new_node->refcnt = node->refcnt; node->end = cut_line - 1; __mm_add(new_node); return new_node; } static struct ibv_mem_node *get_start_node(uintptr_t start, uintptr_t end, int inc) { struct ibv_mem_node *node, *tmp = NULL; node = __mm_find_start(start, end); if (node->start < start) node = split_range(node, start); else { tmp = __mm_prev(node); if (tmp && tmp->refcnt == node->refcnt + inc) node = merge_ranges(node, tmp); } return node; } /* * This function is called if madvise() fails to undo merging/splitting * operations performed on the node. */ static struct ibv_mem_node *undo_node(struct ibv_mem_node *node, uintptr_t start, int inc) { struct ibv_mem_node *tmp = NULL; /* * This condition can be true only if we merged this * node with the previous one, so we need to split them. */ if (start > node->start) { tmp = split_range(node, start); if (tmp) { node->refcnt += inc; node = tmp; } else return NULL; } tmp = __mm_prev(node); if (tmp && tmp->refcnt == node->refcnt) node = merge_ranges(node, tmp); tmp = __mm_next(node); if (tmp && tmp->refcnt == node->refcnt) node = merge_ranges(tmp, node); return node; } static int do_madvise(void *addr, size_t length, int advice, unsigned long range_page_size) { int ret; void *p; ret = madvise(addr, length, advice); if (!ret || advice == MADV_DONTFORK) return ret; if (length > range_page_size) { /* if MADV_DOFORK failed we will try to remove VM_DONTCOPY * flag from each page */ for (p = addr; p < addr + length; p += range_page_size) madvise(p, range_page_size, MADV_DOFORK); } return 0; } static int ibv_madvise_range(void *base, size_t size, int advice) { uintptr_t start, end; struct ibv_mem_node *node, *tmp; int inc; int rolling_back = 0; int ret = 0; unsigned long range_page_size; if (!size || !base) return 0; if (huge_page_enabled) range_page_size = get_page_size(base); else range_page_size = page_size; start = (uintptr_t) base & ~(range_page_size - 1); end = ((uintptr_t) (base + size + range_page_size - 1) & ~(range_page_size - 1)) - 1; pthread_mutex_lock(&mm_mutex); again: inc = advice == MADV_DONTFORK ? 1 : -1; node = get_start_node(start, end, inc); if (!node) { ret = -1; goto out; } while (node && node->start <= end) { if (node->end > end) { if (!split_range(node, end + 1)) { ret = -1; goto out; } } if ((inc == -1 && node->refcnt == 1) || (inc == 1 && node->refcnt == 0)) { /* * If this is the first time through the loop, * and we merged this node with the previous * one, then we only want to do the madvise() * on start ... node->end (rather than * starting at node->start). * * Otherwise we end up doing madvise() on * bigger region than we're being asked to, * and that may lead to a spurious failure. */ if (start > node->start) ret = do_madvise((void *) start, node->end - start + 1, advice, range_page_size); else ret = do_madvise((void *) node->start, node->end - node->start + 1, advice, range_page_size); if (ret) { node = undo_node(node, start, inc); if (rolling_back || !node) goto out; /* madvise failed, roll back previous changes */ rolling_back = 1; advice = advice == MADV_DONTFORK ? MADV_DOFORK : MADV_DONTFORK; end = node->end; goto again; } } node->refcnt += inc; node = __mm_next(node); } if (node) { tmp = __mm_prev(node); if (tmp && node->refcnt == tmp->refcnt) node = merge_ranges(node, tmp); } out: if (rolling_back) ret = -1; pthread_mutex_unlock(&mm_mutex); return ret; } int ibv_dontfork_range(void *base, size_t size) { if (mm_root) return ibv_madvise_range(base, size, MADV_DONTFORK); else { too_late = 1; return 0; } } int ibv_dofork_range(void *base, size_t size) { if (mm_root) return ibv_madvise_range(base, size, MADV_DOFORK); else { too_late = 1; return 0; } } rdma-core-56.1/libibverbs/neigh.c000066400000000000000000000454731477342711600167300ustar00rootroot00000000000000/* Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md */ #include "config.h" #include #include #include #include #include #include #if HAVE_WORKING_IF_H #include #endif #include #include #include #include #include #include #include #include #include #include #include #include #if !HAVE_WORKING_IF_H /* We need this decl from net/if.h but old systems do not let use co-include net/if.h and netlink/route/link.h */ extern unsigned int if_nametoindex(__const char *__ifname); #endif /* for PFX */ #include "ibverbs.h" #include #include "neigh.h" #include union sktaddr { struct sockaddr s; struct sockaddr_in s4; struct sockaddr_in6 s6; }; struct skt { union sktaddr sktaddr; socklen_t len; }; static int set_link_port(union sktaddr *s, __be16 port, int oif) { switch (s->s.sa_family) { case AF_INET: s->s4.sin_port = port; break; case AF_INET6: s->s6.sin6_port = port; s->s6.sin6_scope_id = oif; break; default: return -EINVAL; } return 0; } static bool cmp_address(const struct sockaddr *s1, const struct sockaddr *s2) { if (s1->sa_family != s2->sa_family) return false; switch (s1->sa_family) { case AF_INET: return ((struct sockaddr_in *)s1)->sin_addr.s_addr == ((struct sockaddr_in *)s2)->sin_addr.s_addr; case AF_INET6: return !memcmp( ((struct sockaddr_in6 *)s1)->sin6_addr.s6_addr, ((struct sockaddr_in6 *)s2)->sin6_addr.s6_addr, sizeof(((struct sockaddr_in6 *)s1)->sin6_addr.s6_addr)); default: return false; } } static int get_ifindex(const struct sockaddr *s) { struct ifaddrs *ifaddr, *ifa; int name2index = -ENODEV; if (-1 == getifaddrs(&ifaddr)) return errno; for (ifa = ifaddr; ifa != NULL; ifa = ifa->ifa_next) { if (ifa->ifa_addr == NULL) continue; if (cmp_address(ifa->ifa_addr, s)) { name2index = if_nametoindex(ifa->ifa_name); break; } } freeifaddrs(ifaddr); return name2index; } static struct nl_addr *get_neigh_mac(struct get_neigh_handler *neigh_handler) { struct rtnl_neigh *neigh; struct nl_addr *ll_addr = NULL; /* future optimization - if link local address - parse address and * return mac now instead of doing so after the routing CB. This * is of course referred to GIDs */ neigh = rtnl_neigh_get(neigh_handler->neigh_cache, neigh_handler->oif, neigh_handler->dst); if (neigh == NULL) return NULL; ll_addr = rtnl_neigh_get_lladdr(neigh); if (NULL != ll_addr) ll_addr = nl_addr_clone(ll_addr); rtnl_neigh_put(neigh); return ll_addr; } static void get_neigh_cb_event(struct nl_object *obj, void *arg) { struct get_neigh_handler *neigh_handler = (struct get_neigh_handler *)arg; /* assumed serilized callback (no parallel execution of function) */ if (nl_object_match_filter( obj, (struct nl_object *)neigh_handler->filter_neigh)) { struct rtnl_neigh *neigh = (struct rtnl_neigh *)obj; /* check that we didn't set it already */ if (neigh_handler->found_ll_addr == NULL) { if (rtnl_neigh_get_lladdr(neigh) == NULL) return; neigh_handler->found_ll_addr = nl_addr_clone(rtnl_neigh_get_lladdr(neigh)); } } } static int get_neigh_cb(struct nl_msg *msg, void *arg) { struct get_neigh_handler *neigh_handler = (struct get_neigh_handler *)arg; if (nl_msg_parse(msg, &get_neigh_cb_event, neigh_handler) < 0) errno = ENOMSG; return NL_OK; } static void set_neigh_filter(struct get_neigh_handler *neigh_handler, struct rtnl_neigh *filter) { neigh_handler->filter_neigh = filter; } static struct rtnl_neigh *create_filter_neigh_for_dst(struct nl_addr *dst_addr, int oif) { struct rtnl_neigh *filter_neigh; filter_neigh = rtnl_neigh_alloc(); if (filter_neigh == NULL) return NULL; rtnl_neigh_set_ifindex(filter_neigh, oif); rtnl_neigh_set_dst(filter_neigh, dst_addr); return filter_neigh; } #define PORT_DISCARD htobe16(9) #define SEND_PAYLOAD "H" static int create_socket(struct get_neigh_handler *neigh_handler, struct skt *addr_dst, int *psock_fd) { int err; struct skt addr_src; int sock_fd; memset(addr_dst, 0, sizeof(*addr_dst)); memset(&addr_src, 0, sizeof(addr_src)); addr_src.len = sizeof(addr_src.sktaddr); err = nl_addr_fill_sockaddr(neigh_handler->src, &addr_src.sktaddr.s, &addr_src.len); if (err) { errno = EADDRNOTAVAIL; return -1; } addr_dst->len = sizeof(addr_dst->sktaddr); err = nl_addr_fill_sockaddr(neigh_handler->dst, &addr_dst->sktaddr.s, &addr_dst->len); if (err) { errno = EADDRNOTAVAIL; return -1; } err = set_link_port(&addr_dst->sktaddr, PORT_DISCARD, neigh_handler->oif); if (err) return -1; sock_fd = socket(addr_dst->sktaddr.s.sa_family, SOCK_DGRAM | SOCK_CLOEXEC, 0); if (sock_fd == -1) return -1; err = bind(sock_fd, &addr_src.sktaddr.s, addr_src.len); if (err) { close(sock_fd); return -1; } *psock_fd = sock_fd; return 0; } #define NUM_OF_RETRIES 10 #define NUM_OF_TRIES ((NUM_OF_RETRIES) + 1) #if NUM_OF_TRIES < 1 #error "neigh: invalid value of NUM_OF_RETRIES" #endif static int create_timer(struct get_neigh_handler *neigh_handler) { int user_timeout = neigh_handler->timeout/NUM_OF_TRIES; struct timespec timeout = { .tv_sec = user_timeout / 1000, .tv_nsec = (user_timeout % 1000) * 1000000 }; struct itimerspec timer_time = {.it_value = timeout}; int timer_fd; timer_fd = timerfd_create(CLOCK_MONOTONIC, TFD_NONBLOCK | TFD_CLOEXEC); if (timer_fd == -1) return timer_fd; if (neigh_handler->timeout) { if (NUM_OF_TRIES <= 1) bzero(&timer_time.it_interval, sizeof(timer_time.it_interval)); else timer_time.it_interval = timeout; if (timerfd_settime(timer_fd, 0, &timer_time, NULL)) { close(timer_fd); return -1; } } return timer_fd; } #define UDP_SOCKET_MAX_SENDTO 100000ULL static int try_send_to(int sock_fd, void *buff, size_t buf_size, struct skt *addr_dst) { uint64_t max_count = UDP_SOCKET_MAX_SENDTO; int err; do { err = sendto(sock_fd, buff, buf_size, 0, &addr_dst->sktaddr.s, addr_dst->len); if (err > 0) err = 0; } while (-1 == err && EADDRNOTAVAIL == errno && --max_count); return err; } static struct nl_addr *process_get_neigh_mac( struct get_neigh_handler *neigh_handler) { int err; struct nl_addr *ll_addr = get_neigh_mac(neigh_handler); struct rtnl_neigh *neigh_filter; fd_set fdset; int sock_fd; int fd; int nfds; int timer_fd; int ret; struct skt addr_dst; char buff[sizeof(SEND_PAYLOAD)] = SEND_PAYLOAD; int retries = 0; if (NULL != ll_addr) return ll_addr; err = nl_socket_add_membership(neigh_handler->sock, RTNLGRP_NEIGH); if (err < 0) return NULL; neigh_filter = create_filter_neigh_for_dst(neigh_handler->dst, neigh_handler->oif); if (neigh_filter == NULL) return NULL; set_neigh_filter(neigh_handler, neigh_filter); nl_socket_disable_seq_check(neigh_handler->sock); nl_socket_modify_cb(neigh_handler->sock, NL_CB_VALID, NL_CB_CUSTOM, &get_neigh_cb, neigh_handler); fd = nl_socket_get_fd(neigh_handler->sock); err = create_socket(neigh_handler, &addr_dst, &sock_fd); if (err) return NULL; err = try_send_to(sock_fd, buff, sizeof(buff), &addr_dst); if (err) goto close_socket; timer_fd = create_timer(neigh_handler); if (timer_fd < 0) goto close_socket; nfds = max(fd, timer_fd) + 1; while (1) { FD_ZERO(&fdset); FD_SET(fd, &fdset); FD_SET(timer_fd, &fdset); /* wait for an incoming message on the netlink socket */ ret = select(nfds, &fdset, NULL, NULL, NULL); if (ret == -1) { goto select_err; } else if (ret) { if (FD_ISSET(fd, &fdset)) { nl_recvmsgs_default(neigh_handler->sock); if (neigh_handler->found_ll_addr) break; } else { nl_cache_refill(neigh_handler->sock, neigh_handler->neigh_cache); ll_addr = get_neigh_mac(neigh_handler); if (NULL != ll_addr) { break; } else if (FD_ISSET(timer_fd, &fdset) && retries < NUM_OF_RETRIES) { try_send_to(sock_fd, buff, sizeof(buff), &addr_dst); } } if (FD_ISSET(timer_fd, &fdset)) { uint64_t read_val; ssize_t __attribute__((unused)) rc; rc = read(timer_fd, &read_val, sizeof(read_val)); assert(rc == sizeof(read_val)); if (++retries >= NUM_OF_TRIES) { if (!errno) errno = EDESTADDRREQ; break; } } } } select_err: close(timer_fd); close_socket: close(sock_fd); return ll_addr ? ll_addr : neigh_handler->found_ll_addr; } static int get_mcast_mac_ipv4(struct nl_addr *dst, struct nl_addr **ll_addr) { uint8_t mac_addr[6] = {0x01, 0x00, 0x5E}; uint32_t addr = be32toh(*(__be32 *)nl_addr_get_binary_addr(dst)); mac_addr[5] = addr & 0xFF; addr >>= 8; mac_addr[4] = addr & 0xFF; addr >>= 8; mac_addr[3] = addr & 0x7F; *ll_addr = nl_addr_build(AF_LLC, mac_addr, sizeof(mac_addr)); return *ll_addr == NULL ? -EINVAL : 0; } static int get_mcast_mac_ipv6(struct nl_addr *dst, struct nl_addr **ll_addr) { uint8_t mac_addr[6] = {0x33, 0x33}; memcpy(mac_addr + 2, (uint8_t *)nl_addr_get_binary_addr(dst) + 12, 4); *ll_addr = nl_addr_build(AF_LLC, mac_addr, sizeof(mac_addr)); return *ll_addr == NULL ? -EINVAL : 0; } static int get_link_local_mac_ipv6(struct nl_addr *dst, struct nl_addr **ll_addr) { uint8_t mac_addr[6]; memcpy(mac_addr + 3, (uint8_t *)nl_addr_get_binary_addr(dst) + 13, 3); memcpy(mac_addr, (uint8_t *)nl_addr_get_binary_addr(dst) + 8, 3); mac_addr[0] ^= 2; *ll_addr = nl_addr_build(AF_LLC, mac_addr, sizeof(mac_addr)); return *ll_addr == NULL ? -EINVAL : 0; } static const struct encoded_l3_addr { short family; uint8_t prefix_bits; const uint8_t data[16]; int (*getter)(struct nl_addr *dst, struct nl_addr **ll_addr); } encoded_prefixes[] = { {.family = AF_INET, .prefix_bits = 4, .data = {0xe0}, .getter = &get_mcast_mac_ipv4}, {.family = AF_INET6, .prefix_bits = 8, .data = {0xff}, .getter = &get_mcast_mac_ipv6}, {.family = AF_INET6, .prefix_bits = 64, .data = {0xfe, 0x80}, .getter = get_link_local_mac_ipv6}, }; static int nl_addr_cmp_prefix_msb(void *addr1, int len1, void *addr2, int len2) { int len = min(len1, len2); int bytes = len / 8; int d = memcmp(addr1, addr2, bytes); if (d == 0) { int mask = ((1UL << (len % 8)) - 1UL) << (8 - len); d = (((uint8_t *)addr1)[bytes] & mask) - (((uint8_t *)addr2)[bytes] & mask); } return d; } static int handle_encoded_mac(struct nl_addr *dst, struct nl_addr **ll_addr) { uint32_t family = nl_addr_get_family(dst); struct nl_addr *prefix = NULL; int i; int ret = 1; for (i = 0; i < sizeof(encoded_prefixes)/sizeof(encoded_prefixes[0]) && ret; prefix = NULL, i++) { if (encoded_prefixes[i].family != family) continue; prefix = nl_addr_build( family, (void *)encoded_prefixes[i].data, min_t(size_t, encoded_prefixes[i].prefix_bits / 8 + !!(encoded_prefixes[i].prefix_bits % 8), sizeof(encoded_prefixes[i].data))); if (prefix == NULL) return -ENOMEM; nl_addr_set_prefixlen(prefix, encoded_prefixes[i].prefix_bits); if (nl_addr_cmp_prefix_msb(nl_addr_get_binary_addr(dst), nl_addr_get_prefixlen(dst), nl_addr_get_binary_addr(prefix), nl_addr_get_prefixlen(prefix))) continue; ret = encoded_prefixes[i].getter(dst, ll_addr); nl_addr_put(prefix); } return ret; } static void get_route_cb_parser(struct nl_object *obj, void *arg) { struct get_neigh_handler *neigh_handler = (struct get_neigh_handler *)arg; struct rtnl_route *route = (struct rtnl_route *)obj; struct nl_addr *gateway = NULL; struct nl_addr *src = rtnl_route_get_pref_src(route); int oif; int type = rtnl_route_get_type(route); struct rtnl_link *link; struct rtnl_nexthop *nh = rtnl_route_nexthop_n(route, 0); if (nh != NULL) gateway = rtnl_route_nh_get_gateway(nh); oif = rtnl_route_nh_get_ifindex(nh); if (gateway) { nl_addr_put(neigh_handler->dst); neigh_handler->dst = nl_addr_clone(gateway); } if (RTN_BLACKHOLE == type || RTN_UNREACHABLE == type || RTN_PROHIBIT == type || RTN_THROW == type) { errno = ENETUNREACH; goto err; } if (!neigh_handler->src && src) neigh_handler->src = nl_addr_clone(src); if (neigh_handler->oif < 0 && oif > 0) neigh_handler->oif = oif; /* Link Local */ if (RTN_LOCAL == type) { struct nl_addr *lladdr; link = rtnl_link_get(neigh_handler->link_cache, neigh_handler->oif); if (link == NULL) goto err; lladdr = rtnl_link_get_addr(link); if (lladdr == NULL) goto err_link; neigh_handler->found_ll_addr = nl_addr_clone(lladdr); rtnl_link_put(link); } else { handle_encoded_mac( neigh_handler->dst, &neigh_handler->found_ll_addr); } return; err_link: rtnl_link_put(link); err: if (neigh_handler->src) { nl_addr_put(neigh_handler->src); neigh_handler->src = NULL; } } static int get_route_cb(struct nl_msg *msg, void *arg) { struct get_neigh_handler *neigh_handler = (struct get_neigh_handler *)arg; int err; err = nl_msg_parse(msg, &get_route_cb_parser, neigh_handler); if (err < 0) { errno = ENOMSG; return err; } if (!neigh_handler->dst || !neigh_handler->src || neigh_handler->oif <= 0) { errno = EINVAL; return -1; } if (NULL != neigh_handler->found_ll_addr) goto found; neigh_handler->found_ll_addr = process_get_neigh_mac(neigh_handler); found: return neigh_handler->found_ll_addr ? 0 : -1; } int neigh_get_oif_from_src(struct get_neigh_handler *neigh_handler) { int oif = -ENODEV; struct addrinfo *src_info; int err; err = nl_addr_info(neigh_handler->src, &src_info); if (err) { if (!errno) errno = ENXIO; return oif; } oif = get_ifindex(src_info->ai_addr); if (oif <= 0) goto free; free: freeaddrinfo(src_info); return oif; } int neigh_init_resources(struct get_neigh_handler *neigh_handler, int timeout) { int err; neigh_handler->sock = nl_socket_alloc(); if (neigh_handler->sock == NULL) { errno = ENOSYS; return -ENOSYS; } err = nl_connect(neigh_handler->sock, NETLINK_ROUTE); if (err < 0) goto free_socket; err = rtnl_link_alloc_cache(neigh_handler->sock, AF_UNSPEC, &neigh_handler->link_cache); if (err) { err = -1; errno = ENOMEM; goto free_socket; } nl_cache_mngt_provide(neigh_handler->link_cache); err = rtnl_route_alloc_cache(neigh_handler->sock, AF_UNSPEC, 0, &neigh_handler->route_cache); if (err) { err = -1; errno = ENOMEM; goto free_link_cache; } nl_cache_mngt_provide(neigh_handler->route_cache); err = rtnl_neigh_alloc_cache(neigh_handler->sock, &neigh_handler->neigh_cache); if (err) { err = -ENOMEM; goto free_route_cache; } nl_cache_mngt_provide(neigh_handler->neigh_cache); /* init structure */ neigh_handler->timeout = timeout; neigh_handler->oif = -1; neigh_handler->filter_neigh = NULL; neigh_handler->found_ll_addr = NULL; neigh_handler->dst = NULL; neigh_handler->src = NULL; neigh_handler->vid = -1; return 0; free_route_cache: nl_cache_mngt_unprovide(neigh_handler->route_cache); nl_cache_free(neigh_handler->route_cache); neigh_handler->route_cache = NULL; free_link_cache: nl_cache_mngt_unprovide(neigh_handler->link_cache); nl_cache_free(neigh_handler->link_cache); neigh_handler->link_cache = NULL; free_socket: nl_socket_free(neigh_handler->sock); neigh_handler->sock = NULL; return err; } uint16_t neigh_get_vlan_id_from_dev(struct get_neigh_handler *neigh_handler) { struct rtnl_link *link; int vid = 0xffff; link = rtnl_link_get(neigh_handler->link_cache, neigh_handler->oif); if (link == NULL) { errno = EINVAL; return vid; } if (rtnl_link_is_vlan(link)) vid = rtnl_link_vlan_get_id(link); rtnl_link_put(link); return vid >= 0 && vid <= 0xfff ? vid : 0xffff; } void neigh_set_vlan_id(struct get_neigh_handler *neigh_handler, uint16_t vid) { if (vid <= 0xfff) neigh_handler->vid = vid; } int neigh_set_dst(struct get_neigh_handler *neigh_handler, int family, void *buf, size_t size) { neigh_handler->dst = nl_addr_build(family, buf, size); return neigh_handler->dst == NULL; } int neigh_set_src(struct get_neigh_handler *neigh_handler, int family, void *buf, size_t size) { neigh_handler->src = nl_addr_build(family, buf, size); return neigh_handler->src == NULL; } void neigh_set_oif(struct get_neigh_handler *neigh_handler, int oif) { neigh_handler->oif = oif; } int neigh_get_ll(struct get_neigh_handler *neigh_handler, void *addr_buff, int addr_size) { int neigh_len; if (neigh_handler->found_ll_addr == NULL) return -EINVAL; neigh_len = nl_addr_get_len(neigh_handler->found_ll_addr); if (neigh_len > addr_size) return -EINVAL; memcpy(addr_buff, nl_addr_get_binary_addr(neigh_handler->found_ll_addr), neigh_len); return neigh_len; } void neigh_free_resources(struct get_neigh_handler *neigh_handler) { /* Should be released first because it's holding a reference to dst */ if (neigh_handler->filter_neigh != NULL) { rtnl_neigh_put(neigh_handler->filter_neigh); neigh_handler->filter_neigh = NULL; } if (neigh_handler->src != NULL) { nl_addr_put(neigh_handler->src); neigh_handler->src = NULL; } if (neigh_handler->dst != NULL) { nl_addr_put(neigh_handler->dst); neigh_handler->dst = NULL; } if (neigh_handler->found_ll_addr != NULL) { nl_addr_put(neigh_handler->found_ll_addr); neigh_handler->found_ll_addr = NULL; } if (neigh_handler->neigh_cache != NULL) { nl_cache_mngt_unprovide(neigh_handler->neigh_cache); nl_cache_free(neigh_handler->neigh_cache); neigh_handler->neigh_cache = NULL; } if (neigh_handler->route_cache != NULL) { nl_cache_mngt_unprovide(neigh_handler->route_cache); nl_cache_free(neigh_handler->route_cache); neigh_handler->route_cache = NULL; } if (neigh_handler->link_cache != NULL) { nl_cache_mngt_unprovide(neigh_handler->link_cache); nl_cache_free(neigh_handler->link_cache); neigh_handler->link_cache = NULL; } if (neigh_handler->sock != NULL) { nl_socket_free(neigh_handler->sock); neigh_handler->sock = NULL; } } int process_get_neigh(struct get_neigh_handler *neigh_handler) { struct nl_msg *m; struct rtmsg rmsg = { .rtm_family = nl_addr_get_family(neigh_handler->dst), .rtm_dst_len = nl_addr_get_prefixlen(neigh_handler->dst), }; int err; m = nlmsg_alloc_simple(RTM_GETROUTE, 0); if (m == NULL) return -ENOMEM; nlmsg_append(m, &rmsg, sizeof(rmsg), NLMSG_ALIGNTO); NLA_PUT_ADDR(m, RTA_DST, neigh_handler->dst); if (neigh_handler->oif > 0) NLA_PUT_U32(m, RTA_OIF, neigh_handler->oif); err = nl_send_auto(neigh_handler->sock, m); nlmsg_free(m); if (err < 0) return err; nl_socket_modify_cb(neigh_handler->sock, NL_CB_VALID, NL_CB_CUSTOM, &get_route_cb, neigh_handler); err = nl_recvmsgs_default(neigh_handler->sock); return err; nla_put_failure: nlmsg_free(m); return -ENOMEM; } rdma-core-56.1/libibverbs/neigh.h000066400000000000000000000024541477342711600167250ustar00rootroot00000000000000/* Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md */ #ifndef _NEIGH_H_ #define _NEIGH_H_ #include #include #include "config.h" #include struct get_neigh_handler { struct nl_sock *sock; struct nl_cache *link_cache; struct nl_cache *neigh_cache; struct nl_cache *route_cache; int32_t oif; int vid; struct rtnl_neigh *filter_neigh; struct nl_addr *found_ll_addr; struct nl_addr *dst; struct nl_addr *src; uint64_t timeout; }; int process_get_neigh(struct get_neigh_handler *neigh_handler); void neigh_free_resources(struct get_neigh_handler *neigh_handler); void neigh_set_vlan_id(struct get_neigh_handler *neigh_handler, uint16_t vid); uint16_t neigh_get_vlan_id_from_dev(struct get_neigh_handler *neigh_handler); int neigh_init_resources(struct get_neigh_handler *neigh_handler, int timeout); int neigh_set_src(struct get_neigh_handler *neigh_handler, int family, void *buf, size_t size); void neigh_set_oif(struct get_neigh_handler *neigh_handler, int oif); int neigh_set_dst(struct get_neigh_handler *neigh_handler, int family, void *buf, size_t size); int neigh_get_oif_from_src(struct get_neigh_handler *neigh_handler); int neigh_get_ll(struct get_neigh_handler *neigh_handler, void *addr_buf, int addr_size); #endif rdma-core-56.1/libibverbs/opcode.h000066400000000000000000000130761477342711600171060ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef INFINIBAND_OPCODE_H #define INFINIBAND_OPCODE_H /* * This macro cleans up the definitions of constants for BTH opcodes. * It is used to define constants such as IBV_OPCODE_UD_SEND_ONLY, * which becomes IBV_OPCODE_UD + IBV_OPCODE_SEND_ONLY, and this gives * the correct value. * * In short, user code should use the constants defined using the * macro rather than worrying about adding together other constants. */ #define IBV_OPCODE(transport, op) \ IBV_OPCODE_ ## transport ## _ ## op = \ IBV_OPCODE_ ## transport + IBV_OPCODE_ ## op enum { /* transport types -- just used to define real constants */ IBV_OPCODE_RC = 0x00, IBV_OPCODE_UC = 0x20, IBV_OPCODE_RD = 0x40, IBV_OPCODE_UD = 0x60, /* operations -- just used to define real constants */ IBV_OPCODE_SEND_FIRST = 0x00, IBV_OPCODE_SEND_MIDDLE = 0x01, IBV_OPCODE_SEND_LAST = 0x02, IBV_OPCODE_SEND_LAST_WITH_IMMEDIATE = 0x03, IBV_OPCODE_SEND_ONLY = 0x04, IBV_OPCODE_SEND_ONLY_WITH_IMMEDIATE = 0x05, IBV_OPCODE_RDMA_WRITE_FIRST = 0x06, IBV_OPCODE_RDMA_WRITE_MIDDLE = 0x07, IBV_OPCODE_RDMA_WRITE_LAST = 0x08, IBV_OPCODE_RDMA_WRITE_LAST_WITH_IMMEDIATE = 0x09, IBV_OPCODE_RDMA_WRITE_ONLY = 0x0a, IBV_OPCODE_RDMA_WRITE_ONLY_WITH_IMMEDIATE = 0x0b, IBV_OPCODE_RDMA_READ_REQUEST = 0x0c, IBV_OPCODE_RDMA_READ_RESPONSE_FIRST = 0x0d, IBV_OPCODE_RDMA_READ_RESPONSE_MIDDLE = 0x0e, IBV_OPCODE_RDMA_READ_RESPONSE_LAST = 0x0f, IBV_OPCODE_RDMA_READ_RESPONSE_ONLY = 0x10, IBV_OPCODE_ACKNOWLEDGE = 0x11, IBV_OPCODE_ATOMIC_ACKNOWLEDGE = 0x12, IBV_OPCODE_COMPARE_SWAP = 0x13, IBV_OPCODE_FETCH_ADD = 0x14, /* real constants follow -- see comment about above IBV_OPCODE() macro for more details */ /* RC */ IBV_OPCODE(RC, SEND_FIRST), IBV_OPCODE(RC, SEND_MIDDLE), IBV_OPCODE(RC, SEND_LAST), IBV_OPCODE(RC, SEND_LAST_WITH_IMMEDIATE), IBV_OPCODE(RC, SEND_ONLY), IBV_OPCODE(RC, SEND_ONLY_WITH_IMMEDIATE), IBV_OPCODE(RC, RDMA_WRITE_FIRST), IBV_OPCODE(RC, RDMA_WRITE_MIDDLE), IBV_OPCODE(RC, RDMA_WRITE_LAST), IBV_OPCODE(RC, RDMA_WRITE_LAST_WITH_IMMEDIATE), IBV_OPCODE(RC, RDMA_WRITE_ONLY), IBV_OPCODE(RC, RDMA_WRITE_ONLY_WITH_IMMEDIATE), IBV_OPCODE(RC, RDMA_READ_REQUEST), IBV_OPCODE(RC, RDMA_READ_RESPONSE_FIRST), IBV_OPCODE(RC, RDMA_READ_RESPONSE_MIDDLE), IBV_OPCODE(RC, RDMA_READ_RESPONSE_LAST), IBV_OPCODE(RC, RDMA_READ_RESPONSE_ONLY), IBV_OPCODE(RC, ACKNOWLEDGE), IBV_OPCODE(RC, ATOMIC_ACKNOWLEDGE), IBV_OPCODE(RC, COMPARE_SWAP), IBV_OPCODE(RC, FETCH_ADD), /* UC */ IBV_OPCODE(UC, SEND_FIRST), IBV_OPCODE(UC, SEND_MIDDLE), IBV_OPCODE(UC, SEND_LAST), IBV_OPCODE(UC, SEND_LAST_WITH_IMMEDIATE), IBV_OPCODE(UC, SEND_ONLY), IBV_OPCODE(UC, SEND_ONLY_WITH_IMMEDIATE), IBV_OPCODE(UC, RDMA_WRITE_FIRST), IBV_OPCODE(UC, RDMA_WRITE_MIDDLE), IBV_OPCODE(UC, RDMA_WRITE_LAST), IBV_OPCODE(UC, RDMA_WRITE_LAST_WITH_IMMEDIATE), IBV_OPCODE(UC, RDMA_WRITE_ONLY), IBV_OPCODE(UC, RDMA_WRITE_ONLY_WITH_IMMEDIATE), /* RD */ IBV_OPCODE(RD, SEND_FIRST), IBV_OPCODE(RD, SEND_MIDDLE), IBV_OPCODE(RD, SEND_LAST), IBV_OPCODE(RD, SEND_LAST_WITH_IMMEDIATE), IBV_OPCODE(RD, SEND_ONLY), IBV_OPCODE(RD, SEND_ONLY_WITH_IMMEDIATE), IBV_OPCODE(RD, RDMA_WRITE_FIRST), IBV_OPCODE(RD, RDMA_WRITE_MIDDLE), IBV_OPCODE(RD, RDMA_WRITE_LAST), IBV_OPCODE(RD, RDMA_WRITE_LAST_WITH_IMMEDIATE), IBV_OPCODE(RD, RDMA_WRITE_ONLY), IBV_OPCODE(RD, RDMA_WRITE_ONLY_WITH_IMMEDIATE), IBV_OPCODE(RD, RDMA_READ_REQUEST), IBV_OPCODE(RD, RDMA_READ_RESPONSE_FIRST), IBV_OPCODE(RD, RDMA_READ_RESPONSE_MIDDLE), IBV_OPCODE(RD, RDMA_READ_RESPONSE_LAST), IBV_OPCODE(RD, RDMA_READ_RESPONSE_ONLY), IBV_OPCODE(RD, ACKNOWLEDGE), IBV_OPCODE(RD, ATOMIC_ACKNOWLEDGE), IBV_OPCODE(RD, COMPARE_SWAP), IBV_OPCODE(RD, FETCH_ADD), /* UD */ IBV_OPCODE(UD, SEND_ONLY), IBV_OPCODE(UD, SEND_ONLY_WITH_IMMEDIATE) }; #endif /* INFINIBAND_OPCODE_H */ rdma-core-56.1/libibverbs/sa-kern-abi.h000066400000000000000000000031261477342711600177210ustar00rootroot00000000000000/* * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef INFINIBAND_SA_KERN_ABI_H #define INFINIBAND_SA_KERN_ABI_H #warning "This header is obsolete, use rdma/ib_user_sa.h instead" #include #define ib_kern_path_rec ib_user_path_rec #define ibv_kern_path_rec ib_user_path_rec #endif rdma-core-56.1/libibverbs/sa.h000066400000000000000000000077521477342711600162440ustar00rootroot00000000000000/* * Copyright (c) 2004 Topspin Communications. All rights reserved. * Copyright (c) 2005 Voltaire, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef INFINIBAND_SA_H #define INFINIBAND_SA_H #include #include struct ibv_sa_path_rec { /* reserved */ /* reserved */ union ibv_gid dgid; union ibv_gid sgid; __be16 dlid; __be16 slid; int raw_traffic; /* reserved */ __be32 flow_label; uint8_t hop_limit; uint8_t traffic_class; int reversible; uint8_t numb_path; __be16 pkey; /* reserved */ uint8_t sl; uint8_t mtu_selector; uint8_t mtu; uint8_t rate_selector; uint8_t rate; uint8_t packet_life_time_selector; uint8_t packet_life_time; uint8_t preference; }; struct ibv_sa_mcmember_rec { union ibv_gid mgid; union ibv_gid port_gid; uint32_t qkey; uint16_t mlid; uint8_t mtu_selector; uint8_t mtu; uint8_t traffic_class; uint16_t pkey; uint8_t rate_selector; uint8_t rate; uint8_t packet_life_time_selector; uint8_t packet_life_time; uint8_t sl; uint32_t flow_label; uint8_t hop_limit; uint8_t scope; uint8_t join_state; int proxy_join; }; struct ibv_sa_service_rec { uint64_t id; union ibv_gid gid; uint16_t pkey; /* uint16_t resv; */ uint32_t lease; uint8_t key[16]; uint8_t name[64]; uint8_t data8[16]; uint16_t data16[8]; uint32_t data32[4]; uint64_t data64[2]; }; #define IBV_PATH_RECORD_REVERSIBLE 0x80 struct ibv_path_record { __be64 service_id; union ibv_gid dgid; union ibv_gid sgid; __be16 dlid; __be16 slid; __be32 flowlabel_hoplimit; /* resv-31:28 flow label-27:8 hop limit-7:0*/ uint8_t tclass; uint8_t reversible_numpath; /* reversible-7:7 num path-6:0 */ __be16 pkey; __be16 qosclass_sl; /* qos class-15:4 sl-3:0 */ uint8_t mtu; /* mtu selector-7:6 mtu-5:0 */ uint8_t rate; /* rate selector-7:6 rate-5:0 */ uint8_t packetlifetime; /* lifetime selector-7:6 lifetime-5:0 */ uint8_t preference; uint8_t reserved[6]; }; #define IBV_PATH_FLAG_GMP (1<<0) #define IBV_PATH_FLAG_PRIMARY (1<<1) #define IBV_PATH_FLAG_ALTERNATE (1<<2) #define IBV_PATH_FLAG_OUTBOUND (1<<3) #define IBV_PATH_FLAG_INBOUND (1<<4) #define IBV_PATH_FLAG_INBOUND_REVERSE (1<<5) #define IBV_PATH_FLAG_BIDIRECTIONAL (IBV_PATH_FLAG_OUTBOUND | \ IBV_PATH_FLAG_INBOUND_REVERSE) struct ibv_path_data { uint32_t flags; uint32_t reserved; struct ibv_path_record path; }; #endif /* INFINIBAND_SA_H */ rdma-core-56.1/libibverbs/static_driver.c000066400000000000000000000040111477342711600204570ustar00rootroot00000000000000/* * Copyright (c) 2018 Mellanox Technologies, Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifdef _STATIC_LIBRARY_BUILD_ #define RDMA_STATIC_PROVIDERS none #include #include const struct verbs_device_ops verbs_provider_none; void ibv_static_providers(void *unused, ...) { /* * We do not need to do anything with the VA_ARGs since we continue to * rely on the constructor attribute and simply referencing the * verbs_provider_X symbol will be enough to trigger the constructor. * * This would need to actually check and do the registration for * specialty cases like LTO or section-gc which may not work with the * constructor scheme. */ } #endif rdma-core-56.1/libibverbs/sysfs.c000066400000000000000000000066371477342711600170040ustar00rootroot00000000000000/* * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include "ibverbs.h" static const char *sysfs_path; const char *ibv_get_sysfs_path(void) { const char *env = NULL; if (sysfs_path) return sysfs_path; /* * Only follow use path passed in through the calling user's * environment if we're not running SUID. */ if (getuid() == geteuid()) env = getenv("SYSFS_PATH"); if (env) { int len; char *dup; sysfs_path = dup = strndup(env, IBV_SYSFS_PATH_MAX); len = strlen(dup); while (len > 0 && dup[len - 1] == '/') { --len; dup[len] = '\0'; } } else sysfs_path = "/sys"; return sysfs_path; } int ibv_read_sysfs_file_at(int dirfd, const char *file, char *buf, size_t size) { ssize_t len; int fd; fd = openat(dirfd, file, O_RDONLY | O_CLOEXEC); if (fd < 0) return -1; len = read(fd, buf, size); close(fd); if (len > 0) { if (buf[len - 1] == '\n') buf[--len] = '\0'; else if (len < size) buf[len] = '\0'; else /* We would have to truncate the contents to NULL * terminate, so we are going to fail no matter * what we do, either right now or later when * we pass around an unterminated string. Fail now. */ return -1; } return len; } int ibv_read_sysfs_file(const char *dir, const char *file, char *buf, size_t size) { char *path; int res; if (asprintf(&path, "%s/%s", dir, file) < 0) return -1; res = ibv_read_sysfs_file_at(AT_FDCWD, path, buf, size); free(path); return res; } int ibv_read_ibdev_sysfs_file(char *buf, size_t size, struct verbs_sysfs_dev *sysfs_dev, const char *fnfmt, ...) { char *path; va_list va; int res; if (!sysfs_dev) { errno = EINVAL; return -1; } va_start(va, fnfmt); if (vasprintf(&path, fnfmt, va) < 0) { va_end(va); return -1; } va_end(va); res = ibv_read_sysfs_file(sysfs_dev->ibdev_path, path, buf, size); free(path); return res; } rdma-core-56.1/libibverbs/tm_types.h000066400000000000000000000036771477342711600175070ustar00rootroot00000000000000/* * Copyright (c) 2017 Mellanox Technologies Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef _TM_TYPES_H #define _TM_TYPES_H #include #include #ifdef __cplusplus extern "C" { #endif enum ibv_tmh_op { IBV_TMH_NO_TAG = 0, IBV_TMH_RNDV = 1, IBV_TMH_FIN = 2, IBV_TMH_EAGER = 3, }; struct ibv_tmh { uint8_t opcode; /* from enum ibv_tmh_op */ uint8_t reserved[3]; /* must be zero */ __be32 app_ctx; /* opaque user data */ __be64 tag; }; struct ibv_rvh { __be64 va; __be32 rkey; __be32 len; }; #ifdef __cplusplus } #endif #endif /* _TM_TYPES_H */ rdma-core-56.1/libibverbs/verbs.c000066400000000000000000000656521477342711600167600ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved. * Copyright (c) 2020 Intel Corperation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include "ibverbs.h" #include #include #include "neigh.h" #undef ibv_query_port int __attribute__((const)) ibv_rate_to_mult(enum ibv_rate rate) { switch (rate) { case IBV_RATE_2_5_GBPS: return 1; case IBV_RATE_5_GBPS: return 2; case IBV_RATE_10_GBPS: return 4; case IBV_RATE_20_GBPS: return 8; case IBV_RATE_30_GBPS: return 12; case IBV_RATE_40_GBPS: return 16; case IBV_RATE_60_GBPS: return 24; case IBV_RATE_80_GBPS: return 32; case IBV_RATE_120_GBPS: return 48; case IBV_RATE_28_GBPS: return 11; case IBV_RATE_50_GBPS: return 20; case IBV_RATE_400_GBPS: return 160; case IBV_RATE_600_GBPS: return 240; case IBV_RATE_800_GBPS: return 320; case IBV_RATE_1200_GBPS: return 480; default: return -1; } } enum ibv_rate __attribute__((const)) mult_to_ibv_rate(int mult) { switch (mult) { case 1: return IBV_RATE_2_5_GBPS; case 2: return IBV_RATE_5_GBPS; case 4: return IBV_RATE_10_GBPS; case 8: return IBV_RATE_20_GBPS; case 12: return IBV_RATE_30_GBPS; case 16: return IBV_RATE_40_GBPS; case 24: return IBV_RATE_60_GBPS; case 32: return IBV_RATE_80_GBPS; case 48: return IBV_RATE_120_GBPS; case 11: return IBV_RATE_28_GBPS; case 20: return IBV_RATE_50_GBPS; case 160: return IBV_RATE_400_GBPS; case 240: return IBV_RATE_600_GBPS; case 320: return IBV_RATE_800_GBPS; case 480: return IBV_RATE_1200_GBPS; default: return IBV_RATE_MAX; } } int __attribute__((const)) ibv_rate_to_mbps(enum ibv_rate rate) { switch (rate) { case IBV_RATE_2_5_GBPS: return 2500; case IBV_RATE_5_GBPS: return 5000; case IBV_RATE_10_GBPS: return 10000; case IBV_RATE_20_GBPS: return 20000; case IBV_RATE_30_GBPS: return 30000; case IBV_RATE_40_GBPS: return 40000; case IBV_RATE_60_GBPS: return 60000; case IBV_RATE_80_GBPS: return 80000; case IBV_RATE_120_GBPS: return 120000; case IBV_RATE_14_GBPS: return 14062; case IBV_RATE_56_GBPS: return 56250; case IBV_RATE_112_GBPS: return 112500; case IBV_RATE_168_GBPS: return 168750; case IBV_RATE_25_GBPS: return 25781; case IBV_RATE_100_GBPS: return 103125; case IBV_RATE_200_GBPS: return 206250; case IBV_RATE_300_GBPS: return 309375; case IBV_RATE_28_GBPS: return 28125; case IBV_RATE_50_GBPS: return 53125; case IBV_RATE_400_GBPS: return 425000; case IBV_RATE_600_GBPS: return 637500; case IBV_RATE_800_GBPS: return 850000; case IBV_RATE_1200_GBPS: return 1275000; default: return -1; } } enum ibv_rate __attribute__((const)) mbps_to_ibv_rate(int mbps) { switch (mbps) { case 2500: return IBV_RATE_2_5_GBPS; case 5000: return IBV_RATE_5_GBPS; case 10000: return IBV_RATE_10_GBPS; case 20000: return IBV_RATE_20_GBPS; case 30000: return IBV_RATE_30_GBPS; case 40000: return IBV_RATE_40_GBPS; case 60000: return IBV_RATE_60_GBPS; case 80000: return IBV_RATE_80_GBPS; case 120000: return IBV_RATE_120_GBPS; case 14062: return IBV_RATE_14_GBPS; case 56250: return IBV_RATE_56_GBPS; case 112500: return IBV_RATE_112_GBPS; case 168750: return IBV_RATE_168_GBPS; case 25781: return IBV_RATE_25_GBPS; case 103125: return IBV_RATE_100_GBPS; case 206250: return IBV_RATE_200_GBPS; case 309375: return IBV_RATE_300_GBPS; case 28125: return IBV_RATE_28_GBPS; case 53125: return IBV_RATE_50_GBPS; case 425000: return IBV_RATE_400_GBPS; case 637500: return IBV_RATE_600_GBPS; case 850000: return IBV_RATE_800_GBPS; case 1275000: return IBV_RATE_1200_GBPS; default: return IBV_RATE_MAX; } } LATEST_SYMVER_FUNC(ibv_query_device, 1_1, "IBVERBS_1.1", int, struct ibv_context *context, struct ibv_device_attr *device_attr) { return get_ops(context)->query_device_ex( context, NULL, container_of(device_attr, struct ibv_device_attr_ex, orig_attr), sizeof(*device_attr)); } int __lib_query_port(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr, size_t port_attr_len) { /* Don't expose this mess to the provider, provide a large enough * temporary buffer if the user buffer is too small. */ if (port_attr_len < sizeof(struct ibv_port_attr)) { struct ibv_port_attr tmp_attr = {}; int rc; rc = get_ops(context)->query_port(context, port_num, &tmp_attr); if (rc) return rc; memcpy(port_attr, &tmp_attr, port_attr_len); return 0; } memset(port_attr, 0, port_attr_len); return get_ops(context)->query_port(context, port_num, port_attr); } struct _compat_ibv_port_attr { enum ibv_port_state state; enum ibv_mtu max_mtu; enum ibv_mtu active_mtu; int gid_tbl_len; uint32_t port_cap_flags; uint32_t max_msg_sz; uint32_t bad_pkey_cntr; uint32_t qkey_viol_cntr; uint16_t pkey_tbl_len; uint16_t lid; uint16_t sm_lid; uint8_t lmc; uint8_t max_vl_num; uint8_t sm_sl; uint8_t subnet_timeout; uint8_t init_type_reply; uint8_t active_width; uint8_t active_speed; uint8_t phys_state; uint8_t link_layer; uint8_t flags; }; LATEST_SYMVER_FUNC(ibv_query_port, 1_1, "IBVERBS_1.1", int, struct ibv_context *context, uint8_t port_num, struct _compat_ibv_port_attr *port_attr) { return __lib_query_port(context, port_num, (struct ibv_port_attr *)port_attr, sizeof(*port_attr)); } LATEST_SYMVER_FUNC(ibv_query_gid, 1_1, "IBVERBS_1.1", int, struct ibv_context *context, uint8_t port_num, int index, union ibv_gid *gid) { struct ibv_gid_entry entry = {}; int ret; ret = __ibv_query_gid_ex(context, port_num, index, &entry, 0, sizeof(entry), VERBS_QUERY_GID_ATTR_GID); /* Preserve API behavior for empty GID */ if (ret == ENODATA) { memset(gid, 0, sizeof(*gid)); return 0; } if (ret) return -1; memcpy(gid, &entry.gid, sizeof(entry.gid)); return 0; } LATEST_SYMVER_FUNC(ibv_query_pkey, 1_1, "IBVERBS_1.1", int, struct ibv_context *context, uint8_t port_num, int index, __be16 *pkey) { struct verbs_device *verbs_device = verbs_get_device(context->device); char attr[8]; uint16_t val; if (ibv_read_ibdev_sysfs_file(attr, sizeof(attr), verbs_device->sysfs, "ports/%d/pkeys/%d", port_num, index) < 0) return -1; if (sscanf(attr, "%hx", &val) != 1) return -1; *pkey = htobe16(val); return 0; } LATEST_SYMVER_FUNC(ibv_get_pkey_index, 1_5, "IBVERBS_1.5", int, struct ibv_context *context, uint8_t port_num, __be16 pkey) { __be16 pkey_i; int i, ret; for (i = 0; ; i++) { ret = ibv_query_pkey(context, port_num, i, &pkey_i); if (ret < 0) return ret; if (pkey == pkey_i) return i; } } LATEST_SYMVER_FUNC(ibv_alloc_pd, 1_1, "IBVERBS_1.1", struct ibv_pd *, struct ibv_context *context) { struct ibv_pd *pd; pd = get_ops(context)->alloc_pd(context); if (pd) pd->context = context; return pd; } LATEST_SYMVER_FUNC(ibv_dealloc_pd, 1_1, "IBVERBS_1.1", int, struct ibv_pd *pd) { return get_ops(pd->context)->dealloc_pd(pd); } struct ibv_mr *ibv_reg_mr_iova2(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova, unsigned int access) { struct verbs_device *device = verbs_get_device(pd->context->device); bool odp_mr = access & IBV_ACCESS_ON_DEMAND; struct ibv_mr *mr; if (!(device->core_support & IB_UVERBS_CORE_SUPPORT_OPTIONAL_MR_ACCESS)) access &= ~IBV_ACCESS_OPTIONAL_RANGE; if (!odp_mr && ibv_dontfork_range(addr, length)) return NULL; mr = get_ops(pd->context)->reg_mr(pd, addr, length, iova, access); if (mr) { mr->context = pd->context; mr->pd = pd; mr->addr = addr; mr->length = length; } else { if (!odp_mr) ibv_dofork_range(addr, length); } return mr; } #undef ibv_reg_mr LATEST_SYMVER_FUNC(ibv_reg_mr, 1_1, "IBVERBS_1.1", struct ibv_mr *, struct ibv_pd *pd, void *addr, size_t length, int access) { return ibv_reg_mr_iova2(pd, addr, length, (uintptr_t)addr, access); } #undef ibv_reg_mr_iova struct ibv_mr *ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova, int access) { return ibv_reg_mr_iova2(pd, addr, length, iova, access); } struct ibv_pd *ibv_import_pd(struct ibv_context *context, uint32_t pd_handle) { return get_ops(context)->import_pd(context, pd_handle); } void ibv_unimport_pd(struct ibv_pd *pd) { get_ops(pd->context)->unimport_pd(pd); } /** * ibv_import_mr - Import a memory region */ struct ibv_mr *ibv_import_mr(struct ibv_pd *pd, uint32_t mr_handle) { return get_ops(pd->context)->import_mr(pd, mr_handle); } /** * ibv_unimport_mr - Unimport a memory region */ void ibv_unimport_mr(struct ibv_mr *mr) { get_ops(mr->context)->unimport_mr(mr); } /** * ibv_import_dm - Import a device memory */ struct ibv_dm *ibv_import_dm(struct ibv_context *context, uint32_t dm_handle) { return get_ops(context)->import_dm(context, dm_handle); } /** * ibv_unimport_dm - Unimport a device memory */ void ibv_unimport_dm(struct ibv_dm *dm) { get_ops(dm->context)->unimport_dm(dm); } struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access) { struct ibv_mr *mr; mr = get_ops(pd->context)->reg_dmabuf_mr(pd, offset, length, iova, fd, access); if (!mr) return NULL; mr->context = pd->context; mr->pd = pd; mr->addr = (void *)(uintptr_t)offset; mr->length = length; return mr; } LATEST_SYMVER_FUNC(ibv_rereg_mr, 1_1, "IBVERBS_1.1", int, struct ibv_mr *mr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access) { int dofork_onfail = 0; int err; void *old_addr; size_t old_len; if (verbs_get_mr(mr)->mr_type != IBV_MR_TYPE_MR) { errno = EINVAL; return IBV_REREG_MR_ERR_INPUT; } if (flags & ~IBV_REREG_MR_FLAGS_SUPPORTED) { errno = EINVAL; return IBV_REREG_MR_ERR_INPUT; } if ((flags & IBV_REREG_MR_CHANGE_TRANSLATION) && (!length || !addr)) { errno = EINVAL; return IBV_REREG_MR_ERR_INPUT; } if (access && !(flags & IBV_REREG_MR_CHANGE_ACCESS)) { errno = EINVAL; return IBV_REREG_MR_ERR_INPUT; } if (flags & IBV_REREG_MR_CHANGE_TRANSLATION) { err = ibv_dontfork_range(addr, length); if (err) return IBV_REREG_MR_ERR_DONT_FORK_NEW; dofork_onfail = 1; } old_addr = mr->addr; old_len = mr->length; err = get_ops(mr->context)->rereg_mr(verbs_get_mr(mr), flags, pd, addr, length, access); if (!err) { if (flags & IBV_REREG_MR_CHANGE_PD) mr->pd = pd; if (flags & IBV_REREG_MR_CHANGE_TRANSLATION) { mr->addr = addr; mr->length = length; err = ibv_dofork_range(old_addr, old_len); if (err) return IBV_REREG_MR_ERR_DO_FORK_OLD; } } else { err = IBV_REREG_MR_ERR_CMD; if (dofork_onfail) { if (ibv_dofork_range(addr, length)) err = IBV_REREG_MR_ERR_CMD_AND_DO_FORK_NEW; } } return err; } LATEST_SYMVER_FUNC(ibv_dereg_mr, 1_1, "IBVERBS_1.1", int, struct ibv_mr *mr) { int ret; void *addr = mr->addr; size_t length = mr->length; enum ibv_mr_type type = verbs_get_mr(mr)->mr_type; int access = verbs_get_mr(mr)->access; ret = get_ops(mr->context)->dereg_mr(verbs_get_mr(mr)); if (!ret && type == IBV_MR_TYPE_MR && !(access & IBV_ACCESS_ON_DEMAND)) ibv_dofork_range(addr, length); return ret; } struct ibv_comp_channel *ibv_create_comp_channel(struct ibv_context *context) { struct ibv_create_comp_channel req; struct ib_uverbs_create_comp_channel_resp resp = {}; struct ibv_comp_channel *channel; channel = malloc(sizeof *channel); if (!channel) return NULL; req.core_payload = (struct ib_uverbs_create_comp_channel){}; if (execute_cmd_write(context, IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL, &req, sizeof(req), &resp, sizeof(resp))) { free(channel); return NULL; } channel->context = context; channel->fd = resp.fd; channel->refcnt = 0; return channel; } int ibv_destroy_comp_channel(struct ibv_comp_channel *channel) { struct ibv_context *context; int ret; context = channel->context; pthread_mutex_lock(&context->mutex); if (channel->refcnt) { ret = EBUSY; goto out; } close(channel->fd); free(channel); ret = 0; out: pthread_mutex_unlock(&context->mutex); return ret; } LATEST_SYMVER_FUNC(ibv_create_cq, 1_1, "IBVERBS_1.1", struct ibv_cq *, struct ibv_context *context, int cqe, void *cq_context, struct ibv_comp_channel *channel, int comp_vector) { struct ibv_cq *cq; cq = get_ops(context)->create_cq(context, cqe, channel, comp_vector); if (cq) verbs_init_cq(cq, context, channel, cq_context); return cq; } LATEST_SYMVER_FUNC(ibv_resize_cq, 1_1, "IBVERBS_1.1", int, struct ibv_cq *cq, int cqe) { return get_ops(cq->context)->resize_cq(cq, cqe); } LATEST_SYMVER_FUNC(ibv_destroy_cq, 1_1, "IBVERBS_1.1", int, struct ibv_cq *cq) { struct ibv_comp_channel *channel = cq->channel; int ret; ret = get_ops(cq->context)->destroy_cq(cq); if (channel) { if (!ret) { pthread_mutex_lock(&channel->context->mutex); --channel->refcnt; pthread_mutex_unlock(&channel->context->mutex); } } return ret; } LATEST_SYMVER_FUNC(ibv_get_cq_event, 1_1, "IBVERBS_1.1", int, struct ibv_comp_channel *channel, struct ibv_cq **cq, void **cq_context) { struct ib_uverbs_comp_event_desc ev; if (read(channel->fd, &ev, sizeof ev) != sizeof ev) return -1; *cq = (struct ibv_cq *) (uintptr_t) ev.cq_handle; *cq_context = (*cq)->cq_context; get_ops((*cq)->context)->cq_event(*cq); return 0; } LATEST_SYMVER_FUNC(ibv_ack_cq_events, 1_1, "IBVERBS_1.1", void, struct ibv_cq *cq, unsigned int nevents) { pthread_mutex_lock(&cq->mutex); cq->comp_events_completed += nevents; pthread_cond_signal(&cq->cond); pthread_mutex_unlock(&cq->mutex); } LATEST_SYMVER_FUNC(ibv_create_srq, 1_1, "IBVERBS_1.1", struct ibv_srq *, struct ibv_pd *pd, struct ibv_srq_init_attr *srq_init_attr) { struct ibv_srq *srq; srq = get_ops(pd->context)->create_srq(pd, srq_init_attr); if (srq) { srq->context = pd->context; srq->srq_context = srq_init_attr->srq_context; srq->pd = pd; srq->events_completed = 0; pthread_mutex_init(&srq->mutex, NULL); pthread_cond_init(&srq->cond, NULL); } return srq; } LATEST_SYMVER_FUNC(ibv_modify_srq, 1_1, "IBVERBS_1.1", int, struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, int srq_attr_mask) { return get_ops(srq->context)->modify_srq(srq, srq_attr, srq_attr_mask); } LATEST_SYMVER_FUNC(ibv_query_srq, 1_1, "IBVERBS_1.1", int, struct ibv_srq *srq, struct ibv_srq_attr *srq_attr) { return get_ops(srq->context)->query_srq(srq, srq_attr); } LATEST_SYMVER_FUNC(ibv_destroy_srq, 1_1, "IBVERBS_1.1", int, struct ibv_srq *srq) { return get_ops(srq->context)->destroy_srq(srq); } LATEST_SYMVER_FUNC(ibv_create_qp, 1_1, "IBVERBS_1.1", struct ibv_qp *, struct ibv_pd *pd, struct ibv_qp_init_attr *qp_init_attr) { struct ibv_qp *qp = get_ops(pd->context)->create_qp(pd, qp_init_attr); return qp; } struct ibv_qp_ex *ibv_qp_to_qp_ex(struct ibv_qp *qp) { struct verbs_qp *vqp = (struct verbs_qp *)qp; if (vqp->comp_mask & VERBS_QP_EX) return &vqp->qp_ex; return NULL; } LATEST_SYMVER_FUNC(ibv_query_qp, 1_1, "IBVERBS_1.1", int, struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { int ret; ret = get_ops(qp->context)->query_qp(qp, attr, attr_mask, init_attr); if (ret) return ret; if (attr_mask & IBV_QP_STATE) qp->state = attr->qp_state; return 0; } int ibv_query_qp_data_in_order(struct ibv_qp *qp, enum ibv_wr_opcode op, uint32_t flags) { #if !defined(__i386__) && !defined(__x86_64__) /* Currently this API is only supported for x86 architectures since most * non-x86 platforms are known to be OOO and need to do a per-platform study. */ return 0; #else int result; if (!check_comp_mask(flags, IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS)) return 0; result = get_ops(qp->context)->query_qp_data_in_order(qp, op, flags); if (result & IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG) result |= IBV_QUERY_QP_DATA_IN_ORDER_ALIGNED_128_BYTES; return flags ? result : !!(result & IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG); #endif } LATEST_SYMVER_FUNC(ibv_modify_qp, 1_1, "IBVERBS_1.1", int, struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { int ret; ret = get_ops(qp->context)->modify_qp(qp, attr, attr_mask); if (ret) return ret; if (attr_mask & IBV_QP_STATE) qp->state = attr->qp_state; return 0; } LATEST_SYMVER_FUNC(ibv_destroy_qp, 1_1, "IBVERBS_1.1", int, struct ibv_qp *qp) { return get_ops(qp->context)->destroy_qp(qp); } LATEST_SYMVER_FUNC(ibv_create_ah, 1_1, "IBVERBS_1.1", struct ibv_ah *, struct ibv_pd *pd, struct ibv_ah_attr *attr) { struct ibv_ah *ah = get_ops(pd->context)->create_ah(pd, attr); if (ah) { ah->context = pd->context; ah->pd = pd; } return ah; } int ibv_query_gid_type(struct ibv_context *context, uint8_t port_num, unsigned int index, enum ibv_gid_type_sysfs *type) { struct ibv_gid_entry entry = {}; int ret; ret = __ibv_query_gid_ex(context, port_num, index, &entry, 0, sizeof(entry), VERBS_QUERY_GID_ATTR_TYPE); /* Preserve API behavior for empty GID */ if (ret == ENODATA) { *type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1; return 0; } if (ret) return -1; if (entry.gid_type == IBV_GID_TYPE_IB || entry.gid_type == IBV_GID_TYPE_ROCE_V1) *type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1; else *type = IBV_GID_TYPE_SYSFS_ROCE_V2; return 0; } static int ibv_find_gid_index(struct ibv_context *context, uint8_t port_num, union ibv_gid *gid, enum ibv_gid_type_sysfs gid_type) { enum ibv_gid_type_sysfs sgid_type = 0; union ibv_gid sgid; int i = 0, ret; do { ret = ibv_query_gid(context, port_num, i, &sgid); if (!ret) { ret = ibv_query_gid_type(context, port_num, i, &sgid_type); } i++; } while (!ret && (memcmp(&sgid, gid, sizeof(*gid)) || (gid_type != sgid_type))); return ret ? ret : i - 1; } static inline void map_ipv4_addr_to_ipv6(__be32 ipv4, struct in6_addr *ipv6) { ipv6->s6_addr32[0] = 0; ipv6->s6_addr32[1] = 0; ipv6->s6_addr32[2] = htobe32(0x0000FFFF); ipv6->s6_addr32[3] = ipv4; } static inline __sum16 ipv4_calc_hdr_csum(uint16_t *data, unsigned int num_hwords) { unsigned int i = 0; uint32_t sum = 0; for (i = 0; i < num_hwords; i++) sum += *(data++); sum = (sum & 0xffff) + (sum >> 16); return (__force __sum16)~sum; } static inline int get_grh_header_version(struct ibv_grh *grh) { int ip6h_version = (be32toh(grh->version_tclass_flow) >> 28) & 0xf; struct iphdr *ip4h = (struct iphdr *)((void *)grh + 20); struct iphdr ip4h_checked; if (ip6h_version != 6) { if (ip4h->version == 4) return 4; errno = EPROTONOSUPPORT; return -1; } /* version may be 6 or 4 */ if (ip4h->ihl != 5) /* IPv4 header length must be 5 for RoCE v2. */ return 6; /* * Verify checksum. * We can't write on scattered buffers so we have to copy to temp * buffer. */ memcpy(&ip4h_checked, ip4h, sizeof(ip4h_checked)); /* Need to set the checksum field (check) to 0 before re-calculating * the checksum. */ ip4h_checked.check = 0; ip4h_checked.check = ipv4_calc_hdr_csum((uint16_t *)&ip4h_checked, 10); /* if IPv4 header checksum is OK, believe it */ if (ip4h->check == ip4h_checked.check) return 4; return 6; } static inline void set_ah_attr_generic_fields(struct ibv_ah_attr *ah_attr, struct ibv_wc *wc, struct ibv_grh *grh, uint8_t port_num) { uint32_t flow_class; flow_class = be32toh(grh->version_tclass_flow); ah_attr->grh.flow_label = flow_class & 0xFFFFF; ah_attr->dlid = wc->slid; ah_attr->sl = wc->sl; ah_attr->src_path_bits = wc->dlid_path_bits; ah_attr->port_num = port_num; } static inline int set_ah_attr_by_ipv4(struct ibv_context *context, struct ibv_ah_attr *ah_attr, struct iphdr *ip4h, uint8_t port_num) { union ibv_gid sgid; int ret; /* No point searching multicast GIDs in GID table */ if (IN_CLASSD(be32toh(ip4h->daddr))) { errno = EINVAL; return -1; } map_ipv4_addr_to_ipv6(ip4h->daddr, (struct in6_addr *)&sgid); ret = ibv_find_gid_index(context, port_num, &sgid, IBV_GID_TYPE_SYSFS_ROCE_V2); if (ret < 0) return ret; map_ipv4_addr_to_ipv6(ip4h->saddr, (struct in6_addr *)&ah_attr->grh.dgid); ah_attr->grh.sgid_index = (uint8_t) ret; ah_attr->grh.hop_limit = ip4h->ttl; ah_attr->grh.traffic_class = ip4h->tos; return 0; } #define IB_NEXT_HDR 0x1b static inline int set_ah_attr_by_ipv6(struct ibv_context *context, struct ibv_ah_attr *ah_attr, struct ibv_grh *grh, uint8_t port_num) { uint32_t flow_class; uint32_t sgid_type; int ret; /* No point searching multicast GIDs in GID table */ if (grh->dgid.raw[0] == 0xFF) { errno = EINVAL; return -1; } ah_attr->grh.dgid = grh->sgid; if (grh->next_hdr == IPPROTO_UDP) { sgid_type = IBV_GID_TYPE_SYSFS_ROCE_V2; } else if (grh->next_hdr == IB_NEXT_HDR) { sgid_type = IBV_GID_TYPE_SYSFS_IB_ROCE_V1; } else { errno = EPROTONOSUPPORT; return -1; } ret = ibv_find_gid_index(context, port_num, &grh->dgid, sgid_type); if (ret < 0) return ret; ah_attr->grh.sgid_index = (uint8_t) ret; flow_class = be32toh(grh->version_tclass_flow); ah_attr->grh.hop_limit = grh->hop_limit; ah_attr->grh.traffic_class = (flow_class >> 20) & 0xFF; return 0; } int ibv_init_ah_from_wc(struct ibv_context *context, uint8_t port_num, struct ibv_wc *wc, struct ibv_grh *grh, struct ibv_ah_attr *ah_attr) { int version; int ret = 0; memset(ah_attr, 0, sizeof *ah_attr); set_ah_attr_generic_fields(ah_attr, wc, grh, port_num); if (wc->wc_flags & IBV_WC_GRH) { ah_attr->is_global = 1; version = get_grh_header_version(grh); if (version == 4) ret = set_ah_attr_by_ipv4(context, ah_attr, (struct iphdr *)((void *)grh + 20), port_num); else if (version == 6) ret = set_ah_attr_by_ipv6(context, ah_attr, grh, port_num); else ret = -1; } return ret; } struct ibv_ah *ibv_create_ah_from_wc(struct ibv_pd *pd, struct ibv_wc *wc, struct ibv_grh *grh, uint8_t port_num) { struct ibv_ah_attr ah_attr; int ret; ret = ibv_init_ah_from_wc(pd->context, port_num, wc, grh, &ah_attr); if (ret) return NULL; return ibv_create_ah(pd, &ah_attr); } LATEST_SYMVER_FUNC(ibv_destroy_ah, 1_1, "IBVERBS_1.1", int, struct ibv_ah *ah) { return get_ops(ah->context)->destroy_ah(ah); } LATEST_SYMVER_FUNC(ibv_attach_mcast, 1_1, "IBVERBS_1.1", int, struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) { return get_ops(qp->context)->attach_mcast(qp, gid, lid); } LATEST_SYMVER_FUNC(ibv_detach_mcast, 1_1, "IBVERBS_1.1", int, struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) { return get_ops(qp->context)->detach_mcast(qp, gid, lid); } static inline int ipv6_addr_v4mapped(const struct in6_addr *a) { return IN6_IS_ADDR_V4MAPPED(&a->s6_addr32) || /* IPv4 encoded multicast addresses */ (a->s6_addr32[0] == htobe32(0xff0e0000) && ((a->s6_addr32[1] | (a->s6_addr32[2] ^ htobe32(0x0000ffff))) == 0UL)); } struct peer_address { void *address; uint32_t size; }; static inline int create_peer_from_gid(int family, void *raw_gid, struct peer_address *peer_address) { switch (family) { case AF_INET: peer_address->address = raw_gid + 12; peer_address->size = 4; break; case AF_INET6: peer_address->address = raw_gid; peer_address->size = 16; break; default: return -1; } return 0; } #define NEIGH_GET_DEFAULT_TIMEOUT_MS 3000 int ibv_resolve_eth_l2_from_gid(struct ibv_context *context, struct ibv_ah_attr *attr, uint8_t eth_mac[ETHERNET_LL_SIZE], uint16_t *vid) { int dst_family; int src_family; int oif; struct get_neigh_handler neigh_handler; union ibv_gid sgid; int ether_len; struct peer_address src; struct peer_address dst; int ret = -EINVAL; int err; err = ibv_query_gid(context, attr->port_num, attr->grh.sgid_index, &sgid); if (err) return err; err = neigh_init_resources(&neigh_handler, NEIGH_GET_DEFAULT_TIMEOUT_MS); if (err) return err; dst_family = ipv6_addr_v4mapped((struct in6_addr *)attr->grh.dgid.raw) ? AF_INET : AF_INET6; src_family = ipv6_addr_v4mapped((struct in6_addr *)sgid.raw) ? AF_INET : AF_INET6; if (create_peer_from_gid(dst_family, attr->grh.dgid.raw, &dst)) goto free_resources; if (create_peer_from_gid(src_family, &sgid.raw, &src)) goto free_resources; if (neigh_set_dst(&neigh_handler, dst_family, dst.address, dst.size)) goto free_resources; if (neigh_set_src(&neigh_handler, src_family, src.address, src.size)) goto free_resources; oif = neigh_get_oif_from_src(&neigh_handler); if (oif > 0) neigh_set_oif(&neigh_handler, oif); else goto free_resources; ret = -EHOSTUNREACH; /* blocking call */ if (process_get_neigh(&neigh_handler)) goto free_resources; if (vid) { uint16_t ret_vid = neigh_get_vlan_id_from_dev(&neigh_handler); if (ret_vid <= 0xfff) neigh_set_vlan_id(&neigh_handler, ret_vid); *vid = ret_vid; } /* We are using only Ethernet here */ ether_len = neigh_get_ll(&neigh_handler, eth_mac, sizeof(uint8_t) * ETHERNET_LL_SIZE); if (ether_len <= 0) goto free_resources; ret = 0; free_resources: neigh_free_resources(&neigh_handler); return ret; } int ibv_set_ece(struct ibv_qp *qp, struct ibv_ece *ece) { if (!ece->vendor_id) { errno = EOPNOTSUPP; return errno; } return get_ops(qp->context)->set_ece(qp, ece); } int ibv_query_ece(struct ibv_qp *qp, struct ibv_ece *ece) { return get_ops(qp->context)->query_ece(qp, ece); } rdma-core-56.1/libibverbs/verbs.h000066400000000000000000002742551477342711600167660ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2004, 2011-2012 Intel Corporation. All rights reserved. * Copyright (c) 2005, 2006, 2007 Cisco Systems, Inc. All rights reserved. * Copyright (c) 2005 PathScale, Inc. All rights reserved. * Copyright (c) 2020 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef INFINIBAND_VERBS_H #define INFINIBAND_VERBS_H #include #include #include #include #include #include #include #include #include #ifdef __cplusplus #include #endif #if __GNUC__ >= 3 # define __attribute_const __attribute__((const)) #else # define __attribute_const #endif #ifdef __cplusplus extern "C" { #endif union ibv_gid { uint8_t raw[16]; struct { __be64 subnet_prefix; __be64 interface_id; } global; }; enum ibv_gid_type { IBV_GID_TYPE_IB, IBV_GID_TYPE_ROCE_V1, IBV_GID_TYPE_ROCE_V2, }; struct ibv_gid_entry { union ibv_gid gid; uint32_t gid_index; uint32_t port_num; uint32_t gid_type; /* enum ibv_gid_type */ uint32_t ndev_ifindex; }; #define vext_field_avail(type, fld, sz) (offsetof(type, fld) < (sz)) #ifdef __cplusplus #define __VERBS_ABI_IS_EXTENDED ((void *)std::numeric_limits::max()) #else #define __VERBS_ABI_IS_EXTENDED ((void *)UINTPTR_MAX) #endif enum ibv_node_type { IBV_NODE_UNKNOWN = -1, IBV_NODE_CA = 1, IBV_NODE_SWITCH, IBV_NODE_ROUTER, IBV_NODE_RNIC, IBV_NODE_USNIC, IBV_NODE_USNIC_UDP, IBV_NODE_UNSPECIFIED, }; enum ibv_transport_type { IBV_TRANSPORT_UNKNOWN = -1, IBV_TRANSPORT_IB = 0, IBV_TRANSPORT_IWARP, IBV_TRANSPORT_USNIC, IBV_TRANSPORT_USNIC_UDP, IBV_TRANSPORT_UNSPECIFIED, }; enum ibv_device_cap_flags { IBV_DEVICE_RESIZE_MAX_WR = 1, IBV_DEVICE_BAD_PKEY_CNTR = 1 << 1, IBV_DEVICE_BAD_QKEY_CNTR = 1 << 2, IBV_DEVICE_RAW_MULTI = 1 << 3, IBV_DEVICE_AUTO_PATH_MIG = 1 << 4, IBV_DEVICE_CHANGE_PHY_PORT = 1 << 5, IBV_DEVICE_UD_AV_PORT_ENFORCE = 1 << 6, IBV_DEVICE_CURR_QP_STATE_MOD = 1 << 7, IBV_DEVICE_SHUTDOWN_PORT = 1 << 8, IBV_DEVICE_INIT_TYPE = 1 << 9, IBV_DEVICE_PORT_ACTIVE_EVENT = 1 << 10, IBV_DEVICE_SYS_IMAGE_GUID = 1 << 11, IBV_DEVICE_RC_RNR_NAK_GEN = 1 << 12, IBV_DEVICE_SRQ_RESIZE = 1 << 13, IBV_DEVICE_N_NOTIFY_CQ = 1 << 14, IBV_DEVICE_MEM_WINDOW = 1 << 17, IBV_DEVICE_UD_IP_CSUM = 1 << 18, IBV_DEVICE_XRC = 1 << 20, IBV_DEVICE_MEM_MGT_EXTENSIONS = 1 << 21, IBV_DEVICE_MEM_WINDOW_TYPE_2A = 1 << 23, IBV_DEVICE_MEM_WINDOW_TYPE_2B = 1 << 24, IBV_DEVICE_RC_IP_CSUM = 1 << 25, IBV_DEVICE_RAW_IP_CSUM = 1 << 26, IBV_DEVICE_MANAGED_FLOW_STEERING = 1 << 29 }; enum ibv_fork_status { IBV_FORK_DISABLED, IBV_FORK_ENABLED, IBV_FORK_UNNEEDED, }; /* * Can't extended above ibv_device_cap_flags enum as in some systems/compilers * enum range is limited to 4 bytes. */ #define IBV_DEVICE_RAW_SCATTER_FCS (1ULL << 34) #define IBV_DEVICE_PCI_WRITE_END_PADDING (1ULL << 36) enum ibv_atomic_cap { IBV_ATOMIC_NONE, IBV_ATOMIC_HCA, IBV_ATOMIC_GLOB }; struct ibv_alloc_dm_attr { size_t length; uint32_t log_align_req; uint32_t comp_mask; }; enum ibv_dm_mask { IBV_DM_MASK_HANDLE = 1 << 0, }; struct ibv_dm { struct ibv_context *context; int (*memcpy_to_dm)(struct ibv_dm *dm, uint64_t dm_offset, const void *host_addr, size_t length); int (*memcpy_from_dm)(void *host_addr, struct ibv_dm *dm, uint64_t dm_offset, size_t length); uint32_t comp_mask; uint32_t handle; }; struct ibv_device_attr { char fw_ver[64]; __be64 node_guid; __be64 sys_image_guid; uint64_t max_mr_size; uint64_t page_size_cap; uint32_t vendor_id; uint32_t vendor_part_id; uint32_t hw_ver; int max_qp; int max_qp_wr; unsigned int device_cap_flags; int max_sge; int max_sge_rd; int max_cq; int max_cqe; int max_mr; int max_pd; int max_qp_rd_atom; int max_ee_rd_atom; int max_res_rd_atom; int max_qp_init_rd_atom; int max_ee_init_rd_atom; enum ibv_atomic_cap atomic_cap; int max_ee; int max_rdd; int max_mw; int max_raw_ipv6_qp; int max_raw_ethy_qp; int max_mcast_grp; int max_mcast_qp_attach; int max_total_mcast_qp_attach; int max_ah; int max_fmr; int max_map_per_fmr; int max_srq; int max_srq_wr; int max_srq_sge; uint16_t max_pkeys; uint8_t local_ca_ack_delay; uint8_t phys_port_cnt; }; /* An extensible input struct for possible future extensions of the * ibv_query_device_ex verb. */ struct ibv_query_device_ex_input { uint32_t comp_mask; }; enum ibv_odp_transport_cap_bits { IBV_ODP_SUPPORT_SEND = 1 << 0, IBV_ODP_SUPPORT_RECV = 1 << 1, IBV_ODP_SUPPORT_WRITE = 1 << 2, IBV_ODP_SUPPORT_READ = 1 << 3, IBV_ODP_SUPPORT_ATOMIC = 1 << 4, IBV_ODP_SUPPORT_SRQ_RECV = 1 << 5, }; struct ibv_odp_caps { uint64_t general_caps; struct { uint32_t rc_odp_caps; uint32_t uc_odp_caps; uint32_t ud_odp_caps; } per_transport_caps; }; enum ibv_odp_general_caps { IBV_ODP_SUPPORT = 1 << 0, IBV_ODP_SUPPORT_IMPLICIT = 1 << 1, }; struct ibv_tso_caps { uint32_t max_tso; uint32_t supported_qpts; }; /* RX Hash function flags */ enum ibv_rx_hash_function_flags { IBV_RX_HASH_FUNC_TOEPLITZ = 1 << 0, }; /* * RX Hash fields enable to set which incoming packet's field should * participates in RX Hash. Each flag represent certain packet's field, * when the flag is set the field that is represented by the flag will * participate in RX Hash calculation. * Note: *IPV4 and *IPV6 flags can't be enabled together on the same QP * and *TCP and *UDP flags can't be enabled together on the same QP. */ enum ibv_rx_hash_fields { IBV_RX_HASH_SRC_IPV4 = 1 << 0, IBV_RX_HASH_DST_IPV4 = 1 << 1, IBV_RX_HASH_SRC_IPV6 = 1 << 2, IBV_RX_HASH_DST_IPV6 = 1 << 3, IBV_RX_HASH_SRC_PORT_TCP = 1 << 4, IBV_RX_HASH_DST_PORT_TCP = 1 << 5, IBV_RX_HASH_SRC_PORT_UDP = 1 << 6, IBV_RX_HASH_DST_PORT_UDP = 1 << 7, IBV_RX_HASH_IPSEC_SPI = 1 << 8, IBV_RX_HASH_INNER = (1UL << 31), }; struct ibv_rss_caps { uint32_t supported_qpts; uint32_t max_rwq_indirection_tables; uint32_t max_rwq_indirection_table_size; uint64_t rx_hash_fields_mask; /* enum ibv_rx_hash_fields */ uint8_t rx_hash_function; /* enum ibv_rx_hash_function_flags */ }; struct ibv_packet_pacing_caps { uint32_t qp_rate_limit_min; uint32_t qp_rate_limit_max; /* In kbps */ uint32_t supported_qpts; }; enum ibv_raw_packet_caps { IBV_RAW_PACKET_CAP_CVLAN_STRIPPING = 1 << 0, IBV_RAW_PACKET_CAP_SCATTER_FCS = 1 << 1, IBV_RAW_PACKET_CAP_IP_CSUM = 1 << 2, IBV_RAW_PACKET_CAP_DELAY_DROP = 1 << 3, }; enum ibv_tm_cap_flags { IBV_TM_CAP_RC = 1 << 0, }; struct ibv_tm_caps { /* Max size of rendezvous request header */ uint32_t max_rndv_hdr_size; /* Max number of tagged buffers in a TM-SRQ matching list */ uint32_t max_num_tags; /* From enum ibv_tm_cap_flags */ uint32_t flags; /* Max number of outstanding list operations */ uint32_t max_ops; /* Max number of SGEs in a tagged buffer */ uint32_t max_sge; }; struct ibv_cq_moderation_caps { uint16_t max_cq_count; uint16_t max_cq_period; /* in micro seconds */ }; enum ibv_pci_atomic_op_size { IBV_PCI_ATOMIC_OPERATION_4_BYTE_SIZE_SUP = 1 << 0, IBV_PCI_ATOMIC_OPERATION_8_BYTE_SIZE_SUP = 1 << 1, IBV_PCI_ATOMIC_OPERATION_16_BYTE_SIZE_SUP = 1 << 2, }; /* * Bitmask for supported operation sizes * Use enum ibv_pci_atomic_op_size */ struct ibv_pci_atomic_caps { uint16_t fetch_add; uint16_t swap; uint16_t compare_swap; }; struct ibv_device_attr_ex { struct ibv_device_attr orig_attr; uint32_t comp_mask; struct ibv_odp_caps odp_caps; uint64_t completion_timestamp_mask; uint64_t hca_core_clock; uint64_t device_cap_flags_ex; struct ibv_tso_caps tso_caps; struct ibv_rss_caps rss_caps; uint32_t max_wq_type_rq; struct ibv_packet_pacing_caps packet_pacing_caps; uint32_t raw_packet_caps; /* Use ibv_raw_packet_caps */ struct ibv_tm_caps tm_caps; struct ibv_cq_moderation_caps cq_mod_caps; uint64_t max_dm_size; struct ibv_pci_atomic_caps pci_atomic_caps; uint32_t xrc_odp_caps; uint32_t phys_port_cnt_ex; }; enum ibv_mtu { IBV_MTU_256 = 1, IBV_MTU_512 = 2, IBV_MTU_1024 = 3, IBV_MTU_2048 = 4, IBV_MTU_4096 = 5 }; enum ibv_port_state { IBV_PORT_NOP = 0, IBV_PORT_DOWN = 1, IBV_PORT_INIT = 2, IBV_PORT_ARMED = 3, IBV_PORT_ACTIVE = 4, IBV_PORT_ACTIVE_DEFER = 5 }; enum { IBV_LINK_LAYER_UNSPECIFIED, IBV_LINK_LAYER_INFINIBAND, IBV_LINK_LAYER_ETHERNET, }; enum ibv_port_cap_flags { IBV_PORT_SM = 1 << 1, IBV_PORT_NOTICE_SUP = 1 << 2, IBV_PORT_TRAP_SUP = 1 << 3, IBV_PORT_OPT_IPD_SUP = 1 << 4, IBV_PORT_AUTO_MIGR_SUP = 1 << 5, IBV_PORT_SL_MAP_SUP = 1 << 6, IBV_PORT_MKEY_NVRAM = 1 << 7, IBV_PORT_PKEY_NVRAM = 1 << 8, IBV_PORT_LED_INFO_SUP = 1 << 9, IBV_PORT_SYS_IMAGE_GUID_SUP = 1 << 11, IBV_PORT_PKEY_SW_EXT_PORT_TRAP_SUP = 1 << 12, IBV_PORT_EXTENDED_SPEEDS_SUP = 1 << 14, IBV_PORT_CAP_MASK2_SUP = 1 << 15, IBV_PORT_CM_SUP = 1 << 16, IBV_PORT_SNMP_TUNNEL_SUP = 1 << 17, IBV_PORT_REINIT_SUP = 1 << 18, IBV_PORT_DEVICE_MGMT_SUP = 1 << 19, IBV_PORT_VENDOR_CLASS_SUP = 1 << 20, IBV_PORT_DR_NOTICE_SUP = 1 << 21, IBV_PORT_CAP_MASK_NOTICE_SUP = 1 << 22, IBV_PORT_BOOT_MGMT_SUP = 1 << 23, IBV_PORT_LINK_LATENCY_SUP = 1 << 24, IBV_PORT_CLIENT_REG_SUP = 1 << 25, IBV_PORT_IP_BASED_GIDS = 1 << 26 }; enum ibv_port_cap_flags2 { IBV_PORT_SET_NODE_DESC_SUP = 1 << 0, IBV_PORT_INFO_EXT_SUP = 1 << 1, IBV_PORT_VIRT_SUP = 1 << 2, IBV_PORT_SWITCH_PORT_STATE_TABLE_SUP = 1 << 3, IBV_PORT_LINK_WIDTH_2X_SUP = 1 << 4, IBV_PORT_LINK_SPEED_HDR_SUP = 1 << 5, IBV_PORT_LINK_SPEED_NDR_SUP = 1 << 10, IBV_PORT_LINK_SPEED_XDR_SUP = 1 << 12, }; struct ibv_port_attr { enum ibv_port_state state; enum ibv_mtu max_mtu; enum ibv_mtu active_mtu; int gid_tbl_len; uint32_t port_cap_flags; uint32_t max_msg_sz; uint32_t bad_pkey_cntr; uint32_t qkey_viol_cntr; uint16_t pkey_tbl_len; uint16_t lid; uint16_t sm_lid; uint8_t lmc; uint8_t max_vl_num; uint8_t sm_sl; uint8_t subnet_timeout; uint8_t init_type_reply; uint8_t active_width; uint8_t active_speed; uint8_t phys_state; uint8_t link_layer; uint8_t flags; uint16_t port_cap_flags2; uint32_t active_speed_ex; }; enum ibv_event_type { IBV_EVENT_CQ_ERR, IBV_EVENT_QP_FATAL, IBV_EVENT_QP_REQ_ERR, IBV_EVENT_QP_ACCESS_ERR, IBV_EVENT_COMM_EST, IBV_EVENT_SQ_DRAINED, IBV_EVENT_PATH_MIG, IBV_EVENT_PATH_MIG_ERR, IBV_EVENT_DEVICE_FATAL, IBV_EVENT_PORT_ACTIVE, IBV_EVENT_PORT_ERR, IBV_EVENT_LID_CHANGE, IBV_EVENT_PKEY_CHANGE, IBV_EVENT_SM_CHANGE, IBV_EVENT_SRQ_ERR, IBV_EVENT_SRQ_LIMIT_REACHED, IBV_EVENT_QP_LAST_WQE_REACHED, IBV_EVENT_CLIENT_REREGISTER, IBV_EVENT_GID_CHANGE, IBV_EVENT_WQ_FATAL, }; struct ibv_async_event { union { struct ibv_cq *cq; struct ibv_qp *qp; struct ibv_srq *srq; struct ibv_wq *wq; int port_num; } element; enum ibv_event_type event_type; }; enum ibv_wc_status { IBV_WC_SUCCESS, IBV_WC_LOC_LEN_ERR, IBV_WC_LOC_QP_OP_ERR, IBV_WC_LOC_EEC_OP_ERR, IBV_WC_LOC_PROT_ERR, IBV_WC_WR_FLUSH_ERR, IBV_WC_MW_BIND_ERR, IBV_WC_BAD_RESP_ERR, IBV_WC_LOC_ACCESS_ERR, IBV_WC_REM_INV_REQ_ERR, IBV_WC_REM_ACCESS_ERR, IBV_WC_REM_OP_ERR, IBV_WC_RETRY_EXC_ERR, IBV_WC_RNR_RETRY_EXC_ERR, IBV_WC_LOC_RDD_VIOL_ERR, IBV_WC_REM_INV_RD_REQ_ERR, IBV_WC_REM_ABORT_ERR, IBV_WC_INV_EECN_ERR, IBV_WC_INV_EEC_STATE_ERR, IBV_WC_FATAL_ERR, IBV_WC_RESP_TIMEOUT_ERR, IBV_WC_GENERAL_ERR, IBV_WC_TM_ERR, IBV_WC_TM_RNDV_INCOMPLETE, }; const char *ibv_wc_status_str(enum ibv_wc_status status); enum ibv_wc_opcode { IBV_WC_SEND, IBV_WC_RDMA_WRITE, IBV_WC_RDMA_READ, IBV_WC_COMP_SWAP, IBV_WC_FETCH_ADD, IBV_WC_BIND_MW, IBV_WC_LOCAL_INV, IBV_WC_TSO, IBV_WC_FLUSH, IBV_WC_ATOMIC_WRITE = 9, /* * Set value of IBV_WC_RECV so consumers can test if a completion is a * receive by testing (opcode & IBV_WC_RECV). */ IBV_WC_RECV = 1 << 7, IBV_WC_RECV_RDMA_WITH_IMM, IBV_WC_TM_ADD, IBV_WC_TM_DEL, IBV_WC_TM_SYNC, IBV_WC_TM_RECV, IBV_WC_TM_NO_TAG, IBV_WC_DRIVER1, IBV_WC_DRIVER2, IBV_WC_DRIVER3, }; enum { IBV_WC_IP_CSUM_OK_SHIFT = 2 }; enum ibv_create_cq_wc_flags { IBV_WC_EX_WITH_BYTE_LEN = 1 << 0, IBV_WC_EX_WITH_IMM = 1 << 1, IBV_WC_EX_WITH_QP_NUM = 1 << 2, IBV_WC_EX_WITH_SRC_QP = 1 << 3, IBV_WC_EX_WITH_SLID = 1 << 4, IBV_WC_EX_WITH_SL = 1 << 5, IBV_WC_EX_WITH_DLID_PATH_BITS = 1 << 6, IBV_WC_EX_WITH_COMPLETION_TIMESTAMP = 1 << 7, IBV_WC_EX_WITH_CVLAN = 1 << 8, IBV_WC_EX_WITH_FLOW_TAG = 1 << 9, IBV_WC_EX_WITH_TM_INFO = 1 << 10, IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK = 1 << 11, }; enum { IBV_WC_STANDARD_FLAGS = IBV_WC_EX_WITH_BYTE_LEN | IBV_WC_EX_WITH_IMM | IBV_WC_EX_WITH_QP_NUM | IBV_WC_EX_WITH_SRC_QP | IBV_WC_EX_WITH_SLID | IBV_WC_EX_WITH_SL | IBV_WC_EX_WITH_DLID_PATH_BITS }; enum { IBV_CREATE_CQ_SUP_WC_FLAGS = IBV_WC_STANDARD_FLAGS | IBV_WC_EX_WITH_COMPLETION_TIMESTAMP | IBV_WC_EX_WITH_CVLAN | IBV_WC_EX_WITH_FLOW_TAG | IBV_WC_EX_WITH_TM_INFO | IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK }; enum ibv_wc_flags { IBV_WC_GRH = 1 << 0, IBV_WC_WITH_IMM = 1 << 1, IBV_WC_IP_CSUM_OK = 1 << IBV_WC_IP_CSUM_OK_SHIFT, IBV_WC_WITH_INV = 1 << 3, IBV_WC_TM_SYNC_REQ = 1 << 4, IBV_WC_TM_MATCH = 1 << 5, IBV_WC_TM_DATA_VALID = 1 << 6, }; struct ibv_wc { uint64_t wr_id; enum ibv_wc_status status; enum ibv_wc_opcode opcode; uint32_t vendor_err; uint32_t byte_len; /* When (wc_flags & IBV_WC_WITH_IMM): Immediate data in network byte order. * When (wc_flags & IBV_WC_WITH_INV): Stores the invalidated rkey. */ union { __be32 imm_data; uint32_t invalidated_rkey; }; uint32_t qp_num; uint32_t src_qp; unsigned int wc_flags; uint16_t pkey_index; uint16_t slid; uint8_t sl; uint8_t dlid_path_bits; }; enum ibv_access_flags { IBV_ACCESS_LOCAL_WRITE = 1, IBV_ACCESS_REMOTE_WRITE = (1<<1), IBV_ACCESS_REMOTE_READ = (1<<2), IBV_ACCESS_REMOTE_ATOMIC = (1<<3), IBV_ACCESS_MW_BIND = (1<<4), IBV_ACCESS_ZERO_BASED = (1<<5), IBV_ACCESS_ON_DEMAND = (1<<6), IBV_ACCESS_HUGETLB = (1<<7), IBV_ACCESS_FLUSH_GLOBAL = (1 << 8), IBV_ACCESS_FLUSH_PERSISTENT = (1 << 9), IBV_ACCESS_RELAXED_ORDERING = IBV_ACCESS_OPTIONAL_FIRST, }; struct ibv_mw_bind_info { struct ibv_mr *mr; uint64_t addr; uint64_t length; unsigned int mw_access_flags; /* use ibv_access_flags */ }; struct ibv_pd { struct ibv_context *context; uint32_t handle; }; struct ibv_td_init_attr { uint32_t comp_mask; }; struct ibv_td { struct ibv_context *context; }; enum ibv_xrcd_init_attr_mask { IBV_XRCD_INIT_ATTR_FD = 1 << 0, IBV_XRCD_INIT_ATTR_OFLAGS = 1 << 1, IBV_XRCD_INIT_ATTR_RESERVED = 1 << 2 }; struct ibv_xrcd_init_attr { uint32_t comp_mask; int fd; int oflags; }; struct ibv_xrcd { struct ibv_context *context; }; enum ibv_rereg_mr_flags { IBV_REREG_MR_CHANGE_TRANSLATION = (1 << 0), IBV_REREG_MR_CHANGE_PD = (1 << 1), IBV_REREG_MR_CHANGE_ACCESS = (1 << 2), IBV_REREG_MR_FLAGS_SUPPORTED = ((IBV_REREG_MR_CHANGE_ACCESS << 1) - 1) }; struct ibv_mr { struct ibv_context *context; struct ibv_pd *pd; void *addr; size_t length; uint32_t handle; uint32_t lkey; uint32_t rkey; }; enum ibv_mw_type { IBV_MW_TYPE_1 = 1, IBV_MW_TYPE_2 = 2 }; struct ibv_mw { struct ibv_context *context; struct ibv_pd *pd; uint32_t rkey; uint32_t handle; enum ibv_mw_type type; }; struct ibv_global_route { union ibv_gid dgid; uint32_t flow_label; uint8_t sgid_index; uint8_t hop_limit; uint8_t traffic_class; }; struct ibv_grh { __be32 version_tclass_flow; __be16 paylen; uint8_t next_hdr; uint8_t hop_limit; union ibv_gid sgid; union ibv_gid dgid; }; enum ibv_rate { IBV_RATE_MAX = 0, IBV_RATE_2_5_GBPS = 2, IBV_RATE_5_GBPS = 5, IBV_RATE_10_GBPS = 3, IBV_RATE_20_GBPS = 6, IBV_RATE_30_GBPS = 4, IBV_RATE_40_GBPS = 7, IBV_RATE_60_GBPS = 8, IBV_RATE_80_GBPS = 9, IBV_RATE_120_GBPS = 10, IBV_RATE_14_GBPS = 11, IBV_RATE_56_GBPS = 12, IBV_RATE_112_GBPS = 13, IBV_RATE_168_GBPS = 14, IBV_RATE_25_GBPS = 15, IBV_RATE_100_GBPS = 16, IBV_RATE_200_GBPS = 17, IBV_RATE_300_GBPS = 18, IBV_RATE_28_GBPS = 19, IBV_RATE_50_GBPS = 20, IBV_RATE_400_GBPS = 21, IBV_RATE_600_GBPS = 22, IBV_RATE_800_GBPS = 23, IBV_RATE_1200_GBPS = 24, }; /** * ibv_rate_to_mult - Convert the IB rate enum to a multiple of the * base rate of 2.5 Gbit/sec. For example, IBV_RATE_5_GBPS will be * converted to 2, since 5 Gbit/sec is 2 * 2.5 Gbit/sec. * @rate: rate to convert. */ int __attribute_const ibv_rate_to_mult(enum ibv_rate rate); /** * mult_to_ibv_rate - Convert a multiple of 2.5 Gbit/sec to an IB rate enum. * @mult: multiple to convert. */ enum ibv_rate __attribute_const mult_to_ibv_rate(int mult); /** * ibv_rate_to_mbps - Convert the IB rate enum to Mbit/sec. * For example, IBV_RATE_5_GBPS will return the value 5000. * @rate: rate to convert. */ int __attribute_const ibv_rate_to_mbps(enum ibv_rate rate); /** * mbps_to_ibv_rate - Convert a Mbit/sec value to an IB rate enum. * @mbps: value to convert. */ enum ibv_rate __attribute_const mbps_to_ibv_rate(int mbps); struct ibv_ah_attr { struct ibv_global_route grh; uint16_t dlid; uint8_t sl; uint8_t src_path_bits; uint8_t static_rate; uint8_t is_global; uint8_t port_num; }; enum ibv_srq_attr_mask { IBV_SRQ_MAX_WR = 1 << 0, IBV_SRQ_LIMIT = 1 << 1 }; struct ibv_srq_attr { uint32_t max_wr; uint32_t max_sge; uint32_t srq_limit; }; struct ibv_srq_init_attr { void *srq_context; struct ibv_srq_attr attr; }; enum ibv_srq_type { IBV_SRQT_BASIC, IBV_SRQT_XRC, IBV_SRQT_TM, }; enum ibv_srq_init_attr_mask { IBV_SRQ_INIT_ATTR_TYPE = 1 << 0, IBV_SRQ_INIT_ATTR_PD = 1 << 1, IBV_SRQ_INIT_ATTR_XRCD = 1 << 2, IBV_SRQ_INIT_ATTR_CQ = 1 << 3, IBV_SRQ_INIT_ATTR_TM = 1 << 4, IBV_SRQ_INIT_ATTR_RESERVED = 1 << 5, }; struct ibv_tm_cap { uint32_t max_num_tags; uint32_t max_ops; }; struct ibv_srq_init_attr_ex { void *srq_context; struct ibv_srq_attr attr; uint32_t comp_mask; enum ibv_srq_type srq_type; struct ibv_pd *pd; struct ibv_xrcd *xrcd; struct ibv_cq *cq; struct ibv_tm_cap tm_cap; }; enum ibv_wq_type { IBV_WQT_RQ }; enum ibv_wq_init_attr_mask { IBV_WQ_INIT_ATTR_FLAGS = 1 << 0, IBV_WQ_INIT_ATTR_RESERVED = 1 << 1, }; enum ibv_wq_flags { IBV_WQ_FLAGS_CVLAN_STRIPPING = 1 << 0, IBV_WQ_FLAGS_SCATTER_FCS = 1 << 1, IBV_WQ_FLAGS_DELAY_DROP = 1 << 2, IBV_WQ_FLAGS_PCI_WRITE_END_PADDING = 1 << 3, IBV_WQ_FLAGS_RESERVED = 1 << 4, }; struct ibv_wq_init_attr { void *wq_context; enum ibv_wq_type wq_type; uint32_t max_wr; uint32_t max_sge; struct ibv_pd *pd; struct ibv_cq *cq; uint32_t comp_mask; /* Use ibv_wq_init_attr_mask */ uint32_t create_flags; /* use ibv_wq_flags */ }; enum ibv_wq_state { IBV_WQS_RESET, IBV_WQS_RDY, IBV_WQS_ERR, IBV_WQS_UNKNOWN }; enum ibv_wq_attr_mask { IBV_WQ_ATTR_STATE = 1 << 0, IBV_WQ_ATTR_CURR_STATE = 1 << 1, IBV_WQ_ATTR_FLAGS = 1 << 2, IBV_WQ_ATTR_RESERVED = 1 << 3, }; struct ibv_wq_attr { /* enum ibv_wq_attr_mask */ uint32_t attr_mask; /* Move the WQ to this state */ enum ibv_wq_state wq_state; /* Assume this is the current WQ state */ enum ibv_wq_state curr_wq_state; uint32_t flags; /* Use ibv_wq_flags */ uint32_t flags_mask; /* Use ibv_wq_flags */ }; /* * Receive Work Queue Indirection Table. * It's used in order to distribute incoming packets between different * Receive Work Queues. Associating Receive WQs with different CPU cores * allows one to workload the traffic between different CPU cores. * The Indirection Table can contain only WQs of type IBV_WQT_RQ. */ struct ibv_rwq_ind_table { struct ibv_context *context; int ind_tbl_handle; int ind_tbl_num; uint32_t comp_mask; }; enum ibv_ind_table_init_attr_mask { IBV_CREATE_IND_TABLE_RESERVED = (1 << 0) }; /* * Receive Work Queue Indirection Table attributes */ struct ibv_rwq_ind_table_init_attr { uint32_t log_ind_tbl_size; /* Each entry is a pointer to a Receive Work Queue */ struct ibv_wq **ind_tbl; uint32_t comp_mask; }; enum ibv_qp_type { IBV_QPT_RC = 2, IBV_QPT_UC, IBV_QPT_UD, IBV_QPT_RAW_PACKET = 8, IBV_QPT_XRC_SEND = 9, IBV_QPT_XRC_RECV, IBV_QPT_DRIVER = 0xff, }; struct ibv_qp_cap { uint32_t max_send_wr; uint32_t max_recv_wr; uint32_t max_send_sge; uint32_t max_recv_sge; uint32_t max_inline_data; }; struct ibv_qp_init_attr { void *qp_context; struct ibv_cq *send_cq; struct ibv_cq *recv_cq; struct ibv_srq *srq; struct ibv_qp_cap cap; enum ibv_qp_type qp_type; int sq_sig_all; }; enum ibv_qp_init_attr_mask { IBV_QP_INIT_ATTR_PD = 1 << 0, IBV_QP_INIT_ATTR_XRCD = 1 << 1, IBV_QP_INIT_ATTR_CREATE_FLAGS = 1 << 2, IBV_QP_INIT_ATTR_MAX_TSO_HEADER = 1 << 3, IBV_QP_INIT_ATTR_IND_TABLE = 1 << 4, IBV_QP_INIT_ATTR_RX_HASH = 1 << 5, IBV_QP_INIT_ATTR_SEND_OPS_FLAGS = 1 << 6, }; enum ibv_qp_create_flags { IBV_QP_CREATE_BLOCK_SELF_MCAST_LB = 1 << 1, IBV_QP_CREATE_SCATTER_FCS = 1 << 8, IBV_QP_CREATE_CVLAN_STRIPPING = 1 << 9, IBV_QP_CREATE_SOURCE_QPN = 1 << 10, IBV_QP_CREATE_PCI_WRITE_END_PADDING = 1 << 11, }; enum ibv_qp_create_send_ops_flags { IBV_QP_EX_WITH_RDMA_WRITE = 1 << 0, IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM = 1 << 1, IBV_QP_EX_WITH_SEND = 1 << 2, IBV_QP_EX_WITH_SEND_WITH_IMM = 1 << 3, IBV_QP_EX_WITH_RDMA_READ = 1 << 4, IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP = 1 << 5, IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD = 1 << 6, IBV_QP_EX_WITH_LOCAL_INV = 1 << 7, IBV_QP_EX_WITH_BIND_MW = 1 << 8, IBV_QP_EX_WITH_SEND_WITH_INV = 1 << 9, IBV_QP_EX_WITH_TSO = 1 << 10, IBV_QP_EX_WITH_FLUSH = 1 << 11, IBV_QP_EX_WITH_ATOMIC_WRITE = 1 << 12, }; struct ibv_rx_hash_conf { /* enum ibv_rx_hash_function_flags */ uint8_t rx_hash_function; uint8_t rx_hash_key_len; uint8_t *rx_hash_key; /* enum ibv_rx_hash_fields */ uint64_t rx_hash_fields_mask; }; struct ibv_qp_init_attr_ex { void *qp_context; struct ibv_cq *send_cq; struct ibv_cq *recv_cq; struct ibv_srq *srq; struct ibv_qp_cap cap; enum ibv_qp_type qp_type; int sq_sig_all; uint32_t comp_mask; struct ibv_pd *pd; struct ibv_xrcd *xrcd; uint32_t create_flags; uint16_t max_tso_header; struct ibv_rwq_ind_table *rwq_ind_tbl; struct ibv_rx_hash_conf rx_hash_conf; uint32_t source_qpn; /* See enum ibv_qp_create_send_ops_flags */ uint64_t send_ops_flags; }; enum ibv_qp_open_attr_mask { IBV_QP_OPEN_ATTR_NUM = 1 << 0, IBV_QP_OPEN_ATTR_XRCD = 1 << 1, IBV_QP_OPEN_ATTR_CONTEXT = 1 << 2, IBV_QP_OPEN_ATTR_TYPE = 1 << 3, IBV_QP_OPEN_ATTR_RESERVED = 1 << 4 }; struct ibv_qp_open_attr { uint32_t comp_mask; uint32_t qp_num; struct ibv_xrcd *xrcd; void *qp_context; enum ibv_qp_type qp_type; }; enum ibv_qp_attr_mask { IBV_QP_STATE = 1 << 0, IBV_QP_CUR_STATE = 1 << 1, IBV_QP_EN_SQD_ASYNC_NOTIFY = 1 << 2, IBV_QP_ACCESS_FLAGS = 1 << 3, IBV_QP_PKEY_INDEX = 1 << 4, IBV_QP_PORT = 1 << 5, IBV_QP_QKEY = 1 << 6, IBV_QP_AV = 1 << 7, IBV_QP_PATH_MTU = 1 << 8, IBV_QP_TIMEOUT = 1 << 9, IBV_QP_RETRY_CNT = 1 << 10, IBV_QP_RNR_RETRY = 1 << 11, IBV_QP_RQ_PSN = 1 << 12, IBV_QP_MAX_QP_RD_ATOMIC = 1 << 13, IBV_QP_ALT_PATH = 1 << 14, IBV_QP_MIN_RNR_TIMER = 1 << 15, IBV_QP_SQ_PSN = 1 << 16, IBV_QP_MAX_DEST_RD_ATOMIC = 1 << 17, IBV_QP_PATH_MIG_STATE = 1 << 18, IBV_QP_CAP = 1 << 19, IBV_QP_DEST_QPN = 1 << 20, /* These bits were supported on older kernels, but never exposed from libibverbs: _IBV_QP_SMAC = 1 << 21, _IBV_QP_ALT_SMAC = 1 << 22, _IBV_QP_VID = 1 << 23, _IBV_QP_ALT_VID = 1 << 24, */ IBV_QP_RATE_LIMIT = 1 << 25, }; enum ibv_query_qp_data_in_order_flags { IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS = 1 << 0, }; enum ibv_query_qp_data_in_order_caps { IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG = 1 << 0, IBV_QUERY_QP_DATA_IN_ORDER_ALIGNED_128_BYTES = 1 << 1, }; enum ibv_qp_state { IBV_QPS_RESET, IBV_QPS_INIT, IBV_QPS_RTR, IBV_QPS_RTS, IBV_QPS_SQD, IBV_QPS_SQE, IBV_QPS_ERR, IBV_QPS_UNKNOWN }; enum ibv_mig_state { IBV_MIG_MIGRATED, IBV_MIG_REARM, IBV_MIG_ARMED }; struct ibv_qp_attr { enum ibv_qp_state qp_state; enum ibv_qp_state cur_qp_state; enum ibv_mtu path_mtu; enum ibv_mig_state path_mig_state; uint32_t qkey; uint32_t rq_psn; uint32_t sq_psn; uint32_t dest_qp_num; unsigned int qp_access_flags; struct ibv_qp_cap cap; struct ibv_ah_attr ah_attr; struct ibv_ah_attr alt_ah_attr; uint16_t pkey_index; uint16_t alt_pkey_index; uint8_t en_sqd_async_notify; uint8_t sq_draining; uint8_t max_rd_atomic; uint8_t max_dest_rd_atomic; uint8_t min_rnr_timer; uint8_t port_num; uint8_t timeout; uint8_t retry_cnt; uint8_t rnr_retry; uint8_t alt_port_num; uint8_t alt_timeout; uint32_t rate_limit; }; struct ibv_qp_rate_limit_attr { uint32_t rate_limit; /* in kbps */ uint32_t max_burst_sz; /* total burst size in bytes */ uint16_t typical_pkt_sz; /* typical send packet size in bytes */ uint32_t comp_mask; }; enum ibv_wr_opcode { IBV_WR_RDMA_WRITE, IBV_WR_RDMA_WRITE_WITH_IMM, IBV_WR_SEND, IBV_WR_SEND_WITH_IMM, IBV_WR_RDMA_READ, IBV_WR_ATOMIC_CMP_AND_SWP, IBV_WR_ATOMIC_FETCH_AND_ADD, IBV_WR_LOCAL_INV, IBV_WR_BIND_MW, IBV_WR_SEND_WITH_INV, IBV_WR_TSO, IBV_WR_DRIVER1, IBV_WR_FLUSH = 14, IBV_WR_ATOMIC_WRITE = 15, }; const char *ibv_wr_opcode_str(enum ibv_wr_opcode opcode); enum ibv_send_flags { IBV_SEND_FENCE = 1 << 0, IBV_SEND_SIGNALED = 1 << 1, IBV_SEND_SOLICITED = 1 << 2, IBV_SEND_INLINE = 1 << 3, IBV_SEND_IP_CSUM = 1 << 4 }; enum ibv_placement_type { IBV_FLUSH_GLOBAL = 1U << 0, IBV_FLUSH_PERSISTENT = 1U << 1, }; enum ibv_selectivity_level { IBV_FLUSH_RANGE = 0, IBV_FLUSH_MR, }; struct ibv_data_buf { void *addr; size_t length; }; struct ibv_sge { uint64_t addr; uint32_t length; uint32_t lkey; }; struct ibv_send_wr { uint64_t wr_id; struct ibv_send_wr *next; struct ibv_sge *sg_list; int num_sge; enum ibv_wr_opcode opcode; unsigned int send_flags; /* When opcode is *_WITH_IMM: Immediate data in network byte order. * When opcode is *_INV: Stores the rkey to invalidate */ union { __be32 imm_data; uint32_t invalidate_rkey; }; union { struct { uint64_t remote_addr; uint32_t rkey; } rdma; struct { uint64_t remote_addr; uint64_t compare_add; uint64_t swap; uint32_t rkey; } atomic; struct { struct ibv_ah *ah; uint32_t remote_qpn; uint32_t remote_qkey; } ud; } wr; union { struct { uint32_t remote_srqn; } xrc; } qp_type; union { struct { struct ibv_mw *mw; uint32_t rkey; struct ibv_mw_bind_info bind_info; } bind_mw; struct { void *hdr; uint16_t hdr_sz; uint16_t mss; } tso; }; }; struct ibv_recv_wr { uint64_t wr_id; struct ibv_recv_wr *next; struct ibv_sge *sg_list; int num_sge; }; enum ibv_ops_wr_opcode { IBV_WR_TAG_ADD, IBV_WR_TAG_DEL, IBV_WR_TAG_SYNC, }; enum ibv_ops_flags { IBV_OPS_SIGNALED = 1 << 0, IBV_OPS_TM_SYNC = 1 << 1, }; struct ibv_ops_wr { uint64_t wr_id; struct ibv_ops_wr *next; enum ibv_ops_wr_opcode opcode; int flags; struct { uint32_t unexpected_cnt; uint32_t handle; struct { uint64_t recv_wr_id; struct ibv_sge *sg_list; int num_sge; uint64_t tag; uint64_t mask; } add; } tm; }; struct ibv_mw_bind { uint64_t wr_id; unsigned int send_flags; struct ibv_mw_bind_info bind_info; }; struct ibv_srq { struct ibv_context *context; void *srq_context; struct ibv_pd *pd; uint32_t handle; pthread_mutex_t mutex; pthread_cond_t cond; uint32_t events_completed; }; /* * Work Queue. QP can be created without internal WQs "packaged" inside it, * this QP can be configured to use "external" WQ object as its * receive/send queue. * WQ associated (many to one) with Completion Queue it owns WQ properties * (PD, WQ size etc). * WQ of type IBV_WQT_RQ: * - Contains receive WQEs, in this case its PD serves as scatter as well. * - Exposes post receive function to be used to post a list of work * requests (WRs) to its receive queue. */ struct ibv_wq { struct ibv_context *context; void *wq_context; struct ibv_pd *pd; struct ibv_cq *cq; uint32_t wq_num; uint32_t handle; enum ibv_wq_state state; enum ibv_wq_type wq_type; int (*post_recv)(struct ibv_wq *current, struct ibv_recv_wr *recv_wr, struct ibv_recv_wr **bad_recv_wr); pthread_mutex_t mutex; pthread_cond_t cond; uint32_t events_completed; uint32_t comp_mask; }; struct ibv_qp { struct ibv_context *context; void *qp_context; struct ibv_pd *pd; struct ibv_cq *send_cq; struct ibv_cq *recv_cq; struct ibv_srq *srq; uint32_t handle; uint32_t qp_num; enum ibv_qp_state state; enum ibv_qp_type qp_type; pthread_mutex_t mutex; pthread_cond_t cond; uint32_t events_completed; }; struct ibv_qp_ex { struct ibv_qp qp_base; uint64_t comp_mask; uint64_t wr_id; /* bitmask from enum ibv_send_flags */ unsigned int wr_flags; void (*wr_atomic_cmp_swp)(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, uint64_t compare, uint64_t swap); void (*wr_atomic_fetch_add)(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, uint64_t add); void (*wr_bind_mw)(struct ibv_qp_ex *qp, struct ibv_mw *mw, uint32_t rkey, const struct ibv_mw_bind_info *bind_info); void (*wr_local_inv)(struct ibv_qp_ex *qp, uint32_t invalidate_rkey); void (*wr_rdma_read)(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr); void (*wr_rdma_write)(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr); void (*wr_rdma_write_imm)(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, __be32 imm_data); void (*wr_send)(struct ibv_qp_ex *qp); void (*wr_send_imm)(struct ibv_qp_ex *qp, __be32 imm_data); void (*wr_send_inv)(struct ibv_qp_ex *qp, uint32_t invalidate_rkey); void (*wr_send_tso)(struct ibv_qp_ex *qp, void *hdr, uint16_t hdr_sz, uint16_t mss); void (*wr_set_ud_addr)(struct ibv_qp_ex *qp, struct ibv_ah *ah, uint32_t remote_qpn, uint32_t remote_qkey); void (*wr_set_xrc_srqn)(struct ibv_qp_ex *qp, uint32_t remote_srqn); void (*wr_set_inline_data)(struct ibv_qp_ex *qp, void *addr, size_t length); void (*wr_set_inline_data_list)(struct ibv_qp_ex *qp, size_t num_buf, const struct ibv_data_buf *buf_list); void (*wr_set_sge)(struct ibv_qp_ex *qp, uint32_t lkey, uint64_t addr, uint32_t length); void (*wr_set_sge_list)(struct ibv_qp_ex *qp, size_t num_sge, const struct ibv_sge *sg_list); void (*wr_start)(struct ibv_qp_ex *qp); int (*wr_complete)(struct ibv_qp_ex *qp); void (*wr_abort)(struct ibv_qp_ex *qp); void (*wr_atomic_write)(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, const void *atomic_wr); void (*wr_flush)(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, size_t len, uint8_t type, uint8_t level); }; struct ibv_qp_ex *ibv_qp_to_qp_ex(struct ibv_qp *qp); static inline void ibv_wr_atomic_cmp_swp(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, uint64_t compare, uint64_t swap) { qp->wr_atomic_cmp_swp(qp, rkey, remote_addr, compare, swap); } static inline void ibv_wr_atomic_fetch_add(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, uint64_t add) { qp->wr_atomic_fetch_add(qp, rkey, remote_addr, add); } static inline void ibv_wr_bind_mw(struct ibv_qp_ex *qp, struct ibv_mw *mw, uint32_t rkey, const struct ibv_mw_bind_info *bind_info) { qp->wr_bind_mw(qp, mw, rkey, bind_info); } static inline void ibv_wr_local_inv(struct ibv_qp_ex *qp, uint32_t invalidate_rkey) { qp->wr_local_inv(qp, invalidate_rkey); } static inline void ibv_wr_rdma_read(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr) { qp->wr_rdma_read(qp, rkey, remote_addr); } static inline void ibv_wr_rdma_write(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr) { qp->wr_rdma_write(qp, rkey, remote_addr); } static inline void ibv_wr_flush(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, size_t len, uint8_t type, uint8_t level) { qp->wr_flush(qp, rkey, remote_addr, len, type, level); } static inline void ibv_wr_rdma_write_imm(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, __be32 imm_data) { qp->wr_rdma_write_imm(qp, rkey, remote_addr, imm_data); } static inline void ibv_wr_send(struct ibv_qp_ex *qp) { qp->wr_send(qp); } static inline void ibv_wr_send_imm(struct ibv_qp_ex *qp, __be32 imm_data) { qp->wr_send_imm(qp, imm_data); } static inline void ibv_wr_send_inv(struct ibv_qp_ex *qp, uint32_t invalidate_rkey) { qp->wr_send_inv(qp, invalidate_rkey); } static inline void ibv_wr_send_tso(struct ibv_qp_ex *qp, void *hdr, uint16_t hdr_sz, uint16_t mss) { qp->wr_send_tso(qp, hdr, hdr_sz, mss); } static inline void ibv_wr_set_ud_addr(struct ibv_qp_ex *qp, struct ibv_ah *ah, uint32_t remote_qpn, uint32_t remote_qkey) { qp->wr_set_ud_addr(qp, ah, remote_qpn, remote_qkey); } static inline void ibv_wr_set_xrc_srqn(struct ibv_qp_ex *qp, uint32_t remote_srqn) { qp->wr_set_xrc_srqn(qp, remote_srqn); } static inline void ibv_wr_set_inline_data(struct ibv_qp_ex *qp, void *addr, size_t length) { qp->wr_set_inline_data(qp, addr, length); } static inline void ibv_wr_set_inline_data_list(struct ibv_qp_ex *qp, size_t num_buf, const struct ibv_data_buf *buf_list) { qp->wr_set_inline_data_list(qp, num_buf, buf_list); } static inline void ibv_wr_set_sge(struct ibv_qp_ex *qp, uint32_t lkey, uint64_t addr, uint32_t length) { qp->wr_set_sge(qp, lkey, addr, length); } static inline void ibv_wr_set_sge_list(struct ibv_qp_ex *qp, size_t num_sge, const struct ibv_sge *sg_list) { qp->wr_set_sge_list(qp, num_sge, sg_list); } static inline void ibv_wr_start(struct ibv_qp_ex *qp) { qp->wr_start(qp); } static inline int ibv_wr_complete(struct ibv_qp_ex *qp) { return qp->wr_complete(qp); } static inline void ibv_wr_abort(struct ibv_qp_ex *qp) { qp->wr_abort(qp); } static inline void ibv_wr_atomic_write(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, const void *atomic_wr) { qp->wr_atomic_write(qp, rkey, remote_addr, atomic_wr); } struct ibv_ece { /* * Unique identifier of the provider vendor on the network. * The providers will set IEEE OUI here to distinguish * itself in non-homogenius network. */ uint32_t vendor_id; /* * Provider specific attributes which are supported or * needed to be enabled by ECE users. */ uint32_t options; uint32_t comp_mask; }; struct ibv_comp_channel { struct ibv_context *context; int fd; int refcnt; }; struct ibv_cq { struct ibv_context *context; struct ibv_comp_channel *channel; void *cq_context; uint32_t handle; int cqe; pthread_mutex_t mutex; pthread_cond_t cond; uint32_t comp_events_completed; uint32_t async_events_completed; }; struct ibv_poll_cq_attr { uint32_t comp_mask; }; struct ibv_wc_tm_info { uint64_t tag; /* tag from TMH */ uint32_t priv; /* opaque user data from TMH */ }; struct ibv_cq_ex { struct ibv_context *context; struct ibv_comp_channel *channel; void *cq_context; uint32_t handle; int cqe; pthread_mutex_t mutex; pthread_cond_t cond; uint32_t comp_events_completed; uint32_t async_events_completed; uint32_t comp_mask; enum ibv_wc_status status; uint64_t wr_id; int (*start_poll)(struct ibv_cq_ex *current, struct ibv_poll_cq_attr *attr); int (*next_poll)(struct ibv_cq_ex *current); void (*end_poll)(struct ibv_cq_ex *current); enum ibv_wc_opcode (*read_opcode)(struct ibv_cq_ex *current); uint32_t (*read_vendor_err)(struct ibv_cq_ex *current); uint32_t (*read_byte_len)(struct ibv_cq_ex *current); __be32 (*read_imm_data)(struct ibv_cq_ex *current); uint32_t (*read_qp_num)(struct ibv_cq_ex *current); uint32_t (*read_src_qp)(struct ibv_cq_ex *current); unsigned int (*read_wc_flags)(struct ibv_cq_ex *current); uint32_t (*read_slid)(struct ibv_cq_ex *current); uint8_t (*read_sl)(struct ibv_cq_ex *current); uint8_t (*read_dlid_path_bits)(struct ibv_cq_ex *current); uint64_t (*read_completion_ts)(struct ibv_cq_ex *current); uint16_t (*read_cvlan)(struct ibv_cq_ex *current); uint32_t (*read_flow_tag)(struct ibv_cq_ex *current); void (*read_tm_info)(struct ibv_cq_ex *current, struct ibv_wc_tm_info *tm_info); uint64_t (*read_completion_wallclock_ns)(struct ibv_cq_ex *current); }; static inline struct ibv_cq *ibv_cq_ex_to_cq(struct ibv_cq_ex *cq) { return (struct ibv_cq *)cq; } enum ibv_cq_attr_mask { IBV_CQ_ATTR_MODERATE = 1 << 0, IBV_CQ_ATTR_RESERVED = 1 << 1, }; struct ibv_moderate_cq { uint16_t cq_count; uint16_t cq_period; /* in micro seconds */ }; struct ibv_modify_cq_attr { uint32_t attr_mask; struct ibv_moderate_cq moderate; }; static inline int ibv_start_poll(struct ibv_cq_ex *cq, struct ibv_poll_cq_attr *attr) { return cq->start_poll(cq, attr); } static inline int ibv_next_poll(struct ibv_cq_ex *cq) { return cq->next_poll(cq); } static inline void ibv_end_poll(struct ibv_cq_ex *cq) { cq->end_poll(cq); } static inline enum ibv_wc_opcode ibv_wc_read_opcode(struct ibv_cq_ex *cq) { return cq->read_opcode(cq); } static inline uint32_t ibv_wc_read_vendor_err(struct ibv_cq_ex *cq) { return cq->read_vendor_err(cq); } static inline uint32_t ibv_wc_read_byte_len(struct ibv_cq_ex *cq) { return cq->read_byte_len(cq); } static inline __be32 ibv_wc_read_imm_data(struct ibv_cq_ex *cq) { return cq->read_imm_data(cq); } static inline uint32_t ibv_wc_read_invalidated_rkey(struct ibv_cq_ex *cq) { #ifdef __CHECKER__ return (__attribute__((force)) uint32_t)cq->read_imm_data(cq); #else return cq->read_imm_data(cq); #endif } static inline uint32_t ibv_wc_read_qp_num(struct ibv_cq_ex *cq) { return cq->read_qp_num(cq); } static inline uint32_t ibv_wc_read_src_qp(struct ibv_cq_ex *cq) { return cq->read_src_qp(cq); } static inline unsigned int ibv_wc_read_wc_flags(struct ibv_cq_ex *cq) { return cq->read_wc_flags(cq); } static inline uint32_t ibv_wc_read_slid(struct ibv_cq_ex *cq) { return cq->read_slid(cq); } static inline uint8_t ibv_wc_read_sl(struct ibv_cq_ex *cq) { return cq->read_sl(cq); } static inline uint8_t ibv_wc_read_dlid_path_bits(struct ibv_cq_ex *cq) { return cq->read_dlid_path_bits(cq); } static inline uint64_t ibv_wc_read_completion_ts(struct ibv_cq_ex *cq) { return cq->read_completion_ts(cq); } static inline uint64_t ibv_wc_read_completion_wallclock_ns(struct ibv_cq_ex *cq) { return cq->read_completion_wallclock_ns(cq); } static inline uint16_t ibv_wc_read_cvlan(struct ibv_cq_ex *cq) { return cq->read_cvlan(cq); } static inline uint32_t ibv_wc_read_flow_tag(struct ibv_cq_ex *cq) { return cq->read_flow_tag(cq); } static inline void ibv_wc_read_tm_info(struct ibv_cq_ex *cq, struct ibv_wc_tm_info *tm_info) { cq->read_tm_info(cq, tm_info); } static inline int ibv_post_wq_recv(struct ibv_wq *wq, struct ibv_recv_wr *recv_wr, struct ibv_recv_wr **bad_recv_wr) { return wq->post_recv(wq, recv_wr, bad_recv_wr); } struct ibv_ah { struct ibv_context *context; struct ibv_pd *pd; uint32_t handle; }; enum ibv_flow_flags { /* First bit is deprecated and can't be used */ IBV_FLOW_ATTR_FLAGS_DONT_TRAP = 1 << 1, IBV_FLOW_ATTR_FLAGS_EGRESS = 1 << 2, }; enum ibv_flow_attr_type { /* steering according to rule specifications */ IBV_FLOW_ATTR_NORMAL = 0x0, /* default unicast and multicast rule - * receive all Eth traffic which isn't steered to any QP */ IBV_FLOW_ATTR_ALL_DEFAULT = 0x1, /* default multicast rule - * receive all Eth multicast traffic which isn't steered to any QP */ IBV_FLOW_ATTR_MC_DEFAULT = 0x2, /* sniffer rule - receive all port traffic */ IBV_FLOW_ATTR_SNIFFER = 0x3, }; enum ibv_flow_spec_type { IBV_FLOW_SPEC_ETH = 0x20, IBV_FLOW_SPEC_IPV4 = 0x30, IBV_FLOW_SPEC_IPV6 = 0x31, IBV_FLOW_SPEC_IPV4_EXT = 0x32, IBV_FLOW_SPEC_ESP = 0x34, IBV_FLOW_SPEC_TCP = 0x40, IBV_FLOW_SPEC_UDP = 0x41, IBV_FLOW_SPEC_VXLAN_TUNNEL = 0x50, IBV_FLOW_SPEC_GRE = 0x51, IBV_FLOW_SPEC_MPLS = 0x60, IBV_FLOW_SPEC_INNER = 0x100, IBV_FLOW_SPEC_ACTION_TAG = 0x1000, IBV_FLOW_SPEC_ACTION_DROP = 0x1001, IBV_FLOW_SPEC_ACTION_HANDLE = 0x1002, IBV_FLOW_SPEC_ACTION_COUNT = 0x1003, }; #define ETHERNET_LL_SIZE ETH_ALEN struct ibv_flow_eth_filter { uint8_t dst_mac[ETHERNET_LL_SIZE]; uint8_t src_mac[ETHERNET_LL_SIZE]; uint16_t ether_type; /* * same layout as 802.1q: prio 3, cfi 1, vlan id 12 */ uint16_t vlan_tag; }; struct ibv_flow_spec_eth { enum ibv_flow_spec_type type; uint16_t size; struct ibv_flow_eth_filter val; struct ibv_flow_eth_filter mask; }; struct ibv_flow_ipv4_filter { uint32_t src_ip; uint32_t dst_ip; }; struct ibv_flow_spec_ipv4 { enum ibv_flow_spec_type type; uint16_t size; struct ibv_flow_ipv4_filter val; struct ibv_flow_ipv4_filter mask; }; struct ibv_flow_ipv4_ext_filter { uint32_t src_ip; uint32_t dst_ip; uint8_t proto; uint8_t tos; uint8_t ttl; uint8_t flags; }; struct ibv_flow_spec_ipv4_ext { enum ibv_flow_spec_type type; uint16_t size; struct ibv_flow_ipv4_ext_filter val; struct ibv_flow_ipv4_ext_filter mask; }; struct ibv_flow_ipv6_filter { uint8_t src_ip[16]; uint8_t dst_ip[16]; uint32_t flow_label; uint8_t next_hdr; uint8_t traffic_class; uint8_t hop_limit; }; struct ibv_flow_spec_ipv6 { enum ibv_flow_spec_type type; uint16_t size; struct ibv_flow_ipv6_filter val; struct ibv_flow_ipv6_filter mask; }; struct ibv_flow_esp_filter { uint32_t spi; uint32_t seq; }; struct ibv_flow_spec_esp { enum ibv_flow_spec_type type; uint16_t size; struct ibv_flow_esp_filter val; struct ibv_flow_esp_filter mask; }; struct ibv_flow_tcp_udp_filter { uint16_t dst_port; uint16_t src_port; }; struct ibv_flow_spec_tcp_udp { enum ibv_flow_spec_type type; uint16_t size; struct ibv_flow_tcp_udp_filter val; struct ibv_flow_tcp_udp_filter mask; }; struct ibv_flow_gre_filter { /* c_ks_res0_ver field is bits 0-15 in offset 0 of a standard GRE header: * bit 0 - checksum present bit. * bit 1 - reserved. set to 0. * bit 2 - key present bit. * bit 3 - sequence number present bit. * bits 4:12 - reserved. set to 0. * bits 13:15 - GRE version. */ uint16_t c_ks_res0_ver; uint16_t protocol; uint32_t key; }; struct ibv_flow_spec_gre { enum ibv_flow_spec_type type; uint16_t size; struct ibv_flow_gre_filter val; struct ibv_flow_gre_filter mask; }; struct ibv_flow_mpls_filter { /* The field includes the entire MPLS label: * bits 0:19 - label value field. * bits 20:22 - traffic class field. * bits 23 - bottom of stack bit. * bits 24:31 - ttl field. */ uint32_t label; }; struct ibv_flow_spec_mpls { enum ibv_flow_spec_type type; uint16_t size; struct ibv_flow_mpls_filter val; struct ibv_flow_mpls_filter mask; }; struct ibv_flow_tunnel_filter { uint32_t tunnel_id; }; struct ibv_flow_spec_tunnel { enum ibv_flow_spec_type type; uint16_t size; struct ibv_flow_tunnel_filter val; struct ibv_flow_tunnel_filter mask; }; struct ibv_flow_spec_action_tag { enum ibv_flow_spec_type type; uint16_t size; uint32_t tag_id; }; struct ibv_flow_spec_action_drop { enum ibv_flow_spec_type type; uint16_t size; }; struct ibv_flow_spec_action_handle { enum ibv_flow_spec_type type; uint16_t size; const struct ibv_flow_action *action; }; struct ibv_flow_spec_counter_action { enum ibv_flow_spec_type type; uint16_t size; struct ibv_counters *counters; }; struct ibv_flow_spec { union { struct { enum ibv_flow_spec_type type; uint16_t size; } hdr; struct ibv_flow_spec_eth eth; struct ibv_flow_spec_ipv4 ipv4; struct ibv_flow_spec_tcp_udp tcp_udp; struct ibv_flow_spec_ipv4_ext ipv4_ext; struct ibv_flow_spec_ipv6 ipv6; struct ibv_flow_spec_esp esp; struct ibv_flow_spec_tunnel tunnel; struct ibv_flow_spec_gre gre; struct ibv_flow_spec_mpls mpls; struct ibv_flow_spec_action_tag flow_tag; struct ibv_flow_spec_action_drop drop; struct ibv_flow_spec_action_handle handle; struct ibv_flow_spec_counter_action flow_count; }; }; struct ibv_flow_attr { uint32_t comp_mask; enum ibv_flow_attr_type type; uint16_t size; uint16_t priority; uint8_t num_of_specs; uint8_t port; uint32_t flags; /* Following are the optional layers according to user request * struct ibv_flow_spec_xxx [L2] * struct ibv_flow_spec_yyy [L3/L4] */ }; struct ibv_flow { uint32_t comp_mask; struct ibv_context *context; uint32_t handle; }; struct ibv_flow_action { struct ibv_context *context; }; enum ibv_flow_action_esp_mask { IBV_FLOW_ACTION_ESP_MASK_ESN = 1UL << 0, }; struct ibv_flow_action_esp_attr { struct ibv_flow_action_esp *esp_attr; enum ibv_flow_action_esp_keymat keymat_proto; uint16_t keymat_len; void *keymat_ptr; enum ibv_flow_action_esp_replay replay_proto; uint16_t replay_len; void *replay_ptr; struct ibv_flow_action_esp_encap *esp_encap; uint32_t comp_mask; /* Use enum ibv_flow_action_esp_mask */ uint32_t esn; }; struct ibv_device; struct ibv_context; /* Obsolete, never used, do not touch */ struct _ibv_device_ops { struct ibv_context * (*_dummy1)(struct ibv_device *device, int cmd_fd); void (*_dummy2)(struct ibv_context *context); }; enum { IBV_SYSFS_NAME_MAX = 64, IBV_SYSFS_PATH_MAX = 256 }; struct ibv_device { struct _ibv_device_ops _ops; enum ibv_node_type node_type; enum ibv_transport_type transport_type; /* Name of underlying kernel IB device, eg "mthca0" */ char name[IBV_SYSFS_NAME_MAX]; /* Name of uverbs device, eg "uverbs0" */ char dev_name[IBV_SYSFS_NAME_MAX]; /* Path to infiniband_verbs class device in sysfs */ char dev_path[IBV_SYSFS_PATH_MAX]; /* Path to infiniband class device in sysfs */ char ibdev_path[IBV_SYSFS_PATH_MAX]; }; struct _compat_ibv_port_attr; struct ibv_context_ops { int (*_compat_query_device)(struct ibv_context *context, struct ibv_device_attr *device_attr); int (*_compat_query_port)(struct ibv_context *context, uint8_t port_num, struct _compat_ibv_port_attr *port_attr); void *(*_compat_alloc_pd)(void); void *(*_compat_dealloc_pd)(void); void *(*_compat_reg_mr)(void); void *(*_compat_rereg_mr)(void); void *(*_compat_dereg_mr)(void); struct ibv_mw * (*alloc_mw)(struct ibv_pd *pd, enum ibv_mw_type type); int (*bind_mw)(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind); int (*dealloc_mw)(struct ibv_mw *mw); void *(*_compat_create_cq)(void); int (*poll_cq)(struct ibv_cq *cq, int num_entries, struct ibv_wc *wc); int (*req_notify_cq)(struct ibv_cq *cq, int solicited_only); void *(*_compat_cq_event)(void); void *(*_compat_resize_cq)(void); void *(*_compat_destroy_cq)(void); void *(*_compat_create_srq)(void); void *(*_compat_modify_srq)(void); void *(*_compat_query_srq)(void); void *(*_compat_destroy_srq)(void); int (*post_srq_recv)(struct ibv_srq *srq, struct ibv_recv_wr *recv_wr, struct ibv_recv_wr **bad_recv_wr); void *(*_compat_create_qp)(void); void *(*_compat_query_qp)(void); void *(*_compat_modify_qp)(void); void *(*_compat_destroy_qp)(void); int (*post_send)(struct ibv_qp *qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int (*post_recv)(struct ibv_qp *qp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); void *(*_compat_create_ah)(void); void *(*_compat_destroy_ah)(void); void *(*_compat_attach_mcast)(void); void *(*_compat_detach_mcast)(void); void *(*_compat_async_event)(void); }; struct ibv_context { struct ibv_device *device; struct ibv_context_ops ops; int cmd_fd; int async_fd; int num_comp_vectors; pthread_mutex_t mutex; void *abi_compat; }; enum ibv_cq_init_attr_mask { IBV_CQ_INIT_ATTR_MASK_FLAGS = 1 << 0, IBV_CQ_INIT_ATTR_MASK_PD = 1 << 1, }; enum ibv_create_cq_attr_flags { IBV_CREATE_CQ_ATTR_SINGLE_THREADED = 1 << 0, IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN = 1 << 1, }; struct ibv_cq_init_attr_ex { /* Minimum number of entries required for CQ */ uint32_t cqe; /* Consumer-supplied context returned for completion events */ void *cq_context; /* Completion channel where completion events will be queued. * May be NULL if completion events will not be used. */ struct ibv_comp_channel *channel; /* Completion vector used to signal completion events. * Must be < context->num_comp_vectors. */ uint32_t comp_vector; /* Or'ed bit of enum ibv_create_cq_wc_flags. */ uint64_t wc_flags; /* compatibility mask (extended verb). Or'd flags of * enum ibv_cq_init_attr_mask */ uint32_t comp_mask; /* create cq attr flags - one or more flags from * enum ibv_create_cq_attr_flags */ uint32_t flags; struct ibv_pd *parent_domain; }; enum ibv_parent_domain_init_attr_mask { IBV_PARENT_DOMAIN_INIT_ATTR_ALLOCATORS = 1 << 0, IBV_PARENT_DOMAIN_INIT_ATTR_PD_CONTEXT = 1 << 1, }; #define IBV_ALLOCATOR_USE_DEFAULT ((void *)-1) struct ibv_parent_domain_init_attr { struct ibv_pd *pd; /* reference to a protection domain object, can't be NULL */ struct ibv_td *td; /* reference to a thread domain object, or NULL */ uint32_t comp_mask; void *(*alloc)(struct ibv_pd *pd, void *pd_context, size_t size, size_t alignment, uint64_t resource_type); void (*free)(struct ibv_pd *pd, void *pd_context, void *ptr, uint64_t resource_type); void *pd_context; }; struct ibv_counters_init_attr { uint32_t comp_mask; }; struct ibv_counters { struct ibv_context *context; }; enum ibv_counter_description { IBV_COUNTER_PACKETS, IBV_COUNTER_BYTES, }; struct ibv_counter_attach_attr { enum ibv_counter_description counter_desc; uint32_t index; /* Desired location index of the counter at the counters object */ uint32_t comp_mask; }; enum ibv_read_counters_flags { IBV_READ_COUNTERS_ATTR_PREFER_CACHED = 1 << 0, }; enum ibv_values_mask { IBV_VALUES_MASK_RAW_CLOCK = 1 << 0, IBV_VALUES_MASK_RESERVED = 1 << 1 }; struct ibv_values_ex { uint32_t comp_mask; struct timespec raw_clock; }; struct verbs_context { /* "grows up" - new fields go here */ int (*query_port)(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr, size_t port_attr_len); int (*advise_mr)(struct ibv_pd *pd, enum ibv_advise_mr_advice advice, uint32_t flags, struct ibv_sge *sg_list, uint32_t num_sges); struct ibv_mr *(*alloc_null_mr)(struct ibv_pd *pd); int (*read_counters)(struct ibv_counters *counters, uint64_t *counters_value, uint32_t ncounters, uint32_t flags); int (*attach_counters_point_flow)(struct ibv_counters *counters, struct ibv_counter_attach_attr *attr, struct ibv_flow *flow); struct ibv_counters *(*create_counters)(struct ibv_context *context, struct ibv_counters_init_attr *init_attr); int (*destroy_counters)(struct ibv_counters *counters); struct ibv_mr *(*reg_dm_mr)(struct ibv_pd *pd, struct ibv_dm *dm, uint64_t dm_offset, size_t length, unsigned int access); struct ibv_dm *(*alloc_dm)(struct ibv_context *context, struct ibv_alloc_dm_attr *attr); int (*free_dm)(struct ibv_dm *dm); int (*modify_flow_action_esp)(struct ibv_flow_action *action, struct ibv_flow_action_esp_attr *attr); int (*destroy_flow_action)(struct ibv_flow_action *action); struct ibv_flow_action *(*create_flow_action_esp)(struct ibv_context *context, struct ibv_flow_action_esp_attr *attr); int (*modify_qp_rate_limit)(struct ibv_qp *qp, struct ibv_qp_rate_limit_attr *attr); struct ibv_pd *(*alloc_parent_domain)(struct ibv_context *context, struct ibv_parent_domain_init_attr *attr); int (*dealloc_td)(struct ibv_td *td); struct ibv_td *(*alloc_td)(struct ibv_context *context, struct ibv_td_init_attr *init_attr); int (*modify_cq)(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr); int (*post_srq_ops)(struct ibv_srq *srq, struct ibv_ops_wr *op, struct ibv_ops_wr **bad_op); int (*destroy_rwq_ind_table)(struct ibv_rwq_ind_table *rwq_ind_table); struct ibv_rwq_ind_table *(*create_rwq_ind_table)(struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr); int (*destroy_wq)(struct ibv_wq *wq); int (*modify_wq)(struct ibv_wq *wq, struct ibv_wq_attr *wq_attr); struct ibv_wq * (*create_wq)(struct ibv_context *context, struct ibv_wq_init_attr *wq_init_attr); int (*query_rt_values)(struct ibv_context *context, struct ibv_values_ex *values); struct ibv_cq_ex *(*create_cq_ex)(struct ibv_context *context, struct ibv_cq_init_attr_ex *init_attr); struct verbs_ex_private *priv; int (*query_device_ex)(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int (*ibv_destroy_flow) (struct ibv_flow *flow); void (*ABI_placeholder2) (void); /* DO NOT COPY THIS GARBAGE */ struct ibv_flow * (*ibv_create_flow) (struct ibv_qp *qp, struct ibv_flow_attr *flow_attr); void (*ABI_placeholder1) (void); /* DO NOT COPY THIS GARBAGE */ struct ibv_qp *(*open_qp)(struct ibv_context *context, struct ibv_qp_open_attr *attr); struct ibv_qp *(*create_qp_ex)(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_init_attr_ex); int (*get_srq_num)(struct ibv_srq *srq, uint32_t *srq_num); struct ibv_srq * (*create_srq_ex)(struct ibv_context *context, struct ibv_srq_init_attr_ex *srq_init_attr_ex); struct ibv_xrcd * (*open_xrcd)(struct ibv_context *context, struct ibv_xrcd_init_attr *xrcd_init_attr); int (*close_xrcd)(struct ibv_xrcd *xrcd); uint64_t _ABI_placeholder3; size_t sz; /* Must be immediately before struct ibv_context */ struct ibv_context context; /* Must be last field in the struct */ }; static inline struct verbs_context *verbs_get_ctx(struct ibv_context *ctx) { if (ctx->abi_compat != __VERBS_ABI_IS_EXTENDED) return NULL; /* open code container_of to not pollute the global namespace */ return (struct verbs_context *)(((uintptr_t)ctx) - offsetof(struct verbs_context, context)); } #define verbs_get_ctx_op(ctx, op) ({ \ struct verbs_context *__vctx = verbs_get_ctx(ctx); \ (!__vctx || (__vctx->sz < sizeof(*__vctx) - offsetof(struct verbs_context, op)) || \ !__vctx->op) ? NULL : __vctx; }) /** * ibv_get_device_list - Get list of IB devices currently available * @num_devices: optional. if non-NULL, set to the number of devices * returned in the array. * * Return a NULL-terminated array of IB devices. The array can be * released with ibv_free_device_list(). */ struct ibv_device **ibv_get_device_list(int *num_devices); /* * When statically linking the user can set RDMA_STATIC_PROVIDERS to a comma * separated list of provider names to include in the static link, and this * machinery will cause those providers to be included statically. * * Linking will fail if this is set for dynamic linking. */ #ifdef RDMA_STATIC_PROVIDERS #define _RDMA_STATIC_PREFIX_(_1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, \ _12, _13, _14, _15, _16, _17, _18, _19, ...) \ &verbs_provider_##_1, &verbs_provider_##_2, &verbs_provider_##_3, \ &verbs_provider_##_4, &verbs_provider_##_5, \ &verbs_provider_##_6, &verbs_provider_##_7, \ &verbs_provider_##_8, &verbs_provider_##_9, \ &verbs_provider_##_10, &verbs_provider_##_11, \ &verbs_provider_##_12, &verbs_provider_##_13, \ &verbs_provider_##_14, &verbs_provider_##_15, \ &verbs_provider_##_16, &verbs_provider_##_17, \ &verbs_provider_##_18, &verbs_provider_##_19 #define _RDMA_STATIC_PREFIX(arg) \ _RDMA_STATIC_PREFIX_(arg, none, none, none, none, none, none, none, \ none, none, none, none, none, none, none, none, \ none, none, none) struct verbs_devices_ops; extern const struct verbs_device_ops verbs_provider_bnxt_re; extern const struct verbs_device_ops verbs_provider_cxgb4; extern const struct verbs_device_ops verbs_provider_efa; extern const struct verbs_device_ops verbs_provider_erdma; extern const struct verbs_device_ops verbs_provider_hfi1verbs; extern const struct verbs_device_ops verbs_provider_hns; extern const struct verbs_device_ops verbs_provider_ipathverbs; extern const struct verbs_device_ops verbs_provider_irdma; extern const struct verbs_device_ops verbs_provider_mana; extern const struct verbs_device_ops verbs_provider_mlx4; extern const struct verbs_device_ops verbs_provider_mlx5; extern const struct verbs_device_ops verbs_provider_mthca; extern const struct verbs_device_ops verbs_provider_ocrdma; extern const struct verbs_device_ops verbs_provider_qedr; extern const struct verbs_device_ops verbs_provider_rxe; extern const struct verbs_device_ops verbs_provider_siw; extern const struct verbs_device_ops verbs_provider_vmw_pvrdma; extern const struct verbs_device_ops verbs_provider_all; extern const struct verbs_device_ops verbs_provider_none; void ibv_static_providers(void *unused, ...); static inline struct ibv_device **__ibv_get_device_list(int *num_devices) { ibv_static_providers(NULL, _RDMA_STATIC_PREFIX(RDMA_STATIC_PROVIDERS), NULL); return ibv_get_device_list(num_devices); } #define ibv_get_device_list(num_devices) __ibv_get_device_list(num_devices) #endif /** * ibv_free_device_list - Free list from ibv_get_device_list() * * Free an array of devices returned from ibv_get_device_list(). Once * the array is freed, pointers to devices that were not opened with * ibv_open_device() are no longer valid. Client code must open all * devices it intends to use before calling ibv_free_device_list(). */ void ibv_free_device_list(struct ibv_device **list); /** * ibv_get_device_name - Return kernel device name */ const char *ibv_get_device_name(struct ibv_device *device); /** * ibv_get_device_index - Return kernel device index * * Available for the kernel with support of IB device query * over netlink interface. For the unsupported kernels, the * relevant -1 will be returned. */ int ibv_get_device_index(struct ibv_device *device); /** * ibv_get_device_guid - Return device's node GUID */ __be64 ibv_get_device_guid(struct ibv_device *device); /** * ibv_open_device - Initialize device for use */ struct ibv_context *ibv_open_device(struct ibv_device *device); /** * ibv_close_device - Release device */ int ibv_close_device(struct ibv_context *context); /** * ibv_import_device - Import device */ struct ibv_context *ibv_import_device(int cmd_fd); /** * ibv_import_pd - Import a protetion domain */ struct ibv_pd *ibv_import_pd(struct ibv_context *context, uint32_t pd_handle); /** * ibv_unimport_pd - Unimport a protetion domain */ void ibv_unimport_pd(struct ibv_pd *pd); /** * ibv_import_mr - Import a memory region */ struct ibv_mr *ibv_import_mr(struct ibv_pd *pd, uint32_t mr_handle); /** * ibv_unimport_mr - Unimport a memory region */ void ibv_unimport_mr(struct ibv_mr *mr); /** * ibv_import_dm - Import a device memory */ struct ibv_dm *ibv_import_dm(struct ibv_context *context, uint32_t dm_handle); /** * ibv_unimport_dm - Unimport a device memory */ void ibv_unimport_dm(struct ibv_dm *dm); /** * ibv_get_async_event - Get next async event * @event: Pointer to use to return async event * * All async events returned by ibv_get_async_event() must eventually * be acknowledged with ibv_ack_async_event(). */ int ibv_get_async_event(struct ibv_context *context, struct ibv_async_event *event); /** * ibv_ack_async_event - Acknowledge an async event * @event: Event to be acknowledged. * * All async events which are returned by ibv_get_async_event() must * be acknowledged. To avoid races, destroying an object (CQ, SRQ or * QP) will wait for all affiliated events to be acknowledged, so * there should be a one-to-one correspondence between acks and * successful gets. */ void ibv_ack_async_event(struct ibv_async_event *event); /** * ibv_query_device - Get device properties */ int ibv_query_device(struct ibv_context *context, struct ibv_device_attr *device_attr); /** * ibv_query_port - Get port properties */ int ibv_query_port(struct ibv_context *context, uint8_t port_num, struct _compat_ibv_port_attr *port_attr); static inline int ___ibv_query_port(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr) { struct verbs_context *vctx = verbs_get_ctx_op(context, query_port); if (!vctx) { int rc; memset(port_attr, 0, sizeof(*port_attr)); rc = ibv_query_port(context, port_num, (struct _compat_ibv_port_attr *)port_attr); return rc; } return vctx->query_port(context, port_num, port_attr, sizeof(*port_attr)); } #define ibv_query_port(context, port_num, port_attr) \ ___ibv_query_port(context, port_num, port_attr) /** * ibv_query_gid - Get a GID table entry */ int ibv_query_gid(struct ibv_context *context, uint8_t port_num, int index, union ibv_gid *gid); int _ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num, uint32_t gid_index, struct ibv_gid_entry *entry, uint32_t flags, size_t entry_size); /** * ibv_query_gid_ex - Read a GID table entry */ static inline int ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num, uint32_t gid_index, struct ibv_gid_entry *entry, uint32_t flags) { return _ibv_query_gid_ex(context, port_num, gid_index, entry, flags, sizeof(*entry)); } ssize_t _ibv_query_gid_table(struct ibv_context *context, struct ibv_gid_entry *entries, size_t max_entries, uint32_t flags, size_t entry_size); /* * ibv_query_gid_table - Get all valid GID table entries */ static inline ssize_t ibv_query_gid_table(struct ibv_context *context, struct ibv_gid_entry *entries, size_t max_entries, uint32_t flags) { return _ibv_query_gid_table(context, entries, max_entries, flags, sizeof(*entries)); } /** * ibv_query_pkey - Get a P_Key table entry */ int ibv_query_pkey(struct ibv_context *context, uint8_t port_num, int index, __be16 *pkey); /** * ibv_get_pkey_index - Translate a P_Key into a P_Key index */ int ibv_get_pkey_index(struct ibv_context *context, uint8_t port_num, __be16 pkey); /** * ibv_alloc_pd - Allocate a protection domain */ struct ibv_pd *ibv_alloc_pd(struct ibv_context *context); /** * ibv_dealloc_pd - Free a protection domain */ int ibv_dealloc_pd(struct ibv_pd *pd); static inline struct ibv_flow *ibv_create_flow(struct ibv_qp *qp, struct ibv_flow_attr *flow) { struct verbs_context *vctx = verbs_get_ctx_op(qp->context, ibv_create_flow); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->ibv_create_flow(qp, flow); } static inline int ibv_destroy_flow(struct ibv_flow *flow_id) { struct verbs_context *vctx = verbs_get_ctx_op(flow_id->context, ibv_destroy_flow); if (!vctx) return EOPNOTSUPP; return vctx->ibv_destroy_flow(flow_id); } static inline struct ibv_flow_action * ibv_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *esp) { struct verbs_context *vctx = verbs_get_ctx_op(ctx, create_flow_action_esp); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->create_flow_action_esp(ctx, esp); } static inline int ibv_modify_flow_action_esp(struct ibv_flow_action *action, struct ibv_flow_action_esp_attr *esp) { struct verbs_context *vctx = verbs_get_ctx_op(action->context, modify_flow_action_esp); if (!vctx) return EOPNOTSUPP; return vctx->modify_flow_action_esp(action, esp); } static inline int ibv_destroy_flow_action(struct ibv_flow_action *action) { struct verbs_context *vctx = verbs_get_ctx_op(action->context, destroy_flow_action); if (!vctx) return EOPNOTSUPP; return vctx->destroy_flow_action(action); } /** * ibv_open_xrcd - Open an extended connection domain */ static inline struct ibv_xrcd * ibv_open_xrcd(struct ibv_context *context, struct ibv_xrcd_init_attr *xrcd_init_attr) { struct verbs_context *vctx = verbs_get_ctx_op(context, open_xrcd); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->open_xrcd(context, xrcd_init_attr); } /** * ibv_close_xrcd - Close an extended connection domain */ static inline int ibv_close_xrcd(struct ibv_xrcd *xrcd) { struct verbs_context *vctx = verbs_get_ctx(xrcd->context); return vctx->close_xrcd(xrcd); } /** * ibv_reg_mr_iova2 - Register memory region with a virtual offset address * * This version will be called if ibv_reg_mr or ibv_reg_mr_iova were called * with at least one optional access flag from the IBV_ACCESS_OPTIONAL_RANGE * bits flag range. The optional access flags will be masked if running over * kernel that does not support passing them. */ struct ibv_mr *ibv_reg_mr_iova2(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova, unsigned int access); /** * ibv_reg_mr - Register a memory region */ struct ibv_mr *ibv_reg_mr(struct ibv_pd *pd, void *addr, size_t length, int access); /* use new ibv_reg_mr version only if access flags that require it are used */ __attribute__((__always_inline__)) static inline struct ibv_mr * __ibv_reg_mr(struct ibv_pd *pd, void *addr, size_t length, unsigned int access, int is_access_const) { if (is_access_const && (access & IBV_ACCESS_OPTIONAL_RANGE) == 0) return ibv_reg_mr(pd, addr, length, (int)access); else return ibv_reg_mr_iova2(pd, addr, length, (uintptr_t)addr, access); } #define ibv_reg_mr(pd, addr, length, access) \ __ibv_reg_mr(pd, addr, length, access, \ __builtin_constant_p( \ ((int)(access) & IBV_ACCESS_OPTIONAL_RANGE) == 0)) /** * ibv_reg_mr_iova - Register a memory region with a virtual offset * address */ struct ibv_mr *ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova, int access); /* use new ibv_reg_mr version only if access flags that require it are used */ __attribute__((__always_inline__)) static inline struct ibv_mr * __ibv_reg_mr_iova(struct ibv_pd *pd, void *addr, size_t length, uint64_t iova, unsigned int access, int is_access_const) { if (is_access_const && (access & IBV_ACCESS_OPTIONAL_RANGE) == 0) return ibv_reg_mr_iova(pd, addr, length, iova, (int)access); else return ibv_reg_mr_iova2(pd, addr, length, iova, access); } #define ibv_reg_mr_iova(pd, addr, length, iova, access) \ __ibv_reg_mr_iova(pd, addr, length, iova, access, \ __builtin_constant_p( \ ((access) & IBV_ACCESS_OPTIONAL_RANGE) == 0)) /** * ibv_reg_dmabuf_mr - Register a dmabuf-based memory region */ struct ibv_mr *ibv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access); enum ibv_rereg_mr_err_code { /* Old MR is valid, invalid input */ IBV_REREG_MR_ERR_INPUT = -1, /* Old MR is valid, failed via don't fork on new address range */ IBV_REREG_MR_ERR_DONT_FORK_NEW = -2, /* New MR is valid, failed via do fork on old address range */ IBV_REREG_MR_ERR_DO_FORK_OLD = -3, /* MR shouldn't be used, command error */ IBV_REREG_MR_ERR_CMD = -4, /* MR shouldn't be used, command error, invalid fork state on new address range */ IBV_REREG_MR_ERR_CMD_AND_DO_FORK_NEW = -5, }; /** * ibv_rereg_mr - Re-Register a memory region */ int ibv_rereg_mr(struct ibv_mr *mr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access); /** * ibv_dereg_mr - Deregister a memory region */ int ibv_dereg_mr(struct ibv_mr *mr); /** * ibv_alloc_mw - Allocate a memory window */ static inline struct ibv_mw *ibv_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type) { struct ibv_mw *mw; if (!pd->context->ops.alloc_mw) { errno = EOPNOTSUPP; return NULL; } mw = pd->context->ops.alloc_mw(pd, type); return mw; } /** * ibv_dealloc_mw - Free a memory window */ static inline int ibv_dealloc_mw(struct ibv_mw *mw) { return mw->context->ops.dealloc_mw(mw); } /** * ibv_inc_rkey - Increase the 8 lsb in the given rkey */ static inline uint32_t ibv_inc_rkey(uint32_t rkey) { const uint32_t mask = 0x000000ff; uint8_t newtag = (uint8_t)((rkey + 1) & mask); return (rkey & ~mask) | newtag; } /** * ibv_bind_mw - Bind a memory window to a region */ static inline int ibv_bind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind) { struct ibv_mw_bind_info *bind_info = &mw_bind->bind_info; if (mw->type != IBV_MW_TYPE_1) return EINVAL; if (!bind_info->mr && (bind_info->addr || bind_info->length)) return EINVAL; if (bind_info->mr && (mw->pd != bind_info->mr->pd)) return EPERM; return mw->context->ops.bind_mw(qp, mw, mw_bind); } /** * ibv_create_comp_channel - Create a completion event channel */ struct ibv_comp_channel *ibv_create_comp_channel(struct ibv_context *context); /** * ibv_destroy_comp_channel - Destroy a completion event channel */ int ibv_destroy_comp_channel(struct ibv_comp_channel *channel); /** * ibv_advise_mr - Gives advice about an address range in MRs * @pd - protection domain of all MRs for which the advice is for * @advice - type of advice * @flags - advice modifiers * @sg_list - an array of memory ranges * @num_sge - number of elements in the array */ static inline int ibv_advise_mr(struct ibv_pd *pd, enum ibv_advise_mr_advice advice, uint32_t flags, struct ibv_sge *sg_list, uint32_t num_sge) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(pd->context, advise_mr); if (!vctx) return EOPNOTSUPP; return vctx->advise_mr(pd, advice, flags, sg_list, num_sge); } /** * ibv_alloc_dm - Allocate device memory * @context - Context DM will be attached to * @attr - Attributes to allocate the DM with */ static inline struct ibv_dm *ibv_alloc_dm(struct ibv_context *context, struct ibv_alloc_dm_attr *attr) { struct verbs_context *vctx = verbs_get_ctx_op(context, alloc_dm); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->alloc_dm(context, attr); } /** * ibv_free_dm - Free device allocated memory * @dm - The DM to free */ static inline int ibv_free_dm(struct ibv_dm *dm) { struct verbs_context *vctx = verbs_get_ctx_op(dm->context, free_dm); if (!vctx) return EOPNOTSUPP; return vctx->free_dm(dm); } /** * ibv_memcpy_to/from_dm - copy to/from device allocated memory * @dm - The DM to copy to/from * @dm_offset - Offset in bytes from beginning of DM to start copy to/form * @host_addr - Host memory address to copy to/from * @length - Number of bytes to copy */ static inline int ibv_memcpy_to_dm(struct ibv_dm *dm, uint64_t dm_offset, const void *host_addr, size_t length) { return dm->memcpy_to_dm(dm, dm_offset, host_addr, length); } static inline int ibv_memcpy_from_dm(void *host_addr, struct ibv_dm *dm, uint64_t dm_offset, size_t length) { return dm->memcpy_from_dm(host_addr, dm, dm_offset, length); } /* * ibv_alloc_null_mr - Allocate a null memory region. * @pd - The protection domain associated with the MR. */ static inline struct ibv_mr *ibv_alloc_null_mr(struct ibv_pd *pd) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(pd->context, alloc_null_mr); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->alloc_null_mr(pd); } /** * ibv_reg_dm_mr - Register device memory as a memory region * @pd - The PD to associated this MR with * @dm - The DM to register * @dm_offset - Offset in bytes from beginning of DM to start registration from * @length - Number of bytes to register * @access - memory region access flags */ static inline struct ibv_mr *ibv_reg_dm_mr(struct ibv_pd *pd, struct ibv_dm *dm, uint64_t dm_offset, size_t length, unsigned int access) { struct verbs_context *vctx = verbs_get_ctx_op(pd->context, reg_dm_mr); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->reg_dm_mr(pd, dm, dm_offset, length, access); } /** * ibv_create_cq - Create a completion queue * @context - Context CQ will be attached to * @cqe - Minimum number of entries required for CQ * @cq_context - Consumer-supplied context returned for completion events * @channel - Completion channel where completion events will be queued. * May be NULL if completion events will not be used. * @comp_vector - Completion vector used to signal completion events. * Must be >= 0 and < context->num_comp_vectors. */ struct ibv_cq *ibv_create_cq(struct ibv_context *context, int cqe, void *cq_context, struct ibv_comp_channel *channel, int comp_vector); /** * ibv_create_cq_ex - Create a completion queue * @context - Context CQ will be attached to * @cq_attr - Attributes to create the CQ with */ static inline struct ibv_cq_ex *ibv_create_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr) { struct verbs_context *vctx = verbs_get_ctx_op(context, create_cq_ex); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->create_cq_ex(context, cq_attr); } /** * ibv_resize_cq - Modifies the capacity of the CQ. * @cq: The CQ to resize. * @cqe: The minimum size of the CQ. * * Users can examine the cq structure to determine the actual CQ size. */ int ibv_resize_cq(struct ibv_cq *cq, int cqe); /** * ibv_destroy_cq - Destroy a completion queue */ int ibv_destroy_cq(struct ibv_cq *cq); /** * ibv_get_cq_event - Read next CQ event * @channel: Channel to get next event from. * @cq: Used to return pointer to CQ. * @cq_context: Used to return consumer-supplied CQ context. * * All completion events returned by ibv_get_cq_event() must * eventually be acknowledged with ibv_ack_cq_events(). */ int ibv_get_cq_event(struct ibv_comp_channel *channel, struct ibv_cq **cq, void **cq_context); /** * ibv_ack_cq_events - Acknowledge CQ completion events * @cq: CQ to acknowledge events for * @nevents: Number of events to acknowledge. * * All completion events which are returned by ibv_get_cq_event() must * be acknowledged. To avoid races, ibv_destroy_cq() will wait for * all completion events to be acknowledged, so there should be a * one-to-one correspondence between acks and successful gets. An * application may accumulate multiple completion events and * acknowledge them in a single call to ibv_ack_cq_events() by passing * the number of events to ack in @nevents. */ void ibv_ack_cq_events(struct ibv_cq *cq, unsigned int nevents); /** * ibv_poll_cq - Poll a CQ for work completions * @cq:the CQ being polled * @num_entries:maximum number of completions to return * @wc:array of at least @num_entries of &struct ibv_wc where completions * will be returned * * Poll a CQ for (possibly multiple) completions. If the return value * is < 0, an error occurred. If the return value is >= 0, it is the * number of completions returned. If the return value is * non-negative and strictly less than num_entries, then the CQ was * emptied. */ static inline int ibv_poll_cq(struct ibv_cq *cq, int num_entries, struct ibv_wc *wc) { return cq->context->ops.poll_cq(cq, num_entries, wc); } /** * ibv_req_notify_cq - Request completion notification on a CQ. An * event will be added to the completion channel associated with the * CQ when an entry is added to the CQ. * @cq: The completion queue to request notification for. * @solicited_only: If non-zero, an event will be generated only for * the next solicited CQ entry. If zero, any CQ entry, solicited or * not, will generate an event. */ static inline int ibv_req_notify_cq(struct ibv_cq *cq, int solicited_only) { return cq->context->ops.req_notify_cq(cq, solicited_only); } static inline int ibv_modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr) { struct verbs_context *vctx = verbs_get_ctx_op(cq->context, modify_cq); if (!vctx) return EOPNOTSUPP; return vctx->modify_cq(cq, attr); } /** * ibv_create_srq - Creates a SRQ associated with the specified protection * domain. * @pd: The protection domain associated with the SRQ. * @srq_init_attr: A list of initial attributes required to create the SRQ. * * srq_attr->max_wr and srq_attr->max_sge are read the determine the * requested size of the SRQ, and set to the actual values allocated * on return. If ibv_create_srq() succeeds, then max_wr and max_sge * will always be at least as large as the requested values. */ struct ibv_srq *ibv_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *srq_init_attr); static inline struct ibv_srq * ibv_create_srq_ex(struct ibv_context *context, struct ibv_srq_init_attr_ex *srq_init_attr_ex) { struct verbs_context *vctx; uint32_t mask = srq_init_attr_ex->comp_mask; if (!(mask & ~(uint32_t)(IBV_SRQ_INIT_ATTR_PD | IBV_SRQ_INIT_ATTR_TYPE)) && (mask & IBV_SRQ_INIT_ATTR_PD) && (!(mask & IBV_SRQ_INIT_ATTR_TYPE) || (srq_init_attr_ex->srq_type == IBV_SRQT_BASIC))) return ibv_create_srq(srq_init_attr_ex->pd, (struct ibv_srq_init_attr *)srq_init_attr_ex); vctx = verbs_get_ctx_op(context, create_srq_ex); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->create_srq_ex(context, srq_init_attr_ex); } /** * ibv_modify_srq - Modifies the attributes for the specified SRQ. * @srq: The SRQ to modify. * @srq_attr: On input, specifies the SRQ attributes to modify. On output, * the current values of selected SRQ attributes are returned. * @srq_attr_mask: A bit-mask used to specify which attributes of the SRQ * are being modified. * * The mask may contain IBV_SRQ_MAX_WR to resize the SRQ and/or * IBV_SRQ_LIMIT to set the SRQ's limit and request notification when * the number of receives queued drops below the limit. */ int ibv_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, int srq_attr_mask); /** * ibv_query_srq - Returns the attribute list and current values for the * specified SRQ. * @srq: The SRQ to query. * @srq_attr: The attributes of the specified SRQ. */ int ibv_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr); static inline int ibv_get_srq_num(struct ibv_srq *srq, uint32_t *srq_num) { struct verbs_context *vctx = verbs_get_ctx_op(srq->context, get_srq_num); if (!vctx) return EOPNOTSUPP; return vctx->get_srq_num(srq, srq_num); } /** * ibv_destroy_srq - Destroys the specified SRQ. * @srq: The SRQ to destroy. */ int ibv_destroy_srq(struct ibv_srq *srq); /** * ibv_post_srq_recv - Posts a list of work requests to the specified SRQ. * @srq: The SRQ to post the work request on. * @recv_wr: A list of work requests to post on the receive queue. * @bad_recv_wr: On an immediate failure, this parameter will reference * the work request that failed to be posted on the QP. */ static inline int ibv_post_srq_recv(struct ibv_srq *srq, struct ibv_recv_wr *recv_wr, struct ibv_recv_wr **bad_recv_wr) { return srq->context->ops.post_srq_recv(srq, recv_wr, bad_recv_wr); } static inline int ibv_post_srq_ops(struct ibv_srq *srq, struct ibv_ops_wr *op, struct ibv_ops_wr **bad_op) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(srq->context, post_srq_ops); if (!vctx) { *bad_op = op; return EOPNOTSUPP; } return vctx->post_srq_ops(srq, op, bad_op); } /** * ibv_create_qp - Create a queue pair. */ struct ibv_qp *ibv_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *qp_init_attr); static inline struct ibv_qp * ibv_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_init_attr_ex) { struct verbs_context *vctx; uint32_t mask = qp_init_attr_ex->comp_mask; if (mask == IBV_QP_INIT_ATTR_PD) return ibv_create_qp(qp_init_attr_ex->pd, (struct ibv_qp_init_attr *)qp_init_attr_ex); vctx = verbs_get_ctx_op(context, create_qp_ex); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->create_qp_ex(context, qp_init_attr_ex); } /** * ibv_alloc_td - Allocate a thread domain */ static inline struct ibv_td *ibv_alloc_td(struct ibv_context *context, struct ibv_td_init_attr *init_attr) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(context, alloc_td); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->alloc_td(context, init_attr); } /** * ibv_dealloc_td - Free a thread domain */ static inline int ibv_dealloc_td(struct ibv_td *td) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(td->context, dealloc_td); if (!vctx) return EOPNOTSUPP; return vctx->dealloc_td(td); } /** * ibv_alloc_parent_domain - Allocate a parent domain */ static inline struct ibv_pd * ibv_alloc_parent_domain(struct ibv_context *context, struct ibv_parent_domain_init_attr *attr) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(context, alloc_parent_domain); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->alloc_parent_domain(context, attr); } /** * ibv_query_rt_values_ex - Get current real time @values of a device. * @values - in/out - defines the attributes we need to query/queried. * (Or's bits of enum ibv_values_mask on values->comp_mask field) */ static inline int ibv_query_rt_values_ex(struct ibv_context *context, struct ibv_values_ex *values) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(context, query_rt_values); if (!vctx) return EOPNOTSUPP; return vctx->query_rt_values(context, values); } /** * ibv_query_device_ex - Get extended device properties */ static inline int ibv_query_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr) { struct verbs_context *vctx; int ret; if (input && input->comp_mask) return EINVAL; vctx = verbs_get_ctx_op(context, query_device_ex); if (!vctx) goto legacy; ret = vctx->query_device_ex(context, input, attr, sizeof(*attr)); if (ret == EOPNOTSUPP || ret == ENOSYS) goto legacy; return ret; legacy: memset(attr, 0, sizeof(*attr)); ret = ibv_query_device(context, &attr->orig_attr); return ret; } /** * ibv_open_qp - Open a shareable queue pair. */ static inline struct ibv_qp * ibv_open_qp(struct ibv_context *context, struct ibv_qp_open_attr *qp_open_attr) { struct verbs_context *vctx = verbs_get_ctx_op(context, open_qp); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->open_qp(context, qp_open_attr); } /** * ibv_modify_qp - Modify a queue pair. */ int ibv_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); /** * ibv_modify_qp_rate_limit - Modify a queue pair rate limit values * @qp - QP object to modify * @attr - Attributes to configure the rate limiting values of the QP */ static inline int ibv_modify_qp_rate_limit(struct ibv_qp *qp, struct ibv_qp_rate_limit_attr *attr) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(qp->context, modify_qp_rate_limit); if (!vctx) return EOPNOTSUPP; return vctx->modify_qp_rate_limit(qp, attr); } /** * ibv_query_qp_data_in_order - Checks whether the data is guaranteed to be * written in-order. * @qp: The QP to query. * @op: Operation type. * @flags: Flags are used to select a query type. * For IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS, the function will return a * capabilities vector. If 0, will query for IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG * support and return 0/1 result. * * Return Value * ibv_query_qp_data_in_order() return value is determined by flags. * For each capability bit, 1 is returned if the data is guaranteed to be * written in-order for selected operation and type, 0 otherwise. */ int ibv_query_qp_data_in_order(struct ibv_qp *qp, enum ibv_wr_opcode op, uint32_t flags); /** * ibv_query_qp - Returns the attribute list and current values for the * specified QP. * @qp: The QP to query. * @attr: The attributes of the specified QP. * @attr_mask: A bit-mask used to select specific attributes to query. * @init_attr: Additional attributes of the selected QP. * * The qp_attr_mask may be used to limit the query to gathering only the * selected attributes. */ int ibv_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); /** * ibv_destroy_qp - Destroy a queue pair. */ int ibv_destroy_qp(struct ibv_qp *qp); /* * ibv_create_wq - Creates a WQ associated with the specified protection * domain. * @context: ibv_context. * @wq_init_attr: A list of initial attributes required to create the * WQ. If WQ creation succeeds, then the attributes are updated to * the actual capabilities of the created WQ. * * wq_init_attr->max_wr and wq_init_attr->max_sge determine * the requested size of the WQ, and set to the actual values allocated * on return. * If ibv_create_wq() succeeds, then max_wr and max_sge will always be * at least as large as the requested values. * * Return Value * ibv_create_wq() returns a pointer to the created WQ, or NULL if the request * fails. */ static inline struct ibv_wq *ibv_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *wq_init_attr) { struct verbs_context *vctx = verbs_get_ctx_op(context, create_wq); struct ibv_wq *wq; if (!vctx) { errno = EOPNOTSUPP; return NULL; } wq = vctx->create_wq(context, wq_init_attr); if (wq) { wq->wq_context = wq_init_attr->wq_context; wq->events_completed = 0; pthread_mutex_init(&wq->mutex, NULL); pthread_cond_init(&wq->cond, NULL); } return wq; } /* * ibv_modify_wq - Modifies the attributes for the specified WQ. * @wq: The WQ to modify. * @wq_attr: On input, specifies the WQ attributes to modify. * wq_attr->attr_mask: A bit-mask used to specify which attributes of the WQ * are being modified. * On output, the current values of selected WQ attributes are returned. * * Return Value * ibv_modify_wq() returns 0 on success, or the value of errno * on failure (which indicates the failure reason). * */ static inline int ibv_modify_wq(struct ibv_wq *wq, struct ibv_wq_attr *wq_attr) { struct verbs_context *vctx = verbs_get_ctx_op(wq->context, modify_wq); if (!vctx) return EOPNOTSUPP; return vctx->modify_wq(wq, wq_attr); } /* * ibv_destroy_wq - Destroys the specified WQ. * @ibv_wq: The WQ to destroy. * Return Value * ibv_destroy_wq() returns 0 on success, or the value of errno * on failure (which indicates the failure reason). */ static inline int ibv_destroy_wq(struct ibv_wq *wq) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(wq->context, destroy_wq); if (!vctx) return EOPNOTSUPP; return vctx->destroy_wq(wq); } /* * ibv_create_rwq_ind_table - Creates a receive work queue Indirection Table * @context: ibv_context. * @init_attr: A list of initial attributes required to create the Indirection Table. * Return Value * ibv_create_rwq_ind_table returns a pointer to the created * Indirection Table, or NULL if the request fails. */ static inline struct ibv_rwq_ind_table *ibv_create_rwq_ind_table(struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(context, create_rwq_ind_table); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->create_rwq_ind_table(context, init_attr); } /* * ibv_destroy_rwq_ind_table - Destroys the specified Indirection Table. * @rwq_ind_table: The Indirection Table to destroy. * Return Value * ibv_destroy_rwq_ind_table() returns 0 on success, or the value of errno * on failure (which indicates the failure reason). */ static inline int ibv_destroy_rwq_ind_table(struct ibv_rwq_ind_table *rwq_ind_table) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(rwq_ind_table->context, destroy_rwq_ind_table); if (!vctx) return EOPNOTSUPP; return vctx->destroy_rwq_ind_table(rwq_ind_table); } /** * ibv_post_send - Post a list of work requests to a send queue. * * If IBV_SEND_INLINE flag is set, the data buffers can be reused * immediately after the call returns. */ static inline int ibv_post_send(struct ibv_qp *qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { return qp->context->ops.post_send(qp, wr, bad_wr); } /** * ibv_post_recv - Post a list of work requests to a receive queue. */ static inline int ibv_post_recv(struct ibv_qp *qp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { return qp->context->ops.post_recv(qp, wr, bad_wr); } /** * ibv_create_ah - Create an address handle. */ struct ibv_ah *ibv_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr); /** * ibv_init_ah_from_wc - Initializes address handle attributes from a * work completion. * @context: Device context on which the received message arrived. * @port_num: Port on which the received message arrived. * @wc: Work completion associated with the received message. * @grh: References the received global route header. This parameter is * ignored unless the work completion indicates that the GRH is valid. * @ah_attr: Returned attributes that can be used when creating an address * handle for replying to the message. */ int ibv_init_ah_from_wc(struct ibv_context *context, uint8_t port_num, struct ibv_wc *wc, struct ibv_grh *grh, struct ibv_ah_attr *ah_attr); /** * ibv_create_ah_from_wc - Creates an address handle associated with the * sender of the specified work completion. * @pd: The protection domain associated with the address handle. * @wc: Work completion information associated with a received message. * @grh: References the received global route header. This parameter is * ignored unless the work completion indicates that the GRH is valid. * @port_num: The outbound port number to associate with the address. * * The address handle is used to reference a local or global destination * in all UD QP post sends. */ struct ibv_ah *ibv_create_ah_from_wc(struct ibv_pd *pd, struct ibv_wc *wc, struct ibv_grh *grh, uint8_t port_num); /** * ibv_destroy_ah - Destroy an address handle. */ int ibv_destroy_ah(struct ibv_ah *ah); /** * ibv_attach_mcast - Attaches the specified QP to a multicast group. * @qp: QP to attach to the multicast group. The QP must be a UD QP. * @gid: Multicast group GID. * @lid: Multicast group LID in host byte order. * * In order to route multicast packets correctly, subnet * administration must have created the multicast group and configured * the fabric appropriately. The port associated with the specified * QP must also be a member of the multicast group. */ int ibv_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); /** * ibv_detach_mcast - Detaches the specified QP from a multicast group. * @qp: QP to detach from the multicast group. * @gid: Multicast group GID. * @lid: Multicast group LID in host byte order. */ int ibv_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); /** * ibv_fork_init - Prepare data structures so that fork() may be used * safely. If this function is not called or returns a non-zero * status, then libibverbs data structures are not fork()-safe and the * effect of an application calling fork() is undefined. */ int ibv_fork_init(void); /** * ibv_is_fork_initialized - Check if fork support * (ibv_fork_init) was enabled. */ enum ibv_fork_status ibv_is_fork_initialized(void); /** * ibv_node_type_str - Return string describing node_type enum value */ const char *ibv_node_type_str(enum ibv_node_type node_type); /** * ibv_port_state_str - Return string describing port_state enum value */ const char *ibv_port_state_str(enum ibv_port_state port_state); /** * ibv_event_type_str - Return string describing event_type enum value */ const char *ibv_event_type_str(enum ibv_event_type event); int ibv_resolve_eth_l2_from_gid(struct ibv_context *context, struct ibv_ah_attr *attr, uint8_t eth_mac[ETHERNET_LL_SIZE], uint16_t *vid); static inline int ibv_is_qpt_supported(uint32_t caps, enum ibv_qp_type qpt) { return !!(caps & (1 << qpt)); } static inline struct ibv_counters *ibv_create_counters(struct ibv_context *context, struct ibv_counters_init_attr *init_attr) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(context, create_counters); if (!vctx) { errno = EOPNOTSUPP; return NULL; } return vctx->create_counters(context, init_attr); } static inline int ibv_destroy_counters(struct ibv_counters *counters) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(counters->context, destroy_counters); if (!vctx) return EOPNOTSUPP; return vctx->destroy_counters(counters); } static inline int ibv_attach_counters_point_flow(struct ibv_counters *counters, struct ibv_counter_attach_attr *attr, struct ibv_flow *flow) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(counters->context, attach_counters_point_flow); if (!vctx) return EOPNOTSUPP; return vctx->attach_counters_point_flow(counters, attr, flow); } static inline int ibv_read_counters(struct ibv_counters *counters, uint64_t *counters_value, uint32_t ncounters, uint32_t flags) { struct verbs_context *vctx; vctx = verbs_get_ctx_op(counters->context, read_counters); if (!vctx) return EOPNOTSUPP; return vctx->read_counters(counters, counters_value, ncounters, flags); } #define IB_ROCE_UDP_ENCAP_VALID_PORT_MIN (0xC000) #define IB_ROCE_UDP_ENCAP_VALID_PORT_MAX (0xFFFF) #define IB_GRH_FLOWLABEL_MASK (0x000FFFFF) static inline uint16_t ibv_flow_label_to_udp_sport(uint32_t fl) { uint32_t fl_low = fl & 0x03FFF, fl_high = fl & 0xFC000; fl_low ^= fl_high >> 14; return (uint16_t)(fl_low | IB_ROCE_UDP_ENCAP_VALID_PORT_MIN); } /** * ibv_set_ece - Set ECE options */ int ibv_set_ece(struct ibv_qp *qp, struct ibv_ece *ece); /** * ibv_query_ece - Get accepted ECE options */ int ibv_query_ece(struct ibv_qp *qp, struct ibv_ece *ece); #ifdef __cplusplus } #endif # undef __attribute_const #endif /* INFINIBAND_VERBS_H */ rdma-core-56.1/libibverbs/verbs_api.h000066400000000000000000000124061477342711600176030ustar00rootroot00000000000000/* * Copyright (c) 2017, Mellanox Technologies inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef VERBS_API_H #define VERBS_API_H #if UINTPTR_MAX == UINT32_MAX #if __BYTE_ORDER == __LITTLE_ENDIAN #define RDMA_UAPI_PTR(_type, _name) \ union { \ struct { \ _type _name; \ __u32 _name##_reserved; \ }; \ __aligned_u64 _name##_data_u64; \ } #else #define RDMA_UAPI_PTR(_type, _name) \ union { \ struct { \ __u32 _name##_reserved; \ _type _name; \ }; \ __aligned_u64 _name##_data_u64; \ } #endif #elif UINTPTR_MAX == UINT64_MAX #define RDMA_UAPI_PTR(_type, _name) \ union { \ _type _name; \ __aligned_u64 _name##_data_u64; \ } #else #error "Pointer size not supported" #endif #include #define ibv_flow_action_esp_keymat ib_uverbs_flow_action_esp_keymat #define IBV_FLOW_ACTION_ESP_KEYMAT_AES_GCM IB_UVERBS_FLOW_ACTION_ESP_KEYMAT_AES_GCM #define ibv_flow_action_esp_keymat_aes_gcm_iv_algo ib_uverbs_flow_action_esp_keymat_aes_gcm_iv_algo #define IBV_FLOW_ACTION_IV_ALGO_SEQ IB_UVERBS_FLOW_ACTION_IV_ALGO_SEQ #define ibv_flow_action_esp_keymat_aes_gcm ib_uverbs_flow_action_esp_keymat_aes_gcm #define ibv_flow_action_esp_replay ib_uverbs_flow_action_esp_replay #define IBV_FLOW_ACTION_ESP_REPLAY_NONE IB_UVERBS_FLOW_ACTION_ESP_REPLAY_NONE #define IBV_FLOW_ACTION_ESP_REPLAY_BMP IB_UVERBS_FLOW_ACTION_ESP_REPLAY_BMP #define ibv_flow_action_esp_replay_bmp ib_uverbs_flow_action_esp_replay_bmp #define ibv_flow_action_esp_flags ib_uverbs_flow_action_esp_flags #define IBV_FLOW_ACTION_ESP_FLAGS_INLINE_CRYPTO IB_UVERBS_FLOW_ACTION_ESP_FLAGS_INLINE_CRYPTO #define IBV_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD #define IBV_FLOW_ACTION_ESP_FLAGS_TUNNEL IB_UVERBS_FLOW_ACTION_ESP_FLAGS_TUNNEL #define IBV_FLOW_ACTION_ESP_FLAGS_TRANSPORT IB_UVERBS_FLOW_ACTION_ESP_FLAGS_TRANSPORT #define IBV_FLOW_ACTION_ESP_FLAGS_DECRYPT IB_UVERBS_FLOW_ACTION_ESP_FLAGS_DECRYPT #define IBV_FLOW_ACTION_ESP_FLAGS_ENCRYPT IB_UVERBS_FLOW_ACTION_ESP_FLAGS_ENCRYPT #define IBV_FLOW_ACTION_ESP_FLAGS_ESN_NEW_WINDOW IB_UVERBS_FLOW_ACTION_ESP_FLAGS_ESN_NEW_WINDOW #define ibv_flow_action_esp_encap ib_uverbs_flow_action_esp_encap #define ibv_flow_action_esp ib_uverbs_flow_action_esp #define ibv_advise_mr_advice ib_uverbs_advise_mr_advice #define IBV_ADVISE_MR_ADVICE_PREFETCH IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH #define IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_WRITE #define IBV_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT #define IBV_ADVISE_MR_FLAG_FLUSH IB_UVERBS_ADVISE_MR_FLAG_FLUSH #define IBV_QPF_GRH_REQUIRED IB_UVERBS_QPF_GRH_REQUIRED #define IBV_ACCESS_OPTIONAL_RANGE IB_UVERBS_ACCESS_OPTIONAL_RANGE #define IBV_ACCESS_OPTIONAL_FIRST IB_UVERBS_ACCESS_OPTIONAL_FIRST #endif rdma-core-56.1/librdmacm/000077500000000000000000000000001477342711600152645ustar00rootroot00000000000000rdma-core-56.1/librdmacm/CMakeLists.txt000066400000000000000000000030761477342711600200320ustar00rootroot00000000000000publish_headers(rdma rdma_cma.h rdma_cma_abi.h rdma_verbs.h rsocket.h ) publish_headers(infiniband acm.h ib.h ) rdma_library(rdmacm librdmacm.map # See Documentation/versioning.md 1 1.3.${PACKAGE_VERSION} acm.c addrinfo.c cma.c indexer.c rsocket.c ) target_link_libraries(rdmacm LINK_PUBLIC ibverbs) target_link_libraries(rdmacm LINK_PRIVATE ${NL_LIBRARIES} ${CMAKE_THREAD_LIBS_INIT} ${RT_LIBRARIES} ) # The preload library is a bit special, it needs to be open coded # Since it is a LD_PRELOAD it has no soname, and is installed in sub dir add_library(rspreload MODULE preload.c indexer.c ) # Even though this is a module we still want to use Wl,--no-undefined set_target_properties(rspreload PROPERTIES LINK_FLAGS ${CMAKE_SHARED_LINKER_FLAGS}) set_target_properties(rspreload PROPERTIES LIBRARY_OUTPUT_DIRECTORY "${BUILD_LIB}") rdma_set_library_map(rspreload librspreload.map) target_link_libraries(rspreload LINK_PRIVATE rdmacm ${CMAKE_THREAD_LIBS_INIT} ${CMAKE_DL_LIBS} ) install(TARGETS rspreload DESTINATION "${CMAKE_INSTALL_LIBDIR}/rsocket/") # These are for compat with old packaging, these name should not be used. # FIXME: Maybe we can get rid of them? rdma_install_symlink("librspreload.so" "${CMAKE_INSTALL_LIBDIR}/rsocket/librspreload.so.1") rdma_install_symlink("librspreload.so" "${CMAKE_INSTALL_LIBDIR}/rsocket/librspreload.so.1.0.0") if (ENABLE_STATIC) if (NOT NL_KIND EQUAL 0) set(REQUIRES "libnl-3.0, libnl-route-3.0, ") endif() endif() rdma_pkg_config("rdmacm" "${REQUIRES}libibverbs" "${CMAKE_THREAD_LIBS_INIT}") rdma-core-56.1/librdmacm/acm.c000066400000000000000000000243311477342711600161730ustar00rootroot00000000000000/* * Copyright (c) 2010-2012 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include "cma.h" #include "acm.h" #include #include #include static pthread_mutex_t acm_lock = PTHREAD_MUTEX_INITIALIZER; static int sock = -1; static uint16_t server_port; static int ucma_set_server_port(void) { FILE *f; if ((f = fopen(IBACM_PORT_FILE, "r" STREAM_CLOEXEC))) { if (fscanf(f, "%" SCNu16, &server_port) != 1) server_port = 0; fclose(f); } else server_port = 0; return server_port; } void ucma_ib_init(void) { union { struct sockaddr any; struct sockaddr_in inet; struct sockaddr_un unx; } addr; static int init; int ret; if (init) return; pthread_mutex_lock(&acm_lock); if (init) goto unlock; if (ucma_set_server_port()) { sock = socket(AF_INET, SOCK_STREAM | SOCK_CLOEXEC, IPPROTO_TCP); if (sock < 0) goto out; memset(&addr, 0, sizeof(addr)); addr.any.sa_family = AF_INET; addr.inet.sin_addr.s_addr = htobe32(INADDR_LOOPBACK); addr.inet.sin_port = htobe16(server_port); ret = connect(sock, &addr.any, sizeof(addr.inet)); if (ret) { close(sock); sock = -1; } } else { sock = socket(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0); if (sock < 0) goto out; memset(&addr, 0, sizeof(addr)); addr.any.sa_family = AF_UNIX; BUILD_ASSERT(sizeof(IBACM_SERVER_PATH) <= sizeof(addr.unx.sun_path)); strcpy(addr.unx.sun_path, IBACM_SERVER_PATH); ret = connect(sock, &addr.any, sizeof(addr.unx)); if (ret) { close(sock); sock = -1; } } out: init = 1; unlock: pthread_mutex_unlock(&acm_lock); } void ucma_ib_cleanup(void) { if (sock >= 0) { shutdown(sock, SHUT_RDWR); close(sock); } } static int ucma_ib_set_addr(struct rdma_addrinfo *ib_rai, struct rdma_addrinfo *rai) { struct sockaddr_ib *src, *dst; struct ibv_path_record *path; src = calloc(1, sizeof(*src)); if (!src) return ERR(ENOMEM); dst = calloc(1, sizeof(*dst)); if (!dst) { free(src); return ERR(ENOMEM); } path = &((struct ibv_path_data *) ib_rai->ai_route)->path; src->sib_family = AF_IB; src->sib_pkey = path->pkey; src->sib_flowinfo = htobe32(be32toh(path->flowlabel_hoplimit) >> 8); memcpy(&src->sib_addr, &path->sgid, 16); ucma_set_sid(ib_rai->ai_port_space, rai->ai_src_addr, src); dst->sib_family = AF_IB; dst->sib_pkey = path->pkey; dst->sib_flowinfo = htobe32(be32toh(path->flowlabel_hoplimit) >> 8); memcpy(&dst->sib_addr, &path->dgid, 16); ucma_set_sid(ib_rai->ai_port_space, rai->ai_dst_addr, dst); ib_rai->ai_src_addr = (struct sockaddr *) src; ib_rai->ai_src_len = sizeof(*src); ib_rai->ai_dst_addr = (struct sockaddr *) dst; ib_rai->ai_dst_len = sizeof(*dst); return 0; } static int ucma_ib_set_connect(struct rdma_addrinfo *ib_rai, struct rdma_addrinfo *rai) { struct ib_connect_hdr *hdr; if (rai->ai_family == AF_IB) return 0; hdr = calloc(1, sizeof(*hdr)); if (!hdr) return ERR(ENOMEM); if (rai->ai_family == AF_INET) { hdr->ip_version = 4 << 4; memcpy(&hdr->cma_src_ip4, &((struct sockaddr_in *) rai->ai_src_addr)->sin_addr, 4); memcpy(&hdr->cma_dst_ip4, &((struct sockaddr_in *) rai->ai_dst_addr)->sin_addr, 4); } else { hdr->ip_version = 6 << 4; memcpy(&hdr->cma_src_ip6, &((struct sockaddr_in6 *) rai->ai_src_addr)->sin6_addr, 16); memcpy(&hdr->cma_dst_ip6, &((struct sockaddr_in6 *) rai->ai_dst_addr)->sin6_addr, 16); } ib_rai->ai_connect = hdr; ib_rai->ai_connect_len = sizeof(*hdr); return 0; } static void ucma_resolve_af_ib(struct rdma_addrinfo **rai) { struct rdma_addrinfo *ib_rai; ib_rai = calloc(1, sizeof(*ib_rai)); if (!ib_rai) return; ib_rai->ai_flags = (*rai)->ai_flags; ib_rai->ai_family = AF_IB; ib_rai->ai_qp_type = (*rai)->ai_qp_type; ib_rai->ai_port_space = (*rai)->ai_port_space; ib_rai->ai_route = calloc(1, (*rai)->ai_route_len); if (!ib_rai->ai_route) goto err; memcpy(ib_rai->ai_route, (*rai)->ai_route, (*rai)->ai_route_len); ib_rai->ai_route_len = (*rai)->ai_route_len; if ((*rai)->ai_src_canonname) { ib_rai->ai_src_canonname = strdup((*rai)->ai_src_canonname); if (!ib_rai->ai_src_canonname) goto err; } if ((*rai)->ai_dst_canonname) { ib_rai->ai_dst_canonname = strdup((*rai)->ai_dst_canonname); if (!ib_rai->ai_dst_canonname) goto err; } if (ucma_ib_set_connect(ib_rai, *rai)) goto err; if (ucma_ib_set_addr(ib_rai, *rai)) goto err; ib_rai->ai_next = *rai; *rai = ib_rai; return; err: rdma_freeaddrinfo(ib_rai); } static void ucma_ib_save_resp(struct rdma_addrinfo *rai, struct acm_msg *msg) { struct acm_ep_addr_data *ep_data; struct ibv_path_data *path_data = NULL; struct sockaddr_in *sin; struct sockaddr_in6 *sin6; int i, cnt, path_cnt = 0; cnt = (msg->hdr.length - ACM_MSG_HDR_LENGTH) / ACM_MSG_EP_LENGTH; for (i = 0; i < cnt; i++) { ep_data = &msg->resolve_data[i]; switch (ep_data->type) { case ACM_EP_INFO_PATH: ep_data->type = 0; if (!path_data) path_data = (struct ibv_path_data *) ep_data; path_cnt++; break; case ACM_EP_INFO_ADDRESS_IP: if (!(ep_data->flags & ACM_EP_FLAG_SOURCE) || rai->ai_src_len) break; sin = calloc(1, sizeof(*sin)); if (!sin) break; sin->sin_family = AF_INET; memcpy(&sin->sin_addr, &ep_data->info.addr, 4); rai->ai_src_len = sizeof(*sin); rai->ai_src_addr = (struct sockaddr *) sin; break; case ACM_EP_INFO_ADDRESS_IP6: if (!(ep_data->flags & ACM_EP_FLAG_SOURCE) || rai->ai_src_len) break; sin6 = calloc(1, sizeof(*sin6)); if (!sin6) break; sin6->sin6_family = AF_INET6; memcpy(&sin6->sin6_addr, &ep_data->info.addr, 16); rai->ai_src_len = sizeof(*sin6); rai->ai_src_addr = (struct sockaddr *) sin6; break; default: break; } } rai->ai_route = calloc(path_cnt, sizeof(*path_data)); if (rai->ai_route) { memcpy(rai->ai_route, path_data, path_cnt * sizeof(*path_data)); rai->ai_route_len = path_cnt * sizeof(*path_data); } } static void ucma_set_ep_addr(struct acm_ep_addr_data *data, struct sockaddr *addr) { if (addr->sa_family == AF_INET) { data->type = ACM_EP_INFO_ADDRESS_IP; memcpy(data->info.addr, &((struct sockaddr_in *) addr)->sin_addr, 4); } else { data->type = ACM_EP_INFO_ADDRESS_IP6; memcpy(data->info.addr, &((struct sockaddr_in6 *) addr)->sin6_addr, 16); } } static int ucma_inet_addr(struct sockaddr *addr, socklen_t len) { return len && addr && (addr->sa_family == AF_INET || addr->sa_family == AF_INET6); } static int ucma_ib_addr(struct sockaddr *addr, socklen_t len) { return len && addr && (addr->sa_family == AF_IB); } void ucma_ib_resolve(struct rdma_addrinfo **rai, const struct rdma_addrinfo *hints) { struct acm_msg msg; struct acm_ep_addr_data *data; int ret; ucma_ib_init(); if (sock < 0) return; memset(&msg, 0, sizeof msg); msg.hdr.version = ACM_VERSION; msg.hdr.opcode = ACM_OP_RESOLVE; msg.hdr.length = ACM_MSG_HDR_LENGTH; data = &msg.resolve_data[0]; if (ucma_inet_addr((*rai)->ai_src_addr, (*rai)->ai_src_len)) { data->flags = ACM_EP_FLAG_SOURCE; ucma_set_ep_addr(data, (*rai)->ai_src_addr); data++; msg.hdr.length += ACM_MSG_EP_LENGTH; } if (ucma_inet_addr((*rai)->ai_dst_addr, (*rai)->ai_dst_len)) { data->flags = ACM_EP_FLAG_DEST; if (hints->ai_flags & (RAI_NUMERICHOST | RAI_NOROUTE)) data->flags |= ACM_FLAGS_NODELAY; ucma_set_ep_addr(data, (*rai)->ai_dst_addr); data++; msg.hdr.length += ACM_MSG_EP_LENGTH; } if (hints->ai_route_len || ucma_ib_addr((*rai)->ai_src_addr, (*rai)->ai_src_len) || ucma_ib_addr((*rai)->ai_dst_addr, (*rai)->ai_dst_len)) { struct ibv_path_record *path; if (hints->ai_route_len == sizeof(struct ibv_path_record)) path = (struct ibv_path_record *) hints->ai_route; else if (hints->ai_route_len == sizeof(struct ibv_path_data)) path = &((struct ibv_path_data *) hints->ai_route)->path; else path = NULL; if (path) memcpy(&data->info.path, path, sizeof(*path)); if (ucma_ib_addr((*rai)->ai_src_addr, (*rai)->ai_src_len)) { memcpy(&data->info.path.sgid, &((struct sockaddr_ib *) (*rai)->ai_src_addr)->sib_addr, 16); } if (ucma_ib_addr((*rai)->ai_dst_addr, (*rai)->ai_dst_len)) { memcpy(&data->info.path.dgid, &((struct sockaddr_ib *) (*rai)->ai_dst_addr)->sib_addr, 16); } data->type = ACM_EP_INFO_PATH; data++; msg.hdr.length += ACM_MSG_EP_LENGTH; } pthread_mutex_lock(&acm_lock); ret = send(sock, (char *) &msg, msg.hdr.length, 0); if (ret != msg.hdr.length) { pthread_mutex_unlock(&acm_lock); return; } ret = recv(sock, (char *) &msg, sizeof msg, 0); pthread_mutex_unlock(&acm_lock); if (ret < ACM_MSG_HDR_LENGTH || ret != msg.hdr.length || msg.hdr.status) return; ucma_ib_save_resp(*rai, &msg); if (af_ib_support && !(hints->ai_flags & RAI_ROUTEONLY) && (*rai)->ai_route_len) ucma_resolve_af_ib(rai); } rdma-core-56.1/librdmacm/acm.h000066400000000000000000000110161477342711600161740ustar00rootroot00000000000000/* * Copyright (c) 2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenFabrics.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(ACM_H) #define ACM_H #include #include #ifdef __cplusplus extern "C" { #endif #define ACM_VERSION 1 #define ACM_OP_MASK 0x0F #define ACM_OP_RESOLVE 0x01 #define ACM_OP_PERF_QUERY 0x02 #define ACM_OP_EP_QUERY 0x03 #define ACM_OP_ACK 0x80 #define ACM_STATUS_SUCCESS 0 #define ACM_STATUS_ENOMEM 1 #define ACM_STATUS_EINVAL 2 #define ACM_STATUS_ENODATA 3 #define ACM_STATUS_ENOTCONN 5 #define ACM_STATUS_ETIMEDOUT 6 #define ACM_STATUS_ESRCADDR 7 #define ACM_STATUS_ESRCTYPE 8 #define ACM_STATUS_EDESTADDR 9 #define ACM_STATUS_EDESTTYPE 10 #define ACM_FLAGS_QUERY_SA (1<<31) #define ACM_FLAGS_NODELAY (1<<30) #define ACM_MSG_HDR_LENGTH 16 #define ACM_MAX_ADDRESS 64 #define ACM_MSG_EP_LENGTH 72 #define ACM_MAX_PROV_NAME 64 /* * Support up to 6 path records (primary and alternate CM paths, * inbound and outbound primary and alternate data paths), plus CM data. */ #define ACM_MSG_DATA_LENGTH (ACM_MSG_EP_LENGTH * 8) #define src_out data[0] #define src_index data[1] #define dst_index data[2] struct acm_hdr { uint8_t version; uint8_t opcode; uint8_t status; uint8_t data[3]; uint16_t length; uint64_t tid; }; #define ACM_EP_INFO_NAME 0x0001 #define ACM_EP_INFO_ADDRESS_IP 0x0002 #define ACM_EP_INFO_ADDRESS_IP6 0x0003 #define ACM_EP_INFO_PATH 0x0010 union acm_ep_info { uint8_t addr[ACM_MAX_ADDRESS]; uint8_t name[ACM_MAX_ADDRESS]; struct ibv_path_record path; }; #define ACM_EP_FLAG_SOURCE (1<<0) #define ACM_EP_FLAG_DEST (1<<1) struct acm_ep_addr_data { uint32_t flags; uint16_t type; uint16_t reserved; union acm_ep_info info; }; /* * Resolve messages with the opcode set to ACM_OP_RESOLVE are only * used to communicate with the local ib_acm service. Message fields * in this case are not byte swapped, but note that the acm_ep_info * data is in network order. */ struct acm_resolve_msg { struct acm_hdr hdr; struct acm_ep_addr_data data[]; }; enum { ACM_CNTR_ERROR, ACM_CNTR_RESOLVE, ACM_CNTR_NODATA, ACM_CNTR_ADDR_QUERY, ACM_CNTR_ADDR_CACHE, ACM_CNTR_ROUTE_QUERY, ACM_CNTR_ROUTE_CACHE, ACM_MAX_COUNTER }; /* * Performance messages are sent/received in network byte order. */ struct acm_perf_msg { struct acm_hdr hdr; uint64_t data[]; }; /* * Endpoint query messages are sent/received in network byte order. */ struct acm_ep_config_data { uint64_t dev_guid; uint8_t port_num; uint8_t phys_port_cnt; uint8_t rsvd[2]; uint16_t pkey; uint16_t addr_cnt; uint8_t prov_name[ACM_MAX_PROV_NAME]; union acm_ep_info addrs[]; }; struct acm_ep_query_msg { struct acm_hdr hdr; struct acm_ep_config_data data[]; }; struct acm_msg { struct acm_hdr hdr; union{ uint8_t data[ACM_MSG_DATA_LENGTH]; struct acm_ep_addr_data resolve_data[0]; uint64_t perf_data[0]; struct acm_ep_config_data ep_data; }; }; #ifdef __cplusplus } #endif #endif /* ACM_H */ rdma-core-56.1/librdmacm/addrinfo.c000066400000000000000000000176021477342711600172240ustar00rootroot00000000000000/* * Copyright (c) 2010-2014 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * $Id: cm.c 3453 2005-09-15 21:43:21Z sean.hefty $ */ #include #include #include #include #include #include "cma.h" #include #include static struct rdma_addrinfo nohints; static void ucma_convert_to_ai(struct addrinfo *ai, const struct rdma_addrinfo *rai) { memset(ai, 0, sizeof(*ai)); if (rai->ai_flags & RAI_PASSIVE) ai->ai_flags = AI_PASSIVE; if (rai->ai_flags & RAI_NUMERICHOST) ai->ai_flags |= AI_NUMERICHOST; if (rai->ai_family != AF_IB) ai->ai_family = rai->ai_family; switch (rai->ai_qp_type) { case IBV_QPT_RC: case IBV_QPT_UC: case IBV_QPT_XRC_SEND: case IBV_QPT_XRC_RECV: ai->ai_socktype = SOCK_STREAM; break; case IBV_QPT_UD: ai->ai_socktype = SOCK_DGRAM; break; } switch (rai->ai_port_space) { case RDMA_PS_TCP: ai->ai_protocol = IPPROTO_TCP; break; case RDMA_PS_IPOIB: case RDMA_PS_UDP: ai->ai_protocol = IPPROTO_UDP; break; case RDMA_PS_IB: if (ai->ai_socktype == SOCK_STREAM) ai->ai_protocol = IPPROTO_TCP; else if (ai->ai_socktype == SOCK_DGRAM) ai->ai_protocol = IPPROTO_UDP; break; } if (rai->ai_flags & RAI_PASSIVE) { ai->ai_addrlen = rai->ai_src_len; ai->ai_addr = rai->ai_src_addr; } else { ai->ai_addrlen = rai->ai_dst_len; ai->ai_addr = rai->ai_dst_addr; } ai->ai_canonname = rai->ai_dst_canonname; ai->ai_next = NULL; } static int ucma_copy_addr(struct sockaddr **dst, socklen_t *dst_len, struct sockaddr *src, socklen_t src_len) { *dst = malloc(src_len); if (!(*dst)) return ERR(ENOMEM); memcpy(*dst, src, src_len); *dst_len = src_len; return 0; } void ucma_set_sid(enum rdma_port_space ps, struct sockaddr *addr, struct sockaddr_ib *sib) { __be16 port; port = addr ? ucma_get_port(addr) : 0; sib->sib_sid = htobe64(((uint64_t) ps << 16) + be16toh(port)); if (ps) sib->sib_sid_mask = htobe64(RDMA_IB_IP_PS_MASK); if (port) sib->sib_sid_mask |= htobe64(RDMA_IB_IP_PORT_MASK); } static int ucma_convert_in6(int ps, struct sockaddr_ib **dst, socklen_t *dst_len, struct sockaddr_in6 *src, socklen_t src_len) { *dst = calloc(1, sizeof(struct sockaddr_ib)); if (!(*dst)) return ERR(ENOMEM); (*dst)->sib_family = AF_IB; (*dst)->sib_pkey = htobe16(0xFFFF); (*dst)->sib_flowinfo = src->sin6_flowinfo; ib_addr_set(&(*dst)->sib_addr, src->sin6_addr.s6_addr32[0], src->sin6_addr.s6_addr32[1], src->sin6_addr.s6_addr32[2], src->sin6_addr.s6_addr32[3]); ucma_set_sid(ps, (struct sockaddr *) src, *dst); (*dst)->sib_scope_id = src->sin6_scope_id; *dst_len = sizeof(struct sockaddr_ib); return 0; } static int ucma_convert_to_rai(struct rdma_addrinfo *rai, const struct rdma_addrinfo *hints, const struct addrinfo *ai) { int ret; if (hints->ai_qp_type) { rai->ai_qp_type = hints->ai_qp_type; } else { switch (ai->ai_socktype) { case SOCK_STREAM: rai->ai_qp_type = IBV_QPT_RC; break; case SOCK_DGRAM: rai->ai_qp_type = IBV_QPT_UD; break; } } if (hints->ai_port_space) { rai->ai_port_space = hints->ai_port_space; } else { switch (ai->ai_protocol) { case IPPROTO_TCP: rai->ai_port_space = RDMA_PS_TCP; break; case IPPROTO_UDP: rai->ai_port_space = RDMA_PS_UDP; break; } } if (ai->ai_flags & AI_PASSIVE) { rai->ai_flags = RAI_PASSIVE; if (ai->ai_canonname) rai->ai_src_canonname = strdup(ai->ai_canonname); if ((hints->ai_flags & RAI_FAMILY) && (hints->ai_family == AF_IB) && (hints->ai_flags & RAI_NUMERICHOST)) { rai->ai_family = AF_IB; ret = ucma_convert_in6(rai->ai_port_space, (struct sockaddr_ib **) &rai->ai_src_addr, &rai->ai_src_len, (struct sockaddr_in6 *) ai->ai_addr, ai->ai_addrlen); } else { rai->ai_family = ai->ai_family; ret = ucma_copy_addr(&rai->ai_src_addr, &rai->ai_src_len, ai->ai_addr, ai->ai_addrlen); } } else { if (ai->ai_canonname) rai->ai_dst_canonname = strdup(ai->ai_canonname); if ((hints->ai_flags & RAI_FAMILY) && (hints->ai_family == AF_IB) && (hints->ai_flags & RAI_NUMERICHOST)) { rai->ai_family = AF_IB; ret = ucma_convert_in6(rai->ai_port_space, (struct sockaddr_ib **) &rai->ai_dst_addr, &rai->ai_dst_len, (struct sockaddr_in6 *) ai->ai_addr, ai->ai_addrlen); } else { rai->ai_family = ai->ai_family; ret = ucma_copy_addr(&rai->ai_dst_addr, &rai->ai_dst_len, ai->ai_addr, ai->ai_addrlen); } } return ret; } static int ucma_getaddrinfo(const char *node, const char *service, const struct rdma_addrinfo *hints, struct rdma_addrinfo *rai) { struct addrinfo ai_hints; struct addrinfo *ai; int ret; if (hints != &nohints) { ucma_convert_to_ai(&ai_hints, hints); ret = getaddrinfo(node, service, &ai_hints, &ai); } else { ret = getaddrinfo(node, service, NULL, &ai); } if (ret) return ret; ret = ucma_convert_to_rai(rai, hints, ai); freeaddrinfo(ai); return ret; } int rdma_getaddrinfo(const char *node, const char *service, const struct rdma_addrinfo *hints, struct rdma_addrinfo **res) { struct rdma_addrinfo *rai; int ret; if (!service && !node && !hints) return ERR(EINVAL); ret = ucma_init(); if (ret) return ret; rai = calloc(1, sizeof(*rai)); if (!rai) return ERR(ENOMEM); if (!hints) hints = &nohints; if (node || service) { ret = ucma_getaddrinfo(node, service, hints, rai); } else { rai->ai_flags = hints->ai_flags; rai->ai_family = hints->ai_family; rai->ai_qp_type = hints->ai_qp_type; rai->ai_port_space = hints->ai_port_space; if (hints->ai_dst_len) { ret = ucma_copy_addr(&rai->ai_dst_addr, &rai->ai_dst_len, hints->ai_dst_addr, hints->ai_dst_len); } } if (ret) goto err; if (!rai->ai_src_len && hints->ai_src_len) { ret = ucma_copy_addr(&rai->ai_src_addr, &rai->ai_src_len, hints->ai_src_addr, hints->ai_src_len); if (ret) goto err; } if (!(rai->ai_flags & RAI_PASSIVE)) ucma_ib_resolve(&rai, hints); *res = rai; return 0; err: rdma_freeaddrinfo(rai); return ret; } void rdma_freeaddrinfo(struct rdma_addrinfo *res) { struct rdma_addrinfo *rai; while (res) { rai = res; res = res->ai_next; if (rai->ai_connect) free(rai->ai_connect); if (rai->ai_route) free(rai->ai_route); if (rai->ai_src_canonname) free(rai->ai_src_canonname); if (rai->ai_dst_canonname) free(rai->ai_dst_canonname); if (rai->ai_src_addr) free(rai->ai_src_addr); if (rai->ai_dst_addr) free(rai->ai_dst_addr); free(rai); } } rdma-core-56.1/librdmacm/cma.c000066400000000000000000002154371477342711600162040ustar00rootroot00000000000000/* * Copyright (c) 2005-2014 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "cma.h" #include "indexer.h" #include #include #include #include #include #include #include #include #include #define CMA_INIT_CMD(req, req_size, op) \ do { \ memset(req, 0, req_size); \ (req)->cmd = UCMA_CMD_##op; \ (req)->in = req_size - sizeof(struct ucma_abi_cmd_hdr); \ } while (0) #define CMA_INIT_CMD_RESP(req, req_size, op, resp, resp_size) \ do { \ CMA_INIT_CMD(req, req_size, op); \ (req)->out = resp_size; \ (req)->response = (uintptr_t) (resp); \ } while (0) #define UCMA_INVALID_IB_INDEX -1 struct cma_port { uint8_t link_layer; }; struct cma_device { struct ibv_device *dev; struct list_node entry; struct ibv_context *verbs; struct ibv_pd *pd; struct ibv_xrcd *xrcd; struct cma_port *port; __be64 guid; int port_cnt; int refcnt; int max_qpsize; uint8_t max_initiator_depth; uint8_t max_responder_resources; int ibv_idx; uint8_t is_device_dead : 1; }; struct cma_id_private { struct rdma_cm_id id; struct cma_device *cma_dev; void *connect; size_t connect_len; int events_completed; int connect_error; int sync; pthread_cond_t cond; pthread_mutex_t mut; uint32_t handle; struct cma_multicast *mc_list; struct ibv_qp_init_attr *qp_init_attr; uint8_t initiator_depth; uint8_t responder_resources; struct ibv_ece local_ece; struct ibv_ece remote_ece; }; struct cma_multicast { struct cma_multicast *next; struct cma_id_private *id_priv; void *context; int events_completed; pthread_cond_t cond; uint32_t handle; union ibv_gid mgid; uint16_t mlid; uint16_t join_flags; struct sockaddr_storage addr; }; struct cma_event { struct rdma_cm_event event; uint8_t private_data[RDMA_MAX_PRIVATE_DATA]; struct cma_id_private *id_priv; struct cma_multicast *mc; }; static LIST_HEAD(cma_dev_list); /* sorted based or index or guid, depends on kernel support */ static struct ibv_device **dev_list; static pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER; static int abi_ver = -1; static char dev_name[64] = "rdma_cm"; static dev_t dev_cdev; int af_ib_support; static struct index_map ucma_idm; static fastlock_t idm_lock; static int check_abi_version_nl_cb(struct nl_msg *msg, void *data) { struct nlattr *tb[RDMA_NLDEV_ATTR_MAX]; uint64_t cdev64; int ret; ret = nlmsg_parse(nlmsg_hdr(msg), 0, tb, RDMA_NLDEV_ATTR_MAX - 1, rdmanl_policy); if (ret < 0) return ret; if (!tb[RDMA_NLDEV_ATTR_CHARDEV] || !tb[RDMA_NLDEV_ATTR_CHARDEV_ABI] || !tb[RDMA_NLDEV_ATTR_CHARDEV_NAME]) return NLE_PARSE_ERR; /* Convert from huge_encode_dev to whatever glibc uses */ cdev64 = nla_get_u64(tb[RDMA_NLDEV_ATTR_CHARDEV]); dev_cdev = makedev((cdev64 & 0xfff00) >> 8, (cdev64 & 0xff) | ((cdev64 >> 12) & 0xfff00)); if (!check_snprintf(dev_name, sizeof(dev_name), "%s", nla_get_string(tb[RDMA_NLDEV_ATTR_CHARDEV_NAME]))) return NLE_PARSE_ERR; /* * The top 32 bits of CHARDEV_ABI are reserved for a future use, * current kernels set them to 0 */ abi_ver = (uint32_t)nla_get_u64(tb[RDMA_NLDEV_ATTR_CHARDEV_ABI]); return 0; } /* Ask the kernel for the uverbs char device information */ static int check_abi_version_nl(void) { struct nl_sock *nl; nl = rdmanl_socket_alloc(); if (!nl) return -1; if (rdmanl_get_chardev(nl, -1, "rdma_cm", check_abi_version_nl_cb, NULL)) goto err_socket; if (abi_ver == -1) goto err_socket; nl_socket_free(nl); return 0; err_socket: nl_socket_free(nl); return -1; } static void check_abi_version_sysfs(void) { char value[8]; if ((ibv_read_sysfs_file(ibv_get_sysfs_path(), "class/misc/rdma_cm/abi_version", value, sizeof value) < 0) && (ibv_read_sysfs_file(ibv_get_sysfs_path(), "class/infiniband_ucma/abi_version", value, sizeof value) < 0)) { /* * Older version of Linux do not have class/misc. To support * backports, assume the most recent version of the ABI. If * we're wrong, we'll simply fail later when calling the ABI. */ abi_ver = RDMA_USER_CM_MAX_ABI_VERSION; return; } abi_ver = strtol(value, NULL, 10); dev_cdev = 0; } static int check_abi_version(void) { if (abi_ver == -1) { if (check_abi_version_nl()) check_abi_version_sysfs(); } if (abi_ver < RDMA_USER_CM_MIN_ABI_VERSION || abi_ver > RDMA_USER_CM_MAX_ABI_VERSION) return -1; return 0; } /* * This function is called holding the mutex lock * cma_dev_list must be not empty before calling this function to * ensure that the lock is not acquired recursively. */ static void ucma_set_af_ib_support(void) { struct rdma_cm_id *id; struct sockaddr_ib sib; int ret; ret = rdma_create_id(NULL, &id, NULL, RDMA_PS_IB); if (ret) return; memset(&sib, 0, sizeof sib); sib.sib_family = AF_IB; sib.sib_sid = htobe64(RDMA_IB_IP_PS_TCP); sib.sib_sid_mask = htobe64(RDMA_IB_IP_PS_MASK); af_ib_support = 1; ret = rdma_bind_addr(id, (struct sockaddr *) &sib); af_ib_support = !ret; rdma_destroy_id(id); } static struct cma_device *insert_cma_dev(struct ibv_device *dev) { struct cma_device *cma_dev, *p; cma_dev = calloc(1, sizeof(struct cma_device)); if (!cma_dev) return NULL; cma_dev->guid = ibv_get_device_guid(dev); cma_dev->ibv_idx = ibv_get_device_index(dev); cma_dev->dev = dev; /* reverse iteration, optimized to ibv_idx which is growing */ list_for_each_rev(&cma_dev_list, p, entry) { if (cma_dev->ibv_idx == UCMA_INVALID_IB_INDEX) { /* index not available, sort by guid */ if (be64toh(p->guid) < be64toh(cma_dev->guid)) break; } else { if (p->ibv_idx < cma_dev->ibv_idx) break; } } list_add_after(&cma_dev_list, &p->entry, &cma_dev->entry); return cma_dev; } static void remove_cma_dev(struct cma_device *cma_dev) { if (cma_dev->refcnt) { /* we were asked to be deleted by sync_devices_list() */ cma_dev->is_device_dead = true; return; } if (cma_dev->xrcd) ibv_close_xrcd(cma_dev->xrcd); if (cma_dev->pd) ibv_dealloc_pd(cma_dev->pd); if (cma_dev->verbs) ibv_close_device(cma_dev->verbs); free(cma_dev->port); list_del_from(&cma_dev_list, &cma_dev->entry); free(cma_dev); } static int dev_cmp(const void *a, const void *b) { return (*(uintptr_t *)a > *(uintptr_t *)b) - (*(uintptr_t *)a < *(uintptr_t *)b); } static int sync_devices_list(void) { struct ibv_device **new_list; int i, j, numb_dev; new_list = ibv_get_device_list(&numb_dev); if (!new_list) return ERR(ENODEV); if (!numb_dev) { ibv_free_device_list(new_list); return ERR(ENODEV); } qsort(new_list, numb_dev, sizeof(struct ibv_device *), dev_cmp); if (unlikely(!dev_list)) { /* first sync */ for (j = 0; new_list[j]; j++) insert_cma_dev(new_list[j]); goto out; } for (i = 0, j = 0; dev_list[i] || new_list[j];) { if (dev_list[i] == new_list[j]) { i++; j++; continue; } /* * The device list is sorted by pointer address, * so we need to compare the new list with old one. * * 1. If the device exists in new list, but doesn't exist in * old list, we will add that device to the list. * 2. If the device exists in old list, but doesn't exist in * new list, we should delete it. */ if ((dev_list[i] > new_list[j] && new_list[j]) || (!dev_list[i] && new_list[j])) { insert_cma_dev(new_list[j++]); continue; } if ((dev_list[i] < new_list[j] && dev_list[i]) || (!new_list[j] && dev_list[i])) { /* * We will try our best to remove the entry, * but if some process holds it, we will remove it * later, when rdma-cm will put this resource back. */ struct cma_device *c, *t; list_for_each_safe(&cma_dev_list, c, t, entry) { if (c->dev == dev_list[i]) remove_cma_dev(c); } i++; } } ibv_free_device_list(dev_list); out: dev_list = new_list; return 0; } int ucma_init(void) { int ret; /* * ucma_set_af_ib_support() below recursively calls to this function * again under the &mut lock, so do this fast check and return * immediately. */ if (!list_empty(&cma_dev_list)) return 0; pthread_mutex_lock(&mut); if (!list_empty(&cma_dev_list)) { pthread_mutex_unlock(&mut); return 0; } fastlock_init(&idm_lock); ret = check_abi_version(); if (ret) { ret = ERR(EPERM); goto err1; } ret = sync_devices_list(); if (ret) goto err1; ucma_set_af_ib_support(); pthread_mutex_unlock(&mut); return 0; err1: fastlock_destroy(&idm_lock); pthread_mutex_unlock(&mut); return ret; } static bool match(struct cma_device *cma_dev, __be64 guid, uint32_t idx) { if ((idx == UCMA_INVALID_IB_INDEX) || (cma_dev->ibv_idx == UCMA_INVALID_IB_INDEX)) return cma_dev->guid == guid; return cma_dev->ibv_idx == idx && cma_dev->guid == guid; } static int ucma_init_device(struct cma_device *cma_dev) { struct ibv_port_attr port_attr; struct ibv_device_attr attr; int i, ret; if (cma_dev->verbs) return 0; cma_dev->verbs = ibv_open_device(cma_dev->dev); if (!cma_dev->verbs) return ERR(ENODEV); ret = ibv_query_device(cma_dev->verbs, &attr); if (ret) { ret = ERR(ret); goto err; } cma_dev->port = malloc(sizeof(*cma_dev->port) * attr.phys_port_cnt); if (!cma_dev->port) { ret = ERR(ENOMEM); goto err; } for (i = 1; i <= attr.phys_port_cnt; i++) { if (ibv_query_port(cma_dev->verbs, i, &port_attr)) cma_dev->port[i - 1].link_layer = IBV_LINK_LAYER_UNSPECIFIED; else cma_dev->port[i - 1].link_layer = port_attr.link_layer; } cma_dev->port_cnt = attr.phys_port_cnt; cma_dev->max_qpsize = attr.max_qp_wr; cma_dev->max_initiator_depth = (uint8_t) attr.max_qp_init_rd_atom; cma_dev->max_responder_resources = (uint8_t) attr.max_qp_rd_atom; return 0; err: ibv_close_device(cma_dev->verbs); cma_dev->verbs = NULL; return ret; } static int ucma_init_all(void) { struct cma_device *dev; int ret = 0; ret = ucma_init(); if (ret) return ret; pthread_mutex_lock(&mut); list_for_each(&cma_dev_list, dev, entry) { if (dev->is_device_dead) continue; if (ucma_init_device(dev)) { /* Couldn't initialize the device: mark it dead and continue */ dev->is_device_dead = true; } } pthread_mutex_unlock(&mut); return 0; } struct ibv_context **rdma_get_devices(int *num_devices) { struct ibv_context **devs = NULL; struct cma_device *dev; int cma_dev_cnt = 0; int i = 0; if (ucma_init()) goto err_init; pthread_mutex_lock(&mut); if (sync_devices_list()) goto out; list_for_each(&cma_dev_list, dev, entry) { if (dev->is_device_dead) continue; /* reinit newly added devices */ if (ucma_init_device(dev)) { /* Couldn't initialize the device: mark it dead and continue */ dev->is_device_dead = true; continue; } cma_dev_cnt++; } devs = malloc(sizeof(*devs) * (cma_dev_cnt + 1)); if (!devs) goto out; list_for_each(&cma_dev_list, dev, entry) { if (dev->is_device_dead) continue; devs[i++] = dev->verbs; dev->refcnt++; } devs[i] = NULL; out: pthread_mutex_unlock(&mut); err_init: if (num_devices) *num_devices = devs ? cma_dev_cnt : 0; return devs; } void rdma_free_devices(struct ibv_context **list) { struct cma_device *c, *tmp; int i; pthread_mutex_lock(&mut); list_for_each_safe(&cma_dev_list, c, tmp, entry) { for (i = 0; list[i]; i++) { if (list[i] != c->verbs) /* * Skip devices that were added after * user received the list. */ continue; c->refcnt--; if (c->is_device_dead) /* try to remove */ remove_cma_dev(c); } } pthread_mutex_unlock(&mut); free(list); } struct rdma_event_channel *rdma_create_event_channel(void) { struct rdma_event_channel *channel; if (ucma_init()) return NULL; channel = malloc(sizeof(*channel)); if (!channel) return NULL; channel->fd = open_cdev(dev_name, dev_cdev); if (channel->fd < 0) { goto err; } return channel; err: free(channel); return NULL; } void rdma_destroy_event_channel(struct rdma_event_channel *channel) { close(channel->fd); free(channel); } static struct cma_device *ucma_get_cma_device(__be64 guid, uint32_t idx) { struct cma_device *cma_dev; list_for_each(&cma_dev_list, cma_dev, entry) if (!cma_dev->is_device_dead && match(cma_dev, guid, idx)) goto match; if (sync_devices_list()) return NULL; /* * Kernel informed us that we have new device and it must * be in global dev_list[], let's find the right one. */ list_for_each(&cma_dev_list, cma_dev, entry) if (!cma_dev->is_device_dead && match(cma_dev, guid, idx)) goto match; cma_dev = NULL; match: if (cma_dev) cma_dev->refcnt++; return cma_dev; } static int ucma_get_device(struct cma_id_private *id_priv, __be64 guid, uint32_t idx) { struct cma_device *cma_dev; int ret; pthread_mutex_lock(&mut); cma_dev = ucma_get_cma_device(guid, idx); if (!cma_dev) { pthread_mutex_unlock(&mut); return ERR(ENODEV); } ret = ucma_init_device(cma_dev); if (ret) goto out; if (!cma_dev->pd) cma_dev->pd = ibv_alloc_pd(cma_dev->verbs); if (!cma_dev->pd) { ret = -1; goto out; } id_priv->cma_dev = cma_dev; id_priv->id.verbs = cma_dev->verbs; id_priv->id.pd = cma_dev->pd; out: if (ret) cma_dev->refcnt--; pthread_mutex_unlock(&mut); return ret; } static void ucma_put_device(struct cma_device *cma_dev) { pthread_mutex_lock(&mut); if (!--cma_dev->refcnt) { ibv_dealloc_pd(cma_dev->pd); if (cma_dev->xrcd) ibv_close_xrcd(cma_dev->xrcd); cma_dev->pd = NULL; cma_dev->xrcd = NULL; if (cma_dev->is_device_dead) remove_cma_dev(cma_dev); } pthread_mutex_unlock(&mut); } static struct ibv_xrcd *ucma_get_xrcd(struct cma_device *cma_dev) { struct ibv_xrcd_init_attr attr; pthread_mutex_lock(&mut); if (!cma_dev->xrcd) { memset(&attr, 0, sizeof attr); attr.comp_mask = IBV_XRCD_INIT_ATTR_FD | IBV_XRCD_INIT_ATTR_OFLAGS; attr.fd = -1; attr.oflags = O_CREAT; cma_dev->xrcd = ibv_open_xrcd(cma_dev->verbs, &attr); } pthread_mutex_unlock(&mut); return cma_dev->xrcd; } static void ucma_insert_id(struct cma_id_private *id_priv) { fastlock_acquire(&idm_lock); idm_set(&ucma_idm, id_priv->handle, id_priv); fastlock_release(&idm_lock); } static void ucma_remove_id(struct cma_id_private *id_priv) { if (id_priv->handle <= IDX_MAX_INDEX) idm_clear(&ucma_idm, id_priv->handle); } static struct cma_id_private *ucma_lookup_id(int handle) { return idm_lookup(&ucma_idm, handle); } static void ucma_free_id(struct cma_id_private *id_priv) { ucma_remove_id(id_priv); if (id_priv->cma_dev) ucma_put_device(id_priv->cma_dev); pthread_cond_destroy(&id_priv->cond); pthread_mutex_destroy(&id_priv->mut); if (id_priv->id.route.path_rec) free(id_priv->id.route.path_rec); if (id_priv->sync) rdma_destroy_event_channel(id_priv->id.channel); if (id_priv->connect_len) free(id_priv->connect); free(id_priv); } static struct cma_id_private *ucma_alloc_id(struct rdma_event_channel *channel, void *context, enum rdma_port_space ps, enum ibv_qp_type qp_type) { struct cma_id_private *id_priv; id_priv = calloc(1, sizeof(*id_priv)); if (!id_priv) return NULL; id_priv->id.context = context; id_priv->id.ps = ps; id_priv->id.qp_type = qp_type; id_priv->handle = 0xFFFFFFFF; if (!channel) { id_priv->id.channel = rdma_create_event_channel(); if (!id_priv->id.channel) goto err; id_priv->sync = 1; } else { id_priv->id.channel = channel; } pthread_mutex_init(&id_priv->mut, NULL); if (pthread_cond_init(&id_priv->cond, NULL)) goto err; return id_priv; err: ucma_free_id(id_priv); return NULL; } static int rdma_create_id2(struct rdma_event_channel *channel, struct rdma_cm_id **id, void *context, enum rdma_port_space ps, enum ibv_qp_type qp_type) { struct ucma_abi_create_id_resp resp; struct ucma_abi_create_id cmd; struct cma_id_private *id_priv; int ret; ret = ucma_init(); if (ret) return ret; id_priv = ucma_alloc_id(channel, context, ps, qp_type); if (!id_priv) return ERR(ENOMEM); CMA_INIT_CMD_RESP(&cmd, sizeof cmd, CREATE_ID, &resp, sizeof resp); cmd.uid = (uintptr_t) id_priv; cmd.ps = ps; cmd.qp_type = qp_type; ret = write(id_priv->id.channel->fd, &cmd, sizeof cmd); if (ret != sizeof(cmd)) { ret = (ret >= 0) ? ERR(ENODATA) : -1; goto err; } VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); id_priv->handle = resp.id; ucma_insert_id(id_priv); *id = &id_priv->id; return 0; err: ucma_free_id(id_priv); return ret; } int rdma_create_id(struct rdma_event_channel *channel, struct rdma_cm_id **id, void *context, enum rdma_port_space ps) { enum ibv_qp_type qp_type; qp_type = (ps == RDMA_PS_IPOIB || ps == RDMA_PS_UDP) ? IBV_QPT_UD : IBV_QPT_RC; return rdma_create_id2(channel, id, context, ps, qp_type); } static int ucma_destroy_kern_id(int fd, uint32_t handle) { struct ucma_abi_destroy_id_resp resp; struct ucma_abi_destroy_id cmd; int ret; CMA_INIT_CMD_RESP(&cmd, sizeof cmd, DESTROY_ID, &resp, sizeof resp); cmd.id = handle; ret = write(fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); return resp.events_reported; } int rdma_destroy_id(struct rdma_cm_id *id) { struct cma_id_private *id_priv; int ret; id_priv = container_of(id, struct cma_id_private, id); ret = ucma_destroy_kern_id(id->channel->fd, id_priv->handle); if (ret < 0) return ret; if (id_priv->id.event) rdma_ack_cm_event(id_priv->id.event); pthread_mutex_lock(&id_priv->mut); while (id_priv->events_completed < ret) pthread_cond_wait(&id_priv->cond, &id_priv->mut); pthread_mutex_unlock(&id_priv->mut); ucma_free_id(id_priv); return 0; } int ucma_addrlen(struct sockaddr *addr) { if (!addr) return 0; switch (addr->sa_family) { case PF_INET: return sizeof(struct sockaddr_in); case PF_INET6: return sizeof(struct sockaddr_in6); case PF_IB: return af_ib_support ? sizeof(struct sockaddr_ib) : 0; default: return 0; } } static int ucma_query_addr(struct rdma_cm_id *id) { struct ucma_abi_query_addr_resp resp; struct ucma_abi_query cmd; struct cma_id_private *id_priv; int ret; CMA_INIT_CMD_RESP(&cmd, sizeof cmd, QUERY, &resp, sizeof resp); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; cmd.option = UCMA_QUERY_ADDR; /* * If kernel doesn't support ibdev_index, this field will * be left as is by the kernel. */ resp.ibdev_index = UCMA_INVALID_IB_INDEX; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); memcpy(&id->route.addr.src_addr, &resp.src_addr, resp.src_size); memcpy(&id->route.addr.dst_addr, &resp.dst_addr, resp.dst_size); if (!id_priv->cma_dev && resp.node_guid) { ret = ucma_get_device(id_priv, resp.node_guid, resp.ibdev_index); if (ret) return ret; id->port_num = resp.port_num; id->route.addr.addr.ibaddr.pkey = resp.pkey; } return 0; } static int ucma_query_gid(struct rdma_cm_id *id) { struct ucma_abi_query_addr_resp resp; struct ucma_abi_query cmd; struct cma_id_private *id_priv; struct sockaddr_ib *sib; int ret; CMA_INIT_CMD_RESP(&cmd, sizeof cmd, QUERY, &resp, sizeof resp); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; cmd.option = UCMA_QUERY_GID; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); sib = (struct sockaddr_ib *) &resp.src_addr; memcpy(id->route.addr.addr.ibaddr.sgid.raw, sib->sib_addr.sib_raw, sizeof id->route.addr.addr.ibaddr.sgid); sib = (struct sockaddr_ib *) &resp.dst_addr; memcpy(id->route.addr.addr.ibaddr.dgid.raw, sib->sib_addr.sib_raw, sizeof id->route.addr.addr.ibaddr.dgid); return 0; } static void ucma_convert_path(struct ibv_path_data *path_data, struct ibv_sa_path_rec *sa_path) { uint32_t fl_hop; sa_path->dgid = path_data->path.dgid; sa_path->sgid = path_data->path.sgid; sa_path->dlid = path_data->path.dlid; sa_path->slid = path_data->path.slid; sa_path->raw_traffic = 0; fl_hop = be32toh(path_data->path.flowlabel_hoplimit); sa_path->flow_label = htobe32(fl_hop >> 8); sa_path->hop_limit = (uint8_t) fl_hop; sa_path->traffic_class = path_data->path.tclass; sa_path->reversible = path_data->path.reversible_numpath >> 7; sa_path->numb_path = 1; sa_path->pkey = path_data->path.pkey; sa_path->sl = be16toh(path_data->path.qosclass_sl) & 0xF; sa_path->mtu_selector = 2; /* exactly */ sa_path->mtu = path_data->path.mtu & 0x1F; sa_path->rate_selector = 2; sa_path->rate = path_data->path.rate & 0x1F; sa_path->packet_life_time_selector = 2; sa_path->packet_life_time = path_data->path.packetlifetime & 0x1F; sa_path->preference = (uint8_t) path_data->flags; } static int ucma_query_path(struct rdma_cm_id *id) { struct ucma_abi_query_path_resp *resp; struct ucma_abi_query cmd; struct cma_id_private *id_priv; int ret, i, size; size = sizeof(*resp) + sizeof(struct ibv_path_data) * 6; resp = alloca(size); CMA_INIT_CMD_RESP(&cmd, sizeof cmd, QUERY, resp, size); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; cmd.option = UCMA_QUERY_PATH; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; VALGRIND_MAKE_MEM_DEFINED(resp, size); if (resp->num_paths) { id->route.path_rec = malloc(sizeof(*id->route.path_rec) * resp->num_paths); if (!id->route.path_rec) return ERR(ENOMEM); id->route.num_paths = resp->num_paths; for (i = 0; i < resp->num_paths; i++) ucma_convert_path(&resp->path_data[i], &id->route.path_rec[i]); } return 0; } static int ucma_query_route(struct rdma_cm_id *id) { struct ucma_abi_query_route_resp resp; struct ucma_abi_query cmd; struct cma_id_private *id_priv; int ret, i; CMA_INIT_CMD_RESP(&cmd, sizeof cmd, QUERY_ROUTE, &resp, sizeof resp); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; /* * If kernel doesn't support ibdev_index, this field will * be left as is by the kernel. */ resp.ibdev_index = UCMA_INVALID_IB_INDEX; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); if (resp.num_paths) { id->route.path_rec = malloc(sizeof(*id->route.path_rec) * resp.num_paths); if (!id->route.path_rec) return ERR(ENOMEM); id->route.num_paths = resp.num_paths; for (i = 0; i < resp.num_paths; i++) ibv_copy_path_rec_from_kern(&id->route.path_rec[i], &resp.ib_route[i]); } memcpy(id->route.addr.addr.ibaddr.sgid.raw, resp.ib_route[0].sgid, sizeof id->route.addr.addr.ibaddr.sgid); memcpy(id->route.addr.addr.ibaddr.dgid.raw, resp.ib_route[0].dgid, sizeof id->route.addr.addr.ibaddr.dgid); id->route.addr.addr.ibaddr.pkey = resp.ib_route[0].pkey; memcpy(&id->route.addr.src_addr, &resp.src_addr, sizeof resp.src_addr); memcpy(&id->route.addr.dst_addr, &resp.dst_addr, sizeof resp.dst_addr); if (!id_priv->cma_dev && resp.node_guid) { ret = ucma_get_device(id_priv, resp.node_guid, resp.ibdev_index); if (ret) return ret; id_priv->id.port_num = resp.port_num; } return 0; } static int rdma_bind_addr2(struct rdma_cm_id *id, struct sockaddr *addr, socklen_t addrlen) { struct ucma_abi_bind cmd; struct cma_id_private *id_priv; int ret; CMA_INIT_CMD(&cmd, sizeof cmd, BIND); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; cmd.addr_size = addrlen; memcpy(&cmd.addr, addr, addrlen); ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; ret = ucma_query_addr(id); if (!ret) ret = ucma_query_gid(id); return ret; } int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr) { struct ucma_abi_bind_ip cmd; struct cma_id_private *id_priv; int ret, addrlen; addrlen = ucma_addrlen(addr); if (!addrlen) return ERR(EINVAL); if (af_ib_support) return rdma_bind_addr2(id, addr, addrlen); CMA_INIT_CMD(&cmd, sizeof cmd, BIND_IP); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; memcpy(&cmd.addr, addr, addrlen); ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; return ucma_query_route(id); } int ucma_complete(struct rdma_cm_id *id) { struct cma_id_private *id_priv; int ret; id_priv = container_of(id, struct cma_id_private, id); if (!id_priv->sync) return 0; if (id_priv->id.event) { rdma_ack_cm_event(id_priv->id.event); id_priv->id.event = NULL; } ret = rdma_get_cm_event(id_priv->id.channel, &id_priv->id.event); if (ret) return ret; if (id_priv->id.event->status) { if (id_priv->id.event->event == RDMA_CM_EVENT_REJECTED) ret = ERR(ECONNREFUSED); else if (id_priv->id.event->status < 0) ret = ERR(-id_priv->id.event->status); else ret = ERR(id_priv->id.event->status); } return ret; } static int rdma_resolve_addr2(struct rdma_cm_id *id, struct sockaddr *src_addr, socklen_t src_len, struct sockaddr *dst_addr, socklen_t dst_len, int timeout_ms) { struct ucma_abi_resolve_addr cmd; struct cma_id_private *id_priv; int ret; CMA_INIT_CMD(&cmd, sizeof cmd, RESOLVE_ADDR); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; cmd.src_size = src_len; memcpy(&cmd.src_addr, src_addr, src_len); memcpy(&cmd.dst_addr, dst_addr, dst_len); cmd.dst_size = dst_len; cmd.timeout_ms = timeout_ms; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; memcpy(&id->route.addr.dst_addr, dst_addr, dst_len); return ucma_complete(id); } int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr, struct sockaddr *dst_addr, int timeout_ms) { struct ucma_abi_resolve_ip cmd; struct cma_id_private *id_priv; int ret, dst_len, src_len; dst_len = ucma_addrlen(dst_addr); if (!dst_len) return ERR(EINVAL); src_len = ucma_addrlen(src_addr); if (src_addr && !src_len) return ERR(EINVAL); if (af_ib_support) return rdma_resolve_addr2(id, src_addr, src_len, dst_addr, dst_len, timeout_ms); CMA_INIT_CMD(&cmd, sizeof cmd, RESOLVE_IP); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; if (src_addr) memcpy(&cmd.src_addr, src_addr, src_len); memcpy(&cmd.dst_addr, dst_addr, dst_len); cmd.timeout_ms = timeout_ms; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; memcpy(&id->route.addr.dst_storage, dst_addr, dst_len); return ucma_complete(id); } static int ucma_set_ib_route(struct rdma_cm_id *id) { struct rdma_addrinfo hint, *rai; int ret; memset(&hint, 0, sizeof hint); hint.ai_flags = RAI_ROUTEONLY; hint.ai_family = id->route.addr.src_addr.sa_family; hint.ai_src_len = ucma_addrlen((struct sockaddr *) &id->route.addr.src_addr); hint.ai_src_addr = &id->route.addr.src_addr; hint.ai_dst_len = ucma_addrlen((struct sockaddr *) &id->route.addr.dst_addr); hint.ai_dst_addr = &id->route.addr.dst_addr; ret = rdma_getaddrinfo(NULL, NULL, &hint, &rai); if (ret) return ret; if (rai->ai_route_len) ret = rdma_set_option(id, RDMA_OPTION_IB, RDMA_OPTION_IB_PATH, rai->ai_route, rai->ai_route_len); else ret = -1; rdma_freeaddrinfo(rai); return ret; } int rdma_resolve_route(struct rdma_cm_id *id, int timeout_ms) { struct ucma_abi_resolve_route cmd; struct cma_id_private *id_priv; int ret; id_priv = container_of(id, struct cma_id_private, id); if (id->verbs->device->transport_type == IBV_TRANSPORT_IB) { ret = ucma_set_ib_route(id); if (!ret) goto out; } CMA_INIT_CMD(&cmd, sizeof cmd, RESOLVE_ROUTE); cmd.id = id_priv->handle; cmd.timeout_ms = timeout_ms; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; out: return ucma_complete(id); } static int ucma_is_ud_qp(enum ibv_qp_type qp_type) { return (qp_type == IBV_QPT_UD); } int rdma_init_qp_attr(struct rdma_cm_id *id, struct ibv_qp_attr *qp_attr, int *qp_attr_mask) { struct ucma_abi_init_qp_attr cmd; struct ib_uverbs_qp_attr resp; struct cma_id_private *id_priv; int ret; CMA_INIT_CMD_RESP(&cmd, sizeof cmd, INIT_QP_ATTR, &resp, sizeof resp); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; cmd.qp_state = qp_attr->qp_state; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); ibv_copy_qp_attr_from_kern(qp_attr, &resp); *qp_attr_mask = resp.qp_attr_mask; return 0; } static int ucma_modify_qp_rtr(struct rdma_cm_id *id, uint8_t resp_res) { struct cma_id_private *id_priv; struct ibv_qp_attr qp_attr; int qp_attr_mask, ret; uint8_t link_layer; if (!id->qp) return 0; /* Need to update QP attributes from default values. */ qp_attr.qp_state = IBV_QPS_INIT; ret = rdma_init_qp_attr(id, &qp_attr, &qp_attr_mask); if (ret) return ret; ret = ibv_modify_qp(id->qp, &qp_attr, qp_attr_mask); if (ret) return ERR(ret); qp_attr.qp_state = IBV_QPS_RTR; ret = rdma_init_qp_attr(id, &qp_attr, &qp_attr_mask); if (ret) return ret; /* * Workaround for rdma_ucm kernel bug: * mask off qp_attr_mask bits 21-24 which are used for RoCE */ id_priv = container_of(id, struct cma_id_private, id); link_layer = id_priv->cma_dev->port[id->port_num - 1].link_layer; if (link_layer == IBV_LINK_LAYER_INFINIBAND) qp_attr_mask &= UINT_MAX ^ 0xe00000; if (resp_res != RDMA_MAX_RESP_RES) qp_attr.max_dest_rd_atomic = resp_res; return rdma_seterrno(ibv_modify_qp(id->qp, &qp_attr, qp_attr_mask)); } static int ucma_modify_qp_rts(struct rdma_cm_id *id, uint8_t init_depth) { struct ibv_qp_attr qp_attr; int qp_attr_mask, ret; if (!id->qp) return 0; qp_attr.qp_state = IBV_QPS_RTS; ret = rdma_init_qp_attr(id, &qp_attr, &qp_attr_mask); if (ret) return ret; if (init_depth != RDMA_MAX_INIT_DEPTH) qp_attr.max_rd_atomic = init_depth; return rdma_seterrno(ibv_modify_qp(id->qp, &qp_attr, qp_attr_mask)); } static int ucma_modify_qp_sqd(struct rdma_cm_id *id) { struct ibv_qp_attr qp_attr; if (!id->qp) return 0; qp_attr.qp_state = IBV_QPS_SQD; return rdma_seterrno(ibv_modify_qp(id->qp, &qp_attr, IBV_QP_STATE)); } static int ucma_modify_qp_err(struct rdma_cm_id *id) { struct ibv_qp_attr qp_attr; if (!id->qp) return 0; qp_attr.qp_state = IBV_QPS_ERR; return rdma_seterrno(ibv_modify_qp(id->qp, &qp_attr, IBV_QP_STATE)); } static int ucma_init_conn_qp3(struct cma_id_private *id_priv, struct ibv_qp *qp) { struct ibv_qp_attr qp_attr; int ret; ret = ibv_get_pkey_index(id_priv->cma_dev->verbs, id_priv->id.port_num, id_priv->id.route.addr.addr.ibaddr.pkey); if (ret < 0) return ERR(EINVAL); qp_attr.pkey_index = ret; qp_attr.port_num = id_priv->id.port_num; qp_attr.qp_state = IBV_QPS_INIT; qp_attr.qp_access_flags = 0; ret = ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE | IBV_QP_ACCESS_FLAGS | IBV_QP_PKEY_INDEX | IBV_QP_PORT); return rdma_seterrno(ret); } static int ucma_init_conn_qp(struct cma_id_private *id_priv, struct ibv_qp *qp) { struct ibv_qp_attr qp_attr; int qp_attr_mask, ret; if (abi_ver == 3) return ucma_init_conn_qp3(id_priv, qp); qp_attr.qp_state = IBV_QPS_INIT; ret = rdma_init_qp_attr(&id_priv->id, &qp_attr, &qp_attr_mask); if (ret) return ret; return rdma_seterrno(ibv_modify_qp(qp, &qp_attr, qp_attr_mask)); } static int ucma_init_ud_qp3(struct cma_id_private *id_priv, struct ibv_qp *qp) { struct ibv_qp_attr qp_attr; int ret; ret = ibv_get_pkey_index(id_priv->cma_dev->verbs, id_priv->id.port_num, id_priv->id.route.addr.addr.ibaddr.pkey); if (ret < 0) return ERR(EINVAL); qp_attr.pkey_index = ret; qp_attr.port_num = id_priv->id.port_num; qp_attr.qp_state = IBV_QPS_INIT; qp_attr.qkey = RDMA_UDP_QKEY; ret = ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE | IBV_QP_QKEY | IBV_QP_PKEY_INDEX | IBV_QP_PORT); if (ret) return ERR(ret); qp_attr.qp_state = IBV_QPS_RTR; ret = ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE); if (ret) return ERR(ret); qp_attr.qp_state = IBV_QPS_RTS; qp_attr.sq_psn = 0; ret = ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE | IBV_QP_SQ_PSN); return rdma_seterrno(ret); } static int ucma_init_ud_qp(struct cma_id_private *id_priv, struct ibv_qp *qp) { struct ibv_qp_attr qp_attr; int qp_attr_mask, ret; if (abi_ver == 3) return ucma_init_ud_qp3(id_priv, qp); qp_attr.qp_state = IBV_QPS_INIT; ret = rdma_init_qp_attr(&id_priv->id, &qp_attr, &qp_attr_mask); if (ret) return ret; ret = ibv_modify_qp(qp, &qp_attr, qp_attr_mask); if (ret) return ERR(ret); qp_attr.qp_state = IBV_QPS_RTR; ret = ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE); if (ret) return ERR(ret); qp_attr.qp_state = IBV_QPS_RTS; qp_attr.sq_psn = 0; ret = ibv_modify_qp(qp, &qp_attr, IBV_QP_STATE | IBV_QP_SQ_PSN); return rdma_seterrno(ret); } static void ucma_destroy_cqs(struct rdma_cm_id *id) { if (id->qp_type == IBV_QPT_XRC_RECV && id->srq) return; if (id->recv_cq) { ibv_destroy_cq(id->recv_cq); if (id->send_cq && (id->send_cq != id->recv_cq)) { ibv_destroy_cq(id->send_cq); id->send_cq = NULL; } id->recv_cq = NULL; } if (id->recv_cq_channel) { ibv_destroy_comp_channel(id->recv_cq_channel); if (id->send_cq_channel && (id->send_cq_channel != id->recv_cq_channel)) { ibv_destroy_comp_channel(id->send_cq_channel); id->send_cq_channel = NULL; } id->recv_cq_channel = NULL; } } static int ucma_create_cqs(struct rdma_cm_id *id, uint32_t send_size, uint32_t recv_size) { if (recv_size) { id->recv_cq_channel = ibv_create_comp_channel(id->verbs); if (!id->recv_cq_channel) goto err; id->recv_cq = ibv_create_cq(id->verbs, recv_size, id, id->recv_cq_channel, 0); if (!id->recv_cq) goto err; } if (send_size) { id->send_cq_channel = ibv_create_comp_channel(id->verbs); if (!id->send_cq_channel) goto err; id->send_cq = ibv_create_cq(id->verbs, send_size, id, id->send_cq_channel, 0); if (!id->send_cq) goto err; } return 0; err: ucma_destroy_cqs(id); return -1; } int rdma_create_srq_ex(struct rdma_cm_id *id, struct ibv_srq_init_attr_ex *attr) { struct cma_id_private *id_priv; struct ibv_srq *srq; int ret; id_priv = container_of(id, struct cma_id_private, id); if (!(attr->comp_mask & IBV_SRQ_INIT_ATTR_TYPE)) return ERR(EINVAL); if (!(attr->comp_mask & IBV_SRQ_INIT_ATTR_PD) || !attr->pd) { attr->pd = id->pd; attr->comp_mask |= IBV_SRQ_INIT_ATTR_PD; } if (attr->srq_type == IBV_SRQT_XRC) { if (!(attr->comp_mask & IBV_SRQ_INIT_ATTR_XRCD) || !attr->xrcd) { attr->xrcd = ucma_get_xrcd(id_priv->cma_dev); if (!attr->xrcd) return -1; } if (!(attr->comp_mask & IBV_SRQ_INIT_ATTR_CQ) || !attr->cq) { ret = ucma_create_cqs(id, 0, attr->attr.max_wr); if (ret) return ret; attr->cq = id->recv_cq; } attr->comp_mask |= IBV_SRQ_INIT_ATTR_XRCD | IBV_SRQ_INIT_ATTR_CQ; } srq = ibv_create_srq_ex(id->verbs, attr); if (!srq) { ret = -1; goto err; } if (!id->pd) id->pd = attr->pd; id->srq = srq; return 0; err: ucma_destroy_cqs(id); return ret; } int rdma_create_srq(struct rdma_cm_id *id, struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct ibv_srq_init_attr_ex attr_ex; int ret; memcpy(&attr_ex, attr, sizeof(*attr)); attr_ex.comp_mask = IBV_SRQ_INIT_ATTR_TYPE | IBV_SRQ_INIT_ATTR_PD; if (id->qp_type == IBV_QPT_XRC_RECV) { attr_ex.srq_type = IBV_SRQT_XRC; } else { attr_ex.srq_type = IBV_SRQT_BASIC; } attr_ex.pd = pd; ret = rdma_create_srq_ex(id, &attr_ex); memcpy(attr, &attr_ex, sizeof(*attr)); return ret; } void rdma_destroy_srq(struct rdma_cm_id *id) { ibv_destroy_srq(id->srq); id->srq = NULL; ucma_destroy_cqs(id); } static int init_ece(struct rdma_cm_id *id, struct ibv_qp *qp) { struct cma_id_private *id_priv = container_of(id, struct cma_id_private, id); struct ibv_ece ece = {}; int ret; ret = ibv_query_ece(qp, &ece); if (ret && ret != EOPNOTSUPP) return ERR(ret); id_priv->local_ece.vendor_id = ece.vendor_id; id_priv->local_ece.options = ece.options; if (!id_priv->remote_ece.vendor_id) /* * This QP was created explicitly and we don't need * to do anything additional to the setting local_ece values. */ return 0; /* This QP was created due to REQ event */ if (id_priv->remote_ece.vendor_id != id_priv->local_ece.vendor_id) { /* * Signal to the provider that other ECE node is different * vendor and clear ECE options. */ ece.vendor_id = id_priv->local_ece.vendor_id; ece.options = 0; } else { ece.vendor_id = id_priv->remote_ece.vendor_id; ece.options = id_priv->remote_ece.options; } ret = ibv_set_ece(qp, &ece); return (ret && ret != EOPNOTSUPP) ? ERR(ret) : 0; } static int set_local_ece(struct rdma_cm_id *id, struct ibv_qp *qp) { struct cma_id_private *id_priv = container_of(id, struct cma_id_private, id); struct ibv_ece ece = {}; int ret; if (!id_priv->remote_ece.vendor_id) return 0; ret = ibv_query_ece(qp, &ece); if (ret && ret != EOPNOTSUPP) return ERR(ret); id_priv->local_ece.options = ece.options; return 0; } int rdma_create_qp_ex(struct rdma_cm_id *id, struct ibv_qp_init_attr_ex *attr) { struct cma_id_private *id_priv; struct ibv_qp *qp; int ret; if (id->qp) return ERR(EINVAL); id_priv = container_of(id, struct cma_id_private, id); if (!(attr->comp_mask & IBV_QP_INIT_ATTR_PD) || !attr->pd) { attr->comp_mask |= IBV_QP_INIT_ATTR_PD; attr->pd = id->pd; } else if (id->verbs != attr->pd->context) return ERR(EINVAL); if ((id->recv_cq && attr->recv_cq && id->recv_cq != attr->recv_cq) || (id->send_cq && attr->send_cq && id->send_cq != attr->send_cq)) return ERR(EINVAL); if (id->qp_type == IBV_QPT_XRC_RECV) { if (!(attr->comp_mask & IBV_QP_INIT_ATTR_XRCD) || !attr->xrcd) { attr->xrcd = ucma_get_xrcd(id_priv->cma_dev); if (!attr->xrcd) return -1; attr->comp_mask |= IBV_QP_INIT_ATTR_XRCD; } } ret = ucma_create_cqs(id, attr->send_cq || id->send_cq ? 0 : attr->cap.max_send_wr, attr->recv_cq || id->recv_cq ? 0 : attr->cap.max_recv_wr); if (ret) return ret; if (!attr->send_cq) attr->send_cq = id->send_cq; if (!attr->recv_cq) attr->recv_cq = id->recv_cq; if (id->srq && !attr->srq) attr->srq = id->srq; qp = ibv_create_qp_ex(id->verbs, attr); if (!qp) { ret = -1; goto err1; } ret = init_ece(id, qp); if (ret) goto err2; if (ucma_is_ud_qp(id->qp_type)) ret = ucma_init_ud_qp(id_priv, qp); else ret = ucma_init_conn_qp(id_priv, qp); if (ret) goto err2; ret = set_local_ece(id, qp); if (ret) goto err2; id->pd = qp->pd; id->qp = qp; return 0; err2: ibv_destroy_qp(qp); err1: ucma_destroy_cqs(id); return ret; } int rdma_create_qp(struct rdma_cm_id *id, struct ibv_pd *pd, struct ibv_qp_init_attr *qp_init_attr) { struct ibv_qp_init_attr_ex attr_ex; int ret; memcpy(&attr_ex, qp_init_attr, sizeof(*qp_init_attr)); attr_ex.comp_mask = IBV_QP_INIT_ATTR_PD; attr_ex.pd = pd ? pd : id->pd; ret = rdma_create_qp_ex(id, &attr_ex); memcpy(qp_init_attr, &attr_ex, sizeof(*qp_init_attr)); return ret; } void rdma_destroy_qp(struct rdma_cm_id *id) { ibv_destroy_qp(id->qp); id->qp = NULL; ucma_destroy_cqs(id); } static int ucma_valid_param(struct cma_id_private *id_priv, struct rdma_conn_param *param) { if (id_priv->id.ps != RDMA_PS_TCP) return 0; if (!id_priv->id.qp && !param) goto err; if (!param) return 0; if ((param->responder_resources != RDMA_MAX_RESP_RES) && (param->responder_resources > id_priv->cma_dev->max_responder_resources)) goto err; if ((param->initiator_depth != RDMA_MAX_INIT_DEPTH) && (param->initiator_depth > id_priv->cma_dev->max_initiator_depth)) goto err; return 0; err: return ERR(EINVAL); } static void ucma_copy_conn_param_to_kern(struct cma_id_private *id_priv, struct ucma_abi_conn_param *dst, struct rdma_conn_param *src, uint32_t qp_num, uint8_t srq) { dst->qp_num = qp_num; dst->srq = srq; dst->responder_resources = id_priv->responder_resources; dst->initiator_depth = id_priv->initiator_depth; dst->valid = 1; if (id_priv->connect_len) { memcpy(dst->private_data, id_priv->connect, id_priv->connect_len); dst->private_data_len = id_priv->connect_len; } if (src) { dst->flow_control = src->flow_control; dst->retry_count = src->retry_count; dst->rnr_retry_count = src->rnr_retry_count; if (src->private_data && src->private_data_len) { memcpy(dst->private_data + dst->private_data_len, src->private_data, src->private_data_len); dst->private_data_len += src->private_data_len; } } else { dst->retry_count = 7; dst->rnr_retry_count = 7; } } static void ucma_copy_ece_param_to_kern_req(struct cma_id_private *id_priv, struct ucma_abi_ece *dst) { dst->vendor_id = id_priv->local_ece.vendor_id; dst->attr_mod = id_priv->local_ece.options; } int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param) { uint32_t qp_num = conn_param ? conn_param->qp_num : 0; uint8_t srq = conn_param ? conn_param->srq : 0; struct ucma_abi_connect cmd; struct cma_id_private *id_priv; int ret; id_priv = container_of(id, struct cma_id_private, id); ret = ucma_valid_param(id_priv, conn_param); if (ret) return ret; if (conn_param && conn_param->initiator_depth != RDMA_MAX_INIT_DEPTH) id_priv->initiator_depth = conn_param->initiator_depth; else id_priv->initiator_depth = id_priv->cma_dev->max_initiator_depth; if (conn_param && conn_param->responder_resources != RDMA_MAX_RESP_RES) id_priv->responder_resources = conn_param->responder_resources; else id_priv->responder_resources = id_priv->cma_dev->max_responder_resources; CMA_INIT_CMD(&cmd, sizeof cmd, CONNECT); cmd.id = id_priv->handle; if (id->qp) { qp_num = id->qp->qp_num; srq = !!id->qp->srq; } ucma_copy_conn_param_to_kern(id_priv, &cmd.conn_param, conn_param, qp_num, srq); ucma_copy_ece_param_to_kern_req(id_priv, &cmd.ece); ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; if (id_priv->connect_len) { free(id_priv->connect); id_priv->connect_len = 0; } return ucma_complete(id); } int rdma_listen(struct rdma_cm_id *id, int backlog) { struct ucma_abi_listen cmd; struct cma_id_private *id_priv; int ret; CMA_INIT_CMD(&cmd, sizeof cmd, LISTEN); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; cmd.backlog = backlog; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; if (af_ib_support) return ucma_query_addr(id); else return ucma_query_route(id); } int rdma_get_request(struct rdma_cm_id *listen, struct rdma_cm_id **id) { struct cma_id_private *id_priv; struct rdma_cm_event *event; int ret; id_priv = container_of(listen, struct cma_id_private, id); if (!id_priv->sync) return ERR(EINVAL); if (listen->event) { rdma_ack_cm_event(listen->event); listen->event = NULL; } ret = rdma_get_cm_event(listen->channel, &event); if (ret) return ret; if (event->event == RDMA_CM_EVENT_REJECTED) { ret = ERR(ECONNREFUSED); goto err; } if (event->status) { ret = ERR(-event->status); goto err; } if (event->event != RDMA_CM_EVENT_CONNECT_REQUEST) { ret = ERR(EINVAL); goto err; } if (id_priv->qp_init_attr) { struct ibv_qp_init_attr attr; attr = *id_priv->qp_init_attr; ret = rdma_create_qp(event->id, listen->pd, &attr); if (ret) goto err; } *id = event->id; (*id)->event = event; return 0; err: listen->event = event; return ret; } static void ucma_copy_ece_param_to_kern_rep(struct cma_id_private *id_priv, struct ucma_abi_ece *dst) { /* Return result with same ID as received. */ dst->vendor_id = id_priv->remote_ece.vendor_id; dst->attr_mod = id_priv->local_ece.options; } int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param) { uint32_t qp_num = id->qp ? id->qp->qp_num : conn_param->qp_num; uint8_t srq = id->qp ? !!id->qp->srq : conn_param->srq; struct ucma_abi_accept cmd; struct cma_id_private *id_priv; int ret; id_priv = container_of(id, struct cma_id_private, id); ret = ucma_valid_param(id_priv, conn_param); if (ret) return ret; if (!conn_param || conn_param->initiator_depth == RDMA_MAX_INIT_DEPTH) { id_priv->initiator_depth = min(id_priv->initiator_depth, id_priv->cma_dev->max_initiator_depth); } else { id_priv->initiator_depth = conn_param->initiator_depth; } if (!conn_param || conn_param->responder_resources == RDMA_MAX_RESP_RES) { id_priv->responder_resources = min(id_priv->responder_resources, id_priv->cma_dev->max_responder_resources); } else { id_priv->responder_resources = conn_param->responder_resources; } if (!ucma_is_ud_qp(id->qp_type)) { ret = ucma_modify_qp_rtr(id, id_priv->responder_resources); if (ret) return ret; ret = ucma_modify_qp_rts(id, id_priv->initiator_depth); if (ret) return ret; } CMA_INIT_CMD(&cmd, sizeof cmd, ACCEPT); cmd.id = id_priv->handle; cmd.uid = (uintptr_t) id_priv; ucma_copy_conn_param_to_kern(id_priv, &cmd.conn_param, conn_param, qp_num, srq); ucma_copy_ece_param_to_kern_rep(id_priv, &cmd.ece); ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) { ucma_modify_qp_err(id); return (ret >= 0) ? ERR(ENODATA) : -1; } if (ucma_is_ud_qp(id->qp_type)) { if (id_priv->sync && id_priv->id.event) { rdma_ack_cm_event(id_priv->id.event); id_priv->id.event = NULL; } return 0; } return ucma_complete(id); } static int reject_with_reason(struct rdma_cm_id *id, const void *private_data, uint8_t private_data_len, uint8_t reason) { struct ucma_abi_reject cmd; struct cma_id_private *id_priv; int ret; CMA_INIT_CMD(&cmd, sizeof cmd, REJECT); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; if (private_data && private_data_len) { memcpy(cmd.private_data, private_data, private_data_len); cmd.private_data_len = private_data_len; } cmd.reason = reason; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; return 0; } int rdma_reject(struct rdma_cm_id *id, const void *private_data, uint8_t private_data_len) { return reject_with_reason(id, private_data, private_data_len, 0); } int rdma_reject_ece(struct rdma_cm_id *id, const void *private_data, uint8_t private_data_len) { /* IBTA defines CM_REJ_VENDOR_OPTION_NOT_SUPPORTED as 35 */ return reject_with_reason(id, private_data, private_data_len, 35); } int rdma_notify(struct rdma_cm_id *id, enum ibv_event_type event) { struct ucma_abi_notify cmd; struct cma_id_private *id_priv; int ret; CMA_INIT_CMD(&cmd, sizeof cmd, NOTIFY); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; cmd.event = event; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; return 0; } int ucma_shutdown(struct rdma_cm_id *id) { if (!id->verbs || !id->verbs->device) return ERR(EINVAL); switch (id->verbs->device->transport_type) { case IBV_TRANSPORT_IB: return ucma_modify_qp_err(id); case IBV_TRANSPORT_IWARP: return ucma_modify_qp_sqd(id); default: return ERR(EINVAL); } } int rdma_disconnect(struct rdma_cm_id *id) { struct ucma_abi_disconnect cmd; struct cma_id_private *id_priv; int ret; ret = ucma_shutdown(id); if (ret) return ret; CMA_INIT_CMD(&cmd, sizeof cmd, DISCONNECT); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; return ucma_complete(id); } static int rdma_join_multicast2(struct rdma_cm_id *id, struct sockaddr *addr, socklen_t addrlen, uint16_t join_flags, void *context) { struct ucma_abi_create_id_resp resp; struct cma_id_private *id_priv; struct cma_multicast *mc, **pos; int ret; id_priv = container_of(id, struct cma_id_private, id); mc = calloc(1, sizeof(*mc)); if (!mc) return ERR(ENOMEM); mc->context = context; mc->id_priv = id_priv; mc->join_flags = join_flags; memcpy(&mc->addr, addr, addrlen); if (pthread_cond_init(&mc->cond, NULL)) { ret = -1; goto err1; } pthread_mutex_lock(&id_priv->mut); mc->next = id_priv->mc_list; id_priv->mc_list = mc; pthread_mutex_unlock(&id_priv->mut); if (af_ib_support) { struct ucma_abi_join_mcast cmd; CMA_INIT_CMD_RESP(&cmd, sizeof cmd, JOIN_MCAST, &resp, sizeof resp); cmd.id = id_priv->handle; memcpy(&cmd.addr, addr, addrlen); cmd.addr_size = addrlen; cmd.uid = (uintptr_t) mc; cmd.join_flags = join_flags; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) { ret = (ret >= 0) ? ERR(ENODATA) : -1; goto err2; } } else { struct ucma_abi_join_ip_mcast cmd; CMA_INIT_CMD_RESP(&cmd, sizeof cmd, JOIN_IP_MCAST, &resp, sizeof resp); cmd.id = id_priv->handle; memcpy(&cmd.addr, addr, addrlen); cmd.uid = (uintptr_t) mc; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) { ret = (ret >= 0) ? ERR(ENODATA) : -1; goto err2; } } VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); mc->handle = resp.id; return ucma_complete(id); err2: pthread_mutex_lock(&id_priv->mut); for (pos = &id_priv->mc_list; *pos != mc; pos = &(*pos)->next) ; *pos = mc->next; pthread_mutex_unlock(&id_priv->mut); err1: free(mc); return ret; } int rdma_join_multicast_ex(struct rdma_cm_id *id, struct rdma_cm_join_mc_attr_ex *mc_join_attr, void *context) { int addrlen; if (mc_join_attr->comp_mask >= RDMA_CM_JOIN_MC_ATTR_RESERVED) return ERR(ENOTSUP); if (!(mc_join_attr->comp_mask & RDMA_CM_JOIN_MC_ATTR_ADDRESS)) return ERR(EINVAL); if (!(mc_join_attr->comp_mask & RDMA_CM_JOIN_MC_ATTR_JOIN_FLAGS) || (mc_join_attr->join_flags >= RDMA_MC_JOIN_FLAG_RESERVED)) return ERR(EINVAL); addrlen = ucma_addrlen(mc_join_attr->addr); if (!addrlen) return ERR(EINVAL); return rdma_join_multicast2(id, mc_join_attr->addr, addrlen, mc_join_attr->join_flags, context); } int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr, void *context) { int addrlen; addrlen = ucma_addrlen(addr); if (!addrlen) return ERR(EINVAL); return rdma_join_multicast2(id, addr, addrlen, RDMA_MC_JOIN_FLAG_FULLMEMBER, context); } int rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr) { struct ucma_abi_destroy_id cmd; struct ucma_abi_destroy_id_resp resp; struct cma_id_private *id_priv; struct cma_multicast *mc, **pos; int ret, addrlen; addrlen = ucma_addrlen(addr); if (!addrlen) return ERR(EINVAL); id_priv = container_of(id, struct cma_id_private, id); pthread_mutex_lock(&id_priv->mut); for (pos = &id_priv->mc_list; *pos; pos = &(*pos)->next) if (!memcmp(&(*pos)->addr, addr, addrlen)) break; mc = *pos; if (*pos) *pos = mc->next; pthread_mutex_unlock(&id_priv->mut); if (!mc) return ERR(EADDRNOTAVAIL); if (id->qp && (mc->join_flags != RDMA_MC_JOIN_FLAG_SENDONLY_FULLMEMBER)) ibv_detach_mcast(id->qp, &mc->mgid, mc->mlid); CMA_INIT_CMD_RESP(&cmd, sizeof cmd, LEAVE_MCAST, &resp, sizeof resp); cmd.id = mc->handle; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) { ret = (ret >= 0) ? ERR(ENODATA) : -1; goto free; } VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); pthread_mutex_lock(&id_priv->mut); while (mc->events_completed < resp.events_reported) pthread_cond_wait(&mc->cond, &id_priv->mut); pthread_mutex_unlock(&id_priv->mut); ret = 0; free: free(mc); return ret; } static void ucma_complete_event(struct cma_id_private *id_priv) { pthread_mutex_lock(&id_priv->mut); id_priv->events_completed++; pthread_cond_signal(&id_priv->cond); pthread_mutex_unlock(&id_priv->mut); } static void ucma_complete_mc_event(struct cma_multicast *mc) { pthread_mutex_lock(&mc->id_priv->mut); mc->events_completed++; pthread_cond_signal(&mc->cond); mc->id_priv->events_completed++; pthread_cond_signal(&mc->id_priv->cond); pthread_mutex_unlock(&mc->id_priv->mut); } int rdma_ack_cm_event(struct rdma_cm_event *event) { struct cma_event *evt; if (!event) return ERR(EINVAL); evt = container_of(event, struct cma_event, event); if (evt->mc) ucma_complete_mc_event(evt->mc); else ucma_complete_event(evt->id_priv); free(evt); return 0; } static void ucma_process_addr_resolved(struct cma_event *evt) { struct rdma_cm_id *id = &evt->id_priv->id; if (af_ib_support) { evt->event.status = ucma_query_addr(id); if (!evt->event.status && !id->verbs) goto err_dev; if (!evt->event.status && id->verbs->device->transport_type == IBV_TRANSPORT_IB) { evt->event.status = ucma_query_gid(id); } } else { evt->event.status = ucma_query_route(id); if (!evt->event.status && !id->verbs) goto err_dev; } if (evt->event.status) evt->event.event = RDMA_CM_EVENT_ADDR_ERROR; return; err_dev: evt->event.status = ERR(ENODEV); evt->event.event = RDMA_CM_EVENT_ADDR_ERROR; } static void ucma_process_route_resolved(struct cma_event *evt) { if (evt->id_priv->id.verbs->device->transport_type != IBV_TRANSPORT_IB) return; if (af_ib_support) evt->event.status = ucma_query_path(&evt->id_priv->id); else evt->event.status = ucma_query_route(&evt->id_priv->id); if (evt->event.status) evt->event.event = RDMA_CM_EVENT_ROUTE_ERROR; } static int ucma_query_req_info(struct rdma_cm_id *id) { int ret; if (!af_ib_support) return ucma_query_route(id); ret = ucma_query_addr(id); if (ret) return ret; ret = ucma_query_gid(id); if (ret) return ret; ret = ucma_query_path(id); if (ret) return ret; return 0; } static int ucma_process_conn_req(struct cma_event *evt, uint32_t handle, struct ucma_abi_ece *ece) { struct cma_id_private *id_priv; int ret; id_priv = ucma_alloc_id(evt->id_priv->id.channel, evt->id_priv->id.context, evt->id_priv->id.ps, evt->id_priv->id.qp_type); if (!id_priv) { ucma_destroy_kern_id(evt->id_priv->id.channel->fd, handle); ret = ERR(ENOMEM); goto err1; } evt->event.listen_id = &evt->id_priv->id; evt->event.id = &id_priv->id; id_priv->handle = handle; ucma_insert_id(id_priv); id_priv->initiator_depth = evt->event.param.conn.initiator_depth; id_priv->responder_resources = evt->event.param.conn.responder_resources; id_priv->remote_ece.vendor_id = ece->vendor_id; id_priv->remote_ece.options = ece->attr_mod; if (evt->id_priv->sync) { ret = rdma_migrate_id(&id_priv->id, NULL); if (ret) goto err2; } ret = ucma_query_req_info(&id_priv->id); if (ret) goto err2; return 0; err2: rdma_destroy_id(&id_priv->id); err1: ucma_complete_event(evt->id_priv); return ret; } static int ucma_process_conn_resp(struct cma_id_private *id_priv) { struct ucma_abi_accept cmd; int ret; ret = ucma_modify_qp_rtr(&id_priv->id, RDMA_MAX_RESP_RES); if (ret) goto err; ret = ucma_modify_qp_rts(&id_priv->id, RDMA_MAX_INIT_DEPTH); if (ret) goto err; CMA_INIT_CMD(&cmd, sizeof cmd, ACCEPT); cmd.id = id_priv->handle; ret = write(id_priv->id.channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) { ret = (ret >= 0) ? ERR(ENODATA) : -1; goto err; } return 0; err: ucma_modify_qp_err(&id_priv->id); return ret; } static int ucma_process_conn_resp_ece(struct cma_id_private *id_priv, struct ucma_abi_ece *ece) { struct ibv_ece ibv_ece = { .vendor_id = ece->vendor_id, .options = ece->attr_mod }; int ret; /* This is response handler */ if (!ece->vendor_id) { /* * Kernel or user-space doesn't support ECE transfer, * clear everything. */ ibv_ece.vendor_id = id_priv->local_ece.vendor_id; ibv_ece.options = 0; } else if (ece->vendor_id != id_priv->local_ece.vendor_id) { /* * At this point remote vendor_id should be the same * as the local one, or something bad happened in * ECE handshake implementation. */ ucma_modify_qp_err(&id_priv->id); return ERR(EINVAL); } id_priv->remote_ece.vendor_id = ece->vendor_id; ret = ibv_set_ece(id_priv->id.qp, &ibv_ece); if (ret && ret != EOPNOTSUPP) return ret; ret = ucma_process_conn_resp(id_priv); if (ret) return ret; ret = ibv_query_ece(id_priv->id.qp, &ibv_ece); if (ret && ret != EOPNOTSUPP) { ucma_modify_qp_err(&id_priv->id); return ret; } id_priv->local_ece.options = (ret == EOPNOTSUPP) ? 0 : ibv_ece.options; return 0; } static int ucma_process_join(struct cma_event *evt) { evt->mc->mgid = evt->event.param.ud.ah_attr.grh.dgid; evt->mc->mlid = evt->event.param.ud.ah_attr.dlid; if (!evt->id_priv->id.qp) return 0; /* Don't attach QP to multicast if joined as send only full member */ if (evt->mc->join_flags == RDMA_MC_JOIN_FLAG_SENDONLY_FULLMEMBER) return 0; return rdma_seterrno(ibv_attach_mcast(evt->id_priv->id.qp, &evt->mc->mgid, evt->mc->mlid)); } static void ucma_copy_conn_event(struct cma_event *event, struct ucma_abi_conn_param *src) { struct rdma_conn_param *dst = &event->event.param.conn; dst->private_data_len = src->private_data_len; if (src->private_data_len) { dst->private_data = &event->private_data; memcpy(&event->private_data, src->private_data, src->private_data_len); } dst->responder_resources = src->responder_resources; dst->initiator_depth = src->initiator_depth; dst->flow_control = src->flow_control; dst->retry_count = src->retry_count; dst->rnr_retry_count = src->rnr_retry_count; dst->srq = src->srq; dst->qp_num = src->qp_num; } static void ucma_copy_ud_event(struct cma_event *event, struct ucma_abi_ud_param *src) { struct rdma_ud_param *dst = &event->event.param.ud; dst->private_data_len = src->private_data_len; if (src->private_data_len) { dst->private_data = &event->private_data; memcpy(&event->private_data, src->private_data, src->private_data_len); } ibv_copy_ah_attr_from_kern(&dst->ah_attr, &src->ah_attr); dst->qp_num = src->qp_num; dst->qkey = src->qkey; } int rdma_establish(struct rdma_cm_id *id) { if (id->qp) return ERR(EINVAL); /* id->qp is NULL, so ucma_process_conn_resp() will only send ACCEPT to * the passive side, and will not attempt to modify the QP. */ return ucma_process_conn_resp(container_of(id, struct cma_id_private, id)); } int rdma_get_cm_event(struct rdma_event_channel *channel, struct rdma_cm_event **event) { struct ucma_abi_event_resp resp = {}; struct ucma_abi_get_event cmd; struct cma_event *evt; int ret; ret = ucma_init(); if (ret) return ret; if (!event) return ERR(EINVAL); evt = malloc(sizeof(*evt)); if (!evt) return ERR(ENOMEM); retry: memset(evt, 0, sizeof(*evt)); CMA_INIT_CMD_RESP(&cmd, sizeof cmd, GET_EVENT, &resp, sizeof resp); ret = write(channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) { free(evt); return (ret >= 0) ? ERR(ENODATA) : -1; } VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); evt->event.event = resp.event; /* * We should have a non-zero uid, except for connection requests. * But a bug in older kernels can report a uid 0. Work-around this * issue by looking up the cma_id based on the kernel's id when the * uid is 0 and we're processing a connection established event. * In all other cases, if the uid is 0, we discard the event, like * the kernel should have done. */ if (resp.uid) { evt->id_priv = (void *) (uintptr_t) resp.uid; } else { evt->id_priv = ucma_lookup_id(resp.id); if (!evt->id_priv) { syslog(LOG_WARNING, PFX "Warning: discarding unmatched " "event - rdma_destroy_id may hang.\n"); goto retry; } if (resp.event != RDMA_CM_EVENT_ESTABLISHED) { ucma_complete_event(evt->id_priv); goto retry; } } evt->event.id = &evt->id_priv->id; evt->event.status = resp.status; switch (resp.event) { case RDMA_CM_EVENT_ADDR_RESOLVED: ucma_process_addr_resolved(evt); break; case RDMA_CM_EVENT_ROUTE_RESOLVED: ucma_process_route_resolved(evt); break; case RDMA_CM_EVENT_CONNECT_REQUEST: evt->id_priv = (void *) (uintptr_t) resp.uid; if (ucma_is_ud_qp(evt->id_priv->id.qp_type)) ucma_copy_ud_event(evt, &resp.param.ud); else ucma_copy_conn_event(evt, &resp.param.conn); ret = ucma_process_conn_req(evt, resp.id, &resp.ece); if (ret) goto retry; break; case RDMA_CM_EVENT_CONNECT_RESPONSE: ucma_copy_conn_event(evt, &resp.param.conn); if (!evt->id_priv->id.qp) { evt->event.event = RDMA_CM_EVENT_CONNECT_RESPONSE; evt->id_priv->remote_ece.vendor_id = resp.ece.vendor_id; evt->id_priv->remote_ece.options = resp.ece.attr_mod; } else { evt->event.status = ucma_process_conn_resp_ece( evt->id_priv, &resp.ece); if (!evt->event.status) evt->event.event = RDMA_CM_EVENT_ESTABLISHED; else { evt->event.event = RDMA_CM_EVENT_CONNECT_ERROR; evt->id_priv->connect_error = 1; } } break; case RDMA_CM_EVENT_ESTABLISHED: if (ucma_is_ud_qp(evt->id_priv->id.qp_type)) { ucma_copy_ud_event(evt, &resp.param.ud); break; } ucma_copy_conn_event(evt, &resp.param.conn); break; case RDMA_CM_EVENT_REJECTED: if (evt->id_priv->connect_error) { ucma_complete_event(evt->id_priv); goto retry; } ucma_copy_conn_event(evt, &resp.param.conn); ucma_modify_qp_err(evt->event.id); break; case RDMA_CM_EVENT_DISCONNECTED: if (evt->id_priv->connect_error) { ucma_complete_event(evt->id_priv); goto retry; } ucma_copy_conn_event(evt, &resp.param.conn); break; case RDMA_CM_EVENT_MULTICAST_JOIN: evt->mc = (void *) (uintptr_t) resp.uid; evt->id_priv = evt->mc->id_priv; evt->event.id = &evt->id_priv->id; ucma_copy_ud_event(evt, &resp.param.ud); evt->event.param.ud.private_data = evt->mc->context; evt->event.status = ucma_process_join(evt); if (evt->event.status) evt->event.event = RDMA_CM_EVENT_MULTICAST_ERROR; break; case RDMA_CM_EVENT_MULTICAST_ERROR: evt->mc = (void *) (uintptr_t) resp.uid; evt->id_priv = evt->mc->id_priv; evt->event.id = &evt->id_priv->id; evt->event.param.ud.private_data = evt->mc->context; break; default: evt->id_priv = (void *) (uintptr_t) resp.uid; evt->event.id = &evt->id_priv->id; evt->event.status = resp.status; if (ucma_is_ud_qp(evt->id_priv->id.qp_type)) ucma_copy_ud_event(evt, &resp.param.ud); else ucma_copy_conn_event(evt, &resp.param.conn); break; } *event = &evt->event; return 0; } const char *rdma_event_str(enum rdma_cm_event_type event) { switch (event) { case RDMA_CM_EVENT_ADDR_RESOLVED: return "RDMA_CM_EVENT_ADDR_RESOLVED"; case RDMA_CM_EVENT_ADDR_ERROR: return "RDMA_CM_EVENT_ADDR_ERROR"; case RDMA_CM_EVENT_ROUTE_RESOLVED: return "RDMA_CM_EVENT_ROUTE_RESOLVED"; case RDMA_CM_EVENT_ROUTE_ERROR: return "RDMA_CM_EVENT_ROUTE_ERROR"; case RDMA_CM_EVENT_CONNECT_REQUEST: return "RDMA_CM_EVENT_CONNECT_REQUEST"; case RDMA_CM_EVENT_CONNECT_RESPONSE: return "RDMA_CM_EVENT_CONNECT_RESPONSE"; case RDMA_CM_EVENT_CONNECT_ERROR: return "RDMA_CM_EVENT_CONNECT_ERROR"; case RDMA_CM_EVENT_UNREACHABLE: return "RDMA_CM_EVENT_UNREACHABLE"; case RDMA_CM_EVENT_REJECTED: return "RDMA_CM_EVENT_REJECTED"; case RDMA_CM_EVENT_ESTABLISHED: return "RDMA_CM_EVENT_ESTABLISHED"; case RDMA_CM_EVENT_DISCONNECTED: return "RDMA_CM_EVENT_DISCONNECTED"; case RDMA_CM_EVENT_DEVICE_REMOVAL: return "RDMA_CM_EVENT_DEVICE_REMOVAL"; case RDMA_CM_EVENT_MULTICAST_JOIN: return "RDMA_CM_EVENT_MULTICAST_JOIN"; case RDMA_CM_EVENT_MULTICAST_ERROR: return "RDMA_CM_EVENT_MULTICAST_ERROR"; case RDMA_CM_EVENT_ADDR_CHANGE: return "RDMA_CM_EVENT_ADDR_CHANGE"; case RDMA_CM_EVENT_TIMEWAIT_EXIT: return "RDMA_CM_EVENT_TIMEWAIT_EXIT"; default: return "UNKNOWN EVENT"; } } int rdma_set_option(struct rdma_cm_id *id, int level, int optname, void *optval, size_t optlen) { struct ucma_abi_set_option cmd; struct cma_id_private *id_priv; int ret; CMA_INIT_CMD(&cmd, sizeof cmd, SET_OPTION); id_priv = container_of(id, struct cma_id_private, id); cmd.id = id_priv->handle; cmd.optval = (uintptr_t) optval; cmd.level = level; cmd.optname = optname; cmd.optlen = optlen; ret = write(id->channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) return (ret >= 0) ? ERR(ENODATA) : -1; return 0; } int rdma_migrate_id(struct rdma_cm_id *id, struct rdma_event_channel *channel) { struct ucma_abi_migrate_resp resp; struct ucma_abi_migrate_id cmd; struct cma_id_private *id_priv; int ret, sync; id_priv = container_of(id, struct cma_id_private, id); if (id_priv->sync && !channel) return ERR(EINVAL); if ((sync = (channel == NULL))) { channel = rdma_create_event_channel(); if (!channel) return -1; } CMA_INIT_CMD_RESP(&cmd, sizeof cmd, MIGRATE_ID, &resp, sizeof resp); cmd.id = id_priv->handle; cmd.fd = id->channel->fd; ret = write(channel->fd, &cmd, sizeof cmd); if (ret != sizeof cmd) { if (sync) rdma_destroy_event_channel(channel); return (ret >= 0) ? ERR(ENODATA) : -1; } VALGRIND_MAKE_MEM_DEFINED(&resp, sizeof resp); if (id_priv->sync) { if (id->event) { rdma_ack_cm_event(id->event); id->event = NULL; } rdma_destroy_event_channel(id->channel); } /* * Eventually if we want to support migrating channels while events are * being processed on the current channel, we need to block here while * there are any outstanding events on the current channel for this id * to prevent the user from processing events for this id on the old * channel after this call returns. */ pthread_mutex_lock(&id_priv->mut); id_priv->sync = sync; id->channel = channel; while (id_priv->events_completed < resp.events_reported) pthread_cond_wait(&id_priv->cond, &id_priv->mut); pthread_mutex_unlock(&id_priv->mut); return 0; } static int ucma_passive_ep(struct rdma_cm_id *id, struct rdma_addrinfo *res, struct ibv_pd *pd, struct ibv_qp_init_attr *qp_init_attr) { struct cma_id_private *id_priv; int ret; if (af_ib_support) ret = rdma_bind_addr2(id, res->ai_src_addr, res->ai_src_len); else ret = rdma_bind_addr(id, res->ai_src_addr); if (ret) return ret; id_priv = container_of(id, struct cma_id_private, id); if (pd) id->pd = pd; if (qp_init_attr) { id_priv->qp_init_attr = malloc(sizeof(*qp_init_attr)); if (!id_priv->qp_init_attr) return ERR(ENOMEM); *id_priv->qp_init_attr = *qp_init_attr; id_priv->qp_init_attr->qp_type = res->ai_qp_type; } return 0; } int rdma_create_ep(struct rdma_cm_id **id, struct rdma_addrinfo *res, struct ibv_pd *pd, struct ibv_qp_init_attr *qp_init_attr) { struct rdma_cm_id *cm_id; struct cma_id_private *id_priv; int ret; ret = rdma_create_id2(NULL, &cm_id, NULL, res->ai_port_space, res->ai_qp_type); if (ret) return ret; if (res->ai_flags & RAI_PASSIVE) { ret = ucma_passive_ep(cm_id, res, pd, qp_init_attr); if (ret) goto err; goto out; } if (af_ib_support) ret = rdma_resolve_addr2(cm_id, res->ai_src_addr, res->ai_src_len, res->ai_dst_addr, res->ai_dst_len, 2000); else ret = rdma_resolve_addr(cm_id, res->ai_src_addr, res->ai_dst_addr, 2000); if (ret) goto err; if (res->ai_route_len) { ret = rdma_set_option(cm_id, RDMA_OPTION_IB, RDMA_OPTION_IB_PATH, res->ai_route, res->ai_route_len); if (!ret) ret = ucma_complete(cm_id); } else { ret = rdma_resolve_route(cm_id, 2000); } if (ret) goto err; if (qp_init_attr) { qp_init_attr->qp_type = res->ai_qp_type; ret = rdma_create_qp(cm_id, pd, qp_init_attr); if (ret) goto err; } if (res->ai_connect_len) { id_priv = container_of(cm_id, struct cma_id_private, id); id_priv->connect = malloc(res->ai_connect_len); if (!id_priv->connect) { ret = ERR(ENOMEM); goto err; } memcpy(id_priv->connect, res->ai_connect, res->ai_connect_len); id_priv->connect_len = res->ai_connect_len; } out: *id = cm_id; return 0; err: rdma_destroy_ep(cm_id); return ret; } void rdma_destroy_ep(struct rdma_cm_id *id) { struct cma_id_private *id_priv; if (id->qp) rdma_destroy_qp(id); if (id->srq) rdma_destroy_srq(id); id_priv = container_of(id, struct cma_id_private, id); if (id_priv->qp_init_attr) free(id_priv->qp_init_attr); rdma_destroy_id(id); } int ucma_max_qpsize(struct rdma_cm_id *id) { struct cma_id_private *id_priv; struct cma_device *dev; int max_size = 0; id_priv = container_of(id, struct cma_id_private, id); if (id && id_priv->cma_dev) { max_size = id_priv->cma_dev->max_qpsize; } else { ucma_init_all(); pthread_mutex_lock(&mut); list_for_each(&cma_dev_list, dev, entry) if (!dev->is_device_dead && (!max_size || max_size > dev->max_qpsize)) max_size = dev->max_qpsize; pthread_mutex_unlock(&mut); } return max_size; } __be16 ucma_get_port(struct sockaddr *addr) { switch (addr->sa_family) { case AF_INET: return ((struct sockaddr_in *) addr)->sin_port; case AF_INET6: return ((struct sockaddr_in6 *) addr)->sin6_port; case AF_IB: return htobe16((uint16_t) be64toh(((struct sockaddr_ib *) addr)->sib_sid)); default: return 0; } } __be16 rdma_get_src_port(struct rdma_cm_id *id) { return ucma_get_port(&id->route.addr.src_addr); } __be16 rdma_get_dst_port(struct rdma_cm_id *id) { return ucma_get_port(&id->route.addr.dst_addr); } int rdma_set_local_ece(struct rdma_cm_id *id, struct ibv_ece *ece) { struct cma_id_private *id_priv; if (!id || id->qp || !ece || !ece->vendor_id || ece->comp_mask) return ERR(EINVAL); id_priv = container_of(id, struct cma_id_private, id); id_priv->local_ece.vendor_id = ece->vendor_id; id_priv->local_ece.options = ece->options; return 0; } int rdma_get_remote_ece(struct rdma_cm_id *id, struct ibv_ece *ece) { struct cma_id_private *id_priv; if (!id || id->qp || !ece) return ERR(EINVAL); id_priv = container_of(id, struct cma_id_private, id); ece->vendor_id = id_priv->remote_ece.vendor_id; ece->options = id_priv->remote_ece.options; ece->comp_mask = 0; return 0; } rdma-core-56.1/librdmacm/cma.h000066400000000000000000000062171477342711600162030ustar00rootroot00000000000000/* * Copyright (c) 2005-2014 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #if !defined(CMA_H) #define CMA_H #include #include #include #include #include #include #include #include #include #include #define PFX "librdmacm: " /* * Fast synchronization for low contention locking. */ typedef struct { sem_t sem; _Atomic(int) cnt; } fastlock_t; static inline void fastlock_init(fastlock_t *lock) { sem_init(&lock->sem, 0, 0); atomic_store(&lock->cnt, 0); } static inline void fastlock_destroy(fastlock_t *lock) { sem_destroy(&lock->sem); } static inline void fastlock_acquire(fastlock_t *lock) { if (atomic_fetch_add(&lock->cnt, 1) > 0) sem_wait(&lock->sem); } static inline void fastlock_release(fastlock_t *lock) { if (atomic_fetch_sub(&lock->cnt, 1) > 1) sem_post(&lock->sem); } __be16 ucma_get_port(struct sockaddr *addr); int ucma_addrlen(struct sockaddr *addr); void ucma_set_sid(enum rdma_port_space ps, struct sockaddr *addr, struct sockaddr_ib *sib); int ucma_max_qpsize(struct rdma_cm_id *id); int ucma_complete(struct rdma_cm_id *id); int ucma_shutdown(struct rdma_cm_id *id); static inline int ERR(int err) { errno = err; return -1; } int ucma_init(void); extern int af_ib_support; #define RAI_ROUTEONLY 0x01000000 void ucma_ib_init(void); void ucma_ib_cleanup(void); void ucma_ib_resolve(struct rdma_addrinfo **rai, const struct rdma_addrinfo *hints); struct ib_connect_hdr { uint8_t cma_version; uint8_t ip_version; /* IP version: 7:4 */ uint16_t port; uint32_t src_addr[4]; uint32_t dst_addr[4]; #define cma_src_ip4 src_addr[3] #define cma_src_ip6 src_addr[0] #define cma_dst_ip4 dst_addr[3] #define cma_dst_ip6 dst_addr[0] }; #endif /* CMA_H */ rdma-core-56.1/librdmacm/docs/000077500000000000000000000000001477342711600162145ustar00rootroot00000000000000rdma-core-56.1/librdmacm/docs/rsocket000066400000000000000000000263371477342711600176240ustar00rootroot00000000000000.. Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md rsocket Protocol and Design Guide 11/11/2012 Data Streaming (TCP) Overview ----------------------------- Rsockets is a protocol over RDMA that supports a socket-level API for applications. For details on the current state of the implementation, readers should refer to the rsocket man page. This document describes the rsocket protocol, general design, and some implementation details. Rsockets exchanges data by performing RDMA write operations into exposed data buffers. In addition to RDMA write data, rsockets uses small, 32-bit messages for internal communication. RDMA writes are used to transfer application data into remote data buffers and to notify the peer when new target data buffers are available. The following figure highlights the operation. host A host B remote SGL target SGL <------------- [ ] [ ] ------ [ ] -- ------ receive buffer(s) -- -----> +--+ -- | | -- | | -- | | -- +--+ -- ---> +--+ | | | | +--+ The remote SGL contains the address, size, and rkey of the target SGL. As receive buffers become available on host B, rsockets will issue an RDMA write against one of the entries in the target SGL on host A. The updated entry will reference an available receive buffer. Immediate data included with the RDMA write will indicate to host A that a target SGE has been updated. When host A has data to send, it will check its target SGL. The current target SGE will contain the address, size, and rkey of the next receive buffer on host B. If the data transfer is smaller than the size of the remote receive buffer, host A will update its target SGE to reflect the remaining size of the receive buffer. That is, once a receive buffer has been published to a remote peer, it will be fully consumed before a second buffer is used. Rsockets relies on immediate data to notify the remote peer when data has been transferred or when a target SGL has been updated. Because immediate data requires that the remote QP have a posted receive, rsockets also uses a credit based flow control mechanism. The number of credits is based on the size of the receive queue, with initial credits exchanged during connection setup. In order to transfer data, rsockets requires both available receive buffers (published via the target SGL) and data credits. Since immediate data is limited to 32-bits, messages may either indicate the arrival of application data or may be an internal message, but not both. To avoid credit deadlock, rsockets reserves a small number of available credits for control messages only, with the protocol relying on RNR NAKs and retries to make forward progress. Connection Establishment ------------------------ rsockets uses the RDMA CM for connection establishment. Struct rs_conn_data is exchanged during the connection exchange as private data in the request and reply messages. struct rs_sge { uint64_t addr; uint32_t key; uint32_t length; }; #define RS_CONN_FLAG_NET 1 struct rs_conn_data { uint8_t version; uint8_t flags; uint16_t credits; uint32_t reserved2; struct rs_sge target_sgl; struct rs_sge data_buf; }; Version - current version is 1 Flags RS_CONN_FLAG_NET - Set to 1 if host is big Endian. Determines byte ordering for RDMA write messages Credits - number of initial receive credits Reserved2 - set to 0 Target SGL - Address, size (# entries), and rkey of target SGL. Remote side will copy this into their remote SGL. Data Buffer - Initial receive buffer address, size (in bytes), and rkey. Remote side will copy this into their first target SGE. Message Format -------------- Rsocket uses RDMA writes with immediate data for all message exchanges. RDMA writes of 0 length are used if no additional data beyond the message needs to be exchanged. Immediate data is limited to 32-bits. Rsockets defines the following format for messages. The upper 3 bits are used to define the type of message being exchanged, with the meaning of the lower 29 bits determined by the upper bits. Bits Message Meaning of 31:29 Type Bits 28:0 000 Data Transfer bytes transferred 001 reserved 010 reserved - used internally, available for future use 011 reserved 100 Credit Update received credits granted 101 reserved 110 Iomap Updated index of updated entry 111 Control control message type Data Transfer Indicates that application data has been written into the next available receive buffer. The size of the transfer, in bytes, is carried in the lower bits of the message. Credit Update Used to indicate that additional receive buffers and credits are available. The number of available credits is carried in the lower bits of the message. A credit update message is also used to indicate that a target SGE has been updated, in which case the number of additional credits may be 0. The receiver of a credit update message must check for updates to the target SGL by inspecting the contents of the SGL. The rsocket implementation must take care not to modify a remote target SGL while it may be in use. This is done by tracking when a receive buffer referenced by a remote target SGL has been filled. Iomap Updated Used to indicate that a remote iomap entry was updated. The updated entry contains the offset value associated with an address, length, and rkey. Once an iomap has been updated, the local application can issue directed IO transfers against the corresponding remote buffer. Control Message - DISCONNECT Indicates that the rsocket connection has been fully disconnected and will no longer send or receive data. Data received before the disconnect message was processed may still be available for reading. Control Message - SHUTDOWN Indicates that the remote rsocket has shutdown the send side of its connection. The recipient of a shutdown message will no longer accept incoming data, but may still transfer outbound data. Iomapped Buffers ---------------- Rsockets allows for zero-copy transfers using what it refers to as iomapped buffers. Iomapping and direct data placement (zero-copy) transfers are done using rsocket specific extensions. The general operation is similar to that used for normal data transfers described above. host A host B remote iomap target iomap <----------- [ ] [ ] ------ [ ] -- ------ iomapped buffer(s) -- -----> +--+ -- | | -- | | -- | | -- +--+ -- ---> +--+ | | | | +--+ The remote iomap contains the address, size, and rkey of the target iomap. As the applicaton maps buffers host B to a given rsocket, rsockets will issue an RDMA write against one of the entries in the target iomap on host A. The updated entry will reference an available iomapped buffer. Immediate data included with the RDMA write will indicate to host A that a target iomap has been updated. When host A wishes to transfer directly into an iomapped buffer, it will check its target iomap for an offset corresponding to a remotely mapped buffer. A matching iomap entry will contain the address, size, and rkey of the target buffer on host B. Host A will then issue an RDMA operation against the registered remote data buffer. From host A's perspective, the transfer appears as a normal send/write operation, with the data stream redirected directly into the receiving application's buffer. Datagram Overview ----------------- The rsocket API supports datagram sockets. Datagram support is handled through an entirely different protocol and internal implementation. Unlike connected rsockets, datagram rsockets are not necessarily bound to a network (IP) address. A datagram socket may use any number of network (IP) addresses, including those which map to different RDMA devices. As a result, a single datagram rsocket must support using multiple RDMA devices and ports, and a datagram rsocket references a single UDP socket, plus zero or more UD QPs. Rsockets uses headers inserted before user data sent over UDP sockets to resolve remote UD QP numbers. When a user first attempts to send a datagram to a remote address (IP and UDP port), rsockets will take the following steps: 1. Store the destination address into a lookup table. 2. Resolve which local network address should be used when sending to the specified destination. 3. Allocate a UD QP on the RDMA device associated with the local address. 4. Send the user's datagram to the remote UDP socket. A header is inserted before the user's datagram. The header specifies the UD QP number associated with the local network address (IP and UDP port) of the send. A service thread is used to process messages received on the UDP socket. This thread updates the rsocket lookup tables with the remote QPN and path record data. The service thread forwards data received on the UDP socket to an rsocket QP. After the remote QPN and path records have been resolved, datagram communication between two nodes are done over the UD QP. UDP Message Format ------------------ Rsockets uses messages exchanged over UDP sockets to resolve remote QP numbers. If a user sends a datagram to a remote service and the local rsocket is not yet configured to send directly to a remote UD QP, the user data is sent over a UDP socket with the following header inserted before the user data. struct ds_udp_header { uint32_t tag; uint8_t version; uint8_t op; uint8_t length; uint8_t reserved; uint32_t qpn; /* lower 8-bits reserved */ union { uint32_t ipv4; uint8_t ipv6[16]; } addr; }; Tag - Marker used to help identify that the UDP header is present. #define DS_UDP_TAG 0x55555555 Version - IP address version, either 4 or 6 Op - Indicates message type, used to control the receiver's operation. Valid operations are RS_OP_DATA and RS_OP_CTRL. Data messages carry user data, while control messages are used to reply with the local QP number. Length - Size of the UDP header. QPN - UD QP number associated with sender's IP address and port. The sender's address and port is extracted from the received UDP datagram. Addr - Target IP address of the sent datagram. Once the remote QP information has been resolved, data is sent directly between UD QPs. The following header is inserted before any user data that is transferred over a UD QP. struct ds_header { uint8_t version; uint8_t length; uint16_t port; union { uint32_t ipv4; struct { uint32_t flowinfo; uint8_t addr[16]; } ipv6; } addr; }; Verion - IP address version Length - Size of the header Port - Associated source address UDP port Addr - Associated source IP address rdma-core-56.1/librdmacm/examples/000077500000000000000000000000001477342711600171025ustar00rootroot00000000000000rdma-core-56.1/librdmacm/examples/CMakeLists.txt000066400000000000000000000027101477342711600216420ustar00rootroot00000000000000# Shared example files add_library(rdmacm_tools STATIC common.c ) target_link_libraries(rdmacm_tools LINK_PRIVATE ${CMAKE_THREAD_LIBS_INIT}) rdma_executable(cmtime cmtime.c) target_link_libraries(cmtime LINK_PRIVATE rdmacm rdmacm_tools) rdma_executable(mckey mckey.c) target_link_libraries(mckey LINK_PRIVATE rdmacm ${CMAKE_THREAD_LIBS_INIT} rdmacm_tools) rdma_executable(rcopy rcopy.c) target_link_libraries(rcopy LINK_PRIVATE rdmacm rdmacm_tools) rdma_executable(rdma_client rdma_client.c) target_link_libraries(rdma_client LINK_PRIVATE rdmacm) rdma_executable(rdma_server rdma_server.c) target_link_libraries(rdma_server LINK_PRIVATE rdmacm) rdma_executable(rdma_xclient rdma_xclient.c) target_link_libraries(rdma_xclient LINK_PRIVATE rdmacm) rdma_executable(rdma_xserver rdma_xserver.c) target_link_libraries(rdma_xserver LINK_PRIVATE rdmacm) rdma_executable(riostream riostream.c) target_link_libraries(riostream LINK_PRIVATE rdmacm rdmacm_tools) rdma_executable(rping rping.c) target_link_libraries(rping LINK_PRIVATE rdmacm ${CMAKE_THREAD_LIBS_INIT} rdmacm_tools) rdma_executable(rstream rstream.c) target_link_libraries(rstream LINK_PRIVATE rdmacm rdmacm_tools) rdma_executable(ucmatose cmatose.c) target_link_libraries(ucmatose LINK_PRIVATE rdmacm rdmacm_tools) rdma_executable(udaddy udaddy.c) target_link_libraries(udaddy LINK_PRIVATE rdmacm rdmacm_tools) rdma_executable(udpong udpong.c) target_link_libraries(udpong LINK_PRIVATE rdmacm rdmacm_tools) rdma-core-56.1/librdmacm/examples/cmatose.c000066400000000000000000000372121477342711600207060ustar00rootroot00000000000000/* * Copyright (c) 2005-2006,2011-2012 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * $Id$ */ #include #include #include #include #include #include #include #include #include #include "common.h" struct cmatest_node { int id; struct rdma_cm_id *cma_id; int connected; struct ibv_pd *pd; struct ibv_cq *cq[2]; struct ibv_mr *mr; void *mem; }; enum CQ_INDEX { SEND_CQ_INDEX, RECV_CQ_INDEX }; struct cmatest { struct rdma_event_channel *channel; struct cmatest_node *nodes; int conn_index; int connects_left; int disconnects_left; struct rdma_addrinfo *rai; }; static struct cmatest test; static int connections = 1; static int message_size = 100; static int message_count = 10; static const char *port = "7471"; static uint8_t set_tos = 0; static uint8_t tos; static uint8_t migrate = 0; static char *dst_addr; static char *src_addr; static struct rdma_addrinfo hints; static uint8_t set_timeout; static uint8_t timeout; static int create_message(struct cmatest_node *node) { if (!message_size) message_count = 0; if (!message_count) return 0; node->mem = malloc(message_size); if (!node->mem) { printf("failed message allocation\n"); return -1; } node->mr = ibv_reg_mr(node->pd, node->mem, message_size, IBV_ACCESS_LOCAL_WRITE); if (!node->mr) { printf("failed to reg MR\n"); goto err; } return 0; err: free(node->mem); return -1; } static int init_node(struct cmatest_node *node) { struct ibv_qp_init_attr init_qp_attr; int cqe, ret; node->pd = ibv_alloc_pd(node->cma_id->verbs); if (!node->pd) { ret = -ENOMEM; printf("cmatose: unable to allocate PD\n"); goto out; } cqe = message_count ? message_count : 1; node->cq[SEND_CQ_INDEX] = ibv_create_cq(node->cma_id->verbs, cqe, node, NULL, 0); node->cq[RECV_CQ_INDEX] = ibv_create_cq(node->cma_id->verbs, cqe, node, NULL, 0); if (!node->cq[SEND_CQ_INDEX] || !node->cq[RECV_CQ_INDEX]) { ret = -ENOMEM; printf("cmatose: unable to create CQ\n"); goto out; } memset(&init_qp_attr, 0, sizeof init_qp_attr); init_qp_attr.cap.max_send_wr = cqe; init_qp_attr.cap.max_recv_wr = cqe; init_qp_attr.cap.max_send_sge = 1; init_qp_attr.cap.max_recv_sge = 1; init_qp_attr.qp_context = node; init_qp_attr.sq_sig_all = 1; init_qp_attr.qp_type = IBV_QPT_RC; init_qp_attr.send_cq = node->cq[SEND_CQ_INDEX]; init_qp_attr.recv_cq = node->cq[RECV_CQ_INDEX]; ret = rdma_create_qp(node->cma_id, node->pd, &init_qp_attr); if (ret) { perror("cmatose: unable to create QP"); goto out; } ret = create_message(node); if (ret) { printf("cmatose: failed to create messages: %d\n", ret); goto out; } out: return ret; } static int post_recvs(struct cmatest_node *node) { struct ibv_recv_wr recv_wr, *recv_failure; struct ibv_sge sge; int i, ret = 0; if (!message_count) return 0; recv_wr.next = NULL; recv_wr.sg_list = &sge; recv_wr.num_sge = 1; recv_wr.wr_id = (uintptr_t) node; sge.length = message_size; sge.lkey = node->mr->lkey; sge.addr = (uintptr_t) node->mem; for (i = 0; i < message_count && !ret; i++ ) { ret = ibv_post_recv(node->cma_id->qp, &recv_wr, &recv_failure); if (ret) { printf("failed to post receives: %d\n", ret); break; } } return ret; } static int post_sends(struct cmatest_node *node) { struct ibv_send_wr send_wr, *bad_send_wr; struct ibv_sge sge; int i, ret = 0; if (!node->connected || !message_count) return 0; send_wr.next = NULL; send_wr.sg_list = &sge; send_wr.num_sge = 1; send_wr.opcode = IBV_WR_SEND; send_wr.send_flags = 0; send_wr.wr_id = (unsigned long)node; sge.length = message_size; sge.lkey = node->mr->lkey; sge.addr = (uintptr_t) node->mem; for (i = 0; i < message_count && !ret; i++) { ret = ibv_post_send(node->cma_id->qp, &send_wr, &bad_send_wr); if (ret) printf("failed to post sends: %d\n", ret); } return ret; } static void connect_error(void) { test.connects_left--; } static int addr_handler(struct cmatest_node *node) { int ret; if (set_tos) { ret = rdma_set_option(node->cma_id, RDMA_OPTION_ID, RDMA_OPTION_ID_TOS, &tos, sizeof tos); if (ret) perror("cmatose: set TOS option failed"); } if (set_timeout) { ret = rdma_set_option(node->cma_id, RDMA_OPTION_ID, RDMA_OPTION_ID_ACK_TIMEOUT, &timeout, sizeof(timeout)); if (ret) perror("cmatose: set ack timeout option failed"); } ret = rdma_resolve_route(node->cma_id, 2000); if (ret) { perror("cmatose: resolve route failed"); connect_error(); } return ret; } static int route_handler(struct cmatest_node *node) { struct rdma_conn_param conn_param; int ret; ret = init_node(node); if (ret) goto err; ret = post_recvs(node); if (ret) goto err; memset(&conn_param, 0, sizeof conn_param); conn_param.responder_resources = 1; conn_param.initiator_depth = 1; conn_param.retry_count = 5; conn_param.private_data = test.rai->ai_connect; conn_param.private_data_len = test.rai->ai_connect_len; ret = rdma_connect(node->cma_id, &conn_param); if (ret) { perror("cmatose: failure connecting"); goto err; } return 0; err: connect_error(); return ret; } static int connect_handler(struct rdma_cm_id *cma_id) { struct cmatest_node *node; int ret; if (test.conn_index == connections) { ret = -ENOMEM; goto err1; } node = &test.nodes[test.conn_index++]; node->cma_id = cma_id; cma_id->context = node; ret = init_node(node); if (ret) goto err2; if (set_timeout) { ret = rdma_set_option(node->cma_id, RDMA_OPTION_ID, RDMA_OPTION_ID_ACK_TIMEOUT, &timeout, sizeof(timeout)); if (ret) perror("cmatose: set ack timeout option failed"); } ret = post_recvs(node); if (ret) goto err2; ret = rdma_accept(node->cma_id, NULL); if (ret) { perror("cmatose: failure accepting"); goto err2; } return 0; err2: node->cma_id = NULL; connect_error(); err1: printf("cmatose: failing connection request\n"); rdma_reject(cma_id, NULL, 0); return ret; } static int cma_handler(struct rdma_cm_id *cma_id, struct rdma_cm_event *event) { int ret = 0; switch (event->event) { case RDMA_CM_EVENT_ADDR_RESOLVED: ret = addr_handler(cma_id->context); break; case RDMA_CM_EVENT_ROUTE_RESOLVED: ret = route_handler(cma_id->context); break; case RDMA_CM_EVENT_CONNECT_REQUEST: ret = connect_handler(cma_id); break; case RDMA_CM_EVENT_ESTABLISHED: ((struct cmatest_node *) cma_id->context)->connected = 1; test.connects_left--; test.disconnects_left++; break; case RDMA_CM_EVENT_ADDR_ERROR: case RDMA_CM_EVENT_ROUTE_ERROR: case RDMA_CM_EVENT_CONNECT_ERROR: case RDMA_CM_EVENT_UNREACHABLE: case RDMA_CM_EVENT_REJECTED: printf("cmatose: event: %s, error: %d\n", rdma_event_str(event->event), event->status); connect_error(); ret = event->status; break; case RDMA_CM_EVENT_DISCONNECTED: rdma_disconnect(cma_id); test.disconnects_left--; break; case RDMA_CM_EVENT_DEVICE_REMOVAL: /* Cleanup will occur after test completes. */ break; default: break; } return ret; } static void destroy_node(struct cmatest_node *node) { if (!node->cma_id) return; if (node->cma_id->qp) rdma_destroy_qp(node->cma_id); if (node->cq[SEND_CQ_INDEX]) ibv_destroy_cq(node->cq[SEND_CQ_INDEX]); if (node->cq[RECV_CQ_INDEX]) ibv_destroy_cq(node->cq[RECV_CQ_INDEX]); if (node->mem) { ibv_dereg_mr(node->mr); free(node->mem); } if (node->pd) ibv_dealloc_pd(node->pd); /* Destroy the RDMA ID after all device resources */ rdma_destroy_id(node->cma_id); } static int alloc_nodes(void) { int ret, i; test.nodes = malloc(sizeof *test.nodes * connections); if (!test.nodes) { printf("cmatose: unable to allocate memory for test nodes\n"); return -ENOMEM; } memset(test.nodes, 0, sizeof *test.nodes * connections); for (i = 0; i < connections; i++) { test.nodes[i].id = i; if (dst_addr) { ret = rdma_create_id(test.channel, &test.nodes[i].cma_id, &test.nodes[i], hints.ai_port_space); if (ret) goto err; } } return 0; err: while (--i >= 0) rdma_destroy_id(test.nodes[i].cma_id); free(test.nodes); return ret; } static void destroy_nodes(void) { int i; for (i = 0; i < connections; i++) destroy_node(&test.nodes[i]); free(test.nodes); } static int poll_cqs(enum CQ_INDEX index) { struct ibv_wc wc[8]; int done, i, ret; for (i = 0; i < connections; i++) { if (!test.nodes[i].connected) continue; for (done = 0; done < message_count; done += ret) { ret = ibv_poll_cq(test.nodes[i].cq[index], 8, wc); if (ret < 0) { printf("cmatose: failed polling CQ: %d\n", ret); return ret; } } } return 0; } static int connect_events(void) { struct rdma_cm_event *event; int ret = 0; while (test.connects_left && !ret) { ret = rdma_get_cm_event(test.channel, &event); if (!ret) { ret = cma_handler(event->id, event); rdma_ack_cm_event(event); } else { perror("cmatose: failure in rdma_get_cm_event in connect events"); ret = errno; } } return ret; } static int disconnect_events(void) { struct rdma_cm_event *event; int ret = 0; while (test.disconnects_left && !ret) { ret = rdma_get_cm_event(test.channel, &event); if (!ret) { ret = cma_handler(event->id, event); rdma_ack_cm_event(event); } else { perror("cmatose: failure in rdma_get_cm_event in disconnect events"); ret = errno; } } return ret; } static int migrate_channel(struct rdma_cm_id *listen_id) { struct rdma_event_channel *channel; int i, ret; printf("migrating to new event channel\n"); channel = create_event_channel(); if (!channel) return -1; ret = 0; if (listen_id) ret = rdma_migrate_id(listen_id, channel); for (i = 0; i < connections && !ret; i++) ret = rdma_migrate_id(test.nodes[i].cma_id, channel); if (!ret) { rdma_destroy_event_channel(test.channel); test.channel = channel; } else perror("cmatose: failure migrating to channel"); return ret; } static int run_server(void) { struct rdma_cm_id *listen_id; int i, ret; printf("cmatose: starting server\n"); ret = rdma_create_id(test.channel, &listen_id, &test, hints.ai_port_space); if (ret) { perror("cmatose: listen request failed"); return ret; } ret = get_rdma_addr(src_addr, dst_addr, port, &hints, &test.rai); if (ret) goto out; ret = rdma_bind_addr(listen_id, test.rai->ai_src_addr); if (ret) { perror("cmatose: bind address failed"); goto out; } ret = rdma_listen(listen_id, 0); if (ret) { perror("cmatose: failure trying to listen"); goto out; } ret = connect_events(); if (ret) goto out; if (message_count) { printf("initiating data transfers\n"); for (i = 0; i < connections; i++) { ret = post_sends(&test.nodes[i]); if (ret) goto out; } printf("completing sends\n"); ret = poll_cqs(SEND_CQ_INDEX); if (ret) goto out; printf("receiving data transfers\n"); ret = poll_cqs(RECV_CQ_INDEX); if (ret) goto out; printf("data transfers complete\n"); } if (migrate) { ret = migrate_channel(listen_id); if (ret) goto out; } printf("cmatose: disconnecting\n"); for (i = 0; i < connections; i++) { if (!test.nodes[i].connected) continue; test.nodes[i].connected = 0; rdma_disconnect(test.nodes[i].cma_id); } ret = disconnect_events(); printf("disconnected\n"); out: rdma_destroy_id(listen_id); return ret; } static int run_client(void) { int i, ret, ret2; printf("cmatose: starting client\n"); ret = get_rdma_addr(src_addr, dst_addr, port, &hints, &test.rai); if (ret) return ret; printf("cmatose: connecting\n"); for (i = 0; i < connections; i++) { ret = rdma_resolve_addr(test.nodes[i].cma_id, test.rai->ai_src_addr, test.rai->ai_dst_addr, 2000); if (ret) { perror("cmatose: failure getting addr"); connect_error(); return ret; } } ret = connect_events(); if (ret) goto disc; if (message_count) { printf("receiving data transfers\n"); ret = poll_cqs(RECV_CQ_INDEX); if (ret) goto disc; printf("sending replies\n"); for (i = 0; i < connections; i++) { ret = post_sends(&test.nodes[i]); if (ret) goto disc; } printf("data transfers complete\n"); } ret = 0; if (migrate) { ret = migrate_channel(NULL); if (ret) goto out; } disc: ret2 = disconnect_events(); if (ret2) ret = ret2; out: return ret; } int main(int argc, char **argv) { int op, ret; hints.ai_port_space = RDMA_PS_TCP; while ((op = getopt(argc, argv, "s:b:f:P:c:C:S:t:p:a:m")) != -1) { switch (op) { case 's': dst_addr = optarg; break; case 'b': src_addr = optarg; break; case 'f': if (!strncasecmp("ip", optarg, 2)) { hints.ai_flags = RAI_NUMERICHOST; } else if (!strncasecmp("gid", optarg, 3)) { hints.ai_flags = RAI_NUMERICHOST | RAI_FAMILY; hints.ai_family = AF_IB; } else if (strncasecmp("name", optarg, 4)) { fprintf(stderr, "Warning: unknown address format\n"); } break; case 'P': if (!strncasecmp("ib", optarg, 2)) { hints.ai_port_space = RDMA_PS_IB; } else if (strncasecmp("tcp", optarg, 3)) { fprintf(stderr, "Warning: unknown port space format\n"); } break; case 'c': connections = atoi(optarg); break; case 'C': message_count = atoi(optarg); break; case 'S': message_size = atoi(optarg); break; case 't': set_tos = 1; tos = (uint8_t) strtoul(optarg, NULL, 0); break; case 'p': port = optarg; break; case 'm': migrate = 1; break; case 'a': set_timeout = 1; timeout = (uint8_t) strtoul(optarg, NULL, 0); break; default: printf("usage: %s\n", argv[0]); printf("\t[-s server_address]\n"); printf("\t[-b bind_address]\n"); printf("\t[-f address_format]\n"); printf("\t name, ip, ipv6, or gid\n"); printf("\t[-P port_space]\n"); printf("\t tcp or ib\n"); printf("\t[-c connections]\n"); printf("\t[-C message_count]\n"); printf("\t[-S message_size]\n"); printf("\t[-t type_of_service]\n"); printf("\t[-p port_number]\n"); printf("\t[-m(igrate)]\n"); printf("\t[-a ack_timeout]\n"); exit(1); } } test.connects_left = connections; test.channel = create_event_channel(); if (!test.channel) { exit(1); } if (alloc_nodes()) exit(1); if (dst_addr) { ret = run_client(); } else { hints.ai_flags |= RAI_PASSIVE; ret = run_server(); } printf("test complete\n"); destroy_nodes(); rdma_destroy_event_channel(test.channel); rdma_freeaddrinfo(test.rai); printf("return status %d\n", ret); return ret; } rdma-core-56.1/librdmacm/examples/cmtime.c000066400000000000000000000544261477342711600205370ustar00rootroot00000000000000/* * Copyright (c) 2013 Intel Corporation. All rights reserved. * Copyright (c) Nvidia Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "common.h" static struct rdma_addrinfo hints, *rai; static struct addrinfo *ai; static struct rdma_event_channel *channel; static int oob_sock = -1; static const char *port = "7471"; static char *dst_addr; static char *src_addr; static int timeout = 2000; static int retries = 2; static uint32_t base_qpn = 1000; static _Atomic(uint32_t) cur_qpn; static uint32_t mimic_qp_delay; static bool mimic; enum step { STEP_FULL_CONNECT, STEP_CREATE_ID, STEP_BIND, STEP_RESOLVE_ADDR, STEP_RESOLVE_ROUTE, STEP_CREATE_QP, STEP_INIT_QP_ATTR, STEP_INIT_QP, STEP_RTR_QP_ATTR, STEP_RTR_QP, STEP_RTS_QP_ATTR, STEP_RTS_QP, STEP_CONNECT, STEP_ESTABLISH, STEP_DISCONNECT, STEP_DESTROY_ID, STEP_DESTROY_QP, STEP_CNT }; static const char *step_str[] = { "full connect", "create id", "bind addr", "resolve addr", "resolve route", "create qp", "init qp attr", "init qp", "rtr qp attr", "rtr qp", "rts qp attr", "rts qp", "cm connect", "establish", "disconnect", "destroy id", "destroy qp" }; struct node { struct work_item work; struct rdma_cm_id *id; int sock; struct ibv_qp *qp; enum ibv_qp_state next_qps; enum step next_step; uint64_t times[STEP_CNT][2]; int retries; }; static struct work_queue wq; static struct node *nodes; static int node_index; static uint64_t times[STEP_CNT][2]; static int connections; static int num_threads = 1; static _Atomic(int) disc_events; static _Atomic(int) completed[STEP_CNT]; static struct ibv_pd *pd; static struct ibv_cq *cq; #define start_perf(n, s) do { (n)->times[s][0] = gettime_us(); } while (0) #define end_perf(n, s) do { (n)->times[s][1] = gettime_us(); } while (0) #define start_time(s) do { times[s][0] = gettime_us(); } while (0) #define end_time(s) do { times[s][1] = gettime_us(); } while (0) static inline bool is_client(void) { return dst_addr != NULL; } static void show_perf(int iter) { uint32_t diff, max[STEP_CNT], min[STEP_CNT], sum[STEP_CNT]; int i, c; for (i = 0; i < STEP_CNT; i++) { sum[i] = 0; max[i] = 0; min[i] = UINT32_MAX; for (c = 0; c < iter; c++) { if (nodes[c].times[i][0] && nodes[c].times[i][1]) { diff = (uint32_t) (nodes[c].times[i][1] - nodes[c].times[i][0]); sum[i] += diff; if (diff > max[i]) max[i] = diff; if (diff < min[i]) min[i] = diff; } } /* Print 0 if we have no data */ if (min[i] == UINT32_MAX) min[i] = 0; } /* Reporting the 'sum' of the full connect is meaningless */ sum[STEP_FULL_CONNECT] = 0; if (atomic_load(&cur_qpn) == 0) printf("qp_conn %10d\n", iter); else printf("cm_conn %10d\n", iter); printf("threads %10d\n", num_threads); printf("step avg/iter total(us) us/conn sum(us) max(us) min(us)\n"); for (i = 0; i < STEP_CNT; i++) { diff = (uint32_t) (times[i][1] - times[i][0]); printf("%-13s %10u %10u %10u %10u %10d %10u\n", step_str[i], diff / iter, diff, sum[i] / iter, sum[i], max[i], min[i]); } } static void sock_listen(int *listen_sock, int backlog) { struct addrinfo aih = {}; int optval = 1; int ret; aih.ai_family = AF_INET; aih.ai_socktype = SOCK_STREAM; aih.ai_flags = AI_PASSIVE; ret = getaddrinfo(src_addr, port, &aih, &ai); if (ret) { perror("getaddrinfo"); exit(EXIT_FAILURE); } *listen_sock = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol); if (*listen_sock < 0) { perror("socket"); exit(EXIT_FAILURE); } ret = setsockopt(*listen_sock, SOL_SOCKET, SO_REUSEADDR, (char *) &optval, sizeof(optval)); if (ret) { perror("setsockopt"); exit(EXIT_FAILURE); } ret = bind(*listen_sock, ai->ai_addr, ai->ai_addrlen); if (ret) { perror("bind"); exit(EXIT_FAILURE); } ret = listen(*listen_sock, backlog); if (ret) { perror("listen"); exit(EXIT_FAILURE); } freeaddrinfo(ai); } static void sock_server(int iter) { int listen_sock, i; printf("Server baseline socket setup\n"); sock_listen(&listen_sock, iter); printf("Accept sockets\n"); for (i = 0; i < iter; i++) { nodes[i].sock = accept(listen_sock, NULL, NULL); if (nodes[i].sock < 0) { perror("accept"); exit(EXIT_FAILURE); } if (i == 0) start_time(STEP_FULL_CONNECT); } end_time(STEP_FULL_CONNECT); printf("Closing sockets\n"); start_time(STEP_DESTROY_ID); for (i = 0; i < iter; i++) close(nodes[i].sock); end_time(STEP_DESTROY_ID); close(listen_sock); printf("Server baseline socket results:\n"); show_perf(iter); } static void create_sock(struct work_item *item) { struct node *n = container_of(item, struct node, work); start_perf(n, STEP_CREATE_ID); n->sock = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol); if (n->sock < 0) { perror("socket"); exit(EXIT_FAILURE); } end_perf(n, STEP_CREATE_ID); atomic_fetch_add(&completed[STEP_CREATE_ID], 1); } static void connect_sock(struct work_item *item) { struct node *n = container_of(item, struct node, work); int ret; start_perf(n, STEP_CONNECT); ret = connect(n->sock, ai->ai_addr, ai->ai_addrlen); if (ret) { perror("connect"); exit(EXIT_FAILURE); } end_perf(n, STEP_CONNECT); atomic_fetch_add(&completed[STEP_CONNECT], 1); } static void sock_client(int iter) { int i, ret; printf("Client baseline socket setup\n"); ret = getaddrinfo(dst_addr, port, NULL, &ai); if (ret) { perror("getaddrinfo"); exit(EXIT_FAILURE); } start_time(STEP_FULL_CONNECT); printf("Creating sockets\n"); start_time(STEP_CREATE_ID); for (i = 0; i < iter; i++) wq_insert(&wq, &nodes[i].work, create_sock); while (atomic_load(&completed[STEP_CREATE_ID]) < iter) sched_yield(); end_time(STEP_CREATE_ID); printf("Connecting sockets\n"); start_time(STEP_CONNECT); for (i = 0; i < iter; i++) wq_insert(&wq, &nodes[i].work, connect_sock); while (atomic_load(&completed[STEP_CONNECT]) < iter) sched_yield(); end_time(STEP_CONNECT); end_time(STEP_FULL_CONNECT); printf("Closing sockets\n"); start_time(STEP_DESTROY_ID); for (i = 0; i < iter; i++) close(nodes[i].sock); end_time(STEP_DESTROY_ID); freeaddrinfo(ai); printf("Client baseline socket results:\n"); show_perf(iter); } static inline bool need_verbs(void) { return pd == NULL; } static void open_verbs(struct rdma_cm_id *id) { printf("\tAllocating verbs resources\n"); pd = ibv_alloc_pd(id->verbs); if (!pd) { perror("ibv_alloc_pd"); exit(EXIT_FAILURE); } cq = ibv_create_cq(id->verbs, 1, NULL, NULL, 0); if (!cq) { perror("ibv_create_cq"); exit(EXIT_FAILURE); } } static void create_qp(struct work_item *item) { struct node *n = container_of(item, struct node, work); struct ibv_qp_init_attr attr; if (need_verbs()) open_verbs(n->id); attr.qp_context = n; attr.send_cq = cq; attr.recv_cq = cq; attr.srq = NULL; attr.qp_type = IBV_QPT_RC; attr.sq_sig_all = 1; attr.cap.max_send_wr = 1; attr.cap.max_recv_wr = 1; attr.cap.max_send_sge = 1; attr.cap.max_recv_sge = 1; attr.cap.max_inline_data = 0; start_perf(n, STEP_CREATE_QP); if (atomic_load(&cur_qpn) == 0) { n->qp = ibv_create_qp(pd, &attr); if (!n->qp) { perror("ibv_create_qp"); exit(EXIT_FAILURE); } } else { sleep_us(mimic_qp_delay); } end_perf(n, STEP_CREATE_QP); atomic_fetch_add(&completed[STEP_CREATE_QP], 1); } static void modify_qp(struct node *n, enum ibv_qp_state state, enum step attr_step) { struct ibv_qp_attr attr; int mask, ret; attr.qp_state = state; start_perf(n, attr_step); ret = rdma_init_qp_attr(n->id, &attr, &mask); if (ret) { perror("rdma_init_qp_attr"); exit(EXIT_FAILURE); } end_perf(n, attr_step++); start_perf(n, attr_step); if (n->qp) { ret = ibv_modify_qp(n->qp, &attr, mask); if (ret) { perror("ibv_modify_qp"); exit(EXIT_FAILURE); } } else { sleep_us(mimic_qp_delay); } end_perf(n, attr_step); atomic_fetch_add(&completed[attr_step], 1); } static void modify_qp_work(struct work_item *item) { struct node *n = container_of(item, struct node, work); modify_qp(n, n->next_qps, n->next_step); } static void init_conn_param(struct node *n, struct rdma_conn_param *param) { param->private_data = rai->ai_connect; param->private_data_len = rai->ai_connect_len; param->responder_resources = 1; param->initiator_depth = 1; param->flow_control = 0; param->retry_count = 0; param->rnr_retry_count = 0; param->srq = 0; param->qp_num = n->qp ? n->qp->qp_num : atomic_fetch_add(&cur_qpn, 1); } static void connect_qp(struct node *n) { struct rdma_conn_param conn_param; int ret; init_conn_param(n, &conn_param); start_perf(n, STEP_CONNECT); ret = rdma_connect(n->id, &conn_param); if (ret) { perror("rdma_connect"); exit(EXIT_FAILURE); } } static void resolve_addr(struct work_item *item) { struct node *n = container_of(item, struct node, work); int ret; n->retries = retries; start_perf(n, STEP_RESOLVE_ADDR); ret = rdma_resolve_addr(n->id, rai->ai_src_addr, rai->ai_dst_addr, timeout); if (ret) { perror("rdma_resolve_addr"); exit(EXIT_FAILURE); } } static void resolve_route(struct work_item *item) { struct node *n = container_of(item, struct node, work); int ret; n->retries = retries; start_perf(n, STEP_RESOLVE_ROUTE); ret = rdma_resolve_route(n->id, timeout); if (ret) { perror("rdma_resolve_route"); exit(EXIT_FAILURE); } } static void connect_response(struct work_item *item) { struct node *n = container_of(item, struct node, work); modify_qp(n, IBV_QPS_RTR, STEP_RTR_QP_ATTR); modify_qp(n, IBV_QPS_RTS, STEP_RTS_QP_ATTR); start_perf(n, STEP_ESTABLISH); rdma_establish(n->id); end_perf(n, STEP_ESTABLISH); end_perf(n, STEP_CONNECT); end_perf(n, STEP_FULL_CONNECT); atomic_fetch_add(&completed[STEP_CONNECT], 1); } static void req_handler(struct work_item *item) { struct node *n = container_of(item, struct node, work); struct rdma_conn_param conn_param; int ret; create_qp(&n->work); modify_qp(n, IBV_QPS_INIT, STEP_INIT_QP_ATTR); modify_qp(n, IBV_QPS_RTR, STEP_RTR_QP_ATTR); modify_qp(n, IBV_QPS_RTS, STEP_RTS_QP_ATTR); init_conn_param(n, &conn_param); ret = rdma_accept(n->id, &conn_param); if (ret) { perror("failure accepting"); exit(EXIT_FAILURE); } } static void client_disconnect(struct work_item *item) { struct node *n = container_of(item, struct node, work); start_perf(n, STEP_DISCONNECT); rdma_disconnect(n->id); end_perf(n, STEP_DISCONNECT); atomic_fetch_add(&completed[STEP_DISCONNECT], 1); } static void server_disconnect(struct work_item *item) { struct node *n = container_of(item, struct node, work); start_perf(n, STEP_DISCONNECT); rdma_disconnect(n->id); end_perf(n, STEP_DISCONNECT); if (atomic_load(&disc_events) >= connections) end_time(STEP_DISCONNECT); atomic_fetch_add(&completed[STEP_DISCONNECT], 1); } static void cma_handler(struct rdma_cm_id *id, struct rdma_cm_event *event) { struct node *n = id->context; switch (event->event) { case RDMA_CM_EVENT_ADDR_RESOLVED: end_perf(n, STEP_RESOLVE_ADDR); atomic_fetch_add(&completed[STEP_RESOLVE_ADDR], 1); break; case RDMA_CM_EVENT_ROUTE_RESOLVED: end_perf(n, STEP_RESOLVE_ROUTE); atomic_fetch_add(&completed[STEP_RESOLVE_ROUTE], 1); break; case RDMA_CM_EVENT_CONNECT_REQUEST: if (node_index == 0) { printf("\tAccepting\n"); start_time(STEP_CONNECT); } n = &nodes[node_index++]; n->id = id; id->context = n; wq_insert(&wq, &n->work, req_handler); break; case RDMA_CM_EVENT_CONNECT_RESPONSE: wq_insert(&wq, &n->work, connect_response); break; case RDMA_CM_EVENT_ESTABLISHED: if (atomic_fetch_add(&completed[STEP_CONNECT], 1) >= connections - 1) end_time(STEP_CONNECT); break; case RDMA_CM_EVENT_ADDR_ERROR: if (n->retries--) { if (!rdma_resolve_addr(n->id, rai->ai_src_addr, rai->ai_dst_addr, timeout)) break; } printf("RDMA_CM_EVENT_ADDR_ERROR, error: %d\n", event->status); exit(EXIT_FAILURE); break; case RDMA_CM_EVENT_ROUTE_ERROR: if (n->retries--) { if (!rdma_resolve_route(n->id, timeout)) break; } printf("RDMA_CM_EVENT_ROUTE_ERROR, error: %d\n", event->status); exit(EXIT_FAILURE); break; case RDMA_CM_EVENT_CONNECT_ERROR: case RDMA_CM_EVENT_UNREACHABLE: case RDMA_CM_EVENT_REJECTED: printf("event: %s, error: %d\n", rdma_event_str(event->event), event->status); exit(EXIT_FAILURE); break; case RDMA_CM_EVENT_DISCONNECTED: if (!is_client()) { /* To fix an issue where DREQs are not responded * to, the client completes its disconnect phase * as soon as it calls rdma_disconnect and does * not wait for a response from the server. The * OOB sync handles that coordiation end_perf(n, STEP_DISCONNECT); atomic_fetch_add(&completed[STEP_DISCONNECT], 1); } else { */ if (atomic_fetch_add(&disc_events, 1) == 0) { printf("\tDisconnecting\n"); start_time(STEP_DISCONNECT); } wq_insert(&wq, &n->work, server_disconnect); } break; case RDMA_CM_EVENT_TIMEWAIT_EXIT: break; default: printf("Unhandled event: %d (%s)\n", event->event, rdma_event_str(event->event)); exit(EXIT_FAILURE); break; } rdma_ack_cm_event(event); } static void create_ids(int iter) { int ret, i; printf("\tCreating IDs\n"); start_time(STEP_CREATE_ID); for (i = 0; i < iter; i++) { start_perf(&nodes[i], STEP_FULL_CONNECT); start_perf(&nodes[i], STEP_CREATE_ID); ret = rdma_create_id(channel, &nodes[i].id, &nodes[i], hints.ai_port_space); if (ret) { perror("rdma_create_id"); exit(EXIT_FAILURE); } end_perf(&nodes[i], STEP_CREATE_ID); } end_time(STEP_CREATE_ID); } static void destroy_ids(int iter) { int i; start_time(STEP_DESTROY_ID); for (i = 0; i < iter; i++) { start_perf(&nodes[i], STEP_DESTROY_ID); if (nodes[i].id) rdma_destroy_id(nodes[i].id); end_perf(&nodes[i], STEP_DESTROY_ID); } end_time(STEP_DESTROY_ID); } static void destroy_qps(int iter) { int i; start_time(STEP_DESTROY_QP); for (i = 0; i < iter; i++) { start_perf(&nodes[i], STEP_DESTROY_QP); if (nodes[i].qp) ibv_destroy_qp(nodes[i].qp); end_perf(&nodes[i], STEP_DESTROY_QP); } end_time(STEP_DESTROY_QP); } static void *process_events(void *arg) { struct rdma_cm_event *event; int ret; while (1) { ret = rdma_get_cm_event(channel, &event); if (!ret) { cma_handler(event->id, event); } else { perror("rdma_get_cm_event"); exit(EXIT_FAILURE); } } return NULL; } static void server_listen(struct rdma_cm_id **listen_id) { int ret; ret = rdma_create_id(channel, listen_id, NULL, hints.ai_port_space); if (ret) { perror("rdma_create_id"); exit(EXIT_FAILURE); } ret = rdma_bind_addr(*listen_id, rai->ai_src_addr); if (ret) { perror("rdma_bind_addr"); exit(EXIT_FAILURE); } ret = rdma_listen(*listen_id, 0); if (ret) { perror("rdma_listen"); exit(EXIT_FAILURE); } } static void reset_test(int iter) { int i; node_index = 0; atomic_store(&disc_events, 0); connections = iter; memset(times, 0, sizeof times); memset(nodes, 0, sizeof(*nodes) * iter); for (i = 0; i < STEP_CNT; i++) atomic_store(&completed[i], 0); if (is_client()) oob_sendrecv(oob_sock, 0); else oob_recvsend(oob_sock, 0); } static void server_connect(int iter) { reset_test(iter); while (atomic_load(&completed[STEP_CONNECT]) < iter) sched_yield(); oob_recvsend(oob_sock, STEP_CONNECT); while (atomic_load(&completed[STEP_DISCONNECT]) < iter) sched_yield(); oob_recvsend(oob_sock, STEP_DISCONNECT); destroy_qps(iter); destroy_ids(iter); } static void client_connect(int iter) { int i, ret; reset_test(iter); start_time(STEP_FULL_CONNECT); create_ids(iter); if (src_addr) { printf("\tBinding addresses\n"); start_time(STEP_BIND); for (i = 0; i < iter; i++) { start_perf(&nodes[i], STEP_BIND); ret = rdma_bind_addr(nodes[i].id, rai->ai_src_addr); if (ret) { perror("rdma_bind_addr"); exit(EXIT_FAILURE); } end_perf(&nodes[i], STEP_BIND); } end_time(STEP_BIND); } printf("\tResolving addresses\n"); start_time(STEP_RESOLVE_ADDR); for (i = 0; i < iter; i++) wq_insert(&wq, &nodes[i].work, resolve_addr); while (atomic_load(&completed[STEP_RESOLVE_ADDR]) < iter) sched_yield(); end_time(STEP_RESOLVE_ADDR); printf("\tResolving routes\n"); start_time(STEP_RESOLVE_ROUTE); for (i = 0; i < iter; i++) wq_insert(&wq, &nodes[i].work, resolve_route); while (atomic_load(&completed[STEP_RESOLVE_ROUTE]) < iter) sched_yield(); end_time(STEP_RESOLVE_ROUTE); printf("\tCreating QPs\n"); start_time(STEP_CREATE_QP); for (i = 0; i < iter; i++) wq_insert(&wq, &nodes[i].work, create_qp); while (atomic_load(&completed[STEP_CREATE_QP]) < iter) sched_yield(); end_time(STEP_CREATE_QP); printf("\tModify QPs to INIT\n"); start_time(STEP_INIT_QP); for (i = 0; i < iter; i++) { nodes[i].next_qps = IBV_QPS_INIT; nodes[i].next_step = STEP_INIT_QP_ATTR; wq_insert(&wq, &nodes[i].work, modify_qp_work); } while (atomic_load(&completed[STEP_INIT_QP]) < iter) sched_yield(); end_time(STEP_INIT_QP); printf("\tConnecting\n"); start_time(STEP_CONNECT); for (i = 0; i < iter; i++) connect_qp(&nodes[i]); while (atomic_load(&completed[STEP_CONNECT]) < iter) sched_yield(); end_time(STEP_CONNECT); end_time(STEP_FULL_CONNECT); oob_sendrecv(oob_sock, STEP_CONNECT); printf("\tDisconnecting\n"); start_time(STEP_DISCONNECT); for (i = 0; i < iter; i++) wq_insert(&wq, &nodes[i].work, client_disconnect); while (atomic_load(&completed[STEP_DISCONNECT]) < iter) sched_yield(); end_time(STEP_DISCONNECT); oob_sendrecv(oob_sock, STEP_DISCONNECT); /* Wait for event threads to exit before destroying resources */ printf("\tDestroying QPs\n"); destroy_qps(iter); printf("\tDestroying IDs\n"); destroy_ids(iter); } static void run_client(int iter) { int ret; ret = oob_client_setup(dst_addr, port, &oob_sock); if (ret) exit(EXIT_FAILURE); printf("Client warmup\n"); client_connect(1); if (!mimic) { printf("Connect (%d) QPs test\n", iter); } else { printf("Connect (%d) simulated QPs test (delay %d us)\n", iter, mimic_qp_delay); atomic_store(&cur_qpn, base_qpn); } client_connect(iter); show_perf(iter); printf("Connect (%d) test - no QPs\n", iter); atomic_store(&cur_qpn, base_qpn); mimic_qp_delay = 0; client_connect(iter); show_perf(iter); close(oob_sock); } static void run_server(int iter) { struct rdma_cm_id *listen_id; int ret; /* Make sure we're ready for RDMA prior to any OOB sync */ server_listen(&listen_id); ret = oob_server_setup(src_addr, port, &oob_sock); if (ret) exit(EXIT_FAILURE); printf("Server warmup\n"); server_connect(1); if (!mimic) { printf("Accept (%d) QPs test\n", iter); } else { printf("Accept (%d) simulated QPs test (delay %d us)\n", iter, mimic_qp_delay); atomic_store(&cur_qpn, base_qpn); } server_connect(iter); show_perf(iter); printf("Accept (%d) test - no QPs\n", iter); atomic_store(&cur_qpn, base_qpn); mimic_qp_delay = 0; server_connect(iter); show_perf(iter); close(oob_sock); rdma_destroy_id(listen_id); } int main(int argc, char **argv) { pthread_t event_thread; bool socktest = false; int iter = 100; int op, ret; hints.ai_port_space = RDMA_PS_TCP; hints.ai_qp_type = IBV_QPT_RC; while ((op = getopt(argc, argv, "s:b:c:m:n:p:q:r:St:")) != -1) { switch (op) { case 's': dst_addr = optarg; break; case 'b': src_addr = optarg; break; case 'c': iter = atoi(optarg); break; case 'p': port = optarg; break; case 'q': base_qpn = (uint32_t) atoi(optarg); break; case 'm': mimic_qp_delay = (uint32_t) atoi(optarg); mimic = true; break; case 'n': num_threads = (uint32_t) atoi(optarg); break; case 'r': retries = atoi(optarg); break; case 'S': socktest = true; atomic_store(&cur_qpn, 1); break; case 't': timeout = atoi(optarg); break; default: printf("usage: %s\n", argv[0]); printf("\t[-S] (run socket baseline test)\n"); printf("\t[-s server_address]\n"); printf("\t[-b bind_address]\n"); printf("\t[-c connections]\n"); printf("\t[-p port_number]\n"); printf("\t[-q base_qpn]\n"); printf("\t[-m mimic_qp_delay_us]\n"); printf("\t[-n num_threads]\n"); printf("\t[-r retries]\n"); printf("\t[-t timeout_ms]\n"); exit(EXIT_FAILURE); } } if (!is_client()) hints.ai_flags |= RAI_PASSIVE; ret = get_rdma_addr(src_addr, dst_addr, port, &hints, &rai); if (ret) { perror("get_rdma_addr"); exit(EXIT_FAILURE); } channel = create_event_channel(); if (!channel) { perror("create_event_channel"); exit(EXIT_FAILURE); } ret = pthread_create(&event_thread, NULL, process_events, NULL); if (ret) { perror("pthread_create"); exit(EXIT_FAILURE); } nodes = calloc(iter, sizeof *nodes); if (!nodes) { perror("calloc"); exit(EXIT_FAILURE); } ret = wq_init(&wq, num_threads); if (ret) goto free; if (is_client()) { if (socktest) sock_client(iter); else run_client(iter); } else { if (socktest) sock_server(iter); else run_server(iter); } wq_cleanup(&wq); free: free(nodes); rdma_destroy_event_channel(channel); rdma_freeaddrinfo(rai); return 0; } rdma-core-56.1/librdmacm/examples/common.c000066400000000000000000000210361477342711600205400ustar00rootroot00000000000000/* * Copyright (c) 2005-2006,2012 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * $Id$ */ #include #include #include #include #include #include #include #include #include #include #include #include #include "common.h" int use_rs = 1; int get_rdma_addr(const char *src, const char *dst, const char *port, struct rdma_addrinfo *hints, struct rdma_addrinfo **rai) { struct rdma_addrinfo rai_hints, *res; int ret; if (hints->ai_flags & RAI_PASSIVE) { ret = rdma_getaddrinfo(src, port, hints, rai); goto out; } rai_hints = *hints; if (src) { rai_hints.ai_flags |= RAI_PASSIVE; ret = rdma_getaddrinfo(src, NULL, &rai_hints, &res); if (ret) goto out; rai_hints.ai_src_addr = res->ai_src_addr; rai_hints.ai_src_len = res->ai_src_len; rai_hints.ai_flags &= ~RAI_PASSIVE; } ret = rdma_getaddrinfo(dst, port, &rai_hints, rai); if (src) rdma_freeaddrinfo(res); out: if (ret) printf("rdma_getaddrinfo error: %s\n", gai_strerror(ret)); return ret; } void size_str(char *str, size_t ssize, long long size) { long long base, fraction = 0; char mag; if (size >= (1 << 30)) { base = 1 << 30; mag = 'g'; } else if (size >= (1 << 20)) { base = 1 << 20; mag = 'm'; } else if (size >= (1 << 10)) { base = 1 << 10; mag = 'k'; } else { base = 1; mag = '\0'; } if (size / base < 10) fraction = (size % base) * 10 / base; if (fraction) { snprintf(str, ssize, "%lld.%lld%c", size / base, fraction, mag); } else { snprintf(str, ssize, "%lld%c", size / base, mag); } } void cnt_str(char *str, size_t ssize, long long cnt) { if (cnt >= 1000000000) snprintf(str, ssize, "%lldb", cnt / 1000000000); else if (cnt >= 1000000) snprintf(str, ssize, "%lldm", cnt / 1000000); else if (cnt >= 1000) snprintf(str, ssize, "%lldk", cnt / 1000); else snprintf(str, ssize, "%lld", cnt); } int size_to_count(int size) { if (size >= (1 << 20)) return 100; else if (size >= (1 << 16)) return 1000; else if (size >= (1 << 10)) return 10000; else return 100000; } void format_buf(void *buf, int size) { uint8_t *array = buf; static uint8_t data; int i; for (i = 0; i < size; i++) array[i] = data++; } int verify_buf(void *buf, int size) { static long long total_bytes; uint8_t *array = buf; static uint8_t data; int i; for (i = 0; i < size; i++, total_bytes++) { if (array[i] != data++) { printf("data verification failed byte %lld\n", total_bytes); return -1; } } return 0; } int do_poll(struct pollfd *fds, int timeout) { int ret; do { ret = rs_poll(fds, 1, timeout); } while (!ret); return ret == 1 ? (fds->revents & (POLLERR | POLLHUP)) : ret; } struct rdma_event_channel *create_event_channel(void) { struct rdma_event_channel *channel; channel = rdma_create_event_channel(); if (!channel) { if (errno == ENODEV) fprintf(stderr, "No RDMA devices were detected\n"); else perror("failed to create RDMA CM event channel"); } return channel; } int oob_server_setup(const char *src_addr, const char *port, int *sock) { struct addrinfo hint = {}, *ai; int listen_sock; int optval = 1; int ret; hint.ai_flags = AI_PASSIVE; hint.ai_family = AF_INET; hint.ai_socktype = SOCK_STREAM; ret = getaddrinfo(src_addr, port, &hint, &ai); if (ret) { printf("getaddrinfo error: %s\n", gai_strerror(ret)); return ret; } listen_sock = socket(ai->ai_family, ai->ai_socktype, 0); if (listen_sock == -1) { ret = -errno; goto free; } setsockopt(listen_sock, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof(optval)); ret = bind(listen_sock, ai->ai_addr, ai->ai_addrlen); if (ret) { ret = -errno; goto close; } ret = listen(listen_sock, 1); if (ret) { ret = -errno; goto close; } *sock = accept(listen_sock, NULL, NULL); if (*sock == -1) ret = -errno; setsockopt(*sock, IPPROTO_TCP, TCP_NODELAY, &optval, sizeof(optval)); close: close(listen_sock); free: freeaddrinfo(ai); return ret; } int oob_client_setup(const char *dst_addr, const char *port, int *sock) { struct addrinfo hint = {}, *ai; int nodelay = 1; int ret; hint.ai_family = AF_INET; hint.ai_socktype = SOCK_STREAM; ret = getaddrinfo(dst_addr, port, &hint, &ai); if (ret) { printf("getaddrinfo error: %s\n", gai_strerror(ret)); return ret; } *sock = socket(ai->ai_family, ai->ai_socktype, 0); if (*sock == -1) { ret = -errno; goto out; } setsockopt(*sock, IPPROTO_TCP, TCP_NODELAY, &nodelay, sizeof(nodelay)); ret = connect(*sock, ai->ai_addr, ai->ai_addrlen); out: freeaddrinfo(ai); return ret; } int oob_sendrecv(int sock, char val) { char c = val; ssize_t ret; ret = send(sock, (void *) &c, sizeof(c), 0); if (ret != sizeof(c)) return -errno; ret = recv(sock, (void *) &c, sizeof(c), 0); if (ret != sizeof(c)) return -errno; if (c != val) return -EINVAL; return 0; } int oob_recvsend(int sock, char val) { char c = 0; ssize_t ret; ret = recv(sock, (void *) &c, sizeof(c), 0); if (ret != sizeof(c)) return -errno; if (c != val) return -EINVAL; ret = send(sock, (void *) &c, sizeof(c), 0); if (ret != sizeof(c)) return -errno; return 0; } static void *wq_handler(void *arg); int wq_init(struct work_queue *wq, int thread_cnt) { int ret, i; wq->head = NULL; wq->tail = NULL; ret = pthread_mutex_init(&wq->lock, NULL); if (ret) { perror("pthread_mutex_init"); return ret; } ret = pthread_cond_init(&wq->cond, NULL); if (ret) { perror("pthread_cond_init"); return ret; } wq->thread_cnt = thread_cnt; wq->thread = calloc(thread_cnt, sizeof(*wq->thread)); if (!wq->thread) return -ENOMEM; wq->running = true; for (i = 0; i < thread_cnt; i++) { ret = pthread_create(&wq->thread[i], NULL, wq_handler, wq); if (ret) { perror("pthread_create"); return ret; } } return 0; } void wq_cleanup(struct work_queue *wq) { int i; pthread_mutex_lock(&wq->lock); wq->running = false; pthread_cond_broadcast(&wq->cond); pthread_mutex_unlock(&wq->lock); for (i = 0; i < wq->thread_cnt; i++) pthread_join(wq->thread[i], NULL); pthread_cond_destroy(&wq->cond); pthread_mutex_destroy(&wq->lock); } void wq_insert(struct work_queue *wq, struct work_item *item, void (*work_handler)(struct work_item *item)) { bool empty; item->next = NULL; item->work_handler = work_handler; pthread_mutex_lock(&wq->lock); if (wq->head) { wq->tail->next = item; empty = false; } else { wq->head = item; empty = true; } wq->tail = item; pthread_mutex_unlock(&wq->lock); if (empty) pthread_cond_signal(&wq->cond); } struct work_item *wq_remove(struct work_queue *wq) { struct work_item *item; item = wq->head; wq->head = wq->head->next; item->next = NULL; return item; } static void *wq_handler(void *arg) { struct work_queue *wq = arg; struct work_item *item; pthread_mutex_lock(&wq->lock); while (wq->running) { while (!wq->head) { pthread_cond_wait(&wq->cond, &wq->lock); if (!wq->running) goto out; } item = wq_remove(wq); if (wq->head) pthread_cond_signal(&wq->cond); pthread_mutex_unlock(&wq->lock); item->work_handler(item); pthread_mutex_lock(&wq->lock); } out: if (wq->head) pthread_cond_signal(&wq->cond); pthread_mutex_unlock(&wq->lock); return NULL; } rdma-core-56.1/librdmacm/examples/common.h000066400000000000000000000115001477342711600205400ustar00rootroot00000000000000/* * Copyright (c) 2005-2012 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * $Id$ */ #include #include #include #include #include #include #include #include #include #include /* Defined in common.c; used in all rsocket demos to determine whether to use * rsocket calls or standard socket calls. */ extern int use_rs; static inline int rs_socket(int f, int t, int p) { int fd; if (!use_rs) return socket(f, t, p); fd = rsocket(f, t, p); if (fd < 0) { if (t == SOCK_STREAM && errno == ENODEV) fprintf(stderr, "No RDMA devices were detected\n"); else perror("rsocket failed"); } return fd; } #define rs_bind(s,a,l) use_rs ? rbind(s,a,l) : bind(s,a,l) #define rs_listen(s,b) use_rs ? rlisten(s,b) : listen(s,b) #define rs_connect(s,a,l) use_rs ? rconnect(s,a,l) : connect(s,a,l) #define rs_accept(s,a,l) use_rs ? raccept(s,a,l) : accept(s,a,l) #define rs_shutdown(s,h) use_rs ? rshutdown(s,h) : shutdown(s,h) #define rs_close(s) use_rs ? rclose(s) : close(s) #define rs_recv(s,b,l,f) use_rs ? rrecv(s,b,l,f) : recv(s,b,l,f) #define rs_send(s,b,l,f) use_rs ? rsend(s,b,l,f) : send(s,b,l,f) #define rs_recvfrom(s,b,l,f,a,al) \ use_rs ? rrecvfrom(s,b,l,f,a,al) : recvfrom(s,b,l,f,a,al) #define rs_sendto(s,b,l,f,a,al) \ use_rs ? rsendto(s,b,l,f,a,al) : sendto(s,b,l,f,a,al) #define rs_poll(f,n,t) use_rs ? rpoll(f,n,t) : poll(f,n,t) #define rs_fcntl(s,c,p) use_rs ? rfcntl(s,c,p) : fcntl(s,c,p) #define rs_setsockopt(s,l,n,v,ol) \ use_rs ? rsetsockopt(s,l,n,v,ol) : setsockopt(s,l,n,v,ol) #define rs_getsockopt(s,l,n,v,ol) \ use_rs ? rgetsockopt(s,l,n,v,ol) : getsockopt(s,l,n,v,ol) union socket_addr { struct sockaddr sa; struct sockaddr_in sin; struct sockaddr_in6 sin6; }; enum rs_optimization { opt_mixed, opt_latency, opt_bandwidth }; int get_rdma_addr(const char *src, const char *dst, const char *port, struct rdma_addrinfo *hints, struct rdma_addrinfo **rai); int oob_server_setup(const char *src_addr, const char *port, int *sock); int oob_client_setup(const char *dst_addr, const char *port, int *sock); int oob_sendrecv(int sock, char val); int oob_recvsend(int sock, char val); void size_str(char *str, size_t ssize, long long size); void cnt_str(char *str, size_t ssize, long long cnt); int size_to_count(int size); void format_buf(void *buf, int size); int verify_buf(void *buf, int size); int do_poll(struct pollfd *fds, int timeout); struct rdma_event_channel *create_event_channel(void); static inline uint64_t gettime_ns(void) { struct timespec now; clock_gettime(CLOCK_MONOTONIC, &now); return now.tv_sec * 1000000000 + now.tv_nsec; } static inline uint64_t gettime_us(void) { return gettime_ns() / 1000; } static inline int sleep_us(unsigned int time_us) { struct timespec spec; if (!time_us) return 0; spec.tv_sec = 0; spec.tv_nsec = time_us * 1000; return nanosleep(&spec, NULL); } struct work_item { struct work_item *next; void (*work_handler)(struct work_item *item); }; struct work_queue { pthread_mutex_t lock; pthread_cond_t cond; pthread_t *thread; int thread_cnt; bool running; struct work_item *head; struct work_item *tail; }; int wq_init(struct work_queue *wq, int thread_cnt); void wq_cleanup(struct work_queue *wq); void wq_insert(struct work_queue *wq, struct work_item *item, void (*work_handler)(struct work_item *item)); struct work_item *wq_remove(struct work_queue *wq); rdma-core-56.1/librdmacm/examples/mckey.c000066400000000000000000000354211477342711600203630ustar00rootroot00000000000000/* * Copyright (c) 2005-2007 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * $Id$ */ #include #include #include #include #include #include #include #include #include #include #include #include #include "common.h" struct cmatest_node { int id; struct rdma_cm_id *cma_id; int connected; struct ibv_pd *pd; struct ibv_cq *cq; struct ibv_mr *mr; struct ibv_ah *ah; uint32_t remote_qpn; uint32_t remote_qkey; void *mem; }; struct cmatest { struct rdma_event_channel *channel; pthread_t cmathread; struct cmatest_node *nodes; int conn_index; int connects_left; struct sockaddr_storage dst_in; struct sockaddr *dst_addr; struct sockaddr_storage src_in; struct sockaddr *src_addr; }; static struct cmatest test; static int connections = 1; static int message_size = 100; static int message_count = 10; static int is_sender; static int send_only; static int loopback = 1; static int unmapped_addr; static char *dst_addr; static char *src_addr; static enum rdma_port_space port_space = RDMA_PS_UDP; static int create_message(struct cmatest_node *node) { if (!message_size) message_count = 0; if (!message_count) return 0; node->mem = malloc(message_size + sizeof(struct ibv_grh)); if (!node->mem) { printf("failed message allocation\n"); return -1; } node->mr = ibv_reg_mr(node->pd, node->mem, message_size + sizeof(struct ibv_grh), IBV_ACCESS_LOCAL_WRITE); if (!node->mr) { printf("failed to reg MR\n"); goto err; } return 0; err: free(node->mem); return -1; } static int verify_test_params(struct cmatest_node *node) { struct ibv_port_attr port_attr; int ret; ret = ibv_query_port(node->cma_id->verbs, node->cma_id->port_num, &port_attr); if (ret) return ret; if (message_count && message_size > (1 << (port_attr.active_mtu + 7))) { printf("mckey: message_size %d is larger than active mtu %d\n", message_size, 1 << (port_attr.active_mtu + 7)); return -EINVAL; } return 0; } static int init_node(struct cmatest_node *node) { struct ibv_qp_init_attr_ex init_qp_attr_ex; struct ibv_qp_init_attr init_qp_attr; int cqe, ret = 0; node->pd = ibv_alloc_pd(node->cma_id->verbs); if (!node->pd) { ret = -ENOMEM; printf("mckey: unable to allocate PD\n"); goto out; } cqe = message_count ? message_count * 2 : 2; node->cq = ibv_create_cq(node->cma_id->verbs, cqe, node, NULL, 0); if (!node->cq) { ret = -ENOMEM; printf("mckey: unable to create CQ\n"); goto out; } memset(&init_qp_attr, 0, sizeof init_qp_attr); init_qp_attr.cap.max_send_wr = message_count ? message_count : 1; init_qp_attr.cap.max_recv_wr = message_count ? message_count : 1; init_qp_attr.cap.max_send_sge = 1; init_qp_attr.cap.max_recv_sge = 1; init_qp_attr.qp_context = node; init_qp_attr.sq_sig_all = 0; init_qp_attr.qp_type = IBV_QPT_UD; init_qp_attr.send_cq = node->cq; init_qp_attr.recv_cq = node->cq; if (!loopback) { memset(&init_qp_attr_ex, 0, sizeof(init_qp_attr_ex)); init_qp_attr_ex.cap.max_send_wr = message_count ? message_count : 1; init_qp_attr_ex.cap.max_recv_wr = message_count ? message_count : 1; init_qp_attr_ex.cap.max_send_sge = 1; init_qp_attr_ex.cap.max_recv_sge = 1; init_qp_attr_ex.qp_context = node; init_qp_attr_ex.sq_sig_all = 0; init_qp_attr_ex.qp_type = IBV_QPT_UD; init_qp_attr_ex.send_cq = node->cq; init_qp_attr_ex.recv_cq = node->cq; init_qp_attr_ex.comp_mask = IBV_QP_INIT_ATTR_CREATE_FLAGS|IBV_QP_INIT_ATTR_PD; init_qp_attr_ex.pd = node->pd; init_qp_attr_ex.create_flags = IBV_QP_CREATE_BLOCK_SELF_MCAST_LB; ret = rdma_create_qp_ex(node->cma_id, &init_qp_attr_ex); } else { ret = rdma_create_qp(node->cma_id, node->pd, &init_qp_attr); } if (ret) { perror("mckey: unable to create QP"); goto out; } ret = create_message(node); if (ret) { printf("mckey: failed to create messages: %d\n", ret); goto out; } out: return ret; } static int post_recvs(struct cmatest_node *node) { struct ibv_recv_wr recv_wr, *recv_failure; struct ibv_sge sge; int i, ret = 0; if (!message_count) return 0; recv_wr.next = NULL; recv_wr.sg_list = &sge; recv_wr.num_sge = 1; recv_wr.wr_id = (uintptr_t) node; sge.length = message_size + sizeof(struct ibv_grh); sge.lkey = node->mr->lkey; sge.addr = (uintptr_t) node->mem; for (i = 0; i < message_count && !ret; i++ ) { ret = ibv_post_recv(node->cma_id->qp, &recv_wr, &recv_failure); if (ret) { printf("failed to post receives: %d\n", ret); break; } } return ret; } static int post_sends(struct cmatest_node *node, int signal_flag) { struct ibv_send_wr send_wr, *bad_send_wr; struct ibv_sge sge; int i, ret = 0; if (!node->connected || !message_count) return 0; send_wr.next = NULL; send_wr.sg_list = &sge; send_wr.num_sge = 1; send_wr.opcode = IBV_WR_SEND_WITH_IMM; send_wr.send_flags = signal_flag; send_wr.wr_id = (unsigned long)node; send_wr.imm_data = htobe32(node->cma_id->qp->qp_num); send_wr.wr.ud.ah = node->ah; send_wr.wr.ud.remote_qpn = node->remote_qpn; send_wr.wr.ud.remote_qkey = node->remote_qkey; sge.length = message_size; sge.lkey = node->mr->lkey; sge.addr = (uintptr_t) node->mem; for (i = 0; i < message_count && !ret; i++) { ret = ibv_post_send(node->cma_id->qp, &send_wr, &bad_send_wr); if (ret) printf("failed to post sends: %d\n", ret); } return ret; } static void connect_error(void) { test.connects_left--; } static int addr_handler(struct cmatest_node *node) { int ret; struct rdma_cm_join_mc_attr_ex mc_attr; ret = verify_test_params(node); if (ret) goto err; ret = init_node(node); if (ret) goto err; if (!is_sender) { ret = post_recvs(node); if (ret) goto err; } mc_attr.comp_mask = RDMA_CM_JOIN_MC_ATTR_ADDRESS | RDMA_CM_JOIN_MC_ATTR_JOIN_FLAGS; mc_attr.addr = test.dst_addr; mc_attr.join_flags = send_only ? RDMA_MC_JOIN_FLAG_SENDONLY_FULLMEMBER : RDMA_MC_JOIN_FLAG_FULLMEMBER; ret = rdma_join_multicast_ex(node->cma_id, &mc_attr, node); if (ret) { perror("mckey: failure joining"); goto err; } return 0; err: connect_error(); return ret; } static int join_handler(struct cmatest_node *node, struct rdma_ud_param *param) { char buf[40]; inet_ntop(AF_INET6, param->ah_attr.grh.dgid.raw, buf, 40); printf("mckey: joined dgid: %s mlid 0x%x sl %d\n", buf, param->ah_attr.dlid, param->ah_attr.sl); node->remote_qpn = param->qp_num; node->remote_qkey = param->qkey; node->ah = ibv_create_ah(node->pd, ¶m->ah_attr); if (!node->ah) { printf("mckey: failure creating address handle\n"); goto err; } node->connected = 1; test.connects_left--; return 0; err: connect_error(); return -1; } static int cma_handler(struct rdma_cm_id *cma_id, struct rdma_cm_event *event) { int ret = 0; switch (event->event) { case RDMA_CM_EVENT_ADDR_RESOLVED: ret = addr_handler(cma_id->context); break; case RDMA_CM_EVENT_MULTICAST_JOIN: ret = join_handler(cma_id->context, &event->param.ud); break; case RDMA_CM_EVENT_ADDR_ERROR: case RDMA_CM_EVENT_ROUTE_ERROR: case RDMA_CM_EVENT_MULTICAST_ERROR: printf("mckey: event: %s, error: %d\n", rdma_event_str(event->event), event->status); connect_error(); ret = event->status; break; case RDMA_CM_EVENT_DEVICE_REMOVAL: /* Cleanup will occur after test completes. */ break; default: break; } return ret; } static void *cma_thread(void *arg) { struct rdma_cm_event *event; int ret; while (1) { ret = rdma_get_cm_event(test.channel, &event); if (ret) { perror("rdma_get_cm_event"); break; } switch (event->event) { case RDMA_CM_EVENT_MULTICAST_ERROR: case RDMA_CM_EVENT_ADDR_CHANGE: printf("mckey: event: %s, status: %d\n", rdma_event_str(event->event), event->status); break; default: break; } rdma_ack_cm_event(event); } return NULL; } static void destroy_node(struct cmatest_node *node) { if (!node->cma_id) return; if (node->ah) ibv_destroy_ah(node->ah); if (node->cma_id->qp) rdma_destroy_qp(node->cma_id); if (node->cq) ibv_destroy_cq(node->cq); if (node->mem) { ibv_dereg_mr(node->mr); free(node->mem); } if (node->pd) ibv_dealloc_pd(node->pd); /* Destroy the RDMA ID after all device resources */ rdma_destroy_id(node->cma_id); } static int alloc_nodes(void) { int ret, i; test.nodes = malloc(sizeof *test.nodes * connections); if (!test.nodes) { printf("mckey: unable to allocate memory for test nodes\n"); return -ENOMEM; } memset(test.nodes, 0, sizeof *test.nodes * connections); for (i = 0; i < connections; i++) { test.nodes[i].id = i; ret = rdma_create_id(test.channel, &test.nodes[i].cma_id, &test.nodes[i], port_space); if (ret) goto err; } return 0; err: while (--i >= 0) rdma_destroy_id(test.nodes[i].cma_id); free(test.nodes); return ret; } static void destroy_nodes(void) { int i; for (i = 0; i < connections; i++) destroy_node(&test.nodes[i]); free(test.nodes); } static int poll_cqs(void) { struct ibv_wc wc[8]; int done, i, ret; for (i = 0; i < connections; i++) { if (!test.nodes[i].connected) continue; for (done = 0; done < message_count; done += ret) { ret = ibv_poll_cq(test.nodes[i].cq, 8, wc); if (ret < 0) { printf("mckey: failed polling CQ: %d\n", ret); return ret; } } } return 0; } static int connect_events(void) { struct rdma_cm_event *event; int ret = 0; while (test.connects_left && !ret) { ret = rdma_get_cm_event(test.channel, &event); if (!ret) { ret = cma_handler(event->id, event); rdma_ack_cm_event(event); } } return ret; } static int get_addr(char *dst, struct sockaddr *addr) { struct addrinfo *res; int ret; ret = getaddrinfo(dst, NULL, NULL, &res); if (ret) { printf("getaddrinfo failed (%s) - invalid hostname or IP address\n", gai_strerror(ret)); return ret; } memcpy(addr, res->ai_addr, res->ai_addrlen); freeaddrinfo(res); return ret; } static int get_dst_addr(char *dst, struct sockaddr *addr) { struct sockaddr_ib *sib; if (!unmapped_addr) return get_addr(dst, addr); sib = (struct sockaddr_ib *) addr; memset(sib, 0, sizeof *sib); sib->sib_family = AF_IB; inet_pton(AF_INET6, dst, &sib->sib_addr); return 0; } static int run(void) { int i, ret; printf("mckey: starting %s\n", is_sender ? "client" : "server"); if (src_addr) { ret = get_addr(src_addr, (struct sockaddr *) &test.src_in); if (ret) return ret; } ret = get_dst_addr(dst_addr, (struct sockaddr *) &test.dst_in); if (ret) return ret; printf("mckey: joining\n"); for (i = 0; i < connections; i++) { if (src_addr) { ret = rdma_bind_addr(test.nodes[i].cma_id, test.src_addr); if (ret) { perror("mckey: addr bind failure"); connect_error(); return ret; } } if (unmapped_addr) ret = addr_handler(&test.nodes[i]); else ret = rdma_resolve_addr(test.nodes[i].cma_id, test.src_addr, test.dst_addr, 2000); if (ret) { perror("mckey: resolve addr failure"); connect_error(); return ret; } } ret = connect_events(); if (ret) goto out; pthread_create(&test.cmathread, NULL, cma_thread, NULL); /* * Pause to give SM chance to configure switches. We don't want to * handle reliability issue in this simple test program. */ sleep(3); if (message_count) { if (is_sender) { printf("initiating data transfers\n"); for (i = 0; i < connections; i++) { ret = post_sends(&test.nodes[i], 0); if (ret) goto out; } } else { printf("receiving data transfers\n"); ret = poll_cqs(); if (ret) goto out; } printf("data transfers complete\n"); } out: for (i = 0; i < connections; i++) { ret = rdma_leave_multicast(test.nodes[i].cma_id, test.dst_addr); if (ret) perror("mckey: failure leaving"); } return ret; } int main(int argc, char **argv) { int op, ret; while ((op = getopt(argc, argv, "m:M:sb:c:C:S:p:ol")) != -1) { switch (op) { case 'm': dst_addr = optarg; break; case 'M': unmapped_addr = 1; dst_addr = optarg; break; case 's': is_sender = 1; break; case 'b': src_addr = optarg; test.src_addr = (struct sockaddr *) &test.src_in; break; case 'c': connections = atoi(optarg); break; case 'C': message_count = atoi(optarg); break; case 'S': message_size = atoi(optarg); break; case 'p': port_space = strtol(optarg, NULL, 0); break; case 'o': send_only = 1; break; case 'l': loopback = 0; break; default: printf("usage: %s\n", argv[0]); printf("\t-m multicast_address\n"); printf("\t[-M unmapped_multicast_address]\n" "\t replaces -m and requires -b\n"); printf("\t[-s(ender)]\n"); printf("\t[-b bind_address]\n"); printf("\t[-c connections]\n"); printf("\t[-C message_count]\n"); printf("\t[-S message_size]\n"); printf("\t[-p port_space - %#x for UDP (default), " "%#x for IPOIB]\n", RDMA_PS_UDP, RDMA_PS_IPOIB); printf("\t[-o join as a send-only full-member]\n"); printf("\t[-l join without multicast loopback]\n"); exit(1); } } if (unmapped_addr && !src_addr) { printf("unmapped multicast address requires binding " "to source address\n"); exit(1); } test.dst_addr = (struct sockaddr *) &test.dst_in; test.connects_left = connections; test.channel = create_event_channel(); if (!test.channel) { exit(1); } if (alloc_nodes()) exit(1); ret = run(); printf("test complete\n"); destroy_nodes(); rdma_destroy_event_channel(test.channel); printf("return status %d\n", ret); return ret; } rdma-core-56.1/librdmacm/examples/rcopy.c000066400000000000000000000264211477342711600204070ustar00rootroot00000000000000/* * Copyright (c) 2011 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "common.h" union rsocket_address { struct sockaddr sa; struct sockaddr_in sin; struct sockaddr_in6 sin6; struct sockaddr_storage storage; }; static const char *port = "7427"; static char *dst_addr; static char *dst_file; static char *src_file; static struct timeval start, end; //static void buf[1024 * 1024]; static uint64_t bytes; static int fd; static void *file_addr; enum { CMD_NOOP, CMD_OPEN, CMD_CLOSE, CMD_WRITE, CMD_RESP = 0x80, }; /* TODO: handle byte swapping */ struct msg_hdr { uint8_t version; uint8_t command; uint16_t len; uint32_t data; uint64_t id; }; struct msg_open { struct msg_hdr hdr; char path[0]; }; struct msg_write { struct msg_hdr hdr; uint64_t size; }; static void show_perf(void) { float usec; usec = (end.tv_sec - start.tv_sec) * 1000000 + (end.tv_usec - start.tv_usec); printf("%lld bytes in %.2f seconds = %.2f Gb/sec\n", (long long) bytes, usec / 1000000., (bytes * 8) / (1000. * usec)); } static char *_ntop(union rsocket_address *rsa) { static char addr[32]; switch (rsa->sa.sa_family) { case AF_INET: inet_ntop(AF_INET, &rsa->sin.sin_addr, addr, sizeof addr); break; case AF_INET6: inet_ntop(AF_INET6, &rsa->sin6.sin6_addr, addr, sizeof addr); break; default: addr[0] = '\0'; break; } return addr; } static size_t _recv(int rs, char *msg, size_t len) { size_t ret, offset; for (offset = 0; offset < len; offset += ret) { ret = rrecv(rs, msg + offset, len - offset, 0); if (ret <= 0) return ret; } return len; } static int msg_recv_hdr(int rs, struct msg_hdr *hdr) { int ret; ret = _recv(rs, (char *) hdr, sizeof *hdr); if (ret != sizeof *hdr) return -1; if (hdr->version || hdr->len < sizeof *hdr) { printf("invalid version %d or length %d\n", hdr->version, hdr->len); return -1; } return sizeof *hdr; } static int msg_get_resp(int rs, struct msg_hdr *msg, uint8_t cmd) { int ret; ret = msg_recv_hdr(rs, msg); if (ret != sizeof *msg) return ret; if ((msg->len != sizeof *msg) || (msg->command != (cmd | CMD_RESP))) { printf("invalid length %d or bad command response %x:%x\n", msg->len, msg->command, cmd | CMD_RESP); return -1; } return msg->data; } static void msg_send_resp(int rs, struct msg_hdr *msg, uint32_t status) { struct msg_hdr resp; resp.version = 0; resp.command = msg->command | CMD_RESP; resp.len = sizeof resp; resp.data = status; resp.id = msg->id; rsend(rs, (char *) &resp, sizeof resp, 0); } static int server_listen(void) { struct addrinfo hints, *res; int ret, rs; memset(&hints, 0, sizeof hints); hints.ai_flags = RAI_PASSIVE; ret = getaddrinfo(NULL, port, &hints, &res); if (ret) { printf("getaddrinfo failed: %s\n", gai_strerror(ret)); return ret; } rs = rs_socket(res->ai_family, res->ai_socktype, res->ai_protocol); if (rs < 0) { ret = rs; goto free; } ret = 1; ret = rsetsockopt(rs, SOL_SOCKET, SO_REUSEADDR, &ret, sizeof ret); if (ret) { perror("rsetsockopt failed"); goto close; } ret = rbind(rs, res->ai_addr, res->ai_addrlen); if (ret) { perror("rbind failed"); goto close; } ret = rlisten(rs, 1); if (ret) { perror("rlisten failed"); goto close; } ret = rs; goto free; close: rclose(rs); free: freeaddrinfo(res); return ret; } static int server_open(int rs, struct msg_hdr *msg) { char *path = NULL; int ret, len; printf("opening: "); fflush(NULL); if (file_addr || fd > 0) { printf("cannot open another file\n"); ret = EBUSY; goto out; } len = msg->len - sizeof *msg; path = malloc(len); if (!path) { printf("cannot allocate path name\n"); ret = ENOMEM; goto out; } ret = _recv(rs, path, len); if (ret != len) { printf("error receiving path\n"); goto out; } printf("%s, ", path); fflush(NULL); fd = open(path, O_RDWR | O_CREAT | O_TRUNC, msg->data); if (fd < 0) { printf("unable to open destination file\n"); ret = errno; goto out; } ret = 0; out: if (path) free(path); msg_send_resp(rs, msg, ret); return ret; } static void server_close(int rs, struct msg_hdr *msg) { printf("closing..."); fflush(NULL); msg_send_resp(rs, msg, 0); if (file_addr) { munmap(file_addr, bytes); file_addr = NULL; } if (fd > 0) { close(fd); fd = 0; } printf("done\n"); } static int server_write(int rs, struct msg_hdr *msg) { size_t len; int ret; printf("transferring"); fflush(NULL); if (fd <= 0) { printf("...file not opened\n"); ret = EINVAL; goto out; } if (msg->len != sizeof(struct msg_write)) { printf("...invalid message length %d\n", msg->len); ret = EINVAL; goto out; } ret = _recv(rs, (char *) &bytes, sizeof bytes); if (ret != sizeof bytes) goto out; ret = ftruncate(fd, bytes); if (ret) goto out; file_addr = mmap(NULL, bytes, PROT_WRITE, MAP_SHARED, fd, 0); if (file_addr == (void *) -1) { printf("...error mapping file\n"); ret = errno; goto out; } printf("...%lld bytes...", (long long) bytes); fflush(NULL); len = _recv(rs, file_addr, bytes); if (len != bytes) { printf("...error receiving data\n"); ret = (int) len; } out: msg_send_resp(rs, msg, ret); return ret; } static void server_process(int rs) { struct msg_hdr msg; int ret; do { ret = msg_recv_hdr(rs, &msg); if (ret != sizeof msg) break; switch (msg.command) { case CMD_OPEN: ret = server_open(rs, &msg); break; case CMD_CLOSE: server_close(rs, &msg); ret = 0; break; case CMD_WRITE: ret = server_write(rs, &msg); break; default: msg_send_resp(rs, &msg, EINVAL); ret = -1; break; } } while (!ret); } static int server_run(void) { int lrs, rs; union rsocket_address rsa; socklen_t len; lrs = server_listen(); if (lrs < 0) return lrs; while (1) { len = sizeof rsa; printf("waiting for connection..."); fflush(NULL); rs = raccept(lrs, &rsa.sa, &len); printf("client: %s\n", _ntop(&rsa)); server_process(rs); rshutdown(rs, SHUT_RDWR); rclose(rs); } return 0; } static int client_connect(void) { struct addrinfo *res; int ret, rs; ret = getaddrinfo(dst_addr, port, NULL, &res); if (ret) { printf("getaddrinfo failed: %s\n", gai_strerror(ret)); return ret; } rs = rs_socket(res->ai_family, res->ai_socktype, res->ai_protocol); if (rs < 0) { goto free; } ret = rconnect(rs, res->ai_addr, res->ai_addrlen); if (ret) { perror("rconnect failed\n"); rclose(rs); rs = ret; } free: freeaddrinfo(res); return rs; } static int client_open(int rs) { struct msg_open *msg; struct stat stats; uint32_t len; int ret; printf("opening..."); fflush(NULL); fd = open(src_file, O_RDONLY); if (fd < 0) return fd; ret = fstat(fd, &stats); if (ret < 0) goto err1; bytes = (uint64_t) stats.st_size; file_addr = mmap(NULL, bytes, PROT_READ, MAP_SHARED, fd, 0); if (file_addr == (void *) -1) { ret = errno; goto err1; } len = (((uint32_t) strlen(dst_file)) + 8) & 0xFFFFFFF8; msg = calloc(1, sizeof(*msg) + len); if (!msg) { ret = -1; goto err2; } msg->hdr.command = CMD_OPEN; msg->hdr.len = sizeof(*msg) + len; msg->hdr.data = (uint32_t) stats.st_mode; strcpy(msg->path, dst_file); ret = rsend(rs, msg, msg->hdr.len, 0); if (ret != msg->hdr.len) goto err3; ret = msg_get_resp(rs, &msg->hdr, CMD_OPEN); if (ret) goto err3; return 0; err3: free(msg); err2: munmap(file_addr, bytes); err1: close(fd); return ret; } static int client_start_write(int rs) { struct msg_write msg; int ret; printf("transferring"); fflush(NULL); memset(&msg, 0, sizeof msg); msg.hdr.command = CMD_WRITE; msg.hdr.len = sizeof(msg); msg.size = bytes; ret = rsend(rs, &msg, sizeof msg, 0); if (ret != msg.hdr.len) return ret; return 0; } static int client_close(int rs) { struct msg_hdr msg; int ret; printf("closing..."); fflush(NULL); memset(&msg, 0, sizeof msg); msg.command = CMD_CLOSE; msg.len = sizeof msg; ret = rsend(rs, (char *) &msg, msg.len, 0); if (ret != msg.len) goto out; ret = msg_get_resp(rs, &msg, CMD_CLOSE); if (ret) goto out; printf("done\n"); out: munmap(file_addr, bytes); close(fd); return ret; } static int client_run(void) { struct msg_hdr ack; int ret, rs; size_t len; rs = client_connect(); if (rs < 0) return rs; ret = client_open(rs); if (ret) goto shutdown; ret = client_start_write(rs); if (ret) goto close; printf("..."); fflush(NULL); gettimeofday(&start, NULL); len = rsend(rs, file_addr, bytes, 0); if (len == bytes) ret = msg_get_resp(rs, &ack, CMD_WRITE); else ret = (int) len; gettimeofday(&end, NULL); close: client_close(rs); shutdown: rshutdown(rs, SHUT_RDWR); rclose(rs); if (!ret) show_perf(); return ret; } static void show_usage(char *program) { printf("usage 1: %s [options]\n", program); printf("\t starts the server application\n"); printf("\t[-p port_number]\n"); printf("usage 2: %s source server[:destination] [options]\n", program); printf("\t source - file name and path\n"); printf("\t server - name or address\n"); printf("\t destination - file name and path\n"); printf("\t[-p port_number]\n"); exit(1); } static void server_opts(int argc, char **argv) { int op; while ((op = getopt(argc, argv, "p:")) != -1) { switch (op) { case 'p': port = optarg; break; default: show_usage(argv[0]); } } } static void client_opts(int argc, char **argv) { int op; if (argc < 3) show_usage(argv[0]); src_file = argv[1]; dst_addr = argv[2]; dst_file = strchr(dst_addr, ':'); if (dst_file) { *dst_file = '\0'; dst_file++; } if (!dst_file) dst_file = src_file; while ((op = getopt(argc, argv, "p:")) != -1) { switch (op) { case 'p': port = optarg; break; default: show_usage(argv[0]); } } } int main(int argc, char **argv) { int ret; if (argc == 1 || argv[1][0] == '-') { server_opts(argc, argv); ret = server_run(); } else { client_opts(argc, argv); ret = client_run(); } return ret; } rdma-core-56.1/librdmacm/examples/rdma_client.c000066400000000000000000000100041477342711600215220ustar00rootroot00000000000000/* * Copyright (c) 2010 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include static const char *server = "127.0.0.1"; static const char *port = "7471"; static struct rdma_cm_id *id; static struct ibv_mr *mr, *send_mr; static int send_flags; static uint8_t send_msg[16]; static uint8_t recv_msg[16]; static int run(void) { struct rdma_addrinfo hints, *res; struct ibv_qp_init_attr attr; struct ibv_wc wc; int ret; memset(&hints, 0, sizeof hints); hints.ai_port_space = RDMA_PS_TCP; ret = rdma_getaddrinfo(server, port, &hints, &res); if (ret) { printf("rdma_getaddrinfo: %s\n", gai_strerror(ret)); goto out; } memset(&attr, 0, sizeof attr); attr.cap.max_send_wr = attr.cap.max_recv_wr = 1; attr.cap.max_send_sge = attr.cap.max_recv_sge = 1; attr.cap.max_inline_data = 16; attr.qp_context = id; attr.sq_sig_all = 1; ret = rdma_create_ep(&id, res, NULL, &attr); // Check to see if we got inline data allowed or not if (attr.cap.max_inline_data >= 16) send_flags = IBV_SEND_INLINE; else printf("rdma_client: device doesn't support IBV_SEND_INLINE, " "using sge sends\n"); if (ret) { perror("rdma_create_ep"); goto out_free_addrinfo; } mr = rdma_reg_msgs(id, recv_msg, 16); if (!mr) { perror("rdma_reg_msgs for recv_msg"); ret = -1; goto out_destroy_ep; } if ((send_flags & IBV_SEND_INLINE) == 0) { send_mr = rdma_reg_msgs(id, send_msg, 16); if (!send_mr) { perror("rdma_reg_msgs for send_msg"); ret = -1; goto out_dereg_recv; } } ret = rdma_post_recv(id, NULL, recv_msg, 16, mr); if (ret) { perror("rdma_post_recv"); goto out_dereg_send; } ret = rdma_connect(id, NULL); if (ret) { perror("rdma_connect"); goto out_dereg_send; } ret = rdma_post_send(id, NULL, send_msg, 16, send_mr, send_flags); if (ret) { perror("rdma_post_send"); goto out_disconnect; } while ((ret = rdma_get_send_comp(id, &wc)) == 0); if (ret < 0) { perror("rdma_get_send_comp"); goto out_disconnect; } while ((ret = rdma_get_recv_comp(id, &wc)) == 0); if (ret < 0) perror("rdma_get_recv_comp"); else ret = 0; out_disconnect: rdma_disconnect(id); out_dereg_send: if ((send_flags & IBV_SEND_INLINE) == 0) rdma_dereg_mr(send_mr); out_dereg_recv: rdma_dereg_mr(mr); out_destroy_ep: rdma_destroy_ep(id); out_free_addrinfo: rdma_freeaddrinfo(res); out: return ret; } int main(int argc, char **argv) { int op, ret; while ((op = getopt(argc, argv, "s:p:")) != -1) { switch (op) { case 's': server = optarg; break; case 'p': port = optarg; break; default: printf("usage: %s\n", argv[0]); printf("\t[-s server_address]\n"); printf("\t[-p port_number]\n"); exit(1); } } printf("rdma_client: start\n"); ret = run(); printf("rdma_client: end %d\n", ret); return ret; } rdma-core-56.1/librdmacm/examples/rdma_server.c000066400000000000000000000110671477342711600215640ustar00rootroot00000000000000/* * Copyright (c) 2005-2009 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include static const char *server = "0.0.0.0"; static const char *port = "7471"; static struct rdma_cm_id *listen_id, *id; static struct ibv_mr *mr, *send_mr; static int send_flags; static uint8_t send_msg[16]; static uint8_t recv_msg[16]; static int run(void) { struct rdma_addrinfo hints, *res; struct ibv_qp_init_attr init_attr; struct ibv_qp_attr qp_attr; struct ibv_wc wc; int ret; memset(&hints, 0, sizeof hints); hints.ai_flags = RAI_PASSIVE; hints.ai_port_space = RDMA_PS_TCP; ret = rdma_getaddrinfo(server, port, &hints, &res); if (ret) { printf("rdma_getaddrinfo: %s\n", gai_strerror(ret)); return ret; } memset(&init_attr, 0, sizeof init_attr); init_attr.cap.max_send_wr = init_attr.cap.max_recv_wr = 1; init_attr.cap.max_send_sge = init_attr.cap.max_recv_sge = 1; init_attr.cap.max_inline_data = 16; init_attr.sq_sig_all = 1; ret = rdma_create_ep(&listen_id, res, NULL, &init_attr); if (ret) { perror("rdma_create_ep"); goto out_free_addrinfo; } ret = rdma_listen(listen_id, 0); if (ret) { perror("rdma_listen"); goto out_destroy_listen_ep; } ret = rdma_get_request(listen_id, &id); if (ret) { perror("rdma_get_request"); goto out_destroy_listen_ep; } memset(&qp_attr, 0, sizeof qp_attr); memset(&init_attr, 0, sizeof init_attr); ret = ibv_query_qp(id->qp, &qp_attr, IBV_QP_CAP, &init_attr); if (ret) { perror("ibv_query_qp"); goto out_destroy_accept_ep; } if (init_attr.cap.max_inline_data >= 16) send_flags = IBV_SEND_INLINE; else printf("rdma_server: device doesn't support IBV_SEND_INLINE, " "using sge sends\n"); mr = rdma_reg_msgs(id, recv_msg, 16); if (!mr) { ret = -1; perror("rdma_reg_msgs for recv_msg"); goto out_destroy_accept_ep; } if ((send_flags & IBV_SEND_INLINE) == 0) { send_mr = rdma_reg_msgs(id, send_msg, 16); if (!send_mr) { ret = -1; perror("rdma_reg_msgs for send_msg"); goto out_dereg_recv; } } ret = rdma_post_recv(id, NULL, recv_msg, 16, mr); if (ret) { perror("rdma_post_recv"); goto out_dereg_send; } ret = rdma_accept(id, NULL); if (ret) { perror("rdma_accept"); goto out_dereg_send; } while ((ret = rdma_get_recv_comp(id, &wc)) == 0); if (ret < 0) { perror("rdma_get_recv_comp"); goto out_disconnect; } ret = rdma_post_send(id, NULL, send_msg, 16, send_mr, send_flags); if (ret) { perror("rdma_post_send"); goto out_disconnect; } while ((ret = rdma_get_send_comp(id, &wc)) == 0); if (ret < 0) perror("rdma_get_send_comp"); else ret = 0; out_disconnect: rdma_disconnect(id); out_dereg_send: if ((send_flags & IBV_SEND_INLINE) == 0) rdma_dereg_mr(send_mr); out_dereg_recv: rdma_dereg_mr(mr); out_destroy_accept_ep: rdma_destroy_ep(id); out_destroy_listen_ep: rdma_destroy_ep(listen_id); out_free_addrinfo: rdma_freeaddrinfo(res); return ret; } int main(int argc, char **argv) { int op, ret; while ((op = getopt(argc, argv, "s:p:")) != -1) { switch (op) { case 's': server = optarg; break; case 'p': port = optarg; break; default: printf("usage: %s\n", argv[0]); printf("\t[-s server_address]\n"); printf("\t[-p port_number]\n"); exit(1); } } printf("rdma_server: start\n"); ret = run(); printf("rdma_server: end %d\n", ret); return ret; } rdma-core-56.1/librdmacm/examples/rdma_xclient.c000066400000000000000000000101741477342711600217220ustar00rootroot00000000000000/* * Copyright (c) 2010-2014 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include static const char *server = "127.0.0.1"; static char port[6] = "7471"; static struct rdma_cm_id *id; static struct ibv_mr *mr; static struct rdma_addrinfo hints; static uint8_t send_msg[16]; static uint32_t srqn; static int post_send(void) { struct ibv_send_wr wr, *bad; struct ibv_sge sge; int ret; sge.addr = (uint64_t) (uintptr_t) send_msg; sge.length = (uint32_t) sizeof send_msg; sge.lkey = 0; wr.wr_id = (uintptr_t) NULL; wr.next = NULL; wr.sg_list = &sge; wr.num_sge = 1; wr.opcode = IBV_WR_SEND; wr.send_flags = IBV_SEND_INLINE; if (hints.ai_qp_type == IBV_QPT_XRC_SEND) wr.qp_type.xrc.remote_srqn = srqn; ret = ibv_post_send(id->qp, &wr, &bad); if (ret) perror("rdma_post_send"); return ret; } static int test(void) { struct rdma_addrinfo *res; struct ibv_qp_init_attr attr; struct ibv_wc wc; int ret; ret = rdma_getaddrinfo(server, port, &hints, &res); if (ret) { printf("rdma_getaddrinfo: %s\n", gai_strerror(ret)); return ret; } memset(&attr, 0, sizeof attr); attr.cap.max_send_wr = 1; attr.cap.max_send_sge = 1; if (hints.ai_qp_type != IBV_QPT_XRC_SEND) { attr.cap.max_recv_wr = 1; attr.cap.max_recv_sge = 1; } attr.sq_sig_all = 1; ret = rdma_create_ep(&id, res, NULL, &attr); rdma_freeaddrinfo(res); if (ret) { perror("rdma_create_ep"); return ret; } mr = rdma_reg_msgs(id, send_msg, sizeof send_msg); if (!mr) { perror("rdma_reg_msgs"); return ret; } ret = rdma_connect(id, NULL); if (ret) { perror("rdma_connect"); return ret; } if (hints.ai_qp_type == IBV_QPT_XRC_SEND) srqn = be32toh(*(__be32 *) id->event->param.conn.private_data); ret = post_send(); if (ret) { perror("post_send"); return ret; } ret = rdma_get_send_comp(id, &wc); if (ret <= 0) { perror("rdma_get_recv_comp"); return ret; } rdma_disconnect(id); rdma_dereg_mr(mr); rdma_destroy_ep(id); return 0; } int main(int argc, char **argv) { int op, ret; hints.ai_port_space = RDMA_PS_TCP; hints.ai_qp_type = IBV_QPT_RC; while ((op = getopt(argc, argv, "s:p:c:")) != -1) { switch (op) { case 's': server = optarg; break; case 'p': strncpy(port, optarg, sizeof port - 1); break; case 'c': switch (tolower(optarg[0])) { case 'r': break; case 'x': hints.ai_port_space = RDMA_PS_IB; hints.ai_qp_type = IBV_QPT_XRC_SEND; break; default: goto err; } break; default: goto err; } } printf("%s: start\n", argv[0]); ret = test(); printf("%s: end %d\n", argv[0], ret); return ret; err: printf("usage: %s\n", argv[0]); printf("\t[-s server]\n"); printf("\t[-p port_number]\n"); printf("\t[-c communication type]\n"); printf("\t r - RC: reliable-connected (default)\n"); printf("\t x - XRC: extended-reliable-connected\n"); exit(1); } rdma-core-56.1/librdmacm/examples/rdma_xserver.c000066400000000000000000000104071477342711600217510ustar00rootroot00000000000000/* * Copyright (c) 2005-2014 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include static const char *port = "7471"; static struct rdma_cm_id *listen_id, *id; static struct ibv_mr *mr; static struct rdma_addrinfo hints; static uint8_t recv_msg[16]; static __be32 srqn; static int create_srq(void) { struct ibv_srq_init_attr attr; int ret; uint32_t tmp_srqn; attr.attr.max_wr = 1; attr.attr.max_sge = 1; attr.attr.srq_limit = 0; attr.srq_context = id; ret = rdma_create_srq(id, NULL, &attr); if (ret) perror("rdma_create_srq:"); if (id->srq) { ibv_get_srq_num(id->srq, &tmp_srqn); srqn = htobe32(tmp_srqn); } return ret; } static int test(void) { struct rdma_addrinfo *res; struct ibv_qp_init_attr attr; struct rdma_conn_param param; struct ibv_wc wc; int ret; ret = rdma_getaddrinfo(NULL, port, &hints, &res); if (ret) { printf("rdma_getaddrinfo: %s\n", gai_strerror(ret)); return ret; } memset(&attr, 0, sizeof attr); attr.cap.max_recv_wr = 1; attr.cap.max_recv_sge = 1; if (hints.ai_qp_type != IBV_QPT_XRC_RECV) { attr.cap.max_send_wr = 1; attr.cap.max_send_sge = 1; } ret = rdma_create_ep(&listen_id, res, NULL, &attr); rdma_freeaddrinfo(res); if (ret) { perror("rdma_create_ep"); return ret; } ret = rdma_listen(listen_id, 0); if (ret) { perror("rdma_listen"); return ret; } ret = rdma_get_request(listen_id, &id); if (ret) { perror("rdma_get_request"); return ret; } if (hints.ai_qp_type == IBV_QPT_XRC_RECV) { ret = create_srq(); if (ret) return ret; } mr = rdma_reg_msgs(id, recv_msg, sizeof recv_msg); if (!mr) { perror("rdma_reg_msgs"); return ret; } ret = rdma_post_recv(id, NULL, recv_msg, sizeof recv_msg, mr); if (ret) { perror("rdma_post_recv"); return ret; } memset(¶m, 0, sizeof param); param.private_data = &srqn; param.private_data_len = sizeof srqn; ret = rdma_accept(id, ¶m); if (ret) { perror("rdma_accept"); return ret; } ret = rdma_get_recv_comp(id, &wc); if (ret <= 0) { perror("rdma_get_recv_comp"); return ret; } rdma_disconnect(id); rdma_dereg_mr(mr); rdma_destroy_ep(id); rdma_destroy_ep(listen_id); return 0; } int main(int argc, char **argv) { int op, ret; hints.ai_flags = RAI_PASSIVE; hints.ai_port_space = RDMA_PS_TCP; hints.ai_qp_type = IBV_QPT_RC; while ((op = getopt(argc, argv, "p:c:")) != -1) { switch (op) { case 'p': port = optarg; break; case 'c': switch (tolower(optarg[0])) { case 'r': break; case 'x': hints.ai_port_space = RDMA_PS_IB; hints.ai_qp_type = IBV_QPT_XRC_RECV; break; default: goto err; } break; default: goto err; } } printf("%s: start\n", argv[0]); ret = test(); printf("%s: end %d\n", argv[0], ret); return ret; err: printf("usage: %s\n", argv[0]); printf("\t[-p port_number]\n"); printf("\t[-c communication type]\n"); printf("\t r - RC: reliable-connected (default)\n"); printf("\t x - XRC: extended-reliable-connected\n"); exit(1); } rdma-core-56.1/librdmacm/examples/riostream.c000066400000000000000000000345541477342711600212660ustar00rootroot00000000000000/* * Copyright (c) 2011-2012 Intel Corporation. All rights reserved. * Copyright (c) 2014 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "common.h" struct test_size_param { int size; int option; }; static struct test_size_param test_size[] = { { 1 << 6, 0 }, { 1 << 7, 1 }, { (1 << 7) + (1 << 6), 1}, { 1 << 8, 1 }, { (1 << 8) + (1 << 7), 1}, { 1 << 9, 1 }, { (1 << 9) + (1 << 8), 1}, { 1 << 10, 1 }, { (1 << 10) + (1 << 9), 1}, { 1 << 11, 1 }, { (1 << 11) + (1 << 10), 1}, { 1 << 12, 0 }, { (1 << 12) + (1 << 11), 1}, { 1 << 13, 1 }, { (1 << 13) + (1 << 12), 1}, { 1 << 14, 1 }, { (1 << 14) + (1 << 13), 1}, { 1 << 15, 1 }, { (1 << 15) + (1 << 14), 1}, { 1 << 16, 0 }, { (1 << 16) + (1 << 15), 1}, { 1 << 17, 1 }, { (1 << 17) + (1 << 16), 1}, { 1 << 18, 1 }, { (1 << 18) + (1 << 17), 1}, { 1 << 19, 1 }, { (1 << 19) + (1 << 18), 1}, { 1 << 20, 0 }, { (1 << 20) + (1 << 19), 1}, { 1 << 21, 1 }, { (1 << 21) + (1 << 20), 1}, { 1 << 22, 1 }, { (1 << 22) + (1 << 21), 1}, }; #define TEST_CNT (sizeof test_size / sizeof test_size[0]) static int rs, lrs; static int use_async; static int use_rgai; static int verify; static int flags = MSG_DONTWAIT; static int poll_timeout = 0; static int custom; static enum rs_optimization optimization; static int size_option; static int iterations = 1; static int transfer_size = 1000; static int transfer_count = 1000; static int buffer_size, inline_size = 64; static char test_name[10] = "custom"; static const char *port = "7471"; static char *dst_addr; static char *src_addr; static struct timeval start, end; static void *buf; static volatile uint8_t *poll_byte; static struct rdma_addrinfo rai_hints; static struct addrinfo ai_hints; static void show_perf(void) { char str[32]; float usec; long long bytes; usec = (end.tv_sec - start.tv_sec) * 1000000 + (end.tv_usec - start.tv_usec); bytes = (long long) iterations * transfer_count * transfer_size * 2; /* name size transfers iterations bytes seconds Gb/sec usec/xfer */ printf("%-10s", test_name); size_str(str, sizeof str, transfer_size); printf("%-8s", str); cnt_str(str, sizeof str, transfer_count); printf("%-8s", str); cnt_str(str, sizeof str, iterations); printf("%-8s", str); size_str(str, sizeof str, bytes); printf("%-8s", str); printf("%8.2fs%10.2f%11.2f\n", usec / 1000000., (bytes * 8) / (1000. * usec), (usec / iterations) / (transfer_count * 2)); } static void init_latency_test(int size) { char sstr[5]; size_str(sstr, sizeof sstr, size); snprintf(test_name, sizeof test_name, "%s_lat", sstr); transfer_count = 1; transfer_size = size; iterations = size_to_count(transfer_size); } static void init_bandwidth_test(int size) { char sstr[5]; size_str(sstr, sizeof sstr, size); snprintf(test_name, sizeof test_name, "%s_bw", sstr); iterations = 1; transfer_size = size; transfer_count = size_to_count(transfer_size); } static int send_msg(int size) { struct pollfd fds; int offset, ret; if (use_async) { fds.fd = rs; fds.events = POLLOUT; } for (offset = 0; offset < size; ) { if (use_async) { ret = do_poll(&fds, poll_timeout); if (ret) return ret; } ret = rsend(rs, buf + offset, size - offset, flags); if (ret > 0) { offset += ret; } else if (errno != EWOULDBLOCK && errno != EAGAIN) { perror("rsend"); return ret; } } return 0; } static int send_xfer(int size) { struct pollfd fds; int offset, ret; if (use_async) { fds.fd = rs; fds.events = POLLOUT; } for (offset = 0; offset < size; ) { if (use_async) { ret = do_poll(&fds, poll_timeout); if (ret) return ret; } ret = riowrite(rs, buf + offset, size - offset, offset, flags); if (ret > 0) { offset += ret; } else if (errno != EWOULDBLOCK && errno != EAGAIN) { perror("riowrite"); return ret; } } return 0; } static int recv_msg(int size) { struct pollfd fds; int offset, ret; if (use_async) { fds.fd = rs; fds.events = POLLIN; } for (offset = 0; offset < size; ) { if (use_async) { ret = do_poll(&fds, poll_timeout); if (ret) return ret; } ret = rrecv(rs, buf + offset, size - offset, flags); if (ret > 0) { offset += ret; } else if (errno != EWOULDBLOCK && errno != EAGAIN) { perror("rrecv"); return ret; } } return 0; } static int recv_xfer(int size, uint8_t marker) { int ret; while (*poll_byte != marker) ; if (verify) { ret = verify_buf(buf, size - 1); if (ret) return ret; } return 0; } static int sync_test(void) { int ret; ret = dst_addr ? send_msg(16) : recv_msg(16); if (ret) return ret; return dst_addr ? recv_msg(16) : send_msg(16); } static int run_test(void) { int ret, i, t; off_t offset; uint8_t marker = 0; poll_byte = buf + transfer_size - 1; *poll_byte = -1; offset = riomap(rs, buf, transfer_size, PROT_WRITE, 0, 0); if (offset == -1) { perror("riomap"); ret = -1; goto out; } ret = sync_test(); if (ret) goto out; gettimeofday(&start, NULL); for (i = 0; i < iterations; i++) { if (dst_addr) { for (t = 0; t < transfer_count - 1; t++) { ret = send_xfer(transfer_size); if (ret) goto out; } *poll_byte = (uint8_t) marker++; if (verify) format_buf(buf, transfer_size - 1); ret = send_xfer(transfer_size); if (ret) goto out; ret = recv_xfer(transfer_size, marker++); } else { ret = recv_xfer(transfer_size, marker++); if (ret) goto out; for (t = 0; t < transfer_count - 1; t++) { ret = send_xfer(transfer_size); if (ret) goto out; } *poll_byte = (uint8_t) marker++; if (verify) format_buf(buf, transfer_size - 1); ret = send_xfer(transfer_size); } if (ret) goto out; } gettimeofday(&end, NULL); show_perf(); ret = riounmap(rs, buf, transfer_size); out: return ret; } static void set_options(int fd) { int val; if (buffer_size) { rsetsockopt(fd, SOL_SOCKET, SO_SNDBUF, (void *) &buffer_size, sizeof buffer_size); rsetsockopt(fd, SOL_SOCKET, SO_RCVBUF, (void *) &buffer_size, sizeof buffer_size); } else { val = 1 << 19; rsetsockopt(fd, SOL_SOCKET, SO_SNDBUF, (void *) &val, sizeof val); rsetsockopt(fd, SOL_SOCKET, SO_RCVBUF, (void *) &val, sizeof val); } val = 1; rsetsockopt(fd, IPPROTO_TCP, TCP_NODELAY, (void *) &val, sizeof(val)); rsetsockopt(fd, SOL_RDMA, RDMA_IOMAPSIZE, (void *) &val, sizeof val); if (flags & MSG_DONTWAIT) rfcntl(fd, F_SETFL, O_NONBLOCK); /* Inline size based on experimental data */ if (optimization == opt_latency) { rsetsockopt(fd, SOL_RDMA, RDMA_INLINE, &inline_size, sizeof inline_size); } else if (optimization == opt_bandwidth) { val = 0; rsetsockopt(fd, SOL_RDMA, RDMA_INLINE, &val, sizeof val); } } static int server_listen(void) { struct rdma_addrinfo *rai = NULL; struct addrinfo *ai; int val, ret; if (use_rgai) { rai_hints.ai_flags |= RAI_PASSIVE; ret = rdma_getaddrinfo(src_addr, port, &rai_hints, &rai); } else { ai_hints.ai_flags |= AI_PASSIVE; ret = getaddrinfo(src_addr, port, &ai_hints, &ai); } if (ret) { printf("getaddrinfo: %s\n", gai_strerror(ret)); return ret; } lrs = rai ? rs_socket(rai->ai_family, SOCK_STREAM, 0) : rs_socket(ai->ai_family, SOCK_STREAM, 0); if (lrs < 0) { ret = lrs; goto free; } val = 1; ret = rsetsockopt(lrs, SOL_SOCKET, SO_REUSEADDR, &val, sizeof val); if (ret) { perror("rsetsockopt SO_REUSEADDR"); goto close; } ret = rai ? rbind(lrs, rai->ai_src_addr, rai->ai_src_len) : rbind(lrs, ai->ai_addr, ai->ai_addrlen); if (ret) { perror("rbind"); goto close; } ret = rlisten(lrs, 1); if (ret) perror("rlisten"); close: if (ret) rclose(lrs); free: if (rai) rdma_freeaddrinfo(rai); else freeaddrinfo(ai); return ret; } static int server_connect(void) { struct pollfd fds; int ret = 0; set_options(lrs); do { if (use_async) { fds.fd = lrs; fds.events = POLLIN; ret = do_poll(&fds, poll_timeout); if (ret) { perror("rpoll"); return ret; } } rs = raccept(lrs, NULL, NULL); } while (rs < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)); if (rs < 0) { perror("raccept"); return rs; } set_options(rs); return ret; } static int client_connect(void) { struct rdma_addrinfo *rai = NULL; struct addrinfo *ai; struct pollfd fds; int ret, err; socklen_t len; ret = use_rgai ? rdma_getaddrinfo(dst_addr, port, &rai_hints, &rai) : getaddrinfo(dst_addr, port, &ai_hints, &ai); if (ret) { printf("getaddrinfo: %s\n", gai_strerror(ret)); return ret; } rs = rai ? rs_socket(rai->ai_family, SOCK_STREAM, 0) : rs_socket(ai->ai_family, SOCK_STREAM, 0); if (rs < 0) { ret = rs; goto free; } set_options(rs); /* TODO: bind client to src_addr */ ret = rai ? rconnect(rs, rai->ai_dst_addr, rai->ai_dst_len) : rconnect(rs, ai->ai_addr, ai->ai_addrlen); if (ret && (errno != EINPROGRESS)) { perror("rconnect"); goto close; } if (ret && (errno == EINPROGRESS)) { fds.fd = rs; fds.events = POLLOUT; ret = do_poll(&fds, poll_timeout); if (ret) { perror("rpoll"); goto close; } len = sizeof err; ret = rgetsockopt(rs, SOL_SOCKET, SO_ERROR, &err, &len); if (ret) goto close; if (err) { ret = -1; errno = err; perror("async rconnect"); } } close: if (ret) rclose(rs); free: if (rai) rdma_freeaddrinfo(rai); else freeaddrinfo(ai); return ret; } static int run(void) { int i, ret = 0; buf = malloc(!custom ? test_size[TEST_CNT - 1].size : transfer_size); if (!buf) { perror("malloc"); return -1; } if (!dst_addr) { ret = server_listen(); if (ret) goto free; } printf("%-10s%-8s%-8s%-8s%-8s%8s %10s%13s\n", "name", "bytes", "xfers", "iters", "total", "time", "Gb/sec", "usec/xfer"); if (!custom) { optimization = opt_latency; ret = dst_addr ? client_connect() : server_connect(); if (ret) goto free; for (i = 0; i < TEST_CNT; i++) { if (test_size[i].option > size_option) continue; init_latency_test(test_size[i].size); run_test(); } rshutdown(rs, SHUT_RDWR); rclose(rs); optimization = opt_bandwidth; ret = dst_addr ? client_connect() : server_connect(); if (ret) goto free; for (i = 0; i < TEST_CNT; i++) { if (test_size[i].option > size_option) continue; init_bandwidth_test(test_size[i].size); run_test(); } } else { ret = dst_addr ? client_connect() : server_connect(); if (ret) goto free; ret = run_test(); } rshutdown(rs, SHUT_RDWR); rclose(rs); free: free(buf); return ret; } static int set_test_opt(const char *arg) { if (strlen(arg) == 1) { switch (arg[0]) { case 'a': use_async = 1; break; case 'b': flags = (flags & ~MSG_DONTWAIT) | MSG_WAITALL; break; case 'n': flags |= MSG_DONTWAIT; break; case 'v': verify = 1; break; default: return -1; } } else { if (!strncasecmp("async", arg, 5)) { use_async = 1; } else if (!strncasecmp("block", arg, 5)) { flags = (flags & ~MSG_DONTWAIT) | MSG_WAITALL; } else if (!strncasecmp("nonblock", arg, 8)) { flags |= MSG_DONTWAIT; } else if (!strncasecmp("verify", arg, 6)) { verify = 1; } else { return -1; } } return 0; } int main(int argc, char **argv) { int op, ret; ai_hints.ai_socktype = SOCK_STREAM; rai_hints.ai_port_space = RDMA_PS_TCP; while ((op = getopt(argc, argv, "s:b:f:B:i:I:C:S:p:T:")) != -1) { switch (op) { case 's': dst_addr = optarg; break; case 'b': src_addr = optarg; break; case 'f': if (!strncasecmp("ip", optarg, 2)) { ai_hints.ai_flags = AI_NUMERICHOST; } else if (!strncasecmp("gid", optarg, 3)) { rai_hints.ai_flags = RAI_NUMERICHOST | RAI_FAMILY; rai_hints.ai_family = AF_IB; use_rgai = 1; } else { fprintf(stderr, "Warning: unknown address format\n"); } break; case 'B': buffer_size = atoi(optarg); break; case 'i': inline_size = atoi(optarg); break; case 'I': custom = 1; iterations = atoi(optarg); break; case 'C': custom = 1; transfer_count = atoi(optarg); break; case 'S': if (!strncasecmp("all", optarg, 3)) { size_option = 1; } else { custom = 1; transfer_size = atoi(optarg); } break; case 'p': port = optarg; break; case 'T': if (!set_test_opt(optarg)) break; /* invalid option - fall through */ SWITCH_FALLTHROUGH; default: printf("usage: %s\n", argv[0]); printf("\t[-s server_address]\n"); printf("\t[-b bind_address]\n"); printf("\t[-f address_format]\n"); printf("\t name, ip, ipv6, or gid\n"); printf("\t[-B buffer_size]\n"); printf("\t[-i inline_size]\n"); printf("\t[-I iterations]\n"); printf("\t[-C transfer_count]\n"); printf("\t[-S transfer_size or all]\n"); printf("\t[-p port_number]\n"); printf("\t[-T test_option]\n"); printf("\t a|async - asynchronous operation (use poll)\n"); printf("\t b|blocking - use blocking calls\n"); printf("\t n|nonblocking - use nonblocking calls\n"); printf("\t v|verify - verify data\n"); exit(1); } } if (!(flags & MSG_DONTWAIT)) poll_timeout = -1; ret = run(); return ret; } rdma-core-56.1/librdmacm/examples/rping.c000066400000000000000000000761541477342711600204020ustar00rootroot00000000000000/* * Copyright (c) 2005 Ammasso, Inc. All rights reserved. * Copyright (c) 2006 Open Grid Computing, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include "common.h" static int debug = 0; #define DEBUG_LOG if (debug) printf /* * rping "ping/pong" loop: * client sends source rkey/addr/len * server receives source rkey/add/len * server rdma reads "ping" data from source * server sends "go ahead" on rdma read completion * client sends sink rkey/addr/len * server receives sink rkey/addr/len * server rdma writes "pong" data to sink * server sends "go ahead" on rdma write completion * */ /* * These states are used to signal events between the completion handler * and the main client or server thread. * * Once CONNECTED, they cycle through RDMA_READ_ADV, RDMA_WRITE_ADV, * and RDMA_WRITE_COMPLETE for each ping. */ enum test_state { IDLE = 1, CONNECT_REQUEST, ADDR_RESOLVED, ROUTE_RESOLVED, CONNECTED, RDMA_READ_ADV, RDMA_READ_COMPLETE, RDMA_WRITE_ADV, RDMA_WRITE_COMPLETE, DISCONNECTED, ERROR }; struct rping_rdma_info { __be64 buf; __be32 rkey; __be32 size; }; /* * Default max buffer size for IO... */ #define RPING_BUFSIZE 64*1024 #define RPING_SQ_DEPTH 16 /* Default string for print data and * minimum buffer size */ #define _stringify( _x ) # _x #define stringify( _x ) _stringify( _x ) #define RPING_MSG_FMT "rdma-ping-%d: " #define RPING_MIN_BUFSIZE sizeof(stringify(INT_MAX)) + sizeof(RPING_MSG_FMT) /* * Control block struct. */ struct rping_cb { int server; /* 0 iff client */ pthread_t cqthread; pthread_t persistent_server_thread; struct ibv_comp_channel *channel; struct ibv_cq *cq; struct ibv_pd *pd; struct ibv_qp *qp; struct ibv_recv_wr rq_wr; /* recv work request record */ struct ibv_sge recv_sgl; /* recv single SGE */ struct rping_rdma_info recv_buf;/* malloc'd buffer */ struct ibv_mr *recv_mr; /* MR associated with this buffer */ struct ibv_send_wr sq_wr; /* send work request record */ struct ibv_sge send_sgl; struct rping_rdma_info send_buf;/* single send buf */ struct ibv_mr *send_mr; struct ibv_send_wr rdma_sq_wr; /* rdma work request record */ struct ibv_sge rdma_sgl; /* rdma single SGE */ char *rdma_buf; /* used as rdma sink */ struct ibv_mr *rdma_mr; uint32_t remote_rkey; /* remote guys RKEY */ uint64_t remote_addr; /* remote guys TO */ uint32_t remote_len; /* remote guys LEN */ char *start_buf; /* rdma read src */ struct ibv_mr *start_mr; enum test_state state; /* used for cond/signalling */ sem_t sem; sem_t accept_ready; /* Ready for another conn req */ struct sockaddr_storage sin; struct sockaddr_storage ssource; __be16 port; /* dst port in NBO */ int verbose; /* verbose logging */ int self_create_qp; /* Create QP not via cma */ int count; /* ping count */ int size; /* ping data size */ int validate; /* validate ping data */ /* CM stuff */ pthread_t cmthread; struct rdma_event_channel *cm_channel; struct rdma_cm_id *cm_id; /* connection on client side,*/ /* listener on service side. */ struct rdma_cm_id *child_cm_id; /* connection on server side */ }; static int rping_cma_event_handler(struct rdma_cm_id *cma_id, struct rdma_cm_event *event) { int ret = 0; struct rping_cb *cb = cma_id->context; DEBUG_LOG("cma_event type %s cma_id %p (%s)\n", rdma_event_str(event->event), cma_id, (cma_id == cb->cm_id) ? "parent" : "child"); switch (event->event) { case RDMA_CM_EVENT_ADDR_RESOLVED: cb->state = ADDR_RESOLVED; ret = rdma_resolve_route(cma_id, 2000); if (ret) { cb->state = ERROR; perror("rdma_resolve_route"); sem_post(&cb->sem); } break; case RDMA_CM_EVENT_ROUTE_RESOLVED: cb->state = ROUTE_RESOLVED; sem_post(&cb->sem); break; case RDMA_CM_EVENT_CONNECT_REQUEST: sem_wait(&cb->accept_ready); cb->state = CONNECT_REQUEST; cb->child_cm_id = cma_id; DEBUG_LOG("child cma %p\n", cb->child_cm_id); sem_post(&cb->sem); break; case RDMA_CM_EVENT_CONNECT_RESPONSE: DEBUG_LOG("CONNECT_RESPONSE\n"); cb->state = CONNECTED; sem_post(&cb->sem); break; case RDMA_CM_EVENT_ESTABLISHED: DEBUG_LOG("ESTABLISHED\n"); /* * Server will wake up when first RECV completes. */ if (!cb->server) { cb->state = CONNECTED; } sem_post(&cb->sem); break; case RDMA_CM_EVENT_ADDR_ERROR: case RDMA_CM_EVENT_ROUTE_ERROR: case RDMA_CM_EVENT_CONNECT_ERROR: case RDMA_CM_EVENT_UNREACHABLE: case RDMA_CM_EVENT_REJECTED: fprintf(stderr, "cma event %s, error %d\n", rdma_event_str(event->event), event->status); sem_post(&cb->sem); ret = -1; break; case RDMA_CM_EVENT_DISCONNECTED: fprintf(stderr, "%s DISCONNECT EVENT...\n", cb->server ? "server" : "client"); cb->state = DISCONNECTED; sem_post(&cb->sem); break; case RDMA_CM_EVENT_DEVICE_REMOVAL: fprintf(stderr, "cma detected device removal!!!!\n"); cb->state = ERROR; sem_post(&cb->sem); ret = -1; break; default: fprintf(stderr, "unhandled event: %s, ignoring\n", rdma_event_str(event->event)); break; } return ret; } static int server_recv(struct rping_cb *cb, struct ibv_wc *wc) { if (wc->byte_len != sizeof(cb->recv_buf)) { fprintf(stderr, "Received bogus data, size %d\n", wc->byte_len); return -1; } cb->remote_rkey = be32toh(cb->recv_buf.rkey); cb->remote_addr = be64toh(cb->recv_buf.buf); cb->remote_len = be32toh(cb->recv_buf.size); DEBUG_LOG("Received rkey %x addr %" PRIx64 " len %d from peer\n", cb->remote_rkey, cb->remote_addr, cb->remote_len); if (cb->state <= CONNECTED || cb->state == RDMA_WRITE_COMPLETE) cb->state = RDMA_READ_ADV; else cb->state = RDMA_WRITE_ADV; return 0; } static int client_recv(struct rping_cb *cb, struct ibv_wc *wc) { if (wc->byte_len != sizeof(cb->recv_buf)) { fprintf(stderr, "Received bogus data, size %d\n", wc->byte_len); return -1; } if (cb->state == RDMA_READ_ADV) cb->state = RDMA_WRITE_ADV; else cb->state = RDMA_WRITE_COMPLETE; return 0; } static int rping_cq_event_handler(struct rping_cb *cb) { struct ibv_wc wc; struct ibv_recv_wr *bad_wr; int ret; int flushed = 0; while ((ret = ibv_poll_cq(cb->cq, 1, &wc)) == 1) { ret = 0; if (wc.status) { if (wc.status == IBV_WC_WR_FLUSH_ERR) { flushed = 1; continue; } fprintf(stderr, "cq completion failed status %d\n", wc.status); ret = -1; goto error; } switch (wc.opcode) { case IBV_WC_SEND: DEBUG_LOG("send completion\n"); break; case IBV_WC_RDMA_WRITE: DEBUG_LOG("rdma write completion\n"); cb->state = RDMA_WRITE_COMPLETE; sem_post(&cb->sem); break; case IBV_WC_RDMA_READ: DEBUG_LOG("rdma read completion\n"); cb->state = RDMA_READ_COMPLETE; sem_post(&cb->sem); break; case IBV_WC_RECV: DEBUG_LOG("recv completion\n"); ret = cb->server ? server_recv(cb, &wc) : client_recv(cb, &wc); if (ret) { fprintf(stderr, "recv wc error: %d\n", ret); goto error; } ret = ibv_post_recv(cb->qp, &cb->rq_wr, &bad_wr); if (ret) { fprintf(stderr, "post recv error: %d\n", ret); goto error; } sem_post(&cb->sem); break; default: DEBUG_LOG("unknown!!!!! completion\n"); ret = -1; goto error; } } if (ret) { fprintf(stderr, "poll error %d\n", ret); goto error; } return flushed; error: cb->state = ERROR; sem_post(&cb->sem); return ret; } static void rping_init_conn_param(struct rping_cb *cb, struct rdma_conn_param *conn_param) { memset(conn_param, 0, sizeof(*conn_param)); conn_param->responder_resources = 1; conn_param->initiator_depth = 1; conn_param->retry_count = 7; conn_param->rnr_retry_count = 7; if (cb->self_create_qp) conn_param->qp_num = cb->qp->qp_num; } static int rping_self_modify_qp(struct rping_cb *cb, struct rdma_cm_id *id) { struct ibv_qp_attr qp_attr; int qp_attr_mask, ret; qp_attr.qp_state = IBV_QPS_INIT; ret = rdma_init_qp_attr(id, &qp_attr, &qp_attr_mask); if (ret) return ret; ret = ibv_modify_qp(cb->qp, &qp_attr, qp_attr_mask); if (ret) return ret; qp_attr.qp_state = IBV_QPS_RTR; ret = rdma_init_qp_attr(id, &qp_attr, &qp_attr_mask); if (ret) return ret; ret = ibv_modify_qp(cb->qp, &qp_attr, qp_attr_mask); if (ret) return ret; qp_attr.qp_state = IBV_QPS_RTS; ret = rdma_init_qp_attr(id, &qp_attr, &qp_attr_mask); if (ret) return ret; return ibv_modify_qp(cb->qp, &qp_attr, qp_attr_mask); } static int rping_accept(struct rping_cb *cb) { struct rdma_conn_param conn_param; int ret; DEBUG_LOG("accepting client connection request\n"); if (cb->self_create_qp) { ret = rping_self_modify_qp(cb, cb->child_cm_id); if (ret) return ret; rping_init_conn_param(cb, &conn_param); ret = rdma_accept(cb->child_cm_id, &conn_param); } else { ret = rdma_accept(cb->child_cm_id, NULL); } if (ret) { perror("rdma_accept"); return ret; } sem_wait(&cb->sem); if (cb->state == ERROR) { fprintf(stderr, "wait for CONNECTED state %d\n", cb->state); return -1; } return 0; } static int rping_disconnect(struct rping_cb *cb, struct rdma_cm_id *id) { struct ibv_qp_attr qp_attr = {}; int err = 0; if (cb->self_create_qp) { qp_attr.qp_state = IBV_QPS_ERR; err = ibv_modify_qp(cb->qp, &qp_attr, IBV_QP_STATE); if (err) return err; } return rdma_disconnect(id); } static void rping_setup_wr(struct rping_cb *cb) { cb->recv_sgl.addr = (uint64_t) (unsigned long) &cb->recv_buf; cb->recv_sgl.length = sizeof cb->recv_buf; cb->recv_sgl.lkey = cb->recv_mr->lkey; cb->rq_wr.sg_list = &cb->recv_sgl; cb->rq_wr.num_sge = 1; cb->send_sgl.addr = (uint64_t) (unsigned long) &cb->send_buf; cb->send_sgl.length = sizeof cb->send_buf; cb->send_sgl.lkey = cb->send_mr->lkey; cb->sq_wr.opcode = IBV_WR_SEND; cb->sq_wr.send_flags = IBV_SEND_SIGNALED; cb->sq_wr.sg_list = &cb->send_sgl; cb->sq_wr.num_sge = 1; cb->rdma_sgl.addr = (uint64_t) (unsigned long) cb->rdma_buf; cb->rdma_sgl.lkey = cb->rdma_mr->lkey; cb->rdma_sq_wr.send_flags = IBV_SEND_SIGNALED; cb->rdma_sq_wr.sg_list = &cb->rdma_sgl; cb->rdma_sq_wr.num_sge = 1; } static int rping_setup_buffers(struct rping_cb *cb) { int ret; DEBUG_LOG("rping_setup_buffers called on cb %p\n", cb); cb->recv_mr = ibv_reg_mr(cb->pd, &cb->recv_buf, sizeof cb->recv_buf, IBV_ACCESS_LOCAL_WRITE); if (!cb->recv_mr) { fprintf(stderr, "recv_buf reg_mr failed\n"); return errno; } cb->send_mr = ibv_reg_mr(cb->pd, &cb->send_buf, sizeof cb->send_buf, 0); if (!cb->send_mr) { fprintf(stderr, "send_buf reg_mr failed\n"); ret = errno; goto err1; } cb->rdma_buf = malloc(cb->size); if (!cb->rdma_buf) { fprintf(stderr, "rdma_buf malloc failed\n"); ret = -ENOMEM; goto err2; } cb->rdma_mr = ibv_reg_mr(cb->pd, cb->rdma_buf, cb->size, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_WRITE); if (!cb->rdma_mr) { fprintf(stderr, "rdma_buf reg_mr failed\n"); ret = errno; goto err3; } if (!cb->server) { cb->start_buf = malloc(cb->size); if (!cb->start_buf) { fprintf(stderr, "start_buf malloc failed\n"); ret = -ENOMEM; goto err4; } cb->start_mr = ibv_reg_mr(cb->pd, cb->start_buf, cb->size, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_WRITE); if (!cb->start_mr) { fprintf(stderr, "start_buf reg_mr failed\n"); ret = errno; goto err5; } } rping_setup_wr(cb); DEBUG_LOG("allocated & registered buffers...\n"); return 0; err5: free(cb->start_buf); err4: ibv_dereg_mr(cb->rdma_mr); err3: free(cb->rdma_buf); err2: ibv_dereg_mr(cb->send_mr); err1: ibv_dereg_mr(cb->recv_mr); return ret; } static void rping_free_buffers(struct rping_cb *cb) { DEBUG_LOG("rping_free_buffers called on cb %p\n", cb); ibv_dereg_mr(cb->recv_mr); ibv_dereg_mr(cb->send_mr); ibv_dereg_mr(cb->rdma_mr); free(cb->rdma_buf); if (!cb->server) { ibv_dereg_mr(cb->start_mr); free(cb->start_buf); } } static int rping_create_qp(struct rping_cb *cb) { struct ibv_qp_init_attr init_attr; struct rdma_cm_id *id; int ret; memset(&init_attr, 0, sizeof(init_attr)); init_attr.cap.max_send_wr = RPING_SQ_DEPTH; init_attr.cap.max_recv_wr = 2; init_attr.cap.max_recv_sge = 1; init_attr.cap.max_send_sge = 1; init_attr.qp_type = IBV_QPT_RC; init_attr.send_cq = cb->cq; init_attr.recv_cq = cb->cq; id = cb->server ? cb->child_cm_id : cb->cm_id; if (cb->self_create_qp) { cb->qp = ibv_create_qp(cb->pd, &init_attr); if (!cb->qp) { perror("ibv_create_qp"); return -1; } struct ibv_qp_attr attr = { .qp_state = IBV_QPS_INIT, .pkey_index = 0, .port_num = id->port_num, .qp_access_flags = 0, }; ret = ibv_modify_qp(cb->qp, &attr, IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | IBV_QP_ACCESS_FLAGS); if (ret) { perror("ibv_modify_qp"); ibv_destroy_qp(cb->qp); } return ret ? -1 : 0; } ret = rdma_create_qp(id, cb->pd, &init_attr); if (!ret) cb->qp = id->qp; else perror("rdma_create_qp"); return ret; } static void rping_free_qp(struct rping_cb *cb) { ibv_destroy_qp(cb->qp); ibv_destroy_cq(cb->cq); ibv_destroy_comp_channel(cb->channel); ibv_dealloc_pd(cb->pd); } static int rping_setup_qp(struct rping_cb *cb, struct rdma_cm_id *cm_id) { int ret; cb->pd = ibv_alloc_pd(cm_id->verbs); if (!cb->pd) { fprintf(stderr, "ibv_alloc_pd failed\n"); return errno; } DEBUG_LOG("created pd %p\n", cb->pd); cb->channel = ibv_create_comp_channel(cm_id->verbs); if (!cb->channel) { fprintf(stderr, "ibv_create_comp_channel failed\n"); ret = errno; goto err1; } DEBUG_LOG("created channel %p\n", cb->channel); cb->cq = ibv_create_cq(cm_id->verbs, RPING_SQ_DEPTH * 2, cb, cb->channel, 0); if (!cb->cq) { fprintf(stderr, "ibv_create_cq failed\n"); ret = errno; goto err2; } DEBUG_LOG("created cq %p\n", cb->cq); ret = ibv_req_notify_cq(cb->cq, 0); if (ret) { fprintf(stderr, "ibv_create_cq failed\n"); ret = errno; goto err3; } ret = rping_create_qp(cb); if (ret) { goto err3; } DEBUG_LOG("created qp %p\n", cb->qp); return 0; err3: ibv_destroy_cq(cb->cq); err2: ibv_destroy_comp_channel(cb->channel); err1: ibv_dealloc_pd(cb->pd); return ret; } static void *cm_thread(void *arg) { struct rping_cb *cb = arg; struct rdma_cm_event *event; int ret; while (1) { ret = rdma_get_cm_event(cb->cm_channel, &event); if (ret) { perror("rdma_get_cm_event"); exit(ret); } ret = rping_cma_event_handler(event->id, event); rdma_ack_cm_event(event); if (ret) exit(ret); } } static void *cq_thread(void *arg) { struct rping_cb *cb = arg; struct ibv_cq *ev_cq; void *ev_ctx; int ret; DEBUG_LOG("cq_thread started.\n"); while (1) { pthread_testcancel(); ret = ibv_get_cq_event(cb->channel, &ev_cq, &ev_ctx); if (ret) { fprintf(stderr, "Failed to get cq event!\n"); pthread_exit(NULL); } if (ev_cq != cb->cq) { fprintf(stderr, "Unknown CQ!\n"); pthread_exit(NULL); } ret = ibv_req_notify_cq(cb->cq, 0); if (ret) { fprintf(stderr, "Failed to set notify!\n"); pthread_exit(NULL); } ret = rping_cq_event_handler(cb); ibv_ack_cq_events(cb->cq, 1); if (ret) pthread_exit(NULL); } } static void rping_format_send(struct rping_cb *cb, char *buf, struct ibv_mr *mr) { struct rping_rdma_info *info = &cb->send_buf; info->buf = htobe64((uint64_t) (unsigned long) buf); info->rkey = htobe32(mr->rkey); info->size = htobe32(cb->size); DEBUG_LOG("RDMA addr %" PRIx64" rkey %x len %d\n", be64toh(info->buf), be32toh(info->rkey), be32toh(info->size)); } static int rping_test_server(struct rping_cb *cb) { struct ibv_send_wr *bad_wr; int ret; while (1) { /* Wait for client's Start STAG/TO/Len */ sem_wait(&cb->sem); if (cb->state != RDMA_READ_ADV) { fprintf(stderr, "wait for RDMA_READ_ADV state %d\n", cb->state); ret = -1; break; } DEBUG_LOG("server received sink adv\n"); /* Issue RDMA Read. */ cb->rdma_sq_wr.opcode = IBV_WR_RDMA_READ; cb->rdma_sq_wr.wr.rdma.rkey = cb->remote_rkey; cb->rdma_sq_wr.wr.rdma.remote_addr = cb->remote_addr; cb->rdma_sq_wr.sg_list->length = cb->remote_len; ret = ibv_post_send(cb->qp, &cb->rdma_sq_wr, &bad_wr); if (ret) { fprintf(stderr, "post send error %d\n", ret); break; } DEBUG_LOG("server posted rdma read req \n"); /* Wait for read completion */ sem_wait(&cb->sem); if (cb->state != RDMA_READ_COMPLETE) { fprintf(stderr, "wait for RDMA_READ_COMPLETE state %d\n", cb->state); ret = -1; break; } DEBUG_LOG("server received read complete\n"); /* Display data in recv buf */ if (cb->verbose) printf("server ping data: %s\n", cb->rdma_buf); /* Tell client to continue */ ret = ibv_post_send(cb->qp, &cb->sq_wr, &bad_wr); if (ret) { fprintf(stderr, "post send error %d\n", ret); break; } DEBUG_LOG("server posted go ahead\n"); /* Wait for client's RDMA STAG/TO/Len */ sem_wait(&cb->sem); if (cb->state != RDMA_WRITE_ADV) { fprintf(stderr, "wait for RDMA_WRITE_ADV state %d\n", cb->state); ret = -1; break; } DEBUG_LOG("server received sink adv\n"); /* RDMA Write echo data */ cb->rdma_sq_wr.opcode = IBV_WR_RDMA_WRITE; cb->rdma_sq_wr.wr.rdma.rkey = cb->remote_rkey; cb->rdma_sq_wr.wr.rdma.remote_addr = cb->remote_addr; cb->rdma_sq_wr.sg_list->length = strlen(cb->rdma_buf) + 1; DEBUG_LOG("rdma write from lkey %x laddr %" PRIx64 " len %d\n", cb->rdma_sq_wr.sg_list->lkey, cb->rdma_sq_wr.sg_list->addr, cb->rdma_sq_wr.sg_list->length); ret = ibv_post_send(cb->qp, &cb->rdma_sq_wr, &bad_wr); if (ret) { fprintf(stderr, "post send error %d\n", ret); break; } /* Wait for completion */ ret = sem_wait(&cb->sem); if (cb->state != RDMA_WRITE_COMPLETE) { fprintf(stderr, "wait for RDMA_WRITE_COMPLETE state %d\n", cb->state); ret = -1; break; } DEBUG_LOG("server rdma write complete \n"); /* Tell client to begin again */ ret = ibv_post_send(cb->qp, &cb->sq_wr, &bad_wr); if (ret) { fprintf(stderr, "post send error %d\n", ret); break; } DEBUG_LOG("server posted go ahead\n"); } return (cb->state == DISCONNECTED) ? 0 : ret; } static int rping_bind_server(struct rping_cb *cb) { int ret; if (cb->sin.ss_family == AF_INET) ((struct sockaddr_in *) &cb->sin)->sin_port = cb->port; else ((struct sockaddr_in6 *) &cb->sin)->sin6_port = cb->port; ret = rdma_bind_addr(cb->cm_id, (struct sockaddr *) &cb->sin); if (ret) { perror("rdma_bind_addr"); return ret; } DEBUG_LOG("rdma_bind_addr successful\n"); DEBUG_LOG("rdma_listen\n"); ret = rdma_listen(cb->cm_id, 3); if (ret) { perror("rdma_listen"); return ret; } return 0; } static struct rping_cb *clone_cb(struct rping_cb *listening_cb) { struct rping_cb *cb = malloc(sizeof *cb); if (!cb) return NULL; memset(cb, 0, sizeof *cb); *cb = *listening_cb; cb->child_cm_id->context = cb; return cb; } static void free_cb(struct rping_cb *cb) { free(cb); } static void *rping_persistent_server_thread(void *arg) { struct rping_cb *cb = arg; struct ibv_recv_wr *bad_wr; int ret; ret = rping_setup_qp(cb, cb->child_cm_id); if (ret) { fprintf(stderr, "setup_qp failed: %d\n", ret); goto err0; } ret = rping_setup_buffers(cb); if (ret) { fprintf(stderr, "rping_setup_buffers failed: %d\n", ret); goto err1; } ret = ibv_post_recv(cb->qp, &cb->rq_wr, &bad_wr); if (ret) { fprintf(stderr, "ibv_post_recv failed: %d\n", ret); goto err2; } ret = pthread_create(&cb->cqthread, NULL, cq_thread, cb); if (ret) { perror("pthread_create"); goto err2; } ret = rping_accept(cb); if (ret) { fprintf(stderr, "connect error %d\n", ret); goto err3; } rping_test_server(cb); rping_disconnect(cb, cb->child_cm_id); pthread_join(cb->cqthread, NULL); rping_free_buffers(cb); rping_free_qp(cb); rdma_destroy_id(cb->child_cm_id); free_cb(cb); return NULL; err3: pthread_cancel(cb->cqthread); pthread_join(cb->cqthread, NULL); err2: rping_free_buffers(cb); err1: rping_free_qp(cb); err0: free_cb(cb); return NULL; } static int rping_run_persistent_server(struct rping_cb *listening_cb) { int ret; struct rping_cb *cb; pthread_attr_t attr; ret = rping_bind_server(listening_cb); if (ret) return ret; /* * Set persistent server threads to DEATCHED state so * they release all their resources when they exit. */ ret = pthread_attr_init(&attr); if (ret) { perror("pthread_attr_init"); return ret; } ret = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); if (ret) { perror("pthread_attr_setdetachstate"); return ret; } while (1) { sem_wait(&listening_cb->sem); if (listening_cb->state != CONNECT_REQUEST) { fprintf(stderr, "wait for CONNECT_REQUEST state %d\n", listening_cb->state); return -1; } cb = clone_cb(listening_cb); if (!cb) return -1; sem_post(&listening_cb->accept_ready); ret = pthread_create(&cb->persistent_server_thread, &attr, rping_persistent_server_thread, cb); if (ret) { perror("pthread_create"); return ret; } } return 0; } static int rping_run_server(struct rping_cb *cb) { struct ibv_recv_wr *bad_wr; int ret; ret = rping_bind_server(cb); if (ret) return ret; sem_wait(&cb->sem); if (cb->state != CONNECT_REQUEST) { fprintf(stderr, "wait for CONNECT_REQUEST state %d\n", cb->state); return -1; } ret = rping_setup_qp(cb, cb->child_cm_id); if (ret) { fprintf(stderr, "setup_qp failed: %d\n", ret); return ret; } ret = rping_setup_buffers(cb); if (ret) { fprintf(stderr, "rping_setup_buffers failed: %d\n", ret); goto err1; } ret = ibv_post_recv(cb->qp, &cb->rq_wr, &bad_wr); if (ret) { fprintf(stderr, "ibv_post_recv failed: %d\n", ret); goto err2; } ret = pthread_create(&cb->cqthread, NULL, cq_thread, cb); if (ret) { perror("pthread_create"); goto err2; } ret = rping_accept(cb); if (ret) { fprintf(stderr, "connect error %d\n", ret); goto err2; } ret = rping_test_server(cb); if (ret) { fprintf(stderr, "rping server failed: %d\n", ret); goto err3; } ret = 0; err3: rping_disconnect(cb, cb->child_cm_id); pthread_join(cb->cqthread, NULL); rdma_destroy_id(cb->child_cm_id); err2: rping_free_buffers(cb); err1: rping_free_qp(cb); return ret; } static int rping_test_client(struct rping_cb *cb) { int ping, start, cc, i, ret = 0; struct ibv_send_wr *bad_wr; unsigned char c; start = 65; for (ping = 0; !cb->count || ping < cb->count; ping++) { cb->state = RDMA_READ_ADV; /* Put some ascii text in the buffer. */ cc = snprintf(cb->start_buf, cb->size, RPING_MSG_FMT, ping); for (i = cc, c = start; i < cb->size; i++) { cb->start_buf[i] = c; c++; if (c > 122) c = 65; } start++; if (start > 122) start = 65; cb->start_buf[cb->size - 1] = 0; rping_format_send(cb, cb->start_buf, cb->start_mr); ret = ibv_post_send(cb->qp, &cb->sq_wr, &bad_wr); if (ret) { fprintf(stderr, "post send error %d\n", ret); break; } /* Wait for server to ACK */ sem_wait(&cb->sem); if (cb->state != RDMA_WRITE_ADV) { fprintf(stderr, "wait for RDMA_WRITE_ADV state %d\n", cb->state); ret = -1; break; } rping_format_send(cb, cb->rdma_buf, cb->rdma_mr); ret = ibv_post_send(cb->qp, &cb->sq_wr, &bad_wr); if (ret) { fprintf(stderr, "post send error %d\n", ret); break; } /* Wait for the server to say the RDMA Write is complete. */ sem_wait(&cb->sem); if (cb->state != RDMA_WRITE_COMPLETE) { fprintf(stderr, "wait for RDMA_WRITE_COMPLETE state %d\n", cb->state); ret = -1; break; } if (cb->validate) if (memcmp(cb->start_buf, cb->rdma_buf, cb->size)) { fprintf(stderr, "data mismatch!\n"); ret = -1; break; } if (cb->verbose) printf("ping data: %s\n", cb->rdma_buf); } return (cb->state == DISCONNECTED) ? 0 : ret; } static int rping_connect_client(struct rping_cb *cb) { struct rdma_conn_param conn_param; int ret; rping_init_conn_param(cb, &conn_param); ret = rdma_connect(cb->cm_id, &conn_param); if (ret) { perror("rdma_connect"); return ret; } sem_wait(&cb->sem); if (cb->state != CONNECTED) { fprintf(stderr, "wait for CONNECTED state %d\n", cb->state); return -1; } if (cb->self_create_qp) { ret = rping_self_modify_qp(cb, cb->cm_id); if (ret) { perror("rping_modify_qp"); return ret; } ret = rdma_establish(cb->cm_id); if (ret) { perror("rdma_establish"); return ret; } } DEBUG_LOG("rdma_connect successful\n"); return 0; } static int rping_bind_client(struct rping_cb *cb) { int ret; if (cb->sin.ss_family == AF_INET) ((struct sockaddr_in *) &cb->sin)->sin_port = cb->port; else ((struct sockaddr_in6 *) &cb->sin)->sin6_port = cb->port; if (cb->ssource.ss_family) ret = rdma_resolve_addr(cb->cm_id, (struct sockaddr *) &cb->ssource, (struct sockaddr *) &cb->sin, 2000); else ret = rdma_resolve_addr(cb->cm_id, NULL, (struct sockaddr *) &cb->sin, 2000); if (ret) { perror("rdma_resolve_addr"); return ret; } sem_wait(&cb->sem); if (cb->state != ROUTE_RESOLVED) { fprintf(stderr, "waiting for addr/route resolution state %d\n", cb->state); return -1; } DEBUG_LOG("rdma_resolve_addr - rdma_resolve_route successful\n"); return 0; } static int rping_run_client(struct rping_cb *cb) { struct ibv_recv_wr *bad_wr; int ret; ret = rping_bind_client(cb); if (ret) return ret; ret = rping_setup_qp(cb, cb->cm_id); if (ret) { fprintf(stderr, "setup_qp failed: %d\n", ret); return ret; } ret = rping_setup_buffers(cb); if (ret) { fprintf(stderr, "rping_setup_buffers failed: %d\n", ret); goto err1; } ret = ibv_post_recv(cb->qp, &cb->rq_wr, &bad_wr); if (ret) { fprintf(stderr, "ibv_post_recv failed: %d\n", ret); goto err2; } ret = pthread_create(&cb->cqthread, NULL, cq_thread, cb); if (ret) { perror("pthread_create"); goto err2; } ret = rping_connect_client(cb); if (ret) { fprintf(stderr, "connect error %d\n", ret); goto err3; } ret = rping_test_client(cb); if (ret) { fprintf(stderr, "rping client failed: %d\n", ret); goto err4; } ret = 0; err4: rping_disconnect(cb, cb->cm_id); err3: pthread_join(cb->cqthread, NULL); err2: rping_free_buffers(cb); err1: rping_free_qp(cb); return ret; } static int get_addr(char *dst, struct sockaddr *addr) { struct addrinfo *res; int ret; ret = getaddrinfo(dst, NULL, NULL, &res); if (ret) { printf("getaddrinfo failed (%s) - invalid hostname or IP address\n", gai_strerror(ret)); return ret; } if (res->ai_family == PF_INET) memcpy(addr, res->ai_addr, sizeof(struct sockaddr_in)); else if (res->ai_family == PF_INET6) memcpy(addr, res->ai_addr, sizeof(struct sockaddr_in6)); else ret = -1; freeaddrinfo(res); return ret; } static void usage(const char *name) { printf("%s -s [-vVd] [-S size] [-C count] [-a addr] [-p port]\n", name); printf("%s -c [-vVd] [-S size] [-C count] [-I addr] -a addr [-p port]\n", name); printf("\t-c\t\tclient side\n"); printf("\t-I\t\tSource address to bind to for client.\n"); printf("\t-s\t\tserver side. To bind to any address with IPv6 use -a ::0\n"); printf("\t-v\t\tdisplay ping data to stdout\n"); printf("\t-V\t\tvalidate ping data\n"); printf("\t-d\t\tdebug printfs\n"); printf("\t-S size \tping data size\n"); printf("\t-C count\tping count times\n"); printf("\t-a addr\t\taddress\n"); printf("\t-p port\t\tport\n"); printf("\t-P\t\tpersistent server mode allowing multiple connections\n"); printf("\t-q\t\tuse self-created, self-modified QP\n"); } int main(int argc, char *argv[]) { struct rping_cb *cb; int op; int ret = 0; int persistent_server = 0; cb = malloc(sizeof(*cb)); if (!cb) return -ENOMEM; memset(cb, 0, sizeof(*cb)); cb->server = -1; cb->state = IDLE; cb->size = 64; cb->sin.ss_family = PF_INET; cb->port = htobe16(7174); sem_init(&cb->sem, 0, 0); sem_init(&cb->accept_ready, 0, 1); opterr = 0; while ((op = getopt(argc, argv, "a:I:Pp:C:S:t:scvVdq")) != -1) { switch (op) { case 'a': ret = get_addr(optarg, (struct sockaddr *) &cb->sin); break; case 'I': ret = get_addr(optarg, (struct sockaddr *) &cb->ssource); break; case 'P': persistent_server = 1; break; case 'p': cb->port = htobe16(atoi(optarg)); DEBUG_LOG("port %d\n", (int) atoi(optarg)); break; case 's': cb->server = 1; DEBUG_LOG("server\n"); break; case 'c': cb->server = 0; DEBUG_LOG("client\n"); break; case 'S': cb->size = atoi(optarg); if ((cb->size < RPING_MIN_BUFSIZE) || (cb->size > (RPING_BUFSIZE - 1))) { fprintf(stderr, "Invalid size %d " "(valid range is %zd to %d)\n", cb->size, RPING_MIN_BUFSIZE, RPING_BUFSIZE); ret = EINVAL; } else DEBUG_LOG("size %d\n", (int) atoi(optarg)); break; case 'C': cb->count = atoi(optarg); if (cb->count < 0) { fprintf(stderr, "Invalid count %d\n", cb->count); ret = EINVAL; } else DEBUG_LOG("count %d\n", (int) cb->count); break; case 'v': cb->verbose++; DEBUG_LOG("verbose\n"); break; case 'V': cb->validate++; DEBUG_LOG("validate data\n"); break; case 'd': debug++; break; case 'q': cb->self_create_qp = 1; break; default: usage("rping"); ret = EINVAL; goto out; } } if (ret) goto out; if (cb->server == -1) { usage("rping"); ret = EINVAL; goto out; } cb->cm_channel = create_event_channel(); if (!cb->cm_channel) { ret = errno; goto out; } ret = rdma_create_id(cb->cm_channel, &cb->cm_id, cb, RDMA_PS_TCP); if (ret) { perror("rdma_create_id"); goto out2; } DEBUG_LOG("created cm_id %p\n", cb->cm_id); ret = pthread_create(&cb->cmthread, NULL, cm_thread, cb); if (ret) { perror("pthread_create"); goto out2; } if (cb->server) { if (persistent_server) ret = rping_run_persistent_server(cb); else ret = rping_run_server(cb); } else { ret = rping_run_client(cb); } DEBUG_LOG("destroy cm_id %p\n", cb->cm_id); rdma_destroy_id(cb->cm_id); out2: rdma_destroy_event_channel(cb->cm_channel); out: free(cb); return ret; } rdma-core-56.1/librdmacm/examples/rstream.c000066400000000000000000000367531477342711600207410ustar00rootroot00000000000000/* * Copyright (c) 2011-2012 Intel Corporation. All rights reserved. * Copyright (c) 2014-2015 Mellanox Technologies LTD. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "common.h" struct test_size_param { int size; int option; }; static struct test_size_param test_size[] = { { 1 << 6, 0 }, { 1 << 7, 1 }, { (1 << 7) + (1 << 6), 1}, { 1 << 8, 1 }, { (1 << 8) + (1 << 7), 1}, { 1 << 9, 1 }, { (1 << 9) + (1 << 8), 1}, { 1 << 10, 1 }, { (1 << 10) + (1 << 9), 1}, { 1 << 11, 1 }, { (1 << 11) + (1 << 10), 1}, { 1 << 12, 0 }, { (1 << 12) + (1 << 11), 1}, { 1 << 13, 1 }, { (1 << 13) + (1 << 12), 1}, { 1 << 14, 1 }, { (1 << 14) + (1 << 13), 1}, { 1 << 15, 1 }, { (1 << 15) + (1 << 14), 1}, { 1 << 16, 0 }, { (1 << 16) + (1 << 15), 1}, { 1 << 17, 1 }, { (1 << 17) + (1 << 16), 1}, { 1 << 18, 1 }, { (1 << 18) + (1 << 17), 1}, { 1 << 19, 1 }, { (1 << 19) + (1 << 18), 1}, { 1 << 20, 0 }, { (1 << 20) + (1 << 19), 1}, { 1 << 21, 1 }, { (1 << 21) + (1 << 20), 1}, { 1 << 22, 1 }, { (1 << 22) + (1 << 21), 1}, }; #define TEST_CNT (sizeof test_size / sizeof test_size[0]) static int rs, lrs; static int use_async; static int use_rgai; static int verify; static int flags = MSG_DONTWAIT; static int poll_timeout = 0; static int custom; static int use_fork; static pid_t fork_pid; static enum rs_optimization optimization; static int size_option; static int iterations = 1; static int transfer_size = 1000; static int transfer_count = 1000; static int buffer_size, inline_size = 64; static char test_name[10] = "custom"; static const char *port = "7471"; static int keepalive; static char *dst_addr; static char *src_addr; static struct timeval start, end; static void *buf; static struct rdma_addrinfo rai_hints; static struct addrinfo ai_hints; static void show_perf(void) { char str[32]; float usec; long long bytes; usec = (end.tv_sec - start.tv_sec) * 1000000 + (end.tv_usec - start.tv_usec); bytes = (long long) iterations * transfer_count * transfer_size * 2; /* name size transfers iterations bytes seconds Gb/sec usec/xfer */ printf("%-10s", test_name); size_str(str, sizeof str, transfer_size); printf("%-8s", str); cnt_str(str, sizeof str, transfer_count); printf("%-8s", str); cnt_str(str, sizeof str, iterations); printf("%-8s", str); size_str(str, sizeof str, bytes); printf("%-8s", str); printf("%8.2fs%10.2f%11.2f\n", usec / 1000000., (bytes * 8) / (1000. * usec), (usec / iterations) / (transfer_count * 2)); } static void init_latency_test(int size) { char sstr[5]; size_str(sstr, sizeof sstr, size); snprintf(test_name, sizeof test_name, "%s_lat", sstr); transfer_count = 1; transfer_size = size; iterations = size_to_count(transfer_size); } static void init_bandwidth_test(int size) { char sstr[5]; size_str(sstr, sizeof sstr, size); snprintf(test_name, sizeof test_name, "%s_bw", sstr); iterations = 1; transfer_size = size; transfer_count = size_to_count(transfer_size); } static int send_xfer(int size) { struct pollfd fds; int offset, ret; if (verify) format_buf(buf, size); if (use_async) { fds.fd = rs; fds.events = POLLOUT; } for (offset = 0; offset < size; ) { if (use_async) { ret = do_poll(&fds, poll_timeout); if (ret) return ret; } ret = rs_send(rs, buf + offset, size - offset, flags); if (ret > 0) { offset += ret; } else if (errno != EWOULDBLOCK && errno != EAGAIN) { perror("rsend"); return ret; } } return 0; } static int recv_xfer(int size) { struct pollfd fds; int offset, ret; if (use_async) { fds.fd = rs; fds.events = POLLIN; } for (offset = 0; offset < size; ) { if (use_async) { ret = do_poll(&fds, poll_timeout); if (ret) return ret; } ret = rs_recv(rs, buf + offset, size - offset, flags); if (ret > 0) { offset += ret; } else if (errno != EWOULDBLOCK && errno != EAGAIN) { perror("rrecv"); return ret; } } if (verify) { ret = verify_buf(buf, size); if (ret) return ret; } return 0; } static int sync_test(void) { int ret; ret = dst_addr ? send_xfer(16) : recv_xfer(16); if (ret) return ret; return dst_addr ? recv_xfer(16) : send_xfer(16); } static int run_test(void) { int ret, i, t; ret = sync_test(); if (ret) goto out; gettimeofday(&start, NULL); for (i = 0; i < iterations; i++) { for (t = 0; t < transfer_count; t++) { ret = dst_addr ? send_xfer(transfer_size) : recv_xfer(transfer_size); if (ret) goto out; } for (t = 0; t < transfer_count; t++) { ret = dst_addr ? recv_xfer(transfer_size) : send_xfer(transfer_size); if (ret) goto out; } } gettimeofday(&end, NULL); show_perf(); ret = 0; out: return ret; } static void set_keepalive(int fd) { int optval; socklen_t optlen = sizeof(optlen); optval = 1; if (rs_setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &optval, optlen)) { perror("rsetsockopt SO_KEEPALIVE"); return; } optval = keepalive; if (rs_setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &optval, optlen)) perror("rsetsockopt TCP_KEEPIDLE"); if (!(rs_getsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &optval, &optlen))) printf("Keepalive: %s\n", (optval ? "ON" : "OFF")); if (!(rs_getsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &optval, &optlen))) printf(" time: %i\n", optval); } static void set_options(int fd) { int val; if (buffer_size) { rs_setsockopt(fd, SOL_SOCKET, SO_SNDBUF, (void *) &buffer_size, sizeof buffer_size); rs_setsockopt(fd, SOL_SOCKET, SO_RCVBUF, (void *) &buffer_size, sizeof buffer_size); } else { val = 1 << 19; rs_setsockopt(fd, SOL_SOCKET, SO_SNDBUF, (void *) &val, sizeof val); rs_setsockopt(fd, SOL_SOCKET, SO_RCVBUF, (void *) &val, sizeof val); } val = 1; rs_setsockopt(fd, IPPROTO_TCP, TCP_NODELAY, (void *) &val, sizeof(val)); if (flags & MSG_DONTWAIT) rs_fcntl(fd, F_SETFL, O_NONBLOCK); if (use_rs) { /* Inline size based on experimental data */ if (optimization == opt_latency) { rs_setsockopt(fd, SOL_RDMA, RDMA_INLINE, &inline_size, sizeof inline_size); } else if (optimization == opt_bandwidth) { val = 0; rs_setsockopt(fd, SOL_RDMA, RDMA_INLINE, &val, sizeof val); } } if (keepalive) set_keepalive(fd); } static int server_listen(void) { struct rdma_addrinfo *rai = NULL; struct addrinfo *ai; int val, ret; if (use_rgai) { rai_hints.ai_flags |= RAI_PASSIVE; ret = rdma_getaddrinfo(src_addr, port, &rai_hints, &rai); } else { ai_hints.ai_flags |= AI_PASSIVE; ret = getaddrinfo(src_addr, port, &ai_hints, &ai); } if (ret) { printf("getaddrinfo: %s\n", gai_strerror(ret)); return ret; } lrs = rai ? rs_socket(rai->ai_family, SOCK_STREAM, 0) : rs_socket(ai->ai_family, SOCK_STREAM, 0); if (lrs < 0) { ret = lrs; goto free; } val = 1; ret = rs_setsockopt(lrs, SOL_SOCKET, SO_REUSEADDR, &val, sizeof val); if (ret) { perror("rsetsockopt SO_REUSEADDR"); goto close; } ret = rai ? rs_bind(lrs, rai->ai_src_addr, rai->ai_src_len) : rs_bind(lrs, ai->ai_addr, ai->ai_addrlen); if (ret) { perror("rbind"); goto close; } ret = rs_listen(lrs, 1); if (ret) perror("rlisten"); close: if (ret) rs_close(lrs); free: if (rai) rdma_freeaddrinfo(rai); else freeaddrinfo(ai); return ret; } static int server_connect(void) { struct pollfd fds; int ret = 0; set_options(lrs); do { if (use_async) { fds.fd = lrs; fds.events = POLLIN; ret = do_poll(&fds, poll_timeout); if (ret) { perror("rpoll"); return ret; } } rs = rs_accept(lrs, NULL, NULL); } while (rs < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)); if (rs < 0) { perror("raccept"); return rs; } if (use_fork) fork_pid = fork(); if (!fork_pid) set_options(rs); return ret; } static int client_connect(void) { struct rdma_addrinfo *rai = NULL, *rai_src = NULL; struct addrinfo *ai = NULL, *ai_src = NULL; struct pollfd fds; int ret, err; socklen_t len; ret = use_rgai ? rdma_getaddrinfo(dst_addr, port, &rai_hints, &rai) : getaddrinfo(dst_addr, port, &ai_hints, &ai); if (ret) { printf("getaddrinfo: %s\n", gai_strerror(ret)); return ret; } if (src_addr) { if (use_rgai) { rai_hints.ai_flags |= RAI_PASSIVE; ret = rdma_getaddrinfo(src_addr, port, &rai_hints, &rai_src); } else { ai_hints.ai_flags |= AI_PASSIVE; ret = getaddrinfo(src_addr, port, &ai_hints, &ai_src); } if (ret) { printf("getaddrinfo src_addr: %s\n", gai_strerror(ret)); goto free; } } rs = rai ? rs_socket(rai->ai_family, SOCK_STREAM, 0) : rs_socket(ai->ai_family, SOCK_STREAM, 0); if (rs < 0) { ret = rs; goto free; } set_options(rs); if (src_addr) { ret = rai ? rs_bind(rs, rai_src->ai_src_addr, rai_src->ai_src_len) : rs_bind(rs, ai_src->ai_addr, ai_src->ai_addrlen); if (ret) { perror("rbind"); goto close; } } if (rai && rai->ai_route) { ret = rs_setsockopt(rs, SOL_RDMA, RDMA_ROUTE, rai->ai_route, rai->ai_route_len); if (ret) { perror("rsetsockopt RDMA_ROUTE"); goto close; } } ret = rai ? rs_connect(rs, rai->ai_dst_addr, rai->ai_dst_len) : rs_connect(rs, ai->ai_addr, ai->ai_addrlen); if (ret && (errno != EINPROGRESS)) { perror("rconnect"); goto close; } if (ret && (errno == EINPROGRESS)) { fds.fd = rs; fds.events = POLLOUT; ret = do_poll(&fds, poll_timeout); if (ret) { perror("rpoll"); goto close; } len = sizeof err; ret = rs_getsockopt(rs, SOL_SOCKET, SO_ERROR, &err, &len); if (ret) goto close; if (err) { ret = -1; errno = err; perror("async rconnect"); } } close: if (ret) rs_close(rs); free: rdma_freeaddrinfo(rai); if (ai) freeaddrinfo(ai); rdma_freeaddrinfo(rai_src); if (ai_src) freeaddrinfo(ai_src); return ret; } static int run(void) { int i, ret = 0; buf = malloc(!custom ? test_size[TEST_CNT - 1].size : transfer_size); if (!buf) { perror("malloc"); return -1; } if (!dst_addr) { ret = server_listen(); if (ret) goto free; } printf("%-10s%-8s%-8s%-8s%-8s%8s %10s%13s\n", "name", "bytes", "xfers", "iters", "total", "time", "Gb/sec", "usec/xfer"); if (!custom) { optimization = opt_latency; ret = dst_addr ? client_connect() : server_connect(); if (ret) goto free; for (i = 0; i < TEST_CNT && !fork_pid; i++) { if (test_size[i].option > size_option) continue; init_latency_test(test_size[i].size); run_test(); } if (fork_pid) waitpid(fork_pid, NULL, 0); else rs_shutdown(rs, SHUT_RDWR); rs_close(rs); if (!dst_addr && use_fork && !fork_pid) goto free; optimization = opt_bandwidth; ret = dst_addr ? client_connect() : server_connect(); if (ret) goto free; for (i = 0; i < TEST_CNT && !fork_pid; i++) { if (test_size[i].option > size_option) continue; init_bandwidth_test(test_size[i].size); run_test(); } } else { ret = dst_addr ? client_connect() : server_connect(); if (ret) goto free; if (!fork_pid) ret = run_test(); } if (fork_pid) waitpid(fork_pid, NULL, 0); else rs_shutdown(rs, SHUT_RDWR); rs_close(rs); free: free(buf); return ret; } static int set_test_opt(const char *arg) { if (strlen(arg) == 1) { switch (arg[0]) { case 's': use_rs = 0; break; case 'a': use_async = 1; break; case 'b': flags = (flags & ~MSG_DONTWAIT) | MSG_WAITALL; break; case 'f': use_fork = 1; use_rs = 0; break; case 'n': flags |= MSG_DONTWAIT; break; case 'r': use_rgai = 1; break; case 'v': verify = 1; break; default: return -1; } } else { if (!strncasecmp("socket", arg, 6)) { use_rs = 0; } else if (!strncasecmp("async", arg, 5)) { use_async = 1; } else if (!strncasecmp("block", arg, 5)) { flags = (flags & ~MSG_DONTWAIT) | MSG_WAITALL; } else if (!strncasecmp("nonblock", arg, 8)) { flags |= MSG_DONTWAIT; } else if (!strncasecmp("resolve", arg, 7)) { use_rgai = 1; } else if (!strncasecmp("verify", arg, 6)) { verify = 1; } else if (!strncasecmp("fork", arg, 4)) { use_fork = 1; use_rs = 0; } else { return -1; } } return 0; } int main(int argc, char **argv) { int op, ret; ai_hints.ai_socktype = SOCK_STREAM; rai_hints.ai_port_space = RDMA_PS_TCP; while ((op = getopt(argc, argv, "s:b:f:B:i:I:C:S:p:k:T:")) != -1) { switch (op) { case 's': dst_addr = optarg; break; case 'b': src_addr = optarg; break; case 'f': if (!strncasecmp("ip", optarg, 2)) { ai_hints.ai_flags = AI_NUMERICHOST; } else if (!strncasecmp("gid", optarg, 3)) { rai_hints.ai_flags = RAI_NUMERICHOST | RAI_FAMILY; rai_hints.ai_family = AF_IB; use_rgai = 1; } else { fprintf(stderr, "Warning: unknown address format\n"); } break; case 'B': buffer_size = atoi(optarg); break; case 'i': inline_size = atoi(optarg); break; case 'I': custom = 1; iterations = atoi(optarg); break; case 'C': custom = 1; transfer_count = atoi(optarg); break; case 'S': if (!strncasecmp("all", optarg, 3)) { size_option = 1; } else { custom = 1; transfer_size = atoi(optarg); } break; case 'p': port = optarg; break; case 'k': keepalive = atoi(optarg); break; case 'T': if (!set_test_opt(optarg)) break; /* invalid option - fall through */ SWITCH_FALLTHROUGH; default: printf("usage: %s\n", argv[0]); printf("\t[-s server_address]\n"); printf("\t[-b bind_address]\n"); printf("\t[-f address_format]\n"); printf("\t name, ip, ipv6, or gid\n"); printf("\t[-B buffer_size]\n"); printf("\t[-i inline_size]\n"); printf("\t[-I iterations]\n"); printf("\t[-C transfer_count]\n"); printf("\t[-S transfer_size or all]\n"); printf("\t[-p port_number]\n"); printf("\t[-k keepalive_time]\n"); printf("\t[-T test_option]\n"); printf("\t s|sockets - use standard tcp/ip sockets\n"); printf("\t a|async - asynchronous operation (use poll)\n"); printf("\t b|blocking - use blocking calls\n"); printf("\t f|fork - fork server processing\n"); printf("\t n|nonblocking - use nonblocking calls\n"); printf("\t r|resolve - use rdma cm to resolve address\n"); printf("\t v|verify - verify data\n"); exit(1); } } if (!(flags & MSG_DONTWAIT)) poll_timeout = -1; ret = run(); return ret; } rdma-core-56.1/librdmacm/examples/udaddy.c000066400000000000000000000361401477342711600205240ustar00rootroot00000000000000/* * Copyright (c) 2005-2006 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * $Id$ */ #include #include #include #include #include #include #include #include #include #include "common.h" struct cmatest_node { int id; struct rdma_cm_id *cma_id; int connected; struct ibv_pd *pd; struct ibv_cq *cq; struct ibv_mr *mr; struct ibv_ah *ah; uint32_t remote_qpn; uint32_t remote_qkey; void *mem; }; struct cmatest { struct rdma_event_channel *channel; struct cmatest_node *nodes; int conn_index; int connects_left; struct rdma_addrinfo *rai; }; static struct cmatest test; static int connections = 1; static int message_size = 100; static int message_count = 10; static const char *port = "7174"; static uint8_t set_tos = 0; static uint8_t tos; static char *dst_addr; static char *src_addr; static struct rdma_addrinfo hints; static int create_message(struct cmatest_node *node) { if (!message_size) message_count = 0; if (!message_count) return 0; node->mem = malloc(message_size + sizeof(struct ibv_grh)); if (!node->mem) { printf("failed message allocation\n"); return -1; } node->mr = ibv_reg_mr(node->pd, node->mem, message_size + sizeof(struct ibv_grh), IBV_ACCESS_LOCAL_WRITE); if (!node->mr) { printf("failed to reg MR\n"); goto err; } return 0; err: free(node->mem); return -1; } static int verify_test_params(struct cmatest_node *node) { struct ibv_port_attr port_attr; int ret; ret = ibv_query_port(node->cma_id->verbs, node->cma_id->port_num, &port_attr); if (ret) return ret; if (message_count && message_size > (1 << (port_attr.active_mtu + 7))) { printf("udaddy: message_size %d is larger than active mtu %d\n", message_size, 1 << (port_attr.active_mtu + 7)); return -EINVAL; } return 0; } static int init_node(struct cmatest_node *node) { struct ibv_qp_init_attr init_qp_attr; int cqe, ret; node->pd = ibv_alloc_pd(node->cma_id->verbs); if (!node->pd) { ret = -ENOMEM; printf("udaddy: unable to allocate PD\n"); goto out; } cqe = message_count ? message_count * 2 : 2; node->cq = ibv_create_cq(node->cma_id->verbs, cqe, node, NULL, 0); if (!node->cq) { ret = -ENOMEM; printf("udaddy: unable to create CQ\n"); goto out; } memset(&init_qp_attr, 0, sizeof init_qp_attr); init_qp_attr.cap.max_send_wr = message_count ? message_count : 1; init_qp_attr.cap.max_recv_wr = message_count ? message_count : 1; init_qp_attr.cap.max_send_sge = 1; init_qp_attr.cap.max_recv_sge = 1; init_qp_attr.qp_context = node; init_qp_attr.sq_sig_all = 0; init_qp_attr.qp_type = IBV_QPT_UD; init_qp_attr.send_cq = node->cq; init_qp_attr.recv_cq = node->cq; ret = rdma_create_qp(node->cma_id, node->pd, &init_qp_attr); if (ret) { perror("udaddy: unable to create QP"); goto out; } ret = create_message(node); if (ret) { printf("udaddy: failed to create messages: %d\n", ret); goto out; } out: return ret; } static int post_recvs(struct cmatest_node *node) { struct ibv_recv_wr recv_wr, *recv_failure; struct ibv_sge sge; int i, ret = 0; if (!message_count) return 0; recv_wr.next = NULL; recv_wr.sg_list = &sge; recv_wr.num_sge = 1; recv_wr.wr_id = (uintptr_t) node; sge.length = message_size + sizeof(struct ibv_grh); sge.lkey = node->mr->lkey; sge.addr = (uintptr_t) node->mem; for (i = 0; i < message_count && !ret; i++ ) { ret = ibv_post_recv(node->cma_id->qp, &recv_wr, &recv_failure); if (ret) { printf("failed to post receives: %d\n", ret); break; } } return ret; } static int post_sends(struct cmatest_node *node, int signal_flag) { struct ibv_send_wr send_wr, *bad_send_wr; struct ibv_sge sge; int i, ret = 0; if (!node->connected || !message_count) return 0; send_wr.next = NULL; send_wr.sg_list = &sge; send_wr.num_sge = 1; send_wr.opcode = IBV_WR_SEND_WITH_IMM; send_wr.send_flags = signal_flag; send_wr.wr_id = (unsigned long)node; send_wr.imm_data = htobe32(node->cma_id->qp->qp_num); send_wr.wr.ud.ah = node->ah; send_wr.wr.ud.remote_qpn = node->remote_qpn; send_wr.wr.ud.remote_qkey = node->remote_qkey; sge.length = message_size; sge.lkey = node->mr->lkey; sge.addr = (uintptr_t) node->mem; for (i = 0; i < message_count && !ret; i++) { ret = ibv_post_send(node->cma_id->qp, &send_wr, &bad_send_wr); if (ret) printf("failed to post sends: %d\n", ret); } return ret; } static void connect_error(void) { test.connects_left--; } static int addr_handler(struct cmatest_node *node) { int ret; if (set_tos) { ret = rdma_set_option(node->cma_id, RDMA_OPTION_ID, RDMA_OPTION_ID_TOS, &tos, sizeof tos); if (ret) perror("udaddy: set TOS option failed"); } ret = rdma_resolve_route(node->cma_id, 2000); if (ret) { perror("udaddy: resolve route failed"); connect_error(); } return ret; } static int route_handler(struct cmatest_node *node) { struct rdma_conn_param conn_param; int ret; ret = verify_test_params(node); if (ret) goto err; ret = init_node(node); if (ret) goto err; ret = post_recvs(node); if (ret) goto err; memset(&conn_param, 0, sizeof conn_param); conn_param.private_data = test.rai->ai_connect; conn_param.private_data_len = test.rai->ai_connect_len; ret = rdma_connect(node->cma_id, &conn_param); if (ret) { perror("udaddy: failure connecting"); goto err; } return 0; err: connect_error(); return ret; } static int connect_handler(struct rdma_cm_id *cma_id) { struct cmatest_node *node; struct rdma_conn_param conn_param; int ret; if (test.conn_index == connections) { ret = -ENOMEM; goto err1; } node = &test.nodes[test.conn_index++]; node->cma_id = cma_id; cma_id->context = node; ret = verify_test_params(node); if (ret) goto err2; ret = init_node(node); if (ret) goto err2; ret = post_recvs(node); if (ret) goto err2; memset(&conn_param, 0, sizeof conn_param); conn_param.qp_num = node->cma_id->qp->qp_num; ret = rdma_accept(node->cma_id, &conn_param); if (ret) { perror("udaddy: failure accepting"); goto err2; } node->connected = 1; test.connects_left--; return 0; err2: node->cma_id = NULL; connect_error(); err1: printf("udaddy: failing connection request\n"); rdma_reject(cma_id, NULL, 0); return ret; } static int resolved_handler(struct cmatest_node *node, struct rdma_cm_event *event) { node->remote_qpn = event->param.ud.qp_num; node->remote_qkey = event->param.ud.qkey; node->ah = ibv_create_ah(node->pd, &event->param.ud.ah_attr); if (!node->ah) { printf("udaddy: failure creating address handle\n"); goto err; } node->connected = 1; test.connects_left--; return 0; err: connect_error(); return -1; } static int cma_handler(struct rdma_cm_id *cma_id, struct rdma_cm_event *event) { int ret = 0; switch (event->event) { case RDMA_CM_EVENT_ADDR_RESOLVED: ret = addr_handler(cma_id->context); break; case RDMA_CM_EVENT_ROUTE_RESOLVED: ret = route_handler(cma_id->context); break; case RDMA_CM_EVENT_CONNECT_REQUEST: ret = connect_handler(cma_id); break; case RDMA_CM_EVENT_ESTABLISHED: ret = resolved_handler(cma_id->context, event); break; case RDMA_CM_EVENT_ADDR_ERROR: case RDMA_CM_EVENT_ROUTE_ERROR: case RDMA_CM_EVENT_CONNECT_ERROR: case RDMA_CM_EVENT_UNREACHABLE: case RDMA_CM_EVENT_REJECTED: printf("udaddy: event: %s, error: %d\n", rdma_event_str(event->event), event->status); connect_error(); ret = event->status; break; case RDMA_CM_EVENT_DEVICE_REMOVAL: /* Cleanup will occur after test completes. */ break; default: break; } return ret; } static void destroy_node(struct cmatest_node *node) { if (!node->cma_id) return; if (node->ah) ibv_destroy_ah(node->ah); if (node->cma_id->qp) rdma_destroy_qp(node->cma_id); if (node->cq) ibv_destroy_cq(node->cq); if (node->mem) { ibv_dereg_mr(node->mr); free(node->mem); } if (node->pd) ibv_dealloc_pd(node->pd); /* Destroy the RDMA ID after all device resources */ rdma_destroy_id(node->cma_id); } static int alloc_nodes(void) { int ret, i; test.nodes = malloc(sizeof *test.nodes * connections); if (!test.nodes) { printf("udaddy: unable to allocate memory for test nodes\n"); return -ENOMEM; } memset(test.nodes, 0, sizeof *test.nodes * connections); for (i = 0; i < connections; i++) { test.nodes[i].id = i; if (dst_addr) { ret = rdma_create_id(test.channel, &test.nodes[i].cma_id, &test.nodes[i], hints.ai_port_space); if (ret) goto err; } } return 0; err: while (--i >= 0) rdma_destroy_id(test.nodes[i].cma_id); free(test.nodes); return ret; } static void destroy_nodes(void) { int i; for (i = 0; i < connections; i++) destroy_node(&test.nodes[i]); free(test.nodes); } static int create_reply_ah(struct cmatest_node *node, struct ibv_wc *wc) { struct ibv_qp_attr attr; struct ibv_qp_init_attr init_attr; node->ah = ibv_create_ah_from_wc(node->pd, wc, node->mem, node->cma_id->port_num); if (!node->ah) return -1; node->remote_qpn = be32toh(wc->imm_data); if (ibv_query_qp(node->cma_id->qp, &attr, IBV_QP_QKEY, &init_attr)) return -1; node->remote_qkey = attr.qkey; return 0; } static int poll_cqs(void) { struct ibv_wc wc[8]; int done, i, ret, rc; for (i = 0; i < connections; i++) { if (!test.nodes[i].connected) continue; for (done = 0; done < message_count; done += ret) { ret = ibv_poll_cq(test.nodes[i].cq, 8, wc); if (ret < 0) { printf("udaddy: failed polling CQ: %d\n", ret); return ret; } if (ret && !test.nodes[i].ah) { rc = create_reply_ah(&test.nodes[i], wc); if (rc) { printf("udaddy: failed to create reply AH\n"); return rc; } } } } return 0; } static int connect_events(void) { struct rdma_cm_event *event; int ret = 0; while (test.connects_left && !ret) { ret = rdma_get_cm_event(test.channel, &event); if (!ret) { ret = cma_handler(event->id, event); rdma_ack_cm_event(event); } } return ret; } static int run_server(void) { struct rdma_cm_id *listen_id; int i, ret; printf("udaddy: starting server\n"); ret = rdma_create_id(test.channel, &listen_id, &test, hints.ai_port_space); if (ret) { perror("udaddy: listen request failed"); return ret; } ret = get_rdma_addr(src_addr, dst_addr, port, &hints, &test.rai); if (ret) goto out; ret = rdma_bind_addr(listen_id, test.rai->ai_src_addr); if (ret) { perror("udaddy: bind address failed"); goto out; } ret = rdma_listen(listen_id, 0); if (ret) { perror("udaddy: failure trying to listen"); goto out; } connect_events(); if (message_count) { printf("receiving data transfers\n"); ret = poll_cqs(); if (ret) goto out; printf("sending replies\n"); for (i = 0; i < connections; i++) { ret = post_sends(&test.nodes[i], IBV_SEND_SIGNALED); if (ret) goto out; } ret = poll_cqs(); if (ret) goto out; printf("data transfers complete\n"); } out: rdma_destroy_id(listen_id); return ret; } static int run_client(void) { int i, ret; printf("udaddy: starting client\n"); ret = get_rdma_addr(src_addr, dst_addr, port, &hints, &test.rai); if (ret) return ret; printf("udaddy: connecting\n"); for (i = 0; i < connections; i++) { ret = rdma_resolve_addr(test.nodes[i].cma_id, test.rai->ai_src_addr, test.rai->ai_dst_addr, 2000); if (ret) { perror("udaddy: failure getting addr"); connect_error(); return ret; } } ret = connect_events(); if (ret) goto out; if (message_count) { printf("initiating data transfers\n"); for (i = 0; i < connections; i++) { ret = post_sends(&test.nodes[i], 0); if (ret) goto out; } printf("receiving data transfers\n"); ret = poll_cqs(); if (ret) goto out; printf("data transfers complete\n"); } out: return ret; } int main(int argc, char **argv) { int op, ret; hints.ai_port_space = RDMA_PS_UDP; while ((op = getopt(argc, argv, "s:b:c:C:S:t:p:P:f:")) != -1) { switch (op) { case 's': dst_addr = optarg; break; case 'b': src_addr = optarg; break; case 'c': connections = atoi(optarg); break; case 'C': message_count = atoi(optarg); break; case 'S': message_size = atoi(optarg); break; case 't': set_tos = 1; tos = (uint8_t) strtoul(optarg, NULL, 0); break; case 'p': /* for backwards compatibility - use -P */ hints.ai_port_space = strtol(optarg, NULL, 0); break; case 'f': if (!strncasecmp("ip", optarg, 2)) { hints.ai_flags = RAI_NUMERICHOST; } else if (!strncasecmp("gid", optarg, 3)) { hints.ai_flags = RAI_NUMERICHOST | RAI_FAMILY; hints.ai_family = AF_IB; } else if (strncasecmp("name", optarg, 4)) { fprintf(stderr, "Warning: unknown address format\n"); } break; case 'P': if (!strncasecmp("ipoib", optarg, 5)) { hints.ai_port_space = RDMA_PS_IPOIB; } else if (strncasecmp("udp", optarg, 3)) { fprintf(stderr, "Warning: unknown port space format\n"); } break; default: printf("usage: %s\n", argv[0]); printf("\t[-s server_address]\n"); printf("\t[-b bind_address]\n"); printf("\t[-f address_format]\n"); printf("\t name, ip, ipv6, or gid\n"); printf("\t[-P port_space]\n"); printf("\t udp or ipoib\n"); printf("\t[-c connections]\n"); printf("\t[-C message_count]\n"); printf("\t[-S message_size]\n"); printf("\t[-t type_of_service]\n"); printf("\t[-p port_space - %#x for UDP (default), " "%#x for IPOIB]\n", RDMA_PS_UDP, RDMA_PS_IPOIB); exit(1); } } test.connects_left = connections; test.channel = create_event_channel(); if (!test.channel) { exit(1); } if (alloc_nodes()) exit(1); if (dst_addr) { ret = run_client(); } else { hints.ai_flags |= RAI_PASSIVE; ret = run_server(); } printf("test complete\n"); destroy_nodes(); rdma_destroy_event_channel(test.channel); rdma_freeaddrinfo(test.rai); printf("return status %d\n", ret); return ret; } rdma-core-56.1/librdmacm/examples/udpong.c000066400000000000000000000273071477342711600205530ustar00rootroot00000000000000/* * Copyright (c) 2012 Intel Corporation. All rights reserved. * * This software is available to you under the OpenIB.org BSD license * below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AWV * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "common.h" static int test_size[] = { (1 << 6), (1 << 7), ((1 << 7) + (1 << 6)), (1 << 8), ((1 << 8) + (1 << 7)), (1 << 9), ((1 << 9) + (1 << 8)), (1 << 10), ((1 << 10) + (1 << 9)), }; #define TEST_CNT (sizeof test_size / sizeof test_size[0]) enum { msg_op_login, msg_op_start, msg_op_data, msg_op_echo, msg_op_end }; struct message { uint8_t op; uint8_t id; uint8_t seqno; uint8_t reserved; __be32 data; uint8_t buf[2048]; }; #define CTRL_MSG_SIZE 16 struct client { uint64_t recvcnt; }; static struct client clients[256]; static uint8_t id; static int rs; static int use_async; static int flags = MSG_DONTWAIT; static int poll_timeout; static int custom; static int echo; static int transfer_size = 1000; static int transfer_count = 1000; static int buffer_size; static char test_name[10] = "custom"; static const char *port = "7174"; static char *dst_addr; static char *src_addr; static union socket_addr g_addr; static socklen_t g_addrlen; static struct timeval start, end; static struct message g_msg; static void show_perf(void) { char str[32]; float usec; long long bytes; int transfers; usec = (end.tv_sec - start.tv_sec) * 1000000 + (end.tv_usec - start.tv_usec); transfers = echo ? transfer_count * 2 : be32toh(g_msg.data); bytes = (long long) transfers * transfer_size; /* name size transfers bytes seconds Gb/sec usec/xfer */ printf("%-10s", test_name); size_str(str, sizeof str, transfer_size); printf("%-8s", str); cnt_str(str, sizeof str, transfers); printf("%-8s", str); size_str(str, sizeof str, bytes); printf("%-8s", str); printf("%8.2fs%10.2f%11.2f\n", usec / 1000000., (bytes * 8) / (1000. * usec), (usec / transfers)); } static void init_latency_test(int size) { char sstr[5]; size_str(sstr, sizeof sstr, size); snprintf(test_name, sizeof test_name, "%s_lat", sstr); transfer_size = size; transfer_count = size_to_count(transfer_size) / 10; echo = 1; } static void init_bandwidth_test(int size) { char sstr[5]; size_str(sstr, sizeof sstr, size); snprintf(test_name, sizeof test_name, "%s_bw", sstr); transfer_size = size; transfer_count = size_to_count(transfer_size); echo = 0; } static void set_options(int fd) { int val; if (buffer_size) { rs_setsockopt(fd, SOL_SOCKET, SO_SNDBUF, (void *) &buffer_size, sizeof buffer_size); rs_setsockopt(fd, SOL_SOCKET, SO_RCVBUF, (void *) &buffer_size, sizeof buffer_size); } else { val = 1 << 19; rs_setsockopt(fd, SOL_SOCKET, SO_SNDBUF, (void *) &val, sizeof val); rs_setsockopt(fd, SOL_SOCKET, SO_RCVBUF, (void *) &val, sizeof val); } if (flags & MSG_DONTWAIT) rs_fcntl(fd, F_SETFL, O_NONBLOCK); } static ssize_t svr_send(struct message *msg, size_t size, union socket_addr *addr, socklen_t addrlen) { struct pollfd fds; ssize_t ret; if (use_async) { fds.fd = rs; fds.events = POLLOUT; } do { if (use_async) { ret = do_poll(&fds, poll_timeout); if (ret) return ret; } ret = rs_sendto(rs, msg, size, flags, &addr->sa, addrlen); } while (ret < 0 && (errno == EWOULDBLOCK || errno == EAGAIN)); if (ret < 0) perror("rsend"); return ret; } static ssize_t svr_recv(struct message *msg, size_t size, union socket_addr *addr, socklen_t *addrlen) { struct pollfd fds; ssize_t ret; if (use_async) { fds.fd = rs; fds.events = POLLIN; } do { if (use_async) { ret = do_poll(&fds, poll_timeout); if (ret) return ret; } ret = rs_recvfrom(rs, msg, size, flags, &addr->sa, addrlen); } while (ret < 0 && (errno == EWOULDBLOCK || errno == EAGAIN)); if (ret < 0) perror("rrecv"); return ret; } static int svr_process(struct message *msg, size_t size, union socket_addr *addr, socklen_t addrlen) { char str[64]; ssize_t ret; switch (msg->op) { case msg_op_login: if (addr->sa.sa_family == AF_INET) { printf("client login from %s\n", inet_ntop(AF_INET, &addr->sin.sin_addr.s_addr, str, sizeof str)); } else { printf("client login from %s\n", inet_ntop(AF_INET6, &addr->sin6.sin6_addr.s6_addr, str, sizeof str)); } msg->id = id++; /* fall through */ case msg_op_start: memset(&clients[msg->id], 0, sizeof clients[msg->id]); break; case msg_op_echo: clients[msg->id].recvcnt++; break; case msg_op_end: msg->data = htobe32(clients[msg->id].recvcnt); break; default: clients[msg->id].recvcnt++; return 0; } ret = svr_send(msg, size, addr, addrlen); return (ret == size) ? 0 : (int) ret; } static int svr_bind(void) { struct addrinfo hints, *res; int ret; memset(&hints, 0, sizeof hints); hints.ai_socktype = SOCK_DGRAM; ret = getaddrinfo(src_addr, port, &hints, &res); if (ret) { printf("getaddrinfo: %s\n", gai_strerror(ret)); return ret; } rs = rs_socket(res->ai_family, res->ai_socktype, res->ai_protocol); if (rs < 0) { ret = rs; goto out; } set_options(rs); ret = rs_bind(rs, res->ai_addr, res->ai_addrlen); if (ret) { perror("rbind"); rs_close(rs); } out: free(res); return ret; } static int svr_run(void) { ssize_t len; int ret; ret = svr_bind(); while (!ret) { g_addrlen = sizeof g_addr; len = svr_recv(&g_msg, sizeof g_msg, &g_addr, &g_addrlen); if (len < 0) return len; ret = svr_process(&g_msg, len, &g_addr, g_addrlen); } return ret; } static ssize_t client_send(struct message *msg, size_t size) { struct pollfd fds; int ret; if (use_async) { fds.fd = rs; fds.events = POLLOUT; } do { if (use_async) { ret = do_poll(&fds, poll_timeout); if (ret) return ret; } ret = rs_send(rs, msg, size, flags); } while (ret < 0 && (errno == EWOULDBLOCK || errno == EAGAIN)); if (ret < 0) perror("rsend"); return ret; } static ssize_t client_recv(struct message *msg, size_t size, int timeout) { struct pollfd fds; int ret; if (timeout) { fds.fd = rs; fds.events = POLLIN; ret = rs_poll(&fds, 1, timeout); if (ret <= 0) return ret; } ret = rs_recv(rs, msg, size, flags | MSG_DONTWAIT); if (ret < 0 && errno != EWOULDBLOCK && errno != EAGAIN) perror("rrecv"); return ret; } static int client_send_recv(struct message *msg, size_t size, int timeout) { static uint8_t seqno; int ret; msg->seqno = seqno; do { ret = client_send(msg, size); if (ret != size) return ret; ret = client_recv(msg, size, timeout); } while (ret <= 0 || msg->seqno != seqno); seqno++; return ret; } static int run_test(void) { int ret, i; g_msg.op = msg_op_start; ret = client_send_recv(&g_msg, CTRL_MSG_SIZE, 1000); if (ret != CTRL_MSG_SIZE) goto out; g_msg.op = echo ? msg_op_echo : msg_op_data; gettimeofday(&start, NULL); for (i = 0; i < transfer_count; i++) { ret = echo ? client_send_recv(&g_msg, transfer_size, 1) : client_send(&g_msg, transfer_size); if (ret != transfer_size) goto out; } g_msg.op = msg_op_end; ret = client_send_recv(&g_msg, CTRL_MSG_SIZE, 1); if (ret != CTRL_MSG_SIZE) goto out; gettimeofday(&end, NULL); show_perf(); ret = 0; out: return ret; } static int client_connect(void) { struct addrinfo hints, *res; int ret; memset(&hints, 0, sizeof hints); hints.ai_socktype = SOCK_DGRAM; ret = getaddrinfo(dst_addr, port, &hints, &res); if (ret) { printf("getaddrinfo: %s\n", gai_strerror(ret)); return ret; } rs = rs_socket(res->ai_family, res->ai_socktype, res->ai_protocol); if (rs < 0) { ret = rs; goto out; } set_options(rs); ret = rs_connect(rs, res->ai_addr, res->ai_addrlen); if (ret) { if (errno == ENODEV) fprintf(stderr, "No RDMA devices were detected\n"); else perror("rconnect"); rs_close(rs); goto out; } g_msg.op = msg_op_login; ret = client_send_recv(&g_msg, CTRL_MSG_SIZE, 1000); if (ret == CTRL_MSG_SIZE) ret = 0; out: freeaddrinfo(res); return ret; } static int client_run(void) { int i, ret; printf("%-10s%-8s%-8s%-8s%8s %10s%13s\n", "name", "bytes", "xfers", "total", "time", "Gb/sec", "usec/xfer"); ret = client_connect(); if (ret) return ret; if (!custom) { for (i = 0; i < TEST_CNT; i++) { init_latency_test(test_size[i]); run_test(); } for (i = 0; i < TEST_CNT; i++) { init_bandwidth_test(test_size[i]); run_test(); } } else { run_test(); } rs_close(rs); return ret; } static int set_test_opt(const char *arg) { if (strlen(arg) == 1) { switch (arg[0]) { case 's': use_rs = 0; break; case 'a': use_async = 1; break; case 'b': flags = 0; break; case 'n': flags = MSG_DONTWAIT; break; case 'e': echo = 1; break; default: return -1; } } else { if (!strncasecmp("socket", arg, 6)) { use_rs = 0; } else if (!strncasecmp("async", arg, 5)) { use_async = 1; } else if (!strncasecmp("block", arg, 5)) { flags = 0; } else if (!strncasecmp("nonblock", arg, 8)) { flags = MSG_DONTWAIT; } else if (!strncasecmp("echo", arg, 4)) { echo = 1; } else { return -1; } } return 0; } int main(int argc, char **argv) { int op, ret; while ((op = getopt(argc, argv, "s:b:B:C:S:p:T:")) != -1) { switch (op) { case 's': dst_addr = optarg; break; case 'b': src_addr = optarg; break; case 'B': buffer_size = atoi(optarg); break; case 'C': custom = 1; transfer_count = atoi(optarg); break; case 'S': custom = 1; transfer_size = atoi(optarg); if (transfer_size < CTRL_MSG_SIZE) { printf("size must be at least %d bytes\n", CTRL_MSG_SIZE); exit(1); } break; case 'p': port = optarg; break; case 'T': if (!set_test_opt(optarg)) break; /* invalid option - fall through */ SWITCH_FALLTHROUGH; default: printf("usage: %s\n", argv[0]); printf("\t[-s server_address]\n"); printf("\t[-b bind_address]\n"); printf("\t[-B buffer_size]\n"); printf("\t[-C transfer_count]\n"); printf("\t[-S transfer_size]\n"); printf("\t[-p port_number]\n"); printf("\t[-T test_option]\n"); printf("\t s|sockets - use standard tcp/ip sockets\n"); printf("\t a|async - asynchronous operation (use poll)\n"); printf("\t b|blocking - use blocking calls\n"); printf("\t n|nonblocking - use nonblocking calls\n"); printf("\t e|echo - server echoes all messages\n"); exit(1); } } if (flags) poll_timeout = -1; ret = dst_addr ? client_run() : svr_run(); return ret; } rdma-core-56.1/librdmacm/ib.h000066400000000000000000000055211477342711600160320ustar00rootroot00000000000000/* * Copyright (c) 2010 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(_RDMA_IB_H) #define _RDMA_IB_H #include #include #include #ifndef AF_IB #define AF_IB 27 #endif #ifndef PF_IB #define PF_IB AF_IB #endif struct ib_addr { union { __u8 uib_addr8[16]; __be16 uib_addr16[8]; __be32 uib_addr32[4]; __be64 uib_addr64[2]; } ib_u; #define sib_addr8 ib_u.uib_addr8 #define sib_addr16 ib_u.uib_addr16 #define sib_addr32 ib_u.uib_addr32 #define sib_addr64 ib_u.uib_addr64 #define sib_raw ib_u.uib_addr8 #define sib_subnet_prefix ib_u.uib_addr64[0] #define sib_interface_id ib_u.uib_addr64[1] }; static inline int ib_addr_any(const struct ib_addr *a) { return ((a->sib_addr64[0] | a->sib_addr64[1]) == 0); } static inline int ib_addr_loopback(const struct ib_addr *a) { return ((a->sib_addr32[0] | a->sib_addr32[1] | a->sib_addr32[2] | (a->sib_addr32[3] ^ htobe32(1))) == 0); } static inline void ib_addr_set(struct ib_addr *addr, __be32 w1, __be32 w2, __be32 w3, __be32 w4) { addr->sib_addr32[0] = w1; addr->sib_addr32[1] = w2; addr->sib_addr32[2] = w3; addr->sib_addr32[3] = w4; } static inline int ib_addr_cmp(const struct ib_addr *a1, const struct ib_addr *a2) { return memcmp(a1, a2, sizeof(struct ib_addr)); } struct sockaddr_ib { unsigned short int sib_family; /* AF_IB */ __be16 sib_pkey; __be32 sib_flowinfo; struct ib_addr sib_addr; __be64 sib_sid; __be64 sib_sid_mask; __u64 sib_scope_id; }; #endif /* _RDMA_IB_H */ rdma-core-56.1/librdmacm/indexer.c000066400000000000000000000102621477342711600170670ustar00rootroot00000000000000/* * Copyright (c) 2011 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include "indexer.h" /* * Indexer - to find a structure given an index * * We store pointers using a double lookup and return an index to the * user which is then used to retrieve the pointer. The upper bits of * the index are itself an index into an array of memory allocations. * The lower bits specify the offset into the allocated memory where * the pointer is stored. * * This allows us to adjust the number of pointers stored by the index * list without taking a lock during data lookups. */ static int idx_grow(struct indexer *idx) { union idx_entry *entry; int i, start_index; if (idx->size >= IDX_ARRAY_SIZE) goto nomem; idx->array[idx->size] = calloc(IDX_ENTRY_SIZE, sizeof(union idx_entry)); if (!idx->array[idx->size]) goto nomem; entry = idx->array[idx->size]; start_index = idx->size << IDX_ENTRY_BITS; entry[IDX_ENTRY_SIZE - 1].next = idx->free_list; for (i = IDX_ENTRY_SIZE - 2; i >= 0; i--) entry[i].next = start_index + i + 1; /* Index 0 is reserved */ if (start_index == 0) start_index++; idx->free_list = start_index; idx->size++; return start_index; nomem: errno = ENOMEM; return -1; } int idx_insert(struct indexer *idx, void *item) { union idx_entry *entry; int index; if ((index = idx->free_list) == 0) { if ((index = idx_grow(idx)) <= 0) return index; } entry = idx->array[idx_array_index(index)]; idx->free_list = entry[idx_entry_index(index)].next; entry[idx_entry_index(index)].item = item; return index; } void *idx_remove(struct indexer *idx, int index) { union idx_entry *entry; void *item; entry = idx->array[idx_array_index(index)]; item = entry[idx_entry_index(index)].item; entry[idx_entry_index(index)].next = idx->free_list; idx->free_list = index; return item; } void idx_replace(struct indexer *idx, int index, void *item) { union idx_entry *entry; entry = idx->array[idx_array_index(index)]; entry[idx_entry_index(index)].item = item; } static int idm_grow(struct index_map *idm, int index) { idm->array[idx_array_index(index)] = calloc(IDX_ENTRY_SIZE, sizeof(void *)); if (!idm->array[idx_array_index(index)]) goto nomem; return index; nomem: errno = ENOMEM; return -1; } int idm_set(struct index_map *idm, int index, void *item) { void **entry; if (index > IDX_MAX_INDEX) { errno = ENOMEM; return -1; } if (!idm->array[idx_array_index(index)]) { if (idm_grow(idm, index) < 0) return -1; } entry = idm->array[idx_array_index(index)]; entry[idx_entry_index(index)] = item; return index; } void *idm_clear(struct index_map *idm, int index) { void **entry; void *item; entry = idm->array[idx_array_index(index)]; item = entry[idx_entry_index(index)]; entry[idx_entry_index(index)] = NULL; return item; } rdma-core-56.1/librdmacm/indexer.h000066400000000000000000000076231477342711600171030ustar00rootroot00000000000000/* * Copyright (c) 2011 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #if !defined(INDEXER_H) #define INDEXER_H #include #include #include /* * Indexer - to find a structure given an index. Synchronization * must be provided by the caller. Caller must initialize the * indexer by setting free_list and size to 0. */ union idx_entry { void *item; int next; }; #define IDX_INDEX_BITS 16 #define IDX_ENTRY_BITS 10 #define IDX_ENTRY_SIZE (1 << IDX_ENTRY_BITS) #define IDX_ARRAY_SIZE (1 << (IDX_INDEX_BITS - IDX_ENTRY_BITS)) #define IDX_MAX_INDEX ((1 << IDX_INDEX_BITS) - 1) struct indexer { union idx_entry *array[IDX_ARRAY_SIZE]; int free_list; int size; }; #define idx_array_index(index) (index >> IDX_ENTRY_BITS) #define idx_entry_index(index) (index & (IDX_ENTRY_SIZE - 1)) int idx_insert(struct indexer *idx, void *item); void *idx_remove(struct indexer *idx, int index); void idx_replace(struct indexer *idx, int index, void *item); static inline void *idx_at(struct indexer *idx, int index) { return (idx->array[idx_array_index(index)] + idx_entry_index(index))->item; } /* * Index map - associates a structure with an index. Synchronization * must be provided by the caller. Caller must initialize the * index map by setting it to 0. */ struct index_map { void **array[IDX_ARRAY_SIZE]; }; int idm_set(struct index_map *idm, int index, void *item); void *idm_clear(struct index_map *idm, int index); static inline void *idm_at(struct index_map *idm, int index) { void **entry; entry = idm->array[idx_array_index(index)]; return entry[idx_entry_index(index)]; } static inline void *idm_lookup(struct index_map *idm, int index) { return ((index <= IDX_MAX_INDEX) && idm->array[idx_array_index(index)]) ? idm_at(idm, index) : NULL; } typedef struct _dlist_entry { struct _dlist_entry *next; struct _dlist_entry *prev; } dlist_entry; static inline void dlist_init(dlist_entry *head) { head->next = head; head->prev = head; } static inline int dlist_empty(dlist_entry *head) { return head->next == head; } static inline void dlist_insert_after(dlist_entry *item, dlist_entry *head) { item->next = head->next; item->prev = head; head->next->prev = item; head->next = item; } static inline void dlist_insert_before(dlist_entry *item, dlist_entry *head) { dlist_insert_after(item, head->prev); } #define dlist_insert_head dlist_insert_after #define dlist_insert_tail dlist_insert_before static inline void dlist_remove(dlist_entry *item) { item->prev->next = item->next; item->next->prev = item->prev; } #endif /* INDEXER_H */ rdma-core-56.1/librdmacm/librdmacm.map000066400000000000000000000026121477342711600177160ustar00rootroot00000000000000/* Do not change this file without reading Documentation/versioning.md */ RDMACM_1.0 { global: rdma_create_event_channel; rdma_destroy_event_channel; rdma_create_id; rdma_destroy_id; rdma_bind_addr; rdma_resolve_addr; rdma_resolve_route; rdma_create_qp; rdma_destroy_qp; rdma_connect; rdma_listen; rdma_accept; rdma_reject; rdma_notify; rdma_disconnect; rdma_get_cm_event; rdma_ack_cm_event; rdma_get_src_port; rdma_get_dst_port; rdma_join_multicast; rdma_leave_multicast; rdma_get_devices; rdma_free_devices; rdma_event_str; rdma_set_option; rdma_migrate_id; rdma_getaddrinfo; rdma_freeaddrinfo; rdma_get_request; rdma_create_ep; rdma_destroy_ep; rdma_create_srq; rdma_destroy_srq; rsocket; rbind; rlisten; raccept; rconnect; rshutdown; rclose; rrecv; rrecvfrom; rrecvmsg; rsend; rsendto; rsendmsg; rread; rreadv; rwrite; rwritev; rpoll; rselect; rgetpeername; rgetsockname; rsetsockopt; rgetsockopt; rfcntl; rpoll; rselect; rdma_get_src_port; rdma_get_dst_port; riomap; riounmap; riowrite; rdma_create_srq_ex; rdma_create_qp_ex; local: *; }; RDMACM_1.1 { global: rdma_join_multicast_ex; } RDMACM_1.0; RDMACM_1.2 { global: rdma_establish; rdma_init_qp_attr; } RDMACM_1.1; RDMACM_1.3 { global: rdma_get_remote_ece; rdma_reject_ece; rdma_set_local_ece; } RDMACM_1.2; rdma-core-56.1/librdmacm/librspreload.map000066400000000000000000000007461477342711600204540ustar00rootroot00000000000000{ /* FIXME: It is probably not a great idea to not tag these with the proper symbol version from glibc, at least if glibc ever changes the signature this will go sideways.. */ global: accept; bind; close; connect; dup2; fcntl; getpeername; getsockname; getsockopt; listen; poll; read; readv; recv; recvfrom; recvmsg; select; send; sendfile; sendmsg; sendto; setsockopt; shutdown; socket; write; writev; local: *; }; rdma-core-56.1/librdmacm/man/000077500000000000000000000000001477342711600160375ustar00rootroot00000000000000rdma-core-56.1/librdmacm/man/CMakeLists.txt000066400000000000000000000025461477342711600206060ustar00rootroot00000000000000rdma_man_pages( cmtime.1 mckey.1 rcopy.1 rdma_accept.3 rdma_ack_cm_event.3 rdma_bind_addr.3 rdma_client.1 rdma_cm.7 rdma_connect.3 rdma_create_ep.3 rdma_create_event_channel.3 rdma_create_id.3 rdma_create_qp.3 rdma_create_srq.3 rdma_dereg_mr.3 rdma_destroy_ep.3 rdma_destroy_event_channel.3 rdma_destroy_id.3 rdma_destroy_qp.3 rdma_destroy_srq.3 rdma_disconnect.3 rdma_establish.3.md rdma_event_str.3 rdma_free_devices.3 rdma_freeaddrinfo.3.in.rst rdma_get_cm_event.3 rdma_get_devices.3 rdma_get_dst_port.3 rdma_get_local_addr.3 rdma_get_peer_addr.3 rdma_get_recv_comp.3 rdma_get_remote_ece.3.md rdma_get_request.3 rdma_get_send_comp.3 rdma_get_src_port.3 rdma_getaddrinfo.3 rdma_init_qp_attr.3.md rdma_join_multicast.3 rdma_join_multicast_ex.3 rdma_leave_multicast.3 rdma_listen.3 rdma_migrate_id.3 rdma_notify.3 rdma_post_read.3 rdma_post_readv.3 rdma_post_recv.3 rdma_post_recvv.3 rdma_post_send.3 rdma_post_sendv.3 rdma_post_ud_send.3 rdma_post_write.3 rdma_post_writev.3 rdma_reg_msgs.3 rdma_reg_read.3 rdma_reg_write.3 rdma_reject.3 rdma_resolve_addr.3 rdma_resolve_route.3 rdma_server.1 rdma_set_local_ece.3.md rdma_set_option.3 rdma_xclient.1 rdma_xserver.1 riostream.1 rping.1 rsocket.7.in rstream.1 ucmatose.1 udaddy.1 udpong.1 ) rdma-core-56.1/librdmacm/man/cmtime.1000066400000000000000000000066251477342711600174100ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "CMTIME" 1 "2017-04-28" "librdmacm" "librdmacm" librdmacm .SH NAME cmtime \- RDMA CM connection steps timing test. .SH SYNOPSIS .sp .nf \fIcmtime\fR [-s server_address] [-b bind_address] [-c connections] [-p port_number] [-q base_qpn] [-r retries] [-t timeout_ms] .fi .SH "DESCRIPTION" Determines min, max, and average times for various "steps" in RDMA CM connection setup and teardown between a client and server application. "Steps" that are timed are: create ID, bind address, resolve address, resolve route, create QP, modify QP to INIT, modify QP to RTR, modify QP to RTS, CM connect, client establish, disconnect, destroy QP, and destroy ID. Many operations are asynchronous, allowing progress on multiple connections simultanesously. The 'sum' output adds the time that all connections took for a given step. The average 'us/conn' is the sum divided by the number of connections. This is useful to identify steps which take a significant amount of time. The min and max values are the smallest and largest times that any single connection took to complete a given step. The 'total' and 'avg/iter' times measure the time to complete a given step for all connections. These two values take into account asynchronous operations. For steps which are serial, the total and sum values will be roughly the same. For asynchronous steps, the total may be significantly lower than the sum, as multiple connections will be in progress simultanesously. The avg/iter is the total time divided by the number of connections. In many cases, times may not be available or only available on the client. Is such situations, the output will show 0. .SH "OPTIONS" .TP \-s server_address The network name or IP address of the server system listening for connections. The used name or address must route over an RDMA device. This option must be specified by the client. .TP \-b bind_address The local network address to bind to. .TP \-c connections The number of connections to establish between the client and server. (default 100) .TP \-p port_number The server's port number. .TP \-q base_qpn The first QP number to use when creating connections without allocating hardware QPs. The test will use the values between base_qpn and base_qpn plus connections when connecting. (default 1000) .TP \-n num_threads Sets the number of threads to spawn used to process connection events and hardware operations. (default 1) .TP \-m mimic_qp_delay_us "Simulates" QP creation and modify calls by replacing them with a simple sleep function instead. This allows testing the CM at larger scale than would be practical, or even possible given system configuration settings, if HW resources needed to be allocated. .TP \-r retries Number of retries when resolving address or route. (default 2) .TP \-S Run connection rate test using sockets. This provides a baseline comparison of RDMA connections versus TCP connections. Sockets are set to blocking mode. .TP \-t timeout_ms Timeout in millseconds (ms) when resolving address or route. (default 2000 - 2 seconds) .SH "NOTES" Basic usage is to start cmtime on a server system, then run cmtime -s server_name on a client system. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7) rdma-core-56.1/librdmacm/man/mckey.1000066400000000000000000000050771477342711600172420ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "MCKEY" 1 "2007-05-15" "librdmacm" "librdmacm" librdmacm .SH NAME mckey \- RDMA CM multicast setup and simple data transfer test. .SH SYNOPSIS .sp .nf \fImckey\fR -m multicast_address [-s] [-b bind_address] [-c connections] [-C message_count] [-S message_size] [-p port_space] \fImckey\fR -m multicast_address -s [-b bind_address] [-c connections] [-C message_count] [-S message_size] [-p port_space] \fImckey\fR -M unmapped_multicast_address -b bind_address [-s] [-c connections] [-C message_count] [-S message_size] [-p port_space] .fi .SH "DESCRIPTION" Establishes a set of RDMA multicast communication paths between nodes using the librdmacm, optionally transfers datagrams to receiving nodes, then tears down the communication. .SH "OPTIONS" .TP \-m multicast_address IP multicast address to join. .TP \-M unmapped_multicast_address RDMA transport specific multicast address to join. .TP \-s Send datagrams to the multicast group. .TP \-b bind_address The local network address to bind to. .TP \-c connections The number of QPs to join the multicast group. (default 1) .TP \-C message_count The number of messages to transfer over each connection. (default 10) .TP \-S message_size The size of each message transferred, in bytes. This value must be smaller than the MTU of the underlying RDMA transport, or an error will occur. (default 100) .TP \-o Join the multicast group as a send-only full-member. Otherwise the group is joined as a full-member. .TP .TP \-l Prevent multicast message loopback. Other receivers on the local system will not receive the multicast messages. Otherwise all multicast messages are also send to the host they originated from and local listeners (and probably the sending process itself) will receive the messages. \-p port_space The port space of the datagram communication. May be either the RDMA UDP (0x0111) or IPoIB (0x0002) port space. (default RDMA_PS_UDP) .SH "NOTES" Basic usage is to start mckey -m multicast_address on a server system, then run mckey -m multicast_address -s on a client system. .P Unique Infiniband SA assigned multicast GIDs can be retrieved by invoking mckey with a zero MGID or IP address. (Example, -M 0 or -m 0.0.0.0). The assigned address will be displayed to allow mckey clients to join the created group. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7), ucmatose(1), udaddy(1), rping(1) rdma-core-56.1/librdmacm/man/rcopy.1000066400000000000000000000021331477342711600172540ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RCOPY" 1 "2011-12-2" "librdmacm" "librdmacm" librdmacm .SH NAME rcopy \- simple file copy over RDMA. .SH SYNOPSIS .sp .nf \fIrcopy\fR source server[:destination] [-p port] \fIrcopy\fR [-p port] .fi .SH "DESCRIPTION" Uses sockets over RDMA interface to copy a source file to the specified destination. .SH "OPTIONS" .TP source The name and path of the source file to copy. .TP server The name or address of the destination server. .TP :destination An optional destination filename and path. If not given, the destination filename will match that of the source. .TP \-p server_port The server's port number. .TP .SH "NOTES" Basic usage is to start rcopy on a server system, then run rcopy sourcefile servername. The server application will continue to run after copying the file, but is currently single-threaded. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7) rdma-core-56.1/librdmacm/man/rdma_accept.3000066400000000000000000000112431477342711600203660ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_ACCEPT" 3 "2014-05-27" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_accept \- Called to accept a connection request. .SH SYNOPSIS .B "#include " .P .B "int" rdma_accept .BI "(struct rdma_cm_id *" id "," .BI "struct rdma_conn_param *" conn_param ");" .SH ARGUMENTS .IP "id" 12 Connection identifier associated with the request. .IP "conn_param" 12 Information needed to establish the connection. See CONNECTION PROPERTIES below for details. .SH "DESCRIPTION" Called from the listening side to accept a connection or datagram service lookup request. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" Unlike the socket accept routine, rdma_accept is not called on a listening rdma_cm_id. Instead, after calling rdma_listen, the user waits for an RDMA_CM_EVENT_CONNECT_REQUEST event to occur. Connection request events give the user a newly created rdma_cm_id, similar to a new socket, but the rdma_cm_id is bound to a specific RDMA device. rdma_accept is called on the new rdma_cm_id. .SH "CONNECTION PROPERTIES" The following properties are used to configure the communication and specified by the conn_param parameter when accepting a connection or datagram communication request. Users should use the rdma_conn_param values reported in the connection request event to determine appropriate values for these fields when accepting. Users may reference the rdma_conn_param structure in the connection event directly, or can reference their own structure. If the rdma_conn_param structure from an event is referenced, the event must not be acked until after this call returns. .P If the conn_param parameter is NULL, the values reported in the connection request event are used, adjusted down based on local hardware restrictions. .IP private_data References a user-controlled data buffer. The contents of the buffer are copied and transparently passed to the remote side as part of the communication request. May be NULL if private_data is not required. .IP private_data_len Specifies the size of the user-controlled data buffer. Note that the actual amount of data transferred to the remote side is transport dependent and may be larger than that requested. .IP responder_resources The maximum number of outstanding RDMA read and atomic operations that the local side will accept from the remote side. Applies only to RDMA_PS_TCP. This value must be less than or equal to the local RDMA device attribute max_qp_rd_atom, but preferably greater than or equal to the responder_resources value reported in the connect request event. .IP initiator_depth The maximum number of outstanding RDMA read and atomic operations that the local side will have to the remote side. Applies only to RDMA_PS_TCP. This value must be less than or equal to the local RDMA device attribute max_qp_init_rd_atom and the initiator_depth value reported in the connect request event. .IP flow_control Specifies if hardware flow control is available. This value is exchanged with the remote peer and is not used to configure the QP. Applies only to RDMA_PS_TCP. .IP retry_count This value is ignored. .IP rnr_retry_count The maximum number of times that a send operation from the remote peer should be retried on a connection after receiving a receiver not ready (RNR) error. RNR errors are generated when a send request arrives before a buffer has been posted to receive the incoming data. Applies only to RDMA_PS_TCP. .IP srq Specifies if the QP associated with the connection is using a shared receive queue. This field is ignored by the library if a QP has been created on the rdma_cm_id. Applies only to RDMA_PS_TCP. .IP qp_num Specifies the QP number associated with the connection. This field is ignored by the library if a QP has been created on the rdma_cm_id. .SH "INFINIBAND SPECIFIC" In addition to the connection properties defined above, InfiniBand QPs are configured with minimum RNR NAK timer and local ACK timeout values. The minimum RNR NAK timer value is set to 0, for a delay of 655 ms. The local ACK timeout is calculated based on the packet lifetime and local HCA ACK delay. The packet lifetime is determined by the InfiniBand Subnet Administrator and is part of the route (path record) information obtained by the active side of the connection. The HCA ACK delay is a property of the locally used HCA. .P The RNR retry count is a 3-bit value. .P The length of the private data provided by the user is limited to 196 bytes for RDMA_PS_TCP, or 136 bytes for RDMA_PS_UDP. .SH "SEE ALSO" rdma_listen(3), rdma_reject(3), rdma_get_cm_event(3) rdma-core-56.1/librdmacm/man/rdma_ack_cm_event.3000066400000000000000000000014651477342711600215520ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_ACK_CM_EVENT" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_ack_cm_event \- Free a communication event. .SH SYNOPSIS .B "#include " .P .B "int" rdma_ack_cm_event .BI "(struct rdma_cm_event *" event ");" .SH ARGUMENTS .IP "event" 12 Event to be released. .SH "DESCRIPTION" All events which are allocated by rdma_get_cm_event must be released, there should be a one-to-one correspondence between successful gets and acks. This call frees the event structure and any memory that it references. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "SEE ALSO" rdma_get_cm_event(3), rdma_destroy_id(3) rdma-core-56.1/librdmacm/man/rdma_bind_addr.3000066400000000000000000000024441477342711600210400ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_BIND_ADDR" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_bind_addr \- Bind an RDMA identifier to a source address. .SH SYNOPSIS .B "#include " .P .B "int" rdma_bind_addr .BI "(struct rdma_cm_id *" id "," .BI "struct sockaddr *" addr ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .IP "addr" 12 Local address information. Wildcard values are permitted. .SH "DESCRIPTION" Associates a source address with an rdma_cm_id. The address may be wildcarded. If binding to a specific local address, the rdma_cm_id will also be bound to a local RDMA device. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" Typically, this routine is called before calling rdma_listen to bind to a specific port number, but it may also be called on the active side of a connection before calling rdma_resolve_addr to bind to a specific address. .P If used to bind to port 0, the rdma_cm will select an available port, which can be retrieved with rdma_get_src_port(3). .SH "SEE ALSO" rdma_create_id(3), rdma_listen(3), rdma_resolve_addr(3), rdma_create_qp(3), rdma_get_local_addr(3), rdma_get_src_port(3) rdma-core-56.1/librdmacm/man/rdma_client.1000066400000000000000000000022071477342711600204030ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_CLIENT" 1 "2010-07-19" "librdmacm" "librdmacm" librdmacm .SH NAME rdma_client \- simple RDMA CM connection and ping-pong test. .SH SYNOPSIS .sp .nf \fIrdma_client\fR [-s server_address] [-p server_port] .fi .SH "DESCRIPTION" Uses synchronous librdmam calls to establish an RDMA connection between two nodes. This example is intended to provide a very simple coding example of how to use RDMA. .SH "OPTIONS" .TP \-s server_address Specifies the address of the system that the rdma_server is running on. By default, the client will attempt to connect to the server using 127.0.0.1. .TP \-p server_port Specifies the port number that the server listens on. By default the server listens on port 7471. .SH "NOTES" Basic usage is to start rdma_server, then connect to the server using the rdma_client program. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7), udaddy(1), mckey(1), rping(1), rdma_server(1) rdma-core-56.1/librdmacm/man/rdma_cm.7000066400000000000000000000200301477342711600175240ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_CM" 7 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_cm \- RDMA communication manager. .SH SYNOPSIS .B "#include " .SH "DESCRIPTION" Used to establish communication over RDMA transports. .SH "NOTES" The RDMA CM is a communication manager used to setup reliable, connected and unreliable datagram data transfers. It provides an RDMA transport neutral interface for establishing connections. The API concepts are based on sockets, but adapted for queue pair (QP) based semantics: communication must be over a specific RDMA device, and data transfers are message based. .P The RDMA CM can control both the QP and communication management (connection setup / teardown) portions of an RDMA API, or only the communication management piece. It works in conjunction with the verbs API defined by the libibverbs library. The libibverbs library provides the underlying interfaces needed to send and receive data. .P The RDMA CM can operate asynchronously or synchronously. The mode of operation is controlled by the user through the use of the rdma_cm event channel parameter in specific calls. If an event channel is provided, an rdma_cm identifier will report its event data (results of connecting, for example), on that channel. If a channel is not provided, then all rdma_cm operations for the selected rdma_cm identifier will block until they complete. .P The RDMA CM gives an option to different libibverbs providers to advertise and use various specific to that provider QP configuration options. This functionality is called ECE (enhanced connection establishment). .SH "RDMA VERBS" The rdma_cm supports the full range of verbs available through the libibverbs library and interfaces. However, it also provides wrapper functions for some of the more commonly used verbs funcationality. The full set of abstracted verb calls are: .P rdma_reg_msgs - register an array of buffers for sending and receiving .P rdma_reg_read - registers a buffer for RDMA read operations .P rdma_reg_write - registers a buffer for RDMA write operations .P rdma_dereg_mr - deregisters a memory region .P rdma_post_recv - post a buffer to receive a message .P rdma_post_send - post a buffer to send a message .P rdma_post_read - post an RDMA to read data into a buffer .P rdma_post_write - post an RDMA to send data from a buffer .P rdma_post_recvv - post a vector of buffers to receive a message .P rdma_post_sendv - post a vector of buffers to send a message .P rdma_post_readv - post a vector of buffers to receive an RDMA read .P rdma_post_writev - post a vector of buffers to send an RDMA write .P rdma_post_ud_send - post a buffer to send a message on a UD QP .P rdma_get_send_comp - get completion status for a send or RDMA operation .P rdma_get_recv_comp - get information about a completed receive .SH "CLIENT OPERATION" This section provides a general overview of the basic operation for the active, or client, side of communication. This flow assume asynchronous operation with low level call details shown. For synchronous operation, calls to rdma_create_event_channel, rdma_get_cm_event, rdma_ack_cm_event, and rdma_destroy_event_channel would be eliminated. Abstracted calls, such as rdma_create_ep encapsulate several of these calls under a single API. Users may also refer to the example applications for code samples. A general connection flow would be: .IP rdma_getaddrinfo retrieve address information of the destination .IP rdma_create_event_channel create channel to receive events .IP rdma_create_id allocate an rdma_cm_id, this is conceptually similar to a socket .IP rdma_resolve_addr obtain a local RDMA device to reach the remote address .IP rdma_get_cm_event wait for RDMA_CM_EVENT_ADDR_RESOLVED event .IP rdma_ack_cm_event ack event .IP rdma_create_qp allocate a QP for the communication .IP rdma_resolve_route determine the route to the remote address .IP rdma_get_cm_event wait for RDMA_CM_EVENT_ROUTE_RESOLVED event .IP rdma_ack_cm_event ack event .IP rdma_connect connect to the remote server .IP rdma_get_cm_event wait for RDMA_CM_EVENT_ESTABLISHED event .IP rdma_ack_cm_event ack event .P Perform data transfers over connection .IP rdma_disconnect tear-down connection .IP rdma_get_cm_event wait for RDMA_CM_EVENT_DISCONNECTED event .IP rdma_ack_cm_event ack event .IP rdma_destroy_qp destroy the QP .IP rdma_destroy_id release the rdma_cm_id .IP rdma_destroy_event_channel release the event channel .IP rdma_freeaddrinfo release the list of rdma_addrinfo structures .IP rdma_set_local_ece set desired ECE options .P An almost identical process is used to setup unreliable datagram (UD) communication between nodes. No actual connection is formed between QPs however, so disconnection is not needed. .P Although this example shows the client initiating the disconnect, either side of a connection may initiate the disconnect. .SH "SERVER OPERATION" This section provides a general overview of the basic operation for the passive, or server, side of communication. A general connection flow would be: .IP rdma_create_event_channel create channel to receive events .IP rdma_create_id allocate an rdma_cm_id, this is conceptually similar to a socket .IP rdma_bind_addr set the local port number to listen on .IP rdma_listen begin listening for connection requests .IP rdma_get_cm_event wait for RDMA_CM_EVENT_CONNECT_REQUEST event with a new rdma_cm_id .IP rdma_create_qp allocate a QP for the communication on the new rdma_cm_id .IP rdma_accept accept the connection request .IP rdma_ack_cm_event ack event .IP rdma_get_cm_event wait for RDMA_CM_EVENT_ESTABLISHED event .IP rdma_ack_cm_event ack event .P Perform data transfers over connection .IP rdma_get_cm_event wait for RDMA_CM_EVENT_DISCONNECTED event .IP rdma_ack_cm_event ack event .IP rdma_disconnect tear-down connection .IP rdma_destroy_qp destroy the QP .IP rdma_destroy_id release the connected rdma_cm_id .IP rdma_destroy_id release the listening rdma_cm_id .IP rdma_destroy_event_channel release the event channel .IP rdma_get_remote_ece get ECe options sent by the client .IP rdma_set_local_ece set desired ECE options .SH "RETURN CODES" .IP "= 0" success .IP "= -1" error - see errno for more details .P Most librdmacm functions return 0 to indicate success, and a -1 return value to indicate failure. If a function operates asynchronously, a return value of 0 means that the operation was successfully started. The operation could still complete in error; users should check the status of the related event. If the return value is -1, then errno will contain additional information regarding the reason for the failure. .P Prior versions of the library would return -errno and not set errno for some cases related to ENOMEM, ENODEV, ENODATA, EINVAL, and EADDRNOTAVAIL codes. Applications that want to check these codes and have compatibility with prior library versions must manually set errno to the negative of the return code if it is < -1. .SH "SEE ALSO" rdma_accept(3), rdma_ack_cm_event(3), rdma_bind_addr(3), rdma_connect(3), rdma_create_ep(3), rdma_create_event_channel(3), rdma_create_id(3), rdma_create_qp(3), rdma_dereg_mr(3), rdma_destroy_ep(3), rdma_destroy_event_channel(3), rdma_destroy_id(3), rdma_destroy_qp(3), rdma_disconnect(3), rdma_event_str(3), rdma_free_devices(3), rdma_freeaddrinfo(3), rdma_getaddrinfo(3), rdma_get_cm_event(3), rdma_get_devices(3), rdma_get_dst_port(3), rdma_get_local_addr(3), rdma_get_peer_addr(3), rdma_get_recv_comp(3), rdma_get_remote_ece(3), rdma_get_request(3), rdma_get_send_comp(3), rdma_get_src_port(3), rdma_join_multicast(3), rdma_leave_multicast(3), rdma_listen(3), rdma_migrate_id(3), rdma_notify(3), rdma_post_read(3) rdma_post_readv(3), rdma_post_recv(3), rdma_post_recvv(3), rdma_post_send(3), rdma_post_sendv(3), rdma_post_ud_send(3), rdma_post_write(3), rdma_post_writev(3), rdma_reg_msgs(3), rdma_reg_read(3), rdma_reg_write(3), rdma_reject(3), rdma_resolve_addr(3), rdma_resolve_route(3), rdma_get_remote_ece(3), rdma_set_option(3), mckey(1), rdma_client(1), rdma_server(1), rping(1), ucmatose(1), udaddy(1) rdma-core-56.1/librdmacm/man/rdma_connect.3000066400000000000000000000107211477342711600205600ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_CONNECT" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_connect \- Initiate an active connection request. .SH SYNOPSIS .B "#include " .P .B "int" rdma_connect .BI "(struct rdma_cm_id *" id "," .BI "struct rdma_conn_param *" conn_param ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .IP "conn_param" 12 connection parameters. See CONNECTION PROPERTIES below for details. .SH "DESCRIPTION" For an rdma_cm_id of type RDMA_PS_TCP, this call initiates a connection request to a remote destination. For an rdma_cm_id of type RDMA_PS_UDP, it initiates a lookup of the remote QP providing the datagram service. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" Users must have resolved a route to the destination address by having called rdma_resolve_route or rdma_create_ep before calling this routine. .SH "CONNECTION PROPERTIES" The following properties are used to configure the communication and specified by the conn_param parameter when connecting or establishing datagram communication. .IP private_data References a user-controlled data buffer. The contents of the buffer are copied and transparently passed to the remote side as part of the communication request. May be NULL if private_data is not required. .IP private_data_len Specifies the size of the user-controlled data buffer. Note that the actual amount of data transferred to the remote side is transport dependent and may be larger than that requested. .IP responder_resources The maximum number of outstanding RDMA read and atomic operations that the local side will accept from the remote side. Applies only to RDMA_PS_TCP. This value must be less than or equal to the local RDMA device attribute max_qp_rd_atom and remote RDMA device attribute max_qp_init_rd_atom. The remote endpoint can adjust this value when accepting the connection. .IP initiator_depth The maximum number of outstanding RDMA read and atomic operations that the local side will have to the remote side. Applies only to RDMA_PS_TCP. This value must be less than or equal to the local RDMA device attribute max_qp_init_rd_atom and remote RDMA device attribute max_qp_rd_atom. The remote endpoint can adjust this value when accepting the connection. .IP flow_control Specifies if hardware flow control is available. This value is exchanged with the remote peer and is not used to configure the QP. Applies only to RDMA_PS_TCP. .IP retry_count The maximum number of times that a data transfer operation should be retried on the connection when an error occurs. This setting controls the number of times to retry send, RDMA, and atomic operations when timeouts occur. Applies only to RDMA_PS_TCP. .IP rnr_retry_count The maximum number of times that a send operation from the remote peer should be retried on a connection after receiving a receiver not ready (RNR) error. RNR errors are generated when a send request arrives before a buffer has been posted to receive the incoming data. Applies only to RDMA_PS_TCP. .IP srq Specifies if the QP associated with the connection is using a shared receive queue. This field is ignored by the library if a QP has been created on the rdma_cm_id. Applies only to RDMA_PS_TCP. .IP qp_num Specifies the QP number associated with the connection. This field is ignored by the library if a QP has been created on the rdma_cm_id. Applies only to RDMA_PS_TCP. .SH "INFINIBAND SPECIFIC" In addition to the connection properties defined above, InfiniBand QPs are configured with minimum RNR NAK timer and local ACK timeout values. The minimum RNR NAK timer value is set to 0, for a delay of 655 ms. The local ACK timeout is calculated based on the packet lifetime and local HCA ACK delay. The packet lifetime is determined by the InfiniBand Subnet Administrator and is part of the resolved route (path record) information. The HCA ACK delay is a property of the locally used HCA. .P Retry count and RNR retry count values are 3-bit values. .P The length of the private data provided by the user is limited to 56 bytes for RDMA_PS_TCP, or 180 bytes for RDMA_PS_UDP. .SH "IWARP SPECIFIC" Connections established over iWarp RDMA devices currently require that the active side of the connection send the first message. .SH "SEE ALSO" rdma_cm(7), rdma_create_id(3), rdma_resolve_route(3), rdma_disconnect(3), rdma_listen(3), rdma_get_cm_event(3) rdma-core-56.1/librdmacm/man/rdma_create_ep.3000066400000000000000000000054431477342711600210630ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_CREATE_EP" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_create_ep \- Allocate a communication identifier and optional QP. .SH SYNOPSIS .B "#include " .P .B "int" rdma_create_ep .BI "(struct rdma_cm_id **" id "," .BI "struct rdma_addrinfo *" res "," .BI "struct ibv_pd *" pd "," .BI "struct ibv_qp_init_attr *" qp_init_attr ");" .SH ARGUMENTS .IP "id" 12 A reference where the allocated communication identifier will be returned. .IP "res" 12 Address information associated with the rdma_cm_id returned from rdma_getaddrinfo. .IP "pd" 12 Optional protection domain if a QP is associated with the rdma_cm_id. .IP "qp_init_attr" 12 Optional initial QP attributes. .SH "DESCRIPTION" Creates an identifier that is used to track communication information. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" After resolving address information using rdma_getaddrinfo, a user may use this call to allocate an rdma_cm_id based on the results. .P If the rdma_cm_id will be used on the active side of a connection, meaning that res->ai_flag does not have RAI_PASSIVE set, rdma_create_ep will automatically create a QP on the rdma_cm_id if qp_init_attr is not NULL. The QP will be associated with the specified protection domain, if provided, or a default protection domain if not. Users should see rdma_create_qp for details on the use of the pd and qp_init_attr parameters. After calling rdma_create_ep, the returned rdma_cm_id may be connected by calling rdma_connect. The active side calls rdma_resolve_addr and rdma_resolve_route are not necessary. .P If the rdma_cm_id will be used on the passive side of a connection, indicated by having res->ai_flag RAI_PASSIVE set, this call will save the provided pd and qp_init_attr parameters. When a new connection request is retrieved by calling rdma_get_request, the rdma_cm_id associated with the new connection will automatically be associated with a QP using the pd and qp_init_attr parameters. After calling rdma_create_ep, the returned rdma_cm_id may be placed into a listening state by immediately calling rdma_listen. The passive side call rdma_bind_addr is not necessary. Connection requests may then be retrieved by calling rdma_get_request. .P The newly created rdma_cm_id will be set to use synchronous operation. Users that wish asynchronous operation must migrate the rdma_cm_id to a user created event channel using rdma_migrate_id. .P Users must release the created rdma_cm_id by calling rdma_destroy_ep. .SH "SEE ALSO" rdma_cm(7), rdma_getaddrinfo(3), rdma_create_event_channel(3), rdma_connect(3), rdma_listen(3), rdma_destroy_ep(3), rdma_migrate_id(3) rdma-core-56.1/librdmacm/man/rdma_create_event_channel.3000066400000000000000000000026511477342711600232660ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_CREATE_EVENT_CHANNEL" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_create_event_channel \- Open a channel used to report communication events. .SH SYNOPSIS .B "#include " .P .B "struct rdma_event_channel *" rdma_create_event_channel .BI "(" void ");" .SH ARGUMENTS .IP "void" 12 no arguments .SH "DESCRIPTION" Asynchronous events are reported to users through event channels. .SH "RETURN VALUE" Returns a pointer to the created event channel, or NULL if the request fails. On failure, errno will be set to indicate the failure reason. .SH "NOTES" Event channels are used to direct all events on an rdma_cm_id. For many clients, a single event channel may be sufficient, however, when managing a large number of connections or cm_id's, users may find it useful to direct events for different cm_id's to different channels for processing. .P All created event channels must be destroyed by calling rdma_destroy_event_channel. Users should call rdma_get_cm_event to retrieve events on an event channel. .P Each event channel is mapped to a file descriptor. The associated file descriptor can be used and manipulated like any other fd to change its behavior. Users may make the fd non-blocking, poll or select the fd, etc. .SH "SEE ALSO" rdma_cm(7), rdma_get_cm_event(3), rdma_destroy_event_channel(3) rdma-core-56.1/librdmacm/man/rdma_create_id.3000066400000000000000000000045541477342711600210550ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_CREATE_ID" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_create_id \- Allocate a communication identifier. .SH SYNOPSIS .B "#include " .P .B "int" rdma_create_id .BI "(struct rdma_event_channel *" channel "," .BI "struct rdma_cm_id **" id "," .BI "void *" context "," .BI "enum rdma_port_space " ps ");" .SH ARGUMENTS .IP "channel" 12 The communication channel that events associated with the allocated rdma_cm_id will be reported on. This may be NULL. .IP "id" 12 A reference where the allocated communication identifier will be returned. .IP "context" 12 User specified context associated with the rdma_cm_id. .IP "ps" 12 RDMA port space. .SH "DESCRIPTION" Creates an identifier that is used to track communication information. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" Rdma_cm_id's are conceptually equivalent to a socket for RDMA communication. The difference is that RDMA communication requires explicitly binding to a specified RDMA device before communication can occur, and most operations are asynchronous in nature. Asynchronous communication events on an rdma_cm_id are reported through the associated event channel. If the channel parameter is NULL, the rdma_cm_id will be placed into synchronous operation. While operating synchronously, calls that result in an event will block until the operation completes. The event will be returned to the user through the rdma_cm_id structure, and be available for access until another rdma_cm call is made. .P Users must release the rdma_cm_id by calling rdma_destroy_id. .SH "PORT SPACE" Details of the services provided by the different port spaces are outlined below. .IP RDMA_PS_TCP Provides reliable, connection-oriented QP communication. Unlike TCP, the RDMA port space provides message, not stream, based communication. .IP RDMA_PS_UDP Provides unreliable, connectionless QP communication. Supports both datagram and multicast communication. .IP RDMA_PS_IB Provides for any IB services (UD, UC, RC, XRC, etc.). .SH "SEE ALSO" rdma_cm(7), rdma_create_event_channel(3), rdma_destroy_id(3), rdma_get_devices(3), rdma_bind_addr(3), rdma_resolve_addr(3), rdma_connect(3), rdma_listen(3), rdma_set_option(3) rdma-core-56.1/librdmacm/man/rdma_create_qp.3000066400000000000000000000040311477342711600210670ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_CREATE_QP" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_create_qp \- Allocate a QP. .SH SYNOPSIS .B "#include " .P .B "int" rdma_create_qp .BI "(struct rdma_cm_id *" id "," .BI "struct ibv_pd *" pd "," .BI "struct ibv_qp_init_attr *" qp_init_attr ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .IP "pd" 12 Optional protection domain for the QP. .IP "qp_init_attr" 12 Initial QP attributes. .SH "DESCRIPTION" Allocate a QP associated with the specified rdma_cm_id and transition it for sending and receiving. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" The rdma_cm_id must be bound to a local RDMA device before calling this function, and the protection domain must be for that same device. QPs allocated to an rdma_cm_id are automatically transitioned by the librdmacm through their states. After being allocated, the QP will be ready to handle posting of receives. If the QP is unconnected, it will be ready to post sends. .P If a protection domain is not given - pd parameter is NULL - then the rdma_cm_id will be created using a default protection domain. One default protection domain is allocated per RDMA device. .P The initial QP attributes are specified by the qp_init_attr parameter. The send_cq and recv_cq fields in the ibv_qp_init_attr are optional. If a send or receive completion queue is not specified, then a CQ will be allocated by the rdma_cm for the QP, along with corresponding completion channels. Completion channels and CQ data created by the rdma_cm are exposed to the user through the rdma_cm_id structure. .P The actual capabilities and properties of the created QP will be returned to the user through the qp_init_attr parameter. An rdma_cm_id may only be associated with a single QP. .SH "SEE ALSO" rdma_bind_addr(3), rdma_resolve_addr(3), rdma_destroy_qp(3), ibv_create_qp(3), ibv_modify_qp(3) rdma-core-56.1/librdmacm/man/rdma_create_srq.3000066400000000000000000000035471477342711600212670ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_CREATE_SRQ" 3 "2011-06-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_create_srq \- Allocate a shared receive queue. .SH SYNOPSIS .B "#include " .P .B "int" rdma_create_srq .BI "(struct rdma_cm_id *" id "," .BI "struct ibv_pd *" pd "," .BI "struct ibv_srq_init_attr *" attr ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .IP "pd" 12 Optional protection domain for the SRQ. .IP "attr" 12 Initial SRQ attributes. .SH "DESCRIPTION" Allocate a SRQ associated with the specified rdma_cm_id. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" The rdma_cm_id must be bound to a local RDMA device before calling this function, and the protection domain, if provided, must be for that same device. After being allocated, the SRQ will be ready to handle posting of receives. .P If a protection domain is not given - pd parameter is NULL - then the rdma_cm_id will be created using a default protection domain. One default protection domain is allocated per RDMA device. .P The initial SRQ attributes are specified by the attr parameter. The ext.xrc.cq fields in the ibv_srq_init_attr is optional. If a completion queue is not specified for an XRC SRQ, then a CQ will be allocated by the rdma_cm for the SRQ, along with corresponding completion channels. Completion channels and CQ data created by the rdma_cm are exposed to the user through the rdma_cm_id structure. .P The actual capabilities and properties of the created SRQ will be returned to the user through the attr parameter. An rdma_cm_id may only be associated with a single SRQ. .SH "SEE ALSO" rdma_bind_addr(3), rdma_resolve_addr(3), rdma_create_ep(3), rdma_destroy_srq(3), ibv_create_srq(3), ibv_create_xsrq(3) rdma-core-56.1/librdmacm/man/rdma_dereg_mr.3000066400000000000000000000022151477342711600207120ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_DEREG_MR" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_dereg_mr \- deregisters a registered memory region. .SH SYNOPSIS .B "#include " .P .B "int" rdma_dereg_mr .BI "(struct ibv_mr *" mr ");" .SH ARGUMENTS .IP "mr" 12 A reference to a registered memory buffer. .SH "DESCRIPTION" Deregisters a memory buffer that had been registered for RDMA or message operations. A user should call rdma_dereg_mr for all registered memory associated with an rdma_cm_id before destroying the rdma_cm_id. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" All memory registered with an rdma_cm_id is associated with the protection domain associated with the id. Users must deregister all registered memory before the protection domain can be destroyed. .SH "SEE ALSO" rdma_cm(7), rdma_create_id(3), rdma_create_ep(3), rdma_destroy_id(3), rdma_destroy_ep(3), rdma_reg_msgs(3), rdma_reg_read(3), rdma_reg_write(3), ibv_reg_mr(3), ibv_dereg_mr(3) rdma-core-56.1/librdmacm/man/rdma_destroy_ep.3000066400000000000000000000011651477342711600213060ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_DESTROY_EP" 3 "2011-06-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_destroy_ep \- Release a communication identifier. .SH SYNOPSIS .B "#include " .P .B "void" rdma_destroy_ep .BI "(struct rdma_cm_id *" id ");" .SH ARGUMENTS .IP "id" 12 The communication identifier to destroy. .SH "DESCRIPTION" Destroys the specified rdma_cm_id and all associated resources .SH "NOTES" rdma_destroy_ep will automatically destroy any QP and SRQ associated with the rdma_cm_id. .SH "SEE ALSO" rdma_create_ep(3) rdma-core-56.1/librdmacm/man/rdma_destroy_event_channel.3000066400000000000000000000015101477342711600235050ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_DESTROY_EVENT_CHANNEL" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_destroy_event_channel \- Close an event communication channel. .SH SYNOPSIS .B "#include " .P .B "void" rdma_destroy_event_channel .BI "(struct rdma_event_channel *" channel ");" .SH ARGUMENTS .IP "channel" 12 The communication channel to destroy. .SH "DESCRIPTION" Release all resources associated with an event channel and closes the associated file descriptor. .SH "RETURN VALUE" None .SH "NOTES" All rdma_cm_id's associated with the event channel must be destroyed, and all returned events must be acked before calling this function. .SH "SEE ALSO" rdma_create_event_channel(3), rdma_get_cm_event(3), rdma_ack_cm_event(3) rdma-core-56.1/librdmacm/man/rdma_destroy_id.3000066400000000000000000000015101477342711600212700ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_DESTROY_ID" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_destroy_id \- Release a communication identifier. .SH SYNOPSIS .B "#include " .P .B "int" rdma_destroy_id .BI "(struct rdma_cm_id *" id ");" .SH ARGUMENTS .IP "id" 12 The communication identifier to destroy. .SH "DESCRIPTION" Destroys the specified rdma_cm_id and cancels any outstanding asynchronous operation. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" Users must free any associated QP with the rdma_cm_id before calling this routine and ack all related events. .SH "SEE ALSO" rdma_create_id(3), rdma_destroy_qp(3), rdma_ack_cm_event(3) rdma-core-56.1/librdmacm/man/rdma_destroy_qp.3000066400000000000000000000011231477342711600213140ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_DESTROY_QP" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_destroy_qp \- Deallocate a QP. .SH SYNOPSIS .B "#include " .P .B "void" rdma_destroy_qp .BI "(struct rdma_cm_id *" id ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .SH "DESCRIPTION" Destroy a QP allocated on the rdma_cm_id. .SH "NOTES" Users must destroy any QP associated with an rdma_cm_id before destroying the ID. .SH "SEE ALSO" rdma_create_qp(3), rdma_destroy_id(3), ibv_destroy_qp(3) rdma-core-56.1/librdmacm/man/rdma_destroy_srq.3000066400000000000000000000011701477342711600215030ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_DESTROY_SRQ" 3 "2011-06-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_destroy_srq \- Deallocate a SRQ. .SH SYNOPSIS .B "#include " .P .B "void" rdma_destroy_srq .BI "(struct rdma_cm_id *" id ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .SH "DESCRIPTION" Destroy an SRQ allocated on the rdma_cm_id. .SH "RETURN VALUE" None .SH "NOTES" Users should destroy any SRQ associated with an rdma_cm_id before destroying the ID. .SH "SEE ALSO" rdma_create_srq(3), rdma_destroy_id(3), ibv_destroy_srq(3) rdma-core-56.1/librdmacm/man/rdma_disconnect.3000066400000000000000000000017141477342711600212620ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_DISCONNECT" 3 "2008-01-02" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_disconnect \- This function disconnects a connection. .SH SYNOPSIS .B "#include " .P .B "int" rdma_disconnect .BI "(struct rdma_cm_id *" id ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .SH "DESCRIPTION" Disconnects a connection and transitions any associated QP to the error state, which will flush any posted work requests to the completion queue. This routine should be called by both the client and server side of a connection. After successfully disconnecting, an RDMA_CM_EVENT_DISCONNECTED event will be generated on both sides of the connection. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "SEE ALSO" rdma_connect(3), rdma_listen(3), rdma_accept(3), rdma_get_cm_event(3) rdma-core-56.1/librdmacm/man/rdma_establish.3.md000066400000000000000000000022471477342711600215100ustar00rootroot00000000000000--- date: 2019-01-16 footer: librdmacm header: "Librdmacm Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: RDMA_ESTABLISH --- # NAME rdma_establish - Complete an active connection request. # SYNOPSIS ```c #include int rdma_establish(struct rdma_cm_id *id); ``` # DESCRIPTION **rdma_establish()** Acknowledge an incoming connection response event and complete the connection establishment. Notes: If a QP has not been created on the rdma_cm_id, this function should be called by the active side to complete the connection, after getting connect response event. This will trigger a connection established event on the passive side. This function should not be used on an rdma_cm_id on which a QP has been created. # ARGUMENTS *id* : RDMA identifier. # RETURN VALUE **rdma_establish()** returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. # SEE ALSO **rdma_connect**(3), **rdma_disconnect**(3) **rdma_get_cm_event**(3) # AUTHORS Danit Goldberg Yossi Itigin rdma-core-56.1/librdmacm/man/rdma_event_str.3000066400000000000000000000011611477342711600211360ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_EVENT_STR" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_event_str \- Returns a string representation of an rdma cm event. .SH SYNOPSIS .B "#include " .P .B "char *" rdma_event_str .BI "("enum rdma_cm_event_type " event ");" .SH ARGUMENTS .IP "event" 12 Asynchronous event. .SH "DESCRIPTION" Returns a string representation of an asynchronous event. .SH "RETURN VALUE" Returns a pointer to a static character string corresponding to the event. .SH "SEE ALSO" rdma_get_cm_event(3) rdma-core-56.1/librdmacm/man/rdma_free_devices.3000066400000000000000000000011051477342711600215460ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_FREE_DEVICES" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_free_devices \- Frees the list of devices returned by rdma_get_devices. .SH SYNOPSIS .B "#include " .P .B "void" rdma_free_devices .BI "(struct ibv_context **" list ");" .SH ARGUMENTS .IP "list" 12 List of devices returned from rdma_get_devices. .SH "DESCRIPTION" Frees the device array returned by rdma_get_devices. .SH "RETURN VALUE" None .SH "SEE ALSO" rdma_get_devices(3) rdma-core-56.1/librdmacm/man/rdma_freeaddrinfo.3.in.rst000066400000000000000000000013771477342711600230020ustar00rootroot00000000000000================= RDMA_FREEADDRINFO ================= ----------------------------------------------------------------------- Frees the list of rdma_addrinfo structures returned by rdma_getaddrinfo ----------------------------------------------------------------------- :Date: 2025-02-03 :Manual section: 3 :Manual group: Librdmacm Programmer's Manual SYNOPSIS ======== #include void rdma_freeaddrinfo (struct rdma_addrinfo \*res); ARGUMENTS ========= res List of rdma_addrinfo structures returned by rdma_getaddrinfo. DESCRIPTION =========== Frees the list of rdma_addrinfo structures returned by rdma_getaddrinfo. RETURN VALUE ============ None SEE ALSO ======== rdma_getaddrinfo(3) AUTHOR ====== Mark Zhang rdma-core-56.1/librdmacm/man/rdma_get_cm_event.3000066400000000000000000000202711477342711600215670ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_GET_CM_EVENT" 3 "2007-10-31" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_get_cm_event \- Retrieves the next pending communication event. .SH SYNOPSIS .B "#include " .P .B "int" rdma_get_cm_event .BI "(struct rdma_event_channel *" channel "," .BI "struct rdma_cm_event **" event ");" .SH ARGUMENTS .IP "channel" 12 Event channel to check for events. .IP "event" 12 Allocated information about the next communication event. .SH "DESCRIPTION" Retrieves a communication event. If no events are pending, by default, the call will block until an event is received. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" The default synchronous behavior of this routine can be changed by modifying the file descriptor associated with the given channel. All events that are reported must be acknowledged by calling rdma_ack_cm_event. Destruction of an rdma_cm_id will block until related events have been acknowledged. .SH "EVENT DATA" Communication event details are returned in the rdma_cm_event structure. This structure is allocated by the rdma_cm and released by the rdma_ack_cm_event routine. Details of the rdma_cm_event structure are given below. .IP "id" 12 The rdma_cm identifier associated with the event. If the event type is RDMA_CM_EVENT_CONNECT_REQUEST, then this references a new id for that communication. .IP "listen_id" 12 For RDMA_CM_EVENT_CONNECT_REQUEST event types, this references the corresponding listening request identifier. .IP "event" 12 Specifies the type of communication event which occurred. See EVENT TYPES below. .IP "status" 12 Returns any asynchronous error information associated with an event. The status is zero if the operation was successful, otherwise the status value is non-zero and is either set to a negative errno or a transport specific value. For details on transport specific status values, see the event type information below. .IP "param" 12 Provides additional details based on the type of event. Users should select the conn or ud subfields based on the rdma_port_space of the rdma_cm_id associated with the event. See UD EVENT DATA and CONN EVENT DATA below. .SH "UD EVENT DATA" Event parameters related to unreliable datagram (UD) services: RDMA_PS_UDP and RDMA_PS_IPOIB. The UD event data is valid for RDMA_CM_EVENT_ESTABLISHED and RDMA_CM_EVENT_MULTICAST_JOIN events, unless stated otherwise. .IP "private_data" 12 References any user-specified data associated with RDMA_CM_EVENT_CONNECT_REQUEST or RDMA_CM_EVENT_ESTABLISHED events. The data referenced by this field matches that specified by the remote side when calling rdma_connect or rdma_accept. This field is NULL if the event does not include private data. The buffer referenced by this pointer is deallocated when calling rdma_ack_cm_event. .IP "private_data_len" 12 The size of the private data buffer. Users should note that the size of the private data buffer may be larger than the amount of private data sent by the remote side. Any additional space in the buffer will be zeroed out. .IP "ah_attr" 12 Address information needed to send data to the remote endpoint(s). Users should use this structure when allocating their address handle. .IP "qp_num" 12 QP number of the remote endpoint or multicast group. .IP "qkey" 12 QKey needed to send data to the remote endpoint(s). .SH "CONN EVENT DATA" Event parameters related to connected QP services: RDMA_PS_TCP. The connection related event data is valid for RDMA_CM_EVENT_CONNECT_REQUEST and RDMA_CM_EVENT_ESTABLISHED events, unless stated otherwise. .IP "private_data" 12 References any user-specified data associated with the event. The data referenced by this field matches that specified by the remote side when calling rdma_connect or rdma_accept. This field is NULL if the event does not include private data. The buffer referenced by this pointer is deallocated when calling rdma_ack_cm_event. .IP "private_data_len" 12 The size of the private data buffer. Users should note that the size of the private data buffer may be larger than the amount of private data sent by the remote side. Any additional space in the buffer will be zeroed out. .IP "responder_resources" 12 The number of responder resources requested of the recipient. This field matches the initiator depth specified by the remote node when calling rdma_connect and rdma_accept. .IP "initiator_depth" 12 The maximum number of outstanding RDMA read/atomic operations that the recipient may have outstanding. This field matches the responder resources specified by the remote node when calling rdma_connect and rdma_accept. .IP "flow_control" 12 Indicates if hardware level flow control is provided by the sender. .IP "retry_count" 12 For RDMA_CM_EVENT_CONNECT_REQUEST events only, indicates the number of times that the recipient should retry send operations. .IP "rnr_retry_count" 12 The number of times that the recipient should retry receiver not ready (RNR) NACK errors. .IP "srq" 12 Specifies if the sender is using a shared-receive queue. .IP "qp_num" 12 Indicates the remote QP number for the connection. .SH "EVENT TYPES" The following types of communication events may be reported. .IP RDMA_CM_EVENT_ADDR_RESOLVED Address resolution (rdma_resolve_addr) completed successfully. .IP RDMA_CM_EVENT_ADDR_ERROR Address resolution (rdma_resolve_addr) failed. .IP RDMA_CM_EVENT_ROUTE_RESOLVED Route resolution (rdma_resolve_route) completed successfully. .IP RDMA_CM_EVENT_ROUTE_ERROR Route resolution (rdma_resolve_route) failed. .IP RDMA_CM_EVENT_CONNECT_REQUEST Generated on the passive side to notify the user of a new connection request. .IP RDMA_CM_EVENT_CONNECT_RESPONSE Generated on the active side to notify the user of a successful response to a connection request. It is only generated on rdma_cm_id's that do not have a QP associated with them. .IP RDMA_CM_EVENT_CONNECT_ERROR Indicates that an error has occurred trying to establish or a connection. May be generated on the active or passive side of a connection. .IP RDMA_CM_EVENT_UNREACHABLE Generated on the active side to notify the user that the remote server is not reachable or unable to respond to a connection request. If this event is generated in response to a UD QP resolution request over InfiniBand, the event status field will contain an errno, if negative, or the status result carried in the IB CM SIDR REP message. .IP RDMA_CM_EVENT_REJECTED Indicates that a connection request or response was rejected by the remote end point. The event status field will contain the transport specific reject reason if available. Under InfiniBand, this is the reject reason carried in the IB CM REJ message. .IP RDMA_CM_EVENT_ESTABLISHED Indicates that a connection has been established with the remote end point. .IP RDMA_CM_EVENT_DISCONNECTED The connection has been disconnected. .IP RDMA_CM_EVENT_DEVICE_REMOVAL The local RDMA device associated with the rdma_cm_id has been removed. Upon receiving this event, the user must destroy the related rdma_cm_id. .IP RDMA_CM_EVENT_MULTICAST_JOIN The multicast join operation (rdma_join_multicast) completed successfully. .IP RDMA_CM_EVENT_MULTICAST_ERROR An error either occurred joining a multicast group, or, if the group had already been joined, on an existing group. The specified multicast group is no longer accessible and should be rejoined, if desired. .IP RDMA_CM_EVENT_ADDR_CHANGE The network device associated with this ID through address resolution changed its HW address, eg following of bonding failover. This event can serve as a hint for applications who want the links used for their RDMA sessions to align with the network stack. .IP RDMA_CM_EVENT_TIMEWAIT_EXIT The QP associated with a connection has exited its timewait state and is now ready to be re-used. After a QP has been disconnected, it is maintained in a timewait state to allow any in flight packets to exit the network. After the timewait state has completed, the rdma_cm will report this event. .SH "SEE ALSO" rdma_ack_cm_event(3), rdma_create_event_channel(3), rdma_resolve_addr(3), rdma_resolve_route(3), rdma_connect(3), rdma_listen(3), rdma_join_multicast(3), rdma_destroy_id(3), rdma_event_str(3) rdma-core-56.1/librdmacm/man/rdma_get_devices.3000066400000000000000000000017271477342711600214160ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_GET_DEVICES" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_get_devices \- Get a list of RDMA devices currently available. .SH SYNOPSIS .B "#include " .P .B "struct ibv_context **" rdma_get_devices .BI "(int *" num_devices ");" .SH ARGUMENTS .IP "num_devices" 12 If non-NULL, set to the number of devices returned. .SH "DESCRIPTION" Return a NULL-terminated array of opened RDMA devices. Callers can use this routine to allocate resources on specific RDMA devices that will be shared across multiple rdma_cm_id's. .SH "RETURN VALUE" Returns an array of available RDMA devices, or NULL if the request fails. On failure, errno will be set to indicate the failure reason. .SH "NOTES" The returned array must be released by calling rdma_free_devices. Devices remain opened while the librdmacm is loaded. .SH "SEE ALSO" rdma_free_devices(3) rdma-core-56.1/librdmacm/man/rdma_get_dst_port.3000066400000000000000000000014551477342711600216300ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_GET_DST_PORT" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_get_dst_port \- Returns the remote port number of a bound rdma_cm_id. .SH SYNOPSIS .B "#include " .P .B "uint16_t" rdma_get_dst_port .BI "(struct rdma_cm_id *" id ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .SH "DESCRIPTION" Returns the remote port number for an rdma_cm_id that has been bound to a remote address. .SH "RETURN VALUE" Returns the 16-bit port identifier associated with the peer endpoint. If the rdma_cm_id is not connected, the returned value is 0. .SH "SEE ALSO" rdma_connect(3), rdma_accept(3), rdma_get_cm_event(3), rdma_get_src_port(3), rdma_get_local_addr(3), rdma_get_peer_addr(3) rdma-core-56.1/librdmacm/man/rdma_get_local_addr.3000066400000000000000000000015261477342711600220550ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_GET_LOCAL_ADDR" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_get_local_addr \- Returns the local IP address of a bound rdma_cm_id. .SH SYNOPSIS .B "#include " .P .B "struct sockaddr *" rdma_get_local_addr .BI "(struct rdma_cm_id *" id ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .SH "DESCRIPTION" Returns the local IP address for an rdma_cm_id that has been bound to a local device. .SH "RETURN VALUE" Returns a pointer to the local sockaddr address of the rdma_cm_id. If the rdma_cm_id is not bound to an address, the contents of the sockaddr structure will be set to all zeroes. .SH "SEE ALSO" rdma_bind_addr(3), rdma_resolve_addr(3), rdma_get_src_port(3), rdma_get_dst_port(3), rdma_get_peer_addr(3) rdma-core-56.1/librdmacm/man/rdma_get_peer_addr.3000066400000000000000000000014351477342711600217150ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_GET_PEER_ADDR" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_get_peer_addr \- Returns the remote IP address of a bound rdma_cm_id. .SH SYNOPSIS .B "#include " .P .B "struct sockaddr *" rdma_get_peer_addr .BI "(struct rdma_cm_id *" id ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .SH "DESCRIPTION" Returns the remote IP address associated with an rdma_cm_id. .SH "RETURN VALUE" Returns a pointer to the sockaddr address of the connected peer. If the rdma_cm_id is not connected, the contents of the sockaddr structure will be set to all zeroes. .SH "SEE ALSO" rdma_resolve_addr(3), rdma_get_src_port(3), rdma_get_dst_port(3), rdma_get_local_addr(3) rdma-core-56.1/librdmacm/man/rdma_get_recv_comp.3000066400000000000000000000026231477342711600217450ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_GET_RECV_COMP" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_get_recv_comp \- retrieves a completed receive request. .SH SYNOPSIS .B "#include " .P .B "int" rdma_get_recv_comp .BI "(struct rdma_cm_id *" id "," .BI "struct ibv_wc *" wc ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier to check for completions. .IP "wc" 12 A reference to a work completion structure to fill in. .SH "DESCRIPTION" Retrieves a completed work request for a receive operation. Information about the completed request is returned through the wc parameter, with the wr_id set to the context of the request. For details on the work completion structure, see ibv_poll_cq. .SH "RETURN VALUE" Returns the number of returned completions (0 or 1) on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" This calls polls the receive completion queue associated with an rdma_cm_id. If a completion is not found, the call blocks until a request completes. This call should only be used on rdma_cm_id's that do not share CQs with other rdma_cm_id's, and maintain separate CQs for sends and receive completions. .SH "SEE ALSO" rdma_cm(7), ibv_poll_cq(3), rdma_get_send_comp(3), rdma_post_send(3), rdma_post_read(3), rdma_post_write(3) rdma-core-56.1/librdmacm/man/rdma_get_remote_ece.3.md000066400000000000000000000031361477342711600224760ustar00rootroot00000000000000--- date: 2020-02-02 footer: librdmacm header: "Librdmacm Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: RDMA_GET_REMOTE_ECE --- # NAME rdma_get_remote_ece - Get remote ECE paraemters as received from the peer. # SYNOPSIS ```c #include int rdma_get_remote_ece(struct rdma_cm_id *id, struct ibv_ece *ece); ``` # DESCRIPTION **rdma_get_remote_ece()** get ECE parameters as were received from the communication peer. This function is suppose to be used by the users of external QPs. The call needs to be performed before replying to the peer and needed to allow for the passive side to know ECE options of other side. Being used by external QP and RDMA_CM doesn't manage that QP, the peer needs to call to libibverbs API by itself. Usual flow for the passive side will be: * ibv_create_qp() <- create data QP. * ece = rdma_get_remote_ece() <- get ECE options from remote peer * ibv_set_ece(ece) <- set local ECE options with data received from the peer. * ibv_modify_qp() <- enable data QP. * rdma_set_local_ece(ece) <- set desired ECE options after respective libibverbs provider masked unsupported options. * rdma_accept()/rdma_establish()/rdma_reject_ece() # ARGUMENTS *id : RDMA communication identifier. *ece : ECE struct to be filled. # RETURN VALUE **rdma_get_remote_ece()** returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. # SEE ALSO **rdma_cm**(7), rdma_set_local_ece(3) # AUTHOR Leon Romanovsky rdma-core-56.1/librdmacm/man/rdma_get_request.3000066400000000000000000000026471477342711600214660ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_GET_REQUEST" 3 "2007-10-31" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_get_request \- Retrieves the next pending connection request event. .SH SYNOPSIS .B "#include " .P .B "int" rdma_get_request .BI "(struct rdma_cm_id *" listen "," .BI "struct rdma_cm_id **" id ");" .SH ARGUMENTS .IP "listen" 12 Listening rdma_cm_id. .IP "id" 12 rdma_cm_id associated with the new connection. .SH "DESCRIPTION" Retrieves a connection request event. If no requests are pending, the call will block until an event is received. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" This call may only be used on listening rdma_cm_id's operating synchronously. On success, a new rdma_cm_id representing the connection request will be returned to the user. The new rdma_cm_id will reference event information associated with the request until the user calls rdma_reject, rdma_accept, or rdma_destroy_id on the newly created identifier. For a description of the event data, see rdma_get_cm_event. .P If QP attributes are associated with the listening endpoint, the returned rdma_cm_id will also reference an allocated QP. .SH "SEE ALSO" rdma_get_cm_event(3), rdma_accept(3), rdma_reject(3), rdma_connect(3), rdma_listen(3), rdma_destroy_id(3) rdma-core-56.1/librdmacm/man/rdma_get_send_comp.3000066400000000000000000000026641477342711600217440ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_GET_SEND_COMP" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_get_send_comp \- retrieves a completed send, read, or write request. .SH SYNOPSIS .B "#include " .P .B "int" rdma_get_send_comp .BI "(struct rdma_cm_id *" id "," .BI "struct ibv_wc *" wc ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier to check for completions. .IP "wc" 12 A reference to a work completion structure to fill in. .SH "DESCRIPTION" Retrieves a completed work request for a send, RDMA read, or RDMA write operation. Information about the completed request is returned through the wc parameter, with the wr_id set to the context of the request. For details on the work completion structure, see ibv_poll_cq. .SH "RETURN VALUE" Returns the number of returned completions (0 or 1) on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" This calls polls the send completion queue associated with an rdma_cm_id. If a completion is not found, the call blocks until a request completes. This call should only be used on rdma_cm_id's that do not share CQs with other rdma_cm_id's, and maintain separate CQs for sends and receive completions. .SH "SEE ALSO" rdma_cm(7), ibv_poll_cq(3), rdma_get_recv_comp(3), rdma_post_send(3), rdma_post_read(3), rdma_post_write(3) rdma-core-56.1/librdmacm/man/rdma_get_src_port.3000066400000000000000000000014431477342711600216220ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_GET_SRC_PORT" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_get_src_port \- Returns the local port number of a bound rdma_cm_id. .SH SYNOPSIS .B "#include " .P .B "uint16_t" rdma_get_src_port .BI "(struct rdma_cm_id *" id ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .SH "DESCRIPTION" Returns the local port number for an rdma_cm_id that has been bound to a local address. .SH "RETURN VALUE" Returns the 16-bit port identifier associated with the local endpoint. If the rdma_cm_id is not bound to a port, the returned value is 0. .SH "SEE ALSO" rdma_bind_addr(3), rdma_resolve_addr(3), rdma_get_dst_port(3), rdma_get_local_addr(3), rdma_get_peer_addr(3) rdma-core-56.1/librdmacm/man/rdma_getaddrinfo.3000066400000000000000000000135021477342711600214150ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_GETADDRINFO" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_getaddrinfo \- Provides transport independent address translation. .SH SYNOPSIS .B "#include " .P .B "int" rdma_getaddrinfo .BI "(const char *" node "," .BI "const char *" service "," .BI "const struct rdma_addrinfo *" hints "," .BI "struct rdma_addrinfo **" res ");" .SH ARGUMENTS .IP "node" 12 Optional, name, dotted-decimal IPv4, or IPv6 hex address to resolve. .IP "service" 12 Service name or port number of address. .IP "hints" 12 Reference to an rdma_addrinfo structure containing hints about the type of service the caller supports. .IP "res" 12 A pointer to a linked list of rdma_addrinfo structures containing response information. .SH "DESCRIPTION" Resolves the destination node and service address and returns information needed to establish communication. Provides the RDMA functional equivalent to getaddrinfo. .SH "RETURN VALUE" Returns 0 on success, or -1 on error (errno will be set to indicate the failure reason), or one of the following nonzero error codes: .IP "EAI_ADDRFAMILY" 12 The specified network host does not have any network addresses in the requested address family. .IP "EAI_AGAIN" 12 The name server returned a temporary failure indication. Try again later. .IP "EAI_BADFLAGS" 12 hints.ai_flags contains invalid flags. .IP "EAI_FAIL" 12 The name server returned a permanent failure indication. .IP "EAI_FAMILY" 12 The requested address family is not supported. .IP "EAI_MEMORY" 12 Out of memory. .IP "EAI_NODATA" 12 The specified network host exists, but does not have any network addresses defined. .IP "EAI_NONAME" 12 The node or service is not known; or both node and service are NULL. .IP "EAI_SERVICE" 12 The requested service is not available for the requested QP type. It may be available through another QP type. .IP "EAI_QPTYPE" 12 The requested socket type is not supported. This could occur, for example, if hints.ai_qptype and hints.ai_port_space are inconsistent (e.g., IBV_QPT_UD and RDMA_PS_TCP, respectively). .IP "EAI_SYSTEM" 12 Other system error, check errno for details. The gai_strerror() function translates these error codes to a human readable string, suitable for error reporting. .SH "NOTES" Either node, service, or hints must be provided. If hints are provided, the operation will be controlled by hints.ai_flags. If RAI_PASSIVE is specified, the call will resolve address information for use on the passive side of a connection. If node is provided, rdma_getaddrinfo will attempt to resolve the RDMA address, route, and connection data to the given node. The hints parameter, if provided, may be used to control the resulting output as indicated below. If node is not given, rdma_getaddrinfo will attempt to resolve the RDMA addressing information based on the hints.ai_src_addr, hints.ai_dst_addr, or hints.ai_route. .SH "rdma_addrinfo" .IP "ai_flags" 12 Hint flags that control the operation. Supported flags are: .IP "RAI_PASSIVE" 12 Indicates that the results will be used on the passive/listening side of a connection. .IP "RAI_NUMERICHOST" 12 If specified, then the node parameter, if provided, must be a numerical network address. This flag suppresses any lengthy address resolution. .IP "RAI_NOROUTE" 12 If set, this flag suppresses any lengthy route resolution. .IP "RAI_FAMILY" 12 If set, the ai_family setting should be used as an input hint for interpretting the node parameter. .IP "ai_family" 12 Address family for the source and destination address. Supported families are: AF_INET, AF_INET6, and AF_IB. .IP "ai_qp_type" 12 Indicates the type of RDMA QP used for communication. Supported types are: IBV_QPT_UD (unreliable datagram) and IBV_QPT_RC (reliable connected). .IP "ai_port_space" 12 RDMA port space in use. Supported values are: RDMA_PS_UDP, RDMA_PS_TCP, and RDMA_PS_IB. .IP "ai_src_len" 12 The length of the source address referenced by ai_src_addr. This will be 0 if an appropriate source address could not be discovered for a given destination. .IP "ai_dst_len" 12 The length of the destination address referenced by ai_dst_addr. This will be 0 if the RAI_PASSIVE flag was specified as part of the hints. .IP "ai_src_addr" 12 If provided, the address for the local RDMA device. .IP "ai_dst_addr" 12 If provided, the address for the destination RDMA device. .IP "ai_src_canonname" 12 The canonical for the source. .IP "ai_dst_canonname" 12 The canonical for the destination. .IP "ai_route_len" 12 Size of the routing information buffer referenced by ai_route. This will be 0 if the underlying transport does not require routing data, or none could be resolved. .IP "ai_route" 12 Routing information for RDMA transports that require routing data as part of connection establishment. The format of the routing data depends on the underlying transport. If Infiniband transports are used, ai_route will reference an array of struct ibv_path_data on output, if routing data is available. Routing paths may be restricted by setting desired routing data fields on input to rdma_getaddrinfo. For Infiniband, hints.ai_route may reference an array of struct ibv_path_record or struct ibv_path_data on input. .IP "ai_connect_len" 12 Size of connection information referenced by ai_connect. This will be 0 if the underlying transport does not require additional connection information. .IP "ai_connect" 12 Data exchanged as part of the connection establishment process. If provided, ai_connect data must be transferred as private data, with any user supplied private data following it. .IP "ai_next" 12 Pointer to the next rdma_addrinfo structure in the list. Will be NULL if no more structures exist. .SH "SEE ALSO" rdma_create_id(3), rdma_resolve_route(3), rdma_connect(3), rdma_create_qp(3), rdma_bind_addr(3), rdma_create_ep(3), rdma_freeaddrinfo(3) rdma-core-56.1/librdmacm/man/rdma_init_qp_attr.3.md000066400000000000000000000022271477342711600222250ustar00rootroot00000000000000--- date: 2018-12-31 footer: librdmacm header: "Librdmacm Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: RDMA_INIT_QP_ATTR --- # NAME rdma_init_qp_attr - Returns qp attributes of an rdma_cm_id. # SYNOPSIS ```c #include int rdma_init_qp_attr(struct rdma_cm_id *id, struct ibv_qp_attr *qp_attr, int *qp_attr_mask); ``` # DESCRIPTION **rdma_init_qp_attr()** returns qp attributes of an rdma_cm_id. Information about qp attributes and qp attributes mask is returned through the *qp_attr* and *qp_attr_mask* parameters. For details on the qp_attr structure, see ibv_modify_qp. # ARGUMENTS *id* : RDMA identifier. *qp_attr* : A reference to a qp attributes struct containing response information. *qp_attr_mask* : A reference to a qp attributes mask containing response information. # RETURN VALUE **rdma_init_qp_attr()** returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. # SEE ALSO **rdma_cm**(7), **ibv_modify_qp**(3) # AUTHOR Danit Goldberg rdma-core-56.1/librdmacm/man/rdma_join_multicast.3000066400000000000000000000033141477342711600221530ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_JOIN_MULTICAST" 3 "2008-01-02" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_join_multicast \- Joins a multicast group. .SH SYNOPSIS .B "#include " .P .B "int" rdma_join_multicast .BI "(struct rdma_cm_id *" id "," .BI "struct sockaddr *" addr "," .BI "void *" context ");" .SH ARGUMENTS .IP "id" 12 Communication identifier associated with the request. .IP "addr" 12 Multicast address identifying the group to join. .IP "context" 12 User-defined context associated with the join request. .SH "DESCRIPTION" Joins a multicast group and attaches an associated QP to the group. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" Before joining a multicast group, the rdma_cm_id must be bound to an RDMA device by calling rdma_bind_addr or rdma_resolve_addr. Use of rdma_resolve_addr requires the local routing tables to resolve the multicast address to an RDMA device, unless a specific source address is provided. The user must call rdma_leave_multicast to leave the multicast group and release any multicast resources. After the join operation completes, if a QP is associated with the rdma_cm_id, it is automatically attached to the multicast group when the multicast event is retrieved by the user. Otherwise, the user is responsible for calling ibv_attach_mcast to bind the QP to the multicast group. The join context is returned to the user through the private_data field in the rdma_cm_event. .SH "SEE ALSO" rdma_leave_multicast(3), rdma_bind_addr(3), rdma_resolve_addr(3), rdma_create_qp(3), rdma_get_cm_event(3) rdma-core-56.1/librdmacm/man/rdma_join_multicast_ex.3000066400000000000000000000054451477342711600226560ustar00rootroot00000000000000.TH "RDMA_JOIN_MULTICAST_EX" 3 "2017-11-17" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_join_multicast_ex \- Joins a multicast group with extended options. .SH SYNOPSIS .B "#include " .P .B "int" rdma_join_multicast_ex .BI "(struct rdma_cm_id *" id "," .BI "struct rdma_cm_join_mc_attr_ex *" mc_join_attr "," .BI "void *" context ");" .SH ARGUMENTS .IP "id" 20 Communication identifier associated with the request. .IP "mc_join_attr" 20 Is an rdma_cm_join_mc_attr_ex struct, as defined in . .IP "context" 20 User-defined context associated with the join request. .SH "DESCRIPTION" Joins a multicast group (MCG) with extended options. Currently supporting MC join with a specified join flag. .P .nf struct rdma_cm_join_mc_attr_ex { .in +8 uint32_t comp_mask; /* Bitwise OR between "rdma_cm_join_mc_attr_mask" enum */ uint32_t join_flags; /* Use a single flag from "rdma_cm_mc_join_flags" enum */ struct sockaddr *addr; /* Multicast address identifying the group to join */ .in -8 }; .fi .P The supported join flags are: .P .B RDMA_MC_JOIN_FLAG_FULLMEMBER - Create multicast group, Send multicast messages to MCG, Receive multicast messages from MCG. .P .B RDMA_MC_JOIN_FLAG_SENDONLY_FULLMEMBER - Create multicast group, Send multicast messages to MCG, Don't receive multicast messages from MCG (send-only). .P Initiating a MC join as "Send Only Full Member" on InfiniBand requires SM support, otherwise joining will fail. .P Initiating a MC join as "Send Only Full Member" on RoCEv2/ETH will not send any IGMP messages unlike a Full Member MC join. When "Send Only Full Member" is used the QP will not be attached to the MCG. .P .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" Before joining a multicast group, the rdma_cm_id must be bound to an RDMA device by calling rdma_bind_addr or rdma_resolve_addr. Use of rdma_resolve_addr requires the local routing tables to resolve the multicast address to an RDMA device, unless a specific source address is provided. The user must call rdma_leave_multicast to leave the multicast group and release any multicast resources. After the join operation completes, if a QP is associated with the rdma_cm_id, it is automatically attached to the multicast group when the multicast event is retrieved by the user. Otherwise, the user is responsible for calling ibv_attach_mcast to bind the QP to the multicast group. The join context is returned to the user through the private_data field in the rdma_cm_event. .SH "SEE ALSO" rdma_join_multicast(3), rdma_leave_multicast(3), rdma_bind_addr(3), rdma_resolve_addr(3), rdma_create_qp(3), rdma_get_cm_event(3) .SH "AUTHORS" .TP Alex Vesker rdma-core-56.1/librdmacm/man/rdma_leave_multicast.3000066400000000000000000000021751477342711600223140ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_LEAVE_MULTICAST" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_leave_multicast \- Leaves a multicast group. .SH SYNOPSIS .B "#include " .P .B "int" rdma_leave_multicast .BI "(struct rdma_cm_id *" id "," .BI "struct sockaddr *" addr ");" .SH ARGUMENTS .IP "id" 12 Communication identifier associated with the request. .IP "addr" 12 Multicast address identifying the group to leave. .SH "DESCRIPTION" Leaves a multicast group and detaches an associated QP from the group. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" Calling this function before a group has been fully joined results in canceling the join operation. Users should be aware that messages received from the multicast group may stilled be queued for completion processing immediately after leaving a multicast group. Destroying an rdma_cm_id will automatically leave all multicast groups. .SH "SEE ALSO" rdma_join_multicast(3), rdma_destroy_qp(3) rdma-core-56.1/librdmacm/man/rdma_listen.3000066400000000000000000000023161477342711600204260ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_LISTEN" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_listen \- Listen for incoming connection requests. .SH SYNOPSIS .B "#include " .P .B "int" rdma_listen .BI "(struct rdma_cm_id *" id "," .BI "int " backlog ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .IP "backlog" 12 backlog of incoming connection requests. .SH "DESCRIPTION" Initiates a listen for incoming connection requests or datagram service lookup. The listen will be restricted to the locally bound source address. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" Users must have bound the rdma_cm_id to a local address by calling rdma_bind_addr before calling this routine. If the rdma_cm_id is bound to a specific IP address, the listen will be restricted to that address and the associated RDMA device. If the rdma_cm_id is bound to an RDMA port number only, the listen will occur across all RDMA devices. .SH "SEE ALSO" rdma_cm(7), rdma_bind_addr(3), rdma_connect(3), rdma_accept(3), rdma_reject(3), rdma_get_cm_event(3) rdma-core-56.1/librdmacm/man/rdma_migrate_id.3000066400000000000000000000027631477342711600212420ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_MIGRATE_ID" 3 "2007-11-13" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_migrate_id \- Move a communication identifier to a different event channel. .SH SYNOPSIS .B "#include " .P .B "int" rdma_migrate_id .BI "(struct rdma_cm_id *" id "," .BI "struct rdma_event_channel *" channel ");" .SH ARGUMENTS .IP "id" 12 An existing communication identifier to migrate. .IP "channel" 12 The communication channel that events associated with the allocated rdma_cm_id will be reported on. May be NULL. .SH "DESCRIPTION" Migrates a communication identifier to a different event channel. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" This routine migrates a communication identifier to the specified event channel and moves any pending events associated with the rdma_cm_id to the new channel. Users should not poll for events on the rdma_cm_id's current event channel or invoke other routines on the rdma_cm_id while migrating between channels. This call will block while there are any unacknowledged events on the current event channel. .P If the channel parameter is NULL, the specified rdma_cm_id will be placed into synchronous operation mode. All calls on the id will block until the operation completes. .SH "SEE ALSO" rdma_cm(7), rdma_create_event_channel(3), rdma_create_id(3), rdma_get_cm_event(3) rdma-core-56.1/librdmacm/man/rdma_notify.3000066400000000000000000000033401477342711600204360ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_NOTIFY" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_notify \- Notifies the librdmacm of an asynchronous event. .SH SYNOPSIS .B "#include " .P .B "int" rdma_notify .BI "(struct rdma_cm_id *" id "," .BI "enum ibv_event_type " event ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .IP "event" 12 Asynchronous event. .SH "DESCRIPTION" Used to notify the librdmacm of asynchronous events that have occurred on a QP associated with the rdma_cm_id. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. If errno is set to EISCONN (transport endpoint is already connected), this indicates that the the underlying communication manager established the connection before the call to rdma_notify could be processed. In this case, the error may safely be ignored. .SH "NOTES" Asynchronous events that occur on a QP are reported through the user's device event handler. This routine is used to notify the librdmacm of communication events. In most cases, use of this routine is not necessary, however if connection establishment is done out of band (such as done through Infiniband), it's possible to receive data on a QP that is not yet considered connected. This routine forces the connection into an established state in this case in order to handle the rare situation where the connection never forms on its own. Calling this routine ensures the delivery of the RDMA_CM_EVENT_ESTABLISHED event to the application. Events that should be reported to the CM are: IB_EVENT_COMM_EST. .SH "SEE ALSO" rdma_connect(3), rdma_accept(3), rdma_listen(3) rdma-core-56.1/librdmacm/man/rdma_post_read.3000066400000000000000000000037751477342711600211220ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_POST_READ" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_post_read \- post an RDMA read work request. .SH SYNOPSIS .B "#include " .P .B "int" rdma_post_read .BI "(struct rdma_cm_id *" id "," .BI "void *" context "," .BI "void *" addr "," .BI "size_t " length "," .BI "struct ibv_mr *" mr "," .BI "int " flags "," .BI "uint64_t " remote_addr "," .BI "uint32_t " rkey ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the request will be posted. .IP "context" 12 User-defined context associated with the request. .IP "addr" 12 The address of the local destination of the read request. .IP "length" 12 The length of the read operation. .IP "mr" 12 Registered memory region associated with the local buffer. .IP "flags" 12 Optional flags used to control the read operation. .IP "remote_addr" 12 The address of the remote registered memory to read from. .IP "rkey" 12 The registered memory key associated with the remote address. .SH "DESCRIPTION" Posts a work request to the send queue of the queue pair associated with the rdma_cm_id. The contents of the remote memory region will be read into the local data buffer. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" For a list of supported flags, see ibv_post_send. Both the remote and local data buffers must have been registered before the read is issued, and the buffers must remain registered until the read completes. .P Read operations may not be posted to an rdma_cm_id or the corresponding queue pair until it has been connected. .P The user-defined context associated with the read request will be returned to the user through the work completion wr_id, work request identifier, field. .SH "SEE ALSO" rdma_cm(7), rdma_connect(3), rdma_accept(3), ibv_post_send(3), rdma_post_readv(3), rdma_reg_read(3), rdma_reg_msgs(3) rdma-core-56.1/librdmacm/man/rdma_post_readv.3000066400000000000000000000036531477342711600213030ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_POST_READV" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_post_readv \- post an RDMA read work request. .SH SYNOPSIS .B "#include " .P .B "int" rdma_post_readv .BI "(struct rdma_cm_id *" id "," .BI "void *" context "," .BI "struct ibv_sge *" sgl "," .BI "int " nsge "," .BI "int " flags "," .BI "uint64_t " remote_addr "," .BI "uint32_t " rkey ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the request will be posted. .IP "context" 12 User-defined context associated with the request. .IP "sgl" 12 A scatter-gather list of the destination buffers of the read. .IP "nsge" 12 The number of scatter-gather array entries. .IP "flags" 12 Optional flags used to control the read operation. .IP "remote_addr" 12 The address of the remote registered memory to read from. .IP "rkey" 12 The registered memory key associated with the remote address. .SH "DESCRIPTION" Posts a work request to the send queue of the queue pair associated with the rdma_cm_id. The contents of the remote memory region will be read into the local data buffers. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" For a list of supported flags, see ibv_post_send. Both the remote and local data buffers must have been registered before the read is issued, and the buffers must remain registered until the read completes. .P Read operations may not be posted to an rdma_cm_id or the corresponding queue pair until it has been connected. .P The user-defined context associated with the read request will be returned to the user through the work completion wr_id, work request identifier, field. .SH "SEE ALSO" rdma_cm(7), rdma_connect(3), rdma_accept(3), ibv_post_send(3), rdma_post_read(3), rdma_reg_read(3), rdma_reg_msgs(3) rdma-core-56.1/librdmacm/man/rdma_post_recv.3000066400000000000000000000040271477342711600211350ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_POST_RECV" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_post_recv \- post a work request to receive an incoming message. .SH SYNOPSIS .B "#include " .P .B "int" rdma_post_recv .BI "(struct rdma_cm_id *" id "," .BI "void *" context "," .BI "void *" addr "," .BI "size_t " length "," .BI "struct ibv_mr *" mr ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the message buffer will be posted. .IP "context" 12 User-defined context associated with the request. .IP "addr" 12 The address of the memory buffer to post. .IP "length" 12 The length of the memory buffer. .IP "mr" 12 A registered memory region associated with the posted buffer. .SH "DESCRIPTION" Posts a work request to the receive queue of the queue pair associated with the rdma_cm_id. The posted buffer will be queued to receive an incoming message sent by the remote peer. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" The user is responsible for ensuring that a receive buffer is posted and large enough to contain all sent data before the peer posts the corresponding send message. The message buffer must have been registered before being posted, with the mr parameter referencing the registration. The buffer must remain registered until the receive completes. .P Messages may be posted to an rdma_cm_id only after a queue pair has been associated with it. A queue pair is bound to an rdma_cm_id after calling rdma_create_ep or rdma_create_qp, if the rdma_cm_id is allocated using rdma_create_id. .P The user-defined context associated with the receive request will be returned to the user through the work completion wr_id, work request identifier, field. .SH "SEE ALSO" rdma_cm(7), rdma_create_id(3), rdma_create_ep(3), rdma_create_qp(3), rdma_reg_read(3), ibv_reg_mr(3), ibv_dereg_mr(3), rdma_post_recvv(3), rdma_post_send(3) rdma-core-56.1/librdmacm/man/rdma_post_recvv.3000066400000000000000000000037241477342711600213260ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_POST_RECVV" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_post_recvv \- post a work request to receive incoming messages. .SH SYNOPSIS .B "#include " .P .B "int" rdma_post_recvv .BI "(struct rdma_cm_id *" id "," .BI "void *" context "," .BI "struct ibv_sge *" sgl "," .BI "int " nsge ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the message buffer(s) will be posted. .IP "context" 12 User-defined context associated with the request. .IP "sgl" 12 A scatter-gather list of memory buffers posted as a single request. .IP "nsge" 12 The number of scatter-gather entries in the sgl array. .SH "DESCRIPTION" Posts a single work request to the receive queue of the queue pair associated with the rdma_cm_id. The posted buffers will be queued to receive an incoming message sent by the remote peer. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" The user is responsible for ensuring that the receive is posted, and the total buffer space is large enough to contain all sent data before the peer posts the corresponding send message. The message buffers must have been registered before being posted, and the buffers must remain registered until the receive completes. .P Messages may be posted to an rdma_cm_id only after a queue pair has been associated with it. A queue pair is bound to an rdma_cm_id after calling rdma_create_ep or rdma_create_qp, if the rdma_cm_id is allocated using rdma_create_id. .P The user-defined context associated with the receive request will be returned to the user through the work completion wr_id, work request identifier, field. .SH "SEE ALSO" rdma_cm(7), rdma_create_id(3), rdma_create_ep(3), rdma_create_qp(3), rdma_reg_read(3), ibv_reg_mr(3), ibv_dereg_mr(3), rdma_post_recv(3), rdma_post_send(3) rdma-core-56.1/librdmacm/man/rdma_post_send.3000066400000000000000000000037361477342711600211350ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_POST_SEND" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_post_send \- post a work request to send a message. .SH SYNOPSIS .B "#include " .P .B "int" rdma_post_send .BI "(struct rdma_cm_id *" id "," .BI "void *" context "," .BI "void *" addr "," .BI "size_t " length "," .BI "struct ibv_mr *" mr "," .BI "int " flags ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the message buffer will be posted. .IP "context" 12 User-defined context associated with the request. .IP "addr" 12 The address of the memory buffer to post. .IP "length" 12 The length of the memory buffer. .IP "mr" 12 Optional registered memory region associated with the posted buffer. .IP "flags" 12 Optional flags used to control the send operation. .SH "DESCRIPTION" Posts a work request to the send queue of the queue pair associated with the rdma_cm_id. The contents of the posted buffer will be sent to the remote peer of a connection. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" The user is responsible for ensuring that the remote peer has queued a receive request before issuing the send operations. For a list of supported flags, see ibv_post_send. Unless the send request is using inline data, the message buffer must have been registered before being posted, with the mr parameter referencing the registration. The buffer must remain registered until the send completes. .P Send operations may not be posted to an rdma_cm_id or the corresponding queue pair until it has been connected. .P The user-defined context associated with the send request will be returned to the user through the work completion wr_id, work request identifier, field. .SH "SEE ALSO" rdma_cm(7), rdma_connect(3), rdma_accept(3), ibv_post_send(3), rdma_post_sendv(3), rdma_post_recv(3) rdma-core-56.1/librdmacm/man/rdma_post_sendv.3000066400000000000000000000035661477342711600213240ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_POST_SENDV" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_post_sendv \- post a work request to send a message. .SH SYNOPSIS .B "#include " .P .B "int" rdma_post_sendv .BI "(struct rdma_cm_id *" id "," .BI "void *" context "," .BI "struct ibv_sge *" slg "," .BI "int " nsge "," .BI "int " flags ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the message buffer will be posted. .IP "context" 12 User-defined context associated with the request. .IP "slg" 12 A scatter-gather list of memory buffers posted as a single request. .IP "nsge" 12 The number of scatter-gather entries in the slg array. .IP "flags" 12 Optional flags used to control the send operation. .SH "DESCRIPTION" Posts a work request to the send queue of the queue pair associated with the rdma_cm_id. The contents of the posted buffers will be sent to the remote peer of a connection. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" The user is responsible for ensuring that the remote peer has queued a receive request before issuing the send operations. For a list of supported flags, see ibv_post_send. Unless the send request is using inline data, the message buffers must have been registered before being posted, and the buffers must remain registered until the send completes. .P Send operations may not be posted to an rdma_cm_id or the corresponding queue pair until it has been connected. .P The user-defined context associated with the send request will be returned to the user through the work completion wr_id, work request identifier, field. .SH "SEE ALSO" rdma_cm(7), rdma_connect(3), rdma_accept(3), ibv_post_send(3), rdma_post_send(3), rdma_post_recv(3) rdma-core-56.1/librdmacm/man/rdma_post_ud_send.3000066400000000000000000000041051477342711600216140ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_POST_UD_SEND" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_post_ud_send \- post a work request to send a datagram. .SH SYNOPSIS .B "#include " .P .B "int" rdma_post_ud_send .BI "(struct rdma_cm_id *" id "," .BI "void *" context "," .BI "void *" addr "," .BI "size_t " length "," .BI "struct ibv_mr *" mr "," .BI "int " flags "," .BI "struct ibv_ah *" ah "," .BI "uint32_t " remote_qpn ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the message buffer will be posted. .IP "context" 12 User-defined context associated with the request. .IP "addr" 12 The address of the memory buffer to post. .IP "length" 12 The length of the memory buffer. .IP "mr" 12 Optional registered memory region associated with the posted buffer. .IP "flags" 12 Optional flags used to control the send operation. .IP "ah" 12 An address handle describing the address of the remote node. .IP "remote_qpn" 12 The number of the destination queue pair. .SH "DESCRIPTION" Posts a work request to the send queue of the queue pair associated with the rdma_cm_id. The contents of the posted buffer will be sent to the specified destination queue pair. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" The user is responsible for ensuring that the destination queue pair has queued a receive request before issuing the send operations. For a list of supported flags, see ibv_post_send. Unless the send request is using inline data, the message buffer must have been registered before being posted, with the mr parameter referencing the registration. The buffer must remain registered until the send completes. .P The user-defined context associated with the send request will be returned to the user through the work completion wr_id, work request identifier, field. .SH "SEE ALSO" rdma_cm(7), rdma_connect(3), rdma_accept(3), rdma_reg_msgs(3) ibv_post_send(3), rdma_post_recv(3) rdma-core-56.1/librdmacm/man/rdma_post_write.3000066400000000000000000000041041477342711600213240ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_POST_WRITE" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_post_write \- post an RDMA write work request. .SH SYNOPSIS .B "#include " .P .B "int" rdma_post_write .BI "(struct rdma_cm_id *" id "," .BI "void *" context "," .BI "void *" addr "," .BI "size_t " length "," .BI "struct ibv_mr *" mr "," .BI "int " flags "," .BI "uint64_t " remote_addr "," .BI "uint32_t " rkey ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the request will be posted. .IP "context" 12 User-defined context associated with the request. .IP "addr" 12 The local address of the source of the write request. .IP "length" 12 The length of the write operation. .IP "mr" 12 Optional memory region associated with the local buffer. .IP "flags" 12 Optional flags used to control the write operation. .IP "remote_addr" 12 The address of the remote registered memory to write into. .IP "rkey" 12 The registered memory key associated with the remote address. .SH "DESCRIPTION" Posts a work request to the send queue of the queue pair associated with the rdma_cm_id. The contents of the local data buffer will be written into the remote memory region. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" For a list of supported flags, see ibv_post_send. Unless inline data is specified, the local data buffer must have been registered before the write is issued, and the buffer must remain registered until the write completes. The remote buffer must always be registered. .P Write operations may not be posted to an rdma_cm_id or the corresponding queue pair until it has been connected. .P The user-defined context associated with the write request will be returned to the user through the work completion wr_id, work request identifier, field. .SH "SEE ALSO" rdma_cm(7), rdma_connect(3), rdma_accept(3), ibv_post_send(3), rdma_post_writev(3), rdma_reg_write(3), rdma_reg_msgs(3) rdma-core-56.1/librdmacm/man/rdma_post_writev.3000066400000000000000000000037661477342711600215270ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_POST_WRITEV" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_post_writev \- post an RDMA write work request. .SH SYNOPSIS .B "#include " .P .B "int" rdma_post_writev .BI "(struct rdma_cm_id *" id "," .BI "void *" context "," .BI "struct ibv_sge *" sgl "," .BI "int " nsge "," .BI "int " flags "," .BI "uint64_t " remote_addr "," .BI "uint32_t " rkey ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the request will be posted. .IP "context" 12 User-defined context associated with the request. .IP "sgl" 12 A scatter-gather list of the source buffers of the write. .IP "nsge" 12 The number of scatter-gather array entries. .IP "flags" 12 Optional flags used to control the write operation. .IP "remote_addr" 12 The address of the remote registered memory to write into. .IP "rkey" 12 The registered memory key associated with the remote address. .SH "DESCRIPTION" Posts a work request to the send queue of the queue pair associated with the rdma_cm_id. The contents of the local data buffers will be written into the remote memory region. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" For a list of supported flags, see ibv_post_send. Unless inline data is specified, the local data buffers must have been registered before the write is issued, and the buffers must remain registered until the write completes. The remote buffers must always be registered. .P Write operations may not be posted to an rdma_cm_id or the corresponding queue pair until it has been connected. .P The user-defined context associated with the write request will be returned to the user through the work completion wr_id, work request identifier, field. .SH "SEE ALSO" rdma_cm(7), rdma_connect(3), rdma_accept(3), ibv_post_send(3), rdma_post_write(3), rdma_reg_write(3), rdma_reg_msgs(3) rdma-core-56.1/librdmacm/man/rdma_reg_msgs.3000066400000000000000000000036271477342711600207440ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_REG_MSGS" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_reg_msgs \- register data buffer(s) for sending or receiving messages. .SH SYNOPSIS .B "#include " .P .B "struct ibv_mr *" rdma_reg_msgs .BI "(struct rdma_cm_id *" id "," .BI "void *" addr "," .BI "size_t " length ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the message buffer(s) will be used. .IP "addr" 12 The address of the memory buffer(s) to register. .IP "length" 12 The total length of the memory to register. .SH "DESCRIPTION" Registers an array of memory buffers used for sending and receiving messages or for RDMA operations. Memory buffers registered using rdma_reg_msgs may be posted to an rdma_cm_id using rdma_post_send or rdma_post_recv, or specified as the target of an RDMA read operation or the source of an RDMA write request. .SH "RETURN VALUE" Returns a reference to the registered memory region on success, or NULL on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" rdma_reg_msgs is used to register an array of data buffers that will be used send and/or receive messages on a queue pair associated with an rdma_cm_id. The memory buffer is registered with the proteection domain associated with the idenfier. The start of the data buffer array is specified through the addr parameter, and the total size of the array is given by length. .P All data buffers should be registered before being posted as a work request. Users must deregister all registered memory by calling rdma_dereg_mr. .SH "SEE ALSO" rdma_cm(7), rdma_create_id(3), rdma_create_ep(3), rdma_reg_read(3), rdma_reg_write(3), ibv_reg_mr(3), ibv_dereg_mr(3), rdma_post_send(3), rdma_post_recv(3), rdma_post_read(3), rdma_post_readv(3), rdma_post_write(3), rdma_post_writev(3) rdma-core-56.1/librdmacm/man/rdma_reg_read.3000066400000000000000000000034261477342711600207030ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_REG_READ" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_reg_read \- register data buffer(s) for remote RDMA read access. .SH SYNOPSIS .B "#include " .P .B "struct ibv_mr *" rdma_reg_read .BI "(struct rdma_cm_id *" id "," .BI "void *" addr "," .BI "size_t " length ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the message buffer(s) will be used. .IP "addr" 12 The address of the memory buffer(s) to register. .IP "length" 12 The total length of the memory to register. .SH "DESCRIPTION" Registers a memory buffer that will be accessed by a remote RDMA read operation. Memory buffers registered using rdma_reg_read may be targeted in an RDMA read request, allowing the buffer to be specified on the remote side of an RDMA connection as the remote_addr of rdma_post_read, or similar call. .SH "RETURN VALUE" Returns a reference to the registered memory region on success, or NULL on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" rdma_reg_read is used to register a data buffer that will be the target of an RDMA read operation on a queue pair associated with an rdma_cm_id. The memory buffer is registered with the proteection domain associated with the idenfier. The start of the data buffer is specified through the addr parameter, and the total size of the buffer is given by length. .P All data buffers should be registered before being posted as a work request. Users must deregister all registered memory by calling rdma_dereg_mr. .SH "SEE ALSO" rdma_cm(7), rdma_create_id(3), rdma_create_ep(3), rdma_reg_msgs(3), rdma_reg_write(3), ibv_reg_mr(3), ibv_dereg_mr(3), rdma_post_read(3) rdma-core-56.1/librdmacm/man/rdma_reg_write.3000066400000000000000000000034411477342711600211170ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_REG_WRITE" 3 "2010-07-19" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_reg_write \- register data buffer(s) for remote RDMA write access. .SH SYNOPSIS .B "#include " .P .B "struct ibv_mr *" rdma_reg_write .BI "(struct rdma_cm_id *" id "," .BI "void *" addr "," .BI "size_t " length ");" .SH ARGUMENTS .IP "id" 12 A reference to a communication identifier where the message buffer(s) will be used. .IP "addr" 12 The address of the memory buffer(s) to register. .IP "length" 12 The total length of the memory to register. .SH "DESCRIPTION" Registers a memory buffer that will be accessed by a remote RDMA write operation. Memory buffers registered using rdma_reg_write may be targeted in an RDMA write request, allowing the buffer to be specified on the remote side of an RDMA connection as the remote_addr of rdma_post_write, or similar call. .SH "RETURN VALUE" Returns a reference to the registered memory region on success, or NULL on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" rdma_reg_write is used to register a data buffer that will be the target of an RDMA write operation on a queue pair associated with an rdma_cm_id. The memory buffer is registered with the proteection domain associated with the idenfier. The start of the data buffer is specified through the addr parameter, and the total size of the buffer is given by length. .P All data buffers should be registered before being posted as a work request. Users must deregister all registered memory by calling rdma_dereg_mr. .SH "SEE ALSO" rdma_cm(7), rdma_create_id(3), rdma_create_ep(3), rdma_reg_msgs(3), rdma_reg_read(3), ibv_reg_mr(3), ibv_dereg_mr(3), rdma_post_write(3) rdma-core-56.1/librdmacm/man/rdma_reject.3000066400000000000000000000024441477342711600204060ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_REJECT" 3 "2007-05-15" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_reject \- Called to reject a connection request. .SH SYNOPSIS .B "#include " .P .B "int" rdma_reject .BI "(struct rdma_cm_id *" id "," .BI "const void *" private_data "," .BI "uint8_t " private_data_len ");" .SH ARGUMENTS .IP "id" 12 Connection identifier associated with the request. .IP "private_data" 12 Optional private data to send with the reject message. .IP "private_data_len" 12 Specifies the size of the user-controlled data buffer. Note that the actual amount of data transferred to the remote side is transport dependent and may be larger than that requested. .SH "DESCRIPTION" Called from the listening side to reject a connection or datagram service lookup request. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" After receiving a connection request event, a user may call rdma_reject to reject the request. If the underlying RDMA transport supports private data in the reject message, the specified data will be passed to the remote side. .SH "SEE ALSO" rdma_listen(3), rdma_accept(3), rdma_get_cm_event(3) rdma-core-56.1/librdmacm/man/rdma_resolve_addr.3000066400000000000000000000040271477342711600216020ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_RESOLVE_ADDR" 3 "2007-10-31" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_resolve_addr \- Resolve destination and optional source addresses. .SH SYNOPSIS .B "#include " .P .B "int" rdma_resolve_addr .BI "(struct rdma_cm_id *" id "," .BI "struct sockaddr *" src_addr "," .BI "struct sockaddr *" dst_addr "," .BI "int " timeout_ms ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .IP "src_addr" 12 Source address information. This parameter may be NULL. .IP "dst_addr" 12 Destination address information. .IP "timeout_ms" 12 Time to wait for resolution to complete. .SH "DESCRIPTION" Resolve destination and optional source addresses from IP addresses to an RDMA address. If successful, the specified rdma_cm_id will be bound to a local device. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" This call is used to map a given destination IP address to a usable RDMA address. The IP to RDMA address mapping is done using the local routing tables, or via ARP. If a source address is given, the rdma_cm_id is bound to that address, the same as if rdma_bind_addr were called. If no source address is given, and the rdma_cm_id has not yet been bound to a device, then the rdma_cm_id will be bound to a source address based on the local routing tables. After this call, the rdma_cm_id will be bound to an RDMA device. This call is typically made from the active side of a connection before calling rdma_resolve_route and rdma_connect. .SH "INFINIBAND SPECIFIC" This call maps the destination and, if given, source IP addresses to GIDs. In order to perform the mapping, IPoIB must be running on both the local and remote nodes. .SH "SEE ALSO" rdma_create_id(3), rdma_resolve_route(3), rdma_connect(3), rdma_create_qp(3), rdma_get_cm_event(3), rdma_bind_addr(3), rdma_get_src_port(3), rdma_get_dst_port(3), rdma_get_local_addr(3), rdma_get_peer_addr(3) rdma-core-56.1/librdmacm/man/rdma_resolve_route.3000066400000000000000000000021431477342711600220230ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_RESOLVE_ROUTE" 3 "2007-10-31" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_resolve_route \- Resolve the route information needed to establish a connection. .SH SYNOPSIS .B "#include " .P .B "int" rdma_resolve_route .BI "(struct rdma_cm_id *" id "," .BI "int " timeout_ms ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .IP "timeout_ms" 12 Time to wait for resolution to complete. .SH "DESCRIPTION" Resolves an RDMA route to the destination address in order to establish a connection. The destination address must have already been resolved by calling rdma_resolve_addr. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" This is called on the client side of a connection after calling rdma_resolve_addr, but before calling rdma_connect. .SH "INFINIBAND SPECIFIC" This call obtains a path record that is used by the connection. .SH "SEE ALSO" rdma_resolve_addr(3), rdma_connect(3), rdma_get_cm_event(3) rdma-core-56.1/librdmacm/man/rdma_server.1000066400000000000000000000020751477342711600204360ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_SERVER" 1 "2010-07-19" "librdmacm" "librdmacm" librdmacm .SH NAME rdma_server \- simple RDMA CM connection and ping-pong test. .SH SYNOPSIS .sp .nf \fIrdma_server\fR [-p port] .fi .SH "DESCRIPTION" Uses synchronous librdmam calls to establish an RDMA connections between two nodes. This example is intended to provide a very simple coding example of how to use RDMA. .SH "OPTIONS" .TP \-s server_address Specifies the address that the rdma_server listens on. By default the server listens on any address(0.0.0.0). .TP \-p port Changes the port number that the server listens on. By default the server listens on port 7471. .SH "NOTES" Basic usage is to start rdma_server, then connect to the server using the rdma_client program. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7), udaddy(1), mckey(1), rping(1), rdma_client(1) rdma-core-56.1/librdmacm/man/rdma_set_local_ece.3.md000066400000000000000000000031111477342711600223020ustar00rootroot00000000000000--- date: 2020-02-02 footer: librdmacm header: "Librdmacm Programmer's Manual" layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: RDMA_SET_LOCAL_ECE --- # NAME rdma_set_local_ece - Set local ECE paraemters to be used for REQ/REP communication. # SYNOPSIS ```c #include int rdma_set_local_ece(struct rdma_cm_id *id, struct ibv_ece *ece); ``` # DESCRIPTION **rdma_set_local_ece()** set local ECE parameters. This function is suppose to be used by the users of external QPs. The call needs to be performed before replying to the peer and needed to configure RDMA_CM with desired ECE options. Being used by external QP and RDMA_CM doesn't manage that QP, the peer needs to call to libibverbs API by itself. Usual flow for the passive side will be: * ibv_create_qp() <- create data QP. * ece = ibv_query_ece() <- get ECE from libibvers provider. * rdma_set_local_ece(ece) <- set desired ECE options. * rdma_connect() <- send connection request * ece = rdma_get_remote_ece() <- get ECE options from remote peer * ibv_set_ece(ece) <- set local ECE options with data received from the peer. * ibv_modify_qp() <- enable data QP. * rdma_accept()/rdma_establish()/rdma_reject_ece() # ARGUMENTS *id* : RDMA communication identifier. *ece : ECE parameters. # RETURN VALUE **rdma_set_local_ece()** returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. # SEE ALSO **rdma_cm**(7), rdma_get_remote_ece(3) # AUTHOR Leon Romanovsky rdma-core-56.1/librdmacm/man/rdma_set_option.3000066400000000000000000000033501477342711600213120ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_SET_OPTION" 3 "2007-08-06" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rdma_set_option \- Set communication options for an rdma_cm_id. .SH SYNOPSIS .B "#include " .P .B "int" rdma_set_option .BI "(struct rdma_cm_id *" id "," .BI "int " level "," .BI "int " optname "," .BI "void *" optval "," .BI "size_t " optlen ");" .SH ARGUMENTS .IP "id" 12 RDMA identifier. .IP "level" 12 Protocol level of the option to set. .IP "optname" 12 Name of the option, relative to the level, to set. .IP "optval" 12 Reference to the option data. The data is dependent on the level and optname. .IP "optlen" 12 The size of the %optval buffer. .SH "DESCRIPTION" Sets communication options for an rdma_cm_id. This call is used to override the default system settings. .IP "optname can be one of" 12 .IP "RDMA_OPTION_ID_TOS" 12 Specify the quality of service provided by a connection. The expected optlen is size of uint8_t. .IP "RDMA_OPTION_ID_REUSEADDR" 12 Bound the rdma_cm_id to a reuseable address. This will allow other users to bind to that same address. The expected optlen is size of int. .IP "RDMA_OPTION_ID_AFONLY" 12 Set IPV6_V6ONLY socket. The expected optlen is size of int. .IP "RDMA_OPTION_IB_PATH" 12 Set IB path record data. The expected optlen is size of struct ibv_path_data[]. .IP "RDMA_OPTION_ID_ACK_TIMEOUT" 12 Set QP ACK timeout. The value calculated according to the formula 4.096 * 2^(ack_timeout) usec. .SH "RETURN VALUE" Returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH "NOTES" Option details may be found in the relevant header files. .SH "SEE ALSO" rdma_create_id(3) rdma-core-56.1/librdmacm/man/rdma_xclient.1000066400000000000000000000025671477342711600206040ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_XCLIENT" 1 "2011-06-15" "librdmacm" "librdmacm" librdmacm .SH NAME rdma_xclient \- RDMA CM communication client test program .SH SYNOPSIS .sp .nf \fIrdma_xclient\fR [-s server_address] [-p server_port] [-c comm_type] .fi .SH "DESCRIPTION" Uses synchronous librdmam calls to establish an RDMA connection between two nodes. This example is intended to provide a very simple coding example of how to use RDMA. .SH "OPTIONS" .TP \-s server_address Specifies the address of the system that the rdma_server is running on. By default, the client will attempt to connect to the server using 127.0.0.1. .TP \-p server_port Specifies the port number that the server listens on. By default the server listens on port 7471. .TP \-c communication type Specifies the type of communication established with the server program. 'r' results in using a reliable-connected QP (the default). 'x' uses extended reliable-connected XRC QPs. .SH "NOTES" Basic usage is to start rdma_xserver, then connect to the server using the rdma_client program. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7), udaddy(1), mckey(1), rping(1), rdma_xserver(1), rdma_client(1) rdma-core-56.1/librdmacm/man/rdma_xserver.1000066400000000000000000000021201477342711600206150ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RDMA_XSERVER" 1 "2011-06-15" "librdmacm" "librdmacm" librdmacm .SH NAME rdma_xserver \- RDMA CM communication server test program .SH SYNOPSIS .sp .nf \fIrdma_xserver\fR [-p port] [-c comm_type] .fi .SH "DESCRIPTION" Uses the librdmacm to establish various forms of communication and exchange data. .SH "OPTIONS" .TP \-p port Changes the port number that the server listens on. By default the server listens on port 7471. .TP \-c communication type Specifies the type of communication established with the client program. 'r' results in using a reliable-connected QP (the default). 'x' uses extended reliable-connected XRC QPs. .SH "NOTES" Basic usage is to start rdma_xserver, then connect to the server using the rdma_xclient program. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7), udaddy(1), mckey(1), rping(1), rdma_server(1), rdma_xclient(1) rdma-core-56.1/librdmacm/man/riostream.1000066400000000000000000000042541477342711600201330ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RIOSTREAM" 1 "2012-10-24" "librdmacm" "librdmacm" librdmacm .SH NAME riostream \- zero-copy streaming over RDMA ping-pong test. .SH SYNOPSIS .sp .nf \fIriostream\fR [-s server_address] [-b bind_address] [-B buffer_size] [-I iterations] [-C transfer_count] [-S transfer_size] [-p server_port] [-T test_option] .fi .SH "DESCRIPTION" Uses the streaming over RDMA protocol (rsocket) to connect and exchange data between a client and server application. .SH "OPTIONS" .TP \-s server_address The network name or IP address of the server system listening for connections. The used name or address must route over an RDMA device. This option must be specified by the client. .TP \-b bind_address The local network address to bind to. .TP \-B buffer_size Indicates the size of the send and receive network buffers. .TP \-I iterations The number of times that the specified number of messages will be exchanged between the client and server. (default 1000) .TP \-C transfer_count The number of messages to transfer from the client to the server and back again on each iteration. (default 1) .TP \-S transfer_size The size of each send transfer, in bytes. (default 1000) If 'all' is specified, rstream will run a series of tests of various sizes. .TP \-p server_port The server's port number. .TP \-T test_option Specifies test parameters. Available options are: .P a | async - uses asynchronous operation (e.g. select / poll) .P b | blocking - uses blocking calls .P n | nonblocking - uses non-blocking calls .P v | verify - verifies data transfers .SH "NOTES" Basic usage is to start riostream on a server system, then run riostream -s server_name on a client system. By default, riostream will run a series of latency and bandwidth performance tests. Specifying a different iterations, transfer_count, or transfer_size will run a user customized test using default values where none have been specified. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7) rstream(1) rdma-core-56.1/librdmacm/man/rping.1000066400000000000000000000034511477342711600172430ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RPING" 1 "2007-05-15" "librdmacm" "librdmacm" librdmacm .SH NAME rping \- RDMA CM connection and RDMA ping-pong test. .SH SYNOPSIS .sp .nf \fIrping\fR -s [-v] [-V] [-d] [-P] [-a address] [-p port] [-C message_count] [-S message_size] \fIrping\fR -c [-v] [-V] [-d] [-I address] -a address [-p port] [-C message_count] [-S message_size] .fi .SH "DESCRIPTION" Establishes a reliable RDMA connection between two nodes using the librdmacm, optionally performs RDMA transfers between the nodes, then disconnects. .SH "OPTIONS" .TP \-s Run as the server. .TP \-c Run as the client. .TP \-a address On the server, specifies the network address to bind the connection to. To bind to any address with IPv6 use -a ::0 . On the client, specifies the server address to connect to. .TP \-I address The address to bind to as the source IP address to use. This is useful if you have multiple addresses on the same network or complex routing. .TP \-p Port number for listening server. .TP \-v Display ping data. .TP \-V Validate ping data. .TP \-d Display debug information. .TP \-C message_count The number of messages to transfer over each connection. (default infinite) .TP \-S message_size The size of each message transferred, in bytes. (default 100) .TP \-P Run the server in persistent mode. This allows multiple rping clients to connect to a single server instance. The server will run until killed. .TP \-q Control QP Creation/Modification directly from the application, instead of rdma_cm. .SH "NOTES" Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7), ucmatose(1), udaddy(1), mckey(1) rdma-core-56.1/librdmacm/man/rsocket.7.in000066400000000000000000000145171477342711600202160ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RSOCKET" 7 "2019-04-16" "librdmacm" "Librdmacm Programmer's Manual" librdmacm .SH NAME rsocket \- RDMA socket API .SH SYNOPSIS .B "#include " .SH "DESCRIPTION" RDMA socket API and protocol .SH "NOTES" Rsockets is a protocol over RDMA that supports a socket-level API for applications. Rsocket APIs are intended to match the behavior of corresponding socket calls, except where noted. Rsocket functions match the name and function signature of socket calls, with the exception that all function calls are prefixed with an 'r'. .P The following functions are defined: .P rsocket .P rbind, rlisten, raccept, rconnect .P rshutdown, rclose .P rrecv, rrecvfrom, rrecvmsg, rread, rreadv .P rsend, rsendto, rsendmsg, rwrite, rwritev .P rpoll, rselect .P rgetpeername, rgetsockname .P rsetsockopt, rgetsockopt, rfcntl .P Functions take the same parameters as that used for sockets. The follow capabilities and flags are supported at this time: .P PF_INET, PF_INET6, SOCK_STREAM, SOCK_DGRAM .P SOL_SOCKET - SO_ERROR, SO_KEEPALIVE (flag supported, but ignored), SO_LINGER, SO_OOBINLINE, SO_RCVBUF, SO_REUSEADDR, SO_SNDBUF .P IPPROTO_TCP - TCP_NODELAY, TCP_MAXSEG .P IPPROTO_IPV6 - IPV6_V6ONLY .P MSG_DONTWAIT, MSG_PEEK, O_NONBLOCK .P Rsockets provides extensions beyond normal socket routines that allow for direct placement of data into an application's buffer. This is also known as zero-copy support, since data is sent and received directly, bypassing copies into network controlled buffers. The following calls and options support direct data placement. .P riomap, riounmap, riowrite .TP off_t riomap(int socket, void *buf, size_t len, int prot, int flags, off_t offset) .TP Riomap registers an application buffer with the RDMA hardware associated with an rsocket. The buffer is registered either for local only access (PROT_NONE) or for remote write access (PROT_WRITE). When registered for remote access, the buffer is mapped to a given offset. The offset is either provided by the user, or if the user selects -1 for the offset, rsockets selects one. The remote peer may access an iomapped buffer directly by specifying the correct offset. The mapping is not guaranteed to be available until after the remote peer receives a data transfer initiated after riomap has completed. .PP In order to enable the use of remote IO mapping calls on an rsocket, an application must set the number of IO mappings that are available to the remote peer. This may be done using the rsetsockopt RDMA_IOMAPSIZE option. By default, an rsocket does not support remote IO mappings. riounmap .TP int riounmap(int socket, void *buf, size_t len) .TP Riounmap removes the mapping between a buffer and an rsocket. .P riowrite .TP size_t riowrite(int socket, const void *buf, size_t count, off_t offset, int flags) .TP Riowrite allows an application to transfer data over an rsocket directly into a remotely iomapped buffer. The remote buffer is specified through an offset parameter, which corresponds to a remote iomapped buffer. From the sender's perspective, riowrite behaves similar to rwrite. From a receiver's view, riowrite transfers are silently redirected into a pre- determined data buffer. Data is received automatically, and the receiver is not informed of the transfer. However, iowrite data is still considered part of the data stream, such that iowrite data will be written before a subsequent transfer is received. A message sent immediately after initiating an iowrite may be used to notify the receiver of the iowrite. .P In addition to standard socket options, rsockets supports options specific to RDMA devices and protocols. These options are accessible through rsetsockopt using SOL_RDMA option level. .TP RDMA_SQSIZE - Integer size of the underlying send queue. .TP RDMA_RQSIZE - Integer size of the underlying receive queue. .TP RDMA_INLINE - Integer size of inline data. .TP RDMA_IOMAPSIZE - Integer number of remote IO mappings supported .TP RDMA_ROUTE - struct ibv_path_data of path record for connection. .P Note that rsockets fd's cannot be passed into non-rsocket calls. For applications which must mix rsocket fd's with standard socket fd's or opened files, rpoll and rselect support polling both rsockets and normal fd's. .P Existing applications can make use of rsockets through the use of a preload library. Because rsockets implements an end-to-end protocol, both sides of a connection must use rsockets. The rdma_cm library provides such a preload library, librspreload. To reduce the chance of the preload library intercepting calls without the user's explicit knowledge, the librspreload library is installed into %libdir%/rsocket subdirectory. .P The preload library can be used by setting LD_PRELOAD when running. Note that not all applications will work with rsockets. Support is limited based on the socket options used by the application. Support for fork() is limited, but available. To use rsockets with the preload library for applications that call fork, users must set the environment variable RDMAV_FORK_SAFE=1 on both the client and server side of the connection. In general, fork is supportable for server applications that accept a connection, then fork off a process to handle the new connection. .P rsockets uses configuration files that give an administrator control over the default settings used by rsockets. Use files under @CMAKE_INSTALL_FULL_SYSCONFDIR@/rdma/rsocket as shown: .P .P mem_default - default size of receive buffer(s) .P wmem_default - default size of send buffer(s) .P sqsize_default - default size of send queue .P rqsize_default - default size of receive queue .P inline_default - default size of inline data .P iomap_size - default size of remote iomapping table .P polling_time - default number of microseconds to poll for data before waiting .P wake_up_interval - maximum number of milliseconds to block in poll. This value is used to safe guard against potential application hangs in rpoll(). .P All configuration files should contain a single integer value. Values may be set by issuing a command similar to the following example. .P echo 1000000 > @CMAKE_INSTALL_FULL_SYSCONFDIR@/rdma/rsocket/mem_default .P If configuration files are not available, rsockets uses internal defaults. Applications can override default values programmatically through the rsetsockopt routine. .SH "SEE ALSO" rdma_cm(7) rdma-core-56.1/librdmacm/man/rstream.1000066400000000000000000000046261477342711600176060ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "RSTREAM" 1 "2011-11-16" "librdmacm" "librdmacm" librdmacm .SH NAME rstream \- streaming over RDMA ping-pong test. .SH SYNOPSIS .sp .nf \fIrstream\fR [-s server_address] [-b bind_address] [-f address_format] [-B buffer_size] [-I iterations] [-C transfer_count] [-S transfer_size] [-p server_port] [-T test_option] .fi .SH "DESCRIPTION" Uses the streaming over RDMA protocol (rsocket) to connect and exchange data between a client and server application. .SH "OPTIONS" .TP \-s server_address The network name or IP address of the server system listening for connections. The used name or address must route over an RDMA device. This option must be specified by the client. .TP \-b bind_address The local network address to bind to. .TP \-f address_format Supported address formats are ip, ipv6, gid, or name. .TP \-B buffer_size Indicates the size of the send and receive network buffers. .TP \-I iterations The number of times that the specified number of messages will be exchanged between the client and server. (default 1000) .TP \-C transfer_count The number of messages to transfer from the client to the server and back again on each iteration. (default 1000) .TP \-S transfer_size The size of each send transfer, in bytes. (default 1000) If 'all' is specified, rstream will run a series of tests of various sizes. .TP \-p server_port The server's port number. .TP \-T test_option Specifies test parameters. Available options are: .P s | socket - uses standard socket calls to transfer data .P a | async - uses asynchronous operation (e.g. select / poll) .P b | blocking - uses blocking calls .P f | fork - fork server processing (forces -T s option) .P n | nonblocking - uses non-blocking calls .P r | resolve - use rdma cm to resolve address .P v | verify - verifies data transfers .SH "NOTES" Basic usage is to start rstream on a server system, then run rstream -s server_name on a client system. By default, rstream will run a series of latency and bandwidth performance tests. Specifying a different iterations, transfer_count, or transfer_size will run a user customized test using default values where none have been specified. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7) rdma-core-56.1/librdmacm/man/ucmatose.1000066400000000000000000000052351477342711600177460ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "UCMATOSE" 1 "2007-05-15" "librdmacm" "librdmacm" librdmacm .SH NAME ucmatose \- RDMA CM connection and simple ping-pong test. .SH SYNOPSIS .sp .nf \fIucmatose\fR [-s server_address] [-b bind_address] [-f address_format] [-P port_space] [-c connections] [-C message_count] [-S message_size] [-a ack_timeout] \fIucmatose\fR -s server_address [-b bind_address] [-f address_format] [-P port_space] [-c connections] [-C message_count] [-S message_size] [-t tos] [-a ack_timeout] .fi .SH "DESCRIPTION" Establishes a set of reliable RDMA connections between two nodes using the librdmacm, optionally transfers data between the nodes, then disconnects. .SH "OPTIONS" .TP \-s server_address The network name or IP address of the server system listening for connections. The used name or address must route over an RDMA device. This option must be specified by the client. .TP \-b bind_address The local network address to bind to. To bind to any address with IPv6 use -b ::0 . .TP \-f address_format Specifies the format of the server and bind address. Be default, the format is determined by getaddrinfo() as either being a hostname, an IPv4 address, or an IPv6 address. This option may be used to indicate that a specific address format has been provided. Supported address_format values are: name, ip, ipv6, and gid. .TP \-P port_space Specifies the port space for the connection. Be default, the port space is the RDMA TCP port space. (Note that the RDMA port space may be separate from that used for IP.) Supported port_space values are: tcp and ib. .TP \-c connections The number of connections to establish between the client and server. (default 1) .TP \-C message_count The number of messages to transfer over each connection. (default 10) .TP \-S message_size The size of each message transferred, in bytes. (default 100) .TP \-t tos Indicates the type of service used for the communication. Type of service is implementation dependent based on subnet configuration. .TP \-a ack_timeout Indicates the QP ACK timeout value that should be used. The value calculated according to the formula 4.096 * 2^(ack_timeout) usec. .TP \-m Tests event channel migration. Migrates all communication identifiers to a different event channel for disconnect events. .SH "NOTES" Basic usage is to start ucmatose on a server system, then run ucmatose -s server_name on a client system. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7), udaddy(1), mckey(1), rping(1) rdma-core-56.1/librdmacm/man/udaddy.1000066400000000000000000000041561477342711600174010ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "UDADDY" 1 "2007-05-15" "librdmacm" "librdmacm" librdmacm .SH NAME udaddy \- RDMA CM datagram setup and simple ping-pong test. .SH SYNOPSIS .sp .nf \fIudaddy\fR [-s server_address] [-b bind_address] [-c connections] [-C message_count] [-S message_size] [-p port_space] \fIudaddy\fR -s server_address [-b bind_address] [-c connections] [-C message_count] [-S message_size] [-t tos] [-p port_space] .fi .SH "DESCRIPTION" Establishes a set of unreliable RDMA datagram communication paths between two nodes using the librdmacm, optionally transfers datagrams between the nodes, then tears down the communication. .SH "OPTIONS" .TP \-s server_address The network name or IP address of the server system listening for communication. The used name or address must route over an RDMA device. This option must be specified by the client. .TP \-b bind_address The local network address to bind to. To bind to any address with IPv6 use -b ::0 . .TP \-c connections The number of communication paths to establish between the client and server. The test uses unreliable datagram communication, so no actual connections are formed. (default 1) .TP \-C message_count The number of messages to transfer over each connection. (default 10) .TP \-S message_size The size of each message transferred, in bytes. This value must be smaller than the MTU of the underlying RDMA transport, or an error will occur. (default 100) .TP \-t tos Indicates the type of service used for the communication. Type of service is implementation dependent based on subnet configuration. .TP \-p port_space The port space of the datagram communication. May be either the RDMA UDP (0x0111) or IPoIB (0x0002) port space. (default RDMA_PS_UDP) .SH "NOTES" Basic usage is to start udaddy on a server system, then run udaddy -s server_name on a client system. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7), ucmatose(1), mckey(1), rping(1) rdma-core-56.1/librdmacm/man/udpong.1000066400000000000000000000037461477342711600174270ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH "UDPONG" 1 "2017-04-28" "librdmacm" "librdmacm" librdmacm .SH NAME udpong \- unreliable datagram streaming over RDMA ping-pong test. .SH SYNOPSIS .sp .nf \fIudpong\fR [-s server_address] [-b bind_address] [-B buffer_size] [-C transfer_count] [-S transfer_size] [-p server_port] [-T test_option] .fi .SH "DESCRIPTION" Uses unreliable datagram streaming over RDMA protocol (rsocket) to connect and exchange data between a client and server application. .SH "OPTIONS" .TP \-s server_address The network name or IP address of the server system listening for connections. The used name or address must route over an RDMA device. This option must be specified by the client. .TP \-b bind_address The local network address to bind to. .TP \-B buffer_size Indicates the size of the send and receive network buffers. .TP \-C transfer_count The number of messages to transfer from the client to the server and back again on each iteration. (default 1000) .TP \-S transfer_size The size of each send transfer, in bytes. (default 1000) .TP \-p server_port The server's port number. .TP \-T test_option Specifies test parameters. Available options are: .P s | socket - uses standard socket calls to transfer data .P a | async - uses asynchronous operation (e.g. select / poll) .P b | blocking - uses blocking calls .P n | nonblocking - uses non-blocking calls .P e | echo - server echoes all messages .SH "NOTES" Basic usage is to start udpong on a server system, then run udpong -s server_name on a client system. udpong will run a series of latency and bandwidth performance tests. Specifying a different transfer_count or transfer_size will run a user customized test using default values where none have been specified. .P Because this test maps RDMA resources to userspace, users must ensure that they have available system resources and permissions. See the libibverbs README file for additional details. .SH "SEE ALSO" rdma_cm(7) rdma-core-56.1/librdmacm/preload.c000066400000000000000000000656241477342711600170730ustar00rootroot00000000000000/* * Copyright (c) 2011-2012 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "cma.h" #include "indexer.h" struct socket_calls { int (*socket)(int domain, int type, int protocol); int (*bind)(int socket, const struct sockaddr *addr, socklen_t addrlen); int (*listen)(int socket, int backlog); int (*accept)(int socket, struct sockaddr *addr, socklen_t *addrlen); int (*connect)(int socket, const struct sockaddr *addr, socklen_t addrlen); ssize_t (*recv)(int socket, void *buf, size_t len, int flags); ssize_t (*recvfrom)(int socket, void *buf, size_t len, int flags, struct sockaddr *src_addr, socklen_t *addrlen); ssize_t (*recvmsg)(int socket, struct msghdr *msg, int flags); ssize_t (*read)(int socket, void *buf, size_t count); ssize_t (*readv)(int socket, const struct iovec *iov, int iovcnt); ssize_t (*send)(int socket, const void *buf, size_t len, int flags); ssize_t (*sendto)(int socket, const void *buf, size_t len, int flags, const struct sockaddr *dest_addr, socklen_t addrlen); ssize_t (*sendmsg)(int socket, const struct msghdr *msg, int flags); ssize_t (*write)(int socket, const void *buf, size_t count); ssize_t (*writev)(int socket, const struct iovec *iov, int iovcnt); int (*poll)(struct pollfd *fds, nfds_t nfds, int timeout); int (*shutdown)(int socket, int how); int (*close)(int socket); int (*getpeername)(int socket, struct sockaddr *addr, socklen_t *addrlen); int (*getsockname)(int socket, struct sockaddr *addr, socklen_t *addrlen); int (*setsockopt)(int socket, int level, int optname, const void *optval, socklen_t optlen); int (*getsockopt)(int socket, int level, int optname, void *optval, socklen_t *optlen); int (*fcntl)(int socket, int cmd, ... /* arg */); int (*dup2)(int oldfd, int newfd); ssize_t (*sendfile)(int out_fd, int in_fd, off_t *offset, size_t count); int (*fxstat)(int ver, int fd, struct stat *buf); }; static struct socket_calls real; static struct socket_calls rs; static struct index_map idm; static pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER; static int sq_size; static int rq_size; static int sq_inline; static int fork_support; enum fd_type { fd_normal, fd_rsocket }; enum fd_fork_state { fd_ready, fd_fork, fd_fork_listen, fd_fork_active, fd_fork_passive }; struct fd_info { enum fd_type type; enum fd_fork_state state; int fd; int dupfd; _Atomic(int) refcnt; }; struct config_entry { char *name; int domain; int type; int protocol; }; static struct config_entry *config; static int config_cnt; static void free_config(void) { while (config_cnt) free(config[--config_cnt].name); free(config); } /* * Config file format: * # Starting '#' indicates comment * # wild card values are supported using '*' * # domain - *, INET, INET6, IB * # type - *, STREAM, DGRAM * # protocol - *, TCP, UDP * program_name domain type protocol */ static void scan_config(void) { struct config_entry *new_config; FILE *fp; char line[120], prog[64], dom[16], type[16], proto[16]; fp = fopen(RS_CONF_DIR "/preload_config", "r"); if (!fp) return; while (fgets(line, sizeof(line), fp)) { if (line[0] == '#') continue; if (sscanf(line, "%63s%15s%15s%15s", prog, dom, type, proto) != 4) continue; new_config = realloc(config, (config_cnt + 1) * sizeof(struct config_entry)); if (!new_config) break; config = new_config; memset(&config[config_cnt], 0, sizeof(struct config_entry)); if (!strcasecmp(dom, "INET") || !strcasecmp(dom, "AF_INET") || !strcasecmp(dom, "PF_INET")) { config[config_cnt].domain = AF_INET; } else if (!strcasecmp(dom, "INET6") || !strcasecmp(dom, "AF_INET6") || !strcasecmp(dom, "PF_INET6")) { config[config_cnt].domain = AF_INET6; } else if (!strcasecmp(dom, "IB") || !strcasecmp(dom, "AF_IB") || !strcasecmp(dom, "PF_IB")) { config[config_cnt].domain = AF_IB; } else if (strcmp(dom, "*")) { continue; } if (!strcasecmp(type, "STREAM") || !strcasecmp(type, "SOCK_STREAM")) { config[config_cnt].type = SOCK_STREAM; } else if (!strcasecmp(type, "DGRAM") || !strcasecmp(type, "SOCK_DGRAM")) { config[config_cnt].type = SOCK_DGRAM; } else if (strcmp(type, "*")) { continue; } if (!strcasecmp(proto, "TCP") || !strcasecmp(proto, "IPPROTO_TCP")) { config[config_cnt].protocol = IPPROTO_TCP; } else if (!strcasecmp(proto, "UDP") || !strcasecmp(proto, "IPPROTO_UDP")) { config[config_cnt].protocol = IPPROTO_UDP; } else if (strcmp(proto, "*")) { continue; } if (strcmp(prog, "*")) { if (!(config[config_cnt].name = strdup(prog))) continue; } config_cnt++; } fclose(fp); if (config_cnt) atexit(free_config); } static int intercept_socket(int domain, int type, int protocol) { int i; if (!config_cnt) return 1; if (!protocol) { if (type == SOCK_STREAM) protocol = IPPROTO_TCP; else if (type == SOCK_DGRAM) protocol = IPPROTO_UDP; } for (i = 0; i < config_cnt; i++) { if ((!config[i].name || !strncasecmp(config[i].name, program_invocation_short_name, strlen(config[i].name))) && (!config[i].domain || config[i].domain == domain) && (!config[i].type || config[i].type == type) && (!config[i].protocol || config[i].protocol == protocol)) return 1; } return 0; } static int fd_open(void) { struct fd_info *fdi; int ret, index; fdi = calloc(1, sizeof(*fdi)); if (!fdi) return ERR(ENOMEM); index = open("/dev/null", O_RDONLY); if (index < 0) { ret = index; goto err1; } fdi->dupfd = -1; atomic_store(&fdi->refcnt, 1); pthread_mutex_lock(&mut); ret = idm_set(&idm, index, fdi); pthread_mutex_unlock(&mut); if (ret < 0) goto err2; return index; err2: real.close(index); err1: free(fdi); return ret; } static void fd_store(int index, int fd, enum fd_type type, enum fd_fork_state state) { struct fd_info *fdi; fdi = idm_at(&idm, index); fdi->fd = fd; fdi->type = type; fdi->state = state; } static inline enum fd_type fd_get(int index, int *fd) { struct fd_info *fdi; fdi = idm_lookup(&idm, index); if (fdi) { *fd = fdi->fd; return fdi->type; } else { *fd = index; return fd_normal; } } static inline int fd_getd(int index) { struct fd_info *fdi; fdi = idm_lookup(&idm, index); return fdi ? fdi->fd : index; } static inline enum fd_fork_state fd_gets(int index) { struct fd_info *fdi; fdi = idm_lookup(&idm, index); return fdi ? fdi->state : fd_ready; } static inline enum fd_type fd_gett(int index) { struct fd_info *fdi; fdi = idm_lookup(&idm, index); return fdi ? fdi->type : fd_normal; } static enum fd_type fd_close(int index, int *fd) { struct fd_info *fdi; enum fd_type type; fdi = idm_lookup(&idm, index); if (fdi) { idm_clear(&idm, index); *fd = fdi->fd; type = fdi->type; real.close(index); free(fdi); } else { *fd = index; type = fd_normal; } return type; } static void getenv_options(void) { char *var; var = getenv("RS_SQ_SIZE"); if (var) sq_size = atoi(var); var = getenv("RS_RQ_SIZE"); if (var) rq_size = atoi(var); var = getenv("RS_INLINE"); if (var) sq_inline = atoi(var); var = getenv("RDMAV_FORK_SAFE"); if (var) fork_support = atoi(var); } static void init_preload(void) { static int init; /* Quick check without lock */ if (init) return; pthread_mutex_lock(&mut); if (init) goto out; real.socket = dlsym(RTLD_NEXT, "socket"); real.bind = dlsym(RTLD_NEXT, "bind"); real.listen = dlsym(RTLD_NEXT, "listen"); real.accept = dlsym(RTLD_NEXT, "accept"); real.connect = dlsym(RTLD_NEXT, "connect"); real.recv = dlsym(RTLD_NEXT, "recv"); real.recvfrom = dlsym(RTLD_NEXT, "recvfrom"); real.recvmsg = dlsym(RTLD_NEXT, "recvmsg"); real.read = dlsym(RTLD_NEXT, "read"); real.readv = dlsym(RTLD_NEXT, "readv"); real.send = dlsym(RTLD_NEXT, "send"); real.sendto = dlsym(RTLD_NEXT, "sendto"); real.sendmsg = dlsym(RTLD_NEXT, "sendmsg"); real.write = dlsym(RTLD_NEXT, "write"); real.writev = dlsym(RTLD_NEXT, "writev"); real.poll = dlsym(RTLD_NEXT, "poll"); real.shutdown = dlsym(RTLD_NEXT, "shutdown"); real.close = dlsym(RTLD_NEXT, "close"); real.getpeername = dlsym(RTLD_NEXT, "getpeername"); real.getsockname = dlsym(RTLD_NEXT, "getsockname"); real.setsockopt = dlsym(RTLD_NEXT, "setsockopt"); real.getsockopt = dlsym(RTLD_NEXT, "getsockopt"); real.fcntl = dlsym(RTLD_NEXT, "fcntl"); real.dup2 = dlsym(RTLD_NEXT, "dup2"); real.sendfile = dlsym(RTLD_NEXT, "sendfile"); real.fxstat = dlsym(RTLD_NEXT, "__fxstat"); rs.socket = dlsym(RTLD_DEFAULT, "rsocket"); rs.bind = dlsym(RTLD_DEFAULT, "rbind"); rs.listen = dlsym(RTLD_DEFAULT, "rlisten"); rs.accept = dlsym(RTLD_DEFAULT, "raccept"); rs.connect = dlsym(RTLD_DEFAULT, "rconnect"); rs.recv = dlsym(RTLD_DEFAULT, "rrecv"); rs.recvfrom = dlsym(RTLD_DEFAULT, "rrecvfrom"); rs.recvmsg = dlsym(RTLD_DEFAULT, "rrecvmsg"); rs.read = dlsym(RTLD_DEFAULT, "rread"); rs.readv = dlsym(RTLD_DEFAULT, "rreadv"); rs.send = dlsym(RTLD_DEFAULT, "rsend"); rs.sendto = dlsym(RTLD_DEFAULT, "rsendto"); rs.sendmsg = dlsym(RTLD_DEFAULT, "rsendmsg"); rs.write = dlsym(RTLD_DEFAULT, "rwrite"); rs.writev = dlsym(RTLD_DEFAULT, "rwritev"); rs.poll = dlsym(RTLD_DEFAULT, "rpoll"); rs.shutdown = dlsym(RTLD_DEFAULT, "rshutdown"); rs.close = dlsym(RTLD_DEFAULT, "rclose"); rs.getpeername = dlsym(RTLD_DEFAULT, "rgetpeername"); rs.getsockname = dlsym(RTLD_DEFAULT, "rgetsockname"); rs.setsockopt = dlsym(RTLD_DEFAULT, "rsetsockopt"); rs.getsockopt = dlsym(RTLD_DEFAULT, "rgetsockopt"); rs.fcntl = dlsym(RTLD_DEFAULT, "rfcntl"); getenv_options(); scan_config(); init = 1; out: pthread_mutex_unlock(&mut); } /* * We currently only handle copying a few common values. */ static int copysockopts(int dfd, int sfd, struct socket_calls *dapi, struct socket_calls *sapi) { socklen_t len; int param, ret; ret = sapi->fcntl(sfd, F_GETFL); if (ret > 0) ret = dapi->fcntl(dfd, F_SETFL, ret); if (ret) return ret; len = sizeof param; ret = sapi->getsockopt(sfd, SOL_SOCKET, SO_REUSEADDR, ¶m, &len); if (param && !ret) ret = dapi->setsockopt(dfd, SOL_SOCKET, SO_REUSEADDR, ¶m, len); if (ret) return ret; len = sizeof param; ret = sapi->getsockopt(sfd, IPPROTO_TCP, TCP_NODELAY, ¶m, &len); if (param && !ret) ret = dapi->setsockopt(dfd, IPPROTO_TCP, TCP_NODELAY, ¶m, len); if (ret) return ret; return 0; } /* * Convert between an rsocket and a normal socket. */ static int transpose_socket(int socket, enum fd_type new_type) { socklen_t len = 0; int sfd, dfd, param, ret; struct socket_calls *sapi, *dapi; sfd = fd_getd(socket); if (new_type == fd_rsocket) { dapi = &rs; sapi = ℜ } else { dapi = ℜ sapi = &rs; } ret = sapi->getsockname(sfd, NULL, &len); if (ret) return ret; param = (len == sizeof(struct sockaddr_in6)) ? PF_INET6 : PF_INET; dfd = dapi->socket(param, SOCK_STREAM, 0); if (dfd < 0) return dfd; ret = copysockopts(dfd, sfd, dapi, sapi); if (ret) goto err; fd_store(socket, dfd, new_type, fd_ready); return dfd; err: dapi->close(dfd); return ret; } /* * Use defaults on failure. */ static void set_rsocket_options(int rsocket) { if (sq_size) rsetsockopt(rsocket, SOL_RDMA, RDMA_SQSIZE, &sq_size, sizeof sq_size); if (rq_size) rsetsockopt(rsocket, SOL_RDMA, RDMA_RQSIZE, &rq_size, sizeof rq_size); if (sq_inline) rsetsockopt(rsocket, SOL_RDMA, RDMA_INLINE, &sq_inline, sizeof sq_inline); } int socket(int domain, int type, int protocol) { static __thread int recursive; int index, ret; init_preload(); if (recursive || !intercept_socket(domain, type, protocol)) goto real; index = fd_open(); if (index < 0) return index; if (fork_support && (domain == PF_INET || domain == PF_INET6) && (type == SOCK_STREAM) && (!protocol || protocol == IPPROTO_TCP)) { ret = real.socket(domain, type, protocol); if (ret < 0) return ret; fd_store(index, ret, fd_normal, fd_fork); return index; } recursive = 1; ret = rsocket(domain, type, protocol); recursive = 0; if (ret >= 0) { fd_store(index, ret, fd_rsocket, fd_ready); set_rsocket_options(ret); return index; } fd_close(index, &ret); real: return real.socket(domain, type, protocol); } int bind(int socket, const struct sockaddr *addr, socklen_t addrlen) { int fd; return (fd_get(socket, &fd) == fd_rsocket) ? rbind(fd, addr, addrlen) : real.bind(fd, addr, addrlen); } int listen(int socket, int backlog) { int fd, ret; if (fd_get(socket, &fd) == fd_rsocket) { ret = rlisten(fd, backlog); } else { ret = real.listen(fd, backlog); if (!ret && fd_gets(socket) == fd_fork) fd_store(socket, fd, fd_normal, fd_fork_listen); } return ret; } int accept(int socket, struct sockaddr *addr, socklen_t *addrlen) { int fd, index, ret; if (fd_get(socket, &fd) == fd_rsocket) { index = fd_open(); if (index < 0) return index; ret = raccept(fd, addr, addrlen); if (ret < 0) { fd_close(index, &fd); return ret; } fd_store(index, ret, fd_rsocket, fd_ready); return index; } else if (fd_gets(socket) == fd_fork_listen) { index = fd_open(); if (index < 0) return index; ret = real.accept(fd, addr, addrlen); if (ret < 0) { fd_close(index, &fd); return ret; } fd_store(index, ret, fd_normal, fd_fork_passive); return index; } else { return real.accept(fd, addr, addrlen); } } /* * We can't fork RDMA connections and pass them from the parent to the child * process. Instead, we need to establish the RDMA connection after calling * fork. To do this, we delay establishing the RDMA connection until we try * to send/receive on the server side. */ static void fork_active(int socket) { struct sockaddr_storage addr; int sfd, dfd, ret; socklen_t len; uint32_t msg; long flags; sfd = fd_getd(socket); flags = real.fcntl(sfd, F_GETFL); real.fcntl(sfd, F_SETFL, 0); ret = real.recv(sfd, &msg, sizeof msg, MSG_PEEK); real.fcntl(sfd, F_SETFL, flags); if ((ret != sizeof msg) || msg) goto err1; len = sizeof addr; ret = real.getpeername(sfd, (struct sockaddr *) &addr, &len); if (ret) goto err1; dfd = rsocket(addr.ss_family, SOCK_STREAM, 0); if (dfd < 0) goto err1; ret = rconnect(dfd, (struct sockaddr *) &addr, len); if (ret) goto err2; set_rsocket_options(dfd); copysockopts(dfd, sfd, &rs, &real); real.shutdown(sfd, SHUT_RDWR); real.close(sfd); fd_store(socket, dfd, fd_rsocket, fd_ready); return; err2: rclose(dfd); err1: fd_store(socket, sfd, fd_normal, fd_ready); } /* * The server will start listening for the new connection, then send a * message to the active side when the listen is ready. This does leave * fork unsupported in the following case: the server is nonblocking and * calls select/poll waiting to receive data from the client. */ static void fork_passive(int socket) { struct sockaddr_in6 sin6; sem_t *sem; int lfd, sfd, dfd, ret, param; socklen_t len; uint32_t msg; sfd = fd_getd(socket); len = sizeof sin6; ret = real.getsockname(sfd, (struct sockaddr *) &sin6, &len); if (ret) goto out; sin6.sin6_flowinfo = 0; sin6.sin6_scope_id = 0; memset(&sin6.sin6_addr, 0, sizeof sin6.sin6_addr); sem = sem_open("/rsocket_fork", O_CREAT | O_RDWR, S_IRWXU | S_IRWXG, 1); if (sem == SEM_FAILED) { ret = -1; goto out; } lfd = rsocket(sin6.sin6_family, SOCK_STREAM, 0); if (lfd < 0) { ret = lfd; goto sclose; } param = 1; rsetsockopt(lfd, SOL_SOCKET, SO_REUSEADDR, ¶m, sizeof param); sem_wait(sem); ret = rbind(lfd, (struct sockaddr *) &sin6, sizeof sin6); if (ret) goto lclose; ret = rlisten(lfd, 1); if (ret) goto lclose; msg = 0; len = real.write(sfd, &msg, sizeof msg); if (len != sizeof msg) goto lclose; dfd = raccept(lfd, NULL, NULL); if (dfd < 0) { ret = dfd; goto lclose; } set_rsocket_options(dfd); copysockopts(dfd, sfd, &rs, &real); real.shutdown(sfd, SHUT_RDWR); real.close(sfd); fd_store(socket, dfd, fd_rsocket, fd_ready); lclose: rclose(lfd); sem_post(sem); sclose: sem_close(sem); out: if (ret) fd_store(socket, sfd, fd_normal, fd_ready); } static inline enum fd_type fd_fork_get(int index, int *fd) { struct fd_info *fdi; fdi = idm_lookup(&idm, index); if (fdi) { if (fdi->state == fd_fork_passive) fork_passive(index); else if (fdi->state == fd_fork_active) fork_active(index); *fd = fdi->fd; return fdi->type; } else { *fd = index; return fd_normal; } } int connect(int socket, const struct sockaddr *addr, socklen_t addrlen) { int fd, ret; if (fd_get(socket, &fd) == fd_rsocket) { ret = rconnect(fd, addr, addrlen); if (!ret || errno == EINPROGRESS) return ret; ret = transpose_socket(socket, fd_normal); if (ret < 0) return ret; rclose(fd); fd = ret; } else if (fd_gets(socket) == fd_fork) { fd_store(socket, fd, fd_normal, fd_fork_active); } return real.connect(fd, addr, addrlen); } ssize_t recv(int socket, void *buf, size_t len, int flags) { int fd; return (fd_fork_get(socket, &fd) == fd_rsocket) ? rrecv(fd, buf, len, flags) : real.recv(fd, buf, len, flags); } ssize_t recvfrom(int socket, void *buf, size_t len, int flags, struct sockaddr *src_addr, socklen_t *addrlen) { int fd; return (fd_fork_get(socket, &fd) == fd_rsocket) ? rrecvfrom(fd, buf, len, flags, src_addr, addrlen) : real.recvfrom(fd, buf, len, flags, src_addr, addrlen); } ssize_t recvmsg(int socket, struct msghdr *msg, int flags) { int fd; return (fd_fork_get(socket, &fd) == fd_rsocket) ? rrecvmsg(fd, msg, flags) : real.recvmsg(fd, msg, flags); } ssize_t read(int socket, void *buf, size_t count) { int fd; init_preload(); return (fd_fork_get(socket, &fd) == fd_rsocket) ? rread(fd, buf, count) : real.read(fd, buf, count); } ssize_t readv(int socket, const struct iovec *iov, int iovcnt) { int fd; init_preload(); return (fd_fork_get(socket, &fd) == fd_rsocket) ? rreadv(fd, iov, iovcnt) : real.readv(fd, iov, iovcnt); } ssize_t send(int socket, const void *buf, size_t len, int flags) { int fd; return (fd_fork_get(socket, &fd) == fd_rsocket) ? rsend(fd, buf, len, flags) : real.send(fd, buf, len, flags); } ssize_t sendto(int socket, const void *buf, size_t len, int flags, const struct sockaddr *dest_addr, socklen_t addrlen) { int fd; return (fd_fork_get(socket, &fd) == fd_rsocket) ? rsendto(fd, buf, len, flags, dest_addr, addrlen) : real.sendto(fd, buf, len, flags, dest_addr, addrlen); } ssize_t sendmsg(int socket, const struct msghdr *msg, int flags) { int fd; return (fd_fork_get(socket, &fd) == fd_rsocket) ? rsendmsg(fd, msg, flags) : real.sendmsg(fd, msg, flags); } ssize_t write(int socket, const void *buf, size_t count) { int fd; init_preload(); return (fd_fork_get(socket, &fd) == fd_rsocket) ? rwrite(fd, buf, count) : real.write(fd, buf, count); } ssize_t writev(int socket, const struct iovec *iov, int iovcnt) { int fd; init_preload(); return (fd_fork_get(socket, &fd) == fd_rsocket) ? rwritev(fd, iov, iovcnt) : real.writev(fd, iov, iovcnt); } static struct pollfd *fds_alloc(nfds_t nfds) { static __thread struct pollfd *rfds; static __thread nfds_t rnfds; if (nfds > rnfds) { if (rfds) free(rfds); rfds = malloc(sizeof(*rfds) * nfds); rnfds = rfds ? nfds : 0; } return rfds; } int poll(struct pollfd *fds, nfds_t nfds, int timeout) { struct pollfd *rfds; int i, ret; init_preload(); for (i = 0; i < nfds; i++) { if (fd_gett(fds[i].fd) == fd_rsocket) goto use_rpoll; } return real.poll(fds, nfds, timeout); use_rpoll: rfds = fds_alloc(nfds); if (!rfds) return ERR(ENOMEM); for (i = 0; i < nfds; i++) { rfds[i].fd = fd_getd(fds[i].fd); rfds[i].events = fds[i].events; rfds[i].revents = 0; } ret = rpoll(rfds, nfds, timeout); for (i = 0; i < nfds; i++) fds[i].revents = rfds[i].revents; return ret; } static void select_to_rpoll(struct pollfd *fds, int *nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds) { int fd, events, i = 0; for (fd = 0; fd < *nfds; fd++) { events = (readfds && FD_ISSET(fd, readfds)) ? POLLIN : 0; if (writefds && FD_ISSET(fd, writefds)) events |= POLLOUT; if (events || (exceptfds && FD_ISSET(fd, exceptfds))) { fds[i].fd = fd_getd(fd); fds[i++].events = events; } } *nfds = i; } static int rpoll_to_select(struct pollfd *fds, int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds) { int fd, rfd, i, cnt = 0; for (i = 0, fd = 0; i < nfds; fd++) { rfd = fd_getd(fd); if (rfd != fds[i].fd) continue; if (readfds && (fds[i].revents & POLLIN)) { FD_SET(fd, readfds); cnt++; } if (writefds && (fds[i].revents & POLLOUT)) { FD_SET(fd, writefds); cnt++; } if (exceptfds && (fds[i].revents & ~(POLLIN | POLLOUT))) { FD_SET(fd, exceptfds); cnt++; } i++; } return cnt; } static int rs_convert_timeout(struct timeval *timeout) { return !timeout ? -1 : timeout->tv_sec * 1000 + timeout->tv_usec / 1000; } int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout) { struct pollfd *fds; int ret; fds = fds_alloc(nfds); if (!fds) return ERR(ENOMEM); select_to_rpoll(fds, &nfds, readfds, writefds, exceptfds); ret = rpoll(fds, nfds, rs_convert_timeout(timeout)); if (readfds) FD_ZERO(readfds); if (writefds) FD_ZERO(writefds); if (exceptfds) FD_ZERO(exceptfds); if (ret > 0) ret = rpoll_to_select(fds, nfds, readfds, writefds, exceptfds); return ret; } int shutdown(int socket, int how) { int fd; return (fd_get(socket, &fd) == fd_rsocket) ? rshutdown(fd, how) : real.shutdown(fd, how); } int close(int socket) { struct fd_info *fdi; int ret; init_preload(); fdi = idm_lookup(&idm, socket); if (!fdi) return real.close(socket); if (fdi->dupfd != -1) { ret = close(fdi->dupfd); if (ret) return ret; } if (atomic_fetch_sub(&fdi->refcnt, 1) != 1) return 0; idm_clear(&idm, socket); real.close(socket); ret = (fdi->type == fd_rsocket) ? rclose(fdi->fd) : real.close(fdi->fd); free(fdi); return ret; } int getpeername(int socket, struct sockaddr *addr, socklen_t *addrlen) { int fd; return (fd_get(socket, &fd) == fd_rsocket) ? rgetpeername(fd, addr, addrlen) : real.getpeername(fd, addr, addrlen); } int getsockname(int socket, struct sockaddr *addr, socklen_t *addrlen) { int fd; init_preload(); return (fd_get(socket, &fd) == fd_rsocket) ? rgetsockname(fd, addr, addrlen) : real.getsockname(fd, addr, addrlen); } int setsockopt(int socket, int level, int optname, const void *optval, socklen_t optlen) { int fd; return (fd_get(socket, &fd) == fd_rsocket) ? rsetsockopt(fd, level, optname, optval, optlen) : real.setsockopt(fd, level, optname, optval, optlen); } int getsockopt(int socket, int level, int optname, void *optval, socklen_t *optlen) { int fd; return (fd_get(socket, &fd) == fd_rsocket) ? rgetsockopt(fd, level, optname, optval, optlen) : real.getsockopt(fd, level, optname, optval, optlen); } int fcntl(int socket, int cmd, ... /* arg */) { va_list args; long lparam; void *pparam; int fd, ret; init_preload(); va_start(args, cmd); switch (cmd) { case F_GETFD: case F_GETFL: case F_GETOWN: case F_GETSIG: case F_GETLEASE: ret = (fd_get(socket, &fd) == fd_rsocket) ? rfcntl(fd, cmd) : real.fcntl(fd, cmd); break; case F_DUPFD: /*case F_DUPFD_CLOEXEC:*/ case F_SETFD: case F_SETFL: case F_SETOWN: case F_SETSIG: case F_SETLEASE: case F_NOTIFY: lparam = va_arg(args, long); ret = (fd_get(socket, &fd) == fd_rsocket) ? rfcntl(fd, cmd, lparam) : real.fcntl(fd, cmd, lparam); break; default: pparam = va_arg(args, void *); ret = (fd_get(socket, &fd) == fd_rsocket) ? rfcntl(fd, cmd, pparam) : real.fcntl(fd, cmd, pparam); break; } va_end(args); return ret; } /* * dup2 is not thread safe */ int dup2(int oldfd, int newfd) { struct fd_info *oldfdi, *newfdi; int ret; init_preload(); oldfdi = idm_lookup(&idm, oldfd); if (oldfdi) { if (oldfdi->state == fd_fork_passive) fork_passive(oldfd); else if (oldfdi->state == fd_fork_active) fork_active(oldfd); } newfdi = idm_lookup(&idm, newfd); if (newfdi) { /* newfd cannot have been dup'ed directly */ if (atomic_load(&newfdi->refcnt) > 1) return ERR(EBUSY); close(newfd); } ret = real.dup2(oldfd, newfd); if (!oldfdi || ret != newfd) return ret; newfdi = calloc(1, sizeof(*newfdi)); if (!newfdi) { close(newfd); return ERR(ENOMEM); } pthread_mutex_lock(&mut); idm_set(&idm, newfd, newfdi); pthread_mutex_unlock(&mut); newfdi->fd = oldfdi->fd; newfdi->type = oldfdi->type; if (oldfdi->dupfd != -1) { newfdi->dupfd = oldfdi->dupfd; oldfdi = idm_lookup(&idm, oldfdi->dupfd); } else { newfdi->dupfd = oldfd; } atomic_store(&newfdi->refcnt, 1); atomic_fetch_add(&oldfdi->refcnt, 1); return newfd; } ssize_t sendfile(int out_fd, int in_fd, off_t *offset, size_t count) { void *file_addr; int fd; size_t ret; if (fd_get(out_fd, &fd) != fd_rsocket) return real.sendfile(fd, in_fd, offset, count); file_addr = mmap(NULL, count, PROT_READ, 0, in_fd, offset ? *offset : 0); if (file_addr == (void *) -1) return -1; ret = rwrite(fd, file_addr, count); if ((ret > 0) && offset) lseek(in_fd, ret, SEEK_CUR); munmap(file_addr, count); return ret; } int __fxstat(int ver, int socket, struct stat *buf) { int fd, ret; init_preload(); if (fd_get(socket, &fd) == fd_rsocket) { ret = real.fxstat(ver, socket, buf); if (!ret) buf->st_mode = (buf->st_mode & ~S_IFMT) | S_IFSOCK; } else { ret = real.fxstat(ver, fd, buf); } return ret; } rdma-core-56.1/librdmacm/rdma_cma.h000066400000000000000000000670241477342711600172110ustar00rootroot00000000000000/* * Copyright (c) 2005 Voltaire Inc. All rights reserved. * Copyright (c) 2005-2014 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(RDMA_CMA_H) #define RDMA_CMA_H #include #include #include #include #ifdef __cplusplus extern "C" { #endif /* * Upon receiving a device removal event, users must destroy the associated * RDMA identifier and release all resources allocated with the device. */ enum rdma_cm_event_type { RDMA_CM_EVENT_ADDR_RESOLVED, RDMA_CM_EVENT_ADDR_ERROR, RDMA_CM_EVENT_ROUTE_RESOLVED, RDMA_CM_EVENT_ROUTE_ERROR, RDMA_CM_EVENT_CONNECT_REQUEST, RDMA_CM_EVENT_CONNECT_RESPONSE, RDMA_CM_EVENT_CONNECT_ERROR, RDMA_CM_EVENT_UNREACHABLE, RDMA_CM_EVENT_REJECTED, RDMA_CM_EVENT_ESTABLISHED, RDMA_CM_EVENT_DISCONNECTED, RDMA_CM_EVENT_DEVICE_REMOVAL, RDMA_CM_EVENT_MULTICAST_JOIN, RDMA_CM_EVENT_MULTICAST_ERROR, RDMA_CM_EVENT_ADDR_CHANGE, RDMA_CM_EVENT_TIMEWAIT_EXIT }; enum rdma_port_space { RDMA_PS_IPOIB = 0x0002, RDMA_PS_TCP = 0x0106, RDMA_PS_UDP = 0x0111, RDMA_PS_IB = 0x013F, }; #define RDMA_IB_IP_PS_MASK 0xFFFFFFFFFFFF0000ULL #define RDMA_IB_IP_PORT_MASK 0x000000000000FFFFULL #define RDMA_IB_IP_PS_TCP 0x0000000001060000ULL #define RDMA_IB_IP_PS_UDP 0x0000000001110000ULL #define RDMA_IB_PS_IB 0x00000000013F0000ULL /* * Global qkey value for UDP QPs and multicast groups created via the * RDMA CM. */ #define RDMA_UDP_QKEY 0x01234567 struct rdma_ib_addr { union ibv_gid sgid; union ibv_gid dgid; __be16 pkey; }; struct rdma_addr { union { struct sockaddr src_addr; struct sockaddr_in src_sin; struct sockaddr_in6 src_sin6; struct sockaddr_storage src_storage; }; union { struct sockaddr dst_addr; struct sockaddr_in dst_sin; struct sockaddr_in6 dst_sin6; struct sockaddr_storage dst_storage; }; union { struct rdma_ib_addr ibaddr; } addr; }; struct rdma_route { struct rdma_addr addr; struct ibv_sa_path_rec *path_rec; int num_paths; }; struct rdma_event_channel { int fd; }; struct rdma_cm_id { struct ibv_context *verbs; struct rdma_event_channel *channel; void *context; struct ibv_qp *qp; struct rdma_route route; enum rdma_port_space ps; uint8_t port_num; struct rdma_cm_event *event; struct ibv_comp_channel *send_cq_channel; struct ibv_cq *send_cq; struct ibv_comp_channel *recv_cq_channel; struct ibv_cq *recv_cq; struct ibv_srq *srq; struct ibv_pd *pd; enum ibv_qp_type qp_type; }; enum { RDMA_MAX_RESP_RES = 0xFF, RDMA_MAX_INIT_DEPTH = 0xFF }; struct rdma_conn_param { const void *private_data; uint8_t private_data_len; uint8_t responder_resources; uint8_t initiator_depth; uint8_t flow_control; uint8_t retry_count; /* ignored when accepting */ uint8_t rnr_retry_count; /* Fields below ignored if a QP is created on the rdma_cm_id. */ uint8_t srq; uint32_t qp_num; }; struct rdma_ud_param { const void *private_data; uint8_t private_data_len; struct ibv_ah_attr ah_attr; uint32_t qp_num; uint32_t qkey; }; struct rdma_cm_event { struct rdma_cm_id *id; struct rdma_cm_id *listen_id; enum rdma_cm_event_type event; int status; union { struct rdma_conn_param conn; struct rdma_ud_param ud; } param; }; #define RAI_PASSIVE 0x00000001 #define RAI_NUMERICHOST 0x00000002 #define RAI_NOROUTE 0x00000004 #define RAI_FAMILY 0x00000008 struct rdma_addrinfo { int ai_flags; int ai_family; int ai_qp_type; int ai_port_space; socklen_t ai_src_len; socklen_t ai_dst_len; struct sockaddr *ai_src_addr; struct sockaddr *ai_dst_addr; char *ai_src_canonname; char *ai_dst_canonname; size_t ai_route_len; void *ai_route; size_t ai_connect_len; void *ai_connect; struct rdma_addrinfo *ai_next; }; /* Multicast join compatibility mask attributes */ enum rdma_cm_join_mc_attr_mask { RDMA_CM_JOIN_MC_ATTR_ADDRESS = 1 << 0, RDMA_CM_JOIN_MC_ATTR_JOIN_FLAGS = 1 << 1, RDMA_CM_JOIN_MC_ATTR_RESERVED = 1 << 2, }; /* Multicast join flags */ enum rdma_cm_mc_join_flags { RDMA_MC_JOIN_FLAG_FULLMEMBER, RDMA_MC_JOIN_FLAG_SENDONLY_FULLMEMBER, RDMA_MC_JOIN_FLAG_RESERVED, }; struct rdma_cm_join_mc_attr_ex { /* Bitwise OR between "rdma_cm_join_mc_attr_mask" enum */ uint32_t comp_mask; /* Use a flag from "rdma_cm_mc_join_flags" enum */ uint32_t join_flags; /* Multicast address identifying the group to join */ struct sockaddr *addr; }; /** * rdma_create_event_channel - Open a channel used to report communication events. * Description: * Asynchronous events are reported to users through event channels. Each * event channel maps to a file descriptor. * Notes: * All created event channels must be destroyed by calling * rdma_destroy_event_channel. Users should call rdma_get_cm_event to * retrieve events on an event channel. * See also: * rdma_get_cm_event, rdma_destroy_event_channel */ struct rdma_event_channel *rdma_create_event_channel(void); /** * rdma_destroy_event_channel - Close an event communication channel. * @channel: The communication channel to destroy. * Description: * Release all resources associated with an event channel and closes the * associated file descriptor. * Notes: * All rdma_cm_id's associated with the event channel must be destroyed, * and all returned events must be acked before calling this function. * See also: * rdma_create_event_channel, rdma_get_cm_event, rdma_ack_cm_event */ void rdma_destroy_event_channel(struct rdma_event_channel *channel); /** * rdma_create_id - Allocate a communication identifier. * @channel: The communication channel that events associated with the * allocated rdma_cm_id will be reported on. * @id: A reference where the allocated communication identifier will be * returned. * @context: User specified context associated with the rdma_cm_id. * @ps: RDMA port space. * Description: * Creates an identifier that is used to track communication information. * Notes: * Rdma_cm_id's are conceptually equivalent to a socket for RDMA * communication. The difference is that RDMA communication requires * explicitly binding to a specified RDMA device before communication * can occur, and most operations are asynchronous in nature. Communication * events on an rdma_cm_id are reported through the associated event * channel. Users must release the rdma_cm_id by calling rdma_destroy_id. * See also: * rdma_create_event_channel, rdma_destroy_id, rdma_get_devices, * rdma_bind_addr, rdma_resolve_addr, rdma_connect, rdma_listen, */ int rdma_create_id(struct rdma_event_channel *channel, struct rdma_cm_id **id, void *context, enum rdma_port_space ps); /** * rdma_create_ep - Allocate a communication identifier and qp. * @id: A reference where the allocated communication identifier will be * returned. * @res: Result from rdma_getaddrinfo, which specifies the source and * destination addresses, plus optional routing and connection information. * @pd: Optional protection domain. This parameter is ignored if qp_init_attr * is NULL. * @qp_init_attr: Optional attributes for a QP created on the rdma_cm_id. * Description: * Create an identifier and option QP used for communication. * Notes: * If qp_init_attr is provided, then a queue pair will be allocated and * associated with the rdma_cm_id. If a pd is provided, the QP will be * created on that PD. Otherwise, the QP will be allocated on a default * PD. * The rdma_cm_id will be set to use synchronous operations (connect, * listen, and get_request). To convert to asynchronous operation, the * rdma_cm_id should be migrated to a user allocated event channel. * See also: * rdma_create_id, rdma_create_qp, rdma_migrate_id, rdma_connect, * rdma_listen */ int rdma_create_ep(struct rdma_cm_id **id, struct rdma_addrinfo *res, struct ibv_pd *pd, struct ibv_qp_init_attr *qp_init_attr); /** * rdma_destroy_ep - Deallocates a communication identifier and qp. * @id: The communication identifier to destroy. * Description: * Destroys the specified rdma_cm_id and any associated QP created * on that id. * See also: * rdma_create_ep */ void rdma_destroy_ep(struct rdma_cm_id *id); /** * rdma_destroy_id - Release a communication identifier. * @id: The communication identifier to destroy. * Description: * Destroys the specified rdma_cm_id and cancels any outstanding * asynchronous operation. * Notes: * Users must free any associated QP with the rdma_cm_id before * calling this routine and ack an related events. * See also: * rdma_create_id, rdma_destroy_qp, rdma_ack_cm_event */ int rdma_destroy_id(struct rdma_cm_id *id); /** * rdma_bind_addr - Bind an RDMA identifier to a source address. * @id: RDMA identifier. * @addr: Local address information. Wildcard values are permitted. * Description: * Associates a source address with an rdma_cm_id. The address may be * wildcarded. If binding to a specific local address, the rdma_cm_id * will also be bound to a local RDMA device. * Notes: * Typically, this routine is called before calling rdma_listen to bind * to a specific port number, but it may also be called on the active side * of a connection before calling rdma_resolve_addr to bind to a specific * address. * See also: * rdma_create_id, rdma_listen, rdma_resolve_addr, rdma_create_qp */ int rdma_bind_addr(struct rdma_cm_id *id, struct sockaddr *addr); /** * rdma_resolve_addr - Resolve destination and optional source addresses. * @id: RDMA identifier. * @src_addr: Source address information. This parameter may be NULL. * @dst_addr: Destination address information. * @timeout_ms: Time to wait for resolution to complete. * Description: * Resolve destination and optional source addresses from IP addresses * to an RDMA address. If successful, the specified rdma_cm_id will * be bound to a local device. * Notes: * This call is used to map a given destination IP address to a usable RDMA * address. If a source address is given, the rdma_cm_id is bound to that * address, the same as if rdma_bind_addr were called. If no source * address is given, and the rdma_cm_id has not yet been bound to a device, * then the rdma_cm_id will be bound to a source address based on the * local routing tables. After this call, the rdma_cm_id will be bound to * an RDMA device. This call is typically made from the active side of a * connection before calling rdma_resolve_route and rdma_connect. * See also: * rdma_create_id, rdma_resolve_route, rdma_connect, rdma_create_qp, * rdma_get_cm_event, rdma_bind_addr */ int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr, struct sockaddr *dst_addr, int timeout_ms); /** * rdma_resolve_route - Resolve the route information needed to establish a connection. * @id: RDMA identifier. * @timeout_ms: Time to wait for resolution to complete. * Description: * Resolves an RDMA route to the destination address in order to establish * a connection. The destination address must have already been resolved * by calling rdma_resolve_addr. * Notes: * This is called on the client side of a connection after calling * rdma_resolve_addr, but before calling rdma_connect. * See also: * rdma_resolve_addr, rdma_connect, rdma_get_cm_event */ int rdma_resolve_route(struct rdma_cm_id *id, int timeout_ms); /** * rdma_create_qp - Allocate a QP. * @id: RDMA identifier. * @pd: Optional protection domain for the QP. * @qp_init_attr: initial QP attributes. * Description: * Allocate a QP associated with the specified rdma_cm_id and transition it * for sending and receiving. * Notes: * The rdma_cm_id must be bound to a local RDMA device before calling this * function, and the protection domain must be for that same device. * QPs allocated to an rdma_cm_id are automatically transitioned by the * librdmacm through their states. After being allocated, the QP will be * ready to handle posting of receives. If the QP is unconnected, it will * be ready to post sends. * If pd is NULL, then the QP will be allocated using a default protection * domain associated with the underlying RDMA device. * See also: * rdma_bind_addr, rdma_resolve_addr, rdma_destroy_qp, ibv_create_qp, * ibv_modify_qp */ int rdma_create_qp(struct rdma_cm_id *id, struct ibv_pd *pd, struct ibv_qp_init_attr *qp_init_attr); int rdma_create_qp_ex(struct rdma_cm_id *id, struct ibv_qp_init_attr_ex *qp_init_attr); /** * rdma_destroy_qp - Deallocate a QP. * @id: RDMA identifier. * Description: * Destroy a QP allocated on the rdma_cm_id. * Notes: * Users must destroy any QP associated with an rdma_cm_id before * destroying the ID. * See also: * rdma_create_qp, rdma_destroy_id, ibv_destroy_qp */ void rdma_destroy_qp(struct rdma_cm_id *id); /** * rdma_connect - Initiate an active connection request. * @id: RDMA identifier. * @conn_param: optional connection parameters. * Description: * For a connected rdma_cm_id, this call initiates a connection request * to a remote destination. For an unconnected rdma_cm_id, it initiates * a lookup of the remote QP providing the datagram service. * Notes: * Users must have resolved a route to the destination address * by having called rdma_resolve_route before calling this routine. * A user may override the default connection parameters and exchange * private data as part of the connection by using the conn_param parameter. * See also: * rdma_resolve_route, rdma_disconnect, rdma_listen, rdma_get_cm_event */ int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param); /** * rdma_establish - Complete an active connection request. * @id: RDMA identifier. * Description: * Acknowledge an incoming connection response event and complete the * connection establishment. * Notes: * If a QP has not been created on the rdma_cm_id, this function should be * called by the active side to complete the connection, after getting connect * response event. This will trigger a connection established event on the * passive side. * This function should not be used on an rdma_cm_id on which a QP has been * created. * See also: * rdma_connect, rdma_disconnect, rdma_get_cm_event */ int rdma_establish(struct rdma_cm_id *id); /** * rdma_listen - Listen for incoming connection requests. * @id: RDMA identifier. * @backlog: backlog of incoming connection requests. * Description: * Initiates a listen for incoming connection requests or datagram service * lookup. The listen will be restricted to the locally bound source * address. * Notes: * Users must have bound the rdma_cm_id to a local address by calling * rdma_bind_addr before calling this routine. If the rdma_cm_id is * bound to a specific IP address, the listen will be restricted to that * address and the associated RDMA device. If the rdma_cm_id is bound * to an RDMA port number only, the listen will occur across all RDMA * devices. * See also: * rdma_bind_addr, rdma_connect, rdma_accept, rdma_reject, rdma_get_cm_event */ int rdma_listen(struct rdma_cm_id *id, int backlog); /** * rdma_get_request */ int rdma_get_request(struct rdma_cm_id *listen, struct rdma_cm_id **id); /** * rdma_accept - Called to accept a connection request. * @id: Connection identifier associated with the request. * @conn_param: Optional information needed to establish the connection. * Description: * Called from the listening side to accept a connection or datagram * service lookup request. * Notes: * Unlike the socket accept routine, rdma_accept is not called on a * listening rdma_cm_id. Instead, after calling rdma_listen, the user * waits for a connection request event to occur. Connection request * events give the user a newly created rdma_cm_id, similar to a new * socket, but the rdma_cm_id is bound to a specific RDMA device. * rdma_accept is called on the new rdma_cm_id. * A user may override the default connection parameters and exchange * private data as part of the connection by using the conn_param parameter. * See also: * rdma_listen, rdma_reject, rdma_get_cm_event */ int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param); /** * rdma_reject - Called to reject a connection request. * @id: Connection identifier associated with the request. * @private_data: Optional private data to send with the reject message. * @private_data_len: Size of the private_data to send, in bytes. * Description: * Called from the listening side to reject a connection or datagram * service lookup request. * Notes: * After receiving a connection request event, a user may call rdma_reject * to reject the request. If the underlying RDMA transport supports * private data in the reject message, the specified data will be passed to * the remote side. * See also: * rdma_listen, rdma_accept, rdma_get_cm_event */ int rdma_reject(struct rdma_cm_id *id, const void *private_data, uint8_t private_data_len); /** * rdma_reject_ece - Called to reject a connection request with ECE * rejected reason. * The same as rdma_reject() */ int rdma_reject_ece(struct rdma_cm_id *id, const void *private_data, uint8_t private_data_len); /** * rdma_notify - Notifies the librdmacm of an asynchronous event. * @id: RDMA identifier. * @event: Asynchronous event. * Description: * Used to notify the librdmacm of asynchronous events that have occurred * on a QP associated with the rdma_cm_id. * Notes: * Asynchronous events that occur on a QP are reported through the user's * device event handler. This routine is used to notify the librdmacm of * communication events. In most cases, use of this routine is not * necessary, however if connection establishment is done out of band * (such as done through Infiniband), it's possible to receive data on a * QP that is not yet considered connected. This routine forces the * connection into an established state in this case in order to handle * the rare situation where the connection never forms on its own. * Events that should be reported to the CM are: IB_EVENT_COMM_EST. * See also: * rdma_connect, rdma_accept, rdma_listen */ int rdma_notify(struct rdma_cm_id *id, enum ibv_event_type event); /** * rdma_disconnect - This function disconnects a connection. * @id: RDMA identifier. * Description: * Disconnects a connection and transitions any associated QP to the * error state. * See also: * rdma_connect, rdma_listen, rdma_accept */ int rdma_disconnect(struct rdma_cm_id *id); /** * rdma_join_multicast - Joins a multicast group. * @id: Communication identifier associated with the request. * @addr: Multicast address identifying the group to join. * @context: User-defined context associated with the join request. * Description: * Joins a multicast group and attaches an associated QP to the group. * Notes: * Before joining a multicast group, the rdma_cm_id must be bound to * an RDMA device by calling rdma_bind_addr or rdma_resolve_addr. Use of * rdma_resolve_addr requires the local routing tables to resolve the * multicast address to an RDMA device. The user must call * rdma_leave_multicast to leave the multicast group and release any * multicast resources. The context is returned to the user through * the private_data field in the rdma_cm_event. * See also: * rdma_leave_multicast, rdma_bind_addr, rdma_resolve_addr, rdma_create_qp */ int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr, void *context); /** * rdma_leave_multicast - Leaves a multicast group. * @id: Communication identifier associated with the request. * @addr: Multicast address identifying the group to leave. * Description: * Leaves a multicast group and detaches an associated QP from the group. * Notes: * Calling this function before a group has been fully joined results in * canceling the join operation. Users should be aware that messages * received from the multicast group may stilled be queued for * completion processing immediately after leaving a multicast group. * Destroying an rdma_cm_id will automatically leave all multicast groups. * See also: * rdma_join_multicast, rdma_destroy_qp */ int rdma_leave_multicast(struct rdma_cm_id *id, struct sockaddr *addr); /** * rdma_multicast_ex - Joins a multicast group with options. * @id: Communication identifier associated with the request. * @mc_join_attr: Extensive struct containing multicast join parameters. * @context: User-defined context associated with the join request. * Description: * Joins a multicast group with options. Currently supporting MC join flags. * The QP will be attached based on the given join flag. * Join message will be sent according to the join flag. * Notes: * Before joining a multicast group, the rdma_cm_id must be bound to * an RDMA device by calling rdma_bind_addr or rdma_resolve_addr. Use of * rdma_resolve_addr requires the local routing tables to resolve the * multicast address to an RDMA device. The user must call * rdma_leave_multicast to leave the multicast group and release any * multicast resources. The context is returned to the user through * the private_data field in the rdma_cm_event. * See also: * rdma_leave_multicast, rdma_bind_addr, rdma_resolve_addr, rdma_create_qp */ int rdma_join_multicast_ex(struct rdma_cm_id *id, struct rdma_cm_join_mc_attr_ex *mc_join_attr, void *context); /** * rdma_get_cm_event - Retrieves the next pending communication event. * @channel: Event channel to check for events. * @event: Allocated information about the next communication event. * Description: * Retrieves a communication event. If no events are pending, by default, * the call will block until an event is received. * Notes: * The default synchronous behavior of this routine can be changed by * modifying the file descriptor associated with the given channel. All * events that are reported must be acknowledged by calling rdma_ack_cm_event. * Destruction of an rdma_cm_id will block until related events have been * acknowledged. * See also: * rdma_ack_cm_event, rdma_create_event_channel, rdma_event_str */ int rdma_get_cm_event(struct rdma_event_channel *channel, struct rdma_cm_event **event); /** * rdma_ack_cm_event - Free a communication event. * @event: Event to be released. * Description: * All events which are allocated by rdma_get_cm_event must be released, * there should be a one-to-one correspondence between successful gets * and acks. * See also: * rdma_get_cm_event, rdma_destroy_id */ int rdma_ack_cm_event(struct rdma_cm_event *event); __be16 rdma_get_src_port(struct rdma_cm_id *id); __be16 rdma_get_dst_port(struct rdma_cm_id *id); static inline struct sockaddr *rdma_get_local_addr(struct rdma_cm_id *id) { return &id->route.addr.src_addr; } static inline struct sockaddr *rdma_get_peer_addr(struct rdma_cm_id *id) { return &id->route.addr.dst_addr; } /** * rdma_get_devices - Get list of RDMA devices currently available. * @num_devices: If non-NULL, set to the number of devices returned. * Description: * Return a NULL-terminated array of opened RDMA devices. Callers can use * this routine to allocate resources on specific RDMA devices that will be * shared across multiple rdma_cm_id's. * Notes: * The returned array must be released by calling rdma_free_devices. Devices * remain opened while the librdmacm is loaded. * See also: * rdma_free_devices */ struct ibv_context **rdma_get_devices(int *num_devices); /** * rdma_free_devices - Frees the list of devices returned by rdma_get_devices. * @list: List of devices returned from rdma_get_devices. * Description: * Frees the device array returned by rdma_get_devices. * See also: * rdma_get_devices */ void rdma_free_devices(struct ibv_context **list); /** * rdma_event_str - Returns a string representation of an rdma cm event. * @event: Asynchronous event. * Description: * Returns a string representation of an asynchronous event. * See also: * rdma_get_cm_event */ const char *rdma_event_str(enum rdma_cm_event_type event); /* Option levels */ enum { RDMA_OPTION_ID = 0, RDMA_OPTION_IB = 1 }; /* Option details */ enum { RDMA_OPTION_ID_TOS = 0, /* uint8_t: RFC 2474 */ RDMA_OPTION_ID_REUSEADDR = 1, /* int: ~SO_REUSEADDR */ RDMA_OPTION_ID_AFONLY = 2, /* int: ~IPV6_V6ONLY */ RDMA_OPTION_ID_ACK_TIMEOUT = 3 /* uint8_t */ }; enum { RDMA_OPTION_IB_PATH = 1 /* struct ibv_path_data[] */ }; /** * rdma_set_option - Set options for an rdma_cm_id. * @id: Communication identifier to set option for. * @level: Protocol level of the option to set. * @optname: Name of the option to set. * @optval: Reference to the option data. * @optlen: The size of the %optval buffer. */ int rdma_set_option(struct rdma_cm_id *id, int level, int optname, void *optval, size_t optlen); /** * rdma_migrate_id - Move an rdma_cm_id to a new event channel. * @id: Communication identifier to migrate. * @channel: New event channel for rdma_cm_id events. */ int rdma_migrate_id(struct rdma_cm_id *id, struct rdma_event_channel *channel); /** * rdma_getaddrinfo - RDMA address and route resolution service. */ int rdma_getaddrinfo(const char *node, const char *service, const struct rdma_addrinfo *hints, struct rdma_addrinfo **res); void rdma_freeaddrinfo(struct rdma_addrinfo *res); /** * rdma_init_qp_attr - Returns QP attributes. * @id: Communication identifier. * @qp_attr: A reference to a QP attributes struct containing * response information. * @qp_attr_mask: A reference to a QP attributes mask containing * response information. */ int rdma_init_qp_attr(struct rdma_cm_id *id, struct ibv_qp_attr *qp_attr, int *qp_attr_mask); /** * rdma_set_local_ece - Set local ECE options to be used for REQ/REP * communication. In use to implement ECE handshake in external QP. * @id: Communication identifier to establish connection * @ece: ECE parameters */ int rdma_set_local_ece(struct rdma_cm_id *id, struct ibv_ece *ece); /** * rdma_get_remote_ece - Provide remote ECE parameters as received * in REQ/REP events. In use to implement ECE handshake in external QP. * @id: Communication identifier to establish connection * @ece: ECE parameters */ int rdma_get_remote_ece(struct rdma_cm_id *id, struct ibv_ece *ece); #ifdef __cplusplus } #endif #endif /* RDMA_CMA_H */ rdma-core-56.1/librdmacm/rdma_cma_abi.h000066400000000000000000000150531477342711600200170ustar00rootroot00000000000000/* * Copyright (c) 2005-2011 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef RDMA_CMA_ABI_H #define RDMA_CMA_ABI_H #include #include #include #include /* * This file must be kept in sync with the kernel's version of rdma_user_cm.h */ #define RDMA_USER_CM_MIN_ABI_VERSION 3 #define RDMA_USER_CM_MAX_ABI_VERSION 4 #define RDMA_MAX_PRIVATE_DATA 256 enum { UCMA_CMD_CREATE_ID, UCMA_CMD_DESTROY_ID, UCMA_CMD_BIND_IP, UCMA_CMD_RESOLVE_IP, UCMA_CMD_RESOLVE_ROUTE, UCMA_CMD_QUERY_ROUTE, UCMA_CMD_CONNECT, UCMA_CMD_LISTEN, UCMA_CMD_ACCEPT, UCMA_CMD_REJECT, UCMA_CMD_DISCONNECT, UCMA_CMD_INIT_QP_ATTR, UCMA_CMD_GET_EVENT, UCMA_CMD_GET_OPTION, UCMA_CMD_SET_OPTION, UCMA_CMD_NOTIFY, UCMA_CMD_JOIN_IP_MCAST, UCMA_CMD_LEAVE_MCAST, UCMA_CMD_MIGRATE_ID, UCMA_CMD_QUERY, UCMA_CMD_BIND, UCMA_CMD_RESOLVE_ADDR, UCMA_CMD_JOIN_MCAST }; struct ucma_abi_cmd_hdr { __u32 cmd; __u16 in; __u16 out; }; struct ucma_abi_create_id { __u32 cmd; __u16 in; __u16 out; __u64 uid; __u64 response; __u16 ps; __u8 qp_type; __u8 reserved[5]; }; struct ucma_abi_create_id_resp { __u32 id; }; struct ucma_abi_destroy_id { __u32 cmd; __u16 in; __u16 out; __u64 response; __u32 id; __u32 reserved; }; struct ucma_abi_destroy_id_resp { __u32 events_reported; }; struct ucma_abi_bind_ip { __u32 cmd; __u16 in; __u16 out; __u64 response; struct sockaddr_in6 addr; __u32 id; }; struct ucma_abi_bind { __u32 cmd; __u16 in; __u16 out; __u32 id; __u16 addr_size; __u16 reserved; struct sockaddr_storage addr; }; struct ucma_abi_resolve_ip { __u32 cmd; __u16 in; __u16 out; struct sockaddr_in6 src_addr; struct sockaddr_in6 dst_addr; __u32 id; __u32 timeout_ms; }; struct ucma_abi_resolve_addr { __u32 cmd; __u16 in; __u16 out; __u32 id; __u32 timeout_ms; __u16 src_size; __u16 dst_size; __u32 reserved; struct sockaddr_storage src_addr; struct sockaddr_storage dst_addr; }; struct ucma_abi_resolve_route { __u32 cmd; __u16 in; __u16 out; __u32 id; __u32 timeout_ms; }; enum { UCMA_QUERY_ADDR, UCMA_QUERY_PATH, UCMA_QUERY_GID }; struct ucma_abi_query { __u32 cmd; __u16 in; __u16 out; __u64 response; __u32 id; __u32 option; }; struct ucma_abi_query_route_resp { __be64 node_guid; struct ib_user_path_rec ib_route[2]; struct sockaddr_in6 src_addr; struct sockaddr_in6 dst_addr; __u32 num_paths; __u8 port_num; __u8 reserved[3]; __u32 ibdev_index; __u32 reserved1; }; struct ucma_abi_query_addr_resp { __be64 node_guid; __u8 port_num; __u8 reserved; __be16 pkey; __u16 src_size; __u16 dst_size; struct sockaddr_storage src_addr; struct sockaddr_storage dst_addr; __u32 ibdev_index; __u32 reserved1; }; struct ucma_abi_query_path_resp { __u32 num_paths; __u32 reserved; struct ibv_path_data path_data[0]; }; struct ucma_abi_conn_param { __u32 qp_num; __u32 reserved; __u8 private_data[RDMA_MAX_PRIVATE_DATA]; __u8 private_data_len; __u8 srq; __u8 responder_resources; __u8 initiator_depth; __u8 flow_control; __u8 retry_count; __u8 rnr_retry_count; __u8 valid; }; struct ucma_abi_ud_param { __u32 qp_num; __u32 qkey; struct ib_uverbs_ah_attr ah_attr; __u8 private_data[RDMA_MAX_PRIVATE_DATA]; __u8 private_data_len; __u8 reserved[7]; __u8 reserved2[4]; /* Round to 8-byte boundary to support 32/64 */ }; struct ucma_abi_ece { __u32 vendor_id; __u32 attr_mod; }; struct ucma_abi_connect { __u32 cmd; __u16 in; __u16 out; struct ucma_abi_conn_param conn_param; __u32 id; __u32 reserved; struct ucma_abi_ece ece; }; struct ucma_abi_listen { __u32 cmd; __u16 in; __u16 out; __u32 id; __u32 backlog; }; struct ucma_abi_accept { __u32 cmd; __u16 in; __u16 out; __u64 uid; struct ucma_abi_conn_param conn_param; __u32 id; __u32 reserved; struct ucma_abi_ece ece; }; struct ucma_abi_reject { __u32 cmd; __u16 in; __u16 out; __u32 id; __u8 private_data_len; __u8 reason; __u8 reserved[2]; __u8 private_data[RDMA_MAX_PRIVATE_DATA]; }; struct ucma_abi_disconnect { __u32 cmd; __u16 in; __u16 out; __u32 id; }; struct ucma_abi_init_qp_attr { __u32 cmd; __u16 in; __u16 out; __u64 response; __u32 id; __u32 qp_state; }; struct ucma_abi_notify { __u32 cmd; __u16 in; __u16 out; __u32 id; __u32 event; }; struct ucma_abi_join_ip_mcast { __u32 cmd; __u16 in; __u16 out; __u64 response; /* ucma_abi_create_id_resp */ __u64 uid; struct sockaddr_in6 addr; __u32 id; }; struct ucma_abi_join_mcast { __u32 cmd; __u16 in; __u16 out; __u64 response; /* rdma_ucma_create_id_resp */ __u64 uid; __u32 id; __u16 addr_size; __u16 join_flags; struct sockaddr_storage addr; }; struct ucma_abi_get_event { __u32 cmd; __u16 in; __u16 out; __u64 response; }; struct ucma_abi_event_resp { __u64 uid; __u32 id; __u32 event; __u32 status; union { struct ucma_abi_conn_param conn; struct ucma_abi_ud_param ud; } param; struct ucma_abi_ece ece; }; struct ucma_abi_set_option { __u32 cmd; __u16 in; __u16 out; __u64 optval; __u32 id; __u32 level; __u32 optname; __u32 optlen; }; struct ucma_abi_migrate_id { __u32 cmd; __u16 in; __u16 out; __u64 response; __u32 id; __u32 fd; }; struct ucma_abi_migrate_resp { __u32 events_reported; }; #endif /* RDMA_CMA_ABI_H */ rdma-core-56.1/librdmacm/rdma_verbs.h000066400000000000000000000171531477342711600175700ustar00rootroot00000000000000/* * Copyright (c) 2010-2014 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(RDMA_VERBS_H) #define RDMA_VERBS_H #include #include #include #include #ifdef __cplusplus extern "C" { #endif static inline int rdma_seterrno(int ret) { if (ret) { errno = ret; ret = -1; } return ret; } /* * Shared receive queues. */ int rdma_create_srq(struct rdma_cm_id *id, struct ibv_pd *pd, struct ibv_srq_init_attr *attr); int rdma_create_srq_ex(struct rdma_cm_id *id, struct ibv_srq_init_attr_ex *attr); void rdma_destroy_srq(struct rdma_cm_id *id); /* * Memory registration helpers. */ static inline struct ibv_mr * rdma_reg_msgs(struct rdma_cm_id *id, void *addr, size_t length) { return ibv_reg_mr(id->pd, addr, length, IBV_ACCESS_LOCAL_WRITE); } static inline struct ibv_mr * rdma_reg_read(struct rdma_cm_id *id, void *addr, size_t length) { return ibv_reg_mr(id->pd, addr, length, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_READ); } static inline struct ibv_mr * rdma_reg_write(struct rdma_cm_id *id, void *addr, size_t length) { return ibv_reg_mr(id->pd, addr, length, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE); } static inline int rdma_dereg_mr(struct ibv_mr *mr) { return rdma_seterrno(ibv_dereg_mr(mr)); } /* * Vectored send, receive, and RDMA operations. * Support multiple scatter-gather entries. */ static inline int rdma_post_recvv(struct rdma_cm_id *id, void *context, struct ibv_sge *sgl, int nsge) { struct ibv_recv_wr wr, *bad; wr.wr_id = (uintptr_t) context; wr.next = NULL; wr.sg_list = sgl; wr.num_sge = nsge; if (id->srq) return rdma_seterrno(ibv_post_srq_recv(id->srq, &wr, &bad)); else return rdma_seterrno(ibv_post_recv(id->qp, &wr, &bad)); } static inline int rdma_post_sendv(struct rdma_cm_id *id, void *context, struct ibv_sge *sgl, int nsge, int flags) { struct ibv_send_wr wr, *bad; wr.wr_id = (uintptr_t) context; wr.next = NULL; wr.sg_list = sgl; wr.num_sge = nsge; wr.opcode = IBV_WR_SEND; wr.send_flags = flags; return rdma_seterrno(ibv_post_send(id->qp, &wr, &bad)); } static inline int rdma_post_readv(struct rdma_cm_id *id, void *context, struct ibv_sge *sgl, int nsge, int flags, uint64_t remote_addr, uint32_t rkey) { struct ibv_send_wr wr, *bad; wr.wr_id = (uintptr_t) context; wr.next = NULL; wr.sg_list = sgl; wr.num_sge = nsge; wr.opcode = IBV_WR_RDMA_READ; wr.send_flags = flags; wr.wr.rdma.remote_addr = remote_addr; wr.wr.rdma.rkey = rkey; return rdma_seterrno(ibv_post_send(id->qp, &wr, &bad)); } static inline int rdma_post_writev(struct rdma_cm_id *id, void *context, struct ibv_sge *sgl, int nsge, int flags, uint64_t remote_addr, uint32_t rkey) { struct ibv_send_wr wr, *bad; wr.wr_id = (uintptr_t) context; wr.next = NULL; wr.sg_list = sgl; wr.num_sge = nsge; wr.opcode = IBV_WR_RDMA_WRITE; wr.send_flags = flags; wr.wr.rdma.remote_addr = remote_addr; wr.wr.rdma.rkey = rkey; return rdma_seterrno(ibv_post_send(id->qp, &wr, &bad)); } /* * Simple send, receive, and RDMA calls. */ static inline int rdma_post_recv(struct rdma_cm_id *id, void *context, void *addr, size_t length, struct ibv_mr *mr) { struct ibv_sge sge; assert((addr >= mr->addr) && (((uint8_t *) addr + length) <= ((uint8_t *) mr->addr + mr->length))); sge.addr = (uint64_t) (uintptr_t) addr; sge.length = (uint32_t) length; sge.lkey = mr->lkey; return rdma_post_recvv(id, context, &sge, 1); } static inline int rdma_post_send(struct rdma_cm_id *id, void *context, void *addr, size_t length, struct ibv_mr *mr, int flags) { struct ibv_sge sge; sge.addr = (uint64_t) (uintptr_t) addr; sge.length = (uint32_t) length; sge.lkey = mr ? mr->lkey : 0; return rdma_post_sendv(id, context, &sge, 1, flags); } static inline int rdma_post_read(struct rdma_cm_id *id, void *context, void *addr, size_t length, struct ibv_mr *mr, int flags, uint64_t remote_addr, uint32_t rkey) { struct ibv_sge sge; sge.addr = (uint64_t) (uintptr_t) addr; sge.length = (uint32_t) length; sge.lkey = mr->lkey; return rdma_post_readv(id, context, &sge, 1, flags, remote_addr, rkey); } static inline int rdma_post_write(struct rdma_cm_id *id, void *context, void *addr, size_t length, struct ibv_mr *mr, int flags, uint64_t remote_addr, uint32_t rkey) { struct ibv_sge sge; sge.addr = (uint64_t) (uintptr_t) addr; sge.length = (uint32_t) length; sge.lkey = mr ? mr->lkey : 0; return rdma_post_writev(id, context, &sge, 1, flags, remote_addr, rkey); } static inline int rdma_post_ud_send(struct rdma_cm_id *id, void *context, void *addr, size_t length, struct ibv_mr *mr, int flags, struct ibv_ah *ah, uint32_t remote_qpn) { struct ibv_send_wr wr, *bad; struct ibv_sge sge; sge.addr = (uint64_t) (uintptr_t) addr; sge.length = (uint32_t) length; sge.lkey = mr ? mr->lkey : 0; wr.wr_id = (uintptr_t) context; wr.next = NULL; wr.sg_list = &sge; wr.num_sge = 1; wr.opcode = IBV_WR_SEND; wr.send_flags = flags; wr.wr.ud.ah = ah; wr.wr.ud.remote_qpn = remote_qpn; wr.wr.ud.remote_qkey = RDMA_UDP_QKEY; return rdma_seterrno(ibv_post_send(id->qp, &wr, &bad)); } static inline int rdma_get_send_comp(struct rdma_cm_id *id, struct ibv_wc *wc) { struct ibv_cq *cq; void *context; int ret; do { ret = ibv_poll_cq(id->send_cq, 1, wc); if (ret) break; ret = ibv_req_notify_cq(id->send_cq, 0); if (ret) return rdma_seterrno(ret); ret = ibv_poll_cq(id->send_cq, 1, wc); if (ret) break; ret = ibv_get_cq_event(id->send_cq_channel, &cq, &context); if (ret) return ret; assert(cq == id->send_cq && context == id); ibv_ack_cq_events(id->send_cq, 1); } while (1); return (ret < 0) ? rdma_seterrno(ret) : ret; } static inline int rdma_get_recv_comp(struct rdma_cm_id *id, struct ibv_wc *wc) { struct ibv_cq *cq; void *context; int ret; do { ret = ibv_poll_cq(id->recv_cq, 1, wc); if (ret) break; ret = ibv_req_notify_cq(id->recv_cq, 0); if (ret) return rdma_seterrno(ret); ret = ibv_poll_cq(id->recv_cq, 1, wc); if (ret) break; ret = ibv_get_cq_event(id->recv_cq_channel, &cq, &context); if (ret) return ret; assert(cq == id->recv_cq && context == id); ibv_ack_cq_events(id->recv_cq, 1); } while (1); return (ret < 0) ? rdma_seterrno(ret) : ret; } #ifdef __cplusplus } #endif #endif /* RDMA_CMA_H */ rdma-core-56.1/librdmacm/rsocket.c000066400000000000000000003305061477342711600171110ustar00rootroot00000000000000/* * Copyright (c) 2008-2019 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "cma.h" #include "indexer.h" #define RS_OLAP_START_SIZE 2048 #define RS_MAX_TRANSFER 65536 #define RS_SNDLOWAT 2048 #define RS_QP_MIN_SIZE 16 #define RS_QP_MAX_SIZE 0xFFFE #define RS_QP_CTRL_SIZE 4 /* must be power of 2 */ #define RS_CONN_RETRIES 6 #define RS_SGL_SIZE 2 static struct index_map idm; static pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER; static pthread_mutex_t svc_mut = PTHREAD_MUTEX_INITIALIZER; struct rsocket; enum { RS_SVC_NOOP, RS_SVC_ADD_DGRAM, RS_SVC_REM_DGRAM, RS_SVC_ADD_KEEPALIVE, RS_SVC_REM_KEEPALIVE, RS_SVC_MOD_KEEPALIVE, RS_SVC_ADD_CM, RS_SVC_REM_CM, }; struct rs_svc_msg { uint32_t cmd; uint32_t status; struct rsocket *rs; }; struct rs_svc { pthread_t id; int sock[2]; int cnt; int size; int context_size; void *(*run)(void *svc); struct rsocket **rss; void *contexts; }; static struct pollfd *udp_svc_fds; static void *udp_svc_run(void *arg); static struct rs_svc udp_svc = { .context_size = sizeof(*udp_svc_fds), .run = udp_svc_run }; static uint64_t *tcp_svc_timeouts; static void *tcp_svc_run(void *arg); static struct rs_svc tcp_svc = { .context_size = sizeof(*tcp_svc_timeouts), .run = tcp_svc_run }; static void *cm_svc_run(void *arg); static struct rs_svc listen_svc = { .context_size = sizeof(struct pollfd), .run = cm_svc_run }; static struct rs_svc connect_svc = { .context_size = sizeof(struct pollfd), .run = cm_svc_run }; static uint32_t pollcnt; static bool suspendpoll; static int pollsignal = -1; static uint16_t def_iomap_size = 0; static uint16_t def_inline = 64; static uint16_t def_sqsize = 384; static uint16_t def_rqsize = 384; static uint32_t def_mem = (1 << 17); static uint32_t def_wmem = (1 << 17); static uint32_t polling_time = 10; static int wake_up_interval = 5000; /* * Immediate data format is determined by the upper bits * bit 31: message type, 0 - data, 1 - control * bit 30: buffers updated, 0 - target, 1 - direct-receive * bit 29: more data, 0 - end of transfer, 1 - more data available * * for data transfers: * bits [28:0]: bytes transferred * for control messages: * SGL, CTRL * bits [28-0]: receive credits granted * IOMAP_SGL * bits [28-16]: reserved, bits [15-0]: index */ enum { RS_OP_DATA, RS_OP_RSVD_DATA_MORE, RS_OP_WRITE, /* opcode is not transmitted over the network */ RS_OP_RSVD_DRA_MORE, RS_OP_SGL, RS_OP_RSVD, RS_OP_IOMAP_SGL, RS_OP_CTRL }; #define rs_msg_set(op, data) ((op << 29) | (uint32_t) (data)) #define rs_msg_op(imm_data) (imm_data >> 29) #define rs_msg_data(imm_data) (imm_data & 0x1FFFFFFF) #define RS_MSG_SIZE sizeof(uint32_t) #define RS_WR_ID_FLAG_RECV (((uint64_t) 1) << 63) #define RS_WR_ID_FLAG_MSG_SEND (((uint64_t) 1) << 62) /* See RS_OPT_MSG_SEND */ #define rs_send_wr_id(data) ((uint64_t) data) #define rs_recv_wr_id(data) (RS_WR_ID_FLAG_RECV | (uint64_t) data) #define rs_wr_is_recv(wr_id) (wr_id & RS_WR_ID_FLAG_RECV) #define rs_wr_is_msg_send(wr_id) (wr_id & RS_WR_ID_FLAG_MSG_SEND) #define rs_wr_data(wr_id) ((uint32_t) wr_id) enum { RS_CTRL_DISCONNECT, RS_CTRL_KEEPALIVE, RS_CTRL_SHUTDOWN }; struct rs_msg { uint32_t op; uint32_t data; }; struct ds_qp; struct ds_rmsg { struct ds_qp *qp; uint32_t offset; uint32_t length; }; struct ds_smsg { struct ds_smsg *next; }; struct rs_sge { uint64_t addr; uint32_t key; uint32_t length; }; struct rs_iomap { uint64_t offset; struct rs_sge sge; }; struct rs_iomap_mr { uint64_t offset; struct ibv_mr *mr; dlist_entry entry; _Atomic(int) refcnt; int index; /* -1 if mapping is local and not in iomap_list */ }; #define RS_MAX_CTRL_MSG (sizeof(struct rs_sge)) #define rs_host_is_net() (__BYTE_ORDER == __BIG_ENDIAN) #define RS_CONN_FLAG_NET (1 << 0) #define RS_CONN_FLAG_IOMAP (1 << 1) struct rs_conn_data { uint8_t version; uint8_t flags; __be16 credits; uint8_t reserved[3]; uint8_t target_iomap_size; struct rs_sge target_sgl; struct rs_sge data_buf; }; struct rs_conn_private_data { union { struct rs_conn_data conn_data; struct { struct ib_connect_hdr ib_hdr; struct rs_conn_data conn_data; } af_ib; }; }; /* * rsocket states are ordered as passive, connecting, connected, disconnected. */ enum rs_state { rs_init, rs_bound = 0x0001, rs_listening = 0x0002, rs_opening = 0x0004, rs_resolving_addr = rs_opening | 0x0010, rs_resolving_route = rs_opening | 0x0020, rs_connecting = rs_opening | 0x0040, rs_accepting = rs_opening | 0x0080, rs_connected = 0x0100, rs_writable = 0x0200, rs_readable = 0x0400, rs_connect_rdwr = rs_connected | rs_readable | rs_writable, rs_connect_error = 0x0800, rs_disconnected = 0x1000, rs_error = 0x2000, }; #define RS_OPT_SWAP_SGL (1 << 0) /* * iWarp does not support RDMA write with immediate data. For iWarp, we * transfer rsocket messages as inline sends. */ #define RS_OPT_MSG_SEND (1 << 1) #define RS_OPT_UDP_SVC (1 << 2) #define RS_OPT_KEEPALIVE (1 << 3) #define RS_OPT_CM_SVC (1 << 4) union socket_addr { struct sockaddr sa; struct sockaddr_in sin; struct sockaddr_in6 sin6; }; struct ds_header { uint8_t version; uint8_t length; __be16 port; union { __be32 ipv4; struct { __be32 flowinfo; uint8_t addr[16]; } ipv6; } addr; }; #define DS_IPV4_HDR_LEN 8 #define DS_IPV6_HDR_LEN 24 struct ds_dest { union socket_addr addr; /* must be first */ struct ds_qp *qp; struct ibv_ah *ah; uint32_t qpn; }; struct ds_qp { dlist_entry list; struct rsocket *rs; struct rdma_cm_id *cm_id; struct ds_header hdr; struct ds_dest dest; struct ibv_mr *smr; struct ibv_mr *rmr; uint8_t *rbuf; int cq_armed; }; struct rsocket { int type; int index; fastlock_t slock; fastlock_t rlock; fastlock_t cq_lock; fastlock_t cq_wait_lock; fastlock_t map_lock; /* acquire slock first if needed */ union { /* data stream */ struct { struct rdma_cm_id *cm_id; uint64_t tcp_opts; unsigned int keepalive_time; int accept_queue[2]; unsigned int ctrl_seqno; unsigned int ctrl_max_seqno; uint16_t sseq_no; uint16_t sseq_comp; uint16_t rseq_no; uint16_t rseq_comp; int remote_sge; struct rs_sge remote_sgl; struct rs_sge remote_iomap; struct ibv_mr *target_mr; int target_sge; int target_iomap_size; void *target_buffer_list; volatile struct rs_sge *target_sgl; struct rs_iomap *target_iomap; int rbuf_msg_index; int rbuf_bytes_avail; int rbuf_free_offset; int rbuf_offset; struct ibv_mr *rmr; uint8_t *rbuf; int sbuf_bytes_avail; struct ibv_mr *smr; struct ibv_sge ssgl[2]; }; /* datagram */ struct { struct ds_qp *qp_list; void *dest_map; struct ds_dest *conn_dest; int udp_sock; int epfd; int rqe_avail; struct ds_smsg *smsg_free; }; }; int opts; int fd_flags; uint64_t so_opts; uint64_t ipv6_opts; void *optval; size_t optlen; int state; int cq_armed; int retries; int err; int sqe_avail; uint32_t sbuf_size; uint16_t sq_size; uint16_t sq_inline; uint32_t rbuf_size; uint16_t rq_size; int rmsg_head; int rmsg_tail; union { struct rs_msg *rmsg; struct ds_rmsg *dmsg; }; uint8_t *sbuf; struct rs_iomap_mr *remote_iomappings; dlist_entry iomap_list; dlist_entry iomap_queue; int iomap_pending; int unack_cqe; }; #define DS_UDP_TAG 0x55555555 struct ds_udp_header { __be32 tag; uint8_t version; uint8_t op; uint8_t length; uint8_t reserved; __be32 qpn; /* lower 8-bits reserved */ union { __be32 ipv4; uint8_t ipv6[16]; } addr; }; #define DS_UDP_IPV4_HDR_LEN 16 #define DS_UDP_IPV6_HDR_LEN 28 #define ds_next_qp(qp) container_of((qp)->list.next, struct ds_qp, list) static void write_all(int fd, const void *msg, size_t len) { // FIXME: if fd is a socket this really needs to handle EINTR and other conditions. ssize_t __attribute__((unused)) rc = write(fd, msg, len); assert(rc == len); } static void read_all(int fd, void *msg, size_t len) { // FIXME: if fd is a socket this really needs to handle EINTR and other conditions. ssize_t __attribute__((unused)) rc = read(fd, msg, len); assert(rc == len); } /** * Allocates entire pages for registered allocations so that MADV_DONTFORK will * not unmap valid memory in the child process when IBV_FORKSAFE is enabled. */ static void *forksafe_alloc(size_t len) { long pagesize = sysconf(_SC_PAGESIZE); void *ptr; int ret; len = ((len + pagesize - 1) / pagesize) * pagesize; ret = posix_memalign(&ptr, pagesize, len); if (ret) return NULL; memset(ptr, 0, len); return ptr; } static uint64_t rs_time_us(void) { struct timespec now; clock_gettime(CLOCK_MONOTONIC, &now); return now.tv_sec * 1000000 + now.tv_nsec / 1000; } static void ds_insert_qp(struct rsocket *rs, struct ds_qp *qp) { if (!rs->qp_list) dlist_init(&qp->list); else dlist_insert_head(&qp->list, &rs->qp_list->list); rs->qp_list = qp; } static void ds_remove_qp(struct rsocket *rs, struct ds_qp *qp) { if (qp->list.next != &qp->list) { rs->qp_list = ds_next_qp(qp); dlist_remove(&qp->list); } else { rs->qp_list = NULL; } } static int rs_notify_svc(struct rs_svc *svc, struct rsocket *rs, int cmd) { struct rs_svc_msg msg; int ret; pthread_mutex_lock(&svc_mut); if (!svc->cnt) { ret = socketpair(AF_UNIX, SOCK_STREAM, 0, svc->sock); if (ret) goto unlock; ret = pthread_create(&svc->id, NULL, svc->run, svc); if (ret) { ret = ERR(ret); goto closepair; } } msg.cmd = cmd; msg.status = EINVAL; msg.rs = rs; write_all(svc->sock[0], &msg, sizeof(msg)); read_all(svc->sock[0], &msg, sizeof(msg)); ret = rdma_seterrno(msg.status); if (svc->cnt) goto unlock; pthread_join(svc->id, NULL); closepair: close(svc->sock[0]); close(svc->sock[1]); unlock: pthread_mutex_unlock(&svc_mut); return ret; } static int ds_compare_addr(const void *dst1, const void *dst2) { const struct sockaddr *sa1, *sa2; size_t len; sa1 = (const struct sockaddr *) dst1; sa2 = (const struct sockaddr *) dst2; len = (sa1->sa_family == AF_INET6 && sa2->sa_family == AF_INET6) ? sizeof(struct sockaddr_in6) : sizeof(struct sockaddr_in); return memcmp(dst1, dst2, len); } static int rs_value_to_scale(int value, int bits) { return value <= (1 << (bits - 1)) ? value : (1 << (bits - 1)) | (value >> bits); } static int rs_scale_to_value(int value, int bits) { return value <= (1 << (bits - 1)) ? value : (value & ~(1 << (bits - 1))) << bits; } /* gcc > ~5 will not allow (void)fscanf to suppress -Wunused-result, but this will do it. In this case ignoring the result is OK (but horribly unfriendly to user) since the library has a sane default. */ #define failable_fscanf(f, fmt, ...) \ { \ int rc = fscanf(f, fmt, __VA_ARGS__); \ (void) rc; \ } static void rs_configure(void) { FILE *f; static int init; if (init) return; pthread_mutex_lock(&mut); if (init) goto out; if (ucma_init()) goto out; ucma_ib_init(); if ((f = fopen(RS_CONF_DIR "/polling_time", "r"))) { failable_fscanf(f, "%u", &polling_time); fclose(f); } f = fopen(RS_CONF_DIR "/wake_up_interval", "r"); if (f) { failable_fscanf(f, "%d", &wake_up_interval); fclose(f); } if ((f = fopen(RS_CONF_DIR "/inline_default", "r"))) { failable_fscanf(f, "%hu", &def_inline); fclose(f); } if ((f = fopen(RS_CONF_DIR "/sqsize_default", "r"))) { failable_fscanf(f, "%hu", &def_sqsize); fclose(f); } if ((f = fopen(RS_CONF_DIR "/rqsize_default", "r"))) { failable_fscanf(f, "%hu", &def_rqsize); fclose(f); } if ((f = fopen(RS_CONF_DIR "/mem_default", "r"))) { failable_fscanf(f, "%u", &def_mem); fclose(f); if (def_mem < 1) def_mem = 1; } if ((f = fopen(RS_CONF_DIR "/wmem_default", "r"))) { failable_fscanf(f, "%u", &def_wmem); fclose(f); if (def_wmem < RS_SNDLOWAT) def_wmem = RS_SNDLOWAT << 1; } if ((f = fopen(RS_CONF_DIR "/iomap_size", "r"))) { failable_fscanf(f, "%hu", &def_iomap_size); fclose(f); /* round to supported values */ def_iomap_size = (uint8_t) rs_value_to_scale( (uint16_t) rs_scale_to_value(def_iomap_size, 8), 8); } init = 1; out: pthread_mutex_unlock(&mut); } static int rs_insert(struct rsocket *rs, int index) { pthread_mutex_lock(&mut); rs->index = idm_set(&idm, index, rs); pthread_mutex_unlock(&mut); return rs->index; } static void rs_remove(struct rsocket *rs) { pthread_mutex_lock(&mut); idm_clear(&idm, rs->index); pthread_mutex_unlock(&mut); } /* We only inherit from listening sockets */ static struct rsocket *rs_alloc(struct rsocket *inherited_rs, int type) { struct rsocket *rs; rs = calloc(1, sizeof(*rs)); if (!rs) return NULL; rs->type = type; rs->index = -1; if (type == SOCK_DGRAM) { rs->udp_sock = -1; rs->epfd = -1; } if (inherited_rs) { rs->sbuf_size = inherited_rs->sbuf_size; rs->rbuf_size = inherited_rs->rbuf_size; rs->sq_inline = inherited_rs->sq_inline; rs->sq_size = inherited_rs->sq_size; rs->rq_size = inherited_rs->rq_size; if (type == SOCK_STREAM) { rs->ctrl_max_seqno = inherited_rs->ctrl_max_seqno; rs->target_iomap_size = inherited_rs->target_iomap_size; } } else { rs->sbuf_size = def_wmem; rs->rbuf_size = def_mem; rs->sq_inline = def_inline; rs->sq_size = def_sqsize; rs->rq_size = def_rqsize; if (type == SOCK_STREAM) { rs->ctrl_max_seqno = RS_QP_CTRL_SIZE; rs->target_iomap_size = def_iomap_size; } } fastlock_init(&rs->slock); fastlock_init(&rs->rlock); fastlock_init(&rs->cq_lock); fastlock_init(&rs->cq_wait_lock); fastlock_init(&rs->map_lock); dlist_init(&rs->iomap_list); dlist_init(&rs->iomap_queue); return rs; } static int rs_set_nonblocking(struct rsocket *rs, int arg) { struct ds_qp *qp; int ret = 0; if (rs->type == SOCK_STREAM) { if (rs->cm_id->recv_cq_channel) ret = fcntl(rs->cm_id->recv_cq_channel->fd, F_SETFL, arg); if (rs->state == rs_listening) ret = fcntl(rs->accept_queue[0], F_SETFL, arg); else if (!ret && rs->state < rs_connected) ret = fcntl(rs->cm_id->channel->fd, F_SETFL, arg); } else { ret = fcntl(rs->epfd, F_SETFL, arg); if (!ret && rs->qp_list) { qp = rs->qp_list; do { ret = fcntl(qp->cm_id->recv_cq_channel->fd, F_SETFL, arg); qp = ds_next_qp(qp); } while (qp != rs->qp_list && !ret); } } return ret; } static void rs_set_qp_size(struct rsocket *rs) { uint16_t max_size; max_size = min(ucma_max_qpsize(rs->cm_id), RS_QP_MAX_SIZE); if (rs->sq_size > max_size) rs->sq_size = max_size; else if (rs->sq_size < RS_QP_MIN_SIZE) rs->sq_size = RS_QP_MIN_SIZE; if (rs->rq_size > max_size) rs->rq_size = max_size; else if (rs->rq_size < RS_QP_MIN_SIZE) rs->rq_size = RS_QP_MIN_SIZE; } static void ds_set_qp_size(struct rsocket *rs) { uint16_t max_size; max_size = min(ucma_max_qpsize(NULL), RS_QP_MAX_SIZE); if (rs->sq_size > max_size) rs->sq_size = max_size; if (rs->rq_size > max_size) rs->rq_size = max_size; if (rs->rq_size > (rs->rbuf_size / RS_SNDLOWAT)) rs->rq_size = rs->rbuf_size / RS_SNDLOWAT; else rs->rbuf_size = rs->rq_size * RS_SNDLOWAT; if (rs->sq_size > (rs->sbuf_size / RS_SNDLOWAT)) rs->sq_size = rs->sbuf_size / RS_SNDLOWAT; else rs->sbuf_size = rs->sq_size * RS_SNDLOWAT; } static int rs_init_bufs(struct rsocket *rs) { uint32_t total_rbuf_size, total_sbuf_size; size_t len; rs->rmsg = calloc(rs->rq_size + 1, sizeof(*rs->rmsg)); if (!rs->rmsg) return ERR(ENOMEM); total_sbuf_size = rs->sbuf_size; if (rs->sq_inline < RS_MAX_CTRL_MSG) total_sbuf_size += RS_MAX_CTRL_MSG * RS_QP_CTRL_SIZE; rs->sbuf = forksafe_alloc(total_sbuf_size); if (!rs->sbuf) return ERR(ENOMEM); rs->smr = rdma_reg_msgs(rs->cm_id, rs->sbuf, total_sbuf_size); if (!rs->smr) return -1; len = sizeof(*rs->target_sgl) * RS_SGL_SIZE + sizeof(*rs->target_iomap) * rs->target_iomap_size; rs->target_buffer_list = forksafe_alloc(len); if (!rs->target_buffer_list) return ERR(ENOMEM); rs->target_mr = rdma_reg_write(rs->cm_id, rs->target_buffer_list, len); if (!rs->target_mr) return -1; rs->target_sgl = rs->target_buffer_list; if (rs->target_iomap_size) rs->target_iomap = (struct rs_iomap *) (rs->target_sgl + RS_SGL_SIZE); total_rbuf_size = rs->rbuf_size; if (rs->opts & RS_OPT_MSG_SEND) total_rbuf_size += rs->rq_size * RS_MSG_SIZE; rs->rbuf = forksafe_alloc(total_rbuf_size); if (!rs->rbuf) return ERR(ENOMEM); rs->rmr = rdma_reg_write(rs->cm_id, rs->rbuf, total_rbuf_size); if (!rs->rmr) return -1; rs->ssgl[0].addr = rs->ssgl[1].addr = (uintptr_t) rs->sbuf; rs->sbuf_bytes_avail = rs->sbuf_size; rs->ssgl[0].lkey = rs->ssgl[1].lkey = rs->smr->lkey; rs->rbuf_free_offset = rs->rbuf_size >> 1; rs->rbuf_bytes_avail = rs->rbuf_size >> 1; rs->sqe_avail = rs->sq_size - rs->ctrl_max_seqno; rs->rseq_comp = rs->rq_size >> 1; return 0; } static int ds_init_bufs(struct ds_qp *qp) { qp->rbuf = forksafe_alloc(qp->rs->rbuf_size + sizeof(struct ibv_grh)); if (!qp->rbuf) return ERR(ENOMEM); qp->smr = rdma_reg_msgs(qp->cm_id, qp->rs->sbuf, qp->rs->sbuf_size); if (!qp->smr) return -1; qp->rmr = rdma_reg_msgs(qp->cm_id, qp->rbuf, qp->rs->rbuf_size + sizeof(struct ibv_grh)); if (!qp->rmr) return -1; return 0; } /* * If a user is waiting on a datagram rsocket through poll or select, then * we need the first completion to generate an event on the related epoll fd * in order to signal the user. We arm the CQ on creation for this purpose */ static int rs_create_cq(struct rsocket *rs, struct rdma_cm_id *cm_id) { cm_id->recv_cq_channel = ibv_create_comp_channel(cm_id->verbs); if (!cm_id->recv_cq_channel) return -1; cm_id->recv_cq = ibv_create_cq(cm_id->verbs, rs->sq_size + rs->rq_size, cm_id, cm_id->recv_cq_channel, 0); if (!cm_id->recv_cq) goto err1; if (rs->fd_flags & O_NONBLOCK) { if (set_fd_nonblock(cm_id->recv_cq_channel->fd, true)) goto err2; } ibv_req_notify_cq(cm_id->recv_cq, 0); cm_id->send_cq_channel = cm_id->recv_cq_channel; cm_id->send_cq = cm_id->recv_cq; return 0; err2: ibv_destroy_cq(cm_id->recv_cq); cm_id->recv_cq = NULL; err1: ibv_destroy_comp_channel(cm_id->recv_cq_channel); cm_id->recv_cq_channel = NULL; return -1; } static inline int rs_post_recv(struct rsocket *rs) { struct ibv_recv_wr wr, *bad; struct ibv_sge sge; wr.next = NULL; if (!(rs->opts & RS_OPT_MSG_SEND)) { wr.wr_id = rs_recv_wr_id(0); wr.sg_list = NULL; wr.num_sge = 0; } else { wr.wr_id = rs_recv_wr_id(rs->rbuf_msg_index); sge.addr = (uintptr_t) rs->rbuf + rs->rbuf_size + (rs->rbuf_msg_index * RS_MSG_SIZE); sge.length = RS_MSG_SIZE; sge.lkey = rs->rmr->lkey; wr.sg_list = &sge; wr.num_sge = 1; if(++rs->rbuf_msg_index == rs->rq_size) rs->rbuf_msg_index = 0; } return rdma_seterrno(ibv_post_recv(rs->cm_id->qp, &wr, &bad)); } static inline int ds_post_recv(struct rsocket *rs, struct ds_qp *qp, uint32_t offset) { struct ibv_recv_wr wr, *bad; struct ibv_sge sge[2]; sge[0].addr = (uintptr_t) qp->rbuf + rs->rbuf_size; sge[0].length = sizeof(struct ibv_grh); sge[0].lkey = qp->rmr->lkey; sge[1].addr = (uintptr_t) qp->rbuf + offset; sge[1].length = RS_SNDLOWAT; sge[1].lkey = qp->rmr->lkey; wr.wr_id = rs_recv_wr_id(offset); wr.next = NULL; wr.sg_list = sge; wr.num_sge = 2; return rdma_seterrno(ibv_post_recv(qp->cm_id->qp, &wr, &bad)); } static int rs_create_ep(struct rsocket *rs) { struct ibv_qp_init_attr qp_attr; int i, ret; rs_set_qp_size(rs); if (rs->cm_id->verbs->device->transport_type == IBV_TRANSPORT_IWARP) { rs->opts |= RS_OPT_MSG_SEND; if (rs->sq_inline < RS_MSG_SIZE) rs->sq_inline = RS_MSG_SIZE; } ret = rs_create_cq(rs, rs->cm_id); if (ret) return ret; memset(&qp_attr, 0, sizeof qp_attr); qp_attr.qp_context = rs; qp_attr.send_cq = rs->cm_id->send_cq; qp_attr.recv_cq = rs->cm_id->recv_cq; qp_attr.qp_type = IBV_QPT_RC; qp_attr.sq_sig_all = 1; qp_attr.cap.max_send_wr = rs->sq_size; qp_attr.cap.max_recv_wr = rs->rq_size; qp_attr.cap.max_send_sge = 2; qp_attr.cap.max_recv_sge = 1; qp_attr.cap.max_inline_data = rs->sq_inline; ret = rdma_create_qp(rs->cm_id, NULL, &qp_attr); if (ret) return ret; rs->sq_inline = qp_attr.cap.max_inline_data; if ((rs->opts & RS_OPT_MSG_SEND) && (rs->sq_inline < RS_MSG_SIZE)) return ERR(ENOTSUP); ret = rs_init_bufs(rs); if (ret) return ret; for (i = 0; i < rs->rq_size; i++) { ret = rs_post_recv(rs); if (ret) return ret; } return 0; } static void rs_release_iomap_mr(struct rs_iomap_mr *iomr) { if (atomic_fetch_sub(&iomr->refcnt, 1) != 1) return; dlist_remove(&iomr->entry); ibv_dereg_mr(iomr->mr); if (iomr->index >= 0) iomr->mr = NULL; else free(iomr); } static void rs_free_iomappings(struct rsocket *rs) { struct rs_iomap_mr *iomr; while (!dlist_empty(&rs->iomap_list)) { iomr = container_of(rs->iomap_list.next, struct rs_iomap_mr, entry); riounmap(rs->index, iomr->mr->addr, iomr->mr->length); } while (!dlist_empty(&rs->iomap_queue)) { iomr = container_of(rs->iomap_queue.next, struct rs_iomap_mr, entry); riounmap(rs->index, iomr->mr->addr, iomr->mr->length); } } static void ds_free_qp(struct ds_qp *qp) { if (qp->smr) rdma_dereg_mr(qp->smr); if (qp->rbuf) { if (qp->rmr) rdma_dereg_mr(qp->rmr); free(qp->rbuf); } if (qp->cm_id) { if (qp->cm_id->qp) { tdelete(&qp->dest.addr, &qp->rs->dest_map, ds_compare_addr); epoll_ctl(qp->rs->epfd, EPOLL_CTL_DEL, qp->cm_id->recv_cq_channel->fd, NULL); rdma_destroy_qp(qp->cm_id); } rdma_destroy_id(qp->cm_id); } free(qp); } static void ds_free(struct rsocket *rs) { struct ds_qp *qp; if (rs->udp_sock >= 0) close(rs->udp_sock); if (rs->index >= 0) rs_remove(rs); if (rs->dmsg) free(rs->dmsg); while ((qp = rs->qp_list)) { ds_remove_qp(rs, qp); ds_free_qp(qp); } if (rs->epfd >= 0) close(rs->epfd); if (rs->sbuf) free(rs->sbuf); tdestroy(rs->dest_map, free); fastlock_destroy(&rs->map_lock); fastlock_destroy(&rs->cq_wait_lock); fastlock_destroy(&rs->cq_lock); fastlock_destroy(&rs->rlock); fastlock_destroy(&rs->slock); free(rs); } static void rs_free(struct rsocket *rs) { if (rs->type == SOCK_DGRAM) { ds_free(rs); return; } if (rs->rmsg) free(rs->rmsg); if (rs->sbuf) { if (rs->smr) rdma_dereg_mr(rs->smr); free(rs->sbuf); } if (rs->rbuf) { if (rs->rmr) rdma_dereg_mr(rs->rmr); free(rs->rbuf); } if (rs->target_buffer_list) { if (rs->target_mr) rdma_dereg_mr(rs->target_mr); free(rs->target_buffer_list); } if (rs->index >= 0) rs_remove(rs); if (rs->cm_id) { rs_free_iomappings(rs); if (rs->cm_id->qp) { ibv_ack_cq_events(rs->cm_id->recv_cq, rs->unack_cqe); rdma_destroy_qp(rs->cm_id); } rdma_destroy_id(rs->cm_id); } if (rs->accept_queue[0] > 0 || rs->accept_queue[1] > 0) { close(rs->accept_queue[0]); close(rs->accept_queue[1]); } fastlock_destroy(&rs->map_lock); fastlock_destroy(&rs->cq_wait_lock); fastlock_destroy(&rs->cq_lock); fastlock_destroy(&rs->rlock); fastlock_destroy(&rs->slock); free(rs); } static size_t rs_conn_data_offset(struct rsocket *rs) { return (rs->cm_id->route.addr.src_addr.sa_family == AF_IB) ? sizeof(struct ib_connect_hdr) : 0; } static void rs_format_conn_data(struct rsocket *rs, struct rs_conn_data *conn) { conn->version = 1; conn->flags = RS_CONN_FLAG_IOMAP | (rs_host_is_net() ? RS_CONN_FLAG_NET : 0); conn->credits = htobe16(rs->rq_size); memset(conn->reserved, 0, sizeof conn->reserved); conn->target_iomap_size = (uint8_t) rs_value_to_scale(rs->target_iomap_size, 8); conn->target_sgl.addr = (__force uint64_t)htobe64((uintptr_t) rs->target_sgl); conn->target_sgl.length = (__force uint32_t)htobe32(RS_SGL_SIZE); conn->target_sgl.key = (__force uint32_t)htobe32(rs->target_mr->rkey); conn->data_buf.addr = (__force uint64_t)htobe64((uintptr_t) rs->rbuf); conn->data_buf.length = (__force uint32_t)htobe32(rs->rbuf_size >> 1); conn->data_buf.key = (__force uint32_t)htobe32(rs->rmr->rkey); } static void rs_save_conn_data(struct rsocket *rs, struct rs_conn_data *conn) { rs->remote_sgl.addr = be64toh((__force __be64)conn->target_sgl.addr); rs->remote_sgl.length = be32toh((__force __be32)conn->target_sgl.length); rs->remote_sgl.key = be32toh((__force __be32)conn->target_sgl.key); rs->remote_sge = 1; if ((rs_host_is_net() && !(conn->flags & RS_CONN_FLAG_NET)) || (!rs_host_is_net() && (conn->flags & RS_CONN_FLAG_NET))) rs->opts = RS_OPT_SWAP_SGL; if (conn->flags & RS_CONN_FLAG_IOMAP) { rs->remote_iomap.addr = rs->remote_sgl.addr + sizeof(rs->remote_sgl) * rs->remote_sgl.length; rs->remote_iomap.length = rs_scale_to_value(conn->target_iomap_size, 8); rs->remote_iomap.key = rs->remote_sgl.key; } rs->target_sgl[0].addr = be64toh((__force __be64)conn->data_buf.addr); rs->target_sgl[0].length = be32toh((__force __be32)conn->data_buf.length); rs->target_sgl[0].key = be32toh((__force __be32)conn->data_buf.key); rs->sseq_comp = be16toh(conn->credits); } static int ds_init(struct rsocket *rs, int domain) { rs->udp_sock = socket(domain, SOCK_DGRAM, 0); if (rs->udp_sock < 0) return rs->udp_sock; rs->epfd = epoll_create(2); if (rs->epfd < 0) return rs->epfd; return 0; } static int ds_init_ep(struct rsocket *rs) { struct ds_smsg *msg; int i, ret; ds_set_qp_size(rs); rs->sbuf = calloc(rs->sq_size, RS_SNDLOWAT); if (!rs->sbuf) return ERR(ENOMEM); rs->dmsg = calloc(rs->rq_size + 1, sizeof(*rs->dmsg)); if (!rs->dmsg) return ERR(ENOMEM); rs->sqe_avail = rs->sq_size; rs->rqe_avail = rs->rq_size; rs->smsg_free = (struct ds_smsg *) rs->sbuf; msg = rs->smsg_free; for (i = 0; i < rs->sq_size - 1; i++) { msg->next = (void *) msg + RS_SNDLOWAT; msg = msg->next; } msg->next = NULL; ret = rs_notify_svc(&udp_svc, rs, RS_SVC_ADD_DGRAM); if (ret) return ret; rs->state = rs_readable | rs_writable; return 0; } int rsocket(int domain, int type, int protocol) { struct rsocket *rs; int index, ret; if ((domain != AF_INET && domain != AF_INET6 && domain != AF_IB) || ((type != SOCK_STREAM) && (type != SOCK_DGRAM)) || (type == SOCK_STREAM && protocol && protocol != IPPROTO_TCP) || (type == SOCK_DGRAM && protocol && protocol != IPPROTO_UDP)) return ERR(ENOTSUP); rs_configure(); rs = rs_alloc(NULL, type); if (!rs) return ERR(ENOMEM); if (type == SOCK_STREAM) { ret = rdma_create_id(NULL, &rs->cm_id, rs, RDMA_PS_TCP); if (ret) goto err; rs->cm_id->route.addr.src_addr.sa_family = domain; index = rs->cm_id->channel->fd; } else { ret = ds_init(rs, domain); if (ret) goto err; index = rs->udp_sock; } ret = rs_insert(rs, index); if (ret < 0) goto err; return rs->index; err: rs_free(rs); return ret; } int rbind(int socket, const struct sockaddr *addr, socklen_t addrlen) { struct rsocket *rs; int ret; rs = idm_lookup(&idm, socket); if (!rs) return ERR(EBADF); if (rs->type == SOCK_STREAM) { ret = rdma_bind_addr(rs->cm_id, (struct sockaddr *) addr); if (!ret) rs->state = rs_bound; } else { if (rs->state == rs_init) { ret = ds_init_ep(rs); if (ret) return ret; } ret = bind(rs->udp_sock, addr, addrlen); } return ret; } int rlisten(int socket, int backlog) { struct rsocket *rs; int ret; rs = idm_lookup(&idm, socket); if (!rs) return ERR(EBADF); if (rs->state == rs_listening) return 0; ret = rdma_listen(rs->cm_id, backlog); if (ret) return ret; ret = socketpair(AF_UNIX, SOCK_STREAM, 0, rs->accept_queue); if (ret) return ret; if (rs->fd_flags & O_NONBLOCK) { ret = set_fd_nonblock(rs->accept_queue[0], true); if (ret) return ret; } ret = set_fd_nonblock(rs->cm_id->channel->fd, true); if (ret) return ret; ret = rs_notify_svc(&listen_svc, rs, RS_SVC_ADD_CM); if (ret) return ret; rs->state = rs_listening; return 0; } /* Accepting new connection requests is currently a blocking operation */ static void rs_accept(struct rsocket *rs) { struct rsocket *new_rs; struct rdma_conn_param param; struct rs_conn_data *creq, cresp; struct rdma_cm_id *cm_id; int ret; ret = rdma_get_request(rs->cm_id, &cm_id); if (ret) return; new_rs = rs_alloc(rs, rs->type); if (!new_rs) goto err; new_rs->cm_id = cm_id; ret = rs_insert(new_rs, new_rs->cm_id->channel->fd); if (ret < 0) goto err; creq = (struct rs_conn_data *) (new_rs->cm_id->event->param.conn.private_data + rs_conn_data_offset(rs)); if (creq->version != 1) goto err; ret = rs_create_ep(new_rs); if (ret) goto err; rs_save_conn_data(new_rs, creq); param = new_rs->cm_id->event->param.conn; rs_format_conn_data(new_rs, &cresp); param.private_data = &cresp; param.private_data_len = sizeof cresp; ret = rdma_accept(new_rs->cm_id, ¶m); if (!ret) new_rs->state = rs_connect_rdwr; else if (errno == EAGAIN || errno == EWOULDBLOCK) new_rs->state = rs_accepting; else goto err; write_all(rs->accept_queue[1], &new_rs, sizeof(new_rs)); return; err: rdma_reject(cm_id, NULL, 0); if (new_rs) rs_free(new_rs); } int raccept(int socket, struct sockaddr *addr, socklen_t *addrlen) { struct rsocket *rs, *new_rs; int ret; rs = idm_lookup(&idm, socket); if (!rs) return ERR(EBADF); if (rs->state != rs_listening) return ERR(EBADF); ret = read(rs->accept_queue[0], &new_rs, sizeof(new_rs)); if (ret != sizeof(new_rs)) return ret; if (addr && addrlen) rgetpeername(new_rs->index, addr, addrlen); /* The app can still drive the CM state on failure */ int save_errno = errno; rs_notify_svc(&connect_svc, new_rs, RS_SVC_ADD_CM); errno = save_errno; return new_rs->index; } static int rs_do_connect(struct rsocket *rs) { struct rdma_conn_param param; struct rs_conn_private_data cdata; struct rs_conn_data *creq, *cresp; int to, ret; fastlock_acquire(&rs->slock); switch (rs->state) { case rs_init: case rs_bound: resolve_addr: to = 1000 << rs->retries++; ret = rdma_resolve_addr(rs->cm_id, NULL, &rs->cm_id->route.addr.dst_addr, to); if (!ret) goto resolve_route; if (errno == EAGAIN || errno == EWOULDBLOCK) rs->state = rs_resolving_addr; break; case rs_resolving_addr: ret = ucma_complete(rs->cm_id); if (ret) { if (errno == ETIMEDOUT && rs->retries <= RS_CONN_RETRIES) goto resolve_addr; break; } rs->retries = 0; resolve_route: to = 1000 << rs->retries++; if (rs->optval) { ret = rdma_set_option(rs->cm_id, RDMA_OPTION_IB, RDMA_OPTION_IB_PATH, rs->optval, rs->optlen); free(rs->optval); rs->optval = NULL; if (!ret) { rs->state = rs_resolving_route; goto resolving_route; } } else { ret = rdma_resolve_route(rs->cm_id, to); if (!ret) goto do_connect; } if (errno == EAGAIN || errno == EWOULDBLOCK) rs->state = rs_resolving_route; break; case rs_resolving_route: resolving_route: ret = ucma_complete(rs->cm_id); if (ret) { if (errno == ETIMEDOUT && rs->retries <= RS_CONN_RETRIES) goto resolve_route; break; } do_connect: ret = rs_create_ep(rs); if (ret) break; memset(¶m, 0, sizeof param); creq = (void *) &cdata + rs_conn_data_offset(rs); rs_format_conn_data(rs, creq); param.private_data = (void *) creq - rs_conn_data_offset(rs); param.private_data_len = sizeof(*creq) + rs_conn_data_offset(rs); param.flow_control = 1; param.retry_count = 7; param.rnr_retry_count = 7; /* work-around: iWarp issues RDMA read during connection */ if (rs->opts & RS_OPT_MSG_SEND) param.initiator_depth = 1; rs->retries = 0; ret = rdma_connect(rs->cm_id, ¶m); if (!ret) goto connected; if (errno == EAGAIN || errno == EWOULDBLOCK) rs->state = rs_connecting; break; case rs_connecting: ret = ucma_complete(rs->cm_id); if (ret) break; connected: cresp = (struct rs_conn_data *) rs->cm_id->event->param.conn.private_data; if (cresp->version != 1) { ret = ERR(ENOTSUP); break; } rs_save_conn_data(rs, cresp); rs->state = rs_connect_rdwr; break; case rs_accepting: if (!(rs->fd_flags & O_NONBLOCK)) set_fd_nonblock(rs->cm_id->channel->fd, true); ret = ucma_complete(rs->cm_id); if (ret) break; rs->state = rs_connect_rdwr; break; case rs_connect_error: case rs_disconnected: case rs_error: ret = ERR(ENOTCONN); goto unlock; default: ret = (rs->state & rs_connected) ? 0 : ERR(EINVAL); goto unlock; } if (ret) { if (errno == EAGAIN || errno == EWOULDBLOCK) { errno = EINPROGRESS; } else { rs->state = rs_connect_error; rs->err = errno; } } unlock: fastlock_release(&rs->slock); return ret; } static int rs_any_addr(const union socket_addr *addr) { if (addr->sa.sa_family == AF_INET) { return (addr->sin.sin_addr.s_addr == htobe32(INADDR_ANY) || addr->sin.sin_addr.s_addr == htobe32(INADDR_LOOPBACK)); } else { return (!memcmp(&addr->sin6.sin6_addr, &in6addr_any, 16) || !memcmp(&addr->sin6.sin6_addr, &in6addr_loopback, 16)); } } static int ds_get_src_addr(struct rsocket *rs, const struct sockaddr *dest_addr, socklen_t dest_len, union socket_addr *src_addr, socklen_t *src_len) { int sock, ret; __be16 port; *src_len = sizeof(*src_addr); ret = getsockname(rs->udp_sock, &src_addr->sa, src_len); if (ret || !rs_any_addr(src_addr)) return ret; port = src_addr->sin.sin_port; sock = socket(dest_addr->sa_family, SOCK_DGRAM, 0); if (sock < 0) return sock; ret = connect(sock, dest_addr, dest_len); if (ret) goto out; *src_len = sizeof(*src_addr); ret = getsockname(sock, &src_addr->sa, src_len); src_addr->sin.sin_port = port; out: close(sock); return ret; } static void ds_format_hdr(struct ds_header *hdr, union socket_addr *addr) { if (addr->sa.sa_family == AF_INET) { hdr->version = 4; hdr->length = DS_IPV4_HDR_LEN; hdr->port = addr->sin.sin_port; hdr->addr.ipv4 = addr->sin.sin_addr.s_addr; } else { hdr->version = 6; hdr->length = DS_IPV6_HDR_LEN; hdr->port = addr->sin6.sin6_port; hdr->addr.ipv6.flowinfo= addr->sin6.sin6_flowinfo; memcpy(&hdr->addr.ipv6.addr, &addr->sin6.sin6_addr, 16); } } static int ds_add_qp_dest(struct ds_qp *qp, union socket_addr *addr, socklen_t addrlen) { struct ibv_port_attr port_attr; struct ibv_ah_attr attr; int ret; memcpy(&qp->dest.addr, addr, addrlen); qp->dest.qp = qp; qp->dest.qpn = qp->cm_id->qp->qp_num; ret = ibv_query_port(qp->cm_id->verbs, qp->cm_id->port_num, &port_attr); if (ret) return ret; memset(&attr, 0, sizeof attr); attr.dlid = port_attr.lid; attr.port_num = qp->cm_id->port_num; qp->dest.ah = ibv_create_ah(qp->cm_id->pd, &attr); if (!qp->dest.ah) return ERR(ENOMEM); tsearch(&qp->dest.addr, &qp->rs->dest_map, ds_compare_addr); return 0; } static int ds_create_qp(struct rsocket *rs, union socket_addr *src_addr, socklen_t addrlen, struct ds_qp **new_qp) { struct ds_qp *qp; struct ibv_qp_init_attr qp_attr; struct epoll_event event; int i, ret; qp = calloc(1, sizeof(*qp)); if (!qp) return ERR(ENOMEM); qp->rs = rs; ret = rdma_create_id(NULL, &qp->cm_id, qp, RDMA_PS_UDP); if (ret) goto err; ds_format_hdr(&qp->hdr, src_addr); ret = rdma_bind_addr(qp->cm_id, &src_addr->sa); if (ret) goto err; ret = ds_init_bufs(qp); if (ret) goto err; ret = rs_create_cq(rs, qp->cm_id); if (ret) goto err; memset(&qp_attr, 0, sizeof qp_attr); qp_attr.qp_context = qp; qp_attr.send_cq = qp->cm_id->send_cq; qp_attr.recv_cq = qp->cm_id->recv_cq; qp_attr.qp_type = IBV_QPT_UD; qp_attr.sq_sig_all = 1; qp_attr.cap.max_send_wr = rs->sq_size; qp_attr.cap.max_recv_wr = rs->rq_size; qp_attr.cap.max_send_sge = 1; qp_attr.cap.max_recv_sge = 2; qp_attr.cap.max_inline_data = rs->sq_inline; ret = rdma_create_qp(qp->cm_id, NULL, &qp_attr); if (ret) goto err; rs->sq_inline = qp_attr.cap.max_inline_data; ret = ds_add_qp_dest(qp, src_addr, addrlen); if (ret) goto err; event.events = EPOLLIN; event.data.ptr = qp; ret = epoll_ctl(rs->epfd, EPOLL_CTL_ADD, qp->cm_id->recv_cq_channel->fd, &event); if (ret) goto err; for (i = 0; i < rs->rq_size; i++) { ret = ds_post_recv(rs, qp, i * RS_SNDLOWAT); if (ret) goto err; } ds_insert_qp(rs, qp); *new_qp = qp; return 0; err: ds_free_qp(qp); return ret; } static int ds_get_qp(struct rsocket *rs, union socket_addr *src_addr, socklen_t addrlen, struct ds_qp **qp) { if (rs->qp_list) { *qp = rs->qp_list; do { if (!ds_compare_addr(rdma_get_local_addr((*qp)->cm_id), src_addr)) return 0; *qp = ds_next_qp(*qp); } while (*qp != rs->qp_list); } return ds_create_qp(rs, src_addr, addrlen, qp); } static int ds_get_dest(struct rsocket *rs, const struct sockaddr *addr, socklen_t addrlen, struct ds_dest **dest) { union socket_addr src_addr; socklen_t src_len; struct ds_qp *qp; struct ds_dest **tdest, *new_dest; int ret = 0; fastlock_acquire(&rs->map_lock); tdest = tfind(addr, &rs->dest_map, ds_compare_addr); if (tdest) goto found; ret = ds_get_src_addr(rs, addr, addrlen, &src_addr, &src_len); if (ret) goto out; ret = ds_get_qp(rs, &src_addr, src_len, &qp); if (ret) goto out; tdest = tfind(addr, &rs->dest_map, ds_compare_addr); if (!tdest) { new_dest = calloc(1, sizeof(*new_dest)); if (!new_dest) { ret = ERR(ENOMEM); goto out; } memcpy(&new_dest->addr, addr, addrlen); new_dest->qp = qp; tdest = tsearch(&new_dest->addr, &rs->dest_map, ds_compare_addr); } found: *dest = *tdest; out: fastlock_release(&rs->map_lock); return ret; } int rconnect(int socket, const struct sockaddr *addr, socklen_t addrlen) { struct rsocket *rs; int ret, save_errno; rs = idm_lookup(&idm, socket); if (!rs) return ERR(EBADF); if (rs->type == SOCK_STREAM) { memcpy(&rs->cm_id->route.addr.dst_addr, addr, addrlen); ret = rs_do_connect(rs); if (ret == -1 && errno == EINPROGRESS) { save_errno = errno; /* The app can still drive the CM state on failure */ rs_notify_svc(&connect_svc, rs, RS_SVC_ADD_CM); errno = save_errno; } } else { if (rs->state == rs_init) { ret = ds_init_ep(rs); if (ret) return ret; } fastlock_acquire(&rs->slock); ret = connect(rs->udp_sock, addr, addrlen); if (!ret) ret = ds_get_dest(rs, addr, addrlen, &rs->conn_dest); fastlock_release(&rs->slock); } return ret; } static void *rs_get_ctrl_buf(struct rsocket *rs) { return rs->sbuf + rs->sbuf_size + RS_MAX_CTRL_MSG * (rs->ctrl_seqno & (RS_QP_CTRL_SIZE - 1)); } static int rs_post_msg(struct rsocket *rs, uint32_t msg) { struct ibv_send_wr wr, *bad; struct ibv_sge sge; wr.wr_id = rs_send_wr_id(msg); wr.next = NULL; if (!(rs->opts & RS_OPT_MSG_SEND)) { wr.sg_list = NULL; wr.num_sge = 0; wr.opcode = IBV_WR_RDMA_WRITE_WITH_IMM; wr.send_flags = 0; wr.imm_data = htobe32(msg); } else { sge.addr = (uintptr_t) &msg; sge.lkey = 0; sge.length = sizeof msg; wr.sg_list = &sge; wr.num_sge = 1; wr.opcode = IBV_WR_SEND; wr.send_flags = IBV_SEND_INLINE; } return rdma_seterrno(ibv_post_send(rs->cm_id->qp, &wr, &bad)); } static int rs_post_write(struct rsocket *rs, struct ibv_sge *sgl, int nsge, uint32_t wr_data, int flags, uint64_t addr, uint32_t rkey) { struct ibv_send_wr wr, *bad; wr.wr_id = rs_send_wr_id(wr_data); wr.next = NULL; wr.sg_list = sgl; wr.num_sge = nsge; wr.opcode = IBV_WR_RDMA_WRITE; wr.send_flags = flags; wr.wr.rdma.remote_addr = addr; wr.wr.rdma.rkey = rkey; return rdma_seterrno(ibv_post_send(rs->cm_id->qp, &wr, &bad)); } static int rs_post_write_msg(struct rsocket *rs, struct ibv_sge *sgl, int nsge, uint32_t msg, int flags, uint64_t addr, uint32_t rkey) { struct ibv_send_wr wr, *bad; struct ibv_sge sge; int ret; wr.next = NULL; if (!(rs->opts & RS_OPT_MSG_SEND)) { wr.wr_id = rs_send_wr_id(msg); wr.sg_list = sgl; wr.num_sge = nsge; wr.opcode = IBV_WR_RDMA_WRITE_WITH_IMM; wr.send_flags = flags; wr.imm_data = htobe32(msg); wr.wr.rdma.remote_addr = addr; wr.wr.rdma.rkey = rkey; return rdma_seterrno(ibv_post_send(rs->cm_id->qp, &wr, &bad)); } else { ret = rs_post_write(rs, sgl, nsge, msg, flags, addr, rkey); if (!ret) { wr.wr_id = rs_send_wr_id(rs_msg_set(rs_msg_op(msg), 0)) | RS_WR_ID_FLAG_MSG_SEND; sge.addr = (uintptr_t) &msg; sge.lkey = 0; sge.length = sizeof msg; wr.sg_list = &sge; wr.num_sge = 1; wr.opcode = IBV_WR_SEND; wr.send_flags = IBV_SEND_INLINE; ret = rdma_seterrno(ibv_post_send(rs->cm_id->qp, &wr, &bad)); } return ret; } } static int ds_post_send(struct rsocket *rs, struct ibv_sge *sge, uint32_t wr_data) { struct ibv_send_wr wr, *bad; wr.wr_id = rs_send_wr_id(wr_data); wr.next = NULL; wr.sg_list = sge; wr.num_sge = 1; wr.opcode = IBV_WR_SEND; wr.send_flags = (sge->length <= rs->sq_inline) ? IBV_SEND_INLINE : 0; wr.wr.ud.ah = rs->conn_dest->ah; wr.wr.ud.remote_qpn = rs->conn_dest->qpn; wr.wr.ud.remote_qkey = RDMA_UDP_QKEY; return rdma_seterrno(ibv_post_send(rs->conn_dest->qp->cm_id->qp, &wr, &bad)); } /* * Update target SGE before sending data. Otherwise the remote side may * update the entry before we do. */ static int rs_write_data(struct rsocket *rs, struct ibv_sge *sgl, int nsge, uint32_t length, int flags) { uint64_t addr; uint32_t rkey; rs->sseq_no++; rs->sqe_avail--; if (rs->opts & RS_OPT_MSG_SEND) rs->sqe_avail--; rs->sbuf_bytes_avail -= length; addr = rs->target_sgl[rs->target_sge].addr; rkey = rs->target_sgl[rs->target_sge].key; rs->target_sgl[rs->target_sge].addr += length; rs->target_sgl[rs->target_sge].length -= length; if (!rs->target_sgl[rs->target_sge].length) { if (++rs->target_sge == RS_SGL_SIZE) rs->target_sge = 0; } return rs_post_write_msg(rs, sgl, nsge, rs_msg_set(RS_OP_DATA, length), flags, addr, rkey); } static int rs_write_direct(struct rsocket *rs, struct rs_iomap *iom, uint64_t offset, struct ibv_sge *sgl, int nsge, uint32_t length, int flags) { uint64_t addr; rs->sqe_avail--; rs->sbuf_bytes_avail -= length; addr = iom->sge.addr + offset - iom->offset; return rs_post_write(rs, sgl, nsge, rs_msg_set(RS_OP_WRITE, length), flags, addr, iom->sge.key); } static int rs_write_iomap(struct rsocket *rs, struct rs_iomap_mr *iomr, struct ibv_sge *sgl, int nsge, int flags) { uint64_t addr; rs->sseq_no++; rs->sqe_avail--; if (rs->opts & RS_OPT_MSG_SEND) rs->sqe_avail--; rs->sbuf_bytes_avail -= sizeof(struct rs_iomap); addr = rs->remote_iomap.addr + iomr->index * sizeof(struct rs_iomap); return rs_post_write_msg(rs, sgl, nsge, rs_msg_set(RS_OP_IOMAP_SGL, iomr->index), flags, addr, rs->remote_iomap.key); } static uint32_t rs_sbuf_left(struct rsocket *rs) { return (uint32_t) (((uint64_t) (uintptr_t) &rs->sbuf[rs->sbuf_size]) - rs->ssgl[0].addr); } static void rs_send_credits(struct rsocket *rs) { struct ibv_sge ibsge; struct rs_sge sge, *sge_buf; int flags; rs->ctrl_seqno++; rs->rseq_comp = rs->rseq_no + (rs->rq_size >> 1); if (rs->rbuf_bytes_avail >= (rs->rbuf_size >> 1)) { if (rs->opts & RS_OPT_MSG_SEND) rs->ctrl_seqno++; if (!(rs->opts & RS_OPT_SWAP_SGL)) { sge.addr = (uintptr_t) &rs->rbuf[rs->rbuf_free_offset]; sge.key = rs->rmr->rkey; sge.length = rs->rbuf_size >> 1; } else { sge.addr = bswap_64((uintptr_t) &rs->rbuf[rs->rbuf_free_offset]); sge.key = bswap_32(rs->rmr->rkey); sge.length = bswap_32(rs->rbuf_size >> 1); } if (rs->sq_inline < sizeof sge) { sge_buf = rs_get_ctrl_buf(rs); memcpy(sge_buf, &sge, sizeof sge); ibsge.addr = (uintptr_t) sge_buf; ibsge.lkey = rs->smr->lkey; flags = 0; } else { ibsge.addr = (uintptr_t) &sge; ibsge.lkey = 0; flags = IBV_SEND_INLINE; } ibsge.length = sizeof(sge); rs_post_write_msg(rs, &ibsge, 1, rs_msg_set(RS_OP_SGL, rs->rseq_no + rs->rq_size), flags, rs->remote_sgl.addr + rs->remote_sge * sizeof(struct rs_sge), rs->remote_sgl.key); rs->rbuf_bytes_avail -= rs->rbuf_size >> 1; rs->rbuf_free_offset += rs->rbuf_size >> 1; if (rs->rbuf_free_offset >= rs->rbuf_size) rs->rbuf_free_offset = 0; if (++rs->remote_sge == rs->remote_sgl.length) rs->remote_sge = 0; } else { rs_post_msg(rs, rs_msg_set(RS_OP_SGL, rs->rseq_no + rs->rq_size)); } } static inline int rs_ctrl_avail(struct rsocket *rs) { return rs->ctrl_seqno != rs->ctrl_max_seqno; } /* Protocols that do not support RDMA write with immediate may require 2 msgs */ static inline int rs_2ctrl_avail(struct rsocket *rs) { return (int)((rs->ctrl_seqno + 1) - rs->ctrl_max_seqno) < 0; } static int rs_give_credits(struct rsocket *rs) { if (!(rs->opts & RS_OPT_MSG_SEND)) { return ((rs->rbuf_bytes_avail >= (rs->rbuf_size >> 1)) || ((short) ((short) rs->rseq_no - (short) rs->rseq_comp) >= 0)) && rs_ctrl_avail(rs) && (rs->state & rs_connected); } else { return ((rs->rbuf_bytes_avail >= (rs->rbuf_size >> 1)) || ((short) ((short) rs->rseq_no - (short) rs->rseq_comp) >= 0)) && rs_2ctrl_avail(rs) && (rs->state & rs_connected); } } static void rs_update_credits(struct rsocket *rs) { if (rs_give_credits(rs)) rs_send_credits(rs); } static int rs_poll_cq(struct rsocket *rs) { struct ibv_wc wc; uint32_t msg; int ret, rcnt = 0; while ((ret = ibv_poll_cq(rs->cm_id->recv_cq, 1, &wc)) > 0) { if (rs_wr_is_recv(wc.wr_id)) { if (wc.status != IBV_WC_SUCCESS) continue; rcnt++; if (wc.wc_flags & IBV_WC_WITH_IMM) { msg = be32toh(wc.imm_data); } else { msg = ((uint32_t *) (rs->rbuf + rs->rbuf_size)) [rs_wr_data(wc.wr_id)]; } switch (rs_msg_op(msg)) { case RS_OP_SGL: rs->sseq_comp = (uint16_t) rs_msg_data(msg); break; case RS_OP_IOMAP_SGL: /* The iomap was updated, that's nice to know. */ break; case RS_OP_CTRL: if (rs_msg_data(msg) == RS_CTRL_DISCONNECT) { rs->state = rs_disconnected; return 0; } else if (rs_msg_data(msg) == RS_CTRL_SHUTDOWN) { if (rs->state & rs_writable) { rs->state &= ~rs_readable; } else { rs->state = rs_disconnected; return 0; } } break; case RS_OP_WRITE: /* We really shouldn't be here. */ break; default: rs->rmsg[rs->rmsg_tail].op = rs_msg_op(msg); rs->rmsg[rs->rmsg_tail].data = rs_msg_data(msg); if (++rs->rmsg_tail == rs->rq_size + 1) rs->rmsg_tail = 0; break; } } else { switch (rs_msg_op(rs_wr_data(wc.wr_id))) { case RS_OP_SGL: rs->ctrl_max_seqno++; break; case RS_OP_CTRL: rs->ctrl_max_seqno++; if (rs_msg_data(rs_wr_data(wc.wr_id)) == RS_CTRL_DISCONNECT) rs->state = rs_disconnected; break; case RS_OP_IOMAP_SGL: rs->sqe_avail++; if (!rs_wr_is_msg_send(wc.wr_id)) rs->sbuf_bytes_avail += sizeof(struct rs_iomap); break; default: rs->sqe_avail++; rs->sbuf_bytes_avail += rs_msg_data(rs_wr_data(wc.wr_id)); break; } if (wc.status != IBV_WC_SUCCESS && (rs->state & rs_connected)) { rs->state = rs_error; rs->err = EIO; } } } if (rs->state & rs_connected) { while (!ret && rcnt--) ret = rs_post_recv(rs); if (ret) { rs->state = rs_error; rs->err = errno; } } return ret; } static int rs_get_cq_event(struct rsocket *rs) { struct ibv_cq *cq; void *context; int ret; if (!rs->cq_armed) return 0; ret = ibv_get_cq_event(rs->cm_id->recv_cq_channel, &cq, &context); if (!ret) { if (++rs->unack_cqe >= rs->sq_size + rs->rq_size) { ibv_ack_cq_events(rs->cm_id->recv_cq, rs->unack_cqe); rs->unack_cqe = 0; } rs->cq_armed = 0; } else if (!(errno == EAGAIN || errno == EINTR)) { rs->state = rs_error; } return ret; } /* * Although we serialize rsend and rrecv calls with respect to themselves, * both calls may run simultaneously and need to poll the CQ for completions. * We need to serialize access to the CQ, but rsend and rrecv need to * allow each other to make forward progress. * * For example, rsend may need to wait for credits from the remote side, * which could be stalled until the remote process calls rrecv. This should * not block rrecv from receiving data from the remote side however. * * We handle this by using two locks. The cq_lock protects against polling * the CQ and processing completions. The cq_wait_lock serializes access to * waiting on the CQ. */ static int rs_process_cq(struct rsocket *rs, int nonblock, int (*test)(struct rsocket *rs)) { int ret; fastlock_acquire(&rs->cq_lock); do { rs_update_credits(rs); ret = rs_poll_cq(rs); if (test(rs)) { ret = 0; break; } else if (ret) { break; } else if (nonblock) { ret = ERR(EWOULDBLOCK); } else if (!rs->cq_armed) { ibv_req_notify_cq(rs->cm_id->recv_cq, 0); rs->cq_armed = 1; } else { rs_update_credits(rs); fastlock_acquire(&rs->cq_wait_lock); fastlock_release(&rs->cq_lock); ret = rs_get_cq_event(rs); fastlock_release(&rs->cq_wait_lock); fastlock_acquire(&rs->cq_lock); } } while (!ret); rs_update_credits(rs); fastlock_release(&rs->cq_lock); return ret; } static int rs_get_comp(struct rsocket *rs, int nonblock, int (*test)(struct rsocket *rs)) { uint64_t start_time = 0; uint32_t poll_time; int ret; do { ret = rs_process_cq(rs, 1, test); if (!ret || nonblock || errno != EWOULDBLOCK) return ret; if (!start_time) start_time = rs_time_us(); poll_time = (uint32_t) (rs_time_us() - start_time); } while (poll_time <= polling_time); ret = rs_process_cq(rs, 0, test); return ret; } static int ds_valid_recv(struct ds_qp *qp, struct ibv_wc *wc) { struct ds_header *hdr; hdr = (struct ds_header *) (qp->rbuf + rs_wr_data(wc->wr_id)); return ((wc->byte_len >= sizeof(struct ibv_grh) + DS_IPV4_HDR_LEN) && ((hdr->version == 4 && hdr->length == DS_IPV4_HDR_LEN) || (hdr->version == 6 && hdr->length == DS_IPV6_HDR_LEN))); } /* * Poll all CQs associated with a datagram rsocket. We need to drop any * received messages that we do not have room to store. To limit drops, * we only poll if we have room to store the receive or we need a send * buffer. To ensure fairness, we poll the CQs round robin, remembering * where we left off. */ static void ds_poll_cqs(struct rsocket *rs) { struct ds_qp *qp; struct ds_smsg *smsg; struct ds_rmsg *rmsg; struct ibv_wc wc; int ret, cnt; if (!(qp = rs->qp_list)) return; do { cnt = 0; do { ret = ibv_poll_cq(qp->cm_id->recv_cq, 1, &wc); if (ret <= 0) { qp = ds_next_qp(qp); continue; } if (rs_wr_is_recv(wc.wr_id)) { if (rs->rqe_avail && wc.status == IBV_WC_SUCCESS && ds_valid_recv(qp, &wc)) { rs->rqe_avail--; rmsg = &rs->dmsg[rs->rmsg_tail]; rmsg->qp = qp; rmsg->offset = rs_wr_data(wc.wr_id); rmsg->length = wc.byte_len - sizeof(struct ibv_grh); if (++rs->rmsg_tail == rs->rq_size + 1) rs->rmsg_tail = 0; } else { ds_post_recv(rs, qp, rs_wr_data(wc.wr_id)); } } else { smsg = (struct ds_smsg *) (rs->sbuf + rs_wr_data(wc.wr_id)); smsg->next = rs->smsg_free; rs->smsg_free = smsg; rs->sqe_avail++; } qp = ds_next_qp(qp); if (!rs->rqe_avail && rs->sqe_avail) { rs->qp_list = qp; return; } cnt++; } while (qp != rs->qp_list); } while (cnt); } static void ds_req_notify_cqs(struct rsocket *rs) { struct ds_qp *qp; if (!(qp = rs->qp_list)) return; do { if (!qp->cq_armed) { ibv_req_notify_cq(qp->cm_id->recv_cq, 0); qp->cq_armed = 1; } qp = ds_next_qp(qp); } while (qp != rs->qp_list); } static int ds_get_cq_event(struct rsocket *rs) { struct epoll_event event; struct ds_qp *qp; struct ibv_cq *cq; void *context; int ret; if (!rs->cq_armed) return 0; ret = epoll_wait(rs->epfd, &event, 1, -1); if (ret <= 0) return ret; qp = event.data.ptr; ret = ibv_get_cq_event(qp->cm_id->recv_cq_channel, &cq, &context); if (!ret) { ibv_ack_cq_events(qp->cm_id->recv_cq, 1); qp->cq_armed = 0; rs->cq_armed = 0; } return ret; } static int ds_process_cqs(struct rsocket *rs, int nonblock, int (*test)(struct rsocket *rs)) { int ret = 0; fastlock_acquire(&rs->cq_lock); do { ds_poll_cqs(rs); if (test(rs)) { ret = 0; break; } else if (nonblock) { ret = ERR(EWOULDBLOCK); } else if (!rs->cq_armed) { ds_req_notify_cqs(rs); rs->cq_armed = 1; } else { fastlock_acquire(&rs->cq_wait_lock); fastlock_release(&rs->cq_lock); ret = ds_get_cq_event(rs); fastlock_release(&rs->cq_wait_lock); fastlock_acquire(&rs->cq_lock); } } while (!ret); fastlock_release(&rs->cq_lock); return ret; } static int ds_get_comp(struct rsocket *rs, int nonblock, int (*test)(struct rsocket *rs)) { uint64_t start_time = 0; uint32_t poll_time; int ret; do { ret = ds_process_cqs(rs, 1, test); if (!ret || nonblock || errno != EWOULDBLOCK) return ret; if (!start_time) start_time = rs_time_us(); poll_time = (uint32_t) (rs_time_us() - start_time); } while (poll_time <= polling_time); ret = ds_process_cqs(rs, 0, test); return ret; } static int rs_nonblocking(struct rsocket *rs, int flags) { return (rs->fd_flags & O_NONBLOCK) || (flags & MSG_DONTWAIT); } static int rs_is_cq_armed(struct rsocket *rs) { return rs->cq_armed; } static int rs_poll_all(struct rsocket *rs) { return 1; } /* * We use hardware flow control to prevent over running the remote * receive queue. However, data transfers still require space in * the remote rmsg queue, or we risk losing notification that data * has been transferred. * * Be careful with race conditions in the check below. The target SGL * may be updated by a remote RDMA write. */ static int rs_can_send(struct rsocket *rs) { if (!(rs->opts & RS_OPT_MSG_SEND)) { return rs->sqe_avail && (rs->sbuf_bytes_avail >= RS_SNDLOWAT) && (rs->sseq_no != rs->sseq_comp) && (rs->target_sgl[rs->target_sge].length != 0); } else { return (rs->sqe_avail >= 2) && (rs->sbuf_bytes_avail >= RS_SNDLOWAT) && (rs->sseq_no != rs->sseq_comp) && (rs->target_sgl[rs->target_sge].length != 0); } } static int ds_can_send(struct rsocket *rs) { return rs->sqe_avail; } static int ds_all_sends_done(struct rsocket *rs) { return rs->sqe_avail == rs->sq_size; } static int rs_conn_can_send(struct rsocket *rs) { return rs_can_send(rs) || !(rs->state & rs_writable); } static int rs_conn_can_send_ctrl(struct rsocket *rs) { return rs_ctrl_avail(rs) || !(rs->state & rs_connected); } static int rs_have_rdata(struct rsocket *rs) { return (rs->rmsg_head != rs->rmsg_tail); } static int rs_conn_have_rdata(struct rsocket *rs) { return rs_have_rdata(rs) || !(rs->state & rs_readable); } static int rs_conn_all_sends_done(struct rsocket *rs) { return ((((int) rs->ctrl_max_seqno) - ((int) rs->ctrl_seqno)) + rs->sqe_avail == rs->sq_size) || !(rs->state & rs_connected); } static void ds_set_src(struct sockaddr *addr, socklen_t *addrlen, struct ds_header *hdr) { union socket_addr sa; memset(&sa, 0, sizeof sa); if (hdr->version == 4) { if (*addrlen > sizeof(sa.sin)) *addrlen = sizeof(sa.sin); sa.sin.sin_family = AF_INET; sa.sin.sin_port = hdr->port; sa.sin.sin_addr.s_addr = hdr->addr.ipv4; } else { if (*addrlen > sizeof(sa.sin6)) *addrlen = sizeof(sa.sin6); sa.sin6.sin6_family = AF_INET6; sa.sin6.sin6_port = hdr->port; sa.sin6.sin6_flowinfo = hdr->addr.ipv6.flowinfo; memcpy(&sa.sin6.sin6_addr, &hdr->addr.ipv6.addr, 16); } memcpy(addr, &sa, *addrlen); } static ssize_t ds_recvfrom(struct rsocket *rs, void *buf, size_t len, int flags, struct sockaddr *src_addr, socklen_t *addrlen) { struct ds_rmsg *rmsg; struct ds_header *hdr; int ret; if (!(rs->state & rs_readable)) return ERR(EINVAL); if (!rs_have_rdata(rs)) { ret = ds_get_comp(rs, rs_nonblocking(rs, flags), rs_have_rdata); if (ret) return ret; } rmsg = &rs->dmsg[rs->rmsg_head]; hdr = (struct ds_header *) (rmsg->qp->rbuf + rmsg->offset); if (len > rmsg->length - hdr->length) len = rmsg->length - hdr->length; memcpy(buf, (void *) hdr + hdr->length, len); if (addrlen) ds_set_src(src_addr, addrlen, hdr); if (!(flags & MSG_PEEK)) { ds_post_recv(rs, rmsg->qp, rmsg->offset); if (++rs->rmsg_head == rs->rq_size + 1) rs->rmsg_head = 0; rs->rqe_avail++; } return len; } static ssize_t rs_peek(struct rsocket *rs, void *buf, size_t len) { size_t left = len; uint32_t end_size, rsize; int rmsg_head, rbuf_offset; rmsg_head = rs->rmsg_head; rbuf_offset = rs->rbuf_offset; for (; left && (rmsg_head != rs->rmsg_tail); left -= rsize) { if (left < rs->rmsg[rmsg_head].data) { rsize = left; } else { rsize = rs->rmsg[rmsg_head].data; if (++rmsg_head == rs->rq_size + 1) rmsg_head = 0; } end_size = rs->rbuf_size - rbuf_offset; if (rsize > end_size) { memcpy(buf, &rs->rbuf[rbuf_offset], end_size); rbuf_offset = 0; buf += end_size; rsize -= end_size; left -= end_size; } memcpy(buf, &rs->rbuf[rbuf_offset], rsize); rbuf_offset += rsize; buf += rsize; } return len - left; } /* * Continue to receive any queued data even if the remote side has disconnected. */ ssize_t rrecv(int socket, void *buf, size_t len, int flags) { struct rsocket *rs; size_t left = len; uint32_t end_size, rsize; int ret = 0; rs = idm_at(&idm, socket); if (!rs) return ERR(EBADF); if (rs->type == SOCK_DGRAM) { fastlock_acquire(&rs->rlock); ret = ds_recvfrom(rs, buf, len, flags, NULL, NULL); fastlock_release(&rs->rlock); return ret; } if (rs->state & rs_opening) { ret = rs_do_connect(rs); if (ret) { if (errno == EINPROGRESS) errno = EAGAIN; return ret; } } fastlock_acquire(&rs->rlock); do { if (!rs_have_rdata(rs)) { ret = rs_get_comp(rs, rs_nonblocking(rs, flags), rs_conn_have_rdata); if (ret) break; } if (flags & MSG_PEEK) { left = len - rs_peek(rs, buf, left); break; } for (; left && rs_have_rdata(rs); left -= rsize) { if (left < rs->rmsg[rs->rmsg_head].data) { rsize = left; rs->rmsg[rs->rmsg_head].data -= left; } else { rs->rseq_no++; rsize = rs->rmsg[rs->rmsg_head].data; if (++rs->rmsg_head == rs->rq_size + 1) rs->rmsg_head = 0; } end_size = rs->rbuf_size - rs->rbuf_offset; if (rsize > end_size) { memcpy(buf, &rs->rbuf[rs->rbuf_offset], end_size); rs->rbuf_offset = 0; buf += end_size; rsize -= end_size; left -= end_size; rs->rbuf_bytes_avail += end_size; } memcpy(buf, &rs->rbuf[rs->rbuf_offset], rsize); rs->rbuf_offset += rsize; buf += rsize; rs->rbuf_bytes_avail += rsize; } } while (left && (flags & MSG_WAITALL) && (rs->state & rs_readable)); fastlock_release(&rs->rlock); return (ret && left == len) ? ret : len - left; } ssize_t rrecvfrom(int socket, void *buf, size_t len, int flags, struct sockaddr *src_addr, socklen_t *addrlen) { struct rsocket *rs; int ret; rs = idm_at(&idm, socket); if (!rs) return ERR(EBADF); if (rs->type == SOCK_DGRAM) { fastlock_acquire(&rs->rlock); ret = ds_recvfrom(rs, buf, len, flags, src_addr, addrlen); fastlock_release(&rs->rlock); return ret; } ret = rrecv(socket, buf, len, flags); if (ret > 0 && src_addr) rgetpeername(socket, src_addr, addrlen); return ret; } /* * Simple, straightforward implementation for now that only tries to fill * in the first vector. */ static ssize_t rrecvv(int socket, const struct iovec *iov, int iovcnt, int flags) { return rrecv(socket, iov[0].iov_base, iov[0].iov_len, flags); } ssize_t rrecvmsg(int socket, struct msghdr *msg, int flags) { if (msg->msg_control && msg->msg_controllen) return ERR(ENOTSUP); return rrecvv(socket, msg->msg_iov, (int) msg->msg_iovlen, msg->msg_flags); } ssize_t rread(int socket, void *buf, size_t count) { return rrecv(socket, buf, count, 0); } ssize_t rreadv(int socket, const struct iovec *iov, int iovcnt) { return rrecvv(socket, iov, iovcnt, 0); } static int rs_send_iomaps(struct rsocket *rs, int flags) { struct rs_iomap_mr *iomr; struct ibv_sge sge; struct rs_iomap iom; int ret; fastlock_acquire(&rs->map_lock); while (!dlist_empty(&rs->iomap_queue)) { if (!rs_can_send(rs)) { ret = rs_get_comp(rs, rs_nonblocking(rs, flags), rs_conn_can_send); if (ret) break; if (!(rs->state & rs_writable)) { ret = ERR(ECONNRESET); break; } } iomr = container_of(rs->iomap_queue.next, struct rs_iomap_mr, entry); if (!(rs->opts & RS_OPT_SWAP_SGL)) { iom.offset = iomr->offset; iom.sge.addr = (uintptr_t) iomr->mr->addr; iom.sge.length = iomr->mr->length; iom.sge.key = iomr->mr->rkey; } else { iom.offset = bswap_64(iomr->offset); iom.sge.addr = bswap_64((uintptr_t) iomr->mr->addr); iom.sge.length = bswap_32(iomr->mr->length); iom.sge.key = bswap_32(iomr->mr->rkey); } if (rs->sq_inline >= sizeof iom) { sge.addr = (uintptr_t) &iom; sge.length = sizeof iom; sge.lkey = 0; ret = rs_write_iomap(rs, iomr, &sge, 1, IBV_SEND_INLINE); } else if (rs_sbuf_left(rs) >= sizeof iom) { memcpy((void *) (uintptr_t) rs->ssgl[0].addr, &iom, sizeof iom); rs->ssgl[0].length = sizeof iom; ret = rs_write_iomap(rs, iomr, rs->ssgl, 1, 0); if (rs_sbuf_left(rs) > sizeof iom) rs->ssgl[0].addr += sizeof iom; else rs->ssgl[0].addr = (uintptr_t) rs->sbuf; } else { rs->ssgl[0].length = rs_sbuf_left(rs); memcpy((void *) (uintptr_t) rs->ssgl[0].addr, &iom, rs->ssgl[0].length); rs->ssgl[1].length = sizeof iom - rs->ssgl[0].length; memcpy(rs->sbuf, ((void *) &iom) + rs->ssgl[0].length, rs->ssgl[1].length); ret = rs_write_iomap(rs, iomr, rs->ssgl, 2, 0); rs->ssgl[0].addr = (uintptr_t) rs->sbuf + rs->ssgl[1].length; } dlist_remove(&iomr->entry); dlist_insert_tail(&iomr->entry, &rs->iomap_list); if (ret) break; } rs->iomap_pending = !dlist_empty(&rs->iomap_queue); fastlock_release(&rs->map_lock); return ret; } static ssize_t ds_sendv_udp(struct rsocket *rs, const struct iovec *iov, int iovcnt, int flags, uint8_t op) { struct ds_udp_header hdr; struct msghdr msg; struct iovec miov[8]; ssize_t ret; if (iovcnt > 8) return ERR(ENOTSUP); hdr.tag = htobe32(DS_UDP_TAG); hdr.version = rs->conn_dest->qp->hdr.version; hdr.op = op; hdr.reserved = 0; hdr.qpn = htobe32(rs->conn_dest->qp->cm_id->qp->qp_num & 0xFFFFFF); if (rs->conn_dest->qp->hdr.version == 4) { hdr.length = DS_UDP_IPV4_HDR_LEN; hdr.addr.ipv4 = rs->conn_dest->qp->hdr.addr.ipv4; } else { hdr.length = DS_UDP_IPV6_HDR_LEN; memcpy(hdr.addr.ipv6, &rs->conn_dest->qp->hdr.addr.ipv6, 16); } miov[0].iov_base = &hdr; miov[0].iov_len = hdr.length; if (iov && iovcnt) memcpy(&miov[1], iov, sizeof(*iov) * iovcnt); memset(&msg, 0, sizeof msg); msg.msg_name = &rs->conn_dest->addr; msg.msg_namelen = ucma_addrlen(&rs->conn_dest->addr.sa); msg.msg_iov = miov; msg.msg_iovlen = iovcnt + 1; ret = sendmsg(rs->udp_sock, &msg, flags); return ret > 0 ? ret - hdr.length : ret; } static ssize_t ds_send_udp(struct rsocket *rs, const void *buf, size_t len, int flags, uint8_t op) { struct iovec iov; if (buf && len) { iov.iov_base = (void *) buf; iov.iov_len = len; return ds_sendv_udp(rs, &iov, 1, flags, op); } else { return ds_sendv_udp(rs, NULL, 0, flags, op); } } static ssize_t dsend(struct rsocket *rs, const void *buf, size_t len, int flags) { struct ds_smsg *msg; struct ibv_sge sge; uint64_t offset; int ret = 0; if (!rs->conn_dest->ah) return ds_send_udp(rs, buf, len, flags, RS_OP_DATA); if (!ds_can_send(rs)) { ret = ds_get_comp(rs, rs_nonblocking(rs, flags), ds_can_send); if (ret) return ret; } msg = rs->smsg_free; rs->smsg_free = msg->next; rs->sqe_avail--; memcpy((void *) msg, &rs->conn_dest->qp->hdr, rs->conn_dest->qp->hdr.length); memcpy((void *) msg + rs->conn_dest->qp->hdr.length, buf, len); sge.addr = (uintptr_t) msg; sge.length = rs->conn_dest->qp->hdr.length + len; sge.lkey = rs->conn_dest->qp->smr->lkey; offset = (uint8_t *) msg - rs->sbuf; ret = ds_post_send(rs, &sge, offset); return ret ? ret : len; } /* * We overlap sending the data, by posting a small work request immediately, * then increasing the size of the send on each iteration. */ ssize_t rsend(int socket, const void *buf, size_t len, int flags) { struct rsocket *rs; struct ibv_sge sge; size_t left = len; uint32_t xfer_size, olen = RS_OLAP_START_SIZE; int ret = 0; rs = idm_at(&idm, socket); if (!rs) return ERR(EBADF); if (rs->type == SOCK_DGRAM) { fastlock_acquire(&rs->slock); ret = dsend(rs, buf, len, flags); fastlock_release(&rs->slock); return ret; } if (rs->state & rs_opening) { ret = rs_do_connect(rs); if (ret) { if (errno == EINPROGRESS) errno = EAGAIN; return ret; } } fastlock_acquire(&rs->slock); if (rs->iomap_pending) { ret = rs_send_iomaps(rs, flags); if (ret) goto out; } for (; left; left -= xfer_size, buf += xfer_size) { if (!rs_can_send(rs)) { ret = rs_get_comp(rs, rs_nonblocking(rs, flags), rs_conn_can_send); if (ret) break; if (!(rs->state & rs_writable)) { ret = ERR(ECONNRESET); break; } } if (olen < left) { xfer_size = olen; if (olen < RS_MAX_TRANSFER) olen <<= 1; } else { xfer_size = left; } if (xfer_size > rs->sbuf_bytes_avail) xfer_size = rs->sbuf_bytes_avail; if (xfer_size > rs->target_sgl[rs->target_sge].length) xfer_size = rs->target_sgl[rs->target_sge].length; if (xfer_size <= rs->sq_inline) { sge.addr = (uintptr_t) buf; sge.length = xfer_size; sge.lkey = 0; ret = rs_write_data(rs, &sge, 1, xfer_size, IBV_SEND_INLINE); } else if (xfer_size <= rs_sbuf_left(rs)) { memcpy((void *) (uintptr_t) rs->ssgl[0].addr, buf, xfer_size); rs->ssgl[0].length = xfer_size; ret = rs_write_data(rs, rs->ssgl, 1, xfer_size, 0); if (xfer_size < rs_sbuf_left(rs)) rs->ssgl[0].addr += xfer_size; else rs->ssgl[0].addr = (uintptr_t) rs->sbuf; } else { rs->ssgl[0].length = rs_sbuf_left(rs); memcpy((void *) (uintptr_t) rs->ssgl[0].addr, buf, rs->ssgl[0].length); rs->ssgl[1].length = xfer_size - rs->ssgl[0].length; memcpy(rs->sbuf, buf + rs->ssgl[0].length, rs->ssgl[1].length); ret = rs_write_data(rs, rs->ssgl, 2, xfer_size, 0); rs->ssgl[0].addr = (uintptr_t) rs->sbuf + rs->ssgl[1].length; } if (ret) break; } out: fastlock_release(&rs->slock); return (ret && left == len) ? ret : len - left; } ssize_t rsendto(int socket, const void *buf, size_t len, int flags, const struct sockaddr *dest_addr, socklen_t addrlen) { struct rsocket *rs; int ret; rs = idm_at(&idm, socket); if (!rs) return ERR(EBADF); if (rs->type == SOCK_STREAM) { if (dest_addr || addrlen) return ERR(EISCONN); return rsend(socket, buf, len, flags); } if (rs->state == rs_init) { ret = ds_init_ep(rs); if (ret) return ret; } fastlock_acquire(&rs->slock); if (!rs->conn_dest || ds_compare_addr(dest_addr, &rs->conn_dest->addr)) { ret = ds_get_dest(rs, dest_addr, addrlen, &rs->conn_dest); if (ret) goto out; } ret = dsend(rs, buf, len, flags); out: fastlock_release(&rs->slock); return ret; } static void rs_copy_iov(void *dst, const struct iovec **iov, size_t *offset, size_t len) { size_t size; while (len) { size = (*iov)->iov_len - *offset; if (size > len) { memcpy (dst, (*iov)->iov_base + *offset, len); *offset += len; break; } memcpy(dst, (*iov)->iov_base + *offset, size); len -= size; dst += size; (*iov)++; *offset = 0; } } static ssize_t rsendv(int socket, const struct iovec *iov, int iovcnt, int flags) { struct rsocket *rs; const struct iovec *cur_iov; size_t left, len, offset = 0; uint32_t xfer_size, olen = RS_OLAP_START_SIZE; int i, ret = 0; rs = idm_at(&idm, socket); if (!rs) return ERR(EBADF); if (rs->state & rs_opening) { ret = rs_do_connect(rs); if (ret) { if (errno == EINPROGRESS) errno = EAGAIN; return ret; } } cur_iov = iov; len = iov[0].iov_len; for (i = 1; i < iovcnt; i++) len += iov[i].iov_len; left = len; fastlock_acquire(&rs->slock); if (rs->iomap_pending) { ret = rs_send_iomaps(rs, flags); if (ret) goto out; } for (; left; left -= xfer_size) { if (!rs_can_send(rs)) { ret = rs_get_comp(rs, rs_nonblocking(rs, flags), rs_conn_can_send); if (ret) break; if (!(rs->state & rs_writable)) { ret = ERR(ECONNRESET); break; } } if (olen < left) { xfer_size = olen; if (olen < RS_MAX_TRANSFER) olen <<= 1; } else { xfer_size = left; } if (xfer_size > rs->sbuf_bytes_avail) xfer_size = rs->sbuf_bytes_avail; if (xfer_size > rs->target_sgl[rs->target_sge].length) xfer_size = rs->target_sgl[rs->target_sge].length; if (xfer_size <= rs_sbuf_left(rs)) { rs_copy_iov((void *) (uintptr_t) rs->ssgl[0].addr, &cur_iov, &offset, xfer_size); rs->ssgl[0].length = xfer_size; ret = rs_write_data(rs, rs->ssgl, 1, xfer_size, xfer_size <= rs->sq_inline ? IBV_SEND_INLINE : 0); if (xfer_size < rs_sbuf_left(rs)) rs->ssgl[0].addr += xfer_size; else rs->ssgl[0].addr = (uintptr_t) rs->sbuf; } else { rs->ssgl[0].length = rs_sbuf_left(rs); rs_copy_iov((void *) (uintptr_t) rs->ssgl[0].addr, &cur_iov, &offset, rs->ssgl[0].length); rs->ssgl[1].length = xfer_size - rs->ssgl[0].length; rs_copy_iov(rs->sbuf, &cur_iov, &offset, rs->ssgl[1].length); ret = rs_write_data(rs, rs->ssgl, 2, xfer_size, xfer_size <= rs->sq_inline ? IBV_SEND_INLINE : 0); rs->ssgl[0].addr = (uintptr_t) rs->sbuf + rs->ssgl[1].length; } if (ret) break; } out: fastlock_release(&rs->slock); return (ret && left == len) ? ret : len - left; } ssize_t rsendmsg(int socket, const struct msghdr *msg, int flags) { if (msg->msg_control && msg->msg_controllen) return ERR(ENOTSUP); return rsendv(socket, msg->msg_iov, (int) msg->msg_iovlen, flags); } ssize_t rwrite(int socket, const void *buf, size_t count) { return rsend(socket, buf, count, 0); } ssize_t rwritev(int socket, const struct iovec *iov, int iovcnt) { return rsendv(socket, iov, iovcnt, 0); } /* When mapping rpoll to poll, the events reported on the RDMA * fd are independent from the events rpoll may be looking for. * To avoid threads hanging in poll, whenever any event occurs, * we need to wakeup all threads in poll, so that they can check * if there has been a change on the rsockets they are monitoring. * To support this, we 'gate' threads entering and leaving rpoll. */ static int rs_pollinit(void) { int ret = 0; pthread_mutex_lock(&mut); if (pollsignal >= 0) goto unlock; pollsignal = eventfd(0, EFD_NONBLOCK | EFD_SEMAPHORE); if (pollsignal < 0) ret = -errno; unlock: pthread_mutex_unlock(&mut); return ret; } /* When an event occurs, we must wait until the state of all rsockets * has settled. Then we need to re-check the rsocket state prior to * blocking on poll(). */ static int rs_poll_enter(void) { pthread_mutex_lock(&mut); if (suspendpoll) { pthread_mutex_unlock(&mut); sched_yield(); return -EBUSY; } pollcnt++; pthread_mutex_unlock(&mut); return 0; } static void rs_poll_exit(void) { uint64_t c; int save_errno; ssize_t ret; pthread_mutex_lock(&mut); if (!--pollcnt) { /* Keep errno value from poll() call. We try to clear * a single signal. But there's no guarantee that we'll * find one. Additional signals indicate that a change * occurred on an rsocket, which requires all threads to * re-check before blocking on poll. */ save_errno = errno; ret = read(pollsignal, &c, sizeof(c)); if (ret != sizeof(c)) errno = save_errno; suspendpoll = 0; } pthread_mutex_unlock(&mut); } /* When an event occurs, it's possible for a single thread blocked in * poll to return from the kernel, read the event, and update the state * of an rsocket. However, that can leave threads blocked in the kernel * on poll (trying to read the CQ fd), which have had their rsocket * state set. To avoid those threads remaining blocked in the kernel, * we must wake them up and ensure that they all return to user space, * in order to re-check the state of their rsockets. * * Because poll is racy wrt updating the rsocket states, we need to * signal state checks whenever a thread updates the state of a * monitored rsocket, independent of whether that thread actually * reads an event from an fd. In other words, we must wake up all * polling threads whenever poll() indicates that there is a new * completion to process, and when rpoll() will return a successful * value after having blocked. */ static void rs_poll_stop(void) { uint64_t c; int save_errno; ssize_t ret; /* See comment in rs_poll_exit */ save_errno = errno; pthread_mutex_lock(&mut); if (!--pollcnt) { ret = read(pollsignal, &c, sizeof(c)); suspendpoll = 0; } else if (!suspendpoll) { suspendpoll = 1; c = 1; ret = write(pollsignal, &c, sizeof(c)); } else { ret = sizeof(c); } pthread_mutex_unlock(&mut); if (ret != sizeof(c)) errno = save_errno; } static int rs_poll_signal(void) { uint64_t c; ssize_t ret; pthread_mutex_lock(&mut); if (pollcnt && !suspendpoll) { suspendpoll = 1; c = 1; ret = write(pollsignal, &c, sizeof(c)); if (ret == sizeof(c)) ret = 0; } else { ret = 0; } pthread_mutex_unlock(&mut); return ret; } /* We always add the pollsignal read fd to the poll fd set, so * that we can signal any blocked threads. */ static struct pollfd *rs_fds_alloc(nfds_t nfds) { static __thread struct pollfd *rfds; static __thread nfds_t rnfds; if (nfds + 1 > rnfds) { if (rfds) free(rfds); else if (rs_pollinit()) return NULL; rfds = malloc(sizeof(*rfds) * (nfds + 1)); rnfds = rfds ? nfds + 1 : 0; } if (rfds) { rfds[nfds].fd = pollsignal; rfds[nfds].events = POLLIN; } return rfds; } static int rs_poll_rs(struct rsocket *rs, int events, int nonblock, int (*test)(struct rsocket *rs)) { struct pollfd fds; short revents; int ret; check_cq: if ((rs->type == SOCK_STREAM) && ((rs->state & rs_connected) || (rs->state == rs_disconnected) || (rs->state & rs_error))) { rs_process_cq(rs, nonblock, test); revents = 0; if ((events & POLLIN) && rs_conn_have_rdata(rs)) revents |= POLLIN; if ((events & POLLOUT) && rs_can_send(rs)) revents |= POLLOUT; if (!(rs->state & rs_connected)) { if (rs->state == rs_disconnected) revents |= POLLHUP; else revents |= POLLERR; } return revents; } else if (rs->type == SOCK_DGRAM) { ds_process_cqs(rs, nonblock, test); revents = 0; if ((events & POLLIN) && rs_have_rdata(rs)) revents |= POLLIN; if ((events & POLLOUT) && ds_can_send(rs)) revents |= POLLOUT; return revents; } if (rs->state == rs_listening) { fds.fd = rs->accept_queue[0]; fds.events = events; fds.revents = 0; poll(&fds, 1, 0); return fds.revents; } if (rs->state & rs_opening) { ret = rs_do_connect(rs); if (ret && (errno == EINPROGRESS)) { errno = 0; } else { goto check_cq; } } if (rs->state == rs_connect_error) { revents = 0; if (events & POLLOUT) revents |= POLLOUT; if (events & POLLIN) revents |= POLLIN; revents |= POLLERR; return revents; } return 0; } static int rs_poll_check(struct pollfd *fds, nfds_t nfds) { struct rsocket *rs; int i, cnt = 0; for (i = 0; i < nfds; i++) { rs = idm_lookup(&idm, fds[i].fd); if (rs) fds[i].revents = rs_poll_rs(rs, fds[i].events, 1, rs_poll_all); else poll(&fds[i], 1, 0); if (fds[i].revents) cnt++; } return cnt; } static int rs_poll_arm(struct pollfd *rfds, struct pollfd *fds, nfds_t nfds) { struct rsocket *rs; int i; for (i = 0; i < nfds; i++) { rs = idm_lookup(&idm, fds[i].fd); if (rs) { fds[i].revents = rs_poll_rs(rs, fds[i].events, 0, rs_is_cq_armed); if (fds[i].revents) return 1; if (rs->type == SOCK_STREAM) { if (rs->state >= rs_connected) rfds[i].fd = rs->cm_id->recv_cq_channel->fd; else rfds[i].fd = rs->cm_id->channel->fd; } else { rfds[i].fd = rs->epfd; } rfds[i].events = POLLIN; } else { rfds[i].fd = fds[i].fd; rfds[i].events = fds[i].events; } rfds[i].revents = 0; } return 0; } static int rs_poll_events(struct pollfd *rfds, struct pollfd *fds, nfds_t nfds) { struct rsocket *rs; int i, cnt = 0; for (i = 0; i < nfds; i++) { rs = idm_lookup(&idm, fds[i].fd); if (rs) { if (rfds[i].revents) { fastlock_acquire(&rs->cq_wait_lock); if (rs->type == SOCK_STREAM) rs_get_cq_event(rs); else ds_get_cq_event(rs); fastlock_release(&rs->cq_wait_lock); } fds[i].revents = rs_poll_rs(rs, fds[i].events, 1, rs_poll_all); } else { fds[i].revents = rfds[i].revents; } if (fds[i].revents) cnt++; } return cnt; } /* * We need to poll *all* fd's that the user specifies at least once. * Note that we may receive events on an rsocket that may not be reported * to the user (e.g. connection events or credit updates). Process those * events, then return to polling until we find ones of interest. */ int rpoll(struct pollfd *fds, nfds_t nfds, int timeout) { struct pollfd *rfds; uint64_t start_time = 0; uint32_t poll_time; int pollsleep, ret; do { ret = rs_poll_check(fds, nfds); if (ret || !timeout) return ret; if (!start_time) start_time = rs_time_us(); poll_time = (uint32_t) (rs_time_us() - start_time); } while (poll_time <= polling_time); rfds = rs_fds_alloc(nfds); if (!rfds) return ERR(ENOMEM); do { ret = rs_poll_arm(rfds, fds, nfds); if (ret) break; if (rs_poll_enter()) continue; if (timeout >= 0) { timeout -= (int) ((rs_time_us() - start_time) / 1000); if (timeout <= 0) return 0; pollsleep = min(timeout, wake_up_interval); } else { pollsleep = wake_up_interval; } ret = poll(rfds, nfds + 1, pollsleep); if (ret < 0) { rs_poll_exit(); break; } ret = rs_poll_events(rfds, fds, nfds); rs_poll_stop(); } while (!ret); return ret; } static struct pollfd * rs_select_to_poll(int *nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds) { struct pollfd *fds; int fd, i = 0; fds = calloc(*nfds, sizeof(*fds)); if (!fds) return NULL; for (fd = 0; fd < *nfds; fd++) { if (readfds && FD_ISSET(fd, readfds)) { fds[i].fd = fd; fds[i].events = POLLIN; } if (writefds && FD_ISSET(fd, writefds)) { fds[i].fd = fd; fds[i].events |= POLLOUT; } if (exceptfds && FD_ISSET(fd, exceptfds)) fds[i].fd = fd; if (fds[i].fd) i++; } *nfds = i; return fds; } static int rs_poll_to_select(int nfds, struct pollfd *fds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds) { int i, cnt = 0; for (i = 0; i < nfds; i++) { if (readfds && (fds[i].revents & (POLLIN | POLLHUP))) { FD_SET(fds[i].fd, readfds); cnt++; } if (writefds && (fds[i].revents & POLLOUT)) { FD_SET(fds[i].fd, writefds); cnt++; } if (exceptfds && (fds[i].revents & ~(POLLIN | POLLOUT))) { FD_SET(fds[i].fd, exceptfds); cnt++; } } return cnt; } static int rs_convert_timeout(struct timeval *timeout) { return !timeout ? -1 : timeout->tv_sec * 1000 + timeout->tv_usec / 1000; } int rselect(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout) { struct pollfd *fds; int ret; fds = rs_select_to_poll(&nfds, readfds, writefds, exceptfds); if (!fds) return ERR(ENOMEM); ret = rpoll(fds, nfds, rs_convert_timeout(timeout)); if (readfds) FD_ZERO(readfds); if (writefds) FD_ZERO(writefds); if (exceptfds) FD_ZERO(exceptfds); if (ret > 0) ret = rs_poll_to_select(nfds, fds, readfds, writefds, exceptfds); free(fds); return ret; } /* * For graceful disconnect, notify the remote side that we're * disconnecting and wait until all outstanding sends complete, provided * that the remote side has not sent a disconnect message. */ int rshutdown(int socket, int how) { struct rsocket *rs; int ctrl, ret = 0; rs = idm_lookup(&idm, socket); if (!rs) return ERR(EBADF); if (rs->opts & RS_OPT_KEEPALIVE) rs_notify_svc(&tcp_svc, rs, RS_SVC_REM_KEEPALIVE); if (rs->fd_flags & O_NONBLOCK) rs_set_nonblocking(rs, 0); if (rs->state & rs_connected) { if (how == SHUT_RDWR) { ctrl = RS_CTRL_DISCONNECT; rs->state &= ~(rs_readable | rs_writable); } else if (how == SHUT_WR) { rs->state &= ~rs_writable; ctrl = (rs->state & rs_readable) ? RS_CTRL_SHUTDOWN : RS_CTRL_DISCONNECT; } else { rs->state &= ~rs_readable; if (rs->state & rs_writable) goto out; ctrl = RS_CTRL_DISCONNECT; } if (!rs_ctrl_avail(rs)) { ret = rs_process_cq(rs, 0, rs_conn_can_send_ctrl); if (ret) goto out; } if ((rs->state & rs_connected) && rs_ctrl_avail(rs)) { rs->ctrl_seqno++; ret = rs_post_msg(rs, rs_msg_set(RS_OP_CTRL, ctrl)); } } if (rs->state & rs_connected) rs_process_cq(rs, 0, rs_conn_all_sends_done); out: if ((rs->fd_flags & O_NONBLOCK) && (rs->state & rs_connected)) rs_set_nonblocking(rs, rs->fd_flags); if (rs->state & rs_disconnected) { /* Generate event by flushing receives to unblock rpoll */ ibv_req_notify_cq(rs->cm_id->recv_cq, 0); ucma_shutdown(rs->cm_id); } return ret; } static void ds_shutdown(struct rsocket *rs) { if (rs->opts & RS_OPT_UDP_SVC) rs_notify_svc(&udp_svc, rs, RS_SVC_REM_DGRAM); if (rs->fd_flags & O_NONBLOCK) rs_set_nonblocking(rs, 0); rs->state &= ~(rs_readable | rs_writable); ds_process_cqs(rs, 0, ds_all_sends_done); if (rs->fd_flags & O_NONBLOCK) rs_set_nonblocking(rs, rs->fd_flags); } int rclose(int socket) { struct rsocket *rs; rs = idm_lookup(&idm, socket); if (!rs) return EBADF; if (rs->type == SOCK_STREAM) { if (rs->state & rs_connected) rshutdown(socket, SHUT_RDWR); if (rs->opts & RS_OPT_KEEPALIVE) rs_notify_svc(&tcp_svc, rs, RS_SVC_REM_KEEPALIVE); if (rs->opts & RS_OPT_CM_SVC && rs->state == rs_listening) rs_notify_svc(&listen_svc, rs, RS_SVC_REM_CM); if (rs->opts & RS_OPT_CM_SVC) rs_notify_svc(&connect_svc, rs, RS_SVC_REM_CM); } else { ds_shutdown(rs); } rs_free(rs); return 0; } static void rs_copy_addr(struct sockaddr *dst, struct sockaddr *src, socklen_t *len) { socklen_t size; if (src->sa_family == AF_INET) { size = min_t(socklen_t, *len, sizeof(struct sockaddr_in)); *len = sizeof(struct sockaddr_in); } else { size = min_t(socklen_t, *len, sizeof(struct sockaddr_in6)); *len = sizeof(struct sockaddr_in6); } memcpy(dst, src, size); } int rgetpeername(int socket, struct sockaddr *addr, socklen_t *addrlen) { struct rsocket *rs; rs = idm_lookup(&idm, socket); if (!rs) return ERR(EBADF); if (rs->type == SOCK_STREAM) { rs_copy_addr(addr, rdma_get_peer_addr(rs->cm_id), addrlen); return 0; } else { return getpeername(rs->udp_sock, addr, addrlen); } } int rgetsockname(int socket, struct sockaddr *addr, socklen_t *addrlen) { struct rsocket *rs; rs = idm_lookup(&idm, socket); if (!rs) return ERR(EBADF); if (rs->type == SOCK_STREAM) { rs_copy_addr(addr, rdma_get_local_addr(rs->cm_id), addrlen); return 0; } else { return getsockname(rs->udp_sock, addr, addrlen); } } static int rs_set_keepalive(struct rsocket *rs, int on) { FILE *f; int ret; if ((on && (rs->opts & RS_OPT_KEEPALIVE)) || (!on && !(rs->opts & RS_OPT_KEEPALIVE))) return 0; if (on) { if (!rs->keepalive_time) { if ((f = fopen("/proc/sys/net/ipv4/tcp_keepalive_time", "r"))) { if (fscanf(f, "%u", &rs->keepalive_time) != 1) rs->keepalive_time = 7200; fclose(f); } else { rs->keepalive_time = 7200; } } ret = rs_notify_svc(&tcp_svc, rs, RS_SVC_ADD_KEEPALIVE); } else { ret = rs_notify_svc(&tcp_svc, rs, RS_SVC_REM_KEEPALIVE); } return ret; } int rsetsockopt(int socket, int level, int optname, const void *optval, socklen_t optlen) { struct rsocket *rs; int ret, opt_on = 0; uint64_t *opts = NULL; ret = ERR(ENOTSUP); rs = idm_lookup(&idm, socket); if (!rs) return ERR(EBADF); if (rs->type == SOCK_DGRAM && level != SOL_RDMA) { ret = setsockopt(rs->udp_sock, level, optname, optval, optlen); if (ret) return ret; } switch (level) { case SOL_SOCKET: opts = &rs->so_opts; switch (optname) { case SO_REUSEADDR: if (rs->type == SOCK_STREAM) { ret = rdma_set_option(rs->cm_id, RDMA_OPTION_ID, RDMA_OPTION_ID_REUSEADDR, (void *) optval, optlen); if (ret && ((errno == ENOSYS) || ((rs->state != rs_init) && rs->cm_id->context && (rs->cm_id->verbs->device->transport_type == IBV_TRANSPORT_IB)))) ret = 0; } opt_on = *(int *) optval; break; case SO_RCVBUF: if ((rs->type == SOCK_STREAM && !rs->rbuf) || (rs->type == SOCK_DGRAM && !rs->qp_list)) rs->rbuf_size = (*(uint32_t *) optval) << 1; ret = 0; break; case SO_SNDBUF: if (!rs->sbuf) rs->sbuf_size = (*(uint32_t *) optval) << 1; if (rs->sbuf_size < RS_SNDLOWAT) rs->sbuf_size = RS_SNDLOWAT << 1; ret = 0; break; case SO_LINGER: /* Invert value so default so_opt = 0 is on */ opt_on = !((struct linger *) optval)->l_onoff; ret = 0; break; case SO_KEEPALIVE: ret = rs_set_keepalive(rs, *(int *) optval); opt_on = rs->opts & RS_OPT_KEEPALIVE; break; case SO_OOBINLINE: opt_on = *(int *) optval; ret = 0; break; default: break; } break; case IPPROTO_TCP: opts = &rs->tcp_opts; switch (optname) { case TCP_KEEPCNT: case TCP_KEEPINTVL: ret = 0; /* N/A - we're using a reliable connection */ break; case TCP_KEEPIDLE: if (*(int *) optval <= 0) { ret = ERR(EINVAL); break; } rs->keepalive_time = *(int *) optval; ret = (rs->opts & RS_OPT_KEEPALIVE) ? rs_notify_svc(&tcp_svc, rs, RS_SVC_MOD_KEEPALIVE) : 0; break; case TCP_NODELAY: opt_on = *(int *) optval; ret = 0; break; case TCP_MAXSEG: ret = 0; break; default: break; } break; case IPPROTO_IPV6: opts = &rs->ipv6_opts; switch (optname) { case IPV6_V6ONLY: if (rs->type == SOCK_STREAM) { ret = rdma_set_option(rs->cm_id, RDMA_OPTION_ID, RDMA_OPTION_ID_AFONLY, (void *) optval, optlen); } opt_on = *(int *) optval; break; default: break; } break; case SOL_RDMA: if (rs->state >= rs_opening) { ret = ERR(EINVAL); break; } switch (optname) { case RDMA_SQSIZE: rs->sq_size = min_t(uint32_t, (*(uint32_t *)optval), RS_QP_MAX_SIZE); ret = 0; break; case RDMA_RQSIZE: rs->rq_size = min_t(uint32_t, (*(uint32_t *)optval), RS_QP_MAX_SIZE); ret = 0; break; case RDMA_INLINE: rs->sq_inline = min_t(uint32_t, *(uint32_t *)optval, RS_QP_MAX_SIZE); ret = 0; break; case RDMA_IOMAPSIZE: rs->target_iomap_size = (uint16_t) rs_scale_to_value( (uint8_t) rs_value_to_scale(*(int *) optval, 8), 8); ret = 0; break; case RDMA_ROUTE: if ((rs->optval = malloc(optlen))) { memcpy(rs->optval, optval, optlen); rs->optlen = optlen; ret = 0; } else { ret = ERR(ENOMEM); } break; default: break; } break; default: break; } if (!ret && opts) { if (opt_on) *opts |= (1 << optname); else *opts &= ~(1 << optname); } return ret; } static void rs_convert_sa_path(struct ibv_sa_path_rec *sa_path, struct ibv_path_data *path_data) { uint32_t fl_hop; memset(path_data, 0, sizeof(*path_data)); path_data->path.dgid = sa_path->dgid; path_data->path.sgid = sa_path->sgid; path_data->path.dlid = sa_path->dlid; path_data->path.slid = sa_path->slid; fl_hop = be32toh(sa_path->flow_label) << 8; path_data->path.flowlabel_hoplimit = htobe32(fl_hop | sa_path->hop_limit); path_data->path.tclass = sa_path->traffic_class; path_data->path.reversible_numpath = sa_path->reversible << 7 | 1; path_data->path.pkey = sa_path->pkey; path_data->path.qosclass_sl = htobe16(sa_path->sl); path_data->path.mtu = sa_path->mtu | 2 << 6; /* exactly */ path_data->path.rate = sa_path->rate | 2 << 6; path_data->path.packetlifetime = sa_path->packet_life_time | 2 << 6; path_data->flags= sa_path->preference; } int rgetsockopt(int socket, int level, int optname, void *optval, socklen_t *optlen) { struct rsocket *rs; void *opt; struct ibv_sa_path_rec *path_rec; struct ibv_path_data path_data; socklen_t len; int ret = 0; int num_paths; rs = idm_lookup(&idm, socket); if (!rs) return ERR(EBADF); switch (level) { case SOL_SOCKET: switch (optname) { case SO_REUSEADDR: case SO_KEEPALIVE: case SO_OOBINLINE: *((int *) optval) = !!(rs->so_opts & (1 << optname)); *optlen = sizeof(int); break; case SO_RCVBUF: *((int *) optval) = rs->rbuf_size; *optlen = sizeof(int); break; case SO_SNDBUF: *((int *) optval) = rs->sbuf_size; *optlen = sizeof(int); break; case SO_LINGER: /* Value is inverted so default so_opt = 0 is on */ ((struct linger *) optval)->l_onoff = !(rs->so_opts & (1 << optname)); ((struct linger *) optval)->l_linger = 0; *optlen = sizeof(struct linger); break; case SO_ERROR: *((int *) optval) = rs->err; *optlen = sizeof(int); rs->err = 0; break; default: ret = ENOTSUP; break; } break; case IPPROTO_TCP: switch (optname) { case TCP_KEEPCNT: case TCP_KEEPINTVL: *((int *) optval) = 1; /* N/A */ break; case TCP_KEEPIDLE: *((int *) optval) = (int) rs->keepalive_time; *optlen = sizeof(int); break; case TCP_NODELAY: *((int *) optval) = !!(rs->tcp_opts & (1 << optname)); *optlen = sizeof(int); break; case TCP_MAXSEG: *((int *) optval) = (rs->cm_id && rs->cm_id->route.num_paths) ? 1 << (7 + rs->cm_id->route.path_rec->mtu) : 2048; *optlen = sizeof(int); break; default: ret = ENOTSUP; break; } break; case IPPROTO_IPV6: switch (optname) { case IPV6_V6ONLY: *((int *) optval) = !!(rs->ipv6_opts & (1 << optname)); *optlen = sizeof(int); break; default: ret = ENOTSUP; break; } break; case SOL_RDMA: switch (optname) { case RDMA_SQSIZE: *((int *) optval) = rs->sq_size; *optlen = sizeof(int); break; case RDMA_RQSIZE: *((int *) optval) = rs->rq_size; *optlen = sizeof(int); break; case RDMA_INLINE: *((int *) optval) = rs->sq_inline; *optlen = sizeof(int); break; case RDMA_IOMAPSIZE: *((int *) optval) = rs->target_iomap_size; *optlen = sizeof(int); break; case RDMA_ROUTE: if (rs->optval) { if (*optlen < rs->optlen) { ret = EINVAL; } else { memcpy(rs->optval, optval, rs->optlen); *optlen = rs->optlen; } } else { if (*optlen < sizeof(path_data)) { ret = EINVAL; } else { len = 0; opt = optval; path_rec = rs->cm_id->route.path_rec; num_paths = 0; while (len + sizeof(path_data) <= *optlen && num_paths < rs->cm_id->route.num_paths) { rs_convert_sa_path(path_rec, &path_data); memcpy(opt, &path_data, sizeof(path_data)); len += sizeof(path_data); opt += sizeof(path_data); path_rec++; num_paths++; } *optlen = len; ret = 0; } } break; default: ret = ENOTSUP; break; } break; default: ret = ENOTSUP; break; } return rdma_seterrno(ret); } int rfcntl(int socket, int cmd, ... /* arg */ ) { struct rsocket *rs; va_list args; int param; int ret = 0; rs = idm_lookup(&idm, socket); if (!rs) return ERR(EBADF); va_start(args, cmd); switch (cmd) { case F_GETFL: ret = rs->fd_flags; break; case F_SETFL: param = va_arg(args, int); if ((rs->fd_flags & O_NONBLOCK) != (param & O_NONBLOCK)) ret = rs_set_nonblocking(rs, param & O_NONBLOCK); if (!ret) rs->fd_flags = param; break; default: ret = ERR(ENOTSUP); break; } va_end(args); return ret; } static struct rs_iomap_mr *rs_get_iomap_mr(struct rsocket *rs) { int i; if (!rs->remote_iomappings) { rs->remote_iomappings = calloc(rs->remote_iomap.length, sizeof(*rs->remote_iomappings)); if (!rs->remote_iomappings) return NULL; for (i = 0; i < rs->remote_iomap.length; i++) rs->remote_iomappings[i].index = i; } for (i = 0; i < rs->remote_iomap.length; i++) { if (!rs->remote_iomappings[i].mr) return &rs->remote_iomappings[i]; } return NULL; } /* * If an offset is given, we map to it. If offset is -1, then we map the * offset to the address of buf. We do not check for conflicts, which must * be fixed at some point. */ off_t riomap(int socket, void *buf, size_t len, int prot, int flags, off_t offset) { struct rsocket *rs; struct rs_iomap_mr *iomr; int access = IBV_ACCESS_LOCAL_WRITE; rs = idm_at(&idm, socket); if (!rs) return ERR(EBADF); if (!rs->cm_id->pd || (prot & ~(PROT_WRITE | PROT_NONE))) return ERR(EINVAL); fastlock_acquire(&rs->map_lock); if (prot & PROT_WRITE) { iomr = rs_get_iomap_mr(rs); access |= IBV_ACCESS_REMOTE_WRITE; } else { iomr = calloc(1, sizeof(*iomr)); iomr->index = -1; } if (!iomr) { offset = ERR(ENOMEM); goto out; } iomr->mr = ibv_reg_mr(rs->cm_id->pd, buf, len, access); if (!iomr->mr) { if (iomr->index < 0) free(iomr); offset = -1; goto out; } if (offset == -1) offset = (uintptr_t) buf; iomr->offset = offset; atomic_store(&iomr->refcnt, 1); if (iomr->index >= 0) { dlist_insert_tail(&iomr->entry, &rs->iomap_queue); rs->iomap_pending = 1; } else { dlist_insert_tail(&iomr->entry, &rs->iomap_list); } out: fastlock_release(&rs->map_lock); return offset; } int riounmap(int socket, void *buf, size_t len) { struct rsocket *rs; struct rs_iomap_mr *iomr; dlist_entry *entry; int ret = 0; rs = idm_at(&idm, socket); if (!rs) return ERR(EBADF); fastlock_acquire(&rs->map_lock); for (entry = rs->iomap_list.next; entry != &rs->iomap_list; entry = entry->next) { iomr = container_of(entry, struct rs_iomap_mr, entry); if (iomr->mr->addr == buf && iomr->mr->length == len) { rs_release_iomap_mr(iomr); goto out; } } for (entry = rs->iomap_queue.next; entry != &rs->iomap_queue; entry = entry->next) { iomr = container_of(entry, struct rs_iomap_mr, entry); if (iomr->mr->addr == buf && iomr->mr->length == len) { rs_release_iomap_mr(iomr); goto out; } } ret = ERR(EINVAL); out: fastlock_release(&rs->map_lock); return ret; } static struct rs_iomap *rs_find_iomap(struct rsocket *rs, off_t offset) { int i; for (i = 0; i < rs->target_iomap_size; i++) { if (offset >= rs->target_iomap[i].offset && offset < rs->target_iomap[i].offset + rs->target_iomap[i].sge.length) return &rs->target_iomap[i]; } return NULL; } size_t riowrite(int socket, const void *buf, size_t count, off_t offset, int flags) { struct rsocket *rs; struct rs_iomap *iom = NULL; struct ibv_sge sge; size_t left = count; uint32_t xfer_size, olen = RS_OLAP_START_SIZE; int ret = 0; rs = idm_at(&idm, socket); if (!rs) return ERR(EBADF); fastlock_acquire(&rs->slock); if (rs->iomap_pending) { ret = rs_send_iomaps(rs, flags); if (ret) goto out; } for (; left; left -= xfer_size, buf += xfer_size, offset += xfer_size) { if (!iom || offset > iom->offset + iom->sge.length) { iom = rs_find_iomap(rs, offset); if (!iom) break; } if (!rs_can_send(rs)) { ret = rs_get_comp(rs, rs_nonblocking(rs, flags), rs_conn_can_send); if (ret) break; if (!(rs->state & rs_writable)) { ret = ERR(ECONNRESET); break; } } if (olen < left) { xfer_size = olen; if (olen < RS_MAX_TRANSFER) olen <<= 1; } else { xfer_size = left; } if (xfer_size > rs->sbuf_bytes_avail) xfer_size = rs->sbuf_bytes_avail; if (xfer_size > iom->offset + iom->sge.length - offset) xfer_size = iom->offset + iom->sge.length - offset; if (xfer_size <= rs->sq_inline) { sge.addr = (uintptr_t) buf; sge.length = xfer_size; sge.lkey = 0; ret = rs_write_direct(rs, iom, offset, &sge, 1, xfer_size, IBV_SEND_INLINE); } else if (xfer_size <= rs_sbuf_left(rs)) { memcpy((void *) (uintptr_t) rs->ssgl[0].addr, buf, xfer_size); rs->ssgl[0].length = xfer_size; ret = rs_write_direct(rs, iom, offset, rs->ssgl, 1, xfer_size, 0); if (xfer_size < rs_sbuf_left(rs)) rs->ssgl[0].addr += xfer_size; else rs->ssgl[0].addr = (uintptr_t) rs->sbuf; } else { rs->ssgl[0].length = rs_sbuf_left(rs); memcpy((void *) (uintptr_t) rs->ssgl[0].addr, buf, rs->ssgl[0].length); rs->ssgl[1].length = xfer_size - rs->ssgl[0].length; memcpy(rs->sbuf, buf + rs->ssgl[0].length, rs->ssgl[1].length); ret = rs_write_direct(rs, iom, offset, rs->ssgl, 2, xfer_size, 0); rs->ssgl[0].addr = (uintptr_t) rs->sbuf + rs->ssgl[1].length; } if (ret) break; } out: fastlock_release(&rs->slock); return (ret && left == count) ? ret : count - left; } /**************************************************************************** * Service Processing Threads ****************************************************************************/ static int rs_svc_grow_sets(struct rs_svc *svc, int grow_size) { struct rsocket **rss; void *set, *contexts; set = calloc(svc->size + grow_size, sizeof(*rss) + svc->context_size); if (!set) return ENOMEM; svc->size += grow_size; rss = set; contexts = set + sizeof(*rss) * svc->size; if (svc->cnt) { memcpy(rss, svc->rss, sizeof(*rss) * (svc->cnt + 1)); memcpy(contexts, svc->contexts, svc->context_size * (svc->cnt + 1)); } free(svc->rss); svc->rss = rss; svc->contexts = contexts; return 0; } /* * Index 0 is reserved for the service's communication socket. */ static int rs_svc_add_rs(struct rs_svc *svc, struct rsocket *rs) { int ret; if (svc->cnt >= svc->size - 1) { ret = rs_svc_grow_sets(svc, 4); if (ret) return ret; } svc->rss[++svc->cnt] = rs; return 0; } static int rs_svc_index(struct rs_svc *svc, struct rsocket *rs) { int i; for (i = 1; i <= svc->cnt; i++) { if (svc->rss[i] == rs) return i; } return -1; } static int rs_svc_rm_rs(struct rs_svc *svc, struct rsocket *rs) { int i; if ((i = rs_svc_index(svc, rs)) >= 0) { svc->rss[i] = svc->rss[svc->cnt]; memcpy(svc->contexts + i * svc->context_size, svc->contexts + svc->cnt * svc->context_size, svc->context_size); svc->cnt--; return 0; } return EBADF; } static void udp_svc_process_sock(struct rs_svc *svc) { struct rs_svc_msg msg; read_all(svc->sock[1], &msg, sizeof msg); switch (msg.cmd) { case RS_SVC_ADD_DGRAM: msg.status = rs_svc_add_rs(svc, msg.rs); if (!msg.status) { msg.rs->opts |= RS_OPT_UDP_SVC; udp_svc_fds = svc->contexts; udp_svc_fds[svc->cnt].fd = msg.rs->udp_sock; udp_svc_fds[svc->cnt].events = POLLIN; udp_svc_fds[svc->cnt].revents = 0; } break; case RS_SVC_REM_DGRAM: msg.status = rs_svc_rm_rs(svc, msg.rs); if (!msg.status) msg.rs->opts &= ~RS_OPT_UDP_SVC; break; case RS_SVC_NOOP: msg.status = 0; break; default: break; } write_all(svc->sock[1], &msg, sizeof msg); } static uint8_t udp_svc_sgid_index(struct ds_dest *dest, union ibv_gid *sgid) { union ibv_gid gid; int i; for (i = 0; i < 16; i++) { ibv_query_gid(dest->qp->cm_id->verbs, dest->qp->cm_id->port_num, i, &gid); if (!memcmp(sgid, &gid, sizeof gid)) return i; } return 0; } static uint8_t udp_svc_path_bits(struct ds_dest *dest) { struct ibv_port_attr attr; if (!ibv_query_port(dest->qp->cm_id->verbs, dest->qp->cm_id->port_num, &attr)) return (uint8_t) ((1 << attr.lmc) - 1); return 0x7f; } static void udp_svc_create_ah(struct rsocket *rs, struct ds_dest *dest, uint32_t qpn) { union socket_addr saddr; struct rdma_cm_id *id; struct ibv_ah_attr attr; int ret; if (dest->ah) { fastlock_acquire(&rs->slock); ibv_destroy_ah(dest->ah); dest->ah = NULL; fastlock_release(&rs->slock); } ret = rdma_create_id(NULL, &id, NULL, dest->qp->cm_id->ps); if (ret) return; memcpy(&saddr, rdma_get_local_addr(dest->qp->cm_id), ucma_addrlen(rdma_get_local_addr(dest->qp->cm_id))); if (saddr.sa.sa_family == AF_INET) saddr.sin.sin_port = 0; else saddr.sin6.sin6_port = 0; ret = rdma_resolve_addr(id, &saddr.sa, &dest->addr.sa, 2000); if (ret) goto out; ret = rdma_resolve_route(id, 2000); if (ret) goto out; memset(&attr, 0, sizeof attr); if (id->route.path_rec->hop_limit > 1) { attr.is_global = 1; attr.grh.dgid = id->route.path_rec->dgid; attr.grh.flow_label = be32toh(id->route.path_rec->flow_label); attr.grh.sgid_index = udp_svc_sgid_index(dest, &id->route.path_rec->sgid); attr.grh.hop_limit = id->route.path_rec->hop_limit; attr.grh.traffic_class = id->route.path_rec->traffic_class; } attr.dlid = be16toh(id->route.path_rec->dlid); attr.sl = id->route.path_rec->sl; attr.src_path_bits = be16toh(id->route.path_rec->slid) & udp_svc_path_bits(dest); attr.static_rate = id->route.path_rec->rate; attr.port_num = id->port_num; fastlock_acquire(&rs->slock); dest->qpn = qpn; dest->ah = ibv_create_ah(dest->qp->cm_id->pd, &attr); fastlock_release(&rs->slock); out: rdma_destroy_id(id); } static int udp_svc_valid_udp_hdr(struct ds_udp_header *udp_hdr, union socket_addr *addr) { return (udp_hdr->tag == htobe32(DS_UDP_TAG)) && ((udp_hdr->version == 4 && addr->sa.sa_family == AF_INET && udp_hdr->length == DS_UDP_IPV4_HDR_LEN) || (udp_hdr->version == 6 && addr->sa.sa_family == AF_INET6 && udp_hdr->length == DS_UDP_IPV6_HDR_LEN)); } static void udp_svc_forward(struct rsocket *rs, void *buf, size_t len, union socket_addr *src) { struct ds_header hdr; struct ds_smsg *msg; struct ibv_sge sge; uint64_t offset; if (!ds_can_send(rs)) { if (ds_get_comp(rs, 0, ds_can_send)) return; } msg = rs->smsg_free; rs->smsg_free = msg->next; rs->sqe_avail--; ds_format_hdr(&hdr, src); memcpy((void *) msg, &hdr, hdr.length); memcpy((void *) msg + hdr.length, buf, len); sge.addr = (uintptr_t) msg; sge.length = hdr.length + len; sge.lkey = rs->conn_dest->qp->smr->lkey; offset = (uint8_t *) msg - rs->sbuf; ds_post_send(rs, &sge, offset); } static void udp_svc_process_rs(struct rsocket *rs) { static uint8_t buf[RS_SNDLOWAT]; struct ds_dest *dest, *cur_dest; struct ds_udp_header *udp_hdr; union socket_addr addr; socklen_t addrlen = sizeof addr; int len, ret; uint32_t qpn; ret = recvfrom(rs->udp_sock, buf, sizeof buf, 0, &addr.sa, &addrlen); if (ret < DS_UDP_IPV4_HDR_LEN) return; udp_hdr = (struct ds_udp_header *) buf; if (!udp_svc_valid_udp_hdr(udp_hdr, &addr)) return; len = ret - udp_hdr->length; qpn = be32toh(udp_hdr->qpn) & 0xFFFFFF; udp_hdr->tag = (__force __be32)be32toh(udp_hdr->tag); udp_hdr->qpn = (__force __be32)qpn; ret = ds_get_dest(rs, &addr.sa, addrlen, &dest); if (ret) return; if (udp_hdr->op == RS_OP_DATA) { fastlock_acquire(&rs->slock); cur_dest = rs->conn_dest; rs->conn_dest = dest; ds_send_udp(rs, NULL, 0, 0, RS_OP_CTRL); rs->conn_dest = cur_dest; fastlock_release(&rs->slock); } if (!dest->ah || (dest->qpn != qpn)) udp_svc_create_ah(rs, dest, qpn); /* to do: handle when dest local ip address doesn't match udp ip */ if (udp_hdr->op == RS_OP_DATA) { fastlock_acquire(&rs->slock); cur_dest = rs->conn_dest; rs->conn_dest = &dest->qp->dest; udp_svc_forward(rs, buf + udp_hdr->length, len, &addr); rs->conn_dest = cur_dest; fastlock_release(&rs->slock); } } static void *udp_svc_run(void *arg) { struct rs_svc *svc = arg; struct rs_svc_msg msg; int i, ret; ret = rs_svc_grow_sets(svc, 4); if (ret) { msg.status = ret; write_all(svc->sock[1], &msg, sizeof msg); return (void *) (uintptr_t) ret; } udp_svc_fds = svc->contexts; udp_svc_fds[0].fd = svc->sock[1]; udp_svc_fds[0].events = POLLIN; do { for (i = 0; i <= svc->cnt; i++) udp_svc_fds[i].revents = 0; poll(udp_svc_fds, svc->cnt + 1, -1); if (udp_svc_fds[0].revents) udp_svc_process_sock(svc); for (i = 1; i <= svc->cnt; i++) { if (udp_svc_fds[i].revents) udp_svc_process_rs(svc->rss[i]); } } while (svc->cnt >= 1); return NULL; } static uint64_t rs_get_time(void) { return rs_time_us() / 1000000; } static void tcp_svc_process_sock(struct rs_svc *svc) { struct rs_svc_msg msg; int i; read_all(svc->sock[1], &msg, sizeof msg); switch (msg.cmd) { case RS_SVC_ADD_KEEPALIVE: msg.status = rs_svc_add_rs(svc, msg.rs); if (!msg.status) { msg.rs->opts |= RS_OPT_KEEPALIVE; tcp_svc_timeouts = svc->contexts; tcp_svc_timeouts[svc->cnt] = rs_get_time() + msg.rs->keepalive_time; } break; case RS_SVC_REM_KEEPALIVE: msg.status = rs_svc_rm_rs(svc, msg.rs); if (!msg.status) msg.rs->opts &= ~RS_OPT_KEEPALIVE; break; case RS_SVC_MOD_KEEPALIVE: i = rs_svc_index(svc, msg.rs); if (i >= 0) { tcp_svc_timeouts[i] = rs_get_time() + msg.rs->keepalive_time; msg.status = 0; } else { msg.status = EBADF; } break; case RS_SVC_NOOP: msg.status = 0; break; default: break; } write_all(svc->sock[1], &msg, sizeof msg); } /* * Send a 0 byte RDMA write with immediate as keep-alive message. * This avoids the need for the receive side to do any acknowledgment. */ static void tcp_svc_send_keepalive(struct rsocket *rs) { fastlock_acquire(&rs->cq_lock); if (rs_ctrl_avail(rs) && (rs->state & rs_connected)) { rs->ctrl_seqno++; rs_post_write(rs, NULL, 0, rs_msg_set(RS_OP_CTRL, RS_CTRL_KEEPALIVE), 0, (uintptr_t) NULL, (uintptr_t) NULL); } fastlock_release(&rs->cq_lock); } static void *tcp_svc_run(void *arg) { struct rs_svc *svc = arg; struct rs_svc_msg msg; struct pollfd fds; uint64_t now, next_timeout; int i, ret, timeout; ret = rs_svc_grow_sets(svc, 16); if (ret) { msg.status = ret; write_all(svc->sock[1], &msg, sizeof msg); return (void *) (uintptr_t) ret; } tcp_svc_timeouts = svc->contexts; fds.fd = svc->sock[1]; fds.events = POLLIN; timeout = -1; do { poll(&fds, 1, timeout * 1000); if (fds.revents) tcp_svc_process_sock(svc); now = rs_get_time(); next_timeout = ~0; for (i = 1; i <= svc->cnt; i++) { if (tcp_svc_timeouts[i] <= now) { tcp_svc_send_keepalive(svc->rss[i]); tcp_svc_timeouts[i] = now + svc->rss[i]->keepalive_time; } if (tcp_svc_timeouts[i] < next_timeout) next_timeout = tcp_svc_timeouts[i]; } timeout = (int) (next_timeout - now); } while (svc->cnt >= 1); return NULL; } static void rs_handle_cm_event(struct rsocket *rs) { int ret; if (rs->state & rs_opening) { rs_do_connect(rs); } else { ret = ucma_complete(rs->cm_id); if (!ret && rs->cm_id->event && (rs->state & rs_connected) && (rs->cm_id->event->event == RDMA_CM_EVENT_DISCONNECTED)) rs->state = rs_disconnected; } if (!(rs->state & rs_opening)) rs_poll_signal(); } static void cm_svc_process_sock(struct rs_svc *svc) { struct rs_svc_msg msg; struct pollfd *fds; read_all(svc->sock[1], &msg, sizeof(msg)); switch (msg.cmd) { case RS_SVC_ADD_CM: msg.status = rs_svc_add_rs(svc, msg.rs); if (!msg.status) { msg.rs->opts |= RS_OPT_CM_SVC; fds = svc->contexts; fds[svc->cnt].fd = msg.rs->cm_id->channel->fd; fds[svc->cnt].events = POLLIN; fds[svc->cnt].revents = 0; } break; case RS_SVC_REM_CM: msg.status = rs_svc_rm_rs(svc, msg.rs); if (!msg.status) msg.rs->opts &= ~RS_OPT_CM_SVC; break; case RS_SVC_NOOP: msg.status = 0; break; default: break; } write_all(svc->sock[1], &msg, sizeof(msg)); } static void *cm_svc_run(void *arg) { struct rs_svc *svc = arg; struct pollfd *fds; struct rs_svc_msg msg; int i, ret; ret = rs_svc_grow_sets(svc, 4); if (ret) { msg.status = ret; write_all(svc->sock[1], &msg, sizeof(msg)); return (void *) (uintptr_t) ret; } fds = svc->contexts; fds[0].fd = svc->sock[1]; fds[0].events = POLLIN; do { for (i = 0; i <= svc->cnt; i++) fds[i].revents = 0; poll(fds, svc->cnt + 1, -1); if (fds[0].revents) { cm_svc_process_sock(svc); /* svc->contexts may have been reallocated, so need to assign again */ fds = svc->contexts; } for (i = 1; i <= svc->cnt; i++) { if (!fds[i].revents) continue; if (svc == &listen_svc) rs_accept(svc->rss[i]); else rs_handle_cm_event(svc->rss[i]); } } while (svc->cnt >= 1); return NULL; } rdma-core-56.1/librdmacm/rsocket.h000066400000000000000000000071711477342711600171150ustar00rootroot00000000000000/* * Copyright (c) 2011-2012 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #if !defined(RSOCKET_H) #define RSOCKET_H #include #include #include #include #include #include #include #ifdef __cplusplus extern "C" { #endif int rsocket(int domain, int type, int protocol); int rbind(int socket, const struct sockaddr *addr, socklen_t addrlen); int rlisten(int socket, int backlog); int raccept(int socket, struct sockaddr *addr, socklen_t *addrlen); int rconnect(int socket, const struct sockaddr *addr, socklen_t addrlen); int rshutdown(int socket, int how); int rclose(int socket); ssize_t rrecv(int socket, void *buf, size_t len, int flags); ssize_t rrecvfrom(int socket, void *buf, size_t len, int flags, struct sockaddr *src_addr, socklen_t *addrlen); ssize_t rrecvmsg(int socket, struct msghdr *msg, int flags); ssize_t rsend(int socket, const void *buf, size_t len, int flags); ssize_t rsendto(int socket, const void *buf, size_t len, int flags, const struct sockaddr *dest_addr, socklen_t addrlen); ssize_t rsendmsg(int socket, const struct msghdr *msg, int flags); ssize_t rread(int socket, void *buf, size_t count); ssize_t rreadv(int socket, const struct iovec *iov, int iovcnt); ssize_t rwrite(int socket, const void *buf, size_t count); ssize_t rwritev(int socket, const struct iovec *iov, int iovcnt); int rpoll(struct pollfd *fds, nfds_t nfds, int timeout); int rselect(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); int rgetpeername(int socket, struct sockaddr *addr, socklen_t *addrlen); int rgetsockname(int socket, struct sockaddr *addr, socklen_t *addrlen); #define SOL_RDMA 0x10000 enum { RDMA_SQSIZE, RDMA_RQSIZE, RDMA_INLINE, RDMA_IOMAPSIZE, RDMA_ROUTE }; int rsetsockopt(int socket, int level, int optname, const void *optval, socklen_t optlen); int rgetsockopt(int socket, int level, int optname, void *optval, socklen_t *optlen); int rfcntl(int socket, int cmd, ... /* arg */ ); off_t riomap(int socket, void *buf, size_t len, int prot, int flags, off_t offset); int riounmap(int socket, void *buf, size_t len); size_t riowrite(int socket, const void *buf, size_t count, off_t offset, int flags); #ifdef __cplusplus } #endif #endif /* RSOCKET_H */ rdma-core-56.1/providers/000077500000000000000000000000001477342711600153475ustar00rootroot00000000000000rdma-core-56.1/providers/bnxt_re/000077500000000000000000000000001477342711600170105ustar00rootroot00000000000000rdma-core-56.1/providers/bnxt_re/CMakeLists.txt000066400000000000000000000000711477342711600215460ustar00rootroot00000000000000rdma_provider(bnxt_re db.c main.c memory.c verbs.c ) rdma-core-56.1/providers/bnxt_re/bnxt_re-abi.h000066400000000000000000000232611477342711600213570ustar00rootroot00000000000000/* * Broadcom NetXtreme-E User Space RoCE driver * * Copyright (c) 2015-2017, Broadcom. All rights reserved. The term * Broadcom refers to Broadcom Limited and/or its subsidiaries. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Description: ABI data structure definition */ #ifndef __BNXT_RE_ABI_H__ #define __BNXT_RE_ABI_H__ #include #include #include #define BNXT_RE_FULL_FLAG_DELTA 0x80 DECLARE_DRV_CMD(ubnxt_re_pd, IB_USER_VERBS_CMD_ALLOC_PD, empty, bnxt_re_pd_resp); DECLARE_DRV_CMD(ubnxt_re_cq, IB_USER_VERBS_CMD_CREATE_CQ, bnxt_re_cq_req, bnxt_re_cq_resp); DECLARE_DRV_CMD(ubnxt_re_resize_cq, IB_USER_VERBS_CMD_RESIZE_CQ, bnxt_re_resize_cq_req, empty); DECLARE_DRV_CMD(ubnxt_re_qp, IB_USER_VERBS_CMD_CREATE_QP, bnxt_re_qp_req, bnxt_re_qp_resp); DECLARE_DRV_CMD(ubnxt_re_cntx, IB_USER_VERBS_CMD_GET_CONTEXT, bnxt_re_uctx_req, bnxt_re_uctx_resp); DECLARE_DRV_CMD(ubnxt_re_mr, IB_USER_VERBS_CMD_REG_MR, empty, empty); DECLARE_DRV_CMD(ubnxt_re_srq, IB_USER_VERBS_CMD_CREATE_SRQ, bnxt_re_srq_req, bnxt_re_srq_resp); enum bnxt_re_wr_opcode { BNXT_RE_WR_OPCD_SEND = 0x00, BNXT_RE_WR_OPCD_SEND_IMM = 0x01, BNXT_RE_WR_OPCD_SEND_INVAL = 0x02, BNXT_RE_WR_OPCD_RDMA_WRITE = 0x04, BNXT_RE_WR_OPCD_RDMA_WRITE_IMM = 0x05, BNXT_RE_WR_OPCD_RDMA_READ = 0x06, BNXT_RE_WR_OPCD_ATOMIC_CS = 0x08, BNXT_RE_WR_OPCD_ATOMIC_FA = 0x0B, BNXT_RE_WR_OPCD_LOC_INVAL = 0x0C, BNXT_RE_WR_OPCD_BIND = 0x0E, BNXT_RE_WR_OPCD_RECV = 0x80, BNXT_RE_WR_OPCD_INVAL = 0xFF }; enum bnxt_re_wr_flags { BNXT_RE_WR_FLAGS_INLINE = 0x10, BNXT_RE_WR_FLAGS_SE = 0x08, BNXT_RE_WR_FLAGS_UC_FENCE = 0x04, BNXT_RE_WR_FLAGS_RD_FENCE = 0x02, BNXT_RE_WR_FLAGS_SIGNALED = 0x01 }; enum bnxt_re_wc_type { BNXT_RE_WC_TYPE_SEND = 0x00, BNXT_RE_WC_TYPE_RECV_RC = 0x01, BNXT_RE_WC_TYPE_RECV_UD = 0x02, BNXT_RE_WC_TYPE_RECV_RAW = 0x03, BNXT_RE_WC_TYPE_TERM = 0x0E, BNXT_RE_WC_TYPE_COFF = 0x0F }; #define BNXT_RE_WC_OPCD_RECV 0x80 enum bnxt_re_req_wc_status { BNXT_RE_REQ_ST_OK = 0x00, BNXT_RE_REQ_ST_BAD_RESP = 0x01, BNXT_RE_REQ_ST_LOC_LEN = 0x02, BNXT_RE_REQ_ST_LOC_QP_OP = 0x03, BNXT_RE_REQ_ST_PROT = 0x04, BNXT_RE_REQ_ST_MEM_OP = 0x05, BNXT_RE_REQ_ST_REM_INVAL = 0x06, BNXT_RE_REQ_ST_REM_ACC = 0x07, BNXT_RE_REQ_ST_REM_OP = 0x08, BNXT_RE_REQ_ST_RNR_NAK_XCED = 0x09, BNXT_RE_REQ_ST_TRNSP_XCED = 0x0A, BNXT_RE_REQ_ST_WR_FLUSH = 0x0B }; enum bnxt_re_rsp_wc_status { BNXT_RE_RSP_ST_OK = 0x00, BNXT_RE_RSP_ST_LOC_ACC = 0x01, BNXT_RE_RSP_ST_LOC_LEN = 0x02, BNXT_RE_RSP_ST_LOC_PROT = 0x03, BNXT_RE_RSP_ST_LOC_QP_OP = 0x04, BNXT_RE_RSP_ST_MEM_OP = 0x05, BNXT_RE_RSP_ST_REM_INVAL = 0x06, BNXT_RE_RSP_ST_WR_FLUSH = 0x07, BNXT_RE_RSP_ST_HW_FLUSH = 0x08 }; enum bnxt_re_hdr_offset { BNXT_RE_HDR_WT_MASK = 0xFF, BNXT_RE_HDR_FLAGS_MASK = 0xFF, BNXT_RE_HDR_FLAGS_SHIFT = 0x08, BNXT_RE_HDR_WS_MASK = 0xFF, BNXT_RE_HDR_WS_SHIFT = 0x10 }; enum bnxt_re_db_que_type { BNXT_RE_QUE_TYPE_SQ = 0x00, BNXT_RE_QUE_TYPE_RQ = 0x01, BNXT_RE_QUE_TYPE_SRQ = 0x02, BNXT_RE_QUE_TYPE_SRQ_ARM = 0x03, BNXT_RE_QUE_TYPE_CQ = 0x04, BNXT_RE_QUE_TYPE_CQ_ARMSE = 0x05, BNXT_RE_QUE_TYPE_CQ_ARMALL = 0x06, BNXT_RE_QUE_TYPE_CQ_ARMENA = 0x07, BNXT_RE_QUE_TYPE_SRQ_ARMENA = 0x08, BNXT_RE_QUE_TYPE_CQ_CUT_ACK = 0x09, BNXT_RE_PUSH_TYPE_START = 0x0C, BNXT_RE_PUSH_TYPE_END = 0x0D, BNXT_RE_QUE_TYPE_NULL = 0x0F }; enum bnxt_re_db_mask { BNXT_RE_DB_INDX_MASK = 0xFFFFFFUL, BNXT_RE_DB_PILO_MASK = 0x0FFUL, BNXT_RE_DB_PILO_SHIFT = 0x18, BNXT_RE_DB_QID_MASK = 0xFFFFFUL, BNXT_RE_DB_PIHI_MASK = 0xF00UL, BNXT_RE_DB_PIHI_SHIFT = 0x0C, /* Because mask is 0xF00 */ BNXT_RE_DB_TYP_MASK = 0x0FUL, BNXT_RE_DB_TYP_SHIFT = 0x1C, BNXT_RE_DB_VALID_SHIFT = 0x1A, BNXT_RE_DB_EPOCH_SHIFT = 0x18, BNXT_RE_DB_TOGGLE_SHIFT = 0x19, }; enum bnxt_re_psns_mask { BNXT_RE_PSNS_SPSN_MASK = 0xFFFFFF, BNXT_RE_PSNS_OPCD_MASK = 0xFF, BNXT_RE_PSNS_OPCD_SHIFT = 0x18, BNXT_RE_PSNS_NPSN_MASK = 0xFFFFFF, BNXT_RE_PSNS_FLAGS_MASK = 0xFF, BNXT_RE_PSNS_FLAGS_SHIFT = 0x18 }; enum bnxt_re_msns_mask { BNXT_RE_SQ_MSN_SEARCH_START_PSN_MASK = 0xFFFFFFUL, BNXT_RE_SQ_MSN_SEARCH_START_PSN_SHIFT = 0, BNXT_RE_SQ_MSN_SEARCH_NEXT_PSN_MASK = 0xFFFFFF000000ULL, BNXT_RE_SQ_MSN_SEARCH_NEXT_PSN_SHIFT = 0x18, BNXT_RE_SQ_MSN_SEARCH_START_IDX_MASK = 0xFFFF000000000000ULL, BNXT_RE_SQ_MSN_SEARCH_START_IDX_SHIFT = 0x30 }; enum bnxt_re_bcqe_mask { BNXT_RE_BCQE_PH_MASK = 0x01, BNXT_RE_BCQE_TYPE_MASK = 0x0F, BNXT_RE_BCQE_TYPE_SHIFT = 0x01, BNXT_RE_BCQE_RESIZE_TOG_MASK = 0x03, BNXT_RE_BCQE_RESIZE_TOG_SHIFT = 0x05, BNXT_RE_BCQE_STATUS_MASK = 0xFF, BNXT_RE_BCQE_STATUS_SHIFT = 0x08, BNXT_RE_BCQE_FLAGS_MASK = 0xFFFFU, BNXT_RE_BCQE_FLAGS_SHIFT = 0x10, BNXT_RE_BCQE_RWRID_MASK = 0xFFFFFU, BNXT_RE_BCQE_SRCQP_MASK = 0xFF, BNXT_RE_BCQE_SRCQP_SHIFT = 0x18 }; enum bnxt_re_rc_flags_mask { BNXT_RE_RC_FLAGS_SRQ_RQ_MASK = 0x01, BNXT_RE_RC_FLAGS_IMM_MASK = 0x02, BNXT_RE_RC_FLAGS_IMM_SHIFT = 0x01, BNXT_RE_RC_FLAGS_INV_MASK = 0x04, BNXT_RE_RC_FLAGS_INV_SHIFT = 0x02, BNXT_RE_RC_FLAGS_RDMA_MASK = 0x08, BNXT_RE_RC_FLAGS_RDMA_SHIFT = 0x03 }; enum bnxt_re_ud_flags_mask { BNXT_RE_UD_FLAGS_SRQ_RQ_SFT = 0x00, BNXT_RE_UD_FLAGS_SRQ_RQ_MASK = 0x01, BNXT_RE_UD_FLAGS_IMM_MASK = 0x02, BNXT_RE_UD_FLAGS_IMM_SFT = 0x01, BNXT_RE_UD_FLAGS_IP_VER_MASK = 0x30, BNXT_RE_UD_FLAGS_IP_VER_SFT = 0x4, BNXT_RE_UD_FLAGS_META_MASK = 0x3C0, BNXT_RE_UD_FLAGS_META_SFT = 0x6, BNXT_RE_UD_FLAGS_EXT_META_MASK = 0xC00, BNXT_RE_UD_FLAGS_EXT_META_SFT = 0x10, }; enum bnxt_re_ud_cqe_mask { BNXT_RE_UD_CQE_MAC_MASK = 0xFFFFFFFFFFFFULL, BNXT_RE_UD_CQE_SRCQPLO_MASK = 0xFFFF, BNXT_RE_UD_CQE_SRCQPLO_SHIFT = 0x30, BNXT_RE_UD_CQE_LEN_MASK = 0x3FFFU, }; enum { BNXT_RE_COMP_MASK_UCNTX_WC_DPI_ENABLED = 0x01, BNXT_RE_COMP_MASK_UCNTX_DBR_PACING_ENABLED = 0x02, BNXT_RE_COMP_MASK_UCNTX_POW2_DISABLED = 0x04, BNXT_RE_COMP_MASK_UCNTX_MSN_TABLE_ENABLED = 0x08, }; enum bnxt_re_que_flags_mask { BNXT_RE_FLAG_EPOCH_TAIL_SHIFT = 0x0UL, BNXT_RE_FLAG_EPOCH_HEAD_SHIFT = 0x1UL, BNXT_RE_FLAG_EPOCH_TAIL_MASK = 0x1UL, BNXT_RE_FLAG_EPOCH_HEAD_MASK = 0x2UL, }; enum bnxt_re_db_epoch_flag_shift { BNXT_RE_DB_EPOCH_TAIL_SHIFT = BNXT_RE_DB_EPOCH_SHIFT, BNXT_RE_DB_EPOCH_HEAD_SHIFT = (BNXT_RE_DB_EPOCH_SHIFT - 1) }; #define BNXT_RE_STATIC_WQE_MAX_SGE 0x06 enum bnxt_re_modes { BNXT_RE_WQE_MODE_STATIC = 0x00, BNXT_RE_WQE_MODE_VARIABLE = 0x01 }; struct bnxt_re_db_hdr { __le32 indx; __le32 typ_qid; /* typ: 4, qid:20*/ }; struct bnxt_re_bcqe { __le32 flg_st_typ_ph; __le32 qphi_rwrid; }; struct bnxt_re_req_cqe { __le64 qp_handle; __le32 con_indx; /* 16 bits valid. */ __le32 rsvd1; __le64 rsvd2; }; struct bnxt_re_rc_cqe { __le32 length; __le32 imm_key; __le64 qp_handle; __le64 mr_handle; }; struct bnxt_re_ud_cqe { __le32 length; /* 14 bits */ __le32 immd; __le64 qp_handle; __le64 qplo_mac; /* 16:48*/ }; struct bnxt_re_term_cqe { __le64 qp_handle; __le32 rq_sq_cidx; __le32 rsvd; __le64 rsvd1; }; union lower_shdr { __le64 qkey_len; __le64 lkey_plkey; __le64 rva; }; struct bnxt_re_bsqe { __le32 rsv_ws_fl_wt; union { __be32 imm_data; __le32 key_immd; }; union lower_shdr lhdr; }; struct bnxt_re_psns { __le32 opc_spsn; __le32 flg_npsn; }; struct bnxt_re_psns_ext { __u32 opc_spsn; __u32 flg_npsn; __u16 st_slot_idx; __u16 rsvd0; __u32 rsvd1; }; /* sq_msn_search (size:64b/8B) */ struct bnxt_re_msns { __le64 start_idx_next_psn_start_psn; }; struct bnxt_re_sge { __le64 pa; __le32 lkey; __le32 length; }; /* Cu+ max inline data */ #define BNXT_RE_MAX_INLINE_SIZE 0x60 struct bnxt_re_send { __le32 dst_qp; __le32 avid; __le64 rsvd; }; struct bnxt_re_raw { __le32 cfa_meta; __le32 rsvd2; __le64 rsvd3; }; struct bnxt_re_rdma { __le64 rva; __le32 rkey; __le32 rsvd2; }; struct bnxt_re_atomic { __le64 swp_dt; __le64 cmp_dt; }; struct bnxt_re_inval { __le64 rsvd[2]; }; struct bnxt_re_bind { __le64 va; __le64 len; /* only 40 bits are valid */ }; struct bnxt_re_brqe { __le32 rsv_ws_fl_wt; __le32 rsvd; __le32 wrid; __le32 rsvd1; }; struct bnxt_re_rqe { __le64 rsvd[2]; }; struct bnxt_re_srqe { __le64 rsvd[2]; }; struct bnxt_re_push_wqe { __u64 addr[32]; }; #endif rdma-core-56.1/providers/bnxt_re/db.c000066400000000000000000000250351477342711600175460ustar00rootroot00000000000000/* * Broadcom NetXtreme-E User Space RoCE driver * * Copyright (c) 2015-2017, Broadcom. All rights reserved. The term * Broadcom refers to Broadcom Limited and/or its subsidiaries. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Description: Doorbell handling functions. */ #include #include #include #include "main.h" static uint16_t rnd(struct xorshift32_state *state, uint16_t range) { /* range must be a power of 2 - 1 */ return (xorshift32(state) & range); } static int calculate_fifo_occupancy(struct bnxt_re_context *cntx) { struct bnxt_re_pacing_data *pacing_data = (struct bnxt_re_pacing_data *)cntx->dbr_page; struct bnxt_re_dev *rdev = cntx->rdev; uint32_t read_val, fifo_occup; uint64_t fifo_reg_off; uint32_t *dbr_map; fifo_reg_off = pacing_data->grc_reg_offset & ~(BNXT_RE_PAGE_MASK(rdev->pg_size)); dbr_map = cntx->bar_map + fifo_reg_off; read_val = *dbr_map; fifo_occup = pacing_data->fifo_max_depth - ((read_val & pacing_data->fifo_room_mask) >> pacing_data->fifo_room_shift); return fifo_occup; } static void bnxt_re_do_pacing(struct bnxt_re_context *cntx, struct xorshift32_state *state) { struct bnxt_re_pacing_data *pacing_data = (struct bnxt_re_pacing_data *)cntx->dbr_page; uint32_t fifo_occup; int wait_time = 1; if (!pacing_data) return; if (rnd(state, BNXT_RE_MAX_DO_PACING) < pacing_data->do_pacing) { while ((fifo_occup = calculate_fifo_occupancy(cntx)) > pacing_data->pacing_th) { uint32_t usec_wait; if (pacing_data->alarm_th && fifo_occup > pacing_data->alarm_th) bnxt_re_notify_drv(&cntx->ibvctx.context); usec_wait = rnd(state, wait_time - 1); if (usec_wait) bnxt_re_sub_sec_busy_wait(usec_wait * 1000); /* wait time capped at 128 us */ wait_time = min(wait_time * 2, 128); } } } static void bnxt_re_ring_db(struct bnxt_re_dpi *dpi, struct bnxt_re_db_hdr *hdr) { __le64 *dbval; dbval = (__le64 *)&hdr->indx; mmio_wc_start(); mmio_write64_le(dpi->dbpage, *dbval); mmio_flush_writes(); } static void bnxt_re_init_db_hdr(struct bnxt_re_db_hdr *hdr, uint32_t indx, uint32_t qid, uint32_t toggle, uint32_t typ) { hdr->indx = htole32(indx | toggle << BNXT_RE_DB_TOGGLE_SHIFT); hdr->typ_qid = htole32(qid & BNXT_RE_DB_QID_MASK); hdr->typ_qid |= htole32(((typ & BNXT_RE_DB_TYP_MASK) << BNXT_RE_DB_TYP_SHIFT) | (0x1UL << BNXT_RE_DB_VALID_SHIFT)); } void bnxt_re_ring_rq_db(struct bnxt_re_qp *qp) { struct bnxt_re_db_hdr hdr; uint32_t epoch; uint32_t tail; bnxt_re_do_pacing(qp->cntx, &qp->rand); tail = *qp->jrqq->hwque->dbtail; epoch = (qp->jrqq->hwque->flags & BNXT_RE_FLAG_EPOCH_TAIL_MASK) << BNXT_RE_DB_EPOCH_TAIL_SHIFT; bnxt_re_init_db_hdr(&hdr, tail | epoch, qp->qpid, 0, BNXT_RE_QUE_TYPE_RQ); bnxt_re_ring_db(qp->udpi, &hdr); } void bnxt_re_ring_sq_db(struct bnxt_re_qp *qp) { struct bnxt_re_db_hdr hdr; uint32_t epoch; uint32_t tail; bnxt_re_do_pacing(qp->cntx, &qp->rand); tail = *qp->jsqq->hwque->dbtail; epoch = (qp->jsqq->hwque->flags & BNXT_RE_FLAG_EPOCH_TAIL_MASK) << BNXT_RE_DB_EPOCH_TAIL_SHIFT; bnxt_re_init_db_hdr(&hdr, tail | epoch, qp->qpid, 0, BNXT_RE_QUE_TYPE_SQ); bnxt_re_ring_db(qp->udpi, &hdr); } void bnxt_re_ring_srq_db(struct bnxt_re_srq *srq) { struct bnxt_re_db_hdr hdr; uint32_t epoch; bnxt_re_do_pacing(srq->cntx, &srq->rand); epoch = (srq->srqq->flags & BNXT_RE_FLAG_EPOCH_TAIL_MASK) << BNXT_RE_DB_EPOCH_TAIL_SHIFT; bnxt_re_init_db_hdr(&hdr, srq->srqq->tail | epoch, srq->srqid, 0, BNXT_RE_QUE_TYPE_SRQ); bnxt_re_ring_db(srq->udpi, &hdr); } void bnxt_re_ring_srq_arm(struct bnxt_re_srq *srq) { uint32_t *pgptr, toggle = 0; struct bnxt_re_db_hdr hdr; pgptr = (uint32_t *)srq->toggle_map; if (pgptr) toggle = *pgptr; bnxt_re_do_pacing(srq->cntx, &srq->rand); bnxt_re_init_db_hdr(&hdr, srq->cap.srq_limit, srq->srqid, toggle, BNXT_RE_QUE_TYPE_SRQ_ARM); bnxt_re_ring_db(srq->udpi, &hdr); } /* * During CQ resize, it is expected that the epoch needs to be maintained when * switching from the old CQ to the new resized CQ. * * On the first CQ DB excecuted on the new CQ, we need to check if the index we * are writing is less than the last index written for the old CQ. If that is * the case, we need to flip the epoch so the ASIC does not get confused and * think the CQ DB is out of order and therefore drop the DB (note the logic * in the ASIC that checks CQ DB ordering is not aware of the CQ resize). */ static void bnxt_re_cq_resize_check(struct bnxt_re_queue *cqq) { if (unlikely(cqq->cq_resized)) { if (cqq->head < cqq->old_head) cqq->flags ^= 1UL << BNXT_RE_FLAG_EPOCH_HEAD_SHIFT; cqq->cq_resized = false; } } void bnxt_re_ring_cq_db(struct bnxt_re_cq *cq) { struct bnxt_re_db_hdr hdr; uint32_t epoch; bnxt_re_do_pacing(cq->cntx, &cq->rand); bnxt_re_cq_resize_check(cq->cqq); epoch = (cq->cqq->flags & BNXT_RE_FLAG_EPOCH_HEAD_MASK) << BNXT_RE_DB_EPOCH_HEAD_SHIFT; bnxt_re_init_db_hdr(&hdr, cq->cqq->head | epoch, cq->cqid, 0, BNXT_RE_QUE_TYPE_CQ); bnxt_re_ring_db(cq->udpi, &hdr); } void bnxt_re_ring_cq_arm_db(struct bnxt_re_cq *cq, uint8_t aflag) { uint32_t epoch, toggle = 0; struct bnxt_re_db_hdr hdr; uint32_t *pgptr; if (aflag == BNXT_RE_QUE_TYPE_CQ_CUT_ACK) { toggle = cq->resize_tog; } else { pgptr = (uint32_t *)cq->toggle_map; if (pgptr) toggle = *pgptr; } bnxt_re_do_pacing(cq->cntx, &cq->rand); bnxt_re_cq_resize_check(cq->cqq); epoch = (cq->cqq->flags & BNXT_RE_FLAG_EPOCH_HEAD_MASK) << BNXT_RE_DB_EPOCH_HEAD_SHIFT; bnxt_re_init_db_hdr(&hdr, cq->cqq->head | epoch, cq->cqid, toggle, aflag); bnxt_re_ring_db(cq->udpi, &hdr); } void bnxt_re_ring_pstart_db(struct bnxt_re_qp *qp, struct bnxt_re_push_buffer *pbuf) { uint64_t key; bnxt_re_do_pacing(qp->cntx, &qp->rand); key = ((((pbuf->wcdpi & BNXT_RE_DB_PIHI_MASK) << BNXT_RE_DB_PIHI_SHIFT) | (pbuf->qpid & BNXT_RE_DB_QID_MASK)) | ((BNXT_RE_PUSH_TYPE_START & BNXT_RE_DB_TYP_MASK) << BNXT_RE_DB_TYP_SHIFT) | (0x1UL << BNXT_RE_DB_VALID_SHIFT)); key <<= 32; key |= ((((__u32)pbuf->wcdpi & BNXT_RE_DB_PILO_MASK) << BNXT_RE_DB_PILO_SHIFT) | (pbuf->st_idx & BNXT_RE_DB_INDX_MASK)); udma_to_device_barrier(); mmio_write64((uintptr_t *)pbuf->ucdb, key); } void bnxt_re_ring_pend_db(struct bnxt_re_qp *qp, struct bnxt_re_push_buffer *pbuf) { uint64_t key; bnxt_re_do_pacing(qp->cntx, &qp->rand); key = ((((pbuf->wcdpi & BNXT_RE_DB_PIHI_MASK) << BNXT_RE_DB_PIHI_SHIFT) | (pbuf->qpid & BNXT_RE_DB_QID_MASK)) | ((BNXT_RE_PUSH_TYPE_END & BNXT_RE_DB_TYP_MASK) << BNXT_RE_DB_TYP_SHIFT) | (0x1UL << BNXT_RE_DB_VALID_SHIFT)); key <<= 32; key |= ((((__u32)pbuf->wcdpi & BNXT_RE_DB_PILO_MASK) << BNXT_RE_DB_PILO_SHIFT) | (pbuf->tail & BNXT_RE_DB_INDX_MASK)); udma_to_device_barrier(); mmio_write64((uintptr_t *)pbuf->ucdb, key); } void bnxt_re_fill_push_wcb(struct bnxt_re_qp *qp, struct bnxt_re_push_buffer *pbuf, uint32_t idx) { bnxt_re_ring_pstart_db(qp, pbuf); mmio_wc_start(); bnxt_re_copy_data_to_pb(pbuf, 0, idx); /* Flush WQE write before push end db. */ mmio_flush_writes(); bnxt_re_ring_pend_db(qp, pbuf); } int bnxt_re_init_pbuf_list(struct bnxt_re_context *ucntx) { struct bnxt_re_push_buffer *pbuf; int indx, wqesz; int size, offt; uint64_t wcpage; uint64_t dbpage; void *base; size = (sizeof(*ucntx->pbrec) + 16 * (sizeof(*ucntx->pbrec->pbuf) + sizeof(struct bnxt_re_push_wqe))); ucntx->pbrec = calloc(1, size); if (!ucntx->pbrec) goto out; offt = sizeof(*ucntx->pbrec); base = ucntx->pbrec; ucntx->pbrec->pbuf = (base + offt); ucntx->pbrec->pbmap = ~0x00; ucntx->pbrec->pbmap &= ~0x7fff; /* 15 bits */ ucntx->pbrec->udpi = &ucntx->udpi; wqesz = sizeof(struct bnxt_re_push_wqe); wcpage = (uintptr_t)(ucntx->udpi.wcdbpg); dbpage = (uintptr_t)(ucntx->udpi.dbpage); offt = sizeof(*ucntx->pbrec->pbuf) * 16; base = (char *)ucntx->pbrec->pbuf + offt; for (indx = 0; indx < 16; indx++) { pbuf = &ucntx->pbrec->pbuf[indx]; pbuf->wqe = base + indx * wqesz; pbuf->pbuf = (uintptr_t)(wcpage + indx * wqesz); pbuf->ucdb = (uintptr_t)(dbpage + (indx + 1) * sizeof(uint64_t)); pbuf->wcdpi = ucntx->udpi.wcdpi; } return 0; out: return -ENOMEM; } struct bnxt_re_push_buffer *bnxt_re_get_pbuf(uint8_t *push_st_en, struct bnxt_re_context *cntx) { struct bnxt_re_push_buffer *pbuf = NULL; __u32 old; int bit; old = cntx->pbrec->pbmap; while ((bit = __builtin_ffs(~cntx->pbrec->pbmap)) != 0) { if (__sync_bool_compare_and_swap (&cntx->pbrec->pbmap, old, (old | 0x01 << (bit - 1)))) break; old = cntx->pbrec->pbmap; } if (bit) { pbuf = &cntx->pbrec->pbuf[bit]; pbuf->nbit = bit; } return pbuf; } void bnxt_re_put_pbuf(struct bnxt_re_context *cntx, struct bnxt_re_push_buffer *pbuf) { struct bnxt_re_push_rec *pbrec; __u32 old; int bit; pbrec = cntx->pbrec; if (pbuf->nbit) { bit = pbuf->nbit; pbuf->nbit = 0; old = pbrec->pbmap; while (!__sync_bool_compare_and_swap(&pbrec->pbmap, old, (old & (~(0x01 << (bit - 1)))))) old = pbrec->pbmap; } } void bnxt_re_destroy_pbuf_list(struct bnxt_re_context *cntx) { free(cntx->pbrec); } rdma-core-56.1/providers/bnxt_re/flush.h000066400000000000000000000053601477342711600203060ustar00rootroot00000000000000/* * Broadcom NetXtreme-E User Space RoCE driver * * Copyright (c) 2015-2017, Broadcom. All rights reserved. The term * Broadcom refers to Broadcom Limited and/or its subsidiaries. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Description: A few wrappers for flush queue management */ #ifndef __FLUSH_H__ #define __FLUSH_H__ #include struct bnxt_re_fque_node { uint8_t valid; struct list_node list; }; static inline void fque_init_node(struct bnxt_re_fque_node *node) { list_node_init(&node->list); node->valid = false; } static inline void fque_add_node_tail(struct list_head *head, struct bnxt_re_fque_node *new) { list_add_tail(head, &new->list); new->valid = true; } static inline void fque_del_node(struct bnxt_re_fque_node *entry) { entry->valid = false; list_del(&entry->list); } static inline uint8_t _fque_node_valid(struct bnxt_re_fque_node *node) { return node->valid; } static inline void bnxt_re_fque_add_node(struct list_head *head, struct bnxt_re_fque_node *node) { if (!_fque_node_valid(node)) fque_add_node_tail(head, node); } static inline void bnxt_re_fque_del_node(struct bnxt_re_fque_node *node) { if (_fque_node_valid(node)) fque_del_node(node); } #endif /* __FLUSH_H__ */ rdma-core-56.1/providers/bnxt_re/main.c000066400000000000000000000242251477342711600201050ustar00rootroot00000000000000/* * Broadcom NetXtreme-E User Space RoCE driver * * Copyright (c) 2015-2017, Broadcom. All rights reserved. The term * Broadcom refers to Broadcom Limited and/or its subsidiaries. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Description: Device detection and initializatoin */ #include #include #include #include #include #include #include #include #include #include #include "main.h" #include "verbs.h" static void bnxt_re_free_context(struct ibv_context *ibvctx); #define PCI_VENDOR_ID_BROADCOM 0x14E4 #define CNA(v, d) VERBS_PCI_MATCH(PCI_VENDOR_ID_##v, d, NULL) static const struct verbs_match_ent cna_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_BNXT_RE), CNA(BROADCOM, 0x1605), /* BCM57454 NPAR */ CNA(BROADCOM, 0x1606), /* BCM57454 VF */ CNA(BROADCOM, 0x1614), /* BCM57454 */ CNA(BROADCOM, 0x16C0), /* BCM57417 NPAR */ CNA(BROADCOM, 0x16C1), /* BMC57414 VF */ CNA(BROADCOM, 0x16CE), /* BMC57311 */ CNA(BROADCOM, 0x16CF), /* BMC57312 */ CNA(BROADCOM, 0x16D6), /* BMC57412*/ CNA(BROADCOM, 0x16D7), /* BMC57414 */ CNA(BROADCOM, 0x16D8), /* BMC57416 Cu */ CNA(BROADCOM, 0x16D9), /* BMC57417 Cu */ CNA(BROADCOM, 0x16DF), /* BMC57314 */ CNA(BROADCOM, 0x16E2), /* BMC57417 */ CNA(BROADCOM, 0x16E3), /* BMC57416 */ CNA(BROADCOM, 0x16E5), /* BMC57314 VF */ CNA(BROADCOM, 0x16ED), /* BCM57414 NPAR */ CNA(BROADCOM, 0x16EB), /* BCM57412 NPAR */ CNA(BROADCOM, 0x16EF), /* BCM57416 NPAR */ CNA(BROADCOM, 0x16F0), /* BCM58730 */ CNA(BROADCOM, 0x16F1), /* BCM57452 */ CNA(BROADCOM, 0x1750), /* BCM57508 */ CNA(BROADCOM, 0x1751), /* BCM57504 */ CNA(BROADCOM, 0x1752), /* BCM57502 */ CNA(BROADCOM, 0x1803), /* BCM57508 NPAR */ CNA(BROADCOM, 0x1804), /* BCM57504 NPAR */ CNA(BROADCOM, 0x1805), /* BCM57502 NPAR */ CNA(BROADCOM, 0x1807), /* BCM5750x VF */ CNA(BROADCOM, 0x1809), /* BCM5750x Gen P5 VF HV */ CNA(BROADCOM, 0xD800), /* BCM880xx VF */ CNA(BROADCOM, 0xD802), /* BCM58802 */ CNA(BROADCOM, 0xD804), /* BCM8804 SR */ {} }; static const struct verbs_context_ops bnxt_re_cntx_ops = { .query_device_ex = bnxt_re_query_device, .query_port = bnxt_re_query_port, .alloc_pd = bnxt_re_alloc_pd, .dealloc_pd = bnxt_re_free_pd, .reg_mr = bnxt_re_reg_mr, .reg_dmabuf_mr = bnxt_re_reg_dmabuf_mr, .dereg_mr = bnxt_re_dereg_mr, .create_cq = bnxt_re_create_cq, .poll_cq = bnxt_re_poll_cq, .req_notify_cq = bnxt_re_arm_cq, .resize_cq = bnxt_re_resize_cq, .destroy_cq = bnxt_re_destroy_cq, .create_srq = bnxt_re_create_srq, .modify_srq = bnxt_re_modify_srq, .query_srq = bnxt_re_query_srq, .destroy_srq = bnxt_re_destroy_srq, .post_srq_recv = bnxt_re_post_srq_recv, .create_qp = bnxt_re_create_qp, .query_qp = bnxt_re_query_qp, .modify_qp = bnxt_re_modify_qp, .destroy_qp = bnxt_re_destroy_qp, .post_send = bnxt_re_post_send, .post_recv = bnxt_re_post_recv, .async_event = bnxt_re_async_event, .create_ah = bnxt_re_create_ah, .destroy_ah = bnxt_re_destroy_ah, .free_context = bnxt_re_free_context, .create_qp_ex = bnxt_re_create_qp_ex, }; static inline bool bnxt_re_is_chip_gen_p7(struct bnxt_re_chip_ctx *cctx) { return (cctx->chip_num == CHIP_NUM_58818 || cctx->chip_num == CHIP_NUM_57608); } static bool bnxt_re_is_chip_gen_p5(struct bnxt_re_chip_ctx *cctx) { return (cctx->chip_num == CHIP_NUM_57508 || cctx->chip_num == CHIP_NUM_57504 || cctx->chip_num == CHIP_NUM_57502); } static inline bool bnxt_re_is_chip_gen_p5_p7(struct bnxt_re_chip_ctx *cctx) { return bnxt_re_is_chip_gen_p5(cctx) || bnxt_re_is_chip_gen_p7(cctx); } static int bnxt_re_alloc_map_dbr_page(struct ibv_context *ibvctx) { struct bnxt_re_context *cntx = to_bnxt_re_context(ibvctx); struct bnxt_re_mmap_info minfo = {}; int ret; minfo.type = BNXT_RE_ALLOC_DBR_PAGE; ret = bnxt_re_alloc_page(ibvctx, &minfo, NULL); if (ret) return ret; cntx->dbr_page = mmap(NULL, minfo.alloc_size, PROT_READ, MAP_SHARED, ibvctx->cmd_fd, minfo.alloc_offset); if (cntx->dbr_page == MAP_FAILED) return -ENOMEM; return 0; } static int bnxt_re_alloc_map_dbr_bar_page(struct ibv_context *ibvctx) { struct bnxt_re_context *cntx = to_bnxt_re_context(ibvctx); struct bnxt_re_mmap_info minfo = {}; int ret; minfo.type = BNXT_RE_ALLOC_DBR_BAR_PAGE; ret = bnxt_re_alloc_page(ibvctx, &minfo, NULL); if (ret) return ret; cntx->bar_map = mmap(NULL, minfo.alloc_size, PROT_WRITE, MAP_SHARED, ibvctx->cmd_fd, minfo.alloc_offset); if (cntx->bar_map == MAP_FAILED) return -ENOMEM; return 0; } /* Context Init functions */ static struct verbs_context *bnxt_re_alloc_context(struct ibv_device *vdev, int cmd_fd, void *private_data) { struct bnxt_re_dev *rdev = to_bnxt_re_dev(vdev); struct ubnxt_re_cntx_resp resp = {}; struct ubnxt_re_cntx req = {}; struct bnxt_re_context *cntx; int ret; cntx = verbs_init_and_alloc_context(vdev, cmd_fd, cntx, ibvctx, RDMA_DRIVER_BNXT_RE); if (!cntx) return NULL; req.comp_mask |= BNXT_RE_COMP_MASK_REQ_UCNTX_POW2_SUPPORT; req.comp_mask |= BNXT_RE_COMP_MASK_REQ_UCNTX_VAR_WQE_SUPPORT; if (ibv_cmd_get_context(&cntx->ibvctx, &req.ibv_cmd, sizeof(req), &resp.ibv_resp, sizeof(resp))) goto failed; cntx->dev_id = resp.dev_id; cntx->max_qp = resp.max_qp; rdev->pg_size = resp.pg_size; rdev->cqe_size = resp.cqe_sz; rdev->max_cq_depth = resp.max_cqd; if (resp.comp_mask & BNXT_RE_UCNTX_CMASK_HAVE_CCTX) { cntx->cctx.chip_num = resp.chip_id0 & 0xFFFF; cntx->cctx.chip_rev = (resp.chip_id0 >> BNXT_RE_CHIP_ID0_CHIP_REV_SFT) & 0xFF; cntx->cctx.chip_metal = (resp.chip_id0 >> BNXT_RE_CHIP_ID0_CHIP_MET_SFT) & 0xFF; cntx->cctx.gen_p5_p7 = bnxt_re_is_chip_gen_p5_p7(&cntx->cctx); } if (resp.comp_mask & BNXT_RE_UCNTX_CMASK_HAVE_MODE) cntx->wqe_mode = resp.mode; if (resp.comp_mask & BNXT_RE_UCNTX_CMASK_WC_DPI_ENABLED) cntx->comp_mask |= BNXT_RE_COMP_MASK_UCNTX_WC_DPI_ENABLED; if (resp.comp_mask & BNXT_RE_UCNTX_CMASK_DBR_PACING_ENABLED) cntx->comp_mask |= BNXT_RE_COMP_MASK_UCNTX_DBR_PACING_ENABLED; if (resp.comp_mask & BNXT_RE_UCNTX_CMASK_POW2_DISABLED) cntx->comp_mask |= BNXT_RE_COMP_MASK_UCNTX_POW2_DISABLED; if (resp.comp_mask & BNXT_RE_UCNTX_CMASK_MSN_TABLE_ENABLED) cntx->comp_mask |= BNXT_RE_COMP_MASK_UCNTX_MSN_TABLE_ENABLED; /* mmap shared page. */ cntx->shpg = mmap(NULL, rdev->pg_size, PROT_READ | PROT_WRITE, MAP_SHARED, cmd_fd, 0); if (cntx->shpg == MAP_FAILED) { cntx->shpg = NULL; goto failed; } if (cntx->comp_mask & BNXT_RE_COMP_MASK_UCNTX_DBR_PACING_ENABLED) { if (bnxt_re_alloc_map_dbr_page(&cntx->ibvctx.context)) { munmap(cntx->shpg, rdev->pg_size); cntx->shpg = NULL; goto failed; } if (bnxt_re_alloc_map_dbr_bar_page(&cntx->ibvctx.context)) { munmap(cntx->shpg, rdev->pg_size); cntx->shpg = NULL; munmap(cntx->dbr_page, rdev->pg_size); cntx->dbr_page = NULL; goto failed; } } pthread_mutex_init(&cntx->shlock, NULL); verbs_set_ops(&cntx->ibvctx, &bnxt_re_cntx_ops); cntx->rdev = rdev; ret = ibv_query_device(&cntx->ibvctx.context, &rdev->devattr); if (ret) goto failed; return &cntx->ibvctx; failed: verbs_uninit_context(&cntx->ibvctx); free(cntx); return NULL; } static void bnxt_re_free_context(struct ibv_context *ibvctx) { struct bnxt_re_context *cntx = to_bnxt_re_context(ibvctx); struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibvctx->device); /* Unmap if anything device specific was mapped in init_context. */ pthread_mutex_destroy(&cntx->shlock); if (cntx->shpg) munmap(cntx->shpg, rdev->pg_size); /* Un-map DPI only for the first PD that was * allocated in this context. */ if (cntx->udpi.wcdbpg && cntx->udpi.wcdbpg != MAP_FAILED) { munmap(cntx->udpi.wcdbpg, rdev->pg_size); cntx->udpi.wcdbpg = NULL; } if (cntx->udpi.dbpage && cntx->udpi.dbpage != MAP_FAILED) { munmap(cntx->udpi.dbpage, rdev->pg_size); cntx->udpi.dbpage = NULL; } if (cntx->comp_mask & BNXT_RE_COMP_MASK_UCNTX_DBR_PACING_ENABLED) { munmap(cntx->dbr_page, rdev->pg_size); cntx->dbr_page = NULL; munmap(cntx->bar_map, rdev->pg_size); cntx->bar_map = NULL; } verbs_uninit_context(&cntx->ibvctx); free(cntx); } static struct verbs_device * bnxt_re_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct bnxt_re_dev *rdev; rdev = calloc(1, sizeof(*rdev)); if (!rdev) return NULL; return &rdev->vdev; } static const struct verbs_device_ops bnxt_re_dev_ops = { .name = "bnxt_re", .match_min_abi_version = BNXT_RE_ABI_VERSION, .match_max_abi_version = BNXT_RE_ABI_VERSION, .match_table = cna_table, .alloc_device = bnxt_re_device_alloc, .alloc_context = bnxt_re_alloc_context, }; PROVIDER_DRIVER(bnxt_re, bnxt_re_dev_ops); rdma-core-56.1/providers/bnxt_re/main.h000066400000000000000000000364371477342711600201220ustar00rootroot00000000000000/* * Broadcom NetXtreme-E User Space RoCE driver * * Copyright (c) 2015-2017, Broadcom. All rights reserved. The term * Broadcom refers to Broadcom Limited and/or its subsidiaries. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Description: Basic device data structures needed for book-keeping */ #ifndef __MAIN_H__ #define __MAIN_H__ #include #include #include #include #include #include #include #include #include #include #include "bnxt_re-abi.h" #include "memory.h" #include "flush.h" #define DEV "bnxt_re : " #define BNXT_RE_UD_QP_HW_STALL 0x400000 #define CHIP_NUM_57508 0x1750 #define CHIP_NUM_57504 0x1751 #define CHIP_NUM_57502 0x1752 #define CHIP_NUM_58818 0xd818 #define CHIP_NUM_57608 0x1760 #define BNXT_RE_MAX_DO_PACING 0xFFFF #define BNXT_NSEC_PER_SEC 1000000000UL #define BNXT_RE_PAGE_MASK(pg_size) (~((__u64)(pg_size) - 1)) struct bnxt_re_chip_ctx { __u16 chip_num; __u8 chip_rev; __u8 chip_metal; __u8 gen_p5_p7; __u8 gen_p7; }; struct bnxt_re_dpi { __u32 dpindx; __u32 wcdpi; __u64 *dbpage; __u64 *wcdbpg; }; struct bnxt_re_pd { struct ibv_pd ibvpd; uint32_t pdid; }; struct bnxt_re_cq { struct ibv_cq ibvcq; uint32_t cqid; struct bnxt_re_context *cntx; struct bnxt_re_queue *cqq; struct bnxt_re_dpi *udpi; struct bnxt_re_mem *mem; struct bnxt_re_mem *resize_mem; struct list_head sfhead; struct list_head rfhead; struct list_head prev_cq_head; uint32_t cqe_size; uint8_t phase; struct xorshift32_state rand; uint32_t mem_handle; void *toggle_map; uint32_t toggle_size; uint8_t resize_tog; bool deffered_db_sup; uint32_t hw_cqes; }; struct bnxt_re_push_buffer { uintptr_t pbuf; /*push wc buffer */ uintptr_t *wqe; /* hwqe addresses */ uintptr_t ucdb; __u32 st_idx; __u32 qpid; __u16 wcdpi; __u16 nbit; __u32 tail; }; enum bnxt_re_push_info_mask { BNXT_RE_PUSH_SIZE_MASK = 0x1FUL, BNXT_RE_PUSH_SIZE_SHIFT = 0x18UL }; struct bnxt_re_db_ppp_hdr { struct bnxt_re_db_hdr db_hdr; __u64 rsv_psz_pidx; }; struct bnxt_re_push_rec { struct bnxt_re_dpi *udpi; struct bnxt_re_push_buffer *pbuf; __u32 pbmap; /* only 16 bits in use */ }; struct bnxt_re_wrid { struct bnxt_re_psns_ext *psns_ext; struct bnxt_re_psns *psns; uint64_t wrid; uint32_t bytes; int next_idx; uint32_t st_slot_idx; uint8_t slots; uint8_t sig; uint8_t wc_opcd; }; struct bnxt_re_qpcap { uint32_t max_swr; uint32_t max_rwr; uint32_t max_ssge; uint32_t max_rsge; uint32_t max_inline; uint8_t sqsig; uint8_t is_atomic_cap; }; struct bnxt_re_srq { struct ibv_srq ibvsrq; struct ibv_srq_attr cap; struct bnxt_re_context *cntx; struct bnxt_re_queue *srqq; struct bnxt_re_wrid *srwrid; struct bnxt_re_dpi *udpi; struct xorshift32_state rand; struct bnxt_re_mem *mem; uint32_t srqid; int start_idx; int last_idx; bool arm_req; uint32_t mem_handle; uint32_t toggle_size; void *toggle_map; }; struct bnxt_re_joint_queue { struct bnxt_re_context *cntx; struct bnxt_re_queue *hwque; struct bnxt_re_wrid *swque; uint32_t start_idx; uint32_t last_idx; }; /* WR API post send data */ struct bnxt_re_wr_send_qp { struct bnxt_re_bsqe *cur_hdr; struct bnxt_re_send *cur_sqe; uint32_t cur_wqe_cnt; uint32_t cur_slot_cnt; uint32_t cur_swq_idx; uint8_t cur_opcode; bool cur_push_wqe; unsigned int cur_push_size; int error; }; #define STATIC_WQE_NUM_SLOTS 8 #define SEND_SGE_MIN_SLOTS 3 #define MSG_LEN_ADJ_TO_BYTES 15 #define SLOTS_RSH_TO_NUM_WQE 4 struct bnxt_re_qp { struct verbs_qp vqp; struct ibv_qp *ibvqp; struct bnxt_re_chip_ctx *cctx; struct bnxt_re_context *cntx; struct xorshift32_state rand; struct bnxt_re_joint_queue *jsqq; struct bnxt_re_joint_queue *jrqq; struct bnxt_re_srq *srq; struct bnxt_re_cq *scq; struct bnxt_re_cq *rcq; struct bnxt_re_dpi *udpi; struct bnxt_re_qpcap cap; struct bnxt_re_fque_node snode; struct bnxt_re_fque_node rnode; uint32_t qpid; uint32_t tbl_indx; uint32_t sq_psn; uint32_t pending_db; void *pbuf; uint64_t wqe_cnt; uint16_t mtu; uint16_t qpst; uint32_t qpmode; uint8_t push_st_en; uint16_t max_push_sz; uint8_t qptyp; struct bnxt_re_mem *mem; struct bnxt_re_wr_send_qp wr_sq; }; struct bnxt_re_mr { struct verbs_mr vmr; }; struct bnxt_re_ah { struct ibv_ah ibvah; uint32_t avid; }; struct bnxt_re_dev { struct verbs_device vdev; uint8_t abi_version; uint32_t pg_size; uint32_t cqe_size; uint32_t max_cq_depth; struct ibv_device_attr devattr; }; struct bnxt_re_context { struct verbs_context ibvctx; struct bnxt_re_dev *rdev; uint32_t dev_id; uint32_t max_qp; struct bnxt_re_chip_ctx cctx; uint64_t comp_mask; uint32_t max_srq; struct bnxt_re_dpi udpi; void *shpg; uint32_t wqe_mode; pthread_mutex_t shlock; struct bnxt_re_push_rec *pbrec; uint32_t wc_handle; void *dbr_page; void *bar_map; }; struct bnxt_re_pacing_data { uint32_t do_pacing; uint32_t pacing_th; uint32_t alarm_th; uint32_t fifo_max_depth; uint32_t fifo_room_mask; uint32_t fifo_room_shift; uint32_t grc_reg_offset; }; struct bnxt_re_mmap_info { __u32 type; __u32 dpi; __u64 alloc_offset; __u32 alloc_size; __u32 pg_offset; __u32 res_id; }; /* DB ring functions used internally*/ void bnxt_re_ring_rq_db(struct bnxt_re_qp *qp); void bnxt_re_ring_sq_db(struct bnxt_re_qp *qp); void bnxt_re_ring_srq_arm(struct bnxt_re_srq *srq); void bnxt_re_ring_srq_db(struct bnxt_re_srq *srq); void bnxt_re_ring_cq_db(struct bnxt_re_cq *cq); void bnxt_re_ring_cq_arm_db(struct bnxt_re_cq *cq, uint8_t aflag); void bnxt_re_ring_pstart_db(struct bnxt_re_qp *qp, struct bnxt_re_push_buffer *pbuf); void bnxt_re_ring_pend_db(struct bnxt_re_qp *qp, struct bnxt_re_push_buffer *pbuf); void bnxt_re_fill_push_wcb(struct bnxt_re_qp *qp, struct bnxt_re_push_buffer *pbuf, uint32_t idx); int bnxt_re_init_pbuf_list(struct bnxt_re_context *cntx); void bnxt_re_destroy_pbuf_list(struct bnxt_re_context *cntx); struct bnxt_re_push_buffer *bnxt_re_get_pbuf(uint8_t *push_st_en, struct bnxt_re_context *cntx); void bnxt_re_put_pbuf(struct bnxt_re_context *cntx, struct bnxt_re_push_buffer *pbuf); int bnxt_re_alloc_page(struct ibv_context *ibvctx, struct bnxt_re_mmap_info *minfo, uint32_t *page_handle); int bnxt_re_notify_drv(struct ibv_context *ibvctx); int bnxt_re_get_toggle_mem(struct ibv_context *ibvctx, struct bnxt_re_mmap_info *minfo, uint32_t *page_handle); /* pointer conversion functions*/ static inline struct bnxt_re_dev *to_bnxt_re_dev(struct ibv_device *ibvdev) { return container_of(ibvdev, struct bnxt_re_dev, vdev.device); } static inline struct bnxt_re_context *to_bnxt_re_context( struct ibv_context *ibvctx) { return container_of(ibvctx, struct bnxt_re_context, ibvctx.context); } static inline struct bnxt_re_pd *to_bnxt_re_pd(struct ibv_pd *ibvpd) { return container_of(ibvpd, struct bnxt_re_pd, ibvpd); } static inline struct bnxt_re_cq *to_bnxt_re_cq(struct ibv_cq *ibvcq) { return container_of(ibvcq, struct bnxt_re_cq, ibvcq); } static inline struct bnxt_re_qp *to_bnxt_re_qp(struct ibv_qp *ibvqp) { struct verbs_qp *vqp = (struct verbs_qp *)ibvqp; return container_of(vqp, struct bnxt_re_qp, vqp); } static inline struct bnxt_re_srq *to_bnxt_re_srq(struct ibv_srq *ibvsrq) { return container_of(ibvsrq, struct bnxt_re_srq, ibvsrq); } static inline struct bnxt_re_ah *to_bnxt_re_ah(struct ibv_ah *ibvah) { return container_of(ibvah, struct bnxt_re_ah, ibvah); } static inline uint32_t bnxt_re_get_sqe_sz(void) { return sizeof(struct bnxt_re_bsqe) + sizeof(struct bnxt_re_send) + BNXT_RE_MAX_INLINE_SIZE; } static inline uint32_t bnxt_re_get_sqe_hdr_sz(void) { return sizeof(struct bnxt_re_bsqe) + sizeof(struct bnxt_re_send); } static inline uint32_t bnxt_re_get_rqe_sz(void) { return sizeof(struct bnxt_re_brqe) + sizeof(struct bnxt_re_rqe) + BNXT_RE_MAX_INLINE_SIZE; } static inline uint32_t bnxt_re_get_rqe_hdr_sz(void) { return sizeof(struct bnxt_re_brqe) + sizeof(struct bnxt_re_rqe); } static inline uint32_t bnxt_re_get_srqe_sz(void) { return sizeof(struct bnxt_re_brqe) + sizeof(struct bnxt_re_srqe) + BNXT_RE_MAX_INLINE_SIZE; } static inline uint32_t bnxt_re_get_srqe_hdr_sz(void) { return sizeof(struct bnxt_re_brqe) + sizeof(struct bnxt_re_srqe); } static inline uint32_t bnxt_re_get_cqe_sz(void) { return sizeof(struct bnxt_re_req_cqe) + sizeof(struct bnxt_re_bcqe); } static inline uint8_t bnxt_re_ibv_to_bnxt_wr_opcd(uint8_t ibv_opcd) { uint8_t bnxt_opcd; switch (ibv_opcd) { case IBV_WR_SEND: bnxt_opcd = BNXT_RE_WR_OPCD_SEND; break; case IBV_WR_SEND_WITH_IMM: bnxt_opcd = BNXT_RE_WR_OPCD_SEND_IMM; break; case IBV_WR_RDMA_WRITE: bnxt_opcd = BNXT_RE_WR_OPCD_RDMA_WRITE; break; case IBV_WR_RDMA_WRITE_WITH_IMM: bnxt_opcd = BNXT_RE_WR_OPCD_RDMA_WRITE_IMM; break; case IBV_WR_RDMA_READ: bnxt_opcd = BNXT_RE_WR_OPCD_RDMA_READ; break; case IBV_WR_ATOMIC_CMP_AND_SWP: bnxt_opcd = BNXT_RE_WR_OPCD_ATOMIC_CS; break; case IBV_WR_ATOMIC_FETCH_AND_ADD: bnxt_opcd = BNXT_RE_WR_OPCD_ATOMIC_FA; break; /* TODO: Add other opcodes */ default: bnxt_opcd = BNXT_RE_WR_OPCD_INVAL; break; }; return bnxt_opcd; } static inline uint8_t bnxt_re_ibv_wr_to_wc_opcd(uint8_t wr_opcd) { uint8_t wc_opcd; switch (wr_opcd) { case IBV_WR_SEND_WITH_IMM: case IBV_WR_SEND: wc_opcd = IBV_WC_SEND; break; case IBV_WR_RDMA_WRITE_WITH_IMM: case IBV_WR_RDMA_WRITE: wc_opcd = IBV_WC_RDMA_WRITE; break; case IBV_WR_RDMA_READ: wc_opcd = IBV_WC_RDMA_READ; break; case IBV_WR_ATOMIC_CMP_AND_SWP: wc_opcd = IBV_WC_COMP_SWAP; break; case IBV_WR_ATOMIC_FETCH_AND_ADD: wc_opcd = IBV_WC_FETCH_ADD; break; default: wc_opcd = 0xFF; break; } return wc_opcd; } static inline uint8_t bnxt_re_to_ibv_wc_status(uint8_t bnxt_wcst, uint8_t is_req) { uint8_t ibv_wcst; if (is_req) { switch (bnxt_wcst) { case BNXT_RE_REQ_ST_BAD_RESP: ibv_wcst = IBV_WC_BAD_RESP_ERR; break; case BNXT_RE_REQ_ST_LOC_LEN: ibv_wcst = IBV_WC_LOC_LEN_ERR; break; case BNXT_RE_REQ_ST_LOC_QP_OP: ibv_wcst = IBV_WC_LOC_QP_OP_ERR; break; case BNXT_RE_REQ_ST_PROT: ibv_wcst = IBV_WC_LOC_PROT_ERR; break; case BNXT_RE_REQ_ST_MEM_OP: ibv_wcst = IBV_WC_MW_BIND_ERR; break; case BNXT_RE_REQ_ST_REM_INVAL: ibv_wcst = IBV_WC_REM_INV_REQ_ERR; break; case BNXT_RE_REQ_ST_REM_ACC: ibv_wcst = IBV_WC_REM_ACCESS_ERR; break; case BNXT_RE_REQ_ST_REM_OP: ibv_wcst = IBV_WC_REM_OP_ERR; break; case BNXT_RE_REQ_ST_RNR_NAK_XCED: ibv_wcst = IBV_WC_RNR_RETRY_EXC_ERR; break; case BNXT_RE_REQ_ST_TRNSP_XCED: ibv_wcst = IBV_WC_RETRY_EXC_ERR; break; case BNXT_RE_REQ_ST_WR_FLUSH: ibv_wcst = IBV_WC_WR_FLUSH_ERR; break; default: ibv_wcst = IBV_WC_GENERAL_ERR; break; } } else { switch (bnxt_wcst) { case BNXT_RE_RSP_ST_LOC_ACC: ibv_wcst = IBV_WC_LOC_ACCESS_ERR; break; case BNXT_RE_RSP_ST_LOC_LEN: ibv_wcst = IBV_WC_LOC_LEN_ERR; break; case BNXT_RE_RSP_ST_LOC_PROT: ibv_wcst = IBV_WC_LOC_PROT_ERR; break; case BNXT_RE_RSP_ST_LOC_QP_OP: ibv_wcst = IBV_WC_LOC_QP_OP_ERR; break; case BNXT_RE_RSP_ST_MEM_OP: ibv_wcst = IBV_WC_MW_BIND_ERR; break; case BNXT_RE_RSP_ST_REM_INVAL: ibv_wcst = IBV_WC_REM_INV_REQ_ERR; break; case BNXT_RE_RSP_ST_WR_FLUSH: ibv_wcst = IBV_WC_WR_FLUSH_ERR; break; case BNXT_RE_RSP_ST_HW_FLUSH: ibv_wcst = IBV_WC_FATAL_ERR; break; default: ibv_wcst = IBV_WC_GENERAL_ERR; break; } } return ibv_wcst; } static inline uint8_t bnxt_re_is_cqe_valid(struct bnxt_re_cq *cq, struct bnxt_re_bcqe *hdr) { uint8_t valid = 0; valid = ((le32toh(hdr->flg_st_typ_ph) & BNXT_RE_BCQE_PH_MASK) == cq->phase); udma_from_device_barrier(); return valid; } static inline void bnxt_re_change_cq_phase(struct bnxt_re_cq *cq) { if (!cq->cqq->head) cq->phase = (~cq->phase & BNXT_RE_BCQE_PH_MASK); } static inline void *bnxt_re_get_swqe(struct bnxt_re_joint_queue *jqq, uint32_t *wqe_idx) { if (wqe_idx) *wqe_idx = jqq->start_idx; return &jqq->swque[jqq->start_idx]; } static inline void bnxt_re_jqq_mod_start(struct bnxt_re_joint_queue *jqq, uint32_t idx) { jqq->start_idx = jqq->swque[idx].next_idx; } static inline void bnxt_re_jqq_mod_last(struct bnxt_re_joint_queue *jqq, uint32_t idx) { jqq->last_idx = jqq->swque[idx].next_idx; } static inline uint32_t bnxt_re_init_depth(uint32_t ent, uint64_t cmask) { return cmask & BNXT_RE_COMP_MASK_UCNTX_POW2_DISABLED ? ent : roundup_pow_of_two(ent); } /* Helper function to copy to push buffers */ static inline void bnxt_re_copy_data_to_pb(struct bnxt_re_push_buffer *pbuf, uint8_t offset, uint32_t idx) { uintptr_t *src; uintptr_t *dst; int indx; for (indx = 0; indx < idx; indx++) { dst = (uintptr_t *)(pbuf->pbuf) + 2 * indx + offset; src = (uintptr_t *)(pbuf->wqe[indx]); mmio_write64(dst, *src); dst++; src++; mmio_write64(dst, *src); } } static void timespec_sub(const struct timespec *a, const struct timespec *b, struct timespec *res) { res->tv_sec = a->tv_sec - b->tv_sec; res->tv_nsec = a->tv_nsec - b->tv_nsec; if (res->tv_nsec < 0) { res->tv_sec--; res->tv_nsec += BNXT_NSEC_PER_SEC; } } /* * Function waits in a busy loop for a given nano seconds * The maximum wait period allowed is less than one second */ static inline void bnxt_re_sub_sec_busy_wait(uint32_t nsec) { struct timespec start, cur, res; if (nsec >= BNXT_NSEC_PER_SEC) return; if (clock_gettime(CLOCK_REALTIME, &start)) return; while (1) { if (clock_gettime(CLOCK_REALTIME, &cur)) return; timespec_sub(&cur, &start, &res); if (res.tv_nsec >= nsec) break; } } #define BNXT_RE_MSN_TBL_EN(a) ((a)->comp_mask & BNXT_RE_COMP_MASK_UCNTX_MSN_TABLE_ENABLED) #endif rdma-core-56.1/providers/bnxt_re/memory.c000066400000000000000000000061251477342711600204700ustar00rootroot00000000000000/* * Broadcom NetXtreme-E User Space RoCE driver * * Copyright (c) 2015-2017, Broadcom. All rights reserved. The term * Broadcom refers to Broadcom Limited and/or its subsidiaries. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Description: Implements method to allocate page-aligned memory * buffers. */ #include #include #include #include #include "main.h" void bnxt_re_free_mem(struct bnxt_re_mem *mem) { if (mem->va_head) { ibv_dofork_range(mem->va_head, mem->size); munmap(mem->va_head, mem->size); } free(mem); } void *bnxt_re_alloc_mem(size_t size, uint32_t pg_size) { struct bnxt_re_mem *mem; mem = calloc(1, sizeof(*mem)); if (!mem) return NULL; size = align(size, pg_size); mem->size = size; mem->va_head = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (mem->va_head == MAP_FAILED) goto bail; if (ibv_dontfork_range(mem->va_head, size)) goto unmap; mem->head = 0; mem->tail = 0; mem->va_tail = (void *)((char *)mem->va_head + size); return mem; unmap: munmap(mem->va_head, size); bail: free(mem); return NULL; } void *bnxt_re_get_obj(struct bnxt_re_mem *mem, size_t req) { void *va; if ((mem->size - mem->tail - req) < mem->head) return NULL; mem->tail += req; va = (void *)((char *)mem->va_tail - mem->tail); return va; } void *bnxt_re_get_ring(struct bnxt_re_mem *mem, size_t req) { void *va; if ((mem->head + req) > (mem->size - mem->tail)) return NULL; va = (void *)((char *)mem->va_head + mem->head); mem->head += req; return va; } rdma-core-56.1/providers/bnxt_re/memory.h000066400000000000000000000112361477342711600204740ustar00rootroot00000000000000/* * Broadcom NetXtreme-E User Space RoCE driver * * Copyright (c) 2015-2017, Broadcom. All rights reserved. The term * Broadcom refers to Broadcom Limited and/or its subsidiaries. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Description: Implements data-struture to allocate page-aligned * memory buffer. */ #ifndef __MEMORY_H__ #define __MEMORY_H__ #include #include "main.h" struct bnxt_re_mem { void *va_head; void *va_tail; uint32_t head; uint32_t tail; uint32_t size; uint32_t pad; }; #define BNXT_RE_QATTR_SQ_INDX 0 #define BNXT_RE_QATTR_RQ_INDX 1 struct bnxt_re_qattr { uint32_t esize; uint32_t slots; uint32_t nwr; uint32_t sz_ring; uint32_t sz_shad; uint32_t sw_nwr; }; struct bnxt_re_queue { void *va; uint32_t flags; uint32_t *dbtail; uint32_t bytes; /* for munmap */ uint32_t depth; /* no. of entries */ uint32_t head; uint32_t tail; uint32_t stride; void *pad; /* to hold the padding area */ uint32_t pad_stride_log2; /* Represents the difference between the real queue depth allocated in * HW and the user requested queue depth and is used to correctly flag * queue full condition based on user supplied queue depth. * This value can vary depending on the type of queue and any HW * requirements that mandate keeping a fixed gap between the producer * and the consumer indices in the queue */ uint32_t diff; uint32_t esize; uint32_t max_slots; pthread_spinlock_t qlock; uint32_t msn; uint32_t msn_tbl_sz; /* * This flag is set when CQ is resized. It will be cleared after the * first CQE is received on the newly resized CQ */ bool cq_resized; /* this tracks the 'head' index on the old CQ before resizing */ uint32_t old_head; }; /* Basic queue operation */ static inline void *bnxt_re_get_hwqe(struct bnxt_re_queue *que, uint32_t idx) { idx += que->tail; if (idx >= que->depth) idx -= que->depth; return (void *)(que->va + (idx << 4)); } static inline void *bnxt_re_get_hwqe_hdr(struct bnxt_re_queue *que) { return (void *)(que->va + ((que->tail) << 4)); } static inline uint32_t bnxt_re_is_que_full(struct bnxt_re_queue *que, uint32_t slots) { int32_t avail, head, tail; head = que->head; tail = que->tail; avail = head - tail; if (head <= tail) avail += que->depth; return avail <= (slots + que->diff); } static inline uint32_t bnxt_re_is_que_empty(struct bnxt_re_queue *que) { return que->tail == que->head; } static inline void bnxt_re_incr_tail(struct bnxt_re_queue *que, uint8_t cnt) { que->tail += cnt; if (que->tail >= que->depth) { que->tail %= que->depth; /* Rolled over, Toggle Tail bit in epoch flags */ que->flags ^= 1UL << BNXT_RE_FLAG_EPOCH_TAIL_SHIFT; } } static inline void bnxt_re_incr_head(struct bnxt_re_queue *que, uint8_t cnt) { que->head += cnt; if (que->head >= que->depth) { que->head %= que->depth; /* Rolled over, Toggle HEAD bit in epoch flags */ que->flags ^= 1UL << BNXT_RE_FLAG_EPOCH_HEAD_SHIFT; } } void bnxt_re_free_mem(struct bnxt_re_mem *mem); void *bnxt_re_alloc_mem(size_t size, uint32_t pg_size); void *bnxt_re_get_obj(struct bnxt_re_mem *mem, size_t req); void *bnxt_re_get_ring(struct bnxt_re_mem *mem, size_t req); #endif rdma-core-56.1/providers/bnxt_re/verbs.c000066400000000000000000002321341477342711600203020ustar00rootroot00000000000000/* * Broadcom NetXtreme-E User Space RoCE driver * * Copyright (c) 2015-2017, Broadcom. All rights reserved. The term * Broadcom refers to Broadcom Limited and/or its subsidiaries. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Description: User IB-Verbs implementation */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "main.h" #include "verbs.h" static int bnxt_re_poll_one(struct bnxt_re_cq *cq, int nwc, struct ibv_wc *wc, uint32_t *resize); int bnxt_re_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); uint8_t fw_ver[8]; int err; err = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (err) return err; memcpy(fw_ver, &resp.base.fw_ver, sizeof(resp.base.fw_ver)); snprintf(attr->orig_attr.fw_ver, 64, "%d.%d.%d.%d", fw_ver[0], fw_ver[1], fw_ver[2], fw_ver[3]); return 0; } int bnxt_re_query_port(struct ibv_context *ibvctx, uint8_t port, struct ibv_port_attr *port_attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(ibvctx, port, port_attr, &cmd, sizeof(cmd)); } static inline bool bnxt_re_is_wcdpi_enabled(struct bnxt_re_context *cntx) { return cntx->comp_mask & BNXT_RE_COMP_MASK_UCNTX_WC_DPI_ENABLED; } static int bnxt_re_map_db_page(struct ibv_context *ibvctx, uint64_t dbr, uint32_t dpi) { struct bnxt_re_context *cntx = to_bnxt_re_context(ibvctx); struct bnxt_re_dev *dev = to_bnxt_re_dev(ibvctx->device); cntx->udpi.dpindx = dpi; cntx->udpi.dbpage = mmap(NULL, dev->pg_size, PROT_WRITE, MAP_SHARED, ibvctx->cmd_fd, dbr); if (cntx->udpi.dbpage == MAP_FAILED) return -ENOMEM; return 0; } int bnxt_re_get_toggle_mem(struct ibv_context *ibvctx, struct bnxt_re_mmap_info *minfo, uint32_t *page_handle) { DECLARE_COMMAND_BUFFER(cmd, BNXT_RE_OBJECT_GET_TOGGLE_MEM, BNXT_RE_METHOD_GET_TOGGLE_MEM, 4); struct ib_uverbs_attr *handle; int ret; handle = fill_attr_out_obj(cmd, BNXT_RE_TOGGLE_MEM_HANDLE); fill_attr_const_in(cmd, BNXT_RE_TOGGLE_MEM_TYPE, minfo->type); fill_attr_in(cmd, BNXT_RE_TOGGLE_MEM_RES_ID, &minfo->res_id, sizeof(minfo->res_id)); fill_attr_out_ptr(cmd, BNXT_RE_TOGGLE_MEM_MMAP_PAGE, &minfo->alloc_offset); fill_attr_out_ptr(cmd, BNXT_RE_TOGGLE_MEM_MMAP_LENGTH, &minfo->alloc_size); fill_attr_out_ptr(cmd, BNXT_RE_TOGGLE_MEM_MMAP_OFFSET, &minfo->pg_offset); ret = execute_ioctl(ibvctx, cmd); if (ret) return ret; if (page_handle) *page_handle = read_attr_obj(BNXT_RE_TOGGLE_MEM_HANDLE, handle); return 0; } int bnxt_re_notify_drv(struct ibv_context *ibvctx) { DECLARE_COMMAND_BUFFER(cmd, BNXT_RE_OBJECT_NOTIFY_DRV, BNXT_RE_METHOD_NOTIFY_DRV, 0); return execute_ioctl(ibvctx, cmd); } int bnxt_re_alloc_page(struct ibv_context *ibvctx, struct bnxt_re_mmap_info *minfo, uint32_t *page_handle) { DECLARE_COMMAND_BUFFER(cmd, BNXT_RE_OBJECT_ALLOC_PAGE, BNXT_RE_METHOD_ALLOC_PAGE, 4); struct ib_uverbs_attr *handle; int ret; handle = fill_attr_out_obj(cmd, BNXT_RE_ALLOC_PAGE_HANDLE); fill_attr_const_in(cmd, BNXT_RE_ALLOC_PAGE_TYPE, minfo->type); fill_attr_out_ptr(cmd, BNXT_RE_ALLOC_PAGE_MMAP_OFFSET, &minfo->alloc_offset); fill_attr_out_ptr(cmd, BNXT_RE_ALLOC_PAGE_MMAP_LENGTH, &minfo->alloc_size); fill_attr_out_ptr(cmd, BNXT_RE_ALLOC_PAGE_DPI, &minfo->dpi); ret = execute_ioctl(ibvctx, cmd); if (ret) return ret; if (page_handle) *page_handle = read_attr_obj(BNXT_RE_ALLOC_PAGE_HANDLE, handle); return 0; } static int bnxt_re_alloc_map_push_page(struct ibv_context *ibvctx) { struct bnxt_re_context *cntx = to_bnxt_re_context(ibvctx); struct bnxt_re_mmap_info minfo = {}; int ret; minfo.type = BNXT_RE_ALLOC_WC_PAGE; ret = bnxt_re_alloc_page(ibvctx, &minfo, &cntx->wc_handle); if (ret) return ret; cntx->udpi.wcdbpg = mmap(NULL, minfo.alloc_size, PROT_WRITE, MAP_SHARED, ibvctx->cmd_fd, minfo.alloc_offset); if (cntx->udpi.wcdbpg == MAP_FAILED) return -ENOMEM; cntx->udpi.wcdpi = minfo.dpi; return 0; } struct ibv_pd *bnxt_re_alloc_pd(struct ibv_context *ibvctx) { struct ibv_alloc_pd cmd; struct ubnxt_re_pd_resp resp; struct bnxt_re_context *cntx = to_bnxt_re_context(ibvctx); struct bnxt_re_pd *pd; uint64_t dbr = 0; pd = calloc(1, sizeof(*pd)); if (!pd) return NULL; memset(&resp, 0, sizeof(resp)); if (ibv_cmd_alloc_pd(ibvctx, &pd->ibvpd, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) goto out; pd->pdid = resp.pdid; dbr = resp.dbr; static_assert(offsetof(struct ubnxt_re_pd_resp, dbr) == 4 * 3, "Bad dbr placement"); /* Map DB page now. */ if (!cntx->udpi.dbpage) { if (bnxt_re_map_db_page(ibvctx, dbr, resp.dpi)) goto fail; if (bnxt_re_is_wcdpi_enabled(cntx)) { bnxt_re_alloc_map_push_page(ibvctx); if (cntx->cctx.gen_p5_p7 && cntx->udpi.wcdpi) bnxt_re_init_pbuf_list(cntx); } } return &pd->ibvpd; fail: (void)ibv_cmd_dealloc_pd(&pd->ibvpd); out: free(pd); return NULL; } int bnxt_re_free_pd(struct ibv_pd *ibvpd) { struct bnxt_re_pd *pd = to_bnxt_re_pd(ibvpd); int status; status = ibv_cmd_dealloc_pd(ibvpd); if (status) return status; /* DPI un-mapping will be during uninit_ucontext */ free(pd); return 0; } struct ibv_mr *bnxt_re_reg_mr(struct ibv_pd *ibvpd, void *sva, size_t len, uint64_t hca_va, int access) { struct bnxt_re_mr *mr; struct ibv_reg_mr cmd; struct ubnxt_re_mr_resp resp; mr = calloc(1, sizeof(*mr)); if (!mr) return NULL; if (ibv_cmd_reg_mr(ibvpd, sva, len, hca_va, access, &mr->vmr, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) { free(mr); return NULL; } return &mr->vmr.ibv_mr; } struct ibv_mr *bnxt_re_reg_dmabuf_mr(struct ibv_pd *ibvpd, uint64_t start, size_t len, uint64_t iova, int fd, int access) { struct bnxt_re_mr *mr; mr = calloc(1, sizeof(*mr)); if (!mr) return NULL; if (ibv_cmd_reg_dmabuf_mr(ibvpd, start, len, iova, fd, access, &mr->vmr, NULL)) { free(mr); return NULL; } return &mr->vmr.ibv_mr; } int bnxt_re_dereg_mr(struct verbs_mr *vmr) { struct bnxt_re_mr *mr = (struct bnxt_re_mr *)vmr; int status; status = ibv_cmd_dereg_mr(vmr); if (status) return status; free(mr); return 0; } static void *bnxt_re_alloc_cqslab(struct bnxt_re_context *cntx, uint32_t ncqe, uint32_t cur) { struct bnxt_re_mem *mem; uint32_t depth, sz; depth = bnxt_re_init_depth(ncqe + 1, cntx->comp_mask); if (depth > cntx->rdev->max_cq_depth + 1) depth = cntx->rdev->max_cq_depth + 1; if (depth == cur) return NULL; sz = align((depth * cntx->rdev->cqe_size), cntx->rdev->pg_size); mem = bnxt_re_alloc_mem(sz, cntx->rdev->pg_size); if (mem) mem->pad = depth; return mem; } struct ibv_cq *bnxt_re_create_cq(struct ibv_context *ibvctx, int ncqe, struct ibv_comp_channel *channel, int vec) { struct bnxt_re_cq *cq; struct ubnxt_re_cq cmd; struct ubnxt_re_cq_resp resp; struct bnxt_re_mmap_info minfo = {}; int ret; struct bnxt_re_context *cntx = to_bnxt_re_context(ibvctx); struct bnxt_re_dev *dev = to_bnxt_re_dev(ibvctx->device); if (ncqe > dev->max_cq_depth) { errno = EINVAL; return NULL; } cq = calloc(1, (sizeof(*cq) + sizeof(struct bnxt_re_queue))); if (!cq) return NULL; /* Enable deferred DB mode for CQ if the CQ is small */ if (ncqe * 2 < dev->max_cq_depth) { cq->deffered_db_sup = true; ncqe = 2 * ncqe; } cq->cqq = (void *)((char *)cq + sizeof(*cq)); if (!cq->cqq) goto fail; cq->mem = bnxt_re_alloc_cqslab(cntx, ncqe, 0); if (!cq->mem) goto fail; cq->cqq->depth = cq->mem->pad; cq->cqq->stride = dev->cqe_size; /* As an exception no need to call get_ring api we know * this is the only consumer */ cq->cqq->va = cq->mem->va_head; if (!cq->cqq->va) goto cmdfail; pthread_spin_init(&cq->cqq->qlock, PTHREAD_PROCESS_PRIVATE); cmd.cq_va = (uintptr_t)cq->cqq->va; cmd.cq_handle = (uintptr_t)cq; memset(&resp, 0, sizeof(resp)); if (ibv_cmd_create_cq(ibvctx, ncqe, channel, vec, &cq->ibvcq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) goto cmdfail; cq->cqid = resp.cqid; cq->phase = resp.phase; cq->cqq->tail = resp.tail; cq->udpi = &cntx->udpi; cq->cntx = cntx; cq->rand.seed = cq->cqid; if (resp.comp_mask & BNXT_RE_CQ_TOGGLE_PAGE_SUPPORT) { minfo.type = BNXT_RE_CQ_TOGGLE_MEM; minfo.res_id = resp.cqid; ret = bnxt_re_get_toggle_mem(ibvctx, &minfo, &cq->mem_handle); if (ret) goto cmdfail; cq->toggle_map = mmap(NULL, minfo.alloc_size, PROT_READ, MAP_SHARED, ibvctx->cmd_fd, minfo.alloc_offset); if (cq->toggle_map == MAP_FAILED) goto cmdfail; cq->toggle_size = minfo.alloc_size; } list_head_init(&cq->sfhead); list_head_init(&cq->rfhead); list_head_init(&cq->prev_cq_head); return &cq->ibvcq; cmdfail: bnxt_re_free_mem(cq->mem); fail: free(cq); return NULL; } #define BNXT_RE_QUEUE_START_PHASE 0x01 /* * Function to complete the last steps in CQ resize. Invoke poll function * in the kernel driver; this serves as a signal to the driver to complete CQ * resize steps required. Free memory mapped for the original CQ and switch * over to the memory mapped for CQ with the new size. Finally Ack the Cutoff * CQE. This function must be called under cq->cqq.lock. */ static void bnxt_re_resize_cq_complete(struct bnxt_re_cq *cq) { struct bnxt_re_context *cntx = to_bnxt_re_context(cq->ibvcq.context); struct ibv_wc tmp_wc; ibv_cmd_poll_cq(&cq->ibvcq, 1, &tmp_wc); bnxt_re_free_mem(cq->mem); cq->mem = cq->resize_mem; cq->resize_mem = NULL; cq->cqq->va = cq->mem->va_head; /* mark the CQ resize flag and save the old head index */ cq->cqq->cq_resized = true; cq->cqq->old_head = cq->cqq->head; cq->cqq->depth = cq->mem->pad; cq->cqq->stride = cntx->rdev->cqe_size; cq->cqq->head = 0; cq->cqq->tail = 0; cq->phase = BNXT_RE_QUEUE_START_PHASE; /* Reset epoch portion of the flags */ cq->cqq->flags &= ~(BNXT_RE_FLAG_EPOCH_TAIL_MASK); bnxt_re_ring_cq_arm_db(cq, BNXT_RE_QUE_TYPE_CQ_CUT_ACK); } int bnxt_re_resize_cq(struct ibv_cq *ibvcq, int ncqe) { struct bnxt_re_context *cntx = to_bnxt_re_context(ibvcq->context); struct bnxt_re_dev *dev = to_bnxt_re_dev(ibvcq->context->device); struct bnxt_re_cq *cq = to_bnxt_re_cq(ibvcq); struct ib_uverbs_resize_cq_resp resp = {}; struct ubnxt_re_resize_cq cmd = {}; uint16_t msec_wait = 100; uint16_t exit_cnt = 20; int rc = 0; if (ncqe > dev->max_cq_depth) return -EINVAL; /* Check if we can be in defered DB mode with the * newer size of CQE. */ if (2 * ncqe > dev->max_cq_depth) { cq->deffered_db_sup = false; } else { ncqe = 2 * ncqe; cq->deffered_db_sup = true; } pthread_spin_lock(&cq->cqq->qlock); cq->resize_mem = bnxt_re_alloc_cqslab(cntx, ncqe, cq->cqq->depth); if (unlikely(!cq->resize_mem)) { rc = -ENOMEM; goto done; } /* As an exception no need to call get_ring api we know * this is the only consumer */ cmd.cq_va = (uintptr_t)cq->resize_mem->va_head; rc = ibv_cmd_resize_cq(ibvcq, ncqe, &cmd.ibv_cmd, sizeof(cmd), &resp, sizeof(resp)); if (rc) { bnxt_re_free_mem(cq->mem); goto done; } while (true) { struct bnxt_re_work_compl *compl = NULL; struct ibv_wc tmp_wc = {}; uint32_t resize = 0; int dqed = 0; dqed = bnxt_re_poll_one(cq, 1, &tmp_wc, &resize); if (resize) break; if (dqed) { compl = calloc(1, sizeof(*compl)); if (!compl) break; memcpy(&compl->wc, &tmp_wc, sizeof(tmp_wc)); list_add_tail(&cq->prev_cq_head, &compl->list); compl = NULL; memset(&tmp_wc, 0, sizeof(tmp_wc)); } else { exit_cnt--; if (unlikely(!exit_cnt)) { rc = -EIO; break; } bnxt_re_sub_sec_busy_wait(msec_wait * 1000000); } } done: pthread_spin_unlock(&cq->cqq->qlock); return rc; } static void bnxt_re_destroy_resize_cq_list(struct bnxt_re_cq *cq) { struct bnxt_re_work_compl *compl, *tmp; if (list_empty(&cq->prev_cq_head)) return; list_for_each_safe(&cq->prev_cq_head, compl, tmp, list) { list_del(&compl->list); free(compl); } } int bnxt_re_destroy_cq(struct ibv_cq *ibvcq) { int status; struct bnxt_re_cq *cq = to_bnxt_re_cq(ibvcq); if (cq->toggle_map) munmap(cq->toggle_map, cq->toggle_size); status = ibv_cmd_destroy_cq(ibvcq); if (status) return status; bnxt_re_destroy_resize_cq_list(cq); bnxt_re_free_mem(cq->mem); free(cq); return 0; } static uint8_t bnxt_re_poll_err_scqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc, struct bnxt_re_bcqe *hdr, struct bnxt_re_req_cqe *scqe, int *cnt) { struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_wrid *swrid; struct bnxt_re_cq *scq; uint8_t status; uint32_t head; scq = to_bnxt_re_cq(qp->ibvqp->send_cq); head = qp->jsqq->last_idx; swrid = &qp->jsqq->swque[head]; *cnt = 1; status = (le32toh(hdr->flg_st_typ_ph) >> BNXT_RE_BCQE_STATUS_SHIFT) & BNXT_RE_BCQE_STATUS_MASK; ibvwc->status = bnxt_re_to_ibv_wc_status(status, true); ibvwc->vendor_err = status; ibvwc->wc_flags = 0; ibvwc->wr_id = swrid->wrid; ibvwc->qp_num = qp->qpid; ibvwc->opcode = swrid->wc_opcd; ibvwc->byte_len = 0; bnxt_re_incr_head(sq, swrid->slots); bnxt_re_jqq_mod_last(qp->jsqq, head); if (qp->qpst != IBV_QPS_ERR) qp->qpst = IBV_QPS_ERR; bnxt_re_fque_add_node(&scq->sfhead, &qp->snode); return false; } static uint8_t bnxt_re_poll_success_scqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc, struct bnxt_re_bcqe *hdr, struct bnxt_re_req_cqe *scqe, int *cnt) { struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_wrid *swrid; uint32_t cindx; uint32_t head; head = qp->jsqq->last_idx; swrid = &qp->jsqq->swque[head]; cindx = le32toh(scqe->con_indx) % qp->cap.max_swr; if (!(swrid->sig & IBV_SEND_SIGNALED)) { *cnt = 0; } else { ibvwc->status = IBV_WC_SUCCESS; ibvwc->wc_flags = 0; ibvwc->qp_num = qp->qpid; ibvwc->wr_id = swrid->wrid; ibvwc->opcode = swrid->wc_opcd; if (ibvwc->opcode == IBV_WC_RDMA_READ || ibvwc->opcode == IBV_WC_COMP_SWAP || ibvwc->opcode == IBV_WC_FETCH_ADD) ibvwc->byte_len = swrid->bytes; *cnt = 1; } bnxt_re_incr_head(sq, swrid->slots); bnxt_re_jqq_mod_last(qp->jsqq, head); if (qp->jsqq->last_idx != cindx) return true; return false; } static uint8_t bnxt_re_poll_scqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc, void *cqe, int *cnt) { struct bnxt_re_req_cqe *scqe; struct bnxt_re_bcqe *hdr; uint8_t status; scqe = cqe; hdr = cqe + sizeof(struct bnxt_re_req_cqe); status = (le32toh(hdr->flg_st_typ_ph) >> BNXT_RE_BCQE_STATUS_SHIFT) & BNXT_RE_BCQE_STATUS_MASK; if (likely(status == BNXT_RE_REQ_ST_OK)) return bnxt_re_poll_success_scqe(qp, ibvwc, hdr, scqe, cnt); else return bnxt_re_poll_err_scqe(qp, ibvwc, hdr, scqe, cnt); } static void bnxt_re_release_srqe(struct bnxt_re_srq *srq, int tag) { pthread_spin_lock(&srq->srqq->qlock); srq->srwrid[srq->last_idx].next_idx = tag; srq->last_idx = tag; srq->srwrid[srq->last_idx].next_idx = -1; pthread_spin_unlock(&srq->srqq->qlock); } static int bnxt_re_poll_err_rcqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc, struct bnxt_re_bcqe *hdr, void *cqe) { struct bnxt_re_wrid *swque; struct bnxt_re_queue *rq; uint8_t status, cnt = 0; struct bnxt_re_cq *rcq; uint32_t head = 0; rcq = to_bnxt_re_cq(qp->ibvqp->recv_cq); if (!qp->srq) { rq = qp->jrqq->hwque; head = qp->jrqq->last_idx; swque = &qp->jrqq->swque[head]; ibvwc->wr_id = swque->wrid; cnt = swque->slots; } else { struct bnxt_re_srq *srq; int tag; srq = qp->srq; rq = srq->srqq; cnt = 1; tag = le32toh(hdr->qphi_rwrid) & BNXT_RE_BCQE_RWRID_MASK; ibvwc->wr_id = srq->srwrid[tag].wrid; bnxt_re_release_srqe(srq, tag); } status = (le32toh(hdr->flg_st_typ_ph) >> BNXT_RE_BCQE_STATUS_SHIFT) & BNXT_RE_BCQE_STATUS_MASK; /* skip h/w flush errors */ if (status == BNXT_RE_RSP_ST_HW_FLUSH) return 0; ibvwc->status = bnxt_re_to_ibv_wc_status(status, false); ibvwc->vendor_err = status; ibvwc->qp_num = qp->qpid; ibvwc->opcode = IBV_WC_RECV; ibvwc->byte_len = 0; ibvwc->wc_flags = 0; if (qp->qptyp == IBV_QPT_UD) ibvwc->src_qp = 0; if (!qp->srq) bnxt_re_jqq_mod_last(qp->jrqq, head); bnxt_re_incr_head(rq, cnt); if (!qp->srq) bnxt_re_fque_add_node(&rcq->rfhead, &qp->rnode); return 1; } static void bnxt_re_fill_ud_cqe(struct ibv_wc *ibvwc, struct bnxt_re_bcqe *hdr, void *cqe, uint8_t flags) { struct bnxt_re_ud_cqe *ucqe = cqe; uint32_t qpid; qpid = ((le32toh(hdr->qphi_rwrid) >> BNXT_RE_BCQE_SRCQP_SHIFT) & BNXT_RE_BCQE_SRCQP_SHIFT) << 0x10; /* higher 8 bits of 24 */ qpid |= (le64toh(ucqe->qplo_mac) >> BNXT_RE_UD_CQE_SRCQPLO_SHIFT) & BNXT_RE_UD_CQE_SRCQPLO_MASK; /*lower 16 of 24 */ ibvwc->src_qp = qpid; ibvwc->wc_flags |= IBV_WC_GRH; ibvwc->sl = (flags & BNXT_RE_UD_FLAGS_IP_VER_MASK) >> BNXT_RE_UD_FLAGS_IP_VER_SFT; /*IB-stack ABI in user do not ask for MAC to be reported. */ } static void bnxt_re_poll_success_rcqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc, struct bnxt_re_bcqe *hdr, void *cqe) { uint8_t flags, is_imm, is_rdma; struct bnxt_re_rc_cqe *rcqe; struct bnxt_re_wrid *swque; struct bnxt_re_queue *rq; uint32_t rcqe_len; uint32_t head = 0; uint8_t cnt = 0; rcqe = cqe; if (!qp->srq) { rq = qp->jrqq->hwque; head = qp->jrqq->last_idx; swque = &qp->jrqq->swque[head]; ibvwc->wr_id = swque->wrid; cnt = swque->slots; } else { struct bnxt_re_srq *srq; int tag; srq = qp->srq; rq = srq->srqq; tag = le32toh(hdr->qphi_rwrid) & BNXT_RE_BCQE_RWRID_MASK; ibvwc->wr_id = srq->srwrid[tag].wrid; cnt = 1; bnxt_re_release_srqe(srq, tag); } ibvwc->status = IBV_WC_SUCCESS; ibvwc->qp_num = qp->qpid; rcqe_len = le32toh(rcqe->length); ibvwc->byte_len = (qp->qptyp == IBV_QPT_UD) ? rcqe_len & BNXT_RE_UD_CQE_LEN_MASK: rcqe_len ; ibvwc->opcode = IBV_WC_RECV; flags = (le32toh(hdr->flg_st_typ_ph) >> BNXT_RE_BCQE_FLAGS_SHIFT) & BNXT_RE_BCQE_FLAGS_MASK; is_imm = (flags & BNXT_RE_RC_FLAGS_IMM_MASK) >> BNXT_RE_RC_FLAGS_IMM_SHIFT; is_rdma = (flags & BNXT_RE_RC_FLAGS_RDMA_MASK) >> BNXT_RE_RC_FLAGS_RDMA_SHIFT; ibvwc->wc_flags = 0; if (is_imm) { ibvwc->wc_flags |= IBV_WC_WITH_IMM; /* Completion reports the raw-data in LE format, While * user expects it in BE format. Thus, swapping on outgoing * data is needed. On a BE platform le32toh will do the swap * while on LE platform htobe32 will do the job. */ ibvwc->imm_data = htobe32(le32toh(rcqe->imm_key)); if (is_rdma) ibvwc->opcode = IBV_WC_RECV_RDMA_WITH_IMM; } if (qp->qptyp == IBV_QPT_UD) bnxt_re_fill_ud_cqe(ibvwc, hdr, cqe, flags); if (!qp->srq) bnxt_re_jqq_mod_last(qp->jrqq, head); bnxt_re_incr_head(rq, cnt); } static uint8_t bnxt_re_poll_rcqe(struct bnxt_re_qp *qp, struct ibv_wc *ibvwc, void *cqe, int *cnt) { struct bnxt_re_bcqe *hdr; uint8_t status, pcqe = false; hdr = cqe + sizeof(struct bnxt_re_rc_cqe); status = (le32toh(hdr->flg_st_typ_ph) >> BNXT_RE_BCQE_STATUS_SHIFT) & BNXT_RE_BCQE_STATUS_MASK; *cnt = 1; if (likely(status == BNXT_RE_RSP_ST_OK)) bnxt_re_poll_success_rcqe(qp, ibvwc, hdr, cqe); else *cnt = bnxt_re_poll_err_rcqe(qp, ibvwc, hdr, cqe); return pcqe; } static void bnxt_re_qp_move_flush_err(struct bnxt_re_qp *qp) { struct bnxt_re_cq *scq, *rcq; scq = to_bnxt_re_cq(qp->ibvqp->send_cq); rcq = to_bnxt_re_cq(qp->ibvqp->recv_cq); if (qp->qpst != IBV_QPS_ERR) qp->qpst = IBV_QPS_ERR; bnxt_re_fque_add_node(&rcq->rfhead, &qp->rnode); bnxt_re_fque_add_node(&scq->sfhead, &qp->snode); } static uint8_t bnxt_re_poll_term_cqe(struct bnxt_re_qp *qp, int *cnt) { /* For now just add the QP to flush list without * considering the index reported in the CQE. * Continue reporting flush completions until the * SQ and RQ are empty. */ *cnt = 0; if (qp->qpst != IBV_QPS_RESET) bnxt_re_qp_move_flush_err(qp); return 0; } static inline void bnxt_re_check_and_ring_cq_db(struct bnxt_re_cq *cq, int *hw_polled) { /* Ring doorbell only if the CQ is at * least half when deferred db mode is active */ if (cq->deffered_db_sup) { if (cq->hw_cqes < cq->cqq->depth / 2) return; *hw_polled = 0; cq->hw_cqes = 0; } bnxt_re_ring_cq_db(cq); } static int bnxt_re_poll_one(struct bnxt_re_cq *cq, int nwc, struct ibv_wc *wc, uint32_t *resize) { int type, cnt = 0, dqed = 0, hw_polled = 0; struct bnxt_re_queue *cqq = cq->cqq; struct bnxt_re_req_cqe *scqe; struct bnxt_re_ud_cqe *rcqe; uint64_t *qp_handle = NULL; struct bnxt_re_bcqe *hdr; struct bnxt_re_qp *qp; uint8_t pcqe = false; uint32_t flg_val; void *cqe; while (nwc) { cqe = cqq->va + cqq->head * bnxt_re_get_cqe_sz(); hdr = cqe + sizeof(struct bnxt_re_req_cqe); if (!bnxt_re_is_cqe_valid(cq, hdr)) break; flg_val = le32toh(hdr->flg_st_typ_ph); type = (flg_val >> BNXT_RE_BCQE_TYPE_SHIFT) & BNXT_RE_BCQE_TYPE_MASK; switch (type) { case BNXT_RE_WC_TYPE_SEND: scqe = cqe; qp_handle = (uint64_t *)&scqe->qp_handle; qp = (struct bnxt_re_qp *) (uintptr_t)le64toh(scqe->qp_handle); if (!qp) break; /*stale cqe. should be rung.*/ pcqe = bnxt_re_poll_scqe(qp, wc, cqe, &cnt); break; case BNXT_RE_WC_TYPE_RECV_RC: case BNXT_RE_WC_TYPE_RECV_UD: rcqe = cqe; qp_handle = (uint64_t *)&rcqe->qp_handle; qp = (struct bnxt_re_qp *) (uintptr_t)le64toh(rcqe->qp_handle); if (!qp) break; /*stale cqe. should be rung.*/ pcqe = bnxt_re_poll_rcqe(qp, wc, cqe, &cnt); break; case BNXT_RE_WC_TYPE_RECV_RAW: break; case BNXT_RE_WC_TYPE_TERM: scqe = cqe; qp_handle = (uint64_t *)&scqe->qp_handle; qp = (struct bnxt_re_qp *) (uintptr_t)le64toh(scqe->qp_handle); if (!qp) break; pcqe = bnxt_re_poll_term_cqe(qp, &cnt); break; case BNXT_RE_WC_TYPE_COFF: /* Stop further processing and return */ cq->resize_tog = (flg_val >> BNXT_RE_BCQE_RESIZE_TOG_SHIFT) & BNXT_RE_BCQE_RESIZE_TOG_MASK; bnxt_re_resize_cq_complete(cq); if (resize) *resize = 1; return dqed; default: break; }; if (pcqe) goto skipp_real; hw_polled++; cq->hw_cqes++; if (qp_handle) { *qp_handle = 0x0ULL; /* mark cqe as read */ qp_handle = NULL; } bnxt_re_incr_head(cq->cqq, 1); bnxt_re_change_cq_phase(cq); skipp_real: if (cnt) { cnt = 0; dqed++; nwc--; wc++; } /* Extra check required to avoid CQ full */ if (cq->deffered_db_sup) bnxt_re_check_and_ring_cq_db(cq, &hw_polled); } if (likely(hw_polled)) bnxt_re_check_and_ring_cq_db(cq, &hw_polled); return dqed; } static int bnxt_re_poll_flush_wcs(struct bnxt_re_joint_queue *jqq, struct ibv_wc *ibvwc, uint32_t qpid, int nwc) { uint8_t opcode = IBV_WC_RECV; struct bnxt_re_queue *que; struct bnxt_re_wrid *wrid; struct bnxt_re_psns *psns; uint32_t cnt = 0; que = jqq->hwque; while (nwc) { if (bnxt_re_is_que_empty(que)) break; wrid = &jqq->swque[jqq->last_idx]; if (wrid->psns) { psns = wrid->psns; opcode = (le32toh(psns->opc_spsn) >> BNXT_RE_PSNS_OPCD_SHIFT) & BNXT_RE_PSNS_OPCD_MASK; } ibvwc->status = IBV_WC_WR_FLUSH_ERR; ibvwc->opcode = opcode; ibvwc->wr_id = wrid->wrid; ibvwc->qp_num = qpid; ibvwc->byte_len = 0; ibvwc->wc_flags = 0; bnxt_re_jqq_mod_last(jqq, jqq->last_idx); bnxt_re_incr_head(que, wrid->slots); nwc--; cnt++; ibvwc++; } return cnt; } static int bnxt_re_poll_flush_wqes(struct bnxt_re_cq *cq, struct list_head *lhead, struct ibv_wc *ibvwc, int32_t nwc) { struct bnxt_re_fque_node *cur, *tmp; struct bnxt_re_joint_queue *jqq; struct bnxt_re_qp *qp; bool sq_list = false; uint32_t polled = 0; sq_list = (lhead == &cq->sfhead) ? true : false; if (!list_empty(lhead)) { list_for_each_safe(lhead, cur, tmp, list) { if (sq_list) { qp = container_of(cur, struct bnxt_re_qp, snode); jqq = qp->jsqq; } else { qp = container_of(cur, struct bnxt_re_qp, rnode); jqq = qp->jrqq; } if (bnxt_re_is_que_empty(jqq->hwque)) continue; polled += bnxt_re_poll_flush_wcs(jqq, ibvwc + polled, qp->qpid, nwc - polled); if (!(nwc - polled)) break; } } return polled; } static int bnxt_re_poll_flush_lists(struct bnxt_re_cq *cq, uint32_t nwc, struct ibv_wc *ibvwc) { int left, polled = 0; /* Check if flush Qs are empty */ if (list_empty(&cq->sfhead) && list_empty(&cq->rfhead)) return 0; polled = bnxt_re_poll_flush_wqes(cq, &cq->sfhead, ibvwc, nwc); left = nwc - polled; if (!left) return polled; polled += bnxt_re_poll_flush_wqes(cq, &cq->rfhead, ibvwc + polled, left); return polled; } static int bnxt_re_poll_resize_cq_list(struct bnxt_re_cq *cq, uint32_t nwc, struct ibv_wc *ibvwc) { struct bnxt_re_work_compl *compl, *tmp; int left; left = nwc; list_for_each_safe(&cq->prev_cq_head, compl, tmp, list) { if (!left) break; memcpy(ibvwc, &compl->wc, sizeof(*ibvwc)); ibvwc++; left--; list_del(&compl->list); free(compl); } return nwc - left; } int bnxt_re_poll_cq(struct ibv_cq *ibvcq, int nwc, struct ibv_wc *wc) { struct bnxt_re_cq *cq = to_bnxt_re_cq(ibvcq); int dqed = 0, left = 0; uint32_t resize = 0; pthread_spin_lock(&cq->cqq->qlock); left = nwc; /* Check whether we have anything to be completed * from prev cq context. */ if (unlikely(!list_empty(&cq->prev_cq_head))) { dqed = bnxt_re_poll_resize_cq_list(cq, nwc, wc); left = nwc - dqed; if (!left) { pthread_spin_unlock(&cq->cqq->qlock); return dqed; } } dqed += bnxt_re_poll_one(cq, left, wc + dqed, &resize); left = nwc - dqed; if (unlikely(left && (!list_empty(&cq->sfhead) || !list_empty(&cq->rfhead)))) /* Check if anything is there to flush. */ dqed += bnxt_re_poll_flush_lists(cq, left, (wc + dqed)); pthread_spin_unlock(&cq->cqq->qlock); return dqed; } static void bnxt_re_cleanup_cq(struct bnxt_re_qp *qp, struct bnxt_re_cq *cq) { struct bnxt_re_queue *que = cq->cqq; struct bnxt_re_bcqe *hdr; struct bnxt_re_req_cqe *scqe; struct bnxt_re_rc_cqe *rcqe; void *cqe; int indx, type; pthread_spin_lock(&que->qlock); for (indx = 0; indx < que->depth; indx++) { cqe = que->va + indx * bnxt_re_get_cqe_sz(); hdr = cqe + sizeof(struct bnxt_re_req_cqe); type = (le32toh(hdr->flg_st_typ_ph) >> BNXT_RE_BCQE_TYPE_SHIFT) & BNXT_RE_BCQE_TYPE_MASK; if (type == BNXT_RE_WC_TYPE_COFF) continue; if (type == BNXT_RE_WC_TYPE_SEND || type == BNXT_RE_WC_TYPE_TERM) { scqe = cqe; if (le64toh(scqe->qp_handle) == (uintptr_t)qp) scqe->qp_handle = 0ULL; } else { rcqe = cqe; if (le64toh(rcqe->qp_handle) == (uintptr_t)qp) rcqe->qp_handle = 0ULL; } } bnxt_re_fque_del_node(&qp->snode); bnxt_re_fque_del_node(&qp->rnode); pthread_spin_unlock(&que->qlock); } int bnxt_re_arm_cq(struct ibv_cq *ibvcq, int flags) { struct bnxt_re_cq *cq = to_bnxt_re_cq(ibvcq); pthread_spin_lock(&cq->cqq->qlock); flags = !flags ? BNXT_RE_QUE_TYPE_CQ_ARMALL : BNXT_RE_QUE_TYPE_CQ_ARMSE; bnxt_re_ring_cq_arm_db(cq, flags); pthread_spin_unlock(&cq->cqq->qlock); return 0; } static int bnxt_re_check_qp_limits(struct bnxt_re_context *cntx, struct ibv_qp_init_attr_ex *attr) { struct ibv_device_attr *devattr; struct bnxt_re_dev *rdev; rdev = cntx->rdev; devattr = &rdev->devattr; if (attr->cap.max_send_sge > devattr->max_sge) return EINVAL; if (attr->cap.max_recv_sge > devattr->max_sge) return EINVAL; if (attr->cap.max_inline_data > BNXT_RE_MAX_INLINE_SIZE) return EINVAL; if (attr->cap.max_send_wr > devattr->max_qp_wr) return EINVAL; if (attr->cap.max_recv_wr > devattr->max_qp_wr) return EINVAL; return 0; } static int bnxt_re_calc_wqe_sz(int nsge) { /* This is used for both sq and rq. In case hdr size differs * in future move to individual functions. */ return sizeof(struct bnxt_re_sge) * nsge + bnxt_re_get_sqe_hdr_sz(); } static int bnxt_re_get_rq_slots(struct bnxt_re_dev *rdev, uint8_t qpmode, uint32_t nrwr, uint32_t nsge, uint32_t *esz) { uint32_t max_wqesz; uint32_t wqe_size; uint32_t stride; uint32_t slots; stride = sizeof(struct bnxt_re_sge); max_wqesz = bnxt_re_calc_wqe_sz(rdev->devattr.max_sge); if (qpmode == BNXT_RE_WQE_MODE_STATIC) nsge = BNXT_RE_STATIC_WQE_MAX_SGE; wqe_size = bnxt_re_calc_wqe_sz(nsge); if (wqe_size > max_wqesz) return -EINVAL; if (esz) *esz = wqe_size; slots = (nrwr * wqe_size) / stride; return slots; } #define BNXT_VAR_MAX_SLOT_ALIGN 256 static int bnxt_re_get_sq_slots(struct bnxt_re_dev *rdev, uint8_t qpmode, uint32_t nswr, uint32_t nsge, uint32_t ils, uint32_t *esize) { uint32_t align_bytes; uint32_t max_wqesz; uint32_t wqe_size; uint32_t cal_ils; uint32_t stride; uint32_t ilsize; uint32_t hdr_sz; uint32_t slots; hdr_sz = bnxt_re_get_sqe_hdr_sz(); stride = sizeof(struct bnxt_re_sge); align_bytes = hdr_sz; if (qpmode == BNXT_RE_WQE_MODE_VARIABLE) align_bytes = stride; max_wqesz = bnxt_re_calc_wqe_sz(rdev->devattr.max_sge); ilsize = align(ils, align_bytes); wqe_size = bnxt_re_calc_wqe_sz(nsge); if (ilsize) { cal_ils = hdr_sz + ilsize; wqe_size = MAX(cal_ils, wqe_size); wqe_size = align(wqe_size, hdr_sz); } if (wqe_size > max_wqesz) return -EINVAL; if (qpmode == BNXT_RE_WQE_MODE_STATIC) wqe_size = bnxt_re_calc_wqe_sz(6); if (esize) *esize = wqe_size; slots = (nswr * wqe_size) / stride; if (qpmode == BNXT_RE_WQE_MODE_VARIABLE) slots = align(slots, BNXT_VAR_MAX_SLOT_ALIGN); return slots; } static int bnxt_re_get_sqmem_size(struct bnxt_re_context *cntx, struct ibv_qp_init_attr_ex *attr, struct bnxt_re_qattr *qattr) { uint32_t nsge, nswr, diff = 0; size_t bytes = 0; uint32_t npsn; uint32_t ils; uint8_t mode; uint32_t esz; int nslots; mode = cntx->wqe_mode & BNXT_RE_WQE_MODE_VARIABLE; nsge = attr->cap.max_send_sge; diff = BNXT_RE_FULL_FLAG_DELTA; nswr = attr->cap.max_send_wr + 1 + diff; nswr = bnxt_re_init_depth(nswr, cntx->comp_mask); ils = attr->cap.max_inline_data; nslots = bnxt_re_get_sq_slots(cntx->rdev, mode, nswr, nsge, ils, &esz); if (nslots < 0) return nslots; npsn = bnxt_re_get_npsn(mode, nswr, nslots); if (BNXT_RE_MSN_TBL_EN(cntx)) npsn = roundup_pow_of_two(npsn); qattr->nwr = nswr; qattr->slots = nslots; qattr->esize = esz; if (mode) qattr->sw_nwr = nslots; else qattr->sw_nwr = nswr; bytes = nslots * sizeof(struct bnxt_re_sge); /* ring */ bytes += npsn * bnxt_re_get_psne_size(cntx); /* psn */ qattr->sz_ring = align(bytes, cntx->rdev->pg_size); qattr->sz_shad = qattr->sw_nwr * sizeof(struct bnxt_re_wrid); /* shadow */ return 0; } static int bnxt_re_get_rqmem_size(struct bnxt_re_context *cntx, struct ibv_qp_init_attr_ex *attr, struct bnxt_re_qattr *qattr) { uint32_t nrwr, nsge; size_t bytes = 0; uint32_t esz; int nslots; nsge = attr->cap.max_recv_sge; nrwr = attr->cap.max_recv_wr + 1; nrwr = bnxt_re_init_depth(nrwr, cntx->comp_mask); nslots = bnxt_re_get_rq_slots(cntx->rdev, cntx->wqe_mode, nrwr, nsge, &esz); if (nslots < 0) return nslots; qattr->nwr = nrwr; qattr->slots = nslots; qattr->esize = esz; qattr->sw_nwr = nrwr; bytes = nslots * sizeof(struct bnxt_re_sge); qattr->sz_ring = align(bytes, cntx->rdev->pg_size); qattr->sz_shad = nrwr * sizeof(struct bnxt_re_wrid); return 0; } static int bnxt_re_get_qpmem_size(struct bnxt_re_context *cntx, struct ibv_qp_init_attr_ex *attr, struct bnxt_re_qattr *qattr) { int size = 0; int tmp; int rc; size = sizeof(struct bnxt_re_qp); tmp = sizeof(struct bnxt_re_joint_queue); tmp += sizeof(struct bnxt_re_queue); size += tmp; rc = bnxt_re_get_sqmem_size(cntx, attr, &qattr[BNXT_RE_QATTR_SQ_INDX]); if (rc < 0) return -EINVAL; size += qattr[BNXT_RE_QATTR_SQ_INDX].sz_ring; size += qattr[BNXT_RE_QATTR_SQ_INDX].sz_shad; if (!attr->srq) { tmp = sizeof(struct bnxt_re_joint_queue); tmp += sizeof(struct bnxt_re_queue); size += tmp; rc = bnxt_re_get_rqmem_size(cntx, attr, &qattr[BNXT_RE_QATTR_RQ_INDX]); if (rc < 0) return -EINVAL; size += qattr[BNXT_RE_QATTR_RQ_INDX].sz_ring; size += qattr[BNXT_RE_QATTR_RQ_INDX].sz_shad; } return size; } static void *bnxt_re_alloc_qpslab(struct bnxt_re_context *cntx, struct ibv_qp_init_attr_ex *attr, struct bnxt_re_qattr *qattr) { int bytes; bytes = bnxt_re_get_qpmem_size(cntx, attr, qattr); if (bytes < 0) return NULL; return bnxt_re_alloc_mem(bytes, cntx->rdev->pg_size); } static int bnxt_re_alloc_queue_ptr(struct bnxt_re_qp *qp, struct ibv_qp_init_attr_ex *attr) { int rc = -ENOMEM; int jqsz, qsz; jqsz = sizeof(struct bnxt_re_joint_queue); qsz = sizeof(struct bnxt_re_queue); qp->jsqq = bnxt_re_get_obj(qp->mem, jqsz); if (!qp->jsqq) return rc; qp->jsqq->hwque = bnxt_re_get_obj(qp->mem, qsz); if (!qp->jsqq->hwque) goto fail; if (!attr->srq) { qp->jrqq = bnxt_re_get_obj(qp->mem, jqsz); if (!qp->jrqq) goto fail; qp->jrqq->hwque = bnxt_re_get_obj(qp->mem, qsz); if (!qp->jrqq->hwque) goto fail; } return 0; fail: return rc; } static int bnxt_re_alloc_init_swque(struct bnxt_re_joint_queue *jqq, struct bnxt_re_mem *mem, struct bnxt_re_qattr *qattr) { int indx; jqq->swque = bnxt_re_get_obj(mem, qattr->sz_shad); if (!jqq->swque) return -ENOMEM; jqq->start_idx = 0; jqq->last_idx = qattr->sw_nwr - 1; for (indx = 0; indx < qattr->sw_nwr; indx++) jqq->swque[indx].next_idx = indx + 1; jqq->swque[jqq->last_idx].next_idx = 0; jqq->last_idx = 0; return 0; } static int bnxt_re_alloc_queues(struct bnxt_re_qp *qp, struct ibv_qp_init_attr_ex *attr, struct bnxt_re_qattr *qattr) { struct bnxt_re_queue *que; uint32_t psn_size; uint8_t indx; int ret; indx = BNXT_RE_QATTR_SQ_INDX; que = qp->jsqq->hwque; que->stride = sizeof(struct bnxt_re_sge); que->depth = qattr[indx].slots; que->diff = (BNXT_RE_FULL_FLAG_DELTA * qattr[indx].esize) / que->stride; que->va = bnxt_re_get_ring(qp->mem, qattr[indx].sz_ring); if (!que->va) return -ENOMEM; /* PSN-search memory is allocated without checking for * QP-Type. Kernel driver do not map this memory if it * is UD-qp. UD-qp use this memory to maintain WC-opcode. * See definition of bnxt_re_fill_psns() for the use case. */ que->pad = (que->va + que->depth * que->stride); psn_size = bnxt_re_get_psne_size(qp->cntx); que->pad_stride_log2 = ilog32(psn_size - 1); ret = bnxt_re_alloc_init_swque(qp->jsqq, qp->mem, &qattr[indx]); if (ret) goto fail; qp->cap.max_swr = qattr[indx].sw_nwr; qp->jsqq->cntx = qp->cntx; que->dbtail = (qp->qpmode == BNXT_RE_WQE_MODE_VARIABLE) ? &que->tail : &qp->jsqq->start_idx; /* Init and adjust MSN table size according to qp mode */ if (!BNXT_RE_MSN_TBL_EN(qp->cntx)) goto skip_msn; que->msn = 0; que->msn_tbl_sz = 0; if (qp->qpmode & BNXT_RE_WQE_MODE_VARIABLE) que->msn_tbl_sz = roundup_pow_of_two(qattr->slots) / 2; else que->msn_tbl_sz = roundup_pow_of_two(qattr->nwr); skip_msn: pthread_spin_init(&que->qlock, PTHREAD_PROCESS_PRIVATE); if (qp->jrqq) { indx = BNXT_RE_QATTR_RQ_INDX; que = qp->jrqq->hwque; que->stride = sizeof(struct bnxt_re_sge); que->depth = qattr[indx].slots; que->max_slots = qattr[indx].esize / que->stride; que->dbtail = &qp->jrqq->start_idx; que->va = bnxt_re_get_ring(qp->mem, qattr[indx].sz_ring); if (!que->va) return -ENOMEM; /* For RQ only bnxt_re_wri.wrid is used. */ ret = bnxt_re_alloc_init_swque(qp->jrqq, qp->mem, &qattr[indx]); if (ret) goto fail; pthread_spin_init(&que->qlock, PTHREAD_PROCESS_PRIVATE); qp->cap.max_rwr = qattr[indx].nwr; qp->jrqq->cntx = qp->cntx; } return 0; fail: return ret; } void bnxt_re_async_event(struct ibv_context *context, struct ibv_async_event *event) { struct ibv_qp *ibvqp; struct bnxt_re_qp *qp; switch (event->event_type) { case IBV_EVENT_CQ_ERR: break; case IBV_EVENT_SRQ_ERR: case IBV_EVENT_QP_FATAL: case IBV_EVENT_QP_REQ_ERR: case IBV_EVENT_QP_ACCESS_ERR: case IBV_EVENT_PATH_MIG_ERR: { ibvqp = event->element.qp; qp = to_bnxt_re_qp(ibvqp); bnxt_re_qp_move_flush_err(qp); break; } case IBV_EVENT_SQ_DRAINED: case IBV_EVENT_PATH_MIG: case IBV_EVENT_COMM_EST: case IBV_EVENT_QP_LAST_WQE_REACHED: case IBV_EVENT_SRQ_LIMIT_REACHED: case IBV_EVENT_PORT_ACTIVE: case IBV_EVENT_PORT_ERR: default: break; } } static void *bnxt_re_pull_psn_buff(struct bnxt_re_queue *que, bool hw_retx) { if (hw_retx) return (void *)(que->pad + ((que->msn) << que->pad_stride_log2)); return (void *)(que->pad + ((*que->dbtail) << que->pad_stride_log2)); } static void bnxt_re_fill_psns_for_msntbl(struct bnxt_re_qp *qp, uint32_t len, uint32_t st_idx, uint8_t opcode) { uint32_t npsn = 0, start_psn = 0, next_psn = 0; struct bnxt_re_msns *msns; uint32_t pkt_cnt = 0; msns = bnxt_re_pull_psn_buff(qp->jsqq->hwque, true); msns->start_idx_next_psn_start_psn = 0; if (qp->qptyp == IBV_QPT_RC) { start_psn = qp->sq_psn; pkt_cnt = (len / qp->mtu); if (len % qp->mtu) pkt_cnt++; /* Increment the psn even for 0 len packets * e.g. for opcode rdma-write-with-imm-data * with length field = 0 */ if (len == 0) pkt_cnt = 1; /* make it 24 bit */ next_psn = qp->sq_psn + pkt_cnt; npsn = next_psn; qp->sq_psn = next_psn; msns->start_idx_next_psn_start_psn |= bnxt_re_update_msn_tbl(st_idx, npsn, start_psn); qp->jsqq->hwque->msn++; qp->jsqq->hwque->msn %= qp->jsqq->hwque->msn_tbl_sz; } } static void bnxt_re_fill_psns(struct bnxt_re_qp *qp, uint32_t len, uint32_t st_idx, uint8_t opcode) { uint32_t opc_spsn = 0, flg_npsn = 0; struct bnxt_re_psns_ext *psns_ext; uint32_t pkt_cnt = 0, nxt_psn = 0; struct bnxt_re_psns *psns; psns = bnxt_re_pull_psn_buff(qp->jsqq->hwque, false); psns_ext = (struct bnxt_re_psns_ext *)psns; if (qp->qptyp == IBV_QPT_RC) { opc_spsn = qp->sq_psn & BNXT_RE_PSNS_SPSN_MASK; pkt_cnt = (len / qp->mtu); if (len % qp->mtu) pkt_cnt++; if (len == 0) pkt_cnt = 1; nxt_psn = ((qp->sq_psn + pkt_cnt) & BNXT_RE_PSNS_NPSN_MASK); flg_npsn = nxt_psn; qp->sq_psn = nxt_psn; } opc_spsn |= (((uint32_t)opcode & BNXT_RE_PSNS_OPCD_MASK) << BNXT_RE_PSNS_OPCD_SHIFT); memset(psns, 0, sizeof(*psns)); psns->opc_spsn = htole32(opc_spsn); psns->flg_npsn = htole32(flg_npsn); if (qp->cctx->gen_p5_p7) psns_ext->st_slot_idx = st_idx; } static inline void bnxt_re_set_wr_hdr_flags(struct bnxt_re_qp *qp, unsigned int send_flags) { uint32_t hdrval = 0; uint8_t opcd; if (send_flags & IBV_SEND_SIGNALED || qp->cap.sqsig) hdrval |= ((BNXT_RE_WR_FLAGS_SIGNALED & BNXT_RE_HDR_FLAGS_MASK) << BNXT_RE_HDR_FLAGS_SHIFT); if (send_flags & IBV_SEND_FENCE) /*TODO: See when RD fence can be used. */ hdrval |= ((BNXT_RE_WR_FLAGS_UC_FENCE & BNXT_RE_HDR_FLAGS_MASK) << BNXT_RE_HDR_FLAGS_SHIFT); if (send_flags & IBV_SEND_SOLICITED) hdrval |= ((BNXT_RE_WR_FLAGS_SE & BNXT_RE_HDR_FLAGS_MASK) << BNXT_RE_HDR_FLAGS_SHIFT); if (send_flags & IBV_SEND_INLINE) hdrval |= ((BNXT_RE_WR_FLAGS_INLINE & BNXT_RE_HDR_FLAGS_MASK) << BNXT_RE_HDR_FLAGS_SHIFT); hdrval |= ((qp->wr_sq.cur_slot_cnt) & BNXT_RE_HDR_WS_MASK) << BNXT_RE_HDR_WS_SHIFT; opcd = bnxt_re_ibv_to_bnxt_wr_opcd(qp->wr_sq.cur_opcode); hdrval |= (opcd & BNXT_RE_HDR_WT_MASK); qp->wr_sq.cur_hdr->rsv_ws_fl_wt = htole32(hdrval); } static inline void *bnxt_re_get_wr_swqe(struct bnxt_re_joint_queue *jqq, uint32_t cnt) { return &jqq->swque[jqq->start_idx + cnt]; } static uint16_t bnxt_re_put_wr_inline(struct bnxt_re_queue *que, uint32_t *idx, struct bnxt_re_push_buffer *pbuf, size_t num_buf, const struct ibv_data_buf *buf_list, size_t *msg_len) { int len, t_len, offt = 0; int t_cplen = 0, cplen; bool pull_dst = true; int alsize, indx; void *il_dst; void *il_src; t_len = 0; alsize = sizeof(struct bnxt_re_sge); for (indx = 0; indx < num_buf; indx++) { len = buf_list[indx].length; il_src = (void *)buf_list[indx].addr; t_len += len; while (len) { if (pull_dst) { pull_dst = false; il_dst = bnxt_re_get_hwqe(que, (*idx)++); if (pbuf) pbuf->wqe[*idx - 1] = (uintptr_t)il_dst; t_cplen = 0; offt = 0; } cplen = MIN(len, alsize); cplen = MIN(cplen, (alsize - offt)); memcpy(il_dst, il_src, cplen); t_cplen += cplen; il_src += cplen; il_dst += cplen; offt += cplen; len -= cplen; if (t_cplen == alsize) pull_dst = true; } } return t_len; } static inline void bnxt_re_update_wr_common_hdr(struct bnxt_re_qp *qp, uint8_t opcode) { struct bnxt_re_queue *sq = qp->jsqq->hwque; qp->wr_sq.cur_hdr = bnxt_re_get_hwqe(sq, qp->wr_sq.cur_slot_cnt++); qp->wr_sq.cur_sqe = bnxt_re_get_hwqe(sq, qp->wr_sq.cur_slot_cnt++); qp->wr_sq.cur_opcode = opcode; } static inline void bnxt_re_update_sge(struct bnxt_re_sge *sge, uint32_t lkey, uint64_t addr, uint32_t length) { sge->pa = htole64(addr); sge->lkey = htole32(lkey); sge->length = htole32(length); } static inline void bnxt_re_update_swqe(struct ibv_qp_ex *ibvqp, struct bnxt_re_qp *qp, uint32_t length) { struct bnxt_re_wrid *wrid; wrid = bnxt_re_get_wr_swqe(qp->jsqq, qp->wr_sq.cur_wqe_cnt); wrid->wrid = ibvqp->wr_id; wrid->bytes = length; wrid->slots = (qp->qpmode == BNXT_RE_WQE_MODE_STATIC) ? STATIC_WQE_NUM_SLOTS : qp->wr_sq.cur_slot_cnt; wrid->sig = (ibvqp->wr_flags & IBV_SEND_SIGNALED || qp->cap.sqsig) ? IBV_SEND_SIGNALED : 0; wrid->wc_opcd = bnxt_re_ibv_wr_to_wc_opcd(qp->wr_sq.cur_opcode); } static void bnxt_re_send_wr_start(struct ibv_qp_ex *ibvqp) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; pthread_spin_lock(&sq->qlock); qp->wr_sq.cur_hdr = NULL; qp->wr_sq.cur_sqe = NULL; qp->wr_sq.cur_slot_cnt = 0; qp->wr_sq.cur_wqe_cnt = 0; qp->wr_sq.cur_opcode = 0xff; qp->wr_sq.cur_push_wqe = false; qp->wr_sq.cur_push_size = 0; qp->wr_sq.cur_swq_idx = qp->jsqq->start_idx; } static int bnxt_re_send_wr_complete(struct ibv_qp_ex *ibvqp) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; int err = qp->wr_sq.error; uint8_t slots; if (unlikely(err)) goto exit; bnxt_re_set_wr_hdr_flags(qp, ibvqp->wr_flags); qp->wqe_cnt += qp->wr_sq.cur_wqe_cnt; slots = (qp->qpmode == BNXT_RE_WQE_MODE_STATIC) ? STATIC_WQE_NUM_SLOTS : qp->wr_sq.cur_slot_cnt; bnxt_re_incr_tail(sq, slots); bnxt_re_jqq_mod_start(qp->jsqq, qp->wr_sq.cur_swq_idx + qp->wr_sq.cur_wqe_cnt - 1); if (!qp->wr_sq.cur_push_wqe) { bnxt_re_ring_sq_db(qp); } else { struct bnxt_re_push_buffer *pushb; pushb = (struct bnxt_re_push_buffer *)qp->pbuf; pushb->wqe[0] = (uintptr_t)qp->wr_sq.cur_hdr; pushb->wqe[1] = (uintptr_t)qp->wr_sq.cur_sqe; pushb->tail = *sq->dbtail; bnxt_re_fill_push_wcb(qp, pushb, qp->wr_sq.cur_slot_cnt); } exit: pthread_spin_unlock(&sq->qlock); return err; } static void bnxt_re_send_wr_abort(struct ibv_qp_ex *ibvqp) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; pthread_spin_unlock(&sq->qlock); } static void bnxt_re_send_wr_set_sge(struct ibv_qp_ex *ibvqp, uint32_t lkey, uint64_t addr, uint32_t length) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_sge *sge; sge = bnxt_re_get_hwqe(sq, qp->wr_sq.cur_slot_cnt++); bnxt_re_update_sge(sge, lkey, addr, length); if (qp->qptyp == IBV_QPT_UD) { qp->wr_sq.cur_hdr->lhdr.qkey_len |= htole64(length); } else { if ((qp->wr_sq.cur_opcode != IBV_WR_ATOMIC_FETCH_AND_ADD) && (qp->wr_sq.cur_opcode != IBV_WR_ATOMIC_CMP_AND_SWP)) qp->wr_sq.cur_hdr->lhdr.qkey_len = htole64(length); } if (BNXT_RE_MSN_TBL_EN(qp->cntx)) bnxt_re_fill_psns_for_msntbl(qp, length, *sq->dbtail, qp->wr_sq.cur_opcode); else bnxt_re_fill_psns(qp, length, *sq->dbtail, qp->wr_sq.cur_opcode); bnxt_re_update_swqe(ibvqp, qp, length); qp->wr_sq.cur_wqe_cnt++; } static void bnxt_re_send_wr_set_sge_list(struct ibv_qp_ex *ibvqp, size_t nsge, const struct ibv_sge *sgl) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_sge *sge; uint32_t i, len = 0; if ((qp->wr_sq.cur_opcode == IBV_WR_ATOMIC_FETCH_AND_ADD) || (qp->wr_sq.cur_opcode == IBV_WR_ATOMIC_CMP_AND_SWP)) { qp->wr_sq.error = -EINVAL; return; } /* check the queue full including header slots */ if (bnxt_re_is_que_full(sq, nsge)) { qp->wr_sq.error = ENOMEM; return; } for (i = 0; i < nsge; i++) { sge = bnxt_re_get_hwqe(sq, qp->wr_sq.cur_slot_cnt++); bnxt_re_update_sge(sge, sgl[i].lkey, sgl[i].addr, sgl[i].length); len += sgl[i].length; sge++; } if (qp->qptyp == IBV_QPT_UD) { qp->wr_sq.cur_hdr->lhdr.qkey_len |= htole64(len); } else { if ((qp->wr_sq.cur_opcode != IBV_WR_ATOMIC_FETCH_AND_ADD) && (qp->wr_sq.cur_opcode != IBV_WR_ATOMIC_CMP_AND_SWP)) qp->wr_sq.cur_hdr->lhdr.qkey_len = htole64(len); } if (BNXT_RE_MSN_TBL_EN(qp->cntx)) bnxt_re_fill_psns_for_msntbl(qp, len, *sq->dbtail, qp->wr_sq.cur_opcode); else bnxt_re_fill_psns(qp, len, *sq->dbtail, qp->wr_sq.cur_opcode); bnxt_re_update_swqe(ibvqp, qp, len); qp->wr_sq.cur_wqe_cnt++; } static void bnxt_re_send_wr_set_inline_data(struct ibv_qp_ex *ibvqp, void *addr, size_t length) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_push_buffer *pushb = NULL; struct ibv_data_buf ibv_buf; uint32_t len = 0; if (unlikely(qp->wr_sq.error)) return; if (qp->push_st_en && length < qp->max_push_sz) { pushb = (struct bnxt_re_push_buffer *)qp->pbuf; pushb->qpid = qp->qpid; pushb->st_idx = *sq->dbtail; qp->wr_sq.cur_push_wqe = true; } ibv_buf.addr = addr; ibv_buf.length = length; len = bnxt_re_put_wr_inline(sq, &qp->wr_sq.cur_slot_cnt, pushb, 1, &ibv_buf, &length); if (qp->qptyp == IBV_QPT_UD) { qp->wr_sq.cur_hdr->lhdr.qkey_len |= htole64(len); } else { if ((qp->wr_sq.cur_opcode != IBV_WR_ATOMIC_FETCH_AND_ADD) && (qp->wr_sq.cur_opcode != IBV_WR_ATOMIC_CMP_AND_SWP)) qp->wr_sq.cur_hdr->lhdr.qkey_len = htole64(len); } if (BNXT_RE_MSN_TBL_EN(qp->cntx)) bnxt_re_fill_psns_for_msntbl(qp, len, *sq->dbtail, qp->wr_sq.cur_wqe_cnt); else bnxt_re_fill_psns(qp, len, *sq->dbtail, qp->wr_sq.cur_opcode); bnxt_re_update_swqe(ibvqp, qp, len); qp->wr_sq.cur_wqe_cnt++; qp->wr_sq.cur_push_size += length; } static void bnxt_re_send_wr_set_inline_data_list(struct ibv_qp_ex *ibvqp, size_t num_buf, const struct ibv_data_buf *buf_list) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_push_buffer *pushb = NULL; uint32_t i, num, len = 0; size_t msg_len = 0; /* Get the total message length */ for (i = 0; i < num_buf; i++) msg_len += buf_list[i].length; if (qp->push_st_en && msg_len < qp->max_push_sz) { pushb = (struct bnxt_re_push_buffer *)qp->pbuf; pushb->qpid = qp->qpid; pushb->st_idx = *sq->dbtail; qp->wr_sq.cur_push_wqe = true; } num = (msg_len + MSG_LEN_ADJ_TO_BYTES) >> SLOTS_RSH_TO_NUM_WQE; /* check the queue full including header slots */ if (bnxt_re_is_que_full(sq, num + 2)) { qp->wr_sq.error = ENOMEM; return; } len = bnxt_re_put_wr_inline(sq, &qp->wr_sq.cur_slot_cnt, pushb, num_buf, buf_list, &msg_len); if (qp->qptyp == IBV_QPT_UD) { qp->wr_sq.cur_hdr->lhdr.qkey_len |= htole64(len); } else { if ((qp->wr_sq.cur_opcode != IBV_WR_ATOMIC_FETCH_AND_ADD) && (qp->wr_sq.cur_opcode != IBV_WR_ATOMIC_CMP_AND_SWP)) qp->wr_sq.cur_hdr->lhdr.qkey_len = htole64(len); } if (BNXT_RE_MSN_TBL_EN(qp->cntx)) bnxt_re_fill_psns_for_msntbl(qp, len, *sq->dbtail, qp->wr_sq.cur_opcode); else bnxt_re_fill_psns(qp, len, *sq->dbtail, qp->wr_sq.cur_opcode); bnxt_re_update_swqe(ibvqp, qp, len); qp->wr_sq.cur_wqe_cnt++; qp->wr_sq.cur_push_size += msg_len; } static void bnxt_re_send_wr_set_ud_addr(struct ibv_qp_ex *ibvqp, struct ibv_ah *ibah, uint32_t remote_qpn, uint32_t remote_qkey) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_ah *ah; uint64_t qkey; if (unlikely(!ibah)) { qp->wr_sq.error = -EINVAL; return; } ah = to_bnxt_re_ah(ibah); qkey = remote_qkey; qp->wr_sq.cur_hdr->lhdr.qkey_len |= htole64(qkey << 32); qp->wr_sq.cur_sqe->dst_qp = htole32(remote_qpn); qp->wr_sq.cur_sqe->avid = htole32(ah->avid & 0xFFFFF); } static void bnxt_re_send_wr_send(struct ibv_qp_ex *ibvqp) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; if (bnxt_re_is_que_full(sq, SEND_SGE_MIN_SLOTS)) { qp->wr_sq.error = ENOMEM; return; } bnxt_re_update_wr_common_hdr(qp, IBV_WR_SEND); } static void bnxt_re_send_wr_send_imm(struct ibv_qp_ex *ibvqp, __be32 imm_data) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; if (bnxt_re_is_que_full(sq, SEND_SGE_MIN_SLOTS)) { qp->wr_sq.error = ENOMEM; return; } bnxt_re_update_wr_common_hdr(qp, IBV_WR_SEND_WITH_IMM); qp->wr_sq.cur_hdr->key_immd = htole32(be32toh(imm_data)); } static void bnxt_re_send_wr_rdma_read(struct ibv_qp_ex *ibvqp, uint32_t rkey, uint64_t raddr) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_rdma *rsqe; if (bnxt_re_is_que_full(sq, SEND_SGE_MIN_SLOTS)) { qp->wr_sq.error = ENOMEM; return; } bnxt_re_update_wr_common_hdr(qp, IBV_WR_RDMA_READ); rsqe = (struct bnxt_re_rdma *)qp->wr_sq.cur_sqe; rsqe->rva = htole64(raddr); rsqe->rkey = htole32(rkey); } static void bnxt_re_send_wr_rdma_write(struct ibv_qp_ex *ibvqp, uint32_t rkey, uint64_t raddr) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_rdma *rsqe; if (bnxt_re_is_que_full(sq, SEND_SGE_MIN_SLOTS)) { qp->wr_sq.error = ENOMEM; return; } bnxt_re_update_wr_common_hdr(qp, IBV_WR_RDMA_WRITE); rsqe = (struct bnxt_re_rdma *)qp->wr_sq.cur_sqe; rsqe->rva = htole64(raddr); rsqe->rkey = htole32(rkey); } static void bnxt_re_send_wr_rdma_write_imm(struct ibv_qp_ex *ibvqp, uint32_t rkey, uint64_t raddr, __be32 imm_data) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_rdma *rsqe; if (bnxt_re_is_que_full(sq, SEND_SGE_MIN_SLOTS)) { qp->wr_sq.error = ENOMEM; return; } bnxt_re_update_wr_common_hdr(qp, IBV_WR_RDMA_WRITE_WITH_IMM); qp->wr_sq.cur_hdr->key_immd = htole32(be32toh(imm_data)); rsqe = (struct bnxt_re_rdma *)qp->wr_sq.cur_sqe; rsqe->rva = htole64(raddr); rsqe->rkey = htole32(rkey); } static void bnxt_re_send_wr_atomic_cmp_swp(struct ibv_qp_ex *ibvqp, uint32_t rkey, uint64_t raddr, uint64_t compare, uint64_t swap) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_atomic *sqe; if (bnxt_re_is_que_full(sq, SEND_SGE_MIN_SLOTS)) { qp->wr_sq.error = ENOMEM; return; } bnxt_re_update_wr_common_hdr(qp, IBV_WR_ATOMIC_CMP_AND_SWP); qp->wr_sq.cur_hdr->key_immd = htole32(rkey); qp->wr_sq.cur_hdr->lhdr.rva = htole64(raddr); sqe = (struct bnxt_re_atomic *)qp->wr_sq.cur_sqe; sqe->cmp_dt = htole64(compare); sqe->swp_dt = htole64(swap); } static void bnxt_re_send_wr_atomic_fetch_add(struct ibv_qp_ex *ibvqp, uint32_t rkey, uint64_t raddr, uint64_t add) { struct bnxt_re_qp *qp = to_bnxt_re_qp((struct ibv_qp *)ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_atomic *sqe; if (unlikely(!qp->cap.is_atomic_cap)) { qp->wr_sq.error = -EINVAL; return; } if (bnxt_re_is_que_full(sq, SEND_SGE_MIN_SLOTS)) { qp->wr_sq.error = ENOMEM; return; } bnxt_re_update_wr_common_hdr(qp, IBV_WR_ATOMIC_FETCH_AND_ADD); qp->wr_sq.cur_hdr->key_immd = htole32(rkey); qp->wr_sq.cur_hdr->lhdr.rva = htole64(raddr); sqe = (struct bnxt_re_atomic *)qp->wr_sq.cur_sqe; sqe->swp_dt = htole64(add); } static void bnxt_re_set_qp_ex_ops(struct bnxt_re_qp *qp, uint64_t ops_flags) { struct ibv_qp_ex *ibqp = &qp->vqp.qp_ex; if (ops_flags & IBV_QP_EX_WITH_RDMA_WRITE) ibqp->wr_rdma_write = bnxt_re_send_wr_rdma_write; if (ops_flags & IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM) ibqp->wr_rdma_write_imm = bnxt_re_send_wr_rdma_write_imm; if (ops_flags & IBV_QP_EX_WITH_SEND) ibqp->wr_send = bnxt_re_send_wr_send; if (ops_flags & IBV_QP_EX_WITH_SEND_WITH_IMM) ibqp->wr_send_imm = bnxt_re_send_wr_send_imm; if (ops_flags & IBV_QP_EX_WITH_RDMA_READ) ibqp->wr_rdma_read = bnxt_re_send_wr_rdma_read; if (ops_flags & IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP) ibqp->wr_atomic_cmp_swp = bnxt_re_send_wr_atomic_cmp_swp; if (ops_flags & IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD) ibqp->wr_atomic_fetch_add = bnxt_re_send_wr_atomic_fetch_add; ibqp->wr_set_sge = bnxt_re_send_wr_set_sge; ibqp->wr_set_sge_list = bnxt_re_send_wr_set_sge_list; ibqp->wr_set_inline_data = bnxt_re_send_wr_set_inline_data; ibqp->wr_set_inline_data_list = bnxt_re_send_wr_set_inline_data_list; ibqp->wr_set_ud_addr = bnxt_re_send_wr_set_ud_addr; ibqp->wr_start = bnxt_re_send_wr_start; ibqp->wr_complete = bnxt_re_send_wr_complete; ibqp->wr_abort = bnxt_re_send_wr_abort; } static struct ibv_qp *__bnxt_re_create_qp(struct ibv_context *ibvctx, struct ibv_qp_init_attr_ex *attr) { struct bnxt_re_context *cntx = to_bnxt_re_context(ibvctx); struct bnxt_re_dev *dev = to_bnxt_re_dev(cntx->ibvctx.context.device); struct ubnxt_re_qp_resp resp = {}; struct bnxt_re_qattr qattr[2]; struct bnxt_re_qpcap *cap; struct ubnxt_re_qp req; struct bnxt_re_qp *qp; void *mem; if (bnxt_re_check_qp_limits(cntx, attr)) return NULL; memset(qattr, 0, (2 * sizeof(*qattr))); mem = bnxt_re_alloc_qpslab(cntx, attr, qattr); if (!mem) return NULL; qp = bnxt_re_get_obj(mem, sizeof(*qp)); if (!qp) goto fail; qp->ibvqp = &qp->vqp.qp; qp->mem = mem; qp->cctx = &cntx->cctx; qp->cntx = cntx; qp->qpmode = cntx->wqe_mode & BNXT_RE_WQE_MODE_VARIABLE; /* alloc queue pointers */ if (bnxt_re_alloc_queue_ptr(qp, attr)) goto fail; /* alloc queues */ if (bnxt_re_alloc_queues(qp, attr, qattr)) goto fail; /* Fill ibv_cmd */ cap = &qp->cap; req.qpsva = (uintptr_t)qp->jsqq->hwque->va; req.qprva = qp->jrqq ? (uintptr_t)qp->jrqq->hwque->va : 0; req.qp_handle = (uintptr_t)qp; if (qp->qpmode == BNXT_RE_WQE_MODE_VARIABLE) req.sq_slots = qattr[BNXT_RE_QATTR_SQ_INDX].slots; if (ibv_cmd_create_qp_ex(ibvctx, &qp->vqp, attr, &req.ibv_cmd, sizeof(req), &resp.ibv_resp, sizeof(resp))) goto fail; if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS) { bnxt_re_set_qp_ex_ops(qp, attr->send_ops_flags); qp->vqp.comp_mask |= VERBS_QP_EX; } qp->qpid = resp.qpid; qp->qptyp = attr->qp_type; qp->qpst = IBV_QPS_RESET; qp->scq = to_bnxt_re_cq(attr->send_cq); qp->rcq = to_bnxt_re_cq(attr->recv_cq); if (attr->srq) qp->srq = to_bnxt_re_srq(attr->srq); qp->udpi = &cntx->udpi; qp->rand.seed = qp->qpid; /* Save/return the altered Caps. */ cap->max_ssge = attr->cap.max_send_sge; cap->max_rsge = attr->cap.max_recv_sge; cap->max_inline = attr->cap.max_inline_data; cap->sqsig = attr->sq_sig_all; cap->is_atomic_cap = dev->devattr.atomic_cap; fque_init_node(&qp->snode); fque_init_node(&qp->rnode); if (qp->cctx->gen_p5_p7 && cntx->udpi.wcdpi) { qp->push_st_en = 1; qp->max_push_sz = BNXT_RE_MAX_INLINE_SIZE; qp->pbuf = bnxt_re_get_pbuf(&qp->push_st_en, cntx); } return qp->ibvqp; fail: bnxt_re_free_mem(mem); return NULL; } struct ibv_qp *bnxt_re_create_qp_ex(struct ibv_context *ibvctx, struct ibv_qp_init_attr_ex *attr) { return __bnxt_re_create_qp(ibvctx, attr); } struct ibv_qp *bnxt_re_create_qp(struct ibv_pd *ibvpd, struct ibv_qp_init_attr *attr) { struct ibv_qp_init_attr_ex attr_ex; struct ibv_qp *qp; memset(&attr_ex, 0, sizeof(attr_ex)); memcpy(&attr_ex, attr, sizeof(attr_ex)); attr_ex.comp_mask = IBV_QP_INIT_ATTR_PD; attr_ex.pd = ibvpd; qp = __bnxt_re_create_qp(ibvpd->context, &attr_ex); if (qp) memcpy(attr, &attr_ex, sizeof(*attr)); return qp; } int bnxt_re_modify_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd = {}; struct bnxt_re_qp *qp = to_bnxt_re_qp(ibvqp); int rc; rc = ibv_cmd_modify_qp(ibvqp, attr, attr_mask, &cmd, sizeof(cmd)); if (!rc) { if (attr_mask & IBV_QP_STATE) { qp->qpst = attr->qp_state; /* transition to reset */ if (qp->qpst == IBV_QPS_RESET) { qp->jsqq->hwque->head = 0; qp->jsqq->hwque->tail = 0; bnxt_re_cleanup_cq(qp, qp->scq); qp->jsqq->start_idx = 0; qp->jsqq->last_idx = 0; if (qp->jrqq) { qp->jrqq->hwque->head = 0; qp->jrqq->hwque->tail = 0; bnxt_re_cleanup_cq(qp, qp->rcq); qp->jrqq->start_idx = 0; qp->jrqq->last_idx = 0; } } } if (attr_mask & IBV_QP_SQ_PSN) qp->sq_psn = attr->sq_psn; if (attr_mask & IBV_QP_PATH_MTU) qp->mtu = (0x80 << attr->path_mtu); } return rc; } int bnxt_re_query_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; struct bnxt_re_qp *qp = to_bnxt_re_qp(ibvqp); int rc; rc = ibv_cmd_query_qp(ibvqp, attr, attr_mask, init_attr, &cmd, sizeof(cmd)); if (!rc) qp->qpst = ibvqp->state; return rc; } int bnxt_re_destroy_qp(struct ibv_qp *ibvqp) { struct bnxt_re_qp *qp = to_bnxt_re_qp(ibvqp); struct bnxt_re_mem *mem; int status; qp->qpst = IBV_QPS_RESET; status = ibv_cmd_destroy_qp(ibvqp); if (status) return status; if (qp->pbuf) { bnxt_re_put_pbuf(qp->cntx, qp->pbuf); qp->pbuf = NULL; } bnxt_re_cleanup_cq(qp, qp->rcq); bnxt_re_cleanup_cq(qp, qp->scq); mem = qp->mem; bnxt_re_free_mem(mem); return 0; } static void bnxt_re_put_rx_sge(struct bnxt_re_queue *que, uint32_t *idx, struct ibv_sge *sgl, int nsg) { struct bnxt_re_sge *sge; int indx; for (indx = 0; indx < nsg; indx++) { sge = bnxt_re_get_hwqe(que, (*idx)++); sge->pa = htole64(sgl[indx].addr); sge->lkey = htole32(sgl[indx].lkey); sge->length = htole32(sgl[indx].length); } } static int bnxt_re_put_tx_sge(struct bnxt_re_queue *que, uint32_t *idx, struct ibv_sge *sgl, int nsg) { struct bnxt_re_sge *sge; int indx; int len; len = 0; for (indx = 0; indx < nsg; indx++) { sge = bnxt_re_get_hwqe(que, (*idx)++); sge->pa = htole64(sgl[indx].addr); sge->lkey = htole32(sgl[indx].lkey); sge->length = htole32(sgl[indx].length); len += sgl[indx].length; } return len; } static inline int bnxt_re_calc_inline_len(struct ibv_send_wr *swr) { int illen, indx; illen = 0; for (indx = 0; indx < swr->num_sge; indx++) illen += swr->sg_list[indx].length; return illen; } static int bnxt_re_put_inline(struct bnxt_re_queue *que, uint32_t *idx, struct bnxt_re_push_buffer *pbuf, struct ibv_sge *sgl, uint32_t nsg, uint16_t max_ils) { int len, t_len, offt = 0; int t_cplen = 0, cplen; bool pull_dst = true; void *il_dst = NULL; void *il_src = NULL; int alsize; int indx; alsize = sizeof(struct bnxt_re_sge); t_len = 0; for (indx = 0; indx < nsg; indx++) { len = sgl[indx].length; il_src = (void *)(uintptr_t)(sgl[indx].addr); t_len += len; if (t_len > max_ils) goto bad; while (len) { if (pull_dst) { pull_dst = false; il_dst = bnxt_re_get_hwqe(que, (*idx)++); if (pbuf) pbuf->wqe[*idx - 1] = (uintptr_t)il_dst; t_cplen = 0; offt = 0; } cplen = MIN(len, alsize); cplen = MIN(cplen, (alsize - offt)); memcpy(il_dst, il_src, cplen); t_cplen += cplen; il_src += cplen; il_dst += cplen; offt += cplen; len -= cplen; if (t_cplen == alsize) pull_dst = true; } } return t_len; bad: return -ENOMEM; } static int bnxt_re_required_slots(struct bnxt_re_qp *qp, struct ibv_send_wr *wr, uint32_t *wqe_sz, void **pbuf) { uint32_t wqe_byte; int ilsize; if (wr->send_flags & IBV_SEND_INLINE) { ilsize = bnxt_re_calc_inline_len(wr); if (ilsize > qp->cap.max_inline) return -EINVAL; ilsize = align(ilsize, sizeof(struct bnxt_re_sge)); if (qp->push_st_en && ilsize <= qp->max_push_sz) *pbuf = qp->pbuf; wqe_byte = (ilsize + bnxt_re_get_sqe_hdr_sz()); } else { wqe_byte = bnxt_re_calc_wqe_sz(wr->num_sge); } /* que->stride is always 2^4 = 16, thus using hard-coding */ *wqe_sz = wqe_byte >> 4; if (qp->qpmode == BNXT_RE_WQE_MODE_STATIC) return 8; return *wqe_sz; } static inline void bnxt_re_set_hdr_flags(struct bnxt_re_bsqe *hdr, struct ibv_send_wr *wr, uint32_t slots, uint8_t sqsig) { uint32_t send_flags; uint32_t hdrval = 0; uint8_t opcd; send_flags = wr->send_flags; if (send_flags & IBV_SEND_SIGNALED || sqsig) hdrval |= ((BNXT_RE_WR_FLAGS_SIGNALED & BNXT_RE_HDR_FLAGS_MASK) << BNXT_RE_HDR_FLAGS_SHIFT); if (send_flags & IBV_SEND_FENCE) /*TODO: See when RD fence can be used. */ hdrval |= ((BNXT_RE_WR_FLAGS_UC_FENCE & BNXT_RE_HDR_FLAGS_MASK) << BNXT_RE_HDR_FLAGS_SHIFT); if (send_flags & IBV_SEND_SOLICITED) hdrval |= ((BNXT_RE_WR_FLAGS_SE & BNXT_RE_HDR_FLAGS_MASK) << BNXT_RE_HDR_FLAGS_SHIFT); if (send_flags & IBV_SEND_INLINE) hdrval |= ((BNXT_RE_WR_FLAGS_INLINE & BNXT_RE_HDR_FLAGS_MASK) << BNXT_RE_HDR_FLAGS_SHIFT); hdrval |= (slots & BNXT_RE_HDR_WS_MASK) << BNXT_RE_HDR_WS_SHIFT; /* Fill opcode */ opcd = bnxt_re_ibv_to_bnxt_wr_opcd(wr->opcode); hdrval |= (opcd & BNXT_RE_HDR_WT_MASK); hdr->rsv_ws_fl_wt = htole32(hdrval); } static int bnxt_re_build_tx_sge(struct bnxt_re_queue *que, uint32_t *idx, struct bnxt_re_push_buffer *pbuf, struct ibv_send_wr *wr, uint16_t max_il) { if (wr->send_flags & IBV_SEND_INLINE) return bnxt_re_put_inline(que, idx, pbuf, wr->sg_list, wr->num_sge, max_il); return bnxt_re_put_tx_sge(que, idx, wr->sg_list, wr->num_sge); } static void bnxt_re_fill_wrid(struct bnxt_re_wrid *wrid, uint64_t wr_id, uint32_t len, uint8_t sqsig, uint32_t st_idx, uint8_t slots) { wrid->wrid = wr_id; wrid->bytes = len; wrid->sig = 0; if (sqsig) wrid->sig = IBV_SEND_SIGNALED; wrid->st_slot_idx = st_idx; wrid->slots = slots; } static int bnxt_re_build_ud_sqe(struct ibv_send_wr *wr, struct bnxt_re_bsqe *hdr, struct bnxt_re_send *sqe) { struct bnxt_re_ah *ah; uint64_t qkey; ah = to_bnxt_re_ah(wr->wr.ud.ah); if (!wr->wr.ud.ah) return -EINVAL; qkey = wr->wr.ud.remote_qkey; hdr->lhdr.qkey_len |= htole64(qkey << 32); sqe->dst_qp = htole32(wr->wr.ud.remote_qpn); sqe->avid = htole32(ah->avid & 0xFFFFF); return 0; } static bool __atomic_not_supported(struct bnxt_re_qp *qp, struct ibv_send_wr *wr) { /* Atomic capability disabled or the request has more than 1 SGE */ return (!qp->cap.is_atomic_cap || wr->num_sge > 1); } static void bnxt_re_build_cns_sqe(struct ibv_send_wr *wr, struct bnxt_re_bsqe *hdr, void *hdr2) { struct bnxt_re_atomic *sqe = hdr2; hdr->key_immd = htole32(wr->wr.atomic.rkey); hdr->lhdr.rva = htole64(wr->wr.atomic.remote_addr); sqe->cmp_dt = htole64(wr->wr.atomic.compare_add); sqe->swp_dt = htole64(wr->wr.atomic.swap); } static void bnxt_re_build_fna_sqe(struct ibv_send_wr *wr, struct bnxt_re_bsqe *hdr, void *hdr2) { struct bnxt_re_atomic *sqe = hdr2; hdr->key_immd = htole32(wr->wr.atomic.rkey); hdr->lhdr.rva = htole64(wr->wr.atomic.remote_addr); sqe->swp_dt = htole64(wr->wr.atomic.compare_add); } static int bnxt_re_build_atomic_sqe(struct bnxt_re_qp *qp, struct ibv_send_wr *wr, struct bnxt_re_bsqe *hdr, void *hdr2) { if (__atomic_not_supported(qp, wr)) return -EINVAL; switch (wr->opcode) { case IBV_WR_ATOMIC_CMP_AND_SWP: bnxt_re_build_cns_sqe(wr, hdr, hdr2); return 0; case IBV_WR_ATOMIC_FETCH_AND_ADD: bnxt_re_build_fna_sqe(wr, hdr, hdr2); return 0; default: return -EINVAL; } } static void bnxt_re_force_rts2rts(struct bnxt_re_qp *qp) { struct ibv_qp_attr attr; int attr_mask; attr_mask = IBV_QP_STATE; attr.qp_state = IBV_QPS_RTS; bnxt_re_modify_qp(qp->ibvqp, &attr, attr_mask); qp->wqe_cnt = 0; } int bnxt_re_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad) { struct bnxt_re_qp *qp = to_bnxt_re_qp(ibvqp); struct bnxt_re_queue *sq = qp->jsqq->hwque; struct bnxt_re_push_buffer *pbuf = NULL; struct bnxt_re_wrid *wrid; struct bnxt_re_rdma *rsqe; struct bnxt_re_send *sqe; struct bnxt_re_bsqe *hdr; uint32_t swq_idx, slots; int ret = 0, bytes = 0; uint32_t wqe_size = 0; bool ring_db = false; uint8_t sig = 0; uint32_t idx; pthread_spin_lock(&sq->qlock); while (wr) { pbuf = NULL; slots = bnxt_re_required_slots(qp, wr, &wqe_size, (void **)&pbuf); if (bnxt_re_is_que_full(sq, slots) || wr->num_sge > qp->cap.max_ssge) { *bad = wr; ret = ENOMEM; goto bad_wr; } idx = 2; bytes = 0; hdr = bnxt_re_get_hwqe(sq, 0); sqe = bnxt_re_get_hwqe(sq, 1); /* populate push buffer */ if (pbuf) { pbuf->qpid = qp->qpid; pbuf->wqe[0] = (uintptr_t)hdr; pbuf->wqe[1] = (uintptr_t)sqe; pbuf->st_idx = *sq->dbtail; } if (wr->num_sge) { bytes = bnxt_re_build_tx_sge(sq, &idx, pbuf, wr, qp->cap.max_inline); if (unlikely(bytes < 0)) { ret = ENOMEM; *bad = wr; goto bad_wr; } } hdr->lhdr.qkey_len = htole64((uint64_t)bytes); bnxt_re_set_hdr_flags(hdr, wr, wqe_size, qp->cap.sqsig); switch (wr->opcode) { case IBV_WR_SEND_WITH_IMM: case IBV_WR_SEND_WITH_INV: /* Since our h/w is LE and for send_with_imm user supplies * raw-data in BE format. Swapping on incoming data is needed. * On a BE platform htole32 will do the swap while on * LE platform be32toh will do the job. * For send_with_inv, send the data as BE. */ if (wr->opcode == IBV_WR_SEND_WITH_INV) hdr->imm_data = wr->imm_data; else hdr->key_immd = htole32(be32toh(wr->imm_data)); SWITCH_FALLTHROUGH; case IBV_WR_SEND: if (qp->qptyp == IBV_QPT_UD) bytes = bnxt_re_build_ud_sqe(wr, hdr, sqe); break; case IBV_WR_RDMA_WRITE_WITH_IMM: hdr->key_immd = htole32(be32toh(wr->imm_data)); SWITCH_FALLTHROUGH; case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_READ: rsqe = (struct bnxt_re_rdma *)sqe; rsqe->rva = htole64(wr->wr.rdma.remote_addr); rsqe->rkey = htole32(wr->wr.rdma.rkey); break; case IBV_WR_ATOMIC_CMP_AND_SWP: case IBV_WR_ATOMIC_FETCH_AND_ADD: if (bnxt_re_build_atomic_sqe(qp, wr, hdr, sqe)) { ret = EINVAL; *bad = wr; goto bad_wr; } break; default: ret = -EINVAL; *bad = wr; goto bad_wr; } wrid = bnxt_re_get_swqe(qp->jsqq, &swq_idx); sig = ((wr->send_flags & IBV_SEND_SIGNALED) || qp->cap.sqsig); bnxt_re_fill_wrid(wrid, wr->wr_id, bytes, sig, sq->tail, slots); wrid->wc_opcd = bnxt_re_ibv_wr_to_wc_opcd(wr->opcode); if (BNXT_RE_MSN_TBL_EN(qp->cntx)) bnxt_re_fill_psns_for_msntbl(qp, bytes, *sq->dbtail, wr->opcode); else bnxt_re_fill_psns(qp, bytes, *sq->dbtail, wr->opcode); bnxt_re_jqq_mod_start(qp->jsqq, swq_idx); bnxt_re_incr_tail(sq, slots); ring_db = true; if (pbuf) { ring_db = false; pbuf->tail = *sq->dbtail; bnxt_re_fill_push_wcb(qp, pbuf, idx); pbuf = NULL; } qp->wqe_cnt++; wr = wr->next; if (unlikely(!qp->cntx->cctx.gen_p5_p7 && qp->wqe_cnt == BNXT_RE_UD_QP_HW_STALL && qp->qptyp == IBV_QPT_UD)) bnxt_re_force_rts2rts(qp); } bad_wr: if (ring_db) bnxt_re_ring_sq_db(qp); pthread_spin_unlock(&sq->qlock); return ret; } int bnxt_re_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad) { struct bnxt_re_qp *qp = to_bnxt_re_qp(ibvqp); struct bnxt_re_queue *rq = qp->jrqq->hwque; struct bnxt_re_wrid *swque; struct bnxt_re_brqe *hdr; struct bnxt_re_sge *sge; bool ring_db = false; uint32_t hdrval = 0; uint32_t idx = 0; uint32_t swq_idx; int rc = 0; pthread_spin_lock(&rq->qlock); while (wr) { if (unlikely(bnxt_re_is_que_full(rq, rq->max_slots) || wr->num_sge > qp->cap.max_rsge)) { *bad = wr; rc = ENOMEM; break; } swque = bnxt_re_get_swqe(qp->jrqq, &swq_idx); /* * Initialize idx to 2 since the length of header wqe is 32 bytes * i.e. sizeof(struct bnxt_re_brqe) + sizeof(struct bnxt_re_send) */ idx = 2; hdr = bnxt_re_get_hwqe_hdr(rq); if (unlikely(!wr->num_sge)) { /* * HW needs at least one SGE for RQ Entries. * Create an entry if num_sge = 0, * update the idx and set length of sge to 0. */ sge = bnxt_re_get_hwqe(rq, idx++); sge->length = 0; } else { /* Fill SGEs */ bnxt_re_put_rx_sge(rq, &idx, wr->sg_list, wr->num_sge); } hdrval = BNXT_RE_WR_OPCD_RECV; hdrval |= ((idx & BNXT_RE_HDR_WS_MASK) << BNXT_RE_HDR_WS_SHIFT); hdr->rsv_ws_fl_wt = htole32(hdrval); hdr->wrid = htole32(swq_idx); swque->wrid = wr->wr_id; swque->slots = rq->max_slots; swque->wc_opcd = BNXT_RE_WC_OPCD_RECV; bnxt_re_jqq_mod_start(qp->jrqq, swq_idx); bnxt_re_incr_tail(rq, rq->max_slots); ring_db = true; wr = wr->next; } if (ring_db) bnxt_re_ring_rq_db(qp); pthread_spin_unlock(&rq->qlock); return rc; } static size_t bnxt_re_get_srqmem_size(struct bnxt_re_context *cntx, struct ibv_srq_init_attr *attr, struct bnxt_re_qattr *qattr) { uint32_t stride, nswr; size_t size = 0; size = sizeof(struct bnxt_re_srq); size += sizeof(struct bnxt_re_queue); /* allocate 1 extra to determin full condition */ nswr = attr->attr.max_wr + 1; nswr = bnxt_re_init_depth(nswr, cntx->comp_mask); stride = bnxt_re_get_srqe_sz(); qattr->nwr = nswr; qattr->slots = nswr; qattr->esize = stride; qattr->sz_ring = align((nswr * stride), cntx->rdev->pg_size); qattr->sz_shad = nswr * sizeof(struct bnxt_re_wrid); /* shadow */ size += qattr->sz_ring; size += qattr->sz_shad; return size; } static void *bnxt_re_alloc_srqslab(struct bnxt_re_context *cntx, struct ibv_srq_init_attr *attr, struct bnxt_re_qattr *qattr) { size_t bytes; bytes = bnxt_re_get_srqmem_size(cntx, attr, qattr); return bnxt_re_alloc_mem(bytes, cntx->rdev->pg_size); } static struct bnxt_re_srq *bnxt_re_srq_alloc_queue_ptr(struct bnxt_re_mem *mem) { struct bnxt_re_srq *srq; srq = bnxt_re_get_obj(mem, sizeof(*srq)); if (!srq) return NULL; srq->srqq = bnxt_re_get_obj(mem, sizeof(struct bnxt_re_queue)); if (!srq->srqq) return NULL; return srq; } static int bnxt_re_srq_alloc_queue(struct bnxt_re_srq *srq, struct ibv_srq_init_attr *attr, struct bnxt_re_qattr *qattr) { struct bnxt_re_queue *que; int ret = -ENOMEM; int idx; que = srq->srqq; que->depth = qattr->slots; que->stride = qattr->esize; que->va = bnxt_re_get_ring(srq->mem, qattr->sz_ring); if (!que->va) goto bail; pthread_spin_init(&que->qlock, PTHREAD_PROCESS_PRIVATE); /* For SRQ only bnxt_re_wrid.wrid is used. */ srq->srwrid = bnxt_re_get_obj(srq->mem, qattr->sz_shad); if (!srq->srwrid) goto bail; srq->start_idx = 0; srq->last_idx = que->depth - 1; for (idx = 0; idx < que->depth; idx++) srq->srwrid[idx].next_idx = idx + 1; srq->srwrid[srq->last_idx].next_idx = -1; /*TODO: update actual max depth. */ return 0; bail: pthread_spin_destroy(&srq->srqq->qlock); return ret; } struct ibv_srq *bnxt_re_create_srq(struct ibv_pd *ibvpd, struct ibv_srq_init_attr *attr) { struct bnxt_re_context *cntx = to_bnxt_re_context(ibvpd->context); struct bnxt_re_mmap_info minfo = {}; struct ubnxt_re_srq_resp resp = {}; struct bnxt_re_qattr qattr = {}; struct ubnxt_re_srq req; struct bnxt_re_srq *srq; void *mem; int ret; mem = bnxt_re_alloc_srqslab(cntx, attr, &qattr); if (!mem) return NULL; srq = bnxt_re_srq_alloc_queue_ptr(mem); if (!srq) goto fail; srq->cntx = cntx; srq->mem = mem; if (bnxt_re_srq_alloc_queue(srq, attr, &qattr)) goto fail; req.srqva = (uintptr_t)srq->srqq->va; req.srq_handle = (uintptr_t)srq; ret = ibv_cmd_create_srq(ibvpd, &srq->ibvsrq, attr, &req.ibv_cmd, sizeof(req), &resp.ibv_resp, sizeof(resp)); if (ret) goto fail; srq->srqid = resp.srqid; srq->cntx = cntx; srq->udpi = &cntx->udpi; srq->rand.seed = srq->srqid; srq->cap.max_wr = srq->srqq->depth; srq->cap.max_sge = attr->attr.max_sge; srq->cap.srq_limit = attr->attr.srq_limit; srq->arm_req = false; if (resp.comp_mask & BNXT_RE_SRQ_TOGGLE_PAGE_SUPPORT) { minfo.type = BNXT_RE_SRQ_TOGGLE_MEM; minfo.res_id = resp.srqid; ret = bnxt_re_get_toggle_mem(ibvpd->context, &minfo, &srq->mem_handle); if (ret) goto fail; srq->toggle_map = mmap(NULL, minfo.alloc_size, PROT_READ, MAP_SHARED, ibvpd->context->cmd_fd, minfo.alloc_offset); if (srq->toggle_map == MAP_FAILED) goto fail; srq->toggle_size = minfo.alloc_size; } return &srq->ibvsrq; fail: bnxt_re_free_mem(mem); return NULL; } int bnxt_re_modify_srq(struct ibv_srq *ibvsrq, struct ibv_srq_attr *attr, int attr_mask) { struct bnxt_re_srq *srq = to_bnxt_re_srq(ibvsrq); struct ibv_modify_srq cmd; int status = 0; status = ibv_cmd_modify_srq(ibvsrq, attr, attr_mask, &cmd, sizeof(cmd)); if (!status && ((attr_mask & IBV_SRQ_LIMIT) && (srq->cap.srq_limit != attr->srq_limit))) { srq->cap.srq_limit = attr->srq_limit; } srq->arm_req = true; return status; } int bnxt_re_destroy_srq(struct ibv_srq *ibvsrq) { struct bnxt_re_srq *srq = to_bnxt_re_srq(ibvsrq); struct bnxt_re_mem *mem; int ret; ret = ibv_cmd_destroy_srq(ibvsrq); if (ret) return ret; if (srq->toggle_map) munmap(srq->toggle_map, srq->toggle_size); mem = srq->mem; bnxt_re_free_mem(mem); return 0; } int bnxt_re_query_srq(struct ibv_srq *ibvsrq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; return ibv_cmd_query_srq(ibvsrq, attr, &cmd, sizeof(cmd)); } static void bnxt_re_build_srqe(struct bnxt_re_srq *srq, struct ibv_recv_wr *wr, void *srqe) { struct bnxt_re_brqe *hdr = srqe; struct bnxt_re_wrid *wrid; struct bnxt_re_sge *sge; int wqe_sz, len, next; uint32_t hdrval = 0; int indx; sge = (srqe + bnxt_re_get_srqe_hdr_sz()); next = srq->start_idx; wrid = &srq->srwrid[next]; len = 0; for (indx = 0; indx < wr->num_sge; indx++, sge++) { sge->pa = htole64(wr->sg_list[indx].addr); sge->lkey = htole32(wr->sg_list[indx].lkey); sge->length = htole32(wr->sg_list[indx].length); len += wr->sg_list[indx].length; } hdrval = BNXT_RE_WR_OPCD_RECV; wqe_sz = wr->num_sge + (bnxt_re_get_srqe_hdr_sz() >> 4); /* 16B align */ hdrval |= ((wqe_sz & BNXT_RE_HDR_WS_MASK) << BNXT_RE_HDR_WS_SHIFT); hdr->rsv_ws_fl_wt = htole32(hdrval); hdr->wrid = htole32((uint32_t)next); /* Fill wrid */ wrid->wrid = wr->wr_id; wrid->bytes = len; /* N.A. for RQE */ wrid->sig = 0; /* N.A. for RQE */ } int bnxt_re_post_srq_recv(struct ibv_srq *ibvsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad) { struct bnxt_re_srq *srq = to_bnxt_re_srq(ibvsrq); struct bnxt_re_queue *rq = srq->srqq; int count = 0, rc = 0; bool ring_db = false; void *srqe; pthread_spin_lock(&rq->qlock); count = rq->tail > rq->head ? rq->tail - rq->head : rq->depth - rq->head + rq->tail; while (wr) { if (srq->start_idx == srq->last_idx || wr->num_sge > srq->cap.max_sge) { *bad = wr; rc = ENOMEM; goto exit; } srqe = (void *) (rq->va + (rq->tail * rq->stride)); memset(srqe, 0, bnxt_re_get_srqe_sz()); bnxt_re_build_srqe(srq, wr, srqe); srq->start_idx = srq->srwrid[srq->start_idx].next_idx; bnxt_re_incr_tail(rq, 1); ring_db = true; wr = wr->next; count++; if (srq->arm_req == true && count > srq->cap.srq_limit) { srq->arm_req = false; ring_db = false; bnxt_re_ring_srq_db(srq); bnxt_re_ring_srq_arm(srq); } } exit: if (ring_db) bnxt_re_ring_srq_db(srq); pthread_spin_unlock(&rq->qlock); return rc; } struct ibv_ah *bnxt_re_create_ah(struct ibv_pd *ibvpd, struct ibv_ah_attr *attr) { struct bnxt_re_context *uctx; struct bnxt_re_ah *ah; struct ib_uverbs_create_ah_resp resp; int status; uctx = to_bnxt_re_context(ibvpd->context); ah = calloc(1, sizeof(*ah)); if (!ah) goto failed; pthread_mutex_lock(&uctx->shlock); memset(&resp, 0, sizeof(resp)); status = ibv_cmd_create_ah(ibvpd, &ah->ibvah, attr, &resp, sizeof(resp)); if (status) { pthread_mutex_unlock(&uctx->shlock); free(ah); goto failed; } /* read AV ID now. */ ah->avid = *(uint32_t *)(uctx->shpg + BNXT_RE_AVID_OFFT); pthread_mutex_unlock(&uctx->shlock); return &ah->ibvah; failed: return NULL; } int bnxt_re_destroy_ah(struct ibv_ah *ibvah) { struct bnxt_re_ah *ah; int status; ah = to_bnxt_re_ah(ibvah); status = ibv_cmd_destroy_ah(ibvah); if (status) return status; free(ah); return 0; } rdma-core-56.1/providers/bnxt_re/verbs.h000066400000000000000000000126301477342711600203040ustar00rootroot00000000000000/* * Broadcom NetXtreme-E User Space RoCE driver * * Copyright (c) 2015-2017, Broadcom. All rights reserved. The term * Broadcom refers to Broadcom Limited and/or its subsidiaries. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in * the documentation and/or other materials provided with the * distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * Description: Internal IB-verbs function declaration */ #ifndef __VERBS_H__ #define __VERBS_H__ #include #include #include #include #include #include #include #include #include #include #include #include #include #include struct bnxt_re_work_compl { struct list_node list; struct ibv_wc wc; }; static inline uint8_t bnxt_re_get_psne_size(struct bnxt_re_context *cntx) { return (BNXT_RE_MSN_TBL_EN(cntx)) ? sizeof(struct bnxt_re_msns) : (cntx->cctx.gen_p5_p7) ? sizeof(struct bnxt_re_psns_ext) : sizeof(struct bnxt_re_psns); } static inline uint32_t bnxt_re_get_npsn(uint8_t mode, uint32_t nwr, uint32_t slots) { return mode == BNXT_RE_WQE_MODE_VARIABLE ? slots : nwr; } int bnxt_re_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int bnxt_re_query_port(struct ibv_context *uctx, uint8_t port, struct ibv_port_attr *attr); struct ibv_pd *bnxt_re_alloc_pd(struct ibv_context *uctx); int bnxt_re_free_pd(struct ibv_pd *ibvpd); struct ibv_mr *bnxt_re_reg_mr(struct ibv_pd *ibvpd, void *buf, size_t len, uint64_t hca_va, int ibv_access_flags); struct ibv_mr *bnxt_re_reg_dmabuf_mr(struct ibv_pd *, uint64_t start, size_t len, uint64_t iova, int fd, int access); int bnxt_re_dereg_mr(struct verbs_mr *vmr); struct ibv_cq *bnxt_re_create_cq(struct ibv_context *uctx, int ncqe, struct ibv_comp_channel *ch, int vec); int bnxt_re_resize_cq(struct ibv_cq *ibvcq, int ncqe); int bnxt_re_destroy_cq(struct ibv_cq *ibvcq); int bnxt_re_poll_cq(struct ibv_cq *ibvcq, int nwc, struct ibv_wc *wc); int bnxt_re_arm_cq(struct ibv_cq *ibvcq, int flags); struct ibv_qp *bnxt_re_create_qp(struct ibv_pd *ibvpd, struct ibv_qp_init_attr *attr); struct ibv_qp *bnxt_re_create_qp_ex(struct ibv_context *cntx, struct ibv_qp_init_attr_ex *attr); int bnxt_re_modify_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr, int ibv_qp_attr_mask); int bnxt_re_query_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int bnxt_re_destroy_qp(struct ibv_qp *ibvqp); int bnxt_re_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad); int bnxt_re_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad); struct ibv_srq *bnxt_re_create_srq(struct ibv_pd *ibvpd, struct ibv_srq_init_attr *attr); int bnxt_re_modify_srq(struct ibv_srq *ibvsrq, struct ibv_srq_attr *attr, int mask); int bnxt_re_destroy_srq(struct ibv_srq *ibvsrq); int bnxt_re_query_srq(struct ibv_srq *ibvsrq, struct ibv_srq_attr *attr); int bnxt_re_post_srq_recv(struct ibv_srq *ibvsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad); struct ibv_ah *bnxt_re_create_ah(struct ibv_pd *ibvpd, struct ibv_ah_attr *attr); int bnxt_re_destroy_ah(struct ibv_ah *ibvah); void bnxt_re_async_event(struct ibv_context *context, struct ibv_async_event *event); static inline __le64 bnxt_re_update_msn_tbl(uint32_t st_idx, uint32_t npsn, uint32_t start_psn) { /* Adjust the field values to their respective ofsets */ return htole64((((uint64_t)(st_idx) << BNXT_RE_SQ_MSN_SEARCH_START_IDX_SHIFT) & BNXT_RE_SQ_MSN_SEARCH_START_IDX_MASK) | (((uint64_t)(npsn) << BNXT_RE_SQ_MSN_SEARCH_NEXT_PSN_SHIFT) & BNXT_RE_SQ_MSN_SEARCH_NEXT_PSN_MASK) | (((start_psn) << BNXT_RE_SQ_MSN_SEARCH_START_PSN_SHIFT) & BNXT_RE_SQ_MSN_SEARCH_START_PSN_MASK)); } #endif /* __BNXT_RE_VERBS_H__ */ rdma-core-56.1/providers/cxgb4/000077500000000000000000000000001477342711600163565ustar00rootroot00000000000000rdma-core-56.1/providers/cxgb4/CMakeLists.txt000066400000000000000000000001711477342711600211150ustar00rootroot00000000000000set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${NO_STRICT_ALIASING_FLAGS}") rdma_provider(cxgb4 cq.c dev.c qp.c verbs.c ) rdma-core-56.1/providers/cxgb4/cq.c000066400000000000000000000566041477342711600171400ustar00rootroot00000000000000/* * Copyright (c) 2006-2016 Chelsio, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include "libcxgb4.h" #include "cxgb4-abi.h" static void insert_recv_cqe(struct t4_wq *wq, struct t4_cq *cq, u32 srqidx) { union t4_cqe cqe = {}; __be64 *gen = GEN_ADDR(&cqe); PDBG("%s wq %p cq %p sw_cidx %u sw_pidx %u\n", __func__, wq, cq, cq->sw_cidx, cq->sw_pidx); cqe.com.header = htobe32(V_CQE_STATUS(T4_ERR_SWFLUSH) | V_CQE_OPCODE(FW_RI_SEND) | V_CQE_TYPE(0) | V_CQE_SWCQE(1) | V_CQE_QPID(wq->sq.qid)); *gen = htobe64(V_CQE_GENBIT((u64)cq->gen)); if (srqidx) cqe.b64.u.srcqe.abs_rqe_idx = htobe32(srqidx); memcpy(Q_ENTRY(cq->sw_queue, cq->sw_pidx), &cqe, CQE_SIZE(&cqe)); t4_swcq_produce(cq); } int c4iw_flush_rq(struct t4_wq *wq, struct t4_cq *cq, int count) { int flushed = 0; int in_use = wq->rq.in_use - count; BUG_ON(in_use < 0); PDBG("%s wq %p cq %p rq.in_use %u skip count %u\n", __func__, wq, cq, wq->rq.in_use, count); while (in_use--) { insert_recv_cqe(wq, cq, 0); flushed++; } return flushed; } static void insert_sq_cqe(struct t4_wq *wq, struct t4_cq *cq, struct t4_swsqe *swcqe) { union t4_cqe cqe = {}; __be64 *gen = GEN_ADDR(&cqe); PDBG("%s wq %p cq %p sw_cidx %u sw_pidx %u\n", __func__, wq, cq, cq->sw_cidx, cq->sw_pidx); cqe.com.header = htobe32(V_CQE_STATUS(T4_ERR_SWFLUSH) | V_CQE_OPCODE(swcqe->opcode) | V_CQE_TYPE(1) | V_CQE_SWCQE(1) | V_CQE_QPID(wq->sq.qid)); CQE_WRID_SQ_IDX(&cqe.com) = swcqe->idx; *gen = htobe64(V_CQE_GENBIT((u64)cq->gen)); memcpy(Q_ENTRY(cq->sw_queue, cq->sw_pidx), &cqe, CQE_SIZE(&cqe)); t4_swcq_produce(cq); } static void advance_oldest_read(struct t4_wq *wq); void c4iw_flush_sq(struct c4iw_qp *qhp) { unsigned short flushed = 0; struct t4_wq *wq = &qhp->wq; struct c4iw_cq *chp = to_c4iw_cq(qhp->ibv_qp.send_cq); struct t4_cq *cq = &chp->cq; int idx; struct t4_swsqe *swsqe; if (wq->sq.flush_cidx == -1) wq->sq.flush_cidx = wq->sq.cidx; idx = wq->sq.flush_cidx; BUG_ON(idx >= wq->sq.size); while (idx != wq->sq.pidx) { swsqe = &wq->sq.sw_sq[idx]; BUG_ON(swsqe->flushed); swsqe->flushed = 1; insert_sq_cqe(wq, cq, swsqe); if (wq->sq.oldest_read == swsqe) { BUG_ON(swsqe->opcode != FW_RI_READ_REQ); advance_oldest_read(wq); } flushed++; if (++idx == wq->sq.size) idx = 0; } wq->sq.flush_cidx += flushed; if (wq->sq.flush_cidx >= wq->sq.size) wq->sq.flush_cidx -= wq->sq.size; } static void flush_completed_wrs(struct t4_wq *wq, struct t4_cq *cq) { struct t4_swsqe *swsqe; unsigned short cidx; if (wq->sq.flush_cidx == -1) wq->sq.flush_cidx = wq->sq.cidx; cidx = wq->sq.flush_cidx; BUG_ON(cidx >= wq->sq.size); while (cidx != wq->sq.pidx) { swsqe = &wq->sq.sw_sq[cidx]; if (!swsqe->signaled) { if (++cidx == wq->sq.size) cidx = 0; } else if (swsqe->complete) { BUG_ON(swsqe->flushed); /* * Insert this completed cqe into the swcq. */ PDBG("%s moving cqe into swcq sq idx %u cq idx %u\n", __func__, cidx, cq->sw_pidx); swsqe->cqe.com.header |= htobe32(V_CQE_SWCQE(1)); memcpy(Q_ENTRY(cq->sw_queue, cq->sw_pidx), &swsqe->cqe, CQE_SIZE(&swsqe->cqe)); t4_swcq_produce(cq); swsqe->flushed = 1; if (++cidx == wq->sq.size) cidx = 0; wq->sq.flush_cidx = cidx; } else break; } } static void create_read_req_cqe(struct t4_wq *wq, union t4_cqe *hw_cqe, union t4_cqe *read_cqe) { __be64 *gen = GEN_ADDR(read_cqe); memset(read_cqe, 0, sizeof(*read_cqe)); read_cqe->com.u.scqe.cidx = wq->sq.oldest_read->idx; read_cqe->com.len = be32toh(wq->sq.oldest_read->read_len); read_cqe->com.header = htobe32(V_CQE_QPID(CQE_QPID(&hw_cqe->com)) | V_CQE_SWCQE(SW_CQE(&hw_cqe->com)) | V_CQE_OPCODE(FW_RI_READ_REQ) | V_CQE_TYPE(1)); *gen = GEN_BIT(hw_cqe); } static void advance_oldest_read(struct t4_wq *wq) { u32 rptr = wq->sq.oldest_read - wq->sq.sw_sq + 1; if (rptr == wq->sq.size) rptr = 0; while (rptr != wq->sq.pidx) { wq->sq.oldest_read = &wq->sq.sw_sq[rptr]; if (wq->sq.oldest_read->opcode == FW_RI_READ_REQ) return; if (++rptr == wq->sq.size) rptr = 0; } wq->sq.oldest_read = NULL; } /* * Move all CQEs from the HWCQ into the SWCQ. * Deal with out-of-order and/or completions that complete * prior unsignalled WRs. */ void c4iw_flush_hw_cq(struct c4iw_cq *chp, struct c4iw_qp *flush_qhp) { union t4_cqe *hw_cqe, *swcqe, read_cqe; struct t4_cqe_common *com; struct c4iw_qp *qhp; struct t4_swsqe *swsqe; int ret; PDBG("%s cqid 0x%x\n", __func__, chp->cq.cqid); ret = t4_next_hw_cqe(&chp->cq, &hw_cqe); com = &hw_cqe->com; /* * This logic is similar to poll_cq(), but not quite the same * unfortunately. Need to move pertinent HW CQEs to the SW CQ but * also do any translation magic that poll_cq() normally does. */ while (!ret) { qhp = get_qhp(chp->rhp, CQE_QPID(com)); /* * drop CQEs with no associated QP */ if (qhp == NULL) goto next_cqe; if (flush_qhp != qhp) { pthread_spin_lock(&qhp->lock); if (qhp->wq.flushed == 1) { goto next_cqe; } } if (CQE_OPCODE(com) == FW_RI_TERMINATE) goto next_cqe; if (CQE_OPCODE(com) == FW_RI_READ_RESP) { /* * If we have reached here because of async * event or other error, and have egress error * then drop */ if (CQE_TYPE(com) == 1) { syslog(LOG_CRIT, "%s: got egress error in \ read-response, dropping!\n", __func__); goto next_cqe; } /* * drop peer2peer RTR reads. */ if (CQE_WRID_STAG(com) == 1) goto next_cqe; /* * Eat completions for unsignaled read WRs. */ if (!qhp->wq.sq.oldest_read->signaled) { advance_oldest_read(&qhp->wq); goto next_cqe; } /* * Don't write to the HWCQ, create a new read req CQE * in local memory and move it into the swcq. */ create_read_req_cqe(&qhp->wq, hw_cqe, &read_cqe); hw_cqe = &read_cqe; com = &hw_cqe->com; advance_oldest_read(&qhp->wq); } /* if its a SQ completion, then do the magic to move all the * unsignaled and now in-order completions into the swcq. */ if (SQ_TYPE(com)) { int idx = CQE_WRID_SQ_IDX(com); BUG_ON(idx >= qhp->wq.sq.size); swsqe = &qhp->wq.sq.sw_sq[idx]; swsqe->cqe = *hw_cqe; swsqe->complete = 1; flush_completed_wrs(&qhp->wq, &chp->cq); } else { swcqe = Q_ENTRY(chp->cq.sw_queue, chp->cq.sw_pidx); memcpy(swcqe, hw_cqe, CQE_SIZE(hw_cqe)); swcqe->com.header |= htobe32(V_CQE_SWCQE(1)); t4_swcq_produce(&chp->cq); } next_cqe: t4_hwcq_consume(&chp->cq); ret = t4_next_hw_cqe(&chp->cq, &hw_cqe); com = &hw_cqe->com; if (qhp && flush_qhp != qhp) pthread_spin_unlock(&qhp->lock); } } static int cqe_completes_wr(union t4_cqe *cqe, struct t4_wq *wq) { struct t4_cqe_common *com = &cqe->com; if (CQE_OPCODE(com) == FW_RI_TERMINATE) return 0; if ((CQE_OPCODE(com) == FW_RI_RDMA_WRITE) && RQ_TYPE(com)) return 0; if ((CQE_OPCODE(com) == FW_RI_READ_RESP) && SQ_TYPE(com)) return 0; if (CQE_SEND_OPCODE(com) && RQ_TYPE(com) && t4_rq_empty(wq)) return 0; return 1; } void c4iw_count_rcqes(struct t4_cq *cq, struct t4_wq *wq, int *count) { struct t4_cqe_common *com; union t4_cqe *cqe; u32 ptr; *count = 0; ptr = cq->sw_cidx; BUG_ON(ptr >= cq->size); while (ptr != cq->sw_pidx) { cqe = Q_ENTRY(cq->sw_queue, ptr); com = &cqe->com; if (RQ_TYPE(com) && (CQE_OPCODE(com) != FW_RI_READ_RESP) && (CQE_QPID(com) == wq->sq.qid) && cqe_completes_wr(cqe, wq)) (*count)++; if (++ptr == cq->size) ptr = 0; } PDBG("%s cq %p count %d\n", __func__, cq, *count); } static void dump_cqe(void *arg) { u64 *p = arg; syslog(LOG_NOTICE, "cxgb4 err cqe %016llx %016llx %016llx %016llx\n", (long long)be64toh(p[0]), (long long)be64toh(p[1]), (long long)be64toh(p[2]), (long long)be64toh(p[3])); if (is_64b_cqe) syslog(LOG_NOTICE, "cxgb4 err cqe %016llx %016llx %016llx %016llx\n", (long long)be64toh(p[4]), (long long)be64toh(p[5]), (long long)be64toh(p[6]), (long long)be64toh(p[7])); } static void post_pending_srq_wrs(struct t4_srq *srq) { struct t4_srq_pending_wr *pwr; u16 idx = 0; while (srq->pending_in_use) { assert(!srq->sw_rq[srq->pidx].valid); pwr = &srq->pending_wrs[srq->pending_cidx]; srq->sw_rq[srq->pidx].wr_id = pwr->wr_id; srq->sw_rq[srq->pidx].valid = 1; PDBG("%s posting pending cidx %u pidx %u wq_pidx %u in_use %u rq_size %u wr_id %llx\n", __func__, srq->cidx, srq->pidx, srq->wq_pidx, srq->in_use, srq->size, (unsigned long long)pwr->wr_id); c4iw_copy_wr_to_srq(srq, &pwr->wqe, pwr->len16); t4_srq_consume_pending_wr(srq); t4_srq_produce(srq, pwr->len16); idx += DIV_ROUND_UP(pwr->len16*16, T4_EQ_ENTRY_SIZE); } if (idx) { t4_ring_srq_db(srq, idx, pwr->len16, &pwr->wqe); srq->queue[srq->size].status.host_wq_pidx = srq->wq_pidx; } } static u64 reap_srq_cqe(union t4_cqe *hw_cqe, struct t4_srq *srq) { int rel_idx = CQE_ABS_RQE_IDX(&hw_cqe->b64) - srq->rqt_abs_idx; u64 wr_id; BUG_ON(rel_idx >= srq->size); assert(srq->sw_rq[rel_idx].valid); srq->sw_rq[rel_idx].valid = 0; wr_id = srq->sw_rq[rel_idx].wr_id; if (rel_idx == srq->cidx) { PDBG("%s in order cqe rel_idx %u cidx %u pidx %u wq_pidx %u in_use %u rq_size %u wr_id %llx\n", __func__, rel_idx, srq->cidx, srq->pidx, srq->wq_pidx, srq->in_use, srq->size, (unsigned long long)srq->sw_rq[rel_idx].wr_id); t4_srq_consume(srq); while (srq->ooo_count && !srq->sw_rq[srq->cidx].valid) { PDBG("%s eat ooo cidx %u pidx %u wq_pidx %u in_use %u rq_size %u ooo_count %u wr_id %llx\n", __func__, srq->cidx, srq->pidx, srq->wq_pidx, srq->in_use, srq->size, srq->ooo_count, (unsigned long long)srq->sw_rq[srq->cidx].wr_id); t4_srq_consume_ooo(srq); } if (srq->ooo_count == 0 && srq->pending_in_use) post_pending_srq_wrs(srq); } else { BUG_ON(srq->in_use == 0); PDBG("%s ooo cqe rel_idx %u cidx %u pidx %u wq_pidx %u in_use %u rq_size %u ooo_count %u wr_id %llx\n", __func__, rel_idx, srq->cidx, srq->pidx, srq->wq_pidx, srq->in_use, srq->size, srq->ooo_count, (unsigned long long)srq->sw_rq[rel_idx].wr_id); t4_srq_produce_ooo(srq); } return wr_id; } /* * poll_cq * * Caller must: * check the validity of the first CQE, * supply the wq assicated with the qpid. * * credit: cq credit to return to sge. * cqe_flushed: 1 iff the CQE is flushed. * cqe: copy of the polled CQE. * * return value: * 0 CQE returned ok. * -EAGAIN CQE skipped, try again. * -EOVERFLOW CQ overflow detected. */ static int poll_cq(struct t4_wq *wq, struct t4_cq *cq, union t4_cqe *cqe, u8 *cqe_flushed, u64 *cookie, u32 *credit, struct t4_srq *srq) { int ret = 0; union t4_cqe *hw_cqe, read_cqe; struct t4_cqe_common *com; *cqe_flushed = 0; *credit = 0; ret = t4_next_cqe(cq, &hw_cqe); if (ret) return ret; com = &hw_cqe->com; PDBG("%s CQE OVF %u qpid 0x%0x genbit %u type %u status 0x%0x" " opcode 0x%0x len 0x%0x wrid_hi_stag 0x%x wrid_low_msn 0x%x\n", __func__, is_64b_cqe ? CQE_OVFBIT(&hw_cqe->b64) : CQE_OVFBIT(&hw_cqe->b32), CQE_QPID(com), is_64b_cqe ? CQE_GENBIT(&hw_cqe->b64) : CQE_GENBIT(&hw_cqe->b32), CQE_TYPE(com), CQE_STATUS(com), CQE_OPCODE(com), CQE_LEN(com), CQE_WRID_HI(com), CQE_WRID_LOW(com)); /* * skip cqe's not affiliated with a QP. */ if (wq == NULL) { ret = -EAGAIN; goto skip_cqe; } /* * skip HW cqe's if wq is already flushed. */ if (wq->flushed && !SW_CQE(com)) { ret = -EAGAIN; goto skip_cqe; } /* * Gotta tweak READ completions: * 1) the cqe doesn't contain the sq_wptr from the wr. * 2) opcode not reflected from the wr. * 3) read_len not reflected from the wr. * 4) T4 HW (for now) inserts target read response failures which * need to be skipped. */ if (CQE_OPCODE(com) == FW_RI_READ_RESP) { /* * If we have reached here because of async * event or other error, and have egress error * then drop */ if (CQE_TYPE(com) == 1) { syslog(LOG_CRIT, "%s: got egress error in \ read-response, dropping!\n", __func__); if (CQE_STATUS(com)) t4_set_wq_in_error(wq); ret = -EAGAIN; goto skip_cqe; } /* * If this is an unsolicited read response, then the read * was generated by the kernel driver as part of peer-2-peer * connection setup, or a target read response failure. * So skip the completion. */ if (CQE_WRID_STAG(com) == 1) { if (CQE_STATUS(com)) t4_set_wq_in_error(wq); ret = -EAGAIN; goto skip_cqe; } /* * Eat completions for unsignaled read WRs. */ if (!wq->sq.oldest_read->signaled) { advance_oldest_read(wq); ret = -EAGAIN; goto skip_cqe; } /* * Don't write to the HWCQ, so create a new read req CQE * in local memory. */ create_read_req_cqe(wq, hw_cqe, &read_cqe); hw_cqe = &read_cqe; com = &hw_cqe->com; advance_oldest_read(wq); } if (CQE_OPCODE(com) == FW_RI_TERMINATE) { ret = -EAGAIN; goto skip_cqe; } if (CQE_STATUS(com) || t4_wq_in_error(wq)) { *cqe_flushed = (CQE_STATUS(com) == T4_ERR_SWFLUSH); wq->error = 1; if (!*cqe_flushed && CQE_STATUS(com)) dump_cqe(hw_cqe); assert(!((*cqe_flushed == 0) && !SW_CQE(com))); goto proc_cqe; } /* * RECV completion. */ if (RQ_TYPE(com)) { /* * HW only validates 4 bits of MSN. So we must validate that * the MSN in the SEND is the next expected MSN. If its not, * then we complete this with T4_ERR_MSN and mark the wq in * error. */ if (srq ? t4_srq_empty(srq) : t4_rq_empty(wq)) { t4_set_wq_in_error(wq); ret = -EAGAIN; goto skip_cqe; } if (unlikely((CQE_WRID_MSN(com) != (wq->rq.msn)))) { t4_set_wq_in_error(wq); hw_cqe->com.header |= htobe32(V_CQE_STATUS(T4_ERR_MSN)); goto proc_cqe; } goto proc_cqe; } /* * If we get here its a send completion. * * Handle out of order completion. These get stuffed * in the SW SQ. Then the SW SQ is walked to move any * now in-order completions into the SW CQ. This handles * 2 cases: * 1) reaping unsignaled WRs when the first subsequent * signaled WR is completed. * 2) out of order read completions. */ if (!SW_CQE(com) && (CQE_WRID_SQ_IDX(com) != wq->sq.cidx)) { struct t4_swsqe *swsqe; int idx = CQE_WRID_SQ_IDX(com); PDBG("%s out of order completion going in sw_sq at idx %u\n", __func__, idx); BUG_ON(idx >= wq->sq.size); swsqe = &wq->sq.sw_sq[idx]; swsqe->cqe = *hw_cqe; swsqe->complete = 1; ret = -EAGAIN; goto flush_wq; } proc_cqe: *cqe = *hw_cqe; /* * Reap the associated WR(s) that are freed up with this * completion. */ if (SQ_TYPE(com)) { int idx = CQE_WRID_SQ_IDX(com); BUG_ON(idx >= wq->sq.size); /* * Account for any unsignaled completions completed by * this signaled completion. In this case, cidx points * to the first unsignaled one, and idx points to the * signaled one. So adjust in_use based on this delta. * if this is not completing any unsigned wrs, then the * delta will be 0. Handle wrapping also! */ if (idx < wq->sq.cidx) wq->sq.in_use -= wq->sq.size + idx - wq->sq.cidx; else wq->sq.in_use -= idx - wq->sq.cidx; BUG_ON(wq->sq.in_use <= 0 || wq->sq.in_use >= wq->sq.size); wq->sq.cidx = (u16)idx; PDBG("%s completing sq idx %u\n", __func__, wq->sq.cidx); *cookie = wq->sq.sw_sq[wq->sq.cidx].wr_id; t4_sq_consume(wq); } else { if (!srq) { PDBG("%s completing rq idx %u\n", __func__, wq->rq.cidx); BUG_ON(wq->rq.cidx >= wq->rq.size); *cookie = wq->rq.sw_rq[wq->rq.cidx].wr_id; BUG_ON(t4_rq_empty(wq)); t4_rq_consume(wq); } else *cookie = reap_srq_cqe(hw_cqe, srq); wq->rq.msn++; goto skip_cqe; } flush_wq: /* * Flush any completed cqes that are now in-order. */ flush_completed_wrs(wq, cq); skip_cqe: if (SW_CQE(com)) { PDBG("%s cq %p cqid 0x%x skip sw cqe cidx %u\n", __func__, cq, cq->cqid, cq->sw_cidx); t4_swcq_consume(cq); } else { PDBG("%s cq %p cqid 0x%x skip hw cqe cidx %u\n", __func__, cq, cq->cqid, cq->cidx); t4_hwcq_consume(cq); } return ret; } static void generate_srq_limit_event(struct c4iw_srq *srq) { struct ibv_modify_srq cmd; struct ibv_srq_attr attr = {}; int ret; srq->armed = 0; ret = ibv_cmd_modify_srq(&srq->ibv_srq, &attr, 0, &cmd, sizeof(cmd)); if (ret) fprintf(stderr, "Failure to send srq_limit event - ret %d errno %d\n", ret, errno); } /* * Get one cq entry from c4iw and map it to openib. * * Returns: * 0 cqe returned * -ENODATA EMPTY; * -EAGAIN caller must try again * any other -errno fatal error */ static int c4iw_poll_cq_one(struct c4iw_cq *chp, struct ibv_wc *wc) { struct c4iw_qp *qhp = NULL; struct c4iw_srq *srq = NULL; struct t4_cqe_common *com; union t4_cqe uninitialized_var(cqe), *rd_cqe; struct t4_wq *wq; u32 credit = 0; u8 cqe_flushed; u64 cookie = 0; int ret; ret = t4_next_cqe(&chp->cq, &rd_cqe); if (ret) { #ifdef STALL_DETECTION if (ret == -ENODATA && stall_to && !chp->dumped) { struct timeval t; gettimeofday(&t, NULL); if ((t.tv_sec - chp->time.tv_sec) > stall_to) { dump_state(); chp->dumped = 1; } } #endif return ret; } #ifdef STALL_DETECTION gettimeofday(&chp->time, NULL); #endif qhp = get_qhp(chp->rhp, CQE_QPID(&rd_cqe->com)); if (!qhp) wq = NULL; else { pthread_spin_lock(&qhp->lock); wq = &(qhp->wq); srq = qhp->srq; if (srq) pthread_spin_lock(&srq->lock); } ret = poll_cq(wq, &(chp->cq), &cqe, &cqe_flushed, &cookie, &credit, srq ? &srq->wq : NULL); if (ret) goto out; com = &cqe.com; INC_STAT(cqe); wc->wr_id = cookie; wc->qp_num = qhp->wq.sq.qid; wc->vendor_err = CQE_STATUS(com); wc->wc_flags = 0; /* * Simulate a SRQ_LIMIT_REACHED HW notification if required. */ if (srq && !(srq->flags & T4_SRQ_LIMIT_SUPPORT) && srq->armed && srq->wq.in_use < srq->srq_limit) generate_srq_limit_event(srq); PDBG("%s qpid 0x%x type %d opcode %d status 0x%x wrid hi 0x%x " "lo 0x%x cookie 0x%llx\n", __func__, CQE_QPID(com), CQE_TYPE(com), CQE_OPCODE(com), CQE_STATUS(com), CQE_WRID_HI(com), CQE_WRID_LOW(com), (unsigned long long)cookie); if (CQE_TYPE(com) == 0) { if (!CQE_STATUS(com)) wc->byte_len = CQE_LEN(com); else wc->byte_len = 0; switch (CQE_OPCODE(com)) { case FW_RI_SEND: wc->opcode = IBV_WC_RECV; break; case FW_RI_SEND_WITH_INV: case FW_RI_SEND_WITH_SE_INV: wc->opcode = IBV_WC_RECV; wc->wc_flags |= IBV_WC_WITH_INV; wc->invalidated_rkey = CQE_WRID_STAG(com); break; case FW_RI_WRITE_IMMEDIATE: wc->opcode = IBV_WC_RECV_RDMA_WITH_IMM; wc->imm_data = CQE_IMM_DATA(&cqe.b64); wc->wc_flags |= IBV_WC_WITH_IMM; break; default: PDBG("Unexpected opcode %d in the CQE received for QPID=0x%0x\n", CQE_OPCODE(com), CQE_QPID(com)); ret = -EINVAL; goto out; } } else { switch (CQE_OPCODE(com)) { case FW_RI_RDMA_WRITE: case FW_RI_WRITE_IMMEDIATE: wc->opcode = IBV_WC_RDMA_WRITE; break; case FW_RI_READ_REQ: wc->opcode = IBV_WC_RDMA_READ; wc->byte_len = CQE_LEN(com); break; case FW_RI_SEND: case FW_RI_SEND_WITH_SE: wc->opcode = IBV_WC_SEND; break; case FW_RI_SEND_WITH_INV: case FW_RI_SEND_WITH_SE_INV: wc->wc_flags |= IBV_WC_WITH_INV; wc->opcode = IBV_WC_SEND; break; case FW_RI_BIND_MW: wc->opcode = IBV_WC_BIND_MW; break; default: PDBG("Unexpected opcode %d " "in the CQE received for QPID=0x%0x\n", CQE_OPCODE(com), CQE_QPID(com)); ret = -EINVAL; goto out; } } if (cqe_flushed) wc->status = IBV_WC_WR_FLUSH_ERR; else { switch (CQE_STATUS(com)) { case T4_ERR_SUCCESS: wc->status = IBV_WC_SUCCESS; break; case T4_ERR_STAG: wc->status = IBV_WC_LOC_ACCESS_ERR; break; case T4_ERR_PDID: wc->status = IBV_WC_LOC_PROT_ERR; break; case T4_ERR_QPID: case T4_ERR_ACCESS: wc->status = IBV_WC_LOC_ACCESS_ERR; break; case T4_ERR_WRAP: wc->status = IBV_WC_GENERAL_ERR; break; case T4_ERR_BOUND: wc->status = IBV_WC_LOC_LEN_ERR; break; case T4_ERR_INVALIDATE_SHARED_MR: case T4_ERR_INVALIDATE_MR_WITH_MW_BOUND: wc->status = IBV_WC_MW_BIND_ERR; break; case T4_ERR_CRC: case T4_ERR_MARKER: case T4_ERR_PDU_LEN_ERR: case T4_ERR_OUT_OF_RQE: case T4_ERR_DDP_VERSION: case T4_ERR_RDMA_VERSION: case T4_ERR_DDP_QUEUE_NUM: case T4_ERR_MSN: case T4_ERR_TBIT: case T4_ERR_MO: case T4_ERR_MSN_RANGE: case T4_ERR_IRD_OVERFLOW: case T4_ERR_OPCODE: case T4_ERR_INTERNAL_ERR: wc->status = IBV_WC_FATAL_ERR; break; case T4_ERR_SWFLUSH: wc->status = IBV_WC_WR_FLUSH_ERR; break; default: PDBG("Unexpected cqe_status 0x%x for QPID=0x%0x\n", CQE_STATUS(com), CQE_QPID(com)); wc->status = IBV_WC_FATAL_ERR; } } if (wc->status && wc->status != IBV_WC_WR_FLUSH_ERR) syslog(LOG_NOTICE, "cxgb4 app err cqid %u qpid %u " "type %u opcode %u status 0x%x\n", chp->cq.cqid, CQE_QPID(com), CQE_TYPE(com), CQE_OPCODE(com), CQE_STATUS(com)); out: if (wq) { pthread_spin_unlock(&qhp->lock); if (srq) pthread_spin_unlock(&srq->lock); } return ret; } int c4iw_poll_cq(struct ibv_cq *ibcq, int num_entries, struct ibv_wc *wc) { struct c4iw_cq *chp; int npolled; int err = 0; chp = to_c4iw_cq(ibcq); if (t4_cq_in_error(&chp->cq)) { t4_reset_cq_in_error(&chp->cq); c4iw_flush_qps(chp->rhp); } if (!num_entries) return t4_cq_notempty(&chp->cq); pthread_spin_lock(&chp->lock); for (npolled = 0; npolled < num_entries; ++npolled) { do { err = c4iw_poll_cq_one(chp, wc + npolled); } while (err == -EAGAIN); if (err) break; } pthread_spin_unlock(&chp->lock); return !err || err == -ENODATA ? npolled : err; } int c4iw_arm_cq(struct ibv_cq *ibcq, int solicited) { struct c4iw_cq *chp; int ret; INC_STAT(arm); chp = to_c4iw_cq(ibcq); pthread_spin_lock(&chp->lock); ret = t4_arm_cq(&chp->cq, solicited); pthread_spin_unlock(&chp->lock); return ret; } void c4iw_flush_srqidx(struct c4iw_qp *qhp, u32 srqidx) { struct c4iw_cq *rchp = to_c4iw_cq(qhp->ibv_qp.recv_cq); /* create a SRQ RECV CQE for srqidx */ insert_recv_cqe(&qhp->wq, &rchp->cq, srqidx); } rdma-core-56.1/providers/cxgb4/cxgb4-abi.h000066400000000000000000000062621477342711600202750ustar00rootroot00000000000000/* * Copyright (c) 2006-2016 Chelsio, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef IWCH_ABI_H #define IWCH_ABI_H #include #include #include #include /* compat for ABI version 0 */ #define _c4iw_create_qp_resp_v0 \ { \ __u64 sq_key; \ __u64 rq_key; \ __u64 sq_db_gts_key; \ __u64 rq_db_gts_key; \ __u64 sq_memsize; \ __u64 rq_memsize; \ __u32 sqid; \ __u32 rqid; \ __u32 sq_size; \ __u32 rq_size; \ __u32 qid_mask; \ }; struct c4iw_create_qp_resp_v0 _c4iw_create_qp_resp_v0; #define _STRUCT_c4iw_create_qp_resp_v0 struct _c4iw_create_qp_resp_v0 DECLARE_DRV_CMD(uc4iw_alloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, empty, c4iw_alloc_pd_resp); DECLARE_DRV_CMD(uc4iw_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, c4iw_create_cq, c4iw_create_cq_resp); DECLARE_DRV_CMD(uc4iw_create_srq, IB_USER_VERBS_CMD_CREATE_SRQ, empty, c4iw_create_srq_resp); DECLARE_DRV_CMD(uc4iw_create_qp, IB_USER_VERBS_CMD_CREATE_QP, empty, c4iw_create_qp_resp); DECLARE_DRV_CMD(uc4iw_create_qp_v0, IB_USER_VERBS_CMD_CREATE_QP, empty, c4iw_create_qp_resp_v0); DECLARE_DRV_CMD(uc4iw_alloc_ucontext, IB_USER_VERBS_CMD_GET_CONTEXT, empty, c4iw_alloc_ucontext_resp); #endif /* IWCH_ABI_H */ rdma-core-56.1/providers/cxgb4/dev.c000066400000000000000000000314541477342711600173070ustar00rootroot00000000000000/* * Copyright (c) 2006-2016 Chelsio, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include "libcxgb4.h" #include "cxgb4-abi.h" static void c4iw_free_context(struct ibv_context *ibctx); #define PCI_VENDOR_ID_CHELSIO 0x1425 /* * Macros needed to support the PCI Device ID Table ... */ #define CH_PCI_DEVICE_ID_TABLE_DEFINE_BEGIN \ static const struct verbs_match_ent hca_table[] = { \ VERBS_DRIVER_ID(RDMA_DRIVER_CXGB4), #define CH_PCI_DEVICE_ID_FUNCTION \ 0x4 #define CH_PCI_ID_TABLE_ENTRY(__DeviceID) \ VERBS_PCI_MATCH(PCI_VENDOR_ID_CHELSIO, __DeviceID, NULL) #define CH_PCI_DEVICE_ID_TABLE_DEFINE_END \ {} } #include "t4_chip_type.h" #include "t4_pci_id_tbl.h" unsigned long c4iw_page_size; unsigned long c4iw_page_shift; unsigned long c4iw_page_mask; int ma_wr; int t5_en_wc = 1; static LIST_HEAD(devices); static const struct verbs_context_ops c4iw_ctx_common_ops = { .query_device_ex = c4iw_query_device, .query_port = c4iw_query_port, .alloc_pd = c4iw_alloc_pd, .dealloc_pd = c4iw_free_pd, .reg_mr = c4iw_reg_mr, .dereg_mr = c4iw_dereg_mr, .create_cq = c4iw_create_cq, .destroy_cq = c4iw_destroy_cq, .create_srq = c4iw_create_srq, .modify_srq = c4iw_modify_srq, .destroy_srq = c4iw_destroy_srq, .query_srq = c4iw_query_srq, .create_qp = c4iw_create_qp, .modify_qp = c4iw_modify_qp, .destroy_qp = c4iw_destroy_qp, .query_qp = c4iw_query_qp, .attach_mcast = c4iw_attach_mcast, .detach_mcast = c4iw_detach_mcast, .post_srq_recv = c4iw_post_srq_recv, .req_notify_cq = c4iw_arm_cq, .free_context = c4iw_free_context, }; static const struct verbs_context_ops c4iw_ctx_t4_ops = { .async_event = c4iw_async_event, .poll_cq = c4iw_poll_cq, .post_recv = c4iw_post_receive, .post_send = c4iw_post_send, .req_notify_cq = c4iw_arm_cq, }; static struct verbs_context *c4iw_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct c4iw_context *context; struct ibv_get_context cmd; struct uc4iw_alloc_ucontext_resp resp; struct c4iw_dev *rhp = to_c4iw_dev(ibdev); struct ibv_device_attr attr; context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx, RDMA_DRIVER_CXGB4); if (!context) return NULL; resp.status_page_size = 0; resp.reserved = 0; if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp)) goto err_free; if (resp.reserved) PDBG("%s c4iw_alloc_ucontext_resp reserved field modified by kernel\n", __FUNCTION__); context->status_page_size = resp.status_page_size; if (resp.status_page_size) { context->status_page = mmap(NULL, resp.status_page_size, PROT_READ, MAP_SHARED, cmd_fd, resp.status_page_key); if (context->status_page == MAP_FAILED) goto err_free; } verbs_set_ops(&context->ibv_ctx, &c4iw_ctx_common_ops); if (c4iw_query_device(&context->ibv_ctx.context, NULL, container_of(&attr, struct ibv_device_attr_ex, orig_attr), sizeof(attr))) goto err_unmap; if (!rhp->mmid2ptr) { rhp->max_mr = attr.max_mr; rhp->mmid2ptr = calloc(attr.max_mr, sizeof(void *)); if (!rhp->mmid2ptr) { goto err_unmap; } if (rhp->abi_version < 3) { fprintf(stderr, "Warning: iw_cxgb4 driver is of older version" " than libcxgb4:: %d\n", rhp->abi_version); rhp->max_qp = T4_QID_BASE + attr.max_qp; } else { rhp->max_qp = context->status_page->qp_start + context->status_page->qp_size; } rhp->qpid2ptr = calloc(rhp->max_qp, sizeof(void *)); if (!rhp->qpid2ptr) { goto err_unmap; } if (rhp->abi_version < 3) rhp->max_cq = T4_QID_BASE + attr.max_cq; else rhp->max_cq = context->status_page->cq_start + context->status_page->cq_size; rhp->cqid2ptr = calloc(rhp->max_cq, sizeof(void *)); if (!rhp->cqid2ptr) goto err_unmap; rhp->write_cmpl_supported = context->status_page->write_cmpl_supported; } rhp->chip_version = CHELSIO_CHIP_VERSION(attr.vendor_part_id >> 8); switch (rhp->chip_version) { case CHELSIO_T6: PDBG("%s T6/T5/T4 device\n", __func__); case CHELSIO_T5: PDBG("%s T5/T4 device\n", __func__); case CHELSIO_T4: PDBG("%s T4 device\n", __func__); verbs_set_ops(&context->ibv_ctx, &c4iw_ctx_t4_ops); break; default: PDBG("%s unknown hca type %d\n", __func__, rhp->chip_version); goto err_unmap; } return &context->ibv_ctx; err_unmap: munmap(context->status_page, context->status_page_size); err_free: if (rhp->cqid2ptr) free(rhp->cqid2ptr); if (rhp->qpid2ptr) free(rhp->qpid2ptr); if (rhp->mmid2ptr) free(rhp->mmid2ptr); verbs_uninit_context(&context->ibv_ctx); free(context); return NULL; } static void c4iw_free_context(struct ibv_context *ibctx) { struct c4iw_context *context = to_c4iw_context(ibctx); if (context->status_page_size) munmap(context->status_page, context->status_page_size); verbs_uninit_context(&context->ibv_ctx); free(context); } static void c4iw_uninit_device(struct verbs_device *verbs_device) { struct c4iw_dev *dev = to_c4iw_dev(&verbs_device->device); free(dev); } #ifdef STALL_DETECTION int stall_to; static void dump_cq(struct c4iw_cq *chp) { int i; fprintf(stderr, "CQ: %p id %u queue %p cidx 0x%08x sw_queue %p sw_cidx %d sw_pidx %d sw_in_use %d depth %u error %u gen %d " "cidx_inc %d bits_type_ts %016" PRIx64 " notempty %d\n", chp, chp->cq.cqid, chp->cq.queue, chp->cq.cidx, chp->cq.sw_queue, chp->cq.sw_cidx, chp->cq.sw_pidx, chp->cq.sw_in_use, chp->cq.size, chp->cq.error, chp->cq.gen, chp->cq.cidx_inc, be64toh(chp->cq.bits_type_ts), t4_cq_notempty(&chp->cq)); for (i=0; i < chp->cq.size; i++) { u64 *p = (u64 *)(chp->cq.queue + i); fprintf(stderr, "%02x: %016" PRIx64 " %016" PRIx64, i, be64toh(p[0]), be64toh(p[1])); if (i == chp->cq.cidx) fprintf(stderr, " <-- cidx\n"); else fprintf(stderr, "\n"); p+= 2; fprintf(stderr, "%02x: %016" PRIx64 " %016" PRIx64 "\n", i, be64toh(p[0]), be64toh(p[1])); p+= 2; fprintf(stderr, "%02x: %016" PRIx64 " %016" PRIx64 "\n", i, be64toh(p[0]), be64toh(p[1])); p+= 2; fprintf(stderr, "%02x: %016" PRIx64 " %016" PRIx64 "\n", i, be64toh(p[0]), be64toh(p[1])); p+= 2; } } static void dump_qp(struct c4iw_qp *qhp) { int i; int j; struct t4_swsqe *swsqe; struct t4_swrqe *swrqe; u16 cidx, pidx; u64 *p; fprintf(stderr, "QP: %p id %u error %d flushed %d qid_mask 0x%x\n" " SQ: id %u queue %p sw_queue %p cidx %u pidx %u in_use %u wq_pidx %u depth %u flags 0x%x flush_cidx %d\n" " RQ: id %u queue %p sw_queue %p cidx %u pidx %u in_use %u depth %u\n", qhp, qhp->wq.sq.qid, qhp->wq.error, qhp->wq.flushed, qhp->wq.qid_mask, qhp->wq.sq.qid, qhp->wq.sq.queue, qhp->wq.sq.sw_sq, qhp->wq.sq.cidx, qhp->wq.sq.pidx, qhp->wq.sq.in_use, qhp->wq.sq.wq_pidx, qhp->wq.sq.size, qhp->wq.sq.flags, qhp->wq.sq.flush_cidx, qhp->wq.rq.qid, qhp->wq.rq.queue, qhp->wq.rq.sw_rq, qhp->wq.rq.cidx, qhp->wq.rq.pidx, qhp->wq.rq.in_use, qhp->wq.rq.size); cidx = qhp->wq.sq.cidx; pidx = qhp->wq.sq.pidx; if (cidx != pidx) fprintf(stderr, "SQ: \n"); while (cidx != pidx) { swsqe = &qhp->wq.sq.sw_sq[cidx]; fprintf(stderr, "%04u: wr_id %016" PRIx64 " sq_wptr %08x read_len %u opcode 0x%x " "complete %u signaled %u cqe %016" PRIx64 " %016" PRIx64 " %016" PRIx64 " %016" PRIx64 "\n", cidx, swsqe->wr_id, swsqe->idx, swsqe->read_len, swsqe->opcode, swsqe->complete, swsqe->signaled, htobe64(((uint64_t *)&swsqe->cqe)[0]), htobe64(((uint64_t *)&swsqe->cqe)[1]), htobe64(((uint64_t *)&swsqe->cqe)[2]), htobe64(((uint64_t *)&swsqe->cqe)[3])); if (++cidx == qhp->wq.sq.size) cidx = 0; } fprintf(stderr, "SQ WQ: \n"); p = (u64 *)qhp->wq.sq.queue; for (i=0; i < qhp->wq.sq.size * T4_SQ_NUM_SLOTS; i++) { for (j=0; j < T4_EQ_ENTRY_SIZE / 16; j++) { fprintf(stderr, "%04u %016" PRIx64 " %016" PRIx64 " ", i, be64toh(p[0]), be64toh(p[1])); if (j == 0 && i == qhp->wq.sq.wq_pidx) fprintf(stderr, " <-- pidx"); fprintf(stderr, "\n"); p += 2; } } cidx = qhp->wq.rq.cidx; pidx = qhp->wq.rq.pidx; if (cidx != pidx) fprintf(stderr, "RQ: \n"); while (cidx != pidx) { swrqe = &qhp->wq.rq.sw_rq[cidx]; fprintf(stderr, "%04u: wr_id %016" PRIx64 "\n", cidx, swrqe->wr_id ); if (++cidx == qhp->wq.rq.size) cidx = 0; } fprintf(stderr, "RQ WQ: \n"); p = (u64 *)qhp->wq.rq.queue; for (i=0; i < qhp->wq.rq.size * T4_RQ_NUM_SLOTS; i++) { for (j=0; j < T4_EQ_ENTRY_SIZE / 16; j++) { fprintf(stderr, "%04u %016" PRIx64 " %016" PRIx64 " ", i, be64toh(p[0]), be64toh(p[1])); if (j == 0 && i == qhp->wq.rq.pidx) fprintf(stderr, " <-- pidx"); if (j == 0 && i == qhp->wq.rq.cidx) fprintf(stderr, " <-- cidx"); fprintf(stderr, "\n"); p+=2; } } } void dump_state(void) { struct c4iw_dev *dev; int i; fprintf(stderr, "STALL DETECTED:\n"); list_for_each(&devices, dev, list) { //pthread_spin_lock(&dev->lock); fprintf(stderr, "Device %s\n", dev->ibv_dev.name); for (i=0; i < dev->max_cq; i++) { if (dev->cqid2ptr[i]) { struct c4iw_cq *chp = dev->cqid2ptr[i]; //pthread_spin_lock(&chp->lock); dump_cq(chp); //pthread_spin_unlock(&chp->lock); } } for (i=0; i < dev->max_qp; i++) { if (dev->qpid2ptr[i]) { struct c4iw_qp *qhp = dev->qpid2ptr[i]; //pthread_spin_lock(&qhp->lock); dump_qp(qhp); //pthread_spin_unlock(&qhp->lock); } } //pthread_spin_unlock(&dev->lock); } fprintf(stderr, "DUMP COMPLETE:\n"); fflush(stderr); } #endif /* end of STALL_DETECTION */ /* * c4iw_abi_version is used to store ABI for iw_cxgb4 so the user mode library * can know if the driver supports the kernel mode db ringing. */ int c4iw_abi_version = 1; static struct verbs_device *c4iw_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct c4iw_dev *dev; c4iw_page_size = sysconf(_SC_PAGESIZE); c4iw_page_shift = long_log2(c4iw_page_size); c4iw_page_mask = ~(c4iw_page_size - 1); dev = calloc(1, sizeof *dev); if (!dev) return NULL; pthread_spin_init(&dev->lock, PTHREAD_PROCESS_PRIVATE); c4iw_abi_version = sysfs_dev->abi_ver; dev->abi_version = sysfs_dev->abi_ver; list_node_init(&dev->list); list_head_init(&dev->srq_list); PDBG("%s device claimed\n", __FUNCTION__); list_add_tail(&devices, &dev->list); #ifdef STALL_DETECTION { char *c = getenv("CXGB4_STALL_TIMEOUT"); if (c) { stall_to = strtol(c, NULL, 0); if (errno || stall_to < 0) stall_to = 0; } } #endif { char *c = getenv("CXGB4_MA_WR"); if (c) { ma_wr = strtol(c, NULL, 0); if (ma_wr != 1) ma_wr = 0; } } { char *c = getenv("T5_ENABLE_WC"); if (c) { t5_en_wc = strtol(c, NULL, 0); if (t5_en_wc != 1) t5_en_wc = 0; } } return &dev->ibv_dev; } static const struct verbs_device_ops c4iw_dev_ops = { .name = "cxgb4", .match_min_abi_version = 0, .match_max_abi_version = INT_MAX, .match_table = hca_table, .alloc_device = c4iw_device_alloc, .uninit_device = c4iw_uninit_device, .alloc_context = c4iw_alloc_context, }; PROVIDER_DRIVER(cxgb4, c4iw_dev_ops); #ifdef STATS void __attribute__ ((destructor)) cs_fini(void); void __attribute__ ((destructor)) cs_fini(void) { syslog(LOG_NOTICE, "cxgb4 stats - sends %lu recv %lu read %lu " "write %lu arm %lu cqe %lu mr %lu qp %lu cq %lu\n", c4iw_stats.send, c4iw_stats.recv, c4iw_stats.read, c4iw_stats.write, c4iw_stats.arm, c4iw_stats.cqe, c4iw_stats.mr, c4iw_stats.qp, c4iw_stats.cq); } #endif rdma-core-56.1/providers/cxgb4/libcxgb4.h000066400000000000000000000166411477342711600202350ustar00rootroot00000000000000/* * Copyright (c) 2006-2016 Chelsio, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef IWCH_H #define IWCH_H #include #include #include #include #include #include #include #include #include #include #include "t4.h" extern unsigned long c4iw_page_size; extern unsigned long c4iw_page_shift; extern unsigned long c4iw_page_mask; struct c4iw_mr; struct c4iw_dev { struct verbs_device ibv_dev; unsigned chip_version; int max_mr; struct c4iw_mr **mmid2ptr; int max_qp; struct c4iw_qp **qpid2ptr; int max_cq; struct c4iw_cq **cqid2ptr; struct list_head srq_list; pthread_spinlock_t lock; struct list_node list; int abi_version; bool write_cmpl_supported; }; static inline int dev_is_t6(struct c4iw_dev *dev) { return dev->chip_version == CHELSIO_T6; } static inline int dev_is_t5(struct c4iw_dev *dev) { return dev->chip_version == CHELSIO_T5; } static inline int dev_is_t4(struct c4iw_dev *dev) { return dev->chip_version == CHELSIO_T4; } struct c4iw_context { struct verbs_context ibv_ctx; struct t4_dev_status_page *status_page; int status_page_size; }; struct c4iw_pd { struct ibv_pd ibv_pd; }; struct c4iw_mr { struct verbs_mr vmr; uint64_t va_fbo; uint32_t len; }; static inline u32 c4iw_mmid(u32 stag) { return (stag >> 8); } struct c4iw_cq { struct ibv_cq ibv_cq; struct c4iw_dev *rhp; struct t4_cq cq; pthread_spinlock_t lock; #ifdef STALL_DETECTION struct timeval time; int dumped; #endif }; struct c4iw_qp { struct ibv_qp ibv_qp; struct c4iw_dev *rhp; struct t4_wq wq; pthread_spinlock_t lock; int sq_sig_all; struct c4iw_srq *srq; }; #define to_c4iw_xxx(xxx, type) \ container_of(ib##xxx, struct c4iw_##type, ibv_##xxx) struct c4iw_srq { struct ibv_srq ibv_srq; int type; /* must be 2nd in this struct */ struct c4iw_dev *rhp; struct t4_srq wq; struct list_node list; pthread_spinlock_t lock; uint32_t srq_limit; int armed; __u32 flags; }; static inline struct c4iw_srq *to_c4iw_srq(struct ibv_srq *ibsrq) { return to_c4iw_xxx(srq, srq); } static inline struct c4iw_dev *to_c4iw_dev(struct ibv_device *ibdev) { return container_of(ibdev, struct c4iw_dev, ibv_dev.device); } static inline struct c4iw_context *to_c4iw_context(struct ibv_context *ibctx) { return container_of(ibctx, struct c4iw_context, ibv_ctx.context); } static inline struct c4iw_pd *to_c4iw_pd(struct ibv_pd *ibpd) { return to_c4iw_xxx(pd, pd); } static inline struct c4iw_cq *to_c4iw_cq(struct ibv_cq *ibcq) { return to_c4iw_xxx(cq, cq); } static inline struct c4iw_qp *to_c4iw_qp(struct ibv_qp *ibqp) { return to_c4iw_xxx(qp, qp); } static inline struct c4iw_mr *to_c4iw_mr(struct verbs_mr *vmr) { return container_of(vmr, struct c4iw_mr, vmr); } static inline struct c4iw_qp *get_qhp(struct c4iw_dev *rhp, u32 qid) { return rhp->qpid2ptr[qid]; } static inline struct c4iw_cq *get_chp(struct c4iw_dev *rhp, u32 qid) { return rhp->cqid2ptr[qid]; } static inline unsigned long_log2(unsigned long x) { unsigned r = 0; for (x >>= 1; x > 0; x >>= 1) r++; return r; } int c4iw_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int c4iw_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr); struct ibv_pd *c4iw_alloc_pd(struct ibv_context *context); int c4iw_free_pd(struct ibv_pd *pd); struct ibv_mr *c4iw_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access); int c4iw_dereg_mr(struct verbs_mr *vmr); struct ibv_cq *c4iw_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); int c4iw_destroy_cq(struct ibv_cq *cq); int c4iw_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc); int c4iw_arm_cq(struct ibv_cq *cq, int solicited); void c4iw_cq_event(struct ibv_cq *cq); void c4iw_init_cq_buf(struct c4iw_cq *cq, int nent); struct ibv_srq *c4iw_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr); int c4iw_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int mask); int c4iw_destroy_srq(struct ibv_srq *srq); int c4iw_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int c4iw_query_srq(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr); struct ibv_qp *c4iw_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); int c4iw_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); int c4iw_destroy_qp(struct ibv_qp *qp); int c4iw_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); void c4iw_flush_qp(struct c4iw_qp *qhp); void c4iw_flush_qps(struct c4iw_dev *dev); int c4iw_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int c4iw_post_receive(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int c4iw_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); int c4iw_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); void c4iw_async_event(struct ibv_context *context, struct ibv_async_event *event); void c4iw_flush_hw_cq(struct c4iw_cq *chp, struct c4iw_qp *flush_qhp); int c4iw_flush_rq(struct t4_wq *wq, struct t4_cq *cq, int count); void c4iw_flush_sq(struct c4iw_qp *qhp); void c4iw_count_rcqes(struct t4_cq *cq, struct t4_wq *wq, int *count); void c4iw_copy_wr_to_srq(struct t4_srq *srq, union t4_recv_wr *wqe, u8 len16); void c4iw_flush_srqidx(struct c4iw_qp *qhp, u32 srqidx); #define FW_MAJ 0 #define FW_MIN 0 #ifdef STATS #define INC_STAT(a) { c4iw_stats.a++; } struct c4iw_stats { unsigned long send; unsigned long recv; unsigned long read; unsigned long write; unsigned long arm; unsigned long cqe; unsigned long mr; unsigned long qp; unsigned long cq; }; extern struct c4iw_stats c4iw_stats; #else #define INC_STAT(a) #endif #ifdef STALL_DETECTION void dump_state(void); extern int stall_to; #endif #endif /* IWCH_H */ rdma-core-56.1/providers/cxgb4/qp.c000066400000000000000000000545331477342711600171540ustar00rootroot00000000000000/* * Copyright (c) 2006-2016 Chelsio, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include "libcxgb4.h" #ifdef STATS struct c4iw_stats c4iw_stats; #endif static void copy_wr_to_sq(struct t4_wq *wq, union t4_wr *wqe, u8 len16) { u64 *src, *dst; src = (u64 *)wqe; dst = (u64 *)((u8 *)wq->sq.queue + wq->sq.wq_pidx * T4_EQ_ENTRY_SIZE); if (t4_sq_onchip(wq)) { len16 = align(len16, 4); /* In onchip mode the copy below will be made to WC memory and * could trigger DMA. In offchip mode the copy below only * queues the WQE, DMA cannot start until t4_ring_sq_db * happens */ mmio_wc_start(); } while (len16) { *dst++ = *src++; if (dst == (u64 *)&wq->sq.queue[wq->sq.size]) dst = (u64 *)wq->sq.queue; *dst++ = *src++; if (dst == (u64 *)&wq->sq.queue[wq->sq.size]) dst = (u64 *)wq->sq.queue; len16--; /* NOTE len16 cannot be large enough to write to the same sq.queue memory twice in this loop */ } if (t4_sq_onchip(wq)) mmio_flush_writes(); } static void copy_wr_to_rq(struct t4_wq *wq, union t4_recv_wr *wqe, u8 len16) { u64 *src, *dst; src = (u64 *)wqe; dst = (u64 *)((u8 *)wq->rq.queue + wq->rq.wq_pidx * T4_EQ_ENTRY_SIZE); while (len16) { *dst++ = *src++; if (dst >= (u64 *)&wq->rq.queue[wq->rq.size]) dst = (u64 *)wq->rq.queue; *dst++ = *src++; if (dst >= (u64 *)&wq->rq.queue[wq->rq.size]) dst = (u64 *)wq->rq.queue; len16--; } } void c4iw_copy_wr_to_srq(struct t4_srq *srq, union t4_recv_wr *wqe, u8 len16) { u64 *src, *dst; src = (u64 *)wqe; dst = (u64 *)((u8 *)srq->queue + srq->wq_pidx * T4_EQ_ENTRY_SIZE); while (len16) { *dst++ = *src++; if (dst >= (u64 *)&srq->queue[srq->size]) dst = (u64 *)srq->queue; *dst++ = *src++; if (dst >= (u64 *)&srq->queue[srq->size]) dst = (u64 *)srq->queue; len16--; } } static int build_immd(struct t4_sq *sq, struct fw_ri_immd *immdp, struct ibv_send_wr *wr, int max, u32 *plenp) { u8 *dstp, *srcp; u32 plen = 0; int i; int len; dstp = (u8 *)immdp->data; for (i = 0; i < wr->num_sge; i++) { if ((plen + wr->sg_list[i].length) > max) return -EMSGSIZE; srcp = (u8 *)(unsigned long)wr->sg_list[i].addr; plen += wr->sg_list[i].length; len = wr->sg_list[i].length; memcpy(dstp, srcp, len); dstp += len; srcp += len; } len = ROUND_UP(plen + 8, 16) - (plen + 8); if (len) memset(dstp, 0, len); immdp->op = FW_RI_DATA_IMMD; immdp->r1 = 0; immdp->r2 = 0; immdp->immdlen = htobe32(plen); *plenp = plen; return 0; } static int build_isgl(__be64 *queue_start, __be64 *queue_end, struct fw_ri_isgl *isglp, struct ibv_sge *sg_list, int num_sge, u32 *plenp) { int i; u32 plen = 0; __be64 *flitp; if ((__be64 *)isglp == queue_end) isglp = (struct fw_ri_isgl *)queue_start; flitp = (__be64 *)isglp->sge; for (i = 0; i < num_sge; i++) { if ((plen + sg_list[i].length) < plen) return -EMSGSIZE; plen += sg_list[i].length; *flitp = htobe64(((u64)sg_list[i].lkey << 32) | sg_list[i].length); if (++flitp == queue_end) flitp = queue_start; *flitp = htobe64(sg_list[i].addr); if (++flitp == queue_end) flitp = queue_start; } *flitp = 0; isglp->op = FW_RI_DATA_ISGL; isglp->r1 = 0; isglp->nsge = htobe16(num_sge); isglp->r2 = 0; if (plenp) *plenp = plen; return 0; } static int build_rdma_send(struct t4_sq *sq, union t4_wr *wqe, struct ibv_send_wr *wr, u8 *len16) { u32 plen; int size; int ret; if (wr->num_sge > T4_MAX_SEND_SGE) return -EINVAL; switch (wr->opcode) { case IBV_WR_SEND: if (wr->send_flags & IBV_SEND_SOLICITED) wqe->send.sendop_pkd = htobe32(FW_RI_SEND_WR_SENDOP_V(FW_RI_SEND_WITH_SE)); else wqe->send.sendop_pkd = htobe32(FW_RI_SEND_WR_SENDOP_V(FW_RI_SEND)); wqe->send.stag_inv = 0; break; case IBV_WR_SEND_WITH_INV: if (wr->send_flags & IBV_SEND_SOLICITED) wqe->send.sendop_pkd = htobe32(FW_RI_SEND_WR_SENDOP_V(FW_RI_SEND_WITH_SE_INV)); else wqe->send.sendop_pkd = htobe32(FW_RI_SEND_WR_SENDOP_V(FW_RI_SEND_WITH_INV)); wqe->send.stag_inv = htobe32(wr->invalidate_rkey); break; default: return -EINVAL; } wqe->send.r3 = 0; wqe->send.r4 = 0; plen = 0; if (wr->num_sge) { if (wr->send_flags & IBV_SEND_INLINE) { ret = build_immd(sq, wqe->send.u.immd_src, wr, T4_MAX_SEND_INLINE, &plen); if (ret) return ret; size = sizeof wqe->send + sizeof(struct fw_ri_immd) + plen; } else { ret = build_isgl((__be64 *)sq->queue, (__be64 *)&sq->queue[sq->size], wqe->send.u.isgl_src, wr->sg_list, wr->num_sge, &plen); if (ret) return ret; size = sizeof wqe->send + sizeof(struct fw_ri_isgl) + wr->num_sge * sizeof (struct fw_ri_sge); } } else { wqe->send.u.immd_src[0].op = FW_RI_DATA_IMMD; wqe->send.u.immd_src[0].r1 = 0; wqe->send.u.immd_src[0].r2 = 0; wqe->send.u.immd_src[0].immdlen = 0; size = sizeof wqe->send + sizeof(struct fw_ri_immd); plen = 0; } *len16 = DIV_ROUND_UP(size, 16); wqe->send.plen = htobe32(plen); return 0; } static int build_rdma_write(struct t4_sq *sq, union t4_wr *wqe, struct ibv_send_wr *wr, u8 *len16) { u32 plen; int size; int ret; if (wr->num_sge > T4_MAX_SEND_SGE) return -EINVAL; if (wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM) wqe->write.iw_imm_data.ib_imm_data.imm_data32 = wr->imm_data; else wqe->write.iw_imm_data.ib_imm_data.imm_data32 = 0; wqe->write.stag_sink = htobe32(wr->wr.rdma.rkey); wqe->write.to_sink = htobe64(wr->wr.rdma.remote_addr); if (wr->num_sge) { if (wr->send_flags & IBV_SEND_INLINE) { ret = build_immd(sq, wqe->write.u.immd_src, wr, T4_MAX_WRITE_INLINE, &plen); if (ret) return ret; size = sizeof wqe->write + sizeof(struct fw_ri_immd) + plen; } else { ret = build_isgl((__be64 *)sq->queue, (__be64 *)&sq->queue[sq->size], wqe->write.u.isgl_src, wr->sg_list, wr->num_sge, &plen); if (ret) return ret; size = sizeof wqe->write + sizeof(struct fw_ri_isgl) + wr->num_sge * sizeof (struct fw_ri_sge); } } else { wqe->write.u.immd_src[0].op = FW_RI_DATA_IMMD; wqe->write.u.immd_src[0].r1 = 0; wqe->write.u.immd_src[0].r2 = 0; wqe->write.u.immd_src[0].immdlen = 0; size = sizeof wqe->write + sizeof(struct fw_ri_immd); plen = 0; } *len16 = DIV_ROUND_UP(size, 16); wqe->write.plen = htobe32(plen); return 0; } static void build_immd_cmpl(struct t4_sq *sq, struct fw_ri_immd_cmpl *immdp, struct ibv_send_wr *wr) { memcpy((u8 *)immdp->data, (u8 *)(uintptr_t)wr->sg_list->addr, 16); memset(immdp->r1, 0, 6); immdp->op = FW_RI_DATA_IMMD; immdp->immdlen = 16; } static void build_rdma_write_cmpl(struct t4_sq *sq, struct fw_ri_rdma_write_cmpl_wr *wcwr, struct ibv_send_wr *wr, u8 *len16) { u32 plen; int size; /* * This code assumes the struct fields preceding the write isgl fit * in one 64B WR slot. This is because the WQE is built directly in * the dma queue, and wrapping is only handled by the code buildling * sgls. IE the "fixed part" of the wr structs must all fit in 64B. * The WQE build code should probably be redesigned to avoid this * restriction, but for now just add a static_assert() to catch if * this WQE struct gets too big. */ static_assert(offsetof(struct fw_ri_rdma_write_cmpl_wr, u) <= 64, "WQE structure too BIG!"); wcwr->stag_sink = htobe32(wr->wr.rdma.rkey); wcwr->to_sink = htobe64(wr->wr.rdma.remote_addr); if (wr->next->opcode == IBV_WR_SEND) wcwr->stag_inv = 0; else wcwr->stag_inv = htobe32(wr->next->invalidate_rkey); wcwr->r2 = 0; wcwr->r3 = 0; /* SEND_INV SGL */ if (wr->next->send_flags & IBV_SEND_INLINE) build_immd_cmpl(sq, &wcwr->u_cmpl.immd_src, wr->next); else build_isgl((__be64 *)sq->queue, (__be64 *)&sq->queue[sq->size], &wcwr->u_cmpl.isgl_src, wr->next->sg_list, 1, NULL); /* WRITE SGL */ build_isgl((__be64 *)sq->queue, (__be64 *)&sq->queue[sq->size], wcwr->u.isgl_src, wr->sg_list, wr->num_sge, &plen); size = sizeof(*wcwr) + sizeof(struct fw_ri_isgl) + wr->num_sge * sizeof(struct fw_ri_sge); wcwr->plen = htobe32(plen); *len16 = DIV_ROUND_UP(size, 16); } static int build_rdma_read(union t4_wr *wqe, struct ibv_send_wr *wr, u8 *len16) { if (wr->num_sge > 1) return -EINVAL; if (wr->num_sge) { wqe->read.stag_src = htobe32(wr->wr.rdma.rkey); wqe->read.to_src_hi = htobe32((u32)(wr->wr.rdma.remote_addr >>32)); wqe->read.to_src_lo = htobe32((u32)wr->wr.rdma.remote_addr); wqe->read.stag_sink = htobe32(wr->sg_list[0].lkey); wqe->read.plen = htobe32(wr->sg_list[0].length); wqe->read.to_sink_hi = htobe32((u32)(wr->sg_list[0].addr >> 32)); wqe->read.to_sink_lo = htobe32((u32)(wr->sg_list[0].addr)); } else { wqe->read.stag_src = htobe32(2); wqe->read.to_src_hi = 0; wqe->read.to_src_lo = 0; wqe->read.stag_sink = htobe32(2); wqe->read.plen = 0; wqe->read.to_sink_hi = 0; wqe->read.to_sink_lo = 0; } wqe->read.r2 = 0; wqe->read.r5 = 0; *len16 = DIV_ROUND_UP(sizeof wqe->read, 16); return 0; } static int build_rdma_recv(struct t4_rq *rq, union t4_recv_wr *wqe, struct ibv_recv_wr *wr, u8 *len16) { int ret; ret = build_isgl((__be64 *)rq->queue, (__be64 *)&rq->queue[rq->size], &wqe->recv.isgl, wr->sg_list, wr->num_sge, NULL); if (ret) return ret; *len16 = DIV_ROUND_UP(sizeof wqe->recv + wr->num_sge * sizeof(struct fw_ri_sge), 16); return 0; } static int build_srq_recv(union t4_recv_wr *wqe, struct ibv_recv_wr *wr, u8 *len16) { int ret; ret = build_isgl((__be64 *)wqe, (__be64 *)(wqe + 1), &wqe->recv.isgl, wr->sg_list, wr->num_sge, NULL); if (ret) return ret; *len16 = DIV_ROUND_UP(sizeof(wqe->recv) + wr->num_sge * sizeof(struct fw_ri_sge), 16); return 0; } static void ring_kernel_db(struct c4iw_qp *qhp, u32 qid, u16 idx) { struct ibv_modify_qp cmd = {}; struct ibv_qp_attr attr; int mask; int __attribute__((unused)) ret; /* FIXME: Why do we need this barrier if the kernel is going to trigger the DMA? */ udma_to_device_barrier(); if (qid == qhp->wq.sq.qid) { attr.sq_psn = idx; mask = IBV_QP_SQ_PSN; } else { attr.rq_psn = idx; mask = IBV_QP_RQ_PSN; } ret = ibv_cmd_modify_qp(&qhp->ibv_qp, &attr, mask, &cmd, sizeof cmd); assert(!ret); } static void post_write_cmpl(struct c4iw_qp *qhp, struct ibv_send_wr *wr) { bool send_signaled = (wr->next->send_flags & IBV_SEND_SIGNALED) || qhp->sq_sig_all; bool write_signaled = (wr->send_flags & IBV_SEND_SIGNALED) || qhp->sq_sig_all; struct t4_swsqe *swsqe; union t4_wr *wqe; u16 write_wrid; u8 len16; u16 idx; /* * The sw_sq entries still look like a WRITE and a SEND and consume * 2 slots. The FW WR, however, will be a single uber-WR. */ wqe = (union t4_wr *)((u8 *)qhp->wq.sq.queue + qhp->wq.sq.wq_pidx * T4_EQ_ENTRY_SIZE); build_rdma_write_cmpl(&qhp->wq.sq, &wqe->write_cmpl, wr, &len16); /* WRITE swsqe */ swsqe = &qhp->wq.sq.sw_sq[qhp->wq.sq.pidx]; swsqe->opcode = FW_RI_RDMA_WRITE; swsqe->idx = qhp->wq.sq.pidx; swsqe->complete = 0; swsqe->signaled = write_signaled; swsqe->flushed = 0; swsqe->wr_id = wr->wr_id; write_wrid = qhp->wq.sq.pidx; /* just bump the sw_sq */ qhp->wq.sq.in_use++; if (++qhp->wq.sq.pidx == qhp->wq.sq.size) qhp->wq.sq.pidx = 0; /* SEND swsqe */ swsqe = &qhp->wq.sq.sw_sq[qhp->wq.sq.pidx]; if (wr->next->opcode == IBV_WR_SEND) swsqe->opcode = FW_RI_SEND; else swsqe->opcode = FW_RI_SEND_WITH_INV; swsqe->idx = qhp->wq.sq.pidx; swsqe->complete = 0; swsqe->signaled = send_signaled; swsqe->flushed = 0; swsqe->wr_id = wr->next->wr_id; wqe->write_cmpl.flags_send = send_signaled ? FW_RI_COMPLETION_FLAG : 0; wqe->write_cmpl.wrid_send = qhp->wq.sq.pidx; init_wr_hdr(wqe, write_wrid, FW_RI_RDMA_WRITE_CMPL_WR, write_signaled ? FW_RI_COMPLETION_FLAG : 0, len16); t4_sq_produce(&qhp->wq, len16); idx = DIV_ROUND_UP(len16 * 16, T4_EQ_ENTRY_SIZE); t4_ring_sq_db(&qhp->wq, idx, dev_is_t4(qhp->rhp), len16, wqe); } int c4iw_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { int err = 0; u8 uninitialized_var(len16); enum fw_wr_opcodes fw_opcode; enum fw_ri_wr_flags fw_flags; struct c4iw_qp *qhp; union t4_wr *wqe, lwqe; u32 num_wrs; struct t4_swsqe *swsqe; u16 idx = 0; qhp = to_c4iw_qp(ibqp); pthread_spin_lock(&qhp->lock); if (t4_wq_in_error(&qhp->wq)) { pthread_spin_unlock(&qhp->lock); *bad_wr = wr; return -EINVAL; } num_wrs = t4_sq_avail(&qhp->wq); if (num_wrs == 0) { pthread_spin_unlock(&qhp->lock); *bad_wr = wr; return -ENOMEM; } /* * Fastpath for NVMe-oF target WRITE + SEND_WITH_INV wr chain which is * the response for small NVMEe-oF READ requests. If the chain is * exactly a WRITE->SEND_WITH_INV or a WRITE->SEND and the sgl depths * and lengths meet the requirements of the fw_ri_write_cmpl_wr work * request, then build and post the write_cmpl WR. If any of the tests * below are not true, then we continue on with the tradtional WRITE * and SEND WRs. */ if (qhp->rhp->write_cmpl_supported && qhp->rhp->chip_version >= CHELSIO_T5 && wr && wr->next && !wr->next->next && wr->opcode == IBV_WR_RDMA_WRITE && wr->sg_list[0].length && wr->num_sge <= T4_WRITE_CMPL_MAX_SGL && (wr->next->opcode == IBV_WR_SEND_WITH_INV || wr->next->opcode == IBV_WR_SEND) && wr->next->sg_list[0].length == T4_WRITE_CMPL_MAX_CQE && wr->next->num_sge == 1 && num_wrs >= 2) { post_write_cmpl(qhp, wr); pthread_spin_unlock(&qhp->lock); return 0; } while (wr) { if (num_wrs == 0) { err = -ENOMEM; *bad_wr = wr; break; } wqe = &lwqe; fw_flags = 0; if (wr->send_flags & IBV_SEND_SOLICITED) fw_flags |= FW_RI_SOLICITED_EVENT_FLAG; if (wr->send_flags & IBV_SEND_SIGNALED || qhp->sq_sig_all) fw_flags |= FW_RI_COMPLETION_FLAG; swsqe = &qhp->wq.sq.sw_sq[qhp->wq.sq.pidx]; switch (wr->opcode) { case IBV_WR_SEND_WITH_INV: case IBV_WR_SEND: INC_STAT(send); if (wr->send_flags & IBV_SEND_FENCE) fw_flags |= FW_RI_READ_FENCE_FLAG; fw_opcode = FW_RI_SEND_WR; if (wr->opcode == IBV_WR_SEND) swsqe->opcode = FW_RI_SEND; else swsqe->opcode = FW_RI_SEND_WITH_INV; err = build_rdma_send(&qhp->wq.sq, wqe, wr, &len16); break; case IBV_WR_RDMA_WRITE_WITH_IMM: if (unlikely(!(qhp->wq.sq.flags & T4_SQ_WRITE_W_IMM))) { err = -EINVAL; break; } fw_flags |= FW_RI_RDMA_WRITE_WITH_IMMEDIATE; /*FALLTHROUGH*/ case IBV_WR_RDMA_WRITE: INC_STAT(write); fw_opcode = FW_RI_RDMA_WRITE_WR; swsqe->opcode = FW_RI_RDMA_WRITE; err = build_rdma_write(&qhp->wq.sq, wqe, wr, &len16); break; case IBV_WR_RDMA_READ: INC_STAT(read); fw_opcode = FW_RI_RDMA_READ_WR; swsqe->opcode = FW_RI_READ_REQ; fw_flags = 0; err = build_rdma_read(wqe, wr, &len16); if (err) break; swsqe->read_len = wr->sg_list ? wr->sg_list[0].length : 0; if (!qhp->wq.sq.oldest_read) qhp->wq.sq.oldest_read = swsqe; break; default: PDBG("%s post of type=%d TBD!\n", __func__, wr->opcode); err = -EINVAL; } if (err) { *bad_wr = wr; break; } swsqe->idx = qhp->wq.sq.pidx; swsqe->complete = 0; swsqe->signaled = (wr->send_flags & IBV_SEND_SIGNALED) || qhp->sq_sig_all; swsqe->flushed = 0; swsqe->wr_id = wr->wr_id; init_wr_hdr(wqe, qhp->wq.sq.pidx, fw_opcode, fw_flags, len16); PDBG("%s cookie 0x%llx pidx 0x%x opcode 0x%x\n", __func__, (unsigned long long)wr->wr_id, qhp->wq.sq.pidx, swsqe->opcode); wr = wr->next; num_wrs--; copy_wr_to_sq(&qhp->wq, wqe, len16); t4_sq_produce(&qhp->wq, len16); idx += DIV_ROUND_UP(len16*16, T4_EQ_ENTRY_SIZE); } if (t4_wq_db_enabled(&qhp->wq)) { t4_ring_sq_db(&qhp->wq, idx, dev_is_t4(qhp->rhp), len16, wqe); } else ring_kernel_db(qhp, qhp->wq.sq.qid, idx); /* This write is only for debugging, the value does not matter for DMA */ qhp->wq.sq.queue[qhp->wq.sq.size].status.host_wq_pidx = \ (qhp->wq.sq.wq_pidx); pthread_spin_unlock(&qhp->lock); return err; } static void defer_srq_wr(struct t4_srq *srq, union t4_recv_wr *wqe, uint64_t wr_id, u8 len16) { struct t4_srq_pending_wr *pwr = &srq->pending_wrs[srq->pending_pidx]; PDBG("%s cidx %u pidx %u wq_pidx %u in_use %u ooo_count %u wr_id 0x%llx pending_cidx %u pending_pidx %u pending_in_use %u\n", __func__, srq->cidx, srq->pidx, srq->wq_pidx, srq->in_use, srq->ooo_count, (unsigned long long)wr_id, srq->pending_cidx, srq->pending_pidx, srq->pending_in_use); pwr->wr_id = wr_id; pwr->len16 = len16; memcpy(&pwr->wqe, wqe, len16*16); t4_srq_produce_pending_wr(srq); } int c4iw_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { int err = 0; struct c4iw_srq *srq; union t4_recv_wr *wqe, lwqe; u32 num_wrs; u8 len16 = 0; u16 idx = 0; srq = to_c4iw_srq(ibsrq); pthread_spin_lock(&srq->lock); INC_STAT(srq_recv); num_wrs = t4_srq_avail(&srq->wq); if (num_wrs == 0) { pthread_spin_unlock(&srq->lock); return -ENOMEM; } while (wr) { if (wr->num_sge > T4_MAX_RECV_SGE) { err = -EINVAL; *bad_wr = wr; break; } wqe = &lwqe; if (num_wrs) err = build_srq_recv(wqe, wr, &len16); else err = -ENOMEM; if (err) { *bad_wr = wr; break; } wqe->recv.opcode = FW_RI_RECV_WR; wqe->recv.r1 = 0; wqe->recv.wrid = srq->wq.pidx; wqe->recv.r2[0] = 0; wqe->recv.r2[1] = 0; wqe->recv.r2[2] = 0; wqe->recv.len16 = len16; if (srq->wq.ooo_count || srq->wq.pending_in_use || srq->wq.sw_rq[srq->wq.pidx].valid) defer_srq_wr(&srq->wq, wqe, wr->wr_id, len16); else { srq->wq.sw_rq[srq->wq.pidx].wr_id = wr->wr_id; srq->wq.sw_rq[srq->wq.pidx].valid = 1; c4iw_copy_wr_to_srq(&srq->wq, wqe, len16); PDBG("%s cidx %u pidx %u wq_pidx %u in_use %u wr_id 0x%llx\n", __func__, srq->wq.cidx, srq->wq.pidx, srq->wq.wq_pidx, srq->wq.in_use, (unsigned long long)wr->wr_id); t4_srq_produce(&srq->wq, len16); idx += DIV_ROUND_UP(len16*16, T4_EQ_ENTRY_SIZE); } wr = wr->next; num_wrs--; } if (idx) { t4_ring_srq_db(&srq->wq, idx, len16, wqe); srq->wq.queue[srq->wq.size].status.host_wq_pidx = srq->wq.wq_pidx; } pthread_spin_unlock(&srq->lock); return err; } int c4iw_post_receive(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { int err = 0; struct c4iw_qp *qhp; union t4_recv_wr *wqe, lwqe; u32 num_wrs; u8 len16 = 0; u16 idx = 0; qhp = to_c4iw_qp(ibqp); pthread_spin_lock(&qhp->lock); if (t4_wq_in_error(&qhp->wq)) { pthread_spin_unlock(&qhp->lock); *bad_wr = wr; return -EINVAL; } INC_STAT(recv); num_wrs = t4_rq_avail(&qhp->wq); if (num_wrs == 0) { pthread_spin_unlock(&qhp->lock); *bad_wr = wr; return -ENOMEM; } while (wr) { if (wr->num_sge > T4_MAX_RECV_SGE) { err = -EINVAL; *bad_wr = wr; break; } wqe = &lwqe; if (num_wrs) err = build_rdma_recv(&qhp->wq.rq, wqe, wr, &len16); else err = -ENOMEM; if (err) { *bad_wr = wr; break; } qhp->wq.rq.sw_rq[qhp->wq.rq.pidx].wr_id = wr->wr_id; wqe->recv.opcode = FW_RI_RECV_WR; wqe->recv.r1 = 0; wqe->recv.wrid = qhp->wq.rq.pidx; wqe->recv.r2[0] = 0; wqe->recv.r2[1] = 0; wqe->recv.r2[2] = 0; wqe->recv.len16 = len16; PDBG("%s cookie 0x%llx pidx %u\n", __func__, (unsigned long long) wr->wr_id, qhp->wq.rq.pidx); copy_wr_to_rq(&qhp->wq, wqe, len16); t4_rq_produce(&qhp->wq, len16); idx += DIV_ROUND_UP(len16*16, T4_EQ_ENTRY_SIZE); wr = wr->next; num_wrs--; } if (t4_wq_db_enabled(&qhp->wq)) t4_ring_rq_db(&qhp->wq, idx, dev_is_t4(qhp->rhp), len16, wqe); else ring_kernel_db(qhp, qhp->wq.rq.qid, idx); qhp->wq.rq.queue[qhp->wq.rq.size].status.host_wq_pidx = \ (qhp->wq.rq.wq_pidx); pthread_spin_unlock(&qhp->lock); return err; } void c4iw_flush_qp(struct c4iw_qp *qhp) { struct c4iw_cq *rchp, *schp; u32 srqidx; int count; srqidx = t4_wq_srqidx(&qhp->wq); rchp = to_c4iw_cq(qhp->ibv_qp.recv_cq); schp = to_c4iw_cq(qhp->ibv_qp.send_cq); PDBG("%s qhp %p rchp %p schp %p\n", __func__, qhp, rchp, schp); /* locking heirarchy: cq lock first, then qp lock. */ pthread_spin_lock(&rchp->lock); if (schp != rchp) pthread_spin_lock(&schp->lock); pthread_spin_lock(&qhp->lock); if (qhp->wq.flushed) { pthread_spin_unlock(&qhp->lock); if (rchp != schp) pthread_spin_unlock(&schp->lock); pthread_spin_unlock(&rchp->lock); return; } qhp->wq.flushed = 1; t4_set_wq_in_error(&qhp->wq); if (qhp->srq) pthread_spin_lock(&qhp->srq->lock); if (srqidx) c4iw_flush_srqidx(qhp, srqidx); qhp->ibv_qp.state = IBV_QPS_ERR; c4iw_flush_hw_cq(rchp, qhp); if (!qhp->srq) { c4iw_count_rcqes(&rchp->cq, &qhp->wq, &count); c4iw_flush_rq(&qhp->wq, &rchp->cq, count); } if (schp != rchp) c4iw_flush_hw_cq(schp, qhp); c4iw_flush_sq(qhp); if (qhp->srq) pthread_spin_unlock(&qhp->srq->lock); pthread_spin_unlock(&qhp->lock); if (schp != rchp) pthread_spin_unlock(&schp->lock); pthread_spin_unlock(&rchp->lock); } void c4iw_flush_qps(struct c4iw_dev *dev) { int i; pthread_spin_lock(&dev->lock); for (i=0; i < dev->max_qp; i++) { struct c4iw_qp *qhp = dev->qpid2ptr[i]; if (qhp) { if (!qhp->wq.flushed && t4_wq_in_error(&qhp->wq)) { c4iw_flush_qp(qhp); } } } pthread_spin_unlock(&dev->lock); } rdma-core-56.1/providers/cxgb4/t4.h000066400000000000000000000576511477342711600170740ustar00rootroot00000000000000/* * Copyright (c) 2006-2016 Chelsio, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __T4_H__ #define __T4_H__ #include #include #include #include #include #include #include #include #include #include /* * Try and minimize the changes from the kernel code that is pull in * here for kernel bypass ops. */ #define u8 uint8_t #define u16 uint16_t #define u32 uint32_t #define u64 uint64_t #define DECLARE_PCI_UNMAP_ADDR(a) #define __iomem #define BUG_ON(c) assert(!(c)) #define ROUND_UP(x, n) (((x) + (n) - 1u) & ~((n) - 1u)) /* FIXME: Move me to a generic PCI mmio accessor */ #define cpu_to_pci32(val) htole32(val) #define writel(v, a) do { *((volatile u32 *)(a)) = cpu_to_pci32(v); } while (0) #include "t4_regs.h" #include "t4_chip_type.h" #include "t4fw_api.h" #include "t4fw_ri_api.h" extern bool is_64b_cqe; #ifdef DEBUG #define DBGLOG(s) #define PDBG(fmt, args...) do {syslog(LOG_DEBUG, fmt, ##args); } while (0) #else #define DBGLOG(s) #define PDBG(fmt, args...) do {} while (0) #endif #define A_PCIE_MA_SYNC 0x30b4 #define T4_MAX_READ_DEPTH 16 #define T4_QID_BASE 1024 #define T4_MAX_QIDS 256 #define T4_MAX_NUM_PD 65536 #define T4_EQ_STATUS_ENTRIES (L1_CACHE_BYTES > 64 ? 2 : 1) #define T4_MAX_EQ_SIZE (65520 - T4_EQ_STATUS_ENTRIES) #define T4_MAX_IQ_SIZE (65520 - 1) #define T4_MAX_RQ_SIZE (8192 - T4_EQ_STATUS_ENTRIES) #define T4_MAX_SQ_SIZE (T4_MAX_EQ_SIZE - 1) #define T4_MAX_QP_DEPTH (T4_MAX_RQ_SIZE - 1) #define T4_MAX_CQ_DEPTH (T4_MAX_IQ_SIZE - 1) #define T4_MAX_NUM_STAG (1<<15) #define T4_MAX_MR_SIZE (~0ULL - 1) #define T4_PAGESIZE_MASK 0xffff000 /* 4KB-128MB */ #define T4_STAG_UNSET 0xffffffff #define T4_FW_MAJ 0 struct t4_status_page { __be32 rsvd1; /* flit 0 - hw owns */ __be16 rsvd2; __be16 qid; __be16 cidx; __be16 pidx; u8 qp_err; /* flit 1 - sw owns */ u8 db_off; u8 pad[2]; u16 host_wq_pidx; u16 host_cidx; u16 host_pidx; u16 pad2; u32 srqidx; }; #define T4_EQ_ENTRY_SIZE 64 #define T4_SQ_NUM_SLOTS 5 #define T4_SQ_NUM_BYTES (T4_EQ_ENTRY_SIZE * T4_SQ_NUM_SLOTS) #define T4_MAX_SEND_SGE ((T4_SQ_NUM_BYTES - sizeof(struct fw_ri_send_wr) - sizeof(struct fw_ri_isgl)) / sizeof (struct fw_ri_sge)) #define T4_MAX_SEND_INLINE ((T4_SQ_NUM_BYTES - sizeof(struct fw_ri_send_wr) - sizeof(struct fw_ri_immd))) #define T4_MAX_WRITE_INLINE ((T4_SQ_NUM_BYTES - sizeof(struct fw_ri_rdma_write_wr) - sizeof(struct fw_ri_immd))) #define T4_MAX_WRITE_SGE ((T4_SQ_NUM_BYTES - sizeof(struct fw_ri_rdma_write_wr) - sizeof(struct fw_ri_isgl)) / sizeof (struct fw_ri_sge)) #define T4_MAX_FR_IMMD ((T4_SQ_NUM_BYTES - sizeof(struct fw_ri_fr_nsmr_wr) - sizeof(struct fw_ri_immd))) #define T4_MAX_FR_DEPTH 255 #define T4_RQ_NUM_SLOTS 2 #define T4_RQ_NUM_BYTES (T4_EQ_ENTRY_SIZE * T4_RQ_NUM_SLOTS) #define T4_MAX_RECV_SGE 4 #define T4_WRITE_CMPL_MAX_SGL 4 #define T4_WRITE_CMPL_MAX_CQE 16 union t4_wr { struct fw_ri_res_wr res; struct fw_ri_wr init; struct fw_ri_rdma_write_wr write; struct fw_ri_send_wr send; struct fw_ri_rdma_read_wr read; struct fw_ri_bind_mw_wr bind; struct fw_ri_fr_nsmr_wr fr; struct fw_ri_inv_lstag_wr inv; struct fw_ri_rdma_write_cmpl_wr write_cmpl; struct t4_status_page status; __be64 flits[T4_EQ_ENTRY_SIZE / sizeof(__be64) * T4_SQ_NUM_SLOTS]; }; union t4_recv_wr { struct fw_ri_recv_wr recv; struct t4_status_page status; __be64 flits[T4_EQ_ENTRY_SIZE / sizeof(__be64) * T4_RQ_NUM_SLOTS]; }; static inline void init_wr_hdr(union t4_wr *wqe, u16 wrid, enum fw_wr_opcodes opcode, u8 flags, u8 len16) { wqe->send.opcode = (u8)opcode; wqe->send.flags = flags; wqe->send.wrid = wrid; wqe->send.r1[0] = 0; wqe->send.r1[1] = 0; wqe->send.r1[2] = 0; wqe->send.len16 = len16; } /* CQE/AE status codes */ #define T4_ERR_SUCCESS 0x0 #define T4_ERR_STAG 0x1 /* STAG invalid: either the */ /* STAG is offlimt, being 0, */ /* or STAG_key mismatch */ #define T4_ERR_PDID 0x2 /* PDID mismatch */ #define T4_ERR_QPID 0x3 /* QPID mismatch */ #define T4_ERR_ACCESS 0x4 /* Invalid access right */ #define T4_ERR_WRAP 0x5 /* Wrap error */ #define T4_ERR_BOUND 0x6 /* base and bounds voilation */ #define T4_ERR_INVALIDATE_SHARED_MR 0x7 /* attempt to invalidate a */ /* shared memory region */ #define T4_ERR_INVALIDATE_MR_WITH_MW_BOUND 0x8 /* attempt to invalidate a */ /* shared memory region */ #define T4_ERR_ECC 0x9 /* ECC error detected */ #define T4_ERR_ECC_PSTAG 0xA /* ECC error detected when */ /* reading PSTAG for a MW */ /* Invalidate */ #define T4_ERR_PBL_ADDR_BOUND 0xB /* pbl addr out of bounds: */ /* software error */ #define T4_ERR_SWFLUSH 0xC /* SW FLUSHED */ #define T4_ERR_CRC 0x10 /* CRC error */ #define T4_ERR_MARKER 0x11 /* Marker error */ #define T4_ERR_PDU_LEN_ERR 0x12 /* invalid PDU length */ #define T4_ERR_OUT_OF_RQE 0x13 /* out of RQE */ #define T4_ERR_DDP_VERSION 0x14 /* wrong DDP version */ #define T4_ERR_RDMA_VERSION 0x15 /* wrong RDMA version */ #define T4_ERR_OPCODE 0x16 /* invalid rdma opcode */ #define T4_ERR_DDP_QUEUE_NUM 0x17 /* invalid ddp queue number */ #define T4_ERR_MSN 0x18 /* MSN error */ #define T4_ERR_TBIT 0x19 /* tag bit not set correctly */ #define T4_ERR_MO 0x1A /* MO not 0 for TERMINATE */ /* or READ_REQ */ #define T4_ERR_MSN_GAP 0x1B #define T4_ERR_MSN_RANGE 0x1C #define T4_ERR_IRD_OVERFLOW 0x1D #define T4_ERR_RQE_ADDR_BOUND 0x1E /* RQE addr out of bounds: */ /* software error */ #define T4_ERR_INTERNAL_ERR 0x1F /* internal error (opcode */ /* mismatch) */ /* * CQE defs */ struct t4_cqe_common { __be32 header; __be32 len; union { struct { __be32 stag; __be32 msn; } rcqe; struct { __be32 stag; u16 nada2; u16 cidx; } scqe; struct { __be32 wrid_hi; __be32 wrid_low; } gen; struct { __be32 stag; __be32 msn; } srcqe; struct { __be32 mo; __be32 msn; } imm_data_rcqe; u64 drain_cookie; } u; }; struct t4_cqe_b32 { struct t4_cqe_common com; __be64 reserved; __be64 bits_type_ts; }; struct t4_cqe_b64 { struct t4_cqe_common com; union { struct { __be32 reserved; __be32 abs_rqe_idx; } srcqe; union { struct { __be32 imm_data32; u32 reserved; } ib_imm_data; __be64 imm_data64; } imm_data_rcqe; __be64 flits[3]; } u; __be64 reserved[2]; __be64 bits_type_ts; }; union t4_cqe { struct t4_cqe_common com; struct t4_cqe_b32 b32; struct t4_cqe_b64 b64; }; /* macros for flit 0 of the cqe */ #define S_CQE_QPID 12 #define M_CQE_QPID 0xFFFFF #define G_CQE_QPID(x) ((((x) >> S_CQE_QPID)) & M_CQE_QPID) #define V_CQE_QPID(x) ((x)<> S_CQE_SWCQE)) & M_CQE_SWCQE) #define V_CQE_SWCQE(x) ((x)<> S_CQE_STATUS)) & M_CQE_STATUS) #define V_CQE_STATUS(x) ((x)<> S_CQE_TYPE)) & M_CQE_TYPE) #define V_CQE_TYPE(x) ((x)<> S_CQE_OPCODE)) & M_CQE_OPCODE) #define V_CQE_OPCODE(x) ((x)<header))) #define CQE_QPID(x) (G_CQE_QPID(be32toh((x)->header))) #define CQE_TYPE(x) (G_CQE_TYPE(be32toh((x)->header))) #define SQ_TYPE(x) (CQE_TYPE((x))) #define RQ_TYPE(x) (!CQE_TYPE((x))) #define CQE_STATUS(x) (G_CQE_STATUS(be32toh((x)->header))) #define CQE_OPCODE(x) (G_CQE_OPCODE(be32toh((x)->header))) #define CQE_SEND_OPCODE(x)( \ (G_CQE_OPCODE(be32toh((x)->header)) == FW_RI_SEND) || \ (G_CQE_OPCODE(be32toh((x)->header)) == FW_RI_SEND_WITH_SE) || \ (G_CQE_OPCODE(be32toh((x)->header)) == FW_RI_SEND_WITH_INV) || \ (G_CQE_OPCODE(be32toh((x)->header)) == FW_RI_SEND_WITH_SE_INV)) #define CQE_LEN(x) (be32toh((x)->len)) /* used for RQ completion processing */ #define CQE_WRID_STAG(x) (be32toh((x)->u.rcqe.stag)) #define CQE_WRID_MSN(x) (be32toh((x)->u.rcqe.msn)) #define CQE_ABS_RQE_IDX(x) (be32toh((x)->u.srcqe.abs_rqe_idx)) #define CQE_IMM_DATA(x) ((x)->u.imm_data_rcqe.ib_imm_data.imm_data32) /* used for SQ completion processing */ #define CQE_WRID_SQ_IDX(x) (x)->u.scqe.cidx /* generic accessor macros */ #define CQE_WRID_HI(x) ((x)->u.gen.wrid_hi) #define CQE_WRID_LOW(x) ((x)->u.gen.wrid_low) /* macros for flit 3 of the cqe */ #define S_CQE_GENBIT 63 #define M_CQE_GENBIT 0x1 #define G_CQE_GENBIT(x) (((x) >> S_CQE_GENBIT) & M_CQE_GENBIT) #define V_CQE_GENBIT(x) ((x)<> S_CQE_OVFBIT)) & M_CQE_OVFBIT) #define S_CQE_IQTYPE 60 #define M_CQE_IQTYPE 0x3 #define G_CQE_IQTYPE(x) ((((x) >> S_CQE_IQTYPE)) & M_CQE_IQTYPE) #define M_CQE_TS 0x0fffffffffffffffULL #define G_CQE_TS(x) ((x) & M_CQE_TS) #define CQE_OVFBIT(x) ((unsigned)G_CQE_OVFBIT(be64toh((x)->bits_type_ts))) #define CQE_GENBIT(x) ((unsigned)G_CQE_GENBIT(be64toh((x)->bits_type_ts))) #define CQE_TS(x) (G_CQE_TS(be64toh((x)->bits_type_ts))) #define CQE_SIZE(x) (is_64b_cqe ? sizeof(*(x)) : sizeof(*(x))/2) #define Q_ENTRY(x, y) ((union t4_cqe *)(((u8 *)x) + ((CQE_SIZE(x))*y))) #define GEN_BIT(x) (is_64b_cqe ? \ ((x)->b64.bits_type_ts) : ((x)->b32.bits_type_ts)) #define GEN_ADDR(x) (is_64b_cqe ? \ (&((x)->b64.bits_type_ts)) : (&((x)->b32.bits_type_ts))) struct t4_swsqe { u64 wr_id; union t4_cqe cqe; __be32 read_len; int opcode; int complete; int signaled; u16 idx; int flushed; }; enum { T4_SQ_ONCHIP = (1 << 0), T4_SQ_WRITE_W_IMM = (1 << 1) }; struct t4_sq { /* queue is either host memory or WC MMIO memory if * t4_sq_onchip(). */ union t4_wr *queue; struct t4_swsqe *sw_sq; struct t4_swsqe *oldest_read; /* udb is either UC or WC MMIO memory depending on device version. */ volatile u32 *udb; size_t memsize; u32 qid; u32 bar2_qid; void *ma_sync; u16 in_use; u16 size; u16 cidx; u16 pidx; u16 wq_pidx; u16 flags; short flush_cidx; int wc_reg_available; }; struct t4_swrqe { u64 wr_id; int valid; }; struct t4_rq { union t4_recv_wr *queue; struct t4_swrqe *sw_rq; volatile u32 *udb; size_t memsize; u32 qid; u32 bar2_qid; u32 msn; u32 rqt_hwaddr; u16 rqt_size; u16 in_use; u16 size; u16 cidx; u16 pidx; u16 wq_pidx; int wc_reg_available; }; struct t4_wq { struct t4_sq sq; struct t4_rq rq; struct c4iw_rdev *rdev; u32 qid_mask; int error; int flushed; u8 *db_offp; u8 *qp_errp; u32 *srqidxp; }; static inline int t4_rqes_posted(struct t4_wq *wq) { return wq->rq.in_use; } static inline int t4_rq_empty(struct t4_wq *wq) { return wq->rq.in_use == 0; } static inline int t4_rq_full(struct t4_wq *wq) { return wq->rq.in_use == (wq->rq.size - 1); } static inline u32 t4_rq_avail(struct t4_wq *wq) { return wq->rq.size - 1 - wq->rq.in_use; } static inline void t4_rq_produce(struct t4_wq *wq, u8 len16) { wq->rq.in_use++; if (++wq->rq.pidx == wq->rq.size) wq->rq.pidx = 0; wq->rq.wq_pidx += DIV_ROUND_UP(len16*16, T4_EQ_ENTRY_SIZE); if (wq->rq.wq_pidx >= wq->rq.size * T4_RQ_NUM_SLOTS) wq->rq.wq_pidx %= wq->rq.size * T4_RQ_NUM_SLOTS; if (!wq->error) wq->rq.queue[wq->rq.size].status.host_pidx = wq->rq.pidx; } static inline void t4_rq_consume(struct t4_wq *wq) { wq->rq.in_use--; if (++wq->rq.cidx == wq->rq.size) wq->rq.cidx = 0; assert((wq->rq.cidx != wq->rq.pidx) || wq->rq.in_use == 0); if (!wq->error) wq->rq.queue[wq->rq.size].status.host_cidx = wq->rq.cidx; } struct t4_srq_pending_wr { u64 wr_id; union t4_recv_wr wqe; u8 len16; }; struct t4_srq { union t4_recv_wr *queue; struct t4_swrqe *sw_rq; u32 *udb; size_t memsize; u32 qid; u32 bar2_qid; u32 msn; u32 rqt_hwaddr; u32 rqt_abs_idx; u16 in_use; u16 size; u16 cidx; u16 pidx; u16 wq_pidx; int wc_reg_available; struct t4_srq_pending_wr *pending_wrs; u16 pending_cidx; u16 pending_pidx; u16 pending_in_use; u16 ooo_count; }; static inline u32 t4_srq_avail(struct t4_srq *srq) { return srq->size - 1 - srq->in_use; } static inline int t4_srq_empty(struct t4_srq *srq) { return srq->in_use == 0; } static inline int t4_srq_cidx_at_end(struct t4_srq *srq) { assert(srq->cidx != srq->pidx); if (srq->cidx < srq->pidx) return srq->cidx == (srq->pidx - 1); else return srq->cidx == (srq->size - 1) && srq->pidx == 0; } static inline int t4_srq_wrs_pending(struct t4_srq *srq) { return srq->pending_cidx != srq->pending_pidx; } static inline void t4_srq_produce(struct t4_srq *srq, u8 len16) { srq->in_use++; assert(srq->in_use < srq->size); if (++srq->pidx == srq->size) srq->pidx = 0; assert(srq->cidx != srq->pidx); /* overflow */ srq->wq_pidx += DIV_ROUND_UP(len16*16, T4_EQ_ENTRY_SIZE); if (srq->wq_pidx >= srq->size * T4_RQ_NUM_SLOTS) srq->wq_pidx %= srq->size * T4_RQ_NUM_SLOTS; srq->queue[srq->size].status.host_pidx = srq->pidx; } static inline void t4_srq_produce_pending_wr(struct t4_srq *srq) { srq->pending_in_use++; srq->in_use++; assert(srq->pending_in_use < srq->size); assert(srq->in_use < srq->size); assert(srq->pending_pidx < srq->size); if (++srq->pending_pidx == srq->size) srq->pending_pidx = 0; } static inline void t4_srq_consume_pending_wr(struct t4_srq *srq) { assert(srq->pending_in_use > 0); srq->pending_in_use--; assert(srq->in_use > 0); srq->in_use--; if (++srq->pending_cidx == srq->size) srq->pending_cidx = 0; assert((srq->pending_cidx != srq->pending_pidx) || srq->pending_in_use == 0); } static inline void t4_srq_produce_ooo(struct t4_srq *srq) { assert(srq->in_use > 0); srq->in_use--; srq->ooo_count++; assert(srq->ooo_count < srq->size); } static inline void t4_srq_consume_ooo(struct t4_srq *srq) { srq->cidx++; if (srq->cidx == srq->size) srq->cidx = 0; srq->queue[srq->size].status.host_cidx = srq->cidx; assert(srq->ooo_count > 0); srq->ooo_count--; } static inline void t4_srq_consume(struct t4_srq *srq) { assert(srq->in_use > 0); srq->in_use--; if (++srq->cidx == srq->size) srq->cidx = 0; assert((srq->cidx != srq->pidx) || srq->in_use == 0); srq->queue[srq->size].status.host_cidx = srq->cidx; } static inline int t4_wq_in_error(struct t4_wq *wq) { return wq->error || *wq->qp_errp; } static inline u32 t4_wq_srqidx(struct t4_wq *wq) { u32 srqidx; if (!wq->srqidxp) return 0; srqidx = *wq->srqidxp; wq->srqidxp = 0; return srqidx; } static inline int t4_sq_empty(struct t4_wq *wq) { return wq->sq.in_use == 0; } static inline int t4_sq_full(struct t4_wq *wq) { return wq->sq.in_use == (wq->sq.size - 1); } static inline u32 t4_sq_avail(struct t4_wq *wq) { return wq->sq.size - 1 - wq->sq.in_use; } static inline int t4_sq_onchip(struct t4_wq *wq) { return wq->sq.flags & T4_SQ_ONCHIP; } static inline void t4_sq_produce(struct t4_wq *wq, u8 len16) { wq->sq.in_use++; if (++wq->sq.pidx == wq->sq.size) wq->sq.pidx = 0; wq->sq.wq_pidx += DIV_ROUND_UP(len16*16, T4_EQ_ENTRY_SIZE); if (wq->sq.wq_pidx >= wq->sq.size * T4_SQ_NUM_SLOTS) wq->sq.wq_pidx %= wq->sq.size * T4_SQ_NUM_SLOTS; if (!wq->error) { /* This write is only for debugging, the value does not matter * for DMA */ wq->sq.queue[wq->sq.size].status.host_pidx = (wq->sq.pidx); } } static inline void t4_sq_consume(struct t4_wq *wq) { assert(wq->sq.in_use >= 1); if (wq->sq.cidx == wq->sq.flush_cidx) wq->sq.flush_cidx = -1; wq->sq.in_use--; if (++wq->sq.cidx == wq->sq.size) wq->sq.cidx = 0; assert((wq->sq.cidx != wq->sq.pidx) || wq->sq.in_use == 0); if (!wq->error){ /* This write is only for debugging, the value does not matter * for DMA */ wq->sq.queue[wq->sq.size].status.host_cidx = wq->sq.cidx; } } /* Copies to WC MMIO memory */ static void copy_wqe_to_udb(volatile u32 *udb_offset, void *wqe) { u64 *src, *dst; int len16 = 4; src = (u64 *)wqe; dst = (u64 *)udb_offset; while (len16) { *dst++ = *src++; *dst++ = *src++; len16--; } } extern int ma_wr; extern int t5_en_wc; static inline void t4_ring_sq_db(struct t4_wq *wq, u16 inc, u8 t4, u8 len16, union t4_wr *wqe) { if (!t4) { mmio_wc_start(); if (t5_en_wc && inc == 1 && wq->sq.wc_reg_available) { PDBG("%s: WC wq->sq.pidx = %d; len16=%d\n", __func__, wq->sq.pidx, len16); copy_wqe_to_udb(wq->sq.udb + 14, wqe); } else { PDBG("%s: DB wq->sq.pidx = %d; len16=%d\n", __func__, wq->sq.pidx, len16); writel(QID_V(wq->sq.bar2_qid) | PIDX_T5_V(inc), wq->sq.udb); } /* udb is WC for > t4 devices */ mmio_flush_writes(); return; } udma_to_device_barrier(); if (ma_wr) { if (t4_sq_onchip(wq)) { int i; mmio_wc_start(); for (i = 0; i < 16; i++) *(volatile u32 *)&wq->sq.queue[wq->sq.size].flits[2+i] = i; mmio_flush_writes(); } } else { if (t4_sq_onchip(wq)) { int i; mmio_wc_start(); for (i = 0; i < 16; i++) /* FIXME: What is this supposed to be doing? * Writing to the same address multiple times * with WC memory is not guarenteed to * generate any more than one TLP. Why isn't * writing to WC memory marked volatile? */ *(u32 *)&wq->sq.queue[wq->sq.size].flits[2] = i; mmio_flush_writes(); } } /* udb is UC for t4 devices */ writel(QID_V(wq->sq.qid & wq->qid_mask) | PIDX_V(inc), wq->sq.udb); } static inline void t4_ring_rq_db(struct t4_wq *wq, u16 inc, u8 t4, u8 len16, union t4_recv_wr *wqe) { if (!t4) { mmio_wc_start(); if (t5_en_wc && inc == 1 && wq->sq.wc_reg_available) { PDBG("%s: WC wq->rq.pidx = %d; len16=%d\n", __func__, wq->rq.pidx, len16); copy_wqe_to_udb(wq->rq.udb + 14, wqe); } else { PDBG("%s: DB wq->rq.pidx = %d; len16=%d\n", __func__, wq->rq.pidx, len16); writel(QID_V(wq->rq.bar2_qid) | PIDX_T5_V(inc), wq->rq.udb); } /* udb is WC for > t4 devices */ mmio_flush_writes(); return; } /* udb is UC for t4 devices */ udma_to_device_barrier(); writel(QID_V(wq->rq.qid & wq->qid_mask) | PIDX_V(inc), wq->rq.udb); } static inline void t4_ring_srq_db(struct t4_srq *srq, u16 inc, u8 len16, union t4_recv_wr *wqe) { mmio_wc_start(); if (t5_en_wc && inc == 1 && srq->wc_reg_available) { PDBG("%s: WC srq->pidx = %d; len16=%d\n", __func__, srq->pidx, len16); copy_wqe_to_udb(srq->udb + 14, wqe); } else { PDBG("%s: DB srq->pidx = %d; len16=%d\n", __func__, srq->pidx, len16); writel(QID_V(srq->bar2_qid) | PIDX_T5_V(inc), srq->udb); } mmio_flush_writes(); return; } static inline void t4_set_wq_in_error(struct t4_wq *wq) { *wq->qp_errp = 1; } extern int c4iw_abi_version; static inline int t4_wq_db_enabled(struct t4_wq *wq) { /* * If iw_cxgb4 driver supports door bell drop recovery then its * c4iw_abi_version would be greater than or equal to 2. In such * case return the status of db_off flag to ring the kernel mode * DB from user mode library. */ if ( c4iw_abi_version >= 2 ) return ! *wq->db_offp; else return 1; } struct t4_cq { union t4_cqe *queue; union t4_cqe *sw_queue; struct c4iw_rdev *rdev; volatile u32 *ugts; size_t memsize; u64 bits_type_ts; u32 cqid; u32 qid_mask; u16 size; /* including status page */ u16 cidx; u16 sw_pidx; u16 sw_cidx; u16 sw_in_use; u16 cidx_inc; u8 gen; u8 error; u8 *qp_errp; }; static inline int t4_arm_cq(struct t4_cq *cq, int se) { u32 val; while (cq->cidx_inc > CIDXINC_M) { val = SEINTARM_V(0) | CIDXINC_V(CIDXINC_M) | TIMERREG_V(7) | INGRESSQID_V(cq->cqid & cq->qid_mask); writel(val, cq->ugts); cq->cidx_inc -= CIDXINC_M; } val = SEINTARM_V(se) | CIDXINC_V(cq->cidx_inc) | TIMERREG_V(6) | INGRESSQID_V(cq->cqid & cq->qid_mask); writel(val, cq->ugts); cq->cidx_inc = 0; return 0; } static inline void t4_swcq_produce(struct t4_cq *cq) { cq->sw_in_use++; if (cq->sw_in_use == cq->size) { syslog(LOG_NOTICE, "cxgb4 sw cq overflow cqid %u\n", cq->cqid); cq->error = 1; assert(0); } if (++cq->sw_pidx == cq->size) cq->sw_pidx = 0; } static inline void t4_swcq_consume(struct t4_cq *cq) { assert(cq->sw_in_use >= 1); cq->sw_in_use--; if (++cq->sw_cidx == cq->size) cq->sw_cidx = 0; } static inline void t4_hwcq_consume(struct t4_cq *cq) { cq->bits_type_ts = GEN_BIT(Q_ENTRY(cq->queue, cq->cidx)); if (++cq->cidx_inc == (cq->size >> 4) || cq->cidx_inc == CIDXINC_M) { uint32_t val; val = SEINTARM_V(0) | CIDXINC_V(cq->cidx_inc) | TIMERREG_V(7) | INGRESSQID_V(cq->cqid & cq->qid_mask); writel(val, cq->ugts); cq->cidx_inc = 0; } if (++cq->cidx == cq->size) { cq->cidx = 0; cq->gen ^= 1; } ((struct t4_status_page *)Q_ENTRY(cq->queue, cq->size))->host_cidx = cq->cidx; } static inline int t4_valid_cqe(struct t4_cq *cq, union t4_cqe *cqe) { return (is_64b_cqe ? CQE_GENBIT(&cqe->b64) : (CQE_GENBIT(&cqe->b32))) == cq->gen; } static inline int t4_next_hw_cqe(struct t4_cq *cq, union t4_cqe **cqe) { int ret; u16 prev_cidx; if (cq->cidx == 0) prev_cidx = cq->size - 1; else prev_cidx = cq->cidx - 1; if (GEN_BIT(Q_ENTRY(cq->queue, prev_cidx)) != cq->bits_type_ts) { ret = -EOVERFLOW; syslog(LOG_NOTICE, "cxgb4 cq overflow cqid %u\n", cq->cqid); cq->error = 1; assert(0); } else if (t4_valid_cqe(cq, Q_ENTRY(cq->queue, cq->cidx))) { udma_from_device_barrier(); *cqe = Q_ENTRY(cq->queue, cq->cidx); ret = 0; } else ret = -ENODATA; return ret; } static inline union t4_cqe *t4_next_sw_cqe(struct t4_cq *cq) { if (cq->sw_in_use == cq->size) { syslog(LOG_NOTICE, "cxgb4 sw cq overflow cqid %u\n", cq->cqid); cq->error = 1; assert(0); return NULL; } if (cq->sw_in_use) return Q_ENTRY(cq->sw_queue, cq->sw_cidx); return NULL; } static inline int t4_cq_notempty(struct t4_cq *cq) { return cq->sw_in_use || t4_valid_cqe(cq, Q_ENTRY(cq->queue, cq->cidx)); } static inline int t4_next_cqe(struct t4_cq *cq, union t4_cqe **cqe) { int ret = 0; if (cq->error) ret = -ENODATA; else if (cq->sw_in_use) *cqe = Q_ENTRY(cq->sw_queue, cq->sw_cidx); else ret = t4_next_hw_cqe(cq, cqe); return ret; } static inline int t4_cq_in_error(struct t4_cq *cq) { return *cq->qp_errp; } static inline void t4_set_cq_in_error(struct t4_cq *cq) { *cq->qp_errp = 1; } static inline void t4_reset_cq_in_error(struct t4_cq *cq) { *cq->qp_errp = 0; } struct t4_dev_status_page { u8 db_off; u8 write_cmpl_supported; u16 pad2; u32 pad3; u64 qp_start; u64 qp_size; u64 cq_start; u64 cq_size; }; #endif rdma-core-56.1/providers/cxgb4/t4_chip_type.h000066400000000000000000000053001477342711600211200ustar00rootroot00000000000000/* * This file is part of the Chelsio T4 Ethernet driver for Linux. * * Copyright (c) 2003-2015 Chelsio Communications, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __T4_CHIP_TYPE_H__ #define __T4_CHIP_TYPE_H__ #define CHELSIO_T4 0x4 #define CHELSIO_T5 0x5 #define CHELSIO_T6 0x6 /* We code the Chelsio T4 Family "Chip Code" as a tuple: * * (Chip Version, Chip Revision) * * where: * * Chip Version: is T4, T5, etc. * Chip Revision: is the FAB "spin" of the Chip Version. */ #define CHELSIO_CHIP_CODE(version, revision) (((version) << 4) | (revision)) #define CHELSIO_CHIP_VERSION(code) (((code) >> 4) & 0xf) #define CHELSIO_CHIP_RELEASE(code) ((code) & 0xf) enum chip_type { T4_A1 = CHELSIO_CHIP_CODE(CHELSIO_T4, 1), T4_A2 = CHELSIO_CHIP_CODE(CHELSIO_T4, 2), T4_FIRST_REV = T4_A1, T4_LAST_REV = T4_A2, T5_A0 = CHELSIO_CHIP_CODE(CHELSIO_T5, 0), T5_A1 = CHELSIO_CHIP_CODE(CHELSIO_T5, 1), T5_FIRST_REV = T5_A0, T5_LAST_REV = T5_A1, T6_A0 = CHELSIO_CHIP_CODE(CHELSIO_T6, 0), T6_FIRST_REV = T6_A0, T6_LAST_REV = T6_A0, }; static inline int is_t4(enum chip_type chip) { return (CHELSIO_CHIP_VERSION(chip) == CHELSIO_T4); } static inline int is_t5(enum chip_type chip) { return (CHELSIO_CHIP_VERSION(chip) == CHELSIO_T5); } static inline int is_t6(enum chip_type chip) { return (CHELSIO_CHIP_VERSION(chip) == CHELSIO_T6); } #endif /* __T4_CHIP_TYPE_H__ */ rdma-core-56.1/providers/cxgb4/t4_pci_id_tbl.h000066400000000000000000000220441477342711600212300ustar00rootroot00000000000000/* * This file is part of the Chelsio T4/T5 Ethernet driver for Linux. * * Copyright (c) 2003-2014 Chelsio Communications, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __T4_PCI_ID_TBL_H__ #define __T4_PCI_ID_TBL_H__ /* The code can defined cpp macros for creating a PCI Device ID Table. This is * useful because it allows the PCI ID Table to be maintained in a single place. * * The macros are: * * CH_PCI_DEVICE_ID_TABLE_DEFINE_BEGIN * -- Used to start the definition of the PCI ID Table. * * CH_PCI_DEVICE_ID_FUNCTION * -- The PCI Function Number to use in the PCI Device ID Table. "0" * -- for drivers attaching to PF0-3, "4" for drivers attaching to PF4, * -- "8" for drivers attaching to SR-IOV Virtual Functions, etc. * * CH_PCI_DEVICE_ID_FUNCTION2 [optional] * -- If defined, create a PCI Device ID Table with both * -- CH_PCI_DEVICE_ID_FUNCTION and CH_PCI_DEVICE_ID_FUNCTION2 populated. * * CH_PCI_ID_TABLE_ENTRY(DeviceID) * -- Used for the individual PCI Device ID entries. Note that we will * -- be adding a trailing comma (",") after all of the entries (and * -- between the pairs of entries if CH_PCI_DEVICE_ID_FUNCTION2 is defined). * * CH_PCI_DEVICE_ID_TABLE_DEFINE_END * -- Used to finish the definition of the PCI ID Table. Note that we * -- will be adding a trailing semi-colon (";") here. */ #ifndef CH_PCI_DEVICE_ID_FUNCTION #error CH_PCI_DEVICE_ID_FUNCTION not defined! #endif #ifndef CH_PCI_ID_TABLE_ENTRY #error CH_PCI_ID_TABLE_ENTRY not defined! #endif #ifndef CH_PCI_DEVICE_ID_TABLE_DEFINE_END #error CH_PCI_DEVICE_ID_TABLE_DEFINE_END not defined! #endif /* T4 and later ASICs use a PCI Device ID scheme of 0xVFPP where: * * V = "4" for T4; "5" for T5, etc. * F = "0" for PF 0..3; "4".."7" for PF4..7; and "8" for VFs * PP = adapter product designation * * We use this consistency in order to create the proper PCI Device IDs * for the specified CH_PCI_DEVICE_ID_FUNCTION. */ #ifndef CH_PCI_DEVICE_ID_FUNCTION2 #define CH_PCI_ID_TABLE_FENTRY(devid) \ CH_PCI_ID_TABLE_ENTRY((devid) | \ ((CH_PCI_DEVICE_ID_FUNCTION) << 8)) #else #define CH_PCI_ID_TABLE_FENTRY(devid) \ CH_PCI_ID_TABLE_ENTRY((devid) | \ ((CH_PCI_DEVICE_ID_FUNCTION) << 8)), \ CH_PCI_ID_TABLE_ENTRY((devid) | \ ((CH_PCI_DEVICE_ID_FUNCTION2) << 8)) #endif CH_PCI_DEVICE_ID_TABLE_DEFINE_BEGIN /* T4 adapters: */ CH_PCI_ID_TABLE_FENTRY(0x4000), /* T440-dbg */ CH_PCI_ID_TABLE_FENTRY(0x4001), /* T420-cr */ CH_PCI_ID_TABLE_FENTRY(0x4002), /* T422-cr */ CH_PCI_ID_TABLE_FENTRY(0x4003), /* T440-cr */ CH_PCI_ID_TABLE_FENTRY(0x4004), /* T420-bch */ CH_PCI_ID_TABLE_FENTRY(0x4005), /* T440-bch */ CH_PCI_ID_TABLE_FENTRY(0x4006), /* T440-ch */ CH_PCI_ID_TABLE_FENTRY(0x4007), /* T420-so */ CH_PCI_ID_TABLE_FENTRY(0x4008), /* T420-cx */ CH_PCI_ID_TABLE_FENTRY(0x4009), /* T420-bt */ CH_PCI_ID_TABLE_FENTRY(0x400a), /* T404-bt */ CH_PCI_ID_TABLE_FENTRY(0x400b), /* B420-sr */ CH_PCI_ID_TABLE_FENTRY(0x400c), /* B404-bt */ CH_PCI_ID_TABLE_FENTRY(0x400d), /* T480-cr */ CH_PCI_ID_TABLE_FENTRY(0x400e), /* T440-LP-cr */ CH_PCI_ID_TABLE_FENTRY(0x4080), /* Custom T480-cr */ CH_PCI_ID_TABLE_FENTRY(0x4081), /* Custom T440-cr */ CH_PCI_ID_TABLE_FENTRY(0x4082), /* Custom T420-cr */ CH_PCI_ID_TABLE_FENTRY(0x4083), /* Custom T420-xaui */ CH_PCI_ID_TABLE_FENTRY(0x4084), /* Custom T440-cr */ CH_PCI_ID_TABLE_FENTRY(0x4085), /* Custom T420-cr */ CH_PCI_ID_TABLE_FENTRY(0x4086), /* Custom T440-bt */ CH_PCI_ID_TABLE_FENTRY(0x4087), /* Custom T440-cr */ CH_PCI_ID_TABLE_FENTRY(0x4088), /* Custom T440 2-xaui, 2-xfi */ /* T5 adapters: */ CH_PCI_ID_TABLE_FENTRY(0x5000), /* T580-dbg */ CH_PCI_ID_TABLE_FENTRY(0x5001), /* T520-cr */ CH_PCI_ID_TABLE_FENTRY(0x5002), /* T522-cr */ CH_PCI_ID_TABLE_FENTRY(0x5003), /* T540-cr */ CH_PCI_ID_TABLE_FENTRY(0x5004), /* T520-bch */ CH_PCI_ID_TABLE_FENTRY(0x5005), /* T540-bch */ CH_PCI_ID_TABLE_FENTRY(0x5006), /* T540-ch */ CH_PCI_ID_TABLE_FENTRY(0x5007), /* T520-so */ CH_PCI_ID_TABLE_FENTRY(0x5008), /* T520-cx */ CH_PCI_ID_TABLE_FENTRY(0x5009), /* T520-bt */ CH_PCI_ID_TABLE_FENTRY(0x500a), /* T504-bt */ CH_PCI_ID_TABLE_FENTRY(0x500b), /* B520-sr */ CH_PCI_ID_TABLE_FENTRY(0x500c), /* B504-bt */ CH_PCI_ID_TABLE_FENTRY(0x500d), /* T580-cr */ CH_PCI_ID_TABLE_FENTRY(0x500e), /* T540-LP-cr */ CH_PCI_ID_TABLE_FENTRY(0x5010), /* T580-LP-cr */ CH_PCI_ID_TABLE_FENTRY(0x5011), /* T520-LL-cr */ CH_PCI_ID_TABLE_FENTRY(0x5012), /* T560-cr */ CH_PCI_ID_TABLE_FENTRY(0x5013), /* T580-chr */ CH_PCI_ID_TABLE_FENTRY(0x5014), /* T580-so */ CH_PCI_ID_TABLE_FENTRY(0x5015), /* T502-bt */ CH_PCI_ID_TABLE_FENTRY(0x5016), /* T580-OCP-SO */ CH_PCI_ID_TABLE_FENTRY(0x5017), /* T520-OCP-SO */ CH_PCI_ID_TABLE_FENTRY(0x5018), /* T540-BT */ CH_PCI_ID_TABLE_FENTRY(0x5080), /* Custom T540-cr */ CH_PCI_ID_TABLE_FENTRY(0x5081), /* Custom T540-LL-cr */ CH_PCI_ID_TABLE_FENTRY(0x5082), /* Custom T504-cr */ CH_PCI_ID_TABLE_FENTRY(0x5083), /* Custom T540-LP-CR */ CH_PCI_ID_TABLE_FENTRY(0x5084), /* Custom T580-cr */ CH_PCI_ID_TABLE_FENTRY(0x5085), /* Custom 3x T580-CR */ CH_PCI_ID_TABLE_FENTRY(0x5086), /* Custom 2x T580-CR */ CH_PCI_ID_TABLE_FENTRY(0x5087), /* Custom T580-CR */ CH_PCI_ID_TABLE_FENTRY(0x5088), /* Custom T570-CR */ CH_PCI_ID_TABLE_FENTRY(0x5089), /* Custom T520-CR */ CH_PCI_ID_TABLE_FENTRY(0x5090), /* Custom T540-CR */ CH_PCI_ID_TABLE_FENTRY(0x5091), /* Custom T522-CR */ CH_PCI_ID_TABLE_FENTRY(0x5092), /* Custom T520-CR */ CH_PCI_ID_TABLE_FENTRY(0x5093), /* Custom T580-LP-CR */ CH_PCI_ID_TABLE_FENTRY(0x5094), /* Custom T540-CR */ CH_PCI_ID_TABLE_FENTRY(0x5095), /* Custom T540-CR-SO */ CH_PCI_ID_TABLE_FENTRY(0x5096), /* Custom T580-CR */ CH_PCI_ID_TABLE_FENTRY(0x5097), /* Custom T520-KR */ CH_PCI_ID_TABLE_FENTRY(0x5098), /* Custom 2x40G QSFP */ CH_PCI_ID_TABLE_FENTRY(0x5099), /* Custom 2x40G QSFP */ CH_PCI_ID_TABLE_FENTRY(0x509a), /* Custom T520-CR */ CH_PCI_ID_TABLE_FENTRY(0x509b), /* Custom T540-CR LOM */ CH_PCI_ID_TABLE_FENTRY(0x509c), /* Custom T520-CR*/ CH_PCI_ID_TABLE_FENTRY(0x509d), /* Custom T540-CR*/ CH_PCI_ID_TABLE_FENTRY(0x509e), /* Custom T520-CR */ CH_PCI_ID_TABLE_FENTRY(0x509f), /* Custom T540-CR */ CH_PCI_ID_TABLE_FENTRY(0x50a0), /* Custom T540-CR */ CH_PCI_ID_TABLE_FENTRY(0x50a1), /* Custom T540-CR */ CH_PCI_ID_TABLE_FENTRY(0x50a2), /* Custom T540-KR4 */ CH_PCI_ID_TABLE_FENTRY(0x50a3), /* Custom T580-KR4 */ CH_PCI_ID_TABLE_FENTRY(0x50a4), /* Custom 2x T540-CR */ CH_PCI_ID_TABLE_FENTRY(0x50a5), /* Custom T522-BT */ CH_PCI_ID_TABLE_FENTRY(0x50a6), /* Custom T522-BT-SO */ CH_PCI_ID_TABLE_FENTRY(0x50a7), /* Custom T580-CR */ CH_PCI_ID_TABLE_FENTRY(0x50a8), /* Custom T580-KR */ CH_PCI_ID_TABLE_FENTRY(0x50a9), /* Custom T580-KR */ CH_PCI_ID_TABLE_FENTRY(0x50aa), /* Custom T580-CR */ CH_PCI_ID_TABLE_FENTRY(0x50ab), /* Custom T520-CR */ CH_PCI_ID_TABLE_FENTRY(0x50ac), /* Custom T540-BT */ /* T6 adapters: */ CH_PCI_ID_TABLE_FENTRY(0x6001), CH_PCI_ID_TABLE_FENTRY(0x6002), CH_PCI_ID_TABLE_FENTRY(0x6003), CH_PCI_ID_TABLE_FENTRY(0x6004), CH_PCI_ID_TABLE_FENTRY(0x6005), CH_PCI_ID_TABLE_FENTRY(0x6006), CH_PCI_ID_TABLE_FENTRY(0x6007), CH_PCI_ID_TABLE_FENTRY(0x6008), CH_PCI_ID_TABLE_FENTRY(0x6009), CH_PCI_ID_TABLE_FENTRY(0x600d), CH_PCI_ID_TABLE_FENTRY(0x6010), CH_PCI_ID_TABLE_FENTRY(0x6011), CH_PCI_ID_TABLE_FENTRY(0x6014), CH_PCI_ID_TABLE_FENTRY(0x6015), CH_PCI_ID_TABLE_FENTRY(0x6080), CH_PCI_ID_TABLE_FENTRY(0x6081), CH_PCI_ID_TABLE_FENTRY(0x6082), /* Custom T6225-CR SFP28 */ CH_PCI_ID_TABLE_FENTRY(0x6083), /* Custom T62100-CR QSFP28 */ CH_PCI_ID_TABLE_FENTRY(0x6084), /* Custom T64100-CR QSFP28 */ CH_PCI_ID_TABLE_FENTRY(0x6085), /* Custom T6240-SO */ CH_PCI_ID_TABLE_FENTRY(0x6086), /* Custom T6225-SO-CR */ CH_PCI_ID_TABLE_FENTRY(0x6087), /* Custom T6225-CR */ CH_PCI_DEVICE_ID_TABLE_DEFINE_END; #endif /* __T4_PCI_ID_TBL_H__ */ rdma-core-56.1/providers/cxgb4/t4_regs.h000066400000000000000000002677321477342711600201170ustar00rootroot00000000000000/* * This file is part of the Chelsio T4 Ethernet driver for Linux. * * Copyright (c) 2003-2014 Chelsio Communications, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __T4_REGS_H #define __T4_REGS_H #define MYPF_BASE 0x1b000 #define MYPF_REG(reg_addr) (MYPF_BASE + (reg_addr)) #define PF0_BASE 0x1e000 #define PF0_REG(reg_addr) (PF0_BASE + (reg_addr)) #define PF_STRIDE 0x400 #define PF_BASE(idx) (PF0_BASE + (idx) * PF_STRIDE) #define PF_REG(idx, reg) (PF_BASE(idx) + (reg)) #define MYPORT_BASE 0x1c000 #define MYPORT_REG(reg_addr) (MYPORT_BASE + (reg_addr)) #define PORT0_BASE 0x20000 #define PORT0_REG(reg_addr) (PORT0_BASE + (reg_addr)) #define PORT_STRIDE 0x2000 #define PORT_BASE(idx) (PORT0_BASE + (idx) * PORT_STRIDE) #define PORT_REG(idx, reg) (PORT_BASE(idx) + (reg)) #define EDC_STRIDE (EDC_1_BASE_ADDR - EDC_0_BASE_ADDR) #define EDC_REG(reg, idx) (reg + EDC_STRIDE * idx) #define PCIE_MEM_ACCESS_REG(reg_addr, idx) ((reg_addr) + (idx) * 8) #define PCIE_MAILBOX_REG(reg_addr, idx) ((reg_addr) + (idx) * 8) #define MC_BIST_STATUS_REG(reg_addr, idx) ((reg_addr) + (idx) * 4) #define EDC_BIST_STATUS_REG(reg_addr, idx) ((reg_addr) + (idx) * 4) #define PCIE_FW_REG(reg_addr, idx) ((reg_addr) + (idx) * 4) #define SGE_PF_KDOORBELL_A 0x0 #define QID_S 15 #define QID_V(x) ((x) << QID_S) #define DBPRIO_S 14 #define DBPRIO_V(x) ((x) << DBPRIO_S) #define DBPRIO_F DBPRIO_V(1U) #define PIDX_S 0 #define PIDX_V(x) ((x) << PIDX_S) #define SGE_VF_KDOORBELL_A 0x0 #define DBTYPE_S 13 #define DBTYPE_V(x) ((x) << DBTYPE_S) #define DBTYPE_F DBTYPE_V(1U) #define PIDX_T5_S 0 #define PIDX_T5_M 0x1fffU #define PIDX_T5_V(x) ((x) << PIDX_T5_S) #define PIDX_T5_G(x) (((x) >> PIDX_T5_S) & PIDX_T5_M) #define SGE_PF_GTS_A 0x4 #define INGRESSQID_S 16 #define INGRESSQID_V(x) ((x) << INGRESSQID_S) #define TIMERREG_S 13 #define TIMERREG_V(x) ((x) << TIMERREG_S) #define SEINTARM_S 12 #define SEINTARM_V(x) ((x) << SEINTARM_S) #define CIDXINC_S 0 #define CIDXINC_M 0xfffU #define CIDXINC_V(x) ((x) << CIDXINC_S) #define SGE_CONTROL_A 0x1008 #define SGE_CONTROL2_A 0x1124 #define RXPKTCPLMODE_S 18 #define RXPKTCPLMODE_V(x) ((x) << RXPKTCPLMODE_S) #define RXPKTCPLMODE_F RXPKTCPLMODE_V(1U) #define EGRSTATUSPAGESIZE_S 17 #define EGRSTATUSPAGESIZE_V(x) ((x) << EGRSTATUSPAGESIZE_S) #define EGRSTATUSPAGESIZE_F EGRSTATUSPAGESIZE_V(1U) #define PKTSHIFT_S 10 #define PKTSHIFT_M 0x7U #define PKTSHIFT_V(x) ((x) << PKTSHIFT_S) #define PKTSHIFT_G(x) (((x) >> PKTSHIFT_S) & PKTSHIFT_M) #define INGPCIEBOUNDARY_S 7 #define INGPCIEBOUNDARY_V(x) ((x) << INGPCIEBOUNDARY_S) #define INGPADBOUNDARY_S 4 #define INGPADBOUNDARY_M 0x7U #define INGPADBOUNDARY_V(x) ((x) << INGPADBOUNDARY_S) #define INGPADBOUNDARY_G(x) (((x) >> INGPADBOUNDARY_S) & INGPADBOUNDARY_M) #define EGRPCIEBOUNDARY_S 1 #define EGRPCIEBOUNDARY_V(x) ((x) << EGRPCIEBOUNDARY_S) #define INGPACKBOUNDARY_S 16 #define INGPACKBOUNDARY_M 0x7U #define INGPACKBOUNDARY_V(x) ((x) << INGPACKBOUNDARY_S) #define INGPACKBOUNDARY_G(x) (((x) >> INGPACKBOUNDARY_S) \ & INGPACKBOUNDARY_M) #define VFIFO_ENABLE_S 10 #define VFIFO_ENABLE_V(x) ((x) << VFIFO_ENABLE_S) #define VFIFO_ENABLE_F VFIFO_ENABLE_V(1U) #define SGE_DBVFIFO_BADDR_A 0x1138 #define DBVFIFO_SIZE_S 6 #define DBVFIFO_SIZE_M 0xfffU #define DBVFIFO_SIZE_G(x) (((x) >> DBVFIFO_SIZE_S) & DBVFIFO_SIZE_M) #define T6_DBVFIFO_SIZE_S 0 #define T6_DBVFIFO_SIZE_M 0x1fffU #define T6_DBVFIFO_SIZE_G(x) (((x) >> T6_DBVFIFO_SIZE_S) & T6_DBVFIFO_SIZE_M) #define GLOBALENABLE_S 0 #define GLOBALENABLE_V(x) ((x) << GLOBALENABLE_S) #define GLOBALENABLE_F GLOBALENABLE_V(1U) #define SGE_HOST_PAGE_SIZE_A 0x100c #define HOSTPAGESIZEPF7_S 28 #define HOSTPAGESIZEPF7_M 0xfU #define HOSTPAGESIZEPF7_V(x) ((x) << HOSTPAGESIZEPF7_S) #define HOSTPAGESIZEPF7_G(x) (((x) >> HOSTPAGESIZEPF7_S) & HOSTPAGESIZEPF7_M) #define HOSTPAGESIZEPF6_S 24 #define HOSTPAGESIZEPF6_M 0xfU #define HOSTPAGESIZEPF6_V(x) ((x) << HOSTPAGESIZEPF6_S) #define HOSTPAGESIZEPF6_G(x) (((x) >> HOSTPAGESIZEPF6_S) & HOSTPAGESIZEPF6_M) #define HOSTPAGESIZEPF5_S 20 #define HOSTPAGESIZEPF5_M 0xfU #define HOSTPAGESIZEPF5_V(x) ((x) << HOSTPAGESIZEPF5_S) #define HOSTPAGESIZEPF5_G(x) (((x) >> HOSTPAGESIZEPF5_S) & HOSTPAGESIZEPF5_M) #define HOSTPAGESIZEPF4_S 16 #define HOSTPAGESIZEPF4_M 0xfU #define HOSTPAGESIZEPF4_V(x) ((x) << HOSTPAGESIZEPF4_S) #define HOSTPAGESIZEPF4_G(x) (((x) >> HOSTPAGESIZEPF4_S) & HOSTPAGESIZEPF4_M) #define HOSTPAGESIZEPF3_S 12 #define HOSTPAGESIZEPF3_M 0xfU #define HOSTPAGESIZEPF3_V(x) ((x) << HOSTPAGESIZEPF3_S) #define HOSTPAGESIZEPF3_G(x) (((x) >> HOSTPAGESIZEPF3_S) & HOSTPAGESIZEPF3_M) #define HOSTPAGESIZEPF2_S 8 #define HOSTPAGESIZEPF2_M 0xfU #define HOSTPAGESIZEPF2_V(x) ((x) << HOSTPAGESIZEPF2_S) #define HOSTPAGESIZEPF2_G(x) (((x) >> HOSTPAGESIZEPF2_S) & HOSTPAGESIZEPF2_M) #define HOSTPAGESIZEPF1_S 4 #define HOSTPAGESIZEPF1_M 0xfU #define HOSTPAGESIZEPF1_V(x) ((x) << HOSTPAGESIZEPF1_S) #define HOSTPAGESIZEPF1_G(x) (((x) >> HOSTPAGESIZEPF1_S) & HOSTPAGESIZEPF1_M) #define HOSTPAGESIZEPF0_S 0 #define HOSTPAGESIZEPF0_M 0xfU #define HOSTPAGESIZEPF0_V(x) ((x) << HOSTPAGESIZEPF0_S) #define HOSTPAGESIZEPF0_G(x) (((x) >> HOSTPAGESIZEPF0_S) & HOSTPAGESIZEPF0_M) #define SGE_EGRESS_QUEUES_PER_PAGE_PF_A 0x1010 #define SGE_EGRESS_QUEUES_PER_PAGE_VF_A 0x1014 #define QUEUESPERPAGEPF1_S 4 #define QUEUESPERPAGEPF0_S 0 #define QUEUESPERPAGEPF0_M 0xfU #define QUEUESPERPAGEPF0_V(x) ((x) << QUEUESPERPAGEPF0_S) #define QUEUESPERPAGEPF0_G(x) (((x) >> QUEUESPERPAGEPF0_S) & QUEUESPERPAGEPF0_M) #define SGE_INT_CAUSE1_A 0x1024 #define SGE_INT_CAUSE2_A 0x1030 #define SGE_INT_CAUSE3_A 0x103c #define ERR_FLM_DBP_S 31 #define ERR_FLM_DBP_V(x) ((x) << ERR_FLM_DBP_S) #define ERR_FLM_DBP_F ERR_FLM_DBP_V(1U) #define ERR_FLM_IDMA1_S 30 #define ERR_FLM_IDMA1_V(x) ((x) << ERR_FLM_IDMA1_S) #define ERR_FLM_IDMA1_F ERR_FLM_IDMA1_V(1U) #define ERR_FLM_IDMA0_S 29 #define ERR_FLM_IDMA0_V(x) ((x) << ERR_FLM_IDMA0_S) #define ERR_FLM_IDMA0_F ERR_FLM_IDMA0_V(1U) #define ERR_FLM_HINT_S 28 #define ERR_FLM_HINT_V(x) ((x) << ERR_FLM_HINT_S) #define ERR_FLM_HINT_F ERR_FLM_HINT_V(1U) #define ERR_PCIE_ERROR3_S 27 #define ERR_PCIE_ERROR3_V(x) ((x) << ERR_PCIE_ERROR3_S) #define ERR_PCIE_ERROR3_F ERR_PCIE_ERROR3_V(1U) #define ERR_PCIE_ERROR2_S 26 #define ERR_PCIE_ERROR2_V(x) ((x) << ERR_PCIE_ERROR2_S) #define ERR_PCIE_ERROR2_F ERR_PCIE_ERROR2_V(1U) #define ERR_PCIE_ERROR1_S 25 #define ERR_PCIE_ERROR1_V(x) ((x) << ERR_PCIE_ERROR1_S) #define ERR_PCIE_ERROR1_F ERR_PCIE_ERROR1_V(1U) #define ERR_PCIE_ERROR0_S 24 #define ERR_PCIE_ERROR0_V(x) ((x) << ERR_PCIE_ERROR0_S) #define ERR_PCIE_ERROR0_F ERR_PCIE_ERROR0_V(1U) #define ERR_CPL_EXCEED_IQE_SIZE_S 22 #define ERR_CPL_EXCEED_IQE_SIZE_V(x) ((x) << ERR_CPL_EXCEED_IQE_SIZE_S) #define ERR_CPL_EXCEED_IQE_SIZE_F ERR_CPL_EXCEED_IQE_SIZE_V(1U) #define ERR_INVALID_CIDX_INC_S 21 #define ERR_INVALID_CIDX_INC_V(x) ((x) << ERR_INVALID_CIDX_INC_S) #define ERR_INVALID_CIDX_INC_F ERR_INVALID_CIDX_INC_V(1U) #define ERR_CPL_OPCODE_0_S 19 #define ERR_CPL_OPCODE_0_V(x) ((x) << ERR_CPL_OPCODE_0_S) #define ERR_CPL_OPCODE_0_F ERR_CPL_OPCODE_0_V(1U) #define ERR_DROPPED_DB_S 18 #define ERR_DROPPED_DB_V(x) ((x) << ERR_DROPPED_DB_S) #define ERR_DROPPED_DB_F ERR_DROPPED_DB_V(1U) #define ERR_DATA_CPL_ON_HIGH_QID1_S 17 #define ERR_DATA_CPL_ON_HIGH_QID1_V(x) ((x) << ERR_DATA_CPL_ON_HIGH_QID1_S) #define ERR_DATA_CPL_ON_HIGH_QID1_F ERR_DATA_CPL_ON_HIGH_QID1_V(1U) #define ERR_DATA_CPL_ON_HIGH_QID0_S 16 #define ERR_DATA_CPL_ON_HIGH_QID0_V(x) ((x) << ERR_DATA_CPL_ON_HIGH_QID0_S) #define ERR_DATA_CPL_ON_HIGH_QID0_F ERR_DATA_CPL_ON_HIGH_QID0_V(1U) #define ERR_BAD_DB_PIDX3_S 15 #define ERR_BAD_DB_PIDX3_V(x) ((x) << ERR_BAD_DB_PIDX3_S) #define ERR_BAD_DB_PIDX3_F ERR_BAD_DB_PIDX3_V(1U) #define ERR_BAD_DB_PIDX2_S 14 #define ERR_BAD_DB_PIDX2_V(x) ((x) << ERR_BAD_DB_PIDX2_S) #define ERR_BAD_DB_PIDX2_F ERR_BAD_DB_PIDX2_V(1U) #define ERR_BAD_DB_PIDX1_S 13 #define ERR_BAD_DB_PIDX1_V(x) ((x) << ERR_BAD_DB_PIDX1_S) #define ERR_BAD_DB_PIDX1_F ERR_BAD_DB_PIDX1_V(1U) #define ERR_BAD_DB_PIDX0_S 12 #define ERR_BAD_DB_PIDX0_V(x) ((x) << ERR_BAD_DB_PIDX0_S) #define ERR_BAD_DB_PIDX0_F ERR_BAD_DB_PIDX0_V(1U) #define ERR_ING_CTXT_PRIO_S 10 #define ERR_ING_CTXT_PRIO_V(x) ((x) << ERR_ING_CTXT_PRIO_S) #define ERR_ING_CTXT_PRIO_F ERR_ING_CTXT_PRIO_V(1U) #define ERR_EGR_CTXT_PRIO_S 9 #define ERR_EGR_CTXT_PRIO_V(x) ((x) << ERR_EGR_CTXT_PRIO_S) #define ERR_EGR_CTXT_PRIO_F ERR_EGR_CTXT_PRIO_V(1U) #define DBFIFO_HP_INT_S 8 #define DBFIFO_HP_INT_V(x) ((x) << DBFIFO_HP_INT_S) #define DBFIFO_HP_INT_F DBFIFO_HP_INT_V(1U) #define DBFIFO_LP_INT_S 7 #define DBFIFO_LP_INT_V(x) ((x) << DBFIFO_LP_INT_S) #define DBFIFO_LP_INT_F DBFIFO_LP_INT_V(1U) #define INGRESS_SIZE_ERR_S 5 #define INGRESS_SIZE_ERR_V(x) ((x) << INGRESS_SIZE_ERR_S) #define INGRESS_SIZE_ERR_F INGRESS_SIZE_ERR_V(1U) #define EGRESS_SIZE_ERR_S 4 #define EGRESS_SIZE_ERR_V(x) ((x) << EGRESS_SIZE_ERR_S) #define EGRESS_SIZE_ERR_F EGRESS_SIZE_ERR_V(1U) #define SGE_INT_ENABLE3_A 0x1040 #define SGE_FL_BUFFER_SIZE0_A 0x1044 #define SGE_FL_BUFFER_SIZE1_A 0x1048 #define SGE_FL_BUFFER_SIZE2_A 0x104c #define SGE_FL_BUFFER_SIZE3_A 0x1050 #define SGE_FL_BUFFER_SIZE4_A 0x1054 #define SGE_FL_BUFFER_SIZE5_A 0x1058 #define SGE_FL_BUFFER_SIZE6_A 0x105c #define SGE_FL_BUFFER_SIZE7_A 0x1060 #define SGE_FL_BUFFER_SIZE8_A 0x1064 #define SGE_IMSG_CTXT_BADDR_A 0x1088 #define SGE_FLM_CACHE_BADDR_A 0x108c #define SGE_INGRESS_RX_THRESHOLD_A 0x10a0 #define THRESHOLD_0_S 24 #define THRESHOLD_0_M 0x3fU #define THRESHOLD_0_V(x) ((x) << THRESHOLD_0_S) #define THRESHOLD_0_G(x) (((x) >> THRESHOLD_0_S) & THRESHOLD_0_M) #define THRESHOLD_1_S 16 #define THRESHOLD_1_M 0x3fU #define THRESHOLD_1_V(x) ((x) << THRESHOLD_1_S) #define THRESHOLD_1_G(x) (((x) >> THRESHOLD_1_S) & THRESHOLD_1_M) #define THRESHOLD_2_S 8 #define THRESHOLD_2_M 0x3fU #define THRESHOLD_2_V(x) ((x) << THRESHOLD_2_S) #define THRESHOLD_2_G(x) (((x) >> THRESHOLD_2_S) & THRESHOLD_2_M) #define THRESHOLD_3_S 0 #define THRESHOLD_3_M 0x3fU #define THRESHOLD_3_V(x) ((x) << THRESHOLD_3_S) #define THRESHOLD_3_G(x) (((x) >> THRESHOLD_3_S) & THRESHOLD_3_M) #define SGE_CONM_CTRL_A 0x1094 #define EGRTHRESHOLD_S 8 #define EGRTHRESHOLD_M 0x3fU #define EGRTHRESHOLD_V(x) ((x) << EGRTHRESHOLD_S) #define EGRTHRESHOLD_G(x) (((x) >> EGRTHRESHOLD_S) & EGRTHRESHOLD_M) #define EGRTHRESHOLDPACKING_S 14 #define EGRTHRESHOLDPACKING_M 0x3fU #define EGRTHRESHOLDPACKING_V(x) ((x) << EGRTHRESHOLDPACKING_S) #define EGRTHRESHOLDPACKING_G(x) \ (((x) >> EGRTHRESHOLDPACKING_S) & EGRTHRESHOLDPACKING_M) #define T6_EGRTHRESHOLDPACKING_S 16 #define T6_EGRTHRESHOLDPACKING_M 0xffU #define T6_EGRTHRESHOLDPACKING_G(x) \ (((x) >> T6_EGRTHRESHOLDPACKING_S) & T6_EGRTHRESHOLDPACKING_M) #define SGE_TIMESTAMP_LO_A 0x1098 #define SGE_TIMESTAMP_HI_A 0x109c #define TSOP_S 28 #define TSOP_M 0x3U #define TSOP_V(x) ((x) << TSOP_S) #define TSOP_G(x) (((x) >> TSOP_S) & TSOP_M) #define TSVAL_S 0 #define TSVAL_M 0xfffffffU #define TSVAL_V(x) ((x) << TSVAL_S) #define TSVAL_G(x) (((x) >> TSVAL_S) & TSVAL_M) #define SGE_DBFIFO_STATUS_A 0x10a4 #define SGE_DBVFIFO_SIZE_A 0x113c #define HP_INT_THRESH_S 28 #define HP_INT_THRESH_M 0xfU #define HP_INT_THRESH_V(x) ((x) << HP_INT_THRESH_S) #define LP_INT_THRESH_S 12 #define LP_INT_THRESH_M 0xfU #define LP_INT_THRESH_V(x) ((x) << LP_INT_THRESH_S) #define SGE_DOORBELL_CONTROL_A 0x10a8 #define NOCOALESCE_S 26 #define NOCOALESCE_V(x) ((x) << NOCOALESCE_S) #define NOCOALESCE_F NOCOALESCE_V(1U) #define ENABLE_DROP_S 13 #define ENABLE_DROP_V(x) ((x) << ENABLE_DROP_S) #define ENABLE_DROP_F ENABLE_DROP_V(1U) #define SGE_TIMER_VALUE_0_AND_1_A 0x10b8 #define TIMERVALUE0_S 16 #define TIMERVALUE0_M 0xffffU #define TIMERVALUE0_V(x) ((x) << TIMERVALUE0_S) #define TIMERVALUE0_G(x) (((x) >> TIMERVALUE0_S) & TIMERVALUE0_M) #define TIMERVALUE1_S 0 #define TIMERVALUE1_M 0xffffU #define TIMERVALUE1_V(x) ((x) << TIMERVALUE1_S) #define TIMERVALUE1_G(x) (((x) >> TIMERVALUE1_S) & TIMERVALUE1_M) #define SGE_TIMER_VALUE_2_AND_3_A 0x10bc #define TIMERVALUE2_S 16 #define TIMERVALUE2_M 0xffffU #define TIMERVALUE2_V(x) ((x) << TIMERVALUE2_S) #define TIMERVALUE2_G(x) (((x) >> TIMERVALUE2_S) & TIMERVALUE2_M) #define TIMERVALUE3_S 0 #define TIMERVALUE3_M 0xffffU #define TIMERVALUE3_V(x) ((x) << TIMERVALUE3_S) #define TIMERVALUE3_G(x) (((x) >> TIMERVALUE3_S) & TIMERVALUE3_M) #define SGE_TIMER_VALUE_4_AND_5_A 0x10c0 #define TIMERVALUE4_S 16 #define TIMERVALUE4_M 0xffffU #define TIMERVALUE4_V(x) ((x) << TIMERVALUE4_S) #define TIMERVALUE4_G(x) (((x) >> TIMERVALUE4_S) & TIMERVALUE4_M) #define TIMERVALUE5_S 0 #define TIMERVALUE5_M 0xffffU #define TIMERVALUE5_V(x) ((x) << TIMERVALUE5_S) #define TIMERVALUE5_G(x) (((x) >> TIMERVALUE5_S) & TIMERVALUE5_M) #define SGE_DEBUG_INDEX_A 0x10cc #define SGE_DEBUG_DATA_HIGH_A 0x10d0 #define SGE_DEBUG_DATA_LOW_A 0x10d4 #define SGE_DEBUG_DATA_LOW_INDEX_2_A 0x12c8 #define SGE_DEBUG_DATA_LOW_INDEX_3_A 0x12cc #define SGE_DEBUG_DATA_HIGH_INDEX_10_A 0x12a8 #define SGE_INGRESS_QUEUES_PER_PAGE_PF_A 0x10f4 #define SGE_INGRESS_QUEUES_PER_PAGE_VF_A 0x10f8 #define SGE_ERROR_STATS_A 0x1100 #define UNCAPTURED_ERROR_S 18 #define UNCAPTURED_ERROR_V(x) ((x) << UNCAPTURED_ERROR_S) #define UNCAPTURED_ERROR_F UNCAPTURED_ERROR_V(1U) #define ERROR_QID_VALID_S 17 #define ERROR_QID_VALID_V(x) ((x) << ERROR_QID_VALID_S) #define ERROR_QID_VALID_F ERROR_QID_VALID_V(1U) #define ERROR_QID_S 0 #define ERROR_QID_M 0x1ffffU #define ERROR_QID_G(x) (((x) >> ERROR_QID_S) & ERROR_QID_M) #define HP_INT_THRESH_S 28 #define HP_INT_THRESH_M 0xfU #define HP_INT_THRESH_V(x) ((x) << HP_INT_THRESH_S) #define HP_COUNT_S 16 #define HP_COUNT_M 0x7ffU #define HP_COUNT_G(x) (((x) >> HP_COUNT_S) & HP_COUNT_M) #define LP_INT_THRESH_S 12 #define LP_INT_THRESH_M 0xfU #define LP_INT_THRESH_V(x) ((x) << LP_INT_THRESH_S) #define LP_COUNT_S 0 #define LP_COUNT_M 0x7ffU #define LP_COUNT_G(x) (((x) >> LP_COUNT_S) & LP_COUNT_M) #define LP_INT_THRESH_T5_S 18 #define LP_INT_THRESH_T5_M 0xfffU #define LP_INT_THRESH_T5_V(x) ((x) << LP_INT_THRESH_T5_S) #define LP_COUNT_T5_S 0 #define LP_COUNT_T5_M 0x3ffffU #define LP_COUNT_T5_G(x) (((x) >> LP_COUNT_T5_S) & LP_COUNT_T5_M) #define SGE_DOORBELL_CONTROL_A 0x10a8 #define SGE_STAT_TOTAL_A 0x10e4 #define SGE_STAT_MATCH_A 0x10e8 #define SGE_STAT_CFG_A 0x10ec #define STATMODE_S 2 #define STATMODE_V(x) ((x) << STATMODE_S) #define STATSOURCE_T5_S 9 #define STATSOURCE_T5_M 0xfU #define STATSOURCE_T5_V(x) ((x) << STATSOURCE_T5_S) #define STATSOURCE_T5_G(x) (((x) >> STATSOURCE_T5_S) & STATSOURCE_T5_M) #define T6_STATMODE_S 0 #define T6_STATMODE_V(x) ((x) << T6_STATMODE_S) #define SGE_DBFIFO_STATUS2_A 0x1118 #define HP_INT_THRESH_T5_S 10 #define HP_INT_THRESH_T5_M 0xfU #define HP_INT_THRESH_T5_V(x) ((x) << HP_INT_THRESH_T5_S) #define HP_COUNT_T5_S 0 #define HP_COUNT_T5_M 0x3ffU #define HP_COUNT_T5_G(x) (((x) >> HP_COUNT_T5_S) & HP_COUNT_T5_M) #define ENABLE_DROP_S 13 #define ENABLE_DROP_V(x) ((x) << ENABLE_DROP_S) #define ENABLE_DROP_F ENABLE_DROP_V(1U) #define DROPPED_DB_S 0 #define DROPPED_DB_V(x) ((x) << DROPPED_DB_S) #define DROPPED_DB_F DROPPED_DB_V(1U) #define SGE_CTXT_CMD_A 0x11fc #define SGE_DBQ_CTXT_BADDR_A 0x1084 /* registers for module PCIE */ #define PCIE_PF_CFG_A 0x40 #define AIVEC_S 4 #define AIVEC_M 0x3ffU #define AIVEC_V(x) ((x) << AIVEC_S) #define PCIE_PF_CLI_A 0x44 #define PCIE_INT_CAUSE_A 0x3004 #define UNXSPLCPLERR_S 29 #define UNXSPLCPLERR_V(x) ((x) << UNXSPLCPLERR_S) #define UNXSPLCPLERR_F UNXSPLCPLERR_V(1U) #define PCIEPINT_S 28 #define PCIEPINT_V(x) ((x) << PCIEPINT_S) #define PCIEPINT_F PCIEPINT_V(1U) #define PCIESINT_S 27 #define PCIESINT_V(x) ((x) << PCIESINT_S) #define PCIESINT_F PCIESINT_V(1U) #define RPLPERR_S 26 #define RPLPERR_V(x) ((x) << RPLPERR_S) #define RPLPERR_F RPLPERR_V(1U) #define RXWRPERR_S 25 #define RXWRPERR_V(x) ((x) << RXWRPERR_S) #define RXWRPERR_F RXWRPERR_V(1U) #define RXCPLPERR_S 24 #define RXCPLPERR_V(x) ((x) << RXCPLPERR_S) #define RXCPLPERR_F RXCPLPERR_V(1U) #define PIOTAGPERR_S 23 #define PIOTAGPERR_V(x) ((x) << PIOTAGPERR_S) #define PIOTAGPERR_F PIOTAGPERR_V(1U) #define MATAGPERR_S 22 #define MATAGPERR_V(x) ((x) << MATAGPERR_S) #define MATAGPERR_F MATAGPERR_V(1U) #define INTXCLRPERR_S 21 #define INTXCLRPERR_V(x) ((x) << INTXCLRPERR_S) #define INTXCLRPERR_F INTXCLRPERR_V(1U) #define FIDPERR_S 20 #define FIDPERR_V(x) ((x) << FIDPERR_S) #define FIDPERR_F FIDPERR_V(1U) #define CFGSNPPERR_S 19 #define CFGSNPPERR_V(x) ((x) << CFGSNPPERR_S) #define CFGSNPPERR_F CFGSNPPERR_V(1U) #define HRSPPERR_S 18 #define HRSPPERR_V(x) ((x) << HRSPPERR_S) #define HRSPPERR_F HRSPPERR_V(1U) #define HREQPERR_S 17 #define HREQPERR_V(x) ((x) << HREQPERR_S) #define HREQPERR_F HREQPERR_V(1U) #define HCNTPERR_S 16 #define HCNTPERR_V(x) ((x) << HCNTPERR_S) #define HCNTPERR_F HCNTPERR_V(1U) #define DRSPPERR_S 15 #define DRSPPERR_V(x) ((x) << DRSPPERR_S) #define DRSPPERR_F DRSPPERR_V(1U) #define DREQPERR_S 14 #define DREQPERR_V(x) ((x) << DREQPERR_S) #define DREQPERR_F DREQPERR_V(1U) #define DCNTPERR_S 13 #define DCNTPERR_V(x) ((x) << DCNTPERR_S) #define DCNTPERR_F DCNTPERR_V(1U) #define CRSPPERR_S 12 #define CRSPPERR_V(x) ((x) << CRSPPERR_S) #define CRSPPERR_F CRSPPERR_V(1U) #define CREQPERR_S 11 #define CREQPERR_V(x) ((x) << CREQPERR_S) #define CREQPERR_F CREQPERR_V(1U) #define CCNTPERR_S 10 #define CCNTPERR_V(x) ((x) << CCNTPERR_S) #define CCNTPERR_F CCNTPERR_V(1U) #define TARTAGPERR_S 9 #define TARTAGPERR_V(x) ((x) << TARTAGPERR_S) #define TARTAGPERR_F TARTAGPERR_V(1U) #define PIOREQPERR_S 8 #define PIOREQPERR_V(x) ((x) << PIOREQPERR_S) #define PIOREQPERR_F PIOREQPERR_V(1U) #define PIOCPLPERR_S 7 #define PIOCPLPERR_V(x) ((x) << PIOCPLPERR_S) #define PIOCPLPERR_F PIOCPLPERR_V(1U) #define MSIXDIPERR_S 6 #define MSIXDIPERR_V(x) ((x) << MSIXDIPERR_S) #define MSIXDIPERR_F MSIXDIPERR_V(1U) #define MSIXDATAPERR_S 5 #define MSIXDATAPERR_V(x) ((x) << MSIXDATAPERR_S) #define MSIXDATAPERR_F MSIXDATAPERR_V(1U) #define MSIXADDRHPERR_S 4 #define MSIXADDRHPERR_V(x) ((x) << MSIXADDRHPERR_S) #define MSIXADDRHPERR_F MSIXADDRHPERR_V(1U) #define MSIXADDRLPERR_S 3 #define MSIXADDRLPERR_V(x) ((x) << MSIXADDRLPERR_S) #define MSIXADDRLPERR_F MSIXADDRLPERR_V(1U) #define MSIDATAPERR_S 2 #define MSIDATAPERR_V(x) ((x) << MSIDATAPERR_S) #define MSIDATAPERR_F MSIDATAPERR_V(1U) #define MSIADDRHPERR_S 1 #define MSIADDRHPERR_V(x) ((x) << MSIADDRHPERR_S) #define MSIADDRHPERR_F MSIADDRHPERR_V(1U) #define MSIADDRLPERR_S 0 #define MSIADDRLPERR_V(x) ((x) << MSIADDRLPERR_S) #define MSIADDRLPERR_F MSIADDRLPERR_V(1U) #define READRSPERR_S 29 #define READRSPERR_V(x) ((x) << READRSPERR_S) #define READRSPERR_F READRSPERR_V(1U) #define TRGT1GRPPERR_S 28 #define TRGT1GRPPERR_V(x) ((x) << TRGT1GRPPERR_S) #define TRGT1GRPPERR_F TRGT1GRPPERR_V(1U) #define IPSOTPERR_S 27 #define IPSOTPERR_V(x) ((x) << IPSOTPERR_S) #define IPSOTPERR_F IPSOTPERR_V(1U) #define IPRETRYPERR_S 26 #define IPRETRYPERR_V(x) ((x) << IPRETRYPERR_S) #define IPRETRYPERR_F IPRETRYPERR_V(1U) #define IPRXDATAGRPPERR_S 25 #define IPRXDATAGRPPERR_V(x) ((x) << IPRXDATAGRPPERR_S) #define IPRXDATAGRPPERR_F IPRXDATAGRPPERR_V(1U) #define IPRXHDRGRPPERR_S 24 #define IPRXHDRGRPPERR_V(x) ((x) << IPRXHDRGRPPERR_S) #define IPRXHDRGRPPERR_F IPRXHDRGRPPERR_V(1U) #define MAGRPPERR_S 22 #define MAGRPPERR_V(x) ((x) << MAGRPPERR_S) #define MAGRPPERR_F MAGRPPERR_V(1U) #define VFIDPERR_S 21 #define VFIDPERR_V(x) ((x) << VFIDPERR_S) #define VFIDPERR_F VFIDPERR_V(1U) #define HREQWRPERR_S 16 #define HREQWRPERR_V(x) ((x) << HREQWRPERR_S) #define HREQWRPERR_F HREQWRPERR_V(1U) #define DREQWRPERR_S 13 #define DREQWRPERR_V(x) ((x) << DREQWRPERR_S) #define DREQWRPERR_F DREQWRPERR_V(1U) #define CREQRDPERR_S 11 #define CREQRDPERR_V(x) ((x) << CREQRDPERR_S) #define CREQRDPERR_F CREQRDPERR_V(1U) #define MSTTAGQPERR_S 10 #define MSTTAGQPERR_V(x) ((x) << MSTTAGQPERR_S) #define MSTTAGQPERR_F MSTTAGQPERR_V(1U) #define PIOREQGRPPERR_S 8 #define PIOREQGRPPERR_V(x) ((x) << PIOREQGRPPERR_S) #define PIOREQGRPPERR_F PIOREQGRPPERR_V(1U) #define PIOCPLGRPPERR_S 7 #define PIOCPLGRPPERR_V(x) ((x) << PIOCPLGRPPERR_S) #define PIOCPLGRPPERR_F PIOCPLGRPPERR_V(1U) #define MSIXSTIPERR_S 2 #define MSIXSTIPERR_V(x) ((x) << MSIXSTIPERR_S) #define MSIXSTIPERR_F MSIXSTIPERR_V(1U) #define MSTTIMEOUTPERR_S 1 #define MSTTIMEOUTPERR_V(x) ((x) << MSTTIMEOUTPERR_S) #define MSTTIMEOUTPERR_F MSTTIMEOUTPERR_V(1U) #define MSTGRPPERR_S 0 #define MSTGRPPERR_V(x) ((x) << MSTGRPPERR_S) #define MSTGRPPERR_F MSTGRPPERR_V(1U) #define PCIE_NONFAT_ERR_A 0x3010 #define PCIE_CFG_SPACE_REQ_A 0x3060 #define PCIE_CFG_SPACE_DATA_A 0x3064 #define PCIE_MEM_ACCESS_BASE_WIN_A 0x3068 #define PCIEOFST_S 10 #define PCIEOFST_M 0x3fffffU #define PCIEOFST_G(x) (((x) >> PCIEOFST_S) & PCIEOFST_M) #define BIR_S 8 #define BIR_M 0x3U #define BIR_V(x) ((x) << BIR_S) #define BIR_G(x) (((x) >> BIR_S) & BIR_M) #define WINDOW_S 0 #define WINDOW_M 0xffU #define WINDOW_V(x) ((x) << WINDOW_S) #define WINDOW_G(x) (((x) >> WINDOW_S) & WINDOW_M) #define PCIE_MEM_ACCESS_OFFSET_A 0x306c #define ENABLE_S 30 #define ENABLE_V(x) ((x) << ENABLE_S) #define ENABLE_F ENABLE_V(1U) #define LOCALCFG_S 28 #define LOCALCFG_V(x) ((x) << LOCALCFG_S) #define LOCALCFG_F LOCALCFG_V(1U) #define FUNCTION_S 12 #define FUNCTION_V(x) ((x) << FUNCTION_S) #define REGISTER_S 0 #define REGISTER_V(x) ((x) << REGISTER_S) #define T6_ENABLE_S 31 #define T6_ENABLE_V(x) ((x) << T6_ENABLE_S) #define T6_ENABLE_F T6_ENABLE_V(1U) #define PFNUM_S 0 #define PFNUM_V(x) ((x) << PFNUM_S) #define PCIE_FW_A 0x30b8 #define PCIE_FW_PF_A 0x30bc #define PCIE_CORE_UTL_SYSTEM_BUS_AGENT_STATUS_A 0x5908 #define RNPP_S 31 #define RNPP_V(x) ((x) << RNPP_S) #define RNPP_F RNPP_V(1U) #define RPCP_S 29 #define RPCP_V(x) ((x) << RPCP_S) #define RPCP_F RPCP_V(1U) #define RCIP_S 27 #define RCIP_V(x) ((x) << RCIP_S) #define RCIP_F RCIP_V(1U) #define RCCP_S 26 #define RCCP_V(x) ((x) << RCCP_S) #define RCCP_F RCCP_V(1U) #define RFTP_S 23 #define RFTP_V(x) ((x) << RFTP_S) #define RFTP_F RFTP_V(1U) #define PTRP_S 20 #define PTRP_V(x) ((x) << PTRP_S) #define PTRP_F PTRP_V(1U) #define PCIE_CORE_UTL_PCI_EXPRESS_PORT_STATUS_A 0x59a4 #define TPCP_S 30 #define TPCP_V(x) ((x) << TPCP_S) #define TPCP_F TPCP_V(1U) #define TNPP_S 29 #define TNPP_V(x) ((x) << TNPP_S) #define TNPP_F TNPP_V(1U) #define TFTP_S 28 #define TFTP_V(x) ((x) << TFTP_S) #define TFTP_F TFTP_V(1U) #define TCAP_S 27 #define TCAP_V(x) ((x) << TCAP_S) #define TCAP_F TCAP_V(1U) #define TCIP_S 26 #define TCIP_V(x) ((x) << TCIP_S) #define TCIP_F TCIP_V(1U) #define RCAP_S 25 #define RCAP_V(x) ((x) << RCAP_S) #define RCAP_F RCAP_V(1U) #define PLUP_S 23 #define PLUP_V(x) ((x) << PLUP_S) #define PLUP_F PLUP_V(1U) #define PLDN_S 22 #define PLDN_V(x) ((x) << PLDN_S) #define PLDN_F PLDN_V(1U) #define OTDD_S 21 #define OTDD_V(x) ((x) << OTDD_S) #define OTDD_F OTDD_V(1U) #define GTRP_S 20 #define GTRP_V(x) ((x) << GTRP_S) #define GTRP_F GTRP_V(1U) #define RDPE_S 18 #define RDPE_V(x) ((x) << RDPE_S) #define RDPE_F RDPE_V(1U) #define TDCE_S 17 #define TDCE_V(x) ((x) << TDCE_S) #define TDCE_F TDCE_V(1U) #define TDUE_S 16 #define TDUE_V(x) ((x) << TDUE_S) #define TDUE_F TDUE_V(1U) /* registers for module MC */ #define MC_INT_CAUSE_A 0x7518 #define MC_P_INT_CAUSE_A 0x41318 #define ECC_UE_INT_CAUSE_S 2 #define ECC_UE_INT_CAUSE_V(x) ((x) << ECC_UE_INT_CAUSE_S) #define ECC_UE_INT_CAUSE_F ECC_UE_INT_CAUSE_V(1U) #define ECC_CE_INT_CAUSE_S 1 #define ECC_CE_INT_CAUSE_V(x) ((x) << ECC_CE_INT_CAUSE_S) #define ECC_CE_INT_CAUSE_F ECC_CE_INT_CAUSE_V(1U) #define PERR_INT_CAUSE_S 0 #define PERR_INT_CAUSE_V(x) ((x) << PERR_INT_CAUSE_S) #define PERR_INT_CAUSE_F PERR_INT_CAUSE_V(1U) #define MC_ECC_STATUS_A 0x751c #define MC_P_ECC_STATUS_A 0x4131c #define ECC_CECNT_S 16 #define ECC_CECNT_M 0xffffU #define ECC_CECNT_V(x) ((x) << ECC_CECNT_S) #define ECC_CECNT_G(x) (((x) >> ECC_CECNT_S) & ECC_CECNT_M) #define ECC_UECNT_S 0 #define ECC_UECNT_M 0xffffU #define ECC_UECNT_V(x) ((x) << ECC_UECNT_S) #define ECC_UECNT_G(x) (((x) >> ECC_UECNT_S) & ECC_UECNT_M) #define MC_BIST_CMD_A 0x7600 #define START_BIST_S 31 #define START_BIST_V(x) ((x) << START_BIST_S) #define START_BIST_F START_BIST_V(1U) #define BIST_CMD_GAP_S 8 #define BIST_CMD_GAP_V(x) ((x) << BIST_CMD_GAP_S) #define BIST_OPCODE_S 0 #define BIST_OPCODE_V(x) ((x) << BIST_OPCODE_S) #define MC_BIST_CMD_ADDR_A 0x7604 #define MC_BIST_CMD_LEN_A 0x7608 #define MC_BIST_DATA_PATTERN_A 0x760c #define MC_BIST_STATUS_RDATA_A 0x7688 /* registers for module MA */ #define MA_EDRAM0_BAR_A 0x77c0 #define EDRAM0_BASE_S 16 #define EDRAM0_BASE_M 0xfffU #define EDRAM0_BASE_G(x) (((x) >> EDRAM0_BASE_S) & EDRAM0_BASE_M) #define EDRAM0_SIZE_S 0 #define EDRAM0_SIZE_M 0xfffU #define EDRAM0_SIZE_V(x) ((x) << EDRAM0_SIZE_S) #define EDRAM0_SIZE_G(x) (((x) >> EDRAM0_SIZE_S) & EDRAM0_SIZE_M) #define MA_EDRAM1_BAR_A 0x77c4 #define EDRAM1_BASE_S 16 #define EDRAM1_BASE_M 0xfffU #define EDRAM1_BASE_G(x) (((x) >> EDRAM1_BASE_S) & EDRAM1_BASE_M) #define EDRAM1_SIZE_S 0 #define EDRAM1_SIZE_M 0xfffU #define EDRAM1_SIZE_V(x) ((x) << EDRAM1_SIZE_S) #define EDRAM1_SIZE_G(x) (((x) >> EDRAM1_SIZE_S) & EDRAM1_SIZE_M) #define MA_EXT_MEMORY_BAR_A 0x77c8 #define EXT_MEM_BASE_S 16 #define EXT_MEM_BASE_M 0xfffU #define EXT_MEM_BASE_V(x) ((x) << EXT_MEM_BASE_S) #define EXT_MEM_BASE_G(x) (((x) >> EXT_MEM_BASE_S) & EXT_MEM_BASE_M) #define EXT_MEM_SIZE_S 0 #define EXT_MEM_SIZE_M 0xfffU #define EXT_MEM_SIZE_V(x) ((x) << EXT_MEM_SIZE_S) #define EXT_MEM_SIZE_G(x) (((x) >> EXT_MEM_SIZE_S) & EXT_MEM_SIZE_M) #define MA_EXT_MEMORY1_BAR_A 0x7808 #define EXT_MEM1_BASE_S 16 #define EXT_MEM1_BASE_M 0xfffU #define EXT_MEM1_BASE_G(x) (((x) >> EXT_MEM1_BASE_S) & EXT_MEM1_BASE_M) #define EXT_MEM1_SIZE_S 0 #define EXT_MEM1_SIZE_M 0xfffU #define EXT_MEM1_SIZE_V(x) ((x) << EXT_MEM1_SIZE_S) #define EXT_MEM1_SIZE_G(x) (((x) >> EXT_MEM1_SIZE_S) & EXT_MEM1_SIZE_M) #define MA_EXT_MEMORY0_BAR_A 0x77c8 #define EXT_MEM0_BASE_S 16 #define EXT_MEM0_BASE_M 0xfffU #define EXT_MEM0_BASE_G(x) (((x) >> EXT_MEM0_BASE_S) & EXT_MEM0_BASE_M) #define EXT_MEM0_SIZE_S 0 #define EXT_MEM0_SIZE_M 0xfffU #define EXT_MEM0_SIZE_V(x) ((x) << EXT_MEM0_SIZE_S) #define EXT_MEM0_SIZE_G(x) (((x) >> EXT_MEM0_SIZE_S) & EXT_MEM0_SIZE_M) #define MA_TARGET_MEM_ENABLE_A 0x77d8 #define EXT_MEM_ENABLE_S 2 #define EXT_MEM_ENABLE_V(x) ((x) << EXT_MEM_ENABLE_S) #define EXT_MEM_ENABLE_F EXT_MEM_ENABLE_V(1U) #define EDRAM1_ENABLE_S 1 #define EDRAM1_ENABLE_V(x) ((x) << EDRAM1_ENABLE_S) #define EDRAM1_ENABLE_F EDRAM1_ENABLE_V(1U) #define EDRAM0_ENABLE_S 0 #define EDRAM0_ENABLE_V(x) ((x) << EDRAM0_ENABLE_S) #define EDRAM0_ENABLE_F EDRAM0_ENABLE_V(1U) #define EXT_MEM1_ENABLE_S 4 #define EXT_MEM1_ENABLE_V(x) ((x) << EXT_MEM1_ENABLE_S) #define EXT_MEM1_ENABLE_F EXT_MEM1_ENABLE_V(1U) #define EXT_MEM0_ENABLE_S 2 #define EXT_MEM0_ENABLE_V(x) ((x) << EXT_MEM0_ENABLE_S) #define EXT_MEM0_ENABLE_F EXT_MEM0_ENABLE_V(1U) #define MA_INT_CAUSE_A 0x77e0 #define MEM_PERR_INT_CAUSE_S 1 #define MEM_PERR_INT_CAUSE_V(x) ((x) << MEM_PERR_INT_CAUSE_S) #define MEM_PERR_INT_CAUSE_F MEM_PERR_INT_CAUSE_V(1U) #define MEM_WRAP_INT_CAUSE_S 0 #define MEM_WRAP_INT_CAUSE_V(x) ((x) << MEM_WRAP_INT_CAUSE_S) #define MEM_WRAP_INT_CAUSE_F MEM_WRAP_INT_CAUSE_V(1U) #define MA_INT_WRAP_STATUS_A 0x77e4 #define MEM_WRAP_ADDRESS_S 4 #define MEM_WRAP_ADDRESS_M 0xfffffffU #define MEM_WRAP_ADDRESS_G(x) (((x) >> MEM_WRAP_ADDRESS_S) & MEM_WRAP_ADDRESS_M) #define MEM_WRAP_CLIENT_NUM_S 0 #define MEM_WRAP_CLIENT_NUM_M 0xfU #define MEM_WRAP_CLIENT_NUM_G(x) \ (((x) >> MEM_WRAP_CLIENT_NUM_S) & MEM_WRAP_CLIENT_NUM_M) #define MA_PARITY_ERROR_STATUS_A 0x77f4 #define MA_PARITY_ERROR_STATUS1_A 0x77f4 #define MA_PARITY_ERROR_STATUS2_A 0x7804 /* registers for module EDC_0 */ #define EDC_0_BASE_ADDR 0x7900 #define EDC_BIST_CMD_A 0x7904 #define EDC_BIST_CMD_ADDR_A 0x7908 #define EDC_BIST_CMD_LEN_A 0x790c #define EDC_BIST_DATA_PATTERN_A 0x7910 #define EDC_BIST_STATUS_RDATA_A 0x7928 #define EDC_INT_CAUSE_A 0x7978 #define ECC_UE_PAR_S 5 #define ECC_UE_PAR_V(x) ((x) << ECC_UE_PAR_S) #define ECC_UE_PAR_F ECC_UE_PAR_V(1U) #define ECC_CE_PAR_S 4 #define ECC_CE_PAR_V(x) ((x) << ECC_CE_PAR_S) #define ECC_CE_PAR_F ECC_CE_PAR_V(1U) #define PERR_PAR_CAUSE_S 3 #define PERR_PAR_CAUSE_V(x) ((x) << PERR_PAR_CAUSE_S) #define PERR_PAR_CAUSE_F PERR_PAR_CAUSE_V(1U) #define EDC_ECC_STATUS_A 0x797c /* registers for module EDC_1 */ #define EDC_1_BASE_ADDR 0x7980 /* registers for module CIM */ #define CIM_BOOT_CFG_A 0x7b00 #define CIM_SDRAM_BASE_ADDR_A 0x7b14 #define CIM_SDRAM_ADDR_SIZE_A 0x7b18 #define CIM_EXTMEM2_BASE_ADDR_A 0x7b1c #define CIM_EXTMEM2_ADDR_SIZE_A 0x7b20 #define CIM_PF_MAILBOX_CTRL_SHADOW_COPY_A 0x290 #define BOOTADDR_M 0xffffff00U #define UPCRST_S 0 #define UPCRST_V(x) ((x) << UPCRST_S) #define UPCRST_F UPCRST_V(1U) #define CIM_PF_MAILBOX_DATA_A 0x240 #define CIM_PF_MAILBOX_CTRL_A 0x280 #define MBMSGVALID_S 3 #define MBMSGVALID_V(x) ((x) << MBMSGVALID_S) #define MBMSGVALID_F MBMSGVALID_V(1U) #define MBINTREQ_S 2 #define MBINTREQ_V(x) ((x) << MBINTREQ_S) #define MBINTREQ_F MBINTREQ_V(1U) #define MBOWNER_S 0 #define MBOWNER_M 0x3U #define MBOWNER_V(x) ((x) << MBOWNER_S) #define MBOWNER_G(x) (((x) >> MBOWNER_S) & MBOWNER_M) #define CIM_PF_HOST_INT_ENABLE_A 0x288 #define MBMSGRDYINTEN_S 19 #define MBMSGRDYINTEN_V(x) ((x) << MBMSGRDYINTEN_S) #define MBMSGRDYINTEN_F MBMSGRDYINTEN_V(1U) #define CIM_PF_HOST_INT_CAUSE_A 0x28c #define MBMSGRDYINT_S 19 #define MBMSGRDYINT_V(x) ((x) << MBMSGRDYINT_S) #define MBMSGRDYINT_F MBMSGRDYINT_V(1U) #define CIM_HOST_INT_CAUSE_A 0x7b2c #define TIEQOUTPARERRINT_S 20 #define TIEQOUTPARERRINT_V(x) ((x) << TIEQOUTPARERRINT_S) #define TIEQOUTPARERRINT_F TIEQOUTPARERRINT_V(1U) #define TIEQINPARERRINT_S 19 #define TIEQINPARERRINT_V(x) ((x) << TIEQINPARERRINT_S) #define TIEQINPARERRINT_F TIEQINPARERRINT_V(1U) #define PREFDROPINT_S 1 #define PREFDROPINT_V(x) ((x) << PREFDROPINT_S) #define PREFDROPINT_F PREFDROPINT_V(1U) #define UPACCNONZERO_S 0 #define UPACCNONZERO_V(x) ((x) << UPACCNONZERO_S) #define UPACCNONZERO_F UPACCNONZERO_V(1U) #define MBHOSTPARERR_S 18 #define MBHOSTPARERR_V(x) ((x) << MBHOSTPARERR_S) #define MBHOSTPARERR_F MBHOSTPARERR_V(1U) #define MBUPPARERR_S 17 #define MBUPPARERR_V(x) ((x) << MBUPPARERR_S) #define MBUPPARERR_F MBUPPARERR_V(1U) #define IBQTP0PARERR_S 16 #define IBQTP0PARERR_V(x) ((x) << IBQTP0PARERR_S) #define IBQTP0PARERR_F IBQTP0PARERR_V(1U) #define IBQTP1PARERR_S 15 #define IBQTP1PARERR_V(x) ((x) << IBQTP1PARERR_S) #define IBQTP1PARERR_F IBQTP1PARERR_V(1U) #define IBQULPPARERR_S 14 #define IBQULPPARERR_V(x) ((x) << IBQULPPARERR_S) #define IBQULPPARERR_F IBQULPPARERR_V(1U) #define IBQSGELOPARERR_S 13 #define IBQSGELOPARERR_V(x) ((x) << IBQSGELOPARERR_S) #define IBQSGELOPARERR_F IBQSGELOPARERR_V(1U) #define IBQSGEHIPARERR_S 12 #define IBQSGEHIPARERR_V(x) ((x) << IBQSGEHIPARERR_S) #define IBQSGEHIPARERR_F IBQSGEHIPARERR_V(1U) #define IBQNCSIPARERR_S 11 #define IBQNCSIPARERR_V(x) ((x) << IBQNCSIPARERR_S) #define IBQNCSIPARERR_F IBQNCSIPARERR_V(1U) #define OBQULP0PARERR_S 10 #define OBQULP0PARERR_V(x) ((x) << OBQULP0PARERR_S) #define OBQULP0PARERR_F OBQULP0PARERR_V(1U) #define OBQULP1PARERR_S 9 #define OBQULP1PARERR_V(x) ((x) << OBQULP1PARERR_S) #define OBQULP1PARERR_F OBQULP1PARERR_V(1U) #define OBQULP2PARERR_S 8 #define OBQULP2PARERR_V(x) ((x) << OBQULP2PARERR_S) #define OBQULP2PARERR_F OBQULP2PARERR_V(1U) #define OBQULP3PARERR_S 7 #define OBQULP3PARERR_V(x) ((x) << OBQULP3PARERR_S) #define OBQULP3PARERR_F OBQULP3PARERR_V(1U) #define OBQSGEPARERR_S 6 #define OBQSGEPARERR_V(x) ((x) << OBQSGEPARERR_S) #define OBQSGEPARERR_F OBQSGEPARERR_V(1U) #define OBQNCSIPARERR_S 5 #define OBQNCSIPARERR_V(x) ((x) << OBQNCSIPARERR_S) #define OBQNCSIPARERR_F OBQNCSIPARERR_V(1U) #define CIM_HOST_UPACC_INT_CAUSE_A 0x7b34 #define EEPROMWRINT_S 30 #define EEPROMWRINT_V(x) ((x) << EEPROMWRINT_S) #define EEPROMWRINT_F EEPROMWRINT_V(1U) #define TIMEOUTMAINT_S 29 #define TIMEOUTMAINT_V(x) ((x) << TIMEOUTMAINT_S) #define TIMEOUTMAINT_F TIMEOUTMAINT_V(1U) #define TIMEOUTINT_S 28 #define TIMEOUTINT_V(x) ((x) << TIMEOUTINT_S) #define TIMEOUTINT_F TIMEOUTINT_V(1U) #define RSPOVRLOOKUPINT_S 27 #define RSPOVRLOOKUPINT_V(x) ((x) << RSPOVRLOOKUPINT_S) #define RSPOVRLOOKUPINT_F RSPOVRLOOKUPINT_V(1U) #define REQOVRLOOKUPINT_S 26 #define REQOVRLOOKUPINT_V(x) ((x) << REQOVRLOOKUPINT_S) #define REQOVRLOOKUPINT_F REQOVRLOOKUPINT_V(1U) #define BLKWRPLINT_S 25 #define BLKWRPLINT_V(x) ((x) << BLKWRPLINT_S) #define BLKWRPLINT_F BLKWRPLINT_V(1U) #define BLKRDPLINT_S 24 #define BLKRDPLINT_V(x) ((x) << BLKRDPLINT_S) #define BLKRDPLINT_F BLKRDPLINT_V(1U) #define SGLWRPLINT_S 23 #define SGLWRPLINT_V(x) ((x) << SGLWRPLINT_S) #define SGLWRPLINT_F SGLWRPLINT_V(1U) #define SGLRDPLINT_S 22 #define SGLRDPLINT_V(x) ((x) << SGLRDPLINT_S) #define SGLRDPLINT_F SGLRDPLINT_V(1U) #define BLKWRCTLINT_S 21 #define BLKWRCTLINT_V(x) ((x) << BLKWRCTLINT_S) #define BLKWRCTLINT_F BLKWRCTLINT_V(1U) #define BLKRDCTLINT_S 20 #define BLKRDCTLINT_V(x) ((x) << BLKRDCTLINT_S) #define BLKRDCTLINT_F BLKRDCTLINT_V(1U) #define SGLWRCTLINT_S 19 #define SGLWRCTLINT_V(x) ((x) << SGLWRCTLINT_S) #define SGLWRCTLINT_F SGLWRCTLINT_V(1U) #define SGLRDCTLINT_S 18 #define SGLRDCTLINT_V(x) ((x) << SGLRDCTLINT_S) #define SGLRDCTLINT_F SGLRDCTLINT_V(1U) #define BLKWREEPROMINT_S 17 #define BLKWREEPROMINT_V(x) ((x) << BLKWREEPROMINT_S) #define BLKWREEPROMINT_F BLKWREEPROMINT_V(1U) #define BLKRDEEPROMINT_S 16 #define BLKRDEEPROMINT_V(x) ((x) << BLKRDEEPROMINT_S) #define BLKRDEEPROMINT_F BLKRDEEPROMINT_V(1U) #define SGLWREEPROMINT_S 15 #define SGLWREEPROMINT_V(x) ((x) << SGLWREEPROMINT_S) #define SGLWREEPROMINT_F SGLWREEPROMINT_V(1U) #define SGLRDEEPROMINT_S 14 #define SGLRDEEPROMINT_V(x) ((x) << SGLRDEEPROMINT_S) #define SGLRDEEPROMINT_F SGLRDEEPROMINT_V(1U) #define BLKWRFLASHINT_S 13 #define BLKWRFLASHINT_V(x) ((x) << BLKWRFLASHINT_S) #define BLKWRFLASHINT_F BLKWRFLASHINT_V(1U) #define BLKRDFLASHINT_S 12 #define BLKRDFLASHINT_V(x) ((x) << BLKRDFLASHINT_S) #define BLKRDFLASHINT_F BLKRDFLASHINT_V(1U) #define SGLWRFLASHINT_S 11 #define SGLWRFLASHINT_V(x) ((x) << SGLWRFLASHINT_S) #define SGLWRFLASHINT_F SGLWRFLASHINT_V(1U) #define SGLRDFLASHINT_S 10 #define SGLRDFLASHINT_V(x) ((x) << SGLRDFLASHINT_S) #define SGLRDFLASHINT_F SGLRDFLASHINT_V(1U) #define BLKWRBOOTINT_S 9 #define BLKWRBOOTINT_V(x) ((x) << BLKWRBOOTINT_S) #define BLKWRBOOTINT_F BLKWRBOOTINT_V(1U) #define BLKRDBOOTINT_S 8 #define BLKRDBOOTINT_V(x) ((x) << BLKRDBOOTINT_S) #define BLKRDBOOTINT_F BLKRDBOOTINT_V(1U) #define SGLWRBOOTINT_S 7 #define SGLWRBOOTINT_V(x) ((x) << SGLWRBOOTINT_S) #define SGLWRBOOTINT_F SGLWRBOOTINT_V(1U) #define SGLRDBOOTINT_S 6 #define SGLRDBOOTINT_V(x) ((x) << SGLRDBOOTINT_S) #define SGLRDBOOTINT_F SGLRDBOOTINT_V(1U) #define ILLWRBEINT_S 5 #define ILLWRBEINT_V(x) ((x) << ILLWRBEINT_S) #define ILLWRBEINT_F ILLWRBEINT_V(1U) #define ILLRDBEINT_S 4 #define ILLRDBEINT_V(x) ((x) << ILLRDBEINT_S) #define ILLRDBEINT_F ILLRDBEINT_V(1U) #define ILLRDINT_S 3 #define ILLRDINT_V(x) ((x) << ILLRDINT_S) #define ILLRDINT_F ILLRDINT_V(1U) #define ILLWRINT_S 2 #define ILLWRINT_V(x) ((x) << ILLWRINT_S) #define ILLWRINT_F ILLWRINT_V(1U) #define ILLTRANSINT_S 1 #define ILLTRANSINT_V(x) ((x) << ILLTRANSINT_S) #define ILLTRANSINT_F ILLTRANSINT_V(1U) #define RSVDSPACEINT_S 0 #define RSVDSPACEINT_V(x) ((x) << RSVDSPACEINT_S) #define RSVDSPACEINT_F RSVDSPACEINT_V(1U) /* registers for module TP */ #define DBGLAWHLF_S 23 #define DBGLAWHLF_V(x) ((x) << DBGLAWHLF_S) #define DBGLAWHLF_F DBGLAWHLF_V(1U) #define DBGLAWPTR_S 16 #define DBGLAWPTR_M 0x7fU #define DBGLAWPTR_G(x) (((x) >> DBGLAWPTR_S) & DBGLAWPTR_M) #define DBGLAENABLE_S 12 #define DBGLAENABLE_V(x) ((x) << DBGLAENABLE_S) #define DBGLAENABLE_F DBGLAENABLE_V(1U) #define DBGLARPTR_S 0 #define DBGLARPTR_M 0x7fU #define DBGLARPTR_V(x) ((x) << DBGLARPTR_S) #define TP_DBG_LA_DATAL_A 0x7ed8 #define TP_DBG_LA_CONFIG_A 0x7ed4 #define TP_OUT_CONFIG_A 0x7d04 #define TP_GLOBAL_CONFIG_A 0x7d08 #define TP_CMM_TCB_BASE_A 0x7d10 #define TP_CMM_MM_BASE_A 0x7d14 #define TP_CMM_TIMER_BASE_A 0x7d18 #define TP_PMM_TX_BASE_A 0x7d20 #define TP_PMM_RX_BASE_A 0x7d28 #define TP_PMM_RX_PAGE_SIZE_A 0x7d2c #define TP_PMM_RX_MAX_PAGE_A 0x7d30 #define TP_PMM_TX_PAGE_SIZE_A 0x7d34 #define TP_PMM_TX_MAX_PAGE_A 0x7d38 #define TP_CMM_MM_MAX_PSTRUCT_A 0x7e6c #define PMRXNUMCHN_S 31 #define PMRXNUMCHN_V(x) ((x) << PMRXNUMCHN_S) #define PMRXNUMCHN_F PMRXNUMCHN_V(1U) #define PMTXNUMCHN_S 30 #define PMTXNUMCHN_M 0x3U #define PMTXNUMCHN_G(x) (((x) >> PMTXNUMCHN_S) & PMTXNUMCHN_M) #define PMTXMAXPAGE_S 0 #define PMTXMAXPAGE_M 0x1fffffU #define PMTXMAXPAGE_G(x) (((x) >> PMTXMAXPAGE_S) & PMTXMAXPAGE_M) #define PMRXMAXPAGE_S 0 #define PMRXMAXPAGE_M 0x1fffffU #define PMRXMAXPAGE_G(x) (((x) >> PMRXMAXPAGE_S) & PMRXMAXPAGE_M) #define DBGLAMODE_S 14 #define DBGLAMODE_M 0x3U #define DBGLAMODE_G(x) (((x) >> DBGLAMODE_S) & DBGLAMODE_M) #define FIVETUPLELOOKUP_S 17 #define FIVETUPLELOOKUP_M 0x3U #define FIVETUPLELOOKUP_V(x) ((x) << FIVETUPLELOOKUP_S) #define FIVETUPLELOOKUP_G(x) (((x) >> FIVETUPLELOOKUP_S) & FIVETUPLELOOKUP_M) #define TP_PARA_REG2_A 0x7d68 #define MAXRXDATA_S 16 #define MAXRXDATA_M 0xffffU #define MAXRXDATA_G(x) (((x) >> MAXRXDATA_S) & MAXRXDATA_M) #define TP_TIMER_RESOLUTION_A 0x7d90 #define TIMERRESOLUTION_S 16 #define TIMERRESOLUTION_M 0xffU #define TIMERRESOLUTION_G(x) (((x) >> TIMERRESOLUTION_S) & TIMERRESOLUTION_M) #define TIMESTAMPRESOLUTION_S 8 #define TIMESTAMPRESOLUTION_M 0xffU #define TIMESTAMPRESOLUTION_G(x) \ (((x) >> TIMESTAMPRESOLUTION_S) & TIMESTAMPRESOLUTION_M) #define DELAYEDACKRESOLUTION_S 0 #define DELAYEDACKRESOLUTION_M 0xffU #define DELAYEDACKRESOLUTION_G(x) \ (((x) >> DELAYEDACKRESOLUTION_S) & DELAYEDACKRESOLUTION_M) #define TP_SHIFT_CNT_A 0x7dc0 #define TP_RXT_MIN_A 0x7d98 #define TP_RXT_MAX_A 0x7d9c #define TP_PERS_MIN_A 0x7da0 #define TP_PERS_MAX_A 0x7da4 #define TP_KEEP_IDLE_A 0x7da8 #define TP_KEEP_INTVL_A 0x7dac #define TP_INIT_SRTT_A 0x7db0 #define TP_DACK_TIMER_A 0x7db4 #define TP_FINWAIT2_TIMER_A 0x7db8 #define INITSRTT_S 0 #define INITSRTT_M 0xffffU #define INITSRTT_G(x) (((x) >> INITSRTT_S) & INITSRTT_M) #define PERSMAX_S 0 #define PERSMAX_M 0x3fffffffU #define PERSMAX_V(x) ((x) << PERSMAX_S) #define PERSMAX_G(x) (((x) >> PERSMAX_S) & PERSMAX_M) #define SYNSHIFTMAX_S 24 #define SYNSHIFTMAX_M 0xffU #define SYNSHIFTMAX_V(x) ((x) << SYNSHIFTMAX_S) #define SYNSHIFTMAX_G(x) (((x) >> SYNSHIFTMAX_S) & SYNSHIFTMAX_M) #define RXTSHIFTMAXR1_S 20 #define RXTSHIFTMAXR1_M 0xfU #define RXTSHIFTMAXR1_V(x) ((x) << RXTSHIFTMAXR1_S) #define RXTSHIFTMAXR1_G(x) (((x) >> RXTSHIFTMAXR1_S) & RXTSHIFTMAXR1_M) #define RXTSHIFTMAXR2_S 16 #define RXTSHIFTMAXR2_M 0xfU #define RXTSHIFTMAXR2_V(x) ((x) << RXTSHIFTMAXR2_S) #define RXTSHIFTMAXR2_G(x) (((x) >> RXTSHIFTMAXR2_S) & RXTSHIFTMAXR2_M) #define PERSHIFTBACKOFFMAX_S 12 #define PERSHIFTBACKOFFMAX_M 0xfU #define PERSHIFTBACKOFFMAX_V(x) ((x) << PERSHIFTBACKOFFMAX_S) #define PERSHIFTBACKOFFMAX_G(x) \ (((x) >> PERSHIFTBACKOFFMAX_S) & PERSHIFTBACKOFFMAX_M) #define PERSHIFTMAX_S 8 #define PERSHIFTMAX_M 0xfU #define PERSHIFTMAX_V(x) ((x) << PERSHIFTMAX_S) #define PERSHIFTMAX_G(x) (((x) >> PERSHIFTMAX_S) & PERSHIFTMAX_M) #define KEEPALIVEMAXR1_S 4 #define KEEPALIVEMAXR1_M 0xfU #define KEEPALIVEMAXR1_V(x) ((x) << KEEPALIVEMAXR1_S) #define KEEPALIVEMAXR1_G(x) (((x) >> KEEPALIVEMAXR1_S) & KEEPALIVEMAXR1_M) #define KEEPALIVEMAXR2_S 0 #define KEEPALIVEMAXR2_M 0xfU #define KEEPALIVEMAXR2_V(x) ((x) << KEEPALIVEMAXR2_S) #define KEEPALIVEMAXR2_G(x) (((x) >> KEEPALIVEMAXR2_S) & KEEPALIVEMAXR2_M) #define ROWINDEX_S 16 #define ROWINDEX_V(x) ((x) << ROWINDEX_S) #define TP_CCTRL_TABLE_A 0x7ddc #define TP_MTU_TABLE_A 0x7de4 #define MTUINDEX_S 24 #define MTUINDEX_V(x) ((x) << MTUINDEX_S) #define MTUWIDTH_S 16 #define MTUWIDTH_M 0xfU #define MTUWIDTH_V(x) ((x) << MTUWIDTH_S) #define MTUWIDTH_G(x) (((x) >> MTUWIDTH_S) & MTUWIDTH_M) #define MTUVALUE_S 0 #define MTUVALUE_M 0x3fffU #define MTUVALUE_V(x) ((x) << MTUVALUE_S) #define MTUVALUE_G(x) (((x) >> MTUVALUE_S) & MTUVALUE_M) #define TP_RSS_LKP_TABLE_A 0x7dec #define TP_CMM_MM_RX_FLST_BASE_A 0x7e60 #define TP_CMM_MM_TX_FLST_BASE_A 0x7e64 #define TP_CMM_MM_PS_FLST_BASE_A 0x7e68 #define LKPTBLROWVLD_S 31 #define LKPTBLROWVLD_V(x) ((x) << LKPTBLROWVLD_S) #define LKPTBLROWVLD_F LKPTBLROWVLD_V(1U) #define LKPTBLQUEUE1_S 10 #define LKPTBLQUEUE1_M 0x3ffU #define LKPTBLQUEUE1_G(x) (((x) >> LKPTBLQUEUE1_S) & LKPTBLQUEUE1_M) #define LKPTBLQUEUE0_S 0 #define LKPTBLQUEUE0_M 0x3ffU #define LKPTBLQUEUE0_G(x) (((x) >> LKPTBLQUEUE0_S) & LKPTBLQUEUE0_M) #define TP_PIO_ADDR_A 0x7e40 #define TP_PIO_DATA_A 0x7e44 #define TP_MIB_INDEX_A 0x7e50 #define TP_MIB_DATA_A 0x7e54 #define TP_INT_CAUSE_A 0x7e74 #define SRQTABLEPERR_S 1 #define SRQTABLEPERR_V(x) ((x) << SRQTABLEPERR_S) #define SRQTABLEPERR_F SRQTABLEPERR_V(1U) #define FLMTXFLSTEMPTY_S 30 #define FLMTXFLSTEMPTY_V(x) ((x) << FLMTXFLSTEMPTY_S) #define FLMTXFLSTEMPTY_F FLMTXFLSTEMPTY_V(1U) #define TP_TX_ORATE_A 0x7ebc #define OFDRATE3_S 24 #define OFDRATE3_M 0xffU #define OFDRATE3_G(x) (((x) >> OFDRATE3_S) & OFDRATE3_M) #define OFDRATE2_S 16 #define OFDRATE2_M 0xffU #define OFDRATE2_G(x) (((x) >> OFDRATE2_S) & OFDRATE2_M) #define OFDRATE1_S 8 #define OFDRATE1_M 0xffU #define OFDRATE1_G(x) (((x) >> OFDRATE1_S) & OFDRATE1_M) #define OFDRATE0_S 0 #define OFDRATE0_M 0xffU #define OFDRATE0_G(x) (((x) >> OFDRATE0_S) & OFDRATE0_M) #define TP_TX_TRATE_A 0x7ed0 #define TNLRATE3_S 24 #define TNLRATE3_M 0xffU #define TNLRATE3_G(x) (((x) >> TNLRATE3_S) & TNLRATE3_M) #define TNLRATE2_S 16 #define TNLRATE2_M 0xffU #define TNLRATE2_G(x) (((x) >> TNLRATE2_S) & TNLRATE2_M) #define TNLRATE1_S 8 #define TNLRATE1_M 0xffU #define TNLRATE1_G(x) (((x) >> TNLRATE1_S) & TNLRATE1_M) #define TNLRATE0_S 0 #define TNLRATE0_M 0xffU #define TNLRATE0_G(x) (((x) >> TNLRATE0_S) & TNLRATE0_M) #define TP_VLAN_PRI_MAP_A 0x140 #define FRAGMENTATION_S 9 #define FRAGMENTATION_V(x) ((x) << FRAGMENTATION_S) #define FRAGMENTATION_F FRAGMENTATION_V(1U) #define MPSHITTYPE_S 8 #define MPSHITTYPE_V(x) ((x) << MPSHITTYPE_S) #define MPSHITTYPE_F MPSHITTYPE_V(1U) #define MACMATCH_S 7 #define MACMATCH_V(x) ((x) << MACMATCH_S) #define MACMATCH_F MACMATCH_V(1U) #define ETHERTYPE_S 6 #define ETHERTYPE_V(x) ((x) << ETHERTYPE_S) #define ETHERTYPE_F ETHERTYPE_V(1U) #define PROTOCOL_S 5 #define PROTOCOL_V(x) ((x) << PROTOCOL_S) #define PROTOCOL_F PROTOCOL_V(1U) #define TOS_S 4 #define TOS_V(x) ((x) << TOS_S) #define TOS_F TOS_V(1U) #define VLAN_S 3 #define VLAN_V(x) ((x) << VLAN_S) #define VLAN_F VLAN_V(1U) #define VNIC_ID_S 2 #define VNIC_ID_V(x) ((x) << VNIC_ID_S) #define VNIC_ID_F VNIC_ID_V(1U) #define PORT_S 1 #define PORT_V(x) ((x) << PORT_S) #define PORT_F PORT_V(1U) #define FCOE_S 0 #define FCOE_V(x) ((x) << FCOE_S) #define FCOE_F FCOE_V(1U) #define FILTERMODE_S 15 #define FILTERMODE_V(x) ((x) << FILTERMODE_S) #define FILTERMODE_F FILTERMODE_V(1U) #define FCOEMASK_S 14 #define FCOEMASK_V(x) ((x) << FCOEMASK_S) #define FCOEMASK_F FCOEMASK_V(1U) #define TP_INGRESS_CONFIG_A 0x141 #define VNIC_S 11 #define VNIC_V(x) ((x) << VNIC_S) #define VNIC_F VNIC_V(1U) #define CSUM_HAS_PSEUDO_HDR_S 10 #define CSUM_HAS_PSEUDO_HDR_V(x) ((x) << CSUM_HAS_PSEUDO_HDR_S) #define CSUM_HAS_PSEUDO_HDR_F CSUM_HAS_PSEUDO_HDR_V(1U) #define TP_MIB_MAC_IN_ERR_0_A 0x0 #define TP_MIB_HDR_IN_ERR_0_A 0x4 #define TP_MIB_TCP_IN_ERR_0_A 0x8 #define TP_MIB_TCP_OUT_RST_A 0xc #define TP_MIB_TCP_IN_SEG_HI_A 0x10 #define TP_MIB_TCP_IN_SEG_LO_A 0x11 #define TP_MIB_TCP_OUT_SEG_HI_A 0x12 #define TP_MIB_TCP_OUT_SEG_LO_A 0x13 #define TP_MIB_TCP_RXT_SEG_HI_A 0x14 #define TP_MIB_TCP_RXT_SEG_LO_A 0x15 #define TP_MIB_TNL_CNG_DROP_0_A 0x18 #define TP_MIB_OFD_CHN_DROP_0_A 0x1c #define TP_MIB_TCP_V6IN_ERR_0_A 0x28 #define TP_MIB_TCP_V6OUT_RST_A 0x2c #define TP_MIB_OFD_ARP_DROP_A 0x36 #define TP_MIB_CPL_IN_REQ_0_A 0x38 #define TP_MIB_CPL_OUT_RSP_0_A 0x3c #define TP_MIB_TNL_DROP_0_A 0x44 #define TP_MIB_FCOE_DDP_0_A 0x48 #define TP_MIB_FCOE_DROP_0_A 0x4c #define TP_MIB_FCOE_BYTE_0_HI_A 0x50 #define TP_MIB_OFD_VLN_DROP_0_A 0x58 #define TP_MIB_USM_PKTS_A 0x5c #define TP_MIB_RQE_DFR_PKT_A 0x64 #define ULP_TX_INT_CAUSE_A 0x8dcc #define ULP_TX_TPT_LLIMIT_A 0x8dd4 #define ULP_TX_TPT_ULIMIT_A 0x8dd8 #define ULP_TX_PBL_LLIMIT_A 0x8ddc #define ULP_TX_PBL_ULIMIT_A 0x8de0 #define ULP_TX_ERR_TABLE_BASE_A 0x8e04 #define PBL_BOUND_ERR_CH3_S 31 #define PBL_BOUND_ERR_CH3_V(x) ((x) << PBL_BOUND_ERR_CH3_S) #define PBL_BOUND_ERR_CH3_F PBL_BOUND_ERR_CH3_V(1U) #define PBL_BOUND_ERR_CH2_S 30 #define PBL_BOUND_ERR_CH2_V(x) ((x) << PBL_BOUND_ERR_CH2_S) #define PBL_BOUND_ERR_CH2_F PBL_BOUND_ERR_CH2_V(1U) #define PBL_BOUND_ERR_CH1_S 29 #define PBL_BOUND_ERR_CH1_V(x) ((x) << PBL_BOUND_ERR_CH1_S) #define PBL_BOUND_ERR_CH1_F PBL_BOUND_ERR_CH1_V(1U) #define PBL_BOUND_ERR_CH0_S 28 #define PBL_BOUND_ERR_CH0_V(x) ((x) << PBL_BOUND_ERR_CH0_S) #define PBL_BOUND_ERR_CH0_F PBL_BOUND_ERR_CH0_V(1U) #define PM_RX_INT_CAUSE_A 0x8fdc #define PM_RX_STAT_CONFIG_A 0x8fc8 #define PM_RX_STAT_COUNT_A 0x8fcc #define PM_RX_STAT_LSB_A 0x8fd0 #define PM_RX_DBG_CTRL_A 0x8fd0 #define PM_RX_DBG_DATA_A 0x8fd4 #define PM_RX_DBG_STAT_MSB_A 0x10013 #define PMRX_FRAMING_ERROR_F 0x003ffff0U #define ZERO_E_CMD_ERROR_S 22 #define ZERO_E_CMD_ERROR_V(x) ((x) << ZERO_E_CMD_ERROR_S) #define ZERO_E_CMD_ERROR_F ZERO_E_CMD_ERROR_V(1U) #define OCSPI_PAR_ERROR_S 3 #define OCSPI_PAR_ERROR_V(x) ((x) << OCSPI_PAR_ERROR_S) #define OCSPI_PAR_ERROR_F OCSPI_PAR_ERROR_V(1U) #define DB_OPTIONS_PAR_ERROR_S 2 #define DB_OPTIONS_PAR_ERROR_V(x) ((x) << DB_OPTIONS_PAR_ERROR_S) #define DB_OPTIONS_PAR_ERROR_F DB_OPTIONS_PAR_ERROR_V(1U) #define IESPI_PAR_ERROR_S 1 #define IESPI_PAR_ERROR_V(x) ((x) << IESPI_PAR_ERROR_S) #define IESPI_PAR_ERROR_F IESPI_PAR_ERROR_V(1U) #define PMRX_E_PCMD_PAR_ERROR_S 0 #define PMRX_E_PCMD_PAR_ERROR_V(x) ((x) << PMRX_E_PCMD_PAR_ERROR_S) #define PMRX_E_PCMD_PAR_ERROR_F PMRX_E_PCMD_PAR_ERROR_V(1U) #define PM_TX_INT_CAUSE_A 0x8ffc #define PM_TX_STAT_CONFIG_A 0x8fe8 #define PM_TX_STAT_COUNT_A 0x8fec #define PM_TX_STAT_LSB_A 0x8ff0 #define PM_TX_DBG_CTRL_A 0x8ff0 #define PM_TX_DBG_DATA_A 0x8ff4 #define PM_TX_DBG_STAT_MSB_A 0x1001a #define PCMD_LEN_OVFL0_S 31 #define PCMD_LEN_OVFL0_V(x) ((x) << PCMD_LEN_OVFL0_S) #define PCMD_LEN_OVFL0_F PCMD_LEN_OVFL0_V(1U) #define PCMD_LEN_OVFL1_S 30 #define PCMD_LEN_OVFL1_V(x) ((x) << PCMD_LEN_OVFL1_S) #define PCMD_LEN_OVFL1_F PCMD_LEN_OVFL1_V(1U) #define PCMD_LEN_OVFL2_S 29 #define PCMD_LEN_OVFL2_V(x) ((x) << PCMD_LEN_OVFL2_S) #define PCMD_LEN_OVFL2_F PCMD_LEN_OVFL2_V(1U) #define ZERO_C_CMD_ERROR_S 28 #define ZERO_C_CMD_ERROR_V(x) ((x) << ZERO_C_CMD_ERROR_S) #define ZERO_C_CMD_ERROR_F ZERO_C_CMD_ERROR_V(1U) #define PMTX_FRAMING_ERROR_F 0x0ffffff0U #define OESPI_PAR_ERROR_S 3 #define OESPI_PAR_ERROR_V(x) ((x) << OESPI_PAR_ERROR_S) #define OESPI_PAR_ERROR_F OESPI_PAR_ERROR_V(1U) #define ICSPI_PAR_ERROR_S 1 #define ICSPI_PAR_ERROR_V(x) ((x) << ICSPI_PAR_ERROR_S) #define ICSPI_PAR_ERROR_F ICSPI_PAR_ERROR_V(1U) #define PMTX_C_PCMD_PAR_ERROR_S 0 #define PMTX_C_PCMD_PAR_ERROR_V(x) ((x) << PMTX_C_PCMD_PAR_ERROR_S) #define PMTX_C_PCMD_PAR_ERROR_F PMTX_C_PCMD_PAR_ERROR_V(1U) #define MPS_PORT_STAT_TX_PORT_BYTES_L 0x400 #define MPS_PORT_STAT_TX_PORT_BYTES_H 0x404 #define MPS_PORT_STAT_TX_PORT_FRAMES_L 0x408 #define MPS_PORT_STAT_TX_PORT_FRAMES_H 0x40c #define MPS_PORT_STAT_TX_PORT_BCAST_L 0x410 #define MPS_PORT_STAT_TX_PORT_BCAST_H 0x414 #define MPS_PORT_STAT_TX_PORT_MCAST_L 0x418 #define MPS_PORT_STAT_TX_PORT_MCAST_H 0x41c #define MPS_PORT_STAT_TX_PORT_UCAST_L 0x420 #define MPS_PORT_STAT_TX_PORT_UCAST_H 0x424 #define MPS_PORT_STAT_TX_PORT_ERROR_L 0x428 #define MPS_PORT_STAT_TX_PORT_ERROR_H 0x42c #define MPS_PORT_STAT_TX_PORT_64B_L 0x430 #define MPS_PORT_STAT_TX_PORT_64B_H 0x434 #define MPS_PORT_STAT_TX_PORT_65B_127B_L 0x438 #define MPS_PORT_STAT_TX_PORT_65B_127B_H 0x43c #define MPS_PORT_STAT_TX_PORT_128B_255B_L 0x440 #define MPS_PORT_STAT_TX_PORT_128B_255B_H 0x444 #define MPS_PORT_STAT_TX_PORT_256B_511B_L 0x448 #define MPS_PORT_STAT_TX_PORT_256B_511B_H 0x44c #define MPS_PORT_STAT_TX_PORT_512B_1023B_L 0x450 #define MPS_PORT_STAT_TX_PORT_512B_1023B_H 0x454 #define MPS_PORT_STAT_TX_PORT_1024B_1518B_L 0x458 #define MPS_PORT_STAT_TX_PORT_1024B_1518B_H 0x45c #define MPS_PORT_STAT_TX_PORT_1519B_MAX_L 0x460 #define MPS_PORT_STAT_TX_PORT_1519B_MAX_H 0x464 #define MPS_PORT_STAT_TX_PORT_DROP_L 0x468 #define MPS_PORT_STAT_TX_PORT_DROP_H 0x46c #define MPS_PORT_STAT_TX_PORT_PAUSE_L 0x470 #define MPS_PORT_STAT_TX_PORT_PAUSE_H 0x474 #define MPS_PORT_STAT_TX_PORT_PPP0_L 0x478 #define MPS_PORT_STAT_TX_PORT_PPP0_H 0x47c #define MPS_PORT_STAT_TX_PORT_PPP1_L 0x480 #define MPS_PORT_STAT_TX_PORT_PPP1_H 0x484 #define MPS_PORT_STAT_TX_PORT_PPP2_L 0x488 #define MPS_PORT_STAT_TX_PORT_PPP2_H 0x48c #define MPS_PORT_STAT_TX_PORT_PPP3_L 0x490 #define MPS_PORT_STAT_TX_PORT_PPP3_H 0x494 #define MPS_PORT_STAT_TX_PORT_PPP4_L 0x498 #define MPS_PORT_STAT_TX_PORT_PPP4_H 0x49c #define MPS_PORT_STAT_TX_PORT_PPP5_L 0x4a0 #define MPS_PORT_STAT_TX_PORT_PPP5_H 0x4a4 #define MPS_PORT_STAT_TX_PORT_PPP6_L 0x4a8 #define MPS_PORT_STAT_TX_PORT_PPP6_H 0x4ac #define MPS_PORT_STAT_TX_PORT_PPP7_L 0x4b0 #define MPS_PORT_STAT_TX_PORT_PPP7_H 0x4b4 #define MPS_PORT_STAT_LB_PORT_BYTES_L 0x4c0 #define MPS_PORT_STAT_LB_PORT_BYTES_H 0x4c4 #define MPS_PORT_STAT_LB_PORT_FRAMES_L 0x4c8 #define MPS_PORT_STAT_LB_PORT_FRAMES_H 0x4cc #define MPS_PORT_STAT_LB_PORT_BCAST_L 0x4d0 #define MPS_PORT_STAT_LB_PORT_BCAST_H 0x4d4 #define MPS_PORT_STAT_LB_PORT_MCAST_L 0x4d8 #define MPS_PORT_STAT_LB_PORT_MCAST_H 0x4dc #define MPS_PORT_STAT_LB_PORT_UCAST_L 0x4e0 #define MPS_PORT_STAT_LB_PORT_UCAST_H 0x4e4 #define MPS_PORT_STAT_LB_PORT_ERROR_L 0x4e8 #define MPS_PORT_STAT_LB_PORT_ERROR_H 0x4ec #define MPS_PORT_STAT_LB_PORT_64B_L 0x4f0 #define MPS_PORT_STAT_LB_PORT_64B_H 0x4f4 #define MPS_PORT_STAT_LB_PORT_65B_127B_L 0x4f8 #define MPS_PORT_STAT_LB_PORT_65B_127B_H 0x4fc #define MPS_PORT_STAT_LB_PORT_128B_255B_L 0x500 #define MPS_PORT_STAT_LB_PORT_128B_255B_H 0x504 #define MPS_PORT_STAT_LB_PORT_256B_511B_L 0x508 #define MPS_PORT_STAT_LB_PORT_256B_511B_H 0x50c #define MPS_PORT_STAT_LB_PORT_512B_1023B_L 0x510 #define MPS_PORT_STAT_LB_PORT_512B_1023B_H 0x514 #define MPS_PORT_STAT_LB_PORT_1024B_1518B_L 0x518 #define MPS_PORT_STAT_LB_PORT_1024B_1518B_H 0x51c #define MPS_PORT_STAT_LB_PORT_1519B_MAX_L 0x520 #define MPS_PORT_STAT_LB_PORT_1519B_MAX_H 0x524 #define MPS_PORT_STAT_LB_PORT_DROP_FRAMES 0x528 #define MPS_PORT_STAT_LB_PORT_DROP_FRAMES_L 0x528 #define MPS_PORT_STAT_RX_PORT_BYTES_L 0x540 #define MPS_PORT_STAT_RX_PORT_BYTES_H 0x544 #define MPS_PORT_STAT_RX_PORT_FRAMES_L 0x548 #define MPS_PORT_STAT_RX_PORT_FRAMES_H 0x54c #define MPS_PORT_STAT_RX_PORT_BCAST_L 0x550 #define MPS_PORT_STAT_RX_PORT_BCAST_H 0x554 #define MPS_PORT_STAT_RX_PORT_MCAST_L 0x558 #define MPS_PORT_STAT_RX_PORT_MCAST_H 0x55c #define MPS_PORT_STAT_RX_PORT_UCAST_L 0x560 #define MPS_PORT_STAT_RX_PORT_UCAST_H 0x564 #define MPS_PORT_STAT_RX_PORT_MTU_ERROR_L 0x568 #define MPS_PORT_STAT_RX_PORT_MTU_ERROR_H 0x56c #define MPS_PORT_STAT_RX_PORT_MTU_CRC_ERROR_L 0x570 #define MPS_PORT_STAT_RX_PORT_MTU_CRC_ERROR_H 0x574 #define MPS_PORT_STAT_RX_PORT_CRC_ERROR_L 0x578 #define MPS_PORT_STAT_RX_PORT_CRC_ERROR_H 0x57c #define MPS_PORT_STAT_RX_PORT_LEN_ERROR_L 0x580 #define MPS_PORT_STAT_RX_PORT_LEN_ERROR_H 0x584 #define MPS_PORT_STAT_RX_PORT_SYM_ERROR_L 0x588 #define MPS_PORT_STAT_RX_PORT_SYM_ERROR_H 0x58c #define MPS_PORT_STAT_RX_PORT_64B_L 0x590 #define MPS_PORT_STAT_RX_PORT_64B_H 0x594 #define MPS_PORT_STAT_RX_PORT_65B_127B_L 0x598 #define MPS_PORT_STAT_RX_PORT_65B_127B_H 0x59c #define MPS_PORT_STAT_RX_PORT_128B_255B_L 0x5a0 #define MPS_PORT_STAT_RX_PORT_128B_255B_H 0x5a4 #define MPS_PORT_STAT_RX_PORT_256B_511B_L 0x5a8 #define MPS_PORT_STAT_RX_PORT_256B_511B_H 0x5ac #define MPS_PORT_STAT_RX_PORT_512B_1023B_L 0x5b0 #define MPS_PORT_STAT_RX_PORT_512B_1023B_H 0x5b4 #define MPS_PORT_STAT_RX_PORT_1024B_1518B_L 0x5b8 #define MPS_PORT_STAT_RX_PORT_1024B_1518B_H 0x5bc #define MPS_PORT_STAT_RX_PORT_1519B_MAX_L 0x5c0 #define MPS_PORT_STAT_RX_PORT_1519B_MAX_H 0x5c4 #define MPS_PORT_STAT_RX_PORT_PAUSE_L 0x5c8 #define MPS_PORT_STAT_RX_PORT_PAUSE_H 0x5cc #define MPS_PORT_STAT_RX_PORT_PPP0_L 0x5d0 #define MPS_PORT_STAT_RX_PORT_PPP0_H 0x5d4 #define MPS_PORT_STAT_RX_PORT_PPP1_L 0x5d8 #define MPS_PORT_STAT_RX_PORT_PPP1_H 0x5dc #define MPS_PORT_STAT_RX_PORT_PPP2_L 0x5e0 #define MPS_PORT_STAT_RX_PORT_PPP2_H 0x5e4 #define MPS_PORT_STAT_RX_PORT_PPP3_L 0x5e8 #define MPS_PORT_STAT_RX_PORT_PPP3_H 0x5ec #define MPS_PORT_STAT_RX_PORT_PPP4_L 0x5f0 #define MPS_PORT_STAT_RX_PORT_PPP4_H 0x5f4 #define MPS_PORT_STAT_RX_PORT_PPP5_L 0x5f8 #define MPS_PORT_STAT_RX_PORT_PPP5_H 0x5fc #define MPS_PORT_STAT_RX_PORT_PPP6_L 0x600 #define MPS_PORT_STAT_RX_PORT_PPP6_H 0x604 #define MPS_PORT_STAT_RX_PORT_PPP7_L 0x608 #define MPS_PORT_STAT_RX_PORT_PPP7_H 0x60c #define MPS_PORT_STAT_RX_PORT_LESS_64B_L 0x610 #define MPS_PORT_STAT_RX_PORT_LESS_64B_H 0x614 #define MAC_PORT_MAGIC_MACID_LO 0x824 #define MAC_PORT_MAGIC_MACID_HI 0x828 #define MAC_PORT_EPIO_DATA0_A 0x8c0 #define MAC_PORT_EPIO_DATA1_A 0x8c4 #define MAC_PORT_EPIO_DATA2_A 0x8c8 #define MAC_PORT_EPIO_DATA3_A 0x8cc #define MAC_PORT_EPIO_OP_A 0x8d0 #define MAC_PORT_CFG2_A 0x818 #define MPS_CMN_CTL_A 0x9000 #define NUMPORTS_S 0 #define NUMPORTS_M 0x3U #define NUMPORTS_G(x) (((x) >> NUMPORTS_S) & NUMPORTS_M) #define MPS_INT_CAUSE_A 0x9008 #define MPS_TX_INT_CAUSE_A 0x9408 #define FRMERR_S 15 #define FRMERR_V(x) ((x) << FRMERR_S) #define FRMERR_F FRMERR_V(1U) #define SECNTERR_S 14 #define SECNTERR_V(x) ((x) << SECNTERR_S) #define SECNTERR_F SECNTERR_V(1U) #define BUBBLE_S 13 #define BUBBLE_V(x) ((x) << BUBBLE_S) #define BUBBLE_F BUBBLE_V(1U) #define TXDESCFIFO_S 9 #define TXDESCFIFO_M 0xfU #define TXDESCFIFO_V(x) ((x) << TXDESCFIFO_S) #define TXDATAFIFO_S 5 #define TXDATAFIFO_M 0xfU #define TXDATAFIFO_V(x) ((x) << TXDATAFIFO_S) #define NCSIFIFO_S 4 #define NCSIFIFO_V(x) ((x) << NCSIFIFO_S) #define NCSIFIFO_F NCSIFIFO_V(1U) #define TPFIFO_S 0 #define TPFIFO_M 0xfU #define TPFIFO_V(x) ((x) << TPFIFO_S) #define MPS_STAT_PERR_INT_CAUSE_SRAM_A 0x9614 #define MPS_STAT_PERR_INT_CAUSE_TX_FIFO_A 0x9620 #define MPS_STAT_PERR_INT_CAUSE_RX_FIFO_A 0x962c #define MPS_STAT_RX_BG_0_MAC_DROP_FRAME_L 0x9640 #define MPS_STAT_RX_BG_0_MAC_DROP_FRAME_H 0x9644 #define MPS_STAT_RX_BG_1_MAC_DROP_FRAME_L 0x9648 #define MPS_STAT_RX_BG_1_MAC_DROP_FRAME_H 0x964c #define MPS_STAT_RX_BG_2_MAC_DROP_FRAME_L 0x9650 #define MPS_STAT_RX_BG_2_MAC_DROP_FRAME_H 0x9654 #define MPS_STAT_RX_BG_3_MAC_DROP_FRAME_L 0x9658 #define MPS_STAT_RX_BG_3_MAC_DROP_FRAME_H 0x965c #define MPS_STAT_RX_BG_0_LB_DROP_FRAME_L 0x9660 #define MPS_STAT_RX_BG_0_LB_DROP_FRAME_H 0x9664 #define MPS_STAT_RX_BG_1_LB_DROP_FRAME_L 0x9668 #define MPS_STAT_RX_BG_1_LB_DROP_FRAME_H 0x966c #define MPS_STAT_RX_BG_2_LB_DROP_FRAME_L 0x9670 #define MPS_STAT_RX_BG_2_LB_DROP_FRAME_H 0x9674 #define MPS_STAT_RX_BG_3_LB_DROP_FRAME_L 0x9678 #define MPS_STAT_RX_BG_3_LB_DROP_FRAME_H 0x967c #define MPS_STAT_RX_BG_0_MAC_TRUNC_FRAME_L 0x9680 #define MPS_STAT_RX_BG_0_MAC_TRUNC_FRAME_H 0x9684 #define MPS_STAT_RX_BG_1_MAC_TRUNC_FRAME_L 0x9688 #define MPS_STAT_RX_BG_1_MAC_TRUNC_FRAME_H 0x968c #define MPS_STAT_RX_BG_2_MAC_TRUNC_FRAME_L 0x9690 #define MPS_STAT_RX_BG_2_MAC_TRUNC_FRAME_H 0x9694 #define MPS_STAT_RX_BG_3_MAC_TRUNC_FRAME_L 0x9698 #define MPS_STAT_RX_BG_3_MAC_TRUNC_FRAME_H 0x969c #define MPS_STAT_RX_BG_0_LB_TRUNC_FRAME_L 0x96a0 #define MPS_STAT_RX_BG_0_LB_TRUNC_FRAME_H 0x96a4 #define MPS_STAT_RX_BG_1_LB_TRUNC_FRAME_L 0x96a8 #define MPS_STAT_RX_BG_1_LB_TRUNC_FRAME_H 0x96ac #define MPS_STAT_RX_BG_2_LB_TRUNC_FRAME_L 0x96b0 #define MPS_STAT_RX_BG_2_LB_TRUNC_FRAME_H 0x96b4 #define MPS_STAT_RX_BG_3_LB_TRUNC_FRAME_L 0x96b8 #define MPS_STAT_RX_BG_3_LB_TRUNC_FRAME_H 0x96bc #define MPS_TRC_CFG_A 0x9800 #define TRCFIFOEMPTY_S 4 #define TRCFIFOEMPTY_V(x) ((x) << TRCFIFOEMPTY_S) #define TRCFIFOEMPTY_F TRCFIFOEMPTY_V(1U) #define TRCIGNOREDROPINPUT_S 3 #define TRCIGNOREDROPINPUT_V(x) ((x) << TRCIGNOREDROPINPUT_S) #define TRCIGNOREDROPINPUT_F TRCIGNOREDROPINPUT_V(1U) #define TRCKEEPDUPLICATES_S 2 #define TRCKEEPDUPLICATES_V(x) ((x) << TRCKEEPDUPLICATES_S) #define TRCKEEPDUPLICATES_F TRCKEEPDUPLICATES_V(1U) #define TRCEN_S 1 #define TRCEN_V(x) ((x) << TRCEN_S) #define TRCEN_F TRCEN_V(1U) #define TRCMULTIFILTER_S 0 #define TRCMULTIFILTER_V(x) ((x) << TRCMULTIFILTER_S) #define TRCMULTIFILTER_F TRCMULTIFILTER_V(1U) #define MPS_TRC_RSS_CONTROL_A 0x9808 #define MPS_TRC_FILTER1_RSS_CONTROL_A 0x9ff4 #define MPS_TRC_FILTER2_RSS_CONTROL_A 0x9ffc #define MPS_TRC_FILTER3_RSS_CONTROL_A 0xa004 #define MPS_T5_TRC_RSS_CONTROL_A 0xa00c #define RSSCONTROL_S 16 #define RSSCONTROL_V(x) ((x) << RSSCONTROL_S) #define QUEUENUMBER_S 0 #define QUEUENUMBER_V(x) ((x) << QUEUENUMBER_S) #define TFINVERTMATCH_S 24 #define TFINVERTMATCH_V(x) ((x) << TFINVERTMATCH_S) #define TFINVERTMATCH_F TFINVERTMATCH_V(1U) #define TFEN_S 22 #define TFEN_V(x) ((x) << TFEN_S) #define TFEN_F TFEN_V(1U) #define TFPORT_S 18 #define TFPORT_M 0xfU #define TFPORT_V(x) ((x) << TFPORT_S) #define TFPORT_G(x) (((x) >> TFPORT_S) & TFPORT_M) #define TFLENGTH_S 8 #define TFLENGTH_M 0x1fU #define TFLENGTH_V(x) ((x) << TFLENGTH_S) #define TFLENGTH_G(x) (((x) >> TFLENGTH_S) & TFLENGTH_M) #define TFOFFSET_S 0 #define TFOFFSET_M 0x1fU #define TFOFFSET_V(x) ((x) << TFOFFSET_S) #define TFOFFSET_G(x) (((x) >> TFOFFSET_S) & TFOFFSET_M) #define T5_TFINVERTMATCH_S 25 #define T5_TFINVERTMATCH_V(x) ((x) << T5_TFINVERTMATCH_S) #define T5_TFINVERTMATCH_F T5_TFINVERTMATCH_V(1U) #define T5_TFEN_S 23 #define T5_TFEN_V(x) ((x) << T5_TFEN_S) #define T5_TFEN_F T5_TFEN_V(1U) #define T5_TFPORT_S 18 #define T5_TFPORT_M 0x1fU #define T5_TFPORT_V(x) ((x) << T5_TFPORT_S) #define T5_TFPORT_G(x) (((x) >> T5_TFPORT_S) & T5_TFPORT_M) #define MPS_TRC_FILTER_MATCH_CTL_A_A 0x9810 #define MPS_TRC_FILTER_MATCH_CTL_B_A 0x9820 #define TFMINPKTSIZE_S 16 #define TFMINPKTSIZE_M 0x1ffU #define TFMINPKTSIZE_V(x) ((x) << TFMINPKTSIZE_S) #define TFMINPKTSIZE_G(x) (((x) >> TFMINPKTSIZE_S) & TFMINPKTSIZE_M) #define TFCAPTUREMAX_S 0 #define TFCAPTUREMAX_M 0x3fffU #define TFCAPTUREMAX_V(x) ((x) << TFCAPTUREMAX_S) #define TFCAPTUREMAX_G(x) (((x) >> TFCAPTUREMAX_S) & TFCAPTUREMAX_M) #define MPS_TRC_FILTER0_MATCH_A 0x9c00 #define MPS_TRC_FILTER0_DONT_CARE_A 0x9c80 #define MPS_TRC_FILTER1_MATCH_A 0x9d00 #define TP_RSS_CONFIG_A 0x7df0 #define TNL4TUPENIPV6_S 31 #define TNL4TUPENIPV6_V(x) ((x) << TNL4TUPENIPV6_S) #define TNL4TUPENIPV6_F TNL4TUPENIPV6_V(1U) #define TNL2TUPENIPV6_S 30 #define TNL2TUPENIPV6_V(x) ((x) << TNL2TUPENIPV6_S) #define TNL2TUPENIPV6_F TNL2TUPENIPV6_V(1U) #define TNL4TUPENIPV4_S 29 #define TNL4TUPENIPV4_V(x) ((x) << TNL4TUPENIPV4_S) #define TNL4TUPENIPV4_F TNL4TUPENIPV4_V(1U) #define TNL2TUPENIPV4_S 28 #define TNL2TUPENIPV4_V(x) ((x) << TNL2TUPENIPV4_S) #define TNL2TUPENIPV4_F TNL2TUPENIPV4_V(1U) #define TNLTCPSEL_S 27 #define TNLTCPSEL_V(x) ((x) << TNLTCPSEL_S) #define TNLTCPSEL_F TNLTCPSEL_V(1U) #define TNLIP6SEL_S 26 #define TNLIP6SEL_V(x) ((x) << TNLIP6SEL_S) #define TNLIP6SEL_F TNLIP6SEL_V(1U) #define TNLVRTSEL_S 25 #define TNLVRTSEL_V(x) ((x) << TNLVRTSEL_S) #define TNLVRTSEL_F TNLVRTSEL_V(1U) #define TNLMAPEN_S 24 #define TNLMAPEN_V(x) ((x) << TNLMAPEN_S) #define TNLMAPEN_F TNLMAPEN_V(1U) #define OFDHASHSAVE_S 19 #define OFDHASHSAVE_V(x) ((x) << OFDHASHSAVE_S) #define OFDHASHSAVE_F OFDHASHSAVE_V(1U) #define OFDVRTSEL_S 18 #define OFDVRTSEL_V(x) ((x) << OFDVRTSEL_S) #define OFDVRTSEL_F OFDVRTSEL_V(1U) #define OFDMAPEN_S 17 #define OFDMAPEN_V(x) ((x) << OFDMAPEN_S) #define OFDMAPEN_F OFDMAPEN_V(1U) #define OFDLKPEN_S 16 #define OFDLKPEN_V(x) ((x) << OFDLKPEN_S) #define OFDLKPEN_F OFDLKPEN_V(1U) #define SYN4TUPENIPV6_S 15 #define SYN4TUPENIPV6_V(x) ((x) << SYN4TUPENIPV6_S) #define SYN4TUPENIPV6_F SYN4TUPENIPV6_V(1U) #define SYN2TUPENIPV6_S 14 #define SYN2TUPENIPV6_V(x) ((x) << SYN2TUPENIPV6_S) #define SYN2TUPENIPV6_F SYN2TUPENIPV6_V(1U) #define SYN4TUPENIPV4_S 13 #define SYN4TUPENIPV4_V(x) ((x) << SYN4TUPENIPV4_S) #define SYN4TUPENIPV4_F SYN4TUPENIPV4_V(1U) #define SYN2TUPENIPV4_S 12 #define SYN2TUPENIPV4_V(x) ((x) << SYN2TUPENIPV4_S) #define SYN2TUPENIPV4_F SYN2TUPENIPV4_V(1U) #define SYNIP6SEL_S 11 #define SYNIP6SEL_V(x) ((x) << SYNIP6SEL_S) #define SYNIP6SEL_F SYNIP6SEL_V(1U) #define SYNVRTSEL_S 10 #define SYNVRTSEL_V(x) ((x) << SYNVRTSEL_S) #define SYNVRTSEL_F SYNVRTSEL_V(1U) #define SYNMAPEN_S 9 #define SYNMAPEN_V(x) ((x) << SYNMAPEN_S) #define SYNMAPEN_F SYNMAPEN_V(1U) #define SYNLKPEN_S 8 #define SYNLKPEN_V(x) ((x) << SYNLKPEN_S) #define SYNLKPEN_F SYNLKPEN_V(1U) #define CHANNELENABLE_S 7 #define CHANNELENABLE_V(x) ((x) << CHANNELENABLE_S) #define CHANNELENABLE_F CHANNELENABLE_V(1U) #define PORTENABLE_S 6 #define PORTENABLE_V(x) ((x) << PORTENABLE_S) #define PORTENABLE_F PORTENABLE_V(1U) #define TNLALLLOOKUP_S 5 #define TNLALLLOOKUP_V(x) ((x) << TNLALLLOOKUP_S) #define TNLALLLOOKUP_F TNLALLLOOKUP_V(1U) #define VIRTENABLE_S 4 #define VIRTENABLE_V(x) ((x) << VIRTENABLE_S) #define VIRTENABLE_F VIRTENABLE_V(1U) #define CONGESTIONENABLE_S 3 #define CONGESTIONENABLE_V(x) ((x) << CONGESTIONENABLE_S) #define CONGESTIONENABLE_F CONGESTIONENABLE_V(1U) #define HASHTOEPLITZ_S 2 #define HASHTOEPLITZ_V(x) ((x) << HASHTOEPLITZ_S) #define HASHTOEPLITZ_F HASHTOEPLITZ_V(1U) #define UDPENABLE_S 1 #define UDPENABLE_V(x) ((x) << UDPENABLE_S) #define UDPENABLE_F UDPENABLE_V(1U) #define DISABLE_S 0 #define DISABLE_V(x) ((x) << DISABLE_S) #define DISABLE_F DISABLE_V(1U) #define TP_RSS_CONFIG_TNL_A 0x7df4 #define MASKSIZE_S 28 #define MASKSIZE_M 0xfU #define MASKSIZE_V(x) ((x) << MASKSIZE_S) #define MASKSIZE_G(x) (((x) >> MASKSIZE_S) & MASKSIZE_M) #define MASKFILTER_S 16 #define MASKFILTER_M 0x7ffU #define MASKFILTER_V(x) ((x) << MASKFILTER_S) #define MASKFILTER_G(x) (((x) >> MASKFILTER_S) & MASKFILTER_M) #define USEWIRECH_S 0 #define USEWIRECH_V(x) ((x) << USEWIRECH_S) #define USEWIRECH_F USEWIRECH_V(1U) #define HASHALL_S 2 #define HASHALL_V(x) ((x) << HASHALL_S) #define HASHALL_F HASHALL_V(1U) #define HASHETH_S 1 #define HASHETH_V(x) ((x) << HASHETH_S) #define HASHETH_F HASHETH_V(1U) #define TP_RSS_CONFIG_OFD_A 0x7df8 #define RRCPLMAPEN_S 20 #define RRCPLMAPEN_V(x) ((x) << RRCPLMAPEN_S) #define RRCPLMAPEN_F RRCPLMAPEN_V(1U) #define RRCPLQUEWIDTH_S 16 #define RRCPLQUEWIDTH_M 0xfU #define RRCPLQUEWIDTH_V(x) ((x) << RRCPLQUEWIDTH_S) #define RRCPLQUEWIDTH_G(x) (((x) >> RRCPLQUEWIDTH_S) & RRCPLQUEWIDTH_M) #define TP_RSS_CONFIG_SYN_A 0x7dfc #define TP_RSS_CONFIG_VRT_A 0x7e00 #define VFRDRG_S 25 #define VFRDRG_V(x) ((x) << VFRDRG_S) #define VFRDRG_F VFRDRG_V(1U) #define VFRDEN_S 24 #define VFRDEN_V(x) ((x) << VFRDEN_S) #define VFRDEN_F VFRDEN_V(1U) #define VFPERREN_S 23 #define VFPERREN_V(x) ((x) << VFPERREN_S) #define VFPERREN_F VFPERREN_V(1U) #define KEYPERREN_S 22 #define KEYPERREN_V(x) ((x) << KEYPERREN_S) #define KEYPERREN_F KEYPERREN_V(1U) #define DISABLEVLAN_S 21 #define DISABLEVLAN_V(x) ((x) << DISABLEVLAN_S) #define DISABLEVLAN_F DISABLEVLAN_V(1U) #define ENABLEUP0_S 20 #define ENABLEUP0_V(x) ((x) << ENABLEUP0_S) #define ENABLEUP0_F ENABLEUP0_V(1U) #define HASHDELAY_S 16 #define HASHDELAY_M 0xfU #define HASHDELAY_V(x) ((x) << HASHDELAY_S) #define HASHDELAY_G(x) (((x) >> HASHDELAY_S) & HASHDELAY_M) #define VFWRADDR_S 8 #define VFWRADDR_M 0x7fU #define VFWRADDR_V(x) ((x) << VFWRADDR_S) #define VFWRADDR_G(x) (((x) >> VFWRADDR_S) & VFWRADDR_M) #define KEYMODE_S 6 #define KEYMODE_M 0x3U #define KEYMODE_V(x) ((x) << KEYMODE_S) #define KEYMODE_G(x) (((x) >> KEYMODE_S) & KEYMODE_M) #define VFWREN_S 5 #define VFWREN_V(x) ((x) << VFWREN_S) #define VFWREN_F VFWREN_V(1U) #define KEYWREN_S 4 #define KEYWREN_V(x) ((x) << KEYWREN_S) #define KEYWREN_F KEYWREN_V(1U) #define KEYWRADDR_S 0 #define KEYWRADDR_M 0xfU #define KEYWRADDR_V(x) ((x) << KEYWRADDR_S) #define KEYWRADDR_G(x) (((x) >> KEYWRADDR_S) & KEYWRADDR_M) #define KEYWRADDRX_S 30 #define KEYWRADDRX_M 0x3U #define KEYWRADDRX_V(x) ((x) << KEYWRADDRX_S) #define KEYWRADDRX_G(x) (((x) >> KEYWRADDRX_S) & KEYWRADDRX_M) #define KEYEXTEND_S 26 #define KEYEXTEND_V(x) ((x) << KEYEXTEND_S) #define KEYEXTEND_F KEYEXTEND_V(1U) #define LKPIDXSIZE_S 24 #define LKPIDXSIZE_M 0x3U #define LKPIDXSIZE_V(x) ((x) << LKPIDXSIZE_S) #define LKPIDXSIZE_G(x) (((x) >> LKPIDXSIZE_S) & LKPIDXSIZE_M) #define TP_RSS_VFL_CONFIG_A 0x3a #define TP_RSS_VFH_CONFIG_A 0x3b #define ENABLEUDPHASH_S 31 #define ENABLEUDPHASH_V(x) ((x) << ENABLEUDPHASH_S) #define ENABLEUDPHASH_F ENABLEUDPHASH_V(1U) #define VFUPEN_S 30 #define VFUPEN_V(x) ((x) << VFUPEN_S) #define VFUPEN_F VFUPEN_V(1U) #define VFVLNEX_S 28 #define VFVLNEX_V(x) ((x) << VFVLNEX_S) #define VFVLNEX_F VFVLNEX_V(1U) #define VFPRTEN_S 27 #define VFPRTEN_V(x) ((x) << VFPRTEN_S) #define VFPRTEN_F VFPRTEN_V(1U) #define VFCHNEN_S 26 #define VFCHNEN_V(x) ((x) << VFCHNEN_S) #define VFCHNEN_F VFCHNEN_V(1U) #define DEFAULTQUEUE_S 16 #define DEFAULTQUEUE_M 0x3ffU #define DEFAULTQUEUE_G(x) (((x) >> DEFAULTQUEUE_S) & DEFAULTQUEUE_M) #define VFIP6TWOTUPEN_S 6 #define VFIP6TWOTUPEN_V(x) ((x) << VFIP6TWOTUPEN_S) #define VFIP6TWOTUPEN_F VFIP6TWOTUPEN_V(1U) #define VFIP4FOURTUPEN_S 5 #define VFIP4FOURTUPEN_V(x) ((x) << VFIP4FOURTUPEN_S) #define VFIP4FOURTUPEN_F VFIP4FOURTUPEN_V(1U) #define VFIP4TWOTUPEN_S 4 #define VFIP4TWOTUPEN_V(x) ((x) << VFIP4TWOTUPEN_S) #define VFIP4TWOTUPEN_F VFIP4TWOTUPEN_V(1U) #define KEYINDEX_S 0 #define KEYINDEX_M 0xfU #define KEYINDEX_G(x) (((x) >> KEYINDEX_S) & KEYINDEX_M) #define MAPENABLE_S 31 #define MAPENABLE_V(x) ((x) << MAPENABLE_S) #define MAPENABLE_F MAPENABLE_V(1U) #define CHNENABLE_S 30 #define CHNENABLE_V(x) ((x) << CHNENABLE_S) #define CHNENABLE_F CHNENABLE_V(1U) #define PRTENABLE_S 29 #define PRTENABLE_V(x) ((x) << PRTENABLE_S) #define PRTENABLE_F PRTENABLE_V(1U) #define UDPFOURTUPEN_S 28 #define UDPFOURTUPEN_V(x) ((x) << UDPFOURTUPEN_S) #define UDPFOURTUPEN_F UDPFOURTUPEN_V(1U) #define IP6FOURTUPEN_S 27 #define IP6FOURTUPEN_V(x) ((x) << IP6FOURTUPEN_S) #define IP6FOURTUPEN_F IP6FOURTUPEN_V(1U) #define IP6TWOTUPEN_S 26 #define IP6TWOTUPEN_V(x) ((x) << IP6TWOTUPEN_S) #define IP6TWOTUPEN_F IP6TWOTUPEN_V(1U) #define IP4FOURTUPEN_S 25 #define IP4FOURTUPEN_V(x) ((x) << IP4FOURTUPEN_S) #define IP4FOURTUPEN_F IP4FOURTUPEN_V(1U) #define IP4TWOTUPEN_S 24 #define IP4TWOTUPEN_V(x) ((x) << IP4TWOTUPEN_S) #define IP4TWOTUPEN_F IP4TWOTUPEN_V(1U) #define IVFWIDTH_S 20 #define IVFWIDTH_M 0xfU #define IVFWIDTH_V(x) ((x) << IVFWIDTH_S) #define IVFWIDTH_G(x) (((x) >> IVFWIDTH_S) & IVFWIDTH_M) #define CH1DEFAULTQUEUE_S 10 #define CH1DEFAULTQUEUE_M 0x3ffU #define CH1DEFAULTQUEUE_V(x) ((x) << CH1DEFAULTQUEUE_S) #define CH1DEFAULTQUEUE_G(x) (((x) >> CH1DEFAULTQUEUE_S) & CH1DEFAULTQUEUE_M) #define CH0DEFAULTQUEUE_S 0 #define CH0DEFAULTQUEUE_M 0x3ffU #define CH0DEFAULTQUEUE_V(x) ((x) << CH0DEFAULTQUEUE_S) #define CH0DEFAULTQUEUE_G(x) (((x) >> CH0DEFAULTQUEUE_S) & CH0DEFAULTQUEUE_M) #define VFLKPIDX_S 8 #define VFLKPIDX_M 0xffU #define VFLKPIDX_G(x) (((x) >> VFLKPIDX_S) & VFLKPIDX_M) #define T6_VFWRADDR_S 8 #define T6_VFWRADDR_M 0xffU #define T6_VFWRADDR_V(x) ((x) << T6_VFWRADDR_S) #define T6_VFWRADDR_G(x) (((x) >> T6_VFWRADDR_S) & T6_VFWRADDR_M) #define TP_RSS_CONFIG_CNG_A 0x7e04 #define TP_RSS_SECRET_KEY0_A 0x40 #define TP_RSS_PF0_CONFIG_A 0x30 #define TP_RSS_PF_MAP_A 0x38 #define TP_RSS_PF_MSK_A 0x39 #define PF1LKPIDX_S 3 #define PF0LKPIDX_M 0x7U #define PF1MSKSIZE_S 4 #define PF1MSKSIZE_M 0xfU #define CHNCOUNT3_S 31 #define CHNCOUNT3_V(x) ((x) << CHNCOUNT3_S) #define CHNCOUNT3_F CHNCOUNT3_V(1U) #define CHNCOUNT2_S 30 #define CHNCOUNT2_V(x) ((x) << CHNCOUNT2_S) #define CHNCOUNT2_F CHNCOUNT2_V(1U) #define CHNCOUNT1_S 29 #define CHNCOUNT1_V(x) ((x) << CHNCOUNT1_S) #define CHNCOUNT1_F CHNCOUNT1_V(1U) #define CHNCOUNT0_S 28 #define CHNCOUNT0_V(x) ((x) << CHNCOUNT0_S) #define CHNCOUNT0_F CHNCOUNT0_V(1U) #define CHNUNDFLOW3_S 27 #define CHNUNDFLOW3_V(x) ((x) << CHNUNDFLOW3_S) #define CHNUNDFLOW3_F CHNUNDFLOW3_V(1U) #define CHNUNDFLOW2_S 26 #define CHNUNDFLOW2_V(x) ((x) << CHNUNDFLOW2_S) #define CHNUNDFLOW2_F CHNUNDFLOW2_V(1U) #define CHNUNDFLOW1_S 25 #define CHNUNDFLOW1_V(x) ((x) << CHNUNDFLOW1_S) #define CHNUNDFLOW1_F CHNUNDFLOW1_V(1U) #define CHNUNDFLOW0_S 24 #define CHNUNDFLOW0_V(x) ((x) << CHNUNDFLOW0_S) #define CHNUNDFLOW0_F CHNUNDFLOW0_V(1U) #define RSTCHN3_S 19 #define RSTCHN3_V(x) ((x) << RSTCHN3_S) #define RSTCHN3_F RSTCHN3_V(1U) #define RSTCHN2_S 18 #define RSTCHN2_V(x) ((x) << RSTCHN2_S) #define RSTCHN2_F RSTCHN2_V(1U) #define RSTCHN1_S 17 #define RSTCHN1_V(x) ((x) << RSTCHN1_S) #define RSTCHN1_F RSTCHN1_V(1U) #define RSTCHN0_S 16 #define RSTCHN0_V(x) ((x) << RSTCHN0_S) #define RSTCHN0_F RSTCHN0_V(1U) #define UPDVLD_S 15 #define UPDVLD_V(x) ((x) << UPDVLD_S) #define UPDVLD_F UPDVLD_V(1U) #define XOFF_S 14 #define XOFF_V(x) ((x) << XOFF_S) #define XOFF_F XOFF_V(1U) #define UPDCHN3_S 13 #define UPDCHN3_V(x) ((x) << UPDCHN3_S) #define UPDCHN3_F UPDCHN3_V(1U) #define UPDCHN2_S 12 #define UPDCHN2_V(x) ((x) << UPDCHN2_S) #define UPDCHN2_F UPDCHN2_V(1U) #define UPDCHN1_S 11 #define UPDCHN1_V(x) ((x) << UPDCHN1_S) #define UPDCHN1_F UPDCHN1_V(1U) #define UPDCHN0_S 10 #define UPDCHN0_V(x) ((x) << UPDCHN0_S) #define UPDCHN0_F UPDCHN0_V(1U) #define QUEUE_S 0 #define QUEUE_M 0x3ffU #define QUEUE_V(x) ((x) << QUEUE_S) #define QUEUE_G(x) (((x) >> QUEUE_S) & QUEUE_M) #define MPS_TRC_INT_CAUSE_A 0x985c #define MISCPERR_S 8 #define MISCPERR_V(x) ((x) << MISCPERR_S) #define MISCPERR_F MISCPERR_V(1U) #define PKTFIFO_S 4 #define PKTFIFO_M 0xfU #define PKTFIFO_V(x) ((x) << PKTFIFO_S) #define FILTMEM_S 0 #define FILTMEM_M 0xfU #define FILTMEM_V(x) ((x) << FILTMEM_S) #define MPS_CLS_INT_CAUSE_A 0xd028 #define HASHSRAM_S 2 #define HASHSRAM_V(x) ((x) << HASHSRAM_S) #define HASHSRAM_F HASHSRAM_V(1U) #define MATCHTCAM_S 1 #define MATCHTCAM_V(x) ((x) << MATCHTCAM_S) #define MATCHTCAM_F MATCHTCAM_V(1U) #define MATCHSRAM_S 0 #define MATCHSRAM_V(x) ((x) << MATCHSRAM_S) #define MATCHSRAM_F MATCHSRAM_V(1U) #define MPS_RX_PG_RSV0_A 0x11010 #define MPS_RX_PG_RSV4_A 0x11020 #define MPS_RX_PERR_INT_CAUSE_A 0x11074 #define MPS_RX_MAC_BG_PG_CNT0_A 0x11208 #define MPS_RX_LPBK_BG_PG_CNT0_A 0x11218 #define MPS_CLS_TCAM_Y_L_A 0xf000 #define MPS_CLS_TCAM_DATA0_A 0xf000 #define MPS_CLS_TCAM_DATA1_A 0xf004 #define VIDL_S 16 #define VIDL_M 0xffffU #define VIDL_G(x) (((x) >> VIDL_S) & VIDL_M) #define DATALKPTYPE_S 10 #define DATALKPTYPE_M 0x3U #define DATALKPTYPE_G(x) (((x) >> DATALKPTYPE_S) & DATALKPTYPE_M) #define DATAPORTNUM_S 12 #define DATAPORTNUM_M 0xfU #define DATAPORTNUM_G(x) (((x) >> DATAPORTNUM_S) & DATAPORTNUM_M) #define DATADIPHIT_S 8 #define DATADIPHIT_V(x) ((x) << DATADIPHIT_S) #define DATADIPHIT_F DATADIPHIT_V(1U) #define DATAVIDH2_S 7 #define DATAVIDH2_V(x) ((x) << DATAVIDH2_S) #define DATAVIDH2_F DATAVIDH2_V(1U) #define DATAVIDH1_S 0 #define DATAVIDH1_M 0x7fU #define DATAVIDH1_G(x) (((x) >> DATAVIDH1_S) & DATAVIDH1_M) #define USED_S 16 #define USED_M 0x7ffU #define USED_G(x) (((x) >> USED_S) & USED_M) #define ALLOC_S 0 #define ALLOC_M 0x7ffU #define ALLOC_G(x) (((x) >> ALLOC_S) & ALLOC_M) #define T5_USED_S 16 #define T5_USED_M 0xfffU #define T5_USED_G(x) (((x) >> T5_USED_S) & T5_USED_M) #define T5_ALLOC_S 0 #define T5_ALLOC_M 0xfffU #define T5_ALLOC_G(x) (((x) >> T5_ALLOC_S) & T5_ALLOC_M) #define DMACH_S 0 #define DMACH_M 0xffffU #define DMACH_G(x) (((x) >> DMACH_S) & DMACH_M) #define MPS_CLS_TCAM_X_L_A 0xf008 #define MPS_CLS_TCAM_DATA2_CTL_A 0xf008 #define CTLCMDTYPE_S 31 #define CTLCMDTYPE_V(x) ((x) << CTLCMDTYPE_S) #define CTLCMDTYPE_F CTLCMDTYPE_V(1U) #define CTLTCAMSEL_S 25 #define CTLTCAMSEL_V(x) ((x) << CTLTCAMSEL_S) #define CTLTCAMINDEX_S 17 #define CTLTCAMINDEX_V(x) ((x) << CTLTCAMINDEX_S) #define CTLXYBITSEL_S 16 #define CTLXYBITSEL_V(x) ((x) << CTLXYBITSEL_S) #define MPS_CLS_TCAM_Y_L(idx) (MPS_CLS_TCAM_Y_L_A + (idx) * 16) #define NUM_MPS_CLS_TCAM_Y_L_INSTANCES 512 #define MPS_CLS_TCAM_X_L(idx) (MPS_CLS_TCAM_X_L_A + (idx) * 16) #define NUM_MPS_CLS_TCAM_X_L_INSTANCES 512 #define MPS_CLS_SRAM_L_A 0xe000 #define T6_MULTILISTEN0_S 26 #define T6_SRAM_PRIO3_S 23 #define T6_SRAM_PRIO3_M 0x7U #define T6_SRAM_PRIO3_G(x) (((x) >> T6_SRAM_PRIO3_S) & T6_SRAM_PRIO3_M) #define T6_SRAM_PRIO2_S 20 #define T6_SRAM_PRIO2_M 0x7U #define T6_SRAM_PRIO2_G(x) (((x) >> T6_SRAM_PRIO2_S) & T6_SRAM_PRIO2_M) #define T6_SRAM_PRIO1_S 17 #define T6_SRAM_PRIO1_M 0x7U #define T6_SRAM_PRIO1_G(x) (((x) >> T6_SRAM_PRIO1_S) & T6_SRAM_PRIO1_M) #define T6_SRAM_PRIO0_S 14 #define T6_SRAM_PRIO0_M 0x7U #define T6_SRAM_PRIO0_G(x) (((x) >> T6_SRAM_PRIO0_S) & T6_SRAM_PRIO0_M) #define T6_SRAM_VLD_S 13 #define T6_SRAM_VLD_V(x) ((x) << T6_SRAM_VLD_S) #define T6_SRAM_VLD_F T6_SRAM_VLD_V(1U) #define T6_REPLICATE_S 12 #define T6_REPLICATE_V(x) ((x) << T6_REPLICATE_S) #define T6_REPLICATE_F T6_REPLICATE_V(1U) #define T6_PF_S 9 #define T6_PF_M 0x7U #define T6_PF_G(x) (((x) >> T6_PF_S) & T6_PF_M) #define T6_VF_VALID_S 8 #define T6_VF_VALID_V(x) ((x) << T6_VF_VALID_S) #define T6_VF_VALID_F T6_VF_VALID_V(1U) #define T6_VF_S 0 #define T6_VF_M 0xffU #define T6_VF_G(x) (((x) >> T6_VF_S) & T6_VF_M) #define MPS_CLS_SRAM_H_A 0xe004 #define MPS_CLS_SRAM_L(idx) (MPS_CLS_SRAM_L_A + (idx) * 8) #define NUM_MPS_CLS_SRAM_L_INSTANCES 336 #define MPS_CLS_SRAM_H(idx) (MPS_CLS_SRAM_H_A + (idx) * 8) #define NUM_MPS_CLS_SRAM_H_INSTANCES 336 #define MULTILISTEN0_S 25 #define REPLICATE_S 11 #define REPLICATE_V(x) ((x) << REPLICATE_S) #define REPLICATE_F REPLICATE_V(1U) #define PF_S 8 #define PF_M 0x7U #define PF_G(x) (((x) >> PF_S) & PF_M) #define VF_VALID_S 7 #define VF_VALID_V(x) ((x) << VF_VALID_S) #define VF_VALID_F VF_VALID_V(1U) #define VF_S 0 #define VF_M 0x7fU #define VF_G(x) (((x) >> VF_S) & VF_M) #define SRAM_PRIO3_S 22 #define SRAM_PRIO3_M 0x7U #define SRAM_PRIO3_G(x) (((x) >> SRAM_PRIO3_S) & SRAM_PRIO3_M) #define SRAM_PRIO2_S 19 #define SRAM_PRIO2_M 0x7U #define SRAM_PRIO2_G(x) (((x) >> SRAM_PRIO2_S) & SRAM_PRIO2_M) #define SRAM_PRIO1_S 16 #define SRAM_PRIO1_M 0x7U #define SRAM_PRIO1_G(x) (((x) >> SRAM_PRIO1_S) & SRAM_PRIO1_M) #define SRAM_PRIO0_S 13 #define SRAM_PRIO0_M 0x7U #define SRAM_PRIO0_G(x) (((x) >> SRAM_PRIO0_S) & SRAM_PRIO0_M) #define SRAM_VLD_S 12 #define SRAM_VLD_V(x) ((x) << SRAM_VLD_S) #define SRAM_VLD_F SRAM_VLD_V(1U) #define PORTMAP_S 0 #define PORTMAP_M 0xfU #define PORTMAP_G(x) (((x) >> PORTMAP_S) & PORTMAP_M) #define CPL_INTR_CAUSE_A 0x19054 #define CIM_OP_MAP_PERR_S 5 #define CIM_OP_MAP_PERR_V(x) ((x) << CIM_OP_MAP_PERR_S) #define CIM_OP_MAP_PERR_F CIM_OP_MAP_PERR_V(1U) #define CIM_OVFL_ERROR_S 4 #define CIM_OVFL_ERROR_V(x) ((x) << CIM_OVFL_ERROR_S) #define CIM_OVFL_ERROR_F CIM_OVFL_ERROR_V(1U) #define TP_FRAMING_ERROR_S 3 #define TP_FRAMING_ERROR_V(x) ((x) << TP_FRAMING_ERROR_S) #define TP_FRAMING_ERROR_F TP_FRAMING_ERROR_V(1U) #define SGE_FRAMING_ERROR_S 2 #define SGE_FRAMING_ERROR_V(x) ((x) << SGE_FRAMING_ERROR_S) #define SGE_FRAMING_ERROR_F SGE_FRAMING_ERROR_V(1U) #define CIM_FRAMING_ERROR_S 1 #define CIM_FRAMING_ERROR_V(x) ((x) << CIM_FRAMING_ERROR_S) #define CIM_FRAMING_ERROR_F CIM_FRAMING_ERROR_V(1U) #define ZERO_SWITCH_ERROR_S 0 #define ZERO_SWITCH_ERROR_V(x) ((x) << ZERO_SWITCH_ERROR_S) #define ZERO_SWITCH_ERROR_F ZERO_SWITCH_ERROR_V(1U) #define SMB_INT_CAUSE_A 0x19090 #define MSTTXFIFOPARINT_S 21 #define MSTTXFIFOPARINT_V(x) ((x) << MSTTXFIFOPARINT_S) #define MSTTXFIFOPARINT_F MSTTXFIFOPARINT_V(1U) #define MSTRXFIFOPARINT_S 20 #define MSTRXFIFOPARINT_V(x) ((x) << MSTRXFIFOPARINT_S) #define MSTRXFIFOPARINT_F MSTRXFIFOPARINT_V(1U) #define SLVFIFOPARINT_S 19 #define SLVFIFOPARINT_V(x) ((x) << SLVFIFOPARINT_S) #define SLVFIFOPARINT_F SLVFIFOPARINT_V(1U) #define ULP_RX_INT_CAUSE_A 0x19158 #define ULP_RX_ISCSI_LLIMIT_A 0x1915c #define ULP_RX_ISCSI_ULIMIT_A 0x19160 #define ULP_RX_ISCSI_TAGMASK_A 0x19164 #define ULP_RX_ISCSI_PSZ_A 0x19168 #define ULP_RX_TDDP_LLIMIT_A 0x1916c #define ULP_RX_TDDP_ULIMIT_A 0x19170 #define ULP_RX_STAG_LLIMIT_A 0x1917c #define ULP_RX_STAG_ULIMIT_A 0x19180 #define ULP_RX_RQ_LLIMIT_A 0x19184 #define ULP_RX_RQ_ULIMIT_A 0x19188 #define ULP_RX_PBL_LLIMIT_A 0x1918c #define ULP_RX_PBL_ULIMIT_A 0x19190 #define ULP_RX_CTX_BASE_A 0x19194 #define ULP_RX_RQUDP_LLIMIT_A 0x191a4 #define ULP_RX_RQUDP_ULIMIT_A 0x191a8 #define ULP_RX_LA_CTL_A 0x1923c #define ULP_RX_LA_RDPTR_A 0x19240 #define ULP_RX_LA_RDDATA_A 0x19244 #define ULP_RX_LA_WRPTR_A 0x19248 #define HPZ3_S 24 #define HPZ3_V(x) ((x) << HPZ3_S) #define HPZ2_S 16 #define HPZ2_V(x) ((x) << HPZ2_S) #define HPZ1_S 8 #define HPZ1_V(x) ((x) << HPZ1_S) #define HPZ0_S 0 #define HPZ0_V(x) ((x) << HPZ0_S) #define ULP_RX_TDDP_PSZ_A 0x19178 /* registers for module SF */ #define SF_DATA_A 0x193f8 #define SF_OP_A 0x193fc #define SF_BUSY_S 31 #define SF_BUSY_V(x) ((x) << SF_BUSY_S) #define SF_BUSY_F SF_BUSY_V(1U) #define SF_LOCK_S 4 #define SF_LOCK_V(x) ((x) << SF_LOCK_S) #define SF_LOCK_F SF_LOCK_V(1U) #define SF_CONT_S 3 #define SF_CONT_V(x) ((x) << SF_CONT_S) #define SF_CONT_F SF_CONT_V(1U) #define BYTECNT_S 1 #define BYTECNT_V(x) ((x) << BYTECNT_S) #define OP_S 0 #define OP_V(x) ((x) << OP_S) #define OP_F OP_V(1U) #define PL_PF_INT_CAUSE_A 0x3c0 #define PFSW_S 3 #define PFSW_V(x) ((x) << PFSW_S) #define PFSW_F PFSW_V(1U) #define PFCIM_S 1 #define PFCIM_V(x) ((x) << PFCIM_S) #define PFCIM_F PFCIM_V(1U) #define PL_PF_INT_ENABLE_A 0x3c4 #define PL_PF_CTL_A 0x3c8 #define PL_WHOAMI_A 0x19400 #define SOURCEPF_S 8 #define SOURCEPF_M 0x7U #define SOURCEPF_G(x) (((x) >> SOURCEPF_S) & SOURCEPF_M) #define T6_SOURCEPF_S 9 #define T6_SOURCEPF_M 0x7U #define T6_SOURCEPF_G(x) (((x) >> T6_SOURCEPF_S) & T6_SOURCEPF_M) #define PL_INT_CAUSE_A 0x1940c #define ULP_TX_S 27 #define ULP_TX_V(x) ((x) << ULP_TX_S) #define ULP_TX_F ULP_TX_V(1U) #define SGE_S 26 #define SGE_V(x) ((x) << SGE_S) #define SGE_F SGE_V(1U) #define CPL_SWITCH_S 24 #define CPL_SWITCH_V(x) ((x) << CPL_SWITCH_S) #define CPL_SWITCH_F CPL_SWITCH_V(1U) #define ULP_RX_S 23 #define ULP_RX_V(x) ((x) << ULP_RX_S) #define ULP_RX_F ULP_RX_V(1U) #define PM_RX_S 22 #define PM_RX_V(x) ((x) << PM_RX_S) #define PM_RX_F PM_RX_V(1U) #define PM_TX_S 21 #define PM_TX_V(x) ((x) << PM_TX_S) #define PM_TX_F PM_TX_V(1U) #define MA_S 20 #define MA_V(x) ((x) << MA_S) #define MA_F MA_V(1U) #define TP_S 19 #define TP_V(x) ((x) << TP_S) #define TP_F TP_V(1U) #define LE_S 18 #define LE_V(x) ((x) << LE_S) #define LE_F LE_V(1U) #define EDC1_S 17 #define EDC1_V(x) ((x) << EDC1_S) #define EDC1_F EDC1_V(1U) #define EDC0_S 16 #define EDC0_V(x) ((x) << EDC0_S) #define EDC0_F EDC0_V(1U) #define MC_S 15 #define MC_V(x) ((x) << MC_S) #define MC_F MC_V(1U) #define PCIE_S 14 #define PCIE_V(x) ((x) << PCIE_S) #define PCIE_F PCIE_V(1U) #define XGMAC_KR1_S 12 #define XGMAC_KR1_V(x) ((x) << XGMAC_KR1_S) #define XGMAC_KR1_F XGMAC_KR1_V(1U) #define XGMAC_KR0_S 11 #define XGMAC_KR0_V(x) ((x) << XGMAC_KR0_S) #define XGMAC_KR0_F XGMAC_KR0_V(1U) #define XGMAC1_S 10 #define XGMAC1_V(x) ((x) << XGMAC1_S) #define XGMAC1_F XGMAC1_V(1U) #define XGMAC0_S 9 #define XGMAC0_V(x) ((x) << XGMAC0_S) #define XGMAC0_F XGMAC0_V(1U) #define SMB_S 8 #define SMB_V(x) ((x) << SMB_S) #define SMB_F SMB_V(1U) #define SF_S 7 #define SF_V(x) ((x) << SF_S) #define SF_F SF_V(1U) #define PL_S 6 #define PL_V(x) ((x) << PL_S) #define PL_F PL_V(1U) #define NCSI_S 5 #define NCSI_V(x) ((x) << NCSI_S) #define NCSI_F NCSI_V(1U) #define MPS_S 4 #define MPS_V(x) ((x) << MPS_S) #define MPS_F MPS_V(1U) #define CIM_S 0 #define CIM_V(x) ((x) << CIM_S) #define CIM_F CIM_V(1U) #define MC1_S 31 #define MC1_V(x) ((x) << MC1_S) #define MC1_F MC1_V(1U) #define PL_INT_ENABLE_A 0x19410 #define PL_INT_MAP0_A 0x19414 #define PL_RST_A 0x19428 #define PIORST_S 1 #define PIORST_V(x) ((x) << PIORST_S) #define PIORST_F PIORST_V(1U) #define PIORSTMODE_S 0 #define PIORSTMODE_V(x) ((x) << PIORSTMODE_S) #define PIORSTMODE_F PIORSTMODE_V(1U) #define PL_PL_INT_CAUSE_A 0x19430 #define FATALPERR_S 4 #define FATALPERR_V(x) ((x) << FATALPERR_S) #define FATALPERR_F FATALPERR_V(1U) #define PERRVFID_S 0 #define PERRVFID_V(x) ((x) << PERRVFID_S) #define PERRVFID_F PERRVFID_V(1U) #define PL_REV_A 0x1943c #define REV_S 0 #define REV_M 0xfU #define REV_V(x) ((x) << REV_S) #define REV_G(x) (((x) >> REV_S) & REV_M) #define T6_UNKNOWNCMD_S 3 #define T6_UNKNOWNCMD_V(x) ((x) << T6_UNKNOWNCMD_S) #define T6_UNKNOWNCMD_F T6_UNKNOWNCMD_V(1U) #define T6_LIP0_S 2 #define T6_LIP0_V(x) ((x) << T6_LIP0_S) #define T6_LIP0_F T6_LIP0_V(1U) #define T6_LIPMISS_S 1 #define T6_LIPMISS_V(x) ((x) << T6_LIPMISS_S) #define T6_LIPMISS_F T6_LIPMISS_V(1U) #define LE_DB_CONFIG_A 0x19c04 #define LE_DB_SERVER_INDEX_A 0x19c18 #define LE_DB_SRVR_START_INDEX_A 0x19c18 #define LE_DB_ACT_CNT_IPV4_A 0x19c20 #define LE_DB_ACT_CNT_IPV6_A 0x19c24 #define LE_DB_HASH_TID_BASE_A 0x19c30 #define LE_DB_HASH_TBL_BASE_ADDR_A 0x19c30 #define LE_DB_INT_CAUSE_A 0x19c3c #define LE_DB_TID_HASHBASE_A 0x19df8 #define T6_LE_DB_HASH_TID_BASE_A 0x19df8 #define HASHEN_S 20 #define HASHEN_V(x) ((x) << HASHEN_S) #define HASHEN_F HASHEN_V(1U) #define ASLIPCOMPEN_S 17 #define ASLIPCOMPEN_V(x) ((x) << ASLIPCOMPEN_S) #define ASLIPCOMPEN_F ASLIPCOMPEN_V(1U) #define REQQPARERR_S 16 #define REQQPARERR_V(x) ((x) << REQQPARERR_S) #define REQQPARERR_F REQQPARERR_V(1U) #define UNKNOWNCMD_S 15 #define UNKNOWNCMD_V(x) ((x) << UNKNOWNCMD_S) #define UNKNOWNCMD_F UNKNOWNCMD_V(1U) #define PARITYERR_S 6 #define PARITYERR_V(x) ((x) << PARITYERR_S) #define PARITYERR_F PARITYERR_V(1U) #define LIPMISS_S 5 #define LIPMISS_V(x) ((x) << LIPMISS_S) #define LIPMISS_F LIPMISS_V(1U) #define LIP0_S 4 #define LIP0_V(x) ((x) << LIP0_S) #define LIP0_F LIP0_V(1U) #define BASEADDR_S 3 #define BASEADDR_M 0x1fffffffU #define BASEADDR_G(x) (((x) >> BASEADDR_S) & BASEADDR_M) #define TCAMINTPERR_S 13 #define TCAMINTPERR_V(x) ((x) << TCAMINTPERR_S) #define TCAMINTPERR_F TCAMINTPERR_V(1U) #define SSRAMINTPERR_S 10 #define SSRAMINTPERR_V(x) ((x) << SSRAMINTPERR_S) #define SSRAMINTPERR_F SSRAMINTPERR_V(1U) #define NCSI_INT_CAUSE_A 0x1a0d8 #define CIM_DM_PRTY_ERR_S 8 #define CIM_DM_PRTY_ERR_V(x) ((x) << CIM_DM_PRTY_ERR_S) #define CIM_DM_PRTY_ERR_F CIM_DM_PRTY_ERR_V(1U) #define MPS_DM_PRTY_ERR_S 7 #define MPS_DM_PRTY_ERR_V(x) ((x) << MPS_DM_PRTY_ERR_S) #define MPS_DM_PRTY_ERR_F MPS_DM_PRTY_ERR_V(1U) #define TXFIFO_PRTY_ERR_S 1 #define TXFIFO_PRTY_ERR_V(x) ((x) << TXFIFO_PRTY_ERR_S) #define TXFIFO_PRTY_ERR_F TXFIFO_PRTY_ERR_V(1U) #define RXFIFO_PRTY_ERR_S 0 #define RXFIFO_PRTY_ERR_V(x) ((x) << RXFIFO_PRTY_ERR_S) #define RXFIFO_PRTY_ERR_F RXFIFO_PRTY_ERR_V(1U) #define XGMAC_PORT_CFG2_A 0x1018 #define PATEN_S 18 #define PATEN_V(x) ((x) << PATEN_S) #define PATEN_F PATEN_V(1U) #define MAGICEN_S 17 #define MAGICEN_V(x) ((x) << MAGICEN_S) #define MAGICEN_F MAGICEN_V(1U) #define XGMAC_PORT_MAGIC_MACID_LO 0x1024 #define XGMAC_PORT_MAGIC_MACID_HI 0x1028 #define XGMAC_PORT_EPIO_DATA0_A 0x10c0 #define XGMAC_PORT_EPIO_DATA1_A 0x10c4 #define XGMAC_PORT_EPIO_DATA2_A 0x10c8 #define XGMAC_PORT_EPIO_DATA3_A 0x10cc #define XGMAC_PORT_EPIO_OP_A 0x10d0 #define EPIOWR_S 8 #define EPIOWR_V(x) ((x) << EPIOWR_S) #define EPIOWR_F EPIOWR_V(1U) #define ADDRESS_S 0 #define ADDRESS_V(x) ((x) << ADDRESS_S) #define MAC_PORT_INT_CAUSE_A 0x8dc #define XGMAC_PORT_INT_CAUSE_A 0x10dc #define TP_TX_MOD_QUEUE_REQ_MAP_A 0x7e28 #define TP_TX_MOD_QUEUE_WEIGHT0_A 0x7e30 #define TP_TX_MOD_CHANNEL_WEIGHT_A 0x7e34 #define TX_MOD_QUEUE_REQ_MAP_S 0 #define TX_MOD_QUEUE_REQ_MAP_V(x) ((x) << TX_MOD_QUEUE_REQ_MAP_S) #define TX_MODQ_WEIGHT3_S 24 #define TX_MODQ_WEIGHT3_V(x) ((x) << TX_MODQ_WEIGHT3_S) #define TX_MODQ_WEIGHT2_S 16 #define TX_MODQ_WEIGHT2_V(x) ((x) << TX_MODQ_WEIGHT2_S) #define TX_MODQ_WEIGHT1_S 8 #define TX_MODQ_WEIGHT1_V(x) ((x) << TX_MODQ_WEIGHT1_S) #define TX_MODQ_WEIGHT0_S 0 #define TX_MODQ_WEIGHT0_V(x) ((x) << TX_MODQ_WEIGHT0_S) #define TP_TX_SCHED_HDR_A 0x23 #define TP_TX_SCHED_FIFO_A 0x24 #define TP_TX_SCHED_PCMD_A 0x25 #define NUM_MPS_CLS_SRAM_L_INSTANCES 336 #define NUM_MPS_T5_CLS_SRAM_L_INSTANCES 512 #define T5_PORT0_BASE 0x30000 #define T5_PORT_STRIDE 0x4000 #define T5_PORT_BASE(idx) (T5_PORT0_BASE + (idx) * T5_PORT_STRIDE) #define T5_PORT_REG(idx, reg) (T5_PORT_BASE(idx) + (reg)) #define MC_0_BASE_ADDR 0x40000 #define MC_1_BASE_ADDR 0x48000 #define MC_STRIDE (MC_1_BASE_ADDR - MC_0_BASE_ADDR) #define MC_REG(reg, idx) (reg + MC_STRIDE * idx) #define MC_P_BIST_CMD_A 0x41400 #define MC_P_BIST_CMD_ADDR_A 0x41404 #define MC_P_BIST_CMD_LEN_A 0x41408 #define MC_P_BIST_DATA_PATTERN_A 0x4140c #define MC_P_BIST_STATUS_RDATA_A 0x41488 #define EDC_T50_BASE_ADDR 0x50000 #define EDC_H_BIST_CMD_A 0x50004 #define EDC_H_BIST_CMD_ADDR_A 0x50008 #define EDC_H_BIST_CMD_LEN_A 0x5000c #define EDC_H_BIST_DATA_PATTERN_A 0x50010 #define EDC_H_BIST_STATUS_RDATA_A 0x50028 #define EDC_H_ECC_ERR_ADDR_A 0x50084 #define EDC_T51_BASE_ADDR 0x50800 #define EDC_T5_STRIDE (EDC_T51_BASE_ADDR - EDC_T50_BASE_ADDR) #define EDC_T5_REG(reg, idx) (reg + EDC_T5_STRIDE * idx) #define PL_VF_REV_A 0x4 #define PL_VF_WHOAMI_A 0x0 #define PL_VF_REVISION_A 0x8 /* registers for module CIM */ #define CIM_HOST_ACC_CTRL_A 0x7b50 #define CIM_HOST_ACC_DATA_A 0x7b54 #define UP_UP_DBG_LA_CFG_A 0x140 #define UP_UP_DBG_LA_DATA_A 0x144 #define HOSTBUSY_S 17 #define HOSTBUSY_V(x) ((x) << HOSTBUSY_S) #define HOSTBUSY_F HOSTBUSY_V(1U) #define HOSTWRITE_S 16 #define HOSTWRITE_V(x) ((x) << HOSTWRITE_S) #define HOSTWRITE_F HOSTWRITE_V(1U) #define CIM_IBQ_DBG_CFG_A 0x7b60 #define IBQDBGADDR_S 16 #define IBQDBGADDR_M 0xfffU #define IBQDBGADDR_V(x) ((x) << IBQDBGADDR_S) #define IBQDBGADDR_G(x) (((x) >> IBQDBGADDR_S) & IBQDBGADDR_M) #define IBQDBGBUSY_S 1 #define IBQDBGBUSY_V(x) ((x) << IBQDBGBUSY_S) #define IBQDBGBUSY_F IBQDBGBUSY_V(1U) #define IBQDBGEN_S 0 #define IBQDBGEN_V(x) ((x) << IBQDBGEN_S) #define IBQDBGEN_F IBQDBGEN_V(1U) #define CIM_OBQ_DBG_CFG_A 0x7b64 #define OBQDBGADDR_S 16 #define OBQDBGADDR_M 0xfffU #define OBQDBGADDR_V(x) ((x) << OBQDBGADDR_S) #define OBQDBGADDR_G(x) (((x) >> OBQDBGADDR_S) & OBQDBGADDR_M) #define OBQDBGBUSY_S 1 #define OBQDBGBUSY_V(x) ((x) << OBQDBGBUSY_S) #define OBQDBGBUSY_F OBQDBGBUSY_V(1U) #define OBQDBGEN_S 0 #define OBQDBGEN_V(x) ((x) << OBQDBGEN_S) #define OBQDBGEN_F OBQDBGEN_V(1U) #define CIM_IBQ_DBG_DATA_A 0x7b68 #define CIM_OBQ_DBG_DATA_A 0x7b6c #define CIM_DEBUGCFG_A 0x7b70 #define CIM_DEBUGSTS_A 0x7b74 #define POLADBGRDPTR_S 23 #define POLADBGRDPTR_M 0x1ffU #define POLADBGRDPTR_V(x) ((x) << POLADBGRDPTR_S) #define POLADBGWRPTR_S 16 #define POLADBGWRPTR_M 0x1ffU #define POLADBGWRPTR_G(x) (((x) >> POLADBGWRPTR_S) & POLADBGWRPTR_M) #define PILADBGRDPTR_S 14 #define PILADBGRDPTR_M 0x1ffU #define PILADBGRDPTR_V(x) ((x) << PILADBGRDPTR_S) #define PILADBGWRPTR_S 0 #define PILADBGWRPTR_M 0x1ffU #define PILADBGWRPTR_G(x) (((x) >> PILADBGWRPTR_S) & PILADBGWRPTR_M) #define LADBGEN_S 12 #define LADBGEN_V(x) ((x) << LADBGEN_S) #define LADBGEN_F LADBGEN_V(1U) #define CIM_PO_LA_DEBUGDATA_A 0x7b78 #define CIM_PI_LA_DEBUGDATA_A 0x7b7c #define CIM_PO_LA_MADEBUGDATA_A 0x7b80 #define CIM_PI_LA_MADEBUGDATA_A 0x7b84 #define UPDBGLARDEN_S 1 #define UPDBGLARDEN_V(x) ((x) << UPDBGLARDEN_S) #define UPDBGLARDEN_F UPDBGLARDEN_V(1U) #define UPDBGLAEN_S 0 #define UPDBGLAEN_V(x) ((x) << UPDBGLAEN_S) #define UPDBGLAEN_F UPDBGLAEN_V(1U) #define UPDBGLARDPTR_S 2 #define UPDBGLARDPTR_M 0xfffU #define UPDBGLARDPTR_V(x) ((x) << UPDBGLARDPTR_S) #define UPDBGLAWRPTR_S 16 #define UPDBGLAWRPTR_M 0xfffU #define UPDBGLAWRPTR_G(x) (((x) >> UPDBGLAWRPTR_S) & UPDBGLAWRPTR_M) #define UPDBGLACAPTPCONLY_S 30 #define UPDBGLACAPTPCONLY_V(x) ((x) << UPDBGLACAPTPCONLY_S) #define UPDBGLACAPTPCONLY_F UPDBGLACAPTPCONLY_V(1U) #define CIM_QUEUE_CONFIG_REF_A 0x7b48 #define CIM_QUEUE_CONFIG_CTRL_A 0x7b4c #define CIMQSIZE_S 24 #define CIMQSIZE_M 0x3fU #define CIMQSIZE_G(x) (((x) >> CIMQSIZE_S) & CIMQSIZE_M) #define CIMQBASE_S 16 #define CIMQBASE_M 0x3fU #define CIMQBASE_G(x) (((x) >> CIMQBASE_S) & CIMQBASE_M) #define QUEFULLTHRSH_S 0 #define QUEFULLTHRSH_M 0x1ffU #define QUEFULLTHRSH_G(x) (((x) >> QUEFULLTHRSH_S) & QUEFULLTHRSH_M) #define UP_IBQ_0_RDADDR_A 0x10 #define UP_IBQ_0_SHADOW_RDADDR_A 0x280 #define UP_OBQ_0_REALADDR_A 0x104 #define UP_OBQ_0_SHADOW_REALADDR_A 0x394 #define IBQRDADDR_S 0 #define IBQRDADDR_M 0x1fffU #define IBQRDADDR_G(x) (((x) >> IBQRDADDR_S) & IBQRDADDR_M) #define IBQWRADDR_S 0 #define IBQWRADDR_M 0x1fffU #define IBQWRADDR_G(x) (((x) >> IBQWRADDR_S) & IBQWRADDR_M) #define QUERDADDR_S 0 #define QUERDADDR_M 0x7fffU #define QUERDADDR_G(x) (((x) >> QUERDADDR_S) & QUERDADDR_M) #define QUEREMFLITS_S 0 #define QUEREMFLITS_M 0x7ffU #define QUEREMFLITS_G(x) (((x) >> QUEREMFLITS_S) & QUEREMFLITS_M) #define QUEEOPCNT_S 16 #define QUEEOPCNT_M 0xfffU #define QUEEOPCNT_G(x) (((x) >> QUEEOPCNT_S) & QUEEOPCNT_M) #define QUESOPCNT_S 0 #define QUESOPCNT_M 0xfffU #define QUESOPCNT_G(x) (((x) >> QUESOPCNT_S) & QUESOPCNT_M) #define OBQSELECT_S 4 #define OBQSELECT_V(x) ((x) << OBQSELECT_S) #define OBQSELECT_F OBQSELECT_V(1U) #define IBQSELECT_S 3 #define IBQSELECT_V(x) ((x) << IBQSELECT_S) #define IBQSELECT_F IBQSELECT_V(1U) #define QUENUMSELECT_S 0 #define QUENUMSELECT_V(x) ((x) << QUENUMSELECT_S) #endif /* __T4_REGS_H */ rdma-core-56.1/providers/cxgb4/t4fw_api.h000066400000000000000000003026201477342711600202470ustar00rootroot00000000000000/* * This file is part of the Chelsio T4 Ethernet driver for Linux. * * Copyright (c) 2009-2014 Chelsio Communications, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _T4FW_INTERFACE_H_ #define _T4FW_INTERFACE_H_ #include enum fw_retval { FW_SUCCESS = 0, /* completed successfully */ FW_EPERM = 1, /* operation not permitted */ FW_ENOENT = 2, /* no such file or directory */ FW_EIO = 5, /* input/output error; hw bad */ FW_ENOEXEC = 8, /* exec format error; inv microcode */ FW_EAGAIN = 11, /* try again */ FW_ENOMEM = 12, /* out of memory */ FW_EFAULT = 14, /* bad address; fw bad */ FW_EBUSY = 16, /* resource busy */ FW_EEXIST = 17, /* file exists */ FW_ENODEV = 19, /* no such device */ FW_EINVAL = 22, /* invalid argument */ FW_ENOSPC = 28, /* no space left on device */ FW_ENOSYS = 38, /* functionality not implemented */ FW_ENODATA = 61, /* no data available */ FW_EPROTO = 71, /* protocol error */ FW_EADDRINUSE = 98, /* address already in use */ FW_EADDRNOTAVAIL = 99, /* cannot assigned requested address */ FW_ENETDOWN = 100, /* network is down */ FW_ENETUNREACH = 101, /* network is unreachable */ FW_ENOBUFS = 105, /* no buffer space available */ FW_ETIMEDOUT = 110, /* timeout */ FW_EINPROGRESS = 115, /* fw internal */ FW_SCSI_ABORT_REQUESTED = 128, /* */ FW_SCSI_ABORT_TIMEDOUT = 129, /* */ FW_SCSI_ABORTED = 130, /* */ FW_SCSI_CLOSE_REQUESTED = 131, /* */ FW_ERR_LINK_DOWN = 132, /* */ FW_RDEV_NOT_READY = 133, /* */ FW_ERR_RDEV_LOST = 134, /* */ FW_ERR_RDEV_LOGO = 135, /* */ FW_FCOE_NO_XCHG = 136, /* */ FW_SCSI_RSP_ERR = 137, /* */ FW_ERR_RDEV_IMPL_LOGO = 138, /* */ FW_SCSI_UNDER_FLOW_ERR = 139, /* */ FW_SCSI_OVER_FLOW_ERR = 140, /* */ FW_SCSI_DDP_ERR = 141, /* DDP error*/ FW_SCSI_TASK_ERR = 142, /* No SCSI tasks available */ }; #define FW_T4VF_SGE_BASE_ADDR 0x0000 #define FW_T4VF_MPS_BASE_ADDR 0x0100 #define FW_T4VF_PL_BASE_ADDR 0x0200 #define FW_T4VF_MBDATA_BASE_ADDR 0x0240 #define FW_T4VF_CIM_BASE_ADDR 0x0300 enum fw_wr_opcodes { FW_FILTER_WR = 0x02, FW_ULPTX_WR = 0x04, FW_TP_WR = 0x05, FW_ETH_TX_PKT_WR = 0x08, FW_OFLD_CONNECTION_WR = 0x2f, FW_FLOWC_WR = 0x0a, FW_OFLD_TX_DATA_WR = 0x0b, FW_CMD_WR = 0x10, FW_ETH_TX_PKT_VM_WR = 0x11, FW_RI_RES_WR = 0x0c, FW_RI_INIT_WR = 0x0d, FW_RI_RDMA_WRITE_WR = 0x14, FW_RI_SEND_WR = 0x15, FW_RI_RDMA_READ_WR = 0x16, FW_RI_RECV_WR = 0x17, FW_RI_BIND_MW_WR = 0x18, FW_RI_FR_NSMR_WR = 0x19, FW_RI_RDMA_WRITE_CMPL_WR = 0x21, FW_RI_INV_LSTAG_WR = 0x1a, FW_ISCSI_TX_DATA_WR = 0x45, FW_LASTC2E_WR = 0x70 }; struct fw_wr_hdr { __be32 hi; __be32 lo; }; /* work request opcode (hi) */ #define FW_WR_OP_S 24 #define FW_WR_OP_M 0xff #define FW_WR_OP_V(x) ((x) << FW_WR_OP_S) #define FW_WR_OP_G(x) (((x) >> FW_WR_OP_S) & FW_WR_OP_M) /* atomic flag (hi) - firmware encapsulates CPLs in CPL_BARRIER */ #define FW_WR_ATOMIC_S 23 #define FW_WR_ATOMIC_V(x) ((x) << FW_WR_ATOMIC_S) /* flush flag (hi) - firmware flushes flushable work request buffered * in the flow context. */ #define FW_WR_FLUSH_S 22 #define FW_WR_FLUSH_V(x) ((x) << FW_WR_FLUSH_S) /* completion flag (hi) - firmware generates a cpl_fw6_ack */ #define FW_WR_COMPL_S 21 #define FW_WR_COMPL_V(x) ((x) << FW_WR_COMPL_S) #define FW_WR_COMPL_F FW_WR_COMPL_V(1U) /* work request immediate data length (hi) */ #define FW_WR_IMMDLEN_S 0 #define FW_WR_IMMDLEN_M 0xff #define FW_WR_IMMDLEN_V(x) ((x) << FW_WR_IMMDLEN_S) /* egress queue status update to associated ingress queue entry (lo) */ #define FW_WR_EQUIQ_S 31 #define FW_WR_EQUIQ_V(x) ((x) << FW_WR_EQUIQ_S) #define FW_WR_EQUIQ_F FW_WR_EQUIQ_V(1U) /* egress queue status update to egress queue status entry (lo) */ #define FW_WR_EQUEQ_S 30 #define FW_WR_EQUEQ_V(x) ((x) << FW_WR_EQUEQ_S) #define FW_WR_EQUEQ_F FW_WR_EQUEQ_V(1U) /* flow context identifier (lo) */ #define FW_WR_FLOWID_S 8 #define FW_WR_FLOWID_V(x) ((x) << FW_WR_FLOWID_S) /* length in units of 16-bytes (lo) */ #define FW_WR_LEN16_S 0 #define FW_WR_LEN16_V(x) ((x) << FW_WR_LEN16_S) #define HW_TPL_FR_MT_PR_IV_P_FC 0X32B #define HW_TPL_FR_MT_PR_OV_P_FC 0X327 /* filter wr reply code in cookie in CPL_SET_TCB_RPL */ enum fw_filter_wr_cookie { FW_FILTER_WR_SUCCESS, FW_FILTER_WR_FLT_ADDED, FW_FILTER_WR_FLT_DELETED, FW_FILTER_WR_SMT_TBL_FULL, FW_FILTER_WR_EINVAL, }; struct fw_filter_wr { __be32 op_pkd; __be32 len16_pkd; __be64 r3; __be32 tid_to_iq; __be32 del_filter_to_l2tix; __be16 ethtype; __be16 ethtypem; __u8 frag_to_ovlan_vldm; __u8 smac_sel; __be16 rx_chan_rx_rpl_iq; __be32 maci_to_matchtypem; __u8 ptcl; __u8 ptclm; __u8 ttyp; __u8 ttypm; __be16 ivlan; __be16 ivlanm; __be16 ovlan; __be16 ovlanm; __u8 lip[16]; __u8 lipm[16]; __u8 fip[16]; __u8 fipm[16]; __be16 lp; __be16 lpm; __be16 fp; __be16 fpm; __be16 r7; __u8 sma[6]; }; #define FW_FILTER_WR_TID_S 12 #define FW_FILTER_WR_TID_M 0xfffff #define FW_FILTER_WR_TID_V(x) ((x) << FW_FILTER_WR_TID_S) #define FW_FILTER_WR_TID_G(x) \ (((x) >> FW_FILTER_WR_TID_S) & FW_FILTER_WR_TID_M) #define FW_FILTER_WR_RQTYPE_S 11 #define FW_FILTER_WR_RQTYPE_M 0x1 #define FW_FILTER_WR_RQTYPE_V(x) ((x) << FW_FILTER_WR_RQTYPE_S) #define FW_FILTER_WR_RQTYPE_G(x) \ (((x) >> FW_FILTER_WR_RQTYPE_S) & FW_FILTER_WR_RQTYPE_M) #define FW_FILTER_WR_RQTYPE_F FW_FILTER_WR_RQTYPE_V(1U) #define FW_FILTER_WR_NOREPLY_S 10 #define FW_FILTER_WR_NOREPLY_M 0x1 #define FW_FILTER_WR_NOREPLY_V(x) ((x) << FW_FILTER_WR_NOREPLY_S) #define FW_FILTER_WR_NOREPLY_G(x) \ (((x) >> FW_FILTER_WR_NOREPLY_S) & FW_FILTER_WR_NOREPLY_M) #define FW_FILTER_WR_NOREPLY_F FW_FILTER_WR_NOREPLY_V(1U) #define FW_FILTER_WR_IQ_S 0 #define FW_FILTER_WR_IQ_M 0x3ff #define FW_FILTER_WR_IQ_V(x) ((x) << FW_FILTER_WR_IQ_S) #define FW_FILTER_WR_IQ_G(x) \ (((x) >> FW_FILTER_WR_IQ_S) & FW_FILTER_WR_IQ_M) #define FW_FILTER_WR_DEL_FILTER_S 31 #define FW_FILTER_WR_DEL_FILTER_M 0x1 #define FW_FILTER_WR_DEL_FILTER_V(x) ((x) << FW_FILTER_WR_DEL_FILTER_S) #define FW_FILTER_WR_DEL_FILTER_G(x) \ (((x) >> FW_FILTER_WR_DEL_FILTER_S) & FW_FILTER_WR_DEL_FILTER_M) #define FW_FILTER_WR_DEL_FILTER_F FW_FILTER_WR_DEL_FILTER_V(1U) #define FW_FILTER_WR_RPTTID_S 25 #define FW_FILTER_WR_RPTTID_M 0x1 #define FW_FILTER_WR_RPTTID_V(x) ((x) << FW_FILTER_WR_RPTTID_S) #define FW_FILTER_WR_RPTTID_G(x) \ (((x) >> FW_FILTER_WR_RPTTID_S) & FW_FILTER_WR_RPTTID_M) #define FW_FILTER_WR_RPTTID_F FW_FILTER_WR_RPTTID_V(1U) #define FW_FILTER_WR_DROP_S 24 #define FW_FILTER_WR_DROP_M 0x1 #define FW_FILTER_WR_DROP_V(x) ((x) << FW_FILTER_WR_DROP_S) #define FW_FILTER_WR_DROP_G(x) \ (((x) >> FW_FILTER_WR_DROP_S) & FW_FILTER_WR_DROP_M) #define FW_FILTER_WR_DROP_F FW_FILTER_WR_DROP_V(1U) #define FW_FILTER_WR_DIRSTEER_S 23 #define FW_FILTER_WR_DIRSTEER_M 0x1 #define FW_FILTER_WR_DIRSTEER_V(x) ((x) << FW_FILTER_WR_DIRSTEER_S) #define FW_FILTER_WR_DIRSTEER_G(x) \ (((x) >> FW_FILTER_WR_DIRSTEER_S) & FW_FILTER_WR_DIRSTEER_M) #define FW_FILTER_WR_DIRSTEER_F FW_FILTER_WR_DIRSTEER_V(1U) #define FW_FILTER_WR_MASKHASH_S 22 #define FW_FILTER_WR_MASKHASH_M 0x1 #define FW_FILTER_WR_MASKHASH_V(x) ((x) << FW_FILTER_WR_MASKHASH_S) #define FW_FILTER_WR_MASKHASH_G(x) \ (((x) >> FW_FILTER_WR_MASKHASH_S) & FW_FILTER_WR_MASKHASH_M) #define FW_FILTER_WR_MASKHASH_F FW_FILTER_WR_MASKHASH_V(1U) #define FW_FILTER_WR_DIRSTEERHASH_S 21 #define FW_FILTER_WR_DIRSTEERHASH_M 0x1 #define FW_FILTER_WR_DIRSTEERHASH_V(x) ((x) << FW_FILTER_WR_DIRSTEERHASH_S) #define FW_FILTER_WR_DIRSTEERHASH_G(x) \ (((x) >> FW_FILTER_WR_DIRSTEERHASH_S) & FW_FILTER_WR_DIRSTEERHASH_M) #define FW_FILTER_WR_DIRSTEERHASH_F FW_FILTER_WR_DIRSTEERHASH_V(1U) #define FW_FILTER_WR_LPBK_S 20 #define FW_FILTER_WR_LPBK_M 0x1 #define FW_FILTER_WR_LPBK_V(x) ((x) << FW_FILTER_WR_LPBK_S) #define FW_FILTER_WR_LPBK_G(x) \ (((x) >> FW_FILTER_WR_LPBK_S) & FW_FILTER_WR_LPBK_M) #define FW_FILTER_WR_LPBK_F FW_FILTER_WR_LPBK_V(1U) #define FW_FILTER_WR_DMAC_S 19 #define FW_FILTER_WR_DMAC_M 0x1 #define FW_FILTER_WR_DMAC_V(x) ((x) << FW_FILTER_WR_DMAC_S) #define FW_FILTER_WR_DMAC_G(x) \ (((x) >> FW_FILTER_WR_DMAC_S) & FW_FILTER_WR_DMAC_M) #define FW_FILTER_WR_DMAC_F FW_FILTER_WR_DMAC_V(1U) #define FW_FILTER_WR_SMAC_S 18 #define FW_FILTER_WR_SMAC_M 0x1 #define FW_FILTER_WR_SMAC_V(x) ((x) << FW_FILTER_WR_SMAC_S) #define FW_FILTER_WR_SMAC_G(x) \ (((x) >> FW_FILTER_WR_SMAC_S) & FW_FILTER_WR_SMAC_M) #define FW_FILTER_WR_SMAC_F FW_FILTER_WR_SMAC_V(1U) #define FW_FILTER_WR_INSVLAN_S 17 #define FW_FILTER_WR_INSVLAN_M 0x1 #define FW_FILTER_WR_INSVLAN_V(x) ((x) << FW_FILTER_WR_INSVLAN_S) #define FW_FILTER_WR_INSVLAN_G(x) \ (((x) >> FW_FILTER_WR_INSVLAN_S) & FW_FILTER_WR_INSVLAN_M) #define FW_FILTER_WR_INSVLAN_F FW_FILTER_WR_INSVLAN_V(1U) #define FW_FILTER_WR_RMVLAN_S 16 #define FW_FILTER_WR_RMVLAN_M 0x1 #define FW_FILTER_WR_RMVLAN_V(x) ((x) << FW_FILTER_WR_RMVLAN_S) #define FW_FILTER_WR_RMVLAN_G(x) \ (((x) >> FW_FILTER_WR_RMVLAN_S) & FW_FILTER_WR_RMVLAN_M) #define FW_FILTER_WR_RMVLAN_F FW_FILTER_WR_RMVLAN_V(1U) #define FW_FILTER_WR_HITCNTS_S 15 #define FW_FILTER_WR_HITCNTS_M 0x1 #define FW_FILTER_WR_HITCNTS_V(x) ((x) << FW_FILTER_WR_HITCNTS_S) #define FW_FILTER_WR_HITCNTS_G(x) \ (((x) >> FW_FILTER_WR_HITCNTS_S) & FW_FILTER_WR_HITCNTS_M) #define FW_FILTER_WR_HITCNTS_F FW_FILTER_WR_HITCNTS_V(1U) #define FW_FILTER_WR_TXCHAN_S 13 #define FW_FILTER_WR_TXCHAN_M 0x3 #define FW_FILTER_WR_TXCHAN_V(x) ((x) << FW_FILTER_WR_TXCHAN_S) #define FW_FILTER_WR_TXCHAN_G(x) \ (((x) >> FW_FILTER_WR_TXCHAN_S) & FW_FILTER_WR_TXCHAN_M) #define FW_FILTER_WR_PRIO_S 12 #define FW_FILTER_WR_PRIO_M 0x1 #define FW_FILTER_WR_PRIO_V(x) ((x) << FW_FILTER_WR_PRIO_S) #define FW_FILTER_WR_PRIO_G(x) \ (((x) >> FW_FILTER_WR_PRIO_S) & FW_FILTER_WR_PRIO_M) #define FW_FILTER_WR_PRIO_F FW_FILTER_WR_PRIO_V(1U) #define FW_FILTER_WR_L2TIX_S 0 #define FW_FILTER_WR_L2TIX_M 0xfff #define FW_FILTER_WR_L2TIX_V(x) ((x) << FW_FILTER_WR_L2TIX_S) #define FW_FILTER_WR_L2TIX_G(x) \ (((x) >> FW_FILTER_WR_L2TIX_S) & FW_FILTER_WR_L2TIX_M) #define FW_FILTER_WR_FRAG_S 7 #define FW_FILTER_WR_FRAG_M 0x1 #define FW_FILTER_WR_FRAG_V(x) ((x) << FW_FILTER_WR_FRAG_S) #define FW_FILTER_WR_FRAG_G(x) \ (((x) >> FW_FILTER_WR_FRAG_S) & FW_FILTER_WR_FRAG_M) #define FW_FILTER_WR_FRAG_F FW_FILTER_WR_FRAG_V(1U) #define FW_FILTER_WR_FRAGM_S 6 #define FW_FILTER_WR_FRAGM_M 0x1 #define FW_FILTER_WR_FRAGM_V(x) ((x) << FW_FILTER_WR_FRAGM_S) #define FW_FILTER_WR_FRAGM_G(x) \ (((x) >> FW_FILTER_WR_FRAGM_S) & FW_FILTER_WR_FRAGM_M) #define FW_FILTER_WR_FRAGM_F FW_FILTER_WR_FRAGM_V(1U) #define FW_FILTER_WR_IVLAN_VLD_S 5 #define FW_FILTER_WR_IVLAN_VLD_M 0x1 #define FW_FILTER_WR_IVLAN_VLD_V(x) ((x) << FW_FILTER_WR_IVLAN_VLD_S) #define FW_FILTER_WR_IVLAN_VLD_G(x) \ (((x) >> FW_FILTER_WR_IVLAN_VLD_S) & FW_FILTER_WR_IVLAN_VLD_M) #define FW_FILTER_WR_IVLAN_VLD_F FW_FILTER_WR_IVLAN_VLD_V(1U) #define FW_FILTER_WR_OVLAN_VLD_S 4 #define FW_FILTER_WR_OVLAN_VLD_M 0x1 #define FW_FILTER_WR_OVLAN_VLD_V(x) ((x) << FW_FILTER_WR_OVLAN_VLD_S) #define FW_FILTER_WR_OVLAN_VLD_G(x) \ (((x) >> FW_FILTER_WR_OVLAN_VLD_S) & FW_FILTER_WR_OVLAN_VLD_M) #define FW_FILTER_WR_OVLAN_VLD_F FW_FILTER_WR_OVLAN_VLD_V(1U) #define FW_FILTER_WR_IVLAN_VLDM_S 3 #define FW_FILTER_WR_IVLAN_VLDM_M 0x1 #define FW_FILTER_WR_IVLAN_VLDM_V(x) ((x) << FW_FILTER_WR_IVLAN_VLDM_S) #define FW_FILTER_WR_IVLAN_VLDM_G(x) \ (((x) >> FW_FILTER_WR_IVLAN_VLDM_S) & FW_FILTER_WR_IVLAN_VLDM_M) #define FW_FILTER_WR_IVLAN_VLDM_F FW_FILTER_WR_IVLAN_VLDM_V(1U) #define FW_FILTER_WR_OVLAN_VLDM_S 2 #define FW_FILTER_WR_OVLAN_VLDM_M 0x1 #define FW_FILTER_WR_OVLAN_VLDM_V(x) ((x) << FW_FILTER_WR_OVLAN_VLDM_S) #define FW_FILTER_WR_OVLAN_VLDM_G(x) \ (((x) >> FW_FILTER_WR_OVLAN_VLDM_S) & FW_FILTER_WR_OVLAN_VLDM_M) #define FW_FILTER_WR_OVLAN_VLDM_F FW_FILTER_WR_OVLAN_VLDM_V(1U) #define FW_FILTER_WR_RX_CHAN_S 15 #define FW_FILTER_WR_RX_CHAN_M 0x1 #define FW_FILTER_WR_RX_CHAN_V(x) ((x) << FW_FILTER_WR_RX_CHAN_S) #define FW_FILTER_WR_RX_CHAN_G(x) \ (((x) >> FW_FILTER_WR_RX_CHAN_S) & FW_FILTER_WR_RX_CHAN_M) #define FW_FILTER_WR_RX_CHAN_F FW_FILTER_WR_RX_CHAN_V(1U) #define FW_FILTER_WR_RX_RPL_IQ_S 0 #define FW_FILTER_WR_RX_RPL_IQ_M 0x3ff #define FW_FILTER_WR_RX_RPL_IQ_V(x) ((x) << FW_FILTER_WR_RX_RPL_IQ_S) #define FW_FILTER_WR_RX_RPL_IQ_G(x) \ (((x) >> FW_FILTER_WR_RX_RPL_IQ_S) & FW_FILTER_WR_RX_RPL_IQ_M) #define FW_FILTER_WR_MACI_S 23 #define FW_FILTER_WR_MACI_M 0x1ff #define FW_FILTER_WR_MACI_V(x) ((x) << FW_FILTER_WR_MACI_S) #define FW_FILTER_WR_MACI_G(x) \ (((x) >> FW_FILTER_WR_MACI_S) & FW_FILTER_WR_MACI_M) #define FW_FILTER_WR_MACIM_S 14 #define FW_FILTER_WR_MACIM_M 0x1ff #define FW_FILTER_WR_MACIM_V(x) ((x) << FW_FILTER_WR_MACIM_S) #define FW_FILTER_WR_MACIM_G(x) \ (((x) >> FW_FILTER_WR_MACIM_S) & FW_FILTER_WR_MACIM_M) #define FW_FILTER_WR_FCOE_S 13 #define FW_FILTER_WR_FCOE_M 0x1 #define FW_FILTER_WR_FCOE_V(x) ((x) << FW_FILTER_WR_FCOE_S) #define FW_FILTER_WR_FCOE_G(x) \ (((x) >> FW_FILTER_WR_FCOE_S) & FW_FILTER_WR_FCOE_M) #define FW_FILTER_WR_FCOE_F FW_FILTER_WR_FCOE_V(1U) #define FW_FILTER_WR_FCOEM_S 12 #define FW_FILTER_WR_FCOEM_M 0x1 #define FW_FILTER_WR_FCOEM_V(x) ((x) << FW_FILTER_WR_FCOEM_S) #define FW_FILTER_WR_FCOEM_G(x) \ (((x) >> FW_FILTER_WR_FCOEM_S) & FW_FILTER_WR_FCOEM_M) #define FW_FILTER_WR_FCOEM_F FW_FILTER_WR_FCOEM_V(1U) #define FW_FILTER_WR_PORT_S 9 #define FW_FILTER_WR_PORT_M 0x7 #define FW_FILTER_WR_PORT_V(x) ((x) << FW_FILTER_WR_PORT_S) #define FW_FILTER_WR_PORT_G(x) \ (((x) >> FW_FILTER_WR_PORT_S) & FW_FILTER_WR_PORT_M) #define FW_FILTER_WR_PORTM_S 6 #define FW_FILTER_WR_PORTM_M 0x7 #define FW_FILTER_WR_PORTM_V(x) ((x) << FW_FILTER_WR_PORTM_S) #define FW_FILTER_WR_PORTM_G(x) \ (((x) >> FW_FILTER_WR_PORTM_S) & FW_FILTER_WR_PORTM_M) #define FW_FILTER_WR_MATCHTYPE_S 3 #define FW_FILTER_WR_MATCHTYPE_M 0x7 #define FW_FILTER_WR_MATCHTYPE_V(x) ((x) << FW_FILTER_WR_MATCHTYPE_S) #define FW_FILTER_WR_MATCHTYPE_G(x) \ (((x) >> FW_FILTER_WR_MATCHTYPE_S) & FW_FILTER_WR_MATCHTYPE_M) #define FW_FILTER_WR_MATCHTYPEM_S 0 #define FW_FILTER_WR_MATCHTYPEM_M 0x7 #define FW_FILTER_WR_MATCHTYPEM_V(x) ((x) << FW_FILTER_WR_MATCHTYPEM_S) #define FW_FILTER_WR_MATCHTYPEM_G(x) \ (((x) >> FW_FILTER_WR_MATCHTYPEM_S) & FW_FILTER_WR_MATCHTYPEM_M) struct fw_ulptx_wr { __be32 op_to_compl; __be32 flowid_len16; u64 cookie; }; struct fw_tp_wr { __be32 op_to_immdlen; __be32 flowid_len16; u64 cookie; }; struct fw_eth_tx_pkt_wr { __be32 op_immdlen; __be32 equiq_to_len16; __be64 r3; }; struct fw_ofld_connection_wr { __be32 op_compl; __be32 len16_pkd; __u64 cookie; __be64 r2; __be64 r3; struct fw_ofld_connection_le { __be32 version_cpl; __be32 filter; __be32 r1; __be16 lport; __be16 pport; union fw_ofld_connection_leip { struct fw_ofld_connection_le_ipv4 { __be32 pip; __be32 lip; __be64 r0; __be64 r1; __be64 r2; } ipv4; struct fw_ofld_connection_le_ipv6 { __be64 pip_hi; __be64 pip_lo; __be64 lip_hi; __be64 lip_lo; } ipv6; } u; } le; struct fw_ofld_connection_tcb { __be32 t_state_to_astid; __be16 cplrxdataack_cplpassacceptrpl; __be16 rcv_adv; __be32 rcv_nxt; __be32 tx_max; __be64 opt0; __be32 opt2; __be32 r1; __be64 r2; __be64 r3; } tcb; }; #define FW_OFLD_CONNECTION_WR_VERSION_S 31 #define FW_OFLD_CONNECTION_WR_VERSION_M 0x1 #define FW_OFLD_CONNECTION_WR_VERSION_V(x) \ ((x) << FW_OFLD_CONNECTION_WR_VERSION_S) #define FW_OFLD_CONNECTION_WR_VERSION_G(x) \ (((x) >> FW_OFLD_CONNECTION_WR_VERSION_S) & \ FW_OFLD_CONNECTION_WR_VERSION_M) #define FW_OFLD_CONNECTION_WR_VERSION_F \ FW_OFLD_CONNECTION_WR_VERSION_V(1U) #define FW_OFLD_CONNECTION_WR_CPL_S 30 #define FW_OFLD_CONNECTION_WR_CPL_M 0x1 #define FW_OFLD_CONNECTION_WR_CPL_V(x) ((x) << FW_OFLD_CONNECTION_WR_CPL_S) #define FW_OFLD_CONNECTION_WR_CPL_G(x) \ (((x) >> FW_OFLD_CONNECTION_WR_CPL_S) & FW_OFLD_CONNECTION_WR_CPL_M) #define FW_OFLD_CONNECTION_WR_CPL_F FW_OFLD_CONNECTION_WR_CPL_V(1U) #define FW_OFLD_CONNECTION_WR_T_STATE_S 28 #define FW_OFLD_CONNECTION_WR_T_STATE_M 0xf #define FW_OFLD_CONNECTION_WR_T_STATE_V(x) \ ((x) << FW_OFLD_CONNECTION_WR_T_STATE_S) #define FW_OFLD_CONNECTION_WR_T_STATE_G(x) \ (((x) >> FW_OFLD_CONNECTION_WR_T_STATE_S) & \ FW_OFLD_CONNECTION_WR_T_STATE_M) #define FW_OFLD_CONNECTION_WR_RCV_SCALE_S 24 #define FW_OFLD_CONNECTION_WR_RCV_SCALE_M 0xf #define FW_OFLD_CONNECTION_WR_RCV_SCALE_V(x) \ ((x) << FW_OFLD_CONNECTION_WR_RCV_SCALE_S) #define FW_OFLD_CONNECTION_WR_RCV_SCALE_G(x) \ (((x) >> FW_OFLD_CONNECTION_WR_RCV_SCALE_S) & \ FW_OFLD_CONNECTION_WR_RCV_SCALE_M) #define FW_OFLD_CONNECTION_WR_ASTID_S 0 #define FW_OFLD_CONNECTION_WR_ASTID_M 0xffffff #define FW_OFLD_CONNECTION_WR_ASTID_V(x) \ ((x) << FW_OFLD_CONNECTION_WR_ASTID_S) #define FW_OFLD_CONNECTION_WR_ASTID_G(x) \ (((x) >> FW_OFLD_CONNECTION_WR_ASTID_S) & FW_OFLD_CONNECTION_WR_ASTID_M) #define FW_OFLD_CONNECTION_WR_CPLRXDATAACK_S 15 #define FW_OFLD_CONNECTION_WR_CPLRXDATAACK_M 0x1 #define FW_OFLD_CONNECTION_WR_CPLRXDATAACK_V(x) \ ((x) << FW_OFLD_CONNECTION_WR_CPLRXDATAACK_S) #define FW_OFLD_CONNECTION_WR_CPLRXDATAACK_G(x) \ (((x) >> FW_OFLD_CONNECTION_WR_CPLRXDATAACK_S) & \ FW_OFLD_CONNECTION_WR_CPLRXDATAACK_M) #define FW_OFLD_CONNECTION_WR_CPLRXDATAACK_F \ FW_OFLD_CONNECTION_WR_CPLRXDATAACK_V(1U) #define FW_OFLD_CONNECTION_WR_CPLPASSACCEPTRPL_S 14 #define FW_OFLD_CONNECTION_WR_CPLPASSACCEPTRPL_M 0x1 #define FW_OFLD_CONNECTION_WR_CPLPASSACCEPTRPL_V(x) \ ((x) << FW_OFLD_CONNECTION_WR_CPLPASSACCEPTRPL_S) #define FW_OFLD_CONNECTION_WR_CPLPASSACCEPTRPL_G(x) \ (((x) >> FW_OFLD_CONNECTION_WR_CPLPASSACCEPTRPL_S) & \ FW_OFLD_CONNECTION_WR_CPLPASSACCEPTRPL_M) #define FW_OFLD_CONNECTION_WR_CPLPASSACCEPTRPL_F \ FW_OFLD_CONNECTION_WR_CPLPASSACCEPTRPL_V(1U) enum fw_flowc_mnem { FW_FLOWC_MNEM_PFNVFN, /* PFN [15:8] VFN [7:0] */ FW_FLOWC_MNEM_CH, FW_FLOWC_MNEM_PORT, FW_FLOWC_MNEM_IQID, FW_FLOWC_MNEM_SNDNXT, FW_FLOWC_MNEM_RCVNXT, FW_FLOWC_MNEM_SNDBUF, FW_FLOWC_MNEM_MSS, FW_FLOWC_MNEM_TXDATAPLEN_MAX, FW_FLOWC_MNEM_TCPSTATE, FW_FLOWC_MNEM_EOSTATE, FW_FLOWC_MNEM_SCHEDCLASS, FW_FLOWC_MNEM_DCBPRIO, FW_FLOWC_MNEM_SND_SCALE, FW_FLOWC_MNEM_RCV_SCALE, }; struct fw_flowc_mnemval { u8 mnemonic; u8 r4[3]; __be32 val; }; struct fw_flowc_wr { __be32 op_to_nparams; __be32 flowid_len16; struct fw_flowc_mnemval mnemval[0]; }; #define FW_FLOWC_WR_NPARAMS_S 0 #define FW_FLOWC_WR_NPARAMS_V(x) ((x) << FW_FLOWC_WR_NPARAMS_S) struct fw_ofld_tx_data_wr { __be32 op_to_immdlen; __be32 flowid_len16; __be32 plen; __be32 tunnel_to_proxy; }; #define FW_OFLD_TX_DATA_WR_TUNNEL_S 19 #define FW_OFLD_TX_DATA_WR_TUNNEL_V(x) ((x) << FW_OFLD_TX_DATA_WR_TUNNEL_S) #define FW_OFLD_TX_DATA_WR_SAVE_S 18 #define FW_OFLD_TX_DATA_WR_SAVE_V(x) ((x) << FW_OFLD_TX_DATA_WR_SAVE_S) #define FW_OFLD_TX_DATA_WR_FLUSH_S 17 #define FW_OFLD_TX_DATA_WR_FLUSH_V(x) ((x) << FW_OFLD_TX_DATA_WR_FLUSH_S) #define FW_OFLD_TX_DATA_WR_FLUSH_F FW_OFLD_TX_DATA_WR_FLUSH_V(1U) #define FW_OFLD_TX_DATA_WR_URGENT_S 16 #define FW_OFLD_TX_DATA_WR_URGENT_V(x) ((x) << FW_OFLD_TX_DATA_WR_URGENT_S) #define FW_OFLD_TX_DATA_WR_MORE_S 15 #define FW_OFLD_TX_DATA_WR_MORE_V(x) ((x) << FW_OFLD_TX_DATA_WR_MORE_S) #define FW_OFLD_TX_DATA_WR_SHOVE_S 14 #define FW_OFLD_TX_DATA_WR_SHOVE_V(x) ((x) << FW_OFLD_TX_DATA_WR_SHOVE_S) #define FW_OFLD_TX_DATA_WR_SHOVE_F FW_OFLD_TX_DATA_WR_SHOVE_V(1U) #define FW_OFLD_TX_DATA_WR_ULPMODE_S 10 #define FW_OFLD_TX_DATA_WR_ULPMODE_V(x) ((x) << FW_OFLD_TX_DATA_WR_ULPMODE_S) #define FW_OFLD_TX_DATA_WR_ULPSUBMODE_S 6 #define FW_OFLD_TX_DATA_WR_ULPSUBMODE_V(x) \ ((x) << FW_OFLD_TX_DATA_WR_ULPSUBMODE_S) struct fw_cmd_wr { __be32 op_dma; __be32 len16_pkd; __be64 cookie_daddr; }; #define FW_CMD_WR_DMA_S 17 #define FW_CMD_WR_DMA_V(x) ((x) << FW_CMD_WR_DMA_S) struct fw_eth_tx_pkt_vm_wr { __be32 op_immdlen; __be32 equiq_to_len16; __be32 r3[2]; u8 ethmacdst[6]; u8 ethmacsrc[6]; __be16 ethtype; __be16 vlantci; }; #define FW_CMD_MAX_TIMEOUT 10000 /* * If a host driver does a HELLO and discovers that there's already a MASTER * selected, we may have to wait for that MASTER to finish issuing RESET, * configuration and INITIALIZE commands. Also, there's a possibility that * our own HELLO may get lost if it happens right as the MASTER is issuign a * RESET command, so we need to be willing to make a few retries of our HELLO. */ #define FW_CMD_HELLO_TIMEOUT (3 * FW_CMD_MAX_TIMEOUT) #define FW_CMD_HELLO_RETRIES 3 enum fw_cmd_opcodes { FW_LDST_CMD = 0x01, FW_RESET_CMD = 0x03, FW_HELLO_CMD = 0x04, FW_BYE_CMD = 0x05, FW_INITIALIZE_CMD = 0x06, FW_CAPS_CONFIG_CMD = 0x07, FW_PARAMS_CMD = 0x08, FW_PFVF_CMD = 0x09, FW_IQ_CMD = 0x10, FW_EQ_MNGT_CMD = 0x11, FW_EQ_ETH_CMD = 0x12, FW_EQ_CTRL_CMD = 0x13, FW_EQ_OFLD_CMD = 0x21, FW_VI_CMD = 0x14, FW_VI_MAC_CMD = 0x15, FW_VI_RXMODE_CMD = 0x16, FW_VI_ENABLE_CMD = 0x17, FW_ACL_MAC_CMD = 0x18, FW_ACL_VLAN_CMD = 0x19, FW_VI_STATS_CMD = 0x1a, FW_PORT_CMD = 0x1b, FW_PORT_STATS_CMD = 0x1c, FW_PORT_LB_STATS_CMD = 0x1d, FW_PORT_TRACE_CMD = 0x1e, FW_PORT_TRACE_MMAP_CMD = 0x1f, FW_RSS_IND_TBL_CMD = 0x20, FW_RSS_GLB_CONFIG_CMD = 0x22, FW_RSS_VI_CONFIG_CMD = 0x23, FW_DEVLOG_CMD = 0x25, FW_CLIP_CMD = 0x28, FW_LASTC2E_CMD = 0x40, FW_ERROR_CMD = 0x80, FW_DEBUG_CMD = 0x81, }; enum fw_cmd_cap { FW_CMD_CAP_PF = 0x01, FW_CMD_CAP_DMAQ = 0x02, FW_CMD_CAP_PORT = 0x04, FW_CMD_CAP_PORTPROMISC = 0x08, FW_CMD_CAP_PORTSTATS = 0x10, FW_CMD_CAP_VF = 0x80, }; /* * Generic command header flit0 */ struct fw_cmd_hdr { __be32 hi; __be32 lo; }; #define FW_CMD_OP_S 24 #define FW_CMD_OP_M 0xff #define FW_CMD_OP_V(x) ((x) << FW_CMD_OP_S) #define FW_CMD_OP_G(x) (((x) >> FW_CMD_OP_S) & FW_CMD_OP_M) #define FW_CMD_REQUEST_S 23 #define FW_CMD_REQUEST_V(x) ((x) << FW_CMD_REQUEST_S) #define FW_CMD_REQUEST_F FW_CMD_REQUEST_V(1U) #define FW_CMD_READ_S 22 #define FW_CMD_READ_V(x) ((x) << FW_CMD_READ_S) #define FW_CMD_READ_F FW_CMD_READ_V(1U) #define FW_CMD_WRITE_S 21 #define FW_CMD_WRITE_V(x) ((x) << FW_CMD_WRITE_S) #define FW_CMD_WRITE_F FW_CMD_WRITE_V(1U) #define FW_CMD_EXEC_S 20 #define FW_CMD_EXEC_V(x) ((x) << FW_CMD_EXEC_S) #define FW_CMD_EXEC_F FW_CMD_EXEC_V(1U) #define FW_CMD_RAMASK_S 20 #define FW_CMD_RAMASK_V(x) ((x) << FW_CMD_RAMASK_S) #define FW_CMD_RETVAL_S 8 #define FW_CMD_RETVAL_M 0xff #define FW_CMD_RETVAL_V(x) ((x) << FW_CMD_RETVAL_S) #define FW_CMD_RETVAL_G(x) (((x) >> FW_CMD_RETVAL_S) & FW_CMD_RETVAL_M) #define FW_CMD_LEN16_S 0 #define FW_CMD_LEN16_V(x) ((x) << FW_CMD_LEN16_S) #define FW_LEN16(fw_struct) FW_CMD_LEN16_V(sizeof(fw_struct) / 16) enum fw_ldst_addrspc { FW_LDST_ADDRSPC_FIRMWARE = 0x0001, FW_LDST_ADDRSPC_SGE_EGRC = 0x0008, FW_LDST_ADDRSPC_SGE_INGC = 0x0009, FW_LDST_ADDRSPC_SGE_FLMC = 0x000a, FW_LDST_ADDRSPC_SGE_CONMC = 0x000b, FW_LDST_ADDRSPC_TP_PIO = 0x0010, FW_LDST_ADDRSPC_TP_TM_PIO = 0x0011, FW_LDST_ADDRSPC_TP_MIB = 0x0012, FW_LDST_ADDRSPC_MDIO = 0x0018, FW_LDST_ADDRSPC_MPS = 0x0020, FW_LDST_ADDRSPC_FUNC = 0x0028, FW_LDST_ADDRSPC_FUNC_PCIE = 0x0029, }; enum fw_ldst_mps_fid { FW_LDST_MPS_ATRB, FW_LDST_MPS_RPLC }; enum fw_ldst_func_access_ctl { FW_LDST_FUNC_ACC_CTL_VIID, FW_LDST_FUNC_ACC_CTL_FID }; enum fw_ldst_func_mod_index { FW_LDST_FUNC_MPS }; struct fw_ldst_cmd { __be32 op_to_addrspace; __be32 cycles_to_len16; union fw_ldst { struct fw_ldst_addrval { __be32 addr; __be32 val; } addrval; struct fw_ldst_idctxt { __be32 physid; __be32 msg_ctxtflush; __be32 ctxt_data7; __be32 ctxt_data6; __be32 ctxt_data5; __be32 ctxt_data4; __be32 ctxt_data3; __be32 ctxt_data2; __be32 ctxt_data1; __be32 ctxt_data0; } idctxt; struct fw_ldst_mdio { __be16 paddr_mmd; __be16 raddr; __be16 vctl; __be16 rval; } mdio; struct fw_ldst_cim_rq { u8 req_first64[8]; u8 req_second64[8]; u8 resp_first64[8]; u8 resp_second64[8]; __be32 r3[2]; } cim_rq; union fw_ldst_mps { struct fw_ldst_mps_rplc { __be16 fid_idx; __be16 rplcpf_pkd; __be32 rplc255_224; __be32 rplc223_192; __be32 rplc191_160; __be32 rplc159_128; __be32 rplc127_96; __be32 rplc95_64; __be32 rplc63_32; __be32 rplc31_0; } rplc; struct fw_ldst_mps_atrb { __be16 fid_mpsid; __be16 r2[3]; __be32 r3[2]; __be32 r4; __be32 atrb; __be16 vlan[16]; } atrb; } mps; struct fw_ldst_func { u8 access_ctl; u8 mod_index; __be16 ctl_id; __be32 offset; __be64 data0; __be64 data1; } func; struct fw_ldst_pcie { u8 ctrl_to_fn; u8 bnum; u8 r; u8 ext_r; u8 select_naccess; u8 pcie_fn; __be16 nset_pkd; __be32 data[12]; } pcie; struct fw_ldst_i2c_deprecated { u8 pid_pkd; u8 base; u8 boffset; u8 data; __be32 r9; } i2c_deprecated; struct fw_ldst_i2c { u8 pid; u8 did; u8 boffset; u8 blen; __be32 r9; __u8 data[48]; } i2c; struct fw_ldst_le { __be32 index; __be32 r9; u8 val[33]; u8 r11[7]; } le; } u; }; #define FW_LDST_CMD_ADDRSPACE_S 0 #define FW_LDST_CMD_ADDRSPACE_V(x) ((x) << FW_LDST_CMD_ADDRSPACE_S) #define FW_LDST_CMD_MSG_S 31 #define FW_LDST_CMD_MSG_V(x) ((x) << FW_LDST_CMD_MSG_S) #define FW_LDST_CMD_CTXTFLUSH_S 30 #define FW_LDST_CMD_CTXTFLUSH_V(x) ((x) << FW_LDST_CMD_CTXTFLUSH_S) #define FW_LDST_CMD_CTXTFLUSH_F FW_LDST_CMD_CTXTFLUSH_V(1U) #define FW_LDST_CMD_PADDR_S 8 #define FW_LDST_CMD_PADDR_V(x) ((x) << FW_LDST_CMD_PADDR_S) #define FW_LDST_CMD_MMD_S 0 #define FW_LDST_CMD_MMD_V(x) ((x) << FW_LDST_CMD_MMD_S) #define FW_LDST_CMD_FID_S 15 #define FW_LDST_CMD_FID_V(x) ((x) << FW_LDST_CMD_FID_S) #define FW_LDST_CMD_IDX_S 0 #define FW_LDST_CMD_IDX_V(x) ((x) << FW_LDST_CMD_IDX_S) #define FW_LDST_CMD_RPLCPF_S 0 #define FW_LDST_CMD_RPLCPF_V(x) ((x) << FW_LDST_CMD_RPLCPF_S) #define FW_LDST_CMD_LC_S 4 #define FW_LDST_CMD_LC_V(x) ((x) << FW_LDST_CMD_LC_S) #define FW_LDST_CMD_LC_F FW_LDST_CMD_LC_V(1U) #define FW_LDST_CMD_FN_S 0 #define FW_LDST_CMD_FN_V(x) ((x) << FW_LDST_CMD_FN_S) #define FW_LDST_CMD_NACCESS_S 0 #define FW_LDST_CMD_NACCESS_V(x) ((x) << FW_LDST_CMD_NACCESS_S) struct fw_reset_cmd { __be32 op_to_write; __be32 retval_len16; __be32 val; __be32 halt_pkd; }; #define FW_RESET_CMD_HALT_S 31 #define FW_RESET_CMD_HALT_M 0x1 #define FW_RESET_CMD_HALT_V(x) ((x) << FW_RESET_CMD_HALT_S) #define FW_RESET_CMD_HALT_G(x) \ (((x) >> FW_RESET_CMD_HALT_S) & FW_RESET_CMD_HALT_M) #define FW_RESET_CMD_HALT_F FW_RESET_CMD_HALT_V(1U) enum fw_hellow_cmd { fw_hello_cmd_stage_os = 0x0 }; struct fw_hello_cmd { __be32 op_to_write; __be32 retval_len16; __be32 err_to_clearinit; __be32 fwrev; }; #define FW_HELLO_CMD_ERR_S 31 #define FW_HELLO_CMD_ERR_V(x) ((x) << FW_HELLO_CMD_ERR_S) #define FW_HELLO_CMD_ERR_F FW_HELLO_CMD_ERR_V(1U) #define FW_HELLO_CMD_INIT_S 30 #define FW_HELLO_CMD_INIT_V(x) ((x) << FW_HELLO_CMD_INIT_S) #define FW_HELLO_CMD_INIT_F FW_HELLO_CMD_INIT_V(1U) #define FW_HELLO_CMD_MASTERDIS_S 29 #define FW_HELLO_CMD_MASTERDIS_V(x) ((x) << FW_HELLO_CMD_MASTERDIS_S) #define FW_HELLO_CMD_MASTERFORCE_S 28 #define FW_HELLO_CMD_MASTERFORCE_V(x) ((x) << FW_HELLO_CMD_MASTERFORCE_S) #define FW_HELLO_CMD_MBMASTER_S 24 #define FW_HELLO_CMD_MBMASTER_M 0xfU #define FW_HELLO_CMD_MBMASTER_V(x) ((x) << FW_HELLO_CMD_MBMASTER_S) #define FW_HELLO_CMD_MBMASTER_G(x) \ (((x) >> FW_HELLO_CMD_MBMASTER_S) & FW_HELLO_CMD_MBMASTER_M) #define FW_HELLO_CMD_MBASYNCNOTINT_S 23 #define FW_HELLO_CMD_MBASYNCNOTINT_V(x) ((x) << FW_HELLO_CMD_MBASYNCNOTINT_S) #define FW_HELLO_CMD_MBASYNCNOT_S 20 #define FW_HELLO_CMD_MBASYNCNOT_V(x) ((x) << FW_HELLO_CMD_MBASYNCNOT_S) #define FW_HELLO_CMD_STAGE_S 17 #define FW_HELLO_CMD_STAGE_V(x) ((x) << FW_HELLO_CMD_STAGE_S) #define FW_HELLO_CMD_CLEARINIT_S 16 #define FW_HELLO_CMD_CLEARINIT_V(x) ((x) << FW_HELLO_CMD_CLEARINIT_S) #define FW_HELLO_CMD_CLEARINIT_F FW_HELLO_CMD_CLEARINIT_V(1U) struct fw_bye_cmd { __be32 op_to_write; __be32 retval_len16; __be64 r3; }; struct fw_initialize_cmd { __be32 op_to_write; __be32 retval_len16; __be64 r3; }; enum fw_caps_config_hm { FW_CAPS_CONFIG_HM_PCIE = 0x00000001, FW_CAPS_CONFIG_HM_PL = 0x00000002, FW_CAPS_CONFIG_HM_SGE = 0x00000004, FW_CAPS_CONFIG_HM_CIM = 0x00000008, FW_CAPS_CONFIG_HM_ULPTX = 0x00000010, FW_CAPS_CONFIG_HM_TP = 0x00000020, FW_CAPS_CONFIG_HM_ULPRX = 0x00000040, FW_CAPS_CONFIG_HM_PMRX = 0x00000080, FW_CAPS_CONFIG_HM_PMTX = 0x00000100, FW_CAPS_CONFIG_HM_MC = 0x00000200, FW_CAPS_CONFIG_HM_LE = 0x00000400, FW_CAPS_CONFIG_HM_MPS = 0x00000800, FW_CAPS_CONFIG_HM_XGMAC = 0x00001000, FW_CAPS_CONFIG_HM_CPLSWITCH = 0x00002000, FW_CAPS_CONFIG_HM_T4DBG = 0x00004000, FW_CAPS_CONFIG_HM_MI = 0x00008000, FW_CAPS_CONFIG_HM_I2CM = 0x00010000, FW_CAPS_CONFIG_HM_NCSI = 0x00020000, FW_CAPS_CONFIG_HM_SMB = 0x00040000, FW_CAPS_CONFIG_HM_MA = 0x00080000, FW_CAPS_CONFIG_HM_EDRAM = 0x00100000, FW_CAPS_CONFIG_HM_PMU = 0x00200000, FW_CAPS_CONFIG_HM_UART = 0x00400000, FW_CAPS_CONFIG_HM_SF = 0x00800000, }; enum fw_caps_config_nbm { FW_CAPS_CONFIG_NBM_IPMI = 0x00000001, FW_CAPS_CONFIG_NBM_NCSI = 0x00000002, }; enum fw_caps_config_link { FW_CAPS_CONFIG_LINK_PPP = 0x00000001, FW_CAPS_CONFIG_LINK_QFC = 0x00000002, FW_CAPS_CONFIG_LINK_DCBX = 0x00000004, }; enum fw_caps_config_switch { FW_CAPS_CONFIG_SWITCH_INGRESS = 0x00000001, FW_CAPS_CONFIG_SWITCH_EGRESS = 0x00000002, }; enum fw_caps_config_nic { FW_CAPS_CONFIG_NIC = 0x00000001, FW_CAPS_CONFIG_NIC_VM = 0x00000002, }; enum fw_caps_config_ofld { FW_CAPS_CONFIG_OFLD = 0x00000001, }; enum fw_caps_config_rdma { FW_CAPS_CONFIG_RDMA_RDDP = 0x00000001, FW_CAPS_CONFIG_RDMA_RDMAC = 0x00000002, }; enum fw_caps_config_iscsi { FW_CAPS_CONFIG_ISCSI_INITIATOR_PDU = 0x00000001, FW_CAPS_CONFIG_ISCSI_TARGET_PDU = 0x00000002, FW_CAPS_CONFIG_ISCSI_INITIATOR_CNXOFLD = 0x00000004, FW_CAPS_CONFIG_ISCSI_TARGET_CNXOFLD = 0x00000008, }; enum fw_caps_config_fcoe { FW_CAPS_CONFIG_FCOE_INITIATOR = 0x00000001, FW_CAPS_CONFIG_FCOE_TARGET = 0x00000002, FW_CAPS_CONFIG_FCOE_CTRL_OFLD = 0x00000004, }; enum fw_memtype_cf { FW_MEMTYPE_CF_EDC0 = 0x0, FW_MEMTYPE_CF_EDC1 = 0x1, FW_MEMTYPE_CF_EXTMEM = 0x2, FW_MEMTYPE_CF_FLASH = 0x4, FW_MEMTYPE_CF_INTERNAL = 0x5, FW_MEMTYPE_CF_EXTMEM1 = 0x6, }; struct fw_caps_config_cmd { __be32 op_to_write; __be32 cfvalid_to_len16; __be32 r2; __be32 hwmbitmap; __be16 nbmcaps; __be16 linkcaps; __be16 switchcaps; __be16 r3; __be16 niccaps; __be16 ofldcaps; __be16 rdmacaps; __be16 r4; __be16 iscsicaps; __be16 fcoecaps; __be32 cfcsum; __be32 finiver; __be32 finicsum; }; #define FW_CAPS_CONFIG_CMD_CFVALID_S 27 #define FW_CAPS_CONFIG_CMD_CFVALID_V(x) ((x) << FW_CAPS_CONFIG_CMD_CFVALID_S) #define FW_CAPS_CONFIG_CMD_CFVALID_F FW_CAPS_CONFIG_CMD_CFVALID_V(1U) #define FW_CAPS_CONFIG_CMD_MEMTYPE_CF_S 24 #define FW_CAPS_CONFIG_CMD_MEMTYPE_CF_V(x) \ ((x) << FW_CAPS_CONFIG_CMD_MEMTYPE_CF_S) #define FW_CAPS_CONFIG_CMD_MEMADDR64K_CF_S 16 #define FW_CAPS_CONFIG_CMD_MEMADDR64K_CF_V(x) \ ((x) << FW_CAPS_CONFIG_CMD_MEMADDR64K_CF_S) /* * params command mnemonics */ enum fw_params_mnem { FW_PARAMS_MNEM_DEV = 1, /* device params */ FW_PARAMS_MNEM_PFVF = 2, /* function params */ FW_PARAMS_MNEM_REG = 3, /* limited register access */ FW_PARAMS_MNEM_DMAQ = 4, /* dma queue params */ FW_PARAMS_MNEM_CHNET = 5, /* chnet params */ FW_PARAMS_MNEM_LAST }; /* * device parameters */ enum fw_params_param_dev { FW_PARAMS_PARAM_DEV_CCLK = 0x00, /* chip core clock in khz */ FW_PARAMS_PARAM_DEV_PORTVEC = 0x01, /* the port vector */ FW_PARAMS_PARAM_DEV_NTID = 0x02, /* reads the number of TIDs * allocated by the device's * Lookup Engine */ FW_PARAMS_PARAM_DEV_FLOWC_BUFFIFO_SZ = 0x03, FW_PARAMS_PARAM_DEV_INTVER_NIC = 0x04, FW_PARAMS_PARAM_DEV_INTVER_VNIC = 0x05, FW_PARAMS_PARAM_DEV_INTVER_OFLD = 0x06, FW_PARAMS_PARAM_DEV_INTVER_RI = 0x07, FW_PARAMS_PARAM_DEV_INTVER_ISCSIPDU = 0x08, FW_PARAMS_PARAM_DEV_INTVER_ISCSI = 0x09, FW_PARAMS_PARAM_DEV_INTVER_FCOE = 0x0A, FW_PARAMS_PARAM_DEV_FWREV = 0x0B, FW_PARAMS_PARAM_DEV_TPREV = 0x0C, FW_PARAMS_PARAM_DEV_CF = 0x0D, FW_PARAMS_PARAM_DEV_PHYFW = 0x0F, FW_PARAMS_PARAM_DEV_DIAG = 0x11, FW_PARAMS_PARAM_DEV_MAXORDIRD_QP = 0x13, /* max supported QP IRD/ORD */ FW_PARAMS_PARAM_DEV_MAXIRD_ADAPTER = 0x14, /* max supported adap IRD */ FW_PARAMS_PARAM_DEV_ULPTX_MEMWRITE_DSGL = 0x17, FW_PARAMS_PARAM_DEV_FWCACHE = 0x18, }; /* * physical and virtual function parameters */ enum fw_params_param_pfvf { FW_PARAMS_PARAM_PFVF_RWXCAPS = 0x00, FW_PARAMS_PARAM_PFVF_ROUTE_START = 0x01, FW_PARAMS_PARAM_PFVF_ROUTE_END = 0x02, FW_PARAMS_PARAM_PFVF_CLIP_START = 0x03, FW_PARAMS_PARAM_PFVF_CLIP_END = 0x04, FW_PARAMS_PARAM_PFVF_FILTER_START = 0x05, FW_PARAMS_PARAM_PFVF_FILTER_END = 0x06, FW_PARAMS_PARAM_PFVF_SERVER_START = 0x07, FW_PARAMS_PARAM_PFVF_SERVER_END = 0x08, FW_PARAMS_PARAM_PFVF_TDDP_START = 0x09, FW_PARAMS_PARAM_PFVF_TDDP_END = 0x0A, FW_PARAMS_PARAM_PFVF_ISCSI_START = 0x0B, FW_PARAMS_PARAM_PFVF_ISCSI_END = 0x0C, FW_PARAMS_PARAM_PFVF_STAG_START = 0x0D, FW_PARAMS_PARAM_PFVF_STAG_END = 0x0E, FW_PARAMS_PARAM_PFVF_RQ_START = 0x1F, FW_PARAMS_PARAM_PFVF_RQ_END = 0x10, FW_PARAMS_PARAM_PFVF_PBL_START = 0x11, FW_PARAMS_PARAM_PFVF_PBL_END = 0x12, FW_PARAMS_PARAM_PFVF_L2T_START = 0x13, FW_PARAMS_PARAM_PFVF_L2T_END = 0x14, FW_PARAMS_PARAM_PFVF_SQRQ_START = 0x15, FW_PARAMS_PARAM_PFVF_SQRQ_END = 0x16, FW_PARAMS_PARAM_PFVF_CQ_START = 0x17, FW_PARAMS_PARAM_PFVF_CQ_END = 0x18, FW_PARAMS_PARAM_PFVF_SRQ_START = 0x19, FW_PARAMS_PARAM_PFVF_SRQ_END = 0x1A, FW_PARAMS_PARAM_PFVF_SCHEDCLASS_ETH = 0x20, FW_PARAMS_PARAM_PFVF_VIID = 0x24, FW_PARAMS_PARAM_PFVF_CPMASK = 0x25, FW_PARAMS_PARAM_PFVF_OCQ_START = 0x26, FW_PARAMS_PARAM_PFVF_OCQ_END = 0x27, FW_PARAMS_PARAM_PFVF_CONM_MAP = 0x28, FW_PARAMS_PARAM_PFVF_IQFLINT_START = 0x29, FW_PARAMS_PARAM_PFVF_IQFLINT_END = 0x2A, FW_PARAMS_PARAM_PFVF_EQ_START = 0x2B, FW_PARAMS_PARAM_PFVF_EQ_END = 0x2C, FW_PARAMS_PARAM_PFVF_ACTIVE_FILTER_START = 0x2D, FW_PARAMS_PARAM_PFVF_ACTIVE_FILTER_END = 0x2E, FW_PARAMS_PARAM_PFVF_ETHOFLD_END = 0x30, FW_PARAMS_PARAM_PFVF_CPLFW4MSG_ENCAP = 0x31 }; /* * dma queue parameters */ enum fw_params_param_dmaq { FW_PARAMS_PARAM_DMAQ_IQ_DCAEN_DCACPU = 0x00, FW_PARAMS_PARAM_DMAQ_IQ_INTCNTTHRESH = 0x01, FW_PARAMS_PARAM_DMAQ_EQ_CMPLIQID_MNGT = 0x10, FW_PARAMS_PARAM_DMAQ_EQ_CMPLIQID_CTRL = 0x11, FW_PARAMS_PARAM_DMAQ_EQ_SCHEDCLASS_ETH = 0x12, FW_PARAMS_PARAM_DMAQ_EQ_DCBPRIO_ETH = 0x13, FW_PARAMS_PARAM_DMAQ_CONM_CTXT = 0x20, }; enum fw_params_param_dev_phyfw { FW_PARAMS_PARAM_DEV_PHYFW_DOWNLOAD = 0x00, FW_PARAMS_PARAM_DEV_PHYFW_VERSION = 0x01, }; enum fw_params_param_dev_diag { FW_PARAM_DEV_DIAG_TMP = 0x00, FW_PARAM_DEV_DIAG_VDD = 0x01, }; enum fw_params_param_dev_fwcache { FW_PARAM_DEV_FWCACHE_FLUSH = 0x00, FW_PARAM_DEV_FWCACHE_FLUSHINV = 0x01, }; #define FW_PARAMS_MNEM_S 24 #define FW_PARAMS_MNEM_V(x) ((x) << FW_PARAMS_MNEM_S) #define FW_PARAMS_PARAM_X_S 16 #define FW_PARAMS_PARAM_X_V(x) ((x) << FW_PARAMS_PARAM_X_S) #define FW_PARAMS_PARAM_Y_S 8 #define FW_PARAMS_PARAM_Y_M 0xffU #define FW_PARAMS_PARAM_Y_V(x) ((x) << FW_PARAMS_PARAM_Y_S) #define FW_PARAMS_PARAM_Y_G(x) (((x) >> FW_PARAMS_PARAM_Y_S) &\ FW_PARAMS_PARAM_Y_M) #define FW_PARAMS_PARAM_Z_S 0 #define FW_PARAMS_PARAM_Z_M 0xffu #define FW_PARAMS_PARAM_Z_V(x) ((x) << FW_PARAMS_PARAM_Z_S) #define FW_PARAMS_PARAM_Z_G(x) (((x) >> FW_PARAMS_PARAM_Z_S) &\ FW_PARAMS_PARAM_Z_M) #define FW_PARAMS_PARAM_XYZ_S 0 #define FW_PARAMS_PARAM_XYZ_V(x) ((x) << FW_PARAMS_PARAM_XYZ_S) #define FW_PARAMS_PARAM_YZ_S 0 #define FW_PARAMS_PARAM_YZ_V(x) ((x) << FW_PARAMS_PARAM_YZ_S) struct fw_params_cmd { __be32 op_to_vfn; __be32 retval_len16; struct fw_params_param { __be32 mnem; __be32 val; } param[7]; }; #define FW_PARAMS_CMD_PFN_S 8 #define FW_PARAMS_CMD_PFN_V(x) ((x) << FW_PARAMS_CMD_PFN_S) #define FW_PARAMS_CMD_VFN_S 0 #define FW_PARAMS_CMD_VFN_V(x) ((x) << FW_PARAMS_CMD_VFN_S) struct fw_pfvf_cmd { __be32 op_to_vfn; __be32 retval_len16; __be32 niqflint_niq; __be32 type_to_neq; __be32 tc_to_nexactf; __be32 r_caps_to_nethctrl; __be16 nricq; __be16 nriqp; __be32 r4; }; #define FW_PFVF_CMD_PFN_S 8 #define FW_PFVF_CMD_PFN_V(x) ((x) << FW_PFVF_CMD_PFN_S) #define FW_PFVF_CMD_VFN_S 0 #define FW_PFVF_CMD_VFN_V(x) ((x) << FW_PFVF_CMD_VFN_S) #define FW_PFVF_CMD_NIQFLINT_S 20 #define FW_PFVF_CMD_NIQFLINT_M 0xfff #define FW_PFVF_CMD_NIQFLINT_V(x) ((x) << FW_PFVF_CMD_NIQFLINT_S) #define FW_PFVF_CMD_NIQFLINT_G(x) \ (((x) >> FW_PFVF_CMD_NIQFLINT_S) & FW_PFVF_CMD_NIQFLINT_M) #define FW_PFVF_CMD_NIQ_S 0 #define FW_PFVF_CMD_NIQ_M 0xfffff #define FW_PFVF_CMD_NIQ_V(x) ((x) << FW_PFVF_CMD_NIQ_S) #define FW_PFVF_CMD_NIQ_G(x) \ (((x) >> FW_PFVF_CMD_NIQ_S) & FW_PFVF_CMD_NIQ_M) #define FW_PFVF_CMD_TYPE_S 31 #define FW_PFVF_CMD_TYPE_M 0x1 #define FW_PFVF_CMD_TYPE_V(x) ((x) << FW_PFVF_CMD_TYPE_S) #define FW_PFVF_CMD_TYPE_G(x) \ (((x) >> FW_PFVF_CMD_TYPE_S) & FW_PFVF_CMD_TYPE_M) #define FW_PFVF_CMD_TYPE_F FW_PFVF_CMD_TYPE_V(1U) #define FW_PFVF_CMD_CMASK_S 24 #define FW_PFVF_CMD_CMASK_M 0xf #define FW_PFVF_CMD_CMASK_V(x) ((x) << FW_PFVF_CMD_CMASK_S) #define FW_PFVF_CMD_CMASK_G(x) \ (((x) >> FW_PFVF_CMD_CMASK_S) & FW_PFVF_CMD_CMASK_M) #define FW_PFVF_CMD_PMASK_S 20 #define FW_PFVF_CMD_PMASK_M 0xf #define FW_PFVF_CMD_PMASK_V(x) ((x) << FW_PFVF_CMD_PMASK_S) #define FW_PFVF_CMD_PMASK_G(x) \ (((x) >> FW_PFVF_CMD_PMASK_S) & FW_PFVF_CMD_PMASK_M) #define FW_PFVF_CMD_NEQ_S 0 #define FW_PFVF_CMD_NEQ_M 0xfffff #define FW_PFVF_CMD_NEQ_V(x) ((x) << FW_PFVF_CMD_NEQ_S) #define FW_PFVF_CMD_NEQ_G(x) \ (((x) >> FW_PFVF_CMD_NEQ_S) & FW_PFVF_CMD_NEQ_M) #define FW_PFVF_CMD_TC_S 24 #define FW_PFVF_CMD_TC_M 0xff #define FW_PFVF_CMD_TC_V(x) ((x) << FW_PFVF_CMD_TC_S) #define FW_PFVF_CMD_TC_G(x) (((x) >> FW_PFVF_CMD_TC_S) & FW_PFVF_CMD_TC_M) #define FW_PFVF_CMD_NVI_S 16 #define FW_PFVF_CMD_NVI_M 0xff #define FW_PFVF_CMD_NVI_V(x) ((x) << FW_PFVF_CMD_NVI_S) #define FW_PFVF_CMD_NVI_G(x) (((x) >> FW_PFVF_CMD_NVI_S) & FW_PFVF_CMD_NVI_M) #define FW_PFVF_CMD_NEXACTF_S 0 #define FW_PFVF_CMD_NEXACTF_M 0xffff #define FW_PFVF_CMD_NEXACTF_V(x) ((x) << FW_PFVF_CMD_NEXACTF_S) #define FW_PFVF_CMD_NEXACTF_G(x) \ (((x) >> FW_PFVF_CMD_NEXACTF_S) & FW_PFVF_CMD_NEXACTF_M) #define FW_PFVF_CMD_R_CAPS_S 24 #define FW_PFVF_CMD_R_CAPS_M 0xff #define FW_PFVF_CMD_R_CAPS_V(x) ((x) << FW_PFVF_CMD_R_CAPS_S) #define FW_PFVF_CMD_R_CAPS_G(x) \ (((x) >> FW_PFVF_CMD_R_CAPS_S) & FW_PFVF_CMD_R_CAPS_M) #define FW_PFVF_CMD_WX_CAPS_S 16 #define FW_PFVF_CMD_WX_CAPS_M 0xff #define FW_PFVF_CMD_WX_CAPS_V(x) ((x) << FW_PFVF_CMD_WX_CAPS_S) #define FW_PFVF_CMD_WX_CAPS_G(x) \ (((x) >> FW_PFVF_CMD_WX_CAPS_S) & FW_PFVF_CMD_WX_CAPS_M) #define FW_PFVF_CMD_NETHCTRL_S 0 #define FW_PFVF_CMD_NETHCTRL_M 0xffff #define FW_PFVF_CMD_NETHCTRL_V(x) ((x) << FW_PFVF_CMD_NETHCTRL_S) #define FW_PFVF_CMD_NETHCTRL_G(x) \ (((x) >> FW_PFVF_CMD_NETHCTRL_S) & FW_PFVF_CMD_NETHCTRL_M) enum fw_iq_type { FW_IQ_TYPE_FL_INT_CAP, FW_IQ_TYPE_NO_FL_INT_CAP }; struct fw_iq_cmd { __be32 op_to_vfn; __be32 alloc_to_len16; __be16 physiqid; __be16 iqid; __be16 fl0id; __be16 fl1id; __be32 type_to_iqandstindex; __be16 iqdroprss_to_iqesize; __be16 iqsize; __be64 iqaddr; __be32 iqns_to_fl0congen; __be16 fl0dcaen_to_fl0cidxfthresh; __be16 fl0size; __be64 fl0addr; __be32 fl1cngchmap_to_fl1congen; __be16 fl1dcaen_to_fl1cidxfthresh; __be16 fl1size; __be64 fl1addr; }; #define FW_IQ_CMD_PFN_S 8 #define FW_IQ_CMD_PFN_V(x) ((x) << FW_IQ_CMD_PFN_S) #define FW_IQ_CMD_VFN_S 0 #define FW_IQ_CMD_VFN_V(x) ((x) << FW_IQ_CMD_VFN_S) #define FW_IQ_CMD_ALLOC_S 31 #define FW_IQ_CMD_ALLOC_V(x) ((x) << FW_IQ_CMD_ALLOC_S) #define FW_IQ_CMD_ALLOC_F FW_IQ_CMD_ALLOC_V(1U) #define FW_IQ_CMD_FREE_S 30 #define FW_IQ_CMD_FREE_V(x) ((x) << FW_IQ_CMD_FREE_S) #define FW_IQ_CMD_FREE_F FW_IQ_CMD_FREE_V(1U) #define FW_IQ_CMD_MODIFY_S 29 #define FW_IQ_CMD_MODIFY_V(x) ((x) << FW_IQ_CMD_MODIFY_S) #define FW_IQ_CMD_MODIFY_F FW_IQ_CMD_MODIFY_V(1U) #define FW_IQ_CMD_IQSTART_S 28 #define FW_IQ_CMD_IQSTART_V(x) ((x) << FW_IQ_CMD_IQSTART_S) #define FW_IQ_CMD_IQSTART_F FW_IQ_CMD_IQSTART_V(1U) #define FW_IQ_CMD_IQSTOP_S 27 #define FW_IQ_CMD_IQSTOP_V(x) ((x) << FW_IQ_CMD_IQSTOP_S) #define FW_IQ_CMD_IQSTOP_F FW_IQ_CMD_IQSTOP_V(1U) #define FW_IQ_CMD_TYPE_S 29 #define FW_IQ_CMD_TYPE_V(x) ((x) << FW_IQ_CMD_TYPE_S) #define FW_IQ_CMD_IQASYNCH_S 28 #define FW_IQ_CMD_IQASYNCH_V(x) ((x) << FW_IQ_CMD_IQASYNCH_S) #define FW_IQ_CMD_VIID_S 16 #define FW_IQ_CMD_VIID_V(x) ((x) << FW_IQ_CMD_VIID_S) #define FW_IQ_CMD_IQANDST_S 15 #define FW_IQ_CMD_IQANDST_V(x) ((x) << FW_IQ_CMD_IQANDST_S) #define FW_IQ_CMD_IQANUS_S 14 #define FW_IQ_CMD_IQANUS_V(x) ((x) << FW_IQ_CMD_IQANUS_S) #define FW_IQ_CMD_IQANUD_S 12 #define FW_IQ_CMD_IQANUD_V(x) ((x) << FW_IQ_CMD_IQANUD_S) #define FW_IQ_CMD_IQANDSTINDEX_S 0 #define FW_IQ_CMD_IQANDSTINDEX_V(x) ((x) << FW_IQ_CMD_IQANDSTINDEX_S) #define FW_IQ_CMD_IQDROPRSS_S 15 #define FW_IQ_CMD_IQDROPRSS_V(x) ((x) << FW_IQ_CMD_IQDROPRSS_S) #define FW_IQ_CMD_IQDROPRSS_F FW_IQ_CMD_IQDROPRSS_V(1U) #define FW_IQ_CMD_IQGTSMODE_S 14 #define FW_IQ_CMD_IQGTSMODE_V(x) ((x) << FW_IQ_CMD_IQGTSMODE_S) #define FW_IQ_CMD_IQGTSMODE_F FW_IQ_CMD_IQGTSMODE_V(1U) #define FW_IQ_CMD_IQPCIECH_S 12 #define FW_IQ_CMD_IQPCIECH_V(x) ((x) << FW_IQ_CMD_IQPCIECH_S) #define FW_IQ_CMD_IQDCAEN_S 11 #define FW_IQ_CMD_IQDCAEN_V(x) ((x) << FW_IQ_CMD_IQDCAEN_S) #define FW_IQ_CMD_IQDCACPU_S 6 #define FW_IQ_CMD_IQDCACPU_V(x) ((x) << FW_IQ_CMD_IQDCACPU_S) #define FW_IQ_CMD_IQINTCNTTHRESH_S 4 #define FW_IQ_CMD_IQINTCNTTHRESH_V(x) ((x) << FW_IQ_CMD_IQINTCNTTHRESH_S) #define FW_IQ_CMD_IQO_S 3 #define FW_IQ_CMD_IQO_V(x) ((x) << FW_IQ_CMD_IQO_S) #define FW_IQ_CMD_IQO_F FW_IQ_CMD_IQO_V(1U) #define FW_IQ_CMD_IQCPRIO_S 2 #define FW_IQ_CMD_IQCPRIO_V(x) ((x) << FW_IQ_CMD_IQCPRIO_S) #define FW_IQ_CMD_IQESIZE_S 0 #define FW_IQ_CMD_IQESIZE_V(x) ((x) << FW_IQ_CMD_IQESIZE_S) #define FW_IQ_CMD_IQNS_S 31 #define FW_IQ_CMD_IQNS_V(x) ((x) << FW_IQ_CMD_IQNS_S) #define FW_IQ_CMD_IQRO_S 30 #define FW_IQ_CMD_IQRO_V(x) ((x) << FW_IQ_CMD_IQRO_S) #define FW_IQ_CMD_IQFLINTIQHSEN_S 28 #define FW_IQ_CMD_IQFLINTIQHSEN_V(x) ((x) << FW_IQ_CMD_IQFLINTIQHSEN_S) #define FW_IQ_CMD_IQFLINTCONGEN_S 27 #define FW_IQ_CMD_IQFLINTCONGEN_V(x) ((x) << FW_IQ_CMD_IQFLINTCONGEN_S) #define FW_IQ_CMD_IQFLINTCONGEN_F FW_IQ_CMD_IQFLINTCONGEN_V(1U) #define FW_IQ_CMD_IQFLINTISCSIC_S 26 #define FW_IQ_CMD_IQFLINTISCSIC_V(x) ((x) << FW_IQ_CMD_IQFLINTISCSIC_S) #define FW_IQ_CMD_FL0CNGCHMAP_S 20 #define FW_IQ_CMD_FL0CNGCHMAP_V(x) ((x) << FW_IQ_CMD_FL0CNGCHMAP_S) #define FW_IQ_CMD_FL0CACHELOCK_S 15 #define FW_IQ_CMD_FL0CACHELOCK_V(x) ((x) << FW_IQ_CMD_FL0CACHELOCK_S) #define FW_IQ_CMD_FL0DBP_S 14 #define FW_IQ_CMD_FL0DBP_V(x) ((x) << FW_IQ_CMD_FL0DBP_S) #define FW_IQ_CMD_FL0DATANS_S 13 #define FW_IQ_CMD_FL0DATANS_V(x) ((x) << FW_IQ_CMD_FL0DATANS_S) #define FW_IQ_CMD_FL0DATARO_S 12 #define FW_IQ_CMD_FL0DATARO_V(x) ((x) << FW_IQ_CMD_FL0DATARO_S) #define FW_IQ_CMD_FL0DATARO_F FW_IQ_CMD_FL0DATARO_V(1U) #define FW_IQ_CMD_FL0CONGCIF_S 11 #define FW_IQ_CMD_FL0CONGCIF_V(x) ((x) << FW_IQ_CMD_FL0CONGCIF_S) #define FW_IQ_CMD_FL0CONGCIF_F FW_IQ_CMD_FL0CONGCIF_V(1U) #define FW_IQ_CMD_FL0ONCHIP_S 10 #define FW_IQ_CMD_FL0ONCHIP_V(x) ((x) << FW_IQ_CMD_FL0ONCHIP_S) #define FW_IQ_CMD_FL0STATUSPGNS_S 9 #define FW_IQ_CMD_FL0STATUSPGNS_V(x) ((x) << FW_IQ_CMD_FL0STATUSPGNS_S) #define FW_IQ_CMD_FL0STATUSPGRO_S 8 #define FW_IQ_CMD_FL0STATUSPGRO_V(x) ((x) << FW_IQ_CMD_FL0STATUSPGRO_S) #define FW_IQ_CMD_FL0FETCHNS_S 7 #define FW_IQ_CMD_FL0FETCHNS_V(x) ((x) << FW_IQ_CMD_FL0FETCHNS_S) #define FW_IQ_CMD_FL0FETCHRO_S 6 #define FW_IQ_CMD_FL0FETCHRO_V(x) ((x) << FW_IQ_CMD_FL0FETCHRO_S) #define FW_IQ_CMD_FL0FETCHRO_F FW_IQ_CMD_FL0FETCHRO_V(1U) #define FW_IQ_CMD_FL0HOSTFCMODE_S 4 #define FW_IQ_CMD_FL0HOSTFCMODE_V(x) ((x) << FW_IQ_CMD_FL0HOSTFCMODE_S) #define FW_IQ_CMD_FL0CPRIO_S 3 #define FW_IQ_CMD_FL0CPRIO_V(x) ((x) << FW_IQ_CMD_FL0CPRIO_S) #define FW_IQ_CMD_FL0PADEN_S 2 #define FW_IQ_CMD_FL0PADEN_V(x) ((x) << FW_IQ_CMD_FL0PADEN_S) #define FW_IQ_CMD_FL0PADEN_F FW_IQ_CMD_FL0PADEN_V(1U) #define FW_IQ_CMD_FL0PACKEN_S 1 #define FW_IQ_CMD_FL0PACKEN_V(x) ((x) << FW_IQ_CMD_FL0PACKEN_S) #define FW_IQ_CMD_FL0PACKEN_F FW_IQ_CMD_FL0PACKEN_V(1U) #define FW_IQ_CMD_FL0CONGEN_S 0 #define FW_IQ_CMD_FL0CONGEN_V(x) ((x) << FW_IQ_CMD_FL0CONGEN_S) #define FW_IQ_CMD_FL0CONGEN_F FW_IQ_CMD_FL0CONGEN_V(1U) #define FW_IQ_CMD_FL0DCAEN_S 15 #define FW_IQ_CMD_FL0DCAEN_V(x) ((x) << FW_IQ_CMD_FL0DCAEN_S) #define FW_IQ_CMD_FL0DCACPU_S 10 #define FW_IQ_CMD_FL0DCACPU_V(x) ((x) << FW_IQ_CMD_FL0DCACPU_S) #define FW_IQ_CMD_FL0FBMIN_S 7 #define FW_IQ_CMD_FL0FBMIN_V(x) ((x) << FW_IQ_CMD_FL0FBMIN_S) #define FW_IQ_CMD_FL0FBMAX_S 4 #define FW_IQ_CMD_FL0FBMAX_V(x) ((x) << FW_IQ_CMD_FL0FBMAX_S) #define FW_IQ_CMD_FL0CIDXFTHRESHO_S 3 #define FW_IQ_CMD_FL0CIDXFTHRESHO_V(x) ((x) << FW_IQ_CMD_FL0CIDXFTHRESHO_S) #define FW_IQ_CMD_FL0CIDXFTHRESHO_F FW_IQ_CMD_FL0CIDXFTHRESHO_V(1U) #define FW_IQ_CMD_FL0CIDXFTHRESH_S 0 #define FW_IQ_CMD_FL0CIDXFTHRESH_V(x) ((x) << FW_IQ_CMD_FL0CIDXFTHRESH_S) #define FW_IQ_CMD_FL1CNGCHMAP_S 20 #define FW_IQ_CMD_FL1CNGCHMAP_V(x) ((x) << FW_IQ_CMD_FL1CNGCHMAP_S) #define FW_IQ_CMD_FL1CACHELOCK_S 15 #define FW_IQ_CMD_FL1CACHELOCK_V(x) ((x) << FW_IQ_CMD_FL1CACHELOCK_S) #define FW_IQ_CMD_FL1DBP_S 14 #define FW_IQ_CMD_FL1DBP_V(x) ((x) << FW_IQ_CMD_FL1DBP_S) #define FW_IQ_CMD_FL1DATANS_S 13 #define FW_IQ_CMD_FL1DATANS_V(x) ((x) << FW_IQ_CMD_FL1DATANS_S) #define FW_IQ_CMD_FL1DATARO_S 12 #define FW_IQ_CMD_FL1DATARO_V(x) ((x) << FW_IQ_CMD_FL1DATARO_S) #define FW_IQ_CMD_FL1CONGCIF_S 11 #define FW_IQ_CMD_FL1CONGCIF_V(x) ((x) << FW_IQ_CMD_FL1CONGCIF_S) #define FW_IQ_CMD_FL1ONCHIP_S 10 #define FW_IQ_CMD_FL1ONCHIP_V(x) ((x) << FW_IQ_CMD_FL1ONCHIP_S) #define FW_IQ_CMD_FL1STATUSPGNS_S 9 #define FW_IQ_CMD_FL1STATUSPGNS_V(x) ((x) << FW_IQ_CMD_FL1STATUSPGNS_S) #define FW_IQ_CMD_FL1STATUSPGRO_S 8 #define FW_IQ_CMD_FL1STATUSPGRO_V(x) ((x) << FW_IQ_CMD_FL1STATUSPGRO_S) #define FW_IQ_CMD_FL1FETCHNS_S 7 #define FW_IQ_CMD_FL1FETCHNS_V(x) ((x) << FW_IQ_CMD_FL1FETCHNS_S) #define FW_IQ_CMD_FL1FETCHRO_S 6 #define FW_IQ_CMD_FL1FETCHRO_V(x) ((x) << FW_IQ_CMD_FL1FETCHRO_S) #define FW_IQ_CMD_FL1HOSTFCMODE_S 4 #define FW_IQ_CMD_FL1HOSTFCMODE_V(x) ((x) << FW_IQ_CMD_FL1HOSTFCMODE_S) #define FW_IQ_CMD_FL1CPRIO_S 3 #define FW_IQ_CMD_FL1CPRIO_V(x) ((x) << FW_IQ_CMD_FL1CPRIO_S) #define FW_IQ_CMD_FL1PADEN_S 2 #define FW_IQ_CMD_FL1PADEN_V(x) ((x) << FW_IQ_CMD_FL1PADEN_S) #define FW_IQ_CMD_FL1PADEN_F FW_IQ_CMD_FL1PADEN_V(1U) #define FW_IQ_CMD_FL1PACKEN_S 1 #define FW_IQ_CMD_FL1PACKEN_V(x) ((x) << FW_IQ_CMD_FL1PACKEN_S) #define FW_IQ_CMD_FL1PACKEN_F FW_IQ_CMD_FL1PACKEN_V(1U) #define FW_IQ_CMD_FL1CONGEN_S 0 #define FW_IQ_CMD_FL1CONGEN_V(x) ((x) << FW_IQ_CMD_FL1CONGEN_S) #define FW_IQ_CMD_FL1CONGEN_F FW_IQ_CMD_FL1CONGEN_V(1U) #define FW_IQ_CMD_FL1DCAEN_S 15 #define FW_IQ_CMD_FL1DCAEN_V(x) ((x) << FW_IQ_CMD_FL1DCAEN_S) #define FW_IQ_CMD_FL1DCACPU_S 10 #define FW_IQ_CMD_FL1DCACPU_V(x) ((x) << FW_IQ_CMD_FL1DCACPU_S) #define FW_IQ_CMD_FL1FBMIN_S 7 #define FW_IQ_CMD_FL1FBMIN_V(x) ((x) << FW_IQ_CMD_FL1FBMIN_S) #define FW_IQ_CMD_FL1FBMAX_S 4 #define FW_IQ_CMD_FL1FBMAX_V(x) ((x) << FW_IQ_CMD_FL1FBMAX_S) #define FW_IQ_CMD_FL1CIDXFTHRESHO_S 3 #define FW_IQ_CMD_FL1CIDXFTHRESHO_V(x) ((x) << FW_IQ_CMD_FL1CIDXFTHRESHO_S) #define FW_IQ_CMD_FL1CIDXFTHRESHO_F FW_IQ_CMD_FL1CIDXFTHRESHO_V(1U) #define FW_IQ_CMD_FL1CIDXFTHRESH_S 0 #define FW_IQ_CMD_FL1CIDXFTHRESH_V(x) ((x) << FW_IQ_CMD_FL1CIDXFTHRESH_S) struct fw_eq_eth_cmd { __be32 op_to_vfn; __be32 alloc_to_len16; __be32 eqid_pkd; __be32 physeqid_pkd; __be32 fetchszm_to_iqid; __be32 dcaen_to_eqsize; __be64 eqaddr; __be32 viid_pkd; __be32 r8_lo; __be64 r9; }; #define FW_EQ_ETH_CMD_PFN_S 8 #define FW_EQ_ETH_CMD_PFN_V(x) ((x) << FW_EQ_ETH_CMD_PFN_S) #define FW_EQ_ETH_CMD_VFN_S 0 #define FW_EQ_ETH_CMD_VFN_V(x) ((x) << FW_EQ_ETH_CMD_VFN_S) #define FW_EQ_ETH_CMD_ALLOC_S 31 #define FW_EQ_ETH_CMD_ALLOC_V(x) ((x) << FW_EQ_ETH_CMD_ALLOC_S) #define FW_EQ_ETH_CMD_ALLOC_F FW_EQ_ETH_CMD_ALLOC_V(1U) #define FW_EQ_ETH_CMD_FREE_S 30 #define FW_EQ_ETH_CMD_FREE_V(x) ((x) << FW_EQ_ETH_CMD_FREE_S) #define FW_EQ_ETH_CMD_FREE_F FW_EQ_ETH_CMD_FREE_V(1U) #define FW_EQ_ETH_CMD_MODIFY_S 29 #define FW_EQ_ETH_CMD_MODIFY_V(x) ((x) << FW_EQ_ETH_CMD_MODIFY_S) #define FW_EQ_ETH_CMD_MODIFY_F FW_EQ_ETH_CMD_MODIFY_V(1U) #define FW_EQ_ETH_CMD_EQSTART_S 28 #define FW_EQ_ETH_CMD_EQSTART_V(x) ((x) << FW_EQ_ETH_CMD_EQSTART_S) #define FW_EQ_ETH_CMD_EQSTART_F FW_EQ_ETH_CMD_EQSTART_V(1U) #define FW_EQ_ETH_CMD_EQSTOP_S 27 #define FW_EQ_ETH_CMD_EQSTOP_V(x) ((x) << FW_EQ_ETH_CMD_EQSTOP_S) #define FW_EQ_ETH_CMD_EQSTOP_F FW_EQ_ETH_CMD_EQSTOP_V(1U) #define FW_EQ_ETH_CMD_EQID_S 0 #define FW_EQ_ETH_CMD_EQID_M 0xfffff #define FW_EQ_ETH_CMD_EQID_V(x) ((x) << FW_EQ_ETH_CMD_EQID_S) #define FW_EQ_ETH_CMD_EQID_G(x) \ (((x) >> FW_EQ_ETH_CMD_EQID_S) & FW_EQ_ETH_CMD_EQID_M) #define FW_EQ_ETH_CMD_PHYSEQID_S 0 #define FW_EQ_ETH_CMD_PHYSEQID_M 0xfffff #define FW_EQ_ETH_CMD_PHYSEQID_V(x) ((x) << FW_EQ_ETH_CMD_PHYSEQID_S) #define FW_EQ_ETH_CMD_PHYSEQID_G(x) \ (((x) >> FW_EQ_ETH_CMD_PHYSEQID_S) & FW_EQ_ETH_CMD_PHYSEQID_M) #define FW_EQ_ETH_CMD_FETCHSZM_S 26 #define FW_EQ_ETH_CMD_FETCHSZM_V(x) ((x) << FW_EQ_ETH_CMD_FETCHSZM_S) #define FW_EQ_ETH_CMD_FETCHSZM_F FW_EQ_ETH_CMD_FETCHSZM_V(1U) #define FW_EQ_ETH_CMD_STATUSPGNS_S 25 #define FW_EQ_ETH_CMD_STATUSPGNS_V(x) ((x) << FW_EQ_ETH_CMD_STATUSPGNS_S) #define FW_EQ_ETH_CMD_STATUSPGRO_S 24 #define FW_EQ_ETH_CMD_STATUSPGRO_V(x) ((x) << FW_EQ_ETH_CMD_STATUSPGRO_S) #define FW_EQ_ETH_CMD_FETCHNS_S 23 #define FW_EQ_ETH_CMD_FETCHNS_V(x) ((x) << FW_EQ_ETH_CMD_FETCHNS_S) #define FW_EQ_ETH_CMD_FETCHRO_S 22 #define FW_EQ_ETH_CMD_FETCHRO_V(x) ((x) << FW_EQ_ETH_CMD_FETCHRO_S) #define FW_EQ_ETH_CMD_FETCHRO_F FW_EQ_ETH_CMD_FETCHRO_V(1U) #define FW_EQ_ETH_CMD_HOSTFCMODE_S 20 #define FW_EQ_ETH_CMD_HOSTFCMODE_V(x) ((x) << FW_EQ_ETH_CMD_HOSTFCMODE_S) #define FW_EQ_ETH_CMD_CPRIO_S 19 #define FW_EQ_ETH_CMD_CPRIO_V(x) ((x) << FW_EQ_ETH_CMD_CPRIO_S) #define FW_EQ_ETH_CMD_ONCHIP_S 18 #define FW_EQ_ETH_CMD_ONCHIP_V(x) ((x) << FW_EQ_ETH_CMD_ONCHIP_S) #define FW_EQ_ETH_CMD_PCIECHN_S 16 #define FW_EQ_ETH_CMD_PCIECHN_V(x) ((x) << FW_EQ_ETH_CMD_PCIECHN_S) #define FW_EQ_ETH_CMD_IQID_S 0 #define FW_EQ_ETH_CMD_IQID_V(x) ((x) << FW_EQ_ETH_CMD_IQID_S) #define FW_EQ_ETH_CMD_DCAEN_S 31 #define FW_EQ_ETH_CMD_DCAEN_V(x) ((x) << FW_EQ_ETH_CMD_DCAEN_S) #define FW_EQ_ETH_CMD_DCACPU_S 26 #define FW_EQ_ETH_CMD_DCACPU_V(x) ((x) << FW_EQ_ETH_CMD_DCACPU_S) #define FW_EQ_ETH_CMD_FBMIN_S 23 #define FW_EQ_ETH_CMD_FBMIN_V(x) ((x) << FW_EQ_ETH_CMD_FBMIN_S) #define FW_EQ_ETH_CMD_FBMAX_S 20 #define FW_EQ_ETH_CMD_FBMAX_V(x) ((x) << FW_EQ_ETH_CMD_FBMAX_S) #define FW_EQ_ETH_CMD_CIDXFTHRESHO_S 19 #define FW_EQ_ETH_CMD_CIDXFTHRESHO_V(x) ((x) << FW_EQ_ETH_CMD_CIDXFTHRESHO_S) #define FW_EQ_ETH_CMD_CIDXFTHRESH_S 16 #define FW_EQ_ETH_CMD_CIDXFTHRESH_V(x) ((x) << FW_EQ_ETH_CMD_CIDXFTHRESH_S) #define FW_EQ_ETH_CMD_EQSIZE_S 0 #define FW_EQ_ETH_CMD_EQSIZE_V(x) ((x) << FW_EQ_ETH_CMD_EQSIZE_S) #define FW_EQ_ETH_CMD_AUTOEQUEQE_S 30 #define FW_EQ_ETH_CMD_AUTOEQUEQE_V(x) ((x) << FW_EQ_ETH_CMD_AUTOEQUEQE_S) #define FW_EQ_ETH_CMD_AUTOEQUEQE_F FW_EQ_ETH_CMD_AUTOEQUEQE_V(1U) #define FW_EQ_ETH_CMD_VIID_S 16 #define FW_EQ_ETH_CMD_VIID_V(x) ((x) << FW_EQ_ETH_CMD_VIID_S) struct fw_eq_ctrl_cmd { __be32 op_to_vfn; __be32 alloc_to_len16; __be32 cmpliqid_eqid; __be32 physeqid_pkd; __be32 fetchszm_to_iqid; __be32 dcaen_to_eqsize; __be64 eqaddr; }; #define FW_EQ_CTRL_CMD_PFN_S 8 #define FW_EQ_CTRL_CMD_PFN_V(x) ((x) << FW_EQ_CTRL_CMD_PFN_S) #define FW_EQ_CTRL_CMD_VFN_S 0 #define FW_EQ_CTRL_CMD_VFN_V(x) ((x) << FW_EQ_CTRL_CMD_VFN_S) #define FW_EQ_CTRL_CMD_ALLOC_S 31 #define FW_EQ_CTRL_CMD_ALLOC_V(x) ((x) << FW_EQ_CTRL_CMD_ALLOC_S) #define FW_EQ_CTRL_CMD_ALLOC_F FW_EQ_CTRL_CMD_ALLOC_V(1U) #define FW_EQ_CTRL_CMD_FREE_S 30 #define FW_EQ_CTRL_CMD_FREE_V(x) ((x) << FW_EQ_CTRL_CMD_FREE_S) #define FW_EQ_CTRL_CMD_FREE_F FW_EQ_CTRL_CMD_FREE_V(1U) #define FW_EQ_CTRL_CMD_MODIFY_S 29 #define FW_EQ_CTRL_CMD_MODIFY_V(x) ((x) << FW_EQ_CTRL_CMD_MODIFY_S) #define FW_EQ_CTRL_CMD_MODIFY_F FW_EQ_CTRL_CMD_MODIFY_V(1U) #define FW_EQ_CTRL_CMD_EQSTART_S 28 #define FW_EQ_CTRL_CMD_EQSTART_V(x) ((x) << FW_EQ_CTRL_CMD_EQSTART_S) #define FW_EQ_CTRL_CMD_EQSTART_F FW_EQ_CTRL_CMD_EQSTART_V(1U) #define FW_EQ_CTRL_CMD_EQSTOP_S 27 #define FW_EQ_CTRL_CMD_EQSTOP_V(x) ((x) << FW_EQ_CTRL_CMD_EQSTOP_S) #define FW_EQ_CTRL_CMD_EQSTOP_F FW_EQ_CTRL_CMD_EQSTOP_V(1U) #define FW_EQ_CTRL_CMD_CMPLIQID_S 20 #define FW_EQ_CTRL_CMD_CMPLIQID_V(x) ((x) << FW_EQ_CTRL_CMD_CMPLIQID_S) #define FW_EQ_CTRL_CMD_EQID_S 0 #define FW_EQ_CTRL_CMD_EQID_M 0xfffff #define FW_EQ_CTRL_CMD_EQID_V(x) ((x) << FW_EQ_CTRL_CMD_EQID_S) #define FW_EQ_CTRL_CMD_EQID_G(x) \ (((x) >> FW_EQ_CTRL_CMD_EQID_S) & FW_EQ_CTRL_CMD_EQID_M) #define FW_EQ_CTRL_CMD_PHYSEQID_S 0 #define FW_EQ_CTRL_CMD_PHYSEQID_M 0xfffff #define FW_EQ_CTRL_CMD_PHYSEQID_G(x) \ (((x) >> FW_EQ_CTRL_CMD_PHYSEQID_S) & FW_EQ_CTRL_CMD_PHYSEQID_M) #define FW_EQ_CTRL_CMD_FETCHSZM_S 26 #define FW_EQ_CTRL_CMD_FETCHSZM_V(x) ((x) << FW_EQ_CTRL_CMD_FETCHSZM_S) #define FW_EQ_CTRL_CMD_FETCHSZM_F FW_EQ_CTRL_CMD_FETCHSZM_V(1U) #define FW_EQ_CTRL_CMD_STATUSPGNS_S 25 #define FW_EQ_CTRL_CMD_STATUSPGNS_V(x) ((x) << FW_EQ_CTRL_CMD_STATUSPGNS_S) #define FW_EQ_CTRL_CMD_STATUSPGNS_F FW_EQ_CTRL_CMD_STATUSPGNS_V(1U) #define FW_EQ_CTRL_CMD_STATUSPGRO_S 24 #define FW_EQ_CTRL_CMD_STATUSPGRO_V(x) ((x) << FW_EQ_CTRL_CMD_STATUSPGRO_S) #define FW_EQ_CTRL_CMD_STATUSPGRO_F FW_EQ_CTRL_CMD_STATUSPGRO_V(1U) #define FW_EQ_CTRL_CMD_FETCHNS_S 23 #define FW_EQ_CTRL_CMD_FETCHNS_V(x) ((x) << FW_EQ_CTRL_CMD_FETCHNS_S) #define FW_EQ_CTRL_CMD_FETCHNS_F FW_EQ_CTRL_CMD_FETCHNS_V(1U) #define FW_EQ_CTRL_CMD_FETCHRO_S 22 #define FW_EQ_CTRL_CMD_FETCHRO_V(x) ((x) << FW_EQ_CTRL_CMD_FETCHRO_S) #define FW_EQ_CTRL_CMD_FETCHRO_F FW_EQ_CTRL_CMD_FETCHRO_V(1U) #define FW_EQ_CTRL_CMD_HOSTFCMODE_S 20 #define FW_EQ_CTRL_CMD_HOSTFCMODE_V(x) ((x) << FW_EQ_CTRL_CMD_HOSTFCMODE_S) #define FW_EQ_CTRL_CMD_CPRIO_S 19 #define FW_EQ_CTRL_CMD_CPRIO_V(x) ((x) << FW_EQ_CTRL_CMD_CPRIO_S) #define FW_EQ_CTRL_CMD_ONCHIP_S 18 #define FW_EQ_CTRL_CMD_ONCHIP_V(x) ((x) << FW_EQ_CTRL_CMD_ONCHIP_S) #define FW_EQ_CTRL_CMD_PCIECHN_S 16 #define FW_EQ_CTRL_CMD_PCIECHN_V(x) ((x) << FW_EQ_CTRL_CMD_PCIECHN_S) #define FW_EQ_CTRL_CMD_IQID_S 0 #define FW_EQ_CTRL_CMD_IQID_V(x) ((x) << FW_EQ_CTRL_CMD_IQID_S) #define FW_EQ_CTRL_CMD_DCAEN_S 31 #define FW_EQ_CTRL_CMD_DCAEN_V(x) ((x) << FW_EQ_CTRL_CMD_DCAEN_S) #define FW_EQ_CTRL_CMD_DCACPU_S 26 #define FW_EQ_CTRL_CMD_DCACPU_V(x) ((x) << FW_EQ_CTRL_CMD_DCACPU_S) #define FW_EQ_CTRL_CMD_FBMIN_S 23 #define FW_EQ_CTRL_CMD_FBMIN_V(x) ((x) << FW_EQ_CTRL_CMD_FBMIN_S) #define FW_EQ_CTRL_CMD_FBMAX_S 20 #define FW_EQ_CTRL_CMD_FBMAX_V(x) ((x) << FW_EQ_CTRL_CMD_FBMAX_S) #define FW_EQ_CTRL_CMD_CIDXFTHRESHO_S 19 #define FW_EQ_CTRL_CMD_CIDXFTHRESHO_V(x) \ ((x) << FW_EQ_CTRL_CMD_CIDXFTHRESHO_S) #define FW_EQ_CTRL_CMD_CIDXFTHRESH_S 16 #define FW_EQ_CTRL_CMD_CIDXFTHRESH_V(x) ((x) << FW_EQ_CTRL_CMD_CIDXFTHRESH_S) #define FW_EQ_CTRL_CMD_EQSIZE_S 0 #define FW_EQ_CTRL_CMD_EQSIZE_V(x) ((x) << FW_EQ_CTRL_CMD_EQSIZE_S) struct fw_eq_ofld_cmd { __be32 op_to_vfn; __be32 alloc_to_len16; __be32 eqid_pkd; __be32 physeqid_pkd; __be32 fetchszm_to_iqid; __be32 dcaen_to_eqsize; __be64 eqaddr; }; #define FW_EQ_OFLD_CMD_PFN_S 8 #define FW_EQ_OFLD_CMD_PFN_V(x) ((x) << FW_EQ_OFLD_CMD_PFN_S) #define FW_EQ_OFLD_CMD_VFN_S 0 #define FW_EQ_OFLD_CMD_VFN_V(x) ((x) << FW_EQ_OFLD_CMD_VFN_S) #define FW_EQ_OFLD_CMD_ALLOC_S 31 #define FW_EQ_OFLD_CMD_ALLOC_V(x) ((x) << FW_EQ_OFLD_CMD_ALLOC_S) #define FW_EQ_OFLD_CMD_ALLOC_F FW_EQ_OFLD_CMD_ALLOC_V(1U) #define FW_EQ_OFLD_CMD_FREE_S 30 #define FW_EQ_OFLD_CMD_FREE_V(x) ((x) << FW_EQ_OFLD_CMD_FREE_S) #define FW_EQ_OFLD_CMD_FREE_F FW_EQ_OFLD_CMD_FREE_V(1U) #define FW_EQ_OFLD_CMD_MODIFY_S 29 #define FW_EQ_OFLD_CMD_MODIFY_V(x) ((x) << FW_EQ_OFLD_CMD_MODIFY_S) #define FW_EQ_OFLD_CMD_MODIFY_F FW_EQ_OFLD_CMD_MODIFY_V(1U) #define FW_EQ_OFLD_CMD_EQSTART_S 28 #define FW_EQ_OFLD_CMD_EQSTART_V(x) ((x) << FW_EQ_OFLD_CMD_EQSTART_S) #define FW_EQ_OFLD_CMD_EQSTART_F FW_EQ_OFLD_CMD_EQSTART_V(1U) #define FW_EQ_OFLD_CMD_EQSTOP_S 27 #define FW_EQ_OFLD_CMD_EQSTOP_V(x) ((x) << FW_EQ_OFLD_CMD_EQSTOP_S) #define FW_EQ_OFLD_CMD_EQSTOP_F FW_EQ_OFLD_CMD_EQSTOP_V(1U) #define FW_EQ_OFLD_CMD_EQID_S 0 #define FW_EQ_OFLD_CMD_EQID_M 0xfffff #define FW_EQ_OFLD_CMD_EQID_V(x) ((x) << FW_EQ_OFLD_CMD_EQID_S) #define FW_EQ_OFLD_CMD_EQID_G(x) \ (((x) >> FW_EQ_OFLD_CMD_EQID_S) & FW_EQ_OFLD_CMD_EQID_M) #define FW_EQ_OFLD_CMD_PHYSEQID_S 0 #define FW_EQ_OFLD_CMD_PHYSEQID_M 0xfffff #define FW_EQ_OFLD_CMD_PHYSEQID_G(x) \ (((x) >> FW_EQ_OFLD_CMD_PHYSEQID_S) & FW_EQ_OFLD_CMD_PHYSEQID_M) #define FW_EQ_OFLD_CMD_FETCHSZM_S 26 #define FW_EQ_OFLD_CMD_FETCHSZM_V(x) ((x) << FW_EQ_OFLD_CMD_FETCHSZM_S) #define FW_EQ_OFLD_CMD_STATUSPGNS_S 25 #define FW_EQ_OFLD_CMD_STATUSPGNS_V(x) ((x) << FW_EQ_OFLD_CMD_STATUSPGNS_S) #define FW_EQ_OFLD_CMD_STATUSPGRO_S 24 #define FW_EQ_OFLD_CMD_STATUSPGRO_V(x) ((x) << FW_EQ_OFLD_CMD_STATUSPGRO_S) #define FW_EQ_OFLD_CMD_FETCHNS_S 23 #define FW_EQ_OFLD_CMD_FETCHNS_V(x) ((x) << FW_EQ_OFLD_CMD_FETCHNS_S) #define FW_EQ_OFLD_CMD_FETCHRO_S 22 #define FW_EQ_OFLD_CMD_FETCHRO_V(x) ((x) << FW_EQ_OFLD_CMD_FETCHRO_S) #define FW_EQ_OFLD_CMD_FETCHRO_F FW_EQ_OFLD_CMD_FETCHRO_V(1U) #define FW_EQ_OFLD_CMD_HOSTFCMODE_S 20 #define FW_EQ_OFLD_CMD_HOSTFCMODE_V(x) ((x) << FW_EQ_OFLD_CMD_HOSTFCMODE_S) #define FW_EQ_OFLD_CMD_CPRIO_S 19 #define FW_EQ_OFLD_CMD_CPRIO_V(x) ((x) << FW_EQ_OFLD_CMD_CPRIO_S) #define FW_EQ_OFLD_CMD_ONCHIP_S 18 #define FW_EQ_OFLD_CMD_ONCHIP_V(x) ((x) << FW_EQ_OFLD_CMD_ONCHIP_S) #define FW_EQ_OFLD_CMD_PCIECHN_S 16 #define FW_EQ_OFLD_CMD_PCIECHN_V(x) ((x) << FW_EQ_OFLD_CMD_PCIECHN_S) #define FW_EQ_OFLD_CMD_IQID_S 0 #define FW_EQ_OFLD_CMD_IQID_V(x) ((x) << FW_EQ_OFLD_CMD_IQID_S) #define FW_EQ_OFLD_CMD_DCAEN_S 31 #define FW_EQ_OFLD_CMD_DCAEN_V(x) ((x) << FW_EQ_OFLD_CMD_DCAEN_S) #define FW_EQ_OFLD_CMD_DCACPU_S 26 #define FW_EQ_OFLD_CMD_DCACPU_V(x) ((x) << FW_EQ_OFLD_CMD_DCACPU_S) #define FW_EQ_OFLD_CMD_FBMIN_S 23 #define FW_EQ_OFLD_CMD_FBMIN_V(x) ((x) << FW_EQ_OFLD_CMD_FBMIN_S) #define FW_EQ_OFLD_CMD_FBMAX_S 20 #define FW_EQ_OFLD_CMD_FBMAX_V(x) ((x) << FW_EQ_OFLD_CMD_FBMAX_S) #define FW_EQ_OFLD_CMD_CIDXFTHRESHO_S 19 #define FW_EQ_OFLD_CMD_CIDXFTHRESHO_V(x) \ ((x) << FW_EQ_OFLD_CMD_CIDXFTHRESHO_S) #define FW_EQ_OFLD_CMD_CIDXFTHRESH_S 16 #define FW_EQ_OFLD_CMD_CIDXFTHRESH_V(x) ((x) << FW_EQ_OFLD_CMD_CIDXFTHRESH_S) #define FW_EQ_OFLD_CMD_EQSIZE_S 0 #define FW_EQ_OFLD_CMD_EQSIZE_V(x) ((x) << FW_EQ_OFLD_CMD_EQSIZE_S) /* * Macros for VIID parsing: * VIID - [10:8] PFN, [7] VI Valid, [6:0] VI number */ #define FW_VIID_PFN_S 8 #define FW_VIID_PFN_M 0x7 #define FW_VIID_PFN_G(x) (((x) >> FW_VIID_PFN_S) & FW_VIID_PFN_M) #define FW_VIID_VIVLD_S 7 #define FW_VIID_VIVLD_M 0x1 #define FW_VIID_VIVLD_G(x) (((x) >> FW_VIID_VIVLD_S) & FW_VIID_VIVLD_M) #define FW_VIID_VIN_S 0 #define FW_VIID_VIN_M 0x7F #define FW_VIID_VIN_G(x) (((x) >> FW_VIID_VIN_S) & FW_VIID_VIN_M) struct fw_vi_cmd { __be32 op_to_vfn; __be32 alloc_to_len16; __be16 type_viid; u8 mac[6]; u8 portid_pkd; u8 nmac; u8 nmac0[6]; __be16 rsssize_pkd; u8 nmac1[6]; __be16 idsiiq_pkd; u8 nmac2[6]; __be16 idseiq_pkd; u8 nmac3[6]; __be64 r9; __be64 r10; }; #define FW_VI_CMD_PFN_S 8 #define FW_VI_CMD_PFN_V(x) ((x) << FW_VI_CMD_PFN_S) #define FW_VI_CMD_VFN_S 0 #define FW_VI_CMD_VFN_V(x) ((x) << FW_VI_CMD_VFN_S) #define FW_VI_CMD_ALLOC_S 31 #define FW_VI_CMD_ALLOC_V(x) ((x) << FW_VI_CMD_ALLOC_S) #define FW_VI_CMD_ALLOC_F FW_VI_CMD_ALLOC_V(1U) #define FW_VI_CMD_FREE_S 30 #define FW_VI_CMD_FREE_V(x) ((x) << FW_VI_CMD_FREE_S) #define FW_VI_CMD_FREE_F FW_VI_CMD_FREE_V(1U) #define FW_VI_CMD_VIID_S 0 #define FW_VI_CMD_VIID_M 0xfff #define FW_VI_CMD_VIID_V(x) ((x) << FW_VI_CMD_VIID_S) #define FW_VI_CMD_VIID_G(x) (((x) >> FW_VI_CMD_VIID_S) & FW_VI_CMD_VIID_M) #define FW_VI_CMD_PORTID_S 4 #define FW_VI_CMD_PORTID_M 0xf #define FW_VI_CMD_PORTID_V(x) ((x) << FW_VI_CMD_PORTID_S) #define FW_VI_CMD_PORTID_G(x) \ (((x) >> FW_VI_CMD_PORTID_S) & FW_VI_CMD_PORTID_M) #define FW_VI_CMD_RSSSIZE_S 0 #define FW_VI_CMD_RSSSIZE_M 0x7ff #define FW_VI_CMD_RSSSIZE_G(x) \ (((x) >> FW_VI_CMD_RSSSIZE_S) & FW_VI_CMD_RSSSIZE_M) /* Special VI_MAC command index ids */ #define FW_VI_MAC_ADD_MAC 0x3FF #define FW_VI_MAC_ADD_PERSIST_MAC 0x3FE #define FW_VI_MAC_MAC_BASED_FREE 0x3FD #define FW_CLS_TCAM_NUM_ENTRIES 336 enum fw_vi_mac_smac { FW_VI_MAC_MPS_TCAM_ENTRY, FW_VI_MAC_MPS_TCAM_ONLY, FW_VI_MAC_SMT_ONLY, FW_VI_MAC_SMT_AND_MPSTCAM }; enum fw_vi_mac_result { FW_VI_MAC_R_SUCCESS, FW_VI_MAC_R_F_NONEXISTENT_NOMEM, FW_VI_MAC_R_SMAC_FAIL, FW_VI_MAC_R_F_ACL_CHECK }; struct fw_vi_mac_cmd { __be32 op_to_viid; __be32 freemacs_to_len16; union fw_vi_mac { struct fw_vi_mac_exact { __be16 valid_to_idx; u8 macaddr[6]; } exact[7]; struct fw_vi_mac_hash { __be64 hashvec; } hash; } u; }; #define FW_VI_MAC_CMD_VIID_S 0 #define FW_VI_MAC_CMD_VIID_V(x) ((x) << FW_VI_MAC_CMD_VIID_S) #define FW_VI_MAC_CMD_FREEMACS_S 31 #define FW_VI_MAC_CMD_FREEMACS_V(x) ((x) << FW_VI_MAC_CMD_FREEMACS_S) #define FW_VI_MAC_CMD_HASHVECEN_S 23 #define FW_VI_MAC_CMD_HASHVECEN_V(x) ((x) << FW_VI_MAC_CMD_HASHVECEN_S) #define FW_VI_MAC_CMD_HASHVECEN_F FW_VI_MAC_CMD_HASHVECEN_V(1U) #define FW_VI_MAC_CMD_HASHUNIEN_S 22 #define FW_VI_MAC_CMD_HASHUNIEN_V(x) ((x) << FW_VI_MAC_CMD_HASHUNIEN_S) #define FW_VI_MAC_CMD_VALID_S 15 #define FW_VI_MAC_CMD_VALID_V(x) ((x) << FW_VI_MAC_CMD_VALID_S) #define FW_VI_MAC_CMD_VALID_F FW_VI_MAC_CMD_VALID_V(1U) #define FW_VI_MAC_CMD_PRIO_S 12 #define FW_VI_MAC_CMD_PRIO_V(x) ((x) << FW_VI_MAC_CMD_PRIO_S) #define FW_VI_MAC_CMD_SMAC_RESULT_S 10 #define FW_VI_MAC_CMD_SMAC_RESULT_M 0x3 #define FW_VI_MAC_CMD_SMAC_RESULT_V(x) ((x) << FW_VI_MAC_CMD_SMAC_RESULT_S) #define FW_VI_MAC_CMD_SMAC_RESULT_G(x) \ (((x) >> FW_VI_MAC_CMD_SMAC_RESULT_S) & FW_VI_MAC_CMD_SMAC_RESULT_M) #define FW_VI_MAC_CMD_IDX_S 0 #define FW_VI_MAC_CMD_IDX_M 0x3ff #define FW_VI_MAC_CMD_IDX_V(x) ((x) << FW_VI_MAC_CMD_IDX_S) #define FW_VI_MAC_CMD_IDX_G(x) \ (((x) >> FW_VI_MAC_CMD_IDX_S) & FW_VI_MAC_CMD_IDX_M) #define FW_RXMODE_MTU_NO_CHG 65535 struct fw_vi_rxmode_cmd { __be32 op_to_viid; __be32 retval_len16; __be32 mtu_to_vlanexen; __be32 r4_lo; }; #define FW_VI_RXMODE_CMD_VIID_S 0 #define FW_VI_RXMODE_CMD_VIID_V(x) ((x) << FW_VI_RXMODE_CMD_VIID_S) #define FW_VI_RXMODE_CMD_MTU_S 16 #define FW_VI_RXMODE_CMD_MTU_M 0xffff #define FW_VI_RXMODE_CMD_MTU_V(x) ((x) << FW_VI_RXMODE_CMD_MTU_S) #define FW_VI_RXMODE_CMD_PROMISCEN_S 14 #define FW_VI_RXMODE_CMD_PROMISCEN_M 0x3 #define FW_VI_RXMODE_CMD_PROMISCEN_V(x) ((x) << FW_VI_RXMODE_CMD_PROMISCEN_S) #define FW_VI_RXMODE_CMD_ALLMULTIEN_S 12 #define FW_VI_RXMODE_CMD_ALLMULTIEN_M 0x3 #define FW_VI_RXMODE_CMD_ALLMULTIEN_V(x) \ ((x) << FW_VI_RXMODE_CMD_ALLMULTIEN_S) #define FW_VI_RXMODE_CMD_BROADCASTEN_S 10 #define FW_VI_RXMODE_CMD_BROADCASTEN_M 0x3 #define FW_VI_RXMODE_CMD_BROADCASTEN_V(x) \ ((x) << FW_VI_RXMODE_CMD_BROADCASTEN_S) #define FW_VI_RXMODE_CMD_VLANEXEN_S 8 #define FW_VI_RXMODE_CMD_VLANEXEN_M 0x3 #define FW_VI_RXMODE_CMD_VLANEXEN_V(x) ((x) << FW_VI_RXMODE_CMD_VLANEXEN_S) struct fw_vi_enable_cmd { __be32 op_to_viid; __be32 ien_to_len16; __be16 blinkdur; __be16 r3; __be32 r4; }; #define FW_VI_ENABLE_CMD_VIID_S 0 #define FW_VI_ENABLE_CMD_VIID_V(x) ((x) << FW_VI_ENABLE_CMD_VIID_S) #define FW_VI_ENABLE_CMD_IEN_S 31 #define FW_VI_ENABLE_CMD_IEN_V(x) ((x) << FW_VI_ENABLE_CMD_IEN_S) #define FW_VI_ENABLE_CMD_EEN_S 30 #define FW_VI_ENABLE_CMD_EEN_V(x) ((x) << FW_VI_ENABLE_CMD_EEN_S) #define FW_VI_ENABLE_CMD_LED_S 29 #define FW_VI_ENABLE_CMD_LED_V(x) ((x) << FW_VI_ENABLE_CMD_LED_S) #define FW_VI_ENABLE_CMD_LED_F FW_VI_ENABLE_CMD_LED_V(1U) #define FW_VI_ENABLE_CMD_DCB_INFO_S 28 #define FW_VI_ENABLE_CMD_DCB_INFO_V(x) ((x) << FW_VI_ENABLE_CMD_DCB_INFO_S) /* VI VF stats offset definitions */ #define VI_VF_NUM_STATS 16 enum fw_vi_stats_vf_index { FW_VI_VF_STAT_TX_BCAST_BYTES_IX, FW_VI_VF_STAT_TX_BCAST_FRAMES_IX, FW_VI_VF_STAT_TX_MCAST_BYTES_IX, FW_VI_VF_STAT_TX_MCAST_FRAMES_IX, FW_VI_VF_STAT_TX_UCAST_BYTES_IX, FW_VI_VF_STAT_TX_UCAST_FRAMES_IX, FW_VI_VF_STAT_TX_DROP_FRAMES_IX, FW_VI_VF_STAT_TX_OFLD_BYTES_IX, FW_VI_VF_STAT_TX_OFLD_FRAMES_IX, FW_VI_VF_STAT_RX_BCAST_BYTES_IX, FW_VI_VF_STAT_RX_BCAST_FRAMES_IX, FW_VI_VF_STAT_RX_MCAST_BYTES_IX, FW_VI_VF_STAT_RX_MCAST_FRAMES_IX, FW_VI_VF_STAT_RX_UCAST_BYTES_IX, FW_VI_VF_STAT_RX_UCAST_FRAMES_IX, FW_VI_VF_STAT_RX_ERR_FRAMES_IX }; /* VI PF stats offset definitions */ #define VI_PF_NUM_STATS 17 enum fw_vi_stats_pf_index { FW_VI_PF_STAT_TX_BCAST_BYTES_IX, FW_VI_PF_STAT_TX_BCAST_FRAMES_IX, FW_VI_PF_STAT_TX_MCAST_BYTES_IX, FW_VI_PF_STAT_TX_MCAST_FRAMES_IX, FW_VI_PF_STAT_TX_UCAST_BYTES_IX, FW_VI_PF_STAT_TX_UCAST_FRAMES_IX, FW_VI_PF_STAT_TX_OFLD_BYTES_IX, FW_VI_PF_STAT_TX_OFLD_FRAMES_IX, FW_VI_PF_STAT_RX_BYTES_IX, FW_VI_PF_STAT_RX_FRAMES_IX, FW_VI_PF_STAT_RX_BCAST_BYTES_IX, FW_VI_PF_STAT_RX_BCAST_FRAMES_IX, FW_VI_PF_STAT_RX_MCAST_BYTES_IX, FW_VI_PF_STAT_RX_MCAST_FRAMES_IX, FW_VI_PF_STAT_RX_UCAST_BYTES_IX, FW_VI_PF_STAT_RX_UCAST_FRAMES_IX, FW_VI_PF_STAT_RX_ERR_FRAMES_IX }; struct fw_vi_stats_cmd { __be32 op_to_viid; __be32 retval_len16; union fw_vi_stats { struct fw_vi_stats_ctl { __be16 nstats_ix; __be16 r6; __be32 r7; __be64 stat0; __be64 stat1; __be64 stat2; __be64 stat3; __be64 stat4; __be64 stat5; } ctl; struct fw_vi_stats_pf { __be64 tx_bcast_bytes; __be64 tx_bcast_frames; __be64 tx_mcast_bytes; __be64 tx_mcast_frames; __be64 tx_ucast_bytes; __be64 tx_ucast_frames; __be64 tx_offload_bytes; __be64 tx_offload_frames; __be64 rx_pf_bytes; __be64 rx_pf_frames; __be64 rx_bcast_bytes; __be64 rx_bcast_frames; __be64 rx_mcast_bytes; __be64 rx_mcast_frames; __be64 rx_ucast_bytes; __be64 rx_ucast_frames; __be64 rx_err_frames; } pf; struct fw_vi_stats_vf { __be64 tx_bcast_bytes; __be64 tx_bcast_frames; __be64 tx_mcast_bytes; __be64 tx_mcast_frames; __be64 tx_ucast_bytes; __be64 tx_ucast_frames; __be64 tx_drop_frames; __be64 tx_offload_bytes; __be64 tx_offload_frames; __be64 rx_bcast_bytes; __be64 rx_bcast_frames; __be64 rx_mcast_bytes; __be64 rx_mcast_frames; __be64 rx_ucast_bytes; __be64 rx_ucast_frames; __be64 rx_err_frames; } vf; } u; }; #define FW_VI_STATS_CMD_VIID_S 0 #define FW_VI_STATS_CMD_VIID_V(x) ((x) << FW_VI_STATS_CMD_VIID_S) #define FW_VI_STATS_CMD_NSTATS_S 12 #define FW_VI_STATS_CMD_NSTATS_V(x) ((x) << FW_VI_STATS_CMD_NSTATS_S) #define FW_VI_STATS_CMD_IX_S 0 #define FW_VI_STATS_CMD_IX_V(x) ((x) << FW_VI_STATS_CMD_IX_S) struct fw_acl_mac_cmd { __be32 op_to_vfn; __be32 en_to_len16; u8 nmac; u8 r3[7]; __be16 r4; u8 macaddr0[6]; __be16 r5; u8 macaddr1[6]; __be16 r6; u8 macaddr2[6]; __be16 r7; u8 macaddr3[6]; }; #define FW_ACL_MAC_CMD_PFN_S 8 #define FW_ACL_MAC_CMD_PFN_V(x) ((x) << FW_ACL_MAC_CMD_PFN_S) #define FW_ACL_MAC_CMD_VFN_S 0 #define FW_ACL_MAC_CMD_VFN_V(x) ((x) << FW_ACL_MAC_CMD_VFN_S) #define FW_ACL_MAC_CMD_EN_S 31 #define FW_ACL_MAC_CMD_EN_V(x) ((x) << FW_ACL_MAC_CMD_EN_S) struct fw_acl_vlan_cmd { __be32 op_to_vfn; __be32 en_to_len16; u8 nvlan; u8 dropnovlan_fm; u8 r3_lo[6]; __be16 vlanid[16]; }; #define FW_ACL_VLAN_CMD_PFN_S 8 #define FW_ACL_VLAN_CMD_PFN_V(x) ((x) << FW_ACL_VLAN_CMD_PFN_S) #define FW_ACL_VLAN_CMD_VFN_S 0 #define FW_ACL_VLAN_CMD_VFN_V(x) ((x) << FW_ACL_VLAN_CMD_VFN_S) #define FW_ACL_VLAN_CMD_EN_S 31 #define FW_ACL_VLAN_CMD_EN_V(x) ((x) << FW_ACL_VLAN_CMD_EN_S) #define FW_ACL_VLAN_CMD_DROPNOVLAN_S 7 #define FW_ACL_VLAN_CMD_DROPNOVLAN_V(x) ((x) << FW_ACL_VLAN_CMD_DROPNOVLAN_S) #define FW_ACL_VLAN_CMD_FM_S 6 #define FW_ACL_VLAN_CMD_FM_V(x) ((x) << FW_ACL_VLAN_CMD_FM_S) enum fw_port_cap { FW_PORT_CAP_SPEED_100M = 0x0001, FW_PORT_CAP_SPEED_1G = 0x0002, FW_PORT_CAP_SPEED_25G = 0x0004, FW_PORT_CAP_SPEED_10G = 0x0008, FW_PORT_CAP_SPEED_40G = 0x0010, FW_PORT_CAP_SPEED_100G = 0x0020, FW_PORT_CAP_FC_RX = 0x0040, FW_PORT_CAP_FC_TX = 0x0080, FW_PORT_CAP_ANEG = 0x0100, FW_PORT_CAP_MDIX = 0x0200, FW_PORT_CAP_MDIAUTO = 0x0400, FW_PORT_CAP_FEC = 0x0800, FW_PORT_CAP_TECHKR = 0x1000, FW_PORT_CAP_TECHKX4 = 0x2000, FW_PORT_CAP_802_3_PAUSE = 0x4000, FW_PORT_CAP_802_3_ASM_DIR = 0x8000, }; #define FW_PORT_CAP_SPEED_S 0 #define FW_PORT_CAP_SPEED_M 0x3f #define FW_PORT_CAP_SPEED_V(x) ((x) << FW_PORT_CAP_SPEED_S) #define FW_PORT_CAP_SPEED_G(x) \ (((x) >> FW_PORT_CAP_SPEED_S) & FW_PORT_CAP_SPEED_M) enum fw_port_mdi { FW_PORT_CAP_MDI_UNCHANGED, FW_PORT_CAP_MDI_AUTO, FW_PORT_CAP_MDI_F_STRAIGHT, FW_PORT_CAP_MDI_F_CROSSOVER }; #define FW_PORT_CAP_MDI_S 9 #define FW_PORT_CAP_MDI_V(x) ((x) << FW_PORT_CAP_MDI_S) enum fw_port_action { FW_PORT_ACTION_L1_CFG = 0x0001, FW_PORT_ACTION_L2_CFG = 0x0002, FW_PORT_ACTION_GET_PORT_INFO = 0x0003, FW_PORT_ACTION_L2_PPP_CFG = 0x0004, FW_PORT_ACTION_L2_DCB_CFG = 0x0005, FW_PORT_ACTION_DCB_READ_TRANS = 0x0006, FW_PORT_ACTION_DCB_READ_RECV = 0x0007, FW_PORT_ACTION_DCB_READ_DET = 0x0008, FW_PORT_ACTION_LOW_PWR_TO_NORMAL = 0x0010, FW_PORT_ACTION_L1_LOW_PWR_EN = 0x0011, FW_PORT_ACTION_L2_WOL_MODE_EN = 0x0012, FW_PORT_ACTION_LPBK_TO_NORMAL = 0x0020, FW_PORT_ACTION_L1_LPBK = 0x0021, FW_PORT_ACTION_L1_PMA_LPBK = 0x0022, FW_PORT_ACTION_L1_PCS_LPBK = 0x0023, FW_PORT_ACTION_L1_PHYXS_CSIDE_LPBK = 0x0024, FW_PORT_ACTION_L1_PHYXS_ESIDE_LPBK = 0x0025, FW_PORT_ACTION_PHY_RESET = 0x0040, FW_PORT_ACTION_PMA_RESET = 0x0041, FW_PORT_ACTION_PCS_RESET = 0x0042, FW_PORT_ACTION_PHYXS_RESET = 0x0043, FW_PORT_ACTION_DTEXS_REEST = 0x0044, FW_PORT_ACTION_AN_RESET = 0x0045 }; enum fw_port_l2cfg_ctlbf { FW_PORT_L2_CTLBF_OVLAN0 = 0x01, FW_PORT_L2_CTLBF_OVLAN1 = 0x02, FW_PORT_L2_CTLBF_OVLAN2 = 0x04, FW_PORT_L2_CTLBF_OVLAN3 = 0x08, FW_PORT_L2_CTLBF_IVLAN = 0x10, FW_PORT_L2_CTLBF_TXIPG = 0x20 }; enum fw_port_dcb_versions { FW_PORT_DCB_VER_UNKNOWN, FW_PORT_DCB_VER_CEE1D0, FW_PORT_DCB_VER_CEE1D01, FW_PORT_DCB_VER_IEEE, FW_PORT_DCB_VER_AUTO = 7 }; enum fw_port_dcb_cfg { FW_PORT_DCB_CFG_PG = 0x01, FW_PORT_DCB_CFG_PFC = 0x02, FW_PORT_DCB_CFG_APPL = 0x04 }; enum fw_port_dcb_cfg_rc { FW_PORT_DCB_CFG_SUCCESS = 0x0, FW_PORT_DCB_CFG_ERROR = 0x1 }; enum fw_port_dcb_type { FW_PORT_DCB_TYPE_PGID = 0x00, FW_PORT_DCB_TYPE_PGRATE = 0x01, FW_PORT_DCB_TYPE_PRIORATE = 0x02, FW_PORT_DCB_TYPE_PFC = 0x03, FW_PORT_DCB_TYPE_APP_ID = 0x04, FW_PORT_DCB_TYPE_CONTROL = 0x05, }; enum fw_port_dcb_feature_state { FW_PORT_DCB_FEATURE_STATE_PENDING = 0x0, FW_PORT_DCB_FEATURE_STATE_SUCCESS = 0x1, FW_PORT_DCB_FEATURE_STATE_ERROR = 0x2, FW_PORT_DCB_FEATURE_STATE_TIMEOUT = 0x3, }; struct fw_port_cmd { __be32 op_to_portid; __be32 action_to_len16; union fw_port { struct fw_port_l1cfg { __be32 rcap; __be32 r; } l1cfg; struct fw_port_l2cfg { __u8 ctlbf; __u8 ovlan3_to_ivlan0; __be16 ivlantype; __be16 txipg_force_pinfo; __be16 mtu; __be16 ovlan0mask; __be16 ovlan0type; __be16 ovlan1mask; __be16 ovlan1type; __be16 ovlan2mask; __be16 ovlan2type; __be16 ovlan3mask; __be16 ovlan3type; } l2cfg; struct fw_port_info { __be32 lstatus_to_modtype; __be16 pcap; __be16 acap; __be16 mtu; __u8 cbllen; __u8 auxlinfo; __u8 dcbxdis_pkd; __u8 r8_lo; __be16 lpacap; __be64 r9; } info; struct fw_port_diags { __u8 diagop; __u8 r[3]; __be32 diagval; } diags; union fw_port_dcb { struct fw_port_dcb_pgid { __u8 type; __u8 apply_pkd; __u8 r10_lo[2]; __be32 pgid; __be64 r11; } pgid; struct fw_port_dcb_pgrate { __u8 type; __u8 apply_pkd; __u8 r10_lo[5]; __u8 num_tcs_supported; __u8 pgrate[8]; __u8 tsa[8]; } pgrate; struct fw_port_dcb_priorate { __u8 type; __u8 apply_pkd; __u8 r10_lo[6]; __u8 strict_priorate[8]; } priorate; struct fw_port_dcb_pfc { __u8 type; __u8 pfcen; __u8 r10[5]; __u8 max_pfc_tcs; __be64 r11; } pfc; struct fw_port_app_priority { __u8 type; __u8 r10[2]; __u8 idx; __u8 user_prio_map; __u8 sel_field; __be16 protocolid; __be64 r12; } app_priority; struct fw_port_dcb_control { __u8 type; __u8 all_syncd_pkd; __be16 dcb_version_to_app_state; __be32 r11; __be64 r12; } control; } dcb; } u; }; #define FW_PORT_CMD_READ_S 22 #define FW_PORT_CMD_READ_V(x) ((x) << FW_PORT_CMD_READ_S) #define FW_PORT_CMD_READ_F FW_PORT_CMD_READ_V(1U) #define FW_PORT_CMD_PORTID_S 0 #define FW_PORT_CMD_PORTID_M 0xf #define FW_PORT_CMD_PORTID_V(x) ((x) << FW_PORT_CMD_PORTID_S) #define FW_PORT_CMD_PORTID_G(x) \ (((x) >> FW_PORT_CMD_PORTID_S) & FW_PORT_CMD_PORTID_M) #define FW_PORT_CMD_ACTION_S 16 #define FW_PORT_CMD_ACTION_M 0xffff #define FW_PORT_CMD_ACTION_V(x) ((x) << FW_PORT_CMD_ACTION_S) #define FW_PORT_CMD_ACTION_G(x) \ (((x) >> FW_PORT_CMD_ACTION_S) & FW_PORT_CMD_ACTION_M) #define FW_PORT_CMD_OVLAN3_S 7 #define FW_PORT_CMD_OVLAN3_V(x) ((x) << FW_PORT_CMD_OVLAN3_S) #define FW_PORT_CMD_OVLAN2_S 6 #define FW_PORT_CMD_OVLAN2_V(x) ((x) << FW_PORT_CMD_OVLAN2_S) #define FW_PORT_CMD_OVLAN1_S 5 #define FW_PORT_CMD_OVLAN1_V(x) ((x) << FW_PORT_CMD_OVLAN1_S) #define FW_PORT_CMD_OVLAN0_S 4 #define FW_PORT_CMD_OVLAN0_V(x) ((x) << FW_PORT_CMD_OVLAN0_S) #define FW_PORT_CMD_IVLAN0_S 3 #define FW_PORT_CMD_IVLAN0_V(x) ((x) << FW_PORT_CMD_IVLAN0_S) #define FW_PORT_CMD_TXIPG_S 3 #define FW_PORT_CMD_TXIPG_V(x) ((x) << FW_PORT_CMD_TXIPG_S) #define FW_PORT_CMD_LSTATUS_S 31 #define FW_PORT_CMD_LSTATUS_M 0x1 #define FW_PORT_CMD_LSTATUS_V(x) ((x) << FW_PORT_CMD_LSTATUS_S) #define FW_PORT_CMD_LSTATUS_G(x) \ (((x) >> FW_PORT_CMD_LSTATUS_S) & FW_PORT_CMD_LSTATUS_M) #define FW_PORT_CMD_LSTATUS_F FW_PORT_CMD_LSTATUS_V(1U) #define FW_PORT_CMD_LSPEED_S 24 #define FW_PORT_CMD_LSPEED_M 0x3f #define FW_PORT_CMD_LSPEED_V(x) ((x) << FW_PORT_CMD_LSPEED_S) #define FW_PORT_CMD_LSPEED_G(x) \ (((x) >> FW_PORT_CMD_LSPEED_S) & FW_PORT_CMD_LSPEED_M) #define FW_PORT_CMD_TXPAUSE_S 23 #define FW_PORT_CMD_TXPAUSE_V(x) ((x) << FW_PORT_CMD_TXPAUSE_S) #define FW_PORT_CMD_TXPAUSE_F FW_PORT_CMD_TXPAUSE_V(1U) #define FW_PORT_CMD_RXPAUSE_S 22 #define FW_PORT_CMD_RXPAUSE_V(x) ((x) << FW_PORT_CMD_RXPAUSE_S) #define FW_PORT_CMD_RXPAUSE_F FW_PORT_CMD_RXPAUSE_V(1U) #define FW_PORT_CMD_MDIOCAP_S 21 #define FW_PORT_CMD_MDIOCAP_V(x) ((x) << FW_PORT_CMD_MDIOCAP_S) #define FW_PORT_CMD_MDIOCAP_F FW_PORT_CMD_MDIOCAP_V(1U) #define FW_PORT_CMD_MDIOADDR_S 16 #define FW_PORT_CMD_MDIOADDR_M 0x1f #define FW_PORT_CMD_MDIOADDR_G(x) \ (((x) >> FW_PORT_CMD_MDIOADDR_S) & FW_PORT_CMD_MDIOADDR_M) #define FW_PORT_CMD_LPTXPAUSE_S 15 #define FW_PORT_CMD_LPTXPAUSE_V(x) ((x) << FW_PORT_CMD_LPTXPAUSE_S) #define FW_PORT_CMD_LPTXPAUSE_F FW_PORT_CMD_LPTXPAUSE_V(1U) #define FW_PORT_CMD_LPRXPAUSE_S 14 #define FW_PORT_CMD_LPRXPAUSE_V(x) ((x) << FW_PORT_CMD_LPRXPAUSE_S) #define FW_PORT_CMD_LPRXPAUSE_F FW_PORT_CMD_LPRXPAUSE_V(1U) #define FW_PORT_CMD_PTYPE_S 8 #define FW_PORT_CMD_PTYPE_M 0x1f #define FW_PORT_CMD_PTYPE_G(x) \ (((x) >> FW_PORT_CMD_PTYPE_S) & FW_PORT_CMD_PTYPE_M) #define FW_PORT_CMD_LINKDNRC_S 5 #define FW_PORT_CMD_LINKDNRC_M 0x7 #define FW_PORT_CMD_LINKDNRC_G(x) \ (((x) >> FW_PORT_CMD_LINKDNRC_S) & FW_PORT_CMD_LINKDNRC_M) #define FW_PORT_CMD_MODTYPE_S 0 #define FW_PORT_CMD_MODTYPE_M 0x1f #define FW_PORT_CMD_MODTYPE_V(x) ((x) << FW_PORT_CMD_MODTYPE_S) #define FW_PORT_CMD_MODTYPE_G(x) \ (((x) >> FW_PORT_CMD_MODTYPE_S) & FW_PORT_CMD_MODTYPE_M) #define FW_PORT_CMD_DCBXDIS_S 7 #define FW_PORT_CMD_DCBXDIS_V(x) ((x) << FW_PORT_CMD_DCBXDIS_S) #define FW_PORT_CMD_DCBXDIS_F FW_PORT_CMD_DCBXDIS_V(1U) #define FW_PORT_CMD_APPLY_S 7 #define FW_PORT_CMD_APPLY_V(x) ((x) << FW_PORT_CMD_APPLY_S) #define FW_PORT_CMD_APPLY_F FW_PORT_CMD_APPLY_V(1U) #define FW_PORT_CMD_ALL_SYNCD_S 7 #define FW_PORT_CMD_ALL_SYNCD_V(x) ((x) << FW_PORT_CMD_ALL_SYNCD_S) #define FW_PORT_CMD_ALL_SYNCD_F FW_PORT_CMD_ALL_SYNCD_V(1U) #define FW_PORT_CMD_DCB_VERSION_S 12 #define FW_PORT_CMD_DCB_VERSION_M 0x7 #define FW_PORT_CMD_DCB_VERSION_G(x) \ (((x) >> FW_PORT_CMD_DCB_VERSION_S) & FW_PORT_CMD_DCB_VERSION_M) enum fw_port_type { FW_PORT_TYPE_FIBER_XFI, FW_PORT_TYPE_FIBER_XAUI, FW_PORT_TYPE_BT_SGMII, FW_PORT_TYPE_BT_XFI, FW_PORT_TYPE_BT_XAUI, FW_PORT_TYPE_KX4, FW_PORT_TYPE_CX4, FW_PORT_TYPE_KX, FW_PORT_TYPE_KR, FW_PORT_TYPE_SFP, FW_PORT_TYPE_BP_AP, FW_PORT_TYPE_BP4_AP, FW_PORT_TYPE_QSFP_10G, FW_PORT_TYPE_QSA, FW_PORT_TYPE_QSFP, FW_PORT_TYPE_BP40_BA, FW_PORT_TYPE_KR4_100G, FW_PORT_TYPE_CR4_QSFP, FW_PORT_TYPE_CR_QSFP, FW_PORT_TYPE_CR2_QSFP, FW_PORT_TYPE_SFP28, FW_PORT_TYPE_NONE = FW_PORT_CMD_PTYPE_M }; enum fw_port_module_type { FW_PORT_MOD_TYPE_NA, FW_PORT_MOD_TYPE_LR, FW_PORT_MOD_TYPE_SR, FW_PORT_MOD_TYPE_ER, FW_PORT_MOD_TYPE_TWINAX_PASSIVE, FW_PORT_MOD_TYPE_TWINAX_ACTIVE, FW_PORT_MOD_TYPE_LRM, FW_PORT_MOD_TYPE_ERROR = FW_PORT_CMD_MODTYPE_M - 3, FW_PORT_MOD_TYPE_UNKNOWN = FW_PORT_CMD_MODTYPE_M - 2, FW_PORT_MOD_TYPE_NOTSUPPORTED = FW_PORT_CMD_MODTYPE_M - 1, FW_PORT_MOD_TYPE_NONE = FW_PORT_CMD_MODTYPE_M }; enum fw_port_mod_sub_type { FW_PORT_MOD_SUB_TYPE_NA, FW_PORT_MOD_SUB_TYPE_MV88E114X = 0x1, FW_PORT_MOD_SUB_TYPE_TN8022 = 0x2, FW_PORT_MOD_SUB_TYPE_AQ1202 = 0x3, FW_PORT_MOD_SUB_TYPE_88x3120 = 0x4, FW_PORT_MOD_SUB_TYPE_BCM84834 = 0x5, FW_PORT_MOD_SUB_TYPE_BT_VSC8634 = 0x8, /* The following will never been in the VPD. They are TWINAX cable * lengths decoded from SFP+ module i2c PROMs. These should * almost certainly go somewhere else ... */ FW_PORT_MOD_SUB_TYPE_TWINAX_1 = 0x9, FW_PORT_MOD_SUB_TYPE_TWINAX_3 = 0xA, FW_PORT_MOD_SUB_TYPE_TWINAX_5 = 0xB, FW_PORT_MOD_SUB_TYPE_TWINAX_7 = 0xC, }; enum fw_port_stats_tx_index { FW_STAT_TX_PORT_BYTES_IX = 0, FW_STAT_TX_PORT_FRAMES_IX, FW_STAT_TX_PORT_BCAST_IX, FW_STAT_TX_PORT_MCAST_IX, FW_STAT_TX_PORT_UCAST_IX, FW_STAT_TX_PORT_ERROR_IX, FW_STAT_TX_PORT_64B_IX, FW_STAT_TX_PORT_65B_127B_IX, FW_STAT_TX_PORT_128B_255B_IX, FW_STAT_TX_PORT_256B_511B_IX, FW_STAT_TX_PORT_512B_1023B_IX, FW_STAT_TX_PORT_1024B_1518B_IX, FW_STAT_TX_PORT_1519B_MAX_IX, FW_STAT_TX_PORT_DROP_IX, FW_STAT_TX_PORT_PAUSE_IX, FW_STAT_TX_PORT_PPP0_IX, FW_STAT_TX_PORT_PPP1_IX, FW_STAT_TX_PORT_PPP2_IX, FW_STAT_TX_PORT_PPP3_IX, FW_STAT_TX_PORT_PPP4_IX, FW_STAT_TX_PORT_PPP5_IX, FW_STAT_TX_PORT_PPP6_IX, FW_STAT_TX_PORT_PPP7_IX, FW_NUM_PORT_TX_STATS }; enum fw_port_stat_rx_index { FW_STAT_RX_PORT_BYTES_IX = 0, FW_STAT_RX_PORT_FRAMES_IX, FW_STAT_RX_PORT_BCAST_IX, FW_STAT_RX_PORT_MCAST_IX, FW_STAT_RX_PORT_UCAST_IX, FW_STAT_RX_PORT_MTU_ERROR_IX, FW_STAT_RX_PORT_MTU_CRC_ERROR_IX, FW_STAT_RX_PORT_CRC_ERROR_IX, FW_STAT_RX_PORT_LEN_ERROR_IX, FW_STAT_RX_PORT_SYM_ERROR_IX, FW_STAT_RX_PORT_64B_IX, FW_STAT_RX_PORT_65B_127B_IX, FW_STAT_RX_PORT_128B_255B_IX, FW_STAT_RX_PORT_256B_511B_IX, FW_STAT_RX_PORT_512B_1023B_IX, FW_STAT_RX_PORT_1024B_1518B_IX, FW_STAT_RX_PORT_1519B_MAX_IX, FW_STAT_RX_PORT_PAUSE_IX, FW_STAT_RX_PORT_PPP0_IX, FW_STAT_RX_PORT_PPP1_IX, FW_STAT_RX_PORT_PPP2_IX, FW_STAT_RX_PORT_PPP3_IX, FW_STAT_RX_PORT_PPP4_IX, FW_STAT_RX_PORT_PPP5_IX, FW_STAT_RX_PORT_PPP6_IX, FW_STAT_RX_PORT_PPP7_IX, FW_STAT_RX_PORT_LESS_64B_IX, FW_STAT_RX_PORT_MAC_ERROR_IX, FW_NUM_PORT_RX_STATS }; /* port stats */ #define FW_NUM_PORT_STATS (FW_NUM_PORT_TX_STATS + FW_NUM_PORT_RX_STATS) struct fw_port_stats_cmd { __be32 op_to_portid; __be32 retval_len16; union fw_port_stats { struct fw_port_stats_ctl { u8 nstats_bg_bm; u8 tx_ix; __be16 r6; __be32 r7; __be64 stat0; __be64 stat1; __be64 stat2; __be64 stat3; __be64 stat4; __be64 stat5; } ctl; struct fw_port_stats_all { __be64 tx_bytes; __be64 tx_frames; __be64 tx_bcast; __be64 tx_mcast; __be64 tx_ucast; __be64 tx_error; __be64 tx_64b; __be64 tx_65b_127b; __be64 tx_128b_255b; __be64 tx_256b_511b; __be64 tx_512b_1023b; __be64 tx_1024b_1518b; __be64 tx_1519b_max; __be64 tx_drop; __be64 tx_pause; __be64 tx_ppp0; __be64 tx_ppp1; __be64 tx_ppp2; __be64 tx_ppp3; __be64 tx_ppp4; __be64 tx_ppp5; __be64 tx_ppp6; __be64 tx_ppp7; __be64 rx_bytes; __be64 rx_frames; __be64 rx_bcast; __be64 rx_mcast; __be64 rx_ucast; __be64 rx_mtu_error; __be64 rx_mtu_crc_error; __be64 rx_crc_error; __be64 rx_len_error; __be64 rx_sym_error; __be64 rx_64b; __be64 rx_65b_127b; __be64 rx_128b_255b; __be64 rx_256b_511b; __be64 rx_512b_1023b; __be64 rx_1024b_1518b; __be64 rx_1519b_max; __be64 rx_pause; __be64 rx_ppp0; __be64 rx_ppp1; __be64 rx_ppp2; __be64 rx_ppp3; __be64 rx_ppp4; __be64 rx_ppp5; __be64 rx_ppp6; __be64 rx_ppp7; __be64 rx_less_64b; __be64 rx_bg_drop; __be64 rx_bg_trunc; } all; } u; }; /* port loopback stats */ #define FW_NUM_LB_STATS 16 enum fw_port_lb_stats_index { FW_STAT_LB_PORT_BYTES_IX, FW_STAT_LB_PORT_FRAMES_IX, FW_STAT_LB_PORT_BCAST_IX, FW_STAT_LB_PORT_MCAST_IX, FW_STAT_LB_PORT_UCAST_IX, FW_STAT_LB_PORT_ERROR_IX, FW_STAT_LB_PORT_64B_IX, FW_STAT_LB_PORT_65B_127B_IX, FW_STAT_LB_PORT_128B_255B_IX, FW_STAT_LB_PORT_256B_511B_IX, FW_STAT_LB_PORT_512B_1023B_IX, FW_STAT_LB_PORT_1024B_1518B_IX, FW_STAT_LB_PORT_1519B_MAX_IX, FW_STAT_LB_PORT_DROP_FRAMES_IX }; struct fw_port_lb_stats_cmd { __be32 op_to_lbport; __be32 retval_len16; union fw_port_lb_stats { struct fw_port_lb_stats_ctl { u8 nstats_bg_bm; u8 ix_pkd; __be16 r6; __be32 r7; __be64 stat0; __be64 stat1; __be64 stat2; __be64 stat3; __be64 stat4; __be64 stat5; } ctl; struct fw_port_lb_stats_all { __be64 tx_bytes; __be64 tx_frames; __be64 tx_bcast; __be64 tx_mcast; __be64 tx_ucast; __be64 tx_error; __be64 tx_64b; __be64 tx_65b_127b; __be64 tx_128b_255b; __be64 tx_256b_511b; __be64 tx_512b_1023b; __be64 tx_1024b_1518b; __be64 tx_1519b_max; __be64 rx_lb_drop; __be64 rx_lb_trunc; } all; } u; }; struct fw_rss_ind_tbl_cmd { __be32 op_to_viid; __be32 retval_len16; __be16 niqid; __be16 startidx; __be32 r3; __be32 iq0_to_iq2; __be32 iq3_to_iq5; __be32 iq6_to_iq8; __be32 iq9_to_iq11; __be32 iq12_to_iq14; __be32 iq15_to_iq17; __be32 iq18_to_iq20; __be32 iq21_to_iq23; __be32 iq24_to_iq26; __be32 iq27_to_iq29; __be32 iq30_iq31; __be32 r15_lo; }; #define FW_RSS_IND_TBL_CMD_VIID_S 0 #define FW_RSS_IND_TBL_CMD_VIID_V(x) ((x) << FW_RSS_IND_TBL_CMD_VIID_S) #define FW_RSS_IND_TBL_CMD_IQ0_S 20 #define FW_RSS_IND_TBL_CMD_IQ0_V(x) ((x) << FW_RSS_IND_TBL_CMD_IQ0_S) #define FW_RSS_IND_TBL_CMD_IQ1_S 10 #define FW_RSS_IND_TBL_CMD_IQ1_V(x) ((x) << FW_RSS_IND_TBL_CMD_IQ1_S) #define FW_RSS_IND_TBL_CMD_IQ2_S 0 #define FW_RSS_IND_TBL_CMD_IQ2_V(x) ((x) << FW_RSS_IND_TBL_CMD_IQ2_S) struct fw_rss_glb_config_cmd { __be32 op_to_write; __be32 retval_len16; union fw_rss_glb_config { struct fw_rss_glb_config_manual { __be32 mode_pkd; __be32 r3; __be64 r4; __be64 r5; } manual; struct fw_rss_glb_config_basicvirtual { __be32 mode_pkd; __be32 synmapen_to_hashtoeplitz; __be64 r8; __be64 r9; } basicvirtual; } u; }; #define FW_RSS_GLB_CONFIG_CMD_MODE_S 28 #define FW_RSS_GLB_CONFIG_CMD_MODE_M 0xf #define FW_RSS_GLB_CONFIG_CMD_MODE_V(x) ((x) << FW_RSS_GLB_CONFIG_CMD_MODE_S) #define FW_RSS_GLB_CONFIG_CMD_MODE_G(x) \ (((x) >> FW_RSS_GLB_CONFIG_CMD_MODE_S) & FW_RSS_GLB_CONFIG_CMD_MODE_M) #define FW_RSS_GLB_CONFIG_CMD_MODE_MANUAL 0 #define FW_RSS_GLB_CONFIG_CMD_MODE_BASICVIRTUAL 1 #define FW_RSS_GLB_CONFIG_CMD_SYNMAPEN_S 8 #define FW_RSS_GLB_CONFIG_CMD_SYNMAPEN_V(x) \ ((x) << FW_RSS_GLB_CONFIG_CMD_SYNMAPEN_S) #define FW_RSS_GLB_CONFIG_CMD_SYNMAPEN_F \ FW_RSS_GLB_CONFIG_CMD_SYNMAPEN_V(1U) #define FW_RSS_GLB_CONFIG_CMD_SYN4TUPENIPV6_S 7 #define FW_RSS_GLB_CONFIG_CMD_SYN4TUPENIPV6_V(x) \ ((x) << FW_RSS_GLB_CONFIG_CMD_SYN4TUPENIPV6_S) #define FW_RSS_GLB_CONFIG_CMD_SYN4TUPENIPV6_F \ FW_RSS_GLB_CONFIG_CMD_SYN4TUPENIPV6_V(1U) #define FW_RSS_GLB_CONFIG_CMD_SYN2TUPENIPV6_S 6 #define FW_RSS_GLB_CONFIG_CMD_SYN2TUPENIPV6_V(x) \ ((x) << FW_RSS_GLB_CONFIG_CMD_SYN2TUPENIPV6_S) #define FW_RSS_GLB_CONFIG_CMD_SYN2TUPENIPV6_F \ FW_RSS_GLB_CONFIG_CMD_SYN2TUPENIPV6_V(1U) #define FW_RSS_GLB_CONFIG_CMD_SYN4TUPENIPV4_S 5 #define FW_RSS_GLB_CONFIG_CMD_SYN4TUPENIPV4_V(x) \ ((x) << FW_RSS_GLB_CONFIG_CMD_SYN4TUPENIPV4_S) #define FW_RSS_GLB_CONFIG_CMD_SYN4TUPENIPV4_F \ FW_RSS_GLB_CONFIG_CMD_SYN4TUPENIPV4_V(1U) #define FW_RSS_GLB_CONFIG_CMD_SYN2TUPENIPV4_S 4 #define FW_RSS_GLB_CONFIG_CMD_SYN2TUPENIPV4_V(x) \ ((x) << FW_RSS_GLB_CONFIG_CMD_SYN2TUPENIPV4_S) #define FW_RSS_GLB_CONFIG_CMD_SYN2TUPENIPV4_F \ FW_RSS_GLB_CONFIG_CMD_SYN2TUPENIPV4_V(1U) #define FW_RSS_GLB_CONFIG_CMD_OFDMAPEN_S 3 #define FW_RSS_GLB_CONFIG_CMD_OFDMAPEN_V(x) \ ((x) << FW_RSS_GLB_CONFIG_CMD_OFDMAPEN_S) #define FW_RSS_GLB_CONFIG_CMD_OFDMAPEN_F \ FW_RSS_GLB_CONFIG_CMD_OFDMAPEN_V(1U) #define FW_RSS_GLB_CONFIG_CMD_TNLMAPEN_S 2 #define FW_RSS_GLB_CONFIG_CMD_TNLMAPEN_V(x) \ ((x) << FW_RSS_GLB_CONFIG_CMD_TNLMAPEN_S) #define FW_RSS_GLB_CONFIG_CMD_TNLMAPEN_F \ FW_RSS_GLB_CONFIG_CMD_TNLMAPEN_V(1U) #define FW_RSS_GLB_CONFIG_CMD_TNLALLLKP_S 1 #define FW_RSS_GLB_CONFIG_CMD_TNLALLLKP_V(x) \ ((x) << FW_RSS_GLB_CONFIG_CMD_TNLALLLKP_S) #define FW_RSS_GLB_CONFIG_CMD_TNLALLLKP_F \ FW_RSS_GLB_CONFIG_CMD_TNLALLLKP_V(1U) #define FW_RSS_GLB_CONFIG_CMD_HASHTOEPLITZ_S 0 #define FW_RSS_GLB_CONFIG_CMD_HASHTOEPLITZ_V(x) \ ((x) << FW_RSS_GLB_CONFIG_CMD_HASHTOEPLITZ_S) #define FW_RSS_GLB_CONFIG_CMD_HASHTOEPLITZ_F \ FW_RSS_GLB_CONFIG_CMD_HASHTOEPLITZ_V(1U) struct fw_rss_vi_config_cmd { __be32 op_to_viid; #define FW_RSS_VI_CONFIG_CMD_VIID(x) ((x) << 0) __be32 retval_len16; union fw_rss_vi_config { struct fw_rss_vi_config_manual { __be64 r3; __be64 r4; __be64 r5; } manual; struct fw_rss_vi_config_basicvirtual { __be32 r6; __be32 defaultq_to_udpen; __be64 r9; __be64 r10; } basicvirtual; } u; }; #define FW_RSS_VI_CONFIG_CMD_VIID_S 0 #define FW_RSS_VI_CONFIG_CMD_VIID_V(x) ((x) << FW_RSS_VI_CONFIG_CMD_VIID_S) #define FW_RSS_VI_CONFIG_CMD_DEFAULTQ_S 16 #define FW_RSS_VI_CONFIG_CMD_DEFAULTQ_M 0x3ff #define FW_RSS_VI_CONFIG_CMD_DEFAULTQ_V(x) \ ((x) << FW_RSS_VI_CONFIG_CMD_DEFAULTQ_S) #define FW_RSS_VI_CONFIG_CMD_DEFAULTQ_G(x) \ (((x) >> FW_RSS_VI_CONFIG_CMD_DEFAULTQ_S) & \ FW_RSS_VI_CONFIG_CMD_DEFAULTQ_M) #define FW_RSS_VI_CONFIG_CMD_IP6FOURTUPEN_S 4 #define FW_RSS_VI_CONFIG_CMD_IP6FOURTUPEN_V(x) \ ((x) << FW_RSS_VI_CONFIG_CMD_IP6FOURTUPEN_S) #define FW_RSS_VI_CONFIG_CMD_IP6FOURTUPEN_F \ FW_RSS_VI_CONFIG_CMD_IP6FOURTUPEN_V(1U) #define FW_RSS_VI_CONFIG_CMD_IP6TWOTUPEN_S 3 #define FW_RSS_VI_CONFIG_CMD_IP6TWOTUPEN_V(x) \ ((x) << FW_RSS_VI_CONFIG_CMD_IP6TWOTUPEN_S) #define FW_RSS_VI_CONFIG_CMD_IP6TWOTUPEN_F \ FW_RSS_VI_CONFIG_CMD_IP6TWOTUPEN_V(1U) #define FW_RSS_VI_CONFIG_CMD_IP4FOURTUPEN_S 2 #define FW_RSS_VI_CONFIG_CMD_IP4FOURTUPEN_V(x) \ ((x) << FW_RSS_VI_CONFIG_CMD_IP4FOURTUPEN_S) #define FW_RSS_VI_CONFIG_CMD_IP4FOURTUPEN_F \ FW_RSS_VI_CONFIG_CMD_IP4FOURTUPEN_V(1U) #define FW_RSS_VI_CONFIG_CMD_IP4TWOTUPEN_S 1 #define FW_RSS_VI_CONFIG_CMD_IP4TWOTUPEN_V(x) \ ((x) << FW_RSS_VI_CONFIG_CMD_IP4TWOTUPEN_S) #define FW_RSS_VI_CONFIG_CMD_IP4TWOTUPEN_F \ FW_RSS_VI_CONFIG_CMD_IP4TWOTUPEN_V(1U) #define FW_RSS_VI_CONFIG_CMD_UDPEN_S 0 #define FW_RSS_VI_CONFIG_CMD_UDPEN_V(x) ((x) << FW_RSS_VI_CONFIG_CMD_UDPEN_S) #define FW_RSS_VI_CONFIG_CMD_UDPEN_F FW_RSS_VI_CONFIG_CMD_UDPEN_V(1U) struct fw_clip_cmd { __be32 op_to_write; __be32 alloc_to_len16; __be64 ip_hi; __be64 ip_lo; __be32 r4[2]; }; #define FW_CLIP_CMD_ALLOC_S 31 #define FW_CLIP_CMD_ALLOC_V(x) ((x) << FW_CLIP_CMD_ALLOC_S) #define FW_CLIP_CMD_ALLOC_F FW_CLIP_CMD_ALLOC_V(1U) #define FW_CLIP_CMD_FREE_S 30 #define FW_CLIP_CMD_FREE_V(x) ((x) << FW_CLIP_CMD_FREE_S) #define FW_CLIP_CMD_FREE_F FW_CLIP_CMD_FREE_V(1U) enum fw_error_type { FW_ERROR_TYPE_EXCEPTION = 0x0, FW_ERROR_TYPE_HWMODULE = 0x1, FW_ERROR_TYPE_WR = 0x2, FW_ERROR_TYPE_ACL = 0x3, }; struct fw_error_cmd { __be32 op_to_type; __be32 len16_pkd; union fw_error { struct fw_error_exception { __be32 info[6]; } exception; struct fw_error_hwmodule { __be32 regaddr; __be32 regval; } hwmodule; struct fw_error_wr { __be16 cidx; __be16 pfn_vfn; __be32 eqid; u8 wrhdr[16]; } wr; struct fw_error_acl { __be16 cidx; __be16 pfn_vfn; __be32 eqid; __be16 mv_pkd; u8 val[6]; __be64 r4; } acl; } u; }; struct fw_debug_cmd { __be32 op_type; __be32 len16_pkd; union fw_debug { struct fw_debug_assert { __be32 fcid; __be32 line; __be32 x; __be32 y; u8 filename_0_7[8]; u8 filename_8_15[8]; __be64 r3; } assert; struct fw_debug_prt { __be16 dprtstridx; __be16 r3[3]; __be32 dprtstrparam0; __be32 dprtstrparam1; __be32 dprtstrparam2; __be32 dprtstrparam3; } prt; } u; }; #define FW_DEBUG_CMD_TYPE_S 0 #define FW_DEBUG_CMD_TYPE_M 0xff #define FW_DEBUG_CMD_TYPE_G(x) \ (((x) >> FW_DEBUG_CMD_TYPE_S) & FW_DEBUG_CMD_TYPE_M) #define PCIE_FW_ERR_S 31 #define PCIE_FW_ERR_V(x) ((x) << PCIE_FW_ERR_S) #define PCIE_FW_ERR_F PCIE_FW_ERR_V(1U) #define PCIE_FW_INIT_S 30 #define PCIE_FW_INIT_V(x) ((x) << PCIE_FW_INIT_S) #define PCIE_FW_INIT_F PCIE_FW_INIT_V(1U) #define PCIE_FW_HALT_S 29 #define PCIE_FW_HALT_V(x) ((x) << PCIE_FW_HALT_S) #define PCIE_FW_HALT_F PCIE_FW_HALT_V(1U) #define PCIE_FW_EVAL_S 24 #define PCIE_FW_EVAL_M 0x7 #define PCIE_FW_EVAL_G(x) (((x) >> PCIE_FW_EVAL_S) & PCIE_FW_EVAL_M) #define PCIE_FW_MASTER_VLD_S 15 #define PCIE_FW_MASTER_VLD_V(x) ((x) << PCIE_FW_MASTER_VLD_S) #define PCIE_FW_MASTER_VLD_F PCIE_FW_MASTER_VLD_V(1U) #define PCIE_FW_MASTER_S 12 #define PCIE_FW_MASTER_M 0x7 #define PCIE_FW_MASTER_V(x) ((x) << PCIE_FW_MASTER_S) #define PCIE_FW_MASTER_G(x) (((x) >> PCIE_FW_MASTER_S) & PCIE_FW_MASTER_M) struct fw_hdr { u8 ver; u8 chip; /* terminator chip type */ __be16 len512; /* bin length in units of 512-bytes */ __be32 fw_ver; /* firmware version */ __be32 tp_microcode_ver; u8 intfver_nic; u8 intfver_vnic; u8 intfver_ofld; u8 intfver_ri; u8 intfver_iscsipdu; u8 intfver_iscsi; u8 intfver_fcoepdu; u8 intfver_fcoe; __u32 reserved2; __u32 reserved3; __u32 reserved4; __be32 flags; __be32 reserved6[23]; }; enum fw_hdr_chip { FW_HDR_CHIP_T4, FW_HDR_CHIP_T5, FW_HDR_CHIP_T6 }; #define FW_HDR_FW_VER_MAJOR_S 24 #define FW_HDR_FW_VER_MAJOR_M 0xff #define FW_HDR_FW_VER_MAJOR_V(x) \ ((x) << FW_HDR_FW_VER_MAJOR_S) #define FW_HDR_FW_VER_MAJOR_G(x) \ (((x) >> FW_HDR_FW_VER_MAJOR_S) & FW_HDR_FW_VER_MAJOR_M) #define FW_HDR_FW_VER_MINOR_S 16 #define FW_HDR_FW_VER_MINOR_M 0xff #define FW_HDR_FW_VER_MINOR_V(x) \ ((x) << FW_HDR_FW_VER_MINOR_S) #define FW_HDR_FW_VER_MINOR_G(x) \ (((x) >> FW_HDR_FW_VER_MINOR_S) & FW_HDR_FW_VER_MINOR_M) #define FW_HDR_FW_VER_MICRO_S 8 #define FW_HDR_FW_VER_MICRO_M 0xff #define FW_HDR_FW_VER_MICRO_V(x) \ ((x) << FW_HDR_FW_VER_MICRO_S) #define FW_HDR_FW_VER_MICRO_G(x) \ (((x) >> FW_HDR_FW_VER_MICRO_S) & FW_HDR_FW_VER_MICRO_M) #define FW_HDR_FW_VER_BUILD_S 0 #define FW_HDR_FW_VER_BUILD_M 0xff #define FW_HDR_FW_VER_BUILD_V(x) \ ((x) << FW_HDR_FW_VER_BUILD_S) #define FW_HDR_FW_VER_BUILD_G(x) \ (((x) >> FW_HDR_FW_VER_BUILD_S) & FW_HDR_FW_VER_BUILD_M) enum fw_hdr_intfver { FW_HDR_INTFVER_NIC = 0x00, FW_HDR_INTFVER_VNIC = 0x00, FW_HDR_INTFVER_OFLD = 0x00, FW_HDR_INTFVER_RI = 0x00, FW_HDR_INTFVER_ISCSIPDU = 0x00, FW_HDR_INTFVER_ISCSI = 0x00, FW_HDR_INTFVER_FCOEPDU = 0x00, FW_HDR_INTFVER_FCOE = 0x00, }; enum fw_hdr_flags { FW_HDR_FLAGS_RESET_HALT = 0x00000001, }; /* length of the formatting string */ #define FW_DEVLOG_FMT_LEN 192 /* maximum number of the formatting string parameters */ #define FW_DEVLOG_FMT_PARAMS_NUM 8 /* priority levels */ enum fw_devlog_level { FW_DEVLOG_LEVEL_EMERG = 0x0, FW_DEVLOG_LEVEL_CRIT = 0x1, FW_DEVLOG_LEVEL_ERR = 0x2, FW_DEVLOG_LEVEL_NOTICE = 0x3, FW_DEVLOG_LEVEL_INFO = 0x4, FW_DEVLOG_LEVEL_DEBUG = 0x5, FW_DEVLOG_LEVEL_MAX = 0x5, }; /* facilities that may send a log message */ enum fw_devlog_facility { FW_DEVLOG_FACILITY_CORE = 0x00, FW_DEVLOG_FACILITY_CF = 0x01, FW_DEVLOG_FACILITY_SCHED = 0x02, FW_DEVLOG_FACILITY_TIMER = 0x04, FW_DEVLOG_FACILITY_RES = 0x06, FW_DEVLOG_FACILITY_HW = 0x08, FW_DEVLOG_FACILITY_FLR = 0x10, FW_DEVLOG_FACILITY_DMAQ = 0x12, FW_DEVLOG_FACILITY_PHY = 0x14, FW_DEVLOG_FACILITY_MAC = 0x16, FW_DEVLOG_FACILITY_PORT = 0x18, FW_DEVLOG_FACILITY_VI = 0x1A, FW_DEVLOG_FACILITY_FILTER = 0x1C, FW_DEVLOG_FACILITY_ACL = 0x1E, FW_DEVLOG_FACILITY_TM = 0x20, FW_DEVLOG_FACILITY_QFC = 0x22, FW_DEVLOG_FACILITY_DCB = 0x24, FW_DEVLOG_FACILITY_ETH = 0x26, FW_DEVLOG_FACILITY_OFLD = 0x28, FW_DEVLOG_FACILITY_RI = 0x2A, FW_DEVLOG_FACILITY_ISCSI = 0x2C, FW_DEVLOG_FACILITY_FCOE = 0x2E, FW_DEVLOG_FACILITY_FOISCSI = 0x30, FW_DEVLOG_FACILITY_FOFCOE = 0x32, FW_DEVLOG_FACILITY_CHNET = 0x34, FW_DEVLOG_FACILITY_MAX = 0x34, }; /* log message format */ struct fw_devlog_e { __be64 timestamp; __be32 seqno; __be16 reserved1; __u8 level; __u8 facility; __u8 fmt[FW_DEVLOG_FMT_LEN]; __be32 params[FW_DEVLOG_FMT_PARAMS_NUM]; __be32 reserved3[4]; }; struct fw_devlog_cmd { __be32 op_to_write; __be32 retval_len16; __u8 level; __u8 r2[7]; __be32 memtype_devlog_memaddr16_devlog; __be32 memsize_devlog; __be32 r3[2]; }; #define FW_DEVLOG_CMD_MEMTYPE_DEVLOG_S 28 #define FW_DEVLOG_CMD_MEMTYPE_DEVLOG_M 0xf #define FW_DEVLOG_CMD_MEMTYPE_DEVLOG_G(x) \ (((x) >> FW_DEVLOG_CMD_MEMTYPE_DEVLOG_S) & \ FW_DEVLOG_CMD_MEMTYPE_DEVLOG_M) #define FW_DEVLOG_CMD_MEMADDR16_DEVLOG_S 0 #define FW_DEVLOG_CMD_MEMADDR16_DEVLOG_M 0xfffffff #define FW_DEVLOG_CMD_MEMADDR16_DEVLOG_G(x) \ (((x) >> FW_DEVLOG_CMD_MEMADDR16_DEVLOG_S) & \ FW_DEVLOG_CMD_MEMADDR16_DEVLOG_M) /* P C I E F W P F 7 R E G I S T E R */ /* PF7 stores the Firmware Device Log parameters which allows Host Drivers to * access the "devlog" which needing to contact firmware. The encoding is * mostly the same as that returned by the DEVLOG command except for the size * which is encoded as the number of entries in multiples-1 of 128 here rather * than the memory size as is done in the DEVLOG command. Thus, 0 means 128 * and 15 means 2048. This of course in turn constrains the allowed values * for the devlog size ... */ #define PCIE_FW_PF_DEVLOG 7 #define PCIE_FW_PF_DEVLOG_NENTRIES128_S 28 #define PCIE_FW_PF_DEVLOG_NENTRIES128_M 0xf #define PCIE_FW_PF_DEVLOG_NENTRIES128_V(x) \ ((x) << PCIE_FW_PF_DEVLOG_NENTRIES128_S) #define PCIE_FW_PF_DEVLOG_NENTRIES128_G(x) \ (((x) >> PCIE_FW_PF_DEVLOG_NENTRIES128_S) & \ PCIE_FW_PF_DEVLOG_NENTRIES128_M) #define PCIE_FW_PF_DEVLOG_ADDR16_S 4 #define PCIE_FW_PF_DEVLOG_ADDR16_M 0xffffff #define PCIE_FW_PF_DEVLOG_ADDR16_V(x) ((x) << PCIE_FW_PF_DEVLOG_ADDR16_S) #define PCIE_FW_PF_DEVLOG_ADDR16_G(x) \ (((x) >> PCIE_FW_PF_DEVLOG_ADDR16_S) & PCIE_FW_PF_DEVLOG_ADDR16_M) #define PCIE_FW_PF_DEVLOG_MEMTYPE_S 0 #define PCIE_FW_PF_DEVLOG_MEMTYPE_M 0xf #define PCIE_FW_PF_DEVLOG_MEMTYPE_V(x) ((x) << PCIE_FW_PF_DEVLOG_MEMTYPE_S) #define PCIE_FW_PF_DEVLOG_MEMTYPE_G(x) \ (((x) >> PCIE_FW_PF_DEVLOG_MEMTYPE_S) & PCIE_FW_PF_DEVLOG_MEMTYPE_M) #endif /* _T4FW_INTERFACE_H_ */ rdma-core-56.1/providers/cxgb4/t4fw_ri_api.h000066400000000000000000000561251477342711600207470ustar00rootroot00000000000000/* * Copyright (c) 2009-2010 Chelsio, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _T4FW_RI_API_H_ #define _T4FW_RI_API_H_ #include "t4fw_api.h" enum fw_ri_wr_opcode { FW_RI_RDMA_WRITE = 0x0, /* IETF RDMAP v1.0 ... */ FW_RI_READ_REQ = 0x1, FW_RI_READ_RESP = 0x2, FW_RI_SEND = 0x3, FW_RI_SEND_WITH_INV = 0x4, FW_RI_SEND_WITH_SE = 0x5, FW_RI_SEND_WITH_SE_INV = 0x6, FW_RI_TERMINATE = 0x7, FW_RI_RDMA_INIT = 0x8, /* CHELSIO RI specific ... */ FW_RI_BIND_MW = 0x9, FW_RI_FAST_REGISTER = 0xa, FW_RI_LOCAL_INV = 0xb, FW_RI_QP_MODIFY = 0xc, FW_RI_BYPASS = 0xd, FW_RI_RECEIVE = 0xe, FW_RI_SGE_EC_CR_RETURN = 0xf, FW_RI_WRITE_IMMEDIATE = FW_RI_RDMA_INIT, }; enum fw_ri_wr_flags { FW_RI_COMPLETION_FLAG = 0x01, FW_RI_NOTIFICATION_FLAG = 0x02, FW_RI_SOLICITED_EVENT_FLAG = 0x04, FW_RI_READ_FENCE_FLAG = 0x08, FW_RI_LOCAL_FENCE_FLAG = 0x10, FW_RI_RDMA_READ_INVALIDATE = 0x20, FW_RI_RDMA_WRITE_WITH_IMMEDIATE = 0x40, }; enum fw_ri_mpa_attrs { FW_RI_MPA_RX_MARKER_ENABLE = 0x01, FW_RI_MPA_TX_MARKER_ENABLE = 0x02, FW_RI_MPA_CRC_ENABLE = 0x04, FW_RI_MPA_IETF_ENABLE = 0x08 }; enum fw_ri_qp_caps { FW_RI_QP_RDMA_READ_ENABLE = 0x01, FW_RI_QP_RDMA_WRITE_ENABLE = 0x02, FW_RI_QP_BIND_ENABLE = 0x04, FW_RI_QP_FAST_REGISTER_ENABLE = 0x08, FW_RI_QP_STAG0_ENABLE = 0x10 }; enum fw_ri_addr_type { FW_RI_ZERO_BASED_TO = 0x00, FW_RI_VA_BASED_TO = 0x01 }; enum fw_ri_mem_perms { FW_RI_MEM_ACCESS_REM_WRITE = 0x01, FW_RI_MEM_ACCESS_REM_READ = 0x02, FW_RI_MEM_ACCESS_REM = 0x03, FW_RI_MEM_ACCESS_LOCAL_WRITE = 0x04, FW_RI_MEM_ACCESS_LOCAL_READ = 0x08, FW_RI_MEM_ACCESS_LOCAL = 0x0C }; enum fw_ri_stag_type { FW_RI_STAG_NSMR = 0x00, FW_RI_STAG_SMR = 0x01, FW_RI_STAG_MW = 0x02, FW_RI_STAG_MW_RELAXED = 0x03 }; enum fw_ri_data_op { FW_RI_DATA_IMMD = 0x81, FW_RI_DATA_DSGL = 0x82, FW_RI_DATA_ISGL = 0x83 }; enum fw_ri_sgl_depth { FW_RI_SGL_DEPTH_MAX_SQ = 16, FW_RI_SGL_DEPTH_MAX_RQ = 4 }; struct fw_ri_dsge_pair { __be32 len[2]; __be64 addr[2]; }; struct fw_ri_dsgl { __u8 op; __u8 r1; __be16 nsge; __be32 len0; __be64 addr0; #ifndef C99_NOT_SUPPORTED struct fw_ri_dsge_pair sge[0]; #endif }; struct fw_ri_sge { __be32 stag; __be32 len; __be64 to; }; struct fw_ri_isgl { __u8 op; __u8 r1; __be16 nsge; __be32 r2; #ifndef C99_NOT_SUPPORTED struct fw_ri_sge sge[0]; #endif }; struct fw_ri_immd { __u8 op; __u8 r1; __be16 r2; __be32 immdlen; #ifndef C99_NOT_SUPPORTED __u8 data[0]; #endif }; struct fw_ri_tpte { __be32 valid_to_pdid; __be32 locread_to_qpid; __be32 nosnoop_pbladdr; __be32 len_lo; __be32 va_hi; __be32 va_lo_fbo; __be32 dca_mwbcnt_pstag; __be32 len_hi; }; #define FW_RI_TPTE_VALID_S 31 #define FW_RI_TPTE_VALID_M 0x1 #define FW_RI_TPTE_VALID_V(x) ((x) << FW_RI_TPTE_VALID_S) #define FW_RI_TPTE_VALID_G(x) \ (((x) >> FW_RI_TPTE_VALID_S) & FW_RI_TPTE_VALID_M) #define FW_RI_TPTE_VALID_F FW_RI_TPTE_VALID_V(1U) #define FW_RI_TPTE_STAGKEY_S 23 #define FW_RI_TPTE_STAGKEY_M 0xff #define FW_RI_TPTE_STAGKEY_V(x) ((x) << FW_RI_TPTE_STAGKEY_S) #define FW_RI_TPTE_STAGKEY_G(x) \ (((x) >> FW_RI_TPTE_STAGKEY_S) & FW_RI_TPTE_STAGKEY_M) #define FW_RI_TPTE_STAGSTATE_S 22 #define FW_RI_TPTE_STAGSTATE_M 0x1 #define FW_RI_TPTE_STAGSTATE_V(x) ((x) << FW_RI_TPTE_STAGSTATE_S) #define FW_RI_TPTE_STAGSTATE_G(x) \ (((x) >> FW_RI_TPTE_STAGSTATE_S) & FW_RI_TPTE_STAGSTATE_M) #define FW_RI_TPTE_STAGSTATE_F FW_RI_TPTE_STAGSTATE_V(1U) #define FW_RI_TPTE_STAGTYPE_S 20 #define FW_RI_TPTE_STAGTYPE_M 0x3 #define FW_RI_TPTE_STAGTYPE_V(x) ((x) << FW_RI_TPTE_STAGTYPE_S) #define FW_RI_TPTE_STAGTYPE_G(x) \ (((x) >> FW_RI_TPTE_STAGTYPE_S) & FW_RI_TPTE_STAGTYPE_M) #define FW_RI_TPTE_PDID_S 0 #define FW_RI_TPTE_PDID_M 0xfffff #define FW_RI_TPTE_PDID_V(x) ((x) << FW_RI_TPTE_PDID_S) #define FW_RI_TPTE_PDID_G(x) \ (((x) >> FW_RI_TPTE_PDID_S) & FW_RI_TPTE_PDID_M) #define FW_RI_TPTE_PERM_S 28 #define FW_RI_TPTE_PERM_M 0xf #define FW_RI_TPTE_PERM_V(x) ((x) << FW_RI_TPTE_PERM_S) #define FW_RI_TPTE_PERM_G(x) \ (((x) >> FW_RI_TPTE_PERM_S) & FW_RI_TPTE_PERM_M) #define FW_RI_TPTE_REMINVDIS_S 27 #define FW_RI_TPTE_REMINVDIS_M 0x1 #define FW_RI_TPTE_REMINVDIS_V(x) ((x) << FW_RI_TPTE_REMINVDIS_S) #define FW_RI_TPTE_REMINVDIS_G(x) \ (((x) >> FW_RI_TPTE_REMINVDIS_S) & FW_RI_TPTE_REMINVDIS_M) #define FW_RI_TPTE_REMINVDIS_F FW_RI_TPTE_REMINVDIS_V(1U) #define FW_RI_TPTE_ADDRTYPE_S 26 #define FW_RI_TPTE_ADDRTYPE_M 1 #define FW_RI_TPTE_ADDRTYPE_V(x) ((x) << FW_RI_TPTE_ADDRTYPE_S) #define FW_RI_TPTE_ADDRTYPE_G(x) \ (((x) >> FW_RI_TPTE_ADDRTYPE_S) & FW_RI_TPTE_ADDRTYPE_M) #define FW_RI_TPTE_ADDRTYPE_F FW_RI_TPTE_ADDRTYPE_V(1U) #define FW_RI_TPTE_MWBINDEN_S 25 #define FW_RI_TPTE_MWBINDEN_M 0x1 #define FW_RI_TPTE_MWBINDEN_V(x) ((x) << FW_RI_TPTE_MWBINDEN_S) #define FW_RI_TPTE_MWBINDEN_G(x) \ (((x) >> FW_RI_TPTE_MWBINDEN_S) & FW_RI_TPTE_MWBINDEN_M) #define FW_RI_TPTE_MWBINDEN_F FW_RI_TPTE_MWBINDEN_V(1U) #define FW_RI_TPTE_PS_S 20 #define FW_RI_TPTE_PS_M 0x1f #define FW_RI_TPTE_PS_V(x) ((x) << FW_RI_TPTE_PS_S) #define FW_RI_TPTE_PS_G(x) \ (((x) >> FW_RI_TPTE_PS_S) & FW_RI_TPTE_PS_M) #define FW_RI_TPTE_QPID_S 0 #define FW_RI_TPTE_QPID_M 0xfffff #define FW_RI_TPTE_QPID_V(x) ((x) << FW_RI_TPTE_QPID_S) #define FW_RI_TPTE_QPID_G(x) \ (((x) >> FW_RI_TPTE_QPID_S) & FW_RI_TPTE_QPID_M) #define FW_RI_TPTE_NOSNOOP_S 30 #define FW_RI_TPTE_NOSNOOP_M 0x1 #define FW_RI_TPTE_NOSNOOP_V(x) ((x) << FW_RI_TPTE_NOSNOOP_S) #define FW_RI_TPTE_NOSNOOP_G(x) \ (((x) >> FW_RI_TPTE_NOSNOOP_S) & FW_RI_TPTE_NOSNOOP_M) #define FW_RI_TPTE_NOSNOOP_F FW_RI_TPTE_NOSNOOP_V(1U) #define FW_RI_TPTE_PBLADDR_S 0 #define FW_RI_TPTE_PBLADDR_M 0x1fffffff #define FW_RI_TPTE_PBLADDR_V(x) ((x) << FW_RI_TPTE_PBLADDR_S) #define FW_RI_TPTE_PBLADDR_G(x) \ (((x) >> FW_RI_TPTE_PBLADDR_S) & FW_RI_TPTE_PBLADDR_M) #define FW_RI_TPTE_DCA_S 24 #define FW_RI_TPTE_DCA_M 0x1f #define FW_RI_TPTE_DCA_V(x) ((x) << FW_RI_TPTE_DCA_S) #define FW_RI_TPTE_DCA_G(x) \ (((x) >> FW_RI_TPTE_DCA_S) & FW_RI_TPTE_DCA_M) #define FW_RI_TPTE_MWBCNT_PSTAG_S 0 #define FW_RI_TPTE_MWBCNT_PSTAG_M 0xffffff #define FW_RI_TPTE_MWBCNT_PSTAT_V(x) \ ((x) << FW_RI_TPTE_MWBCNT_PSTAG_S) #define FW_RI_TPTE_MWBCNT_PSTAG_G(x) \ (((x) >> FW_RI_TPTE_MWBCNT_PSTAG_S) & FW_RI_TPTE_MWBCNT_PSTAG_M) enum fw_ri_res_type { FW_RI_RES_TYPE_SQ, FW_RI_RES_TYPE_RQ, FW_RI_RES_TYPE_CQ, FW_RI_RES_TYPE_SRQ, }; enum fw_ri_res_op { FW_RI_RES_OP_WRITE, FW_RI_RES_OP_RESET, }; struct fw_ri_res { union fw_ri_restype { struct fw_ri_res_sqrq { __u8 restype; __u8 op; __be16 r3; __be32 eqid; __be32 r4[2]; __be32 fetchszm_to_iqid; __be32 dcaen_to_eqsize; __be64 eqaddr; } sqrq; struct fw_ri_res_cq { __u8 restype; __u8 op; __be16 r3; __be32 iqid; __be32 r4[2]; __be32 iqandst_to_iqandstindex; __be16 iqdroprss_to_iqesize; __be16 iqsize; __be64 iqaddr; __be32 iqns_iqro; __be32 r6_lo; __be64 r7; } cq; struct fw_ri_res_srq { __u8 restype; __u8 op; __be16 r3; __be32 eqid; __be32 r4[2]; __be32 fetchszm_to_iqid; __be32 dcaen_to_eqsize; __be64 eqaddr; __be32 srqid; __be32 pdid; __be32 hwsrqsize; __be32 hwsrqaddr; } srq; } u; }; struct fw_ri_res_wr { __be32 op_nres; __be32 len16_pkd; __u64 cookie; #ifndef C99_NOT_SUPPORTED struct fw_ri_res res[0]; #endif }; #define FW_RI_RES_WR_NRES_S 0 #define FW_RI_RES_WR_NRES_M 0xff #define FW_RI_RES_WR_NRES_V(x) ((x) << FW_RI_RES_WR_NRES_S) #define FW_RI_RES_WR_NRES_G(x) \ (((x) >> FW_RI_RES_WR_NRES_S) & FW_RI_RES_WR_NRES_M) #define FW_RI_RES_WR_FETCHSZM_S 26 #define FW_RI_RES_WR_FETCHSZM_M 0x1 #define FW_RI_RES_WR_FETCHSZM_V(x) ((x) << FW_RI_RES_WR_FETCHSZM_S) #define FW_RI_RES_WR_FETCHSZM_G(x) \ (((x) >> FW_RI_RES_WR_FETCHSZM_S) & FW_RI_RES_WR_FETCHSZM_M) #define FW_RI_RES_WR_FETCHSZM_F FW_RI_RES_WR_FETCHSZM_V(1U) #define FW_RI_RES_WR_STATUSPGNS_S 25 #define FW_RI_RES_WR_STATUSPGNS_M 0x1 #define FW_RI_RES_WR_STATUSPGNS_V(x) ((x) << FW_RI_RES_WR_STATUSPGNS_S) #define FW_RI_RES_WR_STATUSPGNS_G(x) \ (((x) >> FW_RI_RES_WR_STATUSPGNS_S) & FW_RI_RES_WR_STATUSPGNS_M) #define FW_RI_RES_WR_STATUSPGNS_F FW_RI_RES_WR_STATUSPGNS_V(1U) #define FW_RI_RES_WR_STATUSPGRO_S 24 #define FW_RI_RES_WR_STATUSPGRO_M 0x1 #define FW_RI_RES_WR_STATUSPGRO_V(x) ((x) << FW_RI_RES_WR_STATUSPGRO_S) #define FW_RI_RES_WR_STATUSPGRO_G(x) \ (((x) >> FW_RI_RES_WR_STATUSPGRO_S) & FW_RI_RES_WR_STATUSPGRO_M) #define FW_RI_RES_WR_STATUSPGRO_F FW_RI_RES_WR_STATUSPGRO_V(1U) #define FW_RI_RES_WR_FETCHNS_S 23 #define FW_RI_RES_WR_FETCHNS_M 0x1 #define FW_RI_RES_WR_FETCHNS_V(x) ((x) << FW_RI_RES_WR_FETCHNS_S) #define FW_RI_RES_WR_FETCHNS_G(x) \ (((x) >> FW_RI_RES_WR_FETCHNS_S) & FW_RI_RES_WR_FETCHNS_M) #define FW_RI_RES_WR_FETCHNS_F FW_RI_RES_WR_FETCHNS_V(1U) #define FW_RI_RES_WR_FETCHRO_S 22 #define FW_RI_RES_WR_FETCHRO_M 0x1 #define FW_RI_RES_WR_FETCHRO_V(x) ((x) << FW_RI_RES_WR_FETCHRO_S) #define FW_RI_RES_WR_FETCHRO_G(x) \ (((x) >> FW_RI_RES_WR_FETCHRO_S) & FW_RI_RES_WR_FETCHRO_M) #define FW_RI_RES_WR_FETCHRO_F FW_RI_RES_WR_FETCHRO_V(1U) #define FW_RI_RES_WR_HOSTFCMODE_S 20 #define FW_RI_RES_WR_HOSTFCMODE_M 0x3 #define FW_RI_RES_WR_HOSTFCMODE_V(x) ((x) << FW_RI_RES_WR_HOSTFCMODE_S) #define FW_RI_RES_WR_HOSTFCMODE_G(x) \ (((x) >> FW_RI_RES_WR_HOSTFCMODE_S) & FW_RI_RES_WR_HOSTFCMODE_M) #define FW_RI_RES_WR_CPRIO_S 19 #define FW_RI_RES_WR_CPRIO_M 0x1 #define FW_RI_RES_WR_CPRIO_V(x) ((x) << FW_RI_RES_WR_CPRIO_S) #define FW_RI_RES_WR_CPRIO_G(x) \ (((x) >> FW_RI_RES_WR_CPRIO_S) & FW_RI_RES_WR_CPRIO_M) #define FW_RI_RES_WR_CPRIO_F FW_RI_RES_WR_CPRIO_V(1U) #define FW_RI_RES_WR_ONCHIP_S 18 #define FW_RI_RES_WR_ONCHIP_M 0x1 #define FW_RI_RES_WR_ONCHIP_V(x) ((x) << FW_RI_RES_WR_ONCHIP_S) #define FW_RI_RES_WR_ONCHIP_G(x) \ (((x) >> FW_RI_RES_WR_ONCHIP_S) & FW_RI_RES_WR_ONCHIP_M) #define FW_RI_RES_WR_ONCHIP_F FW_RI_RES_WR_ONCHIP_V(1U) #define FW_RI_RES_WR_PCIECHN_S 16 #define FW_RI_RES_WR_PCIECHN_M 0x3 #define FW_RI_RES_WR_PCIECHN_V(x) ((x) << FW_RI_RES_WR_PCIECHN_S) #define FW_RI_RES_WR_PCIECHN_G(x) \ (((x) >> FW_RI_RES_WR_PCIECHN_S) & FW_RI_RES_WR_PCIECHN_M) #define FW_RI_RES_WR_IQID_S 0 #define FW_RI_RES_WR_IQID_M 0xffff #define FW_RI_RES_WR_IQID_V(x) ((x) << FW_RI_RES_WR_IQID_S) #define FW_RI_RES_WR_IQID_G(x) \ (((x) >> FW_RI_RES_WR_IQID_S) & FW_RI_RES_WR_IQID_M) #define FW_RI_RES_WR_DCAEN_S 31 #define FW_RI_RES_WR_DCAEN_M 0x1 #define FW_RI_RES_WR_DCAEN_V(x) ((x) << FW_RI_RES_WR_DCAEN_S) #define FW_RI_RES_WR_DCAEN_G(x) \ (((x) >> FW_RI_RES_WR_DCAEN_S) & FW_RI_RES_WR_DCAEN_M) #define FW_RI_RES_WR_DCAEN_F FW_RI_RES_WR_DCAEN_V(1U) #define FW_RI_RES_WR_DCACPU_S 26 #define FW_RI_RES_WR_DCACPU_M 0x1f #define FW_RI_RES_WR_DCACPU_V(x) ((x) << FW_RI_RES_WR_DCACPU_S) #define FW_RI_RES_WR_DCACPU_G(x) \ (((x) >> FW_RI_RES_WR_DCACPU_S) & FW_RI_RES_WR_DCACPU_M) #define FW_RI_RES_WR_FBMIN_S 23 #define FW_RI_RES_WR_FBMIN_M 0x7 #define FW_RI_RES_WR_FBMIN_V(x) ((x) << FW_RI_RES_WR_FBMIN_S) #define FW_RI_RES_WR_FBMIN_G(x) \ (((x) >> FW_RI_RES_WR_FBMIN_S) & FW_RI_RES_WR_FBMIN_M) #define FW_RI_RES_WR_FBMAX_S 20 #define FW_RI_RES_WR_FBMAX_M 0x7 #define FW_RI_RES_WR_FBMAX_V(x) ((x) << FW_RI_RES_WR_FBMAX_S) #define FW_RI_RES_WR_FBMAX_G(x) \ (((x) >> FW_RI_RES_WR_FBMAX_S) & FW_RI_RES_WR_FBMAX_M) #define FW_RI_RES_WR_CIDXFTHRESHO_S 19 #define FW_RI_RES_WR_CIDXFTHRESHO_M 0x1 #define FW_RI_RES_WR_CIDXFTHRESHO_V(x) ((x) << FW_RI_RES_WR_CIDXFTHRESHO_S) #define FW_RI_RES_WR_CIDXFTHRESHO_G(x) \ (((x) >> FW_RI_RES_WR_CIDXFTHRESHO_S) & FW_RI_RES_WR_CIDXFTHRESHO_M) #define FW_RI_RES_WR_CIDXFTHRESHO_F FW_RI_RES_WR_CIDXFTHRESHO_V(1U) #define FW_RI_RES_WR_CIDXFTHRESH_S 16 #define FW_RI_RES_WR_CIDXFTHRESH_M 0x7 #define FW_RI_RES_WR_CIDXFTHRESH_V(x) ((x) << FW_RI_RES_WR_CIDXFTHRESH_S) #define FW_RI_RES_WR_CIDXFTHRESH_G(x) \ (((x) >> FW_RI_RES_WR_CIDXFTHRESH_S) & FW_RI_RES_WR_CIDXFTHRESH_M) #define FW_RI_RES_WR_EQSIZE_S 0 #define FW_RI_RES_WR_EQSIZE_M 0xffff #define FW_RI_RES_WR_EQSIZE_V(x) ((x) << FW_RI_RES_WR_EQSIZE_S) #define FW_RI_RES_WR_EQSIZE_G(x) \ (((x) >> FW_RI_RES_WR_EQSIZE_S) & FW_RI_RES_WR_EQSIZE_M) #define FW_RI_RES_WR_IQANDST_S 15 #define FW_RI_RES_WR_IQANDST_M 0x1 #define FW_RI_RES_WR_IQANDST_V(x) ((x) << FW_RI_RES_WR_IQANDST_S) #define FW_RI_RES_WR_IQANDST_G(x) \ (((x) >> FW_RI_RES_WR_IQANDST_S) & FW_RI_RES_WR_IQANDST_M) #define FW_RI_RES_WR_IQANDST_F FW_RI_RES_WR_IQANDST_V(1U) #define FW_RI_RES_WR_IQANUS_S 14 #define FW_RI_RES_WR_IQANUS_M 0x1 #define FW_RI_RES_WR_IQANUS_V(x) ((x) << FW_RI_RES_WR_IQANUS_S) #define FW_RI_RES_WR_IQANUS_G(x) \ (((x) >> FW_RI_RES_WR_IQANUS_S) & FW_RI_RES_WR_IQANUS_M) #define FW_RI_RES_WR_IQANUS_F FW_RI_RES_WR_IQANUS_V(1U) #define FW_RI_RES_WR_IQANUD_S 12 #define FW_RI_RES_WR_IQANUD_M 0x3 #define FW_RI_RES_WR_IQANUD_V(x) ((x) << FW_RI_RES_WR_IQANUD_S) #define FW_RI_RES_WR_IQANUD_G(x) \ (((x) >> FW_RI_RES_WR_IQANUD_S) & FW_RI_RES_WR_IQANUD_M) #define FW_RI_RES_WR_IQANDSTINDEX_S 0 #define FW_RI_RES_WR_IQANDSTINDEX_M 0xfff #define FW_RI_RES_WR_IQANDSTINDEX_V(x) ((x) << FW_RI_RES_WR_IQANDSTINDEX_S) #define FW_RI_RES_WR_IQANDSTINDEX_G(x) \ (((x) >> FW_RI_RES_WR_IQANDSTINDEX_S) & FW_RI_RES_WR_IQANDSTINDEX_M) #define FW_RI_RES_WR_IQDROPRSS_S 15 #define FW_RI_RES_WR_IQDROPRSS_M 0x1 #define FW_RI_RES_WR_IQDROPRSS_V(x) ((x) << FW_RI_RES_WR_IQDROPRSS_S) #define FW_RI_RES_WR_IQDROPRSS_G(x) \ (((x) >> FW_RI_RES_WR_IQDROPRSS_S) & FW_RI_RES_WR_IQDROPRSS_M) #define FW_RI_RES_WR_IQDROPRSS_F FW_RI_RES_WR_IQDROPRSS_V(1U) #define FW_RI_RES_WR_IQGTSMODE_S 14 #define FW_RI_RES_WR_IQGTSMODE_M 0x1 #define FW_RI_RES_WR_IQGTSMODE_V(x) ((x) << FW_RI_RES_WR_IQGTSMODE_S) #define FW_RI_RES_WR_IQGTSMODE_G(x) \ (((x) >> FW_RI_RES_WR_IQGTSMODE_S) & FW_RI_RES_WR_IQGTSMODE_M) #define FW_RI_RES_WR_IQGTSMODE_F FW_RI_RES_WR_IQGTSMODE_V(1U) #define FW_RI_RES_WR_IQPCIECH_S 12 #define FW_RI_RES_WR_IQPCIECH_M 0x3 #define FW_RI_RES_WR_IQPCIECH_V(x) ((x) << FW_RI_RES_WR_IQPCIECH_S) #define FW_RI_RES_WR_IQPCIECH_G(x) \ (((x) >> FW_RI_RES_WR_IQPCIECH_S) & FW_RI_RES_WR_IQPCIECH_M) #define FW_RI_RES_WR_IQDCAEN_S 11 #define FW_RI_RES_WR_IQDCAEN_M 0x1 #define FW_RI_RES_WR_IQDCAEN_V(x) ((x) << FW_RI_RES_WR_IQDCAEN_S) #define FW_RI_RES_WR_IQDCAEN_G(x) \ (((x) >> FW_RI_RES_WR_IQDCAEN_S) & FW_RI_RES_WR_IQDCAEN_M) #define FW_RI_RES_WR_IQDCAEN_F FW_RI_RES_WR_IQDCAEN_V(1U) #define FW_RI_RES_WR_IQDCACPU_S 6 #define FW_RI_RES_WR_IQDCACPU_M 0x1f #define FW_RI_RES_WR_IQDCACPU_V(x) ((x) << FW_RI_RES_WR_IQDCACPU_S) #define FW_RI_RES_WR_IQDCACPU_G(x) \ (((x) >> FW_RI_RES_WR_IQDCACPU_S) & FW_RI_RES_WR_IQDCACPU_M) #define FW_RI_RES_WR_IQINTCNTTHRESH_S 4 #define FW_RI_RES_WR_IQINTCNTTHRESH_M 0x3 #define FW_RI_RES_WR_IQINTCNTTHRESH_V(x) \ ((x) << FW_RI_RES_WR_IQINTCNTTHRESH_S) #define FW_RI_RES_WR_IQINTCNTTHRESH_G(x) \ (((x) >> FW_RI_RES_WR_IQINTCNTTHRESH_S) & FW_RI_RES_WR_IQINTCNTTHRESH_M) #define FW_RI_RES_WR_IQO_S 3 #define FW_RI_RES_WR_IQO_M 0x1 #define FW_RI_RES_WR_IQO_V(x) ((x) << FW_RI_RES_WR_IQO_S) #define FW_RI_RES_WR_IQO_G(x) \ (((x) >> FW_RI_RES_WR_IQO_S) & FW_RI_RES_WR_IQO_M) #define FW_RI_RES_WR_IQO_F FW_RI_RES_WR_IQO_V(1U) #define FW_RI_RES_WR_IQCPRIO_S 2 #define FW_RI_RES_WR_IQCPRIO_M 0x1 #define FW_RI_RES_WR_IQCPRIO_V(x) ((x) << FW_RI_RES_WR_IQCPRIO_S) #define FW_RI_RES_WR_IQCPRIO_G(x) \ (((x) >> FW_RI_RES_WR_IQCPRIO_S) & FW_RI_RES_WR_IQCPRIO_M) #define FW_RI_RES_WR_IQCPRIO_F FW_RI_RES_WR_IQCPRIO_V(1U) #define FW_RI_RES_WR_IQESIZE_S 0 #define FW_RI_RES_WR_IQESIZE_M 0x3 #define FW_RI_RES_WR_IQESIZE_V(x) ((x) << FW_RI_RES_WR_IQESIZE_S) #define FW_RI_RES_WR_IQESIZE_G(x) \ (((x) >> FW_RI_RES_WR_IQESIZE_S) & FW_RI_RES_WR_IQESIZE_M) #define FW_RI_RES_WR_IQNS_S 31 #define FW_RI_RES_WR_IQNS_M 0x1 #define FW_RI_RES_WR_IQNS_V(x) ((x) << FW_RI_RES_WR_IQNS_S) #define FW_RI_RES_WR_IQNS_G(x) \ (((x) >> FW_RI_RES_WR_IQNS_S) & FW_RI_RES_WR_IQNS_M) #define FW_RI_RES_WR_IQNS_F FW_RI_RES_WR_IQNS_V(1U) #define FW_RI_RES_WR_IQRO_S 30 #define FW_RI_RES_WR_IQRO_M 0x1 #define FW_RI_RES_WR_IQRO_V(x) ((x) << FW_RI_RES_WR_IQRO_S) #define FW_RI_RES_WR_IQRO_G(x) \ (((x) >> FW_RI_RES_WR_IQRO_S) & FW_RI_RES_WR_IQRO_M) #define FW_RI_RES_WR_IQRO_F FW_RI_RES_WR_IQRO_V(1U) struct fw_ri_rdma_write_wr { __u8 opcode; __u8 flags; __u16 wrid; __u8 r1[3]; __u8 len16; union { struct { __be32 imm_data32; u32 reserved; } ib_imm_data; __be64 imm_data64; } iw_imm_data; __be32 plen; __be32 stag_sink; __be64 to_sink; #ifndef C99_NOT_SUPPORTED union { struct fw_ri_immd immd_src[0]; struct fw_ri_isgl isgl_src[0]; } u; #endif }; struct fw_ri_send_wr { __u8 opcode; __u8 flags; __u16 wrid; __u8 r1[3]; __u8 len16; __be32 sendop_pkd; __be32 stag_inv; __be32 plen; __be32 r3; __be64 r4; #ifndef C99_NOT_SUPPORTED union { struct fw_ri_immd immd_src[0]; struct fw_ri_isgl isgl_src[0]; } u; #endif }; #define FW_RI_SEND_WR_SENDOP_S 0 #define FW_RI_SEND_WR_SENDOP_M 0xf #define FW_RI_SEND_WR_SENDOP_V(x) ((x) << FW_RI_SEND_WR_SENDOP_S) #define FW_RI_SEND_WR_SENDOP_G(x) \ (((x) >> FW_RI_SEND_WR_SENDOP_S) & FW_RI_SEND_WR_SENDOP_M) struct fw_ri_rdma_write_cmpl_wr { __u8 opcode; __u8 flags; __u16 wrid; __u8 r1[3]; __u8 len16; __u8 r2; __u8 flags_send; __u16 wrid_send; __be32 stag_inv; __be32 plen; __be32 stag_sink; __be64 to_sink; union fw_ri_cmpl { struct fw_ri_immd_cmpl { __u8 op; __u8 r1[6]; __u8 immdlen; __u8 data[16]; } immd_src; struct fw_ri_isgl isgl_src; } u_cmpl; __be64 r3; #ifndef C99_NOT_SUPPORTED union fw_ri_write { struct fw_ri_immd immd_src[0]; struct fw_ri_isgl isgl_src[0]; } u; #endif }; struct fw_ri_rdma_read_wr { __u8 opcode; __u8 flags; __u16 wrid; __u8 r1[3]; __u8 len16; __be64 r2; __be32 stag_sink; __be32 to_sink_hi; __be32 to_sink_lo; __be32 plen; __be32 stag_src; __be32 to_src_hi; __be32 to_src_lo; __be32 r5; }; struct fw_ri_recv_wr { __u8 opcode; __u8 r1; __u16 wrid; __u8 r2[3]; __u8 len16; struct fw_ri_isgl isgl; }; struct fw_ri_bind_mw_wr { __u8 opcode; __u8 flags; __u16 wrid; __u8 r1[3]; __u8 len16; __u8 qpbinde_to_dcacpu; __u8 pgsz_shift; __u8 addr_type; __u8 mem_perms; __be32 stag_mr; __be32 stag_mw; __be32 r3; __be64 len_mw; __be64 va_fbo; __be64 r4; }; #define FW_RI_BIND_MW_WR_QPBINDE_S 6 #define FW_RI_BIND_MW_WR_QPBINDE_M 0x1 #define FW_RI_BIND_MW_WR_QPBINDE_V(x) ((x) << FW_RI_BIND_MW_WR_QPBINDE_S) #define FW_RI_BIND_MW_WR_QPBINDE_G(x) \ (((x) >> FW_RI_BIND_MW_WR_QPBINDE_S) & FW_RI_BIND_MW_WR_QPBINDE_M) #define FW_RI_BIND_MW_WR_QPBINDE_F FW_RI_BIND_MW_WR_QPBINDE_V(1U) #define FW_RI_BIND_MW_WR_NS_S 5 #define FW_RI_BIND_MW_WR_NS_M 0x1 #define FW_RI_BIND_MW_WR_NS_V(x) ((x) << FW_RI_BIND_MW_WR_NS_S) #define FW_RI_BIND_MW_WR_NS_G(x) \ (((x) >> FW_RI_BIND_MW_WR_NS_S) & FW_RI_BIND_MW_WR_NS_M) #define FW_RI_BIND_MW_WR_NS_F FW_RI_BIND_MW_WR_NS_V(1U) #define FW_RI_BIND_MW_WR_DCACPU_S 0 #define FW_RI_BIND_MW_WR_DCACPU_M 0x1f #define FW_RI_BIND_MW_WR_DCACPU_V(x) ((x) << FW_RI_BIND_MW_WR_DCACPU_S) #define FW_RI_BIND_MW_WR_DCACPU_G(x) \ (((x) >> FW_RI_BIND_MW_WR_DCACPU_S) & FW_RI_BIND_MW_WR_DCACPU_M) struct fw_ri_fr_nsmr_wr { __u8 opcode; __u8 flags; __u16 wrid; __u8 r1[3]; __u8 len16; __u8 qpbinde_to_dcacpu; __u8 pgsz_shift; __u8 addr_type; __u8 mem_perms; __be32 stag; __be32 len_hi; __be32 len_lo; __be32 va_hi; __be32 va_lo_fbo; }; #define FW_RI_FR_NSMR_WR_QPBINDE_S 6 #define FW_RI_FR_NSMR_WR_QPBINDE_M 0x1 #define FW_RI_FR_NSMR_WR_QPBINDE_V(x) ((x) << FW_RI_FR_NSMR_WR_QPBINDE_S) #define FW_RI_FR_NSMR_WR_QPBINDE_G(x) \ (((x) >> FW_RI_FR_NSMR_WR_QPBINDE_S) & FW_RI_FR_NSMR_WR_QPBINDE_M) #define FW_RI_FR_NSMR_WR_QPBINDE_F FW_RI_FR_NSMR_WR_QPBINDE_V(1U) #define FW_RI_FR_NSMR_WR_NS_S 5 #define FW_RI_FR_NSMR_WR_NS_M 0x1 #define FW_RI_FR_NSMR_WR_NS_V(x) ((x) << FW_RI_FR_NSMR_WR_NS_S) #define FW_RI_FR_NSMR_WR_NS_G(x) \ (((x) >> FW_RI_FR_NSMR_WR_NS_S) & FW_RI_FR_NSMR_WR_NS_M) #define FW_RI_FR_NSMR_WR_NS_F FW_RI_FR_NSMR_WR_NS_V(1U) #define FW_RI_FR_NSMR_WR_DCACPU_S 0 #define FW_RI_FR_NSMR_WR_DCACPU_M 0x1f #define FW_RI_FR_NSMR_WR_DCACPU_V(x) ((x) << FW_RI_FR_NSMR_WR_DCACPU_S) #define FW_RI_FR_NSMR_WR_DCACPU_G(x) \ (((x) >> FW_RI_FR_NSMR_WR_DCACPU_S) & FW_RI_FR_NSMR_WR_DCACPU_M) struct fw_ri_inv_lstag_wr { __u8 opcode; __u8 flags; __u16 wrid; __u8 r1[3]; __u8 len16; __be32 r2; __be32 stag_inv; }; enum fw_ri_type { FW_RI_TYPE_INIT, FW_RI_TYPE_FINI, FW_RI_TYPE_TERMINATE }; enum fw_ri_init_p2ptype { FW_RI_INIT_P2PTYPE_RDMA_WRITE = FW_RI_RDMA_WRITE, FW_RI_INIT_P2PTYPE_READ_REQ = FW_RI_READ_REQ, FW_RI_INIT_P2PTYPE_SEND = FW_RI_SEND, FW_RI_INIT_P2PTYPE_SEND_WITH_INV = FW_RI_SEND_WITH_INV, FW_RI_INIT_P2PTYPE_SEND_WITH_SE = FW_RI_SEND_WITH_SE, FW_RI_INIT_P2PTYPE_SEND_WITH_SE_INV = FW_RI_SEND_WITH_SE_INV, FW_RI_INIT_P2PTYPE_DISABLED = 0xf, }; enum fw_ri_init_rqeqid_srq { FW_RI_INIT_RQEQID_SRQ = 1 << 31, }; struct fw_ri_wr { __be32 op_compl; __be32 flowid_len16; __u64 cookie; union fw_ri { struct fw_ri_init { __u8 type; __u8 mpareqbit_p2ptype; __u8 r4[2]; __u8 mpa_attrs; __u8 qp_caps; __be16 nrqe; __be32 pdid; __be32 qpid; __be32 sq_eqid; __be32 rq_eqid; __be32 scqid; __be32 rcqid; __be32 ord_max; __be32 ird_max; __be32 iss; __be32 irs; __be32 hwrqsize; __be32 hwrqaddr; __be64 r5; union fw_ri_init_p2p { struct fw_ri_rdma_write_wr write; struct fw_ri_rdma_read_wr read; struct fw_ri_send_wr send; } u; } init; struct fw_ri_fini { __u8 type; __u8 r3[7]; __be64 r4; } fini; struct fw_ri_terminate { __u8 type; __u8 r3[3]; __be32 immdlen; __u8 termmsg[40]; } terminate; } u; }; #define FW_RI_WR_MPAREQBIT_S 7 #define FW_RI_WR_MPAREQBIT_M 0x1 #define FW_RI_WR_MPAREQBIT_V(x) ((x) << FW_RI_WR_MPAREQBIT_S) #define FW_RI_WR_MPAREQBIT_G(x) \ (((x) >> FW_RI_WR_MPAREQBIT_S) & FW_RI_WR_MPAREQBIT_M) #define FW_RI_WR_MPAREQBIT_F FW_RI_WR_MPAREQBIT_V(1U) #define FW_RI_WR_P2PTYPE_S 0 #define FW_RI_WR_P2PTYPE_M 0xf #define FW_RI_WR_P2PTYPE_V(x) ((x) << FW_RI_WR_P2PTYPE_S) #define FW_RI_WR_P2PTYPE_G(x) \ (((x) >> FW_RI_WR_P2PTYPE_S) & FW_RI_WR_P2PTYPE_M) #endif /* _T4FW_RI_API_H_ */ rdma-core-56.1/providers/cxgb4/verbs.c000066400000000000000000000526071477342711600176550ustar00rootroot00000000000000/* * Copyright (c) 2006-2016 Chelsio, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include "libcxgb4.h" #include "cxgb4-abi.h" bool is_64b_cqe; #define MASKED(x) (void *)((unsigned long)(x) & c4iw_page_mask) int c4iw_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); uint64_t raw_fw_ver; u8 major, minor, sub_minor, build; int ret; ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (ret) return ret; raw_fw_ver = resp.base.fw_ver; major = (raw_fw_ver >> 24) & 0xff; minor = (raw_fw_ver >> 16) & 0xff; sub_minor = (raw_fw_ver >> 8) & 0xff; build = raw_fw_ver & 0xff; snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver), "%d.%d.%d.%d", major, minor, sub_minor, build); return 0; } int c4iw_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(context, port, attr, &cmd, sizeof cmd); } struct ibv_pd *c4iw_alloc_pd(struct ibv_context *context) { struct ibv_alloc_pd cmd; struct uc4iw_alloc_pd_resp resp; struct c4iw_pd *pd; pd = malloc(sizeof *pd); if (!pd) return NULL; if (ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp)) { free(pd); return NULL; } return &pd->ibv_pd; } int c4iw_free_pd(struct ibv_pd *pd) { int ret; ret = ibv_cmd_dealloc_pd(pd); if (ret) return ret; free(pd); return 0; } struct ibv_mr *c4iw_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { struct c4iw_mr *mhp; struct ibv_reg_mr cmd; struct ib_uverbs_reg_mr_resp resp; struct c4iw_dev *dev = to_c4iw_dev(pd->context->device); PDBG("%s addr %p length %ld hca_va %p\n", __func__, addr, length, hca_va); mhp = malloc(sizeof *mhp); if (!mhp) return NULL; if (ibv_cmd_reg_mr(pd, addr, length, hca_va, access, &mhp->vmr, &cmd, sizeof(cmd), &resp, sizeof resp)) { free(mhp); return NULL; } mhp->va_fbo = hca_va; mhp->len = length; PDBG("%s stag 0x%x va_fbo 0x%" PRIx64 " len %d\n", __func__, mhp->vmr.ibv_mr.rkey, mhp->va_fbo, mhp->len); pthread_spin_lock(&dev->lock); dev->mmid2ptr[c4iw_mmid(mhp->vmr.ibv_mr.lkey)] = mhp; pthread_spin_unlock(&dev->lock); INC_STAT(mr); return &mhp->vmr.ibv_mr; } int c4iw_dereg_mr(struct verbs_mr *vmr) { int ret; struct c4iw_dev *dev = to_c4iw_dev(vmr->ibv_mr.pd->context->device); ret = ibv_cmd_dereg_mr(vmr); if (ret) return ret; pthread_spin_lock(&dev->lock); dev->mmid2ptr[c4iw_mmid(vmr->ibv_mr.lkey)] = NULL; pthread_spin_unlock(&dev->lock); free(to_c4iw_mr(vmr)); return 0; } struct ibv_cq *c4iw_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct uc4iw_create_cq cmd = {}; struct uc4iw_create_cq_resp resp; struct c4iw_cq *chp; struct c4iw_dev *dev = to_c4iw_dev(context->device); int ret; if (!cqe || cqe > T4_MAX_CQ_DEPTH) { errno = EINVAL; return NULL; } chp = calloc(1, sizeof *chp); if (!chp) { return NULL; } resp.flags = 0; cmd.flags = C4IW_64B_CQE; ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector, &chp->ibv_cq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof resp); if (ret) goto err1; if (resp.flags & C4IW_64B_CQE) is_64b_cqe = true; pthread_spin_init(&chp->lock, PTHREAD_PROCESS_PRIVATE); #ifdef STALL_DETECTION gettimeofday(&chp->time, NULL); #endif chp->rhp = dev; chp->cq.qid_mask = resp.qid_mask; chp->cq.cqid = resp.cqid; chp->cq.size = resp.size; chp->cq.memsize = resp.memsize; chp->cq.gen = 1; chp->cq.queue = mmap(NULL, chp->cq.memsize, PROT_READ|PROT_WRITE, MAP_SHARED, context->cmd_fd, resp.key); if (chp->cq.queue == MAP_FAILED) goto err2; chp->cq.qp_errp = &((struct t4_status_page *) Q_ENTRY(chp->cq.queue, chp->cq.size))->qp_err; chp->cq.ugts = mmap(NULL, c4iw_page_size, PROT_WRITE, MAP_SHARED, context->cmd_fd, resp.gts_key); if (chp->cq.ugts == MAP_FAILED) goto err3; if (dev_is_t4(chp->rhp)) chp->cq.ugts += 1; else chp->cq.ugts += 5; chp->cq.sw_queue = calloc(chp->cq.size, CQE_SIZE(chp->cq.queue)); if (!chp->cq.sw_queue) goto err4; PDBG("%s cqid 0x%x key %" PRIx64 " va %p memsize %lu gts_key %" PRIx64 " va %p qid_mask 0x%x\n", __func__, chp->cq.cqid, resp.key, chp->cq.queue, chp->cq.memsize, resp.gts_key, chp->cq.ugts, chp->cq.qid_mask); pthread_spin_lock(&dev->lock); dev->cqid2ptr[chp->cq.cqid] = chp; pthread_spin_unlock(&dev->lock); INC_STAT(cq); return &chp->ibv_cq; err4: munmap(MASKED(chp->cq.ugts), c4iw_page_size); err3: munmap(chp->cq.queue, chp->cq.memsize); err2: (void)ibv_cmd_destroy_cq(&chp->ibv_cq); err1: free(chp); return NULL; } int c4iw_destroy_cq(struct ibv_cq *ibcq) { int ret; struct c4iw_cq *chp = to_c4iw_cq(ibcq); struct c4iw_dev *dev = to_c4iw_dev(ibcq->context->device); chp->cq.error = 1; ret = ibv_cmd_destroy_cq(ibcq); if (ret) { return ret; } munmap(MASKED(chp->cq.ugts), c4iw_page_size); munmap(chp->cq.queue, chp->cq.memsize); pthread_spin_lock(&dev->lock); dev->cqid2ptr[chp->cq.cqid] = NULL; pthread_spin_unlock(&dev->lock); free(chp->cq.sw_queue); free(chp); return 0; } struct ibv_srq *c4iw_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct c4iw_dev *dev = to_c4iw_dev(pd->context->device); struct uc4iw_create_srq_resp resp; unsigned long segment_offset; struct ibv_create_srq cmd; struct c4iw_srq *srq; void *dbva; int ret; PDBG("%s enter\n", __func__); srq = calloc(1, sizeof(*srq)); if (!srq) goto err; memset(&resp, 0, sizeof(resp)); ret = ibv_cmd_create_srq(pd, &srq->ibv_srq, attr, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) goto err_free_srq_mem; PDBG("%s srq id 0x%x srq key %" PRIx64 " srq db/gts key %" PRIx64 " qid_mask 0x%x\n", __func__, resp.srqid, resp.srq_key, resp.srq_db_gts_key, resp.qid_mask); srq->rhp = dev; srq->wq.qid = resp.srqid; srq->wq.size = resp.srq_size; srq->wq.memsize = resp.srq_memsize; srq->wq.rqt_abs_idx = resp.rqt_abs_idx; srq->flags = resp.flags; pthread_spin_init(&srq->lock, PTHREAD_PROCESS_PRIVATE); dbva = mmap(NULL, c4iw_page_size, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.srq_db_gts_key); if (dbva == MAP_FAILED) goto err_destroy_srq; srq->wq.udb = dbva; segment_offset = 128 * (srq->wq.qid & resp.qid_mask); if (segment_offset < c4iw_page_size) { srq->wq.udb += segment_offset / 4; srq->wq.wc_reg_available = 1; } else srq->wq.bar2_qid = srq->wq.qid & resp.qid_mask; srq->wq.udb += 2; srq->wq.queue = mmap(NULL, srq->wq.memsize, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.srq_key); if (srq->wq.queue == MAP_FAILED) goto err_unmap_udb; srq->wq.sw_rq = calloc(srq->wq.size, sizeof(struct t4_swrqe)); if (!srq->wq.sw_rq) goto err_unmap_queue; srq->wq.pending_wrs = calloc(srq->wq.size, sizeof(*srq->wq.pending_wrs)); if (!srq->wq.pending_wrs) goto err_free_sw_rq; pthread_spin_lock(&dev->lock); list_add_tail(&dev->srq_list, &srq->list); pthread_spin_unlock(&dev->lock); PDBG("%s srq dbva %p srq qva %p srq depth %u srq memsize %lu\n", __func__, srq->wq.udb, srq->wq.queue, srq->wq.size, srq->wq.memsize); INC_STAT(srq); return &srq->ibv_srq; err_free_sw_rq: free(srq->wq.sw_rq); err_unmap_queue: munmap((void *)srq->wq.queue, srq->wq.memsize); err_unmap_udb: munmap(MASKED(srq->wq.udb), c4iw_page_size); err_destroy_srq: (void)ibv_cmd_destroy_srq(&srq->ibv_srq); err_free_srq_mem: free(srq); err: return NULL; } int c4iw_modify_srq(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr, int attr_mask) { struct c4iw_srq *srq = to_c4iw_srq(ibsrq); struct ibv_modify_srq cmd; int ret; /* XXX no support for this yet */ if (attr_mask & IBV_SRQ_MAX_WR) return EINVAL; ret = ibv_cmd_modify_srq(ibsrq, attr, attr_mask, &cmd, sizeof(cmd)); if (!ret) { if (attr_mask & IBV_SRQ_LIMIT) { srq->armed = 1; srq->srq_limit = attr->srq_limit; } } return ret; } int c4iw_destroy_srq(struct ibv_srq *ibsrq) { int ret; struct c4iw_srq *srq = to_c4iw_srq(ibsrq); PDBG("%s enter qp %p\n", __func__, ibsrq); ret = ibv_cmd_destroy_srq(ibsrq); if (ret) return ret; pthread_spin_lock(&srq->rhp->lock); list_del(&srq->list); pthread_spin_unlock(&srq->rhp->lock); munmap(MASKED(srq->wq.udb), c4iw_page_size); munmap(srq->wq.queue, srq->wq.memsize); free(srq->wq.pending_wrs); free(srq->wq.sw_rq); free(srq); return 0; } int c4iw_query_srq(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; return ibv_cmd_query_srq(ibsrq, attr, &cmd, sizeof(cmd)); } static struct ibv_qp *create_qp_v0(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct ibv_create_qp cmd; struct uc4iw_create_qp_v0_resp resp; struct c4iw_qp *qhp; struct c4iw_dev *dev = to_c4iw_dev(pd->context->device); int ret; void *dbva; PDBG("%s enter qp\n", __func__); qhp = calloc(1, sizeof *qhp); if (!qhp) goto err1; memset(&resp, 0, sizeof(resp)); ret = ibv_cmd_create_qp(pd, &qhp->ibv_qp, attr, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) goto err2; PDBG("%s sqid 0x%x sq key %" PRIx64 " sq db/gts key %" PRIx64 " rqid 0x%x rq key %" PRIx64 " rq db/gts key %" PRIx64 " qid_mask 0x%x\n", __func__, resp.sqid, resp.sq_key, resp.sq_db_gts_key, resp.rqid, resp.rq_key, resp.rq_db_gts_key, resp.qid_mask); qhp->wq.qid_mask = resp.qid_mask; qhp->rhp = dev; qhp->wq.sq.qid = resp.sqid; qhp->wq.sq.size = resp.sq_size; qhp->wq.sq.memsize = resp.sq_memsize; qhp->wq.sq.flags = 0; qhp->wq.rq.msn = 1; qhp->wq.rq.qid = resp.rqid; qhp->wq.rq.size = resp.rq_size; qhp->wq.rq.memsize = resp.rq_memsize; pthread_spin_init(&qhp->lock, PTHREAD_PROCESS_PRIVATE); dbva = mmap(NULL, c4iw_page_size, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.sq_db_gts_key); if (dbva == MAP_FAILED) goto err3; qhp->wq.sq.udb = dbva; qhp->wq.sq.queue = mmap(NULL, qhp->wq.sq.memsize, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.sq_key); if (qhp->wq.sq.queue == MAP_FAILED) goto err4; dbva = mmap(NULL, c4iw_page_size, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.rq_db_gts_key); if (dbva == MAP_FAILED) goto err5; qhp->wq.rq.udb = dbva; qhp->wq.rq.queue = mmap(NULL, qhp->wq.rq.memsize, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.rq_key); if (qhp->wq.rq.queue == MAP_FAILED) goto err6; qhp->wq.sq.sw_sq = calloc(qhp->wq.sq.size, sizeof (struct t4_swsqe)); if (!qhp->wq.sq.sw_sq) goto err7; qhp->wq.rq.sw_rq = calloc(qhp->wq.rq.size, sizeof(struct t4_swrqe)); if (!qhp->wq.rq.sw_rq) goto err8; PDBG("%s sq dbva %p sq qva %p sq depth %u sq memsize %lu " " rq dbva %p rq qva %p rq depth %u rq memsize %lu\n", __func__, qhp->wq.sq.udb, qhp->wq.sq.queue, qhp->wq.sq.size, qhp->wq.sq.memsize, qhp->wq.rq.udb, qhp->wq.rq.queue, qhp->wq.rq.size, qhp->wq.rq.memsize); qhp->sq_sig_all = attr->sq_sig_all; pthread_spin_lock(&dev->lock); dev->qpid2ptr[qhp->wq.sq.qid] = qhp; pthread_spin_unlock(&dev->lock); INC_STAT(qp); return &qhp->ibv_qp; err8: free(qhp->wq.sq.sw_sq); err7: munmap((void *)qhp->wq.rq.queue, qhp->wq.rq.memsize); err6: munmap(MASKED(qhp->wq.rq.udb), c4iw_page_size); err5: munmap((void *)qhp->wq.sq.queue, qhp->wq.sq.memsize); err4: munmap(MASKED(qhp->wq.sq.udb), c4iw_page_size); err3: (void)ibv_cmd_destroy_qp(&qhp->ibv_qp); err2: free(qhp); err1: return NULL; } static struct ibv_qp *create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct ibv_create_qp cmd; struct uc4iw_create_qp_resp resp; struct c4iw_qp *qhp; struct c4iw_dev *dev = to_c4iw_dev(pd->context->device); struct c4iw_context *ctx = to_c4iw_context(pd->context); int ret; void *dbva; PDBG("%s enter qp\n", __func__); qhp = calloc(1, sizeof *qhp); if (!qhp) goto err1; memset(&resp, 0, sizeof(resp)); ret = ibv_cmd_create_qp(pd, &qhp->ibv_qp, attr, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) goto err2; PDBG("%s sqid 0x%x sq key %" PRIx64 " sq db/gts key %" PRIx64 " rqid 0x%x rq key %" PRIx64 " rq db/gts key %" PRIx64 " qid_mask 0x%x\n", __func__, resp.sqid, resp.sq_key, resp.sq_db_gts_key, resp.rqid, resp.rq_key, resp.rq_db_gts_key, resp.qid_mask); qhp->wq.qid_mask = resp.qid_mask; qhp->rhp = dev; qhp->wq.sq.qid = resp.sqid; qhp->wq.sq.size = resp.sq_size; qhp->wq.sq.memsize = resp.sq_memsize; qhp->wq.sq.flags = resp.flags & C4IW_QPF_ONCHIP ? T4_SQ_ONCHIP : 0; if (resp.flags & C4IW_QPF_WRITE_W_IMM) qhp->wq.sq.flags |= T4_SQ_WRITE_W_IMM; qhp->wq.sq.flush_cidx = -1; qhp->wq.rq.msn = 1; qhp->srq = to_c4iw_srq(attr->srq); if (!attr->srq) { qhp->wq.rq.qid = resp.rqid; qhp->wq.rq.size = resp.rq_size; qhp->wq.rq.memsize = resp.rq_memsize; } if (ma_wr && resp.sq_memsize < (resp.sq_size + 1) * sizeof *qhp->wq.sq.queue + 16*sizeof(__be64) ) { ma_wr = 0; fprintf(stderr, "libcxgb4 warning - downlevel iw_cxgb4 driver. " "MA workaround disabled.\n"); } pthread_spin_init(&qhp->lock, PTHREAD_PROCESS_PRIVATE); dbva = mmap(NULL, c4iw_page_size, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.sq_db_gts_key); if (dbva == MAP_FAILED) goto err3; qhp->wq.sq.udb = dbva; if (!dev_is_t4(qhp->rhp)) { unsigned long segment_offset = 128 * (qhp->wq.sq.qid & qhp->wq.qid_mask); if (segment_offset < c4iw_page_size) { qhp->wq.sq.udb += segment_offset / 4; qhp->wq.sq.wc_reg_available = 1; } else qhp->wq.sq.bar2_qid = qhp->wq.sq.qid & qhp->wq.qid_mask; qhp->wq.sq.udb += 2; } qhp->wq.sq.queue = mmap(NULL, qhp->wq.sq.memsize, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.sq_key); if (qhp->wq.sq.queue == MAP_FAILED) goto err4; if (!attr->srq) { dbva = mmap(NULL, c4iw_page_size, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.rq_db_gts_key); if (dbva == MAP_FAILED) goto err5; qhp->wq.rq.udb = dbva; if (!dev_is_t4(qhp->rhp)) { unsigned long segment_offset = 128 * (qhp->wq.rq.qid & qhp->wq.qid_mask); if (segment_offset < c4iw_page_size) { qhp->wq.rq.udb += segment_offset / 4; qhp->wq.rq.wc_reg_available = 1; } else qhp->wq.rq.bar2_qid = qhp->wq.rq.qid & qhp->wq.qid_mask; qhp->wq.rq.udb += 2; } qhp->wq.rq.queue = mmap(NULL, qhp->wq.rq.memsize, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.rq_key); if (qhp->wq.rq.queue == MAP_FAILED) goto err6; } qhp->wq.sq.sw_sq = calloc(qhp->wq.sq.size, sizeof (struct t4_swsqe)); if (!qhp->wq.sq.sw_sq) goto err7; if (!attr->srq) { qhp->wq.rq.sw_rq = calloc(qhp->wq.rq.size, sizeof(struct t4_swrqe)); if (!qhp->wq.rq.sw_rq) goto err8; } if (t4_sq_onchip(&qhp->wq)) { qhp->wq.sq.ma_sync = mmap(NULL, c4iw_page_size, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.ma_sync_key); if (qhp->wq.sq.ma_sync == MAP_FAILED) goto err9; qhp->wq.sq.ma_sync += (A_PCIE_MA_SYNC & (c4iw_page_size - 1)); } if (ctx->status_page_size) { qhp->wq.db_offp = &ctx->status_page->db_off; } else if (!attr->srq) { qhp->wq.db_offp = &qhp->wq.rq.queue[qhp->wq.rq.size].status.db_off; } if (!attr->srq) qhp->wq.qp_errp = &qhp->wq.rq.queue[qhp->wq.rq.size].status.qp_err; else { qhp->wq.qp_errp = &qhp->wq.sq.queue[qhp->wq.sq.size].status.qp_err; qhp->wq.srqidxp = &qhp->wq.sq.queue[qhp->wq.sq.size].status.srqidx; } PDBG("%s sq dbva %p sq qva %p sq depth %u sq memsize %lu " " rq dbva %p rq qva %p rq depth %u rq memsize %lu\n", __func__, qhp->wq.sq.udb, qhp->wq.sq.queue, qhp->wq.sq.size, qhp->wq.sq.memsize, qhp->wq.rq.udb, qhp->wq.rq.queue, qhp->wq.rq.size, qhp->wq.rq.memsize); qhp->sq_sig_all = attr->sq_sig_all; pthread_spin_lock(&dev->lock); dev->qpid2ptr[qhp->wq.sq.qid] = qhp; pthread_spin_unlock(&dev->lock); INC_STAT(qp); return &qhp->ibv_qp; err9: if (!attr->srq) free(qhp->wq.rq.sw_rq); err8: free(qhp->wq.sq.sw_sq); err7: if (!attr->srq) munmap((void *)qhp->wq.rq.queue, qhp->wq.rq.memsize); err6: if (!attr->srq) munmap(MASKED(qhp->wq.rq.udb), c4iw_page_size); err5: munmap((void *)qhp->wq.sq.queue, qhp->wq.sq.memsize); err4: munmap(MASKED(qhp->wq.sq.udb), c4iw_page_size); err3: (void)ibv_cmd_destroy_qp(&qhp->ibv_qp); err2: free(qhp); err1: return NULL; } struct ibv_qp *c4iw_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct c4iw_dev *dev = to_c4iw_dev(pd->context->device); if (dev->abi_version == 0) return create_qp_v0(pd, attr); return create_qp(pd, attr); } static void reset_qp(struct c4iw_qp *qhp) { PDBG("%s enter qp %p\n", __func__, qhp); qhp->wq.sq.cidx = 0; qhp->wq.sq.wq_pidx = qhp->wq.sq.pidx = qhp->wq.sq.in_use = 0; qhp->wq.rq.cidx = qhp->wq.rq.pidx = qhp->wq.rq.in_use = 0; qhp->wq.sq.oldest_read = NULL; memset(qhp->wq.sq.queue, 0, qhp->wq.sq.memsize); if (t4_sq_onchip(&qhp->wq)) mmio_flush_writes(); memset(qhp->wq.rq.queue, 0, qhp->wq.rq.memsize); } int c4iw_modify_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd = {}; struct c4iw_qp *qhp = to_c4iw_qp(ibqp); int ret; PDBG("%s enter qp %p new state %d\n", __func__, ibqp, attr_mask & IBV_QP_STATE ? attr->qp_state : -1); if (t4_wq_in_error(&qhp->wq)) c4iw_flush_qp(qhp); pthread_spin_lock(&qhp->lock); ret = ibv_cmd_modify_qp(ibqp, attr, attr_mask, &cmd, sizeof cmd); if (!ret && (attr_mask & IBV_QP_STATE) && attr->qp_state == IBV_QPS_RESET) reset_qp(qhp); pthread_spin_unlock(&qhp->lock); return ret; } int c4iw_destroy_qp(struct ibv_qp *ibqp) { int ret; struct c4iw_qp *qhp = to_c4iw_qp(ibqp); struct c4iw_dev *dev = to_c4iw_dev(ibqp->context->device); PDBG("%s enter qp %p\n", __func__, ibqp); c4iw_flush_qp(qhp); ret = ibv_cmd_destroy_qp(ibqp); if (ret) { return ret; } if (t4_sq_onchip(&qhp->wq)) { qhp->wq.sq.ma_sync -= (A_PCIE_MA_SYNC & (c4iw_page_size - 1)); munmap((void *)qhp->wq.sq.ma_sync, c4iw_page_size); } munmap(MASKED(qhp->wq.sq.udb), c4iw_page_size); munmap(qhp->wq.sq.queue, qhp->wq.sq.memsize); if (!qhp->srq) { munmap(MASKED(qhp->wq.rq.udb), c4iw_page_size); munmap(qhp->wq.rq.queue, qhp->wq.rq.memsize); } pthread_spin_lock(&dev->lock); dev->qpid2ptr[qhp->wq.sq.qid] = NULL; pthread_spin_unlock(&dev->lock); if (!qhp->srq) free(qhp->wq.rq.sw_rq); free(qhp->wq.sq.sw_sq); free(qhp); return 0; } int c4iw_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; struct c4iw_qp *qhp = to_c4iw_qp(ibqp); int ret; if (t4_wq_in_error(&qhp->wq)) c4iw_flush_qp(qhp); pthread_spin_lock(&qhp->lock); ret = ibv_cmd_query_qp(ibqp, attr, attr_mask, init_attr, &cmd, sizeof cmd); pthread_spin_unlock(&qhp->lock); return ret; } int c4iw_attach_mcast(struct ibv_qp *ibqp, const union ibv_gid *gid, uint16_t lid) { struct c4iw_qp *qhp = to_c4iw_qp(ibqp); int ret; if (t4_wq_in_error(&qhp->wq)) c4iw_flush_qp(qhp); pthread_spin_lock(&qhp->lock); ret = ibv_cmd_attach_mcast(ibqp, gid, lid); pthread_spin_unlock(&qhp->lock); return ret; } int c4iw_detach_mcast(struct ibv_qp *ibqp, const union ibv_gid *gid, uint16_t lid) { struct c4iw_qp *qhp = to_c4iw_qp(ibqp); int ret; if (t4_wq_in_error(&qhp->wq)) c4iw_flush_qp(qhp); pthread_spin_lock(&qhp->lock); ret = ibv_cmd_detach_mcast(ibqp, gid, lid); pthread_spin_unlock(&qhp->lock); return ret; } void c4iw_async_event(struct ibv_context *context, struct ibv_async_event *event) { PDBG("%s type %d obj %p\n", __func__, event->event_type, event->element.cq); switch (event->event_type) { case IBV_EVENT_CQ_ERR: break; case IBV_EVENT_QP_FATAL: case IBV_EVENT_QP_REQ_ERR: case IBV_EVENT_QP_ACCESS_ERR: case IBV_EVENT_PATH_MIG_ERR: { struct c4iw_qp *qhp = to_c4iw_qp(event->element.qp); c4iw_flush_qp(qhp); break; } case IBV_EVENT_SQ_DRAINED: case IBV_EVENT_PATH_MIG: case IBV_EVENT_COMM_EST: case IBV_EVENT_QP_LAST_WQE_REACHED: default: break; } } rdma-core-56.1/providers/efa/000077500000000000000000000000001477342711600161025ustar00rootroot00000000000000rdma-core-56.1/providers/efa/CMakeLists.txt000066400000000000000000000006411477342711600206430ustar00rootroot00000000000000if (ENABLE_LTTNG AND LTTNGUST_FOUND) set(TRACE_FILE efa_trace.c) endif() rdma_shared_provider(efa libefa.map 1 1.3.${PACKAGE_VERSION} ${TRACE_FILE} efa.c verbs.c ) publish_headers(infiniband efadv.h ) rdma_pkg_config("efa" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}") if (ENABLE_LTTNG AND LTTNGUST_FOUND) target_include_directories(efa PUBLIC ".") target_link_libraries(efa LINK_PRIVATE LTTng::UST) endif() rdma-core-56.1/providers/efa/efa-abi.h000066400000000000000000000017161477342711600175440ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All rights reserved. */ #ifndef __EFA_ABI_H__ #define __EFA_ABI_H__ #include #include #include #define EFA_ABI_VERSION 1 DECLARE_DRV_CMD(efa_alloc_ucontext, IB_USER_VERBS_CMD_GET_CONTEXT, efa_ibv_alloc_ucontext_cmd, efa_ibv_alloc_ucontext_resp); DECLARE_DRV_CMD(efa_alloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, empty, efa_ibv_alloc_pd_resp); DECLARE_DRV_CMD(efa_create_cq, IB_USER_VERBS_EX_CMD_CREATE_CQ, efa_ibv_create_cq, efa_ibv_create_cq_resp); DECLARE_DRV_CMD(efa_create_qp, IB_USER_VERBS_CMD_CREATE_QP, efa_ibv_create_qp, efa_ibv_create_qp_resp); DECLARE_DRV_CMD(efa_create_ah, IB_USER_VERBS_CMD_CREATE_AH, empty, efa_ibv_create_ah_resp); DECLARE_DRV_CMD(efa_query_device_ex, IB_USER_VERBS_EX_CMD_QUERY_DEVICE, empty, efa_ibv_ex_query_device_resp); #endif /* __EFA_ABI_H__ */ rdma-core-56.1/providers/efa/efa.c000066400000000000000000000101201477342711600167730ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause /* * Copyright 2019-2024 Amazon.com, Inc. or its affiliates. All rights reserved. */ #include #include #include #include #include #include #include "efa.h" #include "verbs.h" static void efa_free_context(struct ibv_context *ibvctx); #define PCI_VENDOR_ID_AMAZON 0x1d0f static const struct verbs_match_ent efa_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_EFA), VERBS_PCI_MATCH(PCI_VENDOR_ID_AMAZON, 0xefa0, NULL), VERBS_PCI_MATCH(PCI_VENDOR_ID_AMAZON, 0xefa1, NULL), VERBS_PCI_MATCH(PCI_VENDOR_ID_AMAZON, 0xefa2, NULL), VERBS_PCI_MATCH(PCI_VENDOR_ID_AMAZON, 0xefa3, NULL), {} }; static const struct verbs_context_ops efa_ctx_ops = { .alloc_pd = efa_alloc_pd, .create_ah = efa_create_ah, .create_cq = efa_create_cq, .create_cq_ex = efa_create_cq_ex, .create_qp = efa_create_qp, .create_qp_ex = efa_create_qp_ex, .cq_event = efa_cq_event, .dealloc_pd = efa_dealloc_pd, .dereg_mr = efa_dereg_mr, .destroy_ah = efa_destroy_ah, .destroy_cq = efa_destroy_cq, .destroy_qp = efa_destroy_qp, .modify_qp = efa_modify_qp, .poll_cq = efa_poll_cq, .post_recv = efa_post_recv, .post_send = efa_post_send, .query_device_ex = efa_query_device_ex, .query_port = efa_query_port, .query_qp = efa_query_qp, .query_qp_data_in_order = efa_query_qp_data_in_order, .reg_dmabuf_mr = efa_reg_dmabuf_mr, .reg_mr = efa_reg_mr, .req_notify_cq = efa_arm_cq, .free_context = efa_free_context, }; static struct verbs_context *efa_alloc_context(struct ibv_device *vdev, int cmd_fd, void *private_data) { struct efa_alloc_ucontext_resp resp = {}; struct efa_alloc_ucontext cmd = {}; struct efa_context *ctx; cmd.comp_mask |= EFA_ALLOC_UCONTEXT_CMD_COMP_TX_BATCH; cmd.comp_mask |= EFA_ALLOC_UCONTEXT_CMD_COMP_MIN_SQ_WR; ctx = verbs_init_and_alloc_context(vdev, cmd_fd, ctx, ibvctx, RDMA_DRIVER_EFA); if (!ctx) return NULL; if (ibv_cmd_get_context(&ctx->ibvctx, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) { verbs_err(&ctx->ibvctx, "ibv_cmd_get_context failed\n"); goto err_free_ctx; } ctx->sub_cqs_per_cq = resp.sub_cqs_per_cq; ctx->cmds_supp_udata_mask = resp.cmds_supp_udata_mask; ctx->cqe_size = sizeof(struct efa_io_rx_cdesc); ctx->ex_cqe_size = sizeof(struct efa_io_rx_cdesc_ex); ctx->inline_buf_size = resp.inline_buf_size; ctx->max_llq_size = resp.max_llq_size; ctx->max_tx_batch = resp.max_tx_batch; ctx->min_sq_wr = resp.min_sq_wr; pthread_spin_init(&ctx->qp_table_lock, PTHREAD_PROCESS_PRIVATE); /* ah udata is mandatory for ah number retrieval */ if (!(ctx->cmds_supp_udata_mask & EFA_USER_CMDS_SUPP_UDATA_CREATE_AH)) { verbs_err(&ctx->ibvctx, "Kernel does not support AH udata\n"); goto err_free_spinlock; } verbs_set_ops(&ctx->ibvctx, &efa_ctx_ops); if (efa_query_device_ctx(ctx)) goto err_free_spinlock; return &ctx->ibvctx; err_free_spinlock: pthread_spin_destroy(&ctx->qp_table_lock); err_free_ctx: verbs_uninit_context(&ctx->ibvctx); free(ctx); return NULL; } static void efa_free_context(struct ibv_context *ibvctx) { struct efa_context *ctx = to_efa_context(ibvctx); free(ctx->qp_table); pthread_spin_destroy(&ctx->qp_table_lock); verbs_uninit_context(&ctx->ibvctx); free(ctx); } static struct verbs_device *efa_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct efa_dev *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; dev->pg_sz = sysconf(_SC_PAGESIZE); return &dev->vdev; } static void efa_uninit_device(struct verbs_device *verbs_device) { struct efa_dev *dev = to_efa_dev(&verbs_device->device); free(dev); } static const struct verbs_device_ops efa_dev_ops = { .name = "efa", .match_min_abi_version = EFA_ABI_VERSION, .match_max_abi_version = EFA_ABI_VERSION, .match_table = efa_table, .alloc_device = efa_device_alloc, .uninit_device = efa_uninit_device, .alloc_context = efa_alloc_context, }; bool is_efa_dev(struct ibv_device *device) { struct verbs_device *verbs_device = verbs_get_device(device); return verbs_device->ops == &efa_dev_ops; } PROVIDER_DRIVER(efa, efa_dev_ops); rdma-core-56.1/providers/efa/efa.h000066400000000000000000000112261477342711600170100ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2019-2025 Amazon.com, Inc. or its affiliates. All rights reserved. */ #ifndef __EFA_H__ #define __EFA_H__ #include #include #include #include #include #include "efa-abi.h" #include "efa_io_defs.h" #include "efadv.h" #define EFA_GET(ptr, mask) FIELD_GET(mask##_MASK, *(ptr)) #define EFA_SET(ptr, mask, value) \ ({ \ typeof(ptr) _ptr = ptr; \ *_ptr = (*_ptr & ~(mask##_MASK)) | \ FIELD_PREP(mask##_MASK, value); \ }) struct efa_context { struct verbs_context ibvctx; uint32_t cmds_supp_udata_mask; uint16_t sub_cqs_per_cq; uint16_t inline_buf_size; uint32_t max_llq_size; uint32_t device_caps; uint32_t max_sq_wr; uint32_t max_rq_wr; uint16_t max_sq_sge; uint16_t max_rq_sge; uint32_t max_rdma_size; uint16_t max_wr_rdma_sge; uint16_t max_tx_batch; uint16_t min_sq_wr; size_t cqe_size; size_t ex_cqe_size; struct efa_qp **qp_table; unsigned int qp_table_sz_m1; pthread_spinlock_t qp_table_lock; }; struct efa_pd { struct ibv_pd ibvpd; uint16_t pdn; }; struct efa_sub_cq { uint16_t consumed_cnt; int phase; uint8_t *buf; int qmask; int cqe_size; uint32_t ref_cnt; }; struct efa_cq { struct verbs_cq verbs_cq; struct efadv_cq dv_cq; uint32_t cqn; size_t cqe_size; uint8_t *buf; size_t buf_size; uint32_t *db; uint8_t *db_mmap_addr; uint16_t cc; /* Consumer Counter */ uint8_t cmd_sn; uint16_t num_sub_cqs; /* Index of next sub cq idx to poll. This is used to guarantee fairness for sub cqs */ uint16_t next_poll_idx; pthread_spinlock_t lock; struct efa_wq *cur_wq; struct efa_io_cdesc_common *cur_cqe; struct ibv_device *dev; struct efa_sub_cq sub_cq_arr[]; }; struct efa_wq { uint64_t *wrid; /* wrid_idx_pool: Pool of free indexes in the wrid array, used to select the * wrid entry to be used to hold the next tx packet's context. * At init time, entry N will hold value N, as OOO tx-completions arrive, * the value stored in a given entry might not equal the entry's index. */ uint32_t *wrid_idx_pool; uint32_t wqe_cnt; uint32_t wqe_posted; uint32_t wqe_completed; uint16_t pc; /* Producer counter */ uint16_t desc_mask; /* wrid_idx_pool_next: Index of the next entry to use in wrid_idx_pool. */ uint16_t wrid_idx_pool_next; int max_sge; int phase; pthread_spinlock_t wqlock; uint32_t *db; uint16_t sub_cq_idx; }; struct efa_rq { struct efa_wq wq; uint8_t *buf; size_t buf_size; }; struct efa_sq { struct efa_wq wq; uint8_t *desc; uint32_t desc_offset; size_t desc_ring_mmap_size; size_t max_inline_data; size_t max_wr_rdma_sge; uint16_t max_batch_wr; /* Buffer for pending WR entries in the current session */ uint8_t *local_queue; /* Number of WR entries posted in the current session */ uint32_t num_wqe_pending; /* Phase before current session */ int phase_rb; /* Current wqe being built */ struct efa_io_tx_wqe *curr_tx_wqe; }; struct efa_qp { struct verbs_qp verbs_qp; struct efa_sq sq; struct efa_rq rq; int page_size; int sq_sig_all; int wr_session_err; struct ibv_device *dev; }; struct efa_mr { struct verbs_mr vmr; }; struct efa_ah { struct ibv_ah ibvah; uint16_t efa_ah; }; struct efa_dev { struct verbs_device vdev; uint32_t pg_sz; }; static inline struct efa_dev *to_efa_dev(struct ibv_device *ibvdev) { return container_of(ibvdev, struct efa_dev, vdev.device); } static inline struct efa_context *to_efa_context(struct ibv_context *ibvctx) { return container_of(ibvctx, struct efa_context, ibvctx.context); } static inline struct efa_pd *to_efa_pd(struct ibv_pd *ibvpd) { return container_of(ibvpd, struct efa_pd, ibvpd); } static inline struct efa_cq *to_efa_cq(struct ibv_cq *ibvcq) { return container_of(ibvcq, struct efa_cq, verbs_cq.cq); } static inline struct efa_cq *to_efa_cq_ex(struct ibv_cq_ex *ibvcqx) { return container_of(ibvcqx, struct efa_cq, verbs_cq.cq_ex); } static inline struct efa_cq *efadv_cq_to_efa_cq(struct efadv_cq *efadv_cq) { return container_of(efadv_cq, struct efa_cq, dv_cq); } static inline struct efa_qp *to_efa_qp(struct ibv_qp *ibvqp) { return container_of(ibvqp, struct efa_qp, verbs_qp.qp); } static inline struct efa_qp *to_efa_qp_ex(struct ibv_qp_ex *ibvqpx) { return container_of(ibvqpx, struct efa_qp, verbs_qp.qp_ex); } static inline struct efa_ah *to_efa_ah(struct ibv_ah *ibvah) { return container_of(ibvah, struct efa_ah, ibvah); } bool is_efa_dev(struct ibv_device *device); #endif /* __EFA_H__ */ rdma-core-56.1/providers/efa/efa_io_defs.h000066400000000000000000000201421477342711600204750ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2018-2024 Amazon.com, Inc. or its affiliates. All rights reserved. */ #ifndef _EFA_IO_H_ #define _EFA_IO_H_ #define EFA_IO_TX_DESC_NUM_BUFS 2 #define EFA_IO_TX_DESC_NUM_RDMA_BUFS 1 #define EFA_IO_TX_DESC_INLINE_MAX_SIZE 32 #define EFA_IO_TX_DESC_IMM_DATA_SIZE 4 enum efa_io_queue_type { /* send queue (of a QP) */ EFA_IO_SEND_QUEUE = 1, /* recv queue (of a QP) */ EFA_IO_RECV_QUEUE = 2, }; enum efa_io_send_op_type { /* send message */ EFA_IO_SEND = 0, /* RDMA read */ EFA_IO_RDMA_READ = 1, /* RDMA write */ EFA_IO_RDMA_WRITE = 2, }; enum efa_io_comp_status { /* Successful completion */ EFA_IO_COMP_STATUS_OK = 0, /* Flushed during QP destroy */ EFA_IO_COMP_STATUS_FLUSHED = 1, /* Internal QP error */ EFA_IO_COMP_STATUS_LOCAL_ERROR_QP_INTERNAL_ERROR = 2, /* Unsupported operation */ EFA_IO_COMP_STATUS_LOCAL_ERROR_UNSUPPORTED_OP = 3, /* Bad AH */ EFA_IO_COMP_STATUS_LOCAL_ERROR_INVALID_AH = 4, /* LKEY not registered or does not match IOVA */ EFA_IO_COMP_STATUS_LOCAL_ERROR_INVALID_LKEY = 5, /* Message too long */ EFA_IO_COMP_STATUS_LOCAL_ERROR_BAD_LENGTH = 6, /* RKEY not registered or does not match remote IOVA */ EFA_IO_COMP_STATUS_REMOTE_ERROR_BAD_ADDRESS = 7, /* Connection was reset by remote side */ EFA_IO_COMP_STATUS_REMOTE_ERROR_ABORT = 8, /* Bad dest QP number (QP does not exist or is in error state) */ EFA_IO_COMP_STATUS_REMOTE_ERROR_BAD_DEST_QPN = 9, /* Destination resource not ready (no WQEs posted on RQ) */ EFA_IO_COMP_STATUS_REMOTE_ERROR_RNR = 10, /* Receiver SGL too short */ EFA_IO_COMP_STATUS_REMOTE_ERROR_BAD_LENGTH = 11, /* Unexpected status returned by responder */ EFA_IO_COMP_STATUS_REMOTE_ERROR_BAD_STATUS = 12, /* Unresponsive remote - was previously responsive */ EFA_IO_COMP_STATUS_LOCAL_ERROR_UNRESP_REMOTE = 13, /* No valid AH at remote side (required for RDMA operations) */ EFA_IO_COMP_STATUS_REMOTE_ERROR_UNKNOWN_PEER = 14, /* Unreachable remote - never received a response */ EFA_IO_COMP_STATUS_LOCAL_ERROR_UNREACH_REMOTE = 15, }; struct efa_io_tx_meta_desc { /* Verbs-generated Request ID */ uint16_t req_id; /* * control flags * 3:0 : op_type - enum efa_io_send_op_type * 4 : has_imm - immediate_data field carries valid * data. * 5 : inline_msg - inline mode - inline message data * follows this descriptor (no buffer descriptors). * Note that it is different from immediate data * 6 : meta_extension - Extended metadata. MBZ * 7 : meta_desc - Indicates metadata descriptor. * Must be set. */ uint8_t ctrl1; /* * control flags * 0 : phase * 1 : reserved25 - MBZ * 2 : first - Indicates first descriptor in * transaction. Must be set. * 3 : last - Indicates last descriptor in * transaction. Must be set. * 4 : comp_req - Indicates whether completion should * be posted, after packet is transmitted. Valid only * for the first descriptor * 7:5 : reserved29 - MBZ */ uint8_t ctrl2; uint16_t dest_qp_num; /* * If inline_msg bit is set, length of inline message in bytes, * otherwise length of SGL (number of buffers). */ uint16_t length; /* * immediate data: if has_imm is set, then this field is included * within Tx message and reported in remote Rx completion. */ uint32_t immediate_data; uint16_t ah; uint16_t reserved; /* Queue key */ uint32_t qkey; uint8_t reserved2[12]; }; /* * Tx queue buffer descriptor, for any transport type. Preceded by metadata * descriptor. */ struct efa_io_tx_buf_desc { /* length in bytes */ uint32_t length; /* * 23:0 : lkey - local memory translation key * 31:24 : reserved - MBZ */ uint32_t lkey; /* Buffer address bits[31:0] */ uint32_t buf_addr_lo; /* Buffer address bits[63:32] */ uint32_t buf_addr_hi; }; struct efa_io_remote_mem_addr { /* length in bytes */ uint32_t length; /* remote memory translation key */ uint32_t rkey; /* Buffer address bits[31:0] */ uint32_t buf_addr_lo; /* Buffer address bits[63:32] */ uint32_t buf_addr_hi; }; struct efa_io_rdma_req { /* Remote memory address */ struct efa_io_remote_mem_addr remote_mem; /* Local memory address */ struct efa_io_tx_buf_desc local_mem[1]; }; /* * Tx WQE, composed of tx meta descriptors followed by either tx buffer * descriptors or inline data */ struct efa_io_tx_wqe { /* TX meta */ struct efa_io_tx_meta_desc meta; union { /* Send buffer descriptors */ struct efa_io_tx_buf_desc sgl[2]; uint8_t inline_data[32]; /* RDMA local and remote memory addresses */ struct efa_io_rdma_req rdma_req; } data; }; /* * Rx buffer descriptor; RX WQE is composed of one or more RX buffer * descriptors. */ struct efa_io_rx_desc { /* Buffer address bits[31:0] */ uint32_t buf_addr_lo; /* Buffer Pointer[63:32] */ uint32_t buf_addr_hi; /* Verbs-generated request id. */ uint16_t req_id; /* Length in bytes. */ uint16_t length; /* * LKey and control flags * 23:0 : lkey * 29:24 : reserved - MBZ * 30 : first - Indicates first descriptor in WQE * 31 : last - Indicates last descriptor in WQE */ uint32_t lkey_ctrl; }; /* Common IO completion descriptor */ struct efa_io_cdesc_common { /* * verbs-generated request ID, as provided in the completed tx or rx * descriptor. */ uint16_t req_id; uint8_t status; /* * flags * 0 : phase - Phase bit * 2:1 : q_type - enum efa_io_queue_type: send/recv * 3 : has_imm - indicates that immediate data is * present - for RX completions only * 6:4 : op_type - enum efa_io_send_op_type * 7 : unsolicited - indicates that there is no * matching request - for RDMA with imm. RX only */ uint8_t flags; /* local QP number */ uint16_t qp_num; }; /* Tx completion descriptor */ struct efa_io_tx_cdesc { /* Common completion info */ struct efa_io_cdesc_common common; /* MBZ */ uint16_t reserved16; }; /* Rx Completion Descriptor */ struct efa_io_rx_cdesc { /* Common completion info */ struct efa_io_cdesc_common common; /* Transferred length bits[15:0] */ uint16_t length; /* Remote Address Handle FW index, 0xFFFF indicates invalid ah */ uint16_t ah; uint16_t src_qp_num; /* Immediate data */ uint32_t imm; }; /* Rx Completion Descriptor RDMA write info */ struct efa_io_rx_cdesc_rdma_write { /* Transferred length bits[31:16] */ uint16_t length_hi; }; /* Extended Rx Completion Descriptor */ struct efa_io_rx_cdesc_ex { /* Base RX completion info */ struct efa_io_rx_cdesc base; union { struct efa_io_rx_cdesc_rdma_write rdma_write; /* * Valid only in case of unknown AH (0xFFFF) and CQ * set_src_addr is enabled. */ uint8_t src_addr[16]; } u; }; /* tx_meta_desc */ #define EFA_IO_TX_META_DESC_OP_TYPE_MASK GENMASK(3, 0) #define EFA_IO_TX_META_DESC_HAS_IMM_MASK BIT(4) #define EFA_IO_TX_META_DESC_INLINE_MSG_MASK BIT(5) #define EFA_IO_TX_META_DESC_META_EXTENSION_MASK BIT(6) #define EFA_IO_TX_META_DESC_META_DESC_MASK BIT(7) #define EFA_IO_TX_META_DESC_PHASE_MASK BIT(0) #define EFA_IO_TX_META_DESC_FIRST_MASK BIT(2) #define EFA_IO_TX_META_DESC_LAST_MASK BIT(3) #define EFA_IO_TX_META_DESC_COMP_REQ_MASK BIT(4) /* tx_buf_desc */ #define EFA_IO_TX_BUF_DESC_LKEY_MASK GENMASK(23, 0) /* rx_desc */ #define EFA_IO_RX_DESC_LKEY_MASK GENMASK(23, 0) #define EFA_IO_RX_DESC_FIRST_MASK BIT(30) #define EFA_IO_RX_DESC_LAST_MASK BIT(31) /* cdesc_common */ #define EFA_IO_CDESC_COMMON_PHASE_MASK BIT(0) #define EFA_IO_CDESC_COMMON_Q_TYPE_MASK GENMASK(2, 1) #define EFA_IO_CDESC_COMMON_HAS_IMM_MASK BIT(3) #define EFA_IO_CDESC_COMMON_OP_TYPE_MASK GENMASK(6, 4) #define EFA_IO_CDESC_COMMON_UNSOLICITED_MASK BIT(7) #endif /* _EFA_IO_H_ */ rdma-core-56.1/providers/efa/efa_io_regs_defs.h000066400000000000000000000006771477342711600215300ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2021 Amazon.com, Inc. or its affiliates. All rights reserved. */ #ifndef _EFA_IO_REGS_H_ #define _EFA_IO_REGS_H_ /* cq_db register */ #define EFA_IO_REGS_CQ_DB_CONSUMER_INDEX_MASK 0xffff #define EFA_IO_REGS_CQ_DB_CMD_SN_MASK 0x60000000 #define EFA_IO_REGS_CQ_DB_ARM_MASK 0x80000000 #endif /* _EFA_IO_REGS_H_ */ rdma-core-56.1/providers/efa/efa_trace.c000066400000000000000000000003611477342711600201570ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2023 Amazon.com, Inc. or its affiliates. All rights reserved. */ #define LTTNG_UST_TRACEPOINT_CREATE_PROBES #define LTTNG_UST_TRACEPOINT_DEFINE #include "efa_trace.h" rdma-core-56.1/providers/efa/efa_trace.h000066400000000000000000000052611477342711600201700ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2023-2024 Amazon.com, Inc. or its affiliates. All rights reserved. */ #if defined(LTTNG_ENABLED) #undef LTTNG_UST_TRACEPOINT_PROVIDER #define LTTNG_UST_TRACEPOINT_PROVIDER rdma_core_efa #undef LTTNG_UST_TRACEPOINT_INCLUDE #define LTTNG_UST_TRACEPOINT_INCLUDE "efa_trace.h" #if !defined(__EFA_TRACE_H__) || defined(LTTNG_UST_TRACEPOINT_HEADER_MULTI_READ) #define __EFA_TRACE_H__ #include #include LTTNG_UST_TRACEPOINT_EVENT( /* Tracepoint provider name */ rdma_core_efa, /* Tracepoint name */ post_recv, /* Input arguments */ LTTNG_UST_TP_ARGS( char *, dev_name, uint64_t, wr_id, uint32_t, qp_num, int, num_sge ), /* Output event fields */ LTTNG_UST_TP_FIELDS( lttng_ust_field_string(dev_name, dev_name) lttng_ust_field_integer(uint64_t, wr_id, wr_id) lttng_ust_field_integer(uint32_t, qp_num, qp_num) lttng_ust_field_integer(int, num_sge, num_sge) ) ) LTTNG_UST_TRACEPOINT_EVENT( /* Tracepoint provider name */ rdma_core_efa, /* Tracepoint name */ post_send, /* Input arguments */ LTTNG_UST_TP_ARGS( char *, dev_name, uint64_t, wr_id, uint8_t, op_type, uint32_t, src_qp_num, uint32_t, dst_qp_num, uint16_t, ah_num ), /* Output event fields */ LTTNG_UST_TP_FIELDS( lttng_ust_field_string(dev_name, dev_name) lttng_ust_field_integer(uint64_t, wr_id, wr_id) lttng_ust_field_integer(uint8_t, op_type, op_type) lttng_ust_field_integer(uint32_t, src_qp_num, src_qp_num) lttng_ust_field_integer(uint32_t, dst_qp_num, dst_qp_num) lttng_ust_field_integer(uint16_t, ah_num, ah_num) ) ) LTTNG_UST_TRACEPOINT_EVENT( /* Tracepoint provider name */ rdma_core_efa, /* Tracepoint name */ process_completion, /* Input arguments */ LTTNG_UST_TP_ARGS( char *, dev_name, uint64_t, wr_id, int, status, int, opcode, uint32_t, src_qp_num, uint32_t, dst_qp_num, uint16_t, ah_num, uint32_t, length ), /* Output event fields */ LTTNG_UST_TP_FIELDS( lttng_ust_field_string(dev_name, dev_name) lttng_ust_field_integer(uint64_t, wr_id, wr_id) lttng_ust_field_integer(int, status, status) lttng_ust_field_integer(int, opcode, opcode) lttng_ust_field_integer(uint32_t, src_qp_num, src_qp_num) lttng_ust_field_integer(uint32_t, dst_qp_num, dst_qp_num) lttng_ust_field_integer(uint16_t, ah_num, ah_num) lttng_ust_field_integer(uint32_t, length, length) ) ) #define rdma_tracepoint(arg...) lttng_ust_tracepoint(arg) #endif /* __EFA_TRACE_H__*/ #include #else #ifndef __EFA_TRACE_H__ #define __EFA_TRACE_H__ #define rdma_tracepoint(arg...) #endif /* __EFA_TRACE_H__*/ #endif /* defined(LTTNG_ENABLED) */ rdma-core-56.1/providers/efa/efadv.h000066400000000000000000000056371477342711600173530ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2019-2024 Amazon.com, Inc. or its affiliates. All rights reserved. */ #ifndef __EFADV_H__ #define __EFADV_H__ #include #include #include #include #ifdef __cplusplus extern "C" { #endif enum { /* Values must match the values in efa-abi.h */ EFADV_QP_DRIVER_TYPE_SRD = 0, }; struct ibv_qp *efadv_create_driver_qp(struct ibv_pd *ibvpd, struct ibv_qp_init_attr *attr, uint32_t driver_qp_type); enum { EFADV_QP_FLAGS_UNSOLICITED_WRITE_RECV = 1 << 0, }; struct efadv_qp_init_attr { uint64_t comp_mask; uint32_t driver_qp_type; uint16_t flags; uint8_t sl; uint8_t reserved[1]; }; struct ibv_qp *efadv_create_qp_ex(struct ibv_context *ibvctx, struct ibv_qp_init_attr_ex *attr_ex, struct efadv_qp_init_attr *efa_attr, uint32_t inlen); enum { EFADV_DEVICE_ATTR_CAPS_RDMA_READ = 1 << 0, EFADV_DEVICE_ATTR_CAPS_RNR_RETRY = 1 << 1, EFADV_DEVICE_ATTR_CAPS_CQ_WITH_SGID = 1 << 2, EFADV_DEVICE_ATTR_CAPS_RDMA_WRITE = 1 << 3, EFADV_DEVICE_ATTR_CAPS_UNSOLICITED_WRITE_RECV = 1 << 4, }; struct efadv_device_attr { uint64_t comp_mask; uint32_t max_sq_wr; uint32_t max_rq_wr; uint16_t max_sq_sge; uint16_t max_rq_sge; uint16_t inline_buf_size; uint8_t reserved[2]; uint32_t device_caps; uint32_t max_rdma_size; }; int efadv_query_device(struct ibv_context *ibvctx, struct efadv_device_attr *attr, uint32_t inlen); struct efadv_ah_attr { uint64_t comp_mask; uint16_t ahn; uint8_t reserved[6]; }; int efadv_query_ah(struct ibv_ah *ibvah, struct efadv_ah_attr *attr, uint32_t inlen); struct efadv_cq { uint64_t comp_mask; int (*wc_read_sgid)(struct efadv_cq *efadv_cq, union ibv_gid *sgid); bool (*wc_is_unsolicited)(struct efadv_cq *efadv_cq); }; enum { EFADV_WC_EX_WITH_SGID = 1 << 0, EFADV_WC_EX_WITH_IS_UNSOLICITED = 1 << 1, }; struct efadv_cq_init_attr { uint64_t comp_mask; uint64_t wc_flags; }; struct ibv_cq_ex *efadv_create_cq(struct ibv_context *ibvctx, struct ibv_cq_init_attr_ex *attr_ex, struct efadv_cq_init_attr *efa_attr, uint32_t inlen); struct efadv_cq *efadv_cq_from_ibv_cq_ex(struct ibv_cq_ex *ibvcqx); static inline int efadv_wc_read_sgid(struct efadv_cq *efadv_cq, union ibv_gid *sgid) { return efadv_cq->wc_read_sgid(efadv_cq, sgid); } static inline bool efadv_wc_is_unsolicited(struct efadv_cq *efadv_cq) { return efadv_cq->wc_is_unsolicited(efadv_cq); } enum { EFADV_MR_ATTR_VALIDITY_RECV_IC_ID = 1 << 0, EFADV_MR_ATTR_VALIDITY_RDMA_READ_IC_ID = 1 << 1, EFADV_MR_ATTR_VALIDITY_RDMA_RECV_IC_ID = 1 << 2, }; struct efadv_mr_attr { uint64_t comp_mask; uint16_t ic_id_validity; uint16_t recv_ic_id; uint16_t rdma_read_ic_id; uint16_t rdma_recv_ic_id; }; int efadv_query_mr(struct ibv_mr *ibvmr, struct efadv_mr_attr *attr, uint32_t inlen); #ifdef __cplusplus } #endif #endif /* __EFADV_H__ */ rdma-core-56.1/providers/efa/libefa.map000066400000000000000000000006151477342711600200250ustar00rootroot00000000000000/* Export symbols should be added below according to Documentation/versioning.md document. */ EFA_1.0 { global: efadv_create_driver_qp; local: *; }; EFA_1.1 { global: efadv_create_qp_ex; efadv_query_ah; efadv_query_device; } EFA_1.0; EFA_1.2 { global: efadv_cq_from_ibv_cq_ex; efadv_create_cq; efadv_wc_read_sgid; } EFA_1.1; EFA_1.3 { global: efadv_query_mr; } EFA_1.2; rdma-core-56.1/providers/efa/man/000077500000000000000000000000001477342711600166555ustar00rootroot00000000000000rdma-core-56.1/providers/efa/man/CMakeLists.txt000066400000000000000000000002351477342711600214150ustar00rootroot00000000000000rdma_man_pages( efadv.7.md efadv_create_driver_qp.3.md efadv_create_qp_ex.3.md efadv_query_ah.3.md efadv_query_device.3.md efadv_query_mr.3.md ) rdma-core-56.1/providers/efa/man/efadv.7.md000066400000000000000000000016621477342711600204360ustar00rootroot00000000000000--- layout: page title: EFADV section: 7 tagline: Verbs date: 2019-01-19 header: "EFA Direct Verbs Manual" footer: efa --- # NAME efadv - Direct verbs for efa devices This provides low level access to efa devices to perform direct operations, without general branching performed by libibverbs. # DESCRIPTION The libibverbs API is an abstract one. It is agnostic to any underlying provider specific implementation. While this abstraction has the advantage of user applications portability, it has a performance penalty. For some applications optimizing performance is more important than portability. The efa direct verbs API is intended for such applications. It exposes efa specific low level operations, allowing the application to bypass the libibverbs API. The direct include of efadv.h together with linkage to efa library will allow usage of this new interface. # SEE ALSO **verbs**(7) # AUTHORS Gal Pressman rdma-core-56.1/providers/efa/man/efadv_create_cq.3.md000066400000000000000000000040751477342711600224410ustar00rootroot00000000000000--- layout: page title: EFADV_CREATE_CQ section: 3 tagline: Verbs date: 2021-01-04 header: "EFA Direct Verbs Manual" footer: efa --- # NAME efadv_create_cq - Create EFA specific Completion Queue (CQ) # SYNOPSIS ```c #include struct ibv_cq_ex *efadv_create_cq(struct ibv_context *context, struct ibv_cq_init_attr_ex *attr_ex, struct efadv_cq_init_attr *efa_attr, uint32_t inlen); static inline int efadv_wc_read_sgid(struct efadv_cq *efadv_cq, union ibv_gid *sgid); ``` # DESCRIPTION **efadv_create_cq()** creates a Completion Queue (CQ) with specific driver properties. The argument attr_ex is an ibv_cq_init_attr_ex struct, as defined in . The EFADV work completions APIs (efadv_wc_\*) is an extension for IBV work completions API (ibv_wc_\*) with efa specific features for polling fields in the completion. This may be used together with or without ibv_wc_* calls. Use efadv_cq_from_ibv_cq_ex() to get the efadv_cq for accessing the work completion interface. Compatibility is handled using the comp_mask and inlen fields. ```c struct efadv_cq_init_attr { uint64_t comp_mask; uint64_t wc_flags; }; ``` *inlen* : In: Size of struct efadv_cq_init_attr. *comp_mask* : Compatibility mask. *wc_flags* : A bitwise OR of the various values described below. EFADV_WC_EX_WITH_SGID: if source AH is unknown, require sgid in WC. EFADV_WC_EX_WITH_IS_UNSOLICITED: request for an option to check whether a receive WC is unsolicited. # Completion iterator functions *efadv_wc_read_sgid* : Get the source GID field from the current completion. If the AH is known, a negative error value is returned. *efadv_wc_is_unsolicited* : Check whether it's an unsolicited receive completion that has no matching work request. This function is available if the CQ was created with EFADV_WC_EX_WITH_IS_UNSOLICITED. # RETURN VALUE efadv_create_cq() returns a pointer to the created extended CQ, or NULL if the request fails. # SEE ALSO **efadv**(7), **ibv_create_cq_ex**(3) # AUTHORS Daniel Kranzdorf rdma-core-56.1/providers/efa/man/efadv_create_driver_qp.3.md000066400000000000000000000021331477342711600240220ustar00rootroot00000000000000--- layout: page title: EFADV_CREATE_DRIVER_QP section: 3 tagline: Verbs date: 2019-01-23 header: "EFA Direct Verbs Manual" footer: efa --- # NAME efadv_create_driver_qp - Create EFA specific Queue Pair # SYNOPSIS ```c #include struct ibv_qp *efadv_create_driver_qp(struct ibv_pd *ibvpd, struct ibv_qp_init_attr *attr, uint32_t driver_qp_type); ``` # DESCRIPTION **efadv_create_driver_qp()** Create device-specific Queue Pairs. Scalable Reliable Datagram (SRD) transport provides reliable out-of-order delivery, transparently utilizing multiple network paths to reduce network tail latency. Its interface is similar to UD, in particular it supports message size up to MTU, with error handling extended to support reliable communication. *driver_qp_type* : The type of QP to be created: EFADV_QP_DRIVER_TYPE_SRD: Create an SRD QP. # RETURN VALUE efadv_create_driver_qp() returns a pointer to the created QP, or NULL if the request fails. # SEE ALSO **efadv**(7) # AUTHORS Gal Pressman rdma-core-56.1/providers/efa/man/efadv_create_qp_ex.3.md000066400000000000000000000035741477342711600231550ustar00rootroot00000000000000--- layout: page title: EFADV_CREATE_QP_EX section: 3 tagline: Verbs date: 2019-08-06 header: "EFA Direct Verbs Manual" footer: efa --- # NAME efadv_create_qp_ex - Create EFA specific extended Queue Pair # SYNOPSIS ```c #include struct ibv_qp *efadv_create_qp_ex(struct ibv_context *ibvctx, struct ibv_qp_init_attr_ex *attr_ex, struct efadv_qp_init_attr *efa_attr, uint32_t inlen); ``` # DESCRIPTION **efadv_create_qp_ex()** creates device-specific extended Queue Pair. The argument attr_ex is an ibv_qp_init_attr_ex struct, as defined in . Use ibv_qp_to_qp_ex() to get the ibv_qp_ex for accessing the send ops iterator interface, when QP create attr IBV_QP_INIT_ATTR_SEND_OPS_FLAGS is used. Scalable Reliable Datagram (SRD) transport provides reliable out-of-order delivery, transparently utilizing multiple network paths to reduce network tail latency. Its interface is similar to UD, in particular it supports message size up to MTU, with error handling extended to support reliable communication. Compatibility is handled using the comp_mask and inlen fields. ```c struct efadv_qp_init_attr { uint64_t comp_mask; uint32_t driver_qp_type; uint16_t flags; uint8_t sl; uint8_t reserved[1]; }; ``` *inlen* : In: Size of struct efadv_qp_init_attr. *comp_mask* : Compatibility mask. *driver_qp_type* : The type of QP to be created: EFADV_QP_DRIVER_TYPE_SRD: Create an SRD QP. *flags* : A bitwise OR of the values described below. EFADV_QP_FLAGS_UNSOLICITED_WRITE_RECV: Receive WRs will not be consumed for RDMA write with imm. *sl* : Service Level - 0 value implies default level. # RETURN VALUE efadv_create_qp_ex() returns a pointer to the created QP, or NULL if the request fails. # SEE ALSO **efadv**(7), **ibv_create_qp_ex**(3) # AUTHORS Gal Pressman Daniel Kranzdorf rdma-core-56.1/providers/efa/man/efadv_query_ah.3.md000066400000000000000000000017561477342711600223330ustar00rootroot00000000000000--- layout: page title: EFADV_QUERY_AH section: 3 tagline: Verbs date: 2019-05-19 header: "EFA Direct Verbs Manual" footer: efa --- # NAME efadv_query_ah - Query EFA specific Address Handle attributes # SYNOPSIS ```c #include int efadv_query_ah(struct ibv_ah *ibvah, struct efadv_ah_attr *attr, uint32_t inlen); ``` # DESCRIPTION **efadv_query_ah()** queries device-specific Address Handle attributes. Compatibility is handled using the comp_mask and inlen fields. ```c struct efadv_ah_attr { uint64_t comp_mask; uint16_t ahn; uint8_t reserved[6]; }; ``` *inlen* : In: Size of struct efadv_ah_attr. *comp_mask* : Compatibility mask. *ahn* : Device's Address Handle number. # RETURN VALUE **efadv_query_ah()** returns 0 on success, or the value of errno on failure (which indicates the failure reason). # SEE ALSO **efadv**(7) # NOTES * Compatibility mask (comp_mask) is an out field and currently has no values. # AUTHORS Gal Pressman rdma-core-56.1/providers/efa/man/efadv_query_device.3.md000066400000000000000000000043051477342711600231730ustar00rootroot00000000000000--- layout: page title: EFADV_QUERY_DEVICE section: 3 tagline: Verbs date: 2019-04-22 header: "EFA Direct Verbs Manual" footer: efa --- # NAME efadv_query_device - Query device capabilities # SYNOPSIS ```c #include int efadv_query_device(struct ibv_context *ibvctx, struct efadv_device_attr *attr, uint32_t inlen); ``` # DESCRIPTION **efadv_query_device()** Queries EFA device specific attributes. Compatibility is handled using the comp_mask and inlen fields. ```c struct efadv_device_attr { uint64_t comp_mask; uint32_t max_sq_wr; uint32_t max_rq_wr; uint16_t max_sq_sge; uint16_t max_rq_sge; uint16_t inline_buf_size; uint8_t reserved[2]; uint32_t device_caps; uint32_t max_rdma_size; }; ``` *inlen* : In: Size of struct efadv_device_attr. *comp_mask* : Compatibility mask. *max_sq_wr* : Maximum Send Queue (SQ) Work Requests (WRs). *max_rq_wr* : Maximum Receive Queue (RQ) Work Requests (WRs). *max_sq_sge* : Maximum Send Queue (SQ) Scatter Gather Elements (SGEs). *max_rq_sge* : Maximum Receive Queue (RQ) Scatter Gather Elements (SGEs). *inline_buf_size* : Maximum inline buffer size. *device_caps* : Bitmask of device capabilities: EFADV_DEVICE_ATTR_CAPS_RDMA_READ: RDMA read is supported. EFADV_DEVICE_ATTR_CAPS_RNR_RETRY: RNR retry is supported for SRD QPs. EFADV_DEVICE_ATTR_CAPS_CQ_WITH_SGID: Reading source address (SGID) from receive completion descriptors is supported. Valid only for unknown AH. EFADV_DEVICE_ATTR_CAPS_RDMA_WRITE: RDMA write is supported EFADV_DEVICE_ATTR_CAPS_UNSOLICITED_WRITE_RECV: Indicates the device has support for creating QPs that can receive unsolicited RDMA write with immediate. RQ with this feature enabled will not consume any work requests in order to receive RDMA write with immediate and a WC generated for such receive will be marked as unsolicited. *max_rdma_size* : Maximum RDMA transfer size in bytes. # RETURN VALUE **efadv_query_device()** returns 0 on success, or the value of errno on failure (which indicates the failure reason). # SEE ALSO **efadv**(7) # NOTES * Compatibility mask (comp_mask) is an out field and currently has no values. # AUTHORS Gal Pressman rdma-core-56.1/providers/efa/man/efadv_query_mr.3.md000066400000000000000000000031171477342711600223520ustar00rootroot00000000000000--- layout: page title: EFADV_QUERY_MR section: 3 tagline: Verbs date: 2023-11-13 header: "EFA Direct Verbs Manual" footer: efa --- # NAME efadv_query_mr - Query EFA specific Memory Region attributes # SYNOPSIS ```c #include int efadv_query_mr(struct ibv_mr *ibvmr, struct efadv_mr_attr *attr, uint32_t inlen); ``` # DESCRIPTION **efadv_query_mr()** queries device-specific Memory Region attributes. Compatibility is handled using the comp_mask and inlen fields. ```c struct efadv_mr_attr { uint64_t comp_mask; uint16_t ic_id_validity; uint16_t recv_ic_id; uint16_t rdma_read_ic_id; uint16_t rdma_recv_ic_id; }; ``` *inlen* : In: Size of struct efadv_mr_attr. *comp_mask* : Compatibility mask. *ic_id_validity* : Validity mask of interconnect id fields: EFADV_MR_ATTR_VALIDITY_RECV_IC_ID: recv_ic_id has a valid value. EFADV_MR_ATTR_VALIDITY_RDMA_READ_IC_ID: rdma_read_ic_id has a valid value. EFADV_MR_ATTR_VALIDITY_RDMA_RECV_IC_ID: rdma_recv_ic_id has a valid value. *recv_ic_id* : Physical interconnect used by the device to reach the MR for receive operation. *rdma_read_ic_id* : Physical interconnect used by the device to reach the MR for RDMA read operation. *rdma_recv_ic_id* : Physical interconnect used by the device to reach the MR for RDMA write receive. # RETURN VALUE **efadv_query_mr()** returns 0 on success, or the value of errno on failure (which indicates the failure reason). # SEE ALSO **efadv**(7) # NOTES * Compatibility mask (comp_mask) is an out field and currently has no values. # AUTHORS Michael Margolin rdma-core-56.1/providers/efa/verbs.c000066400000000000000000001776531477342711600174120ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause /* * Copyright 2019-2025 Amazon.com, Inc. or its affiliates. All rights reserved. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "efa.h" #include "efa_io_regs_defs.h" #include "efadv.h" #include "verbs.h" #include "efa_trace.h" #define EFA_DEV_CAP(ctx, cap) \ ((ctx)->device_caps & EFA_QUERY_DEVICE_CAPS_##cap) static bool is_buf_cleared(void *buf, size_t len) { int i; for (i = 0; i < len; i++) { if (((uint8_t *)buf)[i]) return false; } return true; } #define min3(a, b, c) \ ({ \ typeof(a) _tmpmin = min(a, b); \ min(_tmpmin, c); \ }) #define is_ext_cleared(ptr, inlen) \ is_buf_cleared((uint8_t *)ptr + sizeof(*ptr), inlen - sizeof(*ptr)) #define is_reserved_cleared(reserved) is_buf_cleared(reserved, sizeof(reserved)) struct efa_wq_init_attr { uint64_t db_mmap_key; uint32_t db_off; int cmd_fd; int pgsz; uint16_t sub_cq_idx; }; int efa_query_port(struct ibv_context *ibvctx, uint8_t port, struct ibv_port_attr *port_attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(ibvctx, port, port_attr, &cmd, sizeof(cmd)); } int efa_query_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct efa_context *ctx = to_efa_context(context); struct ibv_device_attr *a = &attr->orig_attr; struct efa_query_device_ex_resp resp = {}; size_t resp_size = (ctx->cmds_supp_udata_mask & EFA_USER_CMDS_SUPP_UDATA_QUERY_DEVICE) ? sizeof(resp) : sizeof(resp.ibv_resp); uint8_t fw_ver[8]; int err; err = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp.ibv_resp, &resp_size); if (err) { verbs_err(verbs_get_ctx(context), "ibv_cmd_query_device_any failed\n"); return err; } a->max_qp_wr = min_t(int, a->max_qp_wr, ctx->max_llq_size / sizeof(struct efa_io_tx_wqe)); memcpy(fw_ver, &resp.ibv_resp.base.fw_ver, sizeof(resp.ibv_resp.base.fw_ver)); snprintf(a->fw_ver, sizeof(a->fw_ver), "%u.%u.%u.%u", fw_ver[0], fw_ver[1], fw_ver[2], fw_ver[3]); return 0; } int efa_query_device_ctx(struct efa_context *ctx) { struct efa_query_device_ex_resp resp = {}; struct ibv_device_attr_ex attr; size_t resp_size = sizeof(resp); unsigned int qp_table_sz; int err; if (ctx->cmds_supp_udata_mask & EFA_USER_CMDS_SUPP_UDATA_QUERY_DEVICE) { err = ibv_cmd_query_device_any(&ctx->ibvctx.context, NULL, &attr, sizeof(attr), &resp.ibv_resp, &resp_size); if (err) { verbs_err(&ctx->ibvctx, "ibv_cmd_query_device_any failed\n"); return err; } ctx->device_caps = resp.device_caps; ctx->max_sq_wr = resp.max_sq_wr; ctx->max_rq_wr = resp.max_rq_wr; ctx->max_sq_sge = resp.max_sq_sge; ctx->max_rq_sge = resp.max_rq_sge; ctx->max_rdma_size = resp.max_rdma_size; } else { err = ibv_cmd_query_device_any(&ctx->ibvctx.context, NULL, &attr, sizeof(attr.orig_attr), NULL, NULL); if (err) { verbs_err(&ctx->ibvctx, "ibv_cmd_query_device_any failed\n"); return err; } } ctx->max_wr_rdma_sge = attr.orig_attr.max_sge_rd; qp_table_sz = roundup_pow_of_two(attr.orig_attr.max_qp); ctx->qp_table_sz_m1 = qp_table_sz - 1; ctx->qp_table = calloc(qp_table_sz, sizeof(*ctx->qp_table)); if (!ctx->qp_table) return ENOMEM; return 0; } int efadv_query_device(struct ibv_context *ibvctx, struct efadv_device_attr *attr, uint32_t inlen) { struct efa_context *ctx = to_efa_context(ibvctx); uint64_t comp_mask_out = 0; if (!is_efa_dev(ibvctx->device)) { verbs_err(verbs_get_ctx(ibvctx), "Not an EFA device\n"); return EOPNOTSUPP; } if (!vext_field_avail(typeof(*attr), inline_buf_size, inlen)) { verbs_err(verbs_get_ctx(ibvctx), "Compatibility issues\n"); return EINVAL; } memset(attr, 0, inlen); attr->max_sq_wr = ctx->max_sq_wr; attr->max_rq_wr = ctx->max_rq_wr; attr->max_sq_sge = ctx->max_sq_sge; attr->max_rq_sge = ctx->max_rq_sge; attr->inline_buf_size = ctx->inline_buf_size; if (vext_field_avail(typeof(*attr), device_caps, inlen)) { if (EFA_DEV_CAP(ctx, RNR_RETRY)) attr->device_caps |= EFADV_DEVICE_ATTR_CAPS_RNR_RETRY; if (EFA_DEV_CAP(ctx, CQ_WITH_SGID)) attr->device_caps |= EFADV_DEVICE_ATTR_CAPS_CQ_WITH_SGID; if (EFA_DEV_CAP(ctx, UNSOLICITED_WRITE_RECV)) attr->device_caps |= EFADV_DEVICE_ATTR_CAPS_UNSOLICITED_WRITE_RECV; } if (vext_field_avail(typeof(*attr), max_rdma_size, inlen)) { attr->max_rdma_size = ctx->max_rdma_size; if (EFA_DEV_CAP(ctx, RDMA_READ)) attr->device_caps |= EFADV_DEVICE_ATTR_CAPS_RDMA_READ; if (EFA_DEV_CAP(ctx, RDMA_WRITE)) attr->device_caps |= EFADV_DEVICE_ATTR_CAPS_RDMA_WRITE; } attr->comp_mask = comp_mask_out; return 0; } struct ibv_pd *efa_alloc_pd(struct ibv_context *ibvctx) { struct efa_alloc_pd_resp resp = {}; struct ibv_alloc_pd cmd; struct efa_pd *pd; int err; pd = calloc(1, sizeof(*pd)); if (!pd) return NULL; err = ibv_cmd_alloc_pd(ibvctx, &pd->ibvpd, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (err) { verbs_err(verbs_get_ctx(ibvctx), "Failed to allocate PD\n"); goto out; } pd->pdn = resp.pdn; return &pd->ibvpd; out: free(pd); errno = err; return NULL; } int efa_dealloc_pd(struct ibv_pd *ibvpd) { struct efa_pd *pd = to_efa_pd(ibvpd); int err; err = ibv_cmd_dealloc_pd(ibvpd); if (err) { verbs_err(verbs_get_ctx(ibvpd->context), "Failed to deallocate PD\n"); return err; } free(pd); return 0; } struct ibv_mr *efa_reg_dmabuf_mr(struct ibv_pd *ibvpd, uint64_t offset, size_t length, uint64_t iova, int fd, int acc) { struct efa_mr *mr; int err; mr = calloc(1, sizeof(*mr)); if (!mr) return NULL; err = ibv_cmd_reg_dmabuf_mr(ibvpd, offset, length, iova, fd, acc, &mr->vmr, NULL); if (err) { free(mr); errno = err; return NULL; } return &mr->vmr.ibv_mr; } struct ibv_mr *efa_reg_mr(struct ibv_pd *ibvpd, void *sva, size_t len, uint64_t hca_va, int access) { struct ib_uverbs_reg_mr_resp resp; struct ibv_reg_mr cmd; struct efa_mr *mr; int err; mr = calloc(1, sizeof(*mr)); if (!mr) return NULL; err = ibv_cmd_reg_mr(ibvpd, sva, len, hca_va, access, &mr->vmr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (err) { verbs_err(verbs_get_ctx(ibvpd->context), "Failed to register MR\n"); free(mr); errno = err; return NULL; } return &mr->vmr.ibv_mr; } int efadv_query_mr(struct ibv_mr *ibvmr, struct efadv_mr_attr *attr, uint32_t inlen) { uint16_t rdma_read_ic_id = 0; uint16_t rdma_recv_ic_id = 0; uint16_t ic_id_validity = 0; uint16_t recv_ic_id = 0; int err; DECLARE_COMMAND_BUFFER(cmd, UVERBS_OBJECT_MR, EFA_IB_METHOD_MR_QUERY, 5); if (!is_efa_dev(ibvmr->context->device)) { verbs_err(verbs_get_ctx(ibvmr->context), "Not an EFA device\n"); return EOPNOTSUPP; } if (!vext_field_avail(typeof(*attr), rdma_recv_ic_id, inlen)) { verbs_err(verbs_get_ctx(ibvmr->context), "Compatibility issues\n"); return EINVAL; } memset(attr, 0, inlen); fill_attr_in_obj(cmd, EFA_IB_ATTR_QUERY_MR_HANDLE, ibvmr->handle); fill_attr_out(cmd, EFA_IB_ATTR_QUERY_MR_RESP_IC_ID_VALIDITY, &ic_id_validity, sizeof(ic_id_validity)); fill_attr_out(cmd, EFA_IB_ATTR_QUERY_MR_RESP_RECV_IC_ID, &recv_ic_id, sizeof(recv_ic_id)); fill_attr_out(cmd, EFA_IB_ATTR_QUERY_MR_RESP_RDMA_READ_IC_ID, &rdma_read_ic_id, sizeof(rdma_read_ic_id)); fill_attr_out(cmd, EFA_IB_ATTR_QUERY_MR_RESP_RDMA_RECV_IC_ID, &rdma_recv_ic_id, sizeof(rdma_recv_ic_id)); err = execute_ioctl(ibvmr->context, cmd); if (err) { verbs_err(verbs_get_ctx(ibvmr->context), "Failed to query MR\n"); return err; } if (ic_id_validity & EFA_QUERY_MR_VALIDITY_RECV_IC_ID) { attr->recv_ic_id = recv_ic_id; attr->ic_id_validity |= EFADV_MR_ATTR_VALIDITY_RECV_IC_ID; } if (ic_id_validity & EFA_QUERY_MR_VALIDITY_RDMA_READ_IC_ID) { attr->rdma_read_ic_id = rdma_read_ic_id; attr->ic_id_validity |= EFADV_MR_ATTR_VALIDITY_RDMA_READ_IC_ID; } if (ic_id_validity & EFA_QUERY_MR_VALIDITY_RDMA_RECV_IC_ID) { attr->rdma_recv_ic_id = rdma_recv_ic_id; attr->ic_id_validity |= EFADV_MR_ATTR_VALIDITY_RDMA_RECV_IC_ID; } return 0; } int efa_dereg_mr(struct verbs_mr *vmr) { struct efa_mr *mr = container_of(vmr, struct efa_mr, vmr); int err; err = ibv_cmd_dereg_mr(vmr); if (err) { verbs_err(verbs_get_ctx(vmr->ibv_mr.context), "Failed to deregister MR\n"); return err; } free(mr); return 0; } static uint32_t efa_wq_get_next_wrid_idx_locked(struct efa_wq *wq, uint64_t wr_id) { uint32_t wrid_idx; /* Get the next wrid to be used from the index pool */ wrid_idx = wq->wrid_idx_pool[wq->wrid_idx_pool_next]; wq->wrid[wrid_idx] = wr_id; /* Will never overlap, as validate function succeeded */ wq->wrid_idx_pool_next++; assert(wq->wrid_idx_pool_next <= wq->wqe_cnt); return wrid_idx; } static void efa_wq_put_wrid_idx_unlocked(struct efa_wq *wq, uint32_t wrid_idx) { pthread_spin_lock(&wq->wqlock); wq->wrid_idx_pool_next--; wq->wrid_idx_pool[wq->wrid_idx_pool_next] = wrid_idx; wq->wqe_completed++; pthread_spin_unlock(&wq->wqlock); } static uint32_t efa_sub_cq_get_current_index(struct efa_sub_cq *sub_cq) { return sub_cq->consumed_cnt & sub_cq->qmask; } static int efa_cqe_is_pending(struct efa_io_cdesc_common *cqe_common, int phase) { return EFA_GET(&cqe_common->flags, EFA_IO_CDESC_COMMON_PHASE) == phase; } static struct efa_io_cdesc_common * efa_sub_cq_get_cqe(struct efa_sub_cq *sub_cq, int entry) { return (struct efa_io_cdesc_common *)(sub_cq->buf + (entry * sub_cq->cqe_size)); } static void efa_update_cq_doorbell(struct efa_cq *cq, bool arm) { uint32_t db = 0; EFA_SET(&db, EFA_IO_REGS_CQ_DB_CONSUMER_INDEX, cq->cc); EFA_SET(&db, EFA_IO_REGS_CQ_DB_CMD_SN, cq->cmd_sn & 0x3); EFA_SET(&db, EFA_IO_REGS_CQ_DB_ARM, arm); mmio_write32(cq->db, db); } void efa_cq_event(struct ibv_cq *ibvcq) { to_efa_cq(ibvcq)->cmd_sn++; } int efa_arm_cq(struct ibv_cq *ibvcq, int solicited_only) { if (unlikely(solicited_only)) return EOPNOTSUPP; efa_update_cq_doorbell(to_efa_cq(ibvcq), true); return 0; } static struct efa_io_cdesc_common * cq_next_sub_cqe_get(struct efa_sub_cq *sub_cq) { struct efa_io_cdesc_common *cqe; uint32_t current_index; current_index = efa_sub_cq_get_current_index(sub_cq); cqe = efa_sub_cq_get_cqe(sub_cq, current_index); if (efa_cqe_is_pending(cqe, sub_cq->phase)) { /* Do not read the rest of the completion entry before the * phase bit has been validated. */ udma_from_device_barrier(); sub_cq->consumed_cnt++; if (!efa_sub_cq_get_current_index(sub_cq)) sub_cq->phase = 1 - sub_cq->phase; return cqe; } return NULL; } static enum ibv_wc_status to_ibv_status(enum efa_io_comp_status status) { switch (status) { case EFA_IO_COMP_STATUS_OK: return IBV_WC_SUCCESS; case EFA_IO_COMP_STATUS_FLUSHED: return IBV_WC_WR_FLUSH_ERR; case EFA_IO_COMP_STATUS_LOCAL_ERROR_QP_INTERNAL_ERROR: case EFA_IO_COMP_STATUS_LOCAL_ERROR_UNSUPPORTED_OP: case EFA_IO_COMP_STATUS_LOCAL_ERROR_INVALID_AH: return IBV_WC_LOC_QP_OP_ERR; case EFA_IO_COMP_STATUS_LOCAL_ERROR_INVALID_LKEY: return IBV_WC_LOC_PROT_ERR; case EFA_IO_COMP_STATUS_LOCAL_ERROR_BAD_LENGTH: return IBV_WC_LOC_LEN_ERR; case EFA_IO_COMP_STATUS_REMOTE_ERROR_ABORT: return IBV_WC_REM_ABORT_ERR; case EFA_IO_COMP_STATUS_REMOTE_ERROR_RNR: return IBV_WC_RNR_RETRY_EXC_ERR; case EFA_IO_COMP_STATUS_REMOTE_ERROR_BAD_DEST_QPN: return IBV_WC_REM_INV_RD_REQ_ERR; case EFA_IO_COMP_STATUS_REMOTE_ERROR_BAD_STATUS: return IBV_WC_BAD_RESP_ERR; case EFA_IO_COMP_STATUS_REMOTE_ERROR_BAD_LENGTH: return IBV_WC_REM_INV_REQ_ERR; case EFA_IO_COMP_STATUS_LOCAL_ERROR_UNRESP_REMOTE: case EFA_IO_COMP_STATUS_LOCAL_ERROR_UNREACH_REMOTE: return IBV_WC_RESP_TIMEOUT_ERR; case EFA_IO_COMP_STATUS_REMOTE_ERROR_BAD_ADDRESS: return IBV_WC_REM_ACCESS_ERR; case EFA_IO_COMP_STATUS_REMOTE_ERROR_UNKNOWN_PEER: return IBV_WC_REM_OP_ERR; default: return IBV_WC_GENERAL_ERR; } } static enum ibv_wc_opcode efa_wc_read_opcode(struct ibv_cq_ex *ibvcqx) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); enum efa_io_send_op_type op_type; struct efa_io_cdesc_common *cqe; cqe = cq->cur_cqe; op_type = EFA_GET(&cqe->flags, EFA_IO_CDESC_COMMON_OP_TYPE); if (EFA_GET(&cqe->flags, EFA_IO_CDESC_COMMON_Q_TYPE) == EFA_IO_SEND_QUEUE) { if (op_type == EFA_IO_RDMA_WRITE) return IBV_WC_RDMA_WRITE; return IBV_WC_SEND; } if (op_type == EFA_IO_RDMA_WRITE) return IBV_WC_RECV_RDMA_WITH_IMM; return IBV_WC_RECV; } static uint32_t efa_wc_read_vendor_err(struct ibv_cq_ex *ibvcqx) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); return cq->cur_cqe->status; } static unsigned int efa_wc_read_wc_flags(struct ibv_cq_ex *ibvcqx) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); unsigned int wc_flags = 0; if (EFA_GET(&cq->cur_cqe->flags, EFA_IO_CDESC_COMMON_HAS_IMM)) wc_flags |= IBV_WC_WITH_IMM; return wc_flags; } static uint32_t efa_wc_read_byte_len(struct ibv_cq_ex *ibvcqx) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); struct efa_io_cdesc_common *cqe; struct efa_io_rx_cdesc_ex *rcqe; uint32_t length; cqe = cq->cur_cqe; if (EFA_GET(&cqe->flags, EFA_IO_CDESC_COMMON_Q_TYPE) != EFA_IO_RECV_QUEUE) return 0; rcqe = container_of(cqe, struct efa_io_rx_cdesc_ex, base.common); length = rcqe->base.length; if (EFA_GET(&cqe->flags, EFA_IO_CDESC_COMMON_OP_TYPE) == EFA_IO_RDMA_WRITE) length |= ((uint32_t)rcqe->u.rdma_write.length_hi << 16); return length; } static __be32 efa_wc_read_imm_data(struct ibv_cq_ex *ibvcqx) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); struct efa_io_rx_cdesc *rcqe; rcqe = container_of(cq->cur_cqe, struct efa_io_rx_cdesc, common); return htobe32(rcqe->imm); } static uint32_t efa_wc_read_qp_num(struct ibv_cq_ex *ibvcqx) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); return cq->cur_cqe->qp_num; } static uint32_t efa_wc_read_src_qp(struct ibv_cq_ex *ibvcqx) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); struct efa_io_rx_cdesc *rcqe; rcqe = container_of(cq->cur_cqe, struct efa_io_rx_cdesc, common); return rcqe->src_qp_num; } static uint32_t efa_wc_read_slid(struct ibv_cq_ex *ibvcqx) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); struct efa_io_rx_cdesc *rcqe; rcqe = container_of(cq->cur_cqe, struct efa_io_rx_cdesc, common); return rcqe->ah; } static uint8_t efa_wc_read_sl(struct ibv_cq_ex *ibvcqx) { return 0; } static uint8_t efa_wc_read_dlid_path_bits(struct ibv_cq_ex *ibvcqx) { return 0; } static int efa_wc_read_sgid(struct efadv_cq *efadv_cq, union ibv_gid *sgid) { struct efa_cq *cq = efadv_cq_to_efa_cq(efadv_cq); struct efa_io_rx_cdesc_ex *rcqex; rcqex = container_of(cq->cur_cqe, struct efa_io_rx_cdesc_ex, base.common); if (rcqex->base.ah != 0xFFFF) { /* SGID is only available if AH is unknown. */ return -ENOENT; } memcpy(sgid->raw, rcqex->u.src_addr, sizeof(sgid->raw)); return 0; } static bool efa_wc_is_unsolicited(struct efadv_cq *efadv_cq) { struct efa_cq *cq = efadv_cq_to_efa_cq(efadv_cq); return EFA_GET(&cq->cur_cqe->flags, EFA_IO_CDESC_COMMON_UNSOLICITED); } static void efa_process_cqe(struct efa_cq *cq, struct ibv_wc *wc, struct efa_qp *qp) { struct efa_io_cdesc_common *cqe = cq->cur_cqe; enum efa_io_send_op_type op_type; uint32_t wrid_idx; wc->status = to_ibv_status(cqe->status); wc->vendor_err = cqe->status; wc->wc_flags = 0; wc->qp_num = cqe->qp_num; wrid_idx = cqe->req_id; op_type = EFA_GET(&cqe->flags, EFA_IO_CDESC_COMMON_OP_TYPE); if (EFA_GET(&cqe->flags, EFA_IO_CDESC_COMMON_Q_TYPE) == EFA_IO_SEND_QUEUE) { cq->cur_wq = &qp->sq.wq; if (op_type == EFA_IO_RDMA_WRITE) wc->opcode = IBV_WC_RDMA_WRITE; else wc->opcode = IBV_WC_SEND; /* We do not have to take the WQ lock here, * because this wrid index has not been freed yet, * so there is no contention on this index. */ wc->wr_id = cq->cur_wq->wrid[wrid_idx]; rdma_tracepoint(rdma_core_efa, process_completion, cq->dev->name, wc->wr_id, wc->status, wc->opcode, wc->qp_num, UINT32_MAX, UINT16_MAX, wc->byte_len); } else { struct efa_io_rx_cdesc_ex *rcqe = container_of(cqe, struct efa_io_rx_cdesc_ex, base.common); cq->cur_wq = &qp->rq.wq; wc->byte_len = rcqe->base.length; if (op_type == EFA_IO_RDMA_WRITE) { wc->byte_len |= ((uint32_t)rcqe->u.rdma_write.length_hi << 16); wc->opcode = IBV_WC_RECV_RDMA_WITH_IMM; } else { wc->opcode = IBV_WC_RECV; } wc->src_qp = rcqe->base.src_qp_num; wc->sl = 0; wc->slid = rcqe->base.ah; if (EFA_GET(&cqe->flags, EFA_IO_CDESC_COMMON_HAS_IMM)) { wc->imm_data = htobe32(rcqe->base.imm); wc->wc_flags |= IBV_WC_WITH_IMM; } wc->wr_id = !EFA_GET(&cqe->flags, EFA_IO_CDESC_COMMON_UNSOLICITED) ? cq->cur_wq->wrid[wrid_idx] : 0; rdma_tracepoint(rdma_core_efa, process_completion, cq->dev->name, wc->wr_id, wc->status, wc->opcode, wc->src_qp, wc->qp_num, wc->slid, wc->byte_len); } } static void efa_process_ex_cqe(struct efa_cq *cq, struct efa_qp *qp) { struct ibv_cq_ex *ibvcqx = &cq->verbs_cq.cq_ex; struct efa_io_cdesc_common *cqe = cq->cur_cqe; uint32_t wrid_idx; wrid_idx = cqe->req_id; if (EFA_GET(&cqe->flags, EFA_IO_CDESC_COMMON_Q_TYPE) == EFA_IO_SEND_QUEUE) { cq->cur_wq = &qp->sq.wq; ibvcqx->wr_id = cq->cur_wq->wrid[wrid_idx]; ibvcqx->status = to_ibv_status(cqe->status); rdma_tracepoint(rdma_core_efa, process_completion, cq->dev->name, ibvcqx->wr_id, ibvcqx->status, efa_wc_read_opcode(ibvcqx), cqe->qp_num, UINT32_MAX, UINT16_MAX, efa_wc_read_byte_len(ibvcqx)); } else { cq->cur_wq = &qp->rq.wq; ibvcqx->wr_id = !EFA_GET(&cqe->flags, EFA_IO_CDESC_COMMON_UNSOLICITED) ? cq->cur_wq->wrid[wrid_idx] : 0; ibvcqx->status = to_ibv_status(cqe->status); rdma_tracepoint(rdma_core_efa, process_completion, cq->dev->name, ibvcqx->wr_id, ibvcqx->status, efa_wc_read_opcode(ibvcqx), efa_wc_read_src_qp(ibvcqx), cqe->qp_num, efa_wc_read_slid(ibvcqx), efa_wc_read_byte_len(ibvcqx)); } } static inline int efa_poll_sub_cq(struct efa_cq *cq, struct efa_sub_cq *sub_cq, struct efa_qp **cur_qp, struct ibv_wc *wc, bool extended) ALWAYS_INLINE; static inline int efa_poll_sub_cq(struct efa_cq *cq, struct efa_sub_cq *sub_cq, struct efa_qp **cur_qp, struct ibv_wc *wc, bool extended) { struct efa_context *ctx = to_efa_context(cq->verbs_cq.cq.context); uint32_t qpn; cq->cur_cqe = cq_next_sub_cqe_get(sub_cq); if (!cq->cur_cqe) return ENOENT; qpn = cq->cur_cqe->qp_num; if (!*cur_qp || qpn != (*cur_qp)->verbs_qp.qp.qp_num) { /* We do not have to take the QP table lock here, * because CQs will be locked while QPs are removed * from the table. */ *cur_qp = ctx->qp_table[qpn & ctx->qp_table_sz_m1]; if (!*cur_qp) { verbs_err(&ctx->ibvctx, "QP[%u] does not exist in QP table\n", qpn); return EINVAL; } } if (extended) { efa_process_ex_cqe(cq, *cur_qp); } else { efa_process_cqe(cq, wc, *cur_qp); if (!EFA_GET(&cq->cur_cqe->flags, EFA_IO_CDESC_COMMON_UNSOLICITED)) efa_wq_put_wrid_idx_unlocked(cq->cur_wq, cq->cur_cqe->req_id); } return 0; } static inline int efa_poll_sub_cqs(struct efa_cq *cq, struct ibv_wc *wc, bool extended) ALWAYS_INLINE; static inline int efa_poll_sub_cqs(struct efa_cq *cq, struct ibv_wc *wc, bool extended) { uint16_t num_sub_cqs = cq->num_sub_cqs; struct efa_sub_cq *sub_cq; struct efa_qp *qp = NULL; uint16_t sub_cq_idx; int err = ENOENT; for (sub_cq_idx = 0; sub_cq_idx < num_sub_cqs; sub_cq_idx++) { sub_cq = &cq->sub_cq_arr[cq->next_poll_idx++]; cq->next_poll_idx %= num_sub_cqs; if (!sub_cq->ref_cnt) continue; err = efa_poll_sub_cq(cq, sub_cq, &qp, wc, extended); if (err != ENOENT) { cq->cc++; break; } } return err; } int efa_poll_cq(struct ibv_cq *ibvcq, int nwc, struct ibv_wc *wc) { struct efa_cq *cq = to_efa_cq(ibvcq); int ret = 0; int i; pthread_spin_lock(&cq->lock); for (i = 0; i < nwc; i++) { ret = efa_poll_sub_cqs(cq, &wc[i], false); if (ret) { if (ret == ENOENT) ret = 0; break; } } if (i && cq->db) efa_update_cq_doorbell(cq, false); pthread_spin_unlock(&cq->lock); return i ?: -ret; } static int efa_start_poll(struct ibv_cq_ex *ibvcqx, struct ibv_poll_cq_attr *attr) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); int ret; if (unlikely(attr->comp_mask)) { verbs_err(verbs_get_ctx(ibvcqx->context), "Invalid comp_mask %u\n", attr->comp_mask); return EINVAL; } pthread_spin_lock(&cq->lock); ret = efa_poll_sub_cqs(cq, NULL, true); if (ret) pthread_spin_unlock(&cq->lock); return ret; } static int efa_next_poll(struct ibv_cq_ex *ibvcqx) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); int ret; if (!EFA_GET(&cq->cur_cqe->flags, EFA_IO_CDESC_COMMON_UNSOLICITED)) efa_wq_put_wrid_idx_unlocked(cq->cur_wq, cq->cur_cqe->req_id); ret = efa_poll_sub_cqs(cq, NULL, true); return ret; } static void efa_end_poll(struct ibv_cq_ex *ibvcqx) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); if (cq->cur_cqe) { if (!EFA_GET(&cq->cur_cqe->flags, EFA_IO_CDESC_COMMON_UNSOLICITED)) efa_wq_put_wrid_idx_unlocked(cq->cur_wq, cq->cur_cqe->req_id); if (cq->db) efa_update_cq_doorbell(cq, false); } pthread_spin_unlock(&cq->lock); } static void efa_cq_fill_pfns(struct efa_cq *cq, struct ibv_cq_init_attr_ex *attr, struct efadv_cq_init_attr *efa_attr) { struct ibv_cq_ex *ibvcqx = &cq->verbs_cq.cq_ex; ibvcqx->start_poll = efa_start_poll; ibvcqx->end_poll = efa_end_poll; ibvcqx->next_poll = efa_next_poll; ibvcqx->read_opcode = efa_wc_read_opcode; ibvcqx->read_vendor_err = efa_wc_read_vendor_err; ibvcqx->read_wc_flags = efa_wc_read_wc_flags; if (attr->wc_flags & IBV_WC_EX_WITH_BYTE_LEN) ibvcqx->read_byte_len = efa_wc_read_byte_len; if (attr->wc_flags & IBV_WC_EX_WITH_IMM) ibvcqx->read_imm_data = efa_wc_read_imm_data; if (attr->wc_flags & IBV_WC_EX_WITH_QP_NUM) ibvcqx->read_qp_num = efa_wc_read_qp_num; if (attr->wc_flags & IBV_WC_EX_WITH_SRC_QP) ibvcqx->read_src_qp = efa_wc_read_src_qp; if (attr->wc_flags & IBV_WC_EX_WITH_SLID) ibvcqx->read_slid = efa_wc_read_slid; if (attr->wc_flags & IBV_WC_EX_WITH_SL) ibvcqx->read_sl = efa_wc_read_sl; if (attr->wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS) ibvcqx->read_dlid_path_bits = efa_wc_read_dlid_path_bits; if (efa_attr && (efa_attr->wc_flags & EFADV_WC_EX_WITH_SGID)) cq->dv_cq.wc_read_sgid = efa_wc_read_sgid; if (efa_attr && (efa_attr->wc_flags & EFADV_WC_EX_WITH_IS_UNSOLICITED)) cq->dv_cq.wc_is_unsolicited = efa_wc_is_unsolicited; } static void efa_sub_cq_initialize(struct efa_sub_cq *sub_cq, uint8_t *buf, int sub_cq_size, int cqe_size) { sub_cq->consumed_cnt = 0; sub_cq->phase = 1; sub_cq->buf = buf; sub_cq->qmask = sub_cq_size - 1; sub_cq->cqe_size = cqe_size; sub_cq->ref_cnt = 0; } static struct ibv_cq_ex *create_cq(struct ibv_context *ibvctx, struct ibv_cq_init_attr_ex *attr, struct efadv_cq_init_attr *efa_attr) { struct efa_context *ctx = to_efa_context(ibvctx); uint16_t cqe_size = ctx->ex_cqe_size; struct efa_create_cq_resp resp = {}; struct efa_create_cq cmd = {}; uint16_t num_sub_cqs; struct efa_cq *cq; int sub_buf_size; int sub_cq_size; uint8_t *buf; int err; int i; if (!check_comp_mask(attr->comp_mask, 0) || !check_comp_mask(attr->wc_flags, IBV_WC_STANDARD_FLAGS)) { verbs_err(verbs_get_ctx(ibvctx), "Invalid comp_mask or wc_flags\n"); errno = EOPNOTSUPP; return NULL; } if (attr->channel && !EFA_DEV_CAP(ctx, CQ_NOTIFICATIONS)) { errno = EOPNOTSUPP; return NULL; } cq = calloc(1, sizeof(*cq) + sizeof(*cq->sub_cq_arr) * ctx->sub_cqs_per_cq); if (!cq) return NULL; if (efa_attr && (efa_attr->wc_flags & EFADV_WC_EX_WITH_SGID)) cmd.flags |= EFA_CREATE_CQ_WITH_SGID; num_sub_cqs = ctx->sub_cqs_per_cq; cmd.num_sub_cqs = num_sub_cqs; cmd.cq_entry_size = cqe_size; if (attr->channel) cmd.flags |= EFA_CREATE_CQ_WITH_COMPLETION_CHANNEL; attr->cqe = roundup_pow_of_two(attr->cqe); err = ibv_cmd_create_cq_ex(ibvctx, attr, &cq->verbs_cq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp), 0); if (err) { errno = err; goto err_free_cq; } sub_cq_size = cq->verbs_cq.cq.cqe; cq->cqn = resp.cq_idx; cq->buf_size = resp.q_mmap_size; cq->num_sub_cqs = num_sub_cqs; cq->cqe_size = cqe_size; cq->dev = ibvctx->device; cq->buf = mmap(NULL, cq->buf_size, PROT_READ, MAP_SHARED, ibvctx->cmd_fd, resp.q_mmap_key); if (cq->buf == MAP_FAILED) goto err_destroy_cq; buf = cq->buf; sub_buf_size = cq->cqe_size * sub_cq_size; for (i = 0; i < num_sub_cqs; i++) { efa_sub_cq_initialize(&cq->sub_cq_arr[i], buf, sub_cq_size, cq->cqe_size); buf += sub_buf_size; } if (resp.comp_mask & EFA_CREATE_CQ_RESP_DB_OFF) { cq->db_mmap_addr = mmap(NULL, to_efa_dev(ibvctx->device)->pg_sz, PROT_WRITE, MAP_SHARED, ibvctx->cmd_fd, resp.db_mmap_key); if (cq->db_mmap_addr == MAP_FAILED) goto err_unmap_cq; cq->db = (uint32_t *)(cq->db_mmap_addr + resp.db_off); } efa_cq_fill_pfns(cq, attr, efa_attr); pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE); return &cq->verbs_cq.cq_ex; err_unmap_cq: munmap(cq->buf, cq->buf_size); err_destroy_cq: ibv_cmd_destroy_cq(&cq->verbs_cq.cq); err_free_cq: free(cq); verbs_err(verbs_get_ctx(ibvctx), "Failed to create CQ\n"); return NULL; } struct ibv_cq *efa_create_cq(struct ibv_context *ibvctx, int ncqe, struct ibv_comp_channel *channel, int vec) { struct ibv_cq_init_attr_ex attr_ex = { .cqe = ncqe, .channel = channel, .comp_vector = vec }; struct ibv_cq_ex *ibvcqx; ibvcqx = create_cq(ibvctx, &attr_ex, NULL); return ibvcqx ? ibv_cq_ex_to_cq(ibvcqx) : NULL; } struct ibv_cq_ex *efa_create_cq_ex(struct ibv_context *ibvctx, struct ibv_cq_init_attr_ex *attr_ex) { return create_cq(ibvctx, attr_ex, NULL); } struct ibv_cq_ex *efadv_create_cq(struct ibv_context *ibvctx, struct ibv_cq_init_attr_ex *attr_ex, struct efadv_cq_init_attr *efa_attr, uint32_t inlen) { uint64_t supp_wc_flags = 0; struct efa_context *ctx; if (!is_efa_dev(ibvctx->device)) { verbs_err(verbs_get_ctx(ibvctx), "Not an EFA device\n"); errno = EOPNOTSUPP; return NULL; } if (!vext_field_avail(struct efadv_cq_init_attr, wc_flags, inlen) || efa_attr->comp_mask || (inlen > sizeof(efa_attr) && !is_ext_cleared(efa_attr, inlen))) { verbs_err(verbs_get_ctx(ibvctx), "Compatibility issues\n"); errno = EINVAL; return NULL; } ctx = to_efa_context(ibvctx); if (EFA_DEV_CAP(ctx, CQ_WITH_SGID)) supp_wc_flags |= EFADV_WC_EX_WITH_SGID; if (EFA_DEV_CAP(ctx, UNSOLICITED_WRITE_RECV)) supp_wc_flags |= EFADV_WC_EX_WITH_IS_UNSOLICITED; if (!check_comp_mask(efa_attr->wc_flags, supp_wc_flags)) { verbs_err(verbs_get_ctx(ibvctx), "Invalid EFA wc_flags[%#lx]\n", efa_attr->wc_flags); errno = EOPNOTSUPP; return NULL; } return create_cq(ibvctx, attr_ex, efa_attr); } struct efadv_cq *efadv_cq_from_ibv_cq_ex(struct ibv_cq_ex *ibvcqx) { struct efa_cq *cq = to_efa_cq_ex(ibvcqx); return &cq->dv_cq; } int efa_destroy_cq(struct ibv_cq *ibvcq) { struct efa_cq *cq = to_efa_cq(ibvcq); int err; err = ibv_cmd_destroy_cq(ibvcq); if (err) { verbs_err(verbs_get_ctx(ibvcq->context), "Failed to destroy CQ[%u]\n", cq->cqn); return err; } munmap(cq->db_mmap_addr, to_efa_dev(cq->dev)->pg_sz); munmap(cq->buf, cq->buf_size); pthread_spin_destroy(&cq->lock); free(cq); return 0; } static void efa_cq_inc_ref_cnt(struct efa_cq *cq, uint8_t sub_cq_idx) { cq->sub_cq_arr[sub_cq_idx].ref_cnt++; } static void efa_cq_dec_ref_cnt(struct efa_cq *cq, uint8_t sub_cq_idx) { cq->sub_cq_arr[sub_cq_idx].ref_cnt--; } static void efa_wq_terminate(struct efa_wq *wq, int pgsz) { void *db_aligned; pthread_spin_destroy(&wq->wqlock); db_aligned = (void *)((uintptr_t)wq->db & ~(pgsz - 1)); munmap(db_aligned, pgsz); free(wq->wrid_idx_pool); free(wq->wrid); } static int efa_wq_initialize(struct efa_wq *wq, struct efa_wq_init_attr *attr) { uint8_t *db_base; int err; int i; wq->wrid = malloc(wq->wqe_cnt * sizeof(*wq->wrid)); if (!wq->wrid) return ENOMEM; wq->wrid_idx_pool = malloc(wq->wqe_cnt * sizeof(uint32_t)); if (!wq->wrid_idx_pool) { err = ENOMEM; goto err_free_wrid; } db_base = mmap(NULL, attr->pgsz, PROT_WRITE, MAP_SHARED, attr->cmd_fd, attr->db_mmap_key); if (db_base == MAP_FAILED) { err = errno; goto err_free_wrid_idx_pool; } wq->db = (uint32_t *)(db_base + attr->db_off); /* Initialize the wrid free indexes pool. */ for (i = 0; i < wq->wqe_cnt; i++) wq->wrid_idx_pool[i] = i; pthread_spin_init(&wq->wqlock, PTHREAD_PROCESS_PRIVATE); wq->sub_cq_idx = attr->sub_cq_idx; return 0; err_free_wrid_idx_pool: free(wq->wrid_idx_pool); err_free_wrid: free(wq->wrid); return err; } static void efa_sq_terminate(struct efa_qp *qp) { struct efa_sq *sq = &qp->sq; if (!sq->wq.wqe_cnt) return; munmap(sq->desc - sq->desc_offset, sq->desc_ring_mmap_size); free(sq->local_queue); efa_wq_terminate(&sq->wq, qp->page_size); } static int efa_sq_initialize(struct efa_qp *qp, const struct ibv_qp_init_attr_ex *attr, struct efa_create_qp_resp *resp) { struct efa_context *ctx = to_efa_context(qp->verbs_qp.qp.context); struct efa_wq_init_attr wq_attr; struct efa_sq *sq = &qp->sq; size_t desc_ring_size; int err; if (!sq->wq.wqe_cnt) return 0; wq_attr = (struct efa_wq_init_attr) { .db_mmap_key = resp->sq_db_mmap_key, .db_off = resp->sq_db_offset, .cmd_fd = qp->verbs_qp.qp.context->cmd_fd, .pgsz = qp->page_size, .sub_cq_idx = resp->send_sub_cq_idx, }; err = efa_wq_initialize(&qp->sq.wq, &wq_attr); if (err) { verbs_err(&ctx->ibvctx, "SQ[%u] efa_wq_initialize failed\n", qp->verbs_qp.qp.qp_num); return err; } sq->desc_offset = resp->llq_desc_offset; desc_ring_size = sq->wq.wqe_cnt * sizeof(struct efa_io_tx_wqe); sq->desc_ring_mmap_size = align(desc_ring_size + sq->desc_offset, qp->page_size); sq->max_inline_data = attr->cap.max_inline_data; sq->local_queue = malloc(desc_ring_size); if (!sq->local_queue) { err = ENOMEM; goto err_terminate_wq; } sq->desc = mmap(NULL, sq->desc_ring_mmap_size, PROT_WRITE, MAP_SHARED, qp->verbs_qp.qp.context->cmd_fd, resp->llq_desc_mmap_key); if (sq->desc == MAP_FAILED) { verbs_err(&ctx->ibvctx, "SQ buffer mmap failed\n"); err = errno; goto err_free_local_queue; } sq->desc += sq->desc_offset; sq->max_wr_rdma_sge = min_t(uint16_t, ctx->max_wr_rdma_sge, EFA_IO_TX_DESC_NUM_RDMA_BUFS); sq->max_batch_wr = ctx->max_tx_batch ? (ctx->max_tx_batch * 64) / sizeof(struct efa_io_tx_wqe) : UINT16_MAX; if (ctx->min_sq_wr) { /* The device can't accept a doorbell for the whole SQ at once, * set the max batch to at least (SQ size - 1). */ sq->max_batch_wr = min_t(uint32_t, sq->max_batch_wr, sq->wq.wqe_cnt - 1); } return 0; err_free_local_queue: free(sq->local_queue); err_terminate_wq: efa_wq_terminate(&sq->wq, qp->page_size); return err; } static void efa_rq_terminate(struct efa_qp *qp) { struct efa_rq *rq = &qp->rq; if (!rq->wq.wqe_cnt) return; munmap(rq->buf, rq->buf_size); efa_wq_terminate(&rq->wq, qp->page_size); } static int efa_rq_initialize(struct efa_qp *qp, struct efa_create_qp_resp *resp) { struct efa_wq_init_attr wq_attr; struct efa_rq *rq = &qp->rq; int err; if (!rq->wq.wqe_cnt) return 0; wq_attr = (struct efa_wq_init_attr) { .db_mmap_key = resp->rq_db_mmap_key, .db_off = resp->rq_db_offset, .cmd_fd = qp->verbs_qp.qp.context->cmd_fd, .pgsz = qp->page_size, .sub_cq_idx = resp->recv_sub_cq_idx, }; err = efa_wq_initialize(&qp->rq.wq, &wq_attr); if (err) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "RQ efa_wq_initialize failed\n"); return err; } rq->buf_size = resp->rq_mmap_size; rq->buf = mmap(NULL, rq->buf_size, PROT_WRITE, MAP_SHARED, qp->verbs_qp.qp.context->cmd_fd, resp->rq_mmap_key); if (rq->buf == MAP_FAILED) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "RQ buffer mmap failed\n"); err = errno; goto err_terminate_wq; } return 0; err_terminate_wq: efa_wq_terminate(&rq->wq, qp->page_size); return err; } static void efa_qp_init_indices(struct efa_qp *qp) { qp->sq.wq.wqe_posted = 0; qp->sq.wq.wqe_completed = 0; qp->sq.wq.pc = 0; qp->sq.wq.wrid_idx_pool_next = 0; qp->rq.wq.wqe_posted = 0; qp->rq.wq.wqe_completed = 0; qp->rq.wq.pc = 0; qp->rq.wq.wrid_idx_pool_next = 0; } static void efa_setup_qp(struct efa_context *ctx, struct efa_qp *qp, struct ibv_qp_cap *cap, size_t page_size) { uint16_t rq_desc_cnt; efa_qp_init_indices(qp); qp->sq.wq.wqe_cnt = roundup_pow_of_two(max_t(uint32_t, cap->max_send_wr, ctx->min_sq_wr)); qp->sq.wq.max_sge = cap->max_send_sge; qp->sq.wq.desc_mask = qp->sq.wq.wqe_cnt - 1; qp->rq.wq.max_sge = cap->max_recv_sge; rq_desc_cnt = roundup_pow_of_two(cap->max_recv_sge * cap->max_recv_wr); qp->rq.wq.desc_mask = rq_desc_cnt - 1; qp->rq.wq.wqe_cnt = rq_desc_cnt / qp->rq.wq.max_sge; qp->page_size = page_size; } static void efa_lock_cqs(struct ibv_qp *ibvqp) { struct efa_cq *send_cq = to_efa_cq(ibvqp->send_cq); struct efa_cq *recv_cq = to_efa_cq(ibvqp->recv_cq); if (recv_cq == send_cq) { pthread_spin_lock(&recv_cq->lock); } else { pthread_spin_lock(&recv_cq->lock); pthread_spin_lock(&send_cq->lock); } } static void efa_unlock_cqs(struct ibv_qp *ibvqp) { struct efa_cq *send_cq = to_efa_cq(ibvqp->send_cq); struct efa_cq *recv_cq = to_efa_cq(ibvqp->recv_cq); if (recv_cq == send_cq) { pthread_spin_unlock(&recv_cq->lock); } else { pthread_spin_unlock(&recv_cq->lock); pthread_spin_unlock(&send_cq->lock); } } static void efa_qp_fill_wr_pfns(struct ibv_qp_ex *ibvqpx, struct ibv_qp_init_attr_ex *attr_ex); static int efa_check_qp_attr(struct efa_context *ctx, struct ibv_qp_init_attr_ex *attr, struct efadv_qp_init_attr *efa_attr) { uint64_t supp_ud_send_ops_mask = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_IMM; uint64_t supp_srd_send_ops_mask = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_IMM; uint64_t supp_send_ops_mask; uint16_t supp_efa_flags = 0; if (EFA_DEV_CAP(ctx, RDMA_READ)) supp_srd_send_ops_mask |= IBV_QP_EX_WITH_RDMA_READ; if (EFA_DEV_CAP(ctx, RDMA_WRITE)) supp_srd_send_ops_mask |= IBV_QP_EX_WITH_RDMA_WRITE | IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM; if (EFA_DEV_CAP(ctx, UNSOLICITED_WRITE_RECV)) supp_efa_flags |= EFADV_QP_FLAGS_UNSOLICITED_WRITE_RECV; #define EFA_CREATE_QP_SUPP_ATTR_MASK \ (IBV_QP_INIT_ATTR_PD | IBV_QP_INIT_ATTR_SEND_OPS_FLAGS) if (attr->qp_type == IBV_QPT_DRIVER && efa_attr->driver_qp_type != EFADV_QP_DRIVER_TYPE_SRD) { verbs_err(&ctx->ibvctx, "Driver QP type must be SRD\n"); return EOPNOTSUPP; } if (!check_comp_mask(efa_attr->flags, supp_efa_flags)) { verbs_err(&ctx->ibvctx, "Unsupported EFA flags[%#x] supported[%#x]\n", efa_attr->flags, supp_efa_flags); return EOPNOTSUPP; } if (!check_comp_mask(attr->comp_mask, EFA_CREATE_QP_SUPP_ATTR_MASK)) { verbs_err(&ctx->ibvctx, "Unsupported comp_mask[%#x] supported[%#x]\n", attr->comp_mask, EFA_CREATE_QP_SUPP_ATTR_MASK); return EOPNOTSUPP; } if (!(attr->comp_mask & IBV_QP_INIT_ATTR_PD)) { verbs_err(&ctx->ibvctx, "Does not support PD in init attr\n"); return EINVAL; } if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS) { switch (attr->qp_type) { case IBV_QPT_UD: supp_send_ops_mask = supp_ud_send_ops_mask; break; case IBV_QPT_DRIVER: supp_send_ops_mask = supp_srd_send_ops_mask; break; default: verbs_err(&ctx->ibvctx, "Invalid QP type %u\n", attr->qp_type); return EOPNOTSUPP; } if (!check_comp_mask(attr->send_ops_flags, supp_send_ops_mask)) { verbs_err(&ctx->ibvctx, "Unsupported send_ops_flags[%" PRIx64 "] supported [%" PRIx64 "]\n", attr->send_ops_flags, supp_send_ops_mask); return EOPNOTSUPP; } } if (!attr->recv_cq || !attr->send_cq) { verbs_err(&ctx->ibvctx, "Send/Receive CQ not provided\n"); return EINVAL; } if (attr->srq) { verbs_err(&ctx->ibvctx, "SRQ is not supported\n"); return EINVAL; } return 0; } static int efa_check_qp_limits(struct efa_context *ctx, struct ibv_qp_init_attr_ex *attr) { if (attr->cap.max_send_sge > ctx->max_sq_sge) { verbs_err(&ctx->ibvctx, "Max send SGE %u > %u\n", attr->cap.max_send_sge, ctx->max_sq_sge); return EINVAL; } if (attr->cap.max_recv_sge > ctx->max_rq_sge) { verbs_err(&ctx->ibvctx, "Max receive SGE %u > %u\n", attr->cap.max_recv_sge, ctx->max_rq_sge); return EINVAL; } if (attr->cap.max_send_wr > ctx->max_sq_wr) { verbs_err(&ctx->ibvctx, "Max send WR %u > %u\n", attr->cap.max_send_wr, ctx->max_sq_wr); return EINVAL; } if (attr->cap.max_recv_wr > ctx->max_rq_wr) { verbs_err(&ctx->ibvctx, "Max receive WR %u > %u\n", attr->cap.max_recv_wr, ctx->max_rq_wr); return EINVAL; } return 0; } static struct ibv_qp *create_qp(struct ibv_context *ibvctx, struct ibv_qp_init_attr_ex *attr, struct efadv_qp_init_attr *efa_attr) { struct efa_context *ctx = to_efa_context(ibvctx); struct efa_dev *dev = to_efa_dev(ibvctx->device); struct efa_create_qp_resp resp = {}; struct efa_create_qp req = {}; struct efa_cq *send_cq; struct efa_cq *recv_cq; struct ibv_qp *ibvqp; struct efa_qp *qp; int err; err = efa_check_qp_attr(ctx, attr, efa_attr); if (err) goto err_out; err = efa_check_qp_limits(ctx, attr); if (err) goto err_out; qp = calloc(1, sizeof(*qp)); if (!qp) { err = ENOMEM; goto err_out; } efa_setup_qp(ctx, qp, &attr->cap, dev->pg_sz); attr->cap.max_send_wr = qp->sq.wq.wqe_cnt; attr->cap.max_recv_wr = qp->rq.wq.wqe_cnt; req.rq_ring_size = (qp->rq.wq.desc_mask + 1) * sizeof(struct efa_io_rx_desc); req.sq_ring_size = (attr->cap.max_send_wr) * sizeof(struct efa_io_tx_wqe); if (attr->qp_type == IBV_QPT_DRIVER) req.driver_qp_type = efa_attr->driver_qp_type; if (efa_attr->flags & EFADV_QP_FLAGS_UNSOLICITED_WRITE_RECV) req.flags |= EFA_CREATE_QP_WITH_UNSOLICITED_WRITE_RECV; req.sl = efa_attr->sl; err = ibv_cmd_create_qp_ex(ibvctx, &qp->verbs_qp, attr, &req.ibv_cmd, sizeof(req), &resp.ibv_resp, sizeof(resp)); if (err) goto err_free_qp; ibvqp = &qp->verbs_qp.qp; ibvqp->state = IBV_QPS_RESET; qp->sq_sig_all = attr->sq_sig_all; qp->dev = ibvctx->device; err = efa_rq_initialize(qp, &resp); if (err) goto err_destroy_qp; err = efa_sq_initialize(qp, attr, &resp); if (err) goto err_terminate_rq; pthread_spin_lock(&ctx->qp_table_lock); ctx->qp_table[ibvqp->qp_num & ctx->qp_table_sz_m1] = qp; pthread_spin_unlock(&ctx->qp_table_lock); send_cq = to_efa_cq(attr->send_cq); pthread_spin_lock(&send_cq->lock); efa_cq_inc_ref_cnt(send_cq, resp.send_sub_cq_idx); pthread_spin_unlock(&send_cq->lock); recv_cq = to_efa_cq(attr->recv_cq); pthread_spin_lock(&recv_cq->lock); efa_cq_inc_ref_cnt(recv_cq, resp.recv_sub_cq_idx); pthread_spin_unlock(&recv_cq->lock); if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS) { efa_qp_fill_wr_pfns(&qp->verbs_qp.qp_ex, attr); qp->verbs_qp.comp_mask |= VERBS_QP_EX; } return ibvqp; err_terminate_rq: efa_rq_terminate(qp); err_destroy_qp: ibv_cmd_destroy_qp(ibvqp); err_free_qp: free(qp); err_out: errno = err; verbs_err(verbs_get_ctx(ibvctx), "Failed to create QP\n"); return NULL; } struct ibv_qp *efa_create_qp(struct ibv_pd *ibvpd, struct ibv_qp_init_attr *attr) { struct ibv_qp_init_attr_ex attr_ex = {}; struct efadv_qp_init_attr efa_attr = {}; struct ibv_qp *ibvqp; if (attr->qp_type != IBV_QPT_UD) { verbs_err(verbs_get_ctx(ibvpd->context), "Unsupported QP type %d\n", attr->qp_type); errno = EOPNOTSUPP; return NULL; } memcpy(&attr_ex, attr, sizeof(*attr)); attr_ex.comp_mask = IBV_QP_INIT_ATTR_PD; attr_ex.pd = ibvpd; ibvqp = create_qp(ibvpd->context, &attr_ex, &efa_attr); if (ibvqp) memcpy(attr, &attr_ex, sizeof(*attr)); return ibvqp; } struct ibv_qp *efa_create_qp_ex(struct ibv_context *ibvctx, struct ibv_qp_init_attr_ex *attr_ex) { struct efadv_qp_init_attr efa_attr = {}; if (attr_ex->qp_type != IBV_QPT_UD) { verbs_err(verbs_get_ctx(ibvctx), "Unsupported QP type\n"); errno = EOPNOTSUPP; return NULL; } return create_qp(ibvctx, attr_ex, &efa_attr); } struct ibv_qp *efadv_create_driver_qp(struct ibv_pd *ibvpd, struct ibv_qp_init_attr *attr, uint32_t driver_qp_type) { struct ibv_qp_init_attr_ex attr_ex = {}; struct efadv_qp_init_attr efa_attr = {}; struct ibv_qp *ibvqp; if (!is_efa_dev(ibvpd->context->device)) { verbs_err(verbs_get_ctx(ibvpd->context), "Not an EFA device\n"); errno = EOPNOTSUPP; return NULL; } if (attr->qp_type != IBV_QPT_DRIVER) { verbs_err(verbs_get_ctx(ibvpd->context), "QP type not IBV_QPT_DRIVER\n"); errno = EINVAL; return NULL; } memcpy(&attr_ex, attr, sizeof(*attr)); attr_ex.comp_mask = IBV_QP_INIT_ATTR_PD; attr_ex.pd = ibvpd; efa_attr.driver_qp_type = driver_qp_type; ibvqp = create_qp(ibvpd->context, &attr_ex, &efa_attr); if (ibvqp) memcpy(attr, &attr_ex, sizeof(*attr)); return ibvqp; } struct ibv_qp *efadv_create_qp_ex(struct ibv_context *ibvctx, struct ibv_qp_init_attr_ex *attr_ex, struct efadv_qp_init_attr *efa_attr, uint32_t inlen) { struct efadv_qp_init_attr local_efa_attr = {}; if (!is_efa_dev(ibvctx->device)) { verbs_err(verbs_get_ctx(ibvctx), "Not an EFA device\n"); errno = EOPNOTSUPP; return NULL; } if (attr_ex->qp_type != IBV_QPT_DRIVER || !vext_field_avail(struct efadv_qp_init_attr, driver_qp_type, inlen) || efa_attr->comp_mask || !is_reserved_cleared(efa_attr->reserved) || (inlen > sizeof(*efa_attr) && !is_ext_cleared(efa_attr, inlen))) { verbs_err(verbs_get_ctx(ibvctx), "Compatibility issues\n"); errno = EINVAL; return NULL; } memcpy(&local_efa_attr, efa_attr, min_t(uint32_t, inlen, sizeof(local_efa_attr))); return create_qp(ibvctx, attr_ex, &local_efa_attr); } int efa_modify_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr, int attr_mask) { struct efa_qp *qp = to_efa_qp(ibvqp); struct ibv_modify_qp cmd = {}; int err; err = ibv_cmd_modify_qp(ibvqp, attr, attr_mask, &cmd, sizeof(cmd)); if (err) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "Failed to modify QP[%u]\n", qp->verbs_qp.qp.qp_num); return err; } if (attr_mask & IBV_QP_STATE) { qp->verbs_qp.qp.state = attr->qp_state; /* transition to reset */ if (qp->verbs_qp.qp.state == IBV_QPS_RESET) efa_qp_init_indices(qp); } return 0; } int efa_query_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; return ibv_cmd_query_qp(ibvqp, attr, attr_mask, init_attr, &cmd, sizeof(cmd)); } int efa_query_qp_data_in_order(struct ibv_qp *ibvqp, enum ibv_wr_opcode op, uint32_t flags) { struct efa_context *ctx = to_efa_context(ibvqp->context); int caps = 0; if (EFA_DEV_CAP(ctx, DATA_POLLING_128)) caps |= IBV_QUERY_QP_DATA_IN_ORDER_ALIGNED_128_BYTES; return caps; } int efa_destroy_qp(struct ibv_qp *ibvqp) { struct efa_context *ctx = to_efa_context(ibvqp->context); struct efa_qp *qp = to_efa_qp(ibvqp); int err; err = ibv_cmd_destroy_qp(ibvqp); if (err) { verbs_err(&ctx->ibvctx, "Failed to destroy QP[%u]\n", ibvqp->qp_num); return err; } pthread_spin_lock(&ctx->qp_table_lock); efa_lock_cqs(ibvqp); efa_cq_dec_ref_cnt(to_efa_cq(ibvqp->send_cq), qp->sq.wq.sub_cq_idx); efa_cq_dec_ref_cnt(to_efa_cq(ibvqp->recv_cq), qp->rq.wq.sub_cq_idx); ctx->qp_table[ibvqp->qp_num & ctx->qp_table_sz_m1] = NULL; efa_unlock_cqs(ibvqp); pthread_spin_unlock(&ctx->qp_table_lock); efa_sq_terminate(qp); efa_rq_terminate(qp); free(qp); return 0; } static void efa_set_tx_buf(struct efa_io_tx_buf_desc *tx_buf, uint64_t addr, uint32_t lkey, uint32_t length) { tx_buf->length = length; EFA_SET(&tx_buf->lkey, EFA_IO_TX_BUF_DESC_LKEY, lkey); tx_buf->buf_addr_lo = addr & 0xffffffff; tx_buf->buf_addr_hi = addr >> 32; } static void efa_post_send_sgl(struct efa_io_tx_buf_desc *tx_bufs, const struct ibv_sge *sg_list, int num_sge) { const struct ibv_sge *sge; size_t i; for (i = 0; i < num_sge; i++) { sge = &sg_list[i]; efa_set_tx_buf(&tx_bufs[i], sge->addr, sge->lkey, sge->length); } } static void efa_post_send_inline_data(const struct ibv_send_wr *wr, struct efa_io_tx_wqe *tx_wqe) { const struct ibv_sge *sgl = wr->sg_list; uint32_t total_length = 0; uint32_t length; size_t i; for (i = 0; i < wr->num_sge; i++) { length = sgl[i].length; memcpy(tx_wqe->data.inline_data + total_length, (void *)(uintptr_t)sgl[i].addr, length); total_length += length; } EFA_SET(&tx_wqe->meta.ctrl1, EFA_IO_TX_META_DESC_INLINE_MSG, 1); tx_wqe->meta.length = total_length; } static size_t efa_sge_total_bytes(const struct ibv_sge *sg_list, int num_sge) { size_t bytes = 0; size_t i; for (i = 0; i < num_sge; i++) bytes += sg_list[i].length; return bytes; } static size_t efa_buf_list_total_bytes(const struct ibv_data_buf *buf_list, size_t num_buf) { size_t bytes = 0; size_t i; for (i = 0; i < num_buf; i++) bytes += buf_list[i].length; return bytes; } static void efa_sq_advance_post_idx(struct efa_sq *sq) { struct efa_wq *wq = &sq->wq; wq->wqe_posted++; wq->pc++; if (!(wq->pc & wq->desc_mask)) wq->phase++; } static inline void efa_rq_ring_doorbell(struct efa_rq *rq, uint16_t pc) { udma_to_device_barrier(); mmio_write32(rq->wq.db, pc); } static inline void efa_sq_ring_doorbell(struct efa_sq *sq, uint16_t pc) { mmio_write32(sq->wq.db, pc); } static void efa_set_common_ctrl_flags(struct efa_io_tx_meta_desc *desc, struct efa_sq *sq, enum efa_io_send_op_type op_type) { EFA_SET(&desc->ctrl1, EFA_IO_TX_META_DESC_META_DESC, 1); EFA_SET(&desc->ctrl1, EFA_IO_TX_META_DESC_OP_TYPE, op_type); EFA_SET(&desc->ctrl2, EFA_IO_TX_META_DESC_PHASE, sq->wq.phase); EFA_SET(&desc->ctrl2, EFA_IO_TX_META_DESC_FIRST, 1); EFA_SET(&desc->ctrl2, EFA_IO_TX_META_DESC_LAST, 1); EFA_SET(&desc->ctrl2, EFA_IO_TX_META_DESC_COMP_REQ, 1); } static int efa_post_send_validate(struct efa_qp *qp, unsigned int wr_flags) { if (unlikely(qp->verbs_qp.qp.state != IBV_QPS_RTS && qp->verbs_qp.qp.state != IBV_QPS_SQD)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] is in invalid state\n", qp->verbs_qp.qp.qp_num); return EINVAL; } if (unlikely(!(wr_flags & IBV_SEND_SIGNALED) && !qp->sq_sig_all)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] Non signaled WRs not supported\n", qp->verbs_qp.qp.qp_num); return EINVAL; } if (unlikely(wr_flags & ~(IBV_SEND_SIGNALED | IBV_SEND_INLINE))) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] Unsupported wr_flags[%#x] supported[%#x]\n", qp->verbs_qp.qp.qp_num, wr_flags, ~(IBV_SEND_SIGNALED | IBV_SEND_INLINE)); return EINVAL; } if (unlikely(qp->sq.wq.wqe_posted - qp->sq.wq.wqe_completed == qp->sq.wq.wqe_cnt)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] is full wqe_posted[%u] wqe_completed[%u] wqe_cnt[%u]\n", qp->verbs_qp.qp.qp_num, qp->sq.wq.wqe_posted, qp->sq.wq.wqe_completed, qp->sq.wq.wqe_cnt); return ENOMEM; } return 0; } static int efa_post_send_validate_wr(struct efa_qp *qp, const struct ibv_send_wr *wr) { int err; err = efa_post_send_validate(qp, wr->send_flags); if (unlikely(err)) return err; if (unlikely(wr->opcode != IBV_WR_SEND && wr->opcode != IBV_WR_SEND_WITH_IMM)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] unsupported opcode %d\n", qp->verbs_qp.qp.qp_num, wr->opcode); return EINVAL; } if (wr->send_flags & IBV_SEND_INLINE) { if (unlikely(efa_sge_total_bytes(wr->sg_list, wr->num_sge) > qp->sq.max_inline_data)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] WR total bytes %zu > %zu\n", qp->verbs_qp.qp.qp_num, efa_sge_total_bytes(wr->sg_list, wr->num_sge), qp->sq.max_inline_data); return EINVAL; } } else { if (unlikely(wr->num_sge > qp->sq.wq.max_sge)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] WR num_sge %d > %d\n", qp->verbs_qp.qp.qp_num, wr->num_sge, qp->sq.wq.max_sge); return EINVAL; } } return 0; } int efa_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad) { struct efa_io_tx_meta_desc *meta_desc; struct efa_qp *qp = to_efa_qp(ibvqp); struct efa_io_tx_wqe tx_wqe; struct efa_sq *sq = &qp->sq; struct efa_wq *wq = &sq->wq; uint32_t sq_desc_offset; uint32_t curbatch = 0; struct efa_ah *ah; int err = 0; mmio_wc_spinlock(&wq->wqlock); while (wr) { err = efa_post_send_validate_wr(qp, wr); if (err) { *bad = wr; goto ring_db; } memset(&tx_wqe, 0, sizeof(tx_wqe)); meta_desc = &tx_wqe.meta; ah = to_efa_ah(wr->wr.ud.ah); if (wr->send_flags & IBV_SEND_INLINE) { efa_post_send_inline_data(wr, &tx_wqe); } else { meta_desc->length = wr->num_sge; efa_post_send_sgl(tx_wqe.data.sgl, wr->sg_list, wr->num_sge); } if (wr->opcode == IBV_WR_SEND_WITH_IMM) { meta_desc->immediate_data = be32toh(wr->imm_data); EFA_SET(&meta_desc->ctrl1, EFA_IO_TX_META_DESC_HAS_IMM, 1); } /* Set rest of the descriptor fields */ efa_set_common_ctrl_flags(meta_desc, sq, EFA_IO_SEND); meta_desc->req_id = efa_wq_get_next_wrid_idx_locked(wq, wr->wr_id); meta_desc->dest_qp_num = wr->wr.ud.remote_qpn; meta_desc->ah = ah->efa_ah; meta_desc->qkey = wr->wr.ud.remote_qkey; /* Copy descriptor */ sq_desc_offset = (wq->pc & wq->desc_mask) * sizeof(tx_wqe); mmio_memcpy_x64(sq->desc + sq_desc_offset, &tx_wqe, sizeof(tx_wqe)); /* advance index and change phase */ efa_sq_advance_post_idx(sq); curbatch++; if (curbatch == sq->max_batch_wr) { curbatch = 0; mmio_flush_writes(); efa_sq_ring_doorbell(sq, wq->pc); mmio_wc_start(); } rdma_tracepoint(rdma_core_efa, post_send, qp->dev->name, wr->wr_id, EFA_IO_SEND, ibvqp->qp_num, meta_desc->dest_qp_num, ah->efa_ah); wr = wr->next; } ring_db: if (curbatch) { mmio_flush_writes(); efa_sq_ring_doorbell(sq, wq->pc); } /* * Not using mmio_wc_spinunlock as the doorbell write should be done * inside the lock. */ pthread_spin_unlock(&wq->wqlock); return err; } static struct efa_io_tx_wqe *efa_send_wr_common(struct ibv_qp_ex *ibvqpx, enum efa_io_send_op_type op_type) { struct efa_qp *qp = to_efa_qp_ex(ibvqpx); struct efa_sq *sq = &qp->sq; struct efa_io_tx_meta_desc *meta_desc; int err; if (unlikely(qp->wr_session_err)) return NULL; err = efa_post_send_validate(qp, ibvqpx->wr_flags); if (unlikely(err)) { qp->wr_session_err = err; return NULL; } sq->curr_tx_wqe = (struct efa_io_tx_wqe *)sq->local_queue + sq->num_wqe_pending; memset(sq->curr_tx_wqe, 0, sizeof(*sq->curr_tx_wqe)); meta_desc = &sq->curr_tx_wqe->meta; efa_set_common_ctrl_flags(meta_desc, sq, op_type); meta_desc->req_id = efa_wq_get_next_wrid_idx_locked(&sq->wq, ibvqpx->wr_id); /* advance index and change phase */ efa_sq_advance_post_idx(sq); sq->num_wqe_pending++; return sq->curr_tx_wqe; } static void efa_send_wr_set_imm_data(struct efa_io_tx_wqe *tx_wqe, __be32 imm_data) { struct efa_io_tx_meta_desc *meta_desc; meta_desc = &tx_wqe->meta; meta_desc->immediate_data = be32toh(imm_data); EFA_SET(&meta_desc->ctrl1, EFA_IO_TX_META_DESC_HAS_IMM, 1); } static void efa_send_wr_set_rdma_addr(struct efa_io_tx_wqe *tx_wqe, uint32_t rkey, uint64_t remote_addr) { struct efa_io_remote_mem_addr *remote_mem; remote_mem = &tx_wqe->data.rdma_req.remote_mem; remote_mem->rkey = rkey; remote_mem->buf_addr_lo = remote_addr & 0xFFFFFFFF; remote_mem->buf_addr_hi = remote_addr >> 32; } static void efa_send_wr_send(struct ibv_qp_ex *ibvqpx) { efa_send_wr_common(ibvqpx, EFA_IO_SEND); } static void efa_send_wr_send_imm(struct ibv_qp_ex *ibvqpx, __be32 imm_data) { struct efa_io_tx_wqe *tx_wqe; tx_wqe = efa_send_wr_common(ibvqpx, EFA_IO_SEND); if (unlikely(!tx_wqe)) return; efa_send_wr_set_imm_data(tx_wqe, imm_data); } static void efa_send_wr_rdma_read(struct ibv_qp_ex *ibvqpx, uint32_t rkey, uint64_t remote_addr) { struct efa_io_tx_wqe *tx_wqe; tx_wqe = efa_send_wr_common(ibvqpx, EFA_IO_RDMA_READ); if (unlikely(!tx_wqe)) return; efa_send_wr_set_rdma_addr(tx_wqe, rkey, remote_addr); } static void efa_send_wr_rdma_write(struct ibv_qp_ex *ibvqpx, uint32_t rkey, uint64_t remote_addr) { struct efa_io_tx_wqe *tx_wqe; tx_wqe = efa_send_wr_common(ibvqpx, EFA_IO_RDMA_WRITE); if (unlikely(!tx_wqe)) return; efa_send_wr_set_rdma_addr(tx_wqe, rkey, remote_addr); } static void efa_send_wr_rdma_write_imm(struct ibv_qp_ex *ibvqpx, uint32_t rkey, uint64_t remote_addr, __be32 imm_data) { struct efa_io_tx_wqe *tx_wqe; tx_wqe = efa_send_wr_common(ibvqpx, EFA_IO_RDMA_WRITE); if (unlikely(!tx_wqe)) return; efa_send_wr_set_rdma_addr(tx_wqe, rkey, remote_addr); efa_send_wr_set_imm_data(tx_wqe, imm_data); } static void efa_send_wr_set_sge(struct ibv_qp_ex *ibvqpx, uint32_t lkey, uint64_t addr, uint32_t length) { struct efa_qp *qp = to_efa_qp_ex(ibvqpx); struct efa_io_tx_buf_desc *buf; struct efa_io_tx_wqe *tx_wqe; uint8_t op_type; if (unlikely(qp->wr_session_err)) return; tx_wqe = qp->sq.curr_tx_wqe; tx_wqe->meta.length = 1; op_type = EFA_GET(&tx_wqe->meta.ctrl1, EFA_IO_TX_META_DESC_OP_TYPE); switch (op_type) { case EFA_IO_SEND: buf = &tx_wqe->data.sgl[0]; break; case EFA_IO_RDMA_READ: case EFA_IO_RDMA_WRITE: tx_wqe->data.rdma_req.remote_mem.length = length; buf = &tx_wqe->data.rdma_req.local_mem[0]; break; default: return; } efa_set_tx_buf(buf, addr, lkey, length); } static void efa_send_wr_set_sge_list(struct ibv_qp_ex *ibvqpx, size_t num_sge, const struct ibv_sge *sg_list) { struct efa_qp *qp = to_efa_qp_ex(ibvqpx); struct efa_io_rdma_req *rdma_req; struct efa_io_tx_wqe *tx_wqe; struct efa_sq *sq = &qp->sq; uint8_t op_type; if (unlikely(qp->wr_session_err)) return; tx_wqe = sq->curr_tx_wqe; op_type = EFA_GET(&tx_wqe->meta.ctrl1, EFA_IO_TX_META_DESC_OP_TYPE); switch (op_type) { case EFA_IO_SEND: if (unlikely(num_sge > sq->wq.max_sge)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] num_sge[%zu] > max_sge[%u]\n", ibvqpx->qp_base.qp_num, num_sge, sq->wq.max_sge); qp->wr_session_err = EINVAL; return; } efa_post_send_sgl(tx_wqe->data.sgl, sg_list, num_sge); break; case EFA_IO_RDMA_READ: case EFA_IO_RDMA_WRITE: if (unlikely(num_sge > sq->max_wr_rdma_sge)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] num_sge[%zu] > max_rdma_sge[%zu]\n", ibvqpx->qp_base.qp_num, num_sge, sq->max_wr_rdma_sge); qp->wr_session_err = EINVAL; return; } rdma_req = &tx_wqe->data.rdma_req; rdma_req->remote_mem.length = efa_sge_total_bytes(sg_list, num_sge); efa_post_send_sgl(rdma_req->local_mem, sg_list, num_sge); break; default: return; } tx_wqe->meta.length = num_sge; } static void efa_send_wr_set_inline_data(struct ibv_qp_ex *ibvqpx, void *addr, size_t length) { struct efa_qp *qp = to_efa_qp_ex(ibvqpx); struct efa_io_tx_wqe *tx_wqe = qp->sq.curr_tx_wqe; if (unlikely(qp->wr_session_err)) return; if (unlikely(length > qp->sq.max_inline_data)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] WR inline length %zu > %zu\n", ibvqpx->qp_base.qp_num, length, qp->sq.max_inline_data); qp->wr_session_err = EINVAL; return; } EFA_SET(&tx_wqe->meta.ctrl1, EFA_IO_TX_META_DESC_INLINE_MSG, 1); memcpy(tx_wqe->data.inline_data, addr, length); tx_wqe->meta.length = length; } static void efa_send_wr_set_inline_data_list(struct ibv_qp_ex *ibvqpx, size_t num_buf, const struct ibv_data_buf *buf_list) { struct efa_qp *qp = to_efa_qp_ex(ibvqpx); struct efa_io_tx_wqe *tx_wqe = qp->sq.curr_tx_wqe; uint32_t total_length = 0; uint32_t length; size_t i; if (unlikely(qp->wr_session_err)) return; if (unlikely(efa_buf_list_total_bytes(buf_list, num_buf) > qp->sq.max_inline_data)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] WR inline length %zu > %zu\n", ibvqpx->qp_base.qp_num, efa_buf_list_total_bytes(buf_list, num_buf), qp->sq.max_inline_data); qp->wr_session_err = EINVAL; return; } for (i = 0; i < num_buf; i++) { length = buf_list[i].length; memcpy(tx_wqe->data.inline_data + total_length, buf_list[i].addr, length); total_length += length; } EFA_SET(&tx_wqe->meta.ctrl1, EFA_IO_TX_META_DESC_INLINE_MSG, 1); tx_wqe->meta.length = total_length; } static void efa_send_wr_set_addr(struct ibv_qp_ex *ibvqpx, struct ibv_ah *ibvah, uint32_t remote_qpn, uint32_t remote_qkey) { struct efa_qp *qp = to_efa_qp_ex(ibvqpx); struct efa_ah *ah = to_efa_ah(ibvah); struct efa_io_tx_wqe *tx_wqe; if (unlikely(qp->wr_session_err)) return; tx_wqe = qp->sq.curr_tx_wqe; tx_wqe->meta.dest_qp_num = remote_qpn; tx_wqe->meta.ah = ah->efa_ah; tx_wqe->meta.qkey = remote_qkey; rdma_tracepoint(rdma_core_efa, post_send, qp->dev->name, ibvqpx->wr_id, EFA_GET(&tx_wqe->meta.ctrl1, EFA_IO_TX_META_DESC_OP_TYPE), ibvqpx->qp_base.qp_num, remote_qpn, ah->efa_ah); } static void efa_send_wr_start(struct ibv_qp_ex *ibvqpx) { struct efa_qp *qp = to_efa_qp_ex(ibvqpx); struct efa_sq *sq = &qp->sq; mmio_wc_spinlock(&qp->sq.wq.wqlock); qp->wr_session_err = 0; sq->num_wqe_pending = 0; sq->phase_rb = qp->sq.wq.phase; } static inline void efa_sq_roll_back(struct efa_sq *sq) { struct efa_qp *qp = container_of(sq, struct efa_qp, sq); struct efa_wq *wq = &sq->wq; verbs_debug(verbs_get_ctx(qp->verbs_qp.qp.context), "SQ[%u] Rollback num_wqe_pending = %u\n", qp->verbs_qp.qp.qp_num, sq->num_wqe_pending); wq->wqe_posted -= sq->num_wqe_pending; wq->pc -= sq->num_wqe_pending; wq->wrid_idx_pool_next -= sq->num_wqe_pending; wq->phase = sq->phase_rb; } static int efa_send_wr_complete(struct ibv_qp_ex *ibvqpx) { struct efa_qp *qp = to_efa_qp_ex(ibvqpx); struct efa_sq *sq = &qp->sq; uint32_t max_txbatch = sq->max_batch_wr; uint32_t num_wqe_to_copy; uint16_t local_idx = 0; uint16_t curbatch = 0; uint16_t sq_desc_idx; uint16_t pc; if (unlikely(qp->wr_session_err)) { efa_sq_roll_back(sq); goto out; } /* * Copy local queue to device in chunks, handling wraparound and max * doorbell batch. */ pc = sq->wq.pc - sq->num_wqe_pending; sq_desc_idx = pc & sq->wq.desc_mask; /* mmio_wc_start() comes from efa_send_wr_start() */ while (sq->num_wqe_pending) { num_wqe_to_copy = min3(sq->num_wqe_pending, sq->wq.wqe_cnt - sq_desc_idx, max_txbatch - curbatch); mmio_memcpy_x64((struct efa_io_tx_wqe *)sq->desc + sq_desc_idx, (struct efa_io_tx_wqe *)sq->local_queue + local_idx, num_wqe_to_copy * sizeof(struct efa_io_tx_wqe)); sq->num_wqe_pending -= num_wqe_to_copy; local_idx += num_wqe_to_copy; curbatch += num_wqe_to_copy; pc += num_wqe_to_copy; sq_desc_idx = (sq_desc_idx + num_wqe_to_copy) & sq->wq.desc_mask; if (curbatch == max_txbatch) { mmio_flush_writes(); efa_sq_ring_doorbell(sq, pc); curbatch = 0; mmio_wc_start(); } } if (curbatch) { mmio_flush_writes(); efa_sq_ring_doorbell(sq, sq->wq.pc); } out: /* * Not using mmio_wc_spinunlock as the doorbell write should be done * inside the lock. */ pthread_spin_unlock(&sq->wq.wqlock); return qp->wr_session_err; } static void efa_send_wr_abort(struct ibv_qp_ex *ibvqpx) { struct efa_sq *sq = &to_efa_qp_ex(ibvqpx)->sq; efa_sq_roll_back(sq); pthread_spin_unlock(&sq->wq.wqlock); } static void efa_qp_fill_wr_pfns(struct ibv_qp_ex *ibvqpx, struct ibv_qp_init_attr_ex *attr_ex) { ibvqpx->wr_start = efa_send_wr_start; ibvqpx->wr_complete = efa_send_wr_complete; ibvqpx->wr_abort = efa_send_wr_abort; if (attr_ex->send_ops_flags & IBV_QP_EX_WITH_SEND) ibvqpx->wr_send = efa_send_wr_send; if (attr_ex->send_ops_flags & IBV_QP_EX_WITH_SEND_WITH_IMM) ibvqpx->wr_send_imm = efa_send_wr_send_imm; if (attr_ex->send_ops_flags & IBV_QP_EX_WITH_RDMA_READ) ibvqpx->wr_rdma_read = efa_send_wr_rdma_read; if (attr_ex->send_ops_flags & IBV_QP_EX_WITH_RDMA_WRITE) ibvqpx->wr_rdma_write = efa_send_wr_rdma_write; if (attr_ex->send_ops_flags & IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM) ibvqpx->wr_rdma_write_imm = efa_send_wr_rdma_write_imm; ibvqpx->wr_set_inline_data = efa_send_wr_set_inline_data; ibvqpx->wr_set_inline_data_list = efa_send_wr_set_inline_data_list; ibvqpx->wr_set_sge = efa_send_wr_set_sge; ibvqpx->wr_set_sge_list = efa_send_wr_set_sge_list; ibvqpx->wr_set_ud_addr = efa_send_wr_set_addr; } static int efa_post_recv_validate(struct efa_qp *qp, struct ibv_recv_wr *wr) { if (unlikely(qp->verbs_qp.qp.state == IBV_QPS_RESET || qp->verbs_qp.qp.state == IBV_QPS_ERR)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "RQ[%u] Invalid QP state\n", qp->verbs_qp.qp.qp_num); return EINVAL; } if (unlikely(wr->num_sge > qp->rq.wq.max_sge)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "RQ[%u] WR num_sge %d > %d\n", qp->verbs_qp.qp.qp_num, wr->num_sge, qp->rq.wq.max_sge); return EINVAL; } if (unlikely(qp->rq.wq.wqe_posted - qp->rq.wq.wqe_completed == qp->rq.wq.wqe_cnt)) { verbs_err(verbs_get_ctx(qp->verbs_qp.qp.context), "RQ[%u] is full wqe_posted[%u] wqe_completed[%u] wqe_cnt[%u]\n", qp->verbs_qp.qp.qp_num, qp->rq.wq.wqe_posted, qp->rq.wq.wqe_completed, qp->rq.wq.wqe_cnt); return ENOMEM; } return 0; } int efa_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad) { struct efa_qp *qp = to_efa_qp(ibvqp); struct efa_wq *wq = &qp->rq.wq; struct efa_io_rx_desc rx_buf; uint32_t rq_desc_offset; uintptr_t addr; int err = 0; size_t i; pthread_spin_lock(&wq->wqlock); while (wr) { err = efa_post_recv_validate(qp, wr); if (err) { *bad = wr; goto ring_db; } memset(&rx_buf, 0, sizeof(rx_buf)); rx_buf.req_id = efa_wq_get_next_wrid_idx_locked(wq, wr->wr_id); wq->wqe_posted++; /* Default init of the rx buffer */ EFA_SET(&rx_buf.lkey_ctrl, EFA_IO_RX_DESC_FIRST, 1); EFA_SET(&rx_buf.lkey_ctrl, EFA_IO_RX_DESC_LAST, 0); for (i = 0; i < wr->num_sge; i++) { /* Set last indication if need) */ if (i == wr->num_sge - 1) EFA_SET(&rx_buf.lkey_ctrl, EFA_IO_RX_DESC_LAST, 1); addr = wr->sg_list[i].addr; /* Set RX buffer desc from SGE */ rx_buf.length = min_t(uint32_t, wr->sg_list[i].length, UINT16_MAX); EFA_SET(&rx_buf.lkey_ctrl, EFA_IO_RX_DESC_LKEY, wr->sg_list[i].lkey); rx_buf.buf_addr_lo = addr; rx_buf.buf_addr_hi = (uint64_t)addr >> 32; /* Copy descriptor to RX ring */ rq_desc_offset = (wq->pc & wq->desc_mask) * sizeof(rx_buf); memcpy(qp->rq.buf + rq_desc_offset, &rx_buf, sizeof(rx_buf)); /* Wrap rx descriptor index */ wq->pc++; if (!(wq->pc & wq->desc_mask)) wq->phase++; /* reset descriptor for next iov */ memset(&rx_buf, 0, sizeof(rx_buf)); } rdma_tracepoint(rdma_core_efa, post_recv, qp->dev->name, wr->wr_id, ibvqp->qp_num, wr->num_sge); wr = wr->next; } ring_db: efa_rq_ring_doorbell(&qp->rq, wq->pc); pthread_spin_unlock(&wq->wqlock); return err; } int efadv_query_ah(struct ibv_ah *ibvah, struct efadv_ah_attr *attr, uint32_t inlen) { uint64_t comp_mask_out = 0; if (!is_efa_dev(ibvah->context->device)) { verbs_err(verbs_get_ctx(ibvah->context), "Not an EFA device\n"); return EOPNOTSUPP; } if (!vext_field_avail(typeof(*attr), ahn, inlen)) { verbs_err(verbs_get_ctx(ibvah->context), "Compatibility issues\n"); return EINVAL; } memset(attr, 0, inlen); attr->ahn = to_efa_ah(ibvah)->efa_ah; attr->comp_mask = comp_mask_out; return 0; } struct ibv_ah *efa_create_ah(struct ibv_pd *ibvpd, struct ibv_ah_attr *attr) { struct efa_create_ah_resp resp = {}; struct efa_ah *ah; int err; ah = calloc(1, sizeof(*ah)); if (!ah) return NULL; err = ibv_cmd_create_ah(ibvpd, &ah->ibvah, attr, &resp.ibv_resp, sizeof(resp)); if (err) { verbs_err(verbs_get_ctx(ibvpd->context), "Failed to create AH\n"); free(ah); errno = err; return NULL; } ah->efa_ah = resp.efa_address_handle; return &ah->ibvah; } int efa_destroy_ah(struct ibv_ah *ibvah) { struct efa_ah *ah; int err; ah = to_efa_ah(ibvah); err = ibv_cmd_destroy_ah(ibvah); if (err) { verbs_err(verbs_get_ctx(ibvah->context), "Failed to destroy AH\n"); return err; } free(ah); return 0; } rdma-core-56.1/providers/efa/verbs.h000066400000000000000000000043131477342711600173750ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2019-2023 Amazon.com, Inc. or its affiliates. All rights reserved. */ #ifndef __EFA_VERBS_H__ #define __EFA_VERBS_H__ #include #include int efa_query_device_ctx(struct efa_context *ctx); int efa_query_port(struct ibv_context *uctx, uint8_t port, struct ibv_port_attr *attr); int efa_query_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); struct ibv_pd *efa_alloc_pd(struct ibv_context *uctx); int efa_dealloc_pd(struct ibv_pd *ibvpd); struct ibv_mr *efa_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int acc); struct ibv_mr *efa_reg_mr(struct ibv_pd *ibvpd, void *buf, size_t len, uint64_t hca_va, int ibv_access_flags); int efa_dereg_mr(struct verbs_mr *vmr); struct ibv_cq *efa_create_cq(struct ibv_context *uctx, int ncqe, struct ibv_comp_channel *ch, int vec); struct ibv_cq_ex *efa_create_cq_ex(struct ibv_context *uctx, struct ibv_cq_init_attr_ex *attr_ex); int efa_destroy_cq(struct ibv_cq *ibvcq); int efa_poll_cq(struct ibv_cq *ibvcq, int nwc, struct ibv_wc *wc); int efa_arm_cq(struct ibv_cq *ibvcq, int solicited_only); void efa_cq_event(struct ibv_cq *ibvcq); struct ibv_qp *efa_create_qp(struct ibv_pd *ibvpd, struct ibv_qp_init_attr *attr); struct ibv_qp *efa_create_qp_ex(struct ibv_context *ibvctx, struct ibv_qp_init_attr_ex *attr_ex); int efa_modify_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr, int ibv_qp_attr_mask); int efa_query_qp(struct ibv_qp *ibvqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int efa_query_qp_data_in_order(struct ibv_qp *ibvqp, enum ibv_wr_opcode op, uint32_t flags); int efa_destroy_qp(struct ibv_qp *ibvqp); int efa_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad); int efa_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad); struct ibv_ah *efa_create_ah(struct ibv_pd *ibvpd, struct ibv_ah_attr *attr); int efa_destroy_ah(struct ibv_ah *ibvah); #endif /* __EFA_VERBS_H__ */ rdma-core-56.1/providers/erdma/000077500000000000000000000000001477342711600164375ustar00rootroot00000000000000rdma-core-56.1/providers/erdma/CMakeLists.txt000066400000000000000000000000751477342711600212010ustar00rootroot00000000000000rdma_provider(erdma erdma.c erdma_db.c erdma_verbs.c ) rdma-core-56.1/providers/erdma/erdma.c000066400000000000000000000064771477342711600177110ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 or OpenIB.org BSD (MIT) See COPYING file // Authors: Cheng Xu // Copyright (c) 2020-2021, Alibaba Group. #include #include #include #include #include #include #include #include #include "erdma.h" #include "erdma_abi.h" #include "erdma_hw.h" #include "erdma_verbs.h" static const struct verbs_context_ops erdma_context_ops = { .alloc_pd = erdma_alloc_pd, .cq_event = erdma_cq_event, .create_cq = erdma_create_cq, .create_qp = erdma_create_qp, .dealloc_pd = erdma_free_pd, .dereg_mr = erdma_dereg_mr, .destroy_cq = erdma_destroy_cq, .destroy_qp = erdma_destroy_qp, .free_context = erdma_free_context, .modify_qp = erdma_modify_qp, .poll_cq = erdma_poll_cq, .post_recv = erdma_post_recv, .post_send = erdma_post_send, .query_device_ex = erdma_query_device, .query_port = erdma_query_port, .query_qp = erdma_query_qp, .reg_mr = erdma_reg_mr, .req_notify_cq = erdma_notify_cq, }; static struct verbs_context *erdma_alloc_context(struct ibv_device *device, int cmd_fd, void *private_data) { struct erdma_cmd_alloc_context_resp resp = {}; struct ibv_get_context cmd = {}; struct erdma_context *ctx; int i; ctx = verbs_init_and_alloc_context(device, cmd_fd, ctx, ibv_ctx, RDMA_DRIVER_ERDMA); if (!ctx) return NULL; pthread_mutex_init(&ctx->qp_table_mutex, NULL); for (i = 0; i < ERDMA_QP_TABLE_SIZE; ++i) ctx->qp_table[i].refcnt = 0; if (ibv_cmd_get_context(&ctx->ibv_ctx, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) goto err_out; verbs_set_ops(&ctx->ibv_ctx, &erdma_context_ops); ctx->dev_id = resp.dev_id; ctx->sdb_type = resp.sdb_type; ctx->sdb_offset = resp.sdb_offset; ctx->sdb = mmap(NULL, ERDMA_PAGE_SIZE, PROT_WRITE, MAP_SHARED, cmd_fd, resp.sdb); if (ctx->sdb == MAP_FAILED) goto err_out; ctx->rdb = mmap(NULL, ERDMA_PAGE_SIZE, PROT_WRITE, MAP_SHARED, cmd_fd, resp.rdb); if (ctx->rdb == MAP_FAILED) goto err_rdb_map; ctx->cdb = mmap(NULL, ERDMA_PAGE_SIZE, PROT_WRITE, MAP_SHARED, cmd_fd, resp.cdb); if (ctx->cdb == MAP_FAILED) goto err_cdb_map; ctx->page_size = ERDMA_PAGE_SIZE; list_head_init(&ctx->dbrecord_pages_list); pthread_mutex_init(&ctx->dbrecord_pages_mutex, NULL); return &ctx->ibv_ctx; err_cdb_map: munmap(ctx->rdb, ERDMA_PAGE_SIZE); err_rdb_map: munmap(ctx->sdb, ERDMA_PAGE_SIZE); err_out: verbs_uninit_context(&ctx->ibv_ctx); free(ctx); return NULL; } static struct verbs_device * erdma_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct erdma_device *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; return &dev->ibv_dev; } static void erdma_device_free(struct verbs_device *vdev) { struct erdma_device *dev = container_of(vdev, struct erdma_device, ibv_dev); free(dev); } static const struct verbs_match_ent match_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_ERDMA), VERBS_PCI_MATCH(PCI_VENDOR_ID_ALIBABA, 0x107f, NULL), {}, }; static const struct verbs_device_ops erdma_dev_ops = { .name = "erdma", .match_min_abi_version = 0, .match_max_abi_version = ERDMA_ABI_VERSION, .match_table = match_table, .alloc_device = erdma_device_alloc, .uninit_device = erdma_device_free, .alloc_context = erdma_alloc_context, }; PROVIDER_DRIVER(erdma, erdma_dev_ops); rdma-core-56.1/providers/erdma/erdma.h000066400000000000000000000022061477342711600177000ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or OpenIB.org BSD (MIT) See COPYING file */ /* * Authors: Cheng Xu * Copyright (c) 2020-2021, Alibaba Group. */ #ifndef __ERDMA_H__ #define __ERDMA_H__ #include #include #include #include #include #include #ifndef PCI_VENDOR_ID_ALIBABA #define PCI_VENDOR_ID_ALIBABA 0x1ded #endif #define ERDMA_PAGE_SIZE 4096 struct erdma_device { struct verbs_device ibv_dev; }; #define ERDMA_QP_TABLE_SIZE 4096 #define ERDMA_QP_TABLE_SHIFT 12 #define ERDMA_QP_TABLE_MASK 0xFFF struct erdma_context { struct verbs_context ibv_ctx; uint32_t dev_id; struct { struct erdma_qp **table; int refcnt; } qp_table[ERDMA_QP_TABLE_SIZE]; pthread_mutex_t qp_table_mutex; uint8_t sdb_type; uint32_t sdb_offset; void *sdb; void *rdb; void *cdb; uint32_t page_size; pthread_mutex_t dbrecord_pages_mutex; struct list_head dbrecord_pages_list; }; static inline struct erdma_context *to_ectx(struct ibv_context *base) { return container_of(base, struct erdma_context, ibv_ctx.context); } #endif rdma-core-56.1/providers/erdma/erdma_abi.h000066400000000000000000000012331477342711600205120ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or OpenIB.org BSD (MIT) See COPYING file */ /* * Authors: Cheng Xu * Copyright (c) 2020-2021, Alibaba Group. */ #ifndef __ERDMA_ABI_H__ #define __ERDMA_ABI_H__ #include #include #include DECLARE_DRV_CMD(erdma_cmd_alloc_context, IB_USER_VERBS_CMD_GET_CONTEXT, empty, erdma_uresp_alloc_ctx); DECLARE_DRV_CMD(erdma_cmd_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, erdma_ureq_create_cq, erdma_uresp_create_cq); DECLARE_DRV_CMD(erdma_cmd_create_qp, IB_USER_VERBS_CMD_CREATE_QP, erdma_ureq_create_qp, erdma_uresp_create_qp); #endif rdma-core-56.1/providers/erdma/erdma_db.c000066400000000000000000000044071477342711600203450ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 or OpenIB.org BSD (MIT) See COPYING file // Authors: Cheng Xu // Copyright (c) 2020-2021, Alibaba Group. #include #include #include #include #include #include "erdma.h" #include "erdma_db.h" #define ERDMA_DBREC_SIZE 16 struct erdma_dbrecord_page { struct list_node list; void *page_buf; uint32_t cnt; uint32_t used; unsigned long *bitmap; }; uint64_t *erdma_alloc_dbrecords(struct erdma_context *ctx) { struct erdma_dbrecord_page *page = NULL; uint32_t free_idx, dbrecords_per_page; uint64_t *db_records = NULL; int rv; pthread_mutex_lock(&ctx->dbrecord_pages_mutex); list_for_each(&ctx->dbrecord_pages_list, page, list) if (page->used < page->cnt) goto found; dbrecords_per_page = ctx->page_size / ERDMA_DBREC_SIZE; page = calloc(1, sizeof(*page)); if (!page) goto err_out; page->bitmap = bitmap_alloc1(dbrecords_per_page); if (!page->bitmap) goto err_bitmap; rv = posix_memalign(&page->page_buf, ctx->page_size, ctx->page_size); if (rv) goto err_alloc; page->cnt = dbrecords_per_page; page->used = 0; list_node_init(&page->list); list_add_tail(&ctx->dbrecord_pages_list, &page->list); found: ++page->used; free_idx = bitmap_find_first_bit(page->bitmap, 0, page->cnt); bitmap_clear_bit(page->bitmap, free_idx); db_records = page->page_buf + free_idx * ERDMA_DBREC_SIZE; pthread_mutex_unlock(&ctx->dbrecord_pages_mutex); return db_records; err_alloc: free(page->bitmap); err_bitmap: free(page); err_out: pthread_mutex_unlock(&ctx->dbrecord_pages_mutex); return NULL; } void erdma_dealloc_dbrecords(struct erdma_context *ctx, uint64_t *dbrec) { uint32_t page_mask = ~(ctx->page_size - 1); struct erdma_dbrecord_page *page; uint32_t idx; pthread_mutex_lock(&ctx->dbrecord_pages_mutex); list_for_each(&ctx->dbrecord_pages_list, page, list) if (((uintptr_t)dbrec & page_mask) == (uintptr_t)page->page_buf) goto found; goto out; found: idx = ((uintptr_t)dbrec - (uintptr_t)page->page_buf) / ERDMA_DBREC_SIZE; bitmap_set_bit(page->bitmap, idx); page->used--; if (!page->used) { list_del(&page->list); free(page->bitmap); free(page); } out: pthread_mutex_unlock(&ctx->dbrecord_pages_mutex); } rdma-core-56.1/providers/erdma/erdma_db.h000066400000000000000000000006411477342711600203460ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or OpenIB.org BSD (MIT) See COPYING file */ /* * Authors: Cheng Xu * Copyright (c) 2020-2021, Alibaba Group. */ #ifndef __ERDMA_DB_H__ #define __ERDMA_DB_H__ #include #include "erdma.h" uint64_t *erdma_alloc_dbrecords(struct erdma_context *ctx); void erdma_dealloc_dbrecords(struct erdma_context *ctx, uint64_t *dbrecords); #endif rdma-core-56.1/providers/erdma/erdma_hw.h000066400000000000000000000117661477342711600204110ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or OpenIB.org BSD (MIT) See COPYING file */ /* * Authors: Cheng Xu * Copyright (c) 2020-2021, Alibaba Group. */ #ifndef __ERDMA_HW_H__ #define __ERDMA_HW_H__ #include #define ERDMA_SDB_PAGE 0 #define ERDMA_SDB_ENTRY 1 #define ERDMA_SDB_SHARED 2 #define ERDMA_NSDB_PER_ENTRY 2 #define ERDMA_SDB_ALLOC_QPN_MASK 0x1f #define ERDMA_RDB_ALLOC_QPN_MASK 0x7f #define ERDMA_SQDB_SIZE 128 #define ERDMA_CQDB_SIZE 8 #define ERDMA_RQDB_SIZE 8 #define ERDMA_RQDB_SPACE_SIZE 32 /* WQE related. */ #define EQE_SIZE 16 #define EQE_SHIFT 4 #define RQE_SIZE 32 #define RQE_SHIFT 5 #define CQE_SIZE 32 #define CQE_SHIFT 5 #define SQEBB_SIZE 32 #define SQEBB_SHIFT 5 #define SQEBB_MASK (~(SQEBB_SIZE - 1)) #define SQEBB_ALIGN(size) ((size + SQEBB_SIZE - 1) & SQEBB_MASK) #define SQEBB_COUNT(size) (SQEBB_ALIGN(size) >> SQEBB_SHIFT) #define MAX_WQEBB_PER_SQE 4 enum erdma_opcode { ERDMA_OP_WRITE = 0, ERDMA_OP_READ = 1, ERDMA_OP_SEND = 2, ERDMA_OP_SEND_WITH_IMM = 3, ERDMA_OP_RECEIVE = 4, ERDMA_OP_RECV_IMM = 5, ERDMA_OP_RECV_INV = 6, ERDMA_OP_REQ_ERR = 7, ERDNA_OP_READ_RESPONSE = 8, ERDMA_OP_WRITE_WITH_IMM = 9, ERDMA_OP_RECV_ERR = 10, ERDMA_OP_INVALIDATE = 11, ERDMA_OP_RSP_SEND_IMM = 12, ERDMA_OP_SEND_WITH_INV = 13, ERDMA_OP_REG_MR = 14, ERDMA_OP_LOCAL_INV = 15, ERDMA_OP_READ_WITH_INV = 16, ERDMA_OP_ATOMIC_CAS = 17, ERDMA_OP_ATOMIC_FAD = 18, ERDMA_NUM_OPCODES = 19, ERDMA_OP_INVALID = ERDMA_NUM_OPCODES + 1 }; /* * Inline data are kept within the work request itself occupying * the space of sge[1] .. sge[n]. Therefore, inline data cannot be * supported if ERDMA_MAX_SGE is below 2 elements. */ #define ERDMA_MAX_INLINE (sizeof(struct erdma_sge) * (ERDMA_MAX_SEND_SGE)) enum erdma_wc_status { ERDMA_WC_SUCCESS = 0, ERDMA_WC_GENERAL_ERR = 1, ERDMA_WC_RECV_WQE_FORMAT_ERR = 2, ERDMA_WC_RECV_STAG_INVALID_ERR = 3, ERDMA_WC_RECV_ADDR_VIOLATION_ERR = 4, ERDMA_WC_RECV_RIGHT_VIOLATION_ERR = 5, ERDMA_WC_RECV_PDID_ERR = 6, ERDMA_WC_RECV_WARRPING_ERR = 7, ERDMA_WC_SEND_WQE_FORMAT_ERR = 8, ERDMA_WC_SEND_WQE_ORD_EXCEED = 9, ERDMA_WC_SEND_STAG_INVALID_ERR = 10, ERDMA_WC_SEND_ADDR_VIOLATION_ERR = 11, ERDMA_WC_SEND_RIGHT_VIOLATION_ERR = 12, ERDMA_WC_SEND_PDID_ERR = 13, ERDMA_WC_SEND_WARRPING_ERR = 14, ERDMA_WC_FLUSH_ERR = 15, ERDMA_WC_RETRY_EXC_ERR = 16, ERDMA_NUM_WC_STATUS }; enum erdma_vendor_err { ERDMA_WC_VENDOR_NO_ERR = 0, ERDMA_WC_VENDOR_INVALID_RQE = 1, ERDMA_WC_VENDOR_RQE_INVALID_STAG = 2, ERDMA_WC_VENDOR_RQE_ADDR_VIOLATION = 3, ERDMA_WC_VENDOR_RQE_ACCESS_RIGHT_ERR = 4, ERDMA_WC_VENDOR_RQE_INVALID_PD = 5, ERDMA_WC_VENDOR_RQE_WRAP_ERR = 6, ERDMA_WC_VENDOR_INVALID_SQE = 0x20, ERDMA_WC_VENDOR_ZERO_ORD = 0x21, ERDMA_WC_VENDOR_SQE_INVALID_STAG = 0x30, ERDMA_WC_VENDOR_SQE_ADDR_VIOLATION = 0x31, ERDMA_WC_VENDOR_SQE_ACCESS_ERR = 0x32, ERDMA_WC_VENDOR_SQE_INVALID_PD = 0x33, ERDMA_WC_VENDOR_SQE_WARP_ERR = 0x34 }; /* Doorbell related. */ #define ERDMA_CQDB_IDX_MASK GENMASK_ULL(63, 56) #define ERDMA_CQDB_CQN_MASK GENMASK_ULL(55, 32) #define ERDMA_CQDB_ARM_MASK BIT_ULL(31) #define ERDMA_CQDB_SOL_MASK BIT_ULL(30) #define ERDMA_CQDB_CMDSN_MASK GENMASK_ULL(29, 28) #define ERDMA_CQDB_CI_MASK GENMASK_ULL(23, 0) #define ERDMA_CQE_QTYPE_SQ 0 #define ERDMA_CQE_QTYPE_RQ 1 #define ERDMA_CQE_QTYPE_CMDQ 2 /* CQE hdr */ #define ERDMA_CQE_HDR_OWNER_MASK BIT(31) #define ERDMA_CQE_HDR_OPCODE_MASK GENMASK(23, 16) #define ERDMA_CQE_HDR_QTYPE_MASK GENMASK(15, 8) #define ERDMA_CQE_HDR_SYNDROME_MASK GENMASK(7, 0) struct erdma_cqe { __be32 hdr; __be32 qe_idx; __be32 qpn; __le32 imm_data; __be32 size; __be32 rsvd[3]; }; struct erdma_sge { __aligned_le64 addr; __le32 length; __le32 key; }; /* Receive Queue Element */ struct erdma_rqe { __le16 qe_idx; __le16 rsvd; __le32 qpn; __le32 rsvd2; __le32 rsvd3; __le64 to; __le32 length; __le32 stag; }; /* SQE */ #define ERDMA_SQE_HDR_SGL_LEN_MASK GENMASK_ULL(63, 56) #define ERDMA_SQE_HDR_WQEBB_CNT_MASK GENMASK_ULL(54, 52) #define ERDMA_SQE_HDR_QPN_MASK GENMASK_ULL(51, 32) #define ERDMA_SQE_HDR_OPCODE_MASK GENMASK_ULL(31, 27) #define ERDMA_SQE_HDR_DWQE_MASK BIT_ULL(26) #define ERDMA_SQE_HDR_INLINE_MASK BIT_ULL(25) #define ERDMA_SQE_HDR_FENCE_MASK BIT_ULL(24) #define ERDMA_SQE_HDR_SE_MASK BIT_ULL(23) #define ERDMA_SQE_HDR_CE_MASK BIT_ULL(22) #define ERDMA_SQE_HDR_WQEBB_INDEX_MASK GENMASK_ULL(15, 0) struct erdma_write_sqe { __le64 hdr; __be32 imm_data; __le32 length; __le32 sink_stag; /* avoid sink_to not 8-byte aligned. */ __le32 sink_to_low; __le32 sink_to_high; __le32 rsvd; struct erdma_sge sgl[]; }; struct erdma_send_sqe { __le64 hdr; __be32 imm_data; __le32 length; struct erdma_sge sgl[]; }; struct erdma_readreq_sqe { __le64 hdr; __le32 invalid_stag; __le32 length; __le32 sink_stag; /* avoid sink_to not 8-byte aligned. */ __le32 sink_to_low; __le32 sink_to_high; __le32 rsvd; struct erdma_sge sgl; }; struct erdma_atomic_sqe { __le64 hdr; __le64 rsvd; __le64 fetchadd_swap_data; __le64 cmp_data; struct erdma_sge remote; struct erdma_sge sgl; }; #endif rdma-core-56.1/providers/erdma/erdma_verbs.c000066400000000000000000000603701477342711600211020ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 or BSD-3-Clause // Authors: Cheng Xu // Copyright (c) 2020-2021, Alibaba Group. // Authors: Bernard Metzler // Copyright (c) 2008-2019, IBM Corporation #include #include #include #include #include #include #include #include #include #include #include "erdma.h" #include "erdma_abi.h" #include "erdma_db.h" #include "erdma_hw.h" #include "erdma_verbs.h" int erdma_query_device(struct ibv_context *ctx, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; unsigned int major, minor, sub_minor; size_t resp_size = sizeof(resp); uint64_t raw_fw_ver; int rv; rv = ibv_cmd_query_device_any(ctx, input, attr, attr_size, &resp, &resp_size); if (rv) return rv; raw_fw_ver = resp.base.fw_ver; major = (raw_fw_ver >> 32) & 0xffff; minor = (raw_fw_ver >> 16) & 0xffff; sub_minor = raw_fw_ver & 0xffff; snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver), "%d.%d.%d", major, minor, sub_minor); return 0; } int erdma_query_port(struct ibv_context *ctx, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd = {}; return ibv_cmd_query_port(ctx, port, attr, &cmd, sizeof(cmd)); } int erdma_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd = {}; return ibv_cmd_query_qp(qp, attr, attr_mask, init_attr, &cmd, sizeof(cmd)); } struct ibv_pd *erdma_alloc_pd(struct ibv_context *ctx) { struct ib_uverbs_alloc_pd_resp resp; struct ibv_alloc_pd cmd = {}; struct ibv_pd *pd; pd = calloc(1, sizeof(*pd)); if (!pd) return NULL; if (ibv_cmd_alloc_pd(ctx, pd, &cmd, sizeof(cmd), &resp, sizeof(resp))) { free(pd); return NULL; } return pd; } int erdma_free_pd(struct ibv_pd *pd) { int rv; rv = ibv_cmd_dealloc_pd(pd); if (rv) return rv; free(pd); return 0; } struct ibv_mr *erdma_reg_mr(struct ibv_pd *pd, void *addr, size_t len, uint64_t hca_va, int access) { struct ib_uverbs_reg_mr_resp resp; struct ibv_reg_mr cmd; struct verbs_mr *vmr; int ret; vmr = calloc(1, sizeof(*vmr)); if (!vmr) return NULL; ret = ibv_cmd_reg_mr(pd, addr, len, hca_va, access, vmr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(vmr); return NULL; } return &vmr->ibv_mr; } int erdma_dereg_mr(struct verbs_mr *vmr) { int ret; ret = ibv_cmd_dereg_mr(vmr); if (ret) return ret; free(vmr); return 0; } int erdma_notify_cq(struct ibv_cq *ibcq, int solicited) { struct erdma_cq *cq = to_ecq(ibcq); uint64_t db_data; int ret; ret = pthread_spin_lock(&cq->lock); if (ret) return ret; db_data = FIELD_PREP(ERDMA_CQDB_IDX_MASK, cq->db_index) | FIELD_PREP(ERDMA_CQDB_CQN_MASK, cq->id) | FIELD_PREP(ERDMA_CQDB_ARM_MASK, 1) | FIELD_PREP(ERDMA_CQDB_SOL_MASK, solicited) | FIELD_PREP(ERDMA_CQDB_CMDSN_MASK, cq->cmdsn) | FIELD_PREP(ERDMA_CQDB_CI_MASK, cq->ci); *(__le64 *)cq->db_record = htole64(db_data); cq->db_index++; udma_to_device_barrier(); mmio_write64_le(cq->db, htole64(db_data)); pthread_spin_unlock(&cq->lock); return ret; } struct ibv_cq *erdma_create_cq(struct ibv_context *ctx, int num_cqe, struct ibv_comp_channel *channel, int comp_vector) { struct erdma_context *ectx = to_ectx(ctx); struct erdma_cmd_create_cq_resp resp = {}; struct erdma_cmd_create_cq cmd = {}; uint64_t *db_records = NULL; struct erdma_cq *cq; size_t cq_size; int rv; cq = calloc(1, sizeof(*cq)); if (!cq) return NULL; if (num_cqe < 64) num_cqe = 64; num_cqe = roundup_pow_of_two(num_cqe); cq_size = align(num_cqe * sizeof(struct erdma_cqe), ERDMA_PAGE_SIZE); rv = posix_memalign((void **)&cq->queue, ERDMA_PAGE_SIZE, cq_size); if (rv) { errno = rv; free(cq); return NULL; } rv = ibv_dontfork_range(cq->queue, cq_size); if (rv) { free(cq->queue); cq->queue = NULL; goto error_alloc; } memset(cq->queue, 0, cq_size); db_records = erdma_alloc_dbrecords(ectx); if (!db_records) { errno = ENOMEM; goto error_alloc; } cmd.db_record_va = (uintptr_t)db_records; cmd.qbuf_va = (uintptr_t)cq->queue; cmd.qbuf_len = cq_size; rv = ibv_cmd_create_cq(ctx, num_cqe, channel, comp_vector, &cq->base_cq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (rv) { errno = EIO; goto error_alloc; } pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE); *db_records = 0; cq->db_record = db_records; cq->id = resp.cq_id; cq->depth = resp.num_cqe; cq->db = ectx->cdb; cq->db_offset = (cq->id & (ERDMA_PAGE_SIZE / ERDMA_CQDB_SIZE - 1)) * ERDMA_CQDB_SIZE; cq->db += cq->db_offset; cq->comp_vector = comp_vector; return &cq->base_cq; error_alloc: if (db_records) erdma_dealloc_dbrecords(ectx, db_records); if (cq->queue) { ibv_dofork_range(cq->queue, cq_size); free(cq->queue); } free(cq); return NULL; } int erdma_destroy_cq(struct ibv_cq *base_cq) { struct erdma_context *ctx = to_ectx(base_cq->context); struct erdma_cq *cq = to_ecq(base_cq); int rv; pthread_spin_lock(&cq->lock); rv = ibv_cmd_destroy_cq(base_cq); if (rv) { pthread_spin_unlock(&cq->lock); errno = EIO; return rv; } pthread_spin_destroy(&cq->lock); if (cq->db_record) erdma_dealloc_dbrecords(ctx, cq->db_record); if (cq->queue) { ibv_dofork_range(cq->queue, cq->depth << CQE_SHIFT); free(cq->queue); } free(cq); return 0; } static void __erdma_alloc_dbs(struct erdma_qp *qp, struct erdma_context *ctx) { uint32_t qpn = qp->id; uint32_t db_offset; if (ctx->sdb_type == ERDMA_SDB_ENTRY) db_offset = ctx->sdb_offset * ERDMA_NSDB_PER_ENTRY * ERDMA_SQDB_SIZE; else db_offset = (qpn & ERDMA_SDB_ALLOC_QPN_MASK) * ERDMA_SQDB_SIZE; qp->sq.db = ctx->sdb + db_offset; /* qpn[6:0] as the index in this rq db page. */ qp->rq.db = ctx->rdb + (qpn & ERDMA_RDB_ALLOC_QPN_MASK) * ERDMA_RQDB_SPACE_SIZE; } static int erdma_store_qp(struct erdma_context *ctx, struct erdma_qp *qp) { uint32_t tbl_idx, tbl_off; int rv = 0; pthread_mutex_lock(&ctx->qp_table_mutex); tbl_idx = qp->id >> ERDMA_QP_TABLE_SHIFT; tbl_off = qp->id & ERDMA_QP_TABLE_MASK; if (ctx->qp_table[tbl_idx].refcnt == 0) { ctx->qp_table[tbl_idx].table = calloc(ERDMA_QP_TABLE_SIZE, sizeof(struct erdma_qp *)); if (!ctx->qp_table[tbl_idx].table) { rv = -ENOMEM; goto out; } } /* exist qp */ if (ctx->qp_table[tbl_idx].table[tbl_off]) { rv = -EBUSY; goto out; } ctx->qp_table[tbl_idx].table[tbl_off] = qp; ctx->qp_table[tbl_idx].refcnt++; out: pthread_mutex_unlock(&ctx->qp_table_mutex); return rv; } static void erdma_clear_qp(struct erdma_context *ctx, struct erdma_qp *qp) { uint32_t tbl_idx, tbl_off; pthread_mutex_lock(&ctx->qp_table_mutex); tbl_idx = qp->id >> ERDMA_QP_TABLE_SHIFT; tbl_off = qp->id & ERDMA_QP_TABLE_MASK; ctx->qp_table[tbl_idx].table[tbl_off] = NULL; ctx->qp_table[tbl_idx].refcnt--; if (ctx->qp_table[tbl_idx].refcnt == 0) { free(ctx->qp_table[tbl_idx].table); ctx->qp_table[tbl_idx].table = NULL; } pthread_mutex_unlock(&ctx->qp_table_mutex); } static int erdma_alloc_qp_buf_and_db(struct erdma_context *ctx, struct erdma_qp *qp, struct ibv_qp_init_attr *attr) { size_t queue_size; uint32_t nwqebb; int rv; nwqebb = roundup_pow_of_two(attr->cap.max_send_wr * MAX_WQEBB_PER_SQE); queue_size = align(nwqebb << SQEBB_SHIFT, ctx->page_size); nwqebb = roundup_pow_of_two(attr->cap.max_recv_wr); queue_size += align(nwqebb << RQE_SHIFT, ctx->page_size); qp->qbuf_size = queue_size; rv = posix_memalign(&qp->qbuf, ctx->page_size, queue_size); if (rv) { errno = ENOMEM; return -1; } rv = ibv_dontfork_range(qp->qbuf, queue_size); if (rv) { errno = rv; goto err_dontfork; } /* doorbell record allocation. */ qp->db_records = erdma_alloc_dbrecords(ctx); if (!qp->db_records) { errno = ENOMEM; goto err_dbrec; } *qp->db_records = 0; *(qp->db_records + 1) = 0; qp->sq.db_record = qp->db_records; qp->rq.db_record = qp->db_records + 1; pthread_spin_init(&qp->sq_lock, PTHREAD_PROCESS_PRIVATE); pthread_spin_init(&qp->rq_lock, PTHREAD_PROCESS_PRIVATE); return 0; err_dbrec: ibv_dofork_range(qp->qbuf, queue_size); err_dontfork: free(qp->qbuf); return -1; } static void erdma_free_qp_buf_and_db(struct erdma_context *ctx, struct erdma_qp *qp) { pthread_spin_destroy(&qp->sq_lock); pthread_spin_destroy(&qp->rq_lock); if (qp->db_records) erdma_dealloc_dbrecords(ctx, qp->db_records); ibv_dofork_range(qp->qbuf, qp->qbuf_size); free(qp->qbuf); } static int erdma_alloc_wrid_tbl(struct erdma_qp *qp) { qp->rq.wr_tbl = calloc(qp->rq.depth, sizeof(uint64_t)); if (!qp->rq.wr_tbl) return -ENOMEM; qp->sq.wr_tbl = calloc(qp->sq.depth, sizeof(uint64_t)); if (!qp->sq.wr_tbl) { free(qp->rq.wr_tbl); return -ENOMEM; } return 0; } static void erdma_free_wrid_tbl(struct erdma_qp *qp) { free(qp->sq.wr_tbl); free(qp->rq.wr_tbl); } struct ibv_qp *erdma_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct erdma_context *ctx = to_ectx(pd->context); struct erdma_cmd_create_qp_resp resp = {}; struct erdma_cmd_create_qp cmd = {}; struct erdma_qp *qp; int rv; qp = calloc(1, sizeof(*qp)); if (!qp) return NULL; rv = erdma_alloc_qp_buf_and_db(ctx, qp, attr); if (rv) goto err; cmd.db_record_va = (uintptr_t)qp->db_records; cmd.qbuf_va = (uintptr_t)qp->qbuf; cmd.qbuf_len = (__u32)qp->qbuf_size; rv = ibv_cmd_create_qp(pd, &qp->base_qp, attr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (rv) goto err_cmd; qp->id = resp.qp_id; qp->sq.qbuf = qp->qbuf; qp->rq.qbuf = qp->qbuf + resp.rq_offset; qp->sq.depth = resp.num_sqe; qp->rq.depth = resp.num_rqe; qp->sq_sig_all = attr->sq_sig_all; qp->sq.size = resp.num_sqe * SQEBB_SIZE; qp->rq.size = resp.num_rqe * sizeof(struct erdma_rqe); /* doorbell allocation. */ __erdma_alloc_dbs(qp, ctx); rv = erdma_alloc_wrid_tbl(qp); if (rv) goto err_wrid_tbl; rv = erdma_store_qp(ctx, qp); if (rv) { errno = -rv; goto err_store; } return &qp->base_qp; err_store: erdma_free_wrid_tbl(qp); err_wrid_tbl: ibv_cmd_destroy_qp(&qp->base_qp); err_cmd: erdma_free_qp_buf_and_db(ctx, qp); err: free(qp); return NULL; } int erdma_modify_qp(struct ibv_qp *base_qp, struct ibv_qp_attr *attr, int attr_mask) { struct erdma_qp *qp = to_eqp(base_qp); struct ibv_modify_qp cmd = {}; int rv; pthread_spin_lock(&qp->sq_lock); pthread_spin_lock(&qp->rq_lock); rv = ibv_cmd_modify_qp(base_qp, attr, attr_mask, &cmd, sizeof(cmd)); pthread_spin_unlock(&qp->rq_lock); pthread_spin_unlock(&qp->sq_lock); return rv; } int erdma_destroy_qp(struct ibv_qp *base_qp) { struct ibv_context *base_ctx = base_qp->pd->context; struct erdma_context *ctx = to_ectx(base_ctx); struct erdma_qp *qp = to_eqp(base_qp); int rv; erdma_clear_qp(ctx, qp); rv = ibv_cmd_destroy_qp(base_qp); if (rv) return rv; erdma_free_wrid_tbl(qp); erdma_free_qp_buf_and_db(ctx, qp); free(qp); return 0; } static int erdma_push_one_sqe(struct erdma_qp *qp, struct ibv_send_wr *wr, uint16_t *sq_pi) { uint32_t i, bytes, sgl_off, sgl_idx, wqebb_cnt, opcode, wqe_size = 0; struct erdma_atomic_sqe *atomic_sqe; struct erdma_readreq_sqe *read_sqe; struct erdma_write_sqe *write_sqe; struct erdma_send_sqe *send_sqe; struct erdma_sge *sgl_base; uint16_t tmp_pi = *sq_pi; __le32 *length_field; uint64_t sqe_hdr; void *sqe; sqe = get_sq_wqebb(qp, tmp_pi); /* Clear the first 8Byte of the wqe hdr. */ *(uint64_t *)sqe = 0; qp->sq.wr_tbl[tmp_pi & (qp->sq.depth - 1)] = wr->wr_id; sqe_hdr = FIELD_PREP(ERDMA_SQE_HDR_QPN_MASK, qp->id) | FIELD_PREP(ERDMA_SQE_HDR_CE_MASK, wr->send_flags & IBV_SEND_SIGNALED ? 1 : 0) | FIELD_PREP(ERDMA_SQE_HDR_CE_MASK, qp->sq_sig_all) | FIELD_PREP(ERDMA_SQE_HDR_SE_MASK, wr->send_flags & IBV_SEND_SOLICITED ? 1 : 0) | FIELD_PREP(ERDMA_SQE_HDR_FENCE_MASK, wr->send_flags & IBV_SEND_FENCE ? 1 : 0) | FIELD_PREP(ERDMA_SQE_HDR_INLINE_MASK, wr->send_flags & IBV_SEND_INLINE ? 1 : 0); switch (wr->opcode) { case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: if (wr->opcode == IBV_WR_RDMA_WRITE) opcode = ERDMA_OP_WRITE; else opcode = ERDMA_OP_WRITE_WITH_IMM; sqe_hdr |= FIELD_PREP(ERDMA_SQE_HDR_OPCODE_MASK, opcode); write_sqe = sqe; write_sqe->imm_data = wr->imm_data; write_sqe->sink_stag = htole32(wr->wr.rdma.rkey); write_sqe->sink_to_low = htole32(wr->wr.rdma.remote_addr & 0xFFFFFFFF); write_sqe->sink_to_high = htole32((wr->wr.rdma.remote_addr >> 32) & 0xFFFFFFFF); length_field = &write_sqe->length; /* sgl is at the start of next wqebb. */ sgl_base = get_sq_wqebb(qp, tmp_pi + 1); sgl_off = 0; sgl_idx = tmp_pi + 1; wqe_size = sizeof(struct erdma_write_sqe); break; case IBV_WR_SEND: case IBV_WR_SEND_WITH_IMM: if (wr->opcode == IBV_WR_SEND) opcode = ERDMA_OP_SEND; else opcode = ERDMA_OP_SEND_WITH_IMM; sqe_hdr |= FIELD_PREP(ERDMA_SQE_HDR_OPCODE_MASK, opcode); send_sqe = sqe; send_sqe->imm_data = wr->imm_data; length_field = &send_sqe->length; /* sgl is in the half of current wqebb (offset 16Byte) */ sgl_base = sqe; sgl_off = 16; sgl_idx = tmp_pi; wqe_size = sizeof(struct erdma_send_sqe); break; case IBV_WR_RDMA_READ: sqe_hdr |= FIELD_PREP(ERDMA_SQE_HDR_OPCODE_MASK, ERDMA_OP_READ); read_sqe = sqe; read_sqe->sink_to_low = htole32(wr->sg_list->addr & 0xFFFFFFFF); read_sqe->sink_to_high = htole32((wr->sg_list->addr >> 32) & 0xFFFFFFFF); read_sqe->sink_stag = htole32(wr->sg_list->lkey); read_sqe->length = htole32(wr->sg_list->length); sgl_base = get_sq_wqebb(qp, tmp_pi + 1); sgl_base->addr = htole64(wr->wr.rdma.remote_addr); sgl_base->length = htole32(wr->sg_list->length); sgl_base->key = htole32(wr->wr.rdma.rkey); wqe_size = sizeof(struct erdma_readreq_sqe); goto out; case IBV_WR_ATOMIC_CMP_AND_SWP: case IBV_WR_ATOMIC_FETCH_AND_ADD: atomic_sqe = (struct erdma_atomic_sqe *)sqe; if (wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP) { sqe_hdr |= FIELD_PREP(ERDMA_SQE_HDR_OPCODE_MASK, ERDMA_OP_ATOMIC_CAS); atomic_sqe->fetchadd_swap_data = htole64(wr->wr.atomic.swap); atomic_sqe->cmp_data = htole64(wr->wr.atomic.compare_add); } else { sqe_hdr |= FIELD_PREP(ERDMA_SQE_HDR_OPCODE_MASK, ERDMA_OP_ATOMIC_FAD); atomic_sqe->fetchadd_swap_data = htole64(wr->wr.atomic.compare_add); } sgl_base = (struct erdma_sge *)get_sq_wqebb(qp, tmp_pi + 1); /* remote SGL fields */ sgl_base->addr = htole64(wr->wr.atomic.remote_addr); sgl_base->key = htole32(wr->wr.atomic.rkey); /* local SGL fields */ sgl_base++; sgl_base->addr = htole64(wr->sg_list[0].addr); sgl_base->length = htole32(wr->sg_list[0].length); sgl_base->key = htole32(wr->sg_list[0].lkey); wqe_size = sizeof(struct erdma_atomic_sqe); goto out; default: return -EINVAL; } if (wr->send_flags & IBV_SEND_INLINE) { char *data = (char *)sgl_base; uint32_t remain_size; uint32_t copy_size; uint32_t data_off; i = 0; bytes = 0; /* Allow more than ERDMA_MAX_SGE, since content copied here */ while (i < wr->num_sge) { bytes += wr->sg_list[i].length; if (bytes > (int)ERDMA_MAX_INLINE) return -EINVAL; remain_size = wr->sg_list[i].length; data_off = 0; while (1) { copy_size = min(remain_size, SQEBB_SIZE - sgl_off); memcpy(data + sgl_off, (void *)(uintptr_t)wr->sg_list[i].addr + data_off, copy_size); remain_size -= copy_size; /* Update sgl_offset. */ sgl_idx += ((sgl_off + copy_size) >> SQEBB_SHIFT); sgl_off = (sgl_off + copy_size) & (SQEBB_SIZE - 1); data_off += copy_size; data = get_sq_wqebb(qp, sgl_idx); if (!remain_size) break; }; i++; } *length_field = htole32(bytes); wqe_size += bytes; sqe_hdr |= FIELD_PREP(ERDMA_SQE_HDR_SGL_LEN_MASK, bytes); } else { char *sgl = (char *)sgl_base; if (wr->num_sge > ERDMA_MAX_SEND_SGE) return -EINVAL; i = 0; bytes = 0; while (i < wr->num_sge) { bytes += wr->sg_list[i].length; memcpy(sgl + sgl_off, &wr->sg_list[i], sizeof(struct ibv_sge)); if (sgl_off == 0) *(uint32_t *)(sgl + 28) = qp->id; sgl_idx += (sgl_off == sizeof(struct ibv_sge) ? 1 : 0); sgl = get_sq_wqebb(qp, sgl_idx); sgl_off = sizeof(struct ibv_sge) - sgl_off; i++; } *length_field = htole32(bytes); sqe_hdr |= FIELD_PREP(ERDMA_SQE_HDR_SGL_LEN_MASK, wr->num_sge); wqe_size += wr->num_sge * sizeof(struct ibv_sge); } out: wqebb_cnt = SQEBB_COUNT(wqe_size); assert(wqebb_cnt <= MAX_WQEBB_PER_SQE); sqe_hdr |= FIELD_PREP(ERDMA_SQE_HDR_WQEBB_CNT_MASK, wqebb_cnt - 1); sqe_hdr |= FIELD_PREP(ERDMA_SQE_HDR_WQEBB_INDEX_MASK, tmp_pi + wqebb_cnt); *(__le64 *)sqe = htole64(sqe_hdr); *sq_pi = tmp_pi + wqebb_cnt; return 0; } int erdma_post_send(struct ibv_qp *base_qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { struct erdma_qp *qp = to_eqp(base_qp); int new_sqe = 0, rv = 0; uint16_t sq_pi; *bad_wr = NULL; if (base_qp->state == IBV_QPS_ERR) { *bad_wr = wr; return -EIO; } pthread_spin_lock(&qp->sq_lock); sq_pi = qp->sq.pi; while (wr) { if ((uint16_t)(sq_pi - qp->sq.ci) >= qp->sq.depth) { rv = -ENOMEM; *bad_wr = wr; break; } rv = erdma_push_one_sqe(qp, wr, &sq_pi); if (rv) { *bad_wr = wr; break; } new_sqe++; wr = wr->next; } if (new_sqe) { qp->sq.pi = sq_pi; __kick_sq_db(qp, sq_pi); /* normal doorbell. */ } pthread_spin_unlock(&qp->sq_lock); return rv; } static int push_recv_wqe(struct erdma_qp *qp, struct ibv_recv_wr *wr) { uint16_t rq_pi = qp->rq.pi; uint16_t idx = rq_pi & (qp->rq.depth - 1); struct erdma_rqe *rqe = (struct erdma_rqe *)qp->rq.qbuf + idx; if ((uint16_t)(rq_pi - qp->rq.ci) == qp->rq.depth) return -ENOMEM; rqe->qe_idx = htole16(rq_pi + 1); rqe->qpn = htole32(qp->id); qp->rq.wr_tbl[idx] = wr->wr_id; if (wr->num_sge == 0) { rqe->length = 0; } else if (wr->num_sge == 1) { rqe->stag = htole32(wr->sg_list[0].lkey); rqe->to = htole64(wr->sg_list[0].addr); rqe->length = htole32(wr->sg_list[0].length); } else { return -EINVAL; } *(__le64 *)qp->rq.db_record = *(__le64 *)rqe; udma_to_device_barrier(); mmio_write64_le(qp->rq.db, *(__le64 *)rqe); qp->rq.pi = rq_pi + 1; return 0; } int erdma_post_recv(struct ibv_qp *base_qp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct erdma_qp *qp = to_eqp(base_qp); int ret = 0; if (base_qp->state == IBV_QPS_ERR) { *bad_wr = wr; return -EIO; } pthread_spin_lock(&qp->rq_lock); while (wr) { ret = push_recv_wqe(qp, wr); if (ret) { *bad_wr = wr; break; } wr = wr->next; } pthread_spin_unlock(&qp->rq_lock); return ret; } void erdma_cq_event(struct ibv_cq *ibcq) { struct erdma_cq *cq = to_ecq(ibcq); cq->cmdsn++; } static void *get_next_valid_cqe(struct erdma_cq *cq) { struct erdma_cqe *cqe = cq->queue + (cq->ci & (cq->depth - 1)); uint32_t owner = FIELD_GET(ERDMA_CQE_HDR_OWNER_MASK, be32toh(cqe->hdr)); return owner ^ !!(cq->ci & cq->depth) ? cqe : NULL; } static const enum ibv_wc_opcode wc_mapping_table[ERDMA_NUM_OPCODES] = { [ERDMA_OP_WRITE] = IBV_WC_RDMA_WRITE, [ERDMA_OP_READ] = IBV_WC_RDMA_READ, [ERDMA_OP_SEND] = IBV_WC_SEND, [ERDMA_OP_SEND_WITH_IMM] = IBV_WC_SEND, [ERDMA_OP_RECEIVE] = IBV_WC_RECV, [ERDMA_OP_RECV_IMM] = IBV_WC_RECV_RDMA_WITH_IMM, [ERDMA_OP_RECV_INV] = IBV_WC_RECV, [ERDMA_OP_WRITE_WITH_IMM] = IBV_WC_RDMA_WRITE, [ERDMA_OP_INVALIDATE] = IBV_WC_LOCAL_INV, [ERDMA_OP_RSP_SEND_IMM] = IBV_WC_RECV, [ERDMA_OP_SEND_WITH_INV] = IBV_WC_SEND, [ERDMA_OP_READ_WITH_INV] = IBV_WC_RDMA_READ, [ERDMA_OP_ATOMIC_CAS] = IBV_WC_COMP_SWAP, [ERDMA_OP_ATOMIC_FAD] = IBV_WC_FETCH_ADD, }; static const struct { enum erdma_wc_status erdma; enum ibv_wc_status base; enum erdma_vendor_err vendor; } map_cqe_status[ERDMA_NUM_WC_STATUS] = { { ERDMA_WC_SUCCESS, IBV_WC_SUCCESS, ERDMA_WC_VENDOR_NO_ERR }, { ERDMA_WC_GENERAL_ERR, IBV_WC_GENERAL_ERR, ERDMA_WC_VENDOR_NO_ERR }, { ERDMA_WC_RECV_WQE_FORMAT_ERR, IBV_WC_GENERAL_ERR, ERDMA_WC_VENDOR_INVALID_RQE }, { ERDMA_WC_RECV_STAG_INVALID_ERR, IBV_WC_REM_ACCESS_ERR, ERDMA_WC_VENDOR_RQE_INVALID_STAG }, { ERDMA_WC_RECV_ADDR_VIOLATION_ERR, IBV_WC_REM_ACCESS_ERR, ERDMA_WC_VENDOR_RQE_ADDR_VIOLATION }, { ERDMA_WC_RECV_RIGHT_VIOLATION_ERR, IBV_WC_REM_ACCESS_ERR, ERDMA_WC_VENDOR_RQE_ACCESS_RIGHT_ERR }, { ERDMA_WC_RECV_PDID_ERR, IBV_WC_REM_ACCESS_ERR, ERDMA_WC_VENDOR_RQE_INVALID_PD }, { ERDMA_WC_RECV_WARRPING_ERR, IBV_WC_REM_ACCESS_ERR, ERDMA_WC_VENDOR_RQE_WRAP_ERR }, { ERDMA_WC_SEND_WQE_FORMAT_ERR, IBV_WC_LOC_QP_OP_ERR, ERDMA_WC_VENDOR_INVALID_SQE }, { ERDMA_WC_SEND_WQE_ORD_EXCEED, IBV_WC_GENERAL_ERR, ERDMA_WC_VENDOR_ZERO_ORD }, { ERDMA_WC_SEND_STAG_INVALID_ERR, IBV_WC_LOC_ACCESS_ERR, ERDMA_WC_VENDOR_SQE_INVALID_STAG }, { ERDMA_WC_SEND_ADDR_VIOLATION_ERR, IBV_WC_LOC_ACCESS_ERR, ERDMA_WC_VENDOR_SQE_ADDR_VIOLATION }, { ERDMA_WC_SEND_RIGHT_VIOLATION_ERR, IBV_WC_LOC_ACCESS_ERR, ERDMA_WC_VENDOR_SQE_ACCESS_ERR }, { ERDMA_WC_SEND_PDID_ERR, IBV_WC_LOC_ACCESS_ERR, ERDMA_WC_VENDOR_SQE_INVALID_PD }, { ERDMA_WC_SEND_WARRPING_ERR, IBV_WC_LOC_ACCESS_ERR, ERDMA_WC_VENDOR_SQE_WARP_ERR }, { ERDMA_WC_FLUSH_ERR, IBV_WC_WR_FLUSH_ERR, ERDMA_WC_VENDOR_NO_ERR }, { ERDMA_WC_RETRY_EXC_ERR, IBV_WC_RETRY_EXC_ERR, ERDMA_WC_VENDOR_NO_ERR }, }; #define ERDMA_POLLCQ_NO_QP (-1) #define ERDMA_POLLCQ_DUP_COMP (-2) #define ERDMA_POLLCQ_WRONG_IDX (-3) static int __erdma_poll_one_cqe(struct erdma_context *ctx, struct erdma_cq *cq, struct ibv_wc *wc) { uint32_t cqe_hdr, opcode, syndrome, qpn; uint16_t depth, wqe_idx, old_ci, new_ci; uint64_t *sqe_hdr, *qeidx2wrid; uint32_t tbl_idx, tbl_off; struct erdma_cqe *cqe; struct erdma_qp *qp; cqe = get_next_valid_cqe(cq); if (!cqe) return -EAGAIN; cq->ci++; udma_from_device_barrier(); cqe_hdr = be32toh(cqe->hdr); syndrome = FIELD_GET(ERDMA_CQE_HDR_SYNDROME_MASK, cqe_hdr); opcode = FIELD_GET(ERDMA_CQE_HDR_OPCODE_MASK, cqe_hdr); qpn = be32toh(cqe->qpn); wqe_idx = be32toh(cqe->qe_idx); tbl_idx = qpn >> ERDMA_QP_TABLE_SHIFT; tbl_off = qpn & ERDMA_QP_TABLE_MASK; if (!ctx->qp_table[tbl_idx].table || !ctx->qp_table[tbl_idx].table[tbl_off]) return ERDMA_POLLCQ_NO_QP; qp = ctx->qp_table[tbl_idx].table[tbl_off]; if (FIELD_GET(ERDMA_CQE_HDR_QTYPE_MASK, cqe_hdr) == ERDMA_CQE_QTYPE_SQ) { qeidx2wrid = qp->sq.wr_tbl; depth = qp->sq.depth; sqe_hdr = get_sq_wqebb(qp, wqe_idx); old_ci = qp->sq.ci; new_ci = wqe_idx + FIELD_GET(ERDMA_SQE_HDR_WQEBB_CNT_MASK, *sqe_hdr) + 1; if ((uint16_t)(new_ci - old_ci) > depth) return ERDMA_POLLCQ_WRONG_IDX; else if (new_ci == old_ci) return ERDMA_POLLCQ_DUP_COMP; qp->sq.ci = new_ci; } else { qeidx2wrid = qp->rq.wr_tbl; depth = qp->rq.depth; qp->rq.ci++; } wc->wr_id = qeidx2wrid[wqe_idx & (depth - 1)]; wc->byte_len = be32toh(cqe->size); wc->wc_flags = 0; wc->opcode = wc_mapping_table[opcode]; if (opcode == ERDMA_OP_RECV_IMM || opcode == ERDMA_OP_RSP_SEND_IMM) { wc->imm_data = htobe32(le32toh(cqe->imm_data)); wc->wc_flags |= IBV_WC_WITH_IMM; } if (syndrome >= ERDMA_NUM_WC_STATUS) syndrome = ERDMA_WC_GENERAL_ERR; wc->status = map_cqe_status[syndrome].base; wc->vendor_err = map_cqe_status[syndrome].vendor; wc->qp_num = qpn; return 0; } int erdma_poll_cq(struct ibv_cq *ibcq, int num_entries, struct ibv_wc *wc) { struct erdma_context *ctx = to_ectx(ibcq->context); struct erdma_cq *cq = to_ecq(ibcq); int ret, npolled = 0; pthread_spin_lock(&cq->lock); while (npolled < num_entries) { ret = __erdma_poll_one_cqe(ctx, cq, wc + npolled); if (ret == -EAGAIN) /* CQ is empty, break the loop. */ break; else if (ret) /* We handle the polling error silently. */ continue; npolled++; } pthread_spin_unlock(&cq->lock); return npolled; } void erdma_free_context(struct ibv_context *ibv_ctx) { struct erdma_context *ctx = to_ectx(ibv_ctx); int i; munmap(ctx->sdb, ERDMA_PAGE_SIZE); munmap(ctx->rdb, ERDMA_PAGE_SIZE); munmap(ctx->cdb, ERDMA_PAGE_SIZE); pthread_mutex_lock(&ctx->qp_table_mutex); for (i = 0; i < ERDMA_QP_TABLE_SIZE; ++i) { if (ctx->qp_table[i].refcnt) free(ctx->qp_table[i].table); } pthread_mutex_unlock(&ctx->qp_table_mutex); pthread_mutex_destroy(&ctx->qp_table_mutex); verbs_uninit_context(&ctx->ibv_ctx); free(ctx); } rdma-core-56.1/providers/erdma/erdma_verbs.h000066400000000000000000000064651477342711600211140ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or OpenIB.org BSD (MIT) See COPYING file */ /* * Authors: Cheng Xu * Copyright (c) 2020-2021, Alibaba Group. */ #ifndef __ERDMA_VERBS_H__ #define __ERDMA_VERBS_H__ #include #include #include #include "erdma.h" #include "erdma_hw.h" #define ERDMA_MAX_SEND_SGE 6 #define ERDMA_MAX_RECV_SGE 1 struct erdma_queue { void *qbuf; void *db; uint16_t rsvd0; uint16_t depth; uint32_t size; uint16_t pi; uint16_t ci; uint32_t rsvd1; uint64_t *wr_tbl; void *db_record; }; struct erdma_qp { struct ibv_qp base_qp; struct erdma_device *erdma_dev; uint32_t id; /* qpn */ pthread_spinlock_t sq_lock; pthread_spinlock_t rq_lock; int sq_sig_all; struct erdma_queue sq; struct erdma_queue rq; void *qbuf; size_t qbuf_size; uint64_t *db_records; }; struct erdma_cq { struct ibv_cq base_cq; struct erdma_device *erdma_dev; uint32_t id; uint32_t event_stats; uint32_t depth; uint32_t ci; struct erdma_cqe *queue; void *db; uint16_t db_offset; void *db_record; uint32_t cmdsn; uint32_t comp_vector; uint32_t db_index; pthread_spinlock_t lock; }; static inline struct erdma_qp *to_eqp(struct ibv_qp *base) { return container_of(base, struct erdma_qp, base_qp); } static inline struct erdma_cq *to_ecq(struct ibv_cq *base) { return container_of(base, struct erdma_cq, base_cq); } static inline void *get_sq_wqebb(struct erdma_qp *qp, uint16_t idx) { idx &= (qp->sq.depth - 1); return qp->sq.qbuf + (idx << SQEBB_SHIFT); } static inline void __kick_sq_db(struct erdma_qp *qp, uint16_t pi) { uint64_t db_data; db_data = FIELD_PREP(ERDMA_SQE_HDR_QPN_MASK, qp->id) | FIELD_PREP(ERDMA_SQE_HDR_WQEBB_INDEX_MASK, pi); *(__le64 *)qp->sq.db_record = htole64(db_data); udma_to_device_barrier(); mmio_write64_le(qp->sq.db, htole64(db_data)); } struct ibv_pd *erdma_alloc_pd(struct ibv_context *ctx); int erdma_free_pd(struct ibv_pd *pd); int erdma_query_device(struct ibv_context *ctx, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int erdma_query_port(struct ibv_context *ctx, uint8_t port, struct ibv_port_attr *attr); struct ibv_mr *erdma_reg_mr(struct ibv_pd *pd, void *addr, size_t len, uint64_t hca_va, int access); int erdma_dereg_mr(struct verbs_mr *vmr); struct ibv_qp *erdma_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); int erdma_modify_qp(struct ibv_qp *base_qp, struct ibv_qp_attr *attr, int attr_mask); int erdma_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int erdma_post_send(struct ibv_qp *base_qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int erdma_post_recv(struct ibv_qp *base_qp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int erdma_destroy_qp(struct ibv_qp *base_qp); void erdma_free_context(struct ibv_context *ibv_ctx); struct ibv_cq *erdma_create_cq(struct ibv_context *ctx, int num_cqe, struct ibv_comp_channel *channel, int comp_vector); int erdma_destroy_cq(struct ibv_cq *base_cq); int erdma_notify_cq(struct ibv_cq *ibcq, int solicited); void erdma_cq_event(struct ibv_cq *ibcq); int erdma_poll_cq(struct ibv_cq *ibcq, int num_entries, struct ibv_wc *wc); #endif rdma-core-56.1/providers/hfi1verbs/000077500000000000000000000000001477342711600172405ustar00rootroot00000000000000rdma-core-56.1/providers/hfi1verbs/CMakeLists.txt000066400000000000000000000000631477342711600217770ustar00rootroot00000000000000rdma_provider(hfi1verbs hfiverbs.c verbs.c ) rdma-core-56.1/providers/hfi1verbs/hfi-abi.h000066400000000000000000000056441477342711600207210ustar00rootroot00000000000000/* This file is provided under a dual BSD/GPLv2 license. When using or redistributing this file, you may do so under either license. GPL LICENSE SUMMARY Copyright(c) 2015 Intel Corporation. This program is free software; you can redistribute it and/or modify it under the terms of version 2 of the GNU General Public License as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. Contact Information: Intel Corporation www.intel.com BSD LICENSE Copyright(c) 2015 Intel Corporation. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Intel Corporation nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Copyright (C) 2006-2007 QLogic Corporation, All rights reserved. */ #ifndef HFI1_ABI_H #define HFI1_ABI_H #include struct hfi1_get_context_resp { struct ib_uverbs_get_context_resp ibv_resp; __u32 version; }; struct hfi1_create_cq_resp { struct ib_uverbs_create_cq_resp ibv_resp; __u64 offset; }; struct hfi1_resize_cq_resp { struct ib_uverbs_resize_cq_resp ibv_resp; __u64 offset; }; struct hfi1_create_qp_resp { struct ib_uverbs_create_qp_resp ibv_resp; __u64 offset; }; struct hfi1_create_srq_resp { struct ib_uverbs_create_srq_resp ibv_resp; __u64 offset; }; struct hfi1_modify_srq_cmd { struct ibv_modify_srq ibv_cmd; __u64 offset_addr; }; #endif /* HFI1_ABI_H */ rdma-core-56.1/providers/hfi1verbs/hfiverbs.c000066400000000000000000000137611477342711600212240ustar00rootroot00000000000000/* This file is provided under a dual BSD/GPLv2 license. When using or redistributing this file, you may do so under either license. GPL LICENSE SUMMARY Copyright(c) 2015 Intel Corporation. This program is free software; you can redistribute it and/or modify it under the terms of version 2 of the GNU General Public License as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. Contact Information: Intel Corporation www.intel.com BSD LICENSE Copyright(c) 2015 Intel Corporation. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Intel Corporation nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Copyright (C) 2006-2007 QLogic Corporation, All rights reserved. Copyright (c) 2005. PathScale, Inc. All rights reserved. */ #include #include #include #include #include #include "hfiverbs.h" #include "hfi-abi.h" static void hfi1_free_context(struct ibv_context *ibctx); #ifndef PCI_VENDOR_ID_INTEL #define PCI_VENDOR_ID_INTEL 0x8086 #endif #ifndef PCI_DEVICE_ID_INTEL0 #define PCI_DEVICE_ID_HFI_INTEL0 0x24f0 #endif #ifndef PCI_DEVICE_ID_INTEL1 #define PCI_DEVICE_ID_HFI_INTEL1 0x24f1 #endif #define HFI(v, d) \ VERBS_PCI_MATCH(PCI_VENDOR_ID_##v, PCI_DEVICE_ID_HFI_##d, NULL) static const struct verbs_match_ent hca_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_HFI1), HFI(INTEL, INTEL0), HFI(INTEL, INTEL1), {} }; static const struct verbs_context_ops hfi1_ctx_common_ops = { .free_context = hfi1_free_context, .query_device_ex = hfi1_query_device, .query_port = hfi1_query_port, .alloc_pd = hfi1_alloc_pd, .dealloc_pd = hfi1_free_pd, .reg_mr = hfi1_reg_mr, .dereg_mr = hfi1_dereg_mr, .create_cq = hfi1_create_cq, .poll_cq = hfi1_poll_cq, .req_notify_cq = ibv_cmd_req_notify_cq, .resize_cq = hfi1_resize_cq, .destroy_cq = hfi1_destroy_cq, .create_srq = hfi1_create_srq, .modify_srq = hfi1_modify_srq, .query_srq = hfi1_query_srq, .destroy_srq = hfi1_destroy_srq, .post_srq_recv = hfi1_post_srq_recv, .create_qp = hfi1_create_qp, .query_qp = hfi1_query_qp, .modify_qp = hfi1_modify_qp, .destroy_qp = hfi1_destroy_qp, .post_send = hfi1_post_send, .post_recv = hfi1_post_recv, .create_ah = hfi1_create_ah, .destroy_ah = hfi1_destroy_ah, .attach_mcast = ibv_cmd_attach_mcast, .detach_mcast = ibv_cmd_detach_mcast }; static const struct verbs_context_ops hfi1_ctx_v1_ops = { .create_cq = hfi1_create_cq_v1, .create_qp = hfi1_create_qp_v1, .create_srq = hfi1_create_srq_v1, .destroy_cq = hfi1_destroy_cq_v1, .destroy_qp = hfi1_destroy_qp_v1, .destroy_srq = hfi1_destroy_srq_v1, .modify_srq = hfi1_modify_srq_v1, .poll_cq = ibv_cmd_poll_cq, .post_recv = ibv_cmd_post_recv, .post_srq_recv = ibv_cmd_post_srq_recv, .resize_cq = hfi1_resize_cq_v1, }; static struct verbs_context *hfi1_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct hfi1_context *context; struct ibv_get_context cmd; struct ib_uverbs_get_context_resp resp; struct hfi1_device *dev; context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx, RDMA_DRIVER_HFI1); if (!context) return NULL; if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof cmd, &resp, sizeof resp)) goto err_free; verbs_set_ops(&context->ibv_ctx, &hfi1_ctx_common_ops); dev = to_idev(ibdev); if (dev->abi_version == 1) verbs_set_ops(&context->ibv_ctx, &hfi1_ctx_v1_ops); return &context->ibv_ctx; err_free: verbs_uninit_context(&context->ibv_ctx); free(context); return NULL; } static void hfi1_free_context(struct ibv_context *ibctx) { struct hfi1_context *context = to_ictx(ibctx); verbs_uninit_context(&context->ibv_ctx); free(context); } static void hf11_uninit_device(struct verbs_device *verbs_device) { struct hfi1_device *dev = to_idev(&verbs_device->device); free(dev); } static struct verbs_device *hfi1_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct hfi1_device *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; dev->abi_version = sysfs_dev->abi_ver; return &dev->ibv_dev; } static const struct verbs_device_ops hfi1_dev_ops = { .name = "hfi1verbs", .match_min_abi_version = 0, .match_max_abi_version = INT_MAX, .match_table = hca_table, .alloc_device = hfi1_device_alloc, .uninit_device = hf11_uninit_device, .alloc_context = hfi1_alloc_context, }; PROVIDER_DRIVER(hfi1verbs, hfi1_dev_ops); rdma-core-56.1/providers/hfi1verbs/hfiverbs.h000066400000000000000000000173721477342711600212330ustar00rootroot00000000000000/* This file is provided under a dual BSD/GPLv2 license. When using or redistributing this file, you may do so under either license. GPL LICENSE SUMMARY Copyright(c) 2015 Intel Corporation. This program is free software; you can redistribute it and/or modify it under the terms of version 2 of the GNU General Public License as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. Contact Information: Intel Corporation www.intel.com BSD LICENSE Copyright(c) 2015 Intel Corporation. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Intel Corporation nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Copyright (C) 2006-2009 QLogic Corporation, All rights reserved. Copyright (c) 2005. PathScale, Inc. All rights reserved. */ #ifndef HFI1_H #define HFI1_H #include #include #include #include #include #include #define PFX "hfi1: " struct hfi1_device { struct verbs_device ibv_dev; int abi_version; }; struct hfi1_context { struct verbs_context ibv_ctx; }; /* * This structure needs to have the same size and offsets as * the kernel's ib_wc structure since it is memory mapped. */ struct hfi1_wc { uint64_t wr_id; enum ibv_wc_status status; enum ibv_wc_opcode opcode; uint32_t vendor_err; uint32_t byte_len; uint32_t imm_data; /* in network byte order */ uint32_t qp_num; uint32_t src_qp; enum ibv_wc_flags wc_flags; uint16_t pkey_index; uint16_t slid; uint8_t sl; uint8_t dlid_path_bits; uint8_t port_num; }; struct hfi1_cq_wc { _Atomic(uint32_t) head; _Atomic(uint32_t) tail; struct hfi1_wc queue[1]; }; struct hfi1_cq { struct ibv_cq ibv_cq; struct hfi1_cq_wc *queue; pthread_spinlock_t lock; }; /* * Receive work request queue entry. * The size of the sg_list is determined when the QP is created and stored * in qp->r_max_sge. */ struct hfi1_rwqe { uint64_t wr_id; uint8_t num_sge; uint8_t padding[7]; struct ibv_sge sg_list[0]; }; /* * This struture is used to contain the head pointer, tail pointer, * and receive work queue entries as a single memory allocation so * it can be mmap'ed into user space. * Note that the wq array elements are variable size so you can't * just index into the array to get the N'th element; * use get_rwqe_ptr() instead. */ struct hfi1_rwq { _Atomic(uint32_t) head; /* new requests posted to the head. */ _Atomic(uint32_t) tail; /* receives pull requests from here. */ struct hfi1_rwqe wq[0]; }; struct hfi1_rq { struct hfi1_rwq *rwq; pthread_spinlock_t lock; uint32_t size; uint32_t max_sge; }; struct hfi1_qp { struct ibv_qp ibv_qp; struct hfi1_rq rq; }; struct hfi1_srq { struct ibv_srq ibv_srq; struct hfi1_rq rq; }; #define to_ixxx(xxx, type) \ container_of(ib##xxx, struct hfi1_##type, ibv_##xxx) static inline struct hfi1_context *to_ictx(struct ibv_context *ibctx) { return container_of(ibctx, struct hfi1_context, ibv_ctx.context); } static inline struct hfi1_device *to_idev(struct ibv_device *ibdev) { return container_of(ibdev, struct hfi1_device, ibv_dev.device); } static inline struct hfi1_cq *to_icq(struct ibv_cq *ibcq) { return to_ixxx(cq, cq); } static inline struct hfi1_qp *to_iqp(struct ibv_qp *ibqp) { return to_ixxx(qp, qp); } static inline struct hfi1_srq *to_isrq(struct ibv_srq *ibsrq) { return to_ixxx(srq, srq); } /* * Since struct hfi1_rwqe is not a fixed size, we can't simply index into * struct hfi1_rq.wq. This function does the array index computation. */ static inline struct hfi1_rwqe *get_rwqe_ptr(struct hfi1_rq *rq, unsigned n) { return (struct hfi1_rwqe *) ((char *) rq->rwq->wq + (sizeof(struct hfi1_rwqe) + rq->max_sge * sizeof(struct ibv_sge)) * n); } int hfi1_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); extern int hfi1_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr); struct ibv_pd *hfi1_alloc_pd(struct ibv_context *pd); int hfi1_free_pd(struct ibv_pd *pd); struct ibv_mr *hfi1_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access); int hfi1_dereg_mr(struct verbs_mr *vmr); struct ibv_cq *hfi1_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); struct ibv_cq *hfi1_create_cq_v1(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); int hfi1_resize_cq(struct ibv_cq *cq, int cqe); int hfi1_resize_cq_v1(struct ibv_cq *cq, int cqe); int hfi1_destroy_cq(struct ibv_cq *cq); int hfi1_destroy_cq_v1(struct ibv_cq *cq); int hfi1_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc); struct ibv_qp *hfi1_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); struct ibv_qp *hfi1_create_qp_v1(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); int hfi1_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int hfi1_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); int hfi1_destroy_qp(struct ibv_qp *qp); int hfi1_destroy_qp_v1(struct ibv_qp *qp); int hfi1_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int hfi1_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); struct ibv_srq *hfi1_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr); struct ibv_srq *hfi1_create_srq_v1(struct ibv_pd *pd, struct ibv_srq_init_attr *attr); int hfi1_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int attr_mask); int hfi1_modify_srq_v1(struct ibv_srq *srq, struct ibv_srq_attr *attr, int attr_mask); int hfi1_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr); int hfi1_destroy_srq(struct ibv_srq *srq); int hfi1_destroy_srq_v1(struct ibv_srq *srq); int hfi1_post_srq_recv(struct ibv_srq *srq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); struct ibv_ah *hfi1_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr); int hfi1_destroy_ah(struct ibv_ah *ah); #endif /* HFI1_H */ rdma-core-56.1/providers/hfi1verbs/verbs.c000066400000000000000000000375641477342711600205440ustar00rootroot00000000000000/* This file is provided under a dual BSD/GPLv2 license. When using or redistributing this file, you may do so under either license. GPL LICENSE SUMMARY Copyright(c) 2015 Intel Corporation. This program is free software; you can redistribute it and/or modify it under the terms of version 2 of the GNU General Public License as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. Contact Information: Intel Corporation www.intel.com BSD LICENSE Copyright(c) 2015 Intel Corporation. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of Intel Corporation nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Copyright (C) 2006-2009 QLogic Corporation, All rights reserved. Copyright (c) 2005. PathScale, Inc. All rights reserved. */ #include #include #include #include #include #include #include #include "hfiverbs.h" #include "hfi-abi.h" int hfi1_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); uint64_t raw_fw_ver; unsigned major, minor, sub_minor; int ret; ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (ret) return ret; raw_fw_ver = resp.base.fw_ver; major = (raw_fw_ver >> 32) & 0xffff; minor = (raw_fw_ver >> 16) & 0xffff; sub_minor = raw_fw_ver & 0xffff; snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver), "%d.%d.%d", major, minor, sub_minor); return 0; } int hfi1_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(context, port, attr, &cmd, sizeof cmd); } struct ibv_pd *hfi1_alloc_pd(struct ibv_context *context) { struct ibv_alloc_pd cmd; struct ib_uverbs_alloc_pd_resp resp; struct ibv_pd *pd; pd = malloc(sizeof *pd); if (!pd) return NULL; if (ibv_cmd_alloc_pd(context, pd, &cmd, sizeof cmd, &resp, sizeof resp)) { free(pd); return NULL; } return pd; } int hfi1_free_pd(struct ibv_pd *pd) { int ret; ret = ibv_cmd_dealloc_pd(pd); if (ret) return ret; free(pd); return 0; } struct ibv_mr *hfi1_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { struct verbs_mr *vmr; struct ibv_reg_mr cmd; struct ib_uverbs_reg_mr_resp resp; int ret; vmr = malloc(sizeof(*vmr)); if (!vmr) return NULL; ret = ibv_cmd_reg_mr(pd, addr, length, hca_va, access, vmr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(vmr); return NULL; } return &vmr->ibv_mr; } int hfi1_dereg_mr(struct verbs_mr *vmr) { int ret; ret = ibv_cmd_dereg_mr(vmr); if (ret) return ret; free(vmr); return 0; } struct ibv_cq *hfi1_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct hfi1_cq *cq; struct hfi1_create_cq_resp resp; int ret; size_t size; memset(&resp, 0, sizeof(resp)); cq = malloc(sizeof *cq); if (!cq) return NULL; ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector, &cq->ibv_cq, NULL, 0, &resp.ibv_resp, sizeof resp); if (ret) { free(cq); return NULL; } size = sizeof(struct hfi1_cq_wc) + sizeof(struct hfi1_wc) * cqe; cq->queue = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, context->cmd_fd, resp.offset); if ((void *) cq->queue == MAP_FAILED) { ibv_cmd_destroy_cq(&cq->ibv_cq); free(cq); return NULL; } pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE); return &cq->ibv_cq; } struct ibv_cq *hfi1_create_cq_v1(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct ibv_cq *cq; int ret; cq = malloc(sizeof *cq); if (!cq) return NULL; ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector, cq, NULL, 0, NULL, 0); if (ret) { free(cq); return NULL; } return cq; } int hfi1_resize_cq(struct ibv_cq *ibcq, int cqe) { struct hfi1_cq *cq = to_icq(ibcq); struct ibv_resize_cq cmd; struct hfi1_resize_cq_resp resp; size_t size; int ret; memset(&resp, 0, sizeof(resp)); pthread_spin_lock(&cq->lock); /* Save the old size so we can unmmap the queue. */ size = sizeof(struct hfi1_cq_wc) + (sizeof(struct hfi1_wc) * cq->ibv_cq.cqe); ret = ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) { pthread_spin_unlock(&cq->lock); return ret; } (void) munmap(cq->queue, size); size = sizeof(struct hfi1_cq_wc) + (sizeof(struct hfi1_wc) * cq->ibv_cq.cqe); cq->queue = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, ibcq->context->cmd_fd, resp.offset); ret = errno; pthread_spin_unlock(&cq->lock); if ((void *) cq->queue == MAP_FAILED) return ret; return 0; } int hfi1_resize_cq_v1(struct ibv_cq *ibcq, int cqe) { struct ibv_resize_cq cmd; struct ib_uverbs_resize_cq_resp resp; return ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof cmd, &resp, sizeof resp); } int hfi1_destroy_cq(struct ibv_cq *ibcq) { struct hfi1_cq *cq = to_icq(ibcq); int ret; ret = ibv_cmd_destroy_cq(ibcq); if (ret) return ret; (void) munmap(cq->queue, sizeof(struct hfi1_cq_wc) + (sizeof(struct hfi1_wc) * cq->ibv_cq.cqe)); free(cq); return 0; } int hfi1_destroy_cq_v1(struct ibv_cq *ibcq) { int ret; ret = ibv_cmd_destroy_cq(ibcq); if (!ret) free(ibcq); return ret; } int hfi1_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc) { struct hfi1_cq *cq = to_icq(ibcq); struct hfi1_cq_wc *q; int npolled; uint32_t tail; pthread_spin_lock(&cq->lock); q = cq->queue; tail = atomic_load_explicit(&q->tail, memory_order_relaxed); for (npolled = 0; npolled < ne; ++npolled, ++wc) { if (tail == atomic_load(&q->head)) break; /* Make sure entry is read after head index is read. */ atomic_thread_fence(memory_order_acquire); memcpy(wc, &q->queue[tail], sizeof(*wc)); if (tail == cq->ibv_cq.cqe) tail = 0; else tail++; } atomic_store(&q->tail, tail); pthread_spin_unlock(&cq->lock); return npolled; } struct ibv_qp *hfi1_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct ibv_create_qp cmd; struct hfi1_create_qp_resp resp; struct hfi1_qp *qp; int ret; size_t size; memset(&resp, 0, sizeof(resp)); qp = malloc(sizeof *qp); if (!qp) return NULL; ret = ibv_cmd_create_qp(pd, &qp->ibv_qp, attr, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) { free(qp); return NULL; } if (attr->srq) { qp->rq.size = 0; qp->rq.max_sge = 0; qp->rq.rwq = NULL; } else { qp->rq.size = attr->cap.max_recv_wr + 1; qp->rq.max_sge = attr->cap.max_recv_sge; size = sizeof(struct hfi1_rwq) + (sizeof(struct hfi1_rwqe) + (sizeof(struct ibv_sge) * qp->rq.max_sge)) * qp->rq.size; qp->rq.rwq = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.offset); if ((void *) qp->rq.rwq == MAP_FAILED) { ibv_cmd_destroy_qp(&qp->ibv_qp); free(qp); return NULL; } } pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE); return &qp->ibv_qp; } struct ibv_qp *hfi1_create_qp_v1(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct ibv_create_qp cmd; struct ib_uverbs_create_qp_resp resp; struct ibv_qp *qp; int ret; qp = malloc(sizeof *qp); if (!qp) return NULL; ret = ibv_cmd_create_qp(pd, qp, attr, &cmd, sizeof cmd, &resp, sizeof resp); if (ret) { free(qp); return NULL; } return qp; } int hfi1_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; return ibv_cmd_query_qp(qp, attr, attr_mask, init_attr, &cmd, sizeof cmd); } int hfi1_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd = {}; return ibv_cmd_modify_qp(qp, attr, attr_mask, &cmd, sizeof cmd); } int hfi1_destroy_qp(struct ibv_qp *ibqp) { struct hfi1_qp *qp = to_iqp(ibqp); int ret; ret = ibv_cmd_destroy_qp(ibqp); if (ret) return ret; if (qp->rq.rwq) { size_t size; size = sizeof(struct hfi1_rwq) + (sizeof(struct hfi1_rwqe) + (sizeof(struct ibv_sge) * qp->rq.max_sge)) * qp->rq.size; (void) munmap(qp->rq.rwq, size); } free(qp); return 0; } int hfi1_destroy_qp_v1(struct ibv_qp *ibqp) { int ret; ret = ibv_cmd_destroy_qp(ibqp); if (!ret) free(ibqp); return ret; } int hfi1_post_send(struct ibv_qp *qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { unsigned wr_count; struct ibv_send_wr *i; /* Sanity check the number of WRs being posted */ for (i = wr, wr_count = 0; i; i = i->next) if (++wr_count > 10) goto iter; return ibv_cmd_post_send(qp, wr, bad_wr); iter: do { struct ibv_send_wr *next; int ret; next = i->next; i->next = NULL; ret = ibv_cmd_post_send(qp, wr, bad_wr); i->next = next; if (ret) return ret; if (next == NULL) break; wr = next; for (i = wr, wr_count = 0; i->next; i = i->next) if (++wr_count > 2) break; } while (1); return 0; } static int post_recv(struct hfi1_rq *rq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct ibv_recv_wr *i; struct hfi1_rwq *rwq; struct hfi1_rwqe *wqe; uint32_t head; int n, ret; pthread_spin_lock(&rq->lock); rwq = rq->rwq; head = atomic_load_explicit(&rwq->head, memory_order_relaxed); for (i = wr; i; i = i->next) { if ((unsigned) i->num_sge > rq->max_sge) { ret = EINVAL; goto bad; } wqe = get_rwqe_ptr(rq, head); if (++head >= rq->size) head = 0; if (head == atomic_load(&rwq->tail)) { ret = ENOMEM; goto bad; } wqe->wr_id = i->wr_id; wqe->num_sge = i->num_sge; for (n = 0; n < wqe->num_sge; n++) wqe->sg_list[n] = i->sg_list[n]; /* Make sure queue entry is written before the head index. */ atomic_thread_fence(memory_order_release); atomic_store(&rwq->head, head); } ret = 0; goto done; bad: if (bad_wr) *bad_wr = i; done: pthread_spin_unlock(&rq->lock); return ret; } int hfi1_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct hfi1_qp *qp = to_iqp(ibqp); return post_recv(&qp->rq, wr, bad_wr); } struct ibv_srq *hfi1_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct hfi1_srq *srq; struct ibv_create_srq cmd; struct hfi1_create_srq_resp resp; int ret; size_t size; memset(&resp, 0, sizeof(resp)); srq = malloc(sizeof *srq); if (srq == NULL) return NULL; ret = ibv_cmd_create_srq(pd, &srq->ibv_srq, attr, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) { free(srq); return NULL; } srq->rq.size = attr->attr.max_wr + 1; srq->rq.max_sge = attr->attr.max_sge; size = sizeof(struct hfi1_rwq) + (sizeof(struct hfi1_rwqe) + (sizeof(struct ibv_sge) * srq->rq.max_sge)) * srq->rq.size; srq->rq.rwq = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.offset); if ((void *) srq->rq.rwq == MAP_FAILED) { ibv_cmd_destroy_srq(&srq->ibv_srq); free(srq); return NULL; } pthread_spin_init(&srq->rq.lock, PTHREAD_PROCESS_PRIVATE); return &srq->ibv_srq; } struct ibv_srq *hfi1_create_srq_v1(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct ibv_srq *srq; struct ibv_create_srq cmd; struct ib_uverbs_create_srq_resp resp; int ret; srq = malloc(sizeof *srq); if (srq == NULL) return NULL; ret = ibv_cmd_create_srq(pd, srq, attr, &cmd, sizeof cmd, &resp, sizeof resp); if (ret) { free(srq); return NULL; } return srq; } int hfi1_modify_srq(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr, int attr_mask) { struct hfi1_srq *srq = to_isrq(ibsrq); struct hfi1_modify_srq_cmd cmd; __u64 offset; size_t size = 0; /* Shut up gcc */ int ret; if (attr_mask & IBV_SRQ_MAX_WR) { pthread_spin_lock(&srq->rq.lock); /* Save the old size so we can unmmap the queue. */ size = sizeof(struct hfi1_rwq) + (sizeof(struct hfi1_rwqe) + (sizeof(struct ibv_sge) * srq->rq.max_sge)) * srq->rq.size; } cmd.offset_addr = (uintptr_t) &offset; ret = ibv_cmd_modify_srq(ibsrq, attr, attr_mask, &cmd.ibv_cmd, sizeof cmd); if (ret) { if (attr_mask & IBV_SRQ_MAX_WR) pthread_spin_unlock(&srq->rq.lock); return ret; } if (attr_mask & IBV_SRQ_MAX_WR) { (void) munmap(srq->rq.rwq, size); srq->rq.size = attr->max_wr + 1; size = sizeof(struct hfi1_rwq) + (sizeof(struct hfi1_rwqe) + (sizeof(struct ibv_sge) * srq->rq.max_sge)) * srq->rq.size; srq->rq.rwq = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, ibsrq->context->cmd_fd, offset); pthread_spin_unlock(&srq->rq.lock); /* XXX Now we have no receive queue. */ if ((void *) srq->rq.rwq == MAP_FAILED) return errno; } return 0; } int hfi1_modify_srq_v1(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr, int attr_mask) { struct ibv_modify_srq cmd; return ibv_cmd_modify_srq(ibsrq, attr, attr_mask, &cmd, sizeof cmd); } int hfi1_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; return ibv_cmd_query_srq(srq, attr, &cmd, sizeof cmd); } int hfi1_destroy_srq(struct ibv_srq *ibsrq) { struct hfi1_srq *srq = to_isrq(ibsrq); size_t size; int ret; ret = ibv_cmd_destroy_srq(ibsrq); if (ret) return ret; size = sizeof(struct hfi1_rwq) + (sizeof(struct hfi1_rwqe) + (sizeof(struct ibv_sge) * srq->rq.max_sge)) * srq->rq.size; (void) munmap(srq->rq.rwq, size); free(srq); return 0; } int hfi1_destroy_srq_v1(struct ibv_srq *ibsrq) { int ret; ret = ibv_cmd_destroy_srq(ibsrq); if (!ret) free(ibsrq); return ret; } int hfi1_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct hfi1_srq *srq = to_isrq(ibsrq); return post_recv(&srq->rq, wr, bad_wr); } struct ibv_ah *hfi1_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr) { struct ibv_ah *ah; struct ib_uverbs_create_ah_resp resp; ah = malloc(sizeof *ah); if (ah == NULL) return NULL; memset(&resp, 0, sizeof(resp)); if (ibv_cmd_create_ah(pd, ah, attr, &resp, sizeof(resp))) { free(ah); return NULL; } return ah; } int hfi1_destroy_ah(struct ibv_ah *ah) { int ret; ret = ibv_cmd_destroy_ah(ah); if (ret) return ret; free(ah); return 0; } rdma-core-56.1/providers/hns/000077500000000000000000000000001477342711600161375ustar00rootroot00000000000000rdma-core-56.1/providers/hns/CMakeLists.txt000066400000000000000000000004101477342711600206720ustar00rootroot00000000000000rdma_shared_provider(hns libhns.map 1 1.0.${PACKAGE_VERSION} hns_roce_u.c hns_roce_u_buf.c hns_roce_u_db.c hns_roce_u_hw_v2.c hns_roce_u_verbs.c ) publish_headers(infiniband hnsdv.h ) rdma_pkg_config("hns" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}") rdma-core-56.1/providers/hns/hns_roce_u.c000066400000000000000000000213601477342711600204310ustar00rootroot00000000000000/* * Copyright (c) 2016-2017 Hisilicon Limited. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include "hns_roce_u.h" static void hns_roce_free_context(struct ibv_context *ibctx); #ifndef PCI_VENDOR_ID_HUAWEI #define PCI_VENDOR_ID_HUAWEI 0x19E5 #endif static const struct verbs_match_ent hca_table[] = { VERBS_PCI_MATCH(PCI_VENDOR_ID_HUAWEI, 0xA222, &hns_roce_u_hw_v2), VERBS_PCI_MATCH(PCI_VENDOR_ID_HUAWEI, 0xA223, &hns_roce_u_hw_v2), VERBS_PCI_MATCH(PCI_VENDOR_ID_HUAWEI, 0xA224, &hns_roce_u_hw_v2), VERBS_PCI_MATCH(PCI_VENDOR_ID_HUAWEI, 0xA225, &hns_roce_u_hw_v2), VERBS_PCI_MATCH(PCI_VENDOR_ID_HUAWEI, 0xA226, &hns_roce_u_hw_v2), VERBS_PCI_MATCH(PCI_VENDOR_ID_HUAWEI, 0xA227, &hns_roce_u_hw_v2), VERBS_PCI_MATCH(PCI_VENDOR_ID_HUAWEI, 0xA228, &hns_roce_u_hw_v2), VERBS_PCI_MATCH(PCI_VENDOR_ID_HUAWEI, 0xA22F, &hns_roce_u_hw_v2), {} }; static const struct verbs_context_ops hns_common_ops = { .alloc_mw = hns_roce_u_alloc_mw, .alloc_pd = hns_roce_u_alloc_pd, .bind_mw = hns_roce_u_bind_mw, .cq_event = hns_roce_u_cq_event, .create_cq = hns_roce_u_create_cq, .create_cq_ex = hns_roce_u_create_cq_ex, .create_qp = hns_roce_u_create_qp, .create_qp_ex = hns_roce_u_create_qp_ex, .dealloc_mw = hns_roce_u_dealloc_mw, .dealloc_pd = hns_roce_u_dealloc_pd, .dereg_mr = hns_roce_u_dereg_mr, .destroy_cq = hns_roce_u_destroy_cq, .modify_cq = hns_roce_u_modify_cq, .query_device_ex = hns_roce_u_query_device, .query_port = hns_roce_u_query_port, .query_qp = hns_roce_u_query_qp, .reg_mr = hns_roce_u_reg_mr, .rereg_mr = hns_roce_u_rereg_mr, .create_srq = hns_roce_u_create_srq, .create_srq_ex = hns_roce_u_create_srq_ex, .modify_srq = hns_roce_u_modify_srq, .query_srq = hns_roce_u_query_srq, .destroy_srq = hns_roce_u_destroy_srq, .free_context = hns_roce_free_context, .create_ah = hns_roce_u_create_ah, .destroy_ah = hns_roce_u_destroy_ah, .open_xrcd = hns_roce_u_open_xrcd, .close_xrcd = hns_roce_u_close_xrcd, .open_qp = hns_roce_u_open_qp, .get_srq_num = hns_roce_u_get_srq_num, .alloc_td = hns_roce_u_alloc_td, .dealloc_td = hns_roce_u_dealloc_td, .alloc_parent_domain = hns_roce_u_alloc_pad, }; static uint32_t calc_table_shift(uint32_t entry_count, uint32_t size_shift) { uint32_t count_shift = hr_ilog32(entry_count); return count_shift > size_shift ? count_shift - size_shift : 0; } static int set_context_attr(struct hns_roce_device *hr_dev, struct hns_roce_context *context, struct hns_roce_alloc_ucontext_resp *resp) { struct ibv_device_attr dev_attrs; int i; if (!resp->cqe_size) context->cqe_size = HNS_ROCE_CQE_SIZE; else if (resp->cqe_size <= HNS_ROCE_V3_CQE_SIZE) context->cqe_size = resp->cqe_size; else context->cqe_size = HNS_ROCE_V3_CQE_SIZE; context->config = resp->config; if (resp->config & HNS_ROCE_RSP_EXSGE_FLAGS) context->max_inline_data = resp->max_inline_data; context->qp_table_shift = calc_table_shift(resp->qp_tab_size, HNS_ROCE_QP_TABLE_BITS); context->qp_table_mask = (1 << context->qp_table_shift) - 1; for (i = 0; i < HNS_ROCE_QP_TABLE_SIZE; ++i) context->qp_table[i].refcnt = 0; context->srq_table_shift = calc_table_shift(resp->srq_tab_size, HNS_ROCE_SRQ_TABLE_BITS); context->srq_table_mask = (1 << context->srq_table_shift) - 1; for (i = 0; i < HNS_ROCE_SRQ_TABLE_SIZE; ++i) context->srq_table[i].refcnt = 0; if (hns_roce_u_query_device(&context->ibv_ctx.context, NULL, container_of(&dev_attrs, struct ibv_device_attr_ex, orig_attr), sizeof(dev_attrs))) return EIO; hr_dev->hw_version = dev_attrs.hw_ver; hr_dev->congest_cap = resp->congest_type; context->max_qp_wr = dev_attrs.max_qp_wr; context->max_sge = dev_attrs.max_sge; context->max_cqe = dev_attrs.max_cqe; context->max_srq_wr = dev_attrs.max_srq_wr; context->max_srq_sge = dev_attrs.max_srq_sge; return 0; } static int hns_roce_init_context_lock(struct hns_roce_context *context) { int ret; ret = pthread_spin_init(&context->uar_lock, PTHREAD_PROCESS_PRIVATE); if (ret) return ret; ret = pthread_mutex_init(&context->qp_table_mutex, NULL); if (ret) goto destroy_uar_lock; ret = pthread_mutex_init(&context->srq_table_mutex, NULL); if (ret) goto destroy_qp_mutex; ret = pthread_mutex_init(&context->db_list_mutex, NULL); if (ret) goto destroy_srq_mutex; return 0; destroy_srq_mutex: pthread_mutex_destroy(&context->srq_table_mutex); destroy_qp_mutex: pthread_mutex_destroy(&context->qp_table_mutex); destroy_uar_lock: pthread_spin_destroy(&context->uar_lock); return ret; } static void hns_roce_destroy_context_lock(struct hns_roce_context *context) { pthread_spin_destroy(&context->uar_lock); pthread_mutex_destroy(&context->qp_table_mutex); pthread_mutex_destroy(&context->srq_table_mutex); pthread_mutex_destroy(&context->db_list_mutex); } static struct verbs_context *hns_roce_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct hns_roce_device *hr_dev = to_hr_dev(ibdev); struct hns_roce_alloc_ucontext_resp resp = {}; struct hns_roce_alloc_ucontext cmd = {}; struct hns_roce_context *context; context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx, RDMA_DRIVER_HNS); if (!context) return NULL; cmd.config |= HNS_ROCE_EXSGE_FLAGS | HNS_ROCE_RQ_INLINE_FLAGS | HNS_ROCE_CQE_INLINE_FLAGS; if (ibv_cmd_get_context(&context->ibv_ctx, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) goto err_ibv_cmd; if (hns_roce_init_context_lock(context)) goto err_ibv_cmd; if (set_context_attr(hr_dev, context, &resp)) goto err_set_attr; context->uar = mmap(NULL, hr_dev->page_size, PROT_READ | PROT_WRITE, MAP_SHARED, cmd_fd, 0); if (context->uar == MAP_FAILED) { verbs_err(&context->ibv_ctx, "failed to mmap uar page.\n"); goto err_set_attr; } verbs_set_ops(&context->ibv_ctx, &hns_common_ops); verbs_set_ops(&context->ibv_ctx, &hr_dev->u_hw->hw_ops); return &context->ibv_ctx; err_set_attr: hns_roce_destroy_context_lock(context); err_ibv_cmd: verbs_uninit_context(&context->ibv_ctx); free(context); return NULL; } static void hns_roce_free_context(struct ibv_context *ibctx) { struct hns_roce_device *hr_dev = to_hr_dev(ibctx->device); struct hns_roce_context *context = to_hr_ctx(ibctx); munmap(context->uar, hr_dev->page_size); hns_roce_destroy_context_lock(context); verbs_uninit_context(&context->ibv_ctx); free(context); } static void hns_uninit_device(struct verbs_device *verbs_device) { struct hns_roce_device *dev = to_hr_dev(&verbs_device->device); free(dev); } static struct verbs_device *hns_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct hns_roce_device *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; dev->u_hw = sysfs_dev->match->driver_data; dev->hw_version = dev->u_hw->hw_version; dev->page_size = sysconf(_SC_PAGESIZE); return &dev->ibv_dev; } static const struct verbs_device_ops hns_roce_dev_ops = { .name = "hns", .match_min_abi_version = 0, .match_max_abi_version = INT_MAX, .match_table = hca_table, .alloc_device = hns_device_alloc, .uninit_device = hns_uninit_device, .alloc_context = hns_roce_alloc_context, }; bool is_hns_dev(struct ibv_device *device) { struct verbs_device *verbs_device = verbs_get_device(device); return verbs_device->ops == &hns_roce_dev_ops; } bool hnsdv_is_supported(struct ibv_device *device) { return is_hns_dev(device); } PROVIDER_DRIVER(hns, hns_roce_dev_ops); rdma-core-56.1/providers/hns/hns_roce_u.h000066400000000000000000000404551477342711600204440ustar00rootroot00000000000000/* * Copyright (c) 2016-2017 Hisilicon Limited. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _HNS_ROCE_U_H #define _HNS_ROCE_U_H #include #include #include #include #include #include #include #include #include #include #include #include "hns_roce_u_abi.h" #define HNS_ROCE_HW_VER2 0x100 #define HNS_ROCE_HW_VER3 0x130 #define PFX "hns: " /* The minimum page size is 4K for hardware */ #define HNS_HW_PAGE_SHIFT 12 #define HNS_HW_PAGE_SIZE (1 << HNS_HW_PAGE_SHIFT) #define HNS_ROCE_MAX_RC_INL_INN_SZ 32 #define HNS_ROCE_MAX_UD_INL_INN_SZ 8 #define HNS_ROCE_MIN_CQE_NUM 0x40 #define HNS_ROCE_V2_MIN_WQE_NUM 0x40 #define HNS_ROCE_MIN_SRQ_WQE_NUM 1 #define HNS_ROCE_CQE_SIZE 0x20 #define HNS_ROCE_V3_CQE_SIZE 0x40 #define HNS_ROCE_SQWQE_SHIFT 6 #define HNS_ROCE_SGE_IN_WQE 2 #define HNS_ROCE_SGE_SIZE 16 #define HNS_ROCE_SGE_SHIFT 4 #define HNS_ROCE_GID_SIZE 16 #define INVALID_SGE_LENGTH 0x80000000 #define HNS_ROCE_DWQE_PAGE_SIZE 65536 #define HNS_ROCE_ADDRESS_MASK 0xFFFFFFFF #define HNS_ROCE_ADDRESS_SHIFT 32 #define roce_get_field(origin, mask, shift) \ (((le32toh(origin)) & (mask)) >> (shift)) #define roce_get_bit(origin, shift) \ roce_get_field((origin), (1ul << (shift)), (shift)) #define roce_set_field(origin, mask, shift, val) \ do { \ (origin) &= ~htole32(mask); \ (origin) |= htole32(((unsigned int)(val) << (shift)) & (mask)); \ } while (0) #define roce_set_bit(origin, shift, val) \ roce_set_field((origin), (1ul << (shift)), (shift), (val)) #define FIELD_LOC(field_type, field_h, field_l) \ field_type, field_h, \ field_l + BUILD_ASSERT_OR_ZERO(((field_h) / 32) == \ ((field_l) / 32)) #define _hr_reg_enable(ptr, field_type, field_h, field_l) \ ({ \ const field_type *_ptr = ptr; \ BUILD_ASSERT((field_h) == (field_l)); \ *((__le32 *)_ptr + (field_h) / 32) |= \ htole32(BIT((field_l) % 32)); \ }) #define hr_reg_enable(ptr, field) _hr_reg_enable(ptr, field) #define _hr_reg_clear(ptr, field_type, field_h, field_l) \ ({ \ const field_type *_ptr = ptr; \ BUILD_ASSERT((field_h) >= (field_l)); \ *((__le32 *)_ptr + (field_h) / 32) &= \ ~htole32(GENMASK((field_h) % 32, (field_l) % 32)); \ }) #define hr_reg_clear(ptr, field) _hr_reg_clear(ptr, field) #define _hr_reg_write_bool(ptr, field_type, field_h, field_l, val) \ ({ \ (val) ? _hr_reg_enable(ptr, field_type, field_h, field_l) : \ _hr_reg_clear(ptr, field_type, field_h, field_l);\ }) #define hr_reg_write_bool(ptr, field, val) _hr_reg_write_bool(ptr, field, val) #define _hr_reg_write(ptr, field_type, field_h, field_l, val) \ ({ \ const uint32_t _val = val; \ _hr_reg_clear(ptr, field_type, field_h, field_l); \ *((__le32 *)ptr + (field_h) / 32) |= htole32(FIELD_PREP( \ GENMASK((field_h) % 32, (field_l) % 32), _val)); \ }) #define hr_reg_write(ptr, field, val) _hr_reg_write(ptr, field, val) #define _hr_reg_read(ptr, field_type, field_h, field_l) \ ({ \ const field_type *_ptr = ptr; \ BUILD_ASSERT((field_h) >= (field_l)); \ FIELD_GET(GENMASK((field_h) % 32, (field_l) % 32), \ le32toh(*((__le32 *)_ptr + (field_h) / 32))); \ }) #define hr_reg_read(ptr, field) _hr_reg_read(ptr, field) #define HNS_ROCE_QP_TABLE_BITS 8 #define HNS_ROCE_QP_TABLE_SIZE BIT(HNS_ROCE_QP_TABLE_BITS) #define HNS_ROCE_SRQ_TABLE_BITS 8 #define HNS_ROCE_SRQ_TABLE_SIZE BIT(HNS_ROCE_SRQ_TABLE_BITS) struct hns_roce_device { struct verbs_device ibv_dev; int page_size; const struct hns_roce_u_hw *u_hw; int hw_version; uint8_t congest_cap; }; struct hns_roce_buf { void *buf; unsigned int length; }; #define BIT_CNT_PER_BYTE 8 #define BIT_CNT_PER_LONG (BIT_CNT_PER_BYTE * sizeof(unsigned long)) /* the sw doorbell type; */ enum hns_roce_db_type { HNS_ROCE_QP_TYPE_DB, HNS_ROCE_CQ_TYPE_DB, HNS_ROCE_SRQ_TYPE_DB, HNS_ROCE_DB_TYPE_NUM }; enum hns_roce_pktype { HNS_ROCE_PKTYPE_ROCE_V1, HNS_ROCE_PKTYPE_ROCE_V2_IPV6, HNS_ROCE_PKTYPE_ROCE_V2_IPV4, }; enum hns_roce_tc_map_mode { HNS_ROCE_TC_MAP_MODE_PRIO, HNS_ROCE_TC_MAP_MODE_DSCP, }; struct hns_roce_db_page { struct hns_roce_db_page *prev, *next; struct hns_roce_buf buf; unsigned int num_db; unsigned int use_cnt; unsigned long *bitmap; }; struct hns_roce_spinlock { pthread_spinlock_t lock; int need_lock; }; struct hns_roce_context { struct verbs_context ibv_ctx; void *uar; pthread_spinlock_t uar_lock; struct { struct hns_roce_qp **table; int refcnt; } qp_table[HNS_ROCE_QP_TABLE_SIZE]; pthread_mutex_t qp_table_mutex; uint32_t qp_table_shift; uint32_t qp_table_mask; struct { struct hns_roce_srq **table; int refcnt; } srq_table[HNS_ROCE_SRQ_TABLE_SIZE]; pthread_mutex_t srq_table_mutex; uint32_t srq_table_shift; uint32_t srq_table_mask; struct hns_roce_db_page *db_list[HNS_ROCE_DB_TYPE_NUM]; pthread_mutex_t db_list_mutex; unsigned int max_qp_wr; unsigned int max_sge; unsigned int max_srq_wr; unsigned int max_srq_sge; int max_cqe; unsigned int cqe_size; uint32_t config; unsigned int max_inline_data; }; struct hns_roce_td { struct ibv_td ibv_td; atomic_int refcount; }; struct hns_roce_pd { struct ibv_pd ibv_pd; unsigned int pdn; atomic_int refcount; struct hns_roce_pd *protection_domain; }; struct hns_roce_pad { struct hns_roce_pd pd; struct hns_roce_td *td; }; struct hns_roce_cq { struct verbs_cq verbs_cq; struct hns_roce_buf buf; struct hns_roce_spinlock hr_lock; unsigned int cqn; unsigned int cq_depth; unsigned int cons_index; unsigned int *db; unsigned int *arm_db; int arm_sn; unsigned long flags; unsigned int cqe_size; struct hns_roce_v2_cqe *cqe; struct ibv_pd *parent_domain; }; struct hns_roce_idx_que { struct hns_roce_buf buf; unsigned int entry_shift; unsigned long *bitmap; int bitmap_cnt; unsigned int head; unsigned int tail; }; struct hns_roce_rinl_wqe { struct ibv_sge *sg_list; unsigned int sge_cnt; }; struct hns_roce_rinl_buf { struct hns_roce_rinl_wqe *wqe_list; unsigned int wqe_cnt; }; struct hns_roce_srq { struct verbs_srq verbs_srq; struct hns_roce_idx_que idx_que; struct hns_roce_buf wqe_buf; struct hns_roce_spinlock hr_lock; unsigned long *wrid; unsigned int srqn; unsigned int wqe_cnt; unsigned int max_gs; unsigned int rsv_sge; unsigned int wqe_shift; unsigned int *rdb; unsigned int cap_flags; unsigned short counter; }; struct hns_roce_wq { unsigned long *wrid; struct hns_roce_spinlock hr_lock; unsigned int wqe_cnt; int max_post; unsigned int head; unsigned int tail; unsigned int max_gs; unsigned int ext_sge_cnt; unsigned int rsv_sge; unsigned int wqe_shift; unsigned int shift; /* wq size is 2^shift */ int offset; void *db_reg; }; /* record the result of sge process */ struct hns_roce_sge_info { unsigned int valid_num; /* sge length is not 0 */ unsigned int start_idx; /* start position of extend sge */ unsigned int total_len; /* total length of valid sges */ }; struct hns_roce_sge_ex { int offset; unsigned int sge_cnt; unsigned int sge_shift; }; struct hns_roce_qp { struct verbs_qp verbs_qp; struct hns_roce_buf buf; int max_inline_data; int buf_size; unsigned int sq_signal_bits; struct hns_roce_wq sq; struct hns_roce_wq rq; unsigned int *rdb; unsigned int *sdb; struct hns_roce_sge_ex ex_sge; unsigned int next_sge; int port_num; uint8_t sl; uint8_t tc_mode; uint8_t priority; unsigned int qkey; enum ibv_mtu path_mtu; struct hns_roce_rinl_buf rq_rinl_buf; unsigned long flags; int refcnt; /* specially used for XRC */ void *dwqe_page; /* specific fields for the new post send APIs */ int err; void *cur_wqe; unsigned int rb_sq_head; /* roll back sq head */ struct hns_roce_sge_info sge_info; }; struct hns_roce_av { uint8_t port; uint8_t gid_index; uint8_t hop_limit; uint32_t flowlabel; uint16_t udp_sport; uint8_t sl; uint8_t tclass; uint8_t dgid[HNS_ROCE_GID_SIZE]; uint8_t mac[ETH_ALEN]; }; struct hns_roce_ah { struct ibv_ah ibv_ah; struct hns_roce_av av; }; struct hns_roce_u_hw { uint32_t hw_version; struct verbs_context_ops hw_ops; }; /* * The entries's buffer should be aligned to a multiple of the hardware's * minimum page size. */ #define hr_hw_page_align(x) align(x, HNS_HW_PAGE_SIZE) static inline unsigned int to_hr_hem_entries_size(int count, int buf_shift) { return hr_hw_page_align(count << buf_shift); } static inline unsigned int hr_ilog32(unsigned int count) { return ilog32(count - 1); } static inline uint32_t to_hr_qp_table_index(uint32_t qpn, struct hns_roce_context *ctx) { return (qpn >> ctx->qp_table_shift) & (HNS_ROCE_QP_TABLE_SIZE - 1); } static inline uint32_t to_hr_srq_table_index(uint32_t srqn, struct hns_roce_context *ctx) { return (srqn >> ctx->srq_table_shift) & (HNS_ROCE_SRQ_TABLE_SIZE - 1); } static inline struct hns_roce_device *to_hr_dev(struct ibv_device *ibv_dev) { return container_of(ibv_dev, struct hns_roce_device, ibv_dev.device); } static inline struct hns_roce_context *to_hr_ctx(struct ibv_context *ibv_ctx) { return container_of(ibv_ctx, struct hns_roce_context, ibv_ctx.context); } static inline struct hns_roce_td *to_hr_td(struct ibv_td *ibv_td) { return container_of(ibv_td, struct hns_roce_td, ibv_td); } /* to_hr_pd always returns the real hns_roce_pd obj. */ static inline struct hns_roce_pd *to_hr_pd(struct ibv_pd *ibv_pd) { struct hns_roce_pd *pd = container_of(ibv_pd, struct hns_roce_pd, ibv_pd); if (pd->protection_domain) return pd->protection_domain; return pd; } static inline struct hns_roce_pad *to_hr_pad(struct ibv_pd *ibv_pd) { struct hns_roce_pad *pad = ibv_pd ? container_of(ibv_pd, struct hns_roce_pad, pd.ibv_pd) : NULL; if (pad && pad->pd.protection_domain) return pad; /* Otherwise ibv_pd isn't a parent_domain */ return NULL; } static inline struct hns_roce_cq *to_hr_cq(struct ibv_cq *ibv_cq) { return container_of(ibv_cq, struct hns_roce_cq, verbs_cq.cq); } static inline struct hns_roce_srq *to_hr_srq(struct ibv_srq *ibv_srq) { return container_of(ibv_srq, struct hns_roce_srq, verbs_srq.srq); } static inline struct hns_roce_qp *to_hr_qp(struct ibv_qp *ibv_qp) { return container_of(ibv_qp, struct hns_roce_qp, verbs_qp.qp); } static inline struct hns_roce_ah *to_hr_ah(struct ibv_ah *ibv_ah) { return container_of(ibv_ah, struct hns_roce_ah, ibv_ah); } static inline int hns_roce_spin_lock(struct hns_roce_spinlock *hr_lock) { if (hr_lock->need_lock) return pthread_spin_lock(&hr_lock->lock); return 0; } static inline int hns_roce_spin_unlock(struct hns_roce_spinlock *hr_lock) { if (hr_lock->need_lock) return pthread_spin_unlock(&hr_lock->lock); return 0; } int hns_roce_u_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int hns_roce_u_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr); struct ibv_td *hns_roce_u_alloc_td(struct ibv_context *context, struct ibv_td_init_attr *attr); int hns_roce_u_dealloc_td(struct ibv_td *ibv_td); struct ibv_pd *hns_roce_u_alloc_pad(struct ibv_context *context, struct ibv_parent_domain_init_attr *attr); struct ibv_pd *hns_roce_u_alloc_pd(struct ibv_context *context); int hns_roce_u_dealloc_pd(struct ibv_pd *pd); struct ibv_mr *hns_roce_u_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access); int hns_roce_u_rereg_mr(struct verbs_mr *vmr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access); int hns_roce_u_dereg_mr(struct verbs_mr *vmr); struct ibv_mw *hns_roce_u_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type); int hns_roce_u_dealloc_mw(struct ibv_mw *mw); int hns_roce_u_bind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind); struct ibv_cq *hns_roce_u_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); struct ibv_cq_ex *hns_roce_u_create_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr); int hns_roce_u_modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr); int hns_roce_u_destroy_cq(struct ibv_cq *cq); void hns_roce_u_cq_event(struct ibv_cq *cq); struct ibv_srq *hns_roce_u_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *srq_init_attr); struct ibv_srq *hns_roce_u_create_srq_ex(struct ibv_context *context, struct ibv_srq_init_attr_ex *attr); int hns_roce_u_get_srq_num(struct ibv_srq *ibv_srq, uint32_t *srq_num); int hns_roce_u_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, int srq_attr_mask); int hns_roce_u_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr); struct hns_roce_srq *hns_roce_find_srq(struct hns_roce_context *ctx, uint32_t srqn); int hns_roce_u_destroy_srq(struct ibv_srq *ibv_srq); struct ibv_qp *hns_roce_u_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); struct ibv_qp * hns_roce_u_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_init_attr_ex); struct ibv_qp *hns_roce_u_open_qp(struct ibv_context *context, struct ibv_qp_open_attr *attr); int hns_roce_u_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); struct ibv_ah *hns_roce_u_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr); int hns_roce_u_destroy_ah(struct ibv_ah *ah); struct ibv_xrcd * hns_roce_u_open_xrcd(struct ibv_context *context, struct ibv_xrcd_init_attr *xrcd_init_attr); int hns_roce_u_close_xrcd(struct ibv_xrcd *ibv_xrcd); int hns_roce_alloc_buf(struct hns_roce_buf *buf, unsigned int size, int page_size); void hns_roce_free_buf(struct hns_roce_buf *buf); void hns_roce_qp_spinlock_destroy(struct hns_roce_qp *qp); void hns_roce_free_qp_buf(struct hns_roce_qp *qp, struct hns_roce_context *ctx); void hns_roce_init_qp_indices(struct hns_roce_qp *qp); bool is_hns_dev(struct ibv_device *device); extern const struct hns_roce_u_hw hns_roce_u_hw_v2; #endif /* _HNS_ROCE_U_H */ rdma-core-56.1/providers/hns/hns_roce_u_abi.h000066400000000000000000000053601477342711600212530ustar00rootroot00000000000000/* * Copyright (c) 2016 Hisilicon Limited. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _HNS_ROCE_U_ABI_H #define _HNS_ROCE_U_ABI_H #include #include #include #include "hnsdv.h" DECLARE_DRV_CMD(hns_roce_alloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, empty, hns_roce_ib_alloc_pd_resp); DECLARE_DRV_CMD(hns_roce_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, hns_roce_ib_create_cq, hns_roce_ib_create_cq_resp); DECLARE_DRV_CMD(hns_roce_create_cq_ex, IB_USER_VERBS_EX_CMD_CREATE_CQ, hns_roce_ib_create_cq, hns_roce_ib_create_cq_resp); DECLARE_DRV_CMD(hns_roce_alloc_ucontext, IB_USER_VERBS_CMD_GET_CONTEXT, hns_roce_ib_alloc_ucontext, hns_roce_ib_alloc_ucontext_resp); DECLARE_DRV_CMD(hns_roce_create_qp, IB_USER_VERBS_CMD_CREATE_QP, hns_roce_ib_create_qp, hns_roce_ib_create_qp_resp); DECLARE_DRV_CMD(hns_roce_create_qp_ex, IB_USER_VERBS_EX_CMD_CREATE_QP, hns_roce_ib_create_qp, hns_roce_ib_create_qp_resp); DECLARE_DRV_CMD(hns_roce_create_srq, IB_USER_VERBS_CMD_CREATE_SRQ, hns_roce_ib_create_srq, hns_roce_ib_create_srq_resp); DECLARE_DRV_CMD(hns_roce_create_srq_ex, IB_USER_VERBS_CMD_CREATE_XSRQ, hns_roce_ib_create_srq, hns_roce_ib_create_srq_resp); DECLARE_DRV_CMD(hns_roce_create_ah, IB_USER_VERBS_CMD_CREATE_AH, empty, hns_roce_ib_create_ah_resp); DECLARE_DRV_CMD(hns_roce_modify_qp_ex, IB_USER_VERBS_EX_CMD_MODIFY_QP, empty, hns_roce_ib_modify_qp_resp); #endif /* _HNS_ROCE_U_ABI_H */ rdma-core-56.1/providers/hns/hns_roce_u_buf.c000066400000000000000000000037531477342711600212730ustar00rootroot00000000000000/* * Copyright (c) 2016 Hisilicon Limited. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include "hns_roce_u.h" int hns_roce_alloc_buf(struct hns_roce_buf *buf, unsigned int size, int page_size) { int ret; buf->length = align(size, page_size); buf->buf = mmap(NULL, buf->length, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (buf->buf == MAP_FAILED) return errno; ret = ibv_dontfork_range(buf->buf, buf->length); if (ret) munmap(buf->buf, buf->length); return ret; } void hns_roce_free_buf(struct hns_roce_buf *buf) { ibv_dofork_range(buf->buf, buf->length); munmap(buf->buf, buf->length); } rdma-core-56.1/providers/hns/hns_roce_u_db.c000066400000000000000000000103541477342711600210770ustar00rootroot00000000000000/* * Copyright (c) 2017 Hisilicon Limited. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include "hns_roce_u.h" #include "hns_roce_u_db.h" /* the sw db length, on behalf of the qp/cq/srq length from left to right */ static const unsigned int db_size[] = { [HNS_ROCE_QP_TYPE_DB] = 4, [HNS_ROCE_CQ_TYPE_DB] = 4, [HNS_ROCE_SRQ_TYPE_DB] = 4, }; static struct hns_roce_db_page *hns_roce_add_db_page( struct hns_roce_context *ctx, enum hns_roce_db_type type) { struct hns_roce_db_page *page; int page_size; page_size = to_hr_dev(ctx->ibv_ctx.context.device)->page_size; page = calloc(1, sizeof(*page)); if (!page) goto err_page; /* allocate bitmap space for sw db and init all bitmap to 1 */ page->num_db = page_size / db_size[type]; page->use_cnt = 0; page->bitmap = bitmap_alloc1(page->num_db); if (!page->bitmap) goto err_map; if (hns_roce_alloc_buf(&(page->buf), page_size, page_size)) goto err; /* add the set ctx->db_list */ page->prev = NULL; page->next = ctx->db_list[type]; ctx->db_list[type] = page; if (page->next) page->next->prev = page; return page; err: free(page->bitmap); err_map: free(page); err_page: return NULL; } static void hns_roce_clear_db_page(struct hns_roce_db_page *page) { free(page->bitmap); hns_roce_free_buf(&(page->buf)); } void *hns_roce_alloc_db(struct hns_roce_context *ctx, enum hns_roce_db_type type) { struct hns_roce_db_page *page; void *db = NULL; uint32_t npos; pthread_mutex_lock((pthread_mutex_t *)&ctx->db_list_mutex); for (page = ctx->db_list[type]; page != NULL; page = page->next) if (page->use_cnt < page->num_db) goto found; page = hns_roce_add_db_page(ctx, type); if (!page) goto out; found: ++page->use_cnt; npos = bitmap_find_first_bit(page->bitmap, 0, page->num_db); bitmap_clear_bit(page->bitmap, npos); db = page->buf.buf + npos * db_size[type]; out: pthread_mutex_unlock((pthread_mutex_t *)&ctx->db_list_mutex); if (db) *((unsigned int *)db) = 0; return db; } void hns_roce_free_db(struct hns_roce_context *ctx, unsigned int *db, enum hns_roce_db_type type) { struct hns_roce_db_page *page; uint32_t npos; uint32_t page_size; pthread_mutex_lock((pthread_mutex_t *)&ctx->db_list_mutex); page_size = to_hr_dev(ctx->ibv_ctx.context.device)->page_size; for (page = ctx->db_list[type]; page != NULL; page = page->next) if (((uintptr_t)db & (~((uintptr_t)page_size - 1))) == (uintptr_t)(page->buf.buf)) goto found; goto out; found: --page->use_cnt; if (!page->use_cnt) { if (page->prev) page->prev->next = page->next; else ctx->db_list[type] = page->next; if (page->next) page->next->prev = page->prev; hns_roce_clear_db_page(page); free(page); goto out; } npos = ((uintptr_t)db - (uintptr_t)page->buf.buf) / db_size[type]; bitmap_set_bit(page->bitmap, npos); out: pthread_mutex_unlock((pthread_mutex_t *)&ctx->db_list_mutex); } rdma-core-56.1/providers/hns/hns_roce_u_db.h000066400000000000000000000035671477342711600211140ustar00rootroot00000000000000/* * Copyright (c) 2016 Hisilicon Limited. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _HNS_ROCE_U_DB_H #define _HNS_ROCE_U_DB_H #include #include #include "hns_roce_u.h" #define HNS_ROCE_WORD_NUM 2 static inline void hns_roce_write64(void *dest, __le32 val[HNS_ROCE_WORD_NUM]) { mmio_write64_le(dest, *(__le64 *)val); } void *hns_roce_alloc_db(struct hns_roce_context *ctx, enum hns_roce_db_type type); void hns_roce_free_db(struct hns_roce_context *ctx, unsigned int *db, enum hns_roce_db_type type); #endif /* _HNS_ROCE_U_DB_H */ rdma-core-56.1/providers/hns/hns_roce_u_hw_v2.c000066400000000000000000002113241477342711600215370ustar00rootroot00000000000000/* * Copyright (c) 2016-2017 Hisilicon Limited. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include "hns_roce_u.h" #include "hns_roce_u_db.h" #include "hns_roce_u_hw_v2.h" #define HR_IBV_OPC_MAP(ib_key, hr_key) \ [IBV_WR_ ## ib_key] = HNS_ROCE_WQE_OP_ ## hr_key static const uint32_t hns_roce_opcode[] = { HR_IBV_OPC_MAP(RDMA_WRITE, RDMA_WRITE), HR_IBV_OPC_MAP(RDMA_WRITE_WITH_IMM, RDMA_WRITE_WITH_IMM), HR_IBV_OPC_MAP(SEND, SEND), HR_IBV_OPC_MAP(SEND_WITH_IMM, SEND_WITH_IMM), HR_IBV_OPC_MAP(RDMA_READ, RDMA_READ), HR_IBV_OPC_MAP(ATOMIC_CMP_AND_SWP, ATOMIC_COM_AND_SWAP), HR_IBV_OPC_MAP(ATOMIC_FETCH_AND_ADD, ATOMIC_FETCH_AND_ADD), HR_IBV_OPC_MAP(BIND_MW, BIND_MW_TYPE), HR_IBV_OPC_MAP(SEND_WITH_INV, SEND_WITH_INV), }; static inline uint32_t to_hr_opcode(enum ibv_wr_opcode ibv_opcode) { if (ibv_opcode >= ARRAY_SIZE(hns_roce_opcode)) return HNS_ROCE_WQE_OP_MASK; return hns_roce_opcode[ibv_opcode]; } static const unsigned int hns_roce_mtu[] = { [IBV_MTU_256] = 256, [IBV_MTU_512] = 512, [IBV_MTU_1024] = 1024, [IBV_MTU_2048] = 2048, [IBV_MTU_4096] = 4096, }; static inline unsigned int mtu_enum_to_int(enum ibv_mtu mtu) { return hns_roce_mtu[mtu]; } static void *get_send_sge_ex(struct hns_roce_qp *qp, unsigned int n); static inline void set_data_seg_v2(struct hns_roce_v2_wqe_data_seg *dseg, const struct ibv_sge *sg) { dseg->lkey = htole32(sg->lkey); dseg->addr = htole64(sg->addr); dseg->len = htole32(sg->length); } static void set_extend_atomic_seg(struct hns_roce_qp *qp, unsigned int sge_cnt, struct hns_roce_sge_info *sge_info, void *buf) { unsigned int sge_mask = qp->ex_sge.sge_cnt - 1; unsigned int i; for (i = 0; i < sge_cnt; i++, sge_info->start_idx++) memcpy(get_send_sge_ex(qp, sge_info->start_idx & sge_mask), buf + i * HNS_ROCE_SGE_SIZE, HNS_ROCE_SGE_SIZE); } static int set_atomic_seg(struct hns_roce_qp *qp, struct ibv_send_wr *wr, void *dseg, struct hns_roce_sge_info *sge_info) { struct hns_roce_wqe_atomic_seg *aseg = dseg; unsigned int data_len = sge_info->total_len; uint8_t tmp[ATOMIC_DATA_LEN_MAX] = {}; void *buf[ATOMIC_BUF_NUM_MAX]; unsigned int buf_sge_num; /* There is only one sge in atomic wr, and data_len is the data length * in the first sge */ if (is_std_atomic(data_len)) { if (wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP) { aseg->fetchadd_swap_data = htole64(wr->wr.atomic.swap); aseg->cmp_data = htole64(wr->wr.atomic.compare_add); } else { aseg->fetchadd_swap_data = htole64(wr->wr.atomic.compare_add); aseg->cmp_data = 0; } return 0; } if (!is_ext_atomic(data_len)) return EINVAL; buf_sge_num = data_len >> HNS_ROCE_SGE_SHIFT; aseg->fetchadd_swap_data = 0; aseg->cmp_data = 0; /* both ext CAS and ext FAA need 2 bufs */ if ((buf_sge_num << 1) + HNS_ROCE_SGE_IN_WQE > qp->sq.max_gs) return EINVAL; if (wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP) { buf[0] = (void *)(uintptr_t)wr->wr.atomic.swap; buf[1] = (void *)(uintptr_t)wr->wr.atomic.compare_add; } else { buf[0] = (void *)(uintptr_t)wr->wr.atomic.compare_add; buf[1] = (void *)(uintptr_t)tmp; /* HW needs all 0 SGEs */ } if (!buf[0] || !buf[1]) return EINVAL; set_extend_atomic_seg(qp, buf_sge_num, sge_info, buf[0]); set_extend_atomic_seg(qp, buf_sge_num, sge_info, buf[1]); return 0; } static enum ibv_wc_status get_wc_status(uint8_t status) { static const struct { unsigned int cqe_status; enum ibv_wc_status wc_status; } map[] = { { HNS_ROCE_V2_CQE_SUCCESS, IBV_WC_SUCCESS }, { HNS_ROCE_V2_CQE_LOCAL_LENGTH_ERR, IBV_WC_LOC_LEN_ERR }, { HNS_ROCE_V2_CQE_LOCAL_QP_OP_ERR, IBV_WC_LOC_QP_OP_ERR }, { HNS_ROCE_V2_CQE_LOCAL_PROT_ERR, IBV_WC_LOC_PROT_ERR }, { HNS_ROCE_V2_CQE_WR_FLUSH_ERR, IBV_WC_WR_FLUSH_ERR }, { HNS_ROCE_V2_CQE_MEM_MANAGERENT_OP_ERR, IBV_WC_MW_BIND_ERR }, { HNS_ROCE_V2_CQE_BAD_RESP_ERR, IBV_WC_BAD_RESP_ERR }, { HNS_ROCE_V2_CQE_LOCAL_ACCESS_ERR, IBV_WC_LOC_ACCESS_ERR }, { HNS_ROCE_V2_CQE_REMOTE_INVAL_REQ_ERR, IBV_WC_REM_INV_REQ_ERR }, { HNS_ROCE_V2_CQE_REMOTE_ACCESS_ERR, IBV_WC_REM_ACCESS_ERR }, { HNS_ROCE_V2_CQE_REMOTE_OP_ERR, IBV_WC_REM_OP_ERR }, { HNS_ROCE_V2_CQE_TRANSPORT_RETRY_EXC_ERR, IBV_WC_RETRY_EXC_ERR }, { HNS_ROCE_V2_CQE_RNR_RETRY_EXC_ERR, IBV_WC_RNR_RETRY_EXC_ERR }, { HNS_ROCE_V2_CQE_REMOTE_ABORTED_ERR, IBV_WC_REM_ABORT_ERR }, { HNS_ROCE_V2_CQE_GENERAL_ERR, IBV_WC_GENERAL_ERR }, { HNS_ROCE_V2_CQE_XRC_VIOLATION_ERR, IBV_WC_REM_INV_RD_REQ_ERR }, }; for (int i = 0; i < ARRAY_SIZE(map); i++) { if (status == map[i].cqe_status) return map[i].wc_status; } return IBV_WC_GENERAL_ERR; } static struct hns_roce_v2_cqe *get_cqe_v2(struct hns_roce_cq *cq, int entry) { return cq->buf.buf + entry * cq->cqe_size; } static void *get_sw_cqe_v2(struct hns_roce_cq *cq, int n) { struct hns_roce_v2_cqe *cqe = get_cqe_v2(cq, n & cq->verbs_cq.cq.cqe); return (hr_reg_read(cqe, CQE_OWNER) ^ !!(n & (cq->verbs_cq.cq.cqe + 1))) ? cqe : NULL; } static struct hns_roce_v2_cqe *next_cqe_sw_v2(struct hns_roce_cq *cq) { return get_sw_cqe_v2(cq, cq->cons_index); } static void *get_recv_wqe_v2(struct hns_roce_qp *qp, unsigned int n) { return qp->buf.buf + qp->rq.offset + (n << qp->rq.wqe_shift); } static void *get_send_wqe(struct hns_roce_qp *qp, unsigned int n) { return qp->buf.buf + qp->sq.offset + (n << qp->sq.wqe_shift); } static void *get_send_sge_ex(struct hns_roce_qp *qp, unsigned int n) { return qp->buf.buf + qp->ex_sge.offset + (n << qp->ex_sge.sge_shift); } static void *get_srq_wqe(struct hns_roce_srq *srq, unsigned int n) { return srq->wqe_buf.buf + (n << srq->wqe_shift); } static void *get_idx_buf(struct hns_roce_idx_que *idx_que, unsigned int n) { return idx_que->buf.buf + (n << idx_que->entry_shift); } static void hns_roce_free_srq_wqe(struct hns_roce_srq *srq, uint16_t ind) { uint32_t bitmap_num; int bit_num; hns_roce_spin_lock(&srq->hr_lock); bitmap_num = ind / BIT_CNT_PER_LONG; bit_num = ind % BIT_CNT_PER_LONG; srq->idx_que.bitmap[bitmap_num] |= (1ULL << bit_num); srq->idx_que.tail++; hns_roce_spin_unlock(&srq->hr_lock); } static int get_srq_from_cqe(struct hns_roce_v2_cqe *cqe, struct hns_roce_context *ctx, struct hns_roce_qp *hr_qp, struct hns_roce_srq **srq) { uint32_t srqn; if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_XRC_RECV) { srqn = hr_reg_read(cqe, CQE_XRC_SRQN); *srq = hns_roce_find_srq(ctx, srqn); if (!*srq) return EINVAL; } else if (hr_qp->verbs_qp.qp.srq) { *srq = to_hr_srq(hr_qp->verbs_qp.qp.srq); } return 0; } static int hns_roce_v2_wq_overflow(struct hns_roce_wq *wq, unsigned int nreq, struct hns_roce_cq *cq) { unsigned int cur; cur = wq->head - wq->tail; if (cur + nreq < wq->max_post) return 0; hns_roce_spin_lock(&cq->hr_lock); cur = wq->head - wq->tail; hns_roce_spin_unlock(&cq->hr_lock); return cur + nreq >= wq->max_post; } static void hns_roce_update_rq_db(struct hns_roce_context *ctx, unsigned int qpn, unsigned int rq_head) { struct hns_roce_db rq_db = {}; hr_reg_write(&rq_db, DB_TAG, qpn); hr_reg_write(&rq_db, DB_CMD, HNS_ROCE_V2_RQ_DB); hr_reg_write(&rq_db, DB_PI, rq_head); hns_roce_write64(ctx->uar + ROCEE_VF_DB_CFG0_OFFSET, (__le32 *)&rq_db); } static void hns_roce_update_sq_db(struct hns_roce_context *ctx, struct hns_roce_qp *qp) { struct hns_roce_db sq_db = {}; hr_reg_write(&sq_db, DB_TAG, qp->verbs_qp.qp.qp_num); hr_reg_write(&sq_db, DB_CMD, HNS_ROCE_V2_SQ_DB); hr_reg_write(&sq_db, DB_PI, qp->sq.head); hr_reg_write(&sq_db, DB_SL, qp->sl); hns_roce_write64(qp->sq.db_reg, (__le32 *)&sq_db); } static void hns_roce_write512(uint64_t *dest, uint64_t *val) { mmio_memcpy_x64(dest, val, sizeof(struct hns_roce_rc_sq_wqe)); } static void hns_roce_write_dwqe(struct hns_roce_qp *qp, void *wqe) { struct hns_roce_rc_sq_wqe *rc_sq_wqe = wqe; /* All kinds of DirectWQE have the same header field layout */ hr_reg_enable(rc_sq_wqe, RCWQE_FLAG); hr_reg_write(rc_sq_wqe, RCWQE_DB_SL_L, qp->sl); hr_reg_write(rc_sq_wqe, RCWQE_DB_SL_H, qp->sl >> HNS_ROCE_SL_SHIFT); hr_reg_write(rc_sq_wqe, RCWQE_WQE_IDX, qp->sq.head); hns_roce_write512(qp->sq.db_reg, wqe); } static void update_cq_db(struct hns_roce_context *ctx, struct hns_roce_cq *cq) { struct hns_roce_db cq_db = {}; hr_reg_write(&cq_db, DB_TAG, cq->cqn); hr_reg_write(&cq_db, DB_CMD, HNS_ROCE_V2_CQ_DB_PTR); hr_reg_write(&cq_db, DB_CQ_CI, cq->cons_index); hr_reg_write(&cq_db, DB_CQ_CMD_SN, 1); hns_roce_write64(ctx->uar + ROCEE_VF_DB_CFG0_OFFSET, (__le32 *)&cq_db); } static struct hns_roce_qp *hns_roce_v2_find_qp(struct hns_roce_context *ctx, uint32_t qpn) { uint32_t tind = to_hr_qp_table_index(qpn, ctx); if (ctx->qp_table[tind].refcnt) return ctx->qp_table[tind].table[qpn & ctx->qp_table_mask]; else return NULL; } void hns_roce_v2_clear_qp(struct hns_roce_context *ctx, struct hns_roce_qp *qp) { uint32_t qpn = qp->verbs_qp.qp.qp_num; uint32_t tind = to_hr_qp_table_index(qpn, ctx); pthread_mutex_lock(&ctx->qp_table_mutex); if (!--ctx->qp_table[tind].refcnt) free(ctx->qp_table[tind].table); else if (!--qp->refcnt) ctx->qp_table[tind].table[qpn & ctx->qp_table_mask] = NULL; pthread_mutex_unlock(&ctx->qp_table_mutex); } static int hns_roce_u_v2_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); static int hns_roce_flush_cqe(struct hns_roce_qp *hr_qp, uint8_t status) { struct ibv_qp_attr attr = {}; int attr_mask; if (status != HNS_ROCE_V2_CQE_WR_FLUSH_ERR) { attr_mask = IBV_QP_STATE; attr.qp_state = IBV_QPS_ERR; hns_roce_u_v2_modify_qp(&hr_qp->verbs_qp.qp, &attr, attr_mask); hr_qp->verbs_qp.qp.state = IBV_QPS_ERR; } return V2_CQ_OK; } static const unsigned int wc_send_op_map[] = { [HNS_ROCE_SQ_OP_SEND] = IBV_WC_SEND, [HNS_ROCE_SQ_OP_SEND_WITH_INV] = IBV_WC_SEND, [HNS_ROCE_SQ_OP_SEND_WITH_IMM] = IBV_WC_SEND, [HNS_ROCE_SQ_OP_RDMA_WRITE] = IBV_WC_RDMA_WRITE, [HNS_ROCE_SQ_OP_RDMA_WRITE_WITH_IMM] = IBV_WC_RDMA_WRITE, [HNS_ROCE_SQ_OP_RDMA_READ] = IBV_WC_RDMA_READ, [HNS_ROCE_SQ_OP_ATOMIC_COMP_AND_SWAP] = IBV_WC_COMP_SWAP, [HNS_ROCE_SQ_OP_ATOMIC_FETCH_AND_ADD] = IBV_WC_FETCH_ADD, [HNS_ROCE_SQ_OP_BIND_MW] = IBV_WC_BIND_MW, }; static const unsigned int wc_rcv_op_map[] = { [HNS_ROCE_RECV_OP_RDMA_WRITE_IMM] = IBV_WC_RECV_RDMA_WITH_IMM, [HNS_ROCE_RECV_OP_SEND] = IBV_WC_RECV, [HNS_ROCE_RECV_OP_SEND_WITH_IMM] = IBV_WC_RECV, [HNS_ROCE_RECV_OP_SEND_WITH_INV] = IBV_WC_RECV, }; static void get_opcode_for_resp(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc, uint32_t opcode) { switch (opcode) { case HNS_ROCE_RECV_OP_SEND: wc->wc_flags = 0; break; case HNS_ROCE_RECV_OP_SEND_WITH_INV: wc->wc_flags = IBV_WC_WITH_INV; wc->invalidated_rkey = le32toh(cqe->rkey); break; case HNS_ROCE_RECV_OP_RDMA_WRITE_IMM: case HNS_ROCE_RECV_OP_SEND_WITH_IMM: wc->wc_flags = IBV_WC_WITH_IMM; wc->imm_data = htobe32(le32toh(cqe->immtdata)); break; default: return; } wc->opcode = wc_rcv_op_map[opcode]; } static void handle_recv_inl_data(struct hns_roce_v2_cqe *cqe, struct hns_roce_rinl_buf *rinl_buf, uint32_t wr_cnt, uint8_t *buf) { struct ibv_sge *sge_list; uint32_t sge_num, data_len; uint32_t sge_cnt, size; sge_list = rinl_buf->wqe_list[wr_cnt].sg_list; sge_num = rinl_buf->wqe_list[wr_cnt].sge_cnt; data_len = le32toh(cqe->byte_cnt); for (sge_cnt = 0; (sge_cnt < sge_num) && (data_len); sge_cnt++) { size = min(sge_list[sge_cnt].length, data_len); memcpy((void *)(uintptr_t)sge_list[sge_cnt].addr, (void *)buf, size); data_len -= size; buf += size; } if (data_len) hr_reg_write(cqe, CQE_STATUS, HNS_ROCE_V2_CQE_LOCAL_LENGTH_ERR); } static void handle_recv_cqe_inl_from_rq(struct hns_roce_v2_cqe *cqe, struct hns_roce_qp *cur_qp) { uint32_t wr_num; wr_num = hr_reg_read(cqe, CQE_WQE_IDX) & (cur_qp->rq.wqe_cnt - 1); handle_recv_inl_data(cqe, &cur_qp->rq_rinl_buf, wr_num, (uint8_t *)cqe->payload); } static void handle_recv_cqe_inl_from_srq(struct hns_roce_v2_cqe *cqe, struct hns_roce_srq *srq) { uint8_t *buf = (uint8_t *)cqe->payload; struct hns_roce_v2_wqe_data_seg *dseg; uint32_t data_len, size; uint32_t wqe_index; uint32_t cnt = 0; uint32_t max_sge; max_sge = srq->max_gs - srq->rsv_sge; wqe_index = hr_reg_read(cqe, CQE_WQE_IDX) & (srq->wqe_cnt - 1); data_len = le32toh(cqe->byte_cnt); dseg = (struct hns_roce_v2_wqe_data_seg *)get_srq_wqe(srq, wqe_index); for (; cnt < max_sge && dseg->addr && data_len; dseg++, cnt++) { size = min(le32toh(dseg->len), data_len); memcpy((void *)(uintptr_t)le64toh(dseg->addr), (void *)buf, size); data_len -= size; buf += size; } if (data_len) hr_reg_write(cqe, CQE_STATUS, HNS_ROCE_V2_CQE_LOCAL_LENGTH_ERR); } static void handle_recv_rq_inl(struct hns_roce_v2_cqe *cqe, struct hns_roce_qp *cur_qp) { uint8_t *wqe_buf; uint32_t wr_num; wr_num = hr_reg_read(cqe, CQE_WQE_IDX) & (cur_qp->rq.wqe_cnt - 1); wqe_buf = (uint8_t *)get_recv_wqe_v2(cur_qp, wr_num); handle_recv_inl_data(cqe, &cur_qp->rq_rinl_buf, wr_num, wqe_buf); } static const uint8_t pktype_for_ud[] = { HNS_ROCE_PKTYPE_ROCE_V1, HNS_ROCE_PKTYPE_ROCE_V2_IPV4, HNS_ROCE_PKTYPE_ROCE_V2_IPV6 }; static void parse_for_ud_qp(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc) { uint8_t port_type = hr_reg_read(cqe, CQE_PORT_TYPE); wc->sl = pktype_for_ud[port_type]; wc->src_qp = hr_reg_read(cqe, CQE_RMT_QPN); wc->slid = 0; wc->wc_flags |= hr_reg_read(cqe, CQE_GRH) ? IBV_WC_GRH : 0; wc->pkey_index = 0; } static void parse_cqe_for_srq(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc, struct hns_roce_srq *srq, struct hns_roce_qp *hr_qp) { uint32_t wqe_idx; if (hr_reg_read(cqe, CQE_CQE_INLINE)) handle_recv_cqe_inl_from_srq(cqe, srq); else if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD) parse_for_ud_qp(cqe, wc); wqe_idx = hr_reg_read(cqe, CQE_WQE_IDX); wc->wr_id = srq->wrid[wqe_idx & (srq->wqe_cnt - 1)]; hns_roce_free_srq_wqe(srq, wqe_idx); } static void parse_cqe_for_resp(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc, struct hns_roce_qp *hr_qp) { struct hns_roce_wq *wq; wq = &hr_qp->rq; wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)]; ++wq->tail; if (hr_reg_read(cqe, CQE_CQE_INLINE)) handle_recv_cqe_inl_from_rq(cqe, hr_qp); else if (hr_reg_read(cqe, CQE_RQ_INLINE)) handle_recv_rq_inl(cqe, hr_qp); else if (hr_qp->verbs_qp.qp.qp_type == IBV_QPT_UD) parse_for_ud_qp(cqe, wc); } static void parse_cqe_for_req(struct hns_roce_v2_cqe *cqe, struct ibv_wc *wc, struct hns_roce_qp *hr_qp, uint8_t opcode) { struct hns_roce_wq *wq; uint32_t wqe_idx; wq = &hr_qp->sq; /* * in case of signalling, the tail pointer needs to be updated * according to the wqe idx in the current cqe first */ if (hr_qp->sq_signal_bits) { wqe_idx = hr_reg_read(cqe, CQE_WQE_IDX); /* get the processed wqes num since last signalling */ wq->tail += (wqe_idx - wq->tail) & (wq->wqe_cnt - 1); } /* write the wr_id of wq into the wc */ wc->wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)]; ++wq->tail; switch (opcode) { case HNS_ROCE_SQ_OP_SEND: case HNS_ROCE_SQ_OP_SEND_WITH_INV: case HNS_ROCE_SQ_OP_RDMA_WRITE: case HNS_ROCE_SQ_OP_BIND_MW: wc->wc_flags = 0; break; case HNS_ROCE_SQ_OP_SEND_WITH_IMM: case HNS_ROCE_SQ_OP_RDMA_WRITE_WITH_IMM: wc->wc_flags = IBV_WC_WITH_IMM; break; case HNS_ROCE_SQ_OP_RDMA_READ: case HNS_ROCE_SQ_OP_ATOMIC_COMP_AND_SWAP: case HNS_ROCE_SQ_OP_ATOMIC_FETCH_AND_ADD: wc->wc_flags = 0; wc->byte_len = le32toh(cqe->byte_cnt); break; default: wc->wc_flags = 0; return; } wc->opcode = wc_send_op_map[opcode]; } static void cqe_proc_sq(struct hns_roce_qp *hr_qp, uint32_t wqe_idx, struct hns_roce_cq *cq) { struct hns_roce_wq *wq = &hr_qp->sq; if (hr_qp->sq_signal_bits) wq->tail += (wqe_idx - wq->tail) & (wq->wqe_cnt - 1); cq->verbs_cq.cq_ex.wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)]; ++wq->tail; } static void cqe_proc_srq(struct hns_roce_srq *srq, uint32_t wqe_idx, struct hns_roce_cq *cq) { if (hr_reg_read(cq->cqe, CQE_CQE_INLINE)) handle_recv_cqe_inl_from_srq(cq->cqe, srq); cq->verbs_cq.cq_ex.wr_id = srq->wrid[wqe_idx & (srq->wqe_cnt - 1)]; hns_roce_free_srq_wqe(srq, wqe_idx); } static void cqe_proc_rq(struct hns_roce_qp *hr_qp, struct hns_roce_cq *cq) { struct hns_roce_wq *wq = &hr_qp->rq; cq->verbs_cq.cq_ex.wr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)]; ++wq->tail; if (hr_reg_read(cq->cqe, CQE_CQE_INLINE)) handle_recv_cqe_inl_from_rq(cq->cqe, hr_qp); else if (hr_reg_read(cq->cqe, CQE_RQ_INLINE)) handle_recv_rq_inl(cq->cqe, hr_qp); } static int cqe_proc_wq(struct hns_roce_context *ctx, struct hns_roce_qp *qp, struct hns_roce_cq *cq) { struct hns_roce_v2_cqe *cqe = cq->cqe; struct hns_roce_srq *srq = NULL; uint32_t wqe_idx; wqe_idx = hr_reg_read(cqe, CQE_WQE_IDX); if (hr_reg_read(cqe, CQE_S_R) == CQE_FOR_SQ) { cqe_proc_sq(qp, wqe_idx, cq); } else { if (get_srq_from_cqe(cqe, ctx, qp, &srq)) return V2_CQ_POLL_ERR; if (srq) cqe_proc_srq(srq, wqe_idx, cq); else cqe_proc_rq(qp, cq); } return 0; } static int parse_cqe_for_cq(struct hns_roce_context *ctx, struct hns_roce_cq *cq, struct hns_roce_qp *cur_qp, struct ibv_wc *wc) { struct hns_roce_v2_cqe *cqe = cq->cqe; struct hns_roce_srq *srq = NULL; uint8_t opcode; if (!wc) { if (cqe_proc_wq(ctx, cur_qp, cq)) return V2_CQ_POLL_ERR; return 0; } opcode = hr_reg_read(cqe, CQE_OPCODE); if (hr_reg_read(cqe, CQE_S_R) == CQE_FOR_SQ) { parse_cqe_for_req(cqe, wc, cur_qp, opcode); } else { wc->byte_len = le32toh(cqe->byte_cnt); get_opcode_for_resp(cqe, wc, opcode); if (get_srq_from_cqe(cqe, ctx, cur_qp, &srq)) return V2_CQ_POLL_ERR; if (srq) parse_cqe_for_srq(cqe, wc, srq, cur_qp); else parse_cqe_for_resp(cqe, wc, cur_qp); } return 0; } static int hns_roce_poll_one(struct hns_roce_context *ctx, struct hns_roce_qp **cur_qp, struct hns_roce_cq *cq, struct ibv_wc *wc) { struct hns_roce_v2_cqe *cqe; uint8_t status, wc_status; uint32_t qpn; cqe = next_cqe_sw_v2(cq); if (!cqe) return wc ? V2_CQ_EMPTY : ENOENT; cq->cqe = cqe; ++cq->cons_index; udma_from_device_barrier(); qpn = hr_reg_read(cqe, CQE_LCL_QPN); /* if cur qp is null, then could not get the correct qpn */ if (!*cur_qp || qpn != (*cur_qp)->verbs_qp.qp.qp_num) { *cur_qp = hns_roce_v2_find_qp(ctx, qpn); if (!*cur_qp) return V2_CQ_POLL_ERR; } if (parse_cqe_for_cq(ctx, cq, *cur_qp, wc)) return V2_CQ_POLL_ERR; status = hr_reg_read(cqe, CQE_STATUS); wc_status = get_wc_status(status); if (wc) { wc->status = wc_status; wc->vendor_err = hr_reg_read(cqe, CQE_SUB_STATUS); wc->qp_num = qpn; } else { cq->verbs_cq.cq_ex.status = wc_status; } if (status == HNS_ROCE_V2_CQE_SUCCESS || status == HNS_ROCE_V2_CQE_GENERAL_ERR) return V2_CQ_OK; /* * once a cqe in error status, the driver needs to help the HW to * generated flushed cqes for all subsequent wqes */ return hns_roce_flush_cqe(*cur_qp, status); } static int hns_roce_u_v2_poll_cq(struct ibv_cq *ibvcq, int ne, struct ibv_wc *wc) { struct hns_roce_context *ctx = to_hr_ctx(ibvcq->context); struct hns_roce_cq *cq = to_hr_cq(ibvcq); struct hns_roce_qp *qp = NULL; int err = V2_CQ_OK; int npolled; hns_roce_spin_lock(&cq->hr_lock); for (npolled = 0; npolled < ne; ++npolled) { err = hns_roce_poll_one(ctx, &qp, cq, wc + npolled); if (err != V2_CQ_OK) break; } if (npolled || err == V2_CQ_POLL_ERR) { if (cq->flags & HNS_ROCE_CQ_FLAG_RECORD_DB) *cq->db = cq->cons_index & RECORD_DB_CI_MASK; else update_cq_db(ctx, cq); } hns_roce_spin_unlock(&cq->hr_lock); return err == V2_CQ_POLL_ERR ? err : npolled; } static int hns_roce_u_v2_arm_cq(struct ibv_cq *ibvcq, int solicited) { struct hns_roce_context *ctx = to_hr_ctx(ibvcq->context); struct hns_roce_cq *cq = to_hr_cq(ibvcq); struct hns_roce_db cq_db = {}; uint32_t solicited_flag; uint32_t ci; ci = cq->cons_index & ((cq->cq_depth << 1) - 1); solicited_flag = solicited ? HNS_ROCE_V2_CQ_DB_REQ_SOL : HNS_ROCE_V2_CQ_DB_REQ_NEXT; hr_reg_write(&cq_db, DB_TAG, cq->cqn); hr_reg_write(&cq_db, DB_CMD, HNS_ROCE_V2_CQ_DB_NTR); hr_reg_write(&cq_db, DB_CQ_CI, ci); hr_reg_write(&cq_db, DB_CQ_CMD_SN, cq->arm_sn); hr_reg_write(&cq_db, DB_CQ_NOTIFY, solicited_flag); hns_roce_write64(ctx->uar + ROCEE_VF_DB_CFG0_OFFSET, (__le32 *)&cq_db); return 0; } static inline int check_qp_send(struct ibv_qp *qp) { if (unlikely(qp->state == IBV_QPS_RESET || qp->state == IBV_QPS_INIT || qp->state == IBV_QPS_RTR)) return EINVAL; return 0; } static void set_rc_sge(struct hns_roce_v2_wqe_data_seg *dseg, struct hns_roce_qp *qp, struct ibv_send_wr *wr, struct hns_roce_sge_info *sge_info) { uint32_t mask = qp->ex_sge.sge_cnt - 1; uint32_t index = sge_info->start_idx; struct ibv_sge *sge = wr->sg_list; int total_sge = wr->num_sge; bool flag = false; uint32_t len = 0; uint32_t cnt = 0; int i; if (wr->opcode == IBV_WR_ATOMIC_FETCH_AND_ADD || wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP) total_sge = 1; else flag = !!(wr->send_flags & IBV_SEND_INLINE); for (i = 0; i < total_sge; i++, sge++) { if (unlikely(!sge->length)) continue; len += sge->length; cnt++; if (flag) continue; if (cnt <= HNS_ROCE_SGE_IN_WQE) { set_data_seg_v2(dseg, sge); dseg++; } else { dseg = get_send_sge_ex(qp, index & mask); set_data_seg_v2(dseg, sge); index++; } } sge_info->start_idx = index; sge_info->valid_num = cnt; sge_info->total_len = len; } static void set_ud_sge(struct hns_roce_v2_wqe_data_seg *dseg, struct hns_roce_qp *qp, struct ibv_send_wr *wr, struct hns_roce_sge_info *sge_info) { int flag = wr->send_flags & IBV_SEND_INLINE; uint32_t mask = qp->ex_sge.sge_cnt - 1; uint32_t index = sge_info->start_idx; struct ibv_sge *sge = wr->sg_list; uint32_t len = 0; uint32_t cnt = 0; int i; for (i = 0; i < wr->num_sge; i++, sge++) { if (unlikely(!sge->length)) continue; len += sge->length; cnt++; if (flag) continue; /* No inner sge in UD wqe */ dseg = get_send_sge_ex(qp, index & mask); set_data_seg_v2(dseg, sge); index++; } sge_info->start_idx = index; sge_info->valid_num = cnt; sge_info->total_len = len; } static void get_src_buf_info(void **src_addr, uint32_t *src_len, const void *buf_list, int buf_idx, enum hns_roce_wr_buf_type type) { if (type == WR_BUF_TYPE_POST_SEND) { const struct ibv_sge *sg_list = buf_list; *src_addr = (void *)(uintptr_t)sg_list[buf_idx].addr; *src_len = sg_list[buf_idx].length; } else { const struct ibv_data_buf *bf_list = buf_list; *src_addr = bf_list[buf_idx].addr; *src_len = bf_list[buf_idx].length; } } static int fill_ext_sge_inl_data(struct hns_roce_qp *qp, struct hns_roce_sge_info *sge_info, const void *buf_list, uint32_t num_buf, enum hns_roce_wr_buf_type buf_type) { unsigned int sge_mask = qp->ex_sge.sge_cnt - 1; void *dst_addr, *src_addr, *tail_bound_addr; uint32_t src_len, tail_len; int i; if (sge_info->total_len > qp->sq.ext_sge_cnt * HNS_ROCE_SGE_SIZE) return EINVAL; dst_addr = get_send_sge_ex(qp, sge_info->start_idx & sge_mask); tail_bound_addr = get_send_sge_ex(qp, qp->ex_sge.sge_cnt); for (i = 0; i < num_buf; i++) { tail_len = (uintptr_t)tail_bound_addr - (uintptr_t)dst_addr; get_src_buf_info(&src_addr, &src_len, buf_list, i, buf_type); if (src_len < tail_len) { memcpy(dst_addr, src_addr, src_len); dst_addr += src_len; } else if (src_len == tail_len) { memcpy(dst_addr, src_addr, src_len); dst_addr = get_send_sge_ex(qp, 0); } else { memcpy(dst_addr, src_addr, tail_len); dst_addr = get_send_sge_ex(qp, 0); src_addr += tail_len; src_len -= tail_len; memcpy(dst_addr, src_addr, src_len); dst_addr += src_len; } } sge_info->valid_num = DIV_ROUND_UP(sge_info->total_len, HNS_ROCE_SGE_SIZE); sge_info->start_idx += sge_info->valid_num; return 0; } static void set_ud_inl_seg(struct hns_roce_ud_sq_wqe *ud_sq_wqe, uint8_t *data) { uint32_t *loc = (uint32_t *)data; uint32_t tmp_data; hr_reg_write(ud_sq_wqe, UDWQE_INLINE_DATA_15_0, *loc & 0xffff); hr_reg_write(ud_sq_wqe, UDWQE_INLINE_DATA_23_16, (*loc >> 16) & 0xff); tmp_data = *loc >> 24; loc++; tmp_data |= ((*loc & 0xffff) << 8); hr_reg_write(ud_sq_wqe, UDWQE_INLINE_DATA_47_24, tmp_data); hr_reg_write(ud_sq_wqe, UDWQE_INLINE_DATA_63_48, *loc >> 16); } static void fill_ud_inn_inl_data(const struct ibv_send_wr *wr, struct hns_roce_ud_sq_wqe *ud_sq_wqe) { uint8_t data[HNS_ROCE_MAX_UD_INL_INN_SZ] = {}; void *tmp = data; int i; for (i = 0; i < wr->num_sge; i++) { memcpy(tmp, (void *)(uintptr_t)wr->sg_list[i].addr, wr->sg_list[i].length); tmp += wr->sg_list[i].length; } set_ud_inl_seg(ud_sq_wqe, data); } static bool check_inl_data_len(struct hns_roce_qp *qp, unsigned int len) { int mtu = mtu_enum_to_int(qp->path_mtu); return (len <= qp->max_inline_data && len <= mtu); } static int set_ud_inl(struct hns_roce_qp *qp, const struct ibv_send_wr *wr, struct hns_roce_ud_sq_wqe *ud_sq_wqe, struct hns_roce_sge_info *sge_info) { int ret; if (!check_inl_data_len(qp, sge_info->total_len)) return EINVAL; if (sge_info->total_len <= HNS_ROCE_MAX_UD_INL_INN_SZ) { hr_reg_clear(ud_sq_wqe, UDWQE_INLINE_TYPE); fill_ud_inn_inl_data(wr, ud_sq_wqe); } else { hr_reg_enable(ud_sq_wqe, UDWQE_INLINE_TYPE); ret = fill_ext_sge_inl_data(qp, sge_info, wr->sg_list, wr->num_sge, WR_BUF_TYPE_POST_SEND); if (ret) return ret; hr_reg_write(ud_sq_wqe, UDWQE_SGE_NUM, sge_info->valid_num); } return 0; } static __le32 get_immtdata(enum ibv_wr_opcode opcode, const struct ibv_send_wr *wr) { switch (opcode) { case IBV_WR_SEND_WITH_IMM: case IBV_WR_RDMA_WRITE_WITH_IMM: return htole32(be32toh(wr->imm_data)); default: return 0; } } static int check_ud_opcode(struct hns_roce_ud_sq_wqe *ud_sq_wqe, const struct ibv_send_wr *wr) { uint32_t ib_op = wr->opcode; if (ib_op != IBV_WR_SEND && ib_op != IBV_WR_SEND_WITH_IMM) return EINVAL; ud_sq_wqe->immtdata = get_immtdata(ib_op, wr); hr_reg_write(ud_sq_wqe, UDWQE_OPCODE, to_hr_opcode(ib_op)); return 0; } static int fill_ud_av(struct hns_roce_ud_sq_wqe *ud_sq_wqe, struct hns_roce_ah *ah) { if (unlikely(ah->av.sl > MAX_SERVICE_LEVEL)) return EINVAL; hr_reg_write(ud_sq_wqe, UDWQE_SL, ah->av.sl); hr_reg_write(ud_sq_wqe, UDWQE_PD, to_hr_pd(ah->ibv_ah.pd)->pdn); hr_reg_write(ud_sq_wqe, UDWQE_TCLASS, ah->av.tclass); hr_reg_write(ud_sq_wqe, UDWQE_HOPLIMIT, ah->av.hop_limit); hr_reg_write(ud_sq_wqe, UDWQE_FLOW_LABEL, ah->av.flowlabel); hr_reg_write(ud_sq_wqe, UDWQE_UDPSPN, ah->av.udp_sport); memcpy(ud_sq_wqe->dmac, ah->av.mac, ETH_ALEN); ud_sq_wqe->sgid_index = ah->av.gid_index; memcpy(ud_sq_wqe->dgid, ah->av.dgid, HNS_ROCE_GID_SIZE); return 0; } static int fill_ud_data_seg(struct hns_roce_ud_sq_wqe *ud_sq_wqe, struct hns_roce_qp *qp, struct ibv_send_wr *wr, struct hns_roce_sge_info *sge_info) { int ret = 0; hr_reg_write(ud_sq_wqe, UDWQE_MSG_START_SGE_IDX, sge_info->start_idx & (qp->ex_sge.sge_cnt - 1)); set_ud_sge((struct hns_roce_v2_wqe_data_seg *)ud_sq_wqe, qp, wr, sge_info); ud_sq_wqe->msg_len = htole32(sge_info->total_len); hr_reg_write(ud_sq_wqe, UDWQE_SGE_NUM, sge_info->valid_num); if (wr->send_flags & IBV_SEND_INLINE) ret = set_ud_inl(qp, wr, ud_sq_wqe, sge_info); return ret; } static inline void enable_wqe(struct hns_roce_qp *qp, void *sq_wqe, unsigned int index) { struct hns_roce_rc_sq_wqe *wqe = sq_wqe; /* * The pipeline can sequentially post all valid WQEs in wq buf, * including those new WQEs waiting for doorbell to update the PI again. * Therefore, the valid bit of WQE MUST be updated after all of fields * and extSGEs have been written into DDR instead of cache. */ if (qp->flags & HNS_ROCE_QP_CAP_OWNER_DB) udma_to_device_barrier(); hr_reg_write_bool(wqe, RCWQE_OWNER, !(index & BIT(qp->sq.shift))); } static int set_ud_wqe(void *wqe, struct hns_roce_qp *qp, struct ibv_send_wr *wr, unsigned int nreq, struct hns_roce_sge_info *sge_info) { struct hns_roce_ah *ah = to_hr_ah(wr->wr.ud.ah); struct hns_roce_ud_sq_wqe *ud_sq_wqe = wqe; int ret = 0; hr_reg_write_bool(ud_sq_wqe, UDWQE_CQE, !!(wr->send_flags & IBV_SEND_SIGNALED)); hr_reg_write_bool(ud_sq_wqe, UDWQE_SE, !!(wr->send_flags & IBV_SEND_SOLICITED)); hr_reg_write_bool(ud_sq_wqe, UDWQE_INLINE, !!(wr->send_flags & IBV_SEND_INLINE)); ret = check_ud_opcode(ud_sq_wqe, wr); if (ret) return ret; ud_sq_wqe->qkey = htole32(wr->wr.ud.remote_qkey & 0x80000000 ? qp->qkey : wr->wr.ud.remote_qkey); hr_reg_write(ud_sq_wqe, UDWQE_DQPN, wr->wr.ud.remote_qpn); ret = fill_ud_av(ud_sq_wqe, ah); if (ret) return ret; ret = fill_ud_data_seg(ud_sq_wqe, qp, wr, sge_info); if (ret) return ret; enable_wqe(qp, ud_sq_wqe, qp->sq.head + nreq); return ret; } static int set_rc_inl(struct hns_roce_qp *qp, const struct ibv_send_wr *wr, struct hns_roce_rc_sq_wqe *rc_sq_wqe, struct hns_roce_sge_info *sge_info) { void *dseg = rc_sq_wqe; int ret; int i; if (wr->opcode == IBV_WR_RDMA_READ) return EINVAL; if (!check_inl_data_len(qp, sge_info->total_len)) return EINVAL; dseg += sizeof(struct hns_roce_rc_sq_wqe); if (sge_info->total_len <= HNS_ROCE_MAX_RC_INL_INN_SZ) { hr_reg_clear(rc_sq_wqe, RCWQE_INLINE_TYPE); for (i = 0; i < wr->num_sge; i++) { memcpy(dseg, (void *)(uintptr_t)(wr->sg_list[i].addr), wr->sg_list[i].length); dseg += wr->sg_list[i].length; } } else { hr_reg_enable(rc_sq_wqe, RCWQE_INLINE_TYPE); ret = fill_ext_sge_inl_data(qp, sge_info, wr->sg_list, wr->num_sge, WR_BUF_TYPE_POST_SEND); if (ret) return ret; hr_reg_write(rc_sq_wqe, RCWQE_SGE_NUM, sge_info->valid_num); } return 0; } static void set_bind_mw_seg(struct hns_roce_rc_sq_wqe *wqe, const struct ibv_send_wr *wr) { unsigned int access = wr->bind_mw.bind_info.mw_access_flags; hr_reg_write_bool(wqe, RCWQE_MW_TYPE, wr->bind_mw.mw->type - 1); hr_reg_write_bool(wqe, RCWQE_MW_RA_EN, !!(access & IBV_ACCESS_REMOTE_ATOMIC)); hr_reg_write_bool(wqe, RCWQE_MW_RR_EN, !!(access & IBV_ACCESS_REMOTE_READ)); hr_reg_write_bool(wqe, RCWQE_MW_RW_EN, !!(access & IBV_ACCESS_REMOTE_WRITE)); wqe->new_rkey = htole32(wr->bind_mw.rkey); wqe->byte_16 = htole32(wr->bind_mw.bind_info.length & HNS_ROCE_ADDRESS_MASK); wqe->byte_20 = htole32(wr->bind_mw.bind_info.length >> HNS_ROCE_ADDRESS_SHIFT); wqe->rkey = htole32(wr->bind_mw.bind_info.mr->rkey); wqe->va = htole64(wr->bind_mw.bind_info.addr); } static int check_rc_opcode(struct hns_roce_rc_sq_wqe *wqe, const struct ibv_send_wr *wr) { int ret = 0; wqe->immtdata = get_immtdata(wr->opcode, wr); switch (wr->opcode) { case IBV_WR_RDMA_READ: case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: wqe->va = htole64(wr->wr.rdma.remote_addr); wqe->rkey = htole32(wr->wr.rdma.rkey); break; case IBV_WR_SEND: case IBV_WR_SEND_WITH_IMM: break; case IBV_WR_ATOMIC_CMP_AND_SWP: case IBV_WR_ATOMIC_FETCH_AND_ADD: wqe->rkey = htole32(wr->wr.atomic.rkey); wqe->va = htole64(wr->wr.atomic.remote_addr); break; case IBV_WR_SEND_WITH_INV: wqe->inv_key = htole32(wr->invalidate_rkey); break; case IBV_WR_BIND_MW: set_bind_mw_seg(wqe, wr); break; default: ret = EINVAL; break; } hr_reg_write(wqe, RCWQE_OPCODE, to_hr_opcode(wr->opcode)); return ret; } static int set_rc_wqe(void *wqe, struct hns_roce_qp *qp, struct ibv_send_wr *wr, unsigned int nreq, struct hns_roce_sge_info *sge_info) { struct hns_roce_rc_sq_wqe *rc_sq_wqe = wqe; struct hns_roce_v2_wqe_data_seg *dseg; int ret; hr_reg_write_bool(wqe, RCWQE_CQE, !!(wr->send_flags & IBV_SEND_SIGNALED)); hr_reg_write_bool(wqe, RCWQE_SO, !!(wr->send_flags & IBV_SEND_FENCE)); hr_reg_write_bool(wqe, RCWQE_SE, !!(wr->send_flags & IBV_SEND_SOLICITED)); hr_reg_write_bool(wqe, RCWQE_INLINE, !!(wr->send_flags & IBV_SEND_INLINE)); ret = check_rc_opcode(rc_sq_wqe, wr); if (ret) return ret; hr_reg_write(rc_sq_wqe, RCWQE_MSG_START_SGE_IDX, sge_info->start_idx & (qp->ex_sge.sge_cnt - 1)); if (wr->opcode == IBV_WR_BIND_MW) goto wqe_valid; wqe += sizeof(struct hns_roce_rc_sq_wqe); dseg = wqe; set_rc_sge(dseg, qp, wr, sge_info); rc_sq_wqe->msg_len = htole32(sge_info->total_len); hr_reg_write(rc_sq_wqe, RCWQE_SGE_NUM, sge_info->valid_num); if (wr->opcode == IBV_WR_ATOMIC_FETCH_AND_ADD || wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP) { dseg++; ret = set_atomic_seg(qp, wr, dseg, sge_info); } else if (wr->send_flags & IBV_SEND_INLINE) { ret = set_rc_inl(qp, wr, rc_sq_wqe, sge_info); } if (ret) return ret; wqe_valid: enable_wqe(qp, rc_sq_wqe, qp->sq.head + nreq); return 0; } int hns_roce_u_v2_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { struct hns_roce_context *ctx = to_hr_ctx(ibvqp->context); struct hns_roce_qp *qp = to_hr_qp(ibvqp); struct hns_roce_sge_info sge_info = {}; struct hns_roce_rc_sq_wqe *wqe = NULL; struct ibv_qp_attr attr = {}; unsigned int wqe_idx, nreq; int ret; ret = check_qp_send(ibvqp); if (unlikely(ret)) { *bad_wr = wr; return ret; } hns_roce_spin_lock(&qp->sq.hr_lock); sge_info.start_idx = qp->next_sge; /* start index of extend sge */ for (nreq = 0; wr; ++nreq, wr = wr->next) { if (wr->num_sge > (int)qp->sq.max_gs) { ret = qp->sq.max_gs > 0 ? EINVAL : EOPNOTSUPP; *bad_wr = wr; goto out; } if (hns_roce_v2_wq_overflow(&qp->sq, nreq, to_hr_cq(qp->verbs_qp.qp.send_cq))) { ret = ENOMEM; *bad_wr = wr; goto out; } wqe_idx = (qp->sq.head + nreq) & (qp->sq.wqe_cnt - 1); wqe = get_send_wqe(qp, wqe_idx); qp->sq.wrid[wqe_idx] = wr->wr_id; switch (ibvqp->qp_type) { case IBV_QPT_XRC_SEND: hr_reg_write(wqe, RCWQE_XRC_SRQN, wr->qp_type.xrc.remote_srqn); SWITCH_FALLTHROUGH; case IBV_QPT_RC: ret = set_rc_wqe(wqe, qp, wr, nreq, &sge_info); break; case IBV_QPT_UD: ret = set_ud_wqe(wqe, qp, wr, nreq, &sge_info); qp->sl = to_hr_ah(wr->wr.ud.ah)->av.sl; break; default: ret = EINVAL; } if (ret) { *bad_wr = wr; goto out; } } out: if (likely(nreq)) { qp->sq.head += nreq; qp->next_sge = sge_info.start_idx; udma_to_device_barrier(); if (nreq == 1 && !ret && (qp->flags & HNS_ROCE_QP_CAP_DIRECT_WQE)) hns_roce_write_dwqe(qp, wqe); else hns_roce_update_sq_db(ctx, qp); if (qp->flags & HNS_ROCE_QP_CAP_SQ_RECORD_DB) *(qp->sdb) = qp->sq.head & 0xffff; } hns_roce_spin_unlock(&qp->sq.hr_lock); if (ibvqp->state == IBV_QPS_ERR) { attr.qp_state = IBV_QPS_ERR; hns_roce_u_v2_modify_qp(ibvqp, &attr, IBV_QP_STATE); } return ret; } static inline int check_qp_recv(struct ibv_qp *qp) { if (qp->state == IBV_QPS_RESET) return EINVAL; return 0; } static void fill_recv_sge_to_wqe(struct ibv_recv_wr *wr, void *wqe, unsigned int max_sge, bool rsv) { struct hns_roce_v2_wqe_data_seg *dseg = wqe; unsigned int i, cnt; for (i = 0, cnt = 0; i < wr->num_sge; i++) { /* Skip zero-length sge */ if (!wr->sg_list[i].length) continue; set_data_seg_v2(dseg + cnt, wr->sg_list + i); cnt++; } /* Fill a reserved sge to make ROCEE stop reading remaining segments */ if (rsv) { dseg[cnt].lkey = 0; dseg[cnt].addr = 0; dseg[cnt].len = htole32(INVALID_SGE_LENGTH); } else { /* Clear remaining segments to make ROCEE ignore sges */ if (cnt < max_sge) memset(dseg + cnt, 0, (max_sge - cnt) * HNS_ROCE_SGE_SIZE); } } static void fill_recv_inl_buf(struct hns_roce_rinl_buf *rinl_buf, unsigned int wqe_idx, struct ibv_recv_wr *wr) { struct ibv_sge *sge_list; unsigned int i; if (!rinl_buf->wqe_cnt) return; sge_list = rinl_buf->wqe_list[wqe_idx].sg_list; rinl_buf->wqe_list[wqe_idx].sge_cnt = (unsigned int)wr->num_sge; for (i = 0; i < wr->num_sge; i++) memcpy((void *)&sge_list[i], (void *)&wr->sg_list[i], sizeof(struct ibv_sge)); } static void fill_rq_wqe(struct hns_roce_qp *qp, struct ibv_recv_wr *wr, unsigned int wqe_idx, unsigned int max_sge) { void *wqe; wqe = get_recv_wqe_v2(qp, wqe_idx); fill_recv_sge_to_wqe(wr, wqe, max_sge, qp->rq.rsv_sge); fill_recv_inl_buf(&qp->rq_rinl_buf, wqe_idx, wr); } static int hns_roce_u_v2_post_recv(struct ibv_qp *ibvqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct hns_roce_context *ctx = to_hr_ctx(ibvqp->context); struct hns_roce_qp *qp = to_hr_qp(ibvqp); unsigned int wqe_idx, nreq, max_sge; struct ibv_qp_attr attr = {}; int ret; ret = check_qp_recv(ibvqp); if (unlikely(ret)) { *bad_wr = wr; return ret; } hns_roce_spin_lock(&qp->rq.hr_lock); max_sge = qp->rq.max_gs - qp->rq.rsv_sge; for (nreq = 0; wr; ++nreq, wr = wr->next) { if (wr->num_sge > max_sge) { ret = max_sge > 0 ? EINVAL : EOPNOTSUPP; *bad_wr = wr; goto out; } if (hns_roce_v2_wq_overflow(&qp->rq, nreq, to_hr_cq(qp->verbs_qp.qp.recv_cq))) { ret = ENOMEM; *bad_wr = wr; goto out; } wqe_idx = (qp->rq.head + nreq) & (qp->rq.wqe_cnt - 1); fill_rq_wqe(qp, wr, wqe_idx, max_sge); qp->rq.wrid[wqe_idx] = wr->wr_id; } out: if (nreq) { qp->rq.head += nreq; udma_to_device_barrier(); if (qp->flags & HNS_ROCE_QP_CAP_RQ_RECORD_DB) *qp->rdb = qp->rq.head & 0xffff; else hns_roce_update_rq_db(ctx, ibvqp->qp_num, qp->rq.head); } hns_roce_spin_unlock(&qp->rq.hr_lock); if (ibvqp->state == IBV_QPS_ERR) { attr.qp_state = IBV_QPS_ERR; hns_roce_u_v2_modify_qp(ibvqp, &attr, IBV_QP_STATE); } return ret; } static void __hns_roce_v2_cq_clean(struct hns_roce_cq *cq, uint32_t qpn, struct hns_roce_srq *srq) { struct hns_roce_context *ctx = to_hr_ctx(cq->verbs_cq.cq.context); uint64_t cons_index = cq->cons_index; uint64_t prod_index = cq->cons_index; struct hns_roce_v2_cqe *cqe, *dest; uint16_t wqe_index; uint8_t owner_bit; bool is_recv_cqe; int nfreed = 0; for (; get_sw_cqe_v2(cq, prod_index); ++prod_index) if (prod_index > cons_index + cq->verbs_cq.cq.cqe) break; while (prod_index - cons_index > 0) { prod_index--; cqe = get_cqe_v2(cq, prod_index & cq->verbs_cq.cq.cqe); if (hr_reg_read(cqe, CQE_LCL_QPN) == qpn) { is_recv_cqe = hr_reg_read(cqe, CQE_S_R); if (srq && is_recv_cqe) { wqe_index = hr_reg_read(cqe, CQE_WQE_IDX); hns_roce_free_srq_wqe(srq, wqe_index); } ++nfreed; } else if (nfreed) { dest = get_cqe_v2(cq, (prod_index + nfreed) & cq->verbs_cq.cq.cqe); owner_bit = hr_reg_read(dest, CQE_OWNER); memcpy(dest, cqe, cq->cqe_size); hr_reg_write_bool(dest, CQE_OWNER, owner_bit); } } if (nfreed) { cq->cons_index += nfreed; udma_to_device_barrier(); update_cq_db(ctx, cq); } } static void hns_roce_v2_cq_clean(struct hns_roce_cq *cq, unsigned int qpn, struct hns_roce_srq *srq) { hns_roce_spin_lock(&cq->hr_lock); __hns_roce_v2_cq_clean(cq, qpn, srq); hns_roce_spin_unlock(&cq->hr_lock); } static void record_qp_attr(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { struct hns_roce_qp *hr_qp = to_hr_qp(qp); if (attr_mask & IBV_QP_PORT) hr_qp->port_num = attr->port_num; if (hr_qp->tc_mode == HNS_ROCE_TC_MAP_MODE_DSCP) hr_qp->sl = hr_qp->priority; else if (attr_mask & IBV_QP_AV) hr_qp->sl = attr->ah_attr.sl; if (attr_mask & IBV_QP_QKEY) hr_qp->qkey = attr->qkey; if (qp->qp_type == IBV_QPT_UD) hr_qp->path_mtu = IBV_MTU_4096; else if (attr_mask & IBV_QP_PATH_MTU) hr_qp->path_mtu = attr->path_mtu; } static int hns_roce_u_v2_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { struct hns_roce_modify_qp_ex_resp resp_ex = {}; struct hns_roce_modify_qp_ex cmd_ex = {}; struct hns_roce_qp *hr_qp = to_hr_qp(qp); bool flag = false; /* modify qp to error */ int ret; if ((attr_mask & IBV_QP_STATE) && (attr->qp_state == IBV_QPS_ERR)) { hns_roce_spin_lock(&hr_qp->sq.hr_lock); hns_roce_spin_lock(&hr_qp->rq.hr_lock); flag = true; } ret = ibv_cmd_modify_qp_ex(qp, attr, attr_mask, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp_ex.ibv_resp, sizeof(resp_ex)); if (flag) { if (!ret) qp->state = IBV_QPS_ERR; hns_roce_spin_unlock(&hr_qp->sq.hr_lock); hns_roce_spin_unlock(&hr_qp->rq.hr_lock); } if (ret) return ret; if (attr_mask & IBV_QP_STATE) { qp->state = attr->qp_state; if (attr->qp_state == IBV_QPS_RTR) { hr_qp->tc_mode = resp_ex.drv_payload.tc_mode; hr_qp->priority = resp_ex.drv_payload.priority; } } if ((attr_mask & IBV_QP_STATE) && attr->qp_state == IBV_QPS_RESET) { if (qp->recv_cq) hns_roce_v2_cq_clean(to_hr_cq(qp->recv_cq), qp->qp_num, qp->srq ? to_hr_srq(qp->srq) : NULL); if (qp->send_cq && qp->send_cq != qp->recv_cq) hns_roce_v2_cq_clean(to_hr_cq(qp->send_cq), qp->qp_num, NULL); hns_roce_init_qp_indices(to_hr_qp(qp)); } record_qp_attr(qp, attr, attr_mask); return ret; } static void hns_roce_lock_cqs(struct ibv_qp *qp) { struct hns_roce_cq *send_cq = to_hr_cq(qp->send_cq); struct hns_roce_cq *recv_cq = to_hr_cq(qp->recv_cq); if (send_cq && recv_cq) { if (send_cq == recv_cq) { hns_roce_spin_lock(&send_cq->hr_lock); } else if (send_cq->cqn < recv_cq->cqn) { hns_roce_spin_lock(&send_cq->hr_lock); hns_roce_spin_lock(&recv_cq->hr_lock); } else { hns_roce_spin_lock(&recv_cq->hr_lock); hns_roce_spin_lock(&send_cq->hr_lock); } } else if (send_cq) { hns_roce_spin_lock(&send_cq->hr_lock); } else if (recv_cq) { hns_roce_spin_lock(&recv_cq->hr_lock); } } static void hns_roce_unlock_cqs(struct ibv_qp *qp) { struct hns_roce_cq *send_cq = to_hr_cq(qp->send_cq); struct hns_roce_cq *recv_cq = to_hr_cq(qp->recv_cq); if (send_cq && recv_cq) { if (send_cq == recv_cq) { hns_roce_spin_unlock(&send_cq->hr_lock); } else if (send_cq->cqn < recv_cq->cqn) { hns_roce_spin_unlock(&recv_cq->hr_lock); hns_roce_spin_unlock(&send_cq->hr_lock); } else { hns_roce_spin_unlock(&send_cq->hr_lock); hns_roce_spin_unlock(&recv_cq->hr_lock); } } else if (send_cq) { hns_roce_spin_unlock(&send_cq->hr_lock); } else if (recv_cq) { hns_roce_spin_unlock(&recv_cq->hr_lock); } } static int hns_roce_u_v2_destroy_qp(struct ibv_qp *ibqp) { struct hns_roce_context *ctx = to_hr_ctx(ibqp->context); struct hns_roce_pad *pad = to_hr_pad(ibqp->pd); struct hns_roce_qp *qp = to_hr_qp(ibqp); int ret; ret = ibv_cmd_destroy_qp(ibqp); if (ret) return ret; if (qp->flags & HNS_ROCE_QP_CAP_DIRECT_WQE) munmap(qp->dwqe_page, HNS_ROCE_DWQE_PAGE_SIZE); hns_roce_v2_clear_qp(ctx, qp); hns_roce_lock_cqs(ibqp); if (ibqp->recv_cq) __hns_roce_v2_cq_clean(to_hr_cq(ibqp->recv_cq), ibqp->qp_num, ibqp->srq ? to_hr_srq(ibqp->srq) : NULL); if (ibqp->send_cq && ibqp->send_cq != ibqp->recv_cq) __hns_roce_v2_cq_clean(to_hr_cq(ibqp->send_cq), ibqp->qp_num, NULL); hns_roce_unlock_cqs(ibqp); hns_roce_free_qp_buf(qp, ctx); hns_roce_qp_spinlock_destroy(qp); if (pad) atomic_fetch_sub(&pad->pd.refcount, 1); free(qp); return ret; } static int hns_roce_v2_srqwq_overflow(struct hns_roce_srq *srq) { struct hns_roce_idx_que *idx_que = &srq->idx_que; return idx_que->head - idx_que->tail >= srq->wqe_cnt; } static int check_post_srq_valid(struct hns_roce_srq *srq, struct ibv_recv_wr *wr, unsigned int max_sge) { if (hns_roce_v2_srqwq_overflow(srq)) return ENOMEM; if (wr->num_sge > max_sge) return EINVAL; return 0; } static int get_wqe_idx(struct hns_roce_srq *srq, unsigned int *wqe_idx) { struct hns_roce_idx_que *idx_que = &srq->idx_que; int bit_num; int i; /* bitmap[i] is set zero if all bits are allocated */ for (i = 0; i < idx_que->bitmap_cnt && idx_que->bitmap[i] == 0; ++i) ; if (i == idx_que->bitmap_cnt) return ENOMEM; bit_num = ffsl(idx_que->bitmap[i]); idx_que->bitmap[i] &= ~(1ULL << (bit_num - 1)); *wqe_idx = i * BIT_CNT_PER_LONG + (bit_num - 1); /* If wqe_cnt is less than BIT_CNT_PER_LONG, wqe_idx may be greater * than wqe_cnt. */ if (*wqe_idx >= srq->wqe_cnt) return ENOMEM; return 0; } static void fill_wqe_idx(struct hns_roce_srq *srq, unsigned int wqe_idx) { struct hns_roce_idx_que *idx_que = &srq->idx_que; unsigned int head; __le32 *idx_buf; head = idx_que->head & (srq->wqe_cnt - 1); idx_buf = get_idx_buf(idx_que, head); *idx_buf = htole32(wqe_idx); idx_que->head++; } static void update_srq_db(struct hns_roce_context *ctx, struct hns_roce_db *db, struct hns_roce_srq *srq) { hr_reg_write(db, DB_TAG, srq->srqn); hr_reg_write(db, DB_CMD, HNS_ROCE_V2_SRQ_DB); hr_reg_write(db, DB_PI, srq->idx_que.head); hns_roce_write64(ctx->uar + ROCEE_VF_DB_CFG0_OFFSET, (__le32 *)db); } static int hns_roce_u_v2_post_srq_recv(struct ibv_srq *ib_srq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct hns_roce_context *ctx = to_hr_ctx(ib_srq->context); struct hns_roce_srq *srq = to_hr_srq(ib_srq); unsigned int wqe_idx, max_sge, nreq; struct hns_roce_db srq_db; int ret = 0; void *wqe; hns_roce_spin_lock(&srq->hr_lock); max_sge = srq->max_gs - srq->rsv_sge; for (nreq = 0; wr; ++nreq, wr = wr->next) { ret = check_post_srq_valid(srq, wr, max_sge); if (ret) { *bad_wr = wr; break; } ret = get_wqe_idx(srq, &wqe_idx); if (ret) { *bad_wr = wr; break; } wqe = get_srq_wqe(srq, wqe_idx); fill_recv_sge_to_wqe(wr, wqe, max_sge, srq->rsv_sge); fill_wqe_idx(srq, wqe_idx); srq->wrid[wqe_idx] = wr->wr_id; } if (nreq) { /* * Make sure that descriptors are written before * we write doorbell record. */ udma_to_device_barrier(); if (srq->cap_flags & HNS_ROCE_RSP_SRQ_CAP_RECORD_DB) *srq->rdb = srq->idx_que.head & 0xffff; else update_srq_db(ctx, &srq_db, srq); } hns_roce_spin_unlock(&srq->hr_lock); return ret; } static int wc_start_poll_cq(struct ibv_cq_ex *current, struct ibv_poll_cq_attr *attr) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); struct hns_roce_context *ctx = to_hr_ctx(current->context); struct hns_roce_qp *qp = NULL; int err; if (attr->comp_mask) return EINVAL; hns_roce_spin_lock(&cq->hr_lock); err = hns_roce_poll_one(ctx, &qp, cq, NULL); if (err != V2_CQ_OK) hns_roce_spin_unlock(&cq->hr_lock); return err; } static int wc_next_poll_cq(struct ibv_cq_ex *current) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); struct hns_roce_context *ctx = to_hr_ctx(current->context); struct hns_roce_qp *qp = NULL; int err; err = hns_roce_poll_one(ctx, &qp, cq, NULL); if (err != V2_CQ_OK) return err; if (cq->flags & HNS_ROCE_CQ_FLAG_RECORD_DB) *cq->db = cq->cons_index & RECORD_DB_CI_MASK; else update_cq_db(ctx, cq); return 0; } static void wc_end_poll_cq(struct ibv_cq_ex *current) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); struct hns_roce_context *ctx = to_hr_ctx(current->context); if (cq->flags & HNS_ROCE_CQ_FLAG_RECORD_DB) *cq->db = cq->cons_index & RECORD_DB_CI_MASK; else update_cq_db(ctx, cq); hns_roce_spin_unlock(&cq->hr_lock); } static enum ibv_wc_opcode wc_read_opcode(struct ibv_cq_ex *current) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); uint8_t opcode = hr_reg_read(cq->cqe, CQE_OPCODE); if (hr_reg_read(cq->cqe, CQE_S_R) == CQE_FOR_SQ) return wc_send_op_map[opcode]; else return wc_rcv_op_map[opcode]; } static uint32_t wc_read_vendor_err(struct ibv_cq_ex *current) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); return hr_reg_read(cq->cqe, CQE_SUB_STATUS); } static uint32_t wc_read_byte_len(struct ibv_cq_ex *current) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); return le32toh(cq->cqe->byte_cnt); } static __be32 wc_read_imm_data(struct ibv_cq_ex *current) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); if (hr_reg_read(cq->cqe, CQE_OPCODE) == HNS_ROCE_RECV_OP_SEND_WITH_INV) /* This is returning invalidate_rkey which is in host order, see * ibv_wc_read_invalidated_rkey. */ return (__force __be32)le32toh(cq->cqe->rkey); return htobe32(le32toh(cq->cqe->immtdata)); } static uint32_t wc_read_qp_num(struct ibv_cq_ex *current) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); return hr_reg_read(cq->cqe, CQE_LCL_QPN); } static uint32_t wc_read_src_qp(struct ibv_cq_ex *current) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); return hr_reg_read(cq->cqe, CQE_RMT_QPN); } static unsigned int get_wc_flags_for_sq(uint8_t opcode) { switch (opcode) { case HNS_ROCE_SQ_OP_SEND_WITH_IMM: case HNS_ROCE_SQ_OP_RDMA_WRITE_WITH_IMM: return IBV_WC_WITH_IMM; default: return 0; } } static unsigned int get_wc_flags_for_rq(uint8_t opcode) { switch (opcode) { case HNS_ROCE_RECV_OP_RDMA_WRITE_IMM: case HNS_ROCE_RECV_OP_SEND_WITH_IMM: return IBV_WC_WITH_IMM; case HNS_ROCE_RECV_OP_SEND_WITH_INV: return IBV_WC_WITH_INV; default: return 0; } } static unsigned int wc_read_wc_flags(struct ibv_cq_ex *current) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); uint8_t opcode = hr_reg_read(cq->cqe, CQE_OPCODE); unsigned int wc_flags; if (hr_reg_read(cq->cqe, CQE_S_R) == CQE_FOR_SQ) { wc_flags = get_wc_flags_for_sq(opcode); } else { wc_flags = get_wc_flags_for_rq(opcode); wc_flags |= hr_reg_read(cq->cqe, CQE_GRH) ? IBV_WC_GRH : 0; } return wc_flags; } static uint32_t wc_read_slid(struct ibv_cq_ex *current) { return 0; } static uint8_t wc_read_sl(struct ibv_cq_ex *current) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); uint8_t port_type; port_type = hr_reg_read(cq->cqe, CQE_PORT_TYPE); return pktype_for_ud[port_type]; } static uint8_t wc_read_dlid_path_bits(struct ibv_cq_ex *current) { return 0; } static uint16_t wc_read_cvlan(struct ibv_cq_ex *current) { struct hns_roce_cq *cq = to_hr_cq(ibv_cq_ex_to_cq(current)); return hr_reg_read(cq->cqe, CQE_VID_VLD) ? hr_reg_read(cq->cqe, CQE_VID) : 0; } void hns_roce_attach_cq_ex_ops(struct ibv_cq_ex *cq_ex, uint64_t wc_flags) { cq_ex->start_poll = wc_start_poll_cq; cq_ex->next_poll = wc_next_poll_cq; cq_ex->end_poll = wc_end_poll_cq; cq_ex->read_opcode = wc_read_opcode; cq_ex->read_vendor_err = wc_read_vendor_err; cq_ex->read_wc_flags = wc_read_wc_flags; if (wc_flags & IBV_WC_EX_WITH_BYTE_LEN) cq_ex->read_byte_len = wc_read_byte_len; if (wc_flags & IBV_WC_EX_WITH_IMM) cq_ex->read_imm_data = wc_read_imm_data; if (wc_flags & IBV_WC_EX_WITH_QP_NUM) cq_ex->read_qp_num = wc_read_qp_num; if (wc_flags & IBV_WC_EX_WITH_SRC_QP) cq_ex->read_src_qp = wc_read_src_qp; if (wc_flags & IBV_WC_EX_WITH_SLID) cq_ex->read_slid = wc_read_slid; if (wc_flags & IBV_WC_EX_WITH_SL) cq_ex->read_sl = wc_read_sl; if (wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS) cq_ex->read_dlid_path_bits = wc_read_dlid_path_bits; if (wc_flags & IBV_WC_EX_WITH_CVLAN) cq_ex->read_cvlan = wc_read_cvlan; } static struct hns_roce_rc_sq_wqe * init_rc_wqe(struct hns_roce_qp *qp, uint64_t wr_id, unsigned int opcode) { unsigned int send_flags = qp->verbs_qp.qp_ex.wr_flags; struct hns_roce_rc_sq_wqe *wqe; unsigned int wqe_idx; if (hns_roce_v2_wq_overflow(&qp->sq, 0, to_hr_cq(qp->verbs_qp.qp.send_cq))) { qp->cur_wqe = NULL; qp->err = ENOMEM; return NULL; } wqe_idx = qp->sq.head & (qp->sq.wqe_cnt - 1); wqe = get_send_wqe(qp, wqe_idx); hr_reg_write(wqe, RCWQE_OPCODE, opcode); hr_reg_write_bool(wqe, RCWQE_CQE, send_flags & IBV_SEND_SIGNALED); hr_reg_write_bool(wqe, RCWQE_FENCE, send_flags & IBV_SEND_FENCE); hr_reg_write_bool(wqe, RCWQE_SE, send_flags & IBV_SEND_SOLICITED); hr_reg_clear(wqe, RCWQE_INLINE); qp->sq.wrid[wqe_idx] = wr_id; qp->cur_wqe = wqe; enable_wqe(qp, wqe, qp->sq.head); qp->sq.head++; return wqe; } static void wr_set_sge_rc(struct ibv_qp_ex *ibv_qp, uint32_t lkey, uint64_t addr, uint32_t length) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_rc_sq_wqe *wqe = qp->cur_wqe; if (!wqe) return; hr_reg_write(wqe, RCWQE_LKEY0, lkey); hr_reg_write(wqe, RCWQE_VA0_L, addr); hr_reg_write(wqe, RCWQE_VA0_H, addr >> 32); wqe->msg_len = htole32(length); hr_reg_write(wqe, RCWQE_LEN0, length); hr_reg_write(wqe, RCWQE_SGE_NUM, !!length); } static void set_sgl_rc(struct hns_roce_v2_wqe_data_seg *dseg, struct hns_roce_qp *qp, const struct ibv_sge *sge, size_t num_sge) { unsigned int index = qp->sge_info.start_idx; unsigned int mask = qp->ex_sge.sge_cnt - 1; unsigned int msg_len = 0; unsigned int cnt = 0; int i; for (i = 0; i < num_sge; i++) { if (!sge[i].length) continue; msg_len += sge[i].length; cnt++; if (cnt <= HNS_ROCE_SGE_IN_WQE) { set_data_seg_v2(dseg, &sge[i]); dseg++; } else { dseg = get_send_sge_ex(qp, index & mask); set_data_seg_v2(dseg, &sge[i]); index++; } } qp->sge_info.start_idx = index; qp->sge_info.valid_num = cnt; qp->sge_info.total_len = msg_len; } static void wr_set_sge_list_rc(struct ibv_qp_ex *ibv_qp, size_t num_sge, const struct ibv_sge *sg_list) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_rc_sq_wqe *wqe = qp->cur_wqe; struct hns_roce_v2_wqe_data_seg *dseg; uint32_t opcode; if (!wqe) return; if (num_sge > qp->sq.max_gs) { qp->err = EINVAL; return; } hr_reg_write(wqe, RCWQE_MSG_START_SGE_IDX, qp->sge_info.start_idx & (qp->ex_sge.sge_cnt - 1)); opcode = hr_reg_read(wqe, RCWQE_OPCODE); if (opcode == HNS_ROCE_WQE_OP_ATOMIC_COM_AND_SWAP || opcode == HNS_ROCE_WQE_OP_ATOMIC_FETCH_AND_ADD) num_sge = 1; dseg = (void *)(wqe + 1); set_sgl_rc(dseg, qp, sg_list, num_sge); wqe->msg_len = htole32(qp->sge_info.total_len); hr_reg_write(wqe, RCWQE_SGE_NUM, qp->sge_info.valid_num); } static void wr_send_rc(struct ibv_qp_ex *ibv_qp) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); init_rc_wqe(qp, ibv_qp->wr_id, HNS_ROCE_WQE_OP_SEND); } static void wr_send_imm_rc(struct ibv_qp_ex *ibv_qp, __be32 imm_data) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_rc_sq_wqe *wqe; wqe = init_rc_wqe(qp, ibv_qp->wr_id, HNS_ROCE_WQE_OP_SEND_WITH_IMM); if (!wqe) return; wqe->immtdata = htole32(be32toh(imm_data)); } static void wr_send_inv_rc(struct ibv_qp_ex *ibv_qp, uint32_t invalidate_rkey) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_rc_sq_wqe *wqe; wqe = init_rc_wqe(qp, ibv_qp->wr_id, HNS_ROCE_WQE_OP_SEND_WITH_INV); if (!wqe) return; wqe->inv_key = htole32(invalidate_rkey); } static void wr_set_xrc_srqn(struct ibv_qp_ex *ibv_qp, uint32_t remote_srqn) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_rc_sq_wqe *wqe = qp->cur_wqe; if (!wqe) return; hr_reg_write(wqe, RCWQE_XRC_SRQN, remote_srqn); } static void wr_rdma_read(struct ibv_qp_ex *ibv_qp, uint32_t rkey, uint64_t remote_addr) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_rc_sq_wqe *wqe; wqe = init_rc_wqe(qp, ibv_qp->wr_id, HNS_ROCE_WQE_OP_RDMA_READ); if (!wqe) return; wqe->va = htole64(remote_addr); wqe->rkey = htole32(rkey); } static void wr_rdma_write(struct ibv_qp_ex *ibv_qp, uint32_t rkey, uint64_t remote_addr) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_rc_sq_wqe *wqe; wqe = init_rc_wqe(qp, ibv_qp->wr_id, HNS_ROCE_WQE_OP_RDMA_WRITE); if (!wqe) return; wqe->va = htole64(remote_addr); wqe->rkey = htole32(rkey); } static void wr_rdma_write_imm(struct ibv_qp_ex *ibv_qp, uint32_t rkey, uint64_t remote_addr, __be32 imm_data) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_rc_sq_wqe *wqe; wqe = init_rc_wqe(qp, ibv_qp->wr_id, HNS_ROCE_WQE_OP_RDMA_WRITE_WITH_IMM); if (!wqe) return; wqe->va = htole64(remote_addr); wqe->rkey = htole32(rkey); wqe->immtdata = htole32(be32toh(imm_data)); } static void set_wr_atomic(struct ibv_qp_ex *ibv_qp, uint32_t rkey, uint64_t remote_addr, uint64_t compare_add, uint64_t swap, uint32_t opcode) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_v2_wqe_data_seg *dseg; struct hns_roce_wqe_atomic_seg *aseg; struct hns_roce_rc_sq_wqe *wqe; wqe = init_rc_wqe(qp, ibv_qp->wr_id, opcode); if (!wqe) return; wqe->va = htole64(remote_addr); wqe->rkey = htole32(rkey); dseg = (void *)(wqe + 1); aseg = (void *)(dseg + 1); if (opcode == HNS_ROCE_WQE_OP_ATOMIC_COM_AND_SWAP) { aseg->fetchadd_swap_data = htole64(swap); aseg->cmp_data = htole64(compare_add); } else { aseg->fetchadd_swap_data = htole64(compare_add); aseg->cmp_data = 0; } } static void wr_atomic_cmp_swp(struct ibv_qp_ex *ibv_qp, uint32_t rkey, uint64_t remote_addr, uint64_t compare, uint64_t swap) { set_wr_atomic(ibv_qp, rkey, remote_addr, compare, swap, HNS_ROCE_WQE_OP_ATOMIC_COM_AND_SWAP); } static void wr_atomic_fetch_add(struct ibv_qp_ex *ibv_qp, uint32_t rkey, uint64_t remote_addr, uint64_t add) { set_wr_atomic(ibv_qp, rkey, remote_addr, add, 0, HNS_ROCE_WQE_OP_ATOMIC_FETCH_AND_ADD); } static void set_inline_data_list_rc(struct hns_roce_qp *qp, struct hns_roce_rc_sq_wqe *wqe, size_t num_buf, const struct ibv_data_buf *buf_list) { unsigned int msg_len = qp->sge_info.total_len; void *dseg; size_t i; int ret; hr_reg_enable(wqe, RCWQE_INLINE); wqe->msg_len = htole32(msg_len); if (msg_len <= HNS_ROCE_MAX_RC_INL_INN_SZ) { hr_reg_clear(wqe, RCWQE_INLINE_TYPE); /* ignore ex sge start index */ dseg = wqe + 1; for (i = 0; i < num_buf; i++) { memcpy(dseg, buf_list[i].addr, buf_list[i].length); dseg += buf_list[i].length; } /* ignore sge num */ } else { if (!check_inl_data_len(qp, msg_len)) { qp->err = EINVAL; return; } hr_reg_enable(wqe, RCWQE_INLINE_TYPE); hr_reg_write(wqe, RCWQE_MSG_START_SGE_IDX, qp->sge_info.start_idx & (qp->ex_sge.sge_cnt - 1)); ret = fill_ext_sge_inl_data(qp, &qp->sge_info, buf_list, num_buf, WR_BUF_TYPE_SEND_WR_OPS); if (ret) { qp->err = EINVAL; return; } hr_reg_write(wqe, RCWQE_SGE_NUM, qp->sge_info.valid_num); } } static void wr_set_inline_data_rc(struct ibv_qp_ex *ibv_qp, void *addr, size_t length) { struct ibv_data_buf buff = { .addr = addr, .length = length }; struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_rc_sq_wqe *wqe = qp->cur_wqe; if (!wqe) return; buff.addr = addr; buff.length = length; qp->sge_info.total_len = length; set_inline_data_list_rc(qp, wqe, 1, &buff); } static void wr_set_inline_data_list_rc(struct ibv_qp_ex *ibv_qp, size_t num_buf, const struct ibv_data_buf *buf_list) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_rc_sq_wqe *wqe = qp->cur_wqe; size_t i; if (!wqe) return; qp->sge_info.total_len = 0; for (i = 0; i < num_buf; i++) qp->sge_info.total_len += buf_list[i].length; set_inline_data_list_rc(qp, wqe, num_buf, buf_list); } static struct hns_roce_ud_sq_wqe * init_ud_wqe(struct hns_roce_qp *qp, uint64_t wr_id, unsigned int opcode) { unsigned int send_flags = qp->verbs_qp.qp_ex.wr_flags; struct hns_roce_ud_sq_wqe *wqe; unsigned int wqe_idx; if (hns_roce_v2_wq_overflow(&qp->sq, 0, to_hr_cq(qp->verbs_qp.qp.send_cq))) { qp->cur_wqe = NULL; qp->err = ENOMEM; return NULL; } wqe_idx = qp->sq.head & (qp->sq.wqe_cnt - 1); wqe = get_send_wqe(qp, wqe_idx); hr_reg_write(wqe, UDWQE_OPCODE, opcode); hr_reg_write_bool(wqe, UDWQE_CQE, send_flags & IBV_SEND_SIGNALED); hr_reg_write_bool(wqe, UDWQE_SE, send_flags & IBV_SEND_SOLICITED); hr_reg_clear(wqe, UDWQE_INLINE); qp->sq.wrid[wqe_idx] = wr_id; qp->cur_wqe = wqe; enable_wqe(qp, wqe, qp->sq.head); qp->sq.head++; return wqe; } static void wr_send_ud(struct ibv_qp_ex *ibv_qp) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); init_ud_wqe(qp, ibv_qp->wr_id, HNS_ROCE_WQE_OP_SEND); } static void wr_send_imm_ud(struct ibv_qp_ex *ibv_qp, __be32 imm_data) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_ud_sq_wqe *wqe; wqe = init_ud_wqe(qp, ibv_qp->wr_id, HNS_ROCE_WQE_OP_SEND_WITH_IMM); if (!wqe) return; wqe->immtdata = htole32(be32toh(imm_data)); } static void wr_set_ud_addr(struct ibv_qp_ex *ibv_qp, struct ibv_ah *ah, uint32_t remote_qpn, uint32_t remote_qkey) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_ud_sq_wqe *wqe = qp->cur_wqe; struct hns_roce_ah *hr_ah = to_hr_ah(ah); int ret; if (!wqe) return; wqe->qkey = htole32(remote_qkey & 0x80000000 ? qp->qkey : remote_qkey); hr_reg_write(wqe, UDWQE_DQPN, remote_qpn); ret = fill_ud_av(wqe, hr_ah); if (ret) qp->err = ret; qp->sl = hr_ah->av.sl; } static void wr_set_sge_ud(struct ibv_qp_ex *ibv_qp, uint32_t lkey, uint64_t addr, uint32_t length) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_ud_sq_wqe *wqe = qp->cur_wqe; struct hns_roce_v2_wqe_data_seg *dseg; int sge_idx; if (!wqe) return; wqe->msg_len = htole32(length); hr_reg_write(wqe, UDWQE_SGE_NUM, 1); sge_idx = qp->sge_info.start_idx & (qp->ex_sge.sge_cnt - 1); hr_reg_write(wqe, UDWQE_MSG_START_SGE_IDX, sge_idx); dseg = get_send_sge_ex(qp, sge_idx); dseg->lkey = htole32(lkey); dseg->addr = htole64(addr); dseg->len = htole32(length); qp->sge_info.start_idx++; } static void wr_set_sge_list_ud(struct ibv_qp_ex *ibv_qp, size_t num_sge, const struct ibv_sge *sg_list) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); unsigned int sge_idx = qp->sge_info.start_idx; struct hns_roce_ud_sq_wqe *wqe = qp->cur_wqe; unsigned int mask = qp->ex_sge.sge_cnt - 1; struct hns_roce_v2_wqe_data_seg *dseg; unsigned int msg_len = 0; unsigned int cnt = 0; if (!wqe) return; if (num_sge > qp->sq.max_gs) { qp->err = EINVAL; return; } hr_reg_write(wqe, UDWQE_MSG_START_SGE_IDX, sge_idx & mask); for (size_t i = 0; i < num_sge; i++) { if (!sg_list[i].length) continue; dseg = get_send_sge_ex(qp, sge_idx & mask); set_data_seg_v2(dseg, &sg_list[i]); msg_len += sg_list[i].length; cnt++; sge_idx++; } wqe->msg_len = htole32(msg_len); hr_reg_write(wqe, UDWQE_SGE_NUM, cnt); qp->sge_info.start_idx += cnt; } static void set_inline_data_list_ud(struct hns_roce_qp *qp, struct hns_roce_ud_sq_wqe *wqe, size_t num_buf, const struct ibv_data_buf *buf_list) { uint8_t data[HNS_ROCE_MAX_UD_INL_INN_SZ] = {}; unsigned int msg_len = qp->sge_info.total_len; void *tmp; size_t i; int ret; if (!check_inl_data_len(qp, msg_len)) { qp->err = EINVAL; return; } hr_reg_enable(wqe, UDWQE_INLINE); wqe->msg_len = htole32(msg_len); if (msg_len <= HNS_ROCE_MAX_UD_INL_INN_SZ) { hr_reg_clear(wqe, UDWQE_INLINE_TYPE); /* ignore ex sge start index */ tmp = data; for (i = 0; i < num_buf; i++) { memcpy(tmp, buf_list[i].addr, buf_list[i].length); tmp += buf_list[i].length; } set_ud_inl_seg(wqe, data); /* ignore sge num */ } else { hr_reg_enable(wqe, UDWQE_INLINE_TYPE); hr_reg_write(wqe, UDWQE_MSG_START_SGE_IDX, qp->sge_info.start_idx & (qp->ex_sge.sge_cnt - 1)); ret = fill_ext_sge_inl_data(qp, &qp->sge_info, buf_list, num_buf, WR_BUF_TYPE_SEND_WR_OPS); if (ret) { qp->err = EINVAL; return; } hr_reg_write(wqe, UDWQE_SGE_NUM, qp->sge_info.valid_num); } } static void wr_set_inline_data_ud(struct ibv_qp_ex *ibv_qp, void *addr, size_t length) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_ud_sq_wqe *wqe = qp->cur_wqe; struct ibv_data_buf buff; if (!wqe) return; buff.addr = addr; buff.length = length; qp->sge_info.total_len = length; set_inline_data_list_ud(qp, wqe, 1, &buff); } static void wr_set_inline_data_list_ud(struct ibv_qp_ex *ibv_qp, size_t num_buf, const struct ibv_data_buf *buf_list) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); struct hns_roce_ud_sq_wqe *wqe = qp->cur_wqe; size_t i; if (!wqe) return; qp->sge_info.total_len = 0; for (i = 0; i < num_buf; i++) qp->sge_info.total_len += buf_list[i].length; set_inline_data_list_ud(qp, wqe, num_buf, buf_list); } static void wr_start(struct ibv_qp_ex *ibv_qp) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); enum ibv_qp_state state = ibv_qp->qp_base.state; if (state == IBV_QPS_RESET || state == IBV_QPS_INIT || state == IBV_QPS_RTR) { hns_roce_spin_lock(&qp->sq.hr_lock); qp->err = EINVAL; return; } hns_roce_spin_lock(&qp->sq.hr_lock); qp->sge_info.start_idx = qp->next_sge; qp->rb_sq_head = qp->sq.head; qp->err = 0; } static int wr_complete(struct ibv_qp_ex *ibv_qp) { struct hns_roce_context *ctx = to_hr_ctx(ibv_qp->qp_base.context); struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); unsigned int nreq = qp->sq.head - qp->rb_sq_head; struct ibv_qp_attr attr = {}; int err = qp->err; if (err) { qp->sq.head = qp->rb_sq_head; goto out; } if (nreq) { qp->next_sge = qp->sge_info.start_idx; udma_to_device_barrier(); if (nreq == 1 && (qp->flags & HNS_ROCE_QP_CAP_DIRECT_WQE)) hns_roce_write_dwqe(qp, qp->cur_wqe); else hns_roce_update_sq_db(ctx, qp); if (qp->flags & HNS_ROCE_QP_CAP_SQ_RECORD_DB) *(qp->sdb) = qp->sq.head & 0xffff; } out: hns_roce_spin_unlock(&qp->sq.hr_lock); if (ibv_qp->qp_base.state == IBV_QPS_ERR) { attr.qp_state = IBV_QPS_ERR; hns_roce_u_v2_modify_qp(&ibv_qp->qp_base, &attr, IBV_QP_STATE); } return err; } static void wr_abort(struct ibv_qp_ex *ibv_qp) { struct hns_roce_qp *qp = to_hr_qp(&ibv_qp->qp_base); qp->sq.head = qp->rb_sq_head; hns_roce_spin_unlock(&qp->sq.hr_lock); } enum { HNS_SUPPORTED_SEND_OPS_FLAGS_RC_XRC = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_INV | IBV_QP_EX_WITH_SEND_WITH_IMM | IBV_QP_EX_WITH_RDMA_WRITE | IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM | IBV_QP_EX_WITH_RDMA_READ | IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP | IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD, HNS_SUPPORTED_SEND_OPS_FLAGS_UD = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_IMM, }; static void fill_send_wr_ops_rc_xrc(struct ibv_qp_ex *qp_ex) { qp_ex->wr_send = wr_send_rc; qp_ex->wr_send_imm = wr_send_imm_rc; qp_ex->wr_send_inv = wr_send_inv_rc; qp_ex->wr_rdma_read = wr_rdma_read; qp_ex->wr_rdma_write = wr_rdma_write; qp_ex->wr_rdma_write_imm = wr_rdma_write_imm; qp_ex->wr_set_inline_data = wr_set_inline_data_rc; qp_ex->wr_set_inline_data_list = wr_set_inline_data_list_rc; qp_ex->wr_atomic_cmp_swp = wr_atomic_cmp_swp; qp_ex->wr_atomic_fetch_add = wr_atomic_fetch_add; qp_ex->wr_set_sge = wr_set_sge_rc; qp_ex->wr_set_sge_list = wr_set_sge_list_rc; } static void fill_send_wr_ops_ud(struct ibv_qp_ex *qp_ex) { qp_ex->wr_send = wr_send_ud; qp_ex->wr_send_imm = wr_send_imm_ud; qp_ex->wr_set_ud_addr = wr_set_ud_addr; qp_ex->wr_set_inline_data = wr_set_inline_data_ud; qp_ex->wr_set_inline_data_list = wr_set_inline_data_list_ud; qp_ex->wr_set_sge = wr_set_sge_ud; qp_ex->wr_set_sge_list = wr_set_sge_list_ud; } static int fill_send_wr_ops(const struct ibv_qp_init_attr_ex *attr, struct ibv_qp_ex *qp_ex) { uint64_t ops = attr->send_ops_flags; qp_ex->wr_start = wr_start; qp_ex->wr_complete = wr_complete; qp_ex->wr_abort = wr_abort; switch (attr->qp_type) { case IBV_QPT_XRC_SEND: qp_ex->wr_set_xrc_srqn = wr_set_xrc_srqn; SWITCH_FALLTHROUGH; case IBV_QPT_RC: if (ops & ~HNS_SUPPORTED_SEND_OPS_FLAGS_RC_XRC) return -EOPNOTSUPP; fill_send_wr_ops_rc_xrc(qp_ex); break; case IBV_QPT_UD: if (ops & ~HNS_SUPPORTED_SEND_OPS_FLAGS_UD) return -EOPNOTSUPP; fill_send_wr_ops_ud(qp_ex); break; default: verbs_err(verbs_get_ctx(qp_ex->qp_base.context), "QP type %d not supported for qp_ex send ops.\n", attr->qp_type); return -EOPNOTSUPP; } return 0; } int hns_roce_attach_qp_ex_ops(struct ibv_qp_init_attr_ex *attr, struct hns_roce_qp *qp) { if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS) { if (fill_send_wr_ops(attr, &qp->verbs_qp.qp_ex)) return -EOPNOTSUPP; qp->verbs_qp.comp_mask |= VERBS_QP_EX; } return 0; } const struct hns_roce_u_hw hns_roce_u_hw_v2 = { .hw_version = HNS_ROCE_HW_VER2, .hw_ops = { .poll_cq = hns_roce_u_v2_poll_cq, .req_notify_cq = hns_roce_u_v2_arm_cq, .post_send = hns_roce_u_v2_post_send, .post_recv = hns_roce_u_v2_post_recv, .modify_qp = hns_roce_u_v2_modify_qp, .destroy_qp = hns_roce_u_v2_destroy_qp, .post_srq_recv = hns_roce_u_v2_post_srq_recv, }, }; rdma-core-56.1/providers/hns/hns_roce_u_hw_v2.h000066400000000000000000000252321477342711600215450ustar00rootroot00000000000000/* * Copyright (c) 2016-2017 Hisilicon Limited. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _HNS_ROCE_U_HW_V2_H #define _HNS_ROCE_U_HW_V2_H enum { CQE_FOR_SQ, CQE_FOR_RQ, }; #define HNS_ROCE_V2_CQ_DB_REQ_SOL 1 #define HNS_ROCE_V2_CQ_DB_REQ_NEXT 0 #define HNS_ROCE_SL_SHIFT 2 /* V2 REG DEFINITION */ #define ROCEE_VF_DB_CFG0_OFFSET 0x0230 #define HNS_ROCE_IDX_QUE_ENTRY_SZ 4 enum { HNS_ROCE_WQE_OP_SEND = 0x0, HNS_ROCE_WQE_OP_SEND_WITH_INV = 0x1, HNS_ROCE_WQE_OP_SEND_WITH_IMM = 0x2, HNS_ROCE_WQE_OP_RDMA_WRITE = 0x3, HNS_ROCE_WQE_OP_RDMA_WRITE_WITH_IMM = 0x4, HNS_ROCE_WQE_OP_RDMA_READ = 0x5, HNS_ROCE_WQE_OP_ATOMIC_COM_AND_SWAP = 0x6, HNS_ROCE_WQE_OP_ATOMIC_FETCH_AND_ADD = 0x7, HNS_ROCE_WQE_OP_ATOMIC_MASK_COMP_AND_SWAP = 0x8, HNS_ROCE_WQE_OP_ATOMIC_MASK_FETCH_AND_ADD = 0x9, HNS_ROCE_WQE_OP_FAST_REG_PMR = 0xa, HNS_ROCE_WQE_OP_BIND_MW_TYPE = 0xc, HNS_ROCE_WQE_OP_MASK = 0x1f }; enum { /* rq operations */ HNS_ROCE_RECV_OP_RDMA_WRITE_IMM = 0x0, HNS_ROCE_RECV_OP_SEND = 0x1, HNS_ROCE_RECV_OP_SEND_WITH_IMM = 0x2, HNS_ROCE_RECV_OP_SEND_WITH_INV = 0x3, }; enum { HNS_ROCE_SQ_OP_SEND = 0x0, HNS_ROCE_SQ_OP_SEND_WITH_INV = 0x1, HNS_ROCE_SQ_OP_SEND_WITH_IMM = 0x2, HNS_ROCE_SQ_OP_RDMA_WRITE = 0x3, HNS_ROCE_SQ_OP_RDMA_WRITE_WITH_IMM = 0x4, HNS_ROCE_SQ_OP_RDMA_READ = 0x5, HNS_ROCE_SQ_OP_ATOMIC_COMP_AND_SWAP = 0x6, HNS_ROCE_SQ_OP_ATOMIC_FETCH_AND_ADD = 0x7, HNS_ROCE_SQ_OP_ATOMIC_MASK_COMP_AND_SWAP = 0x8, HNS_ROCE_SQ_OP_ATOMIC_MASK_FETCH_AND_ADD = 0x9, HNS_ROCE_SQ_OP_FAST_REG_PMR = 0xa, HNS_ROCE_SQ_OP_BIND_MW = 0xc, }; enum { V2_CQ_OK = 0, V2_CQ_EMPTY = -1, V2_CQ_POLL_ERR = -2, }; enum { HNS_ROCE_V2_CQE_SUCCESS = 0x00, HNS_ROCE_V2_CQE_LOCAL_LENGTH_ERR = 0x01, HNS_ROCE_V2_CQE_LOCAL_QP_OP_ERR = 0x02, HNS_ROCE_V2_CQE_LOCAL_PROT_ERR = 0x04, HNS_ROCE_V2_CQE_WR_FLUSH_ERR = 0x05, HNS_ROCE_V2_CQE_MEM_MANAGERENT_OP_ERR = 0x06, HNS_ROCE_V2_CQE_BAD_RESP_ERR = 0x10, HNS_ROCE_V2_CQE_LOCAL_ACCESS_ERR = 0x11, HNS_ROCE_V2_CQE_REMOTE_INVAL_REQ_ERR = 0x12, HNS_ROCE_V2_CQE_REMOTE_ACCESS_ERR = 0x13, HNS_ROCE_V2_CQE_REMOTE_OP_ERR = 0x14, HNS_ROCE_V2_CQE_TRANSPORT_RETRY_EXC_ERR = 0x15, HNS_ROCE_V2_CQE_RNR_RETRY_EXC_ERR = 0x16, HNS_ROCE_V2_CQE_REMOTE_ABORTED_ERR = 0x22, HNS_ROCE_V2_CQE_GENERAL_ERR = 0x23, HNS_ROCE_V2_CQE_XRC_VIOLATION_ERR = 0x24, }; enum { HNS_ROCE_V2_SQ_DB, HNS_ROCE_V2_RQ_DB, HNS_ROCE_V2_SRQ_DB, HNS_ROCE_V2_CQ_DB_PTR, HNS_ROCE_V2_CQ_DB_NTR, }; enum hns_roce_wr_buf_type { WR_BUF_TYPE_POST_SEND, WR_BUF_TYPE_SEND_WR_OPS, }; struct hns_roce_db { __le32 byte_4; __le32 parameter; }; #define DB_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_db, h, l) #define DB_TAG DB_FIELD_LOC(23, 0) #define DB_CMD DB_FIELD_LOC(27, 24) #define DB_FLAG DB_FIELD_LOC(31, 31) #define DB_PI DB_FIELD_LOC(47, 32) #define DB_SL DB_FIELD_LOC(50, 48) #define DB_CQ_CI DB_FIELD_LOC(55, 32) #define DB_CQ_NOTIFY DB_FIELD_LOC(56, 56) #define DB_CQ_CMD_SN DB_FIELD_LOC(58, 57) #define RECORD_DB_CI_MASK GENMASK(23, 0) struct hns_roce_v2_cqe { __le32 byte_4; union { __le32 rkey; __le32 immtdata; }; __le32 byte_12; __le32 byte_16; __le32 byte_cnt; __le32 smac; __le32 byte_28; __le32 byte_32; __le32 payload[8]; }; #define CQE_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_v2_cqe, h, l) #define CQE_OPCODE CQE_FIELD_LOC(4, 0) #define CQE_RQ_INLINE CQE_FIELD_LOC(5, 5) #define CQE_S_R CQE_FIELD_LOC(6, 6) #define CQE_OWNER CQE_FIELD_LOC(7, 7) #define CQE_STATUS CQE_FIELD_LOC(15, 8) #define CQE_WQE_IDX CQE_FIELD_LOC(31, 16) #define CQE_RKEY_IMMTDATA CQE_FIELD_LOC(63, 32) #define CQE_XRC_SRQN CQE_FIELD_LOC(87, 64) #define CQE_CQE_INLINE CQE_FIELD_LOC(89, 88) #define CQE_LCL_QPN CQE_FIELD_LOC(119, 96) #define CQE_SUB_STATUS CQE_FIELD_LOC(127, 120) #define CQE_BYTE_CNT CQE_FIELD_LOC(159, 128) #define CQE_SMAC CQE_FIELD_LOC(207, 160) #define CQE_PORT_TYPE CQE_FIELD_LOC(209, 208) #define CQE_VID CQE_FIELD_LOC(221, 210) #define CQE_VID_VLD CQE_FIELD_LOC(222, 222) #define CQE_RSV2 CQE_FIELD_LOC(223, 223) #define CQE_RMT_QPN CQE_FIELD_LOC(247, 224) #define CQE_SL CQE_FIELD_LOC(250, 248) #define CQE_PORTN CQE_FIELD_LOC(253, 251) #define CQE_GRH CQE_FIELD_LOC(254, 254) #define CQE_LPK CQE_FIELD_LOC(255, 255) #define CQE_RSV3 CQE_FIELD_LOC(511, 256) struct hns_roce_rc_sq_wqe { __le32 byte_4; __le32 msg_len; union { __le32 inv_key; __le32 immtdata; __le32 new_rkey; }; __le32 byte_16; __le32 byte_20; __le32 rkey; __le64 va; }; #define RCWQE_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_rc_sq_wqe, h, l) #define RCWQE_OPCODE RCWQE_FIELD_LOC(4, 0) #define RCWQE_DB_SL_L RCWQE_FIELD_LOC(6, 5) #define RCWQE_SQPN_L RCWQE_FIELD_LOC(6, 5) #define RCWQE_OWNER RCWQE_FIELD_LOC(7, 7) #define RCWQE_CQE RCWQE_FIELD_LOC(8, 8) #define RCWQE_FENCE RCWQE_FIELD_LOC(9, 9) #define RCWQE_SO RCWQE_FIELD_LOC(10, 10) #define RCWQE_SE RCWQE_FIELD_LOC(11, 11) #define RCWQE_INLINE RCWQE_FIELD_LOC(12, 12) #define RCWQE_DB_SL_H RCWQE_FIELD_LOC(14, 13) #define RCWQE_WQE_IDX RCWQE_FIELD_LOC(30, 15) #define RCWQE_SQPN_H RCWQE_FIELD_LOC(30, 13) #define RCWQE_FLAG RCWQE_FIELD_LOC(31, 31) #define RCWQE_MSG_LEN RCWQE_FIELD_LOC(63, 32) #define RCWQE_INV_KEY_IMMTDATA RCWQE_FIELD_LOC(95, 64) #define RCWQE_XRC_SRQN RCWQE_FIELD_LOC(119, 96) #define RCWQE_SGE_NUM RCWQE_FIELD_LOC(127, 120) #define RCWQE_MSG_START_SGE_IDX RCWQE_FIELD_LOC(151, 128) #define RCWQE_REDUCE_CODE RCWQE_FIELD_LOC(158, 152) #define RCWQE_INLINE_TYPE RCWQE_FIELD_LOC(159, 159) #define RCWQE_RKEY RCWQE_FIELD_LOC(191, 160) #define RCWQE_VA_L RCWQE_FIELD_LOC(223, 192) #define RCWQE_VA_H RCWQE_FIELD_LOC(255, 224) #define RCWQE_LEN0 RCWQE_FIELD_LOC(287, 256) #define RCWQE_LKEY0 RCWQE_FIELD_LOC(319, 288) #define RCWQE_VA0_L RCWQE_FIELD_LOC(351, 320) #define RCWQE_VA0_H RCWQE_FIELD_LOC(383, 352) #define RCWQE_LEN1 RCWQE_FIELD_LOC(415, 384) #define RCWQE_LKEY1 RCWQE_FIELD_LOC(447, 416) #define RCWQE_VA1_L RCWQE_FIELD_LOC(479, 448) #define RCWQE_VA1_H RCWQE_FIELD_LOC(511, 480) #define RCWQE_MW_TYPE RCWQE_FIELD_LOC(256, 256) #define RCWQE_MW_RA_EN RCWQE_FIELD_LOC(258, 258) #define RCWQE_MW_RR_EN RCWQE_FIELD_LOC(259, 259) #define RCWQE_MW_RW_EN RCWQE_FIELD_LOC(260, 260) struct hns_roce_v2_wqe_data_seg { __le32 len; __le32 lkey; __le64 addr; }; struct hns_roce_v2_wqe_raddr_seg { __le32 rkey; __le32 len; __le64 raddr; }; struct hns_roce_wqe_atomic_seg { __le64 fetchadd_swap_data; __le64 cmp_data; }; int hns_roce_u_v2_post_send(struct ibv_qp *ibvqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); static inline unsigned int is_std_atomic(unsigned int len) { return len == 8; } static inline unsigned int is_ext_atomic(unsigned int len) { return len == 16 || len == 32 || len == 64; } static inline unsigned int is_atomic(unsigned int len) { return is_std_atomic(len) || is_ext_atomic(len); } #define ATOMIC_DATA_LEN_MAX 64 #define ATOMIC_BUF_NUM_MAX 2 struct hns_roce_ud_sq_wqe { __le32 rsv_opcode; __le32 msg_len; __le32 immtdata; __le32 sge_num_pd; __le32 rsv_msg_start_sge_idx; __le32 udpspn_rsv; __le32 qkey; __le32 rsv_dqpn; __le32 tclass_vlan; __le32 lbi_flow_label; uint8_t dmac[ETH_ALEN]; uint8_t sgid_index; uint8_t smac_index; uint8_t dgid[HNS_ROCE_GID_SIZE]; }; #define UDWQE_FIELD_LOC(h, l) FIELD_LOC(struct hns_roce_ud_sq_wqe, h, l) #define UDWQE_OPCODE UDWQE_FIELD_LOC(4, 0) #define UDWQE_DB_SL_L UDWQE_FIELD_LOC(6, 5) #define UDWQE_OWNER UDWQE_FIELD_LOC(7, 7) #define UDWQE_CQE UDWQE_FIELD_LOC(8, 8) #define UDWQE_RSVD1 UDWQE_FIELD_LOC(10, 9) #define UDWQE_SE UDWQE_FIELD_LOC(11, 11) #define UDWQE_INLINE UDWQE_FIELD_LOC(12, 12) #define UDWQE_DB_SL_H UDWQE_FIELD_LOC(14, 13) #define UDWQE_WQE_IDX UDWQE_FIELD_LOC(30, 15) #define UDWQE_FLAG UDWQE_FIELD_LOC(31, 31) #define UDWQE_MSG_LEN UDWQE_FIELD_LOC(63, 32) #define UDWQE_IMMTDATA UDWQE_FIELD_LOC(95, 64) #define UDWQE_PD UDWQE_FIELD_LOC(119, 96) #define UDWQE_SGE_NUM UDWQE_FIELD_LOC(127, 120) #define UDWQE_MSG_START_SGE_IDX UDWQE_FIELD_LOC(151, 128) #define UDWQE_RSVD3 UDWQE_FIELD_LOC(158, 152) #define UDWQE_INLINE_TYPE UDWQE_FIELD_LOC(159, 159) #define UDWQE_RSVD4 UDWQE_FIELD_LOC(175, 160) #define UDWQE_UDPSPN UDWQE_FIELD_LOC(191, 176) #define UDWQE_QKEY UDWQE_FIELD_LOC(223, 192) #define UDWQE_DQPN UDWQE_FIELD_LOC(247, 224) #define UDWQE_RSVD5 UDWQE_FIELD_LOC(255, 248) #define UDWQE_VLAN UDWQE_FIELD_LOC(271, 256) #define UDWQE_HOPLIMIT UDWQE_FIELD_LOC(279, 272) #define UDWQE_TCLASS UDWQE_FIELD_LOC(287, 280) #define UDWQE_FLOW_LABEL UDWQE_FIELD_LOC(307, 288) #define UDWQE_SL UDWQE_FIELD_LOC(311, 308) #define UDWQE_PORTN UDWQE_FIELD_LOC(314, 312) #define UDWQE_RSVD6 UDWQE_FIELD_LOC(317, 315) #define UDWQE_UD_VLAN_EN UDWQE_FIELD_LOC(318, 318) #define UDWQE_LBI UDWQE_FIELD_LOC(319, 319) #define UDWQE_DMAC_L UDWQE_FIELD_LOC(351, 320) #define UDWQE_DMAC_H UDWQE_FIELD_LOC(367, 352) #define UDWQE_GMV_IDX UDWQE_FIELD_LOC(383, 368) #define UDWQE_DGID0 UDWQE_FIELD_LOC(415, 384) #define UDWQE_DGID1 UDWQE_FIELD_LOC(447, 416) #define UDWQE_DGID2 UDWQE_FIELD_LOC(479, 448) #define UDWQE_DGID3 UDWQE_FIELD_LOC(511, 480) #define UDWQE_INLINE_DATA_15_0 UDWQE_FIELD_LOC(63, 48) #define UDWQE_INLINE_DATA_23_16 UDWQE_FIELD_LOC(127, 120) #define UDWQE_INLINE_DATA_47_24 UDWQE_FIELD_LOC(151, 128) #define UDWQE_INLINE_DATA_63_48 UDWQE_FIELD_LOC(175, 160) #define MAX_SERVICE_LEVEL 0x7 void hns_roce_v2_clear_qp(struct hns_roce_context *ctx, struct hns_roce_qp *qp); void hns_roce_attach_cq_ex_ops(struct ibv_cq_ex *cq_ex, uint64_t wc_flags); int hns_roce_attach_qp_ex_ops(struct ibv_qp_init_attr_ex *attr, struct hns_roce_qp *qp); #endif /* _HNS_ROCE_U_HW_V2_H */ rdma-core-56.1/providers/hns/hns_roce_u_verbs.c000066400000000000000000001226421477342711600216370ustar00rootroot00000000000000/* * Copyright (c) 2016-2017 Hisilicon Limited. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include "hns_roce_u.h" #include "hns_roce_u_db.h" #include "hns_roce_u_hw_v2.h" static bool hns_roce_whether_need_lock(struct ibv_pd *pd) { struct hns_roce_pad *pad = to_hr_pad(pd); return !(pad && pad->td); } static int hns_roce_spinlock_init(struct hns_roce_spinlock *hr_lock, bool need_lock) { hr_lock->need_lock = need_lock; if (need_lock) return pthread_spin_init(&hr_lock->lock, PTHREAD_PROCESS_PRIVATE); return 0; } static int hns_roce_spinlock_destroy(struct hns_roce_spinlock *hr_lock) { if (hr_lock->need_lock) return pthread_spin_destroy(&hr_lock->lock); return 0; } void hns_roce_init_qp_indices(struct hns_roce_qp *qp) { qp->sq.head = 0; qp->sq.tail = 0; qp->rq.head = 0; qp->rq.tail = 0; qp->next_sge = 0; } int hns_roce_u_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); int ret; uint64_t raw_fw_ver; unsigned int major, minor, sub_minor; ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (ret) return ret; raw_fw_ver = resp.base.fw_ver; major = (raw_fw_ver >> 32) & 0xffff; minor = (raw_fw_ver >> 16) & 0xffff; sub_minor = raw_fw_ver & 0xffff; snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver), "%u.%u.%03u", major, minor, sub_minor); return 0; } int hns_roce_u_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(context, port, attr, &cmd, sizeof(cmd)); } struct ibv_td *hns_roce_u_alloc_td(struct ibv_context *context, struct ibv_td_init_attr *attr) { struct hns_roce_td *td; if (attr->comp_mask) { errno = EOPNOTSUPP; return NULL; } td = calloc(1, sizeof(*td)); if (!td) { errno = ENOMEM; return NULL; } td->ibv_td.context = context; atomic_init(&td->refcount, 1); return &td->ibv_td; } int hns_roce_u_dealloc_td(struct ibv_td *ibv_td) { struct hns_roce_td *td; td = to_hr_td(ibv_td); if (atomic_load(&td->refcount) > 1) return EBUSY; free(td); return 0; } struct ibv_pd *hns_roce_u_alloc_pd(struct ibv_context *context) { struct hns_roce_alloc_pd_resp resp = {}; struct ibv_alloc_pd cmd; struct hns_roce_pd *pd; pd = calloc(1, sizeof(*pd)); if (!pd) { errno = ENOMEM; return NULL; } errno = ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (errno) goto err; atomic_init(&pd->refcount, 1); pd->pdn = resp.pdn; return &pd->ibv_pd; err: free(pd); return NULL; } struct ibv_pd *hns_roce_u_alloc_pad(struct ibv_context *context, struct ibv_parent_domain_init_attr *attr) { struct hns_roce_pad *pad; if (ibv_check_alloc_parent_domain(attr)) return NULL; if (attr->comp_mask) { errno = EOPNOTSUPP; return NULL; } pad = calloc(1, sizeof(*pad)); if (!pad) { errno = ENOMEM; return NULL; } if (attr->td) { pad->td = to_hr_td(attr->td); atomic_fetch_add(&pad->td->refcount, 1); } pad->pd.protection_domain = to_hr_pd(attr->pd); atomic_fetch_add(&pad->pd.protection_domain->refcount, 1); atomic_init(&pad->pd.refcount, 1); ibv_initialize_parent_domain(&pad->pd.ibv_pd, &pad->pd.protection_domain->ibv_pd); return &pad->pd.ibv_pd; } static void hns_roce_free_pad(struct hns_roce_pad *pad) { atomic_fetch_sub(&pad->pd.protection_domain->refcount, 1); if (pad->td) atomic_fetch_sub(&pad->td->refcount, 1); free(pad); } static int hns_roce_free_pd(struct hns_roce_pd *pd) { int ret; if (atomic_load(&pd->refcount) > 1) return EBUSY; ret = ibv_cmd_dealloc_pd(&pd->ibv_pd); if (ret) return ret; free(pd); return 0; } int hns_roce_u_dealloc_pd(struct ibv_pd *ibv_pd) { struct hns_roce_pad *pad = to_hr_pad(ibv_pd); struct hns_roce_pd *pd = to_hr_pd(ibv_pd); if (pad) { hns_roce_free_pad(pad); return 0; } return hns_roce_free_pd(pd); } struct ibv_xrcd *hns_roce_u_open_xrcd(struct ibv_context *context, struct ibv_xrcd_init_attr *xrcd_init_attr) { struct ib_uverbs_open_xrcd_resp resp = {}; struct ibv_open_xrcd cmd = {}; struct verbs_xrcd *xrcd; int ret; xrcd = calloc(1, sizeof(*xrcd)); if (!xrcd) return NULL; ret = ibv_cmd_open_xrcd(context, xrcd, sizeof(*xrcd), xrcd_init_attr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(xrcd); return NULL; } return &xrcd->xrcd; } int hns_roce_u_close_xrcd(struct ibv_xrcd *ibv_xrcd) { struct verbs_xrcd *xrcd = container_of(ibv_xrcd, struct verbs_xrcd, xrcd); int ret; ret = ibv_cmd_close_xrcd(xrcd); if (!ret) free(xrcd); return ret; } struct ibv_mr *hns_roce_u_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { int ret; struct verbs_mr *vmr; struct ibv_reg_mr cmd; struct ib_uverbs_reg_mr_resp resp; if (!addr) { verbs_err(verbs_get_ctx(pd->context), "2nd parm addr is NULL!\n"); return NULL; } if (!length) { verbs_err(verbs_get_ctx(pd->context), "3st parm length is 0!\n"); return NULL; } vmr = malloc(sizeof(*vmr)); if (!vmr) return NULL; ret = ibv_cmd_reg_mr(pd, addr, length, hca_va, access, vmr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(vmr); return NULL; } return &vmr->ibv_mr; } int hns_roce_u_rereg_mr(struct verbs_mr *vmr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access) { struct ibv_rereg_mr cmd; struct ib_uverbs_rereg_mr_resp resp; return ibv_cmd_rereg_mr(vmr, flags, addr, length, (uintptr_t)addr, access, pd, &cmd, sizeof(cmd), &resp, sizeof(resp)); } int hns_roce_u_dereg_mr(struct verbs_mr *vmr) { int ret; ret = ibv_cmd_dereg_mr(vmr); if (ret) return ret; free(vmr); return ret; } int hns_roce_u_bind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind) { struct ibv_mw_bind_info *bind_info = &mw_bind->bind_info; struct ibv_send_wr *bad_wr = NULL; struct ibv_send_wr wr = {}; int ret; if (bind_info->mw_access_flags & ~(IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_ATOMIC)) return EINVAL; wr.opcode = IBV_WR_BIND_MW; wr.next = NULL; wr.wr_id = mw_bind->wr_id; wr.send_flags = mw_bind->send_flags; wr.bind_mw.mw = mw; wr.bind_mw.rkey = ibv_inc_rkey(mw->rkey); wr.bind_mw.bind_info = mw_bind->bind_info; ret = hns_roce_u_v2_post_send(qp, &wr, &bad_wr); if (ret) return ret; mw->rkey = wr.bind_mw.rkey; return 0; } struct ibv_mw *hns_roce_u_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type) { struct ibv_mw *mw; struct ibv_alloc_mw cmd = {}; struct ib_uverbs_alloc_mw_resp resp = {}; mw = malloc(sizeof(*mw)); if (!mw) return NULL; if (ibv_cmd_alloc_mw(pd, type, mw, &cmd, sizeof(cmd), &resp, sizeof(resp))) { free(mw); return NULL; } return mw; } int hns_roce_u_dealloc_mw(struct ibv_mw *mw) { int ret; ret = ibv_cmd_dealloc_mw(mw); if (ret) return ret; free(mw); return 0; } enum { CREATE_CQ_SUPPORTED_COMP_MASK = IBV_CQ_INIT_ATTR_MASK_FLAGS | IBV_CQ_INIT_ATTR_MASK_PD, }; enum { CREATE_CQ_SUPPORTED_WC_FLAGS = IBV_WC_STANDARD_FLAGS | IBV_WC_EX_WITH_CVLAN, }; static int verify_cq_create_attr(struct ibv_cq_init_attr_ex *attr, struct hns_roce_context *context) { struct hns_roce_pad *pad = to_hr_pad(attr->parent_domain); if (!attr->cqe || attr->cqe > context->max_cqe) { verbs_err(&context->ibv_ctx, "unsupported cq depth %u.\n", attr->cqe); return EINVAL; } if (!check_comp_mask(attr->comp_mask, CREATE_CQ_SUPPORTED_COMP_MASK)) { verbs_err(&context->ibv_ctx, "unsupported cq comps 0x%x\n", attr->comp_mask); return EOPNOTSUPP; } if (!check_comp_mask(attr->wc_flags, CREATE_CQ_SUPPORTED_WC_FLAGS)) { verbs_err(&context->ibv_ctx, "unsupported wc flags 0x%llx.\n", attr->wc_flags); return EOPNOTSUPP; } if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD) { if (!pad) { verbs_err(&context->ibv_ctx, "failed to check the pad of cq.\n"); return EINVAL; } atomic_fetch_add(&pad->pd.refcount, 1); } attr->cqe = max_t(uint32_t, HNS_ROCE_MIN_CQE_NUM, roundup_pow_of_two(attr->cqe)); return 0; } static int hns_roce_cq_spinlock_init(struct hns_roce_cq *cq, struct ibv_cq_init_attr_ex *attr) { bool need_lock = hns_roce_whether_need_lock(attr->parent_domain); return hns_roce_spinlock_init(&cq->hr_lock, need_lock); } static int hns_roce_srq_spinlock_init(struct hns_roce_srq *srq, struct ibv_srq_init_attr_ex *attr) { bool need_lock = hns_roce_whether_need_lock(attr->pd); return hns_roce_spinlock_init(&srq->hr_lock, need_lock); } static int hns_roce_alloc_cq_buf(struct hns_roce_cq *cq) { int buf_size = hr_hw_page_align(cq->cq_depth * cq->cqe_size); if (hns_roce_alloc_buf(&cq->buf, buf_size, HNS_HW_PAGE_SIZE)) return -ENOMEM; return 0; } static int exec_cq_create_cmd(struct ibv_context *context, struct hns_roce_cq *cq, struct ibv_cq_init_attr_ex *attr) { struct hns_roce_create_cq_ex_resp resp_ex = {}; struct hns_roce_ib_create_cq_resp *resp_drv; struct hns_roce_create_cq_ex cmd_ex = {}; struct hns_roce_ib_create_cq *cmd_drv; int ret; cmd_drv = &cmd_ex.drv_payload; resp_drv = &resp_ex.drv_payload; cmd_drv->buf_addr = (uintptr_t)cq->buf.buf; cmd_drv->db_addr = (uintptr_t)cq->db; cmd_drv->cqe_size = (uintptr_t)cq->cqe_size; ret = ibv_cmd_create_cq_ex(context, attr, &cq->verbs_cq, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp_ex.ibv_resp, sizeof(resp_ex), 0); if (ret) { verbs_err(verbs_get_ctx(context), "failed to exec create cq cmd, ret = %d.\n", ret); return ret; } cq->cqn = resp_drv->cqn; cq->flags = resp_drv->cap_flags; return 0; } static struct ibv_cq_ex *create_cq(struct ibv_context *context, struct ibv_cq_init_attr_ex *attr) { struct hns_roce_context *hr_ctx = to_hr_ctx(context); struct hns_roce_cq *cq; int ret; ret = verify_cq_create_attr(attr, hr_ctx); if (ret) goto err; cq = calloc(1, sizeof(*cq)); if (!cq) { errno = ENOMEM; goto err; } if (attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD) cq->parent_domain = attr->parent_domain; ret = hns_roce_cq_spinlock_init(cq, attr); if (ret) goto err_lock; cq->cq_depth = attr->cqe; cq->cqe_size = hr_ctx->cqe_size; ret = hns_roce_alloc_cq_buf(cq); if (ret) goto err_buf; cq->db = hns_roce_alloc_db(hr_ctx, HNS_ROCE_CQ_TYPE_DB); if (!cq->db) { ret = ENOMEM; goto err_db; } ret = exec_cq_create_cmd(context, cq, attr); if (ret) goto err_cmd; cq->arm_sn = 1; return &cq->verbs_cq.cq_ex; err_cmd: hns_roce_free_db(hr_ctx, cq->db, HNS_ROCE_CQ_TYPE_DB); err_db: hns_roce_free_buf(&cq->buf); err_buf: hns_roce_spinlock_destroy(&cq->hr_lock); err_lock: free(cq); err: if (ret < 0) ret = -ret; errno = ret; return NULL; } struct ibv_cq *hns_roce_u_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct ibv_cq_ex *cq; struct ibv_cq_init_attr_ex attr = { .cqe = cqe, .channel = channel, .comp_vector = comp_vector, }; cq = create_cq(context, &attr); return cq ? ibv_cq_ex_to_cq(cq) : NULL; } struct ibv_cq_ex *hns_roce_u_create_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *attr) { struct ibv_cq_ex *cq; cq = create_cq(context, attr); if (cq) hns_roce_attach_cq_ex_ops(cq, attr->wc_flags); return cq; } void hns_roce_u_cq_event(struct ibv_cq *cq) { to_hr_cq(cq)->arm_sn++; } int hns_roce_u_modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr) { struct ibv_modify_cq cmd = {}; return ibv_cmd_modify_cq(cq, attr, &cmd, sizeof(cmd)); } int hns_roce_u_destroy_cq(struct ibv_cq *cq) { struct hns_roce_cq *hr_cq = to_hr_cq(cq); struct hns_roce_pad *pad = to_hr_pad(hr_cq->parent_domain); int ret; ret = ibv_cmd_destroy_cq(cq); if (ret) return ret; hns_roce_free_db(to_hr_ctx(cq->context), hr_cq->db, HNS_ROCE_CQ_TYPE_DB); hns_roce_free_buf(&hr_cq->buf); hns_roce_spinlock_destroy(&hr_cq->hr_lock); if (pad) atomic_fetch_sub(&pad->pd.refcount, 1); free(hr_cq); return ret; } static int hns_roce_store_srq(struct hns_roce_context *ctx, struct hns_roce_srq *srq) { uint32_t tind = to_hr_srq_table_index(srq->srqn, ctx); pthread_mutex_lock(&ctx->srq_table_mutex); if (!ctx->srq_table[tind].refcnt) { ctx->srq_table[tind].table = calloc(ctx->srq_table_mask + 1, sizeof(struct hns_roce_srq *)); if (!ctx->srq_table[tind].table) { pthread_mutex_unlock(&ctx->srq_table_mutex); return -ENOMEM; } } ++ctx->srq_table[tind].refcnt; ctx->srq_table[tind].table[srq->srqn & ctx->srq_table_mask] = srq; pthread_mutex_unlock(&ctx->srq_table_mutex); return 0; } struct hns_roce_srq *hns_roce_find_srq(struct hns_roce_context *ctx, uint32_t srqn) { uint32_t tind = to_hr_srq_table_index(srqn, ctx); if (ctx->srq_table[tind].refcnt) return ctx->srq_table[tind].table[srqn & ctx->srq_table_mask]; else return NULL; } static void hns_roce_clear_srq(struct hns_roce_context *ctx, uint32_t srqn) { uint32_t tind = to_hr_srq_table_index(srqn, ctx); pthread_mutex_lock(&ctx->srq_table_mutex); if (!--ctx->srq_table[tind].refcnt) free(ctx->srq_table[tind].table); else ctx->srq_table[tind].table[srqn & ctx->srq_table_mask] = NULL; pthread_mutex_unlock(&ctx->srq_table_mutex); } static int verify_srq_create_attr(struct hns_roce_context *context, struct ibv_srq_init_attr_ex *attr) { if (attr->srq_type != IBV_SRQT_BASIC && attr->srq_type != IBV_SRQT_XRC) { verbs_err(&context->ibv_ctx, "unsupported srq type, type = %d.\n", attr->srq_type); return -EINVAL; } if (!attr->attr.max_sge || attr->attr.max_wr > context->max_srq_wr || attr->attr.max_sge > context->max_srq_sge) { verbs_err(&context->ibv_ctx, "invalid srq attr size, max_wr = %u, max_sge = %u.\n", attr->attr.max_wr, attr->attr.max_sge); return -EINVAL; } attr->attr.max_wr = max_t(uint32_t, attr->attr.max_wr, HNS_ROCE_MIN_SRQ_WQE_NUM); return 0; } static void set_srq_param(struct ibv_context *context, struct hns_roce_srq *srq, struct ibv_srq_init_attr_ex *attr) { if (to_hr_dev(context->device)->hw_version == HNS_ROCE_HW_VER2) srq->rsv_sge = 1; srq->wqe_cnt = roundup_pow_of_two(attr->attr.max_wr); srq->max_gs = roundup_pow_of_two(attr->attr.max_sge + srq->rsv_sge); srq->wqe_shift = hr_ilog32(roundup_pow_of_two(HNS_ROCE_SGE_SIZE * srq->max_gs)); attr->attr.max_sge = srq->max_gs; attr->attr.srq_limit = 0; } static int alloc_srq_idx_que(struct hns_roce_srq *srq) { struct hns_roce_idx_que *idx_que = &srq->idx_que; unsigned int buf_size; int i; idx_que->entry_shift = hr_ilog32(HNS_ROCE_IDX_QUE_ENTRY_SZ); idx_que->bitmap_cnt = align(srq->wqe_cnt, BIT_CNT_PER_LONG) / BIT_CNT_PER_LONG; idx_que->bitmap = calloc(idx_que->bitmap_cnt, sizeof(unsigned long)); if (!idx_que->bitmap) return -ENOMEM; buf_size = to_hr_hem_entries_size(srq->wqe_cnt, idx_que->entry_shift); if (hns_roce_alloc_buf(&idx_que->buf, buf_size, HNS_HW_PAGE_SIZE)) { free(idx_que->bitmap); idx_que->bitmap = NULL; return -ENOMEM; } /* init the idx_que bitmap */ for (i = 0; i < idx_que->bitmap_cnt; ++i) idx_que->bitmap[i] = ~(0UL); idx_que->head = 0; idx_que->tail = 0; return 0; } static int alloc_srq_wqe_buf(struct hns_roce_srq *srq) { int buf_size = to_hr_hem_entries_size(srq->wqe_cnt, srq->wqe_shift); return hns_roce_alloc_buf(&srq->wqe_buf, buf_size, HNS_HW_PAGE_SIZE); } static int alloc_recv_rinl_buf(uint32_t max_sge, struct hns_roce_rinl_buf *rinl_buf); static void free_recv_rinl_buf(struct hns_roce_rinl_buf *rinl_buf); static int alloc_srq_buf(struct hns_roce_srq *srq) { int ret; ret = alloc_srq_idx_que(srq); if (ret) return ret; ret = alloc_srq_wqe_buf(srq); if (ret) goto err_idx_que; srq->wrid = calloc(srq->wqe_cnt, sizeof(*srq->wrid)); if (!srq->wrid) { ret = -ENOMEM; goto err_wqe_buf; } return 0; err_wqe_buf: hns_roce_free_buf(&srq->wqe_buf); err_idx_que: hns_roce_free_buf(&srq->idx_que.buf); free(srq->idx_que.bitmap); return ret; } static void free_srq_buf(struct hns_roce_srq *srq) { free(srq->wrid); hns_roce_free_buf(&srq->wqe_buf); hns_roce_free_buf(&srq->idx_que.buf); free(srq->idx_que.bitmap); } static int exec_srq_create_cmd(struct ibv_context *context, struct hns_roce_srq *srq, struct ibv_srq_init_attr_ex *init_attr) { struct hns_roce_create_srq_ex_resp resp_ex = {}; struct hns_roce_create_srq_ex cmd_ex = {}; int ret; cmd_ex.buf_addr = (uintptr_t)srq->wqe_buf.buf; cmd_ex.que_addr = (uintptr_t)srq->idx_que.buf.buf; cmd_ex.db_addr = (uintptr_t)srq->rdb; cmd_ex.req_cap_flags |= HNS_ROCE_SRQ_CAP_RECORD_DB; ret = ibv_cmd_create_srq_ex(context, &srq->verbs_srq, init_attr, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp_ex.ibv_resp, sizeof(resp_ex)); if (ret) { verbs_err(verbs_get_ctx(context), "failed to exec create srq cmd, ret = %d.\n", ret); return ret; } srq->srqn = resp_ex.srqn; srq->cap_flags = resp_ex.cap_flags; return 0; } static struct ibv_srq *create_srq(struct ibv_context *context, struct ibv_srq_init_attr_ex *init_attr) { struct hns_roce_context *hr_ctx = to_hr_ctx(context); struct hns_roce_pad *pad = to_hr_pad(init_attr->pd); struct hns_roce_srq *srq; int ret; ret = verify_srq_create_attr(hr_ctx, init_attr); if (ret) goto err; srq = calloc(1, sizeof(*srq)); if (!srq) { ret = -ENOMEM; goto err; } if (pad) atomic_fetch_add(&pad->pd.refcount, 1); if (hns_roce_srq_spinlock_init(srq, init_attr)) goto err_free_srq; set_srq_param(context, srq, init_attr); if (alloc_srq_buf(srq)) goto err_destroy_lock; srq->rdb = hns_roce_alloc_db(hr_ctx, HNS_ROCE_SRQ_TYPE_DB); if (!srq->rdb) goto err_srq_buf; ret = exec_srq_create_cmd(context, srq, init_attr); if (ret) goto err_srq_db; ret = hns_roce_store_srq(hr_ctx, srq); if (ret) goto err_destroy_srq; srq->max_gs = init_attr->attr.max_sge; init_attr->attr.max_sge = min(init_attr->attr.max_sge - srq->rsv_sge, hr_ctx->max_srq_sge); return &srq->verbs_srq.srq; err_destroy_srq: ibv_cmd_destroy_srq(&srq->verbs_srq.srq); err_srq_db: hns_roce_free_db(hr_ctx, srq->rdb, HNS_ROCE_SRQ_TYPE_DB); err_srq_buf: free_srq_buf(srq); err_destroy_lock: hns_roce_spinlock_destroy(&srq->hr_lock); err_free_srq: free(srq); err: if (ret < 0) ret = -ret; errno = ret; return NULL; } struct ibv_srq *hns_roce_u_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct ibv_srq_init_attr_ex attrx = {}; struct ibv_srq *srq; memcpy(&attrx, attr, sizeof(*attr)); attrx.comp_mask = IBV_SRQ_INIT_ATTR_PD; attrx.pd = pd; srq = create_srq(pd->context, &attrx); if (srq) memcpy(attr, &attrx, sizeof(*attr)); return srq; } struct ibv_srq *hns_roce_u_create_srq_ex(struct ibv_context *context, struct ibv_srq_init_attr_ex *attr) { return create_srq(context, attr); } int hns_roce_u_get_srq_num(struct ibv_srq *ibv_srq, uint32_t *srq_num) { *srq_num = to_hr_srq(ibv_srq)->srqn; return 0; } int hns_roce_u_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr, int srq_attr_mask) { struct ibv_modify_srq cmd; return ibv_cmd_modify_srq(srq, srq_attr, srq_attr_mask, &cmd, sizeof(cmd)); } int hns_roce_u_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *srq_attr) { struct ibv_query_srq cmd; int ret; ret = ibv_cmd_query_srq(srq, srq_attr, &cmd, sizeof(cmd)); srq_attr->max_sge -= to_hr_srq(srq)->rsv_sge; return ret; } int hns_roce_u_destroy_srq(struct ibv_srq *ibv_srq) { struct hns_roce_context *ctx = to_hr_ctx(ibv_srq->context); struct hns_roce_pad *pad = to_hr_pad(ibv_srq->pd); struct hns_roce_srq *srq = to_hr_srq(ibv_srq); int ret; ret = ibv_cmd_destroy_srq(ibv_srq); if (ret) return ret; hns_roce_clear_srq(ctx, srq->srqn); hns_roce_free_db(ctx, srq->rdb, HNS_ROCE_SRQ_TYPE_DB); free_srq_buf(srq); hns_roce_spinlock_destroy(&srq->hr_lock); if (pad) atomic_fetch_sub(&pad->pd.refcount, 1); free(srq); return 0; } enum { HNSDV_QP_SUP_COMP_MASK = HNSDV_QP_INIT_ATTR_MASK_QP_CONGEST_TYPE, }; static int check_hnsdv_qp_attr(struct hns_roce_context *ctx, struct hnsdv_qp_init_attr *hns_attr) { if (!hns_attr) return 0; if (!check_comp_mask(hns_attr->comp_mask, HNSDV_QP_SUP_COMP_MASK)) { verbs_err(&ctx->ibv_ctx, "invalid hnsdv comp_mask 0x%x.\n", hns_attr->comp_mask); return EINVAL; } return 0; } enum { CREATE_QP_SUP_COMP_MASK = IBV_QP_INIT_ATTR_PD | IBV_QP_INIT_ATTR_XRCD | IBV_QP_INIT_ATTR_SEND_OPS_FLAGS, }; static int check_qp_create_mask(struct hns_roce_context *ctx, struct ibv_qp_init_attr_ex *attr) { struct hns_roce_device *hr_dev = to_hr_dev(ctx->ibv_ctx.context.device); int ret = 0; if (!check_comp_mask(attr->comp_mask, CREATE_QP_SUP_COMP_MASK)) { ret = EOPNOTSUPP; goto out; } switch (attr->qp_type) { case IBV_QPT_UD: if (hr_dev->hw_version == HNS_ROCE_HW_VER2) return EINVAL; SWITCH_FALLTHROUGH; case IBV_QPT_RC: case IBV_QPT_XRC_SEND: if (!(attr->comp_mask & IBV_QP_INIT_ATTR_PD)) ret = EINVAL; break; case IBV_QPT_XRC_RECV: if (!(attr->comp_mask & IBV_QP_INIT_ATTR_XRCD)) ret = EINVAL; break; default: return EOPNOTSUPP; } out: if (ret) verbs_err(&ctx->ibv_ctx, "invalid comp_mask 0x%x.\n", attr->comp_mask); return ret; } static int hns_roce_qp_has_rq(struct ibv_qp_init_attr_ex *attr) { if (attr->qp_type == IBV_QPT_XRC_SEND || attr->qp_type == IBV_QPT_XRC_RECV || attr->srq) return 0; return 1; } static int verify_qp_create_cap(struct hns_roce_context *ctx, struct ibv_qp_init_attr_ex *attr) { struct ibv_qp_cap *cap = &attr->cap; uint32_t min_wqe_num; int has_rq; if (!cap->max_send_wr && attr->qp_type != IBV_QPT_XRC_RECV) return -EINVAL; if (cap->max_send_wr > ctx->max_qp_wr || cap->max_recv_wr > ctx->max_qp_wr || cap->max_send_sge > ctx->max_sge || cap->max_recv_sge > ctx->max_sge) { verbs_err(&ctx->ibv_ctx, "invalid qp cap size, max_send/recv_wr = {%u, %u}, max_send/recv_sge = {%u, %u}.\n", cap->max_send_wr, cap->max_recv_wr, cap->max_send_sge, cap->max_recv_sge); return -EINVAL; } has_rq = hns_roce_qp_has_rq(attr); if (!has_rq) { cap->max_recv_wr = 0; cap->max_recv_sge = 0; } min_wqe_num = HNS_ROCE_V2_MIN_WQE_NUM; if (cap->max_send_wr < min_wqe_num) { verbs_debug(&ctx->ibv_ctx, "change sq depth from %u to minimum %u.\n", cap->max_send_wr, min_wqe_num); cap->max_send_wr = min_wqe_num; } if (cap->max_recv_wr) { if (cap->max_recv_wr < min_wqe_num) { verbs_debug(&ctx->ibv_ctx, "change rq depth from %u to minimum %u.\n", cap->max_recv_wr, min_wqe_num); cap->max_recv_wr = min_wqe_num; } if (!cap->max_recv_sge) return -EINVAL; } return 0; } static int verify_qp_create_attr(struct hns_roce_context *ctx, struct ibv_qp_init_attr_ex *attr, struct hnsdv_qp_init_attr *hns_attr) { int ret; ret = check_qp_create_mask(ctx, attr); if (ret) return ret; ret = check_hnsdv_qp_attr(ctx, hns_attr); if (ret) return ret; return verify_qp_create_cap(ctx, attr); } static int hns_roce_qp_spinlock_init(struct ibv_qp_init_attr_ex *attr, struct hns_roce_qp *qp) { bool need_lock = hns_roce_whether_need_lock(attr->pd); int ret; ret = hns_roce_spinlock_init(&qp->sq.hr_lock, need_lock); if (ret) return ret; ret = hns_roce_spinlock_init(&qp->rq.hr_lock, need_lock); if (ret) hns_roce_spinlock_destroy(&qp->sq.hr_lock); return ret; } void hns_roce_qp_spinlock_destroy(struct hns_roce_qp *qp) { hns_roce_spinlock_destroy(&qp->rq.hr_lock); hns_roce_spinlock_destroy(&qp->sq.hr_lock); } static int alloc_recv_rinl_buf(uint32_t max_sge, struct hns_roce_rinl_buf *rinl_buf) { unsigned int cnt; int i; cnt = rinl_buf->wqe_cnt; rinl_buf->wqe_list = calloc(cnt, sizeof(struct hns_roce_rinl_wqe)); if (!rinl_buf->wqe_list) return ENOMEM; rinl_buf->wqe_list[0].sg_list = calloc(cnt * max_sge, sizeof(struct ibv_sge)); if (!rinl_buf->wqe_list[0].sg_list) { free(rinl_buf->wqe_list); return ENOMEM; } for (i = 0; i < cnt; i++) { int wqe_size = i * max_sge; rinl_buf->wqe_list[i].sg_list = &rinl_buf->wqe_list[0].sg_list[wqe_size]; } return 0; } static void free_recv_rinl_buf(struct hns_roce_rinl_buf *rinl_buf) { if (rinl_buf->wqe_list) { if (rinl_buf->wqe_list[0].sg_list) { free(rinl_buf->wqe_list[0].sg_list); rinl_buf->wqe_list[0].sg_list = NULL; } free(rinl_buf->wqe_list); rinl_buf->wqe_list = NULL; } } static int calc_qp_buff_size(struct hns_roce_device *hr_dev, struct hns_roce_qp *qp) { struct hns_roce_wq *sq = &qp->sq; struct hns_roce_wq *rq = &qp->rq; unsigned int size; qp->buf_size = 0; /* SQ WQE */ sq->offset = 0; size = to_hr_hem_entries_size(sq->wqe_cnt, sq->wqe_shift); qp->buf_size += size; /* extend SGE WQE in SQ */ qp->ex_sge.offset = qp->buf_size; if (qp->ex_sge.sge_cnt > 0) { size = to_hr_hem_entries_size(qp->ex_sge.sge_cnt, qp->ex_sge.sge_shift); qp->buf_size += size; } /* RQ WQE */ rq->offset = qp->buf_size; size = to_hr_hem_entries_size(rq->wqe_cnt, rq->wqe_shift); qp->buf_size += size; if (qp->buf_size < 1) return EINVAL; return 0; } static void qp_free_wqe(struct hns_roce_qp *qp) { free_recv_rinl_buf(&qp->rq_rinl_buf); if (qp->sq.wqe_cnt) free(qp->sq.wrid); if (qp->rq.wqe_cnt) free(qp->rq.wrid); hns_roce_free_buf(&qp->buf); } static int qp_alloc_wqe(struct ibv_qp_cap *cap, struct hns_roce_qp *qp, struct hns_roce_context *ctx) { struct hns_roce_device *hr_dev = to_hr_dev(ctx->ibv_ctx.context.device); if (calc_qp_buff_size(hr_dev, qp)) return -EINVAL; qp->sq.wrid = malloc(qp->sq.wqe_cnt * sizeof(uint64_t)); if (!qp->sq.wrid) return -ENOMEM; if (qp->rq.wqe_cnt) { qp->rq.wrid = malloc(qp->rq.wqe_cnt * sizeof(uint64_t)); if (!qp->rq.wrid) goto err_alloc; } if (qp->rq_rinl_buf.wqe_cnt) { if (alloc_recv_rinl_buf(cap->max_recv_sge, &qp->rq_rinl_buf)) goto err_alloc; } if (hns_roce_alloc_buf(&qp->buf, qp->buf_size, HNS_HW_PAGE_SIZE)) goto err_alloc; return 0; err_alloc: free_recv_rinl_buf(&qp->rq_rinl_buf); if (qp->rq.wrid) free(qp->rq.wrid); if (qp->sq.wrid) free(qp->sq.wrid); return -ENOMEM; } /** * Calculated sge num according to attr's max_send_sge */ static unsigned int get_sge_num_from_max_send_sge(bool is_ud, uint32_t max_send_sge) { unsigned int std_sge_num; unsigned int min_sge; std_sge_num = is_ud ? 0 : HNS_ROCE_SGE_IN_WQE; min_sge = is_ud ? 1 : 0; return max_send_sge > std_sge_num ? (max_send_sge - std_sge_num) : min_sge; } /** * Calculated sge num according to attr's max_inline_data */ static unsigned int get_sge_num_from_max_inl_data(bool is_ud, uint32_t max_inline_data) { unsigned int inline_sge = 0; inline_sge = max_inline_data / HNS_ROCE_SGE_SIZE; /* * if max_inline_data less than * HNS_ROCE_SGE_IN_WQE * HNS_ROCE_SGE_SIZE, * In addition to ud's mode, no need to extend sge. */ if (!is_ud && inline_sge <= HNS_ROCE_SGE_IN_WQE) inline_sge = 0; return inline_sge; } static void set_ext_sge_param(struct hns_roce_context *ctx, struct ibv_qp_init_attr_ex *attr, struct hns_roce_qp *qp, unsigned int wr_cnt) { bool is_ud = (qp->verbs_qp.qp.qp_type == IBV_QPT_UD); unsigned int ext_wqe_sge_cnt; unsigned int inline_ext_sge; unsigned int total_sge_cnt; unsigned int std_sge_num; qp->ex_sge.sge_shift = HNS_ROCE_SGE_SHIFT; std_sge_num = is_ud ? 0 : HNS_ROCE_SGE_IN_WQE; ext_wqe_sge_cnt = get_sge_num_from_max_send_sge(is_ud, attr->cap.max_send_sge); if (ctx->config & HNS_ROCE_RSP_EXSGE_FLAGS) { attr->cap.max_inline_data = min_t(uint32_t, roundup_pow_of_two( attr->cap.max_inline_data), ctx->max_inline_data); inline_ext_sge = max(ext_wqe_sge_cnt, get_sge_num_from_max_inl_data(is_ud, attr->cap.max_inline_data)); qp->sq.ext_sge_cnt = inline_ext_sge ? roundup_pow_of_two(inline_ext_sge) : 0; qp->sq.max_gs = min((qp->sq.ext_sge_cnt + std_sge_num), ctx->max_sge); ext_wqe_sge_cnt = qp->sq.ext_sge_cnt; } else { qp->sq.max_gs = max(1U, attr->cap.max_send_sge); qp->sq.max_gs = min(qp->sq.max_gs, ctx->max_sge); qp->sq.ext_sge_cnt = qp->sq.max_gs; } /* If the number of extended sge is not zero, they MUST use the * space of HNS_HW_PAGE_SIZE at least. */ if (ext_wqe_sge_cnt) { total_sge_cnt = roundup_pow_of_two(wr_cnt * ext_wqe_sge_cnt); qp->ex_sge.sge_cnt = max(total_sge_cnt, (unsigned int)HNS_HW_PAGE_SIZE / HNS_ROCE_SGE_SIZE); } } static void hns_roce_set_qp_params(struct ibv_qp_init_attr_ex *attr, struct hns_roce_qp *qp, struct hns_roce_context *ctx) { struct hns_roce_device *hr_dev = to_hr_dev(ctx->ibv_ctx.context.device); unsigned int cnt; qp->verbs_qp.qp.qp_type = attr->qp_type; if (attr->cap.max_recv_wr) { if (hr_dev->hw_version == HNS_ROCE_HW_VER2) qp->rq.rsv_sge = 1; qp->rq.max_gs = roundup_pow_of_two(attr->cap.max_recv_sge + qp->rq.rsv_sge); qp->rq.wqe_shift = hr_ilog32(HNS_ROCE_SGE_SIZE * qp->rq.max_gs); cnt = roundup_pow_of_two(attr->cap.max_recv_wr); qp->rq.wqe_cnt = cnt; qp->rq.shift = hr_ilog32(cnt); if (ctx->config & (HNS_ROCE_RSP_RQ_INLINE_FLAGS | HNS_ROCE_RSP_CQE_INLINE_FLAGS)) qp->rq_rinl_buf.wqe_cnt = cnt; attr->cap.max_recv_wr = qp->rq.wqe_cnt; attr->cap.max_recv_sge = qp->rq.max_gs; } if (attr->cap.max_send_wr) { qp->sq.wqe_shift = HNS_ROCE_SQWQE_SHIFT; cnt = roundup_pow_of_two(attr->cap.max_send_wr); qp->sq.wqe_cnt = cnt; qp->sq.shift = hr_ilog32(cnt); set_ext_sge_param(ctx, attr, qp, cnt); qp->sq.max_post = min(ctx->max_qp_wr, cnt); qp->sq_signal_bits = attr->sq_sig_all ? 0 : 1; attr->cap.max_send_wr = qp->sq.max_post; } } static void qp_free_db(struct hns_roce_qp *qp, struct hns_roce_context *ctx) { if (qp->sdb) hns_roce_free_db(ctx, qp->sdb, HNS_ROCE_QP_TYPE_DB); if (qp->rdb) hns_roce_free_db(ctx, qp->rdb, HNS_ROCE_QP_TYPE_DB); } static int qp_alloc_db(struct ibv_qp_init_attr_ex *attr, struct hns_roce_qp *qp, struct hns_roce_context *ctx) { if (attr->cap.max_send_wr) { qp->sdb = hns_roce_alloc_db(ctx, HNS_ROCE_QP_TYPE_DB); if (!qp->sdb) return -ENOMEM; } if (attr->cap.max_recv_sge) { qp->rdb = hns_roce_alloc_db(ctx, HNS_ROCE_QP_TYPE_DB); if (!qp->rdb) { if (qp->sdb) hns_roce_free_db(ctx, qp->sdb, HNS_ROCE_QP_TYPE_DB); return -ENOMEM; } } return 0; } static int hns_roce_store_qp(struct hns_roce_context *ctx, struct hns_roce_qp *qp) { uint32_t qpn = qp->verbs_qp.qp.qp_num; uint32_t tind = to_hr_qp_table_index(qpn, ctx); pthread_mutex_lock(&ctx->qp_table_mutex); if (!ctx->qp_table[tind].refcnt) { ctx->qp_table[tind].table = calloc(ctx->qp_table_mask + 1, sizeof(struct hns_roce_qp *)); if (!ctx->qp_table[tind].table) { pthread_mutex_unlock(&ctx->qp_table_mutex); return -ENOMEM; } } ++qp->refcnt; ++ctx->qp_table[tind].refcnt; ctx->qp_table[tind].table[qpn & ctx->qp_table_mask] = qp; pthread_mutex_unlock(&ctx->qp_table_mutex); return 0; } static int to_cmd_cong_type(uint8_t cong_type, __u64 *cmd_cong_type) { switch (cong_type) { case HNSDV_QP_CREATE_ENABLE_DCQCN: *cmd_cong_type = HNS_ROCE_CREATE_QP_FLAGS_DCQCN; break; case HNSDV_QP_CREATE_ENABLE_LDCP: *cmd_cong_type = HNS_ROCE_CREATE_QP_FLAGS_LDCP; break; case HNSDV_QP_CREATE_ENABLE_HC3: *cmd_cong_type = HNS_ROCE_CREATE_QP_FLAGS_HC3; break; case HNSDV_QP_CREATE_ENABLE_DIP: *cmd_cong_type = HNS_ROCE_CREATE_QP_FLAGS_DIP; break; default: return EINVAL; } return 0; } static int qp_exec_create_cmd(struct ibv_qp_init_attr_ex *attr, struct hns_roce_qp *qp, struct hns_roce_context *ctx, uint64_t *dwqe_mmap_key, struct hnsdv_qp_init_attr *hns_attr) { struct hns_roce_create_qp_ex_resp resp_ex = {}; struct hns_roce_create_qp_ex cmd_ex = {}; int ret; cmd_ex.sdb_addr = (uintptr_t)qp->sdb; cmd_ex.db_addr = (uintptr_t)qp->rdb; cmd_ex.buf_addr = (uintptr_t)qp->buf.buf; cmd_ex.log_sq_stride = qp->sq.wqe_shift; cmd_ex.log_sq_bb_count = hr_ilog32(qp->sq.wqe_cnt); if (hns_attr && hns_attr->comp_mask & HNSDV_QP_INIT_ATTR_MASK_QP_CONGEST_TYPE) { ret = to_cmd_cong_type(hns_attr->congest_type, &cmd_ex.cong_type_flags); if (ret) return ret; cmd_ex.comp_mask |= HNS_ROCE_CREATE_QP_MASK_CONGEST_TYPE; } ret = ibv_cmd_create_qp_ex2(&ctx->ibv_ctx.context, &qp->verbs_qp, attr, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp_ex.ibv_resp, sizeof(resp_ex)); if (ret) { verbs_err(&ctx->ibv_ctx, "failed to exec create qp cmd, ret = %d.\n", ret); return ret; } qp->flags = resp_ex.drv_payload.cap_flags; *dwqe_mmap_key = resp_ex.drv_payload.dwqe_mmap_key; return ret; } static void qp_setup_config(struct ibv_qp_init_attr_ex *attr, struct hns_roce_qp *qp, struct hns_roce_context *ctx) { hns_roce_init_qp_indices(qp); if (qp->rq.wqe_cnt) { qp->rq.wqe_cnt = attr->cap.max_recv_wr; qp->rq.max_gs = attr->cap.max_recv_sge; /* adjust the RQ's cap based on the reported device's cap */ attr->cap.max_recv_wr = min(ctx->max_qp_wr, attr->cap.max_recv_wr); attr->cap.max_recv_sge -= qp->rq.rsv_sge; qp->rq.max_post = attr->cap.max_recv_wr; } qp->max_inline_data = attr->cap.max_inline_data; if (qp->flags & HNS_ROCE_QP_CAP_DIRECT_WQE) qp->sq.db_reg = qp->dwqe_page; else qp->sq.db_reg = ctx->uar + ROCEE_VF_DB_CFG0_OFFSET; } void hns_roce_free_qp_buf(struct hns_roce_qp *qp, struct hns_roce_context *ctx) { qp_free_db(qp, ctx); qp_free_wqe(qp); } static int hns_roce_alloc_qp_buf(struct ibv_qp_init_attr_ex *attr, struct hns_roce_qp *qp, struct hns_roce_context *ctx) { int ret; ret = qp_alloc_wqe(&attr->cap, qp, ctx); if (ret) goto err_wqe; ret = qp_alloc_db(attr, qp, ctx); if (ret) goto err_db; return 0; err_db: qp_free_wqe(qp); err_wqe: return ret; } static int mmap_dwqe(struct ibv_context *ibv_ctx, struct hns_roce_qp *qp, uint64_t dwqe_mmap_key) { qp->dwqe_page = mmap(NULL, HNS_ROCE_DWQE_PAGE_SIZE, PROT_WRITE, MAP_SHARED, ibv_ctx->cmd_fd, dwqe_mmap_key); if (qp->dwqe_page == MAP_FAILED) { verbs_err(verbs_get_ctx(ibv_ctx), "failed to mmap direct wqe page, QPN = %u.\n", qp->verbs_qp.qp.qp_num); return -EINVAL; } return 0; } static struct ibv_qp *create_qp(struct ibv_context *ibv_ctx, struct ibv_qp_init_attr_ex *attr, struct hnsdv_qp_init_attr *hns_attr) { struct hns_roce_context *context = to_hr_ctx(ibv_ctx); struct hns_roce_pad *pad = to_hr_pad(attr->pd); struct hns_roce_qp *qp; uint64_t dwqe_mmap_key; int ret; ret = verify_qp_create_attr(context, attr, hns_attr); if (ret) goto err; qp = calloc(1, sizeof(*qp)); if (!qp) { ret = -ENOMEM; goto err; } hns_roce_set_qp_params(attr, qp, context); if (pad) atomic_fetch_add(&pad->pd.refcount, 1); ret = hns_roce_qp_spinlock_init(attr, qp); if (ret) goto err_spinlock; ret = hns_roce_alloc_qp_buf(attr, qp, context); if (ret) goto err_buf; ret = qp_exec_create_cmd(attr, qp, context, &dwqe_mmap_key, hns_attr); if (ret) goto err_cmd; ret = hns_roce_attach_qp_ex_ops(attr, qp); if (ret) goto err_ops; ret = hns_roce_store_qp(context, qp); if (ret) goto err_ops; if (qp->flags & HNS_ROCE_QP_CAP_DIRECT_WQE) { ret = mmap_dwqe(ibv_ctx, qp, dwqe_mmap_key); if (ret) goto err_dwqe; } qp_setup_config(attr, qp, context); return &qp->verbs_qp.qp; err_dwqe: hns_roce_v2_clear_qp(context, qp); err_ops: ibv_cmd_destroy_qp(&qp->verbs_qp.qp); err_cmd: hns_roce_free_qp_buf(qp, context); err_buf: hns_roce_qp_spinlock_destroy(qp); err_spinlock: free(qp); err: if (ret < 0) ret = -ret; errno = ret; return NULL; } struct ibv_qp *hns_roce_u_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct ibv_qp_init_attr_ex attrx = {}; struct ibv_qp *qp; memcpy(&attrx, attr, sizeof(*attr)); attrx.comp_mask = IBV_QP_INIT_ATTR_PD; attrx.pd = pd; qp = create_qp(pd->context, &attrx, NULL); if (qp) memcpy(attr, &attrx, sizeof(*attr)); return qp; } struct ibv_qp *hns_roce_u_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr) { return create_qp(context, attr, NULL); } struct ibv_qp *hnsdv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_attr, struct hnsdv_qp_init_attr *hns_attr) { if (!context || !qp_attr) { errno = EINVAL; return NULL; } if (!is_hns_dev(context->device)) { errno = EOPNOTSUPP; return NULL; } return create_qp(context, qp_attr, hns_attr); } int hnsdv_query_device(struct ibv_context *context, struct hnsdv_context *attrs_out) { struct hns_roce_device *hr_dev; if (!context || !context->device || !attrs_out) return EINVAL; if (!is_hns_dev(context->device)) { verbs_err(verbs_get_ctx(context), "not a HNS RoCE device!\n"); return EOPNOTSUPP; } memset(attrs_out, 0, sizeof(*attrs_out)); hr_dev = to_hr_dev(context->device); attrs_out->comp_mask |= HNSDV_CONTEXT_MASK_CONGEST_TYPE; attrs_out->congest_type = hr_dev->congest_cap; return 0; } struct ibv_qp *hns_roce_u_open_qp(struct ibv_context *context, struct ibv_qp_open_attr *attr) { struct ib_uverbs_create_qp_resp resp; struct ibv_open_qp cmd; struct hns_roce_qp *qp; int ret; qp = calloc(1, sizeof(*qp)); if (!qp) return NULL; ret = ibv_cmd_open_qp(context, &qp->verbs_qp, sizeof(qp->verbs_qp), attr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) goto err_buf; ret = hns_roce_store_qp(to_hr_ctx(context), qp); if (ret) goto err_cmd; return &qp->verbs_qp.qp; err_cmd: ibv_cmd_destroy_qp(&qp->verbs_qp.qp); err_buf: free(qp); return NULL; } int hns_roce_u_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct hns_roce_qp *qp = to_hr_qp(ibqp); struct ibv_query_qp cmd; int ret; ret = ibv_cmd_query_qp(ibqp, attr, attr_mask, init_attr, &cmd, sizeof(cmd)); if (ret) return ret; init_attr->cap.max_send_wr = qp->sq.max_post; init_attr->cap.max_send_sge = qp->sq.max_gs; if (init_attr->cap.max_recv_wr) init_attr->cap.max_recv_sge -= qp->rq.rsv_sge; attr->cap = init_attr->cap; return ret; } static uint16_t get_ah_udp_sport(const struct ibv_ah_attr *attr) { uint32_t fl = attr->grh.flow_label & IB_GRH_FLOWLABEL_MASK; uint16_t sport; if (!fl) sport = get_random() % (IB_ROCE_UDP_ENCAP_VALID_PORT_MAX + 1 - IB_ROCE_UDP_ENCAP_VALID_PORT_MIN) + IB_ROCE_UDP_ENCAP_VALID_PORT_MIN; else sport = ibv_flow_label_to_udp_sport(fl); return sport; } static int get_tclass(struct ibv_context *context, struct ibv_ah_attr *attr, uint8_t *tclass) { #define DSCP_SHIFT 2 enum ibv_gid_type_sysfs gid_type; int ret; ret = ibv_query_gid_type(context, attr->port_num, attr->grh.sgid_index, &gid_type); if (ret) return ret; *tclass = gid_type == IBV_GID_TYPE_SYSFS_ROCE_V2 ? attr->grh.traffic_class >> DSCP_SHIFT : attr->grh.traffic_class; return ret; } struct ibv_ah *hns_roce_u_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr) { struct hns_roce_device *hr_dev = to_hr_dev(pd->context->device); struct hns_roce_create_ah_resp resp = {}; struct hns_roce_ah *ah; /* HIP08 don't support create ah */ if (hr_dev->hw_version == HNS_ROCE_HW_VER2) return NULL; ah = malloc(sizeof(*ah)); if (!ah) return NULL; memset(ah, 0, sizeof(*ah)); ah->av.port = attr->port_num; ah->av.sl = attr->sl; if (attr->is_global) { ah->av.gid_index = attr->grh.sgid_index; ah->av.hop_limit = attr->grh.hop_limit; if (get_tclass(pd->context, attr, &ah->av.tclass)) goto err; ah->av.flowlabel = attr->grh.flow_label; memcpy(ah->av.dgid, attr->grh.dgid.raw, ARRAY_SIZE(ah->av.dgid)); } if (ibv_cmd_create_ah(pd, &ah->ibv_ah, attr, &resp.ibv_resp, sizeof(resp))) goto err; if (memcmp(ah->av.mac, resp.dmac, ETH_ALEN)) memcpy(ah->av.mac, resp.dmac, ETH_ALEN); else if (ibv_resolve_eth_l2_from_gid(pd->context, attr, ah->av.mac, NULL)) goto err; if (resp.tc_mode == HNS_ROCE_TC_MAP_MODE_DSCP) ah->av.sl = resp.priority; ah->av.udp_sport = get_ah_udp_sport(attr); return &ah->ibv_ah; err: free(ah); return NULL; } int hns_roce_u_destroy_ah(struct ibv_ah *ah) { int ret; ret = ibv_cmd_destroy_ah(ah); if (ret) return ret; free(to_hr_ah(ah)); return 0; } rdma-core-56.1/providers/hns/hnsdv.h000066400000000000000000000025711477342711600174370ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright (c) 2024 Hisilicon Limited. */ #ifndef __HNSDV_H__ #define __HNSDV_H__ #include #include #include #include #ifdef __cplusplus extern "C" { #endif enum hnsdv_qp_congest_ctrl_type { HNSDV_QP_CREATE_ENABLE_DCQCN = 1 << 0, HNSDV_QP_CREATE_ENABLE_LDCP = 1 << 1, HNSDV_QP_CREATE_ENABLE_HC3 = 1 << 2, HNSDV_QP_CREATE_ENABLE_DIP = 1 << 3, }; enum hnsdv_qp_init_attr_mask { HNSDV_QP_INIT_ATTR_MASK_QP_CONGEST_TYPE = 1 << 1, }; struct hnsdv_qp_init_attr { uint64_t comp_mask; /* Use enum hnsdv_qp_init_attr_mask */ uint32_t create_flags; uint8_t congest_type; /* Use enum hnsdv_qp_congest_ctrl_type */ uint8_t reserved[3]; }; enum hnsdv_query_context_comp_mask { HNSDV_CONTEXT_MASK_CONGEST_TYPE = 1 << 0, }; struct hnsdv_context { uint64_t comp_mask; /* Use enum hnsdv_query_context_comp_mask */ uint64_t flags; uint8_t congest_type; /* Use enum hnsdv_qp_congest_ctrl_type */ uint8_t reserved[7]; }; bool hnsdv_is_supported(struct ibv_device *device); int hnsdv_query_device(struct ibv_context *ctx_in, struct hnsdv_context *attrs_out); struct ibv_qp *hnsdv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_attr, struct hnsdv_qp_init_attr *hns_qp_attr); #ifdef __cplusplus } #endif #endif /* __HNSDV_H__ */ rdma-core-56.1/providers/hns/libhns.map000066400000000000000000000003011477342711600201070ustar00rootroot00000000000000/* Export symbols should be added below according to Documentation/versioning.md document. */ HNS_1.0 { global: hnsdv_is_supported; hnsdv_create_qp; hnsdv_query_device; local: *; }; rdma-core-56.1/providers/hns/man/000077500000000000000000000000001477342711600167125ustar00rootroot00000000000000rdma-core-56.1/providers/hns/man/CMakeLists.txt000066400000000000000000000001521477342711600214500ustar00rootroot00000000000000rdma_man_pages( hnsdv.7.md hnsdv_create_qp.3.md hnsdv_is_supported.3.md hnsdv_query_device.3.md ) rdma-core-56.1/providers/hns/man/hnsdv.7.md000066400000000000000000000021071477342711600205230ustar00rootroot00000000000000--- layout: page title: HNSDV section: 7 tagline: Verbs date: 2024-02-06 header: "HNS Direct Verbs Manual" footer: hns --- # NAME hnsdv \- Direct verbs for hns devices This provides low level access to hns devices to perform direct operations, without general branching performed by libibverbs. # DESCRIPTION The libibverbs API is an abstract one. It is agnostic to any underlying provider specific implementation. While this abstraction has the advantage of user applications portability it has a performance penalty. Besides, some provider specific features that are directly facing users are not available through libibverbs. For some applications these demands are more important than portability. The hns direct verbs API is intended for such applications. It exposes hns specific low level operations, allowing the application to bypass the libibverbs API and enable some hns specific features. The direct include of hnsdv.h together with linkage to hns library will allow usage of this new interface. # SEE ALSO **verbs**(7) # AUTHORS Junxian Huang rdma-core-56.1/providers/hns/man/hnsdv_create_qp.3.md000066400000000000000000000031451477342711600225450ustar00rootroot00000000000000--- layout: page title: HNSDV_CREATE_QP section: 3 tagline: Verbs date: 2024-02-06 header: "HNS Programmer's Manual" footer: hns --- # NAME hnsdv_create_qp - creates a HNS specific queue pair (QP) # SYNOPSIS ```c #include struct ibv_qp *hnsdv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_attr, struct hnsdv_qp_init_attr *hns_attr); ``` # DESCRIPTION **hnsdv_create_qp()** creates a HNS specific queue pair (QP) with specific driver properties. # ARGUMENTS Please see *ibv_create_qp_ex(3)* man page for *context* and *qp_attr*. ## hns_attr ```c struct hnsdv_qp_init_attr { uint64_t comp_mask; uint32_t create_flags; uint8_t congest_type; uint8_t reserved[3]; }; ``` *comp_mask* : Bitmask specifying what fields in the structure are valid: ``` HNSDV_QP_INIT_ATTR_MASK_QP_CONGEST_TYPE: Valid values in congest_type. Allow setting a congestion control algorithm for QP. ``` *create_flags* : Enable the QP of a feature. *congest_type* : Type of congestion control algorithm: HNSDV_QP_CREATE_ENABLE_DCQCN: Data Center Quantized Congestion Notification HNSDV_QP_CREATE_ENABLE_LDCP: Low Delay Control Protocol HNSDV_QP_CREATE_ENABLE_HC3: Huawei Converged Congestion Control HNSDV_QP_CREATE_ENABLE_DIP: Destination IP based Quantized Congestion Notification # RETURN VALUE **hnsdv_create_qp()** returns a pointer to the created QP, on error NULL will be returned and errno will be set. # SEE ALSO **ibv_create_qp_ex**(3) # AUTHOR Junxian Huang rdma-core-56.1/providers/hns/man/hnsdv_is_supported.3.md000066400000000000000000000011671477342711600233240ustar00rootroot00000000000000--- layout: page title: HNSDV_IS_SUPPORTED section: 3 tagline: Verbs date: 2024-02-06 header: "HNS Programmer's Manual" footer: hns --- # NAME hnsdv_is_supported - Check whether an RDMA device implemented by the hns provider # SYNOPSIS ```c #include bool hnsdv_is_supported(struct ibv_device *device); ``` # DESCRIPTION hnsdv functions may be called only if this function returns true for the RDMA device. # ARGUMENTS *device* : RDMA device to check. # RETURN VALUE Returns true if device is implemented by hns provider. # SEE ALSO *hnsdv(7)* # AUTHOR Junxian Huang rdma-core-56.1/providers/hns/man/hnsdv_query_device.3.md000066400000000000000000000027021477342711600232640ustar00rootroot00000000000000--- layout: page title: HNSDV_QUERY_DEVICE section: 3 tagline: Verbs date: 2024-02-06 header: "HNS Direct Verbs Manual" footer: hns --- # NAME hnsdv_query_device - Query hns device specific attributes # SYNOPSIS ```c #include int hnsdv_query_device(struct ibv_context *context, struct hnsdv_context *attrs_out); ``` # DESCRIPTION **hnsdv_query_device()** Queries hns device specific attributes. # ARGUMENTS Please see *ibv_query_device(3)* man page for *context*. ## attrs_out ```c struct hnsdv_context { uint64_t comp_mask; uint64_t flags; uint8_t congest_type; uint8_t reserved[7]; }; ``` *comp_mask* : Bitmask specifying what fields in the structure are valid: HNSDV_CONTEXT_MASK_CONGEST_TYPE: Congestion control algorithm is supported. *congest_type* : Bitmask of supported congestion control algorithms. HNSDV_QP_CREATE_ENABLE_DCQCN: Data Center Quantized Congestion Notification HNSDV_QP_CREATE_ENABLE_LDCP: Low Delay Control Protocol HNSDV_QP_CREATE_ENABLE_HC3: Huawei Converged Congestion Control HNSDV_QP_CREATE_ENABLE_DIP: Destination IP based Quantized Congestion Notification # RETURN VALUE **hnsdv_query_device()** returns 0 on success, or the value of errno on failure (which indicates the failure reason). # SEE ALSO **ibv_query_device**(3) # NOTES * *flags* is an out field and currently has no values. # AUTHORS Junxian Huang rdma-core-56.1/providers/ipathverbs/000077500000000000000000000000001477342711600175165ustar00rootroot00000000000000rdma-core-56.1/providers/ipathverbs/CMakeLists.txt000066400000000000000000000005431477342711600222600ustar00rootroot00000000000000rdma_provider(ipathverbs ipathverbs.c verbs.c ) rdma_subst_install(FILES "truescale.conf.in" DESTINATION "${CMAKE_INSTALL_MODPROBEDIR}/" RENAME "truescale.conf") install(FILES truescale-serdes.cmds DESTINATION "${CMAKE_INSTALL_LIBEXECDIR}" PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ OWNER_EXECUTE GROUP_EXECUTE WORLD_EXECUTE) rdma-core-56.1/providers/ipathverbs/COPYING000066400000000000000000000030721477342711600205530ustar00rootroot00000000000000Copyright (c) 2013. Intel Corporation. All rights reserved. Copyright (c) 2007. QLogic Corp. All rights reserved. Copyright (c) 2005. PathScale, Inc. All rights reserved. This software is available to you under a choice of one of two licenses. You may choose to be licensed under the terms of the GNU General Public License (GPL) Version 2, available from the file COPYING in the main directory of this source tree, or the OpenIB.org BSD license below: Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. Patent licenses, if any, provided herein do not apply to combinations of this program with other software, or any other product whatsoever. rdma-core-56.1/providers/ipathverbs/dracut_check000066400000000000000000000001551477342711600220610ustar00rootroot00000000000000#!/bin/bash if [ -n "$hostonly" ]; then lspci -n 2>/dev/null | grep -q -i "1077\|1fc1" exit $? fi exit 0 rdma-core-56.1/providers/ipathverbs/dracut_install000066400000000000000000000004341477342711600224520ustar00rootroot00000000000000#!/bin/bash inst /etc/modprobe.d/truescale.conf inst /usr/libexec/truescale-serdes.cmds # All files needed by truescale-serdes.cmds need to be present here inst /sbin/lspci inst /bin/grep inst /bin/sed inst /usr/bin/logger inst /usr/sbin/dmidecode inst /bin/readlink inst /bin/echo rdma-core-56.1/providers/ipathverbs/dracut_kmod000066400000000000000000000000361477342711600217340ustar00rootroot00000000000000#!/bin/bash instmods ib_qib rdma-core-56.1/providers/ipathverbs/ipath-abi.h000066400000000000000000000043101477342711600215230ustar00rootroot00000000000000/* * Copyright (c) 2006 QLogic, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * Patent licenses, if any, provided herein do not apply to * combinations of this program with other software, or any other * product whatsoever. */ #ifndef IPATH_ABI_H #define IPATH_ABI_H #include struct ipath_get_context_resp { struct ib_uverbs_get_context_resp ibv_resp; __u32 version; }; struct ipath_create_cq_resp { struct ib_uverbs_create_cq_resp ibv_resp; __u64 offset; }; struct ipath_resize_cq_resp { struct ib_uverbs_resize_cq_resp ibv_resp; __u64 offset; }; struct ipath_create_qp_resp { struct ib_uverbs_create_qp_resp ibv_resp; __u64 offset; }; struct ipath_create_srq_resp { struct ib_uverbs_create_srq_resp ibv_resp; __u64 offset; }; struct ipath_modify_srq_cmd { struct ibv_modify_srq ibv_cmd; __u64 offset_addr; }; #endif /* IPATH_ABI_H */ rdma-core-56.1/providers/ipathverbs/ipathverbs.c000066400000000000000000000134301477342711600220320ustar00rootroot00000000000000/* * Copyright (C) 2006-2007 QLogic Corporation, All rights reserved. * Copyright (c) 2005. PathScale, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * Patent licenses, if any, provided herein do not apply to * combinations of this program with other software, or any other * product whatsoever. */ #include #include #include #include #include #include "ipathverbs.h" #include "ipath-abi.h" static void ipath_free_context(struct ibv_context *ibctx); #ifndef PCI_VENDOR_ID_PATHSCALE #define PCI_VENDOR_ID_PATHSCALE 0x1fc1 #endif #ifndef PCI_VENDOR_ID_QLOGIC #define PCI_VENDOR_ID_QLOGIC 0x1077 #endif #ifndef PCI_DEVICE_ID_INFINIPATH_HT #define PCI_DEVICE_ID_INFINIPATH_HT 0x000d #endif #ifndef PCI_DEVICE_ID_INFINIPATH_PE800 #define PCI_DEVICE_ID_INFINIPATH_PE800 0x0010 #endif #ifndef PCI_DEVICE_ID_INFINIPATH_6220 #define PCI_DEVICE_ID_INFINIPATH_6220 0x6220 #endif #ifndef PCI_DEVICE_ID_INFINIPATH_7220 #define PCI_DEVICE_ID_INFINIPATH_7220 0x7220 #endif #ifndef PCI_DEVICE_ID_INFINIPATH_7322 #define PCI_DEVICE_ID_INFINIPATH_7322 0x7322 #endif #define HCA(v, d) \ VERBS_PCI_MATCH(PCI_VENDOR_ID_##v, PCI_DEVICE_ID_INFINIPATH_##d, NULL) static const struct verbs_match_ent hca_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_QIB), HCA(PATHSCALE, HT), HCA(PATHSCALE, PE800), HCA(QLOGIC, 6220), HCA(QLOGIC, 7220), HCA(QLOGIC, 7322), {} }; static const struct verbs_context_ops ipath_ctx_common_ops = { .free_context = ipath_free_context, .query_device_ex = ipath_query_device, .query_port = ipath_query_port, .alloc_pd = ipath_alloc_pd, .dealloc_pd = ipath_free_pd, .reg_mr = ipath_reg_mr, .dereg_mr = ipath_dereg_mr, .create_cq = ipath_create_cq, .poll_cq = ipath_poll_cq, .req_notify_cq = ibv_cmd_req_notify_cq, .resize_cq = ipath_resize_cq, .destroy_cq = ipath_destroy_cq, .create_srq = ipath_create_srq, .modify_srq = ipath_modify_srq, .query_srq = ipath_query_srq, .destroy_srq = ipath_destroy_srq, .post_srq_recv = ipath_post_srq_recv, .create_qp = ipath_create_qp, .query_qp = ipath_query_qp, .modify_qp = ipath_modify_qp, .destroy_qp = ipath_destroy_qp, .post_send = ipath_post_send, .post_recv = ipath_post_recv, .create_ah = ipath_create_ah, .destroy_ah = ipath_destroy_ah, .attach_mcast = ibv_cmd_attach_mcast, .detach_mcast = ibv_cmd_detach_mcast }; static const struct verbs_context_ops ipath_ctx_v1_ops = { .create_cq = ipath_create_cq_v1, .poll_cq = ibv_cmd_poll_cq, .resize_cq = ipath_resize_cq_v1, .destroy_cq = ipath_destroy_cq_v1, .create_srq = ipath_create_srq_v1, .destroy_srq = ipath_destroy_srq_v1, .modify_srq = ipath_modify_srq_v1, .post_srq_recv = ibv_cmd_post_srq_recv, .create_qp = ipath_create_qp_v1, .destroy_qp = ipath_destroy_qp_v1, .post_recv = ibv_cmd_post_recv, }; static struct verbs_context *ipath_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct ipath_context *context; struct ibv_get_context cmd; struct ib_uverbs_get_context_resp resp; struct ipath_device *dev; context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx, RDMA_DRIVER_QIB); if (!context) return NULL; if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof cmd, &resp, sizeof resp)) goto err_free; verbs_set_ops(&context->ibv_ctx, &ipath_ctx_common_ops); dev = to_idev(ibdev); if (dev->abi_version == 1) verbs_set_ops(&context->ibv_ctx, &ipath_ctx_v1_ops); return &context->ibv_ctx; err_free: verbs_uninit_context(&context->ibv_ctx); free(context); return NULL; } static void ipath_free_context(struct ibv_context *ibctx) { struct ipath_context *context = to_ictx(ibctx); verbs_uninit_context(&context->ibv_ctx); free(context); } static void ipath_uninit_device(struct verbs_device *verbs_device) { struct ipath_device *dev = to_idev(&verbs_device->device); free(dev); } static struct verbs_device * ipath_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct ipath_device *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; dev->abi_version = sysfs_dev->abi_ver; return &dev->ibv_dev; } static const struct verbs_device_ops ipath_dev_ops = { .name = "ipathverbs", .match_min_abi_version = 0, .match_max_abi_version = INT_MAX, .match_table = hca_table, .alloc_device = ipath_device_alloc, .uninit_device = ipath_uninit_device, .alloc_context = ipath_alloc_context, }; PROVIDER_DRIVER(ipathverbs, ipath_dev_ops); rdma-core-56.1/providers/ipathverbs/ipathverbs.h000066400000000000000000000161271477342711600220450ustar00rootroot00000000000000/* * Copyright (c) 2006-2009 QLogic Corp. All rights reserved. * Copyright (c) 2005. PathScale, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * Patent licenses, if any, provided herein do not apply to * combinations of this program with other software, or any other * product whatsoever. */ #ifndef IPATH_H #define IPATH_H #include #include #include #include #include #include #define PFX "ipath: " struct ipath_device { struct verbs_device ibv_dev; int abi_version; }; struct ipath_context { struct verbs_context ibv_ctx; }; /* * This structure needs to have the same size and offsets as * the kernel's ib_wc structure since it is memory mapped. */ struct ipath_wc { uint64_t wr_id; enum ibv_wc_status status; enum ibv_wc_opcode opcode; uint32_t vendor_err; uint32_t byte_len; uint32_t imm_data; /* in network byte order */ uint32_t qp_num; uint32_t src_qp; enum ibv_wc_flags wc_flags; uint16_t pkey_index; uint16_t slid; uint8_t sl; uint8_t dlid_path_bits; uint8_t port_num; }; struct ipath_cq_wc { _Atomic(uint32_t) head; _Atomic(uint32_t) tail; struct ipath_wc queue[1]; }; struct ipath_cq { struct ibv_cq ibv_cq; struct ipath_cq_wc *queue; pthread_spinlock_t lock; }; /* * Receive work request queue entry. * The size of the sg_list is determined when the QP is created and stored * in qp->r_max_sge. */ struct ipath_rwqe { uint64_t wr_id; uint8_t num_sge; uint8_t padding[7]; struct ibv_sge sg_list[0]; }; /* * This struture is used to contain the head pointer, tail pointer, * and receive work queue entries as a single memory allocation so * it can be mmap'ed into user space. * Note that the wq array elements are variable size so you can't * just index into the array to get the N'th element; * use get_rwqe_ptr() instead. */ struct ipath_rwq { _Atomic(uint32_t) head; /* new requests posted to the head. */ _Atomic(uint32_t) tail; /* receives pull requests from here. */ struct ipath_rwqe wq[0]; }; struct ipath_rq { struct ipath_rwq *rwq; pthread_spinlock_t lock; uint32_t size; uint32_t max_sge; }; struct ipath_qp { struct ibv_qp ibv_qp; struct ipath_rq rq; }; struct ipath_srq { struct ibv_srq ibv_srq; struct ipath_rq rq; }; #define to_ixxx(xxx, type) container_of(ib##xxx, struct ipath_##type, ibv_##xxx) static inline struct ipath_context *to_ictx(struct ibv_context *ibctx) { return container_of(ibctx, struct ipath_context, ibv_ctx.context); } static inline struct ipath_device *to_idev(struct ibv_device *ibdev) { return container_of(ibdev, struct ipath_device, ibv_dev.device); } static inline struct ipath_cq *to_icq(struct ibv_cq *ibcq) { return to_ixxx(cq, cq); } static inline struct ipath_qp *to_iqp(struct ibv_qp *ibqp) { return to_ixxx(qp, qp); } static inline struct ipath_srq *to_isrq(struct ibv_srq *ibsrq) { return to_ixxx(srq, srq); } /* * Since struct ipath_rwqe is not a fixed size, we can't simply index into * struct ipath_rq.wq. This function does the array index computation. */ static inline struct ipath_rwqe *get_rwqe_ptr(struct ipath_rq *rq, unsigned n) { return (struct ipath_rwqe *) ((char *) rq->rwq->wq + (sizeof(struct ipath_rwqe) + rq->max_sge * sizeof(struct ibv_sge)) * n); } int ipath_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); extern int ipath_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr); struct ibv_pd *ipath_alloc_pd(struct ibv_context *pd); int ipath_free_pd(struct ibv_pd *pd); struct ibv_mr *ipath_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access); int ipath_dereg_mr(struct verbs_mr *vmr); struct ibv_cq *ipath_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); struct ibv_cq *ipath_create_cq_v1(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); int ipath_resize_cq(struct ibv_cq *cq, int cqe); int ipath_resize_cq_v1(struct ibv_cq *cq, int cqe); int ipath_destroy_cq(struct ibv_cq *cq); int ipath_destroy_cq_v1(struct ibv_cq *cq); int ipath_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc); struct ibv_qp *ipath_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); struct ibv_qp *ipath_create_qp_v1(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); int ipath_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int ipath_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); int ipath_destroy_qp(struct ibv_qp *qp); int ipath_destroy_qp_v1(struct ibv_qp *qp); int ipath_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int ipath_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); struct ibv_srq *ipath_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr); struct ibv_srq *ipath_create_srq_v1(struct ibv_pd *pd, struct ibv_srq_init_attr *attr); int ipath_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int attr_mask); int ipath_modify_srq_v1(struct ibv_srq *srq, struct ibv_srq_attr *attr, int attr_mask); int ipath_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr); int ipath_destroy_srq(struct ibv_srq *srq); int ipath_destroy_srq_v1(struct ibv_srq *srq); int ipath_post_srq_recv(struct ibv_srq *srq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); struct ibv_ah *ipath_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr); int ipath_destroy_ah(struct ibv_ah *ah); #endif /* IPATH_H */ rdma-core-56.1/providers/ipathverbs/truescale-serdes.cmds000077500000000000000000000210021477342711600236360ustar00rootroot00000000000000#!/bin/bash # Copyright (c) 2013 Intel Corporation. All rights reserved. # Copyright (c) 2010 QLogic Corporation. # All rights reserved. # # This software is available to you under a choice of one of two # licenses. You may choose to be licensed under the terms of the GNU # General Public License (GPL) Version 2, available from the file # COPYING in the main directory of this source tree, or the # OpenIB.org BSD license below: # # Redistribution and use in source and binary forms, with or # without modification, are permitted provided that the following # conditions are met: # # - Redistributions of source code must retain the above # copyright notice, this list of conditions and the following # disclaimer. # # - Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following # disclaimer in the documentation and/or other materials # provided with the distribution. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS # BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN # ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN # CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # This script does truescale (qib) adapter-specific actions, and is # sourced during boot after the ib_qib module is loaded. The stop # operation is deprecated. It isn't intended for standalone use. # base name in /sys/class PATH=/sbin:/bin:/usr/sbin:/usr/bin:$PATH export PATH qb=/sys/class/infiniband/qib serdes_parm=txselect if [ -r /etc/rdma/rdma.conf ]; then IB_CONFIG=/etc/rdma/rdma.conf else IB_CONFIG=/etc/infiniband/openib.conf fi if [ -f $IB_CONFIG ]; then . $IB_CONFIG fi # If user specifies an override or the setting is ommitted from the config file # then default to new back plane version. if [ -z $QIB_QME_BPVER ]; then QIB_QME_BPVER=1 fi warn_and_log() { echo "$0: $@" logger -t infinipath "$@" } setup_qmh() { local -i nunit=0 bay bl2xB=0 full=0 local parmf sysinfo bayinfo mez1bus mez2bus mez3bus=0 tbay local -a parm bay_h1 for parm in parameters/${serdes_parm} ${serdes_parm}; do if [ -e /sys/module/ib_qib/$parm ]; then parmf=/sys/module/ib_qib/$parm break; fi done if [ ! "$parmf" ]; then warn_and_log Unable to find ${serdes_parm} parameter return fi sysinfo="$(PATH=/sbin:/usr/sbin:$PATH; dmidecode -t system | \ sed -e '/^Handle/d' -e '/^[ \t]*$/d' -e 's/[ \t]*$//' )" if [ ! "$sysinfo" ]; then warn_and_log Unable to determine system type return fi bayinfo="$(PATH=/sbin:/usr/sbin:$PATH; dmidecode -t 204)" if [ ! "$bayinfo" ]; then warn_and_log Unable to determine bay return fi case "${bayinfo}" in *Server*Bay:*) tbay=$(PATH=/sbin:/usr/sbin:$PATH; dmidecode -t 204 | \ sed -n -e 's/[ \t]*$//' -e 's/[ \t]*Server Bay:[ \t]*//p') ;; *) tbay=$(PATH=/sbin:/usr/sbin:$PATH; dmidecode -t 204 | \ sed -n -e '1,/BladeSystem/d' -e 's/ *$//' -e 's/^\t\t*//' \ -e '/^[0-9][AB]*$/p' -e '/^[0-9][0-9][AB]*$/p') ;; esac read pbase < $parmf parm=($(echo ${qb}*)) nunit=${#parm[*]} # [0] is a dummy in these arrays, bay #'ing starts at 1 # H1 value, per bay (same for both ports) m1_bay_h1=(0 8 7 7 7 7 6 6 6 8 7 7 7 7 6 6 7) m2_bay_h1=(0 11 11 11 11 11 11 10 11 11 11 11 11 10 10 10 10) m3_bay_h1=(0 11 11 11 11 10 10 10 10) # tx serdes index per bay for mez1 (either port) mez1p1_idx=(0 2 2 17 17 17 1 1 1 2 1 17 17 16 2 18 16) # tx serdes setting for mez1 p2 (only used on full-height blades) mez1p2_idx=(0 4 4 3 3 3 2 4 4) # tx serdes index per bay for mez2 port 1 mez2p1_idx=(0 2 2 17 17 17 1 1 1 2 1 17 17 16 2 18 1) # tx serdes index per bay for mez2 port 2 mez2p2_idx=(0 2 2 19 1 1 1 1 1 2 1 18 17 1 19 1 1) # tx serdes index per bay for mez3 port 1 (mez3 only on full-height blades) mez3p1_idx=(0 2 1 18 17 1 19 1 1) # tx serdes index per bay for mez3 port 2 (mez3 only on full-height blades) mez3p2_idx=(0 2 1 17 17 16 2 18 1) case "${sysinfo}" in *BL280[cC]*) mez1bus=3 mez2bus=6 bay=$tbay ;; # both nodes on the 2x220 blade have bus 3, only one mez, but # they connect to different switches through different paths # so A and B have different parameters. They connect to # the switch as if they were the mez2 on other blade types, # with port 1 on mez2 for A node and port 2 on mez2 # for the B node *BL2x220[cC]*) mez1bus=3 mez2bus=3 bay=${tbay%[AB]} case "${tbay}" in *A) bl2xB=${mez2p1_idx[$bay]} ;; *B) bl2xB=${mez2p2_idx[$bay]} ;; esac ;; *BL460[cC]*) mez1bus=6 mez2bus=9 bay=$tbay ;; *BL465[cC]*) mez1bus=5 mez2bus=8 bay=$tbay ;; *BL490[cC]*) mez1bus=6 mez2bus=7 bay=$tbay ;; *BL685[cC]*) mez1bus=41 mez2bus=6 mez3bus=44 full=1 bay=$(($tbay % 9)) ;; *) warn_and_log Unknown blade type "$sysinfo" return ;; esac # mez1 only has port1 connected, mez2, mez3 can have both ports # If only one card, and two mez possible, we have to figure out which # mez we are plugged into. # On RHEL4U8, we look in the driver subdir, all others # in the device/driver subdir for the pcie bus. pciprefix="[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]:" if [ ${bl2xB} -ne 0 ]; then pbase="${pbase} 0,1=${bl2xB},${m2_bay_h1[$bay]}" else while [ $nunit -ne 0 ]; do (( nunit-- )) buspath=$(readlink -m ${qb}${nunit}/device) if [ -n "$(echo ${buspath} | grep "${pciprefix}$(printf "%02d" ${mez1bus}):")" ]; then pbase="${pbase} ${nunit},1=${mez1p1_idx[$bay]},${m1_bay_h1[$bay]}" if [ ${full} -eq 1 ]; then pbase="${pbase} ${nunit},2=${mez1p2_idx[$bay]},${m1_bay_h1[$bay]}" fi elif [ -n "$(echo ${buspath} | grep "${pciprefix}$(printf "%02d" ${mez2bus}):")" ]; then pbase="${pbase} ${nunit},1=${mez2p1_idx[$bay]},${m2_bay_h1[$bay]}" pbase="${pbase} ${nunit},2=${mez2p2_idx[$bay]},${m2_bay_h1[$bay]}" elif [ -n "$(echo ${buspath} | grep "${pciprefix}$(printf "%02d" ${mez3bus}):")" ]; then pbase="${pbase} ${nunit},1=${mez3p1_idx[$bay]},${m3_bay_h1[$bay]}" pbase="${pbase} ${nunit},2=${mez3p2_idx[$bay]},${m3_bay_h1[$bay]}" else warn_and_log Mismatch on mezbus ${mez1_bus},${mez2_bus},${mez3_bus} \ and unit ${nunit}, no serdes setup fi done fi echo -n ${pbase} > $parmf } setup_qme() { local parm parmf sn pbase local -i nunit=0 bay idx bpver=${QIB_QME_BPVER:1} local -a bp0_idx bp1_idx set # tx settings for Dell Backplane v1.0 bp0_idx=( 0 22 23 24 25 26 24 27 28 22 23 24 25 26 24 27 28 ) # tx settings for Dell Backplane v1.1 bp1_idx=( 0 29 29 30 31 32 33 30 29 29 29 30 31 32 33 30 29 ) for parm in parameters/${serdes_parm} ${serdes_parm}; do if [ -e /sys/module/ib_qib/$parm ]; then parmf=/sys/module/ib_qib/$parm break; fi done if [ ! "$parmf" ]; then warn_and_log Unable to find ${serdes_parm} parameter return fi read pbase < $parmf parm=( $(echo ${qb}*) ) nunit=${#parm[*]} if [ -e /sys/module/ib_qib/parameters/qme_bp ]; then read bpver < /sys/module/ib_qib/parameters/qme_bp if [ ${bpver} -ne 0 -a ${bpver} -ne 1 ]; then warn_and_log "Invalid Dell backplane version (${bpver}). Defaulting to 1." bpver=1 fi fi eval 'set=( ${bp'${bpver}'_idx[@]} )' # we get two serial numbers normally, use 2nd if present, else first sn="$(dmidecode -t 2 | grep -i serial | tail -1)" case ${sn} in *[sS]erial\ [nN]umber*) bay="$(echo $sn | sed -e 's/\.$//' -e 's/.*\.0*//' -e 's/[abcd]$//')" if [ ${bay} -gt ${#set[@]} ]; then warn_and_log Unexpected QME7342 bay info: ${sn}, no Tx params return fi idx=${set[bay]} # H1 is same for all QME bays, so no need to specify. while [ $nunit -ne 0 ]; do (( nunit-- )) pbase="${pbase} ${nunit},1=${idx} ${nunit},2=${idx}" done echo -n ${pbase} > $parmf ;; *) warn_and_log No QME7342 bay information, no Tx params return;; esac } has_qib=$(lspci -n 2>/dev/null | grep -i "1077\|1fc1") if [ ! "${has_qib}" ]; then exit 0 fi case "$1" in start) has_qmh7342=$(grep QMH7342 ${qb}*/hca_type 2>/dev/null) if [ "${has_qmh7342}" ]; then setup_qmh else has_qme7342=$(grep QME7342 ${qb}*/hca_type 2>/dev/null) if [ "${has_qme7342}" ]; then setup_qme fi fi ;; stop) warn_and_log stop operation deprecated ;; esac rdma-core-56.1/providers/ipathverbs/truescale.conf.in000066400000000000000000000001571477342711600227640ustar00rootroot00000000000000install ib_qib modprobe -i ib_qib $CMDLINE_OPTS && @CMAKE_INSTALL_FULL_LIBEXECDIR@/truescale-serdes.cmds start rdma-core-56.1/providers/ipathverbs/verbs.c000066400000000000000000000361371477342711600210150ustar00rootroot00000000000000/* * Copyright (c) 2006-2009 QLogic Corp. All rights reserved. * Copyright (c) 2005. PathScale, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * Patent licenses, if any, provided herein do not apply to * combinations of this program with other software, or any other * product whatsoever. */ #include #include #include #include #include #include #include #include "ipathverbs.h" #include "ipath-abi.h" int ipath_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); uint64_t raw_fw_ver; unsigned major, minor, sub_minor; int ret; ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (ret) return ret; raw_fw_ver = resp.base.fw_ver; major = (raw_fw_ver >> 32) & 0xffff; minor = (raw_fw_ver >> 16) & 0xffff; sub_minor = raw_fw_ver & 0xffff; snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver), "%d.%d.%d", major, minor, sub_minor); return 0; } int ipath_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(context, port, attr, &cmd, sizeof cmd); } struct ibv_pd *ipath_alloc_pd(struct ibv_context *context) { struct ibv_alloc_pd cmd; struct ib_uverbs_alloc_pd_resp resp; struct ibv_pd *pd; pd = malloc(sizeof *pd); if (!pd) return NULL; if (ibv_cmd_alloc_pd(context, pd, &cmd, sizeof cmd, &resp, sizeof resp)) { free(pd); return NULL; } return pd; } int ipath_free_pd(struct ibv_pd *pd) { int ret; ret = ibv_cmd_dealloc_pd(pd); if (ret) return ret; free(pd); return 0; } struct ibv_mr *ipath_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { struct verbs_mr *vmr; struct ibv_reg_mr cmd; struct ib_uverbs_reg_mr_resp resp; int ret; vmr = malloc(sizeof(*vmr)); if (!vmr) return NULL; ret = ibv_cmd_reg_mr(pd, addr, length, hca_va, access, vmr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(vmr); return NULL; } return &vmr->ibv_mr; } int ipath_dereg_mr(struct verbs_mr *vmr) { int ret; ret = ibv_cmd_dereg_mr(vmr); if (ret) return ret; free(vmr); return 0; } struct ibv_cq *ipath_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct ipath_cq *cq; struct ipath_create_cq_resp resp; int ret; size_t size; cq = malloc(sizeof *cq); if (!cq) return NULL; ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector, &cq->ibv_cq, NULL, 0, &resp.ibv_resp, sizeof resp); if (ret) { free(cq); return NULL; } size = sizeof(struct ipath_cq_wc) + sizeof(struct ipath_wc) * cqe; cq->queue = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, context->cmd_fd, resp.offset); if ((void *) cq->queue == MAP_FAILED) { ibv_cmd_destroy_cq(&cq->ibv_cq); free(cq); return NULL; } pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE); return &cq->ibv_cq; } struct ibv_cq *ipath_create_cq_v1(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct ibv_cq *cq; int ret; cq = malloc(sizeof *cq); if (!cq) return NULL; ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector, cq, NULL, 0, NULL, 0); if (ret) { free(cq); return NULL; } return cq; } int ipath_resize_cq(struct ibv_cq *ibcq, int cqe) { struct ipath_cq *cq = to_icq(ibcq); struct ibv_resize_cq cmd; struct ipath_resize_cq_resp resp; size_t size; int ret; pthread_spin_lock(&cq->lock); /* Save the old size so we can unmmap the queue. */ size = sizeof(struct ipath_cq_wc) + (sizeof(struct ipath_wc) * cq->ibv_cq.cqe); ret = ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) { pthread_spin_unlock(&cq->lock); return ret; } (void) munmap(cq->queue, size); size = sizeof(struct ipath_cq_wc) + (sizeof(struct ipath_wc) * cq->ibv_cq.cqe); cq->queue = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, ibcq->context->cmd_fd, resp.offset); ret = errno; pthread_spin_unlock(&cq->lock); if ((void *) cq->queue == MAP_FAILED) return ret; return 0; } int ipath_resize_cq_v1(struct ibv_cq *ibcq, int cqe) { struct ibv_resize_cq cmd; struct ib_uverbs_resize_cq_resp resp; return ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof cmd, &resp, sizeof resp); } int ipath_destroy_cq(struct ibv_cq *ibcq) { struct ipath_cq *cq = to_icq(ibcq); int ret; ret = ibv_cmd_destroy_cq(ibcq); if (ret) return ret; (void) munmap(cq->queue, sizeof(struct ipath_cq_wc) + (sizeof(struct ipath_wc) * cq->ibv_cq.cqe)); free(cq); return 0; } int ipath_destroy_cq_v1(struct ibv_cq *ibcq) { int ret; ret = ibv_cmd_destroy_cq(ibcq); if (!ret) free(ibcq); return ret; } int ipath_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc) { struct ipath_cq *cq = to_icq(ibcq); struct ipath_cq_wc *q; int npolled; uint32_t tail; pthread_spin_lock(&cq->lock); q = cq->queue; tail = atomic_load_explicit(&q->tail, memory_order_relaxed); for (npolled = 0; npolled < ne; ++npolled, ++wc) { if (tail == atomic_load(&q->head)) break; /* Make sure entry is read after head index is read. */ atomic_thread_fence(memory_order_acquire); memcpy(wc, &q->queue[tail], sizeof(*wc)); if (tail == cq->ibv_cq.cqe) tail = 0; else tail++; } atomic_store(&q->tail, tail); pthread_spin_unlock(&cq->lock); return npolled; } struct ibv_qp *ipath_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct ibv_create_qp cmd; struct ipath_create_qp_resp resp; struct ipath_qp *qp; int ret; size_t size; qp = malloc(sizeof *qp); if (!qp) return NULL; ret = ibv_cmd_create_qp(pd, &qp->ibv_qp, attr, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) { free(qp); return NULL; } if (attr->srq) { qp->rq.size = 0; qp->rq.max_sge = 0; qp->rq.rwq = NULL; } else { qp->rq.size = attr->cap.max_recv_wr + 1; qp->rq.max_sge = attr->cap.max_recv_sge; size = sizeof(struct ipath_rwq) + (sizeof(struct ipath_rwqe) + (sizeof(struct ibv_sge) * qp->rq.max_sge)) * qp->rq.size; qp->rq.rwq = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.offset); if ((void *) qp->rq.rwq == MAP_FAILED) { ibv_cmd_destroy_qp(&qp->ibv_qp); free(qp); return NULL; } } pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE); return &qp->ibv_qp; } struct ibv_qp *ipath_create_qp_v1(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct ibv_create_qp cmd; struct ib_uverbs_create_qp_resp resp; struct ibv_qp *qp; int ret; qp = malloc(sizeof *qp); if (!qp) return NULL; ret = ibv_cmd_create_qp(pd, qp, attr, &cmd, sizeof cmd, &resp, sizeof resp); if (ret) { free(qp); return NULL; } return qp; } int ipath_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; return ibv_cmd_query_qp(qp, attr, attr_mask, init_attr, &cmd, sizeof cmd); } int ipath_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd = {}; return ibv_cmd_modify_qp(qp, attr, attr_mask, &cmd, sizeof cmd); } int ipath_destroy_qp(struct ibv_qp *ibqp) { struct ipath_qp *qp = to_iqp(ibqp); int ret; ret = ibv_cmd_destroy_qp(ibqp); if (ret) return ret; if (qp->rq.rwq) { size_t size; size = sizeof(struct ipath_rwq) + (sizeof(struct ipath_rwqe) + (sizeof(struct ibv_sge) * qp->rq.max_sge)) * qp->rq.size; (void) munmap(qp->rq.rwq, size); } free(qp); return 0; } int ipath_destroy_qp_v1(struct ibv_qp *ibqp) { int ret; ret = ibv_cmd_destroy_qp(ibqp); if (!ret) free(ibqp); return ret; } int ipath_post_send(struct ibv_qp *qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { unsigned wr_count; struct ibv_send_wr *i; /* Sanity check the number of WRs being posted */ for (i = wr, wr_count = 0; i; i = i->next) if (++wr_count > 10) goto iter; return ibv_cmd_post_send(qp, wr, bad_wr); iter: do { struct ibv_send_wr *next; int ret; next = i->next; i->next = NULL; ret = ibv_cmd_post_send(qp, wr, bad_wr); i->next = next; if (ret) return ret; if (next == NULL) break; wr = next; for (i = wr, wr_count = 0; i->next; i = i->next) if (++wr_count > 2) break; } while (1); return 0; } static int post_recv(struct ipath_rq *rq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct ibv_recv_wr *i; struct ipath_rwq *rwq; struct ipath_rwqe *wqe; uint32_t head; int n, ret; pthread_spin_lock(&rq->lock); rwq = rq->rwq; head = atomic_load_explicit(&rwq->head, memory_order_relaxed); for (i = wr; i; i = i->next) { if ((unsigned) i->num_sge > rq->max_sge) { ret = EINVAL; goto bad; } wqe = get_rwqe_ptr(rq, head); if (++head >= rq->size) head = 0; if (head == atomic_load(&rwq->tail)) { ret = ENOMEM; goto bad; } wqe->wr_id = i->wr_id; wqe->num_sge = i->num_sge; for (n = 0; n < wqe->num_sge; n++) wqe->sg_list[n] = i->sg_list[n]; /* Make sure queue entry is written before the head index. */ atomic_thread_fence(memory_order_release); atomic_store(&rwq->head, head); } ret = 0; goto done; bad: if (bad_wr) *bad_wr = i; done: pthread_spin_unlock(&rq->lock); return ret; } int ipath_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct ipath_qp *qp = to_iqp(ibqp); return post_recv(&qp->rq, wr, bad_wr); } struct ibv_srq *ipath_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct ipath_srq *srq; struct ibv_create_srq cmd; struct ipath_create_srq_resp resp; int ret; size_t size; srq = malloc(sizeof *srq); if (srq == NULL) return NULL; ret = ibv_cmd_create_srq(pd, &srq->ibv_srq, attr, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) { free(srq); return NULL; } srq->rq.size = attr->attr.max_wr + 1; srq->rq.max_sge = attr->attr.max_sge; size = sizeof(struct ipath_rwq) + (sizeof(struct ipath_rwqe) + (sizeof(struct ibv_sge) * srq->rq.max_sge)) * srq->rq.size; srq->rq.rwq = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.offset); if ((void *) srq->rq.rwq == MAP_FAILED) { ibv_cmd_destroy_srq(&srq->ibv_srq); free(srq); return NULL; } pthread_spin_init(&srq->rq.lock, PTHREAD_PROCESS_PRIVATE); return &srq->ibv_srq; } struct ibv_srq *ipath_create_srq_v1(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct ibv_srq *srq; struct ibv_create_srq cmd; struct ib_uverbs_create_srq_resp resp; int ret; srq = malloc(sizeof *srq); if (srq == NULL) return NULL; ret = ibv_cmd_create_srq(pd, srq, attr, &cmd, sizeof cmd, &resp, sizeof resp); if (ret) { free(srq); return NULL; } return srq; } int ipath_modify_srq(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr, int attr_mask) { struct ipath_srq *srq = to_isrq(ibsrq); struct ipath_modify_srq_cmd cmd; __u64 offset; size_t size = 0; /* Shut up gcc */ int ret; if (attr_mask & IBV_SRQ_MAX_WR) { pthread_spin_lock(&srq->rq.lock); /* Save the old size so we can unmmap the queue. */ size = sizeof(struct ipath_rwq) + (sizeof(struct ipath_rwqe) + (sizeof(struct ibv_sge) * srq->rq.max_sge)) * srq->rq.size; } cmd.offset_addr = (uintptr_t) &offset; ret = ibv_cmd_modify_srq(ibsrq, attr, attr_mask, &cmd.ibv_cmd, sizeof cmd); if (ret) { if (attr_mask & IBV_SRQ_MAX_WR) pthread_spin_unlock(&srq->rq.lock); return ret; } if (attr_mask & IBV_SRQ_MAX_WR) { (void) munmap(srq->rq.rwq, size); srq->rq.size = attr->max_wr + 1; size = sizeof(struct ipath_rwq) + (sizeof(struct ipath_rwqe) + (sizeof(struct ibv_sge) * srq->rq.max_sge)) * srq->rq.size; srq->rq.rwq = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, ibsrq->context->cmd_fd, offset); pthread_spin_unlock(&srq->rq.lock); /* XXX Now we have no receive queue. */ if ((void *) srq->rq.rwq == MAP_FAILED) return errno; } return 0; } int ipath_modify_srq_v1(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr, int attr_mask) { struct ibv_modify_srq cmd; return ibv_cmd_modify_srq(ibsrq, attr, attr_mask, &cmd, sizeof cmd); } int ipath_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; return ibv_cmd_query_srq(srq, attr, &cmd, sizeof cmd); } int ipath_destroy_srq(struct ibv_srq *ibsrq) { struct ipath_srq *srq = to_isrq(ibsrq); size_t size; int ret; ret = ibv_cmd_destroy_srq(ibsrq); if (ret) return ret; size = sizeof(struct ipath_rwq) + (sizeof(struct ipath_rwqe) + (sizeof(struct ibv_sge) * srq->rq.max_sge)) * srq->rq.size; (void) munmap(srq->rq.rwq, size); free(srq); return 0; } int ipath_destroy_srq_v1(struct ibv_srq *ibsrq) { int ret; ret = ibv_cmd_destroy_srq(ibsrq); if (!ret) free(ibsrq); return ret; } int ipath_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct ipath_srq *srq = to_isrq(ibsrq); return post_recv(&srq->rq, wr, bad_wr); } struct ibv_ah *ipath_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr) { struct ibv_ah *ah; struct ib_uverbs_create_ah_resp resp; ah = malloc(sizeof *ah); if (ah == NULL) return NULL; memset(&resp, 0, sizeof(resp)); if (ibv_cmd_create_ah(pd, ah, attr, &resp, sizeof(resp))) { free(ah); return NULL; } return ah; } int ipath_destroy_ah(struct ibv_ah *ah) { int ret; ret = ibv_cmd_destroy_ah(ah); if (ret) return ret; free(ah); return 0; } rdma-core-56.1/providers/irdma/000077500000000000000000000000001477342711600164435ustar00rootroot00000000000000rdma-core-56.1/providers/irdma/CMakeLists.txt000066400000000000000000000002211477342711600211760ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Intel Corporation. rdma_provider(irdma uk.c umain.c uverbs.c ) rdma-core-56.1/providers/irdma/abi.h000066400000000000000000000025731477342711600173560ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ /* Copyright (C) 2019 - 2020 Intel Corporation */ #ifndef PROVIDER_IRDMA_ABI_H #define PROVIDER_IRDMA_ABI_H #include "irdma.h" #include #include #include #define IRDMA_MIN_ABI_VERSION 0 #define IRDMA_MAX_ABI_VERSION 5 DECLARE_DRV_CMD(irdma_ualloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, empty, irdma_alloc_pd_resp); DECLARE_DRV_CMD(irdma_ucreate_cq, IB_USER_VERBS_CMD_CREATE_CQ, irdma_create_cq_req, irdma_create_cq_resp); DECLARE_DRV_CMD(irdma_ucreate_cq_ex, IB_USER_VERBS_EX_CMD_CREATE_CQ, irdma_create_cq_req, irdma_create_cq_resp); DECLARE_DRV_CMD(irdma_uresize_cq, IB_USER_VERBS_CMD_RESIZE_CQ, irdma_resize_cq_req, empty); DECLARE_DRV_CMD(irdma_ucreate_qp, IB_USER_VERBS_CMD_CREATE_QP, irdma_create_qp_req, irdma_create_qp_resp); DECLARE_DRV_CMD(irdma_umodify_qp, IB_USER_VERBS_EX_CMD_MODIFY_QP, irdma_modify_qp_req, irdma_modify_qp_resp); DECLARE_DRV_CMD(irdma_get_context, IB_USER_VERBS_CMD_GET_CONTEXT, irdma_alloc_ucontext_req, irdma_alloc_ucontext_resp); DECLARE_DRV_CMD(irdma_ureg_mr, IB_USER_VERBS_CMD_REG_MR, irdma_mem_reg_req, empty); DECLARE_DRV_CMD(irdma_urereg_mr, IB_USER_VERBS_CMD_REREG_MR, irdma_mem_reg_req, empty); DECLARE_DRV_CMD(irdma_ucreate_ah, IB_USER_VERBS_CMD_CREATE_AH, empty, irdma_create_ah_resp); #endif /* PROVIDER_IRDMA_ABI_H */ rdma-core-56.1/providers/irdma/defs.h000066400000000000000000000227561477342711600175510ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ /* Copyright (c) 2015 - 2023 Intel Corporation */ #ifndef IRDMA_DEFS_H #define IRDMA_DEFS_H #include "osdep.h" #define IRDMA_QP_TYPE_IWARP 1 #define IRDMA_QP_TYPE_UDA 2 #define IRDMA_QP_TYPE_ROCE_RC 3 #define IRDMA_QP_TYPE_ROCE_UD 4 #define IRDMA_HW_PAGE_SIZE 4096 #define IRDMA_HW_PAGE_SHIFT 12 #define IRDMA_CQE_QTYPE_RQ 0 #define IRDMA_CQE_QTYPE_SQ 1 #define IRDMA_QP_SW_MIN_WQSIZE 8u /* in WRs*/ #define IRDMA_QP_WQE_MIN_SIZE 32 #define IRDMA_QP_WQE_MAX_SIZE 256 #define IRDMA_QP_WQE_MIN_QUANTA 1 #define IRDMA_MAX_RQ_WQE_SHIFT_GEN1 2 #define IRDMA_MAX_RQ_WQE_SHIFT_GEN2 3 #define IRDMA_SQ_RSVD 258 #define IRDMA_RQ_RSVD 1 #define IRDMA_FEATURE_RTS_AE 1ULL #define IRDMA_FEATURE_CQ_RESIZE 2ULL #define IRDMAQP_OP_RDMA_WRITE 0x00 #define IRDMAQP_OP_RDMA_READ 0x01 #define IRDMAQP_OP_RDMA_SEND 0x03 #define IRDMAQP_OP_RDMA_SEND_INV 0x04 #define IRDMAQP_OP_RDMA_SEND_SOL_EVENT 0x05 #define IRDMAQP_OP_RDMA_SEND_SOL_EVENT_INV 0x06 #define IRDMAQP_OP_BIND_MW 0x08 #define IRDMAQP_OP_FAST_REGISTER 0x09 #define IRDMAQP_OP_LOCAL_INVALIDATE 0x0a #define IRDMAQP_OP_RDMA_READ_LOC_INV 0x0b #define IRDMAQP_OP_NOP 0x0c #define IRDMA_CQPHC_QPCTX GENMASK_ULL(63, 0) #define IRDMA_QP_DBSA_HW_SQ_TAIL GENMASK_ULL(14, 0) #define IRDMA_CQ_DBSA_CQEIDX GENMASK_ULL(19, 0) #define IRDMA_CQ_DBSA_SW_CQ_SELECT GENMASK_ULL(13, 0) #define IRDMA_CQ_DBSA_ARM_NEXT BIT_ULL(14) #define IRDMA_CQ_DBSA_ARM_NEXT_SE BIT_ULL(15) #define IRDMA_CQ_DBSA_ARM_SEQ_NUM GENMASK_ULL(17, 16) /* CQP and iWARP Completion Queue */ #define IRDMA_CQ_QPCTX IRDMA_CQPHC_QPCTX #define IRDMA_CQ_MINERR GENMASK_ULL(15, 0) #define IRDMA_CQ_MAJERR GENMASK_ULL(31, 16) #define IRDMA_CQ_WQEIDX GENMASK_ULL(46, 32) #define IRDMA_CQ_EXTCQE BIT_ULL(50) #define IRDMA_OOO_CMPL BIT_ULL(54) #define IRDMA_CQ_ERROR BIT_ULL(55) #define IRDMA_CQ_SQ BIT_ULL(62) #define IRDMA_CQ_VALID BIT_ULL(63) #define IRDMA_CQ_IMMVALID BIT_ULL(62) #define IRDMA_CQ_UDSMACVALID BIT_ULL(61) #define IRDMA_CQ_UDVLANVALID BIT_ULL(60) #define IRDMA_CQ_UDSMAC GENMASK_ULL(47, 0) #define IRDMA_CQ_UDVLAN GENMASK_ULL(63, 48) #define IRDMA_CQ_IMMDATA_S 0 #define IRDMA_CQ_IMMDATA_M (0xffffffffffffffffULL << IRDMA_CQ_IMMVALID_S) #define IRDMA_CQ_IMMDATALOW32 GENMASK_ULL(31, 0) #define IRDMA_CQ_IMMDATAUP32 GENMASK_ULL(63, 32) #define IRDMACQ_PAYLDLEN GENMASK_ULL(31, 0) #define IRDMACQ_TCPSEQNUMRTT GENMASK_ULL(63, 32) #define IRDMACQ_INVSTAG GENMASK_ULL(31, 0) #define IRDMACQ_QPID GENMASK_ULL(55, 32) #define IRDMACQ_UDSRCQPN GENMASK_ULL(31, 0) #define IRDMACQ_PSHDROP BIT_ULL(51) #define IRDMACQ_STAG BIT_ULL(53) #define IRDMACQ_IPV4 BIT_ULL(53) #define IRDMACQ_SOEVENT BIT_ULL(54) #define IRDMACQ_OP GENMASK_ULL(61, 56) /* Manage Push Page - MPP */ #define IRDMA_INVALID_PUSH_PAGE_INDEX_GEN_1 0xffff #define IRDMA_INVALID_PUSH_PAGE_INDEX 0xffffffff #define IRDMAQPSQ_OPCODE GENMASK_ULL(37, 32) #define IRDMAQPSQ_COPY_HOST_PBL BIT_ULL(43) #define IRDMAQPSQ_ADDFRAGCNT GENMASK_ULL(41, 38) #define IRDMAQPSQ_PUSHWQE BIT_ULL(56) #define IRDMAQPSQ_STREAMMODE BIT_ULL(58) #define IRDMAQPSQ_WAITFORRCVPDU BIT_ULL(59) #define IRDMAQPSQ_READFENCE BIT_ULL(60) #define IRDMAQPSQ_LOCALFENCE BIT_ULL(61) #define IRDMAQPSQ_UDPHEADER BIT_ULL(61) #define IRDMAQPSQ_L4LEN GENMASK_ULL(45, 42) #define IRDMAQPSQ_SIGCOMPL BIT_ULL(62) #define IRDMAQPSQ_VALID BIT_ULL(63) #define IRDMAQPSQ_FRAG_TO IRDMA_CQPHC_QPCTX #define IRDMAQPSQ_FRAG_VALID BIT_ULL(63) #define IRDMAQPSQ_FRAG_LEN GENMASK_ULL(62, 32) #define IRDMAQPSQ_FRAG_STAG GENMASK_ULL(31, 0) #define IRDMAQPSQ_GEN1_FRAG_LEN GENMASK_ULL(31, 0) #define IRDMAQPSQ_GEN1_FRAG_STAG GENMASK_ULL(63, 32) #define IRDMAQPSQ_REMSTAGINV GENMASK_ULL(31, 0) #define IRDMAQPSQ_DESTQKEY GENMASK_ULL(31, 0) #define IRDMAQPSQ_DESTQPN GENMASK_ULL(55, 32) #define IRDMAQPSQ_AHID GENMASK_ULL(16, 0) #define IRDMAQPSQ_INLINEDATAFLAG BIT_ULL(57) #define IRDMA_INLINE_VALID_S 7 #define IRDMAQPSQ_INLINEDATALEN GENMASK_ULL(55, 48) #define IRDMAQPSQ_IMMDATAFLAG BIT_ULL(47) #define IRDMAQPSQ_REPORTRTT BIT_ULL(46) #define IRDMAQPSQ_IMMDATA GENMASK_ULL(63, 0) #define IRDMAQPSQ_REMSTAG GENMASK_ULL(31, 0) #define IRDMAQPSQ_REMTO IRDMA_CQPHC_QPCTX #define IRDMAQPSQ_STAGRIGHTS GENMASK_ULL(52, 48) #define IRDMAQPSQ_VABASEDTO BIT_ULL(53) #define IRDMAQPSQ_MEMWINDOWTYPE BIT_ULL(54) #define IRDMAQPSQ_MWLEN IRDMA_CQPHC_QPCTX #define IRDMAQPSQ_PARENTMRSTAG GENMASK_ULL(63, 32) #define IRDMAQPSQ_MWSTAG GENMASK_ULL(31, 0) #define IRDMAQPSQ_BASEVA_TO_FBO IRDMA_CQPHC_QPCTX #define IRDMAQPSQ_LOCSTAG GENMASK_ULL(31, 0) /* iwarp QP RQ WQE common fields */ #define IRDMAQPRQ_ADDFRAGCNT IRDMAQPSQ_ADDFRAGCNT #define IRDMAQPRQ_VALID IRDMAQPSQ_VALID #define IRDMAQPRQ_COMPLCTX IRDMA_CQPHC_QPCTX #define IRDMAQPRQ_FRAG_LEN IRDMAQPSQ_FRAG_LEN #define IRDMAQPRQ_STAG IRDMAQPSQ_FRAG_STAG #define IRDMAQPRQ_TO IRDMAQPSQ_FRAG_TO #define IRDMAPFINT_OICR_HMC_ERR_M BIT(26) #define IRDMAPFINT_OICR_PE_PUSH_M BIT(27) #define IRDMAPFINT_OICR_PE_CRITERR_M BIT(28) #define IRDMA_CQP_INIT_WQE(wqe) memset(wqe, 0, 64) #define IRDMA_GET_CURRENT_CQ_ELEM(_cq) \ ( \ (_cq)->cq_base[IRDMA_RING_CURRENT_HEAD((_cq)->cq_ring)].buf \ ) #define IRDMA_GET_CURRENT_EXTENDED_CQ_ELEM(_cq) \ ( \ ((struct irdma_extended_cqe *) \ ((_cq)->cq_base))[IRDMA_RING_CURRENT_HEAD((_cq)->cq_ring)].buf \ ) #define IRDMA_RING_INIT(_ring, _size) \ { \ (_ring).head = 0; \ (_ring).tail = 0; \ (_ring).size = (_size); \ } #define IRDMA_RING_SIZE(_ring) ((_ring).size) #define IRDMA_RING_CURRENT_HEAD(_ring) ((_ring).head) #define IRDMA_RING_CURRENT_TAIL(_ring) ((_ring).tail) #define IRDMA_RING_MOVE_HEAD(_ring, _retcode) \ { \ register __u32 size; \ size = (_ring).size; \ if (!IRDMA_RING_FULL_ERR(_ring)) { \ (_ring).head = ((_ring).head + 1) % size; \ (_retcode) = 0; \ } else { \ (_retcode) = ENOMEM; \ } \ } #define IRDMA_RING_MOVE_HEAD_BY_COUNT(_ring, _count, _retcode) \ { \ register __u32 size; \ size = (_ring).size; \ if ((IRDMA_RING_USED_QUANTA(_ring) + (_count)) < size) { \ (_ring).head = ((_ring).head + (_count)) % size; \ (_retcode) = 0; \ } else { \ (_retcode) = ENOMEM; \ } \ } #define IRDMA_SQ_RING_MOVE_HEAD(_ring, _retcode) \ { \ register __u32 size; \ size = (_ring).size; \ if (!IRDMA_SQ_RING_FULL_ERR(_ring)) { \ (_ring).head = ((_ring).head + 1) % size; \ (_retcode) = 0; \ } else { \ (_retcode) = ENOMEM; \ } \ } #define IRDMA_SQ_RING_MOVE_HEAD_BY_COUNT(_ring, _count, _retcode) \ { \ register __u32 size; \ size = (_ring).size; \ if ((IRDMA_RING_USED_QUANTA(_ring) + (_count)) < (size - 256)) { \ (_ring).head = ((_ring).head + (_count)) % size; \ (_retcode) = 0; \ } else { \ (_retcode) = ENOMEM; \ } \ } #define IRDMA_RING_MOVE_HEAD_BY_COUNT_NOCHECK(_ring, _count) \ (_ring).head = ((_ring).head + (_count)) % (_ring).size #define IRDMA_RING_MOVE_TAIL(_ring) \ (_ring).tail = ((_ring).tail + 1) % (_ring).size #define IRDMA_RING_MOVE_HEAD_NOCHECK(_ring) \ (_ring).head = ((_ring).head + 1) % (_ring).size #define IRDMA_RING_MOVE_TAIL_BY_COUNT(_ring, _count) \ (_ring).tail = ((_ring).tail + (_count)) % (_ring).size #define IRDMA_RING_SET_TAIL(_ring, _pos) \ (_ring).tail = (_pos) % (_ring).size #define IRDMA_RING_FULL_ERR(_ring) \ ( \ (IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 1)) \ ) #define IRDMA_ERR_RING_FULL2(_ring) \ ( \ (IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 2)) \ ) #define IRDMA_ERR_RING_FULL3(_ring) \ ( \ (IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 3)) \ ) #define IRDMA_SQ_RING_FULL_ERR(_ring) \ ( \ (IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 257)) \ ) #define IRDMA_ERR_SQ_RING_FULL2(_ring) \ ( \ (IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 258)) \ ) #define IRDMA_ERR_SQ_RING_FULL3(_ring) \ ( \ (IRDMA_RING_USED_QUANTA(_ring) == ((_ring).size - 259)) \ ) #define IRDMA_RING_MORE_WORK(_ring) \ ( \ (IRDMA_RING_USED_QUANTA(_ring) != 0) \ ) #define IRDMA_RING_USED_QUANTA(_ring) \ ( \ (((_ring).head + (_ring).size - (_ring).tail) % (_ring).size) \ ) #define IRDMA_RING_FREE_QUANTA(_ring) \ ( \ ((_ring).size - IRDMA_RING_USED_QUANTA(_ring) - 1) \ ) #define IRDMA_SQ_RING_FREE_QUANTA(_ring) \ ( \ ((_ring).size - IRDMA_RING_USED_QUANTA(_ring) - 257) \ ) #define IRDMA_ATOMIC_RING_MOVE_HEAD(_ring, index, _retcode) \ { \ index = IRDMA_RING_CURRENT_HEAD(_ring); \ IRDMA_RING_MOVE_HEAD(_ring, _retcode); \ } enum irdma_qp_wqe_size { IRDMA_WQE_SIZE_32 = 32, IRDMA_WQE_SIZE_64 = 64, IRDMA_WQE_SIZE_96 = 96, IRDMA_WQE_SIZE_128 = 128, IRDMA_WQE_SIZE_256 = 256, }; /** * set_64bit_val - set 64 bit value to hw wqe * @wqe_words: wqe addr to write * @byte_index: index in wqe * @val: value to write **/ static inline void set_64bit_val(__le64 *wqe_words, __u32 byte_index, __u64 val) { wqe_words[byte_index >> 3] = htole64(val); } /** * set_32bit_val - set 32 bit value to hw wqe * @wqe_words: wqe addr to write * @byte_index: index in wqe * @val: value to write **/ static inline void set_32bit_val(__le32 *wqe_words, __u32 byte_index, __u32 val) { wqe_words[byte_index >> 2] = htole32(val); } /** * get_64bit_val - read 64 bit value from wqe * @wqe_words: wqe addr * @byte_index: index to read from * @val: read value **/ static inline void get_64bit_val(__le64 *wqe_words, __u32 byte_index, __u64 *val) { *val = le64toh(wqe_words[byte_index >> 3]); } /** * get_32bit_val - read 32 bit value from wqe * @wqe_words: wqe addr * @byte_index: index to reaad from * @val: return 32 bit value **/ static inline void get_32bit_val(__le32 *wqe_words, __u32 byte_index, __u32 *val) { *val = le32toh(wqe_words[byte_index >> 2]); } #endif /* IRDMA_DEFS_H */ rdma-core-56.1/providers/irdma/i40e_devids.h000066400000000000000000000022171477342711600207150ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ /* Copyright (c) 2015 - 2019 Intel Corporation */ #ifndef I40E_DEVIDS_H #define I40E_DEVIDS_H /* Vendor ID */ #define I40E_INTEL_VENDOR_ID 0x8086 /* Device IDs */ #define I40E_DEV_ID_SFP_XL710 0x1572 #define I40E_DEV_ID_QEMU 0x1574 #define I40E_DEV_ID_KX_B 0x1580 #define I40E_DEV_ID_KX_C 0x1581 #define I40E_DEV_ID_QSFP_A 0x1583 #define I40E_DEV_ID_QSFP_B 0x1584 #define I40E_DEV_ID_QSFP_C 0x1585 #define I40E_DEV_ID_10G_BASE_T 0x1586 #define I40E_DEV_ID_20G_KR2 0x1587 #define I40E_DEV_ID_20G_KR2_A 0x1588 #define I40E_DEV_ID_10G_BASE_T4 0x1589 #define I40E_DEV_ID_25G_B 0x158A #define I40E_DEV_ID_25G_SFP28 0x158B #define I40E_DEV_ID_VF 0x154C #define I40E_DEV_ID_VF_HV 0x1571 #define I40E_DEV_ID_X722_A0 0x374C #define I40E_DEV_ID_X722_A0_VF 0x374D #define I40E_DEV_ID_KX_X722 0x37CE #define I40E_DEV_ID_QSFP_X722 0x37CF #define I40E_DEV_ID_SFP_X722 0x37D0 #define I40E_DEV_ID_1G_BASE_T_X722 0x37D1 #define I40E_DEV_ID_10G_BASE_T_X722 0x37D2 #define I40E_DEV_ID_SFP_I_X722 0x37D3 #define I40E_DEV_ID_X722_VF 0x37CD #define I40E_DEV_ID_X722_VF_HV 0x37D9 #endif /* I40E_DEVIDS_H */ rdma-core-56.1/providers/irdma/i40iw_hw.h000066400000000000000000000016361477342711600202540ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ /* Copyright (c) 2015 - 2023 Intel Corporation */ #ifndef I40IW_HW_H #define I40IW_HW_H enum i40iw_device_caps_const { I40IW_MAX_WQ_FRAGMENT_COUNT = 3, I40IW_MAX_SGE_RD = 1, I40IW_MAX_PUSH_PAGE_COUNT = 0, I40IW_MAX_INLINE_DATA_SIZE = 48, I40IW_MAX_IRD_SIZE = 63, I40IW_MAX_ORD_SIZE = 127, I40IW_MAX_WQ_ENTRIES = 2048, I40IW_MAX_WQE_SIZE_RQ = 128, I40IW_MAX_PDS = 32768, I40IW_MAX_STATS_COUNT = 16, I40IW_MAX_CQ_SIZE = 1048575, I40IW_MAX_OUTBOUND_MSG_SIZE = 2147483647, I40IW_MAX_INBOUND_MSG_SIZE = 2147483647, I40IW_MIN_WQ_SIZE = 4 /* WQEs */, }; #define I40IW_QP_WQE_MIN_SIZE 32 #define I40IW_QP_WQE_MAX_SIZE 128 #define I40IW_MAX_RQ_WQE_SHIFT 2 #define I40IW_MAX_QUANTA_PER_WR 2 #define I40IW_QP_SW_MAX_SQ_QUANTA 2048 #define I40IW_QP_SW_MAX_RQ_QUANTA 16384 #define I40IW_QP_SW_MAX_WQ_QUANTA 2048 #endif /* I40IW_HW_H */ rdma-core-56.1/providers/irdma/ice_devids.h000066400000000000000000000052451477342711600207200ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ /* Copyright (c) 2019 - 2020 Intel Corporation */ #ifndef ICE_DEVIDS_H #define ICE_DEVIDS_H #define PCI_VENDOR_ID_INTEL 0x8086 /* Device IDs */ /* Intel(R) Ethernet Connection E823-L for backplane */ #define ICE_DEV_ID_E823L_BACKPLANE 0x124C /* Intel(R) Ethernet Connection E823-L for SFP */ #define ICE_DEV_ID_E823L_SFP 0x124D /* Intel(R) Ethernet Connection E823-L/X557-AT 10GBASE-T */ #define ICE_DEV_ID_E823L_10G_BASE_T 0x124E /* Intel(R) Ethernet Connection E823-L 1GbE */ #define ICE_DEV_ID_E823L_1GBE 0x124F /* Intel(R) Ethernet Connection E823-L for QSFP */ #define ICE_DEV_ID_E823L_QSFP 0x151D /* Intel(R) Ethernet Controller E810-C for backplane */ #define ICE_DEV_ID_E810C_BACKPLANE 0x1591 /* Intel(R) Ethernet Controller E810-C for QSFP */ #define ICE_DEV_ID_E810C_QSFP 0x1592 /* Intel(R) Ethernet Controller E810-C for SFP */ #define ICE_DEV_ID_E810C_SFP 0x1593 /* Intel(R) Ethernet Controller E810-XXV for backplane */ #define ICE_DEV_ID_E810_XXV_BACKPLANE 0x1599 /* Intel(R) Ethernet Controller E810-XXV for QSFP */ #define ICE_DEV_ID_E810_XXV_QSFP 0x159A /* Intel(R) Ethernet Controller E810-XXV for SFP */ #define ICE_DEV_ID_E810_XXV_SFP 0x159B /* Intel(R) Ethernet Connection E823-C for backplane */ #define ICE_DEV_ID_E823C_BACKPLANE 0x188A /* Intel(R) Ethernet Connection E823-C for QSFP */ #define ICE_DEV_ID_E823C_QSFP 0x188B /* Intel(R) Ethernet Connection E823-C for SFP */ #define ICE_DEV_ID_E823C_SFP 0x188C /* Intel(R) Ethernet Connection E823-C/X557-AT 10GBASE-T */ #define ICE_DEV_ID_E823C_10G_BASE_T 0x188D /* Intel(R) Ethernet Connection E823-C 1GbE */ #define ICE_DEV_ID_E823C_SGMII 0x188E /* Intel(R) Ethernet Connection C822N for backplane */ #define ICE_DEV_ID_C822N_BACKPLANE 0x1890 /* Intel(R) Ethernet Connection C822N for QSFP */ #define ICE_DEV_ID_C822N_QSFP 0x1891 /* Intel(R) Ethernet Connection C822N for SFP */ #define ICE_DEV_ID_C822N_SFP 0x1892 /* Intel(R) Ethernet Connection E822-C/X557-AT 10GBASE-T */ #define ICE_DEV_ID_E822C_10G_BASE_T 0x1893 /* Intel(R) Ethernet Connection E822-C 1GbE */ #define ICE_DEV_ID_E822C_SGMII 0x1894 /* Intel(R) Ethernet Connection E822-L for backplane */ #define ICE_DEV_ID_E822L_BACKPLANE 0x1897 /* Intel(R) Ethernet Connection E822-L for SFP */ #define ICE_DEV_ID_E822L_SFP 0x1898 /* Intel(R) Ethernet Connection E822-L/X557-AT 10GBASE-T */ #define ICE_DEV_ID_E822L_10G_BASE_T 0x1899 /* Intel(R) Ethernet Connection E822-L 1GbE */ #define ICE_DEV_ID_E822L_SGMII 0x189A #endif /* ICE_DEVIDS_H */ rdma-core-56.1/providers/irdma/irdma.h000066400000000000000000000022031477342711600177050ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ /* Copyright (c) 2017 - 2023 Intel Corporation */ #ifndef IRDMA_H #define IRDMA_H #define IRDMA_WQEALLOC_WQE_DESC_INDEX GENMASK(31, 20) enum irdma_vers { IRDMA_GEN_RSVD, IRDMA_GEN_1, IRDMA_GEN_2, }; struct irdma_uk_attrs { __u64 feature_flags; __u32 max_hw_wq_frags; __u32 max_hw_read_sges; __u32 max_hw_inline; __u32 max_hw_rq_quanta; __u32 max_hw_wq_quanta; __u32 min_hw_cq_size; __u32 max_hw_cq_size; __u16 max_hw_sq_chunk; __u16 min_hw_wq_size; __u8 hw_rev; }; struct irdma_hw_attrs { struct irdma_uk_attrs uk_attrs; __u64 max_hw_outbound_msg_size; __u64 max_hw_inbound_msg_size; __u64 max_mr_size; __u32 min_hw_qp_id; __u32 min_hw_aeq_size; __u32 max_hw_aeq_size; __u32 min_hw_ceq_size; __u32 max_hw_ceq_size; __u32 max_hw_device_pages; __u32 max_hw_vf_fpm_id; __u32 first_hw_vf_fpm_id; __u32 max_hw_ird; __u32 max_hw_ord; __u32 max_hw_wqes; __u32 max_hw_pds; __u32 max_hw_ena_vf_count; __u32 max_qp_wr; __u32 max_pe_ready_count; __u32 max_done_count; __u32 max_sleep_count; __u32 max_cqp_compl_wait_time_ms; __u16 max_stat_inst; }; #endif /* IRDMA_H*/ rdma-core-56.1/providers/irdma/osdep.h000066400000000000000000000010621477342711600177250ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ /* Copyright (c) 2015 - 2021 Intel Corporation */ #ifndef IRDMA_OSDEP_H #define IRDMA_OSDEP_H #include #include #include #include #include #include #include #include #include #include #include #include #include static inline void db_wr32(__u32 val, __u32 *wqe_word) { *wqe_word = val; } #endif /* IRDMA_OSDEP_H */ rdma-core-56.1/providers/irdma/uk.c000066400000000000000000001337241477342711600172400ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB /* Copyright (c) 2015 - 2023 Intel Corporation */ #include #include "osdep.h" #include "defs.h" #include "user.h" #include "irdma.h" /** * irdma_set_fragment - set fragment in wqe * @wqe: wqe for setting fragment * @offset: offset value * @sge: sge length and stag * @valid: The wqe valid */ static void irdma_set_fragment(__le64 *wqe, __u32 offset, struct ibv_sge *sge, __u8 valid) { if (sge) { set_64bit_val(wqe, offset, FIELD_PREP(IRDMAQPSQ_FRAG_TO, sge->addr)); set_64bit_val(wqe, offset + 8, FIELD_PREP(IRDMAQPSQ_VALID, valid) | FIELD_PREP(IRDMAQPSQ_FRAG_LEN, sge->length) | FIELD_PREP(IRDMAQPSQ_FRAG_STAG, sge->lkey)); } else { set_64bit_val(wqe, offset, 0); set_64bit_val(wqe, offset + 8, FIELD_PREP(IRDMAQPSQ_VALID, valid)); } } /** * irdma_set_fragment_gen_1 - set fragment in wqe * @wqe: wqe for setting fragment * @offset: offset value * @sge: sge length and stag * @valid: wqe valid flag */ static void irdma_set_fragment_gen_1(__le64 *wqe, __u32 offset, struct ibv_sge *sge, __u8 valid) { if (sge) { set_64bit_val(wqe, offset, FIELD_PREP(IRDMAQPSQ_FRAG_TO, sge->addr)); set_64bit_val(wqe, offset + 8, FIELD_PREP(IRDMAQPSQ_GEN1_FRAG_LEN, sge->length) | FIELD_PREP(IRDMAQPSQ_GEN1_FRAG_STAG, sge->lkey)); } else { set_64bit_val(wqe, offset, 0); set_64bit_val(wqe, offset + 8, 0); } } /** * irdma_nop_1 - insert a NOP wqe * @qp: hw qp ptr */ static int irdma_nop_1(struct irdma_qp_uk *qp) { __u64 hdr; __le64 *wqe; __u32 wqe_idx; bool signaled = false; if (!qp->sq_ring.head) return EINVAL; wqe_idx = IRDMA_RING_CURRENT_HEAD(qp->sq_ring); wqe = qp->sq_base[wqe_idx].elem; qp->sq_wrtrk_array[wqe_idx].quanta = IRDMA_QP_WQE_MIN_QUANTA; set_64bit_val(wqe, 0, 0); set_64bit_val(wqe, 8, 0); set_64bit_val(wqe, 16, 0); hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_NOP) | FIELD_PREP(IRDMAQPSQ_SIGCOMPL, signaled) | FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity); /* make sure WQE is written before valid bit is set */ udma_to_device_barrier(); set_64bit_val(wqe, 24, hdr); return 0; } /** * irdma_clr_wqes - clear next 128 sq entries * @qp: hw qp ptr * @qp_wqe_idx: wqe_idx */ void irdma_clr_wqes(struct irdma_qp_uk *qp, __u32 qp_wqe_idx) { __le64 *wqe; __u32 wqe_idx; if (!(qp_wqe_idx & 0x7F)) { wqe_idx = (qp_wqe_idx + 128) % qp->sq_ring.size; wqe = qp->sq_base[wqe_idx].elem; if (wqe_idx) memset(wqe, qp->swqe_polarity ? 0 : 0xFF, 0x1000); else memset(wqe, qp->swqe_polarity ? 0xFF : 0, 0x1000); } } /** * irdma_uk_qp_post_wr - ring doorbell * @qp: hw qp ptr */ void irdma_uk_qp_post_wr(struct irdma_qp_uk *qp) { /* valid bit is written before ringing doorbell */ udma_to_device_barrier(); db_wr32(qp->qp_id, qp->wqe_alloc_db); qp->initial_ring.head = qp->sq_ring.head; } /** * irdma_qp_ring_push_db - ring qp doorbell * @qp: hw qp ptr * @wqe_idx: wqe index */ static void irdma_qp_ring_push_db(struct irdma_qp_uk *qp, __u32 wqe_idx) { set_32bit_val(qp->push_db, 0, FIELD_PREP(IRDMA_WQEALLOC_WQE_DESC_INDEX, wqe_idx >> 3) | qp->qp_id); qp->initial_ring.head = qp->sq_ring.head; qp->push_mode = true; qp->push_dropped = false; } void irdma_qp_push_wqe(struct irdma_qp_uk *qp, __le64 *wqe, __u16 quanta, __u32 wqe_idx, bool post_sq) { __le64 *push; if (IRDMA_RING_CURRENT_HEAD(qp->initial_ring) != IRDMA_RING_CURRENT_TAIL(qp->sq_ring) && !qp->push_mode) { if (post_sq) irdma_uk_qp_post_wr(qp); } else { push = (__le64 *)((uintptr_t)qp->push_wqe + (wqe_idx & 0x7) * 0x20); memcpy(push, wqe, quanta * IRDMA_QP_WQE_MIN_SIZE); irdma_qp_ring_push_db(qp, wqe_idx); } } /** * irdma_qp_get_next_send_wqe - pad with NOP if needed, return where next WR should go * @qp: hw qp ptr * @wqe_idx: return wqe index * @quanta: size of WR in quanta * @total_size: size of WR in bytes * @info: info on WR */ __le64 *irdma_qp_get_next_send_wqe(struct irdma_qp_uk *qp, __u32 *wqe_idx, __u16 quanta, __u32 total_size, struct irdma_post_sq_info *info) { __le64 *wqe; __le64 *wqe_0 = NULL; __u32 nop_wqe_idx; __u16 avail_quanta; __u16 i; avail_quanta = qp->uk_attrs->max_hw_sq_chunk - (IRDMA_RING_CURRENT_HEAD(qp->sq_ring) % qp->uk_attrs->max_hw_sq_chunk); if (quanta <= avail_quanta) { /* WR fits in current chunk */ if (quanta > IRDMA_SQ_RING_FREE_QUANTA(qp->sq_ring)) return NULL; } else { /* Need to pad with NOP */ if (quanta + avail_quanta > IRDMA_SQ_RING_FREE_QUANTA(qp->sq_ring)) return NULL; nop_wqe_idx = IRDMA_RING_CURRENT_HEAD(qp->sq_ring); for (i = 0; i < avail_quanta; i++) { irdma_nop_1(qp); IRDMA_RING_MOVE_HEAD_NOCHECK(qp->sq_ring); } if (qp->push_db && info->push_wqe) irdma_qp_push_wqe(qp, qp->sq_base[nop_wqe_idx].elem, avail_quanta, nop_wqe_idx, true); } *wqe_idx = IRDMA_RING_CURRENT_HEAD(qp->sq_ring); if (!*wqe_idx) qp->swqe_polarity = !qp->swqe_polarity; IRDMA_RING_MOVE_HEAD_BY_COUNT_NOCHECK(qp->sq_ring, quanta); wqe = qp->sq_base[*wqe_idx].elem; if (qp->uk_attrs->hw_rev == IRDMA_GEN_1 && quanta == 1 && (IRDMA_RING_CURRENT_HEAD(qp->sq_ring) & 1)) { wqe_0 = qp->sq_base[IRDMA_RING_CURRENT_HEAD(qp->sq_ring)].elem; wqe_0[3] = htole64(FIELD_PREP(IRDMAQPSQ_VALID, !qp->swqe_polarity)); } qp->sq_wrtrk_array[*wqe_idx].wrid = info->wr_id; qp->sq_wrtrk_array[*wqe_idx].wr_len = total_size; qp->sq_wrtrk_array[*wqe_idx].quanta = quanta; return wqe; } /** * irdma_qp_get_next_recv_wqe - get next qp's rcv wqe * @qp: hw qp ptr * @wqe_idx: return wqe index */ __le64 *irdma_qp_get_next_recv_wqe(struct irdma_qp_uk *qp, __u32 *wqe_idx) { __le64 *wqe; int ret_code; if (IRDMA_RING_FULL_ERR(qp->rq_ring)) return NULL; IRDMA_ATOMIC_RING_MOVE_HEAD(qp->rq_ring, *wqe_idx, ret_code); if (ret_code) return NULL; if (!*wqe_idx) qp->rwqe_polarity = !qp->rwqe_polarity; /* rq_wqe_size_multiplier is no of 32 byte quanta in one rq wqe */ wqe = qp->rq_base[*wqe_idx * qp->rq_wqe_size_multiplier].elem; return wqe; } /** * irdma_uk_rdma_write - rdma write operation * @qp: hw qp ptr * @info: post sq information * @post_sq: flag to post sq */ int irdma_uk_rdma_write(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq) { __u64 hdr; __le64 *wqe; struct irdma_rdma_write *op_info; __u32 i, wqe_idx; __u32 total_size = 0, byte_off; int ret_code; __u32 frag_cnt, addl_frag_cnt; bool read_fence = false; __u16 quanta; info->push_wqe = qp->push_db ? true : false; op_info = &info->op.rdma_write; if (op_info->num_lo_sges > qp->max_sq_frag_cnt) return EINVAL; for (i = 0; i < op_info->num_lo_sges; i++) total_size += op_info->lo_sg_list[i].length; read_fence |= info->read_fence; if (info->imm_data_valid) frag_cnt = op_info->num_lo_sges + 1; else frag_cnt = op_info->num_lo_sges; addl_frag_cnt = frag_cnt > 1 ? (frag_cnt - 1) : 0; ret_code = irdma_fragcnt_to_quanta_sq(frag_cnt, &quanta); if (ret_code) return ret_code; wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, total_size, info); if (!wqe) return ENOMEM; irdma_clr_wqes(qp, wqe_idx); set_64bit_val(wqe, 16, FIELD_PREP(IRDMAQPSQ_FRAG_TO, op_info->rem_addr.addr)); if (info->imm_data_valid) { set_64bit_val(wqe, 0, FIELD_PREP(IRDMAQPSQ_IMMDATA, info->imm_data)); i = 0; } else { qp->wqe_ops.iw_set_fragment(wqe, 0, op_info->lo_sg_list, qp->swqe_polarity); i = 1; } for (byte_off = 32; i < op_info->num_lo_sges; i++) { qp->wqe_ops.iw_set_fragment(wqe, byte_off, &op_info->lo_sg_list[i], qp->swqe_polarity); byte_off += 16; } /* if not an odd number set valid bit in next fragment */ if (qp->uk_attrs->hw_rev >= IRDMA_GEN_2 && !(frag_cnt & 0x01) && frag_cnt) { qp->wqe_ops.iw_set_fragment(wqe, byte_off, NULL, qp->swqe_polarity); if (qp->uk_attrs->hw_rev == IRDMA_GEN_2) ++addl_frag_cnt; } hdr = FIELD_PREP(IRDMAQPSQ_REMSTAG, op_info->rem_addr.lkey) | FIELD_PREP(IRDMAQPSQ_OPCODE, info->op_type) | FIELD_PREP(IRDMAQPSQ_IMMDATAFLAG, info->imm_data_valid) | FIELD_PREP(IRDMAQPSQ_REPORTRTT, info->report_rtt) | FIELD_PREP(IRDMAQPSQ_ADDFRAGCNT, addl_frag_cnt) | FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) | FIELD_PREP(IRDMAQPSQ_READFENCE, read_fence) | FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) | FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) | FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity); udma_to_device_barrier(); /* make sure WQE is populated before valid bit is set */ set_64bit_val(wqe, 24, hdr); if (info->push_wqe) { irdma_qp_push_wqe(qp, wqe, quanta, wqe_idx, post_sq); } else { if (post_sq) irdma_uk_qp_post_wr(qp); } return 0; } /** * irdma_uk_rdma_read - rdma read command * @qp: hw qp ptr * @info: post sq information * @inv_stag: flag for inv_stag * @post_sq: flag to post sq */ int irdma_uk_rdma_read(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool inv_stag, bool post_sq) { struct irdma_rdma_read *op_info; int ret_code; __u32 i, byte_off, total_size = 0; bool local_fence = false; __u32 addl_frag_cnt; __le64 *wqe; __u32 wqe_idx; __u16 quanta; __u64 hdr; info->push_wqe = qp->push_db ? true : false; op_info = &info->op.rdma_read; if (qp->max_sq_frag_cnt < op_info->num_lo_sges) return EINVAL; for (i = 0; i < op_info->num_lo_sges; i++) total_size += op_info->lo_sg_list[i].length; ret_code = irdma_fragcnt_to_quanta_sq(op_info->num_lo_sges, &quanta); if (ret_code) return ret_code; wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, total_size, info); if (!wqe) return ENOMEM; irdma_clr_wqes(qp, wqe_idx); addl_frag_cnt = op_info->num_lo_sges > 1 ? (op_info->num_lo_sges - 1) : 0; local_fence |= info->local_fence; qp->wqe_ops.iw_set_fragment(wqe, 0, op_info->lo_sg_list, qp->swqe_polarity); for (i = 1, byte_off = 32; i < op_info->num_lo_sges; ++i) { qp->wqe_ops.iw_set_fragment(wqe, byte_off, &op_info->lo_sg_list[i], qp->swqe_polarity); byte_off += 16; } /* if not an odd number set valid bit in next fragment */ if (qp->uk_attrs->hw_rev >= IRDMA_GEN_2 && !(op_info->num_lo_sges & 0x01) && op_info->num_lo_sges) { qp->wqe_ops.iw_set_fragment(wqe, byte_off, NULL, qp->swqe_polarity); if (qp->uk_attrs->hw_rev == IRDMA_GEN_2) ++addl_frag_cnt; } set_64bit_val(wqe, 16, FIELD_PREP(IRDMAQPSQ_FRAG_TO, op_info->rem_addr.addr)); hdr = FIELD_PREP(IRDMAQPSQ_REMSTAG, op_info->rem_addr.lkey) | FIELD_PREP(IRDMAQPSQ_REPORTRTT, (info->report_rtt ? 1 : 0)) | FIELD_PREP(IRDMAQPSQ_ADDFRAGCNT, addl_frag_cnt) | FIELD_PREP(IRDMAQPSQ_OPCODE, (inv_stag ? IRDMAQP_OP_RDMA_READ_LOC_INV : IRDMAQP_OP_RDMA_READ)) | FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) | FIELD_PREP(IRDMAQPSQ_READFENCE, info->read_fence) | FIELD_PREP(IRDMAQPSQ_LOCALFENCE, local_fence) | FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) | FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity); udma_to_device_barrier(); /* make sure WQE is populated before valid bit is set */ set_64bit_val(wqe, 24, hdr); if (info->push_wqe) { irdma_qp_push_wqe(qp, wqe, quanta, wqe_idx, post_sq); } else { if (post_sq) irdma_uk_qp_post_wr(qp); } return 0; } /** * irdma_uk_send - rdma send command * @qp: hw qp ptr * @info: post sq information * @post_sq: flag to post sq */ int irdma_uk_send(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq) { __le64 *wqe; struct irdma_post_send *op_info; __u64 hdr; __u32 i, wqe_idx, total_size = 0, byte_off; int ret_code; __u32 frag_cnt, addl_frag_cnt; bool read_fence = false; __u16 quanta; info->push_wqe = qp->push_db ? true : false; op_info = &info->op.send; if (qp->max_sq_frag_cnt < op_info->num_sges) return EINVAL; for (i = 0; i < op_info->num_sges; i++) total_size += op_info->sg_list[i].length; if (info->imm_data_valid) frag_cnt = op_info->num_sges + 1; else frag_cnt = op_info->num_sges; ret_code = irdma_fragcnt_to_quanta_sq(frag_cnt, &quanta); if (ret_code) return ret_code; wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, total_size, info); if (!wqe) return ENOMEM; irdma_clr_wqes(qp, wqe_idx); read_fence |= info->read_fence; addl_frag_cnt = frag_cnt > 1 ? (frag_cnt - 1) : 0; if (info->imm_data_valid) { set_64bit_val(wqe, 0, FIELD_PREP(IRDMAQPSQ_IMMDATA, info->imm_data)); i = 0; } else { qp->wqe_ops.iw_set_fragment(wqe, 0, frag_cnt ? op_info->sg_list : NULL, qp->swqe_polarity); i = 1; } for (byte_off = 32; i < op_info->num_sges; i++) { qp->wqe_ops.iw_set_fragment(wqe, byte_off, &op_info->sg_list[i], qp->swqe_polarity); byte_off += 16; } /* if not an odd number set valid bit in next fragment */ if (qp->uk_attrs->hw_rev >= IRDMA_GEN_2 && !(frag_cnt & 0x01) && frag_cnt) { qp->wqe_ops.iw_set_fragment(wqe, byte_off, NULL, qp->swqe_polarity); if (qp->uk_attrs->hw_rev == IRDMA_GEN_2) ++addl_frag_cnt; } set_64bit_val(wqe, 16, FIELD_PREP(IRDMAQPSQ_DESTQKEY, op_info->qkey) | FIELD_PREP(IRDMAQPSQ_DESTQPN, op_info->dest_qp)); hdr = FIELD_PREP(IRDMAQPSQ_REMSTAG, info->stag_to_inv) | FIELD_PREP(IRDMAQPSQ_AHID, op_info->ah_id) | FIELD_PREP(IRDMAQPSQ_IMMDATAFLAG, (info->imm_data_valid ? 1 : 0)) | FIELD_PREP(IRDMAQPSQ_REPORTRTT, (info->report_rtt ? 1 : 0)) | FIELD_PREP(IRDMAQPSQ_OPCODE, info->op_type) | FIELD_PREP(IRDMAQPSQ_ADDFRAGCNT, addl_frag_cnt) | FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) | FIELD_PREP(IRDMAQPSQ_READFENCE, read_fence) | FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) | FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) | FIELD_PREP(IRDMAQPSQ_UDPHEADER, info->udp_hdr) | FIELD_PREP(IRDMAQPSQ_L4LEN, info->l4len) | FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity); udma_to_device_barrier(); /* make sure WQE is populated before valid bit is set */ set_64bit_val(wqe, 24, hdr); if (info->push_wqe) { irdma_qp_push_wqe(qp, wqe, quanta, wqe_idx, post_sq); } else { if (post_sq) irdma_uk_qp_post_wr(qp); } return 0; } /** * irdma_set_mw_bind_wqe_gen_1 - set mw bind wqe * @wqe: wqe for setting fragment * @op_info: info for setting bind wqe values */ static void irdma_set_mw_bind_wqe_gen_1(__le64 *wqe, struct irdma_bind_window *op_info) { set_64bit_val(wqe, 0, (uintptr_t)op_info->va); set_64bit_val(wqe, 8, FIELD_PREP(IRDMAQPSQ_PARENTMRSTAG, op_info->mw_stag) | FIELD_PREP(IRDMAQPSQ_MWSTAG, op_info->mr_stag)); set_64bit_val(wqe, 16, op_info->bind_len); } /** * irdma_copy_inline_data_gen_1 - Copy inline data to wqe * @wqe: pointer to wqe * @sge_list: table of pointers to inline data * @num_sges: Total inline data length * @polarity: compatibility parameter */ static void irdma_copy_inline_data_gen_1(__u8 *wqe, struct ibv_sge *sge_list, __u32 num_sges, __u8 polarity) { __u32 quanta_bytes_remaining = 16; __u32 i; for (i = 0; i < num_sges; i++) { __u8 *cur_sge = (__u8 *)(uintptr_t)sge_list[i].addr; __u32 sge_len = sge_list[i].length; while (sge_len) { __u32 bytes_copied; bytes_copied = min(sge_len, quanta_bytes_remaining); memcpy(wqe, cur_sge, bytes_copied); wqe += bytes_copied; cur_sge += bytes_copied; quanta_bytes_remaining -= bytes_copied; sge_len -= bytes_copied; if (!quanta_bytes_remaining) { /* Remaining inline bytes reside after the hdr */ wqe += 16; quanta_bytes_remaining = 32; } } } } /** * irdma_inline_data_size_to_quanta_gen_1 - based on inline data, quanta * @data_size: data size for inline * * Gets the quanta based on inline and immediate data. */ static inline __u16 irdma_inline_data_size_to_quanta_gen_1(__u32 data_size) { return data_size <= 16 ? IRDMA_QP_WQE_MIN_QUANTA : 2; } /** * irdma_set_mw_bind_wqe - set mw bind in wqe * @wqe: wqe for setting mw bind * @op_info: info for setting wqe values */ static void irdma_set_mw_bind_wqe(__le64 *wqe, struct irdma_bind_window *op_info) { set_64bit_val(wqe, 0, (uintptr_t)op_info->va); set_64bit_val(wqe, 8, FIELD_PREP(IRDMAQPSQ_PARENTMRSTAG, op_info->mr_stag) | FIELD_PREP(IRDMAQPSQ_MWSTAG, op_info->mw_stag)); set_64bit_val(wqe, 16, op_info->bind_len); } /** * irdma_copy_inline_data - Copy inline data to wqe * @wqe: pointer to wqe * @sge_list: table of pointers to inline data * @num_sges: number of SGE's * @polarity: polarity of wqe valid bit */ static void irdma_copy_inline_data(__u8 *wqe, struct ibv_sge *sge_list, __u32 num_sges, __u8 polarity) { __u8 inline_valid = polarity << IRDMA_INLINE_VALID_S; __u32 quanta_bytes_remaining = 8; __u32 i; bool first_quanta = true; wqe += 8; for (i = 0; i < num_sges; i++) { __u8 *cur_sge = (__u8 *)(uintptr_t)sge_list[i].addr; __u32 sge_len = sge_list[i].length; while (sge_len) { __u32 bytes_copied; bytes_copied = min(sge_len, quanta_bytes_remaining); memcpy(wqe, cur_sge, bytes_copied); wqe += bytes_copied; cur_sge += bytes_copied; quanta_bytes_remaining -= bytes_copied; sge_len -= bytes_copied; if (!quanta_bytes_remaining) { quanta_bytes_remaining = 31; /* Remaining inline bytes reside after the hdr */ if (first_quanta) { first_quanta = false; wqe += 16; } else { *wqe = inline_valid; wqe++; } } } } if (!first_quanta && quanta_bytes_remaining < 31) *(wqe + quanta_bytes_remaining) = inline_valid; } /** * irdma_inline_data_size_to_quanta - based on inline data, quanta * @data_size: data size for inline * * Gets the quanta based on inline and immediate data. */ static __u16 irdma_inline_data_size_to_quanta(__u32 data_size) { if (data_size <= 8) return IRDMA_QP_WQE_MIN_QUANTA; else if (data_size <= 39) return 2; else if (data_size <= 70) return 3; else if (data_size <= 101) return 4; else if (data_size <= 132) return 5; else if (data_size <= 163) return 6; else if (data_size <= 194) return 7; else return 8; } /** * irdma_uk_inline_rdma_write - inline rdma write operation * @qp: hw qp ptr * @info: post sq information * @post_sq: flag to post sq */ int irdma_uk_inline_rdma_write(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq) { __le64 *wqe; struct irdma_rdma_write *op_info; __u64 hdr = 0; __u32 wqe_idx; bool read_fence = false; __u32 i, total_size = 0; __u16 quanta; info->push_wqe = qp->push_db ? true : false; op_info = &info->op.rdma_write; if (unlikely(qp->max_sq_frag_cnt < op_info->num_lo_sges)) return EINVAL; for (i = 0; i < op_info->num_lo_sges; i++) total_size += op_info->lo_sg_list[i].length; if (unlikely(total_size > qp->max_inline_data)) return EINVAL; quanta = qp->wqe_ops.iw_inline_data_size_to_quanta(total_size); wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, total_size, info); if (!wqe) return ENOMEM; irdma_clr_wqes(qp, wqe_idx); read_fence |= info->read_fence; set_64bit_val(wqe, 16, FIELD_PREP(IRDMAQPSQ_FRAG_TO, op_info->rem_addr.addr)); hdr = FIELD_PREP(IRDMAQPSQ_REMSTAG, op_info->rem_addr.lkey) | FIELD_PREP(IRDMAQPSQ_OPCODE, info->op_type) | FIELD_PREP(IRDMAQPSQ_INLINEDATALEN, total_size) | FIELD_PREP(IRDMAQPSQ_REPORTRTT, info->report_rtt ? 1 : 0) | FIELD_PREP(IRDMAQPSQ_INLINEDATAFLAG, 1) | FIELD_PREP(IRDMAQPSQ_IMMDATAFLAG, info->imm_data_valid ? 1 : 0) | FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe ? 1 : 0) | FIELD_PREP(IRDMAQPSQ_READFENCE, read_fence) | FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) | FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) | FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity); if (info->imm_data_valid) set_64bit_val(wqe, 0, FIELD_PREP(IRDMAQPSQ_IMMDATA, info->imm_data)); qp->wqe_ops.iw_copy_inline_data((__u8 *)wqe, op_info->lo_sg_list, op_info->num_lo_sges, qp->swqe_polarity); udma_to_device_barrier(); /* make sure WQE is populated before valid bit is set */ set_64bit_val(wqe, 24, hdr); if (info->push_wqe) { irdma_qp_push_wqe(qp, wqe, quanta, wqe_idx, post_sq); } else { if (post_sq) irdma_uk_qp_post_wr(qp); } return 0; } /** * irdma_uk_inline_send - inline send operation * @qp: hw qp ptr * @info: post sq information * @post_sq: flag to post sq */ int irdma_uk_inline_send(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq) { __le64 *wqe; struct irdma_post_send *op_info; __u64 hdr; __u32 wqe_idx; bool read_fence = false; __u32 i, total_size = 0; __u16 quanta; info->push_wqe = qp->push_db ? true : false; op_info = &info->op.send; if (unlikely(qp->max_sq_frag_cnt < op_info->num_sges)) return EINVAL; for (i = 0; i < op_info->num_sges; i++) total_size += op_info->sg_list[i].length; if (unlikely(total_size > qp->max_inline_data)) return EINVAL; quanta = qp->wqe_ops.iw_inline_data_size_to_quanta(total_size); wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, quanta, total_size, info); if (!wqe) return ENOMEM; irdma_clr_wqes(qp, wqe_idx); set_64bit_val(wqe, 16, FIELD_PREP(IRDMAQPSQ_DESTQKEY, op_info->qkey) | FIELD_PREP(IRDMAQPSQ_DESTQPN, op_info->dest_qp)); read_fence |= info->read_fence; hdr = FIELD_PREP(IRDMAQPSQ_REMSTAG, info->stag_to_inv) | FIELD_PREP(IRDMAQPSQ_AHID, op_info->ah_id) | FIELD_PREP(IRDMAQPSQ_OPCODE, info->op_type) | FIELD_PREP(IRDMAQPSQ_INLINEDATALEN, total_size) | FIELD_PREP(IRDMAQPSQ_IMMDATAFLAG, (info->imm_data_valid ? 1 : 0)) | FIELD_PREP(IRDMAQPSQ_REPORTRTT, (info->report_rtt ? 1 : 0)) | FIELD_PREP(IRDMAQPSQ_INLINEDATAFLAG, 1) | FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) | FIELD_PREP(IRDMAQPSQ_READFENCE, read_fence) | FIELD_PREP(IRDMAQPSQ_LOCALFENCE, info->local_fence) | FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) | FIELD_PREP(IRDMAQPSQ_UDPHEADER, info->udp_hdr) | FIELD_PREP(IRDMAQPSQ_L4LEN, info->l4len) | FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity); if (info->imm_data_valid) set_64bit_val(wqe, 0, FIELD_PREP(IRDMAQPSQ_IMMDATA, info->imm_data)); qp->wqe_ops.iw_copy_inline_data((__u8 *)wqe, op_info->sg_list, op_info->num_sges, qp->swqe_polarity); udma_to_device_barrier(); /* make sure WQE is populated before valid bit is set */ set_64bit_val(wqe, 24, hdr); if (info->push_wqe) { irdma_qp_push_wqe(qp, wqe, quanta, wqe_idx, post_sq); } else { if (post_sq) irdma_uk_qp_post_wr(qp); } return 0; } /** * irdma_uk_stag_local_invalidate - stag invalidate operation * @qp: hw qp ptr * @info: post sq information * @post_sq: flag to post sq */ int irdma_uk_stag_local_invalidate(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq) { __le64 *wqe; struct irdma_inv_local_stag *op_info; __u64 hdr; __u32 wqe_idx; bool local_fence = false; struct ibv_sge sge = {}; info->push_wqe = qp->push_db ? true : false; op_info = &info->op.inv_local_stag; local_fence = info->local_fence; wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, IRDMA_QP_WQE_MIN_QUANTA, 0, info); if (!wqe) return ENOMEM; irdma_clr_wqes(qp, wqe_idx); sge.lkey = op_info->target_stag; qp->wqe_ops.iw_set_fragment(wqe, 0, &sge, 0); set_64bit_val(wqe, 16, 0); hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMA_OP_TYPE_INV_STAG) | FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) | FIELD_PREP(IRDMAQPSQ_READFENCE, info->read_fence) | FIELD_PREP(IRDMAQPSQ_LOCALFENCE, local_fence) | FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) | FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity); udma_to_device_barrier(); /* make sure WQE is populated before valid bit is set */ set_64bit_val(wqe, 24, hdr); if (info->push_wqe) { irdma_qp_push_wqe(qp, wqe, IRDMA_QP_WQE_MIN_QUANTA, wqe_idx, post_sq); } else { if (post_sq) irdma_uk_qp_post_wr(qp); } return 0; } /** * irdma_uk_mw_bind - bind Memory Window * @qp: hw qp ptr * @info: post sq information * @post_sq: flag to post sq */ int irdma_uk_mw_bind(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq) { __le64 *wqe; struct irdma_bind_window *op_info; __u64 hdr; __u32 wqe_idx; bool local_fence = false; info->push_wqe = qp->push_db ? true : false; op_info = &info->op.bind_window; local_fence |= info->local_fence; wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, IRDMA_QP_WQE_MIN_QUANTA, 0, info); if (!wqe) return ENOMEM; irdma_clr_wqes(qp, wqe_idx); qp->wqe_ops.iw_set_mw_bind_wqe(wqe, op_info); hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMA_OP_TYPE_BIND_MW) | FIELD_PREP(IRDMAQPSQ_STAGRIGHTS, ((op_info->ena_reads << 2) | (op_info->ena_writes << 3))) | FIELD_PREP(IRDMAQPSQ_VABASEDTO, (op_info->addressing_type == IRDMA_ADDR_TYPE_VA_BASED ? 1 : 0)) | FIELD_PREP(IRDMAQPSQ_MEMWINDOWTYPE, (op_info->mem_window_type_1 ? 1 : 0)) | FIELD_PREP(IRDMAQPSQ_PUSHWQE, info->push_wqe) | FIELD_PREP(IRDMAQPSQ_READFENCE, info->read_fence) | FIELD_PREP(IRDMAQPSQ_LOCALFENCE, local_fence) | FIELD_PREP(IRDMAQPSQ_SIGCOMPL, info->signaled) | FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity); udma_to_device_barrier(); /* make sure WQE is populated before valid bit is set */ set_64bit_val(wqe, 24, hdr); if (info->push_wqe) { irdma_qp_push_wqe(qp, wqe, IRDMA_QP_WQE_MIN_QUANTA, wqe_idx, post_sq); } else { if (post_sq) irdma_uk_qp_post_wr(qp); } return 0; } /** * irdma_uk_post_receive - post receive wqe * @qp: hw qp ptr * @info: post rq information */ int irdma_uk_post_receive(struct irdma_qp_uk *qp, struct irdma_post_rq_info *info) { __u32 wqe_idx, i, byte_off; __u32 addl_frag_cnt; __le64 *wqe; __u64 hdr; if (qp->max_rq_frag_cnt < info->num_sges) return EINVAL; wqe = irdma_qp_get_next_recv_wqe(qp, &wqe_idx); if (!wqe) return ENOMEM; qp->rq_wrid_array[wqe_idx] = info->wr_id; addl_frag_cnt = info->num_sges > 1 ? (info->num_sges - 1) : 0; qp->wqe_ops.iw_set_fragment(wqe, 0, info->sg_list, qp->rwqe_polarity); for (i = 1, byte_off = 32; i < info->num_sges; i++) { qp->wqe_ops.iw_set_fragment(wqe, byte_off, &info->sg_list[i], qp->rwqe_polarity); byte_off += 16; } /* if not an odd number set valid bit in next fragment */ if (qp->uk_attrs->hw_rev >= IRDMA_GEN_2 && !(info->num_sges & 0x01) && info->num_sges) { qp->wqe_ops.iw_set_fragment(wqe, byte_off, NULL, qp->rwqe_polarity); if (qp->uk_attrs->hw_rev == IRDMA_GEN_2) ++addl_frag_cnt; } set_64bit_val(wqe, 16, 0); hdr = FIELD_PREP(IRDMAQPSQ_ADDFRAGCNT, addl_frag_cnt) | FIELD_PREP(IRDMAQPSQ_VALID, qp->rwqe_polarity); udma_to_device_barrier(); /* make sure WQE is populated before valid bit is set */ set_64bit_val(wqe, 24, hdr); return 0; } /** * irdma_uk_cq_resize - reset the cq buffer info * @cq: cq to resize * @cq_base: new cq buffer addr * @cq_size: number of cqes */ void irdma_uk_cq_resize(struct irdma_cq_uk *cq, void *cq_base, int cq_size) { cq->cq_base = cq_base; cq->cq_size = cq_size; IRDMA_RING_INIT(cq->cq_ring, cq->cq_size); cq->polarity = 1; } /** * irdma_uk_cq_set_resized_cnt - record the count of the resized buffers * @cq: cq to resize * @cq_cnt: the count of the resized cq buffers */ void irdma_uk_cq_set_resized_cnt(struct irdma_cq_uk *cq, __u16 cq_cnt) { __u64 temp_val; __u16 sw_cq_sel; __u8 arm_next_se; __u8 arm_next; __u8 arm_seq_num; get_64bit_val(cq->shadow_area, 32, &temp_val); sw_cq_sel = (__u16)FIELD_GET(IRDMA_CQ_DBSA_SW_CQ_SELECT, temp_val); sw_cq_sel += cq_cnt; arm_seq_num = (__u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_SEQ_NUM, temp_val); arm_next_se = (__u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_NEXT_SE, temp_val); arm_next = (__u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_NEXT, temp_val); temp_val = FIELD_PREP(IRDMA_CQ_DBSA_ARM_SEQ_NUM, arm_seq_num) | FIELD_PREP(IRDMA_CQ_DBSA_SW_CQ_SELECT, sw_cq_sel) | FIELD_PREP(IRDMA_CQ_DBSA_ARM_NEXT_SE, arm_next_se) | FIELD_PREP(IRDMA_CQ_DBSA_ARM_NEXT, arm_next); set_64bit_val(cq->shadow_area, 32, temp_val); } /** * irdma_uk_cq_request_notification - cq notification request (door bell) * @cq: hw cq * @cq_notify: notification type */ void irdma_uk_cq_request_notification(struct irdma_cq_uk *cq, enum irdma_cmpl_notify cq_notify) { __u64 temp_val; __u16 sw_cq_sel; __u8 arm_next_se = 0; __u8 arm_next = 0; __u8 arm_seq_num; get_64bit_val(cq->shadow_area, 32, &temp_val); arm_seq_num = (__u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_SEQ_NUM, temp_val); arm_seq_num++; sw_cq_sel = (__u16)FIELD_GET(IRDMA_CQ_DBSA_SW_CQ_SELECT, temp_val); arm_next_se = (__u8)FIELD_GET(IRDMA_CQ_DBSA_ARM_NEXT_SE, temp_val); arm_next_se |= 1; if (cq_notify == IRDMA_CQ_COMPL_EVENT) arm_next = 1; temp_val = FIELD_PREP(IRDMA_CQ_DBSA_ARM_SEQ_NUM, arm_seq_num) | FIELD_PREP(IRDMA_CQ_DBSA_SW_CQ_SELECT, sw_cq_sel) | FIELD_PREP(IRDMA_CQ_DBSA_ARM_NEXT_SE, arm_next_se) | FIELD_PREP(IRDMA_CQ_DBSA_ARM_NEXT, arm_next); set_64bit_val(cq->shadow_area, 32, temp_val); udma_to_device_barrier(); /* make sure WQE is populated before valid bit is set */ db_wr32(cq->cq_id, cq->cqe_alloc_db); } /** * irdma_uk_cq_poll_cmpl - get cq completion info * @cq: hw cq * @info: cq poll information returned */ int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, struct irdma_cq_poll_info *info) { __u64 comp_ctx, qword0, qword2, qword3; __le64 *cqe; struct irdma_qp_uk *qp; struct irdma_ring *pring = NULL; __u32 wqe_idx; int ret_code; bool move_cq_head = true; __u8 polarity; bool ext_valid; __le64 *ext_cqe; if (cq->avoid_mem_cflct) cqe = IRDMA_GET_CURRENT_EXTENDED_CQ_ELEM(cq); else cqe = IRDMA_GET_CURRENT_CQ_ELEM(cq); get_64bit_val(cqe, 24, &qword3); polarity = (__u8)FIELD_GET(IRDMA_CQ_VALID, qword3); if (polarity != cq->polarity) return ENOENT; /* Ensure CQE contents are read after valid bit is checked */ udma_from_device_barrier(); ext_valid = (bool)FIELD_GET(IRDMA_CQ_EXTCQE, qword3); if (ext_valid) { __u64 qword6, qword7; __u32 peek_head; if (cq->avoid_mem_cflct) { ext_cqe = (__le64 *)((__u8 *)cqe + 32); get_64bit_val(ext_cqe, 24, &qword7); polarity = (__u8)FIELD_GET(IRDMA_CQ_VALID, qword7); } else { peek_head = (cq->cq_ring.head + 1) % cq->cq_ring.size; ext_cqe = cq->cq_base[peek_head].buf; get_64bit_val(ext_cqe, 24, &qword7); polarity = (__u8)FIELD_GET(IRDMA_CQ_VALID, qword7); if (!peek_head) polarity ^= 1; } if (polarity != cq->polarity) return ENOENT; /* Ensure ext CQE contents are read after ext valid bit is checked */ udma_from_device_barrier(); info->imm_valid = (bool)FIELD_GET(IRDMA_CQ_IMMVALID, qword7); if (info->imm_valid) { __u64 qword4; get_64bit_val(ext_cqe, 0, &qword4); info->imm_data = (__u32)FIELD_GET(IRDMA_CQ_IMMDATALOW32, qword4); } info->ud_smac_valid = (bool)FIELD_GET(IRDMA_CQ_UDSMACVALID, qword7); info->ud_vlan_valid = (bool)FIELD_GET(IRDMA_CQ_UDVLANVALID, qword7); if (info->ud_smac_valid || info->ud_vlan_valid) { get_64bit_val(ext_cqe, 16, &qword6); if (info->ud_vlan_valid) info->ud_vlan = (__u16)FIELD_GET(IRDMA_CQ_UDVLAN, qword6); if (info->ud_smac_valid) { info->ud_smac[5] = qword6 & 0xFF; info->ud_smac[4] = (qword6 >> 8) & 0xFF; info->ud_smac[3] = (qword6 >> 16) & 0xFF; info->ud_smac[2] = (qword6 >> 24) & 0xFF; info->ud_smac[1] = (qword6 >> 32) & 0xFF; info->ud_smac[0] = (qword6 >> 40) & 0xFF; } } } else { info->imm_valid = false; info->ud_smac_valid = false; info->ud_vlan_valid = false; } info->q_type = (__u8)FIELD_GET(IRDMA_CQ_SQ, qword3); info->error = (bool)FIELD_GET(IRDMA_CQ_ERROR, qword3); info->push_dropped = (bool)FIELD_GET(IRDMACQ_PSHDROP, qword3); info->ipv4 = (bool)FIELD_GET(IRDMACQ_IPV4, qword3); if (info->error) { info->major_err = FIELD_GET(IRDMA_CQ_MAJERR, qword3); info->minor_err = FIELD_GET(IRDMA_CQ_MINERR, qword3); if (info->major_err == IRDMA_FLUSH_MAJOR_ERR) { info->comp_status = IRDMA_COMPL_STATUS_FLUSHED; /* Set the min error to standard flush error code for remaining cqes */ if (info->minor_err != FLUSH_GENERAL_ERR) { qword3 &= ~IRDMA_CQ_MINERR; qword3 |= FIELD_PREP(IRDMA_CQ_MINERR, FLUSH_GENERAL_ERR); set_64bit_val(cqe, 24, qword3); } } else { info->comp_status = IRDMA_COMPL_STATUS_UNKNOWN; } } else { info->comp_status = IRDMA_COMPL_STATUS_SUCCESS; } get_64bit_val(cqe, 0, &qword0); get_64bit_val(cqe, 16, &qword2); info->tcp_seq_num_rtt = (__u32)FIELD_GET(IRDMACQ_TCPSEQNUMRTT, qword0); info->qp_id = (__u32)FIELD_GET(IRDMACQ_QPID, qword2); info->ud_src_qpn = (__u32)FIELD_GET(IRDMACQ_UDSRCQPN, qword2); get_64bit_val(cqe, 8, &comp_ctx); info->solicited_event = (bool)FIELD_GET(IRDMACQ_SOEVENT, qword3); qp = (struct irdma_qp_uk *)(uintptr_t)comp_ctx; if (!qp || qp->destroy_pending) { ret_code = EFAULT; goto exit; } wqe_idx = (__u32)FIELD_GET(IRDMA_CQ_WQEIDX, qword3); info->qp_handle = (irdma_qp_handle)(uintptr_t)qp; info->op_type = (__u8)FIELD_GET(IRDMACQ_OP, qword3); if (info->q_type == IRDMA_CQE_QTYPE_RQ) { __u32 array_idx; array_idx = wqe_idx / qp->rq_wqe_size_multiplier; if (info->comp_status == IRDMA_COMPL_STATUS_FLUSHED || info->comp_status == IRDMA_COMPL_STATUS_UNKNOWN) { if (!IRDMA_RING_MORE_WORK(qp->rq_ring)) { ret_code = ENOENT; goto exit; } info->wr_id = qp->rq_wrid_array[qp->rq_ring.tail]; array_idx = qp->rq_ring.tail; } else { info->wr_id = qp->rq_wrid_array[array_idx]; } info->bytes_xfered = (__u32)FIELD_GET(IRDMACQ_PAYLDLEN, qword0); if (qword3 & IRDMACQ_STAG) { info->stag_invalid_set = true; info->inv_stag = (__u32)FIELD_GET(IRDMACQ_INVSTAG, qword2); } else { info->stag_invalid_set = false; } IRDMA_RING_SET_TAIL(qp->rq_ring, array_idx + 1); if (info->comp_status == IRDMA_COMPL_STATUS_FLUSHED) { qp->rq_flush_seen = true; if (!IRDMA_RING_MORE_WORK(qp->rq_ring)) qp->rq_flush_complete = true; else move_cq_head = false; } pring = &qp->rq_ring; } else { /* q_type is IRDMA_CQE_QTYPE_SQ */ if (qp->first_sq_wq) { if (wqe_idx + 1 >= qp->conn_wqes) qp->first_sq_wq = false; if (wqe_idx < qp->conn_wqes && qp->sq_ring.head == qp->sq_ring.tail) { IRDMA_RING_MOVE_HEAD_NOCHECK(cq->cq_ring); IRDMA_RING_MOVE_TAIL(cq->cq_ring); set_64bit_val(cq->shadow_area, 0, IRDMA_RING_CURRENT_HEAD(cq->cq_ring)); memset(info, 0, sizeof(struct irdma_cq_poll_info)); return irdma_uk_cq_poll_cmpl(cq, info); } } /*cease posting push mode on push drop*/ if (info->push_dropped) { qp->push_mode = false; qp->push_dropped = true; } if (info->comp_status != IRDMA_COMPL_STATUS_FLUSHED) { info->wr_id = qp->sq_wrtrk_array[wqe_idx].wrid; if (!info->comp_status) info->bytes_xfered = qp->sq_wrtrk_array[wqe_idx].wr_len; info->op_type = (__u8)FIELD_GET(IRDMACQ_OP, qword3); IRDMA_RING_SET_TAIL(qp->sq_ring, wqe_idx + qp->sq_wrtrk_array[wqe_idx].quanta); } else { if (!IRDMA_RING_MORE_WORK(qp->sq_ring)) { ret_code = ENOENT; goto exit; } do { __le64 *sw_wqe; __u64 wqe_qword; __u32 tail; tail = qp->sq_ring.tail; sw_wqe = qp->sq_base[tail].elem; get_64bit_val(sw_wqe, 24, &wqe_qword); info->op_type = (__u8)FIELD_GET(IRDMAQPSQ_OPCODE, wqe_qword); IRDMA_RING_SET_TAIL(qp->sq_ring, tail + qp->sq_wrtrk_array[tail].quanta); if (info->op_type != IRDMAQP_OP_NOP) { info->wr_id = qp->sq_wrtrk_array[tail].wrid; info->bytes_xfered = qp->sq_wrtrk_array[tail].wr_len; break; } } while (1); qp->sq_flush_seen = true; if (!IRDMA_RING_MORE_WORK(qp->sq_ring)) qp->sq_flush_complete = true; } pring = &qp->sq_ring; } ret_code = 0; exit: if (!ret_code && info->comp_status == IRDMA_COMPL_STATUS_FLUSHED) if (pring && IRDMA_RING_MORE_WORK(*pring)) move_cq_head = false; if (move_cq_head) { IRDMA_RING_MOVE_HEAD_NOCHECK(cq->cq_ring); if (!IRDMA_RING_CURRENT_HEAD(cq->cq_ring)) cq->polarity ^= 1; if (ext_valid && !cq->avoid_mem_cflct) { IRDMA_RING_MOVE_HEAD_NOCHECK(cq->cq_ring); if (!IRDMA_RING_CURRENT_HEAD(cq->cq_ring)) cq->polarity ^= 1; } IRDMA_RING_MOVE_TAIL(cq->cq_ring); if (!cq->avoid_mem_cflct && ext_valid) IRDMA_RING_MOVE_TAIL(cq->cq_ring); set_64bit_val(cq->shadow_area, 0, IRDMA_RING_CURRENT_HEAD(cq->cq_ring)); } else { qword3 &= ~IRDMA_CQ_WQEIDX; qword3 |= FIELD_PREP(IRDMA_CQ_WQEIDX, pring->tail); set_64bit_val(cqe, 24, qword3); } return ret_code; } /** * irdma_qp_round_up - return round up qp wq depth * @wqdepth: wq depth in quanta to round up */ static int irdma_qp_round_up(__u32 wqdepth) { int scount = 1; for (wqdepth--; scount <= 16; scount *= 2) wqdepth |= wqdepth >> scount; return ++wqdepth; } /** * irdma_get_wqe_shift - get shift count for maximum wqe size * @uk_attrs: qp HW attributes * @sge: Maximum Scatter Gather Elements wqe * @inline_data: Maximum inline data size * @shift: Returns the shift needed based on sge * * Shift can be used to left shift the wqe size based on number of SGEs and inlind data size. * For 1 SGE or inline data <= 8, shift = 0 (wqe size of 32 * bytes). For 2 or 3 SGEs or inline data <= 39, shift = 1 (wqe * size of 64 bytes). * For 4-7 SGE's and inline <= 101 Shift of 2 otherwise (wqe * size of 256 bytes). */ void irdma_get_wqe_shift(struct irdma_uk_attrs *uk_attrs, __u32 sge, __u32 inline_data, __u8 *shift) { *shift = 0; if (uk_attrs->hw_rev >= IRDMA_GEN_2) { if (sge > 1 || inline_data > 8) { if (sge < 4 && inline_data <= 39) *shift = 1; else if (sge < 8 && inline_data <= 101) *shift = 2; else *shift = 3; } } else if (sge > 1 || inline_data > 16) { *shift = (sge < 4 && inline_data <= 48) ? 1 : 2; } } /* * irdma_get_sqdepth - get SQ depth (quanta) * @uk_attrs: qp HW attributes * @sq_size: SQ size * @shift: shift which determines size of WQE * @sqdepth: depth of SQ * */ int irdma_get_sqdepth(struct irdma_uk_attrs *uk_attrs, __u32 sq_size, __u8 shift, __u32 *sqdepth) { __u32 min_size = (__u32)uk_attrs->min_hw_wq_size << shift; *sqdepth = irdma_qp_round_up((sq_size << shift) + IRDMA_SQ_RSVD); if (*sqdepth < min_size) *sqdepth = min_size; else if (*sqdepth > uk_attrs->max_hw_wq_quanta) return EINVAL; return 0; } /* * irdma_get_rqdepth - get RQ depth (quanta) * @uk_attrs: qp HW attributes * @rq_size: RQ size * @shift: shift which determines size of WQE * @rqdepth: depth of RQ */ int irdma_get_rqdepth(struct irdma_uk_attrs *uk_attrs, __u32 rq_size, __u8 shift, __u32 *rqdepth) { __u32 min_size = (__u32)uk_attrs->min_hw_wq_size << shift; *rqdepth = irdma_qp_round_up((rq_size << shift) + IRDMA_RQ_RSVD); if (*rqdepth < min_size) *rqdepth = min_size; else if (*rqdepth > uk_attrs->max_hw_rq_quanta) return EINVAL; return 0; } static const struct irdma_wqe_uk_ops iw_wqe_uk_ops = { .iw_copy_inline_data = irdma_copy_inline_data, .iw_inline_data_size_to_quanta = irdma_inline_data_size_to_quanta, .iw_set_fragment = irdma_set_fragment, .iw_set_mw_bind_wqe = irdma_set_mw_bind_wqe, }; static const struct irdma_wqe_uk_ops iw_wqe_uk_ops_gen_1 = { .iw_copy_inline_data = irdma_copy_inline_data_gen_1, .iw_inline_data_size_to_quanta = irdma_inline_data_size_to_quanta_gen_1, .iw_set_fragment = irdma_set_fragment_gen_1, .iw_set_mw_bind_wqe = irdma_set_mw_bind_wqe_gen_1, }; /** * irdma_setup_connection_wqes - setup WQEs necessary to complete * connection. * @qp: hw qp (user and kernel) * @info: qp initialization info */ static void irdma_setup_connection_wqes(struct irdma_qp_uk *qp, struct irdma_qp_uk_init_info *info) { __u16 move_cnt = 1; if (!info->legacy_mode && (qp->uk_attrs->feature_flags & IRDMA_FEATURE_RTS_AE)) move_cnt = 3; qp->conn_wqes = move_cnt; IRDMA_RING_MOVE_HEAD_BY_COUNT_NOCHECK(qp->sq_ring, move_cnt); IRDMA_RING_MOVE_TAIL_BY_COUNT(qp->sq_ring, move_cnt); IRDMA_RING_MOVE_HEAD_BY_COUNT_NOCHECK(qp->initial_ring, move_cnt); } /** * irdma_uk_calc_depth_shift_sq - calculate depth and shift for SQ size. * @ukinfo: qp initialization info * @sq_depth: Returns depth of SQ * @sq_shift: Returns shift of SQ */ int irdma_uk_calc_depth_shift_sq(struct irdma_qp_uk_init_info *ukinfo, __u32 *sq_depth, __u8 *sq_shift) { bool imm_support = ukinfo->uk_attrs->hw_rev >= IRDMA_GEN_2 ? true : false; int status; irdma_get_wqe_shift(ukinfo->uk_attrs, imm_support ? ukinfo->max_sq_frag_cnt + 1 : ukinfo->max_sq_frag_cnt, ukinfo->max_inline_data, sq_shift); status = irdma_get_sqdepth(ukinfo->uk_attrs, ukinfo->sq_size, *sq_shift, sq_depth); return status; } /** * irdma_uk_calc_depth_shift_rq - calculate depth and shift for RQ size. * @ukinfo: qp initialization info * @rq_depth: Returns depth of RQ * @rq_shift: Returns shift of RQ */ int irdma_uk_calc_depth_shift_rq(struct irdma_qp_uk_init_info *ukinfo, __u32 *rq_depth, __u8 *rq_shift) { int status; irdma_get_wqe_shift(ukinfo->uk_attrs, ukinfo->max_rq_frag_cnt, 0, rq_shift); if (ukinfo->uk_attrs->hw_rev == IRDMA_GEN_1) { if (ukinfo->abi_ver > 4) *rq_shift = IRDMA_MAX_RQ_WQE_SHIFT_GEN1; } status = irdma_get_rqdepth(ukinfo->uk_attrs, ukinfo->rq_size, *rq_shift, rq_depth); return status; } /** * irdma_uk_qp_init - initialize shared qp * @qp: hw qp (user and kernel) * @info: qp initialization info * * initializes the vars used in both user and kernel mode. * size of the wqe depends on numbers of max. fragements * allowed. Then size of wqe * the number of wqes should be the * amount of memory allocated for sq and rq. */ int irdma_uk_qp_init(struct irdma_qp_uk *qp, struct irdma_qp_uk_init_info *info) { int ret_code = 0; __u32 sq_ring_size; qp->uk_attrs = info->uk_attrs; if (info->max_sq_frag_cnt > qp->uk_attrs->max_hw_wq_frags || info->max_rq_frag_cnt > qp->uk_attrs->max_hw_wq_frags) return EINVAL; qp->qp_caps = info->qp_caps; qp->sq_base = info->sq; qp->rq_base = info->rq; qp->qp_type = info->type ? info->type : IRDMA_QP_TYPE_IWARP; qp->shadow_area = info->shadow_area; qp->sq_wrtrk_array = info->sq_wrtrk_array; qp->rq_wrid_array = info->rq_wrid_array; qp->wqe_alloc_db = info->wqe_alloc_db; qp->qp_id = info->qp_id; qp->sq_size = info->sq_size; qp->push_mode = false; qp->max_sq_frag_cnt = info->max_sq_frag_cnt; sq_ring_size = qp->sq_size << info->sq_shift; IRDMA_RING_INIT(qp->sq_ring, sq_ring_size); IRDMA_RING_INIT(qp->initial_ring, sq_ring_size); if (info->first_sq_wq) { irdma_setup_connection_wqes(qp, info); qp->swqe_polarity = 1; qp->first_sq_wq = true; } else { qp->swqe_polarity = 0; } qp->swqe_polarity_deferred = 1; qp->rwqe_polarity = 0; qp->rq_size = info->rq_size; qp->max_rq_frag_cnt = info->max_rq_frag_cnt; qp->max_inline_data = info->max_inline_data; qp->rq_wqe_size = info->rq_shift; IRDMA_RING_INIT(qp->rq_ring, qp->rq_size); qp->rq_wqe_size_multiplier = 1 << info->rq_shift; if (qp->uk_attrs->hw_rev == IRDMA_GEN_1) qp->wqe_ops = iw_wqe_uk_ops_gen_1; else qp->wqe_ops = iw_wqe_uk_ops; return ret_code; } /** * irdma_uk_cq_init - initialize shared cq (user and kernel) * @cq: hw cq * @info: hw cq initialization info */ int irdma_uk_cq_init(struct irdma_cq_uk *cq, struct irdma_cq_uk_init_info *info) { cq->cq_base = info->cq_base; cq->cq_id = info->cq_id; cq->cq_size = info->cq_size; cq->cqe_alloc_db = info->cqe_alloc_db; cq->cq_ack_db = info->cq_ack_db; cq->shadow_area = info->shadow_area; cq->avoid_mem_cflct = info->avoid_mem_cflct; IRDMA_RING_INIT(cq->cq_ring, cq->cq_size); cq->polarity = 1; return 0; } /** * irdma_uk_clean_cq - clean cq entries * @q: completion context * @cq: cq to clean */ void irdma_uk_clean_cq(void *q, struct irdma_cq_uk *cq) { __le64 *cqe; __u64 qword3, comp_ctx; __u32 cq_head; __u8 polarity, temp; cq_head = cq->cq_ring.head; temp = cq->polarity; do { if (cq->avoid_mem_cflct) cqe = ((struct irdma_extended_cqe *)(cq->cq_base))[cq_head].buf; else cqe = cq->cq_base[cq_head].buf; get_64bit_val(cqe, 24, &qword3); polarity = (__u8)FIELD_GET(IRDMA_CQ_VALID, qword3); if (polarity != temp) break; get_64bit_val(cqe, 8, &comp_ctx); if ((void *)(uintptr_t)comp_ctx == q) set_64bit_val(cqe, 8, 0); cq_head = (cq_head + 1) % cq->cq_ring.size; if (!cq_head) temp ^= 1; } while (true); } /** * irdma_nop - post a nop * @qp: hw qp ptr * @wr_id: work request id * @signaled: signaled for completion * @post_sq: ring doorbell */ int irdma_nop(struct irdma_qp_uk *qp, __u64 wr_id, bool signaled, bool post_sq) { __le64 *wqe; __u64 hdr; __u32 wqe_idx; struct irdma_post_sq_info info = {}; info.push_wqe = false; info.wr_id = wr_id; wqe = irdma_qp_get_next_send_wqe(qp, &wqe_idx, IRDMA_QP_WQE_MIN_QUANTA, 0, &info); if (!wqe) return ENOMEM; irdma_clr_wqes(qp, wqe_idx); set_64bit_val(wqe, 0, 0); set_64bit_val(wqe, 8, 0); set_64bit_val(wqe, 16, 0); hdr = FIELD_PREP(IRDMAQPSQ_OPCODE, IRDMAQP_OP_NOP) | FIELD_PREP(IRDMAQPSQ_SIGCOMPL, signaled) | FIELD_PREP(IRDMAQPSQ_VALID, qp->swqe_polarity); udma_to_device_barrier(); /* make sure WQE is populated before valid bit is set */ set_64bit_val(wqe, 24, hdr); if (post_sq) irdma_uk_qp_post_wr(qp); return 0; } /** * irdma_fragcnt_to_quanta_sq - calculate quanta based on fragment count for SQ * @frag_cnt: number of fragments * @quanta: quanta for frag_cnt */ int irdma_fragcnt_to_quanta_sq(__u32 frag_cnt, __u16 *quanta) { switch (frag_cnt) { case 0: case 1: *quanta = IRDMA_QP_WQE_MIN_QUANTA; break; case 2: case 3: *quanta = 2; break; case 4: case 5: *quanta = 3; break; case 6: case 7: *quanta = 4; break; case 8: case 9: *quanta = 5; break; case 10: case 11: *quanta = 6; break; case 12: case 13: *quanta = 7; break; case 14: case 15: /* when immediate data is present */ *quanta = 8; break; default: return EINVAL; } return 0; } /** * irdma_fragcnt_to_wqesize_rq - calculate wqe size based on fragment count for RQ * @frag_cnt: number of fragments * @wqe_size: size in bytes given frag_cnt */ int irdma_fragcnt_to_wqesize_rq(__u32 frag_cnt, __u16 *wqe_size) { switch (frag_cnt) { case 0: case 1: *wqe_size = 32; break; case 2: case 3: *wqe_size = 64; break; case 4: case 5: case 6: case 7: *wqe_size = 128; break; case 8: case 9: case 10: case 11: case 12: case 13: case 14: *wqe_size = 256; break; default: return EINVAL; } return 0; } rdma-core-56.1/providers/irdma/umain.c000066400000000000000000000200651477342711600177230ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB /* Copyright (C) 2019 - 2023 Intel Corporation */ #include #include #include #include #include #include #include #include #include #include #include "ice_devids.h" #include "i40e_devids.h" #include "umain.h" #include "abi.h" #define INTEL_HCA(v, d) VERBS_PCI_MATCH(v, d, NULL) static const struct verbs_match_ent hca_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_IRDMA), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E823L_BACKPLANE), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E823L_SFP), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E823L_10G_BASE_T), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E823L_1GBE), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E823L_QSFP), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E810C_BACKPLANE), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E810C_QSFP), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E810C_SFP), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E810_XXV_BACKPLANE), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E810_XXV_QSFP), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E810_XXV_SFP), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E823C_BACKPLANE), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E823C_QSFP), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E823C_SFP), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E823C_10G_BASE_T), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E823C_SGMII), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_C822N_BACKPLANE), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_C822N_QSFP), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_C822N_SFP), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E822C_10G_BASE_T), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E822C_SGMII), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E822L_BACKPLANE), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E822L_SFP), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E822L_10G_BASE_T), INTEL_HCA(PCI_VENDOR_ID_INTEL, ICE_DEV_ID_E822L_SGMII), INTEL_HCA(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_X722_A0), INTEL_HCA(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_X722_A0_VF), INTEL_HCA(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_KX_X722), INTEL_HCA(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_QSFP_X722), INTEL_HCA(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_SFP_X722), INTEL_HCA(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_1G_BASE_T_X722), INTEL_HCA(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_10G_BASE_T_X722), INTEL_HCA(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_SFP_I_X722), INTEL_HCA(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_X722_VF), INTEL_HCA(I40E_INTEL_VENDOR_ID, I40E_DEV_ID_X722_VF_HV), {} }; /** * irdma_ufree_context - free context that was allocated * @ibctx: context allocated ptr */ static void irdma_ufree_context(struct ibv_context *ibctx) { struct irdma_uvcontext *iwvctx; iwvctx = container_of(ibctx, struct irdma_uvcontext, ibv_ctx.context); irdma_ufree_pd(&iwvctx->iwupd->ibv_pd); irdma_munmap(iwvctx->db); verbs_uninit_context(&iwvctx->ibv_ctx); free(iwvctx); } static const struct verbs_context_ops irdma_uctx_ops = { .alloc_mw = irdma_ualloc_mw, .alloc_pd = irdma_ualloc_pd, .attach_mcast = irdma_uattach_mcast, .bind_mw = irdma_ubind_mw, .cq_event = irdma_cq_event, .create_ah = irdma_ucreate_ah, .create_cq = irdma_ucreate_cq, .create_cq_ex = irdma_ucreate_cq_ex, .create_qp = irdma_ucreate_qp, .dealloc_mw = irdma_udealloc_mw, .dealloc_pd = irdma_ufree_pd, .dereg_mr = irdma_udereg_mr, .destroy_ah = irdma_udestroy_ah, .destroy_cq = irdma_udestroy_cq, .destroy_qp = irdma_udestroy_qp, .detach_mcast = irdma_udetach_mcast, .modify_qp = irdma_umodify_qp, .poll_cq = irdma_upoll_cq, .post_recv = irdma_upost_recv, .post_send = irdma_upost_send, .query_device_ex = irdma_uquery_device_ex, .query_port = irdma_uquery_port, .query_qp = irdma_uquery_qp, .reg_dmabuf_mr = irdma_ureg_mr_dmabuf, .reg_mr = irdma_ureg_mr, .rereg_mr = irdma_urereg_mr, .req_notify_cq = irdma_uarm_cq, .resize_cq = irdma_uresize_cq, .free_context = irdma_ufree_context, }; /** * i40iw_set_hw_attrs - set the hw attributes * @attrs: pointer to hw attributes * * Set the device attibutes to allow user mode to work with * driver on older ABI version. */ static void i40iw_set_hw_attrs(struct irdma_uk_attrs *attrs) { attrs->hw_rev = IRDMA_GEN_1; attrs->max_hw_wq_frags = I40IW_MAX_WQ_FRAGMENT_COUNT; attrs->max_hw_read_sges = I40IW_MAX_SGE_RD; attrs->max_hw_inline = I40IW_MAX_INLINE_DATA_SIZE; attrs->max_hw_rq_quanta = I40IW_QP_SW_MAX_RQ_QUANTA; attrs->max_hw_wq_quanta = I40IW_QP_SW_MAX_WQ_QUANTA; attrs->max_hw_sq_chunk = I40IW_MAX_QUANTA_PER_WR; attrs->max_hw_cq_size = I40IW_MAX_CQ_SIZE; attrs->min_hw_cq_size = IRDMA_MIN_CQ_SIZE; attrs->min_hw_wq_size = I40IW_MIN_WQ_SIZE; } /** * irdma_ualloc_context - allocate context for user app * @ibdev: ib device created during irdma_driver_init * @cmd_fd: save fd for the device * @private_data: device private data * * Returns callback routine table and calls driver for allocating * context and getting back resource information to return as ibv_context. */ static struct verbs_context *irdma_ualloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct ibv_pd *ibv_pd; struct irdma_uvcontext *iwvctx; struct irdma_get_context cmd = {}; struct irdma_get_context_resp resp = {}; __u64 mmap_key; __u8 user_ver = IRDMA_ABI_VER; iwvctx = verbs_init_and_alloc_context(ibdev, cmd_fd, iwvctx, ibv_ctx, RDMA_DRIVER_IRDMA); if (!iwvctx) return NULL; cmd.comp_mask |= IRDMA_ALLOC_UCTX_USE_RAW_ATTR; cmd.userspace_ver = user_ver; if (ibv_cmd_get_context(&iwvctx->ibv_ctx, (struct ibv_get_context *)&cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) { cmd.userspace_ver = 4; if (ibv_cmd_get_context(&iwvctx->ibv_ctx, (struct ibv_get_context *)&cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) goto err_free; user_ver = cmd.userspace_ver; } verbs_set_ops(&iwvctx->ibv_ctx, &irdma_uctx_ops); /* Legacy i40iw does not populate hw_rev. The irdma driver always sets it */ if (!resp.hw_rev) { i40iw_set_hw_attrs(&iwvctx->uk_attrs); iwvctx->abi_ver = resp.kernel_ver; iwvctx->legacy_mode = true; mmap_key = 0; } else { iwvctx->uk_attrs.feature_flags = resp.feature_flags; iwvctx->uk_attrs.hw_rev = resp.hw_rev; iwvctx->uk_attrs.max_hw_wq_frags = resp.max_hw_wq_frags; iwvctx->uk_attrs.max_hw_read_sges = resp.max_hw_read_sges; iwvctx->uk_attrs.max_hw_inline = resp.max_hw_inline; iwvctx->uk_attrs.max_hw_rq_quanta = resp.max_hw_rq_quanta; iwvctx->uk_attrs.max_hw_wq_quanta = resp.max_hw_wq_quanta; iwvctx->uk_attrs.max_hw_sq_chunk = resp.max_hw_sq_chunk; iwvctx->uk_attrs.max_hw_cq_size = resp.max_hw_cq_size; iwvctx->uk_attrs.min_hw_cq_size = resp.min_hw_cq_size; iwvctx->abi_ver = user_ver; if (resp.comp_mask & IRDMA_ALLOC_UCTX_USE_RAW_ATTR) iwvctx->use_raw_attrs = true; if (resp.comp_mask & IRDMA_ALLOC_UCTX_MIN_HW_WQ_SIZE) iwvctx->uk_attrs.min_hw_wq_size = resp.min_hw_wq_size; else iwvctx->uk_attrs.min_hw_wq_size = IRDMA_QP_SW_MIN_WQSIZE; mmap_key = resp.db_mmap_key; } iwvctx->db = irdma_mmap(cmd_fd, mmap_key); if (iwvctx->db == MAP_FAILED) goto err_free; ibv_pd = irdma_ualloc_pd(&iwvctx->ibv_ctx.context); if (!ibv_pd) { irdma_munmap(iwvctx->db); goto err_free; } ibv_pd->context = &iwvctx->ibv_ctx.context; iwvctx->iwupd = container_of(ibv_pd, struct irdma_upd, ibv_pd); return &iwvctx->ibv_ctx; err_free: free(iwvctx); return NULL; } static void irdma_uninit_device(struct verbs_device *verbs_device) { struct irdma_udevice *dev; dev = container_of(&verbs_device->device, struct irdma_udevice, ibv_dev.device); free(dev); } static struct verbs_device *irdma_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct irdma_udevice *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; return &dev->ibv_dev; } static const struct verbs_device_ops irdma_udev_ops = { .alloc_context = irdma_ualloc_context, .alloc_device = irdma_device_alloc, .match_max_abi_version = IRDMA_MAX_ABI_VERSION, .match_min_abi_version = IRDMA_MIN_ABI_VERSION, .match_table = hca_table, .name = "irdma", .uninit_device = irdma_uninit_device, }; PROVIDER_DRIVER(irdma, irdma_udev_ops); rdma-core-56.1/providers/irdma/umain.h000066400000000000000000000116151477342711600177310ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ /* Copyright (C) 2019 - 2023 Intel Corporation */ #ifndef IRDMA_UMAIN_H #define IRDMA_UMAIN_H #include #include #include #include #include #include "osdep.h" #include "irdma.h" #include "defs.h" #include "i40iw_hw.h" #include "user.h" #define IRDMA_BASE_PUSH_PAGE 1 #define IRDMA_U_MINCQ_SIZE 4 #define IRDMA_DB_SHADOW_AREA_SIZE 64 #define IRDMA_DB_CQ_OFFSET 64 enum irdma_supported_wc_flags { IRDMA_CQ_SUPPORTED_WC_FLAGS = IBV_WC_EX_WITH_BYTE_LEN | IBV_WC_EX_WITH_IMM | IBV_WC_EX_WITH_QP_NUM | IBV_WC_EX_WITH_SRC_QP | IBV_WC_EX_WITH_SLID | IBV_WC_EX_WITH_SL | IBV_WC_EX_WITH_DLID_PATH_BITS | IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK | IBV_WC_EX_WITH_COMPLETION_TIMESTAMP, }; struct irdma_udevice { struct verbs_device ibv_dev; }; struct irdma_uah { struct ibv_ah ibv_ah; uint32_t ah_id; struct ibv_global_route grh; }; struct irdma_upd { struct ibv_pd ibv_pd; void *arm_cq_page; void *arm_cq; uint32_t pd_id; }; struct irdma_uvcontext { struct verbs_context ibv_ctx; struct irdma_upd *iwupd; struct irdma_uk_attrs uk_attrs; void *db; int abi_ver; bool legacy_mode:1; bool use_raw_attrs:1; }; struct irdma_uqp; struct irdma_cq_buf { struct list_node list; struct irdma_cq_uk cq; struct verbs_mr vmr; }; struct irdma_ucq { struct verbs_cq verbs_cq; struct verbs_mr vmr; struct verbs_mr vmr_shadow_area; pthread_spinlock_t lock; size_t buf_size; bool is_armed; bool skip_arm; bool arm_sol; bool skip_sol; int comp_vector; uint32_t report_rtt; struct irdma_uqp *uqp; struct irdma_cq_uk cq; struct list_head resize_list; /* for extended CQ completion fields */ struct irdma_cq_poll_info cur_cqe; }; struct irdma_uqp { struct ibv_qp ibv_qp; struct irdma_ucq *send_cq; struct irdma_ucq *recv_cq; struct verbs_mr vmr; size_t buf_size; uint32_t irdma_drv_opt; pthread_spinlock_t lock; uint16_t sq_sig_all; uint16_t qperr; uint16_t rsvd; uint32_t pending_rcvs; uint32_t wq_size; struct ibv_recv_wr *pend_rx_wr; struct irdma_qp_uk qp; enum ibv_qp_type qp_type; }; struct irdma_umr { struct verbs_mr vmr; uint32_t acc_flags; }; /* irdma_uverbs.c */ int irdma_uquery_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int irdma_uquery_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr); struct ibv_pd *irdma_ualloc_pd(struct ibv_context *context); int irdma_ufree_pd(struct ibv_pd *pd); struct ibv_mr *irdma_ureg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access); struct ibv_mr *irdma_ureg_mr_dmabuf(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access); int irdma_udereg_mr(struct verbs_mr *vmr); int irdma_urereg_mr(struct verbs_mr *mr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access); struct ibv_mw *irdma_ualloc_mw(struct ibv_pd *pd, enum ibv_mw_type type); int irdma_ubind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind); int irdma_udealloc_mw(struct ibv_mw *mw); struct ibv_cq *irdma_ucreate_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); struct ibv_cq_ex *irdma_ucreate_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *attr_ex); void irdma_ibvcq_ex_fill_priv_funcs(struct irdma_ucq *iwucq, struct ibv_cq_init_attr_ex *attr_ex); int irdma_uresize_cq(struct ibv_cq *cq, int cqe); int irdma_udestroy_cq(struct ibv_cq *cq); int irdma_upoll_cq(struct ibv_cq *cq, int entries, struct ibv_wc *entry); int irdma_uarm_cq(struct ibv_cq *cq, int solicited); void irdma_cq_event(struct ibv_cq *cq); struct ibv_qp *irdma_ucreate_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); int irdma_uquery_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int irdma_umodify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); int irdma_udestroy_qp(struct ibv_qp *qp); int irdma_upost_send(struct ibv_qp *ib_qp, struct ibv_send_wr *ib_wr, struct ibv_send_wr **bad_wr); int irdma_upost_recv(struct ibv_qp *ib_qp, struct ibv_recv_wr *ib_wr, struct ibv_recv_wr **bad_wr); struct ibv_ah *irdma_ucreate_ah(struct ibv_pd *ibpd, struct ibv_ah_attr *attr); int irdma_udestroy_ah(struct ibv_ah *ibah); int irdma_uattach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); int irdma_udetach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); void irdma_async_event(struct ibv_context *context, struct ibv_async_event *event); void irdma_set_hw_attrs(struct irdma_hw_attrs *attrs); void *irdma_mmap(int fd, off_t offset); void irdma_munmap(void *map); #endif /* IRDMA_UMAIN_H */ rdma-core-56.1/providers/irdma/user.h000066400000000000000000000267061477342711600176050ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB */ /* Copyright (c) 2015 - 2023 Intel Corporation */ #ifndef IRDMA_USER_H #define IRDMA_USER_H #include "osdep.h" #define irdma_handle void * #define irdma_adapter_handle irdma_handle #define irdma_qp_handle irdma_handle #define irdma_cq_handle irdma_handle #define irdma_pd_id irdma_handle #define irdma_stag_handle irdma_handle #define irdma_stag_index __u32 #define irdma_stag __u32 #define irdma_stag_key __u8 #define irdma_tagged_offset __u64 #define irdma_access_privileges __u32 #define irdma_physical_fragment __u64 #define irdma_address_list __u64 * #define IRDMA_MAX_MR_SIZE 0x200000000000ULL #define IRDMA_ACCESS_FLAGS_LOCALREAD 0x01 #define IRDMA_ACCESS_FLAGS_LOCALWRITE 0x02 #define IRDMA_ACCESS_FLAGS_REMOTEREAD_ONLY 0x04 #define IRDMA_ACCESS_FLAGS_REMOTEREAD 0x05 #define IRDMA_ACCESS_FLAGS_REMOTEWRITE_ONLY 0x08 #define IRDMA_ACCESS_FLAGS_REMOTEWRITE 0x0a #define IRDMA_ACCESS_FLAGS_BIND_WINDOW 0x10 #define IRDMA_ACCESS_FLAGS_ZERO_BASED 0x20 #define IRDMA_ACCESS_FLAGS_ALL 0x3f #define IRDMA_OP_TYPE_RDMA_WRITE 0x00 #define IRDMA_OP_TYPE_RDMA_READ 0x01 #define IRDMA_OP_TYPE_SEND 0x03 #define IRDMA_OP_TYPE_SEND_INV 0x04 #define IRDMA_OP_TYPE_SEND_SOL 0x05 #define IRDMA_OP_TYPE_SEND_SOL_INV 0x06 #define IRDMA_OP_TYPE_RDMA_WRITE_SOL 0x0d #define IRDMA_OP_TYPE_BIND_MW 0x08 #define IRDMA_OP_TYPE_FAST_REG_NSMR 0x09 #define IRDMA_OP_TYPE_INV_STAG 0x0a #define IRDMA_OP_TYPE_RDMA_READ_INV_STAG 0x0b #define IRDMA_OP_TYPE_NOP 0x0c #define IRDMA_OP_TYPE_REC 0x3e #define IRDMA_OP_TYPE_REC_IMM 0x3f #define IRDMA_FLUSH_MAJOR_ERR 1 enum irdma_device_caps_const { IRDMA_WQE_SIZE = 4, IRDMA_CQP_WQE_SIZE = 8, IRDMA_CQE_SIZE = 4, IRDMA_EXTENDED_CQE_SIZE = 8, IRDMA_AEQE_SIZE = 2, IRDMA_CEQE_SIZE = 1, IRDMA_CQP_CTX_SIZE = 8, IRDMA_SHADOW_AREA_SIZE = 8, IRDMA_GATHER_STATS_BUF_SIZE = 1024, IRDMA_MIN_IW_QP_ID = 0, IRDMA_QUERY_FPM_BUF_SIZE = 176, IRDMA_COMMIT_FPM_BUF_SIZE = 176, IRDMA_MAX_IW_QP_ID = 262143, IRDMA_MIN_CEQID = 0, IRDMA_MAX_CEQID = 1023, IRDMA_CEQ_MAX_COUNT = IRDMA_MAX_CEQID + 1, IRDMA_MIN_CQID = 0, IRDMA_MAX_CQID = 524287, IRDMA_MIN_AEQ_ENTRIES = 1, IRDMA_MAX_AEQ_ENTRIES = 524287, IRDMA_MIN_CEQ_ENTRIES = 1, IRDMA_MAX_CEQ_ENTRIES = 262143, IRDMA_MIN_CQ_SIZE = 1, IRDMA_MAX_CQ_SIZE = 1048575, IRDMA_DB_ID_ZERO = 0, IRDMA_MAX_WQ_FRAGMENT_COUNT = 13, IRDMA_MAX_SGE_RD = 13, IRDMA_MAX_OUTBOUND_MSG_SIZE = 2147483647, IRDMA_MAX_INBOUND_MSG_SIZE = 2147483647, IRDMA_MAX_PUSH_PAGE_COUNT = 1024, IRDMA_MAX_PE_ENA_VF_COUNT = 32, IRDMA_MAX_VF_FPM_ID = 47, IRDMA_MAX_SQ_PAYLOAD_SIZE = 2145386496, IRDMA_MAX_INLINE_DATA_SIZE = 101, IRDMA_MAX_WQ_ENTRIES = 32768, IRDMA_Q2_BUF_SIZE = 256, IRDMA_QP_CTX_SIZE = 256, IRDMA_MAX_PDS = 262144, }; enum irdma_addressing_type { IRDMA_ADDR_TYPE_ZERO_BASED = 0, IRDMA_ADDR_TYPE_VA_BASED = 1, }; enum irdma_flush_opcode { FLUSH_INVALID = 0, FLUSH_GENERAL_ERR, FLUSH_PROT_ERR, FLUSH_REM_ACCESS_ERR, FLUSH_LOC_QP_OP_ERR, FLUSH_REM_OP_ERR, FLUSH_LOC_LEN_ERR, FLUSH_FATAL_ERR, FLUSH_RETRY_EXC_ERR, FLUSH_MW_BIND_ERR, FLUSH_REM_INV_REQ_ERR, }; enum irdma_cmpl_status { IRDMA_COMPL_STATUS_SUCCESS = 0, IRDMA_COMPL_STATUS_FLUSHED, IRDMA_COMPL_STATUS_INVALID_WQE, IRDMA_COMPL_STATUS_QP_CATASTROPHIC, IRDMA_COMPL_STATUS_REMOTE_TERMINATION, IRDMA_COMPL_STATUS_INVALID_STAG, IRDMA_COMPL_STATUS_BASE_BOUND_VIOLATION, IRDMA_COMPL_STATUS_ACCESS_VIOLATION, IRDMA_COMPL_STATUS_INVALID_PD_ID, IRDMA_COMPL_STATUS_WRAP_ERROR, IRDMA_COMPL_STATUS_STAG_INVALID_PDID, IRDMA_COMPL_STATUS_RDMA_READ_ZERO_ORD, IRDMA_COMPL_STATUS_QP_NOT_PRIVLEDGED, IRDMA_COMPL_STATUS_STAG_NOT_INVALID, IRDMA_COMPL_STATUS_INVALID_PHYS_BUF_SIZE, IRDMA_COMPL_STATUS_INVALID_PHYS_BUF_ENTRY, IRDMA_COMPL_STATUS_INVALID_FBO, IRDMA_COMPL_STATUS_INVALID_LEN, IRDMA_COMPL_STATUS_INVALID_ACCESS, IRDMA_COMPL_STATUS_PHYS_BUF_LIST_TOO_LONG, IRDMA_COMPL_STATUS_INVALID_VIRT_ADDRESS, IRDMA_COMPL_STATUS_INVALID_REGION, IRDMA_COMPL_STATUS_INVALID_WINDOW, IRDMA_COMPL_STATUS_INVALID_TOTAL_LEN, IRDMA_COMPL_STATUS_UNKNOWN, }; enum irdma_cmpl_notify { IRDMA_CQ_COMPL_EVENT = 0, IRDMA_CQ_COMPL_SOLICITED = 1, }; enum irdma_qp_caps { IRDMA_WRITE_WITH_IMM = 1, IRDMA_SEND_WITH_IMM = 2, IRDMA_ROCE = 4, IRDMA_PUSH_MODE = 8, }; struct irdma_qp_uk; struct irdma_cq_uk; struct irdma_qp_uk_init_info; struct irdma_cq_uk_init_info; struct irdma_ring { __u32 head; __u32 tail; __u32 size; }; struct irdma_cqe { __le64 buf[IRDMA_CQE_SIZE]; }; struct irdma_extended_cqe { __le64 buf[IRDMA_EXTENDED_CQE_SIZE]; }; struct irdma_post_send { struct ibv_sge *sg_list; __u32 num_sges; __u32 qkey; __u32 dest_qp; __u32 ah_id; }; struct irdma_post_rq_info { __u64 wr_id; struct ibv_sge *sg_list; __u32 num_sges; }; struct irdma_rdma_write { struct ibv_sge *lo_sg_list; __u32 num_lo_sges; struct ibv_sge rem_addr; }; struct irdma_rdma_read { struct ibv_sge *lo_sg_list; __u32 num_lo_sges; struct ibv_sge rem_addr; }; struct irdma_bind_window { irdma_stag mr_stag; __u64 bind_len; void *va; enum irdma_addressing_type addressing_type; bool ena_reads:1; bool ena_writes:1; irdma_stag mw_stag; bool mem_window_type_1:1; }; struct irdma_inv_local_stag { irdma_stag target_stag; }; struct irdma_post_sq_info { __u64 wr_id; __u8 op_type; __u8 l4len; bool signaled:1; bool read_fence:1; bool local_fence:1; bool inline_data:1; bool imm_data_valid:1; bool push_wqe:1; bool report_rtt:1; bool udp_hdr:1; bool defer_flag:1; __u32 imm_data; __u32 stag_to_inv; union { struct irdma_post_send send; struct irdma_rdma_write rdma_write; struct irdma_rdma_read rdma_read; struct irdma_bind_window bind_window; struct irdma_inv_local_stag inv_local_stag; } op; }; struct irdma_cq_poll_info { __u64 wr_id; irdma_qp_handle qp_handle; __u32 bytes_xfered; __u32 tcp_seq_num_rtt; __u32 qp_id; __u32 ud_src_qpn; __u32 imm_data; irdma_stag inv_stag; /* or L_R_Key */ enum irdma_cmpl_status comp_status; __u16 major_err; __u16 minor_err; __u16 ud_vlan; __u8 ud_smac[6]; __u8 op_type; __u8 q_type; bool stag_invalid_set:1; /* or L_R_Key set */ bool push_dropped:1; bool error:1; bool solicited_event:1; bool ipv4:1; bool ud_vlan_valid:1; bool ud_smac_valid:1; bool imm_valid:1; }; int irdma_uk_inline_rdma_write(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq); int irdma_uk_inline_send(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq); int irdma_uk_mw_bind(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq); int irdma_uk_post_nop(struct irdma_qp_uk *qp, __u64 wr_id, bool signaled, bool post_sq); int irdma_uk_post_receive(struct irdma_qp_uk *qp, struct irdma_post_rq_info *info); void irdma_uk_qp_post_wr(struct irdma_qp_uk *qp); int irdma_uk_rdma_read(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool inv_stag, bool post_sq); int irdma_uk_rdma_write(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq); int irdma_uk_send(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq); int irdma_uk_stag_local_invalidate(struct irdma_qp_uk *qp, struct irdma_post_sq_info *info, bool post_sq); struct irdma_wqe_uk_ops { void (*iw_copy_inline_data)(__u8 *dest, struct ibv_sge *sge_list, __u32 num_sges, __u8 polarity); __u16 (*iw_inline_data_size_to_quanta)(__u32 data_size); void (*iw_set_fragment)(__le64 *wqe, __u32 offset, struct ibv_sge *sge, __u8 valid); void (*iw_set_mw_bind_wqe)(__le64 *wqe, struct irdma_bind_window *op_info); }; int irdma_uk_cq_poll_cmpl(struct irdma_cq_uk *cq, struct irdma_cq_poll_info *info); void irdma_uk_cq_request_notification(struct irdma_cq_uk *cq, enum irdma_cmpl_notify cq_notify); void irdma_uk_cq_resize(struct irdma_cq_uk *cq, void *cq_base, int size); void irdma_uk_cq_set_resized_cnt(struct irdma_cq_uk *qp, __u16 cnt); int irdma_uk_cq_init(struct irdma_cq_uk *cq, struct irdma_cq_uk_init_info *info); int irdma_uk_qp_init(struct irdma_qp_uk *qp, struct irdma_qp_uk_init_info *info); int irdma_uk_calc_depth_shift_sq(struct irdma_qp_uk_init_info *ukinfo, __u32 *sq_depth, __u8 *sq_shift); int irdma_uk_calc_depth_shift_rq(struct irdma_qp_uk_init_info *ukinfo, __u32 *rq_depth, __u8 *rq_shift); struct irdma_sq_uk_wr_trk_info { __u64 wrid; __u32 wr_len; __u16 quanta; __u8 reserved[2]; }; struct irdma_qp_quanta { __le64 elem[IRDMA_WQE_SIZE]; }; struct irdma_qp_uk { struct irdma_qp_quanta *sq_base; struct irdma_qp_quanta *rq_base; struct irdma_uk_attrs *uk_attrs; __u32 *wqe_alloc_db; struct irdma_sq_uk_wr_trk_info *sq_wrtrk_array; __u64 *rq_wrid_array; __le64 *shadow_area; __le32 *push_db; __le64 *push_wqe; struct irdma_ring sq_ring; struct irdma_ring rq_ring; struct irdma_ring initial_ring; __u32 qp_id; __u32 qp_caps; __u32 sq_size; __u32 rq_size; __u32 max_sq_frag_cnt; __u32 max_rq_frag_cnt; __u32 max_inline_data; struct irdma_wqe_uk_ops wqe_ops; __u16 conn_wqes; __u8 qp_type; __u8 swqe_polarity; __u8 swqe_polarity_deferred; __u8 rwqe_polarity; __u8 rq_wqe_size; __u8 rq_wqe_size_multiplier; bool deferred_flag:1; bool push_mode:1; /* whether the last post wqe was pushed */ bool push_dropped:1; bool first_sq_wq:1; bool sq_flush_complete:1; /* Indicates flush was seen and SQ was empty after the flush */ bool rq_flush_complete:1; /* Indicates flush was seen and RQ was empty after the flush */ bool destroy_pending:1; /* Indicates the QP is being destroyed */ void *back_qp; pthread_spinlock_t *lock; __u8 dbg_rq_flushed; __u8 sq_flush_seen; __u8 rq_flush_seen; }; struct irdma_cq_uk { struct irdma_cqe *cq_base; __u32 *cqe_alloc_db; __u32 *cq_ack_db; __le64 *shadow_area; __u32 cq_id; __u32 cq_size; struct irdma_ring cq_ring; __u8 polarity; bool avoid_mem_cflct:1; }; struct irdma_qp_uk_init_info { struct irdma_qp_quanta *sq; struct irdma_qp_quanta *rq; struct irdma_uk_attrs *uk_attrs; __u32 *wqe_alloc_db; __le64 *shadow_area; struct irdma_sq_uk_wr_trk_info *sq_wrtrk_array; __u64 *rq_wrid_array; __u32 qp_id; __u32 qp_caps; __u32 sq_size; __u32 rq_size; __u32 max_sq_frag_cnt; __u32 max_rq_frag_cnt; __u32 max_inline_data; __u32 sq_depth; __u32 rq_depth; __u8 first_sq_wq; __u8 type; __u8 sq_shift; __u8 rq_shift; int abi_ver; bool legacy_mode; }; struct irdma_cq_uk_init_info { __u32 *cqe_alloc_db; __u32 *cq_ack_db; struct irdma_cqe *cq_base; __le64 *shadow_area; __u32 cq_size; __u32 cq_id; bool avoid_mem_cflct; }; __le64 *irdma_qp_get_next_send_wqe(struct irdma_qp_uk *qp, __u32 *wqe_idx, __u16 quanta, __u32 total_size, struct irdma_post_sq_info *info); __le64 *irdma_qp_get_next_recv_wqe(struct irdma_qp_uk *qp, __u32 *wqe_idx); void irdma_uk_clean_cq(void *q, struct irdma_cq_uk *cq); int irdma_nop(struct irdma_qp_uk *qp, __u64 wr_id, bool signaled, bool post_sq); int irdma_fragcnt_to_quanta_sq(__u32 frag_cnt, __u16 *quanta); int irdma_fragcnt_to_wqesize_rq(__u32 frag_cnt, __u16 *wqe_size); void irdma_get_wqe_shift(struct irdma_uk_attrs *uk_attrs, __u32 sge, __u32 inline_data, __u8 *shift); int irdma_get_sqdepth(struct irdma_uk_attrs *uk_attrs, __u32 sq_size, __u8 shift, __u32 *wqdepth); int irdma_get_rqdepth(struct irdma_uk_attrs *uk_attrs, __u32 rq_size, __u8 shift, __u32 *wqdepth); void irdma_qp_push_wqe(struct irdma_qp_uk *qp, __le64 *wqe, __u16 quanta, __u32 wqe_idx, bool post_sq); void irdma_clr_wqes(struct irdma_qp_uk *qp, __u32 qp_wqe_idx); #endif /* IRDMA_USER_H */ rdma-core-56.1/providers/irdma/uverbs.c000066400000000000000000001415051477342711600201230ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB /* Copyright (C) 2019 - 2023 Intel Corporation */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "umain.h" #include "abi.h" static inline void print_fw_ver(uint64_t fw_ver, char *str, size_t len) { uint16_t major, minor; major = fw_ver >> 32 & 0xffff; minor = fw_ver & 0xffff; snprintf(str, len, "%d.%d", major, minor); } /** * irdma_uquery_device_ex - query device attributes including extended properties * @context: user context for the device * @input: extensible input struct for ibv_query_device_ex verb * @attr: extended device attribute struct * @attr_size: size of extended device attribute struct **/ int irdma_uquery_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp = {}; size_t resp_size = sizeof(resp); int ret; ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (ret) return ret; print_fw_ver(resp.base.fw_ver, attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver)); return 0; } /** * irdma_uquery_port - get port attributes (msg size, lnk, mtu...) * @context: user context of the device * @port: port for the attributes * @attr: to return port attributes **/ int irdma_uquery_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(context, port, attr, &cmd, sizeof(cmd)); } /** * irdma_ualloc_pd - allocates protection domain and return pd ptr * @context: user context of the device **/ struct ibv_pd *irdma_ualloc_pd(struct ibv_context *context) { struct ibv_alloc_pd cmd; struct irdma_ualloc_pd_resp resp = {}; struct irdma_upd *iwupd; int err; iwupd = malloc(sizeof(*iwupd)); if (!iwupd) return NULL; err = ibv_cmd_alloc_pd(context, &iwupd->ibv_pd, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (err) goto err_free; iwupd->pd_id = resp.pd_id; return &iwupd->ibv_pd; err_free: free(iwupd); errno = err; return NULL; } /** * irdma_ufree_pd - free pd resources * @pd: pd to free resources */ int irdma_ufree_pd(struct ibv_pd *pd) { struct irdma_upd *iwupd; int ret; iwupd = container_of(pd, struct irdma_upd, ibv_pd); ret = ibv_cmd_dealloc_pd(pd); if (ret) return ret; free(iwupd); return 0; } /** * irdma_ureg_mr - register user memory region * @pd: pd for the mr * @addr: user address of the memory region * @length: length of the memory * @hca_va: hca_va * @access: access allowed on this mr */ struct ibv_mr *irdma_ureg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { struct irdma_umr *umr; struct irdma_ureg_mr cmd; struct ib_uverbs_reg_mr_resp resp; int err; umr = malloc(sizeof(*umr)); if (!umr) return NULL; cmd.reg_type = IRDMA_MEMREG_TYPE_MEM; err = ibv_cmd_reg_mr(pd, addr, length, hca_va, access, &umr->vmr, &cmd.ibv_cmd, sizeof(cmd), &resp, sizeof(resp)); if (err) { free(umr); errno = err; return NULL; } umr->acc_flags = access; return &umr->vmr.ibv_mr; } struct ibv_mr *irdma_ureg_mr_dmabuf(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access) { struct irdma_umr *umr; int err; umr = calloc(1, sizeof(*umr)); if (!umr) return NULL; err = ibv_cmd_reg_dmabuf_mr(pd, offset, length, iova, fd, access, &umr->vmr, NULL); if (err) { free(umr); errno = err; return NULL; } return &umr->vmr.ibv_mr; } /* * irdma_urereg_mr - re-register memory region * @vmr: mr that was allocated * @flags: bit mask to indicate which of the attr's of MR modified * @pd: pd of the mr * @addr: user address of the memory region * @length: length of the memory * @access: access allowed on this mr */ int irdma_urereg_mr(struct verbs_mr *vmr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access) { struct irdma_urereg_mr cmd = {}; struct ib_uverbs_rereg_mr_resp resp; cmd.reg_type = IRDMA_MEMREG_TYPE_MEM; return ibv_cmd_rereg_mr(vmr, flags, addr, length, (uintptr_t)addr, access, pd, &cmd.ibv_cmd, sizeof(cmd), &resp, sizeof(resp)); } /** * irdma_udereg_mr - re-register memory region * @vmr: mr that was allocated */ int irdma_udereg_mr(struct verbs_mr *vmr) { int ret; ret = ibv_cmd_dereg_mr(vmr); if (ret) return ret; free(vmr); return 0; } /** * irdma_ualloc_mw - allocate memory window * @pd: protection domain * @type: memory window type */ struct ibv_mw *irdma_ualloc_mw(struct ibv_pd *pd, enum ibv_mw_type type) { struct ibv_mw *mw; struct ibv_alloc_mw cmd; struct ib_uverbs_alloc_mw_resp resp; if (type != IBV_MW_TYPE_1) { errno = ENOTSUP; return NULL; } mw = calloc(1, sizeof(*mw)); if (!mw) return NULL; if (ibv_cmd_alloc_mw(pd, type, mw, &cmd, sizeof(cmd), &resp, sizeof(resp))) { free(mw); return NULL; } return mw; } /** * irdma_ubind_mw - bind a memory window * @qp: qp to post WR * @mw: memory window to bind * @mw_bind: bind info */ int irdma_ubind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind) { struct ibv_mw_bind_info *bind_info = &mw_bind->bind_info; struct verbs_mr *vmr = verbs_get_mr(bind_info->mr); struct irdma_umr *umr = container_of(vmr, struct irdma_umr, vmr); struct ibv_send_wr wr = {}; struct ibv_send_wr *bad_wr; int err; if (vmr->mr_type != IBV_MR_TYPE_MR) return ENOTSUP; if (umr->acc_flags & IBV_ACCESS_ZERO_BASED) return EINVAL; wr.opcode = IBV_WR_BIND_MW; wr.bind_mw.bind_info = mw_bind->bind_info; wr.bind_mw.mw = mw; wr.bind_mw.rkey = ibv_inc_rkey(mw->rkey); wr.wr_id = mw_bind->wr_id; wr.send_flags = mw_bind->send_flags; err = irdma_upost_send(qp, &wr, &bad_wr); if (!err) mw->rkey = wr.bind_mw.rkey; return err; } /** * irdma_udealloc_mw - deallocate memory window * @mw: memory window to dealloc */ int irdma_udealloc_mw(struct ibv_mw *mw) { int ret; ret = ibv_cmd_dealloc_mw(mw); if (ret) return ret; free(mw); return 0; } static void *irdma_calloc_hw_buf_sz(size_t size, size_t alignment) { void *buf; buf = memalign(alignment, size); if (!buf) return NULL; if (ibv_dontfork_range(buf, size)) { free(buf); return NULL; } memset(buf, 0, size); return buf; } static void *irdma_calloc_hw_buf(size_t size) { return irdma_calloc_hw_buf_sz(size, IRDMA_HW_PAGE_SIZE); } static void irdma_free_hw_buf(void *buf, size_t size) { ibv_dofork_range(buf, size); free(buf); } /** * get_cq_size - returns actual cqe needed by HW * @ncqe: minimum cqes requested by application * @hw_rev: HW generation */ static inline int get_cq_size(int ncqe, __u8 hw_rev) { ncqe++; /* Completions with immediate require 1 extra entry */ if (hw_rev > IRDMA_GEN_1) ncqe *= 2; if (ncqe < IRDMA_U_MINCQ_SIZE) ncqe = IRDMA_U_MINCQ_SIZE; return ncqe; } static inline size_t get_cq_total_bytes(__u32 cq_size) { return roundup(cq_size * sizeof(struct irdma_cqe), IRDMA_HW_PAGE_SIZE); } /** * ucreate_cq - irdma util function to create a CQ * @context: ibv context * @attr_ex: CQ init attributes * @ext_cq: flag to create an extendable or normal CQ */ static struct ibv_cq_ex *ucreate_cq(struct ibv_context *context, struct ibv_cq_init_attr_ex *attr_ex, bool ext_cq) { struct irdma_cq_uk_init_info info = {}; struct irdma_ureg_mr reg_mr_cmd = {}; struct irdma_ucreate_cq_ex cmd = {}; struct irdma_ucreate_cq_ex_resp resp = {}; struct ib_uverbs_reg_mr_resp reg_mr_resp = {}; struct irdma_ureg_mr reg_mr_shadow_cmd = {}; struct ib_uverbs_reg_mr_resp reg_mr_shadow_resp = {}; struct irdma_uk_attrs *uk_attrs; struct irdma_uvcontext *iwvctx; struct irdma_ucq *iwucq; size_t total_size; __u32 cq_pages; int ret, ncqe; __u8 hw_rev; iwvctx = container_of(context, struct irdma_uvcontext, ibv_ctx.context); uk_attrs = &iwvctx->uk_attrs; hw_rev = uk_attrs->hw_rev; if (ext_cq && hw_rev == IRDMA_GEN_1) { errno = EOPNOTSUPP; return NULL; } if (attr_ex->cqe < IRDMA_MIN_CQ_SIZE || attr_ex->cqe > uk_attrs->max_hw_cq_size - 1) { errno = EINVAL; return NULL; } /* save the cqe requested by application */ ncqe = attr_ex->cqe; iwucq = calloc(1, sizeof(*iwucq)); if (!iwucq) return NULL; if (pthread_spin_init(&iwucq->lock, PTHREAD_PROCESS_PRIVATE)) { free(iwucq); return NULL; } info.cq_size = get_cq_size(attr_ex->cqe, hw_rev); iwucq->comp_vector = attr_ex->comp_vector; list_head_init(&iwucq->resize_list); total_size = get_cq_total_bytes(info.cq_size); cq_pages = total_size >> IRDMA_HW_PAGE_SHIFT; if (!(uk_attrs->feature_flags & IRDMA_FEATURE_CQ_RESIZE)) total_size = (cq_pages << IRDMA_HW_PAGE_SHIFT) + IRDMA_DB_SHADOW_AREA_SIZE; iwucq->buf_size = total_size; info.cq_base = irdma_calloc_hw_buf(total_size); if (!info.cq_base) goto err_cq_base; reg_mr_cmd.reg_type = IRDMA_MEMREG_TYPE_CQ; reg_mr_cmd.cq_pages = cq_pages; ret = ibv_cmd_reg_mr(&iwvctx->iwupd->ibv_pd, info.cq_base, total_size, (uintptr_t)info.cq_base, IBV_ACCESS_LOCAL_WRITE, &iwucq->vmr, ®_mr_cmd.ibv_cmd, sizeof(reg_mr_cmd), ®_mr_resp, sizeof(reg_mr_resp)); if (ret) { errno = ret; goto err_dereg_mr; } iwucq->vmr.ibv_mr.pd = &iwvctx->iwupd->ibv_pd; if (uk_attrs->feature_flags & IRDMA_FEATURE_CQ_RESIZE) { info.shadow_area = irdma_calloc_hw_buf(IRDMA_DB_SHADOW_AREA_SIZE); if (!info.shadow_area) goto err_dereg_mr; reg_mr_shadow_cmd.reg_type = IRDMA_MEMREG_TYPE_CQ; reg_mr_shadow_cmd.cq_pages = 1; ret = ibv_cmd_reg_mr(&iwvctx->iwupd->ibv_pd, info.shadow_area, IRDMA_DB_SHADOW_AREA_SIZE, (uintptr_t)info.shadow_area, IBV_ACCESS_LOCAL_WRITE, &iwucq->vmr_shadow_area, ®_mr_shadow_cmd.ibv_cmd, sizeof(reg_mr_shadow_cmd), ®_mr_shadow_resp, sizeof(reg_mr_shadow_resp)); if (ret) { errno = ret; goto err_dereg_shadow; } iwucq->vmr_shadow_area.ibv_mr.pd = &iwvctx->iwupd->ibv_pd; } else { info.shadow_area = (__le64 *)((__u8 *)info.cq_base + (cq_pages << IRDMA_HW_PAGE_SHIFT)); } attr_ex->cqe = info.cq_size; cmd.user_cq_buf = (__u64)((uintptr_t)info.cq_base); cmd.user_shadow_area = (__u64)((uintptr_t)info.shadow_area); ret = ibv_cmd_create_cq_ex(context, attr_ex, &iwucq->verbs_cq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp), 0); attr_ex->cqe = ncqe; if (ret) { errno = ret; goto err_dereg_shadow; } if (ext_cq) irdma_ibvcq_ex_fill_priv_funcs(iwucq, attr_ex); info.cq_id = resp.cq_id; /* Do not report the cqe's burned by HW */ iwucq->verbs_cq.cq.cqe = ncqe; info.cqe_alloc_db = (__u32 *)((__u8 *)iwvctx->db + IRDMA_DB_CQ_OFFSET); irdma_uk_cq_init(&iwucq->cq, &info); return &iwucq->verbs_cq.cq_ex; err_dereg_shadow: ibv_cmd_dereg_mr(&iwucq->vmr); if (iwucq->vmr_shadow_area.ibv_mr.handle) { ibv_cmd_dereg_mr(&iwucq->vmr_shadow_area); irdma_free_hw_buf(info.shadow_area, IRDMA_HW_PAGE_SIZE); } err_dereg_mr: irdma_free_hw_buf(info.cq_base, total_size); err_cq_base: pthread_spin_destroy(&iwucq->lock); free(iwucq); return NULL; } struct ibv_cq *irdma_ucreate_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct ibv_cq_init_attr_ex attr_ex = { .cqe = cqe, .channel = channel, .comp_vector = comp_vector, }; struct ibv_cq_ex *ibvcq_ex; ibvcq_ex = ucreate_cq(context, &attr_ex, false); return ibvcq_ex ? ibv_cq_ex_to_cq(ibvcq_ex) : NULL; } struct ibv_cq_ex *irdma_ucreate_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *attr_ex) { if (attr_ex->wc_flags & ~IRDMA_CQ_SUPPORTED_WC_FLAGS) { errno = EOPNOTSUPP; return NULL; } return ucreate_cq(context, attr_ex, true); } /** * irdma_free_cq_buf - free memory for cq buffer * @cq_buf: cq buf to free */ static void irdma_free_cq_buf(struct irdma_cq_buf *cq_buf) { ibv_cmd_dereg_mr(&cq_buf->vmr); irdma_free_hw_buf(cq_buf->cq.cq_base, get_cq_total_bytes(cq_buf->cq.cq_size)); free(cq_buf); } /** * irdma_process_resize_list - process the cq list to remove buffers * @iwucq: cq which owns the list * @lcqe_buf: cq buf where the last cqe is found */ static int irdma_process_resize_list(struct irdma_ucq *iwucq, struct irdma_cq_buf *lcqe_buf) { struct irdma_cq_buf *cq_buf, *next; int cq_cnt = 0; list_for_each_safe(&iwucq->resize_list, cq_buf, next, list) { if (cq_buf == lcqe_buf) return cq_cnt; list_del(&cq_buf->list); irdma_free_cq_buf(cq_buf); cq_cnt++; } return cq_cnt; } /** * irdma_udestroy_cq - destroys cq * @cq: ptr to cq to be destroyed */ int irdma_udestroy_cq(struct ibv_cq *cq) { struct irdma_uk_attrs *uk_attrs; struct irdma_uvcontext *iwvctx; struct irdma_ucq *iwucq; int ret; iwucq = container_of(cq, struct irdma_ucq, verbs_cq.cq); iwvctx = container_of(cq->context, struct irdma_uvcontext, ibv_ctx.context); uk_attrs = &iwvctx->uk_attrs; ret = pthread_spin_destroy(&iwucq->lock); if (ret) goto err; irdma_process_resize_list(iwucq, NULL); ret = ibv_cmd_destroy_cq(cq); if (ret) goto err; ibv_cmd_dereg_mr(&iwucq->vmr); irdma_free_hw_buf(iwucq->cq.cq_base, iwucq->buf_size); if (uk_attrs->feature_flags & IRDMA_FEATURE_CQ_RESIZE) { ibv_cmd_dereg_mr(&iwucq->vmr_shadow_area); irdma_free_hw_buf(iwucq->cq.shadow_area, IRDMA_DB_SHADOW_AREA_SIZE); } free(iwucq); return 0; err: return ret; } static enum ibv_wc_status irdma_flush_err_to_ib_wc_status(enum irdma_flush_opcode opcode) { switch (opcode) { case FLUSH_PROT_ERR: return IBV_WC_LOC_PROT_ERR; case FLUSH_REM_ACCESS_ERR: return IBV_WC_REM_ACCESS_ERR; case FLUSH_LOC_QP_OP_ERR: return IBV_WC_LOC_QP_OP_ERR; case FLUSH_REM_OP_ERR: return IBV_WC_REM_OP_ERR; case FLUSH_LOC_LEN_ERR: return IBV_WC_LOC_LEN_ERR; case FLUSH_GENERAL_ERR: return IBV_WC_WR_FLUSH_ERR; case FLUSH_RETRY_EXC_ERR: return IBV_WC_RETRY_EXC_ERR; case FLUSH_MW_BIND_ERR: return IBV_WC_MW_BIND_ERR; case FLUSH_REM_INV_REQ_ERR: return IBV_WC_REM_INV_REQ_ERR; case FLUSH_FATAL_ERR: default: return IBV_WC_FATAL_ERR; } } static inline void set_ib_wc_op_sq(struct irdma_cq_poll_info *cur_cqe, struct ibv_wc *entry) { switch (cur_cqe->op_type) { case IRDMA_OP_TYPE_RDMA_WRITE: case IRDMA_OP_TYPE_RDMA_WRITE_SOL: entry->opcode = IBV_WC_RDMA_WRITE; break; case IRDMA_OP_TYPE_RDMA_READ: entry->opcode = IBV_WC_RDMA_READ; break; case IRDMA_OP_TYPE_SEND_SOL: case IRDMA_OP_TYPE_SEND_SOL_INV: case IRDMA_OP_TYPE_SEND_INV: case IRDMA_OP_TYPE_SEND: entry->opcode = IBV_WC_SEND; break; case IRDMA_OP_TYPE_BIND_MW: entry->opcode = IBV_WC_BIND_MW; break; case IRDMA_OP_TYPE_INV_STAG: entry->opcode = IBV_WC_LOCAL_INV; break; default: entry->status = IBV_WC_GENERAL_ERR; } } static inline void set_ib_wc_op_rq(struct irdma_cq_poll_info *cur_cqe, struct ibv_wc *entry, bool send_imm_support) { /** * iWARP does not support sendImm, so the presence of Imm data * must be WriteImm. */ if (!send_imm_support) { entry->opcode = cur_cqe->imm_valid ? IBV_WC_RECV_RDMA_WITH_IMM : IBV_WC_RECV; return; } switch (cur_cqe->op_type) { case IBV_OPCODE_RDMA_WRITE_ONLY_WITH_IMMEDIATE: case IBV_OPCODE_RDMA_WRITE_LAST_WITH_IMMEDIATE: entry->opcode = IBV_WC_RECV_RDMA_WITH_IMM; break; default: entry->opcode = IBV_WC_RECV; } } /** * irdma_process_cqe_ext - process current cqe for extended CQ * @cur_cqe - current cqe info */ static void irdma_process_cqe_ext(struct irdma_cq_poll_info *cur_cqe) { struct irdma_ucq *iwucq = container_of(cur_cqe, struct irdma_ucq, cur_cqe); struct ibv_cq_ex *ibvcq_ex = &iwucq->verbs_cq.cq_ex; ibvcq_ex->wr_id = cur_cqe->wr_id; if (cur_cqe->error) ibvcq_ex->status = (cur_cqe->comp_status == IRDMA_COMPL_STATUS_FLUSHED) ? irdma_flush_err_to_ib_wc_status(cur_cqe->minor_err) : IBV_WC_GENERAL_ERR; else ibvcq_ex->status = IBV_WC_SUCCESS; } /** * irdma_process_cqe - process current cqe info * @entry - ibv_wc object to fill in for non-extended CQ * @cur_cqe - current cqe info */ static void irdma_process_cqe(struct ibv_wc *entry, struct irdma_cq_poll_info *cur_cqe) { struct irdma_qp_uk *qp; struct ibv_qp *ib_qp; entry->wc_flags = 0; entry->wr_id = cur_cqe->wr_id; entry->qp_num = cur_cqe->qp_id; qp = cur_cqe->qp_handle; ib_qp = qp->back_qp; if (cur_cqe->error) { entry->status = (cur_cqe->comp_status == IRDMA_COMPL_STATUS_FLUSHED) ? irdma_flush_err_to_ib_wc_status(cur_cqe->minor_err) : IBV_WC_GENERAL_ERR; entry->vendor_err = cur_cqe->major_err << 16 | cur_cqe->minor_err; } else { entry->status = IBV_WC_SUCCESS; } if (cur_cqe->imm_valid) { entry->imm_data = htonl(cur_cqe->imm_data); entry->wc_flags |= IBV_WC_WITH_IMM; } if (cur_cqe->q_type == IRDMA_CQE_QTYPE_SQ) { set_ib_wc_op_sq(cur_cqe, entry); } else { set_ib_wc_op_rq(cur_cqe, entry, qp->qp_caps & IRDMA_SEND_WITH_IMM ? true : false); if (ib_qp->qp_type != IBV_QPT_UD && cur_cqe->stag_invalid_set) { entry->invalidated_rkey = cur_cqe->inv_stag; entry->wc_flags |= IBV_WC_WITH_INV; } } if (ib_qp->qp_type == IBV_QPT_UD) { entry->src_qp = cur_cqe->ud_src_qpn; entry->wc_flags |= IBV_WC_GRH; } else { entry->src_qp = cur_cqe->qp_id; } entry->byte_len = cur_cqe->bytes_xfered; } /** * irdma_poll_one - poll one entry of the CQ * @ukcq: ukcq to poll * @cur_cqe: current CQE info to be filled in * @entry: ibv_wc object to be filled for non-extended CQ or NULL for extended CQ * * Returns the internal irdma device error code or 0 on success */ static int irdma_poll_one(struct irdma_cq_uk *ukcq, struct irdma_cq_poll_info *cur_cqe, struct ibv_wc *entry) { int ret = irdma_uk_cq_poll_cmpl(ukcq, cur_cqe); if (ret) return ret; if (!entry) irdma_process_cqe_ext(cur_cqe); else irdma_process_cqe(entry, cur_cqe); return 0; } /** * __irdma_upoll_cq - irdma util function to poll device CQ * @iwucq: irdma cq to poll * @num_entries: max cq entries to poll * @entry: pointer to array of ibv_wc objects to be filled in for each completion or NULL if ext CQ * * Returns non-negative value equal to the number of completions * found. On failure, -EINVAL */ static int __irdma_upoll_cq(struct irdma_ucq *iwucq, int num_entries, struct ibv_wc *entry) { struct irdma_cq_buf *cq_buf, *next; struct irdma_cq_buf *last_buf = NULL; struct irdma_cq_poll_info *cur_cqe = &iwucq->cur_cqe; bool cq_new_cqe = false; int resized_bufs = 0; int npolled = 0; int ret; /* go through the list of previously resized CQ buffers */ list_for_each_safe(&iwucq->resize_list, cq_buf, next, list) { while (npolled < num_entries) { ret = irdma_poll_one(&cq_buf->cq, cur_cqe, entry ? entry + npolled : NULL); if (!ret) { ++npolled; cq_new_cqe = true; continue; } if (ret == ENOENT) break; /* QP using the CQ is destroyed. Skip reporting this CQE */ if (ret == EFAULT) { cq_new_cqe = true; continue; } goto error; } /* save the resized CQ buffer which received the last cqe */ if (cq_new_cqe) last_buf = cq_buf; cq_new_cqe = false; } /* check the current CQ for new cqes */ while (npolled < num_entries) { ret = irdma_poll_one(&iwucq->cq, cur_cqe, entry ? entry + npolled : NULL); if (!ret) { ++npolled; cq_new_cqe = true; continue; } if (ret == ENOENT) break; /* QP using the CQ is destroyed. Skip reporting this CQE */ if (ret == EFAULT) { cq_new_cqe = true; continue; } goto error; } if (cq_new_cqe) /* all previous CQ resizes are complete */ resized_bufs = irdma_process_resize_list(iwucq, NULL); else if (last_buf) /* only CQ resizes up to the last_buf are complete */ resized_bufs = irdma_process_resize_list(iwucq, last_buf); if (resized_bufs) /* report to the HW the number of complete CQ resizes */ irdma_uk_cq_set_resized_cnt(&iwucq->cq, resized_bufs); return npolled; error: return -EINVAL; } /** * irdma_upoll_cq - verb API callback to poll device CQ * @cq: ibv_cq to poll * @num_entries: max cq entries to poll * @entry: pointer to array of ibv_wc objects to be filled in for each completion * * Returns non-negative value equal to the number of completions * found and a negative error code on failure */ int irdma_upoll_cq(struct ibv_cq *cq, int num_entries, struct ibv_wc *entry) { struct irdma_ucq *iwucq; int ret; iwucq = container_of(cq, struct irdma_ucq, verbs_cq.cq); ret = pthread_spin_lock(&iwucq->lock); if (ret) return -ret; ret = __irdma_upoll_cq(iwucq, num_entries, entry); pthread_spin_unlock(&iwucq->lock); return ret; } /** * irdma_start_poll - verb_ex API callback to poll batch of WC's * @ibvcq_ex: ibv extended CQ * @attr: attributes (not used) * * Start polling batch of work completions. Return 0 on success, ENONENT when * no completions are available on CQ. And an error code on errors */ static int irdma_start_poll(struct ibv_cq_ex *ibvcq_ex, struct ibv_poll_cq_attr *attr) { struct irdma_ucq *iwucq; int ret; iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); ret = pthread_spin_lock(&iwucq->lock); if (ret) return ret; ret = __irdma_upoll_cq(iwucq, 1, NULL); if (ret == 1) return 0; /* No Completions on CQ */ if (!ret) ret = ENOENT; pthread_spin_unlock(&iwucq->lock); return ret; } /** * irdma_next_poll - verb_ex API callback to get next WC * @ibvcq_ex: ibv extended CQ * * Return 0 on success, ENONENT when no completions are available on CQ. * And an error code on errors */ static int irdma_next_poll(struct ibv_cq_ex *ibvcq_ex) { struct irdma_ucq *iwucq; int ret; iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); ret = __irdma_upoll_cq(iwucq, 1, NULL); if (ret == 1) return 0; /* No Completions on CQ */ if (!ret) ret = ENOENT; return ret; } /** * irdma_end_poll - verb_ex API callback to end polling of WC's * @ibvcq_ex: ibv extended CQ */ static void irdma_end_poll(struct ibv_cq_ex *ibvcq_ex) { struct irdma_ucq *iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); pthread_spin_unlock(&iwucq->lock); } /** * irdma_wc_read_completion_ts - Get completion timestamp * @ibvcq_ex: ibv extended CQ * * Get completion timestamp in HCA clock units */ static uint64_t irdma_wc_read_completion_ts(struct ibv_cq_ex *ibvcq_ex) { struct irdma_ucq *iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); #define HCA_CORE_CLOCK_800_MHZ 800 return iwucq->cur_cqe.tcp_seq_num_rtt / HCA_CORE_CLOCK_800_MHZ; } /** * irdma_wc_read_completion_wallclock_ns - Get completion timestamp in ns * @ibvcq_ex: ibv extended CQ * * Get completion timestamp from current completion in wall clock nanoseconds */ static uint64_t irdma_wc_read_completion_wallclock_ns(struct ibv_cq_ex *ibvcq_ex) { struct irdma_ucq *iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); /* RTT is in usec */ return iwucq->cur_cqe.tcp_seq_num_rtt * 1000; } static enum ibv_wc_opcode irdma_wc_read_opcode(struct ibv_cq_ex *ibvcq_ex) { struct irdma_ucq *iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); switch (iwucq->cur_cqe.op_type) { case IRDMA_OP_TYPE_RDMA_WRITE: case IRDMA_OP_TYPE_RDMA_WRITE_SOL: return IBV_WC_RDMA_WRITE; case IRDMA_OP_TYPE_RDMA_READ: return IBV_WC_RDMA_READ; case IRDMA_OP_TYPE_SEND_SOL: case IRDMA_OP_TYPE_SEND_SOL_INV: case IRDMA_OP_TYPE_SEND_INV: case IRDMA_OP_TYPE_SEND: return IBV_WC_SEND; case IRDMA_OP_TYPE_BIND_MW: return IBV_WC_BIND_MW; case IRDMA_OP_TYPE_REC: return IBV_WC_RECV; case IRDMA_OP_TYPE_REC_IMM: return IBV_WC_RECV_RDMA_WITH_IMM; case IRDMA_OP_TYPE_INV_STAG: return IBV_WC_LOCAL_INV; } return 0; } static uint32_t irdma_wc_read_vendor_err(struct ibv_cq_ex *ibvcq_ex) { struct irdma_cq_poll_info *cur_cqe; struct irdma_ucq *iwucq; iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); cur_cqe = &iwucq->cur_cqe; return cur_cqe->error ? cur_cqe->major_err << 16 | cur_cqe->minor_err : 0; } static unsigned int irdma_wc_read_wc_flags(struct ibv_cq_ex *ibvcq_ex) { struct irdma_cq_poll_info *cur_cqe; struct irdma_ucq *iwucq; struct irdma_qp_uk *qp; struct ibv_qp *ib_qp; unsigned int wc_flags = 0; iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); cur_cqe = &iwucq->cur_cqe; qp = cur_cqe->qp_handle; ib_qp = qp->back_qp; if (cur_cqe->imm_valid) wc_flags |= IBV_WC_WITH_IMM; if (ib_qp->qp_type == IBV_QPT_UD) { wc_flags |= IBV_WC_GRH; } else { if (cur_cqe->stag_invalid_set) { switch (cur_cqe->op_type) { case IRDMA_OP_TYPE_REC: wc_flags |= IBV_WC_WITH_INV; break; case IRDMA_OP_TYPE_REC_IMM: wc_flags |= IBV_WC_WITH_INV; break; } } } return wc_flags; } static uint32_t irdma_wc_read_byte_len(struct ibv_cq_ex *ibvcq_ex) { struct irdma_ucq *iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); return iwucq->cur_cqe.bytes_xfered; } static __be32 irdma_wc_read_imm_data(struct ibv_cq_ex *ibvcq_ex) { struct irdma_cq_poll_info *cur_cqe; struct irdma_ucq *iwucq; iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); cur_cqe = &iwucq->cur_cqe; return cur_cqe->imm_valid ? htonl(cur_cqe->imm_data) : 0; } static uint32_t irdma_wc_read_qp_num(struct ibv_cq_ex *ibvcq_ex) { struct irdma_ucq *iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); return iwucq->cur_cqe.qp_id; } static uint32_t irdma_wc_read_src_qp(struct ibv_cq_ex *ibvcq_ex) { struct irdma_cq_poll_info *cur_cqe; struct irdma_ucq *iwucq; struct irdma_qp_uk *qp; struct ibv_qp *ib_qp; iwucq = container_of(ibvcq_ex, struct irdma_ucq, verbs_cq.cq_ex); cur_cqe = &iwucq->cur_cqe; qp = cur_cqe->qp_handle; ib_qp = qp->back_qp; return ib_qp->qp_type == IBV_QPT_UD ? cur_cqe->ud_src_qpn : cur_cqe->qp_id; } static uint32_t irdma_wc_read_slid(struct ibv_cq_ex *ibvcq_ex) { return 0; } static uint8_t irdma_wc_read_sl(struct ibv_cq_ex *ibvcq_ex) { return 0; } static uint8_t irdma_wc_read_dlid_path_bits(struct ibv_cq_ex *ibvcq_ex) { return 0; } void irdma_ibvcq_ex_fill_priv_funcs(struct irdma_ucq *iwucq, struct ibv_cq_init_attr_ex *attr_ex) { struct ibv_cq_ex *ibvcq_ex = &iwucq->verbs_cq.cq_ex; ibvcq_ex->start_poll = irdma_start_poll; ibvcq_ex->end_poll = irdma_end_poll; ibvcq_ex->next_poll = irdma_next_poll; if (attr_ex->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP) { ibvcq_ex->read_completion_ts = irdma_wc_read_completion_ts; iwucq->report_rtt = true; } if (attr_ex->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK) { ibvcq_ex->read_completion_wallclock_ns = irdma_wc_read_completion_wallclock_ns; iwucq->report_rtt = true; } ibvcq_ex->read_opcode = irdma_wc_read_opcode; ibvcq_ex->read_vendor_err = irdma_wc_read_vendor_err; ibvcq_ex->read_wc_flags = irdma_wc_read_wc_flags; if (attr_ex->wc_flags & IBV_WC_EX_WITH_BYTE_LEN) ibvcq_ex->read_byte_len = irdma_wc_read_byte_len; if (attr_ex->wc_flags & IBV_WC_EX_WITH_IMM) ibvcq_ex->read_imm_data = irdma_wc_read_imm_data; if (attr_ex->wc_flags & IBV_WC_EX_WITH_QP_NUM) ibvcq_ex->read_qp_num = irdma_wc_read_qp_num; if (attr_ex->wc_flags & IBV_WC_EX_WITH_SRC_QP) ibvcq_ex->read_src_qp = irdma_wc_read_src_qp; if (attr_ex->wc_flags & IBV_WC_EX_WITH_SLID) ibvcq_ex->read_slid = irdma_wc_read_slid; if (attr_ex->wc_flags & IBV_WC_EX_WITH_SL) ibvcq_ex->read_sl = irdma_wc_read_sl; if (attr_ex->wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS) ibvcq_ex->read_dlid_path_bits = irdma_wc_read_dlid_path_bits; } /** * irdma_arm_cq - arm of cq * @iwucq: cq to which arm * @cq_notify: notification params */ static void irdma_arm_cq(struct irdma_ucq *iwucq, enum irdma_cmpl_notify cq_notify) { iwucq->is_armed = true; iwucq->arm_sol = true; iwucq->skip_arm = false; iwucq->skip_sol = true; irdma_uk_cq_request_notification(&iwucq->cq, cq_notify); } /** * irdma_uarm_cq - callback for arm of cq * @cq: cq to arm * @solicited: to get notify params */ int irdma_uarm_cq(struct ibv_cq *cq, int solicited) { struct irdma_ucq *iwucq; enum irdma_cmpl_notify cq_notify = IRDMA_CQ_COMPL_EVENT; int ret; iwucq = container_of(cq, struct irdma_ucq, verbs_cq.cq); if (solicited) cq_notify = IRDMA_CQ_COMPL_SOLICITED; ret = pthread_spin_lock(&iwucq->lock); if (ret) return ret; if (iwucq->is_armed) { if (iwucq->arm_sol && !solicited) { irdma_arm_cq(iwucq, cq_notify); } else { iwucq->skip_arm = true; iwucq->skip_sol = solicited ? true : false; } } else { irdma_arm_cq(iwucq, cq_notify); } pthread_spin_unlock(&iwucq->lock); return 0; } /** * irdma_cq_event - cq to do completion event * @cq: cq to arm */ void irdma_cq_event(struct ibv_cq *cq) { struct irdma_ucq *iwucq; iwucq = container_of(cq, struct irdma_ucq, verbs_cq.cq); if (pthread_spin_lock(&iwucq->lock)) return; if (iwucq->skip_arm) irdma_arm_cq(iwucq, IRDMA_CQ_COMPL_EVENT); else iwucq->is_armed = false; pthread_spin_unlock(&iwucq->lock); } void *irdma_mmap(int fd, off_t offset) { void *map; map = mmap(NULL, IRDMA_HW_PAGE_SIZE, PROT_WRITE | PROT_READ, MAP_SHARED, fd, offset); if (map == MAP_FAILED) return map; if (ibv_dontfork_range(map, IRDMA_HW_PAGE_SIZE)) { munmap(map, IRDMA_HW_PAGE_SIZE); return MAP_FAILED; } return map; } void irdma_munmap(void *map) { ibv_dofork_range(map, IRDMA_HW_PAGE_SIZE); munmap(map, IRDMA_HW_PAGE_SIZE); } /** * irdma_destroy_vmapped_qp - destroy resources for qp * @iwuqp: qp struct for resources */ static int irdma_destroy_vmapped_qp(struct irdma_uqp *iwuqp) { int ret; ret = ibv_cmd_destroy_qp(&iwuqp->ibv_qp); if (ret) return ret; if (iwuqp->qp.push_db) irdma_munmap(iwuqp->qp.push_db); if (iwuqp->qp.push_wqe) irdma_munmap(iwuqp->qp.push_wqe); ibv_cmd_dereg_mr(&iwuqp->vmr); return 0; } /** * irdma_vmapped_qp - create resources for qp * @iwuqp: qp struct for resources * @pd: pd for the qp * @attr: attributes of qp passed * @resp: response back from create qp * @info: info for initializing user level qp * @abi_ver: abi version of the create qp command */ static int irdma_vmapped_qp(struct irdma_uqp *iwuqp, struct ibv_pd *pd, struct ibv_qp_init_attr *attr, struct irdma_qp_uk_init_info *info, bool legacy_mode) { struct irdma_ucreate_qp cmd = {}; size_t sqsize, rqsize, totalqpsize; struct irdma_ucreate_qp_resp resp = {}; struct irdma_ureg_mr reg_mr_cmd = {}; struct ib_uverbs_reg_mr_resp reg_mr_resp = {}; long os_pgsz = IRDMA_HW_PAGE_SIZE; struct irdma_uvcontext *iwvctx; int ret; sqsize = roundup(info->sq_depth * IRDMA_QP_WQE_MIN_SIZE, IRDMA_HW_PAGE_SIZE); rqsize = roundup(info->rq_depth * IRDMA_QP_WQE_MIN_SIZE, IRDMA_HW_PAGE_SIZE); totalqpsize = rqsize + sqsize + IRDMA_DB_SHADOW_AREA_SIZE; iwvctx = container_of(pd->context, struct irdma_uvcontext, ibv_ctx.context); /* adjust alignment for iwarp */ if (iwvctx->ibv_ctx.context.device->transport_type == IBV_TRANSPORT_IWARP) { long pgsz = sysconf(_SC_PAGESIZE); if (pgsz > 0) os_pgsz = pgsz; } info->sq = irdma_calloc_hw_buf_sz(totalqpsize, os_pgsz); if (!info->sq) return ENOMEM; iwuqp->buf_size = totalqpsize; info->rq = &info->sq[sqsize / IRDMA_QP_WQE_MIN_SIZE]; info->shadow_area = info->rq[rqsize / IRDMA_QP_WQE_MIN_SIZE].elem; reg_mr_cmd.reg_type = IRDMA_MEMREG_TYPE_QP; reg_mr_cmd.sq_pages = sqsize >> IRDMA_HW_PAGE_SHIFT; reg_mr_cmd.rq_pages = rqsize >> IRDMA_HW_PAGE_SHIFT; ret = ibv_cmd_reg_mr(pd, info->sq, totalqpsize, (uintptr_t)info->sq, IBV_ACCESS_LOCAL_WRITE, &iwuqp->vmr, ®_mr_cmd.ibv_cmd, sizeof(reg_mr_cmd), ®_mr_resp, sizeof(reg_mr_resp)); if (ret) goto err_dereg_mr; cmd.user_wqe_bufs = (__u64)((uintptr_t)info->sq); cmd.user_compl_ctx = (__u64)(uintptr_t)&iwuqp->qp; ret = ibv_cmd_create_qp(pd, &iwuqp->ibv_qp, attr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(struct irdma_ucreate_qp_resp)); if (ret) goto err_qp; info->sq_size = resp.actual_sq_size; info->rq_size = resp.actual_rq_size; info->first_sq_wq = legacy_mode ? 1 : resp.lsmm; info->qp_caps = resp.qp_caps; info->qp_id = resp.qp_id; iwuqp->irdma_drv_opt = resp.irdma_drv_opt; iwuqp->ibv_qp.qp_num = resp.qp_id; iwuqp->send_cq = container_of(attr->send_cq, struct irdma_ucq, verbs_cq.cq); iwuqp->recv_cq = container_of(attr->recv_cq, struct irdma_ucq, verbs_cq.cq); iwuqp->send_cq->uqp = iwuqp; iwuqp->recv_cq->uqp = iwuqp; return 0; err_qp: ibv_cmd_dereg_mr(&iwuqp->vmr); err_dereg_mr: irdma_free_hw_buf(info->sq, iwuqp->buf_size); return ret; } /** * irdma_ucreate_qp - create qp on user app * @pd: pd for the qp * @attr: attributes of the qp to be created (sizes, sge, cq) */ struct ibv_qp *irdma_ucreate_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct irdma_qp_uk_init_info info = {}; struct irdma_uk_attrs *uk_attrs; struct irdma_uvcontext *iwvctx; struct irdma_uqp *iwuqp; int status; if (attr->qp_type != IBV_QPT_RC && attr->qp_type != IBV_QPT_UD) { errno = EOPNOTSUPP; return NULL; } iwvctx = container_of(pd->context, struct irdma_uvcontext, ibv_ctx.context); uk_attrs = &iwvctx->uk_attrs; if (attr->cap.max_send_sge > uk_attrs->max_hw_wq_frags || attr->cap.max_recv_sge > uk_attrs->max_hw_wq_frags || attr->cap.max_send_wr > uk_attrs->max_hw_wq_quanta || attr->cap.max_recv_wr > uk_attrs->max_hw_rq_quanta || attr->cap.max_inline_data > uk_attrs->max_hw_inline) { errno = EINVAL; return NULL; } info.uk_attrs = uk_attrs; info.sq_size = attr->cap.max_send_wr; info.rq_size = attr->cap.max_recv_wr; info.max_sq_frag_cnt = attr->cap.max_send_sge; info.max_rq_frag_cnt = attr->cap.max_recv_sge; info.max_inline_data = attr->cap.max_inline_data; info.abi_ver = iwvctx->abi_ver; status = irdma_uk_calc_depth_shift_sq(&info, &info.sq_depth, &info.sq_shift); if (status) { errno = status; return NULL; } status = irdma_uk_calc_depth_shift_rq(&info, &info.rq_depth, &info.rq_shift); if (status) { errno = status; return NULL; } iwuqp = memalign(1024, sizeof(*iwuqp)); if (!iwuqp) return NULL; memset(iwuqp, 0, sizeof(*iwuqp)); if (pthread_spin_init(&iwuqp->lock, PTHREAD_PROCESS_PRIVATE)) goto err_free_qp; info.sq_size = info.sq_depth >> info.sq_shift; info.rq_size = info.rq_depth >> info.rq_shift; /** * Maintain backward compatibility with older ABI which pass sq * and rq depth (in quanta) in cap.max_send_wr a cap.max_recv_wr */ if (!iwvctx->use_raw_attrs) { attr->cap.max_send_wr = info.sq_size; attr->cap.max_recv_wr = info.rq_size; } info.wqe_alloc_db = (__u32 *)iwvctx->db; info.legacy_mode = iwvctx->legacy_mode; info.sq_wrtrk_array = calloc(info.sq_depth, sizeof(*info.sq_wrtrk_array)); if (!info.sq_wrtrk_array) goto err_destroy_lock; info.rq_wrid_array = calloc(info.rq_depth, sizeof(*info.rq_wrid_array)); if (!info.rq_wrid_array) goto err_free_sq_wrtrk; iwuqp->sq_sig_all = attr->sq_sig_all; iwuqp->qp_type = attr->qp_type; status = irdma_vmapped_qp(iwuqp, pd, attr, &info, iwvctx->legacy_mode); if (status) { errno = status; goto err_free_rq_wrid; } iwuqp->qp.back_qp = iwuqp; iwuqp->qp.lock = &iwuqp->lock; status = irdma_uk_qp_init(&iwuqp->qp, &info); if (status) { errno = EINVAL; goto err_free_vmap_qp; } attr->cap.max_send_wr = (info.sq_depth - IRDMA_SQ_RSVD) >> info.sq_shift; attr->cap.max_recv_wr = (info.rq_depth - IRDMA_RQ_RSVD) >> info.rq_shift; return &iwuqp->ibv_qp; err_free_vmap_qp: irdma_destroy_vmapped_qp(iwuqp); irdma_free_hw_buf(info.sq, iwuqp->buf_size); err_free_rq_wrid: free(info.rq_wrid_array); err_free_sq_wrtrk: free(info.sq_wrtrk_array); err_destroy_lock: pthread_spin_destroy(&iwuqp->lock); err_free_qp: free(iwuqp); return NULL; } /** * irdma_uquery_qp - query qp for some attribute * @qp: qp for the attributes query * @attr: to return the attributes * @attr_mask: mask of what is query for * @init_attr: initial attributes during create_qp */ int irdma_uquery_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; return ibv_cmd_query_qp(qp, attr, attr_mask, init_attr, &cmd, sizeof(cmd)); } /** * irdma_umodify_qp - send qp modify to driver * @qp: qp to modify * @attr: attribute to modify * @attr_mask: mask of the attribute */ int irdma_umodify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { struct irdma_umodify_qp_resp resp = {}; struct ibv_modify_qp cmd = {}; struct irdma_umodify_qp cmd_ex = {}; struct irdma_uvcontext *iwctx; struct irdma_uqp *iwuqp; iwuqp = container_of(qp, struct irdma_uqp, ibv_qp); iwctx = container_of(qp->context, struct irdma_uvcontext, ibv_ctx.context); if (iwuqp->qp.qp_caps & IRDMA_PUSH_MODE && attr_mask & IBV_QP_STATE && iwctx->uk_attrs.hw_rev > IRDMA_GEN_1) { __u64 offset; void *map; int ret; ret = ibv_cmd_modify_qp_ex(qp, attr, attr_mask, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp.ibv_resp, sizeof(resp)); if (ret || !resp.push_valid) return ret; if (iwuqp->qp.push_wqe) return ret; offset = resp.push_wqe_mmap_key; map = irdma_mmap(qp->context->cmd_fd, offset); if (map == MAP_FAILED) return ret; iwuqp->qp.push_wqe = map; offset = resp.push_db_mmap_key; map = irdma_mmap(qp->context->cmd_fd, offset); if (map == MAP_FAILED) { irdma_munmap(iwuqp->qp.push_wqe); iwuqp->qp.push_wqe = NULL; return ret; } iwuqp->qp.push_wqe += resp.push_offset; iwuqp->qp.push_db = map + resp.push_offset; return ret; } else { return ibv_cmd_modify_qp(qp, attr, attr_mask, &cmd, sizeof(cmd)); } } static void irdma_issue_flush(struct ibv_qp *qp, bool sq_flush, bool rq_flush) { struct ib_uverbs_ex_modify_qp_resp resp = {}; struct irdma_umodify_qp cmd_ex = {}; struct ibv_qp_attr attr = {}; attr.qp_state = IBV_QPS_ERR; cmd_ex.sq_flush = sq_flush; cmd_ex.rq_flush = rq_flush; ibv_cmd_modify_qp_ex(qp, &attr, IBV_QP_STATE, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp, sizeof(resp)); } /** * irdma_clean_cqes - clean cq entries for qp * @qp: qp for which completions are cleaned * @iwcq: cq to be cleaned */ static void irdma_clean_cqes(struct irdma_qp_uk *qp, struct irdma_ucq *iwucq) { struct irdma_cq_uk *ukcq = &iwucq->cq; int ret; ret = pthread_spin_lock(&iwucq->lock); if (ret) return; irdma_uk_clean_cq(qp, ukcq); pthread_spin_unlock(&iwucq->lock); } /** * irdma_udestroy_qp - destroy qp * @qp: qp to destroy */ int irdma_udestroy_qp(struct ibv_qp *qp) { struct irdma_uqp *iwuqp; int ret; iwuqp = container_of(qp, struct irdma_uqp, ibv_qp); ret = pthread_spin_destroy(&iwuqp->lock); if (ret) goto err; ret = irdma_destroy_vmapped_qp(iwuqp); if (ret) goto err; /* Clean any pending completions from the cq(s) */ if (iwuqp->send_cq) irdma_clean_cqes(&iwuqp->qp, iwuqp->send_cq); if (iwuqp->recv_cq && iwuqp->recv_cq != iwuqp->send_cq) irdma_clean_cqes(&iwuqp->qp, iwuqp->recv_cq); if (iwuqp->qp.sq_wrtrk_array) free(iwuqp->qp.sq_wrtrk_array); if (iwuqp->qp.rq_wrid_array) free(iwuqp->qp.rq_wrid_array); irdma_free_hw_buf(iwuqp->qp.sq_base, iwuqp->buf_size); free(iwuqp); return 0; err: return ret; } /** * irdma_post_send - post send wr for user application * @ib_qp: qp to post wr * @ib_wr: work request ptr * @bad_wr: return of bad wr if err */ int irdma_upost_send(struct ibv_qp *ib_qp, struct ibv_send_wr *ib_wr, struct ibv_send_wr **bad_wr) { struct irdma_post_sq_info info; struct irdma_uvcontext *iwvctx; struct irdma_uk_attrs *uk_attrs; struct irdma_uqp *iwuqp; bool reflush = false; int err; iwuqp = container_of(ib_qp, struct irdma_uqp, ibv_qp); iwvctx = container_of(ib_qp->context, struct irdma_uvcontext, ibv_ctx.context); uk_attrs = &iwvctx->uk_attrs; err = pthread_spin_lock(&iwuqp->lock); if (err) return err; if (!IRDMA_RING_MORE_WORK(iwuqp->qp.sq_ring) && ib_qp->state == IBV_QPS_ERR) reflush = true; while (ib_wr) { memset(&info, 0, sizeof(info)); info.wr_id = (__u64)(ib_wr->wr_id); if ((ib_wr->send_flags & IBV_SEND_SIGNALED) || iwuqp->sq_sig_all) info.signaled = true; if (ib_wr->send_flags & IBV_SEND_FENCE) info.read_fence = true; if (iwuqp->send_cq->report_rtt) info.report_rtt = true; switch (ib_wr->opcode) { case IBV_WR_SEND_WITH_IMM: if (iwuqp->qp.qp_caps & IRDMA_SEND_WITH_IMM) { info.imm_data_valid = true; info.imm_data = ntohl(ib_wr->imm_data); } else { err = EINVAL; break; } SWITCH_FALLTHROUGH; case IBV_WR_SEND: case IBV_WR_SEND_WITH_INV: if (ib_wr->opcode == IBV_WR_SEND || ib_wr->opcode == IBV_WR_SEND_WITH_IMM) { if (ib_wr->send_flags & IBV_SEND_SOLICITED) info.op_type = IRDMA_OP_TYPE_SEND_SOL; else info.op_type = IRDMA_OP_TYPE_SEND; } else { if (ib_wr->send_flags & IBV_SEND_SOLICITED) info.op_type = IRDMA_OP_TYPE_SEND_SOL_INV; else info.op_type = IRDMA_OP_TYPE_SEND_INV; info.stag_to_inv = ib_wr->invalidate_rkey; } info.op.send.num_sges = ib_wr->num_sge; info.op.send.sg_list = (struct ibv_sge *)ib_wr->sg_list; if (ib_qp->qp_type == IBV_QPT_UD) { struct irdma_uah *ah = container_of(ib_wr->wr.ud.ah, struct irdma_uah, ibv_ah); info.op.send.ah_id = ah->ah_id; info.op.send.qkey = ib_wr->wr.ud.remote_qkey; info.op.send.dest_qp = ib_wr->wr.ud.remote_qpn; } if (ib_wr->send_flags & IBV_SEND_INLINE) err = irdma_uk_inline_send(&iwuqp->qp, &info, false); else err = irdma_uk_send(&iwuqp->qp, &info, false); break; case IBV_WR_RDMA_WRITE_WITH_IMM: if (iwuqp->qp.qp_caps & IRDMA_WRITE_WITH_IMM) { info.imm_data_valid = true; info.imm_data = ntohl(ib_wr->imm_data); } else { err = EINVAL; break; } SWITCH_FALLTHROUGH; case IBV_WR_RDMA_WRITE: if (ib_wr->send_flags & IBV_SEND_SOLICITED) info.op_type = IRDMA_OP_TYPE_RDMA_WRITE_SOL; else info.op_type = IRDMA_OP_TYPE_RDMA_WRITE; info.op.rdma_write.num_lo_sges = ib_wr->num_sge; info.op.rdma_write.lo_sg_list = ib_wr->sg_list; info.op.rdma_write.rem_addr.addr = ib_wr->wr.rdma.remote_addr; info.op.rdma_write.rem_addr.lkey = ib_wr->wr.rdma.rkey; if (ib_wr->send_flags & IBV_SEND_INLINE) err = irdma_uk_inline_rdma_write(&iwuqp->qp, &info, false); else err = irdma_uk_rdma_write(&iwuqp->qp, &info, false); break; case IBV_WR_RDMA_READ: if (ib_wr->num_sge > uk_attrs->max_hw_read_sges) { err = EINVAL; break; } info.op_type = IRDMA_OP_TYPE_RDMA_READ; info.op.rdma_read.rem_addr.addr = ib_wr->wr.rdma.remote_addr; info.op.rdma_read.rem_addr.lkey = ib_wr->wr.rdma.rkey; info.op.rdma_read.lo_sg_list = ib_wr->sg_list; info.op.rdma_read.num_lo_sges = ib_wr->num_sge; err = irdma_uk_rdma_read(&iwuqp->qp, &info, false, false); break; case IBV_WR_BIND_MW: if (ib_qp->qp_type != IBV_QPT_RC) { err = EINVAL; break; } info.op_type = IRDMA_OP_TYPE_BIND_MW; info.op.bind_window.mr_stag = ib_wr->bind_mw.bind_info.mr->rkey; info.op.bind_window.mem_window_type_1 = true; info.op.bind_window.mw_stag = ib_wr->bind_mw.rkey; if (ib_wr->bind_mw.bind_info.mw_access_flags & IBV_ACCESS_ZERO_BASED) { info.op.bind_window.addressing_type = IRDMA_ADDR_TYPE_ZERO_BASED; info.op.bind_window.va = NULL; } else { info.op.bind_window.addressing_type = IRDMA_ADDR_TYPE_VA_BASED; info.op.bind_window.va = (void *)(uintptr_t)ib_wr->bind_mw.bind_info.addr; } info.op.bind_window.bind_len = ib_wr->bind_mw.bind_info.length; info.op.bind_window.ena_reads = (ib_wr->bind_mw.bind_info.mw_access_flags & IBV_ACCESS_REMOTE_READ) ? 1 : 0; info.op.bind_window.ena_writes = (ib_wr->bind_mw.bind_info.mw_access_flags & IBV_ACCESS_REMOTE_WRITE) ? 1 : 0; err = irdma_uk_mw_bind(&iwuqp->qp, &info, false); break; case IBV_WR_LOCAL_INV: info.op_type = IRDMA_OP_TYPE_INV_STAG; info.op.inv_local_stag.target_stag = ib_wr->invalidate_rkey; err = irdma_uk_stag_local_invalidate(&iwuqp->qp, &info, true); break; default: /* error */ err = EINVAL; break; } if (err) break; ib_wr = ib_wr->next; } if (err) *bad_wr = ib_wr; irdma_uk_qp_post_wr(&iwuqp->qp); if (reflush) irdma_issue_flush(ib_qp, 1, 0); pthread_spin_unlock(&iwuqp->lock); return err; } /** * irdma_post_recv - post receive wr for user application * @ib_wr: work request for receive * @bad_wr: bad wr caused an error */ int irdma_upost_recv(struct ibv_qp *ib_qp, struct ibv_recv_wr *ib_wr, struct ibv_recv_wr **bad_wr) { struct irdma_post_rq_info post_recv = {}; struct irdma_uqp *iwuqp; bool reflush = false; int err; iwuqp = container_of(ib_qp, struct irdma_uqp, ibv_qp); err = pthread_spin_lock(&iwuqp->lock); if (err) return err; if (!IRDMA_RING_MORE_WORK(iwuqp->qp.rq_ring) && ib_qp->state == IBV_QPS_ERR) reflush = true; while (ib_wr) { if (ib_wr->num_sge > iwuqp->qp.max_rq_frag_cnt) { *bad_wr = ib_wr; err = EINVAL; goto error; } post_recv.num_sges = ib_wr->num_sge; post_recv.wr_id = ib_wr->wr_id; post_recv.sg_list = ib_wr->sg_list; err = irdma_uk_post_receive(&iwuqp->qp, &post_recv); if (err) { *bad_wr = ib_wr; goto error; } if (reflush) irdma_issue_flush(ib_qp, 0, 1); ib_wr = ib_wr->next; } error: pthread_spin_unlock(&iwuqp->lock); return err; } /** * irdma_ucreate_ah - create address handle associated with a pd * @ibpd: pd for the address handle * @attr: attributes of address handle */ struct ibv_ah *irdma_ucreate_ah(struct ibv_pd *ibpd, struct ibv_ah_attr *attr) { struct irdma_uah *ah; union ibv_gid sgid; struct irdma_ucreate_ah_resp resp; int err; err = ibv_query_gid(ibpd->context, attr->port_num, attr->grh.sgid_index, &sgid); if (err) { errno = err; return NULL; } ah = calloc(1, sizeof(*ah)); if (!ah) return NULL; err = ibv_cmd_create_ah(ibpd, &ah->ibv_ah, attr, &resp.ibv_resp, sizeof(resp)); if (err) { free(ah); errno = err; return NULL; } ah->ah_id = resp.ah_id; return &ah->ibv_ah; } /** * irdma_udestroy_ah - destroy the address handle * @ibah: address handle */ int irdma_udestroy_ah(struct ibv_ah *ibah) { struct irdma_uah *ah; int ret; ah = container_of(ibah, struct irdma_uah, ibv_ah); ret = ibv_cmd_destroy_ah(ibah); if (ret) return ret; free(ah); return 0; } /** * irdma_uattach_mcast - Attach qp to multicast group implemented * @qp: The queue pair * @gid:The Global ID for multicast group * @lid: The Local ID */ int irdma_uattach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) { return ibv_cmd_attach_mcast(qp, gid, lid); } /** * irdma_udetach_mcast - Detach qp from multicast group * @qp: The queue pair * @gid:The Global ID for multicast group * @lid: The Local ID */ int irdma_udetach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) { return ibv_cmd_detach_mcast(qp, gid, lid); } /** * irdma_uresize_cq - resizes a cq * @cq: cq to resize * @cqe: the number of cqes of the new cq */ int irdma_uresize_cq(struct ibv_cq *cq, int cqe) { struct irdma_uvcontext *iwvctx; struct irdma_uk_attrs *uk_attrs; struct irdma_uresize_cq cmd = {}; struct ib_uverbs_resize_cq_resp resp = {}; struct irdma_ureg_mr reg_mr_cmd = {}; struct ib_uverbs_reg_mr_resp reg_mr_resp = {}; struct irdma_cq_buf *cq_buf = NULL; struct irdma_cqe *cq_base = NULL; struct verbs_mr new_mr = {}; struct irdma_ucq *iwucq; size_t cq_size; __u32 cq_pages; int cqe_needed; int ret = 0; iwucq = container_of(cq, struct irdma_ucq, verbs_cq.cq); iwvctx = container_of(cq->context, struct irdma_uvcontext, ibv_ctx.context); uk_attrs = &iwvctx->uk_attrs; if (!(uk_attrs->feature_flags & IRDMA_FEATURE_CQ_RESIZE)) return EOPNOTSUPP; if (cqe > IRDMA_MAX_CQ_SIZE) return EINVAL; cqe_needed = cqe + 1; if (uk_attrs->hw_rev > IRDMA_GEN_1) cqe_needed *= 2; if (cqe_needed < IRDMA_U_MINCQ_SIZE) cqe_needed = IRDMA_U_MINCQ_SIZE; if (cqe_needed == iwucq->cq.cq_size) return 0; cq_size = get_cq_total_bytes(cqe_needed); cq_pages = cq_size >> IRDMA_HW_PAGE_SHIFT; cq_base = irdma_calloc_hw_buf(cq_size); if (!cq_base) return ENOMEM; cq_buf = malloc(sizeof(*cq_buf)); if (!cq_buf) { ret = ENOMEM; goto err_buf; } new_mr.ibv_mr.pd = iwucq->vmr.ibv_mr.pd; reg_mr_cmd.reg_type = IRDMA_MEMREG_TYPE_CQ; reg_mr_cmd.cq_pages = cq_pages; ret = ibv_cmd_reg_mr(new_mr.ibv_mr.pd, cq_base, cq_size, (uintptr_t)cq_base, IBV_ACCESS_LOCAL_WRITE, &new_mr, ®_mr_cmd.ibv_cmd, sizeof(reg_mr_cmd), ®_mr_resp, sizeof(reg_mr_resp)); if (ret) goto err_dereg_mr; ret = pthread_spin_lock(&iwucq->lock); if (ret) goto err_lock; cmd.user_cq_buffer = (__u64)((uintptr_t)cq_base); ret = ibv_cmd_resize_cq(&iwucq->verbs_cq.cq, cqe_needed, &cmd.ibv_cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) goto err_resize; memcpy(&cq_buf->cq, &iwucq->cq, sizeof(cq_buf->cq)); cq_buf->vmr = iwucq->vmr; iwucq->vmr = new_mr; irdma_uk_cq_resize(&iwucq->cq, cq_base, cqe_needed); iwucq->verbs_cq.cq.cqe = cqe; list_add_tail(&iwucq->resize_list, &cq_buf->list); pthread_spin_unlock(&iwucq->lock); return ret; err_resize: pthread_spin_unlock(&iwucq->lock); err_lock: ibv_cmd_dereg_mr(&new_mr); err_dereg_mr: free(cq_buf); err_buf: irdma_free_hw_buf(cq_base, cq_size); return ret; } rdma-core-56.1/providers/mana/000077500000000000000000000000001477342711600162635ustar00rootroot00000000000000rdma-core-56.1/providers/mana/CMakeLists.txt000066400000000000000000000003361477342711600210250ustar00rootroot00000000000000rdma_shared_provider(mana libmana.map 1 1.0.${PACKAGE_VERSION} mana.c manadv.c qp.c wq.c cq.c wr.c ) publish_headers(infiniband manadv.h ) rdma_pkg_config("mana" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}") rdma-core-56.1/providers/mana/cq.c000066400000000000000000000223601477342711600170350ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* * Copyright (c) 2024, Microsoft Corporation. All rights reserved. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "mana.h" #include "gdma.h" #include "doorbells.h" #include "rollback.h" #define INITIALIZED_OWNER_BIT(log2_num_entries) (1UL << (log2_num_entries)) DECLARE_DRV_CMD(mana_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, mana_ib_create_cq, mana_ib_create_cq_resp); struct ibv_cq *mana_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct mana_context *ctx = to_mctx(context); struct mana_create_cq_resp resp = {}; struct mana_ib_create_cq *cmd_drv; struct mana_create_cq cmd = {}; struct mana_cq *cq; uint16_t flags = 0; size_t cq_size; int ret; cq = calloc(1, sizeof(*cq)); if (!cq) return NULL; cq_size = align_hw_size(cqe * COMP_ENTRY_SIZE); cq->db_page = ctx->db_page; list_head_init(&cq->send_qp_list); list_head_init(&cq->recv_qp_list); pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE); cq->buf_external = ctx->extern_alloc.alloc && ctx->extern_alloc.free; if (!cq->buf_external) flags |= MANA_IB_CREATE_RNIC_CQ; if (cq->buf_external) cq->buf = ctx->extern_alloc.alloc(cq_size, ctx->extern_alloc.data); else cq->buf = mana_alloc_mem(cq_size); if (!cq->buf) { errno = ENOMEM; goto free_cq; } if (flags & MANA_IB_CREATE_RNIC_CQ) cq->cqe = cq_size / COMP_ENTRY_SIZE; else cq->cqe = cqe; // to preserve old behaviour for DPDK cq->head = INITIALIZED_OWNER_BIT(ilog32(cq->cqe) - 1); cq->last_armed_head = cq->head - 1; cq->ready_wcs = 0; cmd_drv = &cmd.drv_payload; cmd_drv->buf_addr = (uintptr_t)cq->buf; cmd_drv->flags = flags; resp.cqid = UINT32_MAX; ret = ibv_cmd_create_cq(context, cq->cqe, channel, comp_vector, &cq->ibcq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) { verbs_err(verbs_get_ctx(context), "Failed to Create CQ\n"); errno = ret; goto free_mem; } if (flags & MANA_IB_CREATE_RNIC_CQ) { cq->cqid = resp.cqid; if (cq->cqid == UINT32_MAX) { errno = ENODEV; goto destroy_cq; } } return &cq->ibcq; destroy_cq: ibv_cmd_destroy_cq(&cq->ibcq); free_mem: if (cq->buf_external) ctx->extern_alloc.free(cq->buf, ctx->extern_alloc.data); else munmap(cq->buf, cq_size); free_cq: free(cq); return NULL; } int mana_destroy_cq(struct ibv_cq *ibcq) { struct mana_cq *cq = container_of(ibcq, struct mana_cq, ibcq); struct mana_context *ctx = to_mctx(ibcq->context); int ret; pthread_spin_lock(&cq->lock); ret = ibv_cmd_destroy_cq(ibcq); if (ret) { verbs_err(verbs_get_ctx(ibcq->context), "Failed to Destroy CQ\n"); pthread_spin_unlock(&cq->lock); return ret; } pthread_spin_destroy(&cq->lock); if (cq->buf_external) ctx->extern_alloc.free(cq->buf, ctx->extern_alloc.data); else munmap(cq->buf, cq->cqe * COMP_ENTRY_SIZE); free(cq); return ret; } int mana_arm_cq(struct ibv_cq *ibcq, int solicited) { struct mana_cq *cq = container_of(ibcq, struct mana_cq, ibcq); if (solicited) return -EOPNOTSUPP; if (cq->cqid == UINT32_MAX) return -EINVAL; gdma_ring_cq_doorbell(cq); return 0; } static inline uint32_t handle_rc_requester_cqe(struct mana_qp *qp, struct gdma_cqe *cqe) { struct mana_gdma_queue *recv_queue = &qp->rc_qp.queues[USER_RC_RECV_QUEUE_REQUESTER]; struct mana_gdma_queue *send_queue = &qp->rc_qp.queues[USER_RC_SEND_QUEUE_REQUESTER]; uint32_t syndrome = cqe->rdma_cqe.rc_armed_completion.syndrome; uint32_t psn = cqe->rdma_cqe.rc_armed_completion.psn; struct rc_sq_shadow_wqe *shadow_wqe; uint32_t wcs = 0; if (!IB_IS_ACK(syndrome)) return 0; if (!PSN_GT(psn, qp->rc_qp.sq_highest_completed_psn)) return 0; qp->rc_qp.sq_highest_completed_psn = psn; if (!PSN_LT(psn, qp->rc_qp.sq_psn)) return 0; while ((shadow_wqe = (struct rc_sq_shadow_wqe *) shadow_queue_get_next_to_complete(&qp->shadow_sq)) != NULL) { if (PSN_LT(psn, shadow_wqe->end_psn)) break; send_queue->cons_idx += shadow_wqe->header.posted_wqe_size_in_bu; send_queue->cons_idx &= GDMA_QUEUE_OFFSET_MASK; recv_queue->cons_idx += shadow_wqe->read_posted_wqe_size_in_bu; recv_queue->cons_idx &= GDMA_QUEUE_OFFSET_MASK; uint32_t offset = shadow_wqe->header.unmasked_queue_offset + shadow_wqe->header.posted_wqe_size_in_bu; mana_ib_update_shared_mem_left_offset(qp, offset & GDMA_QUEUE_OFFSET_MASK); shadow_queue_advance_next_to_complete(&qp->shadow_sq); if (shadow_wqe->header.flags != MANA_NO_SIGNAL_WC) wcs++; } uint32_t prev_psn = PSN_DEC(qp->rc_qp.sq_psn); if (qp->rc_qp.sq_highest_completed_psn == prev_psn) gdma_arm_normal_cqe(recv_queue, qp->rc_qp.sq_psn); else gdma_arm_normal_cqe(recv_queue, prev_psn); return wcs; } static inline uint32_t handle_rc_responder_cqe(struct mana_qp *qp, struct gdma_cqe *cqe) { struct mana_gdma_queue *recv_queue = &qp->rc_qp.queues[USER_RC_RECV_QUEUE_RESPONDER]; struct rc_rq_shadow_wqe *shadow_wqe; shadow_wqe = (struct rc_rq_shadow_wqe *)shadow_queue_get_next_to_complete(&qp->shadow_rq); if (!shadow_wqe) return 0; uint32_t offset_cqe = cqe->rdma_cqe.rc_recv.rx_wqe_offset / GDMA_WQE_ALIGNMENT_UNIT_SIZE; uint32_t offset_wqe = shadow_wqe->header.unmasked_queue_offset & GDMA_QUEUE_OFFSET_MASK; if (offset_cqe != offset_wqe) return 0; shadow_wqe->byte_len = cqe->rdma_cqe.rc_recv.msg_len; shadow_wqe->imm_or_rkey = cqe->rdma_cqe.rc_recv.imm_data; switch (cqe->rdma_cqe.cqe_type) { case CQE_TYPE_RC_WRITE_IMM: shadow_wqe->header.opcode = IBV_WC_RECV_RDMA_WITH_IMM; SWITCH_FALLTHROUGH; case CQE_TYPE_RC_SEND_IMM: shadow_wqe->header.flags |= IBV_WC_WITH_IMM; break; case CQE_TYPE_RC_SEND_INV: shadow_wqe->header.flags |= IBV_WC_WITH_INV; break; default: break; } recv_queue->cons_idx += shadow_wqe->header.posted_wqe_size_in_bu; recv_queue->cons_idx &= GDMA_QUEUE_OFFSET_MASK; shadow_queue_advance_next_to_complete(&qp->shadow_rq); return 1; } static inline uint32_t mana_handle_cqe(struct mana_context *ctx, struct gdma_cqe *cqe) { struct mana_qp *qp; if (cqe->is_sq) // impossible for rc return 0; qp = mana_get_qp_from_rq(ctx, cqe->wqid); if (!qp) return 0; if (cqe->rdma_cqe.cqe_type == CQE_TYPE_ARMED_CMPL) return handle_rc_requester_cqe(qp, cqe); else return handle_rc_responder_cqe(qp, cqe); } static inline int gdma_read_cqe(struct mana_cq *cq, struct gdma_cqe *cqe) { uint32_t new_entry_owner_bits; uint32_t old_entry_owner_bits; struct gdma_cqe *current_cqe; uint32_t owner_bits; current_cqe = ((struct gdma_cqe *)cq->buf) + (cq->head % cq->cqe); new_entry_owner_bits = (cq->head / cq->cqe) & CQ_OWNER_MASK; old_entry_owner_bits = (cq->head / cq->cqe - 1) & CQ_OWNER_MASK; owner_bits = current_cqe->owner_bits; if (owner_bits == old_entry_owner_bits) return 0; /* no new entry */ if (owner_bits != new_entry_owner_bits) return -1; /*overflow detected*/ udma_from_device_barrier(); *cqe = *current_cqe; cq->head++; return 1; } static void fill_verbs_from_shadow_wqe(struct mana_qp *qp, struct ibv_wc *wc, const struct shadow_wqe_header *shadow_wqe) { const struct rc_rq_shadow_wqe *rc_wqe = (const struct rc_rq_shadow_wqe *)shadow_wqe; wc->wr_id = shadow_wqe->wr_id; wc->status = IBV_WC_SUCCESS; wc->opcode = shadow_wqe->opcode; wc->vendor_err = 0; wc->wc_flags = shadow_wqe->flags; wc->qp_num = qp->ibqp.qp.qp_num; wc->pkey_index = 0; if (shadow_wqe->opcode & IBV_WC_RECV) { wc->byte_len = rc_wqe->byte_len; wc->imm_data = htobe32(rc_wqe->imm_or_rkey); } } static int mana_process_completions(struct mana_cq *cq, int nwc, struct ibv_wc *wc) { struct shadow_wqe_header *shadow_wqe; struct mana_qp *qp; int wc_index = 0; /* process send shadow queue completions */ list_for_each(&cq->send_qp_list, qp, send_cq_node) { while ((shadow_wqe = shadow_queue_get_next_to_consume(&qp->shadow_sq)) != NULL) { if (wc_index >= nwc && shadow_wqe->flags != MANA_NO_SIGNAL_WC) goto out; if (shadow_wqe->flags != MANA_NO_SIGNAL_WC) { fill_verbs_from_shadow_wqe(qp, &wc[wc_index], shadow_wqe); wc_index++; } shadow_queue_advance_consumer(&qp->shadow_sq); } } /* process recv shadow queue completions */ list_for_each(&cq->recv_qp_list, qp, recv_cq_node) { while ((shadow_wqe = shadow_queue_get_next_to_consume(&qp->shadow_rq)) != NULL) { if (wc_index >= nwc) goto out; fill_verbs_from_shadow_wqe(qp, &wc[wc_index], shadow_wqe); wc_index++; shadow_queue_advance_consumer(&qp->shadow_rq); } } out: return wc_index; } int mana_poll_cq(struct ibv_cq *ibcq, int nwc, struct ibv_wc *wc) { struct mana_cq *cq = container_of(ibcq, struct mana_cq, ibcq); struct mana_context *ctx = to_mctx(ibcq->context); struct gdma_cqe gdma_cqe; int num_polled = 0; int ret; pthread_spin_lock(&cq->lock); while (cq->ready_wcs < nwc) { ret = gdma_read_cqe(cq, &gdma_cqe); if (ret < 0) { num_polled = -1; goto out; } if (ret == 0) break; cq->ready_wcs += mana_handle_cqe(ctx, &gdma_cqe); } num_polled = mana_process_completions(cq, nwc, wc); cq->ready_wcs -= num_polled; out: pthread_spin_unlock(&cq->lock); return num_polled; } rdma-core-56.1/providers/mana/doorbells.h000066400000000000000000000046661477342711600204350ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* * Copyright (c) 2024, Microsoft Corporation. All rights reserved. */ #ifndef _DOORBELLS_H_ #define _DOORBELLS_H_ #include #include #include "mana.h" #define GDMA_CQE_OWNER_BITS 3 #define CQ_OWNER_MASK ((1 << (GDMA_CQE_OWNER_BITS)) - 1) #define DOORBELL_OFFSET_SQ 0x0 #define DOORBELL_OFFSET_RQ 0x400 #define DOORBELL_OFFSET_RQ_CLIENT 0x408 #define DOORBELL_OFFSET_CQ 0x800 union gdma_doorbell_entry { uint64_t as_uint64; struct { uint64_t id : 24; uint64_t reserved : 8; uint64_t prod_idx : 31; uint64_t arm : 1; } cq; struct { uint32_t id : 24; uint32_t wqe_cnt : 8; uint32_t prod_idx; } rx; struct { uint32_t id : 24; uint32_t reserved : 8; uint32_t prod_idx; } tx; struct { uint64_t id : 24; uint64_t high : 8; uint64_t low : 32; } rqe_client; }; /* HW DATA */ static inline void gdma_ring_recv_doorbell(struct mana_gdma_queue *wq, uint8_t wqe_cnt) { union gdma_doorbell_entry e; e.as_uint64 = 0; e.rx.id = wq->id; e.rx.prod_idx = wq->prod_idx * GDMA_WQE_ALIGNMENT_UNIT_SIZE; e.rx.wqe_cnt = wqe_cnt; udma_to_device_barrier(); mmio_write64(wq->db_page + DOORBELL_OFFSET_RQ, e.as_uint64); mmio_flush_writes(); } static inline void gdma_ring_send_doorbell(struct mana_gdma_queue *wq) { union gdma_doorbell_entry e; e.as_uint64 = 0; e.tx.id = wq->id; e.tx.prod_idx = wq->prod_idx * GDMA_WQE_ALIGNMENT_UNIT_SIZE; udma_to_device_barrier(); mmio_write64(wq->db_page + DOORBELL_OFFSET_SQ, e.as_uint64); mmio_flush_writes(); } static inline void gdma_arm_normal_cqe(struct mana_gdma_queue *wq, uint32_t psn) { union gdma_doorbell_entry e; e.as_uint64 = 0; e.rqe_client.id = wq->id; e.rqe_client.high = 1; e.rqe_client.low = psn; udma_to_device_barrier(); mmio_write64(wq->db_page + DOORBELL_OFFSET_RQ_CLIENT, e.as_uint64); mmio_flush_writes(); } static inline void gdma_ring_cq_doorbell(struct mana_cq *cq) { union gdma_doorbell_entry e; // To address the use-case of ibv that re-arms the CQ without polling if (cq->last_armed_head == cq->head) cq->last_armed_head = cq->head + 1; else cq->last_armed_head = cq->head; e.as_uint64 = 0; e.cq.id = cq->cqid; e.cq.prod_idx = cq->last_armed_head % (cq->cqe << GDMA_CQE_OWNER_BITS); e.cq.arm = 1; udma_to_device_barrier(); mmio_write64(cq->db_page + DOORBELL_OFFSET_CQ, e.as_uint64); mmio_flush_writes(); } #endif //_DOORBELLS_H_ rdma-core-56.1/providers/mana/gdma.h000066400000000000000000000117001477342711600173430ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* * Copyright (c) 2024, Microsoft Corporation. All rights reserved. */ #ifndef _GDMA_H_ #define _GDMA_H_ #include #include #include #include #include #include #define GDMA_QUEUE_OFFSET_WIDTH 27 #define GDMA_QUEUE_OFFSET_MASK ((1 << GDMA_QUEUE_OFFSET_WIDTH) - 1) #define GDMA_COMP_DATA_SIZE 60 #define IB_SYNDROME_ACK(credits) (0x00 + (credits)) #define IB_SYNDROME_RNR_NAK(timer) (0x20 + (timer)) #define IB_SYNDROME_NAK(code) (0x60 + (code)) #define IB_IS_ACK(syndrome) (((syndrome) & 0xE0) == IB_SYNDROME_ACK(0)) enum gdma_work_req_flags { GDMA_WORK_REQ_NONE = 0, GDMA_WORK_REQ_OOB_IN_SGL = BIT(0), GDMA_WORK_REQ_SGL_DIRECT = BIT(1), GDMA_WORK_REQ_CONSUME_CREDIT = BIT(2), GDMA_WORK_REQ_FENCE = BIT(3), GDMA_WORK_REQ_CHECK_SN = BIT(4), GDMA_WORK_REQ_PAD_DATA_BY_FIRST_SGE_SIZE = BIT(5), GDMA_WORK_REQ_EXTRA_LARGE_OOB = BIT(5), }; union gdma_oob { struct { uint32_t num_padding_sgls:5; uint32_t reserved1:19; uint32_t last_vbytes:8; uint32_t num_sgl_entries:8; uint32_t inline_client_oob_size:3; uint32_t client_oob_in_sgl:1; uint32_t consume_credit:1; uint32_t fence:1; uint32_t reserved2:2; uint32_t client_data_unit:14; uint32_t check_sn:1; uint32_t sgl_direct:1; } tx; struct { uint32_t reserved1; uint32_t num_sgl_entries:8; uint32_t inline_client_oob_size:3; uint32_t reserved2:19; uint32_t check_sn:1; uint32_t reserved3:1; } rx; }; /* HW DATA */ /* The 16-byte struct is part of the GDMA work queue entry (WQE). */ struct gdma_sge { uint64_t address; uint32_t mem_key; uint32_t size; }; /* HW DATA */ struct rdma_recv_oob { uint32_t psn_start:24; uint32_t reserved1:8; uint32_t psn_range:24; uint32_t reserved2:8; }; /* HW DATA */ struct extra_large_wqe { __le32 immediate; uint32_t reserved; uint64_t padding; }; /* HW DATA */ struct rdma_send_oob { uint32_t wqe_type:5; uint32_t fence:1; uint32_t signaled:1; uint32_t solicited:1; uint32_t psn:24; uint32_t ssn:24; // also remote_qpn uint32_t reserved1:8; union { uint32_t req_details[4]; union { __le32 immediate; uint32_t invalidate_key; } send; struct { uint32_t address_hi; uint32_t address_low; uint32_t rkey; uint32_t dma_len; } rdma; }; }; /* HW DATA */ struct gdma_wqe { // in units of 32-byte blocks, masked by GDMA_QUEUE_OFFSET_MASK. uint32_t unmasked_wqe_index; uint32_t size_in_bu; // Client oob is either 8 bytes or 24 bytes, so DmaOob + ClientOob will never wrap. union gdma_oob *gdma_oob; void *client_oob; uint32_t client_oob_size; struct gdma_sge *sgl1; uint32_t num_sge1; // In case SGL wraps in the queue buffer. struct gdma_sge *sgl2; uint32_t num_sge2; }; enum wqe_opcode_types { WQE_TYPE_UD_SEND = 0, WQE_TYPE_UD_SEND_IMM = 1, WQE_TYPE_RC_SEND = 2, WQE_TYPE_RC_SEND_IMM = 3, WQE_TYPE_RC_SEND_INV = 4, WQE_TYPE_WRITE = 5, WQE_TYPE_WRITE_IMM = 6, WQE_TYPE_READ = 7, WQE_TYPE_UD_RECV = 8, WQE_TYPE_RC_RECV = 9, WQE_TYPE_LOCAL_INV = 10, WQE_TYPE_REG_MR = 11, WQE_TYPE_MAX, }; /* HW DATA */ static inline enum wqe_opcode_types convert_wr_to_hw_opcode(enum ibv_wr_opcode opcode) { switch (opcode) { case IBV_WR_RDMA_WRITE: return WQE_TYPE_WRITE; case IBV_WR_RDMA_WRITE_WITH_IMM: return WQE_TYPE_WRITE_IMM; case IBV_WR_SEND: return WQE_TYPE_RC_SEND; case IBV_WR_SEND_WITH_IMM: return WQE_TYPE_RC_SEND_IMM; case IBV_WR_RDMA_READ: return WQE_TYPE_READ; default: return WQE_TYPE_MAX; } } enum { CQE_TYPE_NOP = 0, CQE_TYPE_UD_SEND = 1, CQE_TYPE_UD_SEND_IMM = 2, CQE_TYPE_RC_SEND = 3, CQE_TYPE_RC_SEND_IMM = 4, CQE_TYPE_RC_SEND_INV = 5, CQE_TYPE_RC_WRITE_IMM = 6, CQE_TYPE_ARMED_CMPL = 7, CQE_TYPE_LWR = 8, CQE_TYPE_RC_FENCE = 9, CQE_TYPE_MAX }; /* HW DATA */ struct mana_rdma_cqe { uint32_t cqe_type : 8; uint32_t vendor_error : 8; uint32_t reserved1 : 16; union { uint32_t data[GDMA_COMP_DATA_SIZE / sizeof(uint32_t) - 4]; struct { uint32_t msg_len; uint32_t psn : 24; uint32_t reserved : 8; uint32_t imm_data; uint32_t rx_wqe_offset; } rc_recv; struct { uint32_t sge_offset : 5; uint32_t rx_wqe_offset : 27; uint32_t sge_byte_offset; } ud_send; struct { uint32_t msg_len; uint32_t src_qpn : 24; uint32_t reserved : 8; uint32_t imm_data; uint32_t rx_wqe_offset; } ud_recv; struct { uint32_t reserved1; uint32_t psn : 24; uint32_t reserved2 : 8; uint32_t imm_data; uint32_t rx_wqe_offset; } rc_write_with_imm; struct { uint32_t msn : 24; uint32_t syndrome : 8; uint32_t psn : 24; uint32_t reserved : 8; uint32_t read_resp_psn : 24; } rc_armed_completion; }; uint32_t timestamp_hi; uint32_t timestamp_lo; uint32_t reserved3; }; /* HW DATA */ struct gdma_cqe { union { uint8_t data[GDMA_COMP_DATA_SIZE]; struct mana_rdma_cqe rdma_cqe; }; uint32_t wqid : 24; uint32_t is_sq : 1; uint32_t reserved : 4; uint32_t owner_bits : 3; }; /* HW DATA */ #endif //_GDMA_H_ rdma-core-56.1/providers/mana/libmana.map000066400000000000000000000002611477342711600203640ustar00rootroot00000000000000/* Export symbols should be added below according to Documentation/versioning.md document. */ MANA_1.0 { global: manadv_set_context_attr; manadv_init_obj; local: *; }; rdma-core-56.1/providers/mana/man/000077500000000000000000000000001477342711600170365ustar00rootroot00000000000000rdma-core-56.1/providers/mana/man/CMakeLists.txt000066400000000000000000000001261477342711600215750ustar00rootroot00000000000000rdma_man_pages( manadv.7.md manadv_init_obj.3.md manadv_set_context_attr.3.md ) rdma-core-56.1/providers/mana/man/manadv.7.md000066400000000000000000000034631477342711600210010ustar00rootroot00000000000000--- layout: page title: MANADV section: 7 tagline: Verbs date: 2022-05-16 header: "MANA Direct Verbs Manual" footer: mana --- # NAME manadv - Direct verbs for mana devices This provides low level access to mana devices to perform direct operations, without general branching performed by libibverbs. # DESCRIPTION The libibverbs API is an abstract one. It is agnostic to any underlying provider specific implementation. While this abstraction has the advantage of user applications portability, it has a performance penalty. For some applications optimizing performance is more important than portability. The mana direct verbs API is intended for such applications. It exposes mana specific low level operations, allowing the application to bypass the libibverbs API. This version of the driver supports one QP type: IBV_QPT_RAW_PACKET. To use this QP type, the application is required to use manadv_set_context_attr() to set external buffer allocators for allocating queues, and use manadv_init_obj() to obtain all the queue information. The application implements its own queue operations, bypassing libibverbs API for sending/receiving traffic over the queues. At hardware layer, IBV_QPT_RAW_PACKET QP shares the same hardware resource as the Ethernet port used in the kernel. The software checks for exclusive use of the hardware Ethernet port, and will fail the QP creation if the port is already in use. To create a IBV_QPT_RAW_PACKET on a specified port, the user needs to configure the system in such a way that this port is not used by any other software (including the Kernel). If the port is used, ibv_create_qp() will fail with errno set to EBUSY. The direct include of manadv.h together with linkage to mana library will allow usage of this new interface. # SEE ALSO **verbs**(7) # AUTHORS Long Li rdma-core-56.1/providers/mana/man/manadv_init_obj.3.md000066400000000000000000000025631477342711600226520ustar00rootroot00000000000000--- layout: page title: manadv_init_obj section: 3 tagline: Verbs --- # NAME manadv_init_obj \- Initialize mana direct verbs object from ibv_xxx structures # SYNOPSIS" ```c #include int manadv_init_obj(struct manadv_obj *obj, uint64_t obj_type); ``` # DESCRIPTION manadv_init_obj() This function will initialize manadv_xxx structs based on supplied type. The information for initialization is taken from ibv_xx structs supplied as part of input. # ARGUMENTS *obj* : The manadv_xxx structs be to returned. ```c struct manadv_qp { void *sq_buf; uint32_t sq_count; uint32_t sq_size; uint32_t sq_id; uint32_t tx_vp_offset; void *db_page; }; struct manadv_cq { void *buf; uint32_t count; uint32_t cq_id; }; struct manadv_rwq { void *buf; uint32_t count; uint32_t size; uint32_t wq_id; void *db_page; }; struct manadv_obj { struct { struct ibv_qp *in; struct manadv_qp *out; } qp; struct { struct ibv_cq *in; struct manadv_cq *out; } cq; struct { struct ibv_wq *in; struct manadv_rwq *out; } rwq; }; ``` *obj_type* : The types of the manadv_xxx structs to be returned. ```c enum manadv_obj_type { MANADV_OBJ_QP = 1 << 0, MANADV_OBJ_CQ = 1 << 1, MANADV_OBJ_RWQ = 1 << 2, }; ``` # RETURN VALUE 0 on success or the value of errno on failure (which indicates the failure reason). # AUTHORS Long Li rdma-core-56.1/providers/mana/man/manadv_set_context_attr.3.md000066400000000000000000000024701477342711600244430ustar00rootroot00000000000000--- layout: page title: manadv_set_context_attr section: 3 tagline: Verbs --- # NAME manadv_set_context_attr - Set context attributes # SYNOPSIS ```c #include int manadv_set_context_attr(struct ibv_context *context, enum manadv_set_ctx_attr_type attr_type, void *attr); ``` # DESCRIPTION manadv_set_context_attr gives the ability to set vendor specific attributes on the RDMA context. # ARGUMENTS *context* : RDMA device context to work on. *attr_type* : The type of the provided attribute. *attr* : Pointer to the attribute to be set. ## attr_type ```c enum manadv_set_ctx_attr_type { /* Attribute type uint8_t */ MANADV_SET_CTX_ATTR_BUF_ALLOCATORS = 0, }; ``` *MANADV_SET_CTX_ATTR_BUF_ALLOCATORS* : Provide an external buffer allocator ```c struct manadv_ctx_allocators { void *(*alloc)(size_t size, void *priv_data); void (*free)(void *ptr, void *priv_data); void *data; }; ``` *alloc* : Function used for buffer allocation instead of libmana internal method *free* : Function used to free buffers allocated by alloc function *data* : Metadata that can be used by alloc and free functions # RETURN VALUE Returns 0 on success, or the value of errno on failure (which indicates the failure reason). # AUTHOR Long Li rdma-core-56.1/providers/mana/mana.c000066400000000000000000000165371477342711600173570ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* * Copyright (c) 2022, Microsoft Corporation. All rights reserved. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "mana.h" DECLARE_DRV_CMD(mana_alloc_ucontext, IB_USER_VERBS_CMD_GET_CONTEXT, empty, empty); DECLARE_DRV_CMD(mana_alloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, empty, empty); static const struct verbs_match_ent hca_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_MANA), {}, }; struct mana_context *to_mctx(struct ibv_context *ibctx) { return container_of(ibctx, struct mana_context, ibv_ctx.context); } void *mana_alloc_mem(uint32_t size) { void *buf; buf = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (buf == MAP_FAILED) return NULL; return buf; } int mana_query_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); int ret; ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); verbs_debug(verbs_get_ctx(context), "device attr max_qp %d max_qp_wr %d max_cqe %d\n", attr->orig_attr.max_qp, attr->orig_attr.max_qp_wr, attr->orig_attr.max_cqe); return ret; } int mana_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(context, port, attr, &cmd, sizeof(cmd)); } struct ibv_pd *mana_alloc_pd(struct ibv_context *context) { struct ibv_alloc_pd cmd; struct mana_alloc_pd_resp resp; struct mana_pd *pd; int ret; pd = calloc(1, sizeof(*pd)); if (!pd) return NULL; ret = ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) { verbs_err(verbs_get_ctx(context), "Failed to allocate PD\n"); errno = ret; free(pd); return NULL; } return &pd->ibv_pd; } struct ibv_pd * mana_alloc_parent_domain(struct ibv_context *context, struct ibv_parent_domain_init_attr *attr) { struct mana_parent_domain *mparent_domain; if (ibv_check_alloc_parent_domain(attr)) { errno = EINVAL; return NULL; } if (!check_comp_mask(attr->comp_mask, IBV_PARENT_DOMAIN_INIT_ATTR_PD_CONTEXT)) { verbs_err( verbs_get_ctx(context), "This driver supports IBV_PARENT_DOMAIN_INIT_ATTR_PD_CONTEXT only\n"); errno = EOPNOTSUPP; return NULL; } mparent_domain = calloc(1, sizeof(*mparent_domain)); if (!mparent_domain) { errno = ENOMEM; return NULL; } mparent_domain->mpd.mprotection_domain = container_of(attr->pd, struct mana_pd, ibv_pd); ibv_initialize_parent_domain(&mparent_domain->mpd.ibv_pd, attr->pd); if (attr->comp_mask & IBV_PARENT_DOMAIN_INIT_ATTR_PD_CONTEXT) mparent_domain->pd_context = attr->pd_context; return &mparent_domain->mpd.ibv_pd; } int mana_dealloc_pd(struct ibv_pd *ibpd) { int ret; struct mana_pd *pd = container_of(ibpd, struct mana_pd, ibv_pd); if (pd->mprotection_domain) { struct mana_parent_domain *parent_domain = container_of(pd, struct mana_parent_domain, mpd); free(parent_domain); return 0; } ret = ibv_cmd_dealloc_pd(ibpd); if (ret) { verbs_err(verbs_get_ctx(ibpd->context), "Failed to deallocate PD\n"); return ret; } free(pd); return 0; } struct ibv_mr *mana_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { struct verbs_mr *vmr; struct ibv_reg_mr cmd; struct ib_uverbs_reg_mr_resp resp; int ret; vmr = malloc(sizeof(*vmr)); if (!vmr) return NULL; ret = ibv_cmd_reg_mr(pd, addr, length, hca_va, access, vmr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { verbs_err(verbs_get_ctx(pd->context), "Failed to register MR\n"); errno = ret; free(vmr); return NULL; } return &vmr->ibv_mr; } int mana_dereg_mr(struct verbs_mr *vmr) { int ret; ret = ibv_cmd_dereg_mr(vmr); if (ret) { verbs_err(verbs_get_ctx(vmr->ibv_mr.context), "Failed to deregister MR\n"); return ret; } free(vmr); return 0; } static void mana_free_context(struct ibv_context *ibctx) { struct mana_context *context = to_mctx(ibctx); int i; for (i = 0; i < MANA_QP_TABLE_SIZE; ++i) { if (context->qp_table[i].refcnt) free(context->qp_table[i].table); } pthread_mutex_destroy(&context->qp_table_mutex); munmap(context->db_page, DOORBELL_PAGE_SIZE); verbs_uninit_context(&context->ibv_ctx); free(context); } static const struct verbs_context_ops mana_ctx_ops = { .alloc_pd = mana_alloc_pd, .alloc_parent_domain = mana_alloc_parent_domain, .create_cq = mana_create_cq, .create_qp = mana_create_qp, .create_qp_ex = mana_create_qp_ex, .create_rwq_ind_table = mana_create_rwq_ind_table, .create_wq = mana_create_wq, .dealloc_pd = mana_dealloc_pd, .dereg_mr = mana_dereg_mr, .destroy_cq = mana_destroy_cq, .destroy_qp = mana_destroy_qp, .destroy_rwq_ind_table = mana_destroy_rwq_ind_table, .destroy_wq = mana_destroy_wq, .free_context = mana_free_context, .modify_wq = mana_modify_wq, .modify_qp = mana_modify_qp, .poll_cq = mana_poll_cq, .post_recv = mana_post_recv, .post_send = mana_post_send, .query_device_ex = mana_query_device_ex, .query_port = mana_query_port, .reg_mr = mana_reg_mr, .req_notify_cq = mana_arm_cq, }; static struct verbs_device *mana_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct mana_device *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; return &dev->verbs_dev; } static void mana_uninit_device(struct verbs_device *verbs_device) { struct mana_device *dev = container_of(verbs_device, struct mana_device, verbs_dev); free(dev); } static struct verbs_context *mana_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { int ret, i; struct mana_context *context; struct mana_alloc_ucontext_resp resp; struct ibv_get_context cmd; context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx, RDMA_DRIVER_MANA); if (!context) return NULL; ret = ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) { verbs_err(&context->ibv_ctx, "Failed to get ucontext\n"); errno = ret; goto free_ctx; } verbs_set_ops(&context->ibv_ctx, &mana_ctx_ops); pthread_mutex_init(&context->qp_table_mutex, NULL); for (i = 0; i < MANA_QP_TABLE_SIZE; ++i) context->qp_table[i].refcnt = 0; context->db_page = mmap(NULL, DOORBELL_PAGE_SIZE, PROT_WRITE, MAP_SHARED, context->ibv_ctx.context.cmd_fd, 0); if (context->db_page == MAP_FAILED) { verbs_err(&context->ibv_ctx, "Failed to map doorbell page\n"); errno = ENOENT; goto free_ctx; } verbs_debug(&context->ibv_ctx, "Mapped db_page=%p\n", context->db_page); return &context->ibv_ctx; free_ctx: verbs_uninit_context(&context->ibv_ctx); free(context); return NULL; } static const struct verbs_device_ops mana_dev_ops = { .name = "mana", .match_min_abi_version = MANA_IB_UVERBS_ABI_VERSION, .match_max_abi_version = MANA_IB_UVERBS_ABI_VERSION, .match_table = hca_table, .alloc_device = mana_device_alloc, .uninit_device = mana_uninit_device, .alloc_context = mana_alloc_context, }; PROVIDER_DRIVER(mana, mana_dev_ops); rdma-core-56.1/providers/mana/mana.h000066400000000000000000000133131477342711600173510ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* * Copyright (c) 2022, Microsoft Corporation. All rights reserved. */ #ifndef _MANA_H_ #define _MANA_H_ #include "manadv.h" #include #include "shadow_queue.h" #define COMP_ENTRY_SIZE 64 #define MANA_IB_TOEPLITZ_HASH_KEY_SIZE_IN_BYTES 40 #define DMA_OOB_SIZE 8 #define INLINE_OOB_SMALL_SIZE 8 #define INLINE_OOB_LARGE_SIZE 24 #define GDMA_WQE_ALIGNMENT_UNIT_SIZE 32 /* The size of a SGE in WQE */ #define SGE_SIZE 16 #define DOORBELL_PAGE_SIZE 4096 #define MANA_PAGE_SIZE 4096 #define MANA_QP_TABLE_SIZE 4096 #define MANA_QP_TABLE_SHIFT 12 #define MANA_QP_TABLE_MASK (MANA_QP_TABLE_SIZE - 1) /* PSN 24 bit arithmetic comparisons */ #define PSN_MASK 0xFFFFFF #define PSN_SIGN_BIT 0x800000 #define PSN_GE(PSN1, PSN2) ((((PSN1) - (PSN2)) & PSN_SIGN_BIT) == 0) #define PSN_GT(PSN1, PSN2) PSN_GE(PSN1, (PSN2) + 1) #define PSN_LE(PSN1, PSN2) PSN_GE(PSN2, PSN1) #define PSN_LT(PSN1, PSN2) PSN_GT(PSN2, PSN1) #define MTU_SIZE(MTU) (1U << ((MTU) + 7)) #define PSN_DELTA(MSG_SIZE, MTU) max(1U, ((MSG_SIZE) + MTU_SIZE(MTU) - 1) >> (MTU + 7)) #define PSN_DEC(PSN) (((PSN) - 1) & PSN_MASK) #define PSN_INC(PSN) (((PSN) + 1) & PSN_MASK) #define PSN_ADD(PSN, DELTA) (((PSN) + (DELTA)) & PSN_MASK) enum user_queue_types { USER_RC_SEND_QUEUE_REQUESTER = 0, USER_RC_SEND_QUEUE_RESPONDER = 1, USER_RC_RECV_QUEUE_REQUESTER = 2, USER_RC_RECV_QUEUE_RESPONDER = 3, USER_RC_QUEUE_TYPE_MAX = 4, }; static inline uint32_t align_hw_size(uint32_t size) { size = roundup_pow_of_two(size); return align(size, MANA_PAGE_SIZE); } static inline uint32_t get_wqe_size(uint32_t sge) { uint32_t wqe_size = sge * SGE_SIZE + DMA_OOB_SIZE + INLINE_OOB_SMALL_SIZE; return align(wqe_size, GDMA_WQE_ALIGNMENT_UNIT_SIZE); } static inline uint32_t get_large_wqe_size(uint32_t sge) { uint32_t wqe_size = sge * SGE_SIZE + DMA_OOB_SIZE + INLINE_OOB_LARGE_SIZE; return align(wqe_size, GDMA_WQE_ALIGNMENT_UNIT_SIZE); } struct mana_context { struct verbs_context ibv_ctx; struct { struct mana_qp **table; int refcnt; } qp_table[MANA_QP_TABLE_SIZE]; pthread_mutex_t qp_table_mutex; struct manadv_ctx_allocators extern_alloc; void *db_page; }; struct mana_rwq_ind_table { struct ibv_rwq_ind_table ib_ind_table; uint32_t ind_tbl_size; struct ibv_wq **ind_tbl; }; struct mana_gdma_queue { uint32_t id; uint32_t size; uint32_t prod_idx; uint32_t cons_idx; void *db_page; void *buffer; }; struct mana_ib_raw_qp { void *send_buf; uint32_t send_buf_size; int send_wqe_count; uint32_t sqid; uint32_t tx_vp_offset; }; struct mana_ib_rc_qp { struct mana_gdma_queue queues[USER_RC_QUEUE_TYPE_MAX]; uint32_t sq_ssn; uint32_t sq_psn; uint32_t sq_highest_completed_psn; }; struct mana_qp { struct verbs_qp ibqp; pthread_spinlock_t sq_lock; pthread_spinlock_t rq_lock; union { struct mana_ib_raw_qp raw_qp; struct mana_ib_rc_qp rc_qp; }; enum ibv_mtu mtu; struct shadow_queue shadow_rq; struct shadow_queue shadow_sq; struct list_node send_cq_node; struct list_node recv_cq_node; }; struct mana_wq { struct ibv_wq ibwq; void *buf; uint32_t buf_size; uint32_t wqe; uint32_t sge; uint32_t wqid; }; struct mana_cq { struct ibv_cq ibcq; uint32_t cqe; uint32_t cqid; void *buf; pthread_spinlock_t lock; uint32_t head; uint32_t last_armed_head; uint32_t ready_wcs; void *db_page; /* list of qp's that use this cq for send completions */ struct list_head send_qp_list; /* list of qp's that use this cq for recv completions */ struct list_head recv_qp_list; bool buf_external; }; struct mana_device { struct verbs_device verbs_dev; }; struct mana_pd { struct ibv_pd ibv_pd; struct mana_pd *mprotection_domain; }; struct mana_parent_domain { struct mana_pd mpd; void *pd_context; }; struct mana_context *to_mctx(struct ibv_context *ibctx); void *mana_alloc_mem(uint32_t size); int mana_query_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int mana_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr); struct ibv_pd *mana_alloc_pd(struct ibv_context *context); struct ibv_pd * mana_alloc_parent_domain(struct ibv_context *context, struct ibv_parent_domain_init_attr *attr); int mana_dealloc_pd(struct ibv_pd *pd); struct ibv_mr *mana_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access); int mana_dereg_mr(struct verbs_mr *vmr); struct ibv_cq *mana_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); int mana_destroy_cq(struct ibv_cq *cq); int mana_poll_cq(struct ibv_cq *ibcq, int nwc, struct ibv_wc *wc); struct ibv_wq *mana_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *attr); int mana_destroy_wq(struct ibv_wq *wq); int mana_modify_wq(struct ibv_wq *ibwq, struct ibv_wq_attr *attr); struct ibv_rwq_ind_table * mana_create_rwq_ind_table(struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr); int mana_destroy_rwq_ind_table(struct ibv_rwq_ind_table *rwq_ind_table); struct ibv_qp *mana_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); struct ibv_qp *mana_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr); int mana_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); int mana_destroy_qp(struct ibv_qp *ibqp); int mana_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad); int mana_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad); int mana_arm_cq(struct ibv_cq *ibcq, int solicited); struct mana_qp *mana_get_qp_from_rq(struct mana_context *ctx, uint32_t qpn); #endif rdma-core-56.1/providers/mana/manadv.c000066400000000000000000000042521477342711600177000ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* * Copyright (c) 2022, Microsoft Corporation. All rights reserved. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "mana.h" int manadv_set_context_attr(struct ibv_context *ibv_ctx, enum manadv_set_ctx_attr_type type, void *attr) { struct mana_context *ctx = to_mctx(ibv_ctx); int ret; switch (type) { case MANADV_CTX_ATTR_BUF_ALLOCATORS: ctx->extern_alloc = *((struct manadv_ctx_allocators *)attr); ret = 0; break; default: verbs_err(verbs_get_ctx(ibv_ctx), "Unsupported context type %d\n", type); ret = EOPNOTSUPP; } return ret; } int manadv_init_obj(struct manadv_obj *obj, uint64_t obj_type) { if (obj_type & ~(MANADV_OBJ_QP | MANADV_OBJ_CQ | MANADV_OBJ_RWQ)) return EINVAL; if (obj_type & MANADV_OBJ_QP) { struct ibv_qp *ibqp = obj->qp.in; struct mana_qp *qp = container_of(ibqp, struct mana_qp, ibqp.qp); struct ibv_context *context = ibqp->context; struct mana_context *ctx = to_mctx(context); obj->qp.out->sq_buf = qp->raw_qp.send_buf; obj->qp.out->sq_count = qp->raw_qp.send_wqe_count; obj->qp.out->sq_size = qp->raw_qp.send_buf_size; obj->qp.out->sq_id = qp->raw_qp.sqid; obj->qp.out->tx_vp_offset = qp->raw_qp.tx_vp_offset; obj->qp.out->db_page = ctx->db_page; } if (obj_type & MANADV_OBJ_CQ) { struct ibv_cq *ibcq = obj->cq.in; struct mana_cq *cq = container_of(ibcq, struct mana_cq, ibcq); obj->cq.out->buf = cq->buf; obj->cq.out->count = cq->cqe; obj->cq.out->cq_id = cq->cqid; } if (obj_type & MANADV_OBJ_RWQ) { struct ibv_wq *ibwq = obj->rwq.in; struct mana_wq *wq = container_of(ibwq, struct mana_wq, ibwq); struct ibv_context *context = ibwq->context; struct mana_context *ctx = to_mctx(context); obj->rwq.out->buf = wq->buf; obj->rwq.out->count = wq->wqe; obj->rwq.out->size = wq->buf_size; obj->rwq.out->wq_id = wq->wqid; obj->rwq.out->db_page = ctx->db_page; } return 0; } rdma-core-56.1/providers/mana/manadv.h000066400000000000000000000025751477342711600177130ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* * Copyright (c) 2022, Microsoft Corporation. All rights reserved. */ #ifndef _MANA_DV_H_ #define _MANA_DV_H_ #include #include #include #include #ifdef __cplusplus extern "C" { #endif enum manadv_set_ctx_attr_type { /* Attribute type uint8_t */ MANADV_CTX_ATTR_BUF_ALLOCATORS = 0, }; struct manadv_ctx_allocators { void *(*alloc)(size_t size, void *priv_data); void (*free)(void *ptr, void *priv_data); void *data; }; int manadv_set_context_attr(struct ibv_context *ibv_ctx, enum manadv_set_ctx_attr_type type, void *attr); struct manadv_qp { void *sq_buf; uint32_t sq_count; uint32_t sq_size; uint32_t sq_id; uint32_t tx_vp_offset; void *db_page; }; struct manadv_cq { void *buf; uint32_t count; uint32_t cq_id; }; struct manadv_rwq { void *buf; uint32_t count; uint32_t size; uint32_t wq_id; void *db_page; }; struct manadv_obj { struct { struct ibv_qp *in; struct manadv_qp *out; } qp; struct { struct ibv_cq *in; struct manadv_cq *out; } cq; struct { struct ibv_wq *in; struct manadv_rwq *out; } rwq; }; enum manadv_obj_type { MANADV_OBJ_QP = 1 << 0, MANADV_OBJ_CQ = 1 << 1, MANADV_OBJ_RWQ = 1 << 2, }; int manadv_init_obj(struct manadv_obj *obj, uint64_t obj_type); #ifdef __cplusplus } #endif #endif /* _MANA_DV_H_ */ rdma-core-56.1/providers/mana/qp.c000066400000000000000000000345221477342711600170550ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* * Copyright (c) 2022, Microsoft Corporation. All rights reserved. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "mana.h" #include "rollback.h" #include "doorbells.h" DECLARE_DRV_CMD(mana_create_qp, IB_USER_VERBS_CMD_CREATE_QP, mana_ib_create_qp, mana_ib_create_qp_resp); DECLARE_DRV_CMD(mana_create_qp_ex, IB_USER_VERBS_EX_CMD_CREATE_QP, mana_ib_create_qp_rss, mana_ib_create_qp_rss_resp); DECLARE_DRV_CMD(mana_create_rc_qp, IB_USER_VERBS_CMD_CREATE_QP, mana_ib_create_rc_qp, mana_ib_create_rc_qp_resp); static struct ibv_qp *mana_create_qp_raw(struct ibv_pd *ibpd, struct ibv_qp_init_attr *attr) { int ret; struct mana_cq *cq; struct mana_qp *qp; struct mana_pd *pd = container_of(ibpd, struct mana_pd, ibv_pd); struct mana_parent_domain *mpd; uint32_t port; struct mana_create_qp qp_cmd = {}; struct mana_create_qp_resp qp_resp = {}; struct mana_ib_create_qp *qp_cmd_drv; struct mana_ib_create_qp_resp *qp_resp_drv; struct mana_context *ctx = to_mctx(ibpd->context); /* This is a RAW QP, pd is a parent domain with port number */ if (!pd->mprotection_domain) { verbs_err(verbs_get_ctx(ibpd->context), "Create RAW QP should use parent domain\n"); errno = EINVAL; return NULL; } mpd = container_of(pd, struct mana_parent_domain, mpd); port = (uint32_t)(uintptr_t)mpd->pd_context; cq = container_of(attr->send_cq, struct mana_cq, ibcq); if (!ctx->extern_alloc.alloc || !ctx->extern_alloc.free) { verbs_err(verbs_get_ctx(ibpd->context), "RAW QP requires extern alloc for buffers\n"); errno = EINVAL; return NULL; } qp = calloc(1, sizeof(*qp)); if (!qp) return NULL; qp->raw_qp.send_buf_size = attr->cap.max_send_wr * get_wqe_size(attr->cap.max_send_sge); qp->raw_qp.send_buf_size = align_hw_size(qp->raw_qp.send_buf_size); qp->raw_qp.send_buf = ctx->extern_alloc.alloc(qp->raw_qp.send_buf_size, ctx->extern_alloc.data); if (!qp->raw_qp.send_buf) { errno = ENOMEM; goto free_qp; } qp_cmd_drv = &qp_cmd.drv_payload; qp_resp_drv = &qp_resp.drv_payload; qp_cmd_drv->sq_buf_addr = (uintptr_t)qp->raw_qp.send_buf; qp_cmd_drv->sq_buf_size = qp->raw_qp.send_buf_size; qp_cmd_drv->port = port; ret = ibv_cmd_create_qp(ibpd, &qp->ibqp.qp, attr, &qp_cmd.ibv_cmd, sizeof(qp_cmd), &qp_resp.ibv_resp, sizeof(qp_resp)); if (ret) { verbs_err(verbs_get_ctx(ibpd->context), "Create QP failed\n"); ctx->extern_alloc.free(qp->raw_qp.send_buf, ctx->extern_alloc.data); errno = ret; goto free_qp; } qp->raw_qp.sqid = qp_resp_drv->sqid; qp->raw_qp.tx_vp_offset = qp_resp_drv->tx_vp_offset; qp->raw_qp.send_wqe_count = attr->cap.max_send_wr; cq->cqid = qp_resp_drv->cqid; return &qp->ibqp.qp; free_qp: free(qp); return NULL; } static int mana_store_qp(struct mana_context *ctx, struct mana_qp *qp, uint32_t qid) { uint32_t tbl_idx, tbl_off; int ret = 0; pthread_mutex_lock(&ctx->qp_table_mutex); tbl_idx = qid >> MANA_QP_TABLE_SHIFT; tbl_off = qid & MANA_QP_TABLE_MASK; if (ctx->qp_table[tbl_idx].refcnt == 0) { ctx->qp_table[tbl_idx].table = calloc(MANA_QP_TABLE_SIZE, sizeof(struct mana_qp *)); if (!ctx->qp_table[tbl_idx].table) { ret = ENOMEM; goto out; } } if (ctx->qp_table[tbl_idx].table[tbl_off]) { ret = EBUSY; goto out; } ctx->qp_table[tbl_idx].table[tbl_off] = qp; ctx->qp_table[tbl_idx].refcnt++; out: pthread_mutex_unlock(&ctx->qp_table_mutex); return ret; } static void mana_remove_qp(struct mana_context *ctx, uint32_t qid) { uint32_t tbl_idx, tbl_off; pthread_mutex_lock(&ctx->qp_table_mutex); tbl_idx = qid >> MANA_QP_TABLE_SHIFT; tbl_off = qid & MANA_QP_TABLE_MASK; ctx->qp_table[tbl_idx].table[tbl_off] = NULL; ctx->qp_table[tbl_idx].refcnt--; if (ctx->qp_table[tbl_idx].refcnt == 0) { free(ctx->qp_table[tbl_idx].table); ctx->qp_table[tbl_idx].table = NULL; } pthread_mutex_unlock(&ctx->qp_table_mutex); } struct mana_qp *mana_get_qp_from_rq(struct mana_context *ctx, uint32_t qid) { uint32_t tbl_idx, tbl_off; tbl_idx = qid >> MANA_QP_TABLE_SHIFT; tbl_off = qid & MANA_QP_TABLE_MASK; if (!ctx->qp_table[tbl_idx].table) return NULL; return ctx->qp_table[tbl_idx].table[tbl_off]; } static uint32_t get_queue_size(struct ibv_qp_init_attr *attr, enum user_queue_types type) { uint32_t size = 0; uint32_t sges = 0; if (attr->qp_type == IBV_QPT_RC) { switch (type) { case USER_RC_SEND_QUEUE_REQUESTER: /* WQE must have at least one SGE */ /* For write with imm we need one extra SGE */ sges = max(1U, attr->cap.max_send_sge) + 1; size = attr->cap.max_send_wr * get_large_wqe_size(sges); break; case USER_RC_SEND_QUEUE_RESPONDER: size = MANA_PAGE_SIZE; break; case USER_RC_RECV_QUEUE_REQUESTER: size = MANA_PAGE_SIZE; break; case USER_RC_RECV_QUEUE_RESPONDER: /* WQE must have at least one SGE */ sges = max(1U, attr->cap.max_recv_sge); size = attr->cap.max_recv_wr * get_wqe_size(sges); break; default: return 0; } } size = align_hw_size(size); if (attr->qp_type == IBV_QPT_RC && type == USER_RC_SEND_QUEUE_REQUESTER) size += sizeof(struct mana_ib_rollback_shared_mem); return size; } static struct ibv_qp *mana_create_qp_rc(struct ibv_pd *ibpd, struct ibv_qp_init_attr *attr) { struct mana_cq *send_cq = container_of(attr->send_cq, struct mana_cq, ibcq); struct mana_cq *recv_cq = container_of(attr->recv_cq, struct mana_cq, ibcq); struct mana_context *ctx = to_mctx(ibpd->context); struct mana_ib_create_rc_qp_resp *qp_resp_drv; struct mana_create_rc_qp_resp qp_resp = {}; struct mana_ib_create_rc_qp *qp_cmd_drv; struct mana_create_rc_qp qp_cmd = {}; struct mana_qp *qp; int ret, i; qp = calloc(1, sizeof(*qp)); if (!qp) return NULL; qp_cmd_drv = &qp_cmd.drv_payload; qp_resp_drv = &qp_resp.drv_payload; pthread_spin_init(&qp->sq_lock, PTHREAD_PROCESS_PRIVATE); pthread_spin_init(&qp->rq_lock, PTHREAD_PROCESS_PRIVATE); if (create_shadow_queue(&qp->shadow_sq, attr->cap.max_send_wr, sizeof(struct rc_sq_shadow_wqe))) { verbs_err(verbs_get_ctx(ibpd->context), "Failed to alloc sq shadow queue\n"); errno = ENOMEM; goto free_qp; } if (create_shadow_queue(&qp->shadow_rq, attr->cap.max_recv_wr, sizeof(struct rc_rq_shadow_wqe))) { verbs_err(verbs_get_ctx(ibpd->context), "Failed to alloc rc shadow queue\n"); errno = ENOMEM; goto destroy_shadow_sq; } for (i = 0; i < USER_RC_QUEUE_TYPE_MAX; ++i) { qp->rc_qp.queues[i].db_page = ctx->db_page; qp->rc_qp.queues[i].size = get_queue_size(attr, i); qp->rc_qp.queues[i].buffer = mana_alloc_mem(qp->rc_qp.queues[i].size); if (!qp->rc_qp.queues[i].buffer) { verbs_err(verbs_get_ctx(ibpd->context), "Failed to allocate memory for RC queue %d\n", i); errno = ENOMEM; goto destroy_queues; } qp_cmd_drv->queue_buf[i] = (uintptr_t)qp->rc_qp.queues[i].buffer; qp_cmd_drv->queue_size[i] = qp->rc_qp.queues[i].size; } mana_ib_init_rb_shmem(qp); ret = ibv_cmd_create_qp(ibpd, &qp->ibqp.qp, attr, &qp_cmd.ibv_cmd, sizeof(qp_cmd), &qp_resp.ibv_resp, sizeof(qp_resp)); if (ret) { verbs_err(verbs_get_ctx(ibpd->context), "Create QP failed\n"); errno = ret; goto free_rb; } for (i = 0; i < USER_RC_QUEUE_TYPE_MAX; ++i) qp->rc_qp.queues[i].id = qp_resp_drv->queue_id[i]; qp->ibqp.qp.qp_num = qp->rc_qp.queues[USER_RC_RECV_QUEUE_RESPONDER].id; ret = mana_store_qp(ctx, qp, qp->rc_qp.queues[USER_RC_RECV_QUEUE_REQUESTER].id); if (ret) { errno = ret; goto destroy_qp; } ret = mana_store_qp(ctx, qp, qp->rc_qp.queues[USER_RC_RECV_QUEUE_RESPONDER].id); if (ret) { errno = ret; goto remove_qp_req; } pthread_spin_lock(&send_cq->lock); list_add(&send_cq->send_qp_list, &qp->send_cq_node); pthread_spin_unlock(&send_cq->lock); pthread_spin_lock(&recv_cq->lock); list_add(&recv_cq->recv_qp_list, &qp->recv_cq_node); pthread_spin_unlock(&recv_cq->lock); return &qp->ibqp.qp; remove_qp_req: mana_remove_qp(ctx, qp->rc_qp.queues[USER_RC_RECV_QUEUE_REQUESTER].id); destroy_qp: ibv_cmd_destroy_qp(&qp->ibqp.qp); free_rb: mana_ib_deinit_rb_shmem(qp); destroy_queues: while (i-- > 0) munmap(qp->rc_qp.queues[i].buffer, qp->rc_qp.queues[i].size); destroy_shadow_queue(&qp->shadow_rq); destroy_shadow_sq: destroy_shadow_queue(&qp->shadow_sq); free_qp: free(qp); return NULL; } struct ibv_qp *mana_create_qp(struct ibv_pd *ibpd, struct ibv_qp_init_attr *attr) { switch (attr->qp_type) { case IBV_QPT_RAW_PACKET: return mana_create_qp_raw(ibpd, attr); case IBV_QPT_RC: return mana_create_qp_rc(ibpd, attr); default: verbs_err(verbs_get_ctx(ibpd->context), "QP type %u is not supported\n", attr->qp_type); errno = EOPNOTSUPP; } return NULL; } static void mana_ib_modify_rc_qp(struct mana_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { int i; if (attr_mask & IBV_QP_PATH_MTU) qp->mtu = attr->path_mtu; switch (attr->qp_state) { case IBV_QPS_RESET: for (i = 0; i < USER_RC_QUEUE_TYPE_MAX; ++i) { qp->rc_qp.queues[i].prod_idx = 0; qp->rc_qp.queues[i].cons_idx = 0; } mana_ib_reset_rb_shmem(qp); reset_shadow_queue(&qp->shadow_rq); reset_shadow_queue(&qp->shadow_sq); case IBV_QPS_INIT: break; case IBV_QPS_RTR: break; case IBV_QPS_RTS: if (attr_mask & IBV_QP_SQ_PSN) { qp->rc_qp.sq_ssn = 1; qp->rc_qp.sq_psn = attr->sq_psn; qp->rc_qp.sq_highest_completed_psn = PSN_DEC(attr->sq_psn); gdma_arm_normal_cqe(&qp->rc_qp.queues[USER_RC_RECV_QUEUE_REQUESTER], attr->sq_psn); } break; default: break; } } int mana_modify_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask) { struct mana_qp *qp = container_of(ibqp, struct mana_qp, ibqp.qp); struct ibv_modify_qp cmd = {}; int err; if (ibqp->qp_type != IBV_QPT_RC) return EOPNOTSUPP; if (!(attr_mask & IBV_QP_STATE)) return 0; err = ibv_cmd_modify_qp(ibqp, attr, attr_mask, &cmd, sizeof(cmd)); if (err) { verbs_err(verbs_get_ctx(ibqp->context), "Failed to modify qp\n"); return err; } mana_ib_modify_rc_qp(qp, attr, attr_mask); return 0; } static void mana_drain_cqes(struct mana_qp *qp) { struct mana_cq *send_cq = container_of(qp->ibqp.qp.send_cq, struct mana_cq, ibcq); struct mana_cq *recv_cq = container_of(qp->ibqp.qp.recv_cq, struct mana_cq, ibcq); pthread_spin_lock(&send_cq->lock); while (shadow_queue_get_next_to_consume(&qp->shadow_sq)) { shadow_queue_advance_consumer(&qp->shadow_sq); send_cq->ready_wcs--; } list_del(&qp->send_cq_node); pthread_spin_unlock(&send_cq->lock); pthread_spin_lock(&recv_cq->lock); while (shadow_queue_get_next_to_consume(&qp->shadow_rq)) { shadow_queue_advance_consumer(&qp->shadow_rq); recv_cq->ready_wcs--; } list_del(&qp->recv_cq_node); pthread_spin_unlock(&recv_cq->lock); } int mana_destroy_qp(struct ibv_qp *ibqp) { struct mana_qp *qp = container_of(ibqp, struct mana_qp, ibqp.qp); struct mana_context *ctx = to_mctx(ibqp->context); int ret, i; if (ibqp->qp_type == IBV_QPT_RC) { mana_remove_qp(ctx, qp->rc_qp.queues[USER_RC_RECV_QUEUE_REQUESTER].id); mana_remove_qp(ctx, qp->rc_qp.queues[USER_RC_RECV_QUEUE_RESPONDER].id); mana_drain_cqes(qp); } ret = ibv_cmd_destroy_qp(ibqp); if (ret) { verbs_err(verbs_get_ctx(ibqp->context), "Destroy QP failed\n"); return ret; } switch (ibqp->qp_type) { case IBV_QPT_RAW_PACKET: ctx->extern_alloc.free(qp->raw_qp.send_buf, ctx->extern_alloc.data); break; case IBV_QPT_RC: pthread_spin_destroy(&qp->sq_lock); pthread_spin_destroy(&qp->rq_lock); destroy_shadow_queue(&qp->shadow_sq); destroy_shadow_queue(&qp->shadow_rq); mana_ib_deinit_rb_shmem(qp); for (i = 0; i < USER_RC_QUEUE_TYPE_MAX; ++i) munmap(qp->rc_qp.queues[i].buffer, qp->rc_qp.queues[i].size); break; default: verbs_err(verbs_get_ctx(ibqp->context), "QP type %u is not supported\n", ibqp->qp_type); errno = EINVAL; } free(qp); return 0; } static struct ibv_qp *mana_create_qp_ex_raw(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr) { struct mana_create_qp_ex cmd = {}; struct mana_ib_create_qp_rss *cmd_drv; struct mana_create_qp_ex_resp resp = {}; struct mana_ib_create_qp_rss_resp *cmd_resp; struct mana_qp *qp; struct mana_pd *pd = container_of(attr->pd, struct mana_pd, ibv_pd); struct mana_parent_domain *mpd; uint32_t port; int ret; cmd_drv = &cmd.drv_payload; cmd_resp = &resp.drv_payload; /* For a RAW QP, pd is a parent domain with port number */ if (!pd->mprotection_domain) { verbs_err(verbs_get_ctx(context), "RAW QP needs to be on a parent domain\n"); errno = EINVAL; return NULL; } if (attr->rx_hash_conf.rx_hash_key_len != MANA_IB_TOEPLITZ_HASH_KEY_SIZE_IN_BYTES) { verbs_err(verbs_get_ctx(context), "Invalid RX hash key length\n"); errno = EINVAL; return NULL; } mpd = container_of(pd, struct mana_parent_domain, mpd); port = (uint32_t)(uintptr_t)mpd->pd_context; qp = calloc(1, sizeof(*qp)); if (!qp) return NULL; cmd_drv->rx_hash_fields_mask = attr->rx_hash_conf.rx_hash_fields_mask; cmd_drv->rx_hash_function = attr->rx_hash_conf.rx_hash_function; cmd_drv->rx_hash_key_len = attr->rx_hash_conf.rx_hash_key_len; if (cmd_drv->rx_hash_key_len) memcpy(cmd_drv->rx_hash_key, attr->rx_hash_conf.rx_hash_key, cmd_drv->rx_hash_key_len); cmd_drv->port = port; ret = ibv_cmd_create_qp_ex2(context, &qp->ibqp, attr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) { verbs_err(verbs_get_ctx(context), "Create QP EX failed\n"); free(qp); errno = ret; return NULL; } if (attr->rwq_ind_tbl) { struct mana_rwq_ind_table *ind_table = container_of(attr->rwq_ind_tbl, struct mana_rwq_ind_table, ib_ind_table); for (int i = 0; i < ind_table->ind_tbl_size; i++) { struct mana_wq *wq = container_of(ind_table->ind_tbl[i], struct mana_wq, ibwq); struct mana_cq *cq = container_of(wq->ibwq.cq, struct mana_cq, ibcq); wq->wqid = cmd_resp->entries[i].wqid; cq->cqid = cmd_resp->entries[i].cqid; } } return &qp->ibqp.qp; } struct ibv_qp *mana_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr) { switch (attr->qp_type) { case IBV_QPT_RAW_PACKET: return mana_create_qp_ex_raw(context, attr); default: verbs_err(verbs_get_ctx(context), "QP type %u is not supported\n", attr->qp_type); errno = EOPNOTSUPP; } return NULL; } rdma-core-56.1/providers/mana/rollback.h000066400000000000000000000046351477342711600202350ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* * Copyright (c) 2024, Microsoft Corporation. All rights reserved. */ #ifndef _ROLLBACK_H_ #define _ROLLBACK_H_ #include #include #include #include #include #include "mana.h" #define MAKE_TAG(a, b, c, d) (((uint32_t)(d) << 24) | ((c) << 16) | ((b) << 8) | (a)) #define RNIC_ROLLBACK_SHARED_MEM_SIG MAKE_TAG('R', 'L', 'B', 'K') struct mana_ib_rollback_shared_mem { uint32_t signature; uint32_t size; _Atomic(uint32_t) left_offset; _Atomic(uint32_t) right_offset; }; static inline struct mana_ib_rollback_shared_mem *mana_ib_get_rollback_sh_mem(struct mana_qp *qp) { struct mana_ib_rollback_shared_mem *rb_shmem; struct mana_gdma_queue *req_sq = &qp->rc_qp.queues[USER_RC_SEND_QUEUE_REQUESTER]; rb_shmem = (struct mana_ib_rollback_shared_mem *) ((uint8_t *)req_sq->buffer + req_sq->size); return rb_shmem; } static inline void mana_ib_init_rb_shmem(struct mana_qp *qp) { // take some bytes for rollback memory struct mana_gdma_queue *req_sq = &qp->rc_qp.queues[USER_RC_SEND_QUEUE_REQUESTER]; req_sq->size -= sizeof(struct mana_ib_rollback_shared_mem); struct mana_ib_rollback_shared_mem *rb_shmem = mana_ib_get_rollback_sh_mem(qp); memset(rb_shmem, 0, sizeof(*rb_shmem)); rb_shmem->signature = RNIC_ROLLBACK_SHARED_MEM_SIG; rb_shmem->size = sizeof(struct mana_ib_rollback_shared_mem); } static inline void mana_ib_deinit_rb_shmem(struct mana_qp *qp) { // return back bytes for rollback memory struct mana_gdma_queue *req_sq = &qp->rc_qp.queues[USER_RC_SEND_QUEUE_REQUESTER]; req_sq->size += sizeof(struct mana_ib_rollback_shared_mem); } static inline void mana_ib_reset_rb_shmem(struct mana_qp *qp) { struct mana_ib_rollback_shared_mem *rb_shmem = mana_ib_get_rollback_sh_mem(qp); atomic_store(&rb_shmem->right_offset, 0); atomic_store(&rb_shmem->left_offset, 0); } static inline void mana_ib_update_shared_mem_right_offset(struct mana_qp *qp, uint32_t offset_in_bu) { struct mana_ib_rollback_shared_mem *rb_shmem = mana_ib_get_rollback_sh_mem(qp); atomic_store(&rb_shmem->right_offset, offset_in_bu); } static inline void mana_ib_update_shared_mem_left_offset(struct mana_qp *qp, uint32_t offset_in_bu) { struct mana_ib_rollback_shared_mem *rb_shmem = mana_ib_get_rollback_sh_mem(qp); atomic_store(&rb_shmem->left_offset, offset_in_bu); } #endif //_ROLLBACK_H_ rdma-core-56.1/providers/mana/shadow_queue.h000066400000000000000000000067161477342711600211370ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* * Copyright (c) 2024, Microsoft Corporation. All rights reserved. */ #ifndef _SHADOW_QUEUE_H_ #define _SHADOW_QUEUE_H_ #include #include #include #include #include #include #define MANA_NO_SIGNAL_WC (0xff) struct shadow_wqe_header { /* ibv_wc_opcode */ uint8_t opcode; /* ibv_wc_flags or MANA_NO_SIGNAL_WC */ uint8_t flags; /* ibv_wc_status */ uint8_t vendor_error_code; uint8_t posted_wqe_size_in_bu; uint32_t unmasked_queue_offset; uint64_t wr_id; }; struct rc_sq_shadow_wqe { struct shadow_wqe_header header; uint32_t end_psn; uint32_t read_posted_wqe_size_in_bu; }; struct rc_rq_shadow_wqe { struct shadow_wqe_header header; uint32_t byte_len; uint32_t imm_or_rkey; }; struct shadow_queue { uint64_t prod_idx; uint64_t cons_idx; uint64_t next_to_complete_idx; uint32_t length; uint32_t stride; void *buffer; }; static inline void reset_shadow_queue(struct shadow_queue *queue) { queue->prod_idx = 0; queue->cons_idx = 0; queue->next_to_complete_idx = 0; } static inline int create_shadow_queue(struct shadow_queue *queue, uint32_t length, uint32_t stride) { length = roundup_pow_of_two(length); stride = align(stride, 8); void *buffer = mmap(NULL, stride * length, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (buffer == MAP_FAILED) return -1; queue->length = length; queue->stride = stride; reset_shadow_queue(queue); queue->buffer = buffer; return 0; } static inline void destroy_shadow_queue(struct shadow_queue *queue) { if (queue->buffer) { munmap(queue->buffer, queue->stride * queue->length); queue->buffer = NULL; } } static inline struct shadow_wqe_header * shadow_queue_get_element(const struct shadow_queue *queue, uint64_t unmasked_index) { uint32_t index = unmasked_index & (queue->length - 1); return (struct shadow_wqe_header *)((uint8_t *)queue->buffer + index * queue->stride); } static inline bool shadow_queue_full(struct shadow_queue *queue) { return (queue->prod_idx - queue->cons_idx) >= queue->length; } static inline struct shadow_wqe_header * shadow_queue_producer_entry(struct shadow_queue *queue) { return shadow_queue_get_element(queue, queue->prod_idx); } static inline void shadow_queue_advance_producer(struct shadow_queue *queue) { queue->prod_idx++; } static inline void shadow_queue_retreat_producer(struct shadow_queue *queue) { queue->prod_idx--; } static inline void shadow_queue_advance_consumer(struct shadow_queue *queue) { queue->cons_idx++; } static inline bool shadow_queue_empty(struct shadow_queue *queue) { return queue->prod_idx == queue->cons_idx; } static inline uint32_t shadow_queue_get_pending_wqe_count(struct shadow_queue *queue) { return (uint32_t)(queue->prod_idx - queue->next_to_complete_idx); } static inline struct shadow_wqe_header * shadow_queue_get_next_to_consume(const struct shadow_queue *queue) { if (queue->cons_idx == queue->next_to_complete_idx) return NULL; return shadow_queue_get_element(queue, queue->cons_idx); } static inline struct shadow_wqe_header * shadow_queue_get_next_to_complete(struct shadow_queue *queue) { if (queue->next_to_complete_idx == queue->prod_idx) return NULL; return shadow_queue_get_element(queue, queue->next_to_complete_idx); } static inline void shadow_queue_advance_next_to_complete(struct shadow_queue *queue) { queue->next_to_complete_idx++; } #endif //_SHADOW_QUEUE_H_ rdma-core-56.1/providers/mana/wq.c000066400000000000000000000076411477342711600170660ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* * Copyright (c) 2022, Microsoft Corporation. All rights reserved. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "mana.h" DECLARE_DRV_CMD(mana_create_wq, IB_USER_VERBS_EX_CMD_CREATE_WQ, mana_ib_create_wq, empty); DECLARE_DRV_CMD(mana_create_rwq_ind_table, IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL, empty, empty); int mana_modify_wq(struct ibv_wq *ibwq, struct ibv_wq_attr *attr) { return EOPNOTSUPP; } struct ibv_wq *mana_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *attr) { int ret; struct mana_context *ctx = to_mctx(context); struct mana_wq *wq; struct mana_create_wq wq_cmd = {}; struct mana_create_wq_resp wq_resp = {}; struct mana_ib_create_wq *wq_cmd_drv; if (!ctx->extern_alloc.alloc || !ctx->extern_alloc.free) { verbs_err(verbs_get_ctx(context), "WQ buffer needs to be externally allocated\n"); errno = EINVAL; return NULL; } wq = calloc(1, sizeof(*wq)); if (!wq) return NULL; wq->sge = attr->max_sge; wq->buf_size = attr->max_wr * get_wqe_size(attr->max_sge); wq->buf_size = align_hw_size(wq->buf_size); wq->buf = ctx->extern_alloc.alloc(wq->buf_size, ctx->extern_alloc.data); if (!wq->buf) { errno = ENOMEM; goto free_wq; } wq->wqe = attr->max_wr; wq_cmd_drv = &wq_cmd.drv_payload; wq_cmd_drv->wq_buf_addr = (uintptr_t)wq->buf; wq_cmd_drv->wq_buf_size = wq->buf_size; ret = ibv_cmd_create_wq(context, attr, &wq->ibwq, &wq_cmd.ibv_cmd, sizeof(wq_cmd), &wq_resp.ibv_resp, sizeof(wq_resp)); if (ret) { verbs_err(verbs_get_ctx(context), "Failed to Create WQ\n"); ctx->extern_alloc.free(wq->buf, ctx->extern_alloc.data); errno = ret; goto free_wq; } return &wq->ibwq; free_wq: free(wq); return NULL; } int mana_destroy_wq(struct ibv_wq *ibwq) { struct mana_wq *wq = container_of(ibwq, struct mana_wq, ibwq); struct mana_context *ctx = to_mctx(ibwq->context); int ret; if (!ctx->extern_alloc.free) { verbs_err(verbs_get_ctx(ibwq->context), "WQ needs external alloc context\n"); return EINVAL; } ret = ibv_cmd_destroy_wq(ibwq); if (ret) { verbs_err(verbs_get_ctx(ibwq->context), "Failed to destroy WQ\n"); return ret; } ctx->extern_alloc.free(wq->buf, ctx->extern_alloc.data); free(wq); return 0; } struct ibv_rwq_ind_table * mana_create_rwq_ind_table(struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr) { int ret; struct mana_rwq_ind_table *ind_table; struct mana_create_rwq_ind_table_resp resp = {}; int i; ind_table = calloc(1, sizeof(*ind_table)); if (!ind_table) return NULL; ret = ibv_cmd_create_rwq_ind_table(context, init_attr, &ind_table->ib_ind_table, &resp.ibv_resp, sizeof(resp)); if (ret) { verbs_err(verbs_get_ctx(context), "Failed to create RWQ IND table\n"); errno = ret; goto free_ind_table; } ind_table->ind_tbl_size = 1 << init_attr->log_ind_tbl_size; ind_table->ind_tbl = calloc(ind_table->ind_tbl_size, sizeof(struct ibv_wq *)); if (!ind_table->ind_tbl) { errno = ENOMEM; goto free_ind_table; } for (i = 0; i < ind_table->ind_tbl_size; i++) ind_table->ind_tbl[i] = init_attr->ind_tbl[i]; return &ind_table->ib_ind_table; free_ind_table: free(ind_table); return NULL; } int mana_destroy_rwq_ind_table(struct ibv_rwq_ind_table *rwq_ind_table) { struct mana_rwq_ind_table *ind_table = container_of( rwq_ind_table, struct mana_rwq_ind_table, ib_ind_table); int ret; ret = ibv_cmd_destroy_rwq_ind_table(&ind_table->ib_ind_table); if (ret) { verbs_err(verbs_get_ctx(rwq_ind_table->context), "Failed to destroy RWQ IND table\n"); goto fail; } free(ind_table->ind_tbl); free(ind_table); fail: return ret; } rdma-core-56.1/providers/mana/wr.c000066400000000000000000000304411477342711600170610ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* * Copyright (c) 2024, Microsoft Corporation. All rights reserved. */ #include #include #include #include #include #include #include #include #include #include #include #include #include "mana.h" #include "doorbells.h" #include "rollback.h" #include "gdma.h" static inline void zero_wqe_content(struct gdma_wqe *wqe) { memset(wqe->gdma_oob, 0, sizeof(union gdma_oob) + wqe->client_oob_size); memset(wqe->sgl1, 0, wqe->num_sge1 * sizeof(struct gdma_sge)); if (wqe->sgl2) memset(wqe->sgl2, 0, wqe->num_sge2 * sizeof(struct gdma_sge)); } static inline void gdma_advance_producer(struct mana_gdma_queue *wq, uint32_t size_in_bu) { wq->prod_idx = (wq->prod_idx + size_in_bu) & GDMA_QUEUE_OFFSET_MASK; } static inline int gdma_get_current_wqe(struct mana_gdma_queue *wq, uint32_t client_oob_size, uint32_t wqe_size, struct gdma_wqe *wqe) { uint32_t wq_size = wq->size; uint32_t used_entries = (wq->prod_idx - wq->cons_idx) & GDMA_QUEUE_OFFSET_MASK; uint32_t free_space = wq_size - (used_entries * GDMA_WQE_ALIGNMENT_UNIT_SIZE); if (wqe_size > free_space) return ENOMEM; uint32_t aligned_sgl_size = wqe_size - sizeof(union gdma_oob) - client_oob_size; uint32_t total_num_sges = aligned_sgl_size / sizeof(struct gdma_sge); uint32_t offset = (wq->prod_idx * GDMA_WQE_ALIGNMENT_UNIT_SIZE) & (wq_size - 1); wqe->unmasked_wqe_index = wq->prod_idx; wqe->size_in_bu = wqe_size / GDMA_WQE_ALIGNMENT_UNIT_SIZE; wqe->gdma_oob = (union gdma_oob *)((uint8_t *)wq->buffer + offset); wqe->client_oob = ((uint8_t *)wqe->gdma_oob) + sizeof(union gdma_oob); wqe->client_oob_size = client_oob_size; if (likely(wq_size - offset >= wqe_size)) { wqe->sgl1 = (struct gdma_sge *)((uint8_t *)wqe->client_oob + client_oob_size); wqe->num_sge1 = total_num_sges; wqe->sgl2 = NULL; wqe->num_sge2 = 0; } else { if (offset + sizeof(union gdma_oob) + client_oob_size == wq_size) { wqe->sgl1 = (struct gdma_sge *)wq->buffer; wqe->num_sge1 = total_num_sges; wqe->sgl2 = NULL; wqe->num_sge2 = 0; } else { wqe->sgl1 = (struct gdma_sge *)((uint8_t *)wqe->client_oob + client_oob_size); wqe->num_sge1 = (wq_size - offset - sizeof(union gdma_oob) - client_oob_size) / sizeof(struct gdma_sge); wqe->sgl2 = (struct gdma_sge *)wq->buffer; wqe->num_sge2 = total_num_sges - wqe->num_sge1; } } zero_wqe_content(wqe); return 0; } static inline void gdma_write_sge(struct gdma_wqe *wqe, void *oob_sge, struct ibv_sge *sge, uint32_t num_sge) { struct gdma_sge *gdma_sgl = wqe->sgl1; uint32_t num_sge1 = wqe->num_sge1; uint32_t i; if (oob_sge) { memcpy(gdma_sgl, oob_sge, sizeof(*gdma_sgl)); gdma_sgl++; num_sge1--; } for (i = 0; i < num_sge; ++i, ++gdma_sgl) { if (i == num_sge1) gdma_sgl = wqe->sgl2; gdma_sgl->address = sge->addr; gdma_sgl->size = sge->length; gdma_sgl->mem_key = sge->lkey; } } static inline int gdma_post_rq_wqe(struct mana_gdma_queue *wq, struct ibv_sge *sgl, struct rdma_recv_oob *recv_oob, uint32_t num_sge, enum gdma_work_req_flags flags, struct gdma_wqe *wqe) { struct ibv_sge dummy = {1, 0, 0}; uint32_t wqe_size; int ret; if (num_sge == 0) { num_sge = 1; sgl = &dummy; } wqe_size = get_wqe_size(num_sge); ret = gdma_get_current_wqe(wq, INLINE_OOB_SMALL_SIZE, wqe_size, wqe); if (ret) return ret; wqe->gdma_oob->rx.num_sgl_entries = num_sge; wqe->gdma_oob->rx.inline_client_oob_size = INLINE_OOB_SMALL_SIZE / sizeof(uint32_t); wqe->gdma_oob->rx.check_sn = (flags & GDMA_WORK_REQ_CHECK_SN) != 0; if (recv_oob) memcpy(wqe->client_oob, recv_oob, INLINE_OOB_SMALL_SIZE); gdma_write_sge(wqe, NULL, sgl, num_sge); gdma_advance_producer(wq, wqe->size_in_bu); return 0; } static int mana_ib_rc_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mana_context *mc = container_of(verbs_get_ctx(ibqp->context), struct mana_context, ibv_ctx); struct mana_qp *qp = container_of(ibqp, struct mana_qp, ibqp.qp); struct mana_gdma_queue *wq = &qp->rc_qp.queues[USER_RC_RECV_QUEUE_RESPONDER]; struct shadow_wqe_header *shadow_wqe; struct gdma_wqe wqe_info; uint8_t wqe_cnt = 0; int ret = 0; pthread_spin_lock(&qp->rq_lock); for (; wr; wr = wr->next) { if (shadow_queue_full(&qp->shadow_rq)) { verbs_err(&mc->ibv_ctx, "recv shadow queue full\n"); ret = ENOMEM; goto cleanup; } ret = gdma_post_rq_wqe(wq, wr->sg_list, NULL, wr->num_sge, GDMA_WORK_REQ_NONE, &wqe_info); if (ret) { verbs_err(&mc->ibv_ctx, "Failed to post RQ wqe , ret %d\n", ret); goto cleanup; } wqe_cnt++; shadow_wqe = shadow_queue_producer_entry(&qp->shadow_rq); memset(shadow_wqe, 0, sizeof(*shadow_wqe)); shadow_wqe->opcode = IBV_WC_RECV; shadow_wqe->wr_id = wr->wr_id; shadow_wqe->unmasked_queue_offset = wqe_info.unmasked_wqe_index; shadow_wqe->posted_wqe_size_in_bu = wqe_info.size_in_bu; shadow_queue_advance_producer(&qp->shadow_rq); } cleanup: if (wqe_cnt) gdma_ring_recv_doorbell(wq, wqe_cnt); pthread_spin_unlock(&qp->rq_lock); if (bad_wr && ret) *bad_wr = wr; return ret; } int mana_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad) { switch (ibqp->qp_type) { case IBV_QPT_RC: return mana_ib_rc_post_recv(ibqp, wr, bad); default: verbs_err(verbs_get_ctx(ibqp->context), "QPT not supported %d\n", ibqp->qp_type); return EOPNOTSUPP; } } static inline bool is_opcode_supported(enum ibv_wr_opcode opcode) { switch (opcode) { case IBV_WR_RDMA_READ: case IBV_WR_RDMA_WRITE: case IBV_WR_SEND: case IBV_WR_SEND_WITH_IMM: case IBV_WR_RDMA_WRITE_WITH_IMM: return true; default: return false; } } static inline enum ibv_wc_opcode convert_wr_to_wc(enum ibv_wr_opcode opcode) { switch (opcode) { case IBV_WR_SEND_WITH_IMM: case IBV_WR_SEND: return IBV_WC_SEND; case IBV_WR_RDMA_WRITE_WITH_IMM: case IBV_WR_RDMA_WRITE: return IBV_WC_RDMA_WRITE; case IBV_WR_RDMA_READ: return IBV_WC_RDMA_READ; case IBV_WR_ATOMIC_CMP_AND_SWP: return IBV_WC_COMP_SWAP; case IBV_WR_ATOMIC_FETCH_AND_ADD: return IBV_WC_FETCH_ADD; default: return 0xFF; } } static inline int gdma_post_sq_wqe(struct mana_gdma_queue *wq, struct ibv_sge *sgl, struct rdma_send_oob *send_oob, void *oob_sge, uint32_t num_sge, uint32_t mtu, enum gdma_work_req_flags flags, struct gdma_wqe *wqe) { struct ibv_sge dummy = {1, 0, 0}; uint32_t total_sge, wqe_size; int ret; if (num_sge == 0) { num_sge = 1; sgl = &dummy; } total_sge = num_sge + (oob_sge ? 1 : 0); wqe_size = get_large_wqe_size(total_sge); ret = gdma_get_current_wqe(wq, INLINE_OOB_LARGE_SIZE, wqe_size, wqe); if (ret) return ret; wqe->gdma_oob->tx.num_padding_sgls = wqe->num_sge1 + wqe->num_sge2 - total_sge; wqe->gdma_oob->tx.num_sgl_entries = wqe->num_sge1 + wqe->num_sge2; wqe->gdma_oob->tx.inline_client_oob_size = INLINE_OOB_LARGE_SIZE / sizeof(uint32_t); if (flags & GDMA_WORK_REQ_EXTRA_LARGE_OOB) { /* the first SGE was a part of the extra large OOB */ wqe->gdma_oob->tx.num_sgl_entries -= 1; wqe->gdma_oob->tx.inline_client_oob_size += 1; } wqe->gdma_oob->tx.client_oob_in_sgl = (flags & GDMA_WORK_REQ_OOB_IN_SGL) != 0; wqe->gdma_oob->tx.consume_credit = (flags & GDMA_WORK_REQ_CONSUME_CREDIT) != 0; wqe->gdma_oob->tx.fence = (flags & GDMA_WORK_REQ_FENCE) != 0; wqe->gdma_oob->tx.client_data_unit = mtu; wqe->gdma_oob->tx.check_sn = (flags & GDMA_WORK_REQ_CHECK_SN) != 0; wqe->gdma_oob->tx.sgl_direct = (flags & GDMA_WORK_REQ_SGL_DIRECT) != 0; memcpy(wqe->client_oob, send_oob, INLINE_OOB_LARGE_SIZE); gdma_write_sge(wqe, oob_sge, sgl, num_sge); gdma_advance_producer(wq, wqe->size_in_bu); return 0; } static inline int mana_ib_rc_post_send_request(struct mana_qp *qp, struct ibv_send_wr *wr, struct rc_sq_shadow_wqe *shadow_wqe) { enum gdma_work_req_flags flags = GDMA_WORK_REQ_NONE; struct extra_large_wqe extra_wqe = {0}; struct rdma_send_oob send_oob = {0}; struct gdma_wqe gdma_wqe = {0}; uint32_t num_sge = wr->num_sge; void *oob_sge = NULL; uint32_t msg_sz = 0; int i, ret; for (i = 0; i < num_sge; i++) msg_sz += wr->sg_list[i].length; if (wr->opcode == IBV_WR_RDMA_READ) { struct rdma_recv_oob recv_oob = {0}; recv_oob.psn_start = qp->rc_qp.sq_psn; ret = gdma_post_rq_wqe(&qp->rc_qp.queues[USER_RC_RECV_QUEUE_REQUESTER], wr->sg_list, &recv_oob, num_sge, GDMA_WORK_REQ_CHECK_SN, &gdma_wqe); if (ret) { verbs_err(verbs_get_ctx(qp->ibqp.qp.context), "rc post Read data WQE error, ret %d\n", ret); goto cleanup; } shadow_wqe->read_posted_wqe_size_in_bu = gdma_wqe.size_in_bu; gdma_ring_recv_doorbell(&qp->rc_qp.queues[USER_RC_RECV_QUEUE_REQUESTER], 1); // for reads no sge to use dummy sgl num_sge = 0; } send_oob.wqe_type = convert_wr_to_hw_opcode(wr->opcode); send_oob.fence = (wr->send_flags & IBV_SEND_FENCE) != 0; send_oob.signaled = (wr->send_flags & IBV_SEND_SIGNALED) != 0; send_oob.solicited = (wr->send_flags & IBV_SEND_SOLICITED) != 0; send_oob.psn = qp->rc_qp.sq_psn; send_oob.ssn = qp->rc_qp.sq_ssn; switch (wr->opcode) { case IBV_WR_SEND_WITH_INV: flags |= GDMA_WORK_REQ_CHECK_SN; send_oob.send.invalidate_key = wr->invalidate_rkey; break; case IBV_WR_SEND_WITH_IMM: send_oob.send.immediate = htole32(be32toh(wr->imm_data)); SWITCH_FALLTHROUGH; case IBV_WR_SEND: flags |= GDMA_WORK_REQ_CHECK_SN; break; case IBV_WR_RDMA_WRITE_WITH_IMM: flags |= GDMA_WORK_REQ_CHECK_SN; flags |= GDMA_WORK_REQ_EXTRA_LARGE_OOB; extra_wqe.immediate = htole32(be32toh(wr->imm_data)); oob_sge = &extra_wqe; SWITCH_FALLTHROUGH; case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_READ: send_oob.rdma.address_hi = (uint32_t)(wr->wr.rdma.remote_addr >> 32); send_oob.rdma.address_low = (uint32_t)(wr->wr.rdma.remote_addr & 0xFFFFFFFF); send_oob.rdma.rkey = wr->wr.rdma.rkey; send_oob.rdma.dma_len = msg_sz; break; default: goto cleanup; } ret = gdma_post_sq_wqe(&qp->rc_qp.queues[USER_RC_SEND_QUEUE_REQUESTER], wr->sg_list, &send_oob, oob_sge, num_sge, MTU_SIZE(qp->mtu), flags, &gdma_wqe); if (ret) { verbs_err(verbs_get_ctx(qp->ibqp.qp.context), "rc post send error, ret %d\n", ret); goto cleanup; } qp->rc_qp.sq_psn = PSN_ADD(qp->rc_qp.sq_psn, PSN_DELTA(msg_sz, qp->mtu)); qp->rc_qp.sq_ssn = PSN_INC(qp->rc_qp.sq_ssn); shadow_wqe->header.wr_id = wr->wr_id; shadow_wqe->header.opcode = convert_wr_to_wc(wr->opcode); shadow_wqe->header.flags = (wr->send_flags & IBV_SEND_SIGNALED) ? 0 : MANA_NO_SIGNAL_WC; shadow_wqe->header.posted_wqe_size_in_bu = gdma_wqe.size_in_bu; shadow_wqe->header.unmasked_queue_offset = gdma_wqe.unmasked_wqe_index; shadow_wqe->end_psn = PSN_DEC(qp->rc_qp.sq_psn); return 0; cleanup: return EINVAL; } static int mana_ib_rc_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { struct mana_qp *qp = container_of(ibqp, struct mana_qp, ibqp.qp); int ret = 0; bool ring = false; pthread_spin_lock(&qp->sq_lock); for (; wr; wr = wr->next) { if ((wr->send_flags & IBV_SEND_SIGNALED) && shadow_queue_full(&qp->shadow_sq)) { verbs_err(verbs_get_ctx(ibqp->context), "shadow queue full\n"); ret = ENOMEM; goto cleanup; } if (!is_opcode_supported(wr->opcode)) { ret = EINVAL; goto cleanup; } /* Fill shadow queue data */ struct rc_sq_shadow_wqe *shadow_wqe = (struct rc_sq_shadow_wqe *) shadow_queue_producer_entry(&qp->shadow_sq); memset(shadow_wqe, 0, sizeof(struct rc_sq_shadow_wqe)); ret = mana_ib_rc_post_send_request(qp, wr, shadow_wqe); if (ret) { verbs_err(verbs_get_ctx(qp->ibqp.qp.context), "Failed to post send request ret %d\n", ret); goto cleanup; } ring = true; shadow_queue_advance_producer(&qp->shadow_sq); mana_ib_update_shared_mem_right_offset(qp, shadow_wqe->header.unmasked_queue_offset); } cleanup: if (ring) gdma_ring_send_doorbell(&qp->rc_qp.queues[USER_RC_SEND_QUEUE_REQUESTER]); pthread_spin_unlock(&qp->sq_lock); if (bad_wr && ret) *bad_wr = wr; return ret; } int mana_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad) { switch (ibqp->qp_type) { case IBV_QPT_RC: return mana_ib_rc_post_send(ibqp, wr, bad); default: verbs_err(verbs_get_ctx(ibqp->context), "QPT not supported %d\n", ibqp->qp_type); return EOPNOTSUPP; } } rdma-core-56.1/providers/mlx4/000077500000000000000000000000001477342711600162335ustar00rootroot00000000000000rdma-core-56.1/providers/mlx4/CMakeLists.txt000066400000000000000000000004611477342711600207740ustar00rootroot00000000000000rdma_shared_provider(mlx4 libmlx4.map 1 1.0.${PACKAGE_VERSION} buf.c cq.c dbrec.c mlx4.c qp.c srq.c verbs.c ) publish_headers(infiniband mlx4dv.h ) install(FILES "mlx4.conf" DESTINATION "${CMAKE_INSTALL_MODPROBEDIR}/") rdma_pkg_config("mlx4" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}") rdma-core-56.1/providers/mlx4/buf.c000066400000000000000000000056531477342711600171640ustar00rootroot00000000000000/* * Copyright (c) 2006, 2007 Cisco, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include "mlx4.h" static void mlx4_free_buf_extern(struct mlx4_context *ctx, struct mlx4_buf *buf) { ibv_dofork_range(buf->buf, buf->length); ctx->extern_alloc.free(buf->buf, ctx->extern_alloc.data); } static int mlx4_alloc_buf_extern(struct mlx4_context *ctx, struct mlx4_buf *buf, size_t size) { void *addr; addr = ctx->extern_alloc.alloc(size, ctx->extern_alloc.data); if (addr || size == 0) { if (ibv_dontfork_range(addr, size)) { ctx->extern_alloc.free(addr, ctx->extern_alloc.data); return -1; } buf->buf = addr; buf->length = size; return 0; } return -1; } static bool mlx4_is_extern_alloc(struct mlx4_context *context) { return context->extern_alloc.alloc && context->extern_alloc.free; } int mlx4_alloc_buf(struct mlx4_context *ctx, struct mlx4_buf *buf, size_t size, int page_size) { int ret; if (mlx4_is_extern_alloc(ctx)) return mlx4_alloc_buf_extern(ctx, buf, size); buf->length = align(size, page_size); buf->buf = mmap(NULL, buf->length, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (buf->buf == MAP_FAILED) return errno; ret = ibv_dontfork_range(buf->buf, size); if (ret) munmap(buf->buf, buf->length); return ret; } void mlx4_free_buf(struct mlx4_context *context, struct mlx4_buf *buf) { if (mlx4_is_extern_alloc(context)) return mlx4_free_buf_extern(context, buf); if (buf->length) { ibv_dofork_range(buf->buf, buf->length); munmap(buf->buf, buf->length); } } rdma-core-56.1/providers/mlx4/cq.c000066400000000000000000000520311477342711600170030ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005 Mellanox Technologies Ltd. All rights reserved. * Copyright (c) 2006, 2007 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include "mlx4.h" enum { CQ_OK = 0, CQ_EMPTY = -1, CQ_POLL_ERR = -2 }; static struct mlx4_cqe *get_cqe(struct mlx4_cq *cq, int entry) { return cq->buf.buf + entry * cq->cqe_size; } static void *get_sw_cqe(struct mlx4_cq *cq, int n) { struct mlx4_cqe *cqe = get_cqe(cq, n & cq->verbs_cq.cq.cqe); struct mlx4_cqe *tcqe = cq->cqe_size == 64 ? cqe + 1 : cqe; return (!!(tcqe->owner_sr_opcode & MLX4_CQE_OWNER_MASK) ^ !!(n & (cq->verbs_cq.cq.cqe + 1))) ? NULL : cqe; } static struct mlx4_cqe *next_cqe_sw(struct mlx4_cq *cq) { return get_sw_cqe(cq, cq->cons_index); } static enum ibv_wc_status mlx4_handle_error_cqe(struct mlx4_err_cqe *cqe) { if (cqe->syndrome == MLX4_CQE_SYNDROME_LOCAL_QP_OP_ERR) printf(PFX "local QP operation err " "(QPN %06x, WQE index %x, vendor syndrome %02x, " "opcode = %02x)\n", htobe32(cqe->vlan_my_qpn), htobe32(cqe->wqe_index), cqe->vendor_err, cqe->owner_sr_opcode & ~MLX4_CQE_OWNER_MASK); switch (cqe->syndrome) { case MLX4_CQE_SYNDROME_LOCAL_LENGTH_ERR: return IBV_WC_LOC_LEN_ERR; case MLX4_CQE_SYNDROME_LOCAL_QP_OP_ERR: return IBV_WC_LOC_QP_OP_ERR; case MLX4_CQE_SYNDROME_LOCAL_PROT_ERR: return IBV_WC_LOC_PROT_ERR; case MLX4_CQE_SYNDROME_WR_FLUSH_ERR: return IBV_WC_WR_FLUSH_ERR; case MLX4_CQE_SYNDROME_MW_BIND_ERR: return IBV_WC_MW_BIND_ERR; case MLX4_CQE_SYNDROME_BAD_RESP_ERR: return IBV_WC_BAD_RESP_ERR; case MLX4_CQE_SYNDROME_LOCAL_ACCESS_ERR: return IBV_WC_LOC_ACCESS_ERR; case MLX4_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR: return IBV_WC_REM_INV_REQ_ERR; case MLX4_CQE_SYNDROME_REMOTE_ACCESS_ERR: return IBV_WC_REM_ACCESS_ERR; case MLX4_CQE_SYNDROME_REMOTE_OP_ERR: return IBV_WC_REM_OP_ERR; case MLX4_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR: return IBV_WC_RETRY_EXC_ERR; case MLX4_CQE_SYNDROME_RNR_RETRY_EXC_ERR: return IBV_WC_RNR_RETRY_EXC_ERR; case MLX4_CQE_SYNDROME_REMOTE_ABORTED_ERR: return IBV_WC_REM_ABORT_ERR; default: return IBV_WC_GENERAL_ERR; } } static inline void handle_good_req(struct ibv_wc *wc, struct mlx4_cqe *cqe) { wc->wc_flags = 0; switch (mlx4dv_get_cqe_opcode(cqe)) { case MLX4_OPCODE_RDMA_WRITE_IMM: wc->wc_flags |= IBV_WC_WITH_IMM; SWITCH_FALLTHROUGH; case MLX4_OPCODE_RDMA_WRITE: wc->opcode = IBV_WC_RDMA_WRITE; break; case MLX4_OPCODE_SEND_IMM: wc->wc_flags |= IBV_WC_WITH_IMM; SWITCH_FALLTHROUGH; case MLX4_OPCODE_SEND: case MLX4_OPCODE_SEND_INVAL: wc->opcode = IBV_WC_SEND; break; case MLX4_OPCODE_RDMA_READ: wc->opcode = IBV_WC_RDMA_READ; wc->byte_len = be32toh(cqe->byte_cnt); break; case MLX4_OPCODE_ATOMIC_CS: wc->opcode = IBV_WC_COMP_SWAP; wc->byte_len = 8; break; case MLX4_OPCODE_ATOMIC_FA: wc->opcode = IBV_WC_FETCH_ADD; wc->byte_len = 8; break; case MLX4_OPCODE_LOCAL_INVAL: wc->opcode = IBV_WC_LOCAL_INV; break; case MLX4_OPCODE_BIND_MW: wc->opcode = IBV_WC_BIND_MW; break; default: /* assume it's a send completion */ wc->opcode = IBV_WC_SEND; break; } } static inline int mlx4_get_next_cqe(struct mlx4_cq *cq, struct mlx4_cqe **pcqe) ALWAYS_INLINE; static inline int mlx4_get_next_cqe(struct mlx4_cq *cq, struct mlx4_cqe **pcqe) { struct mlx4_cqe *cqe; cqe = next_cqe_sw(cq); if (!cqe) return CQ_EMPTY; if (cq->cqe_size == 64) ++cqe; ++cq->cons_index; VALGRIND_MAKE_MEM_DEFINED(cqe, sizeof *cqe); /* * Make sure we read CQ entry contents after we've checked the * ownership bit. */ udma_from_device_barrier(); *pcqe = cqe; return CQ_OK; } static inline int mlx4_parse_cqe(struct mlx4_cq *cq, struct mlx4_cqe *cqe, struct mlx4_qp **cur_qp, struct ibv_wc *wc, int lazy) ALWAYS_INLINE; static inline int mlx4_parse_cqe(struct mlx4_cq *cq, struct mlx4_cqe *cqe, struct mlx4_qp **cur_qp, struct ibv_wc *wc, int lazy) { struct mlx4_wq *wq; struct mlx4_srq *srq; uint32_t qpn; uint32_t g_mlpath_rqpn; uint64_t *pwr_id; uint16_t wqe_index; struct mlx4_err_cqe *ecqe; struct mlx4_context *mctx; int is_error; int is_send; enum ibv_wc_status *pstatus; mctx = to_mctx(cq->verbs_cq.cq.context); qpn = be32toh(cqe->vlan_my_qpn) & MLX4_CQE_QPN_MASK; if (lazy) { cq->cqe = cqe; cq->flags &= (~MLX4_CQ_FLAGS_RX_CSUM_VALID); } else wc->qp_num = qpn; is_send = cqe->owner_sr_opcode & MLX4_CQE_IS_SEND_MASK; is_error = (mlx4dv_get_cqe_opcode(cqe)) == MLX4_CQE_OPCODE_ERROR; if ((qpn & MLX4_XRC_QPN_BIT) && !is_send) { /* * We do not have to take the XSRQ table lock here, * because CQs will be locked while SRQs are removed * from the table. */ srq = mlx4_find_xsrq(&mctx->xsrq_table, be32toh(cqe->g_mlpath_rqpn) & MLX4_CQE_QPN_MASK); if (!srq) return CQ_POLL_ERR; } else { if (!*cur_qp || (qpn != (*cur_qp)->qpn_cache)) { /* * We do not have to take the QP table lock here, * because CQs will be locked while QPs are removed * from the table. */ *cur_qp = mlx4_find_qp(mctx, qpn); if (!*cur_qp) return CQ_POLL_ERR; } srq = ((*cur_qp)->type == MLX4_RSC_TYPE_SRQ) ? to_msrq((*cur_qp)->verbs_qp.qp.srq) : NULL; } pwr_id = lazy ? &cq->verbs_cq.cq_ex.wr_id : &wc->wr_id; if (is_send) { wq = &(*cur_qp)->sq; wqe_index = be16toh(cqe->wqe_index); wq->tail += (uint16_t) (wqe_index - (uint16_t) wq->tail); *pwr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)]; ++wq->tail; } else if (srq) { wqe_index = be16toh(cqe->wqe_index); *pwr_id = srq->wrid[wqe_index]; mlx4_free_srq_wqe(srq, wqe_index); } else { wq = &(*cur_qp)->rq; *pwr_id = wq->wrid[wq->tail & (wq->wqe_cnt - 1)]; ++wq->tail; } pstatus = lazy ? &cq->verbs_cq.cq_ex.status : &wc->status; if (is_error) { ecqe = (struct mlx4_err_cqe *)cqe; *pstatus = mlx4_handle_error_cqe(ecqe); if (!lazy) wc->vendor_err = ecqe->vendor_err; return CQ_OK; } *pstatus = IBV_WC_SUCCESS; if (lazy) { if (!is_send) if ((*cur_qp) && ((*cur_qp)->qp_cap_cache & MLX4_RX_CSUM_VALID)) cq->flags |= MLX4_CQ_FLAGS_RX_CSUM_VALID; } else if (is_send) { handle_good_req(wc, cqe); } else { wc->byte_len = be32toh(cqe->byte_cnt); switch (mlx4dv_get_cqe_opcode(cqe)) { case MLX4_RECV_OPCODE_RDMA_WRITE_IMM: wc->opcode = IBV_WC_RECV_RDMA_WITH_IMM; wc->wc_flags = IBV_WC_WITH_IMM; wc->imm_data = cqe->immed_rss_invalid; break; case MLX4_RECV_OPCODE_SEND_INVAL: wc->opcode = IBV_WC_RECV; wc->wc_flags |= IBV_WC_WITH_INV; wc->invalidated_rkey = be32toh(cqe->immed_rss_invalid); break; case MLX4_RECV_OPCODE_SEND: wc->opcode = IBV_WC_RECV; wc->wc_flags = 0; break; case MLX4_RECV_OPCODE_SEND_IMM: wc->opcode = IBV_WC_RECV; wc->wc_flags = IBV_WC_WITH_IMM; wc->imm_data = cqe->immed_rss_invalid; break; } wc->slid = be16toh(cqe->rlid); g_mlpath_rqpn = be32toh(cqe->g_mlpath_rqpn); wc->src_qp = g_mlpath_rqpn & 0xffffff; wc->dlid_path_bits = (g_mlpath_rqpn >> 24) & 0x7f; wc->wc_flags |= g_mlpath_rqpn & 0x80000000 ? IBV_WC_GRH : 0; wc->pkey_index = be32toh(cqe->immed_rss_invalid) & 0x7f; /* When working with xrc srqs, don't have qp to check link layer. * Using IB SL, should consider Roce. (TBD) */ if ((*cur_qp) && (*cur_qp)->link_layer == IBV_LINK_LAYER_ETHERNET) wc->sl = be16toh(cqe->sl_vid) >> 13; else wc->sl = be16toh(cqe->sl_vid) >> 12; if ((*cur_qp) && ((*cur_qp)->qp_cap_cache & MLX4_RX_CSUM_VALID)) { wc->wc_flags |= ((cqe->status & htobe32(MLX4_CQE_STATUS_IPV4_CSUM_OK)) == htobe32(MLX4_CQE_STATUS_IPV4_CSUM_OK)) << IBV_WC_IP_CSUM_OK_SHIFT; } } return CQ_OK; } static inline int mlx4_parse_lazy_cqe(struct mlx4_cq *cq, struct mlx4_cqe *cqe) ALWAYS_INLINE; static inline int mlx4_parse_lazy_cqe(struct mlx4_cq *cq, struct mlx4_cqe *cqe) { return mlx4_parse_cqe(cq, cqe, &cq->cur_qp, NULL, 1); } static inline int mlx4_poll_one(struct mlx4_cq *cq, struct mlx4_qp **cur_qp, struct ibv_wc *wc) ALWAYS_INLINE; static inline int mlx4_poll_one(struct mlx4_cq *cq, struct mlx4_qp **cur_qp, struct ibv_wc *wc) { struct mlx4_cqe *cqe; int err; err = mlx4_get_next_cqe(cq, &cqe); if (err == CQ_EMPTY) return err; return mlx4_parse_cqe(cq, cqe, cur_qp, wc, 0); } int mlx4_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc) { struct mlx4_cq *cq = to_mcq(ibcq); struct mlx4_qp *qp = NULL; int npolled; int err = CQ_OK; pthread_spin_lock(&cq->lock); for (npolled = 0; npolled < ne; ++npolled) { err = mlx4_poll_one(cq, &qp, wc + npolled); if (err != CQ_OK) break; } if (npolled || err == CQ_POLL_ERR) mlx4_update_cons_index(cq); pthread_spin_unlock(&cq->lock); return err == CQ_POLL_ERR ? err : npolled; } static inline void _mlx4_end_poll(struct ibv_cq_ex *ibcq, int lock) ALWAYS_INLINE; static inline void _mlx4_end_poll(struct ibv_cq_ex *ibcq, int lock) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); mlx4_update_cons_index(cq); if (lock) pthread_spin_unlock(&cq->lock); } static inline int _mlx4_start_poll(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr, int lock) ALWAYS_INLINE; static inline int _mlx4_start_poll(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr, int lock) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); struct mlx4_cqe *cqe; int err; if (unlikely(attr->comp_mask)) return EINVAL; if (lock) pthread_spin_lock(&cq->lock); cq->cur_qp = NULL; err = mlx4_get_next_cqe(cq, &cqe); if (err == CQ_EMPTY) { if (lock) pthread_spin_unlock(&cq->lock); return ENOENT; } err = mlx4_parse_lazy_cqe(cq, cqe); if (lock && err) pthread_spin_unlock(&cq->lock); return err; } static int mlx4_next_poll(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); struct mlx4_cqe *cqe; int err; err = mlx4_get_next_cqe(cq, &cqe); if (err == CQ_EMPTY) return ENOENT; return mlx4_parse_lazy_cqe(cq, cqe); } static void mlx4_end_poll(struct ibv_cq_ex *ibcq) { _mlx4_end_poll(ibcq, 0); } static void mlx4_end_poll_lock(struct ibv_cq_ex *ibcq) { _mlx4_end_poll(ibcq, 1); } static int mlx4_start_poll(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return _mlx4_start_poll(ibcq, attr, 0); } static int mlx4_start_poll_lock(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return _mlx4_start_poll(ibcq, attr, 1); } static enum ibv_wc_opcode mlx4_cq_read_wc_opcode(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); if (cq->cqe->owner_sr_opcode & MLX4_CQE_IS_SEND_MASK) { switch (mlx4dv_get_cqe_opcode(cq->cqe)) { case MLX4_OPCODE_RDMA_WRITE_IMM: case MLX4_OPCODE_RDMA_WRITE: return IBV_WC_RDMA_WRITE; case MLX4_OPCODE_SEND_INVAL: case MLX4_OPCODE_SEND_IMM: case MLX4_OPCODE_SEND: return IBV_WC_SEND; case MLX4_OPCODE_RDMA_READ: return IBV_WC_RDMA_READ; case MLX4_OPCODE_ATOMIC_CS: return IBV_WC_COMP_SWAP; case MLX4_OPCODE_ATOMIC_FA: return IBV_WC_FETCH_ADD; case MLX4_OPCODE_LOCAL_INVAL: return IBV_WC_LOCAL_INV; case MLX4_OPCODE_BIND_MW: return IBV_WC_BIND_MW; } } else { switch (mlx4dv_get_cqe_opcode(cq->cqe)) { case MLX4_RECV_OPCODE_RDMA_WRITE_IMM: return IBV_WC_RECV_RDMA_WITH_IMM; case MLX4_RECV_OPCODE_SEND_INVAL: case MLX4_RECV_OPCODE_SEND_IMM: case MLX4_RECV_OPCODE_SEND: return IBV_WC_RECV; } } return 0; } static uint32_t mlx4_cq_read_wc_qp_num(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return be32toh(cq->cqe->vlan_my_qpn) & MLX4_CQE_QPN_MASK; } static unsigned int mlx4_cq_read_wc_flags(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); int is_send = cq->cqe->owner_sr_opcode & MLX4_CQE_IS_SEND_MASK; int wc_flags = 0; if (is_send) { switch (mlx4dv_get_cqe_opcode(cq->cqe)) { case MLX4_OPCODE_RDMA_WRITE_IMM: case MLX4_OPCODE_SEND_IMM: wc_flags |= IBV_WC_WITH_IMM; break; } } else { if (cq->flags & MLX4_CQ_FLAGS_RX_CSUM_VALID) wc_flags |= ((cq->cqe->status & htobe32(MLX4_CQE_STATUS_IPV4_CSUM_OK)) == htobe32(MLX4_CQE_STATUS_IPV4_CSUM_OK)) << IBV_WC_IP_CSUM_OK_SHIFT; switch (mlx4dv_get_cqe_opcode(cq->cqe)) { case MLX4_RECV_OPCODE_RDMA_WRITE_IMM: case MLX4_RECV_OPCODE_SEND_IMM: wc_flags |= IBV_WC_WITH_IMM; break; case MLX4_RECV_OPCODE_SEND_INVAL: wc_flags |= IBV_WC_WITH_INV; break; } wc_flags |= (be32toh(cq->cqe->g_mlpath_rqpn) & 0x80000000) ? IBV_WC_GRH : 0; } return wc_flags; } static uint32_t mlx4_cq_read_wc_byte_len(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return be32toh(cq->cqe->byte_cnt); } static uint32_t mlx4_cq_read_wc_vendor_err(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); struct mlx4_err_cqe *ecqe = (struct mlx4_err_cqe *)cq->cqe; return ecqe->vendor_err; } static __be32 mlx4_cq_read_wc_imm_data(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); switch (mlx4dv_get_cqe_opcode(cq->cqe)) { case MLX4_RECV_OPCODE_SEND_INVAL: /* This is returning invalidate_rkey which is in host order, see * ibv_wc_read_invalidated_rkey */ return (__force __be32)be32toh(cq->cqe->immed_rss_invalid); default: return cq->cqe->immed_rss_invalid; } } static uint32_t mlx4_cq_read_wc_slid(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return (uint32_t)be16toh(cq->cqe->rlid); } static uint8_t mlx4_cq_read_wc_sl(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); if ((cq->cur_qp) && (cq->cur_qp->link_layer == IBV_LINK_LAYER_ETHERNET)) return be16toh(cq->cqe->sl_vid) >> 13; else return be16toh(cq->cqe->sl_vid) >> 12; } static uint32_t mlx4_cq_read_wc_src_qp(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return be32toh(cq->cqe->g_mlpath_rqpn) & 0xffffff; } static uint8_t mlx4_cq_read_wc_dlid_path_bits(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return (be32toh(cq->cqe->g_mlpath_rqpn) >> 24) & 0x7f; } static uint64_t mlx4_cq_read_wc_completion_ts(struct ibv_cq_ex *ibcq) { struct mlx4_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return ((uint64_t)be32toh(cq->cqe->ts_47_16) << 16) | (cq->cqe->ts_15_8 << 8) | (cq->cqe->ts_7_0); } void mlx4_cq_fill_pfns(struct mlx4_cq *cq, const struct ibv_cq_init_attr_ex *cq_attr) { if (cq->flags & MLX4_CQ_FLAGS_SINGLE_THREADED) { cq->verbs_cq.cq_ex.start_poll = mlx4_start_poll; cq->verbs_cq.cq_ex.end_poll = mlx4_end_poll; } else { cq->verbs_cq.cq_ex.start_poll = mlx4_start_poll_lock; cq->verbs_cq.cq_ex.end_poll = mlx4_end_poll_lock; } cq->verbs_cq.cq_ex.next_poll = mlx4_next_poll; cq->verbs_cq.cq_ex.read_opcode = mlx4_cq_read_wc_opcode; cq->verbs_cq.cq_ex.read_vendor_err = mlx4_cq_read_wc_vendor_err; cq->verbs_cq.cq_ex.read_wc_flags = mlx4_cq_read_wc_flags; if (cq_attr->wc_flags & IBV_WC_EX_WITH_BYTE_LEN) cq->verbs_cq.cq_ex.read_byte_len = mlx4_cq_read_wc_byte_len; if (cq_attr->wc_flags & IBV_WC_EX_WITH_IMM) cq->verbs_cq.cq_ex.read_imm_data = mlx4_cq_read_wc_imm_data; if (cq_attr->wc_flags & IBV_WC_EX_WITH_QP_NUM) cq->verbs_cq.cq_ex.read_qp_num = mlx4_cq_read_wc_qp_num; if (cq_attr->wc_flags & IBV_WC_EX_WITH_SRC_QP) cq->verbs_cq.cq_ex.read_src_qp = mlx4_cq_read_wc_src_qp; if (cq_attr->wc_flags & IBV_WC_EX_WITH_SLID) cq->verbs_cq.cq_ex.read_slid = mlx4_cq_read_wc_slid; if (cq_attr->wc_flags & IBV_WC_EX_WITH_SL) cq->verbs_cq.cq_ex.read_sl = mlx4_cq_read_wc_sl; if (cq_attr->wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS) cq->verbs_cq.cq_ex.read_dlid_path_bits = mlx4_cq_read_wc_dlid_path_bits; if (cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP) cq->verbs_cq.cq_ex.read_completion_ts = mlx4_cq_read_wc_completion_ts; } int mlx4_arm_cq(struct ibv_cq *ibvcq, int solicited) { struct mlx4_cq *cq = to_mcq(ibvcq); uint64_t doorbell; uint32_t sn; uint32_t ci; uint32_t cmd; sn = cq->arm_sn & 3; ci = cq->cons_index & 0xffffff; cmd = solicited ? MLX4_CQ_DB_REQ_NOT_SOL : MLX4_CQ_DB_REQ_NOT; doorbell = sn << 28 | cmd | cq->cqn; doorbell <<= 32; doorbell |= ci; *cq->arm_db = htobe32(sn << 28 | cmd | ci); /* * Make sure that the doorbell record in host memory is * written before ringing the doorbell via PCI MMIO. */ udma_to_device_barrier(); mmio_write64_be(to_mctx(ibvcq->context)->uar + MLX4_CQ_DOORBELL, htobe64(doorbell)); return 0; } void mlx4_cq_event(struct ibv_cq *cq) { to_mcq(cq)->arm_sn++; } void __mlx4_cq_clean(struct mlx4_cq *cq, uint32_t qpn, struct mlx4_srq *srq) { struct mlx4_cqe *cqe, *dest; uint32_t prod_index; uint8_t owner_bit; int nfreed = 0; int cqe_inc = cq->cqe_size == 64 ? 1 : 0; if (!cq || cq->flags & MLX4_CQ_FLAGS_DV_OWNED) return; /* * First we need to find the current producer index, so we * know where to start cleaning from. It doesn't matter if HW * adds new entries after this loop -- the QP we're worried * about is already in RESET, so the new entries won't come * from our QP and therefore don't need to be checked. */ for (prod_index = cq->cons_index; get_sw_cqe(cq, prod_index); ++prod_index) if (prod_index == cq->cons_index + cq->verbs_cq.cq.cqe) break; /* * Now sweep backwards through the CQ, removing CQ entries * that match our QP by copying older entries on top of them. */ while ((int) --prod_index - (int) cq->cons_index >= 0) { cqe = get_cqe(cq, prod_index & cq->verbs_cq.cq.cqe); cqe += cqe_inc; if (srq && srq->ext_srq && (be32toh(cqe->g_mlpath_rqpn) & MLX4_CQE_QPN_MASK) == srq->verbs_srq.srq_num && !(cqe->owner_sr_opcode & MLX4_CQE_IS_SEND_MASK)) { mlx4_free_srq_wqe(srq, be16toh(cqe->wqe_index)); ++nfreed; } else if ((be32toh(cqe->vlan_my_qpn) & MLX4_CQE_QPN_MASK) == qpn) { if (srq && !(cqe->owner_sr_opcode & MLX4_CQE_IS_SEND_MASK)) mlx4_free_srq_wqe(srq, be16toh(cqe->wqe_index)); ++nfreed; } else if (nfreed) { dest = get_cqe(cq, (prod_index + nfreed) & cq->verbs_cq.cq.cqe); dest += cqe_inc; owner_bit = dest->owner_sr_opcode & MLX4_CQE_OWNER_MASK; memcpy(dest, cqe, sizeof *cqe); dest->owner_sr_opcode = owner_bit | (dest->owner_sr_opcode & ~MLX4_CQE_OWNER_MASK); } } if (nfreed) { cq->cons_index += nfreed; /* * Make sure update of buffer contents is done before * updating consumer index. */ udma_to_device_barrier(); mlx4_update_cons_index(cq); } } void mlx4_cq_clean(struct mlx4_cq *cq, uint32_t qpn, struct mlx4_srq *srq) { pthread_spin_lock(&cq->lock); __mlx4_cq_clean(cq, qpn, srq); pthread_spin_unlock(&cq->lock); } int mlx4_get_outstanding_cqes(struct mlx4_cq *cq) { uint32_t i; for (i = cq->cons_index; get_sw_cqe(cq, i); ++i) ; return i - cq->cons_index; } void mlx4_cq_resize_copy_cqes(struct mlx4_cq *cq, void *buf, int old_cqe) { struct mlx4_cqe *cqe; int i; int cqe_inc = cq->cqe_size == 64 ? 1 : 0; i = cq->cons_index; cqe = get_cqe(cq, (i & old_cqe)); cqe += cqe_inc; while ((mlx4dv_get_cqe_opcode(cqe)) != MLX4_CQE_OPCODE_RESIZE) { cqe->owner_sr_opcode = (cqe->owner_sr_opcode & ~MLX4_CQE_OWNER_MASK) | (((i + 1) & (cq->verbs_cq.cq.cqe + 1)) ? MLX4_CQE_OWNER_MASK : 0); memcpy(buf + ((i + 1) & cq->verbs_cq.cq.cqe) * cq->cqe_size, cqe - cqe_inc, cq->cqe_size); ++i; cqe = get_cqe(cq, (i & old_cqe)); cqe += cqe_inc; } ++cq->cons_index; } int mlx4_alloc_cq_buf(struct mlx4_device *dev, struct mlx4_context *ctx, struct mlx4_buf *buf, int nent, int entry_size) { if (mlx4_alloc_buf(ctx, buf, align(nent * entry_size, dev->page_size), dev->page_size)) return -1; memset(buf->buf, 0, nent * entry_size); return 0; } rdma-core-56.1/providers/mlx4/dbrec.c000066400000000000000000000075111477342711600174620ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include "mlx4.h" struct mlx4_db_page { struct mlx4_db_page *prev, *next; struct mlx4_buf buf; int num_db; int use_cnt; unsigned long free[0]; }; static const int db_size[] = { [MLX4_DB_TYPE_CQ] = 8, [MLX4_DB_TYPE_RQ] = 4, }; static struct mlx4_db_page *__add_page(struct mlx4_context *context, enum mlx4_db_type type) { struct mlx4_db_page *page; int ps = to_mdev(context->ibv_ctx.context.device)->page_size; int pp; int i; pp = ps / db_size[type]; page = malloc(sizeof *page + pp / 8); if (!page) return NULL; if (mlx4_alloc_buf(context, &page->buf, ps, ps)) { free(page); return NULL; } page->num_db = pp; page->use_cnt = 0; for (i = 0; i < pp / (sizeof (long) * 8); ++i) page->free[i] = ~0; page->prev = NULL; page->next = context->db_list[type]; context->db_list[type] = page; if (page->next) page->next->prev = page; return page; } __be32 *mlx4_alloc_db(struct mlx4_context *context, enum mlx4_db_type type) { struct mlx4_db_page *page; __be32 *db = NULL; int i, j; pthread_mutex_lock(&context->db_list_mutex); for (page = context->db_list[type]; page; page = page->next) if (page->use_cnt < page->num_db) goto found; page = __add_page(context, type); if (!page) goto out; found: ++page->use_cnt; for (i = 0; !page->free[i]; ++i) /* nothing */; j = ffsl(page->free[i]); page->free[i] &= ~(1UL << (j - 1)); db = page->buf.buf + (i * 8 * sizeof (long) + (j - 1)) * db_size[type]; out: pthread_mutex_unlock(&context->db_list_mutex); return db; } void mlx4_free_db(struct mlx4_context *context, enum mlx4_db_type type, __be32 *db) { struct mlx4_db_page *page; uintptr_t ps = to_mdev(context->ibv_ctx.context.device)->page_size; int i; pthread_mutex_lock(&context->db_list_mutex); for (page = context->db_list[type]; page; page = page->next) if (((uintptr_t) db & ~(ps - 1)) == (uintptr_t) page->buf.buf) break; if (!page) goto out; i = ((void *) db - page->buf.buf) / db_size[type]; page->free[i / (8 * sizeof (long))] |= 1UL << (i % (8 * sizeof (long))); if (!--page->use_cnt) { if (page->prev) page->prev->next = page->next; else context->db_list[type] = page->next; if (page->next) page->next->prev = page->prev; mlx4_free_buf(context, &page->buf); free(page); } out: pthread_mutex_unlock(&context->db_list_mutex); } rdma-core-56.1/providers/mlx4/libmlx4.map000066400000000000000000000003341477342711600203050ustar00rootroot00000000000000/* Export symbols should be added below according to Documentation/versioning.md document. */ MLX4_1.0 { global: mlx4dv_init_obj; mlx4dv_query_device; mlx4dv_create_qp; mlx4dv_set_context_attr; local: *; }; rdma-core-56.1/providers/mlx4/man/000077500000000000000000000000001477342711600170065ustar00rootroot00000000000000rdma-core-56.1/providers/mlx4/man/CMakeLists.txt000066400000000000000000000001501477342711600215420ustar00rootroot00000000000000rdma_man_pages( mlx4dv_init_obj.3 mlx4dv_query_device.3 mlx4dv_set_context_attr.3.md mlx4dv.7 ) rdma-core-56.1/providers/mlx4/man/mlx4dv.7000066400000000000000000000027131477342711600203170ustar00rootroot00000000000000.\" -*- nroff -*- .\" Copyright (c) 2017 Mellanox Technologies, Inc. .\" Licensed under the OpenIB.org (MIT) - See COPYING.md .\" .TH MLX4DV 7 2017-04-19 1.0.0 .SH "NAME" mlx4dv \- Direct verbs for mlx4 devices .br This is low level access to mlx4 devices to perform data path operations, without general branching performed by \fBibv_post_send\fR(3). .SH "DESCRIPTION" The libibverbs API is an abstract one. It is agnostic to any underlying provider specific implementation. While this abstraction has the advantage of user applications portability it has a performance penalty. For some applications optimizing performance is more important than portability. The mlx4 direct verbs API is intended for such applications. It exposes mlx4 specific low level data path (send/receive/completion) operations, allowing the application to bypass the libibverbs data path API. This interface consists from one hardware specific header file with relevant inline functions and conversion logic from ibverbs structures to mlx4 specific structures. The direct include of mlx4dv.h together with linkage to mlx4 library will allow usage of this new interface. Once an application uses the direct flow the locking scheme is fully managed by itself. There is an expectation that no mixed flows in the data path for both direct/non-direct access will be by same application. .SH "NOTES" .SH "SEE ALSO" .BR ibv_post_send (3), .BR verbs (7) .SH "AUTHORS" .TP Maor Gottlieb rdma-core-56.1/providers/mlx4/man/mlx4dv_init_obj.3000066400000000000000000000053031477342711600221660ustar00rootroot00000000000000.\" -*- nroff -*- .\" Copyright (c) 2017 Mellanox Technologies, Inc. .\" Licensed under the OpenIB.org (MIT) - See COPYING.md .\" .TH MLX4DV_INIT_OBJ 3 2017-02-02 1.0.0 .SH "NAME" mlx4dv_init_obj \- Initialize mlx4 direct verbs object from ibv_xxx structures .SH "SYNOPSIS" .nf .B #include .sp .BI "int mlx4dv_init_obj(struct mlx4dv_obj *obj, uint64_t obj_type); .fi .SH "DESCRIPTION" .B mlx4dv_init_obj() This function will initialize mlx4dv_xxx structs based on supplied type. The information for initialization is taken from ibv_xx structs supplied as part of input. Request information of CQ marks its owned by direct verbs for all consumer index related actions. The initialization type can be combination of several types together. .PP .nf struct mlx4dv_qp { .in +8 uint32_t *rdb; uint32_t *sdb; struct { .in +8 uint32_t wqe_cnt; int wqe_shift; int offset; .in -8 } sq; struct { .in +8 uint32_t wqe_cnt; int wqe_shift; int offset; .in -8 } rq; struct { .in +8 void *buf; size_t length; .in -8 } buf; uint64_t comp_mask; /* Use enum mlx4dv_qp_comp_mask */ off_t uar_mmap_offset; /* If MLX4DV_QP_MASK_UAR_MMAP_OFFSET is set in comp_mask, this will contain the mmap offset of *sdb* */ .in -8 }; struct mlx4dv_cq { .in +8 struct { .in +8 void *buf; size_t length; .in -8 } buf; uint32_t cqe_cnt; uint32_t cqn; uint32_t *set_ci_db; uint32_t *arm_db; int arm_sn; int cqe_size; uint64_t comp_mask; /* Use enum mlx4dv_cq_comp_mask */ void *cq_uar; .in -8 }; struct mlx4dv_srq { .in +8 struct { .in +8 void *buf; size_t length; .in -8 } buf; int wqe_shift; int head; int tail; uint32_t *db; uint64_t comp_mask; .in -8 }; struct mlx4dv_rwq { .in +8 __be32 *rdb; struct { .in +8 uint32_t wqe_cnt; int wqe_shift; int offset; .in -8 } rq; struct { .in +8 void *buf; size_t length; .in -8 } buf; uint64_t comp_mask; .in -8 }; struct mlx4dv_obj { .in +8 struct { .in +8 struct ibv_qp *in; struct mlx4dv_qp *out; .in -8 } qp; struct { .in +8 struct ibv_cq *in; struct mlx4dv_cq *out; .in -8 } cq; .in -8 }; enum mlx4dv_obj_type { .in +8 MLX4DV_OBJ_QP = 1 << 0, MLX4DV_OBJ_CQ = 1 << 1, MLX4DV_OBJ_SRQ = 1 << 2, .in -8 }; .fi .SH "RETURN VALUE" 0 on success or the value of errno on failure (which indicates the failure reason). .SH "NOTES" * Compatibility masks (comp_mask) are in/out fields. .SH "SEE ALSO" .BR mlx4dv (7) .SH "AUTHORS" .TP Maor Gottlieb rdma-core-56.1/providers/mlx4/man/mlx4dv_query_device.3000066400000000000000000000023061477342711600230550ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org (MIT) - See COPYING.md .\" .TH MLX4DV_QUERY_DEVICE 3 2017-06-27 1.0.0 .SH "NAME" mlx4dv_query_device \- Query device capabilities specific to mlx4 .SH "SYNOPSIS" .nf .B #include .sp .BI "int mlx4dv_query_device(struct ibv_context *ctx_in, .BI " struct mlx4dv_context *attrs_out); .fi .SH "DESCRIPTION" .B mlx4dv_query_device() Query mlx4 specific device information that is usable via the direct verbs interface. .PP This function returns a version and compatibility mask. The version represents the format of the internal hardware structures that mlx4dv.h exposes. Future additions of new fields to the existing structures are handled by the comp_mask field. .PP .nf struct mlx4dv_context { .in +8 uint8_t version; uint32_t max_inl_recv_sz; /* Maximum supported size of inline receive */ uint64_t comp_mask; .in -8 }; .fi .SH "RETURN VALUE" 0 on success or the value of errno on failure (which indicates the failure reason). .SH "NOTES" * Compatibility mask (comp_mask) is an in/out field. .SH "SEE ALSO" .BR mlx4dv (7), .BR ibv_query_device (3) .SH "AUTHORS" .TP Maor Gottlieb rdma-core-56.1/providers/mlx4/man/mlx4dv_set_context_attr.3.md000066400000000000000000000026711477342711600243660ustar00rootroot00000000000000--- layout: page title: mlx4dv_set_context_attr section: 3 tagline: Verbs --- # NAME mlx4dv_set_context_attr - Set context attributes # SYNOPSIS ```c #include int mlx4dv_set_context_attr(struct ibv_context *context, enum mlx4dv_set_ctx_attr_type attr_type, void *attr); ``` # DESCRIPTION mlx4dv_set_context_attr gives the ability to set vendor specific attributes on the RDMA context. # ARGUMENTS *context* : RDMA device context to work on. *attr_type* : The type of the provided attribute. *attr* : Pointer to the attribute to be set. ## attr_type ```c enum mlx4dv_set_ctx_attr_type { /* Attribute type uint8_t */ MLX4DV_SET_CTX_ATTR_LOG_WQS_RANGE_SZ = 0, MLX4DV_SET_CTX_ATTR_BUF_ALLOCATORS = 1, }; ``` *MLX4DV_SET_CTX_ATTR_LOG_WQS_RANGE_SZ* : Change the LOG WQs Range size for RSS *MLX4DV_SET_CTX_ATTR_BUF_ALLOCATORS* : Provide an external buffer allocator ```c struct mlx4dv_ctx_allocators { void *(*alloc)(size_t size, void *priv_data); void (*free)(void *ptr, void *priv_data); void *data; }; ``` *alloc* : Function used for buffer allocation instead of libmlx4 internal method *free* : Function used to free buffers allocated by alloc function *data* : Metadata that can be used by alloc and free functions # RETURN VALUE Returns 0 on success, or the value of errno on failure (which indicates the failure reason). #AUTHOR Majd Dibbiny rdma-core-56.1/providers/mlx4/mlx4-abi.h000066400000000000000000000061111477342711600200200ustar00rootroot00000000000000/* * Copyright (c) 2007 Cisco, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MLX4_ABI_H #define MLX4_ABI_H #include #include #include #define MLX4_UVERBS_MIN_ABI_VERSION 2 #define MLX4_UVERBS_MAX_ABI_VERSION 4 #define MLX4_UVERBS_NO_DEV_CAPS_ABI_VERSION 3 DECLARE_DRV_CMD(mlx4_alloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, empty, mlx4_ib_alloc_pd_resp); DECLARE_DRV_CMD(mlx4_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, mlx4_ib_create_cq, mlx4_ib_create_cq_resp); DECLARE_DRV_CMD(mlx4_create_cq_ex, IB_USER_VERBS_EX_CMD_CREATE_CQ, mlx4_ib_create_cq, mlx4_ib_create_cq_resp); DECLARE_DRV_CMD(mlx4_create_qp, IB_USER_VERBS_CMD_CREATE_QP, mlx4_ib_create_qp, empty); DECLARE_DRV_CMD(mlx4_create_qp_ex, IB_USER_VERBS_EX_CMD_CREATE_QP, mlx4_ib_create_qp, empty); DECLARE_DRV_CMD(mlx4_create_qp_ex_rss, IB_USER_VERBS_EX_CMD_CREATE_QP, mlx4_ib_create_qp_rss, empty); DECLARE_DRV_CMD(mlx4_create_srq, IB_USER_VERBS_CMD_CREATE_SRQ, mlx4_ib_create_srq, mlx4_ib_create_srq_resp); DECLARE_DRV_CMD(mlx4_create_wq, IB_USER_VERBS_EX_CMD_CREATE_WQ, mlx4_ib_create_wq, empty); DECLARE_DRV_CMD(mlx4_create_xsrq, IB_USER_VERBS_CMD_CREATE_XSRQ, mlx4_ib_create_srq, mlx4_ib_create_srq_resp); DECLARE_DRV_CMD(mlx4_alloc_ucontext_v3, IB_USER_VERBS_CMD_GET_CONTEXT, empty, mlx4_ib_alloc_ucontext_resp_v3); DECLARE_DRV_CMD(mlx4_alloc_ucontext, IB_USER_VERBS_CMD_GET_CONTEXT, empty, mlx4_ib_alloc_ucontext_resp); DECLARE_DRV_CMD(mlx4_modify_wq, IB_USER_VERBS_EX_CMD_MODIFY_WQ, mlx4_ib_modify_wq, empty); DECLARE_DRV_CMD(mlx4_query_device_ex, IB_USER_VERBS_EX_CMD_QUERY_DEVICE, empty, mlx4_uverbs_ex_query_device_resp); DECLARE_DRV_CMD(mlx4_resize_cq, IB_USER_VERBS_CMD_RESIZE_CQ, mlx4_ib_resize_cq, empty); #endif /* MLX4_ABI_H */ rdma-core-56.1/providers/mlx4/mlx4.c000066400000000000000000000305221477342711600172650ustar00rootroot00000000000000/* * Copyright (c) 2007 Cisco, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include "mlx4.h" #include "mlx4-abi.h" static void mlx4_free_context(struct ibv_context *ibv_ctx); #ifndef PCI_VENDOR_ID_MELLANOX #define PCI_VENDOR_ID_MELLANOX 0x15b3 #endif #define HCA(v, d) VERBS_PCI_MATCH(PCI_VENDOR_ID_##v, d, NULL) static const struct verbs_match_ent hca_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_MLX4), HCA(MELLANOX, 0x6340), /* MT25408 "Hermon" SDR */ HCA(MELLANOX, 0x634a), /* MT25408 "Hermon" DDR */ HCA(MELLANOX, 0x6354), /* MT25408 "Hermon" QDR */ HCA(MELLANOX, 0x6732), /* MT25408 "Hermon" DDR PCIe gen2 */ HCA(MELLANOX, 0x673c), /* MT25408 "Hermon" QDR PCIe gen2 */ HCA(MELLANOX, 0x6368), /* MT25408 "Hermon" EN 10GigE */ HCA(MELLANOX, 0x6750), /* MT25408 "Hermon" EN 10GigE PCIe gen2 */ HCA(MELLANOX, 0x6372), /* MT25458 ConnectX EN 10GBASE-T 10GigE */ HCA(MELLANOX, 0x675a), /* MT25458 ConnectX EN 10GBASE-T+Gen2 10GigE */ HCA(MELLANOX, 0x6764), /* MT26468 ConnectX EN 10GigE PCIe gen2*/ HCA(MELLANOX, 0x6746), /* MT26438 ConnectX EN 40GigE PCIe gen2 5GT/s */ HCA(MELLANOX, 0x676e), /* MT26478 ConnectX2 40GigE PCIe gen2 */ HCA(MELLANOX, 0x1002), /* MT25400 Family [ConnectX-2 Virtual Function] */ HCA(MELLANOX, 0x1003), /* MT27500 Family [ConnectX-3] */ HCA(MELLANOX, 0x1004), /* MT27500 Family [ConnectX-3 Virtual Function] */ HCA(MELLANOX, 0x1005), /* MT27510 Family */ HCA(MELLANOX, 0x1006), /* MT27511 Family */ HCA(MELLANOX, 0x1007), /* MT27520 Family */ HCA(MELLANOX, 0x1008), /* MT27521 Family */ HCA(MELLANOX, 0x1009), /* MT27530 Family */ HCA(MELLANOX, 0x100a), /* MT27531 Family */ HCA(MELLANOX, 0x100b), /* MT27540 Family */ HCA(MELLANOX, 0x100c), /* MT27541 Family */ HCA(MELLANOX, 0x100d), /* MT27550 Family */ HCA(MELLANOX, 0x100e), /* MT27551 Family */ HCA(MELLANOX, 0x100f), /* MT27560 Family */ HCA(MELLANOX, 0x1010), /* MT27561 Family */ VERBS_MODALIAS_MATCH("vmbus:3daf2e8ca732094bab99bd1f1c86b501", NULL), /* Microsoft Azure Network Direct */ {} }; static const struct verbs_context_ops mlx4_ctx_ops = { .query_port = mlx4_query_port, .alloc_pd = mlx4_alloc_pd, .dealloc_pd = mlx4_free_pd, .reg_mr = mlx4_reg_mr, .rereg_mr = mlx4_rereg_mr, .dereg_mr = mlx4_dereg_mr, .alloc_mw = mlx4_alloc_mw, .dealloc_mw = mlx4_dealloc_mw, .bind_mw = mlx4_bind_mw, .create_cq = mlx4_create_cq, .poll_cq = mlx4_poll_cq, .req_notify_cq = mlx4_arm_cq, .cq_event = mlx4_cq_event, .resize_cq = mlx4_resize_cq, .destroy_cq = mlx4_destroy_cq, .create_srq = mlx4_create_srq, .modify_srq = mlx4_modify_srq, .query_srq = mlx4_query_srq, .destroy_srq = mlx4_destroy_srq, .post_srq_recv = mlx4_post_srq_recv, .create_qp = mlx4_create_qp, .query_qp = mlx4_query_qp, .modify_qp = mlx4_modify_qp, .destroy_qp = mlx4_destroy_qp, .post_send = mlx4_post_send, .post_recv = mlx4_post_recv, .create_ah = mlx4_create_ah, .destroy_ah = mlx4_destroy_ah, .attach_mcast = ibv_cmd_attach_mcast, .detach_mcast = ibv_cmd_detach_mcast, .close_xrcd = mlx4_close_xrcd, .create_cq_ex = mlx4_create_cq_ex, .create_flow = mlx4_create_flow, .create_qp_ex = mlx4_create_qp_ex, .create_rwq_ind_table = mlx4_create_rwq_ind_table, .create_srq_ex = mlx4_create_srq_ex, .create_wq = mlx4_create_wq, .destroy_flow = mlx4_destroy_flow, .destroy_rwq_ind_table = mlx4_destroy_rwq_ind_table, .destroy_wq = mlx4_destroy_wq, .get_srq_num = mlx4_get_srq_num, .modify_cq = mlx4_modify_cq, .modify_wq = mlx4_modify_wq, .open_qp = mlx4_open_qp, .open_xrcd = mlx4_open_xrcd, .query_device_ex = mlx4_query_device_ex, .query_rt_values = mlx4_query_rt_values, .free_context = mlx4_free_context, }; static struct verbs_context *mlx4_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct mlx4_context *context; struct ibv_get_context cmd; struct mlx4_alloc_ucontext_resp resp = {}; int i; struct mlx4_alloc_ucontext_v3_resp resp_v3 = {}; __u16 bf_reg_size; struct mlx4_device *dev = to_mdev(ibdev); struct verbs_context *verbs_ctx; context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx, RDMA_DRIVER_MLX4); if (!context) return NULL; verbs_ctx = &context->ibv_ctx; if (dev->abi_version <= MLX4_UVERBS_NO_DEV_CAPS_ABI_VERSION) { if (ibv_cmd_get_context(verbs_ctx, &cmd, sizeof(cmd), &resp_v3.ibv_resp, sizeof(resp_v3))) goto failed; context->num_qps = resp_v3.qp_tab_size; bf_reg_size = resp_v3.bf_reg_size; context->cqe_size = sizeof (struct mlx4_cqe); } else { if (ibv_cmd_get_context(verbs_ctx, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) goto failed; context->num_qps = resp.qp_tab_size; bf_reg_size = resp.bf_reg_size; if (resp.dev_caps & MLX4_USER_DEV_CAP_LARGE_CQE) context->cqe_size = resp.cqe_size; else context->cqe_size = sizeof (struct mlx4_cqe); } context->qp_table_shift = ffs(context->num_qps) - 1 - MLX4_QP_TABLE_BITS; context->qp_table_mask = (1 << context->qp_table_shift) - 1; for (i = 0; i < MLX4_PORTS_NUM; ++i) context->port_query_cache[i].valid = 0; pthread_mutex_init(&context->qp_table_mutex, NULL); for (i = 0; i < MLX4_QP_TABLE_SIZE; ++i) context->qp_table[i].refcnt = 0; for (i = 0; i < MLX4_NUM_DB_TYPE; ++i) context->db_list[i] = NULL; mlx4_init_xsrq_table(&context->xsrq_table, context->num_qps); pthread_mutex_init(&context->db_list_mutex, NULL); context->uar_mmap_offset = 0; context->uar = mmap(NULL, dev->page_size, PROT_WRITE, MAP_SHARED, cmd_fd, context->uar_mmap_offset); if (context->uar == MAP_FAILED) goto failed; if (bf_reg_size) { context->bf_page = mmap(NULL, dev->page_size, PROT_WRITE, MAP_SHARED, cmd_fd, dev->page_size); if (context->bf_page == MAP_FAILED) { fprintf(stderr, PFX "Warning: BlueFlame available, " "but failed to mmap() BlueFlame page.\n"); context->bf_page = NULL; context->bf_buf_size = 0; } else { context->bf_buf_size = bf_reg_size / 2; context->bf_offset = 0; pthread_spin_init(&context->bf_lock, PTHREAD_PROCESS_PRIVATE); } } else { context->bf_page = NULL; context->bf_buf_size = 0; } verbs_set_ops(verbs_ctx, &mlx4_ctx_ops); mlx4_query_device_ctx(dev, context); return verbs_ctx; failed: verbs_uninit_context(&context->ibv_ctx); free(context); return NULL; } static void mlx4_free_context(struct ibv_context *ibv_ctx) { struct mlx4_context *context = to_mctx(ibv_ctx); struct mlx4_device *mdev = to_mdev(ibv_ctx->device); munmap(context->uar, mdev->page_size); if (context->bf_page) munmap(context->bf_page, mdev->page_size); if (context->hca_core_clock) munmap(context->hca_core_clock - context->core_clock.offset, mdev->page_size); verbs_uninit_context(&context->ibv_ctx); free(context); } static void mlx4_uninit_device(struct verbs_device *verbs_device) { struct mlx4_device *dev = to_mdev(&verbs_device->device); free(dev); } static struct verbs_device *mlx4_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct mlx4_device *dev; dev = calloc(1, sizeof *dev); if (!dev) return NULL; dev->page_size = sysconf(_SC_PAGESIZE); dev->abi_version = sysfs_dev->abi_ver; return &dev->verbs_dev; } static const struct verbs_device_ops mlx4_dev_ops = { .name = "mlx4", .match_min_abi_version = MLX4_UVERBS_MIN_ABI_VERSION, .match_max_abi_version = MLX4_UVERBS_MAX_ABI_VERSION, .match_table = hca_table, .alloc_device = mlx4_device_alloc, .uninit_device = mlx4_uninit_device, .alloc_context = mlx4_alloc_context, }; PROVIDER_DRIVER(mlx4, mlx4_dev_ops); static int mlx4dv_get_qp(struct ibv_qp *qp_in, struct mlx4dv_qp *qp_out) { struct mlx4_qp *mqp = to_mqp(qp_in); struct mlx4_context *ctx = to_mctx(qp_in->context); uint64_t mask_out = 0; qp_out->buf.buf = mqp->buf.buf; qp_out->buf.length = mqp->buf.length; qp_out->rdb = mqp->db; qp_out->sdb = (uint32_t *) (ctx->uar + MLX4_SEND_DOORBELL); qp_out->doorbell_qpn = mqp->doorbell_qpn; qp_out->sq.wqe_cnt = mqp->sq.wqe_cnt; qp_out->sq.wqe_shift = mqp->sq.wqe_shift; qp_out->sq.offset = mqp->sq.offset; qp_out->rq.wqe_cnt = mqp->rq.wqe_cnt; qp_out->rq.wqe_shift = mqp->rq.wqe_shift; qp_out->rq.offset = mqp->rq.offset; if (qp_out->comp_mask & MLX4DV_QP_MASK_UAR_MMAP_OFFSET) { qp_out->uar_mmap_offset = ctx->uar_mmap_offset; mask_out |= MLX4DV_QP_MASK_UAR_MMAP_OFFSET; } qp_out->comp_mask = mask_out; return 0; } static int mlx4dv_get_cq(struct ibv_cq *cq_in, struct mlx4dv_cq *cq_out) { struct mlx4_cq *mcq = to_mcq(cq_in); struct mlx4_context *mctx = to_mctx(cq_in->context); uint64_t mask_out = 0; cq_out->buf.buf = mcq->buf.buf; cq_out->buf.length = mcq->buf.length; cq_out->cqn = mcq->cqn; cq_out->set_ci_db = mcq->set_ci_db; cq_out->arm_db = mcq->arm_db; cq_out->arm_sn = mcq->arm_sn; cq_out->cqe_size = mcq->cqe_size; cq_out->cqe_cnt = mcq->verbs_cq.cq.cqe + 1; mcq->flags |= MLX4_CQ_FLAGS_DV_OWNED; if (cq_out->comp_mask & MLX4DV_CQ_MASK_UAR) { cq_out->cq_uar = mctx->uar; mask_out |= MLX4DV_CQ_MASK_UAR; } cq_out->comp_mask = mask_out; return 0; } static int mlx4dv_get_srq(struct ibv_srq *srq_in, struct mlx4dv_srq *srq_out) { struct mlx4_srq *msrq = to_msrq(srq_in); srq_out->comp_mask = 0; srq_out->buf.buf = msrq->buf.buf; srq_out->buf.length = msrq->buf.length; srq_out->wqe_shift = msrq->wqe_shift; srq_out->head = msrq->head; srq_out->tail = msrq->tail; srq_out->db = msrq->db; return 0; } static int mlx4dv_get_rwq(struct ibv_wq *wq_in, struct mlx4dv_rwq *wq_out) { struct mlx4_qp *mqp = wq_to_mqp(wq_in); wq_out->comp_mask = 0; wq_out->buf.buf = mqp->buf.buf; wq_out->buf.length = mqp->buf.length; wq_out->rdb = mqp->db; wq_out->rq.wqe_cnt = mqp->rq.wqe_cnt; wq_out->rq.wqe_shift = mqp->rq.wqe_shift; wq_out->rq.offset = mqp->rq.offset; return 0; } int mlx4dv_init_obj(struct mlx4dv_obj *obj, uint64_t obj_type) { int ret = 0; if (obj_type & MLX4DV_OBJ_QP) ret = mlx4dv_get_qp(obj->qp.in, obj->qp.out); if (!ret && (obj_type & MLX4DV_OBJ_CQ)) ret = mlx4dv_get_cq(obj->cq.in, obj->cq.out); if (!ret && (obj_type & MLX4DV_OBJ_SRQ)) ret = mlx4dv_get_srq(obj->srq.in, obj->srq.out); if (!ret && (obj_type & MLX4DV_OBJ_RWQ)) ret = mlx4dv_get_rwq(obj->rwq.in, obj->rwq.out); return ret; } int mlx4dv_query_device(struct ibv_context *ctx_in, struct mlx4dv_context *attrs_out) { struct mlx4_context *mctx = to_mctx(ctx_in); attrs_out->version = 0; attrs_out->comp_mask = 0; attrs_out->max_inl_recv_sz = mctx->max_inl_recv_sz; return 0; } int mlx4dv_set_context_attr(struct ibv_context *context, enum mlx4dv_set_ctx_attr_type attr_type, void *attr) { struct mlx4_context *ctx = to_mctx(context); switch (attr_type) { case MLX4DV_SET_CTX_ATTR_LOG_WQS_RANGE_SZ: ctx->log_wqs_range_sz = *((uint8_t *)attr); break; case MLX4DV_SET_CTX_ATTR_BUF_ALLOCATORS: ctx->extern_alloc = *((struct mlx4dv_ctx_allocators *)attr); break; default: return ENOTSUP; } return 0; } rdma-core-56.1/providers/mlx4/mlx4.conf000066400000000000000000000017541477342711600177750ustar00rootroot00000000000000# This file is intended for users to select the various module options # they need for the mlx4 driver. On upgrade of the rdma package, # any user made changes to this file are preserved. Any changes made # to the libmlx4.conf file in this directory are overwritten on # pacakge upgrade. # # Some sample options and what they would do # Enable debugging output, device managed flow control, and disable SRIOV #options mlx4_core debug_level=1 log_num_mgm_entry_size=-1 probe_vf=0 num_vfs=0 # # Enable debugging output and create SRIOV devices, but don't attach any of # the child devices to the host, only the parent device #options mlx4_core debug_level=1 probe_vf=0 num_vfs=7 # # Enable debugging output, SRIOV, and attach one of the SRIOV child devices # in addition to the parent device to the host #options mlx4_core debug_level=1 probe_vf=1 num_vfs=7 # # Enable per priority flow control for send and receive, setting both priority # 1 and 2 as no drop priorities #options mlx4_en pfctx=3 pfcrx=3 rdma-core-56.1/providers/mlx4/mlx4.h000066400000000000000000000312641477342711600172760ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005, 2006, 2007 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MLX4_H #define MLX4_H #include #include #include #include #include #include #include #include "mlx4dv.h" #define MLX4_PORTS_NUM 2 #include #define PFX "mlx4: " enum { MLX4_STAT_RATE_OFFSET = 5 }; enum { MLX4_QP_TABLE_BITS = 8, MLX4_QP_TABLE_SIZE = 1 << MLX4_QP_TABLE_BITS, MLX4_QP_TABLE_MASK = MLX4_QP_TABLE_SIZE - 1 }; #define MLX4_REMOTE_SRQN_FLAGS(wr) htobe32(wr->qp_type.xrc.remote_srqn << 8) enum { MLX4_XSRQ_TABLE_BITS = 8, MLX4_XSRQ_TABLE_SIZE = 1 << MLX4_XSRQ_TABLE_BITS, MLX4_XSRQ_TABLE_MASK = MLX4_XSRQ_TABLE_SIZE - 1 }; struct mlx4_xsrq_table { struct { struct mlx4_srq **table; int refcnt; } xsrq_table[MLX4_XSRQ_TABLE_SIZE]; pthread_mutex_t mutex; int num_xsrq; int shift; int mask; }; enum { MLX4_XRC_QPN_BIT = (1 << 23) }; enum mlx4_db_type { MLX4_DB_TYPE_CQ, MLX4_DB_TYPE_RQ, MLX4_NUM_DB_TYPE }; struct mlx4_device { struct verbs_device verbs_dev; int page_size; int abi_version; }; struct mlx4_db_page; struct mlx4_context { struct verbs_context ibv_ctx; void *uar; off_t uar_mmap_offset; void *bf_page; int bf_buf_size; int bf_offset; pthread_spinlock_t bf_lock; struct { struct mlx4_qp **table; int refcnt; } qp_table[MLX4_QP_TABLE_SIZE]; pthread_mutex_t qp_table_mutex; int num_qps; int qp_table_shift; int qp_table_mask; int max_qp_wr; int max_sge; struct mlx4_db_page *db_list[MLX4_NUM_DB_TYPE]; pthread_mutex_t db_list_mutex; int cqe_size; struct mlx4_xsrq_table xsrq_table; struct { uint8_t valid; uint8_t link_layer; uint8_t flags; enum ibv_port_cap_flags caps; } port_query_cache[MLX4_PORTS_NUM]; struct { uint64_t offset; uint8_t offset_valid; } core_clock; void *hca_core_clock; uint32_t max_inl_recv_sz; uint8_t log_wqs_range_sz; struct mlx4dv_ctx_allocators extern_alloc; }; struct mlx4_buf { void *buf; size_t length; }; struct mlx4_pd { struct ibv_pd ibv_pd; uint32_t pdn; }; enum { MLX4_CQ_FLAGS_RX_CSUM_VALID = 1 << 0, MLX4_CQ_FLAGS_EXTENDED = 1 << 1, MLX4_CQ_FLAGS_SINGLE_THREADED = 1 << 2, MLX4_CQ_FLAGS_DV_OWNED = 1 << 3, }; struct mlx4_cq { struct verbs_cq verbs_cq; struct mlx4_buf buf; struct mlx4_buf resize_buf; pthread_spinlock_t lock; uint32_t cqn; uint32_t cons_index; __be32 *set_ci_db; __be32 *arm_db; int arm_sn; int cqe_size; struct mlx4_qp *cur_qp; struct mlx4_cqe *cqe; uint32_t flags; }; struct mlx4_srq { struct verbs_srq verbs_srq; struct mlx4_buf buf; pthread_spinlock_t lock; uint64_t *wrid; uint32_t srqn; int max; int max_gs; int wqe_shift; int head; int tail; __be32 *db; uint16_t counter; uint8_t ext_srq; }; struct mlx4_wq { uint64_t *wrid; pthread_spinlock_t lock; int wqe_cnt; int max_post; unsigned head; unsigned tail; int max_gs; int wqe_shift; int offset; }; enum mlx4_rsc_type { MLX4_RSC_TYPE_QP = 0, MLX4_RSC_TYPE_RSS_QP = 1, MLX4_RSC_TYPE_SRQ = 2, }; struct mlx4_qp { union { struct verbs_qp verbs_qp; struct ibv_wq wq; }; struct mlx4_buf buf; int max_inline_data; int buf_size; __be32 doorbell_qpn; __be32 sq_signal_bits; int sq_spare_wqes; struct mlx4_wq sq; __be32 *db; struct mlx4_wq rq; uint8_t link_layer; uint8_t type; /* enum mlx4_rsc_type */ uint32_t qp_cap_cache; uint32_t qpn_cache; }; struct mlx4_ah { struct ibv_ah ibv_ah; struct mlx4_av av; uint16_t vlan; uint8_t mac[ETHERNET_LL_SIZE]; }; enum { MLX4_CSUM_SUPPORT_UD_OVER_IB = (1 << 0), MLX4_CSUM_SUPPORT_RAW_OVER_ETH = (1 << 1), /* Only report rx checksum when the validation is valid */ MLX4_RX_CSUM_VALID = (1 << 16), }; #define to_mxxx(xxx, type) \ container_of(ib##xxx, struct mlx4_##type, ibv_##xxx) static inline struct mlx4_device *to_mdev(struct ibv_device *ibdev) { /* ibv_device is first field of verbs_device * see try_driver() in libibverbs. */ return container_of(ibdev, struct mlx4_device, verbs_dev.device); } static inline struct mlx4_context *to_mctx(struct ibv_context *ibctx) { return container_of(ibctx, struct mlx4_context, ibv_ctx.context); } static inline struct mlx4_pd *to_mpd(struct ibv_pd *ibpd) { return to_mxxx(pd, pd); } static inline struct mlx4_cq *to_mcq(struct ibv_cq *ibcq) { return container_of(ibcq, struct mlx4_cq, verbs_cq.cq); } static inline struct mlx4_srq *to_msrq(struct ibv_srq *ibsrq) { return container_of(ibsrq, struct mlx4_srq, verbs_srq.srq); } static inline struct mlx4_qp *to_mqp(struct ibv_qp *ibqp) { return container_of(ibqp, struct mlx4_qp, verbs_qp.qp); } static inline struct mlx4_qp *wq_to_mqp(struct ibv_wq *ibwq) { return container_of(ibwq, struct mlx4_qp, wq); } static inline struct mlx4_ah *to_mah(struct ibv_ah *ibah) { return to_mxxx(ah, ah); } static inline void mlx4_update_cons_index(struct mlx4_cq *cq) { *cq->set_ci_db = htobe32(cq->cons_index & 0xffffff); } int mlx4_alloc_buf(struct mlx4_context *ctx, struct mlx4_buf *buf, size_t size, int page_size); void mlx4_free_buf(struct mlx4_context *ctx, struct mlx4_buf *buf); __be32 *mlx4_alloc_db(struct mlx4_context *context, enum mlx4_db_type type); void mlx4_free_db(struct mlx4_context *context, enum mlx4_db_type type, __be32 *db); void mlx4_query_device_ctx(struct mlx4_device *mdev, struct mlx4_context *mctx); int mlx4_query_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int mlx4_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr); int mlx4_query_rt_values(struct ibv_context *context, struct ibv_values_ex *values); struct ibv_pd *mlx4_alloc_pd(struct ibv_context *context); int mlx4_free_pd(struct ibv_pd *pd); struct ibv_xrcd *mlx4_open_xrcd(struct ibv_context *context, struct ibv_xrcd_init_attr *attr); int mlx4_close_xrcd(struct ibv_xrcd *xrcd); int mlx4_get_srq_num(struct ibv_srq *srq, uint32_t *srq_num); struct ibv_mr *mlx4_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access); int mlx4_rereg_mr(struct verbs_mr *vmr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access); int mlx4_dereg_mr(struct verbs_mr *vmr); struct ibv_mw *mlx4_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type); int mlx4_dealloc_mw(struct ibv_mw *mw); int mlx4_bind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind); struct ibv_cq *mlx4_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); struct ibv_cq_ex *mlx4_create_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr); void mlx4_cq_fill_pfns(struct mlx4_cq *cq, const struct ibv_cq_init_attr_ex *cq_attr); int mlx4_alloc_cq_buf(struct mlx4_device *dev, struct mlx4_context *ctx, struct mlx4_buf *buf, int nent, int entry_size); int mlx4_resize_cq(struct ibv_cq *cq, int cqe); int mlx4_modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr); int mlx4_destroy_cq(struct ibv_cq *cq); int mlx4_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc); int mlx4_arm_cq(struct ibv_cq *cq, int solicited); void mlx4_cq_event(struct ibv_cq *cq); void __mlx4_cq_clean(struct mlx4_cq *cq, uint32_t qpn, struct mlx4_srq *srq); void mlx4_cq_clean(struct mlx4_cq *cq, uint32_t qpn, struct mlx4_srq *srq); int mlx4_get_outstanding_cqes(struct mlx4_cq *cq); void mlx4_cq_resize_copy_cqes(struct mlx4_cq *cq, void *buf, int new_cqe); struct ibv_srq *mlx4_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr); struct ibv_srq *mlx4_create_srq_ex(struct ibv_context *context, struct ibv_srq_init_attr_ex *attr_ex); struct ibv_srq *mlx4_create_xrc_srq(struct ibv_context *context, struct ibv_srq_init_attr_ex *attr_ex); int mlx4_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int mask); int mlx4_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr); int mlx4_destroy_srq(struct ibv_srq *srq); int mlx4_destroy_xrc_srq(struct ibv_srq *srq); int mlx4_alloc_srq_buf(struct ibv_pd *pd, struct ibv_srq_attr *attr, struct mlx4_srq *srq); void mlx4_init_xsrq_table(struct mlx4_xsrq_table *xsrq_table, int size); struct mlx4_srq *mlx4_find_xsrq(struct mlx4_xsrq_table *xsrq_table, uint32_t srqn); int mlx4_store_xsrq(struct mlx4_xsrq_table *xsrq_table, uint32_t srqn, struct mlx4_srq *srq); void mlx4_clear_xsrq(struct mlx4_xsrq_table *xsrq_table, uint32_t srqn); void mlx4_free_srq_wqe(struct mlx4_srq *srq, int ind); int mlx4_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); struct ibv_qp *mlx4_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); struct ibv_qp *mlx4_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr); struct ibv_qp *mlx4_open_qp(struct ibv_context *context, struct ibv_qp_open_attr *attr); int mlx4_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int mlx4_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); int mlx4_destroy_qp(struct ibv_qp *qp); void mlx4_init_qp_indices(struct mlx4_qp *qp); void mlx4_qp_init_sq_ownership(struct mlx4_qp *qp); int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int mlx4_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); void mlx4_calc_sq_wqe_size(struct ibv_qp_cap *cap, enum ibv_qp_type type, struct mlx4_qp *qp, struct ibv_qp_init_attr_ex *attr); int mlx4_alloc_qp_buf(struct ibv_context *context, uint32_t max_recv_sge, enum ibv_qp_type type, struct mlx4_qp *qp, struct mlx4dv_qp_init_attr *mlx4qp_attr); void mlx4_set_sq_sizes(struct mlx4_qp *qp, struct ibv_qp_cap *cap, enum ibv_qp_type type); struct mlx4_qp *mlx4_find_qp(struct mlx4_context *ctx, uint32_t qpn); int mlx4_store_qp(struct mlx4_context *ctx, uint32_t qpn, struct mlx4_qp *qp); void mlx4_clear_qp(struct mlx4_context *ctx, uint32_t qpn); struct ibv_ah *mlx4_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr); int mlx4_destroy_ah(struct ibv_ah *ah); int mlx4_alloc_av(struct mlx4_pd *pd, struct ibv_ah_attr *attr, struct mlx4_ah *ah); void mlx4_free_av(struct mlx4_ah *ah); struct ibv_wq *mlx4_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *attr); int mlx4_modify_wq(struct ibv_wq *wq, struct ibv_wq_attr *attr); int mlx4_destroy_wq(struct ibv_wq *wq); struct ibv_rwq_ind_table *mlx4_create_rwq_ind_table(struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr); int mlx4_destroy_rwq_ind_table(struct ibv_rwq_ind_table *rwq_ind_table); int mlx4_post_wq_recv(struct ibv_wq *ibwq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); struct ibv_flow *mlx4_create_flow(struct ibv_qp *qp, struct ibv_flow_attr *flow_attr); int mlx4_destroy_flow(struct ibv_flow *flow_id); #endif /* MLX4_H */ rdma-core-56.1/providers/mlx4/mlx4dv.h000066400000000000000000000321451477342711600176270ustar00rootroot00000000000000/* * Copyright (c) 2017 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _MLX4DV_H_ #define _MLX4DV_H_ #include #include #include #include #ifdef __cplusplus extern "C" { #endif /* Always inline the functions */ #ifdef __GNUC__ #define MLX4DV_ALWAYS_INLINE inline __attribute__((always_inline)) #else #define MLX4DV_ALWAYS_INLINE inline #endif enum { MLX4_OPCODE_NOP = 0x00, MLX4_OPCODE_SEND_INVAL = 0x01, MLX4_OPCODE_RDMA_WRITE = 0x08, MLX4_OPCODE_RDMA_WRITE_IMM = 0x09, MLX4_OPCODE_SEND = 0x0a, MLX4_OPCODE_SEND_IMM = 0x0b, MLX4_OPCODE_LSO = 0x0e, MLX4_OPCODE_RDMA_READ = 0x10, MLX4_OPCODE_ATOMIC_CS = 0x11, MLX4_OPCODE_ATOMIC_FA = 0x12, MLX4_OPCODE_MASKED_ATOMIC_CS = 0x14, MLX4_OPCODE_MASKED_ATOMIC_FA = 0x15, MLX4_OPCODE_BIND_MW = 0x18, MLX4_OPCODE_FMR = 0x19, MLX4_OPCODE_LOCAL_INVAL = 0x1b, MLX4_OPCODE_CONFIG_CMD = 0x1f, MLX4_RECV_OPCODE_RDMA_WRITE_IMM = 0x00, MLX4_RECV_OPCODE_SEND = 0x01, MLX4_RECV_OPCODE_SEND_IMM = 0x02, MLX4_RECV_OPCODE_SEND_INVAL = 0x03, MLX4_CQE_OPCODE_ERROR = 0x1e, MLX4_CQE_OPCODE_RESIZE = 0x16, }; enum { MLX4_CQ_DOORBELL = 0x20 }; #define MLX4_CQ_DB_REQ_NOT_SOL (1 << 24) #define MLX4_CQ_DB_REQ_NOT (2 << 24) enum { MLX4_CQE_VLAN_PRESENT_MASK = 1 << 29, MLX4_CQE_QPN_MASK = 0xffffff, }; enum { MLX4_CQE_OWNER_MASK = 0x80, MLX4_CQE_IS_SEND_MASK = 0x40, MLX4_CQE_OPCODE_MASK = 0x1f }; enum { MLX4_CQE_SYNDROME_LOCAL_LENGTH_ERR = 0x01, MLX4_CQE_SYNDROME_LOCAL_QP_OP_ERR = 0x02, MLX4_CQE_SYNDROME_LOCAL_PROT_ERR = 0x04, MLX4_CQE_SYNDROME_WR_FLUSH_ERR = 0x05, MLX4_CQE_SYNDROME_MW_BIND_ERR = 0x06, MLX4_CQE_SYNDROME_BAD_RESP_ERR = 0x10, MLX4_CQE_SYNDROME_LOCAL_ACCESS_ERR = 0x11, MLX4_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR = 0x12, MLX4_CQE_SYNDROME_REMOTE_ACCESS_ERR = 0x13, MLX4_CQE_SYNDROME_REMOTE_OP_ERR = 0x14, MLX4_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR = 0x15, MLX4_CQE_SYNDROME_RNR_RETRY_EXC_ERR = 0x16, MLX4_CQE_SYNDROME_REMOTE_ABORTED_ERR = 0x22, }; struct mlx4_err_cqe { uint32_t vlan_my_qpn; uint32_t reserved1[5]; uint16_t wqe_index; uint8_t vendor_err; uint8_t syndrome; uint8_t reserved2[3]; uint8_t owner_sr_opcode; }; enum mlx4_cqe_status { MLX4_CQE_STATUS_TCP_UDP_CSUM_OK = (1 << 2), MLX4_CQE_STATUS_IPV4_PKT = (1 << 22), MLX4_CQE_STATUS_IP_HDR_CSUM_OK = (1 << 28), MLX4_CQE_STATUS_IPV4_CSUM_OK = MLX4_CQE_STATUS_IPV4_PKT | MLX4_CQE_STATUS_IP_HDR_CSUM_OK | MLX4_CQE_STATUS_TCP_UDP_CSUM_OK }; struct mlx4_cqe { __be32 vlan_my_qpn; __be32 immed_rss_invalid; __be32 g_mlpath_rqpn; union { struct { __be16 sl_vid; __be16 rlid; }; __be32 ts_47_16; }; __be32 status; __be32 byte_cnt; __be16 wqe_index; __be16 checksum; uint8_t reserved3; uint8_t ts_15_8; uint8_t ts_7_0; uint8_t owner_sr_opcode; }; enum mlx4dv_qp_comp_mask { MLX4DV_QP_MASK_UAR_MMAP_OFFSET = 1 << 0, }; struct mlx4dv_qp { __be32 *rdb; uint32_t *sdb; __be32 doorbell_qpn; struct { uint32_t wqe_cnt; int wqe_shift; int offset; } sq; struct { uint32_t wqe_cnt; int wqe_shift; int offset; } rq; struct { void *buf; size_t length; } buf; uint64_t comp_mask; off_t uar_mmap_offset; }; enum mlx4dv_cq_comp_mask { MLX4DV_CQ_MASK_UAR = 1 << 0, }; struct mlx4dv_cq { struct { void *buf; size_t length; } buf; uint32_t cqe_cnt; uint32_t cqn; __be32 *set_ci_db; __be32 *arm_db; int arm_sn; int cqe_size; uint64_t comp_mask; void *cq_uar; }; struct mlx4dv_srq { struct { void *buf; size_t length; } buf; int wqe_shift; int head; int tail; __be32 *db; uint64_t comp_mask; }; struct mlx4dv_rwq { __be32 *rdb; struct { uint32_t wqe_cnt; int wqe_shift; int offset; } rq; struct { void *buf; size_t length; } buf; uint64_t comp_mask; }; struct mlx4dv_obj { struct { struct ibv_qp *in; struct mlx4dv_qp *out; } qp; struct { struct ibv_cq *in; struct mlx4dv_cq *out; } cq; struct { struct ibv_srq *in; struct mlx4dv_srq *out; } srq; struct { struct ibv_wq *in; struct mlx4dv_rwq *out; } rwq; }; enum mlx4dv_obj_type { MLX4DV_OBJ_QP = 1 << 0, MLX4DV_OBJ_CQ = 1 << 1, MLX4DV_OBJ_SRQ = 1 << 2, MLX4DV_OBJ_RWQ = 1 << 3, }; /* * This function will initialize mlx4dv_xxx structs based on supplied type. * The information for initialization is taken from ibv_xx structs supplied * as part of input. * * Request information of CQ marks its owned by DV for all consumer index * related actions. * * The initialization type can be combination of several types together. * * Return: 0 in case of success. */ int mlx4dv_init_obj(struct mlx4dv_obj *obj, uint64_t obj_type); static MLX4DV_ALWAYS_INLINE uint8_t mlx4dv_get_cqe_owner(struct mlx4_cqe *cqe) { return cqe->owner_sr_opcode & MLX4_CQE_OWNER_MASK; } static MLX4DV_ALWAYS_INLINE void mlx4dv_set_cqe_owner(struct mlx4_cqe *cqe, uint8_t val) { cqe->owner_sr_opcode = (val & MLX4_CQE_OWNER_MASK) | (cqe->owner_sr_opcode & ~MLX4_CQE_OWNER_MASK); } static MLX4DV_ALWAYS_INLINE uint8_t mlx4dv_get_cqe_opcode(struct mlx4_cqe *cqe) { return cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK; } /* * WQE related part */ enum { MLX4_SEND_DOORBELL = 0x14, }; enum { MLX4_WQE_CTRL_SOLICIT = 1 << 1, MLX4_WQE_CTRL_CQ_UPDATE = 3 << 2, MLX4_WQE_CTRL_IP_HDR_CSUM = 1 << 4, MLX4_WQE_CTRL_TCP_UDP_CSUM = 1 << 5, MLX4_WQE_CTRL_FENCE = 1 << 6, MLX4_WQE_CTRL_STRONG_ORDER = 1 << 7 }; enum { MLX4_WQE_BIND_TYPE_2 = (1UL<<31), MLX4_WQE_BIND_ZERO_BASED = (1<<30), }; enum { MLX4_INLINE_SEG = 1UL << 31, MLX4_INLINE_ALIGN = 64, }; enum { MLX4_INVALID_LKEY = 0x100, }; enum { MLX4_WQE_MW_REMOTE_READ = 1 << 29, MLX4_WQE_MW_REMOTE_WRITE = 1 << 30, MLX4_WQE_MW_ATOMIC = 1UL << 31 }; struct mlx4_wqe_local_inval_seg { uint64_t reserved1; __be32 mem_key; uint32_t reserved2; uint64_t reserved3[2]; }; struct mlx4_wqe_bind_seg { __be32 flags1; __be32 flags2; __be32 new_rkey; __be32 lkey; __be64 addr; __be64 length; }; struct mlx4_wqe_ctrl_seg { __be32 owner_opcode; union { struct { uint8_t reserved[3]; uint8_t fence_size; }; __be32 bf_qpn; }; /* * High 24 bits are SRC remote buffer; low 8 bits are flags: * [7] SO (strong ordering) * [5] TCP/UDP checksum * [4] IP checksum * [3:2] C (generate completion queue entry) * [1] SE (solicited event) * [0] FL (force loopback) */ union { __be32 srcrb_flags; __be16 srcrb_flags16[2]; }; /* * imm is immediate data for send/RDMA write w/ immediate; * also invalidation key for send with invalidate; input * modifier for WQEs on CCQs. */ __be32 imm; }; struct mlx4_av { __be32 port_pd; uint8_t reserved1; uint8_t g_slid; __be16 dlid; uint8_t reserved2; uint8_t gid_index; uint8_t stat_rate; uint8_t hop_limit; __be32 sl_tclass_flowlabel; uint8_t dgid[16]; }; struct mlx4_wqe_datagram_seg { struct mlx4_av av; __be32 dqpn; __be32 qkey; __be16 vlan; uint8_t mac[ETHERNET_LL_SIZE]; }; struct mlx4_wqe_data_seg { __be32 byte_count; __be32 lkey; __be64 addr; }; struct mlx4_wqe_inline_seg { __be32 byte_count; }; struct mlx4_wqe_srq_next_seg { uint16_t reserved1; __be16 next_wqe_index; uint32_t reserved2[3]; }; struct mlx4_wqe_raddr_seg { __be64 raddr; __be32 rkey; __be32 reserved; }; struct mlx4_wqe_lso_seg { __be32 mss_hdr_size; __be32 header[0]; }; struct mlx4_wqe_atomic_seg { __be64 swap_add; __be64 compare; }; enum mlx4dv_qp_init_attr_mask { MLX4DV_QP_INIT_ATTR_MASK_INL_RECV = 1 << 0, MLX4DV_QP_INIT_ATTR_MASK_RESERVED = 1 << 1, }; struct mlx4dv_qp_init_attr { uint64_t comp_mask; /* Use enum mlx4dv_qp_init_attr_mask */ uint32_t inl_recv_sz; }; struct ibv_qp *mlx4dv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr, struct mlx4dv_qp_init_attr *mlx4_qp_attr); /* * Direct verbs device-specific attributes */ struct mlx4dv_context { uint8_t version; uint32_t max_inl_recv_sz; uint64_t comp_mask; }; /* * Control segment - contains some control information for the current WQE. * * Output: * seg - control segment to be filled * Input: * owner_opcode - Opcode of this WQE (Encodes the type of operation * to be executed on the QP) and owner bit. * wqe_cnt - Number of queue entries. * ind - WQEBB number of the first block of this WQE. * fence_size - Fence bit and WQE size in octowords. * srcrb_flags - High 24 bits are SRC remote buffer; low 8 bits are * flags which described in mlx4_wqe_ctrl_seg struct. * imm - Immediate data/Invalidation key. */ static MLX4DV_ALWAYS_INLINE void mlx4dv_set_ctrl_seg(struct mlx4_wqe_ctrl_seg *seg, uint32_t owner_opcode, uint8_t fence_size, uint32_t srcrb_flags, uint32_t imm) { seg->owner_opcode = htobe32(owner_opcode); seg->fence_size = fence_size; seg->srcrb_flags = htobe32(srcrb_flags); /* * The caller should prepare "imm" in advance based on WR opcode. * For IBV_WR_SEND_WITH_IMM and IBV_WR_RDMA_WRITE_WITH_IMM, * the "imm" should be assigned as is. * For the IBV_WR_SEND_WITH_INV, it should be htobe32(imm). */ seg->imm = imm; } /* * Datagram Segment - contains address information required in order * to form a datagram message. * * Output: * seg - datagram segment to be filled. * Input: * port_pd - Port number and protection domain. * g_slid - GRH and source LID for IB port only. * dlid - Remote LID. * gid_index - Index to port GID table. * state_rate - Maximum static rate control. * hop_limit - IPv6 hop limit. * sl_tclass_flowlabel - Service Level, IPv6 TClass and flow table. * dgid - Remote GID for IB port only. * dqpn - Destination QP. * qkey - QKey. * vlan - VLAN for RAW ETHERNET QP only. * mac - Destination MAC for RAW ETHERNET QP only. */ static MLX4DV_ALWAYS_INLINE void mlx4dv_set_dgram_seg(struct mlx4_wqe_datagram_seg *seg, uint32_t port_pd, uint8_t g_slid, uint16_t dlid, uint8_t gid_index, uint8_t stat_rate, uint8_t hop_limit, uint32_t sl_tclass_flowlabel, uint8_t *dgid, uint32_t dqpn, uint32_t qkey, uint16_t vlan, uint8_t *mac) { seg->av.port_pd = htobe32(port_pd); seg->av.g_slid = g_slid; seg->av.dlid = htobe16(dlid); seg->av.gid_index = gid_index; seg->av.stat_rate = stat_rate; seg->av.hop_limit = hop_limit; seg->av.sl_tclass_flowlabel = htobe32(sl_tclass_flowlabel); memcpy(seg->av.dgid, dgid, 16); seg->dqpn = htobe32(dqpn); seg->qkey = htobe32(qkey); seg->vlan = htobe16(vlan); memcpy(seg->mac, mac, ETHERNET_LL_SIZE); } /* * Data Segments - contain pointers and a byte count for the scatter/gather list. * They can optionally contain data, which will save a memory read access for * gather Work Requests. */ static MLX4DV_ALWAYS_INLINE void mlx4dv_set_data_seg(struct mlx4_wqe_data_seg *seg, uint32_t length, uint32_t lkey, uintptr_t address) { seg->byte_count = htobe32(length); seg->lkey = htobe32(lkey); seg->addr = htobe64(address); } /* Most device capabilities are exported by ibv_query_device(...), * but there is HW device-specific information which is important * for data-path, but isn't provided. * * Return 0 on success. */ int mlx4dv_query_device(struct ibv_context *ctx_in, struct mlx4dv_context *attrs_out); enum mlx4dv_set_ctx_attr_type { /* Attribute type uint8_t */ MLX4DV_SET_CTX_ATTR_LOG_WQS_RANGE_SZ = 0, MLX4DV_SET_CTX_ATTR_BUF_ALLOCATORS = 1, }; struct mlx4dv_ctx_allocators { void *(*alloc)(size_t size, void *priv_data); void (*free)(void *ptr, void *priv_data); void *data; }; /* * Returns 0 on success, or the value of errno on failure * (which indicates the failure reason). */ int mlx4dv_set_context_attr(struct ibv_context *context, enum mlx4dv_set_ctx_attr_type attr_type, void *attr); #ifdef __cplusplus } #endif #endif /* _MLX4DV_H_ */ rdma-core-56.1/providers/mlx4/qp.c000066400000000000000000000526331477342711600170300ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005 Mellanox Technologies Ltd. All rights reserved. * Copyright (c) 2007 Cisco, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include "mlx4.h" static const uint32_t mlx4_ib_opcode[] = { [IBV_WR_SEND] = MLX4_OPCODE_SEND, [IBV_WR_SEND_WITH_IMM] = MLX4_OPCODE_SEND_IMM, [IBV_WR_RDMA_WRITE] = MLX4_OPCODE_RDMA_WRITE, [IBV_WR_RDMA_WRITE_WITH_IMM] = MLX4_OPCODE_RDMA_WRITE_IMM, [IBV_WR_RDMA_READ] = MLX4_OPCODE_RDMA_READ, [IBV_WR_ATOMIC_CMP_AND_SWP] = MLX4_OPCODE_ATOMIC_CS, [IBV_WR_ATOMIC_FETCH_AND_ADD] = MLX4_OPCODE_ATOMIC_FA, [IBV_WR_LOCAL_INV] = MLX4_OPCODE_LOCAL_INVAL, [IBV_WR_BIND_MW] = MLX4_OPCODE_BIND_MW, [IBV_WR_SEND_WITH_INV] = MLX4_OPCODE_SEND_INVAL, }; static void *get_recv_wqe(struct mlx4_qp *qp, int n) { return qp->buf.buf + qp->rq.offset + (n << qp->rq.wqe_shift); } static void *get_send_wqe(struct mlx4_qp *qp, int n) { return qp->buf.buf + qp->sq.offset + (n << qp->sq.wqe_shift); } /* * Stamp a SQ WQE so that it is invalid if prefetched by marking the * first four bytes of every 64 byte chunk with 0xffffffff, except for * the very first chunk of the WQE. */ static void stamp_send_wqe(struct mlx4_qp *qp, int n) { uint32_t *wqe = get_send_wqe(qp, n); int i; int ds = (((struct mlx4_wqe_ctrl_seg *)wqe)->fence_size & 0x3f) << 2; for (i = 16; i < ds; i += 16) wqe[i] = 0xffffffff; } void mlx4_init_qp_indices(struct mlx4_qp *qp) { qp->sq.head = 0; qp->sq.tail = 0; qp->rq.head = 0; qp->rq.tail = 0; } void mlx4_qp_init_sq_ownership(struct mlx4_qp *qp) { struct mlx4_wqe_ctrl_seg *ctrl; int i; for (i = 0; i < qp->sq.wqe_cnt; ++i) { ctrl = get_send_wqe(qp, i); ctrl->owner_opcode = htobe32(1 << 31); ctrl->fence_size = 1 << (qp->sq.wqe_shift - 4); stamp_send_wqe(qp, i); } } static int wq_overflow(struct mlx4_wq *wq, int nreq, struct mlx4_cq *cq) { unsigned cur; cur = wq->head - wq->tail; if (cur + nreq < wq->max_post) return 0; pthread_spin_lock(&cq->lock); cur = wq->head - wq->tail; pthread_spin_unlock(&cq->lock); return cur + nreq >= wq->max_post; } static void set_bind_seg(struct mlx4_wqe_bind_seg *bseg, struct ibv_send_wr *wr) { int acc = wr->bind_mw.bind_info.mw_access_flags; bseg->flags1 = 0; if (acc & IBV_ACCESS_REMOTE_ATOMIC) bseg->flags1 |= htobe32(MLX4_WQE_MW_ATOMIC); if (acc & IBV_ACCESS_REMOTE_WRITE) bseg->flags1 |= htobe32(MLX4_WQE_MW_REMOTE_WRITE); if (acc & IBV_ACCESS_REMOTE_READ) bseg->flags1 |= htobe32(MLX4_WQE_MW_REMOTE_READ); bseg->flags2 = 0; if (((struct ibv_mw *)(wr->bind_mw.mw))->type == IBV_MW_TYPE_2) bseg->flags2 |= htobe32(MLX4_WQE_BIND_TYPE_2); if (acc & IBV_ACCESS_ZERO_BASED) bseg->flags2 |= htobe32(MLX4_WQE_BIND_ZERO_BASED); bseg->new_rkey = htobe32(wr->bind_mw.rkey); bseg->lkey = htobe32(wr->bind_mw.bind_info.mr->lkey); bseg->addr = htobe64((uint64_t) wr->bind_mw.bind_info.addr); bseg->length = htobe64(wr->bind_mw.bind_info.length); } static inline void set_local_inv_seg(struct mlx4_wqe_local_inval_seg *iseg, uint32_t rkey) { iseg->mem_key = htobe32(rkey); iseg->reserved1 = 0; iseg->reserved2 = 0; iseg->reserved3[0] = 0; iseg->reserved3[1] = 0; } static inline void set_raddr_seg(struct mlx4_wqe_raddr_seg *rseg, uint64_t remote_addr, uint32_t rkey) { rseg->raddr = htobe64(remote_addr); rseg->rkey = htobe32(rkey); rseg->reserved = 0; } static void set_atomic_seg(struct mlx4_wqe_atomic_seg *aseg, struct ibv_send_wr *wr) { if (wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP) { aseg->swap_add = htobe64(wr->wr.atomic.swap); aseg->compare = htobe64(wr->wr.atomic.compare_add); } else { aseg->swap_add = htobe64(wr->wr.atomic.compare_add); aseg->compare = 0; } } static void set_datagram_seg(struct mlx4_wqe_datagram_seg *dseg, struct ibv_send_wr *wr) { memcpy(&dseg->av, &to_mah(wr->wr.ud.ah)->av, sizeof (struct mlx4_av)); dseg->dqpn = htobe32(wr->wr.ud.remote_qpn); dseg->qkey = htobe32(wr->wr.ud.remote_qkey); dseg->vlan = htobe16(to_mah(wr->wr.ud.ah)->vlan); memcpy(dseg->mac, to_mah(wr->wr.ud.ah)->mac, ETHERNET_LL_SIZE); } static void __set_data_seg(struct mlx4_wqe_data_seg *dseg, struct ibv_sge *sg) { dseg->byte_count = htobe32(sg->length); dseg->lkey = htobe32(sg->lkey); dseg->addr = htobe64(sg->addr); } static void set_data_seg(struct mlx4_wqe_data_seg *dseg, struct ibv_sge *sg) { dseg->lkey = htobe32(sg->lkey); dseg->addr = htobe64(sg->addr); /* * Need a barrier here before writing the byte_count field to * make sure that all the data is visible before the * byte_count field is set. Otherwise, if the segment begins * a new cacheline, the HCA prefetcher could grab the 64-byte * chunk and get a valid (!= * 0xffffffff) byte count but * stale data, and end up sending the wrong data. */ udma_to_device_barrier(); if (likely(sg->length)) dseg->byte_count = htobe32(sg->length); else dseg->byte_count = htobe32(0x80000000); } int mlx4_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { struct mlx4_context *ctx; struct mlx4_qp *qp = to_mqp(ibqp); void *wqe; struct mlx4_wqe_ctrl_seg *uninitialized_var(ctrl); int ind; int nreq; int inl = 0; int ret = 0; int size = 0; int i; pthread_spin_lock(&qp->sq.lock); /* XXX check that state is OK to post send */ ind = qp->sq.head; for (nreq = 0; wr; ++nreq, wr = wr->next) { if (wq_overflow(&qp->sq, nreq, to_mcq(ibqp->send_cq))) { ret = ENOMEM; *bad_wr = wr; goto out; } if (wr->num_sge > qp->sq.max_gs) { ret = ENOMEM; *bad_wr = wr; goto out; } if (wr->opcode >= sizeof mlx4_ib_opcode / sizeof mlx4_ib_opcode[0]) { ret = EINVAL; *bad_wr = wr; goto out; } ctrl = wqe = get_send_wqe(qp, ind & (qp->sq.wqe_cnt - 1)); qp->sq.wrid[ind & (qp->sq.wqe_cnt - 1)] = wr->wr_id; ctrl->srcrb_flags = (wr->send_flags & IBV_SEND_SIGNALED ? htobe32(MLX4_WQE_CTRL_CQ_UPDATE) : 0) | (wr->send_flags & IBV_SEND_SOLICITED ? htobe32(MLX4_WQE_CTRL_SOLICIT) : 0) | qp->sq_signal_bits; if (wr->opcode == IBV_WR_SEND_WITH_IMM || wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM) ctrl->imm = wr->imm_data; else ctrl->imm = 0; wqe += sizeof *ctrl; size = sizeof *ctrl / 16; switch (ibqp->qp_type) { case IBV_QPT_XRC_SEND: ctrl->srcrb_flags |= MLX4_REMOTE_SRQN_FLAGS(wr); /* fall through */ case IBV_QPT_RC: case IBV_QPT_UC: switch (wr->opcode) { case IBV_WR_ATOMIC_CMP_AND_SWP: case IBV_WR_ATOMIC_FETCH_AND_ADD: set_raddr_seg(wqe, wr->wr.atomic.remote_addr, wr->wr.atomic.rkey); wqe += sizeof (struct mlx4_wqe_raddr_seg); set_atomic_seg(wqe, wr); wqe += sizeof (struct mlx4_wqe_atomic_seg); size += (sizeof (struct mlx4_wqe_raddr_seg) + sizeof (struct mlx4_wqe_atomic_seg)) / 16; break; case IBV_WR_RDMA_READ: inl = 1; /* fall through */ case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: if (!wr->num_sge) inl = 1; set_raddr_seg(wqe, wr->wr.rdma.remote_addr, wr->wr.rdma.rkey); wqe += sizeof (struct mlx4_wqe_raddr_seg); size += sizeof (struct mlx4_wqe_raddr_seg) / 16; break; case IBV_WR_LOCAL_INV: ctrl->srcrb_flags |= htobe32(MLX4_WQE_CTRL_STRONG_ORDER); set_local_inv_seg(wqe, wr->invalidate_rkey); wqe += sizeof (struct mlx4_wqe_local_inval_seg); size += sizeof (struct mlx4_wqe_local_inval_seg) / 16; break; case IBV_WR_BIND_MW: ctrl->srcrb_flags |= htobe32(MLX4_WQE_CTRL_STRONG_ORDER); set_bind_seg(wqe, wr); wqe += sizeof (struct mlx4_wqe_bind_seg); size += sizeof (struct mlx4_wqe_bind_seg) / 16; break; case IBV_WR_SEND_WITH_INV: ctrl->imm = htobe32(wr->invalidate_rkey); break; default: /* No extra segments required for sends */ break; } break; case IBV_QPT_UD: set_datagram_seg(wqe, wr); wqe += sizeof (struct mlx4_wqe_datagram_seg); size += sizeof (struct mlx4_wqe_datagram_seg) / 16; if (wr->send_flags & IBV_SEND_IP_CSUM) { if (!(qp->qp_cap_cache & MLX4_CSUM_SUPPORT_UD_OVER_IB)) { ret = EINVAL; *bad_wr = wr; goto out; } ctrl->srcrb_flags |= htobe32(MLX4_WQE_CTRL_IP_HDR_CSUM | MLX4_WQE_CTRL_TCP_UDP_CSUM); } break; case IBV_QPT_RAW_PACKET: /* For raw eth, the MLX4_WQE_CTRL_SOLICIT flag is used * to indicate that no icrc should be calculated */ ctrl->srcrb_flags |= htobe32(MLX4_WQE_CTRL_SOLICIT); if (wr->send_flags & IBV_SEND_IP_CSUM) { if (!(qp->qp_cap_cache & MLX4_CSUM_SUPPORT_RAW_OVER_ETH)) { ret = EINVAL; *bad_wr = wr; goto out; } ctrl->srcrb_flags |= htobe32(MLX4_WQE_CTRL_IP_HDR_CSUM | MLX4_WQE_CTRL_TCP_UDP_CSUM); } /* Take the dmac from the payload - needed for loopback */ if (qp->link_layer == IBV_LINK_LAYER_ETHERNET) { ctrl->srcrb_flags16[0] = *(__be16 *)(uintptr_t)wr->sg_list[0].addr; ctrl->imm = *(__be32 *)((uintptr_t)(wr->sg_list[0].addr) + 2); } break; default: break; } if (wr->send_flags & IBV_SEND_INLINE && wr->num_sge) { struct mlx4_wqe_inline_seg *seg; void *addr; int len, seg_len; int num_seg; int off, to_copy; inl = 0; seg = wqe; wqe += sizeof *seg; off = ((uintptr_t) wqe) & (MLX4_INLINE_ALIGN - 1); num_seg = 0; seg_len = 0; for (i = 0; i < wr->num_sge; ++i) { addr = (void *) (uintptr_t) wr->sg_list[i].addr; len = wr->sg_list[i].length; inl += len; if (inl > qp->max_inline_data) { inl = 0; ret = ENOMEM; *bad_wr = wr; goto out; } while (len >= MLX4_INLINE_ALIGN - off) { to_copy = MLX4_INLINE_ALIGN - off; memcpy(wqe, addr, to_copy); len -= to_copy; wqe += to_copy; addr += to_copy; seg_len += to_copy; udma_to_device_barrier(); /* see comment below */ seg->byte_count = htobe32(MLX4_INLINE_SEG | seg_len); seg_len = 0; seg = wqe; wqe += sizeof *seg; off = sizeof *seg; ++num_seg; } memcpy(wqe, addr, len); wqe += len; seg_len += len; off += len; } if (seg_len) { ++num_seg; /* * Need a barrier here to make sure * all the data is visible before the * byte_count field is set. Otherwise * the HCA prefetcher could grab the * 64-byte chunk with this inline * segment and get a valid (!= * 0xffffffff) byte count but stale * data, and end up sending the wrong * data. */ udma_to_device_barrier(); seg->byte_count = htobe32(MLX4_INLINE_SEG | seg_len); } size += (inl + num_seg * sizeof * seg + 15) / 16; } else { struct mlx4_wqe_data_seg *seg = wqe; for (i = wr->num_sge - 1; i >= 0 ; --i) set_data_seg(seg + i, wr->sg_list + i); size += wr->num_sge * (sizeof *seg / 16); } ctrl->fence_size = (wr->send_flags & IBV_SEND_FENCE ? MLX4_WQE_CTRL_FENCE : 0) | size; /* * Make sure descriptor is fully written before * setting ownership bit (because HW can start * executing as soon as we do). */ udma_to_device_barrier(); ctrl->owner_opcode = htobe32(mlx4_ib_opcode[wr->opcode]) | (ind & qp->sq.wqe_cnt ? htobe32(1 << 31) : 0); /* * We can improve latency by not stamping the last * send queue WQE until after ringing the doorbell, so * only stamp here if there are still more WQEs to post. */ if (wr->next) stamp_send_wqe(qp, (ind + qp->sq_spare_wqes) & (qp->sq.wqe_cnt - 1)); ++ind; } out: ctx = to_mctx(ibqp->context); if (nreq == 1 && inl && size > 1 && size <= ctx->bf_buf_size / 16) { ctrl->owner_opcode |= htobe32((qp->sq.head & 0xffff) << 8); ctrl->bf_qpn |= qp->doorbell_qpn; ++qp->sq.head; /* * Make sure that descriptor is written to memory * before writing to BlueFlame page. */ mmio_wc_spinlock(&ctx->bf_lock); mmio_memcpy_x64(ctx->bf_page + ctx->bf_offset, ctrl, align(size * 16, 64)); /* Flush before toggling bf_offset to be latency oriented */ mmio_flush_writes(); ctx->bf_offset ^= ctx->bf_buf_size; pthread_spin_unlock(&ctx->bf_lock); } else if (nreq) { qp->sq.head += nreq; /* * Make sure that descriptors are written before * doorbell record. */ udma_to_device_barrier(); mmio_write32_be(ctx->uar + MLX4_SEND_DOORBELL, qp->doorbell_qpn); } if (nreq) stamp_send_wqe(qp, (ind + qp->sq_spare_wqes - 1) & (qp->sq.wqe_cnt - 1)); pthread_spin_unlock(&qp->sq.lock); return ret; } static inline int _mlx4_post_recv(struct mlx4_qp *qp, struct mlx4_cq *cq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) ALWAYS_INLINE; static inline int _mlx4_post_recv(struct mlx4_qp *qp, struct mlx4_cq *cq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mlx4_wqe_data_seg *scat; int ret = 0; int nreq; int ind; int i; pthread_spin_lock(&qp->rq.lock); /* XXX check that state is OK to post receive */ ind = qp->rq.head & (qp->rq.wqe_cnt - 1); for (nreq = 0; wr; ++nreq, wr = wr->next) { if (wq_overflow(&qp->rq, nreq, cq)) { ret = ENOMEM; *bad_wr = wr; goto out; } if (wr->num_sge > qp->rq.max_gs) { ret = ENOMEM; *bad_wr = wr; goto out; } scat = get_recv_wqe(qp, ind); for (i = 0; i < wr->num_sge; ++i) __set_data_seg(scat + i, wr->sg_list + i); if (i < qp->rq.max_gs) { scat[i].byte_count = 0; scat[i].lkey = htobe32(MLX4_INVALID_LKEY); scat[i].addr = 0; } qp->rq.wrid[ind] = wr->wr_id; ind = (ind + 1) & (qp->rq.wqe_cnt - 1); } out: if (nreq) { qp->rq.head += nreq; /* * Make sure that descriptors are written before * doorbell record. */ udma_to_device_barrier(); *qp->db = htobe32(qp->rq.head & 0xffff); } pthread_spin_unlock(&qp->rq.lock); return ret; } int mlx4_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mlx4_qp *qp = to_mqp(ibqp); struct mlx4_cq *cq = to_mcq(ibqp->recv_cq); return _mlx4_post_recv(qp, cq, wr, bad_wr); } int mlx4_post_wq_recv(struct ibv_wq *ibwq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mlx4_qp *qp = wq_to_mqp(ibwq); struct mlx4_cq *cq = to_mcq(ibwq->cq); return _mlx4_post_recv(qp, cq, wr, bad_wr); } static int num_inline_segs(int data, enum ibv_qp_type type) { /* * Inline data segments are not allowed to cross 64 byte * boundaries. For UD QPs, the data segments always start * aligned to 64 bytes (16 byte control segment + 48 byte * datagram segment); for other QPs, there will be a 16 byte * control segment and possibly a 16 byte remote address * segment, so in the worst case there will be only 32 bytes * available for the first data segment. */ if (type == IBV_QPT_UD) data += (sizeof (struct mlx4_wqe_ctrl_seg) + sizeof (struct mlx4_wqe_datagram_seg)) % MLX4_INLINE_ALIGN; else data += (sizeof (struct mlx4_wqe_ctrl_seg) + sizeof (struct mlx4_wqe_raddr_seg)) % MLX4_INLINE_ALIGN; return (data + MLX4_INLINE_ALIGN - sizeof (struct mlx4_wqe_inline_seg) - 1) / (MLX4_INLINE_ALIGN - sizeof (struct mlx4_wqe_inline_seg)); } void mlx4_calc_sq_wqe_size(struct ibv_qp_cap *cap, enum ibv_qp_type type, struct mlx4_qp *qp, struct ibv_qp_init_attr_ex *attr) { int size; int max_sq_sge; max_sq_sge = align(cap->max_inline_data + num_inline_segs(cap->max_inline_data, type) * sizeof (struct mlx4_wqe_inline_seg), sizeof (struct mlx4_wqe_data_seg)) / sizeof (struct mlx4_wqe_data_seg); if (max_sq_sge < cap->max_send_sge) max_sq_sge = cap->max_send_sge; size = max_sq_sge * sizeof (struct mlx4_wqe_data_seg); switch (type) { case IBV_QPT_UD: size += sizeof (struct mlx4_wqe_datagram_seg); break; case IBV_QPT_UC: size += sizeof (struct mlx4_wqe_raddr_seg); break; case IBV_QPT_XRC_SEND: case IBV_QPT_RC: size += sizeof (struct mlx4_wqe_raddr_seg); /* * An atomic op will require an atomic segment, a * remote address segment and one scatter entry. */ if (size < (sizeof (struct mlx4_wqe_atomic_seg) + sizeof (struct mlx4_wqe_raddr_seg) + sizeof (struct mlx4_wqe_data_seg))) size = (sizeof (struct mlx4_wqe_atomic_seg) + sizeof (struct mlx4_wqe_raddr_seg) + sizeof (struct mlx4_wqe_data_seg)); break; default: break; } /* Make sure that we have enough space for a bind request */ if (size < sizeof (struct mlx4_wqe_bind_seg)) size = sizeof (struct mlx4_wqe_bind_seg); size += sizeof (struct mlx4_wqe_ctrl_seg); if (attr->comp_mask & IBV_QP_INIT_ATTR_MAX_TSO_HEADER) size += align(sizeof (struct mlx4_wqe_lso_seg) + attr->max_tso_header, 16); for (qp->sq.wqe_shift = 6; 1 << qp->sq.wqe_shift < size; qp->sq.wqe_shift++) ; /* nothing */ } int mlx4_alloc_qp_buf(struct ibv_context *context, uint32_t max_recv_sge, enum ibv_qp_type type, struct mlx4_qp *qp, struct mlx4dv_qp_init_attr *mlx4qp_attr) { int wqe_size; qp->rq.max_gs = max_recv_sge; wqe_size = qp->rq.max_gs * sizeof(struct mlx4_wqe_data_seg); if (mlx4qp_attr && mlx4qp_attr->comp_mask & MLX4DV_QP_INIT_ATTR_MASK_INL_RECV && mlx4qp_attr->inl_recv_sz > wqe_size) wqe_size = mlx4qp_attr->inl_recv_sz; if (qp->sq.wqe_cnt) { qp->sq.wrid = malloc(qp->sq.wqe_cnt * sizeof (uint64_t)); if (!qp->sq.wrid) return -1; } if (qp->rq.wqe_cnt) { qp->rq.wrid = malloc(qp->rq.wqe_cnt * sizeof (uint64_t)); if (!qp->rq.wrid) { free(qp->sq.wrid); return -1; } } for (qp->rq.wqe_shift = 4; 1 << qp->rq.wqe_shift < wqe_size; qp->rq.wqe_shift++) ; /* nothing */ if (mlx4qp_attr) mlx4qp_attr->inl_recv_sz = 1 << qp->rq.wqe_shift; qp->buf_size = (qp->rq.wqe_cnt << qp->rq.wqe_shift) + (qp->sq.wqe_cnt << qp->sq.wqe_shift); if (qp->rq.wqe_shift > qp->sq.wqe_shift) { qp->rq.offset = 0; qp->sq.offset = qp->rq.wqe_cnt << qp->rq.wqe_shift; } else { qp->rq.offset = qp->sq.wqe_cnt << qp->sq.wqe_shift; qp->sq.offset = 0; } if (qp->buf_size) { if (mlx4_alloc_buf(to_mctx(context), &qp->buf, align(qp->buf_size, to_mdev(context->device)->page_size), to_mdev(context->device)->page_size)) { free(qp->sq.wrid); free(qp->rq.wrid); return -1; } memset(qp->buf.buf, 0, qp->buf_size); } else { qp->buf.buf = NULL; } return 0; } void mlx4_set_sq_sizes(struct mlx4_qp *qp, struct ibv_qp_cap *cap, enum ibv_qp_type type) { int wqe_size; wqe_size = (1 << qp->sq.wqe_shift) - sizeof (struct mlx4_wqe_ctrl_seg); switch (type) { case IBV_QPT_UD: wqe_size -= sizeof (struct mlx4_wqe_datagram_seg); break; case IBV_QPT_XRC_SEND: case IBV_QPT_UC: case IBV_QPT_RC: wqe_size -= sizeof (struct mlx4_wqe_raddr_seg); break; default: break; } qp->sq.max_gs = wqe_size / sizeof (struct mlx4_wqe_data_seg); cap->max_send_sge = qp->sq.max_gs; qp->sq.max_post = qp->sq.wqe_cnt - qp->sq_spare_wqes; cap->max_send_wr = qp->sq.max_post; /* * Inline data segments can't cross a 64 byte boundary. So * subtract off one segment header for each 64-byte chunk, * taking into account the fact that wqe_size will be 32 mod * 64 for non-UD QPs. */ qp->max_inline_data = wqe_size - sizeof (struct mlx4_wqe_inline_seg) * (align(wqe_size, MLX4_INLINE_ALIGN) / MLX4_INLINE_ALIGN); cap->max_inline_data = qp->max_inline_data; } struct mlx4_qp *mlx4_find_qp(struct mlx4_context *ctx, uint32_t qpn) { int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift; if (ctx->qp_table[tind].refcnt) return ctx->qp_table[tind].table[qpn & ctx->qp_table_mask]; else return NULL; } int mlx4_store_qp(struct mlx4_context *ctx, uint32_t qpn, struct mlx4_qp *qp) { int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift; if (!ctx->qp_table[tind].refcnt) { ctx->qp_table[tind].table = calloc(ctx->qp_table_mask + 1, sizeof (struct mlx4_qp *)); if (!ctx->qp_table[tind].table) return -1; } ++ctx->qp_table[tind].refcnt; ctx->qp_table[tind].table[qpn & ctx->qp_table_mask] = qp; return 0; } void mlx4_clear_qp(struct mlx4_context *ctx, uint32_t qpn) { int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift; if (!--ctx->qp_table[tind].refcnt) free(ctx->qp_table[tind].table); else ctx->qp_table[tind].table[qpn & ctx->qp_table_mask] = NULL; } rdma-core-56.1/providers/mlx4/srq.c000066400000000000000000000177241477342711600172170ustar00rootroot00000000000000/* * Copyright (c) 2007 Cisco, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include "mlx4.h" #include "mlx4-abi.h" static void *get_wqe(struct mlx4_srq *srq, int n) { return srq->buf.buf + (n << srq->wqe_shift); } void mlx4_free_srq_wqe(struct mlx4_srq *srq, int ind) { struct mlx4_wqe_srq_next_seg *next; pthread_spin_lock(&srq->lock); next = get_wqe(srq, srq->tail); next->next_wqe_index = htobe16(ind); srq->tail = ind; pthread_spin_unlock(&srq->lock); } int mlx4_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mlx4_srq *srq = to_msrq(ibsrq); struct mlx4_wqe_srq_next_seg *next; struct mlx4_wqe_data_seg *scat; int err = 0; int nreq; int i; pthread_spin_lock(&srq->lock); for (nreq = 0; wr; ++nreq, wr = wr->next) { if (wr->num_sge > srq->max_gs) { err = -1; *bad_wr = wr; break; } if (srq->head == srq->tail) { /* SRQ is full*/ err = -1; *bad_wr = wr; break; } srq->wrid[srq->head] = wr->wr_id; next = get_wqe(srq, srq->head); srq->head = be16toh(next->next_wqe_index); scat = (struct mlx4_wqe_data_seg *) (next + 1); for (i = 0; i < wr->num_sge; ++i) { scat[i].byte_count = htobe32(wr->sg_list[i].length); scat[i].lkey = htobe32(wr->sg_list[i].lkey); scat[i].addr = htobe64(wr->sg_list[i].addr); } if (i < srq->max_gs) { scat[i].byte_count = 0; scat[i].lkey = htobe32(MLX4_INVALID_LKEY); scat[i].addr = 0; } } if (nreq) { srq->counter += nreq; /* * Make sure that descriptors are written before * we write doorbell record. */ udma_to_device_barrier(); *srq->db = htobe32(srq->counter); } pthread_spin_unlock(&srq->lock); return err; } int mlx4_alloc_srq_buf(struct ibv_pd *pd, struct ibv_srq_attr *attr, struct mlx4_srq *srq) { struct mlx4_wqe_srq_next_seg *next; struct mlx4_wqe_data_seg *scatter; int size; int buf_size; int i; srq->wrid = malloc(srq->max * sizeof (uint64_t)); if (!srq->wrid) return -1; size = sizeof (struct mlx4_wqe_srq_next_seg) + srq->max_gs * sizeof (struct mlx4_wqe_data_seg); for (srq->wqe_shift = 5; 1 << srq->wqe_shift < size; ++srq->wqe_shift) ; /* nothing */ buf_size = srq->max << srq->wqe_shift; if (mlx4_alloc_buf(to_mctx(pd->context), &srq->buf, buf_size, to_mdev(pd->context->device)->page_size)) { free(srq->wrid); return -1; } memset(srq->buf.buf, 0, buf_size); /* * Now initialize the SRQ buffer so that all of the WQEs are * linked into the list of free WQEs. */ for (i = 0; i < srq->max; ++i) { next = get_wqe(srq, i); next->next_wqe_index = htobe16((i + 1) & (srq->max - 1)); for (scatter = (void *) (next + 1); (void *) scatter < (void *) next + (1 << srq->wqe_shift); ++scatter) scatter->lkey = htobe32(MLX4_INVALID_LKEY); } srq->head = 0; srq->tail = srq->max - 1; return 0; } void mlx4_init_xsrq_table(struct mlx4_xsrq_table *xsrq_table, int size) { memset(xsrq_table, 0, sizeof *xsrq_table); xsrq_table->num_xsrq = size; xsrq_table->shift = ffs(size) - 1 - MLX4_XSRQ_TABLE_BITS; xsrq_table->mask = (1 << xsrq_table->shift) - 1; pthread_mutex_init(&xsrq_table->mutex, NULL); } struct mlx4_srq *mlx4_find_xsrq(struct mlx4_xsrq_table *xsrq_table, uint32_t srqn) { int index; index = (srqn & (xsrq_table->num_xsrq - 1)) >> xsrq_table->shift; if (xsrq_table->xsrq_table[index].refcnt) return xsrq_table->xsrq_table[index].table[srqn & xsrq_table->mask]; return NULL; } int mlx4_store_xsrq(struct mlx4_xsrq_table *xsrq_table, uint32_t srqn, struct mlx4_srq *srq) { int index, ret = 0; index = (srqn & (xsrq_table->num_xsrq - 1)) >> xsrq_table->shift; pthread_mutex_lock(&xsrq_table->mutex); if (!xsrq_table->xsrq_table[index].refcnt) { xsrq_table->xsrq_table[index].table = calloc(xsrq_table->mask + 1, sizeof(struct mlx4_srq *)); if (!xsrq_table->xsrq_table[index].table) { ret = -1; goto out; } } xsrq_table->xsrq_table[index].refcnt++; xsrq_table->xsrq_table[index].table[srqn & xsrq_table->mask] = srq; out: pthread_mutex_unlock(&xsrq_table->mutex); return ret; } void mlx4_clear_xsrq(struct mlx4_xsrq_table *xsrq_table, uint32_t srqn) { int index; index = (srqn & (xsrq_table->num_xsrq - 1)) >> xsrq_table->shift; pthread_mutex_lock(&xsrq_table->mutex); if (--xsrq_table->xsrq_table[index].refcnt) xsrq_table->xsrq_table[index].table[srqn & xsrq_table->mask] = NULL; else free(xsrq_table->xsrq_table[index].table); pthread_mutex_unlock(&xsrq_table->mutex); } struct ibv_srq *mlx4_create_xrc_srq(struct ibv_context *context, struct ibv_srq_init_attr_ex *attr_ex) { struct mlx4_create_xsrq cmd; struct mlx4_create_xsrq_resp resp; struct mlx4_srq *srq; int ret; /* Sanity check SRQ size before proceeding */ if (attr_ex->attr.max_wr > 1 << 16 || attr_ex->attr.max_sge > 64) return NULL; srq = calloc(1, sizeof *srq); if (!srq) return NULL; if (pthread_spin_init(&srq->lock, PTHREAD_PROCESS_PRIVATE)) goto err; srq->max = roundup_pow_of_two(attr_ex->attr.max_wr + 1); srq->max_gs = attr_ex->attr.max_sge; srq->counter = 0; srq->ext_srq = 1; if (mlx4_alloc_srq_buf(attr_ex->pd, &attr_ex->attr, srq)) goto err; srq->db = mlx4_alloc_db(to_mctx(context), MLX4_DB_TYPE_RQ); if (!srq->db) goto err_free; *srq->db = 0; cmd.buf_addr = (uintptr_t) srq->buf.buf; cmd.db_addr = (uintptr_t) srq->db; ret = ibv_cmd_create_srq_ex(context, &srq->verbs_srq, attr_ex, &cmd.ibv_cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) goto err_db; ret = mlx4_store_xsrq(&to_mctx(context)->xsrq_table, srq->verbs_srq.srq_num, srq); if (ret) goto err_destroy; return &srq->verbs_srq.srq; err_destroy: ibv_cmd_destroy_srq(&srq->verbs_srq.srq); err_db: mlx4_free_db(to_mctx(context), MLX4_DB_TYPE_RQ, srq->db); err_free: free(srq->wrid); mlx4_free_buf(to_mctx(context), &srq->buf); err: free(srq); return NULL; } int mlx4_destroy_xrc_srq(struct ibv_srq *srq) { struct mlx4_context *mctx = to_mctx(srq->context); struct mlx4_srq *msrq = to_msrq(srq); struct mlx4_cq *mcq; int ret; mcq = to_mcq(msrq->verbs_srq.cq); mlx4_cq_clean(mcq, 0, msrq); pthread_spin_lock(&mcq->lock); mlx4_clear_xsrq(&mctx->xsrq_table, msrq->verbs_srq.srq_num); pthread_spin_unlock(&mcq->lock); ret = ibv_cmd_destroy_srq(srq); if (ret) { pthread_spin_lock(&mcq->lock); mlx4_store_xsrq(&mctx->xsrq_table, msrq->verbs_srq.srq_num, msrq); pthread_spin_unlock(&mcq->lock); return ret; } mlx4_free_db(mctx, MLX4_DB_TYPE_RQ, msrq->db); mlx4_free_buf(mctx, &msrq->buf); free(msrq->wrid); free(msrq); return 0; } rdma-core-56.1/providers/mlx4/verbs.c000066400000000000000000001143041477342711600175230ustar00rootroot00000000000000/* * Copyright (c) 2007 Cisco, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include "mlx4.h" #include "mlx4-abi.h" int mlx4_query_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct mlx4_query_device_ex_resp resp = {}; size_t resp_size = sizeof(resp); uint64_t raw_fw_ver; unsigned sub_minor; unsigned major; unsigned minor; int err; err = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp.ibv_resp, &resp_size); if (err) return err; if (attr_size >= offsetofend(struct ibv_device_attr_ex, rss_caps)) { attr->rss_caps.rx_hash_fields_mask = resp.rss_caps.rx_hash_fields_mask; attr->rss_caps.rx_hash_function = resp.rss_caps.rx_hash_function; } if (attr_size >= offsetofend(struct ibv_device_attr_ex, tso_caps)) { attr->tso_caps.max_tso = resp.tso_caps.max_tso; attr->tso_caps.supported_qpts = resp.tso_caps.supported_qpts; } raw_fw_ver = resp.ibv_resp.base.fw_ver; major = (raw_fw_ver >> 32) & 0xffff; minor = (raw_fw_ver >> 16) & 0xffff; sub_minor = raw_fw_ver & 0xffff; snprintf(attr->orig_attr.fw_ver, sizeof attr->orig_attr.fw_ver, "%d.%d.%03d", major, minor, sub_minor); return 0; } void mlx4_query_device_ctx(struct mlx4_device *mdev, struct mlx4_context *mctx) { struct ibv_device_attr_ex device_attr; struct mlx4_query_device_ex_resp resp; size_t resp_size = sizeof(resp); if (ibv_cmd_query_device_any(&mctx->ibv_ctx.context, NULL, &device_attr, sizeof(device_attr), &resp.ibv_resp, &resp_size)) return; mctx->max_qp_wr = device_attr.orig_attr.max_qp_wr; mctx->max_sge = device_attr.orig_attr.max_sge; mctx->max_inl_recv_sz = resp.max_inl_recv_sz; if (resp.comp_mask & MLX4_IB_QUERY_DEV_RESP_MASK_CORE_CLOCK_OFFSET) { void *hca_clock_page; mctx->core_clock.offset = resp.hca_core_clock_offset; mctx->core_clock.offset_valid = 1; hca_clock_page = mmap(NULL, mdev->page_size, PROT_READ, MAP_SHARED, mctx->ibv_ctx.context.cmd_fd, mdev->page_size * 3); if (hca_clock_page != MAP_FAILED) mctx->hca_core_clock = hca_clock_page + (mctx->core_clock.offset & (mdev->page_size - 1)); else fprintf(stderr, PFX "Warning: Timestamp available,\n" "but failed to mmap() hca core clock page.\n"); } } static int mlx4_read_clock(struct ibv_context *context, uint64_t *cycles) { uint32_t clockhi, clocklo, clockhi1; int i; struct mlx4_context *ctx = to_mctx(context); if (!ctx->hca_core_clock) return EOPNOTSUPP; /* Handle wraparound */ for (i = 0; i < 2; i++) { clockhi = be32toh(mmio_read32_be(ctx->hca_core_clock)); clocklo = be32toh(mmio_read32_be(ctx->hca_core_clock + 4)); clockhi1 = be32toh(mmio_read32_be(ctx->hca_core_clock)); if (clockhi == clockhi1) break; } *cycles = (uint64_t)clockhi << 32 | (uint64_t)clocklo; return 0; } int mlx4_query_rt_values(struct ibv_context *context, struct ibv_values_ex *values) { uint32_t comp_mask = 0; int err = 0; if (!check_comp_mask(values->comp_mask, IBV_VALUES_MASK_RAW_CLOCK)) return EINVAL; if (values->comp_mask & IBV_VALUES_MASK_RAW_CLOCK) { uint64_t cycles; err = mlx4_read_clock(context, &cycles); if (!err) { values->raw_clock.tv_sec = 0; values->raw_clock.tv_nsec = cycles; comp_mask |= IBV_VALUES_MASK_RAW_CLOCK; } } values->comp_mask = comp_mask; return err; } int mlx4_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; int err; err = ibv_cmd_query_port(context, port, attr, &cmd, sizeof(cmd)); if (!err && port <= MLX4_PORTS_NUM && port > 0) { struct mlx4_context *mctx = to_mctx(context); if (!mctx->port_query_cache[port - 1].valid) { mctx->port_query_cache[port - 1].link_layer = attr->link_layer; mctx->port_query_cache[port - 1].caps = attr->port_cap_flags; mctx->port_query_cache[port - 1].flags = attr->flags; mctx->port_query_cache[port - 1].valid = 1; } } return err; } /* Only the fields in the port cache will be valid */ static int query_port_cache(struct ibv_context *context, uint8_t port_num, struct ibv_port_attr *port_attr) { struct mlx4_context *mctx = to_mctx(context); if (port_num <= 0 || port_num > MLX4_PORTS_NUM) return -EINVAL; if (mctx->port_query_cache[port_num - 1].valid) { port_attr->link_layer = mctx-> port_query_cache[port_num - 1]. link_layer; port_attr->port_cap_flags = mctx-> port_query_cache[port_num - 1]. caps; port_attr->flags = mctx-> port_query_cache[port_num - 1]. flags; return 0; } return mlx4_query_port(context, port_num, (struct ibv_port_attr *)port_attr); } struct ibv_pd *mlx4_alloc_pd(struct ibv_context *context) { struct ibv_alloc_pd cmd; struct mlx4_alloc_pd_resp resp; struct mlx4_pd *pd; pd = malloc(sizeof *pd); if (!pd) return NULL; if (ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp)) { free(pd); return NULL; } pd->pdn = resp.pdn; return &pd->ibv_pd; } int mlx4_free_pd(struct ibv_pd *pd) { int ret; ret = ibv_cmd_dealloc_pd(pd); if (ret) return ret; free(to_mpd(pd)); return 0; } struct ibv_xrcd *mlx4_open_xrcd(struct ibv_context *context, struct ibv_xrcd_init_attr *attr) { struct ibv_open_xrcd cmd; struct ib_uverbs_open_xrcd_resp resp; struct verbs_xrcd *xrcd; int ret; xrcd = calloc(1, sizeof *xrcd); if (!xrcd) return NULL; ret = ibv_cmd_open_xrcd(context, xrcd, sizeof(*xrcd), attr, &cmd, sizeof cmd, &resp, sizeof resp); if (ret) goto err; return &xrcd->xrcd; err: free(xrcd); return NULL; } int mlx4_close_xrcd(struct ibv_xrcd *ib_xrcd) { struct verbs_xrcd *xrcd = container_of(ib_xrcd, struct verbs_xrcd, xrcd); int ret; ret = ibv_cmd_close_xrcd(xrcd); if (ret) return ret; free(xrcd); return 0; } int mlx4_get_srq_num(struct ibv_srq *srq, uint32_t *srq_num) { struct mlx4_srq *msrq = container_of(srq, struct mlx4_srq, verbs_srq.srq); if (!msrq->verbs_srq.xrcd) return EOPNOTSUPP; *srq_num = msrq->verbs_srq.srq_num; return 0; } struct ibv_mr *mlx4_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { struct verbs_mr *vmr; struct ibv_reg_mr cmd; struct ib_uverbs_reg_mr_resp resp; int ret; vmr = malloc(sizeof(*vmr)); if (!vmr) return NULL; ret = ibv_cmd_reg_mr(pd, addr, length, hca_va, access, vmr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(vmr); return NULL; } return &vmr->ibv_mr; } int mlx4_rereg_mr(struct verbs_mr *vmr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access) { struct ibv_rereg_mr cmd; struct ib_uverbs_rereg_mr_resp resp; return ibv_cmd_rereg_mr(vmr, flags, addr, length, (uintptr_t)addr, access, pd, &cmd, sizeof(cmd), &resp, sizeof(resp)); } int mlx4_dereg_mr(struct verbs_mr *vmr) { int ret; ret = ibv_cmd_dereg_mr(vmr); if (ret) return ret; free(vmr); return 0; } struct ibv_mw *mlx4_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type) { struct ibv_mw *mw; struct ibv_alloc_mw cmd; struct ib_uverbs_alloc_mw_resp resp; int ret; mw = calloc(1, sizeof(*mw)); if (!mw) return NULL; ret = ibv_cmd_alloc_mw(pd, type, mw, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(mw); return NULL; } return mw; } int mlx4_dealloc_mw(struct ibv_mw *mw) { int ret; ret = ibv_cmd_dealloc_mw(mw); if (ret) return ret; free(mw); return 0; } int mlx4_bind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind) { struct ibv_send_wr *bad_wr = NULL; struct ibv_send_wr wr = { }; int ret; wr.opcode = IBV_WR_BIND_MW; wr.next = NULL; wr.wr_id = mw_bind->wr_id; wr.send_flags = mw_bind->send_flags; wr.bind_mw.mw = mw; wr.bind_mw.rkey = ibv_inc_rkey(mw->rkey); wr.bind_mw.bind_info = mw_bind->bind_info; ret = mlx4_post_send(qp, &wr, &bad_wr); if (ret) return ret; /* updating the mw with the latest rkey. */ mw->rkey = wr.bind_mw.rkey; return 0; } enum { CREATE_CQ_SUPPORTED_WC_FLAGS = IBV_WC_STANDARD_FLAGS | IBV_WC_EX_WITH_COMPLETION_TIMESTAMP }; enum { CREATE_CQ_SUPPORTED_COMP_MASK = IBV_CQ_INIT_ATTR_MASK_FLAGS }; enum { CREATE_CQ_SUPPORTED_FLAGS = IBV_CREATE_CQ_ATTR_SINGLE_THREADED }; static int mlx4_cmd_create_cq(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr, struct mlx4_cq *cq) { struct mlx4_create_cq cmd; struct mlx4_create_cq_resp resp; int ret; cmd.buf_addr = (uintptr_t) cq->buf.buf; cmd.db_addr = (uintptr_t) cq->set_ci_db; ret = ibv_cmd_create_cq(context, cq_attr->cqe, cq_attr->channel, cq_attr->comp_vector, &cq->verbs_cq.cq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (!ret) cq->cqn = resp.cqn; return ret; } static int mlx4_cmd_create_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr, struct mlx4_cq *cq) { struct mlx4_create_cq_ex cmd; struct mlx4_create_cq_ex_resp resp = {}; int ret; cmd.buf_addr = (uintptr_t) cq->buf.buf; cmd.db_addr = (uintptr_t) cq->set_ci_db; ret = ibv_cmd_create_cq_ex(context, cq_attr, &cq->verbs_cq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp), 0); if (!ret) cq->cqn = resp.cqn; return ret; } static struct ibv_cq_ex *create_cq(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr, int cq_alloc_flags) { struct mlx4_cq *cq; int ret; struct mlx4_context *mctx = to_mctx(context); /* Sanity check CQ size before proceeding */ if (cq_attr->cqe > 0x3fffff) { errno = EINVAL; return NULL; } if (cq_attr->comp_mask & ~CREATE_CQ_SUPPORTED_COMP_MASK) { errno = ENOTSUP; return NULL; } if (cq_attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_FLAGS && cq_attr->flags & ~CREATE_CQ_SUPPORTED_FLAGS) { errno = ENOTSUP; return NULL; } if (cq_attr->wc_flags & ~CREATE_CQ_SUPPORTED_WC_FLAGS) { errno = ENOTSUP; return NULL; } /* mlx4 devices don't support slid and sl in cqe when completion * timestamp is enabled in the CQ */ if ((cq_attr->wc_flags & (IBV_WC_EX_WITH_SLID | IBV_WC_EX_WITH_SL)) && (cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP)) { errno = ENOTSUP; return NULL; } cq = malloc(sizeof *cq); if (!cq) return NULL; cq->cons_index = 0; if (pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE)) goto err; cq_attr->cqe = roundup_pow_of_two(cq_attr->cqe + 1); if (mlx4_alloc_cq_buf(to_mdev(context->device), mctx, &cq->buf, cq_attr->cqe, mctx->cqe_size)) goto err; cq->cqe_size = mctx->cqe_size; cq->set_ci_db = mlx4_alloc_db(to_mctx(context), MLX4_DB_TYPE_CQ); if (!cq->set_ci_db) goto err_buf; cq->arm_db = cq->set_ci_db + 1; *cq->arm_db = 0; cq->arm_sn = 1; *cq->set_ci_db = 0; cq->flags = cq_alloc_flags; if (cq_attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_FLAGS && cq_attr->flags & IBV_CREATE_CQ_ATTR_SINGLE_THREADED) cq->flags |= MLX4_CQ_FLAGS_SINGLE_THREADED; --cq_attr->cqe; if (cq_alloc_flags & MLX4_CQ_FLAGS_EXTENDED) ret = mlx4_cmd_create_cq_ex(context, cq_attr, cq); else ret = mlx4_cmd_create_cq(context, cq_attr, cq); if (ret) goto err_db; if (cq_alloc_flags & MLX4_CQ_FLAGS_EXTENDED) mlx4_cq_fill_pfns(cq, cq_attr); return &cq->verbs_cq.cq_ex; err_db: mlx4_free_db(to_mctx(context), MLX4_DB_TYPE_CQ, cq->set_ci_db); err_buf: mlx4_free_buf(to_mctx(context), &cq->buf); err: free(cq); return NULL; } struct ibv_cq *mlx4_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct ibv_cq_ex *cq; struct ibv_cq_init_attr_ex cq_attr = {.cqe = cqe, .channel = channel, .comp_vector = comp_vector, .wc_flags = IBV_WC_STANDARD_FLAGS}; cq = create_cq(context, &cq_attr, 0); return cq ? ibv_cq_ex_to_cq(cq) : NULL; } struct ibv_cq_ex *mlx4_create_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr) { /* * Make local copy since some attributes might be adjusted * for internal use. */ struct ibv_cq_init_attr_ex cq_attr_c = {.cqe = cq_attr->cqe, .channel = cq_attr->channel, .comp_vector = cq_attr->comp_vector, .wc_flags = cq_attr->wc_flags, .comp_mask = cq_attr->comp_mask, .flags = cq_attr->flags}; if (!check_comp_mask(cq_attr_c.comp_mask, IBV_CQ_INIT_ATTR_MASK_FLAGS)) { errno = EINVAL; return NULL; } return create_cq(context, &cq_attr_c, MLX4_CQ_FLAGS_EXTENDED); } int mlx4_resize_cq(struct ibv_cq *ibcq, int cqe) { struct mlx4_cq *cq = to_mcq(ibcq); struct mlx4_resize_cq cmd; struct ib_uverbs_resize_cq_resp resp; struct mlx4_buf buf; int old_cqe, outst_cqe, ret; /* Sanity check CQ size before proceeding */ if (cqe > 0x3fffff) return EINVAL; pthread_spin_lock(&cq->lock); cqe = roundup_pow_of_two(cqe + 1); if (cqe == ibcq->cqe + 1) { ret = 0; goto out; } /* Can't be smaller then the number of outstanding CQEs */ outst_cqe = mlx4_get_outstanding_cqes(cq); if (cqe < outst_cqe + 1) { ret = EINVAL; goto out; } ret = mlx4_alloc_cq_buf(to_mdev(ibcq->context->device), to_mctx(ibcq->context), &buf, cqe, cq->cqe_size); if (ret) goto out; old_cqe = ibcq->cqe; cmd.buf_addr = (uintptr_t) buf.buf; ret = ibv_cmd_resize_cq(ibcq, cqe - 1, &cmd.ibv_cmd, sizeof cmd, &resp, sizeof resp); if (ret) { mlx4_free_buf(to_mctx(ibcq->context), &buf); goto out; } mlx4_cq_resize_copy_cqes(cq, buf.buf, old_cqe); mlx4_free_buf(to_mctx(ibcq->context), &cq->buf); cq->buf = buf; mlx4_update_cons_index(cq); out: pthread_spin_unlock(&cq->lock); return ret; } int mlx4_destroy_cq(struct ibv_cq *cq) { int ret; ret = ibv_cmd_destroy_cq(cq); if (ret) return ret; mlx4_free_db(to_mctx(cq->context), MLX4_DB_TYPE_CQ, to_mcq(cq)->set_ci_db); mlx4_free_buf(to_mctx(cq->context), &to_mcq(cq)->buf); free(to_mcq(cq)); return 0; } struct ibv_srq *mlx4_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct mlx4_create_srq cmd; struct mlx4_create_srq_resp resp; struct mlx4_srq *srq; int ret; /* Sanity check SRQ size before proceeding */ if (attr->attr.max_wr > 1 << 16 || attr->attr.max_sge > 64) { errno = EINVAL; return NULL; } srq = malloc(sizeof *srq); if (!srq) return NULL; if (pthread_spin_init(&srq->lock, PTHREAD_PROCESS_PRIVATE)) goto err; srq->max = roundup_pow_of_two(attr->attr.max_wr + 1); srq->max_gs = attr->attr.max_sge; srq->counter = 0; srq->ext_srq = 0; if (mlx4_alloc_srq_buf(pd, &attr->attr, srq)) goto err; srq->db = mlx4_alloc_db(to_mctx(pd->context), MLX4_DB_TYPE_RQ); if (!srq->db) goto err_free; *srq->db = 0; cmd.buf_addr = (uintptr_t) srq->buf.buf; cmd.db_addr = (uintptr_t) srq->db; ret = ibv_cmd_create_srq(pd, &srq->verbs_srq.srq, attr, &cmd.ibv_cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) goto err_db; return &srq->verbs_srq.srq; err_db: mlx4_free_db(to_mctx(pd->context), MLX4_DB_TYPE_RQ, srq->db); err_free: free(srq->wrid); mlx4_free_buf(to_mctx(pd->context), &srq->buf); err: free(srq); return NULL; } struct ibv_srq *mlx4_create_srq_ex(struct ibv_context *context, struct ibv_srq_init_attr_ex *attr_ex) { if (!(attr_ex->comp_mask & IBV_SRQ_INIT_ATTR_TYPE) || (attr_ex->srq_type == IBV_SRQT_BASIC)) return mlx4_create_srq(attr_ex->pd, (struct ibv_srq_init_attr *) attr_ex); else if (attr_ex->srq_type == IBV_SRQT_XRC) return mlx4_create_xrc_srq(context, attr_ex); return NULL; } int mlx4_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int attr_mask) { struct ibv_modify_srq cmd; return ibv_cmd_modify_srq(srq, attr, attr_mask, &cmd, sizeof cmd); } int mlx4_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; return ibv_cmd_query_srq(srq, attr, &cmd, sizeof cmd); } int mlx4_destroy_srq(struct ibv_srq *srq) { int ret; if (to_msrq(srq)->ext_srq) return mlx4_destroy_xrc_srq(srq); ret = ibv_cmd_destroy_srq(srq); if (ret) return ret; mlx4_free_db(to_mctx(srq->context), MLX4_DB_TYPE_RQ, to_msrq(srq)->db); mlx4_free_buf(to_mctx(srq->context), &to_msrq(srq)->buf); free(to_msrq(srq)->wrid); free(to_msrq(srq)); return 0; } static int mlx4_cmd_create_qp_ex_rss(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr, struct mlx4_create_qp *cmd, struct mlx4_qp *qp) { struct mlx4_create_qp_ex_rss cmd_ex = {}; struct mlx4_create_qp_ex_resp resp; int ret; if (attr->rx_hash_conf.rx_hash_key_len != sizeof(cmd_ex.rx_hash_key)) { errno = ENOTSUP; return errno; } cmd_ex.rx_hash_fields_mask = attr->rx_hash_conf.rx_hash_fields_mask; cmd_ex.rx_hash_function = attr->rx_hash_conf.rx_hash_function; memcpy(cmd_ex.rx_hash_key, attr->rx_hash_conf.rx_hash_key, sizeof(cmd_ex.rx_hash_key)); ret = ibv_cmd_create_qp_ex2(context, &qp->verbs_qp, attr, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp.ibv_resp, sizeof(resp)); return ret; } static struct ibv_qp *_mlx4_create_qp_ex_rss(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr) { struct mlx4_create_qp cmd = {}; struct mlx4_qp *qp; int ret; if (!(attr->comp_mask & IBV_QP_INIT_ATTR_RX_HASH) || !(attr->comp_mask & IBV_QP_INIT_ATTR_IND_TABLE)) { errno = EINVAL; return NULL; } if (attr->qp_type != IBV_QPT_RAW_PACKET) { errno = EINVAL; return NULL; } qp = calloc(1, sizeof(*qp)); if (!qp) return NULL; if (pthread_spin_init(&qp->sq.lock, PTHREAD_PROCESS_PRIVATE) || pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE)) goto err; ret = mlx4_cmd_create_qp_ex_rss(context, attr, &cmd, qp); if (ret) goto err; qp->type = MLX4_RSC_TYPE_RSS_QP; return &qp->verbs_qp.qp; err: free(qp); return NULL; } static int mlx4_cmd_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr, struct mlx4_create_qp *cmd, struct mlx4_qp *qp) { struct mlx4_create_qp_ex cmd_ex; struct mlx4_create_qp_ex_resp resp; int ret; memset(&cmd_ex, 0, sizeof(cmd_ex)); *ibv_create_qp_ex_to_reg(&cmd_ex.ibv_cmd) = cmd->ibv_cmd.core_payload; cmd_ex.drv_payload = cmd->drv_payload; ret = ibv_cmd_create_qp_ex2(context, &qp->verbs_qp, attr, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp.ibv_resp, sizeof(resp)); return ret; } enum { MLX4_CREATE_QP_SUP_COMP_MASK = (IBV_QP_INIT_ATTR_PD | IBV_QP_INIT_ATTR_XRCD | IBV_QP_INIT_ATTR_CREATE_FLAGS | IBV_QP_INIT_ATTR_MAX_TSO_HEADER), }; enum { MLX4_CREATE_QP_EX2_COMP_MASK = (IBV_QP_INIT_ATTR_CREATE_FLAGS | IBV_QP_INIT_ATTR_MAX_TSO_HEADER), }; static struct ibv_qp *create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr, struct mlx4dv_qp_init_attr *mlx4qp_attr) { struct mlx4_context *ctx = to_mctx(context); struct mlx4_create_qp cmd = {}; struct ib_uverbs_create_qp_resp resp = {}; struct mlx4_qp *qp; int ret; if (attr->comp_mask & (IBV_QP_INIT_ATTR_RX_HASH | IBV_QP_INIT_ATTR_IND_TABLE)) { return _mlx4_create_qp_ex_rss(context, attr); } /* Sanity check QP size before proceeding */ if (ctx->max_qp_wr) { /* mlx4_query_device succeeded */ if (attr->cap.max_send_wr > ctx->max_qp_wr || attr->cap.max_recv_wr > ctx->max_qp_wr || attr->cap.max_send_sge > ctx->max_sge || attr->cap.max_recv_sge > ctx->max_sge) { errno = EINVAL; return NULL; } } else { if (attr->cap.max_send_wr > 65536 || attr->cap.max_recv_wr > 65536 || attr->cap.max_send_sge > 64 || attr->cap.max_recv_sge > 64) { errno = EINVAL; return NULL; } } if (attr->cap.max_inline_data > 1024) { errno = EINVAL; return NULL; } if (attr->comp_mask & ~MLX4_CREATE_QP_SUP_COMP_MASK) { errno = ENOTSUP; return NULL; } qp = calloc(1, sizeof *qp); if (!qp) return NULL; if (attr->qp_type == IBV_QPT_XRC_RECV) { attr->cap.max_send_wr = qp->sq.wqe_cnt = 0; } else { mlx4_calc_sq_wqe_size(&attr->cap, attr->qp_type, qp, attr); /* * We need to leave 2 KB + 1 WQE of headroom in the SQ to * allow HW to prefetch. */ qp->sq_spare_wqes = (2048 >> qp->sq.wqe_shift) + 1; qp->sq.wqe_cnt = roundup_pow_of_two(attr->cap.max_send_wr + qp->sq_spare_wqes); } if (attr->srq || attr->qp_type == IBV_QPT_XRC_SEND || attr->qp_type == IBV_QPT_XRC_RECV) { attr->cap.max_recv_wr = qp->rq.wqe_cnt = attr->cap.max_recv_sge = 0; } else { qp->rq.wqe_cnt = roundup_pow_of_two(attr->cap.max_recv_wr); if (attr->cap.max_recv_sge < 1) attr->cap.max_recv_sge = 1; if (attr->cap.max_recv_wr < 1) attr->cap.max_recv_wr = 1; } if (mlx4_alloc_qp_buf(context, attr->cap.max_recv_sge, attr->qp_type, qp, mlx4qp_attr)) goto err; mlx4_init_qp_indices(qp); if (pthread_spin_init(&qp->sq.lock, PTHREAD_PROCESS_PRIVATE) || pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE)) goto err_free; if (mlx4qp_attr) { if (!check_comp_mask(mlx4qp_attr->comp_mask, MLX4DV_QP_INIT_ATTR_MASK_RESERVED - 1)) { errno = EINVAL; goto err_free; } if (mlx4qp_attr->comp_mask & MLX4DV_QP_INIT_ATTR_MASK_INL_RECV) cmd.inl_recv_sz = mlx4qp_attr->inl_recv_sz; } if (attr->cap.max_recv_sge) { qp->db = mlx4_alloc_db(to_mctx(context), MLX4_DB_TYPE_RQ); if (!qp->db) goto err_free; *qp->db = 0; cmd.db_addr = (uintptr_t) qp->db; } else { cmd.db_addr = 0; } cmd.buf_addr = (uintptr_t) qp->buf.buf; cmd.log_sq_stride = qp->sq.wqe_shift; for (cmd.log_sq_bb_count = 0; qp->sq.wqe_cnt > 1 << cmd.log_sq_bb_count; ++cmd.log_sq_bb_count) ; /* nothing */ cmd.sq_no_prefetch = 0; /* OK for ABI 2: just a reserved field */ pthread_mutex_lock(&to_mctx(context)->qp_table_mutex); if (attr->comp_mask & MLX4_CREATE_QP_EX2_COMP_MASK) ret = mlx4_cmd_create_qp_ex(context, attr, &cmd, qp); else ret = ibv_cmd_create_qp_ex(context, &qp->verbs_qp, attr, &cmd.ibv_cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) goto err_rq_db; if (qp->sq.wqe_cnt || qp->rq.wqe_cnt) { ret = mlx4_store_qp(to_mctx(context), qp->verbs_qp.qp.qp_num, qp); if (ret) goto err_destroy; } pthread_mutex_unlock(&to_mctx(context)->qp_table_mutex); qp->rq.wqe_cnt = qp->rq.max_post = attr->cap.max_recv_wr; qp->rq.max_gs = attr->cap.max_recv_sge; if (attr->qp_type != IBV_QPT_XRC_RECV) mlx4_set_sq_sizes(qp, &attr->cap, attr->qp_type); qp->doorbell_qpn = htobe32(qp->verbs_qp.qp.qp_num << 8); if (attr->sq_sig_all) qp->sq_signal_bits = htobe32(MLX4_WQE_CTRL_CQ_UPDATE); else qp->sq_signal_bits = 0; qp->qpn_cache = qp->verbs_qp.qp.qp_num; qp->type = attr->srq ? MLX4_RSC_TYPE_SRQ : MLX4_RSC_TYPE_QP; return &qp->verbs_qp.qp; err_destroy: ibv_cmd_destroy_qp(&qp->verbs_qp.qp); err_rq_db: pthread_mutex_unlock(&to_mctx(context)->qp_table_mutex); if (attr->cap.max_recv_sge) mlx4_free_db(to_mctx(context), MLX4_DB_TYPE_RQ, qp->db); err_free: free(qp->sq.wrid); if (qp->rq.wqe_cnt) free(qp->rq.wrid); mlx4_free_buf(ctx, &qp->buf); err: free(qp); return NULL; } struct ibv_qp *mlx4_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr) { return create_qp_ex(context, attr, NULL); } struct ibv_qp *mlx4dv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr, struct mlx4dv_qp_init_attr *mlx4_qp_attr) { return create_qp_ex(context, attr, mlx4_qp_attr); } struct ibv_qp *mlx4_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct ibv_qp_init_attr_ex attr_ex; struct ibv_qp *qp; memcpy(&attr_ex, attr, sizeof *attr); attr_ex.comp_mask = IBV_QP_INIT_ATTR_PD; attr_ex.pd = pd; qp = mlx4_create_qp_ex(pd->context, &attr_ex); if (qp) memcpy(attr, &attr_ex, sizeof *attr); return qp; } struct ibv_qp *mlx4_open_qp(struct ibv_context *context, struct ibv_qp_open_attr *attr) { struct ibv_open_qp cmd; struct ib_uverbs_create_qp_resp resp; struct mlx4_qp *qp; int ret; qp = calloc(1, sizeof *qp); if (!qp) return NULL; ret = ibv_cmd_open_qp(context, &qp->verbs_qp, sizeof(qp->verbs_qp), attr, &cmd, sizeof cmd, &resp, sizeof resp); if (ret) goto err; return &qp->verbs_qp.qp; err: free(qp); return NULL; } int mlx4_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; struct mlx4_qp *qp = to_mqp(ibqp); int ret; if (qp->type == MLX4_RSC_TYPE_RSS_QP) return ENOTSUP; ret = ibv_cmd_query_qp(ibqp, attr, attr_mask, init_attr, &cmd, sizeof cmd); if (ret) return ret; init_attr->cap.max_send_wr = qp->sq.max_post; init_attr->cap.max_send_sge = qp->sq.max_gs; init_attr->cap.max_inline_data = qp->max_inline_data; attr->cap = init_attr->cap; return 0; } static int _mlx4_modify_qp_rss(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd = {}; if (attr_mask & ~(IBV_QP_STATE | IBV_QP_PORT)) return ENOTSUP; if (attr->qp_state > IBV_QPS_RTR) return ENOTSUP; return ibv_cmd_modify_qp(qp, attr, attr_mask, &cmd, sizeof(cmd)); } int mlx4_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd = {}; struct ibv_port_attr port_attr; struct mlx4_qp *mqp = to_mqp(qp); struct ibv_device_attr device_attr; int ret; if (mqp->type == MLX4_RSC_TYPE_RSS_QP) return _mlx4_modify_qp_rss(qp, attr, attr_mask); memset(&device_attr, 0, sizeof(device_attr)); if (attr_mask & IBV_QP_PORT) { ret = ibv_query_port(qp->context, attr->port_num, &port_attr); if (ret) return ret; mqp->link_layer = port_attr.link_layer; ret = ibv_query_device(qp->context, &device_attr); if (ret) return ret; switch(qp->qp_type) { case IBV_QPT_UD: if ((mqp->link_layer == IBV_LINK_LAYER_INFINIBAND) && (device_attr.device_cap_flags & IBV_DEVICE_UD_IP_CSUM)) mqp->qp_cap_cache |= MLX4_CSUM_SUPPORT_UD_OVER_IB | MLX4_RX_CSUM_VALID; break; case IBV_QPT_RAW_PACKET: if ((mqp->link_layer == IBV_LINK_LAYER_ETHERNET) && (device_attr.device_cap_flags & IBV_DEVICE_RAW_IP_CSUM)) mqp->qp_cap_cache |= MLX4_CSUM_SUPPORT_RAW_OVER_ETH | MLX4_RX_CSUM_VALID; break; default: break; } } if (qp->state == IBV_QPS_RESET && attr_mask & IBV_QP_STATE && attr->qp_state == IBV_QPS_INIT) { mlx4_qp_init_sq_ownership(to_mqp(qp)); } ret = ibv_cmd_modify_qp(qp, attr, attr_mask, &cmd, sizeof cmd); if (!ret && (attr_mask & IBV_QP_STATE) && attr->qp_state == IBV_QPS_RESET) { if (qp->recv_cq) mlx4_cq_clean(to_mcq(qp->recv_cq), qp->qp_num, qp->srq ? to_msrq(qp->srq) : NULL); if (qp->send_cq && qp->send_cq != qp->recv_cq) mlx4_cq_clean(to_mcq(qp->send_cq), qp->qp_num, NULL); mlx4_init_qp_indices(to_mqp(qp)); if (to_mqp(qp)->rq.wqe_cnt) *to_mqp(qp)->db = 0; } return ret; } static void mlx4_lock_cqs(struct ibv_qp *qp) { struct mlx4_cq *send_cq = to_mcq(qp->send_cq); struct mlx4_cq *recv_cq = to_mcq(qp->recv_cq); if (!qp->send_cq || !qp->recv_cq) { if (qp->send_cq) pthread_spin_lock(&send_cq->lock); else if (qp->recv_cq) pthread_spin_lock(&recv_cq->lock); } else if (send_cq == recv_cq) { pthread_spin_lock(&send_cq->lock); } else if (send_cq->cqn < recv_cq->cqn) { pthread_spin_lock(&send_cq->lock); pthread_spin_lock(&recv_cq->lock); } else { pthread_spin_lock(&recv_cq->lock); pthread_spin_lock(&send_cq->lock); } } static void mlx4_unlock_cqs(struct ibv_qp *qp) { struct mlx4_cq *send_cq = to_mcq(qp->send_cq); struct mlx4_cq *recv_cq = to_mcq(qp->recv_cq); if (!qp->send_cq || !qp->recv_cq) { if (qp->send_cq) pthread_spin_unlock(&send_cq->lock); else if (qp->recv_cq) pthread_spin_unlock(&recv_cq->lock); } else if (send_cq == recv_cq) { pthread_spin_unlock(&send_cq->lock); } else if (send_cq->cqn < recv_cq->cqn) { pthread_spin_unlock(&recv_cq->lock); pthread_spin_unlock(&send_cq->lock); } else { pthread_spin_unlock(&send_cq->lock); pthread_spin_unlock(&recv_cq->lock); } } static int _mlx4_destroy_qp_rss(struct ibv_qp *ibqp) { struct mlx4_qp *qp = to_mqp(ibqp); int ret; ret = ibv_cmd_destroy_qp(ibqp); if (ret) return ret; free(qp); return 0; } int mlx4_destroy_qp(struct ibv_qp *ibqp) { struct mlx4_qp *qp = to_mqp(ibqp); int ret; if (qp->type == MLX4_RSC_TYPE_RSS_QP) return _mlx4_destroy_qp_rss(ibqp); pthread_mutex_lock(&to_mctx(ibqp->context)->qp_table_mutex); ret = ibv_cmd_destroy_qp(ibqp); if (ret) { pthread_mutex_unlock(&to_mctx(ibqp->context)->qp_table_mutex); return ret; } mlx4_lock_cqs(ibqp); if (ibqp->recv_cq) __mlx4_cq_clean(to_mcq(ibqp->recv_cq), ibqp->qp_num, ibqp->srq ? to_msrq(ibqp->srq) : NULL); if (ibqp->send_cq && ibqp->send_cq != ibqp->recv_cq) __mlx4_cq_clean(to_mcq(ibqp->send_cq), ibqp->qp_num, NULL); if (qp->sq.wqe_cnt || qp->rq.wqe_cnt) mlx4_clear_qp(to_mctx(ibqp->context), ibqp->qp_num); mlx4_unlock_cqs(ibqp); pthread_mutex_unlock(&to_mctx(ibqp->context)->qp_table_mutex); if (qp->rq.wqe_cnt) { mlx4_free_db(to_mctx(ibqp->context), MLX4_DB_TYPE_RQ, qp->db); free(qp->rq.wrid); } if (qp->sq.wqe_cnt) free(qp->sq.wrid); mlx4_free_buf(to_mctx(ibqp->context), &qp->buf); free(qp); return 0; } static int link_local_gid(const union ibv_gid *gid) { return gid->global.subnet_prefix == htobe64(0xfe80000000000000ULL); } static int is_multicast_gid(const union ibv_gid *gid) { return gid->raw[0] == 0xff; } static uint16_t get_vlan_id(union ibv_gid *gid) { uint16_t vid; vid = gid->raw[11] << 8 | gid->raw[12]; return vid < 0x1000 ? vid : 0xffff; } static int mlx4_resolve_grh_to_l2(struct ibv_pd *pd, struct mlx4_ah *ah, struct ibv_ah_attr *attr) { int err, i; uint16_t vid; union ibv_gid sgid; if (link_local_gid(&attr->grh.dgid)) { memcpy(ah->mac, &attr->grh.dgid.raw[8], 3); memcpy(ah->mac + 3, &attr->grh.dgid.raw[13], 3); ah->mac[0] ^= 2; vid = get_vlan_id(&attr->grh.dgid); } else if (is_multicast_gid(&attr->grh.dgid)) { ah->mac[0] = 0x33; ah->mac[1] = 0x33; for (i = 2; i < 6; ++i) ah->mac[i] = attr->grh.dgid.raw[i + 10]; err = ibv_query_gid(pd->context, attr->port_num, attr->grh.sgid_index, &sgid); if (err) return err; ah->av.dlid = htobe16(0xc000); ah->av.port_pd |= htobe32(1 << 31); vid = get_vlan_id(&sgid); } else return 1; if (vid != 0xffff) { ah->av.port_pd |= htobe32(1 << 29); ah->vlan = vid | ((attr->sl & 7) << 13); } return 0; } struct ibv_ah *mlx4_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr) { struct mlx4_ah *ah; struct ibv_port_attr port_attr; if (query_port_cache(pd->context, attr->port_num, &port_attr)) return NULL; if (port_attr.flags & IBV_QPF_GRH_REQUIRED && !attr->is_global) { errno = EINVAL; return NULL; } ah = malloc(sizeof *ah); if (!ah) return NULL; memset(&ah->av, 0, sizeof ah->av); ah->av.port_pd = htobe32(to_mpd(pd)->pdn | (attr->port_num << 24)); if (port_attr.link_layer != IBV_LINK_LAYER_ETHERNET) { ah->av.g_slid = attr->src_path_bits; ah->av.dlid = htobe16(attr->dlid); ah->av.sl_tclass_flowlabel = htobe32(attr->sl << 28); } else ah->av.sl_tclass_flowlabel = htobe32(attr->sl << 29); if (attr->static_rate) { ah->av.stat_rate = attr->static_rate + MLX4_STAT_RATE_OFFSET; /* XXX check rate cap? */ } if (attr->is_global) { ah->av.g_slid |= 0x80; ah->av.gid_index = attr->grh.sgid_index; ah->av.hop_limit = attr->grh.hop_limit; ah->av.sl_tclass_flowlabel |= htobe32((attr->grh.traffic_class << 20) | attr->grh.flow_label); memcpy(ah->av.dgid, attr->grh.dgid.raw, 16); } if (port_attr.link_layer == IBV_LINK_LAYER_ETHERNET) { if (port_attr.port_cap_flags & IBV_PORT_IP_BASED_GIDS) { uint16_t vid; if (ibv_resolve_eth_l2_from_gid(pd->context, attr, ah->mac, &vid)) { free(ah); return NULL; } if (vid <= 0xfff) { ah->av.port_pd |= htobe32(1 << 29); ah->vlan = vid | ((attr->sl & 7) << 13); } } else { if (mlx4_resolve_grh_to_l2(pd, ah, attr)) { free(ah); return NULL; } } } return &ah->ibv_ah; } int mlx4_destroy_ah(struct ibv_ah *ah) { free(to_mah(ah)); return 0; } struct ibv_wq *mlx4_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *attr) { struct mlx4_context *ctx = to_mctx(context); struct mlx4_create_wq cmd = {}; struct ib_uverbs_ex_create_wq_resp resp = {}; struct mlx4_qp *qp; int ret; if (attr->wq_type != IBV_WQT_RQ) { errno = ENOTSUP; return NULL; } /* Sanity check QP size before proceeding */ if (ctx->max_qp_wr) { /* mlx4_query_device succeeded */ if (attr->max_wr > ctx->max_qp_wr || attr->max_sge > ctx->max_sge) { errno = EINVAL; return NULL; } } else { if (attr->max_wr > 65536 || attr->max_sge > 64) { errno = EINVAL; return NULL; } } if (!check_comp_mask(attr->comp_mask, IBV_WQ_INIT_ATTR_FLAGS)) { errno = ENOTSUP; return NULL; } if ((attr->comp_mask & IBV_WQ_INIT_ATTR_FLAGS) && (attr->create_flags & ~IBV_WQ_FLAGS_SCATTER_FCS)) { errno = ENOTSUP; return NULL; } qp = calloc(1, sizeof(*qp)); if (!qp) return NULL; if (attr->max_sge < 1) attr->max_sge = 1; if (attr->max_wr < 1) attr->max_wr = 1; /* Kernel driver requires a dummy SQ with minimum properties */ qp->sq.wqe_shift = 6; qp->sq.wqe_cnt = 1; qp->rq.wqe_cnt = roundup_pow_of_two(attr->max_wr); if (mlx4_alloc_qp_buf(context, attr->max_sge, IBV_QPT_RAW_PACKET, qp, NULL)) goto err; mlx4_init_qp_indices(qp); mlx4_qp_init_sq_ownership(qp); /* For dummy SQ */ if (pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE)) goto err_free; qp->db = mlx4_alloc_db(to_mctx(context), MLX4_DB_TYPE_RQ); if (!qp->db) goto err_free; *qp->db = 0; cmd.db_addr = (uintptr_t)qp->db; cmd.buf_addr = (uintptr_t)qp->buf.buf; cmd.log_range_size = ctx->log_wqs_range_sz; pthread_mutex_lock(&to_mctx(context)->qp_table_mutex); ret = ibv_cmd_create_wq(context, attr, &qp->wq, &cmd.ibv_cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) goto err_rq_db; ret = mlx4_store_qp(to_mctx(context), qp->wq.wq_num, qp); if (ret) goto err_destroy; pthread_mutex_unlock(&to_mctx(context)->qp_table_mutex); ctx->log_wqs_range_sz = 0; qp->rq.max_post = attr->max_wr; qp->rq.wqe_cnt = attr->max_wr; qp->rq.max_gs = attr->max_sge; qp->wq.state = IBV_WQS_RESET; qp->wq.post_recv = mlx4_post_wq_recv; qp->qpn_cache = qp->wq.wq_num; return &qp->wq; err_destroy: ibv_cmd_destroy_wq(&qp->wq); err_rq_db: pthread_mutex_unlock(&to_mctx(context)->qp_table_mutex); mlx4_free_db(to_mctx(context), MLX4_DB_TYPE_RQ, qp->db); err_free: free(qp->rq.wrid); mlx4_free_buf(to_mctx(context), &qp->buf); err: free(qp); return NULL; } int mlx4_modify_wq(struct ibv_wq *ibwq, struct ibv_wq_attr *attr) { struct mlx4_qp *qp = wq_to_mqp(ibwq); struct mlx4_modify_wq cmd = {}; int ret; ret = ibv_cmd_modify_wq(ibwq, attr, &cmd.ibv_cmd, sizeof(cmd)); if (!ret && (attr->attr_mask & IBV_WQ_ATTR_STATE) && (ibwq->state == IBV_WQS_RESET)) { mlx4_cq_clean(to_mcq(ibwq->cq), ibwq->wq_num, NULL); mlx4_init_qp_indices(qp); *qp->db = 0; } return ret; } struct ibv_flow *mlx4_create_flow(struct ibv_qp *qp, struct ibv_flow_attr *flow_attr) { struct ibv_flow *flow_id; int ret; flow_id = calloc(1, sizeof *flow_id); if (!flow_id) return NULL; ret = ibv_cmd_create_flow(qp, flow_id, flow_attr, NULL, 0); if (!ret) return flow_id; free(flow_id); return NULL; } int mlx4_destroy_flow(struct ibv_flow *flow_id) { int ret; ret = ibv_cmd_destroy_flow(flow_id); if (ret) return ret; free(flow_id); return 0; } int mlx4_destroy_wq(struct ibv_wq *ibwq) { struct mlx4_context *mcontext = to_mctx(ibwq->context); struct mlx4_qp *qp = wq_to_mqp(ibwq); struct mlx4_cq *cq = NULL; int ret; pthread_mutex_lock(&mcontext->qp_table_mutex); ret = ibv_cmd_destroy_wq(ibwq); if (ret) { pthread_mutex_unlock(&mcontext->qp_table_mutex); return ret; } cq = to_mcq(ibwq->cq); pthread_spin_lock(&cq->lock); __mlx4_cq_clean(cq, ibwq->wq_num, NULL); mlx4_clear_qp(mcontext, ibwq->wq_num); pthread_spin_unlock(&cq->lock); pthread_mutex_unlock(&mcontext->qp_table_mutex); mlx4_free_db(mcontext, MLX4_DB_TYPE_RQ, qp->db); free(qp->rq.wrid); free(qp->sq.wrid); mlx4_free_buf(mcontext, &qp->buf); free(qp); return 0; } struct ibv_rwq_ind_table *mlx4_create_rwq_ind_table(struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr) { struct ib_uverbs_ex_create_rwq_ind_table_resp resp = {}; struct ibv_rwq_ind_table *ind_table; int err; ind_table = calloc(1, sizeof(*ind_table)); if (!ind_table) return NULL; err = ibv_cmd_create_rwq_ind_table(context, init_attr, ind_table, &resp, sizeof(resp)); if (err) goto err; return ind_table; err: free(ind_table); return NULL; } int mlx4_destroy_rwq_ind_table(struct ibv_rwq_ind_table *rwq_ind_table) { int ret; ret = ibv_cmd_destroy_rwq_ind_table(rwq_ind_table); if (ret) return ret; free(rwq_ind_table); return 0; } int mlx4_modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr) { struct ibv_modify_cq cmd = {}; return ibv_cmd_modify_cq(cq, attr, &cmd, sizeof(cmd)); } rdma-core-56.1/providers/mlx5/000077500000000000000000000000001477342711600162345ustar00rootroot00000000000000rdma-core-56.1/providers/mlx5/CMakeLists.txt000066400000000000000000000021621477342711600207750ustar00rootroot00000000000000set(MLX5_DEBUG "FALSE" CACHE BOOL "Enable expensive runtime logging options for the mlx5 verbs provider") if (MLX5_DEBUG) add_definitions("-DMLX5_DEBUG") endif() set(MLX5_MW_DEBUG "FALSE" CACHE BOOL "Enable extra validation of memory windows for the mlx5 verbs provider") if (MLX5_MW_DEBUG) add_definitions("-DMW_DEBUG") endif() if (ENABLE_LTTNG AND LTTNGUST_FOUND) set(TRACE_FILE mlx5_trace.c) endif() rdma_shared_provider(mlx5 libmlx5.map 1 1.25.${PACKAGE_VERSION} ${TRACE_FILE} buf.c cq.c dbrec.c dr_action.c dr_buddy.c dr_crc32.c dr_dbg.c dr_devx.c dr_icm_pool.c dr_matcher.c dr_domain.c dr_rule.c dr_ste.c dr_ste_v0.c dr_ste_v1.c dr_ste_v2.c dr_ste_v3.c dr_table.c dr_send.c dr_vports.c dr_ptrn.c dr_arg.c mlx5.c mlx5_vfio.c qp.c srq.c verbs.c ) publish_headers(infiniband ../../kernel-headers/rdma/mlx5_user_ioctl_verbs.h mlx5_api.h mlx5dv.h ) rdma_pkg_config("mlx5" "libibverbs" "${CMAKE_THREAD_LIBS_INIT}") if (ENABLE_LTTNG AND LTTNGUST_FOUND) target_include_directories(mlx5 PUBLIC ".") target_link_libraries(mlx5 LINK_PRIVATE LTTng::UST) endif() rdma-core-56.1/providers/mlx5/buf.c000066400000000000000000000332611477342711600171610ustar00rootroot00000000000000/* * Copyright (c) 2012 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include "mlx5.h" /* Only ia64 requires this */ #ifdef __ia64__ #define MLX5_SHM_ADDR ((void *)0x8000000000000000UL) #define MLX5_SHMAT_FLAGS (SHM_RND) #else #define MLX5_SHM_ADDR NULL #define MLX5_SHMAT_FLAGS 0 #endif #ifndef HPAGE_SIZE #define HPAGE_SIZE (2UL * 1024 * 1024) #endif #define MLX5_SHM_LENGTH HPAGE_SIZE #define MLX5_Q_CHUNK_SIZE 32768 static void free_huge_mem(struct mlx5_hugetlb_mem *hmem) { if (hmem->bitmap) free(hmem->bitmap); if (shmdt(hmem->shmaddr) == -1) mlx5_dbg(stderr, MLX5_DBG_CONTIG, "%s\n", strerror(errno)); shmctl(hmem->shmid, IPC_RMID, NULL); free(hmem); } static struct mlx5_hugetlb_mem *alloc_huge_mem(size_t size) { struct mlx5_hugetlb_mem *hmem; size_t shm_len; hmem = malloc(sizeof(*hmem)); if (!hmem) return NULL; shm_len = align(size, MLX5_SHM_LENGTH); hmem->shmid = shmget(IPC_PRIVATE, shm_len, SHM_HUGETLB | SHM_R | SHM_W); if (hmem->shmid == -1) { mlx5_dbg(stderr, MLX5_DBG_CONTIG, "%s\n", strerror(errno)); goto out_free; } hmem->shmaddr = shmat(hmem->shmid, MLX5_SHM_ADDR, MLX5_SHMAT_FLAGS); if (hmem->shmaddr == (void *)-1) { mlx5_dbg(stderr, MLX5_DBG_CONTIG, "%s\n", strerror(errno)); goto out_rmid; } hmem->bitmap = bitmap_alloc0(shm_len / MLX5_Q_CHUNK_SIZE); if (!hmem->bitmap) { mlx5_dbg(stderr, MLX5_DBG_CONTIG, "%s\n", strerror(errno)); goto out_shmdt; } hmem->bmp_size = shm_len / MLX5_Q_CHUNK_SIZE; /* * Marked to be destroyed when process detaches from shmget segment */ shmctl(hmem->shmid, IPC_RMID, NULL); return hmem; out_shmdt: if (shmdt(hmem->shmaddr) == -1) mlx5_dbg(stderr, MLX5_DBG_CONTIG, "%s\n", strerror(errno)); out_rmid: shmctl(hmem->shmid, IPC_RMID, NULL); out_free: free(hmem); return NULL; } static int alloc_huge_buf(struct mlx5_context *mctx, struct mlx5_buf *buf, size_t size, int page_size) { int found = 0; int nchunk; struct mlx5_hugetlb_mem *hmem; int ret; buf->length = align(size, MLX5_Q_CHUNK_SIZE); nchunk = buf->length / MLX5_Q_CHUNK_SIZE; if (!nchunk) return 0; mlx5_spin_lock(&mctx->hugetlb_lock); list_for_each(&mctx->hugetlb_list, hmem, entry) { if (!bitmap_full(hmem->bitmap, hmem->bmp_size)) { buf->base = bitmap_find_free_region(hmem->bitmap, hmem->bmp_size, nchunk); if (buf->base != hmem->bmp_size) { bitmap_fill_region(hmem->bitmap, buf->base, buf->base + nchunk); buf->hmem = hmem; found = 1; break; } } } mlx5_spin_unlock(&mctx->hugetlb_lock); if (!found) { hmem = alloc_huge_mem(buf->length); if (!hmem) return -1; buf->base = 0; assert(nchunk <= hmem->bmp_size); bitmap_fill_region(hmem->bitmap, 0, nchunk); buf->hmem = hmem; mlx5_spin_lock(&mctx->hugetlb_lock); if (nchunk != hmem->bmp_size) list_add(&mctx->hugetlb_list, &hmem->entry); else list_add_tail(&mctx->hugetlb_list, &hmem->entry); mlx5_spin_unlock(&mctx->hugetlb_lock); } buf->buf = hmem->shmaddr + buf->base * MLX5_Q_CHUNK_SIZE; ret = ibv_dontfork_range(buf->buf, buf->length); if (ret) { mlx5_dbg(stderr, MLX5_DBG_CONTIG, "\n"); goto out_fork; } buf->type = MLX5_ALLOC_TYPE_HUGE; return 0; out_fork: mlx5_spin_lock(&mctx->hugetlb_lock); bitmap_zero_region(hmem->bitmap, buf->base, buf->base + nchunk); if (bitmap_empty(hmem->bitmap, hmem->bmp_size)) { list_del(&hmem->entry); mlx5_spin_unlock(&mctx->hugetlb_lock); free_huge_mem(hmem); } else mlx5_spin_unlock(&mctx->hugetlb_lock); return -1; } static void free_huge_buf(struct mlx5_context *ctx, struct mlx5_buf *buf) { int nchunk; nchunk = buf->length / MLX5_Q_CHUNK_SIZE; if (!nchunk) return; mlx5_spin_lock(&ctx->hugetlb_lock); bitmap_zero_region(buf->hmem->bitmap, buf->base, buf->base + nchunk); if (bitmap_empty(buf->hmem->bitmap, buf->hmem->bmp_size)) { list_del(&buf->hmem->entry); mlx5_spin_unlock(&ctx->hugetlb_lock); free_huge_mem(buf->hmem); } else mlx5_spin_unlock(&ctx->hugetlb_lock); } void mlx5_free_buf_extern(struct mlx5_context *ctx, struct mlx5_buf *buf) { ibv_dofork_range(buf->buf, buf->length); ctx->extern_alloc.free(buf->buf, ctx->extern_alloc.data); } int mlx5_alloc_buf_extern(struct mlx5_context *ctx, struct mlx5_buf *buf, size_t size) { void *addr; addr = ctx->extern_alloc.alloc(size, ctx->extern_alloc.data); if (addr || size == 0) { if (ibv_dontfork_range(addr, size)) { mlx5_dbg(stderr, MLX5_DBG_CONTIG, "External mode dontfork_range failed\n"); ctx->extern_alloc.free(addr, ctx->extern_alloc.data); return -1; } buf->buf = addr; buf->length = size; buf->type = MLX5_ALLOC_TYPE_EXTERNAL; return 0; } mlx5_dbg(stderr, MLX5_DBG_CONTIG, "External alloc failed\n"); return -1; } static void mlx5_free_buf_custom(struct mlx5_context *ctx, struct mlx5_buf *buf) { struct mlx5_parent_domain *mparent_domain = buf->mparent_domain; mparent_domain->free(&mparent_domain->mpd.ibv_pd, mparent_domain->pd_context, buf->buf, buf->resource_type); } static int mlx5_alloc_buf_custom(struct mlx5_context *ctx, struct mlx5_buf *buf, size_t size) { struct mlx5_parent_domain *mparent_domain = buf->mparent_domain; void *addr; addr = mparent_domain->alloc(&mparent_domain->mpd.ibv_pd, mparent_domain->pd_context, size, buf->req_alignment, buf->resource_type); if (addr == IBV_ALLOCATOR_USE_DEFAULT) return 1; if (addr || size == 0) { buf->buf = addr; buf->length = size; buf->type = MLX5_ALLOC_TYPE_CUSTOM; return 0; } return -1; } int mlx5_alloc_prefered_buf(struct mlx5_context *mctx, struct mlx5_buf *buf, size_t size, int page_size, enum mlx5_alloc_type type, const char *component) { int ret; if (type == MLX5_ALLOC_TYPE_CUSTOM) { ret = mlx5_alloc_buf_custom(mctx, buf, size); if (ret <= 0) return ret; /* Fallback - default allocation is required */ } /* * Fallback mechanism priority: * huge pages * contig pages * default */ if (type == MLX5_ALLOC_TYPE_HUGE || type == MLX5_ALLOC_TYPE_PREFER_HUGE || type == MLX5_ALLOC_TYPE_ALL) { ret = alloc_huge_buf(mctx, buf, size, page_size); if (!ret) return 0; if (type == MLX5_ALLOC_TYPE_HUGE) return -1; mlx5_dbg(stderr, MLX5_DBG_CONTIG, "Huge mode allocation failed, fallback to %s mode\n", MLX5_ALLOC_TYPE_ALL ? "contig" : "default"); } if (type == MLX5_ALLOC_TYPE_CONTIG || type == MLX5_ALLOC_TYPE_PREFER_CONTIG || type == MLX5_ALLOC_TYPE_ALL) { ret = mlx5_alloc_buf_contig(mctx, buf, size, page_size, component); if (!ret) return 0; if (type == MLX5_ALLOC_TYPE_CONTIG) return -1; mlx5_dbg(stderr, MLX5_DBG_CONTIG, "Contig allocation failed, fallback to default mode\n"); } if (type == MLX5_ALLOC_TYPE_EXTERNAL) return mlx5_alloc_buf_extern(mctx, buf, size); return mlx5_alloc_buf(buf, size, page_size); } int mlx5_free_actual_buf(struct mlx5_context *ctx, struct mlx5_buf *buf) { int err = 0; switch (buf->type) { case MLX5_ALLOC_TYPE_ANON: mlx5_free_buf(buf); break; case MLX5_ALLOC_TYPE_HUGE: free_huge_buf(ctx, buf); break; case MLX5_ALLOC_TYPE_CONTIG: mlx5_free_buf_contig(ctx, buf); break; case MLX5_ALLOC_TYPE_EXTERNAL: mlx5_free_buf_extern(ctx, buf); break; case MLX5_ALLOC_TYPE_CUSTOM: mlx5_free_buf_custom(ctx, buf); break; default: mlx5_err(ctx->dbg_fp, "Bad allocation type\n"); } return err; } /* This function computes log2(v) rounded up. We don't want to have a dependency to libm which exposes ceil & log2 APIs. Code was written based on public domain code: URL: http://graphics.stanford.edu/~seander/bithacks.html#IntegerLog. */ static uint32_t mlx5_get_block_order(uint32_t v) { static const uint32_t bits_arr[] = {0x2, 0xC, 0xF0, 0xFF00, 0xFFFF0000}; static const uint32_t shift_arr[] = {1, 2, 4, 8, 16}; int i; uint32_t input_val = v; register uint32_t r = 0;/* result of log2(v) will go here */ for (i = 4; i >= 0; i--) { if (v & bits_arr[i]) { v >>= shift_arr[i]; r |= shift_arr[i]; } } /* Rounding up if required */ r += !!(input_val & ((1 << r) - 1)); return r; } bool mlx5_is_custom_alloc(struct ibv_pd *pd) { struct mlx5_parent_domain *mparent_domain = to_mparent_domain(pd); return (mparent_domain && mparent_domain->alloc && mparent_domain->free); } bool mlx5_is_extern_alloc(struct mlx5_context *context) { return context->extern_alloc.alloc && context->extern_alloc.free; } void mlx5_get_alloc_type(struct mlx5_context *context, struct ibv_pd *pd, const char *component, enum mlx5_alloc_type *alloc_type, enum mlx5_alloc_type default_type) { char *env_value; char name[128]; if (mlx5_is_custom_alloc(pd)) { *alloc_type = MLX5_ALLOC_TYPE_CUSTOM; return; } if (mlx5_is_extern_alloc(context)) { *alloc_type = MLX5_ALLOC_TYPE_EXTERNAL; return; } snprintf(name, sizeof(name), "%s_ALLOC_TYPE", component); *alloc_type = default_type; env_value = getenv(name); if (env_value) { if (!strcasecmp(env_value, "ANON")) *alloc_type = MLX5_ALLOC_TYPE_ANON; else if (!strcasecmp(env_value, "HUGE")) *alloc_type = MLX5_ALLOC_TYPE_HUGE; else if (!strcasecmp(env_value, "CONTIG")) *alloc_type = MLX5_ALLOC_TYPE_CONTIG; else if (!strcasecmp(env_value, "PREFER_CONTIG")) *alloc_type = MLX5_ALLOC_TYPE_PREFER_CONTIG; else if (!strcasecmp(env_value, "PREFER_HUGE")) *alloc_type = MLX5_ALLOC_TYPE_PREFER_HUGE; else if (!strcasecmp(env_value, "ALL")) *alloc_type = MLX5_ALLOC_TYPE_ALL; } } static void mlx5_alloc_get_env_info(struct mlx5_context *mctx, int *max_block_log, int *min_block_log, const char *component) { char *env; int value; char name[128]; /* First set defaults */ *max_block_log = MLX5_MAX_LOG2_CONTIG_BLOCK_SIZE; *min_block_log = MLX5_MIN_LOG2_CONTIG_BLOCK_SIZE; snprintf(name, sizeof(name), "%s_MAX_LOG2_CONTIG_BSIZE", component); env = getenv(name); if (env) { value = atoi(env); if (value <= MLX5_MAX_LOG2_CONTIG_BLOCK_SIZE && value >= MLX5_MIN_LOG2_CONTIG_BLOCK_SIZE) *max_block_log = value; else mlx5_err(mctx->dbg_fp, "Invalid value %d for %s\n", value, name); } sprintf(name, "%s_MIN_LOG2_CONTIG_BSIZE", component); env = getenv(name); if (env) { value = atoi(env); if (value >= MLX5_MIN_LOG2_CONTIG_BLOCK_SIZE && value <= *max_block_log) *min_block_log = value; else mlx5_err(mctx->dbg_fp, "Invalid value %d for %s\n", value, name); } } int mlx5_alloc_buf_contig(struct mlx5_context *mctx, struct mlx5_buf *buf, size_t size, int page_size, const char *component) { void *addr = MAP_FAILED; int block_size_exp; int max_block_log; int min_block_log; struct ibv_context *context = &mctx->ibv_ctx.context; off_t offset; mlx5_alloc_get_env_info(mctx, &max_block_log, &min_block_log, component); block_size_exp = mlx5_get_block_order(size); if (block_size_exp > max_block_log) block_size_exp = max_block_log; do { offset = 0; set_command(MLX5_IB_MMAP_GET_CONTIGUOUS_PAGES, &offset); set_order(block_size_exp, &offset); addr = mmap(NULL , size, PROT_WRITE | PROT_READ, MAP_SHARED, context->cmd_fd, page_size * offset); if (addr != MAP_FAILED) break; /* * The kernel returns EINVAL if not supported */ if (errno == EINVAL) return -1; block_size_exp -= 1; } while (block_size_exp >= min_block_log); mlx5_dbg(mctx->dbg_fp, MLX5_DBG_CONTIG, "block order %d, addr %p\n", block_size_exp, addr); if (addr == MAP_FAILED) return -1; if (ibv_dontfork_range(addr, size)) { munmap(addr, size); return -1; } buf->buf = addr; buf->length = size; buf->type = MLX5_ALLOC_TYPE_CONTIG; return 0; } void mlx5_free_buf_contig(struct mlx5_context *mctx, struct mlx5_buf *buf) { ibv_dofork_range(buf->buf, buf->length); munmap(buf->buf, buf->length); } int mlx5_alloc_buf(struct mlx5_buf *buf, size_t size, int page_size) { int ret; int al_size; al_size = align(size, page_size); ret = posix_memalign(&buf->buf, page_size, al_size); if (ret) return ret; ret = ibv_dontfork_range(buf->buf, al_size); if (ret) free(buf->buf); if (!ret) { buf->length = al_size; buf->type = MLX5_ALLOC_TYPE_ANON; } return ret; } void mlx5_free_buf(struct mlx5_buf *buf) { ibv_dofork_range(buf->buf, buf->length); free(buf->buf); } rdma-core-56.1/providers/mlx5/cq.c000066400000000000000000001545161477342711600170170ustar00rootroot00000000000000/* * Copyright (c) 2012 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include "mlx5.h" #include "wqe.h" enum { CQ_OK = 0, CQ_EMPTY = -1, CQ_POLL_ERR = -2, CQ_POLL_NODATA = ENOENT }; enum { MLX5_CQ_MODIFY_RESEIZE = 0, MLX5_CQ_MODIFY_MODER = 1, MLX5_CQ_MODIFY_MAPPING = 2, }; struct mlx5_sigerr_cqe { uint8_t rsvd0[16]; __be32 expected_trans_sig; __be32 actual_trans_sig; __be32 expected_ref_tag; __be32 actual_ref_tag; __be16 syndrome; uint8_t sig_type; uint8_t domain; __be32 mkey; __be64 sig_err_offset; uint8_t rsvd30[14]; uint8_t signature; uint8_t op_own; }; enum { MLX5_CQE_APP_TAG_MATCHING = 1, }; enum { MLX5_CQE_APP_OP_TM_CONSUMED = 0x1, MLX5_CQE_APP_OP_TM_EXPECTED = 0x2, MLX5_CQE_APP_OP_TM_UNEXPECTED = 0x3, MLX5_CQE_APP_OP_TM_NO_TAG = 0x4, MLX5_CQE_APP_OP_TM_APPEND = 0x5, MLX5_CQE_APP_OP_TM_REMOVE = 0x6, MLX5_CQE_APP_OP_TM_NOOP = 0x7, MLX5_CQE_APP_OP_TM_CONSUMED_SW_RDNV = 0x9, MLX5_CQE_APP_OP_TM_CONSUMED_MSG = 0xA, MLX5_CQE_APP_OP_TM_CONSUMED_MSG_SW_RDNV = 0xB, MLX5_CQE_APP_OP_TM_MSG_COMPLETION_CANCELED = 0xC, }; /* When larger messages or rendezvous transfers are involved, matching and * data transfer completion are distinct events that generate 2 completion * events for the same recv_wr_id. */ static inline bool mlx5_cqe_app_op_tm_is_complete(int op) { return op != MLX5_CQE_APP_OP_TM_CONSUMED && op != MLX5_CQE_APP_OP_TM_CONSUMED_SW_RDNV; } enum { MLX5_CQ_LAZY_FLAGS = MLX5_CQ_FLAGS_RX_CSUM_VALID | MLX5_CQ_FLAGS_TM_SYNC_REQ | MLX5_CQ_FLAGS_RAW_WQE }; int mlx5_stall_num_loop = 60; int mlx5_stall_cq_poll_min = 60; int mlx5_stall_cq_poll_max = 100000; int mlx5_stall_cq_inc_step = 100; int mlx5_stall_cq_dec_step = 10; enum { MLX5_TM_MAX_SYNC_DIFF = 0x3fff }; static inline uint8_t get_cqe_l3_hdr_type(struct mlx5_cqe64 *cqe) { return (cqe->l4_hdr_type_etc >> 2) & 0x3; } static void *get_buf_cqe(struct mlx5_buf *buf, int n, int cqe_sz) { return buf->buf + n * cqe_sz; } static void *get_cqe(struct mlx5_cq *cq, int n) { return cq->active_buf->buf + n * cq->cqe_sz; } static void *get_sw_cqe(struct mlx5_cq *cq, int n) { void *cqe = get_cqe(cq, n & cq->verbs_cq.cq.cqe); struct mlx5_cqe64 *cqe64; cqe64 = (cq->cqe_sz == 64) ? cqe : cqe + 64; if (likely(mlx5dv_get_cqe_opcode(cqe64) != MLX5_CQE_INVALID) && !((cqe64->op_own & MLX5_CQE_OWNER_MASK) ^ !!(n & (cq->verbs_cq.cq.cqe + 1)))) { return cqe; } else { return NULL; } } static void *next_cqe_sw(struct mlx5_cq *cq) { return get_sw_cqe(cq, cq->cons_index); } static void update_cons_index(struct mlx5_cq *cq) { cq->dbrec[MLX5_CQ_SET_CI] = htobe32(cq->cons_index & 0xffffff); } static inline void handle_good_req(struct ibv_wc *wc, struct mlx5_cqe64 *cqe, struct mlx5_wq *wq, int idx) { switch (be32toh(cqe->sop_drop_qpn) >> 24) { case MLX5_OPCODE_RDMA_WRITE_IMM: wc->wc_flags |= IBV_WC_WITH_IMM; SWITCH_FALLTHROUGH; case MLX5_OPCODE_RDMA_WRITE: wc->opcode = IBV_WC_RDMA_WRITE; break; case MLX5_OPCODE_SEND_IMM: wc->wc_flags |= IBV_WC_WITH_IMM; SWITCH_FALLTHROUGH; case MLX5_OPCODE_SEND: case MLX5_OPCODE_SEND_INVAL: wc->opcode = IBV_WC_SEND; break; case MLX5_OPCODE_RDMA_READ: wc->opcode = IBV_WC_RDMA_READ; wc->byte_len = be32toh(cqe->byte_cnt); break; case MLX5_OPCODE_ATOMIC_CS: wc->opcode = IBV_WC_COMP_SWAP; wc->byte_len = 8; break; case MLX5_OPCODE_ATOMIC_FA: wc->opcode = IBV_WC_FETCH_ADD; wc->byte_len = 8; break; case MLX5_OPCODE_UMR: case MLX5_OPCODE_SET_PSV: case MLX5_OPCODE_NOP: case MLX5_OPCODE_MMO: wc->opcode = wq->wr_data[idx]; break; case MLX5_OPCODE_TSO: wc->opcode = IBV_WC_TSO; break; } if (unlikely(wq->wr_data[idx] == IBV_WC_DRIVER2)) /* raw WQE */ wc->opcode = IBV_WC_DRIVER2; } static inline int handle_responder_lazy(struct mlx5_cq *cq, struct mlx5_cqe64 *cqe, struct mlx5_resource *cur_rsc, struct mlx5_srq *srq) { uint16_t wqe_ctr; struct mlx5_wq *wq; struct mlx5_qp *qp = rsc_to_mqp(cur_rsc); int err = IBV_WC_SUCCESS; if (srq) { wqe_ctr = be16toh(cqe->wqe_counter); cq->verbs_cq.cq_ex.wr_id = srq->wrid[wqe_ctr]; mlx5_free_srq_wqe(srq, wqe_ctr); if (cqe->op_own & MLX5_INLINE_SCATTER_32) err = mlx5_copy_to_recv_srq(srq, wqe_ctr, cqe, be32toh(cqe->byte_cnt)); else if (cqe->op_own & MLX5_INLINE_SCATTER_64) err = mlx5_copy_to_recv_srq(srq, wqe_ctr, cqe - 1, be32toh(cqe->byte_cnt)); } else { if (likely(cur_rsc->type == MLX5_RSC_TYPE_QP)) { wq = &qp->rq; if (qp->qp_cap_cache & MLX5_RX_CSUM_VALID) cq->flags |= MLX5_CQ_FLAGS_RX_CSUM_VALID; } else { wq = &(rsc_to_mrwq(cur_rsc)->rq); } wqe_ctr = be16toh(cqe->wqe_counter) & (wq->wqe_cnt - 1); cq->verbs_cq.cq_ex.wr_id = wq->wrid[wqe_ctr]; ++wq->tail; if (cqe->op_own & MLX5_INLINE_SCATTER_32) err = mlx5_copy_to_recv_wqe(qp, wqe_ctr, cqe, be32toh(cqe->byte_cnt)); else if (cqe->op_own & MLX5_INLINE_SCATTER_64) err = mlx5_copy_to_recv_wqe(qp, wqe_ctr, cqe - 1, be32toh(cqe->byte_cnt)); } return err; } /* Returns IBV_WC_IP_CSUM_OK or 0 */ static inline int get_csum_ok(struct mlx5_cqe64 *cqe) { return (((cqe->hds_ip_ext & (MLX5_CQE_L4_OK | MLX5_CQE_L3_OK)) == (MLX5_CQE_L4_OK | MLX5_CQE_L3_OK)) & (get_cqe_l3_hdr_type(cqe) == MLX5_CQE_L3_HDR_TYPE_IPV4)) << IBV_WC_IP_CSUM_OK_SHIFT; } static inline int handle_responder(struct ibv_wc *wc, struct mlx5_cqe64 *cqe, struct mlx5_resource *cur_rsc, struct mlx5_srq *srq) { uint16_t wqe_ctr; struct mlx5_wq *wq; struct mlx5_qp *qp = rsc_to_mqp(cur_rsc); uint8_t g; int err = 0; wc->byte_len = be32toh(cqe->byte_cnt); if (srq) { wqe_ctr = be16toh(cqe->wqe_counter); wc->wr_id = srq->wrid[wqe_ctr]; mlx5_free_srq_wqe(srq, wqe_ctr); if (cqe->op_own & MLX5_INLINE_SCATTER_32) err = mlx5_copy_to_recv_srq(srq, wqe_ctr, cqe, wc->byte_len); else if (cqe->op_own & MLX5_INLINE_SCATTER_64) err = mlx5_copy_to_recv_srq(srq, wqe_ctr, cqe - 1, wc->byte_len); } else { if (likely(cur_rsc->type == MLX5_RSC_TYPE_QP)) { wq = &qp->rq; if (qp->qp_cap_cache & MLX5_RX_CSUM_VALID) wc->wc_flags |= get_csum_ok(cqe); } else { wq = &(rsc_to_mrwq(cur_rsc)->rq); } wqe_ctr = be16toh(cqe->wqe_counter) & (wq->wqe_cnt - 1); wc->wr_id = wq->wrid[wqe_ctr]; ++wq->tail; if (cqe->op_own & MLX5_INLINE_SCATTER_32) err = mlx5_copy_to_recv_wqe(qp, wqe_ctr, cqe, wc->byte_len); else if (cqe->op_own & MLX5_INLINE_SCATTER_64) err = mlx5_copy_to_recv_wqe(qp, wqe_ctr, cqe - 1, wc->byte_len); } if (err) return err; switch (cqe->op_own >> 4) { case MLX5_CQE_RESP_WR_IMM: wc->opcode = IBV_WC_RECV_RDMA_WITH_IMM; wc->wc_flags |= IBV_WC_WITH_IMM; wc->imm_data = cqe->imm_inval_pkey; break; case MLX5_CQE_RESP_SEND: wc->opcode = IBV_WC_RECV; break; case MLX5_CQE_RESP_SEND_IMM: wc->opcode = IBV_WC_RECV; wc->wc_flags |= IBV_WC_WITH_IMM; wc->imm_data = cqe->imm_inval_pkey; break; case MLX5_CQE_RESP_SEND_INV: wc->opcode = IBV_WC_RECV; wc->wc_flags |= IBV_WC_WITH_INV; wc->invalidated_rkey = be32toh(cqe->imm_inval_pkey); break; } wc->slid = be16toh(cqe->slid); wc->sl = (be32toh(cqe->flags_rqpn) >> 24) & 0xf; wc->src_qp = be32toh(cqe->flags_rqpn) & 0xffffff; wc->dlid_path_bits = cqe->ml_path & 0x7f; g = (be32toh(cqe->flags_rqpn) >> 28) & 3; wc->wc_flags |= g ? IBV_WC_GRH : 0; wc->pkey_index = be32toh(cqe->imm_inval_pkey) & 0xffff; return IBV_WC_SUCCESS; } static void dump_cqe(struct mlx5_context *mctx, void *buf) { __be32 *p = buf; int i; for (i = 0; i < 16; i += 4) mlx5_err(mctx->dbg_fp, "%08x %08x %08x %08x\n", be32toh(p[i]), be32toh(p[i + 1]), be32toh(p[i + 2]), be32toh(p[i + 3])); } static enum ibv_wc_status mlx5_handle_error_cqe(struct mlx5_err_cqe *cqe) { switch (cqe->syndrome) { case MLX5_CQE_SYNDROME_LOCAL_LENGTH_ERR: return IBV_WC_LOC_LEN_ERR; case MLX5_CQE_SYNDROME_LOCAL_QP_OP_ERR: return IBV_WC_LOC_QP_OP_ERR; case MLX5_CQE_SYNDROME_LOCAL_PROT_ERR: return IBV_WC_LOC_PROT_ERR; case MLX5_CQE_SYNDROME_WR_FLUSH_ERR: return IBV_WC_WR_FLUSH_ERR; case MLX5_CQE_SYNDROME_MW_BIND_ERR: return IBV_WC_MW_BIND_ERR; case MLX5_CQE_SYNDROME_BAD_RESP_ERR: return IBV_WC_BAD_RESP_ERR; case MLX5_CQE_SYNDROME_LOCAL_ACCESS_ERR: return IBV_WC_LOC_ACCESS_ERR; case MLX5_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR: return IBV_WC_REM_INV_REQ_ERR; case MLX5_CQE_SYNDROME_REMOTE_ACCESS_ERR: return IBV_WC_REM_ACCESS_ERR; case MLX5_CQE_SYNDROME_REMOTE_OP_ERR: return IBV_WC_REM_OP_ERR; case MLX5_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR: return IBV_WC_RETRY_EXC_ERR; case MLX5_CQE_SYNDROME_RNR_RETRY_EXC_ERR: return IBV_WC_RNR_RETRY_EXC_ERR; case MLX5_CQE_SYNDROME_REMOTE_ABORTED_ERR: return IBV_WC_REM_ABORT_ERR; default: return IBV_WC_GENERAL_ERR; } } #if defined(__x86_64__) || defined (__i386__) static inline unsigned long get_cycles(void) { uint32_t low, high; uint64_t val; asm volatile ("rdtsc" : "=a" (low), "=d" (high)); val = high; val = (val << 32) | low; return val; } static void mlx5_stall_poll_cq(void) { int i; for (i = 0; i < mlx5_stall_num_loop; i++) (void)get_cycles(); } static void mlx5_stall_cycles_poll_cq(uint64_t cycles) { while (get_cycles() < cycles) ; /* Nothing */ } static void mlx5_get_cycles(uint64_t *cycles) { *cycles = get_cycles(); } #else static void mlx5_stall_poll_cq(void) { } static void mlx5_stall_cycles_poll_cq(uint64_t cycles) { } static void mlx5_get_cycles(uint64_t *cycles) { } #endif static inline struct mlx5_qp *get_req_context(struct mlx5_context *mctx, struct mlx5_resource **cur_rsc, uint32_t rsn, int cqe_ver) ALWAYS_INLINE; static inline struct mlx5_qp *get_req_context(struct mlx5_context *mctx, struct mlx5_resource **cur_rsc, uint32_t rsn, int cqe_ver) { if (!*cur_rsc || (rsn != (*cur_rsc)->rsn)) *cur_rsc = cqe_ver ? mlx5_find_uidx(mctx, rsn) : (struct mlx5_resource *)mlx5_find_qp(mctx, rsn); return rsc_to_mqp(*cur_rsc); } static inline int get_resp_ctx_v1(struct mlx5_context *mctx, struct mlx5_resource **cur_rsc, struct mlx5_srq **cur_srq, uint32_t uidx, uint8_t *is_srq) ALWAYS_INLINE; static inline int get_resp_ctx_v1(struct mlx5_context *mctx, struct mlx5_resource **cur_rsc, struct mlx5_srq **cur_srq, uint32_t uidx, uint8_t *is_srq) { struct mlx5_qp *mqp; if (!*cur_rsc || (uidx != (*cur_rsc)->rsn)) { *cur_rsc = mlx5_find_uidx(mctx, uidx); if (unlikely(!*cur_rsc)) return CQ_POLL_ERR; } switch ((*cur_rsc)->type) { case MLX5_RSC_TYPE_QP: mqp = rsc_to_mqp(*cur_rsc); if (mqp->verbs_qp.qp.srq) { *cur_srq = to_msrq(mqp->verbs_qp.qp.srq); *is_srq = 1; } break; case MLX5_RSC_TYPE_XSRQ: *cur_srq = rsc_to_msrq(*cur_rsc); *is_srq = 1; break; case MLX5_RSC_TYPE_RWQ: break; default: return CQ_POLL_ERR; } return CQ_OK; } static inline int get_qp_ctx(struct mlx5_context *mctx, struct mlx5_resource **cur_rsc, uint32_t qpn) ALWAYS_INLINE; static inline int get_qp_ctx(struct mlx5_context *mctx, struct mlx5_resource **cur_rsc, uint32_t qpn) { if (!*cur_rsc || (qpn != (*cur_rsc)->rsn)) { /* * We do not have to take the QP table lock here, * because CQs will be locked while QPs are removed * from the table. */ *cur_rsc = (struct mlx5_resource *)mlx5_find_qp(mctx, qpn); if (unlikely(!*cur_rsc)) return CQ_POLL_ERR; } return CQ_OK; } static inline int get_srq_ctx(struct mlx5_context *mctx, struct mlx5_srq **cur_srq, uint32_t srqn_uidx) ALWAYS_INLINE; static inline int get_srq_ctx(struct mlx5_context *mctx, struct mlx5_srq **cur_srq, uint32_t srqn) { if (!*cur_srq || (srqn != (*cur_srq)->srqn)) { *cur_srq = mlx5_find_srq(mctx, srqn); if (unlikely(!*cur_srq)) return CQ_POLL_ERR; } return CQ_OK; } static inline int get_cur_rsc(struct mlx5_context *mctx, int cqe_ver, uint32_t qpn, uint32_t srqn_uidx, struct mlx5_resource **cur_rsc, struct mlx5_srq **cur_srq, uint8_t *is_srq) { int err; if (cqe_ver) { err = get_resp_ctx_v1(mctx, cur_rsc, cur_srq, srqn_uidx, is_srq); } else { if (srqn_uidx) { *is_srq = 1; err = get_srq_ctx(mctx, cur_srq, srqn_uidx); } else { err = get_qp_ctx(mctx, cur_rsc, qpn); } } return err; } static inline int mlx5_get_next_cqe(struct mlx5_cq *cq, struct mlx5_cqe64 **pcqe64, void **pcqe) ALWAYS_INLINE; static inline int mlx5_get_next_cqe(struct mlx5_cq *cq, struct mlx5_cqe64 **pcqe64, void **pcqe) { void *cqe; struct mlx5_cqe64 *cqe64; cqe = next_cqe_sw(cq); if (!cqe) return CQ_EMPTY; cqe64 = (cq->cqe_sz == 64) ? cqe : cqe + 64; ++cq->cons_index; VALGRIND_MAKE_MEM_DEFINED(cqe64, sizeof *cqe64); /* * Make sure we read CQ entry contents after we've checked the * ownership bit. */ udma_from_device_barrier(); #ifdef MLX5_DEBUG { struct mlx5_context *mctx = to_mctx(cq->verbs_cq.cq_ex.context); if (mlx5_debug_mask & MLX5_DBG_CQ_CQE) { mlx5_dbg(mctx->dbg_fp, MLX5_DBG_CQ_CQE, "dump cqe for cqn 0x%x:\n", cq->cqn); dump_cqe(mctx, cqe64); } } #endif *pcqe64 = cqe64; *pcqe = cqe; return CQ_OK; } static int handle_tag_matching(struct mlx5_cq *cq, struct mlx5_cqe64 *cqe64, struct mlx5_srq *srq) { FILE *fp = to_mctx(srq->vsrq.srq.context)->dbg_fp; struct mlx5_tag_entry *tag; struct mlx5_srq_op *op; uint16_t wqe_ctr; cq->verbs_cq.cq_ex.status = IBV_WC_SUCCESS; switch (cqe64->app_op) { case MLX5_CQE_APP_OP_TM_CONSUMED_MSG_SW_RDNV: case MLX5_CQE_APP_OP_TM_CONSUMED_SW_RDNV: case MLX5_CQE_APP_OP_TM_MSG_COMPLETION_CANCELED: cq->verbs_cq.cq_ex.status = IBV_WC_TM_RNDV_INCOMPLETE; SWITCH_FALLTHROUGH; case MLX5_CQE_APP_OP_TM_CONSUMED_MSG: case MLX5_CQE_APP_OP_TM_CONSUMED: case MLX5_CQE_APP_OP_TM_EXPECTED: mlx5_spin_lock(&srq->lock); tag = &srq->tm_list[be16toh(cqe64->app_info)]; if (!tag->expect_cqe) { mlx5_dbg(fp, MLX5_DBG_CQ, "got idx %d which wasn't added\n", be16toh(cqe64->app_info)); cq->verbs_cq.cq_ex.status = IBV_WC_GENERAL_ERR; mlx5_spin_unlock(&srq->lock); return CQ_OK; } cq->verbs_cq.cq_ex.wr_id = tag->wr_id; if (mlx5_cqe_app_op_tm_is_complete(cqe64->app_op)) mlx5_tm_release_tag(srq, tag); /* inline scatter 32 not supported for TM */ if (cqe64->op_own & MLX5_INLINE_SCATTER_64) { if (be32toh(cqe64->byte_cnt) > tag->size) cq->verbs_cq.cq_ex.status = IBV_WC_LOC_LEN_ERR; else memcpy(tag->ptr, cqe64 - 1, be32toh(cqe64->byte_cnt)); } mlx5_spin_unlock(&srq->lock); break; case MLX5_CQE_APP_OP_TM_REMOVE: if (!(be32toh(cqe64->tm_cqe.success) & MLX5_TMC_SUCCESS)) cq->verbs_cq.cq_ex.status = IBV_WC_TM_ERR; SWITCH_FALLTHROUGH; case MLX5_CQE_APP_OP_TM_APPEND: case MLX5_CQE_APP_OP_TM_NOOP: mlx5_spin_lock(&srq->lock); #ifdef MLX5_DEBUG if (srq->op_tail == srq->op_head) { mlx5_dbg(fp, MLX5_DBG_CQ, "got unexpected list op CQE\n"); cq->verbs_cq.cq_ex.status = IBV_WC_GENERAL_ERR; mlx5_spin_unlock(&srq->lock); return CQ_OK; } #endif op = srq->op + (srq->op_head++ & (to_mqp(srq->cmd_qp)->sq.wqe_cnt - 1)); if (op->tag) { /* APPEND or REMOVE */ mlx5_tm_release_tag(srq, op->tag); if (cqe64->app_op == MLX5_CQE_APP_OP_TM_REMOVE && cq->verbs_cq.cq_ex.status == IBV_WC_SUCCESS) /* * If tag entry was successfully removed we * don't expect consumption completion for it * anymore. Remove reports failure if tag was * consumed meanwhile. */ mlx5_tm_release_tag(srq, op->tag); if (be16toh(cqe64->tm_cqe.hw_phase_cnt) != op->tag->phase_cnt) cq->flags |= MLX5_CQ_FLAGS_TM_SYNC_REQ; } to_mqp(srq->cmd_qp)->sq.tail = op->wqe_head + 1; cq->verbs_cq.cq_ex.wr_id = op->wr_id; mlx5_spin_unlock(&srq->lock); break; case MLX5_CQE_APP_OP_TM_UNEXPECTED: srq->unexp_in++; if (srq->unexp_in - srq->unexp_out > MLX5_TM_MAX_SYNC_DIFF) cq->flags |= MLX5_CQ_FLAGS_TM_SYNC_REQ; SWITCH_FALLTHROUGH; case MLX5_CQE_APP_OP_TM_NO_TAG: wqe_ctr = be16toh(cqe64->wqe_counter); cq->verbs_cq.cq_ex.wr_id = srq->wrid[wqe_ctr]; mlx5_free_srq_wqe(srq, wqe_ctr); if (cqe64->op_own & MLX5_INLINE_SCATTER_32) return mlx5_copy_to_recv_srq(srq, wqe_ctr, cqe64, be32toh(cqe64->byte_cnt)); else if (cqe64->op_own & MLX5_INLINE_SCATTER_64) return mlx5_copy_to_recv_srq(srq, wqe_ctr, cqe64 - 1, be32toh(cqe64->byte_cnt)); break; #ifdef MLX5_DEBUG default: mlx5_dbg(fp, MLX5_DBG_CQ, "un-expected TM opcode in cqe\n"); #endif } return CQ_OK; } static inline void get_sig_err_info(struct mlx5_sigerr_cqe *cqe, struct mlx5_sig_err *err) { err->syndrome = be16toh(cqe->syndrome); err->expected = (uint64_t)be32toh(cqe->expected_trans_sig) << 32 | be32toh(cqe->expected_ref_tag); err->actual = (uint64_t)be32toh(cqe->actual_trans_sig) << 32 | be32toh(cqe->actual_ref_tag); err->offset = be64toh(cqe->sig_err_offset); err->sig_type = cqe->sig_type & 0x7; err->domain = cqe->domain & 0x7; } static inline int is_odp_pfault_err(struct mlx5_err_cqe *ecqe) { return ecqe->syndrome == MLX5_CQE_SYNDROME_REMOTE_ABORTED_ERR && ecqe->vendor_err_synd == MLX5_CQE_VENDOR_SYNDROME_ODP_PFAULT; } static inline int mlx5_parse_cqe(struct mlx5_cq *cq, struct mlx5_cqe64 *cqe64, void *cqe, struct mlx5_resource **cur_rsc, struct mlx5_srq **cur_srq, struct ibv_wc *wc, int cqe_ver, int lazy) ALWAYS_INLINE; static inline int mlx5_parse_cqe(struct mlx5_cq *cq, struct mlx5_cqe64 *cqe64, void *cqe, struct mlx5_resource **cur_rsc, struct mlx5_srq **cur_srq, struct ibv_wc *wc, int cqe_ver, int lazy) { struct mlx5_wq *wq; uint16_t wqe_ctr; uint32_t qpn; uint32_t srqn_uidx; int idx; uint8_t opcode; struct mlx5_err_cqe *ecqe; struct mlx5_sigerr_cqe *sigerr_cqe; struct mlx5_mkey *mkey; int err; struct mlx5_qp *mqp; struct mlx5_context *mctx; uint8_t is_srq; again: is_srq = 0; err = 0; mctx = to_mctx(cq->verbs_cq.cq.context); qpn = be32toh(cqe64->sop_drop_qpn) & 0xffffff; if (lazy) { cq->cqe64 = cqe64; cq->flags &= (~MLX5_CQ_LAZY_FLAGS); } else { wc->wc_flags = 0; wc->qp_num = qpn; } opcode = mlx5dv_get_cqe_opcode(cqe64); wqe_ctr = be16toh(cqe64->wqe_counter); switch (opcode) { case MLX5_CQE_REQ: { mqp = get_req_context(mctx, cur_rsc, (cqe_ver ? (be32toh(cqe64->srqn_uidx) & 0xffffff) : qpn), cqe_ver); if (unlikely(!mqp)) return CQ_POLL_ERR; wq = &mqp->sq; idx = wqe_ctr & (wq->wqe_cnt - 1); if (lazy) { uint32_t wc_byte_len; switch (be32toh(cqe64->sop_drop_qpn) >> 24) { case MLX5_OPCODE_UMR: case MLX5_OPCODE_SET_PSV: case MLX5_OPCODE_NOP: case MLX5_OPCODE_MMO: cq->cached_opcode = wq->wr_data[idx]; break; case MLX5_OPCODE_RDMA_READ: wc_byte_len = be32toh(cqe64->byte_cnt); goto scatter_out; case MLX5_OPCODE_ATOMIC_CS: case MLX5_OPCODE_ATOMIC_FA: wc_byte_len = 8; scatter_out: if (cqe64->op_own & MLX5_INLINE_SCATTER_32) err = mlx5_copy_to_send_wqe( mqp, wqe_ctr, cqe, wc_byte_len); else if (cqe64->op_own & MLX5_INLINE_SCATTER_64) err = mlx5_copy_to_send_wqe( mqp, wqe_ctr, cqe - 1, wc_byte_len); break; } cq->verbs_cq.cq_ex.wr_id = wq->wrid[idx]; cq->verbs_cq.cq_ex.status = err; if (unlikely(wq->wr_data[idx] == IBV_WC_DRIVER2)) cq->flags |= MLX5_CQ_FLAGS_RAW_WQE; } else { handle_good_req(wc, cqe64, wq, idx); if (cqe64->op_own & MLX5_INLINE_SCATTER_32) err = mlx5_copy_to_send_wqe(mqp, wqe_ctr, cqe, wc->byte_len); else if (cqe64->op_own & MLX5_INLINE_SCATTER_64) err = mlx5_copy_to_send_wqe( mqp, wqe_ctr, cqe - 1, wc->byte_len); wc->wr_id = wq->wrid[idx]; wc->status = err; } wq->tail = wq->wqe_head[idx] + 1; break; } case MLX5_CQE_RESP_WR_IMM: case MLX5_CQE_RESP_SEND: case MLX5_CQE_RESP_SEND_IMM: case MLX5_CQE_RESP_SEND_INV: srqn_uidx = be32toh(cqe64->srqn_uidx) & 0xffffff; err = get_cur_rsc(mctx, cqe_ver, qpn, srqn_uidx, cur_rsc, cur_srq, &is_srq); if (unlikely(err)) return CQ_POLL_ERR; if (lazy) { if (likely(cqe64->app != MLX5_CQE_APP_TAG_MATCHING)) { cq->verbs_cq.cq_ex.status = handle_responder_lazy (cq, cqe64, *cur_rsc, is_srq ? *cur_srq : NULL); } else { if (unlikely(!is_srq)) return CQ_POLL_ERR; err = handle_tag_matching(cq, cqe64, *cur_srq); if (unlikely(err)) return CQ_POLL_ERR; } } else { wc->status = handle_responder(wc, cqe64, *cur_rsc, is_srq ? *cur_srq : NULL); } break; case MLX5_CQE_NO_PACKET: if (unlikely(cqe64->app != MLX5_CQE_APP_TAG_MATCHING)) return CQ_POLL_ERR; srqn_uidx = be32toh(cqe64->srqn_uidx) & 0xffffff; err = get_cur_rsc(mctx, cqe_ver, qpn, srqn_uidx, cur_rsc, cur_srq, &is_srq); if (unlikely(err || !is_srq)) return CQ_POLL_ERR; err = handle_tag_matching(cq, cqe64, *cur_srq); if (unlikely(err)) return CQ_POLL_ERR; break; case MLX5_CQE_SIG_ERR: sigerr_cqe = (struct mlx5_sigerr_cqe *)cqe64; pthread_mutex_lock(&mctx->mkey_table_mutex); mkey = mlx5_find_mkey(mctx, be32toh(sigerr_cqe->mkey) >> 8); if (!mkey) { pthread_mutex_unlock(&mctx->mkey_table_mutex); return CQ_POLL_ERR; } mkey->sig->err_exists = true; mkey->sig->err_count++; mkey->sig->err_count_updated = true; get_sig_err_info(sigerr_cqe, &mkey->sig->err_info); pthread_mutex_unlock(&mctx->mkey_table_mutex); err = mlx5_get_next_cqe(cq, &cqe64, &cqe); /* * CQ_POLL_NODATA indicates that CQ was not empty but the polled * CQE was handled internally and should not processed by the * caller. */ if (err == CQ_EMPTY) return CQ_POLL_NODATA; goto again; case MLX5_CQE_RESIZE_CQ: break; case MLX5_CQE_REQ_ERR: case MLX5_CQE_RESP_ERR: srqn_uidx = be32toh(cqe64->srqn_uidx) & 0xffffff; ecqe = (struct mlx5_err_cqe *)cqe64; { enum ibv_wc_status *pstatus = lazy ? &cq->verbs_cq.cq_ex.status : &wc->status; *pstatus = mlx5_handle_error_cqe(ecqe); } if (!lazy) wc->vendor_err = ecqe->vendor_err_synd; if (unlikely(ecqe->syndrome != MLX5_CQE_SYNDROME_WR_FLUSH_ERR && ecqe->syndrome != MLX5_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR && !is_odp_pfault_err(ecqe))) { mlx5_err(mctx->dbg_fp, PFX "%s: got completion with error:\n", mctx->hostname); dump_cqe(mctx, ecqe); if (mlx5_freeze_on_error_cqe) { mlx5_err(mctx->dbg_fp, PFX "freezing at poll cq..."); while (1) sleep(10); } } if (opcode == MLX5_CQE_REQ_ERR) { mqp = get_req_context(mctx, cur_rsc, (cqe_ver ? srqn_uidx : qpn), cqe_ver); if (unlikely(!mqp)) return CQ_POLL_ERR; wq = &mqp->sq; idx = wqe_ctr & (wq->wqe_cnt - 1); if (lazy) cq->verbs_cq.cq_ex.wr_id = wq->wrid[idx]; else wc->wr_id = wq->wrid[idx]; wq->tail = wq->wqe_head[idx] + 1; } else { err = get_cur_rsc(mctx, cqe_ver, qpn, srqn_uidx, cur_rsc, cur_srq, &is_srq); if (unlikely(err)) return CQ_POLL_ERR; if (is_srq) { if (is_odp_pfault_err(ecqe)) { mlx5_complete_odp_fault(*cur_srq, wqe_ctr); err = mlx5_get_next_cqe(cq, &cqe64, &cqe); /* CQ_POLL_NODATA indicates that CQ was not empty but the polled CQE * was handled internally and should not processed by the caller. */ if (err == CQ_EMPTY) return CQ_POLL_NODATA; goto again; } if (lazy) cq->verbs_cq.cq_ex.wr_id = (*cur_srq)->wrid[wqe_ctr]; else wc->wr_id = (*cur_srq)->wrid[wqe_ctr]; mlx5_free_srq_wqe(*cur_srq, wqe_ctr); } else { switch ((*cur_rsc)->type) { case MLX5_RSC_TYPE_RWQ: wq = &(rsc_to_mrwq(*cur_rsc)->rq); break; default: wq = &(rsc_to_mqp(*cur_rsc)->rq); break; } idx = wqe_ctr & (wq->wqe_cnt - 1); if (lazy) cq->verbs_cq.cq_ex.wr_id = wq->wrid[idx]; else wc->wr_id = wq->wrid[idx]; ++wq->tail; } } break; } return CQ_OK; } static inline int mlx5_parse_lazy_cqe(struct mlx5_cq *cq, struct mlx5_cqe64 *cqe64, void *cqe, int cqe_ver) ALWAYS_INLINE; static inline int mlx5_parse_lazy_cqe(struct mlx5_cq *cq, struct mlx5_cqe64 *cqe64, void *cqe, int cqe_ver) { return mlx5_parse_cqe(cq, cqe64, cqe, &cq->cur_rsc, &cq->cur_srq, NULL, cqe_ver, 1); } static inline int mlx5_poll_one(struct mlx5_cq *cq, struct mlx5_resource **cur_rsc, struct mlx5_srq **cur_srq, struct ibv_wc *wc, int cqe_ver) ALWAYS_INLINE; static inline int mlx5_poll_one(struct mlx5_cq *cq, struct mlx5_resource **cur_rsc, struct mlx5_srq **cur_srq, struct ibv_wc *wc, int cqe_ver) { struct mlx5_cqe64 *cqe64; void *cqe; int err; err = mlx5_get_next_cqe(cq, &cqe64, &cqe); if (err == CQ_EMPTY) return err; return mlx5_parse_cqe(cq, cqe64, cqe, cur_rsc, cur_srq, wc, cqe_ver, 0); } static inline int poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc, int cqe_ver) ALWAYS_INLINE; static inline int poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc, int cqe_ver) { struct mlx5_cq *cq = to_mcq(ibcq); struct mlx5_resource *rsc = NULL; struct mlx5_srq *srq = NULL; int npolled; int err = CQ_OK; if (cq->stall_enable) { if (cq->stall_adaptive_enable) { if (cq->stall_last_count) mlx5_stall_cycles_poll_cq(cq->stall_last_count + cq->stall_cycles); } else if (cq->stall_next_poll) { cq->stall_next_poll = 0; mlx5_stall_poll_cq(); } } mlx5_spin_lock(&cq->lock); for (npolled = 0; npolled < ne; ++npolled) { err = mlx5_poll_one(cq, &rsc, &srq, wc + npolled, cqe_ver); if (err != CQ_OK) break; } update_cons_index(cq); mlx5_spin_unlock(&cq->lock); if (cq->stall_enable) { if (cq->stall_adaptive_enable) { if (npolled == 0) { cq->stall_cycles = max(cq->stall_cycles-mlx5_stall_cq_dec_step, mlx5_stall_cq_poll_min); mlx5_get_cycles(&cq->stall_last_count); } else if (npolled < ne) { cq->stall_cycles = min(cq->stall_cycles+mlx5_stall_cq_inc_step, mlx5_stall_cq_poll_max); mlx5_get_cycles(&cq->stall_last_count); } else { cq->stall_cycles = max(cq->stall_cycles-mlx5_stall_cq_dec_step, mlx5_stall_cq_poll_min); cq->stall_last_count = 0; } } else if (err == CQ_EMPTY) { cq->stall_next_poll = 1; } } return err == CQ_POLL_ERR ? err : npolled; } enum polling_mode { POLLING_MODE_NO_STALL, POLLING_MODE_STALL, POLLING_MODE_STALL_ADAPTIVE }; static inline void _mlx5_end_poll(struct ibv_cq_ex *ibcq, int lock, enum polling_mode stall) ALWAYS_INLINE; static inline void _mlx5_end_poll(struct ibv_cq_ex *ibcq, int lock, enum polling_mode stall) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); update_cons_index(cq); if (lock) mlx5_spin_unlock(&cq->lock); if (stall) { if (stall == POLLING_MODE_STALL_ADAPTIVE) { if (!(cq->flags & MLX5_CQ_FLAGS_FOUND_CQES)) { cq->stall_cycles = max(cq->stall_cycles - mlx5_stall_cq_dec_step, mlx5_stall_cq_poll_min); mlx5_get_cycles(&cq->stall_last_count); } else if (cq->flags & MLX5_CQ_FLAGS_EMPTY_DURING_POLL) { cq->stall_cycles = min(cq->stall_cycles + mlx5_stall_cq_inc_step, mlx5_stall_cq_poll_max); mlx5_get_cycles(&cq->stall_last_count); } else { cq->stall_cycles = max(cq->stall_cycles - mlx5_stall_cq_dec_step, mlx5_stall_cq_poll_min); cq->stall_last_count = 0; } } else if (!(cq->flags & MLX5_CQ_FLAGS_FOUND_CQES)) { cq->stall_next_poll = 1; } cq->flags &= ~(MLX5_CQ_FLAGS_FOUND_CQES | MLX5_CQ_FLAGS_EMPTY_DURING_POLL); } } static inline int mlx5_start_poll(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr, int lock, enum polling_mode stall, int cqe_version, int clock_update) ALWAYS_INLINE; static inline int mlx5_start_poll(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr, int lock, enum polling_mode stall, int cqe_version, int clock_update) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); struct mlx5_cqe64 *cqe64; void *cqe; int err; if (unlikely(attr->comp_mask)) return EINVAL; if (stall) { if (stall == POLLING_MODE_STALL_ADAPTIVE) { if (cq->stall_last_count) mlx5_stall_cycles_poll_cq(cq->stall_last_count + cq->stall_cycles); } else if (cq->stall_next_poll) { cq->stall_next_poll = 0; mlx5_stall_poll_cq(); } } if (lock) mlx5_spin_lock(&cq->lock); cq->cur_rsc = NULL; cq->cur_srq = NULL; err = mlx5_get_next_cqe(cq, &cqe64, &cqe); if (err == CQ_EMPTY) { if (lock) mlx5_spin_unlock(&cq->lock); if (stall) { if (stall == POLLING_MODE_STALL_ADAPTIVE) { cq->stall_cycles = max(cq->stall_cycles - mlx5_stall_cq_dec_step, mlx5_stall_cq_poll_min); mlx5_get_cycles(&cq->stall_last_count); } else { cq->stall_next_poll = 1; } } return ENOENT; } if (stall) cq->flags |= MLX5_CQ_FLAGS_FOUND_CQES; err = mlx5_parse_lazy_cqe(cq, cqe64, cqe, cqe_version); if (lock && err) mlx5_spin_unlock(&cq->lock); if (stall && err == CQ_POLL_ERR) { if (stall == POLLING_MODE_STALL_ADAPTIVE) { cq->stall_cycles = max(cq->stall_cycles - mlx5_stall_cq_dec_step, mlx5_stall_cq_poll_min); cq->stall_last_count = 0; } cq->flags &= ~(MLX5_CQ_FLAGS_FOUND_CQES); goto out; } if (clock_update && !err) { err = mlx5dv_get_clock_info(ibcq->context, &cq->last_clock_info); if (lock && err) mlx5_spin_unlock(&cq->lock); } out: return err; } static inline int mlx5_next_poll(struct ibv_cq_ex *ibcq, enum polling_mode stall, int cqe_version) ALWAYS_INLINE; static inline int mlx5_next_poll(struct ibv_cq_ex *ibcq, enum polling_mode stall, int cqe_version) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); struct mlx5_cqe64 *cqe64; void *cqe; int err; err = mlx5_get_next_cqe(cq, &cqe64, &cqe); if (err == CQ_EMPTY) { if (stall == POLLING_MODE_STALL_ADAPTIVE) cq->flags |= MLX5_CQ_FLAGS_EMPTY_DURING_POLL; return ENOENT; } return mlx5_parse_lazy_cqe(cq, cqe64, cqe, cqe_version); } static inline int mlx5_next_poll_adaptive_v0(struct ibv_cq_ex *ibcq) { return mlx5_next_poll(ibcq, POLLING_MODE_STALL_ADAPTIVE, 0); } static inline int mlx5_next_poll_adaptive_v1(struct ibv_cq_ex *ibcq) { return mlx5_next_poll(ibcq, POLLING_MODE_STALL_ADAPTIVE, 1); } static inline int mlx5_next_poll_v0(struct ibv_cq_ex *ibcq) { return mlx5_next_poll(ibcq, 0, 0); } static inline int mlx5_next_poll_v1(struct ibv_cq_ex *ibcq) { return mlx5_next_poll(ibcq, 0, 1); } static inline int mlx5_start_poll_v0(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, 0, 0, 0); } static inline int mlx5_start_poll_v1(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, 0, 1, 0); } static inline int mlx5_start_poll_v0_lock(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, 0, 0, 0); } static inline int mlx5_start_poll_v1_lock(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, 0, 1, 0); } static inline int mlx5_start_poll_adaptive_stall_v0_lock(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL_ADAPTIVE, 0, 0); } static inline int mlx5_start_poll_stall_v0_lock(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL, 0, 0); } static inline int mlx5_start_poll_adaptive_stall_v1_lock(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL_ADAPTIVE, 1, 0); } static inline int mlx5_start_poll_stall_v1_lock(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL, 1, 0); } static inline int mlx5_start_poll_stall_v0(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL, 0, 0); } static inline int mlx5_start_poll_adaptive_stall_v0(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL_ADAPTIVE, 0, 0); } static inline int mlx5_start_poll_adaptive_stall_v1(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL_ADAPTIVE, 1, 0); } static inline int mlx5_start_poll_stall_v1(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL, 1, 0); } static inline int mlx5_start_poll_v0_lock_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, 0, 0, 1); } static inline int mlx5_start_poll_v1_lock_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, 0, 1, 1); } static inline int mlx5_start_poll_v1_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, 0, 1, 1); } static inline int mlx5_start_poll_v0_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, 0, 0, 1); } static inline int mlx5_start_poll_stall_v1_lock_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL, 1, 1); } static inline int mlx5_start_poll_stall_v0_lock_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL, 0, 1); } static inline int mlx5_start_poll_stall_v1_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL, 1, 1); } static inline int mlx5_start_poll_stall_v0_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL, 0, 1); } static inline int mlx5_start_poll_adaptive_stall_v0_lock_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL_ADAPTIVE, 0, 1); } static inline int mlx5_start_poll_adaptive_stall_v1_lock_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 1, POLLING_MODE_STALL_ADAPTIVE, 1, 1); } static inline int mlx5_start_poll_adaptive_stall_v0_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL_ADAPTIVE, 0, 1); } static inline int mlx5_start_poll_adaptive_stall_v1_clock_update(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr) { return mlx5_start_poll(ibcq, attr, 0, POLLING_MODE_STALL_ADAPTIVE, 1, 1); } static inline void mlx5_end_poll_adaptive_stall_lock(struct ibv_cq_ex *ibcq) { _mlx5_end_poll(ibcq, 1, POLLING_MODE_STALL_ADAPTIVE); } static inline void mlx5_end_poll_stall_lock(struct ibv_cq_ex *ibcq) { _mlx5_end_poll(ibcq, 1, POLLING_MODE_STALL); } static inline void mlx5_end_poll_adaptive_stall(struct ibv_cq_ex *ibcq) { _mlx5_end_poll(ibcq, 0, POLLING_MODE_STALL_ADAPTIVE); } static inline void mlx5_end_poll_stall(struct ibv_cq_ex *ibcq) { _mlx5_end_poll(ibcq, 0, POLLING_MODE_STALL); } static inline void mlx5_end_poll(struct ibv_cq_ex *ibcq) { _mlx5_end_poll(ibcq, 0, 0); } static inline void mlx5_end_poll_lock(struct ibv_cq_ex *ibcq) { _mlx5_end_poll(ibcq, 1, 0); } int mlx5_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc) { return poll_cq(ibcq, ne, wc, 0); } int mlx5_poll_cq_v1(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc) { return poll_cq(ibcq, ne, wc, 1); } static inline enum ibv_wc_opcode mlx5_cq_read_wc_opcode(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); switch (mlx5dv_get_cqe_opcode(cq->cqe64)) { case MLX5_CQE_RESP_WR_IMM: return IBV_WC_RECV_RDMA_WITH_IMM; case MLX5_CQE_RESP_SEND: case MLX5_CQE_RESP_SEND_IMM: case MLX5_CQE_RESP_SEND_INV: if (unlikely(cq->cqe64->app == MLX5_CQE_APP_TAG_MATCHING)) { switch (cq->cqe64->app_op) { case MLX5_CQE_APP_OP_TM_CONSUMED_MSG_SW_RDNV: case MLX5_CQE_APP_OP_TM_CONSUMED_MSG: case MLX5_CQE_APP_OP_TM_CONSUMED_SW_RDNV: case MLX5_CQE_APP_OP_TM_EXPECTED: case MLX5_CQE_APP_OP_TM_UNEXPECTED: return IBV_WC_TM_RECV; case MLX5_CQE_APP_OP_TM_NO_TAG: return IBV_WC_TM_NO_TAG; } } return IBV_WC_RECV; case MLX5_CQE_NO_PACKET: switch (cq->cqe64->app_op) { case MLX5_CQE_APP_OP_TM_REMOVE: return IBV_WC_TM_DEL; case MLX5_CQE_APP_OP_TM_APPEND: return IBV_WC_TM_ADD; case MLX5_CQE_APP_OP_TM_NOOP: return IBV_WC_TM_SYNC; case MLX5_CQE_APP_OP_TM_CONSUMED: return IBV_WC_TM_RECV; } break; case MLX5_CQE_REQ: if (unlikely(cq->flags & MLX5_CQ_FLAGS_RAW_WQE)) return IBV_WC_DRIVER2; switch (be32toh(cq->cqe64->sop_drop_qpn) >> 24) { case MLX5_OPCODE_RDMA_WRITE_IMM: case MLX5_OPCODE_RDMA_WRITE: return IBV_WC_RDMA_WRITE; case MLX5_OPCODE_SEND_IMM: case MLX5_OPCODE_SEND: case MLX5_OPCODE_SEND_INVAL: return IBV_WC_SEND; case MLX5_OPCODE_RDMA_READ: return IBV_WC_RDMA_READ; case MLX5_OPCODE_ATOMIC_CS: return IBV_WC_COMP_SWAP; case MLX5_OPCODE_ATOMIC_FA: return IBV_WC_FETCH_ADD; case MLX5_OPCODE_UMR: case MLX5_OPCODE_SET_PSV: case MLX5_OPCODE_NOP: case MLX5_OPCODE_MMO: return cq->cached_opcode; case MLX5_OPCODE_TSO: return IBV_WC_TSO; } } #ifdef MLX5_DEBUG { struct mlx5_context *ctx = to_mctx(ibcq->context); mlx5_dbg(ctx->dbg_fp, MLX5_DBG_CQ_CQE, "un-expected opcode in cqe\n"); } #endif return 0; } static inline uint32_t mlx5_cq_read_wc_qp_num(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return be32toh(cq->cqe64->sop_drop_qpn) & 0xffffff; } static inline unsigned int mlx5_cq_read_wc_flags(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); int wc_flags = 0; if (cq->flags & MLX5_CQ_FLAGS_RX_CSUM_VALID) wc_flags = get_csum_ok(cq->cqe64); switch (mlx5dv_get_cqe_opcode(cq->cqe64)) { case MLX5_CQE_RESP_WR_IMM: case MLX5_CQE_RESP_SEND_IMM: wc_flags |= IBV_WC_WITH_IMM; break; case MLX5_CQE_RESP_SEND_INV: wc_flags |= IBV_WC_WITH_INV; break; } if (cq->flags & MLX5_CQ_FLAGS_TM_SYNC_REQ) wc_flags |= IBV_WC_TM_SYNC_REQ; if (unlikely(cq->cqe64->app == MLX5_CQE_APP_TAG_MATCHING)) { switch (cq->cqe64->app_op) { case MLX5_CQE_APP_OP_TM_CONSUMED_MSG_SW_RDNV: case MLX5_CQE_APP_OP_TM_CONSUMED_MSG: case MLX5_CQE_APP_OP_TM_MSG_COMPLETION_CANCELED: /* Full completion */ wc_flags |= (IBV_WC_TM_MATCH | IBV_WC_TM_DATA_VALID); break; case MLX5_CQE_APP_OP_TM_CONSUMED_SW_RDNV: case MLX5_CQE_APP_OP_TM_CONSUMED: /* First completion */ wc_flags |= IBV_WC_TM_MATCH; break; case MLX5_CQE_APP_OP_TM_EXPECTED: /* Second completion */ wc_flags |= IBV_WC_TM_DATA_VALID; break; } } wc_flags |= ((be32toh(cq->cqe64->flags_rqpn) >> 28) & 3) ? IBV_WC_GRH : 0; return wc_flags; } static inline uint32_t mlx5_cq_read_wc_byte_len(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return be32toh(cq->cqe64->byte_cnt); } static inline uint32_t mlx5_cq_read_wc_vendor_err(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); struct mlx5_err_cqe *ecqe = (struct mlx5_err_cqe *)cq->cqe64; return ecqe->vendor_err_synd; } static inline __be32 mlx5_cq_read_wc_imm_data(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); switch (mlx5dv_get_cqe_opcode(cq->cqe64)) { case MLX5_CQE_RESP_SEND_INV: /* This is returning invalidate_rkey which is in host order, see * ibv_wc_read_invalidated_rkey */ return (__force __be32)be32toh(cq->cqe64->imm_inval_pkey); default: return cq->cqe64->imm_inval_pkey; } } static inline uint32_t mlx5_cq_read_wc_slid(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return (uint32_t)be16toh(cq->cqe64->slid); } static inline uint8_t mlx5_cq_read_wc_sl(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return (be32toh(cq->cqe64->flags_rqpn) >> 24) & 0xf; } static inline uint32_t mlx5_cq_read_wc_src_qp(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return be32toh(cq->cqe64->flags_rqpn) & 0xffffff; } static inline uint8_t mlx5_cq_read_wc_dlid_path_bits(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return cq->cqe64->ml_path & 0x7f; } static inline uint64_t mlx5_cq_read_wc_completion_ts(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return be64toh(cq->cqe64->timestamp); } static inline uint64_t mlx5_cq_read_wc_completion_wallclock_ns(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return mlx5dv_ts_to_ns(&cq->last_clock_info, mlx5_cq_read_wc_completion_ts(ibcq)); } static inline uint16_t mlx5_cq_read_wc_cvlan(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return be16toh(cq->cqe64->vlan_info); } static inline uint32_t mlx5_cq_read_flow_tag(struct ibv_cq_ex *ibcq) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); return be32toh(cq->cqe64->sop_drop_qpn) & MLX5_FLOW_TAG_MASK; } static inline void mlx5_cq_read_wc_tm_info(struct ibv_cq_ex *ibcq, struct ibv_wc_tm_info *tm_info) { struct mlx5_cq *cq = to_mcq(ibv_cq_ex_to_cq(ibcq)); tm_info->tag = be64toh(cq->cqe64->tmh.tag); tm_info->priv = be32toh(cq->cqe64->tmh.app_ctx); } #define SINGLE_THREADED BIT(0) #define STALL BIT(1) #define V1 BIT(2) #define ADAPTIVE BIT(3) #define CLOCK_UPDATE BIT(4) #define mlx5_start_poll_name(cqe_ver, lock, stall, adaptive, clock_update) \ mlx5_start_poll##adaptive##stall##cqe_ver##lock##clock_update #define mlx5_next_poll_name(cqe_ver, adaptive) \ mlx5_next_poll##adaptive##cqe_ver #define mlx5_end_poll_name(lock, stall, adaptive) \ mlx5_end_poll##adaptive##stall##lock #define POLL_FN_ENTRY(cqe_ver, lock, stall, adaptive, clock_update) { \ .start_poll = &mlx5_start_poll_name(cqe_ver, lock, stall, adaptive, clock_update), \ .next_poll = &mlx5_next_poll_name(cqe_ver, adaptive), \ .end_poll = &mlx5_end_poll_name(lock, stall, adaptive), \ } static const struct op { int (*start_poll)(struct ibv_cq_ex *ibcq, struct ibv_poll_cq_attr *attr); int (*next_poll)(struct ibv_cq_ex *ibcq); void (*end_poll)(struct ibv_cq_ex *ibcq); } ops[ADAPTIVE + V1 + STALL + SINGLE_THREADED + CLOCK_UPDATE + 1] = { [V1] = POLL_FN_ENTRY(_v1, _lock, , ,), [0] = POLL_FN_ENTRY(_v0, _lock, , ,), [V1 | SINGLE_THREADED] = POLL_FN_ENTRY(_v1, , , , ), [SINGLE_THREADED] = POLL_FN_ENTRY(_v0, , , , ), [V1 | STALL] = POLL_FN_ENTRY(_v1, _lock, _stall, , ), [STALL] = POLL_FN_ENTRY(_v0, _lock, _stall, , ), [V1 | SINGLE_THREADED | STALL] = POLL_FN_ENTRY(_v1, , _stall, , ), [SINGLE_THREADED | STALL] = POLL_FN_ENTRY(_v0, , _stall, , ), [V1 | STALL | ADAPTIVE] = POLL_FN_ENTRY(_v1, _lock, _stall, _adaptive, ), [STALL | ADAPTIVE] = POLL_FN_ENTRY(_v0, _lock, _stall, _adaptive, ), [V1 | SINGLE_THREADED | STALL | ADAPTIVE] = POLL_FN_ENTRY(_v1, , _stall, _adaptive, ), [SINGLE_THREADED | STALL | ADAPTIVE] = POLL_FN_ENTRY(_v0, , _stall, _adaptive, ), [V1 | CLOCK_UPDATE] = POLL_FN_ENTRY(_v1, _lock, , , _clock_update), [0 | CLOCK_UPDATE] = POLL_FN_ENTRY(_v0, _lock, , , _clock_update), [V1 | SINGLE_THREADED | CLOCK_UPDATE] = POLL_FN_ENTRY(_v1, , , , _clock_update), [SINGLE_THREADED | CLOCK_UPDATE] = POLL_FN_ENTRY(_v0, , , , _clock_update), [V1 | STALL | CLOCK_UPDATE] = POLL_FN_ENTRY(_v1, _lock, _stall, , _clock_update), [STALL | CLOCK_UPDATE] = POLL_FN_ENTRY(_v0, _lock, _stall, , _clock_update), [V1 | SINGLE_THREADED | STALL | CLOCK_UPDATE] = POLL_FN_ENTRY(_v1, , _stall, , _clock_update), [SINGLE_THREADED | STALL | CLOCK_UPDATE] = POLL_FN_ENTRY(_v0, , _stall, , _clock_update), [V1 | STALL | ADAPTIVE | CLOCK_UPDATE] = POLL_FN_ENTRY(_v1, _lock, _stall, _adaptive, _clock_update), [STALL | ADAPTIVE | CLOCK_UPDATE] = POLL_FN_ENTRY(_v0, _lock, _stall, _adaptive, _clock_update), [V1 | SINGLE_THREADED | STALL | ADAPTIVE | CLOCK_UPDATE] = POLL_FN_ENTRY(_v1, , _stall, _adaptive, _clock_update), [SINGLE_THREADED | STALL | ADAPTIVE | CLOCK_UPDATE] = POLL_FN_ENTRY(_v0, , _stall, _adaptive, _clock_update), }; int mlx5_cq_fill_pfns(struct mlx5_cq *cq, const struct ibv_cq_init_attr_ex *cq_attr, struct mlx5_context *mctx) { const struct op *poll_ops = &ops[((cq->stall_enable && cq->stall_adaptive_enable) ? ADAPTIVE : 0) | (mctx->cqe_version ? V1 : 0) | (cq->flags & MLX5_CQ_FLAGS_SINGLE_THREADED ? SINGLE_THREADED : 0) | (cq->stall_enable ? STALL : 0) | ((cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK) ? CLOCK_UPDATE : 0)]; cq->verbs_cq.cq_ex.start_poll = poll_ops->start_poll; cq->verbs_cq.cq_ex.next_poll = poll_ops->next_poll; cq->verbs_cq.cq_ex.end_poll = poll_ops->end_poll; cq->verbs_cq.cq_ex.read_opcode = mlx5_cq_read_wc_opcode; cq->verbs_cq.cq_ex.read_vendor_err = mlx5_cq_read_wc_vendor_err; cq->verbs_cq.cq_ex.read_wc_flags = mlx5_cq_read_wc_flags; if (cq_attr->wc_flags & IBV_WC_EX_WITH_BYTE_LEN) cq->verbs_cq.cq_ex.read_byte_len = mlx5_cq_read_wc_byte_len; if (cq_attr->wc_flags & IBV_WC_EX_WITH_IMM) cq->verbs_cq.cq_ex.read_imm_data = mlx5_cq_read_wc_imm_data; if (cq_attr->wc_flags & IBV_WC_EX_WITH_QP_NUM) cq->verbs_cq.cq_ex.read_qp_num = mlx5_cq_read_wc_qp_num; if (cq_attr->wc_flags & IBV_WC_EX_WITH_SRC_QP) cq->verbs_cq.cq_ex.read_src_qp = mlx5_cq_read_wc_src_qp; if (cq_attr->wc_flags & IBV_WC_EX_WITH_SLID) cq->verbs_cq.cq_ex.read_slid = mlx5_cq_read_wc_slid; if (cq_attr->wc_flags & IBV_WC_EX_WITH_SL) cq->verbs_cq.cq_ex.read_sl = mlx5_cq_read_wc_sl; if (cq_attr->wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS) cq->verbs_cq.cq_ex.read_dlid_path_bits = mlx5_cq_read_wc_dlid_path_bits; if (cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP) cq->verbs_cq.cq_ex.read_completion_ts = mlx5_cq_read_wc_completion_ts; if (cq_attr->wc_flags & IBV_WC_EX_WITH_CVLAN) cq->verbs_cq.cq_ex.read_cvlan = mlx5_cq_read_wc_cvlan; if (cq_attr->wc_flags & IBV_WC_EX_WITH_FLOW_TAG) cq->verbs_cq.cq_ex.read_flow_tag = mlx5_cq_read_flow_tag; if (cq_attr->wc_flags & IBV_WC_EX_WITH_TM_INFO) cq->verbs_cq.cq_ex.read_tm_info = mlx5_cq_read_wc_tm_info; if (cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK) { if (mctx->flags & MLX5_CTX_FLAGS_REAL_TIME_TS_SUPPORTED && !(cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP)) cq->verbs_cq.cq_ex.read_completion_wallclock_ns = mlx5_cq_read_wc_completion_ts; else { if (!mctx->clock_info_page) return EOPNOTSUPP; cq->verbs_cq.cq_ex.read_completion_wallclock_ns = mlx5_cq_read_wc_completion_wallclock_ns; } } return 0; } int mlx5_arm_cq(struct ibv_cq *ibvcq, int solicited) { struct mlx5_cq *cq = to_mcq(ibvcq); struct mlx5_context *ctx = to_mctx(ibvcq->context); uint64_t doorbell; uint32_t sn; uint32_t ci; uint32_t cmd; sn = cq->arm_sn & 3; ci = cq->cons_index & 0xffffff; cmd = solicited ? MLX5_CQ_DB_REQ_NOT_SOL : MLX5_CQ_DB_REQ_NOT; doorbell = sn << 28 | cmd | ci; doorbell <<= 32; doorbell |= cq->cqn; cq->dbrec[MLX5_CQ_ARM_DB] = htobe32(sn << 28 | cmd | ci); /* * Make sure that the doorbell record in host memory is * written before ringing the doorbell via PCI WC MMIO. */ mmio_wc_start(); mmio_write64_be(ctx->cq_uar_reg + MLX5_CQ_DOORBELL, htobe64(doorbell)); mmio_flush_writes(); return 0; } void mlx5_cq_event(struct ibv_cq *cq) { to_mcq(cq)->arm_sn++; } static int is_equal_rsn(struct mlx5_cqe64 *cqe64, uint32_t rsn) { return rsn == (be32toh(cqe64->sop_drop_qpn) & 0xffffff); } static inline int is_equal_uidx(struct mlx5_cqe64 *cqe64, uint32_t uidx) { return uidx == (be32toh(cqe64->srqn_uidx) & 0xffffff); } static inline int is_responder(uint8_t opcode) { switch (opcode) { case MLX5_CQE_RESP_WR_IMM: case MLX5_CQE_RESP_SEND: case MLX5_CQE_RESP_SEND_IMM: case MLX5_CQE_RESP_SEND_INV: case MLX5_CQE_RESP_ERR: return 1; } return 0; } static inline int free_res_cqe(struct mlx5_cqe64 *cqe64, uint32_t rsn, struct mlx5_srq *srq, int cqe_version) { if (cqe_version) { if (is_equal_uidx(cqe64, rsn)) { if (srq && is_responder(mlx5dv_get_cqe_opcode(cqe64))) mlx5_free_srq_wqe(srq, be16toh(cqe64->wqe_counter)); return 1; } } else { if (is_equal_rsn(cqe64, rsn)) { if (srq && (be32toh(cqe64->srqn_uidx) & 0xffffff)) mlx5_free_srq_wqe(srq, be16toh(cqe64->wqe_counter)); return 1; } } return 0; } void __mlx5_cq_clean(struct mlx5_cq *cq, uint32_t rsn, struct mlx5_srq *srq) { uint32_t prod_index; int nfreed = 0; struct mlx5_cqe64 *cqe64, *dest64; void *cqe, *dest; uint8_t owner_bit; int cqe_version; if (!cq || cq->flags & MLX5_CQ_FLAGS_DV_OWNED) return; /* * For CQ created in single threaded mode serving multiple * QPs, if the user destroys a QP between ibv_start_poll() * and ibv_end_poll(), then cq->cur_rsc should be invalidated * since it may point to the QP that is being destroyed, which * may cause UAF error in the next ibv_next_poll() call. */ if (unlikely(cq->cur_rsc && rsn == cq->cur_rsc->rsn)) cq->cur_rsc = NULL; /* * First we need to find the current producer index, so we * know where to start cleaning from. It doesn't matter if HW * adds new entries after this loop -- the QP we're worried * about is already in RESET, so the new entries won't come * from our QP and therefore don't need to be checked. */ for (prod_index = cq->cons_index; get_sw_cqe(cq, prod_index); ++prod_index) if (prod_index == cq->cons_index + cq->verbs_cq.cq.cqe) break; /* * Now sweep backwards through the CQ, removing CQ entries * that match our QP by copying older entries on top of them. */ cqe_version = (to_mctx(cq->verbs_cq.cq.context))->cqe_version; while ((int) --prod_index - (int) cq->cons_index >= 0) { cqe = get_cqe(cq, prod_index & cq->verbs_cq.cq.cqe); cqe64 = (cq->cqe_sz == 64) ? cqe : cqe + 64; if (free_res_cqe(cqe64, rsn, srq, cqe_version)) { ++nfreed; } else if (nfreed) { dest = get_cqe(cq, (prod_index + nfreed) & cq->verbs_cq.cq.cqe); dest64 = (cq->cqe_sz == 64) ? dest : dest + 64; owner_bit = dest64->op_own & MLX5_CQE_OWNER_MASK; memcpy(dest, cqe, cq->cqe_sz); dest64->op_own = owner_bit | (dest64->op_own & ~MLX5_CQE_OWNER_MASK); } } if (nfreed) { cq->cons_index += nfreed; /* * Make sure update of buffer contents is done before * updating consumer index. */ udma_to_device_barrier(); update_cons_index(cq); } } void mlx5_cq_clean(struct mlx5_cq *cq, uint32_t qpn, struct mlx5_srq *srq) { mlx5_spin_lock(&cq->lock); __mlx5_cq_clean(cq, qpn, srq); mlx5_spin_unlock(&cq->lock); } static uint8_t sw_ownership_bit(int n, int nent) { return (n & nent) ? 1 : 0; } static int is_hw(uint8_t own, int n, int mask) { return (own & MLX5_CQE_OWNER_MASK) ^ !!(n & (mask + 1)); } void mlx5_cq_resize_copy_cqes(struct mlx5_context *mctx, struct mlx5_cq *cq) { struct mlx5_cqe64 *scqe64; struct mlx5_cqe64 *dcqe64; void *start_cqe; void *scqe; void *dcqe; int ssize; int dsize; int i; uint8_t sw_own; ssize = cq->cqe_sz; dsize = cq->resize_cqe_sz; i = cq->cons_index; scqe = get_buf_cqe(cq->active_buf, i & cq->active_cqes, ssize); scqe64 = ssize == 64 ? scqe : scqe + 64; start_cqe = scqe; if (is_hw(scqe64->op_own, i, cq->active_cqes)) { mlx5_err(mctx->dbg_fp, "expected cqe in sw ownership\n"); return; } while ((scqe64->op_own >> 4) != MLX5_CQE_RESIZE_CQ) { dcqe = get_buf_cqe(cq->resize_buf, (i + 1) & (cq->resize_cqes - 1), dsize); dcqe64 = dsize == 64 ? dcqe : dcqe + 64; sw_own = sw_ownership_bit(i + 1, cq->resize_cqes); memcpy(dcqe, scqe, ssize); dcqe64->op_own = (dcqe64->op_own & ~MLX5_CQE_OWNER_MASK) | sw_own; ++i; scqe = get_buf_cqe(cq->active_buf, i & cq->active_cqes, ssize); scqe64 = ssize == 64 ? scqe : scqe + 64; if (is_hw(scqe64->op_own, i, cq->active_cqes)) { mlx5_err(mctx->dbg_fp, "expected cqe in sw ownership\n"); return; } if (scqe == start_cqe) { mlx5_err(mctx->dbg_fp, "resize CQ failed to get resize CQE\n"); return; } } ++cq->cons_index; } int mlx5_alloc_cq_buf(struct mlx5_context *mctx, struct mlx5_cq *cq, struct mlx5_buf *buf, int nent, int cqe_sz) { struct mlx5_cqe64 *cqe; int i; struct mlx5_device *dev = to_mdev(mctx->ibv_ctx.context.device); int ret; enum mlx5_alloc_type type; enum mlx5_alloc_type default_type = MLX5_ALLOC_TYPE_ANON; if (mlx5_use_huge("HUGE_CQ")) default_type = MLX5_ALLOC_TYPE_HUGE; mlx5_get_alloc_type(mctx, cq->parent_domain, MLX5_CQ_PREFIX, &type, default_type); if (type == MLX5_ALLOC_TYPE_CUSTOM) { buf->mparent_domain = to_mparent_domain(cq->parent_domain); buf->req_alignment = dev->page_size; buf->resource_type = MLX5DV_RES_TYPE_CQ; } ret = mlx5_alloc_prefered_buf(mctx, buf, align(nent * cqe_sz, dev->page_size), dev->page_size, type, MLX5_CQ_PREFIX); if (ret) return -1; if (buf->type != MLX5_ALLOC_TYPE_CUSTOM) memset(buf->buf, 0, nent * cqe_sz); for (i = 0; i < nent; ++i) { cqe = buf->buf + i * cqe_sz; cqe += cqe_sz == 128 ? 1 : 0; cqe->op_own = MLX5_CQE_INVALID << 4; } return 0; } int mlx5_free_cq_buf(struct mlx5_context *ctx, struct mlx5_buf *buf) { return mlx5_free_actual_buf(ctx, buf); } rdma-core-56.1/providers/mlx5/dbrec.c000066400000000000000000000114301477342711600174560ustar00rootroot00000000000000/* * Copyright (c) 2012 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include "mlx5.h" struct mlx5_db_page { cl_map_item_t cl_map; struct list_node available; struct mlx5_buf buf; int num_db; int use_cnt; unsigned long free[0]; }; static struct mlx5_db_page *__add_page(struct mlx5_context *context) { struct mlx5_db_page *page; int ps = to_mdev(context->ibv_ctx.context.device)->page_size; int pp; int i; int nlong; int ret; pp = ps / context->cache_line_size; nlong = (pp + 8 * sizeof(long) - 1) / (8 * sizeof(long)); page = malloc(sizeof *page + nlong * sizeof(long)); if (!page) return NULL; if (mlx5_is_extern_alloc(context)) ret = mlx5_alloc_buf_extern(context, &page->buf, ps); else ret = mlx5_alloc_buf(&page->buf, ps, ps); if (ret) { free(page); return NULL; } page->num_db = pp; page->use_cnt = 0; for (i = 0; i < nlong; ++i) page->free[i] = ~0; cl_qmap_insert(&context->dbr_map, (uintptr_t) page->buf.buf, &page->cl_map); list_add(&context->dbr_available_pages, &page->available); return page; } __be32 *mlx5_alloc_dbrec(struct mlx5_context *context, struct ibv_pd *pd, bool *custom_alloc) { struct mlx5_db_page *page; __be32 *db = NULL; int i, j; if (mlx5_is_custom_alloc(pd)) { struct mlx5_parent_domain *mparent_domain = to_mparent_domain(pd); db = mparent_domain->alloc(&mparent_domain->mpd.ibv_pd, mparent_domain->pd_context, 8, 8, MLX5DV_RES_TYPE_DBR); if (db == IBV_ALLOCATOR_USE_DEFAULT) goto default_alloc; if (!db) return NULL; *custom_alloc = true; return db; } default_alloc: pthread_mutex_lock(&context->dbr_map_mutex); page = list_top(&context->dbr_available_pages, struct mlx5_db_page, available); if (page) goto found; page = __add_page(context); if (!page) goto out; found: ++page->use_cnt; if (page->use_cnt == page->num_db) list_del(&page->available); for (i = 0; !page->free[i]; ++i) /* nothing */; j = ffsl(page->free[i]); --j; page->free[i] &= ~(1UL << j); db = page->buf.buf + (i * 8 * sizeof(long) + j) * context->cache_line_size; out: pthread_mutex_unlock(&context->dbr_map_mutex); return db; } void mlx5_free_db(struct mlx5_context *context, __be32 *db, struct ibv_pd *pd, bool custom_alloc) { struct mlx5_db_page *page; uintptr_t ps = to_mdev(context->ibv_ctx.context.device)->page_size; cl_map_item_t *item; int i; if (custom_alloc) { struct mlx5_parent_domain *mparent_domain = to_mparent_domain(pd); mparent_domain->free(&mparent_domain->mpd.ibv_pd, mparent_domain->pd_context, db, MLX5DV_RES_TYPE_DBR); return; } pthread_mutex_lock(&context->dbr_map_mutex); item = cl_qmap_get(&context->dbr_map, (uintptr_t) db & ~(ps - 1)); assert(item != cl_qmap_end(&context->dbr_map)); page = (container_of(item, struct mlx5_db_page, cl_map)); i = ((void *) db - page->buf.buf) / context->cache_line_size; page->free[i / (8 * sizeof(long))] |= 1UL << (i % (8 * sizeof(long))); if (page->use_cnt == page->num_db) list_add(&context->dbr_available_pages, &page->available); if (!--page->use_cnt) { cl_qmap_remove_item(&context->dbr_map, item); list_del(&page->available); if (page->buf.type == MLX5_ALLOC_TYPE_EXTERNAL) mlx5_free_buf_extern(context, &page->buf); else mlx5_free_buf(&page->buf); free(page); } pthread_mutex_unlock(&context->dbr_map_mutex); } rdma-core-56.1/providers/mlx5/dr_action.c000066400000000000000000002712631477342711600203550ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include "mlx5dv_dr.h" #include "dr_ste.h" enum dr_action_domain { DR_ACTION_DOMAIN_NIC_INGRESS, DR_ACTION_DOMAIN_NIC_EGRESS, DR_ACTION_DOMAIN_FDB_INGRESS, DR_ACTION_DOMAIN_FDB_EGRESS, DR_ACTION_DOMAIN_MAX, }; enum dr_action_valid_state { DR_ACTION_STATE_ERR, DR_ACTION_STATE_NO_ACTION, DR_ACTION_STATE_ENCAP, DR_ACTION_STATE_DECAP, DR_ACTION_STATE_MODIFY_HDR, DR_ACTION_STATE_POP_VLAN, DR_ACTION_STATE_PUSH_VLAN, DR_ACTION_STATE_NON_TERM, DR_ACTION_STATE_TERM, DR_ACTION_STATE_ASO, DR_ACTION_STATE_MAX, }; static const enum dr_action_valid_state next_action_state[DR_ACTION_DOMAIN_MAX] [DR_ACTION_STATE_MAX] [DR_ACTION_TYP_MAX] = { [DR_ACTION_DOMAIN_NIC_INGRESS] = { [DR_ACTION_STATE_NO_ACTION] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_MISS] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_DECAP] = { [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_MODIFY_HDR] = { [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_POP_VLAN] = { [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_PUSH_VLAN] = { [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_NON_TERM] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_MISS] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_ASO] = { [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_ENCAP] = { [DR_ACTION_TYP_QP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TAG] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_TERM] = { [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_TERM, }, }, [DR_ACTION_DOMAIN_NIC_EGRESS] = { [DR_ACTION_STATE_NO_ACTION] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_MISS] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_ENCAP] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_MODIFY_HDR] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_POP_VLAN] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_PUSH_VLAN] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_NON_TERM] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_MISS] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_ASO] = { [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_MISS] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_TERM] = { [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_TERM, }, }, [DR_ACTION_DOMAIN_FDB_INGRESS] = { [DR_ACTION_STATE_NO_ACTION] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_MISS] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_DECAP] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_ENCAP] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_MODIFY_HDR] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_POP_VLAN] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_PUSH_VLAN] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, }, [DR_ACTION_STATE_NON_TERM] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_TNL_L2_TO_L2] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_TNL_L3_TO_L2] = DR_ACTION_STATE_DECAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_MISS] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_ASO] = { [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_TERM] = { [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_TERM, }, }, [DR_ACTION_DOMAIN_FDB_EGRESS] = { [DR_ACTION_STATE_NO_ACTION] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_MISS] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_ENCAP] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_MODIFY_HDR] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_POP_VLAN] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_PUSH_VLAN] = { [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_NON_TERM] = { [DR_ACTION_TYP_DROP] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_NON_TERM, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_METER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_POP_VLAN] = DR_ACTION_STATE_POP_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_MISS] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_ASO] = { [DR_ACTION_TYP_L2_TO_TNL_L2] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_L2_TO_TNL_L3] = DR_ACTION_STATE_ENCAP, [DR_ACTION_TYP_MODIFY_HDR] = DR_ACTION_STATE_MODIFY_HDR, [DR_ACTION_TYP_PUSH_VLAN] = DR_ACTION_STATE_PUSH_VLAN, [DR_ACTION_TYP_VPORT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_FT] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_SAMPLER] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_DEST_ARRAY] = DR_ACTION_STATE_TERM, [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_FIRST_HIT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_FLOW_METER] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ASO_CT] = DR_ACTION_STATE_ASO, [DR_ACTION_TYP_ROOT_FT] = DR_ACTION_STATE_TERM, }, [DR_ACTION_STATE_TERM] = { [DR_ACTION_TYP_CTR] = DR_ACTION_STATE_TERM, }, }, }; static enum mlx5dv_flow_action_packet_reformat_type dr_action_type_to_reformat_enum(enum dr_action_type action_type) { switch (action_type) { case DR_ACTION_TYP_TNL_L2_TO_L2: return MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2; case DR_ACTION_TYP_L2_TO_TNL_L2: return MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL; case DR_ACTION_TYP_TNL_L3_TO_L2: return MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2; case DR_ACTION_TYP_L2_TO_TNL_L3: return MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL; default: assert(false); return 0; } } static enum dr_action_type dr_action_reformat_to_action_type(enum mlx5dv_flow_action_packet_reformat_type type) { switch (type) { case MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2: return DR_ACTION_TYP_TNL_L2_TO_L2; case MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL: return DR_ACTION_TYP_L2_TO_TNL_L2; case MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2: return DR_ACTION_TYP_TNL_L3_TO_L2; case MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL: return DR_ACTION_TYP_L2_TO_TNL_L3; default: assert(false); return 0; } } /* Apply the actions on the rule STE array starting from the last_ste. * Actions might require more than one STE, new_num_stes will return * the new size of the STEs array, rule with actions. */ static void dr_actions_apply(struct mlx5dv_dr_domain *dmn, enum dr_domain_nic_type nic_type, uint8_t *action_type_set, uint8_t *last_ste, struct dr_ste_actions_attr *attr, uint32_t *new_num_stes) { struct dr_ste_ctx *ste_ctx = dmn->ste_ctx; uint32_t added_stes = 0; if (nic_type == DR_DOMAIN_NIC_TYPE_RX) dr_ste_set_actions_rx(ste_ctx, action_type_set, last_ste, attr, &added_stes); else dr_ste_set_actions_tx(ste_ctx, action_type_set, last_ste, attr, &added_stes); *new_num_stes += added_stes; } static enum dr_action_domain dr_action_get_action_domain(enum mlx5dv_dr_domain_type domain, enum dr_domain_nic_type nic_type) { if (domain == MLX5DV_DR_DOMAIN_TYPE_NIC_RX) { return DR_ACTION_DOMAIN_NIC_INGRESS; } else if (domain == MLX5DV_DR_DOMAIN_TYPE_NIC_TX) { return DR_ACTION_DOMAIN_NIC_EGRESS; } else { /* FDB domain */ if (nic_type == DR_DOMAIN_NIC_TYPE_RX) return DR_ACTION_DOMAIN_FDB_INGRESS; else return DR_ACTION_DOMAIN_FDB_EGRESS; } } static int dr_action_validate_and_get_next_state(enum dr_action_domain action_domain, uint32_t action_type, uint32_t *state) { uint32_t cur_state = *state; /* Check action state machine is valid */ *state = next_action_state[action_domain][cur_state][action_type]; if (*state == DR_ACTION_STATE_ERR) { errno = EOPNOTSUPP; return errno; } return 0; } static int dr_action_send_modify_header_args(struct mlx5dv_dr_action *action, uint8_t send_ring_idx) { int ret; if (!(action->rewrite.args_send_qp & (1 << send_ring_idx))) { ret = dr_send_postsend_args(action->rewrite.dmn, dr_arg_get_object_id(action->rewrite.ptrn_arg.arg), action->rewrite.param.num_of_actions, action->rewrite.param.data, send_ring_idx); if (ret) { dr_dbg(action->rewrite.dmn, "Failed writing args object\n"); return ret; } action->rewrite.args_send_qp |= 1 << send_ring_idx; } return 0; } #define WITH_VLAN_NUM_HW_ACTIONS 6 int dr_actions_build_ste_arr(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct mlx5dv_dr_action *actions[], uint32_t num_actions, uint8_t *ste_arr, uint32_t *new_hw_ste_arr_sz, struct cross_dmn_params *cross_dmn_p, uint8_t send_ring_idx) { struct dr_domain_rx_tx *nic_dmn = nic_matcher->nic_tbl->nic_dmn; bool rx_rule = nic_dmn->type == DR_DOMAIN_NIC_TYPE_RX; struct mlx5dv_dr_action *cross_dmn_action = NULL; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; uint8_t action_type_set[DR_ACTION_TYP_MAX] = {}; uint32_t state = DR_ACTION_STATE_NO_ACTION; struct dr_ste_actions_attr attr = {}; enum dr_action_domain action_domain; uint8_t *last_ste; int i; attr.dmn = dmn; attr.gvmi = dmn->info.caps.gvmi; attr.hit_gvmi = dmn->info.caps.gvmi; attr.final_icm_addr = nic_dmn->default_icm_addr; action_domain = dr_action_get_action_domain(dmn->type, nic_dmn->type); attr.aso_ste_loc = -1; for (i = 0; i < num_actions; i++) { struct mlx5dv_dr_action *action; int max_actions_type = 1; uint32_t action_type; action = actions[i]; action_type = action->action_type; switch (action_type) { case DR_ACTION_TYP_DROP: attr.final_icm_addr = nic_dmn->drop_icm_addr; attr.hit_gvmi = nic_dmn->drop_icm_addr >> 48; break; case DR_ACTION_TYP_FT: { struct mlx5dv_dr_table *dest_tbl = action->dest_tbl; if (dest_tbl->dmn != dmn) { dr_dbg(dmn, "Destination table belongs to a different domain\n"); goto out_invalid_arg; } if (dest_tbl->level <= matcher->tbl->level) dr_dbg(dmn, "Destination table level not higher than source\n"); attr.final_icm_addr = rx_rule ? dr_icm_pool_get_chunk_icm_addr(dest_tbl->rx.s_anchor->chunk) : dr_icm_pool_get_chunk_icm_addr(dest_tbl->tx.s_anchor->chunk); break; } case DR_ACTION_TYP_ROOT_FT: if (action->root_tbl.tbl->dmn != dmn) { dr_dbg(dmn, "Destination anchor belongs to a different domain\n"); goto out_invalid_arg; } attr.final_icm_addr = rx_rule ? action->root_tbl.rx_icm_addr : action->root_tbl.tx_icm_addr; break; case DR_ACTION_TYP_QP: if (action->dest_qp.is_qp) attr.final_icm_addr = to_mqp(action->dest_qp.qp)->tir_icm_addr; else attr.final_icm_addr = action->dest_qp.devx_tir->rx_icm_addr; if (!attr.final_icm_addr) { dr_dbg(dmn, "Unsupported TIR/QP for action\n"); goto out_invalid_arg; } break; case DR_ACTION_TYP_CTR: attr.ctr_id = action->ctr.devx_obj->object_id + action->ctr.offset; break; case DR_ACTION_TYP_ASO_CT: if (dmn != action->aso.dmn) { if (!action->aso.devx_obj->priv) { dr_dbg(dmn, "ASO CT devx priv object is not initialized\n"); goto out_invalid_arg; } struct dr_aso_cross_dmn_arrays *cross_dmn_arrays = (struct dr_aso_cross_dmn_arrays *) action->aso.devx_obj->priv; if (atomic_fetch_add(&cross_dmn_arrays->rule_htbl[action->aso.offset]->ste_arr->refcount, 1) > 1) { dr_dbg(dmn, "ASO CT cross GVMI action is in use by another rule\n"); atomic_fetch_sub(&cross_dmn_arrays->rule_htbl[action->aso.offset]->ste_arr->refcount, 1); errno = EBUSY; goto out_errno; } dr_ste_get(cross_dmn_arrays->action_htbl[action->aso.offset]->ste_arr); cross_dmn_p->cross_dmn_action = action; cross_dmn_action = action; } attr.aso = &action->aso; break; case DR_ACTION_TYP_ASO_FLOW_METER: case DR_ACTION_TYP_ASO_FIRST_HIT: if (dmn->ctx != action->aso.devx_obj->context) { dr_dbg(dmn, "ASO belongs to a different IB ctx\n"); goto out_invalid_arg; } attr.aso = &action->aso; break; case DR_ACTION_TYP_TAG: attr.flow_tag = action->flow_tag; break; case DR_ACTION_TYP_MISS: case DR_ACTION_TYP_TNL_L2_TO_L2: break; case DR_ACTION_TYP_TNL_L3_TO_L2: if (action->rewrite.is_root_level) { dr_dbg(dmn, "Root decap L3 action cannot be used on current table\n"); goto out_invalid_arg; } if (action->rewrite.ptrn_arg.ptrn && action->rewrite.ptrn_arg.arg) { attr.decap_index = dr_arg_get_object_id(action->rewrite.ptrn_arg.arg); attr.decap_actions = action->rewrite.ptrn_arg.ptrn->rewrite_param.num_of_actions; attr.decap_pat_idx = action->rewrite.ptrn_arg.ptrn->rewrite_param.index; if (dmn->info.use_mqs) { if (dr_action_send_modify_header_args(action, send_ring_idx)) goto out_errno; } } else { attr.decap_index = action->rewrite.param.index; attr.decap_actions = action->rewrite.param.num_of_actions; attr.decap_with_vlan = attr.decap_actions == WITH_VLAN_NUM_HW_ACTIONS; attr.decap_pat_idx = DR_INVALID_PATTERN_INDEX; } break; case DR_ACTION_TYP_MODIFY_HDR: if (action->rewrite.is_root_level) { dr_dbg(dmn, "Root modify header action cannot be used on current table\n"); goto out_invalid_arg; } if (action->rewrite.single_action_opt) { attr.modify_actions = action->rewrite.param.num_of_actions; attr.single_modify_action = action->rewrite.param.data; } else { if (action->rewrite.ptrn_arg.ptrn && action->rewrite.ptrn_arg.arg) { attr.modify_index = dr_arg_get_object_id(action->rewrite.ptrn_arg.arg); attr.modify_pat_idx = action->rewrite.ptrn_arg.ptrn->rewrite_param.index; attr.modify_actions = action->rewrite.ptrn_arg.ptrn->rewrite_param. num_of_actions; if (dmn->info.use_mqs) { if (dr_action_send_modify_header_args(action, send_ring_idx)) goto out_errno; } } else { attr.modify_actions = action->rewrite.param.num_of_actions; attr.modify_index = action->rewrite.param.index; attr.modify_pat_idx = DR_INVALID_PATTERN_INDEX; } } break; case DR_ACTION_TYP_L2_TO_TNL_L2: case DR_ACTION_TYP_L2_TO_TNL_L3: if (action->reformat.is_root_level) { dr_dbg(dmn, "Root encap action cannot be used on current table\n"); goto out_invalid_arg; } if (rx_rule && !(dmn->ste_ctx->actions_caps & DR_STE_CTX_ACTION_CAP_RX_ENCAP)) { dr_dbg(dmn, "Device doesn't support Encap on RX\n"); goto out_invalid_arg; } attr.reformat_size = action->reformat.reformat_size; attr.reformat_id = dr_actions_reformat_get_id(action); attr.prio_tag_required = dmn->info.caps.prio_tag_required; break; case DR_ACTION_TYP_METER: if (action->meter.next_ft->dmn != dmn) { dr_dbg(dmn, "Next table belongs to a different domain\n"); goto out_invalid_arg; } if (action->meter.next_ft->level <= matcher->tbl->level) { dr_dbg(dmn, "Next table level should he higher than source table\n"); goto out_invalid_arg; } attr.final_icm_addr = rx_rule ? action->meter.rx_icm_addr : action->meter.tx_icm_addr; break; case DR_ACTION_TYP_SAMPLER: if (action->sampler.dmn != dmn) { dr_dbg(dmn, "Sampler belongs to a different domain\n"); goto out_invalid_arg; } if (action->sampler.sampler_default->next_ft->level <= matcher->tbl->level) { dr_dbg(dmn, "Sampler next table level should he higher than source table\n"); goto out_invalid_arg; } if (rx_rule) { attr.final_icm_addr = action->sampler.sampler_default->rx_icm_addr; } else { attr.final_icm_addr = (action->sampler.sampler_restore) ? action->sampler.sampler_restore->tx_icm_addr : action->sampler.sampler_default->tx_icm_addr; } break; case DR_ACTION_TYP_VPORT: if (action->vport.dmn != dmn) { dr_dbg(dmn, "Destination vport belongs to a different domain\n"); goto out_invalid_arg; } if (unlikely(rx_rule && action->vport.caps->num == WIRE_PORT)) { if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_RX) { dr_dbg(dmn, "Forwarding to uplink vport on RX is not allowed\n"); goto out_invalid_arg; } /* silently drop the packets for RX side of FDB */ attr.final_icm_addr = nic_dmn->drop_icm_addr; attr.hit_gvmi = nic_dmn->drop_icm_addr >> 48; } else { attr.hit_gvmi = action->vport.caps->vhca_gvmi; attr.final_icm_addr = rx_rule ? action->vport.caps->icm_address_rx : action->vport.caps->icm_address_tx; } break; case DR_ACTION_TYP_DEST_ARRAY: if (action->dest_array.dmn != dmn) { dr_dbg(dmn, "Destination array belongs to a different domain\n"); goto out_invalid_arg; } attr.final_icm_addr = rx_rule ? action->dest_array.rx_icm_addr : action->dest_array.tx_icm_addr; break; case DR_ACTION_TYP_POP_VLAN: if (!rx_rule && !(dmn->ste_ctx->actions_caps & DR_STE_CTX_ACTION_CAP_TX_POP)) { dr_dbg(dmn, "Device doesn't support POP VLAN action on TX\n"); goto out_invalid_arg; } max_actions_type = MAX_VLANS; attr.vlans.count_pop++; break; case DR_ACTION_TYP_PUSH_VLAN: if (rx_rule && !(dmn->ste_ctx->actions_caps & DR_STE_CTX_ACTION_CAP_RX_PUSH)) { dr_dbg(dmn, "Device doesn't support PUSH VLAN action on RX\n"); goto out_invalid_arg; } max_actions_type = MAX_VLANS; if (attr.vlans.count_push == MAX_VLANS) { errno = ENOTSUP; return ENOTSUP; } attr.vlans.headers[attr.vlans.count_push++] = action->push_vlan.vlan_hdr; break; default: goto out_invalid_arg; } /* Check action duplication */ if (++action_type_set[action_type] > max_actions_type) { dr_dbg(dmn, "Action type %d supports only max %d time(s)\n", action_type, max_actions_type); goto out_invalid_arg; } /* Check action state machine is valid */ if (dr_action_validate_and_get_next_state(action_domain, action_type, &state)) { dr_dbg(dmn, "Invalid action sequence provided\n"); goto out_errno; } } *new_hw_ste_arr_sz = nic_matcher->num_of_builders; last_ste = ste_arr + DR_STE_SIZE * (nic_matcher->num_of_builders - 1); dr_actions_apply(dmn, nic_dmn->type, action_type_set, last_ste, &attr, new_hw_ste_arr_sz); if (attr.aso_ste_loc != -1) cross_dmn_p->cross_dmn_loc = attr.aso_ste_loc; return 0; out_invalid_arg: errno = EINVAL; out_errno: if (cross_dmn_action) { struct dr_aso_cross_dmn_arrays *cross_dmn_arrays = (struct dr_aso_cross_dmn_arrays *) cross_dmn_action->aso.devx_obj->priv; atomic_fetch_sub(&cross_dmn_arrays->rule_htbl[cross_dmn_action->aso.offset]->ste_arr->refcount, 1); atomic_fetch_sub(&cross_dmn_arrays->action_htbl[cross_dmn_action->aso.offset]->ste_arr->refcount, 1); } return errno; } int dr_actions_build_attr(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_dr_action *actions[], size_t num_actions, struct mlx5dv_flow_action_attr *attr, struct mlx5_flow_action_attr_aux *attr_aux) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; int i; for (i = 0; i < num_actions; i++) { switch (actions[i]->action_type) { case DR_ACTION_TYP_FT: if (actions[i]->dest_tbl->dmn != dmn) { dr_dbg(dmn, "Destination table belongs to a different domain\n"); errno = EINVAL; return errno; } attr[i].type = MLX5DV_FLOW_ACTION_DEST_DEVX; attr[i].obj = actions[i]->dest_tbl->devx_obj; break; case DR_ACTION_TYP_DEST_ARRAY: if (actions[i]->dest_array.dmn != dmn) { dr_dbg(dmn, "Destination array belongs to a different domain\n"); errno = EINVAL; return errno; } attr[i].type = MLX5DV_FLOW_ACTION_DEST_DEVX; attr[i].obj = actions[i]->dest_array.devx_tbl->ft_dvo; break; case DR_ACTION_TYP_TNL_L2_TO_L2: case DR_ACTION_TYP_L2_TO_TNL_L2: case DR_ACTION_TYP_TNL_L3_TO_L2: case DR_ACTION_TYP_L2_TO_TNL_L3: attr[i].type = MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION; attr[i].action = actions[i]->reformat.flow_action; break; case DR_ACTION_TYP_MODIFY_HDR: attr[i].type = MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION; attr[i].action = actions[i]->rewrite.flow_action; break; case DR_ACTION_TYP_QP: if (actions[i]->dest_qp.is_qp) { attr[i].type = MLX5DV_FLOW_ACTION_DEST_IBV_QP; attr[i].qp = actions[i]->dest_qp.qp; } else { attr[i].type = MLX5DV_FLOW_ACTION_DEST_DEVX; attr[i].obj = actions[i]->dest_qp.devx_tir; } break; case DR_ACTION_TYP_CTR: attr[i].type = MLX5DV_FLOW_ACTION_COUNTERS_DEVX; attr[i].obj = actions[i]->ctr.devx_obj; if (actions[i]->ctr.offset) { attr_aux[i].type = MLX5_FLOW_ACTION_COUNTER_OFFSET; attr_aux[i].offset = actions[i]->ctr.offset; } break; case DR_ACTION_TYP_TAG: attr[i].type = MLX5DV_FLOW_ACTION_TAG; attr[i].tag_value = actions[i]->flow_tag; break; case DR_ACTION_TYP_MISS: attr[i].type = MLX5DV_FLOW_ACTION_DEFAULT_MISS; break; case DR_ACTION_TYP_DROP: attr[i].type = MLX5DV_FLOW_ACTION_DROP; break; default: dr_dbg(dmn, "Found unsupported action type: %d\n", actions[i]->action_type); errno = ENOTSUP; return errno; } } return 0; } static struct mlx5dv_dr_action * dr_action_create_generic(enum dr_action_type action_type) { struct mlx5dv_dr_action *action; action = calloc(1, sizeof(struct mlx5dv_dr_action)); if (!action) { errno = ENOMEM; return NULL; } action->action_type = action_type; atomic_init(&action->refcount, 1); return action; } struct mlx5dv_dr_action *mlx5dv_dr_action_create_drop(void) { return dr_action_create_generic(DR_ACTION_TYP_DROP); } struct mlx5dv_dr_action *mlx5dv_dr_action_create_default_miss(void) { return dr_action_create_generic(DR_ACTION_TYP_MISS); } struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_ibv_qp(struct ibv_qp *ibqp) { struct mlx5dv_dr_action *action; if (ibqp->qp_type != IBV_QPT_RAW_PACKET) { errno = EINVAL; return NULL; } action = dr_action_create_generic(DR_ACTION_TYP_QP); if (!action) return NULL; action->dest_qp.is_qp = true; action->dest_qp.qp = ibqp; return action; } struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_devx_tir(struct mlx5dv_devx_obj *devx_obj) { struct mlx5dv_dr_action *action; if (devx_obj->type != MLX5_DEVX_TIR) { errno = EINVAL; return NULL; } action = dr_action_create_generic(DR_ACTION_TYP_QP); if (!action) return NULL; action->dest_qp.devx_tir = devx_obj; return action; } struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_table(struct mlx5dv_dr_table *tbl) { struct mlx5dv_dr_action *action; atomic_fetch_add(&tbl->refcount, 1); if (dr_is_root_table(tbl)) { dr_dbg(tbl->dmn, "Root table cannot be used as a destination\n"); errno = EINVAL; goto dec_ref; } action = dr_action_create_generic(DR_ACTION_TYP_FT); if (!action) goto dec_ref; action->dest_tbl = tbl; return action; dec_ref: atomic_fetch_sub(&tbl->refcount, 1); return NULL; } static int dr_action_create_dest_root_table(struct mlx5dv_dr_action *action) { struct mlx5dv_dr_domain *dmn = action->root_tbl.tbl->dmn; struct dr_devx_flow_dest_info dest_info = {}; struct dr_devx_flow_table_attr ft_attr = {}; struct dr_devx_flow_group_attr fg_attr = {}; struct dr_devx_flow_fte_attr fte_attr = {}; int ret; switch (dmn->type) { case MLX5DV_DR_DOMAIN_TYPE_FDB: ft_attr.type = FS_FT_FDB; break; case MLX5DV_DR_DOMAIN_TYPE_NIC_RX: ft_attr.type = FS_FT_NIC_RX; break; case MLX5DV_DR_DOMAIN_TYPE_NIC_TX: ft_attr.type = FS_FT_NIC_TX; break; default: errno = EOPNOTSUPP; return errno; } fte_attr.dest_arr = &dest_info; fte_attr.action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; fte_attr.dest_arr[0].type = MLX5_FLOW_DEST_TYPE_FT; fte_attr.dest_arr[0].ft_id = action->root_tbl.sa->id; fte_attr.dest_size = 1; action->root_tbl.devx_tbl = dr_devx_create_always_hit_ft(dmn->ctx, &ft_attr, &fg_attr, &fte_attr); if (!action->root_tbl.devx_tbl) return errno; ret = dr_devx_query_flow_table(action->root_tbl.devx_tbl->ft_dvo, ft_attr.type, &action->root_tbl.rx_icm_addr, &action->root_tbl.tx_icm_addr); if (ret) goto destroy_devx_tbl; return 0; destroy_devx_tbl: dr_devx_destroy_always_hit_ft(action->root_tbl.devx_tbl); return errno; } struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_root_table(struct mlx5dv_dr_table *tbl, uint16_t priority) { struct mlx5dv_steering_anchor_attr attr = {}; enum mlx5_ib_uapi_flow_table_type ft_type; struct mlx5dv_steering_anchor *sa; struct mlx5dv_dr_action *action; int err; if (!dr_is_root_table(tbl)) { dr_dbg(tbl->dmn, "Supported only on root flow tables\n"); errno = EINVAL; return NULL; } if (tbl->dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_RX) ft_type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_RX; else if (tbl->dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_TX) ft_type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_TX; else ft_type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_FDB; attr.priority = priority; attr.ft_type = ft_type; sa = mlx5dv_create_steering_anchor(tbl->dmn->ctx, &attr); if (!sa) return NULL; action = dr_action_create_generic(DR_ACTION_TYP_ROOT_FT); if (!action) goto free_steering_anchor; action->root_tbl.sa = sa; action->root_tbl.tbl = tbl; err = dr_action_create_dest_root_table(action); if (err) goto free_action; atomic_fetch_add(&tbl->refcount, 1); return action; free_action: free(action); free_steering_anchor: mlx5dv_destroy_steering_anchor(sa); return NULL; } struct mlx5dv_dr_action * mlx5dv_dr_action_create_flow_counter(struct mlx5dv_devx_obj *devx_obj, uint32_t offset) { struct mlx5dv_dr_action *action; if (devx_obj->type != MLX5_DEVX_FLOW_COUNTER) { errno = EINVAL; return NULL; } action = dr_action_create_generic(DR_ACTION_TYP_CTR); if (!action) return NULL; action->ctr.devx_obj = devx_obj; action->ctr.offset = offset; return action; } static int dr_action_aso_first_hit_init(struct mlx5dv_dr_action *action, uint32_t offset, uint32_t flags, uint8_t return_reg_c) { if (!check_comp_mask(flags, MLX5DV_DR_ACTION_FLAGS_ASO_FIRST_HIT_SET)) { errno = EINVAL; return errno; } if ((offset / MLX5_ASO_FIRST_HIT_NUM_PER_OBJ) >= (1 << action->aso.devx_obj->log_obj_range)) { errno = EINVAL; return errno; } if ((return_reg_c > 5) || (return_reg_c % 2 == 0)) { errno = EINVAL; return errno; } action->aso.offset = offset; action->aso.first_hit.set = flags & MLX5DV_DR_ACTION_FLAGS_ASO_FIRST_HIT_SET; action->aso.dest_reg_id = return_reg_c; return 0; } static int dr_action_aso_flow_meter_init(struct mlx5dv_dr_action *action, uint32_t offset, uint32_t flags, uint8_t return_reg_c) { if (!flags || (flags > MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_UNDEFINED)) { errno = EINVAL; return errno; } if ((offset / MLX5_ASO_FLOW_METER_NUM_PER_OBJ) >= (1 << action->aso.devx_obj->log_obj_range)) { errno = EINVAL; return errno; } if ((return_reg_c > 5) || (return_reg_c % 2 == 0)) { errno = EINVAL; return errno; } switch (flags) { case MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_RED: action->aso.flow_meter.initial_color = MLX5_IFC_ASO_FLOW_METER_INITIAL_COLOR_RED; break; case MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_YELLOW: action->aso.flow_meter.initial_color = MLX5_IFC_ASO_FLOW_METER_INITIAL_COLOR_YELLOW; break; case MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_GREEN: action->aso.flow_meter.initial_color = MLX5_IFC_ASO_FLOW_METER_INITIAL_COLOR_GREEN; break; case MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_UNDEFINED: action->aso.flow_meter.initial_color = MLX5_IFC_ASO_FLOW_METER_INITIAL_COLOR_UNDEFINED; break; default: errno = EINVAL; return errno; } action->aso.offset = offset; action->aso.dest_reg_id = return_reg_c; return 0; } static int dr_action_aso_ct_init(struct mlx5dv_dr_action *action, uint32_t offset, uint32_t flags, uint8_t return_reg_c) { if (!flags || (flags > MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_RESPONDER)) goto err_invalid; if ((offset / MLX5_ASO_CT_NUM_PER_OBJ) >= (1 << action->aso.devx_obj->log_obj_range)) goto err_invalid; if ((return_reg_c > 5) || (return_reg_c % 2 == 0)) goto err_invalid; if (flags == MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_INITIATOR) action->aso.ct.direction = MLX5_IFC_ASO_CT_DIRECTION_INITIATOR; else action->aso.ct.direction = MLX5_IFC_ASO_CT_DIRECTION_RESPONDER; action->aso.offset = offset; action->aso.dest_reg_id = return_reg_c; return 0; err_invalid: errno = EINVAL; return errno; } struct mlx5dv_dr_action * mlx5dv_dr_action_create_aso(struct mlx5dv_dr_domain *dmn, struct mlx5dv_devx_obj *devx_obj, uint32_t offset, uint32_t flags, uint8_t return_reg_c) { struct mlx5dv_dr_action *action = NULL; if (!dmn->info.supp_sw_steering || dmn->info.caps.sw_format_ver == MLX5_HW_CONNECTX_5) { errno = EOPNOTSUPP; return NULL; } if (devx_obj->type == MLX5_DEVX_ASO_FIRST_HIT) { action = dr_action_create_generic(DR_ACTION_TYP_ASO_FIRST_HIT); if (!action) return NULL; action->aso.devx_obj = devx_obj; if (dr_action_aso_first_hit_init(action, offset, flags, return_reg_c)) goto out_free; } else if (devx_obj->type == MLX5_DEVX_ASO_FLOW_METER) { action = dr_action_create_generic(DR_ACTION_TYP_ASO_FLOW_METER); if (!action) return NULL; action->aso.devx_obj = devx_obj; if (dr_action_aso_flow_meter_init(action, offset, flags, return_reg_c)) goto out_free; } else if (devx_obj->type == MLX5_DEVX_ASO_CT) { action = dr_action_create_generic(DR_ACTION_TYP_ASO_CT); if (!action) return NULL; action->aso.devx_obj = devx_obj; if (dr_action_aso_ct_init(action, offset, flags, return_reg_c)) goto out_free; } else { errno = EOPNOTSUPP; return NULL; } action->aso.dmn = dmn; return action; out_free: free(action); return NULL; } static int dr_action_aso_ct_modify(struct mlx5dv_dr_action *action, uint32_t offset, uint32_t flags, uint8_t return_reg_c) { if (action->aso.devx_obj->priv == NULL) return dr_action_aso_ct_init(action, offset, flags, return_reg_c); if (action->aso.dest_reg_id != return_reg_c) { dr_dbg(action->aso.dmn, "Invalid parameters for a cross gvmi action\n"); errno = EOPNOTSUPP; return errno; } if (flags > MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_RESPONDER) { errno = EOPNOTSUPP; return errno; } if ((flags == MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_INITIATOR && action->aso.ct.direction != MLX5_IFC_ASO_CT_DIRECTION_INITIATOR) || (flags == MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_RESPONDER && action->aso.ct.direction != MLX5_IFC_ASO_CT_DIRECTION_RESPONDER)) { errno = EOPNOTSUPP; return errno; } action->aso.offset = offset; return 0; } int mlx5dv_dr_action_modify_aso(struct mlx5dv_dr_action *action, uint32_t offset, uint32_t flags, uint8_t return_reg_c) { if (action->action_type == DR_ACTION_TYP_ASO_FIRST_HIT) return dr_action_aso_first_hit_init(action, offset, flags, return_reg_c); else if (action->action_type == DR_ACTION_TYP_ASO_FLOW_METER) return dr_action_aso_flow_meter_init(action, offset, flags, return_reg_c); else if (action->action_type == DR_ACTION_TYP_ASO_CT) return dr_action_aso_ct_modify(action, offset, flags, return_reg_c); errno = EINVAL; return errno; } struct mlx5dv_dr_action *mlx5dv_dr_action_create_tag(uint32_t tag_value) { struct mlx5dv_dr_action *action; action = dr_action_create_generic(DR_ACTION_TYP_TAG); if (!action) return NULL; action->flow_tag = tag_value & 0xffffff; return action; } static int dr_action_create_reformat_action_root(struct mlx5dv_dr_domain *dmn, size_t data_sz, void *data, struct mlx5dv_dr_action *action) { enum mlx5dv_flow_action_packet_reformat_type reformat_type; struct ibv_flow_action *flow_action; enum mlx5dv_flow_table_type type; if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_RX) type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_RX; else if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_TX) type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_TX; else type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_FDB; reformat_type = dr_action_type_to_reformat_enum(action->action_type); flow_action = mlx5dv_create_flow_action_packet_reformat(dmn->ctx, data_sz, data, reformat_type, type); if (!flow_action) return errno; action->reformat.flow_action = flow_action; return 0; } static int dr_action_verify_reformat_params(enum mlx5dv_flow_action_packet_reformat_type reformat_type, struct mlx5dv_dr_domain *dmn, size_t data_sz, void *data) { if ((!data && data_sz) || (data && !data_sz) || (dr_domain_is_support_sw_encap(dmn) && (data_sz > dmn->info.caps.max_encap_size)) || reformat_type > MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL) { dr_dbg(dmn, "Invalid reformat parameter!\n"); goto out_err; } if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB) return 0; if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_RX) { if (reformat_type != MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2 && reformat_type != MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2) { dr_dbg(dmn, "Action reformat type not support on RX domain\n"); goto out_err; } } else if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_TX) { if (reformat_type != MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL && reformat_type != MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL) { dr_dbg(dmn, "Action reformat type not support on TX domain\n"); goto out_err; } } return 0; out_err: errno = EINVAL; return errno; } static int dr_action_create_sw_reformat(struct mlx5dv_dr_domain *dmn, struct mlx5dv_dr_action *action, size_t data_sz, void *data) { uint8_t *reformat_data; int ret; reformat_data = calloc(1, data_sz); if (!reformat_data) { errno = ENOMEM; return errno; } memcpy(reformat_data, data, data_sz); action->reformat.data = reformat_data; action->reformat.reformat_size = data_sz; ret = dr_ste_alloc_encap(action); if (ret) goto free_reformat_data; return 0; free_reformat_data: free(reformat_data); action->reformat.data = NULL; return ret; } static void dr_action_destroy_sw_reformat(struct mlx5dv_dr_action *action) { dr_ste_free_encap(action); free(action->reformat.data); } static int dr_action_create_devx_reformat(struct mlx5dv_dr_domain *dmn, struct mlx5dv_dr_action *action, size_t data_sz, void *data) { struct mlx5dv_devx_obj *obj; enum reformat_type rt; if (action->action_type == DR_ACTION_TYP_L2_TO_TNL_L2) rt = MLX5_REFORMAT_TYPE_L2_TO_L2_TUNNEL; else rt = MLX5_REFORMAT_TYPE_L2_TO_L3_TUNNEL; obj = dr_devx_create_reformat_ctx(dmn->ctx, rt, data_sz, data); if (!obj) return errno; action->reformat.dvo = obj; action->reformat.reformat_size = data_sz; return 0; } static void dr_action_destroy_devx_reformat(struct mlx5dv_dr_action *action) { mlx5dv_devx_obj_destroy(action->reformat.dvo); } static int dr_action_create_reformat_action(struct mlx5dv_dr_domain *dmn, size_t data_sz, void *data, struct mlx5dv_dr_action *action) { uint8_t *hw_actions; switch (action->action_type) { case DR_ACTION_TYP_L2_TO_TNL_L2: case DR_ACTION_TYP_L2_TO_TNL_L3: { if (dr_domain_is_support_sw_encap(dmn) && !dr_action_create_sw_reformat(dmn, action, data_sz, data)) return 0; /* When failed creating sw encap, fallback to * use devx to try again. */ if (dr_action_create_devx_reformat(dmn, action, data_sz, data)) return errno; return 0; } case DR_ACTION_TYP_TNL_L2_TO_L2: { return 0; } case DR_ACTION_TYP_TNL_L3_TO_L2: { int ret; hw_actions = calloc(1, ACTION_CACHE_LINE_SIZE); if (!hw_actions) { errno = ENOMEM; return errno; } ret = dr_ste_set_action_decap_l3_list(dmn->ste_ctx, data, data_sz, hw_actions, ACTION_CACHE_LINE_SIZE, &action->rewrite.param.num_of_actions); if (ret) { dr_dbg(dmn, "Failed creating decap l3 action list\n"); goto free_hw_actions; } action->rewrite.param.data = hw_actions; action->rewrite.dmn = dmn; ret = dr_ste_alloc_modify_hdr(action); if (ret) { dr_dbg(dmn, "Failed prepare reformat data\n"); goto free_hw_actions; } return 0; } default: dr_dbg(dmn, "Reformat type is not supported %d\n", action->action_type); errno = ENOTSUP; return errno; } free_hw_actions: free(hw_actions); return errno; } struct mlx5dv_dr_action * mlx5dv_dr_action_create_packet_reformat(struct mlx5dv_dr_domain *dmn, uint32_t flags, enum mlx5dv_flow_action_packet_reformat_type reformat_type, size_t data_sz, void *data) { struct mlx5dv_dr_action *action; enum dr_action_type action_type; int ret; atomic_fetch_add(&dmn->refcount, 1); if (!check_comp_mask(flags, MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL)) { errno = EINVAL; goto dec_ref; } if (!dmn->info.supp_sw_steering && !(flags & MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL)) { dr_dbg(dmn, "Only root actions are supported on current domain\n"); errno = EOPNOTSUPP; goto dec_ref; } /* General checks */ ret = dr_action_verify_reformat_params(reformat_type, dmn, data_sz, data); if (ret) goto dec_ref; action_type = dr_action_reformat_to_action_type(reformat_type); action = dr_action_create_generic(action_type); if (!action) goto dec_ref; action->reformat.dmn = dmn; /* Create the action according to the table type */ if (flags & MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL) { action->reformat.is_root_level = true; ret = dr_action_create_reformat_action_root(dmn, data_sz, data, action); } else { action->reformat.is_root_level = false; ret = dr_action_create_reformat_action(dmn, data_sz, data, action); } if (ret) { dr_dbg(dmn, "Failed creating reformat action %d\n", ret); goto free_action; } return action; free_action: free(action); dec_ref: atomic_fetch_sub(&dmn->refcount, 1); return NULL; } struct mlx5dv_dr_action *mlx5dv_dr_action_create_pop_vlan(void) { return dr_action_create_generic(DR_ACTION_TYP_POP_VLAN); } struct mlx5dv_dr_action *mlx5dv_dr_action_create_push_vlan(struct mlx5dv_dr_domain *dmn, __be32 vlan_hdr) { uint32_t vlan_hdr_h = be32toh(vlan_hdr); uint16_t ethertype = vlan_hdr_h >> 16; struct mlx5dv_dr_action *action; if (ethertype != SVLAN_ETHERTYPE && ethertype != CVLAN_ETHERTYPE) { dr_dbg(dmn, "Invalid vlan ethertype\n"); errno = EINVAL; return NULL; } action = dr_action_create_generic(DR_ACTION_TYP_PUSH_VLAN); if (!action) return NULL; action->push_vlan.vlan_hdr = vlan_hdr_h; return action; } uint32_t dr_actions_reformat_get_id(struct mlx5dv_dr_action *action) { if (action->reformat.chunk) return action->reformat.index; return action->reformat.dvo->object_id; } static int dr_action_modify_sw_to_hw_add(struct mlx5dv_dr_domain *dmn, __be64 *sw_action, __be64 *hw_action, const struct dr_ste_action_modify_field **ret_hw_info) { const struct dr_ste_action_modify_field *hw_action_info; uint8_t max_length; uint16_t sw_field; uint32_t data; /* Get SW modify action data */ sw_field = DEVX_GET(set_action_in, sw_action, field); data = DEVX_GET(set_action_in, sw_action, data); /* Convert SW data to HW modify action format */ hw_action_info = dr_ste_conv_modify_hdr_sw_field(dmn->ste_ctx, &dmn->info.caps, sw_field); if (!hw_action_info) { dr_dbg(dmn, "Modify ADD action invalid field given\n"); errno = EINVAL; return errno; } max_length = hw_action_info->end - hw_action_info->start + 1; dr_ste_set_action_add(dmn->ste_ctx, hw_action, hw_action_info->hw_field, hw_action_info->start, max_length, data); *ret_hw_info = hw_action_info; return 0; } static int dr_action_modify_sw_to_hw_set(struct mlx5dv_dr_domain *dmn, __be64 *sw_action, __be64 *hw_action, const struct dr_ste_action_modify_field **ret_hw_info) { const struct dr_ste_action_modify_field *hw_action_info; uint8_t offset, length, max_length; uint16_t sw_field; uint32_t data; /* Get SW modify action data */ sw_field = DEVX_GET(set_action_in, sw_action, field); offset = DEVX_GET(set_action_in, sw_action, offset); length = DEVX_GET(set_action_in, sw_action, length); data = DEVX_GET(set_action_in, sw_action, data); /* Convert SW data to HW modify action format */ hw_action_info = dr_ste_conv_modify_hdr_sw_field(dmn->ste_ctx, &dmn->info.caps, sw_field); if (!hw_action_info) { dr_dbg(dmn, "Modify SET action invalid field given\n"); errno = EINVAL; return errno; } /* Based on device specification value of 0 means 32 */ length = length ? length : 32; max_length = hw_action_info->end - hw_action_info->start + 1; if (length + offset > max_length) { dr_dbg(dmn, "Modify action length + offset exceeds limit\n"); errno = EINVAL; return errno; } dr_ste_set_action_set(dmn->ste_ctx, hw_action, hw_action_info->hw_field, hw_action_info->start + offset, length, data); *ret_hw_info = hw_action_info; return 0; } static int dr_action_modify_sw_to_hw_copy(struct mlx5dv_dr_domain *dmn, __be64 *sw_action, __be64 *hw_action, const struct dr_ste_action_modify_field **ret_dst_hw_info, const struct dr_ste_action_modify_field **ret_src_hw_info) { uint8_t src_offset, dst_offset, src_max_length, dst_max_length, length; const struct dr_ste_action_modify_field *src_hw_action_info; const struct dr_ste_action_modify_field *dst_hw_action_info; uint16_t src_field, dst_field; /* Get SW modify action data */ src_field = DEVX_GET(copy_action_in, sw_action, src_field); dst_field = DEVX_GET(copy_action_in, sw_action, dst_field); src_offset = DEVX_GET(copy_action_in, sw_action, src_offset); dst_offset = DEVX_GET(copy_action_in, sw_action, dst_offset); length = DEVX_GET(copy_action_in, sw_action, length); /* Convert SW data to HW modify action format */ src_hw_action_info = dr_ste_conv_modify_hdr_sw_field(dmn->ste_ctx, &dmn->info.caps, src_field); dst_hw_action_info = dr_ste_conv_modify_hdr_sw_field(dmn->ste_ctx, &dmn->info.caps, dst_field); if (!src_hw_action_info || !dst_hw_action_info) { dr_dbg(dmn, "Modify COPY action invalid src/dst field given\n"); errno = EINVAL; return errno; } /* Based on device specification value of 0 means 32 */ length = length ? length : 32; src_max_length = src_hw_action_info->end - src_hw_action_info->start + 1; dst_max_length = dst_hw_action_info->end - dst_hw_action_info->start + 1; if (length + src_offset > src_max_length || length + dst_offset > dst_max_length) { dr_dbg(dmn, "Modify action length exceeds limit\n"); errno = EINVAL; return errno; } dr_ste_set_action_copy(dmn->ste_ctx, hw_action, dst_hw_action_info->hw_field, dst_hw_action_info->start + dst_offset, length, src_hw_action_info->hw_field, src_hw_action_info->start + src_offset); *ret_dst_hw_info = dst_hw_action_info; *ret_src_hw_info = src_hw_action_info; return 0; } static int dr_action_modify_sw_to_hw(struct mlx5dv_dr_domain *dmn, __be64 *sw_action, __be64 *hw_action, const struct dr_ste_action_modify_field **ret_dst_hw_info, const struct dr_ste_action_modify_field **ret_src_hw_info) { uint8_t action = DEVX_GET(set_action_in, sw_action, action_type); int ret = 0; *hw_action = 0; *ret_src_hw_info = NULL; switch (action) { case MLX5_ACTION_TYPE_SET: ret = dr_action_modify_sw_to_hw_set(dmn, sw_action, hw_action, ret_dst_hw_info); break; case MLX5_ACTION_TYPE_ADD: ret = dr_action_modify_sw_to_hw_add(dmn, sw_action, hw_action, ret_dst_hw_info); break; case MLX5_ACTION_TYPE_COPY: ret = dr_action_modify_sw_to_hw_copy(dmn, sw_action, hw_action, ret_dst_hw_info, ret_src_hw_info); break; default: dr_dbg(dmn, "Unsupported action type %d for modify action\n", action); errno = EOPNOTSUPP; ret = errno; break; } return ret; } static int dr_action_modify_check_field_limitation_set(struct mlx5dv_dr_action *action, const __be64 *sw_action) { uint16_t sw_field = DEVX_GET(set_action_in, sw_action, field); struct mlx5dv_dr_domain *dmn = action->rewrite.dmn; if (sw_field == MLX5_ACTION_IN_FIELD_OUT_METADATA_REGA) { action->rewrite.allow_rx = false; if (dmn->type != MLX5DV_DR_DOMAIN_TYPE_NIC_TX) { dr_dbg(dmn, "Unsupported field %d for RX/FDB set action\n", sw_field); errno = EINVAL; return errno; } } else if (sw_field == MLX5_ACTION_IN_FIELD_OUT_METADATA_REGB) { action->rewrite.allow_tx = false; if (dmn->type != MLX5DV_DR_DOMAIN_TYPE_NIC_RX) { dr_dbg(dmn, "Unsupported field %d for TX/FDB set action\n", sw_field); errno = EINVAL; return errno; } } if (!action->rewrite.allow_rx && !action->rewrite.allow_tx) { dr_dbg(dmn, "Modify SET actions not supported on both RX and TX\n"); errno = EINVAL; return errno; } return 0; } static int dr_action_modify_check_field_limitation_add(struct mlx5dv_dr_action *action, const __be64 *sw_action) { uint16_t sw_field = DEVX_GET(add_action_in, sw_action, field); if (sw_field != MLX5_ACTION_IN_FIELD_OUT_IP_TTL && sw_field != MLX5_ACTION_IN_FIELD_OUT_IPV6_HOPLIMIT && sw_field != MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM && sw_field != MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM) { dr_dbg(action->rewrite.dmn, "Unsupported field %d for ADD action\n", sw_field); errno = EINVAL; return errno; } return 0; } static int dr_action_modify_check_field_limitation_copy(struct mlx5dv_dr_action *action, const __be64 *sw_action) { struct mlx5dv_dr_domain *dmn = action->rewrite.dmn; uint16_t sw_fields[2]; int i; sw_fields[0] = DEVX_GET(copy_action_in, sw_action, src_field); sw_fields[1] = DEVX_GET(copy_action_in, sw_action, dst_field); for (i = 0; i < 2; i++) { if (sw_fields[i] == MLX5_ACTION_IN_FIELD_OUT_METADATA_REGA) { action->rewrite.allow_rx = false; if (dmn->type != MLX5DV_DR_DOMAIN_TYPE_NIC_TX) { dr_dbg(dmn, "Unsupported field %d for RX/FDB COPY action\n", sw_fields[i]); errno = EINVAL; return errno; } } else if (sw_fields[i] == MLX5_ACTION_IN_FIELD_OUT_METADATA_REGB) { action->rewrite.allow_tx = false; if (dmn->type != MLX5DV_DR_DOMAIN_TYPE_NIC_RX) { dr_dbg(dmn, "Unsupported field %d for TX/FDB COPY action\n", sw_fields[i]); errno = EINVAL; return errno; } } } if (!action->rewrite.allow_rx && !action->rewrite.allow_tx) { dr_dbg(dmn, "Modify actions combination is not supported on both RX and TX\n"); errno = EINVAL; return errno; } return 0; } static int dr_action_modify_check_field_limitation(struct mlx5dv_dr_action *action, const __be64 *sw_action) { uint8_t action_type = DEVX_GET(set_action_in, sw_action, action_type); struct mlx5dv_dr_domain *dmn = action->rewrite.dmn; int ret; switch (action_type) { case MLX5_ACTION_TYPE_SET: ret = dr_action_modify_check_field_limitation_set(action, sw_action); break; case MLX5_ACTION_TYPE_ADD: ret = dr_action_modify_check_field_limitation_add(action, sw_action); break; case MLX5_ACTION_TYPE_COPY: ret = dr_action_modify_check_field_limitation_copy(action, sw_action); break; default: dr_dbg(dmn, "Unsupported modify action %d\n", action_type); errno = EOPNOTSUPP; ret = errno; break; } return ret; } static int dr_actions_convert_modify_header(struct mlx5dv_dr_action *action, uint32_t max_hw_actions, uint32_t num_sw_actions, __be64 sw_actions[], __be64 hw_actions[], uint32_t *num_hw_actions) { const struct dr_ste_action_modify_field *hw_dst_action_info; const struct dr_ste_action_modify_field *hw_src_action_info; struct mlx5dv_dr_domain *dmn = action->rewrite.dmn; int ret, i, hw_idx = 0; uint16_t hw_field = 0; uint32_t l3_type = 0; uint32_t l4_type = 0; __be64 *sw_action; __be64 hw_action; action->rewrite.allow_rx = true; action->rewrite.allow_tx = true; for (i = 0; i < num_sw_actions; i++) { sw_action = &sw_actions[i]; ret = dr_action_modify_check_field_limitation(action, sw_action); if (ret) return ret; /* Convert SW action to HW action */ ret = dr_action_modify_sw_to_hw(dmn, sw_action, &hw_action, &hw_dst_action_info, &hw_src_action_info); if (ret) return ret; /* Due to a HW limitation we cannot modify 2 different L3 types */ if (l3_type && hw_dst_action_info->l3_type && (hw_dst_action_info->l3_type != l3_type)) { dr_dbg(dmn, "Action list can't support two different L3 types\n"); errno = ENOTSUP; return errno; } if (hw_dst_action_info->l3_type) l3_type = hw_dst_action_info->l3_type; /* Due to a HW limitation we cannot modify two different L4 types */ if (l4_type && hw_dst_action_info->l4_type && (hw_dst_action_info->l4_type != l4_type)) { dr_dbg(dmn, "Action list can't support two different L4 types\n"); errno = EINVAL; return errno; } if (hw_dst_action_info->l4_type) l4_type = hw_dst_action_info->l4_type; /* HW reads and executes two actions at once this means we * need to create a gap if two actions access the same field */ if ((hw_idx % 2) && (hw_field == hw_dst_action_info->hw_field || (hw_src_action_info && hw_field == hw_src_action_info->hw_field))) { /* Check if after gap insertion the total number of HW * modify actions doesn't exceeds the limit */ hw_idx++; if ((num_sw_actions + hw_idx - i) >= max_hw_actions) { dr_dbg(dmn, "Modify header action number exceeds HW limit\n"); errno = EINVAL; return errno; } } hw_field = hw_dst_action_info->hw_field; hw_actions[hw_idx] = hw_action; hw_idx++; } *num_hw_actions = hw_idx; return 0; } static int dr_action_create_modify_action_root(struct mlx5dv_dr_domain *dmn, size_t actions_sz, __be64 actions[], struct mlx5dv_dr_action *action) { struct ibv_flow_action *flow_action; enum mlx5dv_flow_table_type type; if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_RX) type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_RX; else if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_TX) type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_TX; else type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_FDB; flow_action = mlx5dv_create_flow_action_modify_header(dmn->ctx, actions_sz, (__force uint64_t *)actions, type); if (!flow_action) return errno; action->rewrite.flow_action = flow_action; return 0; } static int dr_action_create_modify_action(struct mlx5dv_dr_domain *dmn, size_t actions_sz, __be64 actions[], struct mlx5dv_dr_action *action) { uint32_t num_hw_actions; uint32_t num_sw_actions; __be64 *hw_actions; int ret; num_sw_actions = actions_sz / DR_MODIFY_ACTION_SIZE; if (num_sw_actions == 0) { dr_dbg(dmn, "Invalid number of actions %u\n", num_sw_actions); errno = EINVAL; return errno; } hw_actions = calloc(1, 2 * num_sw_actions * DR_MODIFY_ACTION_SIZE); if (!hw_actions) { errno = ENOMEM; return errno; } ret = dr_actions_convert_modify_header(action, 2 * num_sw_actions, num_sw_actions, actions, hw_actions, &num_hw_actions); if (ret) goto free_hw_actions; action->rewrite.param.data = (uint8_t *)hw_actions; action->rewrite.param.num_of_actions = num_hw_actions; if (num_hw_actions == 1 && (dmn->ste_ctx->actions_caps & DR_STE_CTX_ACTION_CAP_MODIFY_HDR_INLINE)) { action->rewrite.single_action_opt = true; return 0; } ret = dr_ste_alloc_modify_hdr(action); if (ret) goto free_hw_actions; return 0; free_hw_actions: free(hw_actions); return errno; } struct mlx5dv_dr_action * mlx5dv_dr_action_create_modify_header(struct mlx5dv_dr_domain *dmn, uint32_t flags, size_t actions_sz, __be64 actions[]) { struct mlx5dv_dr_action *action; int ret = 0; atomic_fetch_add(&dmn->refcount, 1); if (!check_comp_mask(flags, MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL)) { errno = EINVAL; goto dec_ref; } if (actions_sz % DR_MODIFY_ACTION_SIZE) { dr_dbg(dmn, "Invalid modify actions size provided\n"); errno = EINVAL; goto dec_ref; } if (!dmn->info.supp_sw_steering && !(flags & MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL)) { dr_dbg(dmn, "Only root actions are supported on current domain\n"); errno = EOPNOTSUPP; goto dec_ref; } action = dr_action_create_generic(DR_ACTION_TYP_MODIFY_HDR); if (!action) goto dec_ref; action->rewrite.dmn = dmn; /* Create the action according to the table type */ if (flags & MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL) { action->rewrite.is_root_level = true; ret = dr_action_create_modify_action_root(dmn, actions_sz, actions, action); } else { action->rewrite.is_root_level = false; ret = dr_action_create_modify_action(dmn, actions_sz, actions, action); } if (ret) { dr_dbg(dmn, "Failed creating modify header action %d\n", ret); goto free_action; } return action; free_action: free(action); dec_ref: atomic_fetch_sub(&dmn->refcount, 1); return NULL; } int mlx5dv_dr_action_modify_flow_meter(struct mlx5dv_dr_action *action, struct mlx5dv_dr_flow_meter_attr *attr, __be64 modify_field_select) { int ret; if (action->action_type != DR_ACTION_TYP_METER) { errno = EINVAL; return errno; } ret = dr_devx_modify_meter(action->meter.devx_obj, attr, modify_field_select); return ret; } struct mlx5dv_dr_action * mlx5dv_dr_action_create_flow_meter(struct mlx5dv_dr_flow_meter_attr *attr) { struct mlx5dv_dr_domain *dmn = attr->next_table->dmn; uint64_t rx_icm_addr = 0, tx_icm_addr = 0; struct mlx5dv_devx_obj *devx_obj; struct mlx5dv_dr_action *action; int ret; if (!dmn->info.supp_sw_steering) { dr_dbg(dmn, "Meter action is not supported on current domain\n"); errno = EOPNOTSUPP; return NULL; } if (dr_is_root_table(attr->next_table)) { dr_dbg(dmn, "Next table cannot be root\n"); errno = EOPNOTSUPP; return NULL; } devx_obj = dr_devx_create_meter(dmn->ctx, attr); if (!devx_obj) return NULL; ret = dr_devx_query_meter(devx_obj, &rx_icm_addr, &tx_icm_addr); if (ret) goto destroy_obj; action = dr_action_create_generic(DR_ACTION_TYP_METER); if (!action) goto destroy_obj; action->meter.devx_obj = devx_obj; action->meter.next_ft = attr->next_table; action->meter.rx_icm_addr = rx_icm_addr; action->meter.tx_icm_addr = tx_icm_addr; atomic_fetch_add(&attr->next_table->refcount, 1); return action; destroy_obj: mlx5dv_devx_obj_destroy(devx_obj); return NULL; } struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_vport(struct mlx5dv_dr_domain *dmn, uint32_t vport) { struct mlx5dv_dr_action *action; struct dr_devx_vport_cap *vport_cap; if (!dmn->info.supp_sw_steering || dmn->type != MLX5DV_DR_DOMAIN_TYPE_FDB) { dr_dbg(dmn, "Domain doesn't support vport actions\n"); errno = EOPNOTSUPP; return NULL; } /* vport number is limited to 16 bit */ if (vport > WIRE_PORT) { dr_dbg(dmn, "The vport number is out of range\n"); errno = EINVAL; return NULL; } vport_cap = dr_vports_table_get_vport_cap(&dmn->info.caps, vport); if (!vport_cap) { dr_dbg(dmn, "Failed to get vport %d caps\n", vport); return NULL; } action = dr_action_create_generic(DR_ACTION_TYP_VPORT); if (!action) return NULL; action->vport.dmn = dmn; action->vport.caps = vport_cap; return action; } struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_ib_port(struct mlx5dv_dr_domain *dmn, uint32_t ib_port) { struct dr_devx_vport_cap *vport_cap; struct mlx5dv_dr_action *action; if (!dmn->info.supp_sw_steering || dmn->type != MLX5DV_DR_DOMAIN_TYPE_FDB) { dr_dbg(dmn, "Domain doesn't support ib_port actions\n"); errno = EOPNOTSUPP; return NULL; } vport_cap = dr_vports_table_get_ib_port_cap(&dmn->info.caps, ib_port); if (!vport_cap) { dr_dbg(dmn, "Failed to get ib_port %d caps\n", ib_port); errno = EINVAL; return NULL; } action = dr_action_create_generic(DR_ACTION_TYP_VPORT); if (!action) return NULL; action->vport.dmn = dmn; action->vport.caps = vport_cap; return action; } static int dr_action_convert_to_fte_dest(struct mlx5dv_dr_domain *dmn, struct mlx5dv_dr_action *dest, struct mlx5dv_dr_action *dest_reformat, struct dr_devx_flow_fte_attr *fte_attr) { struct dr_devx_flow_dest_info *dest_info = &fte_attr->dest_arr[fte_attr->dest_size]; switch (dest->action_type) { case DR_ACTION_TYP_MISS: if (dmn->type != MLX5DV_DR_DOMAIN_TYPE_FDB) goto err_exit; fte_attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; dest_info->type = MLX5_FLOW_DEST_TYPE_VPORT; if (dmn->info.caps.is_ecpf) dest_info->vport_num = ECPF_PORT; break; case DR_ACTION_TYP_VPORT: if (dmn->type != MLX5DV_DR_DOMAIN_TYPE_FDB) goto err_exit; fte_attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; dest_info->type = MLX5_FLOW_DEST_TYPE_VPORT; dest_info->vport_num = dest->vport.caps->num; break; case DR_ACTION_TYP_QP: fte_attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; dest_info->type = MLX5_FLOW_DEST_TYPE_TIR; if (dest->dest_qp.is_qp) dest_info->tir_num = to_mqp(dest->dest_qp.qp)->tirn; else dest_info->tir_num = dest->dest_qp.devx_tir->object_id; break; case DR_ACTION_TYP_CTR: fte_attr->action |= MLX5_FLOW_CONTEXT_ACTION_COUNT; dest_info->type = MLX5_FLOW_DEST_TYPE_COUNTER; dest_info->counter_id = dest->ctr.devx_obj->object_id + dest->ctr.offset; break; case DR_ACTION_TYP_FT: fte_attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; dest_info->type = MLX5_FLOW_DEST_TYPE_FT; dest_info->ft_id = dest->dest_tbl->devx_obj->object_id; break; default: goto err_exit; } if (dest_reformat) { int ret = 0; switch (dest_reformat->action_type) { case DR_ACTION_TYP_L2_TO_TNL_L2: case DR_ACTION_TYP_L2_TO_TNL_L3: if (dest_reformat->reformat.is_root_level) goto err_exit; dr_domain_lock(dmn); if (!dest_reformat->reformat.dvo) { ret = dr_action_create_devx_reformat(dmn, dest_reformat, dest_reformat->reformat.reformat_size, dest_reformat->reformat.data); } dr_domain_unlock(dmn); if (ret) goto err_exit; fte_attr->extended_dest = true; dest_info->has_reformat = true; dest_info->reformat_id = dest_reformat->reformat.dvo->object_id; break; default: goto err_exit; } } fte_attr->dest_size++; return 0; err_exit: errno = EOPNOTSUPP; return errno; } static struct dr_devx_tbl_with_refs * dr_action_create_sampler_term_tbl(struct mlx5dv_dr_domain *dmn, struct mlx5dv_dr_flow_sampler_attr *attr) { struct dr_devx_flow_table_attr ft_attr = {}; struct dr_devx_flow_group_attr fg_attr = {}; struct dr_devx_flow_fte_attr fte_attr = {}; struct dr_devx_flow_dest_info *dest_info; struct dr_devx_tbl_with_refs *term_tbl; struct mlx5dv_dr_action **ref_actions; uint32_t ref_index = 0; uint32_t tbl_type; uint32_t i; tbl_type = attr->default_next_table->table_type; dest_info = calloc(attr->num_sample_actions, sizeof(struct dr_devx_flow_dest_info)); if (!dest_info) { errno = ENOMEM; return NULL; } term_tbl = calloc(1, sizeof(struct dr_devx_tbl_with_refs)); if (!term_tbl) { errno = ENOMEM; goto free_dest_info; } ref_actions = calloc(attr->num_sample_actions, sizeof(struct mlx5dv_dr_action *)); if (!ref_actions) { errno = ENOMEM; goto free_term_tbl; } ft_attr.type = tbl_type; ft_attr.level = dmn->info.caps.max_ft_level - 1; ft_attr.term_tbl = true; fte_attr.dest_arr = dest_info; for (i = 0; i < attr->num_sample_actions; i++) { enum dr_action_type action_type = attr->sample_actions[i]->action_type; atomic_fetch_add(&attr->sample_actions[i]->refcount, 1); ref_actions[ref_index++] = attr->sample_actions[i]; switch (action_type) { case DR_ACTION_TYP_MISS: case DR_ACTION_TYP_VPORT: if (dr_action_convert_to_fte_dest(dmn, attr->sample_actions[i], NULL, &fte_attr)) goto free_ref_actions; break; case DR_ACTION_TYP_QP: case DR_ACTION_TYP_CTR: if (tbl_type != FS_FT_NIC_RX) { errno = EOPNOTSUPP; goto free_ref_actions; } if (dr_action_convert_to_fte_dest(dmn, attr->sample_actions[i], NULL, &fte_attr)) goto free_ref_actions; break; case DR_ACTION_TYP_TAG: if (tbl_type != FS_FT_NIC_RX) { errno = EOPNOTSUPP; goto free_ref_actions; } fte_attr.flow_tag = attr->sample_actions[i]->flow_tag; break; default: errno = EOPNOTSUPP; goto free_ref_actions; } } term_tbl->devx_tbl = dr_devx_create_always_hit_ft(dmn->ctx, &ft_attr, &fg_attr, &fte_attr); if (!term_tbl->devx_tbl) goto free_ref_actions; term_tbl->ref_actions = ref_actions; term_tbl->ref_actions_num = attr->num_sample_actions; free(dest_info); return term_tbl; free_ref_actions: for (i = 0; i < ref_index; i++) atomic_fetch_sub(&ref_actions[i]->refcount, 1); free(ref_actions); free_term_tbl: free(term_tbl); free_dest_info: free(dest_info); return NULL; } static void dr_action_destroy_sampler_term_tbl(struct dr_devx_tbl_with_refs *term_tbl) { uint32_t i; dr_devx_destroy_always_hit_ft(term_tbl->devx_tbl); for (i = 0; i < term_tbl->ref_actions_num; i++) atomic_fetch_sub(&term_tbl->ref_actions[i]->refcount, 1); free(term_tbl->ref_actions); free(term_tbl); } static struct dr_flow_sampler * dr_action_create_sampler(struct mlx5dv_dr_domain *dmn, struct mlx5dv_dr_flow_sampler_attr *attr, struct dr_devx_tbl_with_refs *term_tbl, struct dr_flow_sampler_restore_tbl *restore) { struct dr_devx_flow_sampler_attr sampler_attr = {}; struct dr_flow_sampler *sampler; uint64_t icm_rx = 0, icm_tx = 0; int ret; sampler = calloc(1, sizeof(struct dr_flow_sampler)); if (!sampler) { errno = ENOMEM; return NULL; } sampler->next_ft = restore ? restore->tbl : attr->default_next_table; atomic_fetch_add(&sampler->next_ft->refcount, 1); /* Sampler HW level equals to term_tbl HW level, need to set ignore level */ sampler_attr.ignore_flow_level = true; sampler_attr.sample_ratio = attr->sample_ratio; sampler_attr.table_type = term_tbl->devx_tbl->type; sampler_attr.level = term_tbl->devx_tbl->level; sampler_attr.sample_table_id = term_tbl->devx_tbl->ft_dvo->object_id; sampler_attr.default_next_table_id = sampler->next_ft->devx_obj->object_id; sampler->devx_obj = dr_devx_create_flow_sampler(dmn->ctx, &sampler_attr); if (!sampler->devx_obj) goto dec_next_ft_ref; ret = dr_devx_query_flow_sampler(sampler->devx_obj, &icm_rx, &icm_tx); if (ret) goto destroy_sampler_dvo; sampler->rx_icm_addr = icm_rx; sampler->tx_icm_addr = icm_tx; return sampler; destroy_sampler_dvo: mlx5dv_devx_obj_destroy(sampler->devx_obj); dec_next_ft_ref: atomic_fetch_sub(&sampler->next_ft->refcount, 1); free(sampler); return NULL; } static void dr_action_destroy_sampler(struct dr_flow_sampler *sampler) { mlx5dv_devx_obj_destroy(sampler->devx_obj); atomic_fetch_sub(&sampler->next_ft->refcount, 1); free(sampler); } static struct dr_flow_sampler_restore_tbl * dr_action_create_sampler_restore_tbl(struct mlx5dv_dr_domain *dmn, struct mlx5dv_dr_flow_sampler_attr *attr) { struct mlx5dv_flow_match_parameters *mask; struct dr_flow_sampler_restore_tbl *restore; uint32_t action_field; uint32_t action_type; uint32_t mask_size; action_type = DEVX_GET(set_action_in, &(attr->action), action_type); action_field = DEVX_GET(set_action_in, &(attr->action), field); /* Currently only support restore of setting Reg_C0 */ if (action_type != MLX5_ACTION_TYPE_SET || action_field != MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_0) { errno = EOPNOTSUPP; return NULL; } mask_size = sizeof(struct mlx5dv_flow_match_parameters) + sizeof(struct dr_match_param); mask = calloc(1, mask_size); if (!mask) { errno = ENOMEM; return NULL; } mask->match_sz = sizeof(struct dr_match_param); restore = calloc(1, sizeof(struct dr_flow_sampler_restore_tbl)); if (!restore) { errno = ENOMEM; goto free_mask; } restore->tbl = mlx5dv_dr_table_create(dmn, attr->default_next_table->level - 1); if (!restore->tbl) goto free_restore; restore->matcher = mlx5dv_dr_matcher_create(restore->tbl, 0, 0, mask); if (!restore->matcher) goto destroy_restore_tbl; restore->num_of_actions = 2; restore->actions = calloc(restore->num_of_actions, sizeof(struct mlx5dv_dr_action *)); if (!restore->actions) { errno = ENOMEM; goto destroy_restore_matcher; } restore->actions[0] = mlx5dv_dr_action_create_modify_header(dmn, 0, DR_MODIFY_ACTION_SIZE, &(attr->action)); if (!restore->actions[0]) goto free_action_list; restore->actions[1] = mlx5dv_dr_action_create_dest_table(attr->default_next_table); if (!restore->actions[1]) goto destroy_modify_hdr_action; restore->rule = mlx5dv_dr_rule_create(restore->matcher, mask, restore->num_of_actions, restore->actions); if (!restore->rule) goto destroy_dest_action; free(mask); return restore; destroy_dest_action: mlx5dv_dr_action_destroy(restore->actions[1]); destroy_modify_hdr_action: mlx5dv_dr_action_destroy(restore->actions[0]); free_action_list: free(restore->actions); destroy_restore_matcher: mlx5dv_dr_matcher_destroy(restore->matcher); destroy_restore_tbl: mlx5dv_dr_table_destroy(restore->tbl); free_restore: free(restore); free_mask: free(mask); return NULL; } static void dr_action_destroy_sampler_restore_tbl(struct dr_flow_sampler_restore_tbl *restore) { uint32_t i; mlx5dv_dr_rule_destroy(restore->rule); for (i = 0; i < restore->num_of_actions; i++) mlx5dv_dr_action_destroy(restore->actions[i]); free(restore->actions); mlx5dv_dr_matcher_destroy(restore->matcher); mlx5dv_dr_table_destroy(restore->tbl); free(restore); } struct mlx5dv_dr_action * mlx5dv_dr_action_create_flow_sampler(struct mlx5dv_dr_flow_sampler_attr *attr) { struct mlx5dv_dr_action *action; struct mlx5dv_dr_domain *dmn; bool restore = false; dmn = attr->default_next_table->dmn; if (!dmn || !attr->default_next_table || attr->sample_ratio == 0 || !attr->sample_actions || attr->num_sample_actions == 0) { errno = EINVAL; return NULL; } if (dmn->type != MLX5DV_DR_DOMAIN_TYPE_NIC_RX && dmn->type != MLX5DV_DR_DOMAIN_TYPE_FDB) { errno = EOPNOTSUPP; return NULL; } if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB && dmn->info.caps.sw_format_ver == MLX5_HW_CONNECTX_5) restore = true; atomic_fetch_add(&dmn->refcount, 1); action = dr_action_create_generic(DR_ACTION_TYP_SAMPLER); if (!action) goto dec_ref; action->sampler.dmn = dmn; action->sampler.term_tbl = dr_action_create_sampler_term_tbl(dmn, attr); if (!action->sampler.term_tbl) goto free_action; action->sampler.sampler_default = dr_action_create_sampler(dmn, attr, action->sampler.term_tbl, NULL); if (!action->sampler.sampler_default) goto destroy_term_tbl; if (restore) { struct dr_flow_sampler *sampler_restore; action->sampler.restore_tbl = dr_action_create_sampler_restore_tbl(dmn, attr); if (!action->sampler.restore_tbl) goto destroy_sampler_default; sampler_restore = dr_action_create_sampler(dmn, attr, action->sampler.term_tbl, action->sampler.restore_tbl); if (!sampler_restore) goto destroy_restore; action->sampler.sampler_restore = sampler_restore; } return action; destroy_restore: if (action->sampler.restore_tbl) dr_action_destroy_sampler_restore_tbl(action->sampler.restore_tbl); destroy_sampler_default: dr_action_destroy_sampler(action->sampler.sampler_default); destroy_term_tbl: dr_action_destroy_sampler_term_tbl(action->sampler.term_tbl); free_action: free(action); dec_ref: atomic_fetch_sub(&dmn->refcount, 1); return NULL; } static int dr_action_add_action_member(struct list_head *ref_list, struct mlx5dv_dr_action *action) { struct dr_rule_action_member *action_mem; action_mem = calloc(1, sizeof(*action_mem)); if (!action_mem) { errno = ENOMEM; return errno; } action_mem->action = action; list_node_init(&action_mem->list); list_add_tail(ref_list, &action_mem->list); atomic_fetch_add(&action_mem->action->refcount, 1); return 0; } static void dr_action_remove_action_members(struct list_head *ref_list) { struct dr_rule_action_member *action_mem; struct dr_rule_action_member *tmp; list_for_each_safe(ref_list, action_mem, tmp, list) { list_del(&action_mem->list); atomic_fetch_sub(&action_mem->action->refcount, 1); free(action_mem); } } static int dr_action_create_dest_array_tbl(struct mlx5dv_dr_action *action, size_t num_dest, struct mlx5dv_dr_action_dest_attr *dests[]) { struct mlx5dv_dr_domain *dmn = action->dest_array.dmn; struct dr_devx_flow_table_attr ft_attr = {}; struct dr_devx_flow_group_attr fg_attr = {}; struct dr_devx_flow_fte_attr fte_attr = {}; uint32_t i; int ret; switch (dmn->type) { case MLX5DV_DR_DOMAIN_TYPE_FDB: ft_attr.type = FS_FT_FDB; ft_attr.level = dmn->info.caps.max_ft_level - 1; break; case MLX5DV_DR_DOMAIN_TYPE_NIC_RX: ft_attr.type = FS_FT_NIC_RX; ft_attr.level = MLX5_MULTI_PATH_FT_MAX_LEVEL - 1; break; default: errno = EOPNOTSUPP; return errno; } fte_attr.dest_arr = calloc(num_dest, sizeof(struct dr_devx_flow_dest_info)); if (!fte_attr.dest_arr) { errno = ENOMEM; return errno; } for (i = 0; i < num_dest; i++) { struct mlx5dv_dr_action *reformat_action; struct mlx5dv_dr_action *dest_action; switch (dests[i]->type) { case MLX5DV_DR_ACTION_DEST_REFORMAT: dest_action = dests[i]->dest_reformat->dest; reformat_action = dests[i]->dest_reformat->reformat; ft_attr.reformat_en = true; break; case MLX5DV_DR_ACTION_DEST: dest_action = dests[i]->dest; reformat_action = NULL; break; default: errno = EINVAL; goto clear_actions_list; } switch (dest_action->action_type) { case DR_ACTION_TYP_MISS: case DR_ACTION_TYP_VPORT: case DR_ACTION_TYP_QP: case DR_ACTION_TYP_CTR: case DR_ACTION_TYP_FT: if (dr_action_add_action_member(&action->dest_array.actions_list, dest_action)) goto clear_actions_list; break; default: errno = EOPNOTSUPP; goto clear_actions_list; } if (reformat_action) if (dr_action_add_action_member(&action->dest_array.actions_list, reformat_action)) goto clear_actions_list; if (dr_action_convert_to_fte_dest(dmn, dest_action, reformat_action, &fte_attr)) goto clear_actions_list; } action->dest_array.devx_tbl = dr_devx_create_always_hit_ft(dmn->ctx, &ft_attr, &fg_attr, &fte_attr); if (!action->dest_array.devx_tbl) goto clear_actions_list; ret = dr_devx_query_flow_table(action->dest_array.devx_tbl->ft_dvo, ft_attr.type, &action->dest_array.rx_icm_addr, &action->dest_array.tx_icm_addr); if (ret) goto destroy_devx_tbl; free(fte_attr.dest_arr); return 0; destroy_devx_tbl: dr_devx_destroy_always_hit_ft(action->dest_array.devx_tbl); clear_actions_list: dr_action_remove_action_members(&action->dest_array.actions_list); free(fte_attr.dest_arr); return errno; } struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_array(struct mlx5dv_dr_domain *dmn, size_t num_dest, struct mlx5dv_dr_action_dest_attr *dests[]) { struct mlx5dv_dr_action *action; if (num_dest <= 1) { errno = EINVAL; return NULL; } atomic_fetch_add(&dmn->refcount, 1); action = dr_action_create_generic(DR_ACTION_TYP_DEST_ARRAY); if (!action) goto dec_ref; action->dest_array.dmn = dmn; list_head_init(&action->dest_array.actions_list); if (dr_action_create_dest_array_tbl(action, num_dest, dests)) goto free_action; return action; free_action: free(action); dec_ref: atomic_fetch_sub(&dmn->refcount, 1); return NULL; } int mlx5dv_dr_action_destroy(struct mlx5dv_dr_action *action) { if (atomic_load(&action->refcount) > 1) return EBUSY; switch (action->action_type) { case DR_ACTION_TYP_FT: atomic_fetch_sub(&action->dest_tbl->refcount, 1); break; case DR_ACTION_TYP_TNL_L2_TO_L2: if (action->reformat.is_root_level) mlx5_destroy_flow_action(action->reformat.flow_action); atomic_fetch_sub(&action->reformat.dmn->refcount, 1); break; case DR_ACTION_TYP_TNL_L3_TO_L2: if (action->reformat.is_root_level) { mlx5_destroy_flow_action(action->reformat.flow_action); } else { dr_ste_free_modify_hdr(action); free(action->rewrite.param.data); } atomic_fetch_sub(&action->reformat.dmn->refcount, 1); break; case DR_ACTION_TYP_L2_TO_TNL_L2: case DR_ACTION_TYP_L2_TO_TNL_L3: if (action->reformat.is_root_level) { mlx5_destroy_flow_action(action->reformat.flow_action); } else { if (action->reformat.chunk) dr_action_destroy_sw_reformat(action); if (action->reformat.dvo) dr_action_destroy_devx_reformat(action); } atomic_fetch_sub(&action->reformat.dmn->refcount, 1); break; case DR_ACTION_TYP_MODIFY_HDR: if (action->rewrite.is_root_level) { mlx5_destroy_flow_action(action->rewrite.flow_action); } else { if (!action->rewrite.single_action_opt) dr_ste_free_modify_hdr(action); free(action->rewrite.param.data); } atomic_fetch_sub(&action->rewrite.dmn->refcount, 1); break; case DR_ACTION_TYP_METER: mlx5dv_devx_obj_destroy(action->meter.devx_obj); atomic_fetch_sub(&action->meter.next_ft->refcount, 1); break; case DR_ACTION_TYP_SAMPLER: if (action->sampler.sampler_restore) { dr_action_destroy_sampler(action->sampler.sampler_restore); dr_action_destroy_sampler_restore_tbl(action->sampler.restore_tbl); } dr_action_destroy_sampler(action->sampler.sampler_default); dr_action_destroy_sampler_term_tbl(action->sampler.term_tbl); atomic_fetch_sub(&action->sampler.dmn->refcount, 1); break; case DR_ACTION_TYP_DEST_ARRAY: dr_devx_destroy_always_hit_ft(action->dest_array.devx_tbl); dr_action_remove_action_members(&action->dest_array.actions_list); atomic_fetch_sub(&action->dest_array.dmn->refcount, 1); break; case DR_ACTION_TYP_ROOT_FT: dr_devx_destroy_always_hit_ft(action->root_tbl.devx_tbl); mlx5dv_destroy_steering_anchor(action->root_tbl.sa); atomic_fetch_sub(&action->root_tbl.tbl->refcount, 1); break; default: break; } free(action); return 0; } rdma-core-56.1/providers/mlx5/dr_arg.c000066400000000000000000000140331477342711600176370ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB // Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. #include "mlx5dv_dr.h" #define DR_ICM_MODIFY_HDR_GRANULARITY_4K 12 /* modify-header arg pool */ enum dr_arg_chunk_size { DR_ARG_CHUNK_SIZE_1, DR_ARG_CHUNK_SIZE_MIN = DR_ARG_CHUNK_SIZE_1, /* keep updated when changing */ DR_ARG_CHUNK_SIZE_2, DR_ARG_CHUNK_SIZE_3, DR_ARG_CHUNK_SIZE_4, DR_ARG_CHUNK_SIZE_MAX, }; /* argument pool area */ struct dr_arg_pool { enum dr_arg_chunk_size log_chunk_size; struct mlx5dv_dr_domain *dmn; struct list_head free_list; pthread_mutex_t mutex; }; struct dr_arg_mngr { struct mlx5dv_dr_domain *dmn; struct dr_arg_pool *pools[DR_ARG_CHUNK_SIZE_MAX]; }; static int dr_arg_pool_alloc_objs(struct dr_arg_pool *pool) { struct dr_arg_obj *arg_obj, *tmp_arg; struct mlx5dv_devx_obj *devx_obj; uint16_t object_range; LIST_HEAD(cur_list); int num_of_objects; int i; object_range = pool->dmn->info.caps.log_header_modify_argument_granularity; object_range = max_t(uint32_t, pool->dmn->info.caps.log_header_modify_argument_granularity, DR_ICM_MODIFY_HDR_GRANULARITY_4K); object_range = min_t(uint32_t, pool->dmn->info.caps.log_header_modify_argument_max_alloc, object_range); if (pool->log_chunk_size > object_range) { dr_dbg(pool->dmn, "Required chunk size (%d) is not supported\n", pool->log_chunk_size); errno = ENOMEM; return errno; } num_of_objects = (1 << (object_range - pool->log_chunk_size)); /* Only one devx object per range */ devx_obj = dr_devx_create_modify_header_arg(pool->dmn->ctx, object_range, pool->dmn->pd_num); if (!devx_obj) { dr_dbg(pool->dmn, "failed allocating object with range: %d:\n", object_range); return errno; } for (i = 0; i < num_of_objects; i++) { arg_obj = calloc(1, sizeof(struct dr_arg_obj)); if (!arg_obj) { errno = ENOMEM; goto clean_arg_obj; } arg_obj->log_chunk_size = pool->log_chunk_size; list_add_tail(&cur_list, &arg_obj->list_node); arg_obj->obj = devx_obj; arg_obj->obj_offset = i * (1 << pool->log_chunk_size); } list_append_list(&pool->free_list, &cur_list); return 0; clean_arg_obj: mlx5dv_devx_obj_destroy(devx_obj); list_for_each_safe(&cur_list, arg_obj, tmp_arg, list_node) { list_del(&arg_obj->list_node); free(arg_obj); } return errno; } static struct dr_arg_obj *dr_arg_pool_get_arg_obj(struct dr_arg_pool *pool) { struct dr_arg_obj *arg_obj = NULL; int ret; pthread_mutex_lock(&pool->mutex); if (list_empty(&pool->free_list)) { ret = dr_arg_pool_alloc_objs(pool); if (ret) goto out; } arg_obj = list_pop(&pool->free_list, struct dr_arg_obj, list_node); if (!arg_obj) assert(false); out: pthread_mutex_unlock(&pool->mutex); return arg_obj; } static void dr_arg_pool_put_arg_obj(struct dr_arg_pool *pool, struct dr_arg_obj *arg_obj) { pthread_mutex_lock(&pool->mutex); list_add(&pool->free_list, &arg_obj->list_node); pthread_mutex_unlock(&pool->mutex); } static struct dr_arg_pool *dr_arg_pool_create(struct mlx5dv_dr_domain *dmn, enum dr_arg_chunk_size chunk_size) { struct dr_arg_pool *pool; pool = calloc(1, sizeof(struct dr_arg_pool)); if (!pool) { errno = ENOMEM; return NULL; } pool->dmn = dmn; list_head_init(&pool->free_list); pthread_mutex_init(&pool->mutex, NULL); pool->log_chunk_size = chunk_size; if (dr_arg_pool_alloc_objs(pool)) goto free_pool; return pool; free_pool: free(pool); return NULL; } static void dr_arg_pool_destroy(struct dr_arg_pool *pool) { struct dr_arg_obj *tmp_arg; struct dr_arg_obj *arg_obj; list_for_each_safe(&pool->free_list, arg_obj, tmp_arg, list_node) { list_del(&arg_obj->list_node); if (!arg_obj->obj_offset) /* the first in range */ mlx5dv_devx_obj_destroy(arg_obj->obj); free(arg_obj); } pthread_mutex_destroy(&pool->mutex); free(pool); } static enum dr_arg_chunk_size dr_arg_get_chunk_size(uint16_t num_of_actions) { if (num_of_actions <= 8) return DR_ARG_CHUNK_SIZE_1; if (num_of_actions <= 16) return DR_ARG_CHUNK_SIZE_2; if (num_of_actions <= 32) return DR_ARG_CHUNK_SIZE_3; if (num_of_actions <= 64) return DR_ARG_CHUNK_SIZE_4; errno = EINVAL; return DR_ARG_CHUNK_SIZE_MAX; } uint32_t dr_arg_get_object_id(struct dr_arg_obj *arg_obj) { return (arg_obj->obj->object_id + arg_obj->obj_offset); } struct dr_arg_obj *dr_arg_get_obj(struct dr_arg_mngr *mngr, uint16_t num_of_actions, uint8_t *data) { uint32_t size = dr_arg_get_chunk_size(num_of_actions); struct dr_arg_obj *arg_obj; int ret; if (size >= DR_ARG_CHUNK_SIZE_MAX) return NULL; arg_obj = dr_arg_pool_get_arg_obj(mngr->pools[size]); if (!arg_obj) { dr_dbg(mngr->dmn, "Failed allocating args object for modify header\n"); return NULL; } if (!mngr->dmn->info.use_mqs) { /* write it into the hw */ ret = dr_send_postsend_args(mngr->dmn, dr_arg_get_object_id(arg_obj), num_of_actions, data, 0); if (ret) { dr_dbg(mngr->dmn, "Failed writing args object\n"); goto put_obj; } } return arg_obj; put_obj: dr_arg_put_obj(mngr, arg_obj); return NULL; } void dr_arg_put_obj(struct dr_arg_mngr *mngr, struct dr_arg_obj *arg_obj) { dr_arg_pool_put_arg_obj(mngr->pools[arg_obj->log_chunk_size], arg_obj); } struct dr_arg_mngr* dr_arg_mngr_create(struct mlx5dv_dr_domain *dmn) { struct dr_arg_mngr *pool_mngr; int i; if (!dr_domain_is_support_modify_hdr_cache(dmn)) return NULL; pool_mngr = calloc(1, sizeof(struct dr_arg_mngr)); if (!pool_mngr) { errno = ENOMEM; return NULL; } pool_mngr->dmn = dmn; for (i = 0; i < DR_ARG_CHUNK_SIZE_MAX; i++) { pool_mngr->pools[i] = dr_arg_pool_create(dmn, i); if (!pool_mngr->pools[i]) goto clean_pools; } return pool_mngr; clean_pools: for (i--; i >= 0; i--) dr_arg_pool_destroy(pool_mngr->pools[i]); free(pool_mngr); return NULL; } void dr_arg_mngr_destroy(struct dr_arg_mngr *mngr) { struct dr_arg_pool **pools; int i; if (!mngr) return; pools = mngr->pools; for (i = 0; i < DR_ARG_CHUNK_SIZE_MAX; i++) dr_arg_pool_destroy(pools[i]); free(mngr); } rdma-core-56.1/providers/mlx5/dr_buddy.c000066400000000000000000000152131477342711600201760ustar00rootroot00000000000000/* * Copyright (c) 2004 Topspin Communications. All rights reserved. * Copyright (c) 2005, 2006, 2007, 2008 Mellanox Technologies. All rights reserved. * Copyright (c) 2006, 2007 Cisco Systems, Inc. All rights reserved. * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include "mlx5dv_dr.h" struct dr_icm_pool; struct dr_icm_buddy_mem; static int dr_find_first_bit(const unsigned long *set_addr, const unsigned long *addr, unsigned int size) { unsigned int set_size = (size - 1) / BITS_PER_LONG + 1; unsigned long set_idx; /* find the first free in the first level */ set_idx = bitmap_find_first_bit(set_addr, 0, set_size); /* find the next level */ return bitmap_find_first_bit(addr, set_idx * BITS_PER_LONG, size); } int dr_buddy_init(struct dr_icm_buddy_mem *buddy, uint32_t max_order) { int i, s; buddy->max_order = max_order; list_node_init(&buddy->list_node); list_head_init(&buddy->used_list); list_head_init(&buddy->hot_list); buddy->bits = calloc(buddy->max_order + 1, sizeof(long *)); if (!buddy->bits) { errno = ENOMEM; return ENOMEM; } buddy->num_free = calloc(buddy->max_order + 1, sizeof(*buddy->num_free)); if (!buddy->num_free) goto err_out_free_bits; buddy->set_bit = calloc(buddy->max_order + 1, sizeof(long *)); if (!buddy->set_bit) goto err_out_free_num_free; /* Allocating max_order bitmaps, one for each order. * only the bitmap for the maximum size will be available for use and * the first bit there will be set. */ for (i = 0; i <= buddy->max_order; ++i) { s = 1 << (buddy->max_order - i); buddy->bits[i] = bitmap_alloc0(s); if (!buddy->bits[i]) goto err_out_free_each_bit_per_order; } for (i = 0; i <= buddy->max_order; ++i) { s = BITS_TO_LONGS(1 << (buddy->max_order - i)); buddy->set_bit[i] = bitmap_alloc0(s); if (!buddy->set_bit[i]) goto err_out_free_set; } bitmap_set_bit(buddy->bits[buddy->max_order], 0); bitmap_set_bit(buddy->set_bit[buddy->max_order], 0); buddy->num_free[buddy->max_order] = 1; return 0; err_out_free_set: for (i = 0; i <= buddy->max_order; ++i) free(buddy->set_bit[i]); err_out_free_each_bit_per_order: free(buddy->set_bit); for (i = 0; i <= buddy->max_order; ++i) free(buddy->bits[i]); err_out_free_num_free: free(buddy->num_free); err_out_free_bits: free(buddy->bits); errno = ENOMEM; return ENOMEM; } void dr_buddy_cleanup(struct dr_icm_buddy_mem *buddy) { int i; list_del(&buddy->list_node); for (i = 0; i <= buddy->max_order; ++i) { free(buddy->bits[i]); free(buddy->set_bit[i]); } free(buddy->set_bit); free(buddy->num_free); free(buddy->bits); } /* * Find the borders (high and low) of specific seg (segment location) * of the lower level of the bitmap in order to mark the upper layer * of bitmap. */ static void dr_buddy_get_seg_borders(uint32_t seg, uint32_t *low, uint32_t *high) { *low = (seg / BITS_PER_LONG) * BITS_PER_LONG; *high = ((seg / BITS_PER_LONG) + 1) * BITS_PER_LONG; } /* * We have two layers of searching in the bitmaps, so when needed update the * second layer of search. */ static void dr_buddy_update_upper_bitmap(struct dr_icm_buddy_mem *buddy, uint32_t seg, int order) { uint32_t h, l, m; /* clear upper layer of search if needed */ dr_buddy_get_seg_borders(seg, &l, &h); m = bitmap_find_first_bit(buddy->bits[order], l, h); if (m == h) /* nothing in the long that includes seg */ bitmap_clear_bit(buddy->set_bit[order], seg / BITS_PER_LONG); } /* * This function finds the first area of the managed memory by the buddy. * It uses the data structures of the buddy-system in order to find the first * area of free place, starting from the current order till the maximum order * in the system. * The function returns the location (seg) in the whole buddy memory area, this * indicates the place of the memory to use, it is the index of the mem segment. */ int dr_buddy_alloc_mem(struct dr_icm_buddy_mem *buddy, int order) { int seg; int o, m; for (o = order; o <= buddy->max_order; ++o) if (buddy->num_free[o]) { m = 1 << (buddy->max_order - o); seg = dr_find_first_bit(buddy->set_bit[o], buddy->bits[o], m); if (m <= seg) { /* not found free mem, but there are free mem */ assert(false); return -1; } goto found; } return -1; found: bitmap_clear_bit(buddy->bits[o], seg); /* clear upper layer of search if needed */ dr_buddy_update_upper_bitmap(buddy, seg, o); --buddy->num_free[o]; /* if we find free memory in some order that it is bigger than the * required order, we need to devied each order between the required to * the found one to 2, and mark accordingly. */ while (o > order) { --o; seg <<= 1; bitmap_set_bit(buddy->bits[o], seg ^ 1); bitmap_set_bit(buddy->set_bit[o], (seg ^ 1) / BITS_PER_LONG); ++buddy->num_free[o]; } seg <<= order; return seg; } void dr_buddy_free_mem(struct dr_icm_buddy_mem *buddy, uint32_t seg, int order) { seg >>= order; /* whenever a segment is free, the mem is added to the buddy that gave it */ while (bitmap_test_bit(buddy->bits[order], seg ^ 1)) { bitmap_clear_bit(buddy->bits[order], seg ^ 1); dr_buddy_update_upper_bitmap(buddy, seg ^ 1, order); --buddy->num_free[order]; seg >>= 1; ++order; } bitmap_set_bit(buddy->bits[order], seg); bitmap_set_bit(buddy->set_bit[order], seg / BITS_PER_LONG); ++buddy->num_free[order]; } rdma-core-56.1/providers/mlx5/dr_crc32.c000066400000000000000000000105411477342711600200020ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ /* * Copyright (c) 2011-2015 Stephan Brumme. All rights reserved. * Slicing-by-16 contributed by Bulat Ziganshin * * This software is provided 'as-is', without any express or implied warranty. * In no event will the author be held liable for any damages arising from the * of this software. * * Permission is granted to anyone to use this software for any purpose, * including commercial applications, and to alter it and redistribute it * freely, subject to the following restrictions: * * 1. The origin of this software must not be misrepresented; you must not * claim that you wrote the original software. * 2. If you use this software in a product, an acknowledgment in the product * documentation would be appreciated but is not required. * 3. Altered source versions must be plainly marked as such, and must not be * misrepresented as being the original software. * * Taken from http://create.stephan-brumme.com/crc32/ and adapted. */ #include #include #include "mlx5dv_dr.h" #define DR_STE_CRC_POLY 0xEDB88320L static uint32_t dr_ste_crc_tab32[8][256]; static void dr_crc32_calc_lookup_entry(uint32_t (*tbl)[256], uint8_t i, uint8_t j) { tbl[i][j] = (tbl[i-1][j] >> 8) ^ tbl[0][tbl[i-1][j] & 0xff]; } void dr_crc32_init_table(void) { uint32_t crc, i, j; for (i = 0; i < 256; i++) { crc = i; for (j = 0; j < 8; j++) { if (crc & 0x00000001L) crc = (crc >> 1) ^ DR_STE_CRC_POLY; else crc = crc >> 1; } dr_ste_crc_tab32[0][i] = crc; } /* Init CRC lookup tables according to crc_slice_8 algorithm */ for (i = 0; i < 256; i++) { dr_crc32_calc_lookup_entry(dr_ste_crc_tab32, 1, i); dr_crc32_calc_lookup_entry(dr_ste_crc_tab32, 2, i); dr_crc32_calc_lookup_entry(dr_ste_crc_tab32, 3, i); dr_crc32_calc_lookup_entry(dr_ste_crc_tab32, 4, i); dr_crc32_calc_lookup_entry(dr_ste_crc_tab32, 5, i); dr_crc32_calc_lookup_entry(dr_ste_crc_tab32, 6, i); dr_crc32_calc_lookup_entry(dr_ste_crc_tab32, 7, i); } } /* Compute CRC32 (Slicing-by-8 algorithm) */ uint32_t dr_crc32_slice8_calc(const void *input_data, size_t length) { const uint32_t *current = (const uint32_t *)input_data; const uint8_t *current_char; uint32_t crc = 0, one, two; if (!input_data) return 0; /* Process eight bytes at once (Slicing-by-8) */ while (length >= 8) { one = *current++ ^ crc; two = *current++; crc = dr_ste_crc_tab32[0][(two >> 24) & 0xff] ^ dr_ste_crc_tab32[1][(two >> 16) & 0xff] ^ dr_ste_crc_tab32[2][(two >> 8) & 0xff] ^ dr_ste_crc_tab32[3][two & 0xff] ^ dr_ste_crc_tab32[4][(one >> 24) & 0xff] ^ dr_ste_crc_tab32[5][(one >> 16) & 0xff] ^ dr_ste_crc_tab32[6][(one >> 8) & 0xff] ^ dr_ste_crc_tab32[7][one & 0xff]; length -= 8; } current_char = (const uint8_t *)current; /* Remaining 1 to 7 bytes (standard algorithm) */ while (length-- != 0) crc = (crc >> 8) ^ dr_ste_crc_tab32[0][(crc & 0xff) ^ *current_char++]; return ((crc>>24) & 0xff) | ((crc<<8) & 0xff0000) | ((crc>>8) & 0xff00) | ((crc<<24) & 0xff000000); } rdma-core-56.1/providers/mlx5/dr_dbg.c000066400000000000000000000575361477342711600176410ustar00rootroot00000000000000/* * Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include "mlx5dv_dr.h" #define BUFF_SIZE 1024 enum dr_dump_rec_type { DR_DUMP_REC_TYPE_DOMAIN = 3000, DR_DUMP_REC_TYPE_DOMAIN_INFO_FLEX_PARSER = 3001, DR_DUMP_REC_TYPE_DOMAIN_INFO_DEV_ATTR = 3002, DR_DUMP_REC_TYPE_DOMAIN_INFO_VPORT = 3003, DR_DUMP_REC_TYPE_DOMAIN_INFO_CAPS = 3004, DR_DUMP_REC_TYPE_DOMAIN_SEND_RING = 3005, DR_DUMP_REC_TYPE_TABLE = 3100, DR_DUMP_REC_TYPE_TABLE_RX = 3101, DR_DUMP_REC_TYPE_TABLE_TX = 3102, DR_DUMP_REC_TYPE_MATCHER = 3200, DR_DUMP_REC_TYPE_MATCHER_MASK_DEPRECATED = 3201, DR_DUMP_REC_TYPE_MATCHER_RX = 3202, DR_DUMP_REC_TYPE_MATCHER_TX = 3203, DR_DUMP_REC_TYPE_MATCHER_BUILDER = 3204, DR_DUMP_REC_TYPE_MATCHER_MASK = 3205, DR_DUMP_REC_TYPE_RULE = 3300, DR_DUMP_REC_TYPE_RULE_RX_ENTRY_V0 = 3301, DR_DUMP_REC_TYPE_RULE_TX_ENTRY_V0 = 3302, DR_DUMP_REC_TYPE_RULE_RX_ENTRY_V1 = 3303, DR_DUMP_REC_TYPE_RULE_TX_ENTRY_V1 = 3304, DR_DUMP_REC_TYPE_ACTION_ENCAP_L2 = 3400, DR_DUMP_REC_TYPE_ACTION_ENCAP_L3 = 3401, DR_DUMP_REC_TYPE_ACTION_MODIFY_HDR = 3402, DR_DUMP_REC_TYPE_ACTION_DROP = 3403, DR_DUMP_REC_TYPE_ACTION_QP = 3404, DR_DUMP_REC_TYPE_ACTION_FT = 3405, DR_DUMP_REC_TYPE_ACTION_CTR = 3406, DR_DUMP_REC_TYPE_ACTION_TAG = 3407, DR_DUMP_REC_TYPE_ACTION_VPORT = 3408, DR_DUMP_REC_TYPE_ACTION_DECAP_L2 = 3409, DR_DUMP_REC_TYPE_ACTION_DECAP_L3 = 3410, DR_DUMP_REC_TYPE_ACTION_DEVX_TIR = 3411, DR_DUMP_REC_TYPE_ACTION_PUSH_VLAN = 3412, DR_DUMP_REC_TYPE_ACTION_POP_VLAN = 3413, DR_DUMP_REC_TYPE_ACTION_METER = 3414, DR_DUMP_REC_TYPE_ACTION_SAMPLER = 3415, DR_DUMP_REC_TYPE_ACTION_DEST_ARRAY = 3416, DR_DUMP_REC_TYPE_ACTION_ASO_FIRST_HIT = 3417, DR_DUMP_REC_TYPE_ACTION_ASO_FLOW_METER = 3418, DR_DUMP_REC_TYPE_ACTION_ASO_CT = 3419, DR_DUMP_REC_TYPE_ACTION_MISS = 3423, DR_DUMP_REC_TYPE_ACTION_ROOT_FT = 3424, }; static uint64_t dr_dump_icm_to_idx(uint64_t icm_addr) { return (icm_addr >> 6) & 0xffffffff; } static void dump_hex_print(char *dest, char *src, uint32_t size) { int i; for (i = 0; i < size; i++) sprintf(&dest[2 * i], "%02x", (uint8_t)src[i]); } static int dr_dump_rule_action(FILE *f, const uint64_t rule_id, struct mlx5dv_dr_action *action) { const uint64_t action_id = (uint64_t) (uintptr_t) action; int ret; switch (action->action_type) { case DR_ACTION_TYP_DROP: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 "\n", DR_DUMP_REC_TYPE_ACTION_DROP, action_id, rule_id); break; case DR_ACTION_TYP_FT: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x,0x%" PRIx64 "\n", DR_DUMP_REC_TYPE_ACTION_FT, action_id, rule_id, action->dest_tbl->devx_obj->object_id, (uint64_t)(uintptr_t)action->dest_tbl); break; case DR_ACTION_TYP_QP: if (action->dest_qp.is_qp) ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_QP, action_id, rule_id, action->dest_qp.qp->qp_num); else ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%" PRIx64 "\n", DR_DUMP_REC_TYPE_ACTION_DEVX_TIR, action_id, rule_id, action->dest_qp.devx_tir->rx_icm_addr); break; case DR_ACTION_TYP_CTR: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_CTR, action_id, rule_id, action->ctr.devx_obj->object_id + action->ctr.offset); break; case DR_ACTION_TYP_TAG: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_TAG, action_id, rule_id, action->flow_tag); break; case DR_ACTION_TYP_MODIFY_HDR: { struct dr_ptrn_obj *ptrn = action->rewrite.ptrn_arg.ptrn; struct dr_rewrite_param *param = &action->rewrite.param; struct dr_arg_obj *arg = action->rewrite.ptrn_arg.arg; bool ptrn_in_use = false; int i; if (!action->rewrite.single_action_opt && ptrn && arg) ptrn_in_use = true; ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x,%d,0x%x,0x%" PRIx32 ",0x%" PRIx32, DR_DUMP_REC_TYPE_ACTION_MODIFY_HDR, action_id, rule_id, param->index, action->rewrite.single_action_opt, ptrn_in_use ? param->num_of_actions : 0, ptrn_in_use ? ptrn->rewrite_param.index : 0, ptrn_in_use ? dr_arg_get_object_id(arg) : 0); if (ret < 0) return ret; if (ptrn_in_use) { for (i = 0; i < param->num_of_actions; i++) { ret = fprintf(f, ",0x%016" PRIx64, be64toh(((__be64 *)param->data)[i])); if (ret < 0) return ret; } } ret = fprintf(f, "\n"); break; } case DR_ACTION_TYP_VPORT: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_VPORT, action_id, rule_id, action->vport.caps->num); break; case DR_ACTION_TYP_TNL_L2_TO_L2: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 "\n", DR_DUMP_REC_TYPE_ACTION_DECAP_L2, action_id, rule_id); break; case DR_ACTION_TYP_TNL_L3_TO_L2: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_DECAP_L3, action_id, rule_id, action->rewrite.param.index); break; case DR_ACTION_TYP_L2_TO_TNL_L2: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_ENCAP_L2, action_id, rule_id, dr_actions_reformat_get_id(action)); break; case DR_ACTION_TYP_L2_TO_TNL_L3: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_ENCAP_L3, action_id, rule_id, dr_actions_reformat_get_id(action)); break; case DR_ACTION_TYP_METER: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%" PRIx64 ",0x%x,0x%" PRIx64 ",0x%" PRIx64 "\n", DR_DUMP_REC_TYPE_ACTION_METER, action_id, rule_id, (uint64_t)(uintptr_t)action->meter.next_ft, action->meter.devx_obj->object_id, action->meter.rx_icm_addr, action->meter.tx_icm_addr); break; case DR_ACTION_TYP_SAMPLER: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%" PRIx64 ",0x%x,0x%x,0x%" PRIx64 ",0x%" PRIx64 "\n", DR_DUMP_REC_TYPE_ACTION_SAMPLER, action_id, rule_id, (uint64_t)(uintptr_t)action->sampler.sampler_default->next_ft, action->sampler.term_tbl->devx_tbl->ft_dvo->object_id, action->sampler.sampler_default->devx_obj->object_id, action->sampler.sampler_default->rx_icm_addr, (action->sampler.sampler_restore) ? action->sampler.sampler_restore->tx_icm_addr : action->sampler.sampler_default->tx_icm_addr); break; case DR_ACTION_TYP_DEST_ARRAY: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x,0x%" PRIx64 ",0x%" PRIx64 "\n", DR_DUMP_REC_TYPE_ACTION_DEST_ARRAY, action_id, rule_id, action->dest_array.devx_tbl->ft_dvo->object_id, action->dest_array.rx_icm_addr, action->dest_array.tx_icm_addr); break; case DR_ACTION_TYP_POP_VLAN: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 "\n", DR_DUMP_REC_TYPE_ACTION_POP_VLAN, action_id, rule_id); break; case DR_ACTION_TYP_PUSH_VLAN: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_PUSH_VLAN, action_id, rule_id, action->push_vlan.vlan_hdr); break; case DR_ACTION_TYP_ASO_FIRST_HIT: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_ASO_FIRST_HIT, action_id, rule_id, action->aso.devx_obj->object_id); break; case DR_ACTION_TYP_ASO_FLOW_METER: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_ASO_FLOW_METER, action_id, rule_id, action->aso.devx_obj->object_id); break; case DR_ACTION_TYP_ASO_CT: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_ASO_CT, action_id, rule_id, action->aso.devx_obj->object_id); break; case DR_ACTION_TYP_MISS: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 "\n", DR_DUMP_REC_TYPE_ACTION_MISS, action_id, rule_id); break; case DR_ACTION_TYP_ROOT_FT: ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x\n", DR_DUMP_REC_TYPE_ACTION_ROOT_FT, action_id, rule_id, action->root_tbl.devx_tbl->ft_dvo->object_id); break; default: return 0; } if (ret < 0) return ret; return 0; } static int dr_dump_rule_mem(FILE *f, struct dr_ste *ste, bool is_rx, const uint64_t rule_id, enum mlx5_ifc_steering_format_version format_ver) { char hw_ste_dump[BUFF_SIZE] = {}; enum dr_dump_rec_type mem_rec_type; int ret; if (format_ver == MLX5_HW_CONNECTX_5) { mem_rec_type = is_rx ? DR_DUMP_REC_TYPE_RULE_RX_ENTRY_V0 : DR_DUMP_REC_TYPE_RULE_TX_ENTRY_V0; } else { mem_rec_type = is_rx ? DR_DUMP_REC_TYPE_RULE_RX_ENTRY_V1 : DR_DUMP_REC_TYPE_RULE_TX_ENTRY_V1; } dump_hex_print(hw_ste_dump, (char *)ste->hw_ste, ste->size); ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",%s\n", mem_rec_type, dr_dump_icm_to_idx(dr_ste_get_icm_addr(ste)), rule_id, hw_ste_dump); if (ret < 0) return ret; return 0; } static int dr_dump_rule_rx_tx(FILE *f, struct dr_rule_rx_tx *nic_rule, bool is_rx, const uint64_t rule_id, enum mlx5_ifc_steering_format_version format_ver) { struct dr_ste *ste_arr[DR_RULE_MAX_STES + DR_ACTION_MAX_STES]; struct dr_ste *curr_ste = nic_rule->last_rule_ste; int ret, i; dr_rule_get_reverse_rule_members(ste_arr, curr_ste, &i); while (i--) { ret = dr_dump_rule_mem(f, ste_arr[i], is_rx, rule_id, format_ver); if (ret < 0) return ret; } return 0; } static int dr_dump_rule(FILE *f, struct mlx5dv_dr_rule *rule) { const uint64_t rule_id = (uint64_t) (uintptr_t) rule; enum mlx5_ifc_steering_format_version format_ver; struct dr_rule_rx_tx *rx = &rule->rx; struct dr_rule_rx_tx *tx = &rule->tx; int ret; int i; format_ver = rule->matcher->tbl->dmn->info.caps.sw_format_ver; ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 "\n", DR_DUMP_REC_TYPE_RULE, rule_id, (uint64_t) (uintptr_t) rule->matcher); if (ret < 0) return ret; if (!dr_is_root_table(rule->matcher->tbl)) { if (rx->nic_matcher) { ret = dr_dump_rule_rx_tx(f, rx, true, rule_id, format_ver); if (ret < 0) return ret; } if (tx->nic_matcher) { ret = dr_dump_rule_rx_tx(f, tx, false, rule_id, format_ver); if (ret < 0) return ret; } } for (i = 0; i < rule->num_actions; i++) { ret = dr_dump_rule_action(f, rule_id, rule->actions[i]); if (ret < 0) return ret; } return 0; } static int dr_dump_matcher_mask(FILE *f, struct dr_match_param *mask, uint8_t criteria, const uint64_t matcher_id) { char dump[BUFF_SIZE] = {}; int ret; ret = fprintf(f, "%d,0x%" PRIx64 ",", DR_DUMP_REC_TYPE_MATCHER_MASK, matcher_id); if (ret < 0) return ret; if (criteria & DR_MATCHER_CRITERIA_OUTER) { dump_hex_print(dump, (char *)&mask->outer, sizeof(mask->outer)); ret = fprintf(f, "%s,", dump); } else { ret = fprintf(f, ","); } if (ret < 0) return ret; if (criteria & DR_MATCHER_CRITERIA_INNER) { dump_hex_print(dump, (char *)&mask->inner, sizeof(mask->inner)); ret = fprintf(f, "%s,", dump); } else { ret = fprintf(f, ","); } if (ret < 0) return ret; if (criteria & DR_MATCHER_CRITERIA_MISC) { dump_hex_print(dump, (char *)&mask->misc, sizeof(mask->misc)); ret = fprintf(f, "%s,", dump); } else { ret = fprintf(f, ","); } if (ret < 0) return ret; if (criteria & DR_MATCHER_CRITERIA_MISC2) { dump_hex_print(dump, (char *)&mask->misc2, sizeof(mask->misc2)); ret = fprintf(f, "%s,", dump); } else { ret = fprintf(f, ","); } if (ret < 0) return ret; if (criteria & DR_MATCHER_CRITERIA_MISC3) { dump_hex_print(dump, (char *)&mask->misc3, sizeof(mask->misc3)); ret = fprintf(f, "%s,", dump); } else { ret = fprintf(f, ","); } if (criteria & DR_MATCHER_CRITERIA_MISC4) { dump_hex_print(dump, (char *)&mask->misc4, sizeof(mask->misc4)); ret = fprintf(f, "%s,", dump); } else { ret = fprintf(f, ","); } if (criteria & DR_MATCHER_CRITERIA_MISC5) { dump_hex_print(dump, (char *)&mask->misc5, sizeof(mask->misc5)); ret = fprintf(f, "%s\n", dump); } else { ret = fprintf(f, ",\n"); } if (ret < 0) return ret; return 0; } static int dr_dump_matcher_builder(FILE *f, struct dr_ste_build *builder, uint32_t index, bool is_rx, const uint64_t matcher_id) { bool is_match = builder->htbl_type == DR_STE_HTBL_TYPE_MATCH; int ret; ret = fprintf(f, "%d,0x%" PRIx64 ",%d,%d,0x%x,%d\n", DR_DUMP_REC_TYPE_MATCHER_BUILDER, matcher_id, index, is_rx, builder->lu_type, is_match ? builder->format_id : -1); if (ret < 0) return ret; return 0; } static int dr_dump_matcher_rx_tx(FILE *f, bool is_rx, struct dr_matcher_rx_tx *matcher_rx_tx, const uint64_t matcher_id) { enum dr_dump_rec_type rec_type; int i, ret; rec_type = is_rx ? DR_DUMP_REC_TYPE_MATCHER_RX : DR_DUMP_REC_TYPE_MATCHER_TX; ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",%d,0x%" PRIx64 ",0x%" PRIx64 ",%d\n", rec_type, (uint64_t) (uintptr_t) matcher_rx_tx, matcher_id, matcher_rx_tx->num_of_builders, dr_dump_icm_to_idx(dr_icm_pool_get_chunk_icm_addr(matcher_rx_tx->s_htbl->chunk)), dr_dump_icm_to_idx(dr_icm_pool_get_chunk_icm_addr(matcher_rx_tx->e_anchor->chunk)), matcher_rx_tx->fixed_size ? matcher_rx_tx->s_htbl->chunk_size : -1); if (ret < 0) return ret; for (i = 0; i < matcher_rx_tx->num_of_builders; i++) { ret = dr_dump_matcher_builder(f, &matcher_rx_tx->ste_builder[i], i, is_rx, matcher_id); if (ret < 0) return ret; } return 0; } static int dr_dump_matcher(FILE *f, struct mlx5dv_dr_matcher *matcher) { struct dr_matcher_rx_tx *rx = &matcher->rx; struct dr_matcher_rx_tx *tx = &matcher->tx; uint64_t matcher_id; int ret; matcher_id = (uint64_t) (uintptr_t) matcher; ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",%d\n", DR_DUMP_REC_TYPE_MATCHER, matcher_id, (uint64_t) (uintptr_t) matcher->tbl, matcher->prio); if (ret < 0) return ret; if (!dr_is_root_table(matcher->tbl)) { ret = dr_dump_matcher_mask(f, &matcher->mask, matcher->match_criteria, matcher_id); if (ret < 0) return ret; if (rx->nic_tbl) { ret = dr_dump_matcher_rx_tx(f, true, rx, matcher_id); if (ret < 0) return ret; } if (tx->nic_tbl) { ret = dr_dump_matcher_rx_tx(f, false, tx, matcher_id); if (ret < 0) return ret; } } return 0; } static int dr_dump_matcher_all(FILE *fout, struct mlx5dv_dr_matcher *matcher) { struct mlx5dv_dr_rule *rule; int ret; ret = dr_dump_matcher(fout, matcher); if (ret < 0) return ret; list_for_each(&matcher->rule_list, rule, rule_list) { ret = dr_dump_rule(fout, rule); if (ret < 0) return ret; } return 0; } static uint64_t dr_domain_id_calc(enum mlx5dv_dr_domain_type type) { return (getpid() << 8) | (type & 0xff); } static int dr_dump_table_rx_tx(FILE *f, bool is_rx, struct dr_table_rx_tx *table_rx_tx, const uint64_t table_id) { struct dr_icm_chunk *chunk = table_rx_tx->s_anchor->chunk; enum dr_dump_rec_type rec_type; int ret; rec_type = is_rx ? DR_DUMP_REC_TYPE_TABLE_RX : DR_DUMP_REC_TYPE_TABLE_TX; ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 "\n", rec_type, table_id, dr_dump_icm_to_idx(dr_icm_pool_get_chunk_icm_addr(chunk))); if (ret < 0) return ret; return 0; } static int dr_dump_table(FILE *f, struct mlx5dv_dr_table *table) { struct dr_table_rx_tx *rx = &table->rx; struct dr_table_rx_tx *tx = &table->tx; int ret; ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",%d,%d\n", DR_DUMP_REC_TYPE_TABLE, (uint64_t) (uintptr_t) table, dr_domain_id_calc(table->dmn->type), table->table_type, table->level); if (ret < 0) return ret; if (!dr_is_root_table(table)) { if (rx->nic_dmn) { ret = dr_dump_table_rx_tx(f, true, rx, (uint64_t) (uintptr_t) table); if (ret < 0) return ret; } if (tx->nic_dmn) { ret = dr_dump_table_rx_tx(f, false, tx, (uint64_t) (uintptr_t) table); if (ret < 0) return ret; } } return 0; } static int dr_dump_table_all(FILE *fout, struct mlx5dv_dr_table *tbl) { struct mlx5dv_dr_matcher *matcher; int ret; ret = dr_dump_table(fout, tbl); if (ret < 0) return ret; if (!dr_is_root_table(tbl)) { list_for_each(&tbl->matcher_list, matcher, matcher_list) { ret = dr_dump_matcher_all(fout, matcher); if (ret < 0) return ret; } } return 0; } static int dr_dump_send_ring(FILE *f, struct dr_send_ring *ring, const uint64_t domain_id) { int ret; ret = fprintf(f, "%d,0x%" PRIx64 ",0x%" PRIx64 ",0x%x,0x%x\n", DR_DUMP_REC_TYPE_DOMAIN_SEND_RING, (uint64_t) (uintptr_t) ring, domain_id, ring->cq.cqn, ring->qp->obj->object_id); if (ret < 0) return ret; return 0; } static int dr_dump_domain_info_flex_parser(FILE *f, const char *flex_parser_name, const uint8_t flex_parser_value, const uint64_t domain_id) { int ret; ret = fprintf(f, "%d,0x%" PRIx64 ",%s,0x%x\n", DR_DUMP_REC_TYPE_DOMAIN_INFO_FLEX_PARSER, domain_id, flex_parser_name, flex_parser_value); if (ret < 0) return ret; return 0; } static int dr_dump_vports_table(FILE *f, struct dr_vports_table *vports_tbl, const uint64_t domain_id) { struct dr_devx_vport_cap *vport_cap; int i, ret; if (!vports_tbl) return 0; for (i = 0; i < DR_VPORTS_BUCKETS; i++) { vport_cap = vports_tbl->buckets[i]; while (vport_cap) { ret = fprintf(f, "%d,0x%" PRIx64 ",%d,0x%x,0x%" PRIx64 ",0x%" PRIx64 "\n", DR_DUMP_REC_TYPE_DOMAIN_INFO_VPORT, domain_id, vport_cap->num, vport_cap->vport_gvmi, vport_cap->icm_address_rx, vport_cap->icm_address_tx); if (ret < 0) return ret; vport_cap = vport_cap->next; } } return 0; } static int dr_dump_domain_info_caps(FILE *f, struct dr_devx_caps *caps, const uint64_t domain_id) { int ret; ret = fprintf(f, "%d,0x%" PRIx64 ",0x%x,0x%" PRIx64 ",0x%" PRIx64 ",0x%x,%d,%d\n", DR_DUMP_REC_TYPE_DOMAIN_INFO_CAPS, domain_id, caps->gvmi, caps->nic_rx_drop_address, caps->nic_tx_drop_address, caps->flex_protocols, caps->vports.num_ports, caps->eswitch_manager); if (ret < 0) return ret; ret = dr_dump_vports_table(f, caps->vports.vports, domain_id); if (ret < 0) return ret; return 0; } static int dr_dump_domain_info_dev_attr(FILE *f, struct dr_domain_info *info, const uint64_t domain_id) { int ret; ret = fprintf(f, "%d,0x%" PRIx64 ",%u,%s,%d\n", DR_DUMP_REC_TYPE_DOMAIN_INFO_DEV_ATTR, domain_id, info->caps.vports.num_ports, info->attr.orig_attr.fw_ver, info->use_mqs); if (ret < 0) return ret; return 0; } static int dr_dump_domain_info(FILE *f, struct dr_domain_info *info, const uint64_t domain_id) { int ret; ret = dr_dump_domain_info_dev_attr(f, info, domain_id); if (ret < 0) return ret; ret = dr_dump_domain_info_caps(f, &info->caps, domain_id); if (ret < 0) return ret; ret = dr_dump_domain_info_flex_parser(f, "icmp_dw0", info->caps.flex_parser_id_icmp_dw0, domain_id); if (ret < 0) return ret; ret = dr_dump_domain_info_flex_parser(f, "icmp_dw1", info->caps.flex_parser_id_icmp_dw1, domain_id); if (ret < 0) return ret; ret = dr_dump_domain_info_flex_parser(f, "icmpv6_dw0", info->caps.flex_parser_id_icmpv6_dw0, domain_id); if (ret < 0) return ret; ret = dr_dump_domain_info_flex_parser(f, "icmpv6_dw1", info->caps.flex_parser_id_icmpv6_dw1, domain_id); if (ret < 0) return ret; return 0; } static int dr_dump_domain(FILE *f, struct mlx5dv_dr_domain *dmn) { enum mlx5dv_dr_domain_type dmn_type = dmn->type; char *dev_name = dmn->ctx->device->dev_name; uint64_t domain_id; int ret, i; domain_id = dr_domain_id_calc(dmn_type); ret = fprintf(f, "%d,0x%" PRIx64 ",%d,0%x,%d,%s,%s,%u,%u,%u,%u,%u\n", DR_DUMP_REC_TYPE_DOMAIN, domain_id, dmn_type, dmn->info.caps.gvmi, dmn->info.supp_sw_steering, PACKAGE_VERSION, dev_name, dmn->flags, dmn->num_buddies[DR_ICM_TYPE_STE], dmn->num_buddies[DR_ICM_TYPE_MODIFY_ACTION], dmn->num_buddies[DR_ICM_TYPE_MODIFY_HDR_PTRN], dmn->info.caps.sw_format_ver); if (ret < 0) return ret; ret = dr_dump_domain_info(f, &dmn->info, domain_id); if (ret < 0) return ret; if (dmn->info.supp_sw_steering) { for (i = 0; i < DR_MAX_SEND_RINGS; i++) { ret = dr_dump_send_ring(f, dmn->send_ring[i], domain_id); if (ret < 0) return ret; } } return 0; } static int dr_dump_domain_all(FILE *fout, struct mlx5dv_dr_domain *dmn) { struct mlx5dv_dr_table *tbl; int ret; ret = dr_dump_domain(fout, dmn); if (ret < 0) return ret; list_for_each(&dmn->tbl_list, tbl, tbl_list) { ret = dr_dump_table_all(fout, tbl); if (ret < 0) return ret; } return 0; } int mlx5dv_dump_dr_domain(FILE *fout, struct mlx5dv_dr_domain *dmn) { int ret; if (!fout || !dmn) return -EINVAL; pthread_spin_lock(&dmn->debug_lock); dr_domain_lock(dmn); ret = dr_dump_domain_all(fout, dmn); dr_domain_unlock(dmn); pthread_spin_unlock(&dmn->debug_lock); return ret; } int mlx5dv_dump_dr_table(FILE *fout, struct mlx5dv_dr_table *tbl) { int ret; if (!fout || !tbl) return -EINVAL; pthread_spin_lock(&tbl->dmn->debug_lock); dr_domain_lock(tbl->dmn); ret = dr_dump_domain(fout, tbl->dmn); if (ret < 0) goto out; ret = dr_dump_table_all(fout, tbl); out: dr_domain_unlock(tbl->dmn); pthread_spin_unlock(&tbl->dmn->debug_lock); return ret; } int mlx5dv_dump_dr_matcher(FILE *fout, struct mlx5dv_dr_matcher *matcher) { int ret; if (!fout || !matcher) return -EINVAL; pthread_spin_lock(&matcher->tbl->dmn->debug_lock); dr_domain_lock(matcher->tbl->dmn); ret = dr_dump_domain(fout, matcher->tbl->dmn); if (ret < 0) goto out; ret = dr_dump_table(fout, matcher->tbl); if (ret < 0) goto out; ret = dr_dump_matcher_all(fout, matcher); out: dr_domain_unlock(matcher->tbl->dmn); pthread_spin_unlock(&matcher->tbl->dmn->debug_lock); return ret; } int mlx5dv_dump_dr_rule(FILE *fout, struct mlx5dv_dr_rule *rule) { int ret; if (!fout || !rule) return -EINVAL; pthread_spin_lock(&rule->matcher->tbl->dmn->debug_lock); dr_domain_lock(rule->matcher->tbl->dmn); ret = dr_dump_domain(fout, rule->matcher->tbl->dmn); if (ret < 0) goto out; ret = dr_dump_table(fout, rule->matcher->tbl); if (ret < 0) goto out; ret = dr_dump_matcher(fout, rule->matcher); if (ret < 0) goto out; ret = dr_dump_rule(fout, rule); out: dr_domain_unlock(rule->matcher->tbl->dmn); pthread_spin_unlock(&rule->matcher->tbl->dmn->debug_lock); return ret; } rdma-core-56.1/providers/mlx5/dr_devx.c000066400000000000000000001142041477342711600200350ustar00rootroot00000000000000/* * Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include "mlx5dv_dr.h" int dr_devx_query_esw_vport_context(struct ibv_context *ctx, bool other_vport, uint16_t vport_number, uint64_t *icm_address_rx, uint64_t *icm_address_tx) { uint32_t out[DEVX_ST_SZ_DW(query_esw_vport_context_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(query_esw_vport_context_in)] = {}; int err; DEVX_SET(query_esw_vport_context_in, in, opcode, MLX5_CMD_OP_QUERY_ESW_VPORT_CONTEXT); DEVX_SET(query_esw_vport_context_in, in, other_vport, other_vport); DEVX_SET(query_esw_vport_context_in, in, vport_number, vport_number); err = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (err) { err = mlx5_get_cmd_status_err(err, out); dr_dbg_ctx(ctx, "Query eswitch vport context failed %d\n", err); return err; } *icm_address_rx = DEVX_GET64(query_esw_vport_context_out, out, esw_vport_context.sw_steering_vport_icm_address_rx); *icm_address_tx = DEVX_GET64(query_esw_vport_context_out, out, esw_vport_context.sw_steering_vport_icm_address_tx); return 0; } static int dr_devx_query_nic_vport_context(struct ibv_context *ctx, bool *roce_en) { uint32_t out[DEVX_ST_SZ_DW(query_nic_vport_context_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(query_nic_vport_context_in)] = {}; int err; DEVX_SET(query_nic_vport_context_in, in, opcode, MLX5_CMD_OP_QUERY_NIC_VPORT_CONTEXT); err = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (err) { err = mlx5_get_cmd_status_err(err, out); dr_dbg_ctx(ctx, "Query nic vport context failed %d\n", err); return err; } *roce_en = DEVX_GET(query_nic_vport_context_out, out, nic_vport_context.roce_en); return 0; } int dr_devx_query_gvmi(struct ibv_context *ctx, bool other_vport, uint16_t vport_number, uint16_t *gvmi) { uint32_t out[DEVX_ST_SZ_DW(query_hca_cap_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(query_hca_cap_in)] = {}; int err; DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); DEVX_SET(query_hca_cap_in, in, other_function, other_vport); DEVX_SET(query_hca_cap_in, in, function_id, vport_number); DEVX_SET(query_hca_cap_in, in, op_mod, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE | HCA_CAP_OPMOD_GET_CUR); err = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (err) { err = mlx5_get_cmd_status_err(err, out); dr_dbg_ctx(ctx, "Query general failed %d\n", err); return err; } *gvmi = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.vhca_id); return 0; } static int dr_devx_query_esw_func(struct ibv_context *ctx, uint16_t max_sfs, bool *host_pf_vhca_id_valid, uint16_t *host_pf_vhca_id) { uint32_t in[DEVX_ST_SZ_DW(query_esw_functions_in)] = {}; size_t outsz; void *out; int err; outsz = DEVX_ST_SZ_BYTES(query_esw_functions_out) + (max_sfs - 1) * DEVX_FLD_SZ_BYTES(query_esw_functions_out, host_sf_enable); out = calloc(1, outsz); if (!out) { errno = ENOMEM; return errno; } DEVX_SET(query_esw_functions_in, in, opcode, MLX5_CMD_OP_QUERY_ESW_FUNCTIONS); err = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, outsz); if (err) { err = mlx5_get_cmd_status_err(err, out); dr_dbg_ctx(ctx, "Query esw func failed %d\n", err); free(out); return err; } *host_pf_vhca_id_valid = DEVX_GET(query_esw_functions_out, out, host_params_context.host_pf_vhca_id_valid); *host_pf_vhca_id = DEVX_GET(query_esw_functions_out, out, host_params_context.host_pf_vhca_id); free(out); return 0; } int dr_devx_query_esw_caps(struct ibv_context *ctx, struct dr_esw_caps *caps) { uint32_t out[DEVX_ST_SZ_DW(query_hca_cap_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(query_hca_cap_in)] = {}; void *esw_caps; int err; DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); DEVX_SET(query_hca_cap_in, in, op_mod, MLX5_SET_HCA_CAP_OP_MOD_ESW_FLOW_TABLE | HCA_CAP_OPMOD_GET_CUR); err = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (err) { err = mlx5_get_cmd_status_err(err, out); dr_dbg_ctx(ctx, "Query general failed %d\n", err); return err; } esw_caps = DEVX_ADDR_OF(query_hca_cap_out, out, capability.flow_table_eswitch_cap); caps->drop_icm_address_rx = DEVX_GET64(flow_table_eswitch_cap, esw_caps, sw_steering_fdb_action_drop_icm_address_rx); caps->drop_icm_address_tx = DEVX_GET64(flow_table_eswitch_cap, esw_caps, sw_steering_fdb_action_drop_icm_address_tx); caps->uplink_icm_address_rx = DEVX_GET64(flow_table_eswitch_cap, esw_caps, sw_steering_uplink_icm_address_rx); caps->uplink_icm_address_tx = DEVX_GET64(flow_table_eswitch_cap, esw_caps, sw_steering_uplink_icm_address_tx); caps->sw_owner_v2 = DEVX_GET(flow_table_eswitch_cap, esw_caps, flow_table_properties_nic_esw_fdb.sw_owner_v2); if (!caps->sw_owner_v2) caps->sw_owner = DEVX_GET(flow_table_eswitch_cap, esw_caps, flow_table_properties_nic_esw_fdb.sw_owner); return 0; } int dr_devx_query_device(struct ibv_context *ctx, struct dr_devx_caps *caps) { uint32_t out[DEVX_ST_SZ_DW(query_hca_cap_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(query_hca_cap_in)] = {}; bool host_pf_vhca_id_valid = false; uint16_t host_pf_vhca_id = 0; uint32_t max_sfs = 0; bool roce, sf_supp; int err; DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); DEVX_SET(query_hca_cap_in, in, op_mod, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE | HCA_CAP_OPMOD_GET_CUR); err = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (err) { err = mlx5_get_cmd_status_err(err, out); dr_dbg_ctx(ctx, "Query general failed %d\n", err); return err; } caps->prio_tag_required = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.prio_tag_required); caps->eswitch_manager = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.eswitch_manager); caps->gvmi = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.vhca_id); caps->flex_protocols = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_protocols); caps->isolate_vl_tc = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.isolate_vl_tc_new); caps->flex_parser_header_modify = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_header_modify); sf_supp = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.sf); caps->definer_format_sup = DEVX_GET64(query_hca_cap_out, out, capability.cmd_hca_cap.match_definer_format_supported); roce = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.roce); caps->sw_format_ver = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.steering_format_version); caps->support_modify_argument = DEVX_GET64(query_hca_cap_out, out, capability.cmd_hca_cap.general_obj_types) & (1LL << MLX5_OBJ_TYPE_HEADER_MODIFY_ARGUMENT); caps->roce_caps.fl_rc_qp_when_roce_disabled = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.fl_rc_qp_when_roce_disabled); caps->roce_caps.qp_ts_format = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.sq_ts_format); if (caps->support_modify_argument) { caps->log_header_modify_argument_granularity = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.log_header_modify_argument_granularity); caps->log_header_modify_argument_max_alloc = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.log_header_modify_argument_max_alloc); } if (caps->flex_protocols & MLX5_FLEX_PARSER_ICMP_V4_ENABLED) { caps->flex_parser_id_icmp_dw0 = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_id_icmp_dw0); caps->flex_parser_id_icmp_dw1 = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_id_icmp_dw1); } if (caps->flex_protocols & MLX5_FLEX_PARSER_ICMP_V6_ENABLED) { caps->flex_parser_id_icmpv6_dw0 = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_id_icmpv6_dw0); caps->flex_parser_id_icmpv6_dw1 = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_id_icmpv6_dw1); } if (caps->flex_protocols & MLX5_FLEX_PARSER_GENEVE_OPT_0_ENABLED) caps->flex_parser_id_geneve_opt_0 = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_id_geneve_opt_0); if (caps->flex_protocols & MLX5_FLEX_PARSER_MPLS_OVER_GRE_ENABLED) caps->flex_parser_id_mpls_over_gre = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_id_outer_first_mpls_over_gre); if (caps->flex_protocols & mlx5_FLEX_PARSER_MPLS_OVER_UDP_ENABLED) caps->flex_parser_id_mpls_over_udp = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_id_outer_first_mpls_over_udp_label); if (caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_DW_0_ENABLED) caps->flex_parser_id_gtpu_dw_0 = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_id_gtpu_dw_0); if (caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_TEID_ENABLED) caps->flex_parser_id_gtpu_teid = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_id_gtpu_teid); if (caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_DW_2_ENABLED) caps->flex_parser_id_gtpu_dw_2 = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_id_gtpu_dw_2); if (caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_FIRST_EXT_DW_0_ENABLED) caps->flex_parser_id_gtpu_first_ext_dw_0 = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.flex_parser_id_gtpu_first_ext_dw_0); DEVX_SET(query_hca_cap_in, in, op_mod, MLX5_SET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE | HCA_CAP_OPMOD_GET_CUR); err = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (err) { err = mlx5_get_cmd_status_err(err, out); dr_dbg_ctx(ctx, "Query flow tables failed %d\n", err); return err; } caps->max_encap_size = DEVX_GET(query_hca_cap_out, out, capability.flow_table_nic_cap.max_encap_header_size); caps->nic_rx_drop_address = DEVX_GET64(query_hca_cap_out, out, capability.flow_table_nic_cap. sw_steering_nic_rx_action_drop_icm_address); caps->nic_tx_drop_address = DEVX_GET64(query_hca_cap_out, out, capability.flow_table_nic_cap. sw_steering_nic_tx_action_drop_icm_address); caps->nic_tx_allow_address = DEVX_GET64(query_hca_cap_out, out, capability.flow_table_nic_cap. sw_steering_nic_tx_action_allow_icm_address); caps->rx_sw_owner_v2 = DEVX_GET(query_hca_cap_out, out, capability.flow_table_nic_cap. flow_table_properties_nic_receive.sw_owner_v2); caps->tx_sw_owner_v2 = DEVX_GET(query_hca_cap_out, out, capability.flow_table_nic_cap. flow_table_properties_nic_transmit.sw_owner_v2); if (!caps->rx_sw_owner_v2) caps->rx_sw_owner = DEVX_GET(query_hca_cap_out, out, capability.flow_table_nic_cap. flow_table_properties_nic_receive.sw_owner); if (!caps->tx_sw_owner_v2) caps->tx_sw_owner = DEVX_GET(query_hca_cap_out, out, capability.flow_table_nic_cap. flow_table_properties_nic_transmit.sw_owner); caps->max_ft_level = DEVX_GET(query_hca_cap_out, out, capability.flow_table_nic_cap. flow_table_properties_nic_receive.max_ft_level); /* l4_csum_ok is the indication for definer support csum and ok bits. * Since we don't have definer versions we rely on new field support */ caps->definer_supp_checksum = DEVX_GET(query_hca_cap_out, out, capability.flow_table_nic_cap. ft_field_bitmask_support_2_nic_receive. outer_l4_checksum_ok); /* geneve_tlv_option_0_exist is the indication for STE support for * lookup type flex_parser_ok. */ caps->flex_parser_ok_bits_supp = DEVX_GET(query_hca_cap_out, out, capability.flow_table_nic_cap. flow_table_properties_nic_receive. ft_field_support. geneve_tlv_option_0_exist); caps->support_full_tnl_hdr = (DEVX_GET(query_hca_cap_out, out, capability.flow_table_nic_cap. ft_field_bitmask_support_2_nic_receive. tunnel_header_0_1) && DEVX_GET(query_hca_cap_out, out, capability.flow_table_nic_cap. ft_field_bitmask_support_2_nic_receive. tunnel_header_2_3)); if (sf_supp && caps->eswitch_manager) { DEVX_SET(query_hca_cap_in, in, op_mod, MLX5_SET_HCA_CAP_OP_MOD_ESW | HCA_CAP_OPMOD_GET_CUR); err = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (err) { err = mlx5_get_cmd_status_err(err, out); dr_dbg_ctx(ctx, "Query eswitch capabilities failed %d\n", err); return err; } max_sfs = 1 << DEVX_GET(query_hca_cap_out, out, capability.e_switch_cap.log_max_esw_sf); } if (caps->eswitch_manager) { /* Check if ECPF */ if (DEVX_GET(query_hca_cap_out, out, capability.e_switch_cap.esw_manager_vport_number_valid)) { if (DEVX_GET(query_hca_cap_out, out, capability.e_switch_cap.esw_manager_vport_number) == ECPF_PORT) caps->is_ecpf = true; } else { err = dr_devx_query_esw_func(ctx, max_sfs, &host_pf_vhca_id_valid, &host_pf_vhca_id); if (!err && host_pf_vhca_id_valid && host_pf_vhca_id != caps->gvmi) caps->is_ecpf = true; } } DEVX_SET(query_hca_cap_in, in, op_mod, MLX5_SET_HCA_CAP_OP_MOD_DEVICE_MEMORY | HCA_CAP_OPMOD_GET_CUR); err = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (err) { err = mlx5_get_cmd_status_err(err, out); dr_dbg_ctx(ctx, "Query flow device memory caps failed %d\n", err); return err; } caps->log_icm_size = DEVX_GET(query_hca_cap_out, out, capability.device_mem_cap.log_steering_sw_icm_size); caps->hdr_modify_icm_addr = DEVX_GET64(query_hca_cap_out, out, capability.device_mem_cap. header_modify_sw_icm_start_address); caps->log_modify_hdr_icm_size = DEVX_GET(query_hca_cap_out, out, capability.device_mem_cap.log_header_modify_sw_icm_size); caps->log_modify_pattern_icm_size = DEVX_GET(query_hca_cap_out, out, capability.device_mem_cap.log_header_modify_pattern_sw_icm_size); caps->hdr_modify_pattern_icm_addr = DEVX_GET64(query_hca_cap_out, out, capability.device_mem_cap.header_modify_pattern_sw_icm_start_address); caps->log_sw_encap_icm_size = DEVX_GET(query_hca_cap_out, out, capability.device_mem_cap.log_indirect_encap_sw_icm_size); if (caps->log_sw_encap_icm_size) caps->indirect_encap_icm_base = DEVX_GET64(query_hca_cap_out, out, capability.device_mem_cap.indirect_encap_icm_base); /* RoCE caps */ if (roce) { err = dr_devx_query_nic_vport_context(ctx, &caps->roce_caps.roce_en); if (err) return err; DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); DEVX_SET(query_hca_cap_in, in, op_mod, MLX5_SET_HCA_CAP_OP_MOD_ROCE | HCA_CAP_OPMOD_GET_CUR); err = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (err) { err = mlx5_get_cmd_status_err(err, out); dr_dbg_ctx(ctx, "Query RoCE capabilities failed %d\n", err); return err; } caps->roce_caps.fl_rc_qp_when_roce_disabled |= DEVX_GET(query_hca_cap_out, out, capability.roce_caps.fl_rc_qp_when_roce_disabled); caps->roce_caps.fl_rc_qp_when_roce_enabled = DEVX_GET(query_hca_cap_out, out, capability.roce_caps.fl_rc_qp_when_roce_enabled); caps->roce_caps.qp_ts_format = DEVX_GET(query_hca_cap_out, out, capability.roce_caps.qp_ts_format); } return 0; } int dr_devx_sync_steering(struct ibv_context *ctx) { uint32_t out[DEVX_ST_SZ_DW(sync_steering_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(sync_steering_in)] = {}; int err; DEVX_SET(sync_steering_in, in, opcode, MLX5_CMD_OP_SYNC_STEERING); err = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (err) { err = mlx5_get_cmd_status_err(err, out); dr_dbg_ctx(ctx, "Sync steering failed %d\n", err); } return err; } struct mlx5dv_devx_obj * dr_devx_create_flow_table(struct ibv_context *ctx, struct dr_devx_flow_table_attr *ft_attr) { uint32_t out[DEVX_ST_SZ_DW(create_flow_table_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_flow_table_in)] = {}; struct mlx5dv_devx_obj *obj; void *ft_ctx; DEVX_SET(create_flow_table_in, in, opcode, MLX5_CMD_OP_CREATE_FLOW_TABLE); DEVX_SET(create_flow_table_in, in, table_type, ft_attr->type); ft_ctx = DEVX_ADDR_OF(create_flow_table_in, in, flow_table_context); DEVX_SET(flow_table_context, ft_ctx, termination_table, ft_attr->term_tbl); DEVX_SET(flow_table_context, ft_ctx, sw_owner, ft_attr->sw_owner); DEVX_SET(flow_table_context, ft_ctx, level, ft_attr->level); DEVX_SET(flow_table_context, ft_ctx, reformat_en, ft_attr->reformat_en); if (ft_attr->sw_owner) { /* icm_addr_0 used for FDB RX / NIC TX / NIC_RX * icm_addr_1 used for FDB TX */ if (ft_attr->type == FS_FT_NIC_RX) { DEVX_SET64(flow_table_context, ft_ctx, sw_owner_icm_root_0, ft_attr->icm_addr_rx); } else if (ft_attr->type == FS_FT_NIC_TX) { DEVX_SET64(flow_table_context, ft_ctx, sw_owner_icm_root_0, ft_attr->icm_addr_tx); } else if (ft_attr->type == FS_FT_FDB) { DEVX_SET64(flow_table_context, ft_ctx, sw_owner_icm_root_0, ft_attr->icm_addr_rx); DEVX_SET64(flow_table_context, ft_ctx, sw_owner_icm_root_1, ft_attr->icm_addr_tx); } else { assert(false); } } obj = mlx5dv_devx_obj_create(ctx, in, sizeof(in), out, sizeof(out)); if (!obj) errno = mlx5_get_cmd_status_err(errno, out); return obj; } int dr_devx_query_flow_table(struct mlx5dv_devx_obj *obj, uint32_t type, uint64_t *rx_icm_addr, uint64_t *tx_icm_addr) { uint32_t out[DEVX_ST_SZ_DW(query_flow_table_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(query_flow_table_in)] = {}; int ret; DEVX_SET(query_flow_table_in, in, opcode, MLX5_CMD_OP_QUERY_FLOW_TABLE); DEVX_SET(query_flow_table_in, in, table_type, type); DEVX_SET(query_flow_table_in, in, table_id, obj->object_id); ret = mlx5dv_devx_obj_query(obj, in, sizeof(in), out, sizeof(out)); if (ret) { dr_dbg_ctx(obj->context, "Failed to query flow table id %u\n", obj->object_id); return mlx5_get_cmd_status_err(ret, out); } switch (type) { case FS_FT_NIC_TX: *tx_icm_addr = DEVX_GET64(query_flow_table_out, out, flow_table_context.sw_owner_icm_root_0); *rx_icm_addr = 0; break; case FS_FT_NIC_RX: *rx_icm_addr = DEVX_GET64(query_flow_table_out, out, flow_table_context.sw_owner_icm_root_0); *tx_icm_addr = 0; break; case FS_FT_FDB: *rx_icm_addr = DEVX_GET64(query_flow_table_out, out, flow_table_context.sw_owner_icm_root_0); *tx_icm_addr = DEVX_GET64(query_flow_table_out, out, flow_table_context.sw_owner_icm_root_1); break; default: errno = EINVAL; return errno; } return 0; } static struct mlx5dv_devx_obj * dr_devx_create_flow_group(struct ibv_context *ctx, struct dr_devx_flow_group_attr *fg_attr) { uint32_t out[DEVX_ST_SZ_DW(create_flow_group_out)] = {}; uint32_t inlen = DEVX_ST_SZ_BYTES(create_flow_group_in); struct mlx5dv_devx_obj *obj; uint32_t *in; in = calloc(1, inlen); if (!in) { errno = ENOMEM; return NULL; } DEVX_SET(create_flow_group_in, in, opcode, MLX5_CMD_OP_CREATE_FLOW_GROUP); DEVX_SET(create_flow_group_in, in, table_type, fg_attr->table_type); DEVX_SET(create_flow_group_in, in, table_id, fg_attr->table_id); obj = mlx5dv_devx_obj_create(ctx, in, inlen, out, sizeof(out)); if (!obj) errno = mlx5_get_cmd_status_err(errno, out); free(in); return obj; } static struct mlx5dv_devx_obj * dr_devx_set_fte(struct ibv_context *ctx, struct dr_devx_flow_fte_attr *fte_attr) { uint32_t out[DEVX_ST_SZ_DW(set_fte_out)] = {}; struct mlx5dv_devx_obj *obj; uint32_t dest_entry_size; void *in_flow_context; uint32_t list_size; uint8_t *in_dests; uint32_t inlen; uint32_t *in; uint32_t i; if (fte_attr->extended_dest) dest_entry_size = DEVX_ST_SZ_BYTES(extended_dest_format); else dest_entry_size = DEVX_ST_SZ_BYTES(dest_format); inlen = DEVX_ST_SZ_BYTES(set_fte_in) + fte_attr->dest_size * dest_entry_size; in = calloc(1, inlen); if (!in) { errno = ENOMEM; return NULL; } DEVX_SET(set_fte_in, in, opcode, MLX5_CMD_OP_SET_FLOW_TABLE_ENTRY); DEVX_SET(set_fte_in, in, table_type, fte_attr->table_type); DEVX_SET(set_fte_in, in, table_id, fte_attr->table_id); in_flow_context = DEVX_ADDR_OF(set_fte_in, in, flow_context); DEVX_SET(flow_context, in_flow_context, group_id, fte_attr->group_id); DEVX_SET(flow_context, in_flow_context, flow_tag, fte_attr->flow_tag); DEVX_SET(flow_context, in_flow_context, action, fte_attr->action); DEVX_SET(flow_context, in_flow_context, extended_destination, fte_attr->extended_dest); in_dests = DEVX_ADDR_OF(flow_context, in_flow_context, destination); if (fte_attr->action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) { list_size = 0; for (i = 0; i < fte_attr->dest_size; i++) { uint32_t id; uint32_t type = fte_attr->dest_arr[i].type; if (type == MLX5_FLOW_DEST_TYPE_COUNTER) continue; switch (type) { case MLX5_FLOW_DEST_TYPE_VPORT: id = fte_attr->dest_arr[i].vport_num; break; case MLX5_FLOW_DEST_TYPE_TIR: id = fte_attr->dest_arr[i].tir_num; break; case MLX5_FLOW_DEST_TYPE_FT: id = fte_attr->dest_arr[i].ft_id; break; default: errno = EOPNOTSUPP; goto err_out; } DEVX_SET(dest_format, in_dests, destination_type, type); DEVX_SET(dest_format, in_dests, destination_id, id); if (fte_attr->dest_arr[i].has_reformat) { if (!fte_attr->extended_dest) { errno = EINVAL; goto err_out; } DEVX_SET(dest_format, in_dests, packet_reformat, 1); DEVX_SET(extended_dest_format, in_dests, packet_reformat_id, fte_attr->dest_arr[i].reformat_id); } in_dests += dest_entry_size; list_size++; } DEVX_SET(flow_context, in_flow_context, destination_list_size, list_size); } if (fte_attr->action & MLX5_FLOW_CONTEXT_ACTION_COUNT) { list_size = 0; for (i = 0; i < fte_attr->dest_size; i++) { if (fte_attr->dest_arr[i].type != MLX5_FLOW_DEST_TYPE_COUNTER) continue; DEVX_SET(flow_counter_list, in_dests, flow_counter_id, fte_attr->dest_arr[i].counter_id); in_dests += dest_entry_size; list_size++; } DEVX_SET(flow_context, in_flow_context, flow_counter_list_size, list_size); } obj = mlx5dv_devx_obj_create(ctx, in, inlen, out, sizeof(out)); if (!obj) errno = mlx5_get_cmd_status_err(errno, out); free(in); return obj; err_out: free(in); return NULL; } struct dr_devx_tbl * dr_devx_create_always_hit_ft(struct ibv_context *ctx, struct dr_devx_flow_table_attr *ft_attr, struct dr_devx_flow_group_attr *fg_attr, struct dr_devx_flow_fte_attr *fte_attr) { struct mlx5dv_devx_obj *fte_dvo; struct mlx5dv_devx_obj *fg_dvo; struct mlx5dv_devx_obj *ft_dvo; struct dr_devx_tbl *tbl; tbl = calloc(1, sizeof(*tbl)); if (!tbl) { errno = ENOMEM; return NULL; } ft_dvo = dr_devx_create_flow_table(ctx, ft_attr); if (!ft_dvo) goto free_tbl; fg_attr->table_id = ft_dvo->object_id; fg_attr->table_type = ft_attr->type; fg_dvo = dr_devx_create_flow_group(ctx, fg_attr); if (!fg_dvo) goto free_ft_dvo; fte_attr->table_id = ft_dvo->object_id; fte_attr->table_type = ft_attr->type; fte_attr->group_id = fg_dvo->object_id; fte_dvo = dr_devx_set_fte(ctx, fte_attr); if (!fte_dvo) goto free_fg_dvo; tbl->type = ft_attr->type; tbl->level = ft_attr->level; tbl->ft_dvo = ft_dvo; tbl->fg_dvo = fg_dvo; tbl->fte_dvo = fte_dvo; return tbl; free_fg_dvo: mlx5dv_devx_obj_destroy(fg_dvo); free_ft_dvo: mlx5dv_devx_obj_destroy(ft_dvo); free_tbl: free(tbl); return NULL; } void dr_devx_destroy_always_hit_ft(struct dr_devx_tbl *devx_tbl) { mlx5dv_devx_obj_destroy(devx_tbl->fte_dvo); mlx5dv_devx_obj_destroy(devx_tbl->fg_dvo); mlx5dv_devx_obj_destroy(devx_tbl->ft_dvo); free(devx_tbl); } struct mlx5dv_devx_obj * dr_devx_create_flow_sampler(struct ibv_context *ctx, struct dr_devx_flow_sampler_attr *sampler_attr) { uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_flow_sampler_in)] = {}; struct mlx5dv_devx_obj *obj; void *attr; attr = DEVX_ADDR_OF(create_flow_sampler_in, in, hdr); DEVX_SET(general_obj_in_cmd_hdr, attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, attr, obj_type, MLX5_OBJ_TYPE_FLOW_SAMPLER); attr = DEVX_ADDR_OF(create_flow_sampler_in, in, sampler); DEVX_SET(flow_sampler, attr, table_type, sampler_attr->table_type); DEVX_SET(flow_sampler, attr, level, sampler_attr->level); DEVX_SET(flow_sampler, attr, sample_ratio, sampler_attr->sample_ratio); DEVX_SET(flow_sampler, attr, ignore_flow_level, sampler_attr->ignore_flow_level); DEVX_SET(flow_sampler, attr, default_table_id, sampler_attr->default_next_table_id); DEVX_SET(flow_sampler, attr, sample_table_id, sampler_attr->sample_table_id); obj = mlx5dv_devx_obj_create(ctx, in, sizeof(in), out, sizeof(out)); if (!obj) errno = mlx5_get_cmd_status_err(errno, out); return obj; } int dr_devx_query_flow_sampler(struct mlx5dv_devx_obj *obj, uint64_t *rx_icm_addr, uint64_t *tx_icm_addr) { uint32_t out[DEVX_ST_SZ_DW(query_flow_sampler_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; void *attr; int ret; DEVX_SET(general_obj_in_cmd_hdr, in, opcode, MLX5_CMD_OP_QUERY_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_FLOW_SAMPLER); DEVX_SET(general_obj_in_cmd_hdr, in, obj_id, obj->object_id); ret = mlx5dv_devx_obj_query(obj, in, sizeof(in), out, sizeof(out)); if (ret) { dr_dbg_ctx(obj->context, "Failed to query flow sampler id %u\n", obj->object_id); return mlx5_get_cmd_status_err(ret, out); } attr = DEVX_ADDR_OF(query_flow_sampler_out, out, obj); *rx_icm_addr = DEVX_GET64(flow_sampler, attr, sw_steering_icm_address_rx); *tx_icm_addr = DEVX_GET64(flow_sampler, attr, sw_steering_icm_address_tx); return 0; } struct mlx5dv_devx_obj *dr_devx_create_definer(struct ibv_context *ctx, uint16_t format_id, uint8_t *match_mask) { uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_definer_in)] = {}; struct mlx5dv_devx_obj *obj; void *ptr; DEVX_SET(general_obj_in_cmd_hdr, in, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_MATCH_DEFINER); ptr = DEVX_ADDR_OF(create_definer_in, in, definer); DEVX_SET(definer, ptr, format_id, format_id); ptr = DEVX_ADDR_OF(definer, ptr, match_mask_dw_7_0); memcpy(ptr, match_mask, DEVX_FLD_SZ_BYTES(definer, match_mask_dw_7_0)); obj = mlx5dv_devx_obj_create(ctx, in, sizeof(in), out, sizeof(out)); if (!obj) errno = mlx5_get_cmd_status_err(errno, out); return obj; } struct mlx5dv_devx_obj *dr_devx_create_reformat_ctx(struct ibv_context *ctx, enum reformat_type rt, size_t reformat_size, void *reformat_data) { uint32_t out[DEVX_ST_SZ_DW(alloc_packet_reformat_context_out)] = {}; size_t insz, cmd_data_sz, cmd_total_sz; struct mlx5dv_devx_obj *obj; void *prctx; void *pdata; void *in; cmd_total_sz = DEVX_ST_SZ_BYTES(alloc_packet_reformat_context_in); cmd_data_sz = DEVX_FLD_SZ_BYTES(alloc_packet_reformat_context_in, packet_reformat_context.reformat_data); insz = align(cmd_total_sz + reformat_size - cmd_data_sz, 4); in = calloc(1, insz); if (!in) { errno = ENOMEM; return NULL; } DEVX_SET(alloc_packet_reformat_context_in, in, opcode, MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT); prctx = DEVX_ADDR_OF(alloc_packet_reformat_context_in, in, packet_reformat_context); pdata = DEVX_ADDR_OF(packet_reformat_context_in, prctx, reformat_data); DEVX_SET(packet_reformat_context_in, prctx, reformat_type, rt); DEVX_SET(packet_reformat_context_in, prctx, reformat_data_size, reformat_size); memcpy(pdata, reformat_data, reformat_size); obj = mlx5dv_devx_obj_create(ctx, in, insz, out, sizeof(out)); if (!obj) errno = mlx5_get_cmd_status_err(errno, out); free(in); return obj; } struct mlx5dv_devx_obj *dr_devx_create_meter(struct ibv_context *ctx, struct mlx5dv_dr_flow_meter_attr *meter_attr) { uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_flow_meter_in)] = {}; struct mlx5dv_devx_obj *obj; void *attr; if (meter_attr->flow_meter_parameter_sz > DEVX_FLD_SZ_BYTES(flow_meter, flow_meter_params)) { errno = EINVAL; return NULL; } attr = DEVX_ADDR_OF(create_flow_meter_in, in, hdr); DEVX_SET(general_obj_in_cmd_hdr, attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, attr, obj_type, MLX5_OBJ_TYPE_FLOW_METER); attr = DEVX_ADDR_OF(create_flow_meter_in, in, meter); DEVX_SET(flow_meter, attr, active, meter_attr->active); DEVX_SET(flow_meter, attr, return_reg_id, meter_attr->reg_c_index); DEVX_SET(flow_meter, attr, table_type, meter_attr->next_table->table_type); DEVX_SET(flow_meter, attr, destination_table_id, meter_attr->next_table->devx_obj->object_id); attr = DEVX_ADDR_OF(flow_meter, attr, flow_meter_params); memcpy(attr, meter_attr->flow_meter_parameter, meter_attr->flow_meter_parameter_sz); obj = mlx5dv_devx_obj_create(ctx, in, sizeof(in), out, sizeof(out)); if (!obj) errno = mlx5_get_cmd_status_err(errno, out); return obj; } int dr_devx_query_meter(struct mlx5dv_devx_obj *obj, uint64_t *rx_icm_addr, uint64_t *tx_icm_addr) { uint32_t in[DEVX_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; uint32_t out[DEVX_ST_SZ_DW(query_flow_meter_out)] = {}; void *attr; int ret; DEVX_SET(general_obj_in_cmd_hdr, in, opcode, MLX5_CMD_OP_QUERY_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_FLOW_METER); DEVX_SET(general_obj_in_cmd_hdr, in, obj_id, obj->object_id); ret = mlx5dv_devx_obj_query(obj, in, sizeof(in), out, sizeof(out)); if (ret) { dr_dbg_ctx(obj->context, "Failed to query flow meter id %u\n", obj->object_id); return mlx5_get_cmd_status_err(ret, out); } attr = DEVX_ADDR_OF(query_flow_meter_out, out, obj); *rx_icm_addr = DEVX_GET64(flow_meter, attr, sw_steering_icm_address_rx); *tx_icm_addr = DEVX_GET64(flow_meter, attr, sw_steering_icm_address_tx); return 0; } int dr_devx_modify_meter(struct mlx5dv_devx_obj *obj, struct mlx5dv_dr_flow_meter_attr *meter_attr, __be64 modify_bits) { uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_flow_meter_in)] = {}; void *attr; int ret; if (meter_attr->flow_meter_parameter_sz > DEVX_FLD_SZ_BYTES(flow_meter, flow_meter_params)) { errno = EINVAL; return errno; } attr = DEVX_ADDR_OF(create_flow_meter_in, in, hdr); DEVX_SET(general_obj_in_cmd_hdr, attr, opcode, MLX5_CMD_OP_MODIFY_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, attr, obj_type, MLX5_OBJ_TYPE_FLOW_METER); DEVX_SET(general_obj_in_cmd_hdr, in, obj_id, obj->object_id); attr = DEVX_ADDR_OF(create_flow_meter_in, in, meter); memcpy(DEVX_ADDR_OF(flow_meter, attr, modify_field_select), &modify_bits, sizeof(modify_bits)); DEVX_SET(flow_meter, attr, active, meter_attr->active); attr = DEVX_ADDR_OF(flow_meter, attr, flow_meter_params); memcpy(attr, meter_attr->flow_meter_parameter, meter_attr->flow_meter_parameter_sz); ret = mlx5dv_devx_obj_modify(obj, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } struct mlx5dv_devx_obj *dr_devx_create_qp(struct ibv_context *ctx, struct dr_devx_qp_create_attr *attr) { uint32_t in[DEVX_ST_SZ_DW(create_qp_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(create_qp_out)] = {}; void *qpc = DEVX_ADDR_OF(create_qp_in, in, qpc); struct mlx5dv_devx_obj *obj; DEVX_SET(create_qp_in, in, opcode, MLX5_CMD_OP_CREATE_QP); DEVX_SET(qpc, qpc, st, attr->service_type); DEVX_SET(qpc, qpc, pm_state, attr->pm_state); DEVX_SET(qpc, qpc, pd, attr->pdn); DEVX_SET(qpc, qpc, uar_page, attr->page_id); DEVX_SET(qpc, qpc, cqn_snd, attr->cqn); DEVX_SET(qpc, qpc, cqn_rcv, attr->cqn); DEVX_SET(qpc, qpc, log_sq_size, ilog32(attr->sq_wqe_cnt - 1)); DEVX_SET(qpc, qpc, log_rq_stride, attr->rq_wqe_shift - 4); DEVX_SET(qpc, qpc, log_rq_size, ilog32(attr->rq_wqe_cnt - 1)); DEVX_SET(qpc, qpc, dbr_umem_id, attr->db_umem_id); DEVX_SET(qpc, qpc, isolate_vl_tc, attr->isolate_vl_tc); DEVX_SET(qpc, qpc, ts_format, attr->qp_ts_format); DEVX_SET(create_qp_in, in, wq_umem_id, attr->buff_umem_id); obj = mlx5dv_devx_obj_create(ctx, in, sizeof(in), out, sizeof(out)); if (!obj) errno = mlx5_get_cmd_status_err(errno, out); return obj; } int dr_devx_modify_qp_rst2init(struct ibv_context *ctx, struct mlx5dv_devx_obj *qp_obj, uint16_t port) { uint32_t in[DEVX_ST_SZ_DW(rst2init_qp_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(rst2init_qp_out)] = {}; void *qpc = DEVX_ADDR_OF(rst2init_qp_in, in, qpc); int ret; DEVX_SET(rst2init_qp_in, in, opcode, MLX5_CMD_OP_RST2INIT_QP); DEVX_SET(rst2init_qp_in, in, qpn, qp_obj->object_id); DEVX_SET(qpc, qpc, primary_address_path.vhca_port_num, port); DEVX_SET(qpc, qpc, pm_state, MLX5_QPC_PM_STATE_MIGRATED); DEVX_SET(qpc, qpc, rre, 1); DEVX_SET(qpc, qpc, rwe, 1); ret = mlx5dv_devx_obj_modify(qp_obj, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } #define DR_DEVX_ICM_UDP_PORT 49999 int dr_devx_modify_qp_init2rtr(struct ibv_context *ctx, struct mlx5dv_devx_obj *qp_obj, struct dr_devx_qp_rtr_attr *attr) { uint32_t in[DEVX_ST_SZ_DW(init2rtr_qp_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(init2rtr_qp_out)] = {}; void *qpc = DEVX_ADDR_OF(init2rtr_qp_in, in, qpc); int ret; DEVX_SET(init2rtr_qp_in, in, opcode, MLX5_CMD_OP_INIT2RTR_QP); DEVX_SET(init2rtr_qp_in, in, qpn, qp_obj->object_id); DEVX_SET(qpc, qpc, mtu, attr->mtu); DEVX_SET(qpc, qpc, log_msg_max, DR_CHUNK_SIZE_MAX - 1); DEVX_SET(qpc, qpc, remote_qpn, attr->qp_num); if (attr->fl) { DEVX_SET(qpc, qpc, primary_address_path.fl, attr->fl); } else { memcpy(DEVX_ADDR_OF(qpc, qpc, primary_address_path.rmac_47_32), attr->dgid_attr.mac, sizeof(attr->dgid_attr.mac)); memcpy(DEVX_ADDR_OF(qpc, qpc, primary_address_path.rgid_rip), attr->dgid_attr.gid.raw, sizeof(attr->dgid_attr.gid.raw)); DEVX_SET(qpc, qpc, primary_address_path.src_addr_index, attr->sgid_index); if (attr->dgid_attr.roce_ver == MLX5_ROCE_VERSION_2) DEVX_SET(qpc, qpc, primary_address_path.udp_sport, DR_DEVX_ICM_UDP_PORT); } DEVX_SET(qpc, qpc, primary_address_path.vhca_port_num, attr->port_num); DEVX_SET(qpc, qpc, min_rnr_nak, 1); ret = mlx5dv_devx_obj_modify(qp_obj, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } int dr_devx_modify_qp_rtr2rts(struct ibv_context *ctx, struct mlx5dv_devx_obj *qp_obj, struct dr_devx_qp_rts_attr *attr) { uint32_t in[DEVX_ST_SZ_DW(rtr2rts_qp_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(rtr2rts_qp_out)] = {}; void *qpc = DEVX_ADDR_OF(rtr2rts_qp_in, in, qpc); int ret; DEVX_SET(rtr2rts_qp_in, in, opcode, MLX5_CMD_OP_RTR2RTS_QP); DEVX_SET(rtr2rts_qp_in, in, qpn, qp_obj->object_id); DEVX_SET(qpc, qpc, log_ack_req_freq, 0); DEVX_SET(qpc, qpc, retry_count, attr->retry_cnt); DEVX_SET(qpc, qpc, rnr_retry, attr->rnr_retry); DEVX_SET(qpc, qpc, primary_address_path.ack_timeout, 0x8); /* ~1ms */ ret = mlx5dv_devx_obj_modify(qp_obj, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } int dr_devx_query_gid(struct ibv_context *ctx, uint8_t vhca_port_num, uint16_t index, struct dr_gid_attr *attr) { uint32_t out[DEVX_ST_SZ_DW(query_roce_address_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(query_roce_address_in)] = {}; int ret; DEVX_SET(query_roce_address_in, in, opcode, MLX5_CMD_OP_QUERY_ROCE_ADDRESS); DEVX_SET(query_roce_address_in, in, roce_address_index, index); DEVX_SET(query_roce_address_in, in, vhca_port_num, vhca_port_num); ret = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (ret) return mlx5_get_cmd_status_err(ret, out); memcpy(&attr->gid, DEVX_ADDR_OF(query_roce_address_out, out, roce_address.source_l3_address), sizeof(attr->gid)); memcpy(attr->mac, DEVX_ADDR_OF(query_roce_address_out, out, roce_address.source_mac_47_32), sizeof(attr->mac)); if (DEVX_GET(query_roce_address_out, out, roce_address.roce_version) == MLX5_ROCE_VERSION_2) attr->roce_ver = MLX5_ROCE_VERSION_2; else attr->roce_ver = MLX5_ROCE_VERSION_1; return 0; } struct mlx5dv_devx_obj *dr_devx_create_modify_header_arg(struct ibv_context *ctx, uint16_t log_obj_range, uint32_t pd) { uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_modify_header_arg_in)] = {}; void *attr; attr = DEVX_ADDR_OF(create_modify_header_arg_in, in, hdr); DEVX_SET(general_obj_in_cmd_hdr, attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, attr, obj_type, MLX5_OBJ_TYPE_HEADER_MODIFY_ARGUMENT); DEVX_SET(general_obj_in_cmd_hdr, attr, log_obj_range, log_obj_range); attr = DEVX_ADDR_OF(create_modify_header_arg_in, in, arg); DEVX_SET(modify_header_arg, attr, access_pd, pd); return mlx5dv_devx_obj_create(ctx, in, sizeof(in), out, sizeof(out)); } rdma-core-56.1/providers/mlx5/dr_domain.c000066400000000000000000000420661477342711600203440ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include "mlx5dv_dr.h" enum { MLX5DV_DR_DOMAIN_SYNC_SUP_FLAGS = (MLX5DV_DR_DOMAIN_SYNC_FLAGS_SW | MLX5DV_DR_DOMAIN_SYNC_FLAGS_HW | MLX5DV_DR_DOMAIN_SYNC_FLAGS_MEM), }; bool dr_domain_is_support_sw_encap(struct mlx5dv_dr_domain *dmn) { return !!dmn->info.caps.log_sw_encap_icm_size; } static int dr_domain_init_sw_encap_resources(struct mlx5dv_dr_domain *dmn) { if (!dr_domain_is_support_sw_encap(dmn)) return 0; dmn->encap_icm_pool = dr_icm_pool_create(dmn, DR_ICM_TYPE_ENCAP); if (!dmn->encap_icm_pool) { dr_dbg(dmn, "Couldn't get sw-encap icm memory for %s\n", ibv_get_device_name(dmn->ctx->device)); return errno; } return 0; } static void dr_domain_destroy_sw_encap_resources(struct mlx5dv_dr_domain *dmn) { if (!dr_domain_is_support_sw_encap(dmn)) return; dr_icm_pool_destroy(dmn->encap_icm_pool); } bool dr_domain_is_support_modify_hdr_cache(struct mlx5dv_dr_domain *dmn) { return dmn->info.caps.sw_format_ver >= MLX5_HW_CONNECTX_6DX && dmn->info.caps.support_modify_argument; } static int dr_domain_init_resources(struct mlx5dv_dr_domain *dmn) { struct mlx5dv_pd mlx5_pd = {}; struct mlx5dv_obj obj; int ret = -1; dmn->ste_ctx = dr_ste_get_ctx(dmn->info.caps.sw_format_ver); if (!dmn->ste_ctx) { dr_dbg(dmn, "Couldn't initialize STE context\n"); return errno; } dmn->pd = ibv_alloc_pd(dmn->ctx); if (!dmn->pd) { dr_dbg(dmn, "Couldn't allocate PD\n"); return ret; } obj.pd.in = dmn->pd; obj.pd.out = &mlx5_pd; ret = mlx5dv_init_obj(&obj, MLX5DV_OBJ_PD); if (ret) goto clean_pd; dmn->pd_num = mlx5_pd.pdn; dmn->uar = mlx5dv_devx_alloc_uar(dmn->ctx, MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC); if (!dmn->uar) dmn->uar = mlx5dv_devx_alloc_uar(dmn->ctx, MLX5_IB_UAPI_UAR_ALLOC_TYPE_BF); if (!dmn->uar) { dr_dbg(dmn, "Can't allocate UAR\n"); goto clean_pd; } dmn->ste_icm_pool = dr_icm_pool_create(dmn, DR_ICM_TYPE_STE); if (!dmn->ste_icm_pool) { dr_dbg(dmn, "Couldn't get icm memory for %s\n", ibv_get_device_name(dmn->ctx->device)); goto clean_uar; } dmn->action_icm_pool = dr_icm_pool_create(dmn, DR_ICM_TYPE_MODIFY_ACTION); if (!dmn->action_icm_pool) { dr_dbg(dmn, "Couldn't get action icm memory for %s\n", ibv_get_device_name(dmn->ctx->device)); goto free_ste_icm_pool; } if (dr_domain_is_support_modify_hdr_cache(dmn)) { dmn->modify_header_ptrn_mngr = dr_ptrn_mngr_create(dmn); if (dmn->modify_header_ptrn_mngr) { dmn->modify_header_arg_mngr = dr_arg_mngr_create(dmn); if (!dmn->modify_header_arg_mngr) { dr_ptrn_mngr_destroy(dmn->modify_header_ptrn_mngr); dmn->modify_header_ptrn_mngr = NULL; } } } ret = dr_domain_init_sw_encap_resources(dmn); if (ret) { dr_dbg(dmn, "Couldn't create sw-encap resource for %s\n", ibv_get_device_name(dmn->ctx->device)); goto free_modify_header_ptrn_arg_mngr; } ret = dr_send_ring_alloc(dmn); if (ret) { dr_dbg(dmn, "Couldn't create send-ring for %s\n", ibv_get_device_name(dmn->ctx->device)); goto free_sw_encap_resources; } return 0; free_sw_encap_resources: dr_domain_destroy_sw_encap_resources(dmn); free_modify_header_ptrn_arg_mngr: dr_ptrn_mngr_destroy(dmn->modify_header_ptrn_mngr); dr_arg_mngr_destroy(dmn->modify_header_arg_mngr); dr_icm_pool_destroy(dmn->action_icm_pool); free_ste_icm_pool: dr_icm_pool_destroy(dmn->ste_icm_pool); clean_uar: mlx5dv_devx_free_uar(dmn->uar); clean_pd: ibv_dealloc_pd(dmn->pd); return ret; } static void dr_free_resources(struct mlx5dv_dr_domain *dmn) { dr_send_ring_free(dmn); dr_domain_destroy_sw_encap_resources(dmn); dr_ptrn_mngr_destroy(dmn->modify_header_ptrn_mngr); dr_arg_mngr_destroy(dmn->modify_header_arg_mngr); dr_icm_pool_destroy(dmn->action_icm_pool); dr_icm_pool_destroy(dmn->ste_icm_pool); mlx5dv_devx_free_uar(dmn->uar); ibv_dealloc_pd(dmn->pd); } static int dr_domain_vports_init(struct mlx5dv_dr_domain *dmn) { struct dr_devx_vports *vports = &dmn->info.caps.vports; int ret; ret = pthread_spin_init(&vports->lock, PTHREAD_PROCESS_PRIVATE); if (ret) { errno = ret; return ret; } vports->vports = dr_vports_table_create(dmn); if (!vports->vports) goto free_spin_lock; dr_vports_table_add_wire(vports); return 0; free_spin_lock: pthread_spin_destroy(&vports->lock); return errno; } static void dr_domain_vports_uninit(struct mlx5dv_dr_domain *dmn) { struct dr_devx_vports *vports = &dmn->info.caps.vports; if (vports->vports) { /* Wire port must be deleted before destroying vports table, * since it is not allocated dynamically but inserted to table * as such. */ dr_vports_table_del_wire(vports); dr_vports_table_destroy(vports->vports); vports->vports = NULL; } pthread_spin_destroy(&vports->lock); if (vports->ib_ports) free(vports->ib_ports); } static int dr_domain_query_esw_mgr(struct mlx5dv_dr_domain *dmn, struct dr_devx_vport_cap *esw_mngr) { int ret; /* Query E-Switch manager PF/ECPF */ ret = dr_devx_query_esw_vport_context(dmn->ctx, false, 0, &esw_mngr->icm_address_rx, &esw_mngr->icm_address_tx); if (ret) return ret; /* E-Switch manager gvmi and vhca id are the same */ esw_mngr->vhca_gvmi = dmn->info.caps.gvmi; esw_mngr->vport_gvmi = dmn->info.caps.gvmi; return 0; } static int dr_domain_query_and_set_ib_ports(struct mlx5dv_dr_domain *dmn) { struct dr_devx_vports *vports = &dmn->info.caps.vports; int i; vports->ib_ports = calloc(vports->num_ports, sizeof(struct dr_devx_vport_cap *)); if (!vports->ib_ports) { errno = ENOMEM; return errno; } /* Best effort to query available ib ports */ for (i = 1; i <= vports->num_ports; i++) dr_vports_table_get_ib_port_cap(&dmn->info.caps, i); return 0; } static int dr_domain_query_fdb_caps(struct ibv_context *ctx, struct mlx5dv_dr_domain *dmn) { struct dr_devx_vports *vports = &dmn->info.caps.vports; struct dr_esw_caps esw_caps = {}; int ret; if (!dmn->info.caps.eswitch_manager) return 0; ret = dr_domain_query_esw_mgr(dmn, &vports->esw_mngr); if (ret) return ret; ret = dr_devx_query_esw_caps(ctx, &esw_caps); if (ret) return ret; vports->num_ports = dmn->info.attr.phys_port_cnt_ex; /* Set uplink */ vports->wire.icm_address_rx = esw_caps.uplink_icm_address_rx; vports->wire.icm_address_tx = esw_caps.uplink_icm_address_tx; vports->wire.vhca_gvmi = vports->esw_mngr.vhca_gvmi; vports->wire.num = WIRE_PORT; /* Set FDB general caps */ dmn->info.caps.fdb_sw_owner = esw_caps.sw_owner; dmn->info.caps.fdb_sw_owner_v2 = esw_caps.sw_owner_v2; dmn->info.caps.esw_rx_drop_address = esw_caps.drop_icm_address_rx; dmn->info.caps.esw_tx_drop_address = esw_caps.drop_icm_address_tx; /* Query all ib ports if supported */ ret = dr_domain_query_and_set_ib_ports(dmn); if (ret) { dr_dbg(dmn, "Failed to query ib vports\n"); return ret; } return 0; } static bool dr_domain_caps_is_sw_owner_supported(bool sw_owner, bool sw_owner_v2, uint8_t sw_format_ver) { return sw_owner || (sw_owner_v2 && sw_format_ver <= MLX5_HW_CONNECTX_8); } static int dr_domain_caps_init(struct ibv_context *ctx, struct mlx5dv_dr_domain *dmn) { struct ibv_port_attr port_attr = {}; int ret; dmn->info.caps.dmn = dmn; ret = ibv_query_port(ctx, 1, &port_attr); if (ret) { dr_dbg(dmn, "Failed to query port\n"); return ret; } if (port_attr.link_layer != IBV_LINK_LAYER_ETHERNET) { dr_dbg(dmn, "Failed to allocate domain, bad link type\n"); errno = EOPNOTSUPP; return errno; } ret = ibv_query_device_ex(ctx, NULL, &dmn->info.attr); if (ret) return ret; ret = dr_devx_query_device(ctx, &dmn->info.caps); if (ret) /* Ignore devx query failure to allow steering on root level * tables in case devx is not supported over mlx5dv_dr API */ return 0; /* Non FDB type is supported over root table or when we can enable * force-loopback. */ if ((dmn->type != MLX5DV_DR_DOMAIN_TYPE_FDB) && !dr_send_allow_fl(&dmn->info.caps)) return 0; ret = dr_domain_vports_init(dmn); if (ret) return ret; ret = dr_domain_query_fdb_caps(ctx, dmn); if (ret) goto uninit_vports; switch (dmn->type) { case MLX5DV_DR_DOMAIN_TYPE_NIC_RX: if (!dr_domain_caps_is_sw_owner_supported(dmn->info.caps.rx_sw_owner, dmn->info.caps.rx_sw_owner_v2, dmn->info.caps.sw_format_ver)) return 0; dmn->info.supp_sw_steering = true; dmn->info.rx.type = DR_DOMAIN_NIC_TYPE_RX; dmn->info.rx.default_icm_addr = dmn->info.caps.nic_rx_drop_address; dmn->info.rx.drop_icm_addr = dmn->info.caps.nic_rx_drop_address; break; case MLX5DV_DR_DOMAIN_TYPE_NIC_TX: if (!dr_domain_caps_is_sw_owner_supported(dmn->info.caps.tx_sw_owner, dmn->info.caps.tx_sw_owner_v2, dmn->info.caps.sw_format_ver)) return 0; dmn->info.supp_sw_steering = true; dmn->info.tx.type = DR_DOMAIN_NIC_TYPE_TX; dmn->info.tx.default_icm_addr = dmn->info.caps.nic_tx_allow_address; dmn->info.tx.drop_icm_addr = dmn->info.caps.nic_tx_drop_address; break; case MLX5DV_DR_DOMAIN_TYPE_FDB: if (!dmn->info.caps.eswitch_manager) return 0; if (!dr_domain_caps_is_sw_owner_supported(dmn->info.caps.fdb_sw_owner, dmn->info.caps.fdb_sw_owner_v2, dmn->info.caps.sw_format_ver)) return 0; dmn->info.rx.type = DR_DOMAIN_NIC_TYPE_RX; dmn->info.tx.type = DR_DOMAIN_NIC_TYPE_TX; dmn->info.supp_sw_steering = true; dmn->info.tx.default_icm_addr = dmn->info.caps.vports.esw_mngr.icm_address_tx; dmn->info.rx.default_icm_addr = dmn->info.caps.vports.esw_mngr.icm_address_rx; dmn->info.rx.drop_icm_addr = dmn->info.caps.esw_rx_drop_address; dmn->info.tx.drop_icm_addr = dmn->info.caps.esw_tx_drop_address; break; default: dr_dbg(dmn, "Invalid domain\n"); ret = EINVAL; goto uninit_vports; } return ret; uninit_vports: dr_domain_vports_uninit(dmn); return ret; } static void dr_domain_caps_uninit(struct mlx5dv_dr_domain *dmn) { dr_domain_vports_uninit(dmn); } bool dr_domain_is_support_ste_icm_size(struct mlx5dv_dr_domain *dmn, uint32_t req_log_icm_sz) { if (dmn->info.caps.log_icm_size < req_log_icm_sz + DR_STE_LOG_SIZE) return false; return true; } bool dr_domain_set_max_ste_icm_size(struct mlx5dv_dr_domain *dmn, uint32_t req_log_icm_sz) { if (!dr_domain_is_support_ste_icm_size(dmn, req_log_icm_sz)) return false; if (dmn->info.max_log_sw_icm_sz < req_log_icm_sz) { dmn->info.max_log_sw_icm_sz = req_log_icm_sz; dr_icm_pool_set_pool_max_log_chunk_sz(dmn->ste_icm_pool, dmn->info.max_log_sw_icm_sz); } return true; } static int dr_domain_check_icm_memory_caps(struct mlx5dv_dr_domain *dmn) { uint32_t max_req_bytes_log, max_req_chunks_log; /* Check for minimum ICM log byte size requirements */ if (dmn->info.caps.log_modify_hdr_icm_size < DR_CHUNK_SIZE_4K + DR_MODIFY_ACTION_LOG_SIZE) { errno = ENOMEM; return errno; } if (dmn->info.caps.log_icm_size < DR_CHUNK_SIZE_1024K + DR_STE_LOG_SIZE) { errno = ENOMEM; return errno; } /* Current code tries to use large allocations to improve our internal * memory allocation (less DMs and less FW calls). * When creating multiple domains on the same PF, we want to make sure * we don't deplete all of the ICM resources on a single domain. * To provide some functionality with a limited resource we will use * up to 1/8 of the total available size allowing opening a domain * of each type. */ max_req_bytes_log = dmn->info.caps.log_modify_hdr_icm_size - 3; max_req_chunks_log = max_req_bytes_log - DR_MODIFY_ACTION_LOG_SIZE; dmn->info.max_log_action_icm_sz = min_t(uint32_t, DR_CHUNK_SIZE_1024K, max_req_chunks_log); max_req_bytes_log = dmn->info.caps.log_icm_size - 3; max_req_chunks_log = max_req_bytes_log - DR_STE_LOG_SIZE; dmn->info.max_log_sw_icm_sz = min_t(uint32_t, DR_CHUNK_SIZE_1024K, max_req_chunks_log); dmn->info.max_log_sw_icm_rehash_sz = dmn->info.max_log_sw_icm_sz; if (dmn->info.caps.sw_format_ver >= MLX5_HW_CONNECTX_6DX) { if (dmn->info.caps.log_modify_pattern_icm_size < DR_CHUNK_SIZE_4K + DR_MODIFY_ACTION_LOG_SIZE) { errno = ENOMEM; return errno; } dmn->info.max_log_modify_hdr_pattern_icm_sz = DR_CHUNK_SIZE_4K; } if (dr_domain_is_support_sw_encap(dmn)) { if (dmn->info.caps.log_sw_encap_icm_size < (DR_CHUNK_SIZE_4K + DR_SW_ENCAP_ENTRY_LOG_SIZE)) { errno = ENOMEM; return errno; } dmn->info.max_log_sw_encap_icm_sz = DR_CHUNK_SIZE_4K; } return 0; } struct mlx5dv_dr_domain * mlx5dv_dr_domain_create(struct ibv_context *ctx, enum mlx5dv_dr_domain_type type) { struct mlx5dv_dr_domain *dmn; int ret; if (type > MLX5DV_DR_DOMAIN_TYPE_FDB) { errno = EINVAL; return NULL; } dmn = calloc(1, sizeof(*dmn)); if (!dmn) { errno = ENOMEM; return NULL; } dmn->ctx = ctx; dmn->type = type; atomic_init(&dmn->refcount, 1); list_head_init(&dmn->tbl_list); ret = pthread_spin_init(&dmn->debug_lock, PTHREAD_PROCESS_PRIVATE); if (ret) { errno = ret; goto free_domain; } ret = dr_domain_nic_lock_init(&dmn->info.rx); if (ret) goto free_debug_lock; ret = dr_domain_nic_lock_init(&dmn->info.tx); if (ret) goto uninit_rx_locks; if (dr_domain_caps_init(ctx, dmn)) { dr_dbg(dmn, "Failed init domain, no caps\n"); goto uninit_tx_locks; } /* Allocate resources */ if (dmn->info.supp_sw_steering) { if (dr_domain_check_icm_memory_caps(dmn)) goto uninit_caps; ret = dr_domain_init_resources(dmn); if (ret) { dr_dbg(dmn, "Failed init domain resources for %s\n", ibv_get_device_name(ctx->device)); goto uninit_caps; } /* Init CRC table for htbl CRC calculation */ dr_crc32_init_table(); } return dmn; uninit_caps: dr_domain_caps_uninit(dmn); uninit_tx_locks: dr_domain_nic_lock_uninit(&dmn->info.tx); uninit_rx_locks: dr_domain_nic_lock_uninit(&dmn->info.rx); free_debug_lock: pthread_spin_destroy(&dmn->debug_lock); free_domain: free(dmn); return NULL; } /* * Assure synchronization of the device steering tables with updates made by SW * insertion. */ int mlx5dv_dr_domain_sync(struct mlx5dv_dr_domain *dmn, uint32_t flags) { int ret = 0; if (!dmn->info.supp_sw_steering || !check_comp_mask(flags, MLX5DV_DR_DOMAIN_SYNC_SUP_FLAGS)) { errno = EOPNOTSUPP; return errno; } if (flags & MLX5DV_DR_DOMAIN_SYNC_FLAGS_SW) { ret = dr_send_ring_force_drain(dmn); if (ret) return ret; } if (flags & MLX5DV_DR_DOMAIN_SYNC_FLAGS_HW) { ret = dr_devx_sync_steering(dmn->ctx); if (ret) return ret; } if (flags & MLX5DV_DR_DOMAIN_SYNC_FLAGS_MEM) { if (dmn->ste_icm_pool) { ret = dr_icm_pool_sync_pool(dmn->ste_icm_pool); if (ret) return ret; } if (dmn->encap_icm_pool) { ret = dr_icm_pool_sync_pool(dmn->encap_icm_pool); if (ret) return ret; } if (dmn->action_icm_pool) { ret = dr_icm_pool_sync_pool(dmn->action_icm_pool); if (ret) return ret; } if (dmn->modify_header_ptrn_mngr) ret = dr_ptrn_sync_pool(dmn->modify_header_ptrn_mngr); } return ret; } void mlx5dv_dr_domain_set_reclaim_device_memory(struct mlx5dv_dr_domain *dmn, bool enable) { dr_domain_lock(dmn); if (enable) dmn->flags |= DR_DOMAIN_FLAG_MEMORY_RECLAIM; else dmn->flags &= ~DR_DOMAIN_FLAG_MEMORY_RECLAIM; dr_domain_unlock(dmn); } void mlx5dv_dr_domain_allow_duplicate_rules(struct mlx5dv_dr_domain *dmn, bool allow) { dr_domain_lock(dmn); if (allow) dmn->flags &= ~DR_DOMAIN_FLAG_DISABLE_DUPLICATE_RULES; else dmn->flags |= DR_DOMAIN_FLAG_DISABLE_DUPLICATE_RULES; dr_domain_unlock(dmn); } int mlx5dv_dr_domain_destroy(struct mlx5dv_dr_domain *dmn) { if (atomic_load(&dmn->refcount) > 1) return EBUSY; if (dmn->info.supp_sw_steering) { /* make sure resources are not used by the hardware */ dr_devx_sync_steering(dmn->ctx); dr_free_resources(dmn); } dr_domain_caps_uninit(dmn); dr_domain_nic_lock_uninit(&dmn->info.tx); dr_domain_nic_lock_uninit(&dmn->info.rx); pthread_spin_destroy(&dmn->debug_lock); free(dmn); return 0; } rdma-core-56.1/providers/mlx5/dr_icm_pool.c000066400000000000000000000371711477342711600206770ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include "mlx5dv_dr.h" #define DR_ICM_MODIFY_HDR_ALIGN_BASE 64 struct dr_icm_pool { enum dr_icm_type icm_type; struct mlx5dv_dr_domain *dmn; enum dr_icm_chunk_size max_log_chunk_sz; /* memory management */ pthread_spinlock_t lock; struct list_head buddy_mem_list; uint64_t hot_memory_size; bool syncing; size_t th; }; struct dr_icm_mr { struct ibv_mr *mr; struct ibv_dm *dm; uint64_t icm_start_addr; }; static int dr_icm_allocate_aligned_dm(struct dr_icm_pool *pool, struct dr_icm_mr *icm_mr, struct ibv_alloc_dm_attr *dm_attr, uint64_t *ofsset_in_dm) { struct mlx5dv_alloc_dm_attr mlx5_dm_attr = {}; size_t log_align_base = 0; bool fallback = false; struct mlx5_dm *dm; size_t size; /* create dm/mr for this pool */ size = dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz, pool->icm_type); switch (pool->icm_type) { case DR_ICM_TYPE_STE: mlx5_dm_attr.type = MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM; /* Align base is the biggest chunk size */ log_align_base = ilog32(size - 1); break; case DR_ICM_TYPE_MODIFY_ACTION: mlx5_dm_attr.type = MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM; /* Align base is 64B */ log_align_base = ilog32(DR_ICM_MODIFY_HDR_ALIGN_BASE - 1); break; case DR_ICM_TYPE_MODIFY_HDR_PTRN: mlx5_dm_attr.type = MLX5DV_DM_TYPE_HEADER_MODIFY_PATTERN_SW_ICM; /* Align base is 64B */ log_align_base = ilog32(DR_ICM_MODIFY_HDR_ALIGN_BASE - 1); break; case DR_ICM_TYPE_ENCAP: mlx5_dm_attr.type = MLX5_IB_UAPI_DM_TYPE_ENCAP_SW_ICM; log_align_base = DR_SW_ENCAP_ENTRY_LOG_SIZE; break; default: assert(false); errno = EINVAL; return errno; } dm_attr->length = size; *ofsset_in_dm = 0; alloc_dm: icm_mr->dm = mlx5dv_alloc_dm(pool->dmn->ctx, dm_attr, &mlx5_dm_attr); if (!icm_mr->dm) { dr_dbg(pool->dmn, "Failed allocating DM\n"); return errno; } dm = to_mdm(icm_mr->dm); icm_mr->icm_start_addr = dm->remote_va; if (icm_mr->icm_start_addr & ((1UL << log_align_base) - 1)) { uint64_t align_base; uint64_t align_diff; /* Fallback to previous implementation, ask for double size */ dr_dbg(pool->dmn, "Got not aligned memory: %zu last_try: %d\n", log_align_base, fallback); if (fallback) { align_base = 1UL << log_align_base; align_diff = icm_mr->icm_start_addr % align_base; /* increase the address to start from aligned size */ icm_mr->icm_start_addr = icm_mr->icm_start_addr + (align_base - align_diff); *ofsset_in_dm = align_base - align_diff; /* return the size to its original val */ dm_attr->length = size; return 0; } mlx5_free_dm(icm_mr->dm); /* retry to allocate, now double the size */ dm_attr->length = size * 2; fallback = true; goto alloc_dm; } return 0; } static struct dr_icm_mr * dr_icm_pool_mr_create(struct dr_icm_pool *pool) { struct ibv_alloc_dm_attr dm_attr = {}; uint64_t align_offset_in_dm; struct dr_icm_mr *icm_mr; icm_mr = calloc(1, sizeof(struct dr_icm_mr)); if (!icm_mr) { errno = ENOMEM; return NULL; } if (dr_icm_allocate_aligned_dm(pool, icm_mr, &dm_attr, &align_offset_in_dm)) goto free_icm_mr; /* Register device memory */ icm_mr->mr = ibv_reg_dm_mr(pool->dmn->pd, icm_mr->dm, align_offset_in_dm, dm_attr.length, IBV_ACCESS_ZERO_BASED | IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_READ); if (!icm_mr->mr) { dr_dbg(pool->dmn, "Failed DM registration\n"); goto free_dm; } return icm_mr; free_dm: mlx5_free_dm(icm_mr->dm); free_icm_mr: free(icm_mr); return NULL; } static void dr_icm_pool_mr_destroy(struct dr_icm_mr *icm_mr) { ibv_dereg_mr(icm_mr->mr); mlx5_free_dm(icm_mr->dm); free(icm_mr); } static enum dr_icm_type get_chunk_icm_type(struct dr_icm_chunk *chunk) { return chunk->buddy_mem->pool->icm_type; } static void dr_icm_chunk_ste_init(struct dr_icm_chunk *chunk, int offset) { struct dr_icm_buddy_mem *buddy = chunk->buddy_mem; int index = offset / DR_STE_SIZE; chunk->ste_arr = &buddy->ste_arr[index]; chunk->miss_list = &buddy->miss_list[index]; chunk->hw_ste_arr = buddy->hw_ste_arr + index * buddy->hw_ste_sz; } static void dr_icm_chunk_ste_cleanup(struct dr_icm_chunk *chunk) { struct dr_icm_buddy_mem *buddy = chunk->buddy_mem; memset(chunk->hw_ste_arr, 0, chunk->num_of_entries * buddy->hw_ste_sz); } static void dr_icm_chunk_destroy(struct dr_icm_chunk *chunk) { enum dr_icm_type icm_type = get_chunk_icm_type(chunk); list_del(&chunk->chunk_list); if (icm_type == DR_ICM_TYPE_STE) dr_icm_chunk_ste_cleanup(chunk); free(chunk); } static int dr_icm_buddy_init_ste_cache(struct dr_icm_buddy_mem *buddy) { struct dr_devx_caps *caps = &buddy->pool->dmn->info.caps; int num_of_entries = dr_icm_pool_chunk_size_to_entries(buddy->pool->max_log_chunk_sz); buddy->hw_ste_sz = caps->sw_format_ver == MLX5_HW_CONNECTX_5 ? DR_STE_SIZE_REDUCED : DR_STE_SIZE; buddy->ste_arr = calloc(num_of_entries, sizeof(struct dr_ste)); if (!buddy->ste_arr) { errno = ENOMEM; return ENOMEM; } buddy->hw_ste_arr = calloc(num_of_entries, buddy->hw_ste_sz); if (!buddy->hw_ste_arr) { errno = ENOMEM; goto free_ste_arr; } buddy->miss_list = malloc(num_of_entries * sizeof(struct list_head)); if (!buddy->miss_list) { errno = ENOMEM; goto free_hw_ste_arr; } return 0; free_hw_ste_arr: free(buddy->hw_ste_arr); free_ste_arr: free(buddy->ste_arr); return errno; } static void dr_icm_buddy_cleanup_ste_cache(struct dr_icm_buddy_mem *buddy) { free(buddy->ste_arr); free(buddy->hw_ste_arr); free(buddy->miss_list); } static int dr_icm_buddy_create(struct dr_icm_pool *pool) { struct dr_icm_buddy_mem *buddy; struct dr_icm_mr *icm_mr; icm_mr = dr_icm_pool_mr_create(pool); if (!icm_mr) return ENOMEM; buddy = calloc(1, sizeof(*buddy)); if (!buddy) { errno = ENOMEM; goto free_mr; } buddy->pool = pool; buddy->icm_mr = icm_mr; if (dr_buddy_init(buddy, pool->max_log_chunk_sz)) goto err_free_buddy; /* Reduce allocations by preallocating and reusing the STE structures */ if (pool->icm_type == DR_ICM_TYPE_STE && dr_icm_buddy_init_ste_cache(buddy)) goto err_cleanup_buddy; /* add it to the -start- of the list in order to search in it first */ list_add(&pool->buddy_mem_list, &buddy->list_node); pool->dmn->num_buddies[pool->icm_type]++; return 0; err_cleanup_buddy: dr_buddy_cleanup(buddy); err_free_buddy: free(buddy); free_mr: dr_icm_pool_mr_destroy(icm_mr); return errno; } static void dr_icm_buddy_destroy(struct dr_icm_buddy_mem *buddy) { struct dr_icm_chunk *chunk, *next; list_for_each_safe(&buddy->hot_list, chunk, next, chunk_list) dr_icm_chunk_destroy(chunk); list_for_each_safe(&buddy->used_list, chunk, next, chunk_list) dr_icm_chunk_destroy(chunk); dr_icm_pool_mr_destroy(buddy->icm_mr); dr_buddy_cleanup(buddy); buddy->pool->dmn->num_buddies[buddy->pool->icm_type]--; if (buddy->pool->icm_type == DR_ICM_TYPE_STE) dr_icm_buddy_cleanup_ste_cache(buddy); free(buddy); } static struct dr_icm_chunk * dr_icm_chunk_create(struct dr_icm_pool *pool, enum dr_icm_chunk_size chunk_size, struct dr_icm_buddy_mem *buddy_mem_pool, int seg) { struct dr_icm_chunk *chunk; int offset; chunk = calloc(1, sizeof(struct dr_icm_chunk)); if (!chunk) { errno = ENOMEM; return NULL; } offset = dr_icm_pool_dm_type_to_entry_size(pool->icm_type) * seg; chunk->buddy_mem = buddy_mem_pool; chunk->num_of_entries = dr_icm_pool_chunk_size_to_entries(chunk_size); chunk->byte_size = dr_icm_pool_chunk_size_to_byte(chunk_size, pool->icm_type); chunk->seg = seg; if (pool->icm_type == DR_ICM_TYPE_STE) dr_icm_chunk_ste_init(chunk, offset); buddy_mem_pool->used_memory += chunk->byte_size; list_node_init(&chunk->chunk_list); /* chunk now is part of the used_list */ list_add_tail(&buddy_mem_pool->used_list, &chunk->chunk_list); return chunk; } static bool dr_icm_pool_is_sync_required(struct dr_icm_pool *pool) { if (pool->hot_memory_size >= pool->th) return true; return false; } /* In order to gain performance FW command is done out of the lock */ static int dr_icm_pool_sync_pool_buddies(struct dr_icm_pool *pool) { struct dr_icm_buddy_mem *buddy, *tmp_buddy; struct dr_icm_chunk *chunk, *tmp_chunk; struct list_head sync_list; bool need_reclaim = false; int err; list_head_init(&sync_list); list_for_each_safe(&pool->buddy_mem_list, buddy, tmp_buddy, list_node) list_append_list(&sync_list, &buddy->hot_list); pool->syncing = true; pthread_spin_unlock(&pool->lock); /* Avoid race between delete resource to its reuse on other QP */ dr_send_ring_force_drain(pool->dmn); if (pool->dmn->flags & DR_DOMAIN_FLAG_MEMORY_RECLAIM) need_reclaim = true; err = dr_devx_sync_steering(pool->dmn->ctx); if (err) /* Unexpected state, add debug note and continue */ dr_dbg(pool->dmn, "Failed devx sync hw\n"); pthread_spin_lock(&pool->lock); list_for_each_safe(&sync_list, chunk, tmp_chunk, chunk_list) { buddy = chunk->buddy_mem; dr_buddy_free_mem(buddy, chunk->seg, ilog32(chunk->num_of_entries - 1)); buddy->used_memory -= chunk->byte_size; pool->hot_memory_size -= chunk->byte_size; dr_icm_chunk_destroy(chunk); } if (need_reclaim) { list_for_each_safe(&pool->buddy_mem_list, buddy, tmp_buddy, list_node) if (!buddy->used_memory) dr_icm_buddy_destroy(buddy); } pool->syncing = false; return err; } int dr_icm_pool_sync_pool(struct dr_icm_pool *pool) { int ret = 0; pthread_spin_lock(&pool->lock); if (!pool->syncing) ret = dr_icm_pool_sync_pool_buddies(pool); pthread_spin_unlock(&pool->lock); return ret; } static int dr_icm_handle_buddies_get_mem(struct dr_icm_pool *pool, enum dr_icm_chunk_size chunk_size, struct dr_icm_buddy_mem **buddy, int *seg) { struct dr_icm_buddy_mem *buddy_mem_pool; bool new_mem = false; int err = 0; *seg = -1; /* find the next free place from the buddy list */ while (*seg == -1) { list_for_each(&pool->buddy_mem_list, buddy_mem_pool, list_node) { *seg = dr_buddy_alloc_mem(buddy_mem_pool, chunk_size); if (*seg != -1) goto found; if (new_mem) { /* We have new memory pool, first in the list */ assert(false); dr_dbg(pool->dmn, "No memory for order: %d\n", chunk_size); errno = ENOMEM; err = ENOMEM; goto out; } } /* no more available allocators in that pool, create new */ err = dr_icm_buddy_create(pool); if (err) goto out; /* mark we have new memory, first in list */ new_mem = true; } found: *buddy = buddy_mem_pool; out: return err; } /* Allocate an ICM chunk, each chunk holds a piece of ICM memory and * also memory used for HW STE management for optimisations. */ struct dr_icm_chunk *dr_icm_alloc_chunk(struct dr_icm_pool *pool, enum dr_icm_chunk_size chunk_size) { struct dr_icm_buddy_mem *buddy; struct dr_icm_chunk *chunk = NULL; int ret; int seg; pthread_spin_lock(&pool->lock); if (chunk_size > pool->max_log_chunk_sz) { errno = EINVAL; goto out; } /* find mem, get back the relevant buddy pool and seg in that mem */ ret = dr_icm_handle_buddies_get_mem(pool, chunk_size, &buddy, &seg); if (ret) goto out; chunk = dr_icm_chunk_create(pool, chunk_size, buddy, seg); if (!chunk) goto out_err; goto out; out_err: dr_buddy_free_mem(buddy, seg, chunk_size); out: pthread_spin_unlock(&pool->lock); return chunk; } void dr_icm_free_chunk(struct dr_icm_chunk *chunk) { struct dr_icm_buddy_mem *buddy = chunk->buddy_mem; struct dr_icm_pool *pool = buddy->pool; /* move the memory to the waiting list AKA "hot" */ pthread_spin_lock(&pool->lock); list_del_init(&chunk->chunk_list); list_add_tail(&buddy->hot_list, &chunk->chunk_list); buddy->pool->hot_memory_size += chunk->byte_size; /* Check if we have chunks that are waiting for sync-ste */ if (dr_icm_pool_is_sync_required(pool) && !pool->syncing) dr_icm_pool_sync_pool_buddies(buddy->pool); pthread_spin_unlock(&pool->lock); } void dr_icm_pool_set_pool_max_log_chunk_sz(struct dr_icm_pool *pool, enum dr_icm_chunk_size max_log_chunk_sz) { pthread_spin_lock(&pool->lock); pool->max_log_chunk_sz = max_log_chunk_sz; pthread_spin_unlock(&pool->lock); } uint64_t dr_icm_pool_get_chunk_icm_addr(struct dr_icm_chunk *chunk) { enum dr_icm_type icm_type = chunk->buddy_mem->pool->icm_type; int offset = dr_icm_pool_dm_type_to_entry_size(icm_type) * chunk->seg; return (uintptr_t)chunk->buddy_mem->icm_mr->icm_start_addr + offset; } uint64_t dr_icm_pool_get_chunk_mr_addr(struct dr_icm_chunk *chunk) { enum dr_icm_type icm_type = chunk->buddy_mem->pool->icm_type; int offset = dr_icm_pool_dm_type_to_entry_size(icm_type) * chunk->seg; return (uintptr_t)chunk->buddy_mem->icm_mr->mr->addr + offset; } uint32_t dr_icm_pool_get_chunk_rkey(struct dr_icm_chunk *chunk) { return chunk->buddy_mem->icm_mr->mr->rkey; } struct dr_icm_pool *dr_icm_pool_create(struct mlx5dv_dr_domain *dmn, enum dr_icm_type icm_type) { struct dr_icm_pool *pool; int ret; pool = calloc(1, sizeof(struct dr_icm_pool)); if (!pool) { errno = ENOMEM; return NULL; } pool->dmn = dmn; pool->icm_type = icm_type; switch (icm_type) { case DR_ICM_TYPE_STE: pool->max_log_chunk_sz = dmn->info.max_log_sw_icm_sz; pool->th = dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz, pool->icm_type) / 2; break; case DR_ICM_TYPE_MODIFY_ACTION: pool->max_log_chunk_sz = dmn->info.max_log_action_icm_sz; /* Use larger (0.9 instead of 0.5) TH to reduce sync on high rate insertion */ pool->th = dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz, pool->icm_type) * 0.9; break; case DR_ICM_TYPE_MODIFY_HDR_PTRN: pool->max_log_chunk_sz = dmn->info.max_log_modify_hdr_pattern_icm_sz; pool->th = dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz, pool->icm_type) / 2; break; case DR_ICM_TYPE_ENCAP: pool->max_log_chunk_sz = dmn->info.max_log_sw_encap_icm_sz; pool->th = dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz, pool->icm_type) / 2; break; default: assert(false); } list_head_init(&pool->buddy_mem_list); ret = pthread_spin_init(&pool->lock, PTHREAD_PROCESS_PRIVATE); if (ret) { errno = ret; goto free_pool; } return pool; free_pool: free(pool); return NULL; } void dr_icm_pool_destroy(struct dr_icm_pool *pool) { struct dr_icm_buddy_mem *buddy, *tmp_buddy; list_for_each_safe(&pool->buddy_mem_list, buddy, tmp_buddy, list_node) dr_icm_buddy_destroy(buddy); pthread_spin_destroy(&pool->lock); free(pool); } rdma-core-56.1/providers/mlx5/dr_matcher.c000066400000000000000000001347241477342711600205230ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include "mlx5dv_dr.h" #define DR_MASK_IPV4_ETHERTYPE 0x0800 #define DR_MASK_IPV6_ETHERTYPE 0x86DD #define DR_MASK_IP_VERSION_IPV4 0x4 #define DR_MASK_IP_VERSION_IPV6 0x6 static bool dr_mask_is_smac_set(struct dr_match_spec *spec) { return (spec->smac_47_16 || spec->smac_15_0); } static bool dr_mask_is_dmac_set(struct dr_match_spec *spec) { return (spec->dmac_47_16 || spec->dmac_15_0); } static bool dr_mask_is_src_addr_set(struct dr_match_spec *spec) { return (spec->src_ip_127_96 || spec->src_ip_95_64 || spec->src_ip_63_32 || spec->src_ip_31_0); } static bool dr_mask_is_dst_addr_set(struct dr_match_spec *spec) { return (spec->dst_ip_127_96 || spec->dst_ip_95_64 || spec->dst_ip_63_32 || spec->dst_ip_31_0); } static bool dr_mask_is_l3_base_set(struct dr_match_spec *spec) { return (spec->ip_protocol || spec->frag || spec->tcp_flags || spec->ip_ecn || spec->ip_dscp); } static bool dr_mask_is_tcp_udp_base_set(struct dr_match_spec *spec) { return (spec->tcp_sport || spec->tcp_dport || spec->udp_sport || spec->udp_dport); } static bool dr_mask_is_ipv4_set(struct dr_match_spec *spec) { return (spec->dst_ip_31_0 || spec->src_ip_31_0); } static bool dr_mask_is_ipv4_5_tuple_set(struct dr_match_spec *spec) { return (dr_mask_is_l3_base_set(spec) || dr_mask_is_tcp_udp_base_set(spec) || dr_mask_is_ipv4_set(spec)); } static bool dr_mask_is_eth_l2_tnl_set(struct dr_match_misc *misc) { return misc->vxlan_vni; } static bool dr_mask_is_ttl_set(struct dr_match_spec *spec) { return spec->ip_ttl_hoplimit; } static bool dr_mask_is_ipv4_ihl_set(struct dr_match_spec *spec) { return spec->ipv4_ihl; } #define DR_MASK_IS_L2_DST(_spec, _misc, _inner_outer) (_spec.first_vid || \ (_spec).first_cfi || (_spec).first_prio || (_spec).cvlan_tag || \ (_spec).svlan_tag || (_spec).dmac_47_16 || (_spec).dmac_15_0 || \ (_spec).ethertype || (_spec).ip_version || \ (_misc)._inner_outer##_second_vid || \ (_misc)._inner_outer##_second_cfi || \ (_misc)._inner_outer##_second_prio || \ (_misc)._inner_outer##_second_cvlan_tag || \ (_misc)._inner_outer##_second_svlan_tag) #define DR_MASK_IS_ETH_L4_SET(_spec, _misc, _inner_outer) ( \ dr_mask_is_l3_base_set(&(_spec)) || \ dr_mask_is_tcp_udp_base_set(&(_spec)) || \ dr_mask_is_ttl_set(&(_spec)) || \ (_misc)._inner_outer##_ipv6_flow_label) #define DR_MASK_IS_ETH_L4_MISC_SET(_misc3, _inner_outer) ( \ (_misc3)._inner_outer##_tcp_seq_num || \ (_misc3)._inner_outer##_tcp_ack_num) #define DR_MASK_IS_FIRST_MPLS_SET(_misc2, _inner_outer) ( \ (_misc2)._inner_outer##_first_mpls_label || \ (_misc2)._inner_outer##_first_mpls_exp || \ (_misc2)._inner_outer##_first_mpls_s_bos || \ (_misc2)._inner_outer##_first_mpls_ttl) static bool dr_mask_is_tnl_gre_set(struct dr_match_misc *misc) { return (misc->gre_key_h || misc->gre_key_l || misc->gre_protocol || misc->gre_c_present || misc->gre_k_present || misc->gre_s_present); } #define DR_MASK_IS_OUTER_MPLS_OVER_GRE_SET(_misc) (\ (_misc)->outer_first_mpls_over_gre_label || \ (_misc)->outer_first_mpls_over_gre_exp || \ (_misc)->outer_first_mpls_over_gre_s_bos || \ (_misc)->outer_first_mpls_over_gre_ttl) #define DR_MASK_IS_OUTER_MPLS_OVER_UDP_SET(_misc) (\ (_misc)->outer_first_mpls_over_udp_label || \ (_misc)->outer_first_mpls_over_udp_exp || \ (_misc)->outer_first_mpls_over_udp_s_bos || \ (_misc)->outer_first_mpls_over_udp_ttl) static bool dr_mask_is_vxlan_gpe_set(struct dr_match_misc3 *misc3) { return misc3->outer_vxlan_gpe_vni || misc3->outer_vxlan_gpe_next_protocol || misc3->outer_vxlan_gpe_flags; } static bool dr_matcher_supp_vxlan_gpe(struct dr_devx_caps *caps) { return (caps->sw_format_ver >= MLX5_HW_CONNECTX_6DX) || (caps->flex_protocols & MLX5_FLEX_PARSER_VXLAN_GPE_ENABLED); } static bool dr_mask_is_tnl_vxlan_gpe(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { return dr_mask_is_vxlan_gpe_set(&mask->misc3) && dr_matcher_supp_vxlan_gpe(&dmn->info.caps); } static bool dr_mask_is_tnl_geneve_set(struct dr_match_misc *misc) { return misc->geneve_vni || misc->geneve_oam || misc->geneve_protocol_type || misc->geneve_opt_len; } static bool dr_mask_is_ib_l4_set(struct dr_match_misc *misc) { return misc->bth_opcode || misc->bth_dst_qp || misc->bth_a; } static int dr_matcher_supp_geneve_tlv_option(struct dr_devx_caps *caps) { return caps->flex_protocols & MLX5_FLEX_PARSER_GENEVE_OPT_0_ENABLED; } static bool dr_mask_is_tnl_geneve_tlv_opt(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { return mask->misc3.geneve_tlv_option_0_data && dr_matcher_supp_geneve_tlv_option(&dmn->info.caps); } static bool dr_matcher_supp_tnl_geneve(struct dr_devx_caps *caps) { return (caps->sw_format_ver >= MLX5_HW_CONNECTX_6DX) || (caps->flex_protocols & MLX5_FLEX_PARSER_GENEVE_ENABLED); } static bool dr_mask_is_tnl_geneve(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { return dr_mask_is_tnl_geneve_set(&mask->misc) && dr_matcher_supp_tnl_geneve(&dmn->info.caps); } static bool dr_mask_is_tnl_gtpu_set(struct dr_match_misc3 *misc3) { return misc3->gtpu_msg_flags || misc3->gtpu_msg_type || misc3->gtpu_teid; } static bool dr_matcher_supp_tnl_gtpu(struct dr_devx_caps *caps) { return caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_ENABLED; } static bool dr_mask_is_tnl_gtpu(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { return dr_mask_is_tnl_gtpu_set(&mask->misc3) && dr_matcher_supp_tnl_gtpu(&dmn->info.caps); } static int dr_matcher_supp_tnl_gtpu_dw_0(struct dr_devx_caps *caps) { return caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_DW_0_ENABLED; } static bool dr_mask_is_tnl_gtpu_dw_0(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { return mask->misc3.gtpu_dw_0 && dr_matcher_supp_tnl_gtpu_dw_0(&dmn->info.caps); } static int dr_matcher_supp_tnl_gtpu_teid(struct dr_devx_caps *caps) { return caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_TEID_ENABLED; } static bool dr_mask_is_tnl_gtpu_teid(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { return mask->misc3.gtpu_teid && dr_matcher_supp_tnl_gtpu_teid(&dmn->info.caps); } static int dr_matcher_supp_tnl_gtpu_dw_2(struct dr_devx_caps *caps) { return caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_DW_2_ENABLED; } static bool dr_mask_is_tnl_gtpu_dw_2(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { return mask->misc3.gtpu_dw_2 && dr_matcher_supp_tnl_gtpu_dw_2(&dmn->info.caps); } static int dr_matcher_supp_tnl_gtpu_first_ext(struct dr_devx_caps *caps) { return caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_FIRST_EXT_DW_0_ENABLED; } static bool dr_mask_is_tnl_gtpu_first_ext(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { return mask->misc3.gtpu_first_ext_dw_0 && dr_matcher_supp_tnl_gtpu_first_ext(&dmn->info.caps); } static bool dr_mask_is_tnl_gtpu_flex_parser_0(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { struct dr_devx_caps *caps = &dmn->info.caps; return ((caps->flex_parser_id_gtpu_dw_0 <= DR_STE_MAX_FLEX_0_ID) && dr_mask_is_tnl_gtpu_dw_0(mask, dmn)) || ((caps->flex_parser_id_gtpu_teid <= DR_STE_MAX_FLEX_0_ID) && dr_mask_is_tnl_gtpu_teid(mask, dmn)) || ((caps->flex_parser_id_gtpu_dw_2 <= DR_STE_MAX_FLEX_0_ID) && dr_mask_is_tnl_gtpu_dw_2(mask, dmn)) || ((caps->flex_parser_id_gtpu_first_ext_dw_0 <= DR_STE_MAX_FLEX_0_ID) && dr_mask_is_tnl_gtpu_first_ext(mask, dmn)); } static bool dr_mask_is_tnl_gtpu_flex_parser_1(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { struct dr_devx_caps *caps = &dmn->info.caps; return ((caps->flex_parser_id_gtpu_dw_0 > DR_STE_MAX_FLEX_0_ID) && dr_mask_is_tnl_gtpu_dw_0(mask, dmn)) || ((caps->flex_parser_id_gtpu_teid > DR_STE_MAX_FLEX_0_ID) && dr_mask_is_tnl_gtpu_teid(mask, dmn)) || ((caps->flex_parser_id_gtpu_dw_2 > DR_STE_MAX_FLEX_0_ID) && dr_mask_is_tnl_gtpu_dw_2(mask, dmn)) || ((caps->flex_parser_id_gtpu_first_ext_dw_0 > DR_STE_MAX_FLEX_0_ID) && dr_mask_is_tnl_gtpu_first_ext(mask, dmn)); } static bool dr_mask_is_tnl_gtpu_any(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { return dr_mask_is_tnl_gtpu_flex_parser_0(mask, dmn) || dr_mask_is_tnl_gtpu_flex_parser_1(mask, dmn) || dr_mask_is_tnl_gtpu(mask, dmn); } static inline int dr_matcher_supp_icmp_v4(struct dr_devx_caps *caps) { return (caps->sw_format_ver >= MLX5_HW_CONNECTX_6DX) || (caps->flex_protocols & MLX5_FLEX_PARSER_ICMP_V4_ENABLED); } static inline int dr_matcher_supp_icmp_v6(struct dr_devx_caps *caps) { return (caps->sw_format_ver >= MLX5_HW_CONNECTX_6DX) || (caps->flex_protocols & MLX5_FLEX_PARSER_ICMP_V6_ENABLED); } static bool dr_mask_is_icmpv6_set(struct dr_match_misc3 *misc3) { return (misc3->icmpv6_type || misc3->icmpv6_code || misc3->icmpv6_header_data); } static bool dr_mask_is_icmp(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { if (DR_MASK_IS_ICMPV4_SET(&mask->misc3)) return dr_matcher_supp_icmp_v4(&dmn->info.caps); else if (dr_mask_is_icmpv6_set(&mask->misc3)) return dr_matcher_supp_icmp_v6(&dmn->info.caps); return false; } static bool dr_mask_is_wqe_metadata_set(struct dr_match_misc2 *misc2) { return misc2->metadata_reg_a; } static bool dr_mask_is_reg_c_0_3_set(struct dr_match_misc2 *misc2) { return (misc2->metadata_reg_c_0 || misc2->metadata_reg_c_1 || misc2->metadata_reg_c_2 || misc2->metadata_reg_c_3); } static bool dr_mask_is_reg_c_4_7_set(struct dr_match_misc2 *misc2) { return (misc2->metadata_reg_c_4 || misc2->metadata_reg_c_5 || misc2->metadata_reg_c_6 || misc2->metadata_reg_c_7); } static bool dr_mask_is_gvmi_or_qpn_set(struct dr_match_misc *misc) { return (misc->source_sqn || misc->source_port); } static bool dr_mask_is_flex_parser_id_0_3_set(uint32_t flex_parser_id, uint32_t flex_parser_value) { if (flex_parser_id) return flex_parser_id <= DR_STE_MAX_FLEX_0_ID; /* Using flex_parser 0 means that id is zero, thus value must be set. */ return flex_parser_value; } static bool dr_mask_is_flex_parser_0_3_set(struct dr_match_misc4 *misc4) { return (dr_mask_is_flex_parser_id_0_3_set(misc4->prog_sample_field_id_0, misc4->prog_sample_field_value_0) || dr_mask_is_flex_parser_id_0_3_set(misc4->prog_sample_field_id_1, misc4->prog_sample_field_value_1) || dr_mask_is_flex_parser_id_0_3_set(misc4->prog_sample_field_id_2, misc4->prog_sample_field_value_2) || dr_mask_is_flex_parser_id_0_3_set(misc4->prog_sample_field_id_3, misc4->prog_sample_field_value_3) || dr_mask_is_flex_parser_id_0_3_set(misc4->prog_sample_field_id_4, misc4->prog_sample_field_value_4) || dr_mask_is_flex_parser_id_0_3_set(misc4->prog_sample_field_id_5, misc4->prog_sample_field_value_5) || dr_mask_is_flex_parser_id_0_3_set(misc4->prog_sample_field_id_6, misc4->prog_sample_field_value_6) || dr_mask_is_flex_parser_id_0_3_set(misc4->prog_sample_field_id_7, misc4->prog_sample_field_value_7)); } static bool dr_mask_is_flex_parser_id_4_7_set(uint32_t flex_parser_id) { return flex_parser_id > DR_STE_MAX_FLEX_0_ID && flex_parser_id <= DR_STE_MAX_FLEX_1_ID; } static bool dr_mask_is_flex_parser_4_7_set(struct dr_match_misc4 *misc4) { return (dr_mask_is_flex_parser_id_4_7_set(misc4->prog_sample_field_id_0) || dr_mask_is_flex_parser_id_4_7_set(misc4->prog_sample_field_id_1) || dr_mask_is_flex_parser_id_4_7_set(misc4->prog_sample_field_id_2) || dr_mask_is_flex_parser_id_4_7_set(misc4->prog_sample_field_id_3) || dr_mask_is_flex_parser_id_4_7_set(misc4->prog_sample_field_id_4) || dr_mask_is_flex_parser_id_4_7_set(misc4->prog_sample_field_id_5) || dr_mask_is_flex_parser_id_4_7_set(misc4->prog_sample_field_id_6) || dr_mask_is_flex_parser_id_4_7_set(misc4->prog_sample_field_id_7)); } static bool dr_matcher_supp_flex_parser_ok(struct dr_devx_caps *caps) { return caps->flex_parser_ok_bits_supp; } static bool dr_mask_is_tnl_geneve_tlv_opt_exist_set(struct dr_match_misc *misc, struct mlx5dv_dr_domain *dmn) { return dr_matcher_supp_flex_parser_ok(&dmn->info.caps) && misc->geneve_tlv_option_0_exist; } static bool dr_mask_is_tunnel_header_set(struct dr_match_misc5 *misc5) { return misc5->tunnel_header_0 || misc5->tunnel_header_1 || misc5->tunnel_header_2 || misc5->tunnel_header_3; } static int dr_matcher_supp_tnl_mpls_over_gre(struct dr_devx_caps *caps) { return caps->flex_protocols & MLX5_FLEX_PARSER_MPLS_OVER_GRE_ENABLED; } static bool dr_mask_is_tnl_mpls_over_gre(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { return DR_MASK_IS_OUTER_MPLS_OVER_GRE_SET(&mask->misc2) && dr_matcher_supp_tnl_mpls_over_gre(&dmn->info.caps); } static int dr_matcher_supp_tnl_mpls_over_udp(struct dr_devx_caps *caps) { return caps->flex_protocols & mlx5_FLEX_PARSER_MPLS_OVER_UDP_ENABLED; } static bool dr_mask_is_tnl_mpls_over_udp(struct dr_match_param *mask, struct mlx5dv_dr_domain *dmn) { return DR_MASK_IS_OUTER_MPLS_OVER_UDP_SET(&mask->misc2) && dr_matcher_supp_tnl_mpls_over_udp(&dmn->info.caps); } static bool dr_matcher_mask_is_all_zero(uint8_t *mask, uint32_t size) { return (*mask == 0) && memcmp(mask, mask + 1, size - 1) == 0; } static bool dr_matcher_is_mask_consumed(uint8_t *mask, uint8_t match_criteria) { /* Check that all mask data was consumed */ if (match_criteria & DR_MATCHER_CRITERIA_OUTER && !dr_matcher_mask_is_all_zero(mask, DEVX_ST_SZ_BYTES(dr_match_spec))) return false; mask += DEVX_ST_SZ_BYTES(dr_match_spec); if (match_criteria & DR_MATCHER_CRITERIA_MISC && !dr_matcher_mask_is_all_zero(mask, DEVX_ST_SZ_BYTES(dr_match_set_misc))) return false; mask += DEVX_ST_SZ_BYTES(dr_match_set_misc); if (match_criteria & DR_MATCHER_CRITERIA_INNER && !dr_matcher_mask_is_all_zero(mask, DEVX_ST_SZ_BYTES(dr_match_spec))) return false; mask += DEVX_ST_SZ_BYTES(dr_match_spec); if (match_criteria & DR_MATCHER_CRITERIA_MISC2 && !dr_matcher_mask_is_all_zero(mask, DEVX_ST_SZ_BYTES(dr_match_set_misc2))) return false; mask += DEVX_ST_SZ_BYTES(dr_match_set_misc2); if (match_criteria & DR_MATCHER_CRITERIA_MISC3 && !dr_matcher_mask_is_all_zero(mask, DEVX_ST_SZ_BYTES(dr_match_set_misc3))) return false; mask += DEVX_ST_SZ_BYTES(dr_match_set_misc3); if (match_criteria & DR_MATCHER_CRITERIA_MISC4 && !dr_matcher_mask_is_all_zero(mask, DEVX_ST_SZ_BYTES(dr_match_set_misc4))) return false; mask += DEVX_ST_SZ_BYTES(dr_match_set_misc4); if (match_criteria & DR_MATCHER_CRITERIA_MISC5 && !dr_matcher_mask_is_all_zero(mask, DEVX_ST_SZ_BYTES(dr_match_set_misc5))) return false; return true; } static void dr_matcher_copy_mask(struct dr_match_param *dst_mask, struct dr_match_param *src_mask, uint8_t match_criteria, bool optimize_rx) { if (match_criteria & DR_MATCHER_CRITERIA_OUTER) dst_mask->outer = src_mask->outer; if (match_criteria & DR_MATCHER_CRITERIA_MISC) dst_mask->misc = src_mask->misc; if (match_criteria & DR_MATCHER_CRITERIA_INNER) dst_mask->inner = src_mask->inner; if (match_criteria & DR_MATCHER_CRITERIA_MISC2) dst_mask->misc2 = src_mask->misc2; if (match_criteria & DR_MATCHER_CRITERIA_MISC3) dst_mask->misc3 = src_mask->misc3; if (match_criteria & DR_MATCHER_CRITERIA_MISC4) dst_mask->misc4 = src_mask->misc4; if (match_criteria & DR_MATCHER_CRITERIA_MISC5) dst_mask->misc5 = src_mask->misc5; /* Optimize RX pipe by reducing source port matching */ if (optimize_rx && dst_mask->misc.source_port) dst_mask->misc.source_port = 0; } static void dr_matcher_destroy_definer_objs(struct dr_ste_build *sb, uint8_t idx) { int i; for (i = 0; i < idx; i++) { mlx5dv_devx_obj_destroy(sb[i].definer_obj); sb[i].lu_type = 0; sb[i].htbl_type = 0; sb[i].definer_obj = NULL; } } static int dr_matcher_create_definer_objs(struct ibv_context *ctx, struct dr_ste_build *sb, uint8_t idx) { struct mlx5dv_devx_obj *devx_obj; int i; for (i = 0; i < idx; i++) { devx_obj = dr_devx_create_definer(ctx, sb[i].format_id, sb[i].match); if (!devx_obj) goto cleanup; /* The lu_type combines the definer and the entry type */ sb[i].lu_type |= devx_obj->object_id; sb[i].htbl_type = DR_STE_HTBL_TYPE_MATCH; sb[i].definer_obj = devx_obj; } return 0; cleanup: dr_matcher_destroy_definer_objs(sb, i); return errno; } static void dr_matcher_clear_definers_builders(struct dr_matcher_rx_tx *nic_matcher) { struct dr_ste_build *sb = nic_matcher->ste_builder; int i; for (i = 0; i < nic_matcher->num_of_builders; i++) memset(&sb[i], 0, sizeof(*sb)); nic_matcher->num_of_builders = 0; } static int dr_matcher_set_definer_builders(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher) { struct dr_domain_rx_tx *nic_dmn = nic_matcher->nic_tbl->nic_dmn; struct dr_ste_build *sb = nic_matcher->ste_builder; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; bool rx = nic_dmn->type == DR_DOMAIN_NIC_TYPE_RX; struct dr_devx_caps *caps = &dmn->info.caps; struct dr_ste_ctx *ste_ctx = dmn->ste_ctx; struct dr_match_param mask = {}; bool src_ipv6, dst_ipv6; bool optimize_rx; uint8_t idx = 0; uint8_t ipv; int ret; ipv = matcher->mask.outer.ip_version; src_ipv6 = dr_mask_is_src_addr_set(&matcher->mask.outer); dst_ipv6 = dr_mask_is_dst_addr_set(&matcher->mask.outer); optimize_rx = (dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB && nic_dmn->type == DR_DOMAIN_NIC_TYPE_RX); if (caps->definer_format_sup & (1 << DR_MATCHER_DEFINER_0)) { dr_matcher_copy_mask(&mask, &matcher->mask, matcher->match_criteria, optimize_rx); ret = dr_ste_build_def0(ste_ctx, &sb[idx++], &mask, caps, false, rx); if (!ret && dr_matcher_is_mask_consumed((uint8_t *)&mask, matcher->match_criteria)) goto done; memset(sb, 0, sizeof(*sb)); idx = 0; } if (dmn->info.caps.definer_format_sup & (1 << DR_MATCHER_DEFINER_2)) { dr_matcher_copy_mask(&mask, &matcher->mask, matcher->match_criteria, optimize_rx); ret = dr_ste_build_def2(ste_ctx, &sb[idx++], &mask, caps, false, rx); if (!ret && dr_matcher_is_mask_consumed((uint8_t *)&mask, matcher->match_criteria)) goto done; memset(sb, 0, sizeof(*sb)); idx = 0; } if (caps->definer_format_sup & (1 << DR_MATCHER_DEFINER_16)) { dr_matcher_copy_mask(&mask, &matcher->mask, matcher->match_criteria, optimize_rx); ret = dr_ste_build_def16(ste_ctx, &sb[idx++], &mask, caps, false, rx); if (!ret && dr_matcher_is_mask_consumed((uint8_t *)&mask, matcher->match_criteria)) goto done; memset(sb, 0, sizeof(*sb)); idx = 0; } if (caps->definer_format_sup & (1 << DR_MATCHER_DEFINER_22)) { dr_matcher_copy_mask(&mask, &matcher->mask, matcher->match_criteria, optimize_rx); ret = dr_ste_build_def22(ste_ctx, &sb[idx++], &mask, false, rx); if (!ret && dr_matcher_is_mask_consumed((uint8_t *)&mask, matcher->match_criteria)) goto done; memset(sb, 0, sizeof(*sb)); idx = 0; } if (caps->definer_format_sup & (1 << DR_MATCHER_DEFINER_24)) { dr_matcher_copy_mask(&mask, &matcher->mask, matcher->match_criteria, optimize_rx); ret = dr_ste_build_def24(ste_ctx, &sb[idx++], &mask, false, rx); if (!ret && dr_matcher_is_mask_consumed((uint8_t *)&mask, matcher->match_criteria)) goto done; memset(sb, 0, sizeof(*sb)); idx = 0; } if (caps->definer_format_sup & (1 << DR_MATCHER_DEFINER_25)) { dr_matcher_copy_mask(&mask, &matcher->mask, matcher->match_criteria, optimize_rx); ret = dr_ste_build_def25(ste_ctx, &sb[idx++], &mask, false, rx); if (!ret && dr_matcher_is_mask_consumed((uint8_t *)&mask, matcher->match_criteria)) goto done; memset(sb, 0, sizeof(*sb)); idx = 0; } if ((ipv == DR_MASK_IP_VERSION_IPV6 && src_ipv6) && (caps->definer_format_sup & (1 << DR_MATCHER_DEFINER_6)) && (caps->definer_format_sup & (1 << DR_MATCHER_DEFINER_26))) { dr_matcher_copy_mask(&mask, &matcher->mask, matcher->match_criteria, optimize_rx); ret = dr_ste_build_def26(ste_ctx, &sb[idx++], &mask, false, rx); if (!ret && dst_ipv6) ret = dr_ste_build_def6(ste_ctx, &sb[idx++], &mask, false, rx); if (!ret && dr_matcher_is_mask_consumed((uint8_t *)&mask, matcher->match_criteria)) goto done; memset(&sb[0], 0, sizeof(*sb)); memset(&sb[1], 0, sizeof(*sb)); idx = 0; } if (dmn->info.caps.definer_format_sup & (1 << DR_MATCHER_DEFINER_28)) { dr_matcher_copy_mask(&mask, &matcher->mask, matcher->match_criteria, optimize_rx); ret = dr_ste_build_def28(ste_ctx, &sb[idx++], &mask, false, rx); if (!ret && dr_matcher_is_mask_consumed((uint8_t *)&mask, matcher->match_criteria)) goto done; memset(sb, 0, sizeof(struct dr_ste_build)); idx = 0; } if (dmn->info.caps.definer_format_sup & (1ULL << DR_MATCHER_DEFINER_33)) { dr_matcher_copy_mask(&mask, &matcher->mask, matcher->match_criteria, optimize_rx); ret = dr_ste_build_def33(ste_ctx, &sb[idx++], &mask, false, rx); if (!ret && dr_matcher_is_mask_consumed((uint8_t *)&mask, matcher->match_criteria)) goto done; memset(sb, 0, sizeof(*sb)); idx = 0; } return ENOTSUP; done: nic_matcher->num_of_builders = idx; return 0; } static bool dr_matcher_is_definer_support_mq(struct dr_matcher_rx_tx *nic_matcher) { /* ipv6 needs 2 definers and not supported yet */ if (nic_matcher->num_of_builders == 1 && nic_matcher->ste_builder->htbl_type == DR_STE_HTBL_TYPE_MATCH) return true; return false; } static int dr_matcher_set_large_ste_builders(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; int ret; if (dmn->info.caps.sw_format_ver == MLX5_HW_CONNECTX_5 || !dmn->info.caps.definer_format_sup) return ENOTSUP; ret = dr_matcher_set_definer_builders(matcher, nic_matcher); if (ret) return ret; ret = dr_matcher_create_definer_objs(dmn->ctx, nic_matcher->ste_builder, nic_matcher->num_of_builders); if (ret) goto clear_definers_builders; return 0; clear_definers_builders: dr_matcher_clear_definers_builders(nic_matcher); return ret; } static void dr_matcher_clear_ste_builders(struct dr_matcher_rx_tx *nic_matcher) { if (nic_matcher->ste_builder->htbl_type == DR_STE_HTBL_TYPE_MATCH) dr_matcher_destroy_definer_objs(nic_matcher->ste_builder, nic_matcher->num_of_builders); } static int dr_matcher_set_ste_builders(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher) { struct dr_domain_rx_tx *nic_dmn = nic_matcher->nic_tbl->nic_dmn; struct dr_ste_build *sb = nic_matcher->ste_builder; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct dr_ste_ctx *ste_ctx = dmn->ste_ctx; struct dr_match_param mask = {}; bool allow_empty_match = false; bool optimize_rx; bool inner, rx; uint8_t ipv; int idx = 0; int ret; ret = dr_ste_build_pre_check(dmn, matcher->match_criteria, &matcher->mask, NULL); if (ret) return ret; /* Use a large definers for matching if possible */ ret = dr_matcher_set_large_ste_builders(matcher, nic_matcher); if (!ret) return 0; rx = nic_dmn->type == DR_DOMAIN_NIC_TYPE_RX; optimize_rx = (dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB && rx); /* Create a temporary mask to track and clear used mask fields */ dr_matcher_copy_mask(&mask, &matcher->mask, matcher->match_criteria, optimize_rx); /* Allow empty match if source port was set in matcher->mask, * and now is cleared in mask, because it was optimized. */ if (optimize_rx && (matcher->match_criteria & DR_MATCHER_CRITERIA_MISC) && matcher->mask.misc.source_port) allow_empty_match = true; /* Outer */ if (matcher->match_criteria & (DR_MATCHER_CRITERIA_OUTER | DR_MATCHER_CRITERIA_MISC | DR_MATCHER_CRITERIA_MISC2 | DR_MATCHER_CRITERIA_MISC3 | DR_MATCHER_CRITERIA_MISC5)) { inner = false; ipv = mask.outer.ip_version; if (dr_mask_is_wqe_metadata_set(&mask.misc2)) dr_ste_build_general_purpose(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_reg_c_0_3_set(&mask.misc2)) dr_ste_build_register_0(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_reg_c_4_7_set(&mask.misc2)) dr_ste_build_register_1(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_gvmi_or_qpn_set(&mask.misc) && (dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB || dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_RX)) dr_ste_build_src_gvmi_qpn(ste_ctx, &sb[idx++], &mask, &dmn->info.caps, inner, rx); if (dr_mask_is_smac_set(&mask.outer) && dr_mask_is_dmac_set(&mask.outer)) dr_ste_build_eth_l2_src_dst(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_smac_set(&mask.outer)) dr_ste_build_eth_l2_src(ste_ctx, &sb[idx++], &mask, inner, rx); if (DR_MASK_IS_L2_DST(mask.outer, mask.misc, outer)) dr_ste_build_eth_l2_dst(ste_ctx, &sb[idx++], &mask, inner, rx); if (ipv == 4) { if (dr_mask_is_ttl_set(&mask.outer) || dr_mask_is_ipv4_ihl_set(&mask.outer)) dr_ste_build_eth_l3_ipv4_misc(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_ipv4_5_tuple_set(&mask.outer)) dr_ste_build_eth_l3_ipv4_5_tuple(ste_ctx, &sb[idx++], &mask, inner, rx); } else if (ipv == 6) { if (dr_mask_is_dst_addr_set(&mask.outer)) dr_ste_build_eth_l3_ipv6_dst(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_src_addr_set(&mask.outer)) dr_ste_build_eth_l3_ipv6_src(ste_ctx, &sb[idx++], &mask, inner, rx); if (DR_MASK_IS_ETH_L4_SET(mask.outer, mask.misc, outer)) dr_ste_build_eth_ipv6_l3_l4(ste_ctx, &sb[idx++], &mask, inner, rx); } if (dr_mask_is_tnl_vxlan_gpe(&mask, dmn)) { dr_ste_build_tnl_vxlan_gpe(ste_ctx, &sb[idx++], &mask, inner, rx); } else if (dr_mask_is_tnl_geneve(&mask, dmn) || dr_mask_is_tnl_geneve_tlv_opt(&mask, dmn)) { if (dr_mask_is_tnl_geneve(&mask, dmn)) dr_ste_build_tnl_geneve(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_tnl_geneve_tlv_opt(&mask, dmn)) dr_ste_build_tnl_geneve_tlv_opt(ste_ctx, &sb[idx++], &mask, &dmn->info.caps, inner, rx); if (dr_mask_is_tnl_geneve_tlv_opt_exist_set(&mask.misc, dmn)) dr_ste_build_tnl_geneve_tlv_opt_exist(ste_ctx, &sb[idx++], &mask, &dmn->info.caps, inner, rx); } else if (dr_mask_is_tnl_gtpu_any(&mask, dmn)) { if (dr_mask_is_tnl_gtpu_flex_parser_0(&mask, dmn)) dr_ste_build_tnl_gtpu_flex_parser_0(ste_ctx, &sb[idx++], &mask, &dmn->info.caps, inner, rx); if (dr_mask_is_tnl_gtpu_flex_parser_1(&mask, dmn)) dr_ste_build_tnl_gtpu_flex_parser_1(ste_ctx, &sb[idx++], &mask, &dmn->info.caps, inner, rx); if (dr_mask_is_tnl_gtpu(&mask, dmn)) dr_ste_build_tnl_gtpu(ste_ctx, &sb[idx++], &mask, inner, rx); } else if (dr_mask_is_tunnel_header_set(&mask.misc5)) { dr_ste_build_tunnel_header(ste_ctx, &sb[idx++], &mask, &dmn->info.caps, false, rx); } if (DR_MASK_IS_ETH_L4_MISC_SET(mask.misc3, outer)) dr_ste_build_eth_l4_misc(ste_ctx, &sb[idx++], &mask, inner, rx); if (DR_MASK_IS_FIRST_MPLS_SET(mask.misc2, outer)) dr_ste_build_mpls(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_tnl_mpls_over_gre(&mask, dmn)) dr_ste_build_tnl_mpls_over_gre(ste_ctx, &sb[idx++], &mask, &dmn->info.caps, inner, rx); else if (dr_mask_is_tnl_mpls_over_udp(&mask, dmn)) dr_ste_build_tnl_mpls_over_udp(ste_ctx, &sb[idx++], &mask, &dmn->info.caps, inner, rx); if (dr_mask_is_icmp(&mask, dmn)) dr_ste_build_icmp(ste_ctx, &sb[idx++], &mask, &dmn->info.caps, inner, rx); if (dr_mask_is_tnl_gre_set(&mask.misc)) dr_ste_build_tnl_gre(ste_ctx, &sb[idx++], &mask, inner, rx); } /* Inner */ if (matcher->match_criteria & (DR_MATCHER_CRITERIA_INNER | DR_MATCHER_CRITERIA_MISC | DR_MATCHER_CRITERIA_MISC2 | DR_MATCHER_CRITERIA_MISC3)) { inner = true; ipv = mask.inner.ip_version; if (dr_mask_is_eth_l2_tnl_set(&mask.misc)) dr_ste_build_eth_l2_tnl(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_smac_set(&mask.inner) && dr_mask_is_dmac_set(&mask.inner)) dr_ste_build_eth_l2_src_dst(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_smac_set(&mask.inner)) dr_ste_build_eth_l2_src(ste_ctx, &sb[idx++], &mask, inner, rx); if (DR_MASK_IS_L2_DST(mask.inner, mask.misc, inner)) dr_ste_build_eth_l2_dst(ste_ctx, &sb[idx++], &mask, inner, rx); if (ipv == 4) { if (dr_mask_is_ttl_set(&mask.inner) || dr_mask_is_ipv4_ihl_set(&mask.inner)) dr_ste_build_eth_l3_ipv4_misc(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_ipv4_5_tuple_set(&mask.inner)) dr_ste_build_eth_l3_ipv4_5_tuple(ste_ctx, &sb[idx++], &mask, inner, rx); } else if (ipv == 6) { if (dr_mask_is_dst_addr_set(&mask.inner)) dr_ste_build_eth_l3_ipv6_dst(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_src_addr_set(&mask.inner)) dr_ste_build_eth_l3_ipv6_src(ste_ctx, &sb[idx++], &mask, inner, rx); if (DR_MASK_IS_ETH_L4_SET(mask.inner, mask.misc, inner)) dr_ste_build_eth_ipv6_l3_l4(ste_ctx, &sb[idx++], &mask, inner, rx); } if (DR_MASK_IS_ETH_L4_MISC_SET(mask.misc3, inner)) dr_ste_build_eth_l4_misc(ste_ctx, &sb[idx++], &mask, inner, rx); if (DR_MASK_IS_FIRST_MPLS_SET(mask.misc2, inner)) dr_ste_build_mpls(ste_ctx, &sb[idx++], &mask, inner, rx); if (dr_mask_is_tnl_mpls_over_gre(&mask, dmn)) dr_ste_build_tnl_mpls_over_gre(ste_ctx, &sb[idx++], &mask, &dmn->info.caps, inner, rx); else if (dr_mask_is_tnl_mpls_over_udp(&mask, dmn)) dr_ste_build_tnl_mpls_over_udp(ste_ctx, &sb[idx++], &mask, &dmn->info.caps, inner, rx); } if (matcher->match_criteria & DR_MATCHER_CRITERIA_MISC4) { if (dr_mask_is_flex_parser_0_3_set(&mask.misc4)) dr_ste_build_flex_parser_0(ste_ctx, &sb[idx++], &mask, false, rx); if (dr_mask_is_flex_parser_4_7_set(&mask.misc4)) dr_ste_build_flex_parser_1(ste_ctx, &sb[idx++], &mask, false, rx); } if (matcher->match_criteria & DR_MATCHER_CRITERIA_MISC) { if (dr_mask_is_ib_l4_set(&mask.misc)) dr_ste_build_ib_l4(ste_ctx, &sb[idx++], &mask, inner, rx); } /* Empty matcher, takes all */ if ((!idx && allow_empty_match) || matcher->match_criteria == DR_MATCHER_CRITERIA_EMPTY) dr_ste_build_empty_always_hit(&sb[idx++], rx); if (idx == 0) { dr_dbg(dmn, "Cannot generate any valid rules from mask\n"); errno = EINVAL; return errno; } /* Check that all mask fields were consumed */ if (!dr_matcher_is_mask_consumed((uint8_t *)&mask, matcher->match_criteria)) { dr_dbg(dmn, "Mask contains unsupported parameters\n"); errno = EOPNOTSUPP; return errno; } nic_matcher->num_of_builders = idx; return 0; } static int dr_matcher_connect(struct mlx5dv_dr_domain *dmn, struct dr_matcher_rx_tx *curr_nic_matcher, struct dr_matcher_rx_tx *next_nic_matcher, struct dr_matcher_rx_tx *prev_nic_matcher) { struct dr_table_rx_tx *nic_tbl = curr_nic_matcher->nic_tbl; struct dr_domain_rx_tx *nic_dmn = nic_tbl->nic_dmn; struct dr_htbl_connect_info info; struct dr_ste_htbl *prev_htbl; int ret; /* Connect end anchor hash table to next_htbl or to the default address */ if (next_nic_matcher) { info.type = CONNECT_HIT; info.hit_next_htbl = next_nic_matcher->s_htbl; } else { info.type = CONNECT_MISS; info.miss_icm_addr = nic_dmn->default_icm_addr; } ret = dr_ste_htbl_init_and_postsend(dmn, nic_dmn, curr_nic_matcher->e_anchor, &info, info.type == CONNECT_HIT, 0); if (ret) return ret; /* Connect start hash table to end anchor */ info.type = CONNECT_MISS; info.miss_icm_addr = dr_icm_pool_get_chunk_icm_addr(curr_nic_matcher->e_anchor->chunk); ret = dr_ste_htbl_init_and_postsend(dmn, nic_dmn, curr_nic_matcher->s_htbl, &info, false, 0); if (ret) return ret; /* Connect previous hash table to matcher start hash table */ if (prev_nic_matcher) prev_htbl = prev_nic_matcher->e_anchor; else prev_htbl = nic_tbl->s_anchor; info.type = CONNECT_HIT; info.hit_next_htbl = curr_nic_matcher->s_htbl; ret = dr_ste_htbl_init_and_postsend(dmn, nic_dmn, prev_htbl, &info, true, 0); if (ret) return ret; /* Update the pointing ste and next hash table */ curr_nic_matcher->s_htbl->pointing_ste = prev_htbl->ste_arr; prev_htbl->ste_arr[0].next_htbl = curr_nic_matcher->s_htbl; if (next_nic_matcher) { next_nic_matcher->s_htbl->pointing_ste = curr_nic_matcher->e_anchor->ste_arr; curr_nic_matcher->e_anchor->ste_arr[0].next_htbl = next_nic_matcher->s_htbl; } return 0; } static int dr_matcher_add_to_tbl(struct mlx5dv_dr_matcher *matcher) { struct mlx5dv_dr_matcher *next_matcher, *prev_matcher, *tmp_matcher; struct mlx5dv_dr_table *tbl = matcher->tbl; struct mlx5dv_dr_domain *dmn = tbl->dmn; int ret; if (dr_is_root_table(matcher->tbl)) return 0; next_matcher = NULL; list_for_each(&tbl->matcher_list, tmp_matcher, matcher_list) if (tmp_matcher->prio >= matcher->prio) { next_matcher = tmp_matcher; break; } if (next_matcher) prev_matcher = list_prev(&tbl->matcher_list, next_matcher, matcher_list); else prev_matcher = list_tail(&tbl->matcher_list, struct mlx5dv_dr_matcher, matcher_list); if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB || dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_RX) { ret = dr_matcher_connect(dmn, &matcher->rx, next_matcher ? &next_matcher->rx : NULL, prev_matcher ? &prev_matcher->rx : NULL); if (ret) return ret; } if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB || dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_TX) { ret = dr_matcher_connect(dmn, &matcher->tx, next_matcher ? &next_matcher->tx : NULL, prev_matcher ? &prev_matcher->tx : NULL); if (ret) return ret; } if (prev_matcher) list_add_after(&tbl->matcher_list, &prev_matcher->matcher_list, &matcher->matcher_list); else if (next_matcher) list_add_before(&tbl->matcher_list, &next_matcher->matcher_list, &matcher->matcher_list); else list_add(&tbl->matcher_list, &matcher->matcher_list); return 0; } static void dr_matcher_uninit_nic(struct dr_matcher_rx_tx *nic_matcher) { dr_matcher_clear_ste_builders(nic_matcher); dr_htbl_put(nic_matcher->s_htbl); dr_htbl_put(nic_matcher->e_anchor); } static void dr_matcher_uninit_fdb(struct mlx5dv_dr_matcher *matcher) { dr_matcher_uninit_nic(&matcher->rx); dr_matcher_uninit_nic(&matcher->tx); } static int dr_matcher_uninit_root(struct mlx5dv_dr_matcher *matcher) { return mlx5dv_destroy_flow_matcher(matcher->dv_matcher); } static void dr_matcher_uninit(struct mlx5dv_dr_matcher *matcher) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; if (dr_is_root_table(matcher->tbl)) { dr_matcher_uninit_root(matcher); return; } switch (dmn->type) { case MLX5DV_DR_DOMAIN_TYPE_NIC_RX: dr_matcher_uninit_nic(&matcher->rx); break; case MLX5DV_DR_DOMAIN_TYPE_NIC_TX: dr_matcher_uninit_nic(&matcher->tx); break; case MLX5DV_DR_DOMAIN_TYPE_FDB: dr_matcher_uninit_fdb(matcher); break; default: assert(false); break; } } static int dr_matcher_init_nic(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; int ret; ret = dr_matcher_set_ste_builders(matcher, nic_matcher); if (ret) return ret; nic_matcher->e_anchor = dr_ste_htbl_alloc(dmn->ste_icm_pool, DR_CHUNK_SIZE_1, DR_STE_HTBL_TYPE_LEGACY, DR_STE_LU_TYPE_DONT_CARE, 0); if (!nic_matcher->e_anchor) goto clear_ste_builders; nic_matcher->s_htbl = dr_ste_htbl_alloc(dmn->ste_icm_pool, DR_CHUNK_SIZE_1, nic_matcher->ste_builder->htbl_type, nic_matcher->ste_builder->lu_type, nic_matcher->ste_builder->byte_mask); if (!nic_matcher->s_htbl) goto free_e_htbl; /* make sure the tables exist while empty */ dr_htbl_get(nic_matcher->s_htbl); dr_htbl_get(nic_matcher->e_anchor); return 0; free_e_htbl: dr_ste_htbl_free(nic_matcher->e_anchor); clear_ste_builders: dr_matcher_clear_ste_builders(nic_matcher); return errno; } static int dr_matcher_init_fdb(struct mlx5dv_dr_matcher *matcher) { int ret; ret = dr_matcher_init_nic(matcher, &matcher->rx); if (ret) return ret; ret = dr_matcher_init_nic(matcher, &matcher->tx); if (ret) goto uninit_nic_rx; return 0; uninit_nic_rx: dr_matcher_uninit_nic(&matcher->rx); return ret; } static int dr_matcher_init_root(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_flow_match_parameters *mask) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct mlx5dv_flow_matcher_attr attr = {}; enum mlx5dv_flow_table_type type; if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_RX) type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_RX; else if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_TX) type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_TX; else type = MLX5_IB_UAPI_FLOW_TABLE_TYPE_FDB; attr.match_mask = mask; attr.priority = matcher->prio; attr.type = IBV_FLOW_ATTR_NORMAL; attr.match_criteria_enable = matcher->match_criteria; attr.ft_type = type; attr.comp_mask = MLX5DV_FLOW_MATCHER_MASK_FT_TYPE; matcher->dv_matcher = mlx5dv_create_flow_matcher(dmn->ctx, &attr); if (!matcher->dv_matcher) return errno; return 0; } static bool dr_matcher_is_fixed_size(struct mlx5dv_dr_matcher *matcher) { return (matcher->rx.fixed_size || matcher->tx.fixed_size); } static int dr_matcher_copy_param(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_flow_match_parameters *mask) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; uint8_t match_criteria = matcher->match_criteria; uint64_t *consumed_mask_buf; uint32_t max_mask_sz; int ret = 0; if (match_criteria >= DR_MATCHER_CRITERIA_MAX) { dr_dbg(dmn, "Invalid match criteria attribute\n"); errno = EINVAL; return errno; } if (mask) { max_mask_sz = DEVX_ST_SZ_BYTES(dr_match_param); if (mask->match_sz > max_mask_sz) { dr_dbg(dmn, "Invalid match size attribute\n"); errno = EINVAL; return errno; } consumed_mask_buf = calloc(1, max_mask_sz); if (!consumed_mask_buf) { errno = ENOMEM; return errno; } memcpy(consumed_mask_buf, mask->match_buf, mask->match_sz); dr_ste_copy_param(match_criteria, &matcher->mask, consumed_mask_buf, max_mask_sz, true); ret = dr_matcher_is_mask_consumed((uint8_t *)consumed_mask_buf, match_criteria); if (!ret) { dr_dbg(dmn, "Match param mask contains unsupported parameters\n"); errno = EOPNOTSUPP; ret = errno; } free(consumed_mask_buf); } return 0; } static int dr_matcher_init(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_flow_match_parameters *mask) { struct mlx5dv_dr_table *tbl = matcher->tbl; struct mlx5dv_dr_domain *dmn = tbl->dmn; int ret; if (dr_is_root_table(matcher->tbl)) return dr_matcher_init_root(matcher, mask); ret = dr_matcher_copy_param(matcher, mask); if (ret) return ret; switch (dmn->type) { case MLX5DV_DR_DOMAIN_TYPE_NIC_RX: matcher->rx.nic_tbl = &tbl->rx; ret = dr_matcher_init_nic(matcher, &matcher->rx); break; case MLX5DV_DR_DOMAIN_TYPE_NIC_TX: matcher->tx.nic_tbl = &tbl->tx; ret = dr_matcher_init_nic(matcher, &matcher->tx); break; case MLX5DV_DR_DOMAIN_TYPE_FDB: matcher->rx.nic_tbl = &tbl->rx; matcher->tx.nic_tbl = &tbl->tx; ret = dr_matcher_init_fdb(matcher); break; default: assert(false); errno = EINVAL; return errno; } /* Drain QP to resolve possible race between new multi QP rules * and matcher hash table initial creation. */ if (dr_matcher_is_fixed_size(matcher)) dr_send_ring_force_drain(dmn); return ret; } static int dr_matcher_set_nic_matcher_layout(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct mlx5dv_dr_matcher_layout *matcher_layout) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; int ret = 0; if (!dr_matcher_is_definer_support_mq(nic_matcher)) { dr_dbg(dmn, "not supported not a definer\n"); errno = ENOTSUP; return ENOTSUP; } dr_domain_lock(dmn); if (matcher_layout->flags & MLX5DV_DR_MATCHER_LAYOUT_NUM_RULE) { /* if needed set dmn->info.max_log_sw_icm_sz and pool max_log_chunk_sz */ dr_domain_set_max_ste_icm_size(dmn, matcher_layout->log_num_of_rules_hint); ret = dr_rule_rehash_matcher_s_anchor(matcher, nic_matcher, matcher_layout->log_num_of_rules_hint); if (ret) { dr_dbg(dmn, "failed rehash with log-size: %d\n", matcher_layout->log_num_of_rules_hint); goto out; } } if (matcher_layout->flags & MLX5DV_DR_MATCHER_LAYOUT_RESIZABLE) { nic_matcher->fixed_size = false; } else { nic_matcher->fixed_size = true; dmn->info.use_mqs = true; } dr_send_ring_force_drain(dmn); out: dr_domain_unlock(dmn); return ret; } int mlx5dv_dr_matcher_set_layout(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_dr_matcher_layout *matcher_layout) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; int ret = 0; if (dr_is_root_table(matcher->tbl)) { dr_dbg(dmn, "Not supported in root table\n"); errno = ENOTSUP; return ENOTSUP; } if (!check_comp_mask(matcher_layout->flags, MLX5DV_DR_MATCHER_LAYOUT_RESIZABLE | MLX5DV_DR_MATCHER_LAYOUT_NUM_RULE)) { dr_dbg(dmn, "Not supported flags 0x%x\n", matcher_layout->flags); errno = ENOTSUP; return ENOTSUP; } if ((matcher_layout->flags & MLX5DV_DR_MATCHER_LAYOUT_NUM_RULE) && !dr_domain_is_support_ste_icm_size(dmn, matcher_layout->log_num_of_rules_hint)) { dr_dbg(dmn, "the size is too big: %d\n", matcher_layout->log_num_of_rules_hint); errno = ENOTSUP; return ENOTSUP; } if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_RX || dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB) { ret = dr_matcher_set_nic_matcher_layout(matcher, &matcher->rx, matcher_layout); } if (!ret && (dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_TX || dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB)) { ret = dr_matcher_set_nic_matcher_layout(matcher, &matcher->tx, matcher_layout); } if (ret) { dr_dbg(dmn, "failed nic (%d) rehash with log-size: %d\n", dmn->type, matcher_layout->log_num_of_rules_hint); return ret; } return 0; } struct mlx5dv_dr_matcher * mlx5dv_dr_matcher_create(struct mlx5dv_dr_table *tbl, uint16_t priority, uint8_t match_criteria_enable, struct mlx5dv_flow_match_parameters *mask) { struct mlx5dv_dr_matcher *matcher; int ret; atomic_fetch_add(&tbl->refcount, 1); matcher = calloc(1, sizeof(*matcher)); if (!matcher) { errno = ENOMEM; goto dec_ref; } matcher->tbl = tbl; matcher->prio = priority; matcher->match_criteria = match_criteria_enable; atomic_init(&matcher->refcount, 1); list_node_init(&matcher->matcher_list); list_head_init(&matcher->rule_list); dr_domain_lock(tbl->dmn); ret = dr_matcher_init(matcher, mask); if (ret) goto free_matcher; ret = dr_matcher_add_to_tbl(matcher); if (ret) goto matcher_uninit; dr_domain_unlock(tbl->dmn); return matcher; matcher_uninit: dr_matcher_uninit(matcher); free_matcher: dr_domain_unlock(tbl->dmn); free(matcher); dec_ref: atomic_fetch_sub(&tbl->refcount, 1); return NULL; } static int dr_matcher_disconnect(struct mlx5dv_dr_domain *dmn, struct dr_table_rx_tx *nic_tbl, struct dr_matcher_rx_tx *next_nic_matcher, struct dr_matcher_rx_tx *prev_nic_matcher) { struct dr_domain_rx_tx *nic_dmn = nic_tbl->nic_dmn; struct dr_htbl_connect_info info; struct dr_ste_htbl *prev_anchor; if (prev_nic_matcher) prev_anchor = prev_nic_matcher->e_anchor; else prev_anchor = nic_tbl->s_anchor; /* Connect previous anchor hash table to next matcher or to the default address */ if (next_nic_matcher) { info.type = CONNECT_HIT; info.hit_next_htbl = next_nic_matcher->s_htbl; next_nic_matcher->s_htbl->pointing_ste = prev_anchor->ste_arr; prev_anchor->ste_arr[0].next_htbl = next_nic_matcher->s_htbl; } else { info.type = CONNECT_MISS; info.miss_icm_addr = nic_dmn->default_icm_addr; prev_anchor->ste_arr[0].next_htbl = NULL; } return dr_ste_htbl_init_and_postsend(dmn, nic_dmn, prev_anchor, &info, true, 0); } static int dr_matcher_remove_from_tbl(struct mlx5dv_dr_matcher *matcher) { struct mlx5dv_dr_matcher *prev_matcher, *next_matcher; struct mlx5dv_dr_table *tbl = matcher->tbl; struct mlx5dv_dr_domain *dmn = tbl->dmn; int ret = 0; if (dr_is_root_table(matcher->tbl)) return 0; prev_matcher = list_prev(&tbl->matcher_list, matcher, matcher_list); next_matcher = list_next(&tbl->matcher_list, matcher, matcher_list); if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB || dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_RX) { ret = dr_matcher_disconnect(dmn, &tbl->rx, next_matcher ? &next_matcher->rx : NULL, prev_matcher ? &prev_matcher->rx : NULL); if (ret) return ret; } if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB || dmn->type == MLX5DV_DR_DOMAIN_TYPE_NIC_TX) { ret = dr_matcher_disconnect(dmn, &tbl->tx, next_matcher ? &next_matcher->tx : NULL, prev_matcher ? &prev_matcher->tx : NULL); if (ret) return ret; } list_del(&matcher->matcher_list); return 0; } int mlx5dv_dr_matcher_destroy(struct mlx5dv_dr_matcher *matcher) { struct mlx5dv_dr_table *tbl = matcher->tbl; if (atomic_load(&matcher->refcount) > 1) return EBUSY; dr_domain_lock(tbl->dmn); dr_matcher_remove_from_tbl(matcher); dr_matcher_uninit(matcher); atomic_fetch_sub(&matcher->tbl->refcount, 1); dr_domain_unlock(tbl->dmn); free(matcher); return 0; } rdma-core-56.1/providers/mlx5/dr_ptrn.c000066400000000000000000000155101477342711600200520ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB // Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. #include "mlx5dv_dr.h" #include "dr_ste.h" enum dr_ptrn_modify_hdr_action_id { DR_PTRN_MODIFY_HDR_ACTION_ID_NOP = 0x00, DR_PTRN_MODIFY_HDR_ACTION_ID_COPY = 0x05, DR_PTRN_MODIFY_HDR_ACTION_ID_SET = 0x06, DR_PTRN_MODIFY_HDR_ACTION_ID_ADD = 0x07, DR_PTRN_MODIFY_HDR_ACTION_ID_INSERT_INLINE = 0x0a, }; struct dr_ptrn_mngr { struct mlx5dv_dr_domain *dmn; struct dr_icm_pool *ptrn_icm_pool; /* cache for modify_header ptrn */ struct list_head ptrn_list; pthread_mutex_t modify_hdr_mutex; }; int dr_ptrn_sync_pool(struct dr_ptrn_mngr *ptrn_mngr) { return dr_icm_pool_sync_pool(ptrn_mngr->ptrn_icm_pool); } /* Cache structure and functions */ static bool dr_ptrn_compare_modify_hdr(size_t cur_num_of_actions, __be64 cur_hw_actions[], size_t num_of_actions, __be64 hw_actions[]) { int i; if (cur_num_of_actions != num_of_actions) return false; for (i = 0; i < num_of_actions; i++) { u8 action_id = DEVX_GET(ste_double_action_add_v1, &hw_actions[i], action_id); if (action_id == DR_PTRN_MODIFY_HDR_ACTION_ID_COPY) { if (hw_actions[i] != cur_hw_actions[i]) return false; } else { if ((__force __be32)hw_actions[i] != (__force __be32)cur_hw_actions[i]) return false; } } return true; } static bool dr_ptrn_compare_pattern(enum dr_ptrn_type type, enum dr_ptrn_type cur_type, size_t cur_num_of_actions, __be64 cur_hw_action[], size_t num_of_actions, __be64 hw_action[]) { if ((cur_num_of_actions != num_of_actions) || (cur_type != type)) return false; switch (type) { case DR_PTRN_TYP_MODIFY_HDR: return dr_ptrn_compare_modify_hdr(cur_num_of_actions, (__be64 *)cur_hw_action, num_of_actions, (__be64 *)hw_action); case DR_PTRN_TYP_TNL_L3_TO_L2: return true; default: assert(false); return false; } } static struct dr_ptrn_obj * dr_ptrn_find_cached_pattern(struct dr_ptrn_mngr *mngr, enum dr_ptrn_type type, size_t num_of_actions, __be64 hw_actions[]) { struct dr_ptrn_obj *tmp; struct dr_ptrn_obj *cached_pattern; list_for_each_safe(&mngr->ptrn_list, cached_pattern, tmp, list) { if (dr_ptrn_compare_pattern(type, cached_pattern->type, cached_pattern->rewrite_param.num_of_actions, (__be64 *)cached_pattern->rewrite_param.data, num_of_actions, hw_actions)) { list_del(&cached_pattern->list); list_add(&mngr->ptrn_list, &cached_pattern->list); return cached_pattern; } } return NULL; } static struct dr_ptrn_obj * dr_ptrn_alloc_pattern(struct dr_ptrn_mngr *mngr, uint16_t num_of_actions, uint8_t *data, enum dr_ptrn_type type) { struct dr_ptrn_obj *pattern; struct dr_icm_chunk *chunk; uint32_t chunck_size; uint32_t index; chunck_size = ilog32(num_of_actions - 1); /* HW modify action index granularity is at least 64B */ chunck_size = max_t(uint32_t, chunck_size, DR_CHUNK_SIZE_8); chunk = dr_icm_alloc_chunk(mngr->ptrn_icm_pool, chunck_size); if (!chunk) { errno = ENOMEM; return NULL; } index = (dr_icm_pool_get_chunk_icm_addr(chunk) - mngr->dmn->info.caps.hdr_modify_pattern_icm_addr) / ACTION_CACHE_LINE_SIZE; pattern = calloc(1, sizeof(struct dr_ptrn_obj)); if (!pattern) { errno = ENOMEM; goto free_chunk; } pattern->rewrite_param.data = calloc(1, num_of_actions * DR_MODIFY_ACTION_SIZE); if (!pattern->rewrite_param.data) { errno = ENOMEM; goto free_pattern; } pattern->type = type; memcpy(pattern->rewrite_param.data, data, num_of_actions * DR_MODIFY_ACTION_SIZE); pattern->rewrite_param.chunk = chunk; pattern->rewrite_param.index = index; pattern->rewrite_param.num_of_actions = num_of_actions; list_add(&mngr->ptrn_list, &pattern->list); atomic_init(&pattern->refcount, 0); return pattern; free_pattern: free(pattern); free_chunk: dr_icm_free_chunk(chunk); return NULL; } static void dr_ptrn_free_pattern(struct dr_ptrn_obj *pattern) { list_del(&pattern->list); dr_icm_free_chunk(pattern->rewrite_param.chunk); free(pattern->rewrite_param.data); free(pattern); } struct dr_ptrn_obj * dr_ptrn_cache_get_pattern(struct dr_ptrn_mngr *mngr, enum dr_ptrn_type type, uint16_t num_of_actions, uint8_t *data) { struct dr_ptrn_obj *pattern; uint64_t *hw_actions; uint8_t action_id; int i; pthread_mutex_lock(&mngr->modify_hdr_mutex); pattern = dr_ptrn_find_cached_pattern(mngr, type, num_of_actions, (__be64 *)data); if (!pattern) { /* Alloc and add new pattern to cache */ pattern = dr_ptrn_alloc_pattern(mngr, num_of_actions, data, type); if (!pattern) goto out_unlock; hw_actions = (uint64_t *)pattern->rewrite_param.data; /* Here we mask the pattern data to create a valid pattern * since we do an OR operation between the arg and pattern */ for (i = 0; i < num_of_actions; i++) { action_id = DR_STE_GET(double_action_add_v1, &hw_actions[i], action_id); if (action_id == DR_PTRN_MODIFY_HDR_ACTION_ID_SET || action_id == DR_PTRN_MODIFY_HDR_ACTION_ID_ADD || action_id == DR_PTRN_MODIFY_HDR_ACTION_ID_INSERT_INLINE) DR_STE_SET(double_action_set_v1, &hw_actions[i], inline_data, 0); } if (dr_send_postsend_pattern(mngr->dmn, pattern->rewrite_param.chunk, num_of_actions, pattern->rewrite_param.data)) goto free_pattern; } atomic_fetch_add(&pattern->refcount, 1); pthread_mutex_unlock(&mngr->modify_hdr_mutex); return pattern; free_pattern: dr_ptrn_free_pattern(pattern); out_unlock: pthread_mutex_unlock(&mngr->modify_hdr_mutex); return NULL; } void dr_ptrn_cache_put_pattern(struct dr_ptrn_mngr *mngr, struct dr_ptrn_obj *pattern) { pthread_mutex_lock(&mngr->modify_hdr_mutex); if (atomic_fetch_sub(&pattern->refcount, 1) != 1) goto out; dr_ptrn_free_pattern(pattern); out: pthread_mutex_unlock(&mngr->modify_hdr_mutex); } struct dr_ptrn_mngr * dr_ptrn_mngr_create(struct mlx5dv_dr_domain *dmn) { struct dr_ptrn_mngr *mngr; if (!dr_domain_is_support_modify_hdr_cache(dmn)) return NULL; mngr = calloc(1, sizeof(*mngr)); if (!mngr) { errno = ENOMEM; return NULL; } mngr->dmn = dmn; mngr->ptrn_icm_pool = dr_icm_pool_create(dmn, DR_ICM_TYPE_MODIFY_HDR_PTRN); if (!mngr->ptrn_icm_pool) { dr_dbg(dmn, "Couldn't get modify-header-pattern memory for %s\n", ibv_get_device_name(dmn->ctx->device)); goto free_mngr; } list_head_init(&mngr->ptrn_list); return mngr; free_mngr: free(mngr); return NULL; } void dr_ptrn_mngr_destroy(struct dr_ptrn_mngr *mngr) { struct dr_ptrn_obj *tmp; struct dr_ptrn_obj *pattern; if (!mngr) return; list_for_each_safe(&mngr->ptrn_list, pattern, tmp, list) { list_del(&pattern->list); free(pattern->rewrite_param.data); free(pattern); } dr_icm_pool_destroy(mngr->ptrn_icm_pool); free(mngr); } rdma-core-56.1/providers/mlx5/dr_rule.c000066400000000000000000001277751477342711600200570ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include "mlx5dv_dr.h" /* +1 for the cross GVMI STE */ #define DR_RULE_MAX_STE_CHAIN (DR_RULE_MAX_STES + DR_ACTION_MAX_STES + 1) static int dr_rule_append_to_miss_list(struct dr_ste_ctx *ste_ctx, struct dr_ste *new_last_ste, struct list_head *miss_list, struct list_head *send_list) { struct dr_ste_send_info *ste_info_last; struct dr_ste *last_ste; /* The new entry will be inserted after the last */ last_ste = list_tail(miss_list, struct dr_ste, miss_list_node); assert(last_ste); ste_info_last = calloc(1, sizeof(*ste_info_last)); if (!ste_info_last) { errno = ENOMEM; return errno; } dr_ste_set_miss_addr(ste_ctx, last_ste->hw_ste, dr_ste_get_icm_addr(new_last_ste)); list_add_tail(miss_list, &new_last_ste->miss_list_node); dr_send_fill_and_append_ste_send_info(last_ste, DR_STE_SIZE_CTRL, 0, last_ste->hw_ste, ste_info_last, send_list, true); return 0; } static struct dr_ste *dr_rule_create_collision_htbl(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, uint8_t *hw_ste) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct dr_ste_ctx *ste_ctx = dmn->ste_ctx; struct dr_ste_htbl *new_htbl; struct dr_ste *ste; /* Create new table for miss entry */ new_htbl = dr_ste_htbl_alloc(dmn->ste_icm_pool, DR_CHUNK_SIZE_1, nic_matcher->ste_builder->htbl_type, DR_STE_LU_TYPE_DONT_CARE, 0); if (!new_htbl) { dr_dbg(dmn, "Failed allocating collision table\n"); return NULL; } /* One and only entry, never grows */ ste = new_htbl->ste_arr; dr_ste_set_miss_addr(ste_ctx, hw_ste, dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk)); dr_htbl_get(new_htbl); return ste; } static struct dr_ste *dr_rule_create_collision_entry(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, uint8_t *hw_ste, struct dr_ste *orig_ste, uint8_t send_ring_idx) { struct dr_ste *ste; ste = dr_rule_create_collision_htbl(matcher, nic_matcher, hw_ste); if (!ste) { dr_dbg(matcher->tbl->dmn, "Failed creating collision entry\n"); return NULL; } ste->ste_chain_location = orig_ste->ste_chain_location; ste->htbl->pointing_ste = orig_ste->htbl->pointing_ste; /* In collision entry, all members share the same miss_list_head */ ste->htbl->miss_list = dr_ste_get_miss_list(orig_ste); /* Next table */ if (dr_ste_create_next_htbl(matcher, nic_matcher, ste, hw_ste, DR_CHUNK_SIZE_1, send_ring_idx)) { dr_dbg(matcher->tbl->dmn, "Failed allocating table\n"); goto free_tbl; } return ste; free_tbl: dr_htbl_put(ste->htbl); return NULL; } static int dr_rule_handle_one_ste_in_update_list(struct dr_ste_send_info *ste_info, struct mlx5dv_dr_domain *dmn, uint8_t send_ring_idx) { int ret; list_del(&ste_info->send_list); /* Copy data to ste, only reduced size or control, the last 16B (mask) * is already written to the hw. */ if (ste_info->size == DR_STE_SIZE_CTRL) memcpy(ste_info->ste->hw_ste, ste_info->data, DR_STE_SIZE_CTRL); else memcpy(ste_info->ste->hw_ste, ste_info->data, ste_info->ste->size); ret = dr_send_postsend_ste(dmn, ste_info->ste, ste_info->data, ste_info->size, ste_info->offset, send_ring_idx); if (ret) goto out; out: free(ste_info); return ret; } int dr_rule_send_update_list(struct list_head *send_ste_list, struct mlx5dv_dr_domain *dmn, bool is_reverse, uint8_t send_ring_idx) { struct dr_ste_send_info *ste_info, *tmp_ste_info; int ret; if (is_reverse) { list_for_each_rev_safe(send_ste_list, ste_info, tmp_ste_info, send_list) { ret = dr_rule_handle_one_ste_in_update_list(ste_info, dmn, send_ring_idx); if (ret) return ret; } } else { list_for_each_safe(send_ste_list, ste_info, tmp_ste_info, send_list) { ret = dr_rule_handle_one_ste_in_update_list(ste_info, dmn, send_ring_idx); if (ret) return ret; } } return 0; } static struct dr_ste *dr_rule_find_ste_in_miss_list(struct list_head *miss_list, uint8_t *hw_ste, uint8_t tag_size) { struct dr_ste *ste; /* Check if hw_ste is present in the list */ list_for_each(miss_list, ste, miss_list_node) if (dr_ste_equal_tag(ste->hw_ste, hw_ste, tag_size)) return ste; return NULL; } static struct dr_ste * dr_rule_rehash_handle_collision(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct list_head *update_list, struct dr_ste *col_ste, uint8_t *hw_ste) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct dr_ste *new_ste; int ret; new_ste = dr_rule_create_collision_htbl(matcher, nic_matcher, hw_ste); if (!new_ste) return NULL; /* Update collision pointing STE */ new_ste->htbl->pointing_ste = col_ste->htbl->pointing_ste; /* In collision entry, all members share the same miss_list_head */ new_ste->htbl->miss_list = dr_ste_get_miss_list(col_ste); /* Update the previous from the list */ ret = dr_rule_append_to_miss_list(dmn->ste_ctx, new_ste, dr_ste_get_miss_list(col_ste), update_list); if (ret) { dr_dbg(dmn, "Failed update dup entry\n"); goto err_exit; } return new_ste; err_exit: dr_htbl_put(new_ste->htbl); return NULL; } static void dr_rule_rehash_copy_ste_ctrl(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct dr_ste *cur_ste, struct dr_ste *new_ste) { new_ste->next_htbl = cur_ste->next_htbl; new_ste->ste_chain_location = cur_ste->ste_chain_location; if (new_ste->next_htbl) new_ste->next_htbl->pointing_ste = new_ste; /* * We need to copy the refcount since this ste * may have been traversed several times */ atomic_init(&new_ste->refcount, atomic_load(&cur_ste->refcount)); /* Link old STEs rule to the new ste */ dr_rule_set_last_member(cur_ste->rule_rx_tx, new_ste, false); } static struct dr_ste *dr_rule_rehash_copy_ste(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct dr_ste *cur_ste, struct dr_ste_htbl *new_htbl, struct list_head *update_list) { struct dr_ste_ctx *ste_ctx = matcher->tbl->dmn->ste_ctx; uint8_t hw_ste[DR_STE_SIZE] = {}; struct dr_ste_send_info *ste_info; bool use_update_list = false; struct dr_ste_build *sb; struct dr_ste *new_ste; uint8_t sb_idx; int new_idx; /* Copy STE mask from the matcher */ sb_idx = cur_ste->ste_chain_location - 1; sb = &nic_matcher->ste_builder[sb_idx]; /* Copy STE control, tag and mask on legacy STE */ memcpy(hw_ste, cur_ste->hw_ste, cur_ste->size); dr_ste_set_bit_mask(hw_ste, sb); dr_ste_set_miss_addr(ste_ctx, hw_ste, dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk)); new_idx = dr_ste_calc_hash_index(hw_ste, new_htbl); new_ste = &new_htbl->ste_arr[new_idx]; if (dr_ste_is_not_used(new_ste)) { dr_htbl_get(new_htbl); list_add_tail(dr_ste_get_miss_list(new_ste), &new_ste->miss_list_node); } else { new_ste = dr_rule_rehash_handle_collision(matcher, nic_matcher, update_list, new_ste, hw_ste); if (!new_ste) { dr_dbg(matcher->tbl->dmn, "Failed adding collision entry, index: %d\n", new_idx); return NULL; } new_htbl->ctrl.num_of_collisions++; use_update_list = true; } memcpy(new_ste->hw_ste, hw_ste, new_ste->size); new_htbl->ctrl.num_of_valid_entries++; if (use_update_list) { ste_info = calloc(1, sizeof(*ste_info)); if (!ste_info) { dr_dbg(matcher->tbl->dmn, "Failed allocating ste_info\n"); errno = ENOMEM; dr_htbl_put(new_ste->htbl); return NULL; } dr_send_fill_and_append_ste_send_info(new_ste, DR_STE_SIZE, 0, hw_ste, ste_info, update_list, true); } dr_rule_rehash_copy_ste_ctrl(matcher, nic_matcher, cur_ste, new_ste); return new_ste; } static int dr_rule_rehash_copy_miss_list(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct list_head *cur_miss_list, struct dr_ste_htbl *new_htbl, struct list_head *update_list) { struct dr_ste *tmp_ste, *cur_ste, *new_ste; list_for_each_safe(cur_miss_list, cur_ste, tmp_ste, miss_list_node) { new_ste = dr_rule_rehash_copy_ste(matcher, nic_matcher, cur_ste, new_htbl, update_list); if (!new_ste) goto err_insert; list_del(&cur_ste->miss_list_node); dr_htbl_put(cur_ste->htbl); } return 0; err_insert: dr_dbg(matcher->tbl->dmn, "Fatal error during resize\n"); assert(false); return errno; } static int dr_rule_rehash_copy_htbl(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct dr_ste_htbl *cur_htbl, struct dr_ste_htbl *new_htbl, struct list_head *update_list, uint8_t lock_index) { struct dr_ste *cur_ste; int cur_entries; int err = 0; int i; cur_entries = dr_icm_pool_chunk_size_to_entries(cur_htbl->chunk_size); for (i = 0; i < cur_entries; i++) { cur_ste = &cur_htbl->ste_arr[i]; if (dr_ste_is_not_used(cur_ste)) /* Empty, nothing to copy */ continue; err = dr_rule_rehash_copy_miss_list(matcher, nic_matcher, dr_ste_get_miss_list(cur_ste), new_htbl, update_list); if (err) goto clean_copy; /* In order to decrease memory allocation of ste_info struct send * the current table raw now. */ err = dr_rule_send_update_list(update_list, matcher->tbl->dmn, false, lock_index); if (err) { dr_dbg(matcher->tbl->dmn, "Failed updating table to HW\n"); goto clean_copy; } } clean_copy: return err; } static struct dr_ste_htbl *dr_rule_rehash_htbl_common(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct dr_ste_htbl *cur_htbl, uint8_t ste_location, struct list_head *update_list, enum dr_icm_chunk_size new_size, uint8_t lock_index) { struct dr_domain_rx_tx *nic_dmn = nic_matcher->nic_tbl->nic_dmn; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; uint8_t formated_ste[DR_STE_SIZE] = {}; struct dr_ste_send_info *ste_info; struct dr_htbl_connect_info info; LIST_HEAD(rehash_table_send_list); struct dr_ste_htbl *new_htbl; struct dr_ste *ste_to_update; uint8_t *mask = NULL; int err; ste_info = calloc(1, sizeof(*ste_info)); if (!ste_info) { errno = ENOMEM; return NULL; } new_htbl = dr_ste_htbl_alloc(dmn->ste_icm_pool, new_size, cur_htbl->type, cur_htbl->lu_type, cur_htbl->byte_mask); if (!new_htbl) { dr_dbg(dmn, "Failed to allocate new hash table\n"); goto free_ste_info; } /* Write new table to HW */ info.type = CONNECT_MISS; info.miss_icm_addr = dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk); dr_ste_set_formated_ste(dmn->ste_ctx, dmn->info.caps.gvmi, nic_dmn->type, new_htbl, formated_ste, &info); new_htbl->pointing_ste = cur_htbl->pointing_ste; new_htbl->pointing_ste->next_htbl = new_htbl; err = dr_rule_rehash_copy_htbl(matcher, nic_matcher, cur_htbl, new_htbl, &rehash_table_send_list, lock_index); if (err) goto free_new_htbl; if (new_htbl->type == DR_STE_HTBL_TYPE_LEGACY) mask = nic_matcher->ste_builder[ste_location - 1].bit_mask; if (dr_send_postsend_htbl(dmn, new_htbl, formated_ste, mask, lock_index)) { dr_dbg(dmn, "Failed writing table to HW\n"); goto free_new_htbl; } /* Connect previous hash table to current */ if (ste_location == 1) { /* The previous table is an anchor, anchors size is always one STE */ struct dr_ste_htbl *prev_htbl = cur_htbl->pointing_ste->htbl; /* On matcher s_anchor we keep an extra refcount */ dr_htbl_get(new_htbl); dr_htbl_put(cur_htbl); nic_matcher->s_htbl = new_htbl; /* * It is safe to operate dr_ste_set_hit_addr on the hw_ste here * (48B len) which works only on first 32B */ dr_ste_set_hit_addr(dmn->ste_ctx, prev_htbl->ste_arr[0].hw_ste, dr_icm_pool_get_chunk_icm_addr(new_htbl->chunk), new_htbl->chunk->num_of_entries); ste_to_update = &prev_htbl->ste_arr[0]; } else { dr_ste_set_hit_addr_by_next_htbl(dmn->ste_ctx, cur_htbl->pointing_ste->hw_ste, new_htbl); ste_to_update = cur_htbl->pointing_ste; } dr_send_fill_and_append_ste_send_info(ste_to_update, DR_STE_SIZE_CTRL, 0, ste_to_update->hw_ste, ste_info, update_list, false); return new_htbl; free_new_htbl: dr_ste_htbl_free(new_htbl); free_ste_info: free(ste_info); return NULL; } static struct dr_ste_htbl *dr_rule_rehash_htbl(struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule, struct dr_ste_htbl *cur_htbl, uint8_t ste_location, struct list_head *update_list, enum dr_icm_chunk_size new_size) { struct dr_matcher_rx_tx *nic_matcher = nic_rule->nic_matcher; struct mlx5dv_dr_matcher *matcher = rule->matcher; return dr_rule_rehash_htbl_common(matcher, nic_matcher, cur_htbl, ste_location, update_list, new_size, nic_rule->lock_index); } int dr_rule_rehash_matcher_s_anchor(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, enum dr_icm_chunk_size new_size) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct dr_ste_htbl *new_htbl; LIST_HEAD(update_list); int ret; if (nic_matcher->s_htbl->chunk_size == new_size) { dr_dbg(dmn, "both are with the same size, nothing to do\n"); return 0; } new_htbl = dr_rule_rehash_htbl_common(matcher, nic_matcher, nic_matcher->s_htbl, 1, &update_list, new_size, 0); if (!new_htbl) { dr_dbg(dmn, "Failed creating new matcher s_anchor\n"); goto err_out; } ret = dr_rule_send_update_list(&update_list, dmn, true, 0); if (ret) { dr_dbg(dmn, "Failed sending ste!\n"); goto free_new_htbl; } dr_ste_htbl_free(nic_matcher->s_htbl); nic_matcher->s_htbl = new_htbl; return 0; free_new_htbl: dr_ste_htbl_free(new_htbl); err_out: return ENOTSUP; } static struct dr_ste_htbl *dr_rule_rehash(struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule, struct dr_ste_htbl *cur_htbl, uint8_t ste_location, struct list_head *update_list) { struct mlx5dv_dr_domain *dmn = rule->matcher->tbl->dmn; enum dr_icm_chunk_size new_size; new_size = dr_icm_next_higher_chunk(cur_htbl->chunk_size); new_size = min_t(uint32_t, new_size, dmn->info.max_log_sw_icm_rehash_sz); if (new_size == cur_htbl->chunk_size) return NULL; /* Skip rehash, we already at the max size */ return dr_rule_rehash_htbl(rule, nic_rule, cur_htbl, ste_location, update_list, new_size); } static struct dr_ste *dr_rule_handle_collision(struct mlx5dv_dr_matcher *matcher, struct dr_rule_rx_tx *nic_rule, struct dr_ste *ste, uint8_t *hw_ste, struct list_head *miss_list, struct list_head *send_list) { struct dr_matcher_rx_tx *nic_matcher = nic_rule->nic_matcher; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct dr_ste_ctx *ste_ctx = dmn->ste_ctx; struct dr_ste_send_info *ste_info; struct dr_ste *new_ste; ste_info = calloc(1, sizeof(*ste_info)); if (!ste_info) { dr_dbg(dmn, "Failed allocating ste_info\n"); errno = ENOMEM; return NULL; } new_ste = dr_rule_create_collision_entry(matcher, nic_matcher, hw_ste, ste, nic_rule->lock_index); if (!new_ste) { dr_dbg(dmn, "Failed creating collision entry\n"); goto free_send_info; } if (dr_rule_append_to_miss_list(ste_ctx, new_ste, miss_list, send_list)) { dr_dbg(dmn, "Failed to update prev miss_list\n"); goto err_exit; } dr_send_fill_and_append_ste_send_info(new_ste, DR_STE_SIZE, 0, hw_ste, ste_info, send_list, false); ste->htbl->ctrl.num_of_collisions++; ste->htbl->ctrl.num_of_valid_entries++; return new_ste; err_exit: dr_htbl_put(new_ste->htbl); free_send_info: free(ste_info); return NULL; } static void dr_rule_remove_action_members(struct mlx5dv_dr_rule *rule) { int i; for (i = 0; i < rule->num_actions; i++) atomic_fetch_sub(&rule->actions[i]->refcount, 1); free(rule->actions); } static int dr_rule_add_action_members(struct mlx5dv_dr_rule *rule, size_t num_actions, struct mlx5dv_dr_action *actions[]) { struct mlx5dv_dr_action *action; int i; rule->actions = calloc(num_actions, sizeof(action)); if (!rule->actions) { errno = ENOMEM; return errno; } rule->num_actions = num_actions; for (i = 0; i < num_actions; i++) { rule->actions[i] = actions[i]; atomic_fetch_add(&rule->actions[i]->refcount, 1); } return 0; } void dr_rule_set_last_member(struct dr_rule_rx_tx *nic_rule, struct dr_ste *ste, bool force) { /* Update rule member is usually done for the last STE or during rule * creation to recover from mid-creation failure (for this purpose the * force flag is used) */ if (ste->next_htbl && !force) return; /* Update is required since each rule keeps track of its last STE */ ste->rule_rx_tx = nic_rule; nic_rule->last_rule_ste = ste; } static struct dr_ste *dr_rule_get_pointed_ste(struct dr_ste *curr_ste) { struct dr_ste *first_ste = dr_ste_get_miss_list_top(curr_ste); return first_ste->htbl->pointing_ste; } void dr_rule_get_reverse_rule_members(struct dr_ste **ste_arr, struct dr_ste *curr_ste, int *num_of_stes) { bool first = false; *num_of_stes = 0; if (curr_ste == NULL) return; /* Iterate from last to first */ while (!first) { first = curr_ste->ste_chain_location == 1; ste_arr[*num_of_stes] = curr_ste; *num_of_stes += 1; curr_ste = dr_rule_get_pointed_ste(curr_ste); } } static void dr_rule_clean_rule_members(struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule) { struct dr_ste *ste_arr[DR_RULE_MAX_STES + DR_ACTION_MAX_STES + DR_ACTION_ASO_CROSS_GVMI_STES]; struct dr_ste *curr_ste = nic_rule->last_rule_ste; int i; dr_rule_get_reverse_rule_members(ste_arr, curr_ste, &i); while (i--) dr_ste_put(ste_arr[i], rule, nic_rule); } static void dr_rule_clean_cross_dmn_rule_members(struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule, struct list_head *send_ste_list, struct mlx5dv_dr_action *cross_dmn_action) { struct dr_aso_cross_dmn_arrays *cross_dmn_arrays = (struct dr_aso_cross_dmn_arrays *) cross_dmn_action->aso.devx_obj->priv; struct dr_ste_send_info *ste_info, *tmp_ste_info; dr_rule_clean_rule_members(rule, nic_rule); /* Clean all ste_info's */ list_for_each_safe(send_ste_list, ste_info, tmp_ste_info, send_list) { list_del(&ste_info->send_list); free(ste_info); } if (atomic_load(&cross_dmn_arrays->rule_htbl[cross_dmn_action->aso.offset]->ste_arr->refcount) > 1) { atomic_fetch_sub(&cross_dmn_arrays->rule_htbl[cross_dmn_action->aso.offset]->ste_arr->refcount, 1); atomic_fetch_sub(&cross_dmn_arrays->action_htbl[cross_dmn_action->aso.offset]->ste_arr->refcount, 1); } } static uint16_t dr_get_bits_per_mask(uint16_t byte_mask) { uint16_t bits = 0; while (byte_mask) { byte_mask = byte_mask & (byte_mask - 1); bits++; } return bits; } static int dr_rule_handle_cross_action_stes(struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule, struct list_head *send_ste_list, struct dr_ste *last_ste, uint8_t *hw_ste_arr, uint32_t new_hw_ste_arr_sz, uint32_t cross_dmn_loc, struct mlx5dv_dr_action *cross_dmn_action) { struct dr_matcher_rx_tx *nic_matcher = nic_rule->nic_matcher; struct dr_ste_send_info *ste_info_arr[DR_ACTION_MAX_STES + 1]; uint8_t num_of_builders = nic_matcher->num_of_builders; struct mlx5dv_dr_matcher *matcher = rule->matcher; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; uint8_t *curr_hw_ste, *prev_hw_ste; struct dr_ste *action_ste, *cross_dmn_action_ste, *cross_dmn_rule_ste; bool is_ste_for_cross_dmn; int i, k; for (i = num_of_builders, k = 0; i < new_hw_ste_arr_sz; i++, k++) { curr_hw_ste = hw_ste_arr + i * DR_STE_SIZE; prev_hw_ste = (i == 0) ? curr_hw_ste : hw_ste_arr + (i - 1) * DR_STE_SIZE; is_ste_for_cross_dmn = (cross_dmn_loc == k); if (!is_ste_for_cross_dmn) { action_ste = dr_rule_create_collision_htbl(matcher, nic_matcher, curr_hw_ste); if (!action_ste) return errno; dr_ste_get(action_ste); action_ste->htbl->pointing_ste = last_ste; last_ste->next_htbl = action_ste->htbl; last_ste = action_ste; /* While free ste we go over the miss list, so add this ste to the list */ list_add_tail(dr_ste_get_miss_list(action_ste), &action_ste->miss_list_node); /* Point current ste to the new action */ dr_ste_set_hit_addr_by_next_htbl(dmn->ste_ctx, prev_hw_ste, action_ste->htbl); dr_rule_set_last_member(nic_rule, action_ste, true); } else { struct dr_aso_cross_dmn_arrays *cross_dmn_arrays = (struct dr_aso_cross_dmn_arrays *) cross_dmn_action->aso.devx_obj->priv; cross_dmn_action_ste = cross_dmn_arrays->action_htbl[cross_dmn_action->aso.offset]->ste_arr; cross_dmn_rule_ste = cross_dmn_arrays->rule_htbl[cross_dmn_action->aso.offset]->ste_arr; /* Connect last ste to cross_dmn_action_ste */ cross_dmn_action_ste->htbl->pointing_ste = last_ste; last_ste->next_htbl = cross_dmn_action_ste->htbl; dr_ste_set_hit_addr_by_next_htbl(dmn->ste_ctx, prev_hw_ste, cross_dmn_action_ste->htbl); dr_ste_set_hit_gvmi(dmn->ste_ctx, prev_hw_ste, cross_dmn_action->aso.dmn->info.caps.gvmi); /* Point rule STE as last member */ dr_rule_set_last_member(nic_rule, cross_dmn_rule_ste, true); /* Point rule STE as last STE */ last_ste = cross_dmn_rule_ste; } ste_info_arr[k] = calloc(1, sizeof(struct dr_ste_send_info)); if (!ste_info_arr[k]) { dr_dbg(dmn, "Failed allocate ste_info, k: %d\n", k); errno = ENOMEM; return errno; } if (!is_ste_for_cross_dmn) { dr_send_fill_and_append_ste_send_info(action_ste, DR_STE_SIZE, 0, curr_hw_ste, ste_info_arr[k], send_ste_list, false); } else { memcpy(cross_dmn_rule_ste->hw_ste, curr_hw_ste, DR_STE_SIZE_REDUCED); dr_send_fill_and_append_ste_send_info(cross_dmn_rule_ste, DR_STE_SIZE, 0, curr_hw_ste, ste_info_arr[k], send_ste_list, false); } } last_ste->next_htbl = NULL; return 0; } static bool dr_rule_need_enlarge_hash(struct dr_ste_htbl *htbl, struct mlx5dv_dr_domain *dmn, struct dr_domain_rx_tx *nic_dmn) { struct dr_ste_htbl_ctrl *ctrl = &htbl->ctrl; int threshold; if (dmn->info.max_log_sw_icm_sz <= htbl->chunk_size) return false; if (!dr_ste_htbl_may_grow(htbl)) return false; if (htbl->type == DR_STE_HTBL_TYPE_LEGACY && dr_get_bits_per_mask(htbl->byte_mask) * CHAR_BIT <= htbl->chunk_size) return false; threshold = dr_ste_htbl_increase_threshold(htbl); if (ctrl->num_of_collisions >= threshold && (ctrl->num_of_valid_entries - ctrl->num_of_collisions) >= threshold) return true; return false; } static int dr_rule_handle_regular_action_stes(struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule, struct list_head *send_ste_list, struct dr_ste *last_ste, uint8_t *hw_ste_arr, uint32_t new_hw_ste_arr_sz) { struct dr_matcher_rx_tx *nic_matcher = nic_rule->nic_matcher; struct dr_ste_send_info *ste_info_arr[DR_ACTION_MAX_STES]; uint8_t num_of_builders = nic_matcher->num_of_builders; struct mlx5dv_dr_matcher *matcher = rule->matcher; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; uint8_t *curr_hw_ste, *prev_hw_ste; struct dr_ste *action_ste; int i, k, ret; /* Two cases: * 1. num_of_builders is equal to new_hw_ste_arr_sz, the action in the ste * 2. num_of_builders is less then new_hw_ste_arr_sz, new ste was added * to support the action. */ if (num_of_builders == new_hw_ste_arr_sz) { last_ste->next_htbl = NULL; return 0; } for (i = num_of_builders, k = 0; i < new_hw_ste_arr_sz; i++, k++) { curr_hw_ste = hw_ste_arr + i * DR_STE_SIZE; prev_hw_ste = (i == 0) ? curr_hw_ste : hw_ste_arr + ((i - 1) * DR_STE_SIZE); action_ste = dr_rule_create_collision_htbl(matcher, nic_matcher, curr_hw_ste); if (!action_ste) return errno; dr_ste_get(action_ste); action_ste->htbl->pointing_ste = last_ste; last_ste->next_htbl = action_ste->htbl; last_ste = action_ste; /* While free ste we go over the miss list, so add this ste to the list */ list_add_tail(dr_ste_get_miss_list(action_ste), &action_ste->miss_list_node); ste_info_arr[k] = calloc(1, sizeof(struct dr_ste_send_info)); if (!ste_info_arr[k]) { dr_dbg(dmn, "Failed allocate ste_info, k: %d\n", k); errno = ENOMEM; ret = errno; goto err_exit; } /* This is an always hit entry */ dr_ste_set_miss_addr(dmn->ste_ctx, curr_hw_ste, 0); /* Point current ste to the new action */ dr_ste_set_hit_addr_by_next_htbl(dmn->ste_ctx, prev_hw_ste, action_ste->htbl); dr_rule_set_last_member(nic_rule, action_ste, true); dr_send_fill_and_append_ste_send_info(action_ste, DR_STE_SIZE, 0, curr_hw_ste, ste_info_arr[k], send_ste_list, false); } return 0; err_exit: dr_ste_put(action_ste, rule, nic_rule); return ret; } static int dr_rule_handle_action_stes(struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule, struct list_head *send_ste_list, struct dr_ste *last_ste, uint8_t *hw_ste_arr, uint32_t new_hw_ste_arr_sz, struct cross_dmn_params *cross_dmn_p) { if (cross_dmn_p->cross_dmn_loc != -1) return dr_rule_handle_cross_action_stes(rule, nic_rule, send_ste_list, last_ste, hw_ste_arr, new_hw_ste_arr_sz, cross_dmn_p->cross_dmn_loc, cross_dmn_p->cross_dmn_action); return dr_rule_handle_regular_action_stes(rule, nic_rule, send_ste_list, last_ste, hw_ste_arr, new_hw_ste_arr_sz); } static int dr_rule_handle_empty_entry(struct mlx5dv_dr_matcher *matcher, struct dr_rule_rx_tx *nic_rule, struct dr_ste_htbl *cur_htbl, struct dr_ste *ste, uint8_t ste_location, uint8_t *hw_ste, struct list_head *miss_list, struct list_head *send_list) { struct dr_matcher_rx_tx *nic_matcher = nic_rule->nic_matcher; struct dr_ste_ctx *ste_ctx = matcher->tbl->dmn->ste_ctx; struct dr_ste_send_info *ste_info; /* Take ref on table, only on first time this ste is used */ dr_htbl_get(cur_htbl); /* new entry -> new branch */ list_add_tail(miss_list, &ste->miss_list_node); dr_ste_set_miss_addr(ste_ctx, hw_ste, dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk)); ste->ste_chain_location = ste_location; ste_info = calloc(1, sizeof(*ste_info)); if (!ste_info) { dr_dbg(matcher->tbl->dmn, "Failed allocating ste_info\n"); errno = ENOMEM; goto clean_ste_setting; } if (dr_ste_create_next_htbl(matcher, nic_matcher, ste, hw_ste, DR_CHUNK_SIZE_1, nic_rule->lock_index)) { dr_dbg(matcher->tbl->dmn, "Failed allocating table\n"); goto clean_ste_info; } cur_htbl->ctrl.num_of_valid_entries++; dr_send_fill_and_append_ste_send_info(ste, DR_STE_SIZE, 0, hw_ste, ste_info, send_list, false); return 0; clean_ste_info: free(ste_info); clean_ste_setting: list_del_init(&ste->miss_list_node); dr_htbl_put(cur_htbl); return ENOMEM; } static struct dr_ste *dr_rule_handle_ste_branch(struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule, struct list_head *send_ste_list, struct dr_ste_htbl *cur_htbl, uint8_t *hw_ste, uint8_t ste_location, struct dr_ste_htbl **put_htbl) { struct dr_matcher_rx_tx *nic_matcher = nic_rule->nic_matcher; struct dr_domain_rx_tx *nic_dmn = nic_matcher->nic_tbl->nic_dmn; struct mlx5dv_dr_matcher *matcher = rule->matcher; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct dr_ste_htbl *new_htbl; struct list_head *miss_list; struct dr_ste *matched_ste; bool skip_rehash = nic_matcher->fixed_size; struct dr_ste *ste; int index; again: index = dr_ste_calc_hash_index(hw_ste, cur_htbl); miss_list = &cur_htbl->chunk->miss_list[index]; ste = &cur_htbl->ste_arr[index]; if (dr_ste_is_not_used(ste)) { if (dr_rule_handle_empty_entry(matcher, nic_rule, cur_htbl, ste, ste_location, hw_ste, miss_list, send_ste_list)) return NULL; } else { /* Hash table index in use, check if this ste is in the miss list */ matched_ste = dr_rule_find_ste_in_miss_list(miss_list, hw_ste, dr_ste_tag_sz(ste)); if (matched_ste) { /* * if it is last STE in the chain, and has the same tag * it means that all the previous stes are the same, * if so, this rule is duplicated. */ if (!dr_ste_is_last_in_rule(nic_matcher, ste_location)) return matched_ste; if (dmn->flags & DR_DOMAIN_FLAG_DISABLE_DUPLICATE_RULES) { dr_dbg(dmn, "Duplicate rules are not supported\n"); errno = EEXIST; return NULL; } dr_dbg(dmn, "Duplicate rule inserted\n"); } if (!skip_rehash && dr_rule_need_enlarge_hash(cur_htbl, dmn, nic_dmn)) { /* Hash table index in use, try to resize of the hash */ skip_rehash = true; /* * Hold the table till we update. * Release in dr_rule_create_rule_nr() */ *put_htbl = cur_htbl; dr_htbl_get(cur_htbl); new_htbl = dr_rule_rehash(rule, nic_rule, cur_htbl, ste_location, send_ste_list); if (!new_htbl) { dr_htbl_put(cur_htbl); dr_dbg(dmn, "Failed creating rehash table, htbl-log_size: %d\n", cur_htbl->chunk_size); } else { cur_htbl = new_htbl; } goto again; } else { /* Hash table index in use, add another collision (miss) */ ste = dr_rule_handle_collision(matcher, nic_rule, ste, hw_ste, miss_list, send_ste_list); if (!ste) { dr_dbg(dmn, "Failed adding collision entry, index: %d\n", index); return NULL; } } } return ste; } static bool dr_rule_cmp_value_to_mask(uint8_t *mask, uint8_t *value, uint32_t s_idx, uint32_t e_idx) { uint32_t i; for (i = s_idx; i < e_idx; i++) { if (value[i] & ~mask[i]) { errno = EINVAL; return false; } } return true; } static bool dr_rule_verify(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_flow_match_parameters *value, struct dr_match_param *param) { uint8_t match_criteria = matcher->match_criteria; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; uint8_t *mask_p = (uint8_t *)&matcher->mask; uint8_t *param_p = (uint8_t *)param; size_t value_size = value->match_sz; uint32_t s_idx, e_idx; if (!value_size) return true; if ((value_size > DEVX_ST_SZ_BYTES(dr_match_param) || (value_size % sizeof(uint32_t)))) { dr_dbg(dmn, "Rule parameters length is incorrect\n"); errno = EINVAL; return false; } dr_ste_copy_param(matcher->match_criteria, param, value->match_buf, value->match_sz, false); if (match_criteria & DR_MATCHER_CRITERIA_OUTER) { s_idx = offsetof(struct dr_match_param, outer); e_idx = min(s_idx + sizeof(param->outer), value_size); if (!dr_rule_cmp_value_to_mask(mask_p, param_p, s_idx, e_idx)) { dr_dbg(dmn, "Rule outer parameters contains a value not specified by mask\n"); return false; } } if (match_criteria & DR_MATCHER_CRITERIA_MISC) { s_idx = offsetof(struct dr_match_param, misc); e_idx = min(s_idx + sizeof(param->misc), value_size); if (!dr_rule_cmp_value_to_mask(mask_p, param_p, s_idx, e_idx)) { dr_dbg(dmn, "Rule misc parameters contains a value not specified by mask\n"); return false; } } if (match_criteria & DR_MATCHER_CRITERIA_INNER) { s_idx = offsetof(struct dr_match_param, inner); e_idx = min(s_idx + sizeof(param->inner), value_size); if (!dr_rule_cmp_value_to_mask(mask_p, param_p, s_idx, e_idx)) { dr_dbg(dmn, "Rule inner parameters contains a value not specified by mask\n"); return false; } } if (match_criteria & DR_MATCHER_CRITERIA_MISC2) { s_idx = offsetof(struct dr_match_param, misc2); e_idx = min(s_idx + sizeof(param->misc2), value_size); if (!dr_rule_cmp_value_to_mask(mask_p, param_p, s_idx, e_idx)) { dr_dbg(dmn, "Rule misc2 parameters contains a value not specified by mask\n"); return false; } } if (match_criteria & DR_MATCHER_CRITERIA_MISC3) { s_idx = offsetof(struct dr_match_param, misc3); e_idx = min(s_idx + sizeof(param->misc3), value_size); if (!dr_rule_cmp_value_to_mask(mask_p, param_p, s_idx, e_idx)) { dr_dbg(dmn, "Rule misc3 parameters contains a value not specified by mask\n"); return false; } } if (match_criteria & DR_MATCHER_CRITERIA_MISC4) { s_idx = offsetof(struct dr_match_param, misc4); e_idx = min(s_idx + sizeof(param->misc4), value_size); if (!dr_rule_cmp_value_to_mask(mask_p, param_p, s_idx, e_idx)) { dr_dbg(dmn, "Rule misc4 parameters contains a value not specified by mask\n"); return false; } } if (match_criteria & DR_MATCHER_CRITERIA_MISC5) { s_idx = offsetof(struct dr_match_param, misc5); e_idx = min(s_idx + sizeof(param->misc5), value_size); if (!dr_rule_cmp_value_to_mask(mask_p, param_p, s_idx, e_idx)) { dr_dbg(dmn, "Rule misc5 parameters contains a value not specified by mask\n"); return false; } } return true; } static int dr_rule_destroy_rule_nic(struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule) { dr_rule_lock(nic_rule, NULL); dr_rule_clean_rule_members(rule, nic_rule); dr_rule_unlock(nic_rule); return 0; } static int dr_rule_destroy_rule_fdb(struct mlx5dv_dr_rule *rule) { dr_rule_destroy_rule_nic(rule, &rule->rx); dr_rule_destroy_rule_nic(rule, &rule->tx); return 0; } static int dr_rule_destroy_rule(struct mlx5dv_dr_rule *rule) { struct mlx5dv_dr_domain *dmn = rule->matcher->tbl->dmn; pthread_spin_lock(&dmn->debug_lock); list_del(&rule->rule_list); pthread_spin_unlock(&dmn->debug_lock); switch (dmn->type) { case MLX5DV_DR_DOMAIN_TYPE_NIC_RX: dr_rule_destroy_rule_nic(rule, &rule->rx); break; case MLX5DV_DR_DOMAIN_TYPE_NIC_TX: dr_rule_destroy_rule_nic(rule, &rule->tx); break; case MLX5DV_DR_DOMAIN_TYPE_FDB: dr_rule_destroy_rule_fdb(rule); break; default: assert(false); errno = EINVAL; return errno; } dr_rule_remove_action_members(rule); free(rule); return 0; } static int dr_rule_destroy_rule_root(struct mlx5dv_dr_rule *rule) { int ret; ret = ibv_destroy_flow(rule->flow); if (ret) return ret; dr_rule_remove_action_members(rule); free(rule); return 0; } static int dr_rule_skip(struct mlx5dv_dr_domain *dmn, enum dr_domain_nic_type nic_type, struct dr_match_param *mask, struct dr_match_param *value) { if (dmn->type == MLX5DV_DR_DOMAIN_TYPE_FDB) { if (mask->misc.source_port) { if (nic_type == DR_DOMAIN_NIC_TYPE_RX) if (value->misc.source_port != WIRE_PORT) return 1; if (nic_type == DR_DOMAIN_NIC_TYPE_TX) if (value->misc.source_port == WIRE_PORT) return 1; } /* Metadata C can be used to describe the source vport */ if (mask->misc2.metadata_reg_c_0) { struct dr_devx_vport_cap *wire; uint32_t vport_metadata_c; wire = &dmn->info.caps.vports.wire; /* No correlation between wire mask and regc0 mask */ if (!(wire->metadata_c_mask & mask->misc2.metadata_reg_c_0)) return 0; vport_metadata_c = value->misc2.metadata_reg_c_0 & wire->metadata_c_mask; if (nic_type == DR_DOMAIN_NIC_TYPE_RX) if (vport_metadata_c != wire->metadata_c) return 1; if (nic_type == DR_DOMAIN_NIC_TYPE_TX) if (vport_metadata_c == wire->metadata_c) return 1; } } return 0; } static int dr_rule_create_rule_nic(struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule, struct dr_match_param *param, size_t num_actions, struct mlx5dv_dr_action *actions[]) { uint8_t hw_ste_arr[DR_RULE_MAX_STE_CHAIN * DR_STE_SIZE] = {}; struct dr_matcher_rx_tx *nic_matcher = nic_rule->nic_matcher; struct dr_domain_rx_tx *nic_dmn = nic_matcher->nic_tbl->nic_dmn; struct mlx5dv_dr_matcher *matcher = rule->matcher; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct dr_ste_send_info *ste_info, *tmp_ste_info; struct dr_ste_htbl *htbl = NULL; struct dr_ste_htbl *cur_htbl; uint32_t new_hw_ste_arr_sz = 0; struct cross_dmn_params cross_dmn_p = {}; LIST_HEAD(send_ste_list); struct dr_ste *ste = NULL; /* Fix compilation warning */ int ret, i; cross_dmn_p.cross_dmn_loc = -1; if (dr_rule_skip(dmn, nic_dmn->type, &matcher->mask, param)) return 0; /* Set the tag values inside the ste array */ ret = dr_ste_build_ste_arr(matcher, nic_matcher, param, hw_ste_arr); if (ret) return ret; /* Set the lock index, and use the relative lock */ dr_rule_lock(nic_rule, hw_ste_arr); /* Set the actions values/addresses inside the ste array */ ret = dr_actions_build_ste_arr(matcher, nic_matcher, actions, num_actions, hw_ste_arr, &new_hw_ste_arr_sz, &cross_dmn_p, nic_rule->lock_index); if (ret) goto out_unlock; cur_htbl = nic_matcher->s_htbl; /* * Go over the array of STEs, and build dr_ste accordingly. * The loop is over only the builders which are equeal or less to the * number of stes, in case we have actions that lives in other stes. */ for (i = 0; i < nic_matcher->num_of_builders; i++) { /* Calculate CRC and keep new ste entry */ uint8_t *cur_hw_ste_ent = hw_ste_arr + (i * DR_STE_SIZE); ste = dr_rule_handle_ste_branch(rule, nic_rule, &send_ste_list, cur_htbl, cur_hw_ste_ent, i + 1, &htbl); if (!ste) { dr_dbg(dmn, "Failed creating next branch\n"); ret = errno; goto free_rule; } cur_htbl = ste->next_htbl; dr_ste_get(ste); dr_rule_set_last_member(nic_rule, ste, true); } /* Connect actions */ ret = dr_rule_handle_action_stes(rule, nic_rule, &send_ste_list, ste, hw_ste_arr, new_hw_ste_arr_sz, &cross_dmn_p); if (ret) { dr_dbg(dmn, "Failed apply actions\n"); goto free_rule; } ret = dr_rule_send_update_list(&send_ste_list, dmn, true, nic_rule->lock_index); if (ret) { dr_dbg(dmn, "Failed sending ste!\n"); goto free_rule; } if (htbl) dr_htbl_put(htbl); goto out_unlock; free_rule: if (cross_dmn_p.cross_dmn_action) { dr_rule_clean_cross_dmn_rule_members(rule, nic_rule, &send_ste_list, cross_dmn_p.cross_dmn_action); } else { dr_rule_clean_rule_members(rule, nic_rule); /* Clean all ste_info's */ list_for_each_safe(&send_ste_list, ste_info, tmp_ste_info, send_list) { list_del(&ste_info->send_list); free(ste_info); } } out_unlock: dr_rule_unlock(nic_rule); return ret; } static int dr_rule_create_rule_fdb(struct mlx5dv_dr_rule *rule, struct dr_match_param *param, size_t num_actions, struct mlx5dv_dr_action *actions[]) { struct dr_match_param copy_param = {}; int ret; /* * Copy match_param since they will be consumed during the first * nic_rule insertion. */ memcpy(©_param, param, sizeof(struct dr_match_param)); ret = dr_rule_create_rule_nic(rule, &rule->rx, param, num_actions, actions); if (ret) return ret; ret = dr_rule_create_rule_nic(rule, &rule->tx, ©_param, num_actions, actions); if (ret) goto destroy_rule_nic_rx; return 0; destroy_rule_nic_rx: dr_rule_destroy_rule_nic(rule, &rule->rx); return ret; } static struct mlx5dv_dr_rule * dr_rule_create_rule(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_flow_match_parameters *value, size_t num_actions, struct mlx5dv_dr_action *actions[]) { struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct dr_match_param param = {}; struct mlx5dv_dr_rule *rule; int ret; if (!dr_rule_verify(matcher, value, ¶m)) return NULL; rule = calloc(1, sizeof(*rule)); if (!rule) { errno = ENOMEM; return NULL; } rule->matcher = matcher; list_node_init(&rule->rule_list); ret = dr_rule_add_action_members(rule, num_actions, actions); if (ret) goto free_rule; switch (dmn->type) { case MLX5DV_DR_DOMAIN_TYPE_NIC_RX: rule->rx.nic_matcher = &matcher->rx; ret = dr_rule_create_rule_nic(rule, &rule->rx, ¶m, num_actions, actions); break; case MLX5DV_DR_DOMAIN_TYPE_NIC_TX: rule->tx.nic_matcher = &matcher->tx; ret = dr_rule_create_rule_nic(rule, &rule->tx, ¶m, num_actions, actions); break; case MLX5DV_DR_DOMAIN_TYPE_FDB: rule->rx.nic_matcher = &matcher->rx; rule->tx.nic_matcher = &matcher->tx; ret = dr_rule_create_rule_fdb(rule, ¶m, num_actions, actions); break; default: ret = EINVAL; errno = ret; break; } if (ret) goto remove_action_members; pthread_spin_lock(&dmn->debug_lock); list_add_tail(&matcher->rule_list, &rule->rule_list); pthread_spin_unlock(&dmn->debug_lock); return rule; remove_action_members: dr_rule_remove_action_members(rule); free_rule: free(rule); return NULL; } static struct mlx5dv_dr_rule * dr_rule_create_rule_root(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_flow_match_parameters *value, size_t num_actions, struct mlx5dv_dr_action *actions[]) { struct mlx5dv_flow_action_attr *attr; struct mlx5_flow_action_attr_aux *attr_aux; struct mlx5dv_dr_rule *rule; int ret; rule = calloc(1, sizeof(*rule)); if (!rule) { errno = ENOMEM; return NULL; } rule->matcher = matcher; attr = calloc(num_actions, sizeof(*attr)); if (!attr) { errno = ENOMEM; goto free_rule; } attr_aux = calloc(num_actions, sizeof(*attr_aux)); if (!attr_aux) { errno = ENOMEM; goto free_attr; } ret = dr_actions_build_attr(matcher, actions, num_actions, attr, attr_aux); if (ret) goto free_attr_aux; ret = dr_rule_add_action_members(rule, num_actions, actions); if (ret) goto free_attr_aux; rule->flow = _mlx5dv_create_flow(matcher->dv_matcher, value, num_actions, attr, attr_aux); if (!rule->flow) goto remove_action_members; free(attr); free(attr_aux); return rule; remove_action_members: dr_rule_remove_action_members(rule); free_attr_aux: free(attr_aux); free_attr: free(attr); free_rule: free(rule); return NULL; } struct mlx5dv_dr_rule *mlx5dv_dr_rule_create(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_flow_match_parameters *value, size_t num_actions, struct mlx5dv_dr_action *actions[]) { struct mlx5dv_dr_rule *rule; atomic_fetch_add(&matcher->refcount, 1); if (dr_is_root_table(matcher->tbl)) rule = dr_rule_create_rule_root(matcher, value, num_actions, actions); else rule = dr_rule_create_rule(matcher, value, num_actions, actions); if (!rule) atomic_fetch_sub(&matcher->refcount, 1); return rule; } int mlx5dv_dr_rule_destroy(struct mlx5dv_dr_rule *rule) { struct mlx5dv_dr_matcher *matcher = rule->matcher; struct mlx5dv_dr_table *tbl = rule->matcher->tbl; int ret; if (dr_is_root_table(tbl)) ret = dr_rule_destroy_rule_root(rule); else ret = dr_rule_destroy_rule(rule); if (!ret) atomic_fetch_sub(&matcher->refcount, 1); return ret; } rdma-core-56.1/providers/mlx5/dr_send.c000066400000000000000000001041201477342711600200140ustar00rootroot00000000000000/* * Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include "mlx5dv_dr.h" #include "wqe.h" #define QUEUE_SIZE 128 #define SIGNAL_PER_DIV_QUEUE 16 #define TH_NUMS_TO_DRAIN 2 enum { CQ_OK = 0, CQ_EMPTY = -1, CQ_POLL_ERR = -2 }; struct dr_qp_init_attr { uint32_t cqn; uint32_t pdn; struct mlx5dv_devx_uar *uar; struct ibv_qp_cap cap; bool isolate_vl_tc; uint8_t qp_ts_format; }; static void *dr_cq_get_cqe(struct dr_cq *dr_cq, int n) { return dr_cq->buf + n * dr_cq->cqe_sz; } static void *dr_cq_get_sw_cqe(struct dr_cq *dr_cq, int n) { void *cqe = dr_cq_get_cqe(dr_cq, n & (dr_cq->ncqe - 1)); struct mlx5_cqe64 *cqe64; cqe64 = (dr_cq->cqe_sz == 64) ? cqe : cqe + 64; if (likely(mlx5dv_get_cqe_opcode(cqe64) != MLX5_CQE_INVALID) && !((cqe64->op_own & MLX5_CQE_OWNER_MASK) ^ !!(n & dr_cq->ncqe))) return cqe64; else return NULL; } static int dr_get_next_cqe(struct dr_cq *dr_cq, struct mlx5_cqe64 **pcqe64) { struct mlx5_cqe64 *cqe64; cqe64 = dr_cq_get_sw_cqe(dr_cq, dr_cq->cons_index); if (!cqe64) return CQ_EMPTY; ++dr_cq->cons_index; /* * Make sure we read CQ entry contents after we've checked the * ownership bit. */ udma_from_device_barrier(); *pcqe64 = cqe64; return CQ_OK; } static int dr_parse_cqe(struct dr_cq *dr_cq, struct mlx5_cqe64 *cqe64) { uint16_t wqe_ctr; uint8_t opcode; int idx; wqe_ctr = be16toh(cqe64->wqe_counter); opcode = mlx5dv_get_cqe_opcode(cqe64); if (opcode == MLX5_CQE_REQ_ERR) { idx = wqe_ctr & (dr_cq->qp->sq.wqe_cnt - 1); dr_cq->qp->sq.tail = dr_cq->qp->sq.wqe_head[idx] + 1; } else if (opcode == MLX5_CQE_RESP_ERR) { ++dr_cq->qp->sq.tail; } else { idx = wqe_ctr & (dr_cq->qp->sq.wqe_cnt - 1); dr_cq->qp->sq.tail = dr_cq->qp->sq.wqe_head[idx] + 1; return CQ_OK; } return CQ_POLL_ERR; } static int dr_cq_poll_one(struct dr_cq *dr_cq) { struct mlx5_cqe64 *cqe64; int err; err = dr_get_next_cqe(dr_cq, &cqe64); if (err == CQ_EMPTY) return err; return dr_parse_cqe(dr_cq, cqe64); } static int dr_poll_cq(struct dr_cq *dr_cq, int ne) { int npolled; int err = 0; for (npolled = 0; npolled < ne; ++npolled) { err = dr_cq_poll_one(dr_cq); if (err != CQ_OK) break; } dr_cq->db[MLX5_CQ_SET_CI] = htobe32(dr_cq->cons_index & 0xffffff); return err == CQ_POLL_ERR ? err : npolled; } static int dr_qp_get_args_update_send_wqe_size(struct dr_qp_init_attr *attr) { return roundup_pow_of_two(sizeof(struct mlx5_wqe_ctrl_seg) + sizeof(struct mlx5_wqe_flow_update_ctrl_seg) + sizeof(struct mlx5_wqe_header_modify_argument_update_seg)); } /* We calculate for specific RC QP with the required functionality */ static int dr_qp_calc_rc_send_wqe(struct dr_qp_init_attr *attr) { int size; int inl_size = 0; int update_arg_size; int tot_size; update_arg_size = dr_qp_get_args_update_send_wqe_size(attr); size = sizeof(struct mlx5_wqe_ctrl_seg) + sizeof(struct mlx5_wqe_raddr_seg); if (attr->cap.max_inline_data) inl_size = size + align(sizeof(struct mlx5_wqe_inl_data_seg) + attr->cap.max_inline_data, 16); size += attr->cap.max_send_sge * sizeof(struct mlx5_wqe_data_seg); size = max_int(size, update_arg_size); tot_size = max_int(size, inl_size); return align(tot_size, MLX5_SEND_WQE_BB); } static int dr_calc_sq_size(struct dr_qp *dr_qp, struct dr_qp_init_attr *attr) { int wqe_size; int wq_size; wqe_size = dr_qp_calc_rc_send_wqe(attr); dr_qp->max_inline_data = wqe_size - (sizeof(struct mlx5_wqe_ctrl_seg) + sizeof(struct mlx5_wqe_raddr_seg)) - sizeof(struct mlx5_wqe_inl_data_seg); wq_size = roundup_pow_of_two(attr->cap.max_send_wr * wqe_size); dr_qp->sq.wqe_cnt = wq_size / MLX5_SEND_WQE_BB; dr_qp->sq.wqe_shift = STATIC_ILOG_32(MLX5_SEND_WQE_BB) - 1; dr_qp->sq.max_gs = attr->cap.max_send_sge; dr_qp->sq.max_post = wq_size / wqe_size; return wq_size; } static int dr_qp_calc_recv_wqe(struct dr_qp_init_attr *attr) { uint32_t size; int num_scatter; num_scatter = max_t(uint32_t, attr->cap.max_recv_sge, 1); size = sizeof(struct mlx5_wqe_data_seg) * num_scatter; size = roundup_pow_of_two(size); return size; } static int dr_calc_rq_size(struct dr_qp *dr_qp, struct dr_qp_init_attr *attr) { int wqe_size; int wq_size; wqe_size = dr_qp_calc_recv_wqe(attr); wq_size = roundup_pow_of_two(attr->cap.max_recv_wr) * wqe_size; wq_size = max(wq_size, MLX5_SEND_WQE_BB); dr_qp->rq.wqe_cnt = wq_size / wqe_size; dr_qp->rq.wqe_shift = ilog32(wqe_size - 1); dr_qp->rq.max_post = 1 << ilog32(wq_size / wqe_size - 1); dr_qp->rq.max_gs = wqe_size / sizeof(struct mlx5_wqe_data_seg); return wq_size; } static int dr_calc_wq_size(struct dr_qp *dr_qp, struct dr_qp_init_attr *attr) { int result; int ret; result = dr_calc_sq_size(dr_qp, attr); ret = dr_calc_rq_size(dr_qp, attr); result += ret; dr_qp->sq.offset = ret; dr_qp->rq.offset = 0; return result; } static int dr_qp_alloc_buf(struct dr_qp *dr_qp, int size) { int al_size; int ret; dr_qp->sq.wqe_head = malloc(dr_qp->sq.wqe_cnt * sizeof(*dr_qp->sq.wqe_head)); if (!dr_qp->sq.wqe_head) { errno = ENOMEM; return errno; } al_size = align(size, sysconf(_SC_PAGESIZE)); ret = posix_memalign(&dr_qp->buf.buf, sysconf(_SC_PAGESIZE), al_size); if (ret) { errno = ret; goto free_wqe_head; } dr_qp->buf.length = al_size; dr_qp->buf.type = MLX5_ALLOC_TYPE_ANON; memset(dr_qp->buf.buf, 0, dr_qp->buf.length); return 0; free_wqe_head: free(dr_qp->sq.wqe_head); return ret; } static struct dr_qp *dr_create_rc_qp(struct ibv_context *ctx, struct dr_qp_init_attr *attr) { struct dr_devx_qp_create_attr qp_create_attr; struct mlx5dv_devx_obj *obj; struct dr_qp *dr_qp; int size; int ret; dr_qp = calloc(1, sizeof(*dr_qp)); if (!dr_qp) { errno = ENOMEM; return NULL; } size = dr_calc_wq_size(dr_qp, attr); if (dr_qp_alloc_buf(dr_qp, size)) goto err_alloc_bufs; dr_qp->sq_start = dr_qp->buf.buf + dr_qp->sq.offset; dr_qp->sq.qend = dr_qp->buf.buf + dr_qp->sq.offset + (dr_qp->sq.wqe_cnt << dr_qp->sq.wqe_shift); dr_qp->rq.head = 0; dr_qp->rq.tail = 0; dr_qp->sq.cur_post = 0; ret = posix_memalign((void **)&dr_qp->db, 8, 8); if (ret) { errno = ret; goto err_db_alloc; } dr_qp->db[MLX5_RCV_DBR] = 0; dr_qp->db[MLX5_SND_DBR] = 0; dr_qp->db_umem = mlx5dv_devx_umem_reg(ctx, dr_qp->db, 8, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_READ); if (!dr_qp->db_umem) goto err_db_umem; dr_qp->buf_umem = mlx5dv_devx_umem_reg(ctx, dr_qp->buf.buf, dr_qp->buf.length, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_READ); if (!dr_qp->buf_umem) goto err_buf_umem; qp_create_attr.page_id = attr->uar->page_id; qp_create_attr.pdn = attr->pdn; qp_create_attr.cqn = attr->cqn; qp_create_attr.pm_state = MLX5_QPC_PM_STATE_MIGRATED; qp_create_attr.service_type = MLX5_QPC_ST_RC; qp_create_attr.buff_umem_id = dr_qp->buf_umem->umem_id; qp_create_attr.db_umem_id = dr_qp->db_umem->umem_id; qp_create_attr.sq_wqe_cnt = dr_qp->sq.wqe_cnt; qp_create_attr.rq_wqe_cnt = dr_qp->rq.wqe_cnt; qp_create_attr.rq_wqe_shift = dr_qp->rq.wqe_shift; qp_create_attr.isolate_vl_tc = attr->isolate_vl_tc; qp_create_attr.qp_ts_format = attr->qp_ts_format; obj = dr_devx_create_qp(ctx, &qp_create_attr); if (!obj) goto err_qp_create; dr_qp->uar = attr->uar; dr_qp->nc_uar = container_of(attr->uar, struct mlx5_bf, devx_uar.dv_devx_uar)->nc_mode; dr_qp->obj = obj; return dr_qp; err_qp_create: mlx5dv_devx_umem_dereg(dr_qp->buf_umem); err_buf_umem: mlx5dv_devx_umem_dereg(dr_qp->db_umem); err_db_umem: free(dr_qp->db); err_db_alloc: free(dr_qp->sq.wqe_head); free(dr_qp->buf.buf); err_alloc_bufs: free(dr_qp); return NULL; } static int dr_destroy_qp(struct dr_qp *dr_qp) { int ret; ret = mlx5dv_devx_obj_destroy(dr_qp->obj); if (ret) return ret; ret = mlx5dv_devx_umem_dereg(dr_qp->buf_umem); if (ret) return ret; ret = mlx5dv_devx_umem_dereg(dr_qp->db_umem); if (ret) return ret; free(dr_qp->db); free(dr_qp->sq.wqe_head); free(dr_qp->buf.buf); free(dr_qp); return 0; } static void dr_set_raddr_seg(struct mlx5_wqe_raddr_seg *rseg, uint64_t remote_addr, uint32_t rkey) { rseg->raddr = htobe64(remote_addr); rseg->rkey = htobe32(rkey); rseg->reserved = 0; } static void dr_set_header_modify_arg_update_seg(struct mlx5_wqe_header_modify_argument_update_seg *aseg, void *data, uint32_t data_size) { memcpy(&aseg->argument_list, data, data_size); } static void dr_post_send_db(struct dr_qp *dr_qp, void *ctrl) { /* * Make sure that descriptors are written before * updating doorbell record and ringing the doorbell */ udma_to_device_barrier(); dr_qp->db[MLX5_SND_DBR] = htobe32(dr_qp->sq.cur_post & 0xffff); if (dr_qp->nc_uar) { udma_to_device_barrier(); mmio_write64_be((uint8_t *)dr_qp->uar->reg_addr, *(__be64 *)ctrl); return; } /* Make sure that the doorbell write happens before the memcpy * to WC memory below */ mmio_wc_start(); mmio_write64_be((uint8_t *)dr_qp->uar->reg_addr, *(__be64 *)ctrl); mmio_flush_writes(); } static void dr_set_data_ptr_seg(struct mlx5_wqe_data_seg *dseg, struct dr_data_seg *data_seg) { dseg->byte_count = htobe32(data_seg->length); dseg->lkey = htobe32(data_seg->lkey); dseg->addr = htobe64(data_seg->addr); } static int dr_set_data_inl_seg(struct dr_qp *dr_qp, struct dr_data_seg *data_seg, void *wqe, uint32_t opcode, int *sz) { struct mlx5_wqe_inline_seg *seg; void *qend = dr_qp->sq.qend; int inl = 0; void *addr; int copy; int len; seg = wqe; wqe += sizeof(*seg); addr = (void *)(unsigned long)(data_seg->addr); len = data_seg->length; inl += len; if (unlikely(wqe + len > qend)) { copy = qend - wqe; memcpy(wqe, addr, copy); addr += copy; len -= copy; wqe = dr_qp->sq_start; } memcpy(wqe, addr, len); wqe += len; if (likely(inl)) { seg->byte_count = htobe32(inl | MLX5_INLINE_SEG); *sz = align(inl + sizeof(seg->byte_count), 16) / 16; } else { *sz = 0; } return 0; } static void dr_set_ctrl_seg(struct mlx5_wqe_ctrl_seg *ctrl, struct dr_data_seg *data_seg) { *(uint32_t *)((void *)ctrl + 8) = 0; ctrl->imm = 0; ctrl->fm_ce_se = data_seg->send_flags & IBV_SEND_SIGNALED ? MLX5_WQE_CTRL_CQ_UPDATE : 0; } static void dr_rdma_handle_flow_access_arg_segments(struct mlx5_wqe_ctrl_seg *ctrl, uint32_t remote_addr, struct dr_data_seg *data_seg, void *qend, void *qstart, int *opcod_mod, int *size, void **seg) { *opcod_mod = OPCODE_MOD_UPDATE_HEADER_MODIFY_ARGUMENT; /* general object id */ ctrl->imm = htobe32(remote_addr); if (unlikely(*seg == qend)) *seg = qstart; /* flow_update_ctrl all reserved */ memset(*seg, 0, sizeof(struct mlx5_wqe_flow_update_ctrl_seg)); *seg += sizeof(struct mlx5_wqe_flow_update_ctrl_seg); *size += sizeof(struct mlx5_wqe_flow_update_ctrl_seg) / 16; if (unlikely(*seg == qend)) *seg = qstart; dr_set_header_modify_arg_update_seg(*seg, (void *)(uintptr_t)data_seg->addr, data_seg->length); *size += sizeof(struct mlx5_wqe_header_modify_argument_update_seg) / 16; } static void dr_rdma_handle_icm_write_segments(struct dr_qp *dr_qp, uint64_t remote_addr, uint32_t rkey, struct dr_data_seg *data_seg, uint32_t opcode, void *qend, int *size, void **seg) { dr_set_raddr_seg(*seg, remote_addr, rkey); *seg += sizeof(struct mlx5_wqe_raddr_seg); *size += sizeof(struct mlx5_wqe_raddr_seg) / 16; if (data_seg->send_flags & IBV_SEND_INLINE) { int sz = 0; dr_set_data_inl_seg(dr_qp, data_seg, *seg, opcode, &sz); *size += sz; } else { if (unlikely(*seg == qend)) *seg = dr_qp->sq_start; dr_set_data_ptr_seg(*seg, data_seg); *size += sizeof(struct mlx5_wqe_data_seg) / 16; } } static void dr_rdma_segments(struct dr_qp *dr_qp, uint64_t remote_addr, uint32_t rkey, struct dr_data_seg *data_seg, uint32_t opcode, bool send_now) { struct mlx5_wqe_ctrl_seg *ctrl = NULL; void *qend = dr_qp->sq.qend; int opcode_mod = 0; unsigned idx; int size = 0; void *seg; idx = dr_qp->sq.cur_post & (dr_qp->sq.wqe_cnt - 1); ctrl = dr_qp->sq_start + (idx << MLX5_SEND_WQE_SHIFT); seg = ctrl; dr_set_ctrl_seg(ctrl, data_seg); seg += sizeof(*ctrl); size = sizeof(*ctrl) / 16; switch (opcode) { case MLX5_OPCODE_RDMA_READ: case MLX5_OPCODE_RDMA_WRITE: dr_rdma_handle_icm_write_segments(dr_qp, remote_addr, rkey, data_seg, opcode, qend, &size, &seg); break; case MLX5_OPCODE_FLOW_TBL_ACCESS: dr_rdma_handle_flow_access_arg_segments(ctrl, remote_addr, data_seg, qend, dr_qp->sq_start, &opcode_mod, &size, &seg); break; default: assert(false); break; } ctrl->opmod_idx_opcode = htobe32((opcode_mod << 24) | ((dr_qp->sq.cur_post & 0xffff) << 8) | opcode); ctrl->qpn_ds = htobe32(size | (dr_qp->obj->object_id << 8)); dr_qp->sq.wqe_head[idx] = dr_qp->sq.head; dr_qp->sq.cur_post += DIV_ROUND_UP(size * 16, MLX5_SEND_WQE_BB); /* head is ready for the next WQE */ dr_qp->sq.head += 1; if (send_now) dr_post_send_db(dr_qp, ctrl); } static void dr_post_send(struct dr_qp *dr_qp, struct postsend_info *send_info) { if (send_info->type == WRITE_ICM) { /* false, because we delay the post_send_db till the coming READ */ dr_rdma_segments(dr_qp, send_info->remote_addr, send_info->rkey, &send_info->write, MLX5_OPCODE_RDMA_WRITE, false); /* true, because we send WRITE + READ together */ dr_rdma_segments(dr_qp, send_info->remote_addr, send_info->rkey, &send_info->read, MLX5_OPCODE_RDMA_READ, true); } else { /* GTA_ARG */ dr_rdma_segments(dr_qp, send_info->remote_addr, send_info->rkey, &send_info->write, MLX5_OPCODE_FLOW_TBL_ACCESS, true); } } /* * dr_send_fill_and_append_ste_send_info: Add data to be sent with send_list * parameters: * @ste - The data that attached to this specific ste * @size - of data to write * @offset - of the data from start of the hw_ste entry * @data - data * @ste_info - ste to be sent with send_list * @send_list - to append into it * @copy_data - if true indicates that the data should be kept because it's not * backuped any where (like in re-hash). * if false, it lets the data to be updated after it was added to * the list. */ void dr_send_fill_and_append_ste_send_info(struct dr_ste *ste, uint16_t size, uint16_t offset, uint8_t *data, struct dr_ste_send_info *ste_info, struct list_head *send_list, bool copy_data) { ste_info->size = size; ste_info->ste = ste; ste_info->offset = offset; if (copy_data) { memcpy(ste_info->data_cont, data, size); ste_info->data = ste_info->data_cont; } else { ste_info->data = data; } list_add_tail(send_list, &ste_info->send_list); } static bool dr_is_device_fatal(struct mlx5dv_dr_domain *dmn) { struct mlx5_context *mlx5_ctx = to_mctx(dmn->ctx); if (mlx5_ctx->flags & MLX5_CTX_FLAGS_FATAL_STATE) return true; return false; } /* * The function tries to consume one wc each time, unless the queue is full, in * that case, which means that the hw is behind the sw in a full queue len * the function will drain the cq till it empty. */ static int dr_handle_pending_wc(struct mlx5dv_dr_domain *dmn, struct dr_send_ring *send_ring) { bool is_drain = false; int ne; if (send_ring->pending_wqe >= send_ring->signal_th) { /* Queue is full start drain it */ if (send_ring->pending_wqe >= send_ring->signal_th * TH_NUMS_TO_DRAIN) is_drain = true; do { /* * On IBV_EVENT_DEVICE_FATAL a success is returned to * let the application free its resources successfully */ if (dr_is_device_fatal(dmn)) return 0; ne = dr_poll_cq(&send_ring->cq, 1); if (ne < 0) { dr_dbg(dmn, "poll CQ failed\n"); return ne; } else if (ne == 1) { send_ring->pending_wqe -= send_ring->signal_th; } } while (is_drain && send_ring->pending_wqe >= send_ring->signal_th); } return 0; } static void dr_fill_write_args_segs(struct dr_send_ring *send_ring, struct postsend_info *send_info) { send_ring->pending_wqe++; if (send_ring->pending_wqe % send_ring->signal_th == 0) send_info->write.send_flags |= IBV_SEND_SIGNALED; else send_info->write.send_flags = 0; } static void dr_fill_write_icm_segs(struct mlx5dv_dr_domain *dmn, struct dr_send_ring *send_ring, struct postsend_info *send_info) { unsigned int inline_flag; uint32_t buff_offset; if (send_info->write.length > send_ring->max_inline_size) { buff_offset = (send_ring->tx_head & (send_ring->signal_th - 1)) * dmn->info.max_send_size; /* Copy to ring mr */ memcpy(send_ring->buf + buff_offset, (void *)(uintptr_t)send_info->write.addr, send_info->write.length); send_info->write.addr = (uintptr_t)send_ring->buf + buff_offset; send_info->write.lkey = send_ring->mr->lkey; send_ring->tx_head++; } send_ring->pending_wqe++; if (!send_info->write.lkey) inline_flag = IBV_SEND_INLINE; else inline_flag = 0; send_info->write.send_flags = inline_flag; if (send_ring->pending_wqe % send_ring->signal_th == 0) send_info->write.send_flags |= IBV_SEND_SIGNALED; send_ring->pending_wqe++; send_info->read.length = send_info->write.length; /* Read into dedicated buffer */ send_info->read.addr = (uintptr_t)send_ring->sync_buff; send_info->read.lkey = send_ring->sync_mr->lkey; if (send_ring->pending_wqe % send_ring->signal_th == 0) send_info->read.send_flags = IBV_SEND_SIGNALED; else send_info->read.send_flags = 0; } static void dr_fill_data_segs(struct mlx5dv_dr_domain *dmn, struct dr_send_ring *send_ring, struct postsend_info *send_info) { if (send_info->type == WRITE_ICM) dr_fill_write_icm_segs(dmn, send_ring, send_info); else dr_fill_write_args_segs(send_ring, send_info); } static int dr_postsend_icm_data(struct mlx5dv_dr_domain *dmn, struct postsend_info *send_info, int ring_idx) { struct dr_send_ring *send_ring = dmn->send_ring[ring_idx % DR_MAX_SEND_RINGS]; int ret; pthread_spin_lock(&send_ring->lock); ret = dr_handle_pending_wc(dmn, send_ring); if (ret) goto out_unlock; dr_fill_data_segs(dmn, send_ring, send_info); dr_post_send(send_ring->qp, send_info); out_unlock: pthread_spin_unlock(&send_ring->lock); return ret; } static int dr_get_tbl_copy_details(struct mlx5dv_dr_domain *dmn, struct dr_ste_htbl *htbl, uint8_t **data, uint32_t *byte_size, int *iterations, int *num_stes) { int alloc_size; if (htbl->chunk->byte_size > dmn->info.max_send_size) { *iterations = htbl->chunk->byte_size / dmn->info.max_send_size; *byte_size = dmn->info.max_send_size; alloc_size = *byte_size; *num_stes = *byte_size / DR_STE_SIZE; } else { *iterations = 1; *num_stes = htbl->chunk->num_of_entries; alloc_size = *num_stes * DR_STE_SIZE; } *data = calloc(1, alloc_size); if (!*data) { errno = ENOMEM; return errno; } return 0; } /* * dr_postsend_ste: write size bytes into offset from the hw icm. * * Input: * dmn - Domain * ste - The ste struct that contains the data (at least part of it) * data - The real data to send * size - data size for writing. * offset - The offset from the icm mapped data to start write to. * this for write only part of the buffer. * * Return: 0 on success. */ int dr_send_postsend_ste(struct mlx5dv_dr_domain *dmn, struct dr_ste *ste, uint8_t *data, uint16_t size, uint16_t offset, uint8_t ring_idx) { struct postsend_info send_info = {}; dr_ste_prepare_for_postsend(dmn->ste_ctx, data, size); send_info.write.addr = (uintptr_t) data; send_info.write.length = size; send_info.write.lkey = 0; send_info.remote_addr = dr_ste_get_mr_addr(ste) + offset; send_info.rkey = dr_icm_pool_get_chunk_rkey(ste->htbl->chunk); return dr_postsend_icm_data(dmn, &send_info, ring_idx); } int dr_send_postsend_htbl(struct mlx5dv_dr_domain *dmn, struct dr_ste_htbl *htbl, uint8_t *formated_ste, uint8_t *mask, uint8_t send_ring_idx) { bool legacy_htbl = htbl->type == DR_STE_HTBL_TYPE_LEGACY; uint32_t byte_size = htbl->chunk->byte_size; int i, j, num_stes_per_iter, iterations; uint8_t ste_sz = htbl->ste_arr->size; uint8_t *data; int ret; ret = dr_get_tbl_copy_details(dmn, htbl, &data, &byte_size, &iterations, &num_stes_per_iter); if (ret) return ret; dr_ste_prepare_for_postsend(dmn->ste_ctx, formated_ste, DR_STE_SIZE); /* Send the data iteration times */ for (i = 0; i < iterations; i++) { uint32_t ste_index = i * (byte_size / DR_STE_SIZE); struct postsend_info send_info = {}; /* Copy all ste's on the data buffer, need to add the bit_mask */ for (j = 0; j < num_stes_per_iter; j++) { if (dr_ste_is_not_used(&htbl->ste_arr[ste_index + j])) { memcpy(data + (j * DR_STE_SIZE), formated_ste, DR_STE_SIZE); } else { /* Copy data */ memcpy(data + (j * DR_STE_SIZE), htbl->ste_arr[ste_index + j].hw_ste, ste_sz); /* Copy bit_mask on legacy tables */ if (legacy_htbl) memcpy(data + (j * DR_STE_SIZE) + ste_sz, mask, DR_STE_SIZE_MASK); /* Prepare STE to specific HW format */ dr_ste_prepare_for_postsend(dmn->ste_ctx, data + (j * DR_STE_SIZE), DR_STE_SIZE); } } send_info.write.addr = (uintptr_t) data; send_info.write.length = byte_size; send_info.write.lkey = 0; send_info.remote_addr = dr_ste_get_mr_addr(htbl->ste_arr + ste_index); send_info.rkey = dr_icm_pool_get_chunk_rkey(htbl->chunk); ret = dr_postsend_icm_data(dmn, &send_info, send_ring_idx); if (ret) goto out_free; } out_free: free(data); return ret; } /* Initialize htble with default STEs */ int dr_send_postsend_formated_htbl(struct mlx5dv_dr_domain *dmn, struct dr_ste_htbl *htbl, uint8_t *ste_init_data, bool update_hw_ste, uint8_t send_ring_idx) { uint32_t byte_size = htbl->chunk->byte_size; int i, num_stes, iterations, ret; uint8_t *copy_dst; uint8_t *data; ret = dr_get_tbl_copy_details(dmn, htbl, &data, &byte_size, &iterations, &num_stes); if (ret) return ret; if (update_hw_ste) { /* Copy the STE to hash table ste_arr */ for (i = 0; i < num_stes; i++) { copy_dst = htbl->hw_ste_arr + i * htbl->ste_arr->size; memcpy(copy_dst, ste_init_data, htbl->ste_arr->size); } } dr_ste_prepare_for_postsend(dmn->ste_ctx, ste_init_data, DR_STE_SIZE); /* Copy the same STE on the data buffer */ for (i = 0; i < num_stes; i++) { copy_dst = data + i * DR_STE_SIZE; memcpy(copy_dst, ste_init_data, DR_STE_SIZE); } /* Send the data iteration times */ for (i = 0; i < iterations; i++) { uint32_t ste_index = i * (byte_size / DR_STE_SIZE); struct postsend_info send_info = {}; send_info.write.addr = (uintptr_t) data; send_info.write.length = byte_size; send_info.write.lkey = 0; send_info.remote_addr = dr_ste_get_mr_addr(htbl->ste_arr + ste_index); send_info.rkey = dr_icm_pool_get_chunk_rkey(htbl->chunk); ret = dr_postsend_icm_data(dmn, &send_info, send_ring_idx); if (ret) goto out_free; } out_free: free(data); return ret; } int dr_send_postsend_action(struct mlx5dv_dr_domain *dmn, struct mlx5dv_dr_action *action) { struct postsend_info send_info = {}; int num_qps; int i, ret; num_qps = dmn->info.use_mqs ? DR_MAX_SEND_RINGS : 1; if (action->action_type == DR_ACTION_TYP_L2_TO_TNL_L2 || action->action_type == DR_ACTION_TYP_L2_TO_TNL_L3) { send_info.write.addr = (uintptr_t)action->reformat.data; send_info.write.length = action->reformat.reformat_size; send_info.remote_addr = dr_icm_pool_get_chunk_mr_addr(action->reformat.chunk); send_info.rkey = dr_icm_pool_get_chunk_rkey(action->reformat.chunk); } else { send_info.write.addr = (uintptr_t)action->rewrite.param.data; send_info.write.length = action->rewrite.param.num_of_actions * DR_MODIFY_ACTION_SIZE; send_info.remote_addr = dr_icm_pool_get_chunk_mr_addr(action->rewrite.param.chunk); send_info.rkey = dr_icm_pool_get_chunk_rkey(action->rewrite.param.chunk); } send_info.write.lkey = 0; /* To avoid race between action creation and its use in other QP * write it in all QP's. */ for (i = 0; i < num_qps; i++) { ret = dr_postsend_icm_data(dmn, &send_info, i); if (ret) return ret; } return 0; } int dr_send_postsend_pattern(struct mlx5dv_dr_domain *dmn, struct dr_icm_chunk *chunk, uint16_t num_of_actions, uint8_t *data) { struct postsend_info send_info = {}; int num_qps; int i, ret; num_qps = dmn->info.use_mqs ? DR_MAX_SEND_RINGS : 1; send_info.write.addr = (uintptr_t)data; send_info.write.length = num_of_actions * DR_MODIFY_ACTION_SIZE; send_info.remote_addr = dr_icm_pool_get_chunk_mr_addr(chunk); send_info.rkey = dr_icm_pool_get_chunk_rkey(chunk); /* To avoid race between action creation and its use in other QP * write it in all QP's. */ for (i = 0; i < num_qps; i++) { ret = dr_postsend_icm_data(dmn, &send_info, i); if (ret) { errno = ret; return ret; } } return 0; } int dr_send_postsend_args(struct mlx5dv_dr_domain *dmn, uint64_t arg_id, uint16_t num_of_actions, uint8_t *actions_data, uint32_t ring_index) { struct postsend_info send_info = {}; int data_len, iter = 0, cur_sent; uint64_t addr; int ret; addr = (uintptr_t)actions_data; data_len = num_of_actions * DR_MODIFY_ACTION_SIZE; do { send_info.type = GTA_ARG; send_info.write.addr = addr; cur_sent = min_t(uint32_t, data_len, ACTION_CACHE_LINE_SIZE); send_info.write.length = cur_sent; send_info.write.lkey = 0; send_info.remote_addr = arg_id + iter; ret = dr_postsend_icm_data(dmn, &send_info, ring_index); if (ret) { errno = ret; goto out; } iter++; addr += cur_sent; data_len -= cur_sent; } while (data_len > 0); out: return ret; } bool dr_send_allow_fl(struct dr_devx_caps *caps) { return ((caps->roce_caps.roce_en && caps->roce_caps.fl_rc_qp_when_roce_enabled) || (!caps->roce_caps.roce_en && caps->roce_caps.fl_rc_qp_when_roce_disabled)); } static int dr_send_get_qp_ts_format(struct dr_devx_caps *caps) { /* Set the default TS format in case TS format is supported */ return !caps->roce_caps.qp_ts_format ? MLX5_QPC_TIMESTAMP_FORMAT_FREE_RUNNING : MLX5_QPC_TIMESTAMP_FORMAT_DEFAULT; } static int dr_prepare_qp_to_rts(struct mlx5dv_dr_domain *dmn, struct dr_qp *dr_qp) { struct dr_devx_qp_rts_attr rts_attr = {}; struct dr_devx_qp_rtr_attr rtr_attr = {}; enum ibv_mtu mtu = IBV_MTU_1024; uint16_t gid_index = 0; int port = 1; int ret; /* Init */ ret = dr_devx_modify_qp_rst2init(dmn->ctx, dr_qp->obj, port); if (ret) { dr_dbg(dmn, "Failed to modify QP to INIT, ret: %d\n", ret); return ret; } /* RTR */ rtr_attr.mtu = mtu; rtr_attr.qp_num = dr_qp->obj->object_id; rtr_attr.min_rnr_timer = 12; rtr_attr.port_num = port; /* Enable force-loopback on the QP */ if (dr_send_allow_fl(&dmn->info.caps)) { rtr_attr.fl = true; } else { ret = dr_devx_query_gid(dmn->ctx, port, gid_index, &rtr_attr.dgid_attr); if (ret) { dr_dbg(dmn, "can't read sgid of index %d\n", gid_index); return ret; } rtr_attr.sgid_index = gid_index; } ret = dr_devx_modify_qp_init2rtr(dmn->ctx, dr_qp->obj, &rtr_attr); if (ret) { dr_dbg(dmn, "Failed to modify QP to RTR, ret: %d\n", ret); return ret; } /* RTS */ rts_attr.timeout = 14; rts_attr.retry_cnt = 7; rts_attr.rnr_retry = 7; ret = dr_devx_modify_qp_rtr2rts(dmn->ctx, dr_qp->obj, &rts_attr); if (ret) { dr_dbg(dmn, "Failed to modify QP to RTS, ret: %d\n", ret); return ret; } return 0; } static void dr_send_ring_free_one(struct dr_send_ring *send_ring) { dr_destroy_qp(send_ring->qp); ibv_destroy_cq(send_ring->cq.ibv_cq); ibv_dereg_mr(send_ring->sync_mr); ibv_dereg_mr(send_ring->mr); free(send_ring->buf); free(send_ring->sync_buff); free(send_ring); } void dr_send_ring_free(struct mlx5dv_dr_domain *dmn) { int i; for (i = 0; i < DR_MAX_SEND_RINGS; i++) dr_send_ring_free_one(dmn->send_ring[i]); } /* Each domain has its own ib resources */ static int dr_send_ring_alloc_one(struct mlx5dv_dr_domain *dmn, struct dr_send_ring **curr_send_ring) { struct dr_qp_init_attr init_attr = {}; struct dr_send_ring *send_ring; struct mlx5dv_cq mlx5_cq = {}; int cq_size, page_size; struct mlx5dv_obj obj; int size; int access_flags = IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_READ; int ret; send_ring = calloc(1, sizeof(*send_ring)); if (!send_ring) { dr_dbg(dmn, "Couldn't allocate send-ring\n"); errno = ENOMEM; return errno; } ret = pthread_spin_init(&send_ring->lock, PTHREAD_PROCESS_PRIVATE); if (ret) { errno = ret; goto free_send_ring; } cq_size = QUEUE_SIZE + 1; send_ring->cq.ibv_cq = ibv_create_cq(dmn->ctx, cq_size, NULL, NULL, 0); if (!send_ring->cq.ibv_cq) { dr_dbg(dmn, "Failed to create CQ with %u entries\n", cq_size); ret = ENODEV; errno = ENODEV; goto free_send_ring; } obj.cq.in = send_ring->cq.ibv_cq; obj.cq.out = &mlx5_cq; ret = mlx5dv_init_obj(&obj, MLX5DV_OBJ_CQ); if (ret) goto clean_cq; send_ring->cq.buf = mlx5_cq.buf; send_ring->cq.db = mlx5_cq.dbrec; send_ring->cq.ncqe = mlx5_cq.cqe_cnt; send_ring->cq.cqe_sz = mlx5_cq.cqe_size; init_attr.cqn = mlx5_cq.cqn; init_attr.pdn = dmn->pd_num; init_attr.uar = dmn->uar; init_attr.cap.max_send_wr = QUEUE_SIZE; init_attr.cap.max_recv_wr = 1; init_attr.cap.max_send_sge = 1; init_attr.cap.max_recv_sge = 1; init_attr.cap.max_inline_data = DR_STE_SIZE; init_attr.qp_ts_format = dr_send_get_qp_ts_format(&dmn->info.caps); /* Isolated VL is applicable only if force LB is supported */ if (dr_send_allow_fl(&dmn->info.caps)) init_attr.isolate_vl_tc = dmn->info.caps.isolate_vl_tc; send_ring->qp = dr_create_rc_qp(dmn->ctx, &init_attr); if (!send_ring->qp) { dr_dbg(dmn, "Couldn't create QP\n"); ret = errno; goto clean_cq; } send_ring->cq.qp = send_ring->qp; send_ring->max_inline_size = min(send_ring->qp->max_inline_data, DR_STE_SIZE); send_ring->signal_th = QUEUE_SIZE / SIGNAL_PER_DIV_QUEUE; /* Prepare qp to be used */ ret = dr_prepare_qp_to_rts(dmn, send_ring->qp); if (ret) { dr_dbg(dmn, "Couldn't prepare QP\n"); goto clean_qp; } /* Allocating the max size as a buffer for writing */ size = send_ring->signal_th * dmn->info.max_send_size; page_size = sysconf(_SC_PAGESIZE); ret = posix_memalign(&send_ring->buf, page_size, size); if (ret) { dr_dbg(dmn, "Couldn't allocate send-ring buf.\n"); errno = ret; goto clean_qp; } memset(send_ring->buf, 0, size); send_ring->buf_size = size; send_ring->mr = ibv_reg_mr(dmn->pd, send_ring->buf, size, access_flags); if (!send_ring->mr) { dr_dbg(dmn, "Couldn't register send-ring MR\n"); ret = errno; goto free_mem; } ret = posix_memalign(&send_ring->sync_buff, page_size, dmn->info.max_send_size); if (ret) { dr_dbg(dmn, "Couldn't allocate send-ring sync_buf.\n"); errno = ret; goto clean_mr; } send_ring->sync_mr = ibv_reg_mr(dmn->pd, send_ring->sync_buff, dmn->info.max_send_size, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_WRITE); if (!send_ring->sync_mr) { dr_dbg(dmn, "Couldn't register sync mr\n"); ret = errno; goto clean_sync_buf; } *curr_send_ring = send_ring; return 0; clean_sync_buf: free(send_ring->sync_buff); clean_mr: ibv_dereg_mr(send_ring->mr); free_mem: free(send_ring->buf); clean_qp: dr_destroy_qp(send_ring->qp); clean_cq: ibv_destroy_cq(send_ring->cq.ibv_cq); free_send_ring: free(send_ring); return ret; } int dr_send_ring_alloc(struct mlx5dv_dr_domain *dmn) { int i, ret; dmn->info.max_send_size = dr_icm_pool_chunk_size_to_byte(DR_CHUNK_SIZE_1K, DR_ICM_TYPE_STE); for (i = 0; i < DR_MAX_SEND_RINGS; i++) { ret = dr_send_ring_alloc_one(dmn, &dmn->send_ring[i]); if (ret) { dr_dbg(dmn, "Couldn't allocate send-rings id[%d]\n", i); goto free_send_ring; } } return 0; free_send_ring: for (; i > 0; i--) dr_send_ring_free_one(dmn->send_ring[i - 1]); return ret; } int dr_send_ring_force_drain(struct mlx5dv_dr_domain *dmn) { struct dr_send_ring *send_ring = dmn->send_ring[0]; struct postsend_info send_info = {}; int i, j, num_of_sends_req; uint8_t data[DR_STE_SIZE]; int num_qps; int ret; num_qps = dmn->info.use_mqs ? DR_MAX_SEND_RINGS : 1; /* Sending this amount of requests makes sure we will get drain */ num_of_sends_req = send_ring->signal_th * TH_NUMS_TO_DRAIN / 2; /* Send fake requests forcing the last to be signaled */ send_info.write.addr = (uintptr_t) data; send_info.write.length = DR_STE_SIZE; send_info.write.lkey = 0; /* Using the sync_mr in order to write/read */ send_info.remote_addr = (uintptr_t) send_ring->sync_mr->addr; send_info.rkey = send_ring->sync_mr->rkey; for (i = 0; i < num_of_sends_req; i++) { for (j = 0; j < num_qps; j++) { ret = dr_postsend_icm_data(dmn, &send_info, j); if (ret) return ret; } } return 0; } rdma-core-56.1/providers/mlx5/dr_ste.c000066400000000000000000001532751477342711600176750ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "mlx5dv_dr.h" #include "dr_ste.h" struct dr_hw_ste_format { uint8_t ctrl[DR_STE_SIZE_CTRL]; uint8_t tag[DR_STE_SIZE_TAG]; uint8_t mask[DR_STE_SIZE_MASK]; }; uint32_t dr_ste_calc_hash_index(uint8_t *hw_ste_p, struct dr_ste_htbl *htbl) { struct dr_hw_ste_format *hw_ste = (struct dr_hw_ste_format *)hw_ste_p; uint8_t masked[DR_STE_SIZE_TAG] = {}; uint32_t crc32, index; uint8_t *p_masked; uint16_t bit; size_t len; int i; /* Don't calculate CRC if the result is predicted */ if (htbl->chunk->num_of_entries == 1) return 0; if (htbl->type == DR_STE_HTBL_TYPE_LEGACY) { if (htbl->byte_mask == 0) return 0; len = DR_STE_SIZE_TAG; /* Mask tag using byte mask, bit per byte */ bit = 1 << (DR_STE_SIZE_TAG - 1); for (i = 0; i < DR_STE_SIZE_TAG; i++) { if (htbl->byte_mask & bit) masked[i] = hw_ste->tag[i]; bit = bit >> 1; } p_masked = masked; } else { len = DR_STE_SIZE_MATCH_TAG; p_masked = hw_ste->tag; } crc32 = dr_crc32_slice8_calc(p_masked, len); index = crc32 % htbl->chunk->num_of_entries; return index; } uint16_t dr_ste_conv_bit_to_byte_mask(uint8_t *bit_mask) { uint16_t byte_mask = 0; int i; for (i = 0; i < DR_STE_SIZE_MASK; i++) { byte_mask = byte_mask << 1; if (bit_mask[i] == 0xff) byte_mask |= 1; } return byte_mask; } static uint8_t *dr_ste_get_tag(uint8_t *hw_ste_p) { struct dr_hw_ste_format *hw_ste = (struct dr_hw_ste_format *)hw_ste_p; return hw_ste->tag; } void dr_ste_set_bit_mask(uint8_t *hw_ste_p, struct dr_ste_build *sb) { struct dr_hw_ste_format *hw_ste = (struct dr_hw_ste_format *)hw_ste_p; if (sb->htbl_type == DR_STE_HTBL_TYPE_LEGACY) memcpy(hw_ste->mask, sb->bit_mask, DR_STE_SIZE_MASK); } static void dr_ste_set_always_hit(struct dr_hw_ste_format *hw_ste) { memset(&hw_ste->tag, 0, sizeof(hw_ste->tag)); memset(&hw_ste->mask, 0, sizeof(hw_ste->mask)); } static void dr_ste_set_always_miss(struct dr_hw_ste_format *hw_ste) { hw_ste->tag[0] = 0xdc; hw_ste->mask[0] = 0; } void dr_ste_set_miss_addr(struct dr_ste_ctx *ste_ctx, uint8_t *hw_ste_p, uint64_t miss_addr) { ste_ctx->set_miss_addr(hw_ste_p, miss_addr); } static void dr_ste_always_miss_addr(struct dr_ste_ctx *ste_ctx, struct dr_ste *ste, uint64_t miss_addr, uint16_t gvmi) { uint8_t *hw_ste_p = ste->hw_ste; ste_ctx->set_ctrl_always_miss(hw_ste_p, miss_addr, gvmi); dr_ste_set_always_miss((struct dr_hw_ste_format *)ste->hw_ste); } void dr_ste_set_hit_addr(struct dr_ste_ctx *ste_ctx, uint8_t *hw_ste_p, uint64_t icm_addr, uint32_t ht_size) { ste_ctx->set_hit_addr(hw_ste_p, icm_addr, ht_size); } void dr_ste_set_hit_gvmi(struct dr_ste_ctx *ste_ctx, uint8_t *hw_ste_p, uint16_t gvmi) { ste_ctx->set_hit_gvmi(hw_ste_p, gvmi); } uint64_t dr_ste_get_icm_addr(struct dr_ste *ste) { uint32_t index = ste - ste->htbl->ste_arr; return dr_icm_pool_get_chunk_icm_addr(ste->htbl->chunk) + DR_STE_SIZE * index; } uint64_t dr_ste_get_mr_addr(struct dr_ste *ste) { uint32_t index = ste - ste->htbl->ste_arr; return dr_icm_pool_get_chunk_mr_addr(ste->htbl->chunk) + DR_STE_SIZE * index; } struct list_head *dr_ste_get_miss_list(struct dr_ste *ste) { uint32_t index = ste - ste->htbl->ste_arr; return &ste->htbl->miss_list[index]; } struct dr_ste *dr_ste_get_miss_list_top(struct dr_ste *ste) { /* Optimize miss list access (reduce cache misses) by checking * if we actually need to jump to list_top: * if number of entries in current hash table is more than one, * it means that this is not a collision entry. */ if (ste->htbl->chunk->num_of_entries > 1) return ste; else return list_top(dr_ste_get_miss_list(ste), struct dr_ste, miss_list_node); } static void dr_ste_always_hit_htbl(struct dr_ste_ctx *ste_ctx, struct dr_ste *ste, struct dr_ste_htbl *next_htbl, uint16_t gvmi) { struct dr_icm_chunk *chunk = next_htbl->chunk; uint8_t *hw_ste = ste->hw_ste; ste_ctx->set_ctrl_always_hit_htbl(hw_ste, next_htbl->byte_mask, next_htbl->lu_type, dr_icm_pool_get_chunk_icm_addr(chunk), chunk->num_of_entries, gvmi); dr_ste_set_always_hit((struct dr_hw_ste_format *)ste->hw_ste); } bool dr_ste_is_last_in_rule(struct dr_matcher_rx_tx *nic_matcher, uint8_t ste_location) { return ste_location == nic_matcher->num_of_builders; } /* * Replace relevant fields, except of: * htbl - keep the origin htbl * miss_list + list - already took the src from the list. * icm_addr/mr_addr - depends on the hosting table. * * Before: * | a | -> | b | -> | c | -> * * After: * | a | -> | c | -> * While the data that was in b copied to a. */ static void dr_ste_replace(struct dr_ste *dst, struct dr_ste *src) { memcpy(dst->hw_ste, src->hw_ste, dst->size); dst->next_htbl = src->next_htbl; if (dst->next_htbl) dst->next_htbl->pointing_ste = dst; atomic_init(&dst->refcount, atomic_load(&src->refcount)); } /* Free ste which is the head and the only one in miss_list */ static void dr_ste_remove_head_ste(struct dr_ste_ctx *ste_ctx, struct mlx5dv_dr_domain *dmn, struct dr_ste *ste, struct dr_matcher_rx_tx *nic_matcher, struct dr_ste_send_info *ste_info_head, struct list_head *send_ste_list, struct dr_ste_htbl *stats_tbl) { struct dr_domain_rx_tx *nic_dmn = nic_matcher->nic_tbl->nic_dmn; uint8_t formated_ste[DR_STE_SIZE] = {}; struct dr_htbl_connect_info info; stats_tbl->ctrl.num_of_valid_entries--; /* Hash table will be deleted, no need to update STE */ if (atomic_load(&ste->htbl->refcount) == 1) return; info.type = CONNECT_MISS; info.miss_icm_addr = dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk); dr_ste_set_formated_ste(ste_ctx, dmn->info.caps.gvmi, nic_dmn->type, ste->htbl, formated_ste, &info); memcpy(ste->hw_ste, formated_ste, ste->size); list_del_init(&ste->miss_list_node); /* Write full STE size in order to have "always_miss" */ dr_send_fill_and_append_ste_send_info(ste, DR_STE_SIZE, 0, formated_ste, ste_info_head, send_ste_list, true /* Copy data */); } /* * Free ste which is the head but NOT the only one in miss_list: * |_ste_| --> |_next_ste_| -->|__| -->|__| -->/0 */ static void dr_ste_replace_head_ste(struct dr_matcher_rx_tx *nic_matcher, struct dr_ste *ste, struct dr_ste *next_ste, struct dr_ste_send_info *ste_info_head, struct list_head *send_ste_list, struct dr_ste_htbl *stats_tbl) { struct dr_ste_htbl *next_miss_htbl; uint8_t hw_ste[DR_STE_SIZE] = {}; struct dr_ste_build *sb; int sb_idx; next_miss_htbl = next_ste->htbl; /* Remove from the miss_list the next_ste before copy */ list_del_init(&next_ste->miss_list_node); /* Move data from next into ste */ dr_ste_replace(ste, next_ste); /* Update the rule on STE change */ dr_rule_set_last_member(next_ste->rule_rx_tx, ste, false); sb_idx = ste->ste_chain_location - 1; sb = &nic_matcher->ste_builder[sb_idx]; /* Copy all 64 hw_ste bytes */ memcpy(hw_ste, ste->hw_ste, ste->size); dr_ste_set_bit_mask(hw_ste, sb); /* * Del the htbl that contains the next_ste. * The origin htbl stay with the same number of entries. */ dr_htbl_put(next_miss_htbl); dr_send_fill_and_append_ste_send_info(ste, DR_STE_SIZE, 0, hw_ste, ste_info_head, send_ste_list, true /* Copy data */); stats_tbl->ctrl.num_of_collisions--; stats_tbl->ctrl.num_of_valid_entries--; } /* * Free ste that is located in the middle of the miss list: * |__| -->|_prev_ste_|->|_ste_|-->|_next_ste_| */ static void dr_ste_remove_middle_ste(struct dr_ste_ctx *ste_ctx, struct dr_ste *ste, struct dr_ste_send_info *ste_info, struct list_head *send_ste_list, struct dr_ste_htbl *stats_tbl) { struct dr_ste *prev_ste; uint64_t miss_addr; prev_ste = list_prev(dr_ste_get_miss_list(ste), ste, miss_list_node); assert(prev_ste); miss_addr = ste_ctx->get_miss_addr(ste->hw_ste); ste_ctx->set_miss_addr(prev_ste->hw_ste, miss_addr); dr_send_fill_and_append_ste_send_info(prev_ste, DR_STE_SIZE_CTRL, 0, prev_ste->hw_ste, ste_info, send_ste_list, true /* Copy data*/); list_del_init(&ste->miss_list_node); stats_tbl->ctrl.num_of_valid_entries--; stats_tbl->ctrl.num_of_collisions--; } void dr_ste_free(struct dr_ste *ste, struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule) { struct dr_matcher_rx_tx *nic_matcher = nic_rule->nic_matcher; struct dr_ste_send_info *cur_ste_info, *tmp_ste_info; struct mlx5dv_dr_matcher *matcher = rule->matcher; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct dr_ste_ctx *ste_ctx = dmn->ste_ctx; struct dr_ste_send_info ste_info_head; struct dr_ste *next_ste, *first_ste; bool put_on_origin_table = true; struct dr_ste_htbl *stats_tbl; LIST_HEAD(send_ste_list); first_ste = dr_ste_get_miss_list_top(ste); stats_tbl = first_ste->htbl; /* * Two options: * 1. ste is head: * a. head ste is the only ste in the miss list * b. head ste is not the only ste in the miss-list * 2. ste is not head */ if (first_ste == ste) { /* Ste is the head */ next_ste = list_next(dr_ste_get_miss_list(ste), ste, miss_list_node); if (!next_ste) { /* One and only entry in the list */ dr_ste_remove_head_ste(ste_ctx, dmn, ste, nic_matcher, &ste_info_head, &send_ste_list, stats_tbl); } else { /* First but not only entry in the list */ dr_ste_replace_head_ste(nic_matcher, ste, next_ste, &ste_info_head, &send_ste_list, stats_tbl); put_on_origin_table = false; } } else { /* Ste in the middle of the list */ dr_ste_remove_middle_ste(ste_ctx, ste, &ste_info_head, &send_ste_list, stats_tbl); } /* Update HW */ list_for_each_safe(&send_ste_list, cur_ste_info, tmp_ste_info, send_list) { list_del(&cur_ste_info->send_list); dr_send_postsend_ste(dmn, cur_ste_info->ste, cur_ste_info->data, cur_ste_info->size, cur_ste_info->offset, nic_rule->lock_index); } if (put_on_origin_table) dr_htbl_put(ste->htbl); } bool dr_ste_equal_tag(void *src, void *dst, uint8_t tag_size) { struct dr_hw_ste_format *s_hw_ste = (struct dr_hw_ste_format *)src; struct dr_hw_ste_format *d_hw_ste = (struct dr_hw_ste_format *)dst; return !memcmp(s_hw_ste->tag, d_hw_ste->tag, tag_size); } void dr_ste_set_hit_addr_by_next_htbl(struct dr_ste_ctx *ste_ctx, uint8_t *hw_ste, struct dr_ste_htbl *next_htbl) { struct dr_icm_chunk *chunk = next_htbl->chunk; ste_ctx->set_hit_addr(hw_ste, dr_icm_pool_get_chunk_icm_addr(chunk), chunk->num_of_entries); } void dr_ste_prepare_for_postsend(struct dr_ste_ctx *ste_ctx, uint8_t *hw_ste_p, uint32_t ste_size) { if (ste_ctx->prepare_for_postsend) ste_ctx->prepare_for_postsend(hw_ste_p, ste_size); } /* Init one ste as a pattern for ste data array */ void dr_ste_set_formated_ste(struct dr_ste_ctx *ste_ctx, uint16_t gvmi, enum dr_domain_nic_type nic_type, struct dr_ste_htbl *htbl, uint8_t *formated_ste, struct dr_htbl_connect_info *connect_info) { bool is_rx = nic_type == DR_DOMAIN_NIC_TYPE_RX; struct dr_ste ste = {}; ste_ctx->ste_init(formated_ste, htbl->lu_type, is_rx, gvmi); ste.hw_ste = formated_ste; if (connect_info->type == CONNECT_HIT) dr_ste_always_hit_htbl(ste_ctx, &ste, connect_info->hit_next_htbl, gvmi); else dr_ste_always_miss_addr(ste_ctx, &ste, connect_info->miss_icm_addr, gvmi); } int dr_ste_htbl_init_and_postsend(struct mlx5dv_dr_domain *dmn, struct dr_domain_rx_tx *nic_dmn, struct dr_ste_htbl *htbl, struct dr_htbl_connect_info *connect_info, bool update_hw_ste, uint8_t send_ring_idx) { uint8_t formated_ste[DR_STE_SIZE] = {}; dr_ste_set_formated_ste(dmn->ste_ctx, dmn->info.caps.gvmi, nic_dmn->type, htbl, formated_ste, connect_info); return dr_send_postsend_formated_htbl(dmn, htbl, formated_ste, update_hw_ste, send_ring_idx); } int dr_ste_create_next_htbl(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct dr_ste *ste, uint8_t *cur_hw_ste, enum dr_icm_chunk_size log_table_size, uint8_t send_ring_idx) { struct dr_domain_rx_tx *nic_dmn = nic_matcher->nic_tbl->nic_dmn; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct dr_ste_ctx *ste_ctx = dmn->ste_ctx; struct dr_htbl_connect_info info; struct dr_ste_htbl *next_htbl; if (!dr_ste_is_last_in_rule(nic_matcher, ste->ste_chain_location)) { uint16_t next_lu_type; uint16_t byte_mask; next_lu_type = ste_ctx->get_next_lu_type(cur_hw_ste); byte_mask = ste_ctx->get_byte_mask(cur_hw_ste); next_htbl = dr_ste_htbl_alloc(dmn->ste_icm_pool, log_table_size, ste->htbl->type, next_lu_type, byte_mask); if (!next_htbl) { dr_dbg(dmn, "Failed allocating next hash table\n"); return errno; } /* Write new table to HW */ info.type = CONNECT_MISS; info.miss_icm_addr = dr_icm_pool_get_chunk_icm_addr(nic_matcher->e_anchor->chunk); if (dr_ste_htbl_init_and_postsend(dmn, nic_dmn, next_htbl, &info, false, send_ring_idx)) { dr_dbg(dmn, "Failed writing table to HW\n"); goto free_table; } dr_ste_set_hit_addr_by_next_htbl(ste_ctx, cur_hw_ste, next_htbl); ste->next_htbl = next_htbl; next_htbl->pointing_ste = ste; } return 0; free_table: dr_ste_htbl_free(next_htbl); return ENOENT; } struct dr_ste_htbl *dr_ste_htbl_alloc(struct dr_icm_pool *pool, enum dr_icm_chunk_size chunk_size, enum dr_ste_htbl_type type, uint16_t lu_type, uint16_t byte_mask) { struct dr_icm_chunk *chunk; struct dr_ste_htbl *htbl; uint8_t ste_size; int i; htbl = calloc(1, sizeof(struct dr_ste_htbl)); if (!htbl) { errno = ENOMEM; return NULL; } chunk = dr_icm_alloc_chunk(pool, chunk_size); if (!chunk) goto out_free_htbl; if (type == DR_STE_HTBL_TYPE_LEGACY) ste_size = DR_STE_SIZE_REDUCED; else ste_size = DR_STE_SIZE; htbl->type = type; htbl->chunk = chunk; htbl->lu_type = lu_type; htbl->byte_mask = byte_mask; htbl->ste_arr = chunk->ste_arr; htbl->hw_ste_arr = chunk->hw_ste_arr; htbl->miss_list = chunk->miss_list; atomic_init(&htbl->refcount, 0); for (i = 0; i < chunk->num_of_entries; i++) { struct dr_ste *ste = &htbl->ste_arr[i]; ste->hw_ste = htbl->hw_ste_arr + i * ste_size; ste->htbl = htbl; ste->size = ste_size; atomic_init(&ste->refcount, 0); list_node_init(&ste->miss_list_node); list_head_init(&htbl->miss_list[i]); ste->next_htbl = NULL; ste->rule_rx_tx = NULL; ste->ste_chain_location = 0; } htbl->chunk_size = chunk_size; return htbl; out_free_htbl: free(htbl); return NULL; } int dr_ste_htbl_free(struct dr_ste_htbl *htbl) { if (atomic_load(&htbl->refcount)) return EBUSY; dr_icm_free_chunk(htbl->chunk); free(htbl); return 0; } void dr_ste_set_actions_tx(struct dr_ste_ctx *ste_ctx, uint8_t *action_type_set, uint8_t *hw_ste_arr, struct dr_ste_actions_attr *attr, uint32_t *added_stes) { ste_ctx->set_actions_tx(ste_ctx, action_type_set, ste_ctx->actions_caps, hw_ste_arr, attr, added_stes); } void dr_ste_set_actions_rx(struct dr_ste_ctx *ste_ctx, uint8_t *action_type_set, uint8_t *hw_ste_arr, struct dr_ste_actions_attr *attr, uint32_t *added_stes) { ste_ctx->set_actions_rx(ste_ctx, action_type_set, ste_ctx->actions_caps, hw_ste_arr, attr, added_stes); } const struct dr_ste_action_modify_field * dr_ste_conv_modify_hdr_sw_field(struct dr_ste_ctx *ste_ctx, struct dr_devx_caps *caps, uint16_t sw_field) { return ste_ctx->get_action_hw_field(ste_ctx, sw_field, caps); } void dr_ste_set_action_set(struct dr_ste_ctx *ste_ctx, __be64 *hw_action, uint8_t hw_field, uint8_t shifter, uint8_t length, uint32_t data) { ste_ctx->set_action_set((uint8_t *)hw_action, hw_field, shifter, length, data); } void dr_ste_set_action_add(struct dr_ste_ctx *ste_ctx, __be64 *hw_action, uint8_t hw_field, uint8_t shifter, uint8_t length, uint32_t data) { ste_ctx->set_action_add((uint8_t *)hw_action, hw_field, shifter, length, data); } void dr_ste_set_action_copy(struct dr_ste_ctx *ste_ctx, __be64 *hw_action, uint8_t dst_hw_field, uint8_t dst_shifter, uint8_t dst_len, uint8_t src_hw_field, uint8_t src_shifter) { ste_ctx->set_action_copy((uint8_t *)hw_action, dst_hw_field, dst_shifter, dst_len, src_hw_field, src_shifter); } int dr_ste_set_action_decap_l3_list(struct dr_ste_ctx *ste_ctx, void *data, uint32_t data_sz, uint8_t *hw_action, uint32_t hw_action_sz, uint16_t *used_hw_action_num) { /* Only Ethernet frame is supported, with VLAN (18) or without (14) */ if (data_sz != HDR_LEN_L2 && data_sz != HDR_LEN_L2_W_VLAN) { errno = EINVAL; return errno; } return ste_ctx->set_action_decap_l3_list(data, data_sz, hw_action, hw_action_sz, used_hw_action_num); } static int dr_ste_alloc_modify_hdr_chunk(struct mlx5dv_dr_action *action, uint32_t chunck_size) { int ret; action->rewrite.param.chunk = dr_icm_alloc_chunk(action->rewrite.dmn->action_icm_pool, chunck_size); if (!action->rewrite.param.chunk) return ENOMEM; action->rewrite.param.index = (dr_icm_pool_get_chunk_icm_addr(action->rewrite.param.chunk) - action->rewrite.dmn->info.caps.hdr_modify_icm_addr) / ACTION_CACHE_LINE_SIZE; ret = dr_send_postsend_action(action->rewrite.dmn, action); if (ret) goto free_chunk; return 0; free_chunk: dr_icm_free_chunk(action->rewrite.param.chunk); return ret; } static void dr_dealloc_modify_hdr_chunk(struct mlx5dv_dr_action *action) { dr_icm_free_chunk(action->rewrite.param.chunk); } int dr_ste_alloc_modify_hdr(struct mlx5dv_dr_action *action) { uint32_t dynamic_chunck_size; dynamic_chunck_size = ilog32(action->rewrite.param.num_of_actions - 1); /* HW modify action index granularity is at least 64B */ dynamic_chunck_size = max_t(uint32_t, dynamic_chunck_size, DR_CHUNK_SIZE_8); if (action->rewrite.dmn->modify_header_ptrn_mngr) return action->rewrite.dmn->ste_ctx->alloc_modify_hdr_chunk(action, dynamic_chunck_size); return dr_ste_alloc_modify_hdr_chunk(action, dynamic_chunck_size); } void dr_ste_free_modify_hdr(struct mlx5dv_dr_action *action) { if (action->rewrite.dmn->modify_header_ptrn_mngr) return action->rewrite.dmn->ste_ctx->dealloc_modify_hdr_chunk(action); return dr_dealloc_modify_hdr_chunk(action); } int dr_ste_alloc_encap(struct mlx5dv_dr_action *action) { struct mlx5dv_dr_domain *dmn = action->reformat.dmn; uint32_t dynamic_chunck_size; int ret; dynamic_chunck_size = ilog32((action->reformat.reformat_size - 1) / DR_SW_ENCAP_ENTRY_SIZE); action->reformat.chunk = dr_icm_alloc_chunk(dmn->encap_icm_pool, dynamic_chunck_size); if (!action->reformat.chunk) return errno; action->reformat.index = (dr_icm_pool_get_chunk_icm_addr(action->reformat.chunk) - dmn->info.caps.indirect_encap_icm_base) / DR_SW_ENCAP_ENTRY_SIZE; ret = dr_send_postsend_action(dmn, action); if (ret) goto postsend_err; return 0; postsend_err: dr_icm_free_chunk(action->reformat.chunk); action->reformat.chunk = NULL; action->reformat.index = 0; return ret; } void dr_ste_free_encap(struct mlx5dv_dr_action *action) { dr_icm_free_chunk(action->reformat.chunk); } static int dr_ste_build_pre_check_spec(struct mlx5dv_dr_domain *dmn, struct dr_match_spec *m_spec, struct dr_match_spec *v_spec) { if (m_spec->ip_version) { if (m_spec->ip_version != 4 && m_spec->ip_version != 6) { dr_dbg(dmn, "IP version must be specified v4 or v6\n"); errno = EOPNOTSUPP; return errno; } if (v_spec && (v_spec->ip_version != m_spec->ip_version)) { dr_dbg(dmn, "Mask and value IP version must be equal\n"); errno = EOPNOTSUPP; return errno; } } return 0; } int dr_ste_build_pre_check(struct mlx5dv_dr_domain *dmn, uint8_t match_criteria, struct dr_match_param *mask, struct dr_match_param *value) { int ret; if (match_criteria & DR_MATCHER_CRITERIA_OUTER) { ret = dr_ste_build_pre_check_spec(dmn, &mask->outer, value ? &value->outer : NULL); if (ret) return ret; } if (match_criteria & DR_MATCHER_CRITERIA_INNER) { ret = dr_ste_build_pre_check_spec(dmn, &mask->inner, value ? &value->inner : NULL); if (ret) return ret; } if (!value && (match_criteria & DR_MATCHER_CRITERIA_MISC)) { if (mask->misc.source_port && mask->misc.source_port != 0xffff) { dr_dbg(dmn, "Partial mask source_port is not supported\n"); errno = ENOTSUP; return errno; } } return 0; } int dr_ste_build_ste_arr(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct dr_match_param *value, uint8_t *ste_arr) { struct dr_domain_rx_tx *nic_dmn = nic_matcher->nic_tbl->nic_dmn; bool is_rx = nic_dmn->type == DR_DOMAIN_NIC_TYPE_RX; struct mlx5dv_dr_domain *dmn = matcher->tbl->dmn; struct dr_ste_ctx *ste_ctx = dmn->ste_ctx; struct dr_ste_build *sb; int ret, i; ret = dr_ste_build_pre_check(dmn, matcher->match_criteria, &matcher->mask, value); if (ret) return ret; sb = nic_matcher->ste_builder; for (i = 0; i < nic_matcher->num_of_builders; i++) { ste_ctx->ste_init(ste_arr, sb->lu_type, is_rx, dmn->info.caps.gvmi); dr_ste_set_bit_mask(ste_arr, sb); ret = sb->ste_build_tag_func(value, sb, dr_ste_get_tag(ste_arr)); if (ret) return ret; /* Connect the STEs */ if (i < (nic_matcher->num_of_builders - 1)) { /* Need the next builder for these fields, * not relevant for the last ste in the chain. */ sb++; ste_ctx->set_next_lu_type(ste_arr, sb->lu_type); ste_ctx->set_byte_mask(ste_arr, sb->byte_mask); } ste_arr += DR_STE_SIZE; } return 0; } static void dr_ste_copy_mask_misc(char *mask, struct dr_match_misc *spec, bool clear) { spec->gre_c_present = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, gre_c_present, clear); spec->bth_a = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, bth_a, clear); spec->gre_k_present = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, gre_k_present, clear); spec->gre_s_present = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, gre_s_present, clear); spec->source_vhca_port = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, source_vhca_port, clear); spec->source_sqn = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, source_sqn, clear); spec->source_eswitch_owner_vhca_id = DEVX_GET(dr_match_set_misc, mask, source_eswitch_owner_vhca_id); spec->source_port = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, source_port, clear); spec->outer_second_prio = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, outer_second_prio, clear); spec->outer_second_cfi = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, outer_second_cfi, clear); spec->outer_second_vid = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, outer_second_vid, clear); spec->inner_second_prio = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, inner_second_prio, clear); spec->inner_second_cfi = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, inner_second_cfi, clear); spec->inner_second_vid = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, inner_second_vid, clear); spec->outer_second_cvlan_tag = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, outer_second_cvlan_tag, clear); spec->inner_second_cvlan_tag = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, inner_second_cvlan_tag, clear); spec->outer_second_svlan_tag = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, outer_second_svlan_tag, clear); spec->inner_second_svlan_tag = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, inner_second_svlan_tag, clear); spec->outer_emd_tag = DEVX_GET(dr_match_set_misc, mask, outer_emd_tag); spec->reserved_at_65 = DEVX_GET(dr_match_set_misc, mask, reserved_at_65); spec->gre_protocol = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, gre_protocol, clear); spec->gre_key_h = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, gre_key_h, clear); spec->gre_key_l = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, gre_key_l, clear); spec->vxlan_vni = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, vxlan_vni, clear); spec->bth_opcode = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, bth_opcode, clear); spec->geneve_vni = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, geneve_vni, clear); spec->reserved_at_e4 = DEVX_GET(dr_match_set_misc, mask, reserved_at_e4); spec->geneve_oam = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, geneve_oam, clear); spec->reserved_at_ec = DEVX_GET(dr_match_set_misc, mask, reserved_at_ec); spec->geneve_tlv_option_0_exist = DEVX_GET(dr_match_set_misc, mask, geneve_tlv_option_0_exist); spec->outer_ipv6_flow_label = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, outer_ipv6_flow_label, clear); spec->reserved_at_100 = DEVX_GET(dr_match_set_misc, mask, reserved_at_100); spec->inner_ipv6_flow_label = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, inner_ipv6_flow_label, clear); spec->reserved_at_120 = DEVX_GET(dr_match_set_misc, mask, reserved_at_120); spec->geneve_opt_len = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, geneve_opt_len, clear); spec->geneve_protocol_type = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, geneve_protocol_type, clear); spec->reserved_at_140 = DEVX_GET(dr_match_set_misc, mask, reserved_at_140); spec->bth_dst_qp = DR_DEVX_GET_CLEAR(dr_match_set_misc, mask, bth_dst_qp, clear); spec->inner_esp_spi = DEVX_GET(dr_match_set_misc, mask, inner_esp_spi); spec->outer_esp_spi = DEVX_GET(dr_match_set_misc, mask, outer_esp_spi); spec->reserved_at_1a0 = DEVX_GET(dr_match_set_misc, mask, reserved_at_1a0); spec->reserved_at_1c0 = DEVX_GET(dr_match_set_misc, mask, reserved_at_1c0); spec->reserved_at_1e0 = DEVX_GET(dr_match_set_misc, mask, reserved_at_1e0); } static void dr_ste_copy_mask_spec(char *mask, struct dr_match_spec *spec, bool clear) { spec->smac_47_16 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, smac_47_16, clear); spec->smac_15_0 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, smac_15_0, clear); spec->ethertype = DR_DEVX_GET_CLEAR(dr_match_spec, mask, ethertype, clear); spec->dmac_47_16 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, dmac_47_16, clear); spec->dmac_15_0 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, dmac_15_0, clear); spec->first_prio = DR_DEVX_GET_CLEAR(dr_match_spec, mask, first_prio, clear); spec->first_cfi = DR_DEVX_GET_CLEAR(dr_match_spec, mask, first_cfi, clear); spec->first_vid = DR_DEVX_GET_CLEAR(dr_match_spec, mask, first_vid, clear); spec->ip_protocol = DR_DEVX_GET_CLEAR(dr_match_spec, mask, ip_protocol, clear); spec->ip_dscp = DR_DEVX_GET_CLEAR(dr_match_spec, mask, ip_dscp, clear); spec->ip_ecn = DR_DEVX_GET_CLEAR(dr_match_spec, mask, ip_ecn, clear); spec->cvlan_tag = DR_DEVX_GET_CLEAR(dr_match_spec, mask, cvlan_tag, clear); spec->svlan_tag = DR_DEVX_GET_CLEAR(dr_match_spec, mask, svlan_tag, clear); spec->frag = DR_DEVX_GET_CLEAR(dr_match_spec, mask, frag, clear); spec->ip_version = DR_DEVX_GET_CLEAR(dr_match_spec, mask, ip_version, clear); spec->tcp_flags = DR_DEVX_GET_CLEAR(dr_match_spec, mask, tcp_flags, clear); spec->tcp_sport = DR_DEVX_GET_CLEAR(dr_match_spec, mask, tcp_sport, clear); spec->tcp_dport = DR_DEVX_GET_CLEAR(dr_match_spec, mask, tcp_dport, clear); spec->reserved_at_c0 = DEVX_GET(dr_match_spec, mask, reserved_at_c0); spec->ipv4_ihl = DR_DEVX_GET_CLEAR(dr_match_spec, mask, ipv4_ihl, clear); spec->l3_ok = DR_DEVX_GET_CLEAR(dr_match_spec, mask, l3_ok, clear); spec->l4_ok = DR_DEVX_GET_CLEAR(dr_match_spec, mask, l4_ok, clear); spec->ipv4_checksum_ok = DR_DEVX_GET_CLEAR(dr_match_spec, mask, ipv4_checksum_ok, clear); spec->l4_checksum_ok = DR_DEVX_GET_CLEAR(dr_match_spec, mask, l4_checksum_ok, clear); spec->ip_ttl_hoplimit = DR_DEVX_GET_CLEAR(dr_match_spec, mask, ip_ttl_hoplimit, clear); spec->udp_sport = DR_DEVX_GET_CLEAR(dr_match_spec, mask, udp_sport, clear); spec->udp_dport = DR_DEVX_GET_CLEAR(dr_match_spec, mask, udp_dport, clear); spec->src_ip_127_96 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, src_ip_127_96, clear); spec->src_ip_95_64 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, src_ip_95_64, clear); spec->src_ip_63_32 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, src_ip_63_32, clear); spec->src_ip_31_0 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, src_ip_31_0, clear); spec->dst_ip_127_96 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, dst_ip_127_96, clear); spec->dst_ip_95_64 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, dst_ip_95_64, clear); spec->dst_ip_63_32 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, dst_ip_63_32, clear); spec->dst_ip_31_0 = DR_DEVX_GET_CLEAR(dr_match_spec, mask, dst_ip_31_0, clear); } static void dr_ste_copy_mask_misc2(char *mask, struct dr_match_misc2 *spec, bool clear) { spec->outer_first_mpls_label = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_label, clear); spec->outer_first_mpls_exp = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_exp, clear); spec->outer_first_mpls_s_bos = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_s_bos, clear); spec->outer_first_mpls_ttl = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_ttl, clear); spec->inner_first_mpls_label = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, inner_first_mpls_label, clear); spec->inner_first_mpls_exp = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, inner_first_mpls_exp, clear); spec->inner_first_mpls_s_bos = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, inner_first_mpls_s_bos, clear); spec->inner_first_mpls_ttl = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, inner_first_mpls_ttl, clear); spec->outer_first_mpls_over_gre_label = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_over_gre_label, clear); spec->outer_first_mpls_over_gre_exp = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_over_gre_exp, clear); spec->outer_first_mpls_over_gre_s_bos = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_over_gre_s_bos, clear); spec->outer_first_mpls_over_gre_ttl = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_over_gre_ttl, clear); spec->outer_first_mpls_over_udp_label = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_over_udp_label, clear); spec->outer_first_mpls_over_udp_exp = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_over_udp_exp, clear); spec->outer_first_mpls_over_udp_s_bos = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_over_udp_s_bos, clear); spec->outer_first_mpls_over_udp_ttl = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, outer_first_mpls_over_udp_ttl, clear); spec->metadata_reg_c_7 = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, metadata_reg_c_7, clear); spec->metadata_reg_c_6 = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, metadata_reg_c_6, clear); spec->metadata_reg_c_5 = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, metadata_reg_c_5, clear); spec->metadata_reg_c_4 = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, metadata_reg_c_4, clear); spec->metadata_reg_c_3 = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, metadata_reg_c_3, clear); spec->metadata_reg_c_2 = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, metadata_reg_c_2, clear); spec->metadata_reg_c_1 = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, metadata_reg_c_1, clear); spec->metadata_reg_c_0 = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, metadata_reg_c_0, clear); spec->metadata_reg_a = DR_DEVX_GET_CLEAR(dr_match_set_misc2, mask, metadata_reg_a, clear); spec->reserved_at_1a0 = DEVX_GET(dr_match_set_misc2, mask, reserved_at_1a0); spec->reserved_at_1c0 = DEVX_GET(dr_match_set_misc2, mask, reserved_at_1c0); spec->reserved_at_1e0 = DEVX_GET(dr_match_set_misc2, mask, reserved_at_1e0); } static void dr_ste_copy_mask_misc3(char *mask, struct dr_match_misc3 *spec, bool clear) { spec->inner_tcp_seq_num = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, inner_tcp_seq_num, clear); spec->outer_tcp_seq_num = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, outer_tcp_seq_num, clear); spec->inner_tcp_ack_num = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, inner_tcp_ack_num, clear); spec->outer_tcp_ack_num = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, outer_tcp_ack_num, clear); spec->reserved_at_80 = DEVX_GET(dr_match_set_misc3, mask, reserved_at_80); spec->outer_vxlan_gpe_vni = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, outer_vxlan_gpe_vni, clear); spec->outer_vxlan_gpe_next_protocol = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, outer_vxlan_gpe_next_protocol, clear); spec->outer_vxlan_gpe_flags = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, outer_vxlan_gpe_flags, clear); spec->reserved_at_b0 = DEVX_GET(dr_match_set_misc3, mask, reserved_at_b0); spec->icmpv4_header_data = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, icmp_header_data, clear); spec->icmpv6_header_data = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, icmpv6_header_data, clear); spec->icmpv4_type = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, icmp_type, clear); spec->icmpv4_code = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, icmp_code, clear); spec->icmpv6_type = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, icmpv6_type, clear); spec->icmpv6_code = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, icmpv6_code, clear); spec->geneve_tlv_option_0_data = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, geneve_tlv_option_0_data, clear); spec->gtpu_teid = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, gtpu_teid, clear); spec->gtpu_msg_type = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, gtpu_msg_type, clear); spec->gtpu_msg_flags = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, gtpu_msg_flags, clear); spec->reserved_at_170 = DEVX_GET(dr_match_set_misc3, mask, reserved_at_170); spec->gtpu_dw_2 = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, gtpu_dw_2, clear); spec->gtpu_first_ext_dw_0 = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, gtpu_first_ext_dw_0, clear); spec->gtpu_dw_0 = DR_DEVX_GET_CLEAR(dr_match_set_misc3, mask, gtpu_dw_0, clear); spec->reserved_at_1e0 = DEVX_GET(dr_match_set_misc3, mask, reserved_at_1e0); } static void dr_ste_copy_mask_misc4(char *mask, struct dr_match_misc4 *spec, bool clear) { spec->prog_sample_field_id_0 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_id_0, clear); spec->prog_sample_field_value_0 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_value_0, clear); spec->prog_sample_field_id_1 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_id_1, clear); spec->prog_sample_field_value_1 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_value_1, clear); spec->prog_sample_field_id_2 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_id_2, clear); spec->prog_sample_field_value_2 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_value_2, clear); spec->prog_sample_field_id_3 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_id_3, clear); spec->prog_sample_field_value_3 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_value_3, clear); spec->prog_sample_field_id_4 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_id_4, clear); spec->prog_sample_field_value_4 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_value_4, clear); spec->prog_sample_field_id_5 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_id_5, clear); spec->prog_sample_field_value_5 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_value_5, clear); spec->prog_sample_field_id_6 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_id_6, clear); spec->prog_sample_field_value_6 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_value_6, clear); spec->prog_sample_field_id_7 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_id_7, clear); spec->prog_sample_field_value_7 = DR_DEVX_GET_CLEAR(dr_match_set_misc4, mask, prog_sample_field_value_7, clear); } static void dr_ste_copy_mask_misc5(char *mask, struct dr_match_misc5 *spec, bool clear) { spec->macsec_tag_0 = DR_DEVX_GET_CLEAR(dr_match_set_misc5, mask, macsec_tag_0, clear); spec->macsec_tag_1 = DR_DEVX_GET_CLEAR(dr_match_set_misc5, mask, macsec_tag_1, clear); spec->macsec_tag_2 = DR_DEVX_GET_CLEAR(dr_match_set_misc5, mask, macsec_tag_2, clear); spec->macsec_tag_3 = DR_DEVX_GET_CLEAR(dr_match_set_misc5, mask, macsec_tag_3, clear); spec->tunnel_header_0 = DR_DEVX_GET_CLEAR(dr_match_set_misc5, mask, tunnel_header_0, clear); spec->tunnel_header_1 = DR_DEVX_GET_CLEAR(dr_match_set_misc5, mask, tunnel_header_1, clear); spec->tunnel_header_2 = DR_DEVX_GET_CLEAR(dr_match_set_misc5, mask, tunnel_header_2, clear); spec->tunnel_header_3 = DR_DEVX_GET_CLEAR(dr_match_set_misc5, mask, tunnel_header_3, clear); spec->reserved_at_100 = DEVX_GET(dr_match_set_misc5, mask, reserved_at_100); spec->reserved_at_120 = DEVX_GET(dr_match_set_misc5, mask, reserved_at_120); spec->reserved_at_140 = DEVX_GET(dr_match_set_misc5, mask, reserved_at_140); spec->reserved_at_160 = DEVX_GET(dr_match_set_misc5, mask, reserved_at_160); spec->reserved_at_180 = DEVX_GET(dr_match_set_misc5, mask, reserved_at_180); spec->reserved_at_1a0 = DEVX_GET(dr_match_set_misc5, mask, reserved_at_1a0); spec->reserved_at_1c0 = DEVX_GET(dr_match_set_misc5, mask, reserved_at_1c0); spec->reserved_at_1e0 = DEVX_GET(dr_match_set_misc5, mask, reserved_at_1e0); } void dr_ste_copy_param(uint8_t match_criteria, struct dr_match_param *set_param, uint64_t *mask_buf, size_t mask_sz, bool clear) { uint8_t *data = (uint8_t *)mask_buf; char tail_param[DEVX_ST_SZ_BYTES(dr_match_param)] = {}; size_t param_location; void *buff; if (match_criteria & DR_MATCHER_CRITERIA_OUTER) { if (mask_sz < DEVX_ST_SZ_BYTES(dr_match_spec)) { memcpy(tail_param, data, mask_sz); buff = tail_param; } else { buff = mask_buf; } dr_ste_copy_mask_spec(buff, &set_param->outer, clear); } param_location = DEVX_ST_SZ_BYTES(dr_match_spec); if (match_criteria & DR_MATCHER_CRITERIA_MISC) { if (mask_sz < param_location + DEVX_ST_SZ_BYTES(dr_match_set_misc)) { memcpy(tail_param, data + param_location, mask_sz - param_location); buff = tail_param; } else { buff = data + param_location; } dr_ste_copy_mask_misc(buff, &set_param->misc, clear); } param_location += DEVX_ST_SZ_BYTES(dr_match_set_misc); if (match_criteria & DR_MATCHER_CRITERIA_INNER) { if (mask_sz < param_location + DEVX_ST_SZ_BYTES(dr_match_spec)) { memcpy(tail_param, data + param_location, mask_sz - param_location); buff = tail_param; } else { buff = data + param_location; } dr_ste_copy_mask_spec(buff, &set_param->inner, clear); } param_location += DEVX_ST_SZ_BYTES(dr_match_spec); if (match_criteria & DR_MATCHER_CRITERIA_MISC2) { if (mask_sz < param_location + DEVX_ST_SZ_BYTES(dr_match_set_misc2)) { memcpy(tail_param, data + param_location, mask_sz - param_location); buff = tail_param; } else { buff = data + param_location; } dr_ste_copy_mask_misc2(buff, &set_param->misc2, clear); } param_location += DEVX_ST_SZ_BYTES(dr_match_set_misc2); if (match_criteria & DR_MATCHER_CRITERIA_MISC3) { if (mask_sz < param_location + DEVX_ST_SZ_BYTES(dr_match_set_misc3)) { memcpy(tail_param, data + param_location, mask_sz - param_location); buff = tail_param; } else { buff = data + param_location; } dr_ste_copy_mask_misc3(buff, &set_param->misc3, clear); } param_location += DEVX_ST_SZ_BYTES(dr_match_set_misc3); if (match_criteria & DR_MATCHER_CRITERIA_MISC4) { if (mask_sz < param_location + DEVX_ST_SZ_BYTES(dr_match_set_misc4)) { memcpy(tail_param, data + param_location, mask_sz - param_location); buff = tail_param; } else { buff = data + param_location; } dr_ste_copy_mask_misc4(buff, &set_param->misc4, clear); } param_location += DEVX_ST_SZ_BYTES(dr_match_set_misc4); if (match_criteria & DR_MATCHER_CRITERIA_MISC5) { if (mask_sz < param_location + DEVX_ST_SZ_BYTES(dr_match_set_misc5)) { memcpy(tail_param, data + param_location, mask_sz - param_location); buff = tail_param; } else { buff = data + param_location; } dr_ste_copy_mask_misc5(buff, &set_param->misc5, clear); } } void dr_ste_build_eth_l2_src_dst(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_eth_l2_src_dst_init(sb, mask); } void dr_ste_build_eth_l3_ipv6_dst(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_eth_l3_ipv6_dst_init(sb, mask); } void dr_ste_build_eth_l3_ipv6_src(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_eth_l3_ipv6_src_init(sb, mask); } void dr_ste_build_eth_l3_ipv4_5_tuple(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_eth_l3_ipv4_5_tuple_init(sb, mask); } void dr_ste_build_eth_l2_src(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_eth_l2_src_init(sb, mask); } void dr_ste_build_eth_l2_dst(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_eth_l2_dst_init(sb, mask); } void dr_ste_build_eth_l2_tnl(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_eth_l2_tnl_init(sb, mask); } void dr_ste_build_eth_l3_ipv4_misc(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_eth_l3_ipv4_misc_init(sb, mask); } void dr_ste_build_eth_ipv6_l3_l4(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_eth_ipv6_l3_l4_init(sb, mask); } static int dr_ste_build_empty_always_hit_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { return 0; } void dr_ste_build_empty_always_hit(struct dr_ste_build *sb, bool rx) { sb->rx = rx; sb->lu_type = DR_STE_LU_TYPE_DONT_CARE; sb->byte_mask = 0; sb->ste_build_tag_func = &dr_ste_build_empty_always_hit_tag; } void dr_ste_build_mpls(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_mpls_init(sb, mask); } void dr_ste_build_tnl_gre(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_tnl_gre_init(sb, mask); } void dr_ste_build_tnl_mpls_over_gre(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; sb->caps = caps; ste_ctx->build_tnl_mpls_over_gre_init(sb, mask); } void dr_ste_build_tnl_mpls_over_udp(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; sb->caps = caps; ste_ctx->build_tnl_mpls_over_udp_init(sb, mask); } void dr_ste_build_tnl_geneve_tlv_opt_exist(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { if (!ste_ctx->build_tnl_geneve_tlv_opt_exist_init) return; sb->rx = rx; sb->inner = inner; sb->caps = caps; ste_ctx->build_tnl_geneve_tlv_opt_exist_init(sb, mask); } void dr_ste_build_icmp(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { sb->rx = rx; sb->caps = caps; sb->inner = inner; ste_ctx->build_icmp_init(sb, mask); } void dr_ste_build_general_purpose(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_general_purpose_init(sb, mask); } void dr_ste_build_eth_l4_misc(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_eth_l4_misc_init(sb, mask); } void dr_ste_build_tnl_vxlan_gpe(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_tnl_vxlan_gpe_init(sb, mask); } void dr_ste_build_tnl_geneve(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_tnl_geneve_init(sb, mask); } void dr_ste_build_tnl_geneve_tlv_opt(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { sb->rx = rx; sb->caps = caps; sb->inner = inner; ste_ctx->build_tnl_geneve_tlv_opt_init(sb, mask); } void dr_ste_build_tnl_gtpu(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_tnl_gtpu_init(sb, mask); } void dr_ste_build_tnl_gtpu_flex_parser_0(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { sb->rx = rx; sb->caps = caps; sb->inner = inner; ste_ctx->build_tnl_gtpu_flex_parser_0(sb, mask); } void dr_ste_build_tnl_gtpu_flex_parser_1(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { sb->rx = rx; sb->caps = caps; sb->inner = inner; ste_ctx->build_tnl_gtpu_flex_parser_1(sb, mask); } void dr_ste_build_register_0(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_register_0_init(sb, mask); } void dr_ste_build_register_1(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_register_1_init(sb, mask); } void dr_ste_build_src_gvmi_qpn(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { sb->rx = rx; sb->caps = caps; sb->inner = inner; ste_ctx->build_src_gvmi_qpn_init(sb, mask); } void dr_ste_build_flex_parser_0(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_flex_parser_0_init(sb, mask); } void dr_ste_build_flex_parser_1(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; ste_ctx->build_flex_parser_1_init(sb, mask); } void dr_ste_build_tunnel_header(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { sb->rx = rx; sb->inner = inner; sb->caps = caps; ste_ctx->build_tunnel_header_init(sb, mask); } void dr_ste_build_ib_l4(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { if (!ste_ctx->build_ib_l4_init) return; sb->rx = rx; sb->inner = inner; ste_ctx->build_ib_l4_init(sb, mask); } int dr_ste_build_def0(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { if (!ste_ctx->build_def0_init) { errno = ENOTSUP; return errno; } sb->rx = rx; sb->caps = caps; sb->inner = inner; sb->format_id = DR_MATCHER_DEFINER_0; ste_ctx->build_def0_init(sb, mask); return 0; } int dr_ste_build_def2(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { if (!ste_ctx->build_def2_init) { errno = ENOTSUP; return errno; } sb->rx = rx; sb->caps = caps; sb->inner = inner; sb->format_id = DR_MATCHER_DEFINER_2; ste_ctx->build_def2_init(sb, mask); return 0; } int dr_ste_build_def6(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { if (!ste_ctx->build_def6_init) { errno = ENOTSUP; return errno; } sb->rx = rx; sb->inner = inner; sb->format_id = DR_MATCHER_DEFINER_6; ste_ctx->build_def6_init(sb, mask); return 0; } int dr_ste_build_def16(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx) { if (!ste_ctx->build_def16_init) { errno = ENOTSUP; return errno; } sb->rx = rx; sb->caps = caps; sb->inner = inner; sb->format_id = DR_MATCHER_DEFINER_16; ste_ctx->build_def16_init(sb, mask); return 0; } int dr_ste_build_def22(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { if (!ste_ctx->build_def22_init) { errno = ENOTSUP; return errno; } sb->rx = rx; sb->inner = inner; sb->format_id = DR_MATCHER_DEFINER_22; ste_ctx->build_def22_init(sb, mask); return 0; } int dr_ste_build_def24(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { if (!ste_ctx->build_def24_init) { errno = ENOTSUP; return errno; } sb->rx = rx; sb->inner = inner; sb->format_id = DR_MATCHER_DEFINER_24; ste_ctx->build_def24_init(sb, mask); return 0; } int dr_ste_build_def25(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { if (!ste_ctx->build_def25_init) { errno = ENOTSUP; return errno; } sb->rx = rx; sb->inner = inner; sb->format_id = DR_MATCHER_DEFINER_25; ste_ctx->build_def25_init(sb, mask); return 0; } int dr_ste_build_def26(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { if (!ste_ctx->build_def26_init) { errno = ENOTSUP; return errno; } sb->rx = rx; sb->inner = inner; sb->format_id = DR_MATCHER_DEFINER_26; ste_ctx->build_def26_init(sb, mask); return 0; } int dr_ste_build_def28(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { if (!ste_ctx->build_def28_init) { errno = ENOTSUP; return errno; } sb->rx = rx; sb->inner = inner; sb->format_id = DR_MATCHER_DEFINER_28; ste_ctx->build_def28_init(sb, mask); return 0; } int dr_ste_build_def33(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx) { if (!ste_ctx->build_def33_init) { errno = ENOTSUP; return errno; } sb->rx = rx; sb->inner = inner; sb->format_id = DR_MATCHER_DEFINER_33; ste_ctx->build_def33_init(sb, mask); return 0; } struct dr_ste_ctx *dr_ste_get_ctx(uint8_t version) { if (version == MLX5_HW_CONNECTX_5) return dr_ste_get_ctx_v0(); else if (version == MLX5_HW_CONNECTX_6DX) return dr_ste_get_ctx_v1(); else if (version == MLX5_HW_CONNECTX_7) return dr_ste_get_ctx_v2(); else if (version == MLX5_HW_CONNECTX_8) return dr_ste_get_ctx_v3(); errno = EOPNOTSUPP; return NULL; } int mlx5dv_dr_aso_other_domain_link(struct mlx5dv_devx_obj *devx_obj, struct mlx5dv_dr_domain *peer_dmn, struct mlx5dv_dr_domain *dmn, uint32_t flags, uint8_t return_reg_c) { struct dr_ste_ctx *ste_ctx = dmn->ste_ctx; if (devx_obj->type != MLX5_DEVX_ASO_CT) goto out; if (ste_ctx->aso_other_domain_link) return ste_ctx->aso_other_domain_link(devx_obj, peer_dmn, dmn, flags, return_reg_c); out: errno = EOPNOTSUPP; return errno; } int mlx5dv_dr_aso_other_domain_unlink(struct mlx5dv_devx_obj *devx_obj, struct mlx5dv_dr_domain *dmn) { struct dr_ste_ctx *ste_ctx = dmn->ste_ctx; if (ste_ctx->aso_other_domain_unlink) return ste_ctx->aso_other_domain_unlink(devx_obj); errno = EOPNOTSUPP; return errno; } rdma-core-56.1/providers/mlx5/dr_ste.h000066400000000000000000000255621477342711600176770ustar00rootroot00000000000000/* * Copyright (c) 2020, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _DR_STE_ #define _DR_STE_ #include #include "mlx5dv_dr.h" #define IPV4_ETHERTYPE 0x0800 #define IPV6_ETHERTYPE 0x86DD #define STE_IPV4 0x1 #define STE_IPV6 0x2 #define STE_TCP 0x1 #define STE_UDP 0x2 #define STE_SPI 0x3 #define IP_VERSION_IPV4 0x4 #define IP_VERSION_IPV6 0x6 #define IP_PROTOCOL_UDP 0x11 #define IP_PROTOCOL_TCP 0x06 #define IP_PROTOCOL_IPSEC 0x33 #define HDR_LEN_L2_MACS 0xC #define HDR_LEN_L2_VLAN 0x4 #define HDR_LEN_L2_ETHER 0x2 #define HDR_LEN_L2 (HDR_LEN_L2_MACS + HDR_LEN_L2_ETHER) #define HDR_LEN_L2_W_VLAN (HDR_LEN_L2 + HDR_LEN_L2_VLAN) enum { HDR_MPLS_OFFSET_LABEL = 12, HDR_MPLS_OFFSET_EXP = 9, HDR_MPLS_OFFSET_S_BOS = 8, HDR_MPLS_OFFSET_TTL = 0, }; #define DR_DEVX_GET_CLEAR(typ, p, fld, clear) ({ \ uint32_t ___t = DEVX_GET(typ, p, fld); \ if (clear) \ DEVX_SET(typ, p, fld, 0); \ ___t; \ }) /* Read from layout struct */ #define DR_STE_GET(typ, p, fld) DEVX_GET(ste_##typ, p, fld) /* Write to layout a value */ #define DR_STE_SET(typ, p, fld, v) DEVX_SET(ste_##typ, p, fld, v) #define DR_STE_SET_BOOL(typ, p, fld, v) DEVX_SET(ste_##typ, p, fld, !!(v)) /* Set to STE a specific value using DR_STE_SET */ #define DR_STE_SET_VAL(lookup_type, tag, t_fname, spec, s_fname, value) do { \ if ((spec)->s_fname) { \ DR_STE_SET(lookup_type, tag, t_fname, value); \ (spec)->s_fname = 0; \ } \ } while (0) /* Set to STE spec->s_fname to tag->t_fname set spec->s_fname as used */ #define DR_STE_SET_TAG(lookup_type, tag, t_fname, spec, s_fname) \ DR_STE_SET_VAL(lookup_type, tag, t_fname, spec, s_fname, (spec)->s_fname) /* Set to STE -1 to tag->t_fname and set spec->s_fname as used */ #define DR_STE_SET_ONES(lookup_type, tag, t_fname, spec, s_fname) \ DR_STE_SET_VAL(lookup_type, tag, t_fname, spec, s_fname, -1) #define DR_STE_SET_TCP_FLAGS(lookup_type, tag, spec) do { \ DR_STE_SET_BOOL(lookup_type, tag, tcp_ns, (spec)->tcp_flags & (1 << 8)); \ DR_STE_SET_BOOL(lookup_type, tag, tcp_cwr, (spec)->tcp_flags & (1 << 7)); \ DR_STE_SET_BOOL(lookup_type, tag, tcp_ece, (spec)->tcp_flags & (1 << 6)); \ DR_STE_SET_BOOL(lookup_type, tag, tcp_urg, (spec)->tcp_flags & (1 << 5)); \ DR_STE_SET_BOOL(lookup_type, tag, tcp_ack, (spec)->tcp_flags & (1 << 4)); \ DR_STE_SET_BOOL(lookup_type, tag, tcp_psh, (spec)->tcp_flags & (1 << 3)); \ DR_STE_SET_BOOL(lookup_type, tag, tcp_rst, (spec)->tcp_flags & (1 << 2)); \ DR_STE_SET_BOOL(lookup_type, tag, tcp_syn, (spec)->tcp_flags & (1 << 1)); \ DR_STE_SET_BOOL(lookup_type, tag, tcp_fin, (spec)->tcp_flags & (1 << 0)); \ } while (0) #define DR_STE_SET_MPLS(lookup_type, mask, in_out, tag) do { \ DR_STE_SET_TAG(lookup_type, tag, mpls0_label, mask, \ in_out##_first_mpls_label);\ DR_STE_SET_TAG(lookup_type, tag, mpls0_s_bos, mask, \ in_out##_first_mpls_s_bos); \ DR_STE_SET_TAG(lookup_type, tag, mpls0_exp, mask, \ in_out##_first_mpls_exp); \ DR_STE_SET_TAG(lookup_type, tag, mpls0_ttl, mask, \ in_out##_first_mpls_ttl); \ } while (0) #define DR_STE_SET_FLEX_PARSER_FIELD(tag, fname, caps, spec) do { \ if ((spec)->fname) { \ uint8_t parser_id = caps->flex_parser_id_##fname; \ uint8_t *parser_ptr = dr_ste_calc_flex_parser_offset(tag, parser_id); \ *(__be32 *)parser_ptr = htobe32((spec)->fname);\ (spec)->fname = 0; \ } \ } while (0) enum dr_ste_action_modify_flags { DR_STE_ACTION_MODIFY_FLAG_REQ_FLEX = 1 << 0, }; enum dr_ste_action_modify_type_l3 { DR_STE_ACTION_MDFY_TYPE_L3_NONE = 0x0, DR_STE_ACTION_MDFY_TYPE_L3_IPV4 = 0x1, DR_STE_ACTION_MDFY_TYPE_L3_IPV6 = 0x2, }; enum dr_ste_action_modify_type_l4 { DR_STE_ACTION_MDFY_TYPE_L4_NONE = 0x0, DR_STE_ACTION_MDFY_TYPE_L4_TCP = 0x1, DR_STE_ACTION_MDFY_TYPE_L4_UDP = 0x2, }; uint16_t dr_ste_conv_bit_to_byte_mask(uint8_t *bit_mask); static inline uint8_t * dr_ste_calc_flex_parser_offset(uint8_t *tag, uint8_t parser_id) { /* Calculate tag byte offset based on flex parser id */ return tag + 4 * (3 - (parser_id % 4)); } typedef void (*dr_ste_builder_void_init)(struct dr_ste_build *sb, struct dr_match_param *mask); struct dr_ste_ctx { /* Builders */ dr_ste_builder_void_init build_eth_l2_src_dst_init; dr_ste_builder_void_init build_eth_l3_ipv6_src_init; dr_ste_builder_void_init build_eth_l3_ipv6_dst_init; dr_ste_builder_void_init build_eth_l3_ipv4_5_tuple_init; dr_ste_builder_void_init build_eth_l2_src_init; dr_ste_builder_void_init build_eth_l2_dst_init; dr_ste_builder_void_init build_eth_l2_tnl_init; dr_ste_builder_void_init build_eth_l3_ipv4_misc_init; dr_ste_builder_void_init build_eth_ipv6_l3_l4_init; dr_ste_builder_void_init build_mpls_init; dr_ste_builder_void_init build_tnl_gre_init; dr_ste_builder_void_init build_tnl_mpls_over_gre_init; dr_ste_builder_void_init build_tnl_mpls_over_udp_init; dr_ste_builder_void_init build_icmp_init; dr_ste_builder_void_init build_general_purpose_init; dr_ste_builder_void_init build_eth_l4_misc_init; dr_ste_builder_void_init build_tnl_vxlan_gpe_init; dr_ste_builder_void_init build_tnl_geneve_init; dr_ste_builder_void_init build_tnl_geneve_tlv_opt_init; dr_ste_builder_void_init build_tnl_geneve_tlv_opt_exist_init; dr_ste_builder_void_init build_tnl_gtpu_init; dr_ste_builder_void_init build_tnl_gtpu_flex_parser_0; dr_ste_builder_void_init build_tnl_gtpu_flex_parser_1; dr_ste_builder_void_init build_register_0_init; dr_ste_builder_void_init build_register_1_init; dr_ste_builder_void_init build_src_gvmi_qpn_init; dr_ste_builder_void_init build_flex_parser_0_init; dr_ste_builder_void_init build_flex_parser_1_init; dr_ste_builder_void_init build_tunnel_header_init; dr_ste_builder_void_init build_ib_l4_init; dr_ste_builder_void_init build_def0_init; dr_ste_builder_void_init build_def2_init; dr_ste_builder_void_init build_def6_init; dr_ste_builder_void_init build_def16_init; dr_ste_builder_void_init build_def22_init; dr_ste_builder_void_init build_def24_init; dr_ste_builder_void_init build_def25_init; dr_ste_builder_void_init build_def26_init; dr_ste_builder_void_init build_def28_init; dr_ste_builder_void_init build_def33_init; int (*aso_other_domain_link)(struct mlx5dv_devx_obj *devx_obj, struct mlx5dv_dr_domain *peer_dmn, struct mlx5dv_dr_domain *dmn, uint32_t flags, uint8_t return_reg_c); int (*aso_other_domain_unlink)(struct mlx5dv_devx_obj *devx_obj); /* Getters and Setters */ void (*ste_init)(uint8_t *hw_ste_p, uint16_t lu_type, bool is_rx, uint16_t gvmi); void (*set_next_lu_type)(uint8_t *hw_ste_p, uint16_t lu_type); uint16_t (*get_next_lu_type)(uint8_t *hw_ste_p); void (*set_miss_addr)(uint8_t *hw_ste_p, uint64_t miss_addr); uint64_t (*get_miss_addr)(uint8_t *hw_ste_p); void (*set_hit_addr)(uint8_t *hw_ste_p, uint64_t icm_addr, uint32_t ht_size); void (*set_byte_mask)(uint8_t *hw_ste_p, uint16_t byte_mask); uint16_t (*get_byte_mask)(uint8_t *hw_ste_p); void (*set_ctrl_always_hit_htbl)(uint8_t *hw_ste, uint16_t byte_mask, uint16_t lu_type, uint64_t icm_addr, uint32_t num_of_entries, uint16_t gvmi); void (*set_ctrl_always_miss)(uint8_t *hw_ste, uint64_t miss_addr, uint16_t gvmi); void (*set_hit_gvmi)(uint8_t *hw_ste, uint16_t gvmi); /* Actions */ uint32_t actions_caps; const struct dr_ste_action_modify_field *action_modify_field_arr; size_t action_modify_field_arr_size; void (*set_actions_rx)(struct dr_ste_ctx *ste_ctx, uint8_t *action_type_set, uint32_t actions_caps, uint8_t *hw_ste_arr, struct dr_ste_actions_attr *attr, uint32_t *added_stes); void (*set_actions_tx)(struct dr_ste_ctx *ste_ctx, uint8_t *action_type_set, uint32_t actions_caps, uint8_t *hw_ste_arr, struct dr_ste_actions_attr *attr, uint32_t *added_stes); void (*set_action_set)(uint8_t *hw_action, uint8_t hw_field, uint8_t shifter, uint8_t length, uint32_t data); void (*set_action_add)(uint8_t *hw_action, uint8_t hw_field, uint8_t shifter, uint8_t length, uint32_t data); void (*set_action_copy)(uint8_t *hw_action, uint8_t dst_hw_field, uint8_t dst_shifter, uint8_t dst_len, uint8_t src_hw_field, uint8_t src_shifter); const struct dr_ste_action_modify_field * (*get_action_hw_field)(struct dr_ste_ctx *ste_ctx, uint16_t sw_field, struct dr_devx_caps *caps); int (*set_action_decap_l3_list)(void *data, uint32_t data_sz, uint8_t *hw_action, uint32_t hw_action_sz, uint16_t *used_hw_action_num); void (*set_aso_ct_cross_dmn)(uint8_t *hw_ste, uint32_t object_id, uint32_t offset, uint8_t dest_reg_id, bool direction); int (*alloc_modify_hdr_chunk)(struct mlx5dv_dr_action *action, uint32_t chunck_size); void (*dealloc_modify_hdr_chunk)(struct mlx5dv_dr_action *action); /* Actions bit set */ void (*set_encap)(uint8_t *hw_ste_p, uint8_t *d_action, uint32_t reformat_id, int size); void (*set_push_vlan)(uint8_t *ste, uint8_t *d_action, uint32_t vlan_hdr); void (*set_pop_vlan)(uint8_t *hw_ste_p, uint8_t *s_action, uint8_t vlans_num); void (*set_rx_decap)(uint8_t *hw_ste_p, uint8_t *s_action); void (*set_encap_l3)(uint8_t *hw_ste_p, uint8_t *frst_s_action, uint8_t *scnd_d_action, uint32_t reformat_id, int size); /* Send */ void (*prepare_for_postsend)(uint8_t *hw_ste_p, uint32_t ste_size); }; struct dr_ste_ctx *dr_ste_get_ctx_v0(void); struct dr_ste_ctx *dr_ste_get_ctx_v1(void); struct dr_ste_ctx *dr_ste_get_ctx_v2(void); struct dr_ste_ctx *dr_ste_get_ctx_v3(void); #endif rdma-core-56.1/providers/mlx5/dr_ste_v0.c000066400000000000000000002053661477342711600203010ustar00rootroot00000000000000/* * Copyright (c) 2020, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "dr_ste.h" #define DR_STE_ENABLE_FLOW_TAG (1 << 31) enum dr_ste_v0_entry_type { DR_STE_TYPE_TX = 1, DR_STE_TYPE_RX = 2, DR_STE_TYPE_MODIFY_PKT = 6, }; enum dr_ste_v0_action_tunl { DR_STE_TUNL_ACTION_NONE = 0, DR_STE_TUNL_ACTION_ENABLE = 1, DR_STE_TUNL_ACTION_DECAP = 2, DR_STE_TUNL_ACTION_L3_DECAP = 3, DR_STE_TUNL_ACTION_POP_VLAN = 4, }; enum dr_ste_v0_action_type { DR_STE_ACTION_TYPE_PUSH_VLAN = 1, DR_STE_ACTION_TYPE_ENCAP_L3 = 3, DR_STE_ACTION_TYPE_ENCAP = 4, }; enum dr_ste_v0_action_mdfy_op { DR_STE_ACTION_MDFY_OP_COPY = 0x1, DR_STE_ACTION_MDFY_OP_SET = 0x2, DR_STE_ACTION_MDFY_OP_ADD = 0x3, }; #define DR_STE_CALC_LU_TYPE(lookup_type, rx, inner) \ ((inner) ? DR_STE_V0_LU_TYPE_##lookup_type##_I : \ (rx) ? DR_STE_V0_LU_TYPE_##lookup_type##_D : \ DR_STE_V0_LU_TYPE_##lookup_type##_O) enum dr_ste_v0_lu_type { DR_STE_V0_LU_TYPE_NOP = 0x00, DR_STE_V0_LU_TYPE_SRC_GVMI_AND_QP = 0x05, DR_STE_V0_LU_TYPE_ETHL2_TUNNELING_I = 0x0a, DR_STE_V0_LU_TYPE_ETHL2_DST_O = 0x06, DR_STE_V0_LU_TYPE_ETHL2_DST_I = 0x07, DR_STE_V0_LU_TYPE_ETHL2_DST_D = 0x1b, DR_STE_V0_LU_TYPE_ETHL2_SRC_O = 0x08, DR_STE_V0_LU_TYPE_ETHL2_SRC_I = 0x09, DR_STE_V0_LU_TYPE_ETHL2_SRC_D = 0x1c, DR_STE_V0_LU_TYPE_ETHL2_SRC_DST_O = 0x36, DR_STE_V0_LU_TYPE_ETHL2_SRC_DST_I = 0x37, DR_STE_V0_LU_TYPE_ETHL2_SRC_DST_D = 0x38, DR_STE_V0_LU_TYPE_ETHL3_IPV6_DST_O = 0x0d, DR_STE_V0_LU_TYPE_ETHL3_IPV6_DST_I = 0x0e, DR_STE_V0_LU_TYPE_ETHL3_IPV6_DST_D = 0x1e, DR_STE_V0_LU_TYPE_ETHL3_IPV6_SRC_O = 0x0f, DR_STE_V0_LU_TYPE_ETHL3_IPV6_SRC_I = 0x10, DR_STE_V0_LU_TYPE_ETHL3_IPV6_SRC_D = 0x1f, DR_STE_V0_LU_TYPE_ETHL3_IPV4_5_TUPLE_O = 0x11, DR_STE_V0_LU_TYPE_ETHL3_IPV4_5_TUPLE_I = 0x12, DR_STE_V0_LU_TYPE_ETHL3_IPV4_5_TUPLE_D = 0x20, DR_STE_V0_LU_TYPE_ETHL3_IPV4_MISC_O = 0x29, DR_STE_V0_LU_TYPE_ETHL3_IPV4_MISC_I = 0x2a, DR_STE_V0_LU_TYPE_ETHL3_IPV4_MISC_D = 0x2b, DR_STE_V0_LU_TYPE_ETHL4_O = 0x13, DR_STE_V0_LU_TYPE_ETHL4_I = 0x14, DR_STE_V0_LU_TYPE_ETHL4_D = 0x21, DR_STE_V0_LU_TYPE_ETHL4_MISC_O = 0x2c, DR_STE_V0_LU_TYPE_ETHL4_MISC_I = 0x2d, DR_STE_V0_LU_TYPE_ETHL4_MISC_D = 0x2e, DR_STE_V0_LU_TYPE_MPLS_FIRST_O = 0x15, DR_STE_V0_LU_TYPE_MPLS_FIRST_I = 0x24, DR_STE_V0_LU_TYPE_MPLS_FIRST_D = 0x25, DR_STE_V0_LU_TYPE_GRE = 0x16, DR_STE_V0_LU_TYPE_FLEX_PARSER_0 = 0x22, DR_STE_V0_LU_TYPE_FLEX_PARSER_1 = 0x23, DR_STE_V0_LU_TYPE_FLEX_PARSER_TNL_HEADER = 0x19, DR_STE_V0_LU_TYPE_GENERAL_PURPOSE = 0x18, DR_STE_V0_LU_TYPE_STEERING_REGISTERS_0 = 0x2f, DR_STE_V0_LU_TYPE_STEERING_REGISTERS_1 = 0x30, DR_STE_V0_LU_TYPE_TUNNEL_HEADER = 0x34, DR_STE_V0_LU_TYPE_DONT_CARE = DR_STE_LU_TYPE_DONT_CARE, }; enum { DR_STE_V0_ACTION_MDFY_FLD_L2_0 = 0x00, DR_STE_V0_ACTION_MDFY_FLD_L2_1 = 0x01, DR_STE_V0_ACTION_MDFY_FLD_L2_2 = 0x02, DR_STE_V0_ACTION_MDFY_FLD_L3_0 = 0x03, DR_STE_V0_ACTION_MDFY_FLD_L3_1 = 0x04, DR_STE_V0_ACTION_MDFY_FLD_L3_2 = 0x05, DR_STE_V0_ACTION_MDFY_FLD_L3_3 = 0x06, DR_STE_V0_ACTION_MDFY_FLD_L3_4 = 0x07, DR_STE_V0_ACTION_MDFY_FLD_L4_0 = 0x08, DR_STE_V0_ACTION_MDFY_FLD_L4_1 = 0x09, DR_STE_V0_ACTION_MDFY_FLD_MPLS = 0x0a, DR_STE_V0_ACTION_MDFY_FLD_L2_TNL_0 = 0x0b, DR_STE_V0_ACTION_MDFY_FLD_REG_0 = 0x0c, DR_STE_V0_ACTION_MDFY_FLD_REG_1 = 0x0d, DR_STE_V0_ACTION_MDFY_FLD_REG_2 = 0x0e, DR_STE_V0_ACTION_MDFY_FLD_REG_3 = 0x0f, DR_STE_V0_ACTION_MDFY_FLD_L4_2 = 0x10, DR_STE_V0_ACTION_MDFY_FLD_FLEX_0 = 0x11, DR_STE_V0_ACTION_MDFY_FLD_FLEX_1 = 0x12, DR_STE_V0_ACTION_MDFY_FLD_FLEX_2 = 0x13, DR_STE_V0_ACTION_MDFY_FLD_FLEX_3 = 0x14, DR_STE_V0_ACTION_MDFY_FLD_L2_TNL_1 = 0x15, DR_STE_V0_ACTION_MDFY_FLD_METADATA = 0x16, DR_STE_V0_ACTION_MDFY_FLD_RESERVED = 0x17, }; static const struct dr_ste_action_modify_field dr_ste_v0_action_modify_field_arr[] = { [MLX5_ACTION_IN_FIELD_OUT_SMAC_47_16] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L2_1, .start = 16, .end = 47, }, [MLX5_ACTION_IN_FIELD_OUT_SMAC_15_0] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L2_1, .start = 0, .end = 15, }, [MLX5_ACTION_IN_FIELD_OUT_ETHERTYPE] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L2_2, .start = 32, .end = 47, }, [MLX5_ACTION_IN_FIELD_OUT_DMAC_47_16] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L2_0, .start = 16, .end = 47, }, [MLX5_ACTION_IN_FIELD_OUT_DMAC_15_0] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L2_0, .start = 0, .end = 15, }, [MLX5_ACTION_IN_FIELD_OUT_IP_DSCP] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_1, .start = 0, .end = 5, }, [MLX5_ACTION_IN_FIELD_OUT_IP_ECN] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_1, .start = 6, .end = 7, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_FLAGS] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L4_0, .start = 48, .end = 56, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_SPORT] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L4_0, .start = 0, .end = 15, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_DPORT] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L4_0, .start = 16, .end = 31, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, }, [MLX5_ACTION_IN_FIELD_OUT_IP_TTL] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_1, .start = 8, .end = 15, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, }, [MLX5_ACTION_IN_FIELD_OUT_IPV6_HOPLIMIT] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_1, .start = 8, .end = 15, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_UDP_SPORT] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L4_0, .start = 0, .end = 15, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_UDP, }, [MLX5_ACTION_IN_FIELD_OUT_UDP_DPORT] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L4_0, .start = 16, .end = 31, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_UDP, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_127_96] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_3, .start = 32, .end = 63, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_95_64] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_3, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_63_32] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_4, .start = 32, .end = 63, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_31_0] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_4, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_127_96] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_0, .start = 32, .end = 63, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_95_64] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_0, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_63_32] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_2, .start = 32, .end = 63, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_31_0] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_2, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV4] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_0, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV4] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L3_0, .start = 32, .end = 63, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGA] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_METADATA, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGB] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_METADATA, .start = 32, .end = 63, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_0] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_REG_0, .start = 32, .end = 63, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_1] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_REG_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_2] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_REG_1, .start = 32, .end = 63, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_3] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_REG_1, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_4] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_REG_2, .start = 32, .end = 63, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_5] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_REG_2, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L4_1, .start = 32, .end = 63, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L4_1, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_FIRST_VID] = { .hw_field = DR_STE_V0_ACTION_MDFY_FLD_L2_2, .start = 0, .end = 15, }, }; static void dr_ste_v0_set_entry_type(uint8_t *hw_ste_p, uint8_t entry_type) { DR_STE_SET(general, hw_ste_p, entry_type, entry_type); } static uint8_t dr_ste_v0_get_entry_type(uint8_t *hw_ste_p) { return DR_STE_GET(general, hw_ste_p, entry_type); } static void dr_ste_v0_set_hit_gvmi(uint8_t *hw_ste_p, uint16_t gvmi) { DR_STE_SET(general, hw_ste_p, next_table_base_63_48, gvmi); } static void dr_ste_v0_set_miss_addr(uint8_t *hw_ste_p, uint64_t miss_addr) { uint64_t index = miss_addr >> 6; /* Miss address for TX and RX STEs located in the same offsets */ DR_STE_SET(rx_steering_mult, hw_ste_p, miss_address_39_32, index >> 26); DR_STE_SET(rx_steering_mult, hw_ste_p, miss_address_31_6, index); } static uint64_t dr_ste_v0_get_miss_addr(uint8_t *hw_ste_p) { uint64_t index = (DR_STE_GET(rx_steering_mult, hw_ste_p, miss_address_31_6) | DR_STE_GET(rx_steering_mult, hw_ste_p, miss_address_39_32) << 26); return index << 6; } static void dr_ste_v0_set_byte_mask(uint8_t *hw_ste_p, uint16_t byte_mask) { DR_STE_SET(general, hw_ste_p, byte_mask, byte_mask); } static uint16_t dr_ste_v0_get_byte_mask(uint8_t *hw_ste_p) { return DR_STE_GET(general, hw_ste_p, byte_mask); } static void dr_ste_v0_set_lu_type(uint8_t *hw_ste_p, uint16_t lu_type) { DR_STE_SET(general, hw_ste_p, entry_sub_type, lu_type); } static void dr_ste_v0_set_next_lu_type(uint8_t *hw_ste_p, uint16_t lu_type) { DR_STE_SET(general, hw_ste_p, next_lu_type, lu_type); } static uint16_t dr_ste_v0_get_next_lu_type(uint8_t *hw_ste_p) { return DR_STE_GET(general, hw_ste_p, next_lu_type); } static void dr_ste_v0_set_hit_addr(uint8_t *hw_ste_p, uint64_t icm_addr, uint32_t ht_size) { uint64_t index = (icm_addr >> 5) | ht_size; DR_STE_SET(general, hw_ste_p, next_table_base_39_32_size, index >> 27); DR_STE_SET(general, hw_ste_p, next_table_base_31_5_size, index); } static void dr_ste_v0_init_full(uint8_t *hw_ste_p, uint16_t lu_type, enum dr_ste_v0_entry_type entry_type, uint16_t gvmi) { dr_ste_v0_set_entry_type(hw_ste_p, entry_type); dr_ste_v0_set_lu_type(hw_ste_p, lu_type); dr_ste_v0_set_next_lu_type(hw_ste_p, DR_STE_LU_TYPE_DONT_CARE); DR_STE_SET(rx_steering_mult, hw_ste_p, gvmi, gvmi); DR_STE_SET(rx_steering_mult, hw_ste_p, next_table_base_63_48, gvmi); DR_STE_SET(rx_steering_mult, hw_ste_p, miss_address_63_48, gvmi); } static void dr_ste_v0_init(uint8_t *hw_ste_p, uint16_t lu_type, bool is_rx, uint16_t gvmi) { enum dr_ste_v0_entry_type entry_type; entry_type = is_rx ? DR_STE_TYPE_RX : DR_STE_TYPE_TX; dr_ste_v0_init_full(hw_ste_p, lu_type, entry_type, gvmi); } static void dr_ste_v0_set_ctrl_always_hit_htbl(uint8_t *hw_ste_p, uint16_t byte_mask, uint16_t lu_type, uint64_t icm_addr, uint32_t num_of_entries, uint16_t gvmi) { dr_ste_v0_set_next_lu_type(hw_ste_p, lu_type); dr_ste_v0_set_hit_addr(hw_ste_p, icm_addr, num_of_entries); dr_ste_v0_set_byte_mask(hw_ste_p, byte_mask); } static void dr_ste_v0_set_ctrl_always_miss(uint8_t *hw_ste_p, uint64_t miss_addr, uint16_t gvmi) { dr_ste_v0_set_next_lu_type(hw_ste_p, DR_STE_LU_TYPE_DONT_CARE); dr_ste_v0_set_miss_addr(hw_ste_p, miss_addr); } static void dr_ste_v0_set_rx_flow_tag(uint8_t *hw_ste_p, uint32_t flow_tag) { DR_STE_SET(rx_steering_mult, hw_ste_p, qp_list_pointer, DR_STE_ENABLE_FLOW_TAG | flow_tag); } static void dr_ste_v0_set_counter_id(uint8_t *hw_ste_p, uint32_t ctr_id) { /* This can be used for both rx_steering_mult and for sx_transmit */ DR_STE_SET(rx_steering_mult, hw_ste_p, counter_trigger_15_0, ctr_id); DR_STE_SET(rx_steering_mult, hw_ste_p, counter_trigger_23_16, ctr_id >> 16); } static void dr_ste_v0_set_tx_encap(void *hw_ste_p, uint32_t reformat_id, int size, bool encap_l3) { DR_STE_SET(sx_transmit, hw_ste_p, action_type, encap_l3 ? DR_STE_ACTION_TYPE_ENCAP_L3 : DR_STE_ACTION_TYPE_ENCAP); /* The hardware expects here size in words (2 byte) */ DR_STE_SET(sx_transmit, hw_ste_p, action_description, size / 2); DR_STE_SET(sx_transmit, hw_ste_p, encap_pointer_vlan_data, reformat_id); } static void dr_ste_v0_set_rx_decap(uint8_t *hw_ste_p) { DR_STE_SET(rx_steering_mult, hw_ste_p, tunneling_action, DR_STE_TUNL_ACTION_DECAP); DR_STE_SET(rx_steering_mult, hw_ste_p, fail_on_error, 1); } static void dr_ste_v0_set_go_back_bit(uint8_t *hw_ste_p) { DR_STE_SET(sx_transmit, hw_ste_p, go_back, 1); } static void dr_ste_v0_set_tx_push_vlan(uint8_t *hw_ste_p, uint32_t vlan_hdr, bool go_back) { DR_STE_SET(sx_transmit, hw_ste_p, action_type, DR_STE_ACTION_TYPE_PUSH_VLAN); DR_STE_SET(sx_transmit, hw_ste_p, encap_pointer_vlan_data, vlan_hdr); /* Due to HW limitation we need to set this bit, otherwise reformat + * push vlan will not work. */ if (go_back) dr_ste_v0_set_go_back_bit(hw_ste_p); } static void dr_ste_v0_set_rx_pop_vlan(uint8_t *hw_ste_p) { DR_STE_SET(rx_steering_mult, hw_ste_p, tunneling_action, DR_STE_TUNL_ACTION_POP_VLAN); } static void dr_ste_v0_set_rx_decap_l3(uint8_t *hw_ste_p, bool vlan) { DR_STE_SET(rx_steering_mult, hw_ste_p, tunneling_action, DR_STE_TUNL_ACTION_L3_DECAP); DR_STE_SET(modify_packet, hw_ste_p, action_description, vlan ? 1 : 0); DR_STE_SET(rx_steering_mult, hw_ste_p, fail_on_error, 1); } static void dr_ste_v0_set_rewrite_actions(uint8_t *hw_ste_p, uint16_t num_of_actions, uint32_t re_write_index) { DR_STE_SET(modify_packet, hw_ste_p, number_of_re_write_actions, num_of_actions); DR_STE_SET(modify_packet, hw_ste_p, header_re_write_actions_pointer, re_write_index); } static inline void dr_ste_v0_arr_init_next(uint8_t **last_ste, uint32_t *added_stes, enum dr_ste_v0_entry_type entry_type, uint16_t gvmi) { (*added_stes)++; *last_ste += DR_STE_SIZE; dr_ste_v0_init_full(*last_ste, DR_STE_LU_TYPE_DONT_CARE, entry_type, gvmi); } static void dr_ste_v0_set_actions_tx(struct dr_ste_ctx *ste_ctx, uint8_t *action_type_set, uint32_t actions_caps, uint8_t *last_ste, struct dr_ste_actions_attr *attr, uint32_t *added_stes) { bool encap = action_type_set[DR_ACTION_TYP_L2_TO_TNL_L2] || action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]; /* We want to make sure the modify header comes before L2 * encapsulation. The reason for that is that we support * modify headers for outer headers only */ if (action_type_set[DR_ACTION_TYP_MODIFY_HDR]) { dr_ste_v0_set_entry_type(last_ste, DR_STE_TYPE_MODIFY_PKT); dr_ste_v0_set_rewrite_actions(last_ste, attr->modify_actions, attr->modify_index); } if (action_type_set[DR_ACTION_TYP_PUSH_VLAN]) { int i; for (i = 0; i < attr->vlans.count_push; i++) { if (i || action_type_set[DR_ACTION_TYP_MODIFY_HDR]) dr_ste_v0_arr_init_next(&last_ste, added_stes, DR_STE_TYPE_TX, attr->gvmi); dr_ste_v0_set_tx_push_vlan(last_ste, attr->vlans.headers[i], encap); } } if (encap) { /* Modify header and encapsulation require a different STEs. * Since modify header STE format doesn't support encapsulation * tunneling_action. Encapsulation and push VLAN cannot be set * on the same STE. */ if (action_type_set[DR_ACTION_TYP_MODIFY_HDR] || action_type_set[DR_ACTION_TYP_PUSH_VLAN]) dr_ste_v0_arr_init_next(&last_ste, added_stes, DR_STE_TYPE_TX, attr->gvmi); dr_ste_v0_set_tx_encap(last_ste, attr->reformat_id, attr->reformat_size, action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]); /* Whenever prio_tag_required enabled, we can be sure that the * previous table (ACL) already push vlan to our packet, * And due to HW limitation we need to set this bit, otherwise * push vlan + reformat will not work. */ if (attr->prio_tag_required) dr_ste_v0_set_go_back_bit(last_ste); } if (action_type_set[DR_ACTION_TYP_CTR]) dr_ste_v0_set_counter_id(last_ste, attr->ctr_id); dr_ste_v0_set_hit_gvmi(last_ste, attr->hit_gvmi); dr_ste_v0_set_hit_addr(last_ste, attr->final_icm_addr, 1); } static void dr_ste_v0_set_actions_rx(struct dr_ste_ctx *ste_ctx, uint8_t *action_type_set, uint32_t actions_caps, uint8_t *last_ste, struct dr_ste_actions_attr *attr, uint32_t *added_stes) { if (action_type_set[DR_ACTION_TYP_CTR]) dr_ste_v0_set_counter_id(last_ste, attr->ctr_id); if (action_type_set[DR_ACTION_TYP_TNL_L3_TO_L2]) { dr_ste_v0_set_entry_type(last_ste, DR_STE_TYPE_MODIFY_PKT); dr_ste_v0_set_rx_decap_l3(last_ste, attr->decap_with_vlan); dr_ste_v0_set_rewrite_actions(last_ste, attr->decap_actions, attr->decap_index); } if (action_type_set[DR_ACTION_TYP_TNL_L2_TO_L2]) dr_ste_v0_set_rx_decap(last_ste); if (action_type_set[DR_ACTION_TYP_POP_VLAN]) { int i; for (i = 0; i < attr->vlans.count_pop; i++) { if (i || action_type_set[DR_ACTION_TYP_TNL_L2_TO_L2] || action_type_set[DR_ACTION_TYP_TNL_L3_TO_L2]) dr_ste_v0_arr_init_next(&last_ste, added_stes, DR_STE_TYPE_RX, attr->gvmi); dr_ste_v0_set_rx_pop_vlan(last_ste); } } if (action_type_set[DR_ACTION_TYP_MODIFY_HDR]) { if (dr_ste_v0_get_entry_type(last_ste) == DR_STE_TYPE_MODIFY_PKT) dr_ste_v0_arr_init_next(&last_ste, added_stes, DR_STE_TYPE_MODIFY_PKT, attr->gvmi); else dr_ste_v0_set_entry_type(last_ste, DR_STE_TYPE_MODIFY_PKT); dr_ste_v0_set_rewrite_actions(last_ste, attr->modify_actions, attr->modify_index); } if (action_type_set[DR_ACTION_TYP_TAG]) { if (dr_ste_v0_get_entry_type(last_ste) == DR_STE_TYPE_MODIFY_PKT) dr_ste_v0_arr_init_next(&last_ste, added_stes, DR_STE_TYPE_RX, attr->gvmi); dr_ste_v0_set_rx_flow_tag(last_ste, attr->flow_tag); } dr_ste_v0_set_hit_gvmi(last_ste, attr->hit_gvmi); dr_ste_v0_set_hit_addr(last_ste, attr->final_icm_addr, 1); } static void dr_ste_v0_set_action_set(uint8_t *hw_action, uint8_t hw_field, uint8_t shifter, uint8_t length, uint32_t data) { length = (length == 32) ? 0 : length; DEVX_SET(dr_action_hw_set, hw_action, opcode, DR_STE_ACTION_MDFY_OP_SET); DEVX_SET(dr_action_hw_set, hw_action, destination_field_code, hw_field); DEVX_SET(dr_action_hw_set, hw_action, destination_left_shifter, shifter); DEVX_SET(dr_action_hw_set, hw_action, destination_length, length); DEVX_SET(dr_action_hw_set, hw_action, inline_data, data); } static void dr_ste_v0_set_action_add(uint8_t *hw_action, uint8_t hw_field, uint8_t shifter, uint8_t length, uint32_t data) { length = (length == 32) ? 0 : length; DEVX_SET(dr_action_hw_set, hw_action, opcode, DR_STE_ACTION_MDFY_OP_ADD); DEVX_SET(dr_action_hw_set, hw_action, destination_field_code, hw_field); DEVX_SET(dr_action_hw_set, hw_action, destination_left_shifter, shifter); DEVX_SET(dr_action_hw_set, hw_action, destination_length, length); DEVX_SET(dr_action_hw_set, hw_action, inline_data, data); } static void dr_ste_v0_set_action_copy(uint8_t *hw_action, uint8_t dst_hw_field, uint8_t dst_shifter, uint8_t dst_len, uint8_t src_hw_field, uint8_t src_shifter) { DEVX_SET(dr_action_hw_copy, hw_action, opcode, DR_STE_ACTION_MDFY_OP_COPY); DEVX_SET(dr_action_hw_copy, hw_action, destination_field_code, dst_hw_field); DEVX_SET(dr_action_hw_copy, hw_action, destination_left_shifter, dst_shifter); DEVX_SET(dr_action_hw_copy, hw_action, destination_length, dst_len); DEVX_SET(dr_action_hw_copy, hw_action, source_field_code, src_hw_field); DEVX_SET(dr_action_hw_copy, hw_action, source_left_shifter, src_shifter); } #define DR_STE_DECAP_L3_MIN_ACTION_NUM 5 static int dr_ste_v0_set_action_decap_l3_list(void *data, uint32_t data_sz, uint8_t *hw_action, uint32_t hw_action_sz, uint16_t *used_hw_action_num) { struct mlx5_ifc_l2_hdr_bits *l2_hdr = data; uint32_t hw_action_num; int required_actions; uint32_t hdr_fld_4b; uint16_t hdr_fld_2b; uint16_t vlan_type; bool vlan; vlan = (data_sz != HDR_LEN_L2); hw_action_num = hw_action_sz / DEVX_ST_SZ_BYTES(dr_action_hw_set); required_actions = DR_STE_DECAP_L3_MIN_ACTION_NUM + !!vlan; if (hw_action_num < required_actions) { errno = ENOMEM; return errno; } /* dmac_47_16 */ DEVX_SET(dr_action_hw_set, hw_action, opcode, DR_STE_ACTION_MDFY_OP_SET); DEVX_SET(dr_action_hw_set, hw_action, destination_length, 0); DEVX_SET(dr_action_hw_set, hw_action, destination_field_code, DR_STE_V0_ACTION_MDFY_FLD_L2_0); DEVX_SET(dr_action_hw_set, hw_action, destination_left_shifter, 16); hdr_fld_4b = DEVX_GET(l2_hdr, l2_hdr, dmac_47_16); DEVX_SET(dr_action_hw_set, hw_action, inline_data, hdr_fld_4b); hw_action += DEVX_ST_SZ_BYTES(dr_action_hw_set); /* smac_47_16 */ DEVX_SET(dr_action_hw_set, hw_action, opcode, DR_STE_ACTION_MDFY_OP_SET); DEVX_SET(dr_action_hw_set, hw_action, destination_length, 0); DEVX_SET(dr_action_hw_set, hw_action, destination_field_code, DR_STE_V0_ACTION_MDFY_FLD_L2_1); DEVX_SET(dr_action_hw_set, hw_action, destination_left_shifter, 16); hdr_fld_4b = (DEVX_GET(l2_hdr, l2_hdr, smac_31_0) >> 16 | DEVX_GET(l2_hdr, l2_hdr, smac_47_32) << 16); DEVX_SET(dr_action_hw_set, hw_action, inline_data, hdr_fld_4b); hw_action += DEVX_ST_SZ_BYTES(dr_action_hw_set); /* dmac_15_0 */ DEVX_SET(dr_action_hw_set, hw_action, opcode, DR_STE_ACTION_MDFY_OP_SET); DEVX_SET(dr_action_hw_set, hw_action, destination_length, 16); DEVX_SET(dr_action_hw_set, hw_action, destination_field_code, DR_STE_V0_ACTION_MDFY_FLD_L2_0); DEVX_SET(dr_action_hw_set, hw_action, destination_left_shifter, 0); hdr_fld_2b = DEVX_GET(l2_hdr, l2_hdr, dmac_15_0); DEVX_SET(dr_action_hw_set, hw_action, inline_data, hdr_fld_2b); hw_action += DEVX_ST_SZ_BYTES(dr_action_hw_set); /* ethertype + (optional) vlan */ DEVX_SET(dr_action_hw_set, hw_action, opcode, DR_STE_ACTION_MDFY_OP_SET); DEVX_SET(dr_action_hw_set, hw_action, destination_field_code, DR_STE_V0_ACTION_MDFY_FLD_L2_2); DEVX_SET(dr_action_hw_set, hw_action, destination_left_shifter, 32); if (!vlan) { hdr_fld_2b = DEVX_GET(l2_hdr, l2_hdr, ethertype); DEVX_SET(dr_action_hw_set, hw_action, inline_data, hdr_fld_2b); DEVX_SET(dr_action_hw_set, hw_action, destination_length, 16); } else { hdr_fld_2b = DEVX_GET(l2_hdr, l2_hdr, ethertype); vlan_type = hdr_fld_2b == SVLAN_ETHERTYPE ? DR_STE_SVLAN : DR_STE_CVLAN; hdr_fld_2b = DEVX_GET(l2_hdr, l2_hdr, vlan); hdr_fld_4b = (vlan_type << 16) | hdr_fld_2b; DEVX_SET(dr_action_hw_set, hw_action, inline_data, hdr_fld_4b); DEVX_SET(dr_action_hw_set, hw_action, destination_length, 18); } hw_action += DEVX_ST_SZ_BYTES(dr_action_hw_set); /* smac_15_0 */ DEVX_SET(dr_action_hw_set, hw_action, opcode, DR_STE_ACTION_MDFY_OP_SET); DEVX_SET(dr_action_hw_set, hw_action, destination_length, 16); DEVX_SET(dr_action_hw_set, hw_action, destination_field_code, DR_STE_V0_ACTION_MDFY_FLD_L2_1); DEVX_SET(dr_action_hw_set, hw_action, destination_left_shifter, 0); hdr_fld_2b = DEVX_GET(l2_hdr, l2_hdr, smac_31_0); DEVX_SET(dr_action_hw_set, hw_action, inline_data, hdr_fld_2b); hw_action += DEVX_ST_SZ_BYTES(dr_action_hw_set); if (vlan) { DEVX_SET(dr_action_hw_set, hw_action, opcode, DR_STE_ACTION_MDFY_OP_SET); hdr_fld_2b = DEVX_GET(l2_hdr, l2_hdr, vlan_type); DEVX_SET(dr_action_hw_set, hw_action, inline_data, hdr_fld_2b); DEVX_SET(dr_action_hw_set, hw_action, destination_length, 16); DEVX_SET(dr_action_hw_set, hw_action, destination_field_code, DR_STE_V0_ACTION_MDFY_FLD_L2_2); DEVX_SET(dr_action_hw_set, hw_action, destination_left_shifter, 0); } *used_hw_action_num = required_actions; return 0; } static const struct dr_ste_action_modify_field *dr_ste_v0_get_action_hw_field(struct dr_ste_ctx *ste_ctx, uint16_t sw_field, struct dr_devx_caps *caps) { const struct dr_ste_action_modify_field *hw_field; if (sw_field >= ste_ctx->action_modify_field_arr_size) goto not_found; hw_field = &ste_ctx->action_modify_field_arr[sw_field]; if (!hw_field->end && !hw_field->start) goto not_found; return hw_field; not_found: errno = EINVAL; return NULL; } static void dr_ste_v0_build_eth_l2_src_dst_bit_mask(struct dr_match_param *value, bool inner, uint8_t *bit_mask) { struct dr_match_spec *mask = inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_src_dst, bit_mask, dmac_47_16, mask, dmac_47_16); DR_STE_SET_TAG(eth_l2_src_dst, bit_mask, dmac_15_0, mask, dmac_15_0); if (mask->smac_47_16 || mask->smac_15_0) { DR_STE_SET(eth_l2_src_dst, bit_mask, smac_47_32, mask->smac_47_16 >> 16); DR_STE_SET(eth_l2_src_dst, bit_mask, smac_31_0, mask->smac_47_16 << 16 | mask->smac_15_0); mask->smac_47_16 = 0; mask->smac_15_0 = 0; } DR_STE_SET_TAG(eth_l2_src_dst, bit_mask, first_vlan_id, mask, first_vid); DR_STE_SET_TAG(eth_l2_src_dst, bit_mask, first_cfi, mask, first_cfi); DR_STE_SET_TAG(eth_l2_src_dst, bit_mask, first_priority, mask, first_prio); DR_STE_SET_ONES(eth_l2_src_dst, bit_mask, l3_type, mask, ip_version); if (mask->cvlan_tag || mask->svlan_tag) { DR_STE_SET(eth_l2_src_dst, bit_mask, first_vlan_qualifier, -1); mask->cvlan_tag = 0; mask->svlan_tag = 0; } } static int dr_ste_v0_build_eth_l2_src_dst_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_src_dst, tag, dmac_47_16, spec, dmac_47_16); DR_STE_SET_TAG(eth_l2_src_dst, tag, dmac_15_0, spec, dmac_15_0); if (spec->smac_47_16 || spec->smac_15_0) { DR_STE_SET(eth_l2_src_dst, tag, smac_47_32, spec->smac_47_16 >> 16); DR_STE_SET(eth_l2_src_dst, tag, smac_31_0, spec->smac_47_16 << 16 | spec->smac_15_0); spec->smac_47_16 = 0; spec->smac_15_0 = 0; } if (spec->ip_version) { if (spec->ip_version == IP_VERSION_IPV4) { DR_STE_SET(eth_l2_src_dst, tag, l3_type, STE_IPV4); spec->ip_version = 0; } else if (spec->ip_version == IP_VERSION_IPV6) { DR_STE_SET(eth_l2_src_dst, tag, l3_type, STE_IPV6); spec->ip_version = 0; } else { errno = EINVAL; return errno; } } DR_STE_SET_TAG(eth_l2_src_dst, tag, first_vlan_id, spec, first_vid); DR_STE_SET_TAG(eth_l2_src_dst, tag, first_cfi, spec, first_cfi); DR_STE_SET_TAG(eth_l2_src_dst, tag, first_priority, spec, first_prio); if (spec->cvlan_tag) { DR_STE_SET(eth_l2_src_dst, tag, first_vlan_qualifier, DR_STE_CVLAN); spec->cvlan_tag = 0; } else if (spec->svlan_tag) { DR_STE_SET(eth_l2_src_dst, tag, first_vlan_qualifier, DR_STE_SVLAN); spec->svlan_tag = 0; } return 0; } static void dr_ste_v0_build_eth_l2_src_dst_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_eth_l2_src_dst_bit_mask(mask, sb->inner, sb->bit_mask); sb->lu_type = DR_STE_CALC_LU_TYPE(ETHL2_SRC_DST, sb->rx, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_eth_l2_src_dst_tag; } static int dr_ste_v0_build_eth_l3_ipv6_dst_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l3_ipv6_dst, tag, dst_ip_127_96, spec, dst_ip_127_96); DR_STE_SET_TAG(eth_l3_ipv6_dst, tag, dst_ip_95_64, spec, dst_ip_95_64); DR_STE_SET_TAG(eth_l3_ipv6_dst, tag, dst_ip_63_32, spec, dst_ip_63_32); DR_STE_SET_TAG(eth_l3_ipv6_dst, tag, dst_ip_31_0, spec, dst_ip_31_0); return 0; } static void dr_ste_v0_build_eth_l3_ipv6_dst_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_eth_l3_ipv6_dst_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_LU_TYPE(ETHL3_IPV6_DST, sb->rx, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_eth_l3_ipv6_dst_tag; } static int dr_ste_v0_build_eth_l3_ipv6_src_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l3_ipv6_src, tag, src_ip_127_96, spec, src_ip_127_96); DR_STE_SET_TAG(eth_l3_ipv6_src, tag, src_ip_95_64, spec, src_ip_95_64); DR_STE_SET_TAG(eth_l3_ipv6_src, tag, src_ip_63_32, spec, src_ip_63_32); DR_STE_SET_TAG(eth_l3_ipv6_src, tag, src_ip_31_0, spec, src_ip_31_0); return 0; } static void dr_ste_v0_build_eth_l3_ipv6_src_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_eth_l3_ipv6_src_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_LU_TYPE(ETHL3_IPV6_SRC, sb->rx, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_eth_l3_ipv6_src_tag; } static int dr_ste_v0_build_eth_l3_ipv4_5_tuple_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l3_ipv4_5_tuple, tag, destination_address, spec, dst_ip_31_0); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple, tag, source_address, spec, src_ip_31_0); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple, tag, destination_port, spec, tcp_dport); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple, tag, destination_port, spec, udp_dport); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple, tag, source_port, spec, tcp_sport); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple, tag, source_port, spec, udp_sport); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple, tag, protocol, spec, ip_protocol); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple, tag, fragmented, spec, frag); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple, tag, dscp, spec, ip_dscp); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple, tag, ecn, spec, ip_ecn); if (spec->tcp_flags) { DR_STE_SET_TCP_FLAGS(eth_l3_ipv4_5_tuple, tag, spec); spec->tcp_flags = 0; } return 0; } static void dr_ste_v0_build_eth_l3_ipv4_5_tuple_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_eth_l3_ipv4_5_tuple_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_LU_TYPE(ETHL3_IPV4_5_TUPLE, sb->rx, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_eth_l3_ipv4_5_tuple_tag; } static void dr_ste_v0_build_eth_l2_src_or_dst_bit_mask(struct dr_match_param *value, bool inner, uint8_t *bit_mask) { struct dr_match_spec *mask = inner ? &value->inner : &value->outer; struct dr_match_misc *misc_mask = &value->misc; DR_STE_SET_TAG(eth_l2_src, bit_mask, first_vlan_id, mask, first_vid); DR_STE_SET_TAG(eth_l2_src, bit_mask, first_cfi, mask, first_cfi); DR_STE_SET_TAG(eth_l2_src, bit_mask, first_priority, mask, first_prio); DR_STE_SET_TAG(eth_l2_src, bit_mask, ip_fragmented, mask, frag); DR_STE_SET_TAG(eth_l2_src, bit_mask, l3_ethertype, mask, ethertype); DR_STE_SET_ONES(eth_l2_src, bit_mask, l3_type, mask, ip_version); if (mask->svlan_tag || mask->cvlan_tag) { DR_STE_SET(eth_l2_src, bit_mask, first_vlan_qualifier, -1); mask->cvlan_tag = 0; mask->svlan_tag = 0; } if (inner) { if (misc_mask->inner_second_cvlan_tag || misc_mask->inner_second_svlan_tag) { DR_STE_SET(eth_l2_src, bit_mask, second_vlan_qualifier, -1); misc_mask->inner_second_cvlan_tag = 0; misc_mask->inner_second_svlan_tag = 0; } DR_STE_SET_TAG(eth_l2_src, bit_mask, second_vlan_id, misc_mask, inner_second_vid); DR_STE_SET_TAG(eth_l2_src, bit_mask, second_cfi, misc_mask, inner_second_cfi); DR_STE_SET_TAG(eth_l2_src, bit_mask, second_priority, misc_mask, inner_second_prio); } else { if (misc_mask->outer_second_cvlan_tag || misc_mask->outer_second_svlan_tag) { DR_STE_SET(eth_l2_src, bit_mask, second_vlan_qualifier, -1); misc_mask->outer_second_cvlan_tag = 0; misc_mask->outer_second_svlan_tag = 0; } DR_STE_SET_TAG(eth_l2_src, bit_mask, second_vlan_id, misc_mask, outer_second_vid); DR_STE_SET_TAG(eth_l2_src, bit_mask, second_cfi, misc_mask, outer_second_cfi); DR_STE_SET_TAG(eth_l2_src, bit_mask, second_priority, misc_mask, outer_second_prio); } } static int dr_ste_v0_build_eth_l2_src_or_dst_tag(struct dr_match_param *value, bool inner, uint8_t *tag) { struct dr_match_spec *spec = inner ? &value->inner : &value->outer; struct dr_match_misc *misc_spec = &value->misc; DR_STE_SET_TAG(eth_l2_src, tag, first_vlan_id, spec, first_vid); DR_STE_SET_TAG(eth_l2_src, tag, first_cfi, spec, first_cfi); DR_STE_SET_TAG(eth_l2_src, tag, first_priority, spec, first_prio); DR_STE_SET_TAG(eth_l2_src, tag, ip_fragmented, spec, frag); DR_STE_SET_TAG(eth_l2_src, tag, l3_ethertype, spec, ethertype); if (spec->ip_version) { if (spec->ip_version == IP_VERSION_IPV4) { DR_STE_SET(eth_l2_src, tag, l3_type, STE_IPV4); spec->ip_version = 0; } else if (spec->ip_version == IP_VERSION_IPV6) { DR_STE_SET(eth_l2_src, tag, l3_type, STE_IPV6); spec->ip_version = 0; } else { errno = EINVAL; return errno; } } if (spec->cvlan_tag) { DR_STE_SET(eth_l2_src, tag, first_vlan_qualifier, DR_STE_CVLAN); spec->cvlan_tag = 0; } else if (spec->svlan_tag) { DR_STE_SET(eth_l2_src, tag, first_vlan_qualifier, DR_STE_SVLAN); spec->svlan_tag = 0; } if (inner) { if (misc_spec->inner_second_cvlan_tag) { DR_STE_SET(eth_l2_src, tag, second_vlan_qualifier, DR_STE_CVLAN); misc_spec->inner_second_cvlan_tag = 0; } else if (misc_spec->inner_second_svlan_tag) { DR_STE_SET(eth_l2_src, tag, second_vlan_qualifier, DR_STE_SVLAN); misc_spec->inner_second_svlan_tag = 0; } DR_STE_SET_TAG(eth_l2_src, tag, second_vlan_id, misc_spec, inner_second_vid); DR_STE_SET_TAG(eth_l2_src, tag, second_cfi, misc_spec, inner_second_cfi); DR_STE_SET_TAG(eth_l2_src, tag, second_priority, misc_spec, inner_second_prio); } else { if (misc_spec->outer_second_cvlan_tag) { DR_STE_SET(eth_l2_src, tag, second_vlan_qualifier, DR_STE_CVLAN); misc_spec->outer_second_cvlan_tag = 0; } else if (misc_spec->outer_second_svlan_tag) { DR_STE_SET(eth_l2_src, tag, second_vlan_qualifier, DR_STE_SVLAN); misc_spec->outer_second_svlan_tag = 0; } DR_STE_SET_TAG(eth_l2_src, tag, second_vlan_id, misc_spec, outer_second_vid); DR_STE_SET_TAG(eth_l2_src, tag, second_cfi, misc_spec, outer_second_cfi); DR_STE_SET_TAG(eth_l2_src, tag, second_priority, misc_spec, outer_second_prio); } return 0; } static void dr_ste_v0_build_eth_l2_src_bit_mask(struct dr_match_param *value, bool inner, uint8_t *bit_mask) { struct dr_match_spec *mask = inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_src, bit_mask, smac_47_16, mask, smac_47_16); DR_STE_SET_TAG(eth_l2_src, bit_mask, smac_15_0, mask, smac_15_0); dr_ste_v0_build_eth_l2_src_or_dst_bit_mask(value, inner, bit_mask); } static int dr_ste_v0_build_eth_l2_src_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_src, tag, smac_47_16, spec, smac_47_16); DR_STE_SET_TAG(eth_l2_src, tag, smac_15_0, spec, smac_15_0); return dr_ste_v0_build_eth_l2_src_or_dst_tag(value, sb->inner, tag); } static void dr_ste_v0_build_eth_l2_src_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_eth_l2_src_bit_mask(mask, sb->inner, sb->bit_mask); sb->lu_type = DR_STE_CALC_LU_TYPE(ETHL2_SRC, sb->rx, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_eth_l2_src_tag; } static void dr_ste_v0_build_eth_l2_dst_bit_mask(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *bit_mask) { struct dr_match_spec *mask = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_dst, bit_mask, dmac_47_16, mask, dmac_47_16); DR_STE_SET_TAG(eth_l2_dst, bit_mask, dmac_15_0, mask, dmac_15_0); dr_ste_v0_build_eth_l2_src_or_dst_bit_mask(value, sb->inner, bit_mask); } static int dr_ste_v0_build_eth_l2_dst_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_dst, tag, dmac_47_16, spec, dmac_47_16); DR_STE_SET_TAG(eth_l2_dst, tag, dmac_15_0, spec, dmac_15_0); return dr_ste_v0_build_eth_l2_src_or_dst_tag(value, sb->inner, tag); } static void dr_ste_v0_build_eth_l2_dst_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_eth_l2_dst_bit_mask(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_LU_TYPE(ETHL2_DST, sb->rx, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_eth_l2_dst_tag; } static void dr_ste_v0_build_eth_l2_tnl_bit_mask(struct dr_match_param *value, bool inner, uint8_t *bit_mask) { struct dr_match_spec *mask = inner ? &value->inner : &value->outer; struct dr_match_misc *misc = &value->misc; DR_STE_SET_TAG(eth_l2_tnl, bit_mask, dmac_47_16, mask, dmac_47_16); DR_STE_SET_TAG(eth_l2_tnl, bit_mask, dmac_15_0, mask, dmac_15_0); DR_STE_SET_TAG(eth_l2_tnl, bit_mask, first_vlan_id, mask, first_vid); DR_STE_SET_TAG(eth_l2_tnl, bit_mask, first_cfi, mask, first_cfi); DR_STE_SET_TAG(eth_l2_tnl, bit_mask, first_priority, mask, first_prio); DR_STE_SET_TAG(eth_l2_tnl, bit_mask, ip_fragmented, mask, frag); DR_STE_SET_TAG(eth_l2_tnl, bit_mask, l3_ethertype, mask, ethertype); DR_STE_SET_ONES(eth_l2_tnl, bit_mask, l3_type, mask, ip_version); if (misc->vxlan_vni) { DR_STE_SET(eth_l2_tnl, bit_mask, l2_tunneling_network_id, (misc->vxlan_vni << 8)); misc->vxlan_vni = 0; } if (mask->svlan_tag || mask->cvlan_tag) { DR_STE_SET(eth_l2_tnl, bit_mask, first_vlan_qualifier, -1); mask->cvlan_tag = 0; mask->svlan_tag = 0; } } static int dr_ste_v0_build_eth_l2_tnl_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; struct dr_match_misc *misc = &value->misc; DR_STE_SET_TAG(eth_l2_tnl, tag, dmac_47_16, spec, dmac_47_16); DR_STE_SET_TAG(eth_l2_tnl, tag, dmac_15_0, spec, dmac_15_0); DR_STE_SET_TAG(eth_l2_tnl, tag, first_vlan_id, spec, first_vid); DR_STE_SET_TAG(eth_l2_tnl, tag, first_cfi, spec, first_cfi); DR_STE_SET_TAG(eth_l2_tnl, tag, ip_fragmented, spec, frag); DR_STE_SET_TAG(eth_l2_tnl, tag, first_priority, spec, first_prio); DR_STE_SET_TAG(eth_l2_tnl, tag, l3_ethertype, spec, ethertype); if (misc->vxlan_vni) { DR_STE_SET(eth_l2_tnl, tag, l2_tunneling_network_id, (misc->vxlan_vni << 8)); misc->vxlan_vni = 0; } if (spec->cvlan_tag) { DR_STE_SET(eth_l2_tnl, tag, first_vlan_qualifier, DR_STE_CVLAN); spec->cvlan_tag = 0; } else if (spec->svlan_tag) { DR_STE_SET(eth_l2_tnl, tag, first_vlan_qualifier, DR_STE_SVLAN); spec->svlan_tag = 0; } if (spec->ip_version) { if (spec->ip_version == IP_VERSION_IPV4) { DR_STE_SET(eth_l2_tnl, tag, l3_type, STE_IPV4); spec->ip_version = 0; } else if (spec->ip_version == IP_VERSION_IPV6) { DR_STE_SET(eth_l2_tnl, tag, l3_type, STE_IPV6); spec->ip_version = 0; } else { errno = EINVAL; return errno; } } return 0; } static void dr_ste_v0_build_eth_l2_tnl_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_eth_l2_tnl_bit_mask(mask, sb->inner, sb->bit_mask); sb->lu_type = DR_STE_V0_LU_TYPE_ETHL2_TUNNELING_I; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_eth_l2_tnl_tag; } static int dr_ste_v0_build_eth_l3_ipv4_misc_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l3_ipv4_misc, tag, time_to_live, spec, ip_ttl_hoplimit); DR_STE_SET_TAG(eth_l3_ipv4_misc, tag, ihl, spec, ipv4_ihl); return 0; } static void dr_ste_v0_build_eth_l3_ipv4_misc_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_eth_l3_ipv4_misc_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_LU_TYPE(ETHL3_IPV4_MISC, sb->rx, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_eth_l3_ipv4_misc_tag; } static int dr_ste_v0_build_eth_ipv6_l3_l4_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; struct dr_match_misc *misc = &value->misc; DR_STE_SET_TAG(eth_l4, tag, dst_port, spec, tcp_dport); DR_STE_SET_TAG(eth_l4, tag, src_port, spec, tcp_sport); DR_STE_SET_TAG(eth_l4, tag, dst_port, spec, udp_dport); DR_STE_SET_TAG(eth_l4, tag, src_port, spec, udp_sport); DR_STE_SET_TAG(eth_l4, tag, protocol, spec, ip_protocol); DR_STE_SET_TAG(eth_l4, tag, fragmented, spec, frag); DR_STE_SET_TAG(eth_l4, tag, dscp, spec, ip_dscp); DR_STE_SET_TAG(eth_l4, tag, ecn, spec, ip_ecn); DR_STE_SET_TAG(eth_l4, tag, ipv6_hop_limit, spec, ip_ttl_hoplimit); if (sb->inner) DR_STE_SET_TAG(eth_l4, tag, flow_label, misc, inner_ipv6_flow_label); else DR_STE_SET_TAG(eth_l4, tag, flow_label, misc, outer_ipv6_flow_label); if (spec->tcp_flags) { DR_STE_SET_TCP_FLAGS(eth_l4, tag, spec); spec->tcp_flags = 0; } return 0; } static void dr_ste_v0_build_eth_ipv6_l3_l4_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_eth_ipv6_l3_l4_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_LU_TYPE(ETHL4, sb->rx, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_eth_ipv6_l3_l4_tag; } static int dr_ste_v0_build_mpls_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; if (sb->inner) DR_STE_SET_MPLS(mpls, misc2, inner, tag); else DR_STE_SET_MPLS(mpls, misc2, outer, tag); return 0; } static void dr_ste_v0_build_mpls_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_mpls_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_LU_TYPE(MPLS_FIRST, sb->rx, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_mpls_tag; } static int dr_ste_v0_build_tnl_gre_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc *misc = &value->misc; DR_STE_SET_TAG(gre, tag, gre_protocol, misc, gre_protocol); DR_STE_SET_TAG(gre, tag, gre_k_present, misc, gre_k_present); DR_STE_SET_TAG(gre, tag, gre_key_h, misc, gre_key_h); DR_STE_SET_TAG(gre, tag, gre_key_l, misc, gre_key_l); DR_STE_SET_TAG(gre, tag, gre_c_present, misc, gre_c_present); DR_STE_SET_TAG(gre, tag, gre_s_present, misc, gre_s_present); return 0; } static void dr_ste_v0_build_tnl_gre_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_tnl_gre_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V0_LU_TYPE_GRE; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_tnl_gre_tag; } static int dr_ste_v0_build_tnl_mpls_over_udp_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; uint8_t *parser_ptr; uint8_t parser_id; uint32_t mpls_hdr; mpls_hdr = misc2->outer_first_mpls_over_udp_label << HDR_MPLS_OFFSET_LABEL; misc2->outer_first_mpls_over_udp_label = 0; mpls_hdr |= misc2->outer_first_mpls_over_udp_exp << HDR_MPLS_OFFSET_EXP; misc2->outer_first_mpls_over_udp_exp = 0; mpls_hdr |= misc2->outer_first_mpls_over_udp_s_bos << HDR_MPLS_OFFSET_S_BOS; misc2->outer_first_mpls_over_udp_s_bos = 0; mpls_hdr |= misc2->outer_first_mpls_over_udp_ttl << HDR_MPLS_OFFSET_TTL; misc2->outer_first_mpls_over_udp_ttl = 0; parser_id = sb->caps->flex_parser_id_mpls_over_udp; parser_ptr = dr_ste_calc_flex_parser_offset(tag, parser_id); *(__be32 *)parser_ptr = htobe32(mpls_hdr); return 0; } static void dr_ste_v0_build_tnl_mpls_over_udp_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_tnl_mpls_over_udp_tag(mask, sb, sb->bit_mask); /* STEs with lookup type FLEX_PARSER_{0/1} includes * flex parsers_{0-3}/{4-7} respectively. */ sb->lu_type = sb->caps->flex_parser_id_mpls_over_udp <= DR_STE_MAX_FLEX_0_ID ? DR_STE_V0_LU_TYPE_FLEX_PARSER_0 : DR_STE_V0_LU_TYPE_FLEX_PARSER_1; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_tnl_mpls_over_udp_tag; } static int dr_ste_v0_build_tnl_mpls_over_gre_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; uint8_t *parser_ptr; uint8_t parser_id; uint32_t mpls_hdr; mpls_hdr = misc2->outer_first_mpls_over_gre_label << HDR_MPLS_OFFSET_LABEL; misc2->outer_first_mpls_over_gre_label = 0; mpls_hdr |= misc2->outer_first_mpls_over_gre_exp << HDR_MPLS_OFFSET_EXP; misc2->outer_first_mpls_over_gre_exp = 0; mpls_hdr |= misc2->outer_first_mpls_over_gre_s_bos << HDR_MPLS_OFFSET_S_BOS; misc2->outer_first_mpls_over_gre_s_bos = 0; mpls_hdr |= misc2->outer_first_mpls_over_gre_ttl << HDR_MPLS_OFFSET_TTL; misc2->outer_first_mpls_over_gre_ttl = 0; parser_id = sb->caps->flex_parser_id_mpls_over_gre; parser_ptr = dr_ste_calc_flex_parser_offset(tag, parser_id); *(__be32 *)parser_ptr = htobe32(mpls_hdr); return 0; } static void dr_ste_v0_build_tnl_mpls_over_gre_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_tnl_mpls_over_gre_tag(mask, sb, sb->bit_mask); /* STEs with lookup type FLEX_PARSER_{0/1} includes * flex parsers_{0-3}/{4-7} respectively. */ sb->lu_type = sb->caps->flex_parser_id_mpls_over_gre <= DR_STE_MAX_FLEX_0_ID ? DR_STE_V0_LU_TYPE_FLEX_PARSER_0 : DR_STE_V0_LU_TYPE_FLEX_PARSER_1; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_tnl_mpls_over_gre_tag; } #define ICMP_TYPE_OFFSET_FIRST_DW 24 #define ICMP_CODE_OFFSET_FIRST_DW 16 static int dr_ste_v0_build_icmp_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc3 *misc3 = &value->misc3; bool is_ipv4 = DR_MASK_IS_ICMPV4_SET(misc3); uint32_t *icmp_header_data; uint8_t *parser_ptr; uint8_t *icmp_type; uint8_t *icmp_code; uint32_t icmp_hdr; int dw0_location; int dw1_location; if (is_ipv4) { icmp_header_data = &misc3->icmpv4_header_data; icmp_type = &misc3->icmpv4_type; icmp_code = &misc3->icmpv4_code; dw0_location = sb->caps->flex_parser_id_icmp_dw0; dw1_location = sb->caps->flex_parser_id_icmp_dw1; } else { icmp_header_data = &misc3->icmpv6_header_data; icmp_type = &misc3->icmpv6_type; icmp_code = &misc3->icmpv6_code; dw0_location = sb->caps->flex_parser_id_icmpv6_dw0; dw1_location = sb->caps->flex_parser_id_icmpv6_dw1; } parser_ptr = dr_ste_calc_flex_parser_offset(tag, dw0_location); icmp_hdr = (*icmp_type << ICMP_TYPE_OFFSET_FIRST_DW) | (*icmp_code << ICMP_CODE_OFFSET_FIRST_DW); *(__be32 *)parser_ptr = htobe32(icmp_hdr); *icmp_code = 0; *icmp_type = 0; parser_ptr = dr_ste_calc_flex_parser_offset(tag, dw1_location); *(__be32 *)parser_ptr = htobe32(*icmp_header_data); *icmp_header_data = 0; return 0; } static void dr_ste_v0_build_icmp_init(struct dr_ste_build *sb, struct dr_match_param *mask) { uint8_t parser_id; bool is_ipv4; dr_ste_v0_build_icmp_tag(mask, sb, sb->bit_mask); /* STEs with lookup type FLEX_PARSER_{0/1} includes * flex parsers_{0-3}/{4-7} respectively. */ is_ipv4 = DR_MASK_IS_ICMPV4_SET(&mask->misc3); parser_id = is_ipv4 ? sb->caps->flex_parser_id_icmp_dw0 : sb->caps->flex_parser_id_icmpv6_dw0; sb->lu_type = parser_id <= DR_STE_MAX_FLEX_0_ID ? DR_STE_V0_LU_TYPE_FLEX_PARSER_0 : DR_STE_V0_LU_TYPE_FLEX_PARSER_1; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_icmp_tag; } static int dr_ste_v0_build_general_purpose_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; DR_STE_SET_TAG(general_purpose, tag, general_purpose_lookup_field, misc2, metadata_reg_a); return 0; } static void dr_ste_v0_build_general_purpose_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_general_purpose_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V0_LU_TYPE_GENERAL_PURPOSE; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_general_purpose_tag; } static int dr_ste_v0_build_eth_l4_misc_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc3 *misc3 = &value->misc3; if (sb->inner) { DR_STE_SET_TAG(eth_l4_misc, tag, seq_num, misc3, inner_tcp_seq_num); DR_STE_SET_TAG(eth_l4_misc, tag, ack_num, misc3, inner_tcp_ack_num); } else { DR_STE_SET_TAG(eth_l4_misc, tag, seq_num, misc3, outer_tcp_seq_num); DR_STE_SET_TAG(eth_l4_misc, tag, ack_num, misc3, outer_tcp_ack_num); } return 0; } static void dr_ste_v0_build_eth_l4_misc_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_eth_l4_misc_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_LU_TYPE(ETHL4_MISC, sb->rx, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_eth_l4_misc_tag; } static int dr_ste_v0_build_flex_parser_tnl_vxlan_gpe_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc3 *misc3 = &value->misc3; DR_STE_SET_TAG(flex_parser_tnl_vxlan_gpe, tag, outer_vxlan_gpe_flags, misc3, outer_vxlan_gpe_flags); DR_STE_SET_TAG(flex_parser_tnl_vxlan_gpe, tag, outer_vxlan_gpe_next_protocol, misc3, outer_vxlan_gpe_next_protocol); DR_STE_SET_TAG(flex_parser_tnl_vxlan_gpe, tag, outer_vxlan_gpe_vni, misc3, outer_vxlan_gpe_vni); return 0; } static void dr_ste_v0_build_flex_parser_tnl_vxlan_gpe_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_flex_parser_tnl_vxlan_gpe_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V0_LU_TYPE_FLEX_PARSER_TNL_HEADER; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_flex_parser_tnl_vxlan_gpe_tag; } static int dr_ste_v0_build_flex_parser_tnl_geneve_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc *misc = &value->misc; DR_STE_SET_TAG(flex_parser_tnl_geneve, tag, geneve_protocol_type, misc, geneve_protocol_type); DR_STE_SET_TAG(flex_parser_tnl_geneve, tag, geneve_oam, misc, geneve_oam); DR_STE_SET_TAG(flex_parser_tnl_geneve, tag, geneve_opt_len, misc, geneve_opt_len); DR_STE_SET_TAG(flex_parser_tnl_geneve, tag, geneve_vni, misc, geneve_vni); return 0; } static void dr_ste_v0_build_flex_parser_tnl_geneve_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_flex_parser_tnl_geneve_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V0_LU_TYPE_FLEX_PARSER_TNL_HEADER; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_flex_parser_tnl_geneve_tag; } static int dr_ste_v0_build_flex_parser_tnl_geneve_tlv_opt_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { uint8_t parser_id = sb->caps->flex_parser_id_geneve_opt_0; uint8_t *parser_ptr = dr_ste_calc_flex_parser_offset(tag, parser_id); struct dr_match_misc3 *misc3 = &value->misc3; *(__be32 *)parser_ptr = htobe32(misc3->geneve_tlv_option_0_data); misc3->geneve_tlv_option_0_data = 0; return 0; } static void dr_ste_v0_build_flex_parser_tnl_geneve_tlv_opt_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_flex_parser_tnl_geneve_tlv_opt_tag(mask, sb, sb->bit_mask); /* STEs with lookup type FLEX_PARSER_{0/1} includes * flex parsers_{0-3}/{4-7} respectively. */ sb->lu_type = sb->caps->flex_parser_id_geneve_opt_0 <= DR_STE_MAX_FLEX_0_ID ? DR_STE_V0_LU_TYPE_FLEX_PARSER_0 : DR_STE_V0_LU_TYPE_FLEX_PARSER_1; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_flex_parser_tnl_geneve_tlv_opt_tag; } static int dr_ste_v0_build_flex_parser_tnl_gtpu_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc3 *misc3 = &value->misc3; DR_STE_SET_TAG(flex_parser_tnl_gtpu, tag, gtpu_msg_flags, misc3, gtpu_msg_flags); DR_STE_SET_TAG(flex_parser_tnl_gtpu, tag, gtpu_msg_type, misc3, gtpu_msg_type); DR_STE_SET_TAG(flex_parser_tnl_gtpu, tag, gtpu_teid, misc3, gtpu_teid); return 0; } static void dr_ste_v0_build_flex_parser_tnl_gtpu_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_flex_parser_tnl_gtpu_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V0_LU_TYPE_FLEX_PARSER_TNL_HEADER; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_flex_parser_tnl_gtpu_tag; } static int dr_ste_v0_build_tnl_gtpu_flex_parser_0_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { if (sb->caps->flex_parser_id_gtpu_dw_0 <= DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_dw_0, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_teid <= DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_teid, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_dw_2 <= DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_dw_2, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_first_ext_dw_0 <= DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_first_ext_dw_0, sb->caps, &value->misc3); return 0; } static void dr_ste_v0_build_tnl_gtpu_flex_parser_0_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_tnl_gtpu_flex_parser_0_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V0_LU_TYPE_FLEX_PARSER_0; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_tnl_gtpu_flex_parser_0_tag; } static int dr_ste_v0_build_tnl_gtpu_flex_parser_1_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { if (sb->caps->flex_parser_id_gtpu_dw_0 > DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_dw_0, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_teid > DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_teid, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_dw_2 > DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_dw_2, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_first_ext_dw_0 > DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_first_ext_dw_0, sb->caps, &value->misc3); return 0; } static void dr_ste_v0_build_tnl_gtpu_flex_parser_1_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_tnl_gtpu_flex_parser_1_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V0_LU_TYPE_FLEX_PARSER_1; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_tnl_gtpu_flex_parser_1_tag; } static int dr_ste_v0_build_register_0_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; DR_STE_SET_TAG(register_0, tag, register_0_h, misc2, metadata_reg_c_0); DR_STE_SET_TAG(register_0, tag, register_0_l, misc2, metadata_reg_c_1); DR_STE_SET_TAG(register_0, tag, register_1_h, misc2, metadata_reg_c_2); DR_STE_SET_TAG(register_0, tag, register_1_l, misc2, metadata_reg_c_3); return 0; } static void dr_ste_v0_build_register_0_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_register_0_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V0_LU_TYPE_STEERING_REGISTERS_0; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_register_0_tag; } static int dr_ste_v0_build_register_1_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; DR_STE_SET_TAG(register_1, tag, register_2_h, misc2, metadata_reg_c_4); DR_STE_SET_TAG(register_1, tag, register_2_l, misc2, metadata_reg_c_5); DR_STE_SET_TAG(register_1, tag, register_3_h, misc2, metadata_reg_c_6); DR_STE_SET_TAG(register_1, tag, register_3_l, misc2, metadata_reg_c_7); return 0; } static void dr_ste_v0_build_register_1_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_register_1_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V0_LU_TYPE_STEERING_REGISTERS_1; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_register_1_tag; } static void dr_ste_v0_build_src_gvmi_qpn_bit_mask(struct dr_match_param *value, struct dr_ste_build *sb) { struct dr_match_misc *misc_mask = &value->misc; uint8_t *bit_mask = sb->bit_mask; if (sb->rx && misc_mask->source_port) DR_STE_SET(src_gvmi_qp, bit_mask, functional_lb, 1); DR_STE_SET_ONES(src_gvmi_qp, bit_mask, source_gvmi, misc_mask, source_port); DR_STE_SET_ONES(src_gvmi_qp, bit_mask, source_qp, misc_mask, source_sqn); } static int dr_ste_v0_build_src_gvmi_qpn_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc *misc = &value->misc; struct dr_devx_vport_cap *vport_cap; uint8_t *bit_mask = sb->bit_mask; bool source_gvmi_set; DR_STE_SET_TAG(src_gvmi_qp, tag, source_qp, misc, source_sqn); source_gvmi_set = DR_STE_GET(src_gvmi_qp, bit_mask, source_gvmi); if (source_gvmi_set) { vport_cap = dr_vports_table_get_vport_cap(sb->caps, misc->source_port); if (!vport_cap) return errno; if (vport_cap->vport_gvmi) DR_STE_SET(src_gvmi_qp, tag, source_gvmi, vport_cap->vport_gvmi); /* Make sure that this packet is not coming from the wire since * wire GVMI is set to 0 and can be aliased with another port */ if (sb->rx && misc->source_port != WIRE_PORT) DR_STE_SET(src_gvmi_qp, tag, functional_lb, 1); misc->source_port = 0; } return 0; } static void dr_ste_v0_build_src_gvmi_qpn_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v0_build_src_gvmi_qpn_bit_mask(mask, sb); sb->lu_type = DR_STE_V0_LU_TYPE_SRC_GVMI_AND_QP; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_src_gvmi_qpn_tag; } static void dr_ste_set_flex_parser(uint16_t lu_type, uint32_t *misc4_field_id, uint32_t *misc4_field_value, bool *parser_is_used, uint8_t *tag) { uint32_t id = *misc4_field_id; uint8_t *parser_ptr; bool skip_parser; /* Since this is a shared function to set flex parsers, * we need to skip it if lookup type and parser ID doesn't match */ skip_parser = id <= DR_STE_MAX_FLEX_0_ID ? lu_type != DR_STE_V0_LU_TYPE_FLEX_PARSER_0 : lu_type != DR_STE_V0_LU_TYPE_FLEX_PARSER_1; skip_parser = skip_parser || (id >= NUM_OF_FLEX_PARSERS); if (skip_parser || parser_is_used[id]) return; parser_is_used[id] = true; parser_ptr = dr_ste_calc_flex_parser_offset(tag, id); *(__be32 *)parser_ptr = htobe32(*misc4_field_value); *misc4_field_id = 0; *misc4_field_value = 0; } static int dr_ste_v0_build_flex_parser_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc4 *misc_4_mask = &value->misc4; bool parser_is_used[NUM_OF_FLEX_PARSERS] = {}; dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_0, &misc_4_mask->prog_sample_field_value_0, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_1, &misc_4_mask->prog_sample_field_value_1, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_2, &misc_4_mask->prog_sample_field_value_2, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_3, &misc_4_mask->prog_sample_field_value_3, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_4, &misc_4_mask->prog_sample_field_value_4, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_5, &misc_4_mask->prog_sample_field_value_5, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_6, &misc_4_mask->prog_sample_field_value_6, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_7, &misc_4_mask->prog_sample_field_value_7, parser_is_used, tag); return 0; } static void dr_ste_v0_build_flex_parser_0_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V0_LU_TYPE_FLEX_PARSER_0; dr_ste_v0_build_flex_parser_tag(mask, sb, sb->bit_mask); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_flex_parser_tag; } static void dr_ste_v0_build_flex_parser_1_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V0_LU_TYPE_FLEX_PARSER_1; dr_ste_v0_build_flex_parser_tag(mask, sb, sb->bit_mask); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_flex_parser_tag; } static int dr_ste_v0_build_tunnel_header_0_1_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc5 *misc5 = &value->misc5; DR_STE_SET_TAG(tunnel_header, tag, tunnel_header_dw0, misc5, tunnel_header_0); DR_STE_SET_TAG(tunnel_header, tag, tunnel_header_dw1, misc5, tunnel_header_1); return 0; } static void dr_ste_v0_build_tunnel_header_0_1_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V0_LU_TYPE_TUNNEL_HEADER; dr_ste_v0_build_tunnel_header_0_1_tag(mask, sb, sb->bit_mask); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v0_build_tunnel_header_0_1_tag; } static struct dr_ste_ctx ste_ctx_v0 = { /* Builders */ .build_eth_l2_src_dst_init = &dr_ste_v0_build_eth_l2_src_dst_init, .build_eth_l3_ipv6_src_init = &dr_ste_v0_build_eth_l3_ipv6_src_init, .build_eth_l3_ipv6_dst_init = &dr_ste_v0_build_eth_l3_ipv6_dst_init, .build_eth_l3_ipv4_5_tuple_init = &dr_ste_v0_build_eth_l3_ipv4_5_tuple_init, .build_eth_l2_src_init = &dr_ste_v0_build_eth_l2_src_init, .build_eth_l2_dst_init = &dr_ste_v0_build_eth_l2_dst_init, .build_eth_l2_tnl_init = &dr_ste_v0_build_eth_l2_tnl_init, .build_eth_l3_ipv4_misc_init = &dr_ste_v0_build_eth_l3_ipv4_misc_init, .build_eth_ipv6_l3_l4_init = &dr_ste_v0_build_eth_ipv6_l3_l4_init, .build_mpls_init = &dr_ste_v0_build_mpls_init, .build_tnl_gre_init = &dr_ste_v0_build_tnl_gre_init, .build_tnl_mpls_over_udp_init = &dr_ste_v0_build_tnl_mpls_over_udp_init, .build_tnl_mpls_over_gre_init = &dr_ste_v0_build_tnl_mpls_over_gre_init, .build_icmp_init = &dr_ste_v0_build_icmp_init, .build_general_purpose_init = &dr_ste_v0_build_general_purpose_init, .build_eth_l4_misc_init = &dr_ste_v0_build_eth_l4_misc_init, .build_tnl_vxlan_gpe_init = &dr_ste_v0_build_flex_parser_tnl_vxlan_gpe_init, .build_tnl_geneve_init = &dr_ste_v0_build_flex_parser_tnl_geneve_init, .build_tnl_geneve_tlv_opt_init = &dr_ste_v0_build_flex_parser_tnl_geneve_tlv_opt_init, .build_tnl_gtpu_init = &dr_ste_v0_build_flex_parser_tnl_gtpu_init, .build_tnl_gtpu_flex_parser_0 = &dr_ste_v0_build_tnl_gtpu_flex_parser_0_init, .build_tnl_gtpu_flex_parser_1 = &dr_ste_v0_build_tnl_gtpu_flex_parser_1_init, .build_register_0_init = &dr_ste_v0_build_register_0_init, .build_register_1_init = &dr_ste_v0_build_register_1_init, .build_src_gvmi_qpn_init = &dr_ste_v0_build_src_gvmi_qpn_init, .build_flex_parser_0_init = &dr_ste_v0_build_flex_parser_0_init, .build_flex_parser_1_init = &dr_ste_v0_build_flex_parser_1_init, .build_tunnel_header_init = &dr_ste_v0_build_tunnel_header_0_1_init, /* Getters and Setters */ .ste_init = &dr_ste_v0_init, .set_next_lu_type = &dr_ste_v0_set_next_lu_type, .get_next_lu_type = &dr_ste_v0_get_next_lu_type, .set_miss_addr = &dr_ste_v0_set_miss_addr, .get_miss_addr = &dr_ste_v0_get_miss_addr, .set_hit_addr = &dr_ste_v0_set_hit_addr, .set_byte_mask = &dr_ste_v0_set_byte_mask, .get_byte_mask = &dr_ste_v0_get_byte_mask, .set_ctrl_always_hit_htbl = &dr_ste_v0_set_ctrl_always_hit_htbl, .set_ctrl_always_miss = &dr_ste_v0_set_ctrl_always_miss, .set_hit_gvmi = &dr_ste_v0_set_hit_gvmi, /* Actions */ .actions_caps = DR_STE_CTX_ACTION_CAP_NONE, .action_modify_field_arr = dr_ste_v0_action_modify_field_arr, .action_modify_field_arr_size = ARRAY_SIZE(dr_ste_v0_action_modify_field_arr), .set_actions_rx = &dr_ste_v0_set_actions_rx, .set_actions_tx = &dr_ste_v0_set_actions_tx, .set_action_set = &dr_ste_v0_set_action_set, .set_action_add = &dr_ste_v0_set_action_add, .set_action_copy = &dr_ste_v0_set_action_copy, .get_action_hw_field = &dr_ste_v0_get_action_hw_field, .set_action_decap_l3_list = &dr_ste_v0_set_action_decap_l3_list, }; struct dr_ste_ctx *dr_ste_get_ctx_v0(void) { return &ste_ctx_v0; } rdma-core-56.1/providers/mlx5/dr_ste_v1.c000066400000000000000000003540021477342711600202720ustar00rootroot00000000000000/* * Copyright (c) 2020, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "dr_ste_v1.h" static const struct dr_ste_action_modify_field dr_ste_v1_action_modify_field_arr[] = { [MLX5_ACTION_IN_FIELD_OUT_SMAC_47_16] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_SRC_L2_OUT_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_SMAC_15_0] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_SRC_L2_OUT_1, .start = 16, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_ETHERTYPE] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_1, .start = 0, .end = 15, }, [MLX5_ACTION_IN_FIELD_OUT_DMAC_47_16] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_DMAC_15_0] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_1, .start = 16, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_IP_DSCP] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L3_OUT_0, .start = 18, .end = 23, }, [MLX5_ACTION_IN_FIELD_OUT_IP_ECN] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L3_OUT_0, .start = 16, .end = 17, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_FLAGS] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L4_OUT_1, .start = 16, .end = 24, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_SPORT] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L4_OUT_0, .start = 16, .end = 31, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_DPORT] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L4_OUT_0, .start = 0, .end = 15, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, }, [MLX5_ACTION_IN_FIELD_OUT_IP_TTL] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L3_OUT_0, .start = 8, .end = 15, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, }, [MLX5_ACTION_IN_FIELD_OUT_IPV6_HOPLIMIT] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L3_OUT_0, .start = 8, .end = 15, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_UDP_SPORT] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L4_OUT_0, .start = 16, .end = 31, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_UDP, }, [MLX5_ACTION_IN_FIELD_OUT_UDP_DPORT] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L4_OUT_0, .start = 0, .end = 15, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_UDP, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_127_96] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_0, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_95_64] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_1, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_63_32] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_2, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_31_0] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_3, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_127_96] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_0, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_95_64] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_1, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_63_32] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_2, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_31_0] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_3, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV4] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_IPV4_OUT_0, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV4] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_IPV4_OUT_1, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGA] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_GNRL_PURPOSE, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGB] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_METADATA_2_CQE, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_0] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_REGISTER_0_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_1] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_REGISTER_0_1, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_2] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_REGISTER_1_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_3] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_REGISTER_1_1, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_4] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_REGISTER_2_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_5] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_REGISTER_2_1, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_1, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_FIRST_VID] = { .hw_field = DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_2, .start = 0, .end = 15, }, [MLX5_ACTION_IN_FIELD_OUT_GTPU_TEID] = { .flags = DR_STE_ACTION_MODIFY_FLAG_REQ_FLEX, .start = 0, .end = 31, }, }; static const struct dr_ste_action_modify_field dr_ste_v1_action_modify_flex_field_arr[] = { {.hw_field = DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_0, .start = 0, .end = 31,}, {.hw_field = DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_1, .start = 0, .end = 31,}, {.hw_field = DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_2, .start = 0, .end = 31,}, {.hw_field = DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_3, .start = 0, .end = 31,}, {.hw_field = DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_4, .start = 0, .end = 31,}, {.hw_field = DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_5, .start = 0, .end = 31,}, {.hw_field = DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_6, .start = 0, .end = 31,}, {.hw_field = DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_7, .start = 0, .end = 31,}, }; static void dr_ste_v1_set_entry_type(uint8_t *hw_ste_p, uint8_t entry_type) { DR_STE_SET(match_bwc_v1, hw_ste_p, entry_format, entry_type); } static uint8_t dr_ste_v1_get_entry_type(uint8_t *hw_ste_p) { return DR_STE_GET(match_bwc_v1, hw_ste_p, entry_format); } static void dr_ste_v1_set_miss_addr(uint8_t *hw_ste_p, uint64_t miss_addr) { uint64_t index = miss_addr >> 6; DR_STE_SET(match_bwc_v1, hw_ste_p, miss_address_39_32, index >> 26); DR_STE_SET(match_bwc_v1, hw_ste_p, miss_address_31_6, index); } static uint64_t dr_ste_v1_get_miss_addr(uint8_t *hw_ste_p) { uint64_t index = (DR_STE_GET(match_bwc_v1, hw_ste_p, miss_address_31_6) | DR_STE_GET(match_bwc_v1, hw_ste_p, miss_address_39_32) << 26); return index << 6; } static void dr_ste_v1_set_byte_mask(uint8_t *hw_ste_p, uint16_t byte_mask) { if (dr_ste_v1_get_entry_type(hw_ste_p) != DR_STE_V1_TYPE_MATCH) DR_STE_SET(match_bwc_v1, hw_ste_p, byte_mask, byte_mask); } static uint16_t dr_ste_v1_get_byte_mask(uint8_t *hw_ste_p) { return DR_STE_GET(match_bwc_v1, hw_ste_p, byte_mask); } static void dr_ste_v1_set_lu_type(uint8_t *hw_ste_p, uint16_t lu_type) { DR_STE_SET(match_bwc_v1, hw_ste_p, entry_format, lu_type >> 8); DR_STE_SET(match_bwc_v1, hw_ste_p, match_definer_ctx_idx, lu_type & 0xFF); } static void dr_ste_v1_set_next_lu_type(uint8_t *hw_ste_p, uint16_t lu_type) { if (dr_ste_v1_get_entry_type(hw_ste_p) != DR_STE_V1_TYPE_MATCH) DR_STE_SET(match_bwc_v1, hw_ste_p, next_entry_format, lu_type >> 8); DR_STE_SET(match_bwc_v1, hw_ste_p, hash_definer_ctx_idx, lu_type & 0xFF); } static void dr_ste_v1_set_hit_gvmi(uint8_t *hw_ste_p, uint16_t gvmi) { DR_STE_SET(match_bwc_v1, hw_ste_p, next_table_base_63_48, gvmi); } static uint16_t dr_ste_v1_get_next_lu_type(uint8_t *hw_ste_p) { uint8_t mode = DR_STE_GET(match_bwc_v1, hw_ste_p, next_entry_format); uint8_t index = DR_STE_GET(match_bwc_v1, hw_ste_p, hash_definer_ctx_idx); return (mode << 8 | index); } static void dr_ste_v1_set_hit_addr(uint8_t *hw_ste_p, uint64_t icm_addr, uint32_t ht_size) { uint64_t index = (icm_addr >> 5) | ht_size; DR_STE_SET(match_bwc_v1, hw_ste_p, next_table_base_39_32_size, index >> 27); DR_STE_SET(match_bwc_v1, hw_ste_p, next_table_base_31_5_size, index); } static bool dr_ste_v1_is_match_ste(uint16_t lu_type) { return ((lu_type >> 8) == DR_STE_V1_TYPE_MATCH); } static void dr_ste_v1_init(uint8_t *hw_ste_p, uint16_t lu_type, bool is_rx, uint16_t gvmi) { dr_ste_v1_set_lu_type(hw_ste_p, lu_type); /* No need for GVMI on match ste */ if (!dr_ste_v1_is_match_ste(lu_type)) DR_STE_SET(match_bwc_v1, hw_ste_p, gvmi, gvmi); dr_ste_v1_set_next_lu_type(hw_ste_p, DR_STE_LU_TYPE_DONT_CARE); DR_STE_SET(match_bwc_v1, hw_ste_p, next_table_base_63_48, gvmi); DR_STE_SET(match_bwc_v1, hw_ste_p, miss_address_63_48, gvmi); } static void dr_ste_v1_set_ctrl_always_hit_htbl(uint8_t *hw_ste_p, uint16_t byte_mask, uint16_t lu_type, uint64_t icm_addr, uint32_t num_of_entries, uint16_t gvmi) { bool target_is_match = dr_ste_v1_is_match_ste(lu_type); if (target_is_match) { uint32_t *first_action; /* Convert STE to MATCH */ dr_ste_v1_set_entry_type(hw_ste_p, DR_STE_V1_TYPE_MATCH); dr_ste_v1_set_miss_addr(hw_ste_p, 0); first_action = (uint32_t *)DEVX_ADDR_OF(ste_mask_and_match_v1, hw_ste_p, action); *first_action = 0; } else { /* Convert STE to BWC */ dr_ste_v1_set_entry_type(hw_ste_p, DR_STE_V1_TYPE_BWC_BYTE); dr_ste_v1_set_byte_mask(hw_ste_p, byte_mask); DR_STE_SET(match_bwc_v1, hw_ste_p, gvmi, gvmi); DR_STE_SET(match_bwc_v1, hw_ste_p, mask_mode, 0); } dr_ste_v1_set_next_lu_type(hw_ste_p, lu_type); dr_ste_v1_set_hit_addr(hw_ste_p, icm_addr, num_of_entries); } static void dr_ste_v1_set_ctrl_always_miss(uint8_t *hw_ste_p, uint64_t miss_addr, uint16_t gvmi) { dr_ste_v1_set_hit_addr(hw_ste_p, -1, 0); dr_ste_v1_set_next_lu_type(hw_ste_p, DR_STE_V1_LU_TYPE_DONT_CARE); dr_ste_v1_set_miss_addr(hw_ste_p, miss_addr); } static void dr_ste_v1_prepare_for_postsend(uint8_t *hw_ste_p, uint32_t ste_size) { uint8_t entry_type = dr_ste_v1_get_entry_type(hw_ste_p); uint8_t *tag = hw_ste_p + DR_STE_SIZE_CTRL; uint8_t *mask = tag + DR_STE_SIZE_TAG; uint8_t tmp_tag[DR_STE_SIZE_TAG] = {}; if (ste_size == DR_STE_SIZE_CTRL) return; if (ste_size != DR_STE_SIZE) assert(false); if (entry_type == DR_STE_V1_TYPE_MATCH) return; /* Backup tag */ memcpy(tmp_tag, tag, DR_STE_SIZE_TAG); /* Swap mask and tag both are the same size */ memcpy(tag, mask, DR_STE_SIZE_MASK); memcpy(mask, tmp_tag, DR_STE_SIZE_TAG); } static void dr_ste_v1_set_rx_flow_tag(uint8_t *s_action, uint32_t flow_tag) { DR_STE_SET(single_action_flow_tag_v1, s_action, action_id, DR_STE_V1_ACTION_ID_FLOW_TAG); DR_STE_SET(single_action_flow_tag_v1, s_action, flow_tag, flow_tag); } static void dr_ste_v1_set_counter_id(uint8_t *hw_ste_p, uint32_t ctr_id) { DR_STE_SET(match_bwc_v1, hw_ste_p, counter_id, ctr_id); } void dr_ste_v1_set_reparse(uint8_t *hw_ste_p) { DR_STE_SET(match_bwc_v1, hw_ste_p, reparse, 1); } static void dr_ste_v1_set_encap(uint8_t *hw_ste_p, uint8_t *d_action, uint32_t reformat_id, int size) { DR_STE_SET(double_action_insert_with_ptr_v1, d_action, action_id, DR_STE_V1_ACTION_ID_INSERT_POINTER); /* The hardware expects here size in words (2 bytes) */ DR_STE_SET(double_action_insert_with_ptr_v1, d_action, size, size / 2); DR_STE_SET(double_action_insert_with_ptr_v1, d_action, pointer, reformat_id); DR_STE_SET(double_action_insert_with_ptr_v1, d_action, attributes, DR_STE_V1_ACTION_INSERT_PTR_ATTR_ENCAP); dr_ste_v1_set_reparse(hw_ste_p); } static void dr_ste_v1_set_push_vlan(uint8_t *ste, uint8_t *d_action, uint32_t vlan_hdr) { DR_STE_SET(double_action_insert_with_inline_v1, d_action, action_id, DR_STE_V1_ACTION_ID_INSERT_INLINE); /* The hardware expects here offset to vlan header in words (2 byte) */ DR_STE_SET(double_action_insert_with_inline_v1, d_action, start_offset, HDR_LEN_L2_MACS >> 1); DR_STE_SET(double_action_insert_with_inline_v1, d_action, inline_data, vlan_hdr); dr_ste_v1_set_reparse(ste); } static void dr_ste_v1_set_pop_vlan(uint8_t *hw_ste_p, uint8_t *s_action, uint8_t vlans_num) { DR_STE_SET(single_action_remove_header_size_v1, s_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_BY_SIZE); DR_STE_SET(single_action_remove_header_size_v1, s_action, start_anchor, DR_STE_HEADER_ANCHOR_1ST_VLAN); /* The hardware expects here size in words (2 byte) */ DR_STE_SET(single_action_remove_header_size_v1, s_action, remove_size, (HDR_LEN_L2_VLAN >> 1) * vlans_num); dr_ste_v1_set_reparse(hw_ste_p); } static void dr_ste_v1_set_encap_l3(uint8_t *hw_ste_p, uint8_t *frst_s_action, uint8_t *scnd_d_action, uint32_t reformat_id, int size) { /* Remove L2 headers */ DR_STE_SET(single_action_remove_header_v1, frst_s_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER); DR_STE_SET(single_action_remove_header_v1, frst_s_action, end_anchor, DR_STE_HEADER_ANCHOR_IPV6_IPV4); /* Encapsulate with given reformat ID */ DR_STE_SET(double_action_insert_with_ptr_v1, scnd_d_action, action_id, DR_STE_V1_ACTION_ID_INSERT_POINTER); /* The hardware expects here size in words (2 bytes) */ DR_STE_SET(double_action_insert_with_ptr_v1, scnd_d_action, size, size / 2); DR_STE_SET(double_action_insert_with_ptr_v1, scnd_d_action, pointer, reformat_id); DR_STE_SET(double_action_insert_with_ptr_v1, scnd_d_action, attributes, DR_STE_V1_ACTION_INSERT_PTR_ATTR_ENCAP); dr_ste_v1_set_reparse(hw_ste_p); } static void dr_ste_v1_set_rx_decap(uint8_t *hw_ste_p, uint8_t *s_action) { DR_STE_SET(single_action_remove_header_v1, s_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER); DR_STE_SET(single_action_remove_header_v1, s_action, decap, 1); DR_STE_SET(single_action_remove_header_v1, s_action, vni_to_cqe, 1); DR_STE_SET(single_action_remove_header_v1, s_action, end_anchor, DR_STE_HEADER_ANCHOR_INNER_MAC); dr_ste_v1_set_reparse(hw_ste_p); } static void dr_ste_v1_set_accelerated_rewrite_actions(uint8_t *hw_ste_p, uint8_t *d_action, uint16_t num_of_actions, uint32_t re_write_pat, uint32_t re_write_args, uint8_t *action_data) { if (action_data) { memcpy(d_action, action_data, DR_MODIFY_ACTION_SIZE); } else { DR_STE_SET(double_action_accelerated_modify_action_list_v1, d_action, action_id, DR_STE_V1_ACTION_ID_ACCELERATED_LIST); DR_STE_SET(double_action_accelerated_modify_action_list_v1, d_action, modify_actions_pattern_pointer, re_write_pat); DR_STE_SET(double_action_accelerated_modify_action_list_v1, d_action, number_of_modify_actions, num_of_actions); DR_STE_SET(double_action_accelerated_modify_action_list_v1, d_action, modify_actions_argument_pointer, re_write_args); } dr_ste_v1_set_reparse(hw_ste_p); } static void dr_ste_v1_set_basic_rewrite_actions(uint8_t *hw_ste_p, uint8_t *action, uint16_t num_of_actions, uint32_t re_write_index) { DR_STE_SET(single_action_modify_list_v1, action, action_id, DR_STE_V1_ACTION_ID_MODIFY_LIST); DR_STE_SET(single_action_modify_list_v1, action, num_of_modify_actions, num_of_actions); DR_STE_SET(single_action_modify_list_v1, action, modify_actions_ptr, re_write_index); dr_ste_v1_set_reparse(hw_ste_p); } static void dr_ste_v1_set_rewrite_actions(uint8_t *hw_ste_p, uint8_t *d_action, uint16_t num_of_actions, uint32_t re_write_pat, uint32_t re_write_args, uint8_t *action_data) { if (re_write_pat != DR_INVALID_PATTERN_INDEX) return dr_ste_v1_set_accelerated_rewrite_actions(hw_ste_p, d_action, num_of_actions, re_write_pat, re_write_args, action_data); /* fall back to the code that doesn't support accelerated modify-header */ return dr_ste_v1_set_basic_rewrite_actions(hw_ste_p, d_action, num_of_actions, re_write_args); } static inline void dr_ste_v1_arr_init_next_match(uint8_t **last_ste, uint32_t *added_stes, uint16_t gvmi) { uint8_t *action; (*added_stes)++; *last_ste += DR_STE_SIZE; dr_ste_v1_init(*last_ste, DR_STE_V1_LU_TYPE_MATCH | DR_STE_V1_LU_TYPE_DONT_CARE, 0, gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, *last_ste, action); memset(action, 0, DEVX_FLD_SZ_BYTES(ste_mask_and_match_v1, action)); } static void dr_ste_v1_set_aso_first_hit(uint8_t *d_action, uint32_t object_id, uint32_t offset, uint8_t dest_reg_id, bool set) { DR_STE_SET(double_action_aso_v1, d_action, action_id, DR_STE_V1_ACTION_ID_ASO); DR_STE_SET(double_action_aso_v1, d_action, aso_context_number, object_id + (offset / MLX5_ASO_FIRST_HIT_NUM_PER_OBJ)); /* Convert reg_c index to HW 64bit index */ DR_STE_SET(double_action_aso_v1, d_action, dest_reg_id, (dest_reg_id - 1) / 2); DR_STE_SET(double_action_aso_v1, d_action, aso_context_type, DR_STE_V1_ASO_CTX_TYPE_FIRST_HIT); DR_STE_SET(double_action_aso_v1, d_action, first_hit.line_id, offset % MLX5_ASO_FIRST_HIT_NUM_PER_OBJ); /* In HW 0 is for set and 1 is for just read */ DR_STE_SET(double_action_aso_v1, d_action, first_hit.set, !set); } static void dr_ste_v1_set_aso_flow_meter(uint8_t *d_action, uint32_t object_id, uint32_t offset, uint8_t dest_reg_id, uint8_t initial_color) { DR_STE_SET(double_action_aso_v1, d_action, action_id, DR_STE_V1_ACTION_ID_ASO); DR_STE_SET(double_action_aso_v1, d_action, aso_context_number, object_id + (offset / MLX5_ASO_FLOW_METER_NUM_PER_OBJ)); /* Convert reg_c index to HW 64bit index */ DR_STE_SET(double_action_aso_v1, d_action, dest_reg_id, (dest_reg_id - 1) / 2); DR_STE_SET(double_action_aso_v1, d_action, aso_context_type, DR_STE_V1_ASO_CTX_TYPE_POLICERS); DR_STE_SET(double_action_aso_v1, d_action, flow_meter.line_id, offset % MLX5_ASO_FLOW_METER_NUM_PER_OBJ); DR_STE_SET(double_action_aso_v1, d_action, flow_meter.initial_color, initial_color); } void dr_ste_v1_set_aso_ct(uint8_t *d_action, uint32_t object_id, uint32_t offset, uint8_t dest_reg_id, bool direction) { DR_STE_SET(double_action_aso_v1, d_action, action_id, DR_STE_V1_ACTION_ID_ASO); DR_STE_SET(double_action_aso_v1, d_action, aso_context_number, object_id + (offset / MLX5_ASO_CT_NUM_PER_OBJ)); /* Convert reg_c index to HW 64bit index */ DR_STE_SET(double_action_aso_v1, d_action, dest_reg_id, (dest_reg_id - 1) / 2); DR_STE_SET(double_action_aso_v1, d_action, aso_context_type, DR_STE_V1_ASO_CTX_TYPE_CT); DR_STE_SET(double_action_aso_v1, d_action, ct.direction, direction); } static void dr_ste_v1_set_actions_tx(struct dr_ste_ctx *ste_ctx, uint8_t *action_type_set, uint32_t actions_caps, uint8_t *last_ste, struct dr_ste_actions_attr *attr, uint32_t *added_stes) { bool allow_modify_hdr = true; bool allow_pop_vlan = true; bool allow_encap = true; uint8_t action_sz; uint8_t *action; uint32_t ste_loc = 0; if (dr_ste_v1_get_entry_type(last_ste) == DR_STE_V1_TYPE_MATCH) { action_sz = DR_STE_ACTION_TRIPLE_SZ; action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); } else { action_sz = DR_STE_ACTION_DOUBLE_SZ; action = DEVX_ADDR_OF(ste_match_bwc_v1, last_ste, action); } if (action_type_set[DR_ACTION_TYP_ASO_FLOW_METER]) { if (action_sz < DR_STE_ACTION_DOUBLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; ste_loc++; } dr_ste_v1_set_aso_flow_meter(action, attr->aso->devx_obj->object_id, attr->aso->offset, attr->aso->dest_reg_id, attr->aso->flow_meter.initial_color); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; allow_pop_vlan = false; } if (action_type_set[DR_ACTION_TYP_POP_VLAN]) { if (action_sz < DR_STE_ACTION_SINGLE_SZ || !allow_pop_vlan) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; ste_loc++; } ste_ctx->set_pop_vlan(last_ste, action, attr->vlans.count_pop); action_sz -= DR_STE_ACTION_SINGLE_SZ; action += DR_STE_ACTION_SINGLE_SZ; /* Check if vlan_pop and modify_hdr on same STE is supported */ if (!(actions_caps & DR_STE_CTX_ACTION_CAP_POP_MDFY)) allow_modify_hdr = false; } if (action_type_set[DR_ACTION_TYP_ASO_CT]) { if (attr->aso->dmn != attr->dmn || action_sz < DR_STE_ACTION_DOUBLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; } if (attr->aso->dmn != attr->dmn) { attr->aso_ste_loc = ste_loc; } else { dr_ste_v1_set_aso_ct(action, attr->aso->devx_obj->object_id, attr->aso->offset, attr->aso->dest_reg_id, attr->aso->ct.direction); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } } if (action_type_set[DR_ACTION_TYP_CTR]) dr_ste_v1_set_counter_id(last_ste, attr->ctr_id); if (action_type_set[DR_ACTION_TYP_MODIFY_HDR]) { if (!allow_modify_hdr || action_sz < DR_STE_ACTION_DOUBLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; } dr_ste_v1_set_rewrite_actions(last_ste, action, attr->modify_actions, attr->modify_pat_idx, attr->modify_index, attr->single_modify_action); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; allow_encap = false; } if (action_type_set[DR_ACTION_TYP_PUSH_VLAN]) { int i; for (i = 0; i < attr->vlans.count_push; i++) { if (action_sz < DR_STE_ACTION_DOUBLE_SZ || !allow_encap) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_encap = true; } ste_ctx->set_push_vlan(last_ste, action, attr->vlans.headers[i]); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } } if (action_type_set[DR_ACTION_TYP_ASO_FIRST_HIT]) { if (action_sz < DR_STE_ACTION_DOUBLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_encap = true; } dr_ste_v1_set_aso_first_hit(action, attr->aso->devx_obj->object_id, attr->aso->offset, attr->aso->dest_reg_id, attr->aso->first_hit.set); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L2]) { if (!allow_encap || action_sz < DR_STE_ACTION_DOUBLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_encap = true; } ste_ctx->set_encap(last_ste, action, attr->reformat_id, attr->reformat_size); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } else if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]) { uint8_t *d_action; if (action_sz < DR_STE_ACTION_TRIPLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; } d_action = action + DR_STE_ACTION_SINGLE_SZ; ste_ctx->set_encap_l3(last_ste, action, d_action, attr->reformat_id, attr->reformat_size); action_sz -= DR_STE_ACTION_TRIPLE_SZ; action += DR_STE_ACTION_TRIPLE_SZ; } dr_ste_v1_set_hit_gvmi(last_ste, attr->hit_gvmi); dr_ste_v1_set_hit_addr(last_ste, attr->final_icm_addr, 1); } static void dr_ste_v1_set_actions_rx(struct dr_ste_ctx *ste_ctx, uint8_t *action_type_set, uint32_t actions_caps, uint8_t *last_ste, struct dr_ste_actions_attr *attr, uint32_t *added_stes) { bool allow_modify_hdr = true; bool allow_ctr = true; uint8_t action_sz; uint8_t *action; uint32_t ste_loc = 0; if (dr_ste_v1_get_entry_type(last_ste) == DR_STE_V1_TYPE_MATCH) { action_sz = DR_STE_ACTION_TRIPLE_SZ; action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); } else { action_sz = DR_STE_ACTION_DOUBLE_SZ; action = DEVX_ADDR_OF(ste_match_bwc_v1, last_ste, action); } if (action_type_set[DR_ACTION_TYP_TNL_L3_TO_L2]) { dr_ste_v1_set_rewrite_actions(last_ste, action, attr->decap_actions, attr->decap_pat_idx, attr->decap_index, NULL); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; allow_modify_hdr = false; allow_ctr = false; } else if (action_type_set[DR_ACTION_TYP_TNL_L2_TO_L2]) { ste_ctx->set_rx_decap(last_ste, action); action_sz -= DR_STE_ACTION_SINGLE_SZ; action += DR_STE_ACTION_SINGLE_SZ; allow_modify_hdr = false; allow_ctr = false; } if (action_type_set[DR_ACTION_TYP_TAG]) { if (action_sz < DR_STE_ACTION_SINGLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_modify_hdr = true; allow_ctr = true; ste_loc++; } dr_ste_v1_set_rx_flow_tag(action, attr->flow_tag); action_sz -= DR_STE_ACTION_SINGLE_SZ; action += DR_STE_ACTION_SINGLE_SZ; } if (action_type_set[DR_ACTION_TYP_POP_VLAN]) { if (action_sz < DR_STE_ACTION_SINGLE_SZ || !allow_modify_hdr) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; ste_loc++; } ste_ctx->set_pop_vlan(last_ste, action, attr->vlans.count_pop); action_sz -= DR_STE_ACTION_SINGLE_SZ; action += DR_STE_ACTION_SINGLE_SZ; allow_ctr = false; /* Check if vlan_pop and modify_hdr on same STE is supported */ if (!(actions_caps & DR_STE_CTX_ACTION_CAP_POP_MDFY)) allow_modify_hdr = false; } if (action_type_set[DR_ACTION_TYP_ASO_FIRST_HIT]) { if (action_sz < DR_STE_ACTION_DOUBLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_modify_hdr = true; allow_ctr = true; ste_loc++; } dr_ste_v1_set_aso_first_hit(action, attr->aso->devx_obj->object_id, attr->aso->offset, attr->aso->dest_reg_id, attr->aso->first_hit.set); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } if (action_type_set[DR_ACTION_TYP_MODIFY_HDR]) { /* Modify header and decapsulation must use different STEs */ if (!allow_modify_hdr || action_sz < DR_STE_ACTION_DOUBLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_modify_hdr = true; allow_ctr = true; ste_loc++; } dr_ste_v1_set_rewrite_actions(last_ste, action, attr->modify_actions, attr->modify_pat_idx, attr->modify_index, attr->single_modify_action); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } if (action_type_set[DR_ACTION_TYP_PUSH_VLAN]) { int i; for (i = 0; i < attr->vlans.count_push; i++) { if (action_sz < DR_STE_ACTION_DOUBLE_SZ || !allow_modify_hdr) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; ste_loc++; } ste_ctx->set_push_vlan(last_ste, action, attr->vlans.headers[i]); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } } if (action_type_set[DR_ACTION_TYP_ASO_FLOW_METER]) { if (action_sz < DR_STE_ACTION_DOUBLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_ctr = true; ste_loc++; } dr_ste_v1_set_aso_flow_meter(action, attr->aso->devx_obj->object_id, attr->aso->offset, attr->aso->dest_reg_id, attr->aso->flow_meter.initial_color); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; allow_modify_hdr = false; } if (action_type_set[DR_ACTION_TYP_ASO_CT]) { if (attr->aso->dmn != attr->dmn || action_sz < DR_STE_ACTION_DOUBLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_ctr = true; } if (attr->aso->dmn != attr->dmn) { attr->aso_ste_loc = ste_loc; } else { dr_ste_v1_set_aso_ct(action, attr->aso->devx_obj->object_id, attr->aso->offset, attr->aso->dest_reg_id, attr->aso->ct.direction); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } } if (action_type_set[DR_ACTION_TYP_CTR]) { /* Counter action set after decap to exclude decaped header */ if (!allow_ctr) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; allow_modify_hdr = true; } dr_ste_v1_set_counter_id(last_ste, attr->ctr_id); allow_ctr = false; } if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L2]) { if (action_sz < DR_STE_ACTION_DOUBLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; } ste_ctx->set_encap(last_ste, action, attr->reformat_id, attr->reformat_size); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } else if (action_type_set[DR_ACTION_TYP_L2_TO_TNL_L3]) { u8 *d_action; if (action_sz < DR_STE_ACTION_TRIPLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, attr->gvmi); action = DEVX_ADDR_OF(ste_mask_and_match_v1, last_ste, action); action_sz = DR_STE_ACTION_TRIPLE_SZ; } d_action = action + DR_STE_ACTION_SINGLE_SZ; ste_ctx->set_encap_l3(last_ste, action, d_action, attr->reformat_id, attr->reformat_size); action_sz -= DR_STE_ACTION_TRIPLE_SZ; } dr_ste_v1_set_hit_gvmi(last_ste, attr->hit_gvmi); dr_ste_v1_set_hit_addr(last_ste, attr->final_icm_addr, 1); } static void dr_ste_v1_set_action_set(uint8_t *d_action, uint8_t hw_field, uint8_t shifter, uint8_t length, uint32_t data) { shifter += MLX5_MODIFY_HEADER_V1_QW_OFFSET; DR_STE_SET(double_action_set_v1, d_action, action_id, DR_STE_V1_ACTION_ID_SET); DR_STE_SET(double_action_set_v1, d_action, destination_dw_offset, hw_field); DR_STE_SET(double_action_set_v1, d_action, destination_left_shifter, shifter); DR_STE_SET(double_action_set_v1, d_action, destination_length, length); DR_STE_SET(double_action_set_v1, d_action, inline_data, data); } static void dr_ste_v1_set_action_add(uint8_t *d_action, uint8_t hw_field, uint8_t shifter, uint8_t length, uint32_t data) { shifter += MLX5_MODIFY_HEADER_V1_QW_OFFSET; DR_STE_SET(double_action_add_v1, d_action, action_id, DR_STE_V1_ACTION_ID_ADD); DR_STE_SET(double_action_add_v1, d_action, destination_dw_offset, hw_field); DR_STE_SET(double_action_add_v1, d_action, destination_left_shifter, shifter); DR_STE_SET(double_action_add_v1, d_action, destination_length, length); DR_STE_SET(double_action_add_v1, d_action, add_value, data); } static void dr_ste_v1_set_action_copy(uint8_t *d_action, uint8_t dst_hw_field, uint8_t dst_shifter, uint8_t dst_len, uint8_t src_hw_field, uint8_t src_shifter) { dst_shifter += MLX5_MODIFY_HEADER_V1_QW_OFFSET; src_shifter += MLX5_MODIFY_HEADER_V1_QW_OFFSET; DR_STE_SET(double_action_copy_v1, d_action, action_id, DR_STE_V1_ACTION_ID_COPY); DR_STE_SET(double_action_copy_v1, d_action, destination_dw_offset, dst_hw_field); DR_STE_SET(double_action_copy_v1, d_action, destination_left_shifter, dst_shifter); DR_STE_SET(double_action_copy_v1, d_action, destination_length, dst_len); DR_STE_SET(double_action_copy_v1, d_action, source_dw_offset, src_hw_field); DR_STE_SET(double_action_copy_v1, d_action, source_right_shifter, src_shifter); } static int dr_ste_v1_set_action_decap_l3_list(void *data, uint32_t data_sz, uint8_t *hw_action, uint32_t hw_action_sz, uint16_t *used_hw_action_num) { uint8_t padded_data[DR_STE_L2_HDR_MAX_SZ] = {}; void *data_ptr = padded_data; uint16_t used_actions = 0; uint32_t inline_data_sz; uint32_t i; if (hw_action_sz / DR_STE_ACTION_DOUBLE_SZ < DR_STE_DECAP_L3_ACTION_NUM) { errno = EINVAL; return errno; } inline_data_sz = DEVX_FLD_SZ_BYTES(ste_double_action_insert_with_inline_v1, inline_data); /* Add an alignment padding */ memcpy(padded_data + data_sz % inline_data_sz, data, data_sz); /* Remove L2L3 outer headers */ DR_STE_SET(single_action_remove_header_v1, hw_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER); DR_STE_SET(single_action_remove_header_v1, hw_action, decap, 1); DR_STE_SET(single_action_remove_header_v1, hw_action, vni_to_cqe, 1); DR_STE_SET(single_action_remove_header_v1, hw_action, end_anchor, DR_STE_HEADER_ANCHOR_INNER_IPV6_IPV4); hw_action += DR_STE_ACTION_DOUBLE_SZ; used_actions++; /* Point to the last dword of the header */ data_ptr += (data_sz / inline_data_sz) * inline_data_sz; /* Add the new header using inline action 4Byte at a time, the header * is added in reversed order to the beginning of the packet to avoid * incorrect parsing by the HW. Since header is 14B or 18B an extra * two bytes are padded and later removed. */ for (i = 0; i < data_sz / inline_data_sz + 1; i++) { void *addr_inline; DR_STE_SET(double_action_insert_with_inline_v1, hw_action, action_id, DR_STE_V1_ACTION_ID_INSERT_INLINE); /* The hardware expects here offset to words (2 bytes) */ DR_STE_SET(double_action_insert_with_inline_v1, hw_action, start_offset, 0); /* Copy byte byte in order to skip endianness problem */ addr_inline = DEVX_ADDR_OF(ste_double_action_insert_with_inline_v1, hw_action, inline_data); memcpy(addr_inline, data_ptr - inline_data_sz * i, inline_data_sz); hw_action += DR_STE_ACTION_DOUBLE_SZ; used_actions++; } /* Remove first 2 extra bytes */ DR_STE_SET(single_action_remove_header_size_v1, hw_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_BY_SIZE); DR_STE_SET(single_action_remove_header_size_v1, hw_action, start_offset, 0); /* The hardware expects here size in words (2 bytes) */ DR_STE_SET(single_action_remove_header_size_v1, hw_action, remove_size, 1); used_actions++; *used_hw_action_num = used_actions; return 0; } static const struct dr_ste_action_modify_field * dr_ste_v1_get_action_flex_hw_field(uint16_t sw_field, struct dr_devx_caps *caps) { uint8_t flex_id; if (!caps->flex_parser_header_modify) goto not_found; if ((sw_field == MLX5_ACTION_IN_FIELD_OUT_GTPU_TEID) && (caps->flex_protocols & MLX5_FLEX_PARSER_GTPU_TEID_ENABLED)) flex_id = caps->flex_parser_id_gtpu_teid; else goto not_found; if (flex_id >= ARRAY_SIZE(dr_ste_v1_action_modify_flex_field_arr)) goto not_found; return &dr_ste_v1_action_modify_flex_field_arr[flex_id]; not_found: errno = EINVAL; return NULL; } static const struct dr_ste_action_modify_field * dr_ste_v1_get_action_hw_field(struct dr_ste_ctx *ste_ctx, uint16_t sw_field, struct dr_devx_caps *caps) { const struct dr_ste_action_modify_field *hw_field; if (sw_field >= ste_ctx->action_modify_field_arr_size) goto not_found; hw_field = &ste_ctx->action_modify_field_arr[sw_field]; if (!hw_field->end && !hw_field->start) goto not_found; if (hw_field->flags & DR_STE_ACTION_MODIFY_FLAG_REQ_FLEX) return dr_ste_v1_get_action_flex_hw_field(sw_field, caps); return hw_field; not_found: errno = EINVAL; return NULL; } static void dr_ste_v1_build_eth_l2_src_dst_bit_mask(struct dr_match_param *value, bool inner, uint8_t *bit_mask) { struct dr_match_spec *mask = inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_src_dst_v1, bit_mask, dmac_47_16, mask, dmac_47_16); DR_STE_SET_TAG(eth_l2_src_dst_v1, bit_mask, dmac_15_0, mask, dmac_15_0); DR_STE_SET_TAG(eth_l2_src_dst_v1, bit_mask, smac_47_16, mask, smac_47_16); DR_STE_SET_TAG(eth_l2_src_dst_v1, bit_mask, smac_15_0, mask, smac_15_0); DR_STE_SET_TAG(eth_l2_src_dst_v1, bit_mask, first_vlan_id, mask, first_vid); DR_STE_SET_TAG(eth_l2_src_dst_v1, bit_mask, first_cfi, mask, first_cfi); DR_STE_SET_TAG(eth_l2_src_dst_v1, bit_mask, first_priority, mask, first_prio); DR_STE_SET_ONES(eth_l2_src_dst_v1, bit_mask, l3_type, mask, ip_version); if (mask->cvlan_tag || mask->svlan_tag) { DR_STE_SET(eth_l2_src_dst_v1, bit_mask, first_vlan_qualifier, -1); mask->cvlan_tag = 0; mask->svlan_tag = 0; } } static int dr_ste_v1_build_eth_l2_src_dst_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_src_dst_v1, tag, dmac_47_16, spec, dmac_47_16); DR_STE_SET_TAG(eth_l2_src_dst_v1, tag, dmac_15_0, spec, dmac_15_0); DR_STE_SET_TAG(eth_l2_src_dst_v1, tag, smac_47_16, spec, smac_47_16); DR_STE_SET_TAG(eth_l2_src_dst_v1, tag, smac_15_0, spec, smac_15_0); if (spec->ip_version) { if (spec->ip_version == IP_VERSION_IPV4) { DR_STE_SET(eth_l2_src_dst_v1, tag, l3_type, STE_IPV4); spec->ip_version = 0; } else if (spec->ip_version == IP_VERSION_IPV6) { DR_STE_SET(eth_l2_src_dst_v1, tag, l3_type, STE_IPV6); spec->ip_version = 0; } else { errno = EINVAL; return errno; } } DR_STE_SET_TAG(eth_l2_src_dst_v1, tag, first_vlan_id, spec, first_vid); DR_STE_SET_TAG(eth_l2_src_dst_v1, tag, first_cfi, spec, first_cfi); DR_STE_SET_TAG(eth_l2_src_dst_v1, tag, first_priority, spec, first_prio); if (spec->cvlan_tag) { DR_STE_SET(eth_l2_src_dst_v1, tag, first_vlan_qualifier, DR_STE_CVLAN); spec->cvlan_tag = 0; } else if (spec->svlan_tag) { DR_STE_SET(eth_l2_src_dst_v1, tag, first_vlan_qualifier, DR_STE_SVLAN); spec->svlan_tag = 0; } return 0; } static void dr_ste_v1_build_eth_l2_src_dst_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_eth_l2_src_dst_bit_mask(mask, sb->inner, sb->bit_mask); sb->lu_type = DR_STE_CALC_DFNR_TYPE(ETHL2_SRC_DST, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_eth_l2_src_dst_tag; } static int dr_ste_v1_build_eth_l3_ipv6_dst_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l3_ipv6_dst, tag, dst_ip_127_96, spec, dst_ip_127_96); DR_STE_SET_TAG(eth_l3_ipv6_dst, tag, dst_ip_95_64, spec, dst_ip_95_64); DR_STE_SET_TAG(eth_l3_ipv6_dst, tag, dst_ip_63_32, spec, dst_ip_63_32); DR_STE_SET_TAG(eth_l3_ipv6_dst, tag, dst_ip_31_0, spec, dst_ip_31_0); return 0; } static void dr_ste_v1_build_eth_l3_ipv6_dst_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_eth_l3_ipv6_dst_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_DFNR_TYPE(IPV6_DES, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_eth_l3_ipv6_dst_tag; } static int dr_ste_v1_build_eth_l3_ipv6_src_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l3_ipv6_src, tag, src_ip_127_96, spec, src_ip_127_96); DR_STE_SET_TAG(eth_l3_ipv6_src, tag, src_ip_95_64, spec, src_ip_95_64); DR_STE_SET_TAG(eth_l3_ipv6_src, tag, src_ip_63_32, spec, src_ip_63_32); DR_STE_SET_TAG(eth_l3_ipv6_src, tag, src_ip_31_0, spec, src_ip_31_0); return 0; } static void dr_ste_v1_build_eth_l3_ipv6_src_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_eth_l3_ipv6_src_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_DFNR_TYPE(IPV6_SRC, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_eth_l3_ipv6_src_tag; } static int dr_ste_v1_build_eth_l3_ipv4_5_tuple_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l3_ipv4_5_tuple_v1, tag, destination_address, spec, dst_ip_31_0); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple_v1, tag, source_address, spec, src_ip_31_0); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple_v1, tag, destination_port, spec, tcp_dport); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple_v1, tag, destination_port, spec, udp_dport); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple_v1, tag, source_port, spec, tcp_sport); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple_v1, tag, source_port, spec, udp_sport); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple_v1, tag, protocol, spec, ip_protocol); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple_v1, tag, fragmented, spec, frag); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple_v1, tag, dscp, spec, ip_dscp); DR_STE_SET_TAG(eth_l3_ipv4_5_tuple_v1, tag, ecn, spec, ip_ecn); if (spec->tcp_flags) { DR_STE_SET_TCP_FLAGS(eth_l3_ipv4_5_tuple_v1, tag, spec); spec->tcp_flags = 0; } return 0; } static void dr_ste_v1_build_eth_l3_ipv4_5_tuple_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_eth_l3_ipv4_5_tuple_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_DFNR_TYPE(ETHL3_IPV4_5_TUPLE, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_eth_l3_ipv4_5_tuple_tag; } static void dr_ste_v1_build_eth_l2_src_or_dst_bit_mask(struct dr_match_param *value, bool inner, uint8_t *bit_mask) { struct dr_match_spec *mask = inner ? &value->inner : &value->outer; struct dr_match_misc *misc_mask = &value->misc; DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, first_vlan_id, mask, first_vid); DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, first_cfi, mask, first_cfi); DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, first_priority, mask, first_prio); DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, ip_fragmented, mask, frag); // ? DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, l3_ethertype, mask, ethertype); // ? DR_STE_SET_ONES(eth_l2_src_v1, bit_mask, l3_type, mask, ip_version); if (mask->svlan_tag || mask->cvlan_tag) { DR_STE_SET(eth_l2_src_v1, bit_mask, first_vlan_qualifier, -1); mask->cvlan_tag = 0; mask->svlan_tag = 0; } if (inner) { if (misc_mask->inner_second_cvlan_tag || misc_mask->inner_second_svlan_tag) { DR_STE_SET(eth_l2_src_v1, bit_mask, second_vlan_qualifier, -1); misc_mask->inner_second_cvlan_tag = 0; misc_mask->inner_second_svlan_tag = 0; } DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, second_vlan_id, misc_mask, inner_second_vid); DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, second_cfi, misc_mask, inner_second_cfi); DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, second_priority, misc_mask, inner_second_prio); } else { if (misc_mask->outer_second_cvlan_tag || misc_mask->outer_second_svlan_tag) { DR_STE_SET(eth_l2_src_v1, bit_mask, second_vlan_qualifier, -1); misc_mask->outer_second_cvlan_tag = 0; misc_mask->outer_second_svlan_tag = 0; } DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, second_vlan_id, misc_mask, outer_second_vid); DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, second_cfi, misc_mask, outer_second_cfi); DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, second_priority, misc_mask, outer_second_prio); } } static int dr_ste_v1_build_eth_l2_src_or_dst_tag(struct dr_match_param *value, bool inner, uint8_t *tag) { struct dr_match_spec *spec = inner ? &value->inner : &value->outer; struct dr_match_misc *misc_spec = &value->misc; DR_STE_SET_TAG(eth_l2_src_v1, tag, first_vlan_id, spec, first_vid); DR_STE_SET_TAG(eth_l2_src_v1, tag, first_cfi, spec, first_cfi); DR_STE_SET_TAG(eth_l2_src_v1, tag, first_priority, spec, first_prio); DR_STE_SET_TAG(eth_l2_src_v1, tag, ip_fragmented, spec, frag); DR_STE_SET_TAG(eth_l2_src_v1, tag, l3_ethertype, spec, ethertype); if (spec->ip_version) { if (spec->ip_version == IP_VERSION_IPV4) { DR_STE_SET(eth_l2_src_v1, tag, l3_type, STE_IPV4); spec->ip_version = 0; } else if (spec->ip_version == IP_VERSION_IPV6) { DR_STE_SET(eth_l2_src_v1, tag, l3_type, STE_IPV6); spec->ip_version = 0; } else { errno = EINVAL; return errno; } } if (spec->cvlan_tag) { DR_STE_SET(eth_l2_src_v1, tag, first_vlan_qualifier, DR_STE_CVLAN); spec->cvlan_tag = 0; } else if (spec->svlan_tag) { DR_STE_SET(eth_l2_src_v1, tag, first_vlan_qualifier, DR_STE_SVLAN); spec->svlan_tag = 0; } if (inner) { if (misc_spec->inner_second_cvlan_tag) { DR_STE_SET(eth_l2_src_v1, tag, second_vlan_qualifier, DR_STE_CVLAN); misc_spec->inner_second_cvlan_tag = 0; } else if (misc_spec->inner_second_svlan_tag) { DR_STE_SET(eth_l2_src_v1, tag, second_vlan_qualifier, DR_STE_SVLAN); misc_spec->inner_second_svlan_tag = 0; } DR_STE_SET_TAG(eth_l2_src_v1, tag, second_vlan_id, misc_spec, inner_second_vid); DR_STE_SET_TAG(eth_l2_src_v1, tag, second_cfi, misc_spec, inner_second_cfi); DR_STE_SET_TAG(eth_l2_src_v1, tag, second_priority, misc_spec, inner_second_prio); } else { if (misc_spec->outer_second_cvlan_tag) { DR_STE_SET(eth_l2_src_v1, tag, second_vlan_qualifier, DR_STE_CVLAN); misc_spec->outer_second_cvlan_tag = 0; } else if (misc_spec->outer_second_svlan_tag) { DR_STE_SET(eth_l2_src_v1, tag, second_vlan_qualifier, DR_STE_SVLAN); misc_spec->outer_second_svlan_tag = 0; } DR_STE_SET_TAG(eth_l2_src_v1, tag, second_vlan_id, misc_spec, outer_second_vid); DR_STE_SET_TAG(eth_l2_src_v1, tag, second_cfi, misc_spec, outer_second_cfi); DR_STE_SET_TAG(eth_l2_src_v1, tag, second_priority, misc_spec, outer_second_prio); } return 0; } static void dr_ste_v1_build_eth_l2_src_bit_mask(struct dr_match_param *value, bool inner, uint8_t *bit_mask) { struct dr_match_spec *mask = inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, smac_47_16, mask, smac_47_16); DR_STE_SET_TAG(eth_l2_src_v1, bit_mask, smac_15_0, mask, smac_15_0); dr_ste_v1_build_eth_l2_src_or_dst_bit_mask(value, inner, bit_mask); } static int dr_ste_v1_build_eth_l2_src_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_src_v1, tag, smac_47_16, spec, smac_47_16); DR_STE_SET_TAG(eth_l2_src_v1, tag, smac_15_0, spec, smac_15_0); return dr_ste_v1_build_eth_l2_src_or_dst_tag(value, sb->inner, tag); } static void dr_ste_v1_build_eth_l2_src_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_eth_l2_src_bit_mask(mask, sb->inner, sb->bit_mask); sb->lu_type = DR_STE_CALC_DFNR_TYPE(ETHL2_SRC, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_eth_l2_src_tag; } static void dr_ste_v1_build_eth_l2_dst_bit_mask(struct dr_match_param *value, bool inner, uint8_t *bit_mask) { struct dr_match_spec *mask = inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_dst_v1, bit_mask, dmac_47_16, mask, dmac_47_16); DR_STE_SET_TAG(eth_l2_dst_v1, bit_mask, dmac_15_0, mask, dmac_15_0); dr_ste_v1_build_eth_l2_src_or_dst_bit_mask(value, inner, bit_mask); } static int dr_ste_v1_build_eth_l2_dst_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l2_dst_v1, tag, dmac_47_16, spec, dmac_47_16); DR_STE_SET_TAG(eth_l2_dst_v1, tag, dmac_15_0, spec, dmac_15_0); return dr_ste_v1_build_eth_l2_src_or_dst_tag(value, sb->inner, tag); } static void dr_ste_v1_build_eth_l2_dst_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_eth_l2_dst_bit_mask(mask, sb->inner, sb->bit_mask); sb->lu_type = DR_STE_CALC_DFNR_TYPE(ETHL2, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_eth_l2_dst_tag; } static void dr_ste_v1_build_eth_l2_tnl_bit_mask(struct dr_match_param *value, bool inner, uint8_t *bit_mask) { struct dr_match_spec *mask = inner ? &value->inner : &value->outer; struct dr_match_misc *misc = &value->misc; DR_STE_SET_TAG(eth_l2_tnl_v1, bit_mask, dmac_47_16, mask, dmac_47_16); DR_STE_SET_TAG(eth_l2_tnl_v1, bit_mask, dmac_15_0, mask, dmac_15_0); DR_STE_SET_TAG(eth_l2_tnl_v1, bit_mask, first_vlan_id, mask, first_vid); DR_STE_SET_TAG(eth_l2_tnl_v1, bit_mask, first_cfi, mask, first_cfi); DR_STE_SET_TAG(eth_l2_tnl_v1, bit_mask, first_priority, mask, first_prio); DR_STE_SET_TAG(eth_l2_tnl_v1, bit_mask, ip_fragmented, mask, frag); DR_STE_SET_TAG(eth_l2_tnl_v1, bit_mask, l3_ethertype, mask, ethertype); DR_STE_SET_ONES(eth_l2_tnl_v1, bit_mask, l3_type, mask, ip_version); if (misc->vxlan_vni) { DR_STE_SET(eth_l2_tnl_v1, bit_mask, l2_tunneling_network_id, (misc->vxlan_vni << 8)); misc->vxlan_vni = 0; } if (mask->svlan_tag || mask->cvlan_tag) { DR_STE_SET(eth_l2_tnl_v1, bit_mask, first_vlan_qualifier, -1); mask->cvlan_tag = 0; mask->svlan_tag = 0; } } static int dr_ste_v1_build_eth_l2_tnl_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; struct dr_match_misc *misc = &value->misc; DR_STE_SET_TAG(eth_l2_tnl_v1, tag, dmac_47_16, spec, dmac_47_16); DR_STE_SET_TAG(eth_l2_tnl_v1, tag, dmac_15_0, spec, dmac_15_0); DR_STE_SET_TAG(eth_l2_tnl_v1, tag, first_vlan_id, spec, first_vid); DR_STE_SET_TAG(eth_l2_tnl_v1, tag, first_cfi, spec, first_cfi); DR_STE_SET_TAG(eth_l2_tnl_v1, tag, ip_fragmented, spec, frag); DR_STE_SET_TAG(eth_l2_tnl_v1, tag, first_priority, spec, first_prio); DR_STE_SET_TAG(eth_l2_tnl_v1, tag, l3_ethertype, spec, ethertype); if (misc->vxlan_vni) { DR_STE_SET(eth_l2_tnl_v1, tag, l2_tunneling_network_id, (misc->vxlan_vni << 8)); misc->vxlan_vni = 0; } if (spec->cvlan_tag) { DR_STE_SET(eth_l2_tnl_v1, tag, first_vlan_qualifier, DR_STE_CVLAN); spec->cvlan_tag = 0; } else if (spec->svlan_tag) { DR_STE_SET(eth_l2_tnl_v1, tag, first_vlan_qualifier, DR_STE_SVLAN); spec->svlan_tag = 0; } if (spec->ip_version) { if (spec->ip_version == IP_VERSION_IPV4) { DR_STE_SET(eth_l2_tnl_v1, tag, l3_type, STE_IPV4); spec->ip_version = 0; } else if (spec->ip_version == IP_VERSION_IPV6) { DR_STE_SET(eth_l2_tnl_v1, tag, l3_type, STE_IPV6); spec->ip_version = 0; } else { errno = EINVAL; return errno; } } return 0; } static void dr_ste_v1_build_eth_l2_tnl_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_eth_l2_tnl_bit_mask(mask, sb->inner, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_ETHL2_TNL; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_eth_l2_tnl_tag; } static int dr_ste_v1_build_eth_l3_ipv4_misc_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; DR_STE_SET_TAG(eth_l3_ipv4_misc_v1, tag, time_to_live, spec, ip_ttl_hoplimit); DR_STE_SET_TAG(eth_l3_ipv4_misc_v1, tag, ihl, spec, ipv4_ihl); return 0; } static void dr_ste_v1_build_eth_l3_ipv4_misc_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_eth_l3_ipv4_misc_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_DFNR_TYPE(ETHL3_IPV4_MISC, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_eth_l3_ipv4_misc_tag; } static int dr_ste_v1_build_eth_ipv6_l3_l4_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *spec = sb->inner ? &value->inner : &value->outer; struct dr_match_misc *misc = &value->misc; DR_STE_SET_TAG(eth_l4_v1, tag, dst_port, spec, tcp_dport); DR_STE_SET_TAG(eth_l4_v1, tag, src_port, spec, tcp_sport); DR_STE_SET_TAG(eth_l4_v1, tag, dst_port, spec, udp_dport); DR_STE_SET_TAG(eth_l4_v1, tag, src_port, spec, udp_sport); DR_STE_SET_TAG(eth_l4_v1, tag, protocol, spec, ip_protocol); DR_STE_SET_TAG(eth_l4_v1, tag, fragmented, spec, frag); DR_STE_SET_TAG(eth_l4_v1, tag, dscp, spec, ip_dscp); DR_STE_SET_TAG(eth_l4_v1, tag, ecn, spec, ip_ecn); DR_STE_SET_TAG(eth_l4_v1, tag, ipv6_hop_limit, spec, ip_ttl_hoplimit); if (sb->inner) DR_STE_SET_TAG(eth_l4_v1, tag, flow_label, misc, inner_ipv6_flow_label); else DR_STE_SET_TAG(eth_l4_v1, tag, flow_label, misc, outer_ipv6_flow_label); if (spec->tcp_flags) { DR_STE_SET_TCP_FLAGS(eth_l4_v1, tag, spec); spec->tcp_flags = 0; } return 0; } static void dr_ste_v1_build_eth_ipv6_l3_l4_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_eth_ipv6_l3_l4_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_DFNR_TYPE(ETHL4, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_eth_ipv6_l3_l4_tag; } static int dr_ste_v1_build_mpls_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; if (sb->inner) DR_STE_SET_MPLS(mpls_v1, misc2, inner, tag); else DR_STE_SET_MPLS(mpls_v1, misc2, outer, tag); return 0; } static void dr_ste_v1_build_mpls_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_mpls_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_CALC_DFNR_TYPE(MPLS, sb->inner); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_mpls_tag; } static int dr_ste_v1_build_tnl_gre_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc *misc = &value->misc; DR_STE_SET_TAG(gre_v1, tag, gre_protocol, misc, gre_protocol); DR_STE_SET_TAG(gre_v1, tag, gre_k_present, misc, gre_k_present); DR_STE_SET_TAG(gre_v1, tag, gre_key_h, misc, gre_key_h); DR_STE_SET_TAG(gre_v1, tag, gre_key_l, misc, gre_key_l); DR_STE_SET_TAG(gre_v1, tag, gre_c_present, misc, gre_c_present); DR_STE_SET_TAG(gre_v1, tag, gre_s_present, misc, gre_s_present); return 0; } static void dr_ste_v1_build_tnl_gre_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_tnl_gre_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_GRE; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_tnl_gre_tag; } static int dr_ste_v1_build_tnl_mpls_over_udp_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; uint8_t *parser_ptr; uint8_t parser_id; uint32_t mpls_hdr; mpls_hdr = misc2->outer_first_mpls_over_udp_label << HDR_MPLS_OFFSET_LABEL; misc2->outer_first_mpls_over_udp_label = 0; mpls_hdr |= misc2->outer_first_mpls_over_udp_exp << HDR_MPLS_OFFSET_EXP; misc2->outer_first_mpls_over_udp_exp = 0; mpls_hdr |= misc2->outer_first_mpls_over_udp_s_bos << HDR_MPLS_OFFSET_S_BOS; misc2->outer_first_mpls_over_udp_s_bos = 0; mpls_hdr |= misc2->outer_first_mpls_over_udp_ttl << HDR_MPLS_OFFSET_TTL; misc2->outer_first_mpls_over_udp_ttl = 0; parser_id = sb->caps->flex_parser_id_mpls_over_udp; parser_ptr = dr_ste_calc_flex_parser_offset(tag, parser_id); *(__be32 *)parser_ptr = htobe32(mpls_hdr); return 0; } static void dr_ste_v1_build_tnl_mpls_over_udp_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_tnl_mpls_over_udp_tag(mask, sb, sb->bit_mask); /* STEs with lookup type FLEX_PARSER_{0/1} includes * flex parsers_{0-3}/{4-7} respectively. */ sb->lu_type = sb->caps->flex_parser_id_mpls_over_udp <= DR_STE_MAX_FLEX_0_ID ? DR_STE_V1_LU_TYPE_FLEX_PARSER_0 : DR_STE_V1_LU_TYPE_FLEX_PARSER_1; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_tnl_mpls_over_udp_tag; } static int dr_ste_v1_build_tnl_mpls_over_gre_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; uint8_t *parser_ptr; uint8_t parser_id; uint32_t mpls_hdr; mpls_hdr = misc2->outer_first_mpls_over_gre_label << HDR_MPLS_OFFSET_LABEL; misc2->outer_first_mpls_over_gre_label = 0; mpls_hdr |= misc2->outer_first_mpls_over_gre_exp << HDR_MPLS_OFFSET_EXP; misc2->outer_first_mpls_over_gre_exp = 0; mpls_hdr |= misc2->outer_first_mpls_over_gre_s_bos << HDR_MPLS_OFFSET_S_BOS; misc2->outer_first_mpls_over_gre_s_bos = 0; mpls_hdr |= misc2->outer_first_mpls_over_gre_ttl << HDR_MPLS_OFFSET_TTL; misc2->outer_first_mpls_over_gre_ttl = 0; parser_id = sb->caps->flex_parser_id_mpls_over_gre; parser_ptr = dr_ste_calc_flex_parser_offset(tag, parser_id); *(__be32 *)parser_ptr = htobe32(mpls_hdr); return 0; } static void dr_ste_v1_build_tnl_mpls_over_gre_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_tnl_mpls_over_gre_tag(mask, sb, sb->bit_mask); /* STEs with lookup type FLEX_PARSER_{0/1} includes * flex parsers_{0-3}/{4-7} respectively. */ sb->lu_type = sb->caps->flex_parser_id_mpls_over_gre <= DR_STE_MAX_FLEX_0_ID ? DR_STE_V1_LU_TYPE_FLEX_PARSER_0 : DR_STE_V1_LU_TYPE_FLEX_PARSER_1; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_tnl_mpls_over_gre_tag; } static int dr_ste_v1_build_icmp_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc3 *misc3 = &value->misc3; bool is_ipv4 = DR_MASK_IS_ICMPV4_SET(misc3); uint32_t *icmp_header_data; uint8_t *icmp_type; uint8_t *icmp_code; if (is_ipv4) { icmp_header_data = &misc3->icmpv4_header_data; icmp_type = &misc3->icmpv4_type; icmp_code = &misc3->icmpv4_code; } else { icmp_header_data = &misc3->icmpv6_header_data; icmp_type = &misc3->icmpv6_type; icmp_code = &misc3->icmpv6_code; } DR_STE_SET(icmp_v1, tag, icmp_header_data, *icmp_header_data); DR_STE_SET(icmp_v1, tag, icmp_type, *icmp_type); DR_STE_SET(icmp_v1, tag, icmp_code, *icmp_code); *icmp_header_data = 0; *icmp_type = 0; *icmp_code = 0; return 0; } static void dr_ste_v1_build_icmp_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_icmp_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_ETHL4_MISC_O; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_icmp_tag; } static int dr_ste_v1_build_general_purpose_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; DR_STE_SET_TAG(general_purpose, tag, general_purpose_lookup_field, misc2, metadata_reg_a); return 0; } static void dr_ste_v1_build_general_purpose_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_general_purpose_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_GENERAL_PURPOSE; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_general_purpose_tag; } static int dr_ste_v1_build_eth_l4_misc_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc3 *misc3 = &value->misc3; if (sb->inner) { DR_STE_SET_TAG(eth_l4_misc_v1, tag, seq_num, misc3, inner_tcp_seq_num); DR_STE_SET_TAG(eth_l4_misc_v1, tag, ack_num, misc3, inner_tcp_ack_num); } else { DR_STE_SET_TAG(eth_l4_misc_v1, tag, seq_num, misc3, outer_tcp_seq_num); DR_STE_SET_TAG(eth_l4_misc_v1, tag, ack_num, misc3, outer_tcp_ack_num); } return 0; } static void dr_ste_v1_build_eth_l4_misc_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_eth_l4_misc_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_ETHL4_MISC_O; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_eth_l4_misc_tag; } static int dr_ste_v1_build_flex_parser_tnl_vxlan_gpe_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc3 *misc3 = &value->misc3; DR_STE_SET_TAG(flex_parser_tnl_vxlan_gpe, tag, outer_vxlan_gpe_flags, misc3, outer_vxlan_gpe_flags); DR_STE_SET_TAG(flex_parser_tnl_vxlan_gpe, tag, outer_vxlan_gpe_next_protocol, misc3, outer_vxlan_gpe_next_protocol); DR_STE_SET_TAG(flex_parser_tnl_vxlan_gpe, tag, outer_vxlan_gpe_vni, misc3, outer_vxlan_gpe_vni); return 0; } static void dr_ste_v1_build_flex_parser_tnl_vxlan_gpe_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_flex_parser_tnl_vxlan_gpe_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_FLEX_PARSER_TNL_HEADER; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_flex_parser_tnl_vxlan_gpe_tag; } static int dr_ste_v1_build_flex_parser_tnl_geneve_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc *misc = &value->misc; DR_STE_SET_TAG(flex_parser_tnl_geneve, tag, geneve_protocol_type, misc, geneve_protocol_type); DR_STE_SET_TAG(flex_parser_tnl_geneve, tag, geneve_oam, misc, geneve_oam); DR_STE_SET_TAG(flex_parser_tnl_geneve, tag, geneve_opt_len, misc, geneve_opt_len); DR_STE_SET_TAG(flex_parser_tnl_geneve, tag, geneve_vni, misc, geneve_vni); return 0; } static void dr_ste_v1_build_flex_parser_tnl_geneve_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_flex_parser_tnl_geneve_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_FLEX_PARSER_TNL_HEADER; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_flex_parser_tnl_geneve_tag; } static int dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { uint8_t parser_id = sb->caps->flex_parser_id_geneve_opt_0; uint8_t *parser_ptr = dr_ste_calc_flex_parser_offset(tag, parser_id); struct dr_match_misc3 *misc3 = &value->misc3; *(__be32 *)parser_ptr = htobe32(misc3->geneve_tlv_option_0_data); misc3->geneve_tlv_option_0_data = 0; return 0; } static void dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_tag(mask, sb, sb->bit_mask); /* STEs with lookup type FLEX_PARSER_{0/1} includes * flex parsers_{0-3}/{4-7} respectively. */ sb->lu_type = sb->caps->flex_parser_id_geneve_opt_0 <= DR_STE_MAX_FLEX_0_ID ? DR_STE_V1_LU_TYPE_FLEX_PARSER_0 : DR_STE_V1_LU_TYPE_FLEX_PARSER_1; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_tag; } static int dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_exist_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { uint8_t parser_id = sb->caps->flex_parser_id_geneve_opt_0; struct dr_match_misc *misc = &value->misc; if (misc->geneve_tlv_option_0_exist) { DR_STE_SET(flex_parser_ok, tag, flex_parsers_ok, 1 << parser_id); misc->geneve_tlv_option_0_exist = 0; } return 0; } static void dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_exist_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_FLEX_PARSER_OK; dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_exist_tag(mask, sb, sb->bit_mask); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_exist_tag; } static int dr_ste_v1_build_flex_parser_tnl_gtpu_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc3 *misc3 = &value->misc3; DR_STE_SET_TAG(flex_parser_tnl_gtpu, tag, gtpu_msg_flags, misc3, gtpu_msg_flags); DR_STE_SET_TAG(flex_parser_tnl_gtpu, tag, gtpu_msg_type, misc3, gtpu_msg_type); DR_STE_SET_TAG(flex_parser_tnl_gtpu, tag, gtpu_teid, misc3, gtpu_teid); return 0; } static void dr_ste_v1_build_flex_parser_tnl_gtpu_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_flex_parser_tnl_gtpu_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_FLEX_PARSER_TNL_HEADER; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_flex_parser_tnl_gtpu_tag; } static int dr_ste_v1_build_tnl_gtpu_flex_parser_0_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { if (sb->caps->flex_parser_id_gtpu_dw_0 <= DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_dw_0, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_teid <= DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_teid, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_dw_2 <= DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_dw_2, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_first_ext_dw_0 <= DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_first_ext_dw_0, sb->caps, &value->misc3); return 0; } static void dr_ste_v1_build_tnl_gtpu_flex_parser_0_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_tnl_gtpu_flex_parser_0_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_FLEX_PARSER_0; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_tnl_gtpu_flex_parser_0_tag; } static int dr_ste_v1_build_tnl_gtpu_flex_parser_1_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { if (sb->caps->flex_parser_id_gtpu_dw_0 > DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_dw_0, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_teid > DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_teid, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_dw_2 > DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_dw_2, sb->caps, &value->misc3); if (sb->caps->flex_parser_id_gtpu_first_ext_dw_0 > DR_STE_MAX_FLEX_0_ID) DR_STE_SET_FLEX_PARSER_FIELD(tag, gtpu_first_ext_dw_0, sb->caps, &value->misc3); return 0; } static void dr_ste_v1_build_tnl_gtpu_flex_parser_1_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_tnl_gtpu_flex_parser_1_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_FLEX_PARSER_1; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_tnl_gtpu_flex_parser_1_tag; } static int dr_ste_v1_build_register_0_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; DR_STE_SET_TAG(register_0, tag, register_0_h, misc2, metadata_reg_c_0); DR_STE_SET_TAG(register_0, tag, register_0_l, misc2, metadata_reg_c_1); DR_STE_SET_TAG(register_0, tag, register_1_h, misc2, metadata_reg_c_2); DR_STE_SET_TAG(register_0, tag, register_1_l, misc2, metadata_reg_c_3); return 0; } static void dr_ste_v1_build_register_0_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_register_0_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_STEERING_REGISTERS_0; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_register_0_tag; } static int dr_ste_v1_build_register_1_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; DR_STE_SET_TAG(register_1, tag, register_2_h, misc2, metadata_reg_c_4); DR_STE_SET_TAG(register_1, tag, register_2_l, misc2, metadata_reg_c_5); DR_STE_SET_TAG(register_1, tag, register_3_h, misc2, metadata_reg_c_6); DR_STE_SET_TAG(register_1, tag, register_3_l, misc2, metadata_reg_c_7); return 0; } static void dr_ste_v1_build_register_1_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_register_1_tag(mask, sb, sb->bit_mask); sb->lu_type = DR_STE_V1_LU_TYPE_STEERING_REGISTERS_1; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_register_1_tag; } static void dr_ste_v1_build_src_gvmi_qpn_bit_mask(struct dr_match_param *value, struct dr_ste_build *sb) { struct dr_match_misc *misc_mask = &value->misc; uint8_t *bit_mask = sb->bit_mask; if (sb->rx && misc_mask->source_port) DR_STE_SET(src_gvmi_qp_v1, bit_mask, functional_lb, 1); DR_STE_SET_ONES(src_gvmi_qp_v1, bit_mask, source_gvmi, misc_mask, source_port); DR_STE_SET_ONES(src_gvmi_qp_v1, bit_mask, source_qp, misc_mask, source_sqn); } static int dr_ste_v1_build_src_gvmi_qpn_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc *misc = &value->misc; struct dr_devx_vport_cap *vport_cap; uint8_t *bit_mask = sb->bit_mask; bool source_gvmi_set; DR_STE_SET_TAG(src_gvmi_qp_v1, tag, source_qp, misc, source_sqn); source_gvmi_set = DR_STE_GET(src_gvmi_qp_v1, bit_mask, source_gvmi); if (source_gvmi_set) { vport_cap = dr_vports_table_get_vport_cap(sb->caps, misc->source_port); if (!vport_cap) return errno; if (vport_cap->vport_gvmi) DR_STE_SET(src_gvmi_qp_v1, tag, source_gvmi, vport_cap->vport_gvmi); /* Make sure that this packet is not coming from the wire since * wire GVMI is set to 0 and can be aliased with another port */ if (sb->rx && misc->source_port != WIRE_PORT) DR_STE_SET(src_gvmi_qp_v1, tag, functional_lb, 1); misc->source_port = 0; } return 0; } static void dr_ste_v1_build_src_gvmi_qpn_init(struct dr_ste_build *sb, struct dr_match_param *mask) { dr_ste_v1_build_src_gvmi_qpn_bit_mask(mask, sb); sb->lu_type = DR_STE_V1_LU_TYPE_SRC_QP_GVMI; sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_src_gvmi_qpn_tag; } static void dr_ste_v1_set_aso_ct_cross_dmn(uint8_t *hw_ste, uint32_t object_id, uint32_t offset, uint8_t dest_reg_id, bool direction) { uint8_t *action_addr; action_addr = DEVX_ADDR_OF(ste_match_bwc_v1, hw_ste, action); dr_ste_v1_set_aso_ct(action_addr, object_id, offset, dest_reg_id, direction); } static void dr_ste_set_flex_parser(uint16_t lu_type, uint32_t *misc4_field_id, uint32_t *misc4_field_value, bool *parser_is_used, uint8_t *tag) { uint32_t id = *misc4_field_id; uint8_t *parser_ptr; bool skip_parser; /* Since this is a shared function to set flex parsers, * we need to skip it if lookup type and parser ID doesn't match */ skip_parser = id <= DR_STE_MAX_FLEX_0_ID ? lu_type != DR_STE_V1_LU_TYPE_FLEX_PARSER_0 : lu_type != DR_STE_V1_LU_TYPE_FLEX_PARSER_1; skip_parser = skip_parser || (id >= NUM_OF_FLEX_PARSERS); if (skip_parser || parser_is_used[id]) return; parser_is_used[id] = true; parser_ptr = dr_ste_calc_flex_parser_offset(tag, id); *(__be32 *)parser_ptr = htobe32(*misc4_field_value); *misc4_field_id = 0; *misc4_field_value = 0; } static int dr_ste_v1_build_felx_parser_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc4 *misc_4_mask = &value->misc4; bool parser_is_used[NUM_OF_FLEX_PARSERS] = {}; dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_0, &misc_4_mask->prog_sample_field_value_0, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_1, &misc_4_mask->prog_sample_field_value_1, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_2, &misc_4_mask->prog_sample_field_value_2, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_3, &misc_4_mask->prog_sample_field_value_3, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_4, &misc_4_mask->prog_sample_field_value_4, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_5, &misc_4_mask->prog_sample_field_value_5, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_6, &misc_4_mask->prog_sample_field_value_6, parser_is_used, tag); dr_ste_set_flex_parser(sb->lu_type, &misc_4_mask->prog_sample_field_id_7, &misc_4_mask->prog_sample_field_value_7, parser_is_used, tag); return 0; } static void dr_ste_v1_build_flex_parser_0_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_FLEX_PARSER_0; dr_ste_v1_build_felx_parser_tag(mask, sb, sb->bit_mask); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_felx_parser_tag; } static void dr_ste_v1_build_flex_parser_1_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_FLEX_PARSER_1; dr_ste_v1_build_felx_parser_tag(mask, sb, sb->bit_mask); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_felx_parser_tag; } static int dr_ste_v1_build_tunnel_header_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc5 *misc5 = &value->misc5; DR_STE_SET_TAG(tunnel_header_v1, tag, tunnel_header_0, misc5, tunnel_header_0); DR_STE_SET_TAG(tunnel_header_v1, tag, tunnel_header_1, misc5, tunnel_header_1); if (sb->caps->support_full_tnl_hdr) { DR_STE_SET_TAG(tunnel_header_v1, tag, tunnel_header_2, misc5, tunnel_header_2); DR_STE_SET_TAG(tunnel_header_v1, tag, tunnel_header_3, misc5, tunnel_header_3); } return 0; } static void dr_ste_v1_build_tunnel_header_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = sb->caps->support_full_tnl_hdr ? DR_STE_V1_LU_TYPE_TNL_HEADER : DR_STE_V1_LU_TYPE_FLEX_PARSER_TNL_HEADER; dr_ste_v1_build_tunnel_header_tag(mask, sb, sb->bit_mask); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_tunnel_header_tag; } static int dr_ste_v1_build_ib_l4_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc *misc = &value->misc; DR_STE_SET_TAG(ib_l4, tag, opcode, misc, bth_opcode); DR_STE_SET_TAG(ib_l4, tag, qp, misc, bth_dst_qp); DR_STE_SET_TAG(ib_l4, tag, ackreq, misc, bth_a); return 0; } static void dr_ste_v1_build_ib_l4_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_IBL4; dr_ste_v1_build_ib_l4_tag(mask, sb, sb->bit_mask); sb->byte_mask = dr_ste_conv_bit_to_byte_mask(sb->bit_mask); sb->ste_build_tag_func = &dr_ste_v1_build_ib_l4_tag; } static int dr_ste_v1_build_def0_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; struct dr_match_spec *outer = &value->outer; struct dr_match_spec *inner = &value->inner; DR_STE_SET_TAG(def0_v1, tag, metadata_reg_c_0, misc2, metadata_reg_c_0); DR_STE_SET_TAG(def0_v1, tag, metadata_reg_c_1, misc2, metadata_reg_c_1); DR_STE_SET_TAG(def0_v1, tag, dmac_47_16, outer, dmac_47_16); DR_STE_SET_TAG(def0_v1, tag, dmac_15_0, outer, dmac_15_0); DR_STE_SET_TAG(def0_v1, tag, smac_47_16, outer, smac_47_16); DR_STE_SET_TAG(def0_v1, tag, smac_15_0, outer, smac_15_0); DR_STE_SET_TAG(def0_v1, tag, ethertype, outer, ethertype); DR_STE_SET_TAG(def0_v1, tag, ip_frag, outer, frag); if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET(def0_v1, tag, outer_l3_type, STE_IPV4); outer->ip_version = 0; } else if (outer->ip_version == IP_VERSION_IPV6) { DR_STE_SET(def0_v1, tag, outer_l3_type, STE_IPV6); outer->ip_version = 0; } if (outer->cvlan_tag) { DR_STE_SET(def0_v1, tag, first_vlan_qualifier, DR_STE_CVLAN); outer->cvlan_tag = 0; } else if (outer->svlan_tag) { DR_STE_SET(def0_v1, tag, first_vlan_qualifier, DR_STE_SVLAN); outer->svlan_tag = 0; } DR_STE_SET_TAG(def0_v1, tag, first_priority, outer, first_prio); DR_STE_SET_TAG(def0_v1, tag, first_vlan_id, outer, first_vid); DR_STE_SET_TAG(def0_v1, tag, first_cfi, outer, first_cfi); if (sb->caps->definer_supp_checksum) { DR_STE_SET_TAG(def0_v1, tag, outer_l3_ok, outer, l3_ok); DR_STE_SET_TAG(def0_v1, tag, outer_l4_ok, outer, l4_ok); DR_STE_SET_TAG(def0_v1, tag, inner_l3_ok, inner, l3_ok); DR_STE_SET_TAG(def0_v1, tag, inner_l4_ok, inner, l4_ok); DR_STE_SET_TAG(def0_v1, tag, outer_ipv4_checksum_ok, outer, ipv4_checksum_ok); DR_STE_SET_TAG(def0_v1, tag, outer_l4_checksum_ok, outer, l4_checksum_ok); DR_STE_SET_TAG(def0_v1, tag, inner_ipv4_checksum_ok, inner, ipv4_checksum_ok); DR_STE_SET_TAG(def0_v1, tag, inner_l4_checksum_ok, inner, l4_checksum_ok); } if (outer->tcp_flags) { DR_STE_SET_BOOL(def0_v1, tag, tcp_cwr, outer->tcp_flags & (1 << 7)); DR_STE_SET_BOOL(def0_v1, tag, tcp_ece, outer->tcp_flags & (1 << 6)); DR_STE_SET_BOOL(def0_v1, tag, tcp_urg, outer->tcp_flags & (1 << 5)); DR_STE_SET_BOOL(def0_v1, tag, tcp_ack, outer->tcp_flags & (1 << 4)); DR_STE_SET_BOOL(def0_v1, tag, tcp_psh, outer->tcp_flags & (1 << 3)); DR_STE_SET_BOOL(def0_v1, tag, tcp_rst, outer->tcp_flags & (1 << 2)); DR_STE_SET_BOOL(def0_v1, tag, tcp_syn, outer->tcp_flags & (1 << 1)); DR_STE_SET_BOOL(def0_v1, tag, tcp_fin, outer->tcp_flags & (1 << 0)); outer->tcp_flags ^= (outer->tcp_flags & 0xff); } return 0; } static void dr_ste_v1_build_def0_mask(struct dr_match_param *value, struct dr_ste_build *sb) { struct dr_match_spec *outer = &value->outer; uint8_t *tag = sb->match; if (outer->svlan_tag || outer->cvlan_tag) { DR_STE_SET(def0_v1, tag, first_vlan_qualifier, -1); outer->cvlan_tag = 0; outer->svlan_tag = 0; } dr_ste_v1_build_def0_tag(value, sb, tag); } static void dr_ste_v1_build_def0_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_MATCH; dr_ste_v1_build_def0_mask(mask, sb); sb->ste_build_tag_func = &dr_ste_v1_build_def0_tag; } static int dr_ste_v1_build_def2_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; struct dr_match_spec *outer = &value->outer; struct dr_match_spec *inner = &value->inner; DR_STE_SET_TAG(def2_v1, tag, metadata_reg_a, misc2, metadata_reg_a); DR_STE_SET_TAG(def2_v1, tag, outer_ip_version, outer, ip_version); DR_STE_SET_TAG(def2_v1, tag, outer_ip_ihl, outer, ipv4_ihl); DR_STE_SET_TAG(def2_v1, tag, outer_ip_dscp, outer, ip_dscp); DR_STE_SET_TAG(def2_v1, tag, outer_ip_ecn, outer, ip_ecn); DR_STE_SET_TAG(def2_v1, tag, outer_ip_ttl, outer, ip_ttl_hoplimit); DR_STE_SET_TAG(def2_v1, tag, outer_ip_protocol, outer, ip_protocol); DR_STE_SET_TAG(def2_v1, tag, outer_l4_sport, outer, tcp_sport); DR_STE_SET_TAG(def2_v1, tag, outer_l4_dport, outer, tcp_dport); DR_STE_SET_TAG(def2_v1, tag, outer_l4_sport, outer, udp_sport); DR_STE_SET_TAG(def2_v1, tag, outer_l4_dport, outer, udp_dport); DR_STE_SET_TAG(def2_v1, tag, outer_ip_frag, outer, frag); if (outer->tcp_flags) { DR_STE_SET_BOOL(def2_v1, tag, tcp_ns, outer->tcp_flags & (1 << 8)); DR_STE_SET_BOOL(def2_v1, tag, tcp_cwr, outer->tcp_flags & (1 << 7)); DR_STE_SET_BOOL(def2_v1, tag, tcp_ece, outer->tcp_flags & (1 << 6)); DR_STE_SET_BOOL(def2_v1, tag, tcp_urg, outer->tcp_flags & (1 << 5)); DR_STE_SET_BOOL(def2_v1, tag, tcp_ack, outer->tcp_flags & (1 << 4)); DR_STE_SET_BOOL(def2_v1, tag, tcp_psh, outer->tcp_flags & (1 << 3)); DR_STE_SET_BOOL(def2_v1, tag, tcp_rst, outer->tcp_flags & (1 << 2)); DR_STE_SET_BOOL(def2_v1, tag, tcp_syn, outer->tcp_flags & (1 << 1)); DR_STE_SET_BOOL(def2_v1, tag, tcp_fin, outer->tcp_flags & (1 << 0)); outer->tcp_flags = 0; } if (sb->caps->definer_supp_checksum) { DR_STE_SET_TAG(def2_v1, tag, outer_l3_ok, outer, l3_ok); DR_STE_SET_TAG(def2_v1, tag, outer_l4_ok, outer, l4_ok); DR_STE_SET_TAG(def2_v1, tag, inner_l3_ok, inner, l3_ok); DR_STE_SET_TAG(def2_v1, tag, inner_l4_ok, inner, l4_ok); DR_STE_SET_TAG(def2_v1, tag, outer_ipv4_checksum_ok, outer, ipv4_checksum_ok); DR_STE_SET_TAG(def2_v1, tag, outer_l4_checksum_ok, outer, l4_checksum_ok); DR_STE_SET_TAG(def2_v1, tag, inner_ipv4_checksum_ok, inner, ipv4_checksum_ok); DR_STE_SET_TAG(def2_v1, tag, inner_l4_checksum_ok, inner, l4_checksum_ok); } return 0; } static void dr_ste_v1_build_def2_mask(struct dr_match_param *value, struct dr_ste_build *sb) { struct dr_match_spec *outer = &value->outer; uint8_t *tag = sb->match; if (outer->ip_version) DR_STE_SET_ONES(def2_v1, tag, outer_ip_version, outer, ip_version); dr_ste_v1_build_def2_tag(value, sb, sb->match); } static void dr_ste_v1_build_def2_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_MATCH; dr_ste_v1_build_def2_mask(mask, sb); sb->ste_build_tag_func = &dr_ste_v1_build_def2_tag; } static int dr_ste_v1_build_def6_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *outer = &value->outer; /* Upper layer should verify this the IPv6 is provided */ DR_STE_SET_TAG(def6_v1, tag, dst_ipv6_127_96, outer, dst_ip_127_96); DR_STE_SET_TAG(def6_v1, tag, dst_ipv6_95_64, outer, dst_ip_95_64); DR_STE_SET_TAG(def6_v1, tag, dst_ipv6_63_32, outer, dst_ip_63_32); DR_STE_SET_TAG(def6_v1, tag, dst_ipv6_31_0, outer, dst_ip_31_0); DR_STE_SET_TAG(def6_v1, tag, outer_l4_sport, outer, tcp_sport); DR_STE_SET_TAG(def6_v1, tag, outer_l4_sport, outer, udp_sport); DR_STE_SET_TAG(def6_v1, tag, outer_l4_dport, outer, tcp_dport); DR_STE_SET_TAG(def6_v1, tag, outer_l4_dport, outer, udp_dport); DR_STE_SET_TAG(def6_v1, tag, ip_frag, outer, frag); DR_STE_SET_TAG(def6_v1, tag, l3_ok, outer, l3_ok); DR_STE_SET_TAG(def6_v1, tag, l4_ok, outer, l4_ok); if (outer->tcp_flags) { DR_STE_SET_TCP_FLAGS(def6_v1, tag, outer); outer->tcp_flags = 0; } return 0; } static void dr_ste_v1_build_def6_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_MATCH; dr_ste_v1_build_def6_tag(mask, sb, sb->match); sb->ste_build_tag_func = &dr_ste_v1_build_def6_tag; } static int dr_ste_v1_build_def16_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; struct dr_match_misc5 *misc5 = &value->misc5; struct dr_match_spec *outer = &value->outer; struct dr_match_misc *misc = &value->misc; struct dr_devx_vport_cap *vport_cap; bool source_gvmi_set; DR_STE_SET_TAG(def16_v1, tag, tunnel_header_0, misc5, tunnel_header_0); DR_STE_SET_TAG(def16_v1, tag, tunnel_header_1, misc5, tunnel_header_1); DR_STE_SET_TAG(def16_v1, tag, tunnel_header_2, misc5, tunnel_header_2); DR_STE_SET_TAG(def16_v1, tag, tunnel_header_3, misc5, tunnel_header_3); DR_STE_SET_TAG(def16_v1, tag, metadata_reg_a, misc2, metadata_reg_a); source_gvmi_set = DR_STE_GET(def16_v1, sb->match, source_gvmi); if (source_gvmi_set) { vport_cap = dr_vports_table_get_vport_cap(sb->caps, misc->source_port); if (!vport_cap) return errno; if (vport_cap->vport_gvmi) DR_STE_SET(def16_v1, tag, source_gvmi, vport_cap->vport_gvmi); misc->source_port = 0; } if (outer->cvlan_tag) { DR_STE_SET(def16_v1, tag, outer_first_vlan_type, DR_STE_CVLAN); outer->cvlan_tag = 0; } else if (outer->svlan_tag) { DR_STE_SET(def16_v1, tag, outer_first_vlan_type, DR_STE_SVLAN); outer->svlan_tag = 0; } if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET(def16_v1, tag, outer_l3_type, STE_IPV4); outer->ip_version = 0; } else if (outer->ip_version == IP_VERSION_IPV6) { DR_STE_SET(def16_v1, tag, outer_l3_type, STE_IPV6); outer->ip_version = 0; } if (outer->ip_protocol == IP_PROTOCOL_UDP) { DR_STE_SET(def16_v1, tag, outer_l4_type, STE_UDP); outer->ip_protocol = 0; } else if (outer->ip_protocol == IP_PROTOCOL_TCP) { DR_STE_SET(def16_v1, tag, outer_l4_type, STE_TCP); outer->ip_protocol = 0; } DR_STE_SET_TAG(def16_v1, tag, source_sqn, misc, source_sqn); DR_STE_SET_TAG(def16_v1, tag, outer_ip_frag, outer, frag); return 0; } static void dr_ste_v1_build_def16_mask(struct dr_match_param *value, struct dr_ste_build *sb) { struct dr_match_spec *outer = &value->outer; struct dr_match_misc *misc = &value->misc; uint8_t *tag = sb->match; bool is_tcp_or_udp; /* Hint to indicate UDP/TCP packet due to l4_type limitations */ is_tcp_or_udp = outer->tcp_dport || outer->tcp_sport || outer->udp_dport || outer->udp_sport || outer->ip_protocol == IP_PROTOCOL_UDP || outer->ip_protocol == IP_PROTOCOL_TCP; if (outer->ip_protocol && is_tcp_or_udp) { DR_STE_SET(def16_v1, tag, outer_l4_type, -1); outer->ip_protocol = 0; } if (outer->svlan_tag || outer->cvlan_tag) { DR_STE_SET(def16_v1, tag, outer_first_vlan_type, -1); outer->cvlan_tag = 0; outer->svlan_tag = 0; } dr_ste_v1_build_def16_tag(value, sb, tag); DR_STE_SET_ONES(def16_v1, tag, source_gvmi, misc, source_port); } static void dr_ste_v1_build_def16_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_MATCH; dr_ste_v1_build_def16_mask(mask, sb); sb->ste_build_tag_func = &dr_ste_v1_build_def16_tag; } static int dr_ste_v1_build_def22_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; struct dr_match_spec *outer = &value->outer; if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET_TAG(def22_v1, tag, outer_ip_src_addr, outer, src_ip_31_0); DR_STE_SET_TAG(def22_v1, tag, outer_ip_dst_addr, outer, dst_ip_31_0); } if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET(def22_v1, tag, outer_l3_type, STE_IPV4); outer->ip_version = 0; } else if (outer->ip_version == IP_VERSION_IPV6) { DR_STE_SET(def22_v1, tag, outer_l3_type, STE_IPV6); outer->ip_version = 0; } if (outer->ip_protocol == IP_PROTOCOL_UDP) { DR_STE_SET(def22_v1, tag, outer_l4_type, STE_UDP); outer->ip_protocol = 0; } else if (outer->ip_protocol == IP_PROTOCOL_TCP) { DR_STE_SET(def22_v1, tag, outer_l4_type, STE_TCP); outer->ip_protocol = 0; } if (outer->cvlan_tag) { DR_STE_SET(def22_v1, tag, first_vlan_qualifier, DR_STE_CVLAN); outer->cvlan_tag = 0; } else if (outer->svlan_tag) { DR_STE_SET(def22_v1, tag, first_vlan_qualifier, DR_STE_SVLAN); outer->svlan_tag = 0; } DR_STE_SET_TAG(def22_v1, tag, outer_ip_frag, outer, frag); DR_STE_SET_TAG(def22_v1, tag, outer_l4_sport, outer, tcp_sport); DR_STE_SET_TAG(def22_v1, tag, outer_l4_sport, outer, udp_sport); DR_STE_SET_TAG(def22_v1, tag, outer_l4_dport, outer, tcp_dport); DR_STE_SET_TAG(def22_v1, tag, outer_l4_dport, outer, udp_dport); DR_STE_SET_TAG(def22_v1, tag, first_priority, outer, first_prio); DR_STE_SET_TAG(def22_v1, tag, first_vlan_id, outer, first_vid); DR_STE_SET_TAG(def22_v1, tag, first_cfi, outer, first_cfi); DR_STE_SET_TAG(def22_v1, tag, metadata_reg_c_0, misc2, metadata_reg_c_0); DR_STE_SET_TAG(def22_v1, tag, outer_dmac_47_16, outer, dmac_47_16); DR_STE_SET_TAG(def22_v1, tag, outer_dmac_15_0, outer, dmac_15_0); DR_STE_SET_TAG(def22_v1, tag, outer_smac_47_16, outer, smac_47_16); DR_STE_SET_TAG(def22_v1, tag, outer_smac_15_0, outer, smac_15_0); return 0; } static void dr_ste_v1_build_def22_mask(struct dr_match_param *value, struct dr_ste_build *sb) { struct dr_match_spec *outer = &value->outer; uint8_t *tag = sb->match; bool is_tcp_or_udp; /* Hint to indicate UDP/TCP packet due to l4_type limitations */ is_tcp_or_udp = outer->tcp_dport || outer->tcp_sport || outer->udp_dport || outer->udp_sport || outer->ip_protocol == IP_PROTOCOL_UDP || outer->ip_protocol == IP_PROTOCOL_TCP; if (outer->ip_protocol && is_tcp_or_udp) { DR_STE_SET(def22_v1, tag, outer_l4_type, -1); outer->ip_protocol = 0; } if (outer->svlan_tag || outer->cvlan_tag) { DR_STE_SET(def22_v1, tag, first_vlan_qualifier, -1); outer->cvlan_tag = 0; outer->svlan_tag = 0; } dr_ste_v1_build_def22_tag(value, sb, tag); } static void dr_ste_v1_build_def22_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_MATCH; dr_ste_v1_build_def22_mask(mask, sb); sb->ste_build_tag_func = &dr_ste_v1_build_def22_tag; } static int dr_ste_v1_build_def24_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc2 *misc2 = &value->misc2; struct dr_match_spec *outer = &value->outer; struct dr_match_spec *inner = &value->inner; DR_STE_SET_TAG(def24_v1, tag, metadata_reg_c_0, misc2, metadata_reg_c_0); DR_STE_SET_TAG(def24_v1, tag, metadata_reg_c_1, misc2, metadata_reg_c_1); DR_STE_SET_TAG(def24_v1, tag, metadata_reg_c_2, misc2, metadata_reg_c_2); DR_STE_SET_TAG(def24_v1, tag, metadata_reg_c_3, misc2, metadata_reg_c_3); if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET_TAG(def24_v1, tag, outer_ip_src_addr, outer, src_ip_31_0); DR_STE_SET_TAG(def24_v1, tag, outer_ip_dst_addr, outer, dst_ip_31_0); } if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET(def24_v1, tag, outer_l3_type, STE_IPV4); outer->ip_version = 0; } else if (outer->ip_version == IP_VERSION_IPV6) { DR_STE_SET(def24_v1, tag, outer_l3_type, STE_IPV6); outer->ip_version = 0; } DR_STE_SET_TAG(def24_v1, tag, outer_l4_sport, outer, tcp_sport); DR_STE_SET_TAG(def24_v1, tag, outer_l4_sport, outer, udp_sport); DR_STE_SET_TAG(def24_v1, tag, outer_l4_dport, outer, tcp_dport); DR_STE_SET_TAG(def24_v1, tag, outer_l4_dport, outer, udp_dport); DR_STE_SET_TAG(def24_v1, tag, outer_ip_protocol, outer, ip_protocol); DR_STE_SET_TAG(def24_v1, tag, outer_ip_frag, outer, frag); if (inner->ip_version == IP_VERSION_IPV4) { DR_STE_SET(def24_v1, tag, inner_l3_type, STE_IPV4); inner->ip_version = 0; } else if (inner->ip_version == IP_VERSION_IPV6) { DR_STE_SET(def24_v1, tag, inner_l3_type, STE_IPV6); inner->ip_version = 0; } if (outer->cvlan_tag) { DR_STE_SET(def24_v1, tag, outer_first_vlan_type, DR_STE_CVLAN); outer->cvlan_tag = 0; } else if (outer->svlan_tag) { DR_STE_SET(def24_v1, tag, outer_first_vlan_type, DR_STE_SVLAN); outer->svlan_tag = 0; } if (inner->cvlan_tag) { DR_STE_SET(def24_v1, tag, inner_first_vlan_type, DR_STE_CVLAN); inner->cvlan_tag = 0; } else if (inner->svlan_tag) { DR_STE_SET(def24_v1, tag, inner_first_vlan_type, DR_STE_SVLAN); inner->svlan_tag = 0; } DR_STE_SET_TAG(def24_v1, tag, inner_ip_protocol, inner, ip_protocol); DR_STE_SET_TAG(def24_v1, tag, inner_ip_frag, inner, frag); return 0; } static void dr_ste_v1_build_def24_mask(struct dr_match_param *value, struct dr_ste_build *sb) { struct dr_match_spec *outer = &value->outer; struct dr_match_spec *inner = &value->inner; uint8_t *tag = sb->match; if (outer->svlan_tag || outer->cvlan_tag) { DR_STE_SET(def24_v1, tag, outer_first_vlan_type, -1); outer->cvlan_tag = 0; outer->svlan_tag = 0; } if (inner->svlan_tag || inner->cvlan_tag) { DR_STE_SET(def24_v1, tag, inner_first_vlan_type, -1); inner->cvlan_tag = 0; inner->svlan_tag = 0; } dr_ste_v1_build_def24_tag(value, sb, tag); } static void dr_ste_v1_build_def24_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_MATCH; dr_ste_v1_build_def24_mask(mask, sb); sb->ste_build_tag_func = &dr_ste_v1_build_def24_tag; } static int dr_ste_v1_build_def25_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc5 *misc5 = &value->misc5; struct dr_match_spec *outer = &value->outer; struct dr_match_spec *inner = &value->inner; if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET_TAG(def25_v1, tag, inner_ip_src_addr, inner, src_ip_31_0); DR_STE_SET_TAG(def25_v1, tag, inner_ip_dst_addr, inner, dst_ip_31_0); } DR_STE_SET_TAG(def25_v1, tag, inner_l4_sport, inner, tcp_sport); DR_STE_SET_TAG(def25_v1, tag, inner_l4_sport, inner, udp_sport); DR_STE_SET_TAG(def25_v1, tag, inner_l4_dport, inner, tcp_dport); DR_STE_SET_TAG(def25_v1, tag, inner_l4_dport, inner, udp_dport); DR_STE_SET_TAG(def25_v1, tag, tunnel_header_0, misc5, tunnel_header_0); DR_STE_SET_TAG(def25_v1, tag, tunnel_header_1, misc5, tunnel_header_1); DR_STE_SET_TAG(def25_v1, tag, outer_l4_dport, outer, tcp_dport); DR_STE_SET_TAG(def25_v1, tag, outer_l4_dport, outer, udp_dport); if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET(def25_v1, tag, outer_l3_type, STE_IPV4); outer->ip_version = 0; } else if (outer->ip_version == IP_VERSION_IPV6) { DR_STE_SET(def25_v1, tag, outer_l3_type, STE_IPV6); outer->ip_version = 0; } if (inner->ip_version == IP_VERSION_IPV4) { DR_STE_SET(def25_v1, tag, inner_l3_type, STE_IPV4); inner->ip_version = 0; } else if (inner->ip_version == IP_VERSION_IPV6) { DR_STE_SET(def25_v1, tag, inner_l3_type, STE_IPV6); inner->ip_version = 0; } if (outer->ip_protocol == IP_PROTOCOL_UDP) { DR_STE_SET(def25_v1, tag, outer_l4_type, STE_UDP); outer->ip_protocol = 0; } else if (outer->ip_protocol == IP_PROTOCOL_TCP) { DR_STE_SET(def25_v1, tag, outer_l4_type, STE_TCP); outer->ip_protocol = 0; } if (inner->ip_protocol == IP_PROTOCOL_UDP) { DR_STE_SET(def25_v1, tag, inner_l4_type, STE_UDP); inner->ip_protocol = 0; } else if (inner->ip_protocol == IP_PROTOCOL_TCP) { DR_STE_SET(def25_v1, tag, inner_l4_type, STE_TCP); inner->ip_protocol = 0; } if (outer->cvlan_tag) { DR_STE_SET(def25_v1, tag, outer_first_vlan_type, DR_STE_CVLAN); outer->cvlan_tag = 0; } else if (outer->svlan_tag) { DR_STE_SET(def25_v1, tag, outer_first_vlan_type, DR_STE_SVLAN); outer->svlan_tag = 0; } if (inner->cvlan_tag) { DR_STE_SET(def25_v1, tag, inner_first_vlan_type, DR_STE_CVLAN); inner->cvlan_tag = 0; } else if (inner->svlan_tag) { DR_STE_SET(def25_v1, tag, inner_first_vlan_type, DR_STE_SVLAN); inner->svlan_tag = 0; } return 0; } static void dr_ste_v1_build_def25_mask(struct dr_match_param *mask, struct dr_ste_build *sb) { struct dr_match_spec *outer = &mask->outer; struct dr_match_spec *inner = &mask->inner; bool is_out_tcp_or_udp, is_in_tcp_or_udp; uint8_t *tag = sb->match; /* Hint to indicate UDP/TCP packet due to l4_type limitations */ is_out_tcp_or_udp = outer->tcp_dport || outer->tcp_sport || outer->udp_dport || outer->udp_sport || outer->ip_protocol == IP_PROTOCOL_UDP || outer->ip_protocol == IP_PROTOCOL_TCP; is_in_tcp_or_udp = inner->tcp_dport || inner->tcp_sport || inner->udp_dport || inner->udp_sport || inner->ip_protocol == IP_PROTOCOL_UDP || inner->ip_protocol == IP_PROTOCOL_TCP; if (outer->ip_protocol && is_out_tcp_or_udp) { DR_STE_SET(def25_v1, tag, outer_l4_type, -1); outer->ip_protocol = 0; } if (outer->svlan_tag || outer->cvlan_tag) { DR_STE_SET(def25_v1, tag, outer_first_vlan_type, -1); outer->cvlan_tag = 0; outer->svlan_tag = 0; } if (inner->ip_protocol && is_in_tcp_or_udp) { DR_STE_SET(def25_v1, tag, inner_l4_type, -1); inner->ip_protocol = 0; } if (inner->svlan_tag || inner->cvlan_tag) { DR_STE_SET(def25_v1, tag, inner_first_vlan_type, -1); inner->cvlan_tag = 0; inner->svlan_tag = 0; } dr_ste_v1_build_def25_tag(mask, sb, tag); } static void dr_ste_v1_build_def25_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_MATCH; dr_ste_v1_build_def25_mask(mask, sb); sb->ste_build_tag_func = &dr_ste_v1_build_def25_tag; } static int dr_ste_v1_build_def26_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *outer = &value->outer; struct dr_match_misc *misc = &value->misc; if (outer->ip_version == IP_VERSION_IPV6) { DR_STE_SET_TAG(def26_v1, tag, src_ipv6_127_96, outer, src_ip_127_96); DR_STE_SET_TAG(def26_v1, tag, src_ipv6_95_64, outer, src_ip_95_64); DR_STE_SET_TAG(def26_v1, tag, src_ipv6_63_32, outer, src_ip_63_32); DR_STE_SET_TAG(def26_v1, tag, src_ipv6_31_0, outer, src_ip_31_0); } DR_STE_SET_TAG(def26_v1, tag, ip_frag, outer, frag); if (outer->ip_version == IP_VERSION_IPV6) { DR_STE_SET(def26_v1, tag, l3_type, STE_IPV6); outer->ip_version = 0; } if (outer->cvlan_tag) { DR_STE_SET(def26_v1, tag, first_vlan_type, DR_STE_CVLAN); outer->cvlan_tag = 0; } else if (outer->svlan_tag) { DR_STE_SET(def26_v1, tag, first_vlan_type, DR_STE_SVLAN); outer->svlan_tag = 0; } DR_STE_SET_TAG(def26_v1, tag, first_vlan_id, outer, first_vid); DR_STE_SET_TAG(def26_v1, tag, first_cfi, outer, first_cfi); DR_STE_SET_TAG(def26_v1, tag, first_priority, outer, first_prio); DR_STE_SET_TAG(def26_v1, tag, l3_ok, outer, l3_ok); DR_STE_SET_TAG(def26_v1, tag, l4_ok, outer, l4_ok); if (misc->outer_second_cvlan_tag) { DR_STE_SET(def26_v1, tag, second_vlan_type, DR_STE_CVLAN); misc->outer_second_cvlan_tag = 0; } else if (misc->outer_second_svlan_tag) { DR_STE_SET(def26_v1, tag, second_vlan_type, DR_STE_SVLAN); misc->outer_second_svlan_tag = 0; } DR_STE_SET_TAG(def26_v1, tag, second_vlan_id, misc, outer_second_vid); DR_STE_SET_TAG(def26_v1, tag, second_cfi, misc, outer_second_cfi); DR_STE_SET_TAG(def26_v1, tag, second_priority, misc, outer_second_prio); DR_STE_SET_TAG(def26_v1, tag, smac_47_16, outer, smac_47_16); DR_STE_SET_TAG(def26_v1, tag, smac_15_0, outer, smac_15_0); DR_STE_SET_TAG(def26_v1, tag, ip_porotcol, outer, ip_protocol); if (outer->tcp_flags) { DR_STE_SET_BOOL(def26_v1, tag, tcp_cwr, outer->tcp_flags & (1 << 7)); DR_STE_SET_BOOL(def26_v1, tag, tcp_ece, outer->tcp_flags & (1 << 6)); DR_STE_SET_BOOL(def26_v1, tag, tcp_urg, outer->tcp_flags & (1 << 5)); DR_STE_SET_BOOL(def26_v1, tag, tcp_ack, outer->tcp_flags & (1 << 4)); DR_STE_SET_BOOL(def26_v1, tag, tcp_psh, outer->tcp_flags & (1 << 3)); DR_STE_SET_BOOL(def26_v1, tag, tcp_rst, outer->tcp_flags & (1 << 2)); DR_STE_SET_BOOL(def26_v1, tag, tcp_syn, outer->tcp_flags & (1 << 1)); DR_STE_SET_BOOL(def26_v1, tag, tcp_fin, outer->tcp_flags & (1 << 0)); outer->tcp_flags ^= (outer->tcp_flags & 0xff); } return 0; } static void dr_ste_v1_build_def26_mask(struct dr_match_param *mask, struct dr_ste_build *sb) { struct dr_match_spec *outer = &mask->outer; struct dr_match_misc *misc = &mask->misc; uint8_t *tag = sb->match; if (outer->svlan_tag || outer->cvlan_tag) { DR_STE_SET(def26_v1, tag, first_vlan_type, -1); outer->cvlan_tag = 0; outer->svlan_tag = 0; } if (misc->outer_second_svlan_tag || misc->outer_second_cvlan_tag) { DR_STE_SET(def26_v1, tag, second_vlan_type, -1); misc->outer_second_svlan_tag = 0; misc->outer_second_cvlan_tag = 0; } dr_ste_v1_build_def26_tag(mask, sb, tag); } static void dr_ste_v1_build_def26_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_MATCH; dr_ste_v1_build_def26_mask(mask, sb); sb->ste_build_tag_func = &dr_ste_v1_build_def26_tag; } static int dr_ste_v1_build_def28_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_misc3 *misc3 = &value->misc3; struct dr_match_spec *outer = &value->outer; struct dr_match_spec *inner = &value->inner; DR_STE_SET_TAG(def28_v1, tag, flex_gtpu_teid, misc3, gtpu_teid); if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET_TAG(def28_v1, tag, outer_ip_src_addr, outer, src_ip_31_0); DR_STE_SET_TAG(def28_v1, tag, outer_ip_dst_addr, outer, dst_ip_31_0); } if (inner->ip_version == IP_VERSION_IPV4) { DR_STE_SET_TAG(def28_v1, tag, inner_ip_src_addr, inner, src_ip_31_0); DR_STE_SET_TAG(def28_v1, tag, inner_ip_dst_addr, inner, dst_ip_31_0); } if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET(def28_v1, tag, outer_l3_type, STE_IPV4); outer->ip_version = 0; } else if (outer->ip_version == IP_VERSION_IPV6) { DR_STE_SET(def28_v1, tag, outer_l3_type, STE_IPV6); outer->ip_version = 0; } DR_STE_SET_TAG(def28_v1, tag, outer_l4_sport, outer, tcp_sport); DR_STE_SET_TAG(def28_v1, tag, outer_l4_sport, outer, udp_sport); DR_STE_SET_TAG(def28_v1, tag, outer_l4_dport, outer, tcp_dport); DR_STE_SET_TAG(def28_v1, tag, outer_l4_dport, outer, udp_dport); DR_STE_SET_TAG(def28_v1, tag, inner_l4_sport, inner, tcp_sport); DR_STE_SET_TAG(def28_v1, tag, inner_l4_sport, inner, udp_sport); DR_STE_SET_TAG(def28_v1, tag, inner_l4_dport, inner, tcp_dport); DR_STE_SET_TAG(def28_v1, tag, inner_l4_dport, inner, udp_dport); DR_STE_SET_TAG(def28_v1, tag, outer_ip_protocol, outer, ip_protocol); DR_STE_SET_TAG(def28_v1, tag, outer_ip_frag, outer, frag); if (inner->ip_version == IP_VERSION_IPV4) { DR_STE_SET(def28_v1, tag, inner_l3_type, STE_IPV4); inner->ip_version = 0; } else if (inner->ip_version == IP_VERSION_IPV6) { DR_STE_SET(def28_v1, tag, inner_l3_type, STE_IPV6); inner->ip_version = 0; } if (outer->cvlan_tag) { DR_STE_SET(def28_v1, tag, outer_first_vlan_type, DR_STE_CVLAN); outer->cvlan_tag = 0; } else if (outer->svlan_tag) { DR_STE_SET(def28_v1, tag, outer_first_vlan_type, DR_STE_SVLAN); outer->svlan_tag = 0; } if (inner->cvlan_tag) { DR_STE_SET(def28_v1, tag, inner_first_vlan_type, DR_STE_CVLAN); inner->cvlan_tag = 0; } else if (inner->svlan_tag) { DR_STE_SET(def28_v1, tag, inner_first_vlan_type, DR_STE_SVLAN); inner->svlan_tag = 0; } DR_STE_SET_TAG(def28_v1, tag, inner_ip_protocol, inner, ip_protocol); DR_STE_SET_TAG(def28_v1, tag, inner_ip_frag, inner, frag); return 0; } static void dr_ste_v1_build_def28_mask(struct dr_match_param *value, struct dr_ste_build *sb) { struct dr_match_spec *outer = &value->outer; struct dr_match_spec *inner = &value->inner; uint8_t *tag = sb->match; if (outer->svlan_tag || outer->cvlan_tag) { DR_STE_SET(def28_v1, tag, outer_first_vlan_type, -1); outer->cvlan_tag = 0; outer->svlan_tag = 0; } if (inner->svlan_tag || inner->cvlan_tag) { DR_STE_SET(def28_v1, tag, inner_first_vlan_type, -1); inner->cvlan_tag = 0; inner->svlan_tag = 0; } dr_ste_v1_build_def28_tag(value, sb, tag); } static void dr_ste_v1_build_def28_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_MATCH; dr_ste_v1_build_def28_mask(mask, sb); sb->ste_build_tag_func = &dr_ste_v1_build_def28_tag; } static int dr_ste_v1_build_def33_tag(struct dr_match_param *value, struct dr_ste_build *sb, uint8_t *tag) { struct dr_match_spec *outer = &value->outer; struct dr_match_spec *inner = &value->inner; if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET_TAG(def33_v1, tag, outer_ip_src_addr, outer, src_ip_31_0); DR_STE_SET_TAG(def33_v1, tag, outer_ip_dst_addr, outer, dst_ip_31_0); } DR_STE_SET_TAG(def33_v1, tag, outer_l4_sport, outer, tcp_sport); DR_STE_SET_TAG(def33_v1, tag, outer_l4_sport, outer, udp_sport); DR_STE_SET_TAG(def33_v1, tag, outer_l4_dport, outer, tcp_dport); DR_STE_SET_TAG(def33_v1, tag, outer_l4_dport, outer, udp_dport); DR_STE_SET_TAG(def33_v1, tag, outer_ip_frag, outer, frag); if (outer->ip_version == IP_VERSION_IPV4) { DR_STE_SET(def33_v1, tag, outer_l3_type, STE_IPV4); outer->ip_version = 0; } else if (outer->ip_version == IP_VERSION_IPV6) { DR_STE_SET(def33_v1, tag, outer_l3_type, STE_IPV6); outer->ip_version = 0; } if (outer->cvlan_tag) { DR_STE_SET(def33_v1, tag, outer_first_vlan_type, DR_STE_CVLAN); outer->cvlan_tag = 0; } else if (outer->svlan_tag) { DR_STE_SET(def33_v1, tag, outer_first_vlan_type, DR_STE_SVLAN); outer->svlan_tag = 0; } DR_STE_SET_TAG(def33_v1, tag, outer_first_vlan_prio, outer, first_prio); DR_STE_SET_TAG(def33_v1, tag, outer_first_vlan_cfi, outer, first_cfi); DR_STE_SET_TAG(def33_v1, tag, outer_first_vlan_vid, outer, first_vid); DR_STE_SET_TAG(def33_v1, tag, outer_ip_version, outer, ip_version); DR_STE_SET_TAG(def33_v1, tag, outer_ip_ihl, outer, ipv4_ihl); DR_STE_SET_TAG(def33_v1, tag, outer_l3_ok, outer, l3_ok); DR_STE_SET_TAG(def33_v1, tag, outer_l4_ok, outer, l4_ok); DR_STE_SET_TAG(def33_v1, tag, inner_l3_ok, inner, l3_ok); DR_STE_SET_TAG(def33_v1, tag, inner_l4_ok, inner, l4_ok); DR_STE_SET_TAG(def33_v1, tag, outer_ipv4_checksum_ok, outer, ipv4_checksum_ok); DR_STE_SET_TAG(def33_v1, tag, outer_l4_checksum_ok, outer, l4_checksum_ok); DR_STE_SET_TAG(def33_v1, tag, inner_ipv4_checksum_ok, inner, ipv4_checksum_ok); DR_STE_SET_TAG(def33_v1, tag, inner_l4_checksum_ok, inner, l4_checksum_ok); DR_STE_SET_TAG(def33_v1, tag, outer_ip_ttl, outer, ip_ttl_hoplimit); DR_STE_SET_TAG(def33_v1, tag, outer_ip_protocol, outer, ip_protocol); return 0; } static void dr_ste_v1_build_def33_mask(struct dr_match_param *value, struct dr_ste_build *sb) { struct dr_match_spec *outer = &value->outer; uint8_t *tag = sb->match; if (outer->svlan_tag || outer->cvlan_tag) { DR_STE_SET(def33_v1, tag, outer_first_vlan_type, -1); outer->cvlan_tag = 0; outer->svlan_tag = 0; } dr_ste_v1_build_def33_tag(value, sb, tag); } static void dr_ste_v1_build_def33_init(struct dr_ste_build *sb, struct dr_match_param *mask) { sb->lu_type = DR_STE_V1_LU_TYPE_MATCH; dr_ste_v1_build_def33_mask(mask, sb); sb->ste_build_tag_func = &dr_ste_v1_build_def33_tag; } static int dr_ste_v1_aso_other_domain_link(struct mlx5dv_devx_obj *devx_obj, struct mlx5dv_dr_domain *peer_dmn, struct mlx5dv_dr_domain *dmn, uint32_t flags, uint8_t return_reg_c) { uint32_t chunk_size = devx_obj->log_obj_range; struct dr_aso_cross_dmn_arrays *cross_dmn_arrays; struct dr_ste_htbl **action_htbl, **rule_htbl; struct dr_ste_send_info **ste_info_arr; struct dr_ste *action_ste; LIST_HEAD(send_ste_list); struct dr_ste *rule_ste; uint8_t *action_hw_ste; int ret = 0, i, j; bool direction; if (!flags || (flags > MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_RESPONDER)) { errno = EINVAL; ret = errno; goto out; } if (flags == MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_INITIATOR) direction = MLX5_IFC_ASO_CT_DIRECTION_INITIATOR; else direction = MLX5_IFC_ASO_CT_DIRECTION_RESPONDER; action_hw_ste = calloc(1 << chunk_size, DR_STE_SIZE); if (!action_hw_ste) { errno = ENOMEM; ret = errno; goto out; } ste_info_arr = calloc((1 << chunk_size), sizeof(struct dr_ste_send_info *)); if (!ste_info_arr) { errno = ENOMEM; ret = errno; goto free_action_hw_ste; } action_htbl = calloc(1 << chunk_size, sizeof(struct dr_ste_htbl *)); if (!action_htbl) { errno = ENOMEM; ret = errno; goto free_ste_info_arr; } rule_htbl = calloc(1 << chunk_size, sizeof(struct dr_ste_htbl *)); if (!rule_htbl) { errno = ENOMEM; ret = errno; goto free_action_htbl; } for (i = 0; i < (1 << chunk_size); i++) { action_htbl[i] = dr_ste_htbl_alloc(peer_dmn->ste_icm_pool, DR_CHUNK_SIZE_1, DR_STE_HTBL_TYPE_LEGACY, DR_STE_LU_TYPE_DONT_CARE, 0); if (!action_htbl[i]) { dr_dbg(peer_dmn, "Failed allocating collision table\n"); errno = ENOMEM; ret = errno; goto free_till_i_with_ste_info; } dr_htbl_get(action_htbl[i]); rule_htbl[i] = dr_ste_htbl_alloc(dmn->ste_icm_pool, DR_CHUNK_SIZE_1, DR_STE_HTBL_TYPE_MATCH, DR_STE_LU_TYPE_DONT_CARE, 0); if (!rule_htbl[i]) { dr_dbg(dmn, "Failed allocating collision table\n"); errno = ENOMEM; ret = errno; goto free_action_htbl_i; } dr_htbl_get(rule_htbl[i]); action_ste = action_htbl[i]->ste_arr; dr_ste_get(action_ste); peer_dmn->ste_ctx->ste_init(action_ste->hw_ste, DR_STE_LU_TYPE_DONT_CARE, 0, peer_dmn->info.caps.gvmi); peer_dmn->ste_ctx->set_hit_gvmi(action_ste->hw_ste, dmn->info.caps.gvmi); peer_dmn->ste_ctx->set_aso_ct_cross_dmn(action_ste->hw_ste, devx_obj->object_id, i, return_reg_c, direction); list_add_tail(dr_ste_get_miss_list(action_ste), &action_ste->miss_list_node); rule_ste = rule_htbl[i]->ste_arr; dr_ste_get(rule_ste); dmn->ste_ctx->ste_init(rule_ste->hw_ste, DR_STE_LU_TYPE_DONT_CARE, 0, dmn->info.caps.gvmi); list_add_tail(dr_ste_get_miss_list(rule_ste), &rule_ste->miss_list_node); dr_ste_set_hit_addr_by_next_htbl(peer_dmn->ste_ctx, action_ste->hw_ste, rule_ste->htbl); rule_htbl[i]->pointing_ste = action_ste; action_ste->next_htbl = rule_htbl[i]; ste_info_arr[i] = calloc(1, sizeof(struct dr_ste_send_info)); if (!ste_info_arr[i]) { dr_dbg(peer_dmn, "Failed allocate ste_info\n"); errno = ENOMEM; ret = errno; goto free_rule_htbl_i; } memcpy(&action_hw_ste[i * DR_STE_SIZE], action_ste->hw_ste, DR_STE_SIZE_REDUCED); dr_send_fill_and_append_ste_send_info(action_ste, DR_STE_SIZE, 0, &action_hw_ste[i * DR_STE_SIZE], ste_info_arr[i], &send_ste_list, false); } ret = dr_rule_send_update_list(&send_ste_list, peer_dmn, false, 0); if (ret) { dr_dbg(peer_dmn, "Failed sending ste!\n"); goto free_till_i; } ret = mlx5dv_dr_domain_sync(peer_dmn, MLX5DV_DR_DOMAIN_SYNC_FLAGS_SW); if (ret) { dr_dbg(peer_dmn, "Failed syncing domain\n"); goto free_till_i; } cross_dmn_arrays = (struct dr_aso_cross_dmn_arrays *) malloc(sizeof(struct dr_aso_cross_dmn_arrays)); cross_dmn_arrays->action_htbl = action_htbl; cross_dmn_arrays->rule_htbl = rule_htbl; devx_obj->priv = cross_dmn_arrays; goto free_ste_info_arr; free_rule_htbl_i: dr_htbl_put(rule_htbl[i]); free_action_htbl_i: dr_htbl_put(action_htbl[i]); free_till_i_with_ste_info: for (j = 0; j < i; j++) free(ste_info_arr[j]); free_till_i: for (j = 0; j < i; j++) { dr_htbl_put(rule_htbl[j]); dr_htbl_put(action_htbl[j]); } free(rule_htbl); free_action_htbl: free(action_htbl); free_ste_info_arr: free(ste_info_arr); free_action_hw_ste: free(action_hw_ste); out: return ret; } static int dr_ste_v1_aso_other_domain_unlink(struct mlx5dv_devx_obj *devx_obj) { struct dr_aso_cross_dmn_arrays *cross_dmn_arrays; bool ready_to_clear = true; int i; if (!devx_obj->priv) { errno = EINVAL; return errno; } cross_dmn_arrays = (struct dr_aso_cross_dmn_arrays *) devx_obj->priv; for (i = 0; i < (1 << devx_obj->log_obj_range); i++) { if ((atomic_load(&cross_dmn_arrays->rule_htbl[i]->ste_arr->refcount) > 1) || (atomic_load(&cross_dmn_arrays->action_htbl[i]->ste_arr->refcount) > 1)) ready_to_clear = false; } if (ready_to_clear) { for (i = 0; i < (1 << devx_obj->log_obj_range); i++) { dr_htbl_put(cross_dmn_arrays->rule_htbl[i]); dr_htbl_put(cross_dmn_arrays->action_htbl[i]); } free(cross_dmn_arrays->rule_htbl); free(cross_dmn_arrays->action_htbl); free(cross_dmn_arrays); devx_obj->priv = NULL; } else { errno = EBUSY; return errno; } return 0; } static int dr_ste_v1_alloc_modify_hdr_ptrn_arg(struct mlx5dv_dr_action *action, uint32_t chunck_size) { struct dr_ptrn_mngr *ptrn_mngr; ptrn_mngr = action->rewrite.dmn->modify_header_ptrn_mngr; if (!ptrn_mngr) return ENOTSUP; action->rewrite.ptrn_arg.arg = dr_arg_get_obj(action->rewrite.dmn->modify_header_arg_mngr, action->rewrite.param.num_of_actions, action->rewrite.param.data); if (!action->rewrite.ptrn_arg.arg) { dr_dbg(action->rewrite.dmn, "Failed allocating args for modify header\n"); return errno; } action->rewrite.ptrn_arg.ptrn = dr_ptrn_cache_get_pattern(ptrn_mngr, (enum dr_ptrn_type)action->action_type, action->rewrite.param.num_of_actions, action->rewrite.param.data); if (!action->rewrite.ptrn_arg.ptrn) { dr_dbg(action->rewrite.dmn, "Failed to get pattern\n"); goto put_arg; } return 0; put_arg: dr_arg_put_obj(action->rewrite.dmn->modify_header_arg_mngr, action->rewrite.ptrn_arg.arg); return errno; } static void dr_ste_v1_dealloc_modify_hdr_ptrn_arg(struct mlx5dv_dr_action *action) { dr_ptrn_cache_put_pattern(action->rewrite.dmn->modify_header_ptrn_mngr, action->rewrite.ptrn_arg.ptrn); dr_arg_put_obj(action->rewrite.dmn->modify_header_arg_mngr, action->rewrite.ptrn_arg.arg); } static struct dr_ste_ctx ste_ctx_v1 = { /* Builders */ .build_eth_l2_src_dst_init = &dr_ste_v1_build_eth_l2_src_dst_init, .build_eth_l3_ipv6_src_init = &dr_ste_v1_build_eth_l3_ipv6_src_init, .build_eth_l3_ipv6_dst_init = &dr_ste_v1_build_eth_l3_ipv6_dst_init, .build_eth_l3_ipv4_5_tuple_init = &dr_ste_v1_build_eth_l3_ipv4_5_tuple_init, .build_eth_l2_src_init = &dr_ste_v1_build_eth_l2_src_init, .build_eth_l2_dst_init = &dr_ste_v1_build_eth_l2_dst_init, .build_eth_l2_tnl_init = &dr_ste_v1_build_eth_l2_tnl_init, .build_eth_l3_ipv4_misc_init = &dr_ste_v1_build_eth_l3_ipv4_misc_init, .build_eth_ipv6_l3_l4_init = &dr_ste_v1_build_eth_ipv6_l3_l4_init, .build_mpls_init = &dr_ste_v1_build_mpls_init, .build_tnl_gre_init = &dr_ste_v1_build_tnl_gre_init, .build_tnl_mpls_over_udp_init = &dr_ste_v1_build_tnl_mpls_over_udp_init, .build_tnl_mpls_over_gre_init = &dr_ste_v1_build_tnl_mpls_over_gre_init, .build_icmp_init = &dr_ste_v1_build_icmp_init, .build_general_purpose_init = &dr_ste_v1_build_general_purpose_init, .build_eth_l4_misc_init = &dr_ste_v1_build_eth_l4_misc_init, .build_tnl_vxlan_gpe_init = &dr_ste_v1_build_flex_parser_tnl_vxlan_gpe_init, .build_tnl_geneve_init = &dr_ste_v1_build_flex_parser_tnl_geneve_init, .build_tnl_geneve_tlv_opt_init = &dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_init, .build_tnl_geneve_tlv_opt_exist_init = &dr_ste_v1_build_flex_parser_tnl_geneve_tlv_opt_exist_init, .build_tnl_gtpu_init = &dr_ste_v1_build_flex_parser_tnl_gtpu_init, .build_tnl_gtpu_flex_parser_0 = &dr_ste_v1_build_tnl_gtpu_flex_parser_0_init, .build_tnl_gtpu_flex_parser_1 = &dr_ste_v1_build_tnl_gtpu_flex_parser_1_init, .build_register_0_init = &dr_ste_v1_build_register_0_init, .build_register_1_init = &dr_ste_v1_build_register_1_init, .build_src_gvmi_qpn_init = &dr_ste_v1_build_src_gvmi_qpn_init, .build_flex_parser_0_init = &dr_ste_v1_build_flex_parser_0_init, .build_flex_parser_1_init = &dr_ste_v1_build_flex_parser_1_init, .build_tunnel_header_init = &dr_ste_v1_build_tunnel_header_init, .build_ib_l4_init = &dr_ste_v1_build_ib_l4_init, .build_def0_init = &dr_ste_v1_build_def0_init, .build_def2_init = &dr_ste_v1_build_def2_init, .build_def6_init = &dr_ste_v1_build_def6_init, .build_def16_init = &dr_ste_v1_build_def16_init, .build_def22_init = &dr_ste_v1_build_def22_init, .build_def24_init = &dr_ste_v1_build_def24_init, .build_def25_init = &dr_ste_v1_build_def25_init, .build_def26_init = &dr_ste_v1_build_def26_init, .build_def28_init = &dr_ste_v1_build_def28_init, .build_def33_init = &dr_ste_v1_build_def33_init, .aso_other_domain_link = &dr_ste_v1_aso_other_domain_link, .aso_other_domain_unlink = &dr_ste_v1_aso_other_domain_unlink, /* Getters and Setters */ .ste_init = &dr_ste_v1_init, .set_next_lu_type = &dr_ste_v1_set_next_lu_type, .get_next_lu_type = &dr_ste_v1_get_next_lu_type, .set_miss_addr = &dr_ste_v1_set_miss_addr, .get_miss_addr = &dr_ste_v1_get_miss_addr, .set_hit_addr = &dr_ste_v1_set_hit_addr, .set_byte_mask = &dr_ste_v1_set_byte_mask, .get_byte_mask = &dr_ste_v1_get_byte_mask, .set_ctrl_always_hit_htbl = &dr_ste_v1_set_ctrl_always_hit_htbl, .set_ctrl_always_miss = &dr_ste_v1_set_ctrl_always_miss, .set_hit_gvmi = &dr_ste_v1_set_hit_gvmi, /* Actions */ .actions_caps = DR_STE_CTX_ACTION_CAP_TX_POP | DR_STE_CTX_ACTION_CAP_RX_PUSH | DR_STE_CTX_ACTION_CAP_RX_ENCAP | DR_STE_CTX_ACTION_CAP_POP_MDFY | DR_STE_CTX_ACTION_CAP_MODIFY_HDR_INLINE, .action_modify_field_arr = dr_ste_v1_action_modify_field_arr, .action_modify_field_arr_size = ARRAY_SIZE(dr_ste_v1_action_modify_field_arr), .set_actions_rx = &dr_ste_v1_set_actions_rx, .set_actions_tx = &dr_ste_v1_set_actions_tx, .set_action_set = &dr_ste_v1_set_action_set, .set_action_add = &dr_ste_v1_set_action_add, .set_action_copy = &dr_ste_v1_set_action_copy, .get_action_hw_field = &dr_ste_v1_get_action_hw_field, .set_action_decap_l3_list = &dr_ste_v1_set_action_decap_l3_list, .set_aso_ct_cross_dmn = &dr_ste_v1_set_aso_ct_cross_dmn, .alloc_modify_hdr_chunk = &dr_ste_v1_alloc_modify_hdr_ptrn_arg, .dealloc_modify_hdr_chunk = &dr_ste_v1_dealloc_modify_hdr_ptrn_arg, /* Actions bit set */ .set_encap = &dr_ste_v1_set_encap, .set_push_vlan = &dr_ste_v1_set_push_vlan, .set_pop_vlan = &dr_ste_v1_set_pop_vlan, .set_rx_decap = &dr_ste_v1_set_rx_decap, .set_encap_l3 = &dr_ste_v1_set_encap_l3, /* Send */ .prepare_for_postsend = &dr_ste_v1_prepare_for_postsend, }; struct dr_ste_ctx *dr_ste_get_ctx_v1(void) { return &ste_ctx_v1; } rdma-core-56.1/providers/mlx5/dr_ste_v1.h000066400000000000000000000156341477342711600203040ustar00rootroot00000000000000/* * Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _DR_STE_V1_ #define _DR_STE_V1_ #include "dr_ste.h" #define DR_STE_DECAP_L3_ACTION_NUM 8 #define DR_STE_L2_HDR_MAX_SZ 20 #define DR_STE_CALC_DFNR_TYPE(lookup_type, inner) \ ((inner) ? DR_STE_V1_LU_TYPE_##lookup_type##_I : \ DR_STE_V1_LU_TYPE_##lookup_type##_O) enum dr_ste_v1_entry_format { DR_STE_V1_TYPE_BWC_BYTE = 0x0, DR_STE_V1_TYPE_BWC_DW = 0x1, DR_STE_V1_TYPE_MATCH_AND_MASK_BYTE = 0x2, DR_STE_V1_TYPE_MATCH_AND_MASK_DW = 0x3, DR_STE_V1_TYPE_MATCH = 0x4, }; /* * Lookup type is built from 2B: [ Definer mode 1B ][ Definer index 1B ] */ enum dr_ste_v1_lu_type { DR_STE_V1_LU_TYPE_NOP = 0x0000, DR_STE_V1_LU_TYPE_ETHL2_TNL = 0x0002, DR_STE_V1_LU_TYPE_IBL3_EXT = 0x0102, DR_STE_V1_LU_TYPE_ETHL2_O = 0x0003, DR_STE_V1_LU_TYPE_IBL4 = 0x0103, DR_STE_V1_LU_TYPE_ETHL2_I = 0x0004, DR_STE_V1_LU_TYPE_SRC_QP_GVMI = 0x0104, DR_STE_V1_LU_TYPE_ETHL2_SRC_O = 0x0005, DR_STE_V1_LU_TYPE_ETHL2_HEADERS_O = 0x0105, DR_STE_V1_LU_TYPE_ETHL2_SRC_I = 0x0006, DR_STE_V1_LU_TYPE_ETHL2_HEADERS_I = 0x0106, DR_STE_V1_LU_TYPE_ETHL3_IPV4_5_TUPLE_O = 0x0007, DR_STE_V1_LU_TYPE_IPV6_DES_O = 0x0107, DR_STE_V1_LU_TYPE_ETHL3_IPV4_5_TUPLE_I = 0x0008, DR_STE_V1_LU_TYPE_IPV6_DES_I = 0x0108, DR_STE_V1_LU_TYPE_ETHL4_O = 0x0009, DR_STE_V1_LU_TYPE_IPV6_SRC_O = 0x0109, DR_STE_V1_LU_TYPE_ETHL4_I = 0x000a, DR_STE_V1_LU_TYPE_IPV6_SRC_I = 0x010a, DR_STE_V1_LU_TYPE_ETHL2_SRC_DST_O = 0x000b, DR_STE_V1_LU_TYPE_MPLS_O = 0x010b, DR_STE_V1_LU_TYPE_ETHL2_SRC_DST_I = 0x000c, DR_STE_V1_LU_TYPE_MPLS_I = 0x010c, DR_STE_V1_LU_TYPE_ETHL3_IPV4_MISC_O = 0x000d, DR_STE_V1_LU_TYPE_GRE = 0x010d, DR_STE_V1_LU_TYPE_FLEX_PARSER_TNL_HEADER = 0x000e, DR_STE_V1_LU_TYPE_GENERAL_PURPOSE = 0x010e, DR_STE_V1_LU_TYPE_ETHL3_IPV4_MISC_I = 0x000f, DR_STE_V1_LU_TYPE_STEERING_REGISTERS_0 = 0x010f, DR_STE_V1_LU_TYPE_STEERING_REGISTERS_1 = 0x0110, DR_STE_V1_LU_TYPE_FLEX_PARSER_OK = 0x0011, DR_STE_V1_LU_TYPE_FLEX_PARSER_0 = 0x0111, DR_STE_V1_LU_TYPE_FLEX_PARSER_1 = 0x0112, DR_STE_V1_LU_TYPE_ETHL4_MISC_O = 0x0113, DR_STE_V1_LU_TYPE_ETHL4_MISC_I = 0x0114, DR_STE_V1_LU_TYPE_TNL_HEADER = 0x0117, DR_STE_V1_LU_TYPE_MATCH = 0x0400, DR_STE_V1_LU_TYPE_INVALID = 0x00ff, DR_STE_V1_LU_TYPE_DONT_CARE = DR_STE_LU_TYPE_DONT_CARE, }; enum dr_ste_v1_header_anchors { DR_STE_HEADER_ANCHOR_START_OUTER = 0x00, DR_STE_HEADER_ANCHOR_1ST_VLAN = 0x02, DR_STE_HEADER_ANCHOR_IPV6_IPV4 = 0x07, DR_STE_HEADER_ANCHOR_INNER_MAC = 0x13, DR_STE_HEADER_ANCHOR_INNER_IPV6_IPV4 = 0x19, }; enum dr_ste_v1_action_size { DR_STE_ACTION_SINGLE_SZ = 4, DR_STE_ACTION_DOUBLE_SZ = 8, DR_STE_ACTION_TRIPLE_SZ = 12, }; enum dr_ste_v1_action_insert_ptr_attr { DR_STE_V1_ACTION_INSERT_PTR_ATTR_NONE = 0, /* Regular push header (e.g. push vlan) */ DR_STE_V1_ACTION_INSERT_PTR_ATTR_ENCAP = 1, /* Encapsulation / Tunneling */ DR_STE_V1_ACTION_INSERT_PTR_ATTR_ESP = 2, /* IPsec */ }; enum dr_ste_v1_action_id { DR_STE_V1_ACTION_ID_NOP = 0x00, DR_STE_V1_ACTION_ID_COPY = 0x05, DR_STE_V1_ACTION_ID_SET = 0x06, DR_STE_V1_ACTION_ID_ADD = 0x07, DR_STE_V1_ACTION_ID_REMOVE_BY_SIZE = 0x08, DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER = 0x09, DR_STE_V1_ACTION_ID_INSERT_INLINE = 0x0a, DR_STE_V1_ACTION_ID_INSERT_POINTER = 0x0b, DR_STE_V1_ACTION_ID_FLOW_TAG = 0x0c, DR_STE_V1_ACTION_ID_QUEUE_ID_SEL = 0x0d, DR_STE_V1_ACTION_ID_ACCELERATED_LIST = 0x0e, DR_STE_V1_ACTION_ID_MODIFY_LIST = 0x0f, DR_STE_V1_ACTION_ID_ASO = 0x12, DR_STE_V1_ACTION_ID_TRAILER = 0x13, DR_STE_V1_ACTION_ID_COUNTER_ID = 0x14, DR_STE_V1_ACTION_ID_MAX = 0x21, /* use for special cases */ DR_STE_V1_ACTION_ID_SPECIAL_ENCAP_L3 = 0x22, }; enum { DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_0 = 0x00, DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_1 = 0x01, DR_STE_V1_ACTION_MDFY_FLD_L2_OUT_2 = 0x02, DR_STE_V1_ACTION_MDFY_FLD_SRC_L2_OUT_0 = 0x08, DR_STE_V1_ACTION_MDFY_FLD_SRC_L2_OUT_1 = 0x09, DR_STE_V1_ACTION_MDFY_FLD_L3_OUT_0 = 0x0e, DR_STE_V1_ACTION_MDFY_FLD_L4_OUT_0 = 0x18, DR_STE_V1_ACTION_MDFY_FLD_L4_OUT_1 = 0x19, DR_STE_V1_ACTION_MDFY_FLD_IPV4_OUT_0 = 0x40, DR_STE_V1_ACTION_MDFY_FLD_IPV4_OUT_1 = 0x41, DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_0 = 0x44, DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_1 = 0x45, DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_2 = 0x46, DR_STE_V1_ACTION_MDFY_FLD_IPV6_DST_OUT_3 = 0x47, DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_0 = 0x4c, DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_1 = 0x4d, DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_2 = 0x4e, DR_STE_V1_ACTION_MDFY_FLD_IPV6_SRC_OUT_3 = 0x4f, DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_0 = 0x5e, DR_STE_V1_ACTION_MDFY_FLD_TCP_MISC_1 = 0x5f, DR_STE_V1_ACTION_MDFY_FLD_METADATA_2_CQE = 0x7b, DR_STE_V1_ACTION_MDFY_FLD_GNRL_PURPOSE = 0x7c, DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_7 = 0x82, DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_6 = 0x83, DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_5 = 0x84, DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_4 = 0x85, DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_3 = 0x86, DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_2 = 0x87, DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_1 = 0x88, DR_STE_V1_ACTION_MDFY_FLD_FLEX_PARSER_0 = 0x89, DR_STE_V1_ACTION_MDFY_FLD_REGISTER_2_0 = 0x8c, DR_STE_V1_ACTION_MDFY_FLD_REGISTER_2_1 = 0x8d, DR_STE_V1_ACTION_MDFY_FLD_REGISTER_1_0 = 0x8e, DR_STE_V1_ACTION_MDFY_FLD_REGISTER_1_1 = 0x8f, DR_STE_V1_ACTION_MDFY_FLD_REGISTER_0_0 = 0x90, DR_STE_V1_ACTION_MDFY_FLD_REGISTER_0_1 = 0x91, }; enum dr_ste_v1_aso_ctx_type { DR_STE_V1_ASO_CTX_TYPE_CT = 0x1, DR_STE_V1_ASO_CTX_TYPE_POLICERS = 0x2, DR_STE_V1_ASO_CTX_TYPE_FIRST_HIT = 0x4, }; void dr_ste_v1_set_reparse(uint8_t *hw_ste_p); #endif rdma-core-56.1/providers/mlx5/dr_ste_v2.c000066400000000000000000000216761477342711600203030ustar00rootroot00000000000000/* * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "dr_ste.h" enum { DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_0 = 0x00, DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_1 = 0x01, DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_2 = 0x02, DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_0 = 0x08, DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_1 = 0x09, DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0 = 0x0e, DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0 = 0x18, DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_1 = 0x19, DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_0 = 0x40, DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_1 = 0x41, DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_0 = 0x44, DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_1 = 0x45, DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_2 = 0x46, DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_3 = 0x47, DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_0 = 0x4c, DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_1 = 0x4d, DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_2 = 0x4e, DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_3 = 0x4f, DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_0 = 0x5e, DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_1 = 0x5f, DR_STE_V2_ACTION_MDFY_FLD_METADATA_2_CQE = 0x7b, DR_STE_V2_ACTION_MDFY_FLD_GNRL_PURPOSE = 0x7c, DR_STE_V2_ACTION_MDFY_FLD_FLEX_PARSER_7 = 0x82, DR_STE_V2_ACTION_MDFY_FLD_FLEX_PARSER_6 = 0x83, DR_STE_V2_ACTION_MDFY_FLD_FLEX_PARSER_5 = 0x84, DR_STE_V2_ACTION_MDFY_FLD_FLEX_PARSER_4 = 0x85, DR_STE_V2_ACTION_MDFY_FLD_FLEX_PARSER_3 = 0x86, DR_STE_V2_ACTION_MDFY_FLD_FLEX_PARSER_2 = 0x87, DR_STE_V2_ACTION_MDFY_FLD_FLEX_PARSER_1 = 0x88, DR_STE_V2_ACTION_MDFY_FLD_FLEX_PARSER_0 = 0x89, DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_0 = 0x90, DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_1 = 0x91, DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_0 = 0x92, DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_1 = 0x93, DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_0 = 0x94, DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_1 = 0x95, }; static const struct dr_ste_action_modify_field dr_ste_v2_action_modify_field_arr[] = { [MLX5_ACTION_IN_FIELD_OUT_SMAC_47_16] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_SMAC_15_0] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_SRC_L2_OUT_1, .start = 16, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_ETHERTYPE] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_1, .start = 0, .end = 15, }, [MLX5_ACTION_IN_FIELD_OUT_DMAC_47_16] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_DMAC_15_0] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_1, .start = 16, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_IP_DSCP] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0, .start = 18, .end = 23, }, [MLX5_ACTION_IN_FIELD_OUT_IP_ECN] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0, .start = 16, .end = 17, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_FLAGS] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_1, .start = 16, .end = 24, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_SPORT] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 16, .end = 31, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_DPORT] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 0, .end = 15, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_TCP, }, [MLX5_ACTION_IN_FIELD_OUT_IP_TTL] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0, .start = 8, .end = 15, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, }, [MLX5_ACTION_IN_FIELD_OUT_IPV6_HOPLIMIT] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L3_OUT_0, .start = 8, .end = 15, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_UDP_SPORT] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 16, .end = 31, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_UDP, }, [MLX5_ACTION_IN_FIELD_OUT_UDP_DPORT] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L4_OUT_0, .start = 0, .end = 15, .l4_type = DR_STE_ACTION_MDFY_TYPE_L4_UDP, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_127_96] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_0, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_95_64] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_1, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_63_32] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_2, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV6_31_0] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_SRC_OUT_3, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_127_96] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_0, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_95_64] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_1, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_63_32] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_2, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV6_31_0] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV6_DST_OUT_3, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV6, }, [MLX5_ACTION_IN_FIELD_OUT_SIPV4] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_0, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, }, [MLX5_ACTION_IN_FIELD_OUT_DIPV4] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_IPV4_OUT_1, .start = 0, .end = 31, .l3_type = DR_STE_ACTION_MDFY_TYPE_L3_IPV4, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGA] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_GNRL_PURPOSE, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGB] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_METADATA_2_CQE, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_0] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_1] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_0_1, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_2] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_3] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_1_1, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_4] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_5] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_REGISTER_2_1, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_0, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_TCP_MISC_1, .start = 0, .end = 31, }, [MLX5_ACTION_IN_FIELD_OUT_FIRST_VID] = { .hw_field = DR_STE_V2_ACTION_MDFY_FLD_L2_OUT_2, .start = 0, .end = 15, }, [MLX5_ACTION_IN_FIELD_OUT_GTPU_TEID] = { .flags = DR_STE_ACTION_MODIFY_FLAG_REQ_FLEX, .start = 0, .end = 31, }, }; static struct dr_ste_ctx ste_ctx_v2; static pthread_mutex_t ctx_mutex = PTHREAD_MUTEX_INITIALIZER; struct dr_ste_ctx *dr_ste_get_ctx_v2(void) { pthread_mutex_lock(&ctx_mutex); if (!ste_ctx_v2.actions_caps) { ste_ctx_v2 = *dr_ste_get_ctx_v1(); ste_ctx_v2.actions_caps = DR_STE_CTX_ACTION_CAP_TX_POP | DR_STE_CTX_ACTION_CAP_RX_PUSH | DR_STE_CTX_ACTION_CAP_RX_ENCAP | DR_STE_CTX_ACTION_CAP_MODIFY_HDR_INLINE; ste_ctx_v2.action_modify_field_arr = dr_ste_v2_action_modify_field_arr; ste_ctx_v2.action_modify_field_arr_size = ARRAY_SIZE(dr_ste_v2_action_modify_field_arr); } pthread_mutex_unlock(&ctx_mutex); return &ste_ctx_v2; } rdma-core-56.1/providers/mlx5/dr_ste_v3.c000066400000000000000000000166411477342711600203000ustar00rootroot00000000000000/* * Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include "dr_ste_v1.h" static void dr_ste_v3_set_encap(uint8_t *hw_ste_p, uint8_t *d_action, uint32_t reformat_id, int size) { DR_STE_SET(double_action_insert_with_ptr_v3, d_action, action_id, DR_STE_V1_ACTION_ID_INSERT_POINTER); /* The hardware expects here size in words (2 bytes) */ DR_STE_SET(double_action_insert_with_ptr_v3, d_action, size, size / 2); DR_STE_SET(double_action_insert_with_ptr_v3, d_action, pointer, reformat_id); DR_STE_SET(double_action_insert_with_ptr_v3, d_action, attributes, DR_STE_V1_ACTION_INSERT_PTR_ATTR_ENCAP); dr_ste_v1_set_reparse(hw_ste_p); } static void dr_ste_v3_set_push_vlan(uint8_t *ste, uint8_t *d_action, uint32_t vlan_hdr) { DR_STE_SET(double_action_insert_with_inline_v3, d_action, action_id, DR_STE_V1_ACTION_ID_INSERT_INLINE); /* The hardware expects here offset to vlan header in words (2 byte) */ DR_STE_SET(double_action_insert_with_inline_v3, d_action, start_offset, HDR_LEN_L2_MACS >> 1); DR_STE_SET(double_action_insert_with_inline_v3, d_action, inline_data, vlan_hdr); dr_ste_v1_set_reparse(ste); } static void dr_ste_v3_set_pop_vlan(uint8_t *hw_ste_p, uint8_t *s_action, uint8_t vlans_num) { DR_STE_SET(single_action_remove_header_size_v3, s_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_BY_SIZE); DR_STE_SET(single_action_remove_header_size_v3, s_action, start_anchor, DR_STE_HEADER_ANCHOR_1ST_VLAN); /* The hardware expects here size in words (2 byte) */ DR_STE_SET(single_action_remove_header_size_v3, s_action, remove_size, (HDR_LEN_L2_VLAN >> 1) * vlans_num); dr_ste_v1_set_reparse(hw_ste_p); } static void dr_ste_v3_set_encap_l3(uint8_t *hw_ste_p, uint8_t *frst_s_action, uint8_t *scnd_d_action, uint32_t reformat_id, int size) { /* Remove L2 headers */ DR_STE_SET(single_action_remove_header_v3, frst_s_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER); DR_STE_SET(single_action_remove_header_v3, frst_s_action, end_anchor, DR_STE_HEADER_ANCHOR_IPV6_IPV4); /* Encapsulate with given reformat ID */ DR_STE_SET(double_action_insert_with_ptr_v3, scnd_d_action, action_id, DR_STE_V1_ACTION_ID_INSERT_POINTER); /* The hardware expects here size in words (2 bytes) */ DR_STE_SET(double_action_insert_with_ptr_v3, scnd_d_action, size, size / 2); DR_STE_SET(double_action_insert_with_ptr_v3, scnd_d_action, pointer, reformat_id); DR_STE_SET(double_action_insert_with_ptr_v3, scnd_d_action, attributes, DR_STE_V1_ACTION_INSERT_PTR_ATTR_ENCAP); dr_ste_v1_set_reparse(hw_ste_p); } static void dr_ste_v3_set_rx_decap(uint8_t *hw_ste_p, uint8_t *s_action) { DR_STE_SET(single_action_remove_header_v3, s_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER); DR_STE_SET(single_action_remove_header_v3, s_action, decap, 1); DR_STE_SET(single_action_remove_header_v3, s_action, vni_to_cqe, 1); DR_STE_SET(single_action_remove_header_v3, s_action, end_anchor, DR_STE_HEADER_ANCHOR_INNER_MAC); dr_ste_v1_set_reparse(hw_ste_p); } static int dr_ste_v3_set_action_decap_l3_list(void *data, uint32_t data_sz, uint8_t *hw_action, uint32_t hw_action_sz, uint16_t *used_hw_action_num) { uint8_t padded_data[DR_STE_L2_HDR_MAX_SZ] = {}; void *data_ptr = padded_data; uint16_t used_actions = 0; uint32_t inline_data_sz; uint32_t i; if (hw_action_sz / DR_STE_ACTION_DOUBLE_SZ < DR_STE_DECAP_L3_ACTION_NUM) { errno = EINVAL; return errno; } inline_data_sz = DEVX_FLD_SZ_BYTES(ste_double_action_insert_with_inline_v3, inline_data); /* Add an alignment padding */ memcpy(padded_data + data_sz % inline_data_sz, data, data_sz); /* Remove L2L3 outer headers */ DR_STE_SET(single_action_remove_header_v3, hw_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_HEADER_TO_HEADER); DR_STE_SET(single_action_remove_header_v3, hw_action, decap, 1); DR_STE_SET(single_action_remove_header_v3, hw_action, vni_to_cqe, 1); DR_STE_SET(single_action_remove_header_v3, hw_action, end_anchor, DR_STE_HEADER_ANCHOR_INNER_IPV6_IPV4); hw_action += DR_STE_ACTION_DOUBLE_SZ; used_actions++; /* Point to the last dword of the header */ data_ptr += (data_sz / inline_data_sz) * inline_data_sz; /* Add the new header using inline action 4Byte at a time, the header * is added in reversed order to the beginning of the packet to avoid * incorrect parsing by the HW. Since header is 14B or 18B an extra * two bytes are padded and later removed. */ for (i = 0; i < data_sz / inline_data_sz + 1; i++) { void *addr_inline; DR_STE_SET(double_action_insert_with_inline_v3, hw_action, action_id, DR_STE_V1_ACTION_ID_INSERT_INLINE); /* The hardware expects here offset to words (2 bytes) */ DR_STE_SET(double_action_insert_with_inline_v3, hw_action, start_offset, 0); /* Copy byte in order to skip endianness problem */ addr_inline = DEVX_ADDR_OF(ste_double_action_insert_with_inline_v3, hw_action, inline_data); memcpy(addr_inline, data_ptr - inline_data_sz * i, inline_data_sz); hw_action += DR_STE_ACTION_DOUBLE_SZ; used_actions++; } /* Remove first 2 extra bytes */ DR_STE_SET(single_action_remove_header_size_v3, hw_action, action_id, DR_STE_V1_ACTION_ID_REMOVE_BY_SIZE); DR_STE_SET(single_action_remove_header_size_v3, hw_action, start_offset, 0); /* The hardware expects here size in words (2 bytes) */ DR_STE_SET(single_action_remove_header_size_v3, hw_action, remove_size, 1); used_actions++; *used_hw_action_num = used_actions; return 0; } static struct dr_ste_ctx ste_ctx_v3; static pthread_mutex_t ctx_mutex = PTHREAD_MUTEX_INITIALIZER; struct dr_ste_ctx *dr_ste_get_ctx_v3(void) { pthread_mutex_lock(&ctx_mutex); if (!ste_ctx_v3.actions_caps) { ste_ctx_v3 = *dr_ste_get_ctx_v2(); ste_ctx_v3.set_encap = &dr_ste_v3_set_encap; ste_ctx_v3.set_push_vlan = &dr_ste_v3_set_push_vlan; ste_ctx_v3.set_pop_vlan = &dr_ste_v3_set_pop_vlan; ste_ctx_v3.set_rx_decap = &dr_ste_v3_set_rx_decap; ste_ctx_v3.set_encap_l3 = &dr_ste_v3_set_encap_l3; ste_ctx_v3.set_action_decap_l3_list = &dr_ste_v3_set_action_decap_l3_list; } pthread_mutex_unlock(&ctx_mutex); return &ste_ctx_v3; } rdma-core-56.1/providers/mlx5/dr_table.c000066400000000000000000000130521477342711600201550ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include "mlx5dv_dr.h" static void dr_table_uninit_nic(struct dr_table_rx_tx *nic_tbl) { dr_htbl_put(nic_tbl->s_anchor); } static void dr_table_uninit_fdb(struct mlx5dv_dr_table *tbl) { dr_table_uninit_nic(&tbl->rx); dr_table_uninit_nic(&tbl->tx); } static void dr_table_uninit(struct mlx5dv_dr_table *tbl) { switch (tbl->dmn->type) { case MLX5DV_DR_DOMAIN_TYPE_NIC_RX: dr_table_uninit_nic(&tbl->rx); break; case MLX5DV_DR_DOMAIN_TYPE_NIC_TX: dr_table_uninit_nic(&tbl->tx); break; case MLX5DV_DR_DOMAIN_TYPE_FDB: dr_table_uninit_fdb(tbl); break; default: break; } } static int dr_table_init_nic(struct mlx5dv_dr_domain *dmn, struct dr_table_rx_tx *nic_tbl) { struct dr_domain_rx_tx *nic_dmn = nic_tbl->nic_dmn; struct dr_htbl_connect_info info; int ret; nic_tbl->s_anchor = dr_ste_htbl_alloc(dmn->ste_icm_pool, DR_CHUNK_SIZE_1, DR_STE_HTBL_TYPE_LEGACY, DR_STE_LU_TYPE_DONT_CARE, 0); if (!nic_tbl->s_anchor) return errno; info.type = CONNECT_MISS; info.miss_icm_addr = nic_dmn->default_icm_addr; ret = dr_ste_htbl_init_and_postsend(dmn, nic_dmn, nic_tbl->s_anchor, &info, true, 0); if (ret) goto free_s_anchor; dr_htbl_get(nic_tbl->s_anchor); return 0; free_s_anchor: dr_ste_htbl_free(nic_tbl->s_anchor); return ret; } static int dr_table_init_fdb(struct mlx5dv_dr_table *tbl) { int ret; ret = dr_table_init_nic(tbl->dmn, &tbl->rx); if (ret) return ret; ret = dr_table_init_nic(tbl->dmn, &tbl->tx); if (ret) goto destroy_rx; return 0; destroy_rx: dr_table_uninit_nic(&tbl->rx); return ret; } static int dr_table_init(struct mlx5dv_dr_table *tbl) { int ret = 0; list_head_init(&tbl->matcher_list); switch (tbl->dmn->type) { case MLX5DV_DR_DOMAIN_TYPE_NIC_RX: tbl->table_type = FS_FT_NIC_RX; tbl->rx.nic_dmn = &tbl->dmn->info.rx; ret = dr_table_init_nic(tbl->dmn, &tbl->rx); break; case MLX5DV_DR_DOMAIN_TYPE_NIC_TX: tbl->table_type = FS_FT_NIC_TX; tbl->tx.nic_dmn = &tbl->dmn->info.tx; ret = dr_table_init_nic(tbl->dmn, &tbl->tx); break; case MLX5DV_DR_DOMAIN_TYPE_FDB: tbl->table_type = FS_FT_FDB; tbl->rx.nic_dmn = &tbl->dmn->info.rx; tbl->tx.nic_dmn = &tbl->dmn->info.tx; ret = dr_table_init_fdb(tbl); break; default: assert(false); break; } return ret; } static int dr_table_create_devx_tbl(struct mlx5dv_dr_table *tbl) { struct dr_devx_flow_table_attr ft_attr = {}; ft_attr.type = tbl->table_type; ft_attr.level = tbl->dmn->info.caps.max_ft_level - 1; ft_attr.sw_owner = true; if (tbl->rx.s_anchor) ft_attr.icm_addr_rx = dr_icm_pool_get_chunk_icm_addr(tbl->rx.s_anchor->chunk); if (tbl->tx.s_anchor) ft_attr.icm_addr_tx = dr_icm_pool_get_chunk_icm_addr(tbl->tx.s_anchor->chunk); tbl->devx_obj = dr_devx_create_flow_table(tbl->dmn->ctx, &ft_attr); if (!tbl->devx_obj) return errno; return 0; } struct mlx5dv_dr_table *mlx5dv_dr_table_create(struct mlx5dv_dr_domain *dmn, uint32_t level) { struct mlx5dv_dr_table *tbl; int ret; atomic_fetch_add(&dmn->refcount, 1); if (level && !dmn->info.supp_sw_steering) { errno = EOPNOTSUPP; goto dec_ref; } tbl = calloc(1, sizeof(*tbl)); if (!tbl) { errno = ENOMEM; goto dec_ref; } tbl->dmn = dmn; tbl->level = level; atomic_init(&tbl->refcount, 1); if (!dr_is_root_table(tbl)) { ret = dr_table_init(tbl); if (ret) goto free_tbl; ret = dr_send_ring_force_drain(dmn); if (ret) goto uninit_tbl; ret = dr_table_create_devx_tbl(tbl); if (ret) goto uninit_tbl; } list_node_init(&tbl->tbl_list); dr_domain_lock(dmn); list_add_tail(&dmn->tbl_list, &tbl->tbl_list); dr_domain_unlock(dmn); return tbl; uninit_tbl: dr_table_uninit(tbl); free_tbl: free(tbl); dec_ref: atomic_fetch_sub(&dmn->refcount, 1); return NULL; } int mlx5dv_dr_table_destroy(struct mlx5dv_dr_table *tbl) { int ret = 0; if (atomic_load(&tbl->refcount) > 1) return EBUSY; if (!dr_is_root_table(tbl)) { ret = mlx5dv_devx_obj_destroy(tbl->devx_obj); if (ret) return ret; } dr_domain_lock(tbl->dmn); list_del(&tbl->tbl_list); dr_domain_unlock(tbl->dmn); if (!dr_is_root_table(tbl)) dr_table_uninit(tbl); atomic_fetch_sub(&tbl->dmn->refcount, 1); free(tbl); return ret; } rdma-core-56.1/providers/mlx5/dr_vports.c000066400000000000000000000155341477342711600204320ustar00rootroot00000000000000/* * Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved */ #include #include "mlx5dv_dr.h" static uint32_t dr_vports_gen_vport_key(uint16_t vhca_gvmi, uint16_t vport_num) { return (((uint32_t)vhca_gvmi) << 16) | vport_num; } static int dr_vports_calc_bucket_idx(uint32_t vport_key) { return vport_key % DR_VPORTS_BUCKETS; } static struct dr_devx_vport_cap * dr_vports_table_find_vport_num(struct dr_vports_table *h, uint16_t vhca_gvmi, uint16_t vport_num) { struct dr_devx_vport_cap *vport_cap; uint32_t vport_key; uint32_t idx; vport_key = dr_vports_gen_vport_key(vhca_gvmi, vport_num); idx = dr_vports_calc_bucket_idx(vport_key); vport_cap = h->buckets[idx]; while (vport_cap) { if (vport_cap->vhca_gvmi == vhca_gvmi && vport_cap->num == vport_num) return vport_cap; vport_cap = vport_cap->next; } return NULL; } static void dr_vports_table_add_vport(struct dr_vports_table *h, struct dr_devx_vport_cap *vport) { uint32_t vport_key; uint32_t idx; vport_key = dr_vports_gen_vport_key(vport->vhca_gvmi, vport->num); idx = dr_vports_calc_bucket_idx(vport_key); vport->next = h->buckets[idx]; h->buckets[idx] = vport; } static struct dr_devx_vport_cap * dr_vports_table_query_and_add_vport(struct ibv_context *ctx, struct dr_devx_vports *vports, bool other_vport, uint16_t vport_number) { struct dr_devx_vport_cap *new_vport; int ret = 0; pthread_spin_lock(&vports->lock); new_vport = dr_vports_table_find_vport_num(vports->vports, vports->esw_mngr.vhca_gvmi, vport_number); if (new_vport) goto unlock_ret; new_vport = calloc(1, sizeof(*new_vport)); if (!new_vport) { errno = ENOMEM; goto unlock_ret; } ret = dr_devx_query_esw_vport_context(ctx, other_vport, vport_number, &new_vport->icm_address_rx, &new_vport->icm_address_tx); if (ret) goto unlock_free; ret = dr_devx_query_gvmi(ctx, other_vport, vport_number, &new_vport->vport_gvmi); if (ret) goto unlock_free; new_vport->num = vport_number; new_vport->vhca_gvmi = vports->esw_mngr.vhca_gvmi; dr_vports_table_add_vport(vports->vports, new_vport); pthread_spin_unlock(&vports->lock); return new_vport; unlock_free: free(new_vport); new_vport = NULL; unlock_ret: pthread_spin_unlock(&vports->lock); return new_vport; } static struct dr_devx_vport_cap * dr_vports_table_query_and_add_ib_port(struct ibv_context *ctx, struct dr_devx_vports *vports, uint32_t port_num) { struct dr_devx_vport_cap *vport_ptr; struct mlx5dv_port port_info = {}; bool new_vport = false; uint64_t vport_flags; uint64_t wire_flags; int ret; wire_flags = MLX5DV_QUERY_PORT_VPORT | MLX5DV_QUERY_PORT_ESW_OWNER_VHCA_ID | MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_TX; vport_flags = wire_flags | MLX5DV_QUERY_PORT_VPORT_VHCA_ID | MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_RX; ret = mlx5dv_query_port(ctx, port_num, &port_info); /* Check if query succeed and vport is enabled */ if (ret || !(port_info.flags & MLX5DV_QUERY_PORT_VPORT)) return NULL; /* Check if required fields were supplied */ if (port_info.vport == WIRE_PORT) { if ((port_info.flags & wire_flags) != wire_flags) { errno = EINVAL; return NULL; } } else { if ((port_info.flags & vport_flags) != vport_flags) return NULL; } pthread_spin_lock(&vports->lock); vport_ptr = dr_vports_table_find_vport_num(vports->vports, port_info.esw_owner_vhca_id, port_info.vport); if (!vport_ptr) { new_vport = true; vport_ptr = calloc(1, sizeof(struct dr_devx_vport_cap)); if (!vport_ptr) { errno = ENOMEM; goto unlock_ret; } } vport_ptr->num = port_info.vport; vport_ptr->vport_gvmi = port_info.vport_vhca_id; vport_ptr->vhca_gvmi = port_info.esw_owner_vhca_id; vport_ptr->icm_address_rx = port_info.vport_steering_icm_rx; vport_ptr->icm_address_tx = port_info.vport_steering_icm_tx; if (port_info.flags & MLX5DV_QUERY_PORT_VPORT_REG_C0) { vport_ptr->metadata_c = port_info.reg_c0.value; vport_ptr->metadata_c_mask = port_info.reg_c0.mask; } if (new_vport) { dr_vports_table_add_vport(vports->vports, vport_ptr); /* IB port idx <-> vport idx <-> GVMI/ICM is constant */ vports->ib_ports[port_num - 1] = vport_ptr; } unlock_ret: pthread_spin_unlock(&vports->lock); return vport_ptr; } struct dr_devx_vport_cap * dr_vports_table_get_vport_cap(struct dr_devx_caps *caps, uint16_t vport) { struct dr_devx_vports *vports = &caps->vports; bool other_vport = !!vport || caps->is_ecpf; struct dr_devx_vport_cap *vport_cap; if (vport == ECPF_PORT && caps->is_ecpf) return &vports->esw_mngr; /* no lock on vport_find since table is updated atomically */ vport_cap = dr_vports_table_find_vport_num(vports->vports, vports->esw_mngr.vhca_gvmi, vport); if (vport_cap) return vport_cap; return dr_vports_table_query_and_add_vport(caps->dmn->ctx, vports, other_vport, vport); } struct dr_devx_vport_cap * dr_vports_table_get_ib_port_cap(struct dr_devx_caps *caps, uint32_t ib_port) { struct dr_devx_vports *vports = &caps->vports; struct dr_devx_vport_cap *vport_cap; if (!ib_port) { errno = EINVAL; return NULL; } if (!vports->ib_ports || ib_port > vports->num_ports) { errno = ENOTSUP; return NULL; } /* Query IB port if not found */ vport_cap = vports->ib_ports[ib_port - 1]; if (vport_cap) return vport_cap; return dr_vports_table_query_and_add_ib_port(caps->dmn->ctx, vports, ib_port); } void dr_vports_table_add_wire(struct dr_devx_vports *vports) { pthread_spin_lock(&vports->lock); vports->wire.num = WIRE_PORT; dr_vports_table_add_vport(vports->vports, &vports->wire); pthread_spin_unlock(&vports->lock); } void dr_vports_table_del_wire(struct dr_devx_vports *vports) { struct dr_devx_vport_cap *wire = &vports->wire; struct dr_vports_table *h = vports->vports; struct dr_devx_vport_cap *vport, *prev; uint32_t vport_key; uint32_t idx; vport_key = dr_vports_gen_vport_key(wire->vhca_gvmi, wire->num); idx = dr_vports_calc_bucket_idx(vport_key); pthread_spin_lock(&vports->lock); if (h->buckets[idx] == wire) { h->buckets[idx] = wire->next; goto out_unlock; } prev = h->buckets[idx]; vport = prev->next; while (vport) { if (vport == wire) { prev->next = vport->next; break; } prev = vport; vport = vport->next; } out_unlock: pthread_spin_unlock(&vports->lock); } struct dr_vports_table *dr_vports_table_create(struct mlx5dv_dr_domain *dmn) { struct dr_vports_table *h; h = calloc(1, sizeof(*h)); if (!h) { errno = ENOMEM; return NULL; } return h; } void dr_vports_table_destroy(struct dr_vports_table *h) { struct dr_devx_vport_cap *vport_cap, *next; uint32_t idx; for (idx = 0; idx < DR_VPORTS_BUCKETS; ++idx) { vport_cap = h->buckets[idx]; while (vport_cap) { next = vport_cap->next; free(vport_cap); vport_cap = next; } } free(h); } rdma-core-56.1/providers/mlx5/libmlx5.map000066400000000000000000000113431477342711600203110ustar00rootroot00000000000000/* Export symbols should be added below according to Documentation/versioning.md document. */ MLX5_1.0 { global: mlx5dv_query_device; mlx5dv_init_obj; local: *; }; MLX5_1.1 { global: mlx5dv_create_cq; } MLX5_1.0; MLX5_1.2 { global: mlx5dv_init_obj; mlx5dv_set_context_attr; } MLX5_1.1; MLX5_1.3 { global: mlx5dv_create_qp; mlx5dv_create_wq; } MLX5_1.2; MLX5_1.4 { global: mlx5dv_get_clock_info; } MLX5_1.3; MLX5_1.5 { global: mlx5dv_create_flow_action_esp; } MLX5_1.4; MLX5_1.6 { global: mlx5dv_create_flow_matcher; mlx5dv_destroy_flow_matcher; mlx5dv_create_flow; } MLX5_1.5; MLX5_1.7 { global: mlx5dv_create_flow_action_modify_header; mlx5dv_create_flow_action_packet_reformat; mlx5dv_devx_alloc_uar; mlx5dv_devx_free_uar; mlx5dv_devx_general_cmd; mlx5dv_devx_obj_create; mlx5dv_devx_obj_destroy; mlx5dv_devx_obj_modify; mlx5dv_devx_obj_query; mlx5dv_devx_query_eqn; mlx5dv_devx_umem_dereg; mlx5dv_devx_umem_reg; mlx5dv_open_device; } MLX5_1.6; MLX5_1.8 { global: mlx5dv_devx_cq_modify; mlx5dv_devx_cq_query; mlx5dv_devx_ind_tbl_modify; mlx5dv_devx_ind_tbl_query; mlx5dv_devx_qp_modify; mlx5dv_devx_qp_query; mlx5dv_devx_srq_modify; mlx5dv_devx_srq_query; mlx5dv_devx_wq_modify; mlx5dv_devx_wq_query; mlx5dv_is_supported; } MLX5_1.7; MLX5_1.9 { global: mlx5dv_devx_create_cmd_comp; mlx5dv_devx_destroy_cmd_comp; mlx5dv_devx_get_async_cmd_comp; mlx5dv_devx_obj_query_async; } MLX5_1.8; MLX5_1.10 { global: mlx5dv_alloc_dm; mlx5dv_create_mkey; mlx5dv_destroy_mkey; mlx5dv_dr_action_create_dest_table; mlx5dv_dr_action_create_dest_ibv_qp; mlx5dv_dr_action_create_dest_vport; mlx5dv_dr_action_create_flow_counter; mlx5dv_dr_action_create_drop; mlx5dv_dr_action_create_modify_header; mlx5dv_dr_action_create_packet_reformat; mlx5dv_dr_action_create_tag; mlx5dv_dr_action_destroy; mlx5dv_dr_domain_create; mlx5dv_dr_domain_destroy; mlx5dv_dr_domain_sync; mlx5dv_dr_matcher_create; mlx5dv_dr_matcher_destroy; mlx5dv_dr_rule_create; mlx5dv_dr_rule_destroy; mlx5dv_dr_table_create; mlx5dv_dr_table_destroy; mlx5dv_qp_ex_from_ibv_qp_ex; } MLX5_1.9; MLX5_1.11 { global: mlx5dv_devx_create_event_channel; mlx5dv_devx_destroy_event_channel; mlx5dv_devx_get_event; mlx5dv_devx_subscribe_devx_event; mlx5dv_devx_subscribe_devx_event_fd; } MLX5_1.10; MLX5_1.12 { global: mlx5dv_alloc_var; mlx5dv_dr_action_create_flow_meter; mlx5dv_dr_action_modify_flow_meter; mlx5dv_dump_dr_domain; mlx5dv_dump_dr_matcher; mlx5dv_dump_dr_rule; mlx5dv_dump_dr_table; mlx5dv_free_var; } MLX5_1.11; MLX5_1.13 { global: mlx5dv_pp_alloc; mlx5dv_pp_free; } MLX5_1.12; MLX5_1.14 { global: mlx5dv_dr_action_create_default_miss; mlx5dv_dr_domain_set_reclaim_device_memory; mlx5dv_modify_qp_lag_port; mlx5dv_query_qp_lag_port; } MLX5_1.13; MLX5_1.15 { global: mlx5dv_dr_action_create_dest_devx_tir; } MLX5_1.14; MLX5_1.16 { global: mlx5dv_dr_action_create_dest_array; mlx5dv_dr_action_create_flow_sampler; } MLX5_1.15; MLX5_1.17 { global: mlx5dv_dr_action_create_aso; mlx5dv_dr_action_create_pop_vlan; mlx5dv_dr_action_create_push_vlan; mlx5dv_dr_action_modify_aso; mlx5dv_modify_qp_sched_elem; mlx5dv_modify_qp_udp_sport; mlx5dv_sched_leaf_create; mlx5dv_sched_leaf_destroy; mlx5dv_sched_leaf_modify; mlx5dv_sched_node_create; mlx5dv_sched_node_destroy; mlx5dv_sched_node_modify; } MLX5_1.16; MLX5_1.18 { global: mlx5dv_reserved_qpn_alloc; mlx5dv_reserved_qpn_dealloc; } MLX5_1.17; MLX5_1.19 { global: mlx5dv_devx_umem_reg_ex; mlx5dv_dm_map_op_addr; _mlx5dv_query_port; } MLX5_1.18; MLX5_1.20 { global: mlx5dv_dr_domain_allow_duplicate_rules; mlx5dv_map_ah_to_qp; mlx5dv_qp_cancel_posted_send_wrs; _mlx5dv_mkey_check; } MLX5_1.19; MLX5_1.21 { global: mlx5dv_crypto_login; mlx5dv_crypto_login_query_state; mlx5dv_crypto_logout; mlx5dv_dci_stream_id_reset; mlx5dv_dek_create; mlx5dv_dek_destroy; mlx5dv_dek_query; mlx5dv_dr_action_create_dest_ib_port; mlx5dv_dr_matcher_set_layout; mlx5dv_get_vfio_device_list; mlx5dv_vfio_get_events_fd; mlx5dv_vfio_process_events; } MLX5_1.20; MLX5_1.22 { global: mlx5dv_dr_aso_other_domain_link; mlx5dv_dr_aso_other_domain_unlink; } MLX5_1.21; MLX5_1.23 { global: mlx5dv_devx_alloc_msi_vector; mlx5dv_devx_create_eq; mlx5dv_devx_destroy_eq; mlx5dv_devx_free_msi_vector; } MLX5_1.22; MLX5_1.24 { global: mlx5dv_create_steering_anchor; mlx5dv_crypto_login_create; mlx5dv_crypto_login_destroy; mlx5dv_crypto_login_query; mlx5dv_destroy_steering_anchor; mlx5dv_dr_action_create_dest_root_table; } MLX5_1.23; MLX5_1.25 { global: mlx5dv_get_data_direct_sysfs_path; mlx5dv_reg_dmabuf_mr; } MLX5_1.24; rdma-core-56.1/providers/mlx5/man/000077500000000000000000000000001477342711600170075ustar00rootroot00000000000000rdma-core-56.1/providers/mlx5/man/CMakeLists.txt000066400000000000000000000141311477342711600215470ustar00rootroot00000000000000rdma_man_pages( mlx5dv_alloc_dm.3.md mlx5dv_alloc_var.3.md mlx5dv_create_cq.3.md mlx5dv_create_flow.3.md mlx5dv_create_flow_action_modify_header.3.md mlx5dv_create_flow_action_packet_reformat.3.md mlx5dv_create_flow_matcher.3.md mlx5dv_create_mkey.3.md mlx5dv_create_qp.3.md mlx5dv_create_steering_anchor.3.md mlx5dv_crypto_login.3.md mlx5dv_crypto_login_create.3.md mlx5dv_dci_stream_id_reset.3.md mlx5dv_dek_create.3.md mlx5dv_devx_alloc_msi_vector.3.md mlx5dv_devx_alloc_uar.3.md mlx5dv_devx_create_cmd_comp.3.md mlx5dv_devx_create_eq.3.md mlx5dv_devx_create_event_channel.3.md mlx5dv_devx_get_event.3.md mlx5dv_devx_obj_create.3.md mlx5dv_devx_qp_modify.3.md mlx5dv_devx_query_eqn.3.md mlx5dv_devx_subscribe_devx_event.3.md mlx5dv_devx_umem_reg.3.md mlx5dv_dm_map_op_addr.3.md mlx5dv_dr_flow.3.md mlx5dv_dump.3.md mlx5dv_flow_action_esp.3.md mlx5dv_get_clock_info.3 mlx5dv_get_data_direct_sysfs_path.3.md mlx5dv_get_vfio_device_list.3.md mlx5dv_init_obj.3 mlx5dv_is_supported.3.md mlx5dv_map_ah_to_qp.3.md mlx5dv_mkey_check.3.md mlx5dv_modify_qp_lag_port.3.md mlx5dv_modify_qp_sched_elem.3.md mlx5dv_modify_qp_udp_sport.3.md mlx5dv_open_device.3.md mlx5dv_pp_alloc.3.md mlx5dv_qp_cancel_posted_send_wrs.3.md mlx5dv_query_device.3 mlx5dv_query_port.3.md mlx5dv_query_qp_lag_port.3.md mlx5dv_reg_dmabuf_mr.3.md mlx5dv_reserved_qpn_alloc.3.md mlx5dv_sched_node_create.3.md mlx5dv_ts_to_ns.3 mlx5dv_wr_mkey_configure.3.md mlx5dv_vfio_get_events_fd.3.md mlx5dv_vfio_process_events.3.md mlx5dv_wr_post.3.md mlx5dv_wr_set_mkey_crypto.3.md mlx5dv_wr_set_mkey_sig_block.3.md mlx5dv.7 ) rdma_alias_man_pages( mlx5dv_alloc_var.3 mlx5dv_free_var.3 mlx5dv_create_mkey.3 mlx5dv_destroy_mkey.3 mlx5dv_create_steering_anchor.3 mlx5dv_destroy_steering_anchor.3 mlx5dv_crypto_login.3 mlx5dv_crypto_login_query_state.3 mlx5dv_crypto_login.3 mlx5dv_crypto_logout.3 mlx5dv_crypto_login_create.3 mlx5dv_crypto_login_destroy.3 mlx5dv_crypto_login_create.3 mlx5dv_crypto_login_query.3 mlx5dv_dek_create.3 mlx5dv_dek_query.3 mlx5dv_dek_create.3 mlx5dv_dek_destroy.3 mlx5dv_devx_alloc_msi_vector.3 mlx5dv_devx_free_msi_vector.3 mlx5dv_devx_alloc_uar.3 mlx5dv_devx_free_uar.3 mlx5dv_devx_create_cmd_comp.3 mlx5dv_devx_destroy_cmd_comp.3 mlx5dv_devx_create_eq.3 mlx5dv_devx_destroy_eq.3 mlx5dv_devx_create_event_channel.3 mlx5dv_devx_destroy_event_channel.3 mlx5dv_devx_create_cmd_comp.3 mlx5dv_devx_get_async_cmd_comp.3 mlx5dv_devx_obj_create.3 mlx5dv_devx_general_cmd.3 mlx5dv_devx_obj_create.3 mlx5dv_devx_obj_destroy.3 mlx5dv_devx_obj_create.3 mlx5dv_devx_obj_query.3 mlx5dv_devx_obj_create.3 mlx5dv_devx_obj_query_async.3 mlx5dv_devx_obj_create.3 mlx5dv_devx_obj_modify.3 mlx5dv_devx_qp_modify.3 mlx5dv_devx_qp_query.3 mlx5dv_devx_qp_modify.3 mlx5dv_devx_cq_modify.3 mlx5dv_devx_qp_modify.3 mlx5dv_devx_cq_query.3 mlx5dv_devx_qp_modify.3 mlx5dv_devx_wq_modify.3 mlx5dv_devx_qp_modify.3 mlx5dv_devx_wq_query.3 mlx5dv_devx_qp_modify.3 mlx5dv_devx_srq_modify.3 mlx5dv_devx_qp_modify.3 mlx5dv_devx_srq_query.3 mlx5dv_devx_qp_modify.3 mlx5dv_devx_ind_tbl_modify.3 mlx5dv_devx_qp_modify.3 mlx5dv_devx_ind_tbl_query.3 mlx5dv_devx_subscribe_devx_event.3 mlx5dv_devx_subscribe_devx_event_fd.3 mlx5dv_devx_umem_reg.3 mlx5dv_devx_umem_dereg.3 mlx5dv_devx_umem_reg.3 mlx5dv_devx_umem_reg_ex.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_aso.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_dest_table.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_dest_root_table.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_dest_ib_port.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_dest_ibv_qp.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_dest_devx_tir.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_dest_vport.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_dest_array.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_flow_counter.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_drop.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_default_miss.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_flow_sampler.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_flow_meter.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_modify_header.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_packet_reformat.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_pop_vlan.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_push_vlan.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_create_tag.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_destroy.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_modify_aso.3 mlx5dv_dr_flow.3 mlx5dv_dr_action_modify_flow_meter.3 mlx5dv_dr_flow.3 mlx5dv_dr_aso_other_domain_link.3 mlx5dv_dr_flow.3 mlx5dv_dr_aso_other_domain_unlink.3 mlx5dv_dr_flow.3 mlx5dv_dr_domain_allow_duplicate_rules.3 mlx5dv_dr_flow.3 mlx5dv_dr_domain_create.3 mlx5dv_dr_flow.3 mlx5dv_dr_domain_destroy.3 mlx5dv_dr_flow.3 mlx5dv_dr_domain_sync.3 mlx5dv_dr_flow.3 mlx5dv_dr_domain_set_reclaim_device_memory.3 mlx5dv_dr_flow.3 mlx5dv_dr_matcher_create.3 mlx5dv_dr_flow.3 mlx5dv_dr_matcher_destroy.3 mlx5dv_dr_flow.3 mlx5dv_dr_matcher_set_layout.3 mlx5dv_dr_flow.3 mlx5dv_dr_rule_create.3 mlx5dv_dr_flow.3 mlx5dv_dr_rule_destroy.3 mlx5dv_dr_flow.3 mlx5dv_dr_table_create.3 mlx5dv_dr_flow.3 mlx5dv_dr_table_destroy.3 mlx5dv_dump.3 mlx5dv_dump_dr_domain.3 mlx5dv_dump.3 mlx5dv_dump_dr_matcher.3 mlx5dv_dump.3 mlx5dv_dump_dr_rule.3 mlx5dv_dump.3 mlx5dv_dump_dr_table.3 mlx5dv_pp_alloc.3 mlx5dv_pp_free.3 mlx5dv_reserved_qpn_alloc.3 mlx5dv_reserved_qpn_dealloc.3 mlx5dv_sched_node_create.3 mlx5dv_sched_leaf_create.3 mlx5dv_sched_node_create.3 mlx5dv_sched_leaf_destroy.3 mlx5dv_sched_node_create.3 mlx5dv_sched_leaf_modify.3 mlx5dv_sched_node_create.3 mlx5dv_sched_node_destroy.3 mlx5dv_sched_node_create.3 mlx5dv_sched_node_modify.3 mlx5dv_wr_mkey_configure.3 mlx5dv_wr_set_mkey_access_flags.3 mlx5dv_wr_mkey_configure.3 mlx5dv_wr_set_mkey_layout_interleaved.3 mlx5dv_wr_mkey_configure.3 mlx5dv_wr_set_mkey_layout_list.3 mlx5dv_wr_post.3 mlx5dv_wr_set_dc_addr.3 mlx5dv_wr_post.3 mlx5dv_wr_set_dc_addr_stream.3 mlx5dv_wr_post.3 mlx5dv_qp_ex_from_ibv_qp_ex.3 mlx5dv_wr_post.3 mlx5dv_wr_memcpy.3 mlx5dv_wr_post.3 mlx5dv_wr_mr_interleaved.3 mlx5dv_wr_post.3 mlx5dv_wr_mr_list.3 mlx5dv_wr_post.3 mlx5dv_wr_raw_wqe.3 ) rdma-core-56.1/providers/mlx5/man/mlx5dv.7000066400000000000000000000032751477342711600203250ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org (MIT) - See COPYING.md .\" .TH MLX5DV 7 2017-02-02 1.0.0 .SH "NAME" mlx5dv \- Direct verbs for mlx5 devices .br This is low level access to mlx5 devices to perform data path operations, without general branching performed by \fBibv_post_send\fR(3). .SH "DESCRIPTION" The libibverbs API is an abstract one. It is agnostic to any underlying provider specific implementation. While this abstraction has the advantage of user applications portability it has a performance penalty. For some applications optimizing performance is more important than portability. The mlx5 direct verbs API is intended for such applications. It exposes mlx5 specific low level data path (send/receive/completion) operations, allowing the application to bypass the libibverbs data path API. This interface consists from one hardware specific header file with relevant inline functions and conversion logic from ibverbs structures to mlx5 specific structures. The direct include of mlx5dv.h together with linkage to mlx5 library will allow usage of this new interface. Once an application uses the direct flow the locking scheme is fully managed by itself. There is an expectation that no mixed flows in the data path for both direct/non-direct access will be by same application. .SH "NOTES" All Mellanox NIC devices starting from Connect-IB (Connect-IB, ConnectX-4, ConnectX-4Lx, ConnectX-5, ...) implement the mlx5 API, thus using the mlx5 direct verbs does not limit the applications to a single NIC HW device thus keeping some level of portability. .SH "SEE ALSO" .BR ibv_post_send (3), .BR verbs (7), .BR mlx5dv_is_supported(3) .SH "AUTHORS" .TP Leon Romanovsky rdma-core-56.1/providers/mlx5/man/mlx5dv_alloc_dm.3.md000066400000000000000000000044761477342711600225560ustar00rootroot00000000000000--- layout: page title: mlx5dv_alloc_dm section: 3 tagline: Verbs date: 2018-9-1 header: "mlx5 Programmer's Manual" footer: mlx5 --- # NAME mlx5dv_alloc_dm - allocates device memory (DM) # SYNOPSIS ```c #include struct ibv_dm *mlx5dv_alloc_dm(struct ibv_context *context, struct ibv_alloc_dm_attr *dm_attr, struct mlx5dv_alloc_dm_attr *mlx5_dm_attr) ``` # DESCRIPTION **mlx5dv_alloc_dm()** allocates device memory (DM) with specific driver properties. # ARGUMENTS Please see *ibv_alloc_dm(3)* man page for *context* and *dm_attr*. ## mlx5_dm_attr ```c struct mlx5dv_alloc_dm_attr { enum mlx5dv_alloc_dm_type type; uint64_t comp_mask; }; ``` *type* : The device memory type user wishes to allocate: MLX5DV_DM_TYPE_MEMIC Device memory of type MEMIC - On-Chip memory that can be allocated and used as memory region for transmitting/receiving packet directly from/to the memory on the chip. MLX5DV_DM_TYPE_STEERING_SW_ICM Device memory of type STEERING SW ICM - This memory is used by the device to store the packet steering tables and rules. Can be used for direct table and steering rules creation when allocated by a privileged user. MLX5DV_DM_TYPE_HEADER_MODIFY_SW_ICM Device memory of type HEADER MODIFY SW ICM - This memory is used by the device to store the packet header modification tables and rules. Can be used for direct table and header modification rules creation when allocated by a privileged user. MLX5DV_DM_TYPE_HEADER_MODIFY_PATTERN_SW_ICM Device memory of type HEADER MODIFY PATTERN SW ICM - This memory is used by the device to store packet header modification patterns/templates. Can be used for direct table and header modification rules creation when allocated by a privileged user. MLX5DV_DM_TYPE_ENCAP_SW_ICM Device memory of type PACKET ENCAP SW ICM - This memory is used by the device to store packet encap data. Can be used for packet encap reformat rules creation when allocated by a privileged user. *comp_mask* : Bitmask specifying what fields in the structure are valid: Currently reserved and should be set to 0. # RETURN VALUE **mlx5dv_alloc_dm()** returns a pointer to the created DM, on error NULL will be returned and errno will be set. # SEE ALSO **ibv_alloc_dm**(3), # AUTHOR Ariel Levkovich rdma-core-56.1/providers/mlx5/man/mlx5dv_alloc_var.3.md000066400000000000000000000024441477342711600227370ustar00rootroot00000000000000--- layout: page title: mlx5dv_alloc_var / mlx5dv_free_var section: 3 tagline: Verbs --- # NAME mlx5dv_alloc_var - Allocates a VAR mlx5dv_free_var - Frees a VAR # SYNOPSIS ```c #include struct mlx5dv_var * mlx5dv_alloc_var(struct ibv_context *context, uint32_t flags); void mlx5dv_free_var(struct mlx5dv_var *dv_var); ``` # DESCRIPTION Create / free a VAR which can be used for some device commands over the DEVX interface. The DEVX API enables direct access from the user space area to the mlx5 device driver, the VAR information is needed for few commands related to Virtio. # ARGUMENTS *context* : RDMA device context to work on. *flags* : Allocation flags for the UAR. ## dv_var ```c struct mlx5dv_var { uint32_t page_id; uint32_t length; off_t mmap_off; uint64_t comp_mask; }; ``` *page_id* : The device page id to be used. *length* : The mmap length parameter to be used for mapping a VA to the allocated VAR entry. *mmap_off* : The mmap offset parameter to be used for mapping a VA to the allocated VAR entry. # RETURN VALUE Upon success *mlx5dv_alloc_var* returns a pointer to the created VAR ,on error NULL will be returned and errno will be set. # SEE ALSO **mlx5dv_open_device**, **mlx5dv_devx_obj_create** # AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_create_cq.3.md000066400000000000000000000035021477342711600227170ustar00rootroot00000000000000--- layout: page title: mlx5dv_create_cq section: 3 tagline: Verbs date: 2018-9-1 header: "mlx5 Programmer's Manual" footer: mlx5 --- # NAME mlx5dv_create_cq - creates a completion queue (CQ) # SYNOPSIS ```c #include struct ibv_cq_ex *mlx5dv_create_cq(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr, struct mlx5dv_cq_init_attr *mlx5_cq_attr); ``` # DESCRIPTION **mlx5dv_create_cq()** creates a completion queue (CQ) with specific driver properties. # ARGUMENTS Please see **ibv_create_cq_ex(3)** man page for **context** and **cq_attr** ## mlx5_cq_attr ```c struct mlx5dv_cq_init_attr { uint64_t comp_mask; uint8_t cqe_comp_res_format; uint32_t flags; uint16_t cqe_size; }; ``` *comp_mask* : Bitmask specifying what fields in the structure are valid: MLX5DV_CQ_INIT_ATTR_MASK_COMPRESSED_CQE enables creating a CQ in a mode that few CQEs may be compressed into a single CQE, valid values in *cqe_comp_res_format* MLX5DV_CQ_INIT_ATTR_MASK_FLAGS valid values in *flags* MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE valid values in *cqe_size* *cqe_comp_res_format* : A bitwise OR of the various CQE response formats of the responder side: MLX5DV_CQE_RES_FORMAT_HASH CQE compression with hash MLX5DV_CQE_RES_FORMAT_CSUM CQE compression with RX checksum MLX5DV_CQE_RES_FORMAT_CSUM_STRIDX CQE compression with stride index *flags* : A bitwise OR of the various values described below: MLX5DV_CQ_INIT_ATTR_FLAGS_CQE_PAD create a padded 128B CQE *cqe_size* : configure the CQE size to be 64 or 128 bytes other values will fail mlx5dv_create_cq. # RETURN VALUE **mlx5dv_create_cq()** returns a pointer to the created CQ, or NULL if the request fails and errno will be set. # SEE ALSO **ibv_create_cq_ex**(3), # AUTHOR Yonatan Cohen rdma-core-56.1/providers/mlx5/man/mlx5dv_create_flow.3.md000066400000000000000000000046721477342711600232740ustar00rootroot00000000000000--- layout: page title: mlx5dv_create_flow section: 3 tagline: Verbs date: 2018-9-19 header: "mlx5 Programmer's Manual" footer: mlx5 --- # NAME mlx5dv_create_flow - creates a steering flow rule # SYNOPSIS ```c #include struct ibv_flow * mlx5dv_create_flow(struct mlx5dv_flow_matcher *flow_matcher, struct mlx5dv_flow_match_parameters *match_value, size_t num_actions, struct mlx5dv_flow_action_attr actions_attr[]) ``` # DESCRIPTION **mlx5dv_create_flow()** creates a steering flow rule with the ability to specify specific driver properties. # ARGUMENTS Please see *mlx5dv_create_flow_matcher(3)* for *flow_matcher* and *match_value*. *num_actions* : Specifies how many actions are passed in *actions_attr* ## *actions_attr* ```c struct mlx5dv_flow_action_attr { enum mlx5dv_flow_action_type type; union { struct ibv_qp *qp; struct ibv_counters *counter; struct ibv_flow_action *action; uint32_t tag_value; struct mlx5dv_devx_obj *obj; }; }; ``` *type* : MLX5DV_FLOW_ACTION_DEST_IBV_QP The QP passed will receive the matched packets. MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION The flow action to be applied. MLX5DV_FLOW_ACTION_TAG Flow tag to be provided in work completion. MLX5DV_FLOW_ACTION_DEST_DEVX The DEVX destination object for the matched packets. MLX5DV_FLOW_ACTION_COUNTERS_DEVX The DEVX counter object for the matched packets. MLX5DV_FLOW_ACTION_DEFAULT_MISS Steer the packet to the default miss destination. MLX5DV_FLOW_ACTION_DROP Action is dropping the matched packet. *qp* : QP passed, to be used with *type* *MLX5DV_FLOW_ACTION_DEST_IBV_QP*. *action* : Flow action, to be used with *type* *MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION* see *mlx5dv_create_flow_action_modify_header(3)* and *mlx5dv_create_flow_action_packet_reformat(3)*. *tag_value* : tag value to be passed in the work completion, to be used with *type* *MLX5DV_FLOW_ACTION_TAG* see *ibv_create_cq_ex(3)*. *obj* : DEVX object, to be used with *type* *MLX5DV_FLOW_ACTION_DEST_DEVX* or by *MLX5DV_FLOW_ACTION_COUNTERS_DEVX*. # RETURN VALUE **mlx5dv_create_flow** returns a pointer to the created flow rule, on error NULL will be returned and errno will be set. # SEE ALSO *mlx5dv_create_flow_action_modify_header(3)*, *mlx5dv_create_flow_action_packet_reformat(3)*, *mlx5dv_create_flow_matcher(3)*, *mlx5dv_create_qp(3)*, *ibv_create_qp_ex(3)* *ibv_create_cq_ex(3)* *ibv_create_counters(3)* # AUTHOR Mark Bloch rdma-core-56.1/providers/mlx5/man/mlx5dv_create_flow_action_modify_header.3.md000066400000000000000000000022171477342711600275010ustar00rootroot00000000000000--- layout: page title: mlx5dv_create_flow_action_modify_header section: 3 tagline: Verbs --- # NAME mlx5dv_create_flow_action_modify_header - Flow action modify header for mlx5 provider # SYNOPSIS ```c #include struct ibv_flow_action * mlx5dv_create_flow_action_modify_header(struct ibv_context *ctx, size_t actions_sz, uint64_t actions[], enum mlx5dv_flow_table_type ft_type) ``` # DESCRIPTION Create a modify header flow steering action, it allows mutating a packet header. # ARGUMENTS *ctx* : RDMA device context to create the action on. *actions_sz* : The size of *actions* buffer in bytes. *actions* : A buffer which contains modify actions provided in device spec format (i.e. be64). *ft_type* : Defines the flow table type to which the modify header action will be attached. MLX5DV_FLOW_TABLE_TYPE_NIC_RX: RX FLOW TABLE MLX5DV_FLOW_TABLE_TYPE_NIC_TX: TX FLOW TABLE # RETURN VALUE Upon success *mlx5dv_create_flow_action_modify_header* will return a new *struct ibv_flow_action* object, on error NULL will be returned and errno will be set. # SEE ALSO *ibv_create_flow(3)*, *ibv_create_flow_action(3)* rdma-core-56.1/providers/mlx5/man/mlx5dv_create_flow_action_packet_reformat.3.md000066400000000000000000000035371477342711600300560ustar00rootroot00000000000000 --- layout: page title: mlx5dv_create_flow_action_packet_reformat section: 3 tagline: Verbs --- # NAME mlx5dv_create_flow_action_packet_reformat - Flow action reformat packet for mlx5 provider # SYNOPSIS ```c #include struct ibv_flow_action * mlx5dv_create_flow_action_packet_reformat(struct ibv_context *ctx, size_t data_sz, void *data, enum mlx5dv_flow_action_packet_reformat_type reformat_type, enum mlx5dv_flow_table_type ft_type) ``` # DESCRIPTION Create a packet reformat flow steering action. It allows adding/removing packet headers. # ARGUMENTS *ctx* : RDMA device context to create the action on. *data_sz* : The size of *data* buffer. *data* : A buffer which contains headers in case the actions requires them. *reformat_type* : The reformat type to be create. Use enum mlx5dv_flow_action_packet_reformat_type. MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2: Decap a generic L2 tunneled packet up to inner L2. MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL: Generic encap, *data* should contain the encapsulating headers. MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2: Will do decap where the inner packet starts from L3. *data* should be MAC or MAC + vlan (14 or 18 bytes) to be appended to the packet after the decap action. MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL: Will do encap where is L2 of the original packet will not be included. *data* should be the encapsulating header. *ft_type* : It defines the flow table type to which the packet reformat action will be attached. # RETURN VALUE Upon success *mlx5dv_create_flow_action_packet_reformat* will return a new *struct ibv_flow_action* object, on error NULL will be returned and errno will be set. # SEE ALSO *ibv_create_flow(3)*, *ibv_create_flow_action(3)* rdma-core-56.1/providers/mlx5/man/mlx5dv_create_flow_matcher.3.md000066400000000000000000000046301477342711600247710ustar00rootroot00000000000000--- layout: page title: mlx5dv_create_flow_matcher section: 3 tagline: Verbs date: 2018-9-19 header: "mlx5 Programmer's Manual" footer: mlx5 --- # NAME mlx5dv_create_flow_matcher - creates a matcher to be used with *mlx5dv_create_flow(3)* # SYNOPSIS ```c #include struct mlx5dv_flow_matcher * mlx5dv_create_flow_matcher(struct ibv_context *context, struct mlx5dv_flow_matcher_attr *attr) ``` # DESCRIPTION **mlx5dv_create_flow_matcher()** creates a flow matcher (mask) to be used with *mlx5dv_create_flow(3)*. # ARGUMENTS Please see *ibv_open_device(3)* for *context*. ## *attr* ```c struct mlx5dv_flow_matcher_attr { enum ibv_flow_attr_type type; uint32_t flags; /* From enum ibv_flow_flags */ uint16_t priority; uint8_t match_criteria_enable; /* Device spec format */ struct mlx5dv_flow_match_parameters *match_mask; uint64_t comp_mask; enum mlx5dv_flow_table_type ft_type; }; ``` *type* : Type of matcher to be created: IBV_FLOW_ATTR_NORMAL: Normal rule according to specification. *flags* : special flags to control rule: 0: Nothing or zero value means matcher will store ingress flow rules. IBV_FLOW_ATTR_FLAGS_EGRESS: Specified this matcher will store egress flow rules. *priority* : See *ibv_create_flow(3)*. *match_criteria_enable* : What match criteria is configured in *match_mask*, passed in device spec format. ## *match_mask* ```c struct mlx5dv_flow_match_parameters { size_t match_sz; uint64_t match_buf[]; /* Device spec format */ }; ``` *match_sz* : Size in bytes of *match_buf*. *match_buf* : Set which mask to be used, passed in device spec format. *comp_mask* : MLX5DV_FLOW_MATCHER_MASK_FT_TYPE for *ft_type* ## *ft_type* Specified in which flow table type, the matcher will store the flow rules: MLX5DV_FLOW_TABLE_TYPE_NIC_RX: Specified this matcher will store ingress flow rules. MLX5DV_FLOW_TABLE_TYPE_NIC_TX Specified this matcher will store egress flow rules. MLX5DV_FLOW_TABLE_TYPE_FDB : Specified this matcher will store FDB rules. MLX5DV_FLOW_TABLE_TYPE_RDMA_RX: Specified this matcher will store ingress RDMA flow rules. MLX5DV_FLOW_TABLE_TYPE_RDMA_TX: Specified this matcher will store egress RDMA flow rules. # RETURN VALUE **mlx5dv_create_flow_matcher** returns a pointer to *mlx5dv_flow_matcher*, on error NULL will be returned and errno will be set. # SEE ALSO *ibv_open_device(3)*, *ibv_create_flow(3)* # AUTHOR Mark Bloch rdma-core-56.1/providers/mlx5/man/mlx5dv_create_mkey.3.md000066400000000000000000000047001477342711600232620ustar00rootroot00000000000000--- layout: page title: mlx5dv_create_mkey / mlx5dv_destroy_mkey section: 3 tagline: Verbs --- # NAME mlx5dv_create_mkey - Creates an indirect mkey mlx5dv_destroy_mkey - Destroys an indirect mkey # SYNOPSIS ```c #include struct mlx5dv_mkey_init_attr { struct ibv_pd *pd; uint32_t create_flags; uint16_t max_entries; }; struct mlx5dv_mkey { uint32_t lkey; uint32_t rkey; }; struct mlx5dv_mkey * mlx5dv_create_mkey(struct mlx5dv_mkey_init_attr *mkey_init_attr); int mlx5dv_destroy_mkey(struct mlx5dv_mkey *mkey); ``` # DESCRIPTION Create / destroy an indirect mkey. Create an indirect mkey to enable application uses its specific device functionality. # ARGUMENTS ##mkey_init_attr## *pd* : ibv protection domain. *create_flags* : MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT: Indirect mkey is being created. MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE: Enable block signature offload support for mkey. MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO: Enable crypto offload support for mkey. Setting this flag means that crypto operations will be done and hence, must be configured. I.e. if this flag is set and the MKey was not configured for crypto properties using **mlx5dv_wr_set_mkey_crypto()**, then running traffic with the MKey will fail, generating a CQE with error. MLX5DV_MKEY_INIT_ATTR_FLAGS_UPDATE_TAG: Enable update tag support for mkey. Setting this flag allows an application to set the mkey tag post of creating the mkey. If the kernel does not support updating the mkey tag, mkey creation will fail. MLX5DV_MKEY_INIT_ATTR_FLAGS_REMOTE_INVALIDATE: Enable remote invalidation support for mkey. *max_entries* : Requested max number of pointed entries by this indirect mkey. The function will update the *mkey_init_attr->max_entries* with the actual mkey value that was created; it will be greater than or equal to the value requested. # RETURN VALUE Upon success *mlx5dv_create_mkey* will return a new *struct mlx5dv_mkey* on error NULL will be returned and errno will be set. Upon success destroy 0 is returned or the value of errno on a failure. # Notes To let this functionality works a DEVX context should be opened by using *mlx5dv_open_device*. The created indirect mkey can`t work with scatter to CQE feature, consider *mlx5dv_create_qp()* with MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE for small messages. # SEE ALSO **mlx5dv_open_device**(3), **mlx5dv_create_qp**(3) #AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_create_qp.3.md000066400000000000000000000165471477342711600227510ustar00rootroot00000000000000--- layout: page title: mlx5dv_create_qp section: 3 tagline: Verbs date: 2018-9-1 header: "mlx5 Programmer's Manual" footer: mlx5 --- # NAME mlx5dv_create_qp - creates a queue pair (QP) # SYNOPSIS ```c #include struct ibv_qp *mlx5dv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr) ``` # DESCRIPTION **mlx5dv_create_qp()** creates a queue pair (QP) with specific driver properties. # ARGUMENTS Please see *ibv_create_qp_ex(3)* man page for *context* and *qp_attr*. ## mlx5_qp_attr ```c struct mlx5dv_qp_init_attr { uint64_t comp_mask; uint32_t create_flags; struct mlx5dv_dc_init_attr dc_init_attr; uint64_t send_ops_flags; }; ``` *comp_mask* : Bitmask specifying what fields in the structure are valid: MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS: valid values in *create_flags* MLX5DV_QP_INIT_ATTR_MASK_DC: valid values in *dc_init_attr* MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS: valid values in *send_ops_flags* *create_flags* : A bitwise OR of the various values described below. MLX5DV_QP_CREATE_TUNNEL_OFFLOADS: Enable offloading such as checksum and LRO for incoming tunneling traffic. MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_UC: Allow receiving loopback unicast traffic. MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_MC: Allow receiving loopback multicast traffic. MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE: Disable scatter to CQE feature which is enabled by default. MLX5DV_QP_CREATE_ALLOW_SCATTER_TO_CQE: Allow scatter to CQE for requester even if the qp was not configured to signal all WRs. MLX5DV_QP_CREATE_PACKET_BASED_CREDIT_MODE: Set QP to work in end-to-end packet-based credit, instead of the default message-based credits (IB spec. section 9.7.7.2). \ It is the applications responsibility to make sure that the peer QP is configured with same mode. MLX5DV_QP_CREATE_SIG_PIPELINING: If the flag is set, the QP is moved to SQD state upon encountering a signature error, and IBV_EVENT_SQ_DRAINED is generated to inform about the new state. The signature pipelining feature is a performance optimization, which reduces latency for read operations in the storage protocols. The feature is optional. Creating the QP fails if the kernel or device does not support the feature. In this case, an application should fallback to backward compatibility mode and handle read operations without the pipelining. See details about the signature pipelining in **mlx5dv_qp_cancel_posted_send_wrs**(3). MLX5DV_QP_CREATE_OOO_DP: If the flag is set, Receive WRs on the receiver side of the QP are allowed to be consumed out-of-order and sender side of the QP is allowed to transmit messages without guaranteeing any arrival ordering on the receiver side. The flag, when set, must be set both on the sender and receiver side of a QP (e.g., DCT and DCI). Setting the flag is optional and the availability of this feature should be queried by the application (See details in **mlx5dv_query_device**(3)) and there is no automatic fallback: If the flag is set while kernel or device does not support the feature, then creating the QP fails. Thus, before creating a QP with this flag set, application must query the maximal outstanding Receive WRs possible on a QP with this flag set, according to the QP type (see details in **mlx5dv_query_device**(3)) and make sure the capability is supported. > **Note** > > All the following describe the behavior and semantics of a QP > with this flag set. Completions' delivery ordering: A Receive WR posted on this QP may be consumed by any arriving message to this QP that requires Receive WR consumption. Nonetheless, the ordering in which work completions are delivered for the posted WRs, both on sender side and receiver side, remains unchanged when this flag is set (and is independent of the ordering in which the Receive WRs are consumed). The ID delivered in every work completion (wr_id) will specify which WR was completed by the delivered work completion. Data placing and operations' execution ordering: RDMA Read and RDMA Atomic operations are executed on the responder side in order, i.e., these operations are executed after all previous messages are done executing. However, the ordering of RDMA Read response packets being scattered to memory on the requestor side is not guaranteed. This means that, although the data is read after executing all previous messages, it may be scattered out-of-order on the requestor side. Ordering of write requests towards the memory on the responder side, initiated by RDMA Send, RDMA Send with Immediate, RDMA Write or RDMA Write with Immediate is not guaranteed. Good and bad practice: Since it cannot be guaranteed which RDMA Send (and/or RDMA Send with Immediate) will consume a Receive WR (and will scatter its data to the memory buffers specified in the WR) it's not recommended to post different sizes of Receive WRs. Polling on any memory that is used by the device to scatter data, is not recommended since ordering of data placement of RDMA Send, RDMA Write and RDMA Write with Immediate is not guaranteed. Receiver, upon getting a completion for an RDMA Write with Immediate, should not rely on wr_id alone to determine to which memory data was scattered by the operation. *dc_init_attr* : DC init attributes. ## *dc_init_attr* ```c struct mlx5dv_dci_streams { uint8_t log_num_concurent; uint8_t log_num_errored; }; struct mlx5dv_dc_init_attr { enum mlx5dv_dc_type dc_type; union { uint64_t dct_access_key; struct mlx5dv_dci_streams dci_streams; }; }; ``` *dc_type* : MLX5DV_DCTYPE_DCT QP type: Target DC. MLX5DV_DCTYPE_DCI QP type: Initiator DC. *dct_access_key* : used to create a DCT QP. *dci_streams* : dci_streams used to define DCI QP with multiple concurrent streams. Valid when comp_mask includes MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS. log_num_concurent Defines the number of parallel different streams that could be handled by HW. All work request of a specific stream_id are handled in order. log_num_errored Defines the number of dci error stream channels before moving DCI to an error state. *send_ops_flags* : A bitwise OR of the various values described below. MLX5DV_QP_EX_WITH_MR_INTERLEAVED: Enables the mlx5dv_wr_mr_interleaved() work requset on this QP. MLX5DV_QP_EX_WITH_MR_LIST: Enables the mlx5dv_wr_mr_list() work requset on this QP. MLX5DV_QP_EX_WITH_MKEY_CONFIGURE: Enables the mlx5dv_wr_mkey_configure() work request and the related setters on this QP. # NOTES **mlx5dv_qp_ex_from_ibv_qp_ex()** is used to get *struct mlx5dv_qp_ex* for accessing the send ops interfaces when IBV_QP_INIT_ATTR_SEND_OPS_FLAGS is used. The MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE flag should be set in cases that IOVA doesn't match the process' VA and the message payload size is small enough to trigger the scatter to CQE feature. When device memory is used IBV_SEND_INLINE and scatter to CQE should not be used, as the memcpy is not possible. # RETURN VALUE **mlx5dv_create_qp()** returns a pointer to the created QP, on error NULL will be returned and errno will be set. # SEE ALSO **ibv_query_device_ex**(3), **ibv_create_qp_ex**(3), **mlx5dv_query_device**(3) # AUTHOR Yonatan Cohen rdma-core-56.1/providers/mlx5/man/mlx5dv_create_steering_anchor.3.md000066400000000000000000000037631477342711600254770ustar00rootroot00000000000000--- layout: page title: mlx5dv_create_steering_anchor / mlx5dv_destroy_steering_anchor section: 3 tagline: Verbs --- # NAME mlx5dv_create_steering_anchor - Creates a steering anchor mlx5dv_destroy_steering_anchor - Destroys a steering anchor # SYNOPSIS ```c #include struct mlx5dv_steering_anchor * mlx5dv_create_steering_anchor(struct ibv_context *context, struct mlx5dv_steering_anchor_attr *attr); int mlx5dv_destroy_steering_anchor(struct mlx5dv_steering_anchor *sa); ``` # DESCRIPTION A user can take packets into a user-configured sandbox and do packet processing at the end of which a steering pipeline decision is made on what to do with the packet. A steering anchor allows the user to reinject the packet back into the kernel for additional processing. **mlx5dv_create_steering_anchor()** Creates an anchor which will allow injecting the packet back into the kernel steering pipeline. **mlx5dv_destroy_steering_anchor()** Destroys a steering anchor. # ARGUMENTS ## context The device context to associate the steering anchor with. ## attr Anchor attributes specify the priority and flow table type to which the anchor will point. ```c struct mlx5dv_steering_anchor_attr { enum mlx5dv_flow_table_type ft_type; uint16_t priority; uint64_t comp_mask; }; ``` *ft_type* : The flow table type to which the anchor will point. *priority* : The priority inside *ft_type* to which the created anchor will point. *comp_mask* : Reserved for future extension, must be 0 now. ## mlx5dv_steering_anchor ```c struct mlx5dv_steering_anchor { uint32_t id; }; ``` *id* : The flow table ID to use as the destination when creating the flow table entry. # RETURN VALUE **mlx5dv_create_steering_anchor()** returns a pointer to a new *mlx5dv_steering_anchor* on success. On error NULL is returned and errno is set. **mlx5dv_destroy_steering_anchor()** returns 0 on success and errno value on error. # AUTHORS Mark Bloch rdma-core-56.1/providers/mlx5/man/mlx5dv_crypto_login.3.md000066400000000000000000000074051477342711600235070ustar00rootroot00000000000000--- layout: page title: mlx5dv_crypto_login / mlx5dv_crypto_login_query_state / mlx5dv_crypto_logout section: 3 tagline: Verbs --- # NAME mlx5dv_crypto_login - Creates a crypto login session mlx5dv_crypto_login_query_state - Queries the state of the current crypto login session mlx5dv_crypto_logout - Logs out from the current crypto login session # SYNOPSIS ```c #include int mlx5dv_crypto_login(struct ibv_context *context, struct mlx5dv_crypto_login_attr *login_attr); int mlx5dv_crypto_login_query_state(struct ibv_context *context, enum mlx5dv_crypto_login_state *state); int mlx5dv_crypto_logout(struct ibv_context *context); ``` # DESCRIPTION When using a crypto engine that is in wrapped import method, an active crypto login session must be present in order to create and query Data Encryption Keys (DEKs). **mlx5dv_crypto_login()** Creates a crypto login session with the credential given in *login_attr* and associates it with *context*. Only one active crypto login session can be associated per device context. **mlx5dv_crypto_login_query_state()** queries the state of the crypto login session associated with *context* and returns the state in *state*, which indicates whether it is valid, invalid or doesn't exist. A valid crypto login session can become invalid if the credential or the import KEK used in the crypto login session were deleted during the login session (for example by a crypto officer). In this case, **mlx5dv_crypto_logout()** should be called to destroy the current invalid crypto login session and if still necessary, **mlx5dv_crypto_login()** should be called to create a new crypto login session with valid credential and import KEK. **mlx5dv_crypto_logout()** logs out from the current crypto login session associated with *context*. Existing DEKs that were previously loaded to the device during a crypto login session don't need an active crypto login session in order to be used (in MKey or during traffic). # ARGUMENTS ## context The device context to associate the crypto login session with. ## login_attr Crypto login attributes specify the credential to login with and the import KEK to be used for secured communications during the crypto login session. ```c struct mlx5dv_crypto_login_attr { uint32_t credential_id; uint32_t import_kek_id; char credential[48]; uint64_t comp_mask; }; ``` *credential_id* : An ID of a credential, from the credentials stored on the device, that indicates the credential that should be validated against the credential provided in *credential*. *import_kek_id* : An ID of an import KEK, from the import KEKs stored on the device, that indicates the import KEK that will be used for unwrapping the credential provided in *credential* and also for all other secured communications during the crypto login session. *credential* : The credential to login with. Must be provided wrapped by the AES key wrap algorithm using the import KEK indicated by *import_kek_id*. *comp_mask* : Reserved For future extension, must be 0 now. ## state Indicates the state of the current crypto login session. can be one of MLX5DV_CRYPTO_LOGIN_STATE_VALID, MLX5DV_CRYPTO_LOGIN_STATE_NO_LOGIN and MLX5DV_CRYPTO_LOGIN_STATE_INVALID. # RETURN VALUE **mlx5dv_crypto_login()** returns 0 on success and errno value on error. **mlx5dv_crypto_login_query_state()** returns 0 on success and updates *state* with the queried state. On error, errno value is returned. **mlx5dv_crypto_logout()** returns 0 on success and errno value on error. # ERRORS EEXIST : A crypto login session already exists. EINVAL : Invalid attributes were provided, or one or more of *credential*, *credential_id* and *import_kek_id* are invalid. ENOENT : No crypto login session exists. # AUTHORS Avihai Horon rdma-core-56.1/providers/mlx5/man/mlx5dv_crypto_login_create.3.md000066400000000000000000000114301477342711600250230ustar00rootroot00000000000000--- layout: page title: mlx5dv_crypto_login_create / mlx5dv_crypto_login_query / mlx5dv_crypto_login_destroy section: 3 tagline: Verbs --- # NAME mlx5dv_crypto_login_create - Creates a crypto login object mlx5dv_crypto_login_query - Queries the given crypto login object mlx5dv_crypto_login_destroy - Destroys the given crypto login object # SYNOPSIS ```c #include struct mlx5dv_crypto_login_obj * mlx5dv_crypto_login_create(struct ibv_context *context, struct mlx5dv_crypto_login_attr_ex *login_attr); int mlx5dv_crypto_login_query(struct mlx5dv_crypto_login_obj *crypto_login, struct mlx5dv_crypto_login_query_attr *query_attr); int mlx5dv_crypto_login_destroy(struct mlx5dv_crypto_login_obj *crypto_login); ``` # DESCRIPTION When using a crypto engine that is in wrapped import method, a valid crypto login object must be provided in order to create and query wrapped Data Encryption Keys (DEKs). A valid crypto login object is necessary only to create and query wrapped DEKs. Existing DEKs that were previously created don't need a valid crypto login object in order to be used (in MKey or during traffic). **mlx5dv_crypto_login_create()** creates and returns a crypto login object with the credential given in *login_attr*. Only one crypto login object can be created per device context. The created crypto login object must be provided to **mlx5dv_dek_create()** in order to create wrapped DEKs. **mlx5dv_crypto_login_query()** queries the crypto login object *crypto_login* and returns the queried attributes in *query_attr*. **mlx5dv_crypto_login_destroy()** destroys the given crypto login object. # ARGUMENTS ## context The device context that will be associated with the crypto login object. ## login_attr Crypto extended login attributes specify the credential to login with and the import KEK to be used for secured communications done with the crypto login object. ```c struct mlx5dv_crypto_login_attr_ex { uint32_t credential_id; uint32_t import_kek_id; const void *credential; size_t credential_len; uint64_t comp_mask; }; ``` *credential_id* : An ID of a credential, from the credentials stored on the device, that indicates the credential that should be validated against the credential provided in *credential*. *import_kek_id* : An ID of an import KEK, from the import KEKs stored on the device, that indicates the import KEK that will be used for unwrapping the credential provided in *credential* and also for all other secured communications done with the crypto login object. *credential* : The credential to login with. Credential is a piece of data used to authenticate the user for crypto login. The credential in *credential* is validated against the credential indicated by *credential_id*, which is stored on the device. The credentials must match in order for the crypto login to succeed. *credential* must be provided wrapped by the AES key wrap algorithm using the import KEK indicated by *import_kek_id*. *credential* format is ENC(iv_64b + plaintext_credential) where ENC() is AES key wrap algorithm and iv_64b is 0xA6A6A6A6A6A6A6A6 as per the NIST SP 800-38F AES key wrap spec, and plaintext_credential is the credential value stored on the device. *credential_len* : The length of the provided *credential* value in bytes. *comp_mask* : Reserved for future extension, must be 0 now. ## query_attr Crypto login attributes to be populated when querying a crypto login object. ```c struct mlx5dv_crypto_login_query_attr { enum mlx5dv_crypto_login_state state; uint64_t comp_mask; }; ``` *state* : The state of the crypto login object, can be one of the following **MLX5DV_CRYPTO_LOGIN_STATE_VALID** : The crypto login object is valid and can be used. **MLX5DV_CRYPTO_LOGIN_STATE_INVALID** : The crypto login object is invalid and cannot be used. A valid crypto login object can become invalid if the credential or the import KEK used in the crypto login object were deleted while in use (for example by a crypto officer). In this case, **mlx5dv_crypto_login_destroy()** should be called to destroy the invalid crypto login object and if still necessary, **mlx5dv_crypto_login_create()** should be called to create a new crypto login object with valid credential and import KEK. *comp_mask* : Reserved for future extension, must be 0 now. # RETURN VALUE **mlx5dv_crypto_login_create()** returns a pointer to a new valid *struct mlx5dv_crypto_login_obj* on success. On error NULL is returned and errno is set. **mlx5dv_crypto_login_query()** returns 0 on success and fills *query_attr* with the queried attributes. On error, errno is returned. **mlx5dv_crypto_login_destroy()** returns 0 on success and errno on error. # SEE ALSO **mlx5dv_dek_create**(3), **mlx5dv_query_device**(3) # AUTHORS Avihai Horon rdma-core-56.1/providers/mlx5/man/mlx5dv_dci_stream_id_reset.3.md000066400000000000000000000027711477342711600247700ustar00rootroot00000000000000--- layout: page title: mlx5dv_dci_stream_id_reset section: 3 tagline: Verbs --- # NAME mlx5dv_dci_stream_id_reset - Reset stream_id of a given DCI QP # SYNOPSIS ```c #include int mlx5dv_dci_stream_id_reset(struct ibv_qp *qp, uint16_t stream_id); ``` # DESCRIPTION Used by SW to reset an errored *stream_id* in the HW DCI context. On work completion with error, the application should call ibv_query_qp() to check if the QP was moved to an error state, or it's still operational (in RTS state), which means that the specific *stream_id* that caused the completion with error is in error state. Errors which are stream related will cause only that *stream_id's* work request to be flushed as they are handled in order in the send queue. Once all *stream_id* WR's are flushed, application should reset the errored *stream_id* by calling mlx5dv_dci_stream_id_reset(). Work requested for other *stream_id's* will continue to be processed by the QP. The DCI QP will move to an error state and stop operating once the number of unique *stream_id* in error reaches the DCI QP's 'log_num_errored' streams defined by SW. Application should use the 'wr_id' in the ibv_wc to find the *stream_id* from it’s private context. # ARGUMENTS *qp* : The ibv_qp object to issue the action on. *stream_id* : The DCI stream channel id that need to be reset. # RETURN VALUE Returns 0 on success, or the value of errno on failure (which indicates the failure reason). # AUTHOR Lior Nahmanson rdma-core-56.1/providers/mlx5/man/mlx5dv_dek_create.3.md000066400000000000000000000213761477342711600230700ustar00rootroot00000000000000--- layout: page title: mlx5dv_dek_create / mlx5dv_dek_query / mlx5dv_dek_destroy section: 3 tagline: Verbs --- # NAME mlx5dv_dek_create - Creates a DEK mlx5dv_dek_query - Queries a DEK's attributes mlx5dv_dek_destroy - Destroys a DEK # SYNOPSIS ```c #include struct mlx5dv_dek *mlx5dv_dek_create(struct ibv_context *context, struct mlx5dv_dek_init_attr *init_attr); int mlx5dv_dek_query(struct mlx5dv_dek *dek, struct mlx5dv_dek_attr *attr); int mlx5dv_dek_destroy(struct mlx5dv_dek *dek); ``` # DESCRIPTION Data Encryption Keys (DEKs) are used to encrypt and decrypt transmitted data. After a DEK is created, it can be configured in MKeys for crypto offload operations. DEKs are not persistent and are destroyed upon process exit. Therefore, software process needs to re-create all needed DEKs on startup. **mlx5dv_dek_create()** creates a new DEK with the attributes specified in *init_attr*. A pointer to the newly created dek is returned, which can be used for DEK query, DEK destruction and when configuring a MKey for crypto offload operations. The DEK can be either wrapped or in plaintext and the format that should be used is determined by the specified crypto_login object. To create a wrapped DEK, the application must have a valid crypto login object prior to creating the DEK. Creating a wrapped DEK can be performed in two ways: 1. Call **mlx5dv_crypto_login_create()** to obtain a crypto login object. Indicate that the DEK is wrapped by setting **MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN** value in *comp_mask* and passing the crypto login object in *crypto_login* field of *init_attr*. Fill the other DEK attributes and create the DEK. 2. Call **mlx5dv_crypto_login()** i.e., the old API. Supply credential, import_kek_id To create a plaintext DEK, the application must indicate that the DEK is in plaintext by setting **MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN** value in *comp_mask* and passing NULL value in *crypto_login* field of *init_attr*, fill the other DEK attributes and create the DEK. To use the created DEK (either wrapped or plaintext) in a MKey, a valid crypto login object or session is not needed. Revoking the import KEK or credential that were used for the crypto login object or session (and therefore rendering the crypto login invalid) does not prevent using a created DEK. **mlx5dv_dek_query()** queries the DEK specified by *dek* and returns the queried attributes in *attr*. A valid crypto login object or session is not required to query a plaintext DEK. On the other hand, to query a wrapped DEK a valid crypto login object or session must be present. **mlx5dv_dek_destroy()** destroys the DEK specified by *dek*. # ARGUMENTS ## context The device context to create the DEK with. ## init_attr ```c enum mlx5dv_dek_init_attr_mask { MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN = 1 << 0, }; struct mlx5dv_dek_init_attr { enum mlx5dv_crypto_key_size key_size; bool has_keytag; enum mlx5dv_crypto_key_purpose key_purpose; struct ibv_pd *pd; char opaque[8]; char key[128]; uint64_t comp_mask; /* Use enum mlx5dv_dek_init_attr_mask */ struct mlx5dv_crypto_login_obj *crypto_login; }; ``` *key_size* : The size of the key, can be one of the following **MLX5DV_CRYPTO_KEY_SIZE_128** : Key size is 128 bit. **MLX5DV_CRYPTO_KEY_SIZE_256** : Key size is 256 bit. *has_keytag* : Whether the DEK has a keytag or not. If set, the key should include a 8 Bytes keytag. Keytag is used to verify that the DEK being used by a MKey is the expected DEK. This is done by comparing the keytag that was defined during DEK creation with the keytag provided in the MKey crypto configuration, and failing the operation if they are different. *key_purpose* : The purpose of the key, currently can only be the following value **MLX5DV_CRYPTO_KEY_PURPOSE_AES_XTS** : The key will be used for AES-XTS crypto engine. *pd* : The protection domain to be associated with the DEK. *opaque* : Plaintext metadata to describe the key. *key* : The key that will be used for encryption and decryption of transmitted data. For plaintext DEK *key* must be provided in plaintext. For wrapped DEK *key* must be provided wrapped by the import KEK that was specified in the crypto login. Actual size and layout of this field depend on the provided *key_size* and *has_keytag* fields, as well as on the format of the key (plaintext or wrapped). *key* should be constructed according to the following table. Table: DEK *key* Field Construction. | Import Method | Has Keytag | Key size | Key Layout | | --------------- | ------------ | ------------ | ------------------------------------------------ | | Plaintext | No | 128 Bit | key1_128b + key2_128b | | | | | | | Plaintext | No | 256 Bit | key1_256b + key2_256b | | | | | | | Plaintext | Yes | 128 Bit | key1_128b + key2_128b + keytag_64b | | | | | | | Plaintext | Yes | 256 Bit | key1_256b + key2_256b + keytag_64b | | | | | | | Wrapped | No | 128 Bit | ENC(iv_64b + key1_128b + key2_128b) | | | | | | | Wrapped | No | 256 Bit | ENC(iv_64b + key1_256b + key2_256b) | | | | | | | Wrapped | Yes | 128 Bit | ENC(iv_64b + key1_128b + key2_128b + keytag_64b) | | | | | | | Wrapped | Yes | 256 Bit | ENC(iv_64b + key1_256b + key2_256b + keytag_64b) | Where ENC() is AES key wrap algorithm and iv_64b is 0xA6A6A6A6A6A6A6A6 as per the NIST SP 800-38F AES key wrap spec. The following example shows how to wrap a 128 bit key that has keytag using a 128 bit import KEK in OpenSSL: ```c #include unsigned char import_kek[16]; /* 128 bit import KEK in plaintext for wrapping */ unsigned char iv[8] = {0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6}; /* * Indexes 0-15 are key1 in plaintext, indexes 16-31 are key2 in plaintext, * and indexes 32-39 are key_tag in plaintext. */ unsigned char key[40]; unsigned char wrapped_key[48]; EVP_CIPHER_CTX *ctx; int len; ctx = EVP_CIPHER_CTX_new(); EVP_CIPHER_CTX_set_flags(ctx, EVP_CIPHER_CTX_FLAG_WRAP_ALLOW); EVP_EncryptInit_ex(ctx, EVP_aes_128_wrap(), NULL, import_kek, iv); EVP_EncryptUpdate(ctx, wrapped_key, &len, key, sizeof(key)); EVP_EncryptFinal_ex(ctx, wrapped_key + len, &len); EVP_CIPHER_CTX_free(ctx); ``` *comp_mask* : Currently can be the following value: **MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN**, which indicates that *crypto_login* field is applicable. *crypto_login* : Pointer to a crypto login object. If set to a valid crypto login object, indicates that this is a wrapped DEK that will be created using the given crypto login object. If set to NULL, indicates that this is a plaintext DEK. Must be NULL if **MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN** is not set. Only relevant when comp_mask is set with *MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN* ## dek Pointer to an existing DEK to query or to destroy. ## attr DEK attributes to be populated when querying a DEK. ```c struct mlx5dv_dek_attr { enum mlx5dv_dek_state state; char opaque[8]; uint64_t comp_mask; }; ``` *state* : The state of the DEK, can be one of the following **MLX5DV_DEK_STATE_READY** : The key is ready for use. This is the state of the key when it is first created. **MLX5DV_DEK_STATE_ERROR** : The key is unusable. The key needs to be destroyed and re-created in order to be used. This can happen, for example, due to DEK memory corruption. *opaque* : Plaintext metadata to describe the key. *comp_mask* : Reserved for future extension, must be 0 now. # RETURN VALUE **mlx5dv_dek_create()** returns a pointer to a new *struct mlx5dv_dek* on success. On error NULL is returned and errno is set. **mlx5dv_dek_query()** returns 0 on success and updates *attr* with the queried DEK attributes. On error errno value is returned. **mlx5dv_dek_destroy()** returns 0 on success and errno value on error. # SEE ALSO **mlx5dv_crypto_login**(3), **mlx5dv_crypto_login_create**(3), **mlx5dv_query_device**(3) # AUTHORS Avihai Horon rdma-core-56.1/providers/mlx5/man/mlx5dv_devx_alloc_msi_vector.3.md000066400000000000000000000030351477342711600253440ustar00rootroot00000000000000--- date: 2022-01-12 footer: mlx5 header: "mlx5 Programmer's Manual" tagline: Verbs layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: mlx5dv_devx_alloc_msi_vector --- # NAME mlx5dv_devx_alloc_msi_vector - Allocate an msi vector to be used for creating an EQ. mlx5dv_devx_free_msi_vector - Release an msi vector. # SYNOPSIS ```c #include struct mlx5dv_devx_msi_vector * mlx5dv_devx_alloc_msi_vector(struct ibv_context *ibctx); int mlx5dv_devx_free_msi_vector(struct mlx5dv_devx_msi_vector *msi); ``` # DESCRIPTION Allocate or free an msi vector to be used for creating an EQ. The allocate API exposes a mlx5dv_devx_msi_vector object, which includes an msi vector and a fd. The vector can be used as the "eqc.intr" field when creating an EQ, while the fd (created as non-blocking) can be polled to see once there is some data on that EQ. # ARGUMENTS *ibctx* : RDMA device context to create the action on. *msi* : The msi vector object to work on. ## msi_vector ```c struct mlx5dv_devx_msi_vector { int vector; int fd; }; ``` *vector* : The vector to be used when creating the EQ over the device specification. *fd* : The FD that will be used for polling. # RETURN VALUE Upon success *mlx5dv_devx_alloc_msi_vector* will return a new *struct mlx5dv_devx_msi_vector*; On error NULL will be returned and errno will be set. Upon success *mlx5dv_devx_free_msi_vector* will return 0, on error errno will be returned. # AUTHOR Mark Zhang rdma-core-56.1/providers/mlx5/man/mlx5dv_devx_alloc_uar.3.md000066400000000000000000000032171477342711600237630ustar00rootroot00000000000000--- layout: page title: mlx5dv_devx_alloc_uar / mlx5dv_devx_free_uar section: 3 tagline: Verbs --- # NAME mlx5dv_devx_alloc_uar - Allocates a DEVX UAR mlx5dv_devx_free_uar - Frees a DEVX UAR # SYNOPSIS ```c #include struct mlx5dv_devx_uar *mlx5dv_devx_alloc_uar(struct ibv_context *context, uint32_t flags); void mlx5dv_devx_free_uar(struct mlx5dv_devx_uar *devx_uar); ``` # DESCRIPTION Create / free a DEVX UAR which is needed for other device commands over the DEVX interface. The DEVX API enables direct access from the user space area to the mlx5 device driver, the UAR information is needed for few commands as of QP creation. # ARGUMENTS *context* : RDMA device context to work on. *flags* : Allocation flags for the UAR. MLX5DV_UAR_ALLOC_TYPE_BF: Allocate UAR with Blueflame properties. MLX5DV_UAR_ALLOC_TYPE_NC: Allocate UAR with non-cache properties. MLX5DV_UAR_ALLOC_TYPE_NC_DEDICATED: Allocate a dedicated UAR with non-cache properties. ## devx_uar ```c struct mlx5dv_devx_uar { void *reg_addr; void *base_addr; uint32_t page_id; off_t mmap_off; uint64_t comp_mask; }; ``` *reg_addr* : The write address of DB/BF. *base_addr* : The base address of the UAR. *page_id* : The device page id to be used. *mmap_off* : The mmap offset parameter to be used for re-mapping, to be used by a secondary process. # RETURN VALUE Upon success *mlx5dv_devx_alloc_uar* will return a new *struct mlx5dv_devx_uar*, on error NULL will be returned and errno will be set. # SEE ALSO **mlx5dv_open_device**, **mlx5dv_devx_obj_create** #AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_devx_create_cmd_comp.3.md000066400000000000000000000042031477342711600251220ustar00rootroot00000000000000--- layout: page title: mlx5dv_devx_create_cmd_comp, mlx5dv_devx_destroy_cmd_comp, get_async section: 3 tagline: Verbs --- # NAME mlx5dv_devx_create_cmd_comp - Create a command completion to be used for DEVX asynchronous commands. mlx5dv_devx_destroy_cmd_comp - Destroy a devx command completion. mlx5dv_devx_get_async_cmd_comp - Get an asynchronous command completion. # SYNOPSIS ```c #include struct mlx5dv_devx_cmd_comp { int fd; }; struct mlx5dv_devx_cmd_comp * mlx5dv_devx_create_cmd_comp(struct ibv_context *context) void mlx5dv_devx_destroy_cmd_comp(struct mlx5dv_devx_cmd_comp *cmd_comp) struct mlx5dv_devx_async_cmd_hdr { uint64_t wr_id; uint8_t out_data[]; }; int mlx5dv_devx_get_async_cmd_comp(struct mlx5dv_devx_cmd_comp *cmd_comp, struct mlx5dv_devx_async_cmd_hdr *cmd_resp, size_t cmd_resp_len) ``` # DESCRIPTION Create or destroy a command completion to be used for DEVX asynchronous commands. The create verb exposes an mlx5dv_devx_cmd_comp object that can be used as part of asynchronous DEVX commands. This lets an application run asynchronously without blocking and once the response is ready read it from this object. The response can be read by the mlx5dv_devx_get_async_cmd_comp() API, upon response the *wr_id* that was supplied upon the asynchronous command is returned and the *out_data* includes the data itself. The application must supply a large enough buffer to match any command that was issued on the *cmd_comp*, its size is given by the input *cmd_resp_len* parameter. # ARGUMENTS *context* : RDMA device context to create the action on. *cmd_comp* : The command completion object. *cmd_resp* : The output data from the asynchronous command. *cmd_resp_len* : The output buffer size to hold the response. # RETURN VALUE Upon success *mlx5dv_devx_create_cmd_comp* will return a new *struct mlx5dv_devx_cmd_comp* object, on error NULL will be returned and errno will be set. Upon success *mlx5dv_devx_get_async_cmd_comp* will return 0, otherwise errno will be returned. # SEE ALSO *mlx5dv_open_device(3)*, *mlx5dv_devx_obj_create(3)* #AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_devx_create_eq.3.md000066400000000000000000000037521477342711600237560ustar00rootroot00000000000000--- date: 2022-01-12 footer: mlx5 header: "mlx5 Programmer's Manual" tagline: Verbs layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: mlx5dv_devx_create_eq --- # NAME mlx5dv_devx_create_eq - Create an EQ object mlx5dv_devx_destroy_eq - Destroy an EQ object # SYNOPSIS ```c #include struct mlx5dv_devx_eq * mlx5dv_devx_create_eq(struct ibv_context *ibctx, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_destroy_eq(struct mlx5dv_devx_eq *eq); ``` # DESCRIPTION Create / Destroy an EQ object. Upon creation, the caller prepares the in/out mail boxes based on the device specification format; For the input mailbox, caller needs to prepare all fields except "eqc.log_page_size" and the pas list, which will be set by the driver. The "eqc.intr" field should be used from the output of mlx5dv_devx_alloc_msi_vector(). # ARGUMENTS *ibctx* : RDMA device context to create the action on. *in* : A buffer which contains the command's input data provided in a device specification format. *inlen* : The size of *in* buffer in bytes. *out* : A buffer which contains the command's output data according to the device specification format. *outlen* : The size of *out* buffer in bytes. *eq* : The EQ object to work on. ```c struct mlx5dv_devx_eq { void *vaddr; }; ``` *vaddr* : EQ VA that was allocated in the driver for. # NOTES mlx5dv_devx_query_eqn() will not support vectors which are used by mlx5dv_devx_create_eq(). # RETURN VALUE Upon success *mlx5dv_devx_create_eq* will return a new *struct mlx5dv_devx_eq*; On error NULL will be returned and errno will be set. Upon success *mlx5dv_devx_destroy_eq* will return 0, on error errno will be returned. If the error value is EREMOTEIO, outbox.status and outbox.syndrome will contain the command failure details. # SEE ALSO *mlx5dv_devx_alloc_msi_vector(3)*, *mlx5dv_devx_query_eqn(3)* # AUTHOR Mark Zhang rdma-core-56.1/providers/mlx5/man/mlx5dv_devx_create_event_channel.3.md000066400000000000000000000026541477342711600261620ustar00rootroot00000000000000--- layout: page title: mlx5dv_devx_create_event_channel, mlx5dv_devx_destroy_event_channel section: 3 tagline: Verbs --- # NAME mlx5dv_devx_create_event_channel - Create an event channel to be used for DEVX asynchronous events. mlx5dv_devx_destroy_event_channel - Destroy a DEVX event channel. # SYNOPSIS ```c #include struct mlx5dv_devx_event_channel { int fd; }; struct mlx5dv_devx_event_channel * mlx5dv_devx_create_event_channel(struct ibv_context *context, enum mlx5dv_devx_create_event_channel_flags flags) void mlx5dv_devx_destroy_event_channel(struct mlx5dv_devx_event_channel *event_channel) ``` # DESCRIPTION Create or destroy a channel to be used for DEVX asynchronous events. The create verb exposes an mlx5dv_devx_event_channel object that can be used to read asynchronous DEVX events. This lets an application to subscribe to get device events and once an event occurred read it from this object. # ARGUMENTS *context* : RDMA device context to create the channel on. *flags* : MLX5DV_DEVX_CREATE_EVENT_CHANNEL_FLAGS_OMIT_EV_DATA: omit the event data on this channel. # RETURN VALUE Upon success *mlx5dv_devx_create_event_channel* will return a new *struct mlx5dv_devx_event_channel* object, on error NULL will be returned and errno will be set. # SEE ALSO *mlx5dv_open_device(3)*, *mlx5dv_devx_obj_create(3)* #AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_devx_get_event.3.md000066400000000000000000000040011477342711600237720ustar00rootroot00000000000000--- layout: page title: mlx5dv_devx_get_event section: 3 tagline: Verbs --- # NAME mlx5dv_devx_get_event - Get an asynchronous event. # SYNOPSIS ```c #include struct mlx5dv_devx_async_event_hdr { uint64_t cookie; uint8_t out_data[]; }; ssize_t mlx5dv_devx_get_event(struct mlx5dv_devx_event_channel *event_channel, struct mlx5dv_devx_async_event_hdr *event_data, size_t event_resp_len) ``` # DESCRIPTION Get a device event on the given *event_channel*. Post a successful subscription over the event channel by calling to mlx5dv_devx_subscribe_devx_event() the application should use this API to get the response once an event has occurred. Upon response the *cookie* that was supplied upon the subscription is returned and the *out_data* includes the data itself. The *out_data* may be omitted in case the channel was created with the omit data flag. The application must supply a large enough buffer to hold the event according to the device specification, the buffer size is given by the input *event_resp_len* parameter. # ARGUMENTS *event_channel* : The channel to get the event over. *event_data* : The output data from the asynchronous event. *event_resp_len* : The output buffer size to hold the response. # RETURN VALUE Upon success *mlx5dv_devx_get_event* will return the number of bytes read, otherwise -1 will be returned and errno was set. # NOTES In case the *event_channel* was created with the omit data flag, events having the same type may be combined per subscription and be reported once with the matching *cookie*. In that mode of work, ordering is not preserved between those events to other on this channel. On the other hand, when each event should hold the device data ordering is preserved, however, events might be loose as of lack of kernel memory, in that case EOVERFLOW will be reported. # SEE ALSO *mlx5dv_open_device(3)*, *mlx5dv_devx_subscribe_devx_event(3)* #AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_devx_obj_create.3.md000066400000000000000000000113601477342711600241150ustar00rootroot00000000000000--- layout: page title: mlx5dv_devx_obj_create / destroy / modify /query / general section: 3 tagline: Verbs --- # NAME mlx5dv_devx_obj_create - Creates a devx object mlx5dv_devx_obj_destroy - Destroys a devx object mlx5dv_devx_obj_modify - Modifies a devx object mlx5dv_devx_obj_query - Queries a devx object mlx5dv_devx_obj_query_async - Queries a devx object in an asynchronous mode mlx5dv_devx_general_cmd - Issues a general command over the devx interface # SYNOPSIS ```c #include struct mlx5dv_devx_obj * mlx5dv_devx_obj_create(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_obj_query(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_obj_query_async(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, size_t outlen, uint64_t wr_id, struct mlx5dv_devx_cmd_comp *cmd_comp); int mlx5dv_devx_obj_modify(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_obj_destroy(struct mlx5dv_devx_obj *obj); int mlx5dv_devx_general_cmd(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen); ``` # DESCRIPTION Create / destroy / modify / query a devx object, issue a general command over the devx interface. The DEVX API enables direct access from the user space area to the mlx5 device driver by using the KABI mechanism. The main purpose is to make the user space driver as independent as possible from the kernel so that future device functionality and commands can be activated with minimal to none kernel changes. A DEVX object represents some underlay firmware object, the input command to create it is some raw data given by the user application which should match the device specification. Upon successful creation the output buffer includes the raw data from the device according to its specification, this data can be used as part of related firmware commands to this object. Once the DEVX object is created it can be queried/modified/destroyed by the matching mlx5dv_devx_obj_xxx() API. Both the input and the output for those APIs need to match the device specification as well. The mlx5dv_devx_general_cmd() API enables issuing some general command which is not related to an object such as query device capabilities. The mlx5dv_devx_obj_query_async() API is similar to the query object API, however, it runs asynchronously without blocking. The input includes an mlx5dv_devx_cmd_comp object and an identifier named 'wr_id' for this command. The response should be read upon success with the mlx5dv_devx_get_async_cmd_comp() API. The 'wr_id' that was supplied as an input is returned as part of the response to let application knows for which command the response is related to. An application can gradually migrate to use DEVX according to its needs, it is not all or nothing. For example it can create an ibv_cq via ibv_create_cq() verb and then use the returned cqn to create a DEVX QP object by the mlx5dv_devx_obj_create() API which needs that cqn. The above example can enable an application to create a QP with some driver specific attributes that are not exposed in the ibv_create_qp() API, in that case no user or kernel change may be needed at all as the command input reaches directly to the firmware. The expected users for the DEVX APIs are application that use the mlx5 DV APIs and are familiar with the device specification in both control and data path. To successfully create a DEVX object and work on, a DEVX context must be created, this is done by the mlx5dv_open_device() API with the *MLX5DV_CONTEXT_FLAGS_DEVX* flag. # ARGUMENTS *context* : RDMA device context to create the action on. *in* : A buffer which contains the command's input data provided in a device specification format. *inlen* : The size of *in* buffer in bytes. *out* : A buffer which contains the command's output data according to the device specification format. *outlen* : The size of *out* buffer in bytes. *obj* : For query, modify, destroy: the devx object to work on. *wr_id* : The command identifier when working in asynchronous mode. *cmd_comp* : The command completion object to read the response from in asynchronous mode. # RETURN VALUE Upon success *mlx5dv_devx_create_obj* will return a new *struct mlx5dv_devx_obj* on error NULL will be returned and errno will be set. Upon success query, modify, destroy, general commands, 0 is returned or the value of errno on a failure. If the error value is EREMOTEIO, outbox.status and outbox.syndrome will contain the command failure details. # SEE ALSO **mlx5dv_open_device**, **mlx5dv_devx_create_cmd_comp**, **mlx5dv_devx_get_async_cmd_comp** #AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_devx_qp_modify.3.md000066400000000000000000000066551477342711600240220ustar00rootroot00000000000000--- layout: page title: mlx5dv_devx_qp[/cq/srq/wq/ind_tbl]_modify / query section: 3 tagline: Verbs --- # NAME mlx5dv_devx_qp_modify - Modifies a verbs QP via DEVX mlx5dv_devx_qp_query - Queries a verbs QP via DEVX mlx5dv_devx_cq_modify - Modifies a verbs CQ via DEVX mlx5dv_devx_cq_query - Queries a verbs CQ via DEVX mlx5dv_devx_srq_modify - Modifies a verbs SRQ via DEVX mlx5dv_devx_srq_query - Queries a verbs SRQ via DEVX mlx5dv_devx_wq_modify - Modifies a verbs WQ via DEVX mlx5dv_devx_wq_query - Queries a verbs WQ via DEVX mlx5dv_devx_ind_tbl_modify - Modifies a verbs indirection table via DEVX mlx5dv_devx_ind_tbl_query - Queries a verbs indirection table via DEVX # SYNOPSIS ```c #include int mlx5dv_devx_qp_modify(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_qp_query(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_cq_modify(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_cq_query(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_srq_modify(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_srq_query(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_wq_modify(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_wq_query(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_ind_tbl_modify(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_ind_tbl_query(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen); ``` # DESCRIPTION Modify / query a verb object over the DEVX interface. The DEVX API enables direct access from the user space area to the mlx5 device driver by using the KABI mechanism. The main purpose is to make the user space driver as independent as possible from the kernel so that future device functionality and commands can be activated with minimal to none kernel changes. The above APIs enables modifying/querying a verb object via the DEVX interface. This enables interoperability between verbs and DEVX. As such an application can use the create method from verbs (e.g. ibv_create_qp) and modify and query the created object via DEVX (e.g. mlx5dv_devx_qp_modify). # ARGUMENTS *qp/cq/wq/srq/ind_tbl* : The ibv_xxx object to issue the action on. *in* : A buffer which contains the command's input data provided in a device specification format. *inlen* : The size of *in* buffer in bytes. *out* : A buffer which contains the command's output data according to the device specification format. *outlen* : The size of *out* buffer in bytes. # RETURN VALUE Upon success 0 is returned or the value of errno on a failure. If the error value is EREMOTEIO, outbox.status outbox.syndrome will contain the command failure details. # SEE ALSO **mlx5dv_open_device**, **mlx5dv_devx_obj_create** #AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_devx_query_eqn.3.md000066400000000000000000000017201477342711600240270ustar00rootroot00000000000000--- layout: page title: mlx5dv_devx_query_eqn section: 3 tagline: Verbs --- # NAME mlx5dv_devx_query_eqn - Query EQN for a given vector id. # SYNOPSIS ```c #include int mlx5dv_devx_query_eqn(struct ibv_context *context, uint32_t vector, uint32_t *eqn); ``` # DESCRIPTION Query EQN for a given input vector, the EQN is needed for other device commands over the DEVX interface. The DEVX API enables direct access from the user space area to the mlx5 device driver, the EQN information is needed for few commands such as CQ creation. # ARGUMENTS *context* : RDMA device context to work on. *vector* : Completion vector number. *eqn* : The device EQ number which relates to the given input vector. # RETURN VALUE returns 0 on success, or the value of errno on failure (which indicates the failure reason). # SEE ALSO **mlx5dv_open_device**, **mlx5dv_devx_obj_create** #AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_devx_subscribe_devx_event.3.md000066400000000000000000000031621477342711600262310ustar00rootroot00000000000000--- layout: page title: mlx5dv_devx_subscribe_devx_event, mlx5dv_devx_subscribe_devx_event_fd section: 3 tagline: Verbs --- # NAME mlx5dv_devx_subscribe_devx_event - Subscribe over an event channel for device events. mlx5dv_devx_subscribe_devx_event_fd - Subscribe over an event channel for device events to signal eventfd. # SYNOPSIS ```c #include int mlx5dv_devx_subscribe_devx_event(struct mlx5dv_devx_event_channel *dv_event_channel, struct mlx5dv_devx_obj *obj, uint16_t events_sz, uint16_t events_num[], uint64_t cookie) int mlx5dv_devx_subscribe_devx_event_fd(struct mlx5dv_devx_event_channel *dv_event_channel, int fd, struct mlx5dv_devx_obj *obj, uint16_t event_num) ``` # DESCRIPTION Subscribe over a DEVX event channel for device events. # ARGUMENTS *dv_event_channel* : Event channel to subscribe over. *fd* : A file descriptor that previously was opened by the eventfd() system call. *obj* : DEVX object that *events_num* relates to, can be NULL for unaffiliated events. *events_sz* : Size of the *events_num* buffer that holds the events to subscribe for. *events_num* : Holds the required event numbers to subscribe for, numbers are according to the device specification. *cookie* : The value to be returned back when reading the event, can be used as an ID for application use. # NOTES When mlx5dv_devx_subscribe_devx_event_fd will be used the *fd* will be signaled once an event has occurred. # SEE ALSO *mlx5dv_open_device(3)*, *mlx5dv_devx_create_event_channel(3)*, *mlx5dv_devx_get_event(3)* #AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_devx_umem_reg.3.md000066400000000000000000000055561477342711600236320ustar00rootroot00000000000000--- layout: page title: mlx5dv_devx_umem_reg, mlx5dv_devx_umem_dereg section: 3 tagline: Verbs --- # NAME mlx5dv_devx_umem_reg - Register a user memory to be used by the devx interface mlx5dv_devx_umem_reg_ex - Register a user memory to be used by the devx interface mlx5dv_devx_umem_dereg - Deregister a devx umem object # SYNOPSIS ```c #include struct mlx5dv_devx_umem { uint32_t umem_id; }; struct mlx5dv_devx_umem * mlx5dv_devx_umem_reg(struct ibv_context *context, void *addr, size_t size, uint32_t access) struct mlx5dv_devx_umem_in { void *addr; size_t size; uint32_t access; uint64_t pgsz_bitmap; uint64_t comp_mask; int dmabuf_fd; }; struct mlx5dv_devx_umem * mlx5dv_devx_umem_reg_ex(struct ibv_context *ctx, struct mlx5dv_devx_umem_in *umem_in); int mlx5dv_devx_umem_dereg(struct mlx5dv_devx_umem *dv_devx_umem) ``` # DESCRIPTION Register or deregister a user memory to be used by the devx interface. The register verb exposes a UMEM DEVX object for user memory registration for DMA. The API to register the user memory gets as input the user address, length and access flags, and provides to the user as output an object which holds the UMEM ID returned by the firmware to this registered memory. The user can ask for specific page sizes for the given address and length, in that case *mlx5dv_devx_umem_reg_ex()* should be used. In case the kernel couldn't find a matching page size from the given *umem_in->pgsz_bitmap* bitmap the API will fail. The user will use that UMEM ID in device direct commands that use this memory instead of the physical addresses list, for example upon *mlx5dv_devx_obj_create* to create a QP. # ARGUMENTS *context* : RDMA device context to create the action on. *addr* : The memory start address to register. *size* : The size of *addr* buffer. *access* : The desired memory protection attributes; it is either 0 or the bitwise OR of one or more of *enum ibv_access_flags*. *umem_in* : A structure holds the argument bundle. *pgsz_bitmap* : Represents the required page sizes. umem creation will fail if it cannot be created with these page sizes. *comp_mask* : Flags indicating the additional fields. *dmabuf_fd* : If MLX5DV_UMEM_MASK_DMABUF is set in *comp_mask* then this value must be a FD of a dmabuf. In this mode the dmabuf is used as the backing memory to create the umem out of. The dmabuf must be pinnable. *addr* is interpreted as the starting offset of the dmabuf. # RETURN VALUE Upon success *mlx5dv_devx_umem_reg* / *mlx5dv_devx_umem_reg_ex* will return a new *struct mlx5dv_devx_umem* object, on error NULL will be returned and errno will be set. *mlx5dv_devx_umem_dereg* returns 0 on success, or the value of errno on failure (which indicates the failure reason). # SEE ALSO *mlx5dv_open_device(3)*, *ibv_reg_mr(3)*, *mlx5dv_devx_obj_create(3)* #AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_dm_map_op_addr.3.md000066400000000000000000000014521477342711600237200ustar00rootroot00000000000000--- layout: page title: mlx5dv_dm_map_op_addr section: 3 tagline: Verbs date: 2021-1-21 header: "mlx5 Programmer's Manual" footer: mlx5 --- # NAME mlx5dv_dm_map_op_addr - Get operation address of a device memory (DM) # SYNOPSIS ```c #include void *mlx5dv_dm_map_op_addr(struct ibv_dm *dm, uint8_t op); ``` # DESCRIPTION **mlx5dv_dm_map_op_addr()** returns a mmaped address to the device memory for the requested **op**. # ARGUMENTS *dm* : The associated ibv_dm for this operation. *op* : Indicates the DM operation type, based on device specification. # RETURN VALUE Returns a pointer to the mmaped address, on error NULL will be returned and errno will be set. # SEE ALSO **ibv_alloc_dm**(3), **mlx5dv_alloc_dm**(3), # AUTHOR Maor Gottlieb rdma-core-56.1/providers/mlx5/man/mlx5dv_dr_flow.3.md000066400000000000000000000464531477342711600224410ustar00rootroot00000000000000--- date: 2019-03-28 layout: page title: MLX5DV_DR API section: 3 license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' header: "mlx5 Programmer's Manual" footer: mlx5 --- # NAME mlx5dv_dr_domain_create, mlx5dv_dr_domain_sync, mlx5dv_dr_domain_destroy, mlx5dv_dr_domain_set_reclaim_device_memory, mlx5dv_dr_domain_allow_duplicate_rules - Manage flow domains mlx5dv_dr_table_create, mlx5dv_dr_table_destroy - Manage flow tables mlx5dv_dr_matcher_create, mlx5dv_dr_matcher_destroy, mlx5dv_dr_matcher_set_layout - Manage flow matchers mlx5dv_dr_rule_create, mlx5dv_dr_rule_destroy - Manage flow rules mlx5dv_dr_action_create_drop - Create drop action mlx5dv_dr_action_create_default_miss - Create default miss action mlx5dv_dr_action_create_tag - Create tag actions mlx5dv_dr_action_create_dest_ibv_qp - Create packet destination QP action mlx5dv_dr_action_create_dest_table - Create packet destination dr table action mlx5dv_dr_action_create_dest_root_table - Create packet destination root table action mlx5dv_dr_action_create_dest_vport - Create packet destination vport action mlx5dv_dr_action_create_dest_ib_port - Create packet destination IB port action mlx5dv_dr_action_create_dest_devx_tir - Create packet destination TIR action mlx5dv_dr_action_create_dest_array - Create destination array action mlx5dv_dr_action_create_packet_reformat - Create packet reformat actions mlx5dv_dr_action_create_modify_header - Create modify header actions mlx5dv_dr_action_create_flow_counter - Create devx flow counter actions mlx5dv_dr_action_create_aso, mlx5dv_dr_action_modify_aso - Create and modify ASO actions mlx5dv_dr_action_create_flow_meter, mlx5dv_dr_action_modify_flow_meter - Create and modify meter action mlx5dv_dr_action_create_flow_sampler - Create flow sampler action mlx5dv_dr_action_create_pop_vlan - Create pop vlan action mlx5dv_dr_action_create_push_vlan- Create push vlan action mlx5dv_dr_action_destroy - Destroy actions mlx5dv_dr_aso_other_domain_link, mlx5dv_dr_aso_other_domain_unlink - link/unlink ASO devx object to work with different domains # SYNOPSIS ```c #include struct mlx5dv_dr_domain *mlx5dv_dr_domain_create( struct ibv_context *ctx, enum mlx5dv_dr_domain_type type); int mlx5dv_dr_domain_sync( struct mlx5dv_dr_domain *domain, uint32_t flags); int mlx5dv_dr_domain_destroy(struct mlx5dv_dr_domain *domain); void mlx5dv_dr_domain_set_reclaim_device_memory( struct mlx5dv_dr_domain *dmn, bool enable); void mlx5dv_dr_domain_allow_duplicate_rules(struct mlx5dv_dr_domain *dmn, bool allow); struct mlx5dv_dr_table *mlx5dv_dr_table_create( struct mlx5dv_dr_domain *domain, uint32_t level); int mlx5dv_dr_table_destroy(struct mlx5dv_dr_table *table); struct mlx5dv_dr_matcher *mlx5dv_dr_matcher_create( struct mlx5dv_dr_table *table, uint16_t priority, uint8_t match_criteria_enable, struct mlx5dv_flow_match_parameters *mask); int mlx5dv_dr_matcher_destroy(struct mlx5dv_dr_matcher *matcher); int mlx5dv_dr_matcher_set_layout(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_dr_matcher_layout *matcher_layout); struct mlx5dv_dr_rule *mlx5dv_dr_rule_create( struct mlx5dv_dr_matcher *matcher, struct mlx5dv_flow_match_parameters *value, size_t num_actions, struct mlx5dv_dr_action *actions[]); void mlx5dv_dr_rule_destroy(struct mlx5dv_dr_rule *rule); struct mlx5dv_dr_action *mlx5dv_dr_action_create_drop(void); struct mlx5dv_dr_action *mlx5dv_dr_action_create_default_miss(void); struct mlx5dv_dr_action *mlx5dv_dr_action_create_tag( uint32_t tag_value); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_ibv_qp( struct ibv_qp *ibqp); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_table( struct mlx5dv_dr_table *table); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_root_table( struct mlx5dv_dr_table *table, uint16_t priority); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_vport( struct mlx5dv_dr_domain *domain, uint32_t vport); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_ib_port( struct mlx5dv_dr_domain *domain, uint32_t ib_port); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_devx_tir( struct mlx5dv_devx_obj *devx_obj); struct mlx5dv_dr_action *mlx5dv_dr_action_create_packet_reformat( struct mlx5dv_dr_domain *domain, uint32_t flags, enum mlx5dv_flow_action_packet_reformat_type reformat_type, size_t data_sz, void *data); struct mlx5dv_dr_action *mlx5dv_dr_action_create_modify_header( struct mlx5dv_dr_domain *domain, uint32_t flags, size_t actions_sz, __be64 actions[]); struct mlx5dv_dr_action *mlx5dv_dr_action_create_flow_counter( struct mlx5dv_devx_obj *devx_obj, uint32_t offset); struct mlx5dv_dr_action * mlx5dv_dr_action_create_aso(struct mlx5dv_dr_domain *domain, struct mlx5dv_devx_obj *devx_obj, uint32_t offset, uint32_t flags, uint8_t return_reg_c); int mlx5dv_dr_action_modify_aso(struct mlx5dv_dr_action *action, uint32_t offset, uint32_t flags, uint8_t return_reg_c); struct mlx5dv_dr_action * mlx5dv_dr_action_create_flow_meter(struct mlx5dv_dr_flow_meter_attr *attr); int mlx5dv_dr_action_modify_flow_meter(struct mlx5dv_dr_action *action, struct mlx5dv_dr_flow_meter_attr *attr, __be64 modify_field_select); struct mlx5dv_dr_action * mlx5dv_dr_action_create_flow_sampler(struct mlx5dv_dr_flow_sampler_attr *attr); struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_array(struct mlx5dv_dr_domain *domain, size_t num_dest, struct mlx5dv_dr_action_dest_attr *dests[]); struct mlx5dv_dr_action *mlx5dv_dr_action_create_pop_vlan(void); struct mlx5dv_dr_action *mlx5dv_dr_action_create_push_vlan( struct mlx5dv_dr_domain *dmn, __be32 vlan_hdr) int mlx5dv_dr_action_destroy(struct mlx5dv_dr_action *action); int mlx5dv_dr_aso_other_domain_link(struct mlx5dv_devx_obj *devx_obj, struct mlx5dv_dr_domain *peer_dmn, struct mlx5dv_dr_domain *dmn, uint32_t flags, uint8_t return_reg_c); int mlx5dv_dr_aso_other_domain_unlink(struct mlx5dv_devx_obj *devx_obj, struct mlx5dv_dr_domain *dmn); ``` # DESCRIPTION The Direct Rule API (mlx5dv_dr_\*) allows complete access by verbs application to the device`s packet steering functionality. Steering flow rules are the combination of attributes with a match pattern and a list of actions. Rules can have several distinct actions (such as counting, encapsulating, decapsulating before redirecting packets to a particular queue or port, etc.). In order to manage the rule execution order for the packet processing matching by HW, multiple flow tables in an ordered chain and multiple flow matchers sorted by priorities are defined. ## Domain *mlx5dv_dr_domain_create()* creates a DR domain object to be used with *mlx5dv_dr_table_create()* and *mlx5dv_dr_action_create_\*()*. A domain should be destroyed by calling *mlx5dv_dr_domain_destroy()* once all depended resources are released. The device support the following domains types: **MLX5DV_DR_DOMAIN_TYPE_NIC_RX** Manage ethernet packets received on the NIC. Packets in this domain can be dropped, dispatched to QP`s, modified or redirected to additional tables inside the domain. Default behavior: Drop packet. **MLX5DV_DR_DOMAIN_TYPE_NIC_TX** Manage ethernet packets transmit on the NIC. Packets in this domain can be dropped, modified or redirected to additional tables inside the domain. Default behavior: Forward packet to NIC vport (to eSwitch or wire). **MLX5DV_DR_DOMAIN_TYPE_FDB** Manage ethernet packets in the eSwitch Forwarding Data Base for packets received from wire or from any other vport. Packets in this domain can be dropped, dispatched to vport, modified or redirected to additional tables inside the domain. Default behavior: Forward packet to eSwitch manager vport. *mlx5dv_dr_domain_sync()* is used in order to flush the rule submission queue. By default, rules in a domain are updated in HW asynchronously. **flags** should be a set of type *enum mlx5dv_dr_domain_sync_flags*: **MLX5DV_DR_DOMAIN_SYNC_FLAGS_SW**: block until completion of all software queued tasks. **MLX5DV_DR_DOMAIN_SYNC_FLAGS_HW**: clear the steering HW cache to enforce next packet hits the latest rules, in addition to the SW SYNC handling. **MLX5DV_DR_DOMAIN_SYNC_FLAGS_MEM**: sync device memory to free cached memory. *mlx5dv_dr_domain_set_reclaim_device_memory()* is used to enable the reclaiming of device memory back to the system when not in use, by default this feature is disabled. *mlx5dv_dr_domain_allow_duplicate_rules()* is used to allow or prevent insertion of rules matching on same fields(duplicates) on non root tables, by default this feature is allowed. ## Table *mlx5dv_dr_table_create()* creates a DR table in the **domain**, at the appropriate **level**, and can be used with *mlx5dv_dr_matcher_create()*, *mlx5dv_dr_action_create_dest_table()* and *mlx5dv_dr_action_create_dest_root_table*. All packets start traversing the steering domain tree at table **level** zero (0). Using rule and action, packets can by redirected to other tables in the domain. A table should be destroyed by calling *mlx5dv_dr_table_destroy()* once all depended resources are released. ## Matcher *mlx5dv_dr_matcher_create()* create a matcher object in **table**, at sorted **priority** (lower value is check first). A matcher can hold multiple rules, all with identical **mask** of type *struct mlx5dv_flow_match_parameters* which represents the exact attributes to be compared by HW steering. The **match_criteria_enable** and **mask** are defined in a device spec format. Only the fields that where masked in the *matcher* should be filled by the rule in *mlx5dv_dr_rule_create()*. A matcher should be destroyed by calling *mlx5dv_dr_matcher_destroy()* once all depended resources are released. *mlx5dv_dr_matcher_set_layout()* is used to set specific layout parameters of a matcher, on some conditions setting some attributes might not be supported, in such cases ENOTSUP will be returned. **flags** should be a set of type *enum mlx5dv_dr_matcher_layout_flags*: **MLX5DV_DR_MATCHER_LAYOUT_RESIZABLE**: The matcher can resize its scale and resources according to the rules that are inserted or removed. **MLX5DV_DR_MATCHER_LAYOUT_NUM_RULE**: Indicates a hint from the application about the number of the rules the matcher is expected to handle. This allows preallocation of matcher resources for faster rule updates when using with non-resizable layout mode. ## Actions A set of action create API are defined by *mlx5dv_dr_action_create_\*()*. All action are created as *struct mlx5dv_dr_action*. An action should be destroyed by calling *mlx5dv_dr_action_destroy()* once all depended rules are destroyed. When an action handle is reused for multiple rules, the same action will be executed. e.g.: action 'count' will count multiple flows rules on the same HW flow counter context. action 'drop' will drop packets of different rule from any matcher. Action: Drop *mlx5dv_dr_action_create_drop* create a terminating action which drops packets. Can not be mixed with Destination actions. Action: Default miss *mlx5dv_dr_action_create_default_miss* create a terminating action which will execute the default behavior based on the domain type. Action: Tag *mlx5dv_dr_action_create_tag* creates a non-terminating action which tags packets with **tag_value**. The **tag_value** is available in the CQE of the packet received. Valid only on domain type NIC_RX. Action: Destination *mlx5dv_dr_action_create_dest_ibv_qp* creates a terminating action delivering the packet to a QP, defined by **ibqp**. Valid only on domain type NIC_RX. *mlx5dv_dr_action_create_dest_table* creates a forwarding action to another flow table, defined by **table**. The destination **table** must be from the same domain with a level higher than zero. *mlx5dv_dr_action_create_dest_root_table* creates a forwarding action to another priority inside a root flow table, defined by **table** and **priority**. *mlx5dv_dr_action_create_dest_vport* creates a forwarding action to a **vport** on the same **domain**. Valid only on domain type FDB. *mlx5dv_dr_action_create_dest_ib_port* creates a forwarding action to a **ib_port** on the same **domain**. The valid range of ports is a based on the capability phys_port_cnt_ex provided by ibq_query_device_ex and it is possible to query the ports details using mlx5dv_query_port. Action is supported only on domain type FDB. *mlx5dv_dr_action_create_dest_devx_tir* creates a terminating action delivering the packet to a TIR, defined by **devx_obj**. Valid only on domain type NIC_RX. Action: Array *mlx5dv_dr_action_create_dest_array* creates an action which replicates a packet to multiple destinations. **num_dest** defines the number of replication destinations. Each **dests** destination array entry can be of different **type**. Use type MLX5DV_DR_ACTION_DEST for direct forwarding to an action destination. Use type MLX5DV_DR_ACTION_DEST_REFORMAT when reformat action should be performed on the packet before it is forwarding to the destination action. Action: Packet Reformat *mlx5dv_dr_action_create_packet_reformat* create a packet reformat context and action in the **domain**. The **reformat_type**, **data_sz** and **data** are defined in *man mlx5dv_create_flow_action_packet_reformat*. Action: Modify Header *mlx5dv_dr_action_create_modify_header* create a modify header context and action in the **domain**. The **actions_sz** and **actions** are defined in *man mlx5dv_create_flow_action_modify_header*. Action: Flow Count *mlx5dv_dr_action_create_flow_counter* creates a flow counter action from a DEVX flow counter object, based on **devx_obj** and specific counter index from **offset** in the counter bulk. Action: ASO *mlx5dv_dr_action_create_aso* receives a **domain** pointer and creates an ASO action from the DEVX ASO object, based on **devx_obj**. Use **offset** to select the specific ASO object in the **devx_obj** bulk. DR rules using this action can optionally update the ASO object value according to **flags** to choose the specific wanted behavior of this object. After a packet hits the rule with the ASO object the value of the ASO object will be copied into the chosen **return_reg_c** which can be used for match in following DR rules. *mlx5dv_dr_action_modify_aso* modifies ASO action **action** with new values for **offset**, **return_reg_c** and **flags**. Only new DR rules using this **action** will use the modified values. Existing DR rules do not change the HW action values stored. **flags** can be set to one of the types of *mlx5dv_dr_action_aso_first_hit_flags* or *mlx5dv_dr_action_aso_flow_meter_flags* or *mlx5dv_dr_action_aso_ct_flags*: **MLX5DV_DR_ACTION_ASO_FIRST_HIT_FLAGS_SET**: is used to set the ASO first hit object context, else the context is only copied to the return_reg_c. **MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_RED**: is used to indicate to update the initial color in ASO flow meter object value to red. **MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_YELLOW**: is used to indicate to update the initial color in ASO flow meter object value to yellow. **MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_GREEN**: is used to indicate to update the initial color in ASO flow meter object value to green. **MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_UNDEFINED**: is used to indicate to update the initial color in ASO flow meter object value to undefined. **MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_INITIATOR**: is used to indicate the TCP connection direction the SYN packet was sent on. **MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_RESPONDER**: is used to indicate the TCP connection direction the SYN-ACK packet was sent on. Action: Meter *mlx5dv_dr_action_create_flow_meter* creates a meter action based on the flow meter parameters. The paramertes are according to the device specification. *mlx5dv_dr_action_modify_flow_meter* modifies existing flow meter **action** based on **modify_field_select**. **modify_field_select** is according to the device specification. Action: Sampler *mlx5dv_dr_action_create_flow_sampler* creates a sampler action, allowing us to duplicate and sample a portion of traffic. Packets steered to the sampler action will be sampled with an approximate probability of 1/sample_ratio provided in **attr**, and sample_actions provided in **attr** will be executed over them. All original packets will be steered to default_next_table in **attr**. A modify header format SET_ACTION data can be provided in action of **attr**, which can be executed on packets before going to default flow table. On some devices, this is required to set register value. Action Flags: action **flags** can be set to one of the types of *enum mlx5dv_dr_action_flags*: Action: Pop Vlan *mlx5dv_dr_action_create_pop_vlan* creates a pop vlan action which removes VLAN tags from packets layer 2. Action: Push Vlan *mlx5dv_dr_action_create_push_vlan* creates a push vlan action which adds VLAN tags to packets layer 2. **MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL**: is used to indicate the action is targeted for flow table in level=0 (ROOT) of the specific domain. ## Rule *mlx5dv_dr_rule_create()* creates a HW steering rule entry in **matcher**. The **value** of type *struct mlx5dv_flow_match_parameters* holds the exact attribute values of the steering rule to be matched, in a device spec format. Only the fields that where masked in the *matcher* should be filled. HW will perform the set of **num_actions** from the **action** array of type *struct mlx5dv_dr_action*, once a packet matches the exact **value** of the rule (referred to as a 'hit'). *mlx5dv_dr_rule_destroy()* destroys the rule. ## Other *mlx5dv_dr_aso_other_domain_link()* links the ASO devx object, **devx_obj** to a domain **dmn**, this will allow creating a rule with ASO action using the given object on the linked domain **dmn**. **peer_dmn** is the domain that the ASO devx object was created on. **dmn** is the domain that ASO devx object will be linked to. **flags** choose the specific wanted behavior of this object according to the flags, same as for ASO action creation flags. **regc_index** After a packet hits the rule with the ASO object the value of the ASO object will be copied into the regc register indicated by this param, and then we can use the value for matching in the following DR rules. *mlx5dv_dr_aso_other_domain_unlink()* will unlink the **devx_obj** from the linked **dmn**. **dmn** is the domain that ASO devx object is linked to. # RETURN VALUE The create API calls will return a pointer to the relevant object: table, matcher, action, rule. on failure, NULL will be returned and errno will be set. The destroy API calls will returns 0 on success, or the value of errno on failure (which indicates the failure reason). # LIMITATIONS Application can verify is a feature is supported by *trail and error*. No capabilities are exposed, as the combination of all the options exposed are way to large to define. Tables are size less by definition. They are expected to grow and shrink to accommodate for all rules, according to driver capabilities. Once reaching a limit, an error is returned. Matchers in same priority, in the same table, will have undefined ordered. A rule with identical value pattern to another rule on a given matcher are rejected. IP version in matcher mask and rule should be equal and set to 4, 6 or 0. # SEE ALSO **mlx5dv_open_device(3)**, **mlx5dv_create_flow_action_packet_reformat(3)**, **mlx5dv_create_flow_action_modify_header(3)**. # AUTHOR Alex Rosenbaum Alex Vesker rdma-core-56.1/providers/mlx5/man/mlx5dv_dump.3.md000066400000000000000000000027651477342711600217500ustar00rootroot00000000000000--- date: 2019-11-18 layout: page title: MLX5DV_DUMP API section: 3 license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' header: "mlx5 Programmer's Manual" footer: mlx5 --- # NAME mlx5dv_dump_dr_domain - Dump DR Domain mlx5dv_dump_dr_table - Dump DR Table mlx5dv_dump_dr_matcher - Dump DR Matcher mlx5dv_dump_dr_rule - Dump DR Rule # SYNOPSIS ```c #include int mlx5dv_dump_dr_domain(FILE *fout, struct mlx5dv_dr_domain *domain); int mlx5dv_dump_dr_table(FILE *fout, struct mlx5dv_dr_table *table); int mlx5dv_dump_dr_matcher(FILE *fout, struct mlx5dv_dr_matcher *matcher); int mlx5dv_dump_dr_rule(FILE *fout, struct mlx5dv_dr_rule *rule); ``` # DESCRIPTION The Dump API (mlx5dv_dump_\*) allows the dumping of the existing rdma-core resources to the provided file. The output file format is vendor specific. *mlx5dv_dump_dr_domain()* dumps a DR Domain object properties to a specified file. *mlx5dv_dump_dr_table()* dumps a DR Table object properties to a specified file. *mlx5dv_dump_dr_matcher()* dumps a DR Matcher object properties to a specified file. *mlx5dv_dump_dr_rule()* dumps a DR Rule object properties to a specified file. # RETURN VALUE The API calls returns 0 on success, or the value of errno on failure (which indicates the failure reason). The calls are blocking - function returns only when all related resources info is written to the file. # AUTHOR Yevgeny Kliteynik Muhammad Sammar rdma-core-56.1/providers/mlx5/man/mlx5dv_flow_action_esp.3.md000066400000000000000000000030231477342711600241420ustar00rootroot00000000000000--- layout: page title: mlx5dv_flow_action_esp section: 3 tagline: Verbs --- # NAME mlx5dv_flow_action_esp - Flow action esp for mlx5 provider # SYNOPSIS ```c #include struct ibv_flow_action * mlx5dv_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *esp, struct mlx5dv_flow_action_esp *mlx5_attr); ``` # DESCRIPTION Create an IPSEC ESP flow steering action. This verb is identical to *ibv_create_flow_action_esp* verb, but allows mlx5 specific flags. # ARGUMENTS Please see *ibv_flow_action_esp(3)* man page for *ctx* and *esp*. ## *mlx5_attr* argument ```c struct mlx5dv_flow_action_esp { uint64_t comp_mask; /* Use enum mlx5dv_flow_action_esp_mask */ uint32_t action_flags; /* Use enum mlx5dv_flow_action_flags */ }; ``` *comp_mask* : Bitmask specifying what fields in the structure are valid (*enum mlx5dv_flow_action_esp_mask*). *action_flags* : A bitwise OR of the various values described below. *MLX5DV_FLOW_ACTION_FLAGS_REQUIRE_METADATA*: Each received and transmitted packet using offload is expected to carry metadata in the form of a L2 header with ethernet type 0x8CE4, followed by 6 bytes of data and the original packet ethertype. # NOTE The ESN is expected to be placed in the IV field for egress packets. The 64 bit sequence number is written in big-endian over the 64 bit IV field. There is no need to call modify to update the ESN window on egress when this DV is used. # SEE ALSO *ibv_flow_action_esp(3)*, *RFC 4106* rdma-core-56.1/providers/mlx5/man/mlx5dv_get_clock_info.3000066400000000000000000000025601477342711600233420ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org (MIT) - See COPYING.md .\" .TH MLX5DV_GET_CLOCK_INFO 3 2017-11-08 1.0.0 .SH "NAME" mlx5dv_get_clock_info \- Get device clock information .SH "SYNOPSIS" .nf .B #include .sp .BI "int mlx5dv_get_clock_info(struct ibv_context *ctx_in, .BI " struct mlx5dv_clock_info *clock_info); .fi .SH "DESCRIPTION" Get the updated core .I clock_info from the device driver. This information will be used later to translate the completion timestamp from HCA core clock to nanoseconds. The values of the clock are updated from the driver's PTP clock, therefore, without a running PTP client on the machine, the wall clock conversion will not be accurate. .PP Pass the latest \fBstruct mlx5dv_clock_info\fR to \fBmlx5dv_ts_to_ns(3)\fR in order to translate the completion timestamp from HCA core clock to nanoseconds. .PP If the clock_info becomes too old then time conversion will return wrong conversion results. The user must ensure that \fBmlx5dv_get_clock_info(3)\fR is called at least once every \fBmax_clock_info_update_nsec\fR as returned by the \fBmlx5dv_query_device(3)\fR function. .PP .fi .SH "RETURN VALUE" 0 on success or the value of errno on failure (which indicates the failure reason). .SH "SEE ALSO" .BR mlx5dv (7), .BR mlx5dv_ts_to_ns (3) .SH "AUTHORS" .TP Feras Daoud rdma-core-56.1/providers/mlx5/man/mlx5dv_get_data_direct_sysfs_path.3.md000066400000000000000000000023471477342711600263440ustar00rootroot00000000000000--- layout: page title: mlx5dv_get_data_direct_sysfs_path section: 3 tagline: Verbs --- # NAME mlx5dv_get_data_direct_sysfs_path - Get the sysfs path of a data direct device # SYNOPSIS ```c #include int mlx5dv_get_data_direct_sysfs_path(struct ibv_context *context, char *buf, size_t buf_len) ``` # DESCRIPTION Get the sysfs path of the data direct device that is associated with the given *context*. This lets an application to discover whether/which data direct device is associated with the given *context*. # ARGUMENTS *context* : RDMA device context to work on. *buf* : The buffer where to place the sysfs path of the associated data direct device. *buf_len* : The length of the buffer. # RETURN VALUE Upon success 0 is returned or the value of errno on a failure. # ERRORS The below specific error values should be considered. ENODEV : There is no associated data direct device for the given *context*. ENOSPC : The input buffer size is too small to hold the full sysfs path. # NOTES Upon succees, the caller should add the /sys/ prefix to get the full sysfs path. # SEE ALSO *mlx5dv_reg_dmabuf_mr(3)* # AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_get_vfio_device_list.3.md000066400000000000000000000025761477342711600251570ustar00rootroot00000000000000--- layout: page title: mlx5dv_get_vfio_device_list section: 3 tagline: Verbs --- # NAME mlx5dv_get_vfio_device_list - Get list of available devices to be used over VFIO # SYNOPSIS ```c #include struct ibv_device ** mlx5dv_get_vfio_device_list(struct mlx5dv_vfio_context_attr *attr); ``` # DESCRIPTION Returns a NULL-terminated array of devices based on input *attr*. # ARGUMENTS *attr* : Describe the VFIO devices to return in list. ## *attr* argument ```c struct mlx5dv_vfio_context_attr { const char *pci_name; uint32_t flags; uint64_t comp_mask; }; ``` *pci_name* : The PCI name of the required device. *flags* : A bitwise OR of the various values described below. *MLX5DV_VFIO_CTX_FLAGS_INIT_LINK_DOWN*: Upon device initialization link should stay down. *comp_mask* : Bitmask specifying what fields in the structure are valid. # RETURN VALUE Returns the array of the matching devices, or sets errno and returns NULL if the request fails. # NOTES Client code should open all the devices it intends to use with ibv_open_device() before calling ibv_free_device_list(). Once it frees the array with ibv_free_device_list(), it will be able to use only the open devices; pointers to unopened devices will no longer be valid. # SEE ALSO *ibv_open_device(3)* *ibv_free_device_list(3)* # AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_init_obj.3000066400000000000000000000071341477342711600221740ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org (MIT) - See COPYING.md .\" .TH MLX5DV_INIT_OBJ 3 2017-02-02 1.0.0 .SH "NAME" mlx5dv_init_obj \- Initialize mlx5 direct verbs object from ibv_xxx or mlx5dv_xxx structures .SH "SYNOPSIS" .nf .B #include .sp .BI "int mlx5dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type); .fi .SH "DESCRIPTION" .B mlx5dv_init_obj() This function will initialize mlx5dv_xxx structs based on supplied type. The information for initialization is taken from ibv_xx or mlx5dv_xxx structs supplied as part of input. Request information of CQ marks its owned by direct verbs for all consumer index related actions. The initialization type can be combination of several types together. .PP .nf struct mlx5dv_qp { .in +8 uint32_t *dbrec; struct { .in +8 void *buf; uint32_t wqe_cnt; uint32_t stride; .in -8 } sq; struct { .in +8 void *buf; uint32_t wqe_cnt; uint32_t stride; .in -8 } rq; struct { .in +8 void *reg; uint32_t size; .in -8 } bf; uint64_t comp_mask; off_t uar_mmap_offset; uint32_t tirn; uint32_t tisn; uint32_t rqn; uint32_t sqn; uint64_t tir_icm_address; .in -8 }; struct mlx5dv_cq { .in +8 void *buf; uint32_t *dbrec; uint32_t cqe_cnt; uint32_t cqe_size; void *cq_uar; uint32_t cqn; uint64_t comp_mask; .in -8 }; struct mlx5dv_srq { .in +8 void *buf; uint32_t *dbrec; uint32_t stride; uint32_t head; uint32_t tail; uint64_t comp_mask; uint32_t srqn; .in -8 }; struct mlx5dv_rwq { .in +8 void *buf; uint32_t *dbrec; uint32_t wqe_cnt; uint32_t stride; uint64_t comp_mask; .in -8 }; struct mlx5dv_dm { .in +8 void *buf; uint64_t length; uint64_t comp_mask; uint64_t remote_va; .in -8 }; struct mlx5dv_ah { .in +8 struct mlx5_wqe_av *av; uint64_t comp_mask; .in -8 }; struct mlx5dv_pd { .in +8 uint32_t pdn; uint64_t comp_mask; .in -8 }; struct mlx5dv_devx { .in +8 uint32_t handle; /* The kernel handle, can be used upon direct ioctl destroy */ .in -8 }; struct mlx5dv_obj { .in +8 struct { .in +8 struct ibv_qp *in; struct mlx5dv_qp *out; .in -8 } qp; struct { .in +8 struct ibv_cq *in; struct mlx5dv_cq *out; .in -8 } cq; struct { .in +8 struct ibv_srq *in; struct mlx5dv_srq *out; .in -8 } srq; struct { .in +8 struct ibv_wq *in; struct mlx5dv_rwq *out; .in -8 } rwq; struct { .in +8 struct ibv_dm *in; struct mlx5dv_dm *out; .in -8 } dm; struct { .in +8 struct ibv_ah *in; struct mlx5dv_ah *out; .in -8 } ah; struct { .in +8 struct ibv_pd *in; struct mlx5dv_pd *out; .in -8 } pd; struct { .in +8 struct mlx5dv_devx_obj *in; struct mlx5dv_devx *out; .in -8 } devx; .in -8 }; enum mlx5dv_obj_type { .in +8 MLX5DV_OBJ_QP = 1 << 0, MLX5DV_OBJ_CQ = 1 << 1, MLX5DV_OBJ_SRQ = 1 << 2, MLX5DV_OBJ_RWQ = 1 << 3, MLX5DV_OBJ_DM = 1 << 4, MLX5DV_OBJ_AH = 1 << 5, MLX5DV_OBJ_PD = 1 << 6, MLX5DV_OBJ_DEVX = 1 << 7, .in -8 }; .fi .SH "RETURN VALUE" 0 on success or the value of errno on failure (which indicates the failure reason). .SH "NOTES" * The information if doorbell is blueflame is based on mlx5dv_qp->bf->size, in case of 0 it's not a BF. * Compatibility masks (comp_mask) are in/out fields. .SH "SEE ALSO" .BR mlx5dv (7) .SH "AUTHORS" .TP Leon Romanovsky rdma-core-56.1/providers/mlx5/man/mlx5dv_is_supported.3.md000066400000000000000000000010741477342711600235130ustar00rootroot00000000000000--- layout: page title: mlx5dv_is_supported section: 3 tagline: Verbs --- # NAME mlx5dv_is_supported - Check whether an RDMA device implemented by the mlx5 provider # SYNOPSIS ```c #include bool mlx5dv_is_supported(struct ibv_device *device); ``` # DESCRIPTION mlx5dv functions may be called only if this function returns true for the RDMA device. # ARGUMENTS *device* : RDMA device to check. # RETURN VALUE Returns true if device is implemented by mlx5 provider. # SEE ALSO *mlx5dv(7)* # AUTHOR Artemy Kovalyov rdma-core-56.1/providers/mlx5/man/mlx5dv_map_ah_to_qp.3.md000066400000000000000000000030171477342711600234210ustar00rootroot00000000000000--- layout: page title: mlx5dv_map_ah_to_qp section: 3 tagline: Verbs --- # NAME mlx5dv_map_ah_to_qp - Map the destination path information in address handle (AH) to the information extracted from the qp. # SYNOPSIS ```c #include int mlx5dv_map_ah_to_qp(struct ibv_ah *ah, uint32_t qp_num); ``` # DESCRIPTION This API maps the destination path information in address handle (*ah*) to the information extracted from the qp (e.g. congestion control from ECE). This API serves as an enhancement to DC and UD QPs to achieve better performance by using per-address congestion control (CC) algorithms, enabling DC/UD QPs to use multiple CC algorithms in the same datacenter. The mapping created by this API is implicitly destroyed when the address handle is destroyed. It is not affected by the destruction of QP *qp_num*. A duplicate mapping to the same address handle is ignored. As this API is just a hint for the hardware in this case it would do nothing and return success regardless of the new qp_num ECE. The function must be called after ECE negotiation/preconfiguration was done by some external means. # ARGUMENTS *ah* : The target’s address handle. *qp_num* : The initiator QP from which congestion control information is extracted from its ECE. # RETURN VALUE Upon success, returns 0; Upon failure, the value of errno is returned. # SEE ALSO *rdma_cm(7)*, *rdma_get_remote_ece(3)*, *ibv_query_ece(3)*, *ibv_set_ece(3)* # AUTHOR Yochai Cohen Patrisious Haddad rdma-core-56.1/providers/mlx5/man/mlx5dv_mkey_check.3.md000066400000000000000000000073021477342711600230750ustar00rootroot00000000000000--- layout: page title: mlx5dv_mkey_check section: 3 tagline: Verbs --- # NAME mlx5dv_mkey_check - Check a MKEY for errors # SYNOPSIS ```c #include int mlx5dv_mkey_check(struct mlx5dv_mkey *mkey, struct mlx5dv_mkey_err *err_info); ``` # DESCRIPTION Checks *mkey* for errors and provides the result in *err_info* on success. This should be called after using a MKEY configured with signature validation in a transfer operation. While the transfer operation itself may be completed successfully (i.e. no transport related errors occurred), there still may be errors related to the integrity of the data. The first of these errors is reported to the MKEY and kept there until application software queries it by calling this API. The type of error indicates which part of the signature was bad (guard, reftag or apptag). Also provided is the actual calculated value based on the transferred data, and the expected value based on the signature fields. Last part provided is the offset in the transfer that caused the error. # ARGUMENTS *mkey* : The MKEY to check for errors. *err_info* : The result of the MKEY check, information about the errors detected, if any. ```c struct mlx5dv_mkey_err { enum mlx5dv_mkey_err_type err_type; union { struct mlx5dv_sig_err sig; } err; }; ``` *err_type* : What kind of error happened. If several errors exist in one block verified by the device, only the first of them is reported, according to the order specified in T10DIF specification, which is: **MLX5DV_MKEY_SIG_BLOCK_BAD_GUARD**, **MLX5DV_MKEY_SIG_BLOCK_BAD_APPTAG**, **MLX5DV_MKEY_SIG_BLOCK_BAD_REFTAG**. **MLX5DV_MKEY_NO_ERR** : No error is detected for the MKEY. **MLX5DV_MKEY_SIG_BLOCK_BAD_GUARD** : A signature error was detected in CRC/CHECKSUM for T10-DIF or CRC32/CRC32C/CRC64_XP10 (depends on the configured signature type). Additional information about the error is provided in **struct mlx5dv_sig_err** of *err*. **MLX5DV_MKEY_SIG_BLOCK_BAD_REFTAG** : A signature error was detected in the reference tag. This kind of signature error is relevant for T10-DIF only. Additional information about the error is provided in **struct mlx5dv_sig_err** of *err*. **MLX5DV_MKEY_SIG_BLOCK_BAD_APPTAG** : A signature error was detected in the application tag. This kind of signature error is relevant for T10-DIF only. Additional information about the error is provided in **struct mlx5dv_sig_err** of *err*. *err* : Information about the detected error if *err_type* is not **MLX5DV_MKEY_NO_ERR**. Otherwise, its value is not defined. ## Signature error ```c struct mlx5dv_sig_err { uint64_t actual_value; uint64_t expected_value; uint64_t offset; }; ``` *actual_value* : The actual value that was calculated from the transferred data. *expected_value* : The expected value based on what appears in the signature respected field. *offset* : The offset within the transfer where the error happened. In block signature, this is guaranteed to be a block boundary offset. # RETURN VALUE 0 on success or the value of errno on failure (which indicates the failure reason). # NOTES A DEVX context should be opened by using **mlx5dv_open_device**(3). Checking the MKEY for errors should be done after the application knows the data transfer that was using the MKEY has finished. Application should wait for the respected completion (if this was a local MKEY) or wait for a received message from a peer (if this was a remote MKEY). # SEE ALSO **mlx5dv_wr_mkey_configure**(3), **mlx5dv_wr_set_mkey_sig_block**(3), **mlx5dv_create_mkey**(3), **mlx5dv_destroy_mkey**(3) # AUTHORS Oren Duer Sergey Gorenko rdma-core-56.1/providers/mlx5/man/mlx5dv_modify_qp_lag_port.3.md000066400000000000000000000015061477342711600246510ustar00rootroot00000000000000--- layout: page title: mlx5dv_modify_qp_lag_port section: 3 tagline: Verbs --- # NAME mlx5dv_modify_qp_lag_port - Modify the lag port information of a given QP # SYNOPSIS ```c #include int mlx5dv_modify_qp_lag_port(struct ibv_qp *qp, uint8_t port_num); ``` # DESCRIPTION This API enables modifying the configured port num of a given QP. If the QP state is modified later, the port num may be implicitly re-configured. Use query mlx5dv_query_qp_lag_port to check the configured and active port num values. # ARGUMENTS *qp* : The ibv_qp object to issue the action on. *port_num* : The port_num to set for the QP. # RETURN VALUE 0 on success; EOPNOTSUPP if not in LAG mode, or other errno value on other failures. # SEE ALSO *mlx5dv_query_qp_lag_port(3)* # AUTHOR Aharon Landau rdma-core-56.1/providers/mlx5/man/mlx5dv_modify_qp_sched_elem.3.md000066400000000000000000000017521477342711600251350ustar00rootroot00000000000000--- layout: page title: mlx5dv_modify_qp_sched_elem section: 3 tagline: Verbs date: 2020-9-22 header: "mlx5 Programmer's Manual" footer: mlx5 --- # NAME mlx5dv_modify_qp_sched_elem - Connect a QP with a requestor and/or a responder scheduling element # SYNOPSIS ```c int mlx5dv_modify_qp_sched_elem(struct ibv_qp *qp, struct mlx5dv_sched_leaf *requestor, struct mlx5dv_sched_leaf *responder); ``` # DESCRIPTION The QP scheduling element (SE) allows the association of a QP to a SE tree. The SE is described in *mlx5dv_sched_node_create(3)* man page. By default, all QPs are not associated to SE. The default setting is ensuring fair bandwidth allocation with no maximum bandwidth limiting. A QP can be associate to a requestor and/or a responder SE following the IB spec definition. # RETURN VALUE upon success 0 is returned or the value of errno on a failure. # SEE ALSO **mlx5dv_sched_node_create**(3) # AUTHOR Mark Zhang Ariel Almog rdma-core-56.1/providers/mlx5/man/mlx5dv_modify_qp_udp_sport.3.md000066400000000000000000000015121477342711600250560ustar00rootroot00000000000000--- layout: page title: mlx5dv_modify_qp_udp_sport section: 3 tagline: Verbs --- # NAME mlx5dv_modify_qp_udp_sport - Modify the UDP source port of a given QP # SYNOPSIS ```c #include int mlx5dv_modify_qp_udp_sport(struct ibv_qp *qp, uint16_t udp_sport) ``` # DESCRIPTION The UDP source port is used to create entropy for network routers (ECMP), load balancers and 802.3ad link aggregation switching that are not aware of RoCE IB headers. This API enables modifying the configured UDP source port of a given RC/UC QP when QP is in RTS state. # ARGUMENTS *qp* : The ibv_qp object to issue the action on. *udp_sport* : The UDP source port to set for the QP. # RETURN VALUE Returns 0 on success, or the value of errno on failure (which indicates the failure reason). # AUTHOR Maor Gottlieb rdma-core-56.1/providers/mlx5/man/mlx5dv_open_device.3.md000066400000000000000000000016651477342711600232610ustar00rootroot00000000000000--- layout: page title: mlx5dv_open_device section: 3 tagline: Verbs --- # NAME mlx5dv_open_device - Open an RDMA device context for the mlx5 provider # SYNOPSIS ```c #include struct ibv_context * mlx5dv_open_device(struct ibv_device *device, struct mlx5dv_context_attr *attr); ``` # DESCRIPTION Open an RDMA device context with specific mlx5 provider attributes. # ARGUMENTS *device* : RDMA device to open. ## *attr* argument ```c struct mlx5dv_context_attr { uint32_t flags; uint64_t comp_mask; }; ``` *flags* : A bitwise OR of the various values described below. *MLX5DV_CONTEXT_FLAGS_DEVX*: Allocate a DEVX context *comp_mask* : Bitmask specifying what fields in the structure are valid # RETURN VALUE Returns a pointer to the allocated device context, or NULL if the request fails. # SEE ALSO *ibv_open_device(3)* # AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_pp_alloc.3.md000066400000000000000000000026371477342711600225720ustar00rootroot00000000000000--- layout: page title: mlx5dv_pp_alloc / mlx5dv_pp_free section: 3 tagline: Verbs --- # NAME mlx5dv_pp_alloc - Allocates a packet pacing entry mlx5dv_pp_free - Frees a packet pacing entry # SYNOPSIS ```c #include struct mlx5dv_pp * mlx5dv_pp_alloc(struct ibv_context *context, size_t pp_context_sz, const void *pp_context, uint32_t flags); void mlx5dv_pp_free(struct mlx5dv_pp *dv_pp); ``` # DESCRIPTION Create / free a packet pacing entry which can be used for some device commands over the DEVX interface. The DEVX API enables direct access from the user space area to the mlx5 device driver, the packet pacing information is needed for few commands where a packet pacing index is needed. # ARGUMENTS *context* : RDMA device context to work on, need to be opened with DEVX support by using mlx5dv_open_device(). *pp_context_sz* : Length of *pp_context* input buffer. *pp_context* : Packet pacing context according to the device specification. *flags* : MLX5DV_PP_ALLOC_FLAGS_DEDICATED_INDEX: allocate a dedicated index. ## dv_pp ```c struct mlx5dv_pp { uint16_t index; }; ``` *index* : The device index to be used. # RETURN VALUE Upon success *mlx5dv_pp_alloc* returns a pointer to the created packet pacing object, on error NULL will be returned and errno will be set. # SEE ALSO **mlx5dv_open_device**, **mlx5dv_devx_obj_create** # AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_qp_cancel_posted_send_wrs.3.md000066400000000000000000000057501477342711600262070ustar00rootroot00000000000000--- layout: page title: mlx5dv_qp_cancel_posted_send_wrs section: 3 tagline: Verbs --- # NAME mlx5dv_qp_cancel_posted_send_wrs - Cancel all pending send work requests with supplied WRID in a QP in SQD state # SYNOPSIS ```c #include int mlx5dv_qp_cancel_posted_send_wrs(struct mlx5dv_qp_ex *mqp, uint64_t wr_id); ``` # DESCRIPTION The canceled work requests are replaced with NOPs (no operation), and will generate good completions according to the signaling originally requested in the send flags, or "flushed" completions in case the QP goes to error. A work request can only be canceled when the QP is in SQD state. The cancel function is a part of the signature pipelining feature. The feature allows posting a signature related transfer operation together with a SEND with a good response to the client. Normally, the application must wait for the transfer to end, check the MKEY for errors, and only then send a good or bad response. However this increases the latency of the good flow of a transaction. To enable this feature, a QP must be created with the **MLX5DV_QP_CREATE_SIG_PIPELINING** creation flag. Such QP will stop after a transfer operation that failed signature validation in SQD state. **IBV_EVENT_SQ_DRAINED** is generated to inform about the new state. The SEND operation that might need to be canceled due to a bad signature of a previous operation must be posted with the **IBV_SEND_FENCE** option in **ibv_qp_ex->wr_flags** field. When QP stopped at SQD, it means that at least one WR caused signature error. It may not be the last WR. It may be that more than one WRs cause signature errors by the time the QP finally stopped. It is guaranteed that the QP has stopped somewhere between the WQE that generated the signature error, and the next WQE that has **IBV_SEND_FENCE** on it. Software must handle the SQD event as described below: 1. Poll everything (polling until 0 once) on the respective CQ, allowing the discovery of all possible signature errors. 2. Look through all "open" transactions, check related signature MKEYs using **mlx5dv_mkey_check**(3), find the one with the signature error, get a **WRID** from the operation software context and handle the failed operation. 3. Cancel the SEND WR by the WRID using **mlx5dv_qp_cancel_posted_send_wrs**(). 4. Modify the QP back to RTS state. # ARGUMENTS *mqp* : The QP to investigate, which must be in SQD state. *wr_id* : The WRID to cancel. # RETURN VALUE Number of work requests that were canceled, or -errno on error. # NOTES A DEVX context should be opened by using **mlx5dv_open_device**(3). Must be called with a QP in SQD state. QP should be created with **MLX5DV_QP_CREATE_SIG_PIPELINING** creation flag. Application must listen on QP events, and expect a SQD event. # SEE ALSO **mlx5dv_mkey_check**(3), **mlx5dv_wr_mkey_configure**(3), **mlx5dv_wr_set_mkey_sig_block**(3), **mlx5dv_create_mkey**(3), **mlx5dv_destroy_mkey**(3) # AUTHORS Oren Duer Sergey Gorenko rdma-core-56.1/providers/mlx5/man/mlx5dv_query_device.3000066400000000000000000000226401477342711600230620ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org (MIT) - See COPYING.md .\" .TH MLX5DV_QUERY_DEVICE 3 2017-02-02 1.0.0 .SH "NAME" mlx5dv_query_device \- Query device capabilities specific to mlx5 .SH "SYNOPSIS" .nf .B #include .sp .BI "int mlx5dv_query_device(struct ibv_context *ctx_in, .BI " struct mlx5dv_context *attrs_out); .fi .SH "DESCRIPTION" .B mlx5dv_query_device() Query HW device-specific information which is important for data-path, but isn't provided by \fBibv_query_device\fR(3). .PP This function returns version, flags and compatibility mask. The version represents the format of the internal hardware structures that mlx5dv.h represents. Additions of new fields to the existed structures are handled by comp_mask field. .PP .nf struct mlx5dv_sw_parsing_caps { .in +8 uint32_t sw_parsing_offloads; /* Use enum mlx5dv_sw_parsing_offloads */ uint32_t supported_qpts; .in -8 }; .PP .nf struct mlx5dv_striding_rq_caps { .in +8 uint32_t min_single_stride_log_num_of_bytes; /* min log size of each stride */ uint32_t max_single_stride_log_num_of_bytes; /* max log size of each stride */ uint32_t min_single_wqe_log_num_of_strides; /* min log number of strides per WQE */ uint32_t max_single_wqe_log_num_of_strides; /* max log number of strides per WQE */ uint32_t supported_qpts; .in -8 }; .PP .nf struct mlx5dv_dci_streams_caps { .in +8 uint8_t max_log_num_concurent; /* max log number of parallel different streams that could be handled by HW */ uint8_t max_log_num_errored; /* max DCI error stream channels supported per DCI before a DCI move to an error state */ .in -8 }; .PP .nf struct mlx5dv_context { .in +8 uint8_t version; uint64_t flags; uint64_t comp_mask; /* Use enum mlx5dv_context_comp_mask */ struct mlx5dv_cqe_comp_caps cqe_comp_caps; struct mlx5dv_sw_parsing_caps sw_parsing_caps; uint32_t tunnel_offloads_caps; uint32_t max_dynamic_bfregs /* max blue-flame registers that can be dynamiclly allocated */ uint64_t max_clock_info_update_nsec; uint32_t flow_action_flags; /* use enum mlx5dv_flow_action_cap_flags */ uint32_t dc_odp_caps; /* use enum ibv_odp_transport_cap_bits */ void *hca_core_clock; /* points to a memory location that is mapped to the HCA's core clock */ struct mlx5dv_sig_caps sig_caps; size_t max_wr_memcpy_length; /* max length that is supported by the DMA memcpy WR */ struct mlx5dv_crypto_caps crypto_caps; uint64_t max_dc_rd_atom; /* Maximum number of outstanding RDMA read/atomic per DC QP as a requester */ uint64_t max_dc_init_rd_atom; /* Maximum number of outstanding RDMA read/atomic per DC QP as a responder */ struct mlx5dv_reg reg_c0; /* value and mask to match local vport egress traffic in FDB */ struct mlx5dv_ooo_recv_wrs_caps ooo_recv_wrs_caps; /* Maximum number of outstanding WRs per out-of-order QP type */ .in -8 }; enum mlx5dv_context_flags { .in +8 /* * This flag indicates if CQE version 0 or 1 is needed. */ MLX5DV_CONTEXT_FLAGS_CQE_V1 = (1 << 0), MLX5DV_CONTEXT_FLAGS_OBSOLETE = (1 << 1), /* Obsoleted, don't use */ MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED = (1 << 2), /* Multi packet WQE is allowed */ MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW = (1 << 3), /* Enhanced multi packet WQE is supported or not */ MLX5DV_CONTEXT_FLAGS_CQE_128B_COMP = (1 << 4), /* Support CQE 128B compression */ MLX5DV_CONTEXT_FLAGS_CQE_128B_PAD = (1 << 5), /* Support CQE 128B padding */ MLX5DV_CONTEXT_FLAGS_PACKET_BASED_CREDIT_MODE = (1 << 6), /* Support packet based credit mode in RC QP */ /* * If CQ was created with IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK, CQEs timestamp will be in real time format. */ MLX5DV_CONTEXT_FLAGS_REAL_TIME_TS = (1 << 7), .in -8 }; .PP .nf enum mlx5dv_context_comp_mask { .in +8 MLX5DV_CONTEXT_MASK_CQE_COMPRESION = 1 << 0, MLX5DV_CONTEXT_MASK_SWP = 1 << 1, MLX5DV_CONTEXT_MASK_STRIDING_RQ = 1 << 2, MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS = 1 << 3, MLX5DV_CONTEXT_MASK_DYN_BFREGS = 1 << 4, MLX5DV_CONTEXT_MASK_CLOCK_INFO_UPDATE = 1 << 5, MLX5DV_CONTEXT_MASK_FLOW_ACTION_FLAGS = 1 << 6, MLX5DV_CONTEXT_MASK_DC_ODP_CAPS = 1 << 7, MLX5DV_CONTEXT_MASK_HCA_CORE_CLOCK = 1 << 8, MLX5DV_CONTEXT_MASK_NUM_LAG_PORTS = 1 << 9, MLX5DV_CONTEXT_MASK_SIGNATURE_OFFLOAD = 1 << 10, MLX5DV_CONTEXT_MASK_DCI_STREAMS = 1 << 11, MLX5DV_CONTEXT_MASK_WR_MEMCPY_LENGTH = 1 << 12, MLX5DV_CONTEXT_MASK_CRYPTO_OFFLOAD = 1 << 13, MLX5DV_CONTEXT_MASK_MAX_DC_RD_ATOM = 1 << 14, MLX5DV_CONTEXT_MASK_REG_C0 = 1 << 15, MLX5DV_CONTEXT_MASK_OOO_RECV_WRS = 1 << 16, .in -8 }; .PP .nf enum enum mlx5dv_sw_parsing_offloads { .in +8 MLX5DV_SW_PARSING = 1 << 0, MLX5DV_SW_PARSING_CSUM = 1 << 1, MLX5DV_SW_PARSING_LSO = 1 << 2, .in -8 }; .PP .nf enum mlx5dv_tunnel_offloads { .in +8 MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_VXLAN = 1 << 0, MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GRE = 1 << 1, MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GENEVE = 1 << 2, .in -8 }; .PP .nf enum mlx5dv_flow_action_cap_flags { .in +8 MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM = 1 << 0, /* Flow action ESP (with AES_GCM keymat) is supported */ MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_REQ_METADATA = 1 << 1, /* Flow action ESP always return metadata in the payload */ MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_SPI_STEERING = 1 << 2, /* ESP (with AESGCM keymat) Supports matching by SPI (rather than hashing against SPI) */ MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_FULL_OFFLOAD = 1 << 3, /* Flow action ESP supports full offload (with AES_GCM keymat) */ MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_TX_IV_IS_ESN = 1 << 4, /* Flow action ESP (with AES_GCM keymat), ESN comes implicitly from IV. */ .in -8 }; .PP .nf struct mlx5dv_sig_caps { .in +8 uint64_t block_size; /* use enum mlx5dv_block_size_caps */ uint32_t block_prot; /* use enum mlx5dv_sig_prot_caps */ uint16_t t10dif_bg; /* use enum mlx5dv_sig_t10dif_bg_caps */ uint16_t crc_type; /* use enum mlx5dv_sig_crc_type_caps */ .in -8 }; enum mlx5dv_sig_prot_caps { .in +8 MLX5DV_SIG_PROT_CAP_T10DIF = 1 << MLX5DV_SIG_TYPE_T10DIF, MLX5DV_SIG_PROT_CAP_CRC = 1 << MLX5DV_SIG_TYPE_CRC, .in -8 }; enum mlx5dv_sig_t10dif_bg_caps { .in +8 MLX5DV_SIG_T10DIF_BG_CAP_CRC = 1 << MLX5DV_SIG_T10DIF_CRC, MLX5DV_SIG_T10DIF_BG_CAP_CSUM = 1 << MLX5DV_SIG_T10DIF_CSUM, .in -8 }; enum mlx5dv_sig_crc_type_caps { .in +8 MLX5DV_SIG_CRC_TYPE_CAP_CRC32 = 1 << MLX5DV_SIG_CRC_TYPE_CRC32, MLX5DV_SIG_CRC_TYPE_CAP_CRC32C = 1 << MLX5DV_SIG_CRC_TYPE_CRC32C, MLX5DV_SIG_CRC_TYPE_CAP_CRC64_XP10 = 1 << MLX5DV_SIG_CRC_TYPE_CRC64_XP10, .in -8 }; enum mlx5dv_block_size_caps { .in +8 MLX5DV_BLOCK_SIZE_CAP_512 = 1 << MLX5DV_BLOCK_SIZE_512, MLX5DV_BLOCK_SIZE_CAP_520 = 1 << MLX5DV_BLOCK_SIZE_520, MLX5DV_BLOCK_SIZE_CAP_4048 = 1 << MLX5DV_BLOCK_SIZE_4048, MLX5DV_BLOCK_SIZE_CAP_4096 = 1 << MLX5DV_BLOCK_SIZE_4096, MLX5DV_BLOCK_SIZE_CAP_4160 = 1 << MLX5DV_BLOCK_SIZE_4160, .in -8 }; .PP .nf struct mlx5dv_crypto_caps { .in +8 /* * if failed_selftests != 0 it means there are some self tests errors * that may render specific crypto engines unusable. Exact code meaning * should be consulted with NVIDIA. */ uint16_t failed_selftests; uint8_t crypto_engines; /* use enum mlx5dv_crypto_engines_caps */ uint8_t wrapped_import_method; /* use enum mlx5dv_crypto_wrapped_import_method_caps */ uint8_t log_max_num_deks; uint32_t flags; /* use enum mlx5dv_crypto_caps_flags */ .in -8 }; /* This bitmap indicates which crypto engines are supported by the device. */ enum mlx5dv_crypto_engines_caps { .in +8 /* Deprecated, replaced by MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_SINGLE_BLOCK */ MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS = 1 << 0, /* * Indicates that AES-XTS only supports encrypting a single block * at a time. */ MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_SINGLE_BLOCK = 1 << 1, /* Indicates that AES-XTS supports multi-block encryption. */ MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_MULTI_BLOCK = 1 << 2, .in -8 }; /* * This bitmap indicates the import method of each crypto engine. * * If a bit is not set, the corresponding crypto engine is in plaintext import method and the * application must use plaintext DEKs for this crypto engine. * * Otherwise, the corresponding crypto engine is in wrapped import method and the application * must use wrapped DEKs for this crypto engine. To load wrapped DEKs the application must perform * crypto login, which in turn requires MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_OPERATIONAL below to be set. */ enum mlx5dv_crypto_wrapped_import_method_caps { .in +8 MLX5DV_CRYPTO_WRAPPED_IMPORT_METHOD_CAP_AES_XTS = 1 << 0, .in -8 }; enum mlx5dv_crypto_caps_flags { .in +8 /* Indicates whether crypto capabilities are enabled on the device. */ MLX5DV_CRYPTO_CAPS_CRYPTO = 1 << 0, /* Indicates whether crypto engines that are in wrapped import method are operational. */ MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_OPERATIONAL = 1 << 1, /* * If set, indicates that after the next FW reset the device will go back to * commissioning mode, meaning that MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_OPERATIONAL * will be set to 0. */ MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_GOING_TO_COMMISSIONING = 1 << 2, .in -8 }; .PP .nf struct mlx5dv_ooo_recv_wrs_caps { .in +8 uint32_t max_rc; uint32_t max_xrc; uint32_t max_dct; uint32_t max_ud; uint32_t max_uc; .in -8 }; .fi .SH "RETURN VALUE" 0 on success or the value of errno on failure (which indicates the failure reason). .SH "NOTES" * Compatibility mask (comp_mask) is in/out field. .SH "SEE ALSO" .BR mlx5dv (7), .BR ibv_query_device (3) .SH "AUTHORS" .TP Leon Romanovsky rdma-core-56.1/providers/mlx5/man/mlx5dv_query_port.3.md000066400000000000000000000055021477342711600232040ustar00rootroot00000000000000--- layout: page title: mlx5dv_query_port section: 3 tagline: Verbs --- # NAME mlx5dv_query_port - Query non standard attributes of IB device port. # SYNOPSIS ```c #include int mlx5dv_query_port(struct ibv_context *context, uint32_t port_num, struct mlx5dv_port *info); ``` # DESCRIPTION Query port info which can be used for some device commands over the DEVX interface and when directly accessing the hardware resources. A function that lets a user query hardware and configuration attributes associated with the port. # USAGE A user should provide the port number to query. On successful query *flags* will store a subset of the requested attributes which are supported/relevant for that port. # ARGUMENTS *context* : RDMA device context to work on. *port_num* : Port number to query. ## *info* : Stores the returned attributes from the kernel. ```c struct mlx5dv_port { uint64_t flags; uint16_t vport; uint16_t vport_vhca_id; uint16_t esw_owner_vhca_id; uint16_t rsvd0; uint64_t vport_steering_icm_rx; uint64_t vport_steering_icm_tx; struct mlx5dv_reg reg_c0; }; ``` *flags* : Bit field of attributes, on successful query *flags* stores the valid filled attributes. MLX5DV_QUERY_PORT_VPORT: The vport number of that port. MLX5DV_QUERY_PORT_VPORT_VHCA_ID: The VHCA ID of *vport_num*. MLX5DV_QUERY_PORT_ESW_OWNER_VHCA_ID: The E-Switch owner of *vport_num*. MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_RX: The ICM RX address when directing traffic. MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_TX: The ICM TX address when directing traffic. MLX5DV_QUERY_PORT_VPORT_REG_C0: Register C0 value used to identify egress of *vport_num*. *vport* : The VPORT number of that port. *vport_vhca_id* : The VHCA ID of *vport_num*. *rsvd0* : A reserved field. Not to be used. *esw_owner_vhca_id* : The E-Switch owner of *vport_num*. *vport_steering_ica_rx* : The ICM RX address when directing traffic. *vport_steering_icm_tx* : The ICM TX address when directing traffic. ## reg_c0 : Register C0 value used to identify traffic of *vport_num*. ```c struct mlx5dv_reg { uint32_t value; uint32_t mask; }; ``` *value* : The value that should be used as match. *mask* : The mask that should be used when matching. # RETURN VALUE returns 0 on success, or the value of errno on failure (which indicates the failure reason). # EXAMPLE ```c for (i = 1; i <= ports; i++) { ret = mlx5dv_query_port(context, i, &port_info); if (ret) { printf("Error querying port %d\n", i); break; } printf("Port: %d:\n", i); if (port_info.flags & MLX5DV_QUERY_PORT_VPORT) printf("\tvport_num: 0x%x\n", port_info.vport_num); if (port_info.flags & MLX5DV_QUERY_PORT_VPORT_REG_C0) printf("\treg_c0: val: 0x%x mask: 0x%x\n", port_info.reg_c0.value, port_info.reg_c0.mask); } ``` Mark Bloch rdma-core-56.1/providers/mlx5/man/mlx5dv_query_qp_lag_port.3.md000066400000000000000000000020321477342711600245220ustar00rootroot00000000000000--- layout: page title: mlx5dv_query_qp_lag_port section: 3 tagline: Verbs --- # NAME mlx5dv_query_qp_lag_port - Query the lag port information of a given QP # SYNOPSIS ```c #include int mlx5dv_query_qp_lag_port(struct ibv_qp *qp, uint8_t *port_num, uint8_t *active_port_num); ``` # DESCRIPTION This API returns the configured and active port num of a given QP in mlx5 devices. The active port num indicates which port that the QP sends traffic out in a LAG configuration. The num_lag_ports field of struct mlx5dv_context greater than 1 means LAG is supported on this device. # ARGUMENTS *qp* : The ibv_qp object to issue the action on. *port_num* : The configured port num of the QP. *active_port_num* : The current port num of the QP, which may different from the configured value because of the bonding status. # RETURN VALUE 0 on success; EOPNOTSUPP if not in LAG mode, or other errno value on other failures. # SEE ALSO *mlx5dv_modify_qp_lag_port(3)* # AUTHOR Aharon Landau rdma-core-56.1/providers/mlx5/man/mlx5dv_reg_dmabuf_mr.3.md000066400000000000000000000031141477342711600235610ustar00rootroot00000000000000--- layout: page title: mlx5dv_reg_dmabuf_mr section: 3 tagline: Verbs --- # NAME mlx5dv_reg_dmabuf_mr - Register a dma-buf based memory region (MR) # SYNOPSIS ```c #include struct ibv_mr *mlx5dv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access, int mlx5_access) ``` # DESCRIPTION Register a dma-buf based memory region (MR), it follows the functionality of *ibv_reg_dmabuf_mr()* with the ability to supply specific mlx5 access flags. # ARGUMENTS *pd* : The associated protection domain. *offset* : The offset of the dma-buf where the MR starts. *length* : The length of the MR. *iova* : Specifies the virtual base address of the MR when accessed through a lkey or rkey. It must have the same page offset as *offset* and be aligned with the system page size. *fd* : The file descriptor that the dma-buf is identified by. *access* : The desired memory protection attributes; it is either 0 or the bitwise OR of one or more of *enum ibv_access_flags*. *mlx5_access* : A specific device access flags, it is either 0 or the below. *MLX5DV_REG_DMABUF_ACCESS_DATA_DIRECT* if set, this MR will be accessed through the Data Direct engine bonded with that RDMA device. # RETURN VALUE Upon success returns a pointer to the registered MR, or NULL if the request fails, in that case the value of errno indicates the failure reason. # SEE ALSO *ibv_reg_dmabuf_mr(3)*, *mlx5dv_get_data_direct_sysfs_path(3)* # AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_reserved_qpn_alloc.3.md000066400000000000000000000027431477342711600246460ustar00rootroot00000000000000--- layout: page title: mlx5dv_reserved_qpn_alloc / dealloc section: 3 tagline: Verbs date: 2020-12-29 header: "mlx5 Programmer's Manual" footer: mlx5 --- # NAME mlx5dv_reserved_qpn_alloc - Allocate a reserved QP number from device mlx5dv_reserved_qpn_dealloc - Release the reserved QP number # SYNOPSIS ```c #include int mlx5dv_reserved_qpn_alloc(struct ibv_context *ctx, uint32_t *qpn); int mlx5dv_reserved_qpn_dealloc(struct ibv_context *ctx, uint32_t qpn); ``` # DESCRIPTION When work with RDMA_CM RDMA_TCP_PS + external QP support, a client node needs GUID level unique QP numbers to comply with the CM's timewait logic. If a real unique QP is not allocated, a device global QPN value is required and can be allocated via this interface. The mlx5 DCI QP is such an example, which could connect to the remote DCT's multiple times as long as the application provides unique QPN for each new RDMA_CM connection. These 2 APIs provide the allocation/deallocation of a unique QP number from/to device. This qpn can be used with DC QPN in RDMA_CM connection establishment, which will comply with the CM timewait kernel logic. # ARGUMENTS *ctx* : The device context to issue the action on. *qpn* : The allocated QP number (for alloc API), or the QP number to be deallocated (for dealloc API). # RETURN VALUE 0 on success; EOPNOTSUPP if not supported, or other errno value on other failures. # AUTHOR Mark Zhang Alex Rosenbaum rdma-core-56.1/providers/mlx5/man/mlx5dv_sched_node_create.3.md000066400000000000000000000117631477342711600244170ustar00rootroot00000000000000--- layout: page title: mlx5dv_sched_node[/leaf]_create / modify / destroy section: 3 tagline: Verbs date: 2020-9-3 header: "mlx5 Programmer's Manual" footer: mlx5 --- # NAME mlx5dv_sched_node_create - Creates a scheduling node element mlx5dv_sched_leaf_create - Creates a scheduling leaf element mlx5dv_sched_node_modify - Modifies a node scheduling element mlx5dv_sched_leaf_modify - Modifies a leaf scheduling element mlx5dv_sched_node_destroy - Destroys a node scheduling element mlx5dv_sched_leaf_destroy - Destroys a leaf scheduling element # SYNOPSIS ```c #include struct mlx5dv_sched_node *mlx5dv_sched_node_create(struct ibv_context *context, struct mlx5dv_sched_attr *sched_attr); struct mlx5dv_sched_leaf *mlx5dv_sched_leaf_create(struct ibv_context *context, struct mlx5dv_sched_attr *sched_attr); int mlx5dv_sched_node_modify(struct mlx5dv_sched_node *node, struct mlx5dv_sched_attr *sched_attr); int mlx5dv_sched_leaf_modify(struct mlx5dv_sched_leaf *leaf, struct mlx5dv_sched_attr *sched_attr); int mlx5dv_sched_node_destroy(struct mlx5dv_sched_node *node); int mlx5dv_sched_leaf_destroy(struct mlx5dv_sched_leaf *leaf); ``` # DESCRIPTION The transmit scheduling element (SE) is scheduling the transmission for of all nodes connected it. By configuring the SE, QoS policies may be enforced between the competing entities (e.g. SQ, QP). In each scheduling cycle, the SE schedules all ready-to-transmit entities. The SE assures that weight for each entity is met. If entity has reached its maximum allowed bandwidth within the scheduling cycle, it won’t be scheduled till end of the scheduling cycle. The unused transmission bandwidth will be distributed among the remaining entities assuring the weight setting. The SEs are connected in a tree structure. The entity is connected to a leaf. One or more leaves can be connected to a SE node. One or more SE nodes can be connected to a SE node, until reaching the SE root. For each input on each node, user can assign the maximum bandwidth and the scheduling weight. The SE APIs (mlx5dv_sched_*) allows access by verbs application to set the hierarchical SE tree to the device. The ibv_qp shall be connected to a leaf. # ARGUMENTS Please see *ibv_create_qp_ex(3)* man page for *context*. ## mlx5dv_sched_attr ```c struct mlx5dv_sched_attr { struct mlx5dv_sched_node *parent; uint32_t flags; uint32_t bw_share; uint32_t max_avg_bw; uint64_t comp_mask; }; ``` *parent* : A node handler to the parent scheduling element which this scheduling element will be connected to. The root scheduling element doesn't have a parent. *flags* : Specifying what attributes in the structure are valid: MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE for *bw_share* MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW for *max_avg_bw* *bw_share* : The relative bandwidth share allocated for this element. This field has no units. The bandwidth is shared between all elements connected to the same parent element, relatively to their bw_share. Value of 0, indicates a device default Weight. This field must be 0 for the root TSAR. *max_avg_bw* : The maximal transmission rate allowed for the element, averaged over time. Value is given in units of 1 Mbit/sec. Value 0x0 indicates the rate is unlimited. This field must be 0 for the root TSAR. *comp_mask* : Reserved for future extension, must be 0 now. *node/leaf* : For modify, destroy: the scheduling element to work on. *sched_attr* : For create, modify: the attribute of the scheduling element to work on. # NOTES For example if an application wants to create 2 QoS QP groups: ```c g1: 70% bandwidth share of this application g2: 30% bandwidth share of this application, with maximum average bandwidth limited to 4Gbps ``` Pseudo code: ```c struct mlx5dv_sched_node *root; struct mlx5dv_sched_leaf *leaf_g1, *leaf_g2; struct mlx5dv_sched_attr; struct ibv_qp *qp1, qp2; /* Create root node */ attr.comp_mask = 0; attr.parent = NULL; attr.flags = 0; root = mlx5dv_sched_node_create(context, attr); /* Create group1 */ attr.comp_mask = 0; attr.parent = root; attr.bw_share = 7; attr.flags = MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE; leaf_g1 = mlx5dv_sched_leaf_create(context, attr); /* Create group2 */ attr.comp_mask = 0; attr.parent = root; attr.bw_share = 3; attr.max_avg_bw = 4096; attr.flags = MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE | MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW; leaf_g2 = mlx5dv_sched_leaf_create(context, attr); foreach (qp1 in group1) mlx5dv_modify_qp_sched_elem(qp1, leaf_g1, NULL); foreach (qp2 in group2) mlx5dv_modify_qp_sched_elem(qp2, leaf_g2, NULL); ``` # RETURN VALUE Upon success *mlx5dv_sched_node[/leaf]_create()* will return a new *struct mlx5dv_sched_node[/leaf]*, on error NULL will be returned and errno will be set. Upon success modify and destroy, 0 is returned or the value of errno on a failure. # SEE ALSO **ibv_create_qp_ex**(3), **mlx5dv_modify_qp_sched_elem**(3) # AUTHOR Mark Zhang Ariel Almog rdma-core-56.1/providers/mlx5/man/mlx5dv_ts_to_ns.3000066400000000000000000000017761477342711600222350ustar00rootroot00000000000000.\" -*- nroff -*- .\" Licensed under the OpenIB.org (MIT) - See COPYING.md .\" .TH MLX5DV_TS_TO_NS 3 2017-11-08 1.0.0 .SH "NAME" mlx5dv_ts_to_ns \- Convert device timestamp from HCA core clock units to the corresponding nanosecond counts .SH "SYNOPSIS" .nf .B #include .sp .BI "uint64_t mlx5dv_ts_to_ns(struct mlx5dv_clock_info *clock_info, .BI " uint64_t device_timestamp); .fi .SH "DESCRIPTION" .B mlx5dv_ts_to_ns(3) Converts a host byte order .I device_timestamp from HCA core clock units into the corresponding nanosecond wallclock time. .PP \fBstruct mlx5dv_clock_info\fR can be retrieved using \fBmlx5dv_get_clock_info(3)\fR. .PP The greater the difference between the device reporting a timestamp and the last mlx5dv_clock_info update, the greater the inaccuracy of the clock time conversion. .fi .SH "RETURN VALUE" Timestamp in nanoseconds .SH "SEE ALSO" .BR mlx5dv (7), .BR mlx5dv_get_clock_info (3), .BR mlx5dv_query_device (3) .SH "AUTHORS" .TP Feras Daoud rdma-core-56.1/providers/mlx5/man/mlx5dv_vfio_get_events_fd.3.md000066400000000000000000000014611477342711600246320ustar00rootroot00000000000000--- layout: page title: mlx5dv_vfio_get_events_fd section: 3 tagline: Verbs --- # NAME mlx5dv_vfio_get_events_fd - Get the file descriptor to manage driver events. # SYNOPSIS ```c #include int mlx5dv_vfio_get_events_fd(struct ibv_context *ctx); ``` # DESCRIPTION Returns the file descriptor to be used for managing driver events. # ARGUMENTS *ctx* : device context that was opened for VFIO by calling mlx5dv_get_vfio_device_list(). # RETURN VALUE Returns the internal matching file descriptor. # NOTES Client code should poll the returned file descriptor and once there is some data to be managed immediately call *mlx5dv_vfio_process_events()*. # SEE ALSO *ibv_open_device(3)* *ibv_free_device_list(3)* *mlx5dv_get_vfio_device_list(3)* # AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_vfio_process_events.3.md000066400000000000000000000017431477342711600250630ustar00rootroot00000000000000--- layout: page title: mlx5dv_vfio_process_events section: 3 tagline: Verbs --- # NAME mlx5dv_vfio_process_events - process vfio driver events # SYNOPSIS ```c #include int mlx5dv_vfio_process_events(struct ibv_context *ctx); ``` # DESCRIPTION This API should run from application thread and maintain device events. The application is responsible to get the events FD by calling *mlx5dv_vfio_get_events_fd()* and once the FD is pollable call the API to let driver process its internal events. # ARGUMENTS *ctx* : device context that was opened for VFIO by calling mlx5dv_get_vfio_device_list(). # RETURN VALUE Returns 0 upon success or errno value in case a failure has occurred. # NOTES Application can use this API also to periodically check the device health state even if no events exist. # SEE ALSO *ibv_open_device(3)* *ibv_free_device_list(3)* *mlx5dv_get_vfio_device_list(3)* *mlx5dv_vfio_get_events_fd(3)* # AUTHOR Yishai Hadas rdma-core-56.1/providers/mlx5/man/mlx5dv_wr_mkey_configure.3.md000066400000000000000000000307431477342711600245160ustar00rootroot00000000000000--- layout: page title: mlx5dv_wr_mkey_configure section: 3 tagline: Verbs --- # NAME mlx5dv_wr_mkey_configure - Create a work request to configure an MKEY mlx5dv_wr_set_mkey_access_flags - Set the memory protection attributes for an MKEY mlx5dv_wr_set_mkey_layout_list - Set a memory layout for an MKEY based on SGE list mlx5dv_wr_set_mkey_layout_interleaved - Set an interleaved memory layout for an MKEY # SYNOPSIS ```c #include static inline void mlx5dv_wr_mkey_configure(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint8_t num_setters, struct mlx5dv_mkey_conf_attr *attr); static inline void mlx5dv_wr_set_mkey_access_flags(struct mlx5dv_qp_ex *mqp, uint32_t access_flags); static inline void mlx5dv_wr_set_mkey_layout_list(struct mlx5dv_qp_ex *mqp, uint16_t num_sges, const struct ibv_sge *sge); static inline void mlx5dv_wr_set_mkey_layout_interleaved(struct mlx5dv_qp_ex *mqp, uint32_t repeat_count, uint16_t num_interleaved, const struct mlx5dv_mr_interleaved *data); ``` # DESCRIPTION The MLX5DV MKEY configure API and the related setters (mlx5dv_wr_set_mkey\*) are an extension of IBV work request API (ibv_wr\*) with specific features for MLX5DV MKEY. MKEYs allow creation of virtually-contiguous address spaces out of non-contiguous chunks of memory regions already registered with the hardware. Additionally it provides access to some advanced hardware offload features, e.g. signature offload. These APIs are intended to be used to access additional functionality beyond what is provided by **mlx5dv_wr_mr_list**() and **mlx5dv_wr_mr_interleaved**(). The MKEY features can be optionally enabled using the mkey configure setters. It allows using different features in the same MKEY. # USAGE To use these APIs a QP must be created using **mlx5dv_create_qp**(3) which allows setting the **MLX5DV_QP_EX_WITH_MKEY_CONFIGURE** in **send_ops_flags**. The MKEY configuration work request is created by calling **mlx5dv_wr_mkey_configure**(), a WR builder function, followed by required setter functions. *num_setters* is a number of required setters for the WR. All setters are optional. *num_setters* can be zero to apply *attr* only. Each setter can be called only once per the WR builder. The WR configures *mkey* and applies *attr* of the builder function and setter functions' arguments for it. If *mkey* is already configured, the WR overrides some *mkey* properties depends on builder and setter functions' arguments (see details in setters' description). To clear configuration of *mkey*, use **ibv_post_send**() with **IBV_WR_LOCAL_INV** opcode or **ibv_wr_local_inv**(). Current implementation requires the **IBV_SEND_INLINE** option to be set in **wr_flags** field of **ibv_qp_ex** structure prior to builder function call. Non-inline payload is currently not supported by this API. Please note that inlining here is done for MKEY configuration data, not for user data referenced by data layouts. Once MKEY is configured, it may be used in subsequent work requests (SEND, RDMA_READ, RDMA_WRITE, etc). If these work requests are posted on the same QP, there is no need to wait for completion of MKEY configuration work request. They can be posted immediately after the last setter (or builder if no setters). Usually there is no need to even request a completion for MKEY configuration work request. If completion is requested for MKEY configuration work request it will be delivered with the **IBV_WC_DRIVER1** opcode. ## Builder function **mlx5dv_wr_mkey_configure()** : Post a work request to configure an existing MKEY. With this call alone, it is possible to configure the MKEY and keep or reset signature attributes. This call may be followed by zero or more optional setters. *mqp* : The QP to post the work request on. *mkey* : The MKEY to configure. *num_setters* : The number of setters that must be called after this function. *attr* : The MKEY configuration attributes ## MKEY configuration attributes MKEY configuration attributes are provided in **mlx5dv_mkey_conf_attr** structure. ```c struct mlx5dv_mkey_conf_attr { uint32_t conf_flags; uint64_t comp_mask; }; ``` *conf_flags* : Bitwise OR of the following flags: **MLX5DV_MKEY_CONF_FLAG_RESET_SIG_ATTR** : Reset the signature attributes of the MKEY. If not set, previously configured signature attributes will be kept. *comp_mask* : Reserved for future extension, must be 0 now. ## Generic setters **mlx5dv_wr_set_mkey_access_flags()** : Set the memory protection attributes for the MKEY. If the MKEY is configured, the setter overrides the previous value. For example, two MKEY configuration WRs are posted. The first one sets **IBV_ACCESS_REMOTE_READ**. The second one sets **IBV_ACCESS_REMOTE_WRITE**. In this case, the second WR overrides the memory protection attributes, and only **IBV_ACCESS_REMOTE_WRITE** is allowed for the MKEY when the WR is completed. *mqp* : The QP where an MKEY configuration work request was created by **mlx5dv_wr_mkey_configure()**. *access_flags* : The desired memory protection attributes; it is either 0 or the bitwise OR of one or more of flags in **enum ibv_access_flags**. ## Data layout setters Data layout setters define how data referenced by the MKEY will be scattered/gathered in the memory. In order to use MKEY with RDMA operations, it must be configured with a layout. Not more than one data layout setter may follow builder function. Layout can be updated in the next calls to builder function. When MKEY is used in RDMA operations, it should be used in a zero-based mode, i.e. the **addr** field in **ibv_sge** structure is an offset in the total data. **mlx5dv_wr_set_mkey_layout_list()** : Set a memory layout for an MKEY based on SGE list. If the MKEY is configured and the data layout was defined by some data layout setter (not necessary this one), the setter overrides the previous value. Default WQE size can fit only 4 SGE entries. To allow more, the QP should be created with a larger WQE size that may fit it. This should be done using the **max_inline_data** attribute of **struct ibv_qp_cap** upon QP creation. *mqp* : The QP where an MKEY configuration work request was created by **mlx5dv_wr_mkey_configure()**. *num_sges* : Number of SGEs in the list. *sge* : Pointer to the list of **ibv_sge** structures. **mlx5dv_wr_set_mkey_layout_interleaved()** : Set an interleaved memory layout for an MKEY. If the MKEY is configured and the data layout was defined by some data layout setter (not necessary this one), the setter overrides the previous value. Default WQE size can fit only 3 interleaved entries. To allow more, the QP should be created with a larger WQE size that may fit it. This should be done using the **max_inline_data** attribute of **struct ibv_qp_cap** upon QP creation. As one entry will be consumed for strided header, the MKEY should be created with one more entry than the required *num_interleaved*. *mqp* : The QP where an MKEY configuration work request was created by **mlx5dv_wr_mkey_configure()**. *repeat_count* : The *data* layout representation is repeated *repeat_count* times. *num_interleaved* : Number of entries in the *data* representation. *data* : Pointer to the list of interleaved data layout descriptions. Interleaved data layout is described by **mlx5dv_mr_interleaved** structure. ```c struct mlx5dv_mr_interleaved { uint64_t addr; uint32_t bytes_count; uint32_t bytes_skip; uint32_t lkey; }; ``` *addr* : Start address of the local memory buffer. *bytes_count* : Number of data bytes to put into the buffer. *bytes_skip* : Number of bytes to skip in the buffer before the next data block. *lkey* : Key of the local Memory Region ## Signature setters The signature attributes of the MKEY allow adding/modifying/stripping/validating integrity fields when transmitting data from memory to network and when receiving data from network to memory. Use the signature setters to set/update the signature attributes of the MKEY. To reset the signature attributes without invalidating the MKEY, use the **MLX5DV_MKEY_CONF_FLAG_RESET_SIG_ATTR** flag. **mlx5dv_wr_set_mkey_sig_block**() : Set MKEY block signature attributes. If the MKEY is already configured with the signature attributes, the setter overrides the previous value. See dedicated man page for **mlx5dv_wr_set_mkey_sig_block**(3). ## Crypto setter The crypto attributes of the MKey allow encryption and decryption of transmitted data from memory to network and when receiving data from network to memory. Use the crypto setter to set/update the crypto attributes of the MKey. When the MKey is created with **MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO** it must be configured with crypto attributes before the MKey can be used. **mlx5dv_wr_set_mkey_crypto()** : Set MKey crypto attributes. If the MKey is already configured with crypto attributes, the setter overrides the previous value. see dedicated man page for **mlx5dv_wr_set_mkey_crypto**(3). # EXAMPLES ## Create QP and MKEY Code below creates a QP with MKEY configure operation support and an indirect mkey. ```c /* Create QP with MKEY configure support */ struct ibv_qp_init_attr_ex attr_ex = {}; attr_ex.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS; attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_WRITE; struct mlx5dv_qp_init_attr attr_dv = {}; attr_dv.comp_mask |= MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS; attr_dv.send_ops_flags = MLX5DV_QP_EX_WITH_MKEY_CONFIGURE; ibv_qp *qp = mlx5dv_create_qp(ctx, attr_ex, attr_dv); ibv_qp_ex *qpx = ibv_qp_to_qp_ex(qp); mlx5dv_qp_ex *mqpx = mlx5dv_qp_ex_from_ibv_qp_ex(qpx); mkey_attr.create_flags = MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT; struct mlx5dv_mkey *mkey = mlx5dv_create_mkey(&mkey_attr); ``` ## List data layout configuration Code below configures an MKEY which allows remote access for read and write and is based on SGE list layout with two entries. When this MKEY is used in RDMA write operation, data will be scattered between two memory regions. The first 64 bytes will go to memory referenced by **mr1**. The next 4096 bytes will go to memory referenced by **mr2**. ```c ibv_wr_start(qpx); qpx->wr_id = my_wr_id_1; qpx->wr_flags = IBV_SEND_INLINE; struct mlx5dv_mkey_conf_attr mkey_attr = {}; mlx5dv_wr_mkey_configure(mqpx, mkey, 2, &mkey_attr); mlx5dv_wr_set_mkey_access_flags(mqpx, IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_WRITE); struct ibv_sge sgl[2]; sgl[0].addr = mr1->addr; sgl[0].length = 64; sgl[0].lkey = mr1->lkey; sgl[1].addr = mr2->addr; sgl[1].length = 4096; sgl[1].lkey = mr2->lkey; mlx5dv_wr_set_mkey_layout_list(mqpx, 2, sgl); ret = ibv_wr_complete(qpx); ``` ## Interleaved data layout configuration Code below configures an MKEY which allows remote access for read and write and is based on interleaved data layout with two entries and repeat count of two. When this MKEY is used in RDMA write operation, data will be scattered between two memory regions. The first 512 bytes will go to memory referenced by **mr1** at offset 0. The next 8 bytes will go to memory referenced by **mr2** at offset 0. The next 512 bytes will go to memory referenced by **mr1** at offset 516. The next 8 bytes will go to memory referenced by **mr2** at offset 8. ```c ibv_wr_start(qpx); qpx->wr_id = my_wr_id_1; qpx->wr_flags = IBV_SEND_INLINE; struct mlx5dv_mkey_conf_attr mkey_attr = {}; mlx5dv_wr_mkey_configure(mqpx, mkey, 2, &mkey_attr); mlx5dv_wr_set_mkey_access_flags(mqpx, IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_WRITE); struct mlx5dv_mr_interleaved data[2]; data[0].addr = mr1->addr; data[0].bytes_count = 512; data[0].bytes_skip = 4; data[0].lkey = mr1->lkey; data[1].addr = mr2->addr; data[1].bytes_count = 8; data[1].bytes_skip = 0; data[1].lkey = mr2->lkey; mlx5dv_wr_set_mkey_layout_interleaved(mqpx, 2, 2, &data); ret = ibv_wr_complete(qpx); ``` # NOTES A DEVX context should be opened by using **mlx5dv_open_device**(3). # SEE ALSO **mlx5dv_create_mkey**(3), **mlx5dv_create_qp**(3), **mlx5dv_wr_set_mkey_sig_block**(3), **mlx5dv_wr_set_mkey_crypto**(3) # AUTHORS Oren Duer Sergey Gorenko Evgenii Kochetov rdma-core-56.1/providers/mlx5/man/mlx5dv_wr_post.3.md000066400000000000000000000203401477342711600224650ustar00rootroot00000000000000--- date: 2019-02-24 footer: mlx5 header: "mlx5 Programmer's Manual" tagline: Verbs layout: page license: 'Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md' section: 3 title: MLX5DV_WR --- # NAME mlx5dv_wr_set_dc_addr - Attach a DC info to the last work request mlx5dv_wr_raw_wqe - Build a raw work request mlx5dv_wr_memcpy - Build a DMA memcpy work request # SYNOPSIS ```c #include static inline void mlx5dv_wr_set_dc_addr(struct mlx5dv_qp_ex *mqp, struct ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key); static inline void mlx5dv_wr_set_dc_addr_stream(struct mlx5dv_qp_ex *mqp, struct ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key, uint16_t stream_id); struct mlx5dv_mr_interleaved { uint64_t addr; uint32_t bytes_count; uint32_t bytes_skip; uint32_t lkey; }; static inline void mlx5dv_wr_mr_interleaved(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint32_t access_flags, /* use enum ibv_access_flags */ uint32_t repeat_count, uint16_t num_interleaved, struct mlx5dv_mr_interleaved *data); static inline void mlx5dv_wr_mr_list(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint32_t access_flags, /* use enum ibv_access_flags */ uint16_t num_sges, struct ibv_sge *sge); static inline int mlx5dv_wr_raw_wqe(struct mlx5dv_qp_ex *mqp, const void *wqe); static inline void mlx5dv_wr_memcpy(struct mlx5dv_qp_ex *mqp_ex, uint32_t dest_lkey, uint64_t dest_addr, uint32_t src_lkey, uint64_t src_addr, size_t length) ``` # DESCRIPTION The MLX5DV work request APIs (mlx5dv_wr_\*) is an extension for IBV work request API (ibv_wr_\*) with mlx5 specific features for send work request. This may be used together with or without ibv_wr_* calls. # USAGE To use these APIs a QP must be created using mlx5dv_create_qp() with *send_ops_flags* of struct ibv_qp_init_attr_ex set. If the QP does not support all the requested work request types then QP creation will fail. The mlx5dv_qp_ex is extracted from the IBV_QP by ibv_qp_to_qp_ex() and mlx5dv_qp_ex_from_ibv_qp_ex(). This should be used to apply the mlx5 specific features on the posted WR. A work request creation requires to use the ibv_qp_ex as described in the man for ibv_wr_post and mlx5dv_qp with its available builders and setters. ## QP Specific builders *RC* QPs : *mlx5dv_wr_mr_interleaved()* registers an interleaved memory layout by using an indirect mkey and some interleaved data. The layout of the memory pointed by the mkey after its registration will be the *data* representation for the *num_interleaved* entries. This single layout representation is repeated by *repeat_count*. The *data* as described by struct mlx5dv_mr_interleaved will hold real data defined by *bytes_count* and then a padding of *bytes_skip*. Post a successful registration, RDMA operations can use this *mkey*. The hardware will scatter the data according to the pattern. The *mkey* should be used in a zero-based mode. The *addr* field in its *ibv_sge* is an offset in the total data. To create this *mkey* mlx5dv_create_mkey() should be used. Current implementation requires the IBV_SEND_INLINE option to be on in *ibv_qp_ex->wr_flags* field. To be able to have more than 3 *num_interleaved* entries, the QP should be created with a larger WQE size that may fit it. This should be done using the *max_inline_data* attribute of *struct ibv_qp_cap* upon its creation. As one entry will be consumed for strided header, the *mkey* should be created with one more entry than the required *num_interleaved*. In case *ibv_qp_ex->wr_flags* turns on IBV_SEND_SIGNALED, the reported WC opcode will be MLX5DV_WC_UMR. Unregister the *mkey* to enable another pattern registration should be done via ibv_post_send with IBV_WR_LOCAL_INV opcode. : *mlx5dv_wr_mr_list()* registers a memory layout based on list of ibv_sge. The layout of the memory pointed by the *mkey* after its registration will be based on the list of *sge* counted by *num_sges*. Post a successful registration RDMA operations can use this *mkey*, the hardware will scatter the data according to the pattern. The *mkey* should be used in a zero-based mode, the *addr* field in its *ibv_sge* is an offset in the total data. Current implementation requires the IBV_SEND_INLINE option to be on in *ibv_qp_ex->wr_flags* field. To be able to have more than 4 *num_sge* entries, the QP should be created with a larger WQE size that may fit it. This should be done using the *max_inline_data* attribute of *struct ibv_qp_cap* upon its creation. In case *ibv_qp_ex->wr_flags* turns on IBV_SEND_SIGNALED, the reported WC opcode will be MLX5DV_WC_UMR. Unregister the *mkey* to enable other pattern registration should be done via ibv_post_send with IBV_WR_LOCAL_INV opcode. *RC* or *DCI* QPs : *mlx5dv_wr_memcpy()* Builds a DMA memcpy work request to copy data of length *length* from *src_addr* to *dest_addr*. The copy operation will be done using the DMA MMO functionality of the device to copy data on PCI bus. The MLX5DV_QP_EX_WITH_MEMCPY flag in *mlx5dv_qp_init_attr.send_ops_flags* needs to be set during QP creation. If the device or QP doesn't support it then QP creation will fail. The maximum memcpy length that is supported by the device is reported in *mlx5dv_context->max_wr_memcpy_length*. A zero value in *mlx5dv_context->max_wr_memcpy_length* means the device doesn't support memcpy operations. IBV_SEND_FENCE indicator should be used on a following send request which is dependent on *dest_addr* of the memcpy operation. In case *ibv_qp_ex->wr_flags* turns on IBV_SEND_SIGNALED, the reported WC opcode will be MLX5DV_WC_MEMCPY. ## Raw WQE builders *mlx5dv_wr_raw_wqe()* : It is used to build a custom work request (WQE) and post it on a normal QP. The caller needs to set all details of the WQE (except the "ctrl.wqe_index" and "ctrl.signature" fields, which is the driver's responsibility to set). The MLX5DV_QP_EX_WITH_RAW_WQE flag in mlx5_qp_attr.send_ops_flags needs to be set. The wr_flags are ignored as it's the caller's responsibility to set flags in WQE. No matter what the send opcode is, the work completion opcode for a raw WQE is IBV_WC_DRIVER2. ## QP Specific setters *DCI* QPs : *mlx5dv_wr_set_dc_addr()* must be called to set the DCI WR properties. The destination address of the work is specified by *ah*, the remote DCT number is specified by *remote_dctn* and the DC key is specified by *remote_dc_key*. This setter is available when the QP transport is DCI and send_ops_flags in struct ibv_qp_init_attr_ex is set. The available builders and setters for DCI QP are the same as RC QP. DCI QP created with MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS can call *mlx5dv_wr_set_dc_addr_stream()* to define the *stream_id* of the operation to allow HW to choose one of the multiple concurrent DCI resources. Calls to *mlx5dv_wr_set_dc_addr()* are equivalent to using *stream_id*=0 # EXAMPLE ```c /* create DC QP type and specify the required send opcodes */ attr_ex.qp_type = IBV_QPT_DRIVER; attr_ex.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS; attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_WRITE; attr_dv.comp_mask |= MLX5DV_QP_INIT_ATTR_MASK_DC; attr_dv.dc_init_attr.dc_type = MLX5DV_DCTYPE_DCI; ibv_qp *qp = mlx5dv_create_qp(ctx, attr_ex, attr_dv); ibv_qp_ex *qpx = ibv_qp_to_qp_ex(qp); mlx5dv_qp_ex *mqpx = mlx5dv_qp_ex_from_ibv_qp_ex(qpx); ibv_wr_start(qpx); /* Use ibv_qp_ex object to set WR generic attributes */ qpx->wr_id = my_wr_id_1; qpx->wr_flags = IBV_SEND_SIGNALED; ibv_wr_rdma_write(qpx, rkey, remote_addr_1); ibv_wr_set_sge(qpx, lkey, local_addr_1, length_1); /* Use mlx5 DC setter using mlx5dv_qp_ex object */ mlx5dv_wr_set_wr_dc_addr(mqpx, ah, remote_dctn, remote_dc_key); ret = ibv_wr_complete(qpx); ``` # SEE ALSO **ibv_post_send**(3), **ibv_create_qp_ex(3)**, **ibv_wr_post(3)**, **mlx5dv_create_mkey(3)**. # AUTHOR Guy Levi Mark Zhang rdma-core-56.1/providers/mlx5/man/mlx5dv_wr_set_mkey_crypto.3.md000066400000000000000000000335061477342711600247300ustar00rootroot00000000000000--- layout: page title: mlx5dv_wr_set_mkey_crypto section: 3 tagline: Verbs --- # NAME mlx5dv_wr_set_mkey_crypto - Configure a MKey for crypto operation. # SYNOPSIS ```c #include static inline void mlx5dv_wr_set_mkey_crypto(struct mlx5dv_qp_ex *mqp, const struct mlx5dv_crypto_attr *attr); ``` # DESCRIPTION Configure a MKey with crypto properties. With this, the device will encrypt/decrypt data when transmitting data from memory to network and when receiving data from network to memory. In order to configure MKey with crypto properties, the MKey should be created with **MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO**. MKey that was created with **MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO** must have crypto properties configured to it before it can be used, i.e. this setter must be called before the MKey can be used or else traffic will fail, generating a CQE with error. A call to this setter on a MKey that already has crypto properties configured to it will override existing crypto properties. Configuring crypto properties to a MKey is done by specifying the crypto standard that should be used and its attributes, and also by providing the Data Encryption Key (DEK) to be used for the encryption/decryption itself. The MKey represents a virtually contiguous memory, by configuring a layout to it. The crypto properties of the MKey describe whether data in this virtually contiguous memory is encrypted or in plaintext, and whether it should be encrypted/decrypted before transmitting it or after receiving it. Depending on the actual operation that happens (TX or RX), the device will do the "right thing" based on the crypto properties configured in the MKey. MKeys can be configured with both crypto and signature properties at the same time by calling both **mlx5dv_wr_set_mkey_crypto()**(3) and **mlx5dv_wr_set_mkey_sig_block()**(3). In this case, both crypto and signature operations will be performed according to the crypto and signature properties configured in the MKey, and the order of operations will be determined by the *signature_crypto_order* property. ## Example 1 (corresponds to row F in the table below): Memory signature domain is not configured, and memory data is encrypted. Wire signature domain is not configured, and wire data is in plaintext. *encrypt_on_tx* is set to false, and because signature is not configured, *signature_crypto_order* value doesn't matter. A SEND is issued using the MKey as a local key. Result: device will gather the encrypted data from the MKey (using whatever layout configured to the MKey to locate the actual memory), decrypt it using the supplied DEK and transmit the decrypted data to the wire. ## Example 1.1: Same as above, but a RECV is issued with the same MKey, and RX happens. Result: device will receive the data from the wire, encrypt it using the supplied DEK and scatter it to the MKey (using whatever layout configured to the MKey to locate the actual memory). ## Example 2 (corresponds to row C in the table below): Memory signature domain is configured for no signature, and memory data is in plaintext. Wire signature domain is configured for T10DIF every 512 Bytes block, and wire data (including the T10DIF) is encrypted. *encrypt_on_tx* is set to true and *signature_crypto_order* is set to be **MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_BEFORE_CRYPTO_ON_TX**. *data_unit_size* is set to **MLX5DV_BLOCK_SIZE_520**. The MKey is sent to a remote node that issues a RDMA_READ to this MKey. Result: device will gather the data from the MKey (using whatever layout configured to the MKey to locate the actual memory), generate an additional T10DIF field every 512B of data, encrypt the data and the newly generated T10DIF field using the supplied DEK, and transmit it to the wire. ## Example 2.1: Same as above, but remote node issues a RDMA_WRITE to this MKey. Result: device will receive the data from the wire, decrypt the data using the supplied DEK, validate each T10DIF field against the previous 512B of data, strip the T10DIF field, and scatter the data alone to the MKey (using whatever layout configured to the MKey to locate the actual memory). # ARGUMENTS *mqp* : The QP where an MKey configuration work request was created by **mlx5dv_wr_mkey_configure()**. *attr* : Crypto attributes to set for the MKey. ## Crypto Attributes Crypto attributes describe the format (encrypted or plaintext) and layout of the input and output data in memory and wire domains, the crypto standard that should be used and its attributes. ```c struct mlx5dv_crypto_attr { enum mlx5dv_crypto_standard crypto_standard; bool encrypt_on_tx; enum mlx5dv_signature_crypto_order signature_crypto_order; enum mlx5dv_block_size data_unit_size; char initial_tweak[16]; struct mlx5dv_dek *dek; char keytag[8]; uint64_t comp_mask; }; ``` *crypto_standard* : The encryption standard that should be used, currently can only be the following value **MLX5DV_CRYPTO_STANDARD_AES_XTS** : The AES-XTS encryption standard defined in IEEE Std 1619-2007. *encrypt_on_tx* : If set, memory data will be encrypted during TX and wire data will be decrypted during RX. If not set, memory data will be decrypted during TX and wire data will be encrypted during RX. *signature_crypto_order* : Controls the order between crypto and signature operations (Please see detailed table below). Relevant only if signature is configured. Can be one of the following values **MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_AFTER_CRYPTO_ON_TX** : During TX, first perform crypto operation (encrypt/decrypt based on *encrypt_on_tx*) and then signature operation on memory data. During RX, first perform signature operation and then crypto operation (encrypt/decrypt based on *encrypt_on_tx*) on wire data. **MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_BEFORE_CRYPTO_ON_TX** : During TX, first perform signature operation and then crypto operation (encrypt/decrypt based on *encrypt_on_tx*) on memory data. During RX, first perform crypto operation (encrypt/decrypt based on *encrypt_on_tx*) and then signature operation on wire data. Table: *signature_crypto_order* and *encrypt_on_tx* Meaning. The table describes the possible data layouts in memory and wire domains, and the order in which crypto and signature operations are performed according to *signature_crypto_order*, *encrypt_on_tx* and signature configuration. Memory column represents the data layout in the memory domain. Wire column represents the data layout in the wire domain. There are three possible operations that can be performed by the device on the data when processing it from memory to wire and from wire to memory: 1. Crypto operation. 2. Signature operation in memory domain. 3. Signature operation in wire domain. Op1, Op2 and Op3 columns represent these operations. On TX, Op1, Op2 and Op3 are performed on memory data to produce the data layout that is specified in Wire column. On RX, Op3, Op2 and Op1 are performed on wire data to produce the data layout specified in Memory column. "SIG.mem" and "SIG.wire" represent the signature operation that is performed in memory and wire domains respectively. None means no operation is performed. The exact signature operations are determined by the signature attributes configured by **mlx5dv_wr_set_mkey_sig_block()**. encrypt_on_tx and signature_crypto_order columns represent the values that *encrypt_on_tx* and *signature_crypto_order* should have in order to achieve such behavior. | | Memory | Op1 | Op2 | Op3 | Wire | encrypt_on_tx | signature_crypto_order | |-----| ---------------- | ---------------- | ---------------- | ---------------- | ---------------- |---------------|--------------------------------| | A | data | Encrypt on TX | SIG.mem = none | SIG.wire = none | enc(data) | True | Doesn't matter | | | | | | | | | | | B | data | Encrypt On TX | SIG.mem = none | SIG.wire = SIG | enc(data)+SIG | True | SIGNATURE_AFTER_CRYPTO_ON_TX | | | | | | | | | | | C | data | SIG.mem = none | SIG.wire = SIG | Encrypt on TX | enc(data+SIG) | True | SIGNATURE_BEFORE_CRYPTO_ON_TX | | | | | | | | | | | D | data+SIG | SIG.mem = SIG | SIG.wire = none | Encrypt on TX | enc(data) | True | SIGNATURE_BEFORE_CRYPTO_ON_TX | | | | | | | | | | | E | data+SIG1 | SIG.mem = SIG1 | SIG.wire = SIG2 | Encrypt on TX | enc(data+SIG2) | True | SIGNATURE_BEFORE_CRYPTO_ON_TX | | | | | | | | | | | F | enc(data) | Decrypt on TX | SIG.mem = none | SIG.wire = none | data | False | Doesn't matter | | | | | | | | | | | G | enc(data) | Decrypt on TX | SIG.mem = none | SIG.wire = SIG | data+SIG | False | SIGNATURE_AFTER_CRYPTO_ON_TX | | | | | | | | | | | H | enc(data+SIG) | Decrypt on TX | SIG.mem = SIG | SIG.wire = none | data | False | SIGNATURE_AFTER_CRYPTO_ON_TX | | | | | | | | | | | I | enc(data+SIG1) | Decrypt on TX | SIG.mem = SIG1 | SIG.wire = SIG2 | data+SIG2 | False | SIGNATURE_AFTER_CRYPTO_ON_TX | | | | | | | | | | | J | enc(data)+SIG | SIG.mem = SIG | SIG.wire = none | Decrypt on TX | data | False | SIGNATURE_BEFORE_CRYPTO_ON_TX | Notes: - "Encrypt on TX" also means "Decrypt on RX", and "Decrypt on TX" also means "Encrypt on RX". - When signature properties are not configured in the MKey, only crypto operations will be performed. Thus, *signature_crypto_order* has no meaning in this case (rows A and F), and it can be set to either one of its values. *data_unit_size* : For storage, this will normally be the storage block size. The tweak is incremented after each *data_unit_size* during the encryption. can be one of **enum mlx5dv_block_size**. *initial_tweak* : A value to be used during encryption of each data unit. Must be supplied in little endian. This value is incremented by the device for every data unit in the message. For storage encryption, this will normally be the LBA of the first block in the message, so that the increments represent the LBAs of the rest of the blocks in the message. *dek* : The DEK to be used for the crypto operations. This DEK must be pre-loaded to the device using **mlx5dv_dek_create()**. *key_tag* : A tag that verifies that the correct DEK is being used. *key_tag* is optional and is valid only if the DEK was created with **has_keytag** set to true. If so, it must match the key tag that was provided when the DEK was created. Supllied in plaintext. *comp_mask* : Reserved for future extension, must be 0 now. # RETURN VALUE This function does not return a value. In case of error, user will be notified later when completing the DV WRs chain. # NOTES MKey must be created with **MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO** flag. The last operation posted on the supplied QP should be **mlx5dv_wr_mkey_configure**(3), or one of its related setters, and the operation must still be open (no doorbell issued). In case of **ibv_wr_complete()** failure or calling to **ibv_wr_abort()**, the MKey may be left in an unknown state. The next configuration of it should not assume any previous state of the MKey, i.e. signature/crypto should be re-configured or reset, as required. For example, assuming **mlx5dv_wr_set_mkey_sig_block()** and then **ibv_wr_abort()** were called, then on the next configuration of the MKey, if signature is not needed, it should be reset using **MLX5DV_MKEY_CONF_FLAG_RESET_SIG_ATTR**. When configuring a MKey with AES-XTS crypto offload, and using the former for traffic (send/receive), the amount of data to send/receive must meet one of the following conditions for successful encryption/decryption process (per AES-XTS spec): Let's refer to the amount of data to send/receive as 'job_size' 1.job_size % *data_unit_size* == 0 2.(job_size % 16 == 0) && (job_size % *data_unit_size* <= *data_unit_size* - 16) For example: When *data_unit_size* = 512B: 1. job_size = 512B is valid (1 holds). 2. job_size = 128B is valid (2 holds). 3. job_size = 47B is invalid (neither 1 nor 2 holds). When *data_unit_size* = 520B: 1. job_size = 520B is valid (1 holds). 2. job_size = 496B is valid (2 holds). 3. job_size = 512B is invalid (neither 1 nor 2 holds). # SEE ALSO **mlx5dv_wr_mkey_configure**(3), **mlx5dv_wr_set_mkey_sig_block**(3), **mlx5dv_create_mkey**(3), **mlx5dv_destroy_mkey**(3), **mlx5dv_crypto_login**(3), **mlx5dv_crypto_login_create**(3), **mlx5dv_dek_create**(3) # AUTHORS Oren Duer Avihai Horon Maher Sanalla rdma-core-56.1/providers/mlx5/man/mlx5dv_wr_set_mkey_sig_block.3.md000066400000000000000000000261001477342711600253340ustar00rootroot00000000000000--- layout: page title: mlx5dv_wr_set_mkey_sig_block section: 3 tagline: Verbs --- # NAME mlx5dv_wr_set_mkey_sig_block - Configure a MKEY for block signature (data integrity) operation. # SYNOPSIS ```c #include static inline void mlx5dv_wr_set_mkey_sig_block(struct mlx5dv_qp_ex *mqp, const struct mlx5dv_sig_block_attr *attr) ``` # DESCRIPTION Configure a MKEY with block-level data protection properties. With this, the device can add/modify/strip/validate integrity fields per block when transmitting data from memory to network and when receiving data from network to memory. This setter can be optionally called after a MKEY configuration work request posting has started using **mlx5dv_wr_mkey_configure**(3). Configuring block signature properties to a MKEY is done by describing what kind of signature is required (or expected) in two domains: the wire domain and the memory domain. The MKEY represents a virtually contiguous memory, by configuring a layout to it. The memory signature domain describes whether data in this virtually contiguous memory includes integrity fields, and if so, what kind(**enum mlx5dv_sig_type**) and what block size(**enum mlx5dv_block_size**). The wire signature domain describes the same kind of properties for the data as it is seen on the wire. Now, depending on the actual operation that happens (TX or RX), the device will do the "right thing" based on the signature configurations of the two domains. ## Example 1: Memory signature domain is configured for CRC32 every 512B block. Wire signature domain is configured for no signature. A SEND is issued using the MKEY as a local key. Result: device will gather the data with the CRC32 fields from the MKEY (using whatever layout configured to the MKEY to locate the actual memory), validate each CRC32 against the previous 512 bytes of data, strip the CRC32 field, and transmit only 512 bytes of data to the wire. ### Example 1.1: Same as above, but a RECV is issued with the same key, and RX happens. Result: device will receive the data from the wire, scatter it to the MKEY (using whatever layout configured to the MKEY to locate the actual memory), generating and scattering additional CRC32 field after every 512 bytes that are scattered. ## Example 2: Memory signature domain is configured for no signature. Wire signature domain is configured for T10DIF every 4K block. The MKEY is sent to a remote node that issues a RDMA_READ to this MKEY. Result: device will gather the data from the MKEY (using whatever layout configured to the MKEY to locate the actual memory), transmit it to the wire while generating an additional T10DIF field every 4K of data. ### Example 2.1: Same as above, but remote node issues a RDMA_WRITE to this MKEY. Result: Device will receive the data from the wire, validate each T10DIF field against the previous 4K of data, strip the T10DIF field, and scatter the data alone to the MKEY (using whatever layout configured to the MKEY to locate the actual memory). # ARGUMENTS *mqp* : The QP where an MKEY configuration work request was created by **mlx5dv_wr_mkey_configure()**. *attr* : Block signature attributes to set for the MKEY. ## Block signature attributes Block signature attributes describe the input and output data structures in memory and wire domains. ```c struct mlx5dv_sig_block_attr { const struct mlx5dv_sig_block_domain *mem; const struct mlx5dv_sig_block_domain *wire; uint32_t flags; uint8_t check_mask; uint8_t copy_mask; uint64_t comp_mask; }; ``` *mem* : A pointer to the signature configuration for the memory domain or NULL if the domain does not have a signature. *wire* : A pointer to the signature configuration for the wire domain or NULL if the domain does not have a signature. *flags* : A bitwise OR of the various values described below. **MLX5DV_SIG_BLOCK_ATTR_FLAG_COPY_MASK** : If the bit is not set, then *copy_mask* is ignored. See details in the *copy_mask* description. *check_mask* : Each bit of *check_mask* corresponds to a byte of the signature field in input domain. Byte of the input signature is checked if corresponding bit in *check_mask* is set. Bits not relevant to the signature type are ignored. Table: Layout of *check_mask*. | check_mask (bits) | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | | ----------------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | -------- | | T10-DIF (bytes) | GUARD[1] | GUARD[0] | APP[1] | APP[0] | REF[3] | REF[2] | REF[1] | REF[0] | | CRC32C/CRC32 (bytes) | 3 | 2 | 1 | 0 | | | | | | CRC64_XP10 (bytes) | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | Common used masks are defined in **enum mlx5dv_sig_mask**. Other masks are also supported. Follow the above table to define a custom mask. For example, this can be useful for the application tag field of the T10DIF signature. Using the application tag is out of the scope of the T10DIF specification and depends on the implementation. *check_mask* allows validating a part of the application tag if needed. *copy_mask* : A mask to specify what part of the signature is copied from the source domain to the destination domain. The copy mask is usually calculated automatically. The signature is copied if the same signature type is configurted on both domains. The parts of the T10-DIF are compared and handled independetly. If **MLX5DV_SIG_BLOCK_ATTR_FLAG_COPY_MASK** is set, the *copy_mask* attribute overrides the calculated value of the copy mask. Otherwise, *copy_mask* is ignored. Each bit of *copy_mask* corresponds to a byte of the signature field. If corresponding bit in *copy_mask* is set, byte of the signature field is copied from the input domain to the output domain. Calculation according to the output domain configuration is not performed in this case. Bits not relevant to the signature type are ignored. *copy_mask* may be used only if input and output domains have the same structure, i.e. same block size and signature type. The MKEY configuration will fail if **MLX5DV_SIG_BLOCK_ATTR_FLAG_COPY_MASK** is set but the domains have different signature structures. The predefined masks are available in **enum mlx5dv_sig_mask**. It is also supported to specify a user-defined mask. Follow the table in *check_mask* description to define a custom mask. *copy_mask* can be useful when some bytes of the signature are not known in advance, hence can't be checked, but shall be preserved. In this case corresponding bits should be cleared in *check_mask* and set in *copy_mask*. *comp_mask* : Reserved for future extension, must be 0 now. ## Block signature domain ```c struct mlx5dv_sig_block_domain { enum mlx5dv_sig_type sig_type; union { const struct mlx5dv_sig_t10dif *dif; const struct mlx5dv_sig_crc *crc; } sig; enum mlx5dv_block_size block_size; uint64_t comp_mask; }; ``` *sig_type* : The signature type for this domain, one of the following **MLX5DV_SIG_TYPE_T10DIF** : The block-level data protection defined in the T10 specifications (T10 SBC-3). **MLX5DV_SIG_TYPE_CRC** : The block-level data protection based on cyclic redundancy check (CRC). The specific type of CRC is defined in *sig*. *sig* : Depending on *sig_type*, this is the per signature type specific configuration. *block_size* : The block size for this domain, one of **enum mlx5dv_block_size**. *comp_mask* : Reserved for future extension, must be 0 now. ## CRC signature ```c struct mlx5dv_sig_crc { enum mlx5dv_sig_crc_type type; uint64_t seed; }; ``` *type* : The specific CRC type, one of the following. **MLX5DV_SIG_CRC_TYPE_CRC32** : CRC32 signature is created by calculating a 32-bit CRC defined in Fibre Channel Physical and Signaling Interface (FC-PH), ANSI X3.230:1994. **MLX5DV_SIG_CRC_TYPE_CRC32C** : CRC32C signature is created by calculating a 32-bit CRC called the Castagnoli CRC, defined in the Internet Small Computer Systems Interface (iSCSI) rfc3720. **MLX5DV_SIG_CRC_TYPE_CRC64_XP10** : CRC64_XP10 signature is created by calculating a 64-bit CRC defined in Microsoft XP10 compression standard. *seed* : A seed for the CRC calculation per block. Bits not relevant to the CRC type are ignored. For example, all bits are used for CRC64_XP10, but only the 32 least significant bits are used for CRC32/CRC32C. Only the following values are supported as a seed: CRC32/CRC32C - 0, 0xFFFFFFFF(UINT32_MAX); CRC64_XP10 - 0, 0xFFFFFFFFFFFFFFFF(UINT64_MAX). ## T10DIF signature T10DIF signature is defined in the T10 specifications (T10 SBC-3) for block-level data protection. The size of data block protected by T10DIF must be modulo 8bytes as required in the T10DIF specifications. Note that when setting the initial LBA value to *ref_tag*, it should be the value of the first block to be transmitted. ```c struct mlx5dv_sig_t10dif { enum mlx5dv_sig_t10dif_bg_type bg_type; uint16_t bg; uint16_t app_tag; uint32_t ref_tag; uint16_t flags; }; ``` *bg_type* : The block guard type to be used, one of the following. **MLX5DV_SIG_T10DIF_CRC** : Use CRC in the block guard field as required in the T10DIF specifications. **MLX5DV_SIG_T10DIF_CSUM** : Use IP checksum instead of CRC in the block guard field. *bg* : A seed for the block guard calculation per block. The following values are supported as a seed: 0, 0xFFFF(UINT16_MAX). *app_tag* : An application tag to generate or validate. *ref_tag* : A reference tag to generate or validate. *flags* : Flags for the T10DIF attributes, one of the following. **MLX5DV_SIG_T10DIF_FLAG_REF_REMAP** : Increment reference tag per block. **MLX5DV_SIG_T10DIF_FLAG_APP_ESCAPE** : Do not check block guard if application tag is 0xFFFF. **MLX5DV_SIG_T10DIF_FLAG_APP_REF_ESCAPE** : Do not check block guard if application tag is 0xFFFF and reference tag is 0xFFFFFFFF. # RETURN VALUE This function does not return a value. In case of error, user will be notified later when completing the DV WRs chain. # Notes A DEVX context should be opened by using **mlx5dv_open_device**(3). MKEY must be created with **MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE** flag. The last operation posted on the supplied QP should be **mlx5dv_wr_mkey_configure**(3), or one of its related setters, and the operation must still be open (no doorbell issued). In case of **ibv_wr_complete()** failure or calling to **ibv_wr_abort()**, the MKey may be left in an unknown state. The next configuration of it should not assume any previous state of the MKey, i.e. signature/crypto should be re-configured or reset, as required. For example, assuming **mlx5dv_wr_set_mkey_sig_block()** and then **ibv_wr_abort()** were called, then on the next configuration of the MKey, if signature is not needed, it should be reset using **MLX5DV_MKEY_CONF_FLAG_RESET_SIG_ATTR**. # SEE ALSO **mlx5dv_wr_mkey_configure**(3), **mlx5dv_create_mkey**(3), **mlx5dv_destroy_mkey**(3) # AUTHORS Oren Duer Sergey Gorenko rdma-core-56.1/providers/mlx5/mlx5-abi.h000066400000000000000000000075501477342711600200320ustar00rootroot00000000000000/* * Copyright (c) 2012 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MLX5_ABI_H #define MLX5_ABI_H #include #include #include #include #include "mlx5dv.h" #define MLX5_UVERBS_MIN_ABI_VERSION 1 #define MLX5_UVERBS_MAX_ABI_VERSION 1 enum { MLX5_NUM_NON_FP_BFREGS_PER_UAR = 2, NUM_BFREGS_PER_UAR = 4, MLX5_MAX_UARS = 1 << 8, MLX5_MAX_BFREGS = MLX5_MAX_UARS * MLX5_NUM_NON_FP_BFREGS_PER_UAR, MLX5_DEF_TOT_UUARS = 8 * MLX5_NUM_NON_FP_BFREGS_PER_UAR, MLX5_MED_BFREGS_TSHOLD = 12, }; DECLARE_DRV_CMD(mlx5_alloc_ucontext, IB_USER_VERBS_CMD_GET_CONTEXT, mlx5_ib_alloc_ucontext_req_v2, mlx5_ib_alloc_ucontext_resp); DECLARE_DRV_CMD(mlx5_create_ah, IB_USER_VERBS_CMD_CREATE_AH, empty, mlx5_ib_create_ah_resp); DECLARE_DRV_CMD(mlx5_alloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, empty, mlx5_ib_alloc_pd_resp); DECLARE_DRV_CMD(mlx5_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, mlx5_ib_create_cq, mlx5_ib_create_cq_resp); DECLARE_DRV_CMD(mlx5_create_cq_ex, IB_USER_VERBS_EX_CMD_CREATE_CQ, mlx5_ib_create_cq, mlx5_ib_create_cq_resp); DECLARE_DRV_CMD(mlx5_create_srq, IB_USER_VERBS_CMD_CREATE_SRQ, mlx5_ib_create_srq, mlx5_ib_create_srq_resp); DECLARE_DRV_CMD(mlx5_create_srq_ex, IB_USER_VERBS_CMD_CREATE_XSRQ, mlx5_ib_create_srq, mlx5_ib_create_srq_resp); DECLARE_DRV_CMD(mlx5_create_qp_ex, IB_USER_VERBS_EX_CMD_CREATE_QP, mlx5_ib_create_qp, mlx5_ib_create_qp_resp); DECLARE_DRV_CMD(mlx5_create_qp_ex_rss, IB_USER_VERBS_EX_CMD_CREATE_QP, mlx5_ib_create_qp_rss, mlx5_ib_create_qp_resp); DECLARE_DRV_CMD(mlx5_create_qp, IB_USER_VERBS_CMD_CREATE_QP, mlx5_ib_create_qp, mlx5_ib_create_qp_resp); DECLARE_DRV_CMD(mlx5_create_wq, IB_USER_VERBS_EX_CMD_CREATE_WQ, mlx5_ib_create_wq, mlx5_ib_create_wq_resp); DECLARE_DRV_CMD(mlx5_modify_wq, IB_USER_VERBS_EX_CMD_MODIFY_WQ, mlx5_ib_modify_wq, empty); DECLARE_DRV_CMD(mlx5_create_rwq_ind_table, IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL, empty, empty); DECLARE_DRV_CMD(mlx5_destroy_rwq_ind_table, IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL, empty, empty); DECLARE_DRV_CMD(mlx5_resize_cq, IB_USER_VERBS_CMD_RESIZE_CQ, mlx5_ib_resize_cq, empty); DECLARE_DRV_CMD(mlx5_query_device_ex, IB_USER_VERBS_EX_CMD_QUERY_DEVICE, empty, mlx5_ib_query_device_resp); DECLARE_DRV_CMD(mlx5_modify_qp_ex, IB_USER_VERBS_EX_CMD_MODIFY_QP, empty, mlx5_ib_modify_qp_resp); struct mlx5_modify_qp { struct ibv_modify_qp_ex ibv_cmd; __u32 comp_mask; struct mlx5_ib_burst_info burst_info; __u32 ece_options; }; #endif /* MLX5_ABI_H */ rdma-core-56.1/providers/mlx5/mlx5.c000066400000000000000000002243541477342711600172770ustar00rootroot00000000000000/* * Copyright (c) 2012 Mellanox Technologies, Inc. All rights reserved. * Copyright (c) 2020 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include "mlx5.h" #include "mlx5-abi.h" #include "wqe.h" #include "mlx5_ifc.h" #include "mlx5_vfio.h" static void mlx5_free_context(struct ibv_context *ibctx); static bool is_mlx5_dev(struct ibv_device *device); #ifndef CPU_OR #define CPU_OR(x, y, z) do {} while (0) #endif #ifndef CPU_EQUAL #define CPU_EQUAL(x, y) 1 #endif #define HCA(v, d) VERBS_PCI_MATCH(PCI_VENDOR_ID_##v, d, NULL) const struct verbs_match_ent mlx5_hca_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_MLX5), HCA(MELLANOX, 0x1011), /* MT4113 Connect-IB */ HCA(MELLANOX, 0x1012), /* Connect-IB Virtual Function */ HCA(MELLANOX, 0x1013), /* ConnectX-4 */ HCA(MELLANOX, 0x1014), /* ConnectX-4 Virtual Function */ HCA(MELLANOX, 0x1015), /* ConnectX-4LX */ HCA(MELLANOX, 0x1016), /* ConnectX-4LX Virtual Function */ HCA(MELLANOX, 0x1017), /* ConnectX-5, PCIe 3.0 */ HCA(MELLANOX, 0x1018), /* ConnectX-5 Virtual Function */ HCA(MELLANOX, 0x1019), /* ConnectX-5 Ex */ HCA(MELLANOX, 0x101a), /* ConnectX-5 Ex VF */ HCA(MELLANOX, 0x101b), /* ConnectX-6 */ HCA(MELLANOX, 0x101c), /* ConnectX-6 VF */ HCA(MELLANOX, 0x101d), /* ConnectX-6 DX */ HCA(MELLANOX, 0x101e), /* ConnectX family mlx5Gen Virtual Function */ HCA(MELLANOX, 0x101f), /* ConnectX-6 LX */ HCA(MELLANOX, 0x1021), /* ConnectX-7 */ HCA(MELLANOX, 0x1023), /* ConnectX-8 */ HCA(MELLANOX, 0x1025), /* ConnectX-9 */ HCA(MELLANOX, 0xa2d2), /* BlueField integrated ConnectX-5 network controller */ HCA(MELLANOX, 0xa2d3), /* BlueField integrated ConnectX-5 network controller VF */ HCA(MELLANOX, 0xa2d6), /* BlueField-2 integrated ConnectX-6 Dx network controller */ HCA(MELLANOX, 0xa2dc), /* BlueField-3 integrated ConnectX-7 network controller */ HCA(MELLANOX, 0xa2df), /* BlueField-4 integrated ConnectX-8 network controller */ {} }; uint32_t mlx5_debug_mask = 0; int mlx5_freeze_on_error_cqe; static const struct verbs_context_ops mlx5_ctx_common_ops = { .query_port = mlx5_query_port, .alloc_pd = mlx5_alloc_pd, .async_event = mlx5_async_event, .dealloc_pd = mlx5_free_pd, .reg_mr = mlx5_reg_mr, .reg_dmabuf_mr = mlx5_reg_dmabuf_mr, .rereg_mr = mlx5_rereg_mr, .dereg_mr = mlx5_dereg_mr, .alloc_mw = mlx5_alloc_mw, .dealloc_mw = mlx5_dealloc_mw, .bind_mw = mlx5_bind_mw, .create_cq = mlx5_create_cq, .poll_cq = mlx5_poll_cq, .req_notify_cq = mlx5_arm_cq, .cq_event = mlx5_cq_event, .resize_cq = mlx5_resize_cq, .destroy_cq = mlx5_destroy_cq, .create_srq = mlx5_create_srq, .modify_srq = mlx5_modify_srq, .query_srq = mlx5_query_srq, .destroy_srq = mlx5_destroy_srq, .post_srq_recv = mlx5_post_srq_recv, .create_qp = mlx5_create_qp, .query_qp = mlx5_query_qp, .modify_qp = mlx5_modify_qp, .destroy_qp = mlx5_destroy_qp, .post_send = mlx5_post_send, .post_recv = mlx5_post_recv, .create_ah = mlx5_create_ah, .destroy_ah = mlx5_destroy_ah, .attach_mcast = mlx5_attach_mcast, .detach_mcast = mlx5_detach_mcast, .advise_mr = mlx5_advise_mr, .alloc_dm = mlx5_alloc_dm, .alloc_parent_domain = mlx5_alloc_parent_domain, .alloc_td = mlx5_alloc_td, .attach_counters_point_flow = mlx5_attach_counters_point_flow, .close_xrcd = mlx5_close_xrcd, .create_counters = mlx5_create_counters, .create_cq_ex = mlx5_create_cq_ex, .create_flow = mlx5_create_flow, .create_flow_action_esp = mlx5_create_flow_action_esp, .create_qp_ex = mlx5_create_qp_ex, .create_rwq_ind_table = mlx5_create_rwq_ind_table, .create_srq_ex = mlx5_create_srq_ex, .create_wq = mlx5_create_wq, .dealloc_td = mlx5_dealloc_td, .destroy_counters = mlx5_destroy_counters, .destroy_flow = mlx5_destroy_flow, .destroy_flow_action = mlx5_destroy_flow_action, .destroy_rwq_ind_table = mlx5_destroy_rwq_ind_table, .destroy_wq = mlx5_destroy_wq, .free_dm = mlx5_free_dm, .get_srq_num = mlx5_get_srq_num, .import_dm = mlx5_import_dm, .import_mr = mlx5_import_mr, .import_pd = mlx5_import_pd, .modify_cq = mlx5_modify_cq, .modify_flow_action_esp = mlx5_modify_flow_action_esp, .modify_qp_rate_limit = mlx5_modify_qp_rate_limit, .modify_wq = mlx5_modify_wq, .open_qp = mlx5_open_qp, .open_xrcd = mlx5_open_xrcd, .post_srq_ops = mlx5_post_srq_ops, .query_device_ex = mlx5_query_device_ex, .query_ece = mlx5_query_ece, .query_rt_values = mlx5_query_rt_values, .read_counters = mlx5_read_counters, .reg_dm_mr = mlx5_reg_dm_mr, .alloc_null_mr = mlx5_alloc_null_mr, .free_context = mlx5_free_context, .set_ece = mlx5_set_ece, .unimport_dm = mlx5_unimport_dm, .unimport_mr = mlx5_unimport_mr, .unimport_pd = mlx5_unimport_pd, .query_qp_data_in_order = mlx5_query_qp_data_in_order, }; static const struct verbs_context_ops mlx5_ctx_cqev1_ops = { .poll_cq = mlx5_poll_cq_v1, }; static int read_number_from_line(const char *line, int *value) { const char *ptr; ptr = strchr(line, ':'); if (!ptr) return 1; ++ptr; *value = atoi(ptr); return 0; } /** * The function looks for the first free user-index in all the * user-index tables. If all are used, returns -1, otherwise * a valid user-index. * In case the reference count of the table is zero, it means the * table is not in use and wasn't allocated yet, therefore the * mlx5_store_uidx allocates the table, and increment the reference * count on the table. */ static int32_t get_free_uidx(struct mlx5_context *ctx) { int32_t tind; int32_t i; for (tind = 0; tind < MLX5_UIDX_TABLE_SIZE; tind++) { if (ctx->uidx_table[tind].refcnt < MLX5_UIDX_TABLE_MASK) break; } if (tind == MLX5_UIDX_TABLE_SIZE) return -1; if (!ctx->uidx_table[tind].refcnt) return tind << MLX5_UIDX_TABLE_SHIFT; for (i = 0; i < MLX5_UIDX_TABLE_MASK + 1; i++) { if (!ctx->uidx_table[tind].table[i]) break; } return (tind << MLX5_UIDX_TABLE_SHIFT) | i; } int mlx5_cmd_status_to_err(uint8_t status) { switch (status) { case MLX5_CMD_STAT_OK: return 0; case MLX5_CMD_STAT_INT_ERR: return EIO; case MLX5_CMD_STAT_BAD_OP_ERR: return EINVAL; case MLX5_CMD_STAT_BAD_PARAM_ERR: return EINVAL; case MLX5_CMD_STAT_BAD_SYS_STATE_ERR: return EIO; case MLX5_CMD_STAT_BAD_RES_ERR: return EINVAL; case MLX5_CMD_STAT_RES_BUSY: return EBUSY; case MLX5_CMD_STAT_LIM_ERR: return ENOMEM; case MLX5_CMD_STAT_BAD_RES_STATE_ERR: return EINVAL; case MLX5_CMD_STAT_IX_ERR: return EINVAL; case MLX5_CMD_STAT_NO_RES_ERR: return EAGAIN; case MLX5_CMD_STAT_BAD_INP_LEN_ERR: return EIO; case MLX5_CMD_STAT_BAD_OUTP_LEN_ERR: return EIO; case MLX5_CMD_STAT_BAD_QP_STATE_ERR: return EINVAL; case MLX5_CMD_STAT_BAD_PKT_ERR: return EINVAL; case MLX5_CMD_STAT_BAD_SIZE_OUTS_CQES_ERR: return EINVAL; default: return EIO; } } int mlx5_get_cmd_status_err(int err, void *out) { if (err == EREMOTEIO) err = mlx5_cmd_status_to_err(DEVX_GET(mbox_out, out, status)); return err; } int32_t mlx5_store_uidx(struct mlx5_context *ctx, void *rsc) { int32_t tind; int32_t ret = -1; int32_t uidx; pthread_mutex_lock(&ctx->uidx_table_mutex); uidx = get_free_uidx(ctx); if (uidx < 0) goto out; tind = uidx >> MLX5_UIDX_TABLE_SHIFT; if (!ctx->uidx_table[tind].refcnt) { ctx->uidx_table[tind].table = calloc(MLX5_UIDX_TABLE_MASK + 1, sizeof(struct mlx5_resource *)); if (!ctx->uidx_table[tind].table) goto out; } ++ctx->uidx_table[tind].refcnt; ctx->uidx_table[tind].table[uidx & MLX5_UIDX_TABLE_MASK] = rsc; ret = uidx; out: pthread_mutex_unlock(&ctx->uidx_table_mutex); return ret; } void mlx5_clear_uidx(struct mlx5_context *ctx, uint32_t uidx) { int tind = uidx >> MLX5_UIDX_TABLE_SHIFT; pthread_mutex_lock(&ctx->uidx_table_mutex); if (!--ctx->uidx_table[tind].refcnt) free(ctx->uidx_table[tind].table); else ctx->uidx_table[tind].table[uidx & MLX5_UIDX_TABLE_MASK] = NULL; pthread_mutex_unlock(&ctx->uidx_table_mutex); } struct mlx5_mkey *mlx5_find_mkey(struct mlx5_context *ctx, uint32_t mkey) { int tind = mkey >> MLX5_MKEY_TABLE_SHIFT; if (ctx->mkey_table[tind].refcnt) return ctx->mkey_table[tind].table[mkey & MLX5_MKEY_TABLE_MASK]; else return NULL; } int mlx5_store_mkey(struct mlx5_context *ctx, uint32_t mkey, struct mlx5_mkey *mlx5_mkey) { int tind = mkey >> MLX5_MKEY_TABLE_SHIFT; int ret = 0; pthread_mutex_lock(&ctx->mkey_table_mutex); if (!ctx->mkey_table[tind].refcnt) { ctx->mkey_table[tind].table = calloc(MLX5_MKEY_TABLE_MASK + 1, sizeof(struct mlx5_mkey *)); if (!ctx->mkey_table[tind].table) { ret = -1; goto out; } } ++ctx->mkey_table[tind].refcnt; ctx->mkey_table[tind].table[mkey & MLX5_MKEY_TABLE_MASK] = mlx5_mkey; out: pthread_mutex_unlock(&ctx->mkey_table_mutex); return ret; } void mlx5_clear_mkey(struct mlx5_context *ctx, uint32_t mkey) { int tind = mkey >> MLX5_MKEY_TABLE_SHIFT; pthread_mutex_lock(&ctx->mkey_table_mutex); if (!--ctx->mkey_table[tind].refcnt) free(ctx->mkey_table[tind].table); else ctx->mkey_table[tind].table[mkey & MLX5_MKEY_TABLE_MASK] = NULL; pthread_mutex_unlock(&ctx->mkey_table_mutex); } struct mlx5_psv *mlx5_create_psv(struct ibv_pd *pd) { uint32_t out[DEVX_ST_SZ_DW(create_psv_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_psv_in)] = {}; struct mlx5_psv *psv; psv = calloc(1, sizeof(*psv)); if (!psv) { errno = ENOMEM; return NULL; } DEVX_SET(create_psv_in, in, opcode, MLX5_CMD_OP_CREATE_PSV); DEVX_SET(create_psv_in, in, pd, to_mpd(pd)->pdn); DEVX_SET(create_psv_in, in, num_psv, 1); psv->devx_obj = mlx5dv_devx_obj_create(pd->context, in, sizeof(in), out, sizeof(out)); if (!psv->devx_obj) { errno = mlx5_get_cmd_status_err(errno, out); goto err_free_psv; } psv->index = DEVX_GET(create_psv_out, out, psv0_index); return psv; err_free_psv: free(psv); return NULL; } int mlx5_destroy_psv(struct mlx5_psv *psv) { int ret; ret = mlx5dv_devx_obj_destroy(psv->devx_obj); if (!ret) free(psv); return ret; } static int mlx5_is_sandy_bridge(int *num_cores) { char line[128]; FILE *fd; int rc = 0; int cur_cpu_family = -1; int cur_cpu_model = -1; fd = fopen("/proc/cpuinfo", "r"); if (!fd) return 0; *num_cores = 0; while (fgets(line, 128, fd)) { int value; /* if this is information on new processor */ if (!strncmp(line, "processor", 9)) { ++*num_cores; cur_cpu_family = -1; cur_cpu_model = -1; } else if (!strncmp(line, "cpu family", 10)) { if ((cur_cpu_family < 0) && (!read_number_from_line(line, &value))) cur_cpu_family = value; } else if (!strncmp(line, "model", 5)) { if ((cur_cpu_model < 0) && (!read_number_from_line(line, &value))) cur_cpu_model = value; } /* if this is a Sandy Bridge CPU */ if ((cur_cpu_family == 6) && (cur_cpu_model == 0x2A || (cur_cpu_model == 0x2D) )) rc = 1; } fclose(fd); return rc; } /* man cpuset This format displays each 32-bit word in hexadecimal (using ASCII characters "0" - "9" and "a" - "f"); words are filled with leading zeros, if required. For masks longer than one word, a comma separator is used between words. Words are displayed in big-endian order, which has the most significant bit first. The hex digits within a word are also in big-endian order. The number of 32-bit words displayed is the minimum number needed to display all bits of the bitmask, based on the size of the bitmask. Examples of the Mask Format: 00000001 # just bit 0 set 40000000,00000000,00000000 # just bit 94 set 000000ff,00000000 # bits 32-39 set 00000000,000E3862 # 1,5,6,11-13,17-19 set A mask with bits 0, 1, 2, 4, 8, 16, 32, and 64 set displays as: 00000001,00000001,00010117 The first "1" is for bit 64, the second for bit 32, the third for bit 16, the fourth for bit 8, the fifth for bit 4, and the "7" is for bits 2, 1, and 0. */ static void mlx5_local_cpu_set(struct ibv_device *ibdev, struct mlx5_context *mctx, cpu_set_t *cpu_set) { char *p, buf[1024] = {}; char *env_value; uint32_t word; int i, k; env_value = getenv("MLX5_LOCAL_CPUS"); if (env_value) strncpy(buf, env_value, sizeof(buf) - 1); else { char fname[MAXPATHLEN]; FILE *fp; snprintf(fname, MAXPATHLEN, "/sys/class/infiniband/%s/device/local_cpus", ibv_get_device_name(ibdev)); fp = fopen(fname, "r"); if (!fp) { mlx5_err(mctx->dbg_fp, PFX "Warning: can not get local cpu set: failed to open %s\n", fname); return; } if (!fgets(buf, sizeof(buf), fp)) { mlx5_err(mctx->dbg_fp, PFX "Warning: can not get local cpu set: failed to read cpu mask\n"); fclose(fp); return; } fclose(fp); } p = strrchr(buf, ','); if (!p) p = buf; i = 0; do { if (*p == ',') { *p = 0; p ++; } word = strtoul(p, NULL, 16); for (k = 0; word; ++k, word >>= 1) if (word & 1) CPU_SET(k+i, cpu_set); if (p == buf) break; p = strrchr(buf, ','); if (!p) p = buf; i += 32; } while (i < CPU_SETSIZE); } static int mlx5_enable_sandy_bridge_fix(struct ibv_device *ibdev, struct mlx5_context *mctx) { cpu_set_t my_cpus, dev_local_cpus, result_set; int stall_enable; int ret; int num_cores; if (!mlx5_is_sandy_bridge(&num_cores)) return 0; /* by default enable stall on sandy bridge arch */ stall_enable = 1; /* * check if app is bound to cpu set that is inside * of device local cpu set. Disable stalling if true */ /* use static cpu set - up to CPU_SETSIZE (1024) cpus/node */ CPU_ZERO(&my_cpus); CPU_ZERO(&dev_local_cpus); CPU_ZERO(&result_set); ret = sched_getaffinity(0, sizeof(my_cpus), &my_cpus); if (ret == -1) { if (errno == EINVAL) mlx5_err(mctx->dbg_fp, PFX "Warning: my cpu set is too small\n"); else mlx5_err(mctx->dbg_fp, PFX "Warning: failed to get my cpu set\n"); goto out; } /* get device local cpu set */ mlx5_local_cpu_set(ibdev, mctx, &dev_local_cpus); /* check if my cpu set is in dev cpu */ CPU_OR(&result_set, &my_cpus, &dev_local_cpus); stall_enable = CPU_EQUAL(&result_set, &dev_local_cpus) ? 0 : 1; out: return stall_enable; } static void mlx5_read_env(struct ibv_device *ibdev, struct mlx5_context *ctx) { char *env_value; env_value = getenv("MLX5_STALL_CQ_POLL"); if (env_value) /* check if cq stall is enforced by user */ ctx->stall_enable = (strcmp(env_value, "0")) ? 1 : 0; else /* autodetect if we need to do cq polling */ ctx->stall_enable = mlx5_enable_sandy_bridge_fix(ibdev, ctx); env_value = getenv("MLX5_STALL_NUM_LOOP"); if (env_value) mlx5_stall_num_loop = atoi(env_value); env_value = getenv("MLX5_STALL_CQ_POLL_MIN"); if (env_value) mlx5_stall_cq_poll_min = atoi(env_value); env_value = getenv("MLX5_STALL_CQ_POLL_MAX"); if (env_value) mlx5_stall_cq_poll_max = atoi(env_value); env_value = getenv("MLX5_STALL_CQ_INC_STEP"); if (env_value) mlx5_stall_cq_inc_step = atoi(env_value); env_value = getenv("MLX5_STALL_CQ_DEC_STEP"); if (env_value) mlx5_stall_cq_dec_step = atoi(env_value); ctx->stall_adaptive_enable = 0; ctx->stall_cycles = 0; if (mlx5_stall_num_loop < 0) { ctx->stall_adaptive_enable = 1; ctx->stall_cycles = mlx5_stall_cq_poll_min; } } static int get_total_uuars(int page_size) { int size = MLX5_DEF_TOT_UUARS; int uuars_in_page; char *env; env = getenv("MLX5_TOTAL_UUARS"); if (env) size = atoi(env); if (size < 1) return -EINVAL; uuars_in_page = page_size / MLX5_ADAPTER_PAGE_SIZE * MLX5_NUM_NON_FP_BFREGS_PER_UAR; size = max(uuars_in_page, size); size = align(size, MLX5_NUM_NON_FP_BFREGS_PER_UAR); if (size > MLX5_MAX_BFREGS) return -ENOMEM; return size; } void mlx5_open_debug_file(FILE **dbg_fp) { char *env; FILE *default_dbg_fp = NULL; #ifdef MLX5_DEBUG default_dbg_fp = stderr; #endif env = getenv("MLX5_DEBUG_FILE"); if (!env) { *dbg_fp = default_dbg_fp; return; } *dbg_fp = fopen(env, "aw+"); if (!*dbg_fp) { *dbg_fp = default_dbg_fp; mlx5_err(*dbg_fp, "Failed opening debug file %s\n", env); return; } } void mlx5_close_debug_file(FILE *dbg_fp) { if (dbg_fp && dbg_fp != stderr) fclose(dbg_fp); } void mlx5_set_debug_mask(void) { char *env; env = getenv("MLX5_DEBUG_MASK"); if (env) mlx5_debug_mask = strtol(env, NULL, 0); } static void set_freeze_on_error(void) { char *env; env = getenv("MLX5_FREEZE_ON_ERROR_CQE"); if (env) mlx5_freeze_on_error_cqe = strtol(env, NULL, 0); } static int get_always_bf(void) { char *env; env = getenv("MLX5_POST_SEND_PREFER_BF"); if (!env) return 1; return strcmp(env, "0") ? 1 : 0; } static int get_shut_up_bf(void) { char *env; env = getenv("MLX5_SHUT_UP_BF"); if (!env) return 0; return strcmp(env, "0") ? 1 : 0; } static int get_num_low_lat_uuars(int tot_uuars) { char *env; int num = 4; env = getenv("MLX5_NUM_LOW_LAT_UUARS"); if (env) num = atoi(env); if (num < 0) return -EINVAL; num = max(num, tot_uuars - MLX5_MED_BFREGS_TSHOLD); return num; } /* The library allocates an array of uuar contexts. The one in index zero does * not to execersize odd/even policy so it can avoid a lock but it may not use * blue flame. The upper ones, low_lat_uuars can use blue flame with no lock * since they are assigned to one QP only. The rest can use blue flame but since * they are shared they need a lock */ static int need_uuar_lock(struct mlx5_context *ctx, int uuarn) { int i; if (uuarn == 0 || mlx5_single_threaded) return 0; i = (uuarn / 2) + (uuarn % 2); if (i >= ctx->tot_uuars - ctx->low_lat_uuars) return 0; return 1; } static int single_threaded_app(void) { char *env; env = getenv("MLX5_SINGLE_THREADED"); if (env) return strcmp(env, "1") ? 0 : 1; return 0; } static int mlx5_cmd_get_context(struct mlx5_context *context, struct mlx5_alloc_ucontext *req, size_t req_len, struct mlx5_alloc_ucontext_resp *resp, size_t resp_len) { struct verbs_context *verbs_ctx = &context->ibv_ctx; if (!ibv_cmd_get_context(verbs_ctx, &req->ibv_cmd, req_len, &resp->ibv_resp, resp_len)) return 0; /* The ibv_cmd_get_context fails in older kernels when passing * a request length that the kernel doesn't know. * To avoid breaking compatibility of new libmlx5 and older * kernels, when ibv_cmd_get_context fails with the full * request length, we try once again with the legacy length. * We repeat this process while reducing requested size based * on the feature input size. To avoid this in the future, we * will remove the check in kernel that requires fields unknown * to the kernel to be cleared. This will require that any new * feature that involves extending struct mlx5_alloc_ucontext * will be accompanied by an indication in the form of one or * more fields in struct mlx5_alloc_ucontext_resp. If the * response value can be interpreted as feature not supported * when the returned value is zero, this will suffice to * indicate to the library that the request was ignored by the * kernel, either because it is unaware or because it decided * to do so. If zero is a valid response, we will add a new * field that indicates whether the request was handled. */ if (!ibv_cmd_get_context(verbs_ctx, &req->ibv_cmd, offsetof(struct mlx5_alloc_ucontext, lib_caps), &resp->ibv_resp, resp_len)) return 0; return ibv_cmd_get_context(verbs_ctx, &req->ibv_cmd, offsetof(struct mlx5_alloc_ucontext, max_cqe_version), &resp->ibv_resp, resp_len); } static int mlx5_map_internal_clock(struct mlx5_device *mdev, struct ibv_context *ibv_ctx) { struct mlx5_context *context = to_mctx(ibv_ctx); void *hca_clock_page; off_t offset = 0; set_command(MLX5_IB_MMAP_CORE_CLOCK, &offset); hca_clock_page = mmap(NULL, mdev->page_size, PROT_READ, MAP_SHARED, ibv_ctx->cmd_fd, mdev->page_size * offset); if (hca_clock_page == MAP_FAILED) { mlx5_err(context->dbg_fp, PFX "Warning: Timestamp available,\n" "but failed to mmap() hca core clock page.\n"); return -1; } context->hca_core_clock = hca_clock_page + (context->core_clock.offset & (mdev->page_size - 1)); return 0; } static void mlx5_map_clock_info(struct mlx5_device *mdev, struct ibv_context *ibv_ctx) { struct mlx5_context *context = to_mctx(ibv_ctx); void *clock_info_page; off_t offset = 0; set_command(MLX5_IB_MMAP_CLOCK_INFO, &offset); set_index(MLX5_IB_CLOCK_INFO_V1, &offset); clock_info_page = mmap(NULL, mdev->page_size, PROT_READ, MAP_SHARED, ibv_ctx->cmd_fd, offset * mdev->page_size); if (clock_info_page != MAP_FAILED) context->clock_info_page = clock_info_page; } static uint32_t get_dc_odp_caps(struct ibv_context *ctx) { uint32_t in[DEVX_ST_SZ_DW(query_hca_cap_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(query_hca_cap_out)] = {}; uint16_t opmod = (MLX5_CAP_ODP << 1) | HCA_CAP_OPMOD_GET_CUR; uint32_t ret; DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); DEVX_SET(query_hca_cap_in, in, op_mod, opmod); ret = mlx5dv_devx_general_cmd(ctx, in, sizeof(in), out, sizeof(out)); if (ret) return 0; if (DEVX_GET(query_hca_cap_out, out, capability.odp_cap.dc_odp_caps.send)) ret |= IBV_ODP_SUPPORT_SEND; if (DEVX_GET(query_hca_cap_out, out, capability.odp_cap.dc_odp_caps.receive)) ret |= IBV_ODP_SUPPORT_RECV; if (DEVX_GET(query_hca_cap_out, out, capability.odp_cap.dc_odp_caps.write)) ret |= IBV_ODP_SUPPORT_WRITE; if (DEVX_GET(query_hca_cap_out, out, capability.odp_cap.dc_odp_caps.read)) ret |= IBV_ODP_SUPPORT_READ; if (DEVX_GET(query_hca_cap_out, out, capability.odp_cap.dc_odp_caps.atomic)) ret |= IBV_ODP_SUPPORT_ATOMIC; if (DEVX_GET(query_hca_cap_out, out, capability.odp_cap.dc_odp_caps.srq_receive)) ret |= IBV_ODP_SUPPORT_SRQ_RECV; return ret; } static int _mlx5dv_query_device(struct ibv_context *ctx_in, struct mlx5dv_context *attrs_out) { struct mlx5_context *mctx = to_mctx(ctx_in); uint64_t comp_mask_out = 0; attrs_out->version = 0; attrs_out->flags = 0; if (mctx->cqe_version == MLX5_CQE_VERSION_V1) attrs_out->flags |= MLX5DV_CONTEXT_FLAGS_CQE_V1; if (mctx->vendor_cap_flags & MLX5_VENDOR_CAP_FLAGS_MPW_ALLOWED) attrs_out->flags |= MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED; if (mctx->vendor_cap_flags & MLX5_VENDOR_CAP_FLAGS_CQE_128B_COMP) attrs_out->flags |= MLX5DV_CONTEXT_FLAGS_CQE_128B_COMP; if (mctx->vendor_cap_flags & MLX5_VENDOR_CAP_FLAGS_CQE_128B_PAD) attrs_out->flags |= MLX5DV_CONTEXT_FLAGS_CQE_128B_PAD; if (mctx->flags & MLX5_CTX_FLAGS_REAL_TIME_TS_SUPPORTED) attrs_out->flags |= MLX5DV_CONTEXT_FLAGS_REAL_TIME_TS; if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_CQE_COMPRESION) { attrs_out->cqe_comp_caps = mctx->cqe_comp_caps; comp_mask_out |= MLX5DV_CONTEXT_MASK_CQE_COMPRESION; } if (mctx->vendor_cap_flags & MLX5_VENDOR_CAP_FLAGS_ENHANCED_MPW) attrs_out->flags |= MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW; if (mctx->vendor_cap_flags & MLX5_VENDOR_CAP_FLAGS_PACKET_BASED_CREDIT_MODE) attrs_out->flags |= MLX5DV_CONTEXT_FLAGS_PACKET_BASED_CREDIT_MODE; if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_SWP) { attrs_out->sw_parsing_caps = mctx->sw_parsing_caps; comp_mask_out |= MLX5DV_CONTEXT_MASK_SWP; } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_STRIDING_RQ) { attrs_out->striding_rq_caps = mctx->striding_rq_caps; comp_mask_out |= MLX5DV_CONTEXT_MASK_STRIDING_RQ; } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS) { attrs_out->tunnel_offloads_caps = mctx->tunnel_offloads_caps; comp_mask_out |= MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS; } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_DCI_STREAMS) { attrs_out->dci_streams_caps = mctx->dci_streams_caps; comp_mask_out |= MLX5DV_CONTEXT_MASK_DCI_STREAMS; } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_DYN_BFREGS) { attrs_out->max_dynamic_bfregs = mctx->num_dyn_bfregs; comp_mask_out |= MLX5DV_CONTEXT_MASK_DYN_BFREGS; } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_CLOCK_INFO_UPDATE) { if (mctx->clock_info_page) { attrs_out->max_clock_info_update_nsec = mctx->clock_info_page->overflow_period; comp_mask_out |= MLX5DV_CONTEXT_MASK_CLOCK_INFO_UPDATE; } } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_FLOW_ACTION_FLAGS) { attrs_out->flow_action_flags = mctx->flow_action_flags; comp_mask_out |= MLX5DV_CONTEXT_MASK_FLOW_ACTION_FLAGS; } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_DC_ODP_CAPS) { attrs_out->dc_odp_caps = get_dc_odp_caps(ctx_in); comp_mask_out |= MLX5DV_CONTEXT_MASK_DC_ODP_CAPS; } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_HCA_CORE_CLOCK) { if (mctx->hca_core_clock) { attrs_out->hca_core_clock = mctx->hca_core_clock; comp_mask_out |= MLX5DV_CONTEXT_MASK_HCA_CORE_CLOCK; } } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_NUM_LAG_PORTS) { if (mctx->entropy_caps.num_lag_ports) { attrs_out->num_lag_ports = mctx->entropy_caps.num_lag_ports; comp_mask_out |= MLX5DV_CONTEXT_MASK_NUM_LAG_PORTS; } } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_SIGNATURE_OFFLOAD) { attrs_out->sig_caps = mctx->sig_caps; comp_mask_out |= MLX5DV_CONTEXT_MASK_SIGNATURE_OFFLOAD; } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_WR_MEMCPY_LENGTH) { attrs_out->max_wr_memcpy_length = mctx->dma_mmo_caps.dma_max_size; comp_mask_out |= MLX5DV_CONTEXT_MASK_WR_MEMCPY_LENGTH; } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_CRYPTO_OFFLOAD) { attrs_out->crypto_caps = mctx->crypto_caps; comp_mask_out |= MLX5DV_CONTEXT_MASK_CRYPTO_OFFLOAD; } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_MAX_DC_RD_ATOM) { attrs_out->max_dc_rd_atom = mctx->max_dc_rd_atom; attrs_out->max_dc_init_rd_atom = mctx->max_dc_init_rd_atom; comp_mask_out |= MLX5DV_CONTEXT_MASK_MAX_DC_RD_ATOM; } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_REG_C0) { if (mctx->reg_c0.mask) { attrs_out->reg_c0 = mctx->reg_c0; comp_mask_out |= MLX5DV_CONTEXT_MASK_REG_C0; } } if (attrs_out->comp_mask & MLX5DV_CONTEXT_MASK_OOO_RECV_WRS) { if (mctx->vendor_cap_flags & MLX5_VENDOR_CAP_FLAGS_OOO_DP) { attrs_out->ooo_recv_wrs_caps = mctx->ooo_recv_wrs_caps; comp_mask_out |= MLX5DV_CONTEXT_MASK_OOO_RECV_WRS; } } attrs_out->comp_mask = comp_mask_out; return 0; } int mlx5dv_query_device(struct ibv_context *ctx_in, struct mlx5dv_context *attrs_out) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ctx_in); if (!dvops || !dvops->query_device) return EOPNOTSUPP; return dvops->query_device(ctx_in, attrs_out); } static int mlx5dv_get_qp(struct ibv_qp *qp_in, struct mlx5dv_qp *qp_out) { struct mlx5_qp *mqp = to_mqp(qp_in); uint64_t mask_out = 0; qp_out->dbrec = mqp->db; if (mqp->sq_buf_size) /* IBV_QPT_RAW_PACKET */ qp_out->sq.buf = (void *)((uintptr_t)mqp->sq_buf.buf); else qp_out->sq.buf = (void *)((uintptr_t)mqp->buf.buf + mqp->sq.offset); qp_out->sq.wqe_cnt = mqp->sq.wqe_cnt; qp_out->sq.stride = 1 << mqp->sq.wqe_shift; qp_out->rq.buf = (void *)((uintptr_t)mqp->buf.buf + mqp->rq.offset); qp_out->rq.wqe_cnt = mqp->rq.wqe_cnt; qp_out->rq.stride = 1 << mqp->rq.wqe_shift; qp_out->bf.reg = mqp->bf->reg; if (qp_out->comp_mask & MLX5DV_QP_MASK_UAR_MMAP_OFFSET) { qp_out->uar_mmap_offset = mqp->bf->uar_mmap_offset; mask_out |= MLX5DV_QP_MASK_UAR_MMAP_OFFSET; } if (qp_out->comp_mask & MLX5DV_QP_MASK_RAW_QP_HANDLES) { qp_out->tirn = mqp->tirn; qp_out->tisn = mqp->tisn; qp_out->rqn = mqp->rqn; qp_out->sqn = mqp->sqn; mask_out |= MLX5DV_QP_MASK_RAW_QP_HANDLES; } if (qp_out->comp_mask & MLX5DV_QP_MASK_RAW_QP_TIR_ADDR) { qp_out->tir_icm_addr = mqp->tir_icm_addr; mask_out |= MLX5DV_QP_MASK_RAW_QP_TIR_ADDR; } if (mqp->bf->uuarn > 0) qp_out->bf.size = mqp->bf->buf_size; else qp_out->bf.size = 0; qp_out->comp_mask = mask_out; return 0; } static int mlx5dv_get_cq(struct ibv_cq *cq_in, struct mlx5dv_cq *cq_out) { struct mlx5_cq *mcq = to_mcq(cq_in); struct mlx5_context *mctx = to_mctx(cq_in->context); cq_out->comp_mask = 0; cq_out->cqn = mcq->cqn; cq_out->cqe_cnt = mcq->verbs_cq.cq.cqe + 1; cq_out->cqe_size = mcq->cqe_sz; cq_out->buf = mcq->active_buf->buf; cq_out->dbrec = mcq->dbrec; cq_out->cq_uar = mctx->cq_uar_reg; mcq->flags |= MLX5_CQ_FLAGS_DV_OWNED; return 0; } static int mlx5dv_get_rwq(struct ibv_wq *wq_in, struct mlx5dv_rwq *rwq_out) { struct mlx5_rwq *mrwq = to_mrwq(wq_in); rwq_out->comp_mask = 0; rwq_out->buf = mrwq->pbuff; rwq_out->dbrec = mrwq->recv_db; rwq_out->wqe_cnt = mrwq->rq.wqe_cnt; rwq_out->stride = 1 << mrwq->rq.wqe_shift; return 0; } static int mlx5dv_get_srq(struct ibv_srq *srq_in, struct mlx5dv_srq *srq_out) { struct mlx5_srq *msrq; uint64_t mask_out = 0; msrq = container_of(srq_in, struct mlx5_srq, vsrq.srq); srq_out->buf = msrq->buf.buf; srq_out->dbrec = msrq->db; srq_out->stride = 1 << msrq->wqe_shift; srq_out->head = msrq->head; srq_out->tail = msrq->tail; if (srq_out->comp_mask & MLX5DV_SRQ_MASK_SRQN) { srq_out->srqn = msrq->srqn; mask_out |= MLX5DV_SRQ_MASK_SRQN; } srq_out->comp_mask = mask_out; return 0; } static int mlx5dv_get_dm(struct ibv_dm *dm_in, struct mlx5dv_dm *dm_out) { struct mlx5_dm *mdm = to_mdm(dm_in); uint64_t mask_out = 0; dm_out->buf = mdm->start_va; dm_out->length = mdm->length; if (dm_out->comp_mask & MLX5DV_DM_MASK_REMOTE_VA) { dm_out->remote_va = mdm->remote_va; mask_out |= MLX5DV_DM_MASK_REMOTE_VA; } dm_out->comp_mask = mask_out; return 0; } static int mlx5dv_get_av(struct ibv_ah *ah_in, struct mlx5dv_ah *ah_out) { struct mlx5_ah *mah = to_mah(ah_in); ah_out->comp_mask = 0; ah_out->av = &mah->av; return 0; } static int mlx5dv_get_pd(struct ibv_pd *pd_in, struct mlx5dv_pd *pd_out) { struct mlx5_pd *mpd = to_mpd(pd_in); pd_out->comp_mask = 0; pd_out->pdn = mpd->pdn; return 0; } static int mlx5dv_get_devx(struct mlx5dv_devx_obj *devx_in, struct mlx5dv_devx *devx_out) { devx_out->handle = devx_in->handle; return 0; } static int query_lag(struct ibv_context *ctx, uint8_t *lag_state, uint8_t *tx_remap_affinity_1, uint8_t *tx_remap_affinity_2) { uint32_t out_lag[DEVX_ST_SZ_DW(query_lag_out)] = {}; uint32_t in_lag[DEVX_ST_SZ_DW(query_lag_in)] = {}; int ret; DEVX_SET(query_lag_in, in_lag, opcode, MLX5_CMD_OP_QUERY_LAG); ret = mlx5dv_devx_general_cmd(ctx, in_lag, sizeof(in_lag), out_lag, sizeof(out_lag)); if (ret) return mlx5_get_cmd_status_err(ret, out_lag); *lag_state = DEVX_GET(query_lag_out, out_lag, ctx.lag_state); if (tx_remap_affinity_1) *tx_remap_affinity_1 = DEVX_GET(query_lag_out, out_lag, ctx.tx_remap_affinity_1); if (tx_remap_affinity_2) *tx_remap_affinity_2 = DEVX_GET(query_lag_out, out_lag, ctx.tx_remap_affinity_2); return 0; } static bool lag_operation_supported(struct ibv_qp *qp) { struct mlx5_context *mctx = to_mctx(qp->context); struct mlx5_qp *mqp = to_mqp(qp); if (mctx->entropy_caps.num_lag_ports <= 1) return false; if ((qp->qp_type == IBV_QPT_RC) || (qp->qp_type == IBV_QPT_UD) || (qp->qp_type == IBV_QPT_UC) || (qp->qp_type == IBV_QPT_RAW_PACKET) || (qp->qp_type == IBV_QPT_XRC_SEND) || ((qp->qp_type == IBV_QPT_DRIVER) && (mqp->dc_type == MLX5DV_DCTYPE_DCI))) return true; return false; } static int _mlx5dv_query_qp_lag_port(struct ibv_qp *qp, uint8_t *port_num, uint8_t *active_port_num) { uint8_t lag_state = 0, tx_remap_affinity_1 = 0, tx_remap_affinity_2 = 0; uint32_t in_tis[DEVX_ST_SZ_DW(query_tis_in)] = {}; uint32_t out_tis[DEVX_ST_SZ_DW(query_tis_out)] = {}; uint32_t in_qp[DEVX_ST_SZ_DW(query_qp_in)] = {}; uint32_t out_qp[DEVX_ST_SZ_DW(query_qp_out)] = {}; struct mlx5_context *mctx = to_mctx(qp->context); struct mlx5_qp *mqp = to_mqp(qp); int ret; if (!lag_operation_supported(qp)) return EOPNOTSUPP; ret = query_lag(qp->context, &lag_state, &tx_remap_affinity_1, &tx_remap_affinity_2); if (ret) return ret; if (!lag_state && !mctx->entropy_caps.lag_tx_port_affinity) return EOPNOTSUPP; switch (qp->qp_type) { case IBV_QPT_RAW_PACKET: DEVX_SET(query_tis_in, in_tis, opcode, MLX5_CMD_OP_QUERY_TIS); DEVX_SET(query_tis_in, in_tis, tisn, mqp->tisn); ret = mlx5dv_devx_qp_query(qp, in_tis, sizeof(in_tis), out_tis, sizeof(out_tis)); if (ret) return mlx5_get_cmd_status_err(ret, out_tis); *port_num = DEVX_GET(query_tis_out, out_tis, tis_context.lag_tx_port_affinity); break; default: DEVX_SET(query_qp_in, in_qp, opcode, MLX5_CMD_OP_QUERY_QP); DEVX_SET(query_qp_in, in_qp, qpn, qp->qp_num); ret = mlx5dv_devx_qp_query(qp, in_qp, sizeof(in_qp), out_qp, sizeof(out_qp)); if (ret) return mlx5_get_cmd_status_err(ret, out_qp); *port_num = DEVX_GET(query_qp_out, out_qp, qpc.lag_tx_port_affinity); break; } switch (*port_num) { case 1: *active_port_num = tx_remap_affinity_1; break; case 2: *active_port_num = tx_remap_affinity_2; break; default: return EOPNOTSUPP; } return 0; } int mlx5dv_query_qp_lag_port(struct ibv_qp *qp, uint8_t *port_num, uint8_t *active_port_num) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(qp->context); if (!dvops || !dvops->query_qp_lag_port) return EOPNOTSUPP; return dvops->query_qp_lag_port(qp, port_num, active_port_num); } static int modify_tis_lag_port(struct ibv_qp *qp, uint8_t port_num) { uint32_t out[DEVX_ST_SZ_DW(modify_tis_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(modify_tis_in)] = {}; struct mlx5_qp *mqp = to_mqp(qp); int ret; DEVX_SET(modify_tis_in, in, opcode, MLX5_CMD_OP_MODIFY_TIS); DEVX_SET(modify_tis_in, in, tisn, mqp->tisn); DEVX_SET(modify_tis_in, in, bitmask.lag_tx_port_affinity, 1); DEVX_SET(modify_tis_in, in, ctx.lag_tx_port_affinity, port_num); ret = mlx5dv_devx_qp_modify(qp, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } static int modify_qp_lag_port(struct ibv_qp *qp, uint8_t port_num) { uint32_t out[DEVX_ST_SZ_DW(rts2rts_qp_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(rts2rts_qp_in)] = {}; struct mlx5_context *mctx = to_mctx(qp->context); int ret; if (!mctx->entropy_caps.rts2rts_lag_tx_port_affinity || qp->state != IBV_QPS_RTS) return EOPNOTSUPP; DEVX_SET(rts2rts_qp_in, in, opcode, MLX5_CMD_OP_RTS2RTS_QP); DEVX_SET(rts2rts_qp_in, in, qpn, qp->qp_num); DEVX_SET(rts2rts_qp_in, in, opt_param_mask, MLX5_QPC_OPT_MASK_RTS2RTS_LAG_TX_PORT_AFFINITY); DEVX_SET(rts2rts_qp_in, in, qpc.lag_tx_port_affinity, port_num); ret = mlx5dv_devx_qp_modify(qp, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } static int _mlx5dv_modify_qp_lag_port(struct ibv_qp *qp, uint8_t port_num) { uint8_t curr_configured, curr_active; struct mlx5_qp *mqp = to_mqp(qp); int ret; /* Query lag port to see if we are at all in lag mode, otherwise FW * might return success and ignore the modification. */ ret = mlx5dv_query_qp_lag_port(qp, &curr_configured, &curr_active); if (ret) return ret; switch (qp->qp_type) { case IBV_QPT_RAW_PACKET: return modify_tis_lag_port(qp, port_num); case IBV_QPT_DRIVER: if (mqp->dc_type != MLX5DV_DCTYPE_DCI) return EOPNOTSUPP; SWITCH_FALLTHROUGH; case IBV_QPT_RC: case IBV_QPT_UD: case IBV_QPT_UC: return modify_qp_lag_port(qp, port_num); default: return EOPNOTSUPP; } } int mlx5dv_modify_qp_lag_port(struct ibv_qp *qp, uint8_t port_num) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(qp->context); if (!dvops || !dvops->modify_qp_lag_port) return EOPNOTSUPP; return dvops->modify_qp_lag_port(qp, port_num); } static int _mlx5dv_modify_qp_udp_sport(struct ibv_qp *qp, uint16_t udp_sport) { uint32_t in[DEVX_ST_SZ_DW(rts2rts_qp_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(rts2rts_qp_out)] = {}; struct mlx5_context *mctx = to_mctx(qp->context); int ret; switch (qp->qp_type) { case IBV_QPT_RC: case IBV_QPT_UC: if (qp->state != IBV_QPS_RTS || !mctx->entropy_caps.rts2rts_qp_udp_sport) return EOPNOTSUPP; break; default: return EOPNOTSUPP; } DEVX_SET(rts2rts_qp_in, in, opcode, MLX5_CMD_OP_RTS2RTS_QP); DEVX_SET(rts2rts_qp_in, in, qpn, qp->qp_num); DEVX_SET64(rts2rts_qp_in, in, opt_param_mask_95_32, MLX5_QPC_OPT_MASK_32_UDP_SPORT); DEVX_SET(rts2rts_qp_in, in, qpc.primary_address_path.udp_sport, udp_sport); ret = mlx5dv_devx_qp_modify(qp, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } int mlx5dv_modify_qp_udp_sport(struct ibv_qp *qp, uint16_t udp_sport) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(qp->context); if (!dvops || !dvops->modify_qp_udp_sport) return EOPNOTSUPP; return dvops->modify_qp_udp_sport(qp, udp_sport); } int mlx5dv_dci_stream_id_reset(struct ibv_qp *qp, uint16_t stream_id) { uint32_t out[DEVX_ST_SZ_DW(rts2rts_qp_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(rts2rts_qp_in)] = {}; struct mlx5_context *mctx = to_mctx(qp->context); struct mlx5_qp *mqp = to_mqp(qp); void *qpce = DEVX_ADDR_OF(rts2rts_qp_in, in, qpc_data_ext); int ret; if (!is_mlx5_dev(qp->context->device) || !mctx->dci_streams_caps.max_log_num_errored || !mctx->qpc_extension_cap || qp->state != IBV_QPS_RTS) return EOPNOTSUPP; if ((mqp->dc_type != MLX5DV_DCTYPE_DCI) || (qp->qp_type != IBV_QPT_DRIVER)) return EINVAL; DEVX_SET(rts2rts_qp_in, in, opcode, MLX5_CMD_OP_RTS2RTS_QP); DEVX_SET(rts2rts_qp_in, in, qpn, qp->qp_num); DEVX_SET(rts2rts_qp_in, in, qpc_ext, 1); DEVX_SET64(rts2rts_qp_in, in, opt_param_mask_95_32, MLX5_QPC_OPT_MASK_32_DCI_STREAM_CHANNEL_ID); DEVX_SET(qpc_ext, qpce, dci_stream_channel_id, stream_id); ret = mlx5dv_devx_qp_modify(qp, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } static bool sched_supported(struct ibv_context *ctx) { struct mlx5_qos_caps *qc = &to_mctx(ctx)->qos_caps; return (qc->qos && (qc->nic_element_type & ELEMENT_TYPE_CAP_MASK_TASR) && (qc->nic_element_type & ELEMENT_TYPE_CAP_MASK_QUEUE_GROUP) && (qc->nic_tsar_type & TSAR_TYPE_CAP_MASK_DWRR)); } static struct mlx5dv_devx_obj * mlx5dv_sched_nic_create(struct ibv_context *ctx, const struct mlx5dv_sched_attr *sched_attr, int elem_type) { uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_sched_elem_in)] = {}; struct mlx5dv_devx_obj *obj; uint32_t parent_id; void *attr; attr = DEVX_ADDR_OF(create_sched_elem_in, in, hdr); DEVX_SET(general_obj_in_cmd_hdr, attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, attr, obj_type, MLX5_OBJ_TYPE_SCHEDULING_ELEMENT); attr = DEVX_ADDR_OF(create_sched_elem_in, in, sched_elem); DEVX_SET64(sched_elem, attr, modify_field_select, sched_attr->flags); DEVX_SET(sched_elem, attr, scheduling_hierarchy, MLX5_SCHED_HIERARCHY_NIC); attr = DEVX_ADDR_OF(create_sched_elem_in, in, sched_elem.sched_context); DEVX_SET(sched_context, attr, element_type, elem_type); parent_id = sched_attr->parent ? sched_attr->parent->obj->object_id : 0; DEVX_SET(sched_context, attr, parent_element_id, parent_id); if (sched_attr->flags & MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE) DEVX_SET(sched_context, attr, bw_share, sched_attr->bw_share); if (sched_attr->flags & MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW) DEVX_SET(sched_context, attr, max_average_bw, sched_attr->max_avg_bw); attr = DEVX_ADDR_OF(create_sched_elem_in, in, sched_elem.sched_context.sched_elem_attr); DEVX_SET(sched_elem_attr_tsar, attr, tsar_type, MLX5_SCHED_TSAR_TYPE_DWRR); obj = mlx5dv_devx_obj_create(ctx, in, sizeof(in), out, sizeof(out)); if (!obj) errno = mlx5_get_cmd_status_err(errno, out); return obj; } static int mlx5dv_sched_nic_modify(struct mlx5dv_devx_obj *obj, const struct mlx5dv_sched_attr *sched_attr, int elem_type) { uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_sched_elem_in)] = {}; void *attr; int ret; attr = DEVX_ADDR_OF(create_sched_elem_in, in, hdr); DEVX_SET(general_obj_in_cmd_hdr, attr, opcode, MLX5_CMD_OP_MODIFY_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, attr, obj_type, MLX5_OBJ_TYPE_SCHEDULING_ELEMENT); DEVX_SET(general_obj_in_cmd_hdr, in, obj_id, obj->object_id); attr = DEVX_ADDR_OF(create_sched_elem_in, in, sched_elem); DEVX_SET64(sched_elem, attr, modify_field_select, sched_attr->flags); DEVX_SET(sched_elem, attr, scheduling_hierarchy, MLX5_SCHED_HIERARCHY_NIC); attr = DEVX_ADDR_OF(create_sched_elem_in, in, sched_elem.sched_context); DEVX_SET(sched_context, attr, element_type, elem_type); if (sched_attr->flags & MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE) DEVX_SET(sched_context, attr, bw_share, sched_attr->bw_share); if (sched_attr->flags & MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW) DEVX_SET(sched_context, attr, max_average_bw, sched_attr->max_avg_bw); attr = DEVX_ADDR_OF(create_sched_elem_in, in, sched_elem.sched_context.sched_elem_attr); DEVX_SET(sched_elem_attr_tsar, attr, tsar_type, MLX5_SCHED_TSAR_TYPE_DWRR); ret = mlx5dv_devx_obj_modify(obj, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } #define MLX5DV_SCHED_ELEM_ATTR_ALL_FLAGS \ (MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE | \ MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW) static bool attr_supported(struct ibv_context *ctx, const struct mlx5dv_sched_attr *attr) { struct mlx5_qos_caps *qc = &to_mctx(ctx)->qos_caps; if ((attr->flags & MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE) && !qc->nic_bw_share) return false; if ((attr->flags & MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW) && !qc->nic_rate_limit) return false; return true; } static bool sched_attr_valid(const struct mlx5dv_sched_attr *attr, bool node) { if (!attr || attr->comp_mask || !check_comp_mask(attr->flags, MLX5DV_SCHED_ELEM_ATTR_ALL_FLAGS)) return false; if (node && (!attr->parent && attr->flags)) return false; if (!node && !attr->parent) return false; return true; } static struct mlx5dv_sched_node * _mlx5dv_sched_node_create(struct ibv_context *ctx, const struct mlx5dv_sched_attr *attr) { struct mlx5dv_sched_node *node; struct mlx5dv_devx_obj *obj; if (!sched_attr_valid(attr, true)) { errno = EINVAL; return NULL; } if (!sched_supported(ctx) || !attr_supported(ctx, attr)) { errno = EOPNOTSUPP; return NULL; } node = calloc(1, sizeof(*node)); if (!node) { errno = ENOMEM; return NULL; } obj = mlx5dv_sched_nic_create(ctx, attr, MLX5_SCHED_ELEM_TYPE_TSAR); if (!obj) goto err_sched_nic_create; node->obj = obj; node->parent = attr->parent; return node; err_sched_nic_create: free(node); return NULL; } struct mlx5dv_sched_node * mlx5dv_sched_node_create(struct ibv_context *ctx, const struct mlx5dv_sched_attr *attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ctx); if (!dvops || !dvops->sched_node_create) { errno = EOPNOTSUPP; return NULL; } return dvops->sched_node_create(ctx, attr); } static struct mlx5dv_sched_leaf * _mlx5dv_sched_leaf_create(struct ibv_context *ctx, const struct mlx5dv_sched_attr *attr) { struct mlx5dv_sched_leaf *leaf; struct mlx5dv_devx_obj *obj; if (!sched_attr_valid(attr, false)) { errno = EINVAL; return NULL; } if (!attr_supported(ctx, attr)) { errno = EOPNOTSUPP; return NULL; } leaf = calloc(1, sizeof(*leaf)); if (!leaf) { errno = ENOMEM; return NULL; } obj = mlx5dv_sched_nic_create(ctx, attr, MLX5_SCHED_ELEM_TYPE_QUEUE_GROUP); if (!obj) goto err_sched_nic_create; leaf->obj = obj; leaf->parent = attr->parent; return leaf; err_sched_nic_create: free(leaf); return NULL; } struct mlx5dv_sched_leaf * mlx5dv_sched_leaf_create(struct ibv_context *ctx, const struct mlx5dv_sched_attr *attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ctx); if (!dvops || !dvops->sched_leaf_create) { errno = EOPNOTSUPP; return NULL; } return dvops->sched_leaf_create(ctx, attr); } static int _mlx5dv_sched_node_modify(struct mlx5dv_sched_node *node, const struct mlx5dv_sched_attr *attr) { if (!node || !sched_attr_valid(attr, true)) { errno = EINVAL; return errno; } if (!attr_supported(node->obj->context, attr)) { errno = EOPNOTSUPP; return errno; } return mlx5dv_sched_nic_modify(node->obj, attr, MLX5_SCHED_ELEM_TYPE_TSAR); } int mlx5dv_sched_node_modify(struct mlx5dv_sched_node *node, const struct mlx5dv_sched_attr *attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(node->obj->context); if (!dvops || !dvops->sched_node_modify) return EOPNOTSUPP; return dvops->sched_node_modify(node, attr); } static int _mlx5dv_sched_leaf_modify(struct mlx5dv_sched_leaf *leaf, const struct mlx5dv_sched_attr *attr) { if (!leaf || !sched_attr_valid(attr, false)) { errno = EINVAL; return errno; } if (!attr_supported(leaf->obj->context, attr)) { errno = EOPNOTSUPP; return errno; } return mlx5dv_sched_nic_modify(leaf->obj, attr, MLX5_SCHED_ELEM_TYPE_QUEUE_GROUP); } int mlx5dv_sched_leaf_modify(struct mlx5dv_sched_leaf *leaf, const struct mlx5dv_sched_attr *attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(leaf->obj->context); if (!dvops || !dvops->sched_leaf_modify) return EOPNOTSUPP; return dvops->sched_leaf_modify(leaf, attr); } static int _mlx5dv_sched_node_destroy(struct mlx5dv_sched_node *node) { int ret; ret = mlx5dv_devx_obj_destroy(node->obj); if (ret) return ret; free(node); return 0; } int mlx5dv_sched_node_destroy(struct mlx5dv_sched_node *node) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(node->obj->context); if (!dvops || !dvops->sched_node_destroy) return EOPNOTSUPP; return dvops->sched_node_destroy(node); } static int _mlx5dv_sched_leaf_destroy(struct mlx5dv_sched_leaf *leaf) { int ret; ret = mlx5dv_devx_obj_destroy(leaf->obj); if (ret) return ret; free(leaf); return 0; } int mlx5dv_sched_leaf_destroy(struct mlx5dv_sched_leaf *leaf) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(leaf->obj->context); if (!dvops || !dvops->sched_leaf_destroy) return EOPNOTSUPP; return dvops->sched_leaf_destroy(leaf); } static int modify_ib_qp_sched_elem_init(struct ibv_qp *qp, uint32_t req_id, uint32_t resp_id) { uint64_t mask = MLX5_QPC_OPT_MASK_32_QOS_QUEUE_GROUP_ID; uint32_t in[DEVX_ST_SZ_DW(init2init_qp_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(init2init_qp_out)] = {}; void *qpce = DEVX_ADDR_OF(init2init_qp_in, in, qpc_data_ext); int ret; DEVX_SET(init2init_qp_in, in, opcode, MLX5_CMD_OP_INIT2INIT_QP); DEVX_SET(init2init_qp_in, in, qpc_ext, 1); DEVX_SET(init2init_qp_in, in, qpn, qp->qp_num); DEVX_SET64(init2init_qp_in, in, opt_param_mask_95_32, mask); DEVX_SET(qpc_ext, qpce, qos_queue_group_id_requester, req_id); DEVX_SET(qpc_ext, qpce, qos_queue_group_id_responder, resp_id); ret = mlx5dv_devx_qp_modify(qp, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } static int modify_ib_qp_sched_elem_rts(struct ibv_qp *qp, uint32_t req_id, uint32_t resp_id) { uint64_t mask = MLX5_QPC_OPT_MASK_32_QOS_QUEUE_GROUP_ID; uint32_t in[DEVX_ST_SZ_DW(rts2rts_qp_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(rts2rts_qp_out)] = {}; void *qpce = DEVX_ADDR_OF(rts2rts_qp_in, in, qpc_data_ext); int ret; DEVX_SET(rts2rts_qp_in, in, opcode, MLX5_CMD_OP_RTS2RTS_QP); DEVX_SET(rts2rts_qp_in, in, qpc_ext, 1); DEVX_SET(rts2rts_qp_in, in, qpn, qp->qp_num); DEVX_SET64(rts2rts_qp_in, in, opt_param_mask_95_32, mask); DEVX_SET(qpc_ext, qpce, qos_queue_group_id_requester, req_id); DEVX_SET(qpc_ext, qpce, qos_queue_group_id_responder, resp_id); ret = mlx5dv_devx_qp_modify(qp, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } static int modify_ib_qp_sched_elem(struct ibv_qp *qp, uint32_t req_id, uint32_t resp_id) { int ret; switch (qp->state) { case IBV_QPS_INIT: ret = modify_ib_qp_sched_elem_init(qp, req_id, resp_id); break; case IBV_QPS_RTS: ret = modify_ib_qp_sched_elem_rts(qp, req_id, resp_id); break; default: return EOPNOTSUPP; }; return ret; } static int modify_raw_qp_sched_elem(struct ibv_qp *qp, uint32_t qos_id) { struct mlx5_qos_caps *qc = &to_mctx(qp->context)->qos_caps; uint32_t mout[DEVX_ST_SZ_DW(modify_sq_out)] = {}; uint32_t min[DEVX_ST_SZ_DW(modify_sq_in)] = {}; struct mlx5_qp *mqp = to_mqp(qp); void *sqc; int ret; if (qp->state != IBV_QPS_RTS || !qc->nic_sq_scheduling) return EOPNOTSUPP; DEVX_SET(modify_sq_in, min, opcode, MLX5_CMD_OP_MODIFY_SQ); DEVX_SET(modify_sq_in, min, sq_state, MLX5_SQC_STATE_RDY); DEVX_SET(modify_sq_in, min, sqn, mqp->sqn); DEVX_SET64(modify_sq_in, min, modify_bitmask, MLX5_MODIFY_SQ_BITMASK_QOS_QUEUE_GROUP_ID); sqc = DEVX_ADDR_OF(modify_sq_in, min, sq_context); DEVX_SET(sqc, sqc, state, MLX5_SQC_STATE_RDY); DEVX_SET(sqc, sqc, qos_queue_group_id, qos_id); ret = mlx5dv_devx_qp_modify(qp, min, sizeof(min), mout, sizeof(mout)); return ret ? mlx5_get_cmd_status_err(ret, mout) : 0; } static int _mlx5dv_modify_qp_sched_elem(struct ibv_qp *qp, const struct mlx5dv_sched_leaf *requestor, const struct mlx5dv_sched_leaf *responder) { struct mlx5_qos_caps *qc = &to_mctx(qp->context)->qos_caps; switch (qp->qp_type) { case IBV_QPT_UC: case IBV_QPT_UD: if (responder) return EINVAL; SWITCH_FALLTHROUGH; case IBV_QPT_RC: if ((!to_mctx(qp->context)->qpc_extension_cap) || !(qc->nic_qp_scheduling)) return EOPNOTSUPP; return modify_ib_qp_sched_elem(qp, requestor ? requestor->obj->object_id : 0, responder ? responder->obj->object_id : 0); case IBV_QPT_RAW_PACKET: if (responder) return EINVAL; return modify_raw_qp_sched_elem(qp, requestor ? requestor->obj->object_id : 0); default: return EOPNOTSUPP; } } int mlx5dv_modify_qp_sched_elem(struct ibv_qp *qp, const struct mlx5dv_sched_leaf *requestor, const struct mlx5dv_sched_leaf *responder) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(qp->context); if (!dvops || !dvops->modify_qp_sched_elem) return EOPNOTSUPP; return dvops->modify_qp_sched_elem(qp, requestor, responder); } int mlx5_modify_qp_drain_sigerr(struct ibv_qp *qp) { uint64_t mask = MLX5_QPC_OPT_MASK_INIT2INIT_DRAIN_SIGERR; uint32_t in[DEVX_ST_SZ_DW(init2init_qp_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(init2init_qp_out)] = {}; void *qpc = DEVX_ADDR_OF(init2init_qp_in, in, qpc); int ret; DEVX_SET(init2init_qp_in, in, opcode, MLX5_CMD_OP_INIT2INIT_QP); DEVX_SET(init2init_qp_in, in, qpn, qp->qp_num); DEVX_SET(init2init_qp_in, in, opt_param_mask, mask); DEVX_SET(qpc, qpc, drain_sigerr, 1); ret = mlx5dv_devx_qp_modify(qp, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } static struct reserved_qpn_blk *reserved_qpn_blk_alloc(struct mlx5_context *mctx) { uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_reserved_qpn_in)] = {}; struct reserved_qpn_blk *blk; void *attr; blk = calloc(1, sizeof(*blk)); if (!blk) { errno = ENOMEM; return NULL; } blk->bmp = bitmap_alloc0(1 << mctx->hca_cap_2_caps.log_reserved_qpns_per_obj); if (!blk->bmp) { errno = ENOMEM; goto bmp_alloc_fail; } attr = DEVX_ADDR_OF(create_reserved_qpn_in, in, hdr); DEVX_SET(general_obj_in_cmd_hdr, attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, attr, obj_type, MLX5_OBJ_TYPE_RESERVED_QPN); DEVX_SET(general_obj_in_cmd_hdr, attr, log_obj_range, mctx->hca_cap_2_caps.log_reserved_qpns_per_obj); blk->obj = mlx5dv_devx_obj_create(&mctx->ibv_ctx.context, in, sizeof(in), out, sizeof(out)); if (!blk->obj) { errno = mlx5_get_cmd_status_err(errno, out); goto obj_alloc_fail; } blk->first_qpn = blk->obj->object_id; blk->next_avail_slot = 0; return blk; obj_alloc_fail: free(blk->bmp); bmp_alloc_fail: free(blk); return NULL; } static void reserved_qpn_blk_dealloc(struct reserved_qpn_blk *blk) { if (mlx5dv_devx_obj_destroy(blk->obj)) assert(false); free(blk->bmp); free(blk); } static void reserved_qpn_blks_free(struct mlx5_context *mctx) { struct reserved_qpn_blk *blk, *tmp; pthread_mutex_lock(&mctx->reserved_qpns.mutex); list_for_each_safe(&mctx->reserved_qpns.blk_list, blk, tmp, entry) { list_del(&blk->entry); reserved_qpn_blk_dealloc(blk); } pthread_mutex_unlock(&mctx->reserved_qpns.mutex); } /** * Allocate a reserved QPN either from the last FW object allocated, * or by allocating a new one. When find a free QPN in an object, it * always starts from last allocation position, to make sure the QPN * always move forward to prevent stale QPN. */ static int _mlx5dv_reserved_qpn_alloc(struct ibv_context *ctx, uint32_t *qpn) { struct mlx5_context *mctx = to_mctx(ctx); struct reserved_qpn_blk *blk; uint32_t qpns_per_obj; int ret = 0; if (!(mctx->general_obj_types_caps & (1ULL << MLX5_OBJ_TYPE_RESERVED_QPN))) return EOPNOTSUPP; qpns_per_obj = 1 << mctx->hca_cap_2_caps.log_reserved_qpns_per_obj; pthread_mutex_lock(&mctx->reserved_qpns.mutex); blk = list_tail(&mctx->reserved_qpns.blk_list, struct reserved_qpn_blk, entry); if (!blk || (blk->next_avail_slot >= qpns_per_obj)) { blk = reserved_qpn_blk_alloc(mctx); if (!blk) { ret = errno; goto end; } list_add_tail(&mctx->reserved_qpns.blk_list, &blk->entry); } *qpn = blk->first_qpn + blk->next_avail_slot; bitmap_set_bit(blk->bmp, blk->next_avail_slot); blk->next_avail_slot++; end: pthread_mutex_unlock(&mctx->reserved_qpns.mutex); return ret; } int mlx5dv_reserved_qpn_alloc(struct ibv_context *ctx, uint32_t *qpn) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ctx); if (!dvops || !dvops->reserved_qpn_alloc) return EOPNOTSUPP; return dvops->reserved_qpn_alloc(ctx, qpn); } /** * Deallocate a reserved QPN. The FW object is destroyed only when all QPNs * in this object were used and freed. */ static int _mlx5dv_reserved_qpn_dealloc(struct ibv_context *ctx, uint32_t qpn) { struct mlx5_context *mctx = to_mctx(ctx); struct reserved_qpn_blk *blk, *tmp; uint32_t qpns_per_obj; bool found = false; int ret = 0; qpns_per_obj = 1 << mctx->hca_cap_2_caps.log_reserved_qpns_per_obj; pthread_mutex_lock(&mctx->reserved_qpns.mutex); list_for_each_safe(&mctx->reserved_qpns.blk_list, blk, tmp, entry) { if ((qpn >= blk->first_qpn) && (qpn < blk->first_qpn + qpns_per_obj)) { found = true; break; } } if (!found || !bitmap_test_bit(blk->bmp, qpn - blk->first_qpn)) { errno = EINVAL; ret = errno; goto end; } bitmap_clear_bit(blk->bmp, qpn - blk->first_qpn); if ((blk->next_avail_slot >= qpns_per_obj) && (bitmap_empty(blk->bmp, qpns_per_obj))) { list_del(&blk->entry); reserved_qpn_blk_dealloc(blk); } end: pthread_mutex_unlock(&mctx->reserved_qpns.mutex); return ret; } int mlx5dv_reserved_qpn_dealloc(struct ibv_context *ctx, uint32_t qpn) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ctx); if (!dvops || !dvops->reserved_qpn_dealloc) return EOPNOTSUPP; return dvops->reserved_qpn_dealloc(ctx, qpn); } static int _mlx5dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type) { int ret = 0; if (obj_type & MLX5DV_OBJ_QP) ret = mlx5dv_get_qp(obj->qp.in, obj->qp.out); if (!ret && (obj_type & MLX5DV_OBJ_CQ)) ret = mlx5dv_get_cq(obj->cq.in, obj->cq.out); if (!ret && (obj_type & MLX5DV_OBJ_SRQ)) ret = mlx5dv_get_srq(obj->srq.in, obj->srq.out); if (!ret && (obj_type & MLX5DV_OBJ_RWQ)) ret = mlx5dv_get_rwq(obj->rwq.in, obj->rwq.out); if (!ret && (obj_type & MLX5DV_OBJ_DM)) ret = mlx5dv_get_dm(obj->dm.in, obj->dm.out); if (!ret && (obj_type & MLX5DV_OBJ_AH)) ret = mlx5dv_get_av(obj->ah.in, obj->ah.out); if (!ret && (obj_type & MLX5DV_OBJ_PD)) ret = mlx5dv_get_pd(obj->pd.in, obj->pd.out); if (!ret && (obj_type & MLX5DV_OBJ_DEVX)) ret = mlx5dv_get_devx(obj->devx.in, obj->devx.out); return ret; } static struct ibv_context * get_context_from_obj(struct mlx5dv_obj *obj, uint64_t obj_type) { if (obj_type & MLX5DV_OBJ_QP) return obj->qp.in->context; if (obj_type & MLX5DV_OBJ_CQ) return obj->cq.in->context; if (obj_type & MLX5DV_OBJ_SRQ) return obj->srq.in->context; if (obj_type & MLX5DV_OBJ_RWQ) return obj->rwq.in->context; if (obj_type & MLX5DV_OBJ_DM) return obj->dm.in->context; if (obj_type & MLX5DV_OBJ_AH) return obj->ah.in->context; if (obj_type & MLX5DV_OBJ_PD) return obj->pd.in->context; if (obj_type & MLX5DV_OBJ_DEVX) return obj->devx.in->context; return NULL; } LATEST_SYMVER_FUNC(mlx5dv_init_obj, 1_2, "MLX5_1.2", int, struct mlx5dv_obj *obj, uint64_t obj_type) { struct mlx5_dv_context_ops *dvops; struct ibv_context *ctx; ctx = get_context_from_obj(obj, obj_type); if (!ctx) return EINVAL; dvops = mlx5_get_dv_ops(ctx); if (!dvops || !dvops->init_obj) return EOPNOTSUPP; return dvops->init_obj(obj, obj_type); } COMPAT_SYMVER_FUNC(mlx5dv_init_obj, 1_0, "MLX5_1.0", int, struct mlx5dv_obj *obj, uint64_t obj_type) { int ret = 0; ret = __mlx5dv_init_obj_1_2(obj, obj_type); if (!ret && (obj_type & MLX5DV_OBJ_CQ)) { /* ABI version 1.0 returns the void ** in this memory * location */ obj->cq.out->cq_uar = &(to_mctx(obj->cq.in->context)->cq_uar_reg); } return ret; } off_t get_uar_mmap_offset(int idx, int page_size, int command) { off_t offset = 0; set_command(command, &offset); if (command == MLX5_IB_MMAP_ALLOC_WC && idx >= (1 << MLX5_IB_MMAP_CMD_SHIFT)) set_extended_index(idx, &offset); else set_index(idx, &offset); return offset * page_size; } static off_t uar_type_to_cmd(int uar_type) { return (uar_type == MLX5_UAR_TYPE_NC) ? MLX5_MMAP_GET_NC_PAGES_CMD : MLX5_MMAP_GET_REGULAR_PAGES_CMD; } void *mlx5_mmap(struct mlx5_uar_info *uar, int index, int cmd_fd, int page_size, int uar_type) { off_t offset; if (uar_type == MLX5_UAR_TYPE_NC) { offset = get_uar_mmap_offset(index, page_size, MLX5_MMAP_GET_NC_PAGES_CMD); uar->reg = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED, cmd_fd, offset); if (uar->reg != MAP_FAILED) { uar->type = MLX5_UAR_TYPE_NC; goto out; } } /* Backward compatibility for legacy kernels that don't support * MLX5_MMAP_GET_NC_PAGES_CMD mmap command. */ offset = get_uar_mmap_offset(index, page_size, (uar_type == MLX5_UAR_TYPE_REGULAR_DYN) ? MLX5_IB_MMAP_ALLOC_WC : MLX5_MMAP_GET_REGULAR_PAGES_CMD); uar->reg = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED, cmd_fd, offset); if (uar->reg != MAP_FAILED) uar->type = MLX5_UAR_TYPE_REGULAR; out: return uar->reg; } static int _mlx5dv_set_context_attr(struct ibv_context *ibv_ctx, enum mlx5dv_set_ctx_attr_type type, void *attr) { struct mlx5_context *ctx = to_mctx(ibv_ctx); switch (type) { case MLX5DV_CTX_ATTR_BUF_ALLOCATORS: ctx->extern_alloc = *((struct mlx5dv_ctx_allocators *)attr); break; default: return ENOTSUP; } return 0; } int mlx5dv_set_context_attr(struct ibv_context *ibv_ctx, enum mlx5dv_set_ctx_attr_type type, void *attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ibv_ctx); if (!dvops || !dvops->set_context_attr) return EOPNOTSUPP; return dvops->set_context_attr(ibv_ctx, type, attr); } static int _mlx5dv_get_clock_info(struct ibv_context *ctx_in, struct mlx5dv_clock_info *clock_info) { struct mlx5_context *ctx = to_mctx(ctx_in); const struct mlx5_ib_clock_info *ci; uint32_t retry, tmp_sig; atomic_uint32_t *sig; if (!is_mlx5_dev(ctx_in->device)) return EOPNOTSUPP; ci = ctx->clock_info_page; if (!ci) return EINVAL; sig = (atomic_uint32_t *)&ci->sign; do { retry = 10; repeat: tmp_sig = atomic_load(sig); if (unlikely(tmp_sig & MLX5_IB_CLOCK_INFO_KERNEL_UPDATING)) { if (--retry) goto repeat; return EBUSY; } clock_info->nsec = ci->nsec; clock_info->last_cycles = ci->cycles; clock_info->frac = ci->frac; clock_info->mult = ci->mult; clock_info->shift = ci->shift; clock_info->mask = ci->mask; } while (unlikely(tmp_sig != atomic_load(sig))); return 0; } int mlx5dv_get_clock_info(struct ibv_context *ctx_in, struct mlx5dv_clock_info *clock_info) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ctx_in); if (!dvops || !dvops->get_clock_info) return EOPNOTSUPP; return dvops->get_clock_info(ctx_in, clock_info); } static struct mlx5_dv_context_ops mlx5_dv_ctx_ops = { .query_device = _mlx5dv_query_device, .query_qp_lag_port = _mlx5dv_query_qp_lag_port, .modify_qp_lag_port = _mlx5dv_modify_qp_lag_port, .modify_qp_udp_sport = _mlx5dv_modify_qp_udp_sport, .sched_node_create = _mlx5dv_sched_node_create, .sched_leaf_create = _mlx5dv_sched_leaf_create, .sched_node_modify = _mlx5dv_sched_node_modify, .sched_leaf_modify = _mlx5dv_sched_leaf_modify, .sched_node_destroy = _mlx5dv_sched_node_destroy, .sched_leaf_destroy = _mlx5dv_sched_leaf_destroy, .modify_qp_sched_elem = _mlx5dv_modify_qp_sched_elem, .reserved_qpn_alloc = _mlx5dv_reserved_qpn_alloc, .reserved_qpn_dealloc = _mlx5dv_reserved_qpn_dealloc, .set_context_attr = _mlx5dv_set_context_attr, .get_clock_info = _mlx5dv_get_clock_info, .init_obj = _mlx5dv_init_obj, }; static void adjust_uar_info(struct mlx5_device *mdev, struct mlx5_context *context, struct mlx5_ib_alloc_ucontext_resp *resp) { if (!resp->log_uar_size && !resp->num_uars_per_page) { /* old kernel */ context->uar_size = mdev->page_size; context->num_uars_per_page = 1; return; } context->uar_size = 1 << resp->log_uar_size; context->num_uars_per_page = resp->num_uars_per_page; } bool mlx5dv_is_supported(struct ibv_device *device) { return is_mlx5_dev(device); } struct ibv_context * mlx5dv_open_device(struct ibv_device *device, struct mlx5dv_context_attr *attr) { if (!is_mlx5_dev(device)) { errno = EOPNOTSUPP; return NULL; } return verbs_open_device(device, attr); } static int get_uar_info(struct mlx5_device *mdev, int *tot_uuars, int *low_lat_uuars) { *tot_uuars = get_total_uuars(mdev->page_size); if (*tot_uuars < 0) { errno = -*tot_uuars; return -1; } *low_lat_uuars = get_num_low_lat_uuars(*tot_uuars); if (*low_lat_uuars < 0) { errno = -*low_lat_uuars; return -1; } if (*low_lat_uuars > *tot_uuars - 1) { errno = ENOMEM; return -1; } return 0; } static void mlx5_uninit_context(struct mlx5_context *context) { mlx5_close_debug_file(context->dbg_fp); verbs_uninit_context(&context->ibv_ctx); free(context); } static struct mlx5_context *mlx5_init_context(struct ibv_device *ibdev, int cmd_fd) { struct mlx5_device *mdev = to_mdev(ibdev); struct mlx5_context *context; int low_lat_uuars; int tot_uuars; int ret; context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx, RDMA_DRIVER_MLX5); if (!context) return NULL; mlx5_open_debug_file(&context->dbg_fp); mlx5_set_debug_mask(); set_freeze_on_error(); if (gethostname(context->hostname, sizeof(context->hostname))) strcpy(context->hostname, "host_unknown"); mlx5_single_threaded = single_threaded_app(); ret = get_uar_info(mdev, &tot_uuars, &low_lat_uuars); if (ret) { mlx5_uninit_context(context); return NULL; } context->tot_uuars = tot_uuars; context->low_lat_uuars = low_lat_uuars; return context; } static int mlx5_set_context(struct mlx5_context *context, struct mlx5_ib_alloc_ucontext_resp *resp, bool is_import) { struct verbs_context *v_ctx = &context->ibv_ctx; struct ibv_port_attr port_attr = {}; int cmd_fd = v_ctx->context.cmd_fd; struct mlx5_device *mdev = to_mdev(v_ctx->context.device); struct ibv_device *ibdev = v_ctx->context.device; int page_size = mdev->page_size; int num_sys_page_map; int gross_uuars; int bfi; int i, k, j; context->max_num_qps = resp->qp_tab_size; context->bf_reg_size = resp->bf_reg_size; context->cache_line_size = resp->cache_line_size; context->max_sq_desc_sz = resp->max_sq_desc_sz; context->max_rq_desc_sz = resp->max_rq_desc_sz; context->max_send_wqebb = resp->max_send_wqebb; context->num_ports = resp->num_ports; context->max_recv_wr = resp->max_recv_wr; context->max_srq_recv_wr = resp->max_srq_recv_wr; context->num_dyn_bfregs = resp->num_dyn_bfregs; if (resp->comp_mask & MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_ECE) context->flags |= MLX5_CTX_FLAGS_ECE_SUPPORTED; if (resp->comp_mask & MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_SQD2RTS) context->flags |= MLX5_CTX_FLAGS_SQD2RTS_SUPPORTED; if (resp->comp_mask & MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_REAL_TIME_TS) context->flags |= MLX5_CTX_FLAGS_REAL_TIME_TS_SUPPORTED; if (resp->comp_mask & MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_MKEY_UPDATE_TAG) context->flags |= MLX5_CTX_FLAGS_MKEY_UPDATE_TAG_SUPPORTED; if (resp->comp_mask & MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_DUMP_FILL_MKEY) { context->dump_fill_mkey = resp->dump_fill_mkey; /* Have the BE value ready to be used in data path */ context->dump_fill_mkey_be = htobe32(resp->dump_fill_mkey); } else { /* kernel driver will never return MLX5_INVALID_LKEY for * dump_fill_mkey */ context->dump_fill_mkey = MLX5_INVALID_LKEY; context->dump_fill_mkey_be = htobe32(MLX5_INVALID_LKEY); } context->cqe_version = resp->cqe_version; adjust_uar_info(mdev, context, resp); context->cmds_supp_uhw = resp->cmds_supp_uhw; context->vendor_cap_flags = 0; list_head_init(&context->dyn_uar_bf_list); list_head_init(&context->dyn_uar_db_list); list_head_init(&context->dyn_uar_qp_shared_list); list_head_init(&context->dyn_uar_qp_dedicated_list); if (resp->eth_min_inline) context->eth_min_inline_size = (resp->eth_min_inline == MLX5_USER_INLINE_MODE_NONE) ? 0 : MLX5_ETH_L2_INLINE_HEADER_SIZE; else context->eth_min_inline_size = MLX5_ETH_L2_INLINE_HEADER_SIZE; pthread_mutex_init(&context->qp_table_mutex, NULL); pthread_mutex_init(&context->srq_table_mutex, NULL); pthread_mutex_init(&context->uidx_table_mutex, NULL); pthread_mutex_init(&context->mkey_table_mutex, NULL); pthread_mutex_init(&context->dyn_bfregs_mutex, NULL); pthread_mutex_init(&context->crypto_login_mutex, NULL); for (i = 0; i < MLX5_QP_TABLE_SIZE; ++i) context->qp_table[i].refcnt = 0; for (i = 0; i < MLX5_QP_TABLE_SIZE; ++i) context->uidx_table[i].refcnt = 0; for (i = 0; i < MLX5_MKEY_TABLE_SIZE; ++i) context->mkey_table[i].refcnt = 0; list_head_init(&context->dbr_available_pages); cl_qmap_init(&context->dbr_map); pthread_mutex_init(&context->dbr_map_mutex, NULL); context->prefer_bf = get_always_bf(); context->shut_up_bf = get_shut_up_bf(); if (resp->tot_bfregs) { if (is_import) { errno = EINVAL; return EINVAL; } context->tot_uuars = resp->tot_bfregs; gross_uuars = context->tot_uuars / MLX5_NUM_NON_FP_BFREGS_PER_UAR * NUM_BFREGS_PER_UAR; context->bfs = calloc(gross_uuars, sizeof(*context->bfs)); if (!context->bfs) { errno = ENOMEM; goto err_free; } context->flags |= MLX5_CTX_FLAGS_NO_KERN_DYN_UAR; } else { context->qp_max_dedicated_uuars = context->low_lat_uuars; context->qp_max_shared_uuars = context->tot_uuars - context->low_lat_uuars; goto bf_done; } context->max_num_legacy_dyn_uar_sys_page = context->num_dyn_bfregs / (context->num_uars_per_page * MLX5_NUM_NON_FP_BFREGS_PER_UAR); num_sys_page_map = context->tot_uuars / (context->num_uars_per_page * MLX5_NUM_NON_FP_BFREGS_PER_UAR); for (i = 0; i < num_sys_page_map; ++i) { if (mlx5_mmap(&context->uar[i], i, cmd_fd, page_size, context->shut_up_bf ? MLX5_UAR_TYPE_NC : MLX5_UAR_TYPE_REGULAR) == MAP_FAILED) { context->uar[i].reg = NULL; goto err_free_bf; } } for (i = 0; i < num_sys_page_map; i++) { for (j = 0; j < context->num_uars_per_page; j++) { for (k = 0; k < NUM_BFREGS_PER_UAR; k++) { bfi = (i * context->num_uars_per_page + j) * NUM_BFREGS_PER_UAR + k; context->bfs[bfi].reg = context->uar[i].reg + MLX5_ADAPTER_PAGE_SIZE * j + MLX5_BF_OFFSET + k * context->bf_reg_size; context->bfs[bfi].need_lock = need_uuar_lock(context, bfi); mlx5_spinlock_init(&context->bfs[bfi].lock, context->bfs[bfi].need_lock); context->bfs[bfi].offset = 0; if (bfi) context->bfs[bfi].buf_size = context->bf_reg_size / 2; context->bfs[bfi].uuarn = bfi; context->bfs[bfi].uar_mmap_offset = get_uar_mmap_offset(i, page_size, uar_type_to_cmd(context->uar[i].type)); } } } bf_done: context->hca_core_clock = NULL; if (resp->comp_mask & MLX5_IB_ALLOC_UCONTEXT_RESP_MASK_CORE_CLOCK_OFFSET) { context->core_clock.offset = resp->hca_core_clock_offset; mlx5_map_internal_clock(mdev, &v_ctx->context); } context->clock_info_page = NULL; if ((resp->clock_info_versions & (1 << MLX5_IB_CLOCK_INFO_V1))) mlx5_map_clock_info(mdev, &v_ctx->context); context->flow_action_flags = resp->flow_action_flags; mlx5_read_env(ibdev, context); mlx5_spinlock_init(&context->hugetlb_lock, !mlx5_single_threaded); list_head_init(&context->hugetlb_list); verbs_set_ops(v_ctx, &mlx5_ctx_common_ops); if (context->cqe_version) { if (context->cqe_version == MLX5_CQE_VERSION_V1) verbs_set_ops(v_ctx, &mlx5_ctx_cqev1_ops); else goto err_free; } context->dv_ctx_ops = &mlx5_dv_ctx_ops; mlx5_query_device_ctx(context); for (j = 0; j < min(MLX5_MAX_PORTS_NUM, context->num_ports); ++j) { memset(&port_attr, 0, sizeof(port_attr)); if (!mlx5_query_port(&v_ctx->context, j + 1, &port_attr)) { context->cached_link_layer[j] = port_attr.link_layer; context->cached_port_flags[j] = port_attr.flags; } } mlx5_set_singleton_nc_uar(&v_ctx->context); context->cq_uar_reg = context->nc_uar ? context->nc_uar->uar : context->uar[0].reg; pthread_mutex_init(&context->reserved_qpns.mutex, NULL); list_head_init(&context->reserved_qpns.blk_list); return 0; err_free_bf: free(context->bfs); err_free: for (i = 0; i < MLX5_MAX_UARS; ++i) { if (context->uar[i].reg) munmap(context->uar[i].reg, page_size); } return -1; } static struct verbs_context *mlx5_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct mlx5_context *context; struct mlx5_alloc_ucontext req = {}; struct mlx5_alloc_ucontext_resp resp = {}; struct mlx5dv_context_attr *ctx_attr = private_data; bool always_devx = false; int ret; context = mlx5_init_context(ibdev, cmd_fd); if (!context) return NULL; if (ctx_attr && ctx_attr->comp_mask) { errno = EINVAL; goto err; } req.total_num_bfregs = context->tot_uuars; req.num_low_latency_bfregs = context->low_lat_uuars; req.max_cqe_version = MLX5_CQE_VERSION_V1; req.lib_caps |= (MLX5_LIB_CAP_4K_UAR | MLX5_LIB_CAP_DYN_UAR); if (ctx_attr && ctx_attr->flags) { if (!check_comp_mask(ctx_attr->flags, MLX5DV_CONTEXT_FLAGS_DEVX)) { errno = EINVAL; goto err; } req.flags = MLX5_IB_ALLOC_UCTX_DEVX; } else { req.flags = MLX5_IB_ALLOC_UCTX_DEVX; always_devx = true; } retry_open: if (mlx5_cmd_get_context(context, &req, sizeof(req), &resp, sizeof(resp))) { if (always_devx) { req.flags &= ~MLX5_IB_ALLOC_UCTX_DEVX; always_devx = false; memset(&resp, 0, sizeof(resp)); goto retry_open; } else { goto err; } } ret = mlx5_set_context(context, &resp.drv_payload, false); if (ret) goto err; return &context->ibv_ctx; err: mlx5_uninit_context(context); return NULL; } static struct verbs_context *mlx5_import_context(struct ibv_device *ibdev, int cmd_fd) { struct mlx5_ib_alloc_ucontext_resp resp = {}; DECLARE_COMMAND_BUFFER_LINK(driver_attr, UVERBS_OBJECT_DEVICE, UVERBS_METHOD_QUERY_CONTEXT, 1, NULL); struct ibv_context *context; struct mlx5_context *mctx; int ret; mctx = mlx5_init_context(ibdev, cmd_fd); if (!mctx) return NULL; context = &mctx->ibv_ctx.context; fill_attr_out_ptr(driver_attr, MLX5_IB_ATTR_QUERY_CONTEXT_RESP_UCTX, &resp); ret = ibv_cmd_query_context(context, driver_attr); if (ret) goto err; ret = mlx5_set_context(mctx, &resp, true); if (ret) goto err; return &mctx->ibv_ctx; err: mlx5_uninit_context(mctx); return NULL; } static void mlx5_free_context(struct ibv_context *ibctx) { struct mlx5_context *context = to_mctx(ibctx); int page_size = to_mdev(ibctx->device)->page_size; int i; free(context->bfs); for (i = 0; i < MLX5_MAX_UARS; ++i) { if (context->uar[i].reg) munmap(context->uar[i].reg, page_size); } if (context->hca_core_clock) munmap(context->hca_core_clock - context->core_clock.offset, page_size); if (context->clock_info_page) munmap((void *)context->clock_info_page, page_size); mlx5_close_debug_file(context->dbg_fp); clean_dyn_uars(ibctx); reserved_qpn_blks_free(context); verbs_uninit_context(&context->ibv_ctx); free(context); } static void mlx5_uninit_device(struct verbs_device *verbs_device) { struct mlx5_device *dev = to_mdev(&verbs_device->device); free(dev); } static struct verbs_device *mlx5_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct mlx5_device *dev; dev = calloc(1, sizeof *dev); if (!dev) return NULL; dev->page_size = sysconf(_SC_PAGESIZE); dev->driver_abi_ver = sysfs_dev->abi_ver; mlx5_set_dv_ctx_ops(&mlx5_dv_ctx_ops); return &dev->verbs_dev; } static const struct verbs_device_ops mlx5_dev_ops = { .name = "mlx5", .match_min_abi_version = MLX5_UVERBS_MIN_ABI_VERSION, .match_max_abi_version = MLX5_UVERBS_MAX_ABI_VERSION, .match_table = mlx5_hca_table, .alloc_device = mlx5_device_alloc, .uninit_device = mlx5_uninit_device, .alloc_context = mlx5_alloc_context, .import_context = mlx5_import_context, }; static bool is_mlx5_dev(struct ibv_device *device) { struct verbs_device *verbs_device = verbs_get_device(device); return verbs_device->ops == &mlx5_dev_ops; } struct mlx5_dv_context_ops *mlx5_get_dv_ops(struct ibv_context *ibctx) { if (is_mlx5_dev(ibctx->device)) return to_mctx(ibctx)->dv_ctx_ops; else if (is_mlx5_vfio_dev(ibctx->device)) return to_mvfio_ctx(ibctx)->dv_ctx_ops; else return NULL; } PROVIDER_DRIVER(mlx5, mlx5_dev_ops); rdma-core-56.1/providers/mlx5/mlx5.h000066400000000000000000001375751477342711600173140ustar00rootroot00000000000000/* * Copyright (c) 2012 Mellanox Technologies, Inc. All rights reserved. * Copyright (c) 2020 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MLX5_H #define MLX5_H #include #include #include #include #include #include #include #include #include #include "mlx5-abi.h" #include #include #include #include "mlx5dv.h" #include #define PFX "mlx5: " #ifndef PCI_VENDOR_ID_MELLANOX #define PCI_VENDOR_ID_MELLANOX 0x15b3 #endif typedef _Atomic(uint32_t) atomic_uint32_t; enum { MLX5_IB_MMAP_CMD_SHIFT = 8, MLX5_IB_MMAP_CMD_MASK = 0xff, }; enum { MLX5_CQE_VERSION_V0 = 0, MLX5_CQE_VERSION_V1 = 1, }; enum { MLX5_ADAPTER_PAGE_SIZE = 4096, MLX5_ADAPTER_PAGE_SHIFT = 12, }; #define MLX5_CQ_PREFIX "MLX_CQ" #define MLX5_QP_PREFIX "MLX_QP" #define MLX5_MR_PREFIX "MLX_MR" #define MLX5_RWQ_PREFIX "MLX_RWQ" #define MLX5_SRQ_PREFIX "MLX_SRQ" #define MLX5_MAX_LOG2_CONTIG_BLOCK_SIZE 23 #define MLX5_MIN_LOG2_CONTIG_BLOCK_SIZE 12 enum { MLX5_DBG_QP = 1 << 0, MLX5_DBG_CQ = 1 << 1, MLX5_DBG_QP_SEND = 1 << 2, MLX5_DBG_QP_SEND_ERR = 1 << 3, MLX5_DBG_CQ_CQE = 1 << 4, MLX5_DBG_CONTIG = 1 << 5, MLX5_DBG_DR = 1 << 6, }; extern uint32_t mlx5_debug_mask; extern int mlx5_freeze_on_error_cqe; extern const struct verbs_match_ent mlx5_hca_table[]; #ifdef MLX5_DEBUG #define mlx5_dbg(fp, mask, format, arg...) \ do { \ if (mask & mlx5_debug_mask) { \ int tmp = errno; \ fprintf(fp, "%s:%d: " format, __func__, __LINE__, ##arg); \ errno = tmp; \ } \ } while (0) #else static inline void mlx5_dbg(FILE *fp, uint32_t mask, const char *fmt, ...) __attribute__((format(printf, 3, 4))); static inline void mlx5_dbg(FILE *fp, uint32_t mask, const char *fmt, ...) { } #endif __attribute__((format(printf, 2, 3))) static inline void mlx5_err(FILE *fp, const char *fmt, ...) { va_list args; if (!fp) return; va_start(args, fmt); vfprintf(fp, fmt, args); va_end(args); } enum { MLX5_STAT_RATE_OFFSET = 5 }; enum { MLX5_QP_TABLE_SHIFT = 12, MLX5_QP_TABLE_MASK = (1 << MLX5_QP_TABLE_SHIFT) - 1, MLX5_QP_TABLE_SIZE = 1 << (24 - MLX5_QP_TABLE_SHIFT), }; enum { MLX5_UIDX_TABLE_SHIFT = 12, MLX5_UIDX_TABLE_MASK = (1 << MLX5_UIDX_TABLE_SHIFT) - 1, MLX5_UIDX_TABLE_SIZE = 1 << (24 - MLX5_UIDX_TABLE_SHIFT), }; enum { MLX5_SRQ_TABLE_SHIFT = 12, MLX5_SRQ_TABLE_MASK = (1 << MLX5_SRQ_TABLE_SHIFT) - 1, MLX5_SRQ_TABLE_SIZE = 1 << (24 - MLX5_SRQ_TABLE_SHIFT), }; enum { MLX5_MKEY_TABLE_SHIFT = 12, MLX5_MKEY_TABLE_MASK = (1 << MLX5_MKEY_TABLE_SHIFT) - 1, MLX5_MKEY_TABLE_SIZE = 1 << (24 - MLX5_MKEY_TABLE_SHIFT), }; enum { MLX5_BF_OFFSET = 0x800 }; enum { MLX5_TM_OPCODE_NOP = 0x00, MLX5_TM_OPCODE_APPEND = 0x01, MLX5_TM_OPCODE_REMOVE = 0x02, }; enum { MLX5_RECV_OPCODE_RDMA_WRITE_IMM = 0x00, MLX5_RECV_OPCODE_SEND = 0x01, MLX5_RECV_OPCODE_SEND_IMM = 0x02, MLX5_RECV_OPCODE_SEND_INVAL = 0x03, MLX5_CQE_OPCODE_ERROR = 0x1e, MLX5_CQE_OPCODE_RESIZE = 0x16, }; enum { MLX5_SRQ_FLAG_TM_SW_CNT = (1 << 6), MLX5_SRQ_FLAG_TM_CQE_REQ = (1 << 7), }; enum { MLX5_MAX_PORTS_NUM = 2, }; enum { MLX5_CSUM_SUPPORT_RAW_OVER_ETH = (1 << 0), MLX5_CSUM_SUPPORT_UNDERLAY_UD = (1 << 1), /* * Only report rx checksum when the validation * is valid. */ MLX5_RX_CSUM_VALID = (1 << 16), }; enum mlx5_alloc_type { MLX5_ALLOC_TYPE_ANON, MLX5_ALLOC_TYPE_HUGE, MLX5_ALLOC_TYPE_CONTIG, MLX5_ALLOC_TYPE_PREFER_HUGE, MLX5_ALLOC_TYPE_PREFER_CONTIG, MLX5_ALLOC_TYPE_EXTERNAL, MLX5_ALLOC_TYPE_CUSTOM, MLX5_ALLOC_TYPE_ALL }; enum mlx5_rsc_type { MLX5_RSC_TYPE_QP, MLX5_RSC_TYPE_XSRQ, MLX5_RSC_TYPE_SRQ, MLX5_RSC_TYPE_RWQ, MLX5_RSC_TYPE_INVAL, }; enum mlx5_vendor_cap_flags { MLX5_VENDOR_CAP_FLAGS_MPW = 1 << 0, /* Obsoleted */ MLX5_VENDOR_CAP_FLAGS_MPW_ALLOWED = 1 << 1, MLX5_VENDOR_CAP_FLAGS_ENHANCED_MPW = 1 << 2, MLX5_VENDOR_CAP_FLAGS_CQE_128B_COMP = 1 << 3, MLX5_VENDOR_CAP_FLAGS_CQE_128B_PAD = 1 << 4, MLX5_VENDOR_CAP_FLAGS_PACKET_BASED_CREDIT_MODE = 1 << 5, MLX5_VENDOR_CAP_FLAGS_SCAT2CQE_DCT = 1 << 6, MLX5_VENDOR_CAP_FLAGS_OOO_DP = 1 << 7, }; enum { MLX5_FLOW_TAG_MASK = 0x00ffffff, }; struct mlx5_resource { enum mlx5_rsc_type type; uint32_t rsn; }; struct mlx5_device { struct verbs_device verbs_dev; int page_size; int driver_abi_ver; }; struct mlx5_db_page; struct mlx5_spinlock { pthread_spinlock_t lock; int in_use; int need_lock; }; enum mlx5_uar_type { MLX5_UAR_TYPE_REGULAR, MLX5_UAR_TYPE_NC, MLX5_UAR_TYPE_REGULAR_DYN, }; struct mlx5_uar_info { void *reg; enum mlx5_uar_type type; }; enum mlx5_ctx_flags { MLX5_CTX_FLAGS_FATAL_STATE = 1 << 0, MLX5_CTX_FLAGS_NO_KERN_DYN_UAR = 1 << 1, MLX5_CTX_FLAGS_ECE_SUPPORTED = 1 << 2, MLX5_CTX_FLAGS_SQD2RTS_SUPPORTED = 1 << 3, MLX5_CTX_FLAGS_REAL_TIME_TS_SUPPORTED = 1 << 4, MLX5_CTX_FLAGS_MKEY_UPDATE_TAG_SUPPORTED = 1 << 5, }; struct mlx5_entropy_caps { uint8_t num_lag_ports; uint8_t lag_tx_port_affinity:1; uint8_t rts2rts_qp_udp_sport:1; uint8_t rts2rts_lag_tx_port_affinity:1; }; struct mlx5_qos_caps { uint8_t qos:1; uint8_t nic_sq_scheduling:1; uint8_t nic_bw_share:1; uint8_t nic_rate_limit:1; uint8_t nic_qp_scheduling:1; uint32_t nic_element_type; uint32_t nic_tsar_type; }; struct mlx5_hca_cap_2_caps { uint32_t log_reserved_qpns_per_obj; }; struct reserved_qpn_blk { unsigned long *bmp; uint32_t first_qpn; struct list_node entry; unsigned int next_avail_slot; struct mlx5dv_devx_obj *obj; }; struct mlx5_reserved_qpns { struct list_head blk_list; pthread_mutex_t mutex; }; struct mlx5_dv_context_ops; #define MLX5_DMA_MMO_MAX_SIZE (1ULL << 31) struct mlx5_dma_mmo_caps { uint8_t dma_mmo_sq:1; /* Indicates that RC and DCI support DMA MMO */ uint8_t dma_mmo_qp:1; uint64_t dma_max_size; }; struct mlx5_context { struct verbs_context ibv_ctx; int max_num_qps; int bf_reg_size; int tot_uuars; int low_lat_uuars; int num_uars_per_page; int bf_regs_per_page; int num_bf_regs; int prefer_bf; int shut_up_bf; struct { struct mlx5_qp **table; int refcnt; } qp_table[MLX5_QP_TABLE_SIZE]; pthread_mutex_t qp_table_mutex; struct { struct mlx5_srq **table; int refcnt; } srq_table[MLX5_SRQ_TABLE_SIZE]; pthread_mutex_t srq_table_mutex; struct { struct mlx5_resource **table; int refcnt; } uidx_table[MLX5_UIDX_TABLE_SIZE]; pthread_mutex_t uidx_table_mutex; struct { struct mlx5_mkey **table; int refcnt; } mkey_table[MLX5_MKEY_TABLE_SIZE]; pthread_mutex_t mkey_table_mutex; struct mlx5_uar_info uar[MLX5_MAX_UARS]; struct list_head dbr_available_pages; cl_qmap_t dbr_map; pthread_mutex_t dbr_map_mutex; int cache_line_size; int max_sq_desc_sz; int max_rq_desc_sz; int max_send_wqebb; int max_recv_wr; unsigned max_srq_recv_wr; int num_ports; int stall_enable; int stall_adaptive_enable; int stall_cycles; struct mlx5_bf *bfs; FILE *dbg_fp; char hostname[40]; struct mlx5_spinlock hugetlb_lock; struct list_head hugetlb_list; int cqe_version; uint8_t cached_link_layer[MLX5_MAX_PORTS_NUM]; uint8_t cached_port_flags[MLX5_MAX_PORTS_NUM]; unsigned int cached_device_cap_flags; enum ibv_atomic_cap atomic_cap; struct { uint64_t offset; uint64_t mask; } core_clock; void *hca_core_clock; const struct mlx5_ib_clock_info *clock_info_page; struct mlx5_ib_tso_caps cached_tso_caps; int cmds_supp_uhw; uint32_t uar_size; uint64_t vendor_cap_flags; /* Use enum mlx5_vendor_cap_flags */ struct mlx5dv_cqe_comp_caps cqe_comp_caps; struct mlx5dv_ctx_allocators extern_alloc; struct mlx5dv_sw_parsing_caps sw_parsing_caps; struct mlx5dv_striding_rq_caps striding_rq_caps; struct mlx5dv_dci_streams_caps dci_streams_caps; uint32_t tunnel_offloads_caps; struct mlx5_packet_pacing_caps packet_pacing_caps; struct mlx5_entropy_caps entropy_caps; struct mlx5_qos_caps qos_caps; struct mlx5_hca_cap_2_caps hca_cap_2_caps; uint64_t general_obj_types_caps; uint8_t qpc_extension_cap:1; struct mlx5dv_sig_caps sig_caps; struct mlx5_dma_mmo_caps dma_mmo_caps; struct mlx5dv_crypto_caps crypto_caps; pthread_mutex_t dyn_bfregs_mutex; /* protects the dynamic bfregs allocation */ uint32_t num_dyn_bfregs; uint32_t max_num_legacy_dyn_uar_sys_page; uint32_t curr_legacy_dyn_sys_uar_page; uint16_t flow_action_flags; uint64_t max_dm_size; uint32_t eth_min_inline_size; uint32_t dump_fill_mkey; __be32 dump_fill_mkey_be; uint32_t flags; struct list_head dyn_uar_bf_list; struct list_head dyn_uar_db_list; struct list_head dyn_uar_qp_shared_list; struct list_head dyn_uar_qp_dedicated_list; uint16_t qp_max_dedicated_uuars; uint16_t qp_alloc_dedicated_uuars; uint16_t qp_max_shared_uuars; uint16_t qp_alloc_shared_uuars; struct mlx5_bf *nc_uar; void *cq_uar_reg; struct mlx5_reserved_qpns reserved_qpns; uint8_t qp_data_in_order_cap:1; struct mlx5_dv_context_ops *dv_ctx_ops; struct mlx5dv_devx_obj *crypto_login; pthread_mutex_t crypto_login_mutex; uint64_t max_dc_rd_atom; uint64_t max_dc_init_rd_atom; struct mlx5dv_reg reg_c0; struct mlx5dv_ooo_recv_wrs_caps ooo_recv_wrs_caps; }; struct mlx5_hugetlb_mem { int shmid; void *shmaddr; unsigned long *bitmap; unsigned long bmp_size; struct list_node entry; }; struct mlx5_buf { void *buf; size_t length; int base; struct mlx5_hugetlb_mem *hmem; enum mlx5_alloc_type type; uint64_t resource_type; size_t req_alignment; struct mlx5_parent_domain *mparent_domain; }; struct mlx5_td { struct ibv_td ibv_td; struct mlx5_bf *bf; atomic_int refcount; }; struct mlx5_pd { struct ibv_pd ibv_pd; uint32_t pdn; atomic_int refcount; struct mlx5_pd *mprotection_domain; struct { void *opaque_buf; struct ibv_mr *opaque_mr; pthread_mutex_t opaque_mr_mutex; }; }; struct mlx5_parent_domain { struct mlx5_pd mpd; struct mlx5_td *mtd; void *(*alloc)(struct ibv_pd *pd, void *pd_context, size_t size, size_t alignment, uint64_t resource_type); void (*free)(struct ibv_pd *pd, void *pd_context, void *ptr, uint64_t resource_type); void *pd_context; }; enum { MLX5_CQ_SET_CI = 0, MLX5_CQ_ARM_DB = 1, }; enum { MLX5_CQ_FLAGS_RX_CSUM_VALID = 1 << 0, MLX5_CQ_FLAGS_EMPTY_DURING_POLL = 1 << 1, MLX5_CQ_FLAGS_FOUND_CQES = 1 << 2, MLX5_CQ_FLAGS_EXTENDED = 1 << 3, MLX5_CQ_FLAGS_SINGLE_THREADED = 1 << 4, MLX5_CQ_FLAGS_DV_OWNED = 1 << 5, MLX5_CQ_FLAGS_TM_SYNC_REQ = 1 << 6, MLX5_CQ_FLAGS_RAW_WQE = 1 << 7, }; struct mlx5_cq { struct verbs_cq verbs_cq; struct mlx5_buf buf_a; struct mlx5_buf buf_b; struct mlx5_buf *active_buf; struct mlx5_buf *resize_buf; int resize_cqes; int active_cqes; struct mlx5_spinlock lock; uint32_t cqn; uint32_t cons_index; __be32 *dbrec; bool custom_db; int arm_sn; int cqe_sz; int resize_cqe_sz; int stall_next_poll; int stall_enable; uint64_t stall_last_count; int stall_adaptive_enable; int stall_cycles; struct mlx5_resource *cur_rsc; struct mlx5_srq *cur_srq; struct mlx5_cqe64 *cqe64; uint32_t flags; int cached_opcode; struct mlx5dv_clock_info last_clock_info; struct ibv_pd *parent_domain; }; struct mlx5_tag_entry { struct mlx5_tag_entry *next; uint64_t wr_id; int phase_cnt; void *ptr; uint32_t size; int8_t expect_cqe; }; struct mlx5_srq_op { struct mlx5_tag_entry *tag; uint64_t wr_id; /* we need to advance tail pointer */ uint32_t wqe_head; }; struct mlx5_srq { struct mlx5_resource rsc; /* This struct must be first */ struct verbs_srq vsrq; struct mlx5_buf buf; struct mlx5_spinlock lock; uint64_t *wrid; uint32_t srqn; int max; int max_gs; int wqe_shift; int head; int tail; int waitq_head; int waitq_tail; __be32 *db; bool custom_db; uint16_t counter; int wq_sig; struct ibv_qp *cmd_qp; struct mlx5_tag_entry *tm_list; /* vector of all tags */ struct mlx5_tag_entry *tm_head; /* queue of free tags */ struct mlx5_tag_entry *tm_tail; struct mlx5_srq_op *op; int op_head; int op_tail; int unexp_in; int unexp_out; /* Bit is set if WQE is in SW ownership and not part of the SRQ queues * (main/wait) */ unsigned long *free_wqe_bitmap; uint32_t nwqes; }; static inline void mlx5_tm_release_tag(struct mlx5_srq *srq, struct mlx5_tag_entry *tag) { if (!--tag->expect_cqe) { tag->next = NULL; srq->tm_tail->next = tag; srq->tm_tail = tag; } } struct wr_list { uint16_t opcode; uint16_t next; }; struct mlx5_wq { uint64_t *wrid; unsigned *wqe_head; struct mlx5_spinlock lock; unsigned wqe_cnt; unsigned max_post; unsigned head; unsigned tail; unsigned cur_post; int max_gs; /* * Equal to max_gs when qp is in RTS state for sq, or in INIT state for * rq, equal to -1 otherwise, used to verify qp_state in data path. */ int qp_state_max_gs; int wqe_shift; int offset; void *qend; uint32_t *wr_data; }; struct mlx5_devx_uar { struct mlx5dv_devx_uar dv_devx_uar; struct ibv_context *context; }; struct mlx5_bf { void *reg; int need_lock; struct mlx5_spinlock lock; unsigned offset; unsigned buf_size; unsigned uuarn; off_t uar_mmap_offset; /* The virtual address of the mmaped uar, applicable for the dynamic use case */ void *uar; /* Index in the dynamic bfregs portion */ uint32_t bfreg_dyn_index; struct mlx5_devx_uar devx_uar; uint8_t dyn_alloc_uar : 1; uint8_t mmaped_entry : 1; uint8_t nc_mode : 1; uint8_t singleton : 1; uint8_t qp_dedicated : 1; uint8_t qp_shared : 1; uint32_t count; struct list_node uar_entry; uint32_t uar_handle; uint32_t length; uint32_t page_id; }; struct mlx5_dm { struct verbs_dm verbs_dm; size_t length; void *mmap_va; void *start_va; uint64_t remote_va; }; struct mlx5_mr { struct verbs_mr vmr; uint32_t alloc_flags; }; enum mlx5_qp_flags { MLX5_QP_FLAGS_USE_UNDERLAY = 0x01, MLX5_QP_FLAGS_DRAIN_SIGERR = 0x02, MLX5_QP_FLAGS_OOO_DP = 1 << 2, }; struct mlx5_qp { struct mlx5_resource rsc; /* This struct must be first */ struct verbs_qp verbs_qp; struct mlx5dv_qp_ex dv_qp; struct ibv_qp *ibv_qp; struct mlx5_buf buf; int max_inline_data; int buf_size; /* For Raw Packet QP, use different buffers for the SQ and RQ */ struct mlx5_buf sq_buf; int sq_buf_size; struct mlx5_bf *bf; /* Start of new post send API specific fields */ bool inl_wqe; uint8_t cur_setters_cnt; uint8_t num_mkey_setters; uint8_t fm_cache_rb; int err; int nreq; uint32_t cur_size; uint32_t cur_post_rb; void *cur_eth; void *cur_data; struct mlx5_wqe_ctrl_seg *cur_ctrl; struct mlx5_mkey *cur_mkey; /* End of new post send API specific fields */ uint8_t fm_cache; uint8_t sq_signal_bits; void *sq_start; struct mlx5_wq sq; __be32 *db; bool custom_db; struct mlx5_wq rq; int wq_sig; uint32_t qp_cap_cache; int atomics_enabled; uint32_t max_tso; uint16_t max_tso_header; int rss_qp; uint32_t flags; /* Use enum mlx5_qp_flags */ enum mlx5dv_dc_type dc_type; uint32_t tirn; uint32_t tisn; uint32_t rqn; uint32_t sqn; uint64_t tir_icm_addr; /* * ECE configuration is done in create/modify QP stages, * so this value is cached version of the requested ECE prior * to its execution. This field will be cleared after successful * call to relevant "executor". */ uint32_t set_ece; /* * This field indicates returned ECE options from the device * as were received from the HW in previous stage. Every * write to the set_ece will clear this field. */ uint32_t get_ece; uint8_t need_mmo_enable:1; }; struct mlx5_ah { struct ibv_ah ibv_ah; struct mlx5_wqe_av av; bool kern_ah; pthread_mutex_t mutex; uint8_t is_global; struct mlx5dv_devx_obj *ah_qp_mapping; }; struct mlx5_rwq { struct mlx5_resource rsc; struct ibv_wq wq; struct mlx5_buf buf; int buf_size; struct mlx5_wq rq; __be32 *db; bool custom_db; void *pbuff; __be32 *recv_db; int wq_sig; }; struct mlx5_counter_node { uint32_t index; struct list_node entry; enum ibv_counter_description desc; }; struct mlx5_counters { struct verbs_counters vcounters; struct list_head counters_list; pthread_mutex_t lock; uint32_t ncounters; /* number of bounded objects */ int refcount; }; struct mlx5_flow { struct ibv_flow flow_id; struct mlx5_counters *mcounters; }; struct mlx5dv_flow_matcher { struct ibv_context *context; uint32_t handle; }; struct mlx5_steering_anchor { struct ibv_context *context; uint32_t handle; struct mlx5dv_steering_anchor sa; }; enum mlx5_devx_obj_type { MLX5_DEVX_FLOW_TABLE = 1, MLX5_DEVX_FLOW_COUNTER = 2, MLX5_DEVX_FLOW_METER = 3, MLX5_DEVX_QP = 4, MLX5_DEVX_PKT_REFORMAT_CTX = 5, MLX5_DEVX_TIR = 6, MLX5_DEVX_FLOW_GROUP = 7, MLX5_DEVX_FLOW_TABLE_ENTRY = 8, MLX5_DEVX_FLOW_SAMPLER = 9, MLX5_DEVX_ASO_FIRST_HIT = 10, MLX5_DEVX_ASO_FLOW_METER = 11, MLX5_DEVX_ASO_CT = 12, }; struct mlx5dv_devx_obj { struct ibv_context *context; uint32_t handle; enum mlx5_devx_obj_type type; uint32_t object_id; uint64_t rx_icm_addr; uint8_t log_obj_range; void *priv; }; struct mlx5_var_obj { struct mlx5dv_var dv_var; struct ibv_context *context; uint32_t handle; }; struct mlx5_pp_obj { struct mlx5dv_pp dv_pp; struct ibv_context *context; uint32_t handle; }; struct mlx5_devx_umem { struct mlx5dv_devx_umem dv_devx_umem; struct ibv_context *context; uint32_t handle; void *addr; size_t size; }; /* * The BSF state is used in signature and crypto attributes. It indicates the * state the attributes are in, and helps constructing the signature and crypto * BSFs during MKey configuration. * * INIT state indicates that the attributes are not configured. * RESET state indicates that the attributes should be reset in current MKey * configuration. * SET state indicates that the attributes have been set before. * UPDATED state indicates that the attributes have been updated in current * MKey configuration. */ enum mlx5_mkey_bsf_state { MLX5_MKEY_BSF_STATE_INIT, MLX5_MKEY_BSF_STATE_RESET, MLX5_MKEY_BSF_STATE_SET, MLX5_MKEY_BSF_STATE_UPDATED, }; struct mlx5_psv { uint32_t index; struct mlx5dv_devx_obj *devx_obj; }; enum mlx5_sig_type { MLX5_SIG_TYPE_NONE = 0, MLX5_SIG_TYPE_CRC, MLX5_SIG_TYPE_T10DIF, }; struct mlx5_sig_block_domain { enum mlx5_sig_type sig_type; union { struct mlx5dv_sig_t10dif dif; struct mlx5dv_sig_crc crc; } sig; enum mlx5dv_block_size block_size; }; struct mlx5_sig_block_attr { struct mlx5_sig_block_domain mem; struct mlx5_sig_block_domain wire; uint32_t flags; uint8_t check_mask; uint8_t copy_mask; }; struct mlx5_sig_block { struct mlx5_psv *mem_psv; struct mlx5_psv *wire_psv; struct mlx5_sig_block_attr attr; enum mlx5_mkey_bsf_state state; }; struct mlx5_sig_err { uint16_t syndrome; uint64_t expected; uint64_t actual; uint64_t offset; uint8_t sig_type; uint8_t domain; }; struct mlx5_sig_ctx { struct mlx5_sig_block block; struct mlx5_sig_err err_info; uint32_t err_count; bool err_exists; bool err_count_updated; }; struct mlx5_crypto_attr { enum mlx5dv_crypto_standard crypto_standard; bool encrypt_on_tx; enum mlx5dv_signature_crypto_order signature_crypto_order; enum mlx5dv_block_size data_unit_size; char initial_tweak[16]; struct mlx5dv_dek *dek; char keytag[8]; enum mlx5_mkey_bsf_state state; }; struct mlx5_mkey { struct mlx5dv_mkey dv_mkey; struct mlx5dv_devx_obj *devx_obj; uint16_t num_desc; uint64_t length; struct mlx5_sig_ctx *sig; struct mlx5_crypto_attr *crypto; }; struct mlx5dv_crypto_login_obj { struct mlx5dv_devx_obj *devx_obj; }; struct mlx5dv_dek { struct mlx5dv_devx_obj *devx_obj; }; struct mlx5_devx_event_channel { struct ibv_context *context; struct mlx5dv_devx_event_channel dv_event_channel; }; enum mlx5_flow_action_type { MLX5_FLOW_ACTION_COUNTER_OFFSET = 1, }; struct mlx5_flow_action_attr_aux { enum mlx5_flow_action_type type; uint32_t offset; }; struct mlx5dv_sched_node { struct mlx5dv_sched_node *parent; struct mlx5dv_devx_obj *obj; }; struct mlx5dv_sched_leaf { struct mlx5dv_sched_node *parent; struct mlx5dv_devx_obj *obj; }; struct mlx5_devx_msi_vector { struct mlx5dv_devx_msi_vector dv_msi; struct ibv_context *ibctx; }; struct mlx5_devx_eq { struct mlx5dv_devx_eq dv_eq; struct ibv_context *ibctx; uint64_t iova; size_t size; int eqn; }; struct ibv_flow * _mlx5dv_create_flow(struct mlx5dv_flow_matcher *flow_matcher, struct mlx5dv_flow_match_parameters *match_value, size_t num_actions, struct mlx5dv_flow_action_attr actions_attr[], struct mlx5_flow_action_attr_aux actions_attr_aux[]); extern int mlx5_stall_num_loop; extern int mlx5_stall_cq_poll_min; extern int mlx5_stall_cq_poll_max; extern int mlx5_stall_cq_inc_step; extern int mlx5_stall_cq_dec_step; extern int mlx5_single_threaded; #define to_mxxx(xxx, type) container_of(ib##xxx, struct mlx5_##type, ibv_##xxx) static inline struct mlx5_device *to_mdev(struct ibv_device *ibdev) { return container_of(ibdev, struct mlx5_device, verbs_dev.device); } static inline struct mlx5_context *to_mctx(struct ibv_context *ibctx) { return container_of(ibctx, struct mlx5_context, ibv_ctx.context); } /* to_mpd always returns the real mlx5_pd object ie the protection domain. */ static inline struct mlx5_pd *to_mpd(struct ibv_pd *ibpd) { struct mlx5_pd *mpd = to_mxxx(pd, pd); if (mpd->mprotection_domain) return mpd->mprotection_domain; return mpd; } static inline struct mlx5_parent_domain *to_mparent_domain(struct ibv_pd *ibpd) { struct mlx5_parent_domain *mparent_domain = ibpd ? container_of(ibpd, struct mlx5_parent_domain, mpd.ibv_pd) : NULL; if (mparent_domain && mparent_domain->mpd.mprotection_domain) return mparent_domain; /* Otherwise ibpd isn't a parent_domain */ return NULL; } static inline struct mlx5_cq *to_mcq(struct ibv_cq *ibcq) { return container_of(ibcq, struct mlx5_cq, verbs_cq.cq); } static inline struct mlx5_srq *to_msrq(struct ibv_srq *ibsrq) { struct verbs_srq *vsrq = (struct verbs_srq *)ibsrq; return container_of(vsrq, struct mlx5_srq, vsrq); } static inline struct mlx5_td *to_mtd(struct ibv_td *ibtd) { return to_mxxx(td, td); } static inline struct mlx5_qp *to_mqp(struct ibv_qp *ibqp) { struct verbs_qp *vqp = (struct verbs_qp *)ibqp; return container_of(vqp, struct mlx5_qp, verbs_qp); } static inline struct mlx5_qp *mqp_from_mlx5dv_qp_ex(struct mlx5dv_qp_ex *dv_qp) { return container_of(dv_qp, struct mlx5_qp, dv_qp); } static inline struct mlx5_rwq *to_mrwq(struct ibv_wq *ibwq) { return container_of(ibwq, struct mlx5_rwq, wq); } static inline struct mlx5_dm *to_mdm(struct ibv_dm *ibdm) { return container_of(ibdm, struct mlx5_dm, verbs_dm.dm); } static inline struct mlx5_mr *to_mmr(struct ibv_mr *ibmr) { return container_of(ibmr, struct mlx5_mr, vmr.ibv_mr); } static inline struct mlx5_ah *to_mah(struct ibv_ah *ibah) { return to_mxxx(ah, ah); } static inline int max_int(int a, int b) { return a > b ? a : b; } static inline struct mlx5_qp *rsc_to_mqp(struct mlx5_resource *rsc) { return (struct mlx5_qp *)rsc; } static inline struct mlx5_srq *rsc_to_msrq(struct mlx5_resource *rsc) { return (struct mlx5_srq *)rsc; } static inline struct mlx5_rwq *rsc_to_mrwq(struct mlx5_resource *rsc) { return (struct mlx5_rwq *)rsc; } static inline struct mlx5_counters *to_mcounters(struct ibv_counters *ibcounters) { return container_of(ibcounters, struct mlx5_counters, vcounters.counters); } static inline struct mlx5_flow *to_mflow(struct ibv_flow *flow_id) { return container_of(flow_id, struct mlx5_flow, flow_id); } bool is_mlx5_vfio_dev(struct ibv_device *device); void mlx5_open_debug_file(FILE **dbg_fp); void mlx5_close_debug_file(FILE *dbg_fp); void mlx5_set_debug_mask(void); int mlx5_alloc_buf(struct mlx5_buf *buf, size_t size, int page_size); void mlx5_free_buf(struct mlx5_buf *buf); int mlx5_alloc_buf_contig(struct mlx5_context *mctx, struct mlx5_buf *buf, size_t size, int page_size, const char *component); void mlx5_free_buf_contig(struct mlx5_context *mctx, struct mlx5_buf *buf); int mlx5_alloc_prefered_buf(struct mlx5_context *mctx, struct mlx5_buf *buf, size_t size, int page_size, enum mlx5_alloc_type alloc_type, const char *component); int mlx5_free_actual_buf(struct mlx5_context *ctx, struct mlx5_buf *buf); void mlx5_get_alloc_type(struct mlx5_context *context, struct ibv_pd *pd, const char *component, enum mlx5_alloc_type *alloc_type, enum mlx5_alloc_type default_alloc_type); int mlx5_use_huge(const char *key); bool mlx5_is_custom_alloc(struct ibv_pd *pd); bool mlx5_is_extern_alloc(struct mlx5_context *context); int mlx5_alloc_buf_extern(struct mlx5_context *ctx, struct mlx5_buf *buf, size_t size); void mlx5_free_buf_extern(struct mlx5_context *ctx, struct mlx5_buf *buf); __be32 *mlx5_alloc_dbrec(struct mlx5_context *context, struct ibv_pd *pd, bool *custom_alloc); void mlx5_free_db(struct mlx5_context *context, __be32 *db, struct ibv_pd *pd, bool custom_alloc); void mlx5_query_device_ctx(struct mlx5_context *mctx); int mlx5_query_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int mlx5_query_rt_values(struct ibv_context *context, struct ibv_values_ex *values); struct ibv_qp *mlx5_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr); int mlx5_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr); struct ibv_pd *mlx5_alloc_pd(struct ibv_context *context); int mlx5_free_pd(struct ibv_pd *pd); void mlx5_async_event(struct ibv_context *context, struct ibv_async_event *event); struct ibv_mr *mlx5_alloc_null_mr(struct ibv_pd *pd); struct ibv_mr *mlx5_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access); struct ibv_mr *mlx5_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access); int mlx5_rereg_mr(struct verbs_mr *mr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access); int mlx5_dereg_mr(struct verbs_mr *mr); struct ibv_mw *mlx5_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type); int mlx5_dealloc_mw(struct ibv_mw *mw); int mlx5_bind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind); struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); struct ibv_cq_ex *mlx5_create_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr); int mlx5_cq_fill_pfns(struct mlx5_cq *cq, const struct ibv_cq_init_attr_ex *cq_attr, struct mlx5_context *mctx); int mlx5_alloc_cq_buf(struct mlx5_context *mctx, struct mlx5_cq *cq, struct mlx5_buf *buf, int nent, int cqe_sz); int mlx5_free_cq_buf(struct mlx5_context *ctx, struct mlx5_buf *buf); int mlx5_resize_cq(struct ibv_cq *cq, int cqe); int mlx5_modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr); int mlx5_destroy_cq(struct ibv_cq *cq); int mlx5_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc); int mlx5_poll_cq_v1(struct ibv_cq *cq, int ne, struct ibv_wc *wc); int mlx5_arm_cq(struct ibv_cq *cq, int solicited); void mlx5_cq_event(struct ibv_cq *cq); void __mlx5_cq_clean(struct mlx5_cq *cq, uint32_t qpn, struct mlx5_srq *srq); void mlx5_cq_clean(struct mlx5_cq *cq, uint32_t qpn, struct mlx5_srq *srq); void mlx5_cq_resize_copy_cqes(struct mlx5_context *mctx, struct mlx5_cq *cq); struct ibv_srq *mlx5_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr); int mlx5_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int mask); int mlx5_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr); int mlx5_destroy_srq(struct ibv_srq *srq); int mlx5_alloc_srq_buf(struct ibv_context *context, struct mlx5_srq *srq, uint32_t nwr, struct ibv_pd *pd); void mlx5_complete_odp_fault(struct mlx5_srq *srq, int ind); void mlx5_free_srq_wqe(struct mlx5_srq *srq, int ind); int mlx5_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); struct ibv_qp *mlx5_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); int mlx5_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int mlx5_query_qp_data_in_order(struct ibv_qp *qp, enum ibv_wr_opcode op, uint32_t flags); int mlx5_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); int mlx5_modify_qp_rate_limit(struct ibv_qp *qp, struct ibv_qp_rate_limit_attr *attr); int mlx5_modify_qp_drain_sigerr(struct ibv_qp *qp); int mlx5_destroy_qp(struct ibv_qp *qp); void mlx5_init_qp_indices(struct mlx5_qp *qp); void mlx5_init_rwq_indices(struct mlx5_rwq *rwq); int mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int mlx5_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int mlx5_post_wq_recv(struct ibv_wq *ibwq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); void mlx5_calc_sq_wqe_size(struct ibv_qp_cap *cap, enum ibv_qp_type type, struct mlx5_qp *qp); void mlx5_set_sq_sizes(struct mlx5_qp *qp, struct ibv_qp_cap *cap, enum ibv_qp_type type); struct mlx5_qp *mlx5_find_qp(struct mlx5_context *ctx, uint32_t qpn); int mlx5_store_qp(struct mlx5_context *ctx, uint32_t qpn, struct mlx5_qp *qp); void mlx5_clear_qp(struct mlx5_context *ctx, uint32_t qpn); int32_t mlx5_store_uidx(struct mlx5_context *ctx, void *rsc); void mlx5_clear_uidx(struct mlx5_context *ctx, uint32_t uidx); struct mlx5_srq *mlx5_find_srq(struct mlx5_context *ctx, uint32_t srqn); int mlx5_store_srq(struct mlx5_context *ctx, uint32_t srqn, struct mlx5_srq *srq); void mlx5_clear_srq(struct mlx5_context *ctx, uint32_t srqn); struct mlx5_mkey *mlx5_find_mkey(struct mlx5_context *ctx, uint32_t mkeyn); int mlx5_store_mkey(struct mlx5_context *ctx, uint32_t mkeyn, struct mlx5_mkey *mkey); void mlx5_clear_mkey(struct mlx5_context *ctx, uint32_t mkeyn); struct ibv_ah *mlx5_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr); int mlx5_destroy_ah(struct ibv_ah *ah); int mlx5_alloc_av(struct mlx5_pd *pd, struct ibv_ah_attr *attr, struct mlx5_ah *ah); void mlx5_free_av(struct mlx5_ah *ah); int mlx5_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); int mlx5_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); void *mlx5_get_atomic_laddr(struct mlx5_qp *qp, uint16_t idx, int *byte_count); void *mlx5_get_send_wqe(struct mlx5_qp *qp, int n); int mlx5_copy_to_recv_wqe(struct mlx5_qp *qp, int idx, void *buf, int size); int mlx5_copy_to_send_wqe(struct mlx5_qp *qp, int idx, void *buf, int size); int mlx5_copy_to_recv_srq(struct mlx5_srq *srq, int idx, void *buf, int size); struct ibv_xrcd *mlx5_open_xrcd(struct ibv_context *context, struct ibv_xrcd_init_attr *xrcd_init_attr); int mlx5_get_srq_num(struct ibv_srq *srq, uint32_t *srq_num); struct ibv_qp *mlx5_open_qp(struct ibv_context *context, struct ibv_qp_open_attr *attr); int mlx5_close_xrcd(struct ibv_xrcd *ib_xrcd); struct ibv_wq *mlx5_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *attr); int mlx5_modify_wq(struct ibv_wq *wq, struct ibv_wq_attr *attr); int mlx5_destroy_wq(struct ibv_wq *wq); struct ibv_rwq_ind_table *mlx5_create_rwq_ind_table(struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr); int mlx5_destroy_rwq_ind_table(struct ibv_rwq_ind_table *rwq_ind_table); struct ibv_flow *mlx5_create_flow(struct ibv_qp *qp, struct ibv_flow_attr *flow_attr); int mlx5_destroy_flow(struct ibv_flow *flow_id); struct ibv_srq *mlx5_create_srq_ex(struct ibv_context *context, struct ibv_srq_init_attr_ex *attr); int mlx5_post_srq_ops(struct ibv_srq *srq, struct ibv_ops_wr *wr, struct ibv_ops_wr **bad_wr); struct ibv_flow_action *mlx5_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *attr); int mlx5_destroy_flow_action(struct ibv_flow_action *action); int mlx5_modify_flow_action_esp(struct ibv_flow_action *action, struct ibv_flow_action_esp_attr *attr); struct ibv_dm *mlx5_alloc_dm(struct ibv_context *context, struct ibv_alloc_dm_attr *dm_attr); int mlx5_free_dm(struct ibv_dm *ibdm); struct ibv_mr *mlx5_reg_dm_mr(struct ibv_pd *pd, struct ibv_dm *ibdm, uint64_t dm_offset, size_t length, unsigned int acc); struct ibv_td *mlx5_alloc_td(struct ibv_context *context, struct ibv_td_init_attr *init_attr); int mlx5_dealloc_td(struct ibv_td *td); struct ibv_pd *mlx5_alloc_parent_domain(struct ibv_context *context, struct ibv_parent_domain_init_attr *attr); void *mlx5_mmap(struct mlx5_uar_info *uar, int index, int cmd_fd, int page_size, int uar_type); off_t get_uar_mmap_offset(int idx, int page_size, int command); struct ibv_counters *mlx5_create_counters(struct ibv_context *context, struct ibv_counters_init_attr *init_attr); int mlx5_destroy_counters(struct ibv_counters *counters); int mlx5_attach_counters_point_flow(struct ibv_counters *counters, struct ibv_counter_attach_attr *attr, struct ibv_flow *flow); int mlx5_read_counters(struct ibv_counters *counters, uint64_t *counters_value, uint32_t ncounters, uint32_t flags); int mlx5_advise_mr(struct ibv_pd *pd, enum ibv_advise_mr_advice advice, uint32_t flags, struct ibv_sge *sg_list, uint32_t num_sges); struct ibv_dm *mlx5_import_dm(struct ibv_context *context, uint32_t dm_handle); void mlx5_unimport_dm(struct ibv_dm *dm); struct ibv_mr *mlx5_import_mr(struct ibv_pd *pd, uint32_t mr_handle); void mlx5_unimport_mr(struct ibv_mr *mr); struct ibv_pd *mlx5_import_pd(struct ibv_context *context, uint32_t pd_handle); void mlx5_unimport_pd(struct ibv_pd *pd); void mlx5_qp_fill_wr_complete_error(struct mlx5_qp *mqp); void mlx5_qp_fill_wr_complete_real(struct mlx5_qp *mqp); int mlx5_qp_fill_wr_pfns(struct mlx5_qp *mqp, const struct ibv_qp_init_attr_ex *attr, const struct mlx5dv_qp_init_attr *mlx5_attr); void clean_dyn_uars(struct ibv_context *context); void mlx5_set_singleton_nc_uar(struct ibv_context *context); int mlx5_set_ece(struct ibv_qp *qp, struct ibv_ece *ece); int mlx5_query_ece(struct ibv_qp *qp, struct ibv_ece *ece); struct mlx5_psv *mlx5_create_psv(struct ibv_pd *pd); int mlx5_destroy_psv(struct mlx5_psv *psv); static inline void *mlx5_find_uidx(struct mlx5_context *ctx, uint32_t uidx) { int tind = uidx >> MLX5_UIDX_TABLE_SHIFT; if (likely(ctx->uidx_table[tind].refcnt)) return ctx->uidx_table[tind].table[uidx & MLX5_UIDX_TABLE_MASK]; return NULL; } static inline int mlx5_spin_lock(struct mlx5_spinlock *lock) { if (lock->need_lock) return pthread_spin_lock(&lock->lock); if (unlikely(lock->in_use)) { fprintf(stderr, "*** ERROR: multithreading violation ***\n" "You are running a multithreaded application but\n" "you set MLX5_SINGLE_THREADED=1. Please unset it.\n"); abort(); } else { lock->in_use = 1; /* * This fence is not at all correct, but it increases the * chance that in_use is detected by another thread without * much runtime cost. */ atomic_thread_fence(memory_order_acq_rel); } return 0; } static inline int mlx5_spin_unlock(struct mlx5_spinlock *lock) { if (lock->need_lock) return pthread_spin_unlock(&lock->lock); lock->in_use = 0; return 0; } static inline int mlx5_spinlock_init(struct mlx5_spinlock *lock, int need_lock) { lock->in_use = 0; lock->need_lock = need_lock; return pthread_spin_init(&lock->lock, PTHREAD_PROCESS_PRIVATE); } static inline int mlx5_spinlock_init_pd(struct mlx5_spinlock *lock, struct ibv_pd *pd) { struct mlx5_parent_domain *mparent_domain; int thread_safe; mparent_domain = to_mparent_domain(pd); if (mparent_domain && mparent_domain->mtd) thread_safe = 1; else thread_safe = mlx5_single_threaded; return mlx5_spinlock_init(lock, !thread_safe); } static inline int mlx5_spinlock_destroy(struct mlx5_spinlock *lock) { return pthread_spin_destroy(&lock->lock); } static inline void set_command(int command, off_t *offset) { *offset |= (command << MLX5_IB_MMAP_CMD_SHIFT); } static inline void set_arg(int arg, off_t *offset) { *offset |= arg; } static inline void set_order(int order, off_t *offset) { set_arg(order, offset); } static inline void set_index(int index, off_t *offset) { set_arg(index, offset); } static inline void set_extended_index(int index, off_t *offset) { *offset |= (index & 0xff) | ((index >> 8) << 16); } static inline uint8_t calc_sig(void *wqe, int size) { int i; uint8_t *p = wqe; uint8_t res = 0; for (i = 0; i < size; ++i) res ^= p[i]; return ~res; } static inline int align_queue_size(long long req) { return roundup_pow_of_two(req); } static inline bool srq_has_waitq(struct mlx5_srq *srq) { return srq->waitq_head >= 0; } bool srq_cooldown_wqe(struct mlx5_srq *srq, int ind); struct mlx5_dv_context_ops { int (*devx_general_cmd)(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen); struct mlx5dv_devx_obj *(*devx_obj_create)(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_obj_query)(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_obj_modify)(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_obj_destroy)(struct mlx5dv_devx_obj *obj); int (*devx_query_eqn)(struct ibv_context *context, uint32_t vector, uint32_t *eqn); int (*devx_cq_query)(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_cq_modify)(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_qp_query)(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_qp_modify)(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_srq_query)(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_srq_modify)(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_wq_query)(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_wq_modify)(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_ind_tbl_query)(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_ind_tbl_modify)(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen); struct mlx5dv_devx_cmd_comp *(*devx_create_cmd_comp)(struct ibv_context *context); void (*devx_destroy_cmd_comp)(struct mlx5dv_devx_cmd_comp *cmd_comp); struct mlx5dv_devx_event_channel *(*devx_create_event_channel)(struct ibv_context *context, enum mlx5dv_devx_create_event_channel_flags flags); void (*devx_destroy_event_channel)(struct mlx5dv_devx_event_channel *dv_event_channel); int (*devx_subscribe_devx_event)(struct mlx5dv_devx_event_channel *dv_event_channel, struct mlx5dv_devx_obj *obj, uint16_t events_sz, uint16_t events_num[], uint64_t cookie); int (*devx_subscribe_devx_event_fd)(struct mlx5dv_devx_event_channel *dv_event_channel, int fd, struct mlx5dv_devx_obj *obj, uint16_t event_num); int (*devx_obj_query_async)(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, size_t outlen, uint64_t wr_id, struct mlx5dv_devx_cmd_comp *cmd_comp); int (*devx_get_async_cmd_comp)(struct mlx5dv_devx_cmd_comp *cmd_comp, struct mlx5dv_devx_async_cmd_hdr *cmd_resp, size_t cmd_resp_len); ssize_t (*devx_get_event)(struct mlx5dv_devx_event_channel *event_channel, struct mlx5dv_devx_async_event_hdr *event_data, size_t event_resp_len); struct mlx5dv_devx_uar *(*devx_alloc_uar)(struct ibv_context *context, uint32_t flags); void (*devx_free_uar)(struct mlx5dv_devx_uar *dv_devx_uar); struct mlx5dv_devx_umem *(*devx_umem_reg)(struct ibv_context *context, void *addr, size_t size, uint32_t access); struct mlx5dv_devx_umem *(*devx_umem_reg_ex)(struct ibv_context *ctx, struct mlx5dv_devx_umem_in *umem_in); int (*devx_umem_dereg)(struct mlx5dv_devx_umem *dv_devx_umem); struct mlx5dv_mkey *(*create_mkey)(struct mlx5dv_mkey_init_attr *mkey_init_attr); int (*destroy_mkey)(struct mlx5dv_mkey *dv_mkey); int (*crypto_login)(struct ibv_context *context, struct mlx5dv_crypto_login_attr *login_attr); int (*crypto_login_query_state)(struct ibv_context *context, enum mlx5dv_crypto_login_state *state); int (*crypto_logout)(struct ibv_context *context); struct mlx5dv_crypto_login_obj *(*crypto_login_create)( struct ibv_context *context, struct mlx5dv_crypto_login_attr_ex *login_attr); int (*crypto_login_query)( struct mlx5dv_crypto_login_obj *crypto_login, struct mlx5dv_crypto_login_query_attr *query_attr); int (*crypto_login_destroy)( struct mlx5dv_crypto_login_obj *crypto_login); struct mlx5dv_dek *(*dek_create)(struct ibv_context *context, struct mlx5dv_dek_init_attr *init_attr); int (*dek_query)(struct mlx5dv_dek *dek, struct mlx5dv_dek_attr *dek_attr); int (*dek_destroy)(struct mlx5dv_dek *dek); struct mlx5dv_var *(*alloc_var)(struct ibv_context *context, uint32_t flags); void (*free_var)(struct mlx5dv_var *dv_var); struct mlx5dv_pp *(*pp_alloc)(struct ibv_context *context, size_t pp_context_sz, const void *pp_context, uint32_t flags); void (*pp_free)(struct mlx5dv_pp *dv_pp); int (*init_obj)(struct mlx5dv_obj *obj, uint64_t obj_type); struct ibv_cq_ex *(*create_cq)(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr, struct mlx5dv_cq_init_attr *mlx5_cq_attr); struct ibv_qp *(*create_qp)(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr); struct mlx5dv_qp_ex *(*qp_ex_from_ibv_qp_ex)(struct ibv_qp_ex *qp); /* Is this needed? */ struct ibv_wq *(*create_wq)(struct ibv_context *context, struct ibv_wq_init_attr *attr, struct mlx5dv_wq_init_attr *mlx5_wq_attr); struct ibv_dm *(*alloc_dm)(struct ibv_context *context, struct ibv_alloc_dm_attr *dm_attr, struct mlx5dv_alloc_dm_attr *mlx5_dm_attr); void *(*dm_map_op_addr)(struct ibv_dm *dm, uint8_t op); struct ibv_flow_action * (*create_flow_action_esp)(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *esp, struct mlx5dv_flow_action_esp *mlx5_attr); struct ibv_flow_action * (*create_flow_action_modify_header)(struct ibv_context *ctx, size_t actions_sz, uint64_t actions[], enum mlx5dv_flow_table_type ft_type); struct ibv_flow_action * (*create_flow_action_packet_reformat)(struct ibv_context *ctx, size_t data_sz, void *data, enum mlx5dv_flow_action_packet_reformat_type reformat_type, enum mlx5dv_flow_table_type ft_type); struct mlx5dv_flow_matcher *(*create_flow_matcher)(struct ibv_context *context, struct mlx5dv_flow_matcher_attr *attr); int (*destroy_flow_matcher)(struct mlx5dv_flow_matcher *flow_matcher); struct ibv_flow *(*create_flow)(struct mlx5dv_flow_matcher *flow_matcher, struct mlx5dv_flow_match_parameters *match_value, size_t num_actions, struct mlx5dv_flow_action_attr actions_attr[], struct mlx5_flow_action_attr_aux actions_attr_aux[]); struct mlx5dv_steering_anchor *(*create_steering_anchor)(struct ibv_context *conterxt, struct mlx5dv_steering_anchor_attr *attr); int (*destroy_steering_anchor)(struct mlx5_steering_anchor *sa); int (*query_device)(struct ibv_context *ctx_in, struct mlx5dv_context *attrs_out); int (*query_qp_lag_port)(struct ibv_qp *qp, uint8_t *port_num, uint8_t *active_port_num); int (*modify_qp_lag_port)(struct ibv_qp *qp, uint8_t port_num); int (*modify_qp_udp_sport)(struct ibv_qp *qp, uint16_t udp_sport); struct mlx5dv_sched_node *(*sched_node_create)(struct ibv_context *ctx, const struct mlx5dv_sched_attr *attr); struct mlx5dv_sched_leaf *(*sched_leaf_create)(struct ibv_context *ctx, const struct mlx5dv_sched_attr *attr); int (*sched_node_modify)(struct mlx5dv_sched_node *node, const struct mlx5dv_sched_attr *attr); int (*sched_leaf_modify)(struct mlx5dv_sched_leaf *leaf, const struct mlx5dv_sched_attr *attr); int (*sched_node_destroy)(struct mlx5dv_sched_node *node); int (*sched_leaf_destroy)(struct mlx5dv_sched_leaf *leaf); int (*modify_qp_sched_elem)(struct ibv_qp *qp, const struct mlx5dv_sched_leaf *requestor, const struct mlx5dv_sched_leaf *responder); int (*reserved_qpn_alloc)(struct ibv_context *ctx, uint32_t *qpn); int (*reserved_qpn_dealloc)(struct ibv_context *ctx, uint32_t qpn); int (*set_context_attr)(struct ibv_context *ibv_ctx, enum mlx5dv_set_ctx_attr_type type, void *attr); int (*get_clock_info)(struct ibv_context *ctx_in, struct mlx5dv_clock_info *clock_info); int (*query_port)(struct ibv_context *context, uint32_t port_num, struct mlx5dv_port *info, size_t info_len); int (*map_ah_to_qp)(struct ibv_ah *ah, uint32_t qp_num); struct mlx5dv_devx_msi_vector *(*devx_alloc_msi_vector)(struct ibv_context *ibctx); int (*devx_free_msi_vector)(struct mlx5dv_devx_msi_vector *msi); struct mlx5dv_devx_eq *(*devx_create_eq)(struct ibv_context *ibctx, const void *in, size_t inlen, void *out, size_t outlen); int (*devx_destroy_eq)(struct mlx5dv_devx_eq *eq); struct ibv_mr *(*reg_dmabuf_mr)(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access, int mlx5_access); int (*get_data_direct_sysfs_path)(struct ibv_context *context, char *buf, size_t buf_len); }; struct mlx5_dv_context_ops *mlx5_get_dv_ops(struct ibv_context *context); void mlx5_set_dv_ctx_ops(struct mlx5_dv_context_ops *ops); int mlx5_cmd_status_to_err(uint8_t status); int mlx5_get_cmd_status_err(int err, void *out); #endif /* MLX5_H */ rdma-core-56.1/providers/mlx5/mlx5_api.h000066400000000000000000000105311477342711600201230ustar00rootroot00000000000000/* * Copyright (c) 2017, Mellanox Technologies inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MLX5_API_H #define MLX5_API_H #include #define mlx5dv_flow_action_flags mlx5_ib_uapi_flow_action_flags #define MLX5DV_FLOW_ACTION_FLAGS_REQUIRE_METADATA MLX5_IB_UAPI_FLOW_ACTION_FLAGS_REQUIRE_METADATA #define mlx5dv_flow_table_type mlx5_ib_uapi_flow_table_type #define MLX5DV_FLOW_TABLE_TYPE_NIC_RX MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_RX #define MLX5DV_FLOW_TABLE_TYPE_NIC_TX MLX5_IB_UAPI_FLOW_TABLE_TYPE_NIC_TX #define MLX5DV_FLOW_TABLE_TYPE_FDB MLX5_IB_UAPI_FLOW_TABLE_TYPE_FDB #define MLX5DV_FLOW_TABLE_TYPE_RDMA_RX MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_RX #define MLX5DV_FLOW_TABLE_TYPE_RDMA_TX MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TX #define mlx5dv_flow_action_packet_reformat_type mlx5_ib_uapi_flow_action_packet_reformat_type #define MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2 MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2 #define MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL #define MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2 MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2 #define MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL MLX5_IB_UAPI_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL #define mlx5dv_devx_async_cmd_hdr mlx5_ib_uapi_devx_async_cmd_hdr #define mlx5dv_devx_async_event_hdr mlx5_ib_uapi_devx_async_event_hdr #define mlx5dv_alloc_dm_type mlx5_ib_uapi_dm_type #define MLX5DV_DM_TYPE_MEMIC MLX5_IB_UAPI_DM_TYPE_MEMIC #define MLX5DV_DM_TYPE_STEERING_SW_ICM MLX5_IB_UAPI_DM_TYPE_STEERING_SW_ICM #define MLX5DV_DM_TYPE_HEADER_MODIFY_SW_ICM MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_SW_ICM #define MLX5DV_DM_TYPE_HEADER_MODIFY_PATTERN_SW_ICM MLX5_IB_UAPI_DM_TYPE_HEADER_MODIFY_PATTERN_SW_ICM #define MLX5DV_DM_TYPE_ENCAP_SW_ICM MLX5_IB_UAPI_DM_TYPE_ENCAP_SW_ICM #define mlx5dv_devx_create_event_channel_flags mlx5_ib_uapi_devx_create_event_channel_flags #define MLX5DV_DEVX_CREATE_EVENT_CHANNEL_FLAGS_OMIT_EV_DATA MLX5_IB_UAPI_DEVX_CR_EV_CH_FLAGS_OMIT_DATA #define MLX5DV_PP_ALLOC_FLAGS_DEDICATED_INDEX MLX5_IB_UAPI_PP_ALLOC_FLAGS_DEDICATED_INDEX #define MLX5DV_UAR_ALLOC_TYPE_BF MLX5_IB_UAPI_UAR_ALLOC_TYPE_BF #define MLX5DV_UAR_ALLOC_TYPE_NC MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC #define MLX5DV_UAR_ALLOC_TYPE_NC_DEDICATED (1U << 31) #define MLX5DV_QUERY_PORT_VPORT MLX5_IB_UAPI_QUERY_PORT_VPORT #define MLX5DV_QUERY_PORT_VPORT_VHCA_ID MLX5_IB_UAPI_QUERY_PORT_VPORT_VHCA_ID #define MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_RX MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_RX #define MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_TX MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_TX #define MLX5DV_QUERY_PORT_VPORT_REG_C0 MLX5_IB_UAPI_QUERY_PORT_VPORT_REG_C0 #define MLX5DV_QUERY_PORT_ESW_OWNER_VHCA_ID MLX5_IB_UAPI_QUERY_PORT_ESW_OWNER_VHCA_ID #define mlx5dv_port mlx5_ib_uapi_query_port #define mlx5dv_reg mlx5_ib_uapi_reg #define MLX5DV_REG_DMABUF_ACCESS_DATA_DIRECT MLX5_IB_UAPI_REG_DMABUF_ACCESS_DATA_DIRECT #endif rdma-core-56.1/providers/mlx5/mlx5_ifc.h000066400000000000000000004132201477342711600201150ustar00rootroot00000000000000/* * Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MLX5_IFC_H #define MLX5_IFC_H #define u8 uint8_t enum mlx5_cap_mode { HCA_CAP_OPMOD_GET_MAX = 0, HCA_CAP_OPMOD_GET_CUR = 1, }; enum { MLX5_CMD_OP_QUERY_HCA_CAP = 0x100, MLX5_CMD_OP_INIT_HCA = 0x102, MLX5_CMD_OP_TEARDOWN_HCA = 0x103, MLX5_CMD_OP_ENABLE_HCA = 0x104, MLX5_CMD_OP_QUERY_PAGES = 0x107, MLX5_CMD_OP_MANAGE_PAGES = 0x108, MLX5_CMD_OP_SET_HCA_CAP = 0x109, MLX5_CMD_OP_QUERY_ISSI = 0x10a, MLX5_CMD_OP_SET_ISSI = 0x10b, MLX5_CMD_OP_CREATE_MKEY = 0x200, MLX5_CMD_OP_DESTROY_MKEY = 0x202, MLX5_CMD_OP_CREATE_EQ = 0x301, MLX5_CMD_OP_DESTROY_EQ = 0x302, MLX5_CMD_OP_CREATE_CQ = 0x400, MLX5_CMD_OP_DESTROY_CQ = 0x401, MLX5_CMD_OP_CREATE_QP = 0x500, MLX5_CMD_OP_DESTROY_QP = 0x501, MLX5_CMD_OP_RST2INIT_QP = 0x502, MLX5_CMD_OP_INIT2RTR_QP = 0x503, MLX5_CMD_OP_RTR2RTS_QP = 0x504, MLX5_CMD_OP_RTS2RTS_QP = 0x505, MLX5_CMD_OP_SQERR2RTS_QP = 0x506, MLX5_CMD_OP_2ERR_QP = 0x507, MLX5_CMD_OP_2RST_QP = 0x50a, MLX5_CMD_OP_QUERY_QP = 0x50b, MLX5_CMD_OP_SQD_RTS_QP = 0x50c, MLX5_CMD_OP_INIT2INIT_QP = 0x50e, MLX5_CMD_OP_CREATE_PSV = 0x600, MLX5_CMD_OP_DESTROY_PSV = 0x601, MLX5_CMD_OP_CREATE_SRQ = 0x700, MLX5_CMD_OP_DESTROY_SRQ = 0x701, MLX5_CMD_OP_CREATE_XRC_SRQ = 0x705, MLX5_CMD_OP_DESTROY_XRC_SRQ = 0x706, MLX5_CMD_OP_CREATE_DCT = 0x710, MLX5_CMD_OP_DESTROY_DCT = 0x711, MLX5_CMD_OP_QUERY_DCT = 0x713, MLX5_CMD_OP_CREATE_XRQ = 0x717, MLX5_CMD_OP_DESTROY_XRQ = 0x718, MLX5_CMD_OP_QUERY_ESW_FUNCTIONS = 0x740, MLX5_CMD_OP_QUERY_ESW_VPORT_CONTEXT = 0x752, MLX5_CMD_OP_QUERY_NIC_VPORT_CONTEXT = 0x754, MLX5_CMD_OP_MODIFY_NIC_VPORT_CONTEXT = 0x755, MLX5_CMD_OP_QUERY_ROCE_ADDRESS = 0x760, MLX5_CMD_OP_ALLOC_Q_COUNTER = 0x771, MLX5_CMD_OP_DEALLOC_Q_COUNTER = 0x772, MLX5_CMD_OP_CREATE_SCHEDULING_ELEMENT = 0x782, MLX5_CMD_OP_DESTROY_SCHEDULING_ELEMENT = 0x783, MLX5_CMD_OP_ALLOC_PD = 0x800, MLX5_CMD_OP_DEALLOC_PD = 0x801, MLX5_CMD_OP_ALLOC_UAR = 0x802, MLX5_CMD_OP_DEALLOC_UAR = 0x803, MLX5_CMD_OP_ACCESS_REG = 0x805, MLX5_CMD_OP_ATTACH_TO_MCG = 0x806, MLX5_CMD_OP_DETACH_FROM_MCG = 0x807, MLX5_CMD_OP_ALLOC_XRCD = 0x80e, MLX5_CMD_OP_DEALLOC_XRCD = 0x80f, MLX5_CMD_OP_ALLOC_TRANSPORT_DOMAIN = 0x816, MLX5_CMD_OP_DEALLOC_TRANSPORT_DOMAIN = 0x817, MLX5_CMD_OP_ADD_VXLAN_UDP_DPORT = 0x827, MLX5_CMD_OP_DELETE_VXLAN_UDP_DPORT = 0x828, MLX5_CMD_OP_SET_L2_TABLE_ENTRY = 0x829, MLX5_CMD_OP_DELETE_L2_TABLE_ENTRY = 0x82b, MLX5_CMD_OP_QUERY_LAG = 0x842, MLX5_CMD_OP_CREATE_TIR = 0x900, MLX5_CMD_OP_DESTROY_TIR = 0x902, MLX5_CMD_OP_CREATE_SQ = 0x904, MLX5_CMD_OP_MODIFY_SQ = 0x905, MLX5_CMD_OP_DESTROY_SQ = 0x906, MLX5_CMD_OP_CREATE_RQ = 0x908, MLX5_CMD_OP_DESTROY_RQ = 0x90a, MLX5_CMD_OP_CREATE_RMP = 0x90c, MLX5_CMD_OP_DESTROY_RMP = 0x90e, MLX5_CMD_OP_CREATE_TIS = 0x912, MLX5_CMD_OP_MODIFY_TIS = 0x913, MLX5_CMD_OP_DESTROY_TIS = 0x914, MLX5_CMD_OP_QUERY_TIS = 0x915, MLX5_CMD_OP_CREATE_RQT = 0x916, MLX5_CMD_OP_DESTROY_RQT = 0x918, MLX5_CMD_OP_CREATE_FLOW_TABLE = 0x930, MLX5_CMD_OP_DESTROY_FLOW_TABLE = 0x931, MLX5_CMD_OP_QUERY_FLOW_TABLE = 0x932, MLX5_CMD_OP_CREATE_FLOW_GROUP = 0x933, MLX5_CMD_OP_DESTROY_FLOW_GROUP = 0x934, MLX5_CMD_OP_SET_FLOW_TABLE_ENTRY = 0x936, MLX5_CMD_OP_DELETE_FLOW_TABLE_ENTRY = 0x938, MLX5_CMD_OP_CREATE_FLOW_COUNTER = 0x939, MLX5_CMD_OP_DEALLOC_FLOW_COUNTER = 0x93a, MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT = 0x93d, MLX5_CMD_OP_DEALLOC_PACKET_REFORMAT_CONTEXT = 0x93e, MLX5_CMD_OP_ALLOC_MODIFY_HEADER_CONTEXT = 0x940, MLX5_CMD_OP_DEALLOC_MODIFY_HEADER_CONTEXT = 0x941, MLX5_CMD_OP_CREATE_GENERAL_OBJECT = 0xa00, MLX5_CMD_OP_MODIFY_GENERAL_OBJECT = 0xa01, MLX5_CMD_OP_QUERY_GENERAL_OBJECT = 0xa02, MLX5_CMD_OP_DESTROY_GENERAL_OBJECT = 0xa03, MLX5_CMD_OP_CREATE_UMEM = 0xa08, MLX5_CMD_OP_DESTROY_UMEM = 0xa0a, MLX5_CMD_OP_SYNC_STEERING = 0xb00, }; enum { MLX5_CMD_STAT_OK = 0x0, MLX5_CMD_STAT_INT_ERR = 0x1, MLX5_CMD_STAT_BAD_OP_ERR = 0x2, MLX5_CMD_STAT_BAD_PARAM_ERR = 0x3, MLX5_CMD_STAT_BAD_SYS_STATE_ERR = 0x4, MLX5_CMD_STAT_BAD_RES_ERR = 0x5, MLX5_CMD_STAT_RES_BUSY = 0x6, MLX5_CMD_STAT_LIM_ERR = 0x8, MLX5_CMD_STAT_BAD_RES_STATE_ERR = 0x9, MLX5_CMD_STAT_IX_ERR = 0xa, MLX5_CMD_STAT_NO_RES_ERR = 0xf, MLX5_CMD_STAT_BAD_INP_LEN_ERR = 0x50, MLX5_CMD_STAT_BAD_OUTP_LEN_ERR = 0x51, MLX5_CMD_STAT_BAD_QP_STATE_ERR = 0x10, MLX5_CMD_STAT_BAD_PKT_ERR = 0x30, MLX5_CMD_STAT_BAD_SIZE_OUTS_CQES_ERR = 0x40, }; enum { MLX5_PAGES_CANT_GIVE = 0, MLX5_PAGES_GIVE = 1, MLX5_PAGES_TAKE = 2, }; enum { MLX5_REG_HOST_ENDIANNESS = 0x7004, }; enum { MLX5_CAP_PORT_TYPE_IB = 0x0, MLX5_CAP_PORT_TYPE_ETH = 0x1, }; enum mlx5_event { MLX5_EVENT_TYPE_CMD = 0x0a, MLX5_EVENT_TYPE_PAGE_REQUEST = 0xb, }; enum { MLX5_EQ_DOORBEL_OFFSET = 0x40, }; enum { OPCODE_MOD_UPDATE_HEADER_MODIFY_ARGUMENT = 0x1, }; enum { MLX5_DB_BLUEFLAME_BUFFER_SIZE = 0x100, }; struct mlx5_ifc_atomic_caps_bits { u8 reserved_at_0[0x40]; u8 atomic_req_8B_endianness_mode[0x2]; u8 reserved_at_42[0x4]; u8 supported_atomic_req_8B_endianness_mode_1[0x1]; u8 reserved_at_47[0x19]; u8 reserved_at_60[0x20]; u8 reserved_at_80[0x10]; u8 atomic_operations[0x10]; u8 reserved_at_a0[0x10]; u8 atomic_size_qp[0x10]; u8 reserved_at_c0[0x10]; u8 atomic_size_dc[0x10]; u8 reserved_at_e0[0x1a0]; u8 fetch_add_pci_atomic[0x10]; u8 swap_pci_atomic[0x10]; u8 compare_swap_pci_atomic[0x10]; u8 reserved_at_2b0[0x550]; }; struct mlx5_ifc_roce_cap_bits { u8 reserved_0[0x4]; u8 sw_r_roce_src_udp_port[0x1]; u8 fl_rc_qp_when_roce_disabled[0x1]; u8 fl_rc_qp_when_roce_enabled[0x1]; u8 reserved_at_7[0x17]; u8 qp_ts_format[0x2]; u8 reserved_at_20[0x7e0]; }; enum { MLX5_MULTI_PATH_FT_MAX_LEVEL = 64, }; struct mlx5_ifc_flow_table_context_bits { u8 reformat_en[0x1]; u8 decap_en[0x1]; u8 sw_owner[0x1]; u8 termination_table[0x1]; u8 table_miss_action[0x4]; u8 level[0x8]; u8 reserved_at_10[0x8]; u8 log_size[0x8]; u8 reserved_at_20[0x8]; u8 table_miss_id[0x18]; u8 reserved_at_40[0x8]; u8 lag_master_next_table_id[0x18]; u8 reserved_at_60[0x60]; u8 sw_owner_icm_root_1[0x40]; u8 sw_owner_icm_root_0[0x40]; }; struct mlx5_ifc_create_flow_table_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 other_vport[0x1]; u8 reserved_at_41[0xf]; u8 vport_number[0x10]; u8 reserved_at_60[0x20]; u8 table_type[0x8]; u8 reserved_at_88[0x18]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_flow_table_context_bits flow_table_context; }; struct mlx5_ifc_create_flow_table_out_bits { u8 status[0x8]; u8 icm_address_63_40[0x18]; u8 syndrome[0x20]; u8 icm_address_39_32[0x8]; u8 table_id[0x18]; u8 icm_address_31_0[0x20]; }; struct mlx5_ifc_destroy_flow_table_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x20]; u8 other_vport[0x1]; u8 reserved_at_41[0xf]; u8 vport_number[0x10]; u8 reserved_at_60[0x20]; u8 table_type[0x8]; u8 reserved_at_88[0x18]; u8 reserved_at_a0[0x8]; u8 table_id[0x18]; u8 reserved_at_c0[0x140]; }; struct mlx5_ifc_query_flow_table_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; u8 table_type[0x8]; u8 reserved_at_88[0x18]; u8 reserved_at_a0[0x8]; u8 table_id[0x18]; u8 reserved_at_c0[0x140]; }; struct mlx5_ifc_query_flow_table_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x80]; struct mlx5_ifc_flow_table_context_bits flow_table_context; }; struct mlx5_ifc_sync_steering_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0xc0]; }; struct mlx5_ifc_sync_steering_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_device_mem_cap_bits { u8 memic[0x1]; u8 reserved_at_1[0x1f]; u8 reserved_at_20[0xb]; u8 log_min_memic_alloc_size[0x5]; u8 reserved_at_30[0x8]; u8 log_max_memic_addr_alignment[0x8]; u8 memic_bar_start_addr[0x40]; u8 memic_bar_size[0x20]; u8 max_memic_size[0x20]; u8 steering_sw_icm_start_address[0x40]; u8 reserved_at_100[0x8]; u8 log_header_modify_sw_icm_size[0x8]; u8 reserved_at_110[0x2]; u8 log_sw_icm_alloc_granularity[0x6]; u8 log_steering_sw_icm_size[0x8]; u8 log_indirect_encap_sw_icm_size[0x8]; u8 reserved_at_128[0x10]; u8 log_header_modify_pattern_sw_icm_size[0x8]; u8 header_modify_sw_icm_start_address[0x40]; u8 reserved_at_180[0x40]; u8 header_modify_pattern_sw_icm_start_address[0x40]; u8 reserved_at_200[0x40]; u8 indirect_encap_sw_icm_start_address[0x40]; u8 indirect_encap_icm_base[0x40]; u8 reserved_at_2c0[0x540]; }; struct mlx5_ifc_flow_table_fields_supported_bits { u8 outer_dmac[0x1]; u8 outer_smac[0x1]; u8 outer_ether_type[0x1]; u8 outer_ip_version[0x1]; u8 outer_first_prio[0x1]; u8 outer_first_cfi[0x1]; u8 outer_first_vid[0x1]; u8 outer_ipv4_ttl[0x1]; u8 outer_second_prio[0x1]; u8 outer_second_cfi[0x1]; u8 outer_second_vid[0x1]; u8 outer_ipv6_flow_label[0x1]; u8 outer_sip[0x1]; u8 outer_dip[0x1]; u8 outer_frag[0x1]; u8 outer_ip_protocol[0x1]; u8 outer_ip_ecn[0x1]; u8 outer_ip_dscp[0x1]; u8 outer_udp_sport[0x1]; u8 outer_udp_dport[0x1]; u8 outer_tcp_sport[0x1]; u8 outer_tcp_dport[0x1]; u8 outer_tcp_flags[0x1]; u8 outer_gre_protocol[0x1]; u8 outer_gre_key[0x1]; u8 outer_vxlan_vni[0x1]; u8 outer_geneve_vni[0x1]; u8 outer_geneve_oam[0x1]; u8 outer_geneve_protocol_type[0x1]; u8 outer_geneve_opt_len[0x1]; u8 source_vhca_port[0x1]; u8 source_eswitch_port[0x1]; u8 inner_dmac[0x1]; u8 inner_smac[0x1]; u8 inner_ether_type[0x1]; u8 inner_ip_version[0x1]; u8 inner_first_prio[0x1]; u8 inner_first_cfi[0x1]; u8 inner_first_vid[0x1]; u8 inner_ipv4_ttl[0x1]; u8 inner_second_prio[0x1]; u8 inner_second_cfi[0x1]; u8 inner_second_vid[0x1]; u8 inner_ipv6_flow_label[0x1]; u8 inner_sip[0x1]; u8 inner_dip[0x1]; u8 inner_frag[0x1]; u8 inner_ip_protocol[0x1]; u8 inner_ip_ecn[0x1]; u8 inner_ip_dscp[0x1]; u8 inner_udp_sport[0x1]; u8 inner_udp_dport[0x1]; u8 inner_tcp_sport[0x1]; u8 inner_tcp_dport[0x1]; u8 inner_tcp_flags[0x1]; u8 reserved_at_37[0x7]; u8 metadata_reg_b[0x1]; u8 metadata_reg_a[0x1]; u8 geneve_tlv_option_0_data[0x1]; u8 geneve_tlv_option_0_exist[0x1]; u8 reserved_at_42[0x3]; u8 outer_first_mpls_over_udp_ttl[0x1]; u8 outer_first_mpls_over_udp_s_bos[0x1]; u8 outer_first_mpls_over_udp_exp[0x1]; u8 outer_first_mpls_over_udp_label[0x1]; u8 outer_first_mpls_over_gre_ttl[0x1]; u8 outer_first_mpls_over_gre_s_bos[0x1]; u8 outer_first_mpls_over_gre_exp[0x1]; u8 outer_first_mpls_over_gre_label[0x1]; u8 inner_first_mpls_ttl[0x1]; u8 inner_first_mpls_s_bos[0x1]; u8 inner_first_mpls_exp[0x1]; u8 inner_first_mpls_label[0x1]; u8 outer_first_mpls_ttl[0x1]; u8 outer_first_mpls_s_bos[0x1]; u8 outer_first_mpls_exp[0x1]; u8 outer_first_mpls_label[0x1]; u8 outer_emd_tag[0x1]; u8 inner_esp_spi[0x1]; u8 outer_esp_spi[0x1]; u8 inner_ipv6_hop_limit[0x1]; u8 outer_ipv6_hop_limit[0x1]; u8 bth_dst_qp[0x1]; u8 inner_first_svlan[0x1]; u8 inner_second_svlan[0x1]; u8 outer_first_svlan[0x1]; u8 outer_second_svlan[0x1]; u8 source_sqn[0x1]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_dr_match_spec_bits { u8 smac_47_16[0x20]; u8 smac_15_0[0x10]; u8 ethertype[0x10]; u8 dmac_47_16[0x20]; u8 dmac_15_0[0x10]; u8 first_prio[0x3]; u8 first_cfi[0x1]; u8 first_vid[0xc]; u8 ip_protocol[0x8]; u8 ip_dscp[0x6]; u8 ip_ecn[0x2]; u8 cvlan_tag[0x1]; u8 svlan_tag[0x1]; u8 frag[0x1]; u8 ip_version[0x4]; u8 tcp_flags[0x9]; u8 tcp_sport[0x10]; u8 tcp_dport[0x10]; u8 reserved_at_c0[0x10]; u8 ipv4_ihl[0x4]; u8 l3_ok[0x1]; u8 l4_ok[0x1]; u8 ipv4_checksum_ok[0x1]; u8 l4_checksum_ok[0x1]; u8 ip_ttl_hoplimit[0x8]; u8 udp_sport[0x10]; u8 udp_dport[0x10]; u8 src_ip_127_96[0x20]; u8 src_ip_95_64[0x20]; u8 src_ip_63_32[0x20]; u8 src_ip_31_0[0x20]; u8 dst_ip_127_96[0x20]; u8 dst_ip_95_64[0x20]; u8 dst_ip_63_32[0x20]; u8 dst_ip_31_0[0x20]; }; struct mlx5_ifc_dr_match_set_misc_bits { u8 gre_c_present[0x1]; u8 bth_a[0x1]; u8 gre_k_present[0x1]; u8 gre_s_present[0x1]; u8 source_vhca_port[0x4]; u8 source_sqn[0x18]; u8 source_eswitch_owner_vhca_id[0x10]; u8 source_port[0x10]; u8 outer_second_prio[0x3]; u8 outer_second_cfi[0x1]; u8 outer_second_vid[0xc]; u8 inner_second_prio[0x3]; u8 inner_second_cfi[0x1]; u8 inner_second_vid[0xc]; u8 outer_second_cvlan_tag[0x1]; u8 inner_second_cvlan_tag[0x1]; u8 outer_second_svlan_tag[0x1]; u8 inner_second_svlan_tag[0x1]; u8 outer_emd_tag[0x1]; u8 reserved_at_65[0xb]; u8 gre_protocol[0x10]; u8 gre_key_h[0x18]; u8 gre_key_l[0x8]; u8 vxlan_vni[0x18]; u8 bth_opcode[0x8]; u8 geneve_vni[0x18]; u8 reserved_at_e4[0x6]; u8 geneve_tlv_option_0_exist[0x1]; u8 geneve_oam[0x1]; u8 reserved_at_ec[0xc]; u8 outer_ipv6_flow_label[0x14]; u8 reserved_at_100[0xc]; u8 inner_ipv6_flow_label[0x14]; u8 reserved_at_120[0xa]; u8 geneve_opt_len[0x6]; u8 geneve_protocol_type[0x10]; u8 reserved_at_140[0x8]; u8 bth_dst_qp[0x18]; u8 inner_esp_spi[0x20]; u8 outer_esp_spi[0x20]; u8 reserved_at_1a0[0x20]; u8 reserved_at_1c0[0x20]; u8 reserved_at_1e0[0x20]; }; struct mlx5_ifc_dr_match_set_misc2_bits { u8 outer_first_mpls_label[0x14]; u8 outer_first_mpls_exp[0x3]; u8 outer_first_mpls_s_bos[0x1]; u8 outer_first_mpls_ttl[0x8]; u8 inner_first_mpls_label[0x14]; u8 inner_first_mpls_exp[0x3]; u8 inner_first_mpls_s_bos[0x1]; u8 inner_first_mpls_ttl[0x8]; u8 outer_first_mpls_over_gre_label[0x14]; u8 outer_first_mpls_over_gre_exp[0x3]; u8 outer_first_mpls_over_gre_s_bos[0x1]; u8 outer_first_mpls_over_gre_ttl[0x8]; u8 outer_first_mpls_over_udp_label[0x14]; u8 outer_first_mpls_over_udp_exp[0x3]; u8 outer_first_mpls_over_udp_s_bos[0x1]; u8 outer_first_mpls_over_udp_ttl[0x8]; u8 metadata_reg_c_7[0x20]; u8 metadata_reg_c_6[0x20]; u8 metadata_reg_c_5[0x20]; u8 metadata_reg_c_4[0x20]; u8 metadata_reg_c_3[0x20]; u8 metadata_reg_c_2[0x20]; u8 metadata_reg_c_1[0x20]; u8 metadata_reg_c_0[0x20]; u8 metadata_reg_a[0x20]; u8 reserved_at_1a0[0x20]; u8 reserved_at_1c0[0x20]; u8 reserved_at_1e0[0x20]; }; struct mlx5_ifc_dr_match_set_misc3_bits { u8 inner_tcp_seq_num[0x20]; u8 outer_tcp_seq_num[0x20]; u8 inner_tcp_ack_num[0x20]; u8 outer_tcp_ack_num[0x20]; u8 reserved_at_80[0x8]; u8 outer_vxlan_gpe_vni[0x18]; u8 outer_vxlan_gpe_next_protocol[0x8]; u8 outer_vxlan_gpe_flags[0x8]; u8 reserved_at_b0[0x10]; u8 icmp_header_data[0x20]; u8 icmpv6_header_data[0x20]; u8 icmp_type[0x8]; u8 icmp_code[0x8]; u8 icmpv6_type[0x8]; u8 icmpv6_code[0x8]; u8 geneve_tlv_option_0_data[0x20]; u8 gtpu_teid[0x20]; u8 gtpu_msg_type[0x8]; u8 gtpu_msg_flags[0x8]; u8 reserved_at_170[0x10]; u8 gtpu_dw_2[0x20]; u8 gtpu_first_ext_dw_0[0x20]; u8 gtpu_dw_0[0x20]; u8 reserved_at_1e0[0x20]; }; struct mlx5_ifc_dr_match_set_misc4_bits { u8 prog_sample_field_value_0[0x20]; u8 prog_sample_field_id_0[0x20]; u8 prog_sample_field_value_1[0x20]; u8 prog_sample_field_id_1[0x20]; u8 prog_sample_field_value_2[0x20]; u8 prog_sample_field_id_2[0x20]; u8 prog_sample_field_value_3[0x20]; u8 prog_sample_field_id_3[0x20]; u8 prog_sample_field_value_4[0x20]; u8 prog_sample_field_id_4[0x20]; u8 prog_sample_field_value_5[0x20]; u8 prog_sample_field_id_5[0x20]; u8 prog_sample_field_value_6[0x20]; u8 prog_sample_field_id_6[0x20]; u8 prog_sample_field_value_7[0x20]; u8 prog_sample_field_id_7[0x20]; }; struct mlx5_ifc_dr_match_set_misc5_bits { u8 macsec_tag_0[0x20]; u8 macsec_tag_1[0x20]; u8 macsec_tag_2[0x20]; u8 macsec_tag_3[0x20]; u8 tunnel_header_0[0x20]; u8 tunnel_header_1[0x20]; u8 tunnel_header_2[0x20]; u8 tunnel_header_3[0x20]; u8 reserved_at_100[0x20]; u8 reserved_at_120[0x20]; u8 reserved_at_140[0x20]; u8 reserved_at_160[0x20]; u8 reserved_at_180[0x20]; u8 reserved_at_1a0[0x20]; u8 reserved_at_1c0[0x20]; u8 reserved_at_1e0[0x20]; }; struct mlx5_ifc_dr_match_param_bits { struct mlx5_ifc_dr_match_spec_bits outer; struct mlx5_ifc_dr_match_set_misc_bits misc; struct mlx5_ifc_dr_match_spec_bits inner; struct mlx5_ifc_dr_match_set_misc2_bits misc2; struct mlx5_ifc_dr_match_set_misc3_bits misc3; struct mlx5_ifc_dr_match_set_misc4_bits misc4; struct mlx5_ifc_dr_match_set_misc5_bits misc5; }; struct mlx5_ifc_flow_table_prop_layout_bits { u8 ft_support[0x1]; u8 flow_tag[0x1]; u8 flow_counter[0x1]; u8 flow_modify_en[0x1]; u8 modify_root[0x1]; u8 identified_miss_table[0x1]; u8 flow_table_modify[0x1]; u8 reformat[0x1]; u8 decap[0x1]; u8 reset_root_to_default[0x1]; u8 pop_vlan[0x1]; u8 push_vlan[0x1]; u8 fpga_vendor_acceleration[0x1]; u8 pop_vlan_2[0x1]; u8 push_vlan_2[0x1]; u8 reformat_and_vlan_action[0x1]; u8 modify_and_vlan_action[0x1]; u8 sw_owner[0x1]; u8 reformat_l3_tunnel_to_l2[0x1]; u8 reformat_l2_to_l3_tunnel[0x1]; u8 reformat_and_modify_action[0x1]; u8 reserved_at_15[0x9]; u8 sw_owner_v2[0x1]; u8 reserved_at_1f[0x1]; u8 reserved_at_20[0x2]; u8 log_max_ft_size[0x6]; u8 log_max_modify_header_context[0x8]; u8 max_modify_header_actions[0x8]; u8 max_ft_level[0x8]; u8 reserved_at_40[0x10]; u8 metadata_reg_b_width[0x8]; u8 metadata_reg_a_width[0x8]; u8 reserved_at_60[0x18]; u8 log_max_ft_num[0x8]; u8 reserved_at_80[0x10]; u8 log_max_flow_counter[0x8]; u8 log_max_destination[0x8]; u8 reserved_at_a0[0x18]; u8 log_max_flow[0x8]; u8 reserved_at_c0[0x40]; struct mlx5_ifc_flow_table_fields_supported_bits ft_field_support; struct mlx5_ifc_flow_table_fields_supported_bits ft_field_bitmask_support; }; enum { MLX5_FLEX_PARSER_GENEVE_ENABLED = 1 << 3, MLX5_FLEX_PARSER_MPLS_OVER_GRE_ENABLED = 1 << 4, mlx5_FLEX_PARSER_MPLS_OVER_UDP_ENABLED = 1 << 5, MLX5_FLEX_PARSER_VXLAN_GPE_ENABLED = 1 << 7, MLX5_FLEX_PARSER_ICMP_V4_ENABLED = 1 << 8, MLX5_FLEX_PARSER_ICMP_V6_ENABLED = 1 << 9, MLX5_FLEX_PARSER_GENEVE_OPT_0_ENABLED = 1 << 10, MLX5_FLEX_PARSER_GTPU_ENABLED = 1 << 11, MLX5_FLEX_PARSER_GTPU_DW_2_ENABLED = 1 << 16, MLX5_FLEX_PARSER_GTPU_FIRST_EXT_DW_0_ENABLED = 1 << 17, MLX5_FLEX_PARSER_GTPU_DW_0_ENABLED = 1 << 18, MLX5_FLEX_PARSER_GTPU_TEID_ENABLED = 1 << 19, }; enum mlx5_ifc_steering_format_version { MLX5_HW_CONNECTX_5 = 0x0, MLX5_HW_CONNECTX_6DX = 0x1, MLX5_HW_CONNECTX_7 = 0x2, MLX5_HW_CONNECTX_8 = 0x3, }; enum mlx5_ifc_ste_v1_modify_hdr_offset { MLX5_MODIFY_HEADER_V1_QW_OFFSET = 0x20, }; struct mlx5_ifc_cmd_hca_cap_bits { u8 access_other_hca_roce[0x1]; u8 reserved_at_1[0x1e]; u8 vhca_resource_manager[0x1]; u8 hca_cap_2[0x1]; u8 reserved_at_21[0xf]; u8 vhca_id[0x10]; u8 reserved_at_40[0x20]; u8 reserved_at_60[0x2]; u8 qp_data_in_order[0x1]; u8 reserved_at_63[0x8]; u8 log_dma_mmo_max_size[0x5]; u8 reserved_at_70[0x10]; u8 log_max_srq_sz[0x8]; u8 log_max_qp_sz[0x8]; u8 reserved_at_90[0x3]; u8 isolate_vl_tc_new[0x1]; u8 reserved_at_94[0x4]; u8 prio_tag_required[0x1]; u8 reserved_at_99[0x2]; u8 log_max_qp[0x5]; u8 reserved_at_a0[0xb]; u8 log_max_srq[0x5]; u8 reserved_at_b0[0x10]; u8 reserved_at_c0[0x8]; u8 log_max_cq_sz[0x8]; u8 reserved_at_d0[0xb]; u8 log_max_cq[0x5]; u8 log_max_eq_sz[0x8]; u8 relaxed_ordering_write[0x1]; u8 reserved_at_e9[0x1]; u8 log_max_mkey[0x6]; u8 tunneled_atomic[0x1]; u8 as_notify[0x1]; u8 m_pci_port[0x1]; u8 m_vhca_mk[0x1]; u8 cmd_on_behalf[0x1]; u8 device_emulation_manager[0x1]; u8 terminate_scatter_list_mkey[0x1]; u8 repeated_mkey[0x1]; u8 dump_fill_mkey[0x1]; u8 reserved_at_f9[0x3]; u8 log_max_eq[0x4]; u8 max_indirection[0x8]; u8 fixed_buffer_size[0x1]; u8 log_max_mrw_sz[0x7]; u8 force_teardown[0x1]; u8 fast_teardown[0x1]; u8 log_max_bsf_list_size[0x6]; u8 umr_extended_translation_offset[0x1]; u8 null_mkey[0x1]; u8 log_max_klm_list_size[0x6]; u8 reserved_at_120[0x2]; u8 qpc_extension[0x1]; u8 reserved_at_123[0x7]; u8 log_max_ra_req_dc[0x6]; u8 reserved_at_130[0xa]; u8 log_max_ra_res_dc[0x6]; u8 reserved_at_140[0x7]; u8 sig_crc64_xp10[0x1]; u8 sig_crc32c[0x1]; u8 reserved_at_149[0x1]; u8 log_max_ra_req_qp[0x6]; u8 reserved_at_150[0x1]; u8 rts2rts_qp_udp_sport[0x1]; u8 rts2rts_lag_tx_port_affinity[0x1]; u8 dma_mmo_sq[0x1]; u8 reserved_at_154[0x6]; u8 log_max_ra_res_qp[0x6]; u8 end_pad[0x1]; u8 cc_query_allowed[0x1]; u8 cc_modify_allowed[0x1]; u8 start_pad[0x1]; u8 cache_line_128byte[0x1]; u8 gid_table_size_ro[0x1]; u8 pkey_table_size_ro[0x1]; u8 reserved_at_167[0x1]; u8 rnr_nak_q_counters[0x1]; u8 rts2rts_qp_counters_set_id[0x1]; u8 rts2rts_qp_dscp[0x1]; u8 reserved_at_16b[0x4]; u8 qcam_reg[0x1]; u8 gid_table_size[0x10]; u8 out_of_seq_cnt[0x1]; u8 vport_counters[0x1]; u8 retransmission_q_counters[0x1]; u8 debug[0x1]; u8 modify_rq_counters_set_id[0x1]; u8 rq_delay_drop[0x1]; u8 max_qp_cnt[0xa]; u8 pkey_table_size[0x10]; u8 vport_group_manager[0x1]; u8 vhca_group_manager[0x1]; u8 ib_virt[0x1]; u8 eth_virt[0x1]; u8 vnic_env_queue_counters[0x1]; u8 ets[0x1]; u8 nic_flow_table[0x1]; u8 eswitch_manager[0x1]; u8 device_memory[0x1]; u8 mcam_reg[0x1]; u8 pcam_reg[0x1]; u8 local_ca_ack_delay[0x5]; u8 port_module_event[0x1]; u8 enhanced_retransmission_q_counters[0x1]; u8 port_checks[0x1]; u8 pulse_gen_control[0x1]; u8 disable_link_up_by_init_hca[0x1]; u8 beacon_led[0x1]; u8 port_type[0x2]; u8 num_ports[0x8]; u8 reserved_at_1c0[0x1]; u8 pps[0x1]; u8 pps_modify[0x1]; u8 log_max_msg[0x5]; u8 multi_path_xrc_rdma[0x1]; u8 multi_path_dc_rdma[0x1]; u8 multi_path_rc_rdma[0x1]; u8 traffic_fast_control[0x1]; u8 max_tc[0x4]; u8 temp_warn_event[0x1]; u8 dcbx[0x1]; u8 general_notification_event[0x1]; u8 multi_prio_sq[0x1]; u8 afu_owner[0x1]; u8 fpga[0x1]; u8 rol_s[0x1]; u8 rol_g[0x1]; u8 ib_port_sniffer[0x1]; u8 wol_s[0x1]; u8 wol_g[0x1]; u8 wol_a[0x1]; u8 wol_b[0x1]; u8 wol_m[0x1]; u8 wol_u[0x1]; u8 wol_p[0x1]; u8 stat_rate_support[0x10]; u8 sig_block_4048[0x1]; u8 reserved_at_1f1[0xb]; u8 cqe_version[0x4]; u8 compact_address_vector[0x1]; u8 eth_striding_wq[0x1]; u8 reserved_at_202[0x1]; u8 ipoib_enhanced_offloads[0x1]; u8 ipoib_basic_offloads[0x1]; u8 ib_striding_wq[0x1]; u8 repeated_block_disabled[0x1]; u8 umr_modify_entity_size_disabled[0x1]; u8 umr_modify_atomic_disabled[0x1]; u8 umr_indirect_mkey_disabled[0x1]; u8 umr_fence[0x2]; u8 dc_req_sctr_data_cqe[0x1]; u8 dc_connect_qp[0x1]; u8 dc_cnak_trace[0x1]; u8 drain_sigerr[0x1]; u8 cmdif_checksum[0x2]; u8 sigerr_cqe[0x1]; u8 reserved_at_213[0x1]; u8 wq_signature[0x1]; u8 sctr_data_cqe[0x1]; u8 reserved_at_216[0x1]; u8 sho[0x1]; u8 tph[0x1]; u8 rf[0x1]; u8 dct[0x1]; u8 qos[0x1]; u8 eth_net_offloads[0x1]; u8 roce[0x1]; u8 atomic[0x1]; u8 extended_retry_count[0x1]; u8 cq_oi[0x1]; u8 cq_resize[0x1]; u8 cq_moderation[0x1]; u8 cq_period_mode_modify[0x1]; u8 cq_invalidate[0x1]; u8 reserved_at_225[0x1]; u8 cq_eq_remap[0x1]; u8 pg[0x1]; u8 block_lb_mc[0x1]; u8 exponential_backoff[0x1]; u8 scqe_break_moderation[0x1]; u8 cq_period_start_from_cqe[0x1]; u8 cd[0x1]; u8 atm[0x1]; u8 apm[0x1]; u8 vector_calc[0x1]; u8 umr_ptr_rlkey[0x1]; u8 imaicl[0x1]; u8 qp_packet_based[0x1]; u8 reserved_at_233[0x1]; u8 ipoib_enhanced_pkey_change[0x1]; u8 initiator_src_dct_in_cqe[0x1]; u8 qkv[0x1]; u8 pkv[0x1]; u8 set_deth_sqpn[0x1]; u8 rts2rts_primary_sl[0x1]; u8 initiator_src_dct[0x1]; u8 dc_v2[0x1]; u8 xrc[0x1]; u8 ud[0x1]; u8 uc[0x1]; u8 rc[0x1]; u8 uar_4k[0x1]; u8 reserved_at_241[0x7]; u8 fl_rc_qp_when_roce_disabled[0x1]; u8 reserved_at_249[0x1]; u8 uar_sz[0x6]; u8 reserved_at_250[0x2]; u8 umem_uid_0[0x1]; u8 log_max_dc_cnak_qps[0x5]; u8 log_pg_sz[0x8]; u8 bf[0x1]; u8 driver_version[0x1]; u8 pad_tx_eth_packet[0x1]; u8 query_driver_version[0x1]; u8 max_qp_retry_freq[0x1]; u8 qp_by_name[0x1]; u8 mkey_by_name[0x1]; u8 reserved_at_267[0x1]; u8 suspend_qp_uc[0x1]; u8 suspend_qp_ud[0x1]; u8 suspend_qp_rc[0x1]; u8 log_bf_reg_size[0x5]; u8 reserved_at_270[0x6]; u8 lag_dct[0x2]; u8 lag_tx_port_affinity[0x1]; u8 reserved_at_279[0x2]; u8 lag_master[0x1]; u8 num_lag_ports[0x4]; u8 num_of_diagnostic_counters[0x10]; u8 max_wqe_sz_sq[0x10]; u8 reserved_at_2a0[0x10]; u8 max_wqe_sz_rq[0x10]; u8 max_flow_counter_31_16[0x10]; u8 max_wqe_sz_sq_dc[0x10]; u8 reserved_at_2e0[0x7]; u8 max_qp_mcg[0x19]; u8 mlnx_tag_ethertype[0x10]; u8 reserved_at_310[0x8]; u8 log_max_mcg[0x8]; u8 reserved_at_320[0x3]; u8 log_max_transport_domain[0x5]; u8 reserved_at_328[0x3]; u8 log_max_pd[0x5]; u8 dp_ordering_ooo_all_ud[0x1]; u8 dp_ordering_ooo_all_uc[0x1]; u8 dp_ordering_ooo_all_xrc[0x1]; u8 dp_ordering_ooo_all_dc[0x1]; u8 dp_ordering_ooo_all_rc[0x1]; u8 reserved_at_335[0x6]; u8 log_max_xrcd[0x5]; u8 nic_receive_steering_discard[0x1]; u8 receive_discard_vport_down[0x1]; u8 transmit_discard_vport_down[0x1]; u8 eq_overrun_count[0x1]; u8 nic_receive_steering_depth[0x1]; u8 invalid_command_count[0x1]; u8 quota_exceeded_count[0x1]; u8 reserved_at_347[0x1]; u8 log_max_flow_counter_bulk[0x8]; u8 max_flow_counter_15_0[0x10]; u8 modify_tis[0x1]; u8 reserved_at_361[0x2]; u8 log_max_rq[0x5]; u8 reserved_at_368[0x3]; u8 log_max_sq[0x5]; u8 reserved_at_370[0x3]; u8 log_max_tir[0x5]; u8 reserved_at_378[0x3]; u8 log_max_tis[0x5]; u8 basic_cyclic_rcv_wqe[0x1]; u8 reserved_at_381[0x2]; u8 log_max_rmp[0x5]; u8 reserved_at_388[0x3]; u8 log_max_rqt[0x5]; u8 reserved_at_390[0x3]; u8 log_max_rqt_size[0x5]; u8 reserved_at_398[0x3]; u8 log_max_tis_per_sq[0x5]; u8 ext_stride_num_range[0x1]; u8 reserved_at_3a1[0x2]; u8 log_max_stride_sz_rq[0x5]; u8 reserved_at_3a8[0x3]; u8 log_min_stride_sz_rq[0x5]; u8 reserved_at_3b0[0x3]; u8 log_max_stride_sz_sq[0x5]; u8 reserved_at_3b8[0x3]; u8 log_min_stride_sz_sq[0x5]; u8 hairpin[0x1]; u8 reserved_at_3c1[0x2]; u8 log_max_hairpin_queues[0x5]; u8 reserved_at_3c8[0x3]; u8 log_max_hairpin_wq_data_sz[0x5]; u8 reserved_at_3d0[0x3]; u8 log_max_hairpin_num_packets[0x5]; u8 reserved_at_3d8[0x3]; u8 log_max_wq_sz[0x5]; u8 nic_vport_change_event[0x1]; u8 disable_local_lb_uc[0x1]; u8 disable_local_lb_mc[0x1]; u8 log_min_hairpin_wq_data_sz[0x5]; u8 reserved_at_3e8[0x3]; u8 log_max_vlan_list[0x5]; u8 reserved_at_3f0[0x1]; u8 aes_xts_single_block_le_tweak[0x1]; u8 aes_xts_multi_block_be_tweak[0x1]; u8 log_max_current_mc_list[0x5]; u8 reserved_at_3f8[0x3]; u8 log_max_current_uc_list[0x5]; u8 general_obj_types[0x40]; u8 sq_ts_format[0x2]; u8 rq_ts_format[0x2]; u8 steering_format_version[0x4]; u8 create_qp_start_hint[0x18]; u8 reserved_at_460[0x9]; u8 crypto[0x1]; u8 reserved_at_46a[0x6]; u8 max_num_eqs[0x10]; u8 sigerr_domain_and_sig_type[0x1]; u8 reserved_at_481[0x2]; u8 log_max_l2_table[0x5]; u8 reserved_at_488[0x8]; u8 log_uar_page_sz[0x10]; u8 reserved_at_4a0[0x20]; u8 device_frequency_mhz[0x20]; u8 device_frequency_khz[0x20]; u8 capi[0x1]; u8 create_pec[0x1]; u8 nvmf_target_offload[0x1]; u8 capi_invalidate[0x1]; u8 reserved_at_504[0x17]; u8 log_max_pasid[0x5]; u8 num_of_uars_per_page[0x20]; u8 flex_parser_protocols[0x20]; u8 reserved_at_560[0x10]; u8 flex_parser_header_modify[0x1]; u8 reserved_at_571[0x2]; u8 log_max_guaranteed_connections[0x5]; u8 reserved_at_578[0x3]; u8 log_max_dct_connections[0x5]; u8 log_max_atomic_size_qp[0x8]; u8 reserved_at_588[0x10]; u8 log_max_atomic_size_dc[0x8]; u8 reserved_at_5a0[0x1c]; u8 mini_cqe_resp_stride_index[0x1]; u8 cqe_128_always[0x1]; u8 cqe_compression_128b[0x1]; u8 cqe_compression[0x1]; u8 cqe_compression_timeout[0x10]; u8 cqe_compression_max_num[0x10]; u8 reserved_at_5e0[0x8]; u8 flex_parser_id_gtpu_dw_0[0x4]; u8 log_max_tm_offloaded_op_size[0x4]; u8 tag_matching[0x1]; u8 rndv_offload_rc[0x1]; u8 rndv_offload_dc[0x1]; u8 log_tag_matching_list_sz[0x5]; u8 reserved_at_5f8[0x3]; u8 log_max_xrq[0x5]; u8 affiliate_nic_vport_criteria[0x8]; u8 native_port_num[0x8]; u8 num_vhca_ports[0x8]; u8 flex_parser_id_gtpu_teid[0x4]; u8 reserved_at_61c[0x1]; u8 trusted_vnic_vhca[0x1]; u8 sw_owner_id[0x1]; u8 reserve_not_to_use[0x1]; u8 reserved_at_620[0x60]; u8 sf[0x1]; u8 reserved_at_682[0x43]; u8 flex_parser_id_geneve_opt_0[0x4]; u8 flex_parser_id_icmp_dw1[0x4]; u8 flex_parser_id_icmp_dw0[0x4]; u8 flex_parser_id_icmpv6_dw1[0x4]; u8 flex_parser_id_icmpv6_dw0[0x4]; u8 flex_parser_id_outer_first_mpls_over_gre[0x4]; u8 flex_parser_id_outer_first_mpls_over_udp_label[0x4]; u8 reserved_at_6e0[0x20]; u8 flex_parser_id_gtpu_dw_2[0x4]; u8 flex_parser_id_gtpu_first_ext_dw_0[0x4]; u8 reserved_at_708[0x18]; u8 reserved_at_720[0x20]; u8 reserved_at_740[0x8]; u8 dma_mmo_qp[0x1]; u8 reserved_at_749[0x17]; u8 reserved_at_760[0x3]; u8 log_max_num_header_modify_argument[0x5]; u8 reserved_at_768[0x4]; u8 log_header_modify_argument_granularity[0x4]; u8 reserved_at_770[0x3]; u8 log_header_modify_argument_max_alloc[0x5]; u8 reserved_at_778[0x8]; u8 reserved_at_780[0x40]; u8 match_definer_format_supported[0x40]; }; struct mlx5_ifc_header_modify_cap_properties_bits { struct mlx5_ifc_flow_table_fields_supported_bits set_action_field_support; u8 reserved_at_80[0x80]; struct mlx5_ifc_flow_table_fields_supported_bits add_action_field_support; u8 reserved_at_180[0x80]; u8 copy_action_field_support[8][0x20]; u8 reserved_at_300[0x100]; }; struct mlx5_ifc_flow_table_fields_supported_2_bits { u8 reserved_at_0[0xf]; u8 tunnel_header_2_3[0x1]; u8 tunnel_header_0_1[0x1]; u8 reserved_at_11[0x6]; u8 inner_l3_ok[0x1]; u8 inner_l4_ok[0x1]; u8 outer_l3_ok[0x1]; u8 outer_l4_ok[0x1]; u8 psp_header[0x1]; u8 inner_ipv4_checksum_ok[0x1]; u8 inner_l4_checksum_ok[0x1]; u8 outer_ipv4_checksum_ok[0x1]; u8 outer_l4_checksum_ok[0x1]; u8 reserved_at_20[0x60]; }; struct mlx5_ifc_flow_table_nic_cap_bits { u8 nic_rx_multi_path_tirs[0x1]; u8 nic_rx_multi_path_tirs_fts[0x1]; u8 allow_sniffer_and_nic_rx_shared_tir[0x1]; u8 reserved_at_3[0x1]; u8 nic_rx_flow_tag_multipath_en[0x1]; u8 reserved_at_5[0x13]; u8 nic_receive_max_steering_depth[0x8]; u8 encap_general_header[0x1]; u8 reserved_at_21[0xa]; u8 log_max_packet_reformat_context[0x5]; u8 reserved_at_30[0x6]; u8 max_encap_header_size[0xa]; u8 reserved_at_40[0x1c0]; struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_receive; struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_receive_rdma; struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_receive_sniffer; struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_transmit; struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_transmit_rdma; struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_transmit_sniffer; u8 reserved_at_e00[0x200]; struct mlx5_ifc_header_modify_cap_properties_bits header_modify_nic_receive; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_support_2_nic_receive; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_bitmask_support_2_nic_receive; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_support_2_nic_receive_rdma; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_bitmask_support_2_nic_receive_rdma; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_support_2_nic_receive_sniffer; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_bitmask_support_2_nic_receive_sniffer; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_support_2_nic_transmit; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_bitmask_support_2_nic_transmit; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_support_2_nic_transmit_rdma; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_bitmask_support_2_nic_transmit_rdma; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_support_2_nic_transmit_sniffer; struct mlx5_ifc_flow_table_fields_supported_2_bits ft_field_bitmask_support_2_nic_transmit_sniffer; u8 reserved_at_1400[0x200]; struct mlx5_ifc_header_modify_cap_properties_bits header_modify_nic_transmit; u8 sw_steering_nic_rx_action_drop_icm_address[0x40]; u8 sw_steering_nic_tx_action_drop_icm_address[0x40]; u8 sw_steering_nic_tx_action_allow_icm_address[0x40]; u8 reserved_at_20c0[0x5f40]; }; struct mlx5_ifc_flow_table_eswitch_cap_bits { u8 reserved_at_0[0x1c]; u8 fdb_multi_path_to_table[0x1]; u8 reserved_at_1d[0x1e3]; struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_nic_esw_fdb; struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_esw_acl_ingress; struct mlx5_ifc_flow_table_prop_layout_bits flow_table_properties_esw_acl_egress; u8 reserved_at_800[0x1000]; u8 sw_steering_fdb_action_drop_icm_address_rx[0x40]; u8 sw_steering_fdb_action_drop_icm_address_tx[0x40]; u8 sw_steering_uplink_icm_address_rx[0x40]; u8 sw_steering_uplink_icm_address_tx[0x40]; u8 reserved_at_1900[0x6700]; }; struct mlx5_ifc_odp_per_transport_service_cap_bits { u8 send[0x1]; u8 receive[0x1]; u8 write[0x1]; u8 read[0x1]; u8 atomic[0x1]; u8 srq_receive[0x1]; u8 reserved_at_6[0x1a]; }; struct mlx5_ifc_odp_cap_bits { u8 reserved_at_0[0x40]; u8 sig[0x1]; u8 reserved_at_41[0x1f]; u8 reserved_at_60[0x20]; struct mlx5_ifc_odp_per_transport_service_cap_bits rc_odp_caps; struct mlx5_ifc_odp_per_transport_service_cap_bits uc_odp_caps; struct mlx5_ifc_odp_per_transport_service_cap_bits ud_odp_caps; struct mlx5_ifc_odp_per_transport_service_cap_bits xrc_odp_caps; struct mlx5_ifc_odp_per_transport_service_cap_bits dc_odp_caps; u8 reserved_at_120[0x6e0]; }; struct mlx5_ifc_e_switch_cap_bits { u8 reserved_at_0[0x4b]; u8 log_max_esw_sf[0x5]; u8 esw_sf_base_id[0x10]; u8 esw_manager_vport_number_valid[0x1]; u8 reserved_at_61[0xf]; u8 esw_manager_vport_number[0x10]; u8 reserved_at_80[0x780]; }; enum { ELEMENT_TYPE_CAP_MASK_TASR = 1 << 0, ELEMENT_TYPE_CAP_MASK_QUEUE_GROUP = 1 << 4, }; enum { TSAR_TYPE_CAP_MASK_DWRR = 1 << 0, }; struct mlx5_ifc_qos_cap_bits { u8 reserved_at_0[0x8]; u8 nic_sq_scheduling[0x1]; u8 nic_bw_share[0x1]; u8 nic_rate_limit[0x1]; u8 reserved_at_b[0x15]; u8 reserved_at_20[0x1]; u8 nic_qp_scheduling[0x1]; u8 reserved_at_22[0x1e]; u8 reserved_at_40[0xc0]; u8 nic_element_type[0x10]; u8 nic_tsar_type[0x10]; u8 reserved_at_120[0x6e0]; }; struct mlx5_ifc_cmd_hca_cap_2_bits { u8 reserved_at_0[0x80]; u8 reserved_at_80[0x13]; u8 log_reserved_qpn_granularity[0x5]; u8 reserved_at_98[0x8]; u8 reserved_at_a0[0x760]; }; enum { MLX5_CRYPTO_CAPS_WRAPPED_IMPORT_METHOD_AES = 0x4, }; struct mlx5_ifc_crypto_caps_bits { u8 wrapped_crypto_operational[0x1]; u8 wrapped_crypto_going_to_commissioning[0x1]; u8 reserved_at_2[0x16]; u8 wrapped_import_method[0x8]; u8 reserved_at_20[0xb]; u8 log_max_num_deks[0x5]; u8 reserved_at_30[0x3]; u8 log_max_num_import_keks[0x5]; u8 reserved_at_38[0x3]; u8 log_max_num_creds[0x5]; u8 failed_selftests[0x10]; u8 num_nv_import_keks[0x8]; u8 num_nv_credentials[0x8]; u8 reserved_at_60[0x7a0]; }; union mlx5_ifc_hca_cap_union_bits { struct mlx5_ifc_atomic_caps_bits atomic_caps; struct mlx5_ifc_cmd_hca_cap_bits cmd_hca_cap; struct mlx5_ifc_flow_table_nic_cap_bits flow_table_nic_cap; struct mlx5_ifc_flow_table_eswitch_cap_bits flow_table_eswitch_cap; struct mlx5_ifc_e_switch_cap_bits e_switch_cap; struct mlx5_ifc_device_mem_cap_bits device_mem_cap; struct mlx5_ifc_odp_cap_bits odp_cap; struct mlx5_ifc_roce_cap_bits roce_caps; struct mlx5_ifc_qos_cap_bits qos_caps; struct mlx5_ifc_cmd_hca_cap_2_bits cmd_hca_cap_2; struct mlx5_ifc_crypto_caps_bits crypto_caps; u8 reserved_at_0[0x8000]; }; struct mlx5_ifc_query_hca_cap_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; union mlx5_ifc_hca_cap_union_bits capability; }; struct mlx5_ifc_query_hca_cap_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 other_function[0x1]; u8 reserved_at_41[0xf]; u8 function_id[0x10]; u8 reserved_at_60[0x20]; }; enum mlx5_cap_type { MLX5_CAP_GENERAL = 0, MLX5_CAP_ODP = 2, MLX5_CAP_ATOMIC = 3, MLX5_CAP_ROCE, MLX5_CAP_NUM, }; enum { MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE = 0x0 << 1, MLX5_SET_HCA_CAP_OP_MOD_ROCE = 0x4 << 1, MLX5_SET_HCA_CAP_OP_MOD_NIC_FLOW_TABLE = 0x7 << 1, MLX5_SET_HCA_CAP_OP_MOD_ESW_FLOW_TABLE = 0x8 << 1, MLX5_SET_HCA_CAP_OP_MOD_QOS = 0xc << 1, MLX5_SET_HCA_CAP_OP_MOD_ESW = 0x9 << 1, MLX5_SET_HCA_CAP_OP_MOD_DEVICE_MEMORY = 0xf << 1, MLX5_SET_HCA_CAP_OP_MOD_CRYPTO = 0x1a << 1, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE_CAP_2 = 0x20 << 1, }; enum { MLX5_MKC_ACCESS_MODE_MTT = 0x1, MLX5_MKC_ACCESS_MODE_KLMS = 0x2, }; struct mlx5_ifc_mkc_bits { u8 reserved_at_0[0x1]; u8 free[0x1]; u8 reserved_at_2[0x1]; u8 access_mode_4_2[0x3]; u8 reserved_at_6[0x7]; u8 relaxed_ordering_write[0x1]; u8 reserved_at_e[0x1]; u8 small_fence_on_rdma_read_response[0x1]; u8 umr_en[0x1]; u8 a[0x1]; u8 rw[0x1]; u8 rr[0x1]; u8 lw[0x1]; u8 lr[0x1]; u8 access_mode_1_0[0x2]; u8 reserved_at_18[0x8]; u8 qpn[0x18]; u8 mkey_7_0[0x8]; u8 reserved_at_40[0x20]; u8 length64[0x1]; u8 bsf_en[0x1]; u8 sync_umr[0x1]; u8 reserved_at_63[0x2]; u8 expected_sigerr_count[0x1]; u8 reserved_at_66[0x1]; u8 en_rinval[0x1]; u8 pd[0x18]; u8 start_addr[0x40]; u8 len[0x40]; u8 bsf_octword_size[0x20]; u8 reserved_at_120[0x80]; u8 translations_octword_size[0x20]; u8 reserved_at_1c0[0x19]; u8 relaxed_ordering_read[0x1]; u8 reserved_at_1d9[0x1]; u8 log_page_size[0x5]; u8 reserved_at_1e0[0x3]; u8 crypto_en[0x2]; u8 reserved_at_1e5[0x1b]; }; struct mlx5_ifc_create_mkey_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x8]; u8 mkey_index[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_mkey_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x20]; u8 pg_access[0x1]; u8 mkey_umem_valid[0x1]; u8 reserved_at_62[0x1e]; struct mlx5_ifc_mkc_bits memory_key_mkey_entry; u8 reserved_at_280[0x80]; u8 translations_octword_actual_size[0x20]; u8 reserved_at_320[0x560]; u8 klm_pas_mtt[0][0x20]; }; struct mlx5_ifc_destroy_mkey_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_destroy_mkey_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x8]; u8 mkey_index[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_l2_hdr_bits { u8 dmac_47_16[0x20]; u8 dmac_15_0[0x10]; u8 smac_47_32[0x10]; u8 smac_31_0[0x20]; u8 ethertype[0x10]; u8 vlan_type[0x10]; u8 vlan[0x10]; }; enum { FS_FT_NIC_RX = 0x0, FS_FT_NIC_TX = 0x1, FS_FT_ESW_EGRESS_ACL = 0x2, FS_FT_ESW_INGRESS_ACL = 0x3, FS_FT_FDB = 0X4, FS_FT_SNIFFER_RX = 0X5, FS_FT_SNIFFER_TX = 0X6, }; struct mlx5_ifc_ste_general_bits { u8 entry_type[0x4]; u8 reserved_at_4[0x4]; u8 entry_sub_type[0x8]; u8 byte_mask[0x10]; u8 next_table_base_63_48[0x10]; u8 next_lu_type[0x8]; u8 next_table_base_39_32_size[0x8]; u8 next_table_base_31_5_size[0x1b]; u8 linear_hash_enable[0x1]; u8 reserved_at_5c[0x2]; u8 next_table_rank[0x2]; u8 reserved_at_60[0xa0]; u8 tag_value[0x60]; u8 bit_mask[0x60]; }; struct mlx5_ifc_ste_sx_transmit_bits { u8 entry_type[0x4]; u8 reserved_at_4[0x4]; u8 entry_sub_type[0x8]; u8 byte_mask[0x10]; u8 next_table_base_63_48[0x10]; u8 next_lu_type[0x8]; u8 next_table_base_39_32_size[0x8]; u8 next_table_base_31_5_size[0x1b]; u8 linear_hash_enable[0x1]; u8 reserved_at_5c[0x2]; u8 next_table_rank[0x2]; u8 sx_wire[0x1]; u8 sx_func_lb[0x1]; u8 sx_sniffer[0x1]; u8 sx_wire_enable[0x1]; u8 sx_func_lb_enable[0x1]; u8 sx_sniffer_enable[0x1]; u8 action_type[0x3]; u8 reserved_at_69[0x1]; u8 action_description[0x6]; u8 gvmi[0x10]; u8 encap_pointer_vlan_data[0x20]; u8 loopback_syndome_en[0x8]; u8 loopback_syndome[0x8]; u8 counter_trigger[0x10]; u8 miss_address_63_48[0x10]; u8 counter_trigger_23_16[0x8]; u8 miss_address_39_32[0x8]; u8 miss_address_31_6[0x1a]; u8 learning_point[0x1]; u8 go_back[0x1]; u8 match_polarity[0x1]; u8 mask_mode[0x1]; u8 miss_rank[0x2]; }; struct mlx5_ifc_ste_rx_steering_mult_bits { u8 entry_type[0x4]; u8 reserved_at_4[0x4]; u8 entry_sub_type[0x8]; u8 byte_mask[0x10]; u8 next_table_base_63_48[0x10]; u8 next_lu_type[0x8]; u8 next_table_base_39_32_size[0x8]; u8 next_table_base_31_5_size[0x1b]; u8 linear_hash_enable[0x1]; u8 reserved_at_5c[0x2]; u8 next_table_rank[0x2]; u8 member_count[0x10]; u8 gvmi[0x10]; u8 qp_list_pointer[0x20]; u8 reserved_at_a0[0x1]; u8 tunneling_action[0x3]; u8 action_description[0x4]; u8 reserved_at_a8[0x8]; u8 counter_trigger_15_0[0x10]; u8 miss_address_63_48[0x10]; u8 counter_trigger_23_16[0x08]; u8 miss_address_39_32[0x8]; u8 miss_address_31_6[0x1a]; u8 learning_point[0x1]; u8 fail_on_error[0x1]; u8 match_polarity[0x1]; u8 mask_mode[0x1]; u8 miss_rank[0x2]; }; struct mlx5_ifc_ste_modify_packet_bits { u8 entry_type[0x4]; u8 reserved_at_4[0x4]; u8 entry_sub_type[0x8]; u8 byte_mask[0x10]; u8 next_table_base_63_48[0x10]; u8 next_lu_type[0x8]; u8 next_table_base_39_32_size[0x8]; u8 next_table_base_31_5_size[0x1b]; u8 linear_hash_enable[0x1]; u8 reserved_at_5c[0x2]; u8 next_table_rank[0x2]; u8 number_of_re_write_actions[0x10]; u8 gvmi[0x10]; u8 header_re_write_actions_pointer[0x20]; u8 reserved_at_a0[0x1]; u8 tunneling_action[0x3]; u8 action_description[0x4]; u8 reserved_at_a8[0x8]; u8 counter_trigger_15_0[0x10]; u8 miss_address_63_48[0x10]; u8 counter_trigger_23_16[0x08]; u8 miss_address_39_32[0x8]; u8 miss_address_31_6[0x1a]; u8 learning_point[0x1]; u8 fail_on_error[0x1]; u8 match_polarity[0x1]; u8 mask_mode[0x1]; u8 miss_rank[0x2]; }; struct mlx5_ifc_ste_single_action_flow_tag_v1_bits { u8 action_id[0x8]; u8 flow_tag[0x18]; }; struct mlx5_ifc_ste_single_action_modify_list_v1_bits { u8 action_id[0x8]; u8 num_of_modify_actions[0x8]; u8 modify_actions_ptr[0x10]; }; struct mlx5_ifc_ste_single_action_remove_header_v1_bits { u8 action_id[0x8]; u8 reserved_at_8[0x2]; u8 start_anchor[0x6]; u8 reserved_at_10[0x2]; u8 end_anchor[0x6]; u8 reserved_at_18[0x4]; u8 decap[0x1]; u8 vni_to_cqe[0x1]; u8 qos_profile[0x2]; }; struct mlx5_ifc_ste_single_action_remove_header_size_v1_bits { u8 action_id[0x8]; u8 reserved_at_8[0x2]; u8 start_anchor[0x6]; u8 outer_l4_remove[0x1]; u8 reserved_at_11[0x1]; u8 start_offset[0x7]; u8 reserved_at_18[0x1]; u8 remove_size[0x6]; }; struct mlx5_ifc_ste_double_action_copy_v1_bits { u8 action_id[0x8]; u8 destination_dw_offset[0x8]; u8 reserved_at_10[0x2]; u8 destination_left_shifter[0x6]; u8 reserved_at_18[0x2]; u8 destination_length[0x6]; u8 reserved_at_20[0x8]; u8 source_dw_offset[0x8]; u8 reserved_at_30[0x2]; u8 source_right_shifter[0x6]; u8 reserved_at_38[0x8]; }; struct mlx5_ifc_ste_double_action_set_v1_bits { u8 action_id[0x8]; u8 destination_dw_offset[0x8]; u8 reserved_at_10[0x2]; u8 destination_left_shifter[0x6]; u8 reserved_at_18[0x2]; u8 destination_length[0x6]; u8 inline_data[0x20]; }; struct mlx5_ifc_ste_double_action_add_v1_bits { u8 action_id[0x8]; u8 destination_dw_offset[0x8]; u8 reserved_at_10[0x2]; u8 destination_left_shifter[0x6]; u8 reserved_at_18[0x2]; u8 destination_length[0x6]; u8 add_value[0x20]; }; struct mlx5_ifc_ste_double_action_insert_with_inline_v1_bits { u8 action_id[0x8]; u8 reserved_at_8[0x2]; u8 start_anchor[0x6]; u8 start_offset[0x7]; u8 reserved_at_17[0x9]; u8 inline_data[0x20]; }; struct mlx5_ifc_ste_double_action_insert_with_ptr_v1_bits { u8 action_id[0x8]; u8 reserved_at_8[0x2]; u8 start_anchor[0x6]; u8 start_offset[0x7]; u8 size[0x6]; u8 attributes[0x3]; u8 pointer[0x20]; }; struct mlx5_ifc_ste_double_action_accelerated_modify_action_list_v1_bits { u8 action_id[0x8]; u8 modify_actions_pattern_pointer[0x18]; u8 number_of_modify_actions[0x8]; u8 modify_actions_argument_pointer[0x18]; }; enum { MLX5_IFC_ASO_FLOW_METER_INITIAL_COLOR_RED = 0x0, MLX5_IFC_ASO_FLOW_METER_INITIAL_COLOR_YELLOW = 0x1, MLX5_IFC_ASO_FLOW_METER_INITIAL_COLOR_GREEN = 0x2, MLX5_IFC_ASO_FLOW_METER_INITIAL_COLOR_UNDEFINED = 0x3, }; enum { MLX5_IFC_ASO_CT_DIRECTION_INITIATOR = 0x0, MLX5_IFC_ASO_CT_DIRECTION_RESPONDER = 0x1, }; struct mlx5_ifc_ste_aso_first_hit_action_v1_bits { u8 reserved_at_0[0x6]; u8 set[0x1]; u8 line_id[0x9]; }; struct mlx5_ifc_ste_aso_flow_meter_action_v1_bits { u8 reserved_at_0[0xc]; u8 action[0x1]; u8 initial_color[0x2]; u8 line_id[0x1]; }; struct mlx5_ifc_ste_aso_ct_action_v1_bits { u8 reserved_at_0[0xf]; u8 direction[0x1]; }; struct mlx5_ifc_ste_double_action_aso_v1_bits { u8 action_id[0x8]; u8 aso_context_number[0x18]; u8 dest_reg_id[0x2]; u8 change_ordering_tag[0x1]; u8 aso_check_ordering[0x1]; u8 aso_context_type[0x4]; u8 reserved_at_28[0x8]; union { u8 aso_fields[0x10]; struct mlx5_ifc_ste_aso_first_hit_action_v1_bits first_hit; struct mlx5_ifc_ste_aso_flow_meter_action_v1_bits flow_meter; struct mlx5_ifc_ste_aso_ct_action_v1_bits ct; }; }; struct mlx5_ifc_ste_match_bwc_v1_bits { u8 entry_format[0x8]; u8 counter_id[0x18]; u8 miss_address_63_48[0x10]; u8 match_definer_ctx_idx[0x8]; u8 miss_address_39_32[0x8]; u8 miss_address_31_6[0x1a]; u8 reserved_at_5a[0x1]; u8 match_polarity[0x1]; u8 reparse[0x1]; u8 reserved_at_5d[0x3]; u8 next_table_base_63_48[0x10]; u8 hash_definer_ctx_idx[0x8]; u8 next_table_base_39_32_size[0x8]; u8 next_table_base_31_5_size[0x1b]; u8 hash_type[0x2]; u8 hash_after_actions[0x1]; u8 reserved_at_9e[0x2]; u8 byte_mask[0x10]; u8 next_entry_format[0x1]; u8 mask_mode[0x1]; u8 gvmi[0xe]; u8 action[0x40]; }; struct mlx5_ifc_ste_mask_and_match_v1_bits { u8 entry_format[0x8]; u8 counter_id[0x18]; u8 miss_address_63_48[0x10]; u8 match_definer_ctx_idx[0x8]; u8 miss_address_39_32[0x8]; u8 miss_address_31_6[0x1a]; u8 reserved_at_5a[0x1]; u8 match_polarity[0x1]; u8 reparse[0x1]; u8 reserved_at_5d[0x3]; u8 next_table_base_63_48[0x10]; u8 hash_definer_ctx_idx[0x8]; u8 next_table_base_39_32_size[0x8]; u8 next_table_base_31_5_size[0x1b]; u8 hash_type[0x2]; u8 hash_after_actions[0x1]; u8 reserved_at_9e[0x2]; u8 action[0x60]; }; struct mlx5_ifc_ste_eth_l2_src_bits { u8 smac_47_16[0x20]; u8 smac_15_0[0x10]; u8 l3_ethertype[0x10]; u8 qp_type[0x2]; u8 ethertype_filter[0x1]; u8 reserved_at_43[0x1]; u8 sx_sniffer[0x1]; u8 force_lb[0x1]; u8 functional_lb[0x1]; u8 port[0x1]; u8 reserved_at_48[0x4]; u8 first_priority[0x3]; u8 first_cfi[0x1]; u8 first_vlan_qualifier[0x2]; u8 reserved_at_52[0x2]; u8 first_vlan_id[0xc]; u8 ip_fragmented[0x1]; u8 tcp_syn[0x1]; u8 encp_type[0x2]; u8 l3_type[0x2]; u8 l4_type[0x2]; u8 reserved_at_68[0x4]; u8 second_priority[0x3]; u8 second_cfi[0x1]; u8 second_vlan_qualifier[0x2]; u8 reserved_at_72[0x2]; u8 second_vlan_id[0xc]; }; struct mlx5_ifc_ste_eth_l2_src_v1_bits { u8 reserved_at_0[0x1]; u8 sx_sniffer[0x1]; u8 functional_loopback[0x1]; u8 ip_fragmented[0x1]; u8 qp_type[0x2]; u8 encapsulation_type[0x2]; u8 port[0x2]; u8 l3_type[0x2]; u8 l4_type[0x2]; u8 first_vlan_qualifier[0x2]; u8 first_priority[0x3]; u8 first_cfi[0x1]; u8 first_vlan_id[0xc]; u8 smac_47_16[0x20]; u8 smac_15_0[0x10]; u8 l3_ethertype[0x10]; u8 reserved_at_60[0x6]; u8 tcp_syn[0x1]; u8 reserved_at_67[0x3]; u8 force_loopback[0x1]; u8 l2_ok[0x1]; u8 l3_ok[0x1]; u8 l4_ok[0x1]; u8 second_vlan_qualifier[0x2]; u8 second_priority[0x3]; u8 second_cfi[0x1]; u8 second_vlan_id[0xc]; }; struct mlx5_ifc_ste_eth_l2_dst_bits { u8 dmac_47_16[0x20]; u8 dmac_15_0[0x10]; u8 l3_ethertype[0x10]; u8 qp_type[0x2]; u8 ethertype_filter[0x1]; u8 reserved_at_43[0x1]; u8 sx_sniffer[0x1]; u8 force_lb[0x1]; u8 functional_lb[0x1]; u8 port[0x1]; u8 reserved_at_48[0x4]; u8 first_priority[0x3]; u8 first_cfi[0x1]; u8 first_vlan_qualifier[0x2]; u8 reserved_at_52[0x2]; u8 first_vlan_id[0xc]; u8 ip_fragmented[0x1]; u8 tcp_syn[0x1]; u8 encp_type[0x2]; u8 l3_type[0x2]; u8 l4_type[0x2]; u8 reserved_at_68[0x4]; u8 second_priority[0x3]; u8 second_cfi[0x1]; u8 second_vlan_qualifier[0x2]; u8 reserved_at_72[0x2]; u8 second_vlan_id[0xc]; }; struct mlx5_ifc_ste_eth_l2_dst_v1_bits { u8 reserved_at_0[0x1]; u8 sx_sniffer[0x1]; u8 functional_lb[0x1]; u8 ip_fragmented[0x1]; u8 qp_type[0x2]; u8 encapsulation_type[0x2]; u8 port[0x2]; u8 l3_type[0x2]; u8 l4_type[0x2]; u8 first_vlan_qualifier[0x2]; u8 first_priority[0x3]; u8 first_cfi[0x1]; u8 first_vlan_id[0xc]; u8 dmac_47_16[0x20]; u8 dmac_15_0[0x10]; u8 l3_ethertype[0x10]; u8 reserved_at_60[0x6]; u8 tcp_syn[0x1]; u8 reserved_at_67[0x3]; u8 force_lb[0x1]; u8 l2_ok[0x1]; u8 l3_ok[0x1]; u8 l4_ok[0x1]; u8 second_vlan_qualifier[0x2]; u8 second_priority[0x3]; u8 second_cfi[0x1]; u8 second_vlan_id[0xc]; }; struct mlx5_ifc_ste_eth_l2_src_dst_bits { u8 dmac_47_16[0x20]; u8 dmac_15_0[0x10]; u8 smac_47_32[0x10]; u8 smac_31_0[0x20]; u8 sx_sniffer[0x1]; u8 force_lb[0x1]; u8 functional_lb[0x1]; u8 port[0x1]; u8 l3_type[0x2]; u8 reserved_at_66[0x6]; u8 first_priority[0x3]; u8 first_cfi[0x1]; u8 first_vlan_qualifier[0x2]; u8 reserved_at_72[0x2]; u8 first_vlan_id[0xc]; }; struct mlx5_ifc_ste_eth_l2_src_dst_v1_bits { u8 dmac_47_16[0x20]; u8 smac_47_16[0x20]; u8 dmac_15_0[0x10]; u8 reserved_at_50[0x2]; u8 functional_lb[0x1]; u8 reserved_at_53[0x5]; u8 port[0x2]; u8 l3_type[0x2]; u8 reserved_at_5c[0x2]; u8 first_vlan_qualifier[0x2]; u8 first_priority[0x3]; u8 first_cfi[0x1]; u8 first_vlan_id[0xc]; u8 smac_15_0[0x10]; }; struct mlx5_ifc_ste_eth_l3_ipv4_5_tuple_bits { u8 destination_address[0x20]; u8 source_address[0x20]; u8 source_port[0x10]; u8 destination_port[0x10]; u8 fragmented[0x1]; u8 first_fragment[0x1]; u8 reserved_at_62[0x2]; u8 reserved_at_64[0x1]; u8 ecn[0x2]; u8 tcp_ns[0x1]; u8 tcp_cwr[0x1]; u8 tcp_ece[0x1]; u8 tcp_urg[0x1]; u8 tcp_ack[0x1]; u8 tcp_psh[0x1]; u8 tcp_rst[0x1]; u8 tcp_syn[0x1]; u8 tcp_fin[0x1]; u8 dscp[0x6]; u8 reserved_at_76[0x2]; u8 protocol[0x8]; }; struct mlx5_ifc_ste_eth_l3_ipv4_5_tuple_v1_bits { u8 source_address[0x20]; u8 destination_address[0x20]; u8 source_port[0x10]; u8 destination_port[0x10]; u8 reserved_at_60[0x4]; u8 l4_ok[0x1]; u8 l3_ok[0x1]; u8 fragmented[0x1]; u8 tcp_ns[0x1]; u8 tcp_cwr[0x1]; u8 tcp_ece[0x1]; u8 tcp_urg[0x1]; u8 tcp_ack[0x1]; u8 tcp_psh[0x1]; u8 tcp_rst[0x1]; u8 tcp_syn[0x1]; u8 tcp_fin[0x1]; u8 dscp[0x6]; u8 ecn[0x2]; u8 protocol[0x8]; }; struct mlx5_ifc_ste_eth_l3_ipv6_dst_bits { u8 dst_ip_127_96[0x20]; u8 dst_ip_95_64[0x20]; u8 dst_ip_63_32[0x20]; u8 dst_ip_31_0[0x20]; }; struct mlx5_ifc_ste_eth_l2_tnl_bits { u8 dmac_47_16[0x20]; u8 dmac_15_0[0x10]; u8 l3_ethertype[0x10]; u8 l2_tunneling_network_id[0x20]; u8 ip_fragmented[0x1]; u8 tcp_syn[0x1]; u8 encp_type[0x2]; u8 l3_type[0x2]; u8 l4_type[0x2]; u8 first_priority[0x3]; u8 first_cfi[0x1]; u8 reserved_at_6c[0x3]; u8 gre_key_flag[0x1]; u8 first_vlan_qualifier[0x2]; u8 reserved_at_72[0x2]; u8 first_vlan_id[0xc]; }; struct mlx5_ifc_ste_eth_l2_tnl_v1_bits { u8 l2_tunneling_network_id[0x20]; u8 dmac_47_16[0x20]; u8 dmac_15_0[0x10]; u8 l3_ethertype[0x10]; u8 reserved_at_60[0x3]; u8 ip_fragmented[0x1]; u8 reserved_at_64[0x2]; u8 encp_type[0x2]; u8 reserved_at_68[0x2]; u8 l3_type[0x2]; u8 l4_type[0x2]; u8 first_vlan_qualifier[0x2]; u8 first_priority[0x3]; u8 first_cfi[0x1]; u8 first_vlan_id[0xc]; }; struct mlx5_ifc_ste_eth_l3_ipv6_src_bits { u8 src_ip_127_96[0x20]; u8 src_ip_95_64[0x20]; u8 src_ip_63_32[0x20]; u8 src_ip_31_0[0x20]; }; struct mlx5_ifc_ste_eth_l3_ipv4_misc_bits { u8 version[0x4]; u8 ihl[0x4]; u8 reserved_at_8[0x8]; u8 total_length[0x10]; u8 identification[0x10]; u8 flags[0x3]; u8 fragment_offset[0xd]; u8 time_to_live[0x8]; u8 reserved_at_48[0x8]; u8 checksum[0x10]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_ste_eth_l3_ipv4_misc_v1_bits { u8 identification[0x10]; u8 flags[0x3]; u8 fragment_offset[0xd]; u8 total_length[0x10]; u8 checksum[0x10]; u8 version[0x4]; u8 ihl[0x4]; u8 time_to_live[0x8]; u8 reserved_at_50[0x10]; u8 reserved_at_60[0x1c]; u8 voq_internal_prio[0x4]; }; struct mlx5_ifc_ste_eth_l4_bits { u8 fragmented[0x1]; u8 first_fragment[0x1]; u8 reserved_at_2[0x6]; u8 protocol[0x8]; u8 dst_port[0x10]; u8 ipv6_version[0x4]; u8 reserved_at_24[0x1]; u8 ecn[0x2]; u8 tcp_ns[0x1]; u8 tcp_cwr[0x1]; u8 tcp_ece[0x1]; u8 tcp_urg[0x1]; u8 tcp_ack[0x1]; u8 tcp_psh[0x1]; u8 tcp_rst[0x1]; u8 tcp_syn[0x1]; u8 tcp_fin[0x1]; u8 src_port[0x10]; u8 ipv6_payload_length[0x10]; u8 ipv6_hop_limit[0x8]; u8 dscp[0x6]; u8 reserved_at_5e[0x2]; u8 tcp_data_offset[0x4]; u8 reserved_at_64[0x8]; u8 flow_label[0x14]; }; struct mlx5_ifc_ste_eth_l4_v1_bits { u8 ipv6_version[0x4]; u8 reserved_at_4[0x4]; u8 dscp[0x6]; u8 ecn[0x2]; u8 ipv6_hop_limit[0x8]; u8 protocol[0x8]; u8 src_port[0x10]; u8 dst_port[0x10]; u8 first_fragment[0x1]; u8 reserved_at_41[0xb]; u8 flow_label[0x14]; u8 tcp_data_offset[0x4]; u8 l4_ok[0x1]; u8 l3_ok[0x1]; u8 fragmented[0x1]; u8 tcp_ns[0x1]; u8 tcp_cwr[0x1]; u8 tcp_ece[0x1]; u8 tcp_urg[0x1]; u8 tcp_ack[0x1]; u8 tcp_psh[0x1]; u8 tcp_rst[0x1]; u8 tcp_syn[0x1]; u8 tcp_fin[0x1]; u8 ipv6_paylen[0x10]; }; struct mlx5_ifc_ste_eth_l4_misc_bits { u8 checksum[0x10]; u8 length[0x10]; u8 seq_num[0x20]; u8 ack_num[0x20]; u8 urgent_pointer[0x10]; u8 window_size[0x10]; }; struct mlx5_ifc_ste_eth_l4_misc_v1_bits { u8 window_size[0x10]; u8 urgent_pointer[0x10]; u8 ack_num[0x20]; u8 seq_num[0x20]; u8 length[0x10]; u8 checksum[0x10]; }; struct mlx5_ifc_ste_mpls_bits { u8 mpls0_label[0x14]; u8 mpls0_exp[0x3]; u8 mpls0_s_bos[0x1]; u8 mpls0_ttl[0x8]; u8 mpls1_label[0x20]; u8 mpls2_label[0x20]; u8 reserved_at_60[0x16]; u8 mpls4_s_bit[0x1]; u8 mpls4_qualifier[0x1]; u8 mpls3_s_bit[0x1]; u8 mpls3_qualifier[0x1]; u8 mpls2_s_bit[0x1]; u8 mpls2_qualifier[0x1]; u8 mpls1_s_bit[0x1]; u8 mpls1_qualifier[0x1]; u8 mpls0_s_bit[0x1]; u8 mpls0_qualifier[0x1]; }; struct mlx5_ifc_ste_mpls_v1_bits { u8 reserved_at_0[0x15]; u8 mpls_ok[0x1]; u8 mpls4_s_bit[0x1]; u8 mpls4_qualifier[0x1]; u8 mpls3_s_bit[0x1]; u8 mpls3_qualifier[0x1]; u8 mpls2_s_bit[0x1]; u8 mpls2_qualifier[0x1]; u8 mpls1_s_bit[0x1]; u8 mpls1_qualifier[0x1]; u8 mpls0_s_bit[0x1]; u8 mpls0_qualifier[0x1]; u8 mpls0_label[0x14]; u8 mpls0_exp[0x3]; u8 mpls0_s_bos[0x1]; u8 mpls0_ttl[0x8]; u8 mpls1_label[0x20]; u8 mpls2_label[0x20]; }; struct mlx5_ifc_ste_register_0_bits { u8 register_0_h[0x20]; u8 register_0_l[0x20]; u8 register_1_h[0x20]; u8 register_1_l[0x20]; }; struct mlx5_ifc_ste_register_1_bits { u8 register_2_h[0x20]; u8 register_2_l[0x20]; u8 register_3_h[0x20]; u8 register_3_l[0x20]; }; struct mlx5_ifc_ste_gre_bits { u8 gre_c_present[0x1]; u8 reserved_at_1[0x1]; u8 gre_k_present[0x1]; u8 gre_s_present[0x1]; u8 strict_src_route[0x1]; u8 recur[0x3]; u8 flags[0x5]; u8 version[0x3]; u8 gre_protocol[0x10]; u8 checksum[0x10]; u8 offset[0x10]; u8 gre_key_h[0x18]; u8 gre_key_l[0x8]; u8 seq_num[0x20]; }; struct mlx5_ifc_ste_gre_v1_bits { u8 gre_c_present[0x1]; u8 reserved_at_1[0x1]; u8 gre_k_present[0x1]; u8 gre_s_present[0x1]; u8 strict_src_route[0x1]; u8 recur[0x3]; u8 flags[0x5]; u8 version[0x3]; u8 gre_protocol[0x10]; u8 reserved_at_20[0x20]; u8 gre_key_h[0x18]; u8 gre_key_l[0x8]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_ste_flex_parser_0_bits { u8 flex_parser_3[0x20]; u8 flex_parser_2[0x20]; u8 flex_parser_1[0x20]; u8 flex_parser_0[0x20]; }; struct mlx5_ifc_ste_flex_parser_1_bits { u8 flex_parser_7[0x20]; u8 flex_parser_6[0x20]; u8 flex_parser_5[0x20]; u8 flex_parser_4[0x20]; }; struct mlx5_ifc_ste_flex_parser_ok_bits { u8 flex_parser_3[0x20]; u8 flex_parser_2[0x20]; u8 flex_parsers_ok[0x8]; u8 reserved_at_48[0x18]; u8 flex_parser_0[0x20]; }; struct mlx5_ifc_ste_tunnel_header_bits { u8 tunnel_header_dw0[0x20]; u8 tunnel_header_dw1[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_ste_tunnel_header_v1_bits { u8 tunnel_header_0[0x20]; u8 tunnel_header_1[0x20]; u8 tunnel_header_2[0x20]; u8 tunnel_header_3[0x20]; }; struct mlx5_ifc_ste_flex_parser_tnl_vxlan_gpe_bits { u8 outer_vxlan_gpe_flags[0x8]; u8 reserved_at_8[0x10]; u8 outer_vxlan_gpe_next_protocol[0x8]; u8 outer_vxlan_gpe_vni[0x18]; u8 reserved_at_38[0x8]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_ste_flex_parser_tnl_geneve_bits { u8 reserved_at_0[0x2]; u8 geneve_opt_len[0x6]; u8 geneve_oam[0x1]; u8 reserved_at_9[0x7]; u8 geneve_protocol_type[0x10]; u8 geneve_vni[0x18]; u8 reserved_at_38[0x8]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_ste_flex_parser_tnl_gtpu_bits { u8 gtpu_msg_flags[0x8]; u8 gtpu_msg_type[0x8]; u8 reserved_at_10[0x10]; u8 gtpu_teid[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_ste_general_purpose_bits { u8 general_purpose_lookup_field[0x20]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x20]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_ste_src_gvmi_qp_bits { u8 loopback_syndrome[0x8]; u8 reserved_at_8[0x8]; u8 source_gvmi[0x10]; u8 reserved_at_20[0x5]; u8 force_lb[0x1]; u8 functional_lb[0x1]; u8 source_is_requestor[0x1]; u8 source_qp[0x18]; u8 reserved_at_40[0x20]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_ste_src_gvmi_qp_v1_bits { u8 loopback_synd[0x8]; u8 reserved_at_8[0x7]; u8 functional_lb[0x1]; u8 source_gvmi[0x10]; u8 force_lb[0x1]; u8 reserved_at_21[0x1]; u8 source_is_requestor[0x1]; u8 reserved_at_23[0x5]; u8 source_qp[0x18]; u8 reserved_at_40[0x20]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_ste_icmp_v1_bits { u8 icmp_payload_data[0x20]; u8 icmp_header_data[0x20]; u8 icmp_type[0x8]; u8 icmp_code[0x8]; u8 reserved_at_50[0x10]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_ste_ib_l4_bits { u8 opcode[0x8]; u8 qp[0x18]; u8 se[0x1]; u8 migreg[0x1]; u8 ackreq[0x1]; u8 fecn[0x1]; u8 becn[0x1]; u8 bth[0x1]; u8 deth[0x1]; u8 dcceth[0x1]; u8 reserved_at_28[0x2]; u8 pad_count[0x2]; u8 tver[0x4]; u8 pkey[0x10]; u8 reserved_at_40[0x8]; u8 deth_source_qp[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_ste_def0_v1_bits { u8 metadata_reg_c_0[0x20]; u8 metadata_reg_c_1[0x20]; u8 dmac_47_16[0x20]; u8 dmac_15_0[0x10]; u8 ethertype[0x10]; u8 reserved_at_60[0x1]; u8 sx_sniffer[0x1]; u8 functional_loopback[0x1]; u8 ip_frag[0x1]; u8 qp_type[0x2]; u8 encapsulation_type[0x2]; u8 port[0x2]; u8 outer_l3_type[0x2]; u8 outer_l4_type[0x2]; u8 first_vlan_qualifier[0x2]; u8 first_priority[0x3]; u8 first_cfi[0x1]; u8 first_vlan_id[0xc]; u8 reserved_at_80[0xa]; u8 force_loopback[0x1]; u8 reserved_at_8b[0x3]; u8 second_vlan_qualifier[0x2]; u8 second_priority[0x3]; u8 second_cfi[0x1]; u8 second_vlan_id[0xc]; u8 smac_47_16[0x20]; u8 smac_15_0[0x10]; u8 inner_ipv4_checksum_ok[0x1]; u8 inner_l4_checksum_ok[0x1]; u8 outer_ipv4_checksum_ok[0x1]; u8 outer_l4_checksum_ok[0x1]; u8 inner_l3_ok[0x1]; u8 inner_l4_ok[0x1]; u8 outer_l3_ok[0x1]; u8 outer_l4_ok[0x1]; u8 tcp_cwr[0x1]; u8 tcp_ece[0x1]; u8 tcp_urg[0x1]; u8 tcp_ack[0x1]; u8 tcp_psh[0x1]; u8 tcp_rst[0x1]; u8 tcp_syn[0x1]; u8 tcp_fin[0x1]; }; struct mlx5_ifc_ste_def2_v1_bits { u8 metadata_reg_a[0x20]; u8 outer_ip_version[0x4]; u8 outer_ip_ihl[0x4]; u8 outer_ip_dscp[0x6]; u8 outer_ip_ecn[0x2]; u8 outer_ip_ttl[0x8]; u8 outer_ip_protocol[0x8]; u8 outer_ip_identification[0x10]; u8 outer_ip_flags[0x3]; u8 outer_ip_fragment_offset[0xd]; u8 outer_ip_total_length[0x10]; u8 outer_ip_checksum[0x10]; u8 reserved_180[0xc]; u8 outer_ip_flow_label[0x14]; u8 outer_eth_packet_length[0x10]; u8 outer_ip_payload_length[0x10]; u8 outer_l4_sport[0x10]; u8 outer_l4_dport[0x10]; u8 outer_data_offset[0x4]; u8 reserved_1e4[0x2]; u8 outer_ip_frag[0x1]; u8 tcp_ns[0x1]; u8 tcp_cwr[0x1]; u8 tcp_ece[0x1]; u8 tcp_urg[0x1]; u8 tcp_ack[0x1]; u8 tcp_psh[0x1]; u8 tcp_rst[0x1]; u8 tcp_syn[0x1]; u8 tcp_fin[0x1]; u8 outer_ip_frag_first[0x1]; u8 reserved_1f0[0x7]; u8 inner_ipv4_checksum_ok[0x1]; u8 inner_l4_checksum_ok[0x1]; u8 outer_ipv4_checksum_ok[0x1]; u8 outer_l4_checksum_ok[0x1]; u8 inner_l3_ok[0x1]; u8 inner_l4_ok[0x1]; u8 outer_l3_ok[0x1]; u8 outer_l4_ok[0x1]; }; struct mlx5_ifc_ste_def6_v1_bits { u8 dst_ipv6_127_96[0x20]; u8 dst_ipv6_95_64[0x20]; u8 dst_ipv6_63_32[0x20]; u8 dst_ipv6_31_0[0x20]; u8 reserved_at_80[0x40]; u8 outer_l4_sport[0x10]; u8 outer_l4_dport[0x10]; u8 reserved_e0[0x4]; u8 l4_ok[0x1]; u8 l3_ok[0x1]; u8 ip_frag[0x1]; u8 tcp_ns[0x1]; u8 tcp_cwr[0x1]; u8 tcp_ece[0x1]; u8 tcp_urg[0x1]; u8 tcp_ack[0x1]; u8 tcp_psh[0x1]; u8 tcp_rst[0x1]; u8 tcp_syn[0x1]; u8 tcp_fin[0x1]; u8 reserved_f0[0x10]; }; struct mlx5_ifc_ste_def16_v1_bits { u8 tunnel_header_0[0x20]; u8 tunnel_header_1[0x20]; u8 tunnel_header_2[0x20]; u8 tunnel_header_3[0x20]; u8 random_number[0x10]; u8 reserved_90[0x10]; u8 metadata_reg_a[0x20]; u8 reserved_c0[0x8]; u8 outer_l3_type[0x2]; u8 outer_l4_type[0x2]; u8 outer_first_vlan_type[0x2]; u8 reserved_ce[0x1]; u8 functional_lb[0x1]; u8 source_gvmi[0x10]; u8 force_lb[0x1]; u8 outer_ip_frag[0x1]; u8 source_is_requester[0x1]; u8 reserved_e3[0x5]; u8 source_sqn[0x18]; }; struct mlx5_ifc_ste_def22_v1_bits { u8 outer_ip_src_addr[0x20]; u8 outer_ip_dst_addr[0x20]; u8 outer_l4_sport[0x10]; u8 outer_l4_dport[0x10]; u8 reserved_at_40[0x1]; u8 sx_sniffer[0x1]; u8 functional_loopback[0x1]; u8 outer_ip_frag[0x1]; u8 qp_type[0x2]; u8 encapsulation_type[0x2]; u8 port[0x2]; u8 outer_l3_type[0x2]; u8 outer_l4_type[0x2]; u8 first_vlan_qualifier[0x2]; u8 first_priority[0x3]; u8 first_cfi[0x1]; u8 first_vlan_id[0xc]; u8 metadata_reg_c_0[0x20]; u8 outer_dmac_47_16[0x20]; u8 outer_smac_47_16[0x20]; u8 outer_smac_15_0[0x10]; u8 outer_dmac_15_0[0x10]; }; struct mlx5_ifc_ste_def24_v1_bits { u8 metadata_reg_c_2[0x20]; u8 metadata_reg_c_3[0x20]; u8 metadata_reg_c_0[0x20]; u8 metadata_reg_c_1[0x20]; u8 outer_ip_src_addr[0x20]; u8 outer_ip_dst_addr[0x20]; u8 outer_l4_sport[0x10]; u8 outer_l4_dport[0x10]; u8 inner_ip_protocol[0x8]; u8 inner_l3_type[0x2]; u8 inner_l4_type[0x2]; u8 inner_first_vlan_type[0x2]; u8 inner_ip_frag[0x1]; u8 functional_lb[0x1]; u8 outer_ip_protocol[0x8]; u8 outer_l3_type[0x2]; u8 outer_l4_type[0x2]; u8 outer_first_vlan_type[0x2]; u8 outer_ip_frag[0x1]; u8 functional_lb_dup[0x1]; }; struct mlx5_ifc_ste_def25_v1_bits { u8 inner_ip_src_addr[0x20]; u8 inner_ip_dst_addr[0x20]; u8 inner_l4_sport[0x10]; u8 inner_l4_dport[0x10]; u8 tunnel_header_0[0x20]; u8 tunnel_header_1[0x20]; u8 reserved_at_a0[0x20]; u8 port_number_dup[0x2]; u8 inner_l3_type[0x2]; u8 inner_l4_type[0x2]; u8 inner_first_vlan_type[0x2]; u8 port_number[0x2]; u8 outer_l3_type[0x2]; u8 outer_l4_type[0x2]; u8 outer_first_vlan_type[0x2]; u8 outer_l4_dport[0x10]; u8 reserved_at_e0[0x20]; }; struct mlx5_ifc_ste_def26_v1_bits { u8 src_ipv6_127_96[0x20]; u8 src_ipv6_95_64[0x20]; u8 src_ipv6_63_32[0x20]; u8 src_ipv6_31_0[0x20]; u8 reserved_at_80[0x3]; u8 ip_frag[0x1]; u8 reserved_at_84[0x6]; u8 l3_type[0x2]; u8 l4_type[0x2]; u8 first_vlan_type[0x2]; u8 first_priority[0x3]; u8 first_cfi[0x1]; u8 first_vlan_id[0xc]; u8 reserved_at_a0[0xb]; u8 l2_ok[0x1]; u8 l3_ok[0x1]; u8 l4_ok[0x1]; u8 second_vlan_type[0x2]; u8 second_priority[0x3]; u8 second_cfi[0x1]; u8 second_vlan_id[0xc]; u8 smac_47_16[0x20]; u8 smac_15_0[0x10]; u8 ip_porotcol[0x8]; u8 tcp_cwr[0x1]; u8 tcp_ece[0x1]; u8 tcp_urg[0x1]; u8 tcp_ack[0x1]; u8 tcp_psh[0x1]; u8 tcp_rst[0x1]; u8 tcp_syn[0x1]; u8 tcp_fin[0x1]; }; struct mlx5_ifc_ste_def28_v1_bits { u8 inner_l4_sport[0x10]; u8 inner_l4_dport[0x10]; u8 flex_gtpu_teid[0x20]; u8 inner_ip_src_addr[0x20]; u8 inner_ip_dst_addr[0x20]; u8 outer_ip_src_addr[0x20]; u8 outer_ip_dst_addr[0x20]; u8 outer_l4_sport[0x10]; u8 outer_l4_dport[0x10]; u8 inner_ip_protocol[0x8]; u8 inner_l3_type[0x2]; u8 inner_l4_type[0x2]; u8 inner_first_vlan_type[0x2]; u8 inner_ip_frag[0x1]; u8 functional_lb[0x1]; u8 outer_ip_protocol[0x8]; u8 outer_l3_type[0x2]; u8 outer_l4_type[0x2]; u8 outer_first_vlan_type[0x2]; u8 outer_ip_frag[0x1]; u8 functional_lb_dup[0x1]; }; struct mlx5_ifc_ste_def33_v1_bits { u8 outer_ip_src_addr[0x20]; u8 outer_ip_dst_addr[0x20]; u8 outer_l4_sport[0x10]; u8 outer_l4_dport[0x10]; u8 reserved_at_60[0x1]; u8 sx_sniffer[0x1]; u8 functional_loopback[0x1]; u8 outer_ip_frag[0x1]; u8 qp_type[0x2]; u8 encapsulation_type[0x2]; u8 port[0x2]; u8 outer_l3_type[0x2]; u8 outer_l4_type[0x2]; u8 outer_first_vlan_type[0x2]; u8 outer_first_vlan_prio[0x3]; u8 outer_first_vlan_cfi[0x1]; u8 outer_first_vlan_vid[0xc]; u8 reserved_at_80[0x20]; u8 reserved_at_a0[0x20]; u8 reserved_at_c0[0x20]; u8 outer_ip_version[0x4]; u8 outer_ip_ihl[0x4]; u8 inner_ipv4_checksum_ok[0x1]; u8 inner_l4_checksum_ok[0x1]; u8 outer_ipv4_checksum_ok[0x1]; u8 outer_l4_checksum_ok[0x1]; u8 inner_l3_ok[0x1]; u8 inner_l4_ok[0x1]; u8 outer_l3_ok[0x1]; u8 outer_l4_ok[0x1]; u8 outer_ip_ttl[0x8]; u8 outer_ip_protocol[0x8]; }; struct mlx5_ifc_ste_single_action_remove_header_v3_bits { u8 action_id[0x8]; u8 start_anchor[0x7]; u8 end_anchor[0x7]; u8 reserved_at_16[0x1]; u8 outer_l4_remove[0x1]; u8 reserved_at_18[0x4]; u8 decap[0x1]; u8 vni_to_cqe[0x1]; u8 qos_profile[0x2]; }; struct mlx5_ifc_ste_single_action_remove_header_size_v3_bits { u8 action_id[0x8]; u8 start_anchor[0x7]; u8 start_offset[0x8]; u8 outer_l4_remove[0x1]; u8 reserved_at_18[0x2]; u8 remove_size[0x6]; }; struct mlx5_ifc_ste_double_action_insert_with_inline_v3_bits { u8 action_id[0x8]; u8 start_anchor[0x7]; u8 start_offset[0x8]; u8 reserved_at_17[0x9]; u8 inline_data[0x20]; }; struct mlx5_ifc_ste_double_action_insert_with_ptr_v3_bits { u8 action_id[0x8]; u8 start_anchor[0x7]; u8 start_offset[0x8]; u8 size[0x6]; u8 attributes[0x3]; u8 pointer[0x20]; }; struct mlx5_ifc_set_action_in_bits { u8 action_type[0x4]; u8 field[0xc]; u8 reserved_at_10[0x3]; u8 offset[0x5]; u8 reserved_at_18[0x3]; u8 length[0x5]; u8 data[0x20]; }; struct mlx5_ifc_add_action_in_bits { u8 action_type[0x4]; u8 field[0xc]; u8 reserved_at_10[0x10]; u8 data[0x20]; }; struct mlx5_ifc_copy_action_in_bits { u8 action_type[0x4]; u8 src_field[0xc]; u8 reserved_at_10[0x3]; u8 src_offset[0x5]; u8 reserved_at_18[0x3]; u8 length[0x5]; u8 reserved_at_20[0x4]; u8 dst_field[0xc]; u8 reserved_at_30[0x3]; u8 dst_offset[0x5]; u8 reserved_at_38[0x8]; }; enum { MLX5_ACTION_TYPE_SET = 0x1, MLX5_ACTION_TYPE_ADD = 0x2, MLX5_ACTION_TYPE_COPY = 0x3, }; enum { MLX5_ACTION_IN_FIELD_OUT_SMAC_47_16 = 0x1, MLX5_ACTION_IN_FIELD_OUT_SMAC_15_0 = 0x2, MLX5_ACTION_IN_FIELD_OUT_ETHERTYPE = 0x3, MLX5_ACTION_IN_FIELD_OUT_DMAC_47_16 = 0x4, MLX5_ACTION_IN_FIELD_OUT_DMAC_15_0 = 0x5, MLX5_ACTION_IN_FIELD_OUT_IP_DSCP = 0x6, MLX5_ACTION_IN_FIELD_OUT_TCP_FLAGS = 0x7, MLX5_ACTION_IN_FIELD_OUT_TCP_SPORT = 0x8, MLX5_ACTION_IN_FIELD_OUT_TCP_DPORT = 0x9, MLX5_ACTION_IN_FIELD_OUT_IP_TTL = 0xa, MLX5_ACTION_IN_FIELD_OUT_UDP_SPORT = 0xb, MLX5_ACTION_IN_FIELD_OUT_UDP_DPORT = 0xc, MLX5_ACTION_IN_FIELD_OUT_SIPV6_127_96 = 0xd, MLX5_ACTION_IN_FIELD_OUT_SIPV6_95_64 = 0xe, MLX5_ACTION_IN_FIELD_OUT_SIPV6_63_32 = 0xf, MLX5_ACTION_IN_FIELD_OUT_SIPV6_31_0 = 0x10, MLX5_ACTION_IN_FIELD_OUT_DIPV6_127_96 = 0x11, MLX5_ACTION_IN_FIELD_OUT_DIPV6_95_64 = 0x12, MLX5_ACTION_IN_FIELD_OUT_DIPV6_63_32 = 0x13, MLX5_ACTION_IN_FIELD_OUT_DIPV6_31_0 = 0x14, MLX5_ACTION_IN_FIELD_OUT_SIPV4 = 0x15, MLX5_ACTION_IN_FIELD_OUT_DIPV4 = 0x16, MLX5_ACTION_IN_FIELD_OUT_FIRST_VID = 0x17, MLX5_ACTION_IN_FIELD_OUT_IPV6_HOPLIMIT = 0x47, MLX5_ACTION_IN_FIELD_OUT_METADATA_REGA = 0x49, MLX5_ACTION_IN_FIELD_OUT_METADATA_REGB = 0x50, MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_0 = 0x51, MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_1 = 0x52, MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_2 = 0x53, MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_3 = 0x54, MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_4 = 0x55, MLX5_ACTION_IN_FIELD_OUT_METADATA_REGC_5 = 0x56, MLX5_ACTION_IN_FIELD_OUT_TCP_SEQ_NUM = 0x59, MLX5_ACTION_IN_FIELD_OUT_TCP_ACK_NUM = 0x5B, MLX5_ACTION_IN_FIELD_OUT_GTPU_TEID = 0x6E, MLX5_ACTION_IN_FIELD_OUT_IP_ECN = 0x73, }; struct mlx5_ifc_dctc_bits { u8 reserved_at_0[0x1d]; u8 data_in_order[0x1]; u8 reserved_at_1e[0x362]; }; struct mlx5_ifc_packet_reformat_context_in_bits { u8 reserved_at_0[0x5]; u8 reformat_type[0x3]; u8 reserved_at_8[0xe]; u8 reformat_data_size[0xa]; u8 reserved_at_20[0x10]; u8 reformat_data[2][0x8]; u8 more_reformat_data[0][0x8]; }; struct mlx5_ifc_alloc_packet_reformat_context_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0xa0]; struct mlx5_ifc_packet_reformat_context_in_bits packet_reformat_context; }; struct mlx5_ifc_alloc_packet_reformat_context_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 packet_reformat_id[0x20]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_dealloc_packet_reformat_context_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_20[0x10]; u8 op_mod[0x10]; u8 packet_reformat_id[0x20]; u8 reserved_60[0x20]; }; struct mlx5_ifc_dealloc_packet_reformat_context_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; enum reformat_type { MLX5_REFORMAT_TYPE_L2_TO_VXLAN = 0x0, MLX5_REFORMAT_TYPE_L2_TO_NVGRE = 0x1, MLX5_REFORMAT_TYPE_L2_TO_L2_TUNNEL = 0x2, MLX5_REFORMAT_TYPE_L3_TUNNEL_TO_L2 = 0x3, MLX5_REFORMAT_TYPE_L2_TO_L3_TUNNEL = 0x4, }; struct mlx5_ifc_alloc_flow_counter_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_alloc_flow_counter_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 flow_counter_id[0x20]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_dealloc_flow_counter_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x20]; u8 flow_counter_id[0x20]; u8 reserved_at_60[0x20]; }; enum { MLX5_OBJ_TYPE_FLOW_METER = 0x000a, MLX5_OBJ_TYPE_DEK = 0x000C, MLX5_OBJ_TYPE_MATCH_DEFINER = 0x0018, MLX5_OBJ_TYPE_CRYPTO_LOGIN = 0x001F, MLX5_OBJ_TYPE_FLOW_SAMPLER = 0x0020, MLX5_OBJ_TYPE_HEADER_MODIFY_ARGUMENT = 0x0023, MLX5_OBJ_TYPE_ASO_FLOW_METER = 0x0024, MLX5_OBJ_TYPE_ASO_FIRST_HIT = 0x0025, MLX5_OBJ_TYPE_SCHEDULING_ELEMENT = 0x0026, MLX5_OBJ_TYPE_RESERVED_QPN = 0x002C, MLX5_OBJ_TYPE_ASO_CT = 0x0031, MLX5_OBJ_TYPE_AV_QP_MAPPING = 0x003A, }; struct mlx5_ifc_general_obj_in_cmd_hdr_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 obj_type[0x10]; u8 obj_id[0x20]; u8 reserved_at_60[0x3]; u8 log_obj_range[0x5]; u8 reserved_at_68[0x18]; }; struct mlx5_ifc_general_obj_out_cmd_hdr_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 obj_id[0x20]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_flow_meter_bits { u8 modify_field_select[0x40]; u8 active[0x1]; u8 reserved_at_41[0x3]; u8 return_reg_id[0x4]; u8 table_type[0x8]; u8 reserved_at_50[0x10]; u8 reserved_at_60[0x8]; u8 destination_table_id[0x18]; u8 reserved_at_80[0x80]; u8 flow_meter_params[0x100]; u8 reserved_at_180[0x180]; u8 sw_steering_icm_address_rx[0x40]; u8 sw_steering_icm_address_tx[0x40]; }; struct mlx5_ifc_create_flow_meter_in_bits { struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; struct mlx5_ifc_flow_meter_bits meter; }; struct mlx5_ifc_query_flow_meter_out_bits { struct mlx5_ifc_general_obj_out_cmd_hdr_bits hdr; struct mlx5_ifc_flow_meter_bits obj; }; struct mlx5_ifc_flow_sampler_bits { u8 modify_field_select[0x40]; u8 table_type[0x8]; u8 level[0x8]; u8 reserved_at_50[0xf]; u8 ignore_flow_level[0x1]; u8 sample_ratio[0x20]; u8 reserved_at_80[0x8]; u8 sample_table_id[0x18]; u8 reserved_at_a0[0x8]; u8 default_table_id[0x18]; u8 sw_steering_icm_address_rx[0x40]; u8 sw_steering_icm_address_tx[0x40]; }; struct mlx5_ifc_create_flow_sampler_in_bits { struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; struct mlx5_ifc_flow_sampler_bits sampler; }; struct mlx5_ifc_query_flow_sampler_out_bits { struct mlx5_ifc_general_obj_out_cmd_hdr_bits hdr; struct mlx5_ifc_flow_sampler_bits obj; }; struct mlx5_ifc_modify_header_arg_bits { u8 reserved_at_0[0x80]; u8 reserved_at_80[0x8]; u8 access_pd[0x18]; }; struct mlx5_ifc_create_modify_header_arg_in_bits { struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; struct mlx5_ifc_modify_header_arg_bits arg; }; struct mlx5_ifc_definer_bits { u8 modify_field_select[0x40]; u8 reserved_at_40[0x40]; u8 reserved_at_80[0x10]; u8 format_id[0x10]; u8 reserved_at_60[0x160]; u8 ctrl[0xA0]; u8 match_mask_dw_11_8[0x60]; u8 match_mask_dw_7_0[0x100]; }; struct mlx5_ifc_create_definer_in_bits { struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; struct mlx5_ifc_definer_bits definer; }; struct mlx5_ifc_esw_vport_context_bits { u8 reserved_at_0[0x3]; u8 vport_svlan_strip[0x1]; u8 vport_cvlan_strip[0x1]; u8 vport_svlan_insert[0x1]; u8 vport_cvlan_insert[0x2]; u8 reserved_at_8[0x18]; u8 reserved_at_20[0x20]; u8 svlan_cfi[0x1]; u8 svlan_pcp[0x3]; u8 svlan_id[0xc]; u8 cvlan_cfi[0x1]; u8 cvlan_pcp[0x3]; u8 cvlan_id[0xc]; u8 reserved_at_40[0x720]; u8 sw_steering_vport_icm_address_rx[0x40]; u8 sw_steering_vport_icm_address_tx[0x40]; }; struct mlx5_ifc_query_esw_vport_context_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; struct mlx5_ifc_esw_vport_context_bits esw_vport_context; }; struct mlx5_ifc_query_esw_vport_context_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 other_vport[0x1]; u8 reserved_at_41[0xf]; u8 vport_number[0x10]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_nic_vport_context_bits { u8 reserved_at_0[0x1f]; u8 roce_en[0x1]; u8 reserved_at_20[0x7e0]; }; struct mlx5_ifc_query_nic_vport_context_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; struct mlx5_ifc_nic_vport_context_bits nic_vport_context; }; struct mlx5_ifc_query_nic_vport_context_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; }; enum { MLX5_QPC_ST_RC = 0x0, }; enum { MLX5_QPC_PM_STATE_MIGRATED = 0x3, }; struct mlx5_ifc_ud_av_bits { u8 reserved_at_0[0x60]; u8 reserved_at_60[0x4]; u8 sl_or_eth_prio[0x4]; u8 reserved_at_68[0x18]; u8 reserved_at_80[0x60]; u8 reserved_at_e0[0x4]; u8 src_addr_index[0x8]; u8 reserved_at_ec[0x14]; u8 rgid_or_rip[16][0x8]; }; struct mlx5_ifc_ads_bits { u8 fl[0x1]; u8 free_ar[0x1]; u8 reserved_at_2[0xe]; u8 pkey_index[0x10]; u8 reserved_at_20[0x8]; u8 grh[0x1]; u8 mlid[0x7]; u8 rlid[0x10]; u8 ack_timeout[0x5]; u8 reserved_at_45[0x3]; u8 src_addr_index[0x8]; u8 reserved_at_50[0x4]; u8 stat_rate[0x4]; u8 hop_limit[0x8]; u8 reserved_at_60[0x4]; u8 tclass[0x8]; u8 flow_label[0x14]; u8 rgid_rip[16][0x8]; u8 reserved_at_100[0x4]; u8 f_dscp[0x1]; u8 f_ecn[0x1]; u8 reserved_at_106[0x1]; u8 f_eth_prio[0x1]; u8 ecn[0x2]; u8 dscp[0x6]; u8 udp_sport[0x10]; u8 dei_cfi[0x1]; u8 eth_prio[0x3]; u8 sl[0x4]; u8 vhca_port_num[0x8]; u8 rmac_47_32[0x10]; u8 rmac_31_0[0x20]; }; enum { MLX5_QPC_STATE_SQDRAINED = 0x5, }; enum { MLX5_QPC_TIMESTAMP_FORMAT_FREE_RUNNING = 0x0, MLX5_QPC_TIMESTAMP_FORMAT_DEFAULT = 0x1, MLX5_QPC_TIMESTAMP_FORMAT_REAL_TIME = 0x2, }; struct mlx5_ifc_qpc_bits { u8 state[0x4]; u8 lag_tx_port_affinity[0x4]; u8 st[0x8]; u8 reserved_at_10[0x2]; u8 isolate_vl_tc[0x1]; u8 pm_state[0x2]; u8 reserved_at_15[0x1]; u8 req_e2e_credit_mode[0x2]; u8 offload_type[0x4]; u8 end_padding_mode[0x2]; u8 reserved_at_1e[0x2]; u8 wq_signature[0x1]; u8 block_lb_mc[0x1]; u8 atomic_like_write_en[0x1]; u8 latency_sensitive[0x1]; u8 reserved_at_24[0x1]; u8 drain_sigerr[0x1]; u8 reserved_at_26[0x2]; u8 pd[0x18]; u8 mtu[0x3]; u8 log_msg_max[0x5]; u8 reserved_at_48[0x1]; u8 log_rq_size[0x4]; u8 log_rq_stride[0x3]; u8 no_sq[0x1]; u8 log_sq_size[0x4]; u8 reserved_at_55[0x3]; u8 ts_format[0x2]; u8 data_in_order[0x1]; u8 rlky[0x1]; u8 ulp_stateless_offload_mode[0x4]; u8 counter_set_id[0x8]; u8 uar_page[0x18]; u8 reserved_at_80[0x8]; u8 user_index[0x18]; u8 reserved_at_a0[0x3]; u8 log_page_size[0x5]; u8 remote_qpn[0x18]; struct mlx5_ifc_ads_bits primary_address_path; struct mlx5_ifc_ads_bits secondary_address_path; u8 log_ack_req_freq[0x4]; u8 reserved_at_384[0x4]; u8 log_sra_max[0x3]; u8 reserved_at_38b[0x2]; u8 retry_count[0x3]; u8 rnr_retry[0x3]; u8 reserved_at_393[0x1]; u8 fre[0x1]; u8 cur_rnr_retry[0x3]; u8 cur_retry_count[0x3]; u8 reserved_at_39b[0x5]; u8 reserved_at_3a0[0x20]; u8 reserved_at_3c0[0x8]; u8 next_send_psn[0x18]; u8 reserved_at_3e0[0x8]; u8 cqn_snd[0x18]; u8 reserved_at_400[0x8]; u8 deth_sqpn[0x18]; u8 reserved_at_420[0x20]; u8 reserved_at_440[0x8]; u8 last_acked_psn[0x18]; u8 reserved_at_460[0x8]; u8 ssn[0x18]; u8 reserved_at_480[0x8]; u8 log_rra_max[0x3]; u8 reserved_at_48b[0x1]; u8 atomic_mode[0x4]; u8 rre[0x1]; u8 rwe[0x1]; u8 rae[0x1]; u8 reserved_at_493[0x1]; u8 page_offset[0x6]; u8 reserved_at_49a[0x3]; u8 cd_slave_receive[0x1]; u8 cd_slave_send[0x1]; u8 cd_master[0x1]; u8 reserved_at_4a0[0x3]; u8 min_rnr_nak[0x5]; u8 next_rcv_psn[0x18]; u8 reserved_at_4c0[0x8]; u8 xrcd[0x18]; u8 reserved_at_4e0[0x8]; u8 cqn_rcv[0x18]; u8 dbr_addr[0x40]; u8 q_key[0x20]; u8 reserved_at_560[0x5]; u8 rq_type[0x3]; u8 srqn_rmpn_xrqn[0x18]; u8 reserved_at_580[0x8]; u8 rmsn[0x18]; u8 hw_sq_wqebb_counter[0x10]; u8 sw_sq_wqebb_counter[0x10]; u8 hw_rq_counter[0x20]; u8 sw_rq_counter[0x20]; u8 reserved_at_600[0x20]; u8 reserved_at_620[0xf]; u8 cgs[0x1]; u8 cs_req[0x8]; u8 cs_res[0x8]; u8 dc_access_key[0x40]; u8 reserved_at_680[0x3]; u8 dbr_umem_valid[0x1]; u8 reserved_at_684[0x9c]; u8 dbr_umem_id[0x20]; }; struct mlx5_ifc_qpc_ext_bits { u8 reserved_at_0[0x2]; u8 mmo[0x1]; u8 reserved_at_3[0xd]; u8 dci_stream_channel_id[0x10]; u8 qos_queue_group_id_requester[0x20]; u8 qos_queue_group_id_responder[0x20]; u8 reserved_at_60[0x5a0]; }; struct mlx5_ifc_create_tir_out_bits { u8 status[0x8]; u8 icm_address_63_40[0x18]; u8 syndrome[0x20]; u8 icm_address_39_32[0x8]; u8 tirn[0x18]; u8 icm_address_31_0[0x20]; }; struct mlx5_ifc_destroy_tir_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 tirn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_qp_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x8]; u8 qpn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_qp_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; u8 reserved_at_800[0x40]; u8 wq_umem_id[0x20]; u8 wq_umem_valid[0x1]; u8 reserved_at_861[0x1f]; u8 pas[0][0x40]; }; struct mlx5_ifc_destroy_qp_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 qpn[0x18]; u8 reserved_at_60[0x20]; }; enum mlx5_qpc_opt_mask_32 { MLX5_QPC_OPT_MASK_32_DCI_STREAM_CHANNEL_ID = 1 << 0, MLX5_QPC_OPT_MASK_32_QOS_QUEUE_GROUP_ID = 1 << 1, MLX5_QPC_OPT_MASK_32_UDP_SPORT = 1 << 2, MLX5_QPC_OPT_MASK_32_INIT2INIT_MMO = 1 << 3, }; enum mlx5_qpc_opt_mask { MLX5_QPC_OPT_MASK_INIT2INIT_DRAIN_SIGERR = 1 << 11, MLX5_QPC_OPT_MASK_RTS2RTS_LAG_TX_PORT_AFFINITY = 1 << 15, }; struct mlx5_ifc_init2init_qp_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_init2init_qp_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 qpc_ext[0x1]; u8 reserved_at_41[0x7]; u8 qpn[0x18]; u8 reserved_at_60[0x20]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; u8 reserved_at_800[0x40]; u8 opt_param_mask_95_32[0x40]; struct mlx5_ifc_qpc_ext_bits qpc_data_ext; }; struct mlx5_ifc_init2rtr_qp_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_init2rtr_qp_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x8]; u8 qpn[0x18]; u8 reserved_at_60[0x20]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; u8 reserved_at_800[0x80]; }; struct mlx5_ifc_rtr2rts_qp_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_rtr2rts_qp_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x8]; u8 qpn[0x18]; u8 reserved_at_60[0x20]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; u8 reserved_at_800[0x80]; }; struct mlx5_ifc_rst2init_qp_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_rst2init_qp_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x8]; u8 qpn[0x18]; u8 reserved_at_60[0x20]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; u8 reserved_at_800[0x80]; }; struct mlx5_ifc_rts2rts_qp_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_rts2rts_qp_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 qpc_ext[0x1]; u8 reserved_at_41[0x7]; u8 qpn[0x18]; u8 reserved_at_60[0x20]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; u8 reserved_at_800[0x40]; u8 opt_param_mask_95_32[0x40]; struct mlx5_ifc_qpc_ext_bits qpc_data_ext; }; struct mlx5_ifc_query_qp_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; u8 opt_param_mask[0x20]; u8 reserved_at_a0[0x20]; struct mlx5_ifc_qpc_bits qpc; u8 reserved_at_800[0x80]; u8 pas[0][0x40]; }; struct mlx5_ifc_query_qp_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x8]; u8 qpn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_query_dct_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; struct mlx5_ifc_dctc_bits dctc; }; struct mlx5_ifc_query_dct_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x8]; u8 dctn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_tisc_bits { u8 strict_lag_tx_port_affinity[0x1]; u8 tls_en[0x1]; u8 reserved_at_2[0x2]; u8 lag_tx_port_affinity[0x04]; u8 reserved_at_8[0x4]; u8 prio[0x4]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x100]; u8 reserved_at_120[0x8]; u8 transport_domain[0x18]; u8 reserved_at_140[0x8]; u8 underlay_qpn[0x18]; u8 reserved_at_160[0x8]; u8 pd[0x18]; u8 reserved_at_180[0x380]; }; struct mlx5_ifc_query_tis_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; struct mlx5_ifc_tisc_bits tis_context; }; struct mlx5_ifc_query_tis_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x8]; u8 tisn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_lagc_bits { u8 reserved_at_0[0x1d]; u8 lag_state[0x3]; u8 reserved_at_20[0x14]; u8 tx_remap_affinity_2[0x4]; u8 reserved_at_38[0x4]; u8 tx_remap_affinity_1[0x4]; }; struct mlx5_ifc_query_lag_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; struct mlx5_ifc_lagc_bits ctx; }; struct mlx5_ifc_query_lag_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_av_qp_mapping_bits { u8 modify_field_select[0x40]; u8 reserved_at_40[0x20]; u8 qpn[0x20]; struct mlx5_ifc_ud_av_bits remote_address_vector; }; struct mlx5_ifc_create_av_qp_mapping_in_bits { struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; struct mlx5_ifc_av_qp_mapping_bits mapping; }; struct mlx5_ifc_query_av_qp_mapping_out_bits { struct mlx5_ifc_general_obj_out_cmd_hdr_bits hdr; struct mlx5_ifc_av_qp_mapping_bits obj; }; struct mlx5_ifc_modify_tis_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_modify_tis_bitmask_bits { u8 reserved_at_0[0x20]; u8 reserved_at_20[0x1d]; u8 lag_tx_port_affinity[0x1]; u8 strict_lag_tx_port_affinity[0x1]; u8 prio[0x1]; }; struct mlx5_ifc_modify_tis_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x8]; u8 tisn[0x18]; u8 reserved_at_60[0x20]; struct mlx5_ifc_modify_tis_bitmask_bits bitmask; u8 reserved_at_c0[0x40]; struct mlx5_ifc_tisc_bits ctx; }; enum roce_version { MLX5_ROCE_VERSION_1 = 0, MLX5_ROCE_VERSION_2 = 2, }; struct mlx5_ifc_roce_addr_layout_bits { u8 source_l3_address[16][0x8]; u8 reserved_at_80[0x3]; u8 vlan_valid[0x1]; u8 vlan_id[0xc]; u8 source_mac_47_32[0x10]; u8 source_mac_31_0[0x20]; u8 reserved_at_c0[0x14]; u8 roce_l3_type[0x4]; u8 roce_version[0x8]; u8 reserved_at_e0[0x20]; }; struct mlx5_ifc_query_roce_address_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; struct mlx5_ifc_roce_addr_layout_bits roce_address; }; struct mlx5_ifc_query_roce_address_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 roce_address_index[0x10]; u8 reserved_at_50[0xc]; u8 vhca_port_num[0x4]; u8 reserved_at_60[0x20]; }; /* Both HW set and HW add share the same HW format with different opcodes */ struct mlx5_ifc_dr_action_hw_set_bits { u8 opcode[0x8]; u8 destination_field_code[0x8]; u8 reserved_at_10[0x2]; u8 destination_left_shifter[0x6]; u8 reserved_at_18[0x3]; u8 destination_length[0x5]; u8 inline_data[0x20]; }; struct mlx5_ifc_dr_action_hw_copy_bits { u8 opcode[0x8]; u8 destination_field_code[0x8]; u8 reserved_at_10[0x2]; u8 destination_left_shifter[0x6]; u8 reserved_at_18[0x2]; u8 destination_length[0x6]; u8 reserved_at_20[0x8]; u8 source_field_code[0x8]; u8 reserved_at_30[0x2]; u8 source_left_shifter[0x6]; u8 reserved_at_38[0x8]; }; struct mlx5_ifc_host_params_context_bits { u8 host_number[0x8]; u8 reserved_at_8[0x6]; u8 host_pf_vhca_id_valid[0x1]; u8 host_pf_disabled[0x1]; u8 host_num_of_vfs[0x10]; u8 host_total_vfs[0x10]; u8 host_pci_bus[0x10]; u8 host_pf_vhca_id[0x10]; u8 host_pci_device[0x10]; u8 reserved_at_60[0x10]; u8 host_pci_function[0x10]; u8 reserved_at_80[0x180]; }; struct mlx5_ifc_query_esw_functions_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_query_esw_functions_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; struct mlx5_ifc_host_params_context_bits host_params_context; u8 reserved_at_280[0x180]; u8 host_sf_enable[0][0x40]; }; struct mlx5_ifc_create_flow_group_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x20]; u8 other_vport[0x1]; u8 reserved_at_41[0xf]; u8 vport_number[0x10]; u8 reserved_at_60[0x20]; u8 table_type[0x8]; u8 reserved_at_88[0x18]; u8 reserved_at_a0[0x8]; u8 table_id[0x18]; u8 reserved_at_c0[0x1f40]; }; struct mlx5_ifc_create_flow_group_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x8]; u8 group_id[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_flow_group_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x20]; u8 other_vport[0x1]; u8 reserved_at_41[0xf]; u8 vport_number[0x10]; u8 reserved_at_60[0x20]; u8 table_type[0x8]; u8 reserved_at_88[0x18]; u8 reserved_at_a0[0x8]; u8 table_id[0x18]; u8 group_id[0x20]; u8 reserved_at_e0[0x120]; }; struct mlx5_ifc_dest_format_bits { u8 destination_type[0x8]; u8 destination_id[0x18]; u8 reserved_at_20[0x1]; u8 packet_reformat[0x1]; u8 reserved_at_22[0x1e]; }; struct mlx5_ifc_extended_dest_format_bits { struct mlx5_ifc_dest_format_bits destination_entry; u8 packet_reformat_id[0x20]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_flow_counter_list_bits { u8 flow_counter_id[0x20]; u8 reserved_at_20[0x20]; }; union mlx5_ifc_dest_format_flow_counter_list_auto_bits { struct mlx5_ifc_dest_format_bits dest_format; struct mlx5_ifc_flow_counter_list_bits flow_counter_list; u8 reserved_at_0[0x40]; }; struct mlx5_ifc_flow_context_bits { u8 reserved_at_00[0x20]; u8 group_id[0x20]; u8 reserved_at_40[0x8]; u8 flow_tag[0x18]; u8 reserved_at_60[0x10]; u8 action[0x10]; u8 extended_destination[0x1]; u8 reserved_at_81[0x7]; u8 destination_list_size[0x18]; u8 reserved_at_a0[0x8]; u8 flow_counter_list_size[0x18]; u8 reserved_at_c0[0x1740]; union mlx5_ifc_dest_format_flow_counter_list_auto_bits destination[0]; }; struct mlx5_ifc_set_fte_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 other_vport[0x1]; u8 reserved_at_41[0xf]; u8 vport_number[0x10]; u8 reserved_at_60[0x20]; u8 table_type[0x8]; u8 reserved_at_88[0x18]; u8 reserved_at_a0[0x8]; u8 table_id[0x18]; u8 reserved_at_c0[0x40]; u8 flow_index[0x20]; u8 reserved_at_120[0xe0]; struct mlx5_ifc_flow_context_bits flow_context; }; struct mlx5_ifc_set_fte_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; enum dr_devx_flow_dest_type { MLX5_FLOW_DEST_TYPE_VPORT = 0x0, MLX5_FLOW_DEST_TYPE_FT = 0x1, MLX5_FLOW_DEST_TYPE_TIR = 0x2, MLX5_FLOW_DEST_TYPE_COUNTER = 0x100, }; enum { MLX5_FLOW_CONTEXT_ACTION_FWD_DEST = 0x4, MLX5_FLOW_CONTEXT_ACTION_COUNT = 0x8, }; enum { MLX5_QPC_PAGE_OFFSET_QUANTA = 64, }; enum { MLX5_ASO_FIRST_HIT_NUM_PER_OBJ = 512, MLX5_ASO_FLOW_METER_NUM_PER_OBJ = 2, MLX5_ASO_CT_NUM_PER_OBJ = 1, }; enum mlx5_sched_hierarchy_type { MLX5_SCHED_HIERARCHY_NIC = 3, }; enum mlx5_sched_elem_type { MLX5_SCHED_ELEM_TYPE_TSAR = 0x0, MLX5_SCHED_ELEM_TYPE_VPORT = 0x1, MLX5_SCHED_ELEM_TYPE_VPORT_TC = 0x2, MLX5_SCHED_ELEM_TYPE_PARA_VPORT_TC = 0x3, MLX5_SCHED_ELEM_TYPE_QUEUE_GROUP = 0x4, }; enum mlx5_sched_tsar_type { MLX5_SCHED_TSAR_TYPE_DWRR = 0x0, MLX5_SCHED_TSAR_TYPE_ROUND_ROBIN = 0x1, MLX5_SCHED_TSAR_TYPE_ETS = 0x2, }; struct mlx5_ifc_sched_elem_attr_tsar_bits { u8 reserved_at_0[0x8]; u8 tsar_type[0x8]; u8 reserved_at_10[0x10]; }; union mlx5_ifc_sched_elem_attr_bits { struct mlx5_ifc_sched_elem_attr_tsar_bits tsar; }; struct mlx5_ifc_sched_context_bits { u8 element_type[0x8]; u8 reserved_at_8[0x18]; union mlx5_ifc_sched_elem_attr_bits sched_elem_attr; u8 parent_element_id[0x20]; u8 reserved_at_60[0x40]; u8 bw_share[0x20]; u8 max_average_bw[0x20]; u8 reserved_at_e0[0x120]; }; struct mlx5_ifc_sched_elem_bits { u8 modify_field_select[0x40]; u8 scheduling_hierarchy[0x8]; u8 reserved_at_48[0x18]; u8 reserved_at_60[0xa0]; struct mlx5_ifc_sched_context_bits sched_context; u8 reserved_at_300[0x100]; }; struct mlx5_ifc_create_sched_elem_in_bits { struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; struct mlx5_ifc_sched_elem_bits sched_elem; }; struct mlx5_ifc_create_modify_elem_in_bits { struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; struct mlx5_ifc_sched_elem_bits sched_elem; }; enum { MLX5_SQC_STATE_RDY = 0x1, }; struct mlx5_ifc_sqc_bits { u8 reserved_at_0[0x8]; u8 state[0x4]; u8 reserved_at_c[0x14]; u8 reserved_at_20[0xe0]; u8 reserved_at_100[0x10]; u8 qos_queue_group_id[0x10]; u8 reserved_at_120[0x660]; }; enum { MLX5_MODIFY_SQ_BITMASK_QOS_QUEUE_GROUP_ID = 1 << 2, }; struct mlx5_ifc_modify_sq_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_modify_sq_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 sq_state[0x4]; u8 reserved_at_44[0x4]; u8 sqn[0x18]; u8 reserved_at_60[0x20]; u8 modify_bitmask[0x40]; u8 reserved_at_c0[0x40]; struct mlx5_ifc_sqc_bits sq_context; }; struct mlx5_ifc_reserved_qpn_bits { u8 reserved_at_0[0x80]; }; struct mlx5_ifc_create_reserved_qpn_in_bits { struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; struct mlx5_ifc_reserved_qpn_bits rqpns; }; struct mlx5_ifc_create_psv_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; u8 reserved_at_80[0x8]; u8 psv0_index[0x18]; u8 reserved_at_a0[0x8]; u8 psv1_index[0x18]; u8 reserved_at_c0[0x8]; u8 psv2_index[0x18]; u8 reserved_at_e0[0x8]; u8 psv3_index[0x18]; }; struct mlx5_ifc_create_psv_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 num_psv[0x4]; u8 reserved_at_44[0x4]; u8 pd[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_psv_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 psvn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_mbox_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_mbox_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_enable_hca_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x10]; u8 function_id[0x10]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_enable_hca_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x20]; }; struct mlx5_ifc_query_issi_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x10]; u8 current_issi[0x10]; u8 reserved_at_60[0xa0]; u8 reserved_at_100[76][0x8]; u8 supported_issi_dw0[0x20]; }; struct mlx5_ifc_query_issi_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_set_issi_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_set_issi_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x10]; u8 current_issi[0x10]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_query_pages_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 embedded_cpu_function[0x01]; u8 reserved_bits[0x0f]; u8 function_id[0x10]; u8 num_pages[0x20]; }; struct mlx5_ifc_query_pages_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x10]; u8 function_id[0x10]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_manage_pages_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 output_num_entries[0x20]; u8 reserved_at_60[0x20]; u8 pas[][0x40]; }; struct mlx5_ifc_manage_pages_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 embedded_cpu_function[0x1]; u8 reserved_at_41[0xf]; u8 function_id[0x10]; u8 input_num_entries[0x20]; u8 pas[][0x40]; }; enum { MLX5_TEARDOWN_HCA_OUT_FORCE_STATE_FAIL = 0x1, }; struct mlx5_ifc_teardown_hca_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x3f]; u8 state[0x1]; }; enum { MLX5_TEARDOWN_HCA_IN_PROFILE_GRACEFUL_CLOSE = 0x0, MLX5_TEARDOWN_HCA_IN_PROFILE_PREPARE_FAST_TEARDOWN = 0x2, }; struct mlx5_ifc_teardown_hca_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x10]; u8 profile[0x10]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_init_hca_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_init_hca_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_access_register_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; u8 register_data[][0x20]; }; struct mlx5_ifc_access_register_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x10]; u8 register_id[0x10]; u8 argument[0x20]; u8 register_data[][0x20]; }; struct mlx5_ifc_modify_nic_vport_context_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_modify_nic_vport_field_select_bits { u8 reserved_at_0[0x12]; u8 affiliation[0x1]; u8 reserved_at_13[0x1]; u8 disable_uc_local_lb[0x1]; u8 disable_mc_local_lb[0x1]; u8 node_guid[0x1]; u8 port_guid[0x1]; u8 min_inline[0x1]; u8 mtu[0x1]; u8 change_event[0x1]; u8 promisc[0x1]; u8 permanent_address[0x1]; u8 addresses_list[0x1]; u8 roce_en[0x1]; u8 reserved_at_1f[0x1]; }; struct mlx5_ifc_modify_nic_vport_context_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 other_vport[0x1]; u8 reserved_at_41[0xf]; u8 vport_number[0x10]; struct mlx5_ifc_modify_nic_vport_field_select_bits field_select; u8 reserved_at_80[0x780]; struct mlx5_ifc_nic_vport_context_bits nic_vport_context; }; struct mlx5_ifc_set_hca_cap_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_set_hca_cap_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 other_function[0x1]; u8 reserved_at_41[0xf]; u8 function_id[0x10]; u8 reserved_at_60[0x20]; union mlx5_ifc_hca_cap_union_bits capability; }; struct mlx5_ifc_alloc_uar_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x8]; u8 uar[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_alloc_uar_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_dealloc_uar_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_dealloc_uar_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x8]; u8 uar[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_eqc_bits { u8 status[0x4]; u8 reserved_at_4[0x9]; u8 ec[0x1]; u8 oi[0x1]; u8 reserved_at_f[0x5]; u8 st[0x4]; u8 reserved_at_18[0x8]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x14]; u8 page_offset[0x6]; u8 reserved_at_5a[0x6]; u8 reserved_at_60[0x3]; u8 log_eq_size[0x5]; u8 uar_page[0x18]; u8 reserved_at_80[0x20]; u8 reserved_at_a0[0x18]; u8 intr[0x8]; u8 reserved_at_c0[0x3]; u8 log_page_size[0x5]; u8 reserved_at_c8[0x18]; u8 reserved_at_e0[0x60]; u8 reserved_at_140[0x8]; u8 consumer_counter[0x18]; u8 reserved_at_160[0x8]; u8 producer_counter[0x18]; u8 reserved_at_180[0x80]; }; struct mlx5_ifc_create_eq_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x18]; u8 eq_number[0x8]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_eq_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; struct mlx5_ifc_eqc_bits eq_context_entry; u8 reserved_at_280[0x40]; u8 event_bitmask[4][0x40]; u8 reserved_at_3c0[0x4c0]; u8 pas[][0x40]; }; struct mlx5_ifc_destroy_eq_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_destroy_eq_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x18]; u8 eq_number[0x8]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_alloc_pd_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x8]; u8 pd[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_alloc_pd_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_dealloc_pd_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_dealloc_pd_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x8]; u8 pd[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_mtt_bits { u8 ptag_63_32[0x20]; u8 ptag_31_8[0x18]; u8 reserved_at_38[0x6]; u8 wr_en[0x1]; u8 rd_en[0x1]; }; struct mlx5_ifc_umem_bits { u8 reserved_at_0[0x80]; u8 reserved_at_80[0x1b]; u8 log_page_size[0x5]; u8 page_offset[0x20]; u8 num_of_mtt[0x40]; struct mlx5_ifc_mtt_bits mtt[]; }; struct mlx5_ifc_create_umem_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x40]; struct mlx5_ifc_umem_bits umem; }; struct mlx5_ifc_create_umem_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x8]; u8 umem_id[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_umem_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x10]; u8 op_mod[0x10]; u8 reserved_at_40[0x8]; u8 umem_id[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_umem_out_bits { u8 status[0x8]; u8 reserved_at_8[0x18]; u8 syndrome[0x20]; u8 reserved_at_40[0x40]; }; struct mlx5_ifc_delete_fte_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x20]; u8 other_vport[0x1]; u8 reserved_at_41[0xf]; u8 vport_number[0x10]; u8 reserved_at_60[0x20]; u8 table_type[0x8]; u8 reserved_at_88[0x18]; u8 reserved_at_a0[0x8]; u8 table_id[0x18]; u8 reserved_at_c0[0x40]; u8 flow_index[0x20]; u8 reserved_at_120[0xe0]; }; struct mlx5_ifc_create_cq_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 cqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_cq_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 cqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_alloc_transport_domain_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 transport_domain[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_dealloc_transport_domain_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 transport_domain[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_rmp_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 rmpn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_rmp_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 rmpn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_sq_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 sqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_sq_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 sqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_rq_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 rqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_rq_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 rqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_rqt_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 rqtn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_rqt_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 rqtn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_tis_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 tisn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_tis_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 tisn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_alloc_q_counter_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x18]; u8 counter_set_id[0x8]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_dealloc_q_counter_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x18]; u8 counter_set_id[0x8]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_alloc_modify_header_context_out_bits { u8 reserved_at_0[0x40]; u8 modify_header_id[0x20]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_dealloc_modify_header_context_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x20]; u8 modify_header_id[0x20]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_scheduling_element_out_bits { u8 reserved_at_0[0x80]; u8 scheduling_element_id[0x20]; u8 reserved_at_a0[0x160]; }; struct mlx5_ifc_create_scheduling_element_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x20]; u8 scheduling_hierarchy[0x8]; u8 reserved_at_48[0x18]; u8 reserved_at_60[0x3a0]; }; struct mlx5_ifc_destroy_scheduling_element_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x20]; u8 scheduling_hierarchy[0x8]; u8 reserved_at_48[0x18]; u8 scheduling_element_id[0x20]; u8 reserved_at_80[0x180]; }; struct mlx5_ifc_add_vxlan_udp_dport_in_bits { u8 reserved_at_0[0x60]; u8 reserved_at_60[0x10]; u8 vxlan_udp_port[0x10]; }; struct mlx5_ifc_delete_vxlan_udp_dport_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x40]; u8 reserved_at_60[0x10]; u8 vxlan_udp_port[0x10]; }; struct mlx5_ifc_set_l2_table_entry_in_bits { u8 reserved_at_0[0xa0]; u8 reserved_at_a0[0x8]; u8 table_index[0x18]; u8 reserved_at_c0[0x140]; }; struct mlx5_ifc_delete_l2_table_entry_in_bits { u8 opcode[0x10]; u8 reserved_at_10[0x10]; u8 reserved_at_20[0x80]; u8 reserved_at_a0[0x8]; u8 table_index[0x18]; u8 reserved_at_c0[0x140]; }; struct mlx5_ifc_create_srq_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 srqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_srq_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 srqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_xrc_srq_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 xrc_srqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_xrc_srq_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 xrc_srqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_dct_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 dctn[0x18]; u8 ece[0x20]; }; struct mlx5_ifc_destroy_dct_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 dctn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_create_xrq_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 xrqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_destroy_xrq_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 xrqn[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_attach_to_mcg_in_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 qpn[0x18]; u8 reserved_at_60[0x20]; u8 multicast_gid[16][0x8]; }; struct mlx5_ifc_detach_from_mcg_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 qpn[0x18]; u8 reserved_at_60[0x20]; u8 multicast_gid[16][0x8]; }; struct mlx5_ifc_alloc_xrcd_out_bits { u8 reserved_at_0[0x40]; u8 reserved_at_40[0x8]; u8 xrcd[0x18]; u8 reserved_at_60[0x20]; }; struct mlx5_ifc_dealloc_xrcd_in_bits { u8 opcode[0x10]; u8 uid[0x10]; u8 reserved_at_20[0x20]; u8 reserved_at_40[0x8]; u8 xrcd[0x18]; u8 reserved_at_60[0x20]; }; enum { MLX5_CRYPTO_LOGIN_OBJ_STATE_VALID = 0x0, MLX5_CRYPTO_LOGIN_OBJ_STATE_INVALID = 0x1, }; struct mlx5_ifc_crypto_login_obj_bits { u8 modify_field_select[0x40]; u8 reserved_at_40[0x40]; u8 reserved_at_80[0x4]; u8 state[0x4]; u8 credential_pointer[0x18]; u8 reserved_at_a0[0x8]; u8 session_import_kek_ptr[0x18]; u8 reserved_at_c0[0x140]; u8 credential[12][0x20]; u8 reserved_at_380[0x480]; }; struct mlx5_ifc_create_crypto_login_obj_in_bits { struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; struct mlx5_ifc_crypto_login_obj_bits login_obj; }; struct mlx5_ifc_query_crypto_login_obj_out_bits { struct mlx5_ifc_general_obj_out_cmd_hdr_bits hdr; struct mlx5_ifc_crypto_login_obj_bits obj; }; enum { MLX5_ENCRYPTION_KEY_OBJ_STATE_READY = 0x0, MLX5_ENCRYPTION_KEY_OBJ_STATE_ERROR = 0x1, }; enum { MLX5_ENCRYPTION_KEY_OBJ_KEY_SIZE_SIZE_128 = 0x0, MLX5_ENCRYPTION_KEY_OBJ_KEY_SIZE_SIZE_256 = 0x1, }; enum { MLX5_ENCRYPTION_KEY_OBJ_KEY_PURPOSE_AES_XTS = 0x3, }; struct mlx5_ifc_encryption_key_obj_bits { u8 modify_field_select[0x40]; u8 state[0x8]; u8 reserved_at_48[0xc]; u8 key_size[0x4]; u8 has_keytag[0x1]; u8 reserved_at_59[0x3]; u8 key_purpose[0x4]; u8 reserved_at_60[0x8]; u8 pd[0x18]; u8 reserved_at_80[0x100]; u8 opaque[0x40]; u8 reserved_at_1c0[0x40]; u8 key[32][0x20]; u8 reserved_at_600[0x200]; }; struct mlx5_ifc_create_encryption_key_obj_in_bits { struct mlx5_ifc_general_obj_in_cmd_hdr_bits hdr; struct mlx5_ifc_encryption_key_obj_bits key_obj; }; struct mlx5_ifc_query_encryption_key_obj_out_bits { struct mlx5_ifc_general_obj_out_cmd_hdr_bits hdr; struct mlx5_ifc_encryption_key_obj_bits obj; }; enum { MLX5_ENCRYPTION_ORDER_ENCRYPTED_WIRE_SIGNATURE = 0x0, MLX5_ENCRYPTION_ORDER_ENCRYPTED_MEMORY_SIGNATURE = 0x1, MLX5_ENCRYPTION_ORDER_ENCRYPTED_RAW_WIRE = 0x2, MLX5_ENCRYPTION_ORDER_ENCRYPTED_RAW_MEMORY = 0x3, }; enum { MLX5_ENCRYPTION_STANDARD_AES_XTS = 0x0, }; #endif /* MLX5_IFC_H */ rdma-core-56.1/providers/mlx5/mlx5_trace.c000066400000000000000000000003651477342711600204470ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2023 Bytedance.com, Inc. or its affiliates. All rights reserved. */ #define LTTNG_UST_TRACEPOINT_CREATE_PROBES #define LTTNG_UST_TRACEPOINT_DEFINE #include "mlx5_trace.h" rdma-core-56.1/providers/mlx5/mlx5_trace.h000066400000000000000000000023771477342711600204610ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2023 Bytedance.com, Inc. or its affiliates. All rights reserved. */ #if defined(LTTNG_ENABLED) #undef LTTNG_UST_TRACEPOINT_PROVIDER #define LTTNG_UST_TRACEPOINT_PROVIDER rdma_core_mlx5 #undef LTTNG_UST_TRACEPOINT_INCLUDE #define LTTNG_UST_TRACEPOINT_INCLUDE "mlx5_trace.h" #if !defined(__MLX5_TRACE_H__) || defined(LTTNG_UST_TRACEPOINT_HEADER_MULTI_READ) #define __MLX5_TRACE_H__ #include #include LTTNG_UST_TRACEPOINT_EVENT( /* Tracepoint provider name */ rdma_core_mlx5, /* Tracepoint name */ post_send, /* Input arguments */ LTTNG_UST_TP_ARGS( char *, dev, uint32_t, src_qp_num, char *, opcode, uint32_t, bytes ), /* Output event fields */ LTTNG_UST_TP_FIELDS( lttng_ust_field_string(dev, dev) lttng_ust_field_integer(uint32_t, src_qp_num, src_qp_num) lttng_ust_field_string(opcode, opcode) lttng_ust_field_integer(uint32_t, bytes, bytes) ) ) #define rdma_tracepoint(arg...) lttng_ust_tracepoint(arg) #endif /* __MLX5_TRACE_H__*/ #include #else #ifndef __MLX5_TRACE_H__ #define __MLX5_TRACE_H__ #define rdma_tracepoint(arg...) #endif /* __MLX5_TRACE_H__*/ #endif /* defined(LTTNG_ENABLED) */ rdma-core-56.1/providers/mlx5/mlx5_vfio.c000066400000000000000000002717511477342711600203250ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB /* * Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "mlx5dv.h" #include "mlx5_vfio.h" #include "mlx5.h" #include "mlx5_ifc.h" enum { MLX5_VFIO_CMD_VEC_IDX, }; enum { MLX5_VFIO_SUPP_MR_ACCESS_FLAGS = IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_ATOMIC | IBV_ACCESS_RELAXED_ORDERING, MLX5_VFIO_SUPP_UMEM_ACCESS_FLAGS = IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_READ, }; static int mlx5_vfio_give_pages(struct mlx5_vfio_context *ctx, uint16_t func_id, int32_t npages, bool is_event); static int mlx5_vfio_reclaim_pages(struct mlx5_vfio_context *ctx, uint32_t func_id, int npages); static void mlx5_vfio_free_cmd_msg(struct mlx5_vfio_context *ctx, struct mlx5_cmd_msg *msg); static int mlx5_vfio_alloc_cmd_msg(struct mlx5_vfio_context *ctx, uint32_t size, struct mlx5_cmd_msg *msg); static int mlx5_vfio_post_cmd(struct mlx5_vfio_context *ctx, void *in, int ilen, void *out, int olen, unsigned int slot, bool async); static int mlx5_vfio_register_mem(struct mlx5_vfio_context *ctx, void *vaddr, uint64_t iova, uint64_t size) { struct vfio_iommu_type1_dma_map dma_map = { .argsz = sizeof(dma_map) }; dma_map.vaddr = (uintptr_t)vaddr; dma_map.size = size; dma_map.iova = iova; dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE; return ioctl(ctx->container_fd, VFIO_IOMMU_MAP_DMA, &dma_map); } static void mlx5_vfio_unregister_mem(struct mlx5_vfio_context *ctx, uint64_t iova, uint64_t size) { struct vfio_iommu_type1_dma_unmap dma_unmap = {}; dma_unmap.argsz = sizeof(struct vfio_iommu_type1_dma_unmap); dma_unmap.size = size; dma_unmap.iova = iova; if (ioctl(ctx->container_fd, VFIO_IOMMU_UNMAP_DMA, &dma_unmap)) assert(false); } static struct page_block *mlx5_vfio_new_block(struct mlx5_vfio_context *ctx) { struct page_block *page_block; int err; page_block = calloc(1, sizeof(*page_block)); if (!page_block) { errno = ENOMEM; return NULL; } err = posix_memalign(&page_block->page_ptr, MLX5_VFIO_BLOCK_SIZE, MLX5_VFIO_BLOCK_SIZE); if (err) { errno = err; goto err; } err = iset_alloc_range(ctx->iova_alloc, MLX5_VFIO_BLOCK_SIZE, &page_block->iova, MLX5_VFIO_BLOCK_SIZE); if (err) goto err_range; bitmap_fill(page_block->free_pages, MLX5_VFIO_BLOCK_NUM_PAGES); err = mlx5_vfio_register_mem(ctx, page_block->page_ptr, page_block->iova, MLX5_VFIO_BLOCK_SIZE); if (err) goto err_reg; list_add(&ctx->mem_alloc.block_list, &page_block->next_block); return page_block; err_reg: iset_insert_range(ctx->iova_alloc, page_block->iova, MLX5_VFIO_BLOCK_SIZE); err_range: free(page_block->page_ptr); err: free(page_block); return NULL; } static void mlx5_vfio_free_block(struct mlx5_vfio_context *ctx, struct page_block *page_block) { mlx5_vfio_unregister_mem(ctx, page_block->iova, MLX5_VFIO_BLOCK_SIZE); iset_insert_range(ctx->iova_alloc, page_block->iova, MLX5_VFIO_BLOCK_SIZE); list_del(&page_block->next_block); free(page_block->page_ptr); free(page_block); } static int mlx5_vfio_alloc_page(struct mlx5_vfio_context *ctx, uint64_t *iova) { struct page_block *page_block; unsigned long pg; int ret = 0; pthread_mutex_lock(&ctx->mem_alloc.block_list_mutex); while (true) { list_for_each(&ctx->mem_alloc.block_list, page_block, next_block) { pg = bitmap_find_first_bit(page_block->free_pages, 0, MLX5_VFIO_BLOCK_NUM_PAGES); if (pg != MLX5_VFIO_BLOCK_NUM_PAGES) { bitmap_clear_bit(page_block->free_pages, pg); *iova = page_block->iova + pg * MLX5_ADAPTER_PAGE_SIZE; goto end; } } if (!mlx5_vfio_new_block(ctx)) { ret = -1; goto end; } } end: pthread_mutex_unlock(&ctx->mem_alloc.block_list_mutex); return ret; } static void mlx5_vfio_free_page(struct mlx5_vfio_context *ctx, uint64_t iova) { struct page_block *page_block; unsigned long pg; pthread_mutex_lock(&ctx->mem_alloc.block_list_mutex); list_for_each(&ctx->mem_alloc.block_list, page_block, next_block) { if (page_block->iova > iova || (page_block->iova + MLX5_VFIO_BLOCK_SIZE <= iova)) continue; pg = (iova - page_block->iova) / MLX5_ADAPTER_PAGE_SIZE; assert(!bitmap_test_bit(page_block->free_pages, pg)); bitmap_set_bit(page_block->free_pages, pg); if (bitmap_full(page_block->free_pages, MLX5_VFIO_BLOCK_NUM_PAGES)) mlx5_vfio_free_block(ctx, page_block); goto end; } assert(false); end: pthread_mutex_unlock(&ctx->mem_alloc.block_list_mutex); } static const char *cmd_status_str(uint8_t status) { switch (status) { case MLX5_CMD_STAT_OK: return "OK"; case MLX5_CMD_STAT_INT_ERR: return "internal error"; case MLX5_CMD_STAT_BAD_OP_ERR: return "bad operation"; case MLX5_CMD_STAT_BAD_PARAM_ERR: return "bad parameter"; case MLX5_CMD_STAT_BAD_SYS_STATE_ERR: return "bad system state"; case MLX5_CMD_STAT_BAD_RES_ERR: return "bad resource"; case MLX5_CMD_STAT_RES_BUSY: return "resource busy"; case MLX5_CMD_STAT_LIM_ERR: return "limits exceeded"; case MLX5_CMD_STAT_BAD_RES_STATE_ERR: return "bad resource state"; case MLX5_CMD_STAT_IX_ERR: return "bad index"; case MLX5_CMD_STAT_NO_RES_ERR: return "no resources"; case MLX5_CMD_STAT_BAD_INP_LEN_ERR: return "bad input length"; case MLX5_CMD_STAT_BAD_OUTP_LEN_ERR: return "bad output length"; case MLX5_CMD_STAT_BAD_QP_STATE_ERR: return "bad QP state"; case MLX5_CMD_STAT_BAD_PKT_ERR: return "bad packet (discarded)"; case MLX5_CMD_STAT_BAD_SIZE_OUTS_CQES_ERR: return "bad size too many outstanding CQEs"; default: return "unknown status"; } } static struct mlx5_eqe *get_eqe(struct mlx5_eq *eq, uint32_t entry) { return eq->vaddr + entry * MLX5_EQE_SIZE; } static struct mlx5_eqe *mlx5_eq_get_eqe(struct mlx5_eq *eq, uint32_t cc) { uint32_t ci = eq->cons_index + cc; struct mlx5_eqe *eqe; eqe = get_eqe(eq, ci & (eq->nent - 1)); eqe = ((eqe->owner & 1) ^ !!(ci & eq->nent)) ? NULL : eqe; if (eqe) udma_from_device_barrier(); return eqe; } static void eq_update_ci(struct mlx5_eq *eq, uint32_t cc, int arm) { __be32 *addr = eq->doorbell + (arm ? 0 : 2); uint32_t val; eq->cons_index += cc; val = (eq->cons_index & 0xffffff) | (eq->eqn << 24); mmio_write32_be(addr, htobe32(val)); udma_to_device_barrier(); } static int mlx5_vfio_handle_page_req_event(struct mlx5_vfio_context *ctx, struct mlx5_eqe *eqe) { struct mlx5_eqe_page_req *req = &eqe->data.req_pages; int32_t num_pages; int16_t func_id; func_id = be16toh(req->func_id); num_pages = be32toh(req->num_pages); if (num_pages > 0) return mlx5_vfio_give_pages(ctx, func_id, num_pages, true); return mlx5_vfio_reclaim_pages(ctx, func_id, -1 * num_pages); } static void mlx5_cmd_mbox_status(void *out, uint8_t *status, uint32_t *syndrome) { *status = DEVX_GET(mbox_out, out, status); *syndrome = DEVX_GET(mbox_out, out, syndrome); } static int mlx5_vfio_cmd_check(struct mlx5_vfio_context *ctx, void *in, void *out) { uint32_t syndrome; uint8_t status; uint16_t opcode; uint16_t op_mod; mlx5_cmd_mbox_status(out, &status, &syndrome); if (!status) return 0; opcode = DEVX_GET(mbox_in, in, opcode); op_mod = DEVX_GET(mbox_in, in, op_mod); mlx5_err(ctx->dbg_fp, "mlx5_vfio_op_code(0x%x), op_mod(0x%x) failed, status %s(0x%x), syndrome (0x%x)\n", opcode, op_mod, cmd_status_str(status), status, syndrome); errno = mlx5_cmd_status_to_err(status); return errno; } static int mlx5_copy_from_msg(void *to, struct mlx5_cmd_msg *from, int size, struct mlx5_cmd_layout *cmd_lay) { struct mlx5_cmd_block *block; struct mlx5_cmd_mailbox *next; int copy; copy = min_t(int, size, sizeof(cmd_lay->out)); memcpy(to, cmd_lay->out, copy); size -= copy; to += copy; next = from->next; while (size) { if (!next) { assert(false); errno = ENOMEM; return errno; } copy = min_t(int, size, MLX5_CMD_DATA_BLOCK_SIZE); block = next->buf; memcpy(to, block->data, copy); to += copy; size -= copy; next = next->next; } return 0; } static int mlx5_copy_to_msg(struct mlx5_cmd_msg *to, void *from, int size, struct mlx5_cmd_layout *cmd_lay) { struct mlx5_cmd_block *block; struct mlx5_cmd_mailbox *next; int copy; copy = min_t(int, size, sizeof(cmd_lay->in)); memcpy(cmd_lay->in, from, copy); size -= copy; from += copy; next = to->next; while (size) { if (!next) { assert(false); errno = ENOMEM; return errno; } copy = min_t(int, size, MLX5_CMD_DATA_BLOCK_SIZE); block = next->buf; memcpy(block->data, from, copy); from += copy; size -= copy; next = next->next; } return 0; } /* The HCA will think the queue has overflowed if we don't tell it we've been * processing events. * We create EQs with MLX5_NUM_SPARE_EQE extra entries, * so we must update our consumer index at least that often. */ static inline uint32_t mlx5_eq_update_cc(struct mlx5_eq *eq, uint32_t cc) { if (unlikely(cc >= MLX5_NUM_SPARE_EQE)) { eq_update_ci(eq, cc, 0); cc = 0; } return cc; } static int mlx5_vfio_process_page_request_comp(struct mlx5_vfio_context *ctx, unsigned long slot) { struct mlx5_vfio_cmd_slot *cmd_slot = &ctx->cmd.cmds[slot]; struct cmd_async_data *cmd_data = &cmd_slot->curr; int num_claimed; int ret, i; ret = mlx5_copy_from_msg(cmd_data->buff_out, &cmd_slot->out, cmd_data->olen, cmd_slot->lay); if (ret) goto end; ret = mlx5_vfio_cmd_check(ctx, cmd_data->buff_in, cmd_data->buff_out); if (ret) goto end; if (DEVX_GET(manage_pages_in, cmd_data->buff_in, op_mod) == MLX5_PAGES_GIVE) goto end; num_claimed = DEVX_GET(manage_pages_out, cmd_data->buff_out, output_num_entries); if (num_claimed > DEVX_GET(manage_pages_in, cmd_data->buff_in, input_num_entries)) { ret = EINVAL; errno = ret; goto end; } for (i = 0; i < num_claimed; i++) mlx5_vfio_free_page(ctx, DEVX_GET64(manage_pages_out, cmd_data->buff_out, pas[i])); end: free(cmd_data->buff_in); free(cmd_data->buff_out); cmd_slot->in_use = false; if (!ret && cmd_slot->is_pending) { cmd_data = &cmd_slot->pending; pthread_mutex_lock(&cmd_slot->lock); cmd_slot->is_pending = false; ret = mlx5_vfio_post_cmd(ctx, cmd_data->buff_in, cmd_data->ilen, cmd_data->buff_out, cmd_data->olen, slot, true); pthread_mutex_unlock(&cmd_slot->lock); } return ret; } static int mlx5_vfio_cmd_comp(struct mlx5_vfio_context *ctx, unsigned long slot) { uint64_t u = 1; ssize_t s; s = write(ctx->cmd.cmds[slot].completion_event_fd, &u, sizeof(uint64_t)); if (s != sizeof(uint64_t)) return -1; return 0; } static int mlx5_vfio_process_cmd_eqe(struct mlx5_vfio_context *ctx, struct mlx5_eqe *eqe) { struct mlx5_eqe_cmd *cmd_eqe = &eqe->data.cmd; unsigned long vector = be32toh(cmd_eqe->vector); unsigned long slot; int count = 0; int ret; for (slot = 0; slot < MLX5_MAX_COMMANDS; slot++) { if (vector & (1 << slot)) { assert(ctx->cmd.cmds[slot].comp_func); ret = ctx->cmd.cmds[slot].comp_func(ctx, slot); if (ret) return ret; vector &= ~(1 << slot); count++; } } assert(!vector && count); return 0; } static int mlx5_vfio_process_async_events(struct mlx5_vfio_context *ctx) { struct mlx5_eqe *eqe; int ret = 0; int cc = 0; pthread_mutex_lock(&ctx->eq_lock); while ((eqe = mlx5_eq_get_eqe(&ctx->async_eq, cc))) { switch (eqe->type) { case MLX5_EVENT_TYPE_CMD: ret = mlx5_vfio_process_cmd_eqe(ctx, eqe); break; case MLX5_EVENT_TYPE_PAGE_REQUEST: ret = mlx5_vfio_handle_page_req_event(ctx, eqe); break; default: break; } cc = mlx5_eq_update_cc(&ctx->async_eq, ++cc); if (ret) goto out; } out: eq_update_ci(&ctx->async_eq, cc, 1); pthread_mutex_unlock(&ctx->eq_lock); return ret; } static int mlx5_vfio_enlarge_cmd_msg(struct mlx5_vfio_context *ctx, struct mlx5_cmd_msg *cmd_msg, struct mlx5_cmd_layout *cmd_lay, uint32_t len, bool is_in) { int err; mlx5_vfio_free_cmd_msg(ctx, cmd_msg); err = mlx5_vfio_alloc_cmd_msg(ctx, len, cmd_msg); if (err) return err; if (is_in) cmd_lay->iptr = htobe64(cmd_msg->next->iova); else cmd_lay->optr = htobe64(cmd_msg->next->iova); return 0; } static int mlx5_vfio_wait_event(struct mlx5_vfio_context *ctx, unsigned int slot) { struct mlx5_cmd_layout *cmd_lay = ctx->cmd.cmds[slot].lay; uint64_t u; ssize_t s; int err; struct pollfd fds[2] = { { .fd = ctx->cmd_comp_fd, .events = POLLIN }, { .fd = ctx->cmd.cmds[slot].completion_event_fd, .events = POLLIN } }; while (true) { err = poll(fds, 2, -1); if (err < 0 && errno != EAGAIN) { mlx5_err(ctx->dbg_fp, "mlx5_vfio_wait_event, poll failed, errno=%d\n", errno); return errno; } if (fds[0].revents & POLLIN) { s = read(fds[0].fd, &u, sizeof(uint64_t)); if (s < 0 && errno != EAGAIN) { mlx5_err(ctx->dbg_fp, "mlx5_vfio_wait_event, read failed, errno=%d\n", errno); return errno; } err = mlx5_vfio_process_async_events(ctx); if (err) return err; } if (fds[1].revents & POLLIN) { s = read(fds[1].fd, &u, sizeof(uint64_t)); if (s < 0 && errno != EAGAIN) { mlx5_err(ctx->dbg_fp, "mlx5_vfio_wait_event, read failed, slot=%d, errno=%d\n", slot, errno); return errno; } if (!(mmio_read8(&cmd_lay->status_own) & 0x1)) return 0; } } } /* One minute for the sake of bringup */ #define MLX5_CMD_TIMEOUT_MSEC (60 * 1000) static int mlx5_vfio_poll_timeout(struct mlx5_cmd_layout *cmd_lay) { static struct timeval start, curr; uint64_t ms_start, ms_curr; gettimeofday(&start, NULL); ms_start = (uint64_t)start.tv_sec * 1000 + start.tv_usec / 1000; do { if (!(mmio_read8(&cmd_lay->status_own) & 0x1)) return 0; sched_yield(); gettimeofday(&curr, NULL); ms_curr = (uint64_t)curr.tv_sec * 1000 + curr.tv_usec / 1000; } while (ms_curr - ms_start < MLX5_CMD_TIMEOUT_MSEC); errno = ETIMEDOUT; return errno; } static int mlx5_vfio_cmd_prep_in(struct mlx5_vfio_context *ctx, struct mlx5_cmd_msg *cmd_in, struct mlx5_cmd_layout *cmd_lay, void *in, int ilen) { int err; if (ilen > cmd_in->len) { err = mlx5_vfio_enlarge_cmd_msg(ctx, cmd_in, cmd_lay, ilen, true); if (err) return err; } err = mlx5_copy_to_msg(cmd_in, in, ilen, cmd_lay); if (err) return err; cmd_lay->ilen = htobe32(ilen); return 0; } static int mlx5_vfio_cmd_prep_out(struct mlx5_vfio_context *ctx, struct mlx5_cmd_msg *cmd_out, struct mlx5_cmd_layout *cmd_lay, int olen) { struct mlx5_cmd_mailbox *tmp; struct mlx5_cmd_block *block; cmd_lay->olen = htobe32(olen); /* zeroing output header */ memset(cmd_lay->out, 0, sizeof(cmd_lay->out)); if (olen > cmd_out->len) /* Upon enlarge output message is zeroed */ return mlx5_vfio_enlarge_cmd_msg(ctx, cmd_out, cmd_lay, olen, false); /* zeroing output message */ tmp = cmd_out->next; olen -= min_t(int, olen, sizeof(cmd_lay->out)); while (olen > 0) { block = tmp->buf; memset(block->data, 0, MLX5_CMD_DATA_BLOCK_SIZE); olen -= MLX5_CMD_DATA_BLOCK_SIZE; tmp = tmp->next; assert(tmp || olen <= 0); } return 0; } static int mlx5_vfio_post_cmd(struct mlx5_vfio_context *ctx, void *in, int ilen, void *out, int olen, unsigned int slot, bool async) { struct mlx5_init_seg *init_seg = ctx->bar_map; struct mlx5_cmd_layout *cmd_lay = ctx->cmd.cmds[slot].lay; struct mlx5_cmd_msg *cmd_in = &ctx->cmd.cmds[slot].in; struct mlx5_cmd_msg *cmd_out = &ctx->cmd.cmds[slot].out; int err; /* Lock was taken by caller */ if (async && ctx->cmd.cmds[slot].in_use) { struct cmd_async_data *pending = &ctx->cmd.cmds[slot].pending; if (ctx->cmd.cmds[slot].is_pending) { assert(false); return EINVAL; } /* We might get another PAGE EVENT before previous CMD was completed. * Save the new work and once get the CMD completion go and do the job. */ pending->buff_in = in; pending->buff_out = out; pending->ilen = ilen; pending->olen = olen; ctx->cmd.cmds[slot].is_pending = true; return 0; } err = mlx5_vfio_cmd_prep_in(ctx, cmd_in, cmd_lay, in, ilen); if (err) return err; err = mlx5_vfio_cmd_prep_out(ctx, cmd_out, cmd_lay, olen); if (err) return err; if (async) { ctx->cmd.cmds[slot].in_use = true; ctx->cmd.cmds[slot].curr.ilen = ilen; ctx->cmd.cmds[slot].curr.olen = olen; ctx->cmd.cmds[slot].curr.buff_in = in; ctx->cmd.cmds[slot].curr.buff_out = out; } cmd_lay->status_own = 0x1; udma_to_device_barrier(); mmio_write32_be(&init_seg->cmd_dbell, htobe32(0x1 << slot)); return 0; } static int mlx5_vfio_cmd_do(struct mlx5_vfio_context *ctx, void *in, int ilen, void *out, int olen, unsigned int slot) { struct mlx5_cmd_layout *cmd_lay = ctx->cmd.cmds[slot].lay; struct mlx5_cmd_msg *cmd_out = &ctx->cmd.cmds[slot].out; int err; pthread_mutex_lock(&ctx->cmd.cmds[slot].lock); err = mlx5_vfio_post_cmd(ctx, in, ilen, out, olen, slot, false); if (err) goto end; if (ctx->have_eq) { err = mlx5_vfio_wait_event(ctx, slot); if (err) goto end; } else { err = mlx5_vfio_poll_timeout(cmd_lay); if (err) goto end; udma_from_device_barrier(); } err = mlx5_copy_from_msg(out, cmd_out, olen, cmd_lay); if (err) goto end; if (DEVX_GET(mbox_out, out, status) != MLX5_CMD_STAT_OK) err = EREMOTEIO; end: pthread_mutex_unlock(&ctx->cmd.cmds[slot].lock); return err; } static int mlx5_vfio_cmd_exec(struct mlx5_vfio_context *ctx, void *in, int ilen, void *out, int olen, unsigned int slot) { int err; err = mlx5_vfio_cmd_do(ctx, in, ilen, out, olen, slot); if (err != EREMOTEIO) return err; return mlx5_vfio_cmd_check(ctx, in, out); } static int mlx5_vfio_enable_pci_cmd(struct mlx5_vfio_context *ctx) { struct vfio_region_info pci_config_reg = {}; uint16_t pci_com_buf = 0x6; char buffer[4096]; pci_config_reg.argsz = sizeof(pci_config_reg); pci_config_reg.index = VFIO_PCI_CONFIG_REGION_INDEX; if (ioctl(ctx->device_fd, VFIO_DEVICE_GET_REGION_INFO, &pci_config_reg)) return -1; if (pwrite(ctx->device_fd, &pci_com_buf, 2, pci_config_reg.offset + 0x4) != 2) return -1; if (pread(ctx->device_fd, buffer, pci_config_reg.size, pci_config_reg.offset) != pci_config_reg.size) return -1; return 0; } static void free_cmd_box(struct mlx5_vfio_context *ctx, struct mlx5_cmd_mailbox *mailbox) { mlx5_vfio_unregister_mem(ctx, mailbox->iova, MLX5_ADAPTER_PAGE_SIZE); iset_insert_range(ctx->iova_alloc, mailbox->iova, MLX5_ADAPTER_PAGE_SIZE); free(mailbox->buf); free(mailbox); } static struct mlx5_cmd_mailbox *alloc_cmd_box(struct mlx5_vfio_context *ctx) { struct mlx5_cmd_mailbox *mailbox; int ret; mailbox = calloc(1, sizeof(*mailbox)); if (!mailbox) { errno = ENOMEM; return NULL; } ret = posix_memalign(&mailbox->buf, MLX5_ADAPTER_PAGE_SIZE, MLX5_ADAPTER_PAGE_SIZE); if (ret) { errno = ret; goto err_free; } memset(mailbox->buf, 0, MLX5_ADAPTER_PAGE_SIZE); ret = iset_alloc_range(ctx->iova_alloc, MLX5_ADAPTER_PAGE_SIZE, &mailbox->iova, MLX5_ADAPTER_PAGE_SIZE); if (ret) goto err_tree; ret = mlx5_vfio_register_mem(ctx, mailbox->buf, mailbox->iova, MLX5_ADAPTER_PAGE_SIZE); if (ret) goto err_reg; return mailbox; err_reg: iset_insert_range(ctx->iova_alloc, mailbox->iova, MLX5_ADAPTER_PAGE_SIZE); err_tree: free(mailbox->buf); err_free: free(mailbox); return NULL; } static int mlx5_calc_cmd_blocks(uint32_t msg_len) { int size = msg_len; int blen = size - min_t(int, 16, size); return DIV_ROUND_UP(blen, MLX5_CMD_DATA_BLOCK_SIZE); } static void mlx5_vfio_free_cmd_msg(struct mlx5_vfio_context *ctx, struct mlx5_cmd_msg *msg) { struct mlx5_cmd_mailbox *head = msg->next; struct mlx5_cmd_mailbox *next; while (head) { next = head->next; free_cmd_box(ctx, head); head = next; } msg->len = 0; } static int mlx5_vfio_alloc_cmd_msg(struct mlx5_vfio_context *ctx, uint32_t size, struct mlx5_cmd_msg *msg) { struct mlx5_cmd_mailbox *tmp, *head = NULL; struct mlx5_cmd_block *block; int i, num_blocks; msg->len = size; num_blocks = mlx5_calc_cmd_blocks(size); for (i = 0; i < num_blocks; i++) { tmp = alloc_cmd_box(ctx); if (!tmp) goto err_alloc; block = tmp->buf; tmp->next = head; block->next = htobe64(tmp->next ? tmp->next->iova : 0); block->block_num = htobe32(num_blocks - i - 1); head = tmp; } msg->next = head; return 0; err_alloc: while (head) { tmp = head->next; free_cmd_box(ctx, head); head = tmp; } msg->len = 0; return -1; } static void mlx5_vfio_free_cmd_slot(struct mlx5_vfio_context *ctx, int slot) { struct mlx5_vfio_cmd_slot *cmd_slot = &ctx->cmd.cmds[slot]; mlx5_vfio_free_cmd_msg(ctx, &cmd_slot->in); mlx5_vfio_free_cmd_msg(ctx, &cmd_slot->out); close(cmd_slot->completion_event_fd); } static int mlx5_vfio_setup_cmd_slot(struct mlx5_vfio_context *ctx, int slot) { struct mlx5_vfio_cmd *cmd = &ctx->cmd; struct mlx5_vfio_cmd_slot *cmd_slot = &cmd->cmds[slot]; struct mlx5_cmd_layout *cmd_lay; int ret; ret = mlx5_vfio_alloc_cmd_msg(ctx, 4096, &cmd_slot->in); if (ret) return ret; ret = mlx5_vfio_alloc_cmd_msg(ctx, 4096, &cmd_slot->out); if (ret) goto err; cmd_lay = cmd->vaddr + (slot * (1 << cmd->log_stride)); cmd_lay->type = MLX5_PCI_CMD_XPORT; cmd_lay->iptr = htobe64(cmd_slot->in.next->iova); cmd_lay->optr = htobe64(cmd_slot->out.next->iova); cmd_slot->lay = cmd_lay; cmd_slot->completion_event_fd = eventfd(0, EFD_CLOEXEC); if (cmd_slot->completion_event_fd < 0) { ret = -1; goto err_fd; } if (slot != MLX5_MAX_COMMANDS - 1) cmd_slot->comp_func = mlx5_vfio_cmd_comp; else cmd_slot->comp_func = mlx5_vfio_process_page_request_comp; pthread_mutex_init(&cmd_slot->lock, NULL); return 0; err_fd: mlx5_vfio_free_cmd_msg(ctx, &cmd_slot->out); err: mlx5_vfio_free_cmd_msg(ctx, &cmd_slot->in); return ret; } static int mlx5_vfio_init_cmd_interface(struct mlx5_vfio_context *ctx) { struct mlx5_init_seg *init_seg = ctx->bar_map; struct mlx5_vfio_cmd *cmd = &ctx->cmd; uint16_t cmdif_rev; uint32_t cmd_h, cmd_l; int ret; cmdif_rev = be32toh(init_seg->cmdif_rev_fw_sub) >> 16; if (cmdif_rev != 5) { errno = EINVAL; return -1; } cmd_l = be32toh(init_seg->cmdq_addr_l_sz) & 0xff; ctx->cmd.log_sz = cmd_l >> 4 & 0xf; ctx->cmd.log_stride = cmd_l & 0xf; if (1 << ctx->cmd.log_sz > MLX5_MAX_COMMANDS) { errno = EINVAL; return -1; } if (ctx->cmd.log_sz + ctx->cmd.log_stride > MLX5_ADAPTER_PAGE_SHIFT) { errno = EINVAL; return -1; } /* The initial address must be 4K aligned */ ret = posix_memalign(&cmd->vaddr, MLX5_ADAPTER_PAGE_SIZE, MLX5_ADAPTER_PAGE_SIZE); if (ret) { errno = ret; return -1; } memset(cmd->vaddr, 0, MLX5_ADAPTER_PAGE_SIZE); ret = iset_alloc_range(ctx->iova_alloc, MLX5_ADAPTER_PAGE_SIZE, &cmd->iova, MLX5_ADAPTER_PAGE_SIZE); if (ret) goto err_free; ret = mlx5_vfio_register_mem(ctx, cmd->vaddr, cmd->iova, MLX5_ADAPTER_PAGE_SIZE); if (ret) goto err_reg; cmd_h = (uint32_t)((uint64_t)(cmd->iova) >> 32); cmd_l = (uint32_t)(uint64_t)(cmd->iova); init_seg->cmdq_addr_h = htobe32(cmd_h); init_seg->cmdq_addr_l_sz = htobe32(cmd_l); /* Make sure firmware sees the complete address before we proceed */ udma_to_device_barrier(); ret = mlx5_vfio_setup_cmd_slot(ctx, 0); if (ret) goto err_slot_0; ret = mlx5_vfio_setup_cmd_slot(ctx, MLX5_MAX_COMMANDS - 1); if (ret) goto err_slot_1; ret = mlx5_vfio_enable_pci_cmd(ctx); if (!ret) return 0; mlx5_vfio_free_cmd_slot(ctx, MLX5_MAX_COMMANDS - 1); err_slot_1: mlx5_vfio_free_cmd_slot(ctx, 0); err_slot_0: mlx5_vfio_unregister_mem(ctx, cmd->iova, MLX5_ADAPTER_PAGE_SIZE); err_reg: iset_insert_range(ctx->iova_alloc, cmd->iova, MLX5_ADAPTER_PAGE_SIZE); err_free: free(cmd->vaddr); return ret; } static void mlx5_vfio_clean_cmd_interface(struct mlx5_vfio_context *ctx) { struct mlx5_vfio_cmd *cmd = &ctx->cmd; mlx5_vfio_free_cmd_slot(ctx, 0); mlx5_vfio_free_cmd_slot(ctx, MLX5_MAX_COMMANDS - 1); mlx5_vfio_unregister_mem(ctx, cmd->iova, MLX5_ADAPTER_PAGE_SIZE); iset_insert_range(ctx->iova_alloc, cmd->iova, MLX5_ADAPTER_PAGE_SIZE); free(cmd->vaddr); } static void set_iova_min_page_size(struct mlx5_vfio_context *ctx, uint64_t iova_pgsizes) { int i; for (i = MLX5_ADAPTER_PAGE_SHIFT; i < 64; i++) { if (iova_pgsizes & (1 << i)) { ctx->iova_min_page_size = 1 << i; return; } } assert(false); } /* if the kernel does not report usable IOVA regions, choose the legacy region */ #define MLX5_VFIO_IOVA_MIN1 0x10000ULL #define MLX5_VFIO_IOVA_MAX1 0xFEDFFFFFULL #define MLX5_VFIO_IOVA_MIN2 0xFEF00000ULL #define MLX5_VFIO_IOVA_MAX2 ((1ULL << 39) - 1) static int mlx5_vfio_get_iommu_info(struct mlx5_vfio_context *ctx) { struct vfio_iommu_type1_info *info; int ret, i; void *ptr; uint32_t offset; info = calloc(1, sizeof(*info)); if (!info) { errno = ENOMEM; return -1; } info->argsz = sizeof(*info); ret = ioctl(ctx->container_fd, VFIO_IOMMU_GET_INFO, info); if (ret) goto end; if (info->argsz > sizeof(*info)) { struct vfio_iommu_type1_info *tmp; tmp = realloc(info, info->argsz); if (!tmp) { errno = ENOMEM; ret = -1; goto end; } info = tmp; ret = ioctl(ctx->container_fd, VFIO_IOMMU_GET_INFO, info); if (ret) goto end; } set_iova_min_page_size(ctx, (info->flags & VFIO_IOMMU_INFO_PGSIZES) ? info->iova_pgsizes : 4096); if (!(info->flags & VFIO_IOMMU_INFO_CAPS)) goto set_legacy; offset = info->cap_offset; while (offset) { struct vfio_iommu_type1_info_cap_iova_range *iova_range; struct vfio_info_cap_header *header; ptr = (void *)info + offset; header = ptr; if (header->id != VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE) { offset = header->next; continue; } iova_range = (struct vfio_iommu_type1_info_cap_iova_range *)header; for (i = 0; i < iova_range->nr_iovas; i++) { ret = iset_insert_range(ctx->iova_alloc, iova_range->iova_ranges[i].start, iova_range->iova_ranges[i].end - iova_range->iova_ranges[i].start + 1); if (ret) goto end; } goto end; } set_legacy: ret = iset_insert_range(ctx->iova_alloc, MLX5_VFIO_IOVA_MIN1, MLX5_VFIO_IOVA_MAX1 - MLX5_VFIO_IOVA_MIN1 + 1); if (!ret) ret = iset_insert_range(ctx->iova_alloc, MLX5_VFIO_IOVA_MIN2, MLX5_VFIO_IOVA_MAX2 - MLX5_VFIO_IOVA_MIN2 + 1); end: free(info); return ret; } static void mlx5_vfio_clean_device_dma(struct mlx5_vfio_context *ctx) { struct page_block *page_block, *tmp; list_for_each_safe(&ctx->mem_alloc.block_list, page_block, tmp, next_block) mlx5_vfio_free_block(ctx, page_block); iset_destroy(ctx->iova_alloc); } static int mlx5_vfio_init_device_dma(struct mlx5_vfio_context *ctx) { ctx->iova_alloc = iset_create(); if (!ctx->iova_alloc) return -1; list_head_init(&ctx->mem_alloc.block_list); pthread_mutex_init(&ctx->mem_alloc.block_list_mutex, NULL); if (mlx5_vfio_get_iommu_info(ctx)) goto err; /* create an initial block of DMA memory ready to be used */ if (!mlx5_vfio_new_block(ctx)) goto err; return 0; err: iset_destroy(ctx->iova_alloc); return -1; } static void mlx5_vfio_uninit_bar0(struct mlx5_vfio_context *ctx) { munmap(ctx->bar_map, ctx->bar_map_size); } static int mlx5_vfio_init_bar0(struct mlx5_vfio_context *ctx) { struct vfio_region_info reg = { .argsz = sizeof(reg) }; void *base; int err; reg.index = 0; err = ioctl(ctx->device_fd, VFIO_DEVICE_GET_REGION_INFO, ®); if (err) return err; base = mmap(NULL, reg.size, PROT_READ | PROT_WRITE, MAP_SHARED, ctx->device_fd, reg.offset); if (base == MAP_FAILED) return -1; ctx->bar_map = (struct mlx5_init_seg *)base; ctx->bar_map_size = reg.size; return 0; } static int mlx5_vfio_msix_set_irqs(struct mlx5_vfio_context *ctx, int start, int count, void *irq_set_buf) { struct vfio_irq_set *irq_set = (struct vfio_irq_set *)irq_set_buf; irq_set->argsz = sizeof(*irq_set) + sizeof(int) * count; irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER; irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX; irq_set->start = start; irq_set->count = count; return ioctl(ctx->device_fd, VFIO_DEVICE_SET_IRQS, irq_set); } static int mlx5_vfio_init_async_fd(struct mlx5_vfio_context *ctx) { struct vfio_irq_info irq = { .argsz = sizeof(irq) }; struct vfio_irq_set *irq_set_buf; int fdlen, i; irq.index = VFIO_PCI_MSIX_IRQ_INDEX; if (ioctl(ctx->device_fd, VFIO_DEVICE_GET_IRQ_INFO, &irq)) return -1; /* fail if this vector cannot be used with eventfd */ if ((irq.flags & VFIO_IRQ_INFO_EVENTFD) == 0) return -1; fdlen = sizeof(int) * irq.count; ctx->msix_fds = calloc(1, fdlen); if (!ctx->msix_fds) { errno = ENOMEM; return -1; } for (i = 0; i < irq.count; i++) ctx->msix_fds[i] = -1; /* set up an eventfd for command completion interrupts */ ctx->cmd_comp_fd = eventfd(0, EFD_CLOEXEC | O_NONBLOCK); if (ctx->cmd_comp_fd < 0) goto err_eventfd; ctx->msix_fds[MLX5_VFIO_CMD_VEC_IDX] = ctx->cmd_comp_fd; irq_set_buf = calloc(1, sizeof(*irq_set_buf) + fdlen); if (!irq_set_buf) { errno = ENOMEM; goto err_irq_set_buf; } /* Enable MSI-X interrupts; In the first time it is called, the count * must be the maximum that we need */ memcpy(irq_set_buf->data, ctx->msix_fds, fdlen); if (mlx5_vfio_msix_set_irqs(ctx, 0, irq.count, irq_set_buf)) goto err_msix; free(irq_set_buf); pthread_mutex_init(&ctx->msix_fds_lock, NULL); ctx->vctx.context.num_comp_vectors = irq.count; return 0; err_msix: free(irq_set_buf); err_irq_set_buf: close(ctx->cmd_comp_fd); err_eventfd: free(ctx->msix_fds); return -1; } static void mlx5_vfio_close_fds(struct mlx5_vfio_context *ctx) { int vec; close(ctx->device_fd); close(ctx->container_fd); close(ctx->group_fd); pthread_mutex_lock(&ctx->msix_fds_lock); for (vec = 0; vec < ctx->vctx.context.num_comp_vectors; vec++) if (ctx->msix_fds[vec] >= 0) close(ctx->msix_fds[vec]); free(ctx->msix_fds); pthread_mutex_unlock(&ctx->msix_fds_lock); } static int mlx5_vfio_open_fds(struct mlx5_vfio_context *ctx, struct mlx5_vfio_device *mdev) { struct vfio_group_status group_status = { .argsz = sizeof(group_status) }; /* Create a new container */ ctx->container_fd = open("/dev/vfio/vfio", O_RDWR); if (ctx->container_fd < 0) return -1; if (ioctl(ctx->container_fd, VFIO_GET_API_VERSION) != VFIO_API_VERSION) goto close_cont; if (!ioctl(ctx->container_fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1v2_IOMMU)) /* Doesn't support the IOMMU driver we want. */ goto close_cont; /* Open the group */ ctx->group_fd = open(mdev->vfio_path, O_RDWR); if (ctx->group_fd < 0) goto close_cont; /* Test the group is viable and available */ if (ioctl(ctx->group_fd, VFIO_GROUP_GET_STATUS, &group_status)) goto close_group; if (!(group_status.flags & VFIO_GROUP_FLAGS_VIABLE)) { /* Group is not viable (ie, not all devices bound for vfio) */ errno = EINVAL; goto close_group; } /* Add the group to the container */ if (ioctl(ctx->group_fd, VFIO_GROUP_SET_CONTAINER, &ctx->container_fd)) goto close_group; /* Enable the IOMMU model we want */ if (ioctl(ctx->container_fd, VFIO_SET_IOMMU, VFIO_TYPE1v2_IOMMU)) goto close_group; /* Get a file descriptor for the device */ ctx->device_fd = ioctl(ctx->group_fd, VFIO_GROUP_GET_DEVICE_FD, mdev->pci_name); if (ctx->device_fd < 0) goto close_group; if (mlx5_vfio_init_async_fd(ctx)) goto close_group; return 0; close_group: close(ctx->group_fd); close_cont: close(ctx->container_fd); return -1; } enum { MLX5_EQE_OWNER_INIT_VAL = 0x1, }; static void init_eq_buf(struct mlx5_eq *eq) { struct mlx5_eqe *eqe; int i; for (i = 0; i < eq->nent; i++) { eqe = get_eqe(eq, i); eqe->owner = MLX5_EQE_OWNER_INIT_VAL; } } static uint64_t uar2iova(struct mlx5_vfio_context *ctx, uint32_t index) { return (uint64_t)(uintptr_t)((void *)ctx->bar_map + (index * MLX5_ADAPTER_PAGE_SIZE)); } static int mlx5_vfio_alloc_uar(struct mlx5_vfio_context *ctx, uint32_t *uarn) { uint32_t out[DEVX_ST_SZ_DW(alloc_uar_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(alloc_uar_in)] = {}; int err; DEVX_SET(alloc_uar_in, in, opcode, MLX5_CMD_OP_ALLOC_UAR); err = mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); if (!err) *uarn = DEVX_GET(alloc_uar_out, out, uar); return err; } static void mlx5_vfio_dealloc_uar(struct mlx5_vfio_context *ctx, uint32_t uarn) { uint32_t out[DEVX_ST_SZ_DW(dealloc_uar_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(dealloc_uar_in)] = {}; DEVX_SET(dealloc_uar_in, in, opcode, MLX5_CMD_OP_DEALLOC_UAR); DEVX_SET(dealloc_uar_in, in, uar, uarn); mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); } static void mlx5_vfio_destroy_eq(struct mlx5_vfio_context *ctx, struct mlx5_eq *eq) { uint32_t in[DEVX_ST_SZ_DW(destroy_eq_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(destroy_eq_out)] = {}; DEVX_SET(destroy_eq_in, in, opcode, MLX5_CMD_OP_DESTROY_EQ); DEVX_SET(destroy_eq_in, in, eq_number, eq->eqn); mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); mlx5_vfio_unregister_mem(ctx, eq->iova, eq->iova_size); iset_insert_range(ctx->iova_alloc, eq->iova, eq->iova_size); free(eq->vaddr); } static void destroy_async_eqs(struct mlx5_vfio_context *ctx) { ctx->have_eq = false; mlx5_vfio_destroy_eq(ctx, &ctx->async_eq); mlx5_vfio_dealloc_uar(ctx, ctx->eqs_uar.uarn); } static int create_map_eq(struct mlx5_vfio_context *ctx, struct mlx5_eq *eq, struct mlx5_eq_param *param) { uint32_t out[DEVX_ST_SZ_DW(create_eq_out)] = {}; uint8_t vecidx = param->irq_index; __be64 *pas; void *eqc; int inlen; uint32_t *in; int err; int i; int alloc_size; pthread_mutex_init(&ctx->eq_lock, NULL); eq->nent = roundup_pow_of_two(param->nent + MLX5_NUM_SPARE_EQE); eq->cons_index = 0; alloc_size = eq->nent * MLX5_EQE_SIZE; eq->iova_size = max(roundup_pow_of_two(alloc_size), ctx->iova_min_page_size); inlen = DEVX_ST_SZ_BYTES(create_eq_in) + DEVX_FLD_SZ_BYTES(create_eq_in, pas[0]) * 1; in = calloc(1, inlen); if (!in) return ENOMEM; pas = (__be64 *)DEVX_ADDR_OF(create_eq_in, in, pas); err = posix_memalign(&eq->vaddr, eq->iova_size, alloc_size); if (err) { errno = err; goto end; } err = iset_alloc_range(ctx->iova_alloc, eq->iova_size, &eq->iova, eq->iova_size); if (err) goto err_range; err = mlx5_vfio_register_mem(ctx, eq->vaddr, eq->iova, eq->iova_size); if (err) goto err_reg; pas[0] = htobe64(eq->iova); init_eq_buf(eq); DEVX_SET(create_eq_in, in, opcode, MLX5_CMD_OP_CREATE_EQ); for (i = 0; i < 4; i++) DEVX_ARRAY_SET64(create_eq_in, in, event_bitmask, i, param->mask[i]); eqc = DEVX_ADDR_OF(create_eq_in, in, eq_context_entry); DEVX_SET(eqc, eqc, log_eq_size, ilog32(eq->nent - 1)); DEVX_SET(eqc, eqc, uar_page, ctx->eqs_uar.uarn); DEVX_SET(eqc, eqc, intr, vecidx); DEVX_SET(eqc, eqc, log_page_size, ilog32(eq->iova_size - 1) - MLX5_ADAPTER_PAGE_SHIFT); err = mlx5_vfio_cmd_exec(ctx, in, inlen, out, sizeof(out), 0); if (err) goto err_cmd; eq->vecidx = vecidx; eq->eqn = DEVX_GET(create_eq_out, out, eq_number); eq->doorbell = (void *)(uintptr_t)ctx->eqs_uar.iova + MLX5_EQ_DOORBEL_OFFSET; free(in); return 0; err_cmd: mlx5_vfio_unregister_mem(ctx, eq->iova, eq->iova_size); err_reg: iset_insert_range(ctx->iova_alloc, eq->iova, eq->iova_size); err_range: free(eq->vaddr); end: free(in); return err; } static int setup_async_eq(struct mlx5_vfio_context *ctx, struct mlx5_eq_param *param, struct mlx5_eq *eq) { int err; err = create_map_eq(ctx, eq, param); if (err) return err; eq_update_ci(eq, 0, 1); return 0; } static int create_async_eqs(struct mlx5_vfio_context *ctx) { struct mlx5_eq_param param = {}; int err; err = mlx5_vfio_alloc_uar(ctx, &ctx->eqs_uar.uarn); if (err) return err; ctx->eqs_uar.iova = uar2iova(ctx, ctx->eqs_uar.uarn); param = (struct mlx5_eq_param) { .irq_index = MLX5_VFIO_CMD_VEC_IDX, .nent = MLX5_NUM_CMD_EQE, .mask[0] = 1ull << MLX5_EVENT_TYPE_CMD | 1ull << MLX5_EVENT_TYPE_PAGE_REQUEST, }; err = setup_async_eq(ctx, ¶m, &ctx->async_eq); if (err) goto err; ctx->have_eq = true; return 0; err: mlx5_vfio_dealloc_uar(ctx, ctx->eqs_uar.uarn); return err; } static int mlx5_vfio_reclaim_pages(struct mlx5_vfio_context *ctx, uint32_t func_id, int npages) { uint32_t inlen = DEVX_ST_SZ_BYTES(manage_pages_in); int outlen; uint32_t *out; void *in; int err; int slot = MLX5_MAX_COMMANDS - 1; outlen = DEVX_ST_SZ_BYTES(manage_pages_out); outlen += npages * DEVX_FLD_SZ_BYTES(manage_pages_out, pas[0]); out = calloc(1, outlen); if (!out) { errno = ENOMEM; return errno; } in = calloc(1, inlen); if (!in) { err = ENOMEM; errno = err; goto out_free; } DEVX_SET(manage_pages_in, in, opcode, MLX5_CMD_OP_MANAGE_PAGES); DEVX_SET(manage_pages_in, in, op_mod, MLX5_PAGES_TAKE); DEVX_SET(manage_pages_in, in, function_id, func_id); DEVX_SET(manage_pages_in, in, input_num_entries, npages); pthread_mutex_lock(&ctx->cmd.cmds[slot].lock); err = mlx5_vfio_post_cmd(ctx, in, inlen, out, outlen, slot, true); pthread_mutex_unlock(&ctx->cmd.cmds[slot].lock); if (!err) return 0; free(in); out_free: free(out); return err; } static int mlx5_vfio_enable_hca(struct mlx5_vfio_context *ctx) { uint32_t in[DEVX_ST_SZ_DW(enable_hca_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(enable_hca_out)] = {}; DEVX_SET(enable_hca_in, in, opcode, MLX5_CMD_OP_ENABLE_HCA); return mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); } static int mlx5_vfio_set_issi(struct mlx5_vfio_context *ctx) { uint32_t query_in[DEVX_ST_SZ_DW(query_issi_in)] = {}; uint32_t query_out[DEVX_ST_SZ_DW(query_issi_out)] = {}; uint32_t set_in[DEVX_ST_SZ_DW(set_issi_in)] = {}; uint32_t set_out[DEVX_ST_SZ_DW(set_issi_out)] = {}; uint32_t sup_issi; int err; DEVX_SET(query_issi_in, query_in, opcode, MLX5_CMD_OP_QUERY_ISSI); err = mlx5_vfio_cmd_exec(ctx, query_in, sizeof(query_in), query_out, sizeof(query_out), 0); if (err) return err; sup_issi = DEVX_GET(query_issi_out, query_out, supported_issi_dw0); if (!(sup_issi & (1 << 1))) { errno = EOPNOTSUPP; return errno; } DEVX_SET(set_issi_in, set_in, opcode, MLX5_CMD_OP_SET_ISSI); DEVX_SET(set_issi_in, set_in, current_issi, 1); return mlx5_vfio_cmd_exec(ctx, set_in, sizeof(set_in), set_out, sizeof(set_out), 0); } static int mlx5_vfio_give_pages(struct mlx5_vfio_context *ctx, uint16_t func_id, int32_t npages, bool is_event) { int32_t out[DEVX_ST_SZ_DW(manage_pages_out)] = {}; int inlen = DEVX_ST_SZ_BYTES(manage_pages_in); int slot = MLX5_MAX_COMMANDS - 1; void *outp = out; int i, err; int32_t *in; uint64_t iova; inlen += npages * DEVX_FLD_SZ_BYTES(manage_pages_in, pas[0]); in = calloc(1, inlen); if (!in) { errno = ENOMEM; return errno; } if (is_event) { outp = calloc(1, sizeof(out)); if (!outp) { errno = ENOMEM; err = errno; goto end; } } for (i = 0; i < npages; i++) { err = mlx5_vfio_alloc_page(ctx, &iova); if (err) goto err; DEVX_ARRAY_SET64(manage_pages_in, in, pas, i, iova); } DEVX_SET(manage_pages_in, in, opcode, MLX5_CMD_OP_MANAGE_PAGES); DEVX_SET(manage_pages_in, in, op_mod, MLX5_PAGES_GIVE); DEVX_SET(manage_pages_in, in, function_id, func_id); DEVX_SET(manage_pages_in, in, input_num_entries, npages); if (is_event) { pthread_mutex_lock(&ctx->cmd.cmds[slot].lock); err = mlx5_vfio_post_cmd(ctx, in, inlen, outp, sizeof(out), slot, true); pthread_mutex_unlock(&ctx->cmd.cmds[slot].lock); } else { err = mlx5_vfio_cmd_exec(ctx, in, inlen, outp, sizeof(out), slot); } if (!err) { if (is_event) return 0; goto end; } err: if (is_event) free(outp); for (i--; i >= 0; i--) mlx5_vfio_free_page(ctx, DEVX_GET64(manage_pages_in, in, pas[i])); end: free(in); return err; } static int mlx5_vfio_query_pages(struct mlx5_vfio_context *ctx, int boot, uint16_t *func_id, int32_t *npages) { uint32_t query_pages_in[DEVX_ST_SZ_DW(query_pages_in)] = {}; uint32_t query_pages_out[DEVX_ST_SZ_DW(query_pages_out)] = {}; int ret; DEVX_SET(query_pages_in, query_pages_in, opcode, MLX5_CMD_OP_QUERY_PAGES); DEVX_SET(query_pages_in, query_pages_in, op_mod, boot ? 0x01 : 0x02); ret = mlx5_vfio_cmd_exec(ctx, query_pages_in, sizeof(query_pages_in), query_pages_out, sizeof(query_pages_out), 0); if (ret) return ret; *npages = DEVX_GET(query_pages_out, query_pages_out, num_pages); *func_id = DEVX_GET(query_pages_out, query_pages_out, function_id); return 0; } static int mlx5_vfio_satisfy_startup_pages(struct mlx5_vfio_context *ctx, int boot) { uint16_t function_id; int32_t npages = 0; int ret; ret = mlx5_vfio_query_pages(ctx, boot, &function_id, &npages); if (ret) return ret; return mlx5_vfio_give_pages(ctx, function_id, npages, false); } static int mlx5_vfio_access_reg(struct mlx5_vfio_context *ctx, void *data_in, int size_in, void *data_out, int size_out, uint16_t reg_id, int arg, int write) { int outlen = DEVX_ST_SZ_BYTES(access_register_out) + size_out; int inlen = DEVX_ST_SZ_BYTES(access_register_in) + size_in; int err = ENOMEM; uint32_t *out = NULL; uint32_t *in = NULL; void *data; in = calloc(1, inlen); out = calloc(1, outlen); if (!in || !out) { errno = ENOMEM; goto out; } data = DEVX_ADDR_OF(access_register_in, in, register_data); memcpy(data, data_in, size_in); DEVX_SET(access_register_in, in, opcode, MLX5_CMD_OP_ACCESS_REG); DEVX_SET(access_register_in, in, op_mod, !write); DEVX_SET(access_register_in, in, argument, arg); DEVX_SET(access_register_in, in, register_id, reg_id); err = mlx5_vfio_cmd_exec(ctx, in, inlen, out, outlen, 0); if (err) goto out; data = DEVX_ADDR_OF(access_register_out, out, register_data); memcpy(data_out, data, size_out); out: free(out); free(in); return err; } static int mlx5_vfio_get_caps_mode(struct mlx5_vfio_context *ctx, enum mlx5_cap_type cap_type, enum mlx5_cap_mode cap_mode) { uint8_t in[DEVX_ST_SZ_BYTES(query_hca_cap_in)] = {}; int out_sz = DEVX_ST_SZ_BYTES(query_hca_cap_out); void *out, *hca_caps; uint16_t opmod = (cap_type << 1) | (cap_mode & 0x01); int err; out = calloc(1, out_sz); if (!out) { errno = ENOMEM; return errno; } DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); DEVX_SET(query_hca_cap_in, in, op_mod, opmod); err = mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, out_sz, 0); if (err) goto query_ex; hca_caps = DEVX_ADDR_OF(query_hca_cap_out, out, capability); switch (cap_mode) { case HCA_CAP_OPMOD_GET_MAX: memcpy(ctx->caps.hca_max[cap_type], hca_caps, DEVX_UN_SZ_BYTES(hca_cap_union)); break; case HCA_CAP_OPMOD_GET_CUR: memcpy(ctx->caps.hca_cur[cap_type], hca_caps, DEVX_UN_SZ_BYTES(hca_cap_union)); break; default: err = EINVAL; assert(false); break; } query_ex: free(out); return err; } enum mlx5_vport_roce_state { MLX5_VPORT_ROCE_DISABLED = 0, MLX5_VPORT_ROCE_ENABLED = 1, }; static int mlx5_vfio_nic_vport_update_roce_state(struct mlx5_vfio_context *ctx, enum mlx5_vport_roce_state state) { uint32_t out[DEVX_ST_SZ_DW(modify_nic_vport_context_out)] = {}; int inlen = DEVX_ST_SZ_BYTES(modify_nic_vport_context_in); void *in; int err; in = calloc(1, inlen); if (!in) { errno = ENOMEM; return errno; } DEVX_SET(modify_nic_vport_context_in, in, field_select.roce_en, 1); DEVX_SET(modify_nic_vport_context_in, in, nic_vport_context.roce_en, state); DEVX_SET(modify_nic_vport_context_in, in, opcode, MLX5_CMD_OP_MODIFY_NIC_VPORT_CONTEXT); err = mlx5_vfio_cmd_exec(ctx, in, inlen, out, sizeof(out), 0); free(in); return err; } static int mlx5_vfio_get_caps(struct mlx5_vfio_context *ctx, enum mlx5_cap_type cap_type) { int ret; ret = mlx5_vfio_get_caps_mode(ctx, cap_type, HCA_CAP_OPMOD_GET_CUR); if (ret) return ret; return mlx5_vfio_get_caps_mode(ctx, cap_type, HCA_CAP_OPMOD_GET_MAX); } static int handle_hca_cap_roce(struct mlx5_vfio_context *ctx, void *set_ctx, int ctx_size) { int err; uint32_t out[DEVX_ST_SZ_DW(set_hca_cap_out)] = {}; void *set_hca_cap; if (!MLX5_VFIO_CAP_GEN(ctx, roce)) return 0; err = mlx5_vfio_get_caps(ctx, MLX5_CAP_ROCE); if (err) return err; if (MLX5_VFIO_CAP_ROCE(ctx, sw_r_roce_src_udp_port) || !MLX5_VFIO_CAP_ROCE_MAX(ctx, sw_r_roce_src_udp_port)) return 0; set_hca_cap = DEVX_ADDR_OF(set_hca_cap_in, set_ctx, capability); memcpy(set_hca_cap, ctx->caps.hca_cur[MLX5_CAP_ROCE], DEVX_ST_SZ_BYTES(roce_cap)); DEVX_SET(roce_cap, set_hca_cap, sw_r_roce_src_udp_port, 1); DEVX_SET(set_hca_cap_in, set_ctx, opcode, MLX5_CMD_OP_SET_HCA_CAP); DEVX_SET(set_hca_cap_in, set_ctx, op_mod, MLX5_SET_HCA_CAP_OP_MOD_ROCE); return mlx5_vfio_cmd_exec(ctx, set_ctx, ctx_size, out, sizeof(out), 0); } static int handle_hca_cap(struct mlx5_vfio_context *ctx, void *set_ctx, int set_sz) { struct mlx5_vfio_device *dev = to_mvfio_dev(ctx->vctx.context.device); int sys_page_shift = ilog32(dev->page_size - 1); uint32_t out[DEVX_ST_SZ_DW(set_hca_cap_out)] = {}; void *set_hca_cap; int err; err = mlx5_vfio_get_caps(ctx, MLX5_CAP_GENERAL); if (err) return err; set_hca_cap = DEVX_ADDR_OF(set_hca_cap_in, set_ctx, capability); memcpy(set_hca_cap, ctx->caps.hca_cur[MLX5_CAP_GENERAL], DEVX_ST_SZ_BYTES(cmd_hca_cap)); /* disable cmdif checksum */ DEVX_SET(cmd_hca_cap, set_hca_cap, cmdif_checksum, 0); if (dev->flags & MLX5DV_VFIO_CTX_FLAGS_INIT_LINK_DOWN) DEVX_SET(cmd_hca_cap, set_hca_cap, disable_link_up_by_init_hca, 1); DEVX_SET(cmd_hca_cap, set_hca_cap, log_uar_page_sz, sys_page_shift - 12); if (MLX5_VFIO_CAP_GEN_MAX(ctx, mkey_by_name)) DEVX_SET(cmd_hca_cap, set_hca_cap, mkey_by_name, 1); DEVX_SET(set_hca_cap_in, set_ctx, opcode, MLX5_CMD_OP_SET_HCA_CAP); DEVX_SET(set_hca_cap_in, set_ctx, op_mod, MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE); return mlx5_vfio_cmd_exec(ctx, set_ctx, set_sz, out, sizeof(out), 0); } static int set_hca_cap(struct mlx5_vfio_context *ctx) { int set_sz = DEVX_ST_SZ_BYTES(set_hca_cap_in); void *set_ctx; int err; set_ctx = calloc(1, set_sz); if (!set_ctx) { errno = ENOMEM; return errno; } err = handle_hca_cap(ctx, set_ctx, set_sz); if (err) goto out; memset(set_ctx, 0, set_sz); err = handle_hca_cap_roce(ctx, set_ctx, set_sz); out: free(set_ctx); return err; } static int mlx5_vfio_set_hca_ctrl(struct mlx5_vfio_context *ctx) { struct mlx5_reg_host_endianness he_in = {}; struct mlx5_reg_host_endianness he_out = {}; he_in.he = MLX5_SET_HOST_ENDIANNESS; return mlx5_vfio_access_reg(ctx, &he_in, sizeof(he_in), &he_out, sizeof(he_out), MLX5_REG_HOST_ENDIANNESS, 0, 1); } static int mlx5_vfio_init_hca(struct mlx5_vfio_context *ctx) { uint32_t in[DEVX_ST_SZ_DW(init_hca_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(init_hca_out)] = {}; DEVX_SET(init_hca_in, in, opcode, MLX5_CMD_OP_INIT_HCA); return mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); } static int fw_initializing(struct mlx5_init_seg *init_seg) { return be32toh(init_seg->initializing) >> 31; } static int wait_fw_init(struct mlx5_init_seg *init_seg, uint32_t max_wait_mili) { int num_loops = max_wait_mili / FW_INIT_WAIT_MS; int loop = 0; while (fw_initializing(init_seg)) { usleep(FW_INIT_WAIT_MS * 1000); loop++; if (loop == num_loops) { errno = EBUSY; return errno; } } return 0; } static int mlx5_vfio_teardown_hca_regular(struct mlx5_vfio_context *ctx) { uint32_t in[DEVX_ST_SZ_DW(teardown_hca_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(teardown_hca_out)] = {}; DEVX_SET(teardown_hca_in, in, opcode, MLX5_CMD_OP_TEARDOWN_HCA); DEVX_SET(teardown_hca_in, in, profile, MLX5_TEARDOWN_HCA_IN_PROFILE_GRACEFUL_CLOSE); return mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); } enum mlx5_cmd_addr_l_sz_offset { MLX5_NIC_IFC_OFFSET = 8, }; enum { MLX5_NIC_IFC_DISABLED = 1, MLX5_NIC_IFC_SW_RESET = 7, }; static uint8_t mlx5_vfio_get_nic_state(struct mlx5_vfio_context *ctx) { return (be32toh(mmio_read32_be(&ctx->bar_map->cmdq_addr_l_sz)) >> 8) & 7; } static void mlx5_vfio_set_nic_state(struct mlx5_vfio_context *ctx, uint8_t state) { uint32_t cur_cmdq_addr_l_sz; cur_cmdq_addr_l_sz = be32toh(mmio_read32_be(&ctx->bar_map->cmdq_addr_l_sz)); mmio_write32_be(&ctx->bar_map->cmdq_addr_l_sz, htobe32((cur_cmdq_addr_l_sz & 0xFFFFF000) | state << MLX5_NIC_IFC_OFFSET)); } #define MLX5_FAST_TEARDOWN_WAIT_MS 3000 #define MLX5_FAST_TEARDOWN_WAIT_ONCE_MS 1 static int mlx5_vfio_teardown_hca_fast(struct mlx5_vfio_context *ctx) { uint32_t out[DEVX_ST_SZ_DW(teardown_hca_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(teardown_hca_in)] = {}; int waited = 0, state, ret; DEVX_SET(teardown_hca_in, in, opcode, MLX5_CMD_OP_TEARDOWN_HCA); DEVX_SET(teardown_hca_in, in, profile, MLX5_TEARDOWN_HCA_IN_PROFILE_PREPARE_FAST_TEARDOWN); ret = mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); if (ret) return ret; state = DEVX_GET(teardown_hca_out, out, state); if (state == MLX5_TEARDOWN_HCA_OUT_FORCE_STATE_FAIL) { mlx5_err(ctx->dbg_fp, "teardown with fast mode failed\n"); return EIO; } mlx5_vfio_set_nic_state(ctx, MLX5_NIC_IFC_DISABLED); do { if (mlx5_vfio_get_nic_state(ctx) == MLX5_NIC_IFC_DISABLED) break; usleep(MLX5_FAST_TEARDOWN_WAIT_ONCE_MS * 1000); waited += MLX5_FAST_TEARDOWN_WAIT_ONCE_MS; } while (waited < MLX5_FAST_TEARDOWN_WAIT_MS); if (mlx5_vfio_get_nic_state(ctx) != MLX5_NIC_IFC_DISABLED) { mlx5_err(ctx->dbg_fp, "NIC IFC still %d after %ums.\n", mlx5_vfio_get_nic_state(ctx), waited); return EIO; } return 0; } static int mlx5_vfio_teardown_hca(struct mlx5_vfio_context *ctx) { int err; if (MLX5_VFIO_CAP_GEN(ctx, fast_teardown)) { err = mlx5_vfio_teardown_hca_fast(ctx); if (!err) return 0; } return mlx5_vfio_teardown_hca_regular(ctx); } static bool sensor_pci_not_working(struct mlx5_init_seg *init_seg) { /* Offline PCI reads return 0xffffffff */ return (be32toh(mmio_read32_be(&init_seg->health.fw_ver)) == 0xffffffff); } enum mlx5_fatal_assert_bit_offsets { MLX5_RFR_OFFSET = 31, }; static bool sensor_fw_synd_rfr(struct mlx5_init_seg *init_seg) { uint32_t rfr = be32toh(mmio_read32_be(&init_seg->health.rfr)) >> MLX5_RFR_OFFSET; uint8_t synd = mmio_read8(&init_seg->health.synd); return (rfr && synd); } enum { MLX5_SENSOR_NO_ERR = 0, MLX5_SENSOR_PCI_COMM_ERR = 1, MLX5_SENSOR_NIC_DISABLED = 3, MLX5_SENSOR_NIC_SW_RESET = 4, MLX5_SENSOR_FW_SYND_RFR = 5, }; static uint32_t mlx5_health_check_fatal_sensors(struct mlx5_vfio_context *ctx) { if (sensor_pci_not_working(ctx->bar_map)) return MLX5_SENSOR_PCI_COMM_ERR; if (mlx5_vfio_get_nic_state(ctx) == MLX5_NIC_IFC_DISABLED) return MLX5_SENSOR_NIC_DISABLED; if (mlx5_vfio_get_nic_state(ctx) == MLX5_NIC_IFC_SW_RESET) return MLX5_SENSOR_NIC_SW_RESET; if (sensor_fw_synd_rfr(ctx->bar_map)) return MLX5_SENSOR_FW_SYND_RFR; return MLX5_SENSOR_NO_ERR; } enum { MLX5_HEALTH_SYNDR_FW_ERR = 0x1, MLX5_HEALTH_SYNDR_IRISC_ERR = 0x7, MLX5_HEALTH_SYNDR_HW_UNRECOVERABLE_ERR = 0x8, MLX5_HEALTH_SYNDR_CRC_ERR = 0x9, MLX5_HEALTH_SYNDR_FETCH_PCI_ERR = 0xa, MLX5_HEALTH_SYNDR_HW_FTL_ERR = 0xb, MLX5_HEALTH_SYNDR_ASYNC_EQ_OVERRUN_ERR = 0xc, MLX5_HEALTH_SYNDR_EQ_ERR = 0xd, MLX5_HEALTH_SYNDR_EQ_INV = 0xe, MLX5_HEALTH_SYNDR_FFSER_ERR = 0xf, MLX5_HEALTH_SYNDR_HIGH_TEMP = 0x10, }; static const char *hsynd_str(u8 synd) { switch (synd) { case MLX5_HEALTH_SYNDR_FW_ERR: return "firmware internal error"; case MLX5_HEALTH_SYNDR_IRISC_ERR: return "irisc not responding"; case MLX5_HEALTH_SYNDR_HW_UNRECOVERABLE_ERR: return "unrecoverable hardware error"; case MLX5_HEALTH_SYNDR_CRC_ERR: return "firmware CRC error"; case MLX5_HEALTH_SYNDR_FETCH_PCI_ERR: return "ICM fetch PCI error"; case MLX5_HEALTH_SYNDR_HW_FTL_ERR: return "HW fatal error\n"; case MLX5_HEALTH_SYNDR_ASYNC_EQ_OVERRUN_ERR: return "async EQ buffer overrun"; case MLX5_HEALTH_SYNDR_EQ_ERR: return "EQ error"; case MLX5_HEALTH_SYNDR_EQ_INV: return "Invalid EQ referenced"; case MLX5_HEALTH_SYNDR_FFSER_ERR: return "FFSER error"; case MLX5_HEALTH_SYNDR_HIGH_TEMP: return "High temperature"; default: return "unrecognized error"; } } static void print_health_info(struct mlx5_vfio_context *ctx) { struct mlx5_init_seg *iseg = ctx->bar_map; struct health_buffer *h = &iseg->health; char fw_str[18] = {}; int i; /* If the syndrome is 0, the device is OK and no need to print buffer */ if (!mmio_read8(&h->synd)) return; for (i = 0; i < ARRAY_SIZE(h->assert_var); i++) mlx5_err(ctx->dbg_fp, "assert_var[%d] 0x%08x\n", i, be32toh(mmio_read32_be(h->assert_var + i))); mlx5_err(ctx->dbg_fp, "assert_exit_ptr 0x%08x\n", be32toh(mmio_read32_be(&h->assert_exit_ptr))); mlx5_err(ctx->dbg_fp, "assert_callra 0x%08x\n", be32toh(mmio_read32_be(&h->assert_callra))); sprintf(fw_str, "%d.%d.%d", be32toh(mmio_read32_be(&iseg->fw_rev)) & 0xffff, be32toh(mmio_read32_be(&iseg->fw_rev)) >> 16, be32toh(mmio_read32_be(&iseg->cmdif_rev_fw_sub)) & 0xffff); mlx5_err(ctx->dbg_fp, "fw_ver %s\n", fw_str); mlx5_err(ctx->dbg_fp, "hw_id 0x%08x\n", be32toh(mmio_read32_be(&h->hw_id))); mlx5_err(ctx->dbg_fp, "irisc_index %d\n", mmio_read8(&h->irisc_index)); mlx5_err(ctx->dbg_fp, "synd 0x%x: %s\n", mmio_read8(&h->synd), hsynd_str(mmio_read8(&h->synd))); mlx5_err(ctx->dbg_fp, "ext_synd 0x%04x\n", be16toh(mmio_read16_be(&h->ext_synd))); mlx5_err(ctx->dbg_fp, "raw fw_ver 0x%08x\n", be32toh(mmio_read32_be(&iseg->fw_rev))); } static void mlx5_vfio_poll_health(struct mlx5_vfio_context *ctx) { struct mlx5_vfio_health_state *hstate = &ctx->health_state; uint32_t fatal_error, count; struct timeval tv; uint64_t time; int ret; ret = gettimeofday(&tv, NULL); if (ret) return; time = (uint64_t)tv.tv_sec * 1000 + tv.tv_usec / 1000; if (time - hstate->prev_time < POLL_HEALTH_INTERVAL) return; fatal_error = mlx5_health_check_fatal_sensors(ctx); if (fatal_error) { mlx5_err(ctx->dbg_fp, "%s: Fatal error %u detected\n", __func__, fatal_error); goto err; } count = be32toh(mmio_read32_be(&ctx->bar_map->health_counter)) & 0xffffff; if (count == hstate->prev_count) ++hstate->miss_counter; else hstate->miss_counter = 0; hstate->prev_time = time; hstate->prev_count = count; if (hstate->miss_counter == MAX_MISSES) { mlx5_err(ctx->dbg_fp, "device's health compromised - reached miss count\n"); goto err; } return; err: print_health_info(ctx); abort(); } static int mlx5_vfio_setup_function(struct mlx5_vfio_context *ctx) { int err; err = wait_fw_init(ctx->bar_map, FW_PRE_INIT_TIMEOUT_MILI); if (err) return err; err = mlx5_vfio_enable_hca(ctx); if (err) return err; err = mlx5_vfio_set_issi(ctx); if (err) return err; err = mlx5_vfio_satisfy_startup_pages(ctx, 1); if (err) return err; err = mlx5_vfio_set_hca_ctrl(ctx); if (err) return err; err = set_hca_cap(ctx); if (err) return err; if (!MLX5_VFIO_CAP_GEN(ctx, umem_uid_0)) { errno = EOPNOTSUPP; return errno; } err = mlx5_vfio_satisfy_startup_pages(ctx, 0); if (err) return err; err = mlx5_vfio_init_hca(ctx); if (err) return err; if (MLX5_VFIO_CAP_GEN(ctx, port_type) == MLX5_CAP_PORT_TYPE_ETH) err = mlx5_vfio_nic_vport_update_roce_state(ctx, MLX5_VPORT_ROCE_ENABLED); return err; } static struct ibv_pd *mlx5_vfio_alloc_pd(struct ibv_context *ibctx) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(ibctx); uint32_t in[DEVX_ST_SZ_DW(alloc_pd_in)] = {0}; uint32_t out[DEVX_ST_SZ_DW(alloc_pd_out)] = {0}; int err; struct mlx5_pd *pd; pd = calloc(1, sizeof(*pd)); if (!pd) return NULL; DEVX_SET(alloc_pd_in, in, opcode, MLX5_CMD_OP_ALLOC_PD); err = mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); if (err) goto err; pd->pdn = DEVX_GET(alloc_pd_out, out, pd); return &pd->ibv_pd; err: free(pd); return NULL; } static int mlx5_vfio_dealloc_pd(struct ibv_pd *pd) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(pd->context); uint32_t in[DEVX_ST_SZ_DW(dealloc_pd_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(dealloc_pd_out)] = {}; struct mlx5_pd *mpd = to_mpd(pd); int ret; DEVX_SET(dealloc_pd_in, in, opcode, MLX5_CMD_OP_DEALLOC_PD); DEVX_SET(dealloc_pd_in, in, pd, mpd->pdn); ret = mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); if (ret) return ret; free(mpd); return 0; } static size_t calc_num_dma_blocks(uint64_t iova, size_t length, unsigned long pgsz) { return (size_t)((align(iova + length, pgsz) - align_down(iova, pgsz)) / pgsz); } static int get_octo_len(uint64_t addr, uint64_t len, int page_shift) { uint64_t page_size = 1ULL << page_shift; uint64_t offset; int npages; offset = addr & (page_size - 1); npages = align(len + offset, page_size) >> page_shift; return (npages + 1) / 2; } static inline uint32_t mlx5_mkey_to_idx(uint32_t mkey) { return mkey >> 8; } static inline uint32_t mlx5_idx_to_mkey(uint32_t mkey_idx) { return mkey_idx << 8; } static void set_mkc_access_pd_addr_fields(void *mkc, int acc, uint64_t start_addr, struct ibv_pd *pd) { struct mlx5_pd *mpd = to_mpd(pd); DEVX_SET(mkc, mkc, a, !!(acc & IBV_ACCESS_REMOTE_ATOMIC)); DEVX_SET(mkc, mkc, rw, !!(acc & IBV_ACCESS_REMOTE_WRITE)); DEVX_SET(mkc, mkc, rr, !!(acc & IBV_ACCESS_REMOTE_READ)); DEVX_SET(mkc, mkc, lw, !!(acc & IBV_ACCESS_LOCAL_WRITE)); DEVX_SET(mkc, mkc, lr, 1); /* Application is responsible to set based on caps */ DEVX_SET(mkc, mkc, relaxed_ordering_write, !!(acc & IBV_ACCESS_RELAXED_ORDERING)); DEVX_SET(mkc, mkc, relaxed_ordering_read, !!(acc & IBV_ACCESS_RELAXED_ORDERING)); DEVX_SET(mkc, mkc, pd, mpd->pdn); DEVX_SET(mkc, mkc, qpn, 0xffffff); DEVX_SET64(mkc, mkc, start_addr, start_addr); } static int mlx5_vfio_dereg_mr(struct verbs_mr *vmr) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(vmr->ibv_mr.context); struct mlx5_vfio_mr *mr = to_mvfio_mr(&vmr->ibv_mr); uint32_t in[DEVX_ST_SZ_DW(destroy_mkey_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(destroy_mkey_in)] = {}; int ret; DEVX_SET(destroy_mkey_in, in, opcode, MLX5_CMD_OP_DESTROY_MKEY); DEVX_SET(destroy_mkey_in, in, mkey_index, mlx5_mkey_to_idx(vmr->ibv_mr.lkey)); ret = mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); if (ret) return ret; mlx5_vfio_unregister_mem(ctx, mr->iova + mr->iova_aligned_offset, mr->iova_reg_size); iset_insert_range(ctx->iova_alloc, mr->iova, mr->iova_page_size); free(vmr); return 0; } static void mlx5_vfio_populate_pas(uint64_t dma_addr, int num_dma, size_t page_size, __be64 *pas, uint64_t access_flags) { int i; for (i = 0; i < num_dma; i++) { *pas = htobe64(dma_addr | access_flags); pas++; dma_addr += page_size; } } static uint64_t calc_spanning_page_size(uint64_t start, uint64_t length) { /* Compute a page_size such that: * start & (page_size-1) == (start + length) & (page_size - 1) */ uint64_t diffs = start ^ (start + length - 1); uint64_t page_size = roundup_pow_of_two(diffs + 1); /* * Don't waste more than 1G of IOVA address space trying to * minimize MTTs */ while (page_size - length > 1024 * 1024 * 1024) { if (page_size / 2 < length) break; page_size /= 2; } return page_size; } static struct ibv_mr *mlx5_vfio_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { struct mlx5_vfio_device *dev = to_mvfio_dev(pd->context->device); struct mlx5_vfio_context *ctx = to_mvfio_ctx(pd->context); uint32_t out[DEVX_ST_SZ_DW(create_mkey_out)] = {}; uint32_t mkey_index; uint32_t *in; int inlen, num_pas, ret; struct mlx5_vfio_mr *mr; struct verbs_mr *vmr; int page_shift, iova_min_page_shift; __be64 *pas; uint8_t key; void *mkc; void *aligned_va; if (!check_comp_mask(access, MLX5_VFIO_SUPP_MR_ACCESS_FLAGS)) { errno = EOPNOTSUPP; return NULL; } if (((uint64_t)(uintptr_t)addr & (ctx->iova_min_page_size - 1)) != (hca_va & (ctx->iova_min_page_size - 1))) { errno = EOPNOTSUPP; return NULL; } mr = calloc(1, sizeof(*mr)); if (!mr) { errno = ENOMEM; return NULL; } aligned_va = (void *)(uintptr_t)((unsigned long)addr & ~(ctx->iova_min_page_size - 1)); iova_min_page_shift = ilog64(ctx->iova_min_page_size - 1); mr->iova_page_size = max(calc_spanning_page_size(hca_va, length), ctx->iova_min_page_size); page_shift = ilog64(mr->iova_page_size - 1); /* Ensure the low bis of the mkey VA match the low bits of the IOVA * because the mkc start_addr specifies both the wire VA and the DMA VA. */ mr->iova_aligned_offset = hca_va & GENMASK(page_shift - 1, iova_min_page_shift); mr->iova_reg_size = align(length + hca_va, ctx->iova_min_page_size) - align_down(hca_va, ctx->iova_min_page_size); if (page_shift > MLX5_MAX_PAGE_SHIFT) { page_shift = MLX5_MAX_PAGE_SHIFT; mr->iova_page_size = 1ULL << page_shift; } ret = iset_alloc_range(ctx->iova_alloc, mr->iova_aligned_offset + mr->iova_reg_size, &mr->iova, mr->iova_page_size); if (ret) goto end; /* IOVA must be aligned */ assert(mr->iova % mr->iova_page_size == 0); ret = mlx5_vfio_register_mem(ctx, aligned_va, mr->iova + mr->iova_aligned_offset, mr->iova_reg_size); if (ret) goto err_reg; num_pas = calc_num_dma_blocks(hca_va, length, mr->iova_page_size); inlen = DEVX_ST_SZ_BYTES(create_mkey_in) + (sizeof(*pas) * align(num_pas, 2)); in = calloc(1, inlen); if (!in) { errno = ENOMEM; goto err_in; } pas = (__be64 *)DEVX_ADDR_OF(create_mkey_in, in, klm_pas_mtt); /* if page_shift was greater than MLX5_MAX_PAGE_SHIFT then limiting it * will cause the starting IOVA to be incorrect, adjust it. */ mlx5_vfio_populate_pas(align_down(mr->iova + mr->iova_aligned_offset, mr->iova_page_size), num_pas, mr->iova_page_size, pas, MLX5_MTT_PRESENT); DEVX_SET(create_mkey_in, in, opcode, MLX5_CMD_OP_CREATE_MKEY); DEVX_SET(create_mkey_in, in, pg_access, 1); mkc = DEVX_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); set_mkc_access_pd_addr_fields(mkc, access, hca_va, pd); DEVX_SET(mkc, mkc, free, 0); DEVX_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_MTT); DEVX_SET64(mkc, mkc, len, length); DEVX_SET(mkc, mkc, bsf_octword_size, 0); DEVX_SET(mkc, mkc, translations_octword_size, get_octo_len(hca_va, length, page_shift)); DEVX_SET(mkc, mkc, log_page_size, page_shift); DEVX_SET(create_mkey_in, in, translations_octword_actual_size, get_octo_len(hca_va, length, page_shift)); key = atomic_fetch_add(&dev->mkey_var, 1); DEVX_SET(mkc, mkc, mkey_7_0, key); ret = mlx5_vfio_cmd_exec(ctx, in, inlen, out, sizeof(out), 0); if (ret) goto err_exec; free(in); mkey_index = DEVX_GET(create_mkey_out, out, mkey_index); vmr = &mr->vmr; vmr->ibv_mr.lkey = key | mlx5_idx_to_mkey(mkey_index); vmr->ibv_mr.rkey = vmr->ibv_mr.lkey; vmr->ibv_mr.context = pd->context; vmr->mr_type = IBV_MR_TYPE_MR; vmr->access = access; vmr->ibv_mr.handle = 0; return &mr->vmr.ibv_mr; err_exec: free(in); err_in: mlx5_vfio_unregister_mem(ctx, mr->iova + mr->iova_aligned_offset, mr->iova_reg_size); err_reg: iset_insert_range(ctx->iova_alloc, mr->iova, mr->iova_page_size); end: free(mr); return NULL; } static int vfio_devx_query_eqn(struct ibv_context *ibctx, uint32_t vector, uint32_t *eqn) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(ibctx); if (vector != MLX5_VFIO_CMD_VEC_IDX) return EINVAL; /* For now use the singleton EQN created for async events */ *eqn = ctx->async_eq.eqn; return 0; } static struct mlx5dv_devx_uar * vfio_devx_alloc_uar(struct ibv_context *ibctx, uint32_t flags) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(ibctx); struct mlx5_devx_uar *uar; if (flags != MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC) { errno = EOPNOTSUPP; return NULL; } uar = calloc(1, sizeof(*uar)); if (!uar) { errno = ENOMEM; return NULL; } uar->dv_devx_uar.page_id = ctx->eqs_uar.uarn; uar->dv_devx_uar.base_addr = (void *)(uintptr_t)ctx->eqs_uar.iova; uar->dv_devx_uar.reg_addr = uar->dv_devx_uar.base_addr + MLX5_BF_OFFSET; uar->context = ibctx; return &uar->dv_devx_uar; } static void vfio_devx_free_uar(struct mlx5dv_devx_uar *dv_devx_uar) { free(dv_devx_uar); } static struct mlx5dv_devx_umem * _vfio_devx_umem_reg(struct ibv_context *context, void *addr, size_t size, uint32_t access, uint64_t pgsz_bitmap) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(context); uint32_t out[DEVX_ST_SZ_DW(create_umem_out)] = {}; struct mlx5_vfio_devx_umem *vfio_umem; int iova_page_shift; uint64_t iova_size; int ret; void *in; uint32_t inlen; __be64 *mtt; void *umem; bool writeable; void *aligned_va; int num_pas; if (!check_comp_mask(access, MLX5_VFIO_SUPP_UMEM_ACCESS_FLAGS)) { errno = EOPNOTSUPP; return NULL; } if ((access & IBV_ACCESS_REMOTE_WRITE) && !(access & IBV_ACCESS_LOCAL_WRITE)) { errno = EINVAL; return NULL; } /* Page size that encloses the start and end of the umem range */ iova_size = max(roundup_pow_of_two(size + ((uint64_t)(uintptr_t)addr & (ctx->iova_min_page_size - 1))), ctx->iova_min_page_size); if (!(iova_size & pgsz_bitmap)) { /* input should include the iova page size */ errno = EOPNOTSUPP; return NULL; } writeable = access & (IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE); vfio_umem = calloc(1, sizeof(*vfio_umem)); if (!vfio_umem) { errno = ENOMEM; return NULL; } vfio_umem->iova_size = iova_size; if (ibv_dontfork_range(addr, size)) goto err; ret = iset_alloc_range(ctx->iova_alloc, vfio_umem->iova_size, &vfio_umem->iova, vfio_umem->iova_size); if (ret) goto err_alloc; /* The registration's arguments have to reflect real VA presently mapped into the process */ aligned_va = (void *)(uintptr_t)((unsigned long) addr & ~(ctx->iova_min_page_size - 1)); vfio_umem->iova_reg_size = align((addr + size) - aligned_va, ctx->iova_min_page_size); ret = mlx5_vfio_register_mem(ctx, aligned_va, vfio_umem->iova, vfio_umem->iova_reg_size); if (ret) goto err_reg; iova_page_shift = ilog32(vfio_umem->iova_size - 1); num_pas = 1; if (iova_page_shift > MLX5_MAX_PAGE_SHIFT) { iova_page_shift = MLX5_MAX_PAGE_SHIFT; num_pas = DIV_ROUND_UP(vfio_umem->iova_size, (1ULL << iova_page_shift)); } inlen = DEVX_ST_SZ_BYTES(create_umem_in) + DEVX_ST_SZ_BYTES(mtt) * num_pas; in = calloc(1, inlen); if (!in) { errno = ENOMEM; goto err_in; } umem = DEVX_ADDR_OF(create_umem_in, in, umem); mtt = (__be64 *)DEVX_ADDR_OF(umem, umem, mtt); DEVX_SET(create_umem_in, in, opcode, MLX5_CMD_OP_CREATE_UMEM); DEVX_SET64(umem, umem, num_of_mtt, num_pas); DEVX_SET(umem, umem, log_page_size, iova_page_shift - MLX5_ADAPTER_PAGE_SHIFT); DEVX_SET(umem, umem, page_offset, addr - aligned_va); mlx5_vfio_populate_pas(vfio_umem->iova, num_pas, (1ULL << iova_page_shift), mtt, (writeable ? MLX5_MTT_WRITE : 0) | MLX5_MTT_READ); ret = mlx5_vfio_cmd_exec(ctx, in, inlen, out, sizeof(out), 0); if (ret) goto err_exec; free(in); vfio_umem->dv_devx_umem.umem_id = DEVX_GET(create_umem_out, out, umem_id); vfio_umem->context = context; vfio_umem->addr = addr; vfio_umem->size = size; return &vfio_umem->dv_devx_umem; err_exec: free(in); err_in: mlx5_vfio_unregister_mem(ctx, vfio_umem->iova, vfio_umem->iova_reg_size); err_reg: iset_insert_range(ctx->iova_alloc, vfio_umem->iova, vfio_umem->iova_size); err_alloc: ibv_dofork_range(addr, size); err: free(vfio_umem); return NULL; } static struct mlx5dv_devx_umem * vfio_devx_umem_reg(struct ibv_context *context, void *addr, size_t size, uint32_t access) { return _vfio_devx_umem_reg(context, addr, size, access, UINT64_MAX); } static struct mlx5dv_devx_umem * vfio_devx_umem_reg_ex(struct ibv_context *ctx, struct mlx5dv_devx_umem_in *in) { if (!check_comp_mask(in->comp_mask, 0)) { errno = EOPNOTSUPP; return NULL; } return _vfio_devx_umem_reg(ctx, in->addr, in->size, in->access, in->pgsz_bitmap); } static int vfio_devx_umem_dereg(struct mlx5dv_devx_umem *dv_devx_umem) { struct mlx5_vfio_devx_umem *vfio_umem = container_of(dv_devx_umem, struct mlx5_vfio_devx_umem, dv_devx_umem); struct mlx5_vfio_context *ctx = to_mvfio_ctx(vfio_umem->context); uint32_t in[DEVX_ST_SZ_DW(create_umem_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(create_umem_out)] = {}; int ret; DEVX_SET(destroy_umem_in, in, opcode, MLX5_CMD_OP_DESTROY_UMEM); DEVX_SET(destroy_umem_in, in, umem_id, dv_devx_umem->umem_id); ret = mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); if (ret) return ret; mlx5_vfio_unregister_mem(ctx, vfio_umem->iova, vfio_umem->iova_reg_size); iset_insert_range(ctx->iova_alloc, vfio_umem->iova, vfio_umem->iova_size); ibv_dofork_range(vfio_umem->addr, vfio_umem->size); free(vfio_umem); return 0; } static int vfio_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type) { struct ibv_pd *pd_in = obj->pd.in; struct mlx5dv_pd *pd_out = obj->pd.out; struct mlx5_pd *mpd = to_mpd(pd_in); if (obj_type != MLX5DV_OBJ_PD) return EOPNOTSUPP; pd_out->comp_mask = 0; pd_out->pdn = mpd->pdn; return 0; } static int vfio_devx_general_cmd(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(context); return mlx5_vfio_cmd_do(ctx, (void *)in, inlen, out, outlen, 0); } static bool devx_is_obj_create_cmd(const void *in) { uint16_t opcode = DEVX_GET(general_obj_in_cmd_hdr, in, opcode); switch (opcode) { case MLX5_CMD_OP_CREATE_GENERAL_OBJECT: case MLX5_CMD_OP_CREATE_MKEY: case MLX5_CMD_OP_CREATE_CQ: case MLX5_CMD_OP_ALLOC_PD: case MLX5_CMD_OP_ALLOC_TRANSPORT_DOMAIN: case MLX5_CMD_OP_CREATE_RMP: case MLX5_CMD_OP_CREATE_SQ: case MLX5_CMD_OP_CREATE_RQ: case MLX5_CMD_OP_CREATE_RQT: case MLX5_CMD_OP_CREATE_TIR: case MLX5_CMD_OP_CREATE_TIS: case MLX5_CMD_OP_ALLOC_Q_COUNTER: case MLX5_CMD_OP_CREATE_FLOW_TABLE: case MLX5_CMD_OP_CREATE_FLOW_GROUP: case MLX5_CMD_OP_CREATE_FLOW_COUNTER: case MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT: case MLX5_CMD_OP_ALLOC_MODIFY_HEADER_CONTEXT: case MLX5_CMD_OP_CREATE_SCHEDULING_ELEMENT: case MLX5_CMD_OP_ADD_VXLAN_UDP_DPORT: case MLX5_CMD_OP_SET_L2_TABLE_ENTRY: case MLX5_CMD_OP_CREATE_QP: case MLX5_CMD_OP_CREATE_SRQ: case MLX5_CMD_OP_CREATE_XRC_SRQ: case MLX5_CMD_OP_CREATE_DCT: case MLX5_CMD_OP_CREATE_XRQ: case MLX5_CMD_OP_ATTACH_TO_MCG: case MLX5_CMD_OP_ALLOC_XRCD: return true; case MLX5_CMD_OP_SET_FLOW_TABLE_ENTRY: { uint8_t op_mod = DEVX_GET(set_fte_in, in, op_mod); if (op_mod == 0) return true; return false; } case MLX5_CMD_OP_CREATE_PSV: { uint8_t num_psv = DEVX_GET(create_psv_in, in, num_psv); if (num_psv == 1) return true; return false; } default: return false; } } static uint32_t devx_get_created_obj_id(const void *in, const void *out, uint16_t opcode) { switch (opcode) { case MLX5_CMD_OP_CREATE_GENERAL_OBJECT: return DEVX_GET(general_obj_out_cmd_hdr, out, obj_id); case MLX5_CMD_OP_CREATE_UMEM: return DEVX_GET(create_umem_out, out, umem_id); case MLX5_CMD_OP_CREATE_MKEY: return DEVX_GET(create_mkey_out, out, mkey_index); case MLX5_CMD_OP_CREATE_CQ: return DEVX_GET(create_cq_out, out, cqn); case MLX5_CMD_OP_ALLOC_PD: return DEVX_GET(alloc_pd_out, out, pd); case MLX5_CMD_OP_ALLOC_TRANSPORT_DOMAIN: return DEVX_GET(alloc_transport_domain_out, out, transport_domain); case MLX5_CMD_OP_CREATE_RMP: return DEVX_GET(create_rmp_out, out, rmpn); case MLX5_CMD_OP_CREATE_SQ: return DEVX_GET(create_sq_out, out, sqn); case MLX5_CMD_OP_CREATE_RQ: return DEVX_GET(create_rq_out, out, rqn); case MLX5_CMD_OP_CREATE_RQT: return DEVX_GET(create_rqt_out, out, rqtn); case MLX5_CMD_OP_CREATE_TIR: return DEVX_GET(create_tir_out, out, tirn); case MLX5_CMD_OP_CREATE_TIS: return DEVX_GET(create_tis_out, out, tisn); case MLX5_CMD_OP_ALLOC_Q_COUNTER: return DEVX_GET(alloc_q_counter_out, out, counter_set_id); case MLX5_CMD_OP_CREATE_FLOW_TABLE: return DEVX_GET(create_flow_table_out, out, table_id); case MLX5_CMD_OP_CREATE_FLOW_GROUP: return DEVX_GET(create_flow_group_out, out, group_id); case MLX5_CMD_OP_SET_FLOW_TABLE_ENTRY: return DEVX_GET(set_fte_in, in, flow_index); case MLX5_CMD_OP_CREATE_FLOW_COUNTER: return DEVX_GET(alloc_flow_counter_out, out, flow_counter_id); case MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT: return DEVX_GET(alloc_packet_reformat_context_out, out, packet_reformat_id); case MLX5_CMD_OP_ALLOC_MODIFY_HEADER_CONTEXT: return DEVX_GET(alloc_modify_header_context_out, out, modify_header_id); case MLX5_CMD_OP_CREATE_SCHEDULING_ELEMENT: return DEVX_GET(create_scheduling_element_out, out, scheduling_element_id); case MLX5_CMD_OP_ADD_VXLAN_UDP_DPORT: return DEVX_GET(add_vxlan_udp_dport_in, in, vxlan_udp_port); case MLX5_CMD_OP_SET_L2_TABLE_ENTRY: return DEVX_GET(set_l2_table_entry_in, in, table_index); case MLX5_CMD_OP_CREATE_QP: return DEVX_GET(create_qp_out, out, qpn); case MLX5_CMD_OP_CREATE_SRQ: return DEVX_GET(create_srq_out, out, srqn); case MLX5_CMD_OP_CREATE_XRC_SRQ: return DEVX_GET(create_xrc_srq_out, out, xrc_srqn); case MLX5_CMD_OP_CREATE_DCT: return DEVX_GET(create_dct_out, out, dctn); case MLX5_CMD_OP_CREATE_XRQ: return DEVX_GET(create_xrq_out, out, xrqn); case MLX5_CMD_OP_ATTACH_TO_MCG: return DEVX_GET(attach_to_mcg_in, in, qpn); case MLX5_CMD_OP_ALLOC_XRCD: return DEVX_GET(alloc_xrcd_out, out, xrcd); case MLX5_CMD_OP_CREATE_PSV: return DEVX_GET(create_psv_out, out, psv0_index); default: /* The entry must match to one of the devx_is_obj_create_cmd */ assert(false); return 0; } } static void devx_obj_build_destroy_cmd(const void *in, void *out, void *din, uint32_t *dinlen, struct mlx5dv_devx_obj *obj) { uint16_t opcode = DEVX_GET(general_obj_in_cmd_hdr, in, opcode); uint16_t uid = DEVX_GET(general_obj_in_cmd_hdr, in, uid); uint32_t *obj_id = &obj->object_id; *obj_id = devx_get_created_obj_id(in, out, opcode); *dinlen = DEVX_ST_SZ_BYTES(general_obj_in_cmd_hdr); DEVX_SET(general_obj_in_cmd_hdr, din, uid, uid); switch (opcode) { case MLX5_CMD_OP_CREATE_GENERAL_OBJECT: DEVX_SET(general_obj_in_cmd_hdr, din, opcode, MLX5_CMD_OP_DESTROY_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, din, obj_id, *obj_id); DEVX_SET(general_obj_in_cmd_hdr, din, obj_type, DEVX_GET(general_obj_in_cmd_hdr, in, obj_type)); break; case MLX5_CMD_OP_CREATE_UMEM: DEVX_SET(destroy_umem_in, din, opcode, MLX5_CMD_OP_DESTROY_UMEM); DEVX_SET(destroy_umem_in, din, umem_id, *obj_id); break; case MLX5_CMD_OP_CREATE_MKEY: DEVX_SET(destroy_mkey_in, din, opcode, MLX5_CMD_OP_DESTROY_MKEY); DEVX_SET(destroy_mkey_in, din, mkey_index, *obj_id); break; case MLX5_CMD_OP_CREATE_CQ: DEVX_SET(destroy_cq_in, din, opcode, MLX5_CMD_OP_DESTROY_CQ); DEVX_SET(destroy_cq_in, din, cqn, *obj_id); break; case MLX5_CMD_OP_ALLOC_PD: DEVX_SET(dealloc_pd_in, din, opcode, MLX5_CMD_OP_DEALLOC_PD); DEVX_SET(dealloc_pd_in, din, pd, *obj_id); break; case MLX5_CMD_OP_ALLOC_TRANSPORT_DOMAIN: DEVX_SET(dealloc_transport_domain_in, din, opcode, MLX5_CMD_OP_DEALLOC_TRANSPORT_DOMAIN); DEVX_SET(dealloc_transport_domain_in, din, transport_domain, *obj_id); break; case MLX5_CMD_OP_CREATE_RMP: DEVX_SET(destroy_rmp_in, din, opcode, MLX5_CMD_OP_DESTROY_RMP); DEVX_SET(destroy_rmp_in, din, rmpn, *obj_id); break; case MLX5_CMD_OP_CREATE_SQ: DEVX_SET(destroy_sq_in, din, opcode, MLX5_CMD_OP_DESTROY_SQ); DEVX_SET(destroy_sq_in, din, sqn, *obj_id); break; case MLX5_CMD_OP_CREATE_RQ: DEVX_SET(destroy_rq_in, din, opcode, MLX5_CMD_OP_DESTROY_RQ); DEVX_SET(destroy_rq_in, din, rqn, *obj_id); break; case MLX5_CMD_OP_CREATE_RQT: DEVX_SET(destroy_rqt_in, din, opcode, MLX5_CMD_OP_DESTROY_RQT); DEVX_SET(destroy_rqt_in, din, rqtn, *obj_id); break; case MLX5_CMD_OP_CREATE_TIR: DEVX_SET(destroy_tir_in, din, opcode, MLX5_CMD_OP_DESTROY_TIR); DEVX_SET(destroy_tir_in, din, tirn, *obj_id); break; case MLX5_CMD_OP_CREATE_TIS: DEVX_SET(destroy_tis_in, din, opcode, MLX5_CMD_OP_DESTROY_TIS); DEVX_SET(destroy_tis_in, din, tisn, *obj_id); break; case MLX5_CMD_OP_ALLOC_Q_COUNTER: DEVX_SET(dealloc_q_counter_in, din, opcode, MLX5_CMD_OP_DEALLOC_Q_COUNTER); DEVX_SET(dealloc_q_counter_in, din, counter_set_id, *obj_id); break; case MLX5_CMD_OP_CREATE_FLOW_TABLE: *dinlen = DEVX_ST_SZ_BYTES(destroy_flow_table_in); DEVX_SET(destroy_flow_table_in, din, other_vport, DEVX_GET(create_flow_table_in, in, other_vport)); DEVX_SET(destroy_flow_table_in, din, vport_number, DEVX_GET(create_flow_table_in, in, vport_number)); DEVX_SET(destroy_flow_table_in, din, table_type, DEVX_GET(create_flow_table_in, in, table_type)); DEVX_SET(destroy_flow_table_in, din, table_id, *obj_id); DEVX_SET(destroy_flow_table_in, din, opcode, MLX5_CMD_OP_DESTROY_FLOW_TABLE); break; case MLX5_CMD_OP_CREATE_FLOW_GROUP: *dinlen = DEVX_ST_SZ_BYTES(destroy_flow_group_in); DEVX_SET(destroy_flow_group_in, din, other_vport, DEVX_GET(create_flow_group_in, in, other_vport)); DEVX_SET(destroy_flow_group_in, din, vport_number, DEVX_GET(create_flow_group_in, in, vport_number)); DEVX_SET(destroy_flow_group_in, din, table_type, DEVX_GET(create_flow_group_in, in, table_type)); DEVX_SET(destroy_flow_group_in, din, table_id, DEVX_GET(create_flow_group_in, in, table_id)); DEVX_SET(destroy_flow_group_in, din, group_id, *obj_id); DEVX_SET(destroy_flow_group_in, din, opcode, MLX5_CMD_OP_DESTROY_FLOW_GROUP); break; case MLX5_CMD_OP_SET_FLOW_TABLE_ENTRY: *dinlen = DEVX_ST_SZ_BYTES(delete_fte_in); DEVX_SET(delete_fte_in, din, other_vport, DEVX_GET(set_fte_in, in, other_vport)); DEVX_SET(delete_fte_in, din, vport_number, DEVX_GET(set_fte_in, in, vport_number)); DEVX_SET(delete_fte_in, din, table_type, DEVX_GET(set_fte_in, in, table_type)); DEVX_SET(delete_fte_in, din, table_id, DEVX_GET(set_fte_in, in, table_id)); DEVX_SET(delete_fte_in, din, flow_index, *obj_id); DEVX_SET(delete_fte_in, din, opcode, MLX5_CMD_OP_DELETE_FLOW_TABLE_ENTRY); break; case MLX5_CMD_OP_CREATE_FLOW_COUNTER: DEVX_SET(dealloc_flow_counter_in, din, opcode, MLX5_CMD_OP_DEALLOC_FLOW_COUNTER); DEVX_SET(dealloc_flow_counter_in, din, flow_counter_id, *obj_id); break; case MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT: DEVX_SET(dealloc_packet_reformat_context_in, din, opcode, MLX5_CMD_OP_DEALLOC_PACKET_REFORMAT_CONTEXT); DEVX_SET(dealloc_packet_reformat_context_in, din, packet_reformat_id, *obj_id); break; case MLX5_CMD_OP_ALLOC_MODIFY_HEADER_CONTEXT: DEVX_SET(dealloc_modify_header_context_in, din, opcode, MLX5_CMD_OP_DEALLOC_MODIFY_HEADER_CONTEXT); DEVX_SET(dealloc_modify_header_context_in, din, modify_header_id, *obj_id); break; case MLX5_CMD_OP_CREATE_SCHEDULING_ELEMENT: *dinlen = DEVX_ST_SZ_BYTES(destroy_scheduling_element_in); DEVX_SET(destroy_scheduling_element_in, din, scheduling_hierarchy, DEVX_GET(create_scheduling_element_in, in, scheduling_hierarchy)); DEVX_SET(destroy_scheduling_element_in, din, scheduling_element_id, *obj_id); DEVX_SET(destroy_scheduling_element_in, din, opcode, MLX5_CMD_OP_DESTROY_SCHEDULING_ELEMENT); break; case MLX5_CMD_OP_ADD_VXLAN_UDP_DPORT: *dinlen = DEVX_ST_SZ_BYTES(delete_vxlan_udp_dport_in); DEVX_SET(delete_vxlan_udp_dport_in, din, vxlan_udp_port, *obj_id); DEVX_SET(delete_vxlan_udp_dport_in, din, opcode, MLX5_CMD_OP_DELETE_VXLAN_UDP_DPORT); break; case MLX5_CMD_OP_SET_L2_TABLE_ENTRY: *dinlen = DEVX_ST_SZ_BYTES(delete_l2_table_entry_in); DEVX_SET(delete_l2_table_entry_in, din, table_index, *obj_id); DEVX_SET(delete_l2_table_entry_in, din, opcode, MLX5_CMD_OP_DELETE_L2_TABLE_ENTRY); break; case MLX5_CMD_OP_CREATE_QP: DEVX_SET(destroy_qp_in, din, opcode, MLX5_CMD_OP_DESTROY_QP); DEVX_SET(destroy_qp_in, din, qpn, *obj_id); break; case MLX5_CMD_OP_CREATE_SRQ: DEVX_SET(destroy_srq_in, din, opcode, MLX5_CMD_OP_DESTROY_SRQ); DEVX_SET(destroy_srq_in, din, srqn, *obj_id); break; case MLX5_CMD_OP_CREATE_XRC_SRQ: DEVX_SET(destroy_xrc_srq_in, din, opcode, MLX5_CMD_OP_DESTROY_XRC_SRQ); DEVX_SET(destroy_xrc_srq_in, din, xrc_srqn, *obj_id); break; case MLX5_CMD_OP_CREATE_DCT: DEVX_SET(destroy_dct_in, din, opcode, MLX5_CMD_OP_DESTROY_DCT); DEVX_SET(destroy_dct_in, din, dctn, *obj_id); break; case MLX5_CMD_OP_CREATE_XRQ: DEVX_SET(destroy_xrq_in, din, opcode, MLX5_CMD_OP_DESTROY_XRQ); DEVX_SET(destroy_xrq_in, din, xrqn, *obj_id); break; case MLX5_CMD_OP_ATTACH_TO_MCG: *dinlen = DEVX_ST_SZ_BYTES(detach_from_mcg_in); DEVX_SET(detach_from_mcg_in, din, qpn, DEVX_GET(attach_to_mcg_in, in, qpn)); memcpy(DEVX_ADDR_OF(detach_from_mcg_in, din, multicast_gid), DEVX_ADDR_OF(attach_to_mcg_in, in, multicast_gid), DEVX_FLD_SZ_BYTES(attach_to_mcg_in, multicast_gid)); DEVX_SET(detach_from_mcg_in, din, opcode, MLX5_CMD_OP_DETACH_FROM_MCG); DEVX_SET(detach_from_mcg_in, din, qpn, *obj_id); break; case MLX5_CMD_OP_ALLOC_XRCD: DEVX_SET(dealloc_xrcd_in, din, opcode, MLX5_CMD_OP_DEALLOC_XRCD); DEVX_SET(dealloc_xrcd_in, din, xrcd, *obj_id); break; case MLX5_CMD_OP_CREATE_PSV: DEVX_SET(destroy_psv_in, din, opcode, MLX5_CMD_OP_DESTROY_PSV); DEVX_SET(destroy_psv_in, din, psvn, *obj_id); break; default: /* The entry must match to one of the devx_is_obj_create_cmd */ assert(false); break; } } static struct mlx5dv_devx_obj * vfio_devx_obj_create(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(context); struct mlx5_devx_obj *obj; int ret; if (!devx_is_obj_create_cmd(in)) { errno = EINVAL; return NULL; } obj = calloc(1, sizeof(*obj)); if (!obj) { errno = ENOMEM; return NULL; } ret = mlx5_vfio_cmd_do(ctx, (void *)in, inlen, out, outlen, 0); if (ret) { errno = ret; goto fail; } devx_obj_build_destroy_cmd(in, out, obj->dinbox, &obj->dinlen, &obj->dv_obj); obj->dv_obj.context = context; return &obj->dv_obj; fail: free(obj); return NULL; } static int vfio_devx_obj_query(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(obj->context); return mlx5_vfio_cmd_do(ctx, (void *)in, inlen, out, outlen, 0); } static int vfio_devx_obj_modify(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(obj->context); return mlx5_vfio_cmd_do(ctx, (void *)in, inlen, out, outlen, 0); } static int vfio_devx_obj_destroy(struct mlx5dv_devx_obj *obj) { struct mlx5_devx_obj *mobj = container_of(obj, struct mlx5_devx_obj, dv_obj); struct mlx5_vfio_context *ctx = to_mvfio_ctx(obj->context); uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)]; int ret; ret = mlx5_vfio_cmd_exec(ctx, mobj->dinbox, mobj->dinlen, out, sizeof(out), 0); if (ret) return ret; free(mobj); return 0; } static struct mlx5dv_devx_msi_vector * vfio_devx_alloc_msi_vector(struct ibv_context *ibctx) { uint8_t buf[sizeof(struct vfio_irq_set) + sizeof(int)] = {}; struct mlx5_vfio_context *ctx = to_mvfio_ctx(ibctx); struct mlx5_devx_msi_vector *msi; int vector, *fd, err; msi = calloc(1, sizeof(*msi)); if (!msi) { errno = ENOMEM; return NULL; } pthread_mutex_lock(&ctx->msix_fds_lock); for (vector = 0; vector < ibctx->num_comp_vectors; vector++) if (ctx->msix_fds[vector] < 0) break; if (vector == ibctx->num_comp_vectors) { errno = ENOSPC; goto fail; } fd = (int *)(buf + sizeof(struct vfio_irq_set)); *fd = eventfd(0, EFD_CLOEXEC | EFD_NONBLOCK); if (*fd < 0) goto fail; err = mlx5_vfio_msix_set_irqs(ctx, vector, 1, buf); if (err) goto fail_set_irqs; ctx->msix_fds[vector] = *fd; msi->dv_msi.vector = vector; msi->dv_msi.fd = *fd; msi->ibctx = ibctx; pthread_mutex_unlock(&ctx->msix_fds_lock); return &msi->dv_msi; fail_set_irqs: close(*fd); fail: pthread_mutex_unlock(&ctx->msix_fds_lock); free(msi); return NULL; } static int vfio_devx_free_msi_vector(struct mlx5dv_devx_msi_vector *msi) { struct mlx5_devx_msi_vector *msiv = container_of(msi, struct mlx5_devx_msi_vector, dv_msi); uint8_t buf[sizeof(struct vfio_irq_set) + sizeof(int)] = {}; struct mlx5_vfio_context *ctx = to_mvfio_ctx(msiv->ibctx); int ret; pthread_mutex_lock(&ctx->msix_fds_lock); if ((msi->vector >= msiv->ibctx->num_comp_vectors) || (msi->vector == MLX5_VFIO_CMD_VEC_IDX) || (msi->fd != ctx->msix_fds[msi->vector])) { ret = EINVAL; goto out; } *(int *)(buf + sizeof(struct vfio_irq_set)) = -1; ret = mlx5_vfio_msix_set_irqs(ctx, msi->vector, 1, buf); if (ret) { ret = errno; goto out; } close(msi->fd); ctx->msix_fds[msi->vector] = -1; free(msiv); out: pthread_mutex_unlock(&ctx->msix_fds_lock); return ret; } static struct mlx5dv_devx_eq * vfio_devx_create_eq(struct ibv_context *ibctx, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(ibctx); struct mlx5_devx_eq *eq; void *eqc, *in_pas; size_t inlen_pas; uint64_t size; __be64 *pas; int err; eqc = DEVX_ADDR_OF(create_eq_in, in, eq_context_entry); if ((inlen < DEVX_ST_SZ_BYTES(create_eq_in)) || (DEVX_GET(create_eq_in, in, opcode) != MLX5_CMD_OP_CREATE_EQ) || (DEVX_GET(eqc, eqc, intr) == MLX5_VFIO_CMD_VEC_IDX)) { errno = EINVAL; return NULL; } size = max(roundup_pow_of_two( (1ULL << DEVX_GET(eqc, eqc, log_eq_size)) * MLX5_EQE_SIZE), ctx->iova_min_page_size); if (size > SIZE_MAX) { errno = ERANGE; return NULL; } eq = calloc(1, sizeof(*eq)); if (!eq) { errno = ENOMEM; return NULL; } eq->size = size; err = posix_memalign(&eq->dv_eq.vaddr, MLX5_ADAPTER_PAGE_SIZE, eq->size); if (err) { errno = err; goto err_va; } err = iset_alloc_range(ctx->iova_alloc, eq->size, &eq->iova, eq->size); if (err) goto err_range; err = mlx5_vfio_register_mem(ctx, eq->dv_eq.vaddr, eq->iova, eq->size); if (err) goto err_reg; inlen_pas = inlen + DEVX_FLD_SZ_BYTES(create_eq_in, pas[0]) * 1; in_pas = calloc(1, inlen_pas); if (!in_pas) { errno = ENOMEM; goto err_inpas; } memcpy(in_pas, in, inlen); eqc = DEVX_ADDR_OF(create_eq_in, in_pas, eq_context_entry); DEVX_SET(eqc, eqc, log_page_size, ilog32(eq->size - 1) - MLX5_ADAPTER_PAGE_SHIFT); pas = (__be64 *)DEVX_ADDR_OF(create_eq_in, in_pas, pas); pas[0] = htobe64(eq->iova); err = mlx5_vfio_cmd_do(ctx, in_pas, inlen_pas, out, outlen, 0); if (err) { errno = err; goto err_cmd; } free(in_pas); eq->ibctx = ibctx; eq->eqn = DEVX_GET(create_eq_out, out, eq_number); return &eq->dv_eq; err_cmd: free(in_pas); err_inpas: mlx5_vfio_unregister_mem(ctx, eq->iova, eq->size); err_reg: iset_insert_range(ctx->iova_alloc, eq->iova, eq->size); err_range: free(eq->dv_eq.vaddr); err_va: free(eq); return NULL; } static int vfio_devx_destroy_eq(struct mlx5dv_devx_eq *dveq) { struct mlx5_devx_eq *eq = container_of(dveq, struct mlx5_devx_eq, dv_eq); struct mlx5_vfio_context *ctx = to_mvfio_ctx(eq->ibctx); uint32_t out[DEVX_ST_SZ_DW(destroy_eq_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(destroy_eq_in)] = {}; int err; DEVX_SET(destroy_eq_in, in, opcode, MLX5_CMD_OP_DESTROY_EQ); DEVX_SET(destroy_eq_in, in, eq_number, eq->eqn); err = mlx5_vfio_cmd_exec(ctx, in, sizeof(in), out, sizeof(out), 0); if (err) return err; mlx5_vfio_unregister_mem(ctx, eq->iova, eq->size); iset_insert_range(ctx->iova_alloc, eq->iova, eq->size); free(eq); return 0; } static struct mlx5_dv_context_ops mlx5_vfio_dv_ctx_ops = { .devx_general_cmd = vfio_devx_general_cmd, .devx_obj_create = vfio_devx_obj_create, .devx_obj_query = vfio_devx_obj_query, .devx_obj_modify = vfio_devx_obj_modify, .devx_obj_destroy = vfio_devx_obj_destroy, .devx_query_eqn = vfio_devx_query_eqn, .devx_alloc_uar = vfio_devx_alloc_uar, .devx_free_uar = vfio_devx_free_uar, .devx_umem_reg = vfio_devx_umem_reg, .devx_umem_reg_ex = vfio_devx_umem_reg_ex, .devx_umem_dereg = vfio_devx_umem_dereg, .init_obj = vfio_init_obj, .devx_alloc_msi_vector = vfio_devx_alloc_msi_vector, .devx_free_msi_vector = vfio_devx_free_msi_vector, .devx_create_eq = vfio_devx_create_eq, .devx_destroy_eq = vfio_devx_destroy_eq, }; static void mlx5_vfio_uninit_context(struct mlx5_vfio_context *ctx) { mlx5_close_debug_file(ctx->dbg_fp); verbs_uninit_context(&ctx->vctx); free(ctx); } static void mlx5_vfio_free_context(struct ibv_context *ibctx) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(ibctx); destroy_async_eqs(ctx); mlx5_vfio_teardown_hca(ctx); mlx5_vfio_clean_cmd_interface(ctx); mlx5_vfio_clean_device_dma(ctx); mlx5_vfio_uninit_bar0(ctx); mlx5_vfio_close_fds(ctx); mlx5_vfio_uninit_context(ctx); } static const struct verbs_context_ops mlx5_vfio_common_ops = { .alloc_pd = mlx5_vfio_alloc_pd, .dealloc_pd = mlx5_vfio_dealloc_pd, .reg_mr = mlx5_vfio_reg_mr, .dereg_mr = mlx5_vfio_dereg_mr, .free_context = mlx5_vfio_free_context, }; static struct verbs_context * mlx5_vfio_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct mlx5_vfio_device *mdev = to_mvfio_dev(ibdev); struct mlx5_vfio_context *mctx; cmd_fd = -1; mctx = verbs_init_and_alloc_context(ibdev, cmd_fd, mctx, vctx, RDMA_DRIVER_UNKNOWN); if (!mctx) return NULL; mlx5_open_debug_file(&mctx->dbg_fp); mlx5_set_debug_mask(); if (mlx5_vfio_open_fds(mctx, mdev)) goto err; if (mlx5_vfio_init_bar0(mctx)) goto close_fds; if (mlx5_vfio_init_device_dma(mctx)) goto err_bar; if (mlx5_vfio_init_cmd_interface(mctx)) goto err_dma; if (mlx5_vfio_setup_function(mctx)) goto clean_cmd; if (create_async_eqs(mctx)) goto func_teardown; verbs_set_ops(&mctx->vctx, &mlx5_vfio_common_ops); mctx->dv_ctx_ops = &mlx5_vfio_dv_ctx_ops; return &mctx->vctx; func_teardown: mlx5_vfio_teardown_hca(mctx); clean_cmd: mlx5_vfio_clean_cmd_interface(mctx); err_dma: mlx5_vfio_clean_device_dma(mctx); err_bar: mlx5_vfio_uninit_bar0(mctx); close_fds: mlx5_vfio_close_fds(mctx); err: mlx5_vfio_uninit_context(mctx); return NULL; } static void mlx5_vfio_uninit_device(struct verbs_device *verbs_device) { struct mlx5_vfio_device *dev = to_mvfio_dev(&verbs_device->device); free(dev->pci_name); free(dev); } static const struct verbs_device_ops mlx5_vfio_dev_ops = { .name = "mlx5_vfio", .alloc_context = mlx5_vfio_alloc_context, .uninit_device = mlx5_vfio_uninit_device, }; static bool is_mlx5_pci(const char *pci_path) { const struct verbs_match_ent *ent; uint16_t vendor_id, device_id; char pci_info_path[256]; char buff[128]; int fd; snprintf(pci_info_path, sizeof(pci_info_path), "%s/vendor", pci_path); fd = open(pci_info_path, O_RDONLY); if (fd < 0) return false; if (read(fd, buff, sizeof(buff)) <= 0) goto err; vendor_id = strtoul(buff, NULL, 0); close(fd); snprintf(pci_info_path, sizeof(pci_info_path), "%s/device", pci_path); fd = open(pci_info_path, O_RDONLY); if (fd < 0) return false; if (read(fd, buff, sizeof(buff)) <= 0) goto err; device_id = strtoul(buff, NULL, 0); close(fd); for (ent = mlx5_hca_table; ent->kind != VERBS_MATCH_SENTINEL; ent++) { if (ent->kind != VERBS_MATCH_PCI) continue; if (ent->device == device_id && ent->vendor == vendor_id) return true; } return false; err: close(fd); return false; } static int mlx5_vfio_get_iommu_group_id(const char *pci_name) { int seg, bus, slot, func; int ret, groupid; char path[128], iommu_group_path[128], *group_name; struct stat st; ssize_t len; ret = sscanf(pci_name, "%04x:%02x:%02x.%d", &seg, &bus, &slot, &func); if (ret != 4) return -1; snprintf(path, sizeof(path), "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/", seg, bus, slot, func); ret = stat(path, &st); if (ret < 0) return -1; if (!is_mlx5_pci(path)) return -1; strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1); len = readlink(path, iommu_group_path, sizeof(iommu_group_path)); if (len <= 0) return -1; iommu_group_path[len] = 0; group_name = basename(iommu_group_path); if (sscanf(group_name, "%d", &groupid) != 1) return -1; snprintf(path, sizeof(path), "/dev/vfio/%d", groupid); ret = stat(path, &st); if (ret < 0) return -1; return groupid; } static int mlx5_vfio_get_handle(struct mlx5_vfio_device *vfio_dev, struct mlx5dv_vfio_context_attr *attr) { int iommu_group; iommu_group = mlx5_vfio_get_iommu_group_id(attr->pci_name); if (iommu_group < 0) return -1; sprintf(vfio_dev->vfio_path, "/dev/vfio/%d", iommu_group); vfio_dev->pci_name = strdup(attr->pci_name); return 0; } int mlx5dv_vfio_get_events_fd(struct ibv_context *ibctx) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(ibctx); return ctx->cmd_comp_fd; } int mlx5dv_vfio_process_events(struct ibv_context *ibctx) { struct mlx5_vfio_context *ctx = to_mvfio_ctx(ibctx); uint64_t u; ssize_t s; mlx5_vfio_poll_health(ctx); /* read to re-arm the FD and process all existing events */ s = read(ctx->cmd_comp_fd, &u, sizeof(uint64_t)); if (s < 0 && errno != EAGAIN) { mlx5_err(ctx->dbg_fp, "%s, read failed, errno=%d\n", __func__, errno); return errno; } return mlx5_vfio_process_async_events(ctx); } struct ibv_device ** mlx5dv_get_vfio_device_list(struct mlx5dv_vfio_context_attr *attr) { struct mlx5_vfio_device *vfio_dev; struct ibv_device **list = NULL; int err; if (!check_comp_mask(attr->comp_mask, 0) || !check_comp_mask(attr->flags, MLX5DV_VFIO_CTX_FLAGS_INIT_LINK_DOWN)) { errno = EOPNOTSUPP; return NULL; } list = calloc(2, sizeof(struct ibv_device *)); if (!list) { errno = ENOMEM; return NULL; } vfio_dev = calloc(1, sizeof(*vfio_dev)); if (!vfio_dev) { errno = ENOMEM; goto end; } vfio_dev->vdev.ops = &mlx5_vfio_dev_ops; atomic_init(&vfio_dev->vdev.refcount, 1); /* Find the vfio handle for attrs, store in mlx5_vfio_device */ err = mlx5_vfio_get_handle(vfio_dev, attr); if (err) goto err_get; vfio_dev->flags = attr->flags; vfio_dev->page_size = sysconf(_SC_PAGESIZE); atomic_init(&vfio_dev->mkey_var, 0); list[0] = &vfio_dev->vdev.device; return list; err_get: free(vfio_dev); end: free(list); return NULL; } bool is_mlx5_vfio_dev(struct ibv_device *device) { struct verbs_device *verbs_device = verbs_get_device(device); return verbs_device->ops == &mlx5_vfio_dev_ops; } rdma-core-56.1/providers/mlx5/mlx5_vfio.h000066400000000000000000000151421477342711600203200ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 or Linux-OpenIB /* * Copyright (c) 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved */ #ifndef MLX5_VFIO_H #define MLX5_VFIO_H #include #include #include "mlx5.h" #include "mlx5_ifc.h" #include #include #include "mlx5_ifc.h" #define FW_INIT_WAIT_MS 2 #define FW_PRE_INIT_TIMEOUT_MILI 120000 enum { MLX5_MAX_COMMANDS = 32, MLX5_CMD_DATA_BLOCK_SIZE = 512, MLX5_PCI_CMD_XPORT = 7, }; enum mlx5_ib_mtt_access_flags { MLX5_MTT_READ = (1 << 0), MLX5_MTT_WRITE = (1 << 1), }; enum { MLX5_MAX_PAGE_SHIFT = 31, }; #define MLX5_MTT_PRESENT (MLX5_MTT_READ | MLX5_MTT_WRITE) enum { MLX5_VFIO_BLOCK_SIZE = 2 * 1024 * 1024, MLX5_VFIO_BLOCK_NUM_PAGES = MLX5_VFIO_BLOCK_SIZE / MLX5_ADAPTER_PAGE_SIZE, }; struct mlx5_vfio_mr { struct verbs_mr vmr; uint64_t iova; uint64_t iova_page_size; uint64_t iova_aligned_offset; uint64_t iova_reg_size; }; struct mlx5_vfio_devx_umem { struct mlx5dv_devx_umem dv_devx_umem; struct ibv_context *context; void *addr; size_t size; uint64_t iova; uint64_t iova_size; uint64_t iova_reg_size; }; struct mlx5_vfio_device { struct verbs_device vdev; char *pci_name; char vfio_path[IBV_SYSFS_PATH_MAX]; int page_size; uint32_t flags; atomic_int mkey_var; }; #if __BYTE_ORDER == __LITTLE_ENDIAN #define MLX5_SET_HOST_ENDIANNESS 0 #elif __BYTE_ORDER == __BIG_ENDIAN #define MLX5_SET_HOST_ENDIANNESS 0x80 #else #error Host endianness not defined #endif /* GET Dev Caps macros */ #define MLX5_VFIO_CAP_GEN(ctx, cap) \ DEVX_GET(cmd_hca_cap, ctx->caps.hca_cur[MLX5_CAP_GENERAL], cap) #define MLX5_VFIO_CAP_GEN_64(mdev, cap) \ DEVX_GET64(cmd_hca_cap, mdev->caps.hca_cur[MLX5_CAP_GENERAL], cap) #define MLX5_VFIO_CAP_GEN_MAX(ctx, cap) \ DEVX_GET(cmd_hca_cap, ctx->caps.hca_max[MLX5_CAP_GENERAL], cap) #define MLX5_VFIO_CAP_ROCE(ctx, cap) \ DEVX_GET(roce_cap, ctx->caps.hca_cur[MLX5_CAP_ROCE], cap) #define MLX5_VFIO_CAP_ROCE_MAX(ctx, cap) \ DEVX_GET(roce_cap, ctx->caps.hca_max[MLX5_CAP_ROCE], cap) struct mlx5_vfio_context; struct mlx5_reg_host_endianness { uint8_t he; uint8_t rsvd[15]; }; struct health_buffer { __be32 assert_var[5]; __be32 rsvd0[3]; __be32 assert_exit_ptr; __be32 assert_callra; __be32 rsvd1[2]; __be32 fw_ver; __be32 hw_id; __be32 rfr; uint8_t irisc_index; uint8_t synd; __be16 ext_synd; }; struct mlx5_init_seg { __be32 fw_rev; __be32 cmdif_rev_fw_sub; __be32 rsvd0[2]; __be32 cmdq_addr_h; __be32 cmdq_addr_l_sz; __be32 cmd_dbell; __be32 rsvd1[120]; __be32 initializing; struct health_buffer health; __be32 rsvd2[880]; __be32 internal_timer_h; __be32 internal_timer_l; __be32 rsvd3[2]; __be32 health_counter; __be32 rsvd4[1019]; __be64 ieee1588_clk; __be32 ieee1588_clk_type; __be32 clr_intx; }; struct mlx5_cmd_layout { uint8_t type; uint8_t rsvd0[3]; __be32 ilen; __be64 iptr; __be32 in[4]; __be32 out[4]; __be64 optr; __be32 olen; uint8_t token; uint8_t sig; uint8_t rsvd1; uint8_t status_own; }; struct mlx5_cmd_block { uint8_t data[MLX5_CMD_DATA_BLOCK_SIZE]; uint8_t rsvd0[48]; __be64 next; __be32 block_num; uint8_t rsvd1; uint8_t token; uint8_t ctrl_sig; uint8_t sig; }; struct page_block { void *page_ptr; uint64_t iova; struct list_node next_block; BMP_DECLARE(free_pages, MLX5_VFIO_BLOCK_NUM_PAGES); }; struct vfio_mem_allocator { struct list_head block_list; pthread_mutex_t block_list_mutex; }; struct mlx5_cmd_mailbox { void *buf; uint64_t iova; struct mlx5_cmd_mailbox *next; }; struct mlx5_cmd_msg { uint32_t len; struct mlx5_cmd_mailbox *next; }; typedef int (*vfio_cmd_slot_comp)(struct mlx5_vfio_context *ctx, unsigned long slot); struct cmd_async_data { void *buff_in; int ilen; void *buff_out; int olen; }; struct mlx5_vfio_cmd_slot { struct mlx5_cmd_layout *lay; struct mlx5_cmd_msg in; struct mlx5_cmd_msg out; pthread_mutex_t lock; int completion_event_fd; vfio_cmd_slot_comp comp_func; /* async cmd caller data */ bool in_use; struct cmd_async_data curr; bool is_pending; struct cmd_async_data pending; }; struct mlx5_vfio_cmd { void *vaddr; /* cmd page address */ uint64_t iova; uint8_t log_sz; uint8_t log_stride; struct mlx5_vfio_cmd_slot cmds[MLX5_MAX_COMMANDS]; }; struct mlx5_eq_param { uint8_t irq_index; int nent; uint64_t mask[4]; }; struct mlx5_eq { __be32 *doorbell; uint32_t cons_index; unsigned int vecidx; uint8_t eqn; int nent; void *vaddr; uint64_t iova; uint64_t iova_size; }; struct mlx5_eqe_cmd { __be32 vector; __be32 rsvd[6]; }; struct mlx5_eqe_page_req { __be16 ec_function; __be16 func_id; __be32 num_pages; __be32 rsvd1[5]; }; union ev_data { __be32 raw[7]; struct mlx5_eqe_cmd cmd; struct mlx5_eqe_page_req req_pages; }; struct mlx5_eqe { uint8_t rsvd0; uint8_t type; uint8_t rsvd1; uint8_t sub_type; __be32 rsvd2[7]; union ev_data data; __be16 rsvd3; uint8_t signature; uint8_t owner; }; #define MLX5_EQE_SIZE (sizeof(struct mlx5_eqe)) #define MLX5_NUM_CMD_EQE (32) #define MLX5_NUM_SPARE_EQE (0x80) struct mlx5_vfio_eqs_uar { uint32_t uarn; uint64_t iova; }; #define POLL_HEALTH_INTERVAL 1000 /* ms */ #define MAX_MISSES 3 struct mlx5_vfio_health_state { uint64_t prev_time; /* ms */ uint32_t prev_count; uint32_t miss_counter; }; struct mlx5_vfio_context { struct verbs_context vctx; int container_fd; int group_fd; int device_fd; int cmd_comp_fd; /* command completion FD */ struct iset *iova_alloc; uint64_t iova_min_page_size; FILE *dbg_fp; struct vfio_mem_allocator mem_alloc; struct mlx5_init_seg *bar_map; size_t bar_map_size; struct mlx5_vfio_cmd cmd; bool have_eq; struct { uint32_t hca_cur[MLX5_CAP_NUM][DEVX_UN_SZ_DW(hca_cap_union)]; uint32_t hca_max[MLX5_CAP_NUM][DEVX_UN_SZ_DW(hca_cap_union)]; } caps; struct mlx5_vfio_health_state health_state; struct mlx5_eq async_eq; struct mlx5_vfio_eqs_uar eqs_uar; pthread_mutex_t eq_lock; struct mlx5_dv_context_ops *dv_ctx_ops; int *msix_fds; pthread_mutex_t msix_fds_lock; }; #define MLX5_MAX_DESTROY_INBOX_SIZE_DW DEVX_ST_SZ_DW(delete_fte_in) struct mlx5_devx_obj { struct mlx5dv_devx_obj dv_obj; uint32_t dinbox[MLX5_MAX_DESTROY_INBOX_SIZE_DW]; uint32_t dinlen; }; static inline struct mlx5_vfio_device *to_mvfio_dev(struct ibv_device *ibdev) { return container_of(ibdev, struct mlx5_vfio_device, vdev.device); } static inline struct mlx5_vfio_context *to_mvfio_ctx(struct ibv_context *ibctx) { return container_of(ibctx, struct mlx5_vfio_context, vctx.context); } static inline struct mlx5_vfio_mr *to_mvfio_mr(struct ibv_mr *ibmr) { return container_of(ibmr, struct mlx5_vfio_mr, vmr.ibv_mr); } #endif rdma-core-56.1/providers/mlx5/mlx5dv.h000066400000000000000000001734521477342711600176400ustar00rootroot00000000000000/* * Copyright (c) 2017 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _MLX5DV_H_ #define _MLX5DV_H_ #include #include #include /* For the __be64 type */ #include #include #if defined(__SSE3__) #include #include #include #endif /* defined(__SSE3__) */ #include #include #include #ifdef __cplusplus extern "C" { #endif /* Always inline the functions */ #ifdef __GNUC__ #define MLX5DV_ALWAYS_INLINE inline __attribute__((always_inline)) #else #define MLX5DV_ALWAYS_INLINE inline #endif #define MLX5DV_RES_TYPE_QP ((uint64_t)RDMA_DRIVER_MLX5 << 32 | 1) #define MLX5DV_RES_TYPE_RWQ ((uint64_t)RDMA_DRIVER_MLX5 << 32 | 2) #define MLX5DV_RES_TYPE_DBR ((uint64_t)RDMA_DRIVER_MLX5 << 32 | 3) #define MLX5DV_RES_TYPE_SRQ ((uint64_t)RDMA_DRIVER_MLX5 << 32 | 4) #define MLX5DV_RES_TYPE_CQ ((uint64_t)RDMA_DRIVER_MLX5 << 32 | 5) enum { MLX5_RCV_DBR = 0, MLX5_SND_DBR = 1, }; enum mlx5dv_context_comp_mask { MLX5DV_CONTEXT_MASK_CQE_COMPRESION = 1 << 0, MLX5DV_CONTEXT_MASK_SWP = 1 << 1, MLX5DV_CONTEXT_MASK_STRIDING_RQ = 1 << 2, MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS = 1 << 3, MLX5DV_CONTEXT_MASK_DYN_BFREGS = 1 << 4, MLX5DV_CONTEXT_MASK_CLOCK_INFO_UPDATE = 1 << 5, MLX5DV_CONTEXT_MASK_FLOW_ACTION_FLAGS = 1 << 6, MLX5DV_CONTEXT_MASK_DC_ODP_CAPS = 1 << 7, MLX5DV_CONTEXT_MASK_HCA_CORE_CLOCK = 1 << 8, MLX5DV_CONTEXT_MASK_NUM_LAG_PORTS = 1 << 9, MLX5DV_CONTEXT_MASK_SIGNATURE_OFFLOAD = 1 << 10, MLX5DV_CONTEXT_MASK_DCI_STREAMS = 1 << 11, MLX5DV_CONTEXT_MASK_WR_MEMCPY_LENGTH = 1 << 12, MLX5DV_CONTEXT_MASK_CRYPTO_OFFLOAD = 1 << 13, MLX5DV_CONTEXT_MASK_MAX_DC_RD_ATOM = 1 << 14, MLX5DV_CONTEXT_MASK_REG_C0 = 1 << 15, MLX5DV_CONTEXT_MASK_OOO_RECV_WRS = 1 << 16, }; struct mlx5dv_cqe_comp_caps { uint32_t max_num; uint32_t supported_format; /* enum mlx5dv_cqe_comp_res_format */ }; struct mlx5dv_sw_parsing_caps { uint32_t sw_parsing_offloads; /* Use enum mlx5dv_sw_parsing_offloads */ uint32_t supported_qpts; }; struct mlx5dv_striding_rq_caps { uint32_t min_single_stride_log_num_of_bytes; uint32_t max_single_stride_log_num_of_bytes; uint32_t min_single_wqe_log_num_of_strides; uint32_t max_single_wqe_log_num_of_strides; uint32_t supported_qpts; }; struct mlx5dv_dci_streams_caps { uint8_t max_log_num_concurent; uint8_t max_log_num_errored; }; enum mlx5dv_tunnel_offloads { MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_VXLAN = 1 << 0, MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GRE = 1 << 1, MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GENEVE = 1 << 2, MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_CW_MPLS_OVER_GRE = 1 << 3, MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_CW_MPLS_OVER_UDP = 1 << 4, }; enum mlx5dv_flow_action_cap_flags { MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM = 1 << 0, MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_REQ_METADATA = 1 << 1, MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_SPI_STEERING = 1 << 2, MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_FULL_OFFLOAD = 1 << 3, MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_TX_IV_IS_ESN = 1 << 4, }; enum mlx5dv_sig_type { MLX5DV_SIG_TYPE_T10DIF, MLX5DV_SIG_TYPE_CRC, }; enum mlx5dv_sig_prot_caps { MLX5DV_SIG_PROT_CAP_T10DIF = 1 << MLX5DV_SIG_TYPE_T10DIF, MLX5DV_SIG_PROT_CAP_CRC = 1 << MLX5DV_SIG_TYPE_CRC, }; enum mlx5dv_sig_t10dif_bg_type { MLX5DV_SIG_T10DIF_CRC, MLX5DV_SIG_T10DIF_CSUM, }; enum mlx5dv_sig_t10dif_bg_caps { MLX5DV_SIG_T10DIF_BG_CAP_CRC = 1 << MLX5DV_SIG_T10DIF_CRC, MLX5DV_SIG_T10DIF_BG_CAP_CSUM = 1 << MLX5DV_SIG_T10DIF_CSUM, }; enum mlx5dv_sig_crc_type { MLX5DV_SIG_CRC_TYPE_CRC32, MLX5DV_SIG_CRC_TYPE_CRC32C, MLX5DV_SIG_CRC_TYPE_CRC64_XP10, }; enum mlx5dv_sig_crc_type_caps { MLX5DV_SIG_CRC_TYPE_CAP_CRC32 = 1 << MLX5DV_SIG_CRC_TYPE_CRC32, MLX5DV_SIG_CRC_TYPE_CAP_CRC32C = 1 << MLX5DV_SIG_CRC_TYPE_CRC32C, MLX5DV_SIG_CRC_TYPE_CAP_CRC64_XP10 = 1 << MLX5DV_SIG_CRC_TYPE_CRC64_XP10, }; enum mlx5dv_block_size { MLX5DV_BLOCK_SIZE_512, MLX5DV_BLOCK_SIZE_520, MLX5DV_BLOCK_SIZE_4048, MLX5DV_BLOCK_SIZE_4096, MLX5DV_BLOCK_SIZE_4160, }; enum mlx5dv_block_size_caps { MLX5DV_BLOCK_SIZE_CAP_512 = 1 << MLX5DV_BLOCK_SIZE_512, MLX5DV_BLOCK_SIZE_CAP_520 = 1 << MLX5DV_BLOCK_SIZE_520, MLX5DV_BLOCK_SIZE_CAP_4048 = 1 << MLX5DV_BLOCK_SIZE_4048, MLX5DV_BLOCK_SIZE_CAP_4096 = 1 << MLX5DV_BLOCK_SIZE_4096, MLX5DV_BLOCK_SIZE_CAP_4160 = 1 << MLX5DV_BLOCK_SIZE_4160, }; struct mlx5dv_sig_caps { uint64_t block_size; /* use enum mlx5dv_block_size_caps */ uint32_t block_prot; /* use enum mlx5dv_sig_prot_caps */ uint16_t t10dif_bg; /* use enum mlx5dv_sig_t10dif_bg_caps */ uint16_t crc_type; /* use enum mlx5dv_sig_crc_type_caps */ }; enum mlx5dv_crypto_engines_caps { MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS = 1 << 0, MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_SINGLE_BLOCK = 1 << 1, MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_MULTI_BLOCK = 1 << 2, }; enum mlx5dv_crypto_wrapped_import_method_caps { MLX5DV_CRYPTO_WRAPPED_IMPORT_METHOD_CAP_AES_XTS = 1 << 0, }; enum mlx5dv_crypto_caps_flags { MLX5DV_CRYPTO_CAPS_CRYPTO = 1 << 0, MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_OPERATIONAL = 1 << 1, MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_GOING_TO_COMMISSIONING = 1 << 2, }; struct mlx5dv_crypto_caps { /* * if failed_selftests != 0 it means there are some self tests errors * that may render specific crypto engines unusable. Exact code meaning * should be consulted with NVIDIA. */ uint16_t failed_selftests; uint8_t crypto_engines; /* use enum mlx5dv_crypto_engines_caps */ uint8_t wrapped_import_method; /* use enum mlx5dv_crypto_wrapped_import_method_caps */ uint8_t log_max_num_deks; uint32_t flags; /* use enum mlx5dv_crypto_caps_flags */ }; struct mlx5dv_ooo_recv_wrs_caps { uint32_t max_rc; uint32_t max_xrc; uint32_t max_dct; uint32_t max_ud; uint32_t max_uc; }; /* * Direct verbs device-specific attributes */ struct mlx5dv_context { uint8_t version; uint64_t flags; uint64_t comp_mask; struct mlx5dv_cqe_comp_caps cqe_comp_caps; struct mlx5dv_sw_parsing_caps sw_parsing_caps; struct mlx5dv_striding_rq_caps striding_rq_caps; uint32_t tunnel_offloads_caps; uint32_t max_dynamic_bfregs; uint64_t max_clock_info_update_nsec; uint32_t flow_action_flags; /* use enum mlx5dv_flow_action_cap_flags */ uint32_t dc_odp_caps; /* use enum ibv_odp_transport_cap_bits */ void *hca_core_clock; uint8_t num_lag_ports; struct mlx5dv_sig_caps sig_caps; struct mlx5dv_dci_streams_caps dci_streams_caps; size_t max_wr_memcpy_length; struct mlx5dv_crypto_caps crypto_caps; uint64_t max_dc_rd_atom; uint64_t max_dc_init_rd_atom; struct mlx5dv_reg reg_c0; struct mlx5dv_ooo_recv_wrs_caps ooo_recv_wrs_caps; }; enum mlx5dv_context_flags { /* * This flag indicates if CQE version 0 or 1 is needed. */ MLX5DV_CONTEXT_FLAGS_CQE_V1 = (1 << 0), MLX5DV_CONTEXT_FLAGS_OBSOLETE = (1 << 1), /* Obsoleted, don't use */ MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED = (1 << 2), MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW = (1 << 3), MLX5DV_CONTEXT_FLAGS_CQE_128B_COMP = (1 << 4), /* Support CQE 128B compression */ MLX5DV_CONTEXT_FLAGS_CQE_128B_PAD = (1 << 5), /* Support CQE 128B padding */ MLX5DV_CONTEXT_FLAGS_PACKET_BASED_CREDIT_MODE = (1 << 6), MLX5DV_CONTEXT_FLAGS_REAL_TIME_TS = (1 << 7), }; enum mlx5dv_cq_init_attr_mask { MLX5DV_CQ_INIT_ATTR_MASK_COMPRESSED_CQE = 1 << 0, MLX5DV_CQ_INIT_ATTR_MASK_FLAGS = 1 << 1, MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE = 1 << 2, }; enum mlx5dv_cq_init_attr_flags { MLX5DV_CQ_INIT_ATTR_FLAGS_CQE_PAD = 1 << 0, MLX5DV_CQ_INIT_ATTR_FLAGS_RESERVED = 1 << 1, }; struct mlx5dv_cq_init_attr { uint64_t comp_mask; /* Use enum mlx5dv_cq_init_attr_mask */ uint8_t cqe_comp_res_format; /* Use enum mlx5dv_cqe_comp_res_format */ uint32_t flags; /* Use enum mlx5dv_cq_init_attr_flags */ uint16_t cqe_size; /* when MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE set */ }; struct ibv_cq_ex *mlx5dv_create_cq(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr, struct mlx5dv_cq_init_attr *mlx5_cq_attr); enum mlx5dv_qp_create_flags { MLX5DV_QP_CREATE_TUNNEL_OFFLOADS = 1 << 0, MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_UC = 1 << 1, MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_MC = 1 << 2, MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE = 1 << 3, MLX5DV_QP_CREATE_ALLOW_SCATTER_TO_CQE = 1 << 4, MLX5DV_QP_CREATE_PACKET_BASED_CREDIT_MODE = 1 << 5, MLX5DV_QP_CREATE_SIG_PIPELINING = 1 << 6, MLX5DV_QP_CREATE_OOO_DP = 1 << 7, }; enum mlx5dv_mkey_init_attr_flags { MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT = 1 << 0, MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE = 1 << 1, MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO = 1 << 2, MLX5DV_MKEY_INIT_ATTR_FLAGS_UPDATE_TAG = 1 << 3, MLX5DV_MKEY_INIT_ATTR_FLAGS_REMOTE_INVALIDATE = 1 << 4, }; struct mlx5dv_mkey_init_attr { struct ibv_pd *pd; uint32_t create_flags; /* Use enum mlx5dv_mkey_init_attr_flags */ uint16_t max_entries; /* Requested max number of pointed entries by this indirect mkey */ }; struct mlx5dv_mkey { uint32_t lkey; uint32_t rkey; }; struct mlx5dv_mkey *mlx5dv_create_mkey(struct mlx5dv_mkey_init_attr *mkey_init_attr); int mlx5dv_destroy_mkey(struct mlx5dv_mkey *mkey); enum mlx5dv_qp_init_attr_mask { MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS = 1 << 0, MLX5DV_QP_INIT_ATTR_MASK_DC = 1 << 1, MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS = 1 << 2, MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS = 1 << 3, }; enum mlx5dv_dc_type { MLX5DV_DCTYPE_DCT = 1, MLX5DV_DCTYPE_DCI, }; struct mlx5dv_dci_streams { uint8_t log_num_concurent; uint8_t log_num_errored; }; struct mlx5dv_dc_init_attr { enum mlx5dv_dc_type dc_type; union { uint64_t dct_access_key; struct mlx5dv_dci_streams dci_streams; }; }; enum mlx5dv_qp_create_send_ops_flags { MLX5DV_QP_EX_WITH_MR_INTERLEAVED = 1 << 0, MLX5DV_QP_EX_WITH_MR_LIST = 1 << 1, MLX5DV_QP_EX_WITH_MKEY_CONFIGURE = 1 << 2, MLX5DV_QP_EX_WITH_RAW_WQE = 1 << 3, MLX5DV_QP_EX_WITH_MEMCPY = 1 << 4, }; struct mlx5dv_qp_init_attr { uint64_t comp_mask; /* Use enum mlx5dv_qp_init_attr_mask */ uint32_t create_flags; /* Use enum mlx5dv_qp_create_flags */ struct mlx5dv_dc_init_attr dc_init_attr; uint64_t send_ops_flags; /* Use enum mlx5dv_qp_create_send_ops_flags */ }; struct ibv_qp *mlx5dv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr); struct mlx5dv_mr_interleaved { uint64_t addr; uint32_t bytes_count; uint32_t bytes_skip; uint32_t lkey; }; enum mlx5dv_sig_t10dif_flags { MLX5DV_SIG_T10DIF_FLAG_REF_REMAP = 1 << 0, MLX5DV_SIG_T10DIF_FLAG_APP_ESCAPE = 1 << 1, MLX5DV_SIG_T10DIF_FLAG_APP_REF_ESCAPE = 1 << 2, }; struct mlx5dv_sig_t10dif { enum mlx5dv_sig_t10dif_bg_type bg_type; uint16_t bg; uint16_t app_tag; uint32_t ref_tag; uint16_t flags; /* Use enum mlx5dv_sig_t10dif_flags */ }; struct mlx5dv_sig_crc { enum mlx5dv_sig_crc_type type; uint64_t seed; }; struct mlx5dv_sig_block_domain { enum mlx5dv_sig_type sig_type; union { const struct mlx5dv_sig_t10dif *dif; const struct mlx5dv_sig_crc *crc; } sig; enum mlx5dv_block_size block_size; uint64_t comp_mask; }; enum mlx5dv_sig_mask { MLX5DV_SIG_MASK_T10DIF_GUARD = 0xc0, MLX5DV_SIG_MASK_T10DIF_APPTAG = 0x30, MLX5DV_SIG_MASK_T10DIF_REFTAG = 0x0f, MLX5DV_SIG_MASK_CRC32 = 0xf0, MLX5DV_SIG_MASK_CRC32C = MLX5DV_SIG_MASK_CRC32, MLX5DV_SIG_MASK_CRC64_XP10 = 0xff, }; enum mlx5dv_sig_block_attr_flags { MLX5DV_SIG_BLOCK_ATTR_FLAG_COPY_MASK = 1 << 0, }; struct mlx5dv_sig_block_attr { const struct mlx5dv_sig_block_domain *mem; const struct mlx5dv_sig_block_domain *wire; uint32_t flags; /* Use enum mlx5dv_sig_block_attr_flags */ uint8_t check_mask; uint8_t copy_mask; uint64_t comp_mask; }; enum mlx5dv_crypto_standard { MLX5DV_CRYPTO_STANDARD_AES_XTS, }; enum mlx5dv_signature_crypto_order { MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_AFTER_CRYPTO_ON_TX, MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_BEFORE_CRYPTO_ON_TX, }; struct mlx5dv_crypto_attr { enum mlx5dv_crypto_standard crypto_standard; bool encrypt_on_tx; enum mlx5dv_signature_crypto_order signature_crypto_order; enum mlx5dv_block_size data_unit_size; char initial_tweak[16]; struct mlx5dv_dek *dek; char keytag[8]; uint64_t comp_mask; }; enum mlx5dv_mkey_conf_flags { MLX5DV_MKEY_CONF_FLAG_RESET_SIG_ATTR = 1 << 0, }; struct mlx5dv_mkey_conf_attr { uint32_t conf_flags; /* Use enum mlx5dv_mkey_conf_flags */ uint64_t comp_mask; }; enum mlx5dv_wc_opcode { MLX5DV_WC_UMR = IBV_WC_DRIVER1, MLX5DV_WC_RAW_WQE = IBV_WC_DRIVER2, MLX5DV_WC_MEMCPY = IBV_WC_DRIVER3, }; struct mlx5dv_qp_ex { uint64_t comp_mask; /* * Available just for the MLX5 DC QP type with send opcodes of type: * rdma, atomic and send. */ void (*wr_set_dc_addr)(struct mlx5dv_qp_ex *mqp, struct ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key); void (*wr_mr_interleaved)(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint32_t access_flags, /* use enum ibv_access_flags */ uint32_t repeat_count, uint16_t num_interleaved, struct mlx5dv_mr_interleaved *data); void (*wr_mr_list)(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint32_t access_flags, /* use enum ibv_access_flags */ uint16_t num_sges, struct ibv_sge *sge); void (*wr_mkey_configure)(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint8_t num_setters, struct mlx5dv_mkey_conf_attr *attr); void (*wr_set_mkey_access_flags)(struct mlx5dv_qp_ex *mqp, uint32_t access_flags); void (*wr_set_mkey_layout_list)(struct mlx5dv_qp_ex *mqp, uint16_t num_sges, const struct ibv_sge *sge); void (*wr_set_mkey_layout_interleaved)( struct mlx5dv_qp_ex *mqp, uint32_t repeat_count, uint16_t num_interleaved, const struct mlx5dv_mr_interleaved *data); void (*wr_set_mkey_sig_block)(struct mlx5dv_qp_ex *mqp, const struct mlx5dv_sig_block_attr *attr); void (*wr_raw_wqe)(struct mlx5dv_qp_ex *mqp, const void *wqe); void (*wr_set_dc_addr_stream)(struct mlx5dv_qp_ex *mqp, struct ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key, uint16_t stream_id); void (*wr_memcpy)(struct mlx5dv_qp_ex *mqp, uint32_t dest_lkey, uint64_t dest_addr, uint32_t src_lkey, uint64_t src_addr, size_t length); void (*wr_set_mkey_crypto)(struct mlx5dv_qp_ex *mqp, const struct mlx5dv_crypto_attr *attr); }; struct mlx5dv_qp_ex *mlx5dv_qp_ex_from_ibv_qp_ex(struct ibv_qp_ex *qp); static inline void mlx5dv_wr_set_dc_addr(struct mlx5dv_qp_ex *mqp, struct ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key) { mqp->wr_set_dc_addr(mqp, ah, remote_dctn, remote_dc_key); } static inline void mlx5dv_wr_set_dc_addr_stream(struct mlx5dv_qp_ex *mqp, struct ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key, uint16_t stream_id) { mqp->wr_set_dc_addr_stream(mqp, ah, remote_dctn, remote_dc_key, stream_id); } static inline void mlx5dv_wr_mr_interleaved(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint32_t access_flags, uint32_t repeat_count, uint16_t num_interleaved, struct mlx5dv_mr_interleaved *data) { mqp->wr_mr_interleaved(mqp, mkey, access_flags, repeat_count, num_interleaved, data); } static inline void mlx5dv_wr_mr_list(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint32_t access_flags, uint16_t num_sges, struct ibv_sge *sge) { mqp->wr_mr_list(mqp, mkey, access_flags, num_sges, sge); } static inline void mlx5dv_wr_mkey_configure(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint8_t num_setters, struct mlx5dv_mkey_conf_attr *attr) { mqp->wr_mkey_configure(mqp, mkey, num_setters, attr); } static inline void mlx5dv_wr_set_mkey_access_flags(struct mlx5dv_qp_ex *mqp, uint32_t access_flags) { mqp->wr_set_mkey_access_flags(mqp, access_flags); } static inline void mlx5dv_wr_set_mkey_layout_list(struct mlx5dv_qp_ex *mqp, uint16_t num_sges, const struct ibv_sge *sge) { mqp->wr_set_mkey_layout_list(mqp, num_sges, sge); } static inline void mlx5dv_wr_set_mkey_layout_interleaved(struct mlx5dv_qp_ex *mqp, uint32_t repeat_count, uint16_t num_interleaved, const struct mlx5dv_mr_interleaved *data) { mqp->wr_set_mkey_layout_interleaved(mqp, repeat_count, num_interleaved, data); } static inline void mlx5dv_wr_set_mkey_sig_block(struct mlx5dv_qp_ex *mqp, const struct mlx5dv_sig_block_attr *attr) { mqp->wr_set_mkey_sig_block(mqp, attr); } static inline void mlx5dv_wr_set_mkey_crypto(struct mlx5dv_qp_ex *mqp, const struct mlx5dv_crypto_attr *attr) { mqp->wr_set_mkey_crypto(mqp, attr); } static inline void mlx5dv_wr_memcpy(struct mlx5dv_qp_ex *mqp, uint32_t dest_lkey, uint64_t dest_addr, uint32_t src_lkey, uint64_t src_addr, size_t length) { mqp->wr_memcpy(mqp, dest_lkey, dest_addr, src_lkey, src_addr, length); } enum mlx5dv_mkey_err_type { MLX5DV_MKEY_NO_ERR, MLX5DV_MKEY_SIG_BLOCK_BAD_GUARD, MLX5DV_MKEY_SIG_BLOCK_BAD_REFTAG, MLX5DV_MKEY_SIG_BLOCK_BAD_APPTAG, }; struct mlx5dv_sig_err { uint64_t actual_value; uint64_t expected_value; uint64_t offset; }; struct mlx5dv_mkey_err { enum mlx5dv_mkey_err_type err_type; union { struct mlx5dv_sig_err sig; } err; }; int _mlx5dv_mkey_check(struct mlx5dv_mkey *mkey, struct mlx5dv_mkey_err *err_info, size_t err_info_size); static inline int mlx5dv_mkey_check(struct mlx5dv_mkey *mkey, struct mlx5dv_mkey_err *err_info) { return _mlx5dv_mkey_check(mkey, err_info, sizeof(*err_info)); } int mlx5dv_qp_cancel_posted_send_wrs(struct mlx5dv_qp_ex *mqp, uint64_t wr_id); static inline void mlx5dv_wr_raw_wqe(struct mlx5dv_qp_ex *mqp, const void *wqe) { mqp->wr_raw_wqe(mqp, wqe); } struct mlx5dv_crypto_login_obj; struct mlx5dv_crypto_login_attr { uint32_t credential_id; uint32_t import_kek_id; char credential[48]; uint64_t comp_mask; }; struct mlx5dv_crypto_login_attr_ex { uint32_t credential_id; uint32_t import_kek_id; const void *credential; size_t credential_len; uint64_t comp_mask; }; enum mlx5dv_crypto_login_state { MLX5DV_CRYPTO_LOGIN_STATE_VALID, MLX5DV_CRYPTO_LOGIN_STATE_NO_LOGIN, MLX5DV_CRYPTO_LOGIN_STATE_INVALID, }; struct mlx5dv_crypto_login_query_attr { enum mlx5dv_crypto_login_state state; uint64_t comp_mask; }; int mlx5dv_crypto_login(struct ibv_context *context, struct mlx5dv_crypto_login_attr *login_attr); int mlx5dv_crypto_login_query_state(struct ibv_context *context, enum mlx5dv_crypto_login_state *state); int mlx5dv_crypto_logout(struct ibv_context *context); struct mlx5dv_crypto_login_obj * mlx5dv_crypto_login_create(struct ibv_context *context, struct mlx5dv_crypto_login_attr_ex *login_attr); int mlx5dv_crypto_login_query(struct mlx5dv_crypto_login_obj *crypto_login, struct mlx5dv_crypto_login_query_attr *query_attr); int mlx5dv_crypto_login_destroy(struct mlx5dv_crypto_login_obj *crypto_login); enum mlx5dv_crypto_key_size { MLX5DV_CRYPTO_KEY_SIZE_128, MLX5DV_CRYPTO_KEY_SIZE_256, }; enum mlx5dv_crypto_key_purpose { MLX5DV_CRYPTO_KEY_PURPOSE_AES_XTS, }; enum mlx5dv_dek_state { MLX5DV_DEK_STATE_READY, MLX5DV_DEK_STATE_ERROR, }; enum mlx5dv_dek_init_attr_mask { MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN = 1 << 0, }; struct mlx5dv_dek_init_attr { enum mlx5dv_crypto_key_size key_size; bool has_keytag; enum mlx5dv_crypto_key_purpose key_purpose; struct ibv_pd *pd; char opaque[8]; char key[128]; uint64_t comp_mask; struct mlx5dv_crypto_login_obj *crypto_login; }; struct mlx5dv_dek_attr { enum mlx5dv_dek_state state; char opaque[8]; uint64_t comp_mask; }; struct mlx5dv_dek; struct mlx5dv_dek *mlx5dv_dek_create(struct ibv_context *context, struct mlx5dv_dek_init_attr *init_attr); int mlx5dv_dek_query(struct mlx5dv_dek *dek, struct mlx5dv_dek_attr *attr); int mlx5dv_dek_destroy(struct mlx5dv_dek *dek); enum mlx5dv_flow_action_esp_mask { MLX5DV_FLOW_ACTION_ESP_MASK_FLAGS = 1 << 0, }; struct mlx5dv_flow_action_esp { uint64_t comp_mask; /* Use enum mlx5dv_flow_action_esp_mask */ uint32_t action_flags; /* Use enum mlx5dv_flow_action_flags */ }; struct mlx5dv_flow_match_parameters { size_t match_sz; uint64_t match_buf[]; /* Device spec format */ }; enum mlx5dv_flow_matcher_attr_mask { MLX5DV_FLOW_MATCHER_MASK_FT_TYPE = 1 << 0, }; struct mlx5dv_flow_matcher_attr { enum ibv_flow_attr_type type; uint32_t flags; /* From enum ibv_flow_flags */ uint16_t priority; uint8_t match_criteria_enable; /* Device spec format */ struct mlx5dv_flow_match_parameters *match_mask; uint64_t comp_mask; /* use mlx5dv_flow_matcher_attr_mask */ enum mlx5dv_flow_table_type ft_type; }; struct mlx5dv_flow_matcher; struct mlx5dv_flow_matcher * mlx5dv_create_flow_matcher(struct ibv_context *context, struct mlx5dv_flow_matcher_attr *matcher_attr); int mlx5dv_destroy_flow_matcher(struct mlx5dv_flow_matcher *matcher); struct mlx5dv_steering_anchor_attr { enum mlx5dv_flow_table_type ft_type; uint16_t priority; uint64_t comp_mask; }; struct mlx5dv_steering_anchor { uint32_t id; }; struct mlx5dv_steering_anchor * mlx5dv_create_steering_anchor(struct ibv_context *context, struct mlx5dv_steering_anchor_attr *attr); int mlx5dv_destroy_steering_anchor(struct mlx5dv_steering_anchor *sa); enum mlx5dv_flow_action_type { MLX5DV_FLOW_ACTION_DEST_IBV_QP, MLX5DV_FLOW_ACTION_DROP, MLX5DV_FLOW_ACTION_IBV_COUNTER, MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION, MLX5DV_FLOW_ACTION_TAG, MLX5DV_FLOW_ACTION_DEST_DEVX, MLX5DV_FLOW_ACTION_COUNTERS_DEVX, MLX5DV_FLOW_ACTION_DEFAULT_MISS, }; struct mlx5dv_flow_action_attr { enum mlx5dv_flow_action_type type; union { struct ibv_qp *qp; struct ibv_counters *counter; struct ibv_flow_action *action; uint32_t tag_value; struct mlx5dv_devx_obj *obj; }; }; struct ibv_flow * mlx5dv_create_flow(struct mlx5dv_flow_matcher *matcher, struct mlx5dv_flow_match_parameters *match_value, size_t num_actions, struct mlx5dv_flow_action_attr actions_attr[]); struct ibv_flow_action *mlx5dv_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *esp, struct mlx5dv_flow_action_esp *mlx5_attr); /* * mlx5dv_create_flow_action_modify_header - Create a flow action which mutates * a packet. The flow action can be attached to steering rules via * ibv_create_flow(). * * @ctx: RDMA device context to create the action on. * @actions_sz: The size of *actions* buffer in bytes. * @actions: A buffer which contains modify actions provided in device spec * format. * @ft_type: Defines the flow table type to which the modify * header action will be attached. * * Return a valid ibv_flow_action if successful, NULL otherwise. */ struct ibv_flow_action * mlx5dv_create_flow_action_modify_header(struct ibv_context *ctx, size_t actions_sz, uint64_t actions[], enum mlx5dv_flow_table_type ft_type); /* * mlx5dv_create_flow_action_packet_reformat - Create flow action which can * encap/decap packets. */ struct ibv_flow_action * mlx5dv_create_flow_action_packet_reformat(struct ibv_context *ctx, size_t data_sz, void *data, enum mlx5dv_flow_action_packet_reformat_type reformat_type, enum mlx5dv_flow_table_type ft_type); /* * Most device capabilities are exported by ibv_query_device(...), * but there is HW device-specific information which is important * for data-path, but isn't provided. * * Return 0 on success. */ int mlx5dv_query_device(struct ibv_context *ctx_in, struct mlx5dv_context *attrs_out); int mlx5dv_map_ah_to_qp(struct ibv_ah *ah, uint32_t qp_num); enum mlx5dv_qp_comp_mask { MLX5DV_QP_MASK_UAR_MMAP_OFFSET = 1 << 0, MLX5DV_QP_MASK_RAW_QP_HANDLES = 1 << 1, MLX5DV_QP_MASK_RAW_QP_TIR_ADDR = 1 << 2, }; struct mlx5dv_qp { __be32 *dbrec; struct { void *buf; uint32_t wqe_cnt; uint32_t stride; } sq; struct { void *buf; uint32_t wqe_cnt; uint32_t stride; } rq; struct { void *reg; uint32_t size; } bf; uint64_t comp_mask; off_t uar_mmap_offset; uint32_t tirn; uint32_t tisn; uint32_t rqn; uint32_t sqn; uint64_t tir_icm_addr; }; struct mlx5dv_cq { void *buf; __be32 *dbrec; uint32_t cqe_cnt; uint32_t cqe_size; void *cq_uar; uint32_t cqn; uint64_t comp_mask; }; enum mlx5dv_srq_comp_mask { MLX5DV_SRQ_MASK_SRQN = 1 << 0, }; struct mlx5dv_srq { void *buf; __be32 *dbrec; uint32_t stride; uint32_t head; uint32_t tail; uint64_t comp_mask; uint32_t srqn; }; struct mlx5dv_rwq { void *buf; __be32 *dbrec; uint32_t wqe_cnt; uint32_t stride; uint64_t comp_mask; }; struct mlx5dv_alloc_dm_attr { enum mlx5dv_alloc_dm_type type; uint64_t comp_mask; }; enum mlx5dv_dm_comp_mask { MLX5DV_DM_MASK_REMOTE_VA = 1 << 0, }; struct mlx5dv_dm { void *buf; uint64_t length; uint64_t comp_mask; uint64_t remote_va; }; struct ibv_dm *mlx5dv_alloc_dm(struct ibv_context *context, struct ibv_alloc_dm_attr *dm_attr, struct mlx5dv_alloc_dm_attr *mlx5_dm_attr); void *mlx5dv_dm_map_op_addr(struct ibv_dm *dm, uint8_t op); struct ibv_mr *mlx5dv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access, int mlx5_access); int mlx5dv_get_data_direct_sysfs_path(struct ibv_context *context, char *buf, size_t buf_len); struct mlx5_wqe_av; struct mlx5dv_ah { struct mlx5_wqe_av *av; uint64_t comp_mask; }; struct mlx5dv_pd { uint32_t pdn; uint64_t comp_mask; }; struct mlx5dv_devx { uint32_t handle; }; struct mlx5dv_obj { struct { struct ibv_qp *in; struct mlx5dv_qp *out; } qp; struct { struct ibv_cq *in; struct mlx5dv_cq *out; } cq; struct { struct ibv_srq *in; struct mlx5dv_srq *out; } srq; struct { struct ibv_wq *in; struct mlx5dv_rwq *out; } rwq; struct { struct ibv_dm *in; struct mlx5dv_dm *out; } dm; struct { struct ibv_ah *in; struct mlx5dv_ah *out; } ah; struct { struct ibv_pd *in; struct mlx5dv_pd *out; } pd; struct { struct mlx5dv_devx_obj *in; struct mlx5dv_devx *out; } devx; }; enum mlx5dv_obj_type { MLX5DV_OBJ_QP = 1 << 0, MLX5DV_OBJ_CQ = 1 << 1, MLX5DV_OBJ_SRQ = 1 << 2, MLX5DV_OBJ_RWQ = 1 << 3, MLX5DV_OBJ_DM = 1 << 4, MLX5DV_OBJ_AH = 1 << 5, MLX5DV_OBJ_PD = 1 << 6, MLX5DV_OBJ_DEVX = 1 << 7, }; enum mlx5dv_wq_init_attr_mask { MLX5DV_WQ_INIT_ATTR_MASK_STRIDING_RQ = 1 << 0, }; struct mlx5dv_striding_rq_init_attr { uint32_t single_stride_log_num_of_bytes; uint32_t single_wqe_log_num_of_strides; uint8_t two_byte_shift_en; }; struct mlx5dv_wq_init_attr { uint64_t comp_mask; /* Use enum mlx5dv_wq_init_attr_mask */ struct mlx5dv_striding_rq_init_attr striding_rq_attrs; }; /* * This function creates a work queue object with extra properties * defined by mlx5dv_wq_init_attr struct. * * For each bit in the comp_mask, a field in mlx5dv_wq_init_attr * should follow. * * MLX5DV_WQ_INIT_ATTR_MASK_STRIDING_RQ: Create a work queue with * striding RQ capabilities. * - single_stride_log_num_of_bytes represents the size of each stride in the * WQE and its value should be between min_single_stride_log_num_of_bytes * and max_single_stride_log_num_of_bytes that are reported in * mlx5dv_query_device. * - single_wqe_log_num_of_strides represents the number of strides in each WQE. * Its value should be between min_single_wqe_log_num_of_strides and * max_single_wqe_log_num_of_strides that are reported in mlx5dv_query_device. * - two_byte_shift_en: When enabled, hardware pads 2 bytes of zeroes * before writing the message to memory (e.g. for IP alignment) */ struct ibv_wq *mlx5dv_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *wq_init_attr, struct mlx5dv_wq_init_attr *mlx5_wq_attr); /* * This function will initialize mlx5dv_xxx structs based on supplied type. * The information for initialization is taken from either ibv_xx or * mlx5dv_xxx structs supplied as part of input. * * Request information of CQ marks its owned by DV for all consumer index * related actions. * * The initialization type can be combination of several types together. * * Return: 0 in case of success. */ int mlx5dv_init_obj(struct mlx5dv_obj *obj, uint64_t obj_type); enum { MLX5_OPCODE_NOP = 0x00, MLX5_OPCODE_SEND_INVAL = 0x01, MLX5_OPCODE_RDMA_WRITE = 0x08, MLX5_OPCODE_RDMA_WRITE_IMM = 0x09, MLX5_OPCODE_SEND = 0x0a, MLX5_OPCODE_SEND_IMM = 0x0b, MLX5_OPCODE_TSO = 0x0e, MLX5_OPCODE_RDMA_READ = 0x10, MLX5_OPCODE_ATOMIC_CS = 0x11, MLX5_OPCODE_ATOMIC_FA = 0x12, MLX5_OPCODE_ATOMIC_MASKED_CS = 0x14, MLX5_OPCODE_ATOMIC_MASKED_FA = 0x15, MLX5_OPCODE_FMR = 0x19, MLX5_OPCODE_LOCAL_INVAL = 0x1b, MLX5_OPCODE_CONFIG_CMD = 0x1f, MLX5_OPCODE_SET_PSV = 0x20, MLX5_OPCODE_UMR = 0x25, MLX5_OPCODE_TAG_MATCHING = 0x28, MLX5_OPCODE_FLOW_TBL_ACCESS = 0x2c, MLX5_OPCODE_MMO = 0x2F, }; /* * CQE related part */ enum { MLX5_INLINE_SCATTER_32 = 0x4, MLX5_INLINE_SCATTER_64 = 0x8, }; enum { MLX5_CQE_SYNDROME_LOCAL_LENGTH_ERR = 0x01, MLX5_CQE_SYNDROME_LOCAL_QP_OP_ERR = 0x02, MLX5_CQE_SYNDROME_LOCAL_PROT_ERR = 0x04, MLX5_CQE_SYNDROME_WR_FLUSH_ERR = 0x05, MLX5_CQE_SYNDROME_MW_BIND_ERR = 0x06, MLX5_CQE_SYNDROME_BAD_RESP_ERR = 0x10, MLX5_CQE_SYNDROME_LOCAL_ACCESS_ERR = 0x11, MLX5_CQE_SYNDROME_REMOTE_INVAL_REQ_ERR = 0x12, MLX5_CQE_SYNDROME_REMOTE_ACCESS_ERR = 0x13, MLX5_CQE_SYNDROME_REMOTE_OP_ERR = 0x14, MLX5_CQE_SYNDROME_TRANSPORT_RETRY_EXC_ERR = 0x15, MLX5_CQE_SYNDROME_RNR_RETRY_EXC_ERR = 0x16, MLX5_CQE_SYNDROME_REMOTE_ABORTED_ERR = 0x22, }; enum { MLX5_CQE_VENDOR_SYNDROME_ODP_PFAULT = 0x93, }; enum { MLX5_CQE_L2_OK = 1 << 0, MLX5_CQE_L3_OK = 1 << 1, MLX5_CQE_L4_OK = 1 << 2, }; enum { MLX5_CQE_L3_HDR_TYPE_NONE = 0x0, MLX5_CQE_L3_HDR_TYPE_IPV6 = 0x1, MLX5_CQE_L3_HDR_TYPE_IPV4 = 0x2, }; enum { MLX5_CQE_OWNER_MASK = 1, MLX5_CQE_REQ = 0, MLX5_CQE_RESP_WR_IMM = 1, MLX5_CQE_RESP_SEND = 2, MLX5_CQE_RESP_SEND_IMM = 3, MLX5_CQE_RESP_SEND_INV = 4, MLX5_CQE_RESIZE_CQ = 5, MLX5_CQE_NO_PACKET = 6, MLX5_CQE_SIG_ERR = 12, MLX5_CQE_REQ_ERR = 13, MLX5_CQE_RESP_ERR = 14, MLX5_CQE_INVALID = 15, }; enum { MLX5_CQ_DOORBELL = 0x20 }; enum { MLX5_CQ_DB_REQ_NOT_SOL = 1 << 24, MLX5_CQ_DB_REQ_NOT = 0 << 24, }; struct mlx5_err_cqe { uint8_t rsvd0[32]; uint32_t srqn; uint8_t rsvd1[18]; uint8_t vendor_err_synd; uint8_t syndrome; uint32_t s_wqe_opcode_qpn; uint16_t wqe_counter; uint8_t signature; uint8_t op_own; }; struct mlx5_tm_cqe { __be32 success; __be16 hw_phase_cnt; uint8_t rsvd0[12]; }; struct mlx5_cqe64 { union { struct { uint8_t rsvd0[2]; __be16 wqe_id; uint8_t rsvd4[13]; uint8_t ml_path; uint8_t rsvd20[4]; __be16 slid; __be32 flags_rqpn; uint8_t hds_ip_ext; uint8_t l4_hdr_type_etc; __be16 vlan_info; }; struct mlx5_tm_cqe tm_cqe; /* TMH is scattered to CQE upon match */ struct ibv_tmh tmh; }; __be32 srqn_uidx; __be32 imm_inval_pkey; uint8_t app; uint8_t app_op; __be16 app_info; __be32 byte_cnt; __be64 timestamp; __be32 sop_drop_qpn; __be16 wqe_counter; uint8_t signature; uint8_t op_own; }; enum { MLX5_TMC_SUCCESS = 0x80000000U, }; enum mlx5dv_cqe_comp_res_format { MLX5DV_CQE_RES_FORMAT_HASH = 1 << 0, MLX5DV_CQE_RES_FORMAT_CSUM = 1 << 1, MLX5DV_CQE_RES_FORMAT_CSUM_STRIDX = 1 << 2, }; enum mlx5dv_sw_parsing_offloads { MLX5DV_SW_PARSING = 1 << 0, MLX5DV_SW_PARSING_CSUM = 1 << 1, MLX5DV_SW_PARSING_LSO = 1 << 2, }; static MLX5DV_ALWAYS_INLINE uint8_t mlx5dv_get_cqe_owner(struct mlx5_cqe64 *cqe) { return cqe->op_own & 0x1; } static MLX5DV_ALWAYS_INLINE void mlx5dv_set_cqe_owner(struct mlx5_cqe64 *cqe, uint8_t val) { cqe->op_own = (val & 0x1) | (cqe->op_own & ~0x1); } /* Solicited event */ static MLX5DV_ALWAYS_INLINE uint8_t mlx5dv_get_cqe_se(struct mlx5_cqe64 *cqe) { return (cqe->op_own >> 1) & 0x1; } static MLX5DV_ALWAYS_INLINE uint8_t mlx5dv_get_cqe_format(struct mlx5_cqe64 *cqe) { return (cqe->op_own >> 2) & 0x3; } static MLX5DV_ALWAYS_INLINE uint8_t mlx5dv_get_cqe_opcode(struct mlx5_cqe64 *cqe) { return cqe->op_own >> 4; } /* * WQE related part */ enum { MLX5_INVALID_LKEY = 0x100, }; enum { MLX5_EXTENDED_UD_AV = 0x80000000, }; enum { MLX5_WQE_CTRL_CQ_UPDATE = 2 << 2, MLX5_WQE_CTRL_SOLICITED = 1 << 1, MLX5_WQE_CTRL_FENCE = 4 << 5, MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE = 1 << 5, }; enum { MLX5_SEND_WQE_BB = 64, MLX5_SEND_WQE_SHIFT = 6, }; enum { MLX5_INLINE_SEG = 0x80000000, }; enum { MLX5_ETH_WQE_L3_CSUM = (1 << 6), MLX5_ETH_WQE_L4_CSUM = (1 << 7), }; struct mlx5_wqe_srq_next_seg { uint8_t rsvd0[2]; __be16 next_wqe_index; uint8_t signature; uint8_t rsvd1[11]; }; struct mlx5_wqe_data_seg { __be32 byte_count; __be32 lkey; __be64 addr; }; struct mlx5_wqe_ctrl_seg { __be32 opmod_idx_opcode; __be32 qpn_ds; uint8_t signature; __be16 dci_stream_channel_id; uint8_t fm_ce_se; __be32 imm; } __attribute__((__packed__)) __attribute__((__aligned__(4))); struct mlx5_mprq_wqe { struct mlx5_wqe_srq_next_seg nseg; struct mlx5_wqe_data_seg dseg; }; struct mlx5_wqe_av { union { struct { __be32 qkey; __be32 reserved; } qkey; __be64 dc_key; } key; __be32 dqp_dct; uint8_t stat_rate_sl; uint8_t fl_mlid; __be16 rlid; uint8_t reserved0[4]; uint8_t rmac[ETHERNET_LL_SIZE]; uint8_t tclass; uint8_t hop_limit; __be32 grh_gid_fl; uint8_t rgid[16]; }; struct mlx5_wqe_datagram_seg { struct mlx5_wqe_av av; }; struct mlx5_wqe_raddr_seg { __be64 raddr; __be32 rkey; __be32 reserved; }; struct mlx5_wqe_atomic_seg { __be64 swap_add; __be64 compare; }; struct mlx5_wqe_inl_data_seg { uint32_t byte_count; }; struct mlx5_wqe_eth_seg { __be32 rsvd0; uint8_t cs_flags; uint8_t rsvd1; __be16 mss; __be32 rsvd2; __be16 inline_hdr_sz; uint8_t inline_hdr_start[2]; uint8_t inline_hdr[16]; }; struct mlx5_wqe_tm_seg { uint8_t opcode; uint8_t flags; __be16 index; uint8_t rsvd0[2]; __be16 sw_cnt; uint8_t rsvd1[8]; __be64 append_tag; __be64 append_mask; }; enum { MLX5_WQE_UMR_CTRL_FLAG_INLINE = 1 << 7, MLX5_WQE_UMR_CTRL_FLAG_CHECK_FREE = 1 << 5, MLX5_WQE_UMR_CTRL_FLAG_TRNSLATION_OFFSET = 1 << 4, MLX5_WQE_UMR_CTRL_FLAG_CHECK_QPN = 1 << 3, }; enum { MLX5_WQE_UMR_CTRL_MKEY_MASK_LEN = 1 << 0, MLX5_WQE_UMR_CTRL_MKEY_MASK_START_ADDR = 1 << 6, MLX5_WQE_UMR_CTRL_MKEY_MASK_SIG_ERR = 1 << 9, MLX5_WQE_UMR_CTRL_MKEY_MASK_BSF_ENABLE = 1 << 12, MLX5_WQE_UMR_CTRL_MKEY_MASK_MKEY = 1 << 13, MLX5_WQE_UMR_CTRL_MKEY_MASK_QPN = 1 << 14, MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_LOCAL_WRITE = 1 << 18, MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_REMOTE_READ = 1 << 19, MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_REMOTE_WRITE = 1 << 20, MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_ATOMIC = 1 << 21, MLX5_WQE_UMR_CTRL_MKEY_MASK_FREE = 1 << 29, }; struct mlx5_wqe_umr_ctrl_seg { uint8_t flags; uint8_t rsvd0[3]; __be16 klm_octowords; union { __be16 translation_offset; __be16 bsf_octowords; }; __be64 mkey_mask; uint8_t rsvd1[32]; }; struct mlx5_wqe_umr_klm_seg { /* up to 2GB */ __be32 byte_count; __be32 mkey; __be64 address; }; union mlx5_wqe_umr_inline_seg { struct mlx5_wqe_umr_klm_seg klm; }; struct mlx5_wqe_umr_repeat_ent_seg { __be16 stride; __be16 byte_count; __be32 memkey; __be64 va; }; struct mlx5_wqe_umr_repeat_block_seg { __be32 byte_count; __be32 op; __be32 repeat_count; __be16 reserved; __be16 num_ent; struct mlx5_wqe_umr_repeat_ent_seg entries[0]; }; enum { MLX5_WQE_MKEY_CONTEXT_FREE = 1 << 6 }; enum { MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_ATOMIC = 1 << 6, MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_REMOTE_WRITE = 1 << 5, MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_REMOTE_READ = 1 << 4, MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_LOCAL_WRITE = 1 << 3, MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_LOCAL_READ = 1 << 2 }; struct mlx5_wqe_mkey_context_seg { uint8_t free; uint8_t reserved1; uint8_t access_flags; uint8_t sf; __be32 qpn_mkey; __be32 reserved2; __be32 flags_pd; __be64 start_addr; __be64 len; __be32 bsf_octword_size; __be32 reserved3[4]; __be32 translations_octword_size; uint8_t reserved4[3]; uint8_t log_page_size; __be32 reserved; union mlx5_wqe_umr_inline_seg inseg[0]; }; /* * Control segment - contains some control information for the current WQE. * * Output: * seg - control segment to be filled * Input: * pi - WQEBB number of the first block of this WQE. * This number should wrap at 0xffff, regardless of * size of the WQ. * opcode - Opcode of this WQE. Encodes the type of operation * to be executed on the QP. * opmod - Opcode modifier. * qp_num - QP/SQ number this WQE is posted to. * fm_ce_se - FM (fence mode), CE (completion and event mode) * and SE (solicited event). * ds - WQE size in octowords (16-byte units). DS accounts for all * the segments in the WQE as summarized in WQE construction. * signature - WQE signature. * imm - Immediate data/Invalidation key/UMR mkey. */ static MLX5DV_ALWAYS_INLINE void mlx5dv_set_ctrl_seg(struct mlx5_wqe_ctrl_seg *seg, uint16_t pi, uint8_t opcode, uint8_t opmod, uint32_t qp_num, uint8_t fm_ce_se, uint8_t ds, uint8_t signature, uint32_t imm) { seg->opmod_idx_opcode = htobe32(((uint32_t)opmod << 24) | ((uint32_t)pi << 8) | opcode); seg->qpn_ds = htobe32((qp_num << 8) | ds); seg->fm_ce_se = fm_ce_se; seg->signature = signature; /* * The caller should prepare "imm" in advance based on WR opcode. * For IBV_WR_SEND_WITH_IMM and IBV_WR_RDMA_WRITE_WITH_IMM, * the "imm" should be assigned as is. * For the IBV_WR_SEND_WITH_INV, it should be htobe32(imm). */ seg->imm = imm; } /* x86 optimized version of mlx5dv_set_ctrl_seg() * * This is useful when doing calculations on large data sets * for parallel calculations. * * It doesn't suit for serialized algorithms. */ #if defined(__SSE3__) static MLX5DV_ALWAYS_INLINE void mlx5dv_x86_set_ctrl_seg(struct mlx5_wqe_ctrl_seg *seg, uint16_t pi, uint8_t opcode, uint8_t opmod, uint32_t qp_num, uint8_t fm_ce_se, uint8_t ds, uint8_t signature, uint32_t imm) { __m128i val = _mm_set_epi32(imm, qp_num, (ds << 16) | pi, (signature << 24) | (opcode << 16) | (opmod << 8) | fm_ce_se); __m128i mask = _mm_set_epi8(15, 14, 13, 12, /* immediate */ 0, /* signal/fence_mode */ #if CHAR_MIN -128, -128, /* reserved */ #else 0x80, 0x80, /* reserved */ #endif 3, /* signature */ 6, /* data size */ 8, 9, 10, /* QP num */ 2, /* opcode */ 4, 5, /* sw_pi in BE */ 1 /* opmod */ ); *(__m128i *) seg = _mm_shuffle_epi8(val, mask); } #endif /* defined(__SSE3__) */ /* * Datagram Segment - contains address information required in order * to form a datagram message. * * Output: * seg - datagram segment to be filled. * Input: * key - Q_key/access key. * dqp_dct - Destination QP number for UD and DCT for DC. * ext - Address vector extension. * stat_rate_sl - Maximum static rate control, SL/ethernet priority. * fl_mlid - Force loopback and source LID for IB. * rlid - Remote LID * rmac - Remote MAC * tclass - GRH tclass/IPv6 tclass/IPv4 ToS * hop_limit - GRH hop limit/IPv6 hop limit/IPv4 TTL * grh_gid_fi - GRH, source GID address and IPv6 flow label. * rgid - Remote GID/IP address. */ static MLX5DV_ALWAYS_INLINE void mlx5dv_set_dgram_seg(struct mlx5_wqe_datagram_seg *seg, uint64_t key, uint32_t dqp_dct, uint8_t ext, uint8_t stat_rate_sl, uint8_t fl_mlid, uint16_t rlid, uint8_t *rmac, uint8_t tclass, uint8_t hop_limit, uint32_t grh_gid_fi, uint8_t *rgid) { /* Always put 64 bits, in q_key, the reserved part will be 0 */ seg->av.key.dc_key = htobe64(key); seg->av.dqp_dct = htobe32(((uint32_t)ext << 31) | dqp_dct); seg->av.stat_rate_sl = stat_rate_sl; seg->av.fl_mlid = fl_mlid; seg->av.rlid = htobe16(rlid); memcpy(seg->av.rmac, rmac, ETHERNET_LL_SIZE); seg->av.tclass = tclass; seg->av.hop_limit = hop_limit; seg->av.grh_gid_fl = htobe32(grh_gid_fi); memcpy(seg->av.rgid, rgid, 16); } /* * Data Segments - contain pointers and a byte count for the scatter/gather list. * They can optionally contain data, which will save a memory read access for * gather Work Requests. */ static MLX5DV_ALWAYS_INLINE void mlx5dv_set_data_seg(struct mlx5_wqe_data_seg *seg, uint32_t length, uint32_t lkey, uintptr_t address) { seg->byte_count = htobe32(length); seg->lkey = htobe32(lkey); seg->addr = htobe64(address); } /* * x86 optimized version of mlx5dv_set_data_seg() * * This is useful when doing calculations on large data sets * for parallel calculations. * * It doesn't suit for serialized algorithms. */ #if defined(__SSE3__) static MLX5DV_ALWAYS_INLINE void mlx5dv_x86_set_data_seg(struct mlx5_wqe_data_seg *seg, uint32_t length, uint32_t lkey, uintptr_t address) { uint64_t address64 = address; __m128i val = _mm_set_epi32((uint32_t)address64, (uint32_t)(address64 >> 32), lkey, length); __m128i mask = _mm_set_epi8(12, 13, 14, 15, /* local address low */ 8, 9, 10, 11, /* local address high */ 4, 5, 6, 7, /* l_key */ 0, 1, 2, 3 /* byte count */ ); *(__m128i *) seg = _mm_shuffle_epi8(val, mask); } #endif /* defined(__SSE3__) */ /* * Eth Segment - contains packet headers and information for stateless L2, L3, L4 offloading. * * Output: * seg - Eth segment to be filled. * Input: * cs_flags - l3cs/l3cs_inner/l4cs/l4cs_inner. * mss - Maximum segment size. For TSO WQEs, the number of bytes * in the TCP payload to be transmitted in each packet. Must * be 0 on non TSO WQEs. * inline_hdr_sz - Length of the inlined packet headers. * inline_hdr_start - Inlined packet header. */ static MLX5DV_ALWAYS_INLINE void mlx5dv_set_eth_seg(struct mlx5_wqe_eth_seg *seg, uint8_t cs_flags, uint16_t mss, uint16_t inline_hdr_sz, uint8_t *inline_hdr_start) { seg->cs_flags = cs_flags; seg->mss = htobe16(mss); seg->inline_hdr_sz = htobe16(inline_hdr_sz); memcpy(seg->inline_hdr_start, inline_hdr_start, inline_hdr_sz); } enum mlx5dv_set_ctx_attr_type { MLX5DV_CTX_ATTR_BUF_ALLOCATORS = 1, }; enum { MLX5_MMAP_GET_REGULAR_PAGES_CMD = 0, MLX5_MMAP_GET_NC_PAGES_CMD = 3, }; struct mlx5dv_ctx_allocators { void *(*alloc)(size_t size, void *priv_data); void (*free)(void *ptr, void *priv_data); void *data; }; /* * Generic context attributes set API * * Returns 0 on success, or the value of errno on failure * (which indicates the failure reason). */ int mlx5dv_set_context_attr(struct ibv_context *context, enum mlx5dv_set_ctx_attr_type type, void *attr); struct mlx5dv_clock_info { uint64_t nsec; uint64_t last_cycles; uint64_t frac; uint32_t mult; uint32_t shift; uint64_t mask; }; /* * Get mlx5 core clock info * * Output: * clock_info - clock info to be filled * Input: * context - device context * * Return: 0 on success, or the value of errno on failure */ int mlx5dv_get_clock_info(struct ibv_context *context, struct mlx5dv_clock_info *clock_info); /* * Translate device timestamp to nano-sec * * Input: * clock_info - clock info to be filled * device_timestamp - timestamp to translate * * Return: nano-sec */ static inline uint64_t mlx5dv_ts_to_ns(struct mlx5dv_clock_info *clock_info, uint64_t device_timestamp) { uint64_t delta, nsec; /* * device_timestamp & cycles are the free running 'mask' bit counters * from the hardware hca_core_clock clock. */ delta = (device_timestamp - clock_info->last_cycles) & clock_info->mask; nsec = clock_info->nsec; /* * Guess if the device_timestamp is more recent than * clock_info->last_cycles, if not (too far in the future) treat * it as old time stamp. This will break every max_clock_info_update_nsec. */ if (delta > clock_info->mask / 2) { delta = (clock_info->last_cycles - device_timestamp) & clock_info->mask; nsec -= ((delta * clock_info->mult) - clock_info->frac) >> clock_info->shift; } else { nsec += ((delta * clock_info->mult) + clock_info->frac) >> clock_info->shift; } return nsec; } enum mlx5dv_context_attr_flags { MLX5DV_CONTEXT_FLAGS_DEVX = 1 << 0, }; struct mlx5dv_context_attr { uint32_t flags; /* Use enum mlx5dv_context_attr_flags */ uint64_t comp_mask; }; bool mlx5dv_is_supported(struct ibv_device *device); enum mlx5dv_vfio_context_attr_flags { MLX5DV_VFIO_CTX_FLAGS_INIT_LINK_DOWN = 1 << 0, }; struct mlx5dv_vfio_context_attr { const char *pci_name; uint32_t flags; /* Use enum mlx5dv_vfio_context_attr_flags */ uint64_t comp_mask; }; struct ibv_device ** mlx5dv_get_vfio_device_list(struct mlx5dv_vfio_context_attr *attr); int mlx5dv_vfio_get_events_fd(struct ibv_context *ibctx); /* This API should run from application thread and maintain device events. * The application is responsible to get the events FD by calling mlx5dv_vfio_get_events_fd * and once the FD is pollable call the API to let driver process the ready events. */ int mlx5dv_vfio_process_events(struct ibv_context *context); struct ibv_context * mlx5dv_open_device(struct ibv_device *device, struct mlx5dv_context_attr *attr); struct mlx5dv_devx_obj; struct mlx5dv_devx_obj * mlx5dv_devx_obj_create(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_obj_query(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_obj_modify(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_obj_destroy(struct mlx5dv_devx_obj *obj); int mlx5dv_devx_general_cmd(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen); int _mlx5dv_query_port(struct ibv_context *context, uint32_t port_num, struct mlx5dv_port *info, size_t info_len); static inline int mlx5dv_query_port(struct ibv_context *context, uint32_t port_num, struct mlx5dv_port *info) { return _mlx5dv_query_port(context, port_num, info, sizeof(*info)); } struct mlx5dv_devx_umem { uint32_t umem_id; }; struct mlx5dv_devx_umem * mlx5dv_devx_umem_reg(struct ibv_context *ctx, void *addr, size_t size, uint32_t access); enum mlx5dv_devx_umem_in_mask { MLX5DV_UMEM_MASK_DMABUF = 1 << 0, }; struct mlx5dv_devx_umem_in { void *addr; size_t size; uint32_t access; uint64_t pgsz_bitmap; uint64_t comp_mask; int dmabuf_fd; }; struct mlx5dv_devx_umem * mlx5dv_devx_umem_reg_ex(struct ibv_context *ctx, struct mlx5dv_devx_umem_in *umem_in); int mlx5dv_devx_umem_dereg(struct mlx5dv_devx_umem *umem); struct mlx5dv_devx_uar { void *reg_addr; void *base_addr; uint32_t page_id; off_t mmap_off; uint64_t comp_mask; }; struct mlx5dv_devx_uar *mlx5dv_devx_alloc_uar(struct ibv_context *context, uint32_t flags); void mlx5dv_devx_free_uar(struct mlx5dv_devx_uar *devx_uar); struct mlx5dv_var { uint32_t page_id; uint32_t length; off_t mmap_off; uint64_t comp_mask; }; struct mlx5dv_var * mlx5dv_alloc_var(struct ibv_context *context, uint32_t flags); void mlx5dv_free_var(struct mlx5dv_var *dv_var); int mlx5dv_devx_query_eqn(struct ibv_context *context, uint32_t vector, uint32_t *eqn); int mlx5dv_devx_cq_query(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_cq_modify(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_qp_query(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_qp_modify(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_srq_query(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_srq_modify(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_wq_query(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_wq_modify(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_ind_tbl_query(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_ind_tbl_modify(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen); struct mlx5dv_devx_cmd_comp { int fd; }; struct mlx5dv_devx_cmd_comp * mlx5dv_devx_create_cmd_comp(struct ibv_context *context); void mlx5dv_devx_destroy_cmd_comp(struct mlx5dv_devx_cmd_comp *cmd_comp); int mlx5dv_devx_obj_query_async(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, size_t outlen, uint64_t wr_id, struct mlx5dv_devx_cmd_comp *cmd_comp); int mlx5dv_devx_get_async_cmd_comp(struct mlx5dv_devx_cmd_comp *cmd_comp, struct mlx5dv_devx_async_cmd_hdr *cmd_resp, size_t cmd_resp_len); struct mlx5dv_devx_event_channel { int fd; }; struct mlx5dv_devx_event_channel * mlx5dv_devx_create_event_channel(struct ibv_context *context, enum mlx5dv_devx_create_event_channel_flags flags); void mlx5dv_devx_destroy_event_channel(struct mlx5dv_devx_event_channel *event_channel); int mlx5dv_devx_subscribe_devx_event(struct mlx5dv_devx_event_channel *event_channel, struct mlx5dv_devx_obj *obj, /* can be NULL for unaffiliated events */ uint16_t events_sz, uint16_t events_num[], uint64_t cookie); int mlx5dv_devx_subscribe_devx_event_fd(struct mlx5dv_devx_event_channel *event_channel, int fd, struct mlx5dv_devx_obj *obj, /* can be NULL for unaffiliated events */ uint16_t event_num); /* return code: upon success number of bytes read, otherwise -1 and errno was set */ ssize_t mlx5dv_devx_get_event(struct mlx5dv_devx_event_channel *event_channel, struct mlx5dv_devx_async_event_hdr *event_data, size_t event_resp_len); #define __devx_nullp(typ) ((struct mlx5_ifc_##typ##_bits *)NULL) #define __devx_st_sz_bits(typ) sizeof(struct mlx5_ifc_##typ##_bits) #define __devx_bit_sz(typ, fld) sizeof(__devx_nullp(typ)->fld) #define __devx_bit_off(typ, fld) offsetof(struct mlx5_ifc_##typ##_bits, fld) #define __devx_dw_off(bit_off) ((bit_off) / 32) #define __devx_64_off(bit_off) ((bit_off) / 64) #define __devx_dw_bit_off(bit_sz, bit_off) (32 - (bit_sz) - ((bit_off) & 0x1f)) #define __devx_mask(bit_sz) ((uint32_t)((1ull << (bit_sz)) - 1)) #define __devx_dw_mask(bit_sz, bit_off) \ (__devx_mask(bit_sz) << __devx_dw_bit_off(bit_sz, bit_off)) #define DEVX_FLD_SZ_BYTES(typ, fld) (__devx_bit_sz(typ, fld) / 8) #define DEVX_ST_SZ_BYTES(typ) (sizeof(struct mlx5_ifc_##typ##_bits) / 8) #define DEVX_ST_SZ_DW(typ) (sizeof(struct mlx5_ifc_##typ##_bits) / 32) #define DEVX_ST_SZ_QW(typ) (sizeof(struct mlx5_ifc_##typ##_bits) / 64) #define DEVX_UN_SZ_BYTES(typ) (sizeof(union mlx5_ifc_##typ##_bits) / 8) #define DEVX_UN_SZ_DW(typ) (sizeof(union mlx5_ifc_##typ##_bits) / 32) #define DEVX_BYTE_OFF(typ, fld) (__devx_bit_off(typ, fld) / 8) #define DEVX_ADDR_OF(typ, p, fld) \ ((unsigned char *)(p) + DEVX_BYTE_OFF(typ, fld)) static inline void _devx_set(void *p, uint32_t value, size_t bit_off, size_t bit_sz) { __be32 *fld = (__be32 *)(p) + __devx_dw_off(bit_off); uint32_t dw_mask = __devx_dw_mask(bit_sz, bit_off); uint32_t mask = __devx_mask(bit_sz); *fld = htobe32((be32toh(*fld) & (~dw_mask)) | ((value & mask) << __devx_dw_bit_off(bit_sz, bit_off))); } #define DEVX_SET(typ, p, fld, v) \ _devx_set(p, v, __devx_bit_off(typ, fld), __devx_bit_sz(typ, fld)) static inline uint32_t _devx_get(const void *p, size_t bit_off, size_t bit_sz) { return ((be32toh(*((const __be32 *)(p) + __devx_dw_off(bit_off))) >> __devx_dw_bit_off(bit_sz, bit_off)) & __devx_mask(bit_sz)); } #define DEVX_GET(typ, p, fld) \ _devx_get(p, __devx_bit_off(typ, fld), __devx_bit_sz(typ, fld)) static inline void _devx_set64(void *p, uint64_t v, size_t bit_off) { *((__be64 *)(p) + __devx_64_off(bit_off)) = htobe64(v); } #define DEVX_SET64(typ, p, fld, v) _devx_set64(p, v, __devx_bit_off(typ, fld)) static inline uint64_t _devx_get64(const void *p, size_t bit_off) { return be64toh(*((const __be64 *)(p) + __devx_64_off(bit_off))); } #define DEVX_GET64(typ, p, fld) _devx_get64(p, __devx_bit_off(typ, fld)) #define DEVX_ARRAY_SET64(typ, p, fld, idx, v) do { \ DEVX_SET64(typ, p, fld[idx], v); \ } while (0) struct mlx5dv_dr_domain; struct mlx5dv_dr_table; struct mlx5dv_dr_matcher; struct mlx5dv_dr_rule; struct mlx5dv_dr_action; enum mlx5dv_dr_domain_type { MLX5DV_DR_DOMAIN_TYPE_NIC_RX, MLX5DV_DR_DOMAIN_TYPE_NIC_TX, MLX5DV_DR_DOMAIN_TYPE_FDB, }; enum mlx5dv_dr_domain_sync_flags { MLX5DV_DR_DOMAIN_SYNC_FLAGS_SW = 1 << 0, MLX5DV_DR_DOMAIN_SYNC_FLAGS_HW = 1 << 1, MLX5DV_DR_DOMAIN_SYNC_FLAGS_MEM = 1 << 2, }; struct mlx5dv_dr_flow_meter_attr { struct mlx5dv_dr_table *next_table; uint8_t active; uint8_t reg_c_index; size_t flow_meter_parameter_sz; void *flow_meter_parameter; }; struct mlx5dv_dr_flow_sampler_attr { uint32_t sample_ratio; struct mlx5dv_dr_table *default_next_table; uint32_t num_sample_actions; struct mlx5dv_dr_action **sample_actions; __be64 action; }; struct mlx5dv_dr_domain * mlx5dv_dr_domain_create(struct ibv_context *ctx, enum mlx5dv_dr_domain_type type); int mlx5dv_dr_domain_destroy(struct mlx5dv_dr_domain *domain); int mlx5dv_dr_domain_sync(struct mlx5dv_dr_domain *domain, uint32_t flags); void mlx5dv_dr_domain_set_reclaim_device_memory(struct mlx5dv_dr_domain *dmn, bool enable); void mlx5dv_dr_domain_allow_duplicate_rules(struct mlx5dv_dr_domain *domain, bool allow); struct mlx5dv_dr_table * mlx5dv_dr_table_create(struct mlx5dv_dr_domain *domain, uint32_t level); int mlx5dv_dr_table_destroy(struct mlx5dv_dr_table *table); struct mlx5dv_dr_matcher * mlx5dv_dr_matcher_create(struct mlx5dv_dr_table *table, uint16_t priority, uint8_t match_criteria_enable, struct mlx5dv_flow_match_parameters *mask); int mlx5dv_dr_matcher_destroy(struct mlx5dv_dr_matcher *matcher); enum mlx5dv_dr_matcher_layout_flags { MLX5DV_DR_MATCHER_LAYOUT_RESIZABLE = 1 << 0, MLX5DV_DR_MATCHER_LAYOUT_NUM_RULE = 1 << 1, }; struct mlx5dv_dr_matcher_layout { uint32_t flags; /* use enum mlx5dv_dr_matcher_layout_flags */ uint32_t log_num_of_rules_hint; }; int mlx5dv_dr_matcher_set_layout(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_dr_matcher_layout *layout); struct mlx5dv_dr_rule * mlx5dv_dr_rule_create(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_flow_match_parameters *value, size_t num_actions, struct mlx5dv_dr_action *actions[]); int mlx5dv_dr_rule_destroy(struct mlx5dv_dr_rule *rule); enum mlx5dv_dr_action_flags { MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL = 1 << 0, }; struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_ibv_qp(struct ibv_qp *ibqp); struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_table(struct mlx5dv_dr_table *table); struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_vport(struct mlx5dv_dr_domain *domain, uint32_t vport); struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_ib_port(struct mlx5dv_dr_domain *domain, uint32_t ib_port); struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_devx_tir(struct mlx5dv_devx_obj *devx_obj); enum mlx5dv_dr_action_dest_type { MLX5DV_DR_ACTION_DEST, MLX5DV_DR_ACTION_DEST_REFORMAT, }; struct mlx5dv_dr_action_dest_reformat { struct mlx5dv_dr_action *reformat; struct mlx5dv_dr_action *dest; }; struct mlx5dv_dr_action_dest_attr { enum mlx5dv_dr_action_dest_type type; union { struct mlx5dv_dr_action *dest; struct mlx5dv_dr_action_dest_reformat *dest_reformat; }; }; struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_array(struct mlx5dv_dr_domain *domain, size_t num_dest, struct mlx5dv_dr_action_dest_attr *dests[]); struct mlx5dv_dr_action *mlx5dv_dr_action_create_drop(void); struct mlx5dv_dr_action *mlx5dv_dr_action_create_default_miss(void); struct mlx5dv_dr_action *mlx5dv_dr_action_create_tag(uint32_t tag_value); struct mlx5dv_dr_action * mlx5dv_dr_action_create_flow_counter(struct mlx5dv_devx_obj *devx_obj, uint32_t offset); enum mlx5dv_dr_action_aso_first_hit_flags { MLX5DV_DR_ACTION_FLAGS_ASO_FIRST_HIT_SET = 1 << 0, }; enum mlx5dv_dr_action_aso_flow_meter_flags { MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_RED = 1 << 0, MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_YELLOW = 1 << 1, MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_GREEN = 1 << 2, MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_UNDEFINED = 1 << 3, }; enum mlx5dv_dr_action_aso_ct_flags { MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_INITIATOR = 1 << 0, MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_RESPONDER = 1 << 1, }; struct mlx5dv_dr_action * mlx5dv_dr_action_create_aso(struct mlx5dv_dr_domain *domain, struct mlx5dv_devx_obj *devx_obj, uint32_t offset, uint32_t flags, uint8_t return_reg_c); int mlx5dv_dr_action_modify_aso(struct mlx5dv_dr_action *action, uint32_t offset, uint32_t flags, uint8_t return_reg_c); struct mlx5dv_dr_action * mlx5dv_dr_action_create_packet_reformat(struct mlx5dv_dr_domain *domain, uint32_t flags, enum mlx5dv_flow_action_packet_reformat_type reformat_type, size_t data_sz, void *data); struct mlx5dv_dr_action * mlx5dv_dr_action_create_modify_header(struct mlx5dv_dr_domain *domain, uint32_t flags, size_t actions_sz, __be64 actions[]); struct mlx5dv_dr_action * mlx5dv_dr_action_create_flow_meter(struct mlx5dv_dr_flow_meter_attr *attr); int mlx5dv_dr_action_modify_flow_meter(struct mlx5dv_dr_action *action, struct mlx5dv_dr_flow_meter_attr *attr, __be64 modify_field_select); struct mlx5dv_dr_action * mlx5dv_dr_action_create_flow_sampler(struct mlx5dv_dr_flow_sampler_attr *attr); struct mlx5dv_dr_action * mlx5dv_dr_action_create_pop_vlan(void); struct mlx5dv_dr_action * mlx5dv_dr_action_create_push_vlan(struct mlx5dv_dr_domain *domain, __be32 vlan_hdr); struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_root_table(struct mlx5dv_dr_table *table, uint16_t priority); int mlx5dv_dr_action_destroy(struct mlx5dv_dr_action *action); int mlx5dv_dump_dr_domain(FILE *fout, struct mlx5dv_dr_domain *domain); int mlx5dv_dump_dr_table(FILE *fout, struct mlx5dv_dr_table *table); int mlx5dv_dump_dr_matcher(FILE *fout, struct mlx5dv_dr_matcher *matcher); int mlx5dv_dump_dr_rule(FILE *fout, struct mlx5dv_dr_rule *rule); struct mlx5dv_pp { uint16_t index; }; struct mlx5dv_pp *mlx5dv_pp_alloc(struct ibv_context *context, size_t pp_context_sz, const void *pp_context, uint32_t flags); void mlx5dv_pp_free(struct mlx5dv_pp *pp); int mlx5dv_query_qp_lag_port(struct ibv_qp *qp, uint8_t *port_num, uint8_t *active_port_num); int mlx5dv_modify_qp_lag_port(struct ibv_qp *qp, uint8_t port_num); int mlx5dv_modify_qp_udp_sport(struct ibv_qp *qp, uint16_t udp_sport); int mlx5dv_dci_stream_id_reset(struct ibv_qp *qp, uint16_t stream_id); enum mlx5dv_sched_elem_attr_flags { MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE = 1 << 0, MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW = 1 << 1, }; struct mlx5dv_sched_attr { struct mlx5dv_sched_node *parent; uint32_t flags; /* Use mlx5dv_sched_elem_attr_flags */ uint32_t bw_share; uint32_t max_avg_bw; uint64_t comp_mask; }; struct mlx5dv_sched_node; struct mlx5dv_sched_leaf; struct mlx5dv_sched_node * mlx5dv_sched_node_create(struct ibv_context *context, const struct mlx5dv_sched_attr *sched_attr); struct mlx5dv_sched_leaf * mlx5dv_sched_leaf_create(struct ibv_context *context, const struct mlx5dv_sched_attr *sched_attr); int mlx5dv_sched_node_modify(struct mlx5dv_sched_node *node, const struct mlx5dv_sched_attr *sched_attr); int mlx5dv_sched_leaf_modify(struct mlx5dv_sched_leaf *leaf, const struct mlx5dv_sched_attr *sched_attr); int mlx5dv_sched_node_destroy(struct mlx5dv_sched_node *node); int mlx5dv_sched_leaf_destroy(struct mlx5dv_sched_leaf *leaf); int mlx5dv_modify_qp_sched_elem(struct ibv_qp *qp, const struct mlx5dv_sched_leaf *requestor, const struct mlx5dv_sched_leaf *responder); int mlx5dv_reserved_qpn_alloc(struct ibv_context *ctx, uint32_t *qpn); int mlx5dv_reserved_qpn_dealloc(struct ibv_context *ctx, uint32_t qpn); int mlx5dv_dr_aso_other_domain_link(struct mlx5dv_devx_obj *devx_obj, struct mlx5dv_dr_domain *peer_dmn, struct mlx5dv_dr_domain *dmn, uint32_t flags, uint8_t return_reg_c); int mlx5dv_dr_aso_other_domain_unlink(struct mlx5dv_devx_obj *devx_obj, struct mlx5dv_dr_domain *dmn); struct mlx5dv_devx_msi_vector { int vector; int fd; }; struct mlx5dv_devx_msi_vector * mlx5dv_devx_alloc_msi_vector(struct ibv_context *ibctx); int mlx5dv_devx_free_msi_vector(struct mlx5dv_devx_msi_vector *msi); struct mlx5dv_devx_eq { void *vaddr; }; struct mlx5dv_devx_eq * mlx5dv_devx_create_eq(struct ibv_context *ibctx, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_destroy_eq(struct mlx5dv_devx_eq *eq); #ifdef __cplusplus } #endif #endif /* _MLX5DV_H_ */ rdma-core-56.1/providers/mlx5/mlx5dv_dr.h000066400000000000000000001556041477342711600203240ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef _MLX5_DV_DR_ #define _MLX5_DV_DR_ #include #include #include #include #include "mlx5dv.h" #include "mlx5_ifc.h" #include "mlx5.h" #define DR_RULE_MAX_STES 20 #define DR_ACTION_MAX_STES 7 #define DR_ACTION_ASO_CROSS_GVMI_STES 2 /* Use up to 14 send rings. This number provided the best performance */ #define DR_MAX_SEND_RINGS 14 #define NUM_OF_LOCKS DR_MAX_SEND_RINGS #define WIRE_PORT 0xFFFF #define ECPF_PORT 0xFFFE #define DR_STE_SVLAN 0x1 #define DR_STE_CVLAN 0x2 #define CVLAN_ETHERTYPE 0x8100 #define SVLAN_ETHERTYPE 0x88a8 #define NUM_OF_FLEX_PARSERS 8 #define DR_STE_MAX_FLEX_0_ID 3 #define DR_STE_MAX_FLEX_1_ID 7 #define DR_VPORTS_BUCKETS 256 #define ACTION_CACHE_LINE_SIZE 64 #define dr_dbg(dmn, arg...) dr_dbg_ctx((dmn)->ctx, ##arg) #define dr_dbg_ctx(ctx, arg...) \ mlx5_dbg(to_mctx(ctx)->dbg_fp, MLX5_DBG_DR, ##arg); enum dr_icm_chunk_size { DR_CHUNK_SIZE_1, DR_CHUNK_SIZE_MIN = DR_CHUNK_SIZE_1, /* keep updated when changing */ DR_CHUNK_SIZE_2, DR_CHUNK_SIZE_4, DR_CHUNK_SIZE_8, DR_CHUNK_SIZE_16, DR_CHUNK_SIZE_32, DR_CHUNK_SIZE_64, DR_CHUNK_SIZE_128, DR_CHUNK_SIZE_256, DR_CHUNK_SIZE_512, DR_CHUNK_SIZE_1K, DR_CHUNK_SIZE_2K, DR_CHUNK_SIZE_4K, DR_CHUNK_SIZE_8K, DR_CHUNK_SIZE_16K, DR_CHUNK_SIZE_32K, DR_CHUNK_SIZE_64K, DR_CHUNK_SIZE_128K, DR_CHUNK_SIZE_256K, DR_CHUNK_SIZE_512K, DR_CHUNK_SIZE_1024K, DR_CHUNK_SIZE_2048K, DR_CHUNK_SIZE_4096K, DR_CHUNK_SIZE_8192K, DR_CHUNK_SIZE_16384K, DR_CHUNK_SIZE_MAX, }; enum dr_icm_type { DR_ICM_TYPE_STE, DR_ICM_TYPE_MODIFY_ACTION, DR_ICM_TYPE_MODIFY_HDR_PTRN, DR_ICM_TYPE_ENCAP, DR_ICM_TYPE_MAX, }; static inline enum dr_icm_chunk_size dr_icm_next_higher_chunk(enum dr_icm_chunk_size chunk) { chunk += 2; if (chunk < DR_CHUNK_SIZE_MAX) return chunk; return DR_CHUNK_SIZE_MAX; } enum dr_ste_lu_type { DR_STE_LU_TYPE_DONT_CARE = 0x0f, }; enum { DR_STE_SIZE = 64, DR_STE_SIZE_CTRL = 32, DR_STE_SIZE_MATCH_TAG = 32, DR_STE_SIZE_TAG = 16, DR_STE_SIZE_MASK = 16, DR_STE_SIZE_REDUCED = DR_STE_SIZE - DR_STE_SIZE_MASK, DR_STE_LOG_SIZE = 6, }; enum dr_ste_ctx_action_cap { DR_STE_CTX_ACTION_CAP_NONE = 0, DR_STE_CTX_ACTION_CAP_TX_POP = 1 << 0, DR_STE_CTX_ACTION_CAP_RX_PUSH = 1 << 1, DR_STE_CTX_ACTION_CAP_RX_ENCAP = 1 << 3, DR_STE_CTX_ACTION_CAP_POP_MDFY = 1 << 4, DR_STE_CTX_ACTION_CAP_MODIFY_HDR_INLINE = 1 << 5, }; enum { DR_MODIFY_ACTION_SIZE = 8, DR_MODIFY_ACTION_LOG_SIZE = 3, }; enum { DR_SW_ENCAP_ENTRY_SIZE = 64, DR_SW_ENCAP_ENTRY_LOG_SIZE = 6, }; enum dr_matcher_criteria { DR_MATCHER_CRITERIA_EMPTY = 0, DR_MATCHER_CRITERIA_OUTER = 1 << 0, DR_MATCHER_CRITERIA_MISC = 1 << 1, DR_MATCHER_CRITERIA_INNER = 1 << 2, DR_MATCHER_CRITERIA_MISC2 = 1 << 3, DR_MATCHER_CRITERIA_MISC3 = 1 << 4, DR_MATCHER_CRITERIA_MISC4 = 1 << 5, DR_MATCHER_CRITERIA_MISC5 = 1 << 6, DR_MATCHER_CRITERIA_MAX = 1 << 7, }; enum dr_matcher_definer { DR_MATCHER_DEFINER_0 = 0, DR_MATCHER_DEFINER_2 = 2, DR_MATCHER_DEFINER_6 = 6, DR_MATCHER_DEFINER_16 = 16, DR_MATCHER_DEFINER_22 = 22, DR_MATCHER_DEFINER_24 = 24, DR_MATCHER_DEFINER_25 = 25, DR_MATCHER_DEFINER_26 = 26, DR_MATCHER_DEFINER_28 = 28, DR_MATCHER_DEFINER_33 = 33, }; enum dr_action_type { DR_ACTION_TYP_TNL_L2_TO_L2, DR_ACTION_TYP_L2_TO_TNL_L2, DR_ACTION_TYP_TNL_L3_TO_L2, DR_ACTION_TYP_L2_TO_TNL_L3, DR_ACTION_TYP_DROP, DR_ACTION_TYP_QP, DR_ACTION_TYP_FT, DR_ACTION_TYP_CTR, DR_ACTION_TYP_TAG, DR_ACTION_TYP_MODIFY_HDR, DR_ACTION_TYP_VPORT, DR_ACTION_TYP_METER, DR_ACTION_TYP_MISS, DR_ACTION_TYP_SAMPLER, DR_ACTION_TYP_DEST_ARRAY, DR_ACTION_TYP_POP_VLAN, DR_ACTION_TYP_PUSH_VLAN, DR_ACTION_TYP_ASO_FIRST_HIT, DR_ACTION_TYP_ASO_FLOW_METER, DR_ACTION_TYP_ASO_CT, DR_ACTION_TYP_ROOT_FT, DR_ACTION_TYP_MAX, }; struct dr_icm_pool; struct dr_icm_chunk; struct dr_icm_buddy_mem; struct dr_ste_htbl; struct dr_match_param; struct dr_devx_caps; struct dr_rule_rx_tx; struct dr_matcher_rx_tx; struct dr_ste_ctx; struct dr_ptrn_mngr; struct dr_ptrn_obj; struct dr_arg_mngr; struct dr_arg_obj; struct dr_data_seg { uint64_t addr; uint32_t length; uint32_t lkey; unsigned int send_flags; }; enum send_info_type { WRITE_ICM = 0, GTA_ARG = 1, }; struct postsend_info { enum send_info_type type; struct dr_data_seg write; struct dr_data_seg read; uint64_t remote_addr; uint32_t rkey; }; struct dr_ste { uint8_t *hw_ste; /* refcount: indicates the num of rules that using this ste */ atomic_int refcount; /* attached to the miss_list head at each htbl entry */ struct list_node miss_list_node; /* this ste is member of htbl */ struct dr_ste_htbl *htbl; struct dr_ste_htbl *next_htbl; /* The rule this STE belongs to */ struct dr_rule_rx_tx *rule_rx_tx; /* this ste is part of a rule, located in ste's chain */ uint8_t ste_chain_location; uint8_t size; }; struct dr_ste_htbl_ctrl { /* total number of valid entries belonging to this hash table. This * includes the non collision and collision entries */ int num_of_valid_entries; /* total number of collisions entries attached to this table */ int num_of_collisions; }; enum dr_ste_htbl_type { DR_STE_HTBL_TYPE_LEGACY = 0, DR_STE_HTBL_TYPE_MATCH = 1, }; struct dr_ste_htbl { enum dr_ste_htbl_type type; uint16_t lu_type; uint16_t byte_mask; atomic_int refcount; struct dr_icm_chunk *chunk; struct dr_ste *ste_arr; uint8_t *hw_ste_arr; struct list_head *miss_list; enum dr_icm_chunk_size chunk_size; struct dr_ste *pointing_ste; struct dr_ste_htbl_ctrl ctrl; }; struct dr_ste_send_info { struct dr_ste *ste; struct list_node send_list; uint16_t size; uint16_t offset; uint8_t data_cont[DR_STE_SIZE]; uint8_t *data; }; void dr_send_fill_and_append_ste_send_info(struct dr_ste *ste, uint16_t size, uint16_t offset, uint8_t *data, struct dr_ste_send_info *ste_info, struct list_head *send_list, bool copy_data); struct dr_ste_build { bool inner; bool rx; struct dr_devx_caps *caps; uint16_t lu_type; enum dr_ste_htbl_type htbl_type; union { struct { uint16_t byte_mask; uint8_t bit_mask[DR_STE_SIZE_MASK]; }; struct { uint16_t format_id; uint8_t match[DR_STE_SIZE_MATCH_TAG]; struct mlx5dv_devx_obj *definer_obj; }; }; int (*ste_build_tag_func)(struct dr_match_param *spec, struct dr_ste_build *sb, uint8_t *tag); }; struct dr_ste_htbl *dr_ste_htbl_alloc(struct dr_icm_pool *pool, enum dr_icm_chunk_size chunk_size, enum dr_ste_htbl_type type, uint16_t lu_type, uint16_t byte_mask); int dr_ste_htbl_free(struct dr_ste_htbl *htbl); static inline void dr_htbl_put(struct dr_ste_htbl *htbl) { if (atomic_fetch_sub(&htbl->refcount, 1) == 1) dr_ste_htbl_free(htbl); } static inline void dr_htbl_get(struct dr_ste_htbl *htbl) { atomic_fetch_add(&htbl->refcount, 1); } /* STE utils */ uint32_t dr_ste_calc_hash_index(uint8_t *hw_ste_p, struct dr_ste_htbl *htbl); void dr_ste_set_miss_addr(struct dr_ste_ctx *ste_ctx, uint8_t *hw_ste_p, uint64_t miss_addr); void dr_ste_set_hit_addr_by_next_htbl(struct dr_ste_ctx *ste_ctx, uint8_t *hw_ste, struct dr_ste_htbl *next_htbl); void dr_ste_set_hit_addr(struct dr_ste_ctx *ste_ctx, uint8_t *hw_ste_p, uint64_t icm_addr, uint32_t ht_size); void dr_ste_set_hit_gvmi(struct dr_ste_ctx *ste_ctx, uint8_t *hw_ste_p, uint16_t gvmi); void dr_ste_set_bit_mask(uint8_t *hw_ste_p, struct dr_ste_build *sb); bool dr_ste_is_last_in_rule(struct dr_matcher_rx_tx *nic_matcher, uint8_t ste_location); uint64_t dr_ste_get_icm_addr(struct dr_ste *ste); uint64_t dr_ste_get_mr_addr(struct dr_ste *ste); struct list_head *dr_ste_get_miss_list(struct dr_ste *ste); struct dr_ste *dr_ste_get_miss_list_top(struct dr_ste *ste); static inline int dr_ste_tag_sz(struct dr_ste *ste) { if (ste->htbl->type == DR_STE_HTBL_TYPE_LEGACY) return DR_STE_SIZE_TAG; return DR_STE_SIZE_MATCH_TAG; } #define MAX_VLANS 2 struct dr_aso_cross_dmn_arrays { struct dr_ste_htbl **action_htbl; struct dr_ste_htbl **rule_htbl; }; struct dr_action_aso { struct mlx5dv_dr_domain *dmn; struct mlx5dv_devx_obj *devx_obj; uint32_t offset; uint8_t dest_reg_id; union { struct { bool set; } first_hit; struct { uint8_t initial_color; } flow_meter; struct { bool direction; } ct; }; }; #define DR_INVALID_PATTERN_INDEX 0xffffffff struct dr_ste_actions_attr { uint32_t modify_index; uint32_t modify_pat_idx; uint16_t modify_actions; uint8_t *single_modify_action; uint32_t decap_index; uint32_t decap_pat_idx; uint16_t decap_actions; bool decap_with_vlan; uint64_t final_icm_addr; uint32_t flow_tag; uint32_t ctr_id; uint16_t gvmi; uint16_t hit_gvmi; uint32_t reformat_id; uint32_t reformat_size; bool prio_tag_required; struct { int count_pop; int count_push; uint32_t headers[MAX_VLANS]; } vlans; struct dr_action_aso *aso; uint32_t aso_ste_loc; struct mlx5dv_dr_domain *dmn; }; struct cross_dmn_params { uint32_t cross_dmn_loc; struct mlx5dv_dr_action *cross_dmn_action; }; void dr_ste_set_actions_rx(struct dr_ste_ctx *ste_ctx, uint8_t *action_type_set, uint8_t *last_ste, struct dr_ste_actions_attr *attr, uint32_t *added_stes); void dr_ste_set_actions_tx(struct dr_ste_ctx *ste_ctx, uint8_t *action_type_set, uint8_t *last_ste, struct dr_ste_actions_attr *attr, uint32_t *added_stes); void dr_ste_set_action_set(struct dr_ste_ctx *ste_ctx, __be64 *hw_action, uint8_t hw_field, uint8_t shifter, uint8_t length, uint32_t data); void dr_ste_set_action_add(struct dr_ste_ctx *ste_ctx, __be64 *hw_action, uint8_t hw_field, uint8_t shifter, uint8_t length, uint32_t data); void dr_ste_set_action_copy(struct dr_ste_ctx *ste_ctx, __be64 *hw_action, uint8_t dst_hw_field, uint8_t dst_shifter, uint8_t dst_len, uint8_t src_hw_field, uint8_t src_shifter); int dr_ste_set_action_decap_l3_list(struct dr_ste_ctx *ste_ctx, void *data, uint32_t data_sz, uint8_t *hw_action, uint32_t hw_action_sz, uint16_t *used_hw_action_num); void dr_ste_v1_set_aso_ct(uint8_t *d_action, uint32_t object_id, uint32_t offset, uint8_t dest_reg_id, bool direction); const struct dr_ste_action_modify_field * dr_ste_conv_modify_hdr_sw_field(struct dr_ste_ctx *ste_ctx, struct dr_devx_caps *caps, uint16_t sw_field); struct dr_ste_ctx *dr_ste_get_ctx(uint8_t version); void dr_ste_free(struct dr_ste *ste, struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule); static inline void dr_ste_put(struct dr_ste *ste, struct mlx5dv_dr_rule *rule, struct dr_rule_rx_tx *nic_rule) { if (atomic_fetch_sub(&ste->refcount, 1) == 1) dr_ste_free(ste, rule, nic_rule); } /* initial as 0, increased only when ste appears in a new rule */ static inline void dr_ste_get(struct dr_ste *ste) { atomic_fetch_add(&ste->refcount, 1); } static inline bool dr_ste_is_not_used(struct dr_ste *ste) { return !atomic_load(&ste->refcount); } bool dr_ste_equal_tag(void *src, void *dst, uint8_t tag_size); int dr_ste_create_next_htbl(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct dr_ste *ste, uint8_t *cur_hw_ste, enum dr_icm_chunk_size log_table_size, uint8_t send_ring_idx); /* STE build functions */ int dr_ste_build_pre_check(struct mlx5dv_dr_domain *dmn, uint8_t match_criteria, struct dr_match_param *mask, struct dr_match_param *value); int dr_ste_build_ste_arr(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct dr_match_param *value, uint8_t *ste_arr); void dr_ste_build_eth_l2_src_dst(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_eth_l3_ipv4_5_tuple(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_eth_l3_ipv4_misc(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_eth_l3_ipv6_dst(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_eth_l3_ipv6_src(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_eth_l2_src(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_eth_l2_dst(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_eth_l2_tnl(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_eth_ipv6_l3_l4(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_eth_l4_misc(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_tnl_gre(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_mpls(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_tnl_mpls_over_gre(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); void dr_ste_build_tnl_mpls_over_udp(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); void dr_ste_build_icmp(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); void dr_ste_build_tnl_vxlan_gpe(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_tnl_geneve(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_tnl_geneve_tlv_opt(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); void dr_ste_build_tnl_geneve_tlv_opt_exist(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); void dr_ste_build_tnl_gtpu(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_tnl_gtpu_flex_parser_0(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); void dr_ste_build_tnl_gtpu_flex_parser_1(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); void dr_ste_build_general_purpose(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_register_0(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_register_1(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_src_gvmi_qpn(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); void dr_ste_build_flex_parser_0(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_flex_parser_1(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_tunnel_header(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); void dr_ste_build_ib_l4(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); int dr_ste_build_def0(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); int dr_ste_build_def2(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); int dr_ste_build_def6(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); int dr_ste_build_def16(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, struct dr_devx_caps *caps, bool inner, bool rx); int dr_ste_build_def22(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); int dr_ste_build_def24(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); int dr_ste_build_def25(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); int dr_ste_build_def26(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); int dr_ste_build_def28(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); int dr_ste_build_def33(struct dr_ste_ctx *ste_ctx, struct dr_ste_build *sb, struct dr_match_param *mask, bool inner, bool rx); void dr_ste_build_empty_always_hit(struct dr_ste_build *sb, bool rx); /* Actions utils */ int dr_actions_build_ste_arr(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, struct mlx5dv_dr_action *actions[], uint32_t num_actions, uint8_t *ste_arr, uint32_t *new_hw_ste_arr_sz, struct cross_dmn_params *cross_dmn_p, uint8_t send_ring_idx); int dr_actions_build_attr(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_dr_action *actions[], size_t num_actions, struct mlx5dv_flow_action_attr *attr, struct mlx5_flow_action_attr_aux *attr_aux); uint32_t dr_actions_reformat_get_id(struct mlx5dv_dr_action *action); struct dr_match_spec { uint32_t smac_47_16; /* Source MAC address of incoming packet */ uint32_t smac_15_0:16; /* Source MAC address of incoming packet */ uint32_t ethertype:16; /* Incoming packet Ethertype - this is the Ethertype following the last ;VLAN tag of the packet */ uint32_t dmac_47_16; /* Destination MAC address of incoming packet */ uint32_t dmac_15_0:16; /* Destination MAC address of incoming packet */ uint32_t first_prio:3; /* Priority of first VLAN tag in the incoming packet. Valid only when ;cvlan_tag==1 or svlan_tag==1 */ uint32_t first_cfi:1; /* CFI bit of first VLAN tag in the incoming packet. Valid only when ;cvlan_tag==1 or svlan_tag==1 */ uint32_t first_vid:12; /* VLAN ID of first VLAN tag in the incoming packet. Valid only ;when cvlan_tag==1 or svlan_tag==1 */ uint32_t ip_protocol:8; /* IP protocol */ uint32_t ip_dscp:6; /* Differentiated Services Code Point derived from Traffic Class/;TOS field of IPv6/v4 */ uint32_t ip_ecn:2; /* Explicit Congestion Notification derived from Traffic Class/TOS ;field of IPv6/v4 */ uint32_t cvlan_tag:1; /* The first vlan in the packet is c-vlan (0x8100). cvlan_tag and ;svlan_tag cannot be set together */ uint32_t svlan_tag:1; /* The first vlan in the packet is s-vlan (0x8a88). cvlan_tag and ;svlan_tag cannot be set together */ uint32_t frag:1; /* Packet is an IP fragment */ uint32_t ip_version:4; /* IP version */ uint32_t tcp_flags:9; /* TCP flags. ;Bit 0: FIN;Bit 1: SYN;Bit 2: RST;Bit 3: PSH;Bit 4: ACK;Bit 5: URG;Bit 6: ECE;Bit 7: CWR;Bit 8: NS */ uint32_t tcp_sport:16; /* TCP source port.;tcp and udp sport/dport are mutually exclusive */ uint32_t tcp_dport:16; /* TCP destination port. ;tcp and udp sport/dport are mutually exclusive */ uint32_t reserved_at_c0:16; uint32_t ipv4_ihl:4; uint32_t l3_ok:1; uint32_t l4_ok:1; uint32_t ipv4_checksum_ok:1; uint32_t l4_checksum_ok:1; uint32_t ip_ttl_hoplimit:8; uint32_t udp_sport:16; /* UDP source port.;tcp and udp sport/dport are mutually exclusive */ uint32_t udp_dport:16; /* UDP destination port.;tcp and udp sport/dport are mutually exclusive */ uint32_t src_ip_127_96; /* IPv6 source address of incoming packets ;For IPv4 address use bits 31:0 (rest of the bits are reserved);This field should be qualified by an appropriate ;ethertype */ uint32_t src_ip_95_64; /* IPv6 source address of incoming packets ;For IPv4 address use bits 31:0 (rest of the bits are reserved);This field should be qualified by an appropriate ;ethertype */ uint32_t src_ip_63_32; /* IPv6 source address of incoming packets ;For IPv4 address use bits 31:0 (rest of the bits are reserved);This field should be qualified by an appropriate ;ethertype */ uint32_t src_ip_31_0; /* IPv6 source address of incoming packets ;For IPv4 address use bits 31:0 (rest of the bits are reserved);This field should be qualified by an appropriate ;ethertype */ uint32_t dst_ip_127_96; /* IPv6 destination address of incoming packets ;For IPv4 address use bits 31:0 (rest of the bits are reserved);This field should be qualified by an appropriate ;ethertype */ uint32_t dst_ip_95_64; /* IPv6 destination address of incoming packets ;For IPv4 address use bits 31:0 (rest of the bits are reserved);This field should be qualified by an appropriate ;ethertype */ uint32_t dst_ip_63_32; /* IPv6 destination address of incoming packets ;For IPv4 address use bits 31:0 (rest of the bits are reserved);This field should be qualified by an appropriate ;ethertype */ uint32_t dst_ip_31_0; /* IPv6 destination address of incoming packets ;For IPv4 address use bits 31:0 (rest of the bits are reserved);This field should be qualified by an appropriate ;ethertype */ }; struct dr_match_misc { uint32_t gre_c_present:1; /* used with GRE, checksum exist when gre_c_present == 1 */ uint32_t bth_a:1; uint32_t gre_k_present:1; /* used with GRE, key exist when gre_k_present == 1 */ uint32_t gre_s_present:1; /* used with GRE, sequence number exist when gre_s_present == 1 */ uint32_t source_vhca_port:4; uint32_t source_sqn:24; /* Source SQN */ uint32_t source_eswitch_owner_vhca_id:16; uint32_t source_port:16; /* Source port.;0xffff determines wire port */ uint32_t outer_second_prio:3; /* Priority of second VLAN tag in the outer header of the incoming ;packet. Valid only when outer_second_cvlan_tag ==1 or outer_sec;ond_svlan_tag ==1 */ uint32_t outer_second_cfi:1; /* CFI bit of first VLAN tag in the outer header of the incoming packet. ;Valid only when outer_second_cvlan_tag ==1 or outer_sec;ond_svlan_tag ==1 */ uint32_t outer_second_vid:12; /* VLAN ID of first VLAN tag the outer header of the incoming packet. ;Valid only when outer_second_cvlan_tag ==1 or outer_sec;ond_svlan_tag ==1 */ uint32_t inner_second_prio:3; /* Priority of second VLAN tag in the inner header of the incoming ;packet. Valid only when inner_second_cvlan_tag ==1 or inner_sec;ond_svlan_tag ==1 */ uint32_t inner_second_cfi:1; /* CFI bit of first VLAN tag in the inner header of the incoming packet. ;Valid only when inner_second_cvlan_tag ==1 or inner_sec;ond_svlan_tag ==1 */ uint32_t inner_second_vid:12; /* VLAN ID of first VLAN tag the inner header of the incoming packet. ;Valid only when inner_second_cvlan_tag ==1 or inner_sec;ond_svlan_tag ==1 */ uint32_t outer_second_cvlan_tag:1; /* The second vlan in the outer header of the packet is c-vlan (0x8100). ;outer_second_cvlan_tag and outer_second_svlan_tag cannot be set ;together */ uint32_t inner_second_cvlan_tag:1; /* The second vlan in the inner header of the packet is c-vlan (0x8100). ;inner_second_cvlan_tag and inner_second_svlan_tag cannot be set ;together */ uint32_t outer_second_svlan_tag:1; /* The second vlan in the outer header of the packet is s-vlan (0x8a88). ;outer_second_cvlan_tag and outer_second_svlan_tag cannot be set ;together */ uint32_t inner_second_svlan_tag:1; /* The second vlan in the inner header of the packet is s-vlan (0x8a88). ;inner_second_cvlan_tag and inner_second_svlan_tag cannot be set ;together */ uint32_t outer_emd_tag:1; uint32_t reserved_at_65:11; uint32_t gre_protocol:16; /* GRE Protocol (outer) */ uint32_t gre_key_h:24; /* GRE Key[31:8] (outer) */ uint32_t gre_key_l:8; /* GRE Key [7:0] (outer) */ uint32_t vxlan_vni:24; /* VXLAN VNI (outer) */ uint32_t bth_opcode:8; /* Opcode field from BTH header */ uint32_t geneve_vni:24; /* GENEVE VNI field (outer) */ uint32_t reserved_at_e4:6; uint32_t geneve_tlv_option_0_exist:1; uint32_t geneve_oam:1; /* GENEVE OAM field (outer) */ uint32_t reserved_at_ec:12; uint32_t outer_ipv6_flow_label:20; /* Flow label of incoming IPv6 packet (outer) */ uint32_t reserved_at_100:12; uint32_t inner_ipv6_flow_label:20; /* Flow label of incoming IPv6 packet (inner) */ uint32_t reserved_at_120:10; uint32_t geneve_opt_len:6; /* GENEVE OptLen (outer) */ uint32_t geneve_protocol_type:16; /* GENEVE protocol type (outer) */ uint32_t reserved_at_140:8; uint32_t bth_dst_qp:24; /* Destination QP in BTH header */ uint32_t inner_esp_spi; uint32_t outer_esp_spi; uint32_t reserved_at_1a0; uint32_t reserved_at_1c0; uint32_t reserved_at_1e0; }; struct dr_match_misc2 { uint32_t outer_first_mpls_label:20; /* First MPLS LABEL (outer) */ uint32_t outer_first_mpls_exp:3; /* First MPLS EXP (outer) */ uint32_t outer_first_mpls_s_bos:1; /* First MPLS S_BOS (outer) */ uint32_t outer_first_mpls_ttl:8; /* First MPLS TTL (outer) */ uint32_t inner_first_mpls_label:20; /* First MPLS LABEL (inner) */ uint32_t inner_first_mpls_exp:3; /* First MPLS EXP (inner) */ uint32_t inner_first_mpls_s_bos:1; /* First MPLS S_BOS (inner) */ uint32_t inner_first_mpls_ttl:8; /* First MPLS TTL (inner) */ uint32_t outer_first_mpls_over_gre_label:20; /* last MPLS LABEL (outer) */ uint32_t outer_first_mpls_over_gre_exp:3; /* last MPLS EXP (outer) */ uint32_t outer_first_mpls_over_gre_s_bos:1; /* last MPLS S_BOS (outer) */ uint32_t outer_first_mpls_over_gre_ttl:8; /* last MPLS TTL (outer) */ uint32_t outer_first_mpls_over_udp_label:20; /* last MPLS LABEL (outer) */ uint32_t outer_first_mpls_over_udp_exp:3; /* last MPLS EXP (outer) */ uint32_t outer_first_mpls_over_udp_s_bos:1; /* last MPLS S_BOS (outer) */ uint32_t outer_first_mpls_over_udp_ttl:8; /* last MPLS TTL (outer) */ uint32_t metadata_reg_c_7; /* metadata_reg_c_7 */ uint32_t metadata_reg_c_6; /* metadata_reg_c_6 */ uint32_t metadata_reg_c_5; /* metadata_reg_c_5 */ uint32_t metadata_reg_c_4; /* metadata_reg_c_4 */ uint32_t metadata_reg_c_3; /* metadata_reg_c_3 */ uint32_t metadata_reg_c_2; /* metadata_reg_c_2 */ uint32_t metadata_reg_c_1; /* metadata_reg_c_1 */ uint32_t metadata_reg_c_0; /* metadata_reg_c_0 */ uint32_t metadata_reg_a; /* metadata_reg_a */ uint32_t reserved_at_1a0; uint32_t reserved_at_1c0; uint32_t reserved_at_1e0; }; struct dr_match_misc3 { uint32_t inner_tcp_seq_num; uint32_t outer_tcp_seq_num; uint32_t inner_tcp_ack_num; uint32_t outer_tcp_ack_num; uint32_t reserved_at_80:8; uint32_t outer_vxlan_gpe_vni:24; uint32_t outer_vxlan_gpe_next_protocol:8; uint32_t outer_vxlan_gpe_flags:8; uint32_t reserved_at_b0:16; uint32_t icmpv4_header_data; uint32_t icmpv6_header_data; uint8_t icmpv4_type; uint8_t icmpv4_code; uint8_t icmpv6_type; uint8_t icmpv6_code; uint32_t geneve_tlv_option_0_data; uint32_t gtpu_teid; uint32_t gtpu_msg_type:8; uint32_t gtpu_msg_flags:8; uint32_t reserved_at_170:16; uint32_t gtpu_dw_2; uint32_t gtpu_first_ext_dw_0; uint32_t gtpu_dw_0; uint32_t reserved_at_1e0; }; struct dr_match_misc4 { uint32_t prog_sample_field_value_0; uint32_t prog_sample_field_id_0; uint32_t prog_sample_field_value_1; uint32_t prog_sample_field_id_1; uint32_t prog_sample_field_value_2; uint32_t prog_sample_field_id_2; uint32_t prog_sample_field_value_3; uint32_t prog_sample_field_id_3; uint32_t prog_sample_field_value_4; uint32_t prog_sample_field_id_4; uint32_t prog_sample_field_value_5; uint32_t prog_sample_field_id_5; uint32_t prog_sample_field_value_6; uint32_t prog_sample_field_id_6; uint32_t prog_sample_field_value_7; uint32_t prog_sample_field_id_7; }; struct dr_match_misc5 { uint32_t macsec_tag_0; uint32_t macsec_tag_1; uint32_t macsec_tag_2; uint32_t macsec_tag_3; uint32_t tunnel_header_0; uint32_t tunnel_header_1; uint32_t tunnel_header_2; uint32_t tunnel_header_3; uint32_t reserved_at_100; uint32_t reserved_at_120; uint32_t reserved_at_140; uint32_t reserved_at_160; uint32_t reserved_at_180; uint32_t reserved_at_1a0; uint32_t reserved_at_1c0; uint32_t reserved_at_1e0; }; struct dr_match_param { struct dr_match_spec outer; struct dr_match_misc misc; struct dr_match_spec inner; struct dr_match_misc2 misc2; struct dr_match_misc3 misc3; struct dr_match_misc4 misc4; struct dr_match_misc5 misc5; }; #define DR_MASK_IS_ICMPV4_SET(_misc3) ((_misc3)->icmpv4_type || \ (_misc3)->icmpv4_code || \ (_misc3)->icmpv4_header_data) struct dr_esw_caps { uint64_t drop_icm_address_rx; uint64_t drop_icm_address_tx; uint64_t uplink_icm_address_rx; uint64_t uplink_icm_address_tx; bool sw_owner; bool sw_owner_v2; }; struct dr_devx_vport_cap { uint16_t vport_gvmi; uint16_t vhca_gvmi; uint64_t icm_address_rx; uint64_t icm_address_tx; uint16_t num; uint32_t metadata_c; uint32_t metadata_c_mask; /* locate vports table */ struct dr_devx_vport_cap *next; }; struct dr_devx_roce_cap { bool roce_en; bool fl_rc_qp_when_roce_disabled; bool fl_rc_qp_when_roce_enabled; uint8_t qp_ts_format; }; struct dr_vports_table { struct dr_devx_vport_cap *buckets[DR_VPORTS_BUCKETS]; }; struct dr_devx_vports { /* E-Switch manager */ struct dr_devx_vport_cap esw_mngr; /* Uplink */ struct dr_devx_vport_cap wire; /* PF + VFS + SF */ struct dr_vports_table *vports; /* IB ports to vport + other_vports */ struct dr_devx_vport_cap **ib_ports; /* Number of vports PF + VFS + SFS + WIRE */ uint32_t num_ports; /* Protect vport query and add*/ pthread_spinlock_t lock; }; struct dr_devx_caps { struct mlx5dv_dr_domain *dmn; uint16_t gvmi; uint64_t nic_rx_drop_address; uint64_t nic_tx_drop_address; uint64_t nic_tx_allow_address; uint64_t esw_rx_drop_address; uint64_t esw_tx_drop_address; uint32_t log_icm_size; uint8_t log_modify_hdr_icm_size; uint64_t hdr_modify_icm_addr; uint32_t log_modify_pattern_icm_size; uint64_t hdr_modify_pattern_icm_addr; uint64_t indirect_encap_icm_base; uint32_t log_sw_encap_icm_size; uint16_t max_encap_size; uint32_t flex_protocols; uint8_t flex_parser_header_modify; uint8_t flex_parser_id_icmp_dw0; uint8_t flex_parser_id_icmp_dw1; uint8_t flex_parser_id_icmpv6_dw0; uint8_t flex_parser_id_icmpv6_dw1; uint8_t flex_parser_id_geneve_opt_0; uint8_t flex_parser_id_mpls_over_gre; uint8_t flex_parser_id_mpls_over_udp; uint8_t flex_parser_id_gtpu_dw_0; uint8_t flex_parser_id_gtpu_teid; uint8_t flex_parser_id_gtpu_dw_2; uint8_t flex_parser_id_gtpu_first_ext_dw_0; uint8_t flex_parser_ok_bits_supp; uint8_t definer_supp_checksum; uint8_t max_ft_level; uint8_t sw_format_ver; bool isolate_vl_tc; bool eswitch_manager; bool rx_sw_owner; bool tx_sw_owner; bool fdb_sw_owner; bool rx_sw_owner_v2; bool tx_sw_owner_v2; bool fdb_sw_owner_v2; struct dr_devx_roce_cap roce_caps; uint64_t definer_format_sup; uint16_t log_header_modify_argument_granularity; uint16_t log_header_modify_argument_max_alloc; bool support_modify_argument; bool prio_tag_required; bool is_ecpf; struct dr_devx_vports vports; bool support_full_tnl_hdr; }; struct dr_devx_flow_table_attr { uint8_t type; uint8_t level; bool sw_owner; bool term_tbl; bool reformat_en; uint64_t icm_addr_rx; uint64_t icm_addr_tx; }; struct dr_devx_flow_group_attr { uint32_t table_id; uint32_t table_type; }; struct dr_devx_flow_dest_info { enum dr_devx_flow_dest_type type; union { uint32_t vport_num; uint32_t tir_num; uint32_t counter_id; uint32_t ft_id; }; bool has_reformat; uint32_t reformat_id; }; struct dr_devx_flow_fte_attr { uint32_t table_id; uint32_t table_type; uint32_t group_id; uint32_t flow_tag; uint32_t action; uint32_t dest_size; struct dr_devx_flow_dest_info *dest_arr; bool extended_dest; }; struct dr_devx_tbl { uint8_t type; uint8_t level; struct mlx5dv_devx_obj *ft_dvo; struct mlx5dv_devx_obj *fg_dvo; struct mlx5dv_devx_obj *fte_dvo; }; struct dr_devx_flow_sampler_attr { uint8_t table_type; uint8_t level; uint8_t ignore_flow_level; uint32_t sample_ratio; uint32_t default_next_table_id; uint32_t sample_table_id; }; enum dr_domain_nic_type { DR_DOMAIN_NIC_TYPE_RX, DR_DOMAIN_NIC_TYPE_TX, }; struct dr_domain_rx_tx { uint64_t drop_icm_addr; uint64_t default_icm_addr; enum dr_domain_nic_type type; /* protect rx/tx domain */ pthread_spinlock_t locks[NUM_OF_LOCKS]; }; struct dr_domain_info { bool supp_sw_steering; uint32_t max_log_sw_icm_sz; uint32_t max_log_action_icm_sz; uint32_t max_log_modify_hdr_pattern_icm_sz; uint32_t max_log_sw_icm_rehash_sz; uint32_t max_log_sw_encap_icm_sz; uint32_t max_send_size; struct dr_domain_rx_tx rx; struct dr_domain_rx_tx tx; struct ibv_device_attr_ex attr; struct dr_devx_caps caps; bool use_mqs; }; enum dr_domain_flags { DR_DOMAIN_FLAG_MEMORY_RECLAIM = 1 << 0, DR_DOMAIN_FLAG_DISABLE_DUPLICATE_RULES = 1 << 1, }; struct mlx5dv_dr_domain { struct ibv_context *ctx; struct dr_ste_ctx *ste_ctx; struct ibv_pd *pd; int pd_num; struct mlx5dv_devx_uar *uar; enum mlx5dv_dr_domain_type type; atomic_int refcount; struct dr_icm_pool *ste_icm_pool; struct dr_icm_pool *action_icm_pool; struct dr_ptrn_mngr *modify_header_ptrn_mngr; struct dr_arg_mngr *modify_header_arg_mngr; struct dr_icm_pool *encap_icm_pool; struct dr_send_ring *send_ring[DR_MAX_SEND_RINGS]; struct dr_domain_info info; struct list_head tbl_list; uint32_t flags; /* protect debug lists of all tracked objects */ pthread_spinlock_t debug_lock; /* statistcs */ uint32_t num_buddies[DR_ICM_TYPE_MAX]; }; static inline int dr_domain_nic_lock_init(struct dr_domain_rx_tx *nic_dmn) { int ret; int i; for (i = 0; i < NUM_OF_LOCKS; i++) { ret = pthread_spin_init(&nic_dmn->locks[i], PTHREAD_PROCESS_PRIVATE); if (ret) { errno = ret; goto destroy_locks; } } return 0; destroy_locks: while (i--) pthread_spin_destroy(&nic_dmn->locks[i]); return ret; } static inline void dr_domain_nic_lock_uninit(struct dr_domain_rx_tx *nic_dmn) { int i; for (i = 0; i < NUM_OF_LOCKS; i++) pthread_spin_destroy(&nic_dmn->locks[i]); } static inline void dr_domain_nic_lock(struct dr_domain_rx_tx *nic_dmn) { int i; for (i = 0; i < NUM_OF_LOCKS; i++) pthread_spin_lock(&nic_dmn->locks[i]); } static inline void dr_domain_nic_unlock(struct dr_domain_rx_tx *nic_dmn) { int i; for (i = 0; i < NUM_OF_LOCKS; i++) pthread_spin_unlock(&nic_dmn->locks[i]); } static inline void dr_domain_lock(struct mlx5dv_dr_domain *dmn) { dr_domain_nic_lock(&dmn->info.rx); dr_domain_nic_lock(&dmn->info.tx); } static inline void dr_domain_unlock(struct mlx5dv_dr_domain *dmn) { dr_domain_nic_unlock(&dmn->info.tx); dr_domain_nic_unlock(&dmn->info.rx); } struct dr_table_rx_tx { struct dr_ste_htbl *s_anchor; struct dr_domain_rx_tx *nic_dmn; }; struct mlx5dv_dr_table { struct mlx5dv_dr_domain *dmn; struct dr_table_rx_tx rx; struct dr_table_rx_tx tx; uint32_t level; uint32_t table_type; struct list_head matcher_list; struct mlx5dv_devx_obj *devx_obj; atomic_int refcount; struct list_node tbl_list; }; struct dr_matcher_rx_tx { struct dr_ste_htbl *s_htbl; struct dr_ste_htbl *e_anchor; struct dr_ste_build ste_builder[DR_RULE_MAX_STES]; uint8_t num_of_builders; uint64_t default_icm_addr; struct dr_table_rx_tx *nic_tbl; bool fixed_size; }; struct mlx5dv_dr_matcher { struct mlx5dv_dr_table *tbl; struct dr_matcher_rx_tx rx; struct dr_matcher_rx_tx tx; struct list_node matcher_list; uint16_t prio; struct dr_match_param mask; uint8_t match_criteria; atomic_int refcount; struct mlx5dv_flow_matcher *dv_matcher; struct list_head rule_list; }; struct dr_ste_action_modify_field { uint16_t hw_field; uint8_t start; uint8_t end; uint8_t l3_type; uint8_t l4_type; uint32_t flags; }; struct dr_devx_tbl_with_refs { uint16_t ref_actions_num; struct mlx5dv_dr_action **ref_actions; struct dr_devx_tbl *devx_tbl; }; struct dr_flow_sampler { struct mlx5dv_devx_obj *devx_obj; uint64_t rx_icm_addr; uint64_t tx_icm_addr; struct mlx5dv_dr_table *next_ft; }; struct dr_flow_sampler_restore_tbl { struct mlx5dv_dr_table *tbl; struct mlx5dv_dr_matcher *matcher; struct mlx5dv_dr_rule *rule; struct mlx5dv_dr_action **actions; uint16_t num_of_actions; }; struct dr_rewrite_param { struct dr_icm_chunk *chunk; uint8_t *data; uint32_t data_size; uint16_t num_of_actions; uint32_t index; }; enum dr_ptrn_type { DR_PTRN_TYP_MODIFY_HDR = DR_ACTION_TYP_MODIFY_HDR, DR_PTRN_TYP_TNL_L3_TO_L2 = DR_ACTION_TYP_TNL_L3_TO_L2, }; struct dr_ptrn_obj { struct dr_rewrite_param rewrite_param; atomic_int refcount; struct list_node list; enum dr_ptrn_type type; }; struct dr_arg_obj { struct mlx5dv_devx_obj *obj; uint32_t obj_offset; struct list_node list_node; uint32_t log_chunk_size; }; struct mlx5dv_dr_action { enum dr_action_type action_type; atomic_int refcount; union { struct { struct mlx5dv_dr_domain *dmn; bool is_root_level; uint32_t args_send_qp; union { struct ibv_flow_action *flow_action; /* root*/ struct { struct dr_rewrite_param param; uint8_t single_action_opt:1; uint8_t allow_rx:1; uint8_t allow_tx:1; struct { struct dr_ptrn_obj *ptrn; struct dr_arg_obj *arg; } ptrn_arg; }; }; } rewrite; struct { struct mlx5dv_dr_domain *dmn; bool is_root_level; union { struct ibv_flow_action *flow_action; /* root*/ struct { struct mlx5dv_devx_obj *dvo; uint8_t *data; uint32_t index; struct dr_icm_chunk *chunk; uint32_t reformat_size; }; }; } reformat; struct { struct mlx5dv_dr_table *next_ft; struct mlx5dv_devx_obj *devx_obj; uint64_t rx_icm_addr; uint64_t tx_icm_addr; } meter; struct { struct mlx5dv_dr_domain *dmn; struct dr_devx_tbl_with_refs *term_tbl; struct dr_flow_sampler *sampler_default; struct dr_flow_sampler_restore_tbl *restore_tbl; struct dr_flow_sampler *sampler_restore; } sampler; struct mlx5dv_dr_table *dest_tbl; struct { struct mlx5dv_dr_domain *dmn; struct list_head actions_list; struct dr_devx_tbl *devx_tbl; uint64_t rx_icm_addr; uint64_t tx_icm_addr; } dest_array; struct { struct mlx5dv_devx_obj *devx_obj; uint32_t offset; } ctr; struct { struct mlx5dv_dr_domain *dmn; struct dr_devx_vport_cap *caps; } vport; struct { uint32_t vlan_hdr; } push_vlan; struct { bool is_qp; union { struct mlx5dv_devx_obj *devx_tir; struct ibv_qp *qp; }; } dest_qp; struct { struct mlx5dv_dr_table *tbl; struct dr_devx_tbl *devx_tbl; struct mlx5dv_steering_anchor *sa; uint64_t rx_icm_addr; uint64_t tx_icm_addr; } root_tbl; struct dr_action_aso aso; struct mlx5dv_devx_obj *devx_obj; uint32_t flow_tag; }; }; struct dr_rule_action_member { struct mlx5dv_dr_action *action; struct list_node list; }; enum dr_connect_type { CONNECT_HIT = 1, CONNECT_MISS = 2, }; struct dr_htbl_connect_info { enum dr_connect_type type; union { struct dr_ste_htbl *hit_next_htbl; uint64_t miss_icm_addr; }; }; struct dr_rule_rx_tx { struct dr_matcher_rx_tx *nic_matcher; struct dr_ste *last_rule_ste; uint8_t lock_index; }; struct mlx5dv_dr_rule { struct mlx5dv_dr_matcher *matcher; union { struct { struct dr_rule_rx_tx rx; struct dr_rule_rx_tx tx; }; struct ibv_flow *flow; }; struct list_node rule_list; struct mlx5dv_dr_action **actions; uint16_t num_actions; }; static inline void dr_rule_lock(struct dr_rule_rx_tx *nic_rule, uint8_t *hw_ste) { struct dr_matcher_rx_tx *nic_matcher = nic_rule->nic_matcher; struct dr_domain_rx_tx *nic_dmn = nic_matcher->nic_tbl->nic_dmn; uint32_t index; if (nic_matcher->fixed_size) { if (hw_ste) { index = dr_ste_calc_hash_index(hw_ste, nic_matcher->s_htbl); nic_rule->lock_index = index % NUM_OF_LOCKS; } pthread_spin_lock(&nic_dmn->locks[nic_rule->lock_index]); } else { pthread_spin_lock(&nic_dmn->locks[0]); } } static inline void dr_rule_unlock(struct dr_rule_rx_tx *nic_rule) { struct dr_matcher_rx_tx *nic_matcher = nic_rule->nic_matcher; struct dr_domain_rx_tx *nic_dmn = nic_matcher->nic_tbl->nic_dmn; if (nic_matcher->fixed_size) pthread_spin_unlock(&nic_dmn->locks[nic_rule->lock_index]); else pthread_spin_unlock(&nic_dmn->locks[0]); } void dr_rule_set_last_member(struct dr_rule_rx_tx *nic_rule, struct dr_ste *ste, bool force); void dr_rule_get_reverse_rule_members(struct dr_ste **ste_arr, struct dr_ste *curr_ste, int *num_of_stes); int dr_rule_send_update_list(struct list_head *send_ste_list, struct mlx5dv_dr_domain *dmn, bool is_reverse, uint8_t send_ring_idx); struct dr_icm_chunk { struct dr_icm_buddy_mem *buddy_mem; struct list_node chunk_list; uint32_t num_of_entries; uint32_t byte_size; /* segment indicates the index of this chunk in its buddy's memory */ uint32_t seg; /* Memory optimisation */ struct dr_ste *ste_arr; uint8_t *hw_ste_arr; struct list_head *miss_list; }; static inline int dr_icm_pool_dm_type_to_entry_size(enum dr_icm_type icm_type) { if (icm_type == DR_ICM_TYPE_STE) return DR_STE_SIZE; else if (icm_type == DR_ICM_TYPE_ENCAP) return DR_SW_ENCAP_ENTRY_SIZE; return DR_MODIFY_ACTION_SIZE; } static inline uint32_t dr_icm_pool_chunk_size_to_entries(enum dr_icm_chunk_size chunk_size) { return 1 << chunk_size; } static inline int dr_icm_pool_chunk_size_to_byte(enum dr_icm_chunk_size chunk_size, enum dr_icm_type icm_type) { int num_of_entries; int entry_size; entry_size = dr_icm_pool_dm_type_to_entry_size(icm_type); num_of_entries = dr_icm_pool_chunk_size_to_entries(chunk_size); return entry_size * num_of_entries; } void dr_icm_pool_set_pool_max_log_chunk_sz(struct dr_icm_pool *pool, enum dr_icm_chunk_size max_log_chunk_sz); static inline int dr_ste_htbl_increase_threshold(struct dr_ste_htbl *htbl) { int num_of_entries = dr_icm_pool_chunk_size_to_entries(htbl->chunk_size); /* Threshold is 50%, one is added to table of size 1 */ return (num_of_entries + 1) / 2; } static inline bool dr_ste_htbl_may_grow(struct dr_ste_htbl *htbl) { if (htbl->chunk_size == DR_CHUNK_SIZE_MAX - 1 || (htbl->type == DR_STE_HTBL_TYPE_LEGACY && !htbl->byte_mask)) return false; return true; } /* internal API functions */ int dr_devx_query_device(struct ibv_context *ctx, struct dr_devx_caps *caps); int dr_devx_query_esw_vport_context(struct ibv_context *ctx, bool other_vport, uint16_t vport_number, uint64_t *icm_address_rx, uint64_t *icm_address_tx); int dr_devx_query_gvmi(struct ibv_context *ctx, bool other_vport, uint16_t vport_number, uint16_t *gvmi); int dr_devx_query_esw_caps(struct ibv_context *ctx, struct dr_esw_caps *caps); int dr_devx_sync_steering(struct ibv_context *ctx); struct mlx5dv_devx_obj * dr_devx_create_flow_table(struct ibv_context *ctx, struct dr_devx_flow_table_attr *table_attr); int dr_devx_query_flow_table(struct mlx5dv_devx_obj *obj, uint32_t type, uint64_t *rx_icm_addr, uint64_t *tx_icm_addr); struct dr_devx_tbl * dr_devx_create_always_hit_ft(struct ibv_context *ctx, struct dr_devx_flow_table_attr *ft_attr, struct dr_devx_flow_group_attr *fg_attr, struct dr_devx_flow_fte_attr *fte_attr); void dr_devx_destroy_always_hit_ft(struct dr_devx_tbl *devx_tbl); struct mlx5dv_devx_obj * dr_devx_create_flow_sampler(struct ibv_context *ctx, struct dr_devx_flow_sampler_attr *sampler_attr); int dr_devx_query_flow_sampler(struct mlx5dv_devx_obj *obj, uint64_t *rx_icm_addr, uint64_t *tx_icm_addr); struct mlx5dv_devx_obj *dr_devx_create_definer(struct ibv_context *ctx, uint16_t format_id, uint8_t *match_mask); struct mlx5dv_devx_obj *dr_devx_create_reformat_ctx(struct ibv_context *ctx, enum reformat_type rt, size_t reformat_size, void *reformat_data); struct mlx5dv_devx_obj *dr_devx_create_meter(struct ibv_context *ctx, struct mlx5dv_dr_flow_meter_attr *attr); int dr_devx_query_meter(struct mlx5dv_devx_obj *obj, uint64_t *rx_icm_addr, uint64_t *tx_icm_addr); int dr_devx_modify_meter(struct mlx5dv_devx_obj *obj, struct mlx5dv_dr_flow_meter_attr *attr, __be64 modify_bits); struct mlx5dv_devx_obj *dr_devx_create_cq(struct ibv_context *ctx, uint32_t page_id, uint32_t buff_umem_id, uint32_t db_umem_id, uint32_t eqn, int ncqe, int cqen); struct dr_devx_qp_create_attr { uint32_t page_id; uint32_t pdn; uint32_t cqn; uint32_t pm_state; uint32_t service_type; uint32_t buff_umem_id; uint32_t db_umem_id; uint32_t sq_wqe_cnt; uint32_t rq_wqe_cnt; uint32_t rq_wqe_shift; bool isolate_vl_tc; uint8_t qp_ts_format; }; struct mlx5dv_devx_obj *dr_devx_create_qp(struct ibv_context *ctx, struct dr_devx_qp_create_attr *attr); int dr_devx_modify_qp_rst2init(struct ibv_context *ctx, struct mlx5dv_devx_obj *qp_obj, uint16_t port); struct dr_gid_attr { union ibv_gid gid; enum roce_version roce_ver; uint8_t mac[ETHERNET_LL_SIZE]; }; struct dr_devx_qp_rtr_attr { struct dr_gid_attr dgid_attr; enum ibv_mtu mtu; uint32_t qp_num; uint16_t port_num; uint8_t min_rnr_timer; uint8_t sgid_index; bool fl; }; int dr_devx_modify_qp_init2rtr(struct ibv_context *ctx, struct mlx5dv_devx_obj *qp_obj, struct dr_devx_qp_rtr_attr *attr); struct dr_devx_qp_rts_attr { uint8_t timeout; uint8_t retry_cnt; uint8_t rnr_retry; }; int dr_devx_modify_qp_rtr2rts(struct ibv_context *ctx, struct mlx5dv_devx_obj *qp_obj, struct dr_devx_qp_rts_attr *attr); int dr_devx_query_gid(struct ibv_context *ctx, uint8_t vhca_port_num, uint16_t index, struct dr_gid_attr *attr); struct mlx5dv_devx_obj *dr_devx_create_modify_header_arg(struct ibv_context *ctx, uint16_t log_obj_range, uint32_t pd); static inline bool dr_is_root_table(struct mlx5dv_dr_table *tbl) { return tbl->level == 0; } bool dr_domain_is_support_ste_icm_size(struct mlx5dv_dr_domain *dmn, uint32_t req_log_icm_sz); bool dr_domain_set_max_ste_icm_size(struct mlx5dv_dr_domain *dmn, uint32_t req_log_icm_sz); int dr_rule_rehash_matcher_s_anchor(struct mlx5dv_dr_matcher *matcher, struct dr_matcher_rx_tx *nic_matcher, enum dr_icm_chunk_size new_size); struct dr_icm_pool *dr_icm_pool_create(struct mlx5dv_dr_domain *dmn, enum dr_icm_type icm_type); void dr_icm_pool_destroy(struct dr_icm_pool *pool); int dr_icm_pool_sync_pool(struct dr_icm_pool *pool); uint64_t dr_icm_pool_get_chunk_icm_addr(struct dr_icm_chunk *chunk); uint64_t dr_icm_pool_get_chunk_mr_addr(struct dr_icm_chunk *chunk); uint32_t dr_icm_pool_get_chunk_rkey(struct dr_icm_chunk *chunk); struct dr_icm_chunk *dr_icm_alloc_chunk(struct dr_icm_pool *pool, enum dr_icm_chunk_size chunk_size); void dr_icm_free_chunk(struct dr_icm_chunk *chunk); void dr_ste_prepare_for_postsend(struct dr_ste_ctx *ste_ctx, uint8_t *hw_ste_p, uint32_t ste_size); int dr_ste_htbl_init_and_postsend(struct mlx5dv_dr_domain *dmn, struct dr_domain_rx_tx *nic_dmn, struct dr_ste_htbl *htbl, struct dr_htbl_connect_info *connect_info, bool update_hw_ste, uint8_t send_ring_idx); void dr_ste_set_formated_ste(struct dr_ste_ctx *ste_ctx, uint16_t gvmi, enum dr_domain_nic_type nic_type, struct dr_ste_htbl *htbl, uint8_t *formated_ste, struct dr_htbl_connect_info *connect_info); void dr_ste_copy_param(uint8_t match_criteria, struct dr_match_param *set_param, uint64_t *mask_buf, size_t mask_sz, bool clear); void dr_crc32_init_table(void); uint32_t dr_crc32_slice8_calc(const void *input_data, size_t length); struct dr_wq { unsigned *wqe_head; unsigned wqe_cnt; unsigned max_post; unsigned head; unsigned tail; unsigned cur_post; int max_gs; int wqe_shift; int offset; void *qend; }; struct dr_qp { struct mlx5_buf buf; struct dr_wq sq; struct dr_wq rq; int sq_size; void *sq_start; int max_inline_data; __be32 *db; struct mlx5dv_devx_obj *obj; struct mlx5dv_devx_uar *uar; struct mlx5dv_devx_umem *buf_umem; struct mlx5dv_devx_umem *db_umem; uint8_t nc_uar : 1; }; struct dr_cq { uint8_t *buf; uint32_t cons_index; int ncqe; struct dr_qp *qp; /* Assume CQ per QP */ __be32 *db; struct ibv_cq *ibv_cq; uint32_t cqn; uint32_t cqe_sz; }; #define MAX_SEND_CQE 64 struct dr_send_ring { struct dr_cq cq; struct dr_qp *qp; struct ibv_mr *mr; /* How much wqes are waiting for completion */ uint32_t pending_wqe; /* Signal request per this trash hold value */ uint16_t signal_th; uint32_t max_inline_size; /* manage the send queue */ uint32_t tx_head; /* protect QP/CQ operations */ pthread_spinlock_t lock; void *buf; uint32_t buf_size; void *sync_buff; struct ibv_mr *sync_mr; }; int dr_send_ring_alloc(struct mlx5dv_dr_domain *dmn); void dr_send_ring_free(struct mlx5dv_dr_domain *dmn); int dr_send_ring_force_drain(struct mlx5dv_dr_domain *dmn); bool dr_send_allow_fl(struct dr_devx_caps *caps); int dr_send_postsend_ste(struct mlx5dv_dr_domain *dmn, struct dr_ste *ste, uint8_t *data, uint16_t size, uint16_t offset, uint8_t ring_idx); int dr_send_postsend_htbl(struct mlx5dv_dr_domain *dmn, struct dr_ste_htbl *htbl, uint8_t *formated_ste, uint8_t *mask, uint8_t send_ring_idx); int dr_send_postsend_formated_htbl(struct mlx5dv_dr_domain *dmn, struct dr_ste_htbl *htbl, uint8_t *ste_init_data, bool update_hw_ste, uint8_t send_ring_idx); int dr_send_postsend_action(struct mlx5dv_dr_domain *dmn, struct mlx5dv_dr_action *action); int dr_send_postsend_pattern(struct mlx5dv_dr_domain *dmn, struct dr_icm_chunk *chunk, uint16_t num_of_actions, uint8_t *data); int dr_send_postsend_args(struct mlx5dv_dr_domain *dmn, uint64_t arg_id, uint16_t num_of_actions, uint8_t *actions_data, uint32_t ring_index); /* buddy functions & structure */ struct dr_icm_mr; struct dr_icm_buddy_mem { unsigned long **bits; unsigned int *num_free; unsigned long **set_bit; uint32_t max_order; struct list_node list_node; struct dr_icm_mr *icm_mr; struct dr_icm_pool *pool; /* This is the list of used chunks. HW may be accessing this memory */ struct list_head used_list; size_t used_memory; /* hardware may be accessing this memory but at some future, * undetermined time, it might cease to do so. * sync_ste command sets them free. */ struct list_head hot_list; /* Memory optimization */ struct dr_ste *ste_arr; struct list_head *miss_list; uint8_t *hw_ste_arr; /* HW STE cache entry size */ uint8_t hw_ste_sz; }; bool dr_domain_is_support_modify_hdr_cache(struct mlx5dv_dr_domain *dmn); struct dr_ptrn_mngr * dr_ptrn_mngr_create(struct mlx5dv_dr_domain *dmn); void dr_ptrn_mngr_destroy(struct dr_ptrn_mngr *mngr); struct dr_ptrn_obj * dr_ptrn_cache_get_pattern(struct dr_ptrn_mngr *mngr, enum dr_ptrn_type type, uint16_t num_of_actions, uint8_t *data); void dr_ptrn_cache_put_pattern(struct dr_ptrn_mngr *mngr, struct dr_ptrn_obj *pattern); int dr_ptrn_sync_pool(struct dr_ptrn_mngr *ptrn_mngr); struct dr_arg_mngr* dr_arg_mngr_create(struct mlx5dv_dr_domain *dmn); void dr_arg_mngr_destroy(struct dr_arg_mngr *mngr); struct dr_arg_obj *dr_arg_get_obj(struct dr_arg_mngr *mngr, uint16_t num_of_actions, uint8_t *data); void dr_arg_put_obj(struct dr_arg_mngr *mngr, struct dr_arg_obj *arg_obj); uint32_t dr_arg_get_object_id(struct dr_arg_obj *arg_obj); bool dr_domain_is_support_sw_encap(struct mlx5dv_dr_domain *dmn); int dr_buddy_init(struct dr_icm_buddy_mem *buddy, uint32_t max_order); void dr_buddy_cleanup(struct dr_icm_buddy_mem *buddy); int dr_buddy_alloc_mem(struct dr_icm_buddy_mem *buddy, int order); void dr_buddy_free_mem(struct dr_icm_buddy_mem *buddy, uint32_t seg, int order); void dr_ste_free_modify_hdr(struct mlx5dv_dr_action *action); int dr_ste_alloc_modify_hdr(struct mlx5dv_dr_action *action); int dr_ste_alloc_encap(struct mlx5dv_dr_action *action); void dr_ste_free_encap(struct mlx5dv_dr_action *action); void dr_vports_table_add_wire(struct dr_devx_vports *vports); void dr_vports_table_del_wire(struct dr_devx_vports *vports); struct dr_devx_vport_cap *dr_vports_table_get_vport_cap(struct dr_devx_caps *caps, uint16_t vport); struct dr_devx_vport_cap *dr_vports_table_get_ib_port_cap(struct dr_devx_caps *caps, uint32_t ib_port); struct dr_vports_table *dr_vports_table_create(struct mlx5dv_dr_domain *dmn); void dr_vports_table_destroy(struct dr_vports_table *vports_tbl); #endif rdma-core-56.1/providers/mlx5/qp.c000066400000000000000000003377221477342711600170360ustar00rootroot00000000000000/* * Copyright (c) 2012 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include "mlx5.h" #include "mlx5_ifc.h" #include "mlx5_trace.h" #include "wqe.h" #define MLX5_ATOMIC_SIZE 8 static const uint32_t mlx5_ib_opcode[] = { [IBV_WR_SEND] = MLX5_OPCODE_SEND, [IBV_WR_SEND_WITH_INV] = MLX5_OPCODE_SEND_INVAL, [IBV_WR_SEND_WITH_IMM] = MLX5_OPCODE_SEND_IMM, [IBV_WR_RDMA_WRITE] = MLX5_OPCODE_RDMA_WRITE, [IBV_WR_RDMA_WRITE_WITH_IMM] = MLX5_OPCODE_RDMA_WRITE_IMM, [IBV_WR_RDMA_READ] = MLX5_OPCODE_RDMA_READ, [IBV_WR_ATOMIC_CMP_AND_SWP] = MLX5_OPCODE_ATOMIC_CS, [IBV_WR_ATOMIC_FETCH_AND_ADD] = MLX5_OPCODE_ATOMIC_FA, [IBV_WR_BIND_MW] = MLX5_OPCODE_UMR, [IBV_WR_LOCAL_INV] = MLX5_OPCODE_UMR, [IBV_WR_TSO] = MLX5_OPCODE_TSO, [IBV_WR_DRIVER1] = MLX5_OPCODE_UMR, }; static void *get_recv_wqe(struct mlx5_qp *qp, int n) { return qp->buf.buf + qp->rq.offset + (n << qp->rq.wqe_shift); } static void *get_wq_recv_wqe(struct mlx5_rwq *rwq, int n) { return rwq->pbuff + (n << rwq->rq.wqe_shift); } static int copy_to_scat(struct mlx5_wqe_data_seg *scat, void *buf, int *size, int max, struct mlx5_context *ctx) { int copy; int i; if (unlikely(!(*size))) return IBV_WC_SUCCESS; for (i = 0; i < max; ++i) { copy = min_t(long, *size, be32toh(scat->byte_count)); /* When NULL MR is used can't copy to target, * expected to be NULL. */ if (likely(scat->lkey != ctx->dump_fill_mkey_be)) memcpy((void *)(unsigned long)be64toh(scat->addr), buf, copy); *size -= copy; if (*size == 0) return IBV_WC_SUCCESS; buf += copy; ++scat; } return IBV_WC_LOC_LEN_ERR; } int mlx5_copy_to_recv_wqe(struct mlx5_qp *qp, int idx, void *buf, int size) { struct mlx5_context *ctx = to_mctx(qp->ibv_qp->pd->context); struct mlx5_wqe_data_seg *scat; int max = 1 << (qp->rq.wqe_shift - 4); scat = get_recv_wqe(qp, idx); if (unlikely(qp->wq_sig)) ++scat; return copy_to_scat(scat, buf, &size, max, ctx); } int mlx5_copy_to_send_wqe(struct mlx5_qp *qp, int idx, void *buf, int size) { struct mlx5_context *ctx = to_mctx(qp->ibv_qp->pd->context); struct mlx5_wqe_ctrl_seg *ctrl; struct mlx5_wqe_data_seg *scat; void *p; int max; idx &= (qp->sq.wqe_cnt - 1); ctrl = mlx5_get_send_wqe(qp, idx); if (qp->ibv_qp->qp_type != IBV_QPT_RC) { mlx5_err(ctx->dbg_fp, "scatter to CQE is supported only for RC QPs\n"); return IBV_WC_GENERAL_ERR; } p = ctrl + 1; switch (be32toh(ctrl->opmod_idx_opcode) & 0xff) { case MLX5_OPCODE_RDMA_READ: p = p + sizeof(struct mlx5_wqe_raddr_seg); break; case MLX5_OPCODE_ATOMIC_CS: case MLX5_OPCODE_ATOMIC_FA: p = p + sizeof(struct mlx5_wqe_raddr_seg) + sizeof(struct mlx5_wqe_atomic_seg); break; default: mlx5_err(ctx->dbg_fp, "scatter to CQE for opcode %d\n", be32toh(ctrl->opmod_idx_opcode) & 0xff); return IBV_WC_REM_INV_REQ_ERR; } scat = p; max = (be32toh(ctrl->qpn_ds) & 0x3F) - (((void *)scat - (void *)ctrl) >> 4); if (unlikely((void *)(scat + max) > qp->sq.qend)) { int tmp = ((void *)qp->sq.qend - (void *)scat) >> 4; int orig_size = size; if (copy_to_scat(scat, buf, &size, tmp, ctx) == IBV_WC_SUCCESS) return IBV_WC_SUCCESS; max = max - tmp; buf += orig_size - size; scat = mlx5_get_send_wqe(qp, 0); } return copy_to_scat(scat, buf, &size, max, ctx); } void *mlx5_get_send_wqe(struct mlx5_qp *qp, int n) { return qp->sq_start + (n << MLX5_SEND_WQE_SHIFT); } void mlx5_init_rwq_indices(struct mlx5_rwq *rwq) { rwq->rq.head = 0; rwq->rq.tail = 0; } void mlx5_init_qp_indices(struct mlx5_qp *qp) { qp->sq.head = 0; qp->sq.tail = 0; qp->rq.head = 0; qp->rq.tail = 0; qp->sq.cur_post = 0; } static int mlx5_wq_overflow(struct mlx5_wq *wq, int nreq, struct mlx5_cq *cq) { unsigned cur; cur = wq->head - wq->tail; if (cur + nreq < wq->max_post) return 0; mlx5_spin_lock(&cq->lock); cur = wq->head - wq->tail; mlx5_spin_unlock(&cq->lock); return cur + nreq >= wq->max_post; } static inline void set_raddr_seg(struct mlx5_wqe_raddr_seg *rseg, uint64_t remote_addr, uint32_t rkey) { rseg->raddr = htobe64(remote_addr); rseg->rkey = htobe32(rkey); rseg->reserved = 0; } static void set_tm_seg(struct mlx5_wqe_tm_seg *tmseg, int op, struct ibv_ops_wr *wr, int index) { tmseg->flags = 0; if (wr->flags & IBV_OPS_SIGNALED) tmseg->flags |= MLX5_SRQ_FLAG_TM_CQE_REQ; if (wr->flags & IBV_OPS_TM_SYNC) { tmseg->flags |= MLX5_SRQ_FLAG_TM_SW_CNT; tmseg->sw_cnt = htobe16(wr->tm.unexpected_cnt); } tmseg->opcode = op << 4; if (op == MLX5_TM_OPCODE_NOP) return; tmseg->index = htobe16(index); if (op == MLX5_TM_OPCODE_REMOVE) return; tmseg->append_tag = htobe64(wr->tm.add.tag); tmseg->append_mask = htobe64(wr->tm.add.mask); } static inline void _set_atomic_seg(struct mlx5_wqe_atomic_seg *aseg, enum ibv_wr_opcode opcode, uint64_t swap, uint64_t compare_add) ALWAYS_INLINE; static inline void _set_atomic_seg(struct mlx5_wqe_atomic_seg *aseg, enum ibv_wr_opcode opcode, uint64_t swap, uint64_t compare_add) { if (opcode == IBV_WR_ATOMIC_CMP_AND_SWP) { aseg->swap_add = htobe64(swap); aseg->compare = htobe64(compare_add); } else { aseg->swap_add = htobe64(compare_add); } } static void set_atomic_seg(struct mlx5_wqe_atomic_seg *aseg, enum ibv_wr_opcode opcode, uint64_t swap, uint64_t compare_add) { _set_atomic_seg(aseg, opcode, swap, compare_add); } static inline void _set_datagram_seg(struct mlx5_wqe_datagram_seg *dseg, struct mlx5_wqe_av *av, uint32_t remote_qpn, uint32_t remote_qkey) { memcpy(&dseg->av, av, sizeof(dseg->av)); dseg->av.dqp_dct = htobe32(remote_qpn | MLX5_EXTENDED_UD_AV); dseg->av.key.qkey.qkey = htobe32(remote_qkey); } static void set_datagram_seg(struct mlx5_wqe_datagram_seg *dseg, struct ibv_send_wr *wr) { _set_datagram_seg(dseg, &to_mah(wr->wr.ud.ah)->av, wr->wr.ud.remote_qpn, wr->wr.ud.remote_qkey); } static void set_data_ptr_seg(struct mlx5_wqe_data_seg *dseg, struct ibv_sge *sg, int offset) { dseg->byte_count = htobe32(sg->length - offset); dseg->lkey = htobe32(sg->lkey); dseg->addr = htobe64(sg->addr + offset); } static void set_data_ptr_seg_atomic(struct mlx5_wqe_data_seg *dseg, struct ibv_sge *sg) { dseg->byte_count = htobe32(MLX5_ATOMIC_SIZE); dseg->lkey = htobe32(sg->lkey); dseg->addr = htobe64(sg->addr); } static void set_data_ptr_seg_end(struct mlx5_wqe_data_seg *dseg) { dseg->byte_count = 0; dseg->lkey = htobe32(MLX5_INVALID_LKEY); dseg->addr = 0; } /* * Avoid using memcpy() to copy to BlueFlame page, since memcpy() * implementations may use move-string-buffer assembler instructions, * which do not guarantee order of copying. */ static void mlx5_bf_copy(uint64_t *dst, const uint64_t *src, unsigned bytecnt, struct mlx5_qp *qp) { do { mmio_memcpy_x64(dst, src, 64); bytecnt -= 64; dst += 8; src += 8; if (unlikely(src == qp->sq.qend)) src = qp->sq_start; } while (bytecnt > 0); } static __be32 send_ieth(struct ibv_send_wr *wr) { switch (wr->opcode) { case IBV_WR_SEND_WITH_IMM: case IBV_WR_RDMA_WRITE_WITH_IMM: return wr->imm_data; case IBV_WR_SEND_WITH_INV: return htobe32(wr->invalidate_rkey); default: return 0; } } static int set_data_inl_seg(struct mlx5_qp *qp, struct ibv_send_wr *wr, void *wqe, int *sz, struct mlx5_sg_copy_ptr *sg_copy_ptr) { struct mlx5_wqe_inline_seg *seg; void *addr; int len; int i; int inl = 0; void *qend = qp->sq.qend; int copy; int offset = sg_copy_ptr->offset; seg = wqe; wqe += sizeof *seg; for (i = sg_copy_ptr->index; i < wr->num_sge; ++i) { addr = (void *) (unsigned long)(wr->sg_list[i].addr + offset); len = wr->sg_list[i].length - offset; inl += len; offset = 0; if (unlikely(inl > qp->max_inline_data)) return ENOMEM; if (unlikely(wqe + len > qend)) { copy = qend - wqe; memcpy(wqe, addr, copy); addr += copy; len -= copy; wqe = mlx5_get_send_wqe(qp, 0); } memcpy(wqe, addr, len); wqe += len; } if (likely(inl)) { seg->byte_count = htobe32(inl | MLX5_INLINE_SEG); *sz = align(inl + sizeof seg->byte_count, 16) / 16; } else *sz = 0; return 0; } static uint8_t wq_sig(struct mlx5_wqe_ctrl_seg *ctrl) { return calc_sig(ctrl, (be32toh(ctrl->qpn_ds) & 0x3f) << 4); } #ifdef MLX5_DEBUG static void dump_wqe(struct mlx5_context *mctx, int idx, int size_16, struct mlx5_qp *qp) { uint32_t *uninitialized_var(p); int i, j; int tidx = idx; mlx5_err(mctx->dbg_fp, "dump wqe at %p\n", mlx5_get_send_wqe(qp, tidx)); for (i = 0, j = 0; i < size_16 * 4; i += 4, j += 4) { if ((i & 0xf) == 0) { void *buf = mlx5_get_send_wqe(qp, tidx); tidx = (tidx + 1) & (qp->sq.wqe_cnt - 1); p = buf; j = 0; } mlx5_err(mctx->dbg_fp, "%08x %08x %08x %08x\n", be32toh(p[j]), be32toh(p[j + 1]), be32toh(p[j + 2]), be32toh(p[j + 3])); } } #endif /* MLX5_DEBUG */ void *mlx5_get_atomic_laddr(struct mlx5_qp *qp, uint16_t idx, int *byte_count) { struct mlx5_wqe_data_seg *dpseg; void *addr; dpseg = mlx5_get_send_wqe(qp, idx) + sizeof(struct mlx5_wqe_ctrl_seg) + sizeof(struct mlx5_wqe_raddr_seg) + sizeof(struct mlx5_wqe_atomic_seg); addr = (void *)(unsigned long)be64toh(dpseg->addr); /* * Currently byte count is always 8 bytes. Fix this when * we support variable size of atomics */ *byte_count = 8; return addr; } static inline int copy_eth_inline_headers(struct ibv_qp *ibqp, const void *list, size_t nelem, struct mlx5_wqe_eth_seg *eseg, struct mlx5_sg_copy_ptr *sg_copy_ptr, bool is_sge) ALWAYS_INLINE; static inline int copy_eth_inline_headers(struct ibv_qp *ibqp, const void *list, size_t nelem, struct mlx5_wqe_eth_seg *eseg, struct mlx5_sg_copy_ptr *sg_copy_ptr, bool is_sge) { uint32_t inl_hdr_size = to_mctx(ibqp->context)->eth_min_inline_size; size_t inl_hdr_copy_size = 0; int j = 0; FILE *fp = to_mctx(ibqp->context)->dbg_fp; size_t length; void *addr; if (unlikely(nelem < 1)) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "illegal num_sge: %zu, minimum is 1\n", nelem); return EINVAL; } if (is_sge) { addr = (void *)(uintptr_t)((struct ibv_sge *)list)[0].addr; length = (size_t)((struct ibv_sge *)list)[0].length; } else { addr = ((struct ibv_data_buf *)list)[0].addr; length = ((struct ibv_data_buf *)list)[0].length; } if (likely(length >= MLX5_ETH_L2_INLINE_HEADER_SIZE)) { inl_hdr_copy_size = inl_hdr_size; memcpy(eseg->inline_hdr_start, addr, inl_hdr_copy_size); } else { uint32_t inl_hdr_size_left = inl_hdr_size; for (j = 0; j < nelem && inl_hdr_size_left > 0; ++j) { if (is_sge) { addr = (void *)(uintptr_t)((struct ibv_sge *)list)[j].addr; length = (size_t)((struct ibv_sge *)list)[j].length; } else { addr = ((struct ibv_data_buf *)list)[j].addr; length = ((struct ibv_data_buf *)list)[j].length; } inl_hdr_copy_size = min_t(size_t, length, inl_hdr_size_left); memcpy(eseg->inline_hdr_start + (MLX5_ETH_L2_INLINE_HEADER_SIZE - inl_hdr_size_left), addr, inl_hdr_copy_size); inl_hdr_size_left -= inl_hdr_copy_size; } if (unlikely(inl_hdr_size_left)) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "Ethernet headers < 16 bytes\n"); return EINVAL; } if (j) --j; } eseg->inline_hdr_sz = htobe16(inl_hdr_size); /* If we copied all the sge into the inline-headers, then we need to * start copying from the next sge into the data-segment. */ if (unlikely(length == inl_hdr_copy_size)) { ++j; inl_hdr_copy_size = 0; } sg_copy_ptr->index = j; sg_copy_ptr->offset = inl_hdr_copy_size; return 0; } #define ALIGN(x, log_a) ((((x) + (1 << (log_a)) - 1)) & ~((1 << (log_a)) - 1)) static inline __be16 get_klm_octo(int nentries) { return htobe16(ALIGN(nentries, 3) / 2); } static void set_umr_data_seg(struct mlx5_qp *qp, enum ibv_mw_type type, int32_t rkey, const struct ibv_mw_bind_info *bind_info, uint32_t qpn, void **seg, int *size) { union { struct mlx5_wqe_umr_klm_seg klm; uint8_t reserved[64]; } *data = *seg; data->klm.byte_count = htobe32(bind_info->length); data->klm.mkey = htobe32(bind_info->mr->lkey); data->klm.address = htobe64(bind_info->addr); memset(&data->klm + 1, 0, sizeof(data->reserved) - sizeof(data->klm)); *seg += sizeof(*data); *size += (sizeof(*data) / 16); } static void set_umr_mkey_seg(struct mlx5_qp *qp, enum ibv_mw_type type, int32_t rkey, const struct ibv_mw_bind_info *bind_info, uint32_t qpn, void **seg, int *size) { struct mlx5_wqe_mkey_context_seg *mkey = *seg; mkey->qpn_mkey = htobe32((rkey & 0xFF) | ((type == IBV_MW_TYPE_1 || !bind_info->length) ? 0xFFFFFF00 : qpn << 8)); if (bind_info->length) { /* Local read is set in kernel */ mkey->access_flags = 0; mkey->free = 0; if (bind_info->mw_access_flags & IBV_ACCESS_LOCAL_WRITE) mkey->access_flags |= MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_LOCAL_WRITE; if (bind_info->mw_access_flags & IBV_ACCESS_REMOTE_WRITE) mkey->access_flags |= MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_REMOTE_WRITE; if (bind_info->mw_access_flags & IBV_ACCESS_REMOTE_READ) mkey->access_flags |= MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_REMOTE_READ; if (bind_info->mw_access_flags & IBV_ACCESS_REMOTE_ATOMIC) mkey->access_flags |= MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_ATOMIC; if (bind_info->mw_access_flags & IBV_ACCESS_ZERO_BASED) mkey->start_addr = 0; else mkey->start_addr = htobe64(bind_info->addr); mkey->len = htobe64(bind_info->length); } else { mkey->free = MLX5_WQE_MKEY_CONTEXT_FREE; } *seg += sizeof(struct mlx5_wqe_mkey_context_seg); *size += (sizeof(struct mlx5_wqe_mkey_context_seg) / 16); } static inline void set_umr_control_seg(struct mlx5_qp *qp, enum ibv_mw_type type, int32_t rkey, const struct ibv_mw_bind_info *bind_info, uint32_t qpn, void **seg, int *size) { struct mlx5_wqe_umr_ctrl_seg *ctrl = *seg; ctrl->flags = MLX5_WQE_UMR_CTRL_FLAG_TRNSLATION_OFFSET | MLX5_WQE_UMR_CTRL_FLAG_INLINE; ctrl->mkey_mask = htobe64(MLX5_WQE_UMR_CTRL_MKEY_MASK_FREE | MLX5_WQE_UMR_CTRL_MKEY_MASK_MKEY); ctrl->translation_offset = 0; memset(ctrl->rsvd0, 0, sizeof(ctrl->rsvd0)); memset(ctrl->rsvd1, 0, sizeof(ctrl->rsvd1)); if (type == IBV_MW_TYPE_2) ctrl->mkey_mask |= htobe64(MLX5_WQE_UMR_CTRL_MKEY_MASK_QPN); if (bind_info->length) { ctrl->klm_octowords = get_klm_octo(1); if (type == IBV_MW_TYPE_2) ctrl->flags |= MLX5_WQE_UMR_CTRL_FLAG_CHECK_FREE; ctrl->mkey_mask |= htobe64(MLX5_WQE_UMR_CTRL_MKEY_MASK_LEN | MLX5_WQE_UMR_CTRL_MKEY_MASK_START_ADDR | MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_LOCAL_WRITE | MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_REMOTE_READ | MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_REMOTE_WRITE | MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_ATOMIC); } else { ctrl->klm_octowords = get_klm_octo(0); if (type == IBV_MW_TYPE_2) ctrl->flags |= MLX5_WQE_UMR_CTRL_FLAG_CHECK_QPN; } *seg += sizeof(struct mlx5_wqe_umr_ctrl_seg); *size += sizeof(struct mlx5_wqe_umr_ctrl_seg) / 16; } static inline int set_bind_wr(struct mlx5_qp *qp, enum ibv_mw_type type, int32_t rkey, const struct ibv_mw_bind_info *bind_info, uint32_t qpn, void **seg, int *size) { void *qend = qp->sq.qend; #ifdef MW_DEBUG if (bind_info->mw_access_flags & ~(IBV_ACCESS_REMOTE_ATOMIC | IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_WRITE)) return EINVAL; if (bind_info->mr && (bind_info->mr->addr > (void *)bind_info->addr || bind_info->mr->addr + bind_info->mr->length < (void *)bind_info->addr + bind_info->length || !(to_mmr(bind_info->mr)->alloc_flags & IBV_ACCESS_MW_BIND) || (bind_info->mw_access_flags & (IBV_ACCESS_REMOTE_ATOMIC | IBV_ACCESS_REMOTE_WRITE) && !(to_mmr(bind_info->mr)->alloc_flags & IBV_ACCESS_LOCAL_WRITE)))) return EINVAL; #endif /* check that len > 2GB because KLM support only 2GB */ if (bind_info->length > 1UL << 31) return EOPNOTSUPP; set_umr_control_seg(qp, type, rkey, bind_info, qpn, seg, size); if (unlikely((*seg == qend))) *seg = mlx5_get_send_wqe(qp, 0); set_umr_mkey_seg(qp, type, rkey, bind_info, qpn, seg, size); if (!bind_info->length) return 0; if (unlikely((*seg == qend))) *seg = mlx5_get_send_wqe(qp, 0); set_umr_data_seg(qp, type, rkey, bind_info, qpn, seg, size); return 0; } /* Copy tso header to eth segment with considering padding and WQE * wrap around in WQ buffer. */ static inline int set_tso_eth_seg(void **seg, void *hdr, uint16_t hdr_sz, uint16_t mss, struct mlx5_qp *qp, int *size) { struct mlx5_wqe_eth_seg *eseg = *seg; int size_of_inl_hdr_start = sizeof(eseg->inline_hdr_start); uint64_t left, left_len, copy_sz; FILE *fp = to_mctx(qp->ibv_qp->context)->dbg_fp; if (unlikely(hdr_sz < MLX5_ETH_L2_MIN_HEADER_SIZE || hdr_sz > qp->max_tso_header)) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "TSO header size should be at least %d and at most %d\n", MLX5_ETH_L2_MIN_HEADER_SIZE, qp->max_tso_header); return EINVAL; } left = hdr_sz; eseg->mss = htobe16(mss); eseg->inline_hdr_sz = htobe16(hdr_sz); /* Check if there is space till the end of queue, if yes, * copy all in one shot, otherwise copy till the end of queue, * rollback and then copy the left */ left_len = qp->sq.qend - (void *)eseg->inline_hdr_start; copy_sz = min(left_len, left); memcpy(eseg->inline_hdr_start, hdr, copy_sz); /* The -1 is because there are already 16 bytes included in * eseg->inline_hdr[16] */ *seg += align(copy_sz - size_of_inl_hdr_start, 16) - 16; *size += align(copy_sz - size_of_inl_hdr_start, 16) / 16 - 1; /* The last wqe in the queue */ if (unlikely(copy_sz < left)) { *seg = mlx5_get_send_wqe(qp, 0); left -= copy_sz; hdr += copy_sz; memcpy(*seg, hdr, left); *seg += align(left, 16); *size += align(left, 16) / 16; } return 0; } static inline int mlx5_post_send_underlay(struct mlx5_qp *qp, struct ibv_send_wr *wr, void **pseg, int *total_size, struct mlx5_sg_copy_ptr *sg_copy_ptr) { struct mlx5_wqe_eth_seg *eseg; int inl_hdr_copy_size; void *seg = *pseg; int size = 0; if (unlikely(wr->opcode == IBV_WR_SEND_WITH_IMM)) return EINVAL; memset(seg, 0, sizeof(struct mlx5_wqe_eth_pad)); size += sizeof(struct mlx5_wqe_eth_pad); seg += sizeof(struct mlx5_wqe_eth_pad); eseg = seg; *((uint64_t *)eseg) = 0; eseg->rsvd2 = 0; if (wr->send_flags & IBV_SEND_IP_CSUM) { if (!(qp->qp_cap_cache & MLX5_CSUM_SUPPORT_UNDERLAY_UD)) return EINVAL; eseg->cs_flags |= MLX5_ETH_WQE_L3_CSUM | MLX5_ETH_WQE_L4_CSUM; } if (likely(wr->sg_list[0].length >= MLX5_SOURCE_QPN_INLINE_MAX_HEADER_SIZE)) /* Copying the minimum required data unless inline mode is set */ inl_hdr_copy_size = (wr->send_flags & IBV_SEND_INLINE) ? MLX5_SOURCE_QPN_INLINE_MAX_HEADER_SIZE : MLX5_IPOIB_INLINE_MIN_HEADER_SIZE; else { inl_hdr_copy_size = MLX5_IPOIB_INLINE_MIN_HEADER_SIZE; /* We expect at least 4 bytes as part of first entry to hold the IPoIB header */ if (unlikely(wr->sg_list[0].length < inl_hdr_copy_size)) return EINVAL; } memcpy(eseg->inline_hdr_start, (void *)(uintptr_t)wr->sg_list[0].addr, inl_hdr_copy_size); eseg->inline_hdr_sz = htobe16(inl_hdr_copy_size); size += sizeof(struct mlx5_wqe_eth_seg); seg += sizeof(struct mlx5_wqe_eth_seg); /* If we copied all the sge into the inline-headers, then we need to * start copying from the next sge into the data-segment. */ if (unlikely(wr->sg_list[0].length == inl_hdr_copy_size)) sg_copy_ptr->index++; else sg_copy_ptr->offset = inl_hdr_copy_size; *pseg = seg; *total_size += (size / 16); return 0; } static inline void post_send_db(struct mlx5_qp *qp, struct mlx5_bf *bf, int nreq, int inl, int size, void *ctrl) { struct mlx5_context *ctx; if (unlikely(!nreq)) return; qp->sq.head += nreq; /* * Make sure that descriptors are written before * updating doorbell record and ringing the doorbell */ udma_to_device_barrier(); qp->db[MLX5_SND_DBR] = htobe32(qp->sq.cur_post & 0xffff); /* Make sure that the doorbell write happens before the memcpy * to WC memory below */ ctx = to_mctx(qp->ibv_qp->context); if (bf->need_lock) mmio_wc_spinlock(&bf->lock.lock); else mmio_wc_start(); if (!ctx->shut_up_bf && nreq == 1 && bf->uuarn && (inl || ctx->prefer_bf) && size > 1 && size <= bf->buf_size / 16) mlx5_bf_copy(bf->reg + bf->offset, ctrl, align(size * 16, 64), qp); else mmio_write64_be(bf->reg + bf->offset, *(__be64 *)ctrl); /* * use mmio_flush_writes() to ensure write combining buffers are * flushed out of the running CPU. This must be carried inside * the spinlock. Otherwise, there is a potential race. In the * race, CPU A writes doorbell 1, which is waiting in the WC * buffer. CPU B writes doorbell 2, and it's write is flushed * earlier. Since the mmio_flush_writes is CPU local, this will * result in the HCA seeing doorbell 2, followed by doorbell 1. * Flush before toggling bf_offset to be latency oriented. */ mmio_flush_writes(); bf->offset ^= bf->buf_size; if (bf->need_lock) mlx5_spin_unlock(&bf->lock); } static inline int _mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { struct mlx5_qp *qp = to_mqp(ibqp); void *seg; struct mlx5_wqe_eth_seg *eseg; struct mlx5_wqe_ctrl_seg *ctrl = NULL; struct mlx5_wqe_data_seg *dpseg; struct mlx5_sg_copy_ptr sg_copy_ptr = {.index = 0, .offset = 0}; int nreq; int inl = 0; int err = 0; int size = 0; int i; unsigned idx; uint8_t opmod = 0; struct mlx5_bf *bf = qp->bf; void *qend = qp->sq.qend; uint32_t mlx5_opcode; struct mlx5_wqe_xrc_seg *xrc; uint8_t fence; uint8_t next_fence; uint32_t max_tso = 0; FILE *fp = to_mctx(ibqp->context)->dbg_fp; /* The compiler ignores in non-debug mode */ mlx5_spin_lock(&qp->sq.lock); next_fence = qp->fm_cache; for (nreq = 0; wr; ++nreq, wr = wr->next) { if (unlikely(wr->opcode < 0 || wr->opcode >= sizeof mlx5_ib_opcode / sizeof mlx5_ib_opcode[0])) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "bad opcode %d\n", wr->opcode); err = EINVAL; *bad_wr = wr; goto out; } if (unlikely(mlx5_wq_overflow(&qp->sq, nreq, to_mcq(qp->ibv_qp->send_cq)))) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "work queue overflow\n"); err = ENOMEM; *bad_wr = wr; goto out; } if (unlikely(wr->num_sge > qp->sq.qp_state_max_gs)) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "max gs exceeded %d (max = %d)\n", wr->num_sge, qp->sq.max_gs); err = ENOMEM; *bad_wr = wr; goto out; } if (wr->send_flags & IBV_SEND_FENCE) fence = MLX5_WQE_CTRL_FENCE; else fence = next_fence; next_fence = 0; idx = qp->sq.cur_post & (qp->sq.wqe_cnt - 1); ctrl = seg = mlx5_get_send_wqe(qp, idx); *(uint32_t *)(seg + 8) = 0; ctrl->imm = send_ieth(wr); ctrl->fm_ce_se = qp->sq_signal_bits | fence | (wr->send_flags & IBV_SEND_SIGNALED ? MLX5_WQE_CTRL_CQ_UPDATE : 0) | (wr->send_flags & IBV_SEND_SOLICITED ? MLX5_WQE_CTRL_SOLICITED : 0); seg += sizeof *ctrl; size = sizeof *ctrl / 16; qp->sq.wr_data[idx] = 0; switch (ibqp->qp_type) { case IBV_QPT_XRC_SEND: if (unlikely(wr->opcode != IBV_WR_BIND_MW && wr->opcode != IBV_WR_LOCAL_INV)) { xrc = seg; xrc->xrc_srqn = htobe32(wr->qp_type.xrc.remote_srqn); seg += sizeof(*xrc); size += sizeof(*xrc) / 16; } /* fall through */ case IBV_QPT_RC: switch (wr->opcode) { case IBV_WR_RDMA_READ: case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: set_raddr_seg(seg, wr->wr.rdma.remote_addr, wr->wr.rdma.rkey); seg += sizeof(struct mlx5_wqe_raddr_seg); size += sizeof(struct mlx5_wqe_raddr_seg) / 16; break; case IBV_WR_ATOMIC_CMP_AND_SWP: case IBV_WR_ATOMIC_FETCH_AND_ADD: if (unlikely(!qp->atomics_enabled)) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "atomic operations are not supported\n"); err = EOPNOTSUPP; *bad_wr = wr; goto out; } set_raddr_seg(seg, wr->wr.atomic.remote_addr, wr->wr.atomic.rkey); seg += sizeof(struct mlx5_wqe_raddr_seg); set_atomic_seg(seg, wr->opcode, wr->wr.atomic.swap, wr->wr.atomic.compare_add); seg += sizeof(struct mlx5_wqe_atomic_seg); size += (sizeof(struct mlx5_wqe_raddr_seg) + sizeof(struct mlx5_wqe_atomic_seg)) / 16; break; case IBV_WR_BIND_MW: next_fence = MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE; ctrl->imm = htobe32(wr->bind_mw.mw->rkey); err = set_bind_wr(qp, wr->bind_mw.mw->type, wr->bind_mw.rkey, &wr->bind_mw.bind_info, ibqp->qp_num, &seg, &size); if (err) { *bad_wr = wr; goto out; } qp->sq.wr_data[idx] = IBV_WC_BIND_MW; break; case IBV_WR_LOCAL_INV: { struct ibv_mw_bind_info bind_info = {}; next_fence = MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE; ctrl->imm = htobe32(wr->invalidate_rkey); err = set_bind_wr(qp, IBV_MW_TYPE_2, 0, &bind_info, ibqp->qp_num, &seg, &size); if (err) { *bad_wr = wr; goto out; } qp->sq.wr_data[idx] = IBV_WC_LOCAL_INV; break; } default: break; } break; case IBV_QPT_UC: switch (wr->opcode) { case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: set_raddr_seg(seg, wr->wr.rdma.remote_addr, wr->wr.rdma.rkey); seg += sizeof(struct mlx5_wqe_raddr_seg); size += sizeof(struct mlx5_wqe_raddr_seg) / 16; break; case IBV_WR_BIND_MW: next_fence = MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE; ctrl->imm = htobe32(wr->bind_mw.mw->rkey); err = set_bind_wr(qp, wr->bind_mw.mw->type, wr->bind_mw.rkey, &wr->bind_mw.bind_info, ibqp->qp_num, &seg, &size); if (err) { *bad_wr = wr; goto out; } qp->sq.wr_data[idx] = IBV_WC_BIND_MW; break; case IBV_WR_LOCAL_INV: { struct ibv_mw_bind_info bind_info = {}; next_fence = MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE; ctrl->imm = htobe32(wr->invalidate_rkey); err = set_bind_wr(qp, IBV_MW_TYPE_2, 0, &bind_info, ibqp->qp_num, &seg, &size); if (err) { *bad_wr = wr; goto out; } qp->sq.wr_data[idx] = IBV_WC_LOCAL_INV; break; } default: break; } break; case IBV_QPT_UD: set_datagram_seg(seg, wr); seg += sizeof(struct mlx5_wqe_datagram_seg); size += sizeof(struct mlx5_wqe_datagram_seg) / 16; if (unlikely((seg == qend))) seg = mlx5_get_send_wqe(qp, 0); if (unlikely(qp->flags & MLX5_QP_FLAGS_USE_UNDERLAY)) { err = mlx5_post_send_underlay(qp, wr, &seg, &size, &sg_copy_ptr); if (unlikely(err)) { *bad_wr = wr; goto out; } } break; case IBV_QPT_RAW_PACKET: memset(seg, 0, sizeof(struct mlx5_wqe_eth_seg)); eseg = seg; if (wr->send_flags & IBV_SEND_IP_CSUM) { if (!(qp->qp_cap_cache & MLX5_CSUM_SUPPORT_RAW_OVER_ETH)) { err = EINVAL; *bad_wr = wr; goto out; } eseg->cs_flags |= MLX5_ETH_WQE_L3_CSUM | MLX5_ETH_WQE_L4_CSUM; } if (wr->opcode == IBV_WR_TSO) { max_tso = qp->max_tso; err = set_tso_eth_seg(&seg, wr->tso.hdr, wr->tso.hdr_sz, wr->tso.mss, qp, &size); if (unlikely(err)) { *bad_wr = wr; goto out; } /* For TSO WR we always copy at least MLX5_ETH_L2_MIN_HEADER_SIZE * bytes of inline header which is included in struct mlx5_wqe_eth_seg. * If additional bytes are copied, 'seg' and 'size' are adjusted * inside set_tso_eth_seg(). */ seg += sizeof(struct mlx5_wqe_eth_seg); size += sizeof(struct mlx5_wqe_eth_seg) / 16; } else { uint32_t inl_hdr_size = to_mctx(ibqp->context)->eth_min_inline_size; err = copy_eth_inline_headers(ibqp, wr->sg_list, wr->num_sge, seg, &sg_copy_ptr, 1); if (unlikely(err)) { *bad_wr = wr; mlx5_dbg(fp, MLX5_DBG_QP_SEND, "copy_eth_inline_headers failed, err: %d\n", err); goto out; } /* The eth segment size depends on the device's min inline * header requirement which can be 0 or 18. The basic eth segment * always includes room for first 2 inline header bytes (even if * copy size is 0) so the additional seg size is adjusted accordingly. */ seg += (offsetof(struct mlx5_wqe_eth_seg, inline_hdr) + inl_hdr_size) & ~0xf; size += (offsetof(struct mlx5_wqe_eth_seg, inline_hdr) + inl_hdr_size) >> 4; } break; default: break; } if (wr->send_flags & IBV_SEND_INLINE && wr->num_sge) { int uninitialized_var(sz); err = set_data_inl_seg(qp, wr, seg, &sz, &sg_copy_ptr); if (unlikely(err)) { *bad_wr = wr; mlx5_dbg(fp, MLX5_DBG_QP_SEND, "inline layout failed, err %d\n", err); goto out; } inl = 1; size += sz; } else { dpseg = seg; for (i = sg_copy_ptr.index; i < wr->num_sge; ++i) { if (unlikely(dpseg == qend)) { seg = mlx5_get_send_wqe(qp, 0); dpseg = seg; } if (likely(wr->sg_list[i].length)) { if (unlikely(wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP || wr->opcode == IBV_WR_ATOMIC_FETCH_AND_ADD)) set_data_ptr_seg_atomic(dpseg, wr->sg_list + i); else { if (unlikely(wr->opcode == IBV_WR_TSO)) { if (max_tso < wr->sg_list[i].length) { err = EINVAL; *bad_wr = wr; goto out; } max_tso -= wr->sg_list[i].length; } set_data_ptr_seg(dpseg, wr->sg_list + i, sg_copy_ptr.offset); } sg_copy_ptr.offset = 0; ++dpseg; size += sizeof(struct mlx5_wqe_data_seg) / 16; } } } mlx5_opcode = mlx5_ib_opcode[wr->opcode]; ctrl->opmod_idx_opcode = htobe32(((qp->sq.cur_post & 0xffff) << 8) | mlx5_opcode | (opmod << 24)); ctrl->qpn_ds = htobe32(size | (ibqp->qp_num << 8)); if (unlikely(qp->wq_sig)) ctrl->signature = wq_sig(ctrl); qp->sq.wrid[idx] = wr->wr_id; qp->sq.wqe_head[idx] = qp->sq.head + nreq; qp->sq.cur_post += DIV_ROUND_UP(size * 16, MLX5_SEND_WQE_BB); #ifdef MLX5_DEBUG if (mlx5_debug_mask & MLX5_DBG_QP_SEND) dump_wqe(to_mctx(ibqp->context), idx, size, qp); #endif rdma_tracepoint(rdma_core_mlx5, post_send, ibqp->context->device->name, ibqp->qp_num, (char *)ibv_wr_opcode_str(wr->opcode), wr->num_sge); } out: qp->fm_cache = next_fence; post_send_db(qp, bf, nreq, inl, size, ctrl); mlx5_spin_unlock(&qp->sq.lock); return err; } int mlx5_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { #ifdef MW_DEBUG if (wr->opcode == IBV_WR_BIND_MW) { if (wr->bind_mw.mw->type == IBV_MW_TYPE_1) return EINVAL; if (!wr->bind_mw.bind_info.mr || !wr->bind_mw.bind_info.addr || !wr->bind_mw.bind_info.length) return EINVAL; if (wr->bind_mw.bind_info.mr->pd != wr->bind_mw.mw->pd) return EINVAL; } #endif return _mlx5_post_send(ibqp, wr, bad_wr); } enum { WQE_REQ_SETTERS_UD_XRC_DC = 2, }; static void mlx5_send_wr_start(struct ibv_qp_ex *ibqp) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); mlx5_spin_lock(&mqp->sq.lock); mqp->cur_post_rb = mqp->sq.cur_post; mqp->fm_cache_rb = mqp->fm_cache; mqp->err = 0; mqp->nreq = 0; mqp->inl_wqe = 0; } static int mlx5_send_wr_complete_error(struct ibv_qp_ex *ibqp) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); /* Rolling back */ mqp->sq.cur_post = mqp->cur_post_rb; mqp->fm_cache = mqp->fm_cache_rb; mlx5_spin_unlock(&mqp->sq.lock); return EINVAL; } static int mlx5_send_wr_complete(struct ibv_qp_ex *ibqp) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); int err = mqp->err; if (unlikely(err)) { /* Rolling back */ mqp->sq.cur_post = mqp->cur_post_rb; mqp->fm_cache = mqp->fm_cache_rb; goto out; } post_send_db(mqp, mqp->bf, mqp->nreq, mqp->inl_wqe, mqp->cur_size, mqp->cur_ctrl); out: mlx5_spin_unlock(&mqp->sq.lock); return err; } static void mlx5_send_wr_abort(struct ibv_qp_ex *ibqp) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); /* Rolling back */ mqp->sq.cur_post = mqp->cur_post_rb; mqp->fm_cache = mqp->fm_cache_rb; mlx5_spin_unlock(&mqp->sq.lock); } static inline void _common_wqe_init_op(struct ibv_qp_ex *ibqp, int ib_op, uint8_t mlx5_op) ALWAYS_INLINE; static inline void _common_wqe_init_op(struct ibv_qp_ex *ibqp, int ib_op, uint8_t mlx5_op) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); struct mlx5_wqe_ctrl_seg *ctrl; uint8_t fence; uint32_t idx; if (unlikely(mlx5_wq_overflow(&mqp->sq, mqp->nreq, to_mcq(ibqp->qp_base.send_cq)))) { FILE *fp = to_mctx(((struct ibv_qp *)ibqp)->context)->dbg_fp; mlx5_dbg(fp, MLX5_DBG_QP_SEND, "Work queue overflow\n"); if (!mqp->err) mqp->err = ENOMEM; return; } idx = mqp->sq.cur_post & (mqp->sq.wqe_cnt - 1); mqp->sq.wrid[idx] = ibqp->wr_id; mqp->sq.wqe_head[idx] = mqp->sq.head + mqp->nreq; if (ib_op == IBV_WR_BIND_MW) mqp->sq.wr_data[idx] = IBV_WC_BIND_MW; else if (ib_op == IBV_WR_LOCAL_INV) mqp->sq.wr_data[idx] = IBV_WC_LOCAL_INV; else if (ib_op == IBV_WR_DRIVER1) mqp->sq.wr_data[idx] = IBV_WC_DRIVER1; else if (mlx5_op == MLX5_OPCODE_MMO) mqp->sq.wr_data[idx] = IBV_WC_DRIVER3; else mqp->sq.wr_data[idx] = 0; ctrl = mlx5_get_send_wqe(mqp, idx); *(uint32_t *)((void *)ctrl + 8) = 0; fence = (ibqp->wr_flags & IBV_SEND_FENCE) ? MLX5_WQE_CTRL_FENCE : mqp->fm_cache; mqp->fm_cache = 0; ctrl->fm_ce_se = mqp->sq_signal_bits | fence | (ibqp->wr_flags & IBV_SEND_SIGNALED ? MLX5_WQE_CTRL_CQ_UPDATE : 0) | (ibqp->wr_flags & IBV_SEND_SOLICITED ? MLX5_WQE_CTRL_SOLICITED : 0); ctrl->opmod_idx_opcode = htobe32(((mqp->sq.cur_post & 0xffff) << 8) | mlx5_op); mqp->cur_ctrl = ctrl; } static inline void _common_wqe_init(struct ibv_qp_ex *ibqp, enum ibv_wr_opcode ib_op) ALWAYS_INLINE; static inline void _common_wqe_init(struct ibv_qp_ex *ibqp, enum ibv_wr_opcode ib_op) { _common_wqe_init_op(ibqp, ib_op, mlx5_ib_opcode[ib_op]); } static inline void __wqe_finalize(struct mlx5_qp *mqp) ALWAYS_INLINE; static inline void __wqe_finalize(struct mlx5_qp *mqp) { if (unlikely(mqp->wq_sig)) mqp->cur_ctrl->signature = wq_sig(mqp->cur_ctrl); #ifdef MLX5_DEBUG if (mlx5_debug_mask & MLX5_DBG_QP_SEND) { int idx = mqp->sq.cur_post & (mqp->sq.wqe_cnt - 1); dump_wqe(to_mctx(mqp->ibv_qp->context), idx, mqp->cur_size, mqp); } #endif mqp->sq.cur_post += DIV_ROUND_UP(mqp->cur_size, 4); } static inline void _common_wqe_finalize(struct mlx5_qp *mqp) { mqp->cur_ctrl->qpn_ds = htobe32(mqp->cur_size | (mqp->ibv_qp->qp_num << 8)); __wqe_finalize(mqp); } static inline void _mlx5_send_wr_send(struct ibv_qp_ex *ibqp, enum ibv_wr_opcode ib_op) ALWAYS_INLINE; static inline void _mlx5_send_wr_send(struct ibv_qp_ex *ibqp, enum ibv_wr_opcode ib_op) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); size_t transport_seg_sz = 0; _common_wqe_init(ibqp, ib_op); if (ibqp->qp_base.qp_type == IBV_QPT_UD || ibqp->qp_base.qp_type == IBV_QPT_DRIVER) transport_seg_sz = sizeof(struct mlx5_wqe_datagram_seg); else if (ibqp->qp_base.qp_type == IBV_QPT_XRC_SEND) transport_seg_sz = sizeof(struct mlx5_wqe_xrc_seg); mqp->cur_data = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg) + transport_seg_sz; /* In UD/DC cur_data may overrun the SQ */ if (unlikely(mqp->cur_data == mqp->sq.qend)) mqp->cur_data = mlx5_get_send_wqe(mqp, 0); mqp->cur_size = (sizeof(struct mlx5_wqe_ctrl_seg) + transport_seg_sz) / 16; mqp->nreq++; /* Relevant just for WQE construction which requires more than 1 setter */ mqp->cur_setters_cnt = 0; } static void mlx5_send_wr_send_other(struct ibv_qp_ex *ibqp) { _mlx5_send_wr_send(ibqp, IBV_WR_SEND); } static void mlx5_send_wr_send_eth(struct ibv_qp_ex *ibqp) { uint32_t inl_hdr_size = to_mctx(((struct ibv_qp *)ibqp)->context)->eth_min_inline_size; struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); struct mlx5_wqe_eth_seg *eseg; size_t eseg_sz; _common_wqe_init(ibqp, IBV_WR_SEND); eseg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); memset(eseg, 0, sizeof(struct mlx5_wqe_eth_seg)); if (inl_hdr_size) mqp->cur_eth = eseg; if (ibqp->wr_flags & IBV_SEND_IP_CSUM) { if (unlikely(!(mqp->qp_cap_cache & MLX5_CSUM_SUPPORT_RAW_OVER_ETH))) { if (!mqp->err) mqp->err = EINVAL; return; } eseg->cs_flags |= MLX5_ETH_WQE_L3_CSUM | MLX5_ETH_WQE_L4_CSUM; } /* The eth segment size depends on the device's min inline * header requirement which can be 0 or 18. The basic eth segment * always includes room for first 2 inline header bytes (even if * copy size is 0) so the additional seg size is adjusted accordingly. */ eseg_sz = (offsetof(struct mlx5_wqe_eth_seg, inline_hdr) + inl_hdr_size) & ~0xf; mqp->cur_data = (void *)eseg + eseg_sz; mqp->cur_size = (sizeof(struct mlx5_wqe_ctrl_seg) + eseg_sz) >> 4; mqp->nreq++; } static void mlx5_send_wr_send_imm(struct ibv_qp_ex *ibqp, __be32 imm_data) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_send(ibqp, IBV_WR_SEND_WITH_IMM); mqp->cur_ctrl->imm = imm_data; } static void mlx5_send_wr_send_inv(struct ibv_qp_ex *ibqp, uint32_t invalidate_rkey) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_send(ibqp, IBV_WR_SEND_WITH_INV); mqp->cur_ctrl->imm = htobe32(invalidate_rkey); } static void mlx5_send_wr_send_tso(struct ibv_qp_ex *ibqp, void *hdr, uint16_t hdr_sz, uint16_t mss) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); struct mlx5_wqe_eth_seg *eseg; int size = 0; int err; _common_wqe_init(ibqp, IBV_WR_TSO); eseg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); memset(eseg, 0, sizeof(struct mlx5_wqe_eth_seg)); if (ibqp->wr_flags & IBV_SEND_IP_CSUM) { if (unlikely(!(mqp->qp_cap_cache & MLX5_CSUM_SUPPORT_RAW_OVER_ETH))) { if (!mqp->err) mqp->err = EINVAL; return; } eseg->cs_flags |= MLX5_ETH_WQE_L3_CSUM | MLX5_ETH_WQE_L4_CSUM; } err = set_tso_eth_seg((void *)&eseg, hdr, hdr_sz, mss, mqp, &size); if (unlikely(err)) { if (!mqp->err) mqp->err = err; return; } /* eseg and cur_size was updated with hdr size inside set_tso_eth_seg */ mqp->cur_data = (void *)eseg + sizeof(struct mlx5_wqe_eth_seg); mqp->cur_size = size + ((sizeof(struct mlx5_wqe_ctrl_seg) + sizeof(struct mlx5_wqe_eth_seg)) >> 4); mqp->cur_eth = NULL; mqp->nreq++; } static inline void _mlx5_send_wr_rdma(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, enum ibv_wr_opcode ib_op) ALWAYS_INLINE; static inline void _mlx5_send_wr_rdma(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, enum ibv_wr_opcode ib_op) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); size_t transport_seg_sz = 0; void *raddr_seg; _common_wqe_init(ibqp, ib_op); if (ibqp->qp_base.qp_type == IBV_QPT_DRIVER) transport_seg_sz = sizeof(struct mlx5_wqe_datagram_seg); else if (ibqp->qp_base.qp_type == IBV_QPT_XRC_SEND) transport_seg_sz = sizeof(struct mlx5_wqe_xrc_seg); raddr_seg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg) + transport_seg_sz; /* In DC raddr_seg may overrun the SQ */ if (unlikely(raddr_seg == mqp->sq.qend)) raddr_seg = mlx5_get_send_wqe(mqp, 0); set_raddr_seg(raddr_seg, remote_addr, rkey); mqp->cur_data = raddr_seg + sizeof(struct mlx5_wqe_raddr_seg); mqp->cur_size = (sizeof(struct mlx5_wqe_ctrl_seg) + transport_seg_sz + sizeof(struct mlx5_wqe_raddr_seg)) / 16; mqp->nreq++; /* Relevant just for WQE construction which requires more than 1 setter */ mqp->cur_setters_cnt = 0; } static void mlx5_send_wr_rdma_write(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr) { _mlx5_send_wr_rdma(ibqp, rkey, remote_addr, IBV_WR_RDMA_WRITE); } static void mlx5_send_wr_rdma_write_imm(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, __be32 imm_data) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_rdma(ibqp, rkey, remote_addr, IBV_WR_RDMA_WRITE_WITH_IMM); mqp->cur_ctrl->imm = imm_data; } static void mlx5_send_wr_rdma_read(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr) { _mlx5_send_wr_rdma(ibqp, rkey, remote_addr, IBV_WR_RDMA_READ); } static inline void _mlx5_send_wr_atomic(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, uint64_t compare_add, uint64_t swap, enum ibv_wr_opcode ib_op) ALWAYS_INLINE; static inline void _mlx5_send_wr_atomic(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, uint64_t compare_add, uint64_t swap, enum ibv_wr_opcode ib_op) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); size_t transport_seg_sz = 0; void *raddr_seg; _common_wqe_init(ibqp, ib_op); if (ibqp->qp_base.qp_type == IBV_QPT_DRIVER) transport_seg_sz = sizeof(struct mlx5_wqe_datagram_seg); else if (ibqp->qp_base.qp_type == IBV_QPT_XRC_SEND) transport_seg_sz = sizeof(struct mlx5_wqe_xrc_seg); raddr_seg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg) + transport_seg_sz; /* In DC raddr_seg may overrun the SQ */ if (unlikely(raddr_seg == mqp->sq.qend)) raddr_seg = mlx5_get_send_wqe(mqp, 0); set_raddr_seg(raddr_seg, remote_addr, rkey); _set_atomic_seg((struct mlx5_wqe_atomic_seg *)(raddr_seg + sizeof(struct mlx5_wqe_raddr_seg)), ib_op, swap, compare_add); mqp->cur_data = raddr_seg + sizeof(struct mlx5_wqe_raddr_seg) + sizeof(struct mlx5_wqe_atomic_seg); /* In XRC, cur_data may overrun the SQ */ if (unlikely(mqp->cur_data == mqp->sq.qend)) mqp->cur_data = mlx5_get_send_wqe(mqp, 0); mqp->cur_size = (sizeof(struct mlx5_wqe_ctrl_seg) + transport_seg_sz + sizeof(struct mlx5_wqe_raddr_seg) + sizeof(struct mlx5_wqe_atomic_seg)) / 16; mqp->nreq++; /* Relevant just for WQE construction which requires more than 1 setter */ mqp->cur_setters_cnt = 0; } static void mlx5_send_wr_atomic_cmp_swp(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, uint64_t compare, uint64_t swap) { _mlx5_send_wr_atomic(ibqp, rkey, remote_addr, compare, swap, IBV_WR_ATOMIC_CMP_AND_SWP); } static void mlx5_send_wr_atomic_fetch_add(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, uint64_t add) { _mlx5_send_wr_atomic(ibqp, rkey, remote_addr, add, 0, IBV_WR_ATOMIC_FETCH_AND_ADD); } static inline void _build_umr_wqe(struct ibv_qp_ex *ibqp, uint32_t orig_rkey, uint32_t new_rkey, const struct ibv_mw_bind_info *bind_info, enum ibv_wr_opcode ib_op) ALWAYS_INLINE; static inline void _build_umr_wqe(struct ibv_qp_ex *ibqp, uint32_t orig_rkey, uint32_t new_rkey, const struct ibv_mw_bind_info *bind_info, enum ibv_wr_opcode ib_op) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); void *umr_seg; int err = 0; int size = sizeof(struct mlx5_wqe_ctrl_seg) / 16; _common_wqe_init(ibqp, ib_op); mqp->cur_ctrl->imm = htobe32(orig_rkey); umr_seg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); err = set_bind_wr(mqp, IBV_MW_TYPE_2, new_rkey, bind_info, ((struct ibv_qp *)ibqp)->qp_num, &umr_seg, &size); if (unlikely(err)) { if (!mqp->err) mqp->err = err; return; } mqp->cur_size = size; mqp->fm_cache = MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE; mqp->nreq++; _common_wqe_finalize(mqp); } static void mlx5_send_wr_bind_mw(struct ibv_qp_ex *ibqp, struct ibv_mw *mw, uint32_t rkey, const struct ibv_mw_bind_info *bind_info) { _build_umr_wqe(ibqp, mw->rkey, rkey, bind_info, IBV_WR_BIND_MW); } static void mlx5_send_wr_local_inv(struct ibv_qp_ex *ibqp, uint32_t invalidate_rkey) { const struct ibv_mw_bind_info bind_info = {}; _build_umr_wqe(ibqp, invalidate_rkey, 0, &bind_info, IBV_WR_LOCAL_INV); } static inline void _mlx5_send_wr_set_sge(struct mlx5_qp *mqp, uint32_t lkey, uint64_t addr, uint32_t length) { struct mlx5_wqe_data_seg *dseg; if (unlikely(!length)) return; dseg = mqp->cur_data; dseg->byte_count = htobe32(length); dseg->lkey = htobe32(lkey); dseg->addr = htobe64(addr); mqp->cur_size += sizeof(*dseg) / 16; } static void mlx5_send_wr_set_sge_rc_uc(struct ibv_qp_ex *ibqp, uint32_t lkey, uint64_t addr, uint32_t length) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_sge(mqp, lkey, addr, length); _common_wqe_finalize(mqp); } static void mlx5_send_wr_set_sge_ud_xrc_dc(struct ibv_qp_ex *ibqp, uint32_t lkey, uint64_t addr, uint32_t length) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_sge(mqp, lkey, addr, length); if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finalize(mqp); else mqp->cur_setters_cnt++; } static void mlx5_send_wr_set_sge_eth(struct ibv_qp_ex *ibqp, uint32_t lkey, uint64_t addr, uint32_t length) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); struct mlx5_wqe_eth_seg *eseg = mqp->cur_eth; int err; if (eseg) { /* Inline-headers was set */ struct mlx5_sg_copy_ptr sg_copy_ptr = {.index = 0, .offset = 0}; struct ibv_sge sge = {.addr = addr, .length = length}; err = copy_eth_inline_headers((struct ibv_qp *)ibqp, &sge, 1, eseg, &sg_copy_ptr, 1); if (unlikely(err)) { if (!mqp->err) mqp->err = err; return; } addr += sg_copy_ptr.offset; length -= sg_copy_ptr.offset; } _mlx5_send_wr_set_sge(mqp, lkey, addr, length); _common_wqe_finalize(mqp); } static inline void _mlx5_send_wr_set_sge_list(struct mlx5_qp *mqp, size_t num_sge, const struct ibv_sge *sg_list) { struct mlx5_wqe_data_seg *dseg = mqp->cur_data; size_t i; if (unlikely(num_sge > mqp->sq.max_gs)) { FILE *fp = to_mctx(mqp->ibv_qp->context)->dbg_fp; mlx5_dbg(fp, MLX5_DBG_QP_SEND, "Num SGEs %zu exceeds the maximum (%d)\n", num_sge, mqp->sq.max_gs); if (!mqp->err) mqp->err = ENOMEM; return; } for (i = 0; i < num_sge; i++) { if (unlikely(dseg == mqp->sq.qend)) dseg = mlx5_get_send_wqe(mqp, 0); if (unlikely(!sg_list[i].length)) continue; dseg->byte_count = htobe32(sg_list[i].length); dseg->lkey = htobe32(sg_list[i].lkey); dseg->addr = htobe64(sg_list[i].addr); dseg++; mqp->cur_size += (sizeof(*dseg) / 16); } } static void mlx5_send_wr_set_sge_list_rc_uc(struct ibv_qp_ex *ibqp, size_t num_sge, const struct ibv_sge *sg_list) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_sge_list(mqp, num_sge, sg_list); _common_wqe_finalize(mqp); } static void mlx5_send_wr_set_sge_list_ud_xrc_dc(struct ibv_qp_ex *ibqp, size_t num_sge, const struct ibv_sge *sg_list) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_sge_list(mqp, num_sge, sg_list); if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finalize(mqp); else mqp->cur_setters_cnt++; } static void mlx5_send_wr_set_sge_list_eth(struct ibv_qp_ex *ibqp, size_t num_sge, const struct ibv_sge *sg_list) { struct mlx5_sg_copy_ptr sg_copy_ptr = {.index = 0, .offset = 0}; struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); struct mlx5_wqe_data_seg *dseg = mqp->cur_data; struct mlx5_wqe_eth_seg *eseg = mqp->cur_eth; size_t i; if (unlikely(num_sge > mqp->sq.max_gs)) { FILE *fp = to_mctx(mqp->ibv_qp->context)->dbg_fp; mlx5_dbg(fp, MLX5_DBG_QP_SEND, "Num SGEs %zu exceeds the maximum (%d)\n", num_sge, mqp->sq.max_gs); if (!mqp->err) mqp->err = ENOMEM; return; } if (eseg) { /* Inline-headers was set */ int err; err = copy_eth_inline_headers((struct ibv_qp *)ibqp, sg_list, num_sge, eseg, &sg_copy_ptr, 1); if (unlikely(err)) { if (!mqp->err) mqp->err = err; return; } } for (i = sg_copy_ptr.index; i < num_sge; i++) { uint32_t length = sg_list[i].length - sg_copy_ptr.offset; if (unlikely(!length)) continue; if (unlikely(dseg == mqp->sq.qend)) dseg = mlx5_get_send_wqe(mqp, 0); dseg->addr = htobe64(sg_list[i].addr + sg_copy_ptr.offset); dseg->byte_count = htobe32(length); dseg->lkey = htobe32(sg_list[i].lkey); dseg++; mqp->cur_size += (sizeof(*dseg) / 16); sg_copy_ptr.offset = 0; } _common_wqe_finalize(mqp); } static inline void memcpy_to_wqe(struct mlx5_qp *mqp, void *dest, void *src, size_t n) { if (unlikely(dest + n > mqp->sq.qend)) { size_t copy = mqp->sq.qend - dest; memcpy(dest, src, copy); src += copy; n -= copy; dest = mlx5_get_send_wqe(mqp, 0); } memcpy(dest, src, n); } static inline void memcpy_to_wqe_and_update(struct mlx5_qp *mqp, void **dest, void *src, size_t n) { if (unlikely(*dest + n > mqp->sq.qend)) { size_t copy = mqp->sq.qend - *dest; memcpy(*dest, src, copy); src += copy; n -= copy; *dest = mlx5_get_send_wqe(mqp, 0); } memcpy(*dest, src, n); *dest += n; } static inline void _mlx5_send_wr_set_inline_data(struct mlx5_qp *mqp, void *addr, size_t length) { struct mlx5_wqe_inline_seg *dseg = mqp->cur_data; if (unlikely(length > mqp->max_inline_data)) { FILE *fp = to_mctx(mqp->ibv_qp->context)->dbg_fp; mlx5_dbg(fp, MLX5_DBG_QP_SEND, "Inline data %zu exceeds the maximum (%d)\n", length, mqp->max_inline_data); if (!mqp->err) mqp->err = ENOMEM; return; } mqp->inl_wqe = 1; /* Encourage a BlueFlame usage */ if (unlikely(!length)) return; memcpy_to_wqe(mqp, (void *)dseg + sizeof(*dseg), addr, length); dseg->byte_count = htobe32(length | MLX5_INLINE_SEG); mqp->cur_size += DIV_ROUND_UP(length + sizeof(*dseg), 16); } static void mlx5_send_wr_set_inline_data_rc_uc(struct ibv_qp_ex *ibqp, void *addr, size_t length) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_inline_data(mqp, addr, length); _common_wqe_finalize(mqp); } static void mlx5_send_wr_set_inline_data_ud_xrc_dc(struct ibv_qp_ex *ibqp, void *addr, size_t length) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_inline_data(mqp, addr, length); if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finalize(mqp); else mqp->cur_setters_cnt++; } static void mlx5_send_wr_set_inline_data_eth(struct ibv_qp_ex *ibqp, void *addr, size_t length) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); struct mlx5_wqe_eth_seg *eseg = mqp->cur_eth; if (eseg) { /* Inline-headers was set */ struct mlx5_sg_copy_ptr sg_copy_ptr = {.index = 0, .offset = 0}; struct ibv_data_buf buf = {.addr = addr, .length = length}; int err; err = copy_eth_inline_headers((struct ibv_qp *)ibqp, &buf, 1, eseg, &sg_copy_ptr, 0); if (unlikely(err)) { if (!mqp->err) mqp->err = err; return; } addr += sg_copy_ptr.offset; length -= sg_copy_ptr.offset; } _mlx5_send_wr_set_inline_data(mqp, addr, length); _common_wqe_finalize(mqp); } static inline void _mlx5_send_wr_set_inline_data_list(struct mlx5_qp *mqp, size_t num_buf, const struct ibv_data_buf *buf_list) { struct mlx5_wqe_inline_seg *dseg = mqp->cur_data; void *wqe = (void *)dseg + sizeof(*dseg); size_t inl_size = 0; int i; for (i = 0; i < num_buf; i++) { size_t length = buf_list[i].length; inl_size += length; if (unlikely(inl_size > mqp->max_inline_data)) { FILE *fp = to_mctx(mqp->ibv_qp->context)->dbg_fp; mlx5_dbg(fp, MLX5_DBG_QP_SEND, "Inline data %zu exceeds the maximum (%d)\n", inl_size, mqp->max_inline_data); if (!mqp->err) mqp->err = ENOMEM; return; } memcpy_to_wqe_and_update(mqp, &wqe, buf_list[i].addr, length); } mqp->inl_wqe = 1; /* Encourage a BlueFlame usage */ if (unlikely(!inl_size)) return; dseg->byte_count = htobe32(inl_size | MLX5_INLINE_SEG); mqp->cur_size += DIV_ROUND_UP(inl_size + sizeof(*dseg), 16); } static void mlx5_send_wr_set_inline_data_list_rc_uc(struct ibv_qp_ex *ibqp, size_t num_buf, const struct ibv_data_buf *buf_list) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_inline_data_list(mqp, num_buf, buf_list); _common_wqe_finalize(mqp); } static void mlx5_send_wr_set_inline_data_list_ud_xrc_dc(struct ibv_qp_ex *ibqp, size_t num_buf, const struct ibv_data_buf *buf_list) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); _mlx5_send_wr_set_inline_data_list(mqp, num_buf, buf_list); if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finalize(mqp); else mqp->cur_setters_cnt++; } static void mlx5_send_wr_set_inline_data_list_eth(struct ibv_qp_ex *ibqp, size_t num_buf, const struct ibv_data_buf *buf_list) { struct mlx5_sg_copy_ptr sg_copy_ptr = {.index = 0, .offset = 0}; struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); struct mlx5_wqe_inline_seg *dseg = mqp->cur_data; struct mlx5_wqe_eth_seg *eseg = mqp->cur_eth; void *wqe = (void *)dseg + sizeof(*dseg); size_t inl_size = 0; size_t i; if (eseg) { /* Inline-headers was set */ int err; err = copy_eth_inline_headers((struct ibv_qp *)ibqp, buf_list, num_buf, eseg, &sg_copy_ptr, 0); if (unlikely(err)) { if (!mqp->err) mqp->err = err; return; } } for (i = sg_copy_ptr.index; i < num_buf; i++) { size_t length = buf_list[i].length - sg_copy_ptr.offset; inl_size += length; if (unlikely(inl_size > mqp->max_inline_data)) { FILE *fp = to_mctx(mqp->ibv_qp->context)->dbg_fp; mlx5_dbg(fp, MLX5_DBG_QP_SEND, "Inline data %zu exceeds the maximum (%d)\n", inl_size, mqp->max_inline_data); if (!mqp->err) mqp->err = EINVAL; return; } memcpy_to_wqe_and_update(mqp, &wqe, buf_list[i].addr + sg_copy_ptr.offset, length); sg_copy_ptr.offset = 0; } if (likely(inl_size)) { dseg->byte_count = htobe32(inl_size | MLX5_INLINE_SEG); mqp->cur_size += DIV_ROUND_UP(inl_size + sizeof(*dseg), 16); } mqp->inl_wqe = 1; /* Encourage a BlueFlame usage */ _common_wqe_finalize(mqp); } static void mlx5_send_wr_set_ud_addr(struct ibv_qp_ex *ibqp, struct ibv_ah *ah, uint32_t remote_qpn, uint32_t remote_qkey) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); struct mlx5_wqe_datagram_seg *dseg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); struct mlx5_ah *mah = to_mah(ah); _set_datagram_seg(dseg, &mah->av, remote_qpn, remote_qkey); if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finalize(mqp); else mqp->cur_setters_cnt++; } static void mlx5_send_wr_set_xrc_srqn(struct ibv_qp_ex *ibqp, uint32_t remote_srqn) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); struct mlx5_wqe_xrc_seg *xrc_seg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); xrc_seg->xrc_srqn = htobe32(remote_srqn); if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finalize(mqp); else mqp->cur_setters_cnt++; } static uint8_t get_umr_mr_flags(uint32_t acc) { return ((acc & IBV_ACCESS_REMOTE_ATOMIC ? MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_ATOMIC : 0) | (acc & IBV_ACCESS_REMOTE_WRITE ? MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_REMOTE_WRITE : 0) | (acc & IBV_ACCESS_REMOTE_READ ? MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_REMOTE_READ : 0) | (acc & IBV_ACCESS_LOCAL_WRITE ? MLX5_WQE_MKEY_CONTEXT_ACCESS_FLAGS_LOCAL_WRITE : 0)); } static int umr_sg_list_create(struct mlx5_qp *qp, uint16_t num_sges, const struct ibv_sge *sge, void *seg, void *qend, int *size, int *xlat_size, uint64_t *reglen) { struct mlx5_wqe_data_seg *dseg; int byte_count = 0; int i; size_t tmp; dseg = seg; for (i = 0; i < num_sges; i++, dseg++) { if (unlikely(dseg == qend)) dseg = mlx5_get_send_wqe(qp, 0); dseg->addr = htobe64(sge[i].addr); dseg->lkey = htobe32(sge[i].lkey); dseg->byte_count = htobe32(sge[i].length); byte_count += sge[i].length; } tmp = align(num_sges, 4) - num_sges; memset(dseg, 0, tmp * sizeof(*dseg)); *size = align(num_sges * sizeof(*dseg), 64); *reglen = byte_count; *xlat_size = num_sges * sizeof(*dseg); return 0; } /* The strided block format is as the following: * | repeat_block | entry_block | entry_block |...| entry_block | * While the repeat entry contains details on the list of the block_entries. */ static void umr_strided_seg_create(struct mlx5_qp *qp, uint32_t repeat_count, uint16_t num_interleaved, const struct mlx5dv_mr_interleaved *data, void *seg, void *qend, int *wqe_size, int *xlat_size, uint64_t *reglen) { struct mlx5_wqe_umr_repeat_block_seg *rb = seg; struct mlx5_wqe_umr_repeat_ent_seg *eb; uint64_t byte_count = 0; int tmp; int i; rb->op = htobe32(0x400); rb->reserved = 0; rb->num_ent = htobe16(num_interleaved); rb->repeat_count = htobe32(repeat_count); eb = rb->entries; /* * ------------------------------------------------------------ * | repeat_block | entry_block | entry_block |...| entry_block * ------------------------------------------------------------ */ for (i = 0; i < num_interleaved; i++, eb++) { if (unlikely(eb == qend)) eb = mlx5_get_send_wqe(qp, 0); byte_count += data[i].bytes_count; eb->va = htobe64(data[i].addr); eb->byte_count = htobe16(data[i].bytes_count); eb->stride = htobe16(data[i].bytes_count + data[i].bytes_skip); eb->memkey = htobe32(data[i].lkey); } rb->byte_count = htobe32(byte_count); *reglen = byte_count * repeat_count; tmp = align(num_interleaved + 1, 4) - num_interleaved - 1; memset(eb, 0, tmp * sizeof(*eb)); *wqe_size = align(sizeof(*rb) + sizeof(*eb) * num_interleaved, 64); *xlat_size = (num_interleaved + 1) * sizeof(*eb); } static inline uint8_t bs_to_bs_selector(enum mlx5dv_block_size bs) { static const uint8_t bs_selector[] = { [MLX5DV_BLOCK_SIZE_512] = 1, [MLX5DV_BLOCK_SIZE_520] = 2, [MLX5DV_BLOCK_SIZE_4048] = 6, [MLX5DV_BLOCK_SIZE_4096] = 3, [MLX5DV_BLOCK_SIZE_4160] = 4, }; return bs_selector[bs]; } static uint32_t mlx5_umr_crc_bfs(struct mlx5dv_sig_crc *crc) { enum mlx5dv_sig_crc_type type = crc->type; uint32_t block_format_selector; switch (type) { case MLX5DV_SIG_CRC_TYPE_CRC32: block_format_selector = MLX5_BFS_CRC32_BASE; break; case MLX5DV_SIG_CRC_TYPE_CRC32C: block_format_selector = MLX5_BFS_CRC32C_BASE; break; case MLX5DV_SIG_CRC_TYPE_CRC64_XP10: block_format_selector = MLX5_BFS_CRC64_XP10_BASE; break; default: return 0; } if (!crc->seed) block_format_selector |= MLX5_BFS_CRC_SEED_BIT; block_format_selector |= MLX5_BFS_CRC_REPEAT_BIT; return block_format_selector << MLX5_BFS_SHIFT; } static void mlx5_umr_fill_inl_bsf_t10dif(struct mlx5dv_sig_t10dif *dif, struct mlx5_bsf_inl *inl) { uint8_t inc_ref_guard_check = 0; /* Valid inline section and allow BSF refresh */ inl->vld_refresh = htobe16(MLX5_BSF_INL_VALID | MLX5_BSF_REFRESH_DIF); inl->dif_apptag = htobe16(dif->app_tag); inl->dif_reftag = htobe32(dif->ref_tag); /* repeating block */ inl->rp_inv_seed = MLX5_BSF_REPEAT_BLOCK; if (dif->bg) inl->rp_inv_seed |= MLX5_BSF_SEED; inl->sig_type = dif->bg_type == MLX5DV_SIG_T10DIF_CRC ? MLX5_T10DIF_CRC : MLX5_T10DIF_IPCS; if (dif->flags & MLX5DV_SIG_T10DIF_FLAG_REF_REMAP) inc_ref_guard_check |= MLX5_BSF_INC_REFTAG; if (dif->flags & MLX5DV_SIG_T10DIF_FLAG_APP_REF_ESCAPE) inc_ref_guard_check |= MLX5_BSF_APPREF_ESCAPE; else if (dif->flags & MLX5DV_SIG_T10DIF_FLAG_APP_ESCAPE) inc_ref_guard_check |= MLX5_BSF_APPTAG_ESCAPE; inl->dif_inc_ref_guard_check |= inc_ref_guard_check; inl->dif_app_bitmask_check = htobe16(0xffff); } static bool mlx5_umr_block_crc_sbs(struct mlx5_sig_block_domain *mem, struct mlx5_sig_block_domain *wire, uint8_t *copy_mask) { enum mlx5dv_sig_crc_type crc_type; *copy_mask = 0; if (mem->sig_type != wire->sig_type || mem->block_size != wire->block_size || mem->sig.crc.type != wire->sig.crc.type) return false; crc_type = wire->sig.crc.type; switch (crc_type) { case MLX5DV_SIG_CRC_TYPE_CRC32: case MLX5DV_SIG_CRC_TYPE_CRC32C: *copy_mask = MLX5DV_SIG_MASK_CRC32; break; case MLX5DV_SIG_CRC_TYPE_CRC64_XP10: *copy_mask = MLX5DV_SIG_MASK_CRC64_XP10; break; } return true; } static bool mlx5_umr_block_t10dif_sbs(struct mlx5_sig_block_domain *block_mem, struct mlx5_sig_block_domain *block_wire, uint8_t *copy_mask) { struct mlx5dv_sig_t10dif *mem; struct mlx5dv_sig_t10dif *wire; *copy_mask = 0; if (block_mem->sig_type != block_wire->sig_type || block_mem->block_size != block_wire->block_size) return false; mem = &block_mem->sig.dif; wire = &block_wire->sig.dif; if (mem->bg_type == wire->bg_type && mem->bg == wire->bg) *copy_mask |= MLX5DV_SIG_MASK_T10DIF_GUARD; if (mem->app_tag == wire->app_tag) *copy_mask |= MLX5DV_SIG_MASK_T10DIF_APPTAG; if (mem->ref_tag == wire->ref_tag) *copy_mask |= MLX5DV_SIG_MASK_T10DIF_REFTAG; return true; } static int mlx5_umr_fill_sig_bsf(struct mlx5_bsf *bsf, struct mlx5_sig_block *block, bool have_crypto_bsf) { struct mlx5_bsf_basic *basic = &bsf->basic; struct mlx5_sig_block_domain *block_mem = &block->attr.mem; struct mlx5_sig_block_domain *block_wire = &block->attr.wire; enum mlx5_sig_type type; uint32_t bfs_psv; bool sbs = false; /* Same Block Structure */ uint8_t copy_mask = 0; memset(bsf, 0, sizeof(*bsf)); basic->bsf_size_sbs |= (have_crypto_bsf ? MLX5_BSF_SIZE_SIG_AND_CRYPTO : MLX5_BSF_SIZE_WITH_INLINE) << MLX5_BSF_SIZE_SHIFT; basic->raw_data_size = htobe32(UINT32_MAX); if (block_wire->sig_type != MLX5_SIG_TYPE_NONE || block_mem->sig_type != MLX5_SIG_TYPE_NONE) basic->check_byte_mask = block->attr.check_mask; /* Sig block mem domain */ type = block_mem->sig_type; if (type != MLX5_SIG_TYPE_NONE) { bfs_psv = 0; if (type == MLX5_SIG_TYPE_CRC) bfs_psv = mlx5_umr_crc_bfs(&block_mem->sig.crc); else mlx5_umr_fill_inl_bsf_t10dif(&block_mem->sig.dif, &bsf->m_inl); bfs_psv |= block->mem_psv->index & MLX5_BSF_PSV_INDEX_MASK; basic->m_bfs_psv = htobe32(bfs_psv); basic->mem.bs_selector = bs_to_bs_selector(block_mem->block_size); } /* Sig block wire domain */ type = block_wire->sig_type; if (type != MLX5_SIG_TYPE_NONE) { bfs_psv = 0; if (type == MLX5_SIG_TYPE_CRC) { bfs_psv = mlx5_umr_crc_bfs(&block_wire->sig.crc); sbs = mlx5_umr_block_crc_sbs(block_mem, block_wire, ©_mask); } else { mlx5_umr_fill_inl_bsf_t10dif(&block_wire->sig.dif, &bsf->w_inl); sbs = mlx5_umr_block_t10dif_sbs(block_mem, block_wire, ©_mask); } if (block->attr.flags & MLX5DV_SIG_BLOCK_ATTR_FLAG_COPY_MASK) { if (!sbs) return EINVAL; copy_mask = block->attr.copy_mask; } bfs_psv |= block->wire_psv->index & MLX5_BSF_PSV_INDEX_MASK; basic->w_bfs_psv = htobe32(bfs_psv); if (sbs) { basic->bsf_size_sbs |= 1 << MLX5_BSF_SBS_SHIFT; basic->wire.copy_byte_mask = copy_mask; } else { basic->wire.bs_selector = bs_to_bs_selector(block_wire->block_size); } } return 0; } static int get_crypto_order(bool encrypt_on_tx, enum mlx5dv_signature_crypto_order sig_crypto_order, struct mlx5_sig_block *block) { int order = -1; if (encrypt_on_tx) { if (sig_crypto_order == MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_AFTER_CRYPTO_ON_TX) order = MLX5_ENCRYPTION_ORDER_ENCRYPTED_RAW_WIRE; else order = MLX5_ENCRYPTION_ORDER_ENCRYPTED_WIRE_SIGNATURE; } else { if (sig_crypto_order == MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_AFTER_CRYPTO_ON_TX) order = MLX5_ENCRYPTION_ORDER_ENCRYPTED_MEMORY_SIGNATURE; else order = MLX5_ENCRYPTION_ORDER_ENCRYPTED_RAW_MEMORY; } /* * The combination of RAW_WIRE or RAW_MEMORY with signature configured * in both memory and wire domains is not yet supported by the device. * Return error if the user has mistakenly configured it. */ if (order == MLX5_ENCRYPTION_ORDER_ENCRYPTED_RAW_WIRE || order == MLX5_ENCRYPTION_ORDER_ENCRYPTED_RAW_MEMORY) if (block && block->attr.mem.sig_type != MLX5_SIG_TYPE_NONE && block->attr.wire.sig_type != MLX5_SIG_TYPE_NONE) return -1; return order; } static int mlx5_umr_fill_crypto_bsf(struct mlx5_context *ctx, struct mlx5_crypto_bsf *crypto_bsf, struct mlx5_crypto_attr *attr, struct mlx5_sig_block *block) { int order; memset(crypto_bsf, 0, sizeof(*crypto_bsf)); crypto_bsf->bsf_size_type |= (block ? MLX5_BSF_SIZE_SIG_AND_CRYPTO : MLX5_BSF_SIZE_WITH_INLINE) << MLX5_BSF_SIZE_SHIFT; crypto_bsf->bsf_size_type |= MLX5_BSF_TYPE_CRYPTO; order = get_crypto_order(attr->encrypt_on_tx, attr->signature_crypto_order, block); if (order < 0) return EINVAL; crypto_bsf->enc_order = order; crypto_bsf->enc_standard = MLX5_ENCRYPTION_STANDARD_AES_XTS; crypto_bsf->raw_data_size = htobe32(UINT32_MAX); crypto_bsf->bs_pointer = bs_to_bs_selector(attr->data_unit_size); /* * Multi-block encryption requires big endian tweak. Thus, * convert the tweak accordingly. */ if (ctx->crypto_caps.crypto_engines & MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_MULTI_BLOCK) { int len = sizeof(attr->initial_tweak); for (int i = 0; i < len; i++) crypto_bsf->xts_init_tweak[i] = attr->initial_tweak[len - i - 1]; } else { memcpy(crypto_bsf->xts_init_tweak, attr->initial_tweak, sizeof(crypto_bsf->xts_init_tweak)); } crypto_bsf->rsvd_dek_ptr = htobe32(attr->dek->devx_obj->object_id & 0x00FFFFFF); memcpy(crypto_bsf->keytag, attr->keytag, sizeof(crypto_bsf->keytag)); return 0; } static void mlx5_umr_set_psv(struct mlx5_qp *mqp, uint32_t psv_index, uint64_t transient_signature, bool reset_signal) { struct ibv_qp_ex *ibqp = &mqp->verbs_qp.qp_ex; unsigned int wr_flags; void *seg; struct mlx5_wqe_set_psv_seg *psv; size_t wqe_size; if (reset_signal) { wr_flags = ibqp->wr_flags; ibqp->wr_flags &= ~IBV_SEND_SIGNALED; } _common_wqe_init_op(ibqp, IBV_WR_DRIVER1, MLX5_OPCODE_SET_PSV); if (reset_signal) ibqp->wr_flags = wr_flags; /* Prevent posted wqe corruption if WQ is full */ if (mqp->err) return; seg = mqp->cur_ctrl; seg += sizeof(struct mlx5_wqe_ctrl_seg); wqe_size = sizeof(struct mlx5_wqe_ctrl_seg); psv = seg; seg += sizeof(struct mlx5_wqe_set_psv_seg); wqe_size += sizeof(struct mlx5_wqe_set_psv_seg); memset(psv, 0, sizeof(*psv)); psv->psv_index = htobe32(psv_index); psv->transient_signature = htobe64(transient_signature); mqp->cur_size = wqe_size / 16; mqp->nreq++; mqp->fm_cache = MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE; _common_wqe_finalize(mqp); } static inline void umr_transient_signature_crc(struct mlx5dv_sig_crc *crc, uint64_t *ts) { *ts = (crc->type == MLX5DV_SIG_CRC_TYPE_CRC64_XP10) ? crc->seed : crc->seed << 32; } static inline void umr_transient_signature_t10dif(struct mlx5dv_sig_t10dif *dif, uint64_t *ts) { *ts = (uint64_t)dif->bg << 48 | (uint64_t)dif->app_tag << 32 | dif->ref_tag; } static uint64_t psv_transient_signature(enum mlx5_sig_type type, void *sig) { uint64_t ts; if (type == MLX5_SIG_TYPE_CRC) umr_transient_signature_crc(sig, &ts); else umr_transient_signature_t10dif(sig, &ts); return ts; } static inline int upd_mkc_sig_err_cnt(struct mlx5_mkey *mkey, struct mlx5_wqe_umr_ctrl_seg *umr_ctrl, struct mlx5_wqe_mkey_context_seg *mk) { if (!mkey->sig->err_count_updated) return 0; umr_ctrl->mkey_mask |= htobe64(MLX5_WQE_UMR_CTRL_MKEY_MASK_SIG_ERR); mk->flags_pd |= htobe32( (mkey->sig->err_count & MLX5_WQE_MKEY_CONTEXT_SIG_ERR_CNT_MASK) << MLX5_WQE_MKEY_CONTEXT_SIG_ERR_CNT_SHIFT); mkey->sig->err_count_updated = false; return 1; } static inline void suppress_umr_completion(struct mlx5_qp *mqp) { struct mlx5_wqe_ctrl_seg *wqe_ctrl; /* * Up to 3 WQEs can be posted to configure an MKEY with the signature * attributes: 1 UMR + 1 or 2 SET_PSV. The MKEY is ready to use when the * last WQE is completed. There is no reason to report 3 completions. * One completion for the last SET_PSV WQE is enough. Reset the signal * flag to suppress a completion for UMR WQE. */ wqe_ctrl = (void *)mqp->cur_ctrl; wqe_ctrl->fm_ce_se &= ~MLX5_WQE_CTRL_CQ_UPDATE; } static inline void umr_enable_bsf(struct mlx5_qp *mqp, size_t bsf_size, struct mlx5_wqe_umr_ctrl_seg *umr_ctrl, struct mlx5_wqe_mkey_context_seg *mk) { mqp->cur_size += bsf_size / 16; umr_ctrl->bsf_octowords = htobe16(bsf_size / 16); umr_ctrl->mkey_mask |= htobe64(MLX5_WQE_UMR_CTRL_MKEY_MASK_BSF_ENABLE); mk->flags_pd |= htobe32(MLX5_WQE_MKEY_CONTEXT_FLAGS_BSF_ENABLE); } static inline void umr_finalize_common(struct mlx5_qp *mqp) { mqp->nreq++; _common_wqe_finalize(mqp); mqp->cur_mkey = NULL; } static inline void umr_finalize_and_set_psvs(struct mlx5_qp *mqp, struct mlx5_sig_block *block) { uint64_t ts; bool mem_sig; bool wire_sig; suppress_umr_completion(mqp); umr_finalize_common(mqp); mem_sig = block->attr.mem.sig_type != MLX5_SIG_TYPE_NONE; wire_sig = block->attr.wire.sig_type != MLX5_SIG_TYPE_NONE; if (mem_sig) { ts = psv_transient_signature(block->attr.mem.sig_type, &block->attr.mem.sig); mlx5_umr_set_psv(mqp, block->mem_psv->index, ts, wire_sig); } if (wire_sig) { ts = psv_transient_signature(block->attr.wire.sig_type, &block->attr.wire.sig); mlx5_umr_set_psv(mqp, block->wire_psv->index, ts, false); } } static inline int block_size_to_bytes(enum mlx5dv_block_size data_unit_size) { switch (data_unit_size) { case MLX5DV_BLOCK_SIZE_512: return 512; case MLX5DV_BLOCK_SIZE_520: return 520; case MLX5DV_BLOCK_SIZE_4048: return 4048; case MLX5DV_BLOCK_SIZE_4096: return 4096; case MLX5DV_BLOCK_SIZE_4160: return 4160; default: return -1; } } static void crypto_umr_wqe_finalize(struct mlx5_qp *mqp) { struct mlx5_context *ctx = to_mctx(mqp->ibv_qp->context); struct mlx5_mkey *mkey = mqp->cur_mkey; void *seg; void *qend = mqp->sq.qend; struct mlx5_wqe_umr_ctrl_seg *umr_ctrl; struct mlx5_wqe_mkey_context_seg *mk; size_t cur_data_size; size_t max_data_size; size_t bsf_size = 0; bool set_crypto_bsf = false; bool set_psv = false; int ret; seg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); umr_ctrl = seg; seg += sizeof(struct mlx5_wqe_umr_ctrl_seg); if (unlikely(seg == qend)) seg = mlx5_get_send_wqe(mqp, 0); mk = seg; if (mkey->sig && upd_mkc_sig_err_cnt(mkey, umr_ctrl, mk) && mkey->sig->block.state == MLX5_MKEY_BSF_STATE_SET) set_psv = true; /* Length must fit block size in single block encryption. */ if ((mkey->crypto->state == MLX5_MKEY_BSF_STATE_UPDATED || mkey->crypto->state == MLX5_MKEY_BSF_STATE_SET) && (ctx->crypto_caps.crypto_engines & MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_SINGLE_BLOCK) && !(ctx->crypto_caps.crypto_engines & MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_MULTI_BLOCK)) { int block_size = block_size_to_bytes(mkey->crypto->data_unit_size); if (block_size < 0 || mkey->length > block_size) { mqp->err = EINVAL; return; } } if (!(mkey->sig && mkey->sig->block.state == MLX5_MKEY_BSF_STATE_UPDATED) && !(mkey->crypto->state == MLX5_MKEY_BSF_STATE_UPDATED) && !(mkey->sig && mkey->sig->block.state == MLX5_MKEY_BSF_STATE_RESET)) goto umr_finalize; if (mkey->sig) { bsf_size += sizeof(struct mlx5_bsf); if (mkey->sig->block.state == MLX5_MKEY_BSF_STATE_UPDATED) set_psv = true; } if (mkey->crypto->state == MLX5_MKEY_BSF_STATE_UPDATED || mkey->crypto->state == MLX5_MKEY_BSF_STATE_SET) { bsf_size += sizeof(struct mlx5_crypto_bsf); set_crypto_bsf = true; } cur_data_size = be16toh(umr_ctrl->klm_octowords) * 16; max_data_size = mqp->max_inline_data + sizeof(struct mlx5_wqe_inl_data_seg); if (unlikely((cur_data_size + bsf_size) > max_data_size)) { mqp->err = ENOMEM; return; } /* The length must fit the raw_data_size of the BSF. */ if (unlikely(mkey->length > UINT32_MAX)) { mqp->err = EINVAL; return; } /* * Handle the case when the size of the previously written data segment * was bigger than MLX5_SEND_WQE_BB and it overlapped the end of the SQ. * The BSF segment needs to come just after it for this UMR WQE. */ seg = mqp->cur_data + cur_data_size; if (unlikely(seg >= qend)) seg = seg - qend + mlx5_get_send_wqe(mqp, 0); if (mkey->sig) { /* If sig and crypto are enabled, sig BSF must be set */ ret = mlx5_umr_fill_sig_bsf(seg, &mkey->sig->block, set_crypto_bsf); if (ret) { mqp->err = ret; return; } seg += sizeof(struct mlx5_bsf); if (unlikely(seg == qend)) seg = mlx5_get_send_wqe(mqp, 0); } if (set_crypto_bsf) { ret = mlx5_umr_fill_crypto_bsf(ctx, seg, mkey->crypto, mkey->sig ? &mkey->sig->block : NULL); if (ret) { mqp->err = ret; return; } } umr_enable_bsf(mqp, bsf_size, umr_ctrl, mk); umr_finalize: if (set_psv) umr_finalize_and_set_psvs(mqp, &mkey->sig->block); else umr_finalize_common(mqp); } static void umr_wqe_finalize(struct mlx5_qp *mqp) { struct mlx5_mkey *mkey = mqp->cur_mkey; struct mlx5_sig_block *block; void *seg; void *qend = mqp->sq.qend; struct mlx5_wqe_umr_ctrl_seg *umr_ctrl; struct mlx5_wqe_mkey_context_seg *mk; bool set_psv = false; size_t cur_data_size; size_t max_data_size; size_t bsf_size = sizeof(struct mlx5_bsf); int ret; if (!mkey->sig && !mkey->crypto) { umr_finalize_common(mqp); return; } if (mkey->crypto) { crypto_umr_wqe_finalize(mqp); return; } seg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); umr_ctrl = seg; seg += sizeof(struct mlx5_wqe_umr_ctrl_seg); if (unlikely(seg == qend)) seg = mlx5_get_send_wqe(mqp, 0); mk = seg; block = &mkey->sig->block; /* Disable BSF for the MKEY if the block signature is not configured. */ if (block->state != MLX5_MKEY_BSF_STATE_UPDATED && block->state != MLX5_MKEY_BSF_STATE_SET) { /* * Set bsf_enable bit in the mask to update the * corresponding bit in the MKEY context. The new value * is 0 (BSF is disabled) because the MKEY context * segment was zeroed in the mkey conf builder. */ umr_ctrl->mkey_mask |= htobe64(MLX5_WQE_UMR_CTRL_MKEY_MASK_BSF_ENABLE); } if (upd_mkc_sig_err_cnt(mkey, umr_ctrl, mk) && block->state == MLX5_MKEY_BSF_STATE_SET) set_psv = true; if (block->state != MLX5_MKEY_BSF_STATE_UPDATED) { if (set_psv) umr_finalize_and_set_psvs(mqp, block); else umr_finalize_common(mqp); return; } cur_data_size = be16toh(umr_ctrl->klm_octowords) * 16; max_data_size = mqp->max_inline_data + sizeof(struct mlx5_wqe_inl_data_seg); if (unlikely((cur_data_size + bsf_size) > max_data_size)) { mqp->err = ENOMEM; return; } /* The length must fit the raw_data_size of the BSF. */ if (unlikely(mkey->length > UINT32_MAX)) { mqp->err = EINVAL; return; } /* * Handle the case when the size of the previously written data segment * was bigger than MLX5_SEND_WQE_BB and it overlapped the end of the SQ. * The BSF segment needs to come just after it for this UMR WQE. */ seg = mqp->cur_data + cur_data_size; if (unlikely(seg >= qend)) seg = seg - qend + mlx5_get_send_wqe(mqp, 0); ret = mlx5_umr_fill_sig_bsf(seg, &mkey->sig->block, false); if (ret) { mqp->err = ret; return; } umr_enable_bsf(mqp, bsf_size, umr_ctrl, mk); umr_finalize_and_set_psvs(mqp, block); } static void mlx5_send_wr_mkey_configure(struct mlx5dv_qp_ex *dv_qp, struct mlx5dv_mkey *dv_mkey, uint8_t num_setters, struct mlx5dv_mkey_conf_attr *attr) { struct mlx5_qp *mqp = mqp_from_mlx5dv_qp_ex(dv_qp); struct mlx5_context *mctx = to_mctx(mqp->ibv_qp->context); struct ibv_qp_ex *ibqp = &mqp->verbs_qp.qp_ex; struct mlx5_wqe_umr_ctrl_seg *umr_ctrl; struct mlx5_wqe_mkey_context_seg *mk; struct mlx5_mkey *mkey = container_of(dv_mkey, struct mlx5_mkey, dv_mkey); uint64_t mkey_mask; void *qend = mqp->sq.qend; void *seg; if (unlikely(!(ibqp->wr_flags & IBV_SEND_INLINE))) { mqp->err = EOPNOTSUPP; return; } if (unlikely(!check_comp_mask(attr->conf_flags, MLX5DV_MKEY_CONF_FLAG_RESET_SIG_ATTR) || attr->comp_mask)) { mqp->err = EOPNOTSUPP; return; } _common_wqe_init(ibqp, IBV_WR_DRIVER1); mqp->cur_mkey = mkey; mqp->cur_size = sizeof(struct mlx5_wqe_ctrl_seg) / 16; mqp->cur_ctrl->imm = htobe32(dv_mkey->lkey); /* * There is no need to check (umr_ctrl == qend) here because the WQE * control and UMR control segments are always in the same WQEBB. */ seg = umr_ctrl = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); memset(umr_ctrl, 0, sizeof(*umr_ctrl)); mkey_mask = MLX5_WQE_UMR_CTRL_MKEY_MASK_FREE; seg += sizeof(struct mlx5_wqe_umr_ctrl_seg); mqp->cur_size += sizeof(struct mlx5_wqe_umr_ctrl_seg) / 16; if (unlikely(seg == qend)) seg = mlx5_get_send_wqe(mqp, 0); mk = seg; memset(mk, 0, sizeof(*mk)); if (unlikely(dv_mkey->lkey & 0xff && !(mctx->flags & MLX5_CTX_FLAGS_MKEY_UPDATE_TAG_SUPPORTED))) { mqp->err = EOPNOTSUPP; return; } mk->qpn_mkey = htobe32(0xffffff00 | (dv_mkey->lkey & 0xff)); mkey_mask |= MLX5_WQE_UMR_CTRL_MKEY_MASK_MKEY; seg += sizeof(*mk); mqp->cur_size += (sizeof(*mk) / 16); if (unlikely(seg == qend)) seg = mlx5_get_send_wqe(mqp, 0); mqp->cur_data = seg; umr_ctrl->flags = MLX5_WQE_UMR_CTRL_FLAG_INLINE; if (mkey->sig) { if (attr->conf_flags & MLX5DV_MKEY_CONF_FLAG_RESET_SIG_ATTR) { mkey->sig->block.attr.mem.sig_type = MLX5_SIG_TYPE_NONE; mkey->sig->block.attr.wire.sig_type = MLX5_SIG_TYPE_NONE; mkey->sig->block.state = MLX5_MKEY_BSF_STATE_RESET; } else { if (mkey->sig->block.state == MLX5_MKEY_BSF_STATE_UPDATED) mkey->sig->block.state = MLX5_MKEY_BSF_STATE_SET; else if (mkey->sig->block.state == MLX5_MKEY_BSF_STATE_RESET) mkey->sig->block.state = MLX5_MKEY_BSF_STATE_INIT; } } if (mkey->crypto && mkey->crypto->state == MLX5_MKEY_BSF_STATE_UPDATED) mkey->crypto->state = MLX5_MKEY_BSF_STATE_SET; umr_ctrl->mkey_mask = htobe64(mkey_mask); mqp->fm_cache = MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE; mqp->inl_wqe = 1; if (!num_setters) { umr_wqe_finalize(mqp); } else { mqp->cur_setters_cnt = 0; mqp->num_mkey_setters = num_setters; } } static void mlx5_send_wr_set_mkey_access_flags(struct mlx5dv_qp_ex *dv_qp, uint32_t access_flags) { struct mlx5_qp *mqp = mqp_from_mlx5dv_qp_ex(dv_qp); void *seg; void *qend = mqp->sq.qend; struct mlx5_wqe_umr_ctrl_seg *umr_ctrl; __be64 access_flags_mask = htobe64(MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_LOCAL_WRITE | MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_REMOTE_READ | MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_REMOTE_WRITE | MLX5_WQE_UMR_CTRL_MKEY_MASK_ACCESS_ATOMIC); struct mlx5_wqe_mkey_context_seg *mk; if (unlikely(mqp->err)) return; if (unlikely(!mqp->cur_mkey)) { mqp->err = EINVAL; return; } if (unlikely(!check_comp_mask(access_flags, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_ATOMIC))) { mqp->err = EINVAL; return; } seg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); umr_ctrl = seg; /* Return an error if the setter is called twice per WQE. */ if (umr_ctrl->mkey_mask & access_flags_mask) { mqp->err = EINVAL; return; } umr_ctrl->mkey_mask |= access_flags_mask; seg += sizeof(struct mlx5_wqe_umr_ctrl_seg); if (unlikely(seg == qend)) seg = mlx5_get_send_wqe(mqp, 0); mk = seg; mk->access_flags = get_umr_mr_flags(access_flags); mqp->cur_setters_cnt++; if (mqp->cur_setters_cnt == mqp->num_mkey_setters) umr_wqe_finalize(mqp); } static void mlx5_send_wr_set_mkey_layout(struct mlx5dv_qp_ex *dv_qp, uint32_t repeat_count, uint16_t num_entries, const struct mlx5dv_mr_interleaved *data, const struct ibv_sge *sge) { struct mlx5_qp *mqp = mqp_from_mlx5dv_qp_ex(dv_qp); struct mlx5_mkey *mkey = mqp->cur_mkey; struct mlx5_wqe_umr_ctrl_seg *umr_ctrl; struct mlx5_wqe_mkey_context_seg *mk; int xlat_size; int size; uint64_t reglen = 0; void *qend = mqp->sq.qend; void *seg; uint16_t max_entries; if (unlikely(mqp->err)) return; if (unlikely(!mkey)) { mqp->err = EINVAL; return; } max_entries = data ? min_t(size_t, (mqp->max_inline_data + sizeof(struct mlx5_wqe_inl_data_seg)) / sizeof(struct mlx5_wqe_umr_repeat_ent_seg) - 1, mkey->num_desc) : min_t(size_t, (mqp->max_inline_data + sizeof(struct mlx5_wqe_inl_data_seg)) / sizeof(struct mlx5_wqe_data_seg), mkey->num_desc); if (unlikely(num_entries > max_entries)) { mqp->err = ENOMEM; return; } seg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); umr_ctrl = seg; /* Check whether the data layout is already set. */ if (umr_ctrl->klm_octowords) { mqp->err = EINVAL; return; } seg += sizeof(struct mlx5_wqe_umr_ctrl_seg); if (unlikely(seg == qend)) seg = mlx5_get_send_wqe(mqp, 0); mk = seg; seg = mqp->cur_data; if (data) umr_strided_seg_create(mqp, repeat_count, num_entries, data, seg, qend, &size, &xlat_size, ®len); else umr_sg_list_create(mqp, num_entries, sge, seg, qend, &size, &xlat_size, ®len); mk->len = htobe64(reglen); umr_ctrl->mkey_mask |= htobe64(MLX5_WQE_UMR_CTRL_MKEY_MASK_LEN); umr_ctrl->klm_octowords = htobe16(align(xlat_size, 64) / 16); mqp->cur_size += size / 16; mkey->length = reglen; mqp->cur_setters_cnt++; if (mqp->cur_setters_cnt == mqp->num_mkey_setters) umr_wqe_finalize(mqp); } static void mlx5_send_wr_set_mkey_layout_interleaved(struct mlx5dv_qp_ex *dv_qp, uint32_t repeat_count, uint16_t num_interleaved, const struct mlx5dv_mr_interleaved *data) { mlx5_send_wr_set_mkey_layout(dv_qp, repeat_count, num_interleaved, data, NULL); } static void mlx5_send_wr_mr_interleaved(struct mlx5dv_qp_ex *dv_qp, struct mlx5dv_mkey *mkey, uint32_t access_flags, uint32_t repeat_count, uint16_t num_interleaved, struct mlx5dv_mr_interleaved *data) { struct mlx5dv_mkey_conf_attr attr = {}; mlx5_send_wr_mkey_configure(dv_qp, mkey, 2, &attr); mlx5_send_wr_set_mkey_access_flags(dv_qp, access_flags); mlx5_send_wr_set_mkey_layout(dv_qp, repeat_count, num_interleaved, data, NULL); } static void mlx5_send_wr_set_mkey_layout_list(struct mlx5dv_qp_ex *dv_qp, uint16_t num_sges, const struct ibv_sge *sge) { mlx5_send_wr_set_mkey_layout(dv_qp, 0, num_sges, NULL, sge); } static inline void mlx5_send_wr_mr_list(struct mlx5dv_qp_ex *dv_qp, struct mlx5dv_mkey *mkey, uint32_t access_flags, uint16_t num_sges, struct ibv_sge *sge) { struct mlx5dv_mkey_conf_attr attr = {}; mlx5_send_wr_mkey_configure(dv_qp, mkey, 2, &attr); mlx5_send_wr_set_mkey_access_flags(dv_qp, access_flags); mlx5_send_wr_set_mkey_layout(dv_qp, 0, num_sges, NULL, sge); } static bool mlx5_validate_sig_t10dif(const struct mlx5dv_sig_t10dif *dif) { if (unlikely(dif->bg != 0 && dif->bg != 0xffff)) return false; if (unlikely(dif->bg_type != MLX5DV_SIG_T10DIF_CRC && dif->bg_type != MLX5DV_SIG_T10DIF_CSUM)) return false; if (unlikely(!check_comp_mask(dif->flags, MLX5DV_SIG_T10DIF_FLAG_REF_REMAP | MLX5DV_SIG_T10DIF_FLAG_APP_ESCAPE | MLX5DV_SIG_T10DIF_FLAG_APP_REF_ESCAPE))) return false; return true; } static bool mlx5_validate_sig_crc(const struct mlx5dv_sig_crc *crc) { switch (crc->type) { case MLX5DV_SIG_CRC_TYPE_CRC32: case MLX5DV_SIG_CRC_TYPE_CRC32C: if (unlikely(crc->seed != 0 && crc->seed != UINT32_MAX)) return false; break; case MLX5DV_SIG_CRC_TYPE_CRC64_XP10: if (unlikely(crc->seed != 0 && crc->seed != UINT64_MAX)) return false; break; default: return false; } return true; } static bool mlx5_validate_sig_block_domain(const struct mlx5dv_sig_block_domain *domain) { if (unlikely(domain->block_size < MLX5DV_BLOCK_SIZE_512 || domain->block_size > MLX5DV_BLOCK_SIZE_4160)) return false; if (unlikely(domain->comp_mask)) return false; switch (domain->sig_type) { case MLX5DV_SIG_TYPE_T10DIF: if (unlikely(!mlx5_validate_sig_t10dif(domain->sig.dif))) return false; break; case MLX5DV_SIG_TYPE_CRC: if (unlikely(!mlx5_validate_sig_crc(domain->sig.crc))) return false; break; default: return false; } return true; } static void mlx5_copy_sig_block_domain(const struct mlx5dv_sig_block_domain *src, struct mlx5_sig_block_domain *dst) { if (!src) { dst->sig_type = MLX5_SIG_TYPE_NONE; return; } if (src->sig_type == MLX5DV_SIG_TYPE_CRC) { dst->sig.crc = *src->sig.crc; dst->sig_type = MLX5_SIG_TYPE_CRC; } else { dst->sig.dif = *src->sig.dif; dst->sig_type = MLX5_SIG_TYPE_T10DIF; } dst->block_size = src->block_size; } static void mlx5_send_wr_set_mkey_sig_block(struct mlx5dv_qp_ex *dv_qp, const struct mlx5dv_sig_block_attr *dv_attr) { struct mlx5_qp *mqp = mqp_from_mlx5dv_qp_ex(dv_qp); struct mlx5_mkey *mkey = mqp->cur_mkey; struct mlx5_sig_block *sig_block; if (unlikely(mqp->err)) return; if (unlikely(!mkey)) { mqp->err = EINVAL; return; } if (unlikely(!mkey->sig)) { mqp->err = EINVAL; return; } /* Check whether the setter is already called for the current UMR WQE. */ sig_block = &mkey->sig->block; if (unlikely(sig_block->state == MLX5_MKEY_BSF_STATE_UPDATED)) { mqp->err = EINVAL; return; } if (unlikely(!dv_attr->mem && !dv_attr->wire)) { mqp->err = EINVAL; return; } if (unlikely(!check_comp_mask(dv_attr->flags, MLX5DV_SIG_BLOCK_ATTR_FLAG_COPY_MASK))) { mqp->err = EINVAL; return; } if (unlikely(dv_attr->comp_mask)) { mqp->err = EINVAL; return; } if (dv_attr->mem) { if (unlikely(!mlx5_validate_sig_block_domain(dv_attr->mem))) { mqp->err = EINVAL; return; } } if (dv_attr->wire) { if (unlikely(!mlx5_validate_sig_block_domain(dv_attr->wire))) { mqp->err = EINVAL; return; } } sig_block = &mkey->sig->block; mlx5_copy_sig_block_domain(dv_attr->mem, &sig_block->attr.mem); mlx5_copy_sig_block_domain(dv_attr->wire, &sig_block->attr.wire); sig_block->attr.flags = dv_attr->flags; sig_block->attr.check_mask = dv_attr->check_mask; sig_block->attr.copy_mask = dv_attr->copy_mask; sig_block->state = MLX5_MKEY_BSF_STATE_UPDATED; mqp->cur_setters_cnt++; if (mqp->cur_setters_cnt == mqp->num_mkey_setters) umr_wqe_finalize(mqp); } static void mlx5_send_wr_set_mkey_crypto(struct mlx5dv_qp_ex *dv_qp, const struct mlx5dv_crypto_attr *dv_attr) { struct mlx5_qp *mqp = mqp_from_mlx5dv_qp_ex(dv_qp); struct mlx5_mkey *mkey = mqp->cur_mkey; struct mlx5_crypto_attr *crypto_attr; if (unlikely(mqp->err)) return; if (unlikely(!mkey)) { mqp->err = EINVAL; return; } if (unlikely(!mkey->crypto)) { mqp->err = EINVAL; return; } /* Check whether the setter is already called for the current UMR WQE */ crypto_attr = mkey->crypto; if (unlikely(crypto_attr->state == MLX5_MKEY_BSF_STATE_UPDATED)) { mqp->err = EINVAL; return; } if (unlikely(dv_attr->comp_mask)) { mqp->err = EINVAL; return; } if (unlikely(dv_attr->crypto_standard != MLX5DV_CRYPTO_STANDARD_AES_XTS)) { mqp->err = EINVAL; return; } if (unlikely( dv_attr->signature_crypto_order != MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_AFTER_CRYPTO_ON_TX && dv_attr->signature_crypto_order != MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_BEFORE_CRYPTO_ON_TX)) { mqp->err = EINVAL; return; } if (unlikely(dv_attr->data_unit_size < MLX5DV_BLOCK_SIZE_512 || dv_attr->data_unit_size > MLX5DV_BLOCK_SIZE_4160)) { mqp->err = EINVAL; return; } crypto_attr->crypto_standard = dv_attr->crypto_standard; crypto_attr->encrypt_on_tx = dv_attr->encrypt_on_tx; crypto_attr->signature_crypto_order = dv_attr->signature_crypto_order; crypto_attr->data_unit_size = dv_attr->data_unit_size; crypto_attr->dek = dv_attr->dek; memcpy(crypto_attr->initial_tweak, dv_attr->initial_tweak, sizeof(crypto_attr->initial_tweak)); memcpy(crypto_attr->keytag, dv_attr->keytag, sizeof(crypto_attr->keytag)); crypto_attr->state = MLX5_MKEY_BSF_STATE_UPDATED; mqp->cur_setters_cnt++; if (mqp->cur_setters_cnt == mqp->num_mkey_setters) umr_wqe_finalize(mqp); } static void mlx5_send_wr_set_dc_addr(struct mlx5dv_qp_ex *dv_qp, struct ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key) { struct mlx5_qp *mqp = mqp_from_mlx5dv_qp_ex(dv_qp); struct mlx5_wqe_datagram_seg *dseg = (void *)mqp->cur_ctrl + sizeof(struct mlx5_wqe_ctrl_seg); struct mlx5_ah *mah = to_mah(ah); memcpy(&dseg->av, &mah->av, sizeof(dseg->av)); dseg->av.dqp_dct |= htobe32(remote_dctn | MLX5_EXTENDED_UD_AV); dseg->av.key.dc_key = htobe64(remote_dc_key); if (mqp->cur_setters_cnt == WQE_REQ_SETTERS_UD_XRC_DC - 1) _common_wqe_finalize(mqp); else mqp->cur_setters_cnt++; } static void mlx5_send_wr_set_dc_addr_stream(struct mlx5dv_qp_ex *dv_qp, struct ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key, uint16_t stream_id) { struct mlx5_qp *mqp = mqp_from_mlx5dv_qp_ex(dv_qp); mqp->cur_ctrl->dci_stream_channel_id = htobe16(stream_id); mlx5_send_wr_set_dc_addr(dv_qp, ah, remote_dctn, remote_dc_key); } static inline void raw_wqe_init(struct ibv_qp_ex *ibqp) { struct mlx5_qp *mqp = to_mqp((struct ibv_qp *)ibqp); uint32_t idx; if (unlikely(mlx5_wq_overflow(&mqp->sq, mqp->nreq, to_mcq(ibqp->qp_base.send_cq)))) { FILE *fp = to_mctx(((struct ibv_qp *)ibqp)->context)->dbg_fp; mlx5_dbg(fp, MLX5_DBG_QP_SEND, "Work queue overflow\n"); if (!mqp->err) mqp->err = ENOMEM; return; } idx = mqp->sq.cur_post & (mqp->sq.wqe_cnt - 1); mqp->sq.wrid[idx] = ibqp->wr_id; mqp->sq.wqe_head[idx] = mqp->sq.head + mqp->nreq; mqp->sq.wr_data[idx] = IBV_WC_DRIVER2; mqp->fm_cache = 0; mqp->cur_ctrl = mlx5_get_send_wqe(mqp, idx); } static void mlx5_wr_raw_wqe(struct mlx5dv_qp_ex *mqp_ex, const void *wqe) { struct mlx5_wqe_ctrl_seg *ctrl = (struct mlx5_wqe_ctrl_seg *)wqe; struct mlx5_qp *mqp = mqp_from_mlx5dv_qp_ex(mqp_ex); struct ibv_qp_ex *ibqp = ibv_qp_to_qp_ex(mqp->ibv_qp); uint8_t ds = be32toh(ctrl->qpn_ds) & 0x3f; int wq_left; raw_wqe_init(ibqp); wq_left = mqp->sq.qend - (void *)mqp->cur_ctrl; if (likely(wq_left >= ds << 4)) { memcpy(mqp->cur_ctrl, wqe, ds << 4); } else { memcpy(mqp->cur_ctrl, wqe, wq_left); memcpy(mlx5_get_send_wqe(mqp, 0), wqe + wq_left, (ds << 4) - wq_left); } mqp->cur_ctrl->opmod_idx_opcode = htobe32((be32toh(ctrl->opmod_idx_opcode) & 0xff0000ff) | ((mqp->sq.cur_post & 0xffff) << 8)); mqp->cur_size = ds; mqp->nreq++; __wqe_finalize(mqp); } static inline void mlx5_wr_memcpy(struct mlx5dv_qp_ex *mqp_ex, uint32_t dest_lkey, uint64_t dest_addr, uint32_t src_lkey, uint64_t src_addr, size_t length) ALWAYS_INLINE; static inline void mlx5_wr_memcpy(struct mlx5dv_qp_ex *mqp_ex, uint32_t dest_lkey, uint64_t dest_addr, uint32_t src_lkey, uint64_t src_addr, size_t length) { struct mlx5_qp *mqp = mqp_from_mlx5dv_qp_ex(mqp_ex); struct ibv_qp_ex *ibqp = &mqp->verbs_qp.qp_ex; struct mlx5_pd *mpd = to_mpd(mqp->ibv_qp->pd); struct mlx5_mmo_wqe *dma_wqe; if (unlikely(!length || length > to_mctx(mqp->ibv_qp->context) ->dma_mmo_caps.dma_max_size)) { if (!mqp->err) mqp->err = EINVAL; return; } if (length == MLX5_DMA_MMO_MAX_SIZE) /* 2 Gbyte is represented as 0 in data segment byte count */ length = 0; _common_wqe_init_op(ibqp, -1, MLX5_OPCODE_MMO); mqp->cur_ctrl->opmod_idx_opcode = htobe32((be32toh(mqp->cur_ctrl->opmod_idx_opcode) & 0xffffff) | (MLX5_OPC_MOD_MMO_DMA << 24)); dma_wqe = (struct mlx5_mmo_wqe *)mqp->cur_ctrl; dma_wqe->mmo_meta.mmo_control_31_0 = 0; dma_wqe->mmo_meta.local_key = htobe32(mpd->opaque_mr->lkey); dma_wqe->mmo_meta.local_address = htobe64((uint64_t)(uintptr_t)mpd->opaque_buf); mlx5dv_set_data_seg(&dma_wqe->src, length, src_lkey, src_addr); mlx5dv_set_data_seg(&dma_wqe->dest, length, dest_lkey, dest_addr); mqp->cur_size = sizeof(*dma_wqe) / 16; mqp->nreq++; _common_wqe_finalize(mqp); } enum { MLX5_SUPPORTED_SEND_OPS_FLAGS_RC = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_INV | IBV_QP_EX_WITH_SEND_WITH_IMM | IBV_QP_EX_WITH_RDMA_WRITE | IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM | IBV_QP_EX_WITH_RDMA_READ | IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP | IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD | IBV_QP_EX_WITH_LOCAL_INV | IBV_QP_EX_WITH_BIND_MW, MLX5_SUPPORTED_SEND_OPS_FLAGS_XRC = MLX5_SUPPORTED_SEND_OPS_FLAGS_RC, MLX5_SUPPORTED_SEND_OPS_FLAGS_DCI = MLX5_SUPPORTED_SEND_OPS_FLAGS_RC, MLX5_SUPPORTED_SEND_OPS_FLAGS_UD = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_IMM, MLX5_SUPPORTED_SEND_OPS_FLAGS_UC = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_INV | IBV_QP_EX_WITH_SEND_WITH_IMM | IBV_QP_EX_WITH_RDMA_WRITE | IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM | IBV_QP_EX_WITH_LOCAL_INV | IBV_QP_EX_WITH_BIND_MW, MLX5_SUPPORTED_SEND_OPS_FLAGS_RAW_PACKET = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_TSO, }; static void fill_wr_builders_rc_xrc_dc(struct ibv_qp_ex *ibqp) { ibqp->wr_send = mlx5_send_wr_send_other; ibqp->wr_send_imm = mlx5_send_wr_send_imm; ibqp->wr_send_inv = mlx5_send_wr_send_inv; ibqp->wr_rdma_write = mlx5_send_wr_rdma_write; ibqp->wr_rdma_write_imm = mlx5_send_wr_rdma_write_imm; ibqp->wr_rdma_read = mlx5_send_wr_rdma_read; ibqp->wr_atomic_cmp_swp = mlx5_send_wr_atomic_cmp_swp; ibqp->wr_atomic_fetch_add = mlx5_send_wr_atomic_fetch_add; ibqp->wr_bind_mw = mlx5_send_wr_bind_mw; ibqp->wr_local_inv = mlx5_send_wr_local_inv; } static void fill_wr_builders_uc(struct ibv_qp_ex *ibqp) { ibqp->wr_send = mlx5_send_wr_send_other; ibqp->wr_send_imm = mlx5_send_wr_send_imm; ibqp->wr_send_inv = mlx5_send_wr_send_inv; ibqp->wr_rdma_write = mlx5_send_wr_rdma_write; ibqp->wr_rdma_write_imm = mlx5_send_wr_rdma_write_imm; ibqp->wr_bind_mw = mlx5_send_wr_bind_mw; ibqp->wr_local_inv = mlx5_send_wr_local_inv; } static void fill_wr_builders_ud(struct ibv_qp_ex *ibqp) { ibqp->wr_send = mlx5_send_wr_send_other; ibqp->wr_send_imm = mlx5_send_wr_send_imm; } static void fill_wr_builders_eth(struct ibv_qp_ex *ibqp) { ibqp->wr_send = mlx5_send_wr_send_eth; ibqp->wr_send_tso = mlx5_send_wr_send_tso; } static void fill_wr_setters_rc_uc(struct ibv_qp_ex *ibqp) { ibqp->wr_set_sge = mlx5_send_wr_set_sge_rc_uc; ibqp->wr_set_sge_list = mlx5_send_wr_set_sge_list_rc_uc; ibqp->wr_set_inline_data = mlx5_send_wr_set_inline_data_rc_uc; ibqp->wr_set_inline_data_list = mlx5_send_wr_set_inline_data_list_rc_uc; } static void fill_wr_setters_ud_xrc_dc(struct ibv_qp_ex *ibqp) { ibqp->wr_set_sge = mlx5_send_wr_set_sge_ud_xrc_dc; ibqp->wr_set_sge_list = mlx5_send_wr_set_sge_list_ud_xrc_dc; ibqp->wr_set_inline_data = mlx5_send_wr_set_inline_data_ud_xrc_dc; ibqp->wr_set_inline_data_list = mlx5_send_wr_set_inline_data_list_ud_xrc_dc; } static void fill_wr_setters_eth(struct ibv_qp_ex *ibqp) { ibqp->wr_set_sge = mlx5_send_wr_set_sge_eth; ibqp->wr_set_sge_list = mlx5_send_wr_set_sge_list_eth; ibqp->wr_set_inline_data = mlx5_send_wr_set_inline_data_eth; ibqp->wr_set_inline_data_list = mlx5_send_wr_set_inline_data_list_eth; } void mlx5_qp_fill_wr_complete_error(struct mlx5_qp *mqp) { struct ibv_qp_ex *ibqp = &mqp->verbs_qp.qp_ex; if (ibqp->wr_complete) ibqp->wr_complete = mlx5_send_wr_complete_error; } void mlx5_qp_fill_wr_complete_real(struct mlx5_qp *mqp) { struct ibv_qp_ex *ibqp = &mqp->verbs_qp.qp_ex; if (ibqp->wr_complete) ibqp->wr_complete = mlx5_send_wr_complete; } int mlx5_qp_fill_wr_pfns(struct mlx5_qp *mqp, const struct ibv_qp_init_attr_ex *attr, const struct mlx5dv_qp_init_attr *mlx5_attr) { struct ibv_qp_ex *ibqp = &mqp->verbs_qp.qp_ex; struct mlx5dv_qp_ex *dv_qp = &mqp->dv_qp; uint64_t ops = attr->send_ops_flags; uint64_t mlx5_ops = 0; ibqp->wr_start = mlx5_send_wr_start; ibqp->wr_complete = mlx5_send_wr_complete; ibqp->wr_abort = mlx5_send_wr_abort; if (!mqp->atomics_enabled && (ops & IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP || ops & IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD)) return EOPNOTSUPP; if (mlx5_attr && mlx5_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS) mlx5_ops = mlx5_attr->send_ops_flags; if (mlx5_ops) { if (!check_comp_mask(mlx5_ops, MLX5DV_QP_EX_WITH_MR_INTERLEAVED | MLX5DV_QP_EX_WITH_MR_LIST | MLX5DV_QP_EX_WITH_MKEY_CONFIGURE | MLX5DV_QP_EX_WITH_RAW_WQE | MLX5DV_QP_EX_WITH_MEMCPY)) return EOPNOTSUPP; dv_qp->wr_raw_wqe = mlx5_wr_raw_wqe; } /* Set all supported micro-functions regardless user request */ switch (attr->qp_type) { case IBV_QPT_RC: if (ops & ~MLX5_SUPPORTED_SEND_OPS_FLAGS_RC) return EOPNOTSUPP; fill_wr_builders_rc_xrc_dc(ibqp); fill_wr_setters_rc_uc(ibqp); if (mlx5_ops) { dv_qp->wr_mr_interleaved = mlx5_send_wr_mr_interleaved; dv_qp->wr_mr_list = mlx5_send_wr_mr_list; dv_qp->wr_mkey_configure = mlx5_send_wr_mkey_configure; dv_qp->wr_set_mkey_access_flags = mlx5_send_wr_set_mkey_access_flags; dv_qp->wr_set_mkey_layout_list = mlx5_send_wr_set_mkey_layout_list; dv_qp->wr_set_mkey_layout_interleaved = mlx5_send_wr_set_mkey_layout_interleaved; dv_qp->wr_set_mkey_sig_block = mlx5_send_wr_set_mkey_sig_block; dv_qp->wr_set_mkey_crypto = mlx5_send_wr_set_mkey_crypto; dv_qp->wr_memcpy = mlx5_wr_memcpy; } break; case IBV_QPT_UC: if (ops & ~MLX5_SUPPORTED_SEND_OPS_FLAGS_UC || (mlx5_ops & ~MLX5DV_QP_EX_WITH_RAW_WQE)) return EOPNOTSUPP; fill_wr_builders_uc(ibqp); fill_wr_setters_rc_uc(ibqp); break; case IBV_QPT_XRC_SEND: if (ops & ~MLX5_SUPPORTED_SEND_OPS_FLAGS_XRC || (mlx5_ops & ~MLX5DV_QP_EX_WITH_RAW_WQE)) return EOPNOTSUPP; fill_wr_builders_rc_xrc_dc(ibqp); fill_wr_setters_ud_xrc_dc(ibqp); ibqp->wr_set_xrc_srqn = mlx5_send_wr_set_xrc_srqn; break; case IBV_QPT_UD: if (ops & ~MLX5_SUPPORTED_SEND_OPS_FLAGS_UD || (mlx5_ops & ~MLX5DV_QP_EX_WITH_RAW_WQE)) return EOPNOTSUPP; if (mqp->flags & MLX5_QP_FLAGS_USE_UNDERLAY) return EOPNOTSUPP; fill_wr_builders_ud(ibqp); fill_wr_setters_ud_xrc_dc(ibqp); ibqp->wr_set_ud_addr = mlx5_send_wr_set_ud_addr; break; case IBV_QPT_RAW_PACKET: if (ops & ~MLX5_SUPPORTED_SEND_OPS_FLAGS_RAW_PACKET || (mlx5_ops & ~MLX5DV_QP_EX_WITH_RAW_WQE)) return EOPNOTSUPP; fill_wr_builders_eth(ibqp); fill_wr_setters_eth(ibqp); break; case IBV_QPT_DRIVER: if (!(mlx5_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_DC && mlx5_attr->dc_init_attr.dc_type == MLX5DV_DCTYPE_DCI)) return EOPNOTSUPP; if (ops & ~MLX5_SUPPORTED_SEND_OPS_FLAGS_DCI || (mlx5_ops & ~(MLX5DV_QP_EX_WITH_RAW_WQE | MLX5DV_QP_EX_WITH_MEMCPY))) return EOPNOTSUPP; fill_wr_builders_rc_xrc_dc(ibqp); fill_wr_setters_ud_xrc_dc(ibqp); dv_qp->wr_set_dc_addr = mlx5_send_wr_set_dc_addr; dv_qp->wr_set_dc_addr_stream = mlx5_send_wr_set_dc_addr_stream; dv_qp->wr_memcpy = mlx5_wr_memcpy; break; default: return EOPNOTSUPP; } return 0; } int mlx5_bind_mw(struct ibv_qp *qp, struct ibv_mw *mw, struct ibv_mw_bind *mw_bind) { struct ibv_mw_bind_info *bind_info = &mw_bind->bind_info; struct ibv_send_wr wr = {}; struct ibv_send_wr *bad_wr = NULL; int ret; if (bind_info->mw_access_flags & IBV_ACCESS_ZERO_BASED) { errno = EINVAL; return errno; } if (bind_info->mr) { if (verbs_get_mr(bind_info->mr)->mr_type != IBV_MR_TYPE_MR) { errno = ENOTSUP; return errno; } if (to_mmr(bind_info->mr)->alloc_flags & IBV_ACCESS_ZERO_BASED) { errno = EINVAL; return errno; } } wr.opcode = IBV_WR_BIND_MW; wr.next = NULL; wr.wr_id = mw_bind->wr_id; wr.send_flags = mw_bind->send_flags; wr.bind_mw.bind_info = mw_bind->bind_info; wr.bind_mw.mw = mw; wr.bind_mw.rkey = ibv_inc_rkey(mw->rkey); ret = _mlx5_post_send(qp, &wr, &bad_wr); if (ret) return ret; mw->rkey = wr.bind_mw.rkey; return 0; } static void set_sig_seg(struct mlx5_qp *qp, struct mlx5_rwqe_sig *sig, int size, uint16_t idx) { uint8_t sign; uint32_t qpn = qp->ibv_qp->qp_num; sign = calc_sig(sig, size); sign ^= calc_sig(&qpn, 4); sign ^= calc_sig(&idx, 2); sig->signature = sign; } static void set_wq_sig_seg(struct mlx5_rwq *rwq, struct mlx5_rwqe_sig *sig, int size, uint16_t idx) { uint8_t sign; uint32_t qpn = rwq->wq.wq_num; sign = calc_sig(sig, size); sign ^= calc_sig(&qpn, 4); sign ^= calc_sig(&idx, 2); sig->signature = sign; } int mlx5_post_wq_recv(struct ibv_wq *ibwq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mlx5_rwq *rwq = to_mrwq(ibwq); struct mlx5_wqe_data_seg *scat; int err = 0; int nreq; int ind; int i, j; struct mlx5_rwqe_sig *sig; mlx5_spin_lock(&rwq->rq.lock); ind = rwq->rq.head & (rwq->rq.wqe_cnt - 1); for (nreq = 0; wr; ++nreq, wr = wr->next) { if (unlikely(mlx5_wq_overflow(&rwq->rq, nreq, to_mcq(rwq->wq.cq)))) { err = ENOMEM; *bad_wr = wr; goto out; } if (unlikely(wr->num_sge > rwq->rq.max_gs)) { err = EINVAL; *bad_wr = wr; goto out; } scat = get_wq_recv_wqe(rwq, ind); sig = (struct mlx5_rwqe_sig *)scat; if (unlikely(rwq->wq_sig)) { memset(sig, 0, 1 << rwq->rq.wqe_shift); ++scat; } for (i = 0, j = 0; i < wr->num_sge; ++i) { if (unlikely(!wr->sg_list[i].length)) continue; set_data_ptr_seg(scat + j++, wr->sg_list + i, 0); } if (j < rwq->rq.max_gs) { scat[j].byte_count = 0; scat[j].lkey = htobe32(MLX5_INVALID_LKEY); scat[j].addr = 0; } if (unlikely(rwq->wq_sig)) set_wq_sig_seg(rwq, sig, (wr->num_sge + 1) << 4, rwq->rq.head & 0xffff); rwq->rq.wrid[ind] = wr->wr_id; ind = (ind + 1) & (rwq->rq.wqe_cnt - 1); } out: if (likely(nreq)) { rwq->rq.head += nreq; /* * Make sure that descriptors are written before * doorbell record. */ udma_to_device_barrier(); *(rwq->recv_db) = htobe32(rwq->rq.head & 0xffff); } mlx5_spin_unlock(&rwq->rq.lock); return err; } int mlx5_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mlx5_qp *qp = to_mqp(ibqp); struct mlx5_wqe_data_seg *scat; int err = 0; int nreq; int ind; int i, j; struct mlx5_rwqe_sig *sig; mlx5_spin_lock(&qp->rq.lock); ind = qp->rq.head & (qp->rq.wqe_cnt - 1); for (nreq = 0; wr; ++nreq, wr = wr->next) { if (unlikely(mlx5_wq_overflow(&qp->rq, nreq, to_mcq(qp->ibv_qp->recv_cq)))) { err = ENOMEM; *bad_wr = wr; goto out; } if (unlikely(wr->num_sge > qp->rq.qp_state_max_gs)) { err = EINVAL; *bad_wr = wr; goto out; } scat = get_recv_wqe(qp, ind); sig = (struct mlx5_rwqe_sig *)scat; if (unlikely(qp->wq_sig)) { memset(sig, 0, 1 << qp->rq.wqe_shift); ++scat; } for (i = 0, j = 0; i < wr->num_sge; ++i) { if (unlikely(!wr->sg_list[i].length)) continue; set_data_ptr_seg(scat + j++, wr->sg_list + i, 0); } if (j < qp->rq.max_gs) { scat[j].byte_count = 0; scat[j].lkey = htobe32(MLX5_INVALID_LKEY); scat[j].addr = 0; } if (unlikely(qp->wq_sig)) set_sig_seg(qp, sig, (wr->num_sge + 1) << 4, qp->rq.head & 0xffff); qp->rq.wrid[ind] = wr->wr_id; ind = (ind + 1) & (qp->rq.wqe_cnt - 1); } out: if (likely(nreq)) { qp->rq.head += nreq; /* * Make sure that descriptors are written before * doorbell record. */ udma_to_device_barrier(); /* * For Raw Packet QP, avoid updating the doorbell record * as long as the QP isn't in RTR state, to avoid receiving * packets in illegal states. * This is only for Raw Packet QPs since they are represented * differently in the hardware. */ if (likely(!((ibqp->qp_type == IBV_QPT_RAW_PACKET || qp->flags & MLX5_QP_FLAGS_USE_UNDERLAY) && ibqp->state < IBV_QPS_RTR))) qp->db[MLX5_RCV_DBR] = htobe32(qp->rq.head & 0xffff); } mlx5_spin_unlock(&qp->rq.lock); return err; } static void mlx5_tm_add_op(struct mlx5_srq *srq, struct mlx5_tag_entry *tag, uint64_t wr_id, int nreq) { struct mlx5_qp *qp = to_mqp(srq->cmd_qp); struct mlx5_srq_op *op; op = srq->op + (srq->op_tail++ & (qp->sq.wqe_cnt - 1)); op->tag = tag; op->wr_id = wr_id; /* Will point to next available WQE */ op->wqe_head = qp->sq.head + nreq; if (tag) tag->expect_cqe++; } int mlx5_post_srq_ops(struct ibv_srq *ibsrq, struct ibv_ops_wr *wr, struct ibv_ops_wr **bad_wr) { struct mlx5_context *ctx = to_mctx(ibsrq->context); struct mlx5_srq *srq = to_msrq(ibsrq); struct mlx5_wqe_ctrl_seg *ctrl = NULL; struct mlx5_tag_entry *tag; struct mlx5_bf *bf; struct mlx5_qp *qp; unsigned int idx; int size = 0; int nreq = 0; int err = 0; void *qend; void *seg; FILE *fp = ctx->dbg_fp; if (unlikely(!srq->cmd_qp)) { *bad_wr = wr; return EINVAL; } qp = to_mqp(srq->cmd_qp); bf = qp->bf; qend = qp->sq.qend; mlx5_spin_lock(&srq->lock); for (nreq = 0; wr; ++nreq, wr = wr->next) { if (unlikely(mlx5_wq_overflow(&qp->sq, nreq, to_mcq(qp->ibv_qp->send_cq)))) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "work queue overflow\n"); err = ENOMEM; *bad_wr = wr; goto out; } idx = qp->sq.cur_post & (qp->sq.wqe_cnt - 1); ctrl = seg = mlx5_get_send_wqe(qp, idx); *(uint32_t *)(seg + 8) = 0; ctrl->imm = 0; ctrl->fm_ce_se = 0; seg += sizeof(*ctrl); size = sizeof(*ctrl) / 16; switch (wr->opcode) { case IBV_WR_TAG_ADD: if (unlikely(!srq->tm_head->next)) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "tag matching list is full\n"); err = ENOMEM; *bad_wr = wr; goto out; } tag = srq->tm_head; #ifdef MLX5_DEBUG if (wr->tm.add.num_sge > 1) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "num_sge must be at most 1\n"); err = EINVAL; *bad_wr = wr; goto out; } if (tag->expect_cqe) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "tag matching list is corrupted\n"); err = ENOMEM; *bad_wr = wr; goto out; } #endif srq->tm_head = tag->next; /* place index of next entry into TM segment */ set_tm_seg(seg, MLX5_TM_OPCODE_APPEND, wr, tag->next - srq->tm_list); tag->next = NULL; tag->wr_id = wr->tm.add.recv_wr_id; if (wr->flags & IBV_OPS_TM_SYNC) srq->unexp_out = wr->tm.unexpected_cnt; tag->phase_cnt = srq->unexp_out; tag->expect_cqe++; if (wr->flags & IBV_OPS_SIGNALED) mlx5_tm_add_op(srq, tag, wr->wr_id, nreq); wr->tm.handle = tag - srq->tm_list; seg += sizeof(struct mlx5_wqe_tm_seg); size += sizeof(struct mlx5_wqe_tm_seg) / 16; if (unlikely(seg == qend)) seg = mlx5_get_send_wqe(qp, 0); /* message is allowed to be empty */ if (wr->tm.add.num_sge && wr->tm.add.sg_list->length) { set_data_ptr_seg(seg, wr->tm.add.sg_list, 0); tag->ptr = (void *)(uintptr_t)wr->tm.add.sg_list->addr; tag->size = wr->tm.add.sg_list->length; } else { set_data_ptr_seg_end(seg); } size += sizeof(struct mlx5_wqe_data_seg) / 16; break; case IBV_WR_TAG_DEL: tag = &srq->tm_list[wr->tm.handle]; #ifdef MLX5_DEBUG if (!tag->expect_cqe) { mlx5_dbg(fp, MLX5_DBG_QP_SEND, "removing tag which isn't in HW ownership\n"); err = ENOMEM; *bad_wr = wr; goto out; } #endif set_tm_seg(seg, MLX5_TM_OPCODE_REMOVE, wr, wr->tm.handle); if (wr->flags & IBV_OPS_SIGNALED) mlx5_tm_add_op(srq, tag, wr->wr_id, nreq); else mlx5_tm_release_tag(srq, tag); seg += sizeof(struct mlx5_wqe_tm_seg); size += sizeof(struct mlx5_wqe_tm_seg) / 16; break; case IBV_WR_TAG_SYNC: set_tm_seg(seg, MLX5_TM_OPCODE_NOP, wr, 0); if (wr->flags & IBV_OPS_SIGNALED) mlx5_tm_add_op(srq, NULL, wr->wr_id, nreq); seg += sizeof(struct mlx5_wqe_tm_seg); size += sizeof(struct mlx5_wqe_tm_seg) / 16; break; default: mlx5_dbg(fp, MLX5_DBG_QP_SEND, "bad opcode %d\n", wr->opcode); err = EINVAL; *bad_wr = wr; goto out; } ctrl->opmod_idx_opcode = htobe32(MLX5_OPCODE_TAG_MATCHING | ((qp->sq.cur_post & 0xffff) << 8)); ctrl->qpn_ds = htobe32(size | (srq->cmd_qp->qp_num << 8)); if (unlikely(qp->wq_sig)) ctrl->signature = wq_sig(ctrl); qp->sq.cur_post += DIV_ROUND_UP(size * 16, MLX5_SEND_WQE_BB); #ifdef MLX5_DEBUG if (mlx5_debug_mask & MLX5_DBG_QP_SEND) dump_wqe(ctx, idx, size, qp); #endif } out: qp->fm_cache = 0; post_send_db(qp, bf, nreq, 0, size, ctrl); mlx5_spin_unlock(&srq->lock); return err; } int mlx5_use_huge(const char *key) { char *e; e = getenv(key); if (e && !strcmp(e, "y")) return 1; return 0; } struct mlx5_qp *mlx5_find_qp(struct mlx5_context *ctx, uint32_t qpn) { int tind = qpn >> MLX5_QP_TABLE_SHIFT; if (ctx->qp_table[tind].refcnt) return ctx->qp_table[tind].table[qpn & MLX5_QP_TABLE_MASK]; else return NULL; } int mlx5_store_qp(struct mlx5_context *ctx, uint32_t qpn, struct mlx5_qp *qp) { int tind = qpn >> MLX5_QP_TABLE_SHIFT; if (!ctx->qp_table[tind].refcnt) { ctx->qp_table[tind].table = calloc(MLX5_QP_TABLE_MASK + 1, sizeof(struct mlx5_qp *)); if (!ctx->qp_table[tind].table) return -1; } ++ctx->qp_table[tind].refcnt; ctx->qp_table[tind].table[qpn & MLX5_QP_TABLE_MASK] = qp; return 0; } void mlx5_clear_qp(struct mlx5_context *ctx, uint32_t qpn) { int tind = qpn >> MLX5_QP_TABLE_SHIFT; if (!--ctx->qp_table[tind].refcnt) free(ctx->qp_table[tind].table); else ctx->qp_table[tind].table[qpn & MLX5_QP_TABLE_MASK] = NULL; } static int mlx5_qp_query_sqd(struct mlx5_qp *mqp, unsigned int *cur_idx) { struct ibv_qp *ibqp = mqp->ibv_qp; uint32_t in[DEVX_ST_SZ_DW(query_qp_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(query_qp_out)] = {}; int err; void *qpc; DEVX_SET(query_qp_in, in, opcode, MLX5_CMD_OP_QUERY_QP); DEVX_SET(query_qp_in, in, qpn, ibqp->qp_num); err = mlx5dv_devx_qp_query(ibqp, in, sizeof(in), out, sizeof(out)); if (err) { err = mlx5_get_cmd_status_err(err, out); return -err; } qpc = DEVX_ADDR_OF(query_qp_out, out, qpc); if (DEVX_GET(qpc, qpc, state) != MLX5_QPC_STATE_SQDRAINED) return -EINVAL; *cur_idx = DEVX_GET(qpc, qpc, hw_sq_wqebb_counter) & (mqp->sq.wqe_cnt - 1); return 0; } static int mlx5_qp_sq_next_idx(struct mlx5_qp *mqp, unsigned int cur_idx, unsigned int *next_idx) { unsigned int *wqe_head = mqp->sq.wqe_head; unsigned int idx_mask = mqp->sq.wqe_cnt - 1; unsigned int idx = cur_idx; unsigned int next_head; next_head = wqe_head[idx] + 1; if (next_head == mqp->sq.head) return ENOENT; idx++; while (wqe_head[idx] != next_head) idx = (idx + 1) & idx_mask; *next_idx = idx; return 0; } static int mlx5dv_qp_cancel_wr(struct mlx5_qp *mqp, unsigned int idx) { struct mlx5_wqe_ctrl_seg *ctrl; uint32_t opmod_idx_opcode; uint32_t *wr_data = &mqp->sq.wr_data[idx]; ctrl = mlx5_get_send_wqe(mqp, idx); opmod_idx_opcode = be32toh(ctrl->opmod_idx_opcode); if (unlikely(*wr_data == IBV_WC_DRIVER2)) goto out; /* Save the original opcode to return it in the work completion. */ switch (opmod_idx_opcode & 0xff) { case MLX5_OPCODE_RDMA_WRITE_IMM: case MLX5_OPCODE_RDMA_WRITE: *wr_data = IBV_WC_RDMA_WRITE; break; case MLX5_OPCODE_SEND_IMM: case MLX5_OPCODE_SEND: case MLX5_OPCODE_SEND_INVAL: *wr_data = IBV_WC_SEND; break; case MLX5_OPCODE_RDMA_READ: *wr_data = IBV_WC_RDMA_READ; break; case MLX5_OPCODE_ATOMIC_CS: *wr_data = IBV_WC_COMP_SWAP; break; case MLX5_OPCODE_ATOMIC_FA: *wr_data = IBV_WC_FETCH_ADD; break; case MLX5_OPCODE_TSO: *wr_data = IBV_WC_TSO; break; case MLX5_OPCODE_UMR: case MLX5_OPCODE_SET_PSV: case MLX5_OPCODE_MMO: /* wr_data is already set at posting WQE */ break; default: return -EINVAL; } out: /* Reset opcode and opmod to 0 */ opmod_idx_opcode &= 0xffff00; opmod_idx_opcode |= MLX5_OPCODE_NOP; ctrl->opmod_idx_opcode = htobe32(opmod_idx_opcode); return 0; } int mlx5dv_qp_cancel_posted_send_wrs(struct mlx5dv_qp_ex *dv_qp, uint64_t wr_id) { struct mlx5_qp *mqp = mqp_from_mlx5dv_qp_ex(dv_qp); unsigned int idx; int ret; int num_canceled_wrs = 0; mlx5_spin_lock(&mqp->sq.lock); ret = mlx5_qp_query_sqd(mqp, &idx); if (ret) goto unlock_and_exit; if (idx == mqp->sq.cur_post) goto unlock_and_exit; while (!ret) { if (mqp->sq.wrid[idx] == wr_id) { num_canceled_wrs++; ret = mlx5dv_qp_cancel_wr(mqp, idx); if (ret) goto unlock_and_exit; } ret = mlx5_qp_sq_next_idx(mqp, idx, &idx); } ret = num_canceled_wrs; unlock_and_exit: mlx5_spin_unlock(&mqp->sq.lock); return ret; } rdma-core-56.1/providers/mlx5/srq.c000066400000000000000000000302471477342711600172130ustar00rootroot00000000000000/* * Copyright (c) 2012 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include "mlx5.h" #include "wqe.h" static void *get_wqe(struct mlx5_srq *srq, int n) { return srq->buf.buf + (n << srq->wqe_shift); } static inline void set_next_tail(struct mlx5_srq *srq, int next_tail) { struct mlx5_wqe_srq_next_seg *next; next = get_wqe(srq, srq->tail); next->next_wqe_index = htobe16(next_tail); srq->tail = next_tail; bitmap_clear_bit(srq->free_wqe_bitmap, srq->tail); } int mlx5_copy_to_recv_srq(struct mlx5_srq *srq, int idx, void *buf, int size) { struct mlx5_wqe_srq_next_seg *next; struct mlx5_wqe_data_seg *scat; int copy; int i; int max = 1 << (srq->wqe_shift - 4); next = get_wqe(srq, idx); scat = (struct mlx5_wqe_data_seg *) (next + 1); for (i = 0; i < max; ++i) { copy = min_t(long, size, be32toh(scat->byte_count)); memcpy((void *)(unsigned long)be64toh(scat->addr), buf, copy); size -= copy; if (size <= 0) return IBV_WC_SUCCESS; buf += copy; ++scat; } return IBV_WC_LOC_LEN_ERR; } void mlx5_free_srq_wqe(struct mlx5_srq *srq, int ind) { mlx5_spin_lock(&srq->lock); bitmap_set_bit(srq->free_wqe_bitmap, ind); mlx5_spin_unlock(&srq->lock); } /* Take an index and put it last in wait queue */ static void srq_put_in_waitq(struct mlx5_srq *srq, int ind) { struct mlx5_wqe_srq_next_seg *waitq_tail; waitq_tail = get_wqe(srq, srq->waitq_tail); waitq_tail->next_wqe_index = htobe16(ind); srq->waitq_tail = ind; } /* Take first in wait queue and put in tail of SRQ */ static void srq_get_from_waitq(struct mlx5_srq *srq) { struct mlx5_wqe_srq_next_seg *tail; struct mlx5_wqe_srq_next_seg *waitq_head; tail = get_wqe(srq, srq->tail); waitq_head = get_wqe(srq, srq->waitq_head); tail->next_wqe_index = htobe16(srq->waitq_head); srq->tail = srq->waitq_head; srq->waitq_head = be16toh(waitq_head->next_wqe_index); } /* Put the given WQE that is in SW ownership at the end of the wait queue. * Take a WQE from the wait queue and add it to WQEs in SW ownership instead. */ bool srq_cooldown_wqe(struct mlx5_srq *srq, int ind) { if (!srq_has_waitq(srq)) return false; srq_put_in_waitq(srq, ind); srq_get_from_waitq(srq); return true; } /* Post a WQE internally, based on a previous application post. * Copy a given WQE's data segments to the SRQ head, advance the head * and ring the HW doorbell. */ static void srq_repost(struct mlx5_srq *srq, int ind) { struct mlx5_wqe_srq_next_seg *src, *dst; struct mlx5_wqe_data_seg *src_scat, *dst_scat; int i; srq->wrid[srq->head] = srq->wrid[ind]; src = get_wqe(srq, ind); dst = get_wqe(srq, srq->head); src_scat = (struct mlx5_wqe_data_seg *)(src + 1); dst_scat = (struct mlx5_wqe_data_seg *)(dst + 1); for (i = 0; i < srq->max_gs; ++i) { dst_scat[i] = src_scat[i]; if (dst_scat[i].lkey == htobe32(MLX5_INVALID_LKEY)) break; } srq->head = be16toh(dst->next_wqe_index); srq->counter++; /* Flush descriptors */ udma_to_device_barrier(); *srq->db = htobe32(srq->counter); } static void populate_srq_ll(struct mlx5_srq *srq) { int i; for (i = 0; i < srq->nwqes; i++) { if (bitmap_test_bit(srq->free_wqe_bitmap, i)) set_next_tail(srq, i); } } void mlx5_complete_odp_fault(struct mlx5_srq *srq, int ind) { mlx5_spin_lock(&srq->lock); /* Extend the SRQ LL with all the available WQEs that are not part of * the main/wait queue to reduce the risk of overriding the page-faulted * WQE. */ populate_srq_ll(srq); /* Expand nwqes to include wait queue indexes as from now on these WQEs * can be popped from the wait queue and be part of the main SRQ LL. * Neglecting this step could render some WQE indexes unreachable * despite their availability for use. */ srq->nwqes = srq->max; if (!srq_cooldown_wqe(srq, ind)) { struct mlx5_wqe_srq_next_seg *tail = get_wqe(srq, srq->tail); /* Without a wait queue put the page-faulted wqe * back in SRQ tail. The repost is still possible but * the risk of overriding the page-faulted WQE with a future * post_srq_recv() is now higher. */ tail->next_wqe_index = htobe16(ind); srq->tail = ind; } srq_repost(srq, ind); mlx5_spin_unlock(&srq->lock); } static inline int get_next_contig_wqes(struct mlx5_srq *srq, int first_idx, int last_idx, int *next_wqe_index) { int contig_wqes_count = 0; int i; for (i = first_idx; i < last_idx; i++) { if (bitmap_test_bit(srq->free_wqe_bitmap, i)) contig_wqes_count++; else if (contig_wqes_count > 0) break; } *next_wqe_index = i - contig_wqes_count; return contig_wqes_count; } /* Locate a contiguous chunk of available WQEs that is closest to the current * SRQ HEAD and reorder the SRQ Linked List pointers accordingly. * Returns 0 on success, or a negative value if SRQ is full. */ static int set_next_contig_wqes(struct mlx5_srq *srq) { struct mlx5_wqe_srq_next_seg *cur; int contig_wqes_count; int next_wqe_index; int i; contig_wqes_count = get_next_contig_wqes(srq, srq->head + 1, srq->nwqes, &next_wqe_index); if (contig_wqes_count == 0) { contig_wqes_count = get_next_contig_wqes(srq, 0, srq->head, &next_wqe_index); if (contig_wqes_count == 0) return -1; } cur = get_wqe(srq, srq->tail); cur->next_wqe_index = htobe16(next_wqe_index); srq->tail = next_wqe_index + contig_wqes_count - 1; bitmap_clear_bit(srq->free_wqe_bitmap, srq->tail); /* Reorder the WQE indexes of the new contiguous chunk sequentially, * since the "next" pointers might have been modified and are not * pointing to the sequent index. */ for (i = next_wqe_index; i < srq->tail; i++) { cur = get_wqe(srq, i); cur->next_wqe_index = htobe16(i + 1); bitmap_clear_bit(srq->free_wqe_bitmap, i); } return 0; } int mlx5_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mlx5_srq *srq = to_msrq(ibsrq); struct mlx5_wqe_srq_next_seg *next; struct mlx5_wqe_data_seg *scat; int next_tail; int err = 0; int nreq; int i; mlx5_spin_lock(&srq->lock); for (nreq = 0; wr; ++nreq, wr = wr->next) { if (wr->num_sge > srq->max_gs) { err = EINVAL; *bad_wr = wr; break; } if (srq->head == srq->tail) { next_tail = (srq->tail + 1) % srq->nwqes; if (bitmap_test_bit(srq->free_wqe_bitmap, next_tail)) { set_next_tail(srq, next_tail); } else if (set_next_contig_wqes(srq)) { /* SRQ is full */ err = ENOMEM; *bad_wr = wr; break; } } srq->wrid[srq->head] = wr->wr_id; next = get_wqe(srq, srq->head); srq->head = be16toh(next->next_wqe_index); scat = (struct mlx5_wqe_data_seg *) (next + 1); for (i = 0; i < wr->num_sge; ++i) { scat[i].byte_count = htobe32(wr->sg_list[i].length); scat[i].lkey = htobe32(wr->sg_list[i].lkey); scat[i].addr = htobe64(wr->sg_list[i].addr); } if (i < srq->max_gs) { scat[i].byte_count = 0; scat[i].lkey = htobe32(MLX5_INVALID_LKEY); scat[i].addr = 0; } } if (nreq) { srq->counter += nreq; /* * Make sure that descriptors are written before * we write doorbell record. */ udma_to_device_barrier(); *srq->db = htobe32(srq->counter); } mlx5_spin_unlock(&srq->lock); return err; } /* Build a linked list on an array of SRQ WQEs. * Since WQEs are always added to the tail and taken from the head * it doesn't matter where the last WQE points to. */ static void set_srq_buf_ll(struct mlx5_srq *srq, int start, int end) { struct mlx5_wqe_srq_next_seg *next; int i; for (i = start; i < end; ++i) { next = get_wqe(srq, i); next->next_wqe_index = htobe16(i + 1); } } int mlx5_alloc_srq_buf(struct ibv_context *context, struct mlx5_srq *srq, uint32_t max_wr, struct ibv_pd *pd) { int size; int buf_size; struct mlx5_context *ctx; uint32_t orig_max_wr = max_wr; bool have_wq = true; enum mlx5_alloc_type alloc_type; ctx = to_mctx(context); if (srq->max_gs < 0) { errno = EINVAL; return -1; } /* At first, try to allocate more WQEs than requested so the extra will * be used for the wait queue. */ max_wr = orig_max_wr * 2 + 1; if (max_wr > ctx->max_srq_recv_wr) { /* Device limits are smaller than required * to provide a wait queue, continue without. */ max_wr = orig_max_wr + 1; have_wq = false; } size = sizeof(struct mlx5_wqe_srq_next_seg) + srq->max_gs * sizeof(struct mlx5_wqe_data_seg); size = max(32, size); size = roundup_pow_of_two(size); if (size > ctx->max_recv_wr) { errno = EINVAL; return -1; } srq->max_gs = (size - sizeof(struct mlx5_wqe_srq_next_seg)) / sizeof(struct mlx5_wqe_data_seg); srq->wqe_shift = ilog32(size - 1); srq->max = align_queue_size(max_wr); buf_size = srq->max * size; mlx5_get_alloc_type(ctx, pd, MLX5_SRQ_PREFIX, &alloc_type, MLX5_ALLOC_TYPE_ANON); if (alloc_type == MLX5_ALLOC_TYPE_CUSTOM) { srq->buf.mparent_domain = to_mparent_domain(pd); srq->buf.req_alignment = to_mdev(context->device)->page_size; srq->buf.resource_type = MLX5DV_RES_TYPE_SRQ; } if (mlx5_alloc_prefered_buf(ctx, &srq->buf, buf_size, to_mdev(context->device)->page_size, alloc_type, MLX5_SRQ_PREFIX)) return -1; if (srq->buf.type != MLX5_ALLOC_TYPE_CUSTOM) memset(srq->buf.buf, 0, buf_size); srq->head = 0; srq->tail = align_queue_size(orig_max_wr + 1) - 1; srq->nwqes = srq->tail + 1; if (have_wq) { srq->waitq_head = srq->tail + 1; srq->waitq_tail = srq->max - 1; } else { srq->waitq_head = -1; srq->waitq_tail = -1; } srq->wrid = malloc(srq->max * sizeof(*srq->wrid)); if (!srq->wrid) goto err_free_buf; srq->free_wqe_bitmap = bitmap_alloc0(srq->max); if (!srq->free_wqe_bitmap) goto err_free_wrid; /* * Now initialize the SRQ buffer so that all of the WQEs are * linked into the list of free WQEs. */ set_srq_buf_ll(srq, srq->head, srq->tail); if (have_wq) set_srq_buf_ll(srq, srq->waitq_head, srq->waitq_tail); return 0; err_free_wrid: free(srq->wrid); err_free_buf: mlx5_free_actual_buf(ctx, &srq->buf); return -1; } struct mlx5_srq *mlx5_find_srq(struct mlx5_context *ctx, uint32_t srqn) { int tind = srqn >> MLX5_SRQ_TABLE_SHIFT; if (ctx->srq_table[tind].refcnt) return ctx->srq_table[tind].table[srqn & MLX5_SRQ_TABLE_MASK]; else return NULL; } int mlx5_store_srq(struct mlx5_context *ctx, uint32_t srqn, struct mlx5_srq *srq) { int tind = srqn >> MLX5_SRQ_TABLE_SHIFT; if (!ctx->srq_table[tind].refcnt) { ctx->srq_table[tind].table = calloc(MLX5_SRQ_TABLE_MASK + 1, sizeof(struct mlx5_srq *)); if (!ctx->srq_table[tind].table) return -1; } ++ctx->srq_table[tind].refcnt; ctx->srq_table[tind].table[srqn & MLX5_SRQ_TABLE_MASK] = srq; return 0; } void mlx5_clear_srq(struct mlx5_context *ctx, uint32_t srqn) { int tind = srqn >> MLX5_SRQ_TABLE_SHIFT; if (!--ctx->srq_table[tind].refcnt) free(ctx->srq_table[tind].table); else ctx->srq_table[tind].table[srqn & MLX5_SRQ_TABLE_MASK] = NULL; } rdma-core-56.1/providers/mlx5/verbs.c000066400000000000000000006334321477342711600175340ustar00rootroot00000000000000/* * Copyright (c) 2012 Mellanox Technologies, Inc. All rights reserved. * Copyright (c) 2020 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "mlx5.h" #include "mlx5-abi.h" #include "wqe.h" #include "mlx5_ifc.h" int mlx5_single_threaded = 0; static inline int is_xrc_tgt(int type) { return type == IBV_QPT_XRC_RECV; } static int mlx5_read_clock(struct ibv_context *context, uint64_t *cycles) { unsigned int clockhi, clocklo, clockhi1; int i; struct mlx5_context *ctx = to_mctx(context); if (!ctx->hca_core_clock) return EOPNOTSUPP; /* Handle wraparound */ for (i = 0; i < 2; i++) { clockhi = be32toh(mmio_read32_be(ctx->hca_core_clock)); clocklo = be32toh(mmio_read32_be(ctx->hca_core_clock + 4)); clockhi1 = be32toh(mmio_read32_be(ctx->hca_core_clock)); if (clockhi == clockhi1) break; } *cycles = (uint64_t)clockhi << 32 | (uint64_t)clocklo; return 0; } int mlx5_query_rt_values(struct ibv_context *context, struct ibv_values_ex *values) { uint32_t comp_mask = 0; int err = 0; if (!check_comp_mask(values->comp_mask, IBV_VALUES_MASK_RAW_CLOCK)) return EINVAL; if (values->comp_mask & IBV_VALUES_MASK_RAW_CLOCK) { uint64_t cycles; err = mlx5_read_clock(context, &cycles); if (!err) { values->raw_clock.tv_sec = 0; values->raw_clock.tv_nsec = cycles; comp_mask |= IBV_VALUES_MASK_RAW_CLOCK; } } values->comp_mask = comp_mask; return err; } int mlx5_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(context, port, attr, &cmd, sizeof cmd); } void mlx5_async_event(struct ibv_context *context, struct ibv_async_event *event) { struct mlx5_context *ctx; switch (event->event_type) { case IBV_EVENT_DEVICE_FATAL: ctx = to_mctx(context); ctx->flags |= MLX5_CTX_FLAGS_FATAL_STATE; break; default: break; } } struct ibv_pd *mlx5_alloc_pd(struct ibv_context *context) { struct ibv_alloc_pd cmd; struct mlx5_alloc_pd_resp resp; struct mlx5_pd *pd; pd = calloc(1, sizeof *pd); if (!pd) return NULL; if (ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp)) { free(pd); return NULL; } atomic_init(&pd->refcount, 1); pd->pdn = resp.pdn; pthread_mutex_init(&pd->opaque_mr_mutex, NULL); return &pd->ibv_pd; } static void mlx5_free_uar(struct ibv_context *ctx, struct mlx5_bf *bf) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_UAR, MLX5_IB_METHOD_UAR_OBJ_DESTROY, 1); if (!bf->length) goto end; if (bf->mmaped_entry && munmap(bf->uar, bf->length)) assert(false); if (!bf->dyn_alloc_uar) goto end; fill_attr_in_obj(cmd, MLX5_IB_ATTR_UAR_OBJ_DESTROY_HANDLE, bf->uar_handle); if (execute_ioctl(ctx, cmd)) assert(false); end: free(bf); } static struct mlx5_bf * mlx5_alloc_dyn_uar(struct ibv_context *context, uint32_t flags) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_UAR, MLX5_IB_METHOD_UAR_OBJ_ALLOC, 5); struct ib_uverbs_attr *handle; struct mlx5_context *ctx = to_mctx(context); struct mlx5_bf *bf; bool legacy_mode = false; off_t offset; int ret; if (ctx->flags & MLX5_CTX_FLAGS_NO_KERN_DYN_UAR) { if (flags == MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC) { errno = EOPNOTSUPP; return NULL; } if (ctx->curr_legacy_dyn_sys_uar_page > ctx->max_num_legacy_dyn_uar_sys_page) { errno = ENOSPC; return NULL; } legacy_mode = true; } bf = calloc(1, sizeof(*bf)); if (!bf) { errno = ENOMEM; return NULL; } if (legacy_mode) { struct mlx5_device *dev = to_mdev(context->device); offset = get_uar_mmap_offset(ctx->curr_legacy_dyn_sys_uar_page, dev->page_size, MLX5_IB_MMAP_ALLOC_WC); bf->length = dev->page_size; goto do_mmap; } bf->dyn_alloc_uar = true; handle = fill_attr_out_obj(cmd, MLX5_IB_ATTR_UAR_OBJ_ALLOC_HANDLE); fill_attr_const_in(cmd, MLX5_IB_ATTR_UAR_OBJ_ALLOC_TYPE, flags); fill_attr_out_ptr(cmd, MLX5_IB_ATTR_UAR_OBJ_ALLOC_MMAP_OFFSET, &bf->uar_mmap_offset); fill_attr_out_ptr(cmd, MLX5_IB_ATTR_UAR_OBJ_ALLOC_MMAP_LENGTH, &bf->length); fill_attr_out_ptr(cmd, MLX5_IB_ATTR_UAR_OBJ_ALLOC_PAGE_ID, &bf->page_id); ret = execute_ioctl(context, cmd); if (ret) { free(bf); return NULL; } do_mmap: bf->uar = mmap(NULL, bf->length, PROT_WRITE, MAP_SHARED, context->cmd_fd, legacy_mode ? offset : bf->uar_mmap_offset); if (bf->uar == MAP_FAILED) goto err; bf->mmaped_entry = true; if (legacy_mode) ctx->curr_legacy_dyn_sys_uar_page++; else bf->uar_handle = read_attr_obj(MLX5_IB_ATTR_UAR_OBJ_ALLOC_HANDLE, handle); bf->nc_mode = (flags == MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC); return bf; err: mlx5_free_uar(context, bf); return NULL; } static void mlx5_insert_dyn_uuars(struct mlx5_context *ctx, struct mlx5_bf *bf_uar) { int num_db_bf_per_uar = MLX5_NUM_NON_FP_BFREGS_PER_UAR; int db_bf_reg_size = ctx->bf_reg_size; int index_in_uar, index_uar_in_page; int num_db_bf_per_page; struct list_head *head; struct mlx5_bf *bf = bf_uar; int j; if (bf_uar->nc_mode) { /* DBs are not limited to the odd/even BF's usage */ num_db_bf_per_uar *= 2; db_bf_reg_size = MLX5_DB_BLUEFLAME_BUFFER_SIZE; } num_db_bf_per_page = ctx->num_uars_per_page * num_db_bf_per_uar; if (bf_uar->qp_dedicated) head = &ctx->dyn_uar_qp_dedicated_list; else if (bf_uar->qp_shared) head = &ctx->dyn_uar_qp_shared_list; else if (bf_uar->nc_mode) head = &ctx->dyn_uar_db_list; else head = &ctx->dyn_uar_bf_list; for (j = 0; j < num_db_bf_per_page; j++) { if (j != 0) { bf = calloc(1, sizeof(*bf)); if (!bf) return; } index_uar_in_page = j / num_db_bf_per_uar; index_in_uar = j % num_db_bf_per_uar; bf->reg = bf_uar->uar + (index_uar_in_page * MLX5_ADAPTER_PAGE_SIZE) + MLX5_BF_OFFSET + (index_in_uar * db_bf_reg_size); bf->buf_size = bf_uar->nc_mode ? 0 : ctx->bf_reg_size / 2; /* set to non zero is BF entry, will be detected as part of post_send */ bf->uuarn = bf_uar->nc_mode ? 0 : 1; list_node_init(&bf->uar_entry); list_add_tail(head, &bf->uar_entry); if (!bf_uar->dyn_alloc_uar) bf->bfreg_dyn_index = (ctx->curr_legacy_dyn_sys_uar_page - 1) * num_db_bf_per_page + j; bf->dyn_alloc_uar = bf_uar->dyn_alloc_uar; bf->need_lock = bf_uar->qp_shared && !mlx5_single_threaded; mlx5_spinlock_init(&bf->lock, bf->need_lock); if (j != 0) { bf->uar = bf_uar->uar; bf->page_id = bf_uar->page_id + index_uar_in_page; bf->uar_handle = bf_uar->uar_handle; bf->nc_mode = bf_uar->nc_mode; if (bf_uar->dyn_alloc_uar) bf->uar_mmap_offset = bf_uar->uar_mmap_offset; } if (bf_uar->qp_dedicated) { ctx->qp_alloc_dedicated_uuars++; bf->qp_dedicated = true; } else if (bf_uar->qp_shared) { ctx->qp_alloc_shared_uuars++; bf->qp_shared = true; } } } static void mlx5_put_qp_uar(struct mlx5_context *ctx, struct mlx5_bf *bf) { if (!bf || (!bf->qp_dedicated && !bf->qp_shared)) return; pthread_mutex_lock(&ctx->dyn_bfregs_mutex); if (bf->qp_dedicated) list_add_tail(&ctx->dyn_uar_qp_dedicated_list, &bf->uar_entry); else bf->count--; pthread_mutex_unlock(&ctx->dyn_bfregs_mutex); } static int mlx5_alloc_qp_uar(struct ibv_context *context, bool dedicated) { struct mlx5_context *ctx = to_mctx(context); struct mlx5_bf *bf; bf = mlx5_alloc_dyn_uar(context, MLX5_IB_UAPI_UAR_ALLOC_TYPE_BF); if (!bf) return -1; if (dedicated) bf->qp_dedicated = true; else bf->qp_shared = true; mlx5_insert_dyn_uuars(ctx, bf); return 0; } static struct mlx5_bf *mlx5_get_qp_uar(struct ibv_context *context) { struct mlx5_context *ctx = to_mctx(context); struct mlx5_bf *bf = NULL, *bf_entry; if (ctx->shut_up_bf || !ctx->bf_reg_size) return ctx->nc_uar; pthread_mutex_lock(&ctx->dyn_bfregs_mutex); do { bf = list_pop(&ctx->dyn_uar_qp_dedicated_list, struct mlx5_bf, uar_entry); if (bf) break; if (ctx->qp_alloc_dedicated_uuars < ctx->qp_max_dedicated_uuars) { if (mlx5_alloc_qp_uar(context, true)) break; continue; } if (ctx->qp_alloc_shared_uuars < ctx->qp_max_shared_uuars) { if (mlx5_alloc_qp_uar(context, false)) break; } /* Looking for a shared uuar with the less concurrent usage */ list_for_each(&ctx->dyn_uar_qp_shared_list, bf_entry, uar_entry) { if (!bf) { bf = bf_entry; } else { if (bf_entry->count < bf->count) bf = bf_entry; } } bf->count++; } while (!bf); pthread_mutex_unlock(&ctx->dyn_bfregs_mutex); return bf; } /* Returns a dedicated UAR */ static struct mlx5_bf *mlx5_attach_dedicated_uar(struct ibv_context *context, uint32_t flags) { struct mlx5_context *ctx = to_mctx(context); struct mlx5_bf *bf; struct list_head *head; pthread_mutex_lock(&ctx->dyn_bfregs_mutex); head = (flags == MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC) ? &ctx->dyn_uar_db_list : &ctx->dyn_uar_bf_list; bf = list_pop(head, struct mlx5_bf, uar_entry); if (!bf) { bf = mlx5_alloc_dyn_uar(context, flags); if (!bf) goto end; mlx5_insert_dyn_uuars(ctx, bf); bf = list_pop(head, struct mlx5_bf, uar_entry); assert(bf); } end: pthread_mutex_unlock(&ctx->dyn_bfregs_mutex); return bf; } static void mlx5_detach_dedicated_uar(struct ibv_context *context, struct mlx5_bf *bf) { struct mlx5_context *ctx = to_mctx(context); struct list_head *head; pthread_mutex_lock(&ctx->dyn_bfregs_mutex); head = bf->nc_mode ? &ctx->dyn_uar_db_list : &ctx->dyn_uar_bf_list; list_add_tail(head, &bf->uar_entry); pthread_mutex_unlock(&ctx->dyn_bfregs_mutex); return; } struct ibv_td *mlx5_alloc_td(struct ibv_context *context, struct ibv_td_init_attr *init_attr) { struct mlx5_td *td; if (init_attr->comp_mask) { errno = EINVAL; return NULL; } td = calloc(1, sizeof(*td)); if (!td) { errno = ENOMEM; return NULL; } td->bf = mlx5_attach_dedicated_uar(context, 0); if (!td->bf) { free(td); return NULL; } td->ibv_td.context = context; atomic_init(&td->refcount, 1); return &td->ibv_td; } int mlx5_dealloc_td(struct ibv_td *ib_td) { struct mlx5_td *td; td = to_mtd(ib_td); if (atomic_load(&td->refcount) > 1) return EBUSY; mlx5_detach_dedicated_uar(ib_td->context, td->bf); free(td); return 0; } void mlx5_set_singleton_nc_uar(struct ibv_context *context) { struct mlx5_context *ctx = to_mctx(context); struct mlx5_devx_uar *devx_uar; ctx->nc_uar = mlx5_alloc_dyn_uar(context, MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC); if (!ctx->nc_uar) return; ctx->nc_uar->reg = ctx->nc_uar->uar + MLX5_BF_OFFSET; /* set the singleton devx NC UAR fields */ devx_uar = &ctx->nc_uar->devx_uar; devx_uar->dv_devx_uar.reg_addr = ctx->nc_uar->reg; devx_uar->dv_devx_uar.base_addr = ctx->nc_uar->uar; devx_uar->dv_devx_uar.page_id = ctx->nc_uar->page_id; devx_uar->dv_devx_uar.mmap_off = ctx->nc_uar->uar_mmap_offset; devx_uar->dv_devx_uar.comp_mask = 0; ctx->nc_uar->singleton = true; devx_uar->context = context; } static struct mlx5dv_devx_uar * mlx5_get_singleton_nc_uar(struct ibv_context *context) { struct mlx5_context *ctx = to_mctx(context); if (!ctx->nc_uar) { errno = EOPNOTSUPP; return NULL; } return &ctx->nc_uar->devx_uar.dv_devx_uar; } struct ibv_pd * mlx5_alloc_parent_domain(struct ibv_context *context, struct ibv_parent_domain_init_attr *attr) { struct mlx5_parent_domain *mparent_domain; if (ibv_check_alloc_parent_domain(attr)) return NULL; if (!check_comp_mask(attr->comp_mask, IBV_PARENT_DOMAIN_INIT_ATTR_ALLOCATORS | IBV_PARENT_DOMAIN_INIT_ATTR_PD_CONTEXT)) { errno = EINVAL; return NULL; } mparent_domain = calloc(1, sizeof(*mparent_domain)); if (!mparent_domain) { errno = ENOMEM; return NULL; } if (attr->td) { mparent_domain->mtd = to_mtd(attr->td); atomic_fetch_add(&mparent_domain->mtd->refcount, 1); } mparent_domain->mpd.mprotection_domain = to_mpd(attr->pd); atomic_fetch_add(&mparent_domain->mpd.mprotection_domain->refcount, 1); atomic_init(&mparent_domain->mpd.refcount, 1); ibv_initialize_parent_domain( &mparent_domain->mpd.ibv_pd, &mparent_domain->mpd.mprotection_domain->ibv_pd); if (attr->comp_mask & IBV_PARENT_DOMAIN_INIT_ATTR_ALLOCATORS) { mparent_domain->alloc = attr->alloc; mparent_domain->free = attr->free; } if (attr->comp_mask & IBV_PARENT_DOMAIN_INIT_ATTR_PD_CONTEXT) mparent_domain->pd_context = attr->pd_context; return &mparent_domain->mpd.ibv_pd; } static int mlx5_dealloc_parent_domain(struct mlx5_parent_domain *mparent_domain) { if (atomic_load(&mparent_domain->mpd.refcount) > 1) return EBUSY; atomic_fetch_sub(&mparent_domain->mpd.mprotection_domain->refcount, 1); if (mparent_domain->mtd) atomic_fetch_sub(&mparent_domain->mtd->refcount, 1); free(mparent_domain); return 0; } static int _mlx5_free_pd(struct ibv_pd *pd, bool unimport) { int ret; struct mlx5_parent_domain *mparent_domain = to_mparent_domain(pd); struct mlx5_pd *mpd = to_mpd(pd); if (mparent_domain) { if (unimport) return EINVAL; return mlx5_dealloc_parent_domain(mparent_domain); } if (atomic_load(&mpd->refcount) > 1) return EBUSY; if (mpd->opaque_mr) { ret = mlx5_dereg_mr(verbs_get_mr(mpd->opaque_mr)); if (ret) return ret; mpd->opaque_mr = NULL; free(mpd->opaque_buf); } if (unimport) goto end; ret = ibv_cmd_dealloc_pd(pd); if (ret) return ret; end: free(mpd); return 0; } int mlx5_free_pd(struct ibv_pd *pd) { return _mlx5_free_pd(pd, false); } struct ibv_mr *mlx5_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int acc) { struct mlx5_mr *mr; struct ibv_reg_mr cmd; int ret; enum ibv_access_flags access = (enum ibv_access_flags)acc; struct ib_uverbs_reg_mr_resp resp; mr = calloc(1, sizeof(*mr)); if (!mr) return NULL; ret = ibv_cmd_reg_mr(pd, addr, length, hca_va, access, &mr->vmr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(mr); return NULL; } mr->alloc_flags = acc; return &mr->vmr.ibv_mr; } struct ibv_mr *mlx5_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int acc) { struct mlx5_mr *mr; int ret; mr = calloc(1, sizeof(*mr)); if (!mr) return NULL; ret = ibv_cmd_reg_dmabuf_mr(pd, offset, length, iova, fd, acc, &mr->vmr, NULL); if (ret) { free(mr); return NULL; } mr->alloc_flags = acc; return &mr->vmr.ibv_mr; } struct ibv_mr *mlx5_alloc_null_mr(struct ibv_pd *pd) { struct mlx5_mr *mr; struct mlx5_context *ctx = to_mctx(pd->context); if (ctx->dump_fill_mkey == MLX5_INVALID_LKEY) { errno = ENOTSUP; return NULL; } mr = calloc(1, sizeof(*mr)); if (!mr) { errno = ENOMEM; return NULL; } mr->vmr.ibv_mr.lkey = ctx->dump_fill_mkey; mr->vmr.ibv_mr.context = pd->context; mr->vmr.ibv_mr.pd = pd; mr->vmr.ibv_mr.addr = NULL; mr->vmr.ibv_mr.length = SIZE_MAX; mr->vmr.mr_type = IBV_MR_TYPE_NULL_MR; return &mr->vmr.ibv_mr; } enum { MLX5_DM_ALLOWED_ACCESS = IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE | IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_ATOMIC | IBV_ACCESS_ZERO_BASED | IBV_ACCESS_OPTIONAL_RANGE }; struct ibv_mr *mlx5_reg_dm_mr(struct ibv_pd *pd, struct ibv_dm *ibdm, uint64_t dm_offset, size_t length, unsigned int acc) { struct mlx5_dm *dm = to_mdm(ibdm); struct mlx5_mr *mr; int ret; if (acc & ~MLX5_DM_ALLOWED_ACCESS) { errno = EINVAL; return NULL; } mr = calloc(1, sizeof(*mr)); if (!mr) { errno = ENOMEM; return NULL; } ret = ibv_cmd_reg_dm_mr(pd, &dm->verbs_dm, dm_offset, length, acc, &mr->vmr, NULL); if (ret) { free(mr); return NULL; } mr->alloc_flags = acc; return &mr->vmr.ibv_mr; } int mlx5_rereg_mr(struct verbs_mr *vmr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access) { struct ibv_rereg_mr cmd; struct ib_uverbs_rereg_mr_resp resp; return ibv_cmd_rereg_mr(vmr, flags, addr, length, (uintptr_t)addr, access, pd, &cmd, sizeof(cmd), &resp, sizeof(resp)); } int mlx5_dereg_mr(struct verbs_mr *vmr) { int ret; if (vmr->mr_type == IBV_MR_TYPE_NULL_MR) goto free; ret = ibv_cmd_dereg_mr(vmr); if (ret) return ret; free: free(vmr); return 0; } int mlx5_advise_mr(struct ibv_pd *pd, enum ibv_advise_mr_advice advice, uint32_t flags, struct ibv_sge *sg_list, uint32_t num_sge) { return ibv_cmd_advise_mr(pd, advice, flags, sg_list, num_sge); } struct ibv_pd *mlx5_import_pd(struct ibv_context *context, uint32_t pd_handle) { DECLARE_COMMAND_BUFFER(cmd, UVERBS_OBJECT_PD, MLX5_IB_METHOD_PD_QUERY, 2); struct mlx5_pd *pd; int ret; pd = calloc(1, sizeof *pd); if (!pd) return NULL; fill_attr_in_obj(cmd, MLX5_IB_ATTR_QUERY_PD_HANDLE, pd_handle); fill_attr_out_ptr(cmd, MLX5_IB_ATTR_QUERY_PD_RESP_PDN, &pd->pdn); ret = execute_ioctl(context, cmd); if (ret) { free(pd); return NULL; } pd->ibv_pd.context = context; pd->ibv_pd.handle = pd_handle; atomic_init(&pd->refcount, 1); pthread_mutex_init(&pd->opaque_mr_mutex, NULL); return &pd->ibv_pd; } void mlx5_unimport_pd(struct ibv_pd *pd) { if (_mlx5_free_pd(pd, true)) assert(false); } struct ibv_mr *mlx5_import_mr(struct ibv_pd *pd, uint32_t mr_handle) { struct mlx5_mr *mr; int ret; mr = calloc(1, sizeof *mr); if (!mr) return NULL; ret = ibv_cmd_query_mr(pd, &mr->vmr, mr_handle); if (ret) { free(mr); return NULL; } return &mr->vmr.ibv_mr; } void mlx5_unimport_mr(struct ibv_mr *ibmr) { free(to_mmr(ibmr)); } struct ibv_mw *mlx5_alloc_mw(struct ibv_pd *pd, enum ibv_mw_type type) { struct ibv_mw *mw; struct ibv_alloc_mw cmd; struct ib_uverbs_alloc_mw_resp resp; int ret; mw = malloc(sizeof(*mw)); if (!mw) return NULL; memset(mw, 0, sizeof(*mw)); ret = ibv_cmd_alloc_mw(pd, type, mw, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(mw); return NULL; } return mw; } int mlx5_dealloc_mw(struct ibv_mw *mw) { int ret; ret = ibv_cmd_dealloc_mw(mw); if (ret) return ret; free(mw); return 0; } static int get_cqe_size(struct mlx5dv_cq_init_attr *mlx5cq_attr) { char *env; int size = 64; if (mlx5cq_attr && (mlx5cq_attr->comp_mask & MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE)) { size = mlx5cq_attr->cqe_size; } else { env = getenv("MLX5_CQE_SIZE"); if (env) size = atoi(env); } switch (size) { case 64: case 128: return size; default: return -EINVAL; } } static int use_scatter_to_cqe(void) { char *env; env = getenv("MLX5_SCATTER_TO_CQE"); if (env && !strcmp(env, "0")) return 0; return 1; } static int srq_sig_enabled(void) { char *env; env = getenv("MLX5_SRQ_SIGNATURE"); if (env) return 1; return 0; } static int qp_sig_enabled(void) { char *env; env = getenv("MLX5_QP_SIGNATURE"); if (env) return 1; return 0; } enum { CREATE_CQ_SUPPORTED_WC_FLAGS = IBV_WC_STANDARD_FLAGS | IBV_WC_EX_WITH_COMPLETION_TIMESTAMP | IBV_WC_EX_WITH_CVLAN | IBV_WC_EX_WITH_FLOW_TAG | IBV_WC_EX_WITH_TM_INFO | IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK }; enum { CREATE_CQ_SUPPORTED_COMP_MASK = IBV_CQ_INIT_ATTR_MASK_FLAGS | IBV_CQ_INIT_ATTR_MASK_PD }; enum { CREATE_CQ_SUPPORTED_FLAGS = IBV_CREATE_CQ_ATTR_SINGLE_THREADED | IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN }; enum { MLX5_DV_CREATE_CQ_SUP_COMP_MASK = (MLX5DV_CQ_INIT_ATTR_MASK_COMPRESSED_CQE | MLX5DV_CQ_INIT_ATTR_MASK_FLAGS | MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE), }; static struct ibv_cq_ex *create_cq(struct ibv_context *context, const struct ibv_cq_init_attr_ex *cq_attr, int cq_alloc_flags, struct mlx5dv_cq_init_attr *mlx5cq_attr) { DECLARE_COMMAND_BUFFER_LINK(driver_attrs, UVERBS_OBJECT_CQ, UVERBS_METHOD_CQ_CREATE, 1, NULL); struct mlx5_create_cq_ex cmd_ex = {}; struct mlx5_create_cq_ex_resp resp_ex = {}; struct mlx5_ib_create_cq *cmd_drv; struct mlx5_ib_create_cq_resp *resp_drv; struct mlx5_cq *cq; int cqe_sz; int ret; int ncqe; int rc; struct mlx5_context *mctx = to_mctx(context); FILE *fp = to_mctx(context)->dbg_fp; if (!cq_attr->cqe) { mlx5_dbg(fp, MLX5_DBG_CQ, "CQE invalid\n"); errno = EINVAL; return NULL; } if (cq_attr->comp_mask & ~CREATE_CQ_SUPPORTED_COMP_MASK) { mlx5_dbg(fp, MLX5_DBG_CQ, "Unsupported comp_mask for create_cq\n"); errno = EINVAL; return NULL; } if (cq_attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_FLAGS && cq_attr->flags & ~CREATE_CQ_SUPPORTED_FLAGS) { mlx5_dbg(fp, MLX5_DBG_CQ, "Unsupported creation flags requested for create_cq\n"); errno = EINVAL; return NULL; } if (cq_attr->wc_flags & ~CREATE_CQ_SUPPORTED_WC_FLAGS) { mlx5_dbg(fp, MLX5_DBG_CQ, "\n"); errno = ENOTSUP; return NULL; } if (mlx5cq_attr && !check_comp_mask(mlx5cq_attr->comp_mask, MLX5_DV_CREATE_CQ_SUP_COMP_MASK)) { mlx5_dbg(fp, MLX5_DBG_CQ, "unsupported vendor comp_mask for %s\n", __func__); errno = EINVAL; return NULL; } cq = calloc(1, sizeof *cq); if (!cq) { mlx5_dbg(fp, MLX5_DBG_CQ, "\n"); return NULL; } if (cq_attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_FLAGS) { if (cq_attr->flags & IBV_CREATE_CQ_ATTR_SINGLE_THREADED) cq->flags |= MLX5_CQ_FLAGS_SINGLE_THREADED; } if (cq_attr->comp_mask & IBV_CQ_INIT_ATTR_MASK_PD) { if (!(to_mparent_domain(cq_attr->parent_domain))) { errno = EINVAL; goto err; } cq->parent_domain = cq_attr->parent_domain; } if (cq_alloc_flags & MLX5_CQ_FLAGS_EXTENDED) { rc = mlx5_cq_fill_pfns(cq, cq_attr, mctx); if (rc) { errno = rc; goto err; } } cmd_drv = &cmd_ex.drv_payload; resp_drv = &resp_ex.drv_payload; cq->cons_index = 0; if (mlx5_spinlock_init(&cq->lock, !mlx5_single_threaded)) goto err; ncqe = align_queue_size(cq_attr->cqe + 1); if ((ncqe > (1 << 24)) || (ncqe < (cq_attr->cqe + 1))) { mlx5_dbg(fp, MLX5_DBG_CQ, "ncqe %d\n", ncqe); errno = EINVAL; goto err_spl; } cqe_sz = get_cqe_size(mlx5cq_attr); if (cqe_sz < 0) { mlx5_dbg(fp, MLX5_DBG_CQ, "\n"); errno = -cqe_sz; goto err_spl; } if (mlx5_alloc_cq_buf(to_mctx(context), cq, &cq->buf_a, ncqe, cqe_sz)) { mlx5_dbg(fp, MLX5_DBG_CQ, "\n"); goto err_spl; } cq->dbrec = mlx5_alloc_dbrec(to_mctx(context), cq->parent_domain, &cq->custom_db); if (!cq->dbrec) { mlx5_dbg(fp, MLX5_DBG_CQ, "\n"); goto err_buf; } cq->dbrec[MLX5_CQ_SET_CI] = 0; cq->dbrec[MLX5_CQ_ARM_DB] = 0; cq->arm_sn = 0; cq->cqe_sz = cqe_sz; cq->flags = cq_alloc_flags; cmd_drv->buf_addr = (uintptr_t) cq->buf_a.buf; cmd_drv->db_addr = (uintptr_t) cq->dbrec; cmd_drv->cqe_size = cqe_sz; if (mlx5cq_attr) { if (mlx5cq_attr->comp_mask & MLX5DV_CQ_INIT_ATTR_MASK_COMPRESSED_CQE) { if (mctx->cqe_comp_caps.max_num && (mlx5cq_attr->cqe_comp_res_format & mctx->cqe_comp_caps.supported_format)) { cmd_drv->cqe_comp_en = 1; cmd_drv->cqe_comp_res_format = mlx5cq_attr->cqe_comp_res_format; } else { mlx5_dbg(fp, MLX5_DBG_CQ, "CQE Compression is not supported\n"); errno = EINVAL; goto err_db; } } if (mlx5cq_attr->comp_mask & MLX5DV_CQ_INIT_ATTR_MASK_FLAGS) { if (mlx5cq_attr->flags & ~(MLX5DV_CQ_INIT_ATTR_FLAGS_RESERVED - 1)) { mlx5_dbg(fp, MLX5_DBG_CQ, "Unsupported vendor flags for create_cq\n"); errno = EINVAL; goto err_db; } if (mlx5cq_attr->flags & MLX5DV_CQ_INIT_ATTR_FLAGS_CQE_PAD) { if (!(mctx->vendor_cap_flags & MLX5_VENDOR_CAP_FLAGS_CQE_128B_PAD) || (cqe_sz != 128)) { mlx5_dbg(fp, MLX5_DBG_CQ, "%dB CQE paddind is not supported\n", cqe_sz); errno = EINVAL; goto err_db; } cmd_drv->flags |= MLX5_IB_CREATE_CQ_FLAGS_CQE_128B_PAD; } } } if (mctx->flags & MLX5_CTX_FLAGS_REAL_TIME_TS_SUPPORTED && !(cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP) && cq_attr->wc_flags & IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK) cmd_drv->flags |= MLX5_IB_CREATE_CQ_FLAGS_REAL_TIME_TS; if (mctx->nc_uar) { if (mctx->nc_uar->page_id >= (1ul << 16)) { fill_attr_in_uint32(driver_attrs, MLX5_IB_ATTR_CREATE_CQ_UAR_INDEX, mctx->nc_uar->page_id); } else { cmd_drv->flags |= MLX5_IB_CREATE_CQ_FLAGS_UAR_PAGE_INDEX; cmd_drv->uar_page_index = mctx->nc_uar->page_id; } } { struct ibv_cq_init_attr_ex cq_attr_ex = *cq_attr; cq_attr_ex.cqe = ncqe - 1; ret = ibv_cmd_create_cq_ex2(context, &cq_attr_ex, &cq->verbs_cq, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp_ex.ibv_resp, sizeof(resp_ex), CREATE_CQ_CMD_FLAGS_TS_IGNORED_EX, driver_attrs); } if (ret) { mlx5_dbg(fp, MLX5_DBG_CQ, "ret %d\n", ret); goto err_db; } if (cq->parent_domain) atomic_fetch_add(&to_mparent_domain(cq->parent_domain)->mpd.refcount, 1); cq->active_buf = &cq->buf_a; cq->resize_buf = NULL; cq->cqn = resp_drv->cqn; cq->stall_enable = to_mctx(context)->stall_enable; cq->stall_adaptive_enable = to_mctx(context)->stall_adaptive_enable; cq->stall_cycles = to_mctx(context)->stall_cycles; return &cq->verbs_cq.cq_ex; err_db: mlx5_free_db(to_mctx(context), cq->dbrec, cq->parent_domain, cq->custom_db); err_buf: mlx5_free_cq_buf(to_mctx(context), &cq->buf_a); err_spl: mlx5_spinlock_destroy(&cq->lock); err: free(cq); return NULL; } struct ibv_cq *mlx5_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct ibv_cq_ex *cq; struct ibv_cq_init_attr_ex cq_attr = {.cqe = cqe, .channel = channel, .comp_vector = comp_vector, .wc_flags = IBV_WC_STANDARD_FLAGS}; if (cqe <= 0) { errno = EINVAL; return NULL; } cq = create_cq(context, &cq_attr, 0, NULL); return cq ? ibv_cq_ex_to_cq(cq) : NULL; } struct ibv_cq_ex *mlx5_create_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr) { return create_cq(context, cq_attr, MLX5_CQ_FLAGS_EXTENDED, NULL); } static struct ibv_cq_ex *_mlx5dv_create_cq(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr, struct mlx5dv_cq_init_attr *mlx5_cq_attr) { struct ibv_cq_ex *cq; cq = create_cq(context, cq_attr, MLX5_CQ_FLAGS_EXTENDED, mlx5_cq_attr); if (!cq) return NULL; verbs_init_cq(ibv_cq_ex_to_cq(cq), context, cq_attr->channel, cq_attr->cq_context); return cq; } struct ibv_cq_ex *mlx5dv_create_cq(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr, struct mlx5dv_cq_init_attr *mlx5_cq_attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->create_cq) { errno = EOPNOTSUPP; return NULL; } return dvops->create_cq(context, cq_attr, mlx5_cq_attr); } int mlx5_resize_cq(struct ibv_cq *ibcq, int cqe) { struct mlx5_cq *cq = to_mcq(ibcq); struct mlx5_resize_cq_resp resp; struct mlx5_resize_cq cmd; struct mlx5_context *mctx = to_mctx(ibcq->context); int err; if (cqe < 0) { errno = EINVAL; return errno; } memset(&cmd, 0, sizeof(cmd)); memset(&resp, 0, sizeof(resp)); if (((long long)cqe * 64) > INT_MAX) return EINVAL; mlx5_spin_lock(&cq->lock); cq->active_cqes = cq->verbs_cq.cq.cqe; if (cq->active_buf == &cq->buf_a) cq->resize_buf = &cq->buf_b; else cq->resize_buf = &cq->buf_a; cqe = align_queue_size(cqe + 1); if (cqe == ibcq->cqe + 1) { cq->resize_buf = NULL; err = 0; goto out; } /* currently we don't change cqe size */ cq->resize_cqe_sz = cq->cqe_sz; cq->resize_cqes = cqe; err = mlx5_alloc_cq_buf(mctx, cq, cq->resize_buf, cq->resize_cqes, cq->resize_cqe_sz); if (err) { cq->resize_buf = NULL; errno = ENOMEM; goto out; } cmd.buf_addr = (uintptr_t)cq->resize_buf->buf; cmd.cqe_size = cq->resize_cqe_sz; err = ibv_cmd_resize_cq(ibcq, cqe - 1, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (err) goto out_buf; mlx5_cq_resize_copy_cqes(mctx, cq); mlx5_free_cq_buf(mctx, cq->active_buf); cq->active_buf = cq->resize_buf; cq->verbs_cq.cq.cqe = cqe - 1; mlx5_spin_unlock(&cq->lock); cq->resize_buf = NULL; return 0; out_buf: mlx5_free_cq_buf(mctx, cq->resize_buf); cq->resize_buf = NULL; out: mlx5_spin_unlock(&cq->lock); return err; } int mlx5_destroy_cq(struct ibv_cq *cq) { int ret; struct mlx5_cq *mcq = to_mcq(cq); ret = ibv_cmd_destroy_cq(cq); if (ret) return ret; mlx5_free_db(to_mctx(cq->context), mcq->dbrec, mcq->parent_domain, mcq->custom_db); mlx5_free_cq_buf(to_mctx(cq->context), mcq->active_buf); if (mcq->parent_domain) atomic_fetch_sub(&to_mparent_domain(mcq->parent_domain)->mpd.refcount, 1); free(mcq); return 0; } struct ibv_srq *mlx5_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct mlx5_create_srq cmd; struct mlx5_create_srq_resp resp; struct mlx5_srq *srq; int ret; struct mlx5_context *ctx; int max_sge; struct ibv_srq *ibsrq; ctx = to_mctx(pd->context); srq = calloc(1, sizeof *srq); if (!srq) { mlx5_err(ctx->dbg_fp, "%s-%d:\n", __func__, __LINE__); return NULL; } ibsrq = &srq->vsrq.srq; memset(&cmd, 0, sizeof cmd); if (mlx5_spinlock_init_pd(&srq->lock, pd)) { mlx5_err(ctx->dbg_fp, "%s-%d:\n", __func__, __LINE__); goto err; } if (attr->attr.max_wr > ctx->max_srq_recv_wr) { mlx5_err(ctx->dbg_fp, "%s-%d:max_wr %d, max_srq_recv_wr %d\n", __func__, __LINE__, attr->attr.max_wr, ctx->max_srq_recv_wr); errno = EINVAL; goto err; } /* * this calculation does not consider required control segments. The * final calculation is done again later. This is done so to avoid * overflows of variables */ max_sge = ctx->max_rq_desc_sz / sizeof(struct mlx5_wqe_data_seg); if (attr->attr.max_sge > max_sge) { mlx5_err(ctx->dbg_fp, "%s-%d:max_wr %d, max_srq_recv_wr %d\n", __func__, __LINE__, attr->attr.max_wr, ctx->max_srq_recv_wr); errno = EINVAL; goto err; } srq->max_gs = attr->attr.max_sge; srq->counter = 0; if (mlx5_alloc_srq_buf(pd->context, srq, attr->attr.max_wr, pd)) { mlx5_err(ctx->dbg_fp, "%s-%d:\n", __func__, __LINE__); goto err; } srq->db = mlx5_alloc_dbrec(to_mctx(pd->context), pd, &srq->custom_db); if (!srq->db) { mlx5_err(ctx->dbg_fp, "%s-%d:\n", __func__, __LINE__); goto err_free; } if (!srq->custom_db) *srq->db = 0; cmd.buf_addr = (uintptr_t) srq->buf.buf; cmd.db_addr = (uintptr_t) srq->db; srq->wq_sig = srq_sig_enabled(); if (srq->wq_sig) cmd.flags = MLX5_SRQ_FLAG_SIGNATURE; attr->attr.max_sge = srq->max_gs; pthread_mutex_lock(&ctx->srq_table_mutex); /* Override max_wr to let kernel know about extra WQEs for the * wait queue. */ attr->attr.max_wr = srq->max - 1; ret = ibv_cmd_create_srq(pd, ibsrq, attr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) goto err_db; /* Override kernel response that includes the wait queue with the real * number of WQEs that are applicable for the application. */ attr->attr.max_wr = srq->tail; ret = mlx5_store_srq(ctx, resp.srqn, srq); if (ret) goto err_destroy; pthread_mutex_unlock(&ctx->srq_table_mutex); srq->srqn = resp.srqn; srq->rsc.rsn = resp.srqn; srq->rsc.type = MLX5_RSC_TYPE_SRQ; return ibsrq; err_destroy: ibv_cmd_destroy_srq(ibsrq); err_db: pthread_mutex_unlock(&ctx->srq_table_mutex); mlx5_free_db(to_mctx(pd->context), srq->db, pd, srq->custom_db); err_free: free(srq->wrid); mlx5_free_actual_buf(ctx, &srq->buf); free(srq->free_wqe_bitmap); err: free(srq); return NULL; } int mlx5_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int attr_mask) { struct ibv_modify_srq cmd; return ibv_cmd_modify_srq(srq, attr, attr_mask, &cmd, sizeof cmd); } int mlx5_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; return ibv_cmd_query_srq(srq, attr, &cmd, sizeof cmd); } int mlx5_destroy_srq(struct ibv_srq *srq) { int ret; struct mlx5_srq *msrq = to_msrq(srq); struct mlx5_context *ctx = to_mctx(srq->context); if (msrq->cmd_qp) { ret = mlx5_destroy_qp(msrq->cmd_qp); if (ret) return ret; msrq->cmd_qp = NULL; } ret = ibv_cmd_destroy_srq(srq); if (ret) return ret; if (ctx->cqe_version && msrq->rsc.type == MLX5_RSC_TYPE_XSRQ) mlx5_clear_uidx(ctx, msrq->rsc.rsn); else mlx5_clear_srq(ctx, msrq->srqn); mlx5_free_db(ctx, msrq->db, srq->pd, msrq->custom_db); mlx5_free_actual_buf(ctx, &msrq->buf); free(msrq->tm_list); free(msrq->wrid); free(msrq->op); free(msrq->free_wqe_bitmap); free(msrq); return 0; } static int _sq_overhead(struct mlx5_qp *qp, enum ibv_qp_type qp_type, uint64_t ops, uint64_t mlx5_ops) { size_t size = sizeof(struct mlx5_wqe_ctrl_seg); size_t rdma_size = 0; size_t atomic_size = 0; size_t mw_size = 0; /* Operation overhead */ if (ops & (IBV_QP_EX_WITH_RDMA_WRITE | IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM | IBV_QP_EX_WITH_RDMA_READ)) rdma_size = sizeof(struct mlx5_wqe_ctrl_seg) + sizeof(struct mlx5_wqe_raddr_seg); if (ops & (IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP | IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD)) atomic_size = sizeof(struct mlx5_wqe_ctrl_seg) + sizeof(struct mlx5_wqe_raddr_seg) + sizeof(struct mlx5_wqe_atomic_seg); if (ops & (IBV_QP_EX_WITH_BIND_MW | IBV_QP_EX_WITH_LOCAL_INV) || (mlx5_ops & (MLX5DV_QP_EX_WITH_MR_INTERLEAVED | MLX5DV_QP_EX_WITH_MR_LIST | MLX5DV_QP_EX_WITH_MKEY_CONFIGURE))) mw_size = sizeof(struct mlx5_wqe_ctrl_seg) + sizeof(struct mlx5_wqe_umr_ctrl_seg) + sizeof(struct mlx5_wqe_mkey_context_seg) + max_t(size_t, sizeof(struct mlx5_wqe_umr_klm_seg), 64); size = max_t(size_t, size, rdma_size); size = max_t(size_t, size, atomic_size); size = max_t(size_t, size, mw_size); /* Transport overhead */ switch (qp_type) { case IBV_QPT_DRIVER: if (qp->dc_type != MLX5DV_DCTYPE_DCI) return -EINVAL; SWITCH_FALLTHROUGH; case IBV_QPT_UD: size += sizeof(struct mlx5_wqe_datagram_seg); if (qp->flags & MLX5_QP_FLAGS_USE_UNDERLAY) size += sizeof(struct mlx5_wqe_eth_seg) + sizeof(struct mlx5_wqe_eth_pad); break; case IBV_QPT_XRC_RECV: case IBV_QPT_XRC_SEND: size += sizeof(struct mlx5_wqe_xrc_seg); break; case IBV_QPT_RAW_PACKET: size += sizeof(struct mlx5_wqe_eth_seg); break; case IBV_QPT_RC: case IBV_QPT_UC: break; default: return -EINVAL; } return size; } static int sq_overhead(struct mlx5_qp *qp, struct ibv_qp_init_attr_ex *attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr) { uint64_t ops; uint64_t mlx5_ops = 0; if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS) { ops = attr->send_ops_flags; } else { switch (attr->qp_type) { case IBV_QPT_RC: case IBV_QPT_UC: case IBV_QPT_DRIVER: case IBV_QPT_XRC_RECV: case IBV_QPT_XRC_SEND: ops = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_INV | IBV_QP_EX_WITH_SEND_WITH_IMM | IBV_QP_EX_WITH_RDMA_WRITE | IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM | IBV_QP_EX_WITH_RDMA_READ | IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP | IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD | IBV_QP_EX_WITH_LOCAL_INV | IBV_QP_EX_WITH_BIND_MW; break; case IBV_QPT_UD: ops = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_IMM | IBV_QP_EX_WITH_TSO; break; case IBV_QPT_RAW_PACKET: ops = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_TSO; break; default: return -EINVAL; } } if (mlx5_qp_attr && mlx5_qp_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS) mlx5_ops = mlx5_qp_attr->send_ops_flags; return _sq_overhead(qp, attr->qp_type, ops, mlx5_ops); } static int mlx5_calc_send_wqe(struct mlx5_context *ctx, struct ibv_qp_init_attr_ex *attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr, struct mlx5_qp *qp) { int size; int inl_size = 0; int max_gather; int tot_size; size = sq_overhead(qp, attr, mlx5_qp_attr); if (size < 0) return size; if (attr->cap.max_inline_data) { inl_size = size + align(sizeof(struct mlx5_wqe_inl_data_seg) + attr->cap.max_inline_data, 16); } if (attr->comp_mask & IBV_QP_INIT_ATTR_MAX_TSO_HEADER) { size += align(attr->max_tso_header, 16); qp->max_tso_header = attr->max_tso_header; } max_gather = (ctx->max_sq_desc_sz - size) / sizeof(struct mlx5_wqe_data_seg); if (attr->cap.max_send_sge > max_gather) return -EINVAL; size += attr->cap.max_send_sge * sizeof(struct mlx5_wqe_data_seg); tot_size = max_int(size, inl_size); if (tot_size > ctx->max_sq_desc_sz) return -EINVAL; return align(tot_size, MLX5_SEND_WQE_BB); } static int mlx5_calc_rcv_wqe(struct mlx5_context *ctx, struct ibv_qp_init_attr_ex *attr, struct mlx5_qp *qp) { uint32_t size; int num_scatter; if (attr->srq) return 0; num_scatter = max_t(uint32_t, attr->cap.max_recv_sge, 1); size = sizeof(struct mlx5_wqe_data_seg) * num_scatter; if (qp->wq_sig) size += sizeof(struct mlx5_rwqe_sig); if (size > ctx->max_rq_desc_sz) return -EINVAL; size = roundup_pow_of_two(size); return size; } static int mlx5_calc_sq_size(struct mlx5_context *ctx, struct ibv_qp_init_attr_ex *attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr, struct mlx5_qp *qp) { int wqe_size; int wq_size; FILE *fp = ctx->dbg_fp; if (!attr->cap.max_send_wr) return 0; wqe_size = mlx5_calc_send_wqe(ctx, attr, mlx5_qp_attr, qp); if (wqe_size < 0) { mlx5_dbg(fp, MLX5_DBG_QP, "\n"); return wqe_size; } if (wqe_size > ctx->max_sq_desc_sz) { mlx5_dbg(fp, MLX5_DBG_QP, "\n"); return -EINVAL; } qp->max_inline_data = wqe_size - sq_overhead(qp, attr, mlx5_qp_attr) - sizeof(struct mlx5_wqe_inl_data_seg); attr->cap.max_inline_data = qp->max_inline_data; /* * to avoid overflow, we limit max_send_wr so * that the multiplication will fit in int */ if (attr->cap.max_send_wr > 0x7fffffff / ctx->max_sq_desc_sz) { mlx5_dbg(fp, MLX5_DBG_QP, "\n"); return -EINVAL; } wq_size = roundup_pow_of_two(attr->cap.max_send_wr * wqe_size); qp->sq.wqe_cnt = wq_size / MLX5_SEND_WQE_BB; if (qp->sq.wqe_cnt > ctx->max_send_wqebb) { mlx5_dbg(fp, MLX5_DBG_QP, "\n"); return -EINVAL; } qp->sq.wqe_shift = STATIC_ILOG_32(MLX5_SEND_WQE_BB) - 1; qp->sq.max_gs = attr->cap.max_send_sge; qp->sq.max_post = wq_size / wqe_size; return wq_size; } enum { DV_CREATE_WQ_SUPPORTED_COMP_MASK = MLX5DV_WQ_INIT_ATTR_MASK_STRIDING_RQ }; static int mlx5_calc_rwq_size(struct mlx5_context *ctx, struct mlx5_rwq *rwq, struct ibv_wq_init_attr *attr, struct mlx5dv_wq_init_attr *mlx5wq_attr) { size_t wqe_size; int wq_size; uint32_t num_scatter; int is_mprq = 0; int scat_spc; if (!attr->max_wr) return -EINVAL; if (mlx5wq_attr) { if (!check_comp_mask(mlx5wq_attr->comp_mask, DV_CREATE_WQ_SUPPORTED_COMP_MASK)) return -EINVAL; is_mprq = !!(mlx5wq_attr->comp_mask & MLX5DV_WQ_INIT_ATTR_MASK_STRIDING_RQ); } /* TBD: check caps for RQ */ num_scatter = max_t(uint32_t, attr->max_sge, 1); wqe_size = sizeof(struct mlx5_wqe_data_seg) * num_scatter + sizeof(struct mlx5_wqe_srq_next_seg) * is_mprq; if (rwq->wq_sig) wqe_size += sizeof(struct mlx5_rwqe_sig); if (wqe_size <= 0 || wqe_size > ctx->max_rq_desc_sz) return -EINVAL; wqe_size = roundup_pow_of_two(wqe_size); wq_size = roundup_pow_of_two(attr->max_wr) * wqe_size; wq_size = max(wq_size, MLX5_SEND_WQE_BB); rwq->rq.wqe_cnt = wq_size / wqe_size; rwq->rq.wqe_shift = ilog32(wqe_size - 1); rwq->rq.max_post = 1 << ilog32(wq_size / wqe_size - 1); scat_spc = wqe_size - ((rwq->wq_sig) ? sizeof(struct mlx5_rwqe_sig) : 0) - is_mprq * sizeof(struct mlx5_wqe_srq_next_seg); rwq->rq.max_gs = scat_spc / sizeof(struct mlx5_wqe_data_seg); return wq_size; } static int mlx5_get_max_recv_wr(struct mlx5_context *ctx, struct ibv_qp_init_attr_ex *attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr, uint32_t *max_recv_wr) { if (mlx5_qp_attr && (mlx5_qp_attr->create_flags & MLX5DV_QP_CREATE_OOO_DP) && attr->cap.max_recv_wr > 1) { uint32_t max_recv_wr_cap = 0; /* OOO-enabled cyclic buffers require double the user requested size. */ switch (attr->qp_type) { case IBV_QPT_RC: max_recv_wr_cap = ctx->ooo_recv_wrs_caps.max_rc; break; case IBV_QPT_UC: max_recv_wr_cap = ctx->ooo_recv_wrs_caps.max_uc; break; case IBV_QPT_UD: max_recv_wr_cap = ctx->ooo_recv_wrs_caps.max_ud; break; default: break; } if (max_recv_wr_cap) { if (attr->cap.max_recv_wr > max_recv_wr_cap) goto inval_max_wr; *max_recv_wr = attr->cap.max_recv_wr << 1; return 0; } } if (attr->cap.max_recv_wr > ctx->max_recv_wr) goto inval_max_wr; *max_recv_wr = attr->cap.max_recv_wr; return 0; inval_max_wr: mlx5_dbg(ctx->dbg_fp, MLX5_DBG_QP, "Invalid max_recv_wr value\n"); return -EINVAL; } static int mlx5_calc_rq_size(struct mlx5_context *ctx, struct ibv_qp_init_attr_ex *attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr, struct mlx5_qp *qp) { int wqe_size; int wq_size; int scat_spc; int ret; uint32_t max_recv_wr; FILE *fp = ctx->dbg_fp; if (!attr->cap.max_recv_wr) return 0; ret = mlx5_get_max_recv_wr(ctx, attr, mlx5_qp_attr, &max_recv_wr); if (ret < 0) return ret; wqe_size = mlx5_calc_rcv_wqe(ctx, attr, qp); if (wqe_size < 0 || wqe_size > ctx->max_rq_desc_sz) { mlx5_dbg(fp, MLX5_DBG_QP, "\n"); return -EINVAL; } wq_size = roundup_pow_of_two(max_recv_wr) * wqe_size; if (wqe_size) { wq_size = max(wq_size, MLX5_SEND_WQE_BB); qp->rq.wqe_cnt = wq_size / wqe_size; qp->rq.wqe_shift = ilog32(wqe_size - 1); /* If cyclic OOO RQ buffer is used, the max_posts a user can do * is half the internally allocated size (wqe_cnt). * This prevents overwriting non-consumed WQEs during the * execution of mlx5_post_recv(), enforced by mlx5_wq_overflow(). */ if (max_recv_wr != attr->cap.max_recv_wr) { /* OOO is applicable only when max_recv_wr and wqe_cnt are larger than 1 */ assert(qp->rq.wqe_cnt > 1); qp->rq.max_post = 1 << (ilog32(qp->rq.wqe_cnt - 1) - 1); } else { qp->rq.max_post = 1 << ilog32(qp->rq.wqe_cnt - 1); } scat_spc = wqe_size - (qp->wq_sig ? sizeof(struct mlx5_rwqe_sig) : 0); qp->rq.max_gs = scat_spc / sizeof(struct mlx5_wqe_data_seg); } else { qp->rq.wqe_cnt = 0; qp->rq.wqe_shift = 0; qp->rq.max_post = 0; qp->rq.max_gs = 0; } return wq_size; } static int mlx5_calc_wq_size(struct mlx5_context *ctx, struct ibv_qp_init_attr_ex *attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr, struct mlx5_qp *qp) { int ret; int result; ret = mlx5_calc_sq_size(ctx, attr, mlx5_qp_attr, qp); if (ret < 0) return ret; result = ret; ret = mlx5_calc_rq_size(ctx, attr, mlx5_qp_attr, qp); if (ret < 0) return ret; result += ret; qp->sq.offset = ret; qp->rq.offset = 0; return result; } static void map_uuar(struct ibv_context *context, struct mlx5_qp *qp, int uuar_index, struct mlx5_bf *dyn_bf) { struct mlx5_context *ctx = to_mctx(context); if (!dyn_bf) qp->bf = &ctx->bfs[uuar_index]; else qp->bf = dyn_bf; } static const char *qptype2key(enum ibv_qp_type type) { switch (type) { case IBV_QPT_RC: return "HUGE_RC"; case IBV_QPT_UC: return "HUGE_UC"; case IBV_QPT_UD: return "HUGE_UD"; case IBV_QPT_RAW_PACKET: return "HUGE_RAW_ETH"; default: return "HUGE_NA"; } } static size_t mlx5_set_custom_qp_alignment(struct ibv_context *context, struct mlx5_qp *qp) { uint32_t max_stride; uint32_t buf_page; /* The main QP buffer alignment requirement is QP_PAGE_SIZE / * MLX5_QPC_PAGE_OFFSET_QUANTA. In case the buffer is contig, then * QP_PAGE_SIZE is the buffer size align to system page_size roundup to * the next pow of two. */ buf_page = roundup_pow_of_two(align(qp->buf_size, to_mdev(context->device)->page_size)); /* Another QP buffer alignment requirement is to consider send wqe and * receive wqe strides. */ max_stride = max((1 << qp->sq.wqe_shift), (1 << qp->rq.wqe_shift)); return max(max_stride, buf_page / MLX5_QPC_PAGE_OFFSET_QUANTA); } static int mlx5_alloc_qp_buf(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr, struct mlx5_qp *qp, int size) { int err; enum mlx5_alloc_type alloc_type; enum mlx5_alloc_type default_alloc_type = MLX5_ALLOC_TYPE_ANON; const char *qp_huge_key; size_t req_align = to_mdev(context->device)->page_size; if (qp->sq.wqe_cnt) { qp->sq.wrid = malloc(qp->sq.wqe_cnt * sizeof(*qp->sq.wrid)); if (!qp->sq.wrid) { errno = ENOMEM; err = -1; return err; } qp->sq.wr_data = malloc(qp->sq.wqe_cnt * sizeof(*qp->sq.wr_data)); if (!qp->sq.wr_data) { errno = ENOMEM; err = -1; goto ex_wrid; } qp->sq.wqe_head = malloc(qp->sq.wqe_cnt * sizeof(*qp->sq.wqe_head)); if (!qp->sq.wqe_head) { errno = ENOMEM; err = -1; goto ex_wrid; } } if (qp->rq.wqe_cnt) { qp->rq.wrid = malloc(qp->rq.wqe_cnt * sizeof(uint64_t)); if (!qp->rq.wrid) { errno = ENOMEM; err = -1; goto ex_wrid; } } /* compatibility support */ qp_huge_key = qptype2key(attr->qp_type); if (mlx5_use_huge(qp_huge_key)) default_alloc_type = MLX5_ALLOC_TYPE_HUGE; mlx5_get_alloc_type(to_mctx(context), attr->pd, MLX5_QP_PREFIX, &alloc_type, default_alloc_type); if (alloc_type == MLX5_ALLOC_TYPE_CUSTOM) { qp->buf.mparent_domain = to_mparent_domain(attr->pd); if (attr->qp_type != IBV_QPT_RAW_PACKET && !(qp->flags & MLX5_QP_FLAGS_USE_UNDERLAY)) req_align = mlx5_set_custom_qp_alignment(context, qp); qp->buf.req_alignment = req_align; qp->buf.resource_type = MLX5DV_RES_TYPE_QP; } err = mlx5_alloc_prefered_buf(to_mctx(context), &qp->buf, align(qp->buf_size, req_align), to_mdev(context->device)->page_size, alloc_type, MLX5_QP_PREFIX); if (err) { err = -ENOMEM; goto ex_wrid; } if (qp->buf.type != MLX5_ALLOC_TYPE_CUSTOM) memset(qp->buf.buf, 0, qp->buf_size); if (attr->qp_type == IBV_QPT_RAW_PACKET || qp->flags & MLX5_QP_FLAGS_USE_UNDERLAY) { size_t aligned_sq_buf_size = align(qp->sq_buf_size, to_mdev(context->device)->page_size); if (alloc_type == MLX5_ALLOC_TYPE_CUSTOM) { qp->sq_buf.mparent_domain = to_mparent_domain(attr->pd); qp->sq_buf.req_alignment = to_mdev(context->device)->page_size; qp->sq_buf.resource_type = MLX5DV_RES_TYPE_QP; } /* For Raw Packet QP, allocate a separate buffer for the SQ */ err = mlx5_alloc_prefered_buf(to_mctx(context), &qp->sq_buf, aligned_sq_buf_size, to_mdev(context->device)->page_size, alloc_type, MLX5_QP_PREFIX); if (err) { err = -ENOMEM; goto rq_buf; } if (qp->sq_buf.type != MLX5_ALLOC_TYPE_CUSTOM) memset(qp->sq_buf.buf, 0, aligned_sq_buf_size); } return 0; rq_buf: mlx5_free_actual_buf(to_mctx(context), &qp->buf); ex_wrid: if (qp->rq.wrid) free(qp->rq.wrid); if (qp->sq.wqe_head) free(qp->sq.wqe_head); if (qp->sq.wr_data) free(qp->sq.wr_data); if (qp->sq.wrid) free(qp->sq.wrid); return err; } static void mlx5_free_qp_buf(struct mlx5_context *ctx, struct mlx5_qp *qp) { mlx5_free_actual_buf(ctx, &qp->buf); if (qp->sq_buf.buf) mlx5_free_actual_buf(ctx, &qp->sq_buf); if (qp->rq.wrid) free(qp->rq.wrid); if (qp->sq.wqe_head) free(qp->sq.wqe_head); if (qp->sq.wrid) free(qp->sq.wrid); if (qp->sq.wr_data) free(qp->sq.wr_data); } int mlx5_set_ece(struct ibv_qp *qp, struct ibv_ece *ece) { struct mlx5_context *context = to_mctx(qp->context); struct mlx5_qp *mqp = to_mqp(qp); if (ece->comp_mask) { errno = EINVAL; return errno; } if (ece->vendor_id != PCI_VENDOR_ID_MELLANOX) { errno = EINVAL; return errno; } if (!(context->flags & MLX5_CTX_FLAGS_ECE_SUPPORTED)) { errno = EOPNOTSUPP; return errno; } mqp->set_ece = ece->options; /* Clean previously returned ECE options */ mqp->get_ece = 0; return 0; } int mlx5_query_ece(struct ibv_qp *qp, struct ibv_ece *ece) { struct mlx5_qp *mqp = to_mqp(qp); ece->vendor_id = PCI_VENDOR_ID_MELLANOX; ece->options = mqp->get_ece; ece->comp_mask = 0; return 0; } static int mlx5_cmd_create_rss_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr, struct mlx5_qp *qp, uint32_t mlx5_create_flags) { struct mlx5_create_qp_ex_rss cmd_ex_rss = {}; struct mlx5_create_qp_ex_resp resp = {}; struct mlx5_ib_create_qp_resp *resp_drv; int ret; if (attr->rx_hash_conf.rx_hash_key_len > sizeof(cmd_ex_rss.rx_hash_key)) { errno = EINVAL; return errno; } cmd_ex_rss.rx_hash_fields_mask = attr->rx_hash_conf.rx_hash_fields_mask; cmd_ex_rss.rx_hash_function = attr->rx_hash_conf.rx_hash_function; cmd_ex_rss.rx_key_len = attr->rx_hash_conf.rx_hash_key_len; cmd_ex_rss.flags = mlx5_create_flags; memcpy(cmd_ex_rss.rx_hash_key, attr->rx_hash_conf.rx_hash_key, attr->rx_hash_conf.rx_hash_key_len); ret = ibv_cmd_create_qp_ex2(context, &qp->verbs_qp, attr, &cmd_ex_rss.ibv_cmd, sizeof(cmd_ex_rss), &resp.ibv_resp, sizeof(resp)); if (ret) return ret; resp_drv = &resp.drv_payload; if (resp_drv->comp_mask & MLX5_IB_CREATE_QP_RESP_MASK_TIRN) qp->tirn = resp_drv->tirn; if (resp_drv->comp_mask & MLX5_IB_CREATE_QP_RESP_MASK_TIR_ICM_ADDR) qp->tir_icm_addr = resp_drv->tir_icm_addr; qp->rss_qp = 1; return 0; } static int mlx5_cmd_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr, struct mlx5_create_qp *cmd, struct mlx5_qp *qp, struct mlx5_create_qp_ex_resp *resp) { struct mlx5_create_qp_ex cmd_ex; int ret; memset(&cmd_ex, 0, sizeof(cmd_ex)); *ibv_create_qp_ex_to_reg(&cmd_ex.ibv_cmd) = cmd->ibv_cmd.core_payload; cmd_ex.drv_payload = cmd->drv_payload; ret = ibv_cmd_create_qp_ex2(context, &qp->verbs_qp, attr, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp->ibv_resp, sizeof(*resp)); return ret; } enum { MLX5_CREATE_QP_SUP_COMP_MASK = (IBV_QP_INIT_ATTR_PD | IBV_QP_INIT_ATTR_XRCD | IBV_QP_INIT_ATTR_CREATE_FLAGS | IBV_QP_INIT_ATTR_MAX_TSO_HEADER | IBV_QP_INIT_ATTR_IND_TABLE | IBV_QP_INIT_ATTR_RX_HASH | IBV_QP_INIT_ATTR_SEND_OPS_FLAGS), }; enum { MLX5_DV_CREATE_QP_SUP_COMP_MASK = MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS | MLX5DV_QP_INIT_ATTR_MASK_DC | MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS | MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS }; enum { MLX5_CREATE_QP_EX2_COMP_MASK = (IBV_QP_INIT_ATTR_CREATE_FLAGS | IBV_QP_INIT_ATTR_MAX_TSO_HEADER | IBV_QP_INIT_ATTR_IND_TABLE | IBV_QP_INIT_ATTR_RX_HASH), }; enum { MLX5DV_QP_CREATE_SUP_FLAGS = (MLX5DV_QP_CREATE_TUNNEL_OFFLOADS | MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_UC | MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_MC | MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE | MLX5DV_QP_CREATE_ALLOW_SCATTER_TO_CQE | MLX5DV_QP_CREATE_PACKET_BASED_CREDIT_MODE | MLX5DV_QP_CREATE_SIG_PIPELINING | MLX5DV_QP_CREATE_OOO_DP), }; static int create_dct(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr, struct mlx5_qp *qp, uint32_t mlx5_create_flags) { struct mlx5_create_qp cmd = {}; struct mlx5_create_qp_resp resp = {}; int ret; struct mlx5_context *ctx = to_mctx(context); int32_t usr_idx = 0xffffff; FILE *fp = ctx->dbg_fp; if (!check_comp_mask(attr->comp_mask, IBV_QP_INIT_ATTR_PD)) { mlx5_dbg(fp, MLX5_DBG_QP, "Unsupported comp_mask for %s\n", __func__); errno = EINVAL; return errno; } if (!check_comp_mask(mlx5_qp_attr->comp_mask, MLX5DV_QP_INIT_ATTR_MASK_DC | MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS)) { mlx5_dbg(fp, MLX5_DBG_QP, "Unsupported vendor comp_mask for %s\n", __func__); errno = EINVAL; return errno; } if (!check_comp_mask(mlx5_create_flags, MLX5_QP_FLAG_SCATTER_CQE)) { mlx5_dbg(fp, MLX5_DBG_QP, "Unsupported creation flags requested for DCT QP\n"); errno = EINVAL; return errno; } if (!(ctx->vendor_cap_flags & MLX5_VENDOR_CAP_FLAGS_SCAT2CQE_DCT)) mlx5_create_flags &= ~MLX5_QP_FLAG_SCATTER_CQE; cmd.flags = MLX5_QP_FLAG_TYPE_DCT | mlx5_create_flags; cmd.access_key = mlx5_qp_attr->dc_init_attr.dct_access_key; if (ctx->cqe_version) { usr_idx = mlx5_store_uidx(ctx, qp); if (usr_idx < 0) { mlx5_dbg(fp, MLX5_DBG_QP, "Couldn't find free user index\n"); errno = ENOMEM; return errno; } } cmd.uidx = usr_idx; if (ctx->flags & MLX5_CTX_FLAGS_ECE_SUPPORTED) /* Create QP should start from ECE version 1 as a trigger */ cmd.ece_options = 0x10000000; ret = ibv_cmd_create_qp_ex(context, &qp->verbs_qp, attr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) { mlx5_dbg(fp, MLX5_DBG_QP, "Couldn't create dct, ret %d\n", ret); if (ctx->cqe_version) mlx5_clear_uidx(ctx, cmd.uidx); return ret; } qp->get_ece = resp.ece_options; qp->dc_type = MLX5DV_DCTYPE_DCT; qp->rsc.type = MLX5_RSC_TYPE_QP; if (ctx->cqe_version) qp->rsc.rsn = usr_idx; return 0; } #define MLX5_OPAQUE_BUF_LEN 64 static int reg_opaque_mr(struct ibv_pd *pd) { struct mlx5_pd *mpd = to_mpd(pd); int ret = 0; pthread_mutex_lock(&mpd->opaque_mr_mutex); if (mpd->opaque_mr) goto out; ret = posix_memalign(&mpd->opaque_buf, MLX5_OPAQUE_BUF_LEN, MLX5_OPAQUE_BUF_LEN); if (ret) { errno = ret; goto out; } mpd->opaque_mr = mlx5_reg_mr(&mpd->ibv_pd, mpd->opaque_buf, MLX5_OPAQUE_BUF_LEN, (uint64_t)(uintptr_t)mpd->opaque_buf, IBV_ACCESS_LOCAL_WRITE); if (!mpd->opaque_mr) { ret = errno; free(mpd->opaque_buf); mpd->opaque_buf = NULL; } out: pthread_mutex_unlock(&mpd->opaque_mr_mutex); return ret; } static int qp_init_wr_memcpy(struct mlx5_qp *mqp, struct ibv_qp_init_attr_ex *attr, struct mlx5dv_qp_init_attr *mlx5_attr) { struct mlx5_context *mctx; if (!(attr->comp_mask & IBV_QP_INIT_ATTR_PD)) { errno = EINVAL; return errno; } mctx = to_mctx(attr->pd->context); if (!mctx->dma_mmo_caps.dma_mmo_sq && !mctx->dma_mmo_caps.dma_mmo_qp) { errno = EOPNOTSUPP; return errno; } if (mctx->dma_mmo_caps.dma_mmo_qp) mqp->need_mmo_enable = 1; return reg_opaque_mr(attr->pd); } static void set_qp_operational_state(struct mlx5_qp *qp, enum ibv_qp_state state) { switch (state) { case IBV_QPS_RESET: mlx5_qp_fill_wr_complete_error(qp); qp->rq.qp_state_max_gs = -1; qp->sq.qp_state_max_gs = -1; break; case IBV_QPS_INIT: qp->rq.qp_state_max_gs = qp->rq.max_gs; break; case IBV_QPS_RTS: qp->sq.qp_state_max_gs = qp->sq.max_gs; mlx5_qp_fill_wr_complete_real(qp); break; default: break; } } static int is_qpt_ooo_sup(struct mlx5_context *ctx, struct ibv_qp_init_attr_ex *attr) { if (!(ctx->vendor_cap_flags & MLX5_VENDOR_CAP_FLAGS_OOO_DP)) return 0; switch (attr->qp_type) { case IBV_QPT_DRIVER: return ctx->ooo_recv_wrs_caps.max_dct; case IBV_QPT_RC: return ctx->ooo_recv_wrs_caps.max_rc; case IBV_QPT_XRC_SEND: case IBV_QPT_XRC_RECV: return ctx->ooo_recv_wrs_caps.max_xrc; case IBV_QPT_UC: return ctx->ooo_recv_wrs_caps.max_uc; case IBV_QPT_UD: return ctx->ooo_recv_wrs_caps.max_ud; default: return 0; } } static struct ibv_qp *create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr) { struct mlx5_create_qp cmd; struct mlx5_create_qp_resp resp; struct mlx5_create_qp_ex_resp resp_ex; struct mlx5_qp *qp; int ret; struct mlx5_context *ctx = to_mctx(context); struct ibv_qp *ibqp; int32_t usr_idx = 0; uint32_t mlx5_create_flags = 0; struct mlx5_bf *bf = NULL; FILE *fp = ctx->dbg_fp; struct mlx5_parent_domain *mparent_domain; struct mlx5_ib_create_qp_resp *resp_drv; if (attr->comp_mask & ~MLX5_CREATE_QP_SUP_COMP_MASK) return NULL; if ((attr->comp_mask & IBV_QP_INIT_ATTR_MAX_TSO_HEADER) && (attr->qp_type != IBV_QPT_RAW_PACKET)) return NULL; if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS && (attr->comp_mask & IBV_QP_INIT_ATTR_RX_HASH || (attr->qp_type == IBV_QPT_DRIVER && mlx5_qp_attr && mlx5_qp_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_DC && mlx5_qp_attr->dc_init_attr.dc_type == MLX5DV_DCTYPE_DCT))) { errno = EINVAL; return NULL; } qp = calloc(1, sizeof(*qp)); if (!qp) { mlx5_dbg(fp, MLX5_DBG_QP, "\n"); return NULL; } ibqp = &qp->verbs_qp.qp; qp->ibv_qp = ibqp; if ((attr->comp_mask & IBV_QP_INIT_ATTR_CREATE_FLAGS) && (attr->create_flags & IBV_QP_CREATE_SOURCE_QPN)) { if (attr->qp_type != IBV_QPT_UD) { errno = EINVAL; goto err; } qp->flags |= MLX5_QP_FLAGS_USE_UNDERLAY; } memset(&cmd, 0, sizeof(cmd)); memset(&resp, 0, sizeof(resp)); memset(&resp_ex, 0, sizeof(resp_ex)); if (use_scatter_to_cqe()) mlx5_create_flags |= MLX5_QP_FLAG_SCATTER_CQE; if (mlx5_qp_attr) { if (!check_comp_mask(mlx5_qp_attr->comp_mask, MLX5_DV_CREATE_QP_SUP_COMP_MASK)) { mlx5_dbg(fp, MLX5_DBG_QP, "Unsupported vendor comp_mask for create_qp\n"); errno = EINVAL; goto err; } if ((mlx5_qp_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_DC) && (attr->qp_type != IBV_QPT_DRIVER)) { mlx5_dbg(fp, MLX5_DBG_QP, "DC QP must be of type IBV_QPT_DRIVER\n"); errno = EINVAL; goto err; } if (mlx5_qp_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS) { if (!check_comp_mask(mlx5_qp_attr->create_flags, MLX5DV_QP_CREATE_SUP_FLAGS)) { mlx5_dbg(fp, MLX5_DBG_QP, "Unsupported creation flags requested for create_qp\n"); errno = EINVAL; goto err; } if (mlx5_qp_attr->create_flags & MLX5DV_QP_CREATE_TUNNEL_OFFLOADS) { mlx5_create_flags |= MLX5_QP_FLAG_TUNNEL_OFFLOADS; } if (mlx5_qp_attr->create_flags & MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_UC) { mlx5_create_flags |= MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_UC; } if (mlx5_qp_attr->create_flags & MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_MC) { mlx5_create_flags |= MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_MC; } if (mlx5_qp_attr->create_flags & MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE) { if (mlx5_qp_attr->create_flags & MLX5DV_QP_CREATE_ALLOW_SCATTER_TO_CQE) { mlx5_dbg(fp, MLX5_DBG_QP, "Wrong usage of creation flags requested for create_qp\n"); errno = EINVAL; goto err; } mlx5_create_flags &= ~MLX5_QP_FLAG_SCATTER_CQE; } if (mlx5_qp_attr->create_flags & MLX5DV_QP_CREATE_ALLOW_SCATTER_TO_CQE) { mlx5_create_flags |= (MLX5_QP_FLAG_ALLOW_SCATTER_CQE | MLX5_QP_FLAG_SCATTER_CQE); } if (mlx5_qp_attr->create_flags & MLX5DV_QP_CREATE_PACKET_BASED_CREDIT_MODE) mlx5_create_flags |= MLX5_QP_FLAG_PACKET_BASED_CREDIT_MODE; if (mlx5_qp_attr->create_flags & MLX5DV_QP_CREATE_SIG_PIPELINING) { if (!(to_mctx(context)->flags & MLX5_CTX_FLAGS_SQD2RTS_SUPPORTED)) { errno = EOPNOTSUPP; goto err; } qp->flags |= MLX5_QP_FLAGS_DRAIN_SIGERR; } if (mlx5_qp_attr->create_flags & MLX5DV_QP_CREATE_OOO_DP) { if (!is_qpt_ooo_sup(ctx, attr)) { errno = EOPNOTSUPP; goto err; } qp->flags |= MLX5_QP_FLAGS_OOO_DP; } } if (attr->qp_type == IBV_QPT_DRIVER) { if (mlx5_qp_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_DC) { if (mlx5_qp_attr->dc_init_attr.dc_type == MLX5DV_DCTYPE_DCT) { ret = create_dct(context, attr, mlx5_qp_attr, qp, mlx5_create_flags); if (ret) goto err; return ibqp; } else if (mlx5_qp_attr->dc_init_attr.dc_type == MLX5DV_DCTYPE_DCI) { mlx5_create_flags |= MLX5_QP_FLAG_TYPE_DCI; qp->dc_type = MLX5DV_DCTYPE_DCI; if (mlx5_qp_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS) { if ((ctx->dci_streams_caps.max_log_num_concurent < mlx5_qp_attr->dc_init_attr.dci_streams.log_num_concurent) || (ctx->dci_streams_caps.max_log_num_errored < mlx5_qp_attr->dc_init_attr.dci_streams.log_num_errored)) { errno = EINVAL; goto err; } mlx5_create_flags |= MLX5_QP_FLAG_DCI_STREAM; cmd.dci_streams.log_num_concurent = mlx5_qp_attr->dc_init_attr.dci_streams.log_num_concurent; cmd.dci_streams.log_num_errored = mlx5_qp_attr->dc_init_attr.dci_streams.log_num_errored; } } else { errno = EINVAL; goto err; } } else { errno = EINVAL; goto err; } } } else { if (attr->qp_type == IBV_QPT_DRIVER) goto err; } if (attr->comp_mask & IBV_QP_INIT_ATTR_RX_HASH) { /* Scatter2CQE is unsupported for RSS QP */ mlx5_create_flags &= ~MLX5_QP_FLAG_SCATTER_CQE; ret = mlx5_cmd_create_rss_qp(context, attr, qp, mlx5_create_flags); if (ret) goto err; return ibqp; } if (ctx->atomic_cap) qp->atomics_enabled = 1; if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS || (mlx5_qp_attr && mlx5_qp_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS)) { /* * Scatter2cqe, which is a data-path optimization, is disabled * since driver DC data-path doesn't support it. */ if (mlx5_qp_attr && mlx5_qp_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_DC) { mlx5_create_flags &= ~MLX5_QP_FLAG_SCATTER_CQE; } ret = mlx5_qp_fill_wr_pfns(qp, attr, mlx5_qp_attr); if (ret) { errno = ret; mlx5_dbg(fp, MLX5_DBG_QP, "Failed to handle operations flags (errno %d)\n", errno); goto err; } if (mlx5_qp_attr && (mlx5_qp_attr->comp_mask & MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS) && (mlx5_qp_attr->send_ops_flags & MLX5DV_QP_EX_WITH_MEMCPY)) { ret = qp_init_wr_memcpy(qp, attr, mlx5_qp_attr); if (ret) goto err; } } cmd.flags = mlx5_create_flags; qp->wq_sig = qp_sig_enabled(); if (qp->wq_sig) cmd.flags |= MLX5_QP_FLAG_SIGNATURE; ret = mlx5_calc_wq_size(ctx, attr, mlx5_qp_attr, qp); if (ret < 0) { errno = -ret; goto err; } if (attr->qp_type == IBV_QPT_RAW_PACKET || qp->flags & MLX5_QP_FLAGS_USE_UNDERLAY) { qp->buf_size = qp->sq.offset; qp->sq_buf_size = ret - qp->buf_size; qp->sq.offset = 0; } else { qp->buf_size = ret; qp->sq_buf_size = 0; } if (mlx5_alloc_qp_buf(context, attr, qp, ret)) { mlx5_dbg(fp, MLX5_DBG_QP, "\n"); goto err; } if (attr->qp_type == IBV_QPT_RAW_PACKET || qp->flags & MLX5_QP_FLAGS_USE_UNDERLAY) { qp->sq_start = qp->sq_buf.buf; qp->sq.qend = qp->sq_buf.buf + (qp->sq.wqe_cnt << qp->sq.wqe_shift); } else { qp->sq_start = qp->buf.buf + qp->sq.offset; qp->sq.qend = qp->buf.buf + qp->sq.offset + (qp->sq.wqe_cnt << qp->sq.wqe_shift); } mlx5_init_qp_indices(qp); if (mlx5_spinlock_init_pd(&qp->sq.lock, attr->pd) || mlx5_spinlock_init_pd(&qp->rq.lock, attr->pd)) goto err_free_qp_buf; qp->db = mlx5_alloc_dbrec(ctx, attr->pd, &qp->custom_db); if (!qp->db) { mlx5_dbg(fp, MLX5_DBG_QP, "\n"); goto err_free_qp_buf; } if (!qp->custom_db) { qp->db[MLX5_RCV_DBR] = 0; qp->db[MLX5_SND_DBR] = 0; } cmd.buf_addr = (uintptr_t) qp->buf.buf; cmd.sq_buf_addr = (attr->qp_type == IBV_QPT_RAW_PACKET || qp->flags & MLX5_QP_FLAGS_USE_UNDERLAY) ? (uintptr_t) qp->sq_buf.buf : 0; cmd.db_addr = (uintptr_t) qp->db; cmd.sq_wqe_count = qp->sq.wqe_cnt; cmd.rq_wqe_count = qp->rq.wqe_cnt; cmd.rq_wqe_shift = qp->rq.wqe_shift; if (!ctx->cqe_version) { cmd.uidx = 0xffffff; pthread_mutex_lock(&ctx->qp_table_mutex); } else if (!is_xrc_tgt(attr->qp_type)) { usr_idx = mlx5_store_uidx(ctx, qp); if (usr_idx < 0) { mlx5_dbg(fp, MLX5_DBG_QP, "Couldn't find free user index\n"); goto err_rq_db; } cmd.uidx = usr_idx; } mparent_domain = to_mparent_domain(attr->pd); if (mparent_domain && mparent_domain->mtd) bf = mparent_domain->mtd->bf; if (!bf && !(ctx->flags & MLX5_CTX_FLAGS_NO_KERN_DYN_UAR)) { bf = mlx5_get_qp_uar(context); if (!bf) goto err_free_uidx; } if (bf) { if (bf->dyn_alloc_uar) { cmd.bfreg_index = bf->page_id; cmd.flags |= MLX5_QP_FLAG_UAR_PAGE_INDEX; } else { cmd.bfreg_index = bf->bfreg_dyn_index; cmd.flags |= MLX5_QP_FLAG_BFREG_INDEX; } } if (ctx->flags & MLX5_CTX_FLAGS_ECE_SUPPORTED) /* Create QP should start from ECE version 1 as a trigger */ cmd.ece_options = 0x10000000; if (attr->comp_mask & MLX5_CREATE_QP_EX2_COMP_MASK) ret = mlx5_cmd_create_qp_ex(context, attr, &cmd, qp, &resp_ex); else ret = ibv_cmd_create_qp_ex(context, &qp->verbs_qp, attr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) { mlx5_dbg(fp, MLX5_DBG_QP, "ret %d\n", ret); goto err_free_uidx; } resp_drv = attr->comp_mask & MLX5_CREATE_QP_EX2_COMP_MASK ? &resp_ex.drv_payload : &resp.drv_payload; if (!ctx->cqe_version) { if (qp->sq.wqe_cnt || qp->rq.wqe_cnt) { ret = mlx5_store_qp(ctx, ibqp->qp_num, qp); if (ret) { mlx5_dbg(fp, MLX5_DBG_QP, "ret %d\n", ret); goto err_destroy; } } pthread_mutex_unlock(&ctx->qp_table_mutex); } qp->get_ece = resp_drv->ece_options; map_uuar(context, qp, resp_drv->bfreg_index, bf); if (attr->sq_sig_all) qp->sq_signal_bits = MLX5_WQE_CTRL_CQ_UPDATE; else qp->sq_signal_bits = 0; attr->cap.max_send_wr = qp->sq.max_post; attr->cap.max_recv_wr = qp->rq.max_post; attr->cap.max_recv_sge = qp->rq.max_gs; qp->rsc.type = MLX5_RSC_TYPE_QP; qp->rsc.rsn = (ctx->cqe_version && !is_xrc_tgt(attr->qp_type)) ? usr_idx : ibqp->qp_num; if (mparent_domain) atomic_fetch_add(&mparent_domain->mpd.refcount, 1); if (resp_drv->comp_mask & MLX5_IB_CREATE_QP_RESP_MASK_TIRN) qp->tirn = resp_drv->tirn; if (resp_drv->comp_mask & MLX5_IB_CREATE_QP_RESP_MASK_TISN) qp->tisn = resp_drv->tisn; if (resp_drv->comp_mask & MLX5_IB_CREATE_QP_RESP_MASK_RQN) qp->rqn = resp_drv->rqn; if (resp_drv->comp_mask & MLX5_IB_CREATE_QP_RESP_MASK_SQN) qp->sqn = resp_drv->sqn; if (resp_drv->comp_mask & MLX5_IB_CREATE_QP_RESP_MASK_TIR_ICM_ADDR) qp->tir_icm_addr = resp_drv->tir_icm_addr; if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS) qp->verbs_qp.comp_mask |= VERBS_QP_EX; set_qp_operational_state(qp, IBV_QPS_RESET); return ibqp; err_destroy: ibv_cmd_destroy_qp(ibqp); err_free_uidx: if (bf) mlx5_put_qp_uar(ctx, bf); if (!ctx->cqe_version) pthread_mutex_unlock(&to_mctx(context)->qp_table_mutex); else if (!is_xrc_tgt(attr->qp_type)) mlx5_clear_uidx(ctx, usr_idx); err_rq_db: mlx5_free_db(to_mctx(context), qp->db, attr->pd, qp->custom_db); err_free_qp_buf: mlx5_free_qp_buf(ctx, qp); err: free(qp); return NULL; } struct ibv_qp *mlx5_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct ibv_qp *qp; struct ibv_qp_init_attr_ex attrx; memset(&attrx, 0, sizeof(attrx)); memcpy(&attrx, attr, sizeof(*attr)); attrx.comp_mask = IBV_QP_INIT_ATTR_PD; attrx.pd = pd; qp = create_qp(pd->context, &attrx, NULL); if (qp) memcpy(attr, &attrx, sizeof(*attr)); return qp; } static void mlx5_lock_cqs(struct ibv_qp *qp) { struct mlx5_cq *send_cq = to_mcq(qp->send_cq); struct mlx5_cq *recv_cq = to_mcq(qp->recv_cq); if (send_cq && recv_cq) { if (send_cq == recv_cq) { mlx5_spin_lock(&send_cq->lock); } else if (send_cq->cqn < recv_cq->cqn) { mlx5_spin_lock(&send_cq->lock); mlx5_spin_lock(&recv_cq->lock); } else { mlx5_spin_lock(&recv_cq->lock); mlx5_spin_lock(&send_cq->lock); } } else if (send_cq) { mlx5_spin_lock(&send_cq->lock); } else if (recv_cq) { mlx5_spin_lock(&recv_cq->lock); } } static void mlx5_unlock_cqs(struct ibv_qp *qp) { struct mlx5_cq *send_cq = to_mcq(qp->send_cq); struct mlx5_cq *recv_cq = to_mcq(qp->recv_cq); if (send_cq && recv_cq) { if (send_cq == recv_cq) { mlx5_spin_unlock(&send_cq->lock); } else if (send_cq->cqn < recv_cq->cqn) { mlx5_spin_unlock(&recv_cq->lock); mlx5_spin_unlock(&send_cq->lock); } else { mlx5_spin_unlock(&send_cq->lock); mlx5_spin_unlock(&recv_cq->lock); } } else if (send_cq) { mlx5_spin_unlock(&send_cq->lock); } else if (recv_cq) { mlx5_spin_unlock(&recv_cq->lock); } } int mlx5_destroy_qp(struct ibv_qp *ibqp) { struct mlx5_qp *qp = to_mqp(ibqp); struct mlx5_context *ctx = to_mctx(ibqp->context); int ret; struct mlx5_parent_domain *mparent_domain = to_mparent_domain(ibqp->pd); if (qp->rss_qp) { ret = ibv_cmd_destroy_qp(ibqp); if (ret) return ret; goto free; } if (!ctx->cqe_version) pthread_mutex_lock(&ctx->qp_table_mutex); ret = ibv_cmd_destroy_qp(ibqp); if (ret) { if (!ctx->cqe_version) pthread_mutex_unlock(&ctx->qp_table_mutex); return ret; } mlx5_lock_cqs(ibqp); __mlx5_cq_clean(to_mcq(ibqp->recv_cq), qp->rsc.rsn, ibqp->srq ? to_msrq(ibqp->srq) : NULL); if (ibqp->send_cq != ibqp->recv_cq) __mlx5_cq_clean(to_mcq(ibqp->send_cq), qp->rsc.rsn, NULL); if (!ctx->cqe_version) { if (qp->dc_type == MLX5DV_DCTYPE_DCT) { /* The QP was inserted to the tracking table only after * that it was modifed to RTR */ if (ibqp->state == IBV_QPS_RTR) mlx5_clear_qp(ctx, ibqp->qp_num); } else { if (qp->sq.wqe_cnt || qp->rq.wqe_cnt) mlx5_clear_qp(ctx, ibqp->qp_num); } } mlx5_unlock_cqs(ibqp); if (!ctx->cqe_version) pthread_mutex_unlock(&ctx->qp_table_mutex); else if (!is_xrc_tgt(ibqp->qp_type)) mlx5_clear_uidx(ctx, qp->rsc.rsn); if (qp->dc_type != MLX5DV_DCTYPE_DCT) { mlx5_free_db(ctx, qp->db, ibqp->pd, qp->custom_db); mlx5_free_qp_buf(ctx, qp); } free: if (mparent_domain) atomic_fetch_sub(&mparent_domain->mpd.refcount, 1); mlx5_put_qp_uar(ctx, qp->bf); free(qp); return 0; } static int query_dct_in_order(struct ibv_qp *qp) { uint32_t in_dct[DEVX_ST_SZ_DW(query_dct_in)] = {}; uint32_t out_dct[DEVX_ST_SZ_DW(query_dct_out)] = {}; int ret; DEVX_SET(query_dct_in, in_dct, opcode, MLX5_CMD_OP_QUERY_DCT); DEVX_SET(query_dct_in, in_dct, dctn, qp->qp_num); ret = mlx5dv_devx_qp_query(qp, in_dct, sizeof(in_dct), out_dct, sizeof(out_dct)); if (ret) return 0; return DEVX_GET(query_dct_out, out_dct, dctc.data_in_order); } int mlx5_query_qp_data_in_order(struct ibv_qp *qp, enum ibv_wr_opcode op, uint32_t flags) { uint32_t in_qp[DEVX_ST_SZ_DW(query_qp_in)] = {}; uint32_t out_qp[DEVX_ST_SZ_DW(query_qp_out)] = {}; struct mlx5_context *mctx = to_mctx(qp->context); struct mlx5_qp *mqp = to_mqp(qp); int ret; if (!mctx->qp_data_in_order_cap) return 0; if (mqp->dc_type == MLX5DV_DCTYPE_DCT) return query_dct_in_order(qp) ? IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG : 0; if (qp->state != IBV_QPS_RTS) return 0; DEVX_SET(query_qp_in, in_qp, opcode, MLX5_CMD_OP_QUERY_QP); DEVX_SET(query_qp_in, in_qp, qpn, qp->qp_num); ret = mlx5dv_devx_qp_query(qp, in_qp, sizeof(in_qp), out_qp, sizeof(out_qp)); if (ret) return 0; return DEVX_GET(query_qp_out, out_qp, qpc.data_in_order) ? IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG : 0; } int mlx5_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; struct mlx5_qp *qp = to_mqp(ibqp); int ret; if (qp->rss_qp) return EOPNOTSUPP; ret = ibv_cmd_query_qp(ibqp, attr, attr_mask, init_attr, &cmd, sizeof(cmd)); if (ret) return ret; init_attr->cap.max_send_wr = qp->sq.max_post; init_attr->cap.max_send_sge = qp->sq.max_gs; init_attr->cap.max_inline_data = qp->max_inline_data; if (qp->flags & MLX5_QP_FLAGS_OOO_DP && init_attr->cap.max_recv_wr > 1) init_attr->cap.max_recv_wr = init_attr->cap.max_recv_wr >> 1; attr->cap = init_attr->cap; return 0; } enum { MLX5_MODIFY_QP_EX_ATTR_MASK = IBV_QP_RATE_LIMIT, }; static int modify_dct(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { struct mlx5_modify_qp cmd_ex = {}; struct mlx5_modify_qp_ex_resp resp = {}; struct mlx5_qp *mqp = to_mqp(qp); struct mlx5_context *context = to_mctx(qp->context); int min_resp_size; bool dct_create; int ret; cmd_ex.ece_options = mqp->set_ece; if (mqp->flags & MLX5_QP_FLAGS_OOO_DP && attr_mask & IBV_QP_STATE && attr->qp_state == IBV_QPS_INIT) cmd_ex.comp_mask |= MLX5_IB_MODIFY_QP_OOO_DP; ret = ibv_cmd_modify_qp_ex(qp, attr, attr_mask, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp.ibv_resp, sizeof(resp)); if (ret) return ret; /* dct is created in hardware and gets unique qp number when QP * is modified to RTR so operations that require QP number need * to be delayed to this time */ dct_create = (attr_mask & IBV_QP_STATE) && (attr->qp_state == IBV_QPS_RTR); if (!dct_create) return 0; min_resp_size = offsetof(typeof(resp), dctn) + sizeof(resp.dctn) - sizeof(resp.ibv_resp); if (resp.response_length < min_resp_size) { errno = EINVAL; return errno; } qp->qp_num = resp.dctn; if (mqp->set_ece) { mqp->set_ece = 0; mqp->get_ece = resp.ece_options; } if (!context->cqe_version) { pthread_mutex_lock(&context->qp_table_mutex); ret = mlx5_store_qp(context, qp->qp_num, mqp); if (!ret) mqp->rsc.rsn = qp->qp_num; else errno = ENOMEM; pthread_mutex_unlock(&context->qp_table_mutex); return ret ? errno : 0; } return 0; } static int qp_enable_mmo(struct ibv_qp *qp) { uint32_t in[DEVX_ST_SZ_DW(init2init_qp_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(init2init_qp_out)] = {}; void *qpce = DEVX_ADDR_OF(init2init_qp_in, in, qpc_data_ext); int ret; DEVX_SET(init2init_qp_in, in, opcode, MLX5_CMD_OP_INIT2INIT_QP); DEVX_SET(init2init_qp_in, in, qpc_ext, 1); DEVX_SET(init2init_qp_in, in, qpn, qp->qp_num); DEVX_SET64(init2init_qp_in, in, opt_param_mask_95_32, MLX5_QPC_OPT_MASK_32_INIT2INIT_MMO); DEVX_SET(qpc_ext, qpce, mmo, 1); ret = mlx5dv_devx_qp_modify(qp, in, sizeof(in), out, sizeof(out)); return ret ? mlx5_get_cmd_status_err(ret, out) : 0; } int mlx5_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd = {}; struct mlx5_modify_qp cmd_ex = {}; struct mlx5_modify_qp_ex_resp resp = {}; struct mlx5_qp *mqp = to_mqp(qp); struct mlx5_context *context = to_mctx(qp->context); int ret; __be32 *db; if (mqp->dc_type == MLX5DV_DCTYPE_DCT) return modify_dct(qp, attr, attr_mask); if (mqp->rss_qp) return EOPNOTSUPP; if (mqp->flags & MLX5_QP_FLAGS_USE_UNDERLAY) { if (attr_mask & ~(IBV_QP_STATE | IBV_QP_CUR_STATE)) return EINVAL; /* Underlay QP is UD over infiniband */ if (context->cached_device_cap_flags & IBV_DEVICE_UD_IP_CSUM) mqp->qp_cap_cache |= MLX5_CSUM_SUPPORT_UNDERLAY_UD | MLX5_RX_CSUM_VALID; } if (attr_mask & IBV_QP_PORT) { switch (qp->qp_type) { case IBV_QPT_RAW_PACKET: if (context->cached_link_layer[attr->port_num - 1] == IBV_LINK_LAYER_ETHERNET) { if (context->cached_device_cap_flags & IBV_DEVICE_RAW_IP_CSUM) mqp->qp_cap_cache |= MLX5_CSUM_SUPPORT_RAW_OVER_ETH | MLX5_RX_CSUM_VALID; if (ibv_is_qpt_supported( context->cached_tso_caps.supported_qpts, IBV_QPT_RAW_PACKET)) mqp->max_tso = context->cached_tso_caps.max_tso; } break; default: break; } } if (attr_mask & MLX5_MODIFY_QP_EX_ATTR_MASK || mqp->set_ece || mqp->flags & MLX5_QP_FLAGS_OOO_DP) { cmd_ex.ece_options = mqp->set_ece; if (mqp->flags & MLX5_QP_FLAGS_OOO_DP && attr_mask & IBV_QP_STATE && attr->qp_state == IBV_QPS_INIT) cmd_ex.comp_mask |= MLX5_IB_MODIFY_QP_OOO_DP; ret = ibv_cmd_modify_qp_ex(qp, attr, attr_mask, &cmd_ex.ibv_cmd, sizeof(cmd_ex), &resp.ibv_resp, sizeof(resp)); } else { ret = ibv_cmd_modify_qp(qp, attr, attr_mask, &cmd, sizeof(cmd)); } if (!ret && mqp->set_ece) { mqp->set_ece = 0; mqp->get_ece = resp.ece_options; } if (!ret && (attr_mask & IBV_QP_STATE) && attr->qp_state == IBV_QPS_RESET) { if (qp->recv_cq) { mlx5_cq_clean(to_mcq(qp->recv_cq), mqp->rsc.rsn, qp->srq ? to_msrq(qp->srq) : NULL); } if (qp->send_cq != qp->recv_cq && qp->send_cq) mlx5_cq_clean(to_mcq(qp->send_cq), to_mqp(qp)->rsc.rsn, NULL); mlx5_init_qp_indices(mqp); db = mqp->db; db[MLX5_RCV_DBR] = 0; db[MLX5_SND_DBR] = 0; } /* * When the Raw Packet QP is in INIT state, its RQ * underneath is already in RDY, which means it can * receive packets. According to the IB spec, a QP can't * receive packets until moved to RTR state. To achieve this, * for Raw Packet QPs, we update the doorbell record * once the QP is moved to RTR. */ if (!ret && (attr_mask & IBV_QP_STATE) && attr->qp_state == IBV_QPS_RTR && (qp->qp_type == IBV_QPT_RAW_PACKET || mqp->flags & MLX5_QP_FLAGS_USE_UNDERLAY)) { mlx5_spin_lock(&mqp->rq.lock); mqp->db[MLX5_RCV_DBR] = htobe32(mqp->rq.head & 0xffff); mlx5_spin_unlock(&mqp->rq.lock); } if (!ret && (attr_mask & IBV_QP_STATE) && attr->qp_state == IBV_QPS_INIT && (mqp->flags & MLX5_QP_FLAGS_DRAIN_SIGERR)) { ret = mlx5_modify_qp_drain_sigerr(qp); } if (!ret && (attr_mask & IBV_QP_STATE) && (attr->qp_state == IBV_QPS_INIT) && mqp->need_mmo_enable) ret = qp_enable_mmo(qp); if (!ret && (attr_mask & IBV_QP_STATE)) set_qp_operational_state(mqp, attr->qp_state); return ret; } int mlx5_modify_qp_rate_limit(struct ibv_qp *qp, struct ibv_qp_rate_limit_attr *attr) { struct ibv_qp_attr qp_attr = {}; struct ib_uverbs_ex_modify_qp_resp resp = {}; struct mlx5_modify_qp cmd = {}; struct mlx5_context *mctx = to_mctx(qp->context); int ret; if (attr->comp_mask) return EINVAL; if ((attr->max_burst_sz || attr->typical_pkt_sz) && (!attr->rate_limit || !(mctx->packet_pacing_caps.cap_flags & MLX5_IB_PP_SUPPORT_BURST))) return EINVAL; cmd.burst_info.max_burst_sz = attr->max_burst_sz; cmd.burst_info.typical_pkt_sz = attr->typical_pkt_sz; qp_attr.rate_limit = attr->rate_limit; ret = ibv_cmd_modify_qp_ex(qp, &qp_attr, IBV_QP_RATE_LIMIT, &cmd.ibv_cmd, sizeof(cmd), &resp, sizeof(resp)); return ret; } /* * IB spec version 1.3. Table 224 Rate to mlx5 rate * conversion table on best effort basis. */ static const uint8_t ib_to_mlx5_rate_table[] = { 0, /* Invalid to unlimited */ 0, /* Invalid to unlimited */ 7, /* 2.5 Gbps */ 8, /* 10Gbps */ 9, /* 30Gbps */ 10, /* 5 Gbps */ 11, /* 20 Gbps */ 12, /* 40 Gbps */ 13, /* 60 Gbps */ 14, /* 80 Gbps */ 15, /* 120 Gbps */ 11, /* 14 Gbps to 20 Gbps */ 13, /* 56 Gbps to 60 Gbps */ 15, /* 112 Gbps to 120 Gbps */ 0, /* 168 Gbps to unlimited */ 9, /* 25 Gbps to 30 Gbps */ 15, /* 100 Gbps to 120 Gbps */ 0, /* 200 Gbps to unlimited */ 0, /* 300 Gbps to unlimited */ 9, /* 28 Gbps to 30 Gbps */ 13, /* 50 Gbps to 60 Gbps */ 0, /* 400 Gbps to unlimited */ 0, /* 600 Gbps to unlimited */ }; static uint8_t ah_attr_to_mlx5_rate(enum ibv_rate ah_static_rate) { if (ah_static_rate >= ARRAY_SIZE(ib_to_mlx5_rate_table)) return 0; return ib_to_mlx5_rate_table[ah_static_rate]; } static void mlx5_ah_set_udp_sport(struct mlx5_ah *ah, const struct ibv_ah_attr *attr) { uint16_t sport; uint32_t fl; fl = attr->grh.flow_label & IB_GRH_FLOWLABEL_MASK; if (fl) sport = ibv_flow_label_to_udp_sport(fl); else sport = get_random() % (IB_ROCE_UDP_ENCAP_VALID_PORT_MAX + 1 - IB_ROCE_UDP_ENCAP_VALID_PORT_MIN) + IB_ROCE_UDP_ENCAP_VALID_PORT_MIN; ah->av.rlid = htobe16(sport); } struct ibv_ah *mlx5_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr) { struct mlx5_context *ctx = to_mctx(pd->context); struct ibv_port_attr port_attr; struct mlx5_ah *ah; uint8_t static_rate; uint32_t gid_type; __be32 tmp; uint8_t grh; bool is_eth; bool grh_req; if (attr->port_num < 1 || attr->port_num > ctx->num_ports) return NULL; if (ctx->cached_link_layer[attr->port_num - 1]) { is_eth = ctx->cached_link_layer[attr->port_num - 1] == IBV_LINK_LAYER_ETHERNET; grh_req = ctx->cached_port_flags[attr->port_num - 1] & IBV_QPF_GRH_REQUIRED; } else { if (ibv_query_port(pd->context, attr->port_num, &port_attr)) return NULL; is_eth = port_attr.link_layer == IBV_LINK_LAYER_ETHERNET; grh_req = port_attr.flags & IBV_QPF_GRH_REQUIRED; } if (unlikely((!attr->is_global) && (is_eth || grh_req))) { errno = EINVAL; return NULL; } ah = calloc(1, sizeof *ah); if (!ah) return NULL; static_rate = ah_attr_to_mlx5_rate(attr->static_rate); if (is_eth) { if (ibv_query_gid_type(pd->context, attr->port_num, attr->grh.sgid_index, &gid_type)) goto err; if (gid_type == IBV_GID_TYPE_SYSFS_ROCE_V2) mlx5_ah_set_udp_sport(ah, attr); /* Since RoCE packets must contain GRH, this bit is reserved * for RoCE and shouldn't be set. */ grh = 0; ah->av.stat_rate_sl = (static_rate << 4) | ((attr->sl & 0x7) << 1); } else { ah->av.fl_mlid = attr->src_path_bits & 0x7f; ah->av.rlid = htobe16(attr->dlid); grh = 1; ah->av.stat_rate_sl = (static_rate << 4) | (attr->sl & 0xf); } if (attr->is_global) { ah->av.tclass = attr->grh.traffic_class; ah->av.hop_limit = attr->grh.hop_limit; tmp = htobe32((grh << 30) | ((attr->grh.sgid_index & 0xff) << 20) | (attr->grh.flow_label & IB_GRH_FLOWLABEL_MASK)); ah->av.grh_gid_fl = tmp; memcpy(ah->av.rgid, attr->grh.dgid.raw, 16); } if (is_eth) { if (ctx->cmds_supp_uhw & MLX5_USER_CMDS_SUPP_UHW_CREATE_AH) { struct mlx5_create_ah_resp resp = {}; if (ibv_cmd_create_ah(pd, &ah->ibv_ah, attr, &resp.ibv_resp, sizeof(resp))) goto err; ah->kern_ah = true; memcpy(ah->av.rmac, resp.dmac, ETHERNET_LL_SIZE); } else { if (ibv_resolve_eth_l2_from_gid(pd->context, attr, ah->av.rmac, NULL)) goto err; } } pthread_mutex_init(&ah->mutex, NULL); ah->is_global = attr->is_global; return &ah->ibv_ah; err: free(ah); return NULL; } int mlx5_destroy_ah(struct ibv_ah *ah) { struct mlx5_ah *mah = to_mah(ah); int err; if (mah->kern_ah) { err = ibv_cmd_destroy_ah(ah); if (err) return err; } if (mah->ah_qp_mapping) mlx5dv_devx_obj_destroy(mah->ah_qp_mapping); free(mah); return 0; } static int _mlx5dv_map_ah_to_qp(struct ibv_ah *ah, uint32_t qp_num) { uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_av_qp_mapping_in)] = {}; struct mlx5_context *mctx = to_mctx(ah->context); struct mlx5_ah *mah = to_mah(ah); uint8_t sgid_index; void *attr; int ret = 0; if (!(mctx->general_obj_types_caps & (1ULL << MLX5_OBJ_TYPE_AV_QP_MAPPING)) || !mah->is_global) return EOPNOTSUPP; attr = DEVX_ADDR_OF(create_av_qp_mapping_in, in, hdr); DEVX_SET(general_obj_in_cmd_hdr, attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, attr, obj_type, MLX5_OBJ_TYPE_AV_QP_MAPPING); sgid_index = (be32toh(mah->av.grh_gid_fl) >> 20) & 0xff; attr = DEVX_ADDR_OF(create_av_qp_mapping_in, in, mapping); DEVX_SET(av_qp_mapping, attr, qpn, qp_num); DEVX_SET(av_qp_mapping, attr, remote_address_vector.sl_or_eth_prio, mah->av.stat_rate_sl); DEVX_SET(av_qp_mapping, attr, remote_address_vector.src_addr_index, sgid_index); memcpy(DEVX_ADDR_OF(av_qp_mapping, attr, remote_address_vector.rgid_or_rip), mah->av.rgid, sizeof(mah->av.rgid)); pthread_mutex_lock(&mah->mutex); if (!mah->ah_qp_mapping) { mah->ah_qp_mapping = mlx5dv_devx_obj_create( ah->context, in, sizeof(in), out, sizeof(out)); if (!mah->ah_qp_mapping) ret = mlx5_get_cmd_status_err(errno, out); } pthread_mutex_unlock(&mah->mutex); return ret; } int mlx5dv_map_ah_to_qp(struct ibv_ah *ah, uint32_t qp_num) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ah->context); if (!dvops || !dvops->map_ah_to_qp) return EOPNOTSUPP; return dvops->map_ah_to_qp(ah, qp_num); } int mlx5_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) { return ibv_cmd_attach_mcast(qp, gid, lid); } int mlx5_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid) { return ibv_cmd_detach_mcast(qp, gid, lid); } struct ibv_qp *mlx5_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr) { return create_qp(context, attr, NULL); } static struct ibv_qp *_mlx5dv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr) { return create_qp(context, qp_attr, mlx5_qp_attr); } struct ibv_qp *mlx5dv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->create_qp) { errno = EOPNOTSUPP; return NULL; } return dvops->create_qp(context, qp_attr, mlx5_qp_attr); } struct mlx5dv_qp_ex *mlx5dv_qp_ex_from_ibv_qp_ex(struct ibv_qp_ex *qp) { return &(container_of(qp, struct mlx5_qp, verbs_qp.qp_ex))->dv_qp; } int mlx5_get_srq_num(struct ibv_srq *srq, uint32_t *srq_num) { struct mlx5_srq *msrq = to_msrq(srq); /* May be used by DC users in addition to XRC ones, as there is no * indication on the SRQ for DC usage we can't force the above check. * Even DC users are encouraged to use mlx5dv_init_obj() to get * the SRQN. */ *srq_num = msrq->srqn; return 0; } struct ibv_qp *mlx5_open_qp(struct ibv_context *context, struct ibv_qp_open_attr *attr) { struct ibv_open_qp cmd; struct ib_uverbs_create_qp_resp resp; struct mlx5_qp *qp; int ret; qp = calloc(1, sizeof(*qp)); if (!qp) return NULL; ret = ibv_cmd_open_qp(context, &qp->verbs_qp, sizeof(qp->verbs_qp), attr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) goto err; return &qp->verbs_qp.qp; err: free(qp); return NULL; } struct ibv_xrcd * mlx5_open_xrcd(struct ibv_context *context, struct ibv_xrcd_init_attr *xrcd_init_attr) { int err; struct verbs_xrcd *xrcd; struct ibv_open_xrcd cmd = {}; struct ib_uverbs_open_xrcd_resp resp = {}; xrcd = calloc(1, sizeof(*xrcd)); if (!xrcd) return NULL; err = ibv_cmd_open_xrcd(context, xrcd, sizeof(*xrcd), xrcd_init_attr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (err) { free(xrcd); return NULL; } return &xrcd->xrcd; } int mlx5_close_xrcd(struct ibv_xrcd *ib_xrcd) { struct verbs_xrcd *xrcd = container_of(ib_xrcd, struct verbs_xrcd, xrcd); int ret; ret = ibv_cmd_close_xrcd(xrcd); if (!ret) free(xrcd); return ret; } static struct ibv_qp * create_cmd_qp(struct ibv_context *context, struct ibv_srq_init_attr_ex *srq_attr, struct ibv_srq *srq) { struct ibv_qp_init_attr_ex init_attr = {}; FILE *fp = to_mctx(context)->dbg_fp; struct ibv_port_attr port_attr; struct ibv_modify_qp qcmd = {}; struct ibv_qp_attr attr = {}; struct ibv_query_port pcmd; struct ibv_qp *qp; int attr_mask; int port = 1; int ret; ret = ibv_cmd_query_port(context, port, &port_attr, &pcmd, sizeof(pcmd)); if (ret) { mlx5_dbg(fp, MLX5_DBG_QP, "ret %d\n", ret); return NULL; } init_attr.qp_type = IBV_QPT_RC; init_attr.srq = srq; /* Command QP will be used to pass MLX5_OPCODE_TAG_MATCHING messages * to add/remove tag matching list entries. * WQ size is based on max_ops parameter holding max number of * outstanding list operations. */ init_attr.cap.max_send_wr = srq_attr->tm_cap.max_ops; /* Tag matching list entry will point to a single sge buffer */ init_attr.cap.max_send_sge = 1; init_attr.comp_mask = IBV_QP_INIT_ATTR_PD; init_attr.pd = srq_attr->pd; init_attr.send_cq = srq_attr->cq; init_attr.recv_cq = srq_attr->cq; qp = create_qp(context, &init_attr, NULL); if (!qp) return NULL; attr.qp_state = IBV_QPS_INIT; attr.port_num = port; attr_mask = IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | IBV_QP_ACCESS_FLAGS; ret = ibv_cmd_modify_qp(qp, &attr, attr_mask, &qcmd, sizeof(qcmd)); if (ret) { mlx5_dbg(fp, MLX5_DBG_QP, "ret %d\n", ret); goto err; } attr.qp_state = IBV_QPS_RTR; attr.path_mtu = IBV_MTU_256; attr.dest_qp_num = qp->qp_num; /* Loopback */ attr.ah_attr.dlid = port_attr.lid; attr.ah_attr.port_num = port; attr_mask = IBV_QP_STATE | IBV_QP_AV | IBV_QP_PATH_MTU | IBV_QP_DEST_QPN | IBV_QP_RQ_PSN | IBV_QP_MAX_DEST_RD_ATOMIC | IBV_QP_MIN_RNR_TIMER; ret = ibv_cmd_modify_qp(qp, &attr, attr_mask, &qcmd, sizeof(qcmd)); if (ret) { mlx5_dbg(fp, MLX5_DBG_QP, "ret %d\n", ret); goto err; } attr.qp_state = IBV_QPS_RTS; attr_mask = IBV_QP_STATE | IBV_QP_TIMEOUT | IBV_QP_RETRY_CNT | IBV_QP_RNR_RETRY | IBV_QP_SQ_PSN | IBV_QP_MAX_QP_RD_ATOMIC; ret = ibv_cmd_modify_qp(qp, &attr, attr_mask, &qcmd, sizeof(qcmd)); if (ret) { mlx5_dbg(fp, MLX5_DBG_QP, "ret %d\n", ret); goto err; } return qp; err: mlx5_destroy_qp(qp); return NULL; } struct ibv_srq *mlx5_create_srq_ex(struct ibv_context *context, struct ibv_srq_init_attr_ex *attr) { int err; struct mlx5_create_srq_ex cmd; struct mlx5_create_srq_resp resp; struct mlx5_srq *msrq; struct mlx5_context *ctx = to_mctx(context); int max_sge; struct ibv_srq *ibsrq; int uidx; if (!(attr->comp_mask & IBV_SRQ_INIT_ATTR_TYPE) || (attr->srq_type == IBV_SRQT_BASIC)) return mlx5_create_srq(attr->pd, (struct ibv_srq_init_attr *)attr); if (attr->srq_type != IBV_SRQT_XRC && attr->srq_type != IBV_SRQT_TM) { errno = EINVAL; return NULL; } /* An extended CQ is required to read TM information from */ if (attr->srq_type == IBV_SRQT_TM && !(attr->cq && (to_mcq(attr->cq)->flags & MLX5_CQ_FLAGS_EXTENDED))) { errno = EINVAL; return NULL; } msrq = calloc(1, sizeof(*msrq)); if (!msrq) return NULL; ibsrq = (struct ibv_srq *)&msrq->vsrq; memset(&cmd, 0, sizeof(cmd)); memset(&resp, 0, sizeof(resp)); if (mlx5_spinlock_init_pd(&msrq->lock, attr->pd)) { mlx5_err(ctx->dbg_fp, "%s-%d:\n", __func__, __LINE__); goto err; } if (attr->attr.max_wr > ctx->max_srq_recv_wr) { mlx5_err(ctx->dbg_fp, "%s-%d:max_wr %d, max_srq_recv_wr %d\n", __func__, __LINE__, attr->attr.max_wr, ctx->max_srq_recv_wr); errno = EINVAL; goto err; } /* * this calculation does not consider required control segments. The * final calculation is done again later. This is done so to avoid * overflows of variables */ max_sge = ctx->max_recv_wr / sizeof(struct mlx5_wqe_data_seg); if (attr->attr.max_sge > max_sge) { mlx5_err(ctx->dbg_fp, "%s-%d:attr.max_sge %d, max_sge %d\n", __func__, __LINE__, attr->attr.max_sge, max_sge); errno = EINVAL; goto err; } msrq->max_gs = attr->attr.max_sge; msrq->counter = 0; if (mlx5_alloc_srq_buf(context, msrq, attr->attr.max_wr, attr->pd)) { mlx5_err(ctx->dbg_fp, "%s-%d:\n", __func__, __LINE__); goto err; } msrq->db = mlx5_alloc_dbrec(ctx, attr->pd, &msrq->custom_db); if (!msrq->db) { mlx5_err(ctx->dbg_fp, "%s-%d:\n", __func__, __LINE__); goto err_free; } if (!msrq->custom_db) *msrq->db = 0; cmd.buf_addr = (uintptr_t)msrq->buf.buf; cmd.db_addr = (uintptr_t)msrq->db; msrq->wq_sig = srq_sig_enabled(); if (msrq->wq_sig) cmd.flags = MLX5_SRQ_FLAG_SIGNATURE; attr->attr.max_sge = msrq->max_gs; if (ctx->cqe_version) { uidx = mlx5_store_uidx(ctx, msrq); if (uidx < 0) { mlx5_dbg(ctx->dbg_fp, MLX5_DBG_QP, "Couldn't find free user index\n"); goto err_free_db; } cmd.uidx = uidx; } else { cmd.uidx = 0xffffff; pthread_mutex_lock(&ctx->srq_table_mutex); } /* Override max_wr to let kernel know about extra WQEs for the * wait queue. */ attr->attr.max_wr = msrq->max - 1; err = ibv_cmd_create_srq_ex(context, &msrq->vsrq, attr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); /* Override kernel response that includes the wait queue with the real * number of WQEs that are applicable for the application. */ attr->attr.max_wr = msrq->tail; if (err) goto err_free_uidx; if (attr->srq_type == IBV_SRQT_TM) { int i; msrq->cmd_qp = create_cmd_qp(context, attr, ibsrq); if (!msrq->cmd_qp) goto err_destroy; msrq->tm_list = calloc(attr->tm_cap.max_num_tags + 1, sizeof(struct mlx5_tag_entry)); if (!msrq->tm_list) goto err_free_cmd; for (i = 0; i < attr->tm_cap.max_num_tags; i++) msrq->tm_list[i].next = &msrq->tm_list[i + 1]; msrq->tm_head = &msrq->tm_list[0]; msrq->tm_tail = &msrq->tm_list[attr->tm_cap.max_num_tags]; msrq->op = calloc(to_mqp(msrq->cmd_qp)->sq.wqe_cnt, sizeof(struct mlx5_srq_op)); if (!msrq->op) goto err_free_tm; msrq->op_head = 0; msrq->op_tail = 0; } if (!ctx->cqe_version) { err = mlx5_store_srq(to_mctx(context), resp.srqn, msrq); if (err) goto err_free_tm; pthread_mutex_unlock(&ctx->srq_table_mutex); } msrq->srqn = resp.srqn; msrq->rsc.type = MLX5_RSC_TYPE_XSRQ; msrq->rsc.rsn = ctx->cqe_version ? cmd.uidx : resp.srqn; return ibsrq; err_free_tm: free(msrq->tm_list); free(msrq->op); err_free_cmd: if (msrq->cmd_qp) mlx5_destroy_qp(msrq->cmd_qp); err_destroy: ibv_cmd_destroy_srq(ibsrq); err_free_uidx: if (ctx->cqe_version) mlx5_clear_uidx(ctx, cmd.uidx); else pthread_mutex_unlock(&ctx->srq_table_mutex); err_free_db: mlx5_free_db(ctx, msrq->db, attr->pd, msrq->custom_db); err_free: free(msrq->wrid); mlx5_free_actual_buf(ctx, &msrq->buf); free(msrq->free_wqe_bitmap); err: free(msrq); return NULL; } static void get_pci_atomic_caps(struct ibv_context *context, struct ibv_device_attr_ex *attr) { uint32_t in[DEVX_ST_SZ_DW(query_hca_cap_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(query_hca_cap_out)] = {}; uint16_t opmod = (MLX5_CAP_ATOMIC << 1) | HCA_CAP_OPMOD_GET_CUR; int ret; DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); DEVX_SET(query_hca_cap_in, in, op_mod, opmod); ret = mlx5dv_devx_general_cmd(context, in, sizeof(in), out, sizeof(out)); if (!ret) { attr->pci_atomic_caps.fetch_add = DEVX_GET(query_hca_cap_out, out, capability.atomic_caps.fetch_add_pci_atomic); attr->pci_atomic_caps.swap = DEVX_GET(query_hca_cap_out, out, capability.atomic_caps.swap_pci_atomic); attr->pci_atomic_caps.compare_swap = DEVX_GET(query_hca_cap_out, out, capability.atomic_caps.compare_swap_pci_atomic); if (attr->orig_attr.atomic_cap == IBV_ATOMIC_HCA && (attr->pci_atomic_caps.fetch_add & IBV_PCI_ATOMIC_OPERATION_8_BYTE_SIZE_SUP) && (attr->pci_atomic_caps.compare_swap & IBV_PCI_ATOMIC_OPERATION_8_BYTE_SIZE_SUP)) attr->orig_attr.atomic_cap = IBV_ATOMIC_GLOB; } } static void get_hca_general_caps_2(struct mlx5_context *mctx) { uint16_t opmod = MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE_CAP_2 | HCA_CAP_OPMOD_GET_CUR; uint32_t out[DEVX_ST_SZ_DW(query_hca_cap_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(query_hca_cap_in)] = {}; int ret; DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); DEVX_SET(query_hca_cap_in, in, op_mod, opmod); ret = mlx5dv_devx_general_cmd(&mctx->ibv_ctx.context, in, sizeof(in), out, sizeof(out)); if (ret) return; mctx->hca_cap_2_caps.log_reserved_qpns_per_obj = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap_2.log_reserved_qpn_granularity); } static void get_hca_sig_caps(uint32_t *hca_caps, struct mlx5_context *mctx) { if (!DEVX_GET(query_hca_cap_out, hca_caps, capability.cmd_hca_cap.sho) || !DEVX_GET(query_hca_cap_out, hca_caps, capability.cmd_hca_cap.sigerr_domain_and_sig_type)) return; /* Basic signature offload features */ mctx->sig_caps.block_prot = MLX5DV_SIG_PROT_CAP_T10DIF | MLX5DV_SIG_PROT_CAP_CRC; mctx->sig_caps.block_size = MLX5DV_BLOCK_SIZE_CAP_512 | MLX5DV_BLOCK_SIZE_CAP_520 | MLX5DV_BLOCK_SIZE_CAP_4096 | MLX5DV_BLOCK_SIZE_CAP_4160; mctx->sig_caps.t10dif_bg = MLX5DV_SIG_T10DIF_BG_CAP_CRC | MLX5DV_SIG_T10DIF_BG_CAP_CSUM; mctx->sig_caps.crc_type = MLX5DV_SIG_CRC_TYPE_CAP_CRC32; /* Optional signature offload features */ if (DEVX_GET(query_hca_cap_out, hca_caps, capability.cmd_hca_cap.sig_block_4048)) mctx->sig_caps.block_size |= MLX5DV_BLOCK_SIZE_CAP_4048; if (DEVX_GET(query_hca_cap_out, hca_caps, capability.cmd_hca_cap.sig_crc32c)) mctx->sig_caps.crc_type |= MLX5DV_SIG_CRC_TYPE_CAP_CRC32C; if (DEVX_GET(query_hca_cap_out, hca_caps, capability.cmd_hca_cap.sig_crc64_xp10)) mctx->sig_caps.crc_type |= MLX5DV_SIG_CRC_TYPE_CAP_CRC64_XP10; } static void get_hca_general_caps(struct mlx5_context *mctx) { uint16_t opmod = MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE | HCA_CAP_OPMOD_GET_CUR; uint32_t out[DEVX_ST_SZ_DW(query_hca_cap_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(query_hca_cap_in)] = {}; int max_cyclic_qp_wr; int ret; DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); DEVX_SET(query_hca_cap_in, in, op_mod, opmod); ret = mlx5dv_devx_general_cmd(&mctx->ibv_ctx.context, in, sizeof(in), out, sizeof(out)); if (ret) return; mctx->qp_data_in_order_cap = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.qp_data_in_order); mctx->entropy_caps.num_lag_ports = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.num_lag_ports); mctx->entropy_caps.lag_tx_port_affinity = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.lag_tx_port_affinity); mctx->entropy_caps.rts2rts_qp_udp_sport = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.rts2rts_qp_udp_sport); mctx->entropy_caps.rts2rts_lag_tx_port_affinity = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.rts2rts_lag_tx_port_affinity); mctx->qos_caps.qos = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.qos); mctx->qpc_extension_cap = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.qpc_extension); mctx->general_obj_types_caps = DEVX_GET64(query_hca_cap_out, out, capability.cmd_hca_cap.general_obj_types); mctx->max_dc_rd_atom = 1 << DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.log_max_ra_req_dc); mctx->max_dc_init_rd_atom = 1 << DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.log_max_ra_res_dc); get_hca_sig_caps(out, mctx); if (DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.crypto)) mctx->crypto_caps.flags |= MLX5DV_CRYPTO_CAPS_CRYPTO; if (DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.aes_xts_single_block_le_tweak)) mctx->crypto_caps.crypto_engines |= MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_SINGLE_BLOCK; if (DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.aes_xts_multi_block_be_tweak)) mctx->crypto_caps.crypto_engines |= (MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_SINGLE_BLOCK | MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_MULTI_BLOCK); if (DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.hca_cap_2)) get_hca_general_caps_2(mctx); mctx->dma_mmo_caps.dma_mmo_sq = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.dma_mmo_sq); mctx->dma_mmo_caps.dma_mmo_qp = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.dma_mmo_qp); if (mctx->dma_mmo_caps.dma_mmo_sq || mctx->dma_mmo_caps.dma_mmo_qp) { uint8_t log_sz; log_sz = DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.log_dma_mmo_max_size); if (log_sz) mctx->dma_mmo_caps.dma_max_size = 1ULL << log_sz; else mctx->dma_mmo_caps.dma_max_size = MLX5_DMA_MMO_MAX_SIZE; } /* OOO-enabled cyclic buffers require double the user requested size. * XRC and DC are implemented as linked-list buffers. Hence, halving * is not required. */ max_cyclic_qp_wr = mctx->max_recv_wr > 1 ? mctx->max_recv_wr >> 1 : mctx->max_recv_wr; if (DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.dp_ordering_ooo_all_xrc)) mctx->ooo_recv_wrs_caps.max_xrc = mctx->max_recv_wr; if (DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.dp_ordering_ooo_all_dc)) mctx->ooo_recv_wrs_caps.max_dct = mctx->max_recv_wr; if (DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.dp_ordering_ooo_all_rc)) mctx->ooo_recv_wrs_caps.max_rc = max_cyclic_qp_wr; if (DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.dp_ordering_ooo_all_ud)) mctx->ooo_recv_wrs_caps.max_ud = max_cyclic_qp_wr; if (DEVX_GET(query_hca_cap_out, out, capability.cmd_hca_cap.dp_ordering_ooo_all_uc)) mctx->ooo_recv_wrs_caps.max_uc = max_cyclic_qp_wr; } static void get_qos_caps(struct mlx5_context *mctx) { uint16_t opmod = MLX5_SET_HCA_CAP_OP_MOD_QOS | HCA_CAP_OPMOD_GET_CUR; uint32_t out[DEVX_ST_SZ_DW(query_hca_cap_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(query_hca_cap_in)] = {}; int ret; DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); DEVX_SET(query_hca_cap_in, in, op_mod, opmod); ret = mlx5dv_devx_general_cmd(&mctx->ibv_ctx.context, in, sizeof(in), out, sizeof(out)); if (ret) return; mctx->qos_caps.nic_sq_scheduling = DEVX_GET(query_hca_cap_out, out, capability.qos_caps.nic_sq_scheduling); if (mctx->qos_caps.nic_sq_scheduling) { mctx->qos_caps.nic_bw_share = DEVX_GET(query_hca_cap_out, out, capability.qos_caps.nic_bw_share); mctx->qos_caps.nic_rate_limit = DEVX_GET(query_hca_cap_out, out, capability.qos_caps.nic_rate_limit); } mctx->qos_caps.nic_qp_scheduling = DEVX_GET(query_hca_cap_out, out, capability.qos_caps.nic_qp_scheduling); mctx->qos_caps.nic_element_type = DEVX_GET(query_hca_cap_out, out, capability.qos_caps.nic_element_type); mctx->qos_caps.nic_tsar_type = DEVX_GET(query_hca_cap_out, out, capability.qos_caps.nic_tsar_type); } static void get_crypto_caps(struct mlx5_context *mctx) { uint16_t opmod = MLX5_SET_HCA_CAP_OP_MOD_CRYPTO | HCA_CAP_OPMOD_GET_CUR; uint32_t out[DEVX_ST_SZ_DW(query_hca_cap_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(query_hca_cap_in)] = {}; int ret; DEVX_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP); DEVX_SET(query_hca_cap_in, in, op_mod, opmod); ret = mlx5dv_devx_general_cmd(&mctx->ibv_ctx.context, in, sizeof(in), out, sizeof(out)); if (ret) return; if (DEVX_GET(query_hca_cap_out, out, capability.crypto_caps.wrapped_crypto_operational)) mctx->crypto_caps.flags |= MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_OPERATIONAL; if (DEVX_GET(query_hca_cap_out, out, capability.crypto_caps .wrapped_crypto_going_to_commissioning)) mctx->crypto_caps.flags |= MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_GOING_TO_COMMISSIONING; if (DEVX_GET(query_hca_cap_out, out, capability.crypto_caps.wrapped_import_method) & MLX5_CRYPTO_CAPS_WRAPPED_IMPORT_METHOD_AES) mctx->crypto_caps.wrapped_import_method |= MLX5DV_CRYPTO_WRAPPED_IMPORT_METHOD_CAP_AES_XTS; mctx->crypto_caps.log_max_num_deks = DEVX_GET(query_hca_cap_out, out, capability.crypto_caps.log_max_num_deks); mctx->crypto_caps.failed_selftests = DEVX_GET(query_hca_cap_out, out, capability.crypto_caps.failed_selftests); } int mlx5_query_device_ex(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct mlx5_context *mctx = to_mctx(context); struct mlx5_query_device_ex_resp resp = {}; size_t resp_size = (mctx->cmds_supp_uhw & MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE) ? sizeof(resp) : sizeof(resp.ibv_resp); struct ibv_device_attr *a; uint64_t raw_fw_ver; unsigned sub_minor; unsigned major; unsigned minor; int err; err = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp.ibv_resp, &resp_size); if (err) return err; if (attr_size >= offsetofend(struct ibv_device_attr_ex, tso_caps)) { attr->tso_caps.max_tso = resp.tso_caps.max_tso; attr->tso_caps.supported_qpts = resp.tso_caps.supported_qpts; } if (attr_size >= offsetofend(struct ibv_device_attr_ex, rss_caps)) { attr->rss_caps.rx_hash_fields_mask = resp.rss_caps.rx_hash_fields_mask; attr->rss_caps.rx_hash_function = resp.rss_caps.rx_hash_function; } if (attr_size >= offsetofend(struct ibv_device_attr_ex, packet_pacing_caps)) { attr->packet_pacing_caps.qp_rate_limit_min = resp.packet_pacing_caps.qp_rate_limit_min; attr->packet_pacing_caps.qp_rate_limit_max = resp.packet_pacing_caps.qp_rate_limit_max; attr->packet_pacing_caps.supported_qpts = resp.packet_pacing_caps.supported_qpts; } if (attr_size >= offsetofend(struct ibv_device_attr_ex, pci_atomic_caps)) get_pci_atomic_caps(context, attr); raw_fw_ver = resp.ibv_resp.base.fw_ver; major = (raw_fw_ver >> 32) & 0xffff; minor = (raw_fw_ver >> 16) & 0xffff; sub_minor = raw_fw_ver & 0xffff; a = &attr->orig_attr; snprintf(a->fw_ver, sizeof(a->fw_ver), "%d.%d.%04d", major, minor, sub_minor); return 0; } void mlx5_query_device_ctx(struct mlx5_context *mctx) { struct ibv_device_attr_ex device_attr; struct mlx5_query_device_ex_resp resp = {}; size_t resp_size = (mctx->cmds_supp_uhw & MLX5_USER_CMDS_SUPP_UHW_QUERY_DEVICE) ? sizeof(resp) : sizeof(resp.ibv_resp); get_hca_general_caps(mctx); if (mctx->qos_caps.qos) get_qos_caps(mctx); if (mctx->crypto_caps.flags & MLX5DV_CRYPTO_CAPS_CRYPTO) get_crypto_caps(mctx); if (ibv_cmd_query_device_any(&mctx->ibv_ctx.context, NULL, &device_attr, sizeof(device_attr), &resp.ibv_resp, &resp_size)) return; mctx->cached_device_cap_flags = device_attr.orig_attr.device_cap_flags; mctx->atomic_cap = device_attr.orig_attr.atomic_cap; mctx->max_dm_size = device_attr.max_dm_size; mctx->cached_tso_caps = resp.tso_caps; if (resp.mlx5_ib_support_multi_pkt_send_wqes & MLX5_IB_ALLOW_MPW) mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_MPW_ALLOWED; if (resp.mlx5_ib_support_multi_pkt_send_wqes & MLX5_IB_SUPPORT_EMPW) mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_ENHANCED_MPW; mctx->cqe_comp_caps.max_num = resp.cqe_comp_caps.max_num; mctx->cqe_comp_caps.supported_format = resp.cqe_comp_caps.supported_format; mctx->sw_parsing_caps.sw_parsing_offloads = resp.sw_parsing_caps.sw_parsing_offloads; mctx->sw_parsing_caps.supported_qpts = resp.sw_parsing_caps.supported_qpts; mctx->striding_rq_caps.min_single_stride_log_num_of_bytes = resp.striding_rq_caps.min_single_stride_log_num_of_bytes; mctx->striding_rq_caps.max_single_stride_log_num_of_bytes = resp.striding_rq_caps.max_single_stride_log_num_of_bytes; mctx->striding_rq_caps.min_single_wqe_log_num_of_strides = resp.striding_rq_caps.min_single_wqe_log_num_of_strides; mctx->striding_rq_caps.max_single_wqe_log_num_of_strides = resp.striding_rq_caps.max_single_wqe_log_num_of_strides; mctx->striding_rq_caps.supported_qpts = resp.striding_rq_caps.supported_qpts; mctx->tunnel_offloads_caps = resp.tunnel_offloads_caps; mctx->packet_pacing_caps = resp.packet_pacing_caps; mctx->dci_streams_caps.max_log_num_concurent = resp.dci_streams_caps.max_log_num_concurent; mctx->dci_streams_caps.max_log_num_errored = resp.dci_streams_caps.max_log_num_errored; mctx->reg_c0 = resp.reg_c0; if (resp.flags & MLX5_IB_QUERY_DEV_RESP_FLAGS_CQE_128B_COMP) mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_CQE_128B_COMP; if (resp.flags & MLX5_IB_QUERY_DEV_RESP_FLAGS_CQE_128B_PAD) mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_CQE_128B_PAD; if (resp.flags & MLX5_IB_QUERY_DEV_RESP_PACKET_BASED_CREDIT_MODE) mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_PACKET_BASED_CREDIT_MODE; if (resp.flags & MLX5_IB_QUERY_DEV_RESP_FLAGS_SCAT2CQE_DCT) mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_SCAT2CQE_DCT; if (resp.flags & MLX5_IB_QUERY_DEV_RESP_FLAGS_OOO_DP) mctx->vendor_cap_flags |= MLX5_VENDOR_CAP_FLAGS_OOO_DP; } static int rwq_sig_enabled(struct ibv_context *context) { char *env; env = getenv("MLX5_RWQ_SIGNATURE"); if (env) return 1; return 0; } static void mlx5_free_rwq_buf(struct mlx5_rwq *rwq, struct ibv_context *context) { struct mlx5_context *ctx = to_mctx(context); mlx5_free_actual_buf(ctx, &rwq->buf); free(rwq->rq.wrid); } static int mlx5_alloc_rwq_buf(struct ibv_context *context, struct ibv_pd *pd, struct mlx5_rwq *rwq, int size) { int err; enum mlx5_alloc_type alloc_type; mlx5_get_alloc_type(to_mctx(context), pd, MLX5_RWQ_PREFIX, &alloc_type, MLX5_ALLOC_TYPE_ANON); rwq->rq.wrid = malloc(rwq->rq.wqe_cnt * sizeof(uint64_t)); if (!rwq->rq.wrid) { errno = ENOMEM; return -1; } if (alloc_type == MLX5_ALLOC_TYPE_CUSTOM) { rwq->buf.mparent_domain = to_mparent_domain(pd); rwq->buf.req_alignment = to_mdev(context->device)->page_size; rwq->buf.resource_type = MLX5DV_RES_TYPE_RWQ; } err = mlx5_alloc_prefered_buf(to_mctx(context), &rwq->buf, align(rwq->buf_size, to_mdev (context->device)->page_size), to_mdev(context->device)->page_size, alloc_type, MLX5_RWQ_PREFIX); if (err) { free(rwq->rq.wrid); errno = ENOMEM; return -1; } return 0; } static struct ibv_wq *create_wq(struct ibv_context *context, struct ibv_wq_init_attr *attr, struct mlx5dv_wq_init_attr *mlx5wq_attr) { struct mlx5_create_wq cmd; struct mlx5_create_wq_resp resp; int err; struct mlx5_rwq *rwq; struct mlx5_context *ctx = to_mctx(context); int ret; int32_t usr_idx = 0; FILE *fp = ctx->dbg_fp; if (attr->wq_type != IBV_WQT_RQ) return NULL; memset(&cmd, 0, sizeof(cmd)); memset(&resp, 0, sizeof(resp)); rwq = calloc(1, sizeof(*rwq)); if (!rwq) return NULL; rwq->wq_sig = rwq_sig_enabled(context); if (rwq->wq_sig) cmd.flags = MLX5_WQ_FLAG_SIGNATURE; ret = mlx5_calc_rwq_size(ctx, rwq, attr, mlx5wq_attr); if (ret < 0) { errno = -ret; goto err; } rwq->buf_size = ret; if (mlx5_alloc_rwq_buf(context, attr->pd, rwq, ret)) goto err; mlx5_init_rwq_indices(rwq); if (mlx5_spinlock_init_pd(&rwq->rq.lock, attr->pd)) goto err_free_rwq_buf; rwq->db = mlx5_alloc_dbrec(ctx, attr->pd, &rwq->custom_db); if (!rwq->db) goto err_free_rwq_buf; if (!rwq->custom_db) { rwq->db[MLX5_RCV_DBR] = 0; rwq->db[MLX5_SND_DBR] = 0; } rwq->pbuff = rwq->buf.buf + rwq->rq.offset; rwq->recv_db = &rwq->db[MLX5_RCV_DBR]; cmd.buf_addr = (uintptr_t)rwq->buf.buf; cmd.db_addr = (uintptr_t)rwq->db; cmd.rq_wqe_count = rwq->rq.wqe_cnt; cmd.rq_wqe_shift = rwq->rq.wqe_shift; usr_idx = mlx5_store_uidx(ctx, rwq); if (usr_idx < 0) { mlx5_dbg(fp, MLX5_DBG_QP, "Couldn't find free user index\n"); goto err_free_db_rec; } cmd.user_index = usr_idx; if (mlx5wq_attr) { if (mlx5wq_attr->comp_mask & MLX5DV_WQ_INIT_ATTR_MASK_STRIDING_RQ) { if ((mlx5wq_attr->striding_rq_attrs.single_stride_log_num_of_bytes < ctx->striding_rq_caps.min_single_stride_log_num_of_bytes) || (mlx5wq_attr->striding_rq_attrs.single_stride_log_num_of_bytes > ctx->striding_rq_caps.max_single_stride_log_num_of_bytes)) { errno = EINVAL; goto err_create; } if ((mlx5wq_attr->striding_rq_attrs.single_wqe_log_num_of_strides < ctx->striding_rq_caps.min_single_wqe_log_num_of_strides) || (mlx5wq_attr->striding_rq_attrs.single_wqe_log_num_of_strides > ctx->striding_rq_caps.max_single_wqe_log_num_of_strides)) { errno = EINVAL; goto err_create; } cmd.single_stride_log_num_of_bytes = mlx5wq_attr->striding_rq_attrs.single_stride_log_num_of_bytes; cmd.single_wqe_log_num_of_strides = mlx5wq_attr->striding_rq_attrs.single_wqe_log_num_of_strides; cmd.two_byte_shift_en = mlx5wq_attr->striding_rq_attrs.two_byte_shift_en; cmd.comp_mask |= MLX5_IB_CREATE_WQ_STRIDING_RQ; } } err = ibv_cmd_create_wq(context, attr, &rwq->wq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (err) goto err_create; rwq->rsc.type = MLX5_RSC_TYPE_RWQ; rwq->rsc.rsn = cmd.user_index; rwq->wq.post_recv = mlx5_post_wq_recv; return &rwq->wq; err_create: mlx5_clear_uidx(ctx, cmd.user_index); err_free_db_rec: mlx5_free_db(to_mctx(context), rwq->db, attr->pd, rwq->custom_db); err_free_rwq_buf: mlx5_free_rwq_buf(rwq, context); err: free(rwq); return NULL; } struct ibv_wq *mlx5_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *attr) { return create_wq(context, attr, NULL); } static struct ibv_wq *_mlx5dv_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *attr, struct mlx5dv_wq_init_attr *mlx5_wq_attr) { return create_wq(context, attr, mlx5_wq_attr); } struct ibv_wq *mlx5dv_create_wq(struct ibv_context *context, struct ibv_wq_init_attr *attr, struct mlx5dv_wq_init_attr *mlx5_wq_attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->create_wq) { errno = EOPNOTSUPP; return NULL; } return dvops->create_wq(context, attr, mlx5_wq_attr); } int mlx5_modify_wq(struct ibv_wq *wq, struct ibv_wq_attr *attr) { struct mlx5_modify_wq cmd = {}; struct mlx5_rwq *rwq = to_mrwq(wq); if ((attr->attr_mask & IBV_WQ_ATTR_STATE) && attr->wq_state == IBV_WQS_RDY) { if ((attr->attr_mask & IBV_WQ_ATTR_CURR_STATE) && attr->curr_wq_state != wq->state) return -EINVAL; if (wq->state == IBV_WQS_RESET) { mlx5_spin_lock(&to_mcq(wq->cq)->lock); __mlx5_cq_clean(to_mcq(wq->cq), rwq->rsc.rsn, NULL); mlx5_spin_unlock(&to_mcq(wq->cq)->lock); mlx5_init_rwq_indices(rwq); rwq->db[MLX5_RCV_DBR] = 0; rwq->db[MLX5_SND_DBR] = 0; } } return ibv_cmd_modify_wq(wq, attr, &cmd.ibv_cmd, sizeof(cmd)); } int mlx5_destroy_wq(struct ibv_wq *wq) { struct mlx5_rwq *rwq = to_mrwq(wq); int ret; ret = ibv_cmd_destroy_wq(wq); if (ret) return ret; mlx5_spin_lock(&to_mcq(wq->cq)->lock); __mlx5_cq_clean(to_mcq(wq->cq), rwq->rsc.rsn, NULL); mlx5_spin_unlock(&to_mcq(wq->cq)->lock); mlx5_clear_uidx(to_mctx(wq->context), rwq->rsc.rsn); mlx5_free_db(to_mctx(wq->context), rwq->db, wq->pd, rwq->custom_db); mlx5_free_rwq_buf(rwq, wq->context); free(rwq); return 0; } static void free_flow_counters_descriptions(struct mlx5_ib_create_flow *cmd) { int i; for (i = 0; i < cmd->ncounters_data; i++) free(cmd->data[i].counters_data); } static int get_flow_mcounters(struct mlx5_flow *mflow, struct ibv_flow_attr *flow_attr, struct mlx5_counters **mcounters, uint32_t *data_size) { struct ibv_flow_spec *ib_spec; uint32_t ncounters_used = 0; int i; ib_spec = (struct ibv_flow_spec *)(flow_attr + 1); for (i = 0; i < flow_attr->num_of_specs; i++, ib_spec = (void *)ib_spec + ib_spec->hdr.size) { if (ib_spec->hdr.type != IBV_FLOW_SPEC_ACTION_COUNT) continue; /* currently support only one counters data */ if (ncounters_used > 0) return EINVAL; *mcounters = to_mcounters(ib_spec->flow_count.counters); ncounters_used++; } *data_size = ncounters_used * sizeof(struct mlx5_ib_flow_counters_data); return 0; } static int allocate_flow_counters_descriptions(struct mlx5_counters *mcounters, struct mlx5_ib_create_flow *cmd) { struct mlx5_ib_flow_counters_data *mcntrs_data; struct mlx5_ib_flow_counters_desc *cntrs_data; struct mlx5_counter_node *cntr_node; uint32_t ncounters; int j = 0; mcntrs_data = cmd->data; ncounters = mcounters->ncounters; /* mlx5_attach_counters_point_flow was never called */ if (!ncounters) return EINVAL; /* each counter has both index and description */ cntrs_data = calloc(ncounters, sizeof(*cntrs_data)); if (!cntrs_data) return ENOMEM; list_for_each(&mcounters->counters_list, cntr_node, entry) { cntrs_data[j].description = cntr_node->desc; cntrs_data[j].index = cntr_node->index; j++; } scrub_ptr_attr(cntrs_data); mcntrs_data[cmd->ncounters_data].counters_data = cntrs_data; mcntrs_data[cmd->ncounters_data].ncounters = ncounters; cmd->ncounters_data++; return 0; } struct ibv_flow *mlx5_create_flow(struct ibv_qp *qp, struct ibv_flow_attr *flow_attr) { struct mlx5_ib_create_flow *cmd; uint32_t required_cmd_size = 0; struct ibv_flow *flow_id; struct mlx5_flow *mflow; int ret; mflow = calloc(1, sizeof(*mflow)); if (!mflow) { errno = ENOMEM; return NULL; } ret = get_flow_mcounters(mflow, flow_attr, &mflow->mcounters, &required_cmd_size); if (ret) { errno = ret; goto err_get_mcounters; } required_cmd_size += sizeof(*cmd); cmd = calloc(1, required_cmd_size); if (!cmd) { errno = ENOMEM; goto err_get_mcounters; } if (mflow->mcounters) { pthread_mutex_lock(&mflow->mcounters->lock); /* if the counters already bound no need to pass its description */ if (!mflow->mcounters->refcount) { ret = allocate_flow_counters_descriptions(mflow->mcounters, cmd); if (ret) { errno = ret; goto err_desc_alloc; } } } flow_id = &mflow->flow_id; ret = ibv_cmd_create_flow(qp, flow_id, flow_attr, cmd, required_cmd_size); if (ret) goto err_create_flow; if (mflow->mcounters) { free_flow_counters_descriptions(cmd); mflow->mcounters->refcount++; pthread_mutex_unlock(&mflow->mcounters->lock); } free(cmd); return flow_id; err_create_flow: if (mflow->mcounters) { free_flow_counters_descriptions(cmd); pthread_mutex_unlock(&mflow->mcounters->lock); } err_desc_alloc: free(cmd); err_get_mcounters: free(mflow); return NULL; } int mlx5_destroy_flow(struct ibv_flow *flow_id) { struct mlx5_flow *mflow = to_mflow(flow_id); int ret; ret = ibv_cmd_destroy_flow(flow_id); if (ret) return ret; if (mflow->mcounters) { pthread_mutex_lock(&mflow->mcounters->lock); mflow->mcounters->refcount--; pthread_mutex_unlock(&mflow->mcounters->lock); } free(mflow); return 0; } struct ibv_rwq_ind_table *mlx5_create_rwq_ind_table(struct ibv_context *context, struct ibv_rwq_ind_table_init_attr *init_attr) { struct mlx5_create_rwq_ind_table_resp resp; struct ibv_rwq_ind_table *ind_table; int err; memset(&resp, 0, sizeof(resp)); ind_table = calloc(1, sizeof(*ind_table)); if (!ind_table) return NULL; err = ibv_cmd_create_rwq_ind_table(context, init_attr, ind_table, &resp.ibv_resp, sizeof(resp)); if (err) goto err; return ind_table; err: free(ind_table); return NULL; } int mlx5_destroy_rwq_ind_table(struct ibv_rwq_ind_table *rwq_ind_table) { int ret; ret = ibv_cmd_destroy_rwq_ind_table(rwq_ind_table); if (ret) return ret; free(rwq_ind_table); return 0; } int mlx5_modify_cq(struct ibv_cq *cq, struct ibv_modify_cq_attr *attr) { struct ibv_modify_cq cmd = {}; return ibv_cmd_modify_cq(cq, attr, &cmd, sizeof(cmd)); } static struct ibv_flow_action *_mlx5_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *attr, struct ibv_command_buffer *driver_attr) { struct verbs_flow_action *action; int ret; if (!check_comp_mask(attr->comp_mask, IBV_FLOW_ACTION_ESP_MASK_ESN)) { errno = EOPNOTSUPP; return NULL; } action = calloc(1, sizeof(*action)); if (!action) { errno = ENOMEM; return NULL; } ret = ibv_cmd_create_flow_action_esp(ctx, attr, action, driver_attr); if (ret) { free(action); return NULL; } return &action->action; } struct ibv_flow_action *mlx5_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *attr) { return _mlx5_create_flow_action_esp(ctx, attr, NULL); } static struct ibv_flow_action * _mlx5dv_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *esp, struct mlx5dv_flow_action_esp *mlx5_attr) { DECLARE_COMMAND_BUFFER_LINK(driver_attr, UVERBS_OBJECT_FLOW_ACTION, UVERBS_METHOD_FLOW_ACTION_ESP_CREATE, 1, NULL); if (!check_comp_mask(mlx5_attr->comp_mask, MLX5DV_FLOW_ACTION_ESP_MASK_FLAGS)) { errno = EOPNOTSUPP; return NULL; } if (mlx5_attr->comp_mask & MLX5DV_FLOW_ACTION_ESP_MASK_FLAGS) { if (!check_comp_mask(mlx5_attr->action_flags, MLX5_IB_UAPI_FLOW_ACTION_FLAGS_REQUIRE_METADATA)) { errno = EOPNOTSUPP; return NULL; } fill_attr_in_uint64(driver_attr, MLX5_IB_ATTR_CREATE_FLOW_ACTION_FLAGS, mlx5_attr->action_flags); } return _mlx5_create_flow_action_esp(ctx, esp, driver_attr); } struct ibv_flow_action *mlx5dv_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *esp, struct mlx5dv_flow_action_esp *mlx5_attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ctx); if (!dvops || !dvops->create_flow_action_esp) { errno = EOPNOTSUPP; return NULL; } return dvops->create_flow_action_esp(ctx, esp, mlx5_attr); } int mlx5_modify_flow_action_esp(struct ibv_flow_action *action, struct ibv_flow_action_esp_attr *attr) { struct verbs_flow_action *vaction = container_of(action, struct verbs_flow_action, action); if (!check_comp_mask(attr->comp_mask, IBV_FLOW_ACTION_ESP_MASK_ESN)) return EOPNOTSUPP; return ibv_cmd_modify_flow_action_esp(vaction, attr, NULL); } static struct ibv_flow_action * _mlx5dv_create_flow_action_modify_header(struct ibv_context *ctx, size_t actions_sz, uint64_t actions[], enum mlx5dv_flow_table_type ft_type) { DECLARE_COMMAND_BUFFER(cmd, UVERBS_OBJECT_FLOW_ACTION, MLX5_IB_METHOD_FLOW_ACTION_CREATE_MODIFY_HEADER, 3); struct ib_uverbs_attr *handle = fill_attr_out_obj(cmd, MLX5_IB_ATTR_CREATE_MODIFY_HEADER_HANDLE); struct verbs_flow_action *action; int ret; fill_attr_in(cmd, MLX5_IB_ATTR_CREATE_MODIFY_HEADER_ACTIONS_PRM, actions, actions_sz); fill_attr_const_in(cmd, MLX5_IB_ATTR_CREATE_MODIFY_HEADER_FT_TYPE, ft_type); action = calloc(1, sizeof(*action)); if (!action) { errno = ENOMEM; return NULL; } ret = execute_ioctl(ctx, cmd); if (ret) { free(action); return NULL; } action->action.context = ctx; action->type = IBV_FLOW_ACTION_UNSPECIFIED; action->handle = read_attr_obj(MLX5_IB_ATTR_CREATE_MODIFY_HEADER_HANDLE, handle); return &action->action; } struct ibv_flow_action *mlx5dv_create_flow_action_modify_header(struct ibv_context *ctx, size_t actions_sz, uint64_t actions[], enum mlx5dv_flow_table_type ft_type) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ctx); if (!dvops || !dvops->create_flow_action_modify_header) { errno = EOPNOTSUPP; return NULL; } return dvops->create_flow_action_modify_header(ctx, actions_sz, actions, ft_type); } static struct ibv_flow_action * _mlx5dv_create_flow_action_packet_reformat(struct ibv_context *ctx, size_t data_sz, void *data, enum mlx5dv_flow_action_packet_reformat_type reformat_type, enum mlx5dv_flow_table_type ft_type) { DECLARE_COMMAND_BUFFER(cmd, UVERBS_OBJECT_FLOW_ACTION, MLX5_IB_METHOD_FLOW_ACTION_CREATE_PACKET_REFORMAT, 4); struct ib_uverbs_attr *handle = fill_attr_out_obj(cmd, MLX5_IB_ATTR_CREATE_PACKET_REFORMAT_HANDLE); struct verbs_flow_action *action; int ret; if ((!data && data_sz) || (data && !data_sz)) { errno = EINVAL; return NULL; } if (data && data_sz) fill_attr_in(cmd, MLX5_IB_ATTR_CREATE_PACKET_REFORMAT_DATA_BUF, data, data_sz); fill_attr_const_in(cmd, MLX5_IB_ATTR_CREATE_PACKET_REFORMAT_TYPE, reformat_type); fill_attr_const_in(cmd, MLX5_IB_ATTR_CREATE_PACKET_REFORMAT_FT_TYPE, ft_type); action = calloc(1, sizeof(*action)); if (!action) { errno = ENOMEM; return NULL; } ret = execute_ioctl(ctx, cmd); if (ret) { free(action); return NULL; } action->action.context = ctx; action->type = IBV_FLOW_ACTION_UNSPECIFIED; action->handle = read_attr_obj(MLX5_IB_ATTR_CREATE_PACKET_REFORMAT_HANDLE, handle); return &action->action; } struct ibv_flow_action * mlx5dv_create_flow_action_packet_reformat(struct ibv_context *ctx, size_t data_sz, void *data, enum mlx5dv_flow_action_packet_reformat_type reformat_type, enum mlx5dv_flow_table_type ft_type) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ctx); if (!dvops || !dvops->create_flow_action_packet_reformat) { errno = EOPNOTSUPP; return NULL; } return dvops->create_flow_action_packet_reformat(ctx, data_sz, data, reformat_type, ft_type); } int mlx5_destroy_flow_action(struct ibv_flow_action *action) { struct verbs_flow_action *vaction = container_of(action, struct verbs_flow_action, action); int ret = ibv_cmd_destroy_flow_action(vaction); if (!ret) free(action); return ret; } static inline int mlx5_access_dm(struct ibv_dm *ibdm, uint64_t dm_offset, void *host_addr, size_t length, uint32_t read) { struct mlx5_dm *dm = to_mdm(ibdm); atomic_uint32_t *dm_ptr = (atomic_uint32_t *)dm->start_va + dm_offset / 4; uint32_t *host_ptr = host_addr; const uint32_t *host_end = host_ptr + length / 4; if (dm_offset + length > dm->length) return EFAULT; /* Due to HW limitation, DM access address and length must be aligned * to 4 bytes. */ if ((length & 3) || (dm_offset & 3)) return EINVAL; /* Copy granularity should be 4 Bytes since we enforce copy size to be * a multiple of 4 bytes. */ if (read) { while (host_ptr != host_end) { *host_ptr = atomic_load_explicit(dm_ptr, memory_order_relaxed); host_ptr++; dm_ptr++; } } else { while (host_ptr != host_end) { atomic_store_explicit(dm_ptr, *host_ptr, memory_order_relaxed); host_ptr++; dm_ptr++; } } return 0; } static inline int mlx5_memcpy_to_dm(struct ibv_dm *ibdm, uint64_t dm_offset, const void *host_addr, size_t length) { return mlx5_access_dm(ibdm, dm_offset, (void *)host_addr, length, 0); } static inline int mlx5_memcpy_from_dm(void *host_addr, struct ibv_dm *ibdm, uint64_t dm_offset, size_t length) { return mlx5_access_dm(ibdm, dm_offset, host_addr, length, 1); } static void *dm_mmap(struct ibv_context *context, struct mlx5_dm *mdm, uint16_t page_idx, size_t length) { int page_size = to_mdev(context->device)->page_size; uint64_t act_size = align(length, page_size); off_t offset = 0; set_command(MLX5_IB_MMAP_DEVICE_MEM, &offset); set_extended_index(page_idx, &offset); return mmap(NULL, act_size, PROT_READ | PROT_WRITE, MAP_SHARED, context->cmd_fd, page_size * offset); } static void *_mlx5dv_dm_map_op_addr(struct ibv_dm *dm, uint8_t op) { int page_size = to_mdev(dm->context->device)->page_size; struct mlx5_dm *mdm = to_mdm(dm); uint64_t start_offset; uint16_t page_idx; void *va; int ret; DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_DM, MLX5_IB_METHOD_DM_MAP_OP_ADDR, 4); fill_attr_in_obj(cmdb, MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_HANDLE, mdm->verbs_dm.handle); fill_attr_in(cmdb, MLX5_IB_ATTR_DM_MAP_OP_ADDR_REQ_OP, &op, sizeof(op)); fill_attr_out(cmdb, MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_START_OFFSET, &start_offset, sizeof(start_offset)); fill_attr_out(cmdb, MLX5_IB_ATTR_DM_MAP_OP_ADDR_RESP_PAGE_INDEX, &page_idx, sizeof(page_idx)); ret = execute_ioctl(dm->context, cmdb); if (ret) return NULL; va = dm_mmap(dm->context, mdm, page_idx, mdm->length); if (va == MAP_FAILED) return NULL; return va + (start_offset & (page_size - 1)); } void *mlx5dv_dm_map_op_addr(struct ibv_dm *dm, uint8_t op) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(dm->context); if (!dvops || !dvops->dm_map_op_addr) { errno = EOPNOTSUPP; return NULL; } return dvops->dm_map_op_addr(dm, op); } static struct ibv_mr * _mlx5dv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access, int mlx5_access) { DECLARE_COMMAND_BUFFER_LINK(driver_attr, UVERBS_OBJECT_MR, UVERBS_METHOD_REG_DMABUF_MR, 1, NULL); struct mlx5_mr *mr; int ret; mr = calloc(1, sizeof(*mr)); if (!mr) return NULL; fill_attr_in_uint32(driver_attr, MLX5_IB_ATTR_REG_DMABUF_MR_ACCESS_FLAGS, mlx5_access); ret = ibv_cmd_reg_dmabuf_mr(pd, offset, length, iova, fd, access, &mr->vmr, driver_attr); if (ret) { free(mr); return NULL; } mr->alloc_flags = access; return &mr->vmr.ibv_mr; } struct ibv_mr *mlx5dv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access, int mlx5_access) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(pd->context); if (!dvops || !dvops->reg_dmabuf_mr) { errno = EOPNOTSUPP; return NULL; } return dvops->reg_dmabuf_mr(pd, offset, length, iova, fd, access, mlx5_access); } static int _mlx5dv_get_data_direct_sysfs_path(struct ibv_context *context, char *buf, size_t buf_len) { DECLARE_COMMAND_BUFFER(cmd, UVERBS_OBJECT_DEVICE, MLX5_IB_METHOD_GET_DATA_DIRECT_SYSFS_PATH, 1); fill_attr_out(cmd, MLX5_IB_ATTR_GET_DATA_DIRECT_SYSFS_PATH, buf, buf_len); return execute_ioctl(context, cmd); } int mlx5dv_get_data_direct_sysfs_path(struct ibv_context *context, char *buf, size_t buf_len) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->get_data_direct_sysfs_path) return EOPNOTSUPP; return dvops->get_data_direct_sysfs_path(context, buf, buf_len); } void mlx5_unimport_dm(struct ibv_dm *ibdm) { struct mlx5_dm *dm = to_mdm(ibdm); size_t act_size = align(dm->length, to_mdev(ibdm->context->device)->page_size); munmap(dm->mmap_va, act_size); free(dm); } struct ibv_dm *mlx5_import_dm(struct ibv_context *context, uint32_t dm_handle) { DECLARE_COMMAND_BUFFER(cmd, UVERBS_OBJECT_DM, MLX5_IB_METHOD_DM_QUERY, 4); int page_size = to_mdev(context->device)->page_size; uint64_t start_offset, length; struct mlx5_dm *dm; uint16_t page_idx; void *va; int ret; dm = calloc(1, sizeof(*dm)); if (!dm) { errno = ENOMEM; return NULL; } fill_attr_in_obj(cmd, MLX5_IB_ATTR_QUERY_DM_REQ_HANDLE, dm_handle); fill_attr_out(cmd, MLX5_IB_ATTR_QUERY_DM_RESP_START_OFFSET, &start_offset, sizeof(start_offset)); fill_attr_out(cmd, MLX5_IB_ATTR_QUERY_DM_RESP_PAGE_INDEX, &page_idx, sizeof(page_idx)); fill_attr_out(cmd, MLX5_IB_ATTR_QUERY_DM_RESP_LENGTH, &length, sizeof(length)); ret = execute_ioctl(context, cmd); if (ret) goto free_dm; va = dm_mmap(context, dm, page_idx, length); if (va == MAP_FAILED) goto free_dm; dm->mmap_va = va; dm->length = length; dm->start_va = va + (start_offset & (page_size - 1)); dm->verbs_dm.dm.memcpy_to_dm = mlx5_memcpy_to_dm; dm->verbs_dm.dm.memcpy_from_dm = mlx5_memcpy_from_dm; dm->verbs_dm.dm.context = context; dm->verbs_dm.handle = dm->verbs_dm.dm.handle = dm_handle; return &dm->verbs_dm.dm; free_dm: free(dm); return NULL; } static int alloc_dm_memic(struct ibv_context *ctx, struct mlx5_dm *dm, struct ibv_alloc_dm_attr *dm_attr, struct ibv_command_buffer *cmdb) { int page_size = to_mdev(ctx->device)->page_size; uint64_t start_offset; uint16_t page_idx; void *va; if (dm_attr->length > to_mctx(ctx)->max_dm_size) { errno = EINVAL; return errno; } fill_attr_out(cmdb, MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET, &start_offset, sizeof(start_offset)); fill_attr_out(cmdb, MLX5_IB_ATTR_ALLOC_DM_RESP_PAGE_INDEX, &page_idx, sizeof(page_idx)); if (ibv_cmd_alloc_dm(ctx, dm_attr, &dm->verbs_dm, cmdb)) return EINVAL; va = dm_mmap(ctx, dm, page_idx, dm_attr->length); if (va == MAP_FAILED) { ibv_cmd_free_dm(&dm->verbs_dm); return ENOMEM; } dm->mmap_va = va; dm->start_va = va + (start_offset & (page_size - 1)); dm->verbs_dm.dm.memcpy_to_dm = mlx5_memcpy_to_dm; dm->verbs_dm.dm.memcpy_from_dm = mlx5_memcpy_from_dm; return 0; } static int alloc_dm_steering_sw_icm(struct ibv_context *ctx, struct mlx5_dm *dm, struct ibv_alloc_dm_attr *dm_attr, struct ibv_command_buffer *cmdb) { uint64_t start_offset; fill_attr_out(cmdb, MLX5_IB_ATTR_ALLOC_DM_RESP_START_OFFSET, &start_offset, sizeof(start_offset)); if (ibv_cmd_alloc_dm(ctx, dm_attr, &dm->verbs_dm, cmdb)) return EINVAL; /* For SW ICM we get address in the start_offset attribute */ dm->remote_va = start_offset; return 0; } static struct ibv_dm * _mlx5dv_alloc_dm(struct ibv_context *context, struct ibv_alloc_dm_attr *dm_attr, struct mlx5dv_alloc_dm_attr *mlx5_dm_attr) { DECLARE_COMMAND_BUFFER(cmdb, UVERBS_OBJECT_DM, UVERBS_METHOD_DM_ALLOC, 3); struct ib_uverbs_attr *type_attr; struct mlx5_dm *dm; int err; if ((mlx5_dm_attr->type != MLX5DV_DM_TYPE_MEMIC) && (mlx5_dm_attr->type != MLX5DV_DM_TYPE_ENCAP_SW_ICM) && (mlx5_dm_attr->type != MLX5DV_DM_TYPE_STEERING_SW_ICM) && (mlx5_dm_attr->type != MLX5DV_DM_TYPE_HEADER_MODIFY_SW_ICM) && (mlx5_dm_attr->type != MLX5DV_DM_TYPE_HEADER_MODIFY_PATTERN_SW_ICM)) { errno = EOPNOTSUPP; return NULL; } if (!check_comp_mask(dm_attr->comp_mask, 0) || !check_comp_mask(mlx5_dm_attr->comp_mask, 0)) { errno = EINVAL; return NULL; } dm = calloc(1, sizeof(*dm)); if (!dm) { errno = ENOMEM; return NULL; } type_attr = fill_attr_const_in(cmdb, MLX5_IB_ATTR_ALLOC_DM_REQ_TYPE, mlx5_dm_attr->type); if (mlx5_dm_attr->type == MLX5DV_DM_TYPE_MEMIC) { attr_optional(type_attr); err = alloc_dm_memic(context, dm, dm_attr, cmdb); } else { err = alloc_dm_steering_sw_icm(context, dm, dm_attr, cmdb); } if (err) goto err_free_mem; dm->length = dm_attr->length; return &dm->verbs_dm.dm; err_free_mem: free(dm); return NULL; } struct ibv_dm * mlx5dv_alloc_dm(struct ibv_context *context, struct ibv_alloc_dm_attr *dm_attr, struct mlx5dv_alloc_dm_attr *mlx5_dm_attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->alloc_dm) { errno = EOPNOTSUPP; return NULL; } return dvops->alloc_dm(context, dm_attr, mlx5_dm_attr); } int mlx5_free_dm(struct ibv_dm *ibdm) { struct mlx5_device *mdev = to_mdev(ibdm->context->device); struct mlx5_dm *dm = to_mdm(ibdm); size_t act_size = align(dm->length, mdev->page_size); int ret; ret = ibv_cmd_free_dm(&dm->verbs_dm); if (ret) return ret; if (dm->mmap_va) munmap(dm->mmap_va, act_size); free(dm); return 0; } struct ibv_dm *mlx5_alloc_dm(struct ibv_context *context, struct ibv_alloc_dm_attr *dm_attr) { struct mlx5dv_alloc_dm_attr mlx5_attr = { .type = MLX5DV_DM_TYPE_MEMIC }; return mlx5dv_alloc_dm(context, dm_attr, &mlx5_attr); } struct ibv_counters *mlx5_create_counters(struct ibv_context *context, struct ibv_counters_init_attr *init_attr) { struct mlx5_counters *mcntrs; int ret; if (!check_comp_mask(init_attr->comp_mask, 0)) { errno = EOPNOTSUPP; return NULL; } mcntrs = calloc(1, sizeof(*mcntrs)); if (!mcntrs) { errno = ENOMEM; return NULL; } pthread_mutex_init(&mcntrs->lock, NULL); ret = ibv_cmd_create_counters(context, init_attr, &mcntrs->vcounters, NULL); if (ret) goto err_create; list_head_init(&mcntrs->counters_list); return &mcntrs->vcounters.counters; err_create: free(mcntrs); return NULL; } int mlx5_destroy_counters(struct ibv_counters *counters) { struct mlx5_counters *mcntrs = to_mcounters(counters); struct mlx5_counter_node *tmp, *cntrs_node; int ret; ret = ibv_cmd_destroy_counters(&mcntrs->vcounters); if (ret) return ret; list_for_each_safe(&mcntrs->counters_list, cntrs_node, tmp, entry) { list_del(&cntrs_node->entry); free(cntrs_node); } free(mcntrs); return 0; } int mlx5_attach_counters_point_flow(struct ibv_counters *counters, struct ibv_counter_attach_attr *attr, struct ibv_flow *flow) { struct mlx5_counters *mcntrs = to_mcounters(counters); struct mlx5_counter_node *cntrs_node; int ret; /* The driver supports only the static binding mode as part of ibv_create_flow */ if (flow) return ENOTSUP; if (!check_comp_mask(attr->comp_mask, 0)) return EOPNOTSUPP; /* Check whether the attached counter is supported */ if (attr->counter_desc < IBV_COUNTER_PACKETS || attr->counter_desc > IBV_COUNTER_BYTES) return ENOTSUP; cntrs_node = calloc(1, sizeof(*cntrs_node)); if (!cntrs_node) return ENOMEM; pthread_mutex_lock(&mcntrs->lock); /* The counter is bound to a flow, attach is not allowed */ if (mcntrs->refcount) { ret = EBUSY; goto err_already_bound; } cntrs_node->index = attr->index; cntrs_node->desc = attr->counter_desc; list_add(&mcntrs->counters_list, &cntrs_node->entry); mcntrs->ncounters++; pthread_mutex_unlock(&mcntrs->lock); return 0; err_already_bound: pthread_mutex_unlock(&mcntrs->lock); free(cntrs_node); return ret; } int mlx5_read_counters(struct ibv_counters *counters, uint64_t *counters_value, uint32_t ncounters, uint32_t flags) { struct mlx5_counters *mcntrs = to_mcounters(counters); return ibv_cmd_read_counters(&mcntrs->vcounters, counters_value, ncounters, flags, NULL); } static struct mlx5dv_flow_matcher * _mlx5dv_create_flow_matcher(struct ibv_context *context, struct mlx5dv_flow_matcher_attr *attr) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_FLOW_MATCHER, MLX5_IB_METHOD_FLOW_MATCHER_CREATE, 6); struct mlx5dv_flow_matcher *flow_matcher; struct ib_uverbs_attr *handle; int ret; if (!check_comp_mask(attr->comp_mask, MLX5DV_FLOW_MATCHER_MASK_FT_TYPE)) { errno = EOPNOTSUPP; return NULL; } flow_matcher = calloc(1, sizeof(*flow_matcher)); if (!flow_matcher) { errno = ENOMEM; return NULL; } if (attr->type != IBV_FLOW_ATTR_NORMAL) { errno = EOPNOTSUPP; goto err; } handle = fill_attr_out_obj(cmd, MLX5_IB_ATTR_FLOW_MATCHER_CREATE_HANDLE); fill_attr_in(cmd, MLX5_IB_ATTR_FLOW_MATCHER_MATCH_MASK, attr->match_mask->match_buf, attr->match_mask->match_sz); fill_attr_in(cmd, MLX5_IB_ATTR_FLOW_MATCHER_MATCH_CRITERIA, &attr->match_criteria_enable, sizeof(attr->match_criteria_enable)); fill_attr_in_enum(cmd, MLX5_IB_ATTR_FLOW_MATCHER_FLOW_TYPE, IBV_FLOW_ATTR_NORMAL, &attr->priority, sizeof(attr->priority)); if (attr->comp_mask & MLX5DV_FLOW_MATCHER_MASK_FT_TYPE) fill_attr_const_in(cmd, MLX5_IB_ATTR_FLOW_MATCHER_FT_TYPE, attr->ft_type); if (attr->flags) fill_attr_const_in(cmd, MLX5_IB_ATTR_FLOW_MATCHER_FLOW_FLAGS, attr->flags); ret = execute_ioctl(context, cmd); if (ret) goto err; flow_matcher->context = context; flow_matcher->handle = read_attr_obj(MLX5_IB_ATTR_FLOW_MATCHER_CREATE_HANDLE, handle); return flow_matcher; err: free(flow_matcher); return NULL; } struct mlx5dv_flow_matcher * mlx5dv_create_flow_matcher(struct ibv_context *context, struct mlx5dv_flow_matcher_attr *attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->create_flow_matcher) { errno = EOPNOTSUPP; return NULL; } return dvops->create_flow_matcher(context, attr); } static int _mlx5dv_destroy_flow_matcher(struct mlx5dv_flow_matcher *flow_matcher) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_FLOW_MATCHER, MLX5_IB_METHOD_FLOW_MATCHER_DESTROY, 1); int ret; fill_attr_in_obj(cmd, MLX5_IB_ATTR_FLOW_MATCHER_DESTROY_HANDLE, flow_matcher->handle); ret = execute_ioctl(flow_matcher->context, cmd); verbs_is_destroy_err(&ret); if (ret) return ret; free(flow_matcher); return 0; } int mlx5dv_destroy_flow_matcher(struct mlx5dv_flow_matcher *flow_matcher) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(flow_matcher->context); if (!dvops || !dvops->destroy_flow_matcher) return EOPNOTSUPP; return dvops->destroy_flow_matcher(flow_matcher); } #define CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED 8 struct ibv_flow * _mlx5dv_create_flow(struct mlx5dv_flow_matcher *flow_matcher, struct mlx5dv_flow_match_parameters *match_value, size_t num_actions, struct mlx5dv_flow_action_attr actions_attr[], struct mlx5_flow_action_attr_aux actions_attr_aux[]) { uint32_t flow_actions[CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED]; struct verbs_flow_action *vaction; int num_flow_actions = 0; struct mlx5_flow *mflow; bool have_qp = false; bool have_dest_devx = false; bool have_flow_tag = false; bool have_counter = false; bool have_default = false; bool have_drop = false; int ret; int i; DECLARE_COMMAND_BUFFER(cmd, UVERBS_OBJECT_FLOW, MLX5_IB_METHOD_CREATE_FLOW, 8); struct ib_uverbs_attr *handle; enum mlx5dv_flow_action_type type; mflow = calloc(1, sizeof(*mflow)); if (!mflow) { errno = ENOMEM; return NULL; } handle = fill_attr_out_obj(cmd, MLX5_IB_ATTR_CREATE_FLOW_HANDLE); fill_attr_in(cmd, MLX5_IB_ATTR_CREATE_FLOW_MATCH_VALUE, match_value->match_buf, match_value->match_sz); fill_attr_in_obj(cmd, MLX5_IB_ATTR_CREATE_FLOW_MATCHER, flow_matcher->handle); for (i = 0; i < num_actions; i++) { type = actions_attr[i].type; switch (type) { case MLX5DV_FLOW_ACTION_DEST_IBV_QP: if (have_qp || have_dest_devx || have_default || have_drop) { errno = EOPNOTSUPP; goto err; } fill_attr_in_obj(cmd, MLX5_IB_ATTR_CREATE_FLOW_DEST_QP, actions_attr[i].qp->handle); have_qp = true; break; case MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION: if (num_flow_actions == CREATE_FLOW_MAX_FLOW_ACTIONS_SUPPORTED) { errno = EOPNOTSUPP; goto err; } vaction = container_of(actions_attr[i].action, struct verbs_flow_action, action); flow_actions[num_flow_actions] = vaction->handle; num_flow_actions++; break; case MLX5DV_FLOW_ACTION_DEST_DEVX: if (have_dest_devx || have_qp || have_default || have_drop) { errno = EOPNOTSUPP; goto err; } fill_attr_in_obj(cmd, MLX5_IB_ATTR_CREATE_FLOW_DEST_DEVX, actions_attr[i].obj->handle); have_dest_devx = true; break; case MLX5DV_FLOW_ACTION_TAG: if (have_flow_tag) { errno = EINVAL; goto err; } fill_attr_in_uint32(cmd, MLX5_IB_ATTR_CREATE_FLOW_TAG, actions_attr[i].tag_value); have_flow_tag = true; break; case MLX5DV_FLOW_ACTION_COUNTERS_DEVX: if (have_counter) { errno = EOPNOTSUPP; goto err; } fill_attr_in_objs_arr(cmd, MLX5_IB_ATTR_CREATE_FLOW_ARR_COUNTERS_DEVX, &actions_attr[i].obj->handle, 1); if (actions_attr_aux && actions_attr_aux[i].type == MLX5_FLOW_ACTION_COUNTER_OFFSET) fill_attr_in_ptr_array(cmd, MLX5_IB_ATTR_CREATE_FLOW_ARR_COUNTERS_DEVX_OFFSET, &actions_attr_aux[i].offset, 1); have_counter = true; break; case MLX5DV_FLOW_ACTION_DEFAULT_MISS: if (have_qp || have_dest_devx || have_default || have_drop) { errno = EOPNOTSUPP; goto err; } fill_attr_in_uint32(cmd, MLX5_IB_ATTR_CREATE_FLOW_FLAGS, MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DEFAULT_MISS); have_default = true; break; case MLX5DV_FLOW_ACTION_DROP: if (have_qp || have_dest_devx || have_default || have_drop) { errno = EOPNOTSUPP; goto err; } fill_attr_in_uint32(cmd, MLX5_IB_ATTR_CREATE_FLOW_FLAGS, MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DROP); have_drop = true; break; default: errno = EOPNOTSUPP; goto err; } } if (num_flow_actions) fill_attr_in_objs_arr(cmd, MLX5_IB_ATTR_CREATE_FLOW_ARR_FLOW_ACTIONS, flow_actions, num_flow_actions); ret = execute_ioctl(flow_matcher->context, cmd); if (ret) goto err; mflow->flow_id.handle = read_attr_obj(MLX5_IB_ATTR_CREATE_FLOW_HANDLE, handle); mflow->flow_id.context = flow_matcher->context; return &mflow->flow_id; err: free(mflow); return NULL; } struct ibv_flow * mlx5dv_create_flow(struct mlx5dv_flow_matcher *flow_matcher, struct mlx5dv_flow_match_parameters *match_value, size_t num_actions, struct mlx5dv_flow_action_attr actions_attr[]) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(flow_matcher->context); if (!dvops || !dvops->create_flow) { errno = EOPNOTSUPP; return NULL; } return dvops->create_flow(flow_matcher, match_value, num_actions, actions_attr, NULL); } static struct mlx5dv_steering_anchor * _mlx5dv_create_steering_anchor(struct ibv_context *context, struct mlx5dv_steering_anchor_attr *attr) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_STEERING_ANCHOR, MLX5_IB_METHOD_STEERING_ANCHOR_CREATE, 4); struct mlx5_steering_anchor *steering_anchor; struct ib_uverbs_attr *handle; int err; if (!check_comp_mask(attr->comp_mask, 0)) { errno = EOPNOTSUPP; return NULL; } steering_anchor = calloc(1, sizeof(*steering_anchor)); if (!steering_anchor) { errno = ENOMEM; return NULL; } handle = fill_attr_out_obj(cmd, MLX5_IB_ATTR_STEERING_ANCHOR_CREATE_HANDLE); fill_attr_const_in(cmd, MLX5_IB_ATTR_STEERING_ANCHOR_FT_TYPE, attr->ft_type); fill_attr_in(cmd, MLX5_IB_ATTR_STEERING_ANCHOR_PRIORITY, &attr->priority, sizeof(attr->priority)); fill_attr_out(cmd, MLX5_IB_ATTR_STEERING_ANCHOR_FT_ID, &steering_anchor->sa.id, sizeof(steering_anchor->sa.id)); err = execute_ioctl(context, cmd); if (err) { free(steering_anchor); return NULL; } steering_anchor->context = context; steering_anchor->handle = read_attr_obj(MLX5_IB_ATTR_STEERING_ANCHOR_CREATE_HANDLE, handle); return &steering_anchor->sa; } static int _mlx5dv_destroy_steering_anchor(struct mlx5_steering_anchor *anchor) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_STEERING_ANCHOR, MLX5_IB_METHOD_STEERING_ANCHOR_DESTROY, 1); int ret; fill_attr_in_obj(cmd, MLX5_IB_ATTR_STEERING_ANCHOR_DESTROY_HANDLE, anchor->handle); ret = execute_ioctl(anchor->context, cmd); if (ret) return ret; free(anchor); return 0; } struct mlx5dv_steering_anchor * mlx5dv_create_steering_anchor(struct ibv_context *context, struct mlx5dv_steering_anchor_attr *attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->create_steering_anchor) { errno = EOPNOTSUPP; return NULL; } return dvops->create_steering_anchor(context, attr); } int mlx5dv_destroy_steering_anchor(struct mlx5dv_steering_anchor *sa) { struct mlx5_steering_anchor *anchor; struct mlx5_dv_context_ops *dvops; anchor = container_of(sa, struct mlx5_steering_anchor, sa); dvops = mlx5_get_dv_ops(anchor->context); if (!dvops || !dvops->destroy_steering_anchor) return EOPNOTSUPP; return dvops->destroy_steering_anchor(anchor); } static struct mlx5dv_devx_umem * __mlx5dv_devx_umem_reg_ex(struct ibv_context *context, struct mlx5dv_devx_umem_in *in, bool legacy) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_UMEM, MLX5_IB_METHOD_DEVX_UMEM_REG, 7); struct ib_uverbs_attr *pgsz_bitmap; struct ib_uverbs_attr *handle; struct mlx5_devx_umem *umem; int ret; if (!check_comp_mask(in->comp_mask, MLX5DV_UMEM_MASK_DMABUF)) { errno = EOPNOTSUPP; return NULL; } umem = calloc(1, sizeof(*umem)); if (!umem) { errno = ENOMEM; return NULL; } if (ibv_dontfork_range(in->addr, in->size)) goto err; fill_attr_in_uint64(cmd, MLX5_IB_ATTR_DEVX_UMEM_REG_ADDR, (intptr_t)in->addr); fill_attr_in_uint64(cmd, MLX5_IB_ATTR_DEVX_UMEM_REG_LEN, in->size); fill_attr_in_uint32(cmd, MLX5_IB_ATTR_DEVX_UMEM_REG_ACCESS, in->access); if (in->comp_mask & MLX5DV_UMEM_MASK_DMABUF) { if (in->dmabuf_fd == -1) { errno = EBADF; goto err_umem_reg_cmd; } fill_attr_in_fd(cmd, MLX5_IB_ATTR_DEVX_UMEM_REG_DMABUF_FD, in->dmabuf_fd); } pgsz_bitmap = fill_attr_in_uint64(cmd, MLX5_IB_ATTR_DEVX_UMEM_REG_PGSZ_BITMAP, in->pgsz_bitmap); if (legacy) attr_optional(pgsz_bitmap); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_UMEM_REG_OUT_ID, &umem->dv_devx_umem.umem_id, sizeof(umem->dv_devx_umem.umem_id)); handle = fill_attr_out_obj(cmd, MLX5_IB_ATTR_DEVX_UMEM_REG_HANDLE); ret = execute_ioctl(context, cmd); if (ret) goto err_umem_reg_cmd; umem->handle = read_attr_obj(MLX5_IB_ATTR_DEVX_UMEM_REG_HANDLE, handle); umem->context = context; umem->addr = in->addr; umem->size = in->size; return &umem->dv_devx_umem; err_umem_reg_cmd: ibv_dofork_range(in->addr, in->size); err: free(umem); return NULL; } static struct mlx5dv_devx_umem * _mlx5dv_devx_umem_reg(struct ibv_context *context, void *addr, size_t size, uint32_t access) { struct mlx5dv_devx_umem_in umem_in = {}; umem_in.access = access; umem_in.addr = addr; umem_in.size = size; umem_in.pgsz_bitmap = UINT64_MAX & ~(MLX5_ADAPTER_PAGE_SIZE - 1); return __mlx5dv_devx_umem_reg_ex(context, &umem_in, true); } static struct mlx5dv_devx_umem * _mlx5dv_devx_umem_reg_ex(struct ibv_context *ctx, struct mlx5dv_devx_umem_in *umem_in) { return __mlx5dv_devx_umem_reg_ex(ctx, umem_in, false); } struct mlx5dv_devx_umem * mlx5dv_devx_umem_reg_ex(struct ibv_context *ctx, struct mlx5dv_devx_umem_in *umem_in) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ctx); if (!dvops || !dvops->devx_umem_reg_ex) { errno = EOPNOTSUPP; return NULL; } return dvops->devx_umem_reg_ex(ctx, umem_in); } struct mlx5dv_devx_umem * mlx5dv_devx_umem_reg(struct ibv_context *context, void *addr, size_t size, uint32_t access) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->devx_umem_reg) { errno = EOPNOTSUPP; return NULL; } return dvops->devx_umem_reg(context, addr, size, access); } static int _mlx5dv_devx_umem_dereg(struct mlx5dv_devx_umem *dv_devx_umem) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_UMEM, MLX5_IB_METHOD_DEVX_UMEM_DEREG, 1); int ret; struct mlx5_devx_umem *umem = container_of(dv_devx_umem, struct mlx5_devx_umem, dv_devx_umem); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_UMEM_DEREG_HANDLE, umem->handle); ret = execute_ioctl(umem->context, cmd); if (ret) return ret; ibv_dofork_range(umem->addr, umem->size); free(umem); return 0; } int mlx5dv_devx_umem_dereg(struct mlx5dv_devx_umem *dv_devx_umem) { struct mlx5_devx_umem *umem = container_of(dv_devx_umem, struct mlx5_devx_umem, dv_devx_umem); struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(umem->context); if (!dvops || !dvops->devx_umem_dereg) return EOPNOTSUPP; return dvops->devx_umem_dereg(dv_devx_umem); } static void set_devx_obj_info(const void *in, const void *out, struct mlx5dv_devx_obj *obj) { uint16_t opcode; uint16_t obj_type; opcode = DEVX_GET(general_obj_in_cmd_hdr, in, opcode); switch (opcode) { case MLX5_CMD_OP_CREATE_FLOW_TABLE: obj->type = MLX5_DEVX_FLOW_TABLE; obj->object_id = DEVX_GET(create_flow_table_out, out, table_id); break; case MLX5_CMD_OP_CREATE_FLOW_GROUP: obj->type = MLX5_DEVX_FLOW_GROUP; obj->object_id = DEVX_GET(create_flow_group_out, out, group_id); break; case MLX5_CMD_OP_SET_FLOW_TABLE_ENTRY: obj->type = MLX5_DEVX_FLOW_TABLE_ENTRY; obj->object_id = DEVX_GET(set_fte_in, in, flow_index); break; case MLX5_CMD_OP_CREATE_FLOW_COUNTER: obj->type = MLX5_DEVX_FLOW_COUNTER; obj->object_id = DEVX_GET(alloc_flow_counter_out, out, flow_counter_id); break; case MLX5_CMD_OP_CREATE_GENERAL_OBJECT: obj_type = DEVX_GET(general_obj_in_cmd_hdr, in, obj_type); if (obj_type == MLX5_OBJ_TYPE_FLOW_METER) obj->type = MLX5_DEVX_FLOW_METER; else if (obj_type == MLX5_OBJ_TYPE_FLOW_SAMPLER) obj->type = MLX5_DEVX_FLOW_SAMPLER; else if (obj_type == MLX5_OBJ_TYPE_ASO_FIRST_HIT) obj->type = MLX5_DEVX_ASO_FIRST_HIT; else if (obj_type == MLX5_OBJ_TYPE_ASO_FLOW_METER) obj->type = MLX5_DEVX_ASO_FLOW_METER; else if (obj_type == MLX5_OBJ_TYPE_ASO_CT) obj->type = MLX5_DEVX_ASO_CT; obj->log_obj_range = DEVX_GET(general_obj_in_cmd_hdr, in, log_obj_range); obj->object_id = DEVX_GET(general_obj_out_cmd_hdr, out, obj_id); break; case MLX5_CMD_OP_CREATE_QP: obj->type = MLX5_DEVX_QP; obj->object_id = DEVX_GET(create_qp_out, out, qpn); break; case MLX5_CMD_OP_CREATE_TIR: obj->type = MLX5_DEVX_TIR; obj->object_id = DEVX_GET(create_tir_out, out, tirn); obj->rx_icm_addr = DEVX_GET(create_tir_out, out, icm_address_31_0); obj->rx_icm_addr |= (uint64_t)DEVX_GET(create_tir_out, out, icm_address_39_32) << 32; obj->rx_icm_addr |= (uint64_t)DEVX_GET(create_tir_out, out, icm_address_63_40) << 40; break; case MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT: obj->type = MLX5_DEVX_PKT_REFORMAT_CTX; obj->object_id = DEVX_GET(alloc_packet_reformat_context_out, out, packet_reformat_id); break; default: break; } } static struct mlx5dv_devx_obj * _mlx5dv_devx_obj_create(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_CREATE, 3); struct ib_uverbs_attr *handle; struct mlx5dv_devx_obj *obj; int ret; obj = calloc(1, sizeof(*obj)); if (!obj) { errno = ENOMEM; return NULL; } handle = fill_attr_out_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_CREATE_HANDLE); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_CREATE_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_CREATE_CMD_OUT, out, outlen); ret = execute_ioctl(context, cmd); if (ret) goto err; obj->handle = read_attr_obj(MLX5_IB_ATTR_DEVX_OBJ_CREATE_HANDLE, handle); obj->context = context; set_devx_obj_info(in, out, obj); return obj; err: free(obj); return NULL; } struct mlx5dv_devx_obj * mlx5dv_devx_obj_create(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->devx_obj_create) { errno = EOPNOTSUPP; return NULL; } return dvops->devx_obj_create(context, in, inlen, out, outlen); } static int _mlx5dv_devx_obj_query(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_QUERY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_HANDLE, obj->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_OUT, out, outlen); return execute_ioctl(obj->context, cmd); } int mlx5dv_devx_obj_query(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(obj->context); if (!dvops || !dvops->devx_obj_query) return EOPNOTSUPP; return dvops->devx_obj_query(obj, in, inlen, out, outlen); } static int _mlx5dv_devx_obj_modify(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_MODIFY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_HANDLE, obj->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_OUT, out, outlen); return execute_ioctl(obj->context, cmd); } int mlx5dv_devx_obj_modify(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(obj->context); if (!dvops || !dvops->devx_obj_modify) return EOPNOTSUPP; return dvops->devx_obj_modify(obj, in, inlen, out, outlen); } static int _mlx5dv_devx_obj_destroy(struct mlx5dv_devx_obj *obj) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_DESTROY, 1); int ret; fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_DESTROY_HANDLE, obj->handle); ret = execute_ioctl(obj->context, cmd); if (ret) return ret; free(obj); return 0; } int mlx5dv_devx_obj_destroy(struct mlx5dv_devx_obj *obj) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(obj->context); if (!dvops || !dvops->devx_obj_destroy) return EOPNOTSUPP; return dvops->devx_obj_destroy(obj); } static int _mlx5dv_devx_general_cmd(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX, MLX5_IB_METHOD_DEVX_OTHER, 2); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OTHER_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OTHER_CMD_OUT, out, outlen); return execute_ioctl(context, cmd); } int mlx5dv_devx_general_cmd(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->devx_general_cmd) return EOPNOTSUPP; return dvops->devx_general_cmd(context, in, inlen, out, outlen); } static int __mlx5dv_query_port(struct ibv_context *context, uint32_t port_num, struct mlx5dv_port *info, size_t info_len) { DECLARE_COMMAND_BUFFER(cmd, UVERBS_OBJECT_DEVICE, MLX5_IB_METHOD_QUERY_PORT, 2); fill_attr_in_uint32(cmd, MLX5_IB_ATTR_QUERY_PORT_PORT_NUM, port_num); fill_attr_out(cmd, MLX5_IB_ATTR_QUERY_PORT, info, info_len); return execute_ioctl(context, cmd); } int _mlx5dv_query_port(struct ibv_context *context, uint32_t port_num, struct mlx5dv_port *info, size_t info_len) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->query_port) return EOPNOTSUPP; return dvops->query_port(context, port_num, info, info_len); } void clean_dyn_uars(struct ibv_context *context) { struct mlx5_context *ctx = to_mctx(context); struct mlx5_bf *bf, *tmp_bf; list_for_each_safe(&ctx->dyn_uar_bf_list, bf, tmp_bf, uar_entry) { list_del(&bf->uar_entry); mlx5_free_uar(context, bf); } list_for_each_safe(&ctx->dyn_uar_db_list, bf, tmp_bf, uar_entry) { list_del(&bf->uar_entry); mlx5_free_uar(context, bf); } list_for_each_safe(&ctx->dyn_uar_qp_dedicated_list, bf, tmp_bf, uar_entry) { list_del(&bf->uar_entry); mlx5_free_uar(context, bf); } list_for_each_safe(&ctx->dyn_uar_qp_shared_list, bf, tmp_bf, uar_entry) { list_del(&bf->uar_entry); mlx5_free_uar(context, bf); } if (ctx->nc_uar) mlx5_free_uar(context, ctx->nc_uar); } static struct mlx5dv_devx_uar * _mlx5dv_devx_alloc_uar(struct ibv_context *context, uint32_t flags) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX, MLX5_IB_METHOD_DEVX_QUERY_UAR, 2); int ret; struct mlx5_bf *bf; if (!check_comp_mask(flags, MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC | MLX5DV_UAR_ALLOC_TYPE_NC_DEDICATED)) { errno = EOPNOTSUPP; return NULL; } if (flags & MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC) return mlx5_get_singleton_nc_uar(context); if (flags & MLX5DV_UAR_ALLOC_TYPE_NC_DEDICATED) flags = MLX5_IB_UAPI_UAR_ALLOC_TYPE_NC; bf = mlx5_attach_dedicated_uar(context, flags); if (!bf) return NULL; if (bf->dyn_alloc_uar) bf->devx_uar.dv_devx_uar.page_id = bf->page_id; else { fill_attr_in_uint32(cmd, MLX5_IB_ATTR_DEVX_QUERY_UAR_USER_IDX, bf->bfreg_dyn_index); fill_attr_out_ptr(cmd, MLX5_IB_ATTR_DEVX_QUERY_UAR_DEV_IDX, &bf->devx_uar.dv_devx_uar.page_id); ret = execute_ioctl(context, cmd); if (ret) { mlx5_detach_dedicated_uar(context, bf); return NULL; } } bf->devx_uar.dv_devx_uar.reg_addr = bf->reg; bf->devx_uar.dv_devx_uar.base_addr = bf->uar; bf->devx_uar.dv_devx_uar.mmap_off = bf->uar_mmap_offset; bf->devx_uar.dv_devx_uar.comp_mask = 0; bf->devx_uar.context = context; return &bf->devx_uar.dv_devx_uar; } struct mlx5dv_devx_uar * mlx5dv_devx_alloc_uar(struct ibv_context *context, uint32_t flags) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->devx_alloc_uar) { errno = EOPNOTSUPP; return NULL; } return dvops->devx_alloc_uar(context, flags); } static void _mlx5dv_devx_free_uar(struct mlx5dv_devx_uar *dv_devx_uar) { struct mlx5_bf *bf = container_of(dv_devx_uar, struct mlx5_bf, devx_uar.dv_devx_uar); if (bf->singleton) return; mlx5_detach_dedicated_uar(bf->devx_uar.context, bf); } void mlx5dv_devx_free_uar(struct mlx5dv_devx_uar *dv_devx_uar) { struct mlx5_devx_uar *uar = container_of(dv_devx_uar, struct mlx5_devx_uar, dv_devx_uar); struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(uar->context); if (!dvops || !dvops->devx_free_uar) return; dvops->devx_free_uar(dv_devx_uar); } static int _mlx5dv_devx_query_eqn(struct ibv_context *context, uint32_t vector, uint32_t *eqn) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX, MLX5_IB_METHOD_DEVX_QUERY_EQN, 2); fill_attr_in_uint32(cmd, MLX5_IB_ATTR_DEVX_QUERY_EQN_USER_VEC, vector); fill_attr_out_ptr(cmd, MLX5_IB_ATTR_DEVX_QUERY_EQN_DEV_EQN, eqn); return execute_ioctl(context, cmd); } int mlx5dv_devx_query_eqn(struct ibv_context *context, uint32_t vector, uint32_t *eqn) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->devx_query_eqn) return EOPNOTSUPP; return dvops->devx_query_eqn(context, vector, eqn); } static int _mlx5dv_devx_cq_query(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_QUERY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_HANDLE, cq->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_OUT, out, outlen); return execute_ioctl(cq->context, cmd); } int mlx5dv_devx_cq_query(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(cq->context); if (!dvops || !dvops->devx_cq_query) return EOPNOTSUPP; return dvops->devx_cq_query(cq, in, inlen, out, outlen); } static int _mlx5dv_devx_cq_modify(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_MODIFY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_HANDLE, cq->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_OUT, out, outlen); return execute_ioctl(cq->context, cmd); } int mlx5dv_devx_cq_modify(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(cq->context); if (!dvops || !dvops->devx_cq_modify) return EOPNOTSUPP; return dvops->devx_cq_modify(cq, in, inlen, out, outlen); } static int _mlx5dv_devx_qp_query(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_QUERY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_HANDLE, qp->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_OUT, out, outlen); return execute_ioctl(qp->context, cmd); } int mlx5dv_devx_qp_query(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(qp->context); if (!dvops || !dvops->devx_qp_query) return EOPNOTSUPP; return dvops->devx_qp_query(qp, in, inlen, out, outlen); } static int _mlx5dv_devx_qp_modify(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_MODIFY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_HANDLE, qp->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_OUT, out, outlen); return execute_ioctl(qp->context, cmd); } static enum ibv_qp_state modify_opcode_to_state(uint16_t opcode) { switch (opcode) { case MLX5_CMD_OP_INIT2INIT_QP: case MLX5_CMD_OP_RST2INIT_QP: return IBV_QPS_INIT; case MLX5_CMD_OP_INIT2RTR_QP: return IBV_QPS_RTR; case MLX5_CMD_OP_RTR2RTS_QP: case MLX5_CMD_OP_RTS2RTS_QP: case MLX5_CMD_OP_SQERR2RTS_QP: case MLX5_CMD_OP_SQD_RTS_QP: return IBV_QPS_RTS; case MLX5_CMD_OP_2ERR_QP: return IBV_QPS_ERR; case MLX5_CMD_OP_2RST_QP: return IBV_QPS_RESET; default: return IBV_QPS_UNKNOWN; } } int mlx5dv_devx_qp_modify(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen) { int ret; enum ibv_qp_state qp_state; struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(qp->context); if (!dvops || !dvops->devx_qp_modify) return EOPNOTSUPP; ret = dvops->devx_qp_modify(qp, in, inlen, out, outlen); if (ret) return ret; qp_state = modify_opcode_to_state(DEVX_GET(rtr2rts_qp_in, in, opcode)); set_qp_operational_state(to_mqp(qp), qp_state); return 0; } static int _mlx5dv_devx_srq_query(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_QUERY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_HANDLE, srq->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_OUT, out, outlen); return execute_ioctl(srq->context, cmd); } int mlx5dv_devx_srq_query(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(srq->context); if (!dvops || !dvops->devx_srq_query) return EOPNOTSUPP; return dvops->devx_srq_query(srq, in, inlen, out, outlen); } static int _mlx5dv_devx_srq_modify(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_MODIFY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_HANDLE, srq->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_OUT, out, outlen); return execute_ioctl(srq->context, cmd); } int mlx5dv_devx_srq_modify(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(srq->context); if (!dvops || !dvops->devx_srq_modify) return EOPNOTSUPP; return dvops->devx_srq_modify(srq, in, inlen, out, outlen); } static int _mlx5dv_devx_wq_query(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_QUERY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_HANDLE, wq->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_OUT, out, outlen); return execute_ioctl(wq->context, cmd); } int mlx5dv_devx_wq_query(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(wq->context); if (!dvops || !dvops->devx_wq_query) return EOPNOTSUPP; return dvops->devx_wq_query(wq, in, inlen, out, outlen); } static int _mlx5dv_devx_wq_modify(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_MODIFY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_HANDLE, wq->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_OUT, out, outlen); return execute_ioctl(wq->context, cmd); } int mlx5dv_devx_wq_modify(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(wq->context); if (!dvops || !dvops->devx_wq_modify) return EOPNOTSUPP; return dvops->devx_wq_modify(wq, in, inlen, out, outlen); } static int _mlx5dv_devx_ind_tbl_query(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_QUERY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_HANDLE, ind_tbl->ind_tbl_handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_CMD_OUT, out, outlen); return execute_ioctl(ind_tbl->context, cmd); } int mlx5dv_devx_ind_tbl_query(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ind_tbl->context); if (!dvops || !dvops->devx_ind_tbl_query) return EOPNOTSUPP; return dvops->devx_ind_tbl_query(ind_tbl, in, inlen, out, outlen); } static int _mlx5dv_devx_ind_tbl_modify(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_MODIFY, 3); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_HANDLE, ind_tbl->ind_tbl_handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_IN, in, inlen); fill_attr_out(cmd, MLX5_IB_ATTR_DEVX_OBJ_MODIFY_CMD_OUT, out, outlen); return execute_ioctl(ind_tbl->context, cmd); } int mlx5dv_devx_ind_tbl_modify(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ind_tbl->context); if (!dvops || !dvops->devx_ind_tbl_modify) return EOPNOTSUPP; return dvops->devx_ind_tbl_modify(ind_tbl, in, inlen, out, outlen); } static struct mlx5dv_devx_cmd_comp * _mlx5dv_devx_create_cmd_comp(struct ibv_context *context) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD, MLX5_IB_METHOD_DEVX_ASYNC_CMD_FD_ALLOC, 1); struct ib_uverbs_attr *handle; struct mlx5dv_devx_cmd_comp *cmd_comp; int ret; cmd_comp = calloc(1, sizeof(*cmd_comp)); if (!cmd_comp) { errno = ENOMEM; return NULL; } handle = fill_attr_out_fd(cmd, MLX5_IB_ATTR_DEVX_ASYNC_CMD_FD_ALLOC_HANDLE, 0); ret = execute_ioctl(context, cmd); if (ret) goto err; cmd_comp->fd = read_attr_fd( MLX5_IB_ATTR_DEVX_ASYNC_CMD_FD_ALLOC_HANDLE, handle); return cmd_comp; err: free(cmd_comp); return NULL; } struct mlx5dv_devx_cmd_comp * mlx5dv_devx_create_cmd_comp(struct ibv_context *context) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->devx_create_cmd_comp) { errno = EOPNOTSUPP; return NULL; } return dvops->devx_create_cmd_comp(context); } static void _mlx5dv_devx_destroy_cmd_comp( struct mlx5dv_devx_cmd_comp *cmd_comp) { close(cmd_comp->fd); free(cmd_comp); } void mlx5dv_devx_destroy_cmd_comp( struct mlx5dv_devx_cmd_comp *cmd_comp) { _mlx5dv_devx_destroy_cmd_comp(cmd_comp); } static struct mlx5dv_devx_event_channel * _mlx5dv_devx_create_event_channel(struct ibv_context *context, enum mlx5dv_devx_create_event_channel_flags flags) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_ASYNC_EVENT_FD, MLX5_IB_METHOD_DEVX_ASYNC_EVENT_FD_ALLOC, 2); struct ib_uverbs_attr *handle; struct mlx5_devx_event_channel *event_channel; int ret; event_channel = calloc(1, sizeof(*event_channel)); if (!event_channel) { errno = ENOMEM; return NULL; } handle = fill_attr_out_fd(cmd, MLX5_IB_ATTR_DEVX_ASYNC_EVENT_FD_ALLOC_HANDLE, 0); fill_attr_in_uint32(cmd, MLX5_IB_ATTR_DEVX_ASYNC_EVENT_FD_ALLOC_FLAGS, flags); ret = execute_ioctl(context, cmd); if (ret) goto err; event_channel->dv_event_channel.fd = read_attr_fd( MLX5_IB_ATTR_DEVX_ASYNC_EVENT_FD_ALLOC_HANDLE, handle); event_channel->context = context; return &event_channel->dv_event_channel; err: free(event_channel); return NULL; } struct mlx5dv_devx_event_channel * mlx5dv_devx_create_event_channel(struct ibv_context *context, enum mlx5dv_devx_create_event_channel_flags flags) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->devx_create_event_channel) { errno = EOPNOTSUPP; return NULL; } return dvops->devx_create_event_channel(context, flags); } static void _mlx5dv_devx_destroy_event_channel( struct mlx5dv_devx_event_channel *dv_event_channel) { struct mlx5_devx_event_channel *event_channel = container_of(dv_event_channel, struct mlx5_devx_event_channel, dv_event_channel); close(dv_event_channel->fd); free(event_channel); } void mlx5dv_devx_destroy_event_channel( struct mlx5dv_devx_event_channel *dv_event_channel) { struct mlx5_devx_event_channel *ech = container_of(dv_event_channel, struct mlx5_devx_event_channel, dv_event_channel); struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ech->context); if (!dvops || !dvops->devx_destroy_event_channel) return; return dvops->devx_destroy_event_channel(dv_event_channel); } static int _mlx5dv_devx_subscribe_devx_event(struct mlx5dv_devx_event_channel *dv_event_channel, struct mlx5dv_devx_obj *obj, /* can be NULL for unaffiliated events */ uint16_t events_sz, uint16_t events_num[], uint64_t cookie) { struct mlx5_devx_event_channel *event_channel = container_of(dv_event_channel, struct mlx5_devx_event_channel, dv_event_channel); DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX, MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT, 4); fill_attr_in_fd(cmd, MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_FD_HANDLE, dv_event_channel->fd); fill_attr_in_uint64(cmd, MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_COOKIE, cookie); if (obj) fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_OBJ_HANDLE, obj->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_TYPE_NUM_LIST, events_num, events_sz); return execute_ioctl(event_channel->context, cmd); } int mlx5dv_devx_subscribe_devx_event(struct mlx5dv_devx_event_channel *dv_event_channel, struct mlx5dv_devx_obj *obj, /* can be NULL for unaffiliated events */ uint16_t events_sz, uint16_t events_num[], uint64_t cookie) { struct mlx5_devx_event_channel *event_channel = container_of(dv_event_channel, struct mlx5_devx_event_channel, dv_event_channel); struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(event_channel->context); if (!dvops || !dvops->devx_subscribe_devx_event) return EOPNOTSUPP; return dvops->devx_subscribe_devx_event(dv_event_channel, obj, events_sz, events_num, cookie); } static int _mlx5dv_devx_subscribe_devx_event_fd(struct mlx5dv_devx_event_channel *dv_event_channel, int fd, struct mlx5dv_devx_obj *obj, /* can be NULL for unaffiliated events */ uint16_t event_num) { struct mlx5_devx_event_channel *event_channel = container_of(dv_event_channel, struct mlx5_devx_event_channel, dv_event_channel); DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX, MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT, 4); fill_attr_in_fd(cmd, MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_FD_HANDLE, dv_event_channel->fd); if (obj) fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_OBJ_HANDLE, obj->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_TYPE_NUM_LIST, &event_num, sizeof(event_num)); fill_attr_in_uint32(cmd, MLX5_IB_ATTR_DEVX_SUBSCRIBE_EVENT_FD_NUM, fd); return execute_ioctl(event_channel->context, cmd); } int mlx5dv_devx_subscribe_devx_event_fd(struct mlx5dv_devx_event_channel *dv_event_channel, int fd, struct mlx5dv_devx_obj *obj, /* can be NULL for unaffiliated events */ uint16_t event_num) { struct mlx5_devx_event_channel *event_channel = container_of(dv_event_channel, struct mlx5_devx_event_channel, dv_event_channel); struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(event_channel->context); if (!dvops || !dvops->devx_subscribe_devx_event_fd) return EOPNOTSUPP; return dvops->devx_subscribe_devx_event_fd(dv_event_channel, fd, obj, event_num); } static int _mlx5dv_devx_obj_query_async(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, size_t outlen, uint64_t wr_id, struct mlx5dv_devx_cmd_comp *cmd_comp) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_DEVX_OBJ, MLX5_IB_METHOD_DEVX_OBJ_ASYNC_QUERY, 5); fill_attr_in_obj(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_ASYNC_HANDLE, obj->handle); fill_attr_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_ASYNC_CMD_IN, in, inlen); fill_attr_const_in(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_ASYNC_OUT_LEN, outlen); fill_attr_in_uint64(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_ASYNC_WR_ID, wr_id); fill_attr_in_fd(cmd, MLX5_IB_ATTR_DEVX_OBJ_QUERY_ASYNC_FD, cmd_comp->fd); return execute_ioctl(obj->context, cmd); } int mlx5dv_devx_obj_query_async(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, size_t outlen, uint64_t wr_id, struct mlx5dv_devx_cmd_comp *cmd_comp) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(obj->context); if (!dvops || !dvops->devx_obj_query_async) return EOPNOTSUPP; return dvops->devx_obj_query_async(obj, in, inlen, outlen, wr_id, cmd_comp); } static int _mlx5dv_devx_get_async_cmd_comp(struct mlx5dv_devx_cmd_comp *cmd_comp, struct mlx5dv_devx_async_cmd_hdr *cmd_resp, size_t cmd_resp_len) { ssize_t bytes; bytes = read(cmd_comp->fd, cmd_resp, cmd_resp_len); if (bytes < 0) return errno; if (bytes < sizeof(*cmd_resp)) return EINVAL; return 0; } int mlx5dv_devx_get_async_cmd_comp(struct mlx5dv_devx_cmd_comp *cmd_comp, struct mlx5dv_devx_async_cmd_hdr *cmd_resp, size_t cmd_resp_len) { return _mlx5dv_devx_get_async_cmd_comp(cmd_comp, cmd_resp, cmd_resp_len); } static int mlx5_destroy_sig_psvs(struct mlx5_sig_ctx *sig) { int ret = 0; if (sig->block.mem_psv) { ret = mlx5_destroy_psv(sig->block.mem_psv); if (!ret) sig->block.mem_psv = NULL; } if (!ret && sig->block.wire_psv) { ret = mlx5_destroy_psv(sig->block.wire_psv); if (!ret) sig->block.wire_psv = NULL; } return ret; } static int mlx5_create_sig_psvs(struct ibv_pd *pd, struct mlx5dv_mkey_init_attr *attr, struct mlx5_sig_ctx *sig) { int err; if (attr->create_flags & MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE) { sig->block.mem_psv = mlx5_create_psv(pd); if (!sig->block.mem_psv) return errno; sig->block.wire_psv = mlx5_create_psv(pd); if (!sig->block.wire_psv) { err = errno; goto err_destroy_psvs; } } return 0; err_destroy_psvs: mlx5_destroy_sig_psvs(sig); return err; } static struct mlx5_sig_ctx *mlx5_create_sig_ctx(struct ibv_pd *pd, struct mlx5dv_mkey_init_attr *attr) { struct mlx5_sig_ctx *sig; int err; if (!to_mctx(pd->context)->sig_caps.block_prot) { errno = EOPNOTSUPP; return NULL; } sig = calloc(1, sizeof(*sig)); if (!sig) { errno = ENOMEM; return NULL; } err = mlx5_create_sig_psvs(pd, attr, sig); if (err) { errno = err; goto err_free_sig; } sig->err_exists = false; sig->err_count = 1; sig->err_count_updated = true; return sig; err_free_sig: free(sig); return NULL; } static int mlx5_destroy_sig_ctx(struct mlx5_sig_ctx *sig) { int ret; ret = mlx5_destroy_sig_psvs(sig); if (!ret) free(sig); return ret; } static ssize_t _mlx5dv_devx_get_event(struct mlx5dv_devx_event_channel *event_channel, struct mlx5dv_devx_async_event_hdr *event_data, size_t event_resp_len) { ssize_t bytes; bytes = read(event_channel->fd, event_data, event_resp_len); if (bytes < 0) return -1; /* cookie should be always exist */ if (bytes < sizeof(*event_data)) { errno = EINVAL; return -1; } /* event data may be omitted in case no EQE data exists (e.g. completion event on a CQ) */ return bytes; } ssize_t mlx5dv_devx_get_event(struct mlx5dv_devx_event_channel *event_channel, struct mlx5dv_devx_async_event_hdr *event_data, size_t event_resp_len) { return _mlx5dv_devx_get_event(event_channel, event_data, event_resp_len); } static struct mlx5dv_mkey * _mlx5dv_create_mkey(struct mlx5dv_mkey_init_attr *mkey_init_attr) { uint32_t out[DEVX_ST_SZ_DW(create_mkey_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(create_mkey_in)] = {}; struct mlx5_mkey *mkey; bool update_tag; bool sig_mkey; bool crypto_mkey; struct ibv_pd *pd = mkey_init_attr->pd; size_t bsf_size = 0; void *mkc; update_tag = to_mctx(pd->context)->flags & MLX5_CTX_FLAGS_MKEY_UPDATE_TAG_SUPPORTED; if (!mkey_init_attr->create_flags || !check_comp_mask(mkey_init_attr->create_flags, MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT | MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE | MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO | (update_tag ? MLX5DV_MKEY_INIT_ATTR_FLAGS_UPDATE_TAG : 0) | MLX5DV_MKEY_INIT_ATTR_FLAGS_REMOTE_INVALIDATE)) { errno = EOPNOTSUPP; return NULL; } mkey = calloc(1, sizeof(*mkey)); if (!mkey) { errno = ENOMEM; return NULL; } sig_mkey = mkey_init_attr->create_flags & MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE; if (sig_mkey) { mkey->sig = mlx5_create_sig_ctx(pd, mkey_init_attr); if (!mkey->sig) goto err_free_mkey; bsf_size += sizeof(struct mlx5_bsf); } crypto_mkey = mkey_init_attr->create_flags & MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO; if (crypto_mkey) { if (!(to_mctx(pd->context)->crypto_caps.crypto_engines & (MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_SINGLE_BLOCK | MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_MULTI_BLOCK))) { errno = EOPNOTSUPP; goto err_destroy_sig_ctx; } mkey->crypto = calloc(1, sizeof(*mkey->crypto)); if (!mkey->crypto) { errno = ENOMEM; goto err_destroy_sig_ctx; } bsf_size += sizeof(struct mlx5_crypto_bsf); } mkey->num_desc = align(mkey_init_attr->max_entries, 4); DEVX_SET(create_mkey_in, in, opcode, MLX5_CMD_OP_CREATE_MKEY); mkc = DEVX_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); DEVX_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_KLMS); DEVX_SET(mkc, mkc, free, 1); DEVX_SET(mkc, mkc, umr_en, 1); DEVX_SET(mkc, mkc, pd, to_mpd(pd)->pdn); DEVX_SET(mkc, mkc, translations_octword_size, mkey->num_desc); DEVX_SET(mkc, mkc, lr, 1); DEVX_SET(mkc, mkc, qpn, 0xffffff); DEVX_SET(mkc, mkc, mkey_7_0, 0); if (crypto_mkey) DEVX_SET(mkc, mkc, crypto_en, 1); if (sig_mkey || crypto_mkey) { DEVX_SET(mkc, mkc, bsf_en, 1); DEVX_SET(mkc, mkc, bsf_octword_size, bsf_size / 16); } if (mkey_init_attr->create_flags & MLX5DV_MKEY_INIT_ATTR_FLAGS_REMOTE_INVALIDATE) DEVX_SET(mkc, mkc, en_rinval, 1); mkey->devx_obj = mlx5dv_devx_obj_create(pd->context, in, sizeof(in), out, sizeof(out)); if (!mkey->devx_obj) { errno = mlx5_get_cmd_status_err(errno, out); goto err_free_crypto; } mkey_init_attr->max_entries = mkey->num_desc; mkey->dv_mkey.lkey = (DEVX_GET(create_mkey_out, out, mkey_index) << 8) | 0; mkey->dv_mkey.rkey = mkey->dv_mkey.lkey; if (mlx5_store_mkey(to_mctx(pd->context), mkey->dv_mkey.lkey >> 8, mkey)) { errno = ENOMEM; goto err_destroy_mkey_obj; } return &mkey->dv_mkey; err_destroy_mkey_obj: mlx5dv_devx_obj_destroy(mkey->devx_obj); err_free_crypto: if (crypto_mkey) free(mkey->crypto); err_destroy_sig_ctx: if (sig_mkey) mlx5_destroy_sig_ctx(mkey->sig); err_free_mkey: free(mkey); return NULL; } struct mlx5dv_mkey *mlx5dv_create_mkey(struct mlx5dv_mkey_init_attr *mkey_init_attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(mkey_init_attr->pd->context); if (!dvops || !dvops->create_mkey) { errno = EOPNOTSUPP; return NULL; } return dvops->create_mkey(mkey_init_attr); } static int _mlx5dv_destroy_mkey(struct mlx5dv_mkey *dv_mkey) { struct mlx5_mkey *mkey = container_of(dv_mkey, struct mlx5_mkey, dv_mkey); struct mlx5_context *mctx = to_mctx(mkey->devx_obj->context); int ret; if (mkey->sig) { ret = mlx5_destroy_sig_ctx(mkey->sig); if (ret) return ret; mkey->sig = NULL; } ret = mlx5dv_devx_obj_destroy(mkey->devx_obj); if (ret) return ret; if (mkey->crypto) free(mkey->crypto); mlx5_clear_mkey(mctx, dv_mkey->lkey >> 8); free(mkey); return 0; } int mlx5dv_destroy_mkey(struct mlx5dv_mkey *dv_mkey) { struct mlx5_mkey *mkey = container_of(dv_mkey, struct mlx5_mkey, dv_mkey); struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(mkey->devx_obj->context); if (!dvops || !dvops->destroy_mkey) return EOPNOTSUPP; return dvops->destroy_mkey(dv_mkey); } enum { MLX5_SIGERR_CQE_SYNDROME_REFTAG = 1 << 11, MLX5_SIGERR_CQE_SYNDROME_APPTAG = 1 << 12, MLX5_SIGERR_CQE_SYNDROME_GUARD = 1 << 13, MLX5_SIGERR_CQE_SIG_TYPE_BLOCK = 0, MLX5_SIGERR_CQE_SIG_TYPE_TRANSACTION = 1, MLX5_SIGERR_CQE_DOMAIN_WIRE = 0, MLX5_SIGERR_CQE_DOMAIN_MEMORY = 1, }; static void mlx5_decode_sigerr(struct mlx5_sig_err *mlx5_err, struct mlx5_sig_block_domain *bd, struct mlx5dv_mkey_err *err_info) { struct mlx5dv_sig_err *dv_err = &err_info->err.sig; dv_err->offset = mlx5_err->offset; if (mlx5_err->syndrome & MLX5_SIGERR_CQE_SYNDROME_REFTAG) { err_info->err_type = MLX5DV_MKEY_SIG_BLOCK_BAD_REFTAG; dv_err->expected_value = mlx5_err->expected & 0xffffffff; dv_err->actual_value = mlx5_err->actual & 0xffffffff; } else if (mlx5_err->syndrome & MLX5_SIGERR_CQE_SYNDROME_APPTAG) { err_info->err_type = MLX5DV_MKEY_SIG_BLOCK_BAD_APPTAG; dv_err->expected_value = (mlx5_err->expected >> 32) & 0xffff; dv_err->actual_value = (mlx5_err->actual >> 32) & 0xffff; } else { err_info->err_type = MLX5DV_MKEY_SIG_BLOCK_BAD_GUARD; if (bd->sig_type == MLX5_SIG_TYPE_T10DIF) { dv_err->expected_value = mlx5_err->expected >> 48; dv_err->actual_value = mlx5_err->actual >> 48; } else if (bd->sig.crc.type == MLX5DV_SIG_CRC_TYPE_CRC64_XP10) { dv_err->expected_value = mlx5_err->expected; dv_err->actual_value = mlx5_err->actual; } else { /* CRC32 or CRC32C */ dv_err->expected_value = mlx5_err->expected >> 32; dv_err->actual_value = mlx5_err->actual >> 32; } } } int _mlx5dv_mkey_check(struct mlx5dv_mkey *dv_mkey, struct mlx5dv_mkey_err *err_info, size_t err_info_size) { struct mlx5_mkey *mkey = container_of(dv_mkey, struct mlx5_mkey, dv_mkey); struct mlx5_sig_ctx *sig_ctx = mkey->sig; FILE *fp = to_mctx(mkey->devx_obj->context)->dbg_fp; struct mlx5_sig_err *sig_err; struct mlx5_sig_block_domain *domain; if (!sig_ctx) return EINVAL; if (!sig_ctx->err_exists) { err_info->err_type = MLX5DV_MKEY_NO_ERR; return 0; } sig_err = &sig_ctx->err_info; if (!(sig_err->syndrome & (MLX5_SIGERR_CQE_SYNDROME_REFTAG | MLX5_SIGERR_CQE_SYNDROME_APPTAG | MLX5_SIGERR_CQE_SYNDROME_GUARD))) { mlx5_dbg(fp, MLX5_DBG_CQ, "unknown signature error, syndrome 0x%x\n", sig_err->syndrome); return EINVAL; } if (sig_err->sig_type != MLX5_SIGERR_CQE_SIG_TYPE_BLOCK) { mlx5_dbg(fp, MLX5_DBG_CQ, "not supported signature type 0x%x\n", sig_err->sig_type); return EINVAL; } switch (sig_err->domain) { case MLX5_SIGERR_CQE_DOMAIN_WIRE: domain = &sig_ctx->block.attr.wire; break; case MLX5_SIGERR_CQE_DOMAIN_MEMORY: domain = &sig_ctx->block.attr.mem; break; default: mlx5_dbg(fp, MLX5_DBG_CQ, "unknown signature domain 0x%x\n", sig_err->domain); return EINVAL; } if (domain->sig_type == MLX5_SIG_TYPE_NONE) { mlx5_dbg(fp, MLX5_DBG_CQ, "unexpected signature error for non-signature domain\n"); return EINVAL; } mlx5_decode_sigerr(sig_err, domain, err_info); sig_ctx->err_exists = false; return 0; } static struct mlx5dv_devx_obj * crypto_login_create(struct ibv_context *context, struct mlx5dv_crypto_login_attr_ex *login_attr) { uint32_t in[DEVX_ST_SZ_DW(create_crypto_login_obj_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; struct mlx5_context *mctx = to_mctx(context); struct mlx5dv_devx_obj *obj; void *attr; if (!(mctx->crypto_caps.flags & MLX5DV_CRYPTO_CAPS_CRYPTO) || !(mctx->crypto_caps.flags & MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_OPERATIONAL)) { errno = EOPNOTSUPP; return NULL; } if (!(mctx->general_obj_types_caps & (1ULL << MLX5_OBJ_TYPE_CRYPTO_LOGIN))) { errno = EOPNOTSUPP; return NULL; } if (login_attr->credential_id & 0xff000000 || login_attr->import_kek_id & 0xff000000) { errno = EINVAL; return NULL; } attr = DEVX_ADDR_OF(create_crypto_login_obj_in, in, hdr); DEVX_SET(general_obj_in_cmd_hdr, attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, attr, obj_type, MLX5_OBJ_TYPE_CRYPTO_LOGIN); attr = DEVX_ADDR_OF(create_crypto_login_obj_in, in, login_obj); DEVX_SET(crypto_login_obj, attr, credential_pointer, login_attr->credential_id); DEVX_SET(crypto_login_obj, attr, session_import_kek_ptr, login_attr->import_kek_id); memcpy(DEVX_ADDR_OF(crypto_login_obj, attr, credential), login_attr->credential, login_attr->credential_len); obj = mlx5dv_devx_obj_create(context, in, sizeof(in), out, sizeof(out)); if (!obj) errno = mlx5_get_cmd_status_err(errno, out); return obj; } static int crypto_login_query(struct mlx5dv_devx_obj *obj, struct mlx5dv_crypto_login_query_attr *query_attr) { uint32_t out[DEVX_ST_SZ_DW(query_crypto_login_obj_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; uint8_t crypto_login_state; void *attr; int ret; DEVX_SET(general_obj_in_cmd_hdr, in, opcode, MLX5_CMD_OP_QUERY_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_CRYPTO_LOGIN); DEVX_SET(general_obj_in_cmd_hdr, in, obj_id, obj->object_id); ret = mlx5dv_devx_obj_query(obj, in, sizeof(in), out, sizeof(out)); if (ret) return mlx5_get_cmd_status_err(ret, out); attr = DEVX_ADDR_OF(query_crypto_login_obj_out, out, obj); crypto_login_state = DEVX_GET(crypto_login_obj, attr, state); switch (crypto_login_state) { case MLX5_CRYPTO_LOGIN_OBJ_STATE_VALID: query_attr->state = MLX5DV_CRYPTO_LOGIN_STATE_VALID; break; case MLX5_CRYPTO_LOGIN_OBJ_STATE_INVALID: query_attr->state = MLX5DV_CRYPTO_LOGIN_STATE_INVALID; break; default: ret = EINVAL; break; } return ret; } static int _mlx5dv_crypto_login(struct ibv_context *context, struct mlx5dv_crypto_login_attr *login_attr) { struct mlx5dv_crypto_login_attr_ex login_attr_ex; struct mlx5_context *mctx = to_mctx(context); struct mlx5dv_devx_obj *obj; int ret = 0; if (login_attr->comp_mask) return EINVAL; pthread_mutex_lock(&mctx->crypto_login_mutex); if (mctx->crypto_login) { ret = EEXIST; goto out; } login_attr_ex.credential_len = sizeof(login_attr->credential); login_attr_ex.credential_id = login_attr->credential_id; login_attr_ex.import_kek_id = login_attr->import_kek_id; login_attr_ex.credential = login_attr->credential; login_attr_ex.comp_mask = 0; obj = crypto_login_create(context, &login_attr_ex); if (!obj) { ret = errno; goto out; } mctx->crypto_login = obj; out: pthread_mutex_unlock(&mctx->crypto_login_mutex); return ret; } int mlx5dv_crypto_login(struct ibv_context *context, struct mlx5dv_crypto_login_attr *login_attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->crypto_login) return EOPNOTSUPP; return dvops->crypto_login(context, login_attr); } static int _mlx5dv_crypto_login_query_state(struct ibv_context *context, enum mlx5dv_crypto_login_state *state) { struct mlx5_context *mctx = to_mctx(context); struct mlx5dv_crypto_login_query_attr attr = {}; int ret; pthread_mutex_lock(&mctx->crypto_login_mutex); if (!mctx->crypto_login) { *state = MLX5DV_CRYPTO_LOGIN_STATE_NO_LOGIN; ret = 0; goto out; } ret = crypto_login_query(mctx->crypto_login, &attr); if (ret) goto out; *state = attr.state; out: pthread_mutex_unlock(&mctx->crypto_login_mutex); return ret; } int mlx5dv_crypto_login_query_state(struct ibv_context *context, enum mlx5dv_crypto_login_state *state) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->crypto_login_query_state) return EOPNOTSUPP; return dvops->crypto_login_query_state(context, state); } static int _mlx5dv_crypto_logout(struct ibv_context *context) { struct mlx5_context *mctx = to_mctx(context); int ret; pthread_mutex_lock(&mctx->crypto_login_mutex); if (!mctx->crypto_login) { ret = ENOENT; goto out; } ret = mlx5dv_devx_obj_destroy(mctx->crypto_login); if (ret) goto out; mctx->crypto_login = NULL; out: pthread_mutex_unlock(&mctx->crypto_login_mutex); return ret; } int mlx5dv_crypto_logout(struct ibv_context *context) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->crypto_logout) return EOPNOTSUPP; return dvops->crypto_logout(context); } static struct mlx5dv_crypto_login_obj * _mlx5dv_crypto_login_create(struct ibv_context *context, struct mlx5dv_crypto_login_attr_ex *login_attr) { struct mlx5dv_crypto_login_obj *crypto_login; struct mlx5dv_devx_obj *obj; if (login_attr->comp_mask) { errno = EINVAL; return NULL; } crypto_login = calloc(1, sizeof(*crypto_login)); if (!crypto_login) { errno = ENOMEM; return NULL; } obj = crypto_login_create(context, login_attr); if (!obj) { free(crypto_login); return NULL; } crypto_login->devx_obj = obj; return crypto_login; } struct mlx5dv_crypto_login_obj * mlx5dv_crypto_login_create(struct ibv_context *context, struct mlx5dv_crypto_login_attr_ex *login_attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->crypto_login_create) { errno = EOPNOTSUPP; return NULL; } return dvops->crypto_login_create(context, login_attr); } static int _mlx5dv_crypto_login_query(struct mlx5dv_crypto_login_obj *crypto_login, struct mlx5dv_crypto_login_query_attr *query_attr) { if (query_attr->comp_mask) return EINVAL; return crypto_login_query(crypto_login->devx_obj, query_attr); } int mlx5dv_crypto_login_query(struct mlx5dv_crypto_login_obj *crypto_login, struct mlx5dv_crypto_login_query_attr *query_attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(crypto_login->devx_obj->context); if (!dvops || !dvops->crypto_login_query) return EOPNOTSUPP; return dvops->crypto_login_query(crypto_login, query_attr); } static int _mlx5dv_crypto_login_destroy(struct mlx5dv_crypto_login_obj *crypto_login) { int err; err = mlx5dv_devx_obj_destroy(crypto_login->devx_obj); if (err) return err; free(crypto_login); return 0; } int mlx5dv_crypto_login_destroy(struct mlx5dv_crypto_login_obj *crypto_login) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(crypto_login->devx_obj->context); if (!dvops || !dvops->crypto_login_destroy) return EOPNOTSUPP; return dvops->crypto_login_destroy(crypto_login); } static int check_dek_import_method(struct ibv_context *context, struct mlx5dv_dek_init_attr *init_attr) { struct mlx5_context *mctx = to_mctx(context); int err = 0; if (init_attr->comp_mask & MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN) { if (init_attr->crypto_login) { /* * User wants to create a wrapped DEK using a crypto * login object created by mlx5dv_crypto_login_create(). */ if (!(mctx->crypto_caps.wrapped_import_method & MLX5DV_CRYPTO_WRAPPED_IMPORT_METHOD_CAP_AES_XTS)) err = EINVAL; } else { /* User wants to create a plaintext DEK. */ if (mctx->crypto_caps.wrapped_import_method & MLX5DV_CRYPTO_WRAPPED_IMPORT_METHOD_CAP_AES_XTS) err = EINVAL; } } else { /* * User wants to create a wrapped DEK using the global crypto * login object created by mlx5dv_crypto_login(). */ if (!mctx->crypto_login || !(mctx->crypto_caps.wrapped_import_method & MLX5DV_CRYPTO_WRAPPED_IMPORT_METHOD_CAP_AES_XTS)) err = EINVAL; } return err; } static struct mlx5dv_dek * _mlx5dv_dek_create(struct ibv_context *context, struct mlx5dv_dek_init_attr *init_attr) { uint32_t in[DEVX_ST_SZ_DW(create_encryption_key_obj_in)] = {}; uint32_t out[DEVX_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; struct mlx5_context *mctx = to_mctx(context); struct mlx5dv_devx_obj *obj; struct mlx5dv_dek *dek; uint8_t key_size; void *attr; if (!(mctx->crypto_caps.crypto_engines & (MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_SINGLE_BLOCK | MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_MULTI_BLOCK))) { errno = EOPNOTSUPP; return NULL; } if (!(mctx->general_obj_types_caps & (1ULL << MLX5_OBJ_TYPE_DEK))) { errno = EOPNOTSUPP; return NULL; } if (init_attr->key_purpose != MLX5DV_CRYPTO_KEY_PURPOSE_AES_XTS) { errno = EINVAL; return NULL; } switch (init_attr->key_size) { case MLX5DV_CRYPTO_KEY_SIZE_128: key_size = MLX5_ENCRYPTION_KEY_OBJ_KEY_SIZE_SIZE_128; break; case MLX5DV_CRYPTO_KEY_SIZE_256: key_size = MLX5_ENCRYPTION_KEY_OBJ_KEY_SIZE_SIZE_256; break; default: errno = EINVAL; return NULL; } if (!check_comp_mask(init_attr->comp_mask, MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN)) { errno = EINVAL; return NULL; } errno = check_dek_import_method(context, init_attr); if (errno) { dek = NULL; goto out; } dek = calloc(1, sizeof(*dek)); if (!dek) { errno = ENOMEM; goto out; } attr = DEVX_ADDR_OF(create_encryption_key_obj_in, in, hdr); DEVX_SET(general_obj_in_cmd_hdr, attr, opcode, MLX5_CMD_OP_CREATE_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, attr, obj_type, MLX5_OBJ_TYPE_DEK); attr = DEVX_ADDR_OF(create_encryption_key_obj_in, in, key_obj); DEVX_SET(encryption_key_obj, attr, key_size, key_size); DEVX_SET(encryption_key_obj, attr, has_keytag, !!init_attr->has_keytag); DEVX_SET(encryption_key_obj, attr, key_purpose, MLX5_ENCRYPTION_KEY_OBJ_KEY_PURPOSE_AES_XTS); DEVX_SET(encryption_key_obj, attr, pd, to_mpd(init_attr->pd)->pdn); memcpy(DEVX_ADDR_OF(encryption_key_obj, attr, opaque), init_attr->opaque, sizeof(init_attr->opaque)); memcpy(DEVX_ADDR_OF(encryption_key_obj, attr, key), init_attr->key, sizeof(init_attr->key)); obj = mlx5dv_devx_obj_create(context, in, sizeof(in), out, sizeof(out)); if (!obj) { errno = mlx5_get_cmd_status_err(errno, out); free(dek); dek = NULL; goto out; } dek->devx_obj = obj; out: return dek; } struct mlx5dv_dek *mlx5dv_dek_create(struct ibv_context *context, struct mlx5dv_dek_init_attr *init_attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->dek_create) { errno = EOPNOTSUPP; return NULL; } return dvops->dek_create(context, init_attr); } static int _mlx5dv_dek_query(struct mlx5dv_dek *dek, struct mlx5dv_dek_attr *dek_attr) { uint32_t out[DEVX_ST_SZ_DW(query_encryption_key_obj_out)] = {}; uint32_t in[DEVX_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; uint8_t dek_state; void *attr; int ret; if (dek_attr->comp_mask) return EINVAL; DEVX_SET(general_obj_in_cmd_hdr, in, opcode, MLX5_CMD_OP_QUERY_GENERAL_OBJECT); DEVX_SET(general_obj_in_cmd_hdr, in, obj_type, MLX5_OBJ_TYPE_DEK); DEVX_SET(general_obj_in_cmd_hdr, in, obj_id, dek->devx_obj->object_id); ret = mlx5dv_devx_obj_query(dek->devx_obj, in, sizeof(in), out, sizeof(out)); if (ret) return mlx5_get_cmd_status_err(ret, out); attr = DEVX_ADDR_OF(query_encryption_key_obj_out, out, obj); dek_state = DEVX_GET(encryption_key_obj, attr, state); switch (dek_state) { case MLX5_ENCRYPTION_KEY_OBJ_STATE_READY: dek_attr->state = MLX5DV_DEK_STATE_READY; break; case MLX5_ENCRYPTION_KEY_OBJ_STATE_ERROR: dek_attr->state = MLX5DV_DEK_STATE_ERROR; break; default: return EINVAL; } memcpy(dek_attr->opaque, DEVX_ADDR_OF(encryption_key_obj, attr, opaque), sizeof(dek_attr->opaque)); return 0; } int mlx5dv_dek_query(struct mlx5dv_dek *dek, struct mlx5dv_dek_attr *dek_attr) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(dek->devx_obj->context); if (!dvops || !dvops->dek_query) return EOPNOTSUPP; return dvops->dek_query(dek, dek_attr); } static int _mlx5dv_dek_destroy(struct mlx5dv_dek *dek) { int ret; ret = mlx5dv_devx_obj_destroy(dek->devx_obj); if (ret) return ret; free(dek); return 0; } int mlx5dv_dek_destroy(struct mlx5dv_dek *dek) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(dek->devx_obj->context); if (!dvops || !dvops->dek_destroy) return EOPNOTSUPP; return dvops->dek_destroy(dek); } static struct mlx5dv_var * _mlx5dv_alloc_var(struct ibv_context *context, uint32_t flags) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_VAR, MLX5_IB_METHOD_VAR_OBJ_ALLOC, 4); struct ib_uverbs_attr *handle; struct mlx5_var_obj *obj; int ret; if (flags) { errno = EOPNOTSUPP; return NULL; } obj = calloc(1, sizeof(*obj)); if (!obj) { errno = ENOMEM; return NULL; } handle = fill_attr_out_obj(cmd, MLX5_IB_ATTR_VAR_OBJ_ALLOC_HANDLE); fill_attr_out_ptr(cmd, MLX5_IB_ATTR_VAR_OBJ_ALLOC_MMAP_OFFSET, &obj->dv_var.mmap_off); fill_attr_out_ptr(cmd, MLX5_IB_ATTR_VAR_OBJ_ALLOC_MMAP_LENGTH, &obj->dv_var.length); fill_attr_out_ptr(cmd, MLX5_IB_ATTR_VAR_OBJ_ALLOC_PAGE_ID, &obj->dv_var.page_id); ret = execute_ioctl(context, cmd); if (ret) goto err; obj->handle = read_attr_obj(MLX5_IB_ATTR_VAR_OBJ_ALLOC_HANDLE, handle); obj->context = context; return &obj->dv_var; err: free(obj); return NULL; } struct mlx5dv_var * mlx5dv_alloc_var(struct ibv_context *context, uint32_t flags) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->alloc_var) { errno = EOPNOTSUPP; return NULL; } return dvops->alloc_var(context, flags); } static void _mlx5dv_free_var(struct mlx5dv_var *dv_var) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_VAR, MLX5_IB_METHOD_VAR_OBJ_DESTROY, 1); struct mlx5_var_obj *obj = container_of(dv_var, struct mlx5_var_obj, dv_var); fill_attr_in_obj(cmd, MLX5_IB_ATTR_VAR_OBJ_DESTROY_HANDLE, obj->handle); if (execute_ioctl(obj->context, cmd)) assert(false); free(obj); } void mlx5dv_free_var(struct mlx5dv_var *dv_var) { struct mlx5_var_obj *obj = container_of(dv_var, struct mlx5_var_obj, dv_var); struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(obj->context); if (!dvops || !dvops->free_var) return; return dvops->free_var(dv_var); } static struct mlx5dv_pp *_mlx5dv_pp_alloc(struct ibv_context *context, size_t pp_context_sz, const void *pp_context, uint32_t flags) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_PP, MLX5_IB_METHOD_PP_OBJ_ALLOC, 4); struct ib_uverbs_attr *handle; struct mlx5_pp_obj *obj; int ret; if (!check_comp_mask(flags, MLX5_IB_UAPI_PP_ALLOC_FLAGS_DEDICATED_INDEX)) { errno = EOPNOTSUPP; return NULL; } obj = calloc(1, sizeof(*obj)); if (!obj) { errno = ENOMEM; return NULL; } handle = fill_attr_out_obj(cmd, MLX5_IB_ATTR_PP_OBJ_ALLOC_HANDLE); fill_attr_in(cmd, MLX5_IB_ATTR_PP_OBJ_ALLOC_CTX, pp_context, pp_context_sz); fill_attr_const_in(cmd, MLX5_IB_ATTR_PP_OBJ_ALLOC_FLAGS, flags); fill_attr_out_ptr(cmd, MLX5_IB_ATTR_PP_OBJ_ALLOC_INDEX, &obj->dv_pp.index); ret = execute_ioctl(context, cmd); if (ret) goto err; obj->handle = read_attr_obj(MLX5_IB_ATTR_PP_OBJ_ALLOC_HANDLE, handle); obj->context = context; return &obj->dv_pp; err: free(obj); return NULL; } struct mlx5dv_pp *mlx5dv_pp_alloc(struct ibv_context *context, size_t pp_context_sz, const void *pp_context, uint32_t flags) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(context); if (!dvops || !dvops->pp_alloc) { errno = EOPNOTSUPP; return NULL; } return dvops->pp_alloc(context, pp_context_sz, pp_context, flags); } static void _mlx5dv_pp_free(struct mlx5dv_pp *dv_pp) { DECLARE_COMMAND_BUFFER(cmd, MLX5_IB_OBJECT_PP, MLX5_IB_METHOD_PP_OBJ_DESTROY, 1); struct mlx5_pp_obj *obj = container_of(dv_pp, struct mlx5_pp_obj, dv_pp); fill_attr_in_obj(cmd, MLX5_IB_ATTR_PP_OBJ_DESTROY_HANDLE, obj->handle); if (execute_ioctl(obj->context, cmd)) assert(false); free(obj); } void mlx5dv_pp_free(struct mlx5dv_pp *dv_pp) { struct mlx5_pp_obj *obj = container_of(dv_pp, struct mlx5_pp_obj, dv_pp); struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(obj->context); if (!dvops || !dvops->pp_free) return; dvops->pp_free(dv_pp); } struct mlx5dv_devx_msi_vector * mlx5dv_devx_alloc_msi_vector(struct ibv_context *ibctx) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ibctx); if (!dvops || !dvops->devx_alloc_msi_vector) { errno = EOPNOTSUPP; return NULL; } return dvops->devx_alloc_msi_vector(ibctx); } int mlx5dv_devx_free_msi_vector(struct mlx5dv_devx_msi_vector *dvmsi) { struct mlx5_devx_msi_vector *msi = container_of(dvmsi, struct mlx5_devx_msi_vector, dv_msi); struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(msi->ibctx); if (!dvops || !dvops->devx_free_msi_vector) return EOPNOTSUPP; return dvops->devx_free_msi_vector(dvmsi); } struct mlx5dv_devx_eq * mlx5dv_devx_create_eq(struct ibv_context *ibctx, const void *in, size_t inlen, void *out, size_t outlen) { struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(ibctx); if (!dvops || !dvops->devx_create_eq) { errno = EOPNOTSUPP; return NULL; } return dvops->devx_create_eq(ibctx, in, inlen, out, outlen); } int mlx5dv_devx_destroy_eq(struct mlx5dv_devx_eq *dveq) { struct mlx5_devx_eq *eq = container_of(dveq, struct mlx5_devx_eq, dv_eq); struct mlx5_dv_context_ops *dvops = mlx5_get_dv_ops(eq->ibctx); if (!dvops || !dvops->devx_destroy_eq) return EOPNOTSUPP; return dvops->devx_destroy_eq(dveq); } void mlx5_set_dv_ctx_ops(struct mlx5_dv_context_ops *ops) { ops->devx_general_cmd = _mlx5dv_devx_general_cmd; ops->devx_obj_create = _mlx5dv_devx_obj_create; ops->devx_obj_query = _mlx5dv_devx_obj_query; ops->devx_obj_modify = _mlx5dv_devx_obj_modify; ops->devx_obj_destroy = _mlx5dv_devx_obj_destroy; ops->devx_query_eqn = _mlx5dv_devx_query_eqn; ops->devx_cq_query = _mlx5dv_devx_cq_query; ops->devx_cq_modify = _mlx5dv_devx_cq_modify; ops->devx_qp_query = _mlx5dv_devx_qp_query; ops->devx_qp_modify = _mlx5dv_devx_qp_modify; ops->devx_srq_query = _mlx5dv_devx_srq_query; ops->devx_srq_modify = _mlx5dv_devx_srq_modify; ops->devx_wq_query = _mlx5dv_devx_wq_query; ops->devx_wq_modify = _mlx5dv_devx_wq_modify; ops->devx_ind_tbl_query = _mlx5dv_devx_ind_tbl_query; ops->devx_ind_tbl_modify = _mlx5dv_devx_ind_tbl_modify; ops->devx_create_cmd_comp = _mlx5dv_devx_create_cmd_comp; ops->devx_destroy_cmd_comp = _mlx5dv_devx_destroy_cmd_comp; ops->devx_create_event_channel = _mlx5dv_devx_create_event_channel; ops->devx_destroy_event_channel = _mlx5dv_devx_destroy_event_channel; ops->devx_subscribe_devx_event = _mlx5dv_devx_subscribe_devx_event; ops->devx_subscribe_devx_event_fd = _mlx5dv_devx_subscribe_devx_event_fd; ops->devx_obj_query_async = _mlx5dv_devx_obj_query_async; ops->devx_get_async_cmd_comp = _mlx5dv_devx_get_async_cmd_comp; ops->devx_get_event = _mlx5dv_devx_get_event; ops->devx_alloc_uar = _mlx5dv_devx_alloc_uar; ops->devx_free_uar = _mlx5dv_devx_free_uar; ops->devx_umem_reg = _mlx5dv_devx_umem_reg; ops->devx_umem_reg_ex = _mlx5dv_devx_umem_reg_ex; ops->devx_umem_dereg = _mlx5dv_devx_umem_dereg; ops->create_mkey = _mlx5dv_create_mkey; ops->destroy_mkey = _mlx5dv_destroy_mkey; ops->crypto_login = _mlx5dv_crypto_login; ops->crypto_login_query_state = _mlx5dv_crypto_login_query_state; ops->crypto_logout = _mlx5dv_crypto_logout; ops->crypto_login_create = _mlx5dv_crypto_login_create; ops->crypto_login_query = _mlx5dv_crypto_login_query; ops->crypto_login_destroy = _mlx5dv_crypto_login_destroy; ops->dek_create = _mlx5dv_dek_create; ops->dek_query = _mlx5dv_dek_query; ops->dek_destroy = _mlx5dv_dek_destroy; ops->alloc_var = _mlx5dv_alloc_var; ops->free_var = _mlx5dv_free_var; ops->pp_alloc = _mlx5dv_pp_alloc; ops->pp_free = _mlx5dv_pp_free; ops->create_cq = _mlx5dv_create_cq; ops->create_qp = _mlx5dv_create_qp; ops->create_wq = _mlx5dv_create_wq; ops->alloc_dm = _mlx5dv_alloc_dm; ops->dm_map_op_addr = _mlx5dv_dm_map_op_addr; ops->create_flow_action_esp = _mlx5dv_create_flow_action_esp; ops->create_flow_action_modify_header = _mlx5dv_create_flow_action_modify_header; ops->create_flow_action_packet_reformat = _mlx5dv_create_flow_action_packet_reformat; ops->create_flow_matcher = _mlx5dv_create_flow_matcher; ops->destroy_flow_matcher = _mlx5dv_destroy_flow_matcher; ops->create_flow = _mlx5dv_create_flow; ops->map_ah_to_qp = _mlx5dv_map_ah_to_qp; ops->query_port = __mlx5dv_query_port; ops->create_steering_anchor = _mlx5dv_create_steering_anchor; ops->destroy_steering_anchor = _mlx5dv_destroy_steering_anchor; ops->reg_dmabuf_mr = _mlx5dv_reg_dmabuf_mr; ops->get_data_direct_sysfs_path = _mlx5dv_get_data_direct_sysfs_path; } rdma-core-56.1/providers/mlx5/wqe.h000066400000000000000000000126151477342711600172060ustar00rootroot00000000000000/* * Copyright (c) 2012 Mellanox Technologies, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef WQE_H #define WQE_H #include #include "mlx5dv.h" struct mlx5_sg_copy_ptr { int index; int offset; }; struct mlx5_eqe_comp { uint32_t reserved[6]; uint32_t cqn; }; struct mlx5_eqe_qp_srq { uint32_t reserved[6]; uint32_t qp_srq_n; }; struct mlx5_wqe_eth_pad { uint8_t rsvd0[16]; }; struct mlx5_wqe_xrc_seg { __be32 xrc_srqn; uint8_t rsvd[12]; }; struct mlx5_wqe_masked_atomic_seg { uint64_t swap_add; uint64_t compare; uint64_t swap_add_mask; uint64_t compare_mask; }; enum { MLX5_IPOIB_INLINE_MIN_HEADER_SIZE = 4, MLX5_SOURCE_QPN_INLINE_MAX_HEADER_SIZE = 18, MLX5_ETH_L2_INLINE_HEADER_SIZE = 18, MLX5_ETH_L2_MIN_HEADER_SIZE = 14, }; struct mlx5_seg_set_psv { uint8_t rsvd[4]; uint16_t syndrome; uint16_t status; uint16_t block_guard; uint16_t app_tag; uint32_t ref_tag; uint32_t mkey; uint64_t va; }; struct mlx5_seg_get_psv { uint8_t rsvd[19]; uint8_t num_psv; uint32_t l_key; uint64_t va; uint32_t psv_index[4]; }; struct mlx5_seg_check_psv { uint8_t rsvd0[2]; uint16_t err_coalescing_op; uint8_t rsvd1[2]; uint16_t xport_err_op; uint8_t rsvd2[2]; uint16_t xport_err_mask; uint8_t rsvd3[7]; uint8_t num_psv; uint32_t l_key; uint64_t va; uint32_t psv_index[4]; }; struct mlx5_rwqe_sig { uint8_t rsvd0[4]; uint8_t signature; uint8_t rsvd1[11]; }; struct mlx5_wqe_signature_seg { uint8_t rsvd0[4]; uint8_t signature; uint8_t rsvd1[11]; }; struct mlx5_wqe_inline_seg { __be32 byte_count; }; enum { MLX5_WQE_MKEY_CONTEXT_FLAGS_BSF_ENABLE = 1 << 30, MLX5_WQE_MKEY_CONTEXT_SIG_ERR_CNT_MASK = 1, MLX5_WQE_MKEY_CONTEXT_SIG_ERR_CNT_SHIFT = 26, }; enum { MLX5_BSF_SIZE_BASIC = 0, MLX5_BSF_SIZE_EXTENDED = 1, MLX5_BSF_SIZE_WITH_INLINE = 2, MLX5_BSF_SIZE_SIG_AND_CRYPTO = 3, MLX5_BSF_TYPE_CRYPTO = 1, MLX5_BSF_SIZE_SHIFT = 6, MLX5_BSF_SBS_SHIFT = 4, /* Block Format Selector */ MLX5_BFS_CRC32_BASE = 0x20, MLX5_BFS_CRC32C_BASE = 0x40, MLX5_BFS_CRC64_XP10_BASE = 0x50, MLX5_BFS_CRC_REPEAT_BIT = 0x2, MLX5_BFS_CRC_BLOCK_SIGS_COV_BIT = 0x2, MLX5_BFS_CRC_SEED_BIT = 0x1, MLX5_BFS_SHIFT = 24, MLX5_BSF_PSV_INDEX_MASK = 0xFFFFFF, /* Inline section */ MLX5_BSF_INL_VALID = 1 << 15, MLX5_BSF_REFRESH_DIF = 1 << 14, MLX5_BSF_REPEAT_BLOCK = 1 << 7, MLX5_BSF_INC_REFTAG = 1 << 6, MLX5_BSF_SEED = 1 << 3, MLX5_BSF_APPTAG_ESCAPE = 0x1, MLX5_BSF_APPREF_ESCAPE = 0x2, MLX5_T10DIF_CRC = 0x1, MLX5_T10DIF_IPCS = 0x2, }; struct mlx5_bsf_inl { __be16 vld_refresh; __be16 dif_apptag; __be32 dif_reftag; uint8_t sig_type; uint8_t rp_inv_seed; uint8_t rsvd[3]; uint8_t dif_inc_ref_guard_check; __be16 dif_app_bitmask_check; }; struct mlx5_crypto_bsf { uint8_t bsf_size_type; uint8_t enc_order; uint8_t rsvd0; uint8_t enc_standard; __be32 raw_data_size; uint8_t bs_pointer; uint8_t rsvd1[7]; uint8_t xts_init_tweak[16]; __be32 rsvd_dek_ptr; uint8_t rsvd2[4]; uint8_t keytag[8]; uint8_t rsvd3[16]; }; struct mlx5_bsf { struct mlx5_bsf_basic { uint8_t bsf_size_sbs; uint8_t check_byte_mask; union { uint8_t copy_byte_mask; uint8_t bs_selector; uint8_t rsvd_wflags; } wire; union { uint8_t bs_selector; uint8_t rsvd_mflags; } mem; __be32 raw_data_size; __be32 w_bfs_psv; __be32 m_bfs_psv; } basic; struct mlx5_bsf_ext { __be32 t_init_gen_pro_size; __be32 rsvd_epi_size; __be32 w_tfs_psv; __be32 m_tfs_psv; } ext; struct mlx5_bsf_inl w_inl; struct mlx5_bsf_inl m_inl; }; struct mlx5_wqe_set_psv_seg { __be32 psv_index; __be16 syndrome; uint8_t reserved[2]; __be64 transient_signature; }; enum { MLX5_OPC_MOD_MMO_DMA = 0x1, }; struct mlx5_mmo_metadata_seg { __be32 mmo_control_31_0; __be32 local_key; __be64 local_address; }; struct mlx5_mmo_wqe { struct mlx5_wqe_ctrl_seg ctrl; struct mlx5_mmo_metadata_seg mmo_meta; struct mlx5_wqe_data_seg src; struct mlx5_wqe_data_seg dest; }; struct mlx5_wqe_flow_update_ctrl_seg { __be32 flow_idx_update; __be32 dest_handle; uint8_t reserved0[40]; }; struct mlx5_wqe_header_modify_argument_update_seg { uint8_t argument_list[64]; }; #endif /* WQE_H */ rdma-core-56.1/providers/mthca/000077500000000000000000000000001477342711600164435ustar00rootroot00000000000000rdma-core-56.1/providers/mthca/CMakeLists.txt000066400000000000000000000001331477342711600212000ustar00rootroot00000000000000rdma_provider(mthca ah.c buf.c cq.c memfree.c mthca.c qp.c srq.c verbs.c ) rdma-core-56.1/providers/mthca/ah.c000066400000000000000000000115671477342711600172110ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include "mthca.h" struct mthca_ah_page { struct mthca_ah_page *prev, *next; struct mthca_buf buf; struct ibv_mr *mr; int use_cnt; unsigned free[0]; }; static struct mthca_ah_page *__add_page(struct mthca_pd *pd, int page_size, int per_page) { struct mthca_ah_page *page; int i; page = malloc(sizeof *page + per_page * sizeof (int)); if (!page) return NULL; if (mthca_alloc_buf(&page->buf, page_size, page_size)) { free(page); return NULL; } page->mr = mthca_reg_mr(&pd->ibv_pd, page->buf.buf, page_size, (uintptr_t) page->buf.buf, 0); if (!page->mr) { mthca_free_buf(&page->buf); free(page); return NULL; } page->mr->context = pd->ibv_pd.context; page->use_cnt = 0; for (i = 0; i < per_page; ++i) page->free[i] = ~0; page->prev = NULL; page->next = pd->ah_list; pd->ah_list = page; if (page->next) page->next->prev = page; return page; } int mthca_alloc_av(struct mthca_pd *pd, struct ibv_ah_attr *attr, struct mthca_ah *ah) { if (mthca_is_memfree(pd->ibv_pd.context)) { ah->av = malloc(sizeof *ah->av); if (!ah->av) return -1; } else { struct mthca_ah_page *page; int ps; int pp; int i, j; ps = to_mdev(pd->ibv_pd.context->device)->page_size; pp = ps / (sizeof *ah->av * 8 * sizeof (int)); pthread_mutex_lock(&pd->ah_mutex); for (page = pd->ah_list; page; page = page->next) if (page->use_cnt < ps / sizeof *ah->av) for (i = 0; i < pp; ++i) if (page->free[i]) goto found; page = __add_page(pd, ps, pp); if (!page) { pthread_mutex_unlock(&pd->ah_mutex); return -1; } found: ++page->use_cnt; for (i = 0, j = -1; i < pp; ++i) if (page->free[i]) { j = ffs(page->free[i]); page->free[i] &= ~(1 << (j - 1)); ah->av = page->buf.buf + (i * 8 * sizeof (int) + (j - 1)) * sizeof *ah->av; break; } ah->key = page->mr->lkey; ah->page = page; pthread_mutex_unlock(&pd->ah_mutex); } memset(ah->av, 0, sizeof *ah->av); ah->av->port_pd = htobe32(pd->pdn | (attr->port_num << 24)); ah->av->g_slid = attr->src_path_bits; ah->av->dlid = htobe16(attr->dlid); ah->av->msg_sr = (3 << 4) | /* 2K message */ attr->static_rate; ah->av->sl_tclass_flowlabel = htobe32(attr->sl << 28); if (attr->is_global) { ah->av->g_slid |= 0x80; /* XXX get gid_table length */ ah->av->gid_index = (attr->port_num - 1) * 32 + attr->grh.sgid_index; ah->av->hop_limit = attr->grh.hop_limit; ah->av->sl_tclass_flowlabel |= htobe32((attr->grh.traffic_class << 20) | attr->grh.flow_label); memcpy(ah->av->dgid, attr->grh.dgid.raw, 16); } else { /* Arbel workaround -- low byte of GID must be 2 */ ah->av->dgid[3] = htobe32(2); } return 0; } void mthca_free_av(struct mthca_ah *ah) { if (mthca_is_memfree(ah->ibv_ah.context)) { free(ah->av); } else { struct mthca_pd *pd = to_mpd(ah->ibv_ah.pd); struct mthca_ah_page *page; int i; pthread_mutex_lock(&pd->ah_mutex); page = ah->page; i = ((void *) ah->av - page->buf.buf) / sizeof *ah->av; page->free[i / (8 * sizeof (int))] |= 1 << (i % (8 * sizeof (int))); if (!--page->use_cnt) { if (page->prev) page->prev->next = page->next; else pd->ah_list = page->next; if (page->next) page->next->prev = page->prev; mthca_dereg_mr(verbs_get_mr(page->mr)); mthca_free_buf(&page->buf); free(page); } pthread_mutex_unlock(&pd->ah_mutex); } } rdma-core-56.1/providers/mthca/buf.c000066400000000000000000000037541477342711600173740ustar00rootroot00000000000000/* * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include "mthca.h" int mthca_alloc_buf(struct mthca_buf *buf, size_t size, int page_size) { int ret; buf->length = align(size, page_size); buf->buf = mmap(NULL, buf->length, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (buf->buf == MAP_FAILED) return errno; ret = ibv_dontfork_range(buf->buf, size); if (ret) munmap(buf->buf, buf->length); return ret; } void mthca_free_buf(struct mthca_buf *buf) { ibv_dofork_range(buf->buf, buf->length); munmap(buf->buf, buf->length); } rdma-core-56.1/providers/mthca/cq.c000066400000000000000000000400361477342711600172150ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005 Mellanox Technologies Ltd. All rights reserved. * Copyright (c) 2006 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include "mthca.h" #include "doorbell.h" enum { MTHCA_CQ_DOORBELL = 0x20 }; enum { CQ_OK = 0, CQ_EMPTY = -1, CQ_POLL_ERR = -2 }; #define MTHCA_TAVOR_CQ_DB_INC_CI (1 << 24) #define MTHCA_TAVOR_CQ_DB_REQ_NOT (2 << 24) #define MTHCA_TAVOR_CQ_DB_REQ_NOT_SOL (3 << 24) #define MTHCA_TAVOR_CQ_DB_SET_CI (4 << 24) #define MTHCA_TAVOR_CQ_DB_REQ_NOT_MULT (5 << 24) #define MTHCA_ARBEL_CQ_DB_REQ_NOT_SOL (1 << 24) #define MTHCA_ARBEL_CQ_DB_REQ_NOT (2 << 24) #define MTHCA_ARBEL_CQ_DB_REQ_NOT_MULT (3 << 24) enum { MTHCA_CQ_ENTRY_OWNER_SW = 0x00, MTHCA_CQ_ENTRY_OWNER_HW = 0x80, MTHCA_ERROR_CQE_OPCODE_MASK = 0xfe }; enum { SYNDROME_LOCAL_LENGTH_ERR = 0x01, SYNDROME_LOCAL_QP_OP_ERR = 0x02, SYNDROME_LOCAL_EEC_OP_ERR = 0x03, SYNDROME_LOCAL_PROT_ERR = 0x04, SYNDROME_WR_FLUSH_ERR = 0x05, SYNDROME_MW_BIND_ERR = 0x06, SYNDROME_BAD_RESP_ERR = 0x10, SYNDROME_LOCAL_ACCESS_ERR = 0x11, SYNDROME_REMOTE_INVAL_REQ_ERR = 0x12, SYNDROME_REMOTE_ACCESS_ERR = 0x13, SYNDROME_REMOTE_OP_ERR = 0x14, SYNDROME_RETRY_EXC_ERR = 0x15, SYNDROME_RNR_RETRY_EXC_ERR = 0x16, SYNDROME_LOCAL_RDD_VIOL_ERR = 0x20, SYNDROME_REMOTE_INVAL_RD_REQ_ERR = 0x21, SYNDROME_REMOTE_ABORTED_ERR = 0x22, SYNDROME_INVAL_EECN_ERR = 0x23, SYNDROME_INVAL_EEC_STATE_ERR = 0x24 }; struct mthca_cqe { __be32 my_qpn; __be32 my_ee; __be32 rqpn; __be16 sl_g_mlpath; __be16 rlid; __be32 imm_etype_pkey_eec; __be32 byte_cnt; __be32 wqe; uint8_t opcode; uint8_t is_send; uint8_t reserved; uint8_t owner; }; struct mthca_err_cqe { __be32 my_qpn; __be32 reserved1[3]; uint8_t syndrome; uint8_t vendor_err; __be16 db_cnt; __be32 reserved2; __be32 wqe; uint8_t opcode; uint8_t reserved3[2]; uint8_t owner; }; static inline struct mthca_cqe *get_cqe(struct mthca_cq *cq, int entry) { return cq->buf.buf + entry * MTHCA_CQ_ENTRY_SIZE; } static inline struct mthca_cqe *cqe_sw(struct mthca_cq *cq, int i) { struct mthca_cqe *cqe = get_cqe(cq, i); return MTHCA_CQ_ENTRY_OWNER_HW & cqe->owner ? NULL : cqe; } static inline struct mthca_cqe *next_cqe_sw(struct mthca_cq *cq) { return cqe_sw(cq, cq->cons_index & cq->ibv_cq.cqe); } static inline void set_cqe_hw(struct mthca_cqe *cqe) { VALGRIND_MAKE_MEM_UNDEFINED(cqe, sizeof *cqe); cqe->owner = MTHCA_CQ_ENTRY_OWNER_HW; } /* * incr is ignored in native Arbel (mem-free) mode, so cq->cons_index * should be correct before calling update_cons_index(). */ static inline void update_cons_index(struct mthca_cq *cq, int incr) { uint32_t doorbell[2]; if (mthca_is_memfree(cq->ibv_cq.context)) { *cq->set_ci_db = htobe32(cq->cons_index); mmio_ordered_writes_hack(); } else { doorbell[0] = MTHCA_TAVOR_CQ_DB_INC_CI | cq->cqn; doorbell[1] = incr - 1; mthca_write64(doorbell, to_mctx(cq->ibv_cq.context)->uar + MTHCA_CQ_DOORBELL); } } static void dump_cqe(void *cqe_ptr) { __be32 *cqe = cqe_ptr; int i; for (i = 0; i < 8; ++i) printf(" [%2x] %08x\n", i * 4, be32toh(cqe[i])); } static int handle_error_cqe(struct mthca_cq *cq, struct mthca_qp *qp, int wqe_index, int is_send, struct mthca_err_cqe *cqe, struct ibv_wc *wc, int *free_cqe) { int err; int dbd; __be32 new_wqe; if (cqe->syndrome == SYNDROME_LOCAL_QP_OP_ERR) { printf("local QP operation err " "(QPN %06x, WQE @ %08x, CQN %06x, index %d)\n", be32toh(cqe->my_qpn), be32toh(cqe->wqe), cq->cqn, cq->cons_index); dump_cqe(cqe); } /* * For completions in error, only work request ID, status, vendor error * (and freed resource count for RD) have to be set. */ switch (cqe->syndrome) { case SYNDROME_LOCAL_LENGTH_ERR: wc->status = IBV_WC_LOC_LEN_ERR; break; case SYNDROME_LOCAL_QP_OP_ERR: wc->status = IBV_WC_LOC_QP_OP_ERR; break; case SYNDROME_LOCAL_EEC_OP_ERR: wc->status = IBV_WC_LOC_EEC_OP_ERR; break; case SYNDROME_LOCAL_PROT_ERR: wc->status = IBV_WC_LOC_PROT_ERR; break; case SYNDROME_WR_FLUSH_ERR: wc->status = IBV_WC_WR_FLUSH_ERR; break; case SYNDROME_MW_BIND_ERR: wc->status = IBV_WC_MW_BIND_ERR; break; case SYNDROME_BAD_RESP_ERR: wc->status = IBV_WC_BAD_RESP_ERR; break; case SYNDROME_LOCAL_ACCESS_ERR: wc->status = IBV_WC_LOC_ACCESS_ERR; break; case SYNDROME_REMOTE_INVAL_REQ_ERR: wc->status = IBV_WC_REM_INV_REQ_ERR; break; case SYNDROME_REMOTE_ACCESS_ERR: wc->status = IBV_WC_REM_ACCESS_ERR; break; case SYNDROME_REMOTE_OP_ERR: wc->status = IBV_WC_REM_OP_ERR; break; case SYNDROME_RETRY_EXC_ERR: wc->status = IBV_WC_RETRY_EXC_ERR; break; case SYNDROME_RNR_RETRY_EXC_ERR: wc->status = IBV_WC_RNR_RETRY_EXC_ERR; break; case SYNDROME_LOCAL_RDD_VIOL_ERR: wc->status = IBV_WC_LOC_RDD_VIOL_ERR; break; case SYNDROME_REMOTE_INVAL_RD_REQ_ERR: wc->status = IBV_WC_REM_INV_RD_REQ_ERR; break; case SYNDROME_REMOTE_ABORTED_ERR: wc->status = IBV_WC_REM_ABORT_ERR; break; case SYNDROME_INVAL_EECN_ERR: wc->status = IBV_WC_INV_EECN_ERR; break; case SYNDROME_INVAL_EEC_STATE_ERR: wc->status = IBV_WC_INV_EEC_STATE_ERR; break; default: wc->status = IBV_WC_GENERAL_ERR; break; } wc->vendor_err = cqe->vendor_err; /* * Mem-free HCAs always generate one CQE per WQE, even in the * error case, so we don't have to check the doorbell count, etc. */ if (mthca_is_memfree(cq->ibv_cq.context)) return 0; err = mthca_free_err_wqe(qp, is_send, wqe_index, &dbd, &new_wqe); if (err) return err; /* * If we're at the end of the WQE chain, or we've used up our * doorbell count, free the CQE. Otherwise just update it for * the next poll operation. * * This doesn't apply to mem-free HCAs, which never use the * doorbell count field. In that case we always free the CQE. */ if (mthca_is_memfree(cq->ibv_cq.context) || !(new_wqe & htobe32(0x3f)) || (!cqe->db_cnt && dbd)) return 0; cqe->db_cnt = htobe16(be16toh(cqe->db_cnt) - dbd); cqe->wqe = new_wqe; cqe->syndrome = SYNDROME_WR_FLUSH_ERR; *free_cqe = 0; return 0; } static inline int mthca_poll_one(struct mthca_cq *cq, struct mthca_qp **cur_qp, int *freed, struct ibv_wc *wc) { struct mthca_wq *wq; struct mthca_cqe *cqe; struct mthca_srq *srq; uint32_t qpn; int wqe_index; int is_error; int is_send; int free_cqe = 1; int err = 0; cqe = next_cqe_sw(cq); if (!cqe) return CQ_EMPTY; VALGRIND_MAKE_MEM_DEFINED(cqe, sizeof *cqe); /* * Make sure we read CQ entry contents after we've checked the * ownership bit. */ udma_from_device_barrier(); qpn = be32toh(cqe->my_qpn); is_error = (cqe->opcode & MTHCA_ERROR_CQE_OPCODE_MASK) == MTHCA_ERROR_CQE_OPCODE_MASK; is_send = is_error ? cqe->opcode & 0x01 : cqe->is_send & 0x80; if (!*cur_qp || qpn != (*cur_qp)->ibv_qp.qp_num) { /* * We do not have to take the QP table lock here, * because CQs will be locked while QPs are removed * from the table. */ *cur_qp = mthca_find_qp(to_mctx(cq->ibv_cq.context), qpn); if (!*cur_qp) { err = CQ_POLL_ERR; goto out; } } wc->qp_num = (*cur_qp)->ibv_qp.qp_num; if (is_send) { wq = &(*cur_qp)->sq; wqe_index = ((be32toh(cqe->wqe) - (*cur_qp)->send_wqe_offset) >> wq->wqe_shift); wc->wr_id = (*cur_qp)->wrid[wqe_index + (*cur_qp)->rq.max]; } else if ((*cur_qp)->ibv_qp.srq) { uint32_t wqe; srq = to_msrq((*cur_qp)->ibv_qp.srq); wqe = be32toh(cqe->wqe); wq = NULL; wqe_index = wqe >> srq->wqe_shift; wc->wr_id = srq->wrid[wqe_index]; mthca_free_srq_wqe(srq, wqe_index); } else { int32_t wqe; wq = &(*cur_qp)->rq; wqe = be32toh(cqe->wqe); wqe_index = wqe >> wq->wqe_shift; /* * WQE addr == base - 1 might be reported by Sinai FW * 1.0.800 and Arbel FW 5.1.400 in receive completion * with error instead of (rq size - 1). This bug * should be fixed in later FW revisions. */ if (wqe_index < 0) wqe_index = wq->max - 1; wc->wr_id = (*cur_qp)->wrid[wqe_index]; } if (wq) { if (wq->last_comp < wqe_index) wq->tail += wqe_index - wq->last_comp; else wq->tail += wqe_index + wq->max - wq->last_comp; wq->last_comp = wqe_index; } if (is_error) { err = handle_error_cqe(cq, *cur_qp, wqe_index, is_send, (struct mthca_err_cqe *) cqe, wc, &free_cqe); goto out; } if (is_send) { wc->wc_flags = 0; switch (cqe->opcode) { case MTHCA_OPCODE_RDMA_WRITE: wc->opcode = IBV_WC_RDMA_WRITE; break; case MTHCA_OPCODE_RDMA_WRITE_IMM: wc->opcode = IBV_WC_RDMA_WRITE; wc->wc_flags |= IBV_WC_WITH_IMM; break; case MTHCA_OPCODE_SEND: wc->opcode = IBV_WC_SEND; break; case MTHCA_OPCODE_SEND_IMM: wc->opcode = IBV_WC_SEND; wc->wc_flags |= IBV_WC_WITH_IMM; break; case MTHCA_OPCODE_RDMA_READ: wc->opcode = IBV_WC_RDMA_READ; wc->byte_len = be32toh(cqe->byte_cnt); break; case MTHCA_OPCODE_ATOMIC_CS: wc->opcode = IBV_WC_COMP_SWAP; wc->byte_len = be32toh(cqe->byte_cnt); break; case MTHCA_OPCODE_ATOMIC_FA: wc->opcode = IBV_WC_FETCH_ADD; wc->byte_len = be32toh(cqe->byte_cnt); break; case MTHCA_OPCODE_BIND_MW: wc->opcode = IBV_WC_BIND_MW; break; default: /* assume it's a send completion */ wc->opcode = IBV_WC_SEND; break; } } else { wc->byte_len = be32toh(cqe->byte_cnt); switch (cqe->opcode & 0x1f) { case IBV_OPCODE_SEND_LAST_WITH_IMMEDIATE: case IBV_OPCODE_SEND_ONLY_WITH_IMMEDIATE: wc->wc_flags = IBV_WC_WITH_IMM; wc->imm_data = cqe->imm_etype_pkey_eec; wc->opcode = IBV_WC_RECV; break; case IBV_OPCODE_RDMA_WRITE_LAST_WITH_IMMEDIATE: case IBV_OPCODE_RDMA_WRITE_ONLY_WITH_IMMEDIATE: wc->wc_flags = IBV_WC_WITH_IMM; wc->imm_data = cqe->imm_etype_pkey_eec; wc->opcode = IBV_WC_RECV_RDMA_WITH_IMM; break; default: wc->wc_flags = 0; wc->opcode = IBV_WC_RECV; break; } wc->slid = be16toh(cqe->rlid); wc->sl = be16toh(cqe->sl_g_mlpath) >> 12; wc->src_qp = be32toh(cqe->rqpn) & 0xffffff; wc->dlid_path_bits = be16toh(cqe->sl_g_mlpath) & 0x7f; wc->pkey_index = be32toh(cqe->imm_etype_pkey_eec) >> 16; wc->wc_flags |= be16toh(cqe->sl_g_mlpath) & 0x80 ? IBV_WC_GRH : 0; } wc->status = IBV_WC_SUCCESS; out: if (free_cqe) { set_cqe_hw(cqe); ++(*freed); ++cq->cons_index; } return err; } int mthca_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc) { struct mthca_cq *cq = to_mcq(ibcq); struct mthca_qp *qp = NULL; int npolled; int err = CQ_OK; int freed = 0; pthread_spin_lock(&cq->lock); for (npolled = 0; npolled < ne; ++npolled) { err = mthca_poll_one(cq, &qp, &freed, wc + npolled); if (err != CQ_OK) break; } if (freed) { udma_to_device_barrier(); update_cons_index(cq, freed); } pthread_spin_unlock(&cq->lock); return err == CQ_POLL_ERR ? err : npolled; } int mthca_tavor_arm_cq(struct ibv_cq *cq, int solicited) { uint32_t doorbell[2]; doorbell[0] = (solicited ? MTHCA_TAVOR_CQ_DB_REQ_NOT_SOL : MTHCA_TAVOR_CQ_DB_REQ_NOT) | to_mcq(cq)->cqn; doorbell[1] = 0xffffffff; mthca_write64(doorbell, to_mctx(cq->context)->uar + MTHCA_CQ_DOORBELL); return 0; } int mthca_arbel_arm_cq(struct ibv_cq *ibvcq, int solicited) { struct mthca_cq *cq = to_mcq(ibvcq); uint32_t doorbell[2]; uint32_t sn; sn = cq->arm_sn & 3; doorbell[0] = cq->cons_index; doorbell[1] = (cq->cqn << 8) | (2 << 5) | (sn << 3) | (solicited ? 1 : 2); mthca_write64(doorbell, cq->arm_db); /* * Make sure that the doorbell record in host memory is * written before ringing the doorbell via PCI MMIO. */ udma_to_device_barrier(); doorbell[0] = (sn << 28) | (solicited ? MTHCA_ARBEL_CQ_DB_REQ_NOT_SOL : MTHCA_ARBEL_CQ_DB_REQ_NOT) | cq->cqn; doorbell[1] = cq->cons_index; mthca_write64(doorbell, to_mctx(ibvcq->context)->uar + MTHCA_CQ_DOORBELL); return 0; } void mthca_arbel_cq_event(struct ibv_cq *cq) { to_mcq(cq)->arm_sn++; } static inline int is_recv_cqe(struct mthca_cqe *cqe) { if ((cqe->opcode & MTHCA_ERROR_CQE_OPCODE_MASK) == MTHCA_ERROR_CQE_OPCODE_MASK) return !(cqe->opcode & 0x01); else return !(cqe->is_send & 0x80); } void __mthca_cq_clean(struct mthca_cq *cq, uint32_t qpn, struct mthca_srq *srq) { struct mthca_cqe *cqe; uint32_t prod_index; int i, nfreed = 0; /* * First we need to find the current producer index, so we * know where to start cleaning from. It doesn't matter if HW * adds new entries after this loop -- the QP we're worried * about is already in RESET, so the new entries won't come * from our QP and therefore don't need to be checked. */ for (prod_index = cq->cons_index; cqe_sw(cq, prod_index & cq->ibv_cq.cqe); ++prod_index) if (prod_index == cq->cons_index + cq->ibv_cq.cqe) break; /* * Now sweep backwards through the CQ, removing CQ entries * that match our QP by copying older entries on top of them. */ while ((int) --prod_index - (int) cq->cons_index >= 0) { cqe = get_cqe(cq, prod_index & cq->ibv_cq.cqe); if (cqe->my_qpn == htobe32(qpn)) { if (srq && is_recv_cqe(cqe)) mthca_free_srq_wqe(srq, be32toh(cqe->wqe) >> srq->wqe_shift); ++nfreed; } else if (nfreed) memcpy(get_cqe(cq, (prod_index + nfreed) & cq->ibv_cq.cqe), cqe, MTHCA_CQ_ENTRY_SIZE); } if (nfreed) { for (i = 0; i < nfreed; ++i) set_cqe_hw(get_cqe(cq, (cq->cons_index + i) & cq->ibv_cq.cqe)); udma_to_device_barrier(); cq->cons_index += nfreed; update_cons_index(cq, nfreed); } } void mthca_cq_clean(struct mthca_cq *cq, uint32_t qpn, struct mthca_srq *srq) { pthread_spin_lock(&cq->lock); __mthca_cq_clean(cq, qpn, srq); pthread_spin_unlock(&cq->lock); } void mthca_cq_resize_copy_cqes(struct mthca_cq *cq, void *buf, int old_cqe) { int i; /* * In Tavor mode, the hardware keeps the consumer and producer * indices mod the CQ size. Since we might be making the CQ * bigger, we need to deal with the case where the producer * index wrapped around before the CQ was resized. */ if (!mthca_is_memfree(cq->ibv_cq.context) && old_cqe < cq->ibv_cq.cqe) { cq->cons_index &= old_cqe; if (cqe_sw(cq, old_cqe)) cq->cons_index -= old_cqe + 1; } for (i = cq->cons_index; cqe_sw(cq, i & old_cqe); ++i) memcpy(buf + (i & cq->ibv_cq.cqe) * MTHCA_CQ_ENTRY_SIZE, get_cqe(cq, i & old_cqe), MTHCA_CQ_ENTRY_SIZE); } int mthca_alloc_cq_buf(struct mthca_device *dev, struct mthca_buf *buf, int nent) { int i; if (mthca_alloc_buf(buf, align(nent * MTHCA_CQ_ENTRY_SIZE, dev->page_size), dev->page_size)) return -1; for (i = 0; i < nent; ++i) ((struct mthca_cqe *) buf->buf)[i].owner = MTHCA_CQ_ENTRY_OWNER_HW; return 0; } rdma-core-56.1/providers/mthca/doorbell.h000066400000000000000000000004641477342711600204220ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file */ #ifndef DOORBELL_H #define DOORBELL_H #include #include "mthca.h" static inline void mthca_write64(uint32_t val[2], void *reg) { uint64_t doorbell = (((uint64_t)val[0]) << 32) | val[1]; mmio_write64_be(reg, htobe64(doorbell)); } #endif rdma-core-56.1/providers/mthca/memfree.c000066400000000000000000000113441477342711600202320ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include "mthca.h" #define MTHCA_FREE_MAP_SIZE (MTHCA_DB_REC_PER_PAGE / (SIZEOF_LONG * 8)) struct mthca_db_page { unsigned long free[MTHCA_FREE_MAP_SIZE]; struct mthca_buf db_rec; }; struct mthca_db_table { int npages; int max_group1; int min_group2; pthread_mutex_t mutex; struct mthca_db_page page[]; }; int mthca_alloc_db(struct mthca_db_table *db_tab, enum mthca_db_type type, __be32 **db) { int i, j, k; int group, start, end, dir; int ret = 0; pthread_mutex_lock(&db_tab->mutex); switch (type) { case MTHCA_DB_TYPE_CQ_ARM: case MTHCA_DB_TYPE_SQ: group = 0; start = 0; end = db_tab->max_group1; dir = 1; break; case MTHCA_DB_TYPE_CQ_SET_CI: case MTHCA_DB_TYPE_RQ: case MTHCA_DB_TYPE_SRQ: group = 1; start = db_tab->npages - 1; end = db_tab->min_group2; dir = -1; break; default: ret = -1; goto out; } for (i = start; i != end; i += dir) if (db_tab->page[i].db_rec.buf) for (j = 0; j < MTHCA_FREE_MAP_SIZE; ++j) if (db_tab->page[i].free[j]) goto found; if (db_tab->max_group1 >= db_tab->min_group2 - 1) { ret = -1; goto out; } if (mthca_alloc_buf(&db_tab->page[i].db_rec, MTHCA_DB_REC_PAGE_SIZE, MTHCA_DB_REC_PAGE_SIZE)) { ret = -1; goto out; } memset(db_tab->page[i].db_rec.buf, 0, MTHCA_DB_REC_PAGE_SIZE); memset(db_tab->page[i].free, 0xff, sizeof db_tab->page[i].free); if (group == 0) ++db_tab->max_group1; else --db_tab->min_group2; found: for (j = 0; j < MTHCA_FREE_MAP_SIZE; ++j) { k = ffsl(db_tab->page[i].free[j]); if (k) break; } if (!k) { ret = -1; goto out; } --k; db_tab->page[i].free[j] &= ~(1UL << k); j = j * SIZEOF_LONG * 8 + k; if (group == 1) j = MTHCA_DB_REC_PER_PAGE - 1 - j; ret = i * MTHCA_DB_REC_PER_PAGE + j; *db = db_tab->page[i].db_rec.buf + j * 8; out: pthread_mutex_unlock(&db_tab->mutex); return ret; } void mthca_set_db_qn(__be32 *db, enum mthca_db_type type, uint32_t qn) { db[1] = htobe32((qn << 8) | (type << 5)); } void mthca_free_db(struct mthca_db_table *db_tab, enum mthca_db_type type, int db_index) { int i, j; struct mthca_db_page *page; i = db_index / MTHCA_DB_REC_PER_PAGE; j = db_index % MTHCA_DB_REC_PER_PAGE; page = db_tab->page + i; pthread_mutex_lock(&db_tab->mutex); *(uint64_t *) (page->db_rec.buf + j * 8) = 0; if (i >= db_tab->min_group2) j = MTHCA_DB_REC_PER_PAGE - 1 - j; page->free[j / (SIZEOF_LONG * 8)] |= 1UL << (j % (SIZEOF_LONG * 8)); pthread_mutex_unlock(&db_tab->mutex); } struct mthca_db_table *mthca_alloc_db_tab(int uarc_size) { struct mthca_db_table *db_tab; int npages; int i; npages = uarc_size / MTHCA_DB_REC_PAGE_SIZE; db_tab = malloc(sizeof (struct mthca_db_table) + npages * sizeof (struct mthca_db_page)); pthread_mutex_init(&db_tab->mutex, NULL); db_tab->npages = npages; db_tab->max_group1 = 0; db_tab->min_group2 = npages - 1; for (i = 0; i < npages; ++i) db_tab->page[i].db_rec.buf = NULL; return db_tab; } void mthca_free_db_tab(struct mthca_db_table *db_tab) { int i; if (!db_tab) return; for (i = 0; i < db_tab->npages; ++i) if (db_tab->page[i].db_rec.buf) mthca_free_buf(&db_tab->page[i].db_rec); free(db_tab); } rdma-core-56.1/providers/mthca/mthca-abi.h000066400000000000000000000044421477342711600204450ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2006 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MTHCA_ABI_H #define MTHCA_ABI_H #include #include #include DECLARE_DRV_CMD(umthca_alloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, empty, mthca_alloc_pd_resp); DECLARE_DRV_CMD(umthca_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, mthca_create_cq, mthca_create_cq_resp); DECLARE_DRV_CMD(umthca_create_qp, IB_USER_VERBS_CMD_CREATE_QP, mthca_create_qp, empty); DECLARE_DRV_CMD(umthca_create_srq, IB_USER_VERBS_CMD_CREATE_SRQ, mthca_create_srq, mthca_create_srq_resp); DECLARE_DRV_CMD(umthca_alloc_ucontext, IB_USER_VERBS_CMD_GET_CONTEXT, empty, mthca_alloc_ucontext_resp); DECLARE_DRV_CMD(umthca_reg_mr, IB_USER_VERBS_CMD_REG_MR, mthca_reg_mr, empty); DECLARE_DRV_CMD(umthca_resize_cq, IB_USER_VERBS_CMD_RESIZE_CQ, mthca_resize_cq, empty); #endif /* MTHCA_ABI_H */ rdma-core-56.1/providers/mthca/mthca.c000066400000000000000000000161761477342711600177160ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2006 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include "mthca.h" #include "mthca-abi.h" static void mthca_free_context(struct ibv_context *ibctx); #ifndef PCI_VENDOR_ID_MELLANOX #define PCI_VENDOR_ID_MELLANOX 0x15b3 #endif #ifndef PCI_DEVICE_ID_MELLANOX_TAVOR #define PCI_DEVICE_ID_MELLANOX_TAVOR 0x5a44 #endif #ifndef PCI_DEVICE_ID_MELLANOX_ARBEL_COMPAT #define PCI_DEVICE_ID_MELLANOX_ARBEL_COMPAT 0x6278 #endif #ifndef PCI_DEVICE_ID_MELLANOX_ARBEL #define PCI_DEVICE_ID_MELLANOX_ARBEL 0x6282 #endif #ifndef PCI_DEVICE_ID_MELLANOX_SINAI_OLD #define PCI_DEVICE_ID_MELLANOX_SINAI_OLD 0x5e8c #endif #ifndef PCI_DEVICE_ID_MELLANOX_SINAI #define PCI_DEVICE_ID_MELLANOX_SINAI 0x6274 #endif #ifndef PCI_VENDOR_ID_TOPSPIN #define PCI_VENDOR_ID_TOPSPIN 0x1867 #endif #define HCA(v, d, t) \ VERBS_PCI_MATCH(PCI_VENDOR_ID_##v, PCI_DEVICE_ID_MELLANOX_##d, \ (void *)(MTHCA_##t)) static const struct verbs_match_ent hca_table[] = { HCA(MELLANOX, TAVOR, TAVOR), HCA(MELLANOX, ARBEL_COMPAT, TAVOR), HCA(MELLANOX, ARBEL, ARBEL), HCA(MELLANOX, SINAI_OLD, ARBEL), HCA(MELLANOX, SINAI, ARBEL), HCA(TOPSPIN, TAVOR, TAVOR), HCA(TOPSPIN, ARBEL_COMPAT, TAVOR), HCA(TOPSPIN, ARBEL, ARBEL), HCA(TOPSPIN, SINAI_OLD, ARBEL), HCA(TOPSPIN, SINAI, ARBEL), {} }; static const struct verbs_context_ops mthca_ctx_common_ops = { .query_device_ex = mthca_query_device, .query_port = mthca_query_port, .alloc_pd = mthca_alloc_pd, .dealloc_pd = mthca_free_pd, .reg_mr = mthca_reg_mr, .dereg_mr = mthca_dereg_mr, .create_cq = mthca_create_cq, .poll_cq = mthca_poll_cq, .resize_cq = mthca_resize_cq, .destroy_cq = mthca_destroy_cq, .create_srq = mthca_create_srq, .modify_srq = mthca_modify_srq, .query_srq = mthca_query_srq, .destroy_srq = mthca_destroy_srq, .create_qp = mthca_create_qp, .query_qp = mthca_query_qp, .modify_qp = mthca_modify_qp, .destroy_qp = mthca_destroy_qp, .create_ah = mthca_create_ah, .destroy_ah = mthca_destroy_ah, .attach_mcast = ibv_cmd_attach_mcast, .detach_mcast = ibv_cmd_detach_mcast, .free_context = mthca_free_context, }; static const struct verbs_context_ops mthca_ctx_arbel_ops = { .cq_event = mthca_arbel_cq_event, .post_recv = mthca_arbel_post_recv, .post_send = mthca_arbel_post_send, .post_srq_recv = mthca_arbel_post_srq_recv, .req_notify_cq = mthca_arbel_arm_cq, }; static const struct verbs_context_ops mthca_ctx_tavor_ops = { .post_recv = mthca_tavor_post_recv, .post_send = mthca_tavor_post_send, .post_srq_recv = mthca_tavor_post_srq_recv, .req_notify_cq = mthca_tavor_arm_cq, }; static struct verbs_context *mthca_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct mthca_context *context; struct ibv_get_context cmd; struct umthca_alloc_ucontext_resp resp; int i; context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx, RDMA_DRIVER_MTHCA); if (!context) return NULL; if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp)) goto err_free; context->num_qps = resp.qp_tab_size; context->qp_table_shift = ffs(context->num_qps) - 1 - MTHCA_QP_TABLE_BITS; context->qp_table_mask = (1 << context->qp_table_shift) - 1; if (mthca_is_memfree(&context->ibv_ctx.context)) { context->db_tab = mthca_alloc_db_tab(resp.uarc_size); if (!context->db_tab) goto err_free; } else context->db_tab = NULL; pthread_mutex_init(&context->qp_table_mutex, NULL); for (i = 0; i < MTHCA_QP_TABLE_SIZE; ++i) context->qp_table[i].refcnt = 0; context->uar = mmap(NULL, to_mdev(ibdev)->page_size, PROT_WRITE, MAP_SHARED, cmd_fd, 0); if (context->uar == MAP_FAILED) goto err_db_tab; pthread_spin_init(&context->uar_lock, PTHREAD_PROCESS_PRIVATE); context->pd = mthca_alloc_pd(&context->ibv_ctx.context); if (!context->pd) goto err_unmap; context->pd->context = &context->ibv_ctx.context; verbs_set_ops(&context->ibv_ctx, &mthca_ctx_common_ops); if (mthca_is_memfree(&context->ibv_ctx.context)) verbs_set_ops(&context->ibv_ctx, &mthca_ctx_arbel_ops); else verbs_set_ops(&context->ibv_ctx, &mthca_ctx_tavor_ops); return &context->ibv_ctx; err_unmap: munmap(context->uar, to_mdev(ibdev)->page_size); err_db_tab: mthca_free_db_tab(context->db_tab); err_free: verbs_uninit_context(&context->ibv_ctx); free(context); return NULL; } static void mthca_free_context(struct ibv_context *ibctx) { struct mthca_context *context = to_mctx(ibctx); mthca_free_pd(context->pd); munmap(context->uar, to_mdev(ibctx->device)->page_size); mthca_free_db_tab(context->db_tab); verbs_uninit_context(&context->ibv_ctx); free(context); } static void mthca_uninit_device(struct verbs_device *verbs_device) { struct mthca_device *dev = to_mdev(&verbs_device->device); free(dev); } static struct verbs_device * mthca_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct mthca_device *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; dev->hca_type = (uintptr_t)sysfs_dev->match->driver_data; dev->page_size = sysconf(_SC_PAGESIZE); return &dev->ibv_dev; } static const struct verbs_device_ops mthca_dev_ops = { .name = "mthca", .match_min_abi_version = 0, .match_max_abi_version = MTHCA_UVERBS_ABI_VERSION, .match_table = hca_table, .alloc_device = mthca_device_alloc, .uninit_device = mthca_uninit_device, .alloc_context = mthca_alloc_context, }; PROVIDER_DRIVER(mthca, mthca_dev_ops); rdma-core-56.1/providers/mthca/mthca.h000066400000000000000000000243731477342711600177210ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005, 2006 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef MTHCA_H #define MTHCA_H #include #include #include #include #define PFX "mthca: " enum mthca_hca_type { MTHCA_TAVOR, MTHCA_ARBEL }; enum { MTHCA_CQ_ENTRY_SIZE = 0x20 }; enum { MTHCA_QP_TABLE_BITS = 8, MTHCA_QP_TABLE_SIZE = 1 << MTHCA_QP_TABLE_BITS, MTHCA_QP_TABLE_MASK = MTHCA_QP_TABLE_SIZE - 1 }; enum { MTHCA_DB_REC_PAGE_SIZE = 4096, MTHCA_DB_REC_PER_PAGE = MTHCA_DB_REC_PAGE_SIZE / 8 }; enum mthca_db_type { MTHCA_DB_TYPE_INVALID = 0x0, MTHCA_DB_TYPE_CQ_SET_CI = 0x1, MTHCA_DB_TYPE_CQ_ARM = 0x2, MTHCA_DB_TYPE_SQ = 0x3, MTHCA_DB_TYPE_RQ = 0x4, MTHCA_DB_TYPE_SRQ = 0x5, MTHCA_DB_TYPE_GROUP_SEP = 0x7 }; enum { MTHCA_OPCODE_NOP = 0x00, MTHCA_OPCODE_RDMA_WRITE = 0x08, MTHCA_OPCODE_RDMA_WRITE_IMM = 0x09, MTHCA_OPCODE_SEND = 0x0a, MTHCA_OPCODE_SEND_IMM = 0x0b, MTHCA_OPCODE_RDMA_READ = 0x10, MTHCA_OPCODE_ATOMIC_CS = 0x11, MTHCA_OPCODE_ATOMIC_FA = 0x12, MTHCA_OPCODE_BIND_MW = 0x18, MTHCA_OPCODE_INVALID = 0xff }; struct mthca_ah_page; struct mthca_device { struct verbs_device ibv_dev; enum mthca_hca_type hca_type; int page_size; }; struct mthca_db_table; struct mthca_context { struct verbs_context ibv_ctx; void *uar; pthread_spinlock_t uar_lock; struct mthca_db_table *db_tab; struct ibv_pd *pd; struct { struct mthca_qp **table; int refcnt; } qp_table[MTHCA_QP_TABLE_SIZE]; pthread_mutex_t qp_table_mutex; int num_qps; int qp_table_shift; int qp_table_mask; }; struct mthca_buf { void *buf; size_t length; }; struct mthca_pd { struct ibv_pd ibv_pd; struct mthca_ah_page *ah_list; pthread_mutex_t ah_mutex; uint32_t pdn; }; struct mthca_cq { struct ibv_cq ibv_cq; struct mthca_buf buf; pthread_spinlock_t lock; struct ibv_mr *mr; uint32_t cqn; uint32_t cons_index; /* Next fields are mem-free only */ int set_ci_db_index; __be32 *set_ci_db; int arm_db_index; __be32 *arm_db; int arm_sn; }; struct mthca_srq { struct ibv_srq ibv_srq; struct mthca_buf buf; void *last; pthread_spinlock_t lock; struct ibv_mr *mr; uint64_t *wrid; uint32_t srqn; int max; int max_gs; int wqe_shift; int first_free; int last_free; int buf_size; /* Next fields are mem-free only */ int db_index; __be32 *db; uint16_t counter; }; struct mthca_wq { pthread_spinlock_t lock; int max; unsigned next_ind; unsigned last_comp; unsigned head; unsigned tail; void *last; int max_gs; int wqe_shift; /* Next fields are mem-free only */ int db_index; __be32 *db; }; struct mthca_qp { struct ibv_qp ibv_qp; struct mthca_buf buf; uint64_t *wrid; int send_wqe_offset; int max_inline_data; int buf_size; struct mthca_wq sq; struct mthca_wq rq; struct ibv_mr *mr; int sq_sig_all; }; struct mthca_av { __be32 port_pd; uint8_t reserved1; uint8_t g_slid; __be16 dlid; uint8_t reserved2; uint8_t gid_index; uint8_t msg_sr; uint8_t hop_limit; __be32 sl_tclass_flowlabel; __be32 dgid[4]; }; struct mthca_ah { struct ibv_ah ibv_ah; struct mthca_av *av; struct mthca_ah_page *page; uint32_t key; }; static inline unsigned long align(unsigned long val, unsigned long align) { return (val + align - 1) & ~(align - 1); } static inline uintptr_t db_align(__be32 *db) { return (uintptr_t) db & ~((uintptr_t) MTHCA_DB_REC_PAGE_SIZE - 1); } #define to_mxxx(xxx, type) container_of(ib##xxx, struct mthca_##type, ibv_##xxx) static inline struct mthca_device *to_mdev(struct ibv_device *ibdev) { return container_of(ibdev, struct mthca_device, ibv_dev.device); } static inline struct mthca_context *to_mctx(struct ibv_context *ibctx) { return container_of(ibctx, struct mthca_context, ibv_ctx.context); } static inline struct mthca_pd *to_mpd(struct ibv_pd *ibpd) { return to_mxxx(pd, pd); } static inline struct mthca_cq *to_mcq(struct ibv_cq *ibcq) { return to_mxxx(cq, cq); } static inline struct mthca_srq *to_msrq(struct ibv_srq *ibsrq) { return to_mxxx(srq, srq); } static inline struct mthca_qp *to_mqp(struct ibv_qp *ibqp) { return to_mxxx(qp, qp); } static inline struct mthca_ah *to_mah(struct ibv_ah *ibah) { return to_mxxx(ah, ah); } static inline int mthca_is_memfree(struct ibv_context *ibctx) { return to_mdev(ibctx->device)->hca_type == MTHCA_ARBEL; } int mthca_alloc_buf(struct mthca_buf *buf, size_t size, int page_size); void mthca_free_buf(struct mthca_buf *buf); int mthca_alloc_db(struct mthca_db_table *db_tab, enum mthca_db_type type, __be32 **db); void mthca_set_db_qn(__be32 *db, enum mthca_db_type type, uint32_t qn); void mthca_free_db(struct mthca_db_table *db_tab, enum mthca_db_type type, int db_index); struct mthca_db_table *mthca_alloc_db_tab(int uarc_size); void mthca_free_db_tab(struct mthca_db_table *db_tab); int mthca_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int mthca_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr); struct ibv_pd *mthca_alloc_pd(struct ibv_context *context); int mthca_free_pd(struct ibv_pd *pd); struct ibv_mr *mthca_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access); int mthca_dereg_mr(struct verbs_mr *mr); struct ibv_cq *mthca_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); int mthca_resize_cq(struct ibv_cq *cq, int cqe); int mthca_destroy_cq(struct ibv_cq *cq); int mthca_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc); int mthca_tavor_arm_cq(struct ibv_cq *cq, int solicited); int mthca_arbel_arm_cq(struct ibv_cq *cq, int solicited); void mthca_arbel_cq_event(struct ibv_cq *cq); void __mthca_cq_clean(struct mthca_cq *cq, uint32_t qpn, struct mthca_srq *srq); void mthca_cq_clean(struct mthca_cq *cq, uint32_t qpn, struct mthca_srq *srq); void mthca_cq_resize_copy_cqes(struct mthca_cq *cq, void *buf, int new_cqe); int mthca_alloc_cq_buf(struct mthca_device *dev, struct mthca_buf *buf, int nent); struct ibv_srq *mthca_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr); int mthca_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int mask); int mthca_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr); int mthca_destroy_srq(struct ibv_srq *srq); int mthca_alloc_srq_buf(struct ibv_pd *pd, struct ibv_srq_attr *attr, struct mthca_srq *srq); void mthca_free_srq_wqe(struct mthca_srq *srq, int ind); int mthca_tavor_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int mthca_arbel_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); struct ibv_qp *mthca_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); int mthca_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int mthca_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); int mthca_destroy_qp(struct ibv_qp *qp); void mthca_init_qp_indices(struct mthca_qp *qp); int mthca_tavor_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int mthca_tavor_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int mthca_arbel_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int mthca_arbel_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int mthca_alloc_qp_buf(struct ibv_pd *pd, struct ibv_qp_cap *cap, enum ibv_qp_type type, struct mthca_qp *qp); struct mthca_qp *mthca_find_qp(struct mthca_context *ctx, uint32_t qpn); int mthca_store_qp(struct mthca_context *ctx, uint32_t qpn, struct mthca_qp *qp); void mthca_clear_qp(struct mthca_context *ctx, uint32_t qpn); int mthca_free_err_wqe(struct mthca_qp *qp, int is_send, int index, int *dbd, __be32 *new_wqe); struct ibv_ah *mthca_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr); int mthca_destroy_ah(struct ibv_ah *ah); int mthca_alloc_av(struct mthca_pd *pd, struct ibv_ah_attr *attr, struct mthca_ah *ah); void mthca_free_av(struct mthca_ah *ah); #endif /* MTHCA_H */ rdma-core-56.1/providers/mthca/qp.c000066400000000000000000000572141477342711600172400ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005 Mellanox Technologies Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include "mthca.h" #include "doorbell.h" #include "wqe.h" enum { MTHCA_SEND_DOORBELL_FENCE = 1 << 5 }; static const uint8_t mthca_opcode[] = { [IBV_WR_SEND] = MTHCA_OPCODE_SEND, [IBV_WR_SEND_WITH_IMM] = MTHCA_OPCODE_SEND_IMM, [IBV_WR_RDMA_WRITE] = MTHCA_OPCODE_RDMA_WRITE, [IBV_WR_RDMA_WRITE_WITH_IMM] = MTHCA_OPCODE_RDMA_WRITE_IMM, [IBV_WR_RDMA_READ] = MTHCA_OPCODE_RDMA_READ, [IBV_WR_ATOMIC_CMP_AND_SWP] = MTHCA_OPCODE_ATOMIC_CS, [IBV_WR_ATOMIC_FETCH_AND_ADD] = MTHCA_OPCODE_ATOMIC_FA, }; static void *get_recv_wqe(struct mthca_qp *qp, int n) { return qp->buf.buf + (n << qp->rq.wqe_shift); } static void *get_send_wqe(struct mthca_qp *qp, int n) { return qp->buf.buf + qp->send_wqe_offset + (n << qp->sq.wqe_shift); } void mthca_init_qp_indices(struct mthca_qp *qp) { qp->sq.next_ind = 0; qp->sq.last_comp = qp->sq.max - 1; qp->sq.head = 0; qp->sq.tail = 0; qp->sq.last = get_send_wqe(qp, qp->sq.max - 1); qp->rq.next_ind = 0; qp->rq.last_comp = qp->rq.max - 1; qp->rq.head = 0; qp->rq.tail = 0; qp->rq.last = get_recv_wqe(qp, qp->rq.max - 1); } static inline int wq_overflow(struct mthca_wq *wq, int nreq, struct mthca_cq *cq) { unsigned cur; cur = wq->head - wq->tail; if (cur + nreq < wq->max) return 0; pthread_spin_lock(&cq->lock); cur = wq->head - wq->tail; pthread_spin_unlock(&cq->lock); return cur + nreq >= wq->max; } int mthca_tavor_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { struct mthca_qp *qp = to_mqp(ibqp); void *wqe, *prev_wqe; int ind; int nreq; int ret = 0; int size; int size0 = 0; int i; uint32_t uninitialized_var(f0); uint32_t uninitialized_var(op0); pthread_spin_lock(&qp->sq.lock); udma_to_device_barrier(); ind = qp->sq.next_ind; for (nreq = 0; wr; ++nreq, wr = wr->next) { if (wq_overflow(&qp->sq, nreq, to_mcq(qp->ibv_qp.send_cq))) { ret = -1; *bad_wr = wr; goto out; } wqe = get_send_wqe(qp, ind); prev_wqe = qp->sq.last; qp->sq.last = wqe; ((struct mthca_next_seg *) wqe)->nda_op = 0; ((struct mthca_next_seg *) wqe)->ee_nds = 0; ((struct mthca_next_seg *) wqe)->flags = ((wr->send_flags & IBV_SEND_SIGNALED) ? htobe32(MTHCA_NEXT_CQ_UPDATE) : 0) | ((wr->send_flags & IBV_SEND_SOLICITED) ? htobe32(MTHCA_NEXT_SOLICIT) : 0) | htobe32(1); if (wr->opcode == IBV_WR_SEND_WITH_IMM || wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM) ((struct mthca_next_seg *) wqe)->imm = wr->imm_data; wqe += sizeof (struct mthca_next_seg); size = sizeof (struct mthca_next_seg) / 16; switch (ibqp->qp_type) { case IBV_QPT_RC: switch (wr->opcode) { case IBV_WR_ATOMIC_CMP_AND_SWP: case IBV_WR_ATOMIC_FETCH_AND_ADD: ((struct mthca_raddr_seg *) wqe)->raddr = htobe64(wr->wr.atomic.remote_addr); ((struct mthca_raddr_seg *) wqe)->rkey = htobe32(wr->wr.atomic.rkey); ((struct mthca_raddr_seg *) wqe)->reserved = 0; wqe += sizeof (struct mthca_raddr_seg); if (wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP) { ((struct mthca_atomic_seg *) wqe)->swap_add = htobe64(wr->wr.atomic.swap); ((struct mthca_atomic_seg *) wqe)->compare = htobe64(wr->wr.atomic.compare_add); } else { ((struct mthca_atomic_seg *) wqe)->swap_add = htobe64(wr->wr.atomic.compare_add); ((struct mthca_atomic_seg *) wqe)->compare = 0; } wqe += sizeof (struct mthca_atomic_seg); size += (sizeof (struct mthca_raddr_seg) + sizeof (struct mthca_atomic_seg)) / 16; break; case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: case IBV_WR_RDMA_READ: ((struct mthca_raddr_seg *) wqe)->raddr = htobe64(wr->wr.rdma.remote_addr); ((struct mthca_raddr_seg *) wqe)->rkey = htobe32(wr->wr.rdma.rkey); ((struct mthca_raddr_seg *) wqe)->reserved = 0; wqe += sizeof (struct mthca_raddr_seg); size += sizeof (struct mthca_raddr_seg) / 16; break; default: /* No extra segments required for sends */ break; } break; case IBV_QPT_UC: switch (wr->opcode) { case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: ((struct mthca_raddr_seg *) wqe)->raddr = htobe64(wr->wr.rdma.remote_addr); ((struct mthca_raddr_seg *) wqe)->rkey = htobe32(wr->wr.rdma.rkey); ((struct mthca_raddr_seg *) wqe)->reserved = 0; wqe += sizeof (struct mthca_raddr_seg); size += sizeof (struct mthca_raddr_seg) / 16; break; default: /* No extra segments required for sends */ break; } break; case IBV_QPT_UD: ((struct mthca_tavor_ud_seg *) wqe)->lkey = htobe32(to_mah(wr->wr.ud.ah)->key); ((struct mthca_tavor_ud_seg *) wqe)->av_addr = htobe64((uintptr_t) to_mah(wr->wr.ud.ah)->av); ((struct mthca_tavor_ud_seg *) wqe)->dqpn = htobe32(wr->wr.ud.remote_qpn); ((struct mthca_tavor_ud_seg *) wqe)->qkey = htobe32(wr->wr.ud.remote_qkey); wqe += sizeof (struct mthca_tavor_ud_seg); size += sizeof (struct mthca_tavor_ud_seg) / 16; break; default: break; } if (wr->num_sge > qp->sq.max_gs) { ret = -1; *bad_wr = wr; goto out; } if (wr->send_flags & IBV_SEND_INLINE) { if (wr->num_sge) { struct mthca_inline_seg *seg = wqe; int s = 0; wqe += sizeof *seg; for (i = 0; i < wr->num_sge; ++i) { struct ibv_sge *sge = &wr->sg_list[i]; s += sge->length; if (s > qp->max_inline_data) { ret = -1; *bad_wr = wr; goto out; } memcpy(wqe, (void *) (intptr_t) sge->addr, sge->length); wqe += sge->length; } seg->byte_count = htobe32(MTHCA_INLINE_SEG | s); size += align(s + sizeof *seg, 16) / 16; } } else { struct mthca_data_seg *seg; for (i = 0; i < wr->num_sge; ++i) { seg = wqe; seg->byte_count = htobe32(wr->sg_list[i].length); seg->lkey = htobe32(wr->sg_list[i].lkey); seg->addr = htobe64(wr->sg_list[i].addr); wqe += sizeof *seg; } size += wr->num_sge * (sizeof *seg / 16); } qp->wrid[ind + qp->rq.max] = wr->wr_id; if (wr->opcode >= sizeof mthca_opcode / sizeof mthca_opcode[0]) { ret = -1; *bad_wr = wr; goto out; } ((struct mthca_next_seg *) prev_wqe)->nda_op = htobe32(((ind << qp->sq.wqe_shift) + qp->send_wqe_offset) | mthca_opcode[wr->opcode]); /* * Make sure that nda_op is written before setting ee_nds. */ udma_ordering_write_barrier(); ((struct mthca_next_seg *) prev_wqe)->ee_nds = htobe32((size0 ? 0 : MTHCA_NEXT_DBD) | size | ((wr->send_flags & IBV_SEND_FENCE) ? MTHCA_NEXT_FENCE : 0)); if (!size0) { size0 = size; op0 = mthca_opcode[wr->opcode]; f0 = wr->send_flags & IBV_SEND_FENCE ? MTHCA_SEND_DOORBELL_FENCE : 0; } ++ind; if (ind >= qp->sq.max) ind -= qp->sq.max; } out: if (nreq) { uint32_t doorbell[2]; doorbell[0] = ((qp->sq.next_ind << qp->sq.wqe_shift) + qp->send_wqe_offset) | f0 | op0; doorbell[1] = (ibqp->qp_num << 8) | size0; udma_to_device_barrier(); mthca_write64(doorbell, to_mctx(ibqp->context)->uar + MTHCA_SEND_DOORBELL); } qp->sq.next_ind = ind; qp->sq.head += nreq; pthread_spin_unlock(&qp->sq.lock); return ret; } int mthca_tavor_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mthca_qp *qp = to_mqp(ibqp); uint32_t doorbell[2]; int ret = 0; int nreq; int i; int size; int size0 = 0; int ind; void *wqe; void *prev_wqe; pthread_spin_lock(&qp->rq.lock); ind = qp->rq.next_ind; for (nreq = 0; wr; wr = wr->next) { if (wq_overflow(&qp->rq, nreq, to_mcq(qp->ibv_qp.recv_cq))) { ret = -1; *bad_wr = wr; goto out; } wqe = get_recv_wqe(qp, ind); prev_wqe = qp->rq.last; qp->rq.last = wqe; ((struct mthca_next_seg *) wqe)->ee_nds = htobe32(MTHCA_NEXT_DBD); ((struct mthca_next_seg *) wqe)->flags = htobe32(MTHCA_NEXT_CQ_UPDATE); wqe += sizeof (struct mthca_next_seg); size = sizeof (struct mthca_next_seg) / 16; if (wr->num_sge > qp->rq.max_gs) { ret = -1; *bad_wr = wr; goto out; } for (i = 0; i < wr->num_sge; ++i) { ((struct mthca_data_seg *) wqe)->byte_count = htobe32(wr->sg_list[i].length); ((struct mthca_data_seg *) wqe)->lkey = htobe32(wr->sg_list[i].lkey); ((struct mthca_data_seg *) wqe)->addr = htobe64(wr->sg_list[i].addr); wqe += sizeof (struct mthca_data_seg); size += sizeof (struct mthca_data_seg) / 16; } qp->wrid[ind] = wr->wr_id; ((struct mthca_next_seg *) prev_wqe)->ee_nds = htobe32(MTHCA_NEXT_DBD | size); if (!size0) size0 = size; ++ind; if (ind >= qp->rq.max) ind -= qp->rq.max; ++nreq; if (nreq == MTHCA_TAVOR_MAX_WQES_PER_RECV_DB) { nreq = 0; doorbell[0] = (qp->rq.next_ind << qp->rq.wqe_shift) | size0; doorbell[1] = ibqp->qp_num << 8; /* * Make sure that descriptors are written * before doorbell is rung. */ udma_to_device_barrier(); mthca_write64(doorbell, to_mctx(ibqp->context)->uar + MTHCA_RECV_DOORBELL); qp->rq.next_ind = ind; qp->rq.head += MTHCA_TAVOR_MAX_WQES_PER_RECV_DB; size0 = 0; } } out: if (nreq) { doorbell[0] = (qp->rq.next_ind << qp->rq.wqe_shift) | size0; doorbell[1] = (ibqp->qp_num << 8) | nreq; /* * Make sure that descriptors are written before * doorbell is rung. */ udma_to_device_barrier(); mthca_write64(doorbell, to_mctx(ibqp->context)->uar + MTHCA_RECV_DOORBELL); } qp->rq.next_ind = ind; qp->rq.head += nreq; pthread_spin_unlock(&qp->rq.lock); return ret; } int mthca_arbel_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { struct mthca_qp *qp = to_mqp(ibqp); uint32_t doorbell[2]; void *wqe, *prev_wqe; int ind; int nreq; int ret = 0; int size; int size0 = 0; int i; uint32_t uninitialized_var(f0); uint32_t uninitialized_var(op0); pthread_spin_lock(&qp->sq.lock); /* XXX check that state is OK to post send */ ind = qp->sq.head & (qp->sq.max - 1); for (nreq = 0; wr; ++nreq, wr = wr->next) { if (nreq == MTHCA_ARBEL_MAX_WQES_PER_SEND_DB) { nreq = 0; doorbell[0] = (MTHCA_ARBEL_MAX_WQES_PER_SEND_DB << 24) | ((qp->sq.head & 0xffff) << 8) | f0 | op0; doorbell[1] = (ibqp->qp_num << 8) | size0; qp->sq.head += MTHCA_ARBEL_MAX_WQES_PER_SEND_DB; /* * Make sure that descriptors are written before * doorbell record. */ udma_to_device_barrier(); *qp->sq.db = htobe32(qp->sq.head & 0xffff); /* * Make sure doorbell record is written before we * write MMIO send doorbell. */ mmio_ordered_writes_hack(); mthca_write64(doorbell, to_mctx(ibqp->context)->uar + MTHCA_SEND_DOORBELL); size0 = 0; } if (wq_overflow(&qp->sq, nreq, to_mcq(qp->ibv_qp.send_cq))) { ret = -1; *bad_wr = wr; goto out; } wqe = get_send_wqe(qp, ind); prev_wqe = qp->sq.last; qp->sq.last = wqe; ((struct mthca_next_seg *) wqe)->flags = ((wr->send_flags & IBV_SEND_SIGNALED) ? htobe32(MTHCA_NEXT_CQ_UPDATE) : 0) | ((wr->send_flags & IBV_SEND_SOLICITED) ? htobe32(MTHCA_NEXT_SOLICIT) : 0) | htobe32(1); if (wr->opcode == IBV_WR_SEND_WITH_IMM || wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM) ((struct mthca_next_seg *) wqe)->imm = wr->imm_data; wqe += sizeof (struct mthca_next_seg); size = sizeof (struct mthca_next_seg) / 16; switch (ibqp->qp_type) { case IBV_QPT_RC: switch (wr->opcode) { case IBV_WR_ATOMIC_CMP_AND_SWP: case IBV_WR_ATOMIC_FETCH_AND_ADD: ((struct mthca_raddr_seg *) wqe)->raddr = htobe64(wr->wr.atomic.remote_addr); ((struct mthca_raddr_seg *) wqe)->rkey = htobe32(wr->wr.atomic.rkey); ((struct mthca_raddr_seg *) wqe)->reserved = 0; wqe += sizeof (struct mthca_raddr_seg); if (wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP) { ((struct mthca_atomic_seg *) wqe)->swap_add = htobe64(wr->wr.atomic.swap); ((struct mthca_atomic_seg *) wqe)->compare = htobe64(wr->wr.atomic.compare_add); } else { ((struct mthca_atomic_seg *) wqe)->swap_add = htobe64(wr->wr.atomic.compare_add); ((struct mthca_atomic_seg *) wqe)->compare = 0; } wqe += sizeof (struct mthca_atomic_seg); size += (sizeof (struct mthca_raddr_seg) + sizeof (struct mthca_atomic_seg)) / 16; break; case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: case IBV_WR_RDMA_READ: ((struct mthca_raddr_seg *) wqe)->raddr = htobe64(wr->wr.rdma.remote_addr); ((struct mthca_raddr_seg *) wqe)->rkey = htobe32(wr->wr.rdma.rkey); ((struct mthca_raddr_seg *) wqe)->reserved = 0; wqe += sizeof (struct mthca_raddr_seg); size += sizeof (struct mthca_raddr_seg) / 16; break; default: /* No extra segments required for sends */ break; } break; case IBV_QPT_UC: switch (wr->opcode) { case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: ((struct mthca_raddr_seg *) wqe)->raddr = htobe64(wr->wr.rdma.remote_addr); ((struct mthca_raddr_seg *) wqe)->rkey = htobe32(wr->wr.rdma.rkey); ((struct mthca_raddr_seg *) wqe)->reserved = 0; wqe += sizeof (struct mthca_raddr_seg); size += sizeof (struct mthca_raddr_seg) / 16; break; default: /* No extra segments required for sends */ break; } break; case IBV_QPT_UD: memcpy(((struct mthca_arbel_ud_seg *) wqe)->av, to_mah(wr->wr.ud.ah)->av, sizeof (struct mthca_av)); ((struct mthca_arbel_ud_seg *) wqe)->dqpn = htobe32(wr->wr.ud.remote_qpn); ((struct mthca_arbel_ud_seg *) wqe)->qkey = htobe32(wr->wr.ud.remote_qkey); wqe += sizeof (struct mthca_arbel_ud_seg); size += sizeof (struct mthca_arbel_ud_seg) / 16; break; default: break; } if (wr->num_sge > qp->sq.max_gs) { ret = -1; *bad_wr = wr; goto out; } if (wr->send_flags & IBV_SEND_INLINE) { if (wr->num_sge) { struct mthca_inline_seg *seg = wqe; int s = 0; wqe += sizeof *seg; for (i = 0; i < wr->num_sge; ++i) { struct ibv_sge *sge = &wr->sg_list[i]; s += sge->length; if (s > qp->max_inline_data) { ret = -1; *bad_wr = wr; goto out; } memcpy(wqe, (void *) (uintptr_t) sge->addr, sge->length); wqe += sge->length; } seg->byte_count = htobe32(MTHCA_INLINE_SEG | s); size += align(s + sizeof *seg, 16) / 16; } } else { struct mthca_data_seg *seg; for (i = 0; i < wr->num_sge; ++i) { seg = wqe; seg->byte_count = htobe32(wr->sg_list[i].length); seg->lkey = htobe32(wr->sg_list[i].lkey); seg->addr = htobe64(wr->sg_list[i].addr); wqe += sizeof *seg; } size += wr->num_sge * (sizeof *seg / 16); } qp->wrid[ind + qp->rq.max] = wr->wr_id; if (wr->opcode >= sizeof mthca_opcode / sizeof mthca_opcode[0]) { ret = -1; *bad_wr = wr; goto out; } ((struct mthca_next_seg *) prev_wqe)->nda_op = htobe32(((ind << qp->sq.wqe_shift) + qp->send_wqe_offset) | mthca_opcode[wr->opcode]); udma_ordering_write_barrier(); ((struct mthca_next_seg *) prev_wqe)->ee_nds = htobe32(MTHCA_NEXT_DBD | size | ((wr->send_flags & IBV_SEND_FENCE) ? MTHCA_NEXT_FENCE : 0)); if (!size0) { size0 = size; op0 = mthca_opcode[wr->opcode]; f0 = wr->send_flags & IBV_SEND_FENCE ? MTHCA_SEND_DOORBELL_FENCE : 0; } ++ind; if (ind >= qp->sq.max) ind -= qp->sq.max; } out: if (nreq) { doorbell[0] = (nreq << 24) | ((qp->sq.head & 0xffff) << 8) | f0 | op0; doorbell[1] = (ibqp->qp_num << 8) | size0; qp->sq.head += nreq; /* * Make sure that descriptors are written before * doorbell record. */ udma_to_device_barrier(); *qp->sq.db = htobe32(qp->sq.head & 0xffff); /* * Make sure doorbell record is written before we * write MMIO send doorbell. */ mmio_ordered_writes_hack(); mthca_write64(doorbell, to_mctx(ibqp->context)->uar + MTHCA_SEND_DOORBELL); } pthread_spin_unlock(&qp->sq.lock); return ret; } int mthca_arbel_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mthca_qp *qp = to_mqp(ibqp); int ret = 0; int nreq; int ind; int i; void *wqe; pthread_spin_lock(&qp->rq.lock); /* XXX check that state is OK to post receive */ ind = qp->rq.head & (qp->rq.max - 1); for (nreq = 0; wr; ++nreq, wr = wr->next) { if (wq_overflow(&qp->rq, nreq, to_mcq(qp->ibv_qp.recv_cq))) { ret = -1; *bad_wr = wr; goto out; } wqe = get_recv_wqe(qp, ind); ((struct mthca_next_seg *) wqe)->flags = 0; wqe += sizeof (struct mthca_next_seg); if (wr->num_sge > qp->rq.max_gs) { ret = -1; *bad_wr = wr; goto out; } for (i = 0; i < wr->num_sge; ++i) { ((struct mthca_data_seg *) wqe)->byte_count = htobe32(wr->sg_list[i].length); ((struct mthca_data_seg *) wqe)->lkey = htobe32(wr->sg_list[i].lkey); ((struct mthca_data_seg *) wqe)->addr = htobe64(wr->sg_list[i].addr); wqe += sizeof (struct mthca_data_seg); } if (i < qp->rq.max_gs) { ((struct mthca_data_seg *) wqe)->byte_count = 0; ((struct mthca_data_seg *) wqe)->lkey = htobe32(MTHCA_INVAL_LKEY); ((struct mthca_data_seg *) wqe)->addr = 0; } qp->wrid[ind] = wr->wr_id; ++ind; if (ind >= qp->rq.max) ind -= qp->rq.max; } out: if (nreq) { qp->rq.head += nreq; /* * Make sure that descriptors are written before * doorbell record. */ udma_to_device_barrier(); *qp->rq.db = htobe32(qp->rq.head & 0xffff); } pthread_spin_unlock(&qp->rq.lock); return ret; } int mthca_alloc_qp_buf(struct ibv_pd *pd, struct ibv_qp_cap *cap, enum ibv_qp_type type, struct mthca_qp *qp) { int size; int max_sq_sge; struct mthca_next_seg *next; int i; qp->rq.max_gs = cap->max_recv_sge; qp->sq.max_gs = cap->max_send_sge; max_sq_sge = align(cap->max_inline_data + sizeof (struct mthca_inline_seg), sizeof (struct mthca_data_seg)) / sizeof (struct mthca_data_seg); if (max_sq_sge < cap->max_send_sge) max_sq_sge = cap->max_send_sge; qp->wrid = malloc((qp->rq.max + qp->sq.max) * sizeof (uint64_t)); if (!qp->wrid) return -1; size = sizeof (struct mthca_next_seg) + qp->rq.max_gs * sizeof (struct mthca_data_seg); for (qp->rq.wqe_shift = 6; 1 << qp->rq.wqe_shift < size; qp->rq.wqe_shift++) ; /* nothing */ size = max_sq_sge * sizeof (struct mthca_data_seg); switch (type) { case IBV_QPT_UD: size += mthca_is_memfree(pd->context) ? sizeof (struct mthca_arbel_ud_seg) : sizeof (struct mthca_tavor_ud_seg); break; case IBV_QPT_UC: size += sizeof (struct mthca_raddr_seg); break; case IBV_QPT_RC: size += sizeof (struct mthca_raddr_seg); /* * An atomic op will require an atomic segment, a * remote address segment and one scatter entry. */ if (size < (sizeof (struct mthca_atomic_seg) + sizeof (struct mthca_raddr_seg) + sizeof (struct mthca_data_seg))) size = (sizeof (struct mthca_atomic_seg) + sizeof (struct mthca_raddr_seg) + sizeof (struct mthca_data_seg)); break; default: break; } /* Make sure that we have enough space for a bind request */ if (size < sizeof (struct mthca_bind_seg)) size = sizeof (struct mthca_bind_seg); size += sizeof (struct mthca_next_seg); for (qp->sq.wqe_shift = 6; 1 << qp->sq.wqe_shift < size; qp->sq.wqe_shift++) ; /* nothing */ qp->send_wqe_offset = align(qp->rq.max << qp->rq.wqe_shift, 1 << qp->sq.wqe_shift); qp->buf_size = qp->send_wqe_offset + (qp->sq.max << qp->sq.wqe_shift); if (mthca_alloc_buf(&qp->buf, align(qp->buf_size, to_mdev(pd->context->device)->page_size), to_mdev(pd->context->device)->page_size)) { free(qp->wrid); return -1; } memset(qp->buf.buf, 0, qp->buf_size); if (mthca_is_memfree(pd->context)) { struct mthca_data_seg *scatter; __be32 sz; sz = htobe32((sizeof (struct mthca_next_seg) + qp->rq.max_gs * sizeof (struct mthca_data_seg)) / 16); for (i = 0; i < qp->rq.max; ++i) { next = get_recv_wqe(qp, i); next->nda_op = htobe32(((i + 1) & (qp->rq.max - 1)) << qp->rq.wqe_shift); next->ee_nds = sz; for (scatter = (void *) (next + 1); (void *) scatter < (void *) next + (1 << qp->rq.wqe_shift); ++scatter) scatter->lkey = htobe32(MTHCA_INVAL_LKEY); } for (i = 0; i < qp->sq.max; ++i) { next = get_send_wqe(qp, i); next->nda_op = htobe32((((i + 1) & (qp->sq.max - 1)) << qp->sq.wqe_shift) + qp->send_wqe_offset); } } else { for (i = 0; i < qp->rq.max; ++i) { next = get_recv_wqe(qp, i); next->nda_op = htobe32((((i + 1) % qp->rq.max) << qp->rq.wqe_shift) | 1); } } qp->sq.last = get_send_wqe(qp, qp->sq.max - 1); qp->rq.last = get_recv_wqe(qp, qp->rq.max - 1); return 0; } struct mthca_qp *mthca_find_qp(struct mthca_context *ctx, uint32_t qpn) { int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift; if (ctx->qp_table[tind].refcnt) return ctx->qp_table[tind].table[qpn & ctx->qp_table_mask]; else return NULL; } int mthca_store_qp(struct mthca_context *ctx, uint32_t qpn, struct mthca_qp *qp) { int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift; if (!ctx->qp_table[tind].refcnt) { ctx->qp_table[tind].table = calloc(ctx->qp_table_mask + 1, sizeof (struct mthca_qp *)); if (!ctx->qp_table[tind].table) return -1; } ++ctx->qp_table[tind].refcnt; ctx->qp_table[tind].table[qpn & ctx->qp_table_mask] = qp; return 0; } void mthca_clear_qp(struct mthca_context *ctx, uint32_t qpn) { int tind = (qpn & (ctx->num_qps - 1)) >> ctx->qp_table_shift; if (!--ctx->qp_table[tind].refcnt) free(ctx->qp_table[tind].table); else ctx->qp_table[tind].table[qpn & ctx->qp_table_mask] = NULL; } int mthca_free_err_wqe(struct mthca_qp *qp, int is_send, int index, int *dbd, __be32 *new_wqe) { struct mthca_next_seg *next; /* * For SRQs, all receive WQEs generate a CQE, so we're always * at the end of the doorbell chain. */ if (qp->ibv_qp.srq && !is_send) { *new_wqe = 0; return 0; } if (is_send) next = get_send_wqe(qp, index); else next = get_recv_wqe(qp, index); *dbd = !!(next->ee_nds & htobe32(MTHCA_NEXT_DBD)); if (next->ee_nds & htobe32(0x3f)) *new_wqe = (next->nda_op & htobe32(~0x3f)) | (next->ee_nds & htobe32(0x3f)); else *new_wqe = 0; return 0; } rdma-core-56.1/providers/mthca/srq.c000066400000000000000000000172361477342711600174250ustar00rootroot00000000000000/* * Copyright (c) 2005 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include "mthca.h" #include "doorbell.h" #include "wqe.h" static void *get_wqe(struct mthca_srq *srq, int n) { return srq->buf.buf + (n << srq->wqe_shift); } /* * Return a pointer to the location within a WQE that we're using as a * link when the WQE is in the free list. We use the imm field at an * offset of 12 bytes because in the Tavor case, posting a WQE may * overwrite the next segment of the previous WQE, but a receive WQE * will never touch the imm field. This avoids corrupting our free * list if the previous WQE has already completed and been put on the * free list when we post the next WQE. */ static inline int *wqe_to_link(void *wqe) { return (int *) (wqe + 12); } void mthca_free_srq_wqe(struct mthca_srq *srq, int ind) { struct mthca_next_seg *last_free; pthread_spin_lock(&srq->lock); last_free = get_wqe(srq, srq->last_free); *wqe_to_link(last_free) = ind; last_free->nda_op = htobe32((ind << srq->wqe_shift) | 1); *wqe_to_link(get_wqe(srq, ind)) = -1; srq->last_free = ind; pthread_spin_unlock(&srq->lock); } int mthca_tavor_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mthca_srq *srq = to_msrq(ibsrq); uint32_t doorbell[2]; int err = 0; int first_ind; int ind; int next_ind; int nreq; int i; void *wqe; void *prev_wqe; pthread_spin_lock(&srq->lock); first_ind = srq->first_free; for (nreq = 0; wr; wr = wr->next) { ind = srq->first_free; wqe = get_wqe(srq, ind); next_ind = *wqe_to_link(wqe); if (next_ind < 0) { err = -1; *bad_wr = wr; break; } prev_wqe = srq->last; srq->last = wqe; ((struct mthca_next_seg *) wqe)->ee_nds = 0; /* flags field will always remain 0 */ wqe += sizeof (struct mthca_next_seg); if (wr->num_sge > srq->max_gs) { err = -1; *bad_wr = wr; srq->last = prev_wqe; break; } for (i = 0; i < wr->num_sge; ++i) { ((struct mthca_data_seg *) wqe)->byte_count = htobe32(wr->sg_list[i].length); ((struct mthca_data_seg *) wqe)->lkey = htobe32(wr->sg_list[i].lkey); ((struct mthca_data_seg *) wqe)->addr = htobe64(wr->sg_list[i].addr); wqe += sizeof (struct mthca_data_seg); } if (i < srq->max_gs) { ((struct mthca_data_seg *) wqe)->byte_count = 0; ((struct mthca_data_seg *) wqe)->lkey = htobe32(MTHCA_INVAL_LKEY); ((struct mthca_data_seg *) wqe)->addr = 0; } ((struct mthca_next_seg *) prev_wqe)->ee_nds = htobe32(MTHCA_NEXT_DBD); srq->wrid[ind] = wr->wr_id; srq->first_free = next_ind; if (++nreq == MTHCA_TAVOR_MAX_WQES_PER_RECV_DB) { nreq = 0; doorbell[0] = first_ind << srq->wqe_shift; doorbell[1] = srq->srqn << 8; /* * Make sure that descriptors are written * before doorbell is rung. */ udma_to_device_barrier(); mthca_write64(doorbell, to_mctx(ibsrq->context)->uar + MTHCA_RECV_DOORBELL); first_ind = srq->first_free; } } if (nreq) { doorbell[0] = first_ind << srq->wqe_shift; doorbell[1] = (srq->srqn << 8) | nreq; /* * Make sure that descriptors are written before * doorbell is rung. */ udma_to_device_barrier(); mthca_write64(doorbell, to_mctx(ibsrq->context)->uar + MTHCA_RECV_DOORBELL); } pthread_spin_unlock(&srq->lock); return err; } int mthca_arbel_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct mthca_srq *srq = to_msrq(ibsrq); int err = 0; int ind; int next_ind; int nreq; int i; void *wqe; pthread_spin_lock(&srq->lock); for (nreq = 0; wr; ++nreq, wr = wr->next) { ind = srq->first_free; wqe = get_wqe(srq, ind); next_ind = *wqe_to_link(wqe); if (next_ind < 0) { err = -1; *bad_wr = wr; break; } ((struct mthca_next_seg *) wqe)->ee_nds = 0; /* flags field will always remain 0 */ wqe += sizeof (struct mthca_next_seg); if (wr->num_sge > srq->max_gs) { err = -1; *bad_wr = wr; break; } for (i = 0; i < wr->num_sge; ++i) { ((struct mthca_data_seg *) wqe)->byte_count = htobe32(wr->sg_list[i].length); ((struct mthca_data_seg *) wqe)->lkey = htobe32(wr->sg_list[i].lkey); ((struct mthca_data_seg *) wqe)->addr = htobe64(wr->sg_list[i].addr); wqe += sizeof (struct mthca_data_seg); } if (i < srq->max_gs) { ((struct mthca_data_seg *) wqe)->byte_count = 0; ((struct mthca_data_seg *) wqe)->lkey = htobe32(MTHCA_INVAL_LKEY); ((struct mthca_data_seg *) wqe)->addr = 0; } srq->wrid[ind] = wr->wr_id; srq->first_free = next_ind; } if (nreq) { srq->counter += nreq; /* * Make sure that descriptors are written before * we write doorbell record. */ udma_ordering_write_barrier(); *srq->db = htobe32(srq->counter); } pthread_spin_unlock(&srq->lock); return err; } int mthca_alloc_srq_buf(struct ibv_pd *pd, struct ibv_srq_attr *attr, struct mthca_srq *srq) { struct mthca_data_seg *scatter; void *wqe; int size; int i; srq->wrid = malloc(srq->max * sizeof (uint64_t)); if (!srq->wrid) return -1; size = sizeof (struct mthca_next_seg) + srq->max_gs * sizeof (struct mthca_data_seg); for (srq->wqe_shift = 6; 1 << srq->wqe_shift < size; ++srq->wqe_shift) ; /* nothing */ srq->buf_size = srq->max << srq->wqe_shift; if (mthca_alloc_buf(&srq->buf, align(srq->buf_size, to_mdev(pd->context->device)->page_size), to_mdev(pd->context->device)->page_size)) { free(srq->wrid); return -1; } memset(srq->buf.buf, 0, srq->buf_size); /* * Now initialize the SRQ buffer so that all of the WQEs are * linked into the list of free WQEs. In addition, set the * scatter list L_Keys to the sentry value of 0x100. */ for (i = 0; i < srq->max; ++i) { struct mthca_next_seg *next; next = wqe = get_wqe(srq, i); if (i < srq->max - 1) { *wqe_to_link(wqe) = i + 1; next->nda_op = htobe32(((i + 1) << srq->wqe_shift) | 1); } else { *wqe_to_link(wqe) = -1; next->nda_op = 0; } for (scatter = wqe + sizeof (struct mthca_next_seg); (void *) scatter < wqe + (1 << srq->wqe_shift); ++scatter) scatter->lkey = htobe32(MTHCA_INVAL_LKEY); } srq->first_free = 0; srq->last_free = srq->max - 1; srq->last = get_wqe(srq, srq->max - 1); return 0; } rdma-core-56.1/providers/mthca/verbs.c000066400000000000000000000417571477342711600177460ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005, 2006 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include "mthca.h" #include "mthca-abi.h" int mthca_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); uint64_t raw_fw_ver; unsigned major, minor, sub_minor; int ret; ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (ret) return ret; raw_fw_ver = resp.base.fw_ver; major = (raw_fw_ver >> 32) & 0xffff; minor = (raw_fw_ver >> 16) & 0xffff; sub_minor = raw_fw_ver & 0xffff; snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver), "%d.%d.%d", major, minor, sub_minor); return 0; } int mthca_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(context, port, attr, &cmd, sizeof cmd); } struct ibv_pd *mthca_alloc_pd(struct ibv_context *context) { struct ibv_alloc_pd cmd; struct umthca_alloc_pd_resp resp; struct mthca_pd *pd; pd = malloc(sizeof *pd); if (!pd) return NULL; if (!mthca_is_memfree(context)) { pd->ah_list = NULL; if (pthread_mutex_init(&pd->ah_mutex, NULL)) { free(pd); return NULL; } } if (ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof cmd, &resp.ibv_resp, sizeof resp)) { free(pd); return NULL; } pd->pdn = resp.pdn; return &pd->ibv_pd; } int mthca_free_pd(struct ibv_pd *pd) { int ret; ret = ibv_cmd_dealloc_pd(pd); if (ret) return ret; free(to_mpd(pd)); return 0; } static struct ibv_mr *__mthca_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access, int dma_sync) { struct verbs_mr *vmr; struct umthca_reg_mr cmd; struct ib_uverbs_reg_mr_resp resp; int ret; /* * Old kernels just ignore the extra data we pass in with the * reg_mr command structure, so there's no need to add an ABI * version check here (and indeed the kernel ABI was not * incremented due to this change). */ cmd.mr_attrs = dma_sync ? MTHCA_MR_DMASYNC : 0; cmd.reserved = 0; vmr = malloc(sizeof(*vmr)); if (!vmr) return NULL; ret = ibv_cmd_reg_mr(pd, addr, length, hca_va, access, vmr, &cmd.ibv_cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(vmr); return NULL; } return &vmr->ibv_mr; } struct ibv_mr *mthca_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { return __mthca_reg_mr(pd, addr, length, hca_va, access, 0); } int mthca_dereg_mr(struct verbs_mr *vmr) { int ret; ret = ibv_cmd_dereg_mr(vmr); if (ret) return ret; free(vmr); return 0; } static int align_cq_size(int cqe) { int nent; for (nent = 1; nent <= cqe; nent <<= 1) ; /* nothing */ return nent; } struct ibv_cq *mthca_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct umthca_create_cq cmd; struct umthca_create_cq_resp resp; struct mthca_cq *cq; int ret; /* Sanity check CQ size before proceeding */ if (cqe > 131072) return NULL; cq = malloc(sizeof *cq); if (!cq) return NULL; cq->cons_index = 0; if (pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE)) goto err; cqe = align_cq_size(cqe); if (mthca_alloc_cq_buf(to_mdev(context->device), &cq->buf, cqe)) goto err; cq->mr = __mthca_reg_mr(to_mctx(context)->pd, cq->buf.buf, cqe * MTHCA_CQ_ENTRY_SIZE, 0, IBV_ACCESS_LOCAL_WRITE, 1); if (!cq->mr) goto err_buf; cq->mr->context = context; if (mthca_is_memfree(context)) { cq->arm_sn = 1; cq->set_ci_db_index = mthca_alloc_db(to_mctx(context)->db_tab, MTHCA_DB_TYPE_CQ_SET_CI, &cq->set_ci_db); if (cq->set_ci_db_index < 0) goto err_unreg; cq->arm_db_index = mthca_alloc_db(to_mctx(context)->db_tab, MTHCA_DB_TYPE_CQ_ARM, &cq->arm_db); if (cq->arm_db_index < 0) goto err_set_db; cmd.arm_db_page = db_align(cq->arm_db); cmd.set_db_page = db_align(cq->set_ci_db); cmd.arm_db_index = cq->arm_db_index; cmd.set_db_index = cq->set_ci_db_index; } else { cmd.arm_db_page = cmd.set_db_page = cmd.arm_db_index = cmd.set_db_index = 0; } cmd.lkey = cq->mr->lkey; cmd.pdn = to_mpd(to_mctx(context)->pd)->pdn; ret = ibv_cmd_create_cq(context, cqe - 1, channel, comp_vector, &cq->ibv_cq, &cmd.ibv_cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) goto err_arm_db; cq->cqn = resp.cqn; if (mthca_is_memfree(context)) { mthca_set_db_qn(cq->set_ci_db, MTHCA_DB_TYPE_CQ_SET_CI, cq->cqn); mthca_set_db_qn(cq->arm_db, MTHCA_DB_TYPE_CQ_ARM, cq->cqn); } return &cq->ibv_cq; err_arm_db: if (mthca_is_memfree(context)) mthca_free_db(to_mctx(context)->db_tab, MTHCA_DB_TYPE_CQ_ARM, cq->arm_db_index); err_set_db: if (mthca_is_memfree(context)) mthca_free_db(to_mctx(context)->db_tab, MTHCA_DB_TYPE_CQ_SET_CI, cq->set_ci_db_index); err_unreg: mthca_dereg_mr(verbs_get_mr(cq->mr)); err_buf: mthca_free_buf(&cq->buf); err: free(cq); return NULL; } int mthca_resize_cq(struct ibv_cq *ibcq, int cqe) { struct mthca_cq *cq = to_mcq(ibcq); struct umthca_resize_cq cmd; struct ibv_mr *mr; struct mthca_buf buf; struct ib_uverbs_resize_cq_resp resp; int old_cqe; int ret; /* Sanity check CQ size before proceeding */ if (cqe > 131072) return EINVAL; pthread_spin_lock(&cq->lock); cqe = align_cq_size(cqe); if (cqe == ibcq->cqe + 1) { ret = 0; goto out; } ret = mthca_alloc_cq_buf(to_mdev(ibcq->context->device), &buf, cqe); if (ret) goto out; mr = __mthca_reg_mr(to_mctx(ibcq->context)->pd, buf.buf, cqe * MTHCA_CQ_ENTRY_SIZE, 0, IBV_ACCESS_LOCAL_WRITE, 1); if (!mr) { mthca_free_buf(&buf); ret = ENOMEM; goto out; } mr->context = ibcq->context; old_cqe = ibcq->cqe; cmd.lkey = mr->lkey; ret = ibv_cmd_resize_cq(ibcq, cqe - 1, &cmd.ibv_cmd, sizeof cmd, &resp, sizeof resp); if (ret) { mthca_dereg_mr(verbs_get_mr(mr)); mthca_free_buf(&buf); goto out; } mthca_cq_resize_copy_cqes(cq, buf.buf, old_cqe); mthca_dereg_mr(verbs_get_mr(cq->mr)); mthca_free_buf(&cq->buf); cq->buf = buf; cq->mr = mr; out: pthread_spin_unlock(&cq->lock); return ret; } int mthca_destroy_cq(struct ibv_cq *cq) { int ret; ret = ibv_cmd_destroy_cq(cq); if (ret) return ret; if (mthca_is_memfree(cq->context)) { mthca_free_db(to_mctx(cq->context)->db_tab, MTHCA_DB_TYPE_CQ_SET_CI, to_mcq(cq)->set_ci_db_index); mthca_free_db(to_mctx(cq->context)->db_tab, MTHCA_DB_TYPE_CQ_ARM, to_mcq(cq)->arm_db_index); } mthca_dereg_mr(verbs_get_mr(to_mcq(cq)->mr)); mthca_free_buf(&to_mcq(cq)->buf); free(to_mcq(cq)); return 0; } static int align_queue_size(struct ibv_context *context, int size, int spare) { int ret; /* * If someone asks for a 0-sized queue, presumably they're not * going to use it. So don't mess with their size. */ if (!size) return 0; if (mthca_is_memfree(context)) { for (ret = 1; ret < size + spare; ret <<= 1) ; /* nothing */ return ret; } else return size + spare; } struct ibv_srq *mthca_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct umthca_create_srq cmd; struct umthca_create_srq_resp resp; struct mthca_srq *srq; int ret; /* Sanity check SRQ size before proceeding */ if (attr->attr.max_wr > 1 << 16 || attr->attr.max_sge > 64) return NULL; srq = malloc(sizeof *srq); if (!srq) return NULL; if (pthread_spin_init(&srq->lock, PTHREAD_PROCESS_PRIVATE)) goto err; srq->max = align_queue_size(pd->context, attr->attr.max_wr, 1); srq->max_gs = attr->attr.max_sge; srq->counter = 0; if (mthca_alloc_srq_buf(pd, &attr->attr, srq)) goto err; srq->mr = __mthca_reg_mr(pd, srq->buf.buf, srq->buf_size, 0, 0, 0); if (!srq->mr) goto err_free; srq->mr->context = pd->context; if (mthca_is_memfree(pd->context)) { srq->db_index = mthca_alloc_db(to_mctx(pd->context)->db_tab, MTHCA_DB_TYPE_SRQ, &srq->db); if (srq->db_index < 0) goto err_unreg; cmd.db_page = db_align(srq->db); cmd.db_index = srq->db_index; } else { cmd.db_page = cmd.db_index = 0; } cmd.lkey = srq->mr->lkey; ret = ibv_cmd_create_srq(pd, &srq->ibv_srq, attr, &cmd.ibv_cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (ret) goto err_db; srq->srqn = resp.srqn; if (mthca_is_memfree(pd->context)) mthca_set_db_qn(srq->db, MTHCA_DB_TYPE_SRQ, srq->srqn); return &srq->ibv_srq; err_db: if (mthca_is_memfree(pd->context)) mthca_free_db(to_mctx(pd->context)->db_tab, MTHCA_DB_TYPE_SRQ, srq->db_index); err_unreg: mthca_dereg_mr(verbs_get_mr(srq->mr)); err_free: free(srq->wrid); mthca_free_buf(&srq->buf); err: free(srq); return NULL; } int mthca_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int attr_mask) { struct ibv_modify_srq cmd; return ibv_cmd_modify_srq(srq, attr, attr_mask, &cmd, sizeof cmd); } int mthca_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; return ibv_cmd_query_srq(srq, attr, &cmd, sizeof cmd); } int mthca_destroy_srq(struct ibv_srq *srq) { int ret; ret = ibv_cmd_destroy_srq(srq); if (ret) return ret; if (mthca_is_memfree(srq->context)) mthca_free_db(to_mctx(srq->context)->db_tab, MTHCA_DB_TYPE_SRQ, to_msrq(srq)->db_index); mthca_dereg_mr(verbs_get_mr(to_msrq(srq)->mr)); mthca_free_buf(&to_msrq(srq)->buf); free(to_msrq(srq)->wrid); free(to_msrq(srq)); return 0; } struct ibv_qp *mthca_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct umthca_create_qp cmd; struct ib_uverbs_create_qp_resp resp; struct mthca_qp *qp; int ret; /* Sanity check QP size before proceeding */ if (attr->cap.max_send_wr > 65536 || attr->cap.max_recv_wr > 65536 || attr->cap.max_send_sge > 64 || attr->cap.max_recv_sge > 64 || attr->cap.max_inline_data > 1024) return NULL; qp = malloc(sizeof *qp); if (!qp) return NULL; qp->sq.max = align_queue_size(pd->context, attr->cap.max_send_wr, 0); qp->rq.max = align_queue_size(pd->context, attr->cap.max_recv_wr, 0); if (mthca_alloc_qp_buf(pd, &attr->cap, attr->qp_type, qp)) goto err; mthca_init_qp_indices(qp); if (pthread_spin_init(&qp->sq.lock, PTHREAD_PROCESS_PRIVATE) || pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE)) goto err_free; qp->mr = __mthca_reg_mr(pd, qp->buf.buf, qp->buf_size, 0, 0, 0); if (!qp->mr) goto err_free; qp->mr->context = pd->context; cmd.lkey = qp->mr->lkey; cmd.reserved = 0; if (mthca_is_memfree(pd->context)) { qp->sq.db_index = mthca_alloc_db(to_mctx(pd->context)->db_tab, MTHCA_DB_TYPE_SQ, &qp->sq.db); if (qp->sq.db_index < 0) goto err_unreg; qp->rq.db_index = mthca_alloc_db(to_mctx(pd->context)->db_tab, MTHCA_DB_TYPE_RQ, &qp->rq.db); if (qp->rq.db_index < 0) goto err_sq_db; cmd.sq_db_page = db_align(qp->sq.db); cmd.rq_db_page = db_align(qp->rq.db); cmd.sq_db_index = qp->sq.db_index; cmd.rq_db_index = qp->rq.db_index; } else { cmd.sq_db_page = cmd.rq_db_page = cmd.sq_db_index = cmd.rq_db_index = 0; } pthread_mutex_lock(&to_mctx(pd->context)->qp_table_mutex); ret = ibv_cmd_create_qp(pd, &qp->ibv_qp, attr, &cmd.ibv_cmd, sizeof cmd, &resp, sizeof resp); if (ret) goto err_rq_db; if (mthca_is_memfree(pd->context)) { mthca_set_db_qn(qp->sq.db, MTHCA_DB_TYPE_SQ, qp->ibv_qp.qp_num); mthca_set_db_qn(qp->rq.db, MTHCA_DB_TYPE_RQ, qp->ibv_qp.qp_num); } ret = mthca_store_qp(to_mctx(pd->context), qp->ibv_qp.qp_num, qp); if (ret) goto err_destroy; pthread_mutex_unlock(&to_mctx(pd->context)->qp_table_mutex); qp->sq.max = attr->cap.max_send_wr; qp->rq.max = attr->cap.max_recv_wr; qp->sq.max_gs = attr->cap.max_send_sge; qp->rq.max_gs = attr->cap.max_recv_sge; qp->max_inline_data = attr->cap.max_inline_data; return &qp->ibv_qp; err_destroy: ibv_cmd_destroy_qp(&qp->ibv_qp); err_rq_db: pthread_mutex_unlock(&to_mctx(pd->context)->qp_table_mutex); if (mthca_is_memfree(pd->context)) mthca_free_db(to_mctx(pd->context)->db_tab, MTHCA_DB_TYPE_RQ, qp->rq.db_index); err_sq_db: if (mthca_is_memfree(pd->context)) mthca_free_db(to_mctx(pd->context)->db_tab, MTHCA_DB_TYPE_SQ, qp->sq.db_index); err_unreg: mthca_dereg_mr(verbs_get_mr(qp->mr)); err_free: free(qp->wrid); mthca_free_buf(&qp->buf); err: free(qp); return NULL; } int mthca_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; return ibv_cmd_query_qp(qp, attr, attr_mask, init_attr, &cmd, sizeof cmd); } int mthca_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd = {}; int ret; ret = ibv_cmd_modify_qp(qp, attr, attr_mask, &cmd, sizeof cmd); if (!ret && (attr_mask & IBV_QP_STATE) && attr->qp_state == IBV_QPS_RESET) { mthca_cq_clean(to_mcq(qp->recv_cq), qp->qp_num, qp->srq ? to_msrq(qp->srq) : NULL); if (qp->send_cq != qp->recv_cq) mthca_cq_clean(to_mcq(qp->send_cq), qp->qp_num, NULL); mthca_init_qp_indices(to_mqp(qp)); if (mthca_is_memfree(qp->context)) { *to_mqp(qp)->sq.db = 0; *to_mqp(qp)->rq.db = 0; } } return ret; } static void mthca_lock_cqs(struct ibv_qp *qp) { struct mthca_cq *send_cq = to_mcq(qp->send_cq); struct mthca_cq *recv_cq = to_mcq(qp->recv_cq); if (send_cq == recv_cq) pthread_spin_lock(&send_cq->lock); else if (send_cq->cqn < recv_cq->cqn) { pthread_spin_lock(&send_cq->lock); pthread_spin_lock(&recv_cq->lock); } else { pthread_spin_lock(&recv_cq->lock); pthread_spin_lock(&send_cq->lock); } } static void mthca_unlock_cqs(struct ibv_qp *qp) { struct mthca_cq *send_cq = to_mcq(qp->send_cq); struct mthca_cq *recv_cq = to_mcq(qp->recv_cq); if (send_cq == recv_cq) pthread_spin_unlock(&send_cq->lock); else if (send_cq->cqn < recv_cq->cqn) { pthread_spin_unlock(&recv_cq->lock); pthread_spin_unlock(&send_cq->lock); } else { pthread_spin_unlock(&send_cq->lock); pthread_spin_unlock(&recv_cq->lock); } } int mthca_destroy_qp(struct ibv_qp *qp) { int ret; pthread_mutex_lock(&to_mctx(qp->context)->qp_table_mutex); ret = ibv_cmd_destroy_qp(qp); if (ret) { pthread_mutex_unlock(&to_mctx(qp->context)->qp_table_mutex); return ret; } mthca_lock_cqs(qp); __mthca_cq_clean(to_mcq(qp->recv_cq), qp->qp_num, qp->srq ? to_msrq(qp->srq) : NULL); if (qp->send_cq != qp->recv_cq) __mthca_cq_clean(to_mcq(qp->send_cq), qp->qp_num, NULL); mthca_clear_qp(to_mctx(qp->context), qp->qp_num); mthca_unlock_cqs(qp); pthread_mutex_unlock(&to_mctx(qp->context)->qp_table_mutex); if (mthca_is_memfree(qp->context)) { mthca_free_db(to_mctx(qp->context)->db_tab, MTHCA_DB_TYPE_RQ, to_mqp(qp)->rq.db_index); mthca_free_db(to_mctx(qp->context)->db_tab, MTHCA_DB_TYPE_SQ, to_mqp(qp)->sq.db_index); } mthca_dereg_mr(verbs_get_mr(to_mqp(qp)->mr)); mthca_free_buf(&to_mqp(qp)->buf); free(to_mqp(qp)->wrid); free(to_mqp(qp)); return 0; } struct ibv_ah *mthca_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr) { struct mthca_ah *ah; ah = malloc(sizeof *ah); if (!ah) return NULL; if (mthca_alloc_av(to_mpd(pd), attr, ah)) { free(ah); return NULL; } return &ah->ibv_ah; } int mthca_destroy_ah(struct ibv_ah *ah) { mthca_free_av(to_mah(ah)); free(to_mah(ah)); return 0; } rdma-core-56.1/providers/mthca/wqe.h000066400000000000000000000055561477342711600174230ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005 Cisco Systems. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef WQE_H #define WQE_H #include enum { MTHCA_SEND_DOORBELL = 0x10, MTHCA_RECV_DOORBELL = 0x18 }; enum { MTHCA_NEXT_DBD = 1 << 7, MTHCA_NEXT_FENCE = 1 << 6, MTHCA_NEXT_CQ_UPDATE = 1 << 3, MTHCA_NEXT_EVENT_GEN = 1 << 2, MTHCA_NEXT_SOLICIT = 1 << 1, }; enum { MTHCA_INLINE_SEG = 1 << 31 }; enum { MTHCA_INVAL_LKEY = 0x100, MTHCA_TAVOR_MAX_WQES_PER_RECV_DB = 256, MTHCA_ARBEL_MAX_WQES_PER_SEND_DB = 255 }; struct mthca_next_seg { __be32 nda_op; /* [31:6] next WQE [4:0] next opcode */ __be32 ee_nds; /* [31:8] next EE [7] DBD [6] F [5:0] next WQE size */ __be32 flags; /* [3] CQ [2] Event [1] Solicit */ __be32 imm; /* immediate data */ }; struct mthca_tavor_ud_seg { __be32 reserved1; __be32 lkey; __be64 av_addr; __be32 reserved2[4]; __be32 dqpn; __be32 qkey; __be32 reserved3[2]; }; struct mthca_arbel_ud_seg { __be32 av[8]; __be32 dqpn; __be32 qkey; __be32 reserved[2]; }; struct mthca_bind_seg { __be32 flags; /* [31] Atomic [30] rem write [29] rem read */ __be32 reserved; __be32 new_rkey; __be32 lkey; __be64 addr; __be64 length; }; struct mthca_raddr_seg { __be64 raddr; __be32 rkey; __be32 reserved; }; struct mthca_atomic_seg { __be64 swap_add; __be64 compare; }; struct mthca_data_seg { __be32 byte_count; __be32 lkey; __be64 addr; }; struct mthca_inline_seg { __be32 byte_count; }; #endif /* WQE_H */ rdma-core-56.1/providers/ocrdma/000077500000000000000000000000001477342711600166145ustar00rootroot00000000000000rdma-core-56.1/providers/ocrdma/CMakeLists.txt000066400000000000000000000000721477342711600213530ustar00rootroot00000000000000rdma_provider(ocrdma ocrdma_main.c ocrdma_verbs.c ) rdma-core-56.1/providers/ocrdma/Changelog000066400000000000000000000000001477342711600204140ustar00rootroot00000000000000rdma-core-56.1/providers/ocrdma/ocrdma_abi.h000066400000000000000000000161551477342711600210550ustar00rootroot00000000000000/* * Copyright (C) 2008-2013 Emulex. All rights reserved. * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGE. */ #ifndef __OCRDMA_ABI_H__ #define __OCRDMA_ABI_H__ #include #include #include #include #define OCRDMA_ABI_VERSION 2 DECLARE_DRV_CMD(uocrdma_get_context, IB_USER_VERBS_CMD_GET_CONTEXT, empty, ocrdma_alloc_ucontext_resp); DECLARE_DRV_CMD(uocrdma_alloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, ocrdma_alloc_pd_ureq, ocrdma_alloc_pd_uresp); DECLARE_DRV_CMD(uocrdma_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, ocrdma_create_cq_ureq, ocrdma_create_cq_uresp); DECLARE_DRV_CMD(uocrdma_reg_mr, IB_USER_VERBS_CMD_REG_MR, empty, empty); DECLARE_DRV_CMD(uocrdma_create_qp, IB_USER_VERBS_CMD_CREATE_QP, ocrdma_create_qp_ureq, ocrdma_create_qp_uresp); DECLARE_DRV_CMD(uocrdma_create_srq, IB_USER_VERBS_CMD_CREATE_SRQ, empty, ocrdma_create_srq_uresp); #define Bit(_b) (1 << (_b)) #define OCRDMA_MAX_QP 2048 enum { OCRDMA_DB_RQ_OFFSET = 0xE0, OCRDMA_DB_SQ_OFFSET = 0x60, OCRDMA_DB_SRQ_OFFSET = OCRDMA_DB_RQ_OFFSET, OCRDMA_DB_CQ_OFFSET = 0x120 }; #define OCRDMA_DB_CQ_RING_ID_MASK 0x3FF /* bits 0 - 9 */ #define OCRDMA_DB_CQ_RING_ID_EXT_MASK 0x0C00 /* bits 10-11 of qid placing at 12-11 */ #define OCRDMA_DB_CQ_RING_ID_EXT_MASK_SHIFT 0x1 /* qid #2 msbits placing at 12-11 */ #define OCRDMA_DB_CQ_NUM_POPPED_SHIFT (16) /* bits 16 - 28 */ /* Rearm bit */ #define OCRDMA_DB_CQ_REARM_SHIFT (29) /* bit 29 */ /* solicited bit */ #define OCRDMA_DB_CQ_SOLICIT_SHIFT (31) /* bit 31 */ enum OCRDMA_CQE_STATUS { OCRDMA_CQE_SUCCESS = 0, OCRDMA_CQE_LOC_LEN_ERR = 1, OCRDMA_CQE_LOC_QP_OP_ERR = 2, OCRDMA_CQE_LOC_EEC_OP_ERR = 3, OCRDMA_CQE_LOC_PROT_ERR = 4, OCRDMA_CQE_WR_FLUSH_ERR = 5, OCRDMA_CQE_MW_BIND_ERR = 6, OCRDMA_CQE_BAD_RESP_ERR = 7, OCRDMA_CQE_LOC_ACCESS_ERR = 8, OCRDMA_CQE_REM_INV_REQ_ERR = 9, OCRDMA_CQE_REM_ACCESS_ERR = 0xa, OCRDMA_CQE_REM_OP_ERR = 0xb, OCRDMA_CQE_RETRY_EXC_ERR = 0xc, OCRDMA_CQE_RNR_RETRY_EXC_ERR = 0xd, OCRDMA_CQE_LOC_RDD_VIOL_ERR = 0xe, OCRDMA_CQE_REM_INV_RD_REQ_ERR = 0xf, OCRDMA_CQE_REM_ABORT_ERR = 0x10, OCRDMA_CQE_INV_EECN_ERR = 0x11, OCRDMA_CQE_INV_EEC_STATE_ERR = 0x12, OCRDMA_CQE_FATAL_ERR = 0x13, OCRDMA_CQE_RESP_TIMEOUT_ERR = 0x14, OCRDMA_CQE_GENERAL_ERR }; enum { /* w0 */ OCRDMA_CQE_WQEIDX_SHIFT = 0, OCRDMA_CQE_WQEIDX_MASK = 0xFFFF, /* w1 */ OCRDMA_CQE_UD_XFER_LEN_SHIFT = 16, OCRDMA_CQE_PKEY_SHIFT = 0, OCRDMA_CQE_PKEY_MASK = 0xFFFF, /* w2 */ OCRDMA_CQE_QPN_SHIFT = 0, OCRDMA_CQE_QPN_MASK = 0x0000FFFF, OCRDMA_CQE_BUFTAG_SHIFT = 16, OCRDMA_CQE_BUFTAG_MASK = 0xFFFF << OCRDMA_CQE_BUFTAG_SHIFT, /* w3 */ OCRDMA_CQE_UD_STATUS_SHIFT = 24, OCRDMA_CQE_UD_STATUS_MASK = 0x7 << OCRDMA_CQE_UD_STATUS_SHIFT, OCRDMA_CQE_STATUS_SHIFT = 16, OCRDMA_CQE_STATUS_MASK = (0xFF << OCRDMA_CQE_STATUS_SHIFT), OCRDMA_CQE_VALID = Bit(31), OCRDMA_CQE_INVALIDATE = Bit(30), OCRDMA_CQE_QTYPE = Bit(29), OCRDMA_CQE_IMM = Bit(28), OCRDMA_CQE_WRITE_IMM = Bit(27), OCRDMA_CQE_QTYPE_SQ = 0, OCRDMA_CQE_QTYPE_RQ = 1, OCRDMA_CQE_SRCQP_MASK = 0xFFFFFF }; struct ocrdma_cqe { union { /* w0 to w2 */ struct { __le32 wqeidx; __le32 bytes_xfered; __le32 qpn; } wq; struct { __le32 lkey_immdt; __le32 rxlen; __le32 buftag_qpn; } rq; struct { __le32 lkey_immdt; __le32 rxlen_pkey; __le32 buftag_qpn; } ud; struct { __le32 word_0; __le32 word_1; __le32 qpn; } cmn; }; __le32 flags_status_srcqpn; /* w3 */ } __attribute__ ((packed)); struct ocrdma_sge { uint32_t addr_hi; uint32_t addr_lo; uint32_t lrkey; uint32_t len; } __attribute__ ((packed)); enum { OCRDMA_WQE_OPCODE_SHIFT = 0, OCRDMA_WQE_OPCODE_MASK = 0x0000001F, OCRDMA_WQE_FLAGS_SHIFT = 5, OCRDMA_WQE_TYPE_SHIFT = 16, OCRDMA_WQE_TYPE_MASK = 0x00030000, OCRDMA_WQE_SIZE_SHIFT = 18, OCRDMA_WQE_SIZE_MASK = 0xFF, OCRDMA_WQE_NXT_WQE_SIZE_SHIFT = 25, OCRDMA_WQE_LKEY_FLAGS_SHIFT = 0, OCRDMA_WQE_LKEY_FLAGS_MASK = 0xF }; enum { OCRDMA_FLAG_SIG = 0x1, OCRDMA_FLAG_INV = 0x2, OCRDMA_FLAG_FENCE_L = 0x4, OCRDMA_FLAG_FENCE_R = 0x8, OCRDMA_FLAG_SOLICIT = 0x10, OCRDMA_FLAG_IMM = 0x20, OCRDMA_FLAG_AH_VLAN_PR = 0x40, /* Stag flags */ OCRDMA_LKEY_FLAG_LOCAL_WR = 0x1, OCRDMA_LKEY_FLAG_REMOTE_RD = 0x2, OCRDMA_LKEY_FLAG_REMOTE_WR = 0x4, OCRDMA_LKEY_FLAG_VATO = 0x8 }; enum { OCRDMA_TYPE_INLINE = 0x0, OCRDMA_TYPE_LKEY = 0x1 }; #define OCRDMA_CQE_QTYPE_RQ 1 #define OCRDMA_CQE_QTYPE_SQ 0 enum OCRDMA_WQE_OPCODE { OCRDMA_WRITE = 0x06, OCRDMA_READ = 0x0C, OCRDMA_RESV0 = 0x02, OCRDMA_SEND = 0x00, OCRDMA_BIND_MW = 0x08, OCRDMA_RESV1 = 0x0A, OCRDMA_LKEY_INV = 0x15, }; #define OCRDMA_WQE_STRIDE 8 #define OCRDMA_WQE_ALIGN_BYTES 16 /* header WQE for all the SQ and RQ operations */ struct ocrdma_hdr_wqe { uint32_t cw; union { uint32_t rsvd_tag; uint32_t rsvd_stag_flags; }; union { uint32_t immdt; uint32_t lkey; }; uint32_t total_len; } __attribute__ ((packed)); struct ocrdma_hdr_wqe_le { __le32 cw; union { __le32 rsvd_tag; __le32 rsvd_stag_flags; }; union { __le32 immdt; __le32 lkey; }; __le32 total_len; } __attribute__ ((packed)); struct ocrdma_ewqe_atomic { uint32_t ra_hi; uint32_t ra_lo; uint32_t rkey; uint32_t rlen; uint32_t swap_add_hi; uint32_t swap_add_lo; uint32_t compare_hi; uint32_t compare_lo; struct ocrdma_sge sge; } __attribute__ ((packed)); struct ocrdma_ewqe_ud_hdr { uint32_t rsvd_dest_qpn; uint32_t qkey; uint32_t rsvd_ahid; uint32_t hdr_type; } __attribute__ ((packed)); #endif /* __OCRDMA_ABI_H__ */ rdma-core-56.1/providers/ocrdma/ocrdma_main.c000066400000000000000000000137521477342711600212410ustar00rootroot00000000000000/* * Copyright (C) 2008-2013 Emulex. All rights reserved. * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include "ocrdma_main.h" #include "ocrdma_abi.h" #include #include #include #include static void ocrdma_free_context(struct ibv_context *ibctx); #define PCI_VENDOR_ID_EMULEX 0x10DF #define PCI_DEVICE_ID_EMULEX_GEN1 0xe220 #define PCI_DEVICE_ID_EMULEX_GEN2 0x720 #define PCI_DEVICE_ID_EMULEX_GEN2_VF 0x728 #define UCNA(v, d) \ VERBS_PCI_MATCH(PCI_VENDOR_ID_##v, PCI_DEVICE_ID_EMULEX_##d, NULL) static const struct verbs_match_ent ucna_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_OCRDMA), UCNA(EMULEX, GEN1), UCNA(EMULEX, GEN2), UCNA(EMULEX, GEN2_VF), {} }; static const struct verbs_context_ops ocrdma_ctx_ops = { .query_device_ex = ocrdma_query_device, .query_port = ocrdma_query_port, .alloc_pd = ocrdma_alloc_pd, .dealloc_pd = ocrdma_free_pd, .reg_mr = ocrdma_reg_mr, .dereg_mr = ocrdma_dereg_mr, .create_cq = ocrdma_create_cq, .poll_cq = ocrdma_poll_cq, .req_notify_cq = ocrdma_arm_cq, .resize_cq = ocrdma_resize_cq, .destroy_cq = ocrdma_destroy_cq, .create_qp = ocrdma_create_qp, .query_qp = ocrdma_query_qp, .modify_qp = ocrdma_modify_qp, .destroy_qp = ocrdma_destroy_qp, .post_send = ocrdma_post_send, .post_recv = ocrdma_post_recv, .create_ah = ocrdma_create_ah, .destroy_ah = ocrdma_destroy_ah, .create_srq = ocrdma_create_srq, .modify_srq = ocrdma_modify_srq, .query_srq = ocrdma_query_srq, .destroy_srq = ocrdma_destroy_srq, .post_srq_recv = ocrdma_post_srq_recv, .attach_mcast = ocrdma_attach_mcast, .detach_mcast = ocrdma_detach_mcast, .free_context = ocrdma_free_context, }; static void ocrdma_uninit_device(struct verbs_device *verbs_device) { struct ocrdma_device *dev = get_ocrdma_dev(&verbs_device->device); free(dev); } /* * ocrdma_alloc_context */ static struct verbs_context *ocrdma_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct ocrdma_devctx *ctx; struct uocrdma_get_context cmd; struct uocrdma_get_context_resp resp = {}; ctx = verbs_init_and_alloc_context(ibdev, cmd_fd, ctx, ibv_ctx, RDMA_DRIVER_OCRDMA); if (!ctx) return NULL; if (ibv_cmd_get_context(&ctx->ibv_ctx, (struct ibv_get_context *)&cmd, sizeof cmd, &resp.ibv_resp, sizeof(resp))) goto cmd_err; verbs_set_ops(&ctx->ibv_ctx, &ocrdma_ctx_ops); get_ocrdma_dev(ibdev)->id = resp.dev_id; get_ocrdma_dev(ibdev)->max_inline_data = resp.max_inline_data; get_ocrdma_dev(ibdev)->wqe_size = resp.wqe_size; get_ocrdma_dev(ibdev)->rqe_size = resp.rqe_size; memcpy(get_ocrdma_dev(ibdev)->fw_ver, resp.fw_ver, sizeof(resp.fw_ver)); get_ocrdma_dev(ibdev)->dpp_wqe_size = resp.dpp_wqe_size; ctx->ah_tbl = mmap(NULL, resp.ah_tbl_len, PROT_READ | PROT_WRITE, MAP_SHARED, cmd_fd, resp.ah_tbl_page); if (ctx->ah_tbl == MAP_FAILED) goto cmd_err; ctx->ah_tbl_len = resp.ah_tbl_len; ocrdma_init_ahid_tbl(ctx); return &ctx->ibv_ctx; cmd_err: ocrdma_err("%s: Failed to allocate context for device.\n", __func__); verbs_uninit_context(&ctx->ibv_ctx); free(ctx); return NULL; } /* * ocrdma_free_context */ static void ocrdma_free_context(struct ibv_context *ibctx) { struct ocrdma_devctx *ctx = get_ocrdma_ctx(ibctx); if (ctx->ah_tbl) munmap((void *)ctx->ah_tbl, ctx->ah_tbl_len); verbs_uninit_context(&ctx->ibv_ctx); free(ctx); } static struct verbs_device * ocrdma_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct ocrdma_device *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; dev->qp_tbl = malloc(OCRDMA_MAX_QP * sizeof(struct ocrdma_qp *)); if (!dev->qp_tbl) goto qp_err; bzero(dev->qp_tbl, OCRDMA_MAX_QP * sizeof(struct ocrdma_qp *)); pthread_mutex_init(&dev->dev_lock, NULL); pthread_spin_init(&dev->flush_q_lock, PTHREAD_PROCESS_PRIVATE); return &dev->ibv_dev; qp_err: free(dev); return NULL; } static const struct verbs_device_ops ocrdma_dev_ops = { .name = "ocrdma", .match_min_abi_version = OCRDMA_ABI_VERSION, .match_max_abi_version = OCRDMA_ABI_VERSION, .match_table = ucna_table, .alloc_device = ocrdma_device_alloc, .uninit_device = ocrdma_uninit_device, .alloc_context = ocrdma_alloc_context, }; PROVIDER_DRIVER(ocrdma, ocrdma_dev_ops); rdma-core-56.1/providers/ocrdma/ocrdma_main.h000066400000000000000000000202151477342711600212360ustar00rootroot00000000000000/* * Copyright (C) 2008-2013 Emulex. All rights reserved. * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGE. */ #ifndef __OCRDMA_MAIN_H__ #define __OCRDMA_MAIN_H__ #include #include #include #include #include #include #define ocrdma_err(format, arg...) printf(format, ##arg) #define OCRDMA_DPP_PAGE_SIZE (4096) #define ROUND_UP_X(_val, _x) \ (((unsigned long)(_val) + ((_x)-1)) & (long)~((_x)-1)) struct ocrdma_qp; struct ocrdma_device { struct verbs_device ibv_dev; struct ocrdma_qp **qp_tbl; pthread_mutex_t dev_lock; pthread_spinlock_t flush_q_lock; int id; int gen; uint32_t wqe_size; uint32_t rqe_size; uint32_t dpp_wqe_size; uint32_t max_inline_data; uint8_t fw_ver[32]; }; struct ocrdma_devctx { struct verbs_context ibv_ctx; uint32_t *ah_tbl; uint32_t ah_tbl_len; pthread_mutex_t tbl_lock; }; struct ocrdma_pd { struct ibv_pd ibv_pd; struct ocrdma_device *dev; struct ocrdma_devctx *uctx; void *dpp_va; }; struct ocrdma_mr { struct verbs_mr vmr; }; struct ocrdma_cq { struct ibv_cq ibv_cq; struct ocrdma_device *dev; uint16_t cq_id; uint16_t cq_dbid; uint16_t getp; pthread_spinlock_t cq_lock; uint32_t max_hw_cqe; uint32_t cq_mem_size; struct ocrdma_cqe *va; void *db_va; uint32_t db_size; uint32_t phase; int phase_change; uint8_t deferred_arm; uint8_t deferred_sol; uint8_t first_arm; struct list_head sq_head; struct list_head rq_head; }; enum { OCRDMA_DPP_WQE_INDEX_MASK = 0xFFFF, OCRDMA_DPP_CQE_VALID_BIT_SHIFT = 31, OCRDMA_DPP_CQE_VALID_BIT_MASK = 1 << 31 }; struct ocrdma_dpp_cqe { uint32_t wqe_idx_valid; }; enum { OCRDMA_PD_MAX_DPP_ENABLED_QP = 16 }; struct ocrdma_qp_hwq_info { uint8_t *va; /* virtual address */ uint32_t max_sges; uint32_t free_cnt; uint32_t head, tail; uint32_t entry_size; uint32_t max_cnt; uint32_t max_wqe_idx; uint32_t len; uint16_t dbid; /* qid, where to ring the doorbell. */ }; struct ocrdma_srq { struct ibv_srq ibv_srq; struct ocrdma_device *dev; void *db_va; uint32_t db_size; pthread_spinlock_t q_lock; struct ocrdma_qp_hwq_info rq; uint32_t max_rq_sges; uint32_t id; uint64_t *rqe_wr_id_tbl; uint32_t *idx_bit_fields; uint32_t bit_fields_len; uint32_t db_shift; }; enum { OCRDMA_CREATE_QP_REQ_DPP_CREDIT_LIMIT = 1 }; enum ocrdma_qp_state { OCRDMA_QPS_RST = 0, OCRDMA_QPS_INIT = 1, OCRDMA_QPS_RTR = 2, OCRDMA_QPS_RTS = 3, OCRDMA_QPS_SQE = 4, OCRDMA_QPS_SQ_DRAINING = 5, OCRDMA_QPS_ERR = 6, OCRDMA_QPS_SQD = 7 }; struct ocrdma_qp { struct ibv_qp ibv_qp; struct ocrdma_device *dev; pthread_spinlock_t q_lock; struct ocrdma_qp_hwq_info sq; struct ocrdma_cq *sq_cq; struct { uint64_t wrid; uint16_t dpp_wqe_idx; uint16_t dpp_wqe; uint8_t signaled; uint8_t rsvd[3]; } *wqe_wr_id_tbl; struct ocrdma_qp_hwq_info dpp_q; int dpp_enabled; struct ocrdma_qp_hwq_info rq; struct ocrdma_cq *rq_cq; uint64_t *rqe_wr_id_tbl; void *db_va; void *db_sq_va; void *db_rq_va; uint32_t max_inline_data; struct ocrdma_srq *srq; struct ocrdma_cq *dpp_cq; uint32_t db_size; uint32_t max_ord; uint32_t max_ird; uint32_t dpp_prev_indx; enum ibv_qp_type qp_type; enum ocrdma_qp_state state; struct list_node sq_entry; struct list_node rq_entry; uint16_t id; uint16_t rsvd; uint32_t db_shift; int signaled; /* signaled QP */ }; enum { OCRDMA_AH_ID_MASK = 0x3FF, OCRDMA_AH_VLAN_VALID_MASK = 0x01, OCRDMA_AH_VLAN_VALID_SHIFT = 0x1F, OCRDMA_AH_L3_TYPE_MASK = 0x03, OCRDMA_AH_L3_TYPE_SHIFT = 0x1D }; struct ocrdma_ah { struct ibv_ah ibv_ah; struct ocrdma_pd *pd; uint16_t id; uint8_t isvlan; uint8_t hdr_type; }; #define get_ocrdma_xxx(xxx, type) \ container_of(ib##xxx, struct ocrdma_##type, ibv_##xxx) static inline struct ocrdma_devctx *get_ocrdma_ctx(struct ibv_context *ibctx) { return container_of(ibctx, struct ocrdma_devctx, ibv_ctx.context); } static inline struct ocrdma_device *get_ocrdma_dev(struct ibv_device *ibdev) { return container_of(ibdev, struct ocrdma_device, ibv_dev.device); } static inline struct ocrdma_qp *get_ocrdma_qp(struct ibv_qp *ibqp) { return get_ocrdma_xxx(qp, qp); } static inline struct ocrdma_srq *get_ocrdma_srq(struct ibv_srq *ibsrq) { return get_ocrdma_xxx(srq, srq); } static inline struct ocrdma_pd *get_ocrdma_pd(struct ibv_pd *ibpd) { return get_ocrdma_xxx(pd, pd); } static inline struct ocrdma_cq *get_ocrdma_cq(struct ibv_cq *ibcq) { return get_ocrdma_xxx(cq, cq); } static inline struct ocrdma_ah *get_ocrdma_ah(struct ibv_ah *ibah) { return get_ocrdma_xxx(ah, ah); } void ocrdma_init_ahid_tbl(struct ocrdma_devctx *ctx); int ocrdma_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int ocrdma_query_port(struct ibv_context *, uint8_t, struct ibv_port_attr *); struct ibv_pd *ocrdma_alloc_pd(struct ibv_context *); int ocrdma_free_pd(struct ibv_pd *); struct ibv_mr *ocrdma_reg_mr(struct ibv_pd *pd, void *addr, size_t len, uint64_t hca_va, int access); int ocrdma_dereg_mr(struct verbs_mr *vmr); struct ibv_cq *ocrdma_create_cq(struct ibv_context *, int, struct ibv_comp_channel *, int); int ocrdma_resize_cq(struct ibv_cq *, int); int ocrdma_destroy_cq(struct ibv_cq *); int ocrdma_poll_cq(struct ibv_cq *, int, struct ibv_wc *); int ocrdma_arm_cq(struct ibv_cq *, int); struct ibv_qp *ocrdma_create_qp(struct ibv_pd *, struct ibv_qp_init_attr *); int ocrdma_modify_qp(struct ibv_qp *, struct ibv_qp_attr *, int ibv_qp_attr_mask); int ocrdma_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int ocrdma_destroy_qp(struct ibv_qp *); int ocrdma_post_send(struct ibv_qp *, struct ibv_send_wr *, struct ibv_send_wr **); int ocrdma_post_recv(struct ibv_qp *, struct ibv_recv_wr *, struct ibv_recv_wr **); struct ibv_srq *ocrdma_create_srq(struct ibv_pd *, struct ibv_srq_init_attr *); int ocrdma_modify_srq(struct ibv_srq *, struct ibv_srq_attr *, int); int ocrdma_destroy_srq(struct ibv_srq *); int ocrdma_query_srq(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr); int ocrdma_post_srq_recv(struct ibv_srq *, struct ibv_recv_wr *, struct ibv_recv_wr **); struct ibv_ah *ocrdma_create_ah(struct ibv_pd *, struct ibv_ah_attr *); int ocrdma_destroy_ah(struct ibv_ah *); int ocrdma_attach_mcast(struct ibv_qp *, const union ibv_gid *, uint16_t); int ocrdma_detach_mcast(struct ibv_qp *, const union ibv_gid *, uint16_t); void ocrdma_async_event(struct ibv_async_event *event); #endif /* __OCRDMA_MAIN_H__ */ rdma-core-56.1/providers/ocrdma/ocrdma_verbs.c000066400000000000000000001502701477342711600214330ustar00rootroot00000000000000/* * Copyright (C) 2008-2013 Emulex. All rights reserved. * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF * THE POSSIBILITY OF SUCH DAMAGE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include "ocrdma_main.h" #include "ocrdma_abi.h" #include #include static void ocrdma_ring_cq_db(struct ocrdma_cq *cq, uint32_t armed, int solicited, uint32_t num_cqe); static inline void ocrdma_swap_cpu_to_le(void *dst, uint32_t len) { int i = 0; __le32 *src_ptr = dst; uint32_t *dst_ptr = dst; for (; i < (len / 4); i++) *dst_ptr++ = le32toh(*src_ptr++); } /* * ocrdma_query_device */ int ocrdma_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ocrdma_device *dev = get_ocrdma_dev(context->device); struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); int ret; ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (ret) return ret; memcpy(attr->orig_attr.fw_ver, dev->fw_ver, sizeof(dev->fw_ver)); return 0; } /* * ocrdma_query_port */ int ocrdma_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; int status; status = ibv_cmd_query_port(context, port, attr, &cmd, sizeof cmd); return status; } #define OCRDMA_INVALID_AH_IDX 0xffffffff void ocrdma_init_ahid_tbl(struct ocrdma_devctx *ctx) { int i; pthread_mutex_init(&ctx->tbl_lock, NULL); for (i = 0; i < (ctx->ah_tbl_len / sizeof(uint32_t)); i++) ctx->ah_tbl[i] = OCRDMA_INVALID_AH_IDX; } static int ocrdma_alloc_ah_tbl_id(struct ocrdma_devctx *ctx) { int i; int status = -EINVAL; pthread_mutex_lock(&ctx->tbl_lock); for (i = 0; i < (ctx->ah_tbl_len / sizeof(uint32_t)); i++) { if (ctx->ah_tbl[i] == OCRDMA_INVALID_AH_IDX) { ctx->ah_tbl[i] = ctx->ah_tbl_len; status = i; break; } } pthread_mutex_unlock(&ctx->tbl_lock); return status; } static void ocrdma_free_ah_tbl_id(struct ocrdma_devctx *ctx, int idx) { pthread_mutex_lock(&ctx->tbl_lock); ctx->ah_tbl[idx] = OCRDMA_INVALID_AH_IDX; pthread_mutex_unlock(&ctx->tbl_lock); } /* * ocrdma_alloc_pd */ struct ibv_pd *ocrdma_alloc_pd(struct ibv_context *context) { struct uocrdma_alloc_pd cmd; struct uocrdma_alloc_pd_resp resp; struct ocrdma_pd *pd; uint64_t map_address = 0; pd = malloc(sizeof *pd); if (!pd) return NULL; bzero(pd, sizeof *pd); memset(&cmd, 0, sizeof(cmd)); if (ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) { free(pd); return NULL; } pd->dev = get_ocrdma_dev(context->device); pd->uctx = get_ocrdma_ctx(context); if (resp.dpp_enabled) { map_address = ((uint64_t) resp.dpp_page_addr_hi << 32) | resp.dpp_page_addr_lo; pd->dpp_va = mmap(NULL, OCRDMA_DPP_PAGE_SIZE, PROT_WRITE, MAP_SHARED, context->cmd_fd, map_address); if (pd->dpp_va == MAP_FAILED) { ocrdma_free_pd(&pd->ibv_pd); return NULL; } } return &pd->ibv_pd; } /* * ocrdma_free_pd */ int ocrdma_free_pd(struct ibv_pd *ibpd) { int status; struct ocrdma_pd *pd = get_ocrdma_pd(ibpd); status = ibv_cmd_dealloc_pd(ibpd); if (status) return status; if (pd->dpp_va) munmap((void *)pd->dpp_va, OCRDMA_DPP_PAGE_SIZE); free(pd); return 0; } /* * ocrdma_reg_mr */ struct ibv_mr *ocrdma_reg_mr(struct ibv_pd *pd, void *addr, size_t len, uint64_t hca_va, int access) { struct ocrdma_mr *mr; struct ibv_reg_mr cmd; struct uocrdma_reg_mr_resp resp; mr = malloc(sizeof *mr); if (!mr) return NULL; bzero(mr, sizeof *mr); if (ibv_cmd_reg_mr(pd, addr, len, hca_va, access, &mr->vmr, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) { free(mr); return NULL; } return &mr->vmr.ibv_mr; } /* * ocrdma_dereg_mr */ int ocrdma_dereg_mr(struct verbs_mr *vmr) { int status; status = ibv_cmd_dereg_mr(vmr); if (status) return status; free(vmr); return 0; } /* * ocrdma_create_cq */ static struct ibv_cq *ocrdma_create_cq_common(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector, int dpp_cq) { int status; struct uocrdma_create_cq cmd; struct uocrdma_create_cq_resp resp; struct ocrdma_cq *cq; struct ocrdma_device *dev = get_ocrdma_dev(context->device); void *map_addr; cq = malloc(sizeof *cq); if (!cq) return NULL; bzero(cq, sizeof *cq); cmd.dpp_cq = dpp_cq; status = ibv_cmd_create_cq(context, cqe, channel, comp_vector, &cq->ibv_cq, &cmd.ibv_cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (status) goto cq_err1; pthread_spin_init(&cq->cq_lock, PTHREAD_PROCESS_PRIVATE); cq->dev = dev; cq->cq_id = resp.cq_id; cq->cq_dbid = resp.cq_id; cq->cq_mem_size = resp.page_size; cq->max_hw_cqe = resp.max_hw_cqe; cq->phase_change = resp.phase_change; cq->va = mmap(NULL, resp.page_size, PROT_READ | PROT_WRITE, MAP_SHARED, context->cmd_fd, resp.page_addr[0]); if (cq->va == MAP_FAILED) goto cq_err2; map_addr = mmap(NULL, resp.db_page_size, PROT_WRITE, MAP_SHARED, context->cmd_fd, resp.db_page_addr); if (map_addr == MAP_FAILED) goto cq_err2; cq->db_va = map_addr; cq->db_size = resp.db_page_size; cq->phase = OCRDMA_CQE_VALID; cq->first_arm = 1; if (!dpp_cq) { ocrdma_ring_cq_db(cq, 0, 0, 0); } cq->ibv_cq.cqe = cqe; list_head_init(&cq->sq_head); list_head_init(&cq->rq_head); return &cq->ibv_cq; cq_err2: (void)ibv_cmd_destroy_cq(&cq->ibv_cq); cq_err1: free(cq); return NULL; } struct ibv_cq *ocrdma_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { return ocrdma_create_cq_common(context, cqe, channel, comp_vector, 0); } #ifdef DPP_CQ_SUPPORT static struct ocrdma_cq *ocrdma_create_dpp_cq(struct ibv_context *context, int cqe) { struct ibv_cq *ibcq; ibcq = ocrdma_create_cq_common(context, cqe, 0, 0, 1); if (ibcq) return get_ocrdma_cq(ibcq); return NULL; } #endif /* * ocrdma_resize_cq */ int ocrdma_resize_cq(struct ibv_cq *ibcq, int new_entries) { int status; struct ibv_resize_cq cmd; struct ib_uverbs_resize_cq_resp resp; status = ibv_cmd_resize_cq(ibcq, new_entries, &cmd, sizeof cmd, &resp, sizeof resp); if (status == 0) ibcq->cqe = new_entries; return status; } /* * ocrdma_destroy_cq */ int ocrdma_destroy_cq(struct ibv_cq *ibv_cq) { struct ocrdma_cq *cq = get_ocrdma_cq(ibv_cq); int status; status = ibv_cmd_destroy_cq(ibv_cq); if (status) return status; if (cq->db_va) munmap((void *)cq->db_va, cq->db_size); if (cq->va) munmap((void*)cq->va, cq->cq_mem_size); free(cq); return 0; } static void ocrdma_add_qpn_map(struct ocrdma_device *dev, struct ocrdma_qp *qp) { pthread_mutex_lock(&dev->dev_lock); dev->qp_tbl[qp->id] = qp; pthread_mutex_unlock(&dev->dev_lock); } static void _ocrdma_del_qpn_map(struct ocrdma_device *dev, struct ocrdma_qp *qp) { dev->qp_tbl[qp->id] = NULL; } struct ibv_srq *ocrdma_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *init_attr) { int status = 0; struct ocrdma_srq *srq; struct uocrdma_create_srq cmd; struct uocrdma_create_srq_resp resp = {}; void *map_addr; srq = calloc(1, sizeof *srq); if (!srq) return NULL; pthread_spin_init(&srq->q_lock, PTHREAD_PROCESS_PRIVATE); status = ibv_cmd_create_srq(pd, &srq->ibv_srq, init_attr, &cmd.ibv_cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (status) goto cmd_err; srq->dev = get_ocrdma_pd(pd)->dev; srq->rq.dbid = resp.rq_dbid; srq->rq.max_sges = init_attr->attr.max_sge; srq->rq.max_cnt = resp.num_rqe_allocated; srq->rq.max_wqe_idx = resp.num_rqe_allocated - 1; srq->rq.entry_size = srq->dev->rqe_size; srq->rqe_wr_id_tbl = calloc(srq->rq.max_cnt, sizeof(uint64_t)); if (srq->rqe_wr_id_tbl == NULL) goto map_err; srq->bit_fields_len = (srq->rq.max_cnt / 32) + (srq->rq.max_cnt % 32 ? 1 : 0); srq->idx_bit_fields = malloc(srq->bit_fields_len * sizeof(uint32_t)); if (srq->idx_bit_fields == NULL) goto map_err; memset(srq->idx_bit_fields, 0xff, srq->bit_fields_len * sizeof(uint32_t)); if (resp.num_rq_pages > 1) goto map_err; map_addr = mmap(NULL, resp.rq_page_size, PROT_READ | PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.rq_page_addr[0]); if (map_addr == MAP_FAILED) goto map_err; srq->rq.len = resp.rq_page_size; srq->rq.va = map_addr; map_addr = mmap(NULL, resp.db_page_size, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.db_page_addr); if (map_addr == MAP_FAILED) goto map_err; srq->db_va = (uint8_t *) map_addr + resp.db_rq_offset; srq->db_shift = resp.db_shift; srq->db_size = resp.db_page_size; return &srq->ibv_srq; map_err: ocrdma_destroy_srq(&srq->ibv_srq); return NULL; cmd_err: pthread_spin_destroy(&srq->q_lock); free(srq); return NULL; } int ocrdma_modify_srq(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr, int attr_mask) { struct ibv_modify_srq cmd; return ibv_cmd_modify_srq(ibsrq, attr, attr_mask, &cmd, sizeof cmd); } int ocrdma_query_srq(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; return ibv_cmd_query_srq(ibsrq, attr, &cmd, sizeof cmd); } int ocrdma_destroy_srq(struct ibv_srq *ibsrq) { int status; struct ocrdma_srq *srq; srq = get_ocrdma_srq(ibsrq); status = ibv_cmd_destroy_srq(ibsrq); if (status) return status; if (srq->idx_bit_fields) free(srq->idx_bit_fields); if (srq->rqe_wr_id_tbl) free(srq->rqe_wr_id_tbl); if (srq->db_va) { munmap((void *)srq->db_va, srq->db_size); srq->db_va = NULL; } if (srq->rq.va) { munmap(srq->rq.va, srq->rq.len); srq->rq.va = NULL; } pthread_spin_destroy(&srq->q_lock); free(srq); return status; } /* * ocrdma_create_qp */ struct ibv_qp *ocrdma_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attrs) { int status = 0; struct uocrdma_create_qp cmd; struct uocrdma_create_qp_resp resp = {}; struct ocrdma_qp *qp; void *map_addr; #ifdef DPP_CQ_SUPPORT struct ocrdma_dpp_cqe *dpp_cqe = NULL; #endif qp = calloc(1, sizeof *qp); if (!qp) return NULL; memset(&cmd, 0, sizeof(cmd)); qp->qp_type = attrs->qp_type; pthread_spin_init(&qp->q_lock, PTHREAD_PROCESS_PRIVATE); #ifdef DPP_CQ_SUPPORT if (attrs->cap.max_inline_data) { qp->dpp_cq = ocrdma_create_dpp_cq(pd->context, OCRDMA_CREATE_QP_REQ_DPP_CREDIT_LIMIT); if (qp->dpp_cq) { cmd.enable_dpp_cq = 1; cmd.dpp_cq_id = qp->dpp_cq->cq_id; /* Write invalid index for the first entry */ dpp_cqe = (struct ocrdma_dpp_cqe *)qp->dpp_cq->va; dpp_cqe->wqe_idx_valid = 0xFFFF; qp->dpp_prev_indx = 0xFFFF; } } #endif status = ibv_cmd_create_qp(pd, &qp->ibv_qp, attrs, &cmd.ibv_cmd, sizeof cmd, &resp.ibv_resp, sizeof resp); if (status) goto mbx_err; qp->dev = get_ocrdma_dev(pd->context->device); qp->id = resp.qp_id; ocrdma_add_qpn_map(qp->dev, qp); qp->sq.dbid = resp.sq_dbid; qp->sq.max_sges = attrs->cap.max_send_sge; qp->max_inline_data = attrs->cap.max_inline_data; qp->signaled = attrs->sq_sig_all; qp->sq.max_cnt = resp.num_wqe_allocated; qp->sq.max_wqe_idx = resp.num_wqe_allocated - 1; qp->sq.entry_size = qp->dev->wqe_size; if (attrs->srq) qp->srq = get_ocrdma_srq(attrs->srq); else { qp->rq.dbid = resp.rq_dbid; qp->rq.max_sges = attrs->cap.max_recv_sge; qp->rq.max_cnt = resp.num_rqe_allocated; qp->rq.max_wqe_idx = resp.num_rqe_allocated - 1; qp->rq.entry_size = qp->dev->rqe_size; qp->rqe_wr_id_tbl = calloc(qp->rq.max_cnt, sizeof(uint64_t)); if (qp->rqe_wr_id_tbl == NULL) goto map_err; } qp->sq_cq = get_ocrdma_cq(attrs->send_cq); qp->rq_cq = get_ocrdma_cq(attrs->recv_cq); qp->wqe_wr_id_tbl = calloc(qp->sq.max_cnt, sizeof(*qp->wqe_wr_id_tbl)); if (qp->wqe_wr_id_tbl == NULL) goto map_err; /* currently we support only one virtual page */ if ((resp.num_sq_pages > 1) || (!attrs->srq && resp.num_rq_pages > 1)) goto map_err; map_addr = mmap(NULL, resp.sq_page_size, PROT_READ | PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.sq_page_addr[0]); if (map_addr == MAP_FAILED) goto map_err; qp->sq.va = map_addr; qp->sq.len = resp.sq_page_size; qp->db_shift = resp.db_shift; if (!attrs->srq) { map_addr = mmap(NULL, resp.rq_page_size, PROT_READ | PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.rq_page_addr[0]); if (map_addr == MAP_FAILED) goto map_err; qp->rq.len = resp.rq_page_size; qp->rq.va = map_addr; } map_addr = mmap(NULL, resp.db_page_size, PROT_WRITE, MAP_SHARED, pd->context->cmd_fd, resp.db_page_addr); if (map_addr == MAP_FAILED) goto map_err; qp->db_va = map_addr; qp->db_sq_va = (uint8_t *) map_addr + resp.db_sq_offset; qp->db_rq_va = (uint8_t *) map_addr + resp.db_rq_offset; qp->db_size = resp.db_page_size; if (resp.dpp_credit) { struct ocrdma_pd *opd = get_ocrdma_pd(pd); map_addr = (uint8_t *) opd->dpp_va + (resp.dpp_offset * qp->dev->wqe_size); qp->dpp_q.max_cnt = 1; /* DPP is posted at the same offset */ qp->dpp_q.free_cnt = resp.dpp_credit; qp->dpp_q.va = map_addr; qp->dpp_q.head = qp->dpp_q.tail = 0; qp->dpp_q.entry_size = qp->dev->dpp_wqe_size; qp->dpp_q.len = resp.dpp_credit * qp->dev->dpp_wqe_size; qp->dpp_enabled = 1; } else { if (qp->dpp_cq) { ocrdma_destroy_cq(&qp->dpp_cq->ibv_cq); qp->dpp_cq = NULL; } } qp->state = OCRDMA_QPS_RST; list_node_init(&qp->sq_entry); list_node_init(&qp->rq_entry); return &qp->ibv_qp; map_err: ocrdma_destroy_qp(&qp->ibv_qp); return NULL; mbx_err: pthread_spin_destroy(&qp->q_lock); free(qp); return NULL; } static enum ocrdma_qp_state get_ocrdma_qp_state(enum ibv_qp_state qps) { switch (qps) { case IBV_QPS_RESET: return OCRDMA_QPS_RST; case IBV_QPS_INIT: return OCRDMA_QPS_INIT; case IBV_QPS_RTR: return OCRDMA_QPS_RTR; case IBV_QPS_RTS: return OCRDMA_QPS_RTS; case IBV_QPS_SQD: return OCRDMA_QPS_SQD; case IBV_QPS_SQE: return OCRDMA_QPS_SQE; case IBV_QPS_ERR: return OCRDMA_QPS_ERR; case IBV_QPS_UNKNOWN: break; default: break; }; return OCRDMA_QPS_ERR; } static int ocrdma_is_qp_in_sq_flushlist(struct ocrdma_cq *cq, struct ocrdma_qp *qp) { struct ocrdma_qp *list_qp; struct ocrdma_qp *list_qp_tmp; int found = 0; list_for_each_safe(&cq->sq_head, list_qp, list_qp_tmp, sq_entry) { if (qp == list_qp) { found = 1; break; } } return found; } static int ocrdma_is_qp_in_rq_flushlist(struct ocrdma_cq *cq, struct ocrdma_qp *qp) { struct ocrdma_qp *list_qp; struct ocrdma_qp *list_qp_tmp; int found = 0; list_for_each_safe(&cq->rq_head, list_qp, list_qp_tmp, rq_entry) { if (qp == list_qp) { found = 1; break; } } return found; } static void ocrdma_init_hwq_ptr(struct ocrdma_qp *qp) { qp->sq.head = qp->sq.tail = 0; qp->rq.head = qp->rq.tail = 0; qp->dpp_q.head = qp->dpp_q.tail = 0; qp->dpp_q.free_cnt = qp->dpp_q.max_cnt; } static void ocrdma_del_flush_qp(struct ocrdma_qp *qp) { int found = 0; struct ocrdma_device *dev = qp->dev; /* sync with any active CQ poll */ pthread_spin_lock(&dev->flush_q_lock); found = ocrdma_is_qp_in_sq_flushlist(qp->sq_cq, qp); if (found) list_del(&qp->sq_entry); if (!qp->srq) { found = ocrdma_is_qp_in_rq_flushlist(qp->rq_cq, qp); if (found) list_del(&qp->rq_entry); } pthread_spin_unlock(&dev->flush_q_lock); } static void ocrdma_flush_qp(struct ocrdma_qp *qp) { int found; pthread_spin_lock(&qp->dev->flush_q_lock); found = ocrdma_is_qp_in_sq_flushlist(qp->sq_cq, qp); if (!found) list_add_tail(&qp->sq_cq->sq_head, &qp->sq_entry); if (!qp->srq) { found = ocrdma_is_qp_in_rq_flushlist(qp->rq_cq, qp); if (!found) list_add_tail(&qp->rq_cq->rq_head, &qp->rq_entry); } pthread_spin_unlock(&qp->dev->flush_q_lock); } static int ocrdma_qp_state_machine(struct ocrdma_qp *qp, enum ibv_qp_state new_ib_state) { int status = 0; enum ocrdma_qp_state new_state; new_state = get_ocrdma_qp_state(new_ib_state); pthread_spin_lock(&qp->q_lock); if (new_state == qp->state) { pthread_spin_unlock(&qp->q_lock); return 1; } switch (qp->state) { case OCRDMA_QPS_RST: switch (new_state) { case OCRDMA_QPS_RST: break; case OCRDMA_QPS_INIT: /* init pointers to place wqe/rqe at start of hw q */ ocrdma_init_hwq_ptr(qp); /* detach qp from the CQ flush list */ ocrdma_del_flush_qp(qp); break; default: status = EINVAL; break; }; break; case OCRDMA_QPS_INIT: /* qps: INIT->XXX */ switch (new_state) { case OCRDMA_QPS_INIT: break; case OCRDMA_QPS_RTR: break; case OCRDMA_QPS_ERR: ocrdma_flush_qp(qp); break; default: /* invalid state change. */ status = EINVAL; break; }; break; case OCRDMA_QPS_RTR: /* qps: RTS->XXX */ switch (new_state) { case OCRDMA_QPS_RTS: break; case OCRDMA_QPS_ERR: ocrdma_flush_qp(qp); break; default: /* invalid state change. */ status = EINVAL; break; }; break; case OCRDMA_QPS_RTS: /* qps: RTS->XXX */ switch (new_state) { case OCRDMA_QPS_SQD: case OCRDMA_QPS_SQE: break; case OCRDMA_QPS_ERR: ocrdma_flush_qp(qp); break; default: /* invalid state change. */ status = EINVAL; break; }; break; case OCRDMA_QPS_SQD: /* qps: SQD->XXX */ switch (new_state) { case OCRDMA_QPS_RTS: case OCRDMA_QPS_SQE: case OCRDMA_QPS_ERR: break; default: /* invalid state change. */ status = EINVAL; break; }; break; case OCRDMA_QPS_SQE: switch (new_state) { case OCRDMA_QPS_RTS: case OCRDMA_QPS_ERR: break; default: /* invalid state change. */ status = EINVAL; break; }; break; case OCRDMA_QPS_ERR: /* qps: ERR->XXX */ switch (new_state) { case OCRDMA_QPS_RST: break; default: status = EINVAL; break; }; break; default: status = EINVAL; break; }; if (!status) qp->state = new_state; pthread_spin_unlock(&qp->q_lock); return status; } /* * ocrdma_modify_qp */ int ocrdma_modify_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd = {}; struct ocrdma_qp *qp = get_ocrdma_qp(ibqp); int status; status = ibv_cmd_modify_qp(ibqp, attr, attr_mask, &cmd, sizeof cmd); if ((!status) && (attr_mask & IBV_QP_STATE)) ocrdma_qp_state_machine(qp, attr->qp_state); return status; } /* * ocrdma_query_qp */ int ocrdma_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; struct ocrdma_qp *qp = get_ocrdma_qp(ibqp); int status; status = ibv_cmd_query_qp(ibqp, attr, attr_mask, init_attr, &cmd, sizeof(cmd)); if (!status) ocrdma_qp_state_machine(qp, attr->qp_state); return status; } static void ocrdma_srq_toggle_bit(struct ocrdma_srq *srq, int idx) { int i = idx / 32; unsigned int mask = (1 << (idx % 32)); if (srq->idx_bit_fields[i] & mask) { srq->idx_bit_fields[i] &= ~mask; } else { srq->idx_bit_fields[i] |= mask; } } static int ocrdma_srq_get_idx(struct ocrdma_srq *srq) { int row = 0; int indx = 0; for (row = 0; row < srq->bit_fields_len; row++) { if (srq->idx_bit_fields[row]) { indx = ffs(srq->idx_bit_fields[row]); indx = (row * 32) + (indx - 1); if (indx >= srq->rq.max_cnt) assert(0); ocrdma_srq_toggle_bit(srq, indx); break; } } if (row == srq->bit_fields_len) assert(0); return indx + 1; /* Use the index from 1 */ } static int ocrdma_dppq_credits(struct ocrdma_qp_hwq_info *q) { return ((q->max_wqe_idx - q->head) + q->tail) % q->free_cnt; } static int ocrdma_hwq_free_cnt(struct ocrdma_qp_hwq_info *q) { return ((q->max_wqe_idx - q->head) + q->tail) % q->max_cnt; } static int is_hw_sq_empty(struct ocrdma_qp *qp) { return ((qp->sq.tail == qp->sq.head) ? 1 : 0); } static inline int is_hw_rq_empty(struct ocrdma_qp *qp) { return ((qp->rq.head == qp->rq.tail) ? 1 : 0); } static inline void *ocrdma_hwq_head(struct ocrdma_qp_hwq_info *q) { return q->va + (q->head * q->entry_size); } /*static inline void *ocrdma_wq_tail(struct ocrdma_qp_hwq_info *q) { return q->va + (q->tail * q->entry_size); } */ static inline void *ocrdma_hwq_head_from_idx(struct ocrdma_qp_hwq_info *q, uint32_t idx) { return q->va + (idx * q->entry_size); } static void ocrdma_hwq_inc_head(struct ocrdma_qp_hwq_info *q) { q->head = (q->head + 1) & q->max_wqe_idx; } static void ocrdma_hwq_inc_tail(struct ocrdma_qp_hwq_info *q) { q->tail = (q->tail + 1) & q->max_wqe_idx; } static inline void ocrdma_hwq_inc_tail_by_idx(struct ocrdma_qp_hwq_info *q, int idx) { q->tail = (idx + 1) & q->max_wqe_idx; } static int is_cqe_valid(struct ocrdma_cq *cq, struct ocrdma_cqe *cqe) { int cqe_valid; cqe_valid = le32toh(cqe->flags_status_srcqpn) & OCRDMA_CQE_VALID; return (cqe_valid == cq->phase); } static int is_cqe_for_sq(struct ocrdma_cqe *cqe) { return (le32toh(cqe->flags_status_srcqpn) & OCRDMA_CQE_QTYPE) ? 0 : 1; } static int is_cqe_imm(struct ocrdma_cqe *cqe) { return (le32toh(cqe->flags_status_srcqpn) & OCRDMA_CQE_IMM) ? 1 : 0; } static int is_cqe_wr_imm(struct ocrdma_cqe *cqe) { return (le32toh(cqe->flags_status_srcqpn) & OCRDMA_CQE_WRITE_IMM) ? 1 : 0; } static inline void ocrdma_srq_inc_tail(struct ocrdma_qp *qp, struct ocrdma_cqe *cqe) { int wqe_idx; wqe_idx = (le32toh(cqe->rq.buftag_qpn) >> OCRDMA_CQE_BUFTAG_SHIFT) & qp->srq->rq.max_wqe_idx; if (wqe_idx < 1) assert(0); pthread_spin_lock(&qp->srq->q_lock); ocrdma_hwq_inc_tail(&qp->srq->rq); ocrdma_srq_toggle_bit(qp->srq, wqe_idx - 1); pthread_spin_unlock(&qp->srq->q_lock); } static void ocrdma_discard_cqes(struct ocrdma_qp *qp, struct ocrdma_cq *cq) { uint32_t cur_getp, stop_getp; struct ocrdma_cqe *cqe; uint32_t qpn = 0; int wqe_idx; pthread_spin_lock(&cq->cq_lock); /* traverse through the CQEs in the hw CQ, * find the matching CQE for a given qp, * mark the matching one discarded=1. * discard the cqe. * ring the doorbell in the poll_cq() as * we don't complete out of order cqe. */ cur_getp = cq->getp; /* find up to when do we reap the cq.*/ stop_getp = cur_getp; do { if (is_hw_sq_empty(qp) && (!qp->srq && is_hw_rq_empty(qp))) break; cqe = cq->va + cur_getp; /* if (a) no valid cqe, or (b) done reading full hw cq, or * (c) qp_xq becomes empty. * then exit */ qpn = le32toh(cqe->cmn.qpn) & OCRDMA_CQE_QPN_MASK; /* if previously discarded cqe found, skip that too. * check for matching qp */ if ((qpn == 0) || (qpn != qp->id)) goto skip_cqe; /* mark cqe discarded so that it is not picked up later * in the poll_cq(). */ if (is_cqe_for_sq(cqe)) { wqe_idx = (le32toh(cqe->wq.wqeidx) & OCRDMA_CQE_WQEIDX_MASK) & qp->sq.max_wqe_idx; ocrdma_hwq_inc_tail_by_idx(&qp->sq, wqe_idx); } else { if (qp->srq) ocrdma_srq_inc_tail(qp, cqe); else ocrdma_hwq_inc_tail(&qp->rq); } /* discard by marking qp_id = 0 */ cqe->cmn.qpn = 0; skip_cqe: cur_getp = (cur_getp + 1) % cq->max_hw_cqe; } while (cur_getp != stop_getp); pthread_spin_unlock(&cq->cq_lock); } /* * ocrdma_destroy_qp */ int ocrdma_destroy_qp(struct ibv_qp *ibqp) { int status = 0; struct ocrdma_qp *qp; struct ocrdma_device *dev; qp = get_ocrdma_qp(ibqp); dev = qp->dev; /* * acquire CQ lock while destroy is in progress, in order to * protect against proessing in-flight CQEs for this QP. */ pthread_spin_lock(&qp->sq_cq->cq_lock); if (qp->rq_cq && (qp->rq_cq != qp->sq_cq)) pthread_spin_lock(&qp->rq_cq->cq_lock); _ocrdma_del_qpn_map(qp->dev, qp); if (qp->rq_cq && (qp->rq_cq != qp->sq_cq)) pthread_spin_unlock(&qp->rq_cq->cq_lock); pthread_spin_unlock(&qp->sq_cq->cq_lock); if (qp->db_va) munmap((void *)qp->db_va, qp->db_size); if (qp->rq.va) munmap(qp->rq.va, qp->rq.len); if (qp->sq.va) munmap(qp->sq.va, qp->sq.len); /* ensure that CQEs for newly created QP (whose id may be same with * one which just getting destroyed are same), don't get * discarded until the old CQEs are discarded. */ pthread_mutex_lock(&dev->dev_lock); status = ibv_cmd_destroy_qp(ibqp); ocrdma_discard_cqes(qp, qp->sq_cq); ocrdma_discard_cqes(qp, qp->rq_cq); pthread_mutex_unlock(&dev->dev_lock); ocrdma_del_flush_qp(qp); pthread_spin_destroy(&qp->q_lock); if (qp->rqe_wr_id_tbl) free(qp->rqe_wr_id_tbl); if (qp->wqe_wr_id_tbl) free(qp->wqe_wr_id_tbl); if (qp->dpp_cq) ocrdma_destroy_cq(&qp->dpp_cq->ibv_cq); free(qp); return status; } static void ocrdma_ring_sq_db(struct ocrdma_qp *qp) { __le32 db_val = htole32((qp->sq.dbid | (1 << 16))); udma_to_device_barrier(); *(__le32 *) (((uint8_t *) qp->db_sq_va)) = db_val; } static void ocrdma_ring_rq_db(struct ocrdma_qp *qp) { __le32 db_val = htole32((qp->rq.dbid | (1 << qp->db_shift))); udma_to_device_barrier(); *(__le32 *) ((uint8_t *) qp->db_rq_va) = db_val; } static void ocrdma_ring_srq_db(struct ocrdma_srq *srq) { __le32 db_val = htole32(srq->rq.dbid | (1 << srq->db_shift)); udma_to_device_barrier(); *(__le32 *) (srq->db_va) = db_val; } static void ocrdma_ring_cq_db(struct ocrdma_cq *cq, uint32_t armed, int solicited, uint32_t num_cqe) { uint32_t val; val = cq->cq_dbid & OCRDMA_DB_CQ_RING_ID_MASK; val |= ((cq->cq_dbid & OCRDMA_DB_CQ_RING_ID_EXT_MASK) << OCRDMA_DB_CQ_RING_ID_EXT_MASK_SHIFT); if (armed) val |= (1 << OCRDMA_DB_CQ_REARM_SHIFT); if (solicited) val |= (1 << OCRDMA_DB_CQ_SOLICIT_SHIFT); val |= (num_cqe << OCRDMA_DB_CQ_NUM_POPPED_SHIFT); udma_to_device_barrier(); *(__le32 *) ((uint8_t *) (cq->db_va) + OCRDMA_DB_CQ_OFFSET) = htole32(val); } static void ocrdma_build_ud_hdr(struct ocrdma_qp *qp, struct ocrdma_hdr_wqe *hdr, struct ibv_send_wr *wr) { struct ocrdma_ewqe_ud_hdr *ud_hdr = (struct ocrdma_ewqe_ud_hdr *)(hdr + 1); struct ocrdma_ah *ah = get_ocrdma_ah(wr->wr.ud.ah); ud_hdr->rsvd_dest_qpn = wr->wr.ud.remote_qpn; ud_hdr->qkey = wr->wr.ud.remote_qkey; ud_hdr->rsvd_ahid = ah->id; if (ah->isvlan) hdr->cw |= (OCRDMA_FLAG_AH_VLAN_PR << OCRDMA_WQE_FLAGS_SHIFT); ud_hdr->hdr_type = ah->hdr_type; } static void ocrdma_build_sges(struct ocrdma_hdr_wqe *hdr, struct ocrdma_sge *sge, int num_sge, struct ibv_sge *sg_list) { int i; for (i = 0; i < num_sge; i++) { sge[i].lrkey = sg_list[i].lkey; sge[i].addr_lo = sg_list[i].addr; sge[i].addr_hi = sg_list[i].addr >> 32; sge[i].len = sg_list[i].length; hdr->total_len += sg_list[i].length; } if (num_sge == 0) memset(sge, 0, sizeof(*sge)); } static inline uint32_t ocrdma_sglist_len(struct ibv_sge *sg_list, int num_sge) { uint32_t total_len = 0, i; for (i = 0; i < num_sge; i++) total_len += sg_list[i].length; return total_len; } static inline int ocrdma_build_inline_sges(struct ocrdma_qp *qp, struct ocrdma_hdr_wqe *hdr, struct ocrdma_sge *sge, struct ibv_send_wr *wr, uint32_t wqe_size) { int i; char *dpp_addr; if (wr->send_flags & IBV_SEND_INLINE && qp->qp_type != IBV_QPT_UD) { hdr->total_len = ocrdma_sglist_len(wr->sg_list, wr->num_sge); if (hdr->total_len > qp->max_inline_data) { ocrdma_err ("%s() supported_len=0x%x, unsupported len req=0x%x\n", __func__, qp->max_inline_data, hdr->total_len); return EINVAL; } dpp_addr = (char *)sge; for (i = 0; i < wr->num_sge; i++) { memcpy(dpp_addr, (void *)(unsigned long)wr->sg_list[i].addr, wr->sg_list[i].length); dpp_addr += wr->sg_list[i].length; } wqe_size += ROUND_UP_X(hdr->total_len, OCRDMA_WQE_ALIGN_BYTES); if (0 == hdr->total_len) wqe_size += sizeof(struct ocrdma_sge); hdr->cw |= (OCRDMA_TYPE_INLINE << OCRDMA_WQE_TYPE_SHIFT); } else { ocrdma_build_sges(hdr, sge, wr->num_sge, wr->sg_list); if (wr->num_sge) wqe_size += (wr->num_sge * sizeof(struct ocrdma_sge)); else wqe_size += sizeof(struct ocrdma_sge); hdr->cw |= (OCRDMA_TYPE_LKEY << OCRDMA_WQE_TYPE_SHIFT); } hdr->cw |= ((wqe_size / OCRDMA_WQE_STRIDE) << OCRDMA_WQE_SIZE_SHIFT); return 0; } static int ocrdma_build_send(struct ocrdma_qp *qp, struct ocrdma_hdr_wqe *hdr, struct ibv_send_wr *wr) { int status; struct ocrdma_sge *sge; uint32_t wqe_size = sizeof(*hdr); if (qp->qp_type == IBV_QPT_UD) { wqe_size += sizeof(struct ocrdma_ewqe_ud_hdr); ocrdma_build_ud_hdr(qp, hdr, wr); sge = (struct ocrdma_sge *)(hdr + 2); } else sge = (struct ocrdma_sge *)(hdr + 1); status = ocrdma_build_inline_sges(qp, hdr, sge, wr, wqe_size); return status; } static int ocrdma_build_write(struct ocrdma_qp *qp, struct ocrdma_hdr_wqe *hdr, struct ibv_send_wr *wr) { int status; struct ocrdma_sge *ext_rw = (struct ocrdma_sge *)(hdr + 1); struct ocrdma_sge *sge = ext_rw + 1; uint32_t wqe_size = sizeof(*hdr) + sizeof(*ext_rw); status = ocrdma_build_inline_sges(qp, hdr, sge, wr, wqe_size); if (status) return status; ext_rw->addr_lo = wr->wr.rdma.remote_addr; ext_rw->addr_hi = (wr->wr.rdma.remote_addr >> 32); ext_rw->lrkey = wr->wr.rdma.rkey; ext_rw->len = hdr->total_len; return 0; } static void ocrdma_build_read(struct ocrdma_qp *qp, struct ocrdma_hdr_wqe *hdr, struct ibv_send_wr *wr) { struct ocrdma_sge *ext_rw = (struct ocrdma_sge *)(hdr + 1); struct ocrdma_sge *sge = ext_rw + 1; uint32_t wqe_size = ((wr->num_sge + 1) * sizeof(*sge)) + sizeof(*hdr); hdr->cw |= (OCRDMA_TYPE_LKEY << OCRDMA_WQE_TYPE_SHIFT); hdr->cw |= ((wqe_size / OCRDMA_WQE_STRIDE) << OCRDMA_WQE_SIZE_SHIFT); hdr->cw |= (OCRDMA_READ << OCRDMA_WQE_OPCODE_SHIFT); ocrdma_build_sges(hdr, sge, wr->num_sge, wr->sg_list); ext_rw->addr_lo = wr->wr.rdma.remote_addr; ext_rw->addr_hi = (wr->wr.rdma.remote_addr >> 32); ext_rw->lrkey = wr->wr.rdma.rkey; ext_rw->len = hdr->total_len; } /* Dpp cq is single entry cq, we just need to read * wqe index from first 16 bits at 0th cqe index. */ static void ocrdma_poll_dpp_cq(struct ocrdma_qp *qp) { struct ocrdma_cq *cq = qp->dpp_cq; struct ocrdma_dpp_cqe *cqe; int idx = 0; cqe = ((struct ocrdma_dpp_cqe *)cq->va); idx = cqe->wqe_idx_valid & OCRDMA_DPP_WQE_INDEX_MASK; if (idx != qp->dpp_prev_indx) { ocrdma_hwq_inc_tail_by_idx(&qp->dpp_q, idx); qp->dpp_prev_indx = idx; } } static uint32_t ocrdma_get_hdr_len(struct ocrdma_qp *qp, struct ocrdma_hdr_wqe *hdr) { uint32_t hdr_sz = sizeof(*hdr); if (qp->qp_type == IBV_QPT_UD) hdr_sz += sizeof(struct ocrdma_ewqe_ud_hdr); if (hdr->cw & (OCRDMA_WRITE << OCRDMA_WQE_OPCODE_SHIFT)) hdr_sz += sizeof(struct ocrdma_sge); return hdr_sz / sizeof(uint32_t); } static void ocrdma_build_dpp_wqe(void *va, struct ocrdma_hdr_wqe *wqe, uint32_t hdr_len) { uint32_t pyld_len = (wqe->cw >> OCRDMA_WQE_SIZE_SHIFT) * 2; uint32_t i = 0; mmio_wc_start(); /* convert WQE header to LE format */ for (; i < hdr_len; i++) *((__le32 *) va + i) = htole32(*((uint32_t *) wqe + i)); /* Convertion of data is done in HW */ for (; i < pyld_len; i++) *((uint32_t *) va + i) = (*((uint32_t *) wqe + i)); mmio_flush_writes(); } static void ocrdma_post_dpp_wqe(struct ocrdma_qp *qp, struct ocrdma_hdr_wqe *hdr) { if (qp->dpp_cq && ocrdma_dppq_credits(&qp->dpp_q) == 0) ocrdma_poll_dpp_cq(qp); if (!qp->dpp_cq || ocrdma_dppq_credits(&qp->dpp_q)) { ocrdma_build_dpp_wqe(qp->dpp_q.va, hdr, ocrdma_get_hdr_len(qp, hdr)); qp->wqe_wr_id_tbl[qp->sq.head].dpp_wqe = 1; qp->wqe_wr_id_tbl[qp->sq.head].dpp_wqe_idx = qp->dpp_q.head; /* if dpp cq is not enabled, we can post * wqe as soon as we receive and adapter * takes care of flow control. */ if (qp->dpp_cq) ocrdma_hwq_inc_head(&qp->dpp_q); } else qp->wqe_wr_id_tbl[qp->sq.head].dpp_wqe = 0; } /* * ocrdma_post_send */ int ocrdma_post_send(struct ibv_qp *ib_qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { int status = 0; struct ocrdma_qp *qp; struct ocrdma_hdr_wqe *hdr; qp = get_ocrdma_qp(ib_qp); pthread_spin_lock(&qp->q_lock); if (qp->state != OCRDMA_QPS_RTS && qp->state != OCRDMA_QPS_SQD) { pthread_spin_unlock(&qp->q_lock); *bad_wr = wr; return EINVAL; } while (wr) { if (qp->qp_type == IBV_QPT_UD && (wr->opcode != IBV_WR_SEND && wr->opcode != IBV_WR_SEND_WITH_IMM)) { *bad_wr = wr; status = EINVAL; break; } if (ocrdma_hwq_free_cnt(&qp->sq) == 0 || wr->num_sge > qp->sq.max_sges) { *bad_wr = wr; status = ENOMEM; break; } hdr = ocrdma_hwq_head(&qp->sq); hdr->cw = 0; hdr->total_len = 0; if (wr->send_flags & IBV_SEND_SIGNALED || qp->signaled) hdr->cw = (OCRDMA_FLAG_SIG << OCRDMA_WQE_FLAGS_SHIFT); if (wr->send_flags & IBV_SEND_FENCE) hdr->cw |= (OCRDMA_FLAG_FENCE_L << OCRDMA_WQE_FLAGS_SHIFT); if (wr->send_flags & IBV_SEND_SOLICITED) hdr->cw |= (OCRDMA_FLAG_SOLICIT << OCRDMA_WQE_FLAGS_SHIFT); qp->wqe_wr_id_tbl[qp->sq.head].wrid = wr->wr_id; switch (wr->opcode) { case IBV_WR_SEND_WITH_IMM: hdr->cw |= (OCRDMA_FLAG_IMM << OCRDMA_WQE_FLAGS_SHIFT); hdr->immdt = be32toh(wr->imm_data); SWITCH_FALLTHROUGH; case IBV_WR_SEND: hdr->cw |= (OCRDMA_SEND << OCRDMA_WQE_OPCODE_SHIFT); status = ocrdma_build_send(qp, hdr, wr); break; case IBV_WR_RDMA_WRITE_WITH_IMM: hdr->cw |= (OCRDMA_FLAG_IMM << OCRDMA_WQE_FLAGS_SHIFT); hdr->immdt = be32toh(wr->imm_data); SWITCH_FALLTHROUGH; case IBV_WR_RDMA_WRITE: hdr->cw |= (OCRDMA_WRITE << OCRDMA_WQE_OPCODE_SHIFT); status = ocrdma_build_write(qp, hdr, wr); break; case IBV_WR_RDMA_READ: ocrdma_build_read(qp, hdr, wr); break; default: status = EINVAL; break; } if (status) { *bad_wr = wr; break; } if (wr->send_flags & IBV_SEND_SIGNALED || qp->signaled) qp->wqe_wr_id_tbl[qp->sq.head].signaled = 1; else qp->wqe_wr_id_tbl[qp->sq.head].signaled = 0; if (qp->dpp_enabled && (wr->send_flags & IBV_SEND_INLINE)) ocrdma_post_dpp_wqe(qp, hdr); ocrdma_swap_cpu_to_le(hdr, ((hdr->cw >> OCRDMA_WQE_SIZE_SHIFT) & OCRDMA_WQE_SIZE_MASK) * OCRDMA_WQE_STRIDE); ocrdma_ring_sq_db(qp); /* update pointer, counter for next wr */ ocrdma_hwq_inc_head(&qp->sq); wr = wr->next; } pthread_spin_unlock(&qp->q_lock); return status; } static void ocrdma_build_rqe(struct ocrdma_hdr_wqe *rqe, struct ibv_recv_wr *wr, uint16_t tag) { struct ocrdma_sge *sge; uint32_t wqe_size; if (wr->num_sge) wqe_size = (wr->num_sge * sizeof(*sge)) + sizeof(*rqe); else wqe_size = sizeof(*sge) + sizeof(*rqe); rqe->cw = ((wqe_size / OCRDMA_WQE_STRIDE) << OCRDMA_WQE_SIZE_SHIFT); rqe->cw |= (OCRDMA_FLAG_SIG << OCRDMA_WQE_FLAGS_SHIFT); rqe->cw |= (OCRDMA_TYPE_LKEY << OCRDMA_WQE_TYPE_SHIFT); rqe->total_len = 0; rqe->rsvd_tag = tag; sge = (struct ocrdma_sge *)(rqe + 1); ocrdma_build_sges(rqe, sge, wr->num_sge, wr->sg_list); ocrdma_swap_cpu_to_le(rqe, wqe_size); } /* * ocrdma_post_recv */ int ocrdma_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { int status = 0; struct ocrdma_qp *qp; struct ocrdma_hdr_wqe *rqe; qp = get_ocrdma_qp(ibqp); pthread_spin_lock(&qp->q_lock); if (qp->state == OCRDMA_QPS_RST || qp->state == OCRDMA_QPS_ERR) { pthread_spin_unlock(&qp->q_lock); *bad_wr = wr; return EINVAL; } while (wr) { if (ocrdma_hwq_free_cnt(&qp->rq) == 0 || wr->num_sge > qp->rq.max_sges) { status = ENOMEM; *bad_wr = wr; break; } rqe = ocrdma_hwq_head(&qp->rq); ocrdma_build_rqe(rqe, wr, 0); qp->rqe_wr_id_tbl[qp->rq.head] = wr->wr_id; ocrdma_ring_rq_db(qp); /* update pointer, counter for next wr */ ocrdma_hwq_inc_head(&qp->rq); wr = wr->next; } pthread_spin_unlock(&qp->q_lock); return status; } static enum ibv_wc_status ocrdma_to_ibwc_err(uint16_t status) { enum ibv_wc_status ibwc_status = IBV_WC_GENERAL_ERR; switch (status) { case OCRDMA_CQE_GENERAL_ERR: ibwc_status = IBV_WC_GENERAL_ERR; break; case OCRDMA_CQE_LOC_LEN_ERR: ibwc_status = IBV_WC_LOC_LEN_ERR; break; case OCRDMA_CQE_LOC_QP_OP_ERR: ibwc_status = IBV_WC_LOC_QP_OP_ERR; break; case OCRDMA_CQE_LOC_EEC_OP_ERR: ibwc_status = IBV_WC_LOC_EEC_OP_ERR; break; case OCRDMA_CQE_LOC_PROT_ERR: ibwc_status = IBV_WC_LOC_PROT_ERR; break; case OCRDMA_CQE_WR_FLUSH_ERR: ibwc_status = IBV_WC_WR_FLUSH_ERR; break; case OCRDMA_CQE_BAD_RESP_ERR: ibwc_status = IBV_WC_BAD_RESP_ERR; break; case OCRDMA_CQE_LOC_ACCESS_ERR: ibwc_status = IBV_WC_LOC_ACCESS_ERR; break; case OCRDMA_CQE_REM_INV_REQ_ERR: ibwc_status = IBV_WC_REM_INV_REQ_ERR; break; case OCRDMA_CQE_REM_ACCESS_ERR: ibwc_status = IBV_WC_REM_ACCESS_ERR; break; case OCRDMA_CQE_REM_OP_ERR: ibwc_status = IBV_WC_REM_OP_ERR; break; case OCRDMA_CQE_RETRY_EXC_ERR: ibwc_status = IBV_WC_RETRY_EXC_ERR; break; case OCRDMA_CQE_RNR_RETRY_EXC_ERR: ibwc_status = IBV_WC_RNR_RETRY_EXC_ERR; break; case OCRDMA_CQE_LOC_RDD_VIOL_ERR: ibwc_status = IBV_WC_LOC_RDD_VIOL_ERR; break; case OCRDMA_CQE_REM_INV_RD_REQ_ERR: ibwc_status = IBV_WC_REM_INV_RD_REQ_ERR; break; case OCRDMA_CQE_REM_ABORT_ERR: ibwc_status = IBV_WC_REM_ABORT_ERR; break; case OCRDMA_CQE_INV_EECN_ERR: ibwc_status = IBV_WC_INV_EECN_ERR; break; case OCRDMA_CQE_INV_EEC_STATE_ERR: ibwc_status = IBV_WC_INV_EEC_STATE_ERR; break; case OCRDMA_CQE_FATAL_ERR: ibwc_status = IBV_WC_FATAL_ERR; break; case OCRDMA_CQE_RESP_TIMEOUT_ERR: ibwc_status = IBV_WC_RESP_TIMEOUT_ERR; break; default: ibwc_status = IBV_WC_GENERAL_ERR; break; }; return ibwc_status; } static void ocrdma_update_wc(struct ocrdma_qp *qp, struct ibv_wc *ibwc, uint32_t wqe_idx) { struct ocrdma_hdr_wqe_le *hdr; struct ocrdma_sge *rw; int opcode; hdr = ocrdma_hwq_head_from_idx(&qp->sq, wqe_idx); ibwc->wr_id = qp->wqe_wr_id_tbl[wqe_idx].wrid; /* Undo the hdr->cw swap */ opcode = le32toh(hdr->cw) & OCRDMA_WQE_OPCODE_MASK; switch (opcode) { case OCRDMA_WRITE: ibwc->opcode = IBV_WC_RDMA_WRITE; break; case OCRDMA_READ: rw = (struct ocrdma_sge *)(hdr + 1); ibwc->opcode = IBV_WC_RDMA_READ; ibwc->byte_len = rw->len; break; case OCRDMA_SEND: ibwc->opcode = IBV_WC_SEND; break; default: ibwc->status = IBV_WC_GENERAL_ERR; ocrdma_err("%s() invalid opcode received = 0x%x\n", __func__, le32toh(hdr->cw) & OCRDMA_WQE_OPCODE_MASK); break; }; } static void ocrdma_set_cqe_status_flushed(struct ocrdma_qp *qp, struct ocrdma_cqe *cqe) { if (is_cqe_for_sq(cqe)) { cqe->flags_status_srcqpn = htole32(le32toh(cqe->flags_status_srcqpn) & ~OCRDMA_CQE_STATUS_MASK); cqe->flags_status_srcqpn = htole32(le32toh(cqe->flags_status_srcqpn) | (OCRDMA_CQE_WR_FLUSH_ERR << OCRDMA_CQE_STATUS_SHIFT)); } else { if (qp->qp_type == IBV_QPT_UD) { cqe->flags_status_srcqpn = htole32(le32toh (cqe->flags_status_srcqpn) & ~OCRDMA_CQE_UD_STATUS_MASK); cqe->flags_status_srcqpn = htole32(le32toh (cqe->flags_status_srcqpn) | (OCRDMA_CQE_WR_FLUSH_ERR << OCRDMA_CQE_UD_STATUS_SHIFT)); } else { cqe->flags_status_srcqpn = htole32(le32toh (cqe->flags_status_srcqpn) & ~OCRDMA_CQE_STATUS_MASK); cqe->flags_status_srcqpn = htole32(le32toh (cqe->flags_status_srcqpn) | (OCRDMA_CQE_WR_FLUSH_ERR << OCRDMA_CQE_STATUS_SHIFT)); } } } static int ocrdma_update_err_cqe(struct ibv_wc *ibwc, struct ocrdma_cqe *cqe, struct ocrdma_qp *qp, int status) { int expand = 0; ibwc->byte_len = 0; ibwc->qp_num = qp->id; ibwc->status = ocrdma_to_ibwc_err(status); ocrdma_flush_qp(qp); ocrdma_qp_state_machine(qp, IBV_QPS_ERR); /* if wqe/rqe pending for which cqe needs to be returned, * trigger inflating it. */ if (!is_hw_rq_empty(qp) || !is_hw_sq_empty(qp)) { expand = 1; ocrdma_set_cqe_status_flushed(qp, cqe); } return expand; } static int ocrdma_update_err_rcqe(struct ibv_wc *ibwc, struct ocrdma_cqe *cqe, struct ocrdma_qp *qp, int status) { ibwc->opcode = IBV_WC_RECV; ibwc->wr_id = qp->rqe_wr_id_tbl[qp->rq.tail]; ocrdma_hwq_inc_tail(&qp->rq); return ocrdma_update_err_cqe(ibwc, cqe, qp, status); } static int ocrdma_update_err_scqe(struct ibv_wc *ibwc, struct ocrdma_cqe *cqe, struct ocrdma_qp *qp, int status) { ocrdma_update_wc(qp, ibwc, qp->sq.tail); ocrdma_hwq_inc_tail(&qp->sq); return ocrdma_update_err_cqe(ibwc, cqe, qp, status); } static int ocrdma_poll_err_scqe(struct ocrdma_qp *qp, struct ocrdma_cqe *cqe, struct ibv_wc *ibwc, int *polled, int *stop) { int expand; int status = (le32toh(cqe->flags_status_srcqpn) & OCRDMA_CQE_STATUS_MASK) >> OCRDMA_CQE_STATUS_SHIFT; /* when hw sq is empty, but rq is not empty, so we continue * to keep the cqe in order to get the cq event again. */ if (is_hw_sq_empty(qp) && !is_hw_rq_empty(qp)) { /* when cq for rq and sq is same, it is safe to return * flush cqe for RQEs. */ if (!qp->srq && (qp->sq_cq == qp->rq_cq)) { *polled = 1; status = OCRDMA_CQE_WR_FLUSH_ERR; expand = ocrdma_update_err_rcqe(ibwc, cqe, qp, status); } else { *polled = 0; *stop = 1; expand = 0; } } else if (is_hw_sq_empty(qp)) { /* Do nothing */ expand = 0; *polled = 0; *stop = 0; } else { *polled = 1; expand = ocrdma_update_err_scqe(ibwc, cqe, qp, status); } return expand; } static int ocrdma_poll_success_scqe(struct ocrdma_qp *qp, struct ocrdma_cqe *cqe, struct ibv_wc *ibwc, int *polled) { int expand = 0; int tail = qp->sq.tail; uint32_t wqe_idx; if (!qp->wqe_wr_id_tbl[tail].signaled) { *polled = 0; /* WC cannot be consumed yet */ } else { ibwc->status = IBV_WC_SUCCESS; ibwc->wc_flags = 0; ibwc->qp_num = qp->id; ocrdma_update_wc(qp, ibwc, tail); *polled = 1; } wqe_idx = (le32toh(cqe->wq.wqeidx) & OCRDMA_CQE_WQEIDX_MASK) & qp->sq.max_wqe_idx; if (tail != wqe_idx) /* CQE cannot be consumed yet */ expand = 1; /* Coallesced CQE */ ocrdma_hwq_inc_tail(&qp->sq); return expand; } static int ocrdma_poll_scqe(struct ocrdma_qp *qp, struct ocrdma_cqe *cqe, struct ibv_wc *ibwc, int *polled, int *stop) { int status, expand; status = (le32toh(cqe->flags_status_srcqpn) & OCRDMA_CQE_STATUS_MASK) >> OCRDMA_CQE_STATUS_SHIFT; if (status == OCRDMA_CQE_SUCCESS) expand = ocrdma_poll_success_scqe(qp, cqe, ibwc, polled); else expand = ocrdma_poll_err_scqe(qp, cqe, ibwc, polled, stop); return expand; } static int ocrdma_update_ud_rcqe(struct ibv_wc *ibwc, struct ocrdma_cqe *cqe) { int status; status = (le32toh(cqe->flags_status_srcqpn) & OCRDMA_CQE_UD_STATUS_MASK) >> OCRDMA_CQE_UD_STATUS_SHIFT; ibwc->src_qp = le32toh(cqe->flags_status_srcqpn) & OCRDMA_CQE_SRCQP_MASK; ibwc->pkey_index = le32toh(cqe->ud.rxlen_pkey) & OCRDMA_CQE_PKEY_MASK; ibwc->wc_flags = IBV_WC_GRH; ibwc->byte_len = (le32toh(cqe->ud.rxlen_pkey) >> OCRDMA_CQE_UD_XFER_LEN_SHIFT); return status; } static void ocrdma_update_free_srq_cqe(struct ibv_wc *ibwc, struct ocrdma_cqe *cqe, struct ocrdma_qp *qp) { struct ocrdma_srq *srq = NULL; uint32_t wqe_idx; srq = get_ocrdma_srq(qp->ibv_qp.srq); #if !defined(SKH_A0_WORKAROUND) /* BUG 113416 */ wqe_idx = (le32toh(cqe->rq.buftag_qpn) >> OCRDMA_CQE_BUFTAG_SHIFT) & srq->rq.max_wqe_idx; #else wqe_idx = (le32toh(cqe->flags_status_srcqpn)) & 0xFFFF; #endif if (wqe_idx < 1) assert(0); ibwc->wr_id = srq->rqe_wr_id_tbl[wqe_idx]; pthread_spin_lock(&srq->q_lock); ocrdma_srq_toggle_bit(srq, wqe_idx - 1); pthread_spin_unlock(&srq->q_lock); ocrdma_hwq_inc_tail(&srq->rq); } static int ocrdma_poll_err_rcqe(struct ocrdma_qp *qp, struct ocrdma_cqe *cqe, struct ibv_wc *ibwc, int *polled, int *stop, int status) { int expand; /* when hw_rq is empty, but wq is not empty, so continue * to keep the cqe to get the cq event again. */ if (is_hw_rq_empty(qp) && !is_hw_sq_empty(qp)) { if (!qp->srq && (qp->sq_cq == qp->rq_cq)) { *polled = 1; status = OCRDMA_CQE_WR_FLUSH_ERR; expand = ocrdma_update_err_scqe(ibwc, cqe, qp, status); } else { *polled = 0; *stop = 1; expand = 0; } } else if (is_hw_rq_empty(qp)) { /* Do nothing */ expand = 0; *polled = 0; *stop = 0; } else { *polled = 1; expand = ocrdma_update_err_rcqe(ibwc, cqe, qp, status); } return expand; } static void ocrdma_poll_success_rcqe(struct ocrdma_qp *qp, struct ocrdma_cqe *cqe, struct ibv_wc *ibwc) { ibwc->opcode = IBV_WC_RECV; ibwc->qp_num = qp->id; ibwc->status = IBV_WC_SUCCESS; if (qp->qp_type == IBV_QPT_UD) ocrdma_update_ud_rcqe(ibwc, cqe); else ibwc->byte_len = le32toh(cqe->rq.rxlen); if (is_cqe_imm(cqe)) { ibwc->imm_data = htobe32(le32toh(cqe->rq.lkey_immdt)); ibwc->wc_flags |= IBV_WC_WITH_IMM; } else if (is_cqe_wr_imm(cqe)) { ibwc->opcode = IBV_WC_RECV_RDMA_WITH_IMM; ibwc->imm_data = htobe32(le32toh(cqe->rq.lkey_immdt)); ibwc->wc_flags |= IBV_WC_WITH_IMM; } if (qp->ibv_qp.srq) ocrdma_update_free_srq_cqe(ibwc, cqe, qp); else { ibwc->wr_id = qp->rqe_wr_id_tbl[qp->rq.tail]; ocrdma_hwq_inc_tail(&qp->rq); } } static int ocrdma_poll_rcqe(struct ocrdma_qp *qp, struct ocrdma_cqe *cqe, struct ibv_wc *ibwc, int *polled, int *stop) { int status; int expand = 0; ibwc->wc_flags = 0; if (qp->qp_type == IBV_QPT_UD) status = (le32toh(cqe->flags_status_srcqpn) & OCRDMA_CQE_UD_STATUS_MASK) >> OCRDMA_CQE_UD_STATUS_SHIFT; else status = (le32toh(cqe->flags_status_srcqpn) & OCRDMA_CQE_STATUS_MASK) >> OCRDMA_CQE_STATUS_SHIFT; if (status == OCRDMA_CQE_SUCCESS) { *polled = 1; ocrdma_poll_success_rcqe(qp, cqe, ibwc); } else { expand = ocrdma_poll_err_rcqe(qp, cqe, ibwc, polled, stop, status); } return expand; } static void ocrdma_change_cq_phase(struct ocrdma_cq *cq, struct ocrdma_cqe *cqe, uint16_t cur_getp) { if (cq->phase_change) { if (cur_getp == 0) cq->phase = (~cq->phase & OCRDMA_CQE_VALID); } else cqe->flags_status_srcqpn = 0; /* clear valid bit */ } static int ocrdma_poll_hwcq(struct ocrdma_cq *cq, int num_entries, struct ibv_wc *ibwc) { uint16_t qpn = 0; int i = 0; int expand = 0; int polled_hw_cqes = 0; struct ocrdma_qp *qp = NULL; struct ocrdma_device *dev = cq->dev; struct ocrdma_cqe *cqe; uint16_t cur_getp; int polled = 0; int stop = 0; cur_getp = cq->getp; while (num_entries) { cqe = cq->va + cur_getp; /* check whether valid cqe or not */ if (!is_cqe_valid(cq, cqe)) break; qpn = (le32toh(cqe->cmn.qpn) & OCRDMA_CQE_QPN_MASK); /* ignore discarded cqe */ if (qpn == 0) goto skip_cqe; qp = dev->qp_tbl[qpn]; if (qp == NULL) { ocrdma_err("%s() cqe for invalid qpn= 0x%x received.\n", __func__, qpn); goto skip_cqe; } if (is_cqe_for_sq(cqe)) { expand = ocrdma_poll_scqe(qp, cqe, ibwc, &polled, &stop); } else { expand = ocrdma_poll_rcqe(qp, cqe, ibwc, &polled, &stop); } if (expand) goto expand_cqe; if (stop) goto stop_cqe; /* clear qpn to avoid duplicate processing by discard_cqe() */ cqe->cmn.qpn = 0; skip_cqe: polled_hw_cqes += 1; cur_getp = (cur_getp + 1) % cq->max_hw_cqe; ocrdma_change_cq_phase(cq, cqe, cur_getp); expand_cqe: if (polled) { num_entries -= 1; i += 1; ibwc = ibwc + 1; polled = 0; } } stop_cqe: cq->getp = cur_getp; if (cq->deferred_arm || polled_hw_cqes) { ocrdma_ring_cq_db(cq, cq->deferred_arm, cq->deferred_sol, polled_hw_cqes); cq->deferred_arm = 0; cq->deferred_sol = 0; } return i; } static int ocrdma_add_err_cqe(struct ocrdma_cq *cq, int num_entries, struct ocrdma_qp *qp, struct ibv_wc *ibwc) { int err_cqes = 0; while (num_entries) { if (is_hw_sq_empty(qp) && is_hw_rq_empty(qp)) break; if (!is_hw_sq_empty(qp) && qp->sq_cq == cq) { ocrdma_update_wc(qp, ibwc, qp->sq.tail); ocrdma_hwq_inc_tail(&qp->sq); } else if (!is_hw_rq_empty(qp) && qp->rq_cq == cq) { ibwc->wr_id = qp->rqe_wr_id_tbl[qp->rq.tail]; ocrdma_hwq_inc_tail(&qp->rq); } else return err_cqes; ibwc->byte_len = 0; ibwc->status = IBV_WC_WR_FLUSH_ERR; ibwc = ibwc + 1; err_cqes += 1; num_entries -= 1; } return err_cqes; } /* * ocrdma_poll_cq */ int ocrdma_poll_cq(struct ibv_cq *ibcq, int num_entries, struct ibv_wc *wc) { struct ocrdma_cq *cq; int cqes_to_poll = num_entries; int num_os_cqe = 0, err_cqes = 0; struct ocrdma_qp *qp; struct ocrdma_qp *qp_tmp; cq = get_ocrdma_cq(ibcq); pthread_spin_lock(&cq->cq_lock); num_os_cqe = ocrdma_poll_hwcq(cq, num_entries, wc); pthread_spin_unlock(&cq->cq_lock); cqes_to_poll -= num_os_cqe; if (cqes_to_poll) { wc = wc + num_os_cqe; pthread_spin_lock(&cq->dev->flush_q_lock); list_for_each_safe(&cq->sq_head, qp, qp_tmp, sq_entry) { if (cqes_to_poll == 0) break; err_cqes = ocrdma_add_err_cqe(cq, cqes_to_poll, qp, wc); cqes_to_poll -= err_cqes; num_os_cqe += err_cqes; wc = wc + err_cqes; } pthread_spin_unlock(&cq->dev->flush_q_lock); } return num_os_cqe; } /* * ocrdma_arm_cq */ int ocrdma_arm_cq(struct ibv_cq *ibcq, int solicited) { struct ocrdma_cq *cq; cq = get_ocrdma_cq(ibcq); pthread_spin_lock(&cq->cq_lock); if (cq->first_arm) { ocrdma_ring_cq_db(cq, 1, solicited, 0); cq->first_arm = 0; } cq->deferred_arm = 1; cq->deferred_sol = solicited; pthread_spin_unlock(&cq->cq_lock); return 0; } /* * ocrdma_post_srq_recv */ int ocrdma_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { int status = 0; uint16_t tag; struct ocrdma_srq *srq; struct ocrdma_hdr_wqe *rqe; srq = get_ocrdma_srq(ibsrq); pthread_spin_lock(&srq->q_lock); while (wr) { if (ocrdma_hwq_free_cnt(&srq->rq) == 0 || wr->num_sge > srq->rq.max_sges) { status = ENOMEM; *bad_wr = wr; break; } rqe = ocrdma_hwq_head(&srq->rq); tag = ocrdma_srq_get_idx(srq); ocrdma_build_rqe(rqe, wr, tag); srq->rqe_wr_id_tbl[tag] = wr->wr_id; ocrdma_ring_srq_db(srq); /* update pointer, counter for next wr */ ocrdma_hwq_inc_head(&srq->rq); wr = wr->next; } pthread_spin_unlock(&srq->q_lock); return status; } /* * ocrdma_create_ah */ struct ibv_ah *ocrdma_create_ah(struct ibv_pd *ibpd, struct ibv_ah_attr *attr) { int status; int ahtbl_idx; struct ocrdma_pd *pd; struct ocrdma_ah *ah; struct ib_uverbs_create_ah_resp resp; pd = get_ocrdma_pd(ibpd); ah = malloc(sizeof *ah); if (!ah) return NULL; bzero(ah, sizeof *ah); ah->pd = pd; ahtbl_idx = ocrdma_alloc_ah_tbl_id(pd->uctx); if (ahtbl_idx < 0) goto tbl_err; attr->dlid = ahtbl_idx; memset(&resp, 0, sizeof(resp)); status = ibv_cmd_create_ah(ibpd, &ah->ibv_ah, attr, &resp, sizeof(resp)); if (status) goto cmd_err; ah->id = pd->uctx->ah_tbl[ahtbl_idx] & OCRDMA_AH_ID_MASK; ah->isvlan = (pd->uctx->ah_tbl[ahtbl_idx] >> OCRDMA_AH_VLAN_VALID_SHIFT); ah->hdr_type = ((pd->uctx->ah_tbl[ahtbl_idx] >> OCRDMA_AH_L3_TYPE_SHIFT) & OCRDMA_AH_L3_TYPE_MASK); return &ah->ibv_ah; cmd_err: ocrdma_free_ah_tbl_id(pd->uctx, ahtbl_idx); tbl_err: free(ah); return NULL; } /* * ocrdma_destroy_ah */ int ocrdma_destroy_ah(struct ibv_ah *ibah) { int status; struct ocrdma_ah *ah; ah = get_ocrdma_ah(ibah); status = ibv_cmd_destroy_ah(ibah); ocrdma_free_ah_tbl_id(ah->pd->uctx, ah->id); free(ah); return status; } /* * ocrdma_attach_mcast */ int ocrdma_attach_mcast(struct ibv_qp *ibqp, const union ibv_gid *gid, uint16_t lid) { return ibv_cmd_attach_mcast(ibqp, gid, lid); } /* * ocrdma_detach_mcast */ int ocrdma_detach_mcast(struct ibv_qp *ibqp, const union ibv_gid *gid, uint16_t lid) { return ibv_cmd_detach_mcast(ibqp, gid, lid); } rdma-core-56.1/providers/qedr/000077500000000000000000000000001477342711600163025ustar00rootroot00000000000000rdma-core-56.1/providers/qedr/CMakeLists.txt000066400000000000000000000001031477342711600210340ustar00rootroot00000000000000rdma_provider(qedr qelr_main.c qelr_verbs.c qelr_chain.c ) rdma-core-56.1/providers/qedr/common_hsi.h000066400000000000000000001606121477342711600206140ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __COMMON_HSI__ #define __COMMON_HSI__ #include #include /********************************/ /* PROTOCOL COMMON FW CONSTANTS */ /********************************/ /* Temporarily here should be added to HSI automatically by resource allocation tool.*/ #define T_TEST_AGG_INT_TEMP 6 #define M_TEST_AGG_INT_TEMP 8 #define U_TEST_AGG_INT_TEMP 6 #define X_TEST_AGG_INT_TEMP 14 #define Y_TEST_AGG_INT_TEMP 4 #define P_TEST_AGG_INT_TEMP 4 #define X_FINAL_CLEANUP_AGG_INT 1 #define EVENT_RING_PAGE_SIZE_BYTES 4096 #define NUM_OF_GLOBAL_QUEUES 128 #define COMMON_QUEUE_ENTRY_MAX_BYTE_SIZE 64 #define ISCSI_CDU_TASK_SEG_TYPE 0 #define FCOE_CDU_TASK_SEG_TYPE 0 #define RDMA_CDU_TASK_SEG_TYPE 1 #define FW_ASSERT_GENERAL_ATTN_IDX 32 #define MAX_PINNED_CCFC 32 #define EAGLE_ENG1_WORKAROUND_NIG_FLOWCTRL_MODE 3 /* Queue Zone sizes in bytes */ #define TSTORM_QZONE_SIZE 8 /*tstorm_scsi_queue_zone*/ #define MSTORM_QZONE_SIZE 16 /*mstorm_eth_queue_zone. Used only for RX producer of VFs in backward compatibility mode.*/ #define USTORM_QZONE_SIZE 8 /*ustorm_eth_queue_zone*/ #define XSTORM_QZONE_SIZE 8 /*xstorm_eth_queue_zone*/ #define YSTORM_QZONE_SIZE 0 #define PSTORM_QZONE_SIZE 0 #define MSTORM_VF_ZONE_DEFAULT_SIZE_LOG 7 /*Log of mstorm default VF zone size.*/ #define ETH_MAX_NUM_RX_QUEUES_PER_VF_DEFAULT 16 /*Maximum number of RX queues that can be allocated to VF by default*/ #define ETH_MAX_NUM_RX_QUEUES_PER_VF_DOUBLE 48 /*Maximum number of RX queues that can be allocated to VF with doubled VF zone size. Up to 96 VF supported in this mode*/ #define ETH_MAX_NUM_RX_QUEUES_PER_VF_QUAD 112 /*Maximum number of RX queues that can be allocated to VF with 4 VF zone size. Up to 48 VF supported in this mode*/ /********************************/ /* CORE (LIGHT L2) FW CONSTANTS */ /********************************/ #define CORE_LL2_MAX_RAMROD_PER_CON 8 #define CORE_LL2_TX_BD_PAGE_SIZE_BYTES 4096 #define CORE_LL2_RX_BD_PAGE_SIZE_BYTES 4096 #define CORE_LL2_RX_CQE_PAGE_SIZE_BYTES 4096 #define CORE_LL2_RX_NUM_NEXT_PAGE_BDS 1 #define CORE_LL2_TX_MAX_BDS_PER_PACKET 12 #define CORE_SPQE_PAGE_SIZE_BYTES 4096 #define MAX_NUM_LL2_RX_QUEUES 32 #define MAX_NUM_LL2_TX_STATS_COUNTERS 32 /////////////////////////////////////////////////////////////////////////////////////////////////// // Include firmware verison number only- do not add constants here to avoid redundunt compilations /////////////////////////////////////////////////////////////////////////////////////////////////// #define FW_MAJOR_VERSION 8 #define FW_MINOR_VERSION 10 #define FW_REVISION_VERSION 9 #define FW_ENGINEERING_VERSION 0 /***********************/ /* COMMON HW CONSTANTS */ /***********************/ /* PCI functions */ #define MAX_NUM_PORTS_K2 (4) #define MAX_NUM_PORTS_BB (2) #define MAX_NUM_PORTS (MAX_NUM_PORTS_K2) #define MAX_NUM_PFS_K2 (16) #define MAX_NUM_PFS_BB (8) #define MAX_NUM_PFS (MAX_NUM_PFS_K2) #define MAX_NUM_OF_PFS_IN_CHIP (16) /* On both engines */ #define MAX_NUM_VFS_K2 (192) #define MAX_NUM_VFS_BB (120) #define MAX_NUM_VFS (MAX_NUM_VFS_K2) #define MAX_NUM_FUNCTIONS_BB (MAX_NUM_PFS_BB + MAX_NUM_VFS_BB) #define MAX_NUM_FUNCTIONS_K2 (MAX_NUM_PFS_K2 + MAX_NUM_VFS_K2) #define MAX_NUM_FUNCTIONS (MAX_NUM_PFS + MAX_NUM_VFS) /* in both BB and K2, the VF number starts from 16. so for arrays containing all */ /* possible PFs and VFs - we need a constant for this size */ #define MAX_FUNCTION_NUMBER_BB (MAX_NUM_PFS + MAX_NUM_VFS_BB) #define MAX_FUNCTION_NUMBER_K2 (MAX_NUM_PFS + MAX_NUM_VFS_K2) #define MAX_FUNCTION_NUMBER (MAX_NUM_PFS + MAX_NUM_VFS) #define MAX_NUM_VPORTS_K2 (208) #define MAX_NUM_VPORTS_BB (160) #define MAX_NUM_VPORTS (MAX_NUM_VPORTS_K2) #define MAX_NUM_L2_QUEUES_K2 (320) #define MAX_NUM_L2_QUEUES_BB (256) #define MAX_NUM_L2_QUEUES (MAX_NUM_L2_QUEUES_K2) /* Traffic classes in network-facing blocks (PBF, BTB, NIG, BRB, PRS and QM) */ // 4-Port K2. #define NUM_PHYS_TCS_4PORT_K2 (4) #define NUM_OF_PHYS_TCS (8) #define NUM_TCS_4PORT_K2 (NUM_PHYS_TCS_4PORT_K2 + 1) #define NUM_OF_TCS (NUM_OF_PHYS_TCS + 1) #define LB_TC (NUM_OF_PHYS_TCS) /* Num of possible traffic priority values */ #define NUM_OF_PRIO (8) #define MAX_NUM_VOQS_K2 (NUM_TCS_4PORT_K2 * MAX_NUM_PORTS_K2) #define MAX_NUM_VOQS_BB (NUM_OF_TCS * MAX_NUM_PORTS_BB) #define MAX_NUM_VOQS (MAX_NUM_VOQS_K2) #define MAX_PHYS_VOQS (NUM_OF_PHYS_TCS * MAX_NUM_PORTS_BB) /* CIDs */ #define NUM_OF_CONNECTION_TYPES (8) #define NUM_OF_LCIDS (320) #define NUM_OF_LTIDS (320) /* Clock values */ #define MASTER_CLK_FREQ_E4 (375e6) #define STORM_CLK_FREQ_E4 (1000e6) #define CLK25M_CLK_FREQ_E4 (25e6) /* Global PXP windows (GTT) */ #define NUM_OF_GTT 19 #define GTT_DWORD_SIZE_BITS 10 #define GTT_BYTE_SIZE_BITS (GTT_DWORD_SIZE_BITS + 2) #define GTT_DWORD_SIZE (1 << GTT_DWORD_SIZE_BITS) /* Tools Version */ #define TOOLS_VERSION 10 /*****************/ /* CDU CONSTANTS */ /*****************/ #define CDU_SEG_TYPE_OFFSET_REG_TYPE_SHIFT (17) #define CDU_SEG_TYPE_OFFSET_REG_OFFSET_MASK (0x1ffff) #define CDU_VF_FL_SEG_TYPE_OFFSET_REG_TYPE_SHIFT (12) #define CDU_VF_FL_SEG_TYPE_OFFSET_REG_OFFSET_MASK (0xfff) /*****************/ /* DQ CONSTANTS */ /*****************/ /* DEMS */ #define DQ_DEMS_LEGACY 0 #define DQ_DEMS_TOE_MORE_TO_SEND 3 #define DQ_DEMS_TOE_LOCAL_ADV_WND 4 #define DQ_DEMS_ROCE_CQ_CONS 7 /* XCM agg val selection (HW) */ #define DQ_XCM_AGG_VAL_SEL_WORD2 0 #define DQ_XCM_AGG_VAL_SEL_WORD3 1 #define DQ_XCM_AGG_VAL_SEL_WORD4 2 #define DQ_XCM_AGG_VAL_SEL_WORD5 3 #define DQ_XCM_AGG_VAL_SEL_REG3 4 #define DQ_XCM_AGG_VAL_SEL_REG4 5 #define DQ_XCM_AGG_VAL_SEL_REG5 6 #define DQ_XCM_AGG_VAL_SEL_REG6 7 /* XCM agg val selection (FW) */ #define DQ_XCM_CORE_TX_BD_CONS_CMD DQ_XCM_AGG_VAL_SEL_WORD3 #define DQ_XCM_CORE_TX_BD_PROD_CMD DQ_XCM_AGG_VAL_SEL_WORD4 #define DQ_XCM_CORE_SPQ_PROD_CMD DQ_XCM_AGG_VAL_SEL_WORD4 #define DQ_XCM_ETH_EDPM_NUM_BDS_CMD DQ_XCM_AGG_VAL_SEL_WORD2 #define DQ_XCM_ETH_TX_BD_CONS_CMD DQ_XCM_AGG_VAL_SEL_WORD3 #define DQ_XCM_ETH_TX_BD_PROD_CMD DQ_XCM_AGG_VAL_SEL_WORD4 #define DQ_XCM_ETH_GO_TO_BD_CONS_CMD DQ_XCM_AGG_VAL_SEL_WORD5 #define DQ_XCM_FCOE_SQ_CONS_CMD DQ_XCM_AGG_VAL_SEL_WORD3 #define DQ_XCM_FCOE_SQ_PROD_CMD DQ_XCM_AGG_VAL_SEL_WORD4 #define DQ_XCM_FCOE_X_FERQ_PROD_CMD DQ_XCM_AGG_VAL_SEL_WORD5 #define DQ_XCM_ISCSI_SQ_CONS_CMD DQ_XCM_AGG_VAL_SEL_WORD3 #define DQ_XCM_ISCSI_SQ_PROD_CMD DQ_XCM_AGG_VAL_SEL_WORD4 #define DQ_XCM_ISCSI_MORE_TO_SEND_SEQ_CMD DQ_XCM_AGG_VAL_SEL_REG3 #define DQ_XCM_ISCSI_EXP_STAT_SN_CMD DQ_XCM_AGG_VAL_SEL_REG6 #define DQ_XCM_ROCE_SQ_PROD_CMD DQ_XCM_AGG_VAL_SEL_WORD4 #define DQ_XCM_TOE_TX_BD_PROD_CMD DQ_XCM_AGG_VAL_SEL_WORD4 #define DQ_XCM_TOE_MORE_TO_SEND_SEQ_CMD DQ_XCM_AGG_VAL_SEL_REG3 #define DQ_XCM_TOE_LOCAL_ADV_WND_SEQ_CMD DQ_XCM_AGG_VAL_SEL_REG4 /* UCM agg val selection (HW) */ #define DQ_UCM_AGG_VAL_SEL_WORD0 0 #define DQ_UCM_AGG_VAL_SEL_WORD1 1 #define DQ_UCM_AGG_VAL_SEL_WORD2 2 #define DQ_UCM_AGG_VAL_SEL_WORD3 3 #define DQ_UCM_AGG_VAL_SEL_REG0 4 #define DQ_UCM_AGG_VAL_SEL_REG1 5 #define DQ_UCM_AGG_VAL_SEL_REG2 6 #define DQ_UCM_AGG_VAL_SEL_REG3 7 /* UCM agg val selection (FW) */ #define DQ_UCM_ETH_PMD_TX_CONS_CMD DQ_UCM_AGG_VAL_SEL_WORD2 #define DQ_UCM_ETH_PMD_RX_CONS_CMD DQ_UCM_AGG_VAL_SEL_WORD3 #define DQ_UCM_ROCE_CQ_CONS_CMD DQ_UCM_AGG_VAL_SEL_REG0 #define DQ_UCM_ROCE_CQ_PROD_CMD DQ_UCM_AGG_VAL_SEL_REG2 /* TCM agg val selection (HW) */ #define DQ_TCM_AGG_VAL_SEL_WORD0 0 #define DQ_TCM_AGG_VAL_SEL_WORD1 1 #define DQ_TCM_AGG_VAL_SEL_WORD2 2 #define DQ_TCM_AGG_VAL_SEL_WORD3 3 #define DQ_TCM_AGG_VAL_SEL_REG1 4 #define DQ_TCM_AGG_VAL_SEL_REG2 5 #define DQ_TCM_AGG_VAL_SEL_REG6 6 #define DQ_TCM_AGG_VAL_SEL_REG9 7 /* TCM agg val selection (FW) */ #define DQ_TCM_L2B_BD_PROD_CMD DQ_TCM_AGG_VAL_SEL_WORD1 #define DQ_TCM_ROCE_RQ_PROD_CMD DQ_TCM_AGG_VAL_SEL_WORD0 /* XCM agg counter flag selection (HW) */ #define DQ_XCM_AGG_FLG_SHIFT_BIT14 0 #define DQ_XCM_AGG_FLG_SHIFT_BIT15 1 #define DQ_XCM_AGG_FLG_SHIFT_CF12 2 #define DQ_XCM_AGG_FLG_SHIFT_CF13 3 #define DQ_XCM_AGG_FLG_SHIFT_CF18 4 #define DQ_XCM_AGG_FLG_SHIFT_CF19 5 #define DQ_XCM_AGG_FLG_SHIFT_CF22 6 #define DQ_XCM_AGG_FLG_SHIFT_CF23 7 /* XCM agg counter flag selection (FW) */ #define DQ_XCM_CORE_DQ_CF_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF18) #define DQ_XCM_CORE_TERMINATE_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF19) #define DQ_XCM_CORE_SLOW_PATH_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF22) #define DQ_XCM_ETH_DQ_CF_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF18) #define DQ_XCM_ETH_TERMINATE_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF19) #define DQ_XCM_ETH_SLOW_PATH_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF22) #define DQ_XCM_ETH_TPH_EN_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF23) #define DQ_XCM_FCOE_SLOW_PATH_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF22) #define DQ_XCM_ISCSI_DQ_FLUSH_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF19) #define DQ_XCM_ISCSI_SLOW_PATH_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF22) #define DQ_XCM_ISCSI_PROC_ONLY_CLEANUP_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF23) #define DQ_XCM_TOE_DQ_FLUSH_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF19) #define DQ_XCM_TOE_SLOW_PATH_CMD (1 << DQ_XCM_AGG_FLG_SHIFT_CF22) /* UCM agg counter flag selection (HW) */ #define DQ_UCM_AGG_FLG_SHIFT_CF0 0 #define DQ_UCM_AGG_FLG_SHIFT_CF1 1 #define DQ_UCM_AGG_FLG_SHIFT_CF3 2 #define DQ_UCM_AGG_FLG_SHIFT_CF4 3 #define DQ_UCM_AGG_FLG_SHIFT_CF5 4 #define DQ_UCM_AGG_FLG_SHIFT_CF6 5 #define DQ_UCM_AGG_FLG_SHIFT_RULE0EN 6 #define DQ_UCM_AGG_FLG_SHIFT_RULE1EN 7 /* UCM agg counter flag selection (FW) */ #define DQ_UCM_ETH_PMD_TX_ARM_CMD (1 << DQ_UCM_AGG_FLG_SHIFT_CF4) #define DQ_UCM_ETH_PMD_RX_ARM_CMD (1 << DQ_UCM_AGG_FLG_SHIFT_CF5) #define DQ_UCM_ROCE_CQ_ARM_SE_CF_CMD (1 << DQ_UCM_AGG_FLG_SHIFT_CF4) #define DQ_UCM_ROCE_CQ_ARM_CF_CMD (1 << DQ_UCM_AGG_FLG_SHIFT_CF5) #define DQ_UCM_TOE_TIMER_STOP_ALL_CMD (1 << DQ_UCM_AGG_FLG_SHIFT_CF3) #define DQ_UCM_TOE_SLOW_PATH_CF_CMD (1 << DQ_UCM_AGG_FLG_SHIFT_CF4) #define DQ_UCM_TOE_DQ_CF_CMD (1 << DQ_UCM_AGG_FLG_SHIFT_CF5) /* TCM agg counter flag selection (HW) */ #define DQ_TCM_AGG_FLG_SHIFT_CF0 0 #define DQ_TCM_AGG_FLG_SHIFT_CF1 1 #define DQ_TCM_AGG_FLG_SHIFT_CF2 2 #define DQ_TCM_AGG_FLG_SHIFT_CF3 3 #define DQ_TCM_AGG_FLG_SHIFT_CF4 4 #define DQ_TCM_AGG_FLG_SHIFT_CF5 5 #define DQ_TCM_AGG_FLG_SHIFT_CF6 6 #define DQ_TCM_AGG_FLG_SHIFT_CF7 7 /* TCM agg counter flag selection (FW) */ #define DQ_TCM_FCOE_FLUSH_Q0_CMD (1 << DQ_TCM_AGG_FLG_SHIFT_CF1) #define DQ_TCM_FCOE_DUMMY_TIMER_CMD (1 << DQ_TCM_AGG_FLG_SHIFT_CF2) #define DQ_TCM_FCOE_TIMER_STOP_ALL_CMD (1 << DQ_TCM_AGG_FLG_SHIFT_CF3) #define DQ_TCM_ISCSI_FLUSH_Q0_CMD (1 << DQ_TCM_AGG_FLG_SHIFT_CF1) #define DQ_TCM_ISCSI_TIMER_STOP_ALL_CMD (1 << DQ_TCM_AGG_FLG_SHIFT_CF3) #define DQ_TCM_TOE_FLUSH_Q0_CMD (1 << DQ_TCM_AGG_FLG_SHIFT_CF1) #define DQ_TCM_TOE_TIMER_STOP_ALL_CMD (1 << DQ_TCM_AGG_FLG_SHIFT_CF3) #define DQ_TCM_IWARP_POST_RQ_CF_CMD (1 << DQ_TCM_AGG_FLG_SHIFT_CF1) /* PWM address mapping */ #define DQ_PWM_OFFSET_DPM_BASE 0x0 #define DQ_PWM_OFFSET_DPM_END 0x27 #define DQ_PWM_OFFSET_XCM16_BASE 0x40 #define DQ_PWM_OFFSET_XCM32_BASE 0x44 #define DQ_PWM_OFFSET_UCM16_BASE 0x48 #define DQ_PWM_OFFSET_UCM32_BASE 0x4C #define DQ_PWM_OFFSET_UCM16_4 0x50 #define DQ_PWM_OFFSET_TCM16_BASE 0x58 #define DQ_PWM_OFFSET_TCM32_BASE 0x5C #define DQ_PWM_OFFSET_XCM_FLAGS 0x68 #define DQ_PWM_OFFSET_UCM_FLAGS 0x69 #define DQ_PWM_OFFSET_TCM_FLAGS 0x6B #define DQ_PWM_OFFSET_XCM_RDMA_SQ_PROD (DQ_PWM_OFFSET_XCM16_BASE + 2) #define DQ_PWM_OFFSET_UCM_RDMA_CQ_CONS_32BIT (DQ_PWM_OFFSET_UCM32_BASE) #define DQ_PWM_OFFSET_UCM_RDMA_CQ_CONS_16BIT (DQ_PWM_OFFSET_UCM16_4) #define DQ_PWM_OFFSET_UCM_RDMA_INT_TIMEOUT (DQ_PWM_OFFSET_UCM16_BASE + 2) #define DQ_PWM_OFFSET_UCM_RDMA_ARM_FLAGS (DQ_PWM_OFFSET_UCM_FLAGS) #define DQ_PWM_OFFSET_TCM_ROCE_RQ_PROD (DQ_PWM_OFFSET_TCM16_BASE + 1) #define DQ_PWM_OFFSET_TCM_IWARP_RQ_PROD (DQ_PWM_OFFSET_TCM16_BASE + 3) #define DQ_REGION_SHIFT (12) /* DPM */ #define DQ_DPM_WQE_BUFF_SIZE (320) // Conn type ranges #define DQ_CONN_TYPE_RANGE_SHIFT (4) /*****************/ /* QM CONSTANTS */ /*****************/ /* number of TX queues in the QM */ #define MAX_QM_TX_QUEUES_K2 512 #define MAX_QM_TX_QUEUES_BB 448 #define MAX_QM_TX_QUEUES MAX_QM_TX_QUEUES_K2 /* number of Other queues in the QM */ #define MAX_QM_OTHER_QUEUES_BB 64 #define MAX_QM_OTHER_QUEUES_K2 128 #define MAX_QM_OTHER_QUEUES MAX_QM_OTHER_QUEUES_K2 /* number of queues in a PF queue group */ #define QM_PF_QUEUE_GROUP_SIZE 8 /* the size of a single queue element in bytes */ #define QM_PQ_ELEMENT_SIZE 4 /* base number of Tx PQs in the CM PQ representation. should be used when storing PQ IDs in CM PQ registers and context */ #define CM_TX_PQ_BASE 0x200 /* number of global Vport/QCN rate limiters */ #define MAX_QM_GLOBAL_RLS 256 /* QM registers data */ #define QM_LINE_CRD_REG_WIDTH 16 #define QM_LINE_CRD_REG_SIGN_BIT (1 << (QM_LINE_CRD_REG_WIDTH - 1)) #define QM_BYTE_CRD_REG_WIDTH 24 #define QM_BYTE_CRD_REG_SIGN_BIT (1 << (QM_BYTE_CRD_REG_WIDTH - 1)) #define QM_WFQ_CRD_REG_WIDTH 32 #define QM_WFQ_CRD_REG_SIGN_BIT (1 << (QM_WFQ_CRD_REG_WIDTH - 1)) #define QM_RL_CRD_REG_WIDTH 32 #define QM_RL_CRD_REG_SIGN_BIT (1 << (QM_RL_CRD_REG_WIDTH - 1)) /*****************/ /* CAU CONSTANTS */ /*****************/ #define CAU_FSM_ETH_RX 0 #define CAU_FSM_ETH_TX 1 /* Number of Protocol Indices per Status Block */ #define PIS_PER_SB 12 #define CAU_HC_STOPPED_STATE 3 /* fsm is stopped or not valid for this sb */ #define CAU_HC_DISABLE_STATE 4 /* fsm is working without interrupt coalescing for this sb*/ #define CAU_HC_ENABLE_STATE 0 /* fsm is working with interrupt coalescing for this sb*/ /*****************/ /* IGU CONSTANTS */ /*****************/ #define MAX_SB_PER_PATH_K2 (368) #define MAX_SB_PER_PATH_BB (288) #define MAX_TOT_SB_PER_PATH MAX_SB_PER_PATH_K2 #define MAX_SB_PER_PF_MIMD 129 #define MAX_SB_PER_PF_SIMD 64 #define MAX_SB_PER_VF 64 /* Memory addresses on the BAR for the IGU Sub Block */ #define IGU_MEM_BASE 0x0000 #define IGU_MEM_MSIX_BASE 0x0000 #define IGU_MEM_MSIX_UPPER 0x0101 #define IGU_MEM_MSIX_RESERVED_UPPER 0x01ff #define IGU_MEM_PBA_MSIX_BASE 0x0200 #define IGU_MEM_PBA_MSIX_UPPER 0x0202 #define IGU_MEM_PBA_MSIX_RESERVED_UPPER 0x03ff #define IGU_CMD_INT_ACK_BASE 0x0400 #define IGU_CMD_INT_ACK_UPPER (IGU_CMD_INT_ACK_BASE + MAX_TOT_SB_PER_PATH - 1) #define IGU_CMD_INT_ACK_RESERVED_UPPER 0x05ff #define IGU_CMD_ATTN_BIT_UPD_UPPER 0x05f0 #define IGU_CMD_ATTN_BIT_SET_UPPER 0x05f1 #define IGU_CMD_ATTN_BIT_CLR_UPPER 0x05f2 #define IGU_REG_SISR_MDPC_WMASK_UPPER 0x05f3 #define IGU_REG_SISR_MDPC_WMASK_LSB_UPPER 0x05f4 #define IGU_REG_SISR_MDPC_WMASK_MSB_UPPER 0x05f5 #define IGU_REG_SISR_MDPC_WOMASK_UPPER 0x05f6 #define IGU_CMD_PROD_UPD_BASE 0x0600 #define IGU_CMD_PROD_UPD_UPPER (IGU_CMD_PROD_UPD_BASE + MAX_TOT_SB_PER_PATH - 1) #define IGU_CMD_PROD_UPD_RESERVED_UPPER 0x07ff /*****************/ /* PXP CONSTANTS */ /*****************/ /* Bars for Blocks */ #define PXP_BAR_GRC 0 #define PXP_BAR_TSDM 0 #define PXP_BAR_USDM 0 #define PXP_BAR_XSDM 0 #define PXP_BAR_MSDM 0 #define PXP_BAR_YSDM 0 #define PXP_BAR_PSDM 0 #define PXP_BAR_IGU 0 #define PXP_BAR_DQ 1 /* PTT and GTT */ #define PXP_NUM_PF_WINDOWS 12 #define PXP_PER_PF_ENTRY_SIZE 8 #define PXP_NUM_GLOBAL_WINDOWS 243 #define PXP_GLOBAL_ENTRY_SIZE 4 #define PXP_ADMIN_WINDOW_ALLOWED_LENGTH 4 #define PXP_PF_WINDOW_ADMIN_START 0 #define PXP_PF_WINDOW_ADMIN_LENGTH 0x1000 #define PXP_PF_WINDOW_ADMIN_END (PXP_PF_WINDOW_ADMIN_START + PXP_PF_WINDOW_ADMIN_LENGTH - 1) #define PXP_PF_WINDOW_ADMIN_PER_PF_START 0 #define PXP_PF_WINDOW_ADMIN_PER_PF_LENGTH (PXP_NUM_PF_WINDOWS * PXP_PER_PF_ENTRY_SIZE) #define PXP_PF_WINDOW_ADMIN_PER_PF_END (PXP_PF_WINDOW_ADMIN_PER_PF_START + PXP_PF_WINDOW_ADMIN_PER_PF_LENGTH - 1) #define PXP_PF_WINDOW_ADMIN_GLOBAL_START 0x200 #define PXP_PF_WINDOW_ADMIN_GLOBAL_LENGTH (PXP_NUM_GLOBAL_WINDOWS * PXP_GLOBAL_ENTRY_SIZE) #define PXP_PF_WINDOW_ADMIN_GLOBAL_END (PXP_PF_WINDOW_ADMIN_GLOBAL_START + PXP_PF_WINDOW_ADMIN_GLOBAL_LENGTH - 1) #define PXP_PF_GLOBAL_PRETEND_ADDR 0x1f0 #define PXP_PF_ME_OPAQUE_MASK_ADDR 0xf4 #define PXP_PF_ME_OPAQUE_ADDR 0x1f8 #define PXP_PF_ME_CONCRETE_ADDR 0x1fc #define PXP_EXTERNAL_BAR_PF_WINDOW_START 0x1000 #define PXP_EXTERNAL_BAR_PF_WINDOW_NUM PXP_NUM_PF_WINDOWS #define PXP_EXTERNAL_BAR_PF_WINDOW_SINGLE_SIZE 0x1000 #define PXP_EXTERNAL_BAR_PF_WINDOW_LENGTH (PXP_EXTERNAL_BAR_PF_WINDOW_NUM * PXP_EXTERNAL_BAR_PF_WINDOW_SINGLE_SIZE) #define PXP_EXTERNAL_BAR_PF_WINDOW_END (PXP_EXTERNAL_BAR_PF_WINDOW_START + PXP_EXTERNAL_BAR_PF_WINDOW_LENGTH - 1) #define PXP_EXTERNAL_BAR_GLOBAL_WINDOW_START (PXP_EXTERNAL_BAR_PF_WINDOW_END + 1) #define PXP_EXTERNAL_BAR_GLOBAL_WINDOW_NUM PXP_NUM_GLOBAL_WINDOWS #define PXP_EXTERNAL_BAR_GLOBAL_WINDOW_SINGLE_SIZE 0x1000 #define PXP_EXTERNAL_BAR_GLOBAL_WINDOW_LENGTH (PXP_EXTERNAL_BAR_GLOBAL_WINDOW_NUM * PXP_EXTERNAL_BAR_GLOBAL_WINDOW_SINGLE_SIZE) #define PXP_EXTERNAL_BAR_GLOBAL_WINDOW_END (PXP_EXTERNAL_BAR_GLOBAL_WINDOW_START + PXP_EXTERNAL_BAR_GLOBAL_WINDOW_LENGTH - 1) /* PF BAR */ //#define PXP_BAR0_START_GRC 0x1000 //#define PXP_BAR0_GRC_LENGTH 0xBFF000 #define PXP_BAR0_START_GRC 0x0000 #define PXP_BAR0_GRC_LENGTH 0x1C00000 #define PXP_BAR0_END_GRC (PXP_BAR0_START_GRC + PXP_BAR0_GRC_LENGTH - 1) #define PXP_BAR0_START_IGU 0x1C00000 #define PXP_BAR0_IGU_LENGTH 0x10000 #define PXP_BAR0_END_IGU (PXP_BAR0_START_IGU + PXP_BAR0_IGU_LENGTH - 1) #define PXP_BAR0_START_TSDM 0x1C80000 #define PXP_BAR0_SDM_LENGTH 0x40000 #define PXP_BAR0_SDM_RESERVED_LENGTH 0x40000 #define PXP_BAR0_END_TSDM (PXP_BAR0_START_TSDM + PXP_BAR0_SDM_LENGTH - 1) #define PXP_BAR0_START_MSDM 0x1D00000 #define PXP_BAR0_END_MSDM (PXP_BAR0_START_MSDM + PXP_BAR0_SDM_LENGTH - 1) #define PXP_BAR0_START_USDM 0x1D80000 #define PXP_BAR0_END_USDM (PXP_BAR0_START_USDM + PXP_BAR0_SDM_LENGTH - 1) #define PXP_BAR0_START_XSDM 0x1E00000 #define PXP_BAR0_END_XSDM (PXP_BAR0_START_XSDM + PXP_BAR0_SDM_LENGTH - 1) #define PXP_BAR0_START_YSDM 0x1E80000 #define PXP_BAR0_END_YSDM (PXP_BAR0_START_YSDM + PXP_BAR0_SDM_LENGTH - 1) #define PXP_BAR0_START_PSDM 0x1F00000 #define PXP_BAR0_END_PSDM (PXP_BAR0_START_PSDM + PXP_BAR0_SDM_LENGTH - 1) #define PXP_BAR0_FIRST_INVALID_ADDRESS (PXP_BAR0_END_PSDM + 1) /* VF BAR */ #define PXP_VF_BAR0 0 #define PXP_VF_BAR0_START_GRC 0x3E00 #define PXP_VF_BAR0_GRC_LENGTH 0x200 #define PXP_VF_BAR0_END_GRC (PXP_VF_BAR0_START_GRC + PXP_VF_BAR0_GRC_LENGTH - 1) #define PXP_VF_BAR0_START_IGU 0 #define PXP_VF_BAR0_IGU_LENGTH 0x3000 #define PXP_VF_BAR0_END_IGU (PXP_VF_BAR0_START_IGU + PXP_VF_BAR0_IGU_LENGTH - 1) #define PXP_VF_BAR0_START_DQ 0x3000 #define PXP_VF_BAR0_DQ_LENGTH 0x200 #define PXP_VF_BAR0_DQ_OPAQUE_OFFSET 0 #define PXP_VF_BAR0_ME_OPAQUE_ADDRESS (PXP_VF_BAR0_START_DQ + PXP_VF_BAR0_DQ_OPAQUE_OFFSET) #define PXP_VF_BAR0_ME_CONCRETE_ADDRESS (PXP_VF_BAR0_ME_OPAQUE_ADDRESS + 4) #define PXP_VF_BAR0_END_DQ (PXP_VF_BAR0_START_DQ + PXP_VF_BAR0_DQ_LENGTH - 1) #define PXP_VF_BAR0_START_TSDM_ZONE_B 0x3200 #define PXP_VF_BAR0_SDM_LENGTH_ZONE_B 0x200 #define PXP_VF_BAR0_END_TSDM_ZONE_B (PXP_VF_BAR0_START_TSDM_ZONE_B + PXP_VF_BAR0_SDM_LENGTH_ZONE_B - 1) #define PXP_VF_BAR0_START_MSDM_ZONE_B 0x3400 #define PXP_VF_BAR0_END_MSDM_ZONE_B (PXP_VF_BAR0_START_MSDM_ZONE_B + PXP_VF_BAR0_SDM_LENGTH_ZONE_B - 1) #define PXP_VF_BAR0_START_USDM_ZONE_B 0x3600 #define PXP_VF_BAR0_END_USDM_ZONE_B (PXP_VF_BAR0_START_USDM_ZONE_B + PXP_VF_BAR0_SDM_LENGTH_ZONE_B - 1) #define PXP_VF_BAR0_START_XSDM_ZONE_B 0x3800 #define PXP_VF_BAR0_END_XSDM_ZONE_B (PXP_VF_BAR0_START_XSDM_ZONE_B + PXP_VF_BAR0_SDM_LENGTH_ZONE_B - 1) #define PXP_VF_BAR0_START_YSDM_ZONE_B 0x3a00 #define PXP_VF_BAR0_END_YSDM_ZONE_B (PXP_VF_BAR0_START_YSDM_ZONE_B + PXP_VF_BAR0_SDM_LENGTH_ZONE_B - 1) #define PXP_VF_BAR0_START_PSDM_ZONE_B 0x3c00 #define PXP_VF_BAR0_END_PSDM_ZONE_B (PXP_VF_BAR0_START_PSDM_ZONE_B + PXP_VF_BAR0_SDM_LENGTH_ZONE_B - 1) #define PXP_VF_BAR0_START_SDM_ZONE_A 0x4000 #define PXP_VF_BAR0_END_SDM_ZONE_A 0x10000 #define PXP_VF_BAR0_GRC_WINDOW_LENGTH 32 #define PXP_ILT_PAGE_SIZE_NUM_BITS_MIN 12 #define PXP_ILT_BLOCK_FACTOR_MULTIPLIER 1024 // ILT Records #define PXP_NUM_ILT_RECORDS_BB 7600 #define PXP_NUM_ILT_RECORDS_K2 11000 #define MAX_NUM_ILT_RECORDS MAX(PXP_NUM_ILT_RECORDS_BB,PXP_NUM_ILT_RECORDS_K2) // Host Interface #define PXP_QUEUES_ZONE_MAX_NUM 320 /*****************/ /* PRM CONSTANTS */ /*****************/ #define PRM_DMA_PAD_BYTES_NUM 2 /*****************/ /* SDMs CONSTANTS */ /*****************/ #define SDM_OP_GEN_TRIG_NONE 0 #define SDM_OP_GEN_TRIG_WAKE_THREAD 1 #define SDM_OP_GEN_TRIG_AGG_INT 2 #define SDM_OP_GEN_TRIG_LOADER 4 #define SDM_OP_GEN_TRIG_INDICATE_ERROR 6 #define SDM_OP_GEN_TRIG_RELEASE_THREAD 7 ///////////////////////////////////////////////////////////// // Completion types ///////////////////////////////////////////////////////////// #define SDM_COMP_TYPE_NONE 0 #define SDM_COMP_TYPE_WAKE_THREAD 1 #define SDM_COMP_TYPE_AGG_INT 2 #define SDM_COMP_TYPE_CM 3 // Send direct message to local CM and/or remote CMs. Destinations are defined by vector in CompParams. #define SDM_COMP_TYPE_LOADER 4 #define SDM_COMP_TYPE_PXP 5 // Send direct message to PXP (like "internal write" command) to write to remote Storm RAM via remote SDM #define SDM_COMP_TYPE_INDICATE_ERROR 6 // Indicate error per thread #define SDM_COMP_TYPE_RELEASE_THREAD 7 #define SDM_COMP_TYPE_RAM 8 // Write to local RAM as a completion /******************/ /* PBF CONSTANTS */ /******************/ /* Number of PBF command queue lines. Each line is 32B. */ #define PBF_MAX_CMD_LINES 3328 /* Number of BTB blocks. Each block is 256B. */ #define BTB_MAX_BLOCKS 1440 /*****************/ /* PRS CONSTANTS */ /*****************/ #define PRS_GFT_CAM_LINES_NO_MATCH 31 /* * Async data KCQ CQE */ struct async_data { __le32 cid /* Context ID of the connection */; __le16 itid /* Task Id of the task (for error that happened on a a task) */; uint8_t error_code /* error code - relevant only if the opcode indicates its an error */; uint8_t fw_debug_param /* internal fw debug parameter */; }; /* * Interrupt coalescing TimeSet */ struct coalescing_timeset { uint8_t value; #define COALESCING_TIMESET_TIMESET_MASK 0x7F /* Interrupt coalescing TimeSet (timeout_ticks = TimeSet shl (TimerRes+1)) */ #define COALESCING_TIMESET_TIMESET_SHIFT 0 #define COALESCING_TIMESET_VALID_MASK 0x1 /* Only if this flag is set, timeset will take effect */ #define COALESCING_TIMESET_VALID_SHIFT 7 }; struct common_queue_zone { __le16 ring_drv_data_consumer; __le16 reserved; }; /* * ETH Rx producers data */ struct eth_rx_prod_data { __le16 bd_prod /* BD producer. */; __le16 cqe_prod /* CQE producer. */; }; struct regpair { __le32 lo /* low word for reg-pair */; __le32 hi /* high word for reg-pair */; }; /* * Event Ring VF-PF Channel data */ struct vf_pf_channel_eqe_data { struct regpair msg_addr /* VF-PF message address */; }; struct iscsi_eqe_data { __le32 cid /* Context ID of the connection */; __le16 conn_id /* Task Id of the task (for error that happened on a a task) */; uint8_t error_code /* error code - relevant only if the opcode indicates its an error */; uint8_t error_pdu_opcode_reserved; #define ISCSI_EQE_DATA_ERROR_PDU_OPCODE_MASK 0x3F /* The processed PDUs opcode on which happened the error - updated for specific error codes, by defualt=0xFF */ #define ISCSI_EQE_DATA_ERROR_PDU_OPCODE_SHIFT 0 #define ISCSI_EQE_DATA_ERROR_PDU_OPCODE_VALID_MASK 0x1 /* Indication for driver is the error_pdu_opcode field has valid value */ #define ISCSI_EQE_DATA_ERROR_PDU_OPCODE_VALID_SHIFT 6 #define ISCSI_EQE_DATA_RESERVED0_MASK 0x1 #define ISCSI_EQE_DATA_RESERVED0_SHIFT 7 }; /* * Event Ring malicious VF data */ struct malicious_vf_eqe_data { uint8_t vfId /* Malicious VF ID */; uint8_t errId /* Malicious VF error */; __le16 reserved[3]; }; /* * Event Ring initial cleanup data */ struct initial_cleanup_eqe_data { uint8_t vfId /* VF ID */; uint8_t reserved[7]; }; /* * Event Data Union */ union event_ring_data { uint8_t bytes[8] /* Byte Array */; struct vf_pf_channel_eqe_data vf_pf_channel /* VF-PF Channel data */; struct iscsi_eqe_data iscsi_info /* Dedicated fields to iscsi data */; struct regpair roceHandle /* Dedicated field for RoCE affiliated asynchronous error */; struct malicious_vf_eqe_data malicious_vf /* Malicious VF data */; struct initial_cleanup_eqe_data vf_init_cleanup /* VF Initial Cleanup data */; struct regpair iwarp_handle /* Host handle for the Async Completions */; }; /* * Event Ring Entry */ struct event_ring_entry { uint8_t protocol_id /* Event Protocol ID */; uint8_t opcode /* Event Opcode */; __le16 reserved0 /* Reserved */; __le16 echo /* Echo value from ramrod data on the host */; uint8_t fw_return_code /* FW return code for SP ramrods */; uint8_t flags; #define EVENT_RING_ENTRY_ASYNC_MASK 0x1 /* 0: synchronous EQE - a completion of SP message. 1: asynchronous EQE */ #define EVENT_RING_ENTRY_ASYNC_SHIFT 0 #define EVENT_RING_ENTRY_RESERVED1_MASK 0x7F #define EVENT_RING_ENTRY_RESERVED1_SHIFT 1 union event_ring_data data; }; /* * Multi function mode */ enum mf_mode { ERROR_MODE /* Unsupported mode */, MF_OVLAN /* Multi function based on outer VLAN */, MF_NPAR /* Multi function based on MAC address (NIC partitioning) */, MAX_MF_MODE }; /* * Per-protocol connection types */ enum protocol_type { PROTOCOLID_ISCSI /* iSCSI */, PROTOCOLID_FCOE /* FCoE */, PROTOCOLID_ROCE /* RoCE */, PROTOCOLID_CORE /* Core (light L2, slow path core) */, PROTOCOLID_ETH /* Ethernet */, PROTOCOLID_IWARP /* iWARP */, PROTOCOLID_TOE /* TOE */, PROTOCOLID_PREROCE /* Pre (tapeout) RoCE */, PROTOCOLID_COMMON /* ProtocolCommon */, PROTOCOLID_TCP /* TCP */, MAX_PROTOCOL_TYPE }; /* * Ustorm Queue Zone */ struct ustorm_eth_queue_zone { struct coalescing_timeset int_coalescing_timeset /* Rx interrupt coalescing TimeSet */; uint8_t reserved[3]; }; struct ustorm_queue_zone { struct ustorm_eth_queue_zone eth; struct common_queue_zone common; }; /* * status block structure */ struct cau_pi_entry { __le32 prod; #define CAU_PI_ENTRY_PROD_VAL_MASK 0xFFFF /* A per protocol indexPROD value. */ #define CAU_PI_ENTRY_PROD_VAL_SHIFT 0 #define CAU_PI_ENTRY_PI_TIMESET_MASK 0x7F /* This value determines the TimeSet that the PI is associated with */ #define CAU_PI_ENTRY_PI_TIMESET_SHIFT 16 #define CAU_PI_ENTRY_FSM_SEL_MASK 0x1 /* Select the FSM within the SB */ #define CAU_PI_ENTRY_FSM_SEL_SHIFT 23 #define CAU_PI_ENTRY_RESERVED_MASK 0xFF /* Select the FSM within the SB */ #define CAU_PI_ENTRY_RESERVED_SHIFT 24 }; /* * status block structure */ struct cau_sb_entry { __le32 data; #define CAU_SB_ENTRY_SB_PROD_MASK 0xFFFFFF /* The SB PROD index which is sent to the IGU. */ #define CAU_SB_ENTRY_SB_PROD_SHIFT 0 #define CAU_SB_ENTRY_STATE0_MASK 0xF /* RX state */ #define CAU_SB_ENTRY_STATE0_SHIFT 24 #define CAU_SB_ENTRY_STATE1_MASK 0xF /* TX state */ #define CAU_SB_ENTRY_STATE1_SHIFT 28 __le32 params; #define CAU_SB_ENTRY_SB_TIMESET0_MASK 0x7F /* Indicates the RX TimeSet that this SB is associated with. */ #define CAU_SB_ENTRY_SB_TIMESET0_SHIFT 0 #define CAU_SB_ENTRY_SB_TIMESET1_MASK 0x7F /* Indicates the TX TimeSet that this SB is associated with. */ #define CAU_SB_ENTRY_SB_TIMESET1_SHIFT 7 #define CAU_SB_ENTRY_TIMER_RES0_MASK 0x3 /* This value will determine the RX FSM timer resolution in ticks */ #define CAU_SB_ENTRY_TIMER_RES0_SHIFT 14 #define CAU_SB_ENTRY_TIMER_RES1_MASK 0x3 /* This value will determine the TX FSM timer resolution in ticks */ #define CAU_SB_ENTRY_TIMER_RES1_SHIFT 16 #define CAU_SB_ENTRY_VF_NUMBER_MASK 0xFF #define CAU_SB_ENTRY_VF_NUMBER_SHIFT 18 #define CAU_SB_ENTRY_VF_VALID_MASK 0x1 #define CAU_SB_ENTRY_VF_VALID_SHIFT 26 #define CAU_SB_ENTRY_PF_NUMBER_MASK 0xF #define CAU_SB_ENTRY_PF_NUMBER_SHIFT 27 #define CAU_SB_ENTRY_TPH_MASK 0x1 /* If set then indicates that the TPH STAG is equal to the SB number. Otherwise the STAG will be equal to all ones. */ #define CAU_SB_ENTRY_TPH_SHIFT 31 }; /* * core doorbell data */ struct core_db_data { uint8_t params; #define CORE_DB_DATA_DEST_MASK 0x3 /* destination of doorbell (use enum db_dest) */ #define CORE_DB_DATA_DEST_SHIFT 0 #define CORE_DB_DATA_AGG_CMD_MASK 0x3 /* aggregative command to CM (use enum db_agg_cmd_sel) */ #define CORE_DB_DATA_AGG_CMD_SHIFT 2 #define CORE_DB_DATA_BYPASS_EN_MASK 0x1 /* enable QM bypass */ #define CORE_DB_DATA_BYPASS_EN_SHIFT 4 #define CORE_DB_DATA_RESERVED_MASK 0x1 #define CORE_DB_DATA_RESERVED_SHIFT 5 #define CORE_DB_DATA_AGG_VAL_SEL_MASK 0x3 /* aggregative value selection */ #define CORE_DB_DATA_AGG_VAL_SEL_SHIFT 6 uint8_t agg_flags /* bit for every DQ counter flags in CM context that DQ can increment */; __le16 spq_prod; }; /* * Enum of doorbell aggregative command selection */ enum db_agg_cmd_sel { DB_AGG_CMD_NOP /* No operation */, DB_AGG_CMD_SET /* Set the value */, DB_AGG_CMD_ADD /* Add the value */, DB_AGG_CMD_MAX /* Set max of current and new value */, MAX_DB_AGG_CMD_SEL }; /* * Enum of doorbell destination */ enum db_dest { DB_DEST_XCM /* TX doorbell to XCM */, DB_DEST_UCM /* RX doorbell to UCM */, DB_DEST_TCM /* RX doorbell to TCM */, DB_NUM_DESTINATIONS, MAX_DB_DEST }; /* * Enum of doorbell DPM types */ enum db_dpm_type { DPM_LEGACY /* Legacy DPM- to Xstorm RAM */, DPM_ROCE /* RoCE DPM- to NIG */, DPM_L2_INLINE /* L2 DPM inline- to PBF, with packet data on doorbell */, DPM_L2_BD /* L2 DPM with BD- to PBF, with TX BD data on doorbell */, MAX_DB_DPM_TYPE }; /* * Structure for doorbell data, in L2 DPM mode, for the first doorbell in a DPM burst */ struct db_l2_dpm_data { __le16 icid /* internal CID */; __le16 bd_prod /* bd producer value to update */; __le32 params; #define DB_L2_DPM_DATA_SIZE_MASK 0x3F /* Size in QWORD-s of the DPM burst */ #define DB_L2_DPM_DATA_SIZE_SHIFT 0 #define DB_L2_DPM_DATA_DPM_TYPE_MASK 0x3 /* Type of DPM transaction (DPM_L2_INLINE or DPM_L2_BD) (use enum db_dpm_type) */ #define DB_L2_DPM_DATA_DPM_TYPE_SHIFT 6 #define DB_L2_DPM_DATA_NUM_BDS_MASK 0xFF /* number of BD-s */ #define DB_L2_DPM_DATA_NUM_BDS_SHIFT 8 #define DB_L2_DPM_DATA_PKT_SIZE_MASK 0x7FF /* size of the packet to be transmitted in bytes */ #define DB_L2_DPM_DATA_PKT_SIZE_SHIFT 16 #define DB_L2_DPM_DATA_RESERVED0_MASK 0x1 #define DB_L2_DPM_DATA_RESERVED0_SHIFT 27 #define DB_L2_DPM_DATA_SGE_NUM_MASK 0x7 /* In DPM_L2_BD mode: the number of SGE-s */ #define DB_L2_DPM_DATA_SGE_NUM_SHIFT 28 #define DB_L2_DPM_DATA_RESERVED1_MASK 0x1 #define DB_L2_DPM_DATA_RESERVED1_SHIFT 31 }; /* * Structure for SGE in a DPM doorbell of type DPM_L2_BD */ struct db_l2_dpm_sge { struct regpair addr /* Single continuous buffer */; __le16 nbytes /* Number of bytes in this BD. */; __le16 bitfields; #define DB_L2_DPM_SGE_TPH_ST_INDEX_MASK 0x1FF /* The TPH STAG index value */ #define DB_L2_DPM_SGE_TPH_ST_INDEX_SHIFT 0 #define DB_L2_DPM_SGE_RESERVED0_MASK 0x3 #define DB_L2_DPM_SGE_RESERVED0_SHIFT 9 #define DB_L2_DPM_SGE_ST_VALID_MASK 0x1 /* Indicate if ST hint is requested or not */ #define DB_L2_DPM_SGE_ST_VALID_SHIFT 11 #define DB_L2_DPM_SGE_RESERVED1_MASK 0xF #define DB_L2_DPM_SGE_RESERVED1_SHIFT 12 __le32 reserved2; }; /* * Structure for doorbell address, in legacy mode */ struct db_legacy_addr { __le32 addr; #define DB_LEGACY_ADDR_RESERVED0_MASK 0x3 #define DB_LEGACY_ADDR_RESERVED0_SHIFT 0 #define DB_LEGACY_ADDR_DEMS_MASK 0x7 /* doorbell extraction mode specifier- 0 if not used */ #define DB_LEGACY_ADDR_DEMS_SHIFT 2 #define DB_LEGACY_ADDR_ICID_MASK 0x7FFFFFF /* internal CID */ #define DB_LEGACY_ADDR_ICID_SHIFT 5 }; /* * Structure for doorbell address, in PWM mode */ struct db_pwm_addr { __le32 addr; #define DB_PWM_ADDR_RESERVED0_MASK 0x7 #define DB_PWM_ADDR_RESERVED0_SHIFT 0 #define DB_PWM_ADDR_OFFSET_MASK 0x7F /* Offset in PWM address space */ #define DB_PWM_ADDR_OFFSET_SHIFT 3 #define DB_PWM_ADDR_WID_MASK 0x3 /* Window ID */ #define DB_PWM_ADDR_WID_SHIFT 10 #define DB_PWM_ADDR_DPI_MASK 0xFFFF /* Doorbell page ID */ #define DB_PWM_ADDR_DPI_SHIFT 12 #define DB_PWM_ADDR_RESERVED1_MASK 0xF #define DB_PWM_ADDR_RESERVED1_SHIFT 28 }; /* * Parameters to RoCE firmware, passed in EDPM doorbell */ struct db_roce_dpm_params { __le32 params; #define DB_ROCE_DPM_PARAMS_SIZE_MASK 0x3F /* Size in QWORD-s of the DPM burst */ #define DB_ROCE_DPM_PARAMS_SIZE_SHIFT 0 #define DB_ROCE_DPM_PARAMS_DPM_TYPE_MASK 0x3 /* Type of DPM transacation (DPM_ROCE) (use enum db_dpm_type) */ #define DB_ROCE_DPM_PARAMS_DPM_TYPE_SHIFT 6 #define DB_ROCE_DPM_PARAMS_OPCODE_MASK 0xFF /* opcode for ROCE operation */ #define DB_ROCE_DPM_PARAMS_OPCODE_SHIFT 8 #define DB_ROCE_DPM_PARAMS_WQE_SIZE_MASK 0x7FF /* the size of the WQE payload in bytes */ #define DB_ROCE_DPM_PARAMS_WQE_SIZE_SHIFT 16 #define DB_ROCE_DPM_PARAMS_RESERVED0_MASK 0x1 #define DB_ROCE_DPM_PARAMS_RESERVED0_SHIFT 27 #define DB_ROCE_DPM_PARAMS_ACK_REQUEST_MASK 0x1 /* RoCE ack request (will be set to 1) */ #define DB_ROCE_DPM_PARAMS_ACK_REQUEST_SHIFT 28 #define DB_ROCE_DPM_PARAMS_S_FLG_MASK 0x1 /* RoCE S flag */ #define DB_ROCE_DPM_PARAMS_S_FLG_SHIFT 29 #define DB_ROCE_DPM_PARAMS_COMPLETION_FLG_MASK 0x1 /* RoCE completion flag for FW use */ #define DB_ROCE_DPM_PARAMS_COMPLETION_FLG_SHIFT 30 #define DB_ROCE_DPM_PARAMS_RESERVED1_MASK 0x1 #define DB_ROCE_DPM_PARAMS_RESERVED1_SHIFT 31 }; /* * Structure for doorbell data, in ROCE DPM mode, for the first doorbell in a DPM burst */ struct db_roce_dpm_data { __le16 icid /* internal CID */; __le16 prod_val /* aggregated value to update */; struct db_roce_dpm_params params /* parametes passed to RoCE firmware */; }; /* * Igu interrupt command */ enum igu_int_cmd { IGU_INT_ENABLE=0, IGU_INT_DISABLE=1, IGU_INT_NOP=2, IGU_INT_NOP2=3, MAX_IGU_INT_CMD }; /* * IGU producer or consumer update command */ struct igu_prod_cons_update { __le32 sb_id_and_flags; #define IGU_PROD_CONS_UPDATE_SB_INDEX_MASK 0xFFFFFF #define IGU_PROD_CONS_UPDATE_SB_INDEX_SHIFT 0 #define IGU_PROD_CONS_UPDATE_UPDATE_FLAG_MASK 0x1 #define IGU_PROD_CONS_UPDATE_UPDATE_FLAG_SHIFT 24 #define IGU_PROD_CONS_UPDATE_ENABLE_INT_MASK 0x3 /* interrupt enable/disable/nop (use enum igu_int_cmd) */ #define IGU_PROD_CONS_UPDATE_ENABLE_INT_SHIFT 25 #define IGU_PROD_CONS_UPDATE_SEGMENT_ACCESS_MASK 0x1 /* (use enum igu_seg_access) */ #define IGU_PROD_CONS_UPDATE_SEGMENT_ACCESS_SHIFT 27 #define IGU_PROD_CONS_UPDATE_TIMER_MASK_MASK 0x1 #define IGU_PROD_CONS_UPDATE_TIMER_MASK_SHIFT 28 #define IGU_PROD_CONS_UPDATE_RESERVED0_MASK 0x3 #define IGU_PROD_CONS_UPDATE_RESERVED0_SHIFT 29 #define IGU_PROD_CONS_UPDATE_COMMAND_TYPE_MASK 0x1 /* must always be set cleared (use enum command_type_bit) */ #define IGU_PROD_CONS_UPDATE_COMMAND_TYPE_SHIFT 31 __le32 reserved1; }; /* * Igu segments access for default status block only */ enum igu_seg_access { IGU_SEG_ACCESS_REG=0, IGU_SEG_ACCESS_ATTN=1, MAX_IGU_SEG_ACCESS }; /* * Enumeration for L3 type field of parsing_and_err_flags_union. L3Type: 0 - unknown (not ip) ,1 - Ipv4, 2 - Ipv6 (this field can be filled according to the last-ethertype) */ enum l3_type { e_l3Type_unknown, e_l3Type_ipv4, e_l3Type_ipv6, MAX_L3_TYPE }; /* * Enumeration for l4Protocol field of parsing_and_err_flags_union. L4-protocol 0 - none, 1 - TCP, 2- UDP. if the packet is IPv4 fragment, and its not the first fragment, the protocol-type should be set to none. */ enum l4_protocol { e_l4Protocol_none, e_l4Protocol_tcp, e_l4Protocol_udp, MAX_L4_PROTOCOL }; /* * Parsing and error flags field. */ struct parsing_and_err_flags { __le16 flags; #define PARSING_AND_ERR_FLAGS_L3TYPE_MASK 0x3 /* L3Type: 0 - unknown (not ip) ,1 - Ipv4, 2 - Ipv6 (this field can be filled according to the last-ethertype) (use enum l3_type) */ #define PARSING_AND_ERR_FLAGS_L3TYPE_SHIFT 0 #define PARSING_AND_ERR_FLAGS_L4PROTOCOL_MASK 0x3 /* L4-protocol 0 - none, 1 - TCP, 2- UDP. if the packet is IPv4 fragment, and its not the first fragment, the protocol-type should be set to none. (use enum l4_protocol) */ #define PARSING_AND_ERR_FLAGS_L4PROTOCOL_SHIFT 2 #define PARSING_AND_ERR_FLAGS_IPV4FRAG_MASK 0x1 /* Set if the packet is IPv4 fragment. */ #define PARSING_AND_ERR_FLAGS_IPV4FRAG_SHIFT 4 #define PARSING_AND_ERR_FLAGS_TAG8021QEXIST_MASK 0x1 /* Set if VLAN tag exists. Invalid if tunnel type are IP GRE or IP GENEVE. */ #define PARSING_AND_ERR_FLAGS_TAG8021QEXIST_SHIFT 5 #define PARSING_AND_ERR_FLAGS_L4CHKSMWASCALCULATED_MASK 0x1 /* Set if L4 checksum was calculated. */ #define PARSING_AND_ERR_FLAGS_L4CHKSMWASCALCULATED_SHIFT 6 #define PARSING_AND_ERR_FLAGS_TIMESYNCPKT_MASK 0x1 /* Set for PTP packet. */ #define PARSING_AND_ERR_FLAGS_TIMESYNCPKT_SHIFT 7 #define PARSING_AND_ERR_FLAGS_TIMESTAMPRECORDED_MASK 0x1 /* Set if PTP timestamp recorded. */ #define PARSING_AND_ERR_FLAGS_TIMESTAMPRECORDED_SHIFT 8 #define PARSING_AND_ERR_FLAGS_IPHDRERROR_MASK 0x1 /* Set if either version-mismatch or hdr-len-error or ipv4-cksm is set or ipv6 ver mismatch */ #define PARSING_AND_ERR_FLAGS_IPHDRERROR_SHIFT 9 #define PARSING_AND_ERR_FLAGS_L4CHKSMERROR_MASK 0x1 /* Set if L4 checksum validation failed. Valid only if L4 checksum was calculated. */ #define PARSING_AND_ERR_FLAGS_L4CHKSMERROR_SHIFT 10 #define PARSING_AND_ERR_FLAGS_TUNNELEXIST_MASK 0x1 /* Set if GRE/VXLAN/GENEVE tunnel detected. */ #define PARSING_AND_ERR_FLAGS_TUNNELEXIST_SHIFT 11 #define PARSING_AND_ERR_FLAGS_TUNNEL8021QTAGEXIST_MASK 0x1 /* Set if VLAN tag exists in tunnel header. */ #define PARSING_AND_ERR_FLAGS_TUNNEL8021QTAGEXIST_SHIFT 12 #define PARSING_AND_ERR_FLAGS_TUNNELIPHDRERROR_MASK 0x1 /* Set if either tunnel-ipv4-version-mismatch or tunnel-ipv4-hdr-len-error or tunnel-ipv4-cksm is set or tunneling ipv6 ver mismatch */ #define PARSING_AND_ERR_FLAGS_TUNNELIPHDRERROR_SHIFT 13 #define PARSING_AND_ERR_FLAGS_TUNNELL4CHKSMWASCALCULATED_MASK 0x1 /* Set if GRE or VXLAN/GENEVE UDP checksum was calculated. */ #define PARSING_AND_ERR_FLAGS_TUNNELL4CHKSMWASCALCULATED_SHIFT 14 #define PARSING_AND_ERR_FLAGS_TUNNELL4CHKSMERROR_MASK 0x1 /* Set if tunnel L4 checksum validation failed. Valid only if tunnel L4 checksum was calculated. */ #define PARSING_AND_ERR_FLAGS_TUNNELL4CHKSMERROR_SHIFT 15 }; /* * Pb context */ struct pb_context { __le32 crc[4]; }; /* * Concrete Function ID. */ struct pxp_concrete_fid { __le16 fid; #define PXP_CONCRETE_FID_PFID_MASK 0xF /* Parent PFID */ #define PXP_CONCRETE_FID_PFID_SHIFT 0 #define PXP_CONCRETE_FID_PORT_MASK 0x3 /* port number */ #define PXP_CONCRETE_FID_PORT_SHIFT 4 #define PXP_CONCRETE_FID_PATH_MASK 0x1 /* path number */ #define PXP_CONCRETE_FID_PATH_SHIFT 6 #define PXP_CONCRETE_FID_VFVALID_MASK 0x1 #define PXP_CONCRETE_FID_VFVALID_SHIFT 7 #define PXP_CONCRETE_FID_VFID_MASK 0xFF #define PXP_CONCRETE_FID_VFID_SHIFT 8 }; /* * Concrete Function ID. */ struct pxp_pretend_concrete_fid { __le16 fid; #define PXP_PRETEND_CONCRETE_FID_PFID_MASK 0xF /* Parent PFID */ #define PXP_PRETEND_CONCRETE_FID_PFID_SHIFT 0 #define PXP_PRETEND_CONCRETE_FID_RESERVED_MASK 0x7 /* port number. Only when part of ME register. */ #define PXP_PRETEND_CONCRETE_FID_RESERVED_SHIFT 4 #define PXP_PRETEND_CONCRETE_FID_VFVALID_MASK 0x1 #define PXP_PRETEND_CONCRETE_FID_VFVALID_SHIFT 7 #define PXP_PRETEND_CONCRETE_FID_VFID_MASK 0xFF #define PXP_PRETEND_CONCRETE_FID_VFID_SHIFT 8 }; /* * Function ID. */ union pxp_pretend_fid { struct pxp_pretend_concrete_fid concrete_fid; __le16 opaque_fid; }; /* * Pxp Pretend Command Register. */ struct pxp_pretend_cmd { union pxp_pretend_fid fid; __le16 control; #define PXP_PRETEND_CMD_PATH_MASK 0x1 #define PXP_PRETEND_CMD_PATH_SHIFT 0 #define PXP_PRETEND_CMD_USE_PORT_MASK 0x1 #define PXP_PRETEND_CMD_USE_PORT_SHIFT 1 #define PXP_PRETEND_CMD_PORT_MASK 0x3 #define PXP_PRETEND_CMD_PORT_SHIFT 2 #define PXP_PRETEND_CMD_RESERVED0_MASK 0xF #define PXP_PRETEND_CMD_RESERVED0_SHIFT 4 #define PXP_PRETEND_CMD_RESERVED1_MASK 0xF #define PXP_PRETEND_CMD_RESERVED1_SHIFT 8 #define PXP_PRETEND_CMD_PRETEND_PATH_MASK 0x1 /* is pretend mode? */ #define PXP_PRETEND_CMD_PRETEND_PATH_SHIFT 12 #define PXP_PRETEND_CMD_PRETEND_PORT_MASK 0x1 /* is pretend mode? */ #define PXP_PRETEND_CMD_PRETEND_PORT_SHIFT 13 #define PXP_PRETEND_CMD_PRETEND_FUNCTION_MASK 0x1 /* is pretend mode? */ #define PXP_PRETEND_CMD_PRETEND_FUNCTION_SHIFT 14 #define PXP_PRETEND_CMD_IS_CONCRETE_MASK 0x1 /* is fid concrete? */ #define PXP_PRETEND_CMD_IS_CONCRETE_SHIFT 15 }; /* * PTT Record in PXP Admin Window. */ struct pxp_ptt_entry { __le32 offset; #define PXP_PTT_ENTRY_OFFSET_MASK 0x7FFFFF #define PXP_PTT_ENTRY_OFFSET_SHIFT 0 #define PXP_PTT_ENTRY_RESERVED0_MASK 0x1FF #define PXP_PTT_ENTRY_RESERVED0_SHIFT 23 struct pxp_pretend_cmd pretend; }; /* * VF Zone A Permission Register. */ struct pxp_vf_zone_a_permission { __le32 control; #define PXP_VF_ZONE_A_PERMISSION_VFID_MASK 0xFF #define PXP_VF_ZONE_A_PERMISSION_VFID_SHIFT 0 #define PXP_VF_ZONE_A_PERMISSION_VALID_MASK 0x1 #define PXP_VF_ZONE_A_PERMISSION_VALID_SHIFT 8 #define PXP_VF_ZONE_A_PERMISSION_RESERVED0_MASK 0x7F #define PXP_VF_ZONE_A_PERMISSION_RESERVED0_SHIFT 9 #define PXP_VF_ZONE_A_PERMISSION_RESERVED1_MASK 0xFFFF #define PXP_VF_ZONE_A_PERMISSION_RESERVED1_SHIFT 16 }; /* * Rdif context */ struct rdif_task_context { __le32 initialRefTag; __le16 appTagValue; __le16 appTagMask; uint8_t flags0; #define RDIF_TASK_CONTEXT_IGNOREAPPTAG_MASK 0x1 #define RDIF_TASK_CONTEXT_IGNOREAPPTAG_SHIFT 0 #define RDIF_TASK_CONTEXT_INITIALREFTAGVALID_MASK 0x1 #define RDIF_TASK_CONTEXT_INITIALREFTAGVALID_SHIFT 1 #define RDIF_TASK_CONTEXT_HOSTGUARDTYPE_MASK 0x1 /* 0 = IP checksum, 1 = CRC */ #define RDIF_TASK_CONTEXT_HOSTGUARDTYPE_SHIFT 2 #define RDIF_TASK_CONTEXT_SETERRORWITHEOP_MASK 0x1 #define RDIF_TASK_CONTEXT_SETERRORWITHEOP_SHIFT 3 #define RDIF_TASK_CONTEXT_PROTECTIONTYPE_MASK 0x3 /* 1/2/3 - Protection Type */ #define RDIF_TASK_CONTEXT_PROTECTIONTYPE_SHIFT 4 #define RDIF_TASK_CONTEXT_CRC_SEED_MASK 0x1 /* 0=0x0000, 1=0xffff */ #define RDIF_TASK_CONTEXT_CRC_SEED_SHIFT 6 #define RDIF_TASK_CONTEXT_KEEPREFTAGCONST_MASK 0x1 /* Keep reference tag constant */ #define RDIF_TASK_CONTEXT_KEEPREFTAGCONST_SHIFT 7 uint8_t partialDifData[7]; __le16 partialCrcValue; __le16 partialChecksumValue; __le32 offsetInIO; __le16 flags1; #define RDIF_TASK_CONTEXT_VALIDATEGUARD_MASK 0x1 #define RDIF_TASK_CONTEXT_VALIDATEGUARD_SHIFT 0 #define RDIF_TASK_CONTEXT_VALIDATEAPPTAG_MASK 0x1 #define RDIF_TASK_CONTEXT_VALIDATEAPPTAG_SHIFT 1 #define RDIF_TASK_CONTEXT_VALIDATEREFTAG_MASK 0x1 #define RDIF_TASK_CONTEXT_VALIDATEREFTAG_SHIFT 2 #define RDIF_TASK_CONTEXT_FORWARDGUARD_MASK 0x1 #define RDIF_TASK_CONTEXT_FORWARDGUARD_SHIFT 3 #define RDIF_TASK_CONTEXT_FORWARDAPPTAG_MASK 0x1 #define RDIF_TASK_CONTEXT_FORWARDAPPTAG_SHIFT 4 #define RDIF_TASK_CONTEXT_FORWARDREFTAG_MASK 0x1 #define RDIF_TASK_CONTEXT_FORWARDREFTAG_SHIFT 5 #define RDIF_TASK_CONTEXT_INTERVALSIZE_MASK 0x7 /* 0=512B, 1=1KB, 2=2KB, 3=4KB, 4=8KB */ #define RDIF_TASK_CONTEXT_INTERVALSIZE_SHIFT 6 #define RDIF_TASK_CONTEXT_HOSTINTERFACE_MASK 0x3 /* 0=None, 1=DIF, 2=DIX */ #define RDIF_TASK_CONTEXT_HOSTINTERFACE_SHIFT 9 #define RDIF_TASK_CONTEXT_DIFBEFOREDATA_MASK 0x1 /* DIF tag right at the beginning of DIF interval */ #define RDIF_TASK_CONTEXT_DIFBEFOREDATA_SHIFT 11 #define RDIF_TASK_CONTEXT_RESERVED0_MASK 0x1 #define RDIF_TASK_CONTEXT_RESERVED0_SHIFT 12 #define RDIF_TASK_CONTEXT_NETWORKINTERFACE_MASK 0x1 /* 0=None, 1=DIF */ #define RDIF_TASK_CONTEXT_NETWORKINTERFACE_SHIFT 13 #define RDIF_TASK_CONTEXT_FORWARDAPPTAGWITHMASK_MASK 0x1 /* Forward application tag with mask */ #define RDIF_TASK_CONTEXT_FORWARDAPPTAGWITHMASK_SHIFT 14 #define RDIF_TASK_CONTEXT_FORWARDREFTAGWITHMASK_MASK 0x1 /* Forward reference tag with mask */ #define RDIF_TASK_CONTEXT_FORWARDREFTAGWITHMASK_SHIFT 15 __le16 state; #define RDIF_TASK_CONTEXT_RECEIVEDDIFBYTESLEFT_MASK 0xF #define RDIF_TASK_CONTEXT_RECEIVEDDIFBYTESLEFT_SHIFT 0 #define RDIF_TASK_CONTEXT_TRANSMITEDDIFBYTESLEFT_MASK 0xF #define RDIF_TASK_CONTEXT_TRANSMITEDDIFBYTESLEFT_SHIFT 4 #define RDIF_TASK_CONTEXT_ERRORINIO_MASK 0x1 #define RDIF_TASK_CONTEXT_ERRORINIO_SHIFT 8 #define RDIF_TASK_CONTEXT_CHECKSUMOVERFLOW_MASK 0x1 #define RDIF_TASK_CONTEXT_CHECKSUMOVERFLOW_SHIFT 9 #define RDIF_TASK_CONTEXT_REFTAGMASK_MASK 0xF /* mask for refernce tag handling */ #define RDIF_TASK_CONTEXT_REFTAGMASK_SHIFT 10 #define RDIF_TASK_CONTEXT_RESERVED1_MASK 0x3 #define RDIF_TASK_CONTEXT_RESERVED1_SHIFT 14 __le32 reserved2; }; /* * RSS hash type */ enum rss_hash_type { RSS_HASH_TYPE_DEFAULT=0, RSS_HASH_TYPE_IPV4=1, RSS_HASH_TYPE_TCP_IPV4=2, RSS_HASH_TYPE_IPV6=3, RSS_HASH_TYPE_TCP_IPV6=4, RSS_HASH_TYPE_UDP_IPV4=5, RSS_HASH_TYPE_UDP_IPV6=6, MAX_RSS_HASH_TYPE }; /* * status block structure */ struct status_block { __le16 pi_array[PIS_PER_SB]; __le32 sb_num; #define STATUS_BLOCK_SB_NUM_MASK 0x1FF #define STATUS_BLOCK_SB_NUM_SHIFT 0 #define STATUS_BLOCK_ZERO_PAD_MASK 0x7F #define STATUS_BLOCK_ZERO_PAD_SHIFT 9 #define STATUS_BLOCK_ZERO_PAD2_MASK 0xFFFF #define STATUS_BLOCK_ZERO_PAD2_SHIFT 16 __le32 prod_index; #define STATUS_BLOCK_PROD_INDEX_MASK 0xFFFFFF #define STATUS_BLOCK_PROD_INDEX_SHIFT 0 #define STATUS_BLOCK_ZERO_PAD3_MASK 0xFF #define STATUS_BLOCK_ZERO_PAD3_SHIFT 24 }; /* * Tdif context */ struct tdif_task_context { __le32 initialRefTag; __le16 appTagValue; __le16 appTagMask; __le16 partialCrcValueB; __le16 partialChecksumValueB; __le16 stateB; #define TDIF_TASK_CONTEXT_RECEIVEDDIFBYTESLEFTB_MASK 0xF #define TDIF_TASK_CONTEXT_RECEIVEDDIFBYTESLEFTB_SHIFT 0 #define TDIF_TASK_CONTEXT_TRANSMITEDDIFBYTESLEFTB_MASK 0xF #define TDIF_TASK_CONTEXT_TRANSMITEDDIFBYTESLEFTB_SHIFT 4 #define TDIF_TASK_CONTEXT_ERRORINIOB_MASK 0x1 #define TDIF_TASK_CONTEXT_ERRORINIOB_SHIFT 8 #define TDIF_TASK_CONTEXT_CHECKSUMOVERFLOW_MASK 0x1 #define TDIF_TASK_CONTEXT_CHECKSUMOVERFLOW_SHIFT 9 #define TDIF_TASK_CONTEXT_RESERVED0_MASK 0x3F #define TDIF_TASK_CONTEXT_RESERVED0_SHIFT 10 uint8_t reserved1; uint8_t flags0; #define TDIF_TASK_CONTEXT_IGNOREAPPTAG_MASK 0x1 #define TDIF_TASK_CONTEXT_IGNOREAPPTAG_SHIFT 0 #define TDIF_TASK_CONTEXT_INITIALREFTAGVALID_MASK 0x1 #define TDIF_TASK_CONTEXT_INITIALREFTAGVALID_SHIFT 1 #define TDIF_TASK_CONTEXT_HOSTGUARDTYPE_MASK 0x1 /* 0 = IP checksum, 1 = CRC */ #define TDIF_TASK_CONTEXT_HOSTGUARDTYPE_SHIFT 2 #define TDIF_TASK_CONTEXT_SETERRORWITHEOP_MASK 0x1 #define TDIF_TASK_CONTEXT_SETERRORWITHEOP_SHIFT 3 #define TDIF_TASK_CONTEXT_PROTECTIONTYPE_MASK 0x3 /* 1/2/3 - Protection Type */ #define TDIF_TASK_CONTEXT_PROTECTIONTYPE_SHIFT 4 #define TDIF_TASK_CONTEXT_CRC_SEED_MASK 0x1 /* 0=0x0000, 1=0xffff */ #define TDIF_TASK_CONTEXT_CRC_SEED_SHIFT 6 #define TDIF_TASK_CONTEXT_RESERVED2_MASK 0x1 #define TDIF_TASK_CONTEXT_RESERVED2_SHIFT 7 __le32 flags1; #define TDIF_TASK_CONTEXT_VALIDATEGUARD_MASK 0x1 #define TDIF_TASK_CONTEXT_VALIDATEGUARD_SHIFT 0 #define TDIF_TASK_CONTEXT_VALIDATEAPPTAG_MASK 0x1 #define TDIF_TASK_CONTEXT_VALIDATEAPPTAG_SHIFT 1 #define TDIF_TASK_CONTEXT_VALIDATEREFTAG_MASK 0x1 #define TDIF_TASK_CONTEXT_VALIDATEREFTAG_SHIFT 2 #define TDIF_TASK_CONTEXT_FORWARDGUARD_MASK 0x1 #define TDIF_TASK_CONTEXT_FORWARDGUARD_SHIFT 3 #define TDIF_TASK_CONTEXT_FORWARDAPPTAG_MASK 0x1 #define TDIF_TASK_CONTEXT_FORWARDAPPTAG_SHIFT 4 #define TDIF_TASK_CONTEXT_FORWARDREFTAG_MASK 0x1 #define TDIF_TASK_CONTEXT_FORWARDREFTAG_SHIFT 5 #define TDIF_TASK_CONTEXT_INTERVALSIZE_MASK 0x7 /* 0=512B, 1=1KB, 2=2KB, 3=4KB, 4=8KB */ #define TDIF_TASK_CONTEXT_INTERVALSIZE_SHIFT 6 #define TDIF_TASK_CONTEXT_HOSTINTERFACE_MASK 0x3 /* 0=None, 1=DIF, 2=DIX */ #define TDIF_TASK_CONTEXT_HOSTINTERFACE_SHIFT 9 #define TDIF_TASK_CONTEXT_DIFBEFOREDATA_MASK 0x1 /* DIF tag right at the beginning of DIF interval */ #define TDIF_TASK_CONTEXT_DIFBEFOREDATA_SHIFT 11 #define TDIF_TASK_CONTEXT_RESERVED3_MASK 0x1 /* reserved */ #define TDIF_TASK_CONTEXT_RESERVED3_SHIFT 12 #define TDIF_TASK_CONTEXT_NETWORKINTERFACE_MASK 0x1 /* 0=None, 1=DIF */ #define TDIF_TASK_CONTEXT_NETWORKINTERFACE_SHIFT 13 #define TDIF_TASK_CONTEXT_RECEIVEDDIFBYTESLEFTA_MASK 0xF #define TDIF_TASK_CONTEXT_RECEIVEDDIFBYTESLEFTA_SHIFT 14 #define TDIF_TASK_CONTEXT_TRANSMITEDDIFBYTESLEFTA_MASK 0xF #define TDIF_TASK_CONTEXT_TRANSMITEDDIFBYTESLEFTA_SHIFT 18 #define TDIF_TASK_CONTEXT_ERRORINIOA_MASK 0x1 #define TDIF_TASK_CONTEXT_ERRORINIOA_SHIFT 22 #define TDIF_TASK_CONTEXT_CHECKSUMOVERFLOWA_MASK 0x1 #define TDIF_TASK_CONTEXT_CHECKSUMOVERFLOWA_SHIFT 23 #define TDIF_TASK_CONTEXT_REFTAGMASK_MASK 0xF /* mask for refernce tag handling */ #define TDIF_TASK_CONTEXT_REFTAGMASK_SHIFT 24 #define TDIF_TASK_CONTEXT_FORWARDAPPTAGWITHMASK_MASK 0x1 /* Forward application tag with mask */ #define TDIF_TASK_CONTEXT_FORWARDAPPTAGWITHMASK_SHIFT 28 #define TDIF_TASK_CONTEXT_FORWARDREFTAGWITHMASK_MASK 0x1 /* Forward reference tag with mask */ #define TDIF_TASK_CONTEXT_FORWARDREFTAGWITHMASK_SHIFT 29 #define TDIF_TASK_CONTEXT_KEEPREFTAGCONST_MASK 0x1 /* Keep reference tag constant */ #define TDIF_TASK_CONTEXT_KEEPREFTAGCONST_SHIFT 30 #define TDIF_TASK_CONTEXT_RESERVED4_MASK 0x1 #define TDIF_TASK_CONTEXT_RESERVED4_SHIFT 31 __le32 offsetInIOB; __le16 partialCrcValueA; __le16 partialChecksumValueA; __le32 offsetInIOA; uint8_t partialDifDataA[8]; uint8_t partialDifDataB[8]; }; /* * Timers context */ struct timers_context { __le32 logical_client_0; #define TIMERS_CONTEXT_EXPIRATIONTIMELC0_MASK 0xFFFFFFF /* Expiration time of logical client 0 */ #define TIMERS_CONTEXT_EXPIRATIONTIMELC0_SHIFT 0 #define TIMERS_CONTEXT_VALIDLC0_MASK 0x1 /* Valid bit of logical client 0 */ #define TIMERS_CONTEXT_VALIDLC0_SHIFT 28 #define TIMERS_CONTEXT_ACTIVELC0_MASK 0x1 /* Active bit of logical client 0 */ #define TIMERS_CONTEXT_ACTIVELC0_SHIFT 29 #define TIMERS_CONTEXT_RESERVED0_MASK 0x3 #define TIMERS_CONTEXT_RESERVED0_SHIFT 30 __le32 logical_client_1; #define TIMERS_CONTEXT_EXPIRATIONTIMELC1_MASK 0xFFFFFFF /* Expiration time of logical client 1 */ #define TIMERS_CONTEXT_EXPIRATIONTIMELC1_SHIFT 0 #define TIMERS_CONTEXT_VALIDLC1_MASK 0x1 /* Valid bit of logical client 1 */ #define TIMERS_CONTEXT_VALIDLC1_SHIFT 28 #define TIMERS_CONTEXT_ACTIVELC1_MASK 0x1 /* Active bit of logical client 1 */ #define TIMERS_CONTEXT_ACTIVELC1_SHIFT 29 #define TIMERS_CONTEXT_RESERVED1_MASK 0x3 #define TIMERS_CONTEXT_RESERVED1_SHIFT 30 __le32 logical_client_2; #define TIMERS_CONTEXT_EXPIRATIONTIMELC2_MASK 0xFFFFFFF /* Expiration time of logical client 2 */ #define TIMERS_CONTEXT_EXPIRATIONTIMELC2_SHIFT 0 #define TIMERS_CONTEXT_VALIDLC2_MASK 0x1 /* Valid bit of logical client 2 */ #define TIMERS_CONTEXT_VALIDLC2_SHIFT 28 #define TIMERS_CONTEXT_ACTIVELC2_MASK 0x1 /* Active bit of logical client 2 */ #define TIMERS_CONTEXT_ACTIVELC2_SHIFT 29 #define TIMERS_CONTEXT_RESERVED2_MASK 0x3 #define TIMERS_CONTEXT_RESERVED2_SHIFT 30 __le32 host_expiration_fields; #define TIMERS_CONTEXT_HOSTEXPRIRATIONVALUE_MASK 0xFFFFFFF /* Expiration time on host (closest one) */ #define TIMERS_CONTEXT_HOSTEXPRIRATIONVALUE_SHIFT 0 #define TIMERS_CONTEXT_HOSTEXPRIRATIONVALID_MASK 0x1 /* Valid bit of host expiration */ #define TIMERS_CONTEXT_HOSTEXPRIRATIONVALID_SHIFT 28 #define TIMERS_CONTEXT_RESERVED3_MASK 0x7 #define TIMERS_CONTEXT_RESERVED3_SHIFT 29 }; /* * Enum for next_protocol field of tunnel_parsing_flags */ enum tunnel_next_protocol { e_unknown=0, e_l2=1, e_ipv4=2, e_ipv6=3, MAX_TUNNEL_NEXT_PROTOCOL }; #endif /* __COMMON_HSI__ */ rdma-core-56.1/providers/qedr/qelr.h000066400000000000000000000211721477342711600174210ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __QELR_H__ #define __QELR_H__ #include #include #include #include #include #include #include #include #define writel(b, p) (*(uint32_t *)(p) = (b)) #define writeq(b, p) (*(uint64_t *)(p) = (b)) #include "qelr_abi.h" #include "qelr_hsi.h" #include "qelr_chain.h" struct qelr_buf { void *addr; size_t len; /* a 64 uint is used as s preparation * for double layer pbl. */ }; #define IS_IWARP(_dev) (_dev->node_type == IBV_NODE_RNIC) #define IS_ROCE(_dev) (_dev->node_type == IBV_NODE_CA) struct qelr_device { struct verbs_device ibv_dev; }; enum qelr_dpm_flags { QELR_DPM_FLAGS_ENHANCED = (1 << 0), QELR_DPM_FLAGS_LEGACY = (1 << 1), QELR_DPM_FLAGS_EDPM_MODE = (1 << 2), }; #define QELR_MAX_SRQ_ID 4096 struct qelr_devctx { struct verbs_context ibv_ctx; FILE *dbg_fp; void *db_addr; uint64_t db_pa; struct qedr_user_db_rec db_rec_addr_dummy; uint32_t db_size; enum qelr_dpm_flags dpm_flags; uint32_t kernel_page_size; uint16_t ldpm_limit_size; uint16_t edpm_limit_size; uint8_t edpm_trans_size; uint32_t max_send_wr; uint32_t max_recv_wr; uint32_t max_srq_wr; uint32_t sges_per_send_wr; uint32_t sges_per_recv_wr; uint32_t sges_per_srq_wr; struct qelr_srq **srq_table; int max_cqes; }; struct qelr_pd { struct ibv_pd ibv_pd; uint32_t pd_id; }; struct qelr_mr { struct verbs_mr vmr; }; union db_prod64 { struct rdma_pwm_val32_data data; uint64_t raw; }; struct qelr_cq { struct ibv_cq ibv_cq; /* must be first */ struct qelr_chain chain; void *db_addr; union db_prod64 db; /* Doorbell recovery entry address */ void *db_rec_map; struct qedr_user_db_rec *db_rec_addr; uint8_t chain_toggle; union rdma_cqe *latest_cqe; union rdma_cqe *toggle_cqe; uint8_t arm_flags; }; enum qelr_qp_state { QELR_QPS_RST, QELR_QPS_INIT, QELR_QPS_RTR, QELR_QPS_RTS, QELR_QPS_SQD, QELR_QPS_ERR, QELR_QPS_SQE }; union db_prod32 { struct rdma_pwm_val16_data data; uint32_t raw; }; struct qelr_qp_hwq_info { /* WQE */ struct qelr_chain chain; uint8_t max_sges; /* WQ */ uint16_t prod; uint16_t wqe_cons; uint16_t cons; uint16_t max_wr; /* DB */ void *db; /* Doorbell address */ void *edpm_db; union db_prod32 db_data; /* Doorbell data */ /* Doorbell recovery entry address */ void *db_rec_map; struct qedr_user_db_rec *db_rec_addr; void *iwarp_db2; union db_prod32 iwarp_db2_data; uint16_t icid; }; struct qelr_rdma_ext { __be64 remote_va; __be32 remote_key; __be32 dma_length; }; struct qelr_xrceth { __be32 xrc_srq; }; /* rdma extension, invalidate / immediate data + padding, inline data... */ #define QELR_MAX_DPM_PAYLOAD (sizeof(struct qelr_rdma_ext) + sizeof(uint64_t) +\ ROCE_REQ_MAX_INLINE_DATA_SIZE) struct qelr_dpm { uint8_t is_edpm; uint8_t is_ldpm; union { struct db_roce_dpm_data data; uint64_t raw; } msg; uint8_t payload[QELR_MAX_DPM_PAYLOAD]; uint32_t payload_size; uint32_t payload_offset; struct qelr_rdma_ext *rdma_ext; }; struct qelr_srq_hwq_info { uint32_t max_sges; uint32_t max_wr; struct qelr_chain chain; uint32_t wqe_prod; /* WQE prod index in HW ring */ uint32_t sge_prod; /* SGE prod index in HW ring */ uint32_t wr_prod_cnt; /* wr producer count */ uint32_t wr_cons_cnt; /* wr consumer count */ uint32_t num_elems; void *virt_prod_pair_addr; /* producer pair virtual address */ }; struct qelr_srq { struct verbs_srq verbs_srq; struct qelr_srq_hwq_info hw_srq; uint16_t srq_id; pthread_spinlock_t lock; bool is_xrc; }; enum qelr_qp_flags { QELR_QP_FLAG_SQ = 1 << 0, QELR_QP_FLAG_RQ = 1 << 1, }; struct qelr_qp { struct verbs_qp verbs_qp; struct ibv_qp *ibv_qp; pthread_spinlock_t q_lock; enum qelr_qp_state state; /* QP state */ uint8_t flags; struct qelr_qp_hwq_info sq; struct qelr_qp_hwq_info rq; struct { uint64_t wr_id; enum ibv_wc_opcode opcode; uint32_t bytes_len; uint8_t wqe_size; uint8_t signaled; } *wqe_wr_id; struct { uint64_t wr_id; uint8_t wqe_size; } *rqe_wr_id; uint8_t prev_wqe_size; uint32_t max_inline_data; uint32_t qp_id; int sq_sig_all; int atomic_supported; uint8_t edpm_disabled; uint8_t edpm_mode; struct qelr_srq *srq; }; static inline struct qelr_devctx *get_qelr_ctx(struct ibv_context *ibctx) { return container_of(ibctx, struct qelr_devctx, ibv_ctx.context); } static inline struct qelr_device *get_qelr_dev(struct ibv_device *ibdev) { return container_of(ibdev, struct qelr_device, ibv_dev.device); } static inline struct ibv_qp *get_ibv_qp(struct qelr_qp *qp) { return &qp->verbs_qp.qp; } static inline struct qelr_qp *get_qelr_qp(struct ibv_qp *ibqp) { struct verbs_qp *vqp = (struct verbs_qp *)ibqp; return container_of(vqp, struct qelr_qp, verbs_qp); } static inline struct qelr_pd *get_qelr_pd(struct ibv_pd *ibpd) { return container_of(ibpd, struct qelr_pd, ibv_pd); } static inline struct qelr_cq *get_qelr_cq(struct ibv_cq *ibcq) { return container_of(ibcq, struct qelr_cq, ibv_cq); } static inline struct qelr_srq *get_qelr_srq(struct ibv_srq *ibsrq) { struct verbs_srq *vsrq = (struct verbs_srq *)ibsrq; return container_of(vsrq, struct qelr_srq, verbs_srq); } static inline struct ibv_srq *get_ibv_srq(struct qelr_srq *srq) { return &srq->verbs_srq.srq; } #define SET_FIELD(value, name, flag) \ do { \ (value) &= ~(name ## _MASK << name ## _SHIFT); \ (value) |= ((flag) << (name ## _SHIFT)); \ } while (0) #define SET_FIELD2(value, name, flag) \ ((value) |= ((flag) << (name ## _SHIFT))) #define GET_FIELD(value, name) \ (((value) >> (name ## _SHIFT)) & name ## _MASK) #define ROCE_WQE_ELEM_SIZE sizeof(struct rdma_sq_sge) #define RDMA_WQE_BYTES (16) #define QELR_RESP_IMM (RDMA_CQE_RESPONDER_IMM_FLG_MASK << \ RDMA_CQE_RESPONDER_IMM_FLG_SHIFT) #define QELR_RESP_INV (RDMA_CQE_RESPONDER_INV_FLG_MASK << \ RDMA_CQE_RESPONDER_INV_FLG_SHIFT) #define QELR_RESP_RDMA (RDMA_CQE_RESPONDER_RDMA_FLG_MASK << \ RDMA_CQE_RESPONDER_RDMA_FLG_SHIFT) #define QELR_RESP_RDMA_IMM (QELR_RESP_IMM | QELR_RESP_RDMA) #define TYPEPTR_ADDR_SET(type_ptr, field, vaddr) \ do { \ (type_ptr)->field.hi = htole32(U64_HI(vaddr)); \ (type_ptr)->field.lo = htole32(U64_LO(vaddr)); \ } while (0) #define RQ_SGE_SET(sge, vaddr, vlength, vflags) \ do { \ TYPEPTR_ADDR_SET(sge, addr, vaddr); \ (sge)->length = htole32(vlength); \ (sge)->flags = htole32(vflags); \ } while (0) #define SRQ_HDR_SET(hdr, vwr_id, num_sge) \ do { \ TYPEPTR_ADDR_SET(hdr, wr_id, vwr_id); \ (hdr)->num_sges = num_sge; \ } while (0) #define SRQ_SGE_SET(sge, vaddr, vlength, vlkey) \ do { \ TYPEPTR_ADDR_SET(sge, addr, vaddr); \ (sge)->length = htole32(vlength); \ (sge)->l_key = htole32(vlkey); \ } while (0) #define U64_HI(val) ((uint32_t)(((uint64_t)(uintptr_t)(val)) >> 32)) #define U64_LO(val) ((uint32_t)(((uint64_t)(uintptr_t)(val)) & 0xffffffff)) #define HILO_U64(hi, lo) ((uintptr_t)((((uint64_t)(hi)) << 32) + (lo))) #define QELR_MAX_RQ_WQE_SIZE (RDMA_MAX_SGE_PER_RQ_WQE) #define QELR_MAX_SQ_WQE_SIZE (ROCE_REQ_MAX_SINGLE_SQ_WQE_SIZE / \ ROCE_WQE_ELEM_SIZE) #endif /* __QELR_H__ */ rdma-core-56.1/providers/qedr/qelr_abi.h000066400000000000000000000046261477342711600202410ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __QELR_ABI_H__ #define __QELR_ABI_H__ #include #include #include #define QELR_ABI_VERSION (8) DECLARE_DRV_CMD(qelr_alloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, empty, qedr_alloc_pd_uresp); DECLARE_DRV_CMD(qelr_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, qedr_create_cq_ureq, qedr_create_cq_uresp); DECLARE_DRV_CMD(qelr_create_qp, IB_USER_VERBS_CMD_CREATE_QP, qedr_create_qp_ureq, qedr_create_qp_uresp); DECLARE_DRV_CMD(qelr_alloc_context, IB_USER_VERBS_CMD_GET_CONTEXT, qedr_alloc_ucontext_req, qedr_alloc_ucontext_resp); DECLARE_DRV_CMD(qelr_reg_mr, IB_USER_VERBS_CMD_REG_MR, empty, empty); DECLARE_DRV_CMD(qelr_create_srq, IB_USER_VERBS_CMD_CREATE_SRQ, qedr_create_srq_ureq, qedr_create_srq_uresp); DECLARE_DRV_CMD(qelr_create_srq_ex, IB_USER_VERBS_CMD_CREATE_XSRQ, qedr_create_srq_ureq, qedr_create_srq_uresp); DECLARE_DRV_CMD(qelr_create_qp_ex, IB_USER_VERBS_EX_CMD_CREATE_QP, qedr_create_qp_ureq, qedr_create_qp_uresp); #endif /* __QELR_ABI_H__ */ rdma-core-56.1/providers/qedr/qelr_chain.c000066400000000000000000000063561477342711600205650ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include "qelr.h" void *qelr_chain_get_last_elem(struct qelr_chain *p_chain) { void *p_virt_addr = NULL; uint32_t size; if (!p_chain->first_addr) goto out; size = p_chain->elem_size * (p_chain->n_elems - 1); p_virt_addr = ((uint8_t *)p_chain->first_addr + size); out: return p_virt_addr; } void qelr_chain_reset(struct qelr_chain *p_chain) { p_chain->prod_idx = 0; p_chain->cons_idx = 0; p_chain->p_cons_elem = p_chain->first_addr; p_chain->p_prod_elem = p_chain->first_addr; } #define QELR_ANON_FD (-1) /* MAP_ANONYMOUS => file desc.= -1 */ #define QELR_ANON_OFFSET (0) /* MAP_ANONYMOUS => offset = d/c */ int qelr_chain_alloc(struct qelr_chain *chain, int chain_size, int page_size, uint16_t elem_size) { int ret, a_chain_size; void *addr; /* alloc aligned page aligned chain */ a_chain_size = (chain_size + page_size - 1) & ~(page_size - 1); addr = mmap(NULL, a_chain_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, QELR_ANON_FD, QELR_ANON_OFFSET); if (addr == MAP_FAILED) return errno; ret = ibv_dontfork_range(addr, a_chain_size); if (ret) { munmap(addr, a_chain_size); return ret; } /* init chain */ memset(chain, 0, sizeof(*chain)); chain->first_addr = addr; chain->size = a_chain_size; chain->p_cons_elem = chain->first_addr; chain->p_prod_elem = chain->first_addr; chain->elem_size = elem_size; chain->n_elems = chain->size / elem_size; chain->last_addr = (void *) ((uint8_t *)addr + (elem_size * (chain->n_elems -1))); /* Note: since we are using MAP_ANONYMOUS the chain is zeroed for us */ return 0; } void qelr_chain_free(struct qelr_chain *chain) { if (chain->size) { ibv_dofork_range(chain->first_addr, chain->size); munmap(chain->first_addr, chain->size); } } rdma-core-56.1/providers/qedr/qelr_chain.h000066400000000000000000000107441477342711600205660ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __QELR_CHAIN_H__ #define __QELR_CHAIN_H__ #include #include struct qelr_chain { void *first_addr; /* Address of first element in chain */ void *last_addr; /* Address of last element in chain */ /* Point to next element to produce/consume */ void *p_prod_elem; void *p_cons_elem; uint32_t prod_idx; uint32_t cons_idx; uint32_t n_elems; uint32_t size; uint16_t elem_size; }; /* fast path functions are inline */ static inline uint32_t qelr_chain_get_cons_idx_u32(struct qelr_chain *p_chain) { return p_chain->cons_idx; } static inline void *qelr_chain_produce(struct qelr_chain *p_chain) { void *p_ret = NULL; p_chain->prod_idx++; p_ret = p_chain->p_prod_elem; if (p_chain->p_prod_elem == p_chain->last_addr) p_chain->p_prod_elem = p_chain->first_addr; else p_chain->p_prod_elem = (void *)(((uint8_t *)p_chain->p_prod_elem) + p_chain->elem_size); return p_ret; } static inline void *qelr_chain_produce_n(struct qelr_chain *p_chain, int n) { void *p_ret = NULL; int n_wrap; p_chain->prod_idx++; p_ret = p_chain->p_prod_elem; n_wrap = p_chain->prod_idx % p_chain->n_elems; if (n_wrap < n) p_chain->p_prod_elem = (void *) (((uint8_t *)p_chain->first_addr) + (p_chain->elem_size * n_wrap)); else p_chain->p_prod_elem = (void *)(((uint8_t *)p_chain->p_prod_elem) + (p_chain->elem_size * n)); return p_ret; } static inline void *qelr_chain_consume(struct qelr_chain *p_chain) { void *p_ret = NULL; p_chain->cons_idx++; p_ret = p_chain->p_cons_elem; if (p_chain->p_cons_elem == p_chain->last_addr) p_chain->p_cons_elem = p_chain->first_addr; else p_chain->p_cons_elem = (void *) (((uint8_t *)p_chain->p_cons_elem) + p_chain->elem_size); return p_ret; } static inline void *qelr_chain_consume_n(struct qelr_chain *p_chain, int n) { void *p_ret = NULL; int n_wrap; p_chain->cons_idx += n; p_ret = p_chain->p_cons_elem; n_wrap = p_chain->cons_idx % p_chain->n_elems; if (n_wrap < n) p_chain->p_cons_elem = (void *) (((uint8_t *)p_chain->first_addr) + (p_chain->elem_size * n_wrap)); else p_chain->p_cons_elem = (void *)(((uint8_t *)p_chain->p_cons_elem) + (p_chain->elem_size * n)); return p_ret; } static inline uint32_t qelr_chain_get_elem_left_u32(struct qelr_chain *p_chain) { uint32_t used; used = (uint32_t)(((uint64_t)((uint64_t) ~0U) + 1 + (uint64_t)(p_chain->prod_idx)) - (uint64_t)p_chain->cons_idx); return p_chain->n_elems - used; } static inline uint8_t qelr_chain_is_full(struct qelr_chain *p_chain) { return qelr_chain_get_elem_left_u32(p_chain) == p_chain->n_elems; } static inline void qelr_chain_set_prod( struct qelr_chain *p_chain, uint32_t prod_idx, void *p_prod_elem) { p_chain->prod_idx = prod_idx; p_chain->p_prod_elem = p_prod_elem; } void *qelr_chain_get_last_elem(struct qelr_chain *p_chain); void qelr_chain_reset(struct qelr_chain *p_chain); int qelr_chain_alloc(struct qelr_chain *chain, int chain_size, int page_size, uint16_t elem_size); void qelr_chain_free(struct qelr_chain *buf); #endif /* __QELR_CHAIN_H__ */ rdma-core-56.1/providers/qedr/qelr_hsi.h000066400000000000000000000051361477342711600202660ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __QED_HSI_ROCE__ #define __QED_HSI_ROCE__ /********************************/ /* Add include to common target */ /********************************/ #include "common_hsi.h" /************************************************************************/ /* Add include to common roce target for both eCore and protocol roce driver */ /************************************************************************/ #include "roce_common.h" /************************************************************************/ /* Add include to qed hsi rdma target for both roce and iwarp qed driver */ /************************************************************************/ #include "qelr_hsi_rdma.h" /* Affiliated asynchronous events / errors enumeration */ enum roce_async_events_type { ROCE_ASYNC_EVENT_NONE, ROCE_ASYNC_EVENT_COMM_EST, ROCE_ASYNC_EVENT_SQ_DRAINED, ROCE_ASYNC_EVENT_SRQ_LIMIT, ROCE_ASYNC_EVENT_LAST_WQE_REACHED, ROCE_ASYNC_EVENT_CQ_ERR, ROCE_ASYNC_EVENT_LOCAL_INVALID_REQUEST_ERR, ROCE_ASYNC_EVENT_LOCAL_CATASTROPHIC_ERR, ROCE_ASYNC_EVENT_LOCAL_ACCESS_ERR, ROCE_ASYNC_EVENT_QP_CATASTROPHIC_ERR, ROCE_ASYNC_EVENT_CQ_OVERFLOW_ERR, ROCE_ASYNC_EVENT_SRQ_EMPTY, MAX_ROCE_ASYNC_EVENTS_TYPE }; #endif /* __QED_HSI_ROCE__ */ rdma-core-56.1/providers/qedr/qelr_hsi_rdma.h000066400000000000000000001167331477342711600212770ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __QED_HSI_RDMA__ #define __QED_HSI_RDMA__ #include "common_hsi.h" #include "rdma_common.h" /* * rdma completion notification queue element */ struct rdma_cnqe { struct regpair cq_handle; }; struct rdma_cqe_responder { struct regpair srq_wr_id; struct regpair qp_handle; __le32 imm_data_or_inv_r_Key /* immediate data in case imm_flg is set, or invalidated r_key in case inv_flg is set */; __le32 length; __le32 imm_data_hi /* High bytes of immediate data in case imm_flg is set in iWARP only */; __le16 rq_cons_or_srq_id;/* When type is RDMA_CQE_TYPE_RESPONDER_RQ and status is * WORK_REQUEST_FLUSHED_ERR it indicates an aggregative * flush on all posted RQ WQEs until the reported rq_cons. * When type is RDMA_CQE_TYPE_RESPONDER_XRC_SRQ it is the srq_id */ uint8_t flags; #define RDMA_CQE_RESPONDER_TOGGLE_BIT_MASK 0x1 /* indicates a valid completion written by FW. FW toggle this bit each time it finishes producing all PBL entries */ #define RDMA_CQE_RESPONDER_TOGGLE_BIT_SHIFT 0 #define RDMA_CQE_RESPONDER_TYPE_MASK 0x3 /* (use enum rdma_cqe_type) */ #define RDMA_CQE_RESPONDER_TYPE_SHIFT 1 #define RDMA_CQE_RESPONDER_INV_FLG_MASK 0x1 /* r_key invalidated indicator */ #define RDMA_CQE_RESPONDER_INV_FLG_SHIFT 3 #define RDMA_CQE_RESPONDER_IMM_FLG_MASK 0x1 /* immediate data indicator */ #define RDMA_CQE_RESPONDER_IMM_FLG_SHIFT 4 #define RDMA_CQE_RESPONDER_RDMA_FLG_MASK 0x1 /* 1=this CQE relates to an RDMA Write. 0=Send. */ #define RDMA_CQE_RESPONDER_RDMA_FLG_SHIFT 5 #define RDMA_CQE_RESPONDER_RESERVED2_MASK 0x3 #define RDMA_CQE_RESPONDER_RESERVED2_SHIFT 6 uint8_t status; }; struct rdma_cqe_requester { __le16 sq_cons; __le16 reserved0; __le32 reserved1; struct regpair qp_handle; struct regpair reserved2; __le32 reserved3; __le16 reserved4; uint8_t flags; #define RDMA_CQE_REQUESTER_TOGGLE_BIT_MASK 0x1 /* indicates a valid completion written by FW. FW toggle this bit each time it finishes producing all PBL entries */ #define RDMA_CQE_REQUESTER_TOGGLE_BIT_SHIFT 0 #define RDMA_CQE_REQUESTER_TYPE_MASK 0x3 /* (use enum rdma_cqe_type) */ #define RDMA_CQE_REQUESTER_TYPE_SHIFT 1 #define RDMA_CQE_REQUESTER_RESERVED5_MASK 0x1F #define RDMA_CQE_REQUESTER_RESERVED5_SHIFT 3 uint8_t status; }; struct rdma_cqe_common { struct regpair reserved0; struct regpair qp_handle; __le16 reserved1[7]; uint8_t flags; #define RDMA_CQE_COMMON_TOGGLE_BIT_MASK 0x1 /* indicates a valid completion written by FW. FW toggle this bit each time it finishes producing all PBL entries */ #define RDMA_CQE_COMMON_TOGGLE_BIT_SHIFT 0 #define RDMA_CQE_COMMON_TYPE_MASK 0x3 /* (use enum rdma_cqe_type) */ #define RDMA_CQE_COMMON_TYPE_SHIFT 1 #define RDMA_CQE_COMMON_RESERVED2_MASK 0x1F #define RDMA_CQE_COMMON_RESERVED2_SHIFT 3 uint8_t status; }; /* * rdma completion queue element */ union rdma_cqe { struct rdma_cqe_responder resp; struct rdma_cqe_requester req; struct rdma_cqe_common cmn; }; /* * CQE requester status enumeration */ enum rdma_cqe_requester_status_enum { RDMA_CQE_REQ_STS_OK, RDMA_CQE_REQ_STS_BAD_RESPONSE_ERR, RDMA_CQE_REQ_STS_LOCAL_LENGTH_ERR, RDMA_CQE_REQ_STS_LOCAL_QP_OPERATION_ERR, RDMA_CQE_REQ_STS_LOCAL_PROTECTION_ERR, RDMA_CQE_REQ_STS_MEMORY_MGT_OPERATION_ERR, RDMA_CQE_REQ_STS_REMOTE_INVALID_REQUEST_ERR, RDMA_CQE_REQ_STS_REMOTE_ACCESS_ERR, RDMA_CQE_REQ_STS_REMOTE_OPERATION_ERR, RDMA_CQE_REQ_STS_RNR_NAK_RETRY_CNT_ERR, RDMA_CQE_REQ_STS_TRANSPORT_RETRY_CNT_ERR, RDMA_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR, RDMA_CQE_REQ_STS_XRC_VIOLATION_ERR, MAX_RDMA_CQE_REQUESTER_STATUS_ENUM }; /* * CQE responder status enumeration */ enum rdma_cqe_responder_status_enum { RDMA_CQE_RESP_STS_OK, RDMA_CQE_RESP_STS_LOCAL_ACCESS_ERR, RDMA_CQE_RESP_STS_LOCAL_LENGTH_ERR, RDMA_CQE_RESP_STS_LOCAL_QP_OPERATION_ERR, RDMA_CQE_RESP_STS_LOCAL_PROTECTION_ERR, RDMA_CQE_RESP_STS_MEMORY_MGT_OPERATION_ERR, RDMA_CQE_RESP_STS_REMOTE_INVALID_REQUEST_ERR, RDMA_CQE_RESP_STS_WORK_REQUEST_FLUSHED_ERR, MAX_RDMA_CQE_RESPONDER_STATUS_ENUM }; /* * CQE type enumeration */ enum rdma_cqe_type { RDMA_CQE_TYPE_REQUESTER, RDMA_CQE_TYPE_RESPONDER_RQ, RDMA_CQE_TYPE_RESPONDER_SRQ, RDMA_CQE_TYPE_RESPONDER_XRC_SRQ, RDMA_CQE_TYPE_INVALID, MAX_RDMA_CQE_TYPE }; /* * DIF Block size options */ enum rdma_dif_block_size { RDMA_DIF_BLOCK_512=0, RDMA_DIF_BLOCK_4096=1, MAX_RDMA_DIF_BLOCK_SIZE }; /* * DIF CRC initial value */ enum rdma_dif_crc_seed { RDMA_DIF_CRC_SEED_0000=0, RDMA_DIF_CRC_SEED_FFFF=1, MAX_RDMA_DIF_CRC_SEED }; /* * RDMA DIF Error Result Structure */ struct rdma_dif_error_result { __le32 error_intervals /* Total number of error intervals in the IO. */; __le32 dif_error_1st_interval /* Number of the first interval that contained error. Set to 0xFFFFFFFF if error occurred in the Runt Block. */; uint8_t flags; #define RDMA_DIF_ERROR_RESULT_DIF_ERROR_TYPE_CRC_MASK 0x1 /* CRC error occurred. */ #define RDMA_DIF_ERROR_RESULT_DIF_ERROR_TYPE_CRC_SHIFT 0 #define RDMA_DIF_ERROR_RESULT_DIF_ERROR_TYPE_APP_TAG_MASK 0x1 /* App Tag error occurred. */ #define RDMA_DIF_ERROR_RESULT_DIF_ERROR_TYPE_APP_TAG_SHIFT 1 #define RDMA_DIF_ERROR_RESULT_DIF_ERROR_TYPE_REF_TAG_MASK 0x1 /* Ref Tag error occurred. */ #define RDMA_DIF_ERROR_RESULT_DIF_ERROR_TYPE_REF_TAG_SHIFT 2 #define RDMA_DIF_ERROR_RESULT_RESERVED0_MASK 0xF #define RDMA_DIF_ERROR_RESULT_RESERVED0_SHIFT 3 #define RDMA_DIF_ERROR_RESULT_TOGGLE_BIT_MASK 0x1 /* Used to indicate the structure is valid. Toggles each time an invalidate region is performed. */ #define RDMA_DIF_ERROR_RESULT_TOGGLE_BIT_SHIFT 7 uint8_t reserved1[55] /* Pad to 64 bytes to ensure efficient word line writing. */; }; /* * DIF IO direction */ enum rdma_dif_io_direction_flg { RDMA_DIF_DIR_RX=0, RDMA_DIF_DIR_TX=1, MAX_RDMA_DIF_IO_DIRECTION_FLG }; /* * RDMA DIF Runt Result Structure */ struct rdma_dif_runt_result { __le16 guard_tag /* CRC result of received IO. */; __le16 reserved[3]; }; /* * memory window type enumeration */ enum rdma_mw_type { RDMA_MW_TYPE_1, RDMA_MW_TYPE_2A, MAX_RDMA_MW_TYPE }; struct rdma_rq_sge { struct regpair addr; __le32 length; __le32 flags; #define RDMA_RQ_SGE_L_KEY_MASK 0x3FFFFFF /* key of memory relating to this RQ */ #define RDMA_RQ_SGE_L_KEY_SHIFT 0 #define RDMA_RQ_SGE_NUM_SGES_MASK 0x7 /* first SGE - number of SGEs in this RQ WQE. Other SGEs - should be set to 0 */ #define RDMA_RQ_SGE_NUM_SGES_SHIFT 26 #define RDMA_RQ_SGE_RESERVED0_MASK 0x7 #define RDMA_RQ_SGE_RESERVED0_SHIFT 29 }; struct rdma_sq_atomic_wqe { __le32 reserved1; __le32 length /* Total data length (8 bytes for Atomic) */; __le32 xrc_srq /* Valid only when XRC is set for the QP */; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_ATOMIC_WQE_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_ATOMIC_WQE_COMP_FLG_SHIFT 0 #define RDMA_SQ_ATOMIC_WQE_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_ATOMIC_WQE_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_ATOMIC_WQE_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_ATOMIC_WQE_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_ATOMIC_WQE_SE_FLG_MASK 0x1 /* Don't care for atomic wqe */ #define RDMA_SQ_ATOMIC_WQE_SE_FLG_SHIFT 3 #define RDMA_SQ_ATOMIC_WQE_INLINE_FLG_MASK 0x1 /* Should be 0 for atomic wqe */ #define RDMA_SQ_ATOMIC_WQE_INLINE_FLG_SHIFT 4 #define RDMA_SQ_ATOMIC_WQE_DIF_ON_HOST_FLG_MASK 0x1 /* Should be 0 for atomic wqe */ #define RDMA_SQ_ATOMIC_WQE_DIF_ON_HOST_FLG_SHIFT 5 #define RDMA_SQ_ATOMIC_WQE_RESERVED0_MASK 0x3 #define RDMA_SQ_ATOMIC_WQE_RESERVED0_SHIFT 6 uint8_t wqe_size /* Size of WQE in 16B chunks including SGE */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; struct regpair remote_va /* remote virtual address */; __le32 r_key /* Remote key */; __le32 reserved2; struct regpair cmp_data /* Data to compare in case of ATOMIC_CMP_AND_SWAP */; struct regpair swap_data /* Swap or add data */; }; /* * First element (16 bytes) of atomic wqe */ struct rdma_sq_atomic_wqe_1st { __le32 reserved1; __le32 length /* Total data length (8 bytes for Atomic) */; __le32 xrc_srq /* Valid only when XRC is set for the QP */; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_ATOMIC_WQE_1ST_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_ATOMIC_WQE_1ST_COMP_FLG_SHIFT 0 #define RDMA_SQ_ATOMIC_WQE_1ST_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_ATOMIC_WQE_1ST_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_ATOMIC_WQE_1ST_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_ATOMIC_WQE_1ST_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_ATOMIC_WQE_1ST_SE_FLG_MASK 0x1 /* Don't care for atomic wqe */ #define RDMA_SQ_ATOMIC_WQE_1ST_SE_FLG_SHIFT 3 #define RDMA_SQ_ATOMIC_WQE_1ST_INLINE_FLG_MASK 0x1 /* Should be 0 for atomic wqe */ #define RDMA_SQ_ATOMIC_WQE_1ST_INLINE_FLG_SHIFT 4 #define RDMA_SQ_ATOMIC_WQE_1ST_RESERVED0_MASK 0x7 #define RDMA_SQ_ATOMIC_WQE_1ST_RESERVED0_SHIFT 5 uint8_t wqe_size /* Size of WQE in 16B chunks including all SGEs. Set to number of SGEs + 1. */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; }; /* * Second element (16 bytes) of atomic wqe */ struct rdma_sq_atomic_wqe_2nd { struct regpair remote_va /* remote virtual address */; __le32 r_key /* Remote key */; __le32 reserved2; }; /* * Third element (16 bytes) of atomic wqe */ struct rdma_sq_atomic_wqe_3rd { struct regpair cmp_data /* Data to compare in case of ATOMIC_CMP_AND_SWAP */; struct regpair swap_data /* Swap or add data */; }; struct rdma_sq_bind_wqe { struct regpair addr; __le32 l_key; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_BIND_WQE_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_BIND_WQE_COMP_FLG_SHIFT 0 #define RDMA_SQ_BIND_WQE_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_BIND_WQE_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_BIND_WQE_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_BIND_WQE_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_BIND_WQE_SE_FLG_MASK 0x1 /* Don't care for bind wqe */ #define RDMA_SQ_BIND_WQE_SE_FLG_SHIFT 3 #define RDMA_SQ_BIND_WQE_INLINE_FLG_MASK 0x1 /* Should be 0 for bind wqe */ #define RDMA_SQ_BIND_WQE_INLINE_FLG_SHIFT 4 #define RDMA_SQ_BIND_WQE_RESERVED0_MASK 0x7 #define RDMA_SQ_BIND_WQE_RESERVED0_SHIFT 5 uint8_t wqe_size /* Size of WQE in 16B chunks */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; uint8_t bind_ctrl; #define RDMA_SQ_BIND_WQE_ZERO_BASED_MASK 0x1 /* zero based indication */ #define RDMA_SQ_BIND_WQE_ZERO_BASED_SHIFT 0 #define RDMA_SQ_BIND_WQE_MW_TYPE_MASK 0x1 /* (use enum rdma_mw_type) */ #define RDMA_SQ_BIND_WQE_MW_TYPE_SHIFT 1 #define RDMA_SQ_BIND_WQE_RESERVED1_MASK 0x3F #define RDMA_SQ_BIND_WQE_RESERVED1_SHIFT 2 uint8_t access_ctrl; #define RDMA_SQ_BIND_WQE_REMOTE_READ_MASK 0x1 #define RDMA_SQ_BIND_WQE_REMOTE_READ_SHIFT 0 #define RDMA_SQ_BIND_WQE_REMOTE_WRITE_MASK 0x1 #define RDMA_SQ_BIND_WQE_REMOTE_WRITE_SHIFT 1 #define RDMA_SQ_BIND_WQE_ENABLE_ATOMIC_MASK 0x1 #define RDMA_SQ_BIND_WQE_ENABLE_ATOMIC_SHIFT 2 #define RDMA_SQ_BIND_WQE_LOCAL_READ_MASK 0x1 #define RDMA_SQ_BIND_WQE_LOCAL_READ_SHIFT 3 #define RDMA_SQ_BIND_WQE_LOCAL_WRITE_MASK 0x1 #define RDMA_SQ_BIND_WQE_LOCAL_WRITE_SHIFT 4 #define RDMA_SQ_BIND_WQE_RESERVED2_MASK 0x7 #define RDMA_SQ_BIND_WQE_RESERVED2_SHIFT 5 uint8_t reserved3; uint8_t length_hi /* upper 8 bits of the registered MW length */; __le32 length_lo /* lower 32 bits of the registered MW length */; __le32 parent_l_key /* l_key of the parent MR */; __le32 reserved4; }; /* * First element (16 bytes) of bind wqe */ struct rdma_sq_bind_wqe_1st { struct regpair addr; __le32 l_key; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_BIND_WQE_1ST_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_BIND_WQE_1ST_COMP_FLG_SHIFT 0 #define RDMA_SQ_BIND_WQE_1ST_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_BIND_WQE_1ST_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_BIND_WQE_1ST_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_BIND_WQE_1ST_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_BIND_WQE_1ST_SE_FLG_MASK 0x1 /* Don't care for bind wqe */ #define RDMA_SQ_BIND_WQE_1ST_SE_FLG_SHIFT 3 #define RDMA_SQ_BIND_WQE_1ST_INLINE_FLG_MASK 0x1 /* Should be 0 for bind wqe */ #define RDMA_SQ_BIND_WQE_1ST_INLINE_FLG_SHIFT 4 #define RDMA_SQ_BIND_WQE_1ST_RESERVED0_MASK 0x7 #define RDMA_SQ_BIND_WQE_1ST_RESERVED0_SHIFT 5 uint8_t wqe_size /* Size of WQE in 16B chunks */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; }; /* * Second element (16 bytes) of bind wqe */ struct rdma_sq_bind_wqe_2nd { uint8_t bind_ctrl; #define RDMA_SQ_BIND_WQE_2ND_ZERO_BASED_MASK 0x1 /* zero based indication */ #define RDMA_SQ_BIND_WQE_2ND_ZERO_BASED_SHIFT 0 #define RDMA_SQ_BIND_WQE_2ND_MW_TYPE_MASK 0x1 /* (use enum rdma_mw_type) */ #define RDMA_SQ_BIND_WQE_2ND_MW_TYPE_SHIFT 1 #define RDMA_SQ_BIND_WQE_2ND_RESERVED1_MASK 0x3F #define RDMA_SQ_BIND_WQE_2ND_RESERVED1_SHIFT 2 uint8_t access_ctrl; #define RDMA_SQ_BIND_WQE_2ND_REMOTE_READ_MASK 0x1 #define RDMA_SQ_BIND_WQE_2ND_REMOTE_READ_SHIFT 0 #define RDMA_SQ_BIND_WQE_2ND_REMOTE_WRITE_MASK 0x1 #define RDMA_SQ_BIND_WQE_2ND_REMOTE_WRITE_SHIFT 1 #define RDMA_SQ_BIND_WQE_2ND_ENABLE_ATOMIC_MASK 0x1 #define RDMA_SQ_BIND_WQE_2ND_ENABLE_ATOMIC_SHIFT 2 #define RDMA_SQ_BIND_WQE_2ND_LOCAL_READ_MASK 0x1 #define RDMA_SQ_BIND_WQE_2ND_LOCAL_READ_SHIFT 3 #define RDMA_SQ_BIND_WQE_2ND_LOCAL_WRITE_MASK 0x1 #define RDMA_SQ_BIND_WQE_2ND_LOCAL_WRITE_SHIFT 4 #define RDMA_SQ_BIND_WQE_2ND_RESERVED2_MASK 0x7 #define RDMA_SQ_BIND_WQE_2ND_RESERVED2_SHIFT 5 uint8_t reserved3; uint8_t length_hi /* upper 8 bits of the registered MW length */; __le32 length_lo /* lower 32 bits of the registered MW length */; __le32 parent_l_key /* l_key of the parent MR */; __le32 reserved4; }; /* * Structure with only the SQ WQE common fields. Size is of one SQ element (16B) */ struct rdma_sq_common_wqe { __le32 reserved1[3]; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_COMMON_WQE_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_COMMON_WQE_COMP_FLG_SHIFT 0 #define RDMA_SQ_COMMON_WQE_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_COMMON_WQE_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_COMMON_WQE_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_COMMON_WQE_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_COMMON_WQE_SE_FLG_MASK 0x1 /* If set, signal the responder to generate a solicited event on this WQE (only relevant in SENDs and RDMA write with Imm) */ #define RDMA_SQ_COMMON_WQE_SE_FLG_SHIFT 3 #define RDMA_SQ_COMMON_WQE_INLINE_FLG_MASK 0x1 /* if set, indicates inline data is following this WQE instead of SGEs (only relevant in SENDs and RDMA writes) */ #define RDMA_SQ_COMMON_WQE_INLINE_FLG_SHIFT 4 #define RDMA_SQ_COMMON_WQE_RESERVED0_MASK 0x7 #define RDMA_SQ_COMMON_WQE_RESERVED0_SHIFT 5 uint8_t wqe_size /* Size of WQE in 16B chunks including all SGEs or inline data. In case there are SGEs: set to number of SGEs + 1. In case of inline data: set to the whole number of 16B which contain the inline data + 1. */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; }; struct rdma_sq_fmr_wqe { struct regpair addr; __le32 l_key; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_FMR_WQE_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_FMR_WQE_COMP_FLG_SHIFT 0 #define RDMA_SQ_FMR_WQE_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_FMR_WQE_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_FMR_WQE_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_FMR_WQE_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_FMR_WQE_SE_FLG_MASK 0x1 /* Don't care for FMR wqe */ #define RDMA_SQ_FMR_WQE_SE_FLG_SHIFT 3 #define RDMA_SQ_FMR_WQE_INLINE_FLG_MASK 0x1 /* Should be 0 for FMR wqe */ #define RDMA_SQ_FMR_WQE_INLINE_FLG_SHIFT 4 #define RDMA_SQ_FMR_WQE_DIF_ON_HOST_FLG_MASK 0x1 /* If set, indicated host memory of this WQE is DIF protected. */ #define RDMA_SQ_FMR_WQE_DIF_ON_HOST_FLG_SHIFT 5 #define RDMA_SQ_FMR_WQE_RESERVED0_MASK 0x3 #define RDMA_SQ_FMR_WQE_RESERVED0_SHIFT 6 uint8_t wqe_size /* Size of WQE in 16B chunks */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; uint8_t fmr_ctrl; #define RDMA_SQ_FMR_WQE_PAGE_SIZE_LOG_MASK 0x1F /* 0 is 4k, 1 is 8k... */ #define RDMA_SQ_FMR_WQE_PAGE_SIZE_LOG_SHIFT 0 #define RDMA_SQ_FMR_WQE_ZERO_BASED_MASK 0x1 /* zero based indication */ #define RDMA_SQ_FMR_WQE_ZERO_BASED_SHIFT 5 #define RDMA_SQ_FMR_WQE_BIND_EN_MASK 0x1 /* indication whether bind is enabled for this MR */ #define RDMA_SQ_FMR_WQE_BIND_EN_SHIFT 6 #define RDMA_SQ_FMR_WQE_RESERVED1_MASK 0x1 #define RDMA_SQ_FMR_WQE_RESERVED1_SHIFT 7 uint8_t access_ctrl; #define RDMA_SQ_FMR_WQE_REMOTE_READ_MASK 0x1 #define RDMA_SQ_FMR_WQE_REMOTE_READ_SHIFT 0 #define RDMA_SQ_FMR_WQE_REMOTE_WRITE_MASK 0x1 #define RDMA_SQ_FMR_WQE_REMOTE_WRITE_SHIFT 1 #define RDMA_SQ_FMR_WQE_ENABLE_ATOMIC_MASK 0x1 #define RDMA_SQ_FMR_WQE_ENABLE_ATOMIC_SHIFT 2 #define RDMA_SQ_FMR_WQE_LOCAL_READ_MASK 0x1 #define RDMA_SQ_FMR_WQE_LOCAL_READ_SHIFT 3 #define RDMA_SQ_FMR_WQE_LOCAL_WRITE_MASK 0x1 #define RDMA_SQ_FMR_WQE_LOCAL_WRITE_SHIFT 4 #define RDMA_SQ_FMR_WQE_RESERVED2_MASK 0x7 #define RDMA_SQ_FMR_WQE_RESERVED2_SHIFT 5 uint8_t reserved3; uint8_t length_hi /* upper 8 bits of the registered MR length */; __le32 length_lo /* lower 32 bits of the registered MR length. In case of DIF the length is specified including the DIF guards. */; struct regpair pbl_addr /* Address of PBL */; __le32 dif_base_ref_tag /* Ref tag of the first DIF Block. */; __le16 dif_app_tag /* App tag of all DIF Blocks. */; __le16 dif_app_tag_mask /* Bitmask for verifying dif_app_tag. */; __le16 dif_runt_crc_value /* In TX IO, in case the runt_valid_flg is set, this value is used to validate the last Block in the IO. */; __le16 dif_flags; #define RDMA_SQ_FMR_WQE_DIF_IO_DIRECTION_FLG_MASK 0x1 /* 0=RX, 1=TX (use enum rdma_dif_io_direction_flg) */ #define RDMA_SQ_FMR_WQE_DIF_IO_DIRECTION_FLG_SHIFT 0 #define RDMA_SQ_FMR_WQE_DIF_BLOCK_SIZE_MASK 0x1 /* DIF block size. 0=512B 1=4096B (use enum rdma_dif_block_size) */ #define RDMA_SQ_FMR_WQE_DIF_BLOCK_SIZE_SHIFT 1 #define RDMA_SQ_FMR_WQE_DIF_RUNT_VALID_FLG_MASK 0x1 /* In TX IO, indicates the runt_value field is valid. In RX IO, indicates the calculated runt value is to be placed on host buffer. */ #define RDMA_SQ_FMR_WQE_DIF_RUNT_VALID_FLG_SHIFT 2 #define RDMA_SQ_FMR_WQE_DIF_VALIDATE_CRC_GUARD_MASK 0x1 /* In TX IO, indicates CRC of each DIF guard tag is checked. */ #define RDMA_SQ_FMR_WQE_DIF_VALIDATE_CRC_GUARD_SHIFT 3 #define RDMA_SQ_FMR_WQE_DIF_VALIDATE_REF_TAG_MASK 0x1 /* In TX IO, indicates Ref tag of each DIF guard tag is checked. */ #define RDMA_SQ_FMR_WQE_DIF_VALIDATE_REF_TAG_SHIFT 4 #define RDMA_SQ_FMR_WQE_DIF_VALIDATE_APP_TAG_MASK 0x1 /* In TX IO, indicates App tag of each DIF guard tag is checked. */ #define RDMA_SQ_FMR_WQE_DIF_VALIDATE_APP_TAG_SHIFT 5 #define RDMA_SQ_FMR_WQE_DIF_CRC_SEED_MASK 0x1 /* DIF CRC Seed to use. 0=0x000 1=0xFFFF (use enum rdma_dif_crc_seed) */ #define RDMA_SQ_FMR_WQE_DIF_CRC_SEED_SHIFT 6 #define RDMA_SQ_FMR_WQE_RESERVED4_MASK 0x1FF #define RDMA_SQ_FMR_WQE_RESERVED4_SHIFT 7 __le32 Reserved5; }; /* * First element (16 bytes) of fmr wqe */ struct rdma_sq_fmr_wqe_1st { struct regpair addr; __le32 l_key; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_FMR_WQE_1ST_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_FMR_WQE_1ST_COMP_FLG_SHIFT 0 #define RDMA_SQ_FMR_WQE_1ST_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_FMR_WQE_1ST_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_FMR_WQE_1ST_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_FMR_WQE_1ST_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_FMR_WQE_1ST_SE_FLG_MASK 0x1 /* Don't care for FMR wqe */ #define RDMA_SQ_FMR_WQE_1ST_SE_FLG_SHIFT 3 #define RDMA_SQ_FMR_WQE_1ST_INLINE_FLG_MASK 0x1 /* Should be 0 for FMR wqe */ #define RDMA_SQ_FMR_WQE_1ST_INLINE_FLG_SHIFT 4 #define RDMA_SQ_FMR_WQE_1ST_DIF_ON_HOST_FLG_MASK 0x1 /* If set, indicated host memory of this WQE is DIF protected. */ #define RDMA_SQ_FMR_WQE_1ST_DIF_ON_HOST_FLG_SHIFT 5 #define RDMA_SQ_FMR_WQE_1ST_RESERVED0_MASK 0x3 #define RDMA_SQ_FMR_WQE_1ST_RESERVED0_SHIFT 6 uint8_t wqe_size /* Size of WQE in 16B chunks */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; }; /* * Second element (16 bytes) of fmr wqe */ struct rdma_sq_fmr_wqe_2nd { uint8_t fmr_ctrl; #define RDMA_SQ_FMR_WQE_2ND_PAGE_SIZE_LOG_MASK 0x1F /* 0 is 4k, 1 is 8k... */ #define RDMA_SQ_FMR_WQE_2ND_PAGE_SIZE_LOG_SHIFT 0 #define RDMA_SQ_FMR_WQE_2ND_ZERO_BASED_MASK 0x1 /* zero based indication */ #define RDMA_SQ_FMR_WQE_2ND_ZERO_BASED_SHIFT 5 #define RDMA_SQ_FMR_WQE_2ND_BIND_EN_MASK 0x1 /* indication whether bind is enabled for this MR */ #define RDMA_SQ_FMR_WQE_2ND_BIND_EN_SHIFT 6 #define RDMA_SQ_FMR_WQE_2ND_RESERVED1_MASK 0x1 #define RDMA_SQ_FMR_WQE_2ND_RESERVED1_SHIFT 7 uint8_t access_ctrl; #define RDMA_SQ_FMR_WQE_2ND_REMOTE_READ_MASK 0x1 #define RDMA_SQ_FMR_WQE_2ND_REMOTE_READ_SHIFT 0 #define RDMA_SQ_FMR_WQE_2ND_REMOTE_WRITE_MASK 0x1 #define RDMA_SQ_FMR_WQE_2ND_REMOTE_WRITE_SHIFT 1 #define RDMA_SQ_FMR_WQE_2ND_ENABLE_ATOMIC_MASK 0x1 #define RDMA_SQ_FMR_WQE_2ND_ENABLE_ATOMIC_SHIFT 2 #define RDMA_SQ_FMR_WQE_2ND_LOCAL_READ_MASK 0x1 #define RDMA_SQ_FMR_WQE_2ND_LOCAL_READ_SHIFT 3 #define RDMA_SQ_FMR_WQE_2ND_LOCAL_WRITE_MASK 0x1 #define RDMA_SQ_FMR_WQE_2ND_LOCAL_WRITE_SHIFT 4 #define RDMA_SQ_FMR_WQE_2ND_RESERVED2_MASK 0x7 #define RDMA_SQ_FMR_WQE_2ND_RESERVED2_SHIFT 5 uint8_t reserved3; uint8_t length_hi /* upper 8 bits of the registered MR length */; __le32 length_lo /* lower 32 bits of the registered MR length. In case of zero based MR, will hold FBO */; struct regpair pbl_addr /* Address of PBL */; }; /* * Third element (16 bytes) of fmr wqe */ struct rdma_sq_fmr_wqe_3rd { __le32 dif_base_ref_tag /* Ref tag of the first DIF Block. */; __le16 dif_app_tag /* App tag of all DIF Blocks. */; __le16 dif_app_tag_mask /* Bitmask for verifying dif_app_tag. */; __le16 dif_runt_crc_value /* In TX IO, in case the runt_valid_flg is set, this value is used to validate the last Block in the IO. */; __le16 dif_flags; #define RDMA_SQ_FMR_WQE_3RD_DIF_IO_DIRECTION_FLG_MASK 0x1 /* 0=RX, 1=TX (use enum rdma_dif_io_direction_flg) */ #define RDMA_SQ_FMR_WQE_3RD_DIF_IO_DIRECTION_FLG_SHIFT 0 #define RDMA_SQ_FMR_WQE_3RD_DIF_BLOCK_SIZE_MASK 0x1 /* DIF block size. 0=512B 1=4096B (use enum rdma_dif_block_size) */ #define RDMA_SQ_FMR_WQE_3RD_DIF_BLOCK_SIZE_SHIFT 1 #define RDMA_SQ_FMR_WQE_3RD_DIF_RUNT_VALID_FLG_MASK 0x1 /* In TX IO, indicates the runt_value field is valid. In RX IO, indicates the calculated runt value is to be placed on host buffer. */ #define RDMA_SQ_FMR_WQE_3RD_DIF_RUNT_VALID_FLG_SHIFT 2 #define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_CRC_GUARD_MASK 0x1 /* In TX IO, indicates CRC of each DIF guard tag is checked. */ #define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_CRC_GUARD_SHIFT 3 #define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_REF_TAG_MASK 0x1 /* In TX IO, indicates Ref tag of each DIF guard tag is checked. */ #define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_REF_TAG_SHIFT 4 #define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_APP_TAG_MASK 0x1 /* In TX IO, indicates App tag of each DIF guard tag is checked. */ #define RDMA_SQ_FMR_WQE_3RD_DIF_VALIDATE_APP_TAG_SHIFT 5 #define RDMA_SQ_FMR_WQE_3RD_DIF_CRC_SEED_MASK 0x1 /* DIF CRC Seed to use. 0=0x000 1=0xFFFF (use enum rdma_dif_crc_seed) */ #define RDMA_SQ_FMR_WQE_3RD_DIF_CRC_SEED_SHIFT 6 #define RDMA_SQ_FMR_WQE_3RD_RESERVED4_MASK 0x1FF #define RDMA_SQ_FMR_WQE_3RD_RESERVED4_SHIFT 7 __le32 Reserved5; }; struct rdma_sq_local_inv_wqe { struct regpair reserved; __le32 inv_l_key /* The invalidate local key */; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_LOCAL_INV_WQE_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_LOCAL_INV_WQE_COMP_FLG_SHIFT 0 #define RDMA_SQ_LOCAL_INV_WQE_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_LOCAL_INV_WQE_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_LOCAL_INV_WQE_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_LOCAL_INV_WQE_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_LOCAL_INV_WQE_SE_FLG_MASK 0x1 /* Don't care for local invalidate wqe */ #define RDMA_SQ_LOCAL_INV_WQE_SE_FLG_SHIFT 3 #define RDMA_SQ_LOCAL_INV_WQE_INLINE_FLG_MASK 0x1 /* Should be 0 for local invalidate wqe */ #define RDMA_SQ_LOCAL_INV_WQE_INLINE_FLG_SHIFT 4 #define RDMA_SQ_LOCAL_INV_WQE_DIF_ON_HOST_FLG_MASK 0x1 /* If set, indicated host memory of this WQE is DIF protected. */ #define RDMA_SQ_LOCAL_INV_WQE_DIF_ON_HOST_FLG_SHIFT 5 #define RDMA_SQ_LOCAL_INV_WQE_RESERVED0_MASK 0x3 #define RDMA_SQ_LOCAL_INV_WQE_RESERVED0_SHIFT 6 uint8_t wqe_size /* Size of WQE in 16B chunks */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; }; struct rdma_sq_rdma_wqe { __le32 imm_data /* The immediate data in case of RDMA_WITH_IMM */; __le32 length /* Total data length. If DIF on host is enabled, length does NOT include DIF guards. */; __le32 xrc_srq /* Valid only when XRC is set for the QP */; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_RDMA_WQE_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_RDMA_WQE_COMP_FLG_SHIFT 0 #define RDMA_SQ_RDMA_WQE_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_RDMA_WQE_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_RDMA_WQE_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_RDMA_WQE_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_RDMA_WQE_SE_FLG_MASK 0x1 /* If set, signal the responder to generate a solicited event on this WQE */ #define RDMA_SQ_RDMA_WQE_SE_FLG_SHIFT 3 #define RDMA_SQ_RDMA_WQE_INLINE_FLG_MASK 0x1 /* if set, indicates inline data is following this WQE instead of SGEs. Applicable for RDMA_WR or RDMA_WR_WITH_IMM. Should be 0 for RDMA_RD */ #define RDMA_SQ_RDMA_WQE_INLINE_FLG_SHIFT 4 #define RDMA_SQ_RDMA_WQE_DIF_ON_HOST_FLG_MASK 0x1 /* If set, indicated host memory of this WQE is DIF protected. */ #define RDMA_SQ_RDMA_WQE_DIF_ON_HOST_FLG_SHIFT 5 #define RDMA_SQ_RDMA_WQE_RESERVED0_MASK 0x3 #define RDMA_SQ_RDMA_WQE_RESERVED0_SHIFT 6 uint8_t wqe_size /* Size of WQE in 16B chunks including all SGEs or inline data. In case there are SGEs: set to number of SGEs + 1. In case of inline data: set to the whole number of 16B which contain the inline data + 1. */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; struct regpair remote_va /* Remote virtual address */; __le32 r_key /* Remote key */; uint8_t dif_flags; #define RDMA_SQ_RDMA_WQE_DIF_BLOCK_SIZE_MASK 0x1 /* if dif_on_host_flg set: DIF block size. 0=512B 1=4096B (use enum rdma_dif_block_size) */ #define RDMA_SQ_RDMA_WQE_DIF_BLOCK_SIZE_SHIFT 0 #define RDMA_SQ_RDMA_WQE_DIF_FIRST_RDMA_IN_IO_FLG_MASK 0x1 /* if dif_on_host_flg set: WQE executes first RDMA on related IO. */ #define RDMA_SQ_RDMA_WQE_DIF_FIRST_RDMA_IN_IO_FLG_SHIFT 1 #define RDMA_SQ_RDMA_WQE_DIF_LAST_RDMA_IN_IO_FLG_MASK 0x1 /* if dif_on_host_flg set: WQE executes last RDMA on related IO. */ #define RDMA_SQ_RDMA_WQE_DIF_LAST_RDMA_IN_IO_FLG_SHIFT 2 #define RDMA_SQ_RDMA_WQE_RESERVED1_MASK 0x1F #define RDMA_SQ_RDMA_WQE_RESERVED1_SHIFT 3 uint8_t reserved2[3]; }; /* * First element (16 bytes) of rdma wqe */ struct rdma_sq_rdma_wqe_1st { __le32 imm_data /* The immediate data in case of RDMA_WITH_IMM */; __le32 length /* Total data length */; __le32 xrc_srq /* Valid only when XRC is set for the QP */; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_RDMA_WQE_1ST_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_RDMA_WQE_1ST_COMP_FLG_SHIFT 0 #define RDMA_SQ_RDMA_WQE_1ST_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_RDMA_WQE_1ST_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_RDMA_WQE_1ST_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_RDMA_WQE_1ST_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_RDMA_WQE_1ST_SE_FLG_MASK 0x1 /* If set, signal the responder to generate a solicited event on this WQE */ #define RDMA_SQ_RDMA_WQE_1ST_SE_FLG_SHIFT 3 #define RDMA_SQ_RDMA_WQE_1ST_INLINE_FLG_MASK 0x1 /* if set, indicates inline data is following this WQE instead of SGEs. Applicable for RDMA_WR or RDMA_WR_WITH_IMM. Should be 0 for RDMA_RD */ #define RDMA_SQ_RDMA_WQE_1ST_INLINE_FLG_SHIFT 4 #define RDMA_SQ_RDMA_WQE_1ST_DIF_ON_HOST_FLG_MASK 0x1 /* If set, indicated host memory of this WQE is DIF protected. */ #define RDMA_SQ_RDMA_WQE_1ST_DIF_ON_HOST_FLG_SHIFT 5 #define RDMA_SQ_RDMA_WQE_1ST_RESERVED0_MASK 0x3 #define RDMA_SQ_RDMA_WQE_1ST_RESERVED0_SHIFT 6 uint8_t wqe_size /* Size of WQE in 16B chunks including all SGEs or inline data. In case there are SGEs: set to number of SGEs + 1. In case of inline data: set to the whole number of 16B which contain the inline data + 1. */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; }; /* * Second element (16 bytes) of rdma wqe */ struct rdma_sq_rdma_wqe_2nd { struct regpair remote_va /* Remote virtual address */; __le32 r_key /* Remote key */; uint8_t dif_flags; #define RDMA_SQ_RDMA_WQE_2ND_DIF_BLOCK_SIZE_MASK 0x1 /* if dif_on_host_flg set: DIF block size. 0=512B 1=4096B (use enum rdma_dif_block_size) */ #define RDMA_SQ_RDMA_WQE_2ND_DIF_BLOCK_SIZE_SHIFT 0 #define RDMA_SQ_RDMA_WQE_2ND_DIF_FIRST_SEGMENT_FLG_MASK 0x1 /* if dif_on_host_flg set: WQE executes first DIF on related MR. */ #define RDMA_SQ_RDMA_WQE_2ND_DIF_FIRST_SEGMENT_FLG_SHIFT 1 #define RDMA_SQ_RDMA_WQE_2ND_DIF_LAST_SEGMENT_FLG_MASK 0x1 /* if dif_on_host_flg set: WQE executes last DIF on related MR. */ #define RDMA_SQ_RDMA_WQE_2ND_DIF_LAST_SEGMENT_FLG_SHIFT 2 #define RDMA_SQ_RDMA_WQE_2ND_RESERVED1_MASK 0x1F #define RDMA_SQ_RDMA_WQE_2ND_RESERVED1_SHIFT 3 uint8_t reserved2[3]; }; /* * SQ WQE req type enumeration */ enum rdma_sq_req_type { RDMA_SQ_REQ_TYPE_SEND, RDMA_SQ_REQ_TYPE_SEND_WITH_IMM, RDMA_SQ_REQ_TYPE_SEND_WITH_INVALIDATE, RDMA_SQ_REQ_TYPE_RDMA_WR, RDMA_SQ_REQ_TYPE_RDMA_WR_WITH_IMM, RDMA_SQ_REQ_TYPE_RDMA_RD, RDMA_SQ_REQ_TYPE_ATOMIC_CMP_AND_SWAP, RDMA_SQ_REQ_TYPE_ATOMIC_ADD, RDMA_SQ_REQ_TYPE_LOCAL_INVALIDATE, RDMA_SQ_REQ_TYPE_FAST_MR, RDMA_SQ_REQ_TYPE_BIND, RDMA_SQ_REQ_TYPE_INVALID, MAX_RDMA_SQ_REQ_TYPE }; struct rdma_sq_send_wqe { __le32 inv_key_or_imm_data /* the r_key to invalidate in case of SEND_WITH_INVALIDATE, or the immediate data in case of SEND_WITH_IMM */; __le32 length /* Total data length */; __le32 xrc_srq /* Valid only when XRC is set for the QP */; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_SEND_WQE_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_SEND_WQE_COMP_FLG_SHIFT 0 #define RDMA_SQ_SEND_WQE_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_SEND_WQE_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_SEND_WQE_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_SEND_WQE_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_SEND_WQE_SE_FLG_MASK 0x1 /* If set, signal the responder to generate a solicited event on this WQE */ #define RDMA_SQ_SEND_WQE_SE_FLG_SHIFT 3 #define RDMA_SQ_SEND_WQE_INLINE_FLG_MASK 0x1 /* if set, indicates inline data is following this WQE instead of SGEs */ #define RDMA_SQ_SEND_WQE_INLINE_FLG_SHIFT 4 #define RDMA_SQ_SEND_WQE_DIF_ON_HOST_FLG_MASK 0x1 /* Should be 0 for send wqe */ #define RDMA_SQ_SEND_WQE_DIF_ON_HOST_FLG_SHIFT 5 #define RDMA_SQ_SEND_WQE_RESERVED0_MASK 0x3 #define RDMA_SQ_SEND_WQE_RESERVED0_SHIFT 6 uint8_t wqe_size /* Size of WQE in 16B chunks including all SGEs or inline data. In case there are SGEs: set to number of SGEs + 1. In case of inline data: set to the whole number of 16B which contain the inline data + 1. */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; __le32 reserved1[4]; }; struct rdma_sq_send_wqe_1st { __le32 inv_key_or_imm_data /* the r_key to invalidate in case of SEND_WITH_INVALIDATE, or the immediate data in case of SEND_WITH_IMM */; __le32 length /* Total data length */; __le32 xrc_srq /* Valid only when XRC is set for the QP */; uint8_t req_type /* Type of WQE */; uint8_t flags; #define RDMA_SQ_SEND_WQE_1ST_COMP_FLG_MASK 0x1 /* If set, completion will be generated when the WQE is completed */ #define RDMA_SQ_SEND_WQE_1ST_COMP_FLG_SHIFT 0 #define RDMA_SQ_SEND_WQE_1ST_RD_FENCE_FLG_MASK 0x1 /* If set, all pending RDMA read or Atomic operations will be completed before start processing this WQE */ #define RDMA_SQ_SEND_WQE_1ST_RD_FENCE_FLG_SHIFT 1 #define RDMA_SQ_SEND_WQE_1ST_INV_FENCE_FLG_MASK 0x1 /* If set, all pending operations will be completed before start processing this WQE */ #define RDMA_SQ_SEND_WQE_1ST_INV_FENCE_FLG_SHIFT 2 #define RDMA_SQ_SEND_WQE_1ST_SE_FLG_MASK 0x1 /* If set, signal the responder to generate a solicited event on this WQE */ #define RDMA_SQ_SEND_WQE_1ST_SE_FLG_SHIFT 3 #define RDMA_SQ_SEND_WQE_1ST_INLINE_FLG_MASK 0x1 /* if set, indicates inline data is following this WQE instead of SGEs */ #define RDMA_SQ_SEND_WQE_1ST_INLINE_FLG_SHIFT 4 #define RDMA_SQ_SEND_WQE_1ST_RESERVED0_MASK 0x7 #define RDMA_SQ_SEND_WQE_1ST_RESERVED0_SHIFT 5 uint8_t wqe_size /* Size of WQE in 16B chunks including all SGEs or inline data. In case there are SGEs: set to number of SGEs + 1. In case of inline data: set to the whole number of 16B which contain the inline data + 1. */; uint8_t prev_wqe_size /* Previous WQE size in 16B chunks */; }; struct rdma_sq_send_wqe_2st { __le32 reserved1[4]; }; struct rdma_sq_sge { __le32 length /* Total length of the send. If DIF on host is enabled, SGE length includes the DIF guards. */; struct regpair addr; __le32 l_key; }; struct rdma_srq_wqe_header { struct regpair wr_id; uint8_t num_sges /* number of SGEs in WQE */; uint8_t reserved2[7]; }; struct rdma_srq_sge { struct regpair addr; __le32 length; __le32 l_key; }; /* * rdma srq sge */ union rdma_srq_elm { struct rdma_srq_wqe_header header; struct rdma_srq_sge sge; }; /* * Rdma doorbell data for flags update */ struct rdma_pwm_flags_data { __le16 icid /* internal CID */; uint8_t agg_flags /* aggregative flags */; uint8_t reserved; }; /* * Rdma doorbell data for SQ and RQ */ struct rdma_pwm_val16_data { __le16 icid /* internal CID */; __le16 value /* aggregated value to update */; }; union rdma_pwm_val16_data_union { struct rdma_pwm_val16_data as_struct /* Parameters field */; __le32 as_dword; }; /* * Rdma doorbell data for CQ */ struct rdma_pwm_val32_data { __le16 icid /* internal CID */; uint8_t agg_flags /* bit for every DQ counter flags in CM context that DQ can increment */; uint8_t params; #define RDMA_PWM_VAL32_DATA_AGG_CMD_MASK 0x3 /* aggregative command to CM (use enum db_agg_cmd_sel) */ #define RDMA_PWM_VAL32_DATA_AGG_CMD_SHIFT 0 #define RDMA_PWM_VAL32_DATA_BYPASS_EN_MASK 0x1 /* enable QM bypass */ #define RDMA_PWM_VAL32_DATA_BYPASS_EN_SHIFT 2 #define RDMA_PWM_VAL32_DATA_RESERVED_MASK 0x1F #define RDMA_PWM_VAL32_DATA_RESERVED_SHIFT 3 __le32 value /* aggregated value to update */; }; union rdma_pwm_val32_data_union { struct rdma_pwm_val32_data as_struct /* Parameters field */; struct regpair as_repair; }; #endif /* __QED_HSI_RDMA__ */ rdma-core-56.1/providers/qedr/qelr_main.c000066400000000000000000000166511477342711600204260ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include "qelr.h" #include "qelr_verbs.h" #include "qelr_chain.h" #include #include #include static void qelr_free_context(struct ibv_context *ibctx); #define PCI_VENDOR_ID_QLOGIC (0x1077) #define PCI_DEVICE_ID_QLOGIC_57980S (0x1629) #define PCI_DEVICE_ID_QLOGIC_57980S_40 (0x1634) #define PCI_DEVICE_ID_QLOGIC_57980S_10 (0x1666) #define PCI_DEVICE_ID_QLOGIC_57980S_MF (0x1636) #define PCI_DEVICE_ID_QLOGIC_57980S_100 (0x1644) #define PCI_DEVICE_ID_QLOGIC_57980S_50 (0x1654) #define PCI_DEVICE_ID_QLOGIC_57980S_25 (0x1656) #define PCI_DEVICE_ID_QLOGIC_57980S_IOV (0x1664) #define PCI_DEVICE_ID_QLOGIC_AH (0x8070) #define PCI_DEVICE_ID_QLOGIC_AH_IOV (0x8090) #define PCI_DEVICE_ID_QLOGIC_AHP (0x8170) #define PCI_DEVICE_ID_QLOGIC_AHP_IOV (0x8190) #define QHCA(d) \ VERBS_PCI_MATCH(PCI_VENDOR_ID_QLOGIC, PCI_DEVICE_ID_QLOGIC_##d, NULL) static const struct verbs_match_ent hca_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_QEDR), QHCA(57980S), QHCA(57980S_40), QHCA(57980S_10), QHCA(57980S_MF), QHCA(57980S_100), QHCA(57980S_50), QHCA(57980S_25), QHCA(57980S_IOV), QHCA(AH), QHCA(AH_IOV), QHCA(AHP), QHCA(AHP_IOV), {} }; static const struct verbs_context_ops qelr_ctx_ops = { .query_device_ex = qelr_query_device, .query_port = qelr_query_port, .alloc_pd = qelr_alloc_pd, .dealloc_pd = qelr_dealloc_pd, .reg_mr = qelr_reg_mr, .dereg_mr = qelr_dereg_mr, .create_cq = qelr_create_cq, .poll_cq = qelr_poll_cq, .req_notify_cq = qelr_arm_cq, .cq_event = qelr_cq_event, .destroy_cq = qelr_destroy_cq, .create_qp = qelr_create_qp, .query_qp = qelr_query_qp, .modify_qp = qelr_modify_qp, .destroy_qp = qelr_destroy_qp, .create_srq = qelr_create_srq, .destroy_srq = qelr_destroy_srq, .modify_srq = qelr_modify_srq, .query_srq = qelr_query_srq, .post_srq_recv = qelr_post_srq_recv, .post_send = qelr_post_send, .post_recv = qelr_post_recv, .async_event = qelr_async_event, .free_context = qelr_free_context, }; static const struct verbs_context_ops qelr_ctx_roce_ops = { .close_xrcd = qelr_close_xrcd, .create_qp_ex = qelr_create_qp_ex, .create_srq_ex = qelr_create_srq_ex, .get_srq_num = qelr_get_srq_num, .open_xrcd = qelr_open_xrcd, }; static void qelr_uninit_device(struct verbs_device *verbs_device) { struct qelr_device *dev = get_qelr_dev(&verbs_device->device); free(dev); } static struct verbs_context *qelr_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct qelr_devctx *ctx; struct qelr_alloc_context cmd = {}; struct qelr_alloc_context_resp resp; ctx = verbs_init_and_alloc_context(ibdev, cmd_fd, ctx, ibv_ctx, RDMA_DRIVER_QEDR); if (!ctx) return NULL; memset(&resp, 0, sizeof(resp)); cmd.context_flags = QEDR_ALLOC_UCTX_DB_REC | QEDR_SUPPORT_DPM_SIZES; cmd.context_flags |= QEDR_ALLOC_UCTX_EDPM_MODE; if (ibv_cmd_get_context(&ctx->ibv_ctx, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) goto cmd_err; verbs_set_ops(&ctx->ibv_ctx, &qelr_ctx_ops); if (IS_ROCE(ibdev)) verbs_set_ops(&ctx->ibv_ctx, &qelr_ctx_roce_ops); ctx->srq_table = calloc(QELR_MAX_SRQ_ID, sizeof(*ctx->srq_table)); if (!ctx->srq_table) { verbs_err(&ctx->ibv_ctx, "failed to allocate srq_table\n"); goto cmd_err; } ctx->kernel_page_size = sysconf(_SC_PAGESIZE); ctx->db_pa = resp.db_pa; ctx->db_size = resp.db_size; /* Set dpm flags according to protocol */ if (IS_ROCE(ibdev)) { if (resp.dpm_flags & QEDR_DPM_TYPE_ROCE_ENHANCED) ctx->dpm_flags = QELR_DPM_FLAGS_ENHANCED; if (resp.dpm_flags & QEDR_DPM_TYPE_ROCE_LEGACY) ctx->dpm_flags |= QELR_DPM_FLAGS_LEGACY; if (resp.dpm_flags & QEDR_DPM_TYPE_ROCE_EDPM_MODE) ctx->dpm_flags |= QELR_DPM_FLAGS_EDPM_MODE; } else { if (resp.dpm_flags & QEDR_DPM_TYPE_IWARP_LEGACY) ctx->dpm_flags = QELR_DPM_FLAGS_LEGACY; } /* Defaults set for backward-forward compatibility */ if (resp.dpm_flags & QEDR_DPM_SIZES_SET) { ctx->ldpm_limit_size = resp.ldpm_limit_size; ctx->edpm_trans_size = resp.edpm_trans_size; ctx->edpm_limit_size = resp.edpm_limit_size ? resp.edpm_limit_size : QEDR_EDPM_MAX_SIZE; } else { ctx->ldpm_limit_size = QEDR_LDPM_MAX_SIZE; ctx->edpm_trans_size = QEDR_EDPM_TRANS_SIZE; ctx->edpm_limit_size = QEDR_EDPM_MAX_SIZE; } ctx->max_send_wr = resp.max_send_wr; ctx->max_recv_wr = resp.max_recv_wr; ctx->max_srq_wr = resp.max_srq_wr; ctx->sges_per_send_wr = resp.sges_per_send_wr; ctx->sges_per_recv_wr = resp.sges_per_recv_wr; ctx->sges_per_srq_wr = resp.sges_per_recv_wr; ctx->max_cqes = resp.max_cqes; ctx->db_addr = mmap(NULL, ctx->db_size, PROT_WRITE, MAP_SHARED, cmd_fd, ctx->db_pa); if (ctx->db_addr == MAP_FAILED) { int errsv = errno; verbs_err(&ctx->ibv_ctx, "alloc context: doorbell mapping failed resp.db_pa = %llx resp.db_size=%d context->cmd_fd=%d errno=%d\n", resp.db_pa, resp.db_size, cmd_fd, errsv); goto free_srq_tbl; } return &ctx->ibv_ctx; free_srq_tbl: free(ctx->srq_table); cmd_err: verbs_err(&ctx->ibv_ctx, "Failed to allocate context for device.\n"); verbs_uninit_context(&ctx->ibv_ctx); free(ctx); return NULL; } static void qelr_free_context(struct ibv_context *ibctx) { struct qelr_devctx *ctx = get_qelr_ctx(ibctx); if (ctx->db_addr) munmap(ctx->db_addr, ctx->db_size); free(ctx->srq_table); verbs_uninit_context(&ctx->ibv_ctx); free(ctx); } static struct verbs_device *qelr_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct qelr_device *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; return &dev->ibv_dev; } static const struct verbs_device_ops qelr_dev_ops = { .name = "qedr", .match_min_abi_version = QELR_ABI_VERSION, .match_max_abi_version = QELR_ABI_VERSION, .match_table = hca_table, .alloc_device = qelr_device_alloc, .uninit_device = qelr_uninit_device, .alloc_context = qelr_alloc_context, }; PROVIDER_DRIVER(qedr, qelr_dev_ops); rdma-core-56.1/providers/qedr/qelr_verbs.c000066400000000000000000002173751477342711600206310ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include "qelr.h" #include "qelr_chain.h" #include "qelr_verbs.h" #include #include #include #include #include #define QELR_SQE_ELEMENT_SIZE (sizeof(struct rdma_sq_sge)) #define QELR_RQE_ELEMENT_SIZE (sizeof(struct rdma_rq_sge)) #define QELR_CQE_SIZE (sizeof(union rdma_cqe)) static void qelr_inc_sw_cons_u16(struct qelr_qp_hwq_info *info) { info->cons = (info->cons + 1) % info->max_wr; info->wqe_cons++; } static void qelr_inc_sw_prod_u16(struct qelr_qp_hwq_info *info) { info->prod = (info->prod + 1) % info->max_wr; } static inline int qelr_wq_is_full(struct qelr_qp_hwq_info *info) { return (((info->prod + 1) % info->max_wr) == info->cons); } int qelr_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); uint64_t fw_ver; unsigned int major, minor, revision, eng; int ret; ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (ret) return ret; fw_ver = resp.base.fw_ver; major = (fw_ver >> 24) & 0xff; minor = (fw_ver >> 16) & 0xff; revision = (fw_ver >> 8) & 0xff; eng = fw_ver & 0xff; snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver), "%d.%d.%d.%d", major, minor, revision, eng); return 0; } int qelr_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; int status; status = ibv_cmd_query_port(context, port, attr, &cmd, sizeof(cmd)); return status; } struct ibv_pd *qelr_alloc_pd(struct ibv_context *context) { struct qelr_alloc_pd cmd; struct qelr_alloc_pd_resp resp; struct qelr_pd *pd; struct qelr_devctx *cxt = get_qelr_ctx(context); pd = malloc(sizeof(*pd)); if (!pd) return NULL; bzero(pd, sizeof(*pd)); memset(&cmd, 0, sizeof(cmd)); if (ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) { free(pd); return NULL; } pd->pd_id = resp.pd_id; verbs_debug(&cxt->ibv_ctx, "Allocated pd: %d\n", pd->pd_id); return &pd->ibv_pd; } int qelr_dealloc_pd(struct ibv_pd *ibpd) { int rc = 0; struct qelr_pd *pd = get_qelr_pd(ibpd); struct qelr_devctx *cxt = get_qelr_ctx(ibpd->context); verbs_debug(&cxt->ibv_ctx, "Deallocated pd: %d\n", pd->pd_id); rc = ibv_cmd_dealloc_pd(ibpd); if (rc) return rc; free(pd); return rc; } struct ibv_mr *qelr_reg_mr(struct ibv_pd *ibpd, void *addr, size_t len, uint64_t hca_va, int access) { struct qelr_mr *mr; struct ibv_reg_mr cmd; struct qelr_reg_mr_resp resp; struct qelr_pd *pd = get_qelr_pd(ibpd); struct qelr_devctx *cxt = get_qelr_ctx(ibpd->context); mr = malloc(sizeof(*mr)); if (!mr) return NULL; bzero(mr, sizeof(*mr)); if (ibv_cmd_reg_mr(ibpd, addr, len, hca_va, access, &mr->vmr, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) { free(mr); return NULL; } verbs_debug(&cxt->ibv_ctx, "MR Register %p completed successfully pd_id=%d addr=%p len=%zu access=%d lkey=%x rkey=%x\n", mr, pd->pd_id, addr, len, access, mr->vmr.ibv_mr.lkey, mr->vmr.ibv_mr.rkey); return &mr->vmr.ibv_mr; } int qelr_dereg_mr(struct verbs_mr *vmr) { struct qelr_devctx *cxt = get_qelr_ctx(vmr->ibv_mr.context); int rc; rc = ibv_cmd_dereg_mr(vmr); if (rc) return rc; verbs_debug(&cxt->ibv_ctx, "MR DERegister %p completed successfully\n", vmr); free(vmr); return 0; } static void consume_cqe(struct qelr_cq *cq) { if (cq->latest_cqe == cq->toggle_cqe) cq->chain_toggle ^= RDMA_CQE_REQUESTER_TOGGLE_BIT_MASK; cq->latest_cqe = qelr_chain_consume(&cq->chain); } static inline int qelr_cq_entries(int entries) { /* FW requires an extra entry */ return entries + 1; } struct ibv_cq *qelr_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct qelr_devctx *cxt = get_qelr_ctx(context); struct qelr_create_cq_resp resp = {}; struct qelr_create_cq cmd; struct qelr_cq *cq; int chain_size; int rc; verbs_debug(&cxt->ibv_ctx, "create cq: context=%p, cqe=%d, channel=%p, comp_vector=%d\n", context, cqe, channel, comp_vector); if (!cqe || cqe > cxt->max_cqes) { verbs_err(&cxt->ibv_ctx, "create cq: failed. attempted to allocate %d cqes but valid range is 1...%d\n", cqe, cxt->max_cqes); errno = EINVAL; return NULL; } /* allocate CQ structure */ cq = calloc(1, sizeof(*cq)); if (!cq) return NULL; /* allocate CQ buffer */ chain_size = qelr_cq_entries(cqe) * QELR_CQE_SIZE; rc = qelr_chain_alloc(&cq->chain, chain_size, cxt->kernel_page_size, QELR_CQE_SIZE); if (rc) goto err_0; cmd.addr = (uintptr_t) cq->chain.first_addr; cmd.len = cq->chain.size; rc = ibv_cmd_create_cq(context, cqe, channel, comp_vector, &cq->ibv_cq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (rc) { verbs_err(&cxt->ibv_ctx, "create cq: failed with rc = %d\n", rc); goto err_1; } /* map the doorbell and prepare its data */ cq->db.data.icid = htole16(resp.icid); cq->db.data.params = DB_AGG_CMD_SET << RDMA_PWM_VAL32_DATA_AGG_CMD_SHIFT; cq->db_addr = cxt->db_addr + resp.db_offset; if (resp.db_rec_addr) { cq->db_rec_map = mmap(NULL, cxt->kernel_page_size, PROT_WRITE, MAP_SHARED, context->cmd_fd, resp.db_rec_addr); if (cq->db_rec_map == MAP_FAILED) { int errsv = errno; verbs_err(&cxt->ibv_ctx, "alloc context: doorbell rec mapping failed resp.db_rec_addr = %llx size=%d context->cmd_fd=%d errno=%d\n", resp.db_rec_addr, cxt->kernel_page_size, context->cmd_fd, errsv); goto err_1; } cq->db_rec_addr = cq->db_rec_map; } else { /* Kernel doesn't support doorbell recovery. Point to dummy * location instead */ cq->db_rec_addr = &cxt->db_rec_addr_dummy; } /* point to the very last element, passing this we will toggle */ cq->toggle_cqe = qelr_chain_get_last_elem(&cq->chain); cq->chain_toggle = RDMA_CQE_REQUESTER_TOGGLE_BIT_MASK; cq->latest_cqe = NULL; /* must be different from chain_toggle */ consume_cqe(cq); verbs_debug(&cxt->ibv_ctx, "create cq: successfully created %p\n", cq); return &cq->ibv_cq; err_1: qelr_chain_free(&cq->chain); err_0: free(cq); return NULL; } int qelr_destroy_cq(struct ibv_cq *ibv_cq) { struct qelr_devctx *cxt = get_qelr_ctx(ibv_cq->context); struct qelr_cq *cq = get_qelr_cq(ibv_cq); int rc; verbs_debug(&cxt->ibv_ctx, "destroy cq: %p\n", cq); rc = ibv_cmd_destroy_cq(ibv_cq); if (rc) { verbs_debug(&cxt->ibv_ctx, "destroy cq: failed to destroy %p, got %d.\n", cq, rc); return rc; } qelr_chain_free(&cq->chain); if (cq->db_rec_map) munmap(cq->db_rec_map, cxt->kernel_page_size); verbs_debug(&cxt->ibv_ctx, "destroy cq: successfully destroyed %p\n", cq); free(cq); return 0; } static struct qelr_srq *qelr_get_srq(struct qelr_devctx *cxt, uint32_t srq_id) { if (unlikely(srq_id >= QELR_MAX_SRQ_ID)) { verbs_err(&cxt->ibv_ctx, "invalid srq_id %u\n", srq_id); return NULL; } return cxt->srq_table[srq_id]; } int qelr_query_srq(struct ibv_srq *ibv_srq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; return ibv_cmd_query_srq(ibv_srq, attr, &cmd, sizeof(cmd)); } int qelr_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int attr_mask) { struct ibv_modify_srq cmd; return ibv_cmd_modify_srq(srq, attr, attr_mask, &cmd, sizeof(cmd)); } static void qelr_destroy_srq_buffers(struct ibv_srq *ibv_srq) { struct qelr_srq *srq = get_qelr_srq(ibv_srq); uint32_t *virt_prod_pair_addr; uint32_t prod_size; qelr_chain_free(&srq->hw_srq.chain); virt_prod_pair_addr = srq->hw_srq.virt_prod_pair_addr; prod_size = sizeof(struct rdma_srq_producers); ibv_dofork_range(virt_prod_pair_addr, prod_size); munmap(virt_prod_pair_addr, prod_size); } int qelr_destroy_srq(struct ibv_srq *ibv_srq) { struct qelr_devctx *cxt = get_qelr_ctx(ibv_srq->context); struct qelr_srq *srq = get_qelr_srq(ibv_srq); int ret; ret = ibv_cmd_destroy_srq(ibv_srq); if (ret) return ret; if (srq->is_xrc) cxt->srq_table[srq->srq_id] = NULL; qelr_destroy_srq_buffers(ibv_srq); free(srq); return 0; } static void qelr_create_srq_configure_req(struct qelr_srq *srq, struct qelr_create_srq *req) { req->srq_addr = (uintptr_t)srq->hw_srq.chain.first_addr; req->srq_len = srq->hw_srq.chain.size; req->prod_pair_addr = (uintptr_t)srq->hw_srq.virt_prod_pair_addr; } static inline void qelr_create_srq_configure_req_ex(struct qelr_srq *srq, struct qelr_create_srq_ex *req) { req->srq_addr = (uintptr_t)srq->hw_srq.chain.first_addr; req->srq_len = srq->hw_srq.chain.size; req->prod_pair_addr = (uintptr_t)srq->hw_srq.virt_prod_pair_addr; } static int qelr_create_srq_buffers(struct qelr_devctx *cxt, struct qelr_srq *srq, uint32_t max_wr) { uint32_t max_sges; int chain_size, prod_size; void *addr; int rc; if (!max_wr) return -EINVAL; max_wr = min_t(uint32_t, max_wr, cxt->max_srq_wr); max_sges = max_wr * (cxt->sges_per_srq_wr + 1); /* +1 for header */ chain_size = max_sges * QELR_RQE_ELEMENT_SIZE; rc = qelr_chain_alloc(&srq->hw_srq.chain, chain_size, cxt->kernel_page_size, QELR_RQE_ELEMENT_SIZE); if (rc) { verbs_err(&cxt->ibv_ctx, "create srq: failed to map srq, got %d", rc); return rc; } prod_size = sizeof(struct rdma_srq_producers); addr = mmap(NULL, prod_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (addr == MAP_FAILED) { verbs_err(&cxt->ibv_ctx, "create srq: failed to map producer, got %d", errno); qelr_chain_free(&srq->hw_srq.chain); return errno; } rc = ibv_dontfork_range(addr, prod_size); if (rc) { munmap(addr, prod_size); qelr_chain_free(&srq->hw_srq.chain); return rc; } srq->hw_srq.virt_prod_pair_addr = addr; srq->hw_srq.max_sges = cxt->sges_per_srq_wr; srq->hw_srq.max_wr = max_wr; return 0; } struct ibv_srq *qelr_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *init_attr) { struct qelr_devctx *cxt = get_qelr_ctx(pd->context); struct qelr_create_srq req; struct qelr_create_srq_resp resp; struct ibv_srq *ibv_srq; struct qelr_srq *srq; int ret; srq = calloc(1, sizeof(*srq)); if (!srq) return NULL; ibv_srq = &srq->verbs_srq.srq; ret = qelr_create_srq_buffers(cxt, srq, init_attr->attr.max_wr); if (ret) { free(srq); return NULL; } pthread_spin_init(&srq->lock, PTHREAD_PROCESS_PRIVATE); qelr_create_srq_configure_req(srq, &req); ret = ibv_cmd_create_srq(pd, ibv_srq, init_attr, &req.ibv_cmd, sizeof(req), &resp.ibv_resp, sizeof(resp)); if (ret) { qelr_destroy_srq_buffers(ibv_srq); free(srq); return NULL; } return ibv_srq; } static void qelr_free_rq(struct qelr_qp *qp) { free(qp->rqe_wr_id); } static void qelr_free_sq(struct qelr_qp *qp) { free(qp->wqe_wr_id); } static void qelr_chain_free_sq(struct qelr_qp *qp) { qelr_chain_free(&qp->sq.chain); } static void qelr_chain_free_rq(struct qelr_qp *qp) { qelr_chain_free(&qp->rq.chain); } static inline bool qelr_qp_has_rq(struct qelr_qp *qp) { return !!(qp->flags & QELR_QP_FLAG_RQ); } static inline bool qelr_qp_has_sq(struct qelr_qp *qp) { return !!(qp->flags & QELR_QP_FLAG_SQ); } static inline int qelr_create_qp_buffers_sq(struct qelr_devctx *cxt, struct qelr_qp *qp, struct ibv_qp_init_attr_ex *attrx) { uint32_t max_send_wr, max_send_sges, max_send_buf; int chain_size; int rc; /* SQ */ max_send_wr = attrx->cap.max_send_wr; max_send_wr = max_t(uint32_t, max_send_wr, 1); max_send_wr = min_t(uint32_t, max_send_wr, cxt->max_send_wr); max_send_sges = max_send_wr * cxt->sges_per_send_wr; max_send_buf = max_send_sges * QELR_SQE_ELEMENT_SIZE; chain_size = max_send_buf; rc = qelr_chain_alloc(&qp->sq.chain, chain_size, cxt->kernel_page_size, QELR_SQE_ELEMENT_SIZE); if (rc) verbs_err(&cxt->ibv_ctx, "create qp: failed to map SQ chain, got %d", rc); qp->sq.max_wr = max_send_wr; qp->sq.max_sges = cxt->sges_per_send_wr; return rc; } static inline int qelr_create_qp_buffers_rq(struct qelr_devctx *cxt, struct qelr_qp *qp, struct ibv_qp_init_attr_ex *attrx) { uint32_t max_recv_wr, max_recv_sges, max_recv_buf; int chain_size; int rc; /* RQ */ max_recv_wr = attrx->cap.max_recv_wr; max_recv_wr = max_t(uint32_t, max_recv_wr, 1); max_recv_wr = min_t(uint32_t, max_recv_wr, cxt->max_recv_wr); max_recv_sges = max_recv_wr * cxt->sges_per_recv_wr; max_recv_buf = max_recv_sges * QELR_RQE_ELEMENT_SIZE; chain_size = max_recv_buf; rc = qelr_chain_alloc(&qp->rq.chain, chain_size, cxt->kernel_page_size, QELR_RQE_ELEMENT_SIZE); if (rc) verbs_err(&cxt->ibv_ctx, "create qp: failed to map RQ chain, got %d", rc); qp->rq.max_wr = max_recv_wr; qp->rq.max_sges = cxt->sges_per_recv_wr; return rc; } static inline int qelr_create_qp_buffers(struct qelr_devctx *cxt, struct qelr_qp *qp, struct ibv_qp_init_attr_ex *attrx) { int rc; if (qelr_qp_has_sq(qp)) { rc = qelr_create_qp_buffers_sq(cxt, qp, attrx); if (rc) return rc; } if (qelr_qp_has_rq(qp)) { rc = qelr_create_qp_buffers_rq(cxt, qp, attrx); if (rc && qelr_qp_has_sq(qp)) { qelr_chain_free_sq(qp); if (qp->sq.db_rec_map) munmap(qp->sq.db_rec_map, cxt->kernel_page_size); return rc; } } return 0; } static inline int qelr_configure_qp_sq(struct qelr_devctx *cxt, struct qelr_qp *qp, struct ibv_qp_init_attr_ex *attrx, struct qelr_create_qp_resp *resp) { qp->sq.icid = resp->sq_icid; qp->sq.db_data.data.icid = htole16(resp->sq_icid); qp->sq.prod = 0; qp->sq.db = cxt->db_addr + resp->sq_db_offset; qp->sq.edpm_db = cxt->db_addr; if (resp->sq_db_rec_addr) { qp->sq.db_rec_map = mmap(NULL, cxt->kernel_page_size, PROT_WRITE, MAP_SHARED, cxt->ibv_ctx.context.cmd_fd, resp->sq_db_rec_addr); if (qp->sq.db_rec_map == MAP_FAILED) { int errsv = errno; verbs_err(&cxt->ibv_ctx, "alloc context: doorbell rec mapping failed resp.db_rec_addr = %llx size=%d context->cmd_fd=%d errno=%d\n", resp->sq_db_rec_addr, cxt->kernel_page_size, cxt->ibv_ctx.context.cmd_fd, errsv); return -ENOMEM; } qp->sq.db_rec_addr = qp->sq.db_rec_map; } else { /* Kernel doesn't support doorbell recovery. Point to dummy * location instead */ qp->sq.db_rec_addr = &cxt->db_rec_addr_dummy; } /* shadow SQ */ qp->sq.max_wr++; /* prod/cons method requires N+1 elements */ qp->wqe_wr_id = calloc(qp->sq.max_wr, sizeof(*qp->wqe_wr_id)); if (!qp->wqe_wr_id) { verbs_err(&cxt->ibv_ctx, "create qp: failed shadow SQ memory allocation\n"); return -ENOMEM; } return 0; } static inline int qelr_configure_qp_rq(struct qelr_devctx *cxt, struct qelr_qp *qp, struct qelr_create_qp_resp *resp) { /* RQ */ qp->rq.icid = resp->rq_icid; qp->rq.db_data.data.icid = htole16(resp->rq_icid); qp->rq.db = cxt->db_addr + resp->rq_db_offset; qp->rq.iwarp_db2 = cxt->db_addr + resp->rq_db2_offset; qp->rq.iwarp_db2_data.data.icid = htole16(qp->rq.icid); qp->rq.iwarp_db2_data.data.value = htole16(DQ_TCM_IWARP_POST_RQ_CF_CMD); qp->rq.prod = 0; if (resp->rq_db_rec_addr) { qp->rq.db_rec_map = mmap(NULL, cxt->kernel_page_size, PROT_WRITE, MAP_SHARED, cxt->ibv_ctx.context.cmd_fd, resp->rq_db_rec_addr); if (qp->rq.db_rec_map == MAP_FAILED) { int errsv = errno; verbs_err(&cxt->ibv_ctx, "alloc context: doorbell rec mapping failed resp.db_rec_addr = %llx size=%d context->cmd_fd=%d errno=%d\n", resp->rq_db_rec_addr, cxt->kernel_page_size, cxt->ibv_ctx.context.cmd_fd, errsv); return -ENOMEM; } qp->rq.db_rec_addr = qp->rq.db_rec_map; } else { /* Kernel doesn't support doorbell recovery. Point to dummy * location instead */ qp->rq.db_rec_addr = &cxt->db_rec_addr_dummy; } /* shadow RQ */ qp->rq.max_wr++; /* prod/cons method requires N+1 elements */ qp->rqe_wr_id = calloc(qp->rq.max_wr, sizeof(*qp->rqe_wr_id)); if (!qp->rqe_wr_id) { verbs_err(&cxt->ibv_ctx, "create qp: failed shadow RQ memory allocation\n"); return -ENOMEM; } return 0; } static inline int qelr_configure_qp(struct qelr_devctx *cxt, struct qelr_qp *qp, struct ibv_qp_init_attr_ex *attrx, struct qelr_create_qp_resp *resp) { int rc = 0; /* general */ pthread_spin_init(&qp->q_lock, PTHREAD_PROCESS_PRIVATE); qp->qp_id = resp->qp_id; qp->state = QELR_QPS_RST; qp->sq_sig_all = attrx->sq_sig_all; qp->atomic_supported = resp->atomic_supported; if (cxt->dpm_flags & QELR_DPM_FLAGS_EDPM_MODE) qp->edpm_mode = 1; if (qelr_qp_has_sq(qp)) { rc = qelr_configure_qp_sq(cxt, qp, attrx, resp); if (rc) return rc; } if (qelr_qp_has_rq(qp)) { rc = qelr_configure_qp_rq(cxt, qp, resp); if (rc && qelr_qp_has_sq(qp)) qelr_free_sq(qp); } return rc; } static inline void qelr_print_qp_init_attr(struct qelr_devctx *cxt, struct ibv_qp_init_attr_ex *attrx) { verbs_debug(&cxt->ibv_ctx, "create qp: send_cq=%p, recv_cq=%p, srq=%p, max_inline_data=%d, max_recv_sge=%d, max_recv_wr=%d, max_send_sge=%d, max_send_wr=%d, qp_type=%d, sq_sig_all=%d\n", attrx->send_cq, attrx->recv_cq, attrx->srq, attrx->cap.max_inline_data, attrx->cap.max_recv_sge, attrx->cap.max_recv_wr, attrx->cap.max_send_sge, attrx->cap.max_send_wr, attrx->qp_type, attrx->sq_sig_all); } static inline void qelr_create_qp_configure_sq_req(struct qelr_qp *qp, struct qelr_create_qp *req) { req->sq_addr = (uintptr_t)qp->sq.chain.first_addr; req->sq_len = qp->sq.chain.size; } static inline void qelr_create_qp_configure_rq_req(struct qelr_qp *qp, struct qelr_create_qp *req) { req->rq_addr = (uintptr_t)qp->rq.chain.first_addr; req->rq_len = qp->rq.chain.size; } static inline void qelr_create_qp_configure_req(struct qelr_qp *qp, struct qelr_create_qp *req) { memset(req, 0, sizeof(*req)); req->qp_handle_hi = U64_HI(qp); req->qp_handle_lo = U64_LO(qp); if (qelr_qp_has_sq(qp)) qelr_create_qp_configure_sq_req(qp, req); if (qelr_qp_has_rq(qp)) qelr_create_qp_configure_rq_req(qp, req); } static inline void qelr_basic_qp_config(struct qelr_qp *qp, struct ibv_qp_init_attr_ex *attrx) { if (attrx->srq) qp->srq = get_qelr_srq(attrx->srq); if (attrx->qp_type == IBV_QPT_RC || attrx->qp_type == IBV_QPT_XRC_SEND) qp->flags |= QELR_QP_FLAG_SQ; if (attrx->qp_type == IBV_QPT_RC && !qp->srq) qp->flags |= QELR_QP_FLAG_RQ; } static void qelr_print_ah_attr(struct qelr_devctx *cxt, struct ibv_ah_attr *attr) { verbs_debug(&cxt->ibv_ctx, "grh.dgid=[%#" PRIx64 ":%#" PRIx64 "], grh.flow_label=%d, grh.sgid_index=%d, grh.hop_limit=%d, grh.traffic_class=%d, dlid=%d, sl=%d, src_path_bits=%d, static_rate = %d, port_num=%d\n", be64toh(attr->grh.dgid.global.interface_id), be64toh(attr->grh.dgid.global.subnet_prefix), attr->grh.flow_label, attr->grh.hop_limit, attr->grh.sgid_index, attr->grh.traffic_class, attr->dlid, attr->sl, attr->src_path_bits, attr->static_rate, attr->port_num); } static void qelr_print_qp_attr(struct qelr_devctx *cxt, struct ibv_qp_attr *attr) { verbs_debug(&cxt->ibv_ctx, "\tqp_state=%d\tcur_qp_state=%d\tpath_mtu=%d\tpath_mig_state=%d\tqkey=%d\trq_psn=%d\tsq_psn=%d\tdest_qp_num=%d\tqp_access_flags=%d\tmax_inline_data=%d\tmax_recv_sge=%d\tmax_recv_wr=%d\tmax_send_sge=%d\tmax_send_wr=%d\tpkey_index=%d\talt_pkey_index=%d\ten_sqd_async_notify=%d\tsq_draining=%d\tmax_rd_atomic=%d\tmax_dest_rd_atomic=%d\tmin_rnr_timer=%d\tport_num=%d\ttimeout=%d\tretry_cnt=%d\trnr_retry=%d\talt_port_num=%d\talt_timeout=%d\n", attr->qp_state, attr->cur_qp_state, attr->path_mtu, attr->path_mig_state, attr->qkey, attr->rq_psn, attr->sq_psn, attr->dest_qp_num, attr->qp_access_flags, attr->cap.max_inline_data, attr->cap.max_recv_sge, attr->cap.max_recv_wr, attr->cap.max_send_sge, attr->cap.max_send_wr, attr->pkey_index, attr->alt_pkey_index, attr->en_sqd_async_notify, attr->sq_draining, attr->max_rd_atomic, attr->max_dest_rd_atomic, attr->min_rnr_timer, attr->port_num, attr->timeout, attr->retry_cnt, attr->rnr_retry, attr->alt_port_num, attr->alt_timeout); qelr_print_ah_attr(cxt, &attr->ah_attr); qelr_print_ah_attr(cxt, &attr->alt_ah_attr); } int qelr_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; struct qelr_devctx *cxt = get_qelr_ctx(qp->context); int rc; verbs_debug(&cxt->ibv_ctx, "QP Query %p, attr_mask=0x%x\n", get_qelr_qp(qp), attr_mask); rc = ibv_cmd_query_qp(qp, attr, attr_mask, init_attr, &cmd, sizeof(cmd)); qelr_print_qp_attr(cxt, attr); return rc; } static enum qelr_qp_state get_qelr_qp_state(enum ibv_qp_state qps) { switch (qps) { case IBV_QPS_RESET: return QELR_QPS_RST; case IBV_QPS_INIT: return QELR_QPS_INIT; case IBV_QPS_RTR: return QELR_QPS_RTR; case IBV_QPS_RTS: return QELR_QPS_RTS; case IBV_QPS_SQD: return QELR_QPS_SQD; case IBV_QPS_SQE: return QELR_QPS_SQE; case IBV_QPS_ERR: default: return QELR_QPS_ERR; }; } static void qelr_reset_qp_hwq_info(struct qelr_qp_hwq_info *q) { qelr_chain_reset(&q->chain); q->prod = 0; q->cons = 0; q->wqe_cons = 0; q->db_data.data.value = 0; } static int qelr_update_qp_state(struct qelr_qp *qp, enum ibv_qp_state new_ib_state) { int status = 0; enum qelr_qp_state new_state; /* iWARP states are updated implicitely by driver and don't have a * real purpose in user-lib. */ if (IS_IWARP(qp->ibv_qp->context->device)) return 0; new_state = get_qelr_qp_state(new_ib_state); pthread_spin_lock(&qp->q_lock); if (new_state == qp->state) { pthread_spin_unlock(&qp->q_lock); return 0; } switch (qp->state) { case QELR_QPS_RST: switch (new_state) { case QELR_QPS_INIT: qp->prev_wqe_size = 0; qelr_reset_qp_hwq_info(&qp->sq); qelr_reset_qp_hwq_info(&qp->rq); break; default: status = -EINVAL; break; }; break; case QELR_QPS_INIT: /* INIT->XXX */ switch (new_state) { case QELR_QPS_RTR: /* Update doorbell (in case post_recv was done before * move to RTR) */ if (IS_ROCE(qp->ibv_qp->context->device) && (qelr_qp_has_rq(qp))) { mmio_wc_start(); writel(qp->rq.db_data.raw, qp->rq.db); mmio_flush_writes(); } break; case QELR_QPS_ERR: break; default: /* invalid state change. */ status = -EINVAL; break; }; break; case QELR_QPS_RTR: /* RTR->XXX */ switch (new_state) { case QELR_QPS_RTS: break; case QELR_QPS_ERR: break; default: /* invalid state change. */ status = -EINVAL; break; }; break; case QELR_QPS_RTS: /* RTS->XXX */ switch (new_state) { case QELR_QPS_SQD: case QELR_QPS_SQE: break; case QELR_QPS_ERR: break; default: /* invalid state change. */ status = -EINVAL; break; }; break; case QELR_QPS_SQD: /* SQD->XXX */ switch (new_state) { case QELR_QPS_RTS: case QELR_QPS_SQE: case QELR_QPS_ERR: break; default: /* invalid state change. */ status = -EINVAL; break; }; break; case QELR_QPS_SQE: switch (new_state) { case QELR_QPS_RTS: case QELR_QPS_ERR: break; default: /* invalid state change. */ status = -EINVAL; break; }; break; case QELR_QPS_ERR: /* ERR->XXX */ switch (new_state) { case QELR_QPS_RST: break; default: status = -EINVAL; break; }; break; default: status = -EINVAL; break; }; if (!status) qp->state = new_state; pthread_spin_unlock(&qp->q_lock); return status; } int qelr_modify_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd = {}; struct qelr_qp *qp = get_qelr_qp(ibqp); struct qelr_devctx *cxt = get_qelr_ctx(ibqp->context); union ibv_gid sgid, *p_dgid; int rc; verbs_debug(&cxt->ibv_ctx, "QP Modify %p, attr_mask=0x%x\n", qp, attr_mask); qelr_print_qp_attr(cxt, attr); rc = ibv_cmd_modify_qp(ibqp, attr, attr_mask, &cmd, sizeof(cmd)); if (rc) { verbs_err(&cxt->ibv_ctx, "QP Modify: Failed command. rc=%d\n", rc); return rc; } if (attr_mask & IBV_QP_STATE) { rc = qelr_update_qp_state(qp, attr->qp_state); verbs_debug(&cxt->ibv_ctx, "QP Modify state %d->%d, rc=%d\n", qp->state, attr->qp_state, rc); if (rc) { verbs_err(&cxt->ibv_ctx, "QP Modify: Failed to update state. rc=%d\n", rc); return rc; } } /* EDPM must be disabled if GIDs match */ if (attr_mask & IBV_QP_AV) { rc = ibv_query_gid(ibqp->context, attr->ah_attr.port_num, attr->ah_attr.grh.sgid_index, &sgid); if (!rc) { p_dgid = &attr->ah_attr.grh.dgid; qp->edpm_disabled = !memcmp(&sgid, p_dgid, sizeof(sgid)); verbs_debug(&cxt->ibv_ctx, "QP Modify: %p, edpm_disabled=%d\n", qp, qp->edpm_disabled); } else { verbs_err(&cxt->ibv_ctx, "QP Modify: Failed querying GID. rc=%d\n", rc); } } return 0; } int qelr_destroy_qp(struct ibv_qp *ibqp) { struct qelr_devctx *cxt = get_qelr_ctx(ibqp->context); struct qelr_qp *qp = get_qelr_qp(ibqp); int rc = 0; verbs_debug(&cxt->ibv_ctx, "destroy qp: %p\n", qp); rc = ibv_cmd_destroy_qp(ibqp); if (rc) { verbs_err(&cxt->ibv_ctx, "destroy qp: failed to destroy %p, got %d.\n", qp, rc); return rc; } qelr_free_sq(qp); qelr_free_rq(qp); qelr_chain_free_sq(qp); qelr_chain_free_rq(qp); if (qp->sq.db_rec_map) munmap(qp->sq.db_rec_map, cxt->kernel_page_size); if (qp->rq.db_rec_map) munmap(qp->rq.db_rec_map, cxt->kernel_page_size); verbs_debug(&cxt->ibv_ctx, "destroy cq: successfully destroyed %p\n", qp); free(qp); return 0; } static int sge_data_len(struct ibv_sge *sg_list, int num_sge) { int i, len = 0; for (i = 0; i < num_sge; i++) len += sg_list[i].length; return len; } static void swap_wqe_data64(uint64_t *p) { __be64 *bep=(__be64 *)p; int i; for (i = 0; i < ROCE_WQE_ELEM_SIZE / sizeof(uint64_t); i++, p++, bep++) *bep = htobe64(*p); } static inline void qelr_init_dpm_info(struct qelr_devctx *cxt, struct qelr_qp *qp, struct ibv_send_wr *wr, struct qelr_dpm *dpm, int data_size) { dpm->is_edpm = 0; dpm->is_ldpm = 0; /* DPM only succeeds when transmit queues are empty */ if (!qelr_chain_is_full(&qp->sq.chain)) return; /* Check if edpm can be used */ if (wr->send_flags & IBV_SEND_INLINE && !qp->edpm_disabled && cxt->dpm_flags & QELR_DPM_FLAGS_ENHANCED && data_size <= cxt->edpm_limit_size) { memset(dpm, 0, sizeof(*dpm)); dpm->rdma_ext = (struct qelr_rdma_ext *)&dpm->payload; dpm->is_edpm = 1; return; } /* Check if ldpm can be used - not inline and limited to ldpm_limit */ if (cxt->dpm_flags & QELR_DPM_FLAGS_LEGACY && !(wr->send_flags & IBV_SEND_INLINE) && data_size <= cxt->ldpm_limit_size) { memset(dpm, 0, sizeof(*dpm)); dpm->is_ldpm = 1; } } #define QELR_IB_OPCODE_SEND_ONLY 0x04 #define QELR_IB_OPCODE_SEND_ONLY_WITH_IMMEDIATE 0x05 #define QELR_IB_OPCODE_RDMA_WRITE_ONLY 0x0a #define QELR_IB_OPCODE_RDMA_WRITE_ONLY_WITH_IMMEDIATE 0x0b #define QELR_IB_OPCODE_SEND_WITH_INV 0x17 #define QELR_IS_IMM_OR_INV(opcode) \ (((opcode) == QELR_IB_OPCODE_SEND_ONLY_WITH_IMMEDIATE) || \ ((opcode) == QELR_IB_OPCODE_RDMA_WRITE_ONLY_WITH_IMMEDIATE) || \ ((opcode) == QELR_IB_OPCODE_SEND_WITH_INV)) static inline void qelr_edpm_set_msg_data(struct qelr_qp *qp, struct qelr_dpm *dpm, uint8_t opcode, uint16_t length, uint8_t se, uint8_t comp) { uint32_t wqe_size, dpm_size, params; /* edpm mode - 0 : ack field is treated by old FW as "completion" * edpm mode - 1 : ack field is treated by new FW as ack which is * always required. */ uint8_t ack = (qp->edpm_mode) ? 1 : comp; params = 0; wqe_size = length + (QELR_IS_IMM_OR_INV(opcode) ? sizeof(uint32_t) : 0); dpm_size = wqe_size + sizeof(struct db_roce_dpm_data); SET_FIELD(params, DB_ROCE_DPM_PARAMS_ACK_REQUEST, ack); SET_FIELD(params, DB_ROCE_DPM_PARAMS_DPM_TYPE, DPM_ROCE); SET_FIELD(params, DB_ROCE_DPM_PARAMS_OPCODE, opcode); SET_FIELD(params, DB_ROCE_DPM_PARAMS_WQE_SIZE, wqe_size); SET_FIELD(params, DB_ROCE_DPM_PARAMS_COMPLETION_FLG, comp ? 1 : 0); SET_FIELD(params, DB_ROCE_DPM_PARAMS_S_FLG, se ? 1 : 0); SET_FIELD(params, DB_ROCE_DPM_PARAMS_SIZE, (dpm_size + sizeof(uint64_t) - 1) / sizeof(uint64_t)); dpm->msg.data.params.params = htole32(params); } static inline void qelr_edpm_set_inv_imm(struct qelr_qp *qp, struct qelr_dpm *dpm, __be32 data) { memcpy(&dpm->payload[dpm->payload_offset], &data, sizeof(data)); dpm->payload_offset += sizeof(data); dpm->payload_size += sizeof(data); } static inline void qelr_edpm_set_rdma_ext(struct qelr_qp *qp, struct qelr_dpm *dpm, uint64_t remote_addr, uint32_t rkey) { dpm->rdma_ext->remote_va = htobe64(remote_addr); dpm->rdma_ext->remote_key = htobe32(rkey); dpm->payload_offset += sizeof(*dpm->rdma_ext); dpm->payload_size += sizeof(*dpm->rdma_ext); } static inline void qelr_edpm_set_payload(struct qelr_qp *qp, struct qelr_dpm *dpm, char *buf, uint32_t length) { memcpy(&dpm->payload[dpm->payload_offset], buf, length); dpm->payload_offset += length; } static void qelr_prepare_sq_inline_data(struct qelr_qp *qp, struct qelr_dpm *dpm, int data_size, uint8_t *wqe_size, struct ibv_send_wr *wr, uint8_t *bits, uint8_t bit) { int i; uint32_t seg_siz; char *seg_prt, *wqe; if (!data_size) return; /* set the bit */ *bits |= bit; seg_prt = NULL; wqe = NULL; seg_siz = 0; /* copy data inline */ for (i = 0; i < wr->num_sge; i++) { uint32_t len = wr->sg_list[i].length; void *src = (void *)(uintptr_t)wr->sg_list[i].addr; if (dpm->is_edpm) qelr_edpm_set_payload(qp, dpm, src, len); while (len > 0) { uint32_t cur; /* new segment required */ if (!seg_siz) { wqe = (char *)qelr_chain_produce(&qp->sq.chain); seg_prt = wqe; seg_siz = sizeof(struct rdma_sq_common_wqe); (*wqe_size)++; } /* calculate currently allowed length */ cur = min(len, seg_siz); memcpy(seg_prt, src, cur); /* update segment variables */ seg_prt += cur; seg_siz -= cur; /* update sge variables */ src += cur; len -= cur; /* swap fully-completed segments */ if (!seg_siz) swap_wqe_data64((uint64_t *)wqe); } } /* swap last not completed segment */ if (seg_siz) swap_wqe_data64((uint64_t *)wqe); if (dpm->is_edpm) { dpm->payload_size += data_size; if (wr->opcode == IBV_WR_RDMA_WRITE || wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM) dpm->rdma_ext->dma_length = htobe32(data_size); } } static void qelr_prepare_sq_sges(struct qelr_qp *qp, struct qelr_dpm *dpm, uint8_t *wqe_size, struct ibv_send_wr *wr) { int i; for (i = 0; i < wr->num_sge; i++) { struct rdma_sq_sge *sge = qelr_chain_produce(&qp->sq.chain); TYPEPTR_ADDR_SET(sge, addr, wr->sg_list[i].addr); sge->l_key = htole32(wr->sg_list[i].lkey); sge->length = htole32(wr->sg_list[i].length); if (dpm->is_ldpm) { memcpy(&dpm->payload[dpm->payload_size], sge, sizeof(*sge)); dpm->payload_size += sizeof(*sge); } } if (wqe_size) *wqe_size += wr->num_sge; } static uint32_t qelr_prepare_sq_rdma_data(struct qelr_qp *qp, struct qelr_dpm *dpm, int data_size, uint8_t *p_wqe_size, struct rdma_sq_rdma_wqe_1st *rwqe, struct rdma_sq_rdma_wqe_2nd *rwqe2, struct ibv_send_wr *wr, bool is_imm) { memset(rwqe2, 0, sizeof(*rwqe2)); rwqe2->r_key = htole32(wr->wr.rdma.rkey); TYPEPTR_ADDR_SET(rwqe2, remote_va, wr->wr.rdma.remote_addr); rwqe->length = htole32(data_size); if (is_imm) rwqe->imm_data = htole32(be32toh(wr->imm_data)); if (wr->send_flags & IBV_SEND_INLINE && (wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM || wr->opcode == IBV_WR_RDMA_WRITE)) { uint8_t flags = 0; SET_FIELD2(flags, RDMA_SQ_RDMA_WQE_1ST_INLINE_FLG, 1); qelr_prepare_sq_inline_data(qp, dpm, data_size, p_wqe_size, wr, &rwqe->flags, flags); rwqe->wqe_size = *p_wqe_size; } else { if (dpm->is_ldpm) dpm->payload_size = sizeof(*rwqe) + sizeof(*rwqe2); qelr_prepare_sq_sges(qp, dpm, p_wqe_size, wr); rwqe->wqe_size = *p_wqe_size; if (dpm->is_ldpm) { memcpy(dpm->payload, rwqe, sizeof(*rwqe)); memcpy(&dpm->payload[sizeof(*rwqe)], rwqe2, sizeof(*rwqe2)); } } return data_size; } static uint32_t qelr_prepare_sq_send_data(struct qelr_qp *qp, struct qelr_dpm *dpm, int data_size, uint8_t *p_wqe_size, struct rdma_sq_send_wqe_1st *swqe, struct rdma_sq_send_wqe_2st *swqe2, struct ibv_send_wr *wr, bool is_imm) { memset(swqe2, 0, sizeof(*swqe2)); swqe->length = htole32(data_size); if (is_imm) swqe->inv_key_or_imm_data = htole32(be32toh(wr->imm_data)); if (wr->send_flags & IBV_SEND_INLINE) { uint8_t flags = 0; SET_FIELD2(flags, RDMA_SQ_SEND_WQE_INLINE_FLG, 1); qelr_prepare_sq_inline_data(qp, dpm, data_size, p_wqe_size, wr, &swqe->flags, flags); swqe->wqe_size = *p_wqe_size; } else { if (dpm->is_ldpm) dpm->payload_size = sizeof(*swqe) + sizeof(*swqe2); qelr_prepare_sq_sges(qp, dpm, p_wqe_size, wr); swqe->wqe_size = *p_wqe_size; if (dpm->is_ldpm) { memcpy(dpm->payload, swqe, sizeof(*swqe)); memcpy(&dpm->payload[sizeof(*swqe)], swqe2, sizeof(*swqe2)); } } return data_size; } static void qelr_prepare_sq_atom_data(struct qelr_qp *qp, struct qelr_dpm *dpm, struct rdma_sq_atomic_wqe_1st *awqe1, struct rdma_sq_atomic_wqe_2nd *awqe2, struct rdma_sq_atomic_wqe_3rd *awqe3, struct ibv_send_wr *wr) { if (dpm->is_ldpm) { memcpy(&dpm->payload[dpm->payload_size], awqe1, sizeof(*awqe1)); dpm->payload_size += sizeof(*awqe1); memcpy(&dpm->payload[dpm->payload_size], awqe2, sizeof(*awqe2)); dpm->payload_size += sizeof(*awqe2); memcpy(&dpm->payload[dpm->payload_size], awqe3, sizeof(*awqe3)); dpm->payload_size += sizeof(*awqe3); } qelr_prepare_sq_sges(qp, dpm, NULL, wr); } static inline void qelr_ldpm_prepare_data(struct qelr_qp *qp, struct qelr_dpm *dpm) { uint32_t val, params; /* DPM size is given in 8 bytes so we round up */ val = dpm->payload_size + sizeof(struct db_roce_dpm_data); val = DIV_ROUND_UP(val, sizeof(uint64_t)); params = 0; SET_FIELD(params, DB_ROCE_DPM_PARAMS_SIZE, val); SET_FIELD(params, DB_ROCE_DPM_PARAMS_DPM_TYPE, DPM_LEGACY); dpm->msg.data.params.params = htole32(params); } static enum ibv_wc_opcode qelr_ibv_to_wc_opcode(enum ibv_wr_opcode opcode) { switch (opcode) { case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: return IBV_WC_RDMA_WRITE; case IBV_WR_SEND_WITH_IMM: case IBV_WR_SEND: case IBV_WR_SEND_WITH_INV: return IBV_WC_SEND; case IBV_WR_RDMA_READ: return IBV_WC_RDMA_READ; case IBV_WR_ATOMIC_CMP_AND_SWP: return IBV_WC_COMP_SWAP; case IBV_WR_ATOMIC_FETCH_AND_ADD: return IBV_WC_FETCH_ADD; default: return IBV_WC_SEND; } } static inline void doorbell_qp(struct qelr_qp *qp) { mmio_wc_start(); writel(qp->sq.db_data.raw, qp->sq.db); /* copy value to doorbell recovery mechanism */ qp->sq.db_rec_addr->db_data = qp->sq.db_data.raw; mmio_flush_writes(); } static inline void doorbell_dpm_qp(struct qelr_devctx *cxt, struct qelr_qp *qp, struct qelr_dpm *dpm) { uint32_t offset = 0; uint64_t *payload = (uint64_t *)dpm->payload; uint32_t num_dwords; int bytes = 0; void *db_addr; mmio_wc_start(); /* Write message header */ dpm->msg.data.icid = qp->sq.db_data.data.icid; dpm->msg.data.prod_val = qp->sq.db_data.data.value; db_addr = qp->sq.edpm_db; writeq(dpm->msg.raw, db_addr); /* Write mesage body */ bytes += sizeof(uint64_t); num_dwords = DIV_ROUND_UP(dpm->payload_size, sizeof(uint64_t)); db_addr += sizeof(dpm->msg.data); if (bytes == cxt->edpm_trans_size) { mmio_flush_writes(); bytes = 0; } while (offset < num_dwords) { /* endianity is different between FW and DORQ HW block */ if (dpm->is_ldpm) mmio_write64_be(db_addr, htobe64(payload[offset])); else /* EDPM */ mmio_write64(db_addr, payload[offset]); bytes += sizeof(uint64_t); db_addr += sizeof(uint64_t); /* Writing to a wc bar. We need to flush the writes every * edpm transaction size otherwise the CPU could optimize away * the duplicate stores. */ if (bytes == cxt->edpm_trans_size) { mmio_flush_writes(); bytes = 0; } offset++; } mmio_flush_writes(); } static inline int qelr_can_post_send(struct qelr_devctx *cxt, struct qelr_qp *qp, struct ibv_send_wr *wr, int data_size) { /* Invalid WR */ if (wr->num_sge > qp->sq.max_sges) { verbs_err(&cxt->ibv_ctx, "error: WR is bad. Post send on QP %p failed\n", qp); return -EINVAL; } /* WR overflow */ if (qelr_wq_is_full(&qp->sq)) { verbs_err(&cxt->ibv_ctx, "error: WQ is full. Post send on QP %p failed (this error appears only once)\n", qp); return -ENOMEM; } /* WQE overflow */ if (qelr_chain_get_elem_left_u32(&qp->sq.chain) < QELR_MAX_SQ_WQE_SIZE) { verbs_err(&cxt->ibv_ctx, "error: WQ PBL is full. Post send on QP %p failed (this error appears only once)\n", qp); return -ENOMEM; } if ((wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP || wr->opcode == IBV_WR_ATOMIC_FETCH_AND_ADD) && !qp->atomic_supported) { verbs_err(&cxt->ibv_ctx, "Atomic not supported on this machine\n"); return -EINVAL; } if ((wr->send_flags & IBV_SEND_INLINE) && (data_size > ROCE_REQ_MAX_INLINE_DATA_SIZE)) { verbs_err(&cxt->ibv_ctx, "Too much inline data in WR: %d\n", data_size); return -EINVAL; } return 0; } static void qelr_configure_xrc_srq(struct ibv_send_wr *wr, struct rdma_sq_common_wqe *wqe, struct qelr_dpm *dpm) { struct rdma_sq_send_wqe_1st *xrc_wqe; /* xrc_srq location is the same for all relevant wqes */ xrc_wqe = (struct rdma_sq_send_wqe_1st *)wqe; xrc_wqe->xrc_srq = htole32(wr->qp_type.xrc.remote_srqn); if (dpm->is_edpm) { struct qelr_xrceth *xrceth; xrceth = (struct qelr_xrceth *) &dpm->payload[dpm->payload_offset]; xrceth->xrc_srq = htobe32(wr->qp_type.xrc.remote_srqn); dpm->payload_offset += sizeof(*xrceth); dpm->payload_size += sizeof(*xrceth); dpm->rdma_ext = (struct qelr_rdma_ext *)&dpm->payload_offset; } } static int __qelr_post_send(struct qelr_devctx *cxt, struct qelr_qp *qp, struct ibv_send_wr *wr, int data_size, int *normal_db_required) { uint8_t se, comp, fence; struct rdma_sq_common_wqe *wqe; struct rdma_sq_send_wqe_1st *swqe; struct rdma_sq_send_wqe_2st *swqe2; struct rdma_sq_rdma_wqe_1st *rwqe; struct rdma_sq_rdma_wqe_2nd *rwqe2; struct rdma_sq_atomic_wqe_1st *awqe1; struct rdma_sq_atomic_wqe_2nd *awqe2; struct rdma_sq_atomic_wqe_3rd *awqe3; struct qelr_dpm dpm; uint32_t wqe_length; uint8_t wqe_size; uint16_t db_val; int rc = 0; qelr_init_dpm_info(cxt, qp, wr, &dpm, data_size); wqe = qelr_chain_produce(&qp->sq.chain); comp = (!!(wr->send_flags & IBV_SEND_SIGNALED)) || (!!qp->sq_sig_all); qp->wqe_wr_id[qp->sq.prod].signaled = comp; /* common fields */ wqe->flags = 0; se = !!(wr->send_flags & IBV_SEND_SOLICITED); fence = !!(wr->send_flags & IBV_SEND_FENCE); SET_FIELD2(wqe->flags, RDMA_SQ_COMMON_WQE_SE_FLG, se); SET_FIELD2(wqe->flags, RDMA_SQ_COMMON_WQE_COMP_FLG, comp); SET_FIELD2(wqe->flags, RDMA_SQ_COMMON_WQE_RD_FENCE_FLG, fence); wqe->prev_wqe_size = qp->prev_wqe_size; qp->wqe_wr_id[qp->sq.prod].opcode = qelr_ibv_to_wc_opcode(wr->opcode); if (get_ibv_qp(qp)->qp_type == IBV_QPT_XRC_SEND) qelr_configure_xrc_srq(wr, wqe, &dpm); switch (wr->opcode) { case IBV_WR_SEND_WITH_IMM: wqe->req_type = RDMA_SQ_REQ_TYPE_SEND_WITH_IMM; swqe = (struct rdma_sq_send_wqe_1st *)wqe; wqe_size = sizeof(struct rdma_sq_send_wqe) / RDMA_WQE_BYTES; swqe2 = (struct rdma_sq_send_wqe_2st *)qelr_chain_produce(&qp->sq.chain); if (dpm.is_edpm) qelr_edpm_set_inv_imm(qp, &dpm, wr->imm_data); wqe_length = qelr_prepare_sq_send_data(qp, &dpm, data_size, &wqe_size, swqe, swqe2, wr, 1 /* Imm */); if (dpm.is_edpm) qelr_edpm_set_msg_data(qp, &dpm, QELR_IB_OPCODE_SEND_ONLY_WITH_IMMEDIATE, wqe_length, se, comp); else if (dpm.is_ldpm) qelr_ldpm_prepare_data(qp, &dpm); qp->wqe_wr_id[qp->sq.prod].wqe_size = wqe_size; qp->prev_wqe_size = wqe_size; qp->wqe_wr_id[qp->sq.prod].bytes_len = wqe_length; break; case IBV_WR_SEND: wqe->req_type = RDMA_SQ_REQ_TYPE_SEND; swqe = (struct rdma_sq_send_wqe_1st *)wqe; wqe_size = sizeof(struct rdma_sq_send_wqe) / RDMA_WQE_BYTES; swqe2 = (struct rdma_sq_send_wqe_2st *)qelr_chain_produce(&qp->sq.chain); wqe_length = qelr_prepare_sq_send_data(qp, &dpm, data_size, &wqe_size, swqe, swqe2, wr, 0); if (dpm.is_edpm) qelr_edpm_set_msg_data(qp, &dpm, QELR_IB_OPCODE_SEND_ONLY, wqe_length, se, comp); else if (dpm.is_ldpm) qelr_ldpm_prepare_data(qp, &dpm); qp->wqe_wr_id[qp->sq.prod].wqe_size = wqe_size; qp->prev_wqe_size = wqe_size; qp->wqe_wr_id[qp->sq.prod].bytes_len = wqe_length; break; case IBV_WR_SEND_WITH_INV: wqe->req_type = RDMA_SQ_REQ_TYPE_SEND_WITH_INVALIDATE; swqe = (struct rdma_sq_send_wqe_1st *)wqe; wqe_size = sizeof(struct rdma_sq_send_wqe) / RDMA_WQE_BYTES; swqe2 = qelr_chain_produce(&qp->sq.chain); if (dpm.is_edpm) qelr_edpm_set_inv_imm(qp, &dpm, htobe32(wr->invalidate_rkey)); swqe->inv_key_or_imm_data = htole32(wr->invalidate_rkey); wqe_length = qelr_prepare_sq_send_data(qp, &dpm, data_size, &wqe_size, swqe, swqe2, wr, 0); if (dpm.is_edpm) qelr_edpm_set_msg_data(qp, &dpm, QELR_IB_OPCODE_SEND_WITH_INV, wqe_length, se, comp); else if (dpm.is_ldpm) qelr_ldpm_prepare_data(qp, &dpm); qp->wqe_wr_id[qp->sq.prod].wqe_size = wqe_size; qp->prev_wqe_size = wqe_size; qp->wqe_wr_id[qp->sq.prod].bytes_len = wqe_length; break; case IBV_WR_RDMA_WRITE_WITH_IMM: wqe->req_type = RDMA_SQ_REQ_TYPE_RDMA_WR_WITH_IMM; rwqe = (struct rdma_sq_rdma_wqe_1st *)wqe; wqe_size = sizeof(struct rdma_sq_rdma_wqe) / RDMA_WQE_BYTES; rwqe2 = (struct rdma_sq_rdma_wqe_2nd *)qelr_chain_produce(&qp->sq.chain); if (dpm.is_edpm) { qelr_edpm_set_rdma_ext(qp, &dpm, wr->wr.rdma.remote_addr, wr->wr.rdma.rkey); qelr_edpm_set_inv_imm(qp, &dpm, wr->imm_data); } wqe_length = qelr_prepare_sq_rdma_data(qp, &dpm, data_size, &wqe_size, rwqe, rwqe2, wr, 1 /* Imm */); if (dpm.is_edpm) qelr_edpm_set_msg_data(qp, &dpm, QELR_IB_OPCODE_RDMA_WRITE_ONLY_WITH_IMMEDIATE, wqe_length + sizeof(*dpm.rdma_ext), se, comp); else if (dpm.is_ldpm) qelr_ldpm_prepare_data(qp, &dpm); qp->wqe_wr_id[qp->sq.prod].wqe_size = wqe_size; qp->prev_wqe_size = wqe_size; qp->wqe_wr_id[qp->sq.prod].bytes_len = wqe_length; break; case IBV_WR_RDMA_WRITE: wqe->req_type = RDMA_SQ_REQ_TYPE_RDMA_WR; rwqe = (struct rdma_sq_rdma_wqe_1st *)wqe; wqe_size = sizeof(struct rdma_sq_rdma_wqe) / RDMA_WQE_BYTES; rwqe2 = (struct rdma_sq_rdma_wqe_2nd *)qelr_chain_produce(&qp->sq.chain); if (dpm.is_edpm) qelr_edpm_set_rdma_ext(qp, &dpm, wr->wr.rdma.remote_addr, wr->wr.rdma.rkey); wqe_length = qelr_prepare_sq_rdma_data(qp, &dpm, data_size, &wqe_size, rwqe, rwqe2, wr, 0); if (dpm.is_edpm) qelr_edpm_set_msg_data(qp, &dpm, QELR_IB_OPCODE_RDMA_WRITE_ONLY, wqe_length + sizeof(*dpm.rdma_ext), se, comp); else if (dpm.is_ldpm) qelr_ldpm_prepare_data(qp, &dpm); qp->wqe_wr_id[qp->sq.prod].wqe_size = wqe_size; qp->prev_wqe_size = wqe_size; qp->wqe_wr_id[qp->sq.prod].bytes_len = wqe_length; break; case IBV_WR_RDMA_READ: wqe->req_type = RDMA_SQ_REQ_TYPE_RDMA_RD; rwqe = (struct rdma_sq_rdma_wqe_1st *)wqe; wqe_size = sizeof(struct rdma_sq_rdma_wqe) / RDMA_WQE_BYTES; rwqe2 = (struct rdma_sq_rdma_wqe_2nd *)qelr_chain_produce(&qp->sq.chain); wqe_length = qelr_prepare_sq_rdma_data(qp, &dpm, data_size, &wqe_size, rwqe, rwqe2, wr, 0); if (dpm.is_ldpm) qelr_ldpm_prepare_data(qp, &dpm); qp->wqe_wr_id[qp->sq.prod].wqe_size = wqe_size; qp->prev_wqe_size = wqe_size; qp->wqe_wr_id[qp->sq.prod].bytes_len = wqe_length; break; case IBV_WR_ATOMIC_CMP_AND_SWP: case IBV_WR_ATOMIC_FETCH_AND_ADD: awqe1 = (struct rdma_sq_atomic_wqe_1st *)wqe; awqe1->wqe_size = 4; awqe2 = (struct rdma_sq_atomic_wqe_2nd *)qelr_chain_produce(&qp->sq.chain); TYPEPTR_ADDR_SET(awqe2, remote_va, wr->wr.atomic.remote_addr); awqe2->r_key = htole32(wr->wr.atomic.rkey); awqe3 = (struct rdma_sq_atomic_wqe_3rd *)qelr_chain_produce(&qp->sq.chain); if (wr->opcode == IBV_WR_ATOMIC_FETCH_AND_ADD) { wqe->req_type = RDMA_SQ_REQ_TYPE_ATOMIC_ADD; TYPEPTR_ADDR_SET(awqe3, swap_data, wr->wr.atomic.compare_add); } else { wqe->req_type = RDMA_SQ_REQ_TYPE_ATOMIC_CMP_AND_SWAP; TYPEPTR_ADDR_SET(awqe3, swap_data, wr->wr.atomic.swap); TYPEPTR_ADDR_SET(awqe3, cmp_data, wr->wr.atomic.compare_add); } qelr_prepare_sq_atom_data(qp, &dpm, awqe1, awqe2, awqe3, wr); if (dpm.is_ldpm) qelr_ldpm_prepare_data(qp, &dpm); qp->wqe_wr_id[qp->sq.prod].wqe_size = awqe1->wqe_size; qp->prev_wqe_size = awqe1->wqe_size; break; default: /* restore prod to its position before this WR was processed */ qelr_chain_set_prod(&qp->sq.chain, le16toh(qp->sq.db_data.data.value), wqe); /* restore prev_wqe_size */ qp->prev_wqe_size = wqe->prev_wqe_size; rc = -EINVAL; verbs_err(&cxt->ibv_ctx, "Invalid opcode %d in work request on QP %p\n", wr->opcode, qp); break; } if (rc) return rc; qp->wqe_wr_id[qp->sq.prod].wr_id = wr->wr_id; qelr_inc_sw_prod_u16(&qp->sq); db_val = le16toh(qp->sq.db_data.data.value) + 1; qp->sq.db_data.data.value = htole16(db_val); if (dpm.is_edpm || dpm.is_ldpm) { doorbell_dpm_qp(cxt, qp, &dpm); *normal_db_required = 0; } else { *normal_db_required = 1; } return 0; } int qelr_post_send(struct ibv_qp *ib_qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { struct qelr_devctx *cxt = get_qelr_ctx(ib_qp->context); struct qelr_qp *qp = get_qelr_qp(ib_qp); int doorbell_required = 0; *bad_wr = NULL; int rc = 0; pthread_spin_lock(&qp->q_lock); if (IS_ROCE(ib_qp->context->device) && (qp->state != QELR_QPS_RTS && qp->state != QELR_QPS_ERR && qp->state != QELR_QPS_SQD)) { pthread_spin_unlock(&qp->q_lock); *bad_wr = wr; return -EINVAL; } while (wr) { int data_size = sge_data_len(wr->sg_list, wr->num_sge); rc = qelr_can_post_send(cxt, qp, wr, data_size); if (rc) { *bad_wr = wr; break; } rc = __qelr_post_send(cxt, qp, wr, data_size, &doorbell_required); if (rc) { *bad_wr = wr; break; } wr = wr->next; } if (doorbell_required) doorbell_qp(qp); pthread_spin_unlock(&qp->q_lock); return rc; } static uint32_t qelr_srq_elem_left(struct qelr_srq_hwq_info *hw_srq) { uint32_t used; /* Calculate number of elements used based on producer * count and consumer count and subtract it from max * work request supported so that we get elements left. */ used = (uint32_t)(((uint64_t)((uint64_t)~0U) + 1 + (uint64_t)(hw_srq->wr_prod_cnt)) - (uint64_t)hw_srq->wr_cons_cnt); return hw_srq->max_wr - used; } int qelr_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct qelr_devctx *cxt = get_qelr_ctx(ibsrq->context); struct qelr_srq *srq = get_qelr_srq(ibsrq); struct qelr_srq_hwq_info *hw_srq = &srq->hw_srq; struct qelr_chain *chain; int status = 0; pthread_spin_lock(&srq->lock); chain = &srq->hw_srq.chain; while (wr) { struct rdma_srq_wqe_header *hdr; int i; if (!qelr_srq_elem_left(hw_srq) || wr->num_sge > srq->hw_srq.max_sges) { verbs_err(&cxt->ibv_ctx, "Can't post WR (%d,%d) || (%d > %d)\n", hw_srq->wr_prod_cnt, hw_srq->wr_cons_cnt, wr->num_sge, srq->hw_srq.max_sges); status = -ENOMEM; *bad_wr = wr; break; } hdr = qelr_chain_produce(chain); SRQ_HDR_SET(hdr, wr->wr_id, wr->num_sge); hw_srq->wr_prod_cnt++; hw_srq->wqe_prod++; hw_srq->sge_prod++; verbs_debug(&cxt->ibv_ctx, "SRQ WR: SGEs: %d with wr_id[%d] = %" PRIx64 "\n", wr->num_sge, hw_srq->wqe_prod, wr->wr_id); for (i = 0; i < wr->num_sge; i++) { struct rdma_srq_sge *srq_sge; srq_sge = qelr_chain_produce(chain); SRQ_SGE_SET(srq_sge, wr->sg_list[i].addr, wr->sg_list[i].length, wr->sg_list[i].lkey); verbs_debug(&cxt->ibv_ctx, "[%d]: len %d key %x addr %x:%x\n", i, srq_sge->length, srq_sge->l_key, srq_sge->addr.hi, srq_sge->addr.lo); hw_srq->sge_prod++; } /* Make sure that descriptors are written before we update * producers. */ udma_ordering_write_barrier(); struct rdma_srq_producers *virt_prod; virt_prod = srq->hw_srq.virt_prod_pair_addr; virt_prod->sge_prod = htole32(hw_srq->sge_prod); virt_prod->wqe_prod = htole32(hw_srq->wqe_prod); wr = wr->next; } verbs_debug(&cxt->ibv_ctx, "POST: Elements in SRQ: %d\n", qelr_chain_get_elem_left_u32(chain)); pthread_spin_unlock(&srq->lock); return status; } int qelr_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { int status = 0; struct qelr_qp *qp = get_qelr_qp(ibqp); struct qelr_devctx *cxt = get_qelr_ctx(ibqp->context); uint16_t db_val; uint8_t iwarp = IS_IWARP(ibqp->context->device); if (unlikely(qp->srq)) { verbs_err(&cxt->ibv_ctx, "QP is associated with SRQ, cannot post RQ buffers\n"); *bad_wr = wr; return -EINVAL; } pthread_spin_lock(&qp->q_lock); if (!iwarp && qp->state == QELR_QPS_RST) { pthread_spin_unlock(&qp->q_lock); *bad_wr = wr; return -EINVAL; } while (wr) { int i; if (qelr_chain_get_elem_left_u32(&qp->rq.chain) < QELR_MAX_RQ_WQE_SIZE || wr->num_sge > qp->rq.max_sges) { verbs_err(&cxt->ibv_ctx, "Can't post WR (%d < %d) || (%d > %d)\n", qelr_chain_get_elem_left_u32(&qp->rq.chain), QELR_MAX_RQ_WQE_SIZE, wr->num_sge, qp->rq.max_sges); status = -ENOMEM; *bad_wr = wr; break; } for (i = 0; i < wr->num_sge; i++) { uint32_t flags = 0; struct rdma_rq_sge *rqe; /* first one must include the number of SGE in the * list */ if (!i) SET_FIELD(flags, RDMA_RQ_SGE_NUM_SGES, wr->num_sge); SET_FIELD(flags, RDMA_RQ_SGE_L_KEY, wr->sg_list[i].lkey); rqe = qelr_chain_produce(&qp->rq.chain); RQ_SGE_SET(rqe, wr->sg_list[i].addr, wr->sg_list[i].length, flags); } /* Special case of no sges. FW requires between 1-4 sges... * in this case we need to post 1 sge with length zero. this is * because rdma write with immediate consumes an RQ. */ if (!wr->num_sge) { uint32_t flags = 0; struct rdma_rq_sge *rqe; /* first one must include the number of SGE in the * list */ SET_FIELD(flags, RDMA_RQ_SGE_L_KEY, 0); SET_FIELD(flags, RDMA_RQ_SGE_NUM_SGES, 1); rqe = qelr_chain_produce(&qp->rq.chain); RQ_SGE_SET(rqe, 0, 0, flags); i = 1; } qp->rqe_wr_id[qp->rq.prod].wr_id = wr->wr_id; qp->rqe_wr_id[qp->rq.prod].wqe_size = i; qelr_inc_sw_prod_u16(&qp->rq); mmio_wc_start(); db_val = le16toh(qp->rq.db_data.data.value) + 1; qp->rq.db_data.data.value = htole16(db_val); writel(qp->rq.db_data.raw, qp->rq.db); /* copy value to doorbell recovery mechanism */ qp->rq.db_rec_addr->db_data = qp->rq.db_data.raw; mmio_flush_writes(); if (iwarp) { writel(qp->rq.iwarp_db2_data.raw, qp->rq.iwarp_db2); mmio_flush_writes(); } wr = wr->next; } pthread_spin_unlock(&qp->q_lock); return status; } static int is_valid_cqe(struct qelr_cq *cq, union rdma_cqe *cqe) { struct rdma_cqe_requester *resp_cqe = &cqe->req; return (resp_cqe->flags & RDMA_CQE_REQUESTER_TOGGLE_BIT_MASK) == cq->chain_toggle; } static enum rdma_cqe_type cqe_get_type(union rdma_cqe *cqe) { struct rdma_cqe_requester *resp_cqe = &cqe->req; return GET_FIELD(resp_cqe->flags, RDMA_CQE_REQUESTER_TYPE); } static struct qelr_qp *cqe_get_qp(union rdma_cqe *cqe) { struct regpair *qph = &cqe->req.qp_handle; return (struct qelr_qp *)HILO_U64(le32toh(qph->hi), le32toh(qph->lo)); } static int process_req(struct qelr_qp *qp, struct qelr_cq *cq, int num_entries, struct ibv_wc *wc, uint16_t hw_cons, enum ibv_wc_status status, int force) { struct qelr_devctx *cxt = get_qelr_ctx(qp->ibv_qp->context); uint16_t cnt = 0; while (num_entries && qp->sq.wqe_cons != hw_cons) { if (!qp->wqe_wr_id[qp->sq.cons].signaled && !force) { /* skip WC */ goto next_cqe; } /* fill WC */ wc->status = status; wc->wc_flags = 0; wc->qp_num = qp->qp_id; /* common section */ wc->wr_id = qp->wqe_wr_id[qp->sq.cons].wr_id; wc->opcode = qp->wqe_wr_id[qp->sq.cons].opcode; switch (wc->opcode) { case IBV_WC_RDMA_WRITE: wc->byte_len = qp->wqe_wr_id[qp->sq.cons].bytes_len; verbs_debug(&cxt->ibv_ctx, "POLL REQ CQ: IBV_WC_RDMA_WRITE byte_len=%d\n", qp->wqe_wr_id[qp->sq.cons].bytes_len); break; case IBV_WC_COMP_SWAP: case IBV_WC_FETCH_ADD: wc->byte_len = 8; break; case IBV_WC_RDMA_READ: case IBV_WC_SEND: case IBV_WC_BIND_MW: wc->byte_len = qp->wqe_wr_id[qp->sq.cons].bytes_len; verbs_debug(&cxt->ibv_ctx, "POLL REQ CQ: IBV_WC_RDMA_READ / IBV_WC_SEND\n"); break; default: break; } num_entries--; wc++; cnt++; next_cqe: while (qp->wqe_wr_id[qp->sq.cons].wqe_size--) qelr_chain_consume(&qp->sq.chain); qelr_inc_sw_cons_u16(&qp->sq); } return cnt; } static int qelr_poll_cq_req(struct qelr_qp *qp, struct qelr_cq *cq, int num_entries, struct ibv_wc *wc, struct rdma_cqe_requester *req) { struct qelr_devctx *cxt = get_qelr_ctx(qp->ibv_qp->context); uint16_t sq_cons = le16toh(req->sq_cons); int cnt = 0; switch (req->status) { case RDMA_CQE_REQ_STS_OK: cnt = process_req(qp, cq, num_entries, wc, sq_cons, IBV_WC_SUCCESS, 0); break; case RDMA_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR: verbs_err(&cxt->ibv_ctx, "Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x%x\n", qp->sq.icid); cnt = process_req(qp, cq, num_entries, wc, sq_cons, IBV_WC_WR_FLUSH_ERR, 1); break; default: /* other errors case */ /* process all WQE before the consumer */ qp->state = QELR_QPS_ERR; cnt = process_req(qp, cq, num_entries, wc, sq_cons - 1, IBV_WC_SUCCESS, 0); wc += cnt; /* if we have extra WC fill it with actual error info */ if (cnt < num_entries) { enum ibv_wc_status wc_status; switch (req->status) { case RDMA_CQE_REQ_STS_BAD_RESPONSE_ERR: verbs_err(&cxt->ibv_ctx, "Error: POLL CQ with RDMA_CQE_REQ_STS_BAD_RESPONSE_ERR. QP icid=0x%x\n", qp->sq.icid); wc_status = IBV_WC_BAD_RESP_ERR; break; case RDMA_CQE_REQ_STS_LOCAL_LENGTH_ERR: verbs_err(&cxt->ibv_ctx, "Error: POLL CQ with RDMA_CQE_REQ_STS_LOCAL_LENGTH_ERR. QP icid=0x%x\n", qp->sq.icid); wc_status = IBV_WC_LOC_LEN_ERR; break; case RDMA_CQE_REQ_STS_LOCAL_QP_OPERATION_ERR: verbs_err(&cxt->ibv_ctx, "Error: POLL CQ with RDMA_CQE_REQ_STS_LOCAL_QP_OPERATION_ERR. QP icid=0x%x\n", qp->sq.icid); wc_status = IBV_WC_LOC_QP_OP_ERR; break; case RDMA_CQE_REQ_STS_LOCAL_PROTECTION_ERR: verbs_err(&cxt->ibv_ctx, "Error: POLL CQ with RDMA_CQE_REQ_STS_LOCAL_PROTECTION_ERR. QP icid=0x%x\n", qp->sq.icid); wc_status = IBV_WC_LOC_PROT_ERR; break; case RDMA_CQE_REQ_STS_MEMORY_MGT_OPERATION_ERR: verbs_err(&cxt->ibv_ctx, "Error: POLL CQ with RDMA_CQE_REQ_STS_MEMORY_MGT_OPERATION_ERR. QP icid=0x%x\n", qp->sq.icid); wc_status = IBV_WC_MW_BIND_ERR; break; case RDMA_CQE_REQ_STS_REMOTE_INVALID_REQUEST_ERR: verbs_err(&cxt->ibv_ctx, "Error: POLL CQ with RDMA_CQE_REQ_STS_REMOTE_INVALID_REQUEST_ERR. QP icid=0x%x\n", qp->sq.icid); wc_status = IBV_WC_REM_INV_REQ_ERR; break; case RDMA_CQE_REQ_STS_REMOTE_ACCESS_ERR: verbs_err(&cxt->ibv_ctx, "Error: POLL CQ with RDMA_CQE_REQ_STS_REMOTE_ACCESS_ERR. QP icid=0x%x\n", qp->sq.icid); wc_status = IBV_WC_REM_ACCESS_ERR; break; case RDMA_CQE_REQ_STS_REMOTE_OPERATION_ERR: verbs_err(&cxt->ibv_ctx, "Error: POLL CQ with RDMA_CQE_REQ_STS_REMOTE_OPERATION_ERR. QP icid=0x%x\n", qp->sq.icid); wc_status = IBV_WC_REM_OP_ERR; break; case RDMA_CQE_REQ_STS_RNR_NAK_RETRY_CNT_ERR: verbs_err(&cxt->ibv_ctx, "Error: POLL CQ with RDMA_CQE_REQ_STS_RNR_NAK_RETRY_CNT_ERR. QP icid=0x%x\n", qp->sq.icid); wc_status = IBV_WC_RNR_RETRY_EXC_ERR; break; case RDMA_CQE_REQ_STS_TRANSPORT_RETRY_CNT_ERR: verbs_err(&cxt->ibv_ctx, "RDMA_CQE_REQ_STS_TRANSPORT_RETRY_CNT_ERR. QP icid=0x%x\n", qp->sq.icid); wc_status = IBV_WC_RETRY_EXC_ERR; break; default: verbs_err(&cxt->ibv_ctx, "IBV_WC_GENERAL_ERR. QP icid=0x%x\n", qp->sq.icid); wc_status = IBV_WC_GENERAL_ERR; } cnt += process_req(qp, cq, 1, wc, sq_cons, wc_status, 1 /* force use of WC */); } } return cnt; } static void __process_resp_one(struct qelr_devctx *cxt, struct qelr_cq *cq, struct ibv_wc *wc, struct rdma_cqe_responder *resp, uint64_t wr_id, uint32_t qp_id) { enum ibv_wc_status wc_status = IBV_WC_SUCCESS; uint8_t flags; wc->opcode = IBV_WC_RECV; wc->wr_id = wr_id; wc->wc_flags = 0; switch (resp->status) { case RDMA_CQE_RESP_STS_LOCAL_ACCESS_ERR: wc_status = IBV_WC_LOC_ACCESS_ERR; break; case RDMA_CQE_RESP_STS_LOCAL_LENGTH_ERR: wc_status = IBV_WC_LOC_LEN_ERR; break; case RDMA_CQE_RESP_STS_LOCAL_QP_OPERATION_ERR: wc_status = IBV_WC_LOC_QP_OP_ERR; break; case RDMA_CQE_RESP_STS_LOCAL_PROTECTION_ERR: wc_status = IBV_WC_LOC_PROT_ERR; break; case RDMA_CQE_RESP_STS_MEMORY_MGT_OPERATION_ERR: wc_status = IBV_WC_MW_BIND_ERR; break; case RDMA_CQE_RESP_STS_REMOTE_INVALID_REQUEST_ERR: wc_status = IBV_WC_REM_INV_RD_REQ_ERR; break; case RDMA_CQE_RESP_STS_OK: wc_status = IBV_WC_SUCCESS; wc->byte_len = le32toh(resp->length); if (GET_FIELD(resp->flags, RDMA_CQE_REQUESTER_TYPE) == RDMA_CQE_TYPE_RESPONDER_XRC_SRQ) wc->src_qp = le16toh(resp->rq_cons_or_srq_id); flags = resp->flags & QELR_RESP_RDMA_IMM; switch (flags) { case QELR_RESP_RDMA_IMM: /* update opcode */ wc->opcode = IBV_WC_RECV_RDMA_WITH_IMM; SWITCH_FALLTHROUGH; case QELR_RESP_IMM: wc->imm_data = htobe32(le32toh(resp->imm_data_or_inv_r_Key)); wc->wc_flags |= IBV_WC_WITH_IMM; break; case QELR_RESP_INV: wc->invalidated_rkey = le32toh(resp->imm_data_or_inv_r_Key); wc->wc_flags |= IBV_WC_WITH_INV; break; case QELR_RESP_RDMA: verbs_err(&cxt->ibv_ctx, "Invalid flags detected\n"); break; default: /* valid configuration, but nothing to do here */ break; } break; default: wc->status = IBV_WC_GENERAL_ERR; verbs_err(&cxt->ibv_ctx, "Invalid CQE status detected\n"); } /* fill WC */ wc->status = wc_status; wc->qp_num = qp_id; } static int process_resp_one_srq(struct qelr_srq *srq, struct qelr_cq *cq, struct ibv_wc *wc, struct rdma_cqe_responder *resp, uint32_t qp_id) { struct qelr_srq_hwq_info *hw_srq = &srq->hw_srq; uint64_t wr_id; wr_id = (((uint64_t)(le32toh(resp->srq_wr_id.hi))) << 32) + le32toh(resp->srq_wr_id.lo); if (resp->status == RDMA_CQE_RESP_STS_WORK_REQUEST_FLUSHED_ERR) { wc->byte_len = 0; wc->status = IBV_WC_WR_FLUSH_ERR; wc->qp_num = qp_id; wc->wr_id = wr_id; } else { __process_resp_one(get_qelr_ctx(srq->verbs_srq.srq.context), cq, wc, resp, wr_id, qp_id); } hw_srq->wr_cons_cnt++; return 1; } static int process_resp_one(struct qelr_qp *qp, struct qelr_cq *cq, struct ibv_wc *wc, struct rdma_cqe_responder *resp) { uint64_t wr_id = qp->rqe_wr_id[qp->rq.cons].wr_id; __process_resp_one(get_qelr_ctx(qp->ibv_qp->context), cq, wc, resp, wr_id, qp->qp_id); while (qp->rqe_wr_id[qp->rq.cons].wqe_size--) qelr_chain_consume(&qp->rq.chain); qelr_inc_sw_cons_u16(&qp->rq); return 1; } static int process_resp_flush(struct qelr_qp *qp, struct qelr_cq *cq, int num_entries, struct ibv_wc *wc, uint16_t hw_cons) { uint16_t cnt = 0; while (num_entries && qp->rq.wqe_cons != hw_cons) { /* fill WC */ wc->status = IBV_WC_WR_FLUSH_ERR; wc->qp_num = qp->qp_id; wc->byte_len = 0; wc->wr_id = qp->rqe_wr_id[qp->rq.cons].wr_id; num_entries--; wc++; cnt++; while (qp->rqe_wr_id[qp->rq.cons].wqe_size--) qelr_chain_consume(&qp->rq.chain); qelr_inc_sw_cons_u16(&qp->rq); } return cnt; } /* return latest CQE (needs processing) */ static union rdma_cqe *get_cqe(struct qelr_cq *cq) { return cq->latest_cqe; } static void try_consume_req_cqe(struct qelr_cq *cq, struct qelr_qp *qp, struct rdma_cqe_requester *req, int *update) { uint16_t sq_cons = le16toh(req->sq_cons); if (sq_cons == qp->sq.wqe_cons) { consume_cqe(cq); *update |= 1; } } /* used with flush only, when resp->rq_cons is valid */ static void try_consume_resp_cqe(struct qelr_cq *cq, struct qelr_qp *qp, uint16_t rq_cons, int *update) { if (rq_cons == qp->rq.wqe_cons) { consume_cqe(cq); *update |= 1; } } static int qelr_poll_cq_resp_srq(struct qelr_srq *srq, struct qelr_cq *cq, int num_entries, struct ibv_wc *wc, struct rdma_cqe_responder *resp, int *update, uint32_t qp_id) { int cnt; cnt = process_resp_one_srq(srq, cq, wc, resp, qp_id); consume_cqe(cq); *update |= 1; return cnt; } static int qelr_poll_cq_resp(struct qelr_qp *qp, struct qelr_cq *cq, int num_entries, struct ibv_wc *wc, struct rdma_cqe_responder *resp, int *update) { uint16_t rq_cons = le16toh(resp->rq_cons_or_srq_id); int cnt; if (resp->status == RDMA_CQE_RESP_STS_WORK_REQUEST_FLUSHED_ERR) { cnt = process_resp_flush(qp, cq, num_entries, wc, rq_cons); try_consume_resp_cqe(cq, qp, rq_cons, update); } else { cnt = process_resp_one(qp, cq, wc, resp); consume_cqe(cq); *update |= 1; } return cnt; } static void doorbell_cq(struct qelr_cq *cq, uint32_t cons, uint8_t flags) { mmio_wc_start(); cq->db.data.agg_flags = flags; cq->db.data.value = htole32(cons); writeq(cq->db.raw, cq->db_addr); /* copy value to doorbell recovery mechanism */ cq->db_rec_addr->db_data = cq->db.raw; mmio_flush_writes(); } static struct qelr_srq *qelr_get_xrc_srq_from_cqe(struct qelr_cq *cq, union rdma_cqe *cqe, struct qelr_qp *qp) { struct qelr_devctx *cxt; struct qelr_srq *srq; uint16_t srq_id; srq_id = le16toh(cqe->resp.rq_cons_or_srq_id); cxt = get_qelr_ctx(cq->ibv_cq.context); srq = qelr_get_srq(cxt, srq_id); if (unlikely(!srq)) { verbs_err(&cxt->ibv_ctx, "srq handle is null\n"); return NULL; } return srq; } int qelr_poll_cq(struct ibv_cq *ibcq, int num_entries, struct ibv_wc *wc) { struct qelr_cq *cq = get_qelr_cq(ibcq); int done = 0; union rdma_cqe *cqe = get_cqe(cq); struct qelr_srq *srq; struct regpair *qph; int update = 0; uint32_t db_cons; uint32_t qp_id; while (num_entries && is_valid_cqe(cq, cqe)) { int cnt = 0; struct qelr_qp *qp; /* prevent speculative reads of any field of CQE */ udma_from_device_barrier(); qp = cqe_get_qp(cqe); if (!qp && cqe_get_type(cqe) != RDMA_CQE_TYPE_RESPONDER_XRC_SRQ) { verbs_err(verbs_get_ctx(qp->ibv_qp->context), "Error: CQE QP pointer is NULL. CQE=%p\n", cqe); break; } switch (cqe_get_type(cqe)) { case RDMA_CQE_TYPE_REQUESTER: cnt = qelr_poll_cq_req(qp, cq, num_entries, wc, &cqe->req); try_consume_req_cqe(cq, qp, &cqe->req, &update); break; case RDMA_CQE_TYPE_RESPONDER_RQ: cnt = qelr_poll_cq_resp(qp, cq, num_entries, wc, &cqe->resp, &update); break; case RDMA_CQE_TYPE_RESPONDER_XRC_SRQ: qph = &cqe->req.qp_handle; srq = qelr_get_xrc_srq_from_cqe(cq, cqe, qp); if (unlikely(!srq)) { consume_cqe(cq); cqe = get_cqe(cq); update |= 1; continue; } qp_id = le32toh(qph->lo); cnt = qelr_poll_cq_resp_srq(srq, cq, num_entries, wc, &cqe->resp, &update, qp_id); break; case RDMA_CQE_TYPE_RESPONDER_SRQ: cnt = qelr_poll_cq_resp_srq(qp->srq, cq, num_entries, wc, &cqe->resp, &update, qp->qp_id); break; case RDMA_CQE_TYPE_INVALID: default: printf("Error: invalid CQE type = %d\n", cqe_get_type(cqe)); } num_entries -= cnt; wc += cnt; done += cnt; cqe = get_cqe(cq); } db_cons = qelr_chain_get_cons_idx_u32(&cq->chain) - 1; if (update) { /* doorbell notifies about latest VALID entry, * but chain already point to the next INVALID one */ doorbell_cq(cq, db_cons, cq->arm_flags); } return done; } void qelr_cq_event(struct ibv_cq *ibcq) { /* Trigger received, can reset arm flags */ struct qelr_cq *cq = get_qelr_cq(ibcq); cq->arm_flags = 0; } int qelr_arm_cq(struct ibv_cq *ibcq, int solicited) { struct qelr_cq *cq = get_qelr_cq(ibcq); uint32_t db_cons; db_cons = qelr_chain_get_cons_idx_u32(&cq->chain) - 1; cq->arm_flags = solicited ? DQ_UCM_ROCE_CQ_ARM_SE_CF_CMD : DQ_UCM_ROCE_CQ_ARM_CF_CMD; doorbell_cq(cq, db_cons, cq->arm_flags); return 0; } void qelr_async_event(struct ibv_context *context, struct ibv_async_event *event) { struct qelr_cq *cq = NULL; struct qelr_qp *qp = NULL; switch (event->event_type) { case IBV_EVENT_CQ_ERR: cq = get_qelr_cq(event->element.cq); break; case IBV_EVENT_QP_FATAL: case IBV_EVENT_QP_REQ_ERR: case IBV_EVENT_QP_ACCESS_ERR: case IBV_EVENT_PATH_MIG_ERR:{ qp = get_qelr_qp(event->element.qp); break; } case IBV_EVENT_SQ_DRAINED: case IBV_EVENT_PATH_MIG: case IBV_EVENT_COMM_EST: case IBV_EVENT_QP_LAST_WQE_REACHED: break; case IBV_EVENT_SRQ_LIMIT_REACHED: case IBV_EVENT_SRQ_ERR: return; case IBV_EVENT_PORT_ACTIVE: case IBV_EVENT_PORT_ERR: break; default: break; } fprintf(stderr, "qelr_async_event not implemented yet cq=%p qp=%p\n", cq, qp); } struct ibv_xrcd *qelr_open_xrcd(struct ibv_context *context, struct ibv_xrcd_init_attr *init_attr) { struct qelr_devctx *cxt = get_qelr_ctx(context); struct ib_uverbs_open_xrcd_resp resp; struct ibv_open_xrcd cmd; struct verbs_xrcd *xrcd; int rc; xrcd = calloc(1, sizeof(*xrcd)); if (!xrcd) return NULL; rc = ibv_cmd_open_xrcd(context, xrcd, sizeof(*xrcd), init_attr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (rc) { verbs_err(&cxt->ibv_ctx, "open xrcd: failed with rc=%d.\n", rc); free(xrcd); return NULL; } return &xrcd->xrcd; } int qelr_close_xrcd(struct ibv_xrcd *ibxrcd) { struct verbs_xrcd *xrcd = container_of(ibxrcd, struct verbs_xrcd, xrcd); struct qelr_devctx *cxt = get_qelr_ctx(ibxrcd->context); int rc; rc = ibv_cmd_close_xrcd(xrcd); if (rc) { verbs_err(&cxt->ibv_ctx, "close xrcd: failed with rc=%d.\n", rc); free(xrcd); } return rc; } static struct ibv_srq * qelr_create_xrc_srq(struct ibv_context *context, struct ibv_srq_init_attr_ex *init_attr) { struct qelr_devctx *cxt = get_qelr_ctx(context); struct qelr_create_srq_ex req; struct qelr_create_srq_resp resp; struct ibv_srq *ibv_srq; struct qelr_srq *srq; int rc = 0; srq = calloc(1, sizeof(*srq)); if (!srq) goto err0; ibv_srq = &srq->verbs_srq.srq; rc = qelr_create_srq_buffers(cxt, srq, init_attr->attr.max_wr); if (rc) goto err1; pthread_spin_init(&srq->lock, PTHREAD_PROCESS_PRIVATE); qelr_create_srq_configure_req_ex(srq, &req); rc = ibv_cmd_create_srq_ex(context, &srq->verbs_srq, init_attr, &req.ibv_cmd, sizeof(req), &resp.ibv_resp, sizeof(resp)); if (rc) goto err1; if (unlikely(resp.srq_id >= QELR_MAX_SRQ_ID)) { rc = -EINVAL; goto err1; } srq->srq_id = resp.srq_id; srq->is_xrc = 1; cxt->srq_table[resp.srq_id] = srq; verbs_debug(&cxt->ibv_ctx, "create srq_ex: successfully created %p.\n", srq); return ibv_srq; err1: qelr_destroy_srq_buffers(ibv_srq); free(srq); err0: verbs_err(&cxt->ibv_ctx, "create srq: failed to create. rc=%d\n", rc); return NULL; } int qelr_get_srq_num(struct ibv_srq *ibv_srq, uint32_t *srq_num) { struct qelr_srq *srq = get_qelr_srq(ibv_srq); *srq_num = srq->srq_id; return 0; } struct ibv_srq *qelr_create_srq_ex(struct ibv_context *context, struct ibv_srq_init_attr_ex *init_attr) { struct qelr_devctx *cxt = get_qelr_ctx(context); if (init_attr->srq_type == IBV_SRQT_BASIC) return qelr_create_srq(init_attr->pd, (struct ibv_srq_init_attr *)init_attr); if (init_attr->srq_type == IBV_SRQT_XRC) return qelr_create_xrc_srq(context, init_attr); verbs_err(&cxt->ibv_ctx, "failed to create srq type %d\n", init_attr->srq_type); return NULL; } static struct ibv_qp *create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *attrx) { struct qelr_devctx *cxt = get_qelr_ctx(context); struct qelr_create_qp_resp resp = {}; struct qelr_create_qp req; struct ibv_qp *ibqp; struct qelr_qp *qp; int rc; qelr_print_qp_init_attr(cxt, attrx); #define QELR_CREATE_QP_SUPP_ATTR_MASK \ (IBV_QP_INIT_ATTR_PD | IBV_QP_INIT_ATTR_XRCD) if (!check_comp_mask(attrx->comp_mask, QELR_CREATE_QP_SUPP_ATTR_MASK)) { errno = EOPNOTSUPP; return NULL; } qp = calloc(1, sizeof(*qp)); if (!qp) return NULL; qelr_basic_qp_config(qp, attrx); rc = qelr_create_qp_buffers(cxt, qp, attrx); if (rc) goto err0; qelr_create_qp_configure_req(qp, &req); rc = ibv_cmd_create_qp_ex(context, &qp->verbs_qp, attrx, &req.ibv_cmd, sizeof(req), &resp.ibv_resp, sizeof(resp)); if (rc) { verbs_err(&cxt->ibv_ctx, "create qp: failed on ibv_cmd_create_qp with %d\n", rc); goto err1; } rc = qelr_configure_qp(cxt, qp, attrx, &resp); if (rc) goto err2; verbs_debug(&cxt->ibv_ctx, "create qp: successfully created %p. handle_hi=%x handle_lo=%x\n", qp, req.qp_handle_hi, req.qp_handle_lo); ibqp = (struct ibv_qp *)&qp->verbs_qp; qp->ibv_qp = ibqp; return get_ibv_qp(qp); err2: rc = ibv_cmd_destroy_qp(get_ibv_qp(qp)); if (rc) verbs_err(&cxt->ibv_ctx, "create qp: fatal fault. rc=%d\n", rc); err1: if (qelr_qp_has_sq(qp)) qelr_chain_free(&qp->sq.chain); if (qelr_qp_has_rq(qp)) qelr_chain_free(&qp->rq.chain); err0: free(qp); return NULL; } struct ibv_qp *qelr_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr) { return create_qp(context, attr); } struct ibv_qp *qelr_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct ibv_qp *qp; struct ibv_qp_init_attr_ex attrx = {}; memcpy(&attrx, attr, sizeof(*attr)); attrx.comp_mask = IBV_QP_INIT_ATTR_PD; attrx.pd = pd; qp = create_qp(pd->context, &attrx); if (qp) memcpy(attr, &attrx, sizeof(*attr)); return qp; } rdma-core-56.1/providers/qedr/qelr_verbs.h000066400000000000000000000075671477342711600206360ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __QELR_VERBS_H__ #define __QELR_VERBS_H__ #include #include #include #include #include int qelr_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int qelr_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr); struct ibv_pd *qelr_alloc_pd(struct ibv_context *context); int qelr_dealloc_pd(struct ibv_pd *ibpd); struct ibv_mr *qelr_reg_mr(struct ibv_pd *ibpd, void *addr, size_t len, uint64_t hca_va, int access); int qelr_dereg_mr(struct verbs_mr *mr); struct ibv_cq *qelr_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); int qelr_arm_cq(struct ibv_cq *ibcq, int solicited); int qelr_poll_cq(struct ibv_cq *ibcq, int num_entries, struct ibv_wc *wc); void qelr_cq_event(struct ibv_cq *ibcq); int qelr_destroy_cq(struct ibv_cq *); struct ibv_qp *qelr_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attrs); int qelr_modify_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask); int qelr_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int qelr_destroy_qp(struct ibv_qp *ibqp); int qelr_post_send(struct ibv_qp *ib_qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int qelr_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); int qelr_query_srq(struct ibv_srq *ibv_srq, struct ibv_srq_attr *attr); int qelr_modify_srq(struct ibv_srq *ibv_srq, struct ibv_srq_attr *attr, int attr_mask); struct ibv_srq *qelr_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *init_attr); int qelr_destroy_srq(struct ibv_srq *ibv_srq); int qelr_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); void qelr_async_event(struct ibv_context *context, struct ibv_async_event *event); struct ibv_xrcd *qelr_open_xrcd(struct ibv_context *context, struct ibv_xrcd_init_attr *init_attr); int qelr_close_xrcd(struct ibv_xrcd *ibxrcd); struct ibv_srq *qelr_create_srq_ex(struct ibv_context *context, struct ibv_srq_init_attr_ex *init_attr); struct ibv_qp *qelr_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attrx); int qelr_get_srq_num(struct ibv_srq *srq, uint32_t *srq_num); #endif /* __QELR_VERBS_H__ */ rdma-core-56.1/providers/qedr/rdma_common.h000066400000000000000000000050571477342711600207550ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __RDMA_COMMON__ #define __RDMA_COMMON__ #include /************************/ /* RDMA FW CONSTANTS */ /************************/ #define RDMA_RESERVED_LKEY (0) //Reserved lkey #define RDMA_RING_PAGE_SIZE (0x1000) //4KB pages #define RDMA_MAX_SGE_PER_SQ_WQE (4) //max number of SGEs in a single request #define RDMA_MAX_SGE_PER_RQ_WQE (4) //max number of SGEs in a single request #define RDMA_MAX_DATA_SIZE_IN_WQE (0x7FFFFFFF) //max size of data in single request #define RDMA_REQ_RD_ATOMIC_ELM_SIZE (0x50) #define RDMA_RESP_RD_ATOMIC_ELM_SIZE (0x20) #define RDMA_MAX_CQS (64*1024) #define RDMA_MAX_TIDS (128*1024-1) #define RDMA_MAX_PDS (64*1024) #define RDMA_MAX_SRQS (32*1024) #define RDMA_NUM_STATISTIC_COUNTERS MAX_NUM_VPORTS #define RDMA_NUM_STATISTIC_COUNTERS_K2 MAX_NUM_VPORTS_K2 #define RDMA_NUM_STATISTIC_COUNTERS_BB MAX_NUM_VPORTS_BB #define RDMA_TASK_TYPE (PROTOCOLID_ROCE) struct rdma_srq_id { __le16 srq_idx /* SRQ index */; __le16 opaque_fid; }; struct rdma_srq_producers { __le32 sge_prod /* Current produced sge in SRQ */; __le32 wqe_prod /* Current produced WQE to SRQ */; }; #endif /* __RDMA_COMMON__ */ rdma-core-56.1/providers/qedr/roce_common.h000066400000000000000000000042151477342711600207550ustar00rootroot00000000000000/* * Copyright (c) 2015-2016 QLogic Corporation * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and /or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __ROCE_COMMON__ #define __ROCE_COMMON__ /************************************************************************/ /* Add include to common rdma target for both eCore and protocol rdma driver */ /************************************************************************/ #include "rdma_common.h" /************************/ /* ROCE FW CONSTANTS */ /************************/ #define ROCE_REQ_MAX_INLINE_DATA_SIZE (256) //max size of inline data in single request #define ROCE_REQ_MAX_SINGLE_SQ_WQE_SIZE (288) //Maximum size of single SQ WQE (rdma wqe and inline data) #define ROCE_MAX_QPS (32*1024) #define ROCE_DCQCN_NP_MAX_QPS (64) /* notification point max QPs*/ #define ROCE_DCQCN_RP_MAX_QPS (64) /* reaction point max QPs*/ #endif /* __ROCE_COMMON__ */ rdma-core-56.1/providers/rxe/000077500000000000000000000000001477342711600161455ustar00rootroot00000000000000rdma-core-56.1/providers/rxe/CMakeLists.txt000066400000000000000000000005061477342711600207060ustar00rootroot00000000000000if (ENABLE_LTTNG AND LTTNGUST_FOUND) set(TRACE_FILE rxe_trace.c) endif() rdma_provider(rxe ${TRACE_FILE} rxe.c ) if (ENABLE_LTTNG AND LTTNGUST_FOUND) target_include_directories("rxe-rdmav${IBVERBS_PABI_VERSION}" PUBLIC ".") target_link_libraries("rxe-rdmav${IBVERBS_PABI_VERSION}" LINK_PRIVATE LTTng::UST) endif() rdma-core-56.1/providers/rxe/man/000077500000000000000000000000001477342711600167205ustar00rootroot00000000000000rdma-core-56.1/providers/rxe/man/CMakeLists.txt000066400000000000000000000000321477342711600214530ustar00rootroot00000000000000rdma_man_pages( rxe.7 ) rdma-core-56.1/providers/rxe/man/rxe.7000066400000000000000000000075671477342711600176250ustar00rootroot00000000000000.\" -*- nroff -*- .\" .TH RXE 7 2011-06-29 1.0.0 .SH "NAME" rxe \- Software RDMA over Ethernet .SH "SYNOPSIS" \fBmodprobe rdma_rxe\fR .br This is usually performed by a configuration utility (see \fBrdma link\fR(8).) .SH "DESCRIPTION" The rdma_rxe kernel module provides a software implementation of the RoCEv2 protocol. The RoCEv2 protocol is an RDMA transport protocol that exists on top of UDP/IPv4 or UDP/IPv6. The InfiniBand (IB) Base Transport Header (BTH) is encapsulated in the UDP packet. Once a RXE instance has been created, communicating via RXE is the same as communicating via any OFED compatible Infiniband HCA, albeit in some cases with addressing implications. In particular, while the use of a GRH header is optional within IB subnets, it is mandatory with RoCE. Verbs applications written over IB verbs should work seamlessly, but they require provisioning of GRH information when creating address vectors. The library and driver are modified to provide for mapping from GID to MAC addresses required by the hardware. .SH "FILES" .TP \fB/sys/class/infiniband/rxe[0,1,...]\fR Directory that holds RDMA device information. The format is the same as other RDMA devices. .TP \fB/sys/module/rdma_rxe_net/parameters/mtu\fR Write only file used to configure RoCE and Ethernet MTU values. .TP \fB/sys/module/rdma_rxe/parameters/max_ucontext\fR Read/Write file that sets a limit on the number of UCs allowed per RXE device. .TP \fB/sys/module/rdma_rxe/parameters/max_qp\fR Read/Write file that sets a limit on the number of QPs allowed per RXE device. .TP \fB/sys/module/rdma_rxe/parameters/max_qp_wr\fR Read/Write file that sets a limit on the number of WRs per QP allowed per RXE device. .TP \fB/sys/module/rdma_rxe/parameters/max_mr\fR Read/Write file that sets a limit on the number of MRs allowed per RXE device. .TP \fB/sys/module/rdma_rxe/parameters/max_fmr\fR Read/Write file that sets a limit on the number of FMRs allowed per RXE device. .TP \fB/sys/module/rdma_rxe/parameters/max_cq\fR Read/Write file that sets a limit on the number of CQs allowed per RXE device. .TP \fB/sys/module/rdma_rxe/parameters/max_log_cqe\fR Read/Write file that sets a limit on the log base 2 of the number of CQEs per CQ allowed per RXE device. .TP \fB/sys/module/rdma_rxe/parameters/max_inline_data\fR Read/Write file that sets a limit on the maximum amount of inline data per WR allowed per RXE device. The above configuration parameters only affect a new RXE instance when it is created not afterwards. .TP \fB/sys/module/rdma_rxe/parameters/crc_disable\fR Read/Write file that controls the disabling of ICRC computation. Set to a nonzero value for TRUE. Zero for FALSE. .TP \fB/sys/module/rdma_rxe/parameters/fast_comp|req|resp|arb\fR Read/Write file that enables calling kernel tasklets as subroutines to reduce latency. .TP \fB/sys/module/rdma_rxe/parameters/nsec_per_packet|kbyte\fR Read/Write file that controls static rate pacing for output packets. If set to nonzero values the minimum delay to the next packet is set to nsec_per_kbyte * sizeof(current packet in KBytes) or nsec_per_packet which ever is less. .TP \fB/sys/module/rdma_rxe/parameters/max_packet_per_ack\fR Read/Write file that controls the issuing of acks by the responder during a long message. If set additional acks will be generated every max_pkt_per_ack packets. .TP \fB/sys/module/rdma_rxe/parameters/max_skb_per_qp\fR Read/Write file that controls the number of skbs (packets) that a requester can queue for sending internally. .TP \fB/sys/module/rdma_rxe/parameters/max_req_comp_gap\fR Read/Write file that controls the maximum gap between the PSN of request packets send and ack packets received. .TP \fB/sys/module/rdma_rxe/parameters/default_mtu\fR Read/Write file that controls the default mtu used for UD packets. .SH "SEE ALSO" .BR rdma (8), .BR verbs (7), .SH "AUTHORS" Written by John Groves, Frank Zago and Bob Pearson at System Fabric Works. rdma-core-56.1/providers/rxe/rxe-abi.h000066400000000000000000000047261477342711600176560ustar00rootroot00000000000000/* * Copyright (c) 2009 Mellanox Technologies Ltd. All rights reserved. * Copyright (c) 2009 System Fabric Works, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #ifndef RXE_ABI_H #define RXE_ABI_H #include #include #include DECLARE_DRV_CMD(urxe_create_ah, IB_USER_VERBS_CMD_CREATE_AH, empty, rxe_create_ah_resp); DECLARE_DRV_CMD(urxe_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, empty, rxe_create_cq_resp); DECLARE_DRV_CMD(urxe_create_cq_ex, IB_USER_VERBS_EX_CMD_CREATE_CQ, empty, rxe_create_cq_resp); DECLARE_DRV_CMD(urxe_create_qp, IB_USER_VERBS_CMD_CREATE_QP, empty, rxe_create_qp_resp); DECLARE_DRV_CMD(urxe_create_qp_ex, IB_USER_VERBS_EX_CMD_CREATE_QP, empty, rxe_create_qp_resp); DECLARE_DRV_CMD(urxe_create_srq, IB_USER_VERBS_CMD_CREATE_SRQ, empty, rxe_create_srq_resp); DECLARE_DRV_CMD(urxe_create_srq_ex, IB_USER_VERBS_CMD_CREATE_XSRQ, empty, rxe_create_srq_resp); DECLARE_DRV_CMD(urxe_modify_srq, IB_USER_VERBS_CMD_MODIFY_SRQ, rxe_modify_srq_cmd, empty); DECLARE_DRV_CMD(urxe_resize_cq, IB_USER_VERBS_CMD_RESIZE_CQ, empty, rxe_resize_cq_resp); #endif /* RXE_ABI_H */ rdma-core-56.1/providers/rxe/rxe.c000066400000000000000000001307771477342711600171260ustar00rootroot00000000000000/* * Copyright (c) 2009 Mellanox Technologies Ltd. All rights reserved. * Copyright (c) 2009 System Fabric Works, Inc. All rights reserved. * Copyright (C) 2006-2007 QLogic Corporation, All rights reserved. * Copyright (c) 2005. PathScale, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "rxe_queue.h" #include "rxe-abi.h" #include "rxe.h" #include "rxe_trace.h" static void rxe_free_context(struct ibv_context *ibctx); static const struct verbs_match_ent hca_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_RXE), VERBS_NAME_MATCH("rxe", NULL), {}, }; static int rxe_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); uint64_t raw_fw_ver; unsigned int major, minor, sub_minor; int ret; ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (ret) return ret; raw_fw_ver = resp.base.fw_ver; major = (raw_fw_ver >> 32) & 0xffff; minor = (raw_fw_ver >> 16) & 0xffff; sub_minor = raw_fw_ver & 0xffff; snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver), "%d.%d.%d", major, minor, sub_minor); return 0; } static int rxe_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(context, port, attr, &cmd, sizeof(cmd)); } static struct ibv_pd *rxe_alloc_pd(struct ibv_context *context) { struct ibv_alloc_pd cmd; struct ib_uverbs_alloc_pd_resp resp; struct ibv_pd *pd; pd = calloc(1, sizeof(*pd)); if (!pd) return NULL; if (ibv_cmd_alloc_pd(context, pd, &cmd, sizeof(cmd), &resp, sizeof(resp))) { free(pd); return NULL; } return pd; } static int rxe_dealloc_pd(struct ibv_pd *pd) { int ret; ret = ibv_cmd_dealloc_pd(pd); if (!ret) free(pd); return ret; } static struct ibv_mw *rxe_alloc_mw(struct ibv_pd *ibpd, enum ibv_mw_type type) { int ret; struct ibv_mw *ibmw; struct ibv_alloc_mw cmd = {}; struct ib_uverbs_alloc_mw_resp resp = {}; ibmw = calloc(1, sizeof(*ibmw)); if (!ibmw) return NULL; ret = ibv_cmd_alloc_mw(ibpd, type, ibmw, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(ibmw); return NULL; } return ibmw; } static int rxe_dealloc_mw(struct ibv_mw *ibmw) { int ret; ret = ibv_cmd_dealloc_mw(ibmw); if (ret) return ret; free(ibmw); return 0; } static int rxe_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr_list, struct ibv_send_wr **bad_wr); static int rxe_bind_mw(struct ibv_qp *ibqp, struct ibv_mw *ibmw, struct ibv_mw_bind *mw_bind) { int ret; struct ibv_mw_bind_info *bind_info = &mw_bind->bind_info; struct ibv_send_wr ibwr; struct ibv_send_wr *bad_wr; if (bind_info->mw_access_flags & IBV_ACCESS_ZERO_BASED) { ret = EINVAL; goto err; } memset(&ibwr, 0, sizeof(ibwr)); ibwr.opcode = IBV_WR_BIND_MW; ibwr.next = NULL; ibwr.wr_id = mw_bind->wr_id; ibwr.send_flags = mw_bind->send_flags; ibwr.bind_mw.bind_info = mw_bind->bind_info; ibwr.bind_mw.mw = ibmw; ibwr.bind_mw.rkey = ibv_inc_rkey(ibmw->rkey); ret = rxe_post_send(ibqp, &ibwr, &bad_wr); if (ret) goto err; /* user has to undo this if he gets an error wc */ ibmw->rkey = ibwr.bind_mw.rkey; return 0; err: errno = ret; return errno; } static struct ibv_mr *rxe_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { struct verbs_mr *vmr; struct ibv_reg_mr cmd; struct ib_uverbs_reg_mr_resp resp; int ret; vmr = calloc(1, sizeof(*vmr)); if (!vmr) return NULL; ret = ibv_cmd_reg_mr(pd, addr, length, hca_va, access, vmr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(vmr); return NULL; } return &vmr->ibv_mr; } static int rxe_dereg_mr(struct verbs_mr *vmr) { int ret; ret = ibv_cmd_dereg_mr(vmr); if (ret) return ret; free(vmr); return 0; } static int cq_start_poll(struct ibv_cq_ex *current, struct ibv_poll_cq_attr *attr) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); pthread_spin_lock(&cq->lock); cq->cur_index = load_consumer_index(cq->queue); if (check_cq_queue_empty(cq)) { pthread_spin_unlock(&cq->lock); errno = ENOENT; return errno; } cq->wc = addr_from_index(cq->queue, cq->cur_index); cq->vcq.cq_ex.status = cq->wc->status; cq->vcq.cq_ex.wr_id = cq->wc->wr_id; return 0; } static int cq_next_poll(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); struct rxe_queue_buf *q = cq->queue; uint32_t next_index = (cq->cur_index + 1) & q->index_mask; if (next_index == load_producer_index(q)) { store_consumer_index(cq->queue, cq->cur_index); pthread_spin_unlock(&cq->lock); errno = ENOENT; return errno; } cq->cur_index = next_index; cq->wc = addr_from_index(cq->queue, cq->cur_index); cq->vcq.cq_ex.status = cq->wc->status; cq->vcq.cq_ex.wr_id = cq->wc->wr_id; return 0; } static void cq_end_poll(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); advance_cq_cur_index(cq); store_consumer_index(cq->queue, cq->cur_index); pthread_spin_unlock(&cq->lock); } static enum ibv_wc_opcode cq_read_opcode(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); return cq->wc->opcode; } static uint32_t cq_read_vendor_err(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); return cq->wc->vendor_err; } static uint32_t cq_read_byte_len(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); return cq->wc->byte_len; } static __be32 cq_read_imm_data(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); return cq->wc->ex.imm_data; } static uint32_t cq_read_qp_num(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); return cq->wc->qp_num; } static uint32_t cq_read_src_qp(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); return cq->wc->src_qp; } static unsigned int cq_read_wc_flags(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); return cq->wc->wc_flags; } static uint32_t cq_read_slid(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); return cq->wc->slid; } static uint8_t cq_read_sl(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); return cq->wc->sl; } static uint8_t cq_read_dlid_path_bits(struct ibv_cq_ex *current) { struct rxe_cq *cq = container_of(current, struct rxe_cq, vcq.cq_ex); return cq->wc->dlid_path_bits; } static int rxe_destroy_cq(struct ibv_cq *ibcq); static struct ibv_cq *rxe_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct rxe_cq *cq; struct urxe_create_cq_resp resp = {}; int ret; cq = calloc(1, sizeof(*cq)); if (!cq) return NULL; ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector, &cq->vcq.cq, NULL, 0, &resp.ibv_resp, sizeof(resp)); if (ret) { free(cq); return NULL; } cq->queue = mmap(NULL, resp.mi.size, PROT_READ | PROT_WRITE, MAP_SHARED, context->cmd_fd, resp.mi.offset); if ((void *)cq->queue == MAP_FAILED) { ibv_cmd_destroy_cq(&cq->vcq.cq); free(cq); return NULL; } cq->wc_size = 1ULL << cq->queue->log2_elem_size; if (cq->wc_size < sizeof(struct ib_uverbs_wc)) { rxe_destroy_cq(&cq->vcq.cq); return NULL; } cq->mmap_info = resp.mi; pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE); return &cq->vcq.cq; } enum rxe_sup_wc_flags { RXE_SUP_WC_FLAGS = IBV_WC_EX_WITH_BYTE_LEN | IBV_WC_EX_WITH_IMM | IBV_WC_EX_WITH_QP_NUM | IBV_WC_EX_WITH_SRC_QP | IBV_WC_EX_WITH_SLID | IBV_WC_EX_WITH_SL | IBV_WC_EX_WITH_DLID_PATH_BITS, RXE_SUP_WC_EX_FLAGS = RXE_SUP_WC_FLAGS, // add extended flags here }; static struct ibv_cq_ex *rxe_create_cq_ex(struct ibv_context *context, struct ibv_cq_init_attr_ex *attr) { int ret; struct rxe_cq *cq; struct urxe_create_cq_ex_resp resp = {}; /* user is asking for flags we don't support */ if (attr->wc_flags & ~RXE_SUP_WC_EX_FLAGS) { errno = EOPNOTSUPP; goto err; } cq = calloc(1, sizeof(*cq)); if (!cq) goto err; ret = ibv_cmd_create_cq_ex(context, attr, &cq->vcq, NULL, 0, &resp.ibv_resp, sizeof(resp), 0); if (ret) goto err_free; cq->queue = mmap(NULL, resp.mi.size, PROT_READ | PROT_WRITE, MAP_SHARED, context->cmd_fd, resp.mi.offset); if ((void *)cq->queue == MAP_FAILED) goto err_destroy; cq->wc_size = 1ULL << cq->queue->log2_elem_size; if (cq->wc_size < sizeof(struct ib_uverbs_wc)) goto err_unmap; cq->mmap_info = resp.mi; pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE); cq->vcq.cq_ex.start_poll = cq_start_poll; cq->vcq.cq_ex.next_poll = cq_next_poll; cq->vcq.cq_ex.end_poll = cq_end_poll; cq->vcq.cq_ex.read_opcode = cq_read_opcode; cq->vcq.cq_ex.read_vendor_err = cq_read_vendor_err; cq->vcq.cq_ex.read_wc_flags = cq_read_wc_flags; if (attr->wc_flags & IBV_WC_EX_WITH_BYTE_LEN) cq->vcq.cq_ex.read_byte_len = cq_read_byte_len; if (attr->wc_flags & IBV_WC_EX_WITH_IMM) cq->vcq.cq_ex.read_imm_data = cq_read_imm_data; if (attr->wc_flags & IBV_WC_EX_WITH_QP_NUM) cq->vcq.cq_ex.read_qp_num = cq_read_qp_num; if (attr->wc_flags & IBV_WC_EX_WITH_SRC_QP) cq->vcq.cq_ex.read_src_qp = cq_read_src_qp; if (attr->wc_flags & IBV_WC_EX_WITH_SLID) cq->vcq.cq_ex.read_slid = cq_read_slid; if (attr->wc_flags & IBV_WC_EX_WITH_SL) cq->vcq.cq_ex.read_sl = cq_read_sl; if (attr->wc_flags & IBV_WC_EX_WITH_DLID_PATH_BITS) cq->vcq.cq_ex.read_dlid_path_bits = cq_read_dlid_path_bits; return &cq->vcq.cq_ex; err_unmap: if (cq->mmap_info.size) munmap(cq->queue, cq->mmap_info.size); err_destroy: ibv_cmd_destroy_cq(&cq->vcq.cq); err_free: free(cq); err: return NULL; } static int rxe_resize_cq(struct ibv_cq *ibcq, int cqe) { struct rxe_cq *cq = to_rcq(ibcq); struct ibv_resize_cq cmd; struct urxe_resize_cq_resp resp; int ret; pthread_spin_lock(&cq->lock); ret = ibv_cmd_resize_cq(ibcq, cqe, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) { pthread_spin_unlock(&cq->lock); return ret; } munmap(cq->queue, cq->mmap_info.size); cq->queue = mmap(NULL, resp.mi.size, PROT_READ | PROT_WRITE, MAP_SHARED, ibcq->context->cmd_fd, resp.mi.offset); ret = errno; pthread_spin_unlock(&cq->lock); if ((void *)cq->queue == MAP_FAILED) { cq->queue = NULL; cq->mmap_info.size = 0; return ret; } cq->mmap_info = resp.mi; return 0; } static int rxe_destroy_cq(struct ibv_cq *ibcq) { struct rxe_cq *cq = to_rcq(ibcq); int ret; ret = ibv_cmd_destroy_cq(ibcq); if (ret) return ret; if (cq->mmap_info.size) munmap(cq->queue, cq->mmap_info.size); free(cq); return 0; } static int rxe_poll_cq(struct ibv_cq *ibcq, int ne, struct ibv_wc *wc) { struct rxe_cq *cq = to_rcq(ibcq); struct rxe_queue_buf *q; int npolled; uint8_t *src; pthread_spin_lock(&cq->lock); q = cq->queue; for (npolled = 0; npolled < ne; ++npolled, ++wc) { if (queue_empty(q)) break; src = consumer_addr(q); memcpy(wc, src, sizeof(*wc)); advance_consumer(q); } pthread_spin_unlock(&cq->lock); return npolled; } static struct ibv_srq *rxe_create_srq(struct ibv_pd *ibpd, struct ibv_srq_init_attr *attr) { struct rxe_srq *srq; struct ibv_srq *ibsrq; struct ibv_create_srq cmd; struct urxe_create_srq_resp resp = {}; int ret; srq = calloc(1, sizeof(*srq)); if (srq == NULL) return NULL; ibsrq = &srq->vsrq.srq; ret = ibv_cmd_create_srq(ibpd, ibsrq, attr, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) { free(srq); return NULL; } srq->rq.queue = mmap(NULL, resp.mi.size, PROT_READ | PROT_WRITE, MAP_SHARED, ibpd->context->cmd_fd, resp.mi.offset); if ((void *)srq->rq.queue == MAP_FAILED) { ibv_cmd_destroy_srq(ibsrq); free(srq); return NULL; } srq->mmap_info = resp.mi; srq->rq.max_sge = attr->attr.max_sge; pthread_spin_init(&srq->rq.lock, PTHREAD_PROCESS_PRIVATE); return ibsrq; } static struct ibv_srq *rxe_create_srq_ex( struct ibv_context *ibcontext, struct ibv_srq_init_attr_ex *attr_ex) { struct rxe_srq *srq; struct ibv_srq *ibsrq; struct ibv_create_xsrq cmd; struct urxe_create_srq_ex_resp resp = {}; int ret; srq = calloc(1, sizeof(*srq)); if (srq == NULL) return NULL; ibsrq = &srq->vsrq.srq; ret = ibv_cmd_create_srq_ex(ibcontext, &srq->vsrq, attr_ex, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) { free(srq); return NULL; } srq->rq.queue = mmap(NULL, resp.mi.size, PROT_READ | PROT_WRITE, MAP_SHARED, ibcontext->cmd_fd, resp.mi.offset); if ((void *)srq->rq.queue == MAP_FAILED) { ibv_cmd_destroy_srq(ibsrq); free(srq); return NULL; } srq->mmap_info = resp.mi; srq->rq.max_sge = attr_ex->attr.max_sge; pthread_spin_init(&srq->rq.lock, PTHREAD_PROCESS_PRIVATE); return ibsrq; } static int rxe_modify_srq(struct ibv_srq *ibsrq, struct ibv_srq_attr *attr, int attr_mask) { struct rxe_srq *srq = to_rsrq(ibsrq); struct urxe_modify_srq cmd; int rc = 0; struct mminfo mi; mi.offset = 0; mi.size = 0; if (attr_mask & IBV_SRQ_MAX_WR) pthread_spin_lock(&srq->rq.lock); cmd.mmap_info_addr = (__u64)(uintptr_t) &mi; rc = ibv_cmd_modify_srq(ibsrq, attr, attr_mask, &cmd.ibv_cmd, sizeof(cmd)); if (rc) goto out; if (attr_mask & IBV_SRQ_MAX_WR) { munmap(srq->rq.queue, srq->mmap_info.size); srq->rq.queue = mmap(NULL, mi.size, PROT_READ | PROT_WRITE, MAP_SHARED, ibsrq->context->cmd_fd, mi.offset); if ((void *)srq->rq.queue == MAP_FAILED) { rc = errno; srq->rq.queue = NULL; srq->mmap_info.size = 0; goto out; } srq->mmap_info = mi; } out: if (attr_mask & IBV_SRQ_MAX_WR) pthread_spin_unlock(&srq->rq.lock); return rc; } static int rxe_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; return ibv_cmd_query_srq(srq, attr, &cmd, sizeof(cmd)); } static int rxe_destroy_srq(struct ibv_srq *ibsrq) { int ret; struct rxe_srq *srq = to_rsrq(ibsrq); struct rxe_queue_buf *q = srq->rq.queue; ret = ibv_cmd_destroy_srq(ibsrq); if (!ret) { if (srq->mmap_info.size) munmap(q, srq->mmap_info.size); free(srq); } return ret; } static int rxe_post_one_recv(struct rxe_wq *rq, struct ibv_recv_wr *recv_wr) { int i; struct rxe_recv_wqe *wqe; struct rxe_queue_buf *q = rq->queue; int num_sge = recv_wr->num_sge; int length = 0; int rc = 0; if (queue_full(q)) { rc = ENOMEM; goto out; } if (num_sge > rq->max_sge) { rc = EINVAL; goto out; } wqe = (struct rxe_recv_wqe *)producer_addr(q); wqe->wr_id = recv_wr->wr_id; memcpy(wqe->dma.sge, recv_wr->sg_list, num_sge*sizeof(*wqe->dma.sge)); for (i = 0; i < num_sge; i++) length += wqe->dma.sge[i].length; wqe->dma.length = length; wqe->dma.resid = length; wqe->dma.cur_sge = 0; wqe->dma.num_sge = num_sge; wqe->dma.sge_offset = 0; advance_producer(q); out: return rc; } static int rxe_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *recv_wr, struct ibv_recv_wr **bad_recv_wr) { struct rxe_srq *srq = to_rsrq(ibsrq); int rc = 0; pthread_spin_lock(&srq->rq.lock); while (recv_wr) { rc = rxe_post_one_recv(&srq->rq, recv_wr); if (rc) { *bad_recv_wr = recv_wr; break; } recv_wr = recv_wr->next; } pthread_spin_unlock(&srq->rq.lock); return rc; } /* * builders always consume one send queue slot * setters (below) reach back and adjust previous build */ static void wr_atomic_cmp_swp(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, uint64_t compare, uint64_t swap) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = ibqp->wr_id; wqe->wr.send_flags = ibqp->wr_flags; wqe->wr.opcode = IBV_WR_ATOMIC_CMP_AND_SWP; wqe->wr.wr.atomic.remote_addr = remote_addr; wqe->wr.wr.atomic.compare_add = compare; wqe->wr.wr.atomic.swap = swap; wqe->wr.wr.atomic.rkey = rkey; wqe->iova = remote_addr; advance_qp_cur_index(qp); } static void wr_atomic_fetch_add(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, uint64_t add) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = qp->vqp.qp_ex.wr_id; wqe->wr.opcode = IBV_WR_ATOMIC_FETCH_AND_ADD; wqe->wr.send_flags = qp->vqp.qp_ex.wr_flags; wqe->wr.wr.atomic.remote_addr = remote_addr; wqe->wr.wr.atomic.compare_add = add; wqe->wr.wr.atomic.rkey = rkey; wqe->iova = remote_addr; advance_qp_cur_index(qp); } static void wr_bind_mw(struct ibv_qp_ex *ibqp, struct ibv_mw *ibmw, uint32_t rkey, const struct ibv_mw_bind_info *info) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = ibqp->wr_id; wqe->wr.opcode = IBV_WR_BIND_MW; wqe->wr.send_flags = qp->vqp.qp_ex.wr_flags; wqe->wr.wr.mw.addr = info->addr; wqe->wr.wr.mw.length = info->length; wqe->wr.wr.mw.mr_lkey = info->mr->lkey; wqe->wr.wr.mw.mw_rkey = ibmw->rkey; wqe->wr.wr.mw.rkey = rkey; wqe->wr.wr.mw.access = info->mw_access_flags; advance_qp_cur_index(qp); } static void wr_local_inv(struct ibv_qp_ex *ibqp, uint32_t invalidate_rkey) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = qp->vqp.qp_ex.wr_id; wqe->wr.opcode = IBV_WR_LOCAL_INV; wqe->wr.send_flags = qp->vqp.qp_ex.wr_flags; wqe->wr.ex.invalidate_rkey = invalidate_rkey; advance_qp_cur_index(qp); } static void wr_rdma_read(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = qp->vqp.qp_ex.wr_id; wqe->wr.opcode = IBV_WR_RDMA_READ; wqe->wr.send_flags = qp->vqp.qp_ex.wr_flags; wqe->wr.wr.rdma.remote_addr = remote_addr; wqe->wr.wr.rdma.rkey = rkey; wqe->iova = remote_addr; advance_qp_cur_index(qp); } static void wr_rdma_write(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = qp->vqp.qp_ex.wr_id; wqe->wr.opcode = IBV_WR_RDMA_WRITE; wqe->wr.send_flags = qp->vqp.qp_ex.wr_flags; wqe->wr.wr.rdma.remote_addr = remote_addr; wqe->wr.wr.rdma.rkey = rkey; wqe->iova = remote_addr; advance_qp_cur_index(qp); } static void wr_flush(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, size_t length, uint8_t type, uint8_t level) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = qp->vqp.qp_ex.wr_id; wqe->wr.opcode = IBV_WR_FLUSH; wqe->wr.send_flags = qp->vqp.qp_ex.wr_flags; wqe->wr.wr.flush.remote_addr = remote_addr; wqe->wr.wr.flush.rkey = rkey; wqe->wr.wr.flush.type = type; wqe->wr.wr.flush.level = level; wqe->dma.length = length; wqe->dma.resid = length; wqe->iova = remote_addr; advance_qp_cur_index(qp); } static void wr_atomic_write(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, const void *atomic_wr) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = qp->vqp.qp_ex.wr_id; wqe->wr.opcode = IBV_WR_ATOMIC_WRITE; wqe->wr.send_flags = qp->vqp.qp_ex.wr_flags; wqe->wr.wr.rdma.remote_addr = remote_addr; wqe->wr.wr.rdma.rkey = rkey; memcpy(wqe->dma.atomic_wr, atomic_wr, 8); wqe->dma.length = 8; wqe->dma.resid = 8; wqe->iova = remote_addr; advance_qp_cur_index(qp); } static void wr_rdma_write_imm(struct ibv_qp_ex *ibqp, uint32_t rkey, uint64_t remote_addr, __be32 imm_data) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = qp->vqp.qp_ex.wr_id; wqe->wr.opcode = IBV_WR_RDMA_WRITE_WITH_IMM; wqe->wr.send_flags = qp->vqp.qp_ex.wr_flags; wqe->wr.wr.rdma.remote_addr = remote_addr; wqe->wr.wr.rdma.rkey = rkey; wqe->wr.ex.imm_data = imm_data; wqe->iova = remote_addr; advance_qp_cur_index(qp); } static void wr_send(struct ibv_qp_ex *ibqp) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = qp->vqp.qp_ex.wr_id; wqe->wr.opcode = IBV_WR_SEND; wqe->wr.send_flags = qp->vqp.qp_ex.wr_flags; advance_qp_cur_index(qp); } static void wr_send_imm(struct ibv_qp_ex *ibqp, __be32 imm_data) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = qp->vqp.qp_ex.wr_id; wqe->wr.opcode = IBV_WR_SEND_WITH_IMM; wqe->wr.send_flags = qp->vqp.qp_ex.wr_flags; wqe->wr.ex.imm_data = imm_data; advance_qp_cur_index(qp); } static void wr_send_inv(struct ibv_qp_ex *ibqp, uint32_t invalidate_rkey) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index); if (check_qp_queue_full(qp)) return; memset(wqe, 0, sizeof(*wqe)); wqe->wr.wr_id = qp->vqp.qp_ex.wr_id; wqe->wr.opcode = IBV_WR_SEND_WITH_INV; wqe->wr.send_flags = qp->vqp.qp_ex.wr_flags; wqe->wr.ex.invalidate_rkey = invalidate_rkey; advance_qp_cur_index(qp); } static void wr_set_ud_addr(struct ibv_qp_ex *ibqp, struct ibv_ah *ibah, uint32_t remote_qpn, uint32_t remote_qkey) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_ah *ah = to_rah(ibah); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index - 1); if (qp->err) return; wqe->wr.wr.ud.remote_qpn = remote_qpn; wqe->wr.wr.ud.remote_qkey = remote_qkey; wqe->wr.wr.ud.ah_num = ah->ah_num; if (!ah->ah_num) /* old kernels only */ memcpy(&wqe->wr.wr.ud.av, &ah->av, sizeof(ah->av)); } static void wr_set_inline_data(struct ibv_qp_ex *ibqp, void *addr, size_t length) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index - 1); if (qp->err) return; if (length > qp->sq.max_inline) { qp->err = ENOSPC; return; } memcpy(wqe->dma.inline_data, addr, length); wqe->dma.length = length; wqe->dma.resid = length; } static void wr_set_inline_data_list(struct ibv_qp_ex *ibqp, size_t num_buf, const struct ibv_data_buf *buf_list) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index - 1); uint8_t *data = wqe->dma.inline_data; size_t length; size_t tot_length = 0; if (qp->err) return; while (num_buf--) { length = buf_list->length; if (tot_length + length > qp->sq.max_inline) { qp->err = ENOSPC; return; } memcpy(data, buf_list->addr, length); buf_list++; data += length; } wqe->dma.length = tot_length; wqe->dma.resid = tot_length; } static void wr_set_sge(struct ibv_qp_ex *ibqp, uint32_t lkey, uint64_t addr, uint32_t length) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index - 1); if (qp->err) return; if (length) { wqe->dma.length = length; wqe->dma.resid = length; wqe->dma.num_sge = 1; wqe->dma.sge[0].addr = addr; wqe->dma.sge[0].length = length; wqe->dma.sge[0].lkey = lkey; } } static void wr_set_sge_list(struct ibv_qp_ex *ibqp, size_t num_sge, const struct ibv_sge *sg_list) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue, qp->cur_index - 1); size_t tot_length = 0; if (qp->err) return; if (num_sge > qp->sq.max_sge) { qp->err = ENOSPC; return; } wqe->dma.num_sge = num_sge; memcpy(wqe->dma.sge, sg_list, num_sge*sizeof(*sg_list)); while (num_sge--) tot_length += sg_list->length; wqe->dma.length = tot_length; wqe->dma.resid = tot_length; } static void wr_start(struct ibv_qp_ex *ibqp) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); pthread_spin_lock(&qp->sq.lock); qp->err = 0; qp->cur_index = load_producer_index(qp->sq.queue); } static int post_send_db(struct ibv_qp *ibqp); static int wr_complete(struct ibv_qp_ex *ibqp) { int ret; struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); if (qp->err) { pthread_spin_unlock(&qp->sq.lock); return qp->err; } store_producer_index(qp->sq.queue, qp->cur_index); ret = post_send_db(&qp->vqp.qp); pthread_spin_unlock(&qp->sq.lock); return ret; } static void wr_abort(struct ibv_qp_ex *ibqp) { struct rxe_qp *qp = container_of(ibqp, struct rxe_qp, vqp.qp_ex); pthread_spin_unlock(&qp->sq.lock); } static int map_queue_pair(int cmd_fd, struct rxe_qp *qp, struct ibv_qp_init_attr *attr, struct rxe_create_qp_resp *resp) { if (attr->srq) { qp->rq.max_sge = 0; qp->rq.queue = NULL; qp->rq_mmap_info.size = 0; } else { qp->rq.max_sge = attr->cap.max_recv_sge; qp->rq.queue = mmap(NULL, resp->rq_mi.size, PROT_READ | PROT_WRITE, MAP_SHARED, cmd_fd, resp->rq_mi.offset); if ((void *)qp->rq.queue == MAP_FAILED) return errno; qp->rq_mmap_info = resp->rq_mi; pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE); } qp->sq.max_sge = attr->cap.max_send_sge; qp->sq.max_inline = attr->cap.max_inline_data; qp->sq.queue = mmap(NULL, resp->sq_mi.size, PROT_READ | PROT_WRITE, MAP_SHARED, cmd_fd, resp->sq_mi.offset); if ((void *)qp->sq.queue == MAP_FAILED) { if (qp->rq_mmap_info.size) munmap(qp->rq.queue, qp->rq_mmap_info.size); return errno; } qp->sq_mmap_info = resp->sq_mi; pthread_spin_init(&qp->sq.lock, PTHREAD_PROCESS_PRIVATE); return 0; } static struct ibv_qp *rxe_create_qp(struct ibv_pd *ibpd, struct ibv_qp_init_attr *attr) { struct ibv_create_qp cmd = {}; struct urxe_create_qp_resp resp = {}; struct rxe_qp *qp; int ret; qp = calloc(1, sizeof(*qp)); if (!qp) goto err; ret = ibv_cmd_create_qp(ibpd, &qp->vqp.qp, attr, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) goto err_free; ret = map_queue_pair(ibpd->context->cmd_fd, qp, attr, &resp.drv_payload); if (ret) goto err_destroy; qp->sq_mmap_info = resp.sq_mi; pthread_spin_init(&qp->sq.lock, PTHREAD_PROCESS_PRIVATE); return &qp->vqp.qp; err_destroy: ibv_cmd_destroy_qp(&qp->vqp.qp); err_free: free(qp); err: return NULL; } enum { RXE_QP_CREATE_FLAGS_SUP = 0, RXE_QP_COMP_MASK_SUP = IBV_QP_INIT_ATTR_PD | IBV_QP_INIT_ATTR_CREATE_FLAGS | IBV_QP_INIT_ATTR_SEND_OPS_FLAGS, RXE_SUP_RC_QP_SEND_OPS_FLAGS = IBV_QP_EX_WITH_RDMA_WRITE | IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM | IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_IMM | IBV_QP_EX_WITH_RDMA_READ | IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP | IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD | IBV_QP_EX_WITH_LOCAL_INV | IBV_QP_EX_WITH_BIND_MW | IBV_QP_EX_WITH_SEND_WITH_INV | IBV_QP_EX_WITH_FLUSH | IBV_QP_EX_WITH_ATOMIC_WRITE, RXE_SUP_UC_QP_SEND_OPS_FLAGS = IBV_QP_EX_WITH_RDMA_WRITE | IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM | IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_IMM | IBV_QP_EX_WITH_BIND_MW | IBV_QP_EX_WITH_SEND_WITH_INV, RXE_SUP_UD_QP_SEND_OPS_FLAGS = IBV_QP_EX_WITH_SEND | IBV_QP_EX_WITH_SEND_WITH_IMM, }; static int check_qp_init_attr(struct ibv_qp_init_attr_ex *attr) { if (attr->comp_mask & ~RXE_QP_COMP_MASK_SUP) goto err; if ((attr->comp_mask & IBV_QP_INIT_ATTR_CREATE_FLAGS) && (attr->create_flags & ~RXE_QP_CREATE_FLAGS_SUP)) goto err; if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS) { switch (attr->qp_type) { case IBV_QPT_RC: if (attr->send_ops_flags & ~RXE_SUP_RC_QP_SEND_OPS_FLAGS) goto err; break; case IBV_QPT_UC: if (attr->send_ops_flags & ~RXE_SUP_UC_QP_SEND_OPS_FLAGS) goto err; break; case IBV_QPT_UD: if (attr->send_ops_flags & ~RXE_SUP_UD_QP_SEND_OPS_FLAGS) goto err; break; default: goto err; } } return 0; err: errno = EOPNOTSUPP; return errno; } static void set_qp_send_ops(struct rxe_qp *qp, uint64_t flags) { if (flags & IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP) qp->vqp.qp_ex.wr_atomic_cmp_swp = wr_atomic_cmp_swp; if (flags & IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD) qp->vqp.qp_ex.wr_atomic_fetch_add = wr_atomic_fetch_add; if (flags & IBV_QP_EX_WITH_BIND_MW) qp->vqp.qp_ex.wr_bind_mw = wr_bind_mw; if (flags & IBV_QP_EX_WITH_LOCAL_INV) qp->vqp.qp_ex.wr_local_inv = wr_local_inv; if (flags & IBV_QP_EX_WITH_ATOMIC_WRITE) qp->vqp.qp_ex.wr_atomic_write = wr_atomic_write; if (flags & IBV_QP_EX_WITH_RDMA_READ) qp->vqp.qp_ex.wr_rdma_read = wr_rdma_read; if (flags & IBV_QP_EX_WITH_RDMA_WRITE) qp->vqp.qp_ex.wr_rdma_write = wr_rdma_write; if (flags & IBV_QP_EX_WITH_FLUSH) qp->vqp.qp_ex.wr_flush = wr_flush; if (flags & IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM) qp->vqp.qp_ex.wr_rdma_write_imm = wr_rdma_write_imm; if (flags & IBV_QP_EX_WITH_SEND) qp->vqp.qp_ex.wr_send = wr_send; if (flags & IBV_QP_EX_WITH_SEND_WITH_IMM) qp->vqp.qp_ex.wr_send_imm = wr_send_imm; if (flags & IBV_QP_EX_WITH_SEND_WITH_INV) qp->vqp.qp_ex.wr_send_inv = wr_send_inv; qp->vqp.qp_ex.wr_set_ud_addr = wr_set_ud_addr; qp->vqp.qp_ex.wr_set_inline_data = wr_set_inline_data; qp->vqp.qp_ex.wr_set_inline_data_list = wr_set_inline_data_list; qp->vqp.qp_ex.wr_set_sge = wr_set_sge; qp->vqp.qp_ex.wr_set_sge_list = wr_set_sge_list; qp->vqp.qp_ex.wr_start = wr_start; qp->vqp.qp_ex.wr_complete = wr_complete; qp->vqp.qp_ex.wr_abort = wr_abort; } static struct ibv_qp *rxe_create_qp_ex(struct ibv_context *context, struct ibv_qp_init_attr_ex *attr) { int ret; struct rxe_qp *qp; struct ibv_create_qp_ex cmd = {}; struct urxe_create_qp_ex_resp resp = {}; size_t cmd_size = sizeof(cmd); size_t resp_size = sizeof(resp); ret = check_qp_init_attr(attr); if (ret) goto err; qp = calloc(1, sizeof(*qp)); if (!qp) goto err; if (attr->comp_mask & IBV_QP_INIT_ATTR_SEND_OPS_FLAGS) set_qp_send_ops(qp, attr->send_ops_flags); ret = ibv_cmd_create_qp_ex2(context, &qp->vqp, attr, &cmd, cmd_size, &resp.ibv_resp, resp_size); if (ret) goto err_free; qp->vqp.comp_mask |= VERBS_QP_EX; ret = map_queue_pair(context->cmd_fd, qp, (struct ibv_qp_init_attr *)attr, &resp.drv_payload); if (ret) goto err_destroy; return &qp->vqp.qp; err_destroy: ibv_cmd_destroy_qp(&qp->vqp.qp); err_free: free(qp); err: return NULL; } static int rxe_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd = {}; return ibv_cmd_query_qp(ibqp, attr, attr_mask, init_attr, &cmd, sizeof(cmd)); } static int rxe_modify_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd = {}; return ibv_cmd_modify_qp(ibqp, attr, attr_mask, &cmd, sizeof(cmd)); } static int rxe_destroy_qp(struct ibv_qp *ibqp) { int ret; struct rxe_qp *qp = to_rqp(ibqp); ret = ibv_cmd_destroy_qp(ibqp); if (!ret) { if (qp->rq_mmap_info.size) munmap(qp->rq.queue, qp->rq_mmap_info.size); if (qp->sq_mmap_info.size) munmap(qp->sq.queue, qp->sq_mmap_info.size); free(qp); } return ret; } /* basic sanity checks for send work request */ static int validate_send_wr(struct rxe_qp *qp, struct ibv_send_wr *ibwr, unsigned int length) { struct rxe_wq *sq = &qp->sq; enum ibv_wr_opcode opcode = ibwr->opcode; if (ibwr->num_sge > sq->max_sge) return EINVAL; if ((opcode == IBV_WR_ATOMIC_CMP_AND_SWP) || (opcode == IBV_WR_ATOMIC_FETCH_AND_ADD)) if (length < 8 || ibwr->wr.atomic.remote_addr & 0x7) return EINVAL; if ((ibwr->send_flags & IBV_SEND_INLINE) && (length > sq->max_inline)) return EINVAL; if (ibwr->opcode == IBV_WR_BIND_MW) { if (length) return EINVAL; if (ibwr->num_sge) return EINVAL; if (ibwr->imm_data) return EINVAL; if ((qp_type(qp) != IBV_QPT_RC) && (qp_type(qp) != IBV_QPT_UC)) return EINVAL; } return 0; } static void convert_send_wr(struct rxe_qp *qp, struct rxe_send_wr *kwr, struct ibv_send_wr *uwr) { struct ibv_mw *ibmw; struct ibv_mr *ibmr; memset(kwr, 0, sizeof(*kwr)); kwr->wr_id = uwr->wr_id; kwr->opcode = uwr->opcode; kwr->send_flags = uwr->send_flags; kwr->ex.imm_data = uwr->imm_data; switch (uwr->opcode) { case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: case IBV_WR_RDMA_READ: kwr->wr.rdma.remote_addr = uwr->wr.rdma.remote_addr; kwr->wr.rdma.rkey = uwr->wr.rdma.rkey; break; case IBV_WR_SEND: case IBV_WR_SEND_WITH_IMM: if (qp_type(qp) == IBV_QPT_UD) { struct rxe_ah *ah = to_rah(uwr->wr.ud.ah); kwr->wr.ud.remote_qpn = uwr->wr.ud.remote_qpn; kwr->wr.ud.remote_qkey = uwr->wr.ud.remote_qkey; kwr->wr.ud.ah_num = ah->ah_num; } break; case IBV_WR_ATOMIC_CMP_AND_SWP: case IBV_WR_ATOMIC_FETCH_AND_ADD: kwr->wr.atomic.remote_addr = uwr->wr.atomic.remote_addr; kwr->wr.atomic.compare_add = uwr->wr.atomic.compare_add; kwr->wr.atomic.swap = uwr->wr.atomic.swap; kwr->wr.atomic.rkey = uwr->wr.atomic.rkey; break; case IBV_WR_BIND_MW: ibmr = uwr->bind_mw.bind_info.mr; ibmw = uwr->bind_mw.mw; kwr->wr.mw.addr = uwr->bind_mw.bind_info.addr; kwr->wr.mw.length = uwr->bind_mw.bind_info.length; kwr->wr.mw.mr_lkey = ibmr->lkey; kwr->wr.mw.mw_rkey = ibmw->rkey; kwr->wr.mw.rkey = uwr->bind_mw.rkey; kwr->wr.mw.access = uwr->bind_mw.bind_info.mw_access_flags; break; default: break; } } static int init_send_wqe(struct rxe_qp *qp, struct rxe_wq *sq, struct ibv_send_wr *ibwr, unsigned int length, struct rxe_send_wqe *wqe) { int num_sge = ibwr->num_sge; int i; unsigned int opcode = ibwr->opcode; convert_send_wr(qp, &wqe->wr, ibwr); if (qp_type(qp) == IBV_QPT_UD) { struct rxe_ah *ah = to_rah(ibwr->wr.ud.ah); if (!ah->ah_num) /* old kernels only */ memcpy(&wqe->wr.wr.ud.av, &ah->av, sizeof(struct rxe_av)); } if (ibwr->send_flags & IBV_SEND_INLINE) { uint8_t *inline_data = wqe->dma.inline_data; for (i = 0; i < num_sge; i++) { memcpy(inline_data, (uint8_t *)(long)ibwr->sg_list[i].addr, ibwr->sg_list[i].length); inline_data += ibwr->sg_list[i].length; } } else memcpy(wqe->dma.sge, ibwr->sg_list, num_sge*sizeof(struct ibv_sge)); if ((opcode == IBV_WR_ATOMIC_CMP_AND_SWP) || (opcode == IBV_WR_ATOMIC_FETCH_AND_ADD)) wqe->iova = ibwr->wr.atomic.remote_addr; else wqe->iova = ibwr->wr.rdma.remote_addr; wqe->dma.length = length; wqe->dma.resid = length; wqe->dma.num_sge = num_sge; wqe->dma.cur_sge = 0; wqe->dma.sge_offset = 0; wqe->state = 0; return 0; } static int post_one_send(struct rxe_qp *qp, struct rxe_wq *sq, struct ibv_send_wr *ibwr) { int err; struct rxe_send_wqe *wqe; unsigned int length = 0; int i; for (i = 0; i < ibwr->num_sge; i++) length += ibwr->sg_list[i].length; err = validate_send_wr(qp, ibwr, length); if (err) { verbs_err(verbs_get_ctx(qp->vqp.qp.context), "validate send failed\n"); return err; } wqe = (struct rxe_send_wqe *)producer_addr(sq->queue); err = init_send_wqe(qp, sq, ibwr, length, wqe); if (err) return err; if (queue_full(sq->queue)) return ENOMEM; advance_producer(sq->queue); rdma_tracepoint(rdma_core_rxe, post_send, qp->vqp.qp.context->device->name, qp->vqp.qp.qp_num, (char *)ibv_wr_opcode_str(ibwr->opcode), length); return 0; } /* send a null post send as a doorbell */ static int post_send_db(struct ibv_qp *ibqp) { struct ibv_post_send cmd; struct ib_uverbs_post_send_resp resp; cmd.hdr.command = IB_USER_VERBS_CMD_POST_SEND; cmd.hdr.in_words = sizeof(cmd) / 4; cmd.hdr.out_words = sizeof(resp) / 4; cmd.response = (uintptr_t)&resp; cmd.qp_handle = ibqp->handle; cmd.wr_count = 0; cmd.sge_count = 0; cmd.wqe_size = sizeof(struct ibv_send_wr); if (write(ibqp->context->cmd_fd, &cmd, sizeof(cmd)) != sizeof(cmd)) return errno; return 0; } /* this API does not make a distinction between * restartable and non-restartable errors */ static int rxe_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr_list, struct ibv_send_wr **bad_wr) { int rc = 0; int err; struct rxe_qp *qp = to_rqp(ibqp); struct rxe_wq *sq = &qp->sq; if (!bad_wr) return EINVAL; *bad_wr = NULL; if (!sq || !wr_list || !sq->queue) return EINVAL; pthread_spin_lock(&sq->lock); while (wr_list) { rc = post_one_send(qp, sq, wr_list); if (rc) { *bad_wr = wr_list; break; } wr_list = wr_list->next; } pthread_spin_unlock(&sq->lock); err = post_send_db(ibqp); return err ? err : rc; } static int rxe_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *recv_wr, struct ibv_recv_wr **bad_wr) { int rc = 0; struct rxe_qp *qp = to_rqp(ibqp); struct rxe_wq *rq = &qp->rq; if (!bad_wr) return EINVAL; *bad_wr = NULL; if (!rq || !recv_wr || !rq->queue) return EINVAL; /* see C10-97.2.1 */ if (ibqp->state == IBV_QPS_RESET) return EINVAL; pthread_spin_lock(&rq->lock); while (recv_wr) { rc = rxe_post_one_recv(rq, recv_wr); if (rc) { *bad_wr = recv_wr; break; } recv_wr = recv_wr->next; } pthread_spin_unlock(&rq->lock); return rc; } static inline int ipv6_addr_v4mapped(const struct in6_addr *a) { return IN6_IS_ADDR_V4MAPPED(a); } typedef typeof(((struct rxe_av *)0)->sgid_addr) sockaddr_union_t; static inline int rdma_gid2ip(sockaddr_union_t *out, union ibv_gid *gid) { if (ipv6_addr_v4mapped((struct in6_addr *)gid)) { memset(&out->_sockaddr_in, 0, sizeof(out->_sockaddr_in)); memcpy(&out->_sockaddr_in.sin_addr.s_addr, gid->raw + 12, 4); } else { memset(&out->_sockaddr_in6, 0, sizeof(out->_sockaddr_in6)); out->_sockaddr_in6.sin6_family = AF_INET6; memcpy(&out->_sockaddr_in6.sin6_addr.s6_addr, gid->raw, 16); } return 0; } static int rxe_create_av(struct rxe_ah *ah, struct ibv_pd *pd, struct ibv_ah_attr *attr) { struct rxe_av *av = &ah->av; union ibv_gid sgid; int ret; ret = ibv_query_gid(pd->context, attr->port_num, attr->grh.sgid_index, &sgid); if (ret) return ret; av->port_num = attr->port_num; memcpy(&av->grh, &attr->grh, sizeof(attr->grh)); ret = ipv6_addr_v4mapped((struct in6_addr *)attr->grh.dgid.raw); av->network_type = ret ? RXE_NETWORK_TYPE_IPV4 : RXE_NETWORK_TYPE_IPV6; rdma_gid2ip(&av->sgid_addr, &sgid); rdma_gid2ip(&av->dgid_addr, &attr->grh.dgid); ret = ibv_resolve_eth_l2_from_gid(pd->context, attr, av->dmac, NULL); return ret; } /* * Newer kernels will return a non-zero AH index in resp.ah_num * which can be returned in UD send WQEs. * Older kernels will leave ah_num == 0. For these create an AV and use * in UD send WQEs. */ static struct ibv_ah *rxe_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr) { struct rxe_ah *ah; struct urxe_create_ah_resp resp = {}; int ret; ah = calloc(1, sizeof(*ah)); if (!ah) return NULL; ret = ibv_cmd_create_ah(pd, &ah->ibv_ah, attr, &resp.ibv_resp, sizeof(resp)); if (ret) goto err_free; ah->ah_num = resp.ah_num; if (!ah->ah_num) { /* old kernels only */ ret = rxe_create_av(ah, pd, attr); if (ret) goto err_free; } return &ah->ibv_ah; err_free: free(ah); return NULL; } static int rxe_destroy_ah(struct ibv_ah *ibah) { struct rxe_ah *ah = to_rah(ibah); int ret; ret = ibv_cmd_destroy_ah(&ah->ibv_ah); if (!ret) free(ah); return ret; } static const struct verbs_context_ops rxe_ctx_ops = { .query_device_ex = rxe_query_device, .query_port = rxe_query_port, .alloc_pd = rxe_alloc_pd, .dealloc_pd = rxe_dealloc_pd, .reg_mr = rxe_reg_mr, .dereg_mr = rxe_dereg_mr, .alloc_mw = rxe_alloc_mw, .dealloc_mw = rxe_dealloc_mw, .bind_mw = rxe_bind_mw, .create_cq = rxe_create_cq, .create_cq_ex = rxe_create_cq_ex, .poll_cq = rxe_poll_cq, .req_notify_cq = ibv_cmd_req_notify_cq, .resize_cq = rxe_resize_cq, .destroy_cq = rxe_destroy_cq, .create_srq = rxe_create_srq, .create_srq_ex = rxe_create_srq_ex, .modify_srq = rxe_modify_srq, .query_srq = rxe_query_srq, .destroy_srq = rxe_destroy_srq, .post_srq_recv = rxe_post_srq_recv, .create_qp = rxe_create_qp, .create_qp_ex = rxe_create_qp_ex, .query_qp = rxe_query_qp, .modify_qp = rxe_modify_qp, .destroy_qp = rxe_destroy_qp, .post_send = rxe_post_send, .post_recv = rxe_post_recv, .create_ah = rxe_create_ah, .destroy_ah = rxe_destroy_ah, .attach_mcast = ibv_cmd_attach_mcast, .detach_mcast = ibv_cmd_detach_mcast, .free_context = rxe_free_context, }; static struct verbs_context *rxe_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct rxe_context *context; struct ibv_get_context cmd; struct ib_uverbs_get_context_resp resp; context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx, RDMA_DRIVER_RXE); if (!context) return NULL; if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof(cmd), &resp, sizeof(resp))) goto out; verbs_set_ops(&context->ibv_ctx, &rxe_ctx_ops); return &context->ibv_ctx; out: verbs_uninit_context(&context->ibv_ctx); free(context); return NULL; } static void rxe_free_context(struct ibv_context *ibctx) { struct rxe_context *context = to_rctx(ibctx); verbs_uninit_context(&context->ibv_ctx); free(context); } static void rxe_uninit_device(struct verbs_device *verbs_device) { struct rxe_device *dev = to_rdev(&verbs_device->device); free(dev); } static struct verbs_device *rxe_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct rxe_device *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; dev->abi_version = sysfs_dev->abi_ver; return &dev->ibv_dev; } static const struct verbs_device_ops rxe_dev_ops = { .name = "rxe", /* * For 64 bit machines ABI version 1 and 2 are the same. Otherwise 32 * bit machines require ABI version 2 which guarentees the user and * kernel use the same ABI. */ .match_min_abi_version = sizeof(void *) == 8?1:2, .match_max_abi_version = 2, .match_table = hca_table, .alloc_device = rxe_device_alloc, .uninit_device = rxe_uninit_device, .alloc_context = rxe_alloc_context, }; PROVIDER_DRIVER(rxe, rxe_dev_ops); rdma-core-56.1/providers/rxe/rxe.h000066400000000000000000000067411477342711600171240ustar00rootroot00000000000000/* * Copyright (c) 2009 Mellanox Technologies Ltd. All rights reserved. * Copyright (c) 2009 System Fabric Works, Inc. All rights reserved. * Copyright (c) 2006-2007 QLogic Corp. All rights reserved. * Copyright (c) 2005. PathScale, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef RXE_H #define RXE_H #include #include #include #include #include "rxe-abi.h" struct rxe_device { struct verbs_device ibv_dev; int abi_version; }; struct rxe_context { struct verbs_context ibv_ctx; }; /* common between cq and cq_ex */ struct rxe_cq { struct verbs_cq vcq; struct mminfo mmap_info; struct rxe_queue_buf *queue; pthread_spinlock_t lock; /* new API support */ struct ib_uverbs_wc *wc; size_t wc_size; uint32_t cur_index; }; struct rxe_ah { struct ibv_ah ibv_ah; struct rxe_av av; int ah_num; }; struct rxe_wq { struct rxe_queue_buf *queue; pthread_spinlock_t lock; unsigned int max_sge; unsigned int max_inline; }; struct rxe_qp { struct verbs_qp vqp; struct mminfo rq_mmap_info; struct rxe_wq rq; struct mminfo sq_mmap_info; struct rxe_wq sq; /* new API support */ uint32_t cur_index; int err; }; struct rxe_srq { struct verbs_srq vsrq; struct mminfo mmap_info; struct rxe_wq rq; uint32_t srq_num; }; #define to_rxxx(xxx, type) container_of(ib##xxx, struct rxe_##type, ibv_##xxx) static inline struct rxe_context *to_rctx(struct ibv_context *ibctx) { return container_of(ibctx, struct rxe_context, ibv_ctx.context); } static inline struct rxe_device *to_rdev(struct ibv_device *ibdev) { return container_of(ibdev, struct rxe_device, ibv_dev.device); } static inline struct rxe_cq *to_rcq(struct ibv_cq *ibcq) { return container_of(ibcq, struct rxe_cq, vcq.cq); } static inline struct rxe_qp *to_rqp(struct ibv_qp *ibqp) { return container_of(ibqp, struct rxe_qp, vqp.qp); } static inline struct rxe_srq *to_rsrq(struct ibv_srq *ibsrq) { return container_of(ibsrq, struct rxe_srq, vsrq.srq); } static inline struct rxe_ah *to_rah(struct ibv_ah *ibah) { return to_rxxx(ah, ah); } static inline enum ibv_qp_type qp_type(struct rxe_qp *qp) { return qp->vqp.qp.qp_type; } #endif /* RXE_H */ rdma-core-56.1/providers/rxe/rxe_queue.h000066400000000000000000000137071477342711600203300ustar00rootroot00000000000000/* * Copyright (c) 2009 Mellanox Technologies Ltd. All rights reserved. * Copyright (c) 2009 System Fabric Works, Inc. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the fileA * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ /* implements a simple circular buffer with sizes a power of 2 */ #ifndef H_RXE_QUEUE #define H_RXE_QUEUE #include #include #include "rxe.h" /* N.B. producer_index and consumer_index always lie in the range * [0, index_mask] masking is only required when computing a new value. * Below, 'consumer_index lock' is cq->lock * and, 'producer_index lock' is one of rq, sq or srq->lock. * In the code below the only memory ordering required is between the * kernel driver (rdma_rxe) and the user provider library. Ordering between * user space threads is addressed by spinlocks which provide memory * barriers. */ typedef _Atomic(__u32) _atomic_t; static inline _atomic_t *producer(struct rxe_queue_buf *q) { return (_atomic_t *)&q->producer_index; } static inline _atomic_t *consumer(struct rxe_queue_buf *q) { return (_atomic_t *)&q->consumer_index; } /* Must hold consumer_index lock (used by CQ only) */ static inline int queue_empty(struct rxe_queue_buf *q) { __u32 prod; __u32 cons; prod = atomic_load_explicit(producer(q), memory_order_acquire); cons = atomic_load_explicit(consumer(q), memory_order_relaxed); return (prod == cons); } /* Must hold producer_index lock (used by SQ, RQ, SRQ only) */ static inline int queue_full(struct rxe_queue_buf *q) { __u32 prod; __u32 cons; prod = atomic_load_explicit(producer(q), memory_order_relaxed); cons = atomic_load_explicit(consumer(q), memory_order_acquire); return (cons == ((prod + 1) & q->index_mask)); } /* Must hold producer_index lock */ static inline void advance_producer(struct rxe_queue_buf *q) { __u32 prod; prod = atomic_load_explicit(producer(q), memory_order_relaxed); prod = (prod + 1) & q->index_mask; atomic_store_explicit(producer(q), prod, memory_order_release); } /* Must hold consumer_index lock */ static inline void advance_consumer(struct rxe_queue_buf *q) { __u32 cons; cons = atomic_load_explicit(consumer(q), memory_order_relaxed); cons = (cons + 1) & q->index_mask; atomic_store_explicit(consumer(q), cons, memory_order_release); } /* Must hold producer_index lock */ static inline __u32 load_producer_index(struct rxe_queue_buf *q) { return atomic_load_explicit(producer(q), memory_order_relaxed); } /* Must hold producer_index lock */ static inline void store_producer_index(struct rxe_queue_buf *q, __u32 index) { /* flush writes to work queue before moving index */ atomic_store_explicit(producer(q), index, memory_order_release); } /* Must hold consumer_index lock */ static inline __u32 load_consumer_index(struct rxe_queue_buf *q) { return atomic_load_explicit(consumer(q), memory_order_relaxed); } /* Must hold consumer_index lock */ static inline void store_consumer_index(struct rxe_queue_buf *q, __u32 index) { /* complete reads from completion queue before moving index */ atomic_store_explicit(consumer(q), index, memory_order_release); } /* Must hold producer_index lock */ static inline void *producer_addr(struct rxe_queue_buf *q) { __u32 prod; prod = atomic_load_explicit(producer(q), memory_order_relaxed); return q->data + (prod << q->log2_elem_size); } /* Must hold consumer_index lock */ static inline void *consumer_addr(struct rxe_queue_buf *q) { __u32 cons; cons = atomic_load_explicit(consumer(q), memory_order_relaxed); return q->data + (cons << q->log2_elem_size); } static inline void *addr_from_index(struct rxe_queue_buf *q, unsigned int index) { index &= q->index_mask; return q->data + (index << q->log2_elem_size); } static inline unsigned int index_from_addr(const struct rxe_queue_buf *q, const void *addr) { return (((__u8 *)addr - q->data) >> q->log2_elem_size) & q->index_mask; } static inline void advance_cq_cur_index(struct rxe_cq *cq) { struct rxe_queue_buf *q = cq->queue; cq->cur_index = (cq->cur_index + 1) & q->index_mask; } static inline int check_cq_queue_empty(struct rxe_cq *cq) { struct rxe_queue_buf *q = cq->queue; __u32 prod; prod = atomic_load_explicit(producer(q), memory_order_acquire); return (cq->cur_index == prod); } static inline void advance_qp_cur_index(struct rxe_qp *qp) { struct rxe_queue_buf *q = qp->sq.queue; qp->cur_index = (qp->cur_index + 1) & q->index_mask; } static inline int check_qp_queue_full(struct rxe_qp *qp) { struct rxe_queue_buf *q = qp->sq.queue; uint32_t cons; cons = atomic_load_explicit(consumer(q), memory_order_acquire); if (qp->err) goto err; if (cons == ((qp->cur_index + 1) & q->index_mask)) qp->err = ENOSPC; err: return qp->err; } #endif /* H_RXE_QUEUE */ rdma-core-56.1/providers/rxe/rxe_trace.c000066400000000000000000000003641477342711600202700ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2023 Bytedance.com, Inc. or its affiliates. All rights reserved. */ #define LTTNG_UST_TRACEPOINT_CREATE_PROBES #define LTTNG_UST_TRACEPOINT_DEFINE #include "rxe_trace.h" rdma-core-56.1/providers/rxe/rxe_trace.h000066400000000000000000000023661477342711600203010ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause */ /* * Copyright 2023 Bytedance.com, Inc. or its affiliates. All rights reserved. */ #if defined(LTTNG_ENABLED) #undef LTTNG_UST_TRACEPOINT_PROVIDER #define LTTNG_UST_TRACEPOINT_PROVIDER rdma_core_rxe #undef LTTNG_UST_TRACEPOINT_INCLUDE #define LTTNG_UST_TRACEPOINT_INCLUDE "rxe_trace.h" #if !defined(__RXE_TRACE_H__) || defined(LTTNG_UST_TRACEPOINT_HEADER_MULTI_READ) #define __RXE_TRACE_H__ #include #include LTTNG_UST_TRACEPOINT_EVENT( /* Tracepoint provider name */ rdma_core_rxe, /* Tracepoint name */ post_send, /* Input arguments */ LTTNG_UST_TP_ARGS( char *, dev, uint32_t, src_qp_num, char *, opcode, uint32_t, bytes ), /* Output event fields */ LTTNG_UST_TP_FIELDS( lttng_ust_field_string(dev, dev) lttng_ust_field_integer(uint32_t, src_qp_num, src_qp_num) lttng_ust_field_string(opcode, opcode) lttng_ust_field_integer(uint32_t, bytes, bytes) ) ) #define rdma_tracepoint(arg...) lttng_ust_tracepoint(arg) #endif /* __RXE_TRACE_H__*/ #include #else #ifndef __RXE_TRACE_H__ #define __RXE_TRACE_H__ #define rdma_tracepoint(arg...) #endif /* __RXE_TRACE_H__*/ #endif /* defined(LTTNG_ENABLED) */ rdma-core-56.1/providers/siw/000077500000000000000000000000001477342711600161515ustar00rootroot00000000000000rdma-core-56.1/providers/siw/CMakeLists.txt000066400000000000000000000000341477342711600207060ustar00rootroot00000000000000rdma_provider(siw siw.c ) rdma-core-56.1/providers/siw/siw.c000066400000000000000000000525641477342711600171330ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 or BSD-3-Clause // Authors: Bernard Metzler // Copyright (c) 2008-2019, IBM Corporation #include #include #include #include #include #include #include #include #include #include #include "siw_abi.h" #include "siw.h" static void siw_free_context(struct ibv_context *ibv_ctx); static int siw_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); uint64_t raw_fw_ver; unsigned int major, minor, sub_minor; int rv; rv = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (rv) return rv; raw_fw_ver = resp.base.fw_ver; major = (raw_fw_ver >> 32) & 0xffff; minor = (raw_fw_ver >> 16) & 0xffff; sub_minor = raw_fw_ver & 0xffff; snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver), "%d.%d.%d", major, minor, sub_minor); return 0; } static int siw_query_port(struct ibv_context *ctx, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; memset(&cmd, 0, sizeof(cmd)); return ibv_cmd_query_port(ctx, port, attr, &cmd, sizeof(cmd)); } static int siw_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; memset(&cmd, 0, sizeof(cmd)); return ibv_cmd_query_qp(qp, attr, attr_mask, init_attr, &cmd, sizeof(cmd)); } static struct ibv_pd *siw_alloc_pd(struct ibv_context *ctx) { struct ibv_alloc_pd cmd; struct ib_uverbs_alloc_pd_resp resp; struct ibv_pd *pd; memset(&cmd, 0, sizeof(cmd)); pd = calloc(1, sizeof(*pd)); if (!pd) return NULL; if (ibv_cmd_alloc_pd(ctx, pd, &cmd, sizeof(cmd), &resp, sizeof(resp))) { free(pd); return NULL; } return pd; } static int siw_free_pd(struct ibv_pd *pd) { int rv; rv = ibv_cmd_dealloc_pd(pd); if (rv) return rv; free(pd); return 0; } static struct ibv_mr *siw_reg_mr(struct ibv_pd *pd, void *addr, size_t len, uint64_t hca_va, int access) { struct siw_cmd_reg_mr cmd = {}; struct siw_cmd_reg_mr_resp resp = {}; struct siw_mr *mr; int rv; mr = calloc(1, sizeof(*mr)); if (!mr) return NULL; rv = ibv_cmd_reg_mr(pd, addr, len, hca_va, access, &mr->base_mr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (rv) { free(mr); return NULL; } return &mr->base_mr.ibv_mr; } static int siw_dereg_mr(struct verbs_mr *base_mr) { struct siw_mr *mr = mr_base2siw(base_mr); int rv; rv = ibv_cmd_dereg_mr(base_mr); if (rv) return rv; free(mr); return 0; } static struct ibv_cq *siw_create_cq(struct ibv_context *ctx, int num_cqe, struct ibv_comp_channel *channel, int comp_vector) { struct siw_cmd_create_cq cmd = {}; struct siw_cmd_create_cq_resp resp = {}; struct siw_cq *cq; int cq_size, rv; cq = calloc(1, sizeof(*cq)); if (!cq) return NULL; rv = ibv_cmd_create_cq(ctx, num_cqe, channel, comp_vector, &cq->base_cq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (rv) { verbs_err(verbs_get_ctx(ctx), "libsiw: CQ creation failed: %d\n", rv); free(cq); return NULL; } if (resp.cq_key == SIW_INVAL_UOBJ_KEY) { verbs_err(verbs_get_ctx(ctx), "libsiw: prepare CQ mapping failed\n"); goto fail; } pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE); cq->id = resp.cq_id; cq->num_cqe = resp.num_cqe; cq_size = resp.num_cqe * sizeof(struct siw_cqe) + sizeof(struct siw_cq_ctrl); cq->queue = mmap(NULL, cq_size, PROT_READ | PROT_WRITE, MAP_SHARED, ctx->cmd_fd, resp.cq_key); if (cq->queue == MAP_FAILED) { verbs_err(verbs_get_ctx(ctx), "libsiw: CQ mapping failed: %d", errno); goto fail; } cq->ctrl = (struct siw_cq_ctrl *)&cq->queue[cq->num_cqe]; cq->ctrl->flags = SIW_NOTIFY_NOT; return &cq->base_cq; fail: ibv_cmd_destroy_cq(&cq->base_cq); free(cq); return NULL; } static int siw_destroy_cq(struct ibv_cq *base_cq) { struct siw_cq *cq = cq_base2siw(base_cq); int rv; assert(pthread_spin_trylock(&cq->lock)); if (cq->queue) munmap(cq->queue, cq->num_cqe * sizeof(struct siw_cqe) + sizeof(struct siw_cq_ctrl)); rv = ibv_cmd_destroy_cq(base_cq); if (rv) { pthread_spin_unlock(&cq->lock); return rv; } pthread_spin_destroy(&cq->lock); free(cq); return 0; } static struct ibv_srq *siw_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct siw_cmd_create_srq cmd = {}; struct siw_cmd_create_srq_resp resp = {}; struct ibv_context *ctx = pd->context; struct siw_srq *srq; int rv, rq_size; srq = calloc(1, sizeof(*srq)); if (!srq) return NULL; rv = ibv_cmd_create_srq(pd, &srq->base_srq, attr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (rv) { verbs_err(verbs_get_ctx(pd->context), "libsiw: creating SRQ failed\n"); free(srq); return NULL; } if (resp.srq_key == SIW_INVAL_UOBJ_KEY) { verbs_err(verbs_get_ctx(pd->context), "libsiw: prepare SRQ mapping failed\n"); goto fail; } pthread_spin_init(&srq->lock, PTHREAD_PROCESS_PRIVATE); rq_size = resp.num_rqe * sizeof(struct siw_rqe); srq->num_rqe = resp.num_rqe; srq->recvq = mmap(NULL, rq_size, PROT_READ | PROT_WRITE, MAP_SHARED, ctx->cmd_fd, resp.srq_key); if (srq->recvq == MAP_FAILED) { verbs_err(verbs_get_ctx(pd->context), "libsiw: SRQ mapping failed: %d", errno); goto fail; } return &srq->base_srq; fail: ibv_cmd_destroy_srq(&srq->base_srq); free(srq); return NULL; } static int siw_modify_srq(struct ibv_srq *base_srq, struct ibv_srq_attr *attr, int attr_mask) { struct ibv_modify_srq cmd = {}; struct siw_srq *srq = srq_base2siw(base_srq); int rv; pthread_spin_lock(&srq->lock); rv = ibv_cmd_modify_srq(base_srq, attr, attr_mask, &cmd, sizeof(cmd)); pthread_spin_unlock(&srq->lock); return rv; } static int siw_destroy_srq(struct ibv_srq *base_srq) { struct siw_srq *srq = srq_base2siw(base_srq); int rv; assert(pthread_spin_trylock(&srq->lock)); rv = ibv_cmd_destroy_srq(base_srq); if (rv) { pthread_spin_unlock(&srq->lock); return rv; } if (srq->recvq) munmap(srq->recvq, srq->num_rqe * sizeof(struct siw_rqe)); pthread_spin_destroy(&srq->lock); free(srq); return 0; } static struct ibv_qp *siw_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct siw_cmd_create_qp cmd = {}; struct siw_cmd_create_qp_resp resp = {}; struct siw_qp *qp; struct ibv_context *base_ctx = pd->context; int sq_size, rq_size, rv; memset(&cmd, 0, sizeof(cmd)); memset(&resp, 0, sizeof(resp)); qp = calloc(1, sizeof(*qp)); if (!qp) return NULL; rv = ibv_cmd_create_qp(pd, &qp->base_qp, attr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (rv) { verbs_err(verbs_get_ctx(pd->context), "libsiw: QP creation failed\n"); free(qp); return NULL; } if (resp.sq_key == SIW_INVAL_UOBJ_KEY || resp.rq_key == SIW_INVAL_UOBJ_KEY) { verbs_err(verbs_get_ctx(pd->context), "libsiw: prepare QP mapping failed\n"); goto fail; } qp->id = resp.qp_id; qp->num_sqe = resp.num_sqe; qp->num_rqe = resp.num_rqe; qp->sq_sig_all = attr->sq_sig_all; /* Init doorbell request structure */ qp->db_req.hdr.command = IB_USER_VERBS_CMD_POST_SEND; qp->db_req.hdr.in_words = sizeof(qp->db_req) / 4; qp->db_req.hdr.out_words = sizeof(qp->db_resp) / 4; qp->db_req.response = (uintptr_t)&qp->db_resp; qp->db_req.wr_count = 0; qp->db_req.sge_count = 0; qp->db_req.wqe_size = sizeof(struct ibv_send_wr); pthread_spin_init(&qp->sq_lock, PTHREAD_PROCESS_PRIVATE); pthread_spin_init(&qp->rq_lock, PTHREAD_PROCESS_PRIVATE); sq_size = resp.num_sqe * sizeof(struct siw_sqe); qp->sendq = mmap(NULL, sq_size, PROT_READ | PROT_WRITE, MAP_SHARED, base_ctx->cmd_fd, resp.sq_key); if (qp->sendq == MAP_FAILED) { verbs_err(verbs_get_ctx(pd->context), "libsiw: SQ mapping failed: %d", errno); qp->sendq = NULL; goto fail; } if (attr->srq) { qp->srq = srq_base2siw(attr->srq); } else { rq_size = resp.num_rqe * sizeof(struct siw_rqe); qp->recvq = mmap(NULL, rq_size, PROT_READ | PROT_WRITE, MAP_SHARED, base_ctx->cmd_fd, resp.rq_key); if (qp->recvq == MAP_FAILED) { verbs_err(verbs_get_ctx(pd->context), "libsiw: RQ mapping failed: %d\n", resp.num_rqe); qp->recvq = NULL; goto fail; } } qp->db_req.qp_handle = qp->base_qp.handle; return &qp->base_qp; fail: ibv_cmd_destroy_qp(&qp->base_qp); if (qp->sendq) munmap(qp->sendq, qp->num_sqe * sizeof(struct siw_sqe)); if (qp->recvq) munmap(qp->recvq, qp->num_rqe * sizeof(struct siw_rqe)); free(qp); return NULL; } static int siw_modify_qp(struct ibv_qp *base_qp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd; struct siw_qp *qp = qp_base2siw(base_qp); int rv; memset(&cmd, 0, sizeof(cmd)); pthread_spin_lock(&qp->sq_lock); pthread_spin_lock(&qp->rq_lock); rv = ibv_cmd_modify_qp(base_qp, attr, attr_mask, &cmd, sizeof(cmd)); pthread_spin_unlock(&qp->rq_lock); pthread_spin_unlock(&qp->sq_lock); return rv; } static int siw_destroy_qp(struct ibv_qp *base_qp) { struct siw_qp *qp = qp_base2siw(base_qp); int rv; assert(pthread_spin_trylock(&qp->sq_lock)); assert(pthread_spin_trylock(&qp->rq_lock)); if (qp->sendq) munmap(qp->sendq, qp->num_sqe * sizeof(struct siw_sqe)); if (qp->recvq) munmap(qp->recvq, qp->num_rqe * sizeof(struct siw_rqe)); rv = ibv_cmd_destroy_qp(base_qp); if (rv) { pthread_spin_unlock(&qp->rq_lock); pthread_spin_unlock(&qp->sq_lock); return rv; } pthread_spin_destroy(&qp->rq_lock); pthread_spin_destroy(&qp->sq_lock); free(qp); return 0; } static void siw_async_event(struct ibv_context *ctx, struct ibv_async_event *event) { struct ibv_qp *base_qp = event->element.qp; struct ibv_cq *base_cq = event->element.cq; switch (event->event_type) { case IBV_EVENT_CQ_ERR: verbs_err(verbs_get_ctx(ctx), "libsiw: CQ[%d] event: error\n", cq_base2siw(base_cq)->id); break; case IBV_EVENT_QP_FATAL: verbs_err(verbs_get_ctx(ctx), "libsiw: QP[%d] event: fatal error\n", qp_base2siw(base_qp)->id); break; case IBV_EVENT_QP_REQ_ERR: verbs_err(verbs_get_ctx(ctx), "libsiw: QP[%d] event: request error\n", qp_base2siw(base_qp)->id); break; case IBV_EVENT_QP_ACCESS_ERR: verbs_err(verbs_get_ctx(ctx), "libsiw: QP[%d] event: access error\n", qp_base2siw(base_qp)->id); break; case IBV_EVENT_SQ_DRAINED: case IBV_EVENT_COMM_EST: case IBV_EVENT_QP_LAST_WQE_REACHED: break; default: break; } } static int siw_notify_cq(struct ibv_cq *ibcq, int solicited) { struct siw_cq *cq = cq_base2siw(ibcq); int rv = 0; if (solicited) atomic_store((_Atomic(uint32_t) *)&cq->ctrl->flags, SIW_NOTIFY_SOLICITED); else atomic_store((_Atomic(uint32_t) *)&cq->ctrl->flags, SIW_NOTIFY_SOLICITED | SIW_NOTIFY_NEXT_COMPLETION); return rv; } static const struct { enum ibv_wr_opcode base; enum siw_opcode siw; } map_send_opcode[IBV_WR_DRIVER1 + 1] = { { IBV_WR_RDMA_WRITE, SIW_OP_WRITE}, { IBV_WR_RDMA_WRITE_WITH_IMM, SIW_NUM_OPCODES + 1 }, { IBV_WR_SEND, SIW_OP_SEND }, { IBV_WR_SEND_WITH_IMM, SIW_NUM_OPCODES + 1 }, { IBV_WR_RDMA_READ, SIW_OP_READ }, { IBV_WR_ATOMIC_CMP_AND_SWP, SIW_NUM_OPCODES + 1 }, { IBV_WR_ATOMIC_FETCH_AND_ADD, SIW_NUM_OPCODES + 1 }, { IBV_WR_LOCAL_INV, SIW_NUM_OPCODES + 1 }, { IBV_WR_BIND_MW, SIW_NUM_OPCODES + 1 }, { IBV_WR_SEND_WITH_INV, SIW_OP_SEND_REMOTE_INV }, { IBV_WR_TSO, SIW_NUM_OPCODES + 1 }, { IBV_WR_DRIVER1, SIW_NUM_OPCODES + 1 } }; static inline uint16_t map_send_flags(int ibv_flags) { uint16_t flags = SIW_WQE_VALID; if (ibv_flags & IBV_SEND_SIGNALED) flags |= SIW_WQE_SIGNALLED; if (ibv_flags & IBV_SEND_SOLICITED) flags |= SIW_WQE_SOLICITED; if (ibv_flags & IBV_SEND_INLINE) flags |= SIW_WQE_INLINE; if (ibv_flags & IBV_SEND_FENCE) flags |= SIW_WQE_READ_FENCE; return flags; } static inline int push_send_wqe(struct ibv_qp *base_qp, struct ibv_send_wr *base_wr, struct siw_sqe *siw_sqe, int sig_all) { uint32_t flags = map_send_flags(base_wr->send_flags); atomic_ushort *fp = (atomic_ushort *)&siw_sqe->flags; siw_sqe->id = base_wr->wr_id; siw_sqe->num_sge = base_wr->num_sge; siw_sqe->raddr = base_wr->wr.rdma.remote_addr; siw_sqe->rkey = base_wr->wr.rdma.rkey; siw_sqe->opcode = map_send_opcode[base_wr->opcode].siw; if (siw_sqe->opcode > SIW_NUM_OPCODES) { verbs_err(verbs_get_ctx(base_qp->context), "libsiw: opcode %d unsupported\n", base_wr->opcode); return -EINVAL; } if (sig_all) flags |= SIW_WQE_SIGNALLED; if (flags & SIW_WQE_INLINE) { char *data = (char *)&siw_sqe->sge[1]; int bytes = 0, i = 0; /* Allow more than SIW_MAX_SGE, since content copied here */ while (i < base_wr->num_sge) { bytes += base_wr->sg_list[i].length; if (bytes > (int)SIW_MAX_INLINE) { verbs_err(verbs_get_ctx(base_qp->context), "libsiw: inline data: %d:%d\n", bytes, (int)SIW_MAX_INLINE); return -EINVAL; } memcpy(data, (void *)(uintptr_t)base_wr->sg_list[i].addr, base_wr->sg_list[i].length); data += base_wr->sg_list[i++].length; } siw_sqe->sge[0].length = bytes; } else { if (siw_sqe->num_sge > SIW_MAX_SGE) return -EINVAL; /* this assumes same layout of siw and base SGE */ memcpy(siw_sqe->sge, base_wr->sg_list, siw_sqe->num_sge * sizeof(struct ibv_sge)); } atomic_store(fp, flags); return 0; } static int siw_post_send(struct ibv_qp *base_qp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { struct siw_qp *qp = qp_base2siw(base_qp); uint32_t sq_put; atomic_ushort *fp; int new_sqe = 0, rv = 0; *bad_wr = NULL; pthread_spin_lock(&qp->sq_lock); sq_put = qp->sq_put; /* * Push all current work requests into mmapped SQ */ while (wr) { uint32_t idx = sq_put % qp->num_sqe; struct siw_sqe *sqe = &qp->sendq[idx]; uint16_t sqe_flags; fp = (atomic_ushort *)&sqe->flags; sqe_flags = atomic_load(fp); if (!(sqe_flags & SIW_WQE_VALID)) { rv = push_send_wqe(base_qp, wr, sqe, qp->sq_sig_all); if (rv) { *bad_wr = wr; break; } new_sqe++; } else { verbs_err(verbs_get_ctx(base_qp->context), "libsiw: QP[%d]: SQ overflow, idx %d\n", qp->id, idx); rv = -ENOMEM; *bad_wr = wr; break; } sq_put++; wr = wr->next; } if (new_sqe) { /* * If last WQE pushed before position where current post_send * started is idle, we assume SQ is not being actively * processed. Only then, the doorbell call will be issued. * This may significantly reduce unnecessary doorbell calls * on a busy SQ. We also always ring the doorbell, if the * complete SQ was re-written during current post_send. */ if (new_sqe < qp->num_sqe) { uint32_t old_idx = (qp->sq_put - 1) % qp->num_sqe; struct siw_sqe *old_sqe = &qp->sendq[old_idx]; fp = (atomic_ushort *)&old_sqe->flags; if (!(atomic_load(fp) & SIW_WQE_VALID)) rv = siw_db(qp); } else { rv = siw_db(qp); } if (rv) *bad_wr = wr; qp->sq_put = sq_put; } pthread_spin_unlock(&qp->sq_lock); return rv; } static inline int push_recv_wqe(struct ibv_recv_wr *base_wr, struct siw_rqe *siw_rqe) { atomic_ushort *fp = (atomic_ushort *)&siw_rqe->flags; siw_rqe->id = base_wr->wr_id; siw_rqe->num_sge = base_wr->num_sge; if (base_wr->num_sge == 1) { siw_rqe->sge[0].laddr = base_wr->sg_list[0].addr; siw_rqe->sge[0].length = base_wr->sg_list[0].length; siw_rqe->sge[0].lkey = base_wr->sg_list[0].lkey; } else if (base_wr->num_sge && base_wr->num_sge <= SIW_MAX_SGE) /* this assumes same layout of siw and base SGE */ memcpy(siw_rqe->sge, base_wr->sg_list, sizeof(struct ibv_sge) * base_wr->num_sge); else return -EINVAL; atomic_store(fp, SIW_WQE_VALID); return 0; } static int siw_post_recv(struct ibv_qp *base_qp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct siw_qp *qp = qp_base2siw(base_qp); uint32_t rq_put; int rv = 0; pthread_spin_lock(&qp->rq_lock); rq_put = qp->rq_put; while (wr) { int idx = rq_put % qp->num_rqe; struct siw_rqe *rqe = &qp->recvq[idx]; atomic_ushort *fp = (atomic_ushort *)&rqe->flags; uint16_t rqe_flags = atomic_load(fp); if (!(rqe_flags & SIW_WQE_VALID)) { if (push_recv_wqe(wr, rqe)) { *bad_wr = wr; rv = -EINVAL; break; } } else { verbs_err(verbs_get_ctx(base_qp->context), "libsiw: QP[%d]: RQ overflow, idx %d\n", qp->id, idx); rv = -ENOMEM; *bad_wr = wr; break; } rq_put++; wr = wr->next; } qp->rq_put = rq_put; pthread_spin_unlock(&qp->rq_lock); return rv; } static int siw_post_srq_recv(struct ibv_srq *base_srq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct siw_srq *srq = srq_base2siw(base_srq); uint32_t srq_put; int rv = 0; pthread_spin_lock(&srq->lock); srq_put = srq->rq_put; while (wr) { int idx = srq_put % srq->num_rqe; struct siw_rqe *rqe = &srq->recvq[idx]; atomic_ushort *fp = (atomic_ushort *)&rqe->flags; uint16_t rqe_flags = atomic_load(fp); if (!(rqe_flags & SIW_WQE_VALID)) { if (push_recv_wqe(wr, rqe)) { *bad_wr = wr; rv = -EINVAL; break; } } else { verbs_err(verbs_get_ctx(base_srq->context), "libsiw: SRQ[%p]: SRQ overflow\n", srq); rv = -ENOMEM; *bad_wr = wr; break; } srq_put++; wr = wr->next; } srq->rq_put = srq_put; pthread_spin_unlock(&srq->lock); return rv; } static const struct { enum siw_opcode siw; enum ibv_wc_opcode base; } map_cqe_opcode[SIW_NUM_OPCODES] = { { SIW_OP_WRITE, IBV_WC_RDMA_WRITE }, { SIW_OP_READ, IBV_WC_RDMA_READ }, { SIW_OP_READ_LOCAL_INV, IBV_WC_RDMA_READ }, { SIW_OP_SEND, IBV_WC_SEND }, { SIW_OP_SEND_WITH_IMM, IBV_WC_SEND }, { SIW_OP_SEND_REMOTE_INV, IBV_WC_SEND }, { SIW_OP_FETCH_AND_ADD, IBV_WC_FETCH_ADD }, { SIW_OP_COMP_AND_SWAP, IBV_WC_COMP_SWAP }, { SIW_OP_RECEIVE, IBV_WC_RECV } }; static const struct { enum siw_wc_status siw; enum ibv_wc_status base; } map_cqe_status[SIW_NUM_WC_STATUS] = { { SIW_WC_SUCCESS, IBV_WC_SUCCESS }, { SIW_WC_LOC_LEN_ERR, IBV_WC_LOC_LEN_ERR }, { SIW_WC_LOC_PROT_ERR, IBV_WC_LOC_PROT_ERR }, { SIW_WC_LOC_QP_OP_ERR, IBV_WC_LOC_QP_OP_ERR }, { SIW_WC_WR_FLUSH_ERR, IBV_WC_WR_FLUSH_ERR }, { SIW_WC_BAD_RESP_ERR, IBV_WC_BAD_RESP_ERR }, { SIW_WC_LOC_ACCESS_ERR, IBV_WC_LOC_ACCESS_ERR }, { SIW_WC_REM_ACCESS_ERR, IBV_WC_REM_ACCESS_ERR }, { SIW_WC_REM_INV_REQ_ERR, IBV_WC_REM_INV_REQ_ERR }, { SIW_WC_GENERAL_ERR, IBV_WC_GENERAL_ERR } }; static inline void copy_cqe(struct siw_cqe *cqe, struct ibv_wc *wc) { wc->wr_id = cqe->id; wc->byte_len = cqe->bytes; /* No immediate data supported yet */ wc->wc_flags = 0; wc->imm_data = 0; wc->vendor_err = 0; wc->opcode = map_cqe_opcode[cqe->opcode].base; wc->status = map_cqe_status[cqe->status].base; wc->qp_num = (uint32_t)cqe->qp_id; } static int siw_poll_cq(struct ibv_cq *ibcq, int num_entries, struct ibv_wc *wc) { struct siw_cq *cq = cq_base2siw(ibcq); int new = 0; pthread_spin_lock(&cq->lock); for (; num_entries--; wc++) { struct siw_cqe *cqe = &cq->queue[cq->cq_get % cq->num_cqe]; atomic_uchar *fp = (atomic_uchar *)&cqe->flags; if (atomic_load(fp) & SIW_WQE_VALID) { copy_cqe(cqe, wc); atomic_store(fp, 0); cq->cq_get++; new++; } else break; } pthread_spin_unlock(&cq->lock); return new; } static const struct verbs_context_ops siw_context_ops = { .alloc_pd = siw_alloc_pd, .async_event = siw_async_event, .create_cq = siw_create_cq, .create_qp = siw_create_qp, .create_srq = siw_create_srq, .dealloc_pd = siw_free_pd, .dereg_mr = siw_dereg_mr, .destroy_cq = siw_destroy_cq, .destroy_qp = siw_destroy_qp, .destroy_srq = siw_destroy_srq, .free_context = siw_free_context, .modify_qp = siw_modify_qp, .modify_srq = siw_modify_srq, .poll_cq = siw_poll_cq, .post_recv = siw_post_recv, .post_send = siw_post_send, .post_srq_recv = siw_post_srq_recv, .query_device_ex = siw_query_device, .query_port = siw_query_port, .query_qp = siw_query_qp, .reg_mr = siw_reg_mr, .req_notify_cq = siw_notify_cq, }; static struct verbs_context *siw_alloc_context(struct ibv_device *base_dev, int fd, void *pdata) { struct siw_context *ctx; struct ibv_get_context cmd = {}; struct siw_cmd_alloc_context_resp resp = {}; ctx = verbs_init_and_alloc_context(base_dev, fd, ctx, base_ctx, RDMA_DRIVER_SIW); if (!ctx) return NULL; if (ibv_cmd_get_context(&ctx->base_ctx, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) { verbs_uninit_context(&ctx->base_ctx); free(ctx); return NULL; } verbs_set_ops(&ctx->base_ctx, &siw_context_ops); ctx->dev_id = resp.dev_id; return &ctx->base_ctx; } static void siw_free_context(struct ibv_context *ibv_ctx) { struct siw_context *ctx = ctx_ibv2siw(ibv_ctx); verbs_uninit_context(&ctx->base_ctx); free(ctx); } static struct verbs_device *siw_device_alloc(struct verbs_sysfs_dev *unused) { struct siw_device *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; return &dev->base_dev; } static void siw_device_free(struct verbs_device *vdev) { struct siw_device *dev = container_of(vdev, struct siw_device, base_dev); free(dev); } static const struct verbs_match_ent rnic_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_SIW), {}, }; static const struct verbs_device_ops siw_dev_ops = { .name = "siw", .match_min_abi_version = SIW_ABI_VERSION, .match_max_abi_version = SIW_ABI_VERSION, .match_table = rnic_table, .alloc_device = siw_device_alloc, .uninit_device = siw_device_free, .alloc_context = siw_alloc_context, }; PROVIDER_DRIVER(siw, siw_dev_ops); rdma-core-56.1/providers/siw/siw.h000066400000000000000000000040761477342711600171330ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or BSD-3-Clause */ /* Authors: Bernard Metzler */ /* Copyright (c) 2008-2019, IBM Corporation */ #ifndef _SIW_H #define _SIW_H #include #include #include #include #include struct siw_device { struct verbs_device base_dev; }; struct siw_srq { struct ibv_srq base_srq; struct siw_rqe *recvq; uint32_t rq_put; uint32_t num_rqe; pthread_spinlock_t lock; }; struct siw_mr { struct verbs_mr base_mr; }; struct siw_qp { struct ibv_qp base_qp; struct siw_device *siw_dev; uint32_t id; pthread_spinlock_t sq_lock; pthread_spinlock_t rq_lock; struct ibv_post_send db_req; struct ib_uverbs_post_send_resp db_resp; uint32_t num_sqe; uint32_t sq_put; int sq_sig_all; struct siw_sqe *sendq; uint32_t num_rqe; uint32_t rq_put; struct siw_rqe *recvq; struct siw_srq *srq; }; struct siw_cq { struct ibv_cq base_cq; struct siw_device *siw_dev; uint32_t id; /* Points to kernel shared control * object at the end of CQE array */ struct siw_cq_ctrl *ctrl; int num_cqe; uint32_t cq_get; struct siw_cqe *queue; pthread_spinlock_t lock; }; struct siw_context { struct verbs_context base_ctx; uint32_t dev_id; }; static inline struct siw_context *ctx_ibv2siw(struct ibv_context *base) { return container_of(base, struct siw_context, base_ctx.context); } static inline struct siw_qp *qp_base2siw(struct ibv_qp *base) { return container_of(base, struct siw_qp, base_qp); } static inline struct siw_cq *cq_base2siw(struct ibv_cq *base) { return container_of(base, struct siw_cq, base_cq); } static inline struct siw_mr *mr_base2siw(struct verbs_mr *base) { return container_of(base, struct siw_mr, base_mr); } static inline struct siw_srq *srq_base2siw(struct ibv_srq *base) { return container_of(base, struct siw_srq, base_srq); } static inline int siw_db(struct siw_qp *qp) { int rv = write(qp->base_qp.context->cmd_fd, &qp->db_req, sizeof(qp->db_req)); return rv == sizeof(qp->db_req) ? 0 : rv; } #endif /* _SIW_H */ rdma-core-56.1/providers/siw/siw_abi.h000066400000000000000000000014351477342711600177420ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 or BSD-3-Clause */ /* Authors: Bernard Metzler */ /* Copyright (c) 2008-2019, IBM Corporation */ #ifndef _SIW_ABI_H #define _SIW_ABI_H #include #include #include DECLARE_DRV_CMD(siw_cmd_alloc_context, IB_USER_VERBS_CMD_GET_CONTEXT, empty, siw_uresp_alloc_ctx); DECLARE_DRV_CMD(siw_cmd_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, empty, siw_uresp_create_cq); DECLARE_DRV_CMD(siw_cmd_create_srq, IB_USER_VERBS_CMD_CREATE_SRQ, empty, siw_uresp_create_srq); DECLARE_DRV_CMD(siw_cmd_create_qp, IB_USER_VERBS_CMD_CREATE_QP, empty, siw_uresp_create_qp); DECLARE_DRV_CMD(siw_cmd_reg_mr, IB_USER_VERBS_CMD_REG_MR, siw_ureq_reg_mr, siw_uresp_reg_mr); #endif /* _SIW_ABI_H */ rdma-core-56.1/providers/vmw_pvrdma/000077500000000000000000000000001477342711600175315ustar00rootroot00000000000000rdma-core-56.1/providers/vmw_pvrdma/CMakeLists.txt000066400000000000000000000001031477342711600222630ustar00rootroot00000000000000rdma_provider(vmw_pvrdma cq.c pvrdma_main.c qp.c verbs.c ) rdma-core-56.1/providers/vmw_pvrdma/cq.c000066400000000000000000000162341477342711600203060ustar00rootroot00000000000000/* * Copyright (c) 2012-2016 VMware, Inc. All rights reserved. * * This program is free software; you can redistribute it and/or * modify it under the terms of EITHER the GNU General Public License * version 2 as published by the Free Software Foundation or the BSD * 2-Clause License. This program is distributed in the hope that it * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License version 2 for more details at * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html. * * You should have received a copy of the GNU General Public License * along with this program available in the file COPYING in the main * directory of this source tree. * * The BSD 2-Clause License * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include "pvrdma.h" enum { CQ_OK = 0, CQ_EMPTY = -1, CQ_POLL_ERR = -2, }; enum { PVRDMA_CQE_IS_SEND_MASK = 0x40, PVRDMA_CQE_OPCODE_MASK = 0x1f, }; int pvrdma_alloc_cq_buf(struct pvrdma_device *dev, struct pvrdma_cq *cq, struct pvrdma_buf *buf, int entries) { if (pvrdma_alloc_buf(buf, cq->offset + entries * (sizeof(struct pvrdma_cqe)), dev->page_size)) return -1; memset(buf->buf, 0, buf->length); return 0; } static struct pvrdma_cqe *get_cqe(struct pvrdma_cq *cq, int entry) { return cq->buf.buf + cq->offset + entry * (sizeof(struct pvrdma_cqe)); } static int pvrdma_poll_one(struct pvrdma_cq *cq, struct pvrdma_qp **cur_qp, struct ibv_wc *wc) { struct pvrdma_context *ctx = to_vctx(cq->ibv_cq.context); int has_data; unsigned int head; int tried = 0; struct pvrdma_cqe *cqe; retry: has_data = pvrdma_idx_ring_has_data(&cq->ring_state->rx, cq->cqe_cnt, &head); if (has_data == 0) { unsigned int val; if (tried) return CQ_EMPTY; /* Pass down POLL to give physical HCA a chance to poll. */ val = cq->cqn | PVRDMA_UAR_CQ_POLL; pvrdma_write_uar_cq(ctx->uar, val); tried = 1; goto retry; } else if (has_data == -1) { return CQ_POLL_ERR; } cqe = get_cqe(cq, head); if (!cqe) return CQ_EMPTY; udma_from_device_barrier(); if (ctx->qp_tbl[cqe->qp & 0xFFFF]) *cur_qp = (struct pvrdma_qp *)ctx->qp_tbl[cqe->qp & 0xFFFF]; else return CQ_POLL_ERR; wc->opcode = pvrdma_wc_opcode_to_ibv(cqe->opcode); wc->status = pvrdma_wc_status_to_ibv(cqe->status); wc->wr_id = cqe->wr_id; wc->qp_num = (*cur_qp)->ibv_qp.qp_num; wc->byte_len = cqe->byte_len; wc->imm_data = cqe->imm_data; wc->src_qp = cqe->src_qp; wc->wc_flags = cqe->wc_flags; wc->pkey_index = cqe->pkey_index; wc->slid = cqe->slid; wc->sl = cqe->sl; wc->dlid_path_bits = cqe->dlid_path_bits; wc->vendor_err = 0; /* Update shared ring state. */ pvrdma_idx_ring_inc(&(cq->ring_state->rx.cons_head), cq->cqe_cnt); return CQ_OK; } int pvrdma_poll_cq(struct ibv_cq *ibcq, int num_entries, struct ibv_wc *wc) { struct pvrdma_cq *cq = to_vcq(ibcq); struct pvrdma_qp *qp; int npolled = 0; if (num_entries < 1 || wc == NULL) return 0; pthread_spin_lock(&cq->lock); for (npolled = 0; npolled < num_entries; ++npolled) { if (pvrdma_poll_one(cq, &qp, wc + npolled) != CQ_OK) break; } pthread_spin_unlock(&cq->lock); return npolled; } void pvrdma_cq_clean_int(struct pvrdma_cq *cq, uint32_t qp_handle) { /* Flush CQEs from specified QP */ int has_data; unsigned int head; /* Lock held */ has_data = pvrdma_idx_ring_has_data(&cq->ring_state->rx, cq->cqe_cnt, &head); if (unlikely(has_data > 0)) { int items; int curr; int tail = pvrdma_idx(&cq->ring_state->rx.prod_tail, cq->cqe_cnt); struct pvrdma_cqe *cqe; struct pvrdma_cqe *curr_cqe; items = (tail > head) ? (tail - head) : (cq->cqe_cnt - head + tail); curr = --tail; while (items-- > 0) { if (curr < 0) curr = cq->cqe_cnt - 1; if (tail < 0) tail = cq->cqe_cnt - 1; curr_cqe = get_cqe(cq, curr); udma_from_device_barrier(); if ((curr_cqe->qp & 0xFFFF) != qp_handle) { if (curr != tail) { cqe = get_cqe(cq, tail); udma_from_device_barrier(); *cqe = *curr_cqe; } tail--; } else { pvrdma_idx_ring_inc( &cq->ring_state->rx.cons_head, cq->cqe_cnt); } curr--; } } } void pvrdma_cq_clean(struct pvrdma_cq *cq, uint32_t qp_handle) { pthread_spin_lock(&cq->lock); pvrdma_cq_clean_int(cq, qp_handle); pthread_spin_unlock(&cq->lock); } struct ibv_cq *pvrdma_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector) { struct pvrdma_device *dev = to_vdev(context->device); struct user_pvrdma_create_cq cmd; struct user_pvrdma_create_cq_resp resp; struct pvrdma_cq *cq; int ret; if (cqe < 1) return NULL; cq = malloc(sizeof(*cq)); if (!cq) return NULL; /* Extra page for shared ring state */ cq->offset = dev->page_size; if (pthread_spin_init(&cq->lock, PTHREAD_PROCESS_PRIVATE)) goto err; cqe = align_next_power2(cqe); if (pvrdma_alloc_cq_buf(dev, cq, &cq->buf, cqe)) goto err; cq->ring_state = cq->buf.buf; cmd.buf_addr = (uintptr_t) cq->buf.buf; cmd.buf_size = cq->buf.length; ret = ibv_cmd_create_cq(context, cqe, channel, comp_vector, &cq->ibv_cq, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) goto err_buf; cq->cqn = resp.cqn; cq->cqe_cnt = cq->ibv_cq.cqe; return &cq->ibv_cq; err_buf: pvrdma_free_buf(&cq->buf); err: free(cq); return NULL; } int pvrdma_destroy_cq(struct ibv_cq *cq) { int ret; ret = ibv_cmd_destroy_cq(cq); if (ret) return ret; pvrdma_free_buf(&to_vcq(cq)->buf); free(to_vcq(cq)); return 0; } int pvrdma_req_notify_cq(struct ibv_cq *ibcq, int solicited) { struct pvrdma_context *ctx = to_vctx(ibcq->context); struct pvrdma_cq *cq = to_vcq(ibcq); unsigned int val = cq->cqn; val |= solicited ? PVRDMA_UAR_CQ_ARM_SOL : PVRDMA_UAR_CQ_ARM; pvrdma_write_uar_cq(ctx->uar, val); return 0; } rdma-core-56.1/providers/vmw_pvrdma/pvrdma-abi.h000066400000000000000000000055701477342711600217330ustar00rootroot00000000000000/* * Copyright (c) 2012-2016 VMware, Inc. All rights reserved. * * This program is free software; you can redistribute it and/or * modify it under the terms of EITHER the GNU General Public License * version 2 as published by the Free Software Foundation or the BSD * 2-Clause License. This program is distributed in the hope that it * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License version 2 for more details at * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html. * * You should have received a copy of the GNU General Public License * along with this program available in the file COPYING in the main * directory of this source tree. * * The BSD 2-Clause License * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. */ #ifndef __PVRDMA_ABI_FIX_H__ #define __PVRDMA_ABI_FIX_H__ #include #include #include DECLARE_DRV_CMD(user_pvrdma_alloc_pd, IB_USER_VERBS_CMD_ALLOC_PD, empty, pvrdma_alloc_pd_resp); DECLARE_DRV_CMD(user_pvrdma_create_cq, IB_USER_VERBS_CMD_CREATE_CQ, pvrdma_create_cq, pvrdma_create_cq_resp); DECLARE_DRV_CMD(user_pvrdma_create_qp, IB_USER_VERBS_CMD_CREATE_QP, pvrdma_create_qp, pvrdma_create_qp_resp); DECLARE_DRV_CMD(user_pvrdma_create_srq, IB_USER_VERBS_CMD_CREATE_SRQ, pvrdma_create_srq, pvrdma_create_srq_resp); DECLARE_DRV_CMD(user_pvrdma_alloc_ucontext, IB_USER_VERBS_CMD_GET_CONTEXT, empty, pvrdma_alloc_ucontext_resp); #endif /* __PVRDMA_ABI_FIX_H__ */ rdma-core-56.1/providers/vmw_pvrdma/pvrdma.h000066400000000000000000000242721477342711600212020ustar00rootroot00000000000000/* * Copyright (c) 2012-2016 VMware, Inc. All rights reserved. * * This program is free software; you can redistribute it and/or * modify it under the terms of EITHER the GNU General Public License * version 2 as published by the Free Software Foundation or the BSD * 2-Clause License. This program is distributed in the hope that it * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License version 2 for more details at * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html. * * You should have received a copy of the GNU General Public License * along with this program available in the file COPYING in the main * directory of this source tree. * * The BSD 2-Clause License * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. */ #ifndef __PVRDMA_H__ #define __PVRDMA_H__ #include #include #include #include #include #include #include #include #include #include #include #include "pvrdma-abi.h" #include "pvrdma_ring.h" #define PFX "pvrdma: " enum { PVRDMA_OPCODE_NOP = 0x00, PVRDMA_OPCODE_SEND_INVAL = 0x01, PVRDMA_OPCODE_RDMA_WRITE = 0x08, PVRDMA_OPCODE_RDMA_WRITE_IMM = 0x09, PVRDMA_OPCODE_SEND = 0x0a, PVRDMA_OPCODE_SEND_IMM = 0x0b, PVRDMA_OPCODE_LSO = 0x0e, PVRDMA_OPCODE_RDMA_READ = 0x10, PVRDMA_OPCODE_ATOMIC_CS = 0x11, PVRDMA_OPCODE_ATOMIC_FA = 0x12, PVRDMA_OPCODE_ATOMIC_MASK_CS = 0x14, PVRDMA_OPCODE_ATOMIC_MASK_FA = 0x15, PVRDMA_OPCODE_BIND_MW = 0x18, PVRDMA_OPCODE_FMR = 0x19, PVRDMA_OPCODE_LOCAL_INVAL = 0x1b, PVRDMA_OPCODE_CONFIG_CMD = 0x1f, PVRDMA_RECV_OPCODE_RDMA_WRITE_IMM = 0x00, PVRDMA_RECV_OPCODE_SEND = 0x01, PVRDMA_RECV_OPCODE_SEND_IMM = 0x02, PVRDMA_RECV_OPCODE_SEND_INVAL = 0x03, PVRDMA_CQE_OPCODE_ERROR = 0x1e, PVRDMA_CQE_OPCODE_RESIZE = 0x16, }; enum { PVRDMA_WQE_CTRL_FENCE = 1 << 6, PVRDMA_WQE_CTRL_CQ_UPDATE = 3 << 2, PVRDMA_WQE_CTRL_SOLICIT = 1 << 1, }; struct pvrdma_device { struct verbs_device ibv_dev; int page_size; int abi_version; }; struct pvrdma_context { struct verbs_context ibv_ctx; void *uar; pthread_spinlock_t uar_lock; int max_qp_wr; int max_sge; int max_cqe; struct pvrdma_qp **qp_tbl; }; struct pvrdma_buf { void *buf; size_t length; }; struct pvrdma_pd { struct ibv_pd ibv_pd; uint32_t pdn; }; struct pvrdma_cq { struct ibv_cq ibv_cq; struct pvrdma_buf buf; struct pvrdma_buf resize_buf; pthread_spinlock_t lock; struct pvrdma_ring_state *ring_state; uint32_t cqe_cnt; uint32_t offset; uint32_t cqn; }; struct pvrdma_srq { struct ibv_srq ibv_srq; struct pvrdma_buf buf; pthread_spinlock_t lock; uint64_t *wrid; uint32_t srqn; int wqe_cnt; int wqe_size; int max_gs; int wqe_shift; struct pvrdma_ring_state *ring_state; uint16_t counter; int offset; }; struct pvrdma_wq { uint64_t *wrid; pthread_spinlock_t lock; int wqe_cnt; int wqe_size; struct pvrdma_ring *ring_state; int max_gs; int wqe_shift; int offset; }; struct pvrdma_qp { struct ibv_qp ibv_qp; struct pvrdma_buf rbuf; struct pvrdma_buf sbuf; int max_inline_data; int buf_size; __be32 sq_signal_bits; int sq_spare_wqes; struct pvrdma_wq sq; struct pvrdma_wq rq; int is_srq; uint32_t qp_handle; }; struct pvrdma_ah { struct ibv_ah ibv_ah; struct pvrdma_av av; }; static inline unsigned long align(unsigned long val, unsigned long align) { return (val + align - 1) & ~(align - 1); } static inline int align_next_power2(int size) { int val = 1; while (val < size) val <<= 1; return val; } static inline struct pvrdma_device *to_vdev(struct ibv_device *ibdev) { return container_of(ibdev, struct pvrdma_device, ibv_dev.device); } static inline struct pvrdma_context *to_vctx(struct ibv_context *ibctx) { return container_of(ibctx, struct pvrdma_context, ibv_ctx.context); } static inline struct pvrdma_pd *to_vpd(struct ibv_pd *ibpd) { return container_of(ibpd, struct pvrdma_pd, ibv_pd); } static inline struct pvrdma_cq *to_vcq(struct ibv_cq *ibcq) { return container_of(ibcq, struct pvrdma_cq, ibv_cq); } static inline struct pvrdma_srq *to_vsrq(struct ibv_srq *ibsrq) { return container_of(ibsrq, struct pvrdma_srq, ibv_srq); } static inline struct pvrdma_qp *to_vqp(struct ibv_qp *ibqp) { return container_of(ibqp, struct pvrdma_qp, ibv_qp); } static inline struct pvrdma_ah *to_vah(struct ibv_ah *ibah) { return container_of(ibah, struct pvrdma_ah, ibv_ah); } static inline void pvrdma_write_uar_qp(void *uar, unsigned value) { *(__le32 *)(uar + PVRDMA_UAR_QP_OFFSET) = htole32(value); } static inline void pvrdma_write_uar_cq(void *uar, unsigned value) { *(__le32 *)(uar + PVRDMA_UAR_CQ_OFFSET) = htole32(value); } static inline void pvrdma_write_uar_srq(void *uar, unsigned int value) { *(__le32 *)(uar + PVRDMA_UAR_SRQ_OFFSET) = htole32(value); } static inline int ibv_send_flags_to_pvrdma(int flags) { return flags; } static inline enum pvrdma_wr_opcode ibv_wr_opcode_to_pvrdma( enum ibv_wr_opcode op) { return (enum pvrdma_wr_opcode)op; } static inline enum ibv_wc_status pvrdma_wc_status_to_ibv( enum pvrdma_wc_status status) { return (enum ibv_wc_status)status; } static inline enum ibv_wc_opcode pvrdma_wc_opcode_to_ibv( enum pvrdma_wc_opcode op) { return (enum ibv_wc_opcode)op; } static inline int pvrdma_wc_flags_to_ibv(int flags) { return flags; } int pvrdma_alloc_buf(struct pvrdma_buf *buf, size_t size, int page_size); void pvrdma_free_buf(struct pvrdma_buf *buf); int pvrdma_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size); int pvrdma_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr); struct ibv_pd *pvrdma_alloc_pd(struct ibv_context *context); int pvrdma_free_pd(struct ibv_pd *pd); struct ibv_mr *pvrdma_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access); int pvrdma_dereg_mr(struct verbs_mr *mr); struct ibv_cq *pvrdma_create_cq(struct ibv_context *context, int cqe, struct ibv_comp_channel *channel, int comp_vector); int pvrdma_alloc_cq_buf(struct pvrdma_device *dev, struct pvrdma_cq *cq, struct pvrdma_buf *buf, int nent); int pvrdma_destroy_cq(struct ibv_cq *cq); int pvrdma_req_notify_cq(struct ibv_cq *cq, int solicited); int pvrdma_poll_cq(struct ibv_cq *cq, int ne, struct ibv_wc *wc); void pvrdma_cq_event(struct ibv_cq *cq); void pvrdma_cq_clean_int(struct pvrdma_cq *cq, uint32_t qp_handle); void pvrdma_cq_clean(struct pvrdma_cq *cq, uint32_t qp_handle); int pvrdma_get_outstanding_cqes(struct pvrdma_cq *cq); void pvrdma_cq_resize_copy_cqes(struct pvrdma_cq *cq, void *buf, int new_cqe); struct ibv_qp *pvrdma_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr); int pvrdma_query_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr); int pvrdma_modify_qp(struct ibv_qp *qp, struct ibv_qp_attr *attr, int attr_mask); int pvrdma_destroy_qp(struct ibv_qp *qp); void pvrdma_init_qp_indices(struct pvrdma_qp *qp); void pvrdma_qp_init_sq_ownership(struct pvrdma_qp *qp); int pvrdma_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr); int pvrdma_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); void pvrdma_calc_sq_wqe_size(struct ibv_qp_cap *cap, enum ibv_qp_type type, struct pvrdma_qp *qp); int pvrdma_alloc_qp_buf(struct pvrdma_device *dev, struct ibv_qp_cap *cap, enum ibv_qp_type type, struct pvrdma_qp *qp); void pvrdma_set_sq_sizes(struct pvrdma_qp *qp, struct ibv_qp_cap *cap, enum ibv_qp_type type); struct pvrdma_qp *pvrdma_find_qp(struct pvrdma_context *ctx, uint32_t qpn); int pvrdma_store_qp(struct pvrdma_context *ctx, uint32_t qpn, struct pvrdma_qp *qp); void pvrdma_clear_qp(struct pvrdma_context *ctx, uint32_t qpn); struct ibv_srq *pvrdma_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr); int pvrdma_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int attr_mask); int pvrdma_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr); int pvrdma_destroy_srq(struct ibv_srq *srq); int pvrdma_alloc_srq_buf(struct pvrdma_device *dev, struct ibv_srq_attr *attr, struct pvrdma_srq *srq); int pvrdma_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr); void pvrdma_init_srq_queue(struct pvrdma_srq *srq); struct ibv_ah *pvrdma_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr); int pvrdma_destroy_ah(struct ibv_ah *ah); int pvrdma_alloc_av(struct pvrdma_pd *pd, struct ibv_ah_attr *attr, struct pvrdma_ah *ah); void pvrdma_free_av(struct pvrdma_ah *ah); #endif /* __PVRDMA_H__ */ rdma-core-56.1/providers/vmw_pvrdma/pvrdma_main.c000066400000000000000000000144441477342711600222010ustar00rootroot00000000000000/* * Copyright (c) 2012-2016 VMware, Inc. All rights reserved. * * This program is free software; you can redistribute it and/or * modify it under the terms of EITHER the GNU General Public License * version 2 as published by the Free Software Foundation or the BSD * 2-Clause License. This program is distributed in the hope that it * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License version 2 for more details at * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html. * * You should have received a copy of the GNU General Public License * along with this program available in the file COPYING in the main * directory of this source tree. * * The BSD 2-Clause License * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. */ #include "pvrdma.h" static void pvrdma_free_context(struct ibv_context *ibctx); /* * VMware PVRDMA vendor id and PCI device id. */ #define PCI_VENDOR_ID_VMWARE 0x15AD #define PCI_DEVICE_ID_VMWARE_PVRDMA 0x0820 static const struct verbs_context_ops pvrdma_ctx_ops = { .free_context = pvrdma_free_context, .query_device_ex = pvrdma_query_device, .query_port = pvrdma_query_port, .alloc_pd = pvrdma_alloc_pd, .dealloc_pd = pvrdma_free_pd, .reg_mr = pvrdma_reg_mr, .dereg_mr = pvrdma_dereg_mr, .create_cq = pvrdma_create_cq, .poll_cq = pvrdma_poll_cq, .req_notify_cq = pvrdma_req_notify_cq, .destroy_cq = pvrdma_destroy_cq, .create_qp = pvrdma_create_qp, .query_qp = pvrdma_query_qp, .modify_qp = pvrdma_modify_qp, .destroy_qp = pvrdma_destroy_qp, .create_srq = pvrdma_create_srq, .modify_srq = pvrdma_modify_srq, .query_srq = pvrdma_query_srq, .destroy_srq = pvrdma_destroy_srq, .post_srq_recv = pvrdma_post_srq_recv, .post_send = pvrdma_post_send, .post_recv = pvrdma_post_recv, .create_ah = pvrdma_create_ah, .destroy_ah = pvrdma_destroy_ah, }; int pvrdma_alloc_buf(struct pvrdma_buf *buf, size_t size, int page_size) { int ret; buf->length = align(size, page_size); buf->buf = mmap(NULL, buf->length, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (buf->buf == MAP_FAILED) return errno; ret = ibv_dontfork_range(buf->buf, size); if (ret) munmap(buf->buf, buf->length); return ret; } void pvrdma_free_buf(struct pvrdma_buf *buf) { ibv_dofork_range(buf->buf, buf->length); munmap(buf->buf, buf->length); } static int pvrdma_init_context_shared(struct pvrdma_context *context, struct ibv_device *ibdev, int cmd_fd) { struct ibv_get_context cmd; struct user_pvrdma_alloc_ucontext_resp resp = {}; context->ibv_ctx.context.cmd_fd = cmd_fd; if (ibv_cmd_get_context(&context->ibv_ctx, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) return errno; context->qp_tbl = calloc(resp.qp_tab_size & 0xFFFF, sizeof(struct pvrdma_qp *)); if (!context->qp_tbl) return -ENOMEM; context->uar = mmap(NULL, to_vdev(ibdev)->page_size, PROT_WRITE, MAP_SHARED, cmd_fd, 0); if (context->uar == MAP_FAILED) { free(context->qp_tbl); return errno; } pthread_spin_init(&context->uar_lock, PTHREAD_PROCESS_PRIVATE); verbs_set_ops(&context->ibv_ctx, &pvrdma_ctx_ops); return 0; } static void pvrdma_free_context_shared(struct pvrdma_context *context, struct pvrdma_device *dev) { munmap(context->uar, dev->page_size); free(context->qp_tbl); } static struct verbs_context *pvrdma_alloc_context(struct ibv_device *ibdev, int cmd_fd, void *private_data) { struct pvrdma_context *context; context = verbs_init_and_alloc_context(ibdev, cmd_fd, context, ibv_ctx, RDMA_DRIVER_VMW_PVRDMA); if (!context) return NULL; if (pvrdma_init_context_shared(context, ibdev, cmd_fd)) { verbs_uninit_context(&context->ibv_ctx); free(context); return NULL; } return &context->ibv_ctx; } static void pvrdma_free_context(struct ibv_context *ibctx) { struct pvrdma_context *context = to_vctx(ibctx); pvrdma_free_context_shared(context, to_vdev(ibctx->device)); verbs_uninit_context(&context->ibv_ctx); free(context); } static void pvrdma_uninit_device(struct verbs_device *verbs_device) { struct pvrdma_device *dev = to_vdev(&verbs_device->device); free(dev); } static struct verbs_device * pvrdma_device_alloc(struct verbs_sysfs_dev *sysfs_dev) { struct pvrdma_device *dev; dev = calloc(1, sizeof(*dev)); if (!dev) return NULL; dev->abi_version = sysfs_dev->abi_ver; dev->page_size = sysconf(_SC_PAGESIZE); return &dev->ibv_dev; } static const struct verbs_match_ent hca_table[] = { VERBS_DRIVER_ID(RDMA_DRIVER_VMW_PVRDMA), VERBS_PCI_MATCH(PCI_VENDOR_ID_VMWARE, PCI_DEVICE_ID_VMWARE_PVRDMA, NULL), {} }; static const struct verbs_device_ops pvrdma_dev_ops = { .name = "pvrdma", .match_min_abi_version = PVRDMA_UVERBS_ABI_VERSION, .match_max_abi_version = PVRDMA_UVERBS_ABI_VERSION, .match_table = hca_table, .alloc_device = pvrdma_device_alloc, .uninit_device = pvrdma_uninit_device, .alloc_context = pvrdma_alloc_context, }; PROVIDER_DRIVER(vmw_pvrdma, pvrdma_dev_ops); rdma-core-56.1/providers/vmw_pvrdma/pvrdma_ring.h000066400000000000000000000110751477342711600222160ustar00rootroot00000000000000/* * Copyright (c) 2012-2016 VMware, Inc. All rights reserved. * * This program is free software; you can redistribute it and/or * modify it under the terms of EITHER the GNU General Public License * version 2 as published by the Free Software Foundation or the BSD * 2-Clause License. This program is distributed in the hope that it * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License version 2 for more details at * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html. * * You should have received a copy of the GNU General Public License * along with this program in the file COPYING. If not, write to the * Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, * Boston, MA 02110-1301, USA. * * The BSD 2-Clause License * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. */ #ifndef __PVRDMA_RING_H__ #define __PVRDMA_RING_H__ #include #define PVRDMA_INVALID_IDX -1 /* Invalid index. */ /* * Rings are shared with the device, so read/write access must be atomic. * PVRDMA is x86 only, and since 32-bit access is atomic on x86, using * regular uint32_t is safe. */ struct pvrdma_ring { uint32_t prod_tail; /* Producer tail. */ uint32_t cons_head; /* Consumer head. */ }; struct pvrdma_ring_state { struct pvrdma_ring tx; /* Tx ring. */ struct pvrdma_ring rx; /* Rx ring. */ }; static inline int pvrdma_idx_valid(uint32_t idx, uint32_t max_elems) { /* Generates fewer instructions than a less-than. */ return (idx & ~((max_elems << 1) - 1)) == 0; } static inline int32_t pvrdma_idx(uint32_t *var, uint32_t max_elems) { const uint32_t idx = *var; if (pvrdma_idx_valid(idx, max_elems)) return idx & (max_elems - 1); return PVRDMA_INVALID_IDX; } static inline void pvrdma_idx_ring_inc(uint32_t *var, uint32_t max_elems) { uint32_t idx = (*var) + 1; /* Increment. */ idx &= (max_elems << 1) - 1; /* Modulo size, flip gen. */ *var = idx; } static inline int32_t pvrdma_idx_ring_has_space(const struct pvrdma_ring *r, uint32_t max_elems, uint32_t *out_tail) { const uint32_t tail = r->prod_tail; const uint32_t head = r->cons_head; if (pvrdma_idx_valid(tail, max_elems) && pvrdma_idx_valid(head, max_elems)) { *out_tail = tail & (max_elems - 1); return tail != (head ^ max_elems); } return PVRDMA_INVALID_IDX; } static inline int32_t pvrdma_idx_ring_has_data(const struct pvrdma_ring *r, uint32_t max_elems, uint32_t *out_head) { const uint32_t tail = r->prod_tail; const uint32_t head = r->cons_head; if (pvrdma_idx_valid(tail, max_elems) && pvrdma_idx_valid(head, max_elems)) { *out_head = head & (max_elems - 1); return tail != head; } return PVRDMA_INVALID_IDX; } static inline int32_t pvrdma_idx_ring_is_valid_idx(const struct pvrdma_ring *r, uint32_t max_elems, uint32_t *idx) { const uint32_t tail = r->prod_tail; const uint32_t head = r->cons_head; if (pvrdma_idx_valid(tail, max_elems) && pvrdma_idx_valid(head, max_elems) && pvrdma_idx_valid(*idx, max_elems)) { if (tail > head && (*idx < tail && *idx >= head)) return 1; else if (head > tail && (*idx >= head || *idx < tail)) return 1; } return 0; } #endif /* __PVRDMA_RING_H__ */ rdma-core-56.1/providers/vmw_pvrdma/qp.c000066400000000000000000000434651477342711600203310ustar00rootroot00000000000000/* * Copyright (c) 2012-2017 VMware, Inc. All rights reserved. * * This program is free software; you can redistribute it and/or * modify it under the terms of EITHER the GNU General Public License * version 2 as published by the Free Software Foundation or the BSD * 2-Clause License. This program is distributed in the hope that it * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License version 2 for more details at * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html. * * You should have received a copy of the GNU General Public License * along with this program available in the file COPYING in the main * directory of this source tree. * * The BSD 2-Clause License * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include "pvrdma.h" int pvrdma_alloc_qp_buf(struct pvrdma_device *dev, struct ibv_qp_cap *cap, enum ibv_qp_type type, struct pvrdma_qp *qp) { qp->sq.wrid = calloc(qp->sq.wqe_cnt, sizeof(uint64_t)); if (!qp->sq.wrid) return -1; /* Align page size for sq */ qp->sbuf.length = align(qp->sq.offset + qp->sq.wqe_cnt * qp->sq.wqe_size, dev->page_size); if (pvrdma_alloc_buf(&qp->sbuf, qp->sbuf.length, dev->page_size)) { free(qp->sq.wrid); return -1; } memset(qp->sbuf.buf, 0, qp->sbuf.length); if (!qp->is_srq) { qp->rq.wrid = calloc(qp->rq.wqe_cnt, sizeof(uint64_t)); if (!qp->rq.wrid) { pvrdma_free_buf(&qp->sbuf); free(qp->sq.wrid); return -1; } /* Align page size for rq */ qp->rbuf.length = align(qp->rq.offset + qp->rq.wqe_cnt * qp->rq.wqe_size, dev->page_size); if (pvrdma_alloc_buf(&qp->rbuf, qp->rbuf.length, dev->page_size)) { free(qp->sq.wrid); free(qp->rq.wrid); pvrdma_free_buf(&qp->sbuf); return -1; } memset(qp->rbuf.buf, 0, qp->rbuf.length); } else { qp->rbuf.buf = NULL; qp->rbuf.length = 0; } qp->buf_size = qp->rbuf.length + qp->sbuf.length; return 0; } void pvrdma_init_srq_queue(struct pvrdma_srq *srq) { srq->ring_state->rx.cons_head = 0; srq->ring_state->rx.prod_tail = 0; } struct ibv_srq *pvrdma_create_srq(struct ibv_pd *pd, struct ibv_srq_init_attr *attr) { struct pvrdma_device *dev = to_vdev(pd->context->device); struct user_pvrdma_create_srq cmd; struct user_pvrdma_create_srq_resp resp = {}; struct pvrdma_srq *srq; int ret; attr->attr.max_wr = align_next_power2(max_t(uint32_t, 1U, attr->attr.max_wr)); attr->attr.max_sge = max_t(uint32_t, 1U, attr->attr.max_sge); srq = malloc(sizeof(*srq)); if (!srq) return NULL; if (pthread_spin_init(&srq->lock, PTHREAD_PROCESS_PRIVATE)) goto err; srq->wqe_cnt = attr->attr.max_wr; srq->max_gs = attr->attr.max_sge; srq->wqe_size = align_next_power2(sizeof(struct pvrdma_rq_wqe_hdr) + sizeof(struct ibv_sge) * srq->max_gs); /* Page reserved for queue metadata */ srq->offset = dev->page_size; if (pvrdma_alloc_srq_buf(dev, &attr->attr, srq)) goto err_spinlock; srq->ring_state = srq->buf.buf; pvrdma_init_srq_queue(srq); memset(&cmd, 0, sizeof(cmd)); cmd.buf_addr = (uintptr_t) srq->buf.buf; cmd.buf_size = srq->buf.length; ret = ibv_cmd_create_srq(pd, &srq->ibv_srq, attr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp)); if (ret) goto err_free; srq->srqn = resp.srqn; return &srq->ibv_srq; err_free: free(srq->wrid); pvrdma_free_buf(&srq->buf); err_spinlock: pthread_spin_destroy(&srq->lock); err: free(srq); return NULL; } int pvrdma_modify_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr, int attr_mask) { struct ibv_modify_srq cmd; return ibv_cmd_modify_srq(srq, attr, attr_mask, &cmd, sizeof(cmd)); } int pvrdma_query_srq(struct ibv_srq *srq, struct ibv_srq_attr *attr) { struct ibv_query_srq cmd; return ibv_cmd_query_srq(srq, attr, &cmd, sizeof(cmd)); } int pvrdma_destroy_srq(struct ibv_srq *ibsrq) { struct pvrdma_srq *srq = to_vsrq(ibsrq); int ret; ret = ibv_cmd_destroy_srq(ibsrq); if (ret) return ret; pthread_spin_destroy(&srq->lock); pvrdma_free_buf(&srq->buf); free(srq->wrid); free(srq); return 0; } static void pvrdma_init_qp_queue(struct pvrdma_qp *qp) { qp->sq.ring_state->cons_head = 0; qp->sq.ring_state->prod_tail = 0; if (qp->rq.ring_state) { qp->rq.ring_state->cons_head = 0; qp->rq.ring_state->prod_tail = 0; } } struct ibv_qp *pvrdma_create_qp(struct ibv_pd *pd, struct ibv_qp_init_attr *attr) { struct pvrdma_device *dev = to_vdev(pd->context->device); struct user_pvrdma_create_qp cmd; struct user_pvrdma_create_qp_resp resp = {}; struct pvrdma_qp *qp; int is_srq = !!(attr->srq); attr->cap.max_send_sge = max_t(uint32_t, 1U, attr->cap.max_send_sge); attr->cap.max_send_wr = align_next_power2(max_t(uint32_t, 1U, attr->cap.max_send_wr)); if (!is_srq) { attr->cap.max_recv_sge = max_t(uint32_t, 1U, attr->cap.max_recv_sge); attr->cap.max_recv_wr = align_next_power2(max_t(uint32_t, 1U, attr->cap.max_recv_wr)); } else { attr->cap.max_recv_sge = 0; attr->cap.max_recv_wr = 0; } qp = calloc(1, sizeof(*qp)); if (!qp) return NULL; qp->is_srq = is_srq; qp->sq.max_gs = attr->cap.max_send_sge; qp->sq.wqe_cnt = attr->cap.max_send_wr; /* Extra page for shared ring state */ qp->sq.offset = dev->page_size; qp->sq.wqe_size = align_next_power2(sizeof(struct pvrdma_sq_wqe_hdr) + sizeof(struct ibv_sge) * qp->sq.max_gs); if (!is_srq) { qp->rq.max_gs = attr->cap.max_recv_sge; qp->rq.wqe_cnt = attr->cap.max_recv_wr; qp->rq.offset = 0; qp->rq.wqe_size = align_next_power2(sizeof(struct pvrdma_rq_wqe_hdr) + sizeof(struct ibv_sge) * qp->rq.max_gs); } else { qp->rq.max_gs = 0; qp->rq.wqe_cnt = 0; qp->rq.offset = 0; qp->rq.wqe_size = 0; } /* Allocate [rq][sq] memory */ if (pvrdma_alloc_qp_buf(dev, &attr->cap, attr->qp_type, qp)) goto err; qp->sq.ring_state = qp->sbuf.buf; if (pthread_spin_init(&qp->sq.lock, PTHREAD_PROCESS_PRIVATE)) goto err_free; if (!is_srq) { qp->rq.ring_state = (struct pvrdma_ring *)&qp->sq.ring_state[1]; if (pthread_spin_init(&qp->rq.lock, PTHREAD_PROCESS_PRIVATE)) goto err_free; } else { qp->rq.ring_state = NULL; } pvrdma_init_qp_queue(qp); memset(&cmd, 0, sizeof(cmd)); cmd.sbuf_addr = (uintptr_t)qp->sbuf.buf; cmd.sbuf_size = qp->sbuf.length; cmd.rbuf_addr = (uintptr_t)qp->rbuf.buf; cmd.rbuf_size = qp->rbuf.length; cmd.qp_addr = (uintptr_t) qp; if (ibv_cmd_create_qp(pd, &qp->ibv_qp, attr, &cmd.ibv_cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) goto err_free; if (resp.drv_payload.qp_handle != 0) qp->qp_handle = resp.drv_payload.qp_handle; else qp->qp_handle = qp->ibv_qp.qp_num; to_vctx(pd->context)->qp_tbl[qp->qp_handle & 0xFFFF] = qp; /* If set, each WR submitted to the SQ generate a completion entry */ if (attr->sq_sig_all) qp->sq_signal_bits = htobe32(PVRDMA_WQE_CTRL_CQ_UPDATE); else qp->sq_signal_bits = 0; return &qp->ibv_qp; err_free: if (qp->sq.wqe_cnt) free(qp->sq.wrid); if (qp->rq.wqe_cnt) free(qp->rq.wrid); pvrdma_free_buf(&qp->rbuf); pvrdma_free_buf(&qp->sbuf); err: free(qp); return NULL; } int pvrdma_query_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask, struct ibv_qp_init_attr *init_attr) { struct ibv_query_qp cmd; struct pvrdma_qp *qp = to_vqp(ibqp); int ret; ret = ibv_cmd_query_qp(ibqp, attr, attr_mask, init_attr, &cmd, sizeof(cmd)); if (ret) return ret; /* Passing back */ init_attr->cap.max_send_wr = qp->sq.wqe_cnt; init_attr->cap.max_send_sge = qp->sq.max_gs; init_attr->cap.max_inline_data = qp->max_inline_data; attr->cap = init_attr->cap; return 0; } int pvrdma_modify_qp(struct ibv_qp *ibqp, struct ibv_qp_attr *attr, int attr_mask) { struct ibv_modify_qp cmd; struct pvrdma_qp *qp = to_vqp(ibqp); int ret; /* Sanity check */ if (!attr_mask) return 0; ret = ibv_cmd_modify_qp(ibqp, attr, attr_mask, &cmd, sizeof(cmd)); if (!ret && (attr_mask & IBV_QP_STATE) && attr->qp_state == IBV_QPS_RESET) { pvrdma_cq_clean(to_vcq(ibqp->recv_cq), qp->qp_handle); if (ibqp->send_cq != ibqp->recv_cq) pvrdma_cq_clean(to_vcq(ibqp->send_cq), qp->qp_handle); pvrdma_init_qp_queue(qp); } return ret; } static void pvrdma_lock_cqs(struct ibv_qp *qp) { struct pvrdma_cq *send_cq = to_vcq(qp->send_cq); struct pvrdma_cq *recv_cq = to_vcq(qp->recv_cq); if (send_cq == recv_cq) { pthread_spin_lock(&send_cq->lock); } else if (send_cq->cqn < recv_cq->cqn) { pthread_spin_lock(&send_cq->lock); pthread_spin_lock(&recv_cq->lock); } else { pthread_spin_lock(&recv_cq->lock); pthread_spin_lock(&send_cq->lock); } } static void pvrdma_unlock_cqs(struct ibv_qp *qp) { struct pvrdma_cq *send_cq = to_vcq(qp->send_cq); struct pvrdma_cq *recv_cq = to_vcq(qp->recv_cq); if (send_cq == recv_cq) { pthread_spin_unlock(&send_cq->lock); } else if (send_cq->cqn < recv_cq->cqn) { pthread_spin_unlock(&recv_cq->lock); pthread_spin_unlock(&send_cq->lock); } else { pthread_spin_unlock(&send_cq->lock); pthread_spin_unlock(&recv_cq->lock); } } int pvrdma_destroy_qp(struct ibv_qp *ibqp) { struct pvrdma_context *ctx = to_vctx(ibqp->context); struct pvrdma_qp *qp = to_vqp(ibqp); int ret; ret = ibv_cmd_destroy_qp(ibqp); if (ret) { return ret; } pvrdma_lock_cqs(ibqp); /* Dump cqs */ pvrdma_cq_clean_int(to_vcq(ibqp->recv_cq), qp->qp_handle); if (ibqp->send_cq != ibqp->recv_cq) pvrdma_cq_clean_int(to_vcq(ibqp->send_cq), qp->qp_handle); pvrdma_unlock_cqs(ibqp); free(qp->sq.wrid); free(qp->rq.wrid); pvrdma_free_buf(&qp->rbuf); pvrdma_free_buf(&qp->sbuf); ctx->qp_tbl[qp->qp_handle & 0xFFFF] = NULL; free(qp); return 0; } static void *get_srq_wqe(struct pvrdma_srq *srq, int n) { return srq->buf.buf + srq->offset + (n * srq->wqe_size); } static void *get_rq_wqe(struct pvrdma_qp *qp, int n) { return qp->rbuf.buf + qp->rq.offset + (n * qp->rq.wqe_size); } static void *get_sq_wqe(struct pvrdma_qp *qp, int n) { return qp->sbuf.buf + qp->sq.offset + (n * qp->sq.wqe_size); } int pvrdma_post_send(struct ibv_qp *ibqp, struct ibv_send_wr *wr, struct ibv_send_wr **bad_wr) { struct pvrdma_context *ctx = to_vctx(ibqp->context); struct pvrdma_qp *qp = to_vqp(ibqp); int ind; int nreq = 0; struct pvrdma_sq_wqe_hdr *wqe_hdr; struct ibv_sge *sge; int ret = 0; int i; /* * In states lower than RTS, we can fail immediately. In other states, * just post and let the device figure it out. */ if (ibqp->state < IBV_QPS_RTS) { *bad_wr = wr; return EINVAL; } pthread_spin_lock(&qp->sq.lock); ind = pvrdma_idx(&(qp->sq.ring_state->prod_tail), qp->sq.wqe_cnt); if (ind < 0) { pthread_spin_unlock(&qp->sq.lock); *bad_wr = wr; return EINVAL; } for (nreq = 0; wr; ++nreq, wr = wr->next) { unsigned int tail; if (pvrdma_idx_ring_has_space(qp->sq.ring_state, qp->sq.wqe_cnt, &tail) <= 0) { ret = ENOMEM; *bad_wr = wr; goto out; } if (wr->num_sge > qp->sq.max_gs) { ret = EINVAL; *bad_wr = wr; goto out; } wqe_hdr = (struct pvrdma_sq_wqe_hdr *)get_sq_wqe(qp, ind); wqe_hdr->wr_id = wr->wr_id; wqe_hdr->num_sge = wr->num_sge; wqe_hdr->opcode = ibv_wr_opcode_to_pvrdma(wr->opcode); wqe_hdr->send_flags = ibv_send_flags_to_pvrdma(wr->send_flags); if (wr->opcode == IBV_WR_SEND_WITH_IMM || wr->opcode == IBV_WR_RDMA_WRITE_WITH_IMM) wqe_hdr->ex.imm_data = wr->imm_data; switch (ibqp->qp_type) { case IBV_QPT_UD: wqe_hdr->wr.ud.remote_qpn = wr->wr.ud.remote_qpn; wqe_hdr->wr.ud.remote_qkey = wr->wr.ud.remote_qkey; wqe_hdr->wr.ud.av = to_vah(wr->wr.ud.ah)->av; break; case IBV_QPT_RC: switch (wr->opcode) { case IBV_WR_RDMA_READ: case IBV_WR_RDMA_WRITE: case IBV_WR_RDMA_WRITE_WITH_IMM: wqe_hdr->wr.rdma.remote_addr = wr->wr.rdma.remote_addr; wqe_hdr->wr.rdma.rkey = wr->wr.rdma.rkey; break; case IBV_WR_ATOMIC_CMP_AND_SWP: case IBV_WR_ATOMIC_FETCH_AND_ADD: wqe_hdr->wr.atomic.remote_addr = wr->wr.atomic.remote_addr; wqe_hdr->wr.atomic.rkey = wr->wr.atomic.rkey; wqe_hdr->wr.atomic.compare_add = wr->wr.atomic.compare_add; if (wr->opcode == IBV_WR_ATOMIC_CMP_AND_SWP) wqe_hdr->wr.atomic.swap = wr->wr.atomic.swap; break; default: /* No extra segments required for sends */ break; } break; default: fprintf(stderr, PFX "invalid post send opcode\n"); ret = EINVAL; *bad_wr = wr; goto out; } /* Write each segment */ sge = (struct ibv_sge *)&wqe_hdr[1]; for (i = 0; i < wr->num_sge; i++) { sge->addr = wr->sg_list[i].addr; sge->length = wr->sg_list[i].length; sge->lkey = wr->sg_list[i].lkey; sge++; } udma_to_device_barrier(); pvrdma_idx_ring_inc(&(qp->sq.ring_state->prod_tail), qp->sq.wqe_cnt); qp->sq.wrid[ind] = wr->wr_id; ++ind; if (ind >= qp->sq.wqe_cnt) ind = 0; } out: if (nreq) { udma_to_device_barrier(); pvrdma_write_uar_qp(ctx->uar, PVRDMA_UAR_QP_SEND | qp->qp_handle); } pthread_spin_unlock(&qp->sq.lock); return ret; } int pvrdma_post_recv(struct ibv_qp *ibqp, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct pvrdma_context *ctx = to_vctx(ibqp->context); struct pvrdma_qp *qp = to_vqp(ibqp); struct pvrdma_rq_wqe_hdr *wqe_hdr; struct ibv_sge *sge; int nreq; int ind; int i; int ret = 0; if (qp->is_srq) return EINVAL; if (!wr || !bad_wr) return EINVAL; /* * In the RESET state, we can fail immediately. For other states, * just post and let the device figure it out. */ if (ibqp->state == IBV_QPS_RESET) { *bad_wr = wr; return EINVAL; } pthread_spin_lock(&qp->rq.lock); ind = pvrdma_idx(&(qp->rq.ring_state->prod_tail), qp->rq.wqe_cnt); if (ind < 0) { pthread_spin_unlock(&qp->rq.lock); *bad_wr = wr; return EINVAL; } for (nreq = 0; wr; ++nreq, wr = wr->next) { unsigned int tail; if (pvrdma_idx_ring_has_space(qp->rq.ring_state, qp->rq.wqe_cnt, &tail) <= 0) { ret = ENOMEM; *bad_wr = wr; goto out; } if (wr->num_sge > qp->rq.max_gs) { ret = EINVAL; *bad_wr = wr; goto out; } /* Fetch wqe */ wqe_hdr = (struct pvrdma_rq_wqe_hdr *)get_rq_wqe(qp, ind); wqe_hdr->wr_id = wr->wr_id; wqe_hdr->num_sge = wr->num_sge; sge = (struct ibv_sge *)(wqe_hdr + 1); for (i = 0; i < wr->num_sge; ++i) { sge->addr = (uint64_t)wr->sg_list[i].addr; sge->length = wr->sg_list[i].length; sge->lkey = wr->sg_list[i].lkey; sge++; } pvrdma_idx_ring_inc(&qp->rq.ring_state->prod_tail, qp->rq.wqe_cnt); qp->rq.wrid[ind] = wr->wr_id; ind = (ind + 1) & (qp->rq.wqe_cnt - 1); } out: if (nreq) pvrdma_write_uar_qp(ctx->uar, PVRDMA_UAR_QP_RECV | qp->qp_handle); pthread_spin_unlock(&qp->rq.lock); return ret; } int pvrdma_post_srq_recv(struct ibv_srq *ibsrq, struct ibv_recv_wr *wr, struct ibv_recv_wr **bad_wr) { struct pvrdma_context *ctx = to_vctx(ibsrq->context); struct pvrdma_srq *srq = to_vsrq(ibsrq); struct pvrdma_rq_wqe_hdr *wqe_hdr; struct ibv_sge *sge; int nreq; int ind; int i; int ret = 0; if (!wr || !bad_wr) return EINVAL; pthread_spin_lock(&srq->lock); ind = pvrdma_idx(&(srq->ring_state->rx.prod_tail), srq->wqe_cnt); if (ind < 0) { pthread_spin_unlock(&srq->lock); *bad_wr = wr; return EINVAL; } for (nreq = 0; wr; ++nreq, wr = wr->next) { unsigned int tail; if (pvrdma_idx_ring_has_space(&srq->ring_state->rx, srq->wqe_cnt, &tail) <= 0) { ret = ENOMEM; *bad_wr = wr; break; } if (wr->num_sge > srq->max_gs) { ret = EINVAL; *bad_wr = wr; break; } /* Fetch wqe */ wqe_hdr = (struct pvrdma_rq_wqe_hdr *)get_srq_wqe(srq, ind); wqe_hdr->wr_id = wr->wr_id; wqe_hdr->num_sge = wr->num_sge; sge = (struct ibv_sge *)(wqe_hdr + 1); for (i = 0; i < wr->num_sge; ++i) { sge->addr = (uint64_t)wr->sg_list[i].addr; sge->length = wr->sg_list[i].length; sge->lkey = wr->sg_list[i].lkey; sge++; } pvrdma_idx_ring_inc(&srq->ring_state->rx.prod_tail, srq->wqe_cnt); srq->wrid[ind] = wr->wr_id; ind = (ind + 1) & (srq->wqe_cnt - 1); } if (nreq) pvrdma_write_uar_srq(ctx->uar, PVRDMA_UAR_SRQ_RECV | srq->srqn); pthread_spin_unlock(&srq->lock); return ret; } int pvrdma_alloc_srq_buf(struct pvrdma_device *dev, struct ibv_srq_attr *attr, struct pvrdma_srq *srq) { srq->wrid = calloc(srq->wqe_cnt, sizeof(uint64_t)); if (!srq->wrid) return -1; srq->buf.length = align(srq->offset, dev->page_size); srq->buf.length += 2 * align(srq->wqe_cnt * srq->wqe_size, dev->page_size); if (pvrdma_alloc_buf(&srq->buf, srq->buf.length, dev->page_size)) { free(srq->wrid); return -1; } memset(srq->buf.buf, 0, srq->buf.length); return 0; } rdma-core-56.1/providers/vmw_pvrdma/verbs.c000066400000000000000000000144661477342711600210310ustar00rootroot00000000000000/* * Copyright (c) 2012-2016 VMware, Inc. All rights reserved. * * This program is free software; you can redistribute it and/or * modify it under the terms of EITHER the GNU General Public License * version 2 as published by the Free Software Foundation or the BSD * 2-Clause License. This program is distributed in the hope that it * will be useful, but WITHOUT ANY WARRANTY; WITHOUT EVEN THE IMPLIED * WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. * See the GNU General Public License version 2 for more details at * http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html. * * You should have received a copy of the GNU General Public License * along with this program available in the file COPYING in the main * directory of this source tree. * * The BSD 2-Clause License * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE * COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED * OF THE POSSIBILITY OF SUCH DAMAGE. */ #include #include "pvrdma.h" int pvrdma_query_device(struct ibv_context *context, const struct ibv_query_device_ex_input *input, struct ibv_device_attr_ex *attr, size_t attr_size) { struct ib_uverbs_ex_query_device_resp resp; size_t resp_size = sizeof(resp); uint64_t raw_fw_ver; unsigned major, minor, sub_minor; int ret; ret = ibv_cmd_query_device_any(context, input, attr, attr_size, &resp, &resp_size); if (ret) return ret; raw_fw_ver = resp.base.fw_ver; major = (raw_fw_ver >> 32) & 0xffff; minor = (raw_fw_ver >> 16) & 0xffff; sub_minor = raw_fw_ver & 0xffff; snprintf(attr->orig_attr.fw_ver, sizeof(attr->orig_attr.fw_ver), "%d.%d.%03d", major, minor, sub_minor); return 0; } int pvrdma_query_port(struct ibv_context *context, uint8_t port, struct ibv_port_attr *attr) { struct ibv_query_port cmd; return ibv_cmd_query_port(context, port, attr, &cmd, sizeof(cmd)); } struct ibv_pd *pvrdma_alloc_pd(struct ibv_context *context) { struct ibv_alloc_pd cmd; struct user_pvrdma_alloc_pd_resp resp; struct pvrdma_pd *pd; pd = malloc(sizeof(*pd)); if (!pd) return NULL; if (ibv_cmd_alloc_pd(context, &pd->ibv_pd, &cmd, sizeof(cmd), &resp.ibv_resp, sizeof(resp))) { free(pd); return NULL; } pd->pdn = resp.pdn; return &pd->ibv_pd; } int pvrdma_free_pd(struct ibv_pd *pd) { int ret; ret = ibv_cmd_dealloc_pd(pd); if (ret) return ret; free(to_vpd(pd)); return 0; } struct ibv_mr *pvrdma_reg_mr(struct ibv_pd *pd, void *addr, size_t length, uint64_t hca_va, int access) { struct verbs_mr *vmr; struct ibv_reg_mr cmd; struct ib_uverbs_reg_mr_resp resp; int ret; vmr = malloc(sizeof(*vmr)); if (!vmr) return NULL; ret = ibv_cmd_reg_mr(pd, addr, length, hca_va, access, vmr, &cmd, sizeof(cmd), &resp, sizeof(resp)); if (ret) { free(vmr); return NULL; } return &vmr->ibv_mr; } int pvrdma_dereg_mr(struct verbs_mr *vmr) { int ret; ret = ibv_cmd_dereg_mr(vmr); if (ret) return ret; free(vmr); return 0; } static int is_multicast_gid(const union ibv_gid *gid) { return gid->raw[0] == 0xff; } static int is_link_local_gid(const union ibv_gid *gid) { return gid->global.subnet_prefix == htobe64(0xfe80000000000000ULL); } static int is_ipv6_addr_v4mapped(const struct in6_addr *a) { return IN6_IS_ADDR_V4MAPPED(&a->s6_addr32) || /* IPv4 encoded multicast addresses */ (a->s6_addr32[0] == htobe32(0xff0e0000) && ((a->s6_addr32[1] | (a->s6_addr32[2] ^ htobe32(0x0000ffff))) == 0UL)); } static int set_mac_from_gid(const union ibv_gid *gid, __u8 mac[6]) { if (is_link_local_gid(gid)) { /* * The MAC is embedded in GID[8-10,13-15] with the * 7th most significant bit inverted. */ memcpy(mac, gid->raw + 8, 3); memcpy(mac + 3, gid->raw + 13, 3); mac[0] ^= 2; return 0; } return 1; } struct ibv_ah *pvrdma_create_ah(struct ibv_pd *pd, struct ibv_ah_attr *attr) { struct pvrdma_ah *ah; struct pvrdma_av *av; struct ibv_port_attr port_attr; if (!attr->is_global) return NULL; if (ibv_query_port(pd->context, attr->port_num, &port_attr)) return NULL; if (port_attr.link_layer == IBV_LINK_LAYER_UNSPECIFIED || port_attr.link_layer == IBV_LINK_LAYER_INFINIBAND) return NULL; if (port_attr.link_layer == IBV_LINK_LAYER_ETHERNET && (!is_link_local_gid(&attr->grh.dgid) && !is_multicast_gid(&attr->grh.dgid) && !is_ipv6_addr_v4mapped((struct in6_addr *)attr->grh.dgid.raw))) return NULL; ah = calloc(1, sizeof(*ah)); if (!ah) return NULL; av = &ah->av; av->port_pd = to_vpd(pd)->pdn | (attr->port_num << 24); av->src_path_bits = attr->src_path_bits; av->src_path_bits |= 0x80; av->gid_index = attr->grh.sgid_index; av->hop_limit = attr->grh.hop_limit; av->sl_tclass_flowlabel = (attr->grh.traffic_class << 20) | attr->grh.flow_label; memcpy(av->dgid, attr->grh.dgid.raw, 16); if (port_attr.port_cap_flags & IBV_PORT_IP_BASED_GIDS) { if (!ibv_resolve_eth_l2_from_gid(pd->context, attr, av->dmac, NULL)) return &ah->ibv_ah; } else { if (!set_mac_from_gid(&attr->grh.dgid, av->dmac)) return &ah->ibv_ah; } free(ah); return NULL; } int pvrdma_destroy_ah(struct ibv_ah *ah) { free(to_vah(ah)); return 0; } rdma-core-56.1/pyverbs/000077500000000000000000000000001477342711600150245ustar00rootroot00000000000000rdma-core-56.1/pyverbs/CMakeLists.txt000066400000000000000000000020541477342711600175650ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file # Copyright (c) 2020, Intel Corporation. All rights reserved. See COPYING file publish_internal_headers("" dmabuf_alloc.h ) if (DRM_INCLUDE_DIRS) set(DMABUF_ALLOC dmabuf_alloc.c) else() set(DMABUF_ALLOC dmabuf_alloc_stub.c) endif() if (HAVE_COHERENT_DMA) set(DMA_UTIL dma_util.pyx) else() set(DMA_UTIL "") endif() rdma_cython_module(pyverbs "" addr.pyx base.pyx cm_enums.pyx cmid.pyx cq.pyx device.pyx ${DMA_UTIL} dmabuf.pyx ${DMABUF_ALLOC} enums.pyx flow.pyx fork.pyx libibverbs.pyx libibverbs_enums.pyx librdmacm.pyx librdmacm_enums.pyx mem_alloc.pyx mr.pyx pd.pyx qp.pyx spec.pyx srq.pyx wq.pyx wr.pyx xrcd.pyx ) rdma_python_module(pyverbs __init__.py pyverbs_error.py utils.py ) # mlx5 and efa providers are not built without coherent DMA, e.g. ARM32 build. if (HAVE_COHERENT_DMA) add_subdirectory(providers/mlx5) add_subdirectory(providers/efa) endif() rdma-core-56.1/pyverbs/__init__.pxd000066400000000000000000000000001477342711600172660ustar00rootroot00000000000000rdma-core-56.1/pyverbs/__init__.py000066400000000000000000000000001477342711600171230ustar00rootroot00000000000000rdma-core-56.1/pyverbs/addr.pxd000066400000000000000000000012251477342711600164530ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. See COPYING file #cython: language_level=3 from .base cimport PyverbsObject, PyverbsCM from pyverbs cimport libibverbs as v from .cmid cimport UDParam cdef class GID(PyverbsObject): cdef v.ibv_gid gid cdef class GRH(PyverbsObject): cdef v.ibv_grh grh cdef class GlobalRoute(PyverbsObject): cdef v.ibv_global_route gr cdef class AHAttr(PyverbsObject): cdef v.ibv_ah_attr ah_attr cdef init_from_ud_param(self, UDParam udparam) cdef class AH(PyverbsCM): cdef v.ibv_ah *ah cdef object pd cpdef close(self) rdma-core-56.1/pyverbs/addr.pyx000066400000000000000000000350011477342711600164770ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. See COPYING file from libc.stdint cimport uint8_t, uintptr_t from .pyverbs_error import PyverbsUserError, PyverbsRDMAError from pyverbs.utils import gid_str_to_array, gid_str from pyverbs.base import PyverbsRDMAErrno from pyverbs.cmid cimport UDParam cimport pyverbs.libibverbs as v from pyverbs.pd cimport PD from pyverbs.cq cimport WC cdef extern from 'endian.h': unsigned long be64toh(unsigned long host_64bits) cdef class GID(PyverbsObject): """ GID class represents ibv_gid. It enables user to query for GIDs values. """ def __init__(self, val=None): super().__init__() if val is not None: vals = gid_str_to_array(val) for i in range(16): self.gid.raw[i] = int(vals[i],16) @property def gid(self): """ Expose the inner GID :return: A GID string in an 8 words format: 'xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx' """ return self.__str__() @gid.setter def gid(self, val): """ Sets the inner GID :param val: A GID string in an 8 words format: 'xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx' :return: None """ self._set_gid(val) def _set_gid(self, val): vals = gid_str_to_array(val) for i in range(16): self.gid.raw[i] = int(vals[i],16) def __str__(self): return gid_str(self.gid._global.subnet_prefix, self.gid._global.interface_id) cdef class GRH(PyverbsObject): """ Represents ibv_grh struct. Used when creating or initializing an Address Handle from a Work Completion. """ def __init__(self, GID sgid=None, GID dgid=None, version_tclass_flow=0, paylen=0, next_hdr=0, hop_limit=1): """ Initializes a GRH object :param sgid: Source GID :param dgid: Destination GID :param version_tclass_flow: A 32b big endian used to communicate service level e.g. across subnets :param paylen: A 16b big endian that is the packet length in bytes, starting from the first byte after the GRH up to and including the last byte of the ICRC :param next_hdr: An 8b unsigned integer specifying the next header For non-raw packets: 0x1B For raw packets: According to IETF RFC 1700 :param hop_limit: An 8b unsigned integer specifying the number of hops (i.e. routers) that the packet is permitted to take prior to being discarded :return: A GRH object """ super().__init__() self.grh.dgid = dgid.gid self.grh.sgid = sgid.gid self.grh.version_tclass_flow = version_tclass_flow self.grh.paylen = paylen self.grh.next_hdr = next_hdr self.grh.hop_limit = hop_limit @property def dgid(self): return gid_str(self.grh.dgid._global.subnet_prefix, self.grh.dgid._global.interface_id) @dgid.setter def dgid(self, val): vals = gid_str_to_array(val) for i in range(16): self.grh.dgid.raw[i] = int(vals[i],16) @property def sgid(self): return gid_str(self.grh.sgid._global.subnet_prefix, self.grh.sgid._global.interface_id) @sgid.setter def sgid(self, val): vals = gid_str_to_array(val) for i in range(16): self.grh.sgid.raw[i] = int(vals[i],16) @property def version_tclass_flow(self): return self.grh.version_tclass_flow @version_tclass_flow.setter def version_tclass_flow(self, val): self.grh.version_tclass_flow = val @property def paylen(self): return self.grh.paylen @paylen.setter def paylen(self, val): self.grh.paylen = val @property def next_hdr(self): return self.grh.next_hdr @next_hdr.setter def next_hdr(self, val): self.grh.next_hdr = val @property def hop_limit(self): return self.grh.hop_limit @hop_limit.setter def hop_limit(self, val): self.grh.hop_limit = val def __str__(self): print_format = '{:22}: {:<20}\n' return print_format.format('DGID', self.dgid) +\ print_format.format('SGID', self.sgid) +\ print_format.format('version tclass flow', self.version_tclass_flow) +\ print_format.format('paylen', self.paylen) +\ print_format.format('next header', self.next_hdr) +\ print_format.format('hop limit', self.hop_limit) cdef class GlobalRoute(PyverbsObject): """ Represents ibv_global_route. Used in Address Handle creation and describes the values to be used in the GRH of the packets that will be sent using this Address Handle. """ def __init__(self, GID dgid=None, flow_label=0, sgid_index=0, hop_limit=1, traffic_class=0): """ Initializes a GlobalRoute object with given parameters. :param dgid: Destination GID :param flow_label: A 20b value. If non-zero, gives a hint to switches and routers that this sequence of packets must be delivered in order :param sgid_index: An index in the port's GID table that identifies the originator of the packet :param hop_limit: An 8b unsigned integer specifying the number of hops (i.e. routers) that the packet is permitted to take prior to being discarded :param traffic_class: An 8b unsigned integer specifying the required delivery priority for routers :return: A GlobalRoute object """ super().__init__() self.gr.dgid=dgid.gid self.gr.flow_label = flow_label self.gr.sgid_index = sgid_index self.gr.hop_limit = hop_limit self.gr.traffic_class = traffic_class @property def dgid(self): return gid_str(self.gr.dgid._global.subnet_prefix, self.gr.dgid._global.interface_id) @dgid.setter def dgid(self, val): vals = gid_str_to_array(val) for i in range(16): self.gr.dgid.raw[i] = int(vals[i],16) @property def flow_label(self): return self.gr.flow_label @flow_label.setter def flow_label(self, val): self.gr.flow_label = val @property def sgid_index(self): return self.gr.sgid_index @sgid_index.setter def sgid_index(self, val): self.gr.sgid_index = val @property def hop_limit(self): return self.gr.hop_limit @hop_limit.setter def hop_limit(self, val): self.gr.hop_limit = val @property def traffic_class(self): return self.gr.traffic_class @traffic_class.setter def traffic_class(self, val): self.gr.traffic_class = val def __str__(self): print_format = '{:22}: {:<20}\n' return print_format.format('DGID', self.dgid) +\ print_format.format('flow label', self.flow_label) +\ print_format.format('sgid index', self.sgid_index) +\ print_format.format('hop limit', self.hop_limit) +\ print_format.format('traffic class', self.traffic_class) cdef class AHAttr(PyverbsObject): """ Represents ibv_ah_attr struct """ def __init__(self, dlid=0, sl=0, src_path_bits=0, static_rate=0, is_global=0, port_num=1, GlobalRoute gr=None): """ Initializes an AHAttr object. :param dlid: Destination LID, a 16b unsigned integer :param sl: Service level, an 8b unsigned integer :param src_path_bits: When LMC (LID mask count) is used in the port, packets are being sent with the port's base LID, bitwise ORed with the value of the src_path_bits. An 8b unsigned integer :param static_rate: An 8b unsigned integer limiting the rate of packets that are being sent to the subnet :param is_global: If non-zero, GRH information exists in the Address Handle :param port_num: The local physical port from which the packets will be sent :param grh: Attributes of a global routing header. Will only be used if is_global is non zero. :return: An AHAttr object """ super().__init__() self.ah_attr.port_num = port_num self.ah_attr.sl = sl self.ah_attr.src_path_bits = src_path_bits self.ah_attr.dlid = dlid self.ah_attr.static_rate = static_rate self.ah_attr.is_global = is_global # Do not set GRH fields for a non-global AH if is_global: if gr is None: raise PyverbsUserError('Global AH Attr is created but gr parameter is None') self.ah_attr.grh.dgid = gr.gr.dgid self.ah_attr.grh.flow_label = gr.flow_label self.ah_attr.grh.sgid_index = gr.sgid_index self.ah_attr.grh.hop_limit = gr.hop_limit self.ah_attr.grh.traffic_class = gr.traffic_class cdef init_from_ud_param(self, UDParam udparam): """ Initiate the AHAttr from UDParam's ah_attr. :param udparam: UDParam that contains the AHAttr. :return: None """ self.ah_attr = udparam.ud_param.ah_attr @property def port_num(self): return self.ah_attr.port_num @port_num.setter def port_num(self, val): self.ah_attr.port_num = val @property def sl(self): return self.ah_attr.sl @sl.setter def sl(self, val): self.ah_attr.sl = val @property def src_path_bits(self): return self.ah_attr.src_path_bits @src_path_bits.setter def src_path_bits(self, val): self.ah_attr.src_path_bits = val @property def dlid(self): return self.ah_attr.dlid @dlid.setter def dlid(self, val): self.ah_attr.dlid = val @property def static_rate(self): return self.ah_attr.static_rate @static_rate.setter def static_rate(self, val): self.ah_attr.static_rate = val @property def is_global(self): return self.ah_attr.is_global @is_global.setter def is_global(self, val): self.ah_attr.is_global = val @property def dgid(self): if self.ah_attr.is_global: return gid_str(self.ah_attr.grh.dgid._global.subnet_prefix, self.ah_attr.grh.dgid._global.interface_id) @dgid.setter def dgid(self, val): if self.ah_attr.is_global: vals = gid_str_to_array(val) for i in range(16): self.ah_attr.grh.dgid.raw[i] = int(vals[i],16) @property def flow_label(self): if self.ah_attr.is_global: return self.ah_attr.grh.flow_label @flow_label.setter def flow_label(self, val): self.ah_attr.grh.flow_label = val @property def sgid_index(self): if self.ah_attr.is_global: return self.ah_attr.grh.sgid_index @sgid_index.setter def sgid_index(self, val): self.ah_attr.grh.sgid_index = val @property def hop_limit(self): if self.ah_attr.is_global: return self.ah_attr.grh.hop_limit @hop_limit.setter def hop_limit(self, val): self.ah_attr.grh.hop_limit = val @property def traffic_class(self): if self.ah_attr.is_global: return self.ah_attr.grh.traffic_class @traffic_class.setter def traffic_class(self, val): self.ah_attr.grh.traffic_class = val def __str__(self): print_format = ' {:22}: {:<20}\n' if self.is_global: global_format = print_format.format('dgid', self.dgid) +\ print_format.format('flow label', self.flow_label) +\ print_format.format('sgid index', self.sgid_index) +\ print_format.format('hop limit', self.hop_limit) +\ print_format.format('traffic_class', self.traffic_class) else: global_format = '' return print_format.format('port num', self.port_num) +\ print_format.format('sl', self.sl) +\ print_format.format('source path bits', self.src_path_bits) +\ print_format.format('dlid', self.dlid) +\ print_format.format('static rate', self.static_rate) +\ print_format.format('is global', self.is_global) + global_format cdef class AH(PyverbsCM): def __init__(self, PD pd, **kwargs): """ Initializes an AH object with the given values. Two creation methods are supported: - Creation via AHAttr object (calls ibv_create_ah) - Creation via a WC object (calls ibv_create_ah_from_wc) :param pd: PD object this AH belongs to :param kwargs: Arguments: * *attr* (AHAttr) An AHAttr object (represents ibv_ah_attr struct) * *wc* A WC object to use for AH initialization * *grh* Pointer to GRH object to use for AH initialization (when using wc) * *port_num* Port number to be used for this AH (when using wc) :return: An AH object on success """ super().__init__() if len(kwargs) == 1: # Create AH via ib_create_ah ah_attr = kwargs['attr'] self.ah = v.ibv_create_ah(pd.pd, &ah_attr.ah_attr) else: # Create AH from WC wc = kwargs['wc'] grh = kwargs['grh'] port_num = kwargs['port_num'] self.ah = v.ibv_create_ah_from_wc(pd.pd, &wc.wc, grh, port_num) if self.ah == NULL: raise PyverbsRDMAErrno('Failed to create AH') pd.add_ref(self) self.pd = pd def __dealloc__(self): self.close() cpdef close(self): if self.ah != NULL: if self.logger: self.logger.debug('Closing AH') rc = v.ibv_destroy_ah(self.ah) if rc: raise PyverbsRDMAError('Failed to destroy AH', rc) self.ah = NULL self.pd = None rdma-core-56.1/pyverbs/base.pxd000066400000000000000000000005041477342711600164520ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. #cython: language_level=3 cdef class PyverbsObject(object): cdef object __weakref__ cdef object logger cdef class PyverbsCM(PyverbsObject): cpdef close(self) cdef close_weakrefs(iterables) rdma-core-56.1/pyverbs/base.pyx000066400000000000000000000036621477342711600165070ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. from libc.errno cimport errno import logging from pyverbs.pyverbs_error import PyverbsRDMAError cimport pyverbs.libibverbs as v def inc_rkey(rkey): return v.ibv_inc_rkey(rkey) cpdef PyverbsRDMAErrno(str msg): return PyverbsRDMAError(msg, errno) LOG_LEVEL=logging.INFO LOG_FORMAT='[%(levelname)s] %(asctime)s %(filename)s:%(lineno)s: %(message)s' logging.basicConfig(format=LOG_FORMAT, level=LOG_LEVEL, datefmt='%d %b %Y %H:%M:%S') cdef close_weakrefs(iterables): """ For each iterable element of iterables, pop each element and call its close() method. This method is used when an object is being closed while other objects still hold C references to it; the object holds weakrefs to such other object, and closes them before trying to teardown the C resources. :param iterables: an array of WeakSets :return: None """ # None elements can be present if an object's close() was called more # than once (e.g. GC and by another object) for it in iterables: if it is None: continue while True: try: tmp = it.pop() tmp.close() except KeyError: # popping an empty set break cdef class PyverbsObject(object): def __init__(self): self.logger = logging.getLogger(self.__class__.__name__) def set_log_level(self, val): self.logger.setLevel(val) cdef class PyverbsCM(PyverbsObject): """ This is a base class for pyverbs' context manager objects. It includes __enter__ and __exit__ functions. close() is also declared but it should be overridden by each inheriting class. """ def __enter__(self): return self def __exit__(self, exc_type, exc_value, traceback): return self.close() cpdef close(self): pass rdma-core-56.1/pyverbs/cm_enums.pyx000077700000000000000000000000001477342711600232452librdmacm_enums.pxdustar00rootroot00000000000000rdma-core-56.1/pyverbs/cmid.pxd000066400000000000000000000017601477342711600164610ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file #cython: language_level=3 from pyverbs.base cimport PyverbsObject, PyverbsCM cimport pyverbs.librdmacm as cm cdef class CMID(PyverbsCM): cdef cm.rdma_cm_id *id cdef object event_channel cdef object ctx cdef object pd cdef object mrs cdef add_ref(self, obj) cpdef close(self) cdef class CMEventChannel(PyverbsObject): cdef cm.rdma_event_channel *event_channel cpdef close(self) cdef class CMEvent(PyverbsObject): cdef cm.rdma_cm_event *event cpdef close(self) cdef class AddrInfo(PyverbsObject): cdef cm.rdma_addrinfo *addr_info cpdef close(self) cdef class ConnParam(PyverbsObject): cdef cm.rdma_conn_param conn_param cdef object data cdef class UDParam(PyverbsObject): cdef cm.rdma_ud_param ud_param cdef class JoinMCAttrEx(PyverbsObject): cdef cm.rdma_cm_join_mc_attr_ex join_mc_attr_ex rdma-core-56.1/pyverbs/cmid.pyx000066400000000000000000001007361477342711600165110ustar00rootroot00000000000000from libc.stdint cimport uintptr_t, uint8_t from libc.string cimport memset import weakref import ctypes from pyverbs.pyverbs_error import PyverbsUserError, PyverbsError, PyverbsRDMAError from pyverbs.qp cimport QPInitAttr, QPAttr, ECE from pyverbs.base import PyverbsRDMAErrno from pyverbs.base cimport close_weakrefs cimport pyverbs.libibverbs_enums as e cimport pyverbs.librdmacm_enums as ce from pyverbs.addr cimport AH, AHAttr from pyverbs.device cimport Context cimport pyverbs.libibverbs as v cimport pyverbs.librdmacm as cm from pyverbs.pd cimport PD from pyverbs.mr cimport MR from pyverbs.cq cimport WC cdef class ConnParam(PyverbsObject): def __init__(self, resources=1, depth=1, flow_control=0, retry=5, rnr_retry=5, srq=0, qp_num=0, data_len=0): """ Initialize a ConnParam object over an underlying rdma_conn_param C object which contains connection parameters. There are a few types of port spaces in RDMACM: RDMA_PS_TCP, RDMA_PS_UDP, RDMA_PS_IB and RDMA_PS_IPOIB. RDMA_PS_TCP resembles RC QP connection, which provides reliable, connection-oriented QP communication. This object applies only to RDMA_PS_TCP port space. :param resources: Max outstanding RDMA read and atomic ops that local side will accept from the remote side. :param depth: Max outstanding RDMA read and atomic ops that local side will have to the remote side. :param flow_control: Specifies if hardware flow control is available. :param retry: Max number of times that a send, RDMA or atomic op from the remote peer should be retried. :param rnr_retry: The maximum number of times that a send operation from the remote peer should be retried on a connection after receiving a receiver not ready (RNR) error. :param srq: Specifies if the QP using shared receive queue, ignored if the QP created by CMID. :param qp_num: Specifies the QP number, ignored if the QP created by CMID. :param data_len: Specifies the private data length. RDMA_PS_TCP connect: 56 accept: 196 RDMA_PS_UDP connect: 180 accept: 136 :return: ConnParam object """ super().__init__() memset(&self.conn_param, 0, sizeof(cm.rdma_conn_param)) self.conn_param.responder_resources = resources self.conn_param.initiator_depth = depth self.conn_param.flow_control = flow_control self.conn_param.retry_count = retry self.conn_param.rnr_retry_count = rnr_retry self.conn_param.srq = srq self.conn_param.qp_num = qp_num self.data = (ctypes.c_char * data_len)() @property def qpn(self): return self.conn_param.qp_num @qpn.setter def qpn(self, val): self.conn_param.qp_num = val @property def private_data(self): data_array = bytearray(self.conn_param.private_data_len) cdef const unsigned char *p = self.conn_param.private_data for i in range(self.conn_param.private_data_len): data_array[i] = p[i] return bytes(data_array) def set_private_data(self, data): if (min(len(self.data), len(data)) == 0): return for i in range(len(self.data)): self.data[i] = 0 for i in range(min(len(self.data), len(data))): self.data[i] = data[i] cdef size_t ptr = ctypes.addressof(self.data) self.conn_param.private_data = ptr self.conn_param.private_data_len = min(len(self.data), len(data)) def __str__(self): print_format = '{:<4}: {:<4}\n' return '{}: {}\n'.format('Connection parameters', "") +\ print_format.format('responder resources', self.conn_param.responder_resources) +\ print_format.format('initiator depth', self.conn_param.initiator_depth) +\ print_format.format('flow control', self.conn_param.flow_control) +\ print_format.format('retry count', self.conn_param.retry_count) +\ print_format.format('rnr retry count', self.conn_param.rnr_retry_count) +\ print_format.format('srq', self.conn_param.srq) +\ print_format.format('qp number', self.conn_param.qp_num) cdef class JoinMCAttrEx(PyverbsObject): def __init__(self, AddrInfo addr not None, comp_mask=0, join_flags=0): """ Initialize a JoinMCAttrEx object over an underlying rdma_cm_join_mc_attr_ex C object which contains the extended join multicast attributes. :param addr: Multicast address identifying the group to join. :param comp_mask: Bitwise OR between "rdma_cm_join_mc_attr_mask" enum. :param join_flags: Single flag from "rdma_cm_mc_join_flags" enum. Indicates the type of the join requests. """ super().__init__() self.join_mc_attr_ex.addr = addr.addr_info.ai_src_addr self.join_mc_attr_ex.comp_mask = comp_mask self.join_mc_attr_ex.join_flags = join_flags @property def join_flags(self): return self.join_mc_attr_ex.join_flags @join_flags.setter def join_flags(self, val): self.join_mc_attr_ex.join_flags = val @property def comp_mask(self): return self.join_mc_attr_ex.comp_mask @comp_mask.setter def comp_mask(self, val): self.join_mc_attr_ex.comp_mask = val cdef class UDParam(PyverbsObject): def __init__(self, CMEvent cm_event not None): """ Initialize a UDParam object over an underlying rdma_ud_param C object which contains UD connection parameters. :param cm_event: The creator of UDParam. When the active side gets connection establishment event, the event contains UDParam for the passive CMID details. :return: UDParam object """ super().__init__() memset(&self.ud_param, 0, sizeof(cm.rdma_ud_param)) self.ud_param = (cm_event).event.param.ud @property def qp_num(self): return self.ud_param.qp_num @property def qkey(self): return self.ud_param.qkey @property def ah_attr(self): ah_attr = AHAttr() ah_attr.init_from_ud_param(self) return ah_attr cdef class AddrInfo(PyverbsObject): def __init__(self, src=None, dst=None, src_service=None, dst_service=None, port_space=0, flags=0): """ Initialize an AddrInfo object over an underlying rdma_addrinfo C object. :param src: Name, dotted-decimal IPv4 or IPv6 hex address to bind to. :param dst: Name, dotted-decimal IPv4 or IPv6 hex address to connect to. :param src_service: The service name or port number of the source address. :param dst_service: The service name or port number of the destination address. :param port_space: RDMA port space used (RDMA_PS_UDP or RDMA_PS_TCP). :param flags: Hint flags which control the operation. :return: An AddrInfo object which contains information needed to establish communication. """ cdef char* src_srvc = NULL cdef char* dst_srvc = NULL cdef char* src_addr = NULL cdef char* dst_addr = NULL cdef cm.rdma_addrinfo hints cdef cm.rdma_addrinfo *hints_ptr = NULL cdef cm.rdma_addrinfo *res = NULL super().__init__() if src is not None: if isinstance(src, str): src = src.encode('utf-8') src_addr = src if dst is not None: if isinstance(dst, str): dst = dst.encode('utf-8') dst_addr = dst if src_service is not None: if isinstance(src_service, str): src_service = src_service.encode('utf-8') src_srvc = src_service if dst_service is not None: if isinstance(dst_service, str): dst_service = dst_service.encode('utf-8') dst_srvc = dst_service hints_ptr = &hints memset(hints_ptr, 0, sizeof(cm.rdma_addrinfo)) hints.ai_port_space = port_space hints.ai_flags = flags if flags & ce.RAI_PASSIVE: ret = cm.rdma_getaddrinfo(src_addr, src_srvc, hints_ptr, &self.addr_info) else: if src: hints.ai_flags |= ce.RAI_PASSIVE ret = cm.rdma_getaddrinfo(src_addr, src_srvc, hints_ptr, &res) if ret != 0: raise PyverbsRDMAErrno('Failed to get Address Info') hints.ai_src_addr = res.ai_src_addr hints.ai_src_len = res.ai_src_len hints.ai_flags &= ~ce.RAI_PASSIVE ret = cm.rdma_getaddrinfo(dst_addr, dst_srvc, hints_ptr, &self.addr_info) if src: cm.rdma_freeaddrinfo(res) if ret != 0: raise PyverbsRDMAErrno('Failed to get Address Info') def __dealloc__(self): self.close() cpdef close(self): if self.addr_info != NULL: if self.logger: self.logger.debug('Closing AddrInfo') cm.rdma_freeaddrinfo(self.addr_info) self.addr_info = NULL cdef class CMEvent(PyverbsObject): def __init__(self, CMEventChannel channel): """ Initialize a CMEvent object over an underlying rdma_cm_event C object :param channel: Event Channel on which this event has been received :return: CMEvent object """ super().__init__() ret = cm.rdma_get_cm_event(channel.event_channel, &self.event) if ret != 0: raise PyverbsRDMAErrno('Failed to create CMEvent') self.logger.debug('Created a CMEvent') def __dealloc__(self): self.close() cpdef close(self): if self.event != NULL: if self.logger: self.logger.debug('Closing CMEvent') self.ack_cm_event() self.event = NULL @property def event_type(self): return self.event.event def ack_cm_event(self): """ Free a communication event. This call frees the event structure and any memory that it references. :return: None """ ret = cm.rdma_ack_cm_event(self.event) if ret != 0: raise PyverbsRDMAErrno('Failed to Acknowledge Event - {}' .format(self.event_str())) self.event = NULL def event_str(self): if self.event == NULL: return '' return (cm.rdma_event_str(self.event_type)).decode() @property def private_data(self): data_array = bytearray(self.event.param.conn.private_data_len) cdef const unsigned char *p = self.event.param.conn.private_data for i in range(self.event.param.conn.private_data_len): data_array[i] = p[i] return bytes(data_array) cdef class CMEventChannel(PyverbsObject): def __init__(self): """ Initialize a CMEventChannel object over an underlying rdma_event_channel C object. :return: EventChannel object """ super().__init__() self.event_channel = cm.rdma_create_event_channel() if self.event_channel == NULL: raise PyverbsRDMAErrno('Failed to create CMEventChannel') self.logger.debug('Created a CMEventChannel') def __dealloc__(self): self.close() cpdef close(self): if self.event_channel != NULL: if self.logger: self.logger.debug('Closing CMEventChannel') cm.rdma_destroy_event_channel(self.event_channel) self.event_channel = NULL cdef class CMID(PyverbsCM): def __init__(self, object creator=None, QPInitAttr qp_init_attr=None, PD pd=None, port_space=ce.RDMA_PS_TCP, CMID listen_id=None): """ Initialize a CMID object over an underlying rdma_cm_id C object. This is the main RDMA CM object which provides most of the rdmacm API. Currently only synchronous RDMA_PS_TCP communication supported. Notes: User-specific context, currently not supported. :param creator: For synchronous communication we need AddrInfo object in order to establish connection. We allow creator to be None for inner usage, see get_request method. :param qp_init_attr: Optional initial QP attributes of CMID associated QP. :param pd: Optional parameter, a PD to be associated with this CMID. :param port_space: RDMA port space. :param listen_id: When passive side establishes a connection, it creates a new CMID. listen_id is used to initialize the new CMID. :return: CMID object for synchronous communication. """ cdef v.ibv_qp_init_attr *init cdef v.ibv_pd *in_pd = NULL super().__init__() self.pd = None self.ctx = None self.event_channel = None self.mrs = weakref.WeakSet() if creator is None: return elif isinstance(creator, AddrInfo): init = NULL if qp_init_attr is None else &qp_init_attr.attr if pd is not None: in_pd = pd.pd self.pd = pd ret = cm.rdma_create_ep(&self.id, (creator).addr_info, in_pd, init) if ret != 0: raise PyverbsRDMAErrno('Failed to create CM ID') if not (creator).addr_info.ai_flags & ce.RAI_PASSIVE: self.ctx = Context(cmid=self) if self.pd is None: self.pd = PD(self) elif isinstance(creator, CMEventChannel): self.event_channel = creator ret = cm.rdma_create_id((creator).event_channel, &self.id, NULL, port_space) if ret != 0: raise PyverbsRDMAErrno('Failed to create CM ID') elif isinstance(creator, CMEvent): if listen_id is None: raise PyverbsUserError('listen ID not provided') self.id = (creator).event.id self.event_channel = listen_id.event_channel self.ctx = listen_id.ctx self.pd = listen_id.pd else: raise PyverbsRDMAErrno('Cannot create CM ID from {obj}' .format(obj=type(creator))) cdef add_ref(self, obj): if isinstance(obj, MR): self.mrs.add(obj) else: raise PyverbsError('Unrecognized object type') @property def dev_name(self): return str(v.ibv_get_device_name(self.id.verbs.device).decode('utf-8')) @property def port_num(self): return self.id.port_num @property def event_channel(self): return self.event_channel @property def context(self): return self.ctx @property def pd(self): return self.pd @property def qpn(self): if self.id.qp: return self.id.qp.qp_num return None def __dealloc__(self): self.close() cpdef close(self): if self.id != NULL: if self.logger: self.logger.debug('Closing CMID') if self.event_channel is None: cm.rdma_destroy_ep(self.id) else: if self.id.qp != NULL: cm.rdma_destroy_qp(self.id) ret = cm.rdma_destroy_id(self.id) if ret != 0: raise PyverbsRDMAErrno('Failed to close CMID') if self.ctx: (self.ctx).context = NULL if self.pd: (self.pd).pd = NULL close_weakrefs([self.mrs]) self.id = NULL def get_request(self): """ Retrieves the next pending connection request event. The call may only be used on listening CMIDs operating synchronously. If the call is successful, a new CMID representing the connection request will be returned to the user. The new CMID will reference event information associated with the request until the user calls reject, accept, or close on the newly created identifier. :return: New CMID representing the connection request. """ to_conn = CMID() ret = cm.rdma_get_request(self.id, &to_conn.id) if ret != 0: raise PyverbsRDMAErrno('Failed to get request, no connection established') self.ctx = Context(cmid=to_conn) self.pd = PD(to_conn) return to_conn def bind_addr(self, AddrInfo lai not None): """ Associate a source address with a CMID. If binding to a specific local address, the CMID will also be bound to a local RDMA device. :param lai: Local address information :return: None """ ret = cm.rdma_bind_addr(self.id, lai.addr_info.ai_src_addr) if ret != 0: raise PyverbsRDMAErrno('Failed to Bind ID') # After bind address, cm_id contains ibv_context. # Now we can create Context object. if self.ctx is None: self.ctx = Context(cmid=self) if self.pd is None: self.pd = PD(self) def resolve_addr(self, AddrInfo rai not None, timeout_ms=2000): """ Resolve destination and optional source addresses from IP addresses to an RDMA address. If successful, the specified rdma_cm_id will be bound to a local device. :param rai: Remote address information. :param timeout_ms: Time to wait for resolution to complete [msec] :return: None """ ret = cm.rdma_resolve_addr(self.id, rai.addr_info.ai_src_addr, rai.addr_info.ai_dst_addr, timeout_ms) if ret != 0: raise PyverbsRDMAErrno('Failed to Resolve Address') def join_multicast(self, AddrInfo addr=None, JoinMCAttrEx mc_attr=None, context=0): """ Joins a multicast group and attaches an associated QP to the group. :param addr: Multicast address identifying the group to join. :param mc_attr: JoinMCAttrEx object is requierd to use rdma_join_multicast_ex. This object contains the join flags and the AddrInfo to join. :param context: User-defined context associated with the join request. :return: None """ cdef cm.rdma_cm_join_mc_attr_ex *mc_join_attr = NULL if not addr and not mc_attr: raise PyverbsUserError('Join to multicast must have AddrInfo or JoinMCAttrEx arguments') if not mc_attr: ret = cm.rdma_join_multicast(self.id, addr.addr_info.ai_src_addr, context) else: ret = cm.rdma_join_multicast_ex(self.id, &mc_attr.join_mc_attr_ex, context) if ret != 0: raise PyverbsRDMAErrno('Failed to Join multicast') def leave_multicast(self, AddrInfo addr not None): """ Leaves a multicast group and detaches an associated QP from the group. :param addr: AddrInfo object, represent the multicast address that identifies the group to leave. :return: None """ ret = cm.rdma_leave_multicast(self.id, addr.addr_info.ai_src_addr) if ret != 0: raise PyverbsRDMAErrno('Failed to leave multicast') def resolve_route(self, timeout_ms=2000): """ Resolve an RDMA route to the destination address in order to establish a connection. The destination must already have been resolved by calling resolve_addr. Thus this function is called on the client side after resolve_addr but before calling connect. :param timeout_ms: Time to wait for resolution to complete :return: None """ ret = cm.rdma_resolve_route(self.id, timeout_ms) if ret != 0: raise PyverbsRDMAErrno('Failed to Resolve Route') # After resolve route, cm_id contains ibv_context. # Now we can create Context object. if self.ctx is None: self.ctx = Context(cmid=self) if self.pd is None: self.pd = PD(self) def listen(self, backlog=0): """ Listen for incoming connection requests or datagram service lookup. The listen is restricted to the locally bound source address. :param backlog: The backlog of incoming connection requests :return: None """ ret = cm.rdma_listen(self.id, backlog) if ret != 0: raise PyverbsRDMAErrno('Listen Failed') def connect(self, ConnParam param=None): """ Initiates an active connection request to a remote destination. :param param: Optional connection parameters :return: None """ cdef cm.rdma_conn_param *conn = ¶m.conn_param if param else NULL ret = cm.rdma_connect(self.id, conn) if ret != 0: raise PyverbsRDMAErrno('Failed to Connect') def disconnect(self): """ Disconnects a connection and transitions any associated QP to error state. :return: None """ ret = cm.rdma_disconnect(self.id) if ret != 0: raise PyverbsRDMAErrno('Failed to Disconnect') def accept(self, ConnParam param=None): """ Is called from the listening side to accept a connection or datagram service lookup request. :param param: Optional connection parameters :return: None """ cdef cm.rdma_conn_param *conn = ¶m.conn_param if param else NULL ret = cm.rdma_accept(self.id, conn) if ret != 0: raise PyverbsRDMAErrno('Failed to Accept Connection') def establish(self): """ Complete an active connection request. If a QP has not been created on the CMID, this method should be called by the active side to complete the connection, after getting connect response event. This will trigger a connection established event on the passive side. This method should not be used on a CMID on which a QP has been created. """ ret = cm.rdma_establish(self.id) if ret != 0: raise PyverbsRDMAErrno('Failed to Complete an active connection request') def set_local_ece(self, ECE ece): """ Set local ECE paraemters to be used for REQ/REP communication. :param ece: ECE object with the requested configuration :return: None """ rc = cm.rdma_set_local_ece(self.id, &ece.ece) if rc != 0: raise PyverbsRDMAErrno('Failed to set local ECE') def get_remote_ece(self): """ Get ECE parameters as were received from the communication peer. :return: ECE object with the ece configuration """ ece = ECE() rc = cm.rdma_get_remote_ece(self.id, &ece.ece) if rc != 0: raise PyverbsRDMAErrno('Failed to get remote ECE') return ece def create_qp(self, QPInitAttr qp_init not None): """ Create a QP, which is associated with CMID. If CMID and qp_init don't hold any CQs, new CQs will be created and associated with CMID. If only qp_init provides CQs, they will not be associated with CMID. If both provide CQs they have to be using the same CQs. :param qp_init: QP init attributes """ ret = cm.rdma_create_qp(self.id, (self.pd).pd, &qp_init.attr) if ret != 0: raise PyverbsRDMAErrno('Failed to Create QP') def query_qp(self, attr_mask): """ Query QP using ibv_query_qp. :param attr_mask: Which attributes to query (use enum) :return: A (QPAttr, QPInitAttr) tuple, containing the relevant QP info """ attr = QPAttr() init_attr = QPInitAttr() rc = v.ibv_query_qp(self.id.qp, &attr.attr, attr_mask, &init_attr.attr) if rc != 0: raise PyverbsRDMAError('Failed to query QP', rc) return attr, init_attr def init_qp_attr(self, qp_state): """ Initialize a QPAttr object used for state transitions of an external QP (a QP which was not created using CMID). When connecting external QPs using CMIDs both sides must call this method before QP state transition to RTR/RTS in order to obtain relevant QP attributes from CMID. :param qp_state: The QP's destination state :return: A (QPAttr, attr_mask) tuple, where attr_mask defines which attributes of QPAttr are valid """ cdef int attr_mask qp_attr = QPAttr() qp_attr.qp_state = qp_state rc = cm.rdma_init_qp_attr(self.id, &qp_attr.attr, &attr_mask) if rc != 0: raise PyverbsRDMAErrno('Failed to get QP attributes') return qp_attr, attr_mask def reg_msgs(self, size): """ Registers a memory region for sending or receiving messages or for RDMA operations. The registered memory may then be posted to an CMID using post_send or post_recv methods. :param size: The total length of the memory to register :return: registered MR """ return MR(self, size, e.IBV_ACCESS_LOCAL_WRITE) def reg_read(self, size=0): """ Registers a memory region for sending or receiving messages or for remote read operations. :param size: The total length of the memory to register :return: registered MR """ return MR(self, size, e.IBV_ACCESS_REMOTE_READ) def reg_write(self, size=0): """ Registers a memory region for sending or receiving messages or for remote write operations. :param size: The total length of the memory to register :return: registered MR """ return MR(self, size, e.IBV_ACCESS_REMOTE_WRITE) def post_recv(self, MR mr not None, length=None): """ Posts a recv_wr via QP associated with CMID. Context param of rdma_post_recv C function currently not supported. :param mr: A valid MR object. :param length: length of buffer to recv (default: mr length). :return: None """ if not length: length = mr.mr.length ret = cm.rdma_post_recv(self.id, NULL, mr.buf, length, mr.mr) if ret != 0: raise PyverbsRDMAErrno('Failed to Post Receive') def post_send(self, MR mr not None, flags=v.IBV_SEND_SIGNALED, length=None): """ Posts a message via QP associated with CMID. Context param of rdma_post_send C function currently not supported. :param mr: A valid MR object which contains message to send. :param flags: flags for send work request. :param length: length of buffer to send (default: mr length). :return: None """ if not length: length = mr.mr.length ret = cm.rdma_post_send(self.id, NULL, mr.buf, length, mr.mr, flags) if ret != 0: raise PyverbsRDMAErrno('Failed to Post Send') def post_read(self, MR mr not None, length, remote_addr, rkey, flags=0): """ Post read WR using the CMIDs internal QP. :param mr: A valid MR object. :param length: length of buffer to send. :param remote_addr: The remote MR address. :param rkey: The remote MR rkey. :param flags: flags for send work request. :return: None """ ret = cm.rdma_post_read(self.id, NULL, mr.buf, length, mr.mr, flags, remote_addr, rkey) if ret != 0: raise PyverbsRDMAErrno('Failed to Post Read') def post_write(self, MR mr not None, length, remote_addr, rkey, flags=0): """ Post write WR using the CMIDs internal QP. :param mr: A valid MR object. :param length: length of buffer to send. :param remote_addr: The remote MR address. :param rkey: The remote MR rkey. :param flags: flags for send work request. :return: None """ ret = cm.rdma_post_write(self.id, NULL, mr.buf, length, mr.mr, flags, remote_addr, rkey) if ret != 0: raise PyverbsRDMAErrno('Failed to Post Write') def post_ud_send(self, MR mr not None, AH ah not None, rqpn=0, flags=v.IBV_SEND_SIGNALED, length=None): """ Posts a message via UD QP associated with CMID to another UD QP. :param mr: A valid MR object which contains message to send. :param ah: The destination AH. :param rqpn: The remote QP number. :param flags: flags for send work request. :param length: length of buffer to send. :return: None """ if not length: length = mr.mr.length ret = cm.rdma_post_ud_send(self.id, NULL, mr.buf, length, mr.mr, flags, ah.ah, rqpn) if ret != 0: raise PyverbsRDMAErrno('Failed to Post Send') def get_recv_comp(self): """ Polls the receive CQ associated with CMID for a work completion. :return: The retrieved WC or None if there is no completions """ cdef v.ibv_wc wc ret = cm.rdma_get_recv_comp(self.id, &wc) if ret < 0: raise PyverbsRDMAErrno('Failed to retrieve receive completion') elif ret == 0: return None return WC(wr_id=wc.wr_id, status=wc.status, opcode=wc.opcode, vendor_err=wc.vendor_err, byte_len=wc.byte_len, qp_num=wc.qp_num, src_qp=wc.src_qp, imm_data=wc.imm_data, wc_flags=wc.wc_flags, pkey_index=wc.pkey_index, slid=wc.slid, sl=wc.sl, dlid_path_bits=wc.dlid_path_bits) def get_send_comp(self): """ Polls the send CQ associated with CMID for a work completion. :return: The retrieved WC or None if there is no completions """ cdef v.ibv_wc wc ret = cm.rdma_get_send_comp(self.id, &wc) if ret < 0: raise PyverbsRDMAErrno('Failed to retrieve send completion') elif ret == 0: return None return WC(wr_id=wc.wr_id, status=wc.status, opcode=wc.opcode, vendor_err=wc.vendor_err, byte_len=wc.byte_len, qp_num=wc.qp_num, src_qp=wc.src_qp, imm_data=wc.imm_data, wc_flags=wc.wc_flags, pkey_index=wc.pkey_index, slid=wc.slid, sl=wc.sl, dlid_path_bits=wc.dlid_path_bits) def set_option(self, level, optname, optval, optlen): """ Set communication options for a CMID. :param level: The protocol level of the option to set. :param optname: The name of the option to set. :param optval: The option data. :param optlen: The size of the data. """ if optname != ce.RDMA_OPTION_ID_ACK_TIMEOUT: raise PyverbsUserError('Currently only RDMA_OPTION_ID_ACK_TIMEOUT is supported in Pyverbs.') cdef uint8_t value = optval ret = cm.rdma_set_option(self.id, level, optname, &value, optlen) if ret != 0: raise PyverbsRDMAErrno('Failed to set option') def reject(self, private_data=None): """ Reject a connection or datagram service lookup request. :param private_data: Optional private data to send with the reject message. :param private_data_len: Size (in bytes) of the private data being sent. """ data_len = len(private_data) if private_data else 0 buffer = ctypes.create_string_buffer(data_len) if (private_data): buffer.value = private_data cdef size_t data_ptr = ctypes.addressof(buffer) ret = cm.rdma_reject(self.id, data_ptr, data_len) if ret != 0: raise PyverbsRDMAErrno('Failed to Reject Connection') rdma-core-56.1/pyverbs/cq.pxd000066400000000000000000000022151477342711600161440ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. #cython: language_level=3 from pyverbs.base cimport PyverbsObject, PyverbsCM cimport pyverbs.libibverbs as v cdef class CompChannel(PyverbsCM): cdef v.ibv_comp_channel *cc cpdef close(self) cdef object context cdef add_ref(self, obj) cdef object cqs cdef class CQ(PyverbsCM): cdef v.ibv_cq *cq cpdef close(self) cdef object context cdef add_ref(self, obj) cdef object qps cdef object srqs cdef object channel cdef object num_events cdef object wqs cdef class CqInitAttrEx(PyverbsObject): cdef v.ibv_cq_init_attr_ex attr cdef object channel cdef object parent_domain cdef class CQEX(PyverbsCM): cdef v.ibv_cq_ex *cq cdef v.ibv_cq *ibv_cq cpdef close(self) cdef object context cdef add_ref(self, obj) cdef object qps cdef object srqs cdef object wqs cdef class WC(PyverbsObject): cdef v.ibv_wc wc cdef class PollCqAttr(PyverbsObject): cdef v.ibv_poll_cq_attr attr cdef class WcTmInfo(PyverbsObject): cdef v.ibv_wc_tm_info info rdma-core-56.1/pyverbs/cq.pyx000066400000000000000000000542361477342711600162030ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. import weakref from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError from pyverbs.base import PyverbsRDMAErrno from pyverbs.pd cimport PD, ParentDomain from pyverbs.base cimport close_weakrefs cimport pyverbs.libibverbs_enums as e from pyverbs.device cimport Context from pyverbs.srq cimport SRQ from pyverbs.qp cimport QP from pyverbs.wq cimport WQ cdef class CompChannel(PyverbsCM): """ A completion channel is a file descriptor used to deliver completion notifications to a userspace process. When a completion event is generated for a CQ, the event is delivered via the completion channel attached to the CQ. """ def __init__(self, Context context not None): """ Initializes a completion channel object on the given device. :param context: The device's context to use :return: A CompChannel object on success """ super().__init__() self.cc = v.ibv_create_comp_channel(context.context) if self.cc == NULL: raise PyverbsRDMAErrno('Failed to create a completion channel') self.context = context context.add_ref(self) self.cqs = weakref.WeakSet() self.logger.debug('Created a Completion Channel') def __dealloc__(self): self.close() cpdef close(self): if self.cc != NULL: if self.logger: self.logger.debug('Closing completion channel') close_weakrefs([self.cqs]) rc = v.ibv_destroy_comp_channel(self.cc) if rc != 0: raise PyverbsRDMAError('Failed to destroy a completion channel', rc) self.cc = NULL def get_cq_event(self, CQ expected_cq): """ Waits for the next completion event in the completion event channel :param expected_cq: The CQ that is expected to get the event :return: None """ cdef v.ibv_cq *cq cdef void *ctx rc = v.ibv_get_cq_event(self.cc, &cq, &ctx) if rc != 0: raise PyverbsRDMAErrno('Failed to get CQ event') if cq != expected_cq.cq: raise PyverbsRDMAErrno('Received event on an unexpected CQ') expected_cq.num_events += 1 cdef add_ref(self, obj): if isinstance(obj, CQ) or isinstance(obj, CQEX): self.cqs.add(obj) cdef class CQ(PyverbsCM): """ A Completion Queue is the notification mechanism for work request completions. A CQ can have 0 or more associated QPs. """ def __init__(self, Context context not None, cqe, cq_context=None, CompChannel channel=None, comp_vector=0): """ Initializes a CQ object with the given parameters. :param context: The device's context on which to open the CQ :param cqe: CQ's capacity :param cq_context: User context's pointer :param channel: If set, will be used to return completion events :param comp_vector: Will be used for signaling completion events. Must be larger than 0 and smaller than the context's num_comp_vectors :return: The newly created CQ """ super().__init__() if channel is not None: self.cq = v.ibv_create_cq(context.context, cqe, cq_context, channel.cc, comp_vector) channel.add_ref(self) self.channel = channel else: self.cq = v.ibv_create_cq(context.context, cqe, cq_context, NULL, comp_vector) self.channel = None if self.cq == NULL: raise PyverbsRDMAErrno('Failed to create a CQ') self.context = context context.add_ref(self) self.qps = weakref.WeakSet() self.srqs = weakref.WeakSet() self.wqs = weakref.WeakSet() self.num_events = 0 self.logger.debug('Created a CQ') cdef add_ref(self, obj): if isinstance(obj, QP): self.qps.add(obj) elif isinstance(obj, SRQ): self.srqs.add(obj) elif isinstance(obj, WQ): self.wqs.add(obj) else: raise PyverbsError('Unrecognized object type') def __dealloc__(self): self.close() cpdef close(self): if self.cq != NULL: if self.logger: self.logger.debug('Closing CQ') close_weakrefs([self.qps, self.srqs, self.wqs]) if self.num_events: self.ack_events(self.num_events) rc = v.ibv_destroy_cq(self.cq) if rc != 0: raise PyverbsRDMAError('Failed to close CQ', rc) self.cq = NULL self.context = None self.channel = None def resize(self, cqe): """ Resizes the completion queue (CQ) to have at least cqe entries. :param cqe: The requested CQ depth. :return: None """ rc = v.ibv_resize_cq(self.cq, cqe) if rc: raise PyverbsRDMAError('Failed to resize CQ', rc) def poll(self, num_entries=1): """ Polls the CQ for completions. :param num_entries: number of completions to pull :return: (npolled, wcs): The number of polled completions and an array of the polled completions """ cdef v.ibv_wc wc wcs = [] npolled = 0 while npolled < num_entries: rc = v.ibv_poll_cq(self.cq, 1, &wc) if rc < 0: raise PyverbsRDMAError('Failed to poll CQ', -rc) if rc == 0: break; npolled += 1 wcs.append(WC(wr_id=wc.wr_id, status=wc.status, opcode=wc.opcode, vendor_err=wc.vendor_err, byte_len=wc.byte_len, qp_num=wc.qp_num, src_qp=wc.src_qp, imm_data=wc.imm_data, wc_flags=wc.wc_flags, pkey_index=wc.pkey_index, slid=wc.slid, sl=wc.sl, dlid_path_bits=wc.dlid_path_bits)) return npolled, wcs def req_notify(self, solicited_only = False): """ Request completion notification on the completion queue. :param solicited_only: If non-zero, notifications will be created only for incoming send / RDMA write WRs with immediate data that have the solicited bit set in their send flags. :return: None """ rc = v.ibv_req_notify_cq(self.cq, solicited_only) if rc != 0: raise PyverbsRDMAError('Request notify CQ returned {rc}'. format(rc=rc), rc) def ack_events(self, num_events): """ Get and acknowledge CQ events :param num_events: Number of events to acknowledge :return: None """ v.ibv_ack_cq_events(self.cq, num_events) self.num_events -= num_events def __str__(self): print_format = '{:22}: {:<20}\n' return 'CQ\n' +\ print_format.format('Handle', self.cq.handle) +\ print_format.format('CQEs', self.cq.cqe) @property def comp_channel(self): return self.channel @property def cqe(self): return self.cq.cqe @property def cq(self): return self.cq cdef class CqInitAttrEx(PyverbsObject): def __init__(self, cqe = 100, CompChannel channel = None, comp_vector = 0, wc_flags = 0, comp_mask = 0, flags = 0, PD parent_domain = None): """ Initializes a CqInitAttrEx object with the given parameters. :param cqe: CQ's capacity :param channel: If set, will be used to return completion events :param comp_vector: Will be used for signaling completion events. Must be larger than 0 and smaller than the context's num_comp_vectors :param wc_flags: The wc_flags that should be returned in ibv_poll_cq_ex. Or'ed bit of enum ibv_wc_flags_ex. :param comp_mask: compatibility mask (extended verb) :param flags: create cq attr flags - one or more flags from ibv_create_cq_attr_flags enum :param parent_domain: If set, will be used to custom alloc cq buffers. :return: """ super().__init__() self.attr.cqe = cqe self.attr.cq_context = NULL self.attr.channel = NULL if channel is None else channel.cc self.attr.comp_vector = comp_vector self.attr.wc_flags = wc_flags self.attr.comp_mask = comp_mask self.attr.flags = flags self.attr.parent_domain = NULL if parent_domain is None else parent_domain.pd self.channel = channel self.parent_domain = parent_domain @property def cqe(self): return self.attr.cqe @cqe.setter def cqe(self, val): self.attr.cqe = val # Setter-only properties require the older syntax property cq_context: def __set__(self, val): self.attr.cq_context = val @property def parent_domain(self): return self.parent_domain @parent_domain.setter def parent_domain(self, PD val): self.parent_domain = val self.attr.parent_domain = val.pd @property def comp_channel(self): return self.channel @comp_channel.setter def comp_channel(self, CompChannel val): self.channel = val self.attr.channel = val.cc @property def comp_vector(self): return self.attr.comp_vector @comp_vector.setter def comp_vector(self, val): self.attr.comp_vector = val @property def wc_flags(self): return self.attr.wc_flags @wc_flags.setter def wc_flags(self, val): self.attr.wc_flags = val @property def comp_mask(self): return self.attr.comp_mask @comp_mask.setter def comp_mask(self, val): self.attr.comp_mask = val @property def flags(self): return self.attr.flags @flags.setter def flags(self, val): self.attr.flags = val def __str__(self): print_format = '{:22}: {:<20}\n' return print_format.format('Number of CQEs', self.cqe) +\ print_format.format('WC flags', create_wc_flags_to_str(self.wc_flags)) +\ print_format.format('comp mask', self.comp_mask) +\ print_format.format('flags', self.flags) cdef class CQEX(PyverbsCM): def __init__(self, Context context not None, CqInitAttrEx init_attr): """ Initializes a CQEX object on the given device's context with the given attributes. :param context: The device's context on which to open the CQ :param init_attr: Initial attributes that describe the CQ :return: The newly created CQEX on success """ super().__init__() self.qps = weakref.WeakSet() self.srqs = weakref.WeakSet() self.wqs = weakref.WeakSet() if self.cq != NULL: # Leave CQ initialization to the provider return if init_attr is None: init_attr = CqInitAttrEx() self.cq = v.ibv_create_cq_ex(context.context, &init_attr.attr) if init_attr.comp_channel: init_attr.comp_channel.add_ref(self) if init_attr.parent_domain: (init_attr.parent_domain).add_ref(self) if self.cq == NULL: raise PyverbsRDMAErrno('Failed to create extended CQ') self.ibv_cq = v.ibv_cq_ex_to_cq(self.cq) self.context = context context.add_ref(self) cdef add_ref(self, obj): if isinstance(obj, QP): self.qps.add(obj) elif isinstance(obj, SRQ): self.srqs.add(obj) elif isinstance(obj, WQ): self.wqs.add(obj) else: raise PyverbsError('Unrecognized object type') def __dealloc__(self): self.close() cpdef close(self): if self.cq != NULL: if self.logger: self.logger.debug('Closing CQEx') close_weakrefs([self.srqs, self.qps, self.wqs]) rc = v.ibv_destroy_cq(self.cq) if rc != 0: raise PyverbsRDMAError('Failed to destroy CQEX', rc) self.cq = NULL self.context = None def start_poll(self, PollCqAttr attr): """ Start polling a batch of work completions. :param attr: For easy future extensions :return: 0 on success, ENOENT when no completions are available """ if attr is None: attr = PollCqAttr() return v.ibv_start_poll(self.cq, &attr.attr) def poll_next(self): """ Get the next work completion. :return: 0 on success, ENOENT when no completions are available """ return v.ibv_next_poll(self.cq) def end_poll(self): """ Indicates the end of polling batch of work completions :return: None """ return v.ibv_end_poll(self.cq) def read_opcode(self): return v.ibv_wc_read_opcode(self.cq) def read_vendor_err(self): return v.ibv_wc_read_vendor_err(self.cq) def read_byte_len(self): return v.ibv_wc_read_byte_len(self.cq) def read_imm_data(self): return v.ibv_wc_read_imm_data(self.cq) def read_qp_num(self): return v.ibv_wc_read_qp_num(self.cq) def read_src_qp(self): return v.ibv_wc_read_src_qp(self.cq) def read_wc_flags(self): return v.ibv_wc_read_wc_flags(self.cq) def read_slid(self): return v.ibv_wc_read_slid(self.cq) def read_sl(self): return v.ibv_wc_read_sl(self.cq) def read_dlid_path_bits(self): return v.ibv_wc_read_dlid_path_bits(self.cq) def read_timestamp(self): return v.ibv_wc_read_completion_ts(self.cq) def read_cvlan(self): return v.ibv_wc_read_cvlan(self.cq) def read_flow_tag(self): return v.ibv_wc_read_flow_tag(self.cq) def read_tm_info(self): info = WcTmInfo() v.ibv_wc_read_tm_info(self.cq, &info.info) return info def read_completion_wallclock_ns(self): return v.ibv_wc_read_completion_wallclock_ns(self.cq) @property def status(self): return self.cq.status @status.setter def status(self, val): self.cq.status = val @property def wr_id(self): return self.cq.wr_id @wr_id.setter def wr_id(self, val): self.cq.wr_id = val def __str__(self): print_format = '{:<22}: {:<20}\n' return 'Extended CQ:\n' +\ print_format.format('Handle', self.cq.handle) +\ print_format.format('CQEs', self.cq.cqe) cdef class WC(PyverbsObject): def __init__(self, wr_id=0, status=0, opcode=0, vendor_err=0, byte_len=0, qp_num=0, src_qp=0, imm_data=0, wc_flags=0, pkey_index=0, slid=0, sl=0, dlid_path_bits=0): super().__init__() self.wc.wr_id = wr_id self.wc.status = status self.wc.opcode = opcode self.wc.vendor_err = vendor_err self.wc.byte_len = byte_len self.wc.qp_num = qp_num self.wc.src_qp = src_qp self.wc.wc_flags = wc_flags self.wc.pkey_index = pkey_index self.wc.slid = slid self.wc.imm_data = imm_data self.wc.sl = sl self.wc.dlid_path_bits = dlid_path_bits @property def wr_id(self): return self.wc.wr_id @wr_id.setter def wr_id(self, val): self.wc.wr_id = val @property def status(self): return self.wc.status @status.setter def status(self, val): self.wc.status = val @property def opcode(self): return self.wc.opcode @opcode.setter def opcode(self, val): self.wc.opcode = val @property def vendor_err(self): return self.wc.vendor_err @vendor_err.setter def vendor_err(self, val): self.wc.vendor_err = val @property def byte_len(self): return self.wc.byte_len @byte_len.setter def byte_len(self, val): self.wc.byte_len = val @property def qp_num(self): return self.wc.qp_num @qp_num.setter def qp_num(self, val): self.wc.qp_num = val @property def src_qp(self): return self.wc.src_qp @src_qp.setter def src_qp(self, val): self.wc.src_qp = val @property def wc_flags(self): return self.wc.wc_flags @wc_flags.setter def wc_flags(self, val): self.wc.wc_flags = val @property def pkey_index(self): return self.wc.pkey_index @pkey_index.setter def pkey_index(self, val): self.wc.pkey_index = val @property def slid(self): return self.wc.slid @slid.setter def slid(self, val): self.wc.slid = val @property def sl(self): return self.wc.sl @sl.setter def sl(self, val): self.wc.sl = val @property def imm_data(self): return self.wc.imm_data @imm_data.setter def imm_data(self, val): self.wc.imm_data = val @property def dlid_path_bits(self): return self.wc.dlid_path_bits @dlid_path_bits.setter def dlid_path_bits(self, val): self.wc.dlid_path_bits = val def __str__(self): print_format = '{:22}: {:<20}\n' return print_format.format('WR ID', self.wr_id) +\ print_format.format('status', cqe_status_to_str(self.status)) +\ print_format.format('opcode', cqe_opcode_to_str(self.opcode)) +\ print_format.format('vendor error', self.vendor_err) +\ print_format.format('byte length', self.byte_len) +\ print_format.format('QP num', self.qp_num) +\ print_format.format('source QP', self.src_qp) +\ print_format.format('WC flags', cqe_flags_to_str(self.wc_flags)) +\ print_format.format('pkey index', self.pkey_index) +\ print_format.format('slid', self.slid) +\ print_format.format('sl', self.sl) +\ print_format.format('imm_data', self.imm_data) +\ print_format.format('dlid path bits', self.dlid_path_bits) cdef class PollCqAttr(PyverbsObject): @property def comp_mask(self): return self.attr.comp_mask @comp_mask.setter def comp_mask(self, val): self.attr.comp_mask = val cdef class WcTmInfo(PyverbsObject): @property def tag(self): return self.info.tag @tag.setter def tag(self, val): self.info.tag = val @property def priv(self): return self.info.priv @priv.setter def priv(self, val): self.info.priv = val def cqe_status_to_str(status): try: return {e.IBV_WC_SUCCESS: "success", e.IBV_WC_LOC_LEN_ERR: "local length error", e.IBV_WC_LOC_QP_OP_ERR: "local QP op error", e.IBV_WC_LOC_EEC_OP_ERR: "local EEC op error", e.IBV_WC_LOC_PROT_ERR: "local protection error", e.IBV_WC_WR_FLUSH_ERR: "WR flush error", e.IBV_WC_MW_BIND_ERR: "memory window bind error", e.IBV_WC_BAD_RESP_ERR: "bad response error", e.IBV_WC_LOC_ACCESS_ERR: "local access error", e.IBV_WC_REM_INV_REQ_ERR: "remote invalidate request error", e.IBV_WC_REM_ACCESS_ERR: "remote access error", e.IBV_WC_REM_OP_ERR: "remote op error", e.IBV_WC_RETRY_EXC_ERR: "retry exceeded error", e.IBV_WC_RNR_RETRY_EXC_ERR: "RNR retry exceeded", e.IBV_WC_LOC_RDD_VIOL_ERR: "local RDD violation error", e.IBV_WC_REM_INV_RD_REQ_ERR: "remote invalidate RD request error", e.IBV_WC_REM_ABORT_ERR: "remote abort error", e.IBV_WC_INV_EECN_ERR: "invalidate EECN error", e.IBV_WC_INV_EEC_STATE_ERR: "invalidate EEC state error", e.IBV_WC_FATAL_ERR: "WC fatal error", e.IBV_WC_RESP_TIMEOUT_ERR: "response timeout error", e.IBV_WC_GENERAL_ERR: "general error"}[status] except KeyError: return "Unknown CQE status" def cqe_opcode_to_str(opcode): try: return {0x0: "Send", 0x1:"RDMA write", 0x2: "RDMA read", 0x3: "Compare and swap", 0x4: "Fetch and add", 0x5: "Bind Memory window", 0x6: "Local invalidate", 0x7: "TSO", 0x80: "Receive", 0x81: "Receive RDMA with immediate", 0x82: "Tag matching - add", 0x83: "Tag matching - delete", 0x84: "Tag matching - sync", 0x85: "Tag matching - receive", 0x86: "Tag matching - no tag", 0x87: "Driver WR"}[opcode] except KeyError: return "Unknown CQE opcode {op}".format(op=opcode) def flags_to_str(flags, dictionary): flags_str = "" for f in dictionary: if flags & f: flags_str += dictionary[f] flags_str += " " return flags_str def cqe_flags_to_str(flags): cqe_flags = {1: "GRH", 2: "With immediate", 4: "IP csum OK", 8: "With invalidate", 16: "TM sync request", 32: "TM match", 64: "TM data valid"} return flags_to_str(flags, cqe_flags) def create_wc_flags_to_str(flags): cqe_flags = {e.IBV_WC_EX_WITH_BYTE_LEN: 'IBV_WC_EX_WITH_BYTE_LEN', e.IBV_WC_EX_WITH_IMM: 'IBV_WC_EX_WITH_IMM', e.IBV_WC_EX_WITH_QP_NUM: 'IBV_WC_EX_WITH_QP_NUM', e.IBV_WC_EX_WITH_SRC_QP: 'IBV_WC_EX_WITH_SRC_QP', e.IBV_WC_EX_WITH_SLID: 'IBV_WC_EX_WITH_SLID', e.IBV_WC_EX_WITH_SL: 'IBV_WC_EX_WITH_SL', e.IBV_WC_EX_WITH_DLID_PATH_BITS: 'IBV_WC_EX_WITH_DLID_PATH_BITS', e.IBV_WC_EX_WITH_COMPLETION_TIMESTAMP: 'IBV_WC_EX_WITH_COMPLETION_TIMESTAMP', e.IBV_WC_EX_WITH_CVLAN: 'IBV_WC_EX_WITH_CVLAN', e.IBV_WC_EX_WITH_FLOW_TAG: 'IBV_WC_EX_WITH_FLOW_TAG', e.IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK: 'IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK'} return flags_to_str(flags, cqe_flags) rdma-core-56.1/pyverbs/device.pxd000066400000000000000000000037031477342711600170030ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. See COPYING file #cython: language_level=3 from .base cimport PyverbsObject, PyverbsCM cimport pyverbs.libibverbs as v cdef class Context(PyverbsCM): cdef v.ibv_context *context cdef v.ibv_device *device cdef object name cdef add_ref(self, obj) cdef object pds cdef object dms cdef object ccs cdef object cqs cdef object qps cdef object xrcds cdef object vars cdef object uars cdef object pps cdef object sched_nodes cdef object sched_leafs cdef object dr_domains cdef object wqs cdef object rwq_ind_tbls cdef object crypto_logins cdef class DeviceAttr(PyverbsObject): cdef v.ibv_device_attr dev_attr cdef class QueryDeviceExInput(PyverbsObject): cdef v.ibv_query_device_ex_input input cdef class ODPCaps(PyverbsObject): cdef v.ibv_odp_caps odp_caps cdef object xrc_odp_caps cdef class RSSCaps(PyverbsObject): cdef v.ibv_rss_caps rss_caps cdef class PacketPacingCaps(PyverbsObject): cdef v.ibv_packet_pacing_caps packet_pacing_caps cdef class PCIAtomicCaps(PyverbsObject): cdef v.ibv_pci_atomic_caps caps cdef class TMCaps(PyverbsObject): cdef v.ibv_tm_caps tm_caps cdef class CQModerationCaps(PyverbsObject): cdef v.ibv_cq_moderation_caps cq_mod_caps cdef class TSOCaps(PyverbsObject): cdef v.ibv_tso_caps tso_caps cdef class DeviceAttrEx(PyverbsObject): cdef v.ibv_device_attr_ex dev_attr cdef class AllocDmAttr(PyverbsObject): cdef v.ibv_alloc_dm_attr alloc_dm_attr cdef class DM(PyverbsCM): cdef v.ibv_dm *dm cdef object dm_mrs cdef object context cdef object _is_imported cdef add_ref(self, obj) cdef class PortAttr(PyverbsObject): cdef v.ibv_port_attr attr cdef class GIDEntry(PyverbsObject): cdef v.ibv_gid_entry entry cdef class AsyncEvent(PyverbsObject): cdef v.ibv_async_event event rdma-core-56.1/pyverbs/device.pyx000066400000000000000000001315511477342711600170330ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. See COPYING file """ Device module introduces the Context and DeviceAttr class. It allows user to open an IB device (using Context(name=) and query it, which returns a DeviceAttr object. """ import weakref from .pyverbs_error import PyverbsRDMAError, PyverbsError from pyverbs.cq cimport CQEX, CQ, CompChannel from .pyverbs_error import PyverbsUserError from pyverbs.base import PyverbsRDMAErrno from pyverbs.base cimport close_weakrefs from pyverbs.wq cimport WQ, RwqIndTable cimport pyverbs.libibverbs_enums as e cimport pyverbs.libibverbs as v cimport pyverbs.librdmacm as cm from pyverbs.cmid cimport CMID from pyverbs.xrcd cimport XRCD from pyverbs.addr cimport GID from pyverbs.mr import DMMR from pyverbs.pd cimport PD from pyverbs.qp cimport QP from libc.stdlib cimport free, malloc from libc.string cimport memset from libc.stdint cimport uint64_t from libc.stdint cimport uint16_t from libc.stdint cimport uint32_t from pyverbs.utils import gid_str cdef extern from 'endian.h': unsigned long be64toh(unsigned long host_64bits); class Device(PyverbsObject): """ Device class represents the C ibv_device. It stores device's properties. It is not a part of objects creation order - there's no need for the user to create it for such purposes. """ def __init__(self, name, guid, node_type, transport_type, index): self._node_type = node_type self._transport_type = transport_type self._name = name self._guid = guid self._index = index @property def name(self): return self._name @property def node_type(self): return self._node_type @property def transport_type(self): return self._transport_type @property def guid(self): return self._guid @property def index(self): return self._index def __str__(self): return 'Device {dev}, node type {ntype}, transport type {ttype},' \ ' guid {guid}, index {index}'.format(dev=self.name.decode(), ntype=translate_node_type(self.node_type), ttype=translate_transport_type(self.transport_type), guid=guid_to_hex(self.guid), index=self._index) cdef class Context(PyverbsCM): """ Context class represents the C ibv_context. """ def __init__(self, **kwargs): """ Initializes a Context object. The function searches the IB devices list for a device with the name provided by the user. If such a device is found, it is opened (unless provider attributes were given). In case of cmid argument, CMID object already holds an ibv_context initiated pointer, hence all we have to do is assign this pointer to Context's object pointer. :param kwargs: Arguments: * *name* The device's name * *attr* Provider-specific attributes. If not None, it means that the device will be opened by the provider and __init__ will return after locating the requested device. * *cmid* A CMID object. If not None, it means that the device was already opened by a CMID class, and only a pointer assignment is missing. * *cmd_fd* A command FD. If passed, the device will be imported from the given cmd_fd using ibv_import_device. :return: None """ cdef int count cdef v.ibv_device **dev_list cdef CMID cmid super().__init__() self.pds = weakref.WeakSet() self.dms = weakref.WeakSet() self.ccs = weakref.WeakSet() self.cqs = weakref.WeakSet() self.qps = weakref.WeakSet() self.xrcds = weakref.WeakSet() self.vars = weakref.WeakSet() self.uars = weakref.WeakSet() self.pps = weakref.WeakSet() self.sched_nodes = weakref.WeakSet() self.sched_leafs = weakref.WeakSet() self.dr_domains = weakref.WeakSet() self.wqs = weakref.WeakSet() self.rwq_ind_tbls = weakref.WeakSet() self.crypto_logins = weakref.WeakSet() self.name = kwargs.get('name') provider_attr = kwargs.get('attr') cmid = kwargs.get('cmid') cmd_fd = kwargs.get('cmd_fd') if cmid is not None: self.context = cmid.id.verbs self.name = str(v.ibv_get_device_name(self.context.device).decode('utf-8')) cmid.ctx = self return if cmd_fd is not None: self.context = v.ibv_import_device(cmd_fd) if self.context == NULL: raise PyverbsRDMAErrno('Failed to import device') self.name = str(v.ibv_get_device_name(self.context.device).decode('utf-8')) return if self.name is None: raise PyverbsUserError('Device name must be provided') dev_list = v.ibv_get_device_list(&count) if dev_list == NULL: raise PyverbsRDMAError('Failed to get devices list') try: for i in range(count): if dev_list[i].name.decode() == self.name: if provider_attr is not None: # A provider opens its own context, we're just # setting its IB device self.device = dev_list[i] return self.context = v.ibv_open_device(dev_list[i]) if self.context == NULL: raise PyverbsRDMAErrno('Failed to open device {dev}'. format(dev=self.name)) self.logger.debug('Context: opened device {dev}'. format(dev=self.name)) break else: raise PyverbsRDMAError('Failed to find device {dev}'. format(dev=self.name)) finally: v.ibv_free_device_list(dev_list) def __dealloc__(self): """ Closes the inner IB device. :return: None """ self.close() cpdef close(self): if self.context != NULL: if self.logger: self.logger.debug('Closing Context') close_weakrefs([self.qps, self.crypto_logins, self.rwq_ind_tbls, self.wqs, self.ccs, self.cqs, self.dms, self.pds, self.xrcds, self.vars, self.sched_leafs, self.sched_nodes, self.dr_domains]) rc = v.ibv_close_device(self.context) if rc != 0: raise PyverbsRDMAErrno(f'Failed to close device {self.name}') self.context = NULL @property def context(self): return self.context @property def num_comp_vectors(self): return self.context.num_comp_vectors def query_device(self): """ Queries the device's attributes. :return: A DeviceAttr object which holds the device's attributes as reported by the hardware. """ dev_attr = DeviceAttr() rc = v.ibv_query_device(self.context, &dev_attr.dev_attr) if rc != 0: raise PyverbsRDMAError('Failed to query device {name}'. format(name=self.name), rc) return dev_attr def query_device_ex(self, QueryDeviceExInput ex_input = None): """ Queries the device's extended attributes. :param ex_input: An extensible input struct for possible future extensions :return: DeviceAttrEx object """ dev_attr_ex = DeviceAttrEx() rc = v.ibv_query_device_ex(self.context, &ex_input.input if ex_input is not None else NULL, &dev_attr_ex.dev_attr) if rc != 0: raise PyverbsRDMAError('Failed to query EX device {name}'. format(name=self.name), rc) return dev_attr_ex def query_pkey(self, unsigned int port_num, int index): cdef uint16_t pkey rc = v.ibv_query_pkey(self.context, port_num, index, &pkey) if rc != 0: raise PyverbsRDMAError(f'Failed to query pkey {index} of port {port_num}') return pkey def get_pkey_index(self, unsigned int port_num, int pkey): idx = v.ibv_get_pkey_index(self.context, port_num, pkey) if idx == -1: raise PyverbsRDMAError(f'Failed to get pkey index of pkey = {pkey} of port {port_num}') return idx def query_gid(self, unsigned int port_num, int index): gid = GID() rc = v.ibv_query_gid(self.context, port_num, index, &gid.gid) if rc != 0: raise PyverbsRDMAError('Failed to query gid {idx} of port {port}'. format(idx=index, port=port_num)) return gid def query_gid_type(self, unsigned int port_num, unsigned int index): cdef v.ibv_gid_type_sysfs gid_type rc = v.ibv_query_gid_type(self.context, port_num, index, &gid_type) if rc != 0: raise PyverbsRDMAErrno('Failed to query gid type of port {p} and gid index {g}' .format(p=port_num, g=index)) return gid_type def query_port(self, unsigned int port_num): """ Query port of the device and returns its attributes. :param port_num: Port number to query :return: PortAttr object on success """ port_attrs = PortAttr() rc = v.ibv_query_port(self.context, port_num, &port_attrs.attr) if rc != 0: raise PyverbsRDMAError('Failed to query port {p}'. format(p=port_num), rc) return port_attrs def query_gid_table(self, size_t max_entries, uint32_t flags=0): """ Queries the GID tables of the device for at most entries and returns them. :param max_entries: Maximum number of GID entries to retrieve :param flags: Specifies new extra members of struct ibv_gid_entry to query :return: List of GIDEntry objects on success """ cdef v.ibv_gid_entry *entries cdef v.ibv_gid_entry entry entries = malloc(max_entries * sizeof(v.ibv_gid_entry)) rc = v.ibv_query_gid_table(self.context, entries, max_entries, flags) if rc < 0: raise PyverbsRDMAError('Failed to query gid tables of the device', rc) gid_entries = [] for i in range(rc): entry = entries[i] gid_entries.append(GIDEntry(entry.gid._global.subnet_prefix, entry.gid._global.interface_id, entry.gid_index, entry.port_num, entry.gid_type, entry.ndev_ifindex)) free(entries) return gid_entries def query_gid_ex(self, uint32_t port_num, uint32_t gid_index, uint32_t flags=0): """ Queries the GID table of port in index , and returns the GID entry. :param port_num: The port number to query :param gid_index: The index in the GID table to query :param flags: Specifies new extra members of struct ibv_gid_entry to query :return: GIDEntry object on success """ entry = GIDEntry() rc = v.ibv_query_gid_ex(self.context, port_num, gid_index, &entry.entry, flags) if rc != 0: raise PyverbsRDMAError(f'Failed to query gid table of port '\ f'{port_num} in index {gid_index}', rc) return entry def query_rt_values_ex(self, comp_mask=v.IBV_VALUES_MASK_RAW_CLOCK): """ Query an RDMA device for some real time values. :return: A tuple of the real time values according to comp_mask (sec, nsec) """ cdef v.ibv_values_ex *val val = malloc(sizeof(v.ibv_values_ex)) val.comp_mask = comp_mask rc = v.ibv_query_rt_values_ex(self.context, val) if rc != 0: raise PyverbsRDMAError(f'Failed to query real time values', rc) if val.comp_mask != comp_mask: raise PyverbsRDMAError(f'Failed to query real time values with requested comp_mask') nsec = (val).raw_clock.tv_nsec sec = (val).raw_clock.tv_sec free(val) return sec, nsec cdef add_ref(self, obj): if isinstance(obj, PD): self.pds.add(obj) elif isinstance(obj, DM): self.dms.add(obj) elif isinstance(obj, CompChannel): self.ccs.add(obj) elif isinstance(obj, CQ) or isinstance(obj, CQEX): self.cqs.add(obj) elif isinstance(obj, QP): self.qps.add(obj) elif isinstance(obj, XRCD): self.xrcds.add(obj) elif isinstance(obj, WQ): self.wqs.add(obj) elif isinstance(obj, RwqIndTable): self.rwq_ind_tbls.add(obj) else: raise PyverbsError('Unrecognized object type') def get_async_event(self): event = AsyncEvent() rc = v.ibv_get_async_event(self.context, &event.event) if rc != 0: raise PyverbsRDMAError(f'Failed to get async event', rc) return event @property def cmd_fd(self): return self.context.cmd_fd @property def name(self): return self.name cdef class DeviceAttr(PyverbsObject): """ DeviceAttr represents ibv_device_attr C class. It exposes the same properties (read only) and also provides an __str__() function for readability. """ @property def fw_version(self): return self.dev_attr.fw_ver.decode() @property def node_guid(self): return self.dev_attr.node_guid @property def sys_image_guid(self): return self.dev_attr.sys_image_guid @property def max_mr_size(self): return self.dev_attr.max_mr_size @property def page_size_cap(self): return self.dev_attr.page_size_cap @property def vendor_id(self): return self.dev_attr.vendor_id @property def vendor_part_id(self): return self.dev_attr.vendor_part_id @property def hw_ver(self): return self.dev_attr.hw_ver @property def max_qp(self): return self.dev_attr.max_qp @property def max_qp_wr(self): return self.dev_attr.max_qp_wr @property def device_cap_flags(self): return self.dev_attr.device_cap_flags @property def max_sge(self): return self.dev_attr.max_sge @property def max_sge_rd(self): return self.dev_attr.max_sge_rd @property def max_cq(self): return self.dev_attr.max_cq @property def max_cqe(self): return self.dev_attr.max_cqe @property def max_mr(self): return self.dev_attr.max_mr @property def max_pd(self): return self.dev_attr.max_pd @property def max_qp_rd_atom(self): return self.dev_attr.max_qp_rd_atom @property def max_ee_rd_atom(self): return self.dev_attr.max_ee_rd_atom @property def max_res_rd_atom(self): return self.dev_attr.max_res_rd_atom @property def max_qp_init_rd_atom(self): return self.dev_attr.max_qp_init_rd_atom @property def max_ee_init_rd_atom(self): return self.dev_attr.max_ee_init_rd_atom @property def atomic_caps(self): return self.dev_attr.atomic_cap @property def max_ee(self): return self.dev_attr.max_ee @property def max_rdd(self): return self.dev_attr.max_rdd @property def max_mw(self): return self.dev_attr.max_mw @property def max_raw_ipv6_qps(self): return self.dev_attr.max_raw_ipv6_qp @property def max_raw_ethy_qp(self): return self.dev_attr.max_raw_ethy_qp @property def max_mcast_grp(self): return self.dev_attr.max_mcast_grp @property def max_mcast_qp_attach(self): return self.dev_attr.max_mcast_qp_attach @property def max_ah(self): return self.dev_attr.max_ah @property def max_fmr(self): return self.dev_attr.max_fmr @property def max_map_per_fmr(self): return self.dev_attr.max_map_per_fmr @property def max_srq(self): return self.dev_attr.max_srq @property def max_srq_wr(self): return self.dev_attr.max_srq_wr @property def max_srq_sge(self): return self.dev_attr.max_srq_sge @property def max_pkeys(self): return self.dev_attr.max_pkeys @property def local_ca_ack_delay(self): return self.dev_attr.local_ca_ack_delay @property def phys_port_cnt(self): return self.dev_attr.phys_port_cnt def __str__(self): print_format = '{:<22}: {:<20}\n' return print_format.format('FW version', self.fw_version) +\ print_format.format('Node guid', guid_format(self.node_guid)) +\ print_format.format('Sys image GUID', guid_format(self.sys_image_guid)) +\ print_format.format('Max MR size', hex(self.max_mr_size).replace('L', '')) +\ print_format.format('Page size cap', hex(self.page_size_cap).replace('L', '')) +\ print_format.format('Vendor ID', hex(self.vendor_id)) +\ print_format.format('Vendor part ID', self.vendor_part_id) +\ print_format.format('HW version', self.hw_ver) +\ print_format.format('Max QP', self.max_qp) +\ print_format.format('Max QP WR', self.max_qp_wr) +\ print_format.format('Device cap flags', translate_device_caps(self.device_cap_flags)) +\ print_format.format('Max SGE', self.max_sge) +\ print_format.format('Max SGE RD', self.max_sge_rd) +\ print_format.format('MAX CQ', self.max_cq) +\ print_format.format('Max CQE', self.max_cqe) +\ print_format.format('Max MR', self.max_mr) +\ print_format.format('Max PD', self.max_pd) +\ print_format.format('Max QP RD atom', self.max_qp_rd_atom) +\ print_format.format('Max EE RD atom', self.max_ee_rd_atom) +\ print_format.format('Max res RD atom', self.max_res_rd_atom) +\ print_format.format('Max QP init RD atom', self.max_qp_init_rd_atom) +\ print_format.format('Max EE init RD atom', self.max_ee_init_rd_atom) +\ print_format.format('Atomic caps', self.atomic_caps) +\ print_format.format('Max EE', self.max_ee) +\ print_format.format('Max RDD', self.max_rdd) +\ print_format.format('Max MW', self.max_mw) +\ print_format.format('Max raw IPv6 QPs', self.max_raw_ipv6_qps) +\ print_format.format('Max raw ethy QP', self.max_raw_ethy_qp) +\ print_format.format('Max mcast group', self.max_mcast_grp) +\ print_format.format('Max mcast QP attach', self.max_mcast_qp_attach) +\ print_format.format('Max AH', self.max_ah) +\ print_format.format('Max FMR', self.max_fmr) +\ print_format.format('Max map per FMR', self.max_map_per_fmr) +\ print_format.format('Max SRQ', self.max_srq) +\ print_format.format('Max SRQ WR', self.max_srq_wr) +\ print_format.format('Max SRQ SGE', self.max_srq_sge) +\ print_format.format('Max PKeys', self.max_pkeys) +\ print_format.format('local CA ack delay', self.local_ca_ack_delay) +\ print_format.format('Phys port count', self.phys_port_cnt) cdef class QueryDeviceExInput(PyverbsObject): def __init__(self, comp_mask): super().__init__() self.ex_input.comp_mask = comp_mask cdef class ODPCaps(PyverbsObject): @property def general_caps(self): return self.odp_caps.general_caps @property def rc_odp_caps(self): return self.odp_caps.per_transport_caps.rc_odp_caps @property def uc_odp_caps(self): return self.odp_caps.per_transport_caps.uc_odp_caps @property def ud_odp_caps(self): return self.odp_caps.per_transport_caps.ud_odp_caps @property def xrc_odp_caps(self): return self.xrc_odp_caps @xrc_odp_caps.setter def xrc_odp_caps(self, val): self.xrc_odp_caps = val def __str__(self): general_caps = {e.IBV_ODP_SUPPORT: 'IBV_ODP_SUPPORT', e.IBV_ODP_SUPPORT_IMPLICIT: 'IBV_ODP_SUPPORT_IMPLICIT'} l = {e.IBV_ODP_SUPPORT_SEND: 'IBV_ODP_SUPPORT_SEND', e.IBV_ODP_SUPPORT_RECV: 'IBV_ODP_SUPPORT_RECV', e.IBV_ODP_SUPPORT_WRITE: 'IBV_ODP_SUPPORT_WRITE', e.IBV_ODP_SUPPORT_READ: 'IBV_ODP_SUPPORT_READ', e.IBV_ODP_SUPPORT_ATOMIC: 'IBV_ODP_SUPPORT_ATOMIC', e.IBV_ODP_SUPPORT_SRQ_RECV: 'IBV_ODP_SUPPORT_SRQ_RECV'} print_format = '{}: {}\n' return print_format.format('ODP General caps', str_from_flags(self.general_caps, general_caps)) +\ print_format.format('RC ODP caps', str_from_flags(self.rc_odp_caps, l)) +\ print_format.format('UD ODP caps', str_from_flags(self.ud_odp_caps, l)) +\ print_format.format('UC ODP caps', str_from_flags(self.uc_odp_caps, l)) +\ print_format.format('XRC ODP caps', str_from_flags(self.xrc_odp_caps, l)) cdef class PCIAtomicCaps(PyverbsObject): @property def fetch_add(self): return self.caps.fetch_add @property def swap(self): return self.caps.swap @property def compare_swap(self): return self.caps.compare_swap cdef class TSOCaps(PyverbsObject): @property def max_tso(self): return self.tso_caps.max_tso @property def supported_qpts(self): return self.tso_caps.supported_qpts cdef class RSSCaps(PyverbsObject): @property def supported_qpts(self): return self.rss_caps.supported_qpts @property def max_rwq_indirection_tables(self): return self.rss_caps.max_rwq_indirection_tables @property def rx_hash_fields_mask(self): return self.rss_caps.rx_hash_fields_mask @property def rx_hash_function(self): return self.rss_caps.rx_hash_function @property def max_rwq_indirection_table_size(self): return self.rss_caps.max_rwq_indirection_table_size cdef class PacketPacingCaps(PyverbsObject): @property def qp_rate_limit_min(self): return self.packet_pacing_caps.qp_rate_limit_min @property def qp_rate_limit_max(self): return self.packet_pacing_caps.qp_rate_limit_max @property def supported_qpts(self): return self.packet_pacing_caps.supported_qpts cdef class TMCaps(PyverbsObject): @property def max_rndv_hdr_size(self): return self.tm_caps.max_rndv_hdr_size @property def max_num_tags(self): return self.tm_caps.max_num_tags @property def flags(self): return self.tm_caps.flags @property def max_ops(self): return self.tm_caps.max_ops @property def max_sge(self): return self.tm_caps.max_sge cdef class CQModerationCaps(PyverbsObject): @property def max_cq_count(self): return self.cq_mod_caps.max_cq_count @property def max_cq_period(self): return self.cq_mod_caps.max_cq_period cdef class DeviceAttrEx(PyverbsObject): @property def orig_attr(self): attr = DeviceAttr() attr.dev_attr = self.dev_attr.orig_attr return attr @property def comp_mask(self): return self.dev_attr.comp_mask @comp_mask.setter def comp_mask(self, val): self.dev_attr.comp_mask = val @property def odp_caps(self): caps = ODPCaps() caps.odp_caps = self.dev_attr.odp_caps caps.xrc_odp_caps = self.dev_attr.xrc_odp_caps return caps @property def completion_timestamp_mask(self): return self.dev_attr.completion_timestamp_mask @property def hca_core_clock(self): return self.dev_attr.hca_core_clock @property def device_cap_flags_ex(self): return self.dev_attr.device_cap_flags_ex @property def tso_caps(self): caps = TSOCaps() caps.tso_caps = self.dev_attr.tso_caps return caps @property def pci_atomic_caps(self): caps = PCIAtomicCaps() caps.caps = self.dev_attr.pci_atomic_caps return caps @property def rss_caps(self): caps = RSSCaps() caps.rss_caps = self.dev_attr.rss_caps return caps @property def max_wq_type_rq(self): return self.dev_attr.max_wq_type_rq @property def packet_pacing_caps(self): caps = PacketPacingCaps() caps.packet_pacing_caps = self.dev_attr.packet_pacing_caps return caps @property def raw_packet_caps(self): return self.dev_attr.raw_packet_caps @property def tm_caps(self): caps = TMCaps() caps.tm_caps = self.dev_attr.tm_caps return caps @property def cq_mod_caps(self): caps = CQModerationCaps() caps.cq_mod_caps = self.dev_attr.cq_mod_caps return caps @property def max_dm_size(self): return self.dev_attr.max_dm_size @property def phys_port_cnt_ex(self): return self.dev_attr.phys_port_cnt_ex cdef class AllocDmAttr(PyverbsObject): def __init__(self, length, log_align_req = 0, comp_mask = 0): """ Creates an AllocDmAttr object with the given parameters. This object can than be used to create a DM object. :param length: Length of the future device memory :param log_align_req: log2 of address alignment requirement :param comp_mask: compatibility mask :return: An AllocDmAttr object """ super().__init__() self.alloc_dm_attr.length = length self.alloc_dm_attr.log_align_req = log_align_req self.alloc_dm_attr.comp_mask = comp_mask @property def length(self): return self.alloc_dm_attr.length @length.setter def length(self, val): self.alloc_dm_attr.length = val @property def log_align_req(self): return self.alloc_dm_attr.log_align_req @log_align_req.setter def log_align_req(self, val): self.alloc_dm_attr.log_align_req = val @property def comp_mask(self): return self.alloc_dm_attr.comp_mask @comp_mask.setter def comp_mask(self, val): self.alloc_dm_attr.comp_mask = val cdef class DM(PyverbsCM): def __init__(self, Context context, AllocDmAttr dm_attr=None, **kwargs): """ Allocate a device (direct) memory. :param context: The context of the device on which to allocate memory :param dm_attr: Attributes that define the DM :param kwargs: Arguments: * *handle* A valid kernel handle for a DM object in the given context. If passed, the DM will be imported and associated with the given context using ibv_import_dm. :return: A DM object on success """ super().__init__() self.dm_mrs = weakref.WeakSet() dm_handle = kwargs.get('handle') if dm_handle is not None: self.dm = v.ibv_import_dm(context.context, dm_handle) if self.dm == NULL: raise PyverbsRDMAErrno('Failed to import DM') self._is_imported = True else: device_attr = context.query_device_ex() if device_attr.max_dm_size <= 0: raise PyverbsUserError('Device doesn\'t support dm allocation') self.dm = v.ibv_alloc_dm(context.context, &dm_attr.alloc_dm_attr) if self.dm == NULL: raise PyverbsRDMAErrno('Failed to allocate device memory of size ' '{size}. Max available size {max}.' .format(size=dm_attr.length, max=device_attr.max_dm_size)) self.context = context context.add_ref(self) def unimport(self): v.ibv_unimport_dm(self.dm) self.close() def __dealloc__(self): self.close() cpdef close(self): """ Closes the underlying C object of the DM. In case of an imported DM, the DM won't be freed, and it's kept for the original DM object, in order to prevent double free by Python GC. """ if self.dm != NULL: if self.logger: self.logger.debug('Closing DM') close_weakrefs([self.dm_mrs]) if not self._is_imported: rc = v.ibv_free_dm(self.dm) if rc != 0: raise PyverbsRDMAError('Failed to free dm', rc) self.dm = NULL self.context = None cdef add_ref(self, obj): if isinstance(obj, DMMR): self.dm_mrs.add(obj) def copy_to_dm(self, dm_offset, data, length): rc = v.ibv_memcpy_to_dm(self.dm, dm_offset, data, length) if rc != 0: raise PyverbsRDMAError('Failed to copy to dm', rc) def copy_from_dm(self, dm_offset, length): cdef char *data =malloc(length) memset(data, 0, length) rc = v.ibv_memcpy_from_dm(data, self.dm, dm_offset, length) if rc != 0: raise PyverbsRDMAError('Failed to copy from dm', rc) res = data[:length] free(data) return res @property def handle(self): return self.dm.handle cdef class PortAttr(PyverbsObject): @property def state(self): return self.attr.state @property def max_mtu(self): return self.attr.max_mtu @property def active_mtu(self): return self.attr.active_mtu @property def gid_tbl_len(self): return self.attr.gid_tbl_len @property def port_cap_flags(self): return self.attr.port_cap_flags @property def max_msg_sz(self): return self.attr.max_msg_sz @property def bad_pkey_cntr(self): return self.attr.bad_pkey_cntr @property def qkey_viol_cntr(self): return self.attr.qkey_viol_cntr @property def pkey_tbl_len(self): return self.attr.pkey_tbl_len @property def lid(self): return self.attr.lid @property def sm_lid(self): return self.attr.sm_lid @property def lmc(self): return self.attr.lmc @property def max_vl_num(self): return self.attr.max_vl_num @property def sm_sl(self): return self.attr.sm_sl @property def subnet_timeout(self): return self.attr.subnet_timeout @property def init_type_reply(self): return self.attr.init_type_reply @property def active_width(self): return self.attr.active_width @property def active_speed(self): return self.attr.active_speed @property def phys_state(self): return self.attr.phys_state @property def link_layer(self): return self.attr.link_layer @property def flags(self): return self.attr.flags @property def port_cap_flags2(self): return self.attr.port_cap_flags2 @property def active_speed_ex(self): return self.attr.active_speed_ex def __str__(self): print_format = '{:<24}: {:<20}\n' return print_format.format('Port state', port_state_to_str(self.attr.state)) +\ print_format.format('Max MTU', translate_mtu(self.attr.max_mtu)) +\ print_format.format('Active MTU', translate_mtu(self.attr.active_mtu)) +\ print_format.format('SM lid', self.attr.sm_lid) +\ print_format.format('Port lid', self.attr.lid) +\ print_format.format('lmc', hex(self.attr.lmc)) +\ print_format.format('Link layer', translate_link_layer(self.attr.link_layer)) +\ print_format.format('Max message size', hex(self.attr.max_msg_sz)) +\ print_format.format('Port cap flags', translate_port_cap_flags(self.attr.port_cap_flags)) +\ print_format.format('Port cap flags 2', translate_port_cap_flags2(self.attr.port_cap_flags2)) +\ print_format.format('max VL num', self.attr.max_vl_num) +\ print_format.format('Bad Pkey counter', self.attr.bad_pkey_cntr) +\ print_format.format('Qkey violations counter', self.attr.qkey_viol_cntr) +\ print_format.format('GID table len', self.attr.gid_tbl_len) +\ print_format.format('Pkey table len', self.attr.pkey_tbl_len) +\ print_format.format('SM sl', self.attr.sm_sl) +\ print_format.format('Subnet timeout', self.attr.subnet_timeout) +\ print_format.format('Init type reply', self.attr.init_type_reply) +\ print_format.format('Active width', width_to_str(self.attr.active_width)) +\ print_format.format('Active speed', speed_to_str(self.attr.active_speed, self.attr.active_speed_ex)) +\ print_format.format('Phys state', phys_state_to_str(self.attr.phys_state)) +\ print_format.format('Flags', self.attr.flags) cdef class GIDEntry(PyverbsObject): def __init__(self, subnet_prefix=0, interface_id=0, gid_index=0, port_num=0, gid_type=0, ndev_ifindex=0): super().__init__() self.entry.gid._global.subnet_prefix = subnet_prefix self.entry.gid._global.interface_id = interface_id self.entry.gid_index = gid_index self.entry.port_num = port_num self.entry.gid_type = gid_type self.entry.ndev_ifindex = ndev_ifindex @property def gid_subnet_prefix(self): return self.entry.gid._global.subnet_prefix @property def gid_interface_id(self): return self.entry.gid._global.interface_id @property def gid_index(self): return self.entry.gid_index @property def port_num(self): return self.entry.port_num @property def gid_type(self): return self.entry.gid_type @property def ndev_ifindex(self): return self.entry.ndev_ifindex def gid_str(self): return gid_str(self.gid_subnet_prefix, self.gid_interface_id) def __str__(self): print_format = '{:<24}: {:<20}\n' return print_format.format('GID', self.gid_str()) +\ print_format.format('GID Index', self.gid_index) +\ print_format.format('Port number', self.port_num) +\ print_format.format('GID type', translate_gid_type( self.gid_type)) +\ print_format.format('Ndev ifindex', self.ndev_ifindex) cdef class AsyncEvent(PyverbsObject): def __init__(self, event_type=0): super().__init__() self.event.event_type = event_type def ack(self): v.ibv_ack_async_event(&self.event) @property def event_type(self): return self.event.event_type def __str__(self): print_format = '{:<24}: {:<20}\n' return print_format.format('Event Type', translate_event_type( self.event.event_type)) def translate_gid_type(gid_type): types = {e.IBV_GID_TYPE_IB: 'IB', e.IBV_GID_TYPE_ROCE_V1: 'RoCEv1', e.IBV_GID_TYPE_ROCE_V2: 'RoCEv2'} try: return types[gid_type] except KeyError: return f'Unknown gid_type ({gid_type})' def guid_format(num): """ Get GUID representation of the given number, including change of endianness. :param num: Number to change to GUID format. :return: GUID-formatted string. """ num = be64toh(num) hex_str = "%016x" % (num) hex_array = [hex_str[i:i+2] for i in range(0, len(hex_str), 2)] hex_array = [''.join(x) for x in zip(hex_array[0::2], hex_array[1::2])] return ':'.join(hex_array) def translate_transport_type(transport_type): l = {0: 'IB', 1: 'IWARP', 2: 'USNIC', 3: 'USNIC UDP'} try: return l[transport_type] except KeyError: return 'Unknown' def translate_node_type(node_type): l = {1: 'CA', 2: 'Switch', 3: 'Router', 4: 'RNIC', 5: 'USNIC', 6: 'USNIC UDP'} try: return l[node_type] except KeyError: return 'Unknown' def guid_to_hex(node_guid): return hex(node_guid).replace('L', '').replace('0x', '') def port_state_to_str(port_state): l = {0: 'NOP', 1: 'Down', 2: 'Init', 3: 'Armed', 4: 'Active', 5: 'Defer'} try: return '{s} ({n})'.format(s=l[port_state], n=port_state) except KeyError: return 'Invalid state ({s})'.format(s=port_state) def translate_mtu(mtu): l = {1: 256, 2: 512, 3: 1024, 4: 2048, 5: 4096} try: return '{s} ({n})'.format(s=l[mtu], n=mtu) except KeyError: return 'Invalid MTU ({m})'.format(m=mtu) def translate_link_layer(ll): l = {0: 'Unspecified', 1:'InfiniBand', 2:'Ethernet'} try: return l[ll] except KeyError: return 'Invalid link layer ({ll})'.format(ll=ll) def translate_port_cap_flags(flags): l = {e.IBV_PORT_SM: 'IBV_PORT_SM', e.IBV_PORT_NOTICE_SUP: 'IBV_PORT_NOTICE_SUP', e.IBV_PORT_TRAP_SUP: 'IBV_PORT_TRAP_SUP', e.IBV_PORT_OPT_IPD_SUP: 'IBV_PORT_OPT_IPD_SUP', e.IBV_PORT_AUTO_MIGR_SUP: 'IBV_PORT_AUTO_MIGR_SUP', e.IBV_PORT_SL_MAP_SUP: 'IBV_PORT_SL_MAP_SUP', e.IBV_PORT_MKEY_NVRAM: 'IBV_PORT_MKEY_NVRAM', e.IBV_PORT_PKEY_NVRAM: 'IBV_PORT_PKEY_NVRAM', e.IBV_PORT_LED_INFO_SUP: 'IBV_PORT_LED_INFO_SUP', e.IBV_PORT_SYS_IMAGE_GUID_SUP: 'IBV_PORT_SYS_IMAGE_GUID_SUP', e.IBV_PORT_PKEY_SW_EXT_PORT_TRAP_SUP: 'IBV_PORT_PKEY_SW_EXT_PORT_TRAP_SUP', e.IBV_PORT_EXTENDED_SPEEDS_SUP: 'IBV_PORT_EXTENDED_SPEEDS_SUP', e.IBV_PORT_CAP_MASK2_SUP: 'IBV_PORT_CAP_MASK2_SUP', e.IBV_PORT_CM_SUP: 'IBV_PORT_CM_SUP', e.IBV_PORT_SNMP_TUNNEL_SUP: 'IBV_PORT_SNMP_TUNNEL_SUP', e.IBV_PORT_REINIT_SUP: 'IBV_PORT_REINIT_SUP', e.IBV_PORT_DEVICE_MGMT_SUP: 'IBV_PORT_DEVICE_MGMT_SUP', e.IBV_PORT_VENDOR_CLASS_SUP: 'IBV_PORT_VENDOR_CLASS_SUP', e.IBV_PORT_DR_NOTICE_SUP: 'IBV_PORT_DR_NOTICE_SUP', e.IBV_PORT_CAP_MASK_NOTICE_SUP: 'IBV_PORT_CAP_MASK_NOTICE_SUP', e.IBV_PORT_BOOT_MGMT_SUP: 'IBV_PORT_BOOT_MGMT_SUP', e.IBV_PORT_LINK_LATENCY_SUP: 'IBV_PORT_LINK_LATENCY_SUP', e.IBV_PORT_CLIENT_REG_SUP: 'IBV_PORT_CLIENT_REG_SUP', e.IBV_PORT_IP_BASED_GIDS: 'IBV_PORT_IP_BASED_GIDS'} return str_from_flags(flags, l) def translate_port_cap_flags2(flags): l = {e.IBV_PORT_SET_NODE_DESC_SUP: 'IBV_PORT_SET_NODE_DESC_SUP', e.IBV_PORT_INFO_EXT_SUP: 'IBV_PORT_INFO_EXT_SUP', e.IBV_PORT_VIRT_SUP: 'IBV_PORT_VIRT_SUP', e.IBV_PORT_SWITCH_PORT_STATE_TABLE_SUP: 'IBV_PORT_SWITCH_PORT_STATE_TABLE_SUP', e.IBV_PORT_LINK_WIDTH_2X_SUP: 'IBV_PORT_LINK_WIDTH_2X_SUP', e.IBV_PORT_LINK_SPEED_HDR_SUP: 'IBV_PORT_LINK_SPEED_HDR_SUP', e.IBV_PORT_LINK_SPEED_NDR_SUP: 'IBV_PORT_LINK_SPEED_NDR_SUP'} return str_from_flags(flags, l) def translate_device_caps(flags): l = {e.IBV_DEVICE_RESIZE_MAX_WR: 'IBV_DEVICE_RESIZE_MAX_WR', e.IBV_DEVICE_BAD_PKEY_CNTR: 'IBV_DEVICE_BAD_PKEY_CNTR', e.IBV_DEVICE_BAD_QKEY_CNTR: 'IBV_DEVICE_BAD_QKEY_CNTR', e.IBV_DEVICE_RAW_MULTI: 'IBV_DEVICE_RAW_MULTI', e.IBV_DEVICE_AUTO_PATH_MIG: 'IBV_DEVICE_AUTO_PATH_MIG', e.IBV_DEVICE_CHANGE_PHY_PORT: 'IBV_DEVICE_CHANGE_PHY_PORT', e.IBV_DEVICE_UD_AV_PORT_ENFORCE: 'IBV_DEVICE_UD_AV_PORT_ENFORCE', e.IBV_DEVICE_CURR_QP_STATE_MOD: 'IBV_DEVICE_CURR_QP_STATE_MOD', e.IBV_DEVICE_SHUTDOWN_PORT: 'IBV_DEVICE_SHUTDOWN_PORT', e.IBV_DEVICE_INIT_TYPE: 'IBV_DEVICE_INIT_TYPE', e.IBV_DEVICE_PORT_ACTIVE_EVENT: 'IBV_DEVICE_PORT_ACTIVE_EVENT', e.IBV_DEVICE_SYS_IMAGE_GUID: 'IBV_DEVICE_SYS_IMAGE_GUID', e.IBV_DEVICE_RC_RNR_NAK_GEN: 'IBV_DEVICE_RC_RNR_NAK_GEN', e.IBV_DEVICE_SRQ_RESIZE: 'IBV_DEVICE_SRQ_RESIZE', e.IBV_DEVICE_N_NOTIFY_CQ: 'IBV_DEVICE_N_NOTIFY_CQ', e.IBV_DEVICE_MEM_WINDOW: 'IBV_DEVICE_MEM_WINDOW', e.IBV_DEVICE_UD_IP_CSUM: 'IBV_DEVICE_UD_IP_CSUM', e.IBV_DEVICE_XRC: 'IBV_DEVICE_XRC', e.IBV_DEVICE_MEM_MGT_EXTENSIONS: 'IBV_DEVICE_MEM_MGT_EXTENSIONS', e.IBV_DEVICE_MEM_WINDOW_TYPE_2A: 'IBV_DEVICE_MEM_WINDOW_TYPE_2A', e.IBV_DEVICE_MEM_WINDOW_TYPE_2B: 'IBV_DEVICE_MEM_WINDOW_TYPE_2B', e.IBV_DEVICE_RC_IP_CSUM: 'IBV_DEVICE_RC_IP_CSUM', e.IBV_DEVICE_RAW_IP_CSUM: 'IBV_DEVICE_RAW_IP_CSUM', e.IBV_DEVICE_MANAGED_FLOW_STEERING: 'IBV_DEVICE_MANAGED_FLOW_STEERING'} return str_from_flags(flags, l) def str_from_flags(flags, dictionary): str_flags = "\n " for bit in dictionary: if flags & bit: str_flags += dictionary[bit] str_flags += '\n ' return str_flags def phys_state_to_str(phys): l = {1: 'Sleep', 2: 'Polling', 3: 'Disabled', 4: 'Port configuration training', 5: 'Link up', 6: 'Link error recovery', 7: 'Phy test'} try: return '{s} ({n})'.format(s=l[phys], n=phys) except KeyError: return 'Invalid physical state' def width_to_str(width): l = {1: '1X', 2: '4X', 4: '8X', 8: '12X', 16: '2X'} try: return '{s} ({n})'.format(s=l[width], n=width) except KeyError: return 'Invalid width' def speed_to_str(active_speed, active_speed_ex): real_speed = active_speed if not active_speed_ex else active_speed_ex l = {0: '0.0 Gbps', 1: '2.5 Gbps', 2: '5.0 Gbps', 4: '5.0 Gbps', 8: '10.0 Gbps', 16: '14.0 Gbps', 32: '25.0 Gbps', 64: '50.0 Gbps', 128: '100.0 Gbps', 256: '200.0 Gbps'} try: return '{s} ({n})'.format(s=l[real_speed], n=real_speed) except KeyError: return 'Invalid speed' def get_device_list(): """ :return: list of IB_devices on current node each list element contains a Device with: device name device node type device transport type device guid device index """ cdef int count = 0; cdef v.ibv_device **dev_list; dev_list = v.ibv_get_device_list(&count) if dev_list == NULL: raise PyverbsRDMAError('Failed to get devices list') devices = [] try: for i in range(count): name = dev_list[i].name node = dev_list[i].node_type transport = dev_list[i].transport_type guid = be64toh(v.ibv_get_device_guid(dev_list[i])) index = v.ibv_get_device_index(dev_list[i]) devices.append(Device(name, guid, node, transport, index)) finally: v.ibv_free_device_list(dev_list) return devices def rdma_get_devices(): """ Get the RDMA devices. :return: list of Device objects. """ cdef int count cdef v.ibv_context **ctx_list ctx_list = cm.rdma_get_devices(&count) if ctx_list == NULL: raise PyverbsRDMAErrno('Failed to get device list') devices = [] for i in range(count): name = ctx_list[i].device.name node = ctx_list[i].device.node_type transport = ctx_list[i].device.transport_type guid = be64toh(v.ibv_get_device_guid(ctx_list[i].device)) index = v.ibv_get_device_index(ctx_list[i].device) devices.append(Device(name, guid, node, transport, index)) cm.rdma_free_devices(ctx_list) return devices def translate_event_type(event_type): types = { e.IBV_EVENT_CQ_ERR: 'IBV_EVENT_CQ_ERR', e.IBV_EVENT_QP_FATAL: 'IBV_EVENT_QP_FATAL', e.IBV_EVENT_QP_REQ_ERR: 'IBV_EVENT_QP_REQ_ERR', e.IBV_EVENT_QP_ACCESS_ERR: 'IBV_EVENT_QP_ACCESS_ERR', e.IBV_EVENT_COMM_EST: 'IBV_EVENT_COMM_EST', e.IBV_EVENT_SQ_DRAINED: 'IBV_EVENT_SQ_DRAINED', e.IBV_EVENT_PATH_MIG: 'IBV_EVENT_PATH_MIG', e.IBV_EVENT_PATH_MIG_ERR: 'IBV_EVENT_PATH_MIG_ERR', e.IBV_EVENT_DEVICE_FATAL: 'IBV_EVENT_DEVICE_FATAL', e.IBV_EVENT_PORT_ACTIVE: 'IBV_EVENT_PORT_ACTIVE', e.IBV_EVENT_PORT_ERR: 'IBV_EVENT_PORT_ERR', e.IBV_EVENT_LID_CHANGE: 'IBV_EVENT_LID_CHANGE', e.IBV_EVENT_PKEY_CHANGE: 'IBV_EVENT_PKEY_CHANGE', e.IBV_EVENT_SM_CHANGE: 'IBV_EVENT_SM_CHANGE', e.IBV_EVENT_SRQ_ERR: 'IBV_EVENT_SRQ_ERR', e.IBV_EVENT_SRQ_LIMIT_REACHED: 'IBV_EVENT_SRQ_LIMIT_REACHED', e.IBV_EVENT_QP_LAST_WQE_REACHED: '.IBV_EVENT_QP_LAST_WQE_REACHED', e.IBV_EVENT_CLIENT_REREGISTER: 'IBV_EVENT_CLIENT_REREGISTER', e.IBV_EVENT_GID_CHANGE: 'IBV_EVENT_GID_CHANGE', e.IBV_EVENT_WQ_FATAL: 'IBV_EVENT_WQ_FATAL' } try: return types[event_type] except KeyError: return f'Unknown event_type ({event_type})' rdma-core-56.1/pyverbs/dma_util.pyx000066400000000000000000000013761477342711600173730ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 from libc.stdint cimport uintptr_t, uint64_t, uint32_t cdef extern from 'util/udma_barrier.h': cdef void udma_to_device_barrier() cdef void udma_from_device_barrier() cdef extern from 'util/mmio.h': cdef void mmio_write64_be(void *addr, uint64_t val) cdef void mmio_write32_be(void *addr, uint32_t val) def udma_to_dev_barrier(): udma_to_device_barrier() def udma_from_dev_barrier(): udma_from_device_barrier() def mmio_write64_as_be(addr, val): mmio_write64_be( addr, val) def mmio_write32_as_be(addr, val): mmio_write32_be( addr, val) rdma-core-56.1/pyverbs/dmabuf.pxd000066400000000000000000000006141477342711600170000ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020, Intel Corporation. All rights reserved. See COPYING file #cython: language_level=3 cdef class DmaBuf: cdef int drm_fd cdef int handle cdef int fd cdef unsigned long size cdef unsigned long map_offset cdef void *dmabuf cdef object dmabuf_mrs cdef add_ref(self, obj) cpdef close(self) rdma-core-56.1/pyverbs/dmabuf.pyx000066400000000000000000000042551477342711600170320ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020, Intel Corporation. All rights reserved. See COPYING file #cython: language_level=3 import weakref from pyverbs.base cimport close_weakrefs from pyverbs.base import PyverbsRDMAErrno from pyverbs.mr cimport DmaBufMR cdef extern from "dmabuf_alloc.h": cdef struct dmabuf: pass dmabuf *dmabuf_alloc(unsigned long size, int gpu, int gtt) void dmabuf_free(dmabuf *dmabuf) int dmabuf_get_drm_fd(dmabuf *dmabuf) int dmabuf_get_fd(dmabuf *dmabuf) unsigned long dmabuf_get_offset(dmabuf *dmabuf) cdef class DmaBuf: def __init__(self, size, gpu=0, gtt=0): """ Allocate DmaBuf object from a GPU device. This is done through the DRI device interface. Usually this requires the effective user id being a member of the 'render' group. :param size: The size (in number of bytes) of the buffer. :param gpu: The GPU unit to allocate the buffer from. :param gtt: Allocate from GTT (Graphics Translation Table) instead of VRAM. :return: The newly created DmaBuf object on success. """ self.dmabuf_mrs = weakref.WeakSet() self.dmabuf = dmabuf_alloc(size, gpu, gtt) if self.dmabuf == NULL: raise PyverbsRDMAErrno(f'Failed to allocate dmabuf of size {size} on gpu {gpu}') self.drm_fd = dmabuf_get_drm_fd(self.dmabuf) self.fd = dmabuf_get_fd(self.dmabuf) self.map_offset = dmabuf_get_offset(self.dmabuf) def __dealloc__(self): self.close() cpdef close(self): if self.dmabuf == NULL: return None close_weakrefs([self.dmabuf_mrs]) dmabuf_free(self.dmabuf) self.dmabuf = NULL cdef add_ref(self, obj): if isinstance(obj, DmaBufMR): self.dmabuf_mrs.add(obj) @property def drm_fd(self): return self.drm_fd @property def handle(self): return self.handle @property def fd(self): return self.fd @property def size(self): return self.size @property def map_offset(self): return self.map_offset rdma-core-56.1/pyverbs/dmabuf_alloc.c000066400000000000000000000126111477342711600176010ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* * Copyright 2020 Intel Corporation. All rights reserved. See COPYING file */ #include #include #include #include #include #include #include #include #include #include #include #include "dmabuf_alloc.h" /* * Abstraction of the buffer allocation mechanism using the DRM interface. * The interface is accessed by ioctl() calls over the '/dev/dri/renderD*' * device. Successful access usually requires the effective user id being * in the 'render' group. */ struct drm { int fd; int (*alloc)(struct drm *drm, uint64_t size, uint32_t *handle, int gtt); int (*mmap_offset)(struct drm *drm, uint32_t handle, uint64_t *offset); }; static int i915_alloc(struct drm *drm, uint64_t size, uint32_t *handle, int gtt) { struct drm_i915_gem_create gem_create = {}; int err; gem_create.size = size; err = ioctl(drm->fd, DRM_IOCTL_I915_GEM_CREATE, &gem_create); if (err) return err; *handle = gem_create.handle; return 0; } static int amdgpu_alloc(struct drm *drm, uint64_t size, uint32_t *handle, int gtt) { union drm_amdgpu_gem_create gem_create = {{}}; int err; gem_create.in.bo_size = size; if (gtt) { gem_create.in.domains = AMDGPU_GEM_DOMAIN_GTT; gem_create.in.domain_flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC; } else { gem_create.in.domains = AMDGPU_GEM_DOMAIN_VRAM; gem_create.in.domain_flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; } err = ioctl(drm->fd, DRM_IOCTL_AMDGPU_GEM_CREATE, &gem_create); if (err) return err; *handle = gem_create.out.handle; return 0; } static int i915_mmap_offset(struct drm *drm, uint32_t handle, uint64_t *offset) { struct drm_i915_gem_mmap_gtt gem_mmap = {}; int err; gem_mmap.handle = handle; err = ioctl(drm->fd, DRM_IOCTL_I915_GEM_MMAP_GTT, &gem_mmap); if (err) return err; *offset = gem_mmap.offset; return 0; } static int amdgpu_mmap_offset(struct drm *drm, uint32_t handle, uint64_t *offset) { union drm_amdgpu_gem_mmap gem_mmap = {{}}; int err; gem_mmap.in.handle = handle; err = ioctl(drm->fd, DRM_IOCTL_AMDGPU_GEM_MMAP, &gem_mmap); if (err) return err; *offset = gem_mmap.out.addr_ptr; return 0; } static struct drm *drm_open(int gpu) { char path[32]; struct drm_version version = {}; char name[16] = {}; int err; struct drm *drm; drm = malloc(sizeof(*drm)); if (!drm) return NULL; snprintf(path, sizeof(path), "/dev/dri/renderD%d", gpu + 128); drm->fd = open(path, O_RDWR); if (drm->fd < 0) goto out_free; version.name = name; version.name_len = 16; err = ioctl(drm->fd, DRM_IOCTL_VERSION, &version); if (err) goto out_close; if (!strcmp(name, "amdgpu")) { drm->alloc = amdgpu_alloc; drm->mmap_offset = amdgpu_mmap_offset; } else if (!strcmp(name, "i915")) { drm->alloc = i915_alloc; drm->mmap_offset = i915_mmap_offset; } else { errno = EOPNOTSUPP; goto out_close; } return drm; out_close: close(drm->fd); out_free: free(drm); return NULL; } static void drm_close(struct drm *drm) { if (!drm || drm->fd < 0) return; close(drm->fd); free(drm); } static void drm_free_buf(struct drm *drm, uint32_t handle) { struct drm_gem_close close = {}; close.handle = handle; ioctl(drm->fd, DRM_IOCTL_GEM_CLOSE, &close); } static int drm_alloc_buf(struct drm *drm, size_t size, uint32_t *handle, int *fd, int gtt) { struct drm_prime_handle prime_handle = {}; int err; if (!drm || drm->fd < 0) return -EINVAL; err = drm->alloc(drm, size, handle, gtt); if (err) return err; prime_handle.handle = *handle; prime_handle.flags = O_RDWR; err = ioctl(drm->fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, &prime_handle); if (err) { drm_free_buf(drm, *handle); return err; } *fd = prime_handle.fd; return 0; } static int drm_map_buf(struct drm *drm, uint32_t handle, uint64_t *offset) { if (!drm || drm->fd < 0) return -EINVAL; return drm->mmap_offset(drm, handle, offset); } /* * Abstraction of dmabuf object, allocated using the DRI abstraction defined * above. */ struct dmabuf { struct drm *drm; int fd; uint32_t handle; uint64_t map_offset; }; /* * dmabuf_alloc - allocate a dmabuf from GPU * @size - byte size of the buffer to allocate * @gpu - the GPU unit to use * @gtt - if true, allocate from GTT (Graphics Translation Table) instead of VRAM */ struct dmabuf *dmabuf_alloc(uint64_t size, int gpu, int gtt) { struct dmabuf *dmabuf; int err; dmabuf = malloc(sizeof(*dmabuf)); if (!dmabuf) return NULL; dmabuf->drm = drm_open(gpu); if (!dmabuf->drm) goto out_free; err = drm_alloc_buf(dmabuf->drm, size, &dmabuf->handle, &dmabuf->fd, gtt); if (err) goto out_close; err = drm_map_buf(dmabuf->drm, dmabuf->handle, &dmabuf->map_offset); if (err) goto out_free_buf; return dmabuf; out_free_buf: drm_free_buf(dmabuf->drm, dmabuf->handle); out_close: drm_close(dmabuf->drm); out_free: free(dmabuf); return NULL; } void dmabuf_free(struct dmabuf *dmabuf) { if (!dmabuf) return; close(dmabuf->fd); drm_free_buf(dmabuf->drm, dmabuf->handle); drm_close(dmabuf->drm); free(dmabuf); } int dmabuf_get_drm_fd(struct dmabuf *dmabuf) { if (!dmabuf || !dmabuf->drm) return -1; return dmabuf->drm->fd; } int dmabuf_get_fd(struct dmabuf *dmabuf) { if (!dmabuf) return -1; return dmabuf->fd; } uint64_t dmabuf_get_offset(struct dmabuf *dmabuf) { if (!dmabuf) return -1; return dmabuf->map_offset; } rdma-core-56.1/pyverbs/dmabuf_alloc.h000066400000000000000000000007631477342711600176130ustar00rootroot00000000000000/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* * Copyright 2020 Intel Corporation. All rights reserved. See COPYING file */ #ifndef _DMABUF_ALLOC_H_ #define _DMABUF_ALLOC_H_ #include struct dmabuf; struct dmabuf *dmabuf_alloc(uint64_t size, int gpu, int gtt); void dmabuf_free(struct dmabuf *dmabuf); int dmabuf_get_drm_fd(struct dmabuf *dmabuf); int dmabuf_get_fd(struct dmabuf *dmabuf); uint64_t dmabuf_get_offset(struct dmabuf *dmabuf); #endif /* _DMABUF_ALLOC_H_ */ rdma-core-56.1/pyverbs/dmabuf_alloc_stub.c000066400000000000000000000011751477342711600206410ustar00rootroot00000000000000// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* * Copyright 2021 Intel Corporation. All rights reserved. See COPYING file */ #include #include #include #include "dmabuf_alloc.h" struct dmabuf *dmabuf_alloc(uint64_t size, int gpu, int gtt) { errno = EOPNOTSUPP; return NULL; } void dmabuf_free(struct dmabuf *dmabuf) { errno = EOPNOTSUPP; } int dmabuf_get_drm_fd(struct dmabuf *dmabuf) { errno = EOPNOTSUPP; return -1; } int dmabuf_get_fd(struct dmabuf *dmabuf) { errno = EOPNOTSUPP; return -1; } uint64_t dmabuf_get_offset(struct dmabuf *dmabuf) { errno = EOPNOTSUPP; return -1; } rdma-core-56.1/pyverbs/enums.pyx000077700000000000000000000000001477342711600227572libibverbs_enums.pxdustar00rootroot00000000000000rdma-core-56.1/pyverbs/examples/000077500000000000000000000000001477342711600166425ustar00rootroot00000000000000rdma-core-56.1/pyverbs/examples/ib_devices.py000077500000000000000000000012361477342711600213150ustar00rootroot00000000000000#!/usr/bin/env python3 # SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. See COPYING file from pyverbs import device as d import sys lst = d.get_device_list() dev = 'Device' node = 'Node Type' trans = 'Transport Type' guid = 'Node GUID' print_format = '{:^20}{:^20}{:^20}{:^20}' print (print_format.format(dev, node, trans, guid)) print (print_format.format('-'*len(dev), '-'*len(node), '-'*len(trans), '-'*len(guid))) for i in lst: print (print_format.format(i.name.decode(), d.translate_node_type(i.node_type), d.translate_transport_type(i.transport_type), d.guid_to_hex(i.guid))) rdma-core-56.1/pyverbs/flow.pxd000066400000000000000000000007301477342711600165100ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia All rights reserved. #cython: language_level=3 from pyverbs.base cimport PyverbsObject, PyverbsCM cimport pyverbs.libibverbs as v cdef class FlowAttr(PyverbsObject): cdef v.ibv_flow_attr attr cdef object specs cdef class Flow(PyverbsCM): cdef v.ibv_flow *flow cdef object qp cpdef close(self) cdef class FlowAction(PyverbsObject): cdef v.ibv_flow_action *action rdma-core-56.1/pyverbs/flow.pyx000066400000000000000000000101221477342711600165310ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia All rights reserved. from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsError from pyverbs.base import PyverbsRDMAErrno from libc.stdlib cimport calloc, free from libc.string cimport memcpy from pyverbs.qp cimport QP cdef class FlowAttr(PyverbsObject): def __init__(self, num_of_specs=0, flow_type=v.IBV_FLOW_ATTR_NORMAL, priority=0, port=1, flags=0): """ Initialize a FlowAttr object over an underlying ibv_flow_attr C object which contains attributes for creating a steering flow. :param num_of_specs: number of specs :param flow_type: flow type :param priority: flow priority :param port: port number :param flags: flow flags """ super().__init__() self.attr.type = flow_type self.attr.size = sizeof(v.ibv_flow_attr) self.attr.priority = priority self.attr.num_of_specs = num_of_specs self.attr.port = port self.attr.flags = flags self.specs = list() @property def type(self): return self.attr.type @type.setter def type(self, val): self.attr.type = val @property def priority(self): return self.attr.priority @priority.setter def priority(self, val): self.attr.priority = val @property def num_of_specs(self): return self.attr.num_of_specs @num_of_specs.setter def num_of_specs(self, val): self.attr.num_of_specs = val @property def port(self): return self.attr.port @port.setter def port(self, val): self.attr.port = val @property def flags(self): return self.attr.flags @flags.setter def flags(self, val): self.attr.flags = val @property def specs(self): return self.specs cdef class Flow(PyverbsCM): def __init__(self, QP qp, FlowAttr flow_attr): """ Initialize a Flow object over an underlying ibv_flow C object which represents a steering flow. :param qp: QP to create flow for :param flow_attr: Flow attributes for flow creation """ super().__init__() cdef char *flow_addr cdef char *dst_addr cdef v.ibv_flow_attr attr = flow_attr.attr if flow_attr.num_of_specs != len(flow_attr.specs): self.logger.warn(f'The number of appended specs ' f'({len(flow_attr.specs)}) is not equal to the ' f'number of declared specs ' f'({flow_attr.flow_attr.num_of_specs})') # Calculate total size for allocation total_size = sizeof(v.ibv_flow_attr) for spec in flow_attr.specs: total_size += spec.size flow_addr = calloc(1, total_size) if flow_addr == NULL: raise PyverbsError(f'Failed to allocate memory of size ' f'{total_size}') dst_addr = flow_addr # Copy flow_attr at the beginning of the allocated memory memcpy(dst_addr, &attr, sizeof(v.ibv_flow_attr)) dst_addr = (dst_addr + sizeof(v.ibv_flow_attr)) # Copy specs one after another into the allocated memory after flow_attr for spec in flow_attr.specs: spec._copy_data(dst_addr) dst_addr += spec.size self.flow = v.ibv_create_flow(qp.qp, flow_addr) free(flow_addr) if self.flow == NULL: raise PyverbsRDMAErrno('Flow creation failed') self.qp = qp qp.add_ref(self) def __dealloc__(self): self.close() cpdef close(self): if self.flow != NULL: if self.logger: self.logger.debug('Closing Flow') rc = v.ibv_destroy_flow(self.flow) if rc != 0: raise PyverbsRDMAError('Failed to destroy Flow', rc) self.flow = NULL self.qp = None cdef class FlowAction(PyverbsObject): def __cinit__(self): self.action = NULL rdma-core-56.1/pyverbs/fork.pyx000066400000000000000000000006471477342711600165360ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright 2021 Amazon.com, Inc. or its affiliates. All rights reserved. #cython: language_level=3 from pyverbs.base import PyverbsRDMAError cimport pyverbs.libibverbs as v def fork_init(): ret = v.ibv_fork_init() if ret: raise PyverbsRDMAError('Failed to init fork support', ret) def is_fork_initialized(): return v.ibv_is_fork_initialized() rdma-core-56.1/pyverbs/libibverbs.pxd000066400000000000000000000677501477342711600177030ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. See COPYING file #cython: language_level=3 from libc.stdint cimport uint8_t, uint16_t, uint32_t, uint64_t from posix.time cimport timespec from pyverbs.libibverbs_enums cimport * cdef extern from 'infiniband/verbs.h': cdef struct anon: unsigned long subnet_prefix unsigned long interface_id cdef union ibv_gid: anon _global "global" uint8_t raw[16] cdef struct ibv_device: char *name int node_type int transport_type cdef struct ibv_context: ibv_device *device int num_comp_vectors int cmd_fd cdef struct ibv_device_attr: char *fw_ver unsigned long node_guid unsigned long sys_image_guid unsigned long max_mr_size unsigned long page_size_cap unsigned int vendor_id unsigned int vendor_part_id unsigned int hw_ver unsigned int max_qp unsigned int max_qp_wr unsigned int device_cap_flags unsigned int max_sge unsigned int max_sge_rd unsigned int max_cq unsigned int max_cqe unsigned int max_mr unsigned int max_pd unsigned int max_qp_rd_atom unsigned int max_ee_rd_atom unsigned int max_res_rd_atom unsigned int max_qp_init_rd_atom unsigned int max_ee_init_rd_atom ibv_atomic_cap atomic_cap unsigned int max_ee unsigned int max_rdd unsigned int max_mw unsigned int max_raw_ipv6_qp unsigned int max_raw_ethy_qp unsigned int max_mcast_grp unsigned int max_mcast_qp_attach unsigned int max_total_mcast_qp_attach unsigned int max_ah unsigned int max_fmr unsigned int max_map_per_fmr unsigned int max_srq unsigned int max_srq_wr unsigned int max_srq_sge unsigned int max_pkeys unsigned int local_ca_ack_delay unsigned int phys_port_cnt struct ibv_pd: ibv_context *context unsigned int handle cdef struct ibv_mr: ibv_context *context ibv_pd *pd void *addr size_t length unsigned int handle unsigned int lkey unsigned int rkey cdef struct ibv_query_device_ex_input: unsigned int comp_mask cdef struct per_transport_caps: uint32_t rc_odp_caps uint32_t uc_odp_caps uint32_t ud_odp_caps cdef struct ibv_odp_caps: uint64_t general_caps per_transport_caps per_transport_caps cdef struct ibv_tso_caps: unsigned int max_tso unsigned int supported_qpts cdef struct ibv_rss_caps: unsigned int supported_qpts unsigned int max_rwq_indirection_tables unsigned int max_rwq_indirection_table_size unsigned long rx_hash_fields_mask unsigned int rx_hash_function cdef struct ibv_packet_pacing_caps: unsigned int qp_rate_limit_min unsigned int qp_rate_limit_max unsigned int supported_qpts cdef struct ibv_tm_caps: unsigned int max_rndv_hdr_size unsigned int max_num_tags unsigned int flags unsigned int max_ops unsigned int max_sge cdef struct ibv_tm_cap: uint32_t max_num_tags uint32_t max_ops cdef struct ibv_cq_moderation_caps: unsigned int max_cq_count unsigned int max_cq_period cdef struct ibv_pci_atomic_caps: uint16_t fetch_add uint16_t swap uint16_t compare_swap cdef struct ibv_device_attr_ex: ibv_device_attr orig_attr unsigned int comp_mask ibv_odp_caps odp_caps unsigned long completion_timestamp_mask unsigned long hca_core_clock unsigned long device_cap_flags_ex ibv_tso_caps tso_caps ibv_rss_caps rss_caps unsigned int max_wq_type_rq ibv_packet_pacing_caps packet_pacing_caps unsigned int raw_packet_caps ibv_tm_caps tm_caps ibv_cq_moderation_caps cq_mod_caps unsigned long max_dm_size ibv_pci_atomic_caps pci_atomic_caps uint32_t xrc_odp_caps uint32_t phys_port_cnt_ex cdef struct ibv_mw: ibv_context *context ibv_pd *pd unsigned int rkey unsigned int handle ibv_mw_type type cdef struct ibv_alloc_dm_attr: size_t length unsigned int log_align_req unsigned int comp_mask cdef struct ibv_dm: ibv_context *context unsigned int comp_mask uint32_t handle cdef struct ibv_port_attr: ibv_port_state state ibv_mtu max_mtu ibv_mtu active_mtu int gid_tbl_len unsigned int port_cap_flags unsigned int max_msg_sz unsigned int bad_pkey_cntr unsigned int qkey_viol_cntr unsigned short pkey_tbl_len unsigned short lid unsigned short sm_lid unsigned char lmc unsigned char max_vl_num unsigned char sm_sl unsigned char subnet_timeout unsigned char init_type_reply unsigned char active_width unsigned char active_speed unsigned char phys_state unsigned char link_layer unsigned char flags unsigned short port_cap_flags2 unsigned int active_speed_ex cdef struct ibv_comp_channel: ibv_context *context unsigned int fd unsigned int refcnt cdef struct ibv_cq: ibv_context *context ibv_comp_channel *channel void *cq_context int handle int cqe cdef struct ibv_wc: unsigned long wr_id ibv_wc_status status ibv_wc_opcode opcode unsigned int vendor_err unsigned int byte_len unsigned int qp_num unsigned int imm_data unsigned int src_qp int wc_flags unsigned int pkey_index unsigned int slid unsigned int sl unsigned int dlid_path_bits cdef struct ibv_cq_init_attr_ex: unsigned int cqe void *cq_context ibv_comp_channel *channel unsigned int comp_vector unsigned long wc_flags unsigned int comp_mask unsigned int flags ibv_pd *parent_domain cdef struct ibv_cq_ex: ibv_context *context ibv_comp_channel *channel void *cq_context unsigned int handle int cqe unsigned int comp_events_completed unsigned int async_events_completed unsigned int comp_mask ibv_wc_status status unsigned long wr_id cdef struct ibv_poll_cq_attr: unsigned int comp_mask cdef struct ibv_wc_tm_info: unsigned long tag unsigned int priv cdef struct ibv_grh: unsigned int version_tclass_flow unsigned short paylen unsigned char next_hdr unsigned char hop_limit ibv_gid sgid ibv_gid dgid cdef struct ibv_global_route: ibv_gid dgid unsigned int flow_label unsigned char sgid_index unsigned char hop_limit unsigned char traffic_class cdef struct ibv_ah_attr: ibv_global_route grh unsigned short dlid unsigned char sl unsigned char src_path_bits unsigned char static_rate unsigned char is_global unsigned char port_num cdef struct ibv_ah: ibv_context *context ibv_pd *pd unsigned int handle cdef struct ibv_sge: unsigned long addr unsigned int length unsigned int lkey cdef struct ibv_recv_wr: unsigned long wr_id ibv_recv_wr *next ibv_sge *sg_list int num_sge cdef struct _add: uint64_t recv_wr_id ibv_sge *sg_list int num_sge uint64_t tag uint64_t mask cdef struct _tm: uint32_t unexpected_cnt uint32_t handle _add add cdef struct ibv_ops_wr: uint64_t wr_id ibv_ops_wr *next ibv_ops_wr_opcode opcode int flags _tm tm cdef struct rdma: unsigned long remote_addr unsigned int rkey cdef struct atomic: unsigned long remote_addr unsigned long compare_add unsigned long swap unsigned int rkey cdef struct ud: ibv_ah *ah unsigned int remote_qpn unsigned int remote_qkey cdef union wr: rdma rdma atomic atomic ud ud cdef struct ibv_mw_bind_info: ibv_mr *mr unsigned long addr unsigned long length unsigned int mw_access_flags cdef struct ibv_mw_bind: uint64_t wr_id unsigned int send_flags ibv_mw_bind_info bind_info cdef struct bind_mw: ibv_mw *mw unsigned int rkey ibv_mw_bind_info bind_info cdef struct tso: void *hdr unsigned short hdr_sz unsigned short mss cdef struct xrc: unsigned int remote_srqn cdef union qp_type: xrc xrc cdef struct ibv_send_wr: unsigned long wr_id ibv_send_wr *next ibv_sge *sg_list int num_sge ibv_wr_opcode opcode uint32_t imm_data unsigned int send_flags wr wr qp_type qp_type bind_mw bind_mw tso tso cdef struct ibv_qp_cap: unsigned int max_send_wr unsigned int max_recv_wr unsigned int max_send_sge unsigned int max_recv_sge unsigned int max_inline_data cdef struct ibv_qp_init_attr: void *qp_context ibv_cq *send_cq ibv_cq *recv_cq ibv_srq *srq ibv_qp_cap cap ibv_qp_type qp_type int sq_sig_all cdef struct ibv_xrcd_init_attr: uint32_t comp_mask int fd int oflags cdef struct ibv_xrcd: pass cdef struct ibv_srq_attr: unsigned int max_wr unsigned int max_sge unsigned int srq_limit cdef struct ibv_srq_init_attr: void *srq_context ibv_srq_attr attr cdef struct ibv_srq_init_attr_ex: void *srq_context ibv_srq_attr attr unsigned int comp_mask ibv_srq_type srq_type ibv_pd *pd ibv_xrcd *xrcd ibv_cq *cq ibv_tm_cap tm_cap cdef struct ibv_srq: ibv_context *context void *srq_context ibv_pd *pd unsigned int handle unsigned int events_completed cdef struct ibv_rwq_ind_table: ibv_context *context int ind_tbl_handle int ind_tbl_num uint32_t comp_mask cdef struct ibv_rx_hash_conf: uint8_t rx_hash_function uint8_t rx_hash_key_len uint8_t *rx_hash_key uint64_t rx_hash_fields_mask cdef struct ibv_qp_init_attr_ex: void *qp_context ibv_cq *send_cq ibv_cq *recv_cq ibv_srq *srq ibv_qp_cap cap ibv_qp_type qp_type int sq_sig_all unsigned int comp_mask ibv_pd *pd ibv_xrcd *xrcd unsigned int create_flags unsigned short max_tso_header ibv_rwq_ind_table *rwq_ind_tbl ibv_rx_hash_conf rx_hash_conf unsigned int source_qpn unsigned long send_ops_flags cdef struct ibv_qp_attr: ibv_qp_state qp_state ibv_qp_state cur_qp_state ibv_mtu path_mtu ibv_mig_state path_mig_state unsigned int qkey unsigned int rq_psn unsigned int sq_psn unsigned int dest_qp_num unsigned int qp_access_flags ibv_qp_cap cap ibv_ah_attr ah_attr ibv_ah_attr alt_ah_attr unsigned short pkey_index unsigned short alt_pkey_index unsigned char en_sqd_async_notify unsigned char sq_draining unsigned char max_rd_atomic unsigned char max_dest_rd_atomic unsigned char min_rnr_timer unsigned char port_num unsigned char timeout unsigned char retry_cnt unsigned char rnr_retry unsigned char alt_port_num unsigned char alt_timeout unsigned int rate_limit cdef struct ibv_srq: ibv_context *context void *srq_context ibv_pd *pd unsigned int handle unsigned int events_completed cdef struct ibv_data_buf: void *addr size_t length cdef struct ibv_qp: ibv_context *context; void *qp_context; ibv_pd *pd; ibv_cq *send_cq; ibv_cq *recv_cq; ibv_srq *srq; unsigned int handle; unsigned int qp_num; ibv_qp_state state; ibv_qp_type qp_type; unsigned int events_completed; cdef struct ibv_parent_domain_init_attr: ibv_pd *pd; uint32_t comp_mask; void *(*alloc)(ibv_pd *pd, void *pd_context, size_t size, size_t alignment, uint64_t resource_type); void (*free)(ibv_pd *pd, void *pd_context, void *ptr, uint64_t resource_type); void *pd_context; cdef struct ibv_qp_ex: ibv_qp qp_base uint64_t comp_mask uint64_t wr_id unsigned int wr_flags cdef struct ibv_ece: uint32_t vendor_id uint32_t options uint32_t comp_mask cdef struct ibv_gid_entry: ibv_gid gid uint32_t gid_index uint32_t port_num uint32_t gid_type uint32_t ndev_ifindex cdef struct ibv_flow: uint32_t comp_mask ibv_context *context uint32_t handle cdef struct ibv_flow_attr: uint32_t comp_mask ibv_flow_attr_type type uint16_t size uint16_t priority uint8_t num_of_specs uint8_t port uint32_t flags cdef struct ibv_flow_eth_filter: uint8_t dst_mac[6] uint8_t src_mac[6] uint16_t ether_type uint16_t vlan_tag cdef struct ibv_flow_spec_eth: ibv_flow_spec_type type uint16_t size ibv_flow_eth_filter val ibv_flow_eth_filter mask cdef struct ibv_flow_ipv4_ext_filter: uint32_t src_ip uint32_t dst_ip uint8_t proto uint8_t tos uint8_t ttl uint8_t flags cdef struct ibv_flow_spec_ipv4_ext: ibv_flow_spec_type type uint16_t size ibv_flow_ipv4_ext_filter val ibv_flow_ipv4_ext_filter mask cdef struct ibv_flow_tcp_udp_filter: uint16_t dst_port uint16_t src_port cdef struct ibv_flow_spec_tcp_udp: ibv_flow_spec_type type uint16_t size ibv_flow_tcp_udp_filter val ibv_flow_tcp_udp_filter mask cdef struct ibv_flow_ipv6_filter: uint8_t src_ip[16] uint8_t dst_ip[16] uint32_t flow_label uint8_t next_hdr uint8_t traffic_class uint8_t hop_limit cdef struct ibv_flow_spec_ipv6: ibv_flow_spec_type type uint16_t size ibv_flow_ipv6_filter val ibv_flow_ipv6_filter mask cdef struct ibv_flow_action: ibv_context *context cdef struct ibv_values_ex: uint32_t comp_mask timespec raw_clock cdef union ibv_async_event_element: ibv_cq *cq; ibv_qp *qp; ibv_srq *srq; int port_num; cdef struct ibv_async_event: ibv_async_event_element element ibv_event_type event_type cdef struct ibv_wq: ibv_context *context void *wq_context ibv_pd *pd ibv_cq *cq uint32_t wq_num uint32_t handle ibv_wq_state state ibv_wq_type wq_type uint32_t events_completed uint32_t comp_mask cdef struct ibv_wq_init_attr: void *wq_context ibv_wq_type wq_type uint32_t max_wr uint32_t max_sge ibv_pd *pd ibv_cq *cq uint32_t comp_mask uint32_t create_flags cdef struct ibv_wq_attr: uint32_t attr_mask ibv_wq_state wq_state ibv_wq_state curr_wq_state uint32_t flags uint32_t flags_mask cdef struct ibv_rwq_ind_table_init_attr: uint32_t log_ind_tbl_size ibv_wq **ind_tbl uint32_t comp_mask ibv_device **ibv_get_device_list(int *n) int ibv_get_device_index(ibv_device *device); void ibv_free_device_list(ibv_device **list) const char *ibv_get_device_name(ibv_device *device) ibv_context *ibv_open_device(ibv_device *device) int ibv_close_device(ibv_context *context) int ibv_query_device(ibv_context *context, ibv_device_attr *device_attr) int ibv_query_device_ex(ibv_context *context, ibv_query_device_ex_input *input, ibv_device_attr_ex *attr) unsigned long ibv_get_device_guid(ibv_device *device) int ibv_query_gid(ibv_context *context, unsigned int port_num, int index, ibv_gid *gid) int ibv_query_pkey(ibv_context *context, unsigned int port_num, int index, uint16_t *pkey) int ibv_get_pkey_index(ibv_context *context, unsigned int port_num, uint16_t pkey) ibv_pd *ibv_alloc_pd(ibv_context *context) int ibv_dealloc_pd(ibv_pd *pd) ibv_mr *ibv_reg_mr(ibv_pd *pd, void *addr, size_t length, int access) ibv_mr *ibv_reg_dmabuf_mr(ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access) int ibv_rereg_mr(ibv_mr *mr, int flags, ibv_pd *pd, void *addr, size_t length, int access) int ibv_dereg_mr(ibv_mr *mr) int ibv_advise_mr(ibv_pd *pd, uint32_t advice, uint32_t flags, ibv_sge *sg_list, uint32_t num_sge) ibv_mw *ibv_alloc_mw(ibv_pd *pd, ibv_mw_type type) int ibv_dealloc_mw(ibv_mw *mw) ibv_dm *ibv_alloc_dm(ibv_context *context, ibv_alloc_dm_attr *attr) int ibv_free_dm(ibv_dm *dm) ibv_mr *ibv_reg_dm_mr(ibv_pd *pd, ibv_dm *dm, unsigned long dm_offset, size_t length, unsigned int access) int ibv_memcpy_to_dm(ibv_dm *dm, unsigned long dm_offset, void *host_addr, size_t length) int ibv_memcpy_from_dm(void *host_addr, ibv_dm *dm, unsigned long dm_offset, size_t length) int ibv_query_port(ibv_context *context, uint8_t port_num, ibv_port_attr *port_attr) ibv_comp_channel *ibv_create_comp_channel(ibv_context *context) int ibv_destroy_comp_channel(ibv_comp_channel *channel) int ibv_get_cq_event(ibv_comp_channel *channel, ibv_cq **cq, void **cq_context) int ibv_req_notify_cq(ibv_cq *cq, int solicited_only) void ibv_ack_cq_events(ibv_cq *cq, int nevents) ibv_cq *ibv_create_cq(ibv_context *context, int cqe, void *cq_context, ibv_comp_channel *channel, int comp_vector) int ibv_resize_cq(ibv_cq *cq, int cqe) int ibv_destroy_cq(ibv_cq *cq) int ibv_poll_cq(ibv_cq *cq, int num_entries, ibv_wc *wc) ibv_cq_ex *ibv_create_cq_ex(ibv_context *context, ibv_cq_init_attr_ex *cq_attr) ibv_cq *ibv_cq_ex_to_cq(ibv_cq_ex *cq) int ibv_start_poll(ibv_cq_ex *cq, ibv_poll_cq_attr *attr) int ibv_next_poll(ibv_cq_ex *cq) void ibv_end_poll(ibv_cq_ex *cq) ibv_wc_opcode ibv_wc_read_opcode(ibv_cq_ex *cq) unsigned int ibv_wc_read_vendor_err(ibv_cq_ex *cq) unsigned int ibv_wc_read_byte_len(ibv_cq_ex *cq) unsigned int ibv_wc_read_imm_data(ibv_cq_ex *cq) unsigned int ibv_wc_read_invalidated_rkey(ibv_cq_ex *cq) unsigned int ibv_wc_read_qp_num(ibv_cq_ex *cq) unsigned int ibv_wc_read_src_qp(ibv_cq_ex *cq) unsigned int ibv_wc_read_wc_flags(ibv_cq_ex *cq) unsigned int ibv_wc_read_slid(ibv_cq_ex *cq) unsigned char ibv_wc_read_sl(ibv_cq_ex *cq) unsigned char ibv_wc_read_dlid_path_bits(ibv_cq_ex *cq) unsigned long ibv_wc_read_completion_ts(ibv_cq_ex *cq) unsigned short ibv_wc_read_cvlan(ibv_cq_ex *cq) unsigned int ibv_wc_read_flow_tag(ibv_cq_ex *cq) void ibv_wc_read_tm_info(ibv_cq_ex *cq, ibv_wc_tm_info *tm_info) unsigned long ibv_wc_read_completion_wallclock_ns(ibv_cq_ex *cq) ibv_ah *ibv_create_ah(ibv_pd *pd, ibv_ah_attr *attr) int ibv_init_ah_from_wc(ibv_context *context, uint8_t port_num, ibv_wc *wc, ibv_grh *grh, ibv_ah_attr *ah_attr) ibv_ah *ibv_create_ah_from_wc(ibv_pd *pd, ibv_wc *wc, ibv_grh *grh, uint8_t port_num) int ibv_destroy_ah(ibv_ah *ah) ibv_qp *ibv_create_qp(ibv_pd *pd, ibv_qp_init_attr *qp_init_attr) ibv_qp *ibv_create_qp_ex(ibv_context *context, ibv_qp_init_attr_ex *qp_init_attr_ex) int ibv_modify_qp(ibv_qp *qp, ibv_qp_attr *qp_attr, int comp_mask) int ibv_query_qp(ibv_qp *qp, ibv_qp_attr *attr, int attr_mask, ibv_qp_init_attr *init_attr) int ibv_destroy_qp(ibv_qp *qp) int ibv_post_recv(ibv_qp *qp, ibv_recv_wr *wr, ibv_recv_wr **bad_wr) int ibv_post_send(ibv_qp *qp, ibv_send_wr *wr, ibv_send_wr **bad_wr) int ibv_bind_mw(ibv_qp *qp, ibv_mw *mw, ibv_mw_bind *mw_bind) ibv_xrcd *ibv_open_xrcd(ibv_context *context, ibv_xrcd_init_attr *xrcd_init_attr) int ibv_close_xrcd(ibv_xrcd *xrcd) ibv_srq *ibv_create_srq(ibv_pd *pd, ibv_srq_init_attr *srq_init_attr) ibv_srq *ibv_create_srq_ex(ibv_context *context, ibv_srq_init_attr_ex *srq_init_attr) int ibv_modify_srq(ibv_srq *srq, ibv_srq_attr *srq_attr, int srq_attr_mask) int ibv_query_srq(ibv_srq *srq, ibv_srq_attr *srq_attr) int ibv_get_srq_num(ibv_srq *srq, unsigned int *srq_num) int ibv_destroy_srq(ibv_srq *srq) int ibv_post_srq_recv(ibv_srq *srq, ibv_recv_wr *recv_wr, ibv_recv_wr **bad_recv_wr) int ibv_post_srq_ops(ibv_srq *srq, ibv_ops_wr *op, ibv_ops_wr **bad_op) ibv_pd *ibv_alloc_parent_domain(ibv_context *context, ibv_parent_domain_init_attr *attr) uint32_t ibv_inc_rkey(uint32_t rkey) ibv_qp_ex *ibv_qp_to_qp_ex(ibv_qp *qp) void ibv_wr_atomic_cmp_swp(ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, uint64_t compare, uint64_t swap) void ibv_wr_atomic_fetch_add(ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, uint64_t add) void ibv_wr_bind_mw(ibv_qp_ex *qp, ibv_mw *mw, uint32_t rkey, ibv_mw_bind_info *bind_info) void ibv_wr_local_inv(ibv_qp_ex *qp, uint32_t invalidate_rkey) void ibv_wr_flush(ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, size_t length, uint8_t ptype, uint8_t level) void ibv_wr_atomic_write(ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, const void *atomic_wr) void ibv_wr_rdma_read(ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr) void ibv_wr_rdma_write(ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr) void ibv_wr_rdma_write_imm(ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, uint32_t imm_data) void ibv_wr_send(ibv_qp_ex *qp) void ibv_wr_send_imm(ibv_qp_ex *qp, uint32_t imm_data) void ibv_wr_send_inv(ibv_qp_ex *qp, uint32_t invalidate_rkey) void ibv_wr_send_tso(ibv_qp_ex *qp, void *hdr, uint16_t hdr_sz, uint16_t mss) void ibv_wr_set_ud_addr(ibv_qp_ex *qp, ibv_ah *ah, uint32_t remote_qpn, uint32_t remote_qkey) void ibv_wr_set_xrc_srqn(ibv_qp_ex *qp, uint32_t remote_srqn) void ibv_wr_set_inline_data(ibv_qp_ex *qp, void *addr, size_t length) void ibv_wr_set_inline_data_list(ibv_qp_ex *qp, size_t num_buf, ibv_data_buf *buf_list) void ibv_wr_set_sge(ibv_qp_ex *qp, uint32_t lkey, uint64_t addr, uint32_t length) void ibv_wr_set_sge_list(ibv_qp_ex *qp, size_t num_sge, ibv_sge *sg_list) void ibv_wr_start(ibv_qp_ex *qp) int ibv_wr_complete(ibv_qp_ex *qp) void ibv_wr_abort(ibv_qp_ex *qp) ibv_context *ibv_import_device(int cmd_fd) ibv_mr *ibv_import_mr(ibv_pd *pd, uint32_t handle) void ibv_unimport_mr(ibv_mr *mr) ibv_pd *ibv_import_pd(ibv_context *context, uint32_t handle) void ibv_unimport_pd(ibv_pd *pd) ibv_dm *ibv_import_dm(ibv_context *context, uint32_t dm_handle) void ibv_unimport_dm(ibv_dm *dm) int ibv_query_gid_ex(ibv_context *context, uint32_t port_num, uint32_t gid_index, ibv_gid_entry *entry, uint32_t flags) ssize_t ibv_query_gid_table(ibv_context *context, ibv_gid_entry *entries, size_t max_entries, uint32_t flags) ibv_flow *ibv_create_flow(ibv_qp *qp, ibv_flow_attr *flow) int ibv_destroy_flow(ibv_flow *flow_id) int ibv_query_rt_values_ex(ibv_context *context, ibv_values_ex *values) int ibv_get_async_event(ibv_context *context, ibv_async_event *event) void ibv_ack_async_event(ibv_async_event *event) int ibv_query_qp_data_in_order(ibv_qp *qp, ibv_wr_opcode op, uint32_t flags) int ibv_fork_init() ibv_fork_status ibv_is_fork_initialized() ibv_wq *ibv_create_wq(ibv_context *context, ibv_wq_init_attr *wq_init_attr) int ibv_modify_wq(ibv_wq *wq, ibv_wq_attr *wq_attr) int ibv_destroy_wq(ibv_wq *wq) int ibv_post_wq_recv(ibv_wq *wq, ibv_recv_wr *recv_wr, ibv_recv_wr **bad_recv_wr) ibv_rwq_ind_table *ibv_create_rwq_ind_table(ibv_context *context, ibv_rwq_ind_table_init_attr *init_attr) int ibv_destroy_rwq_ind_table(ibv_rwq_ind_table *rwq_ind_table) cdef extern from 'infiniband/driver.h': int ibv_query_gid_type(ibv_context *context, uint8_t port_num, unsigned int index, ibv_gid_type_sysfs *type) int ibv_set_ece(ibv_qp *qp, ibv_ece *ece) int ibv_query_ece(ibv_qp *qp, ibv_ece *ece) rdma-core-56.1/pyverbs/libibverbs.pyx000066400000000000000000000000001477342711600176770ustar00rootroot00000000000000rdma-core-56.1/pyverbs/libibverbs_enums.pxd000066400000000000000000000340671477342711600211050ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. #cython: language_level=3 cdef extern from '': cpdef enum ibv_transport_type: IBV_TRANSPORT_UNKNOWN IBV_TRANSPORT_IB IBV_TRANSPORT_IWARP IBV_TRANSPORT_USNIC IBV_TRANSPORT_USNIC_UDP cpdef enum ibv_node_type: IBV_NODE_UNKNOWN IBV_NODE_CA IBV_NODE_SWITCH IBV_NODE_ROUTER IBV_NODE_RNIC IBV_NODE_USNIC IBV_NODE_USNIC_UDP IBV_NODE_UNSPECIFIED cpdef enum: IBV_LINK_LAYER_UNSPECIFIED IBV_LINK_LAYER_INFINIBAND IBV_LINK_LAYER_ETHERNET cpdef enum ibv_atomic_cap: IBV_ATOMIC_NONE IBV_ATOMIC_HCA IBV_ATOMIC_GLOB cpdef enum ibv_port_state: IBV_PORT_NOP IBV_PORT_DOWN IBV_PORT_INIT IBV_PORT_ARMED IBV_PORT_ACTIVE IBV_PORT_ACTIVE_DEFER cpdef enum ibv_port_cap_flags: IBV_PORT_SM IBV_PORT_NOTICE_SUP IBV_PORT_TRAP_SUP IBV_PORT_OPT_IPD_SUP IBV_PORT_AUTO_MIGR_SUP IBV_PORT_SL_MAP_SUP IBV_PORT_MKEY_NVRAM IBV_PORT_PKEY_NVRAM IBV_PORT_LED_INFO_SUP IBV_PORT_SYS_IMAGE_GUID_SUP IBV_PORT_PKEY_SW_EXT_PORT_TRAP_SUP IBV_PORT_EXTENDED_SPEEDS_SUP IBV_PORT_CAP_MASK2_SUP IBV_PORT_CM_SUP IBV_PORT_SNMP_TUNNEL_SUP IBV_PORT_REINIT_SUP IBV_PORT_DEVICE_MGMT_SUP IBV_PORT_VENDOR_CLASS_SUP IBV_PORT_DR_NOTICE_SUP IBV_PORT_CAP_MASK_NOTICE_SUP IBV_PORT_BOOT_MGMT_SUP IBV_PORT_LINK_LATENCY_SUP IBV_PORT_CLIENT_REG_SUP IBV_PORT_IP_BASED_GIDS cpdef enum ibv_port_cap_flags2: IBV_PORT_SET_NODE_DESC_SUP IBV_PORT_INFO_EXT_SUP IBV_PORT_VIRT_SUP IBV_PORT_SWITCH_PORT_STATE_TABLE_SUP IBV_PORT_LINK_WIDTH_2X_SUP IBV_PORT_LINK_SPEED_HDR_SUP IBV_PORT_LINK_SPEED_NDR_SUP cpdef enum ibv_mtu: IBV_MTU_256 IBV_MTU_512 IBV_MTU_1024 IBV_MTU_2048 IBV_MTU_4096 cpdef enum ibv_event_type: IBV_EVENT_CQ_ERR IBV_EVENT_QP_FATAL IBV_EVENT_QP_REQ_ERR IBV_EVENT_QP_ACCESS_ERR IBV_EVENT_COMM_EST IBV_EVENT_SQ_DRAINED IBV_EVENT_PATH_MIG IBV_EVENT_PATH_MIG_ERR IBV_EVENT_DEVICE_FATAL IBV_EVENT_PORT_ACTIVE IBV_EVENT_PORT_ERR IBV_EVENT_LID_CHANGE IBV_EVENT_PKEY_CHANGE IBV_EVENT_SM_CHANGE IBV_EVENT_SRQ_ERR IBV_EVENT_SRQ_LIMIT_REACHED IBV_EVENT_QP_LAST_WQE_REACHED IBV_EVENT_CLIENT_REREGISTER IBV_EVENT_GID_CHANGE IBV_EVENT_WQ_FATAL cpdef enum ibv_access_flags: IBV_ACCESS_LOCAL_WRITE IBV_ACCESS_REMOTE_WRITE IBV_ACCESS_REMOTE_READ IBV_ACCESS_REMOTE_ATOMIC IBV_ACCESS_MW_BIND IBV_ACCESS_ZERO_BASED IBV_ACCESS_ON_DEMAND IBV_ACCESS_HUGETLB IBV_ACCESS_FLUSH_GLOBAL IBV_ACCESS_FLUSH_PERSISTENT IBV_ACCESS_RELAXED_ORDERING cpdef enum ibv_rereg_mr_flags: IBV_REREG_MR_CHANGE_TRANSLATION IBV_REREG_MR_CHANGE_PD IBV_REREG_MR_CHANGE_ACCESS cpdef enum ibv_rereg_mr_err_code: IBV_REREG_MR_ERR_INPUT IBV_REREG_MR_ERR_DONT_FORK_NEW IBV_REREG_MR_ERR_DO_FORK_OLD IBV_REREG_MR_ERR_CMD IBV_REREG_MR_ERR_CMD_AND_DO_FORK_NEW cpdef enum ibv_wr_opcode: IBV_WR_RDMA_WRITE IBV_WR_RDMA_WRITE_WITH_IMM IBV_WR_SEND IBV_WR_SEND_WITH_IMM IBV_WR_RDMA_READ IBV_WR_ATOMIC_CMP_AND_SWP IBV_WR_ATOMIC_FETCH_AND_ADD IBV_WR_LOCAL_INV IBV_WR_BIND_MW IBV_WR_SEND_WITH_INV IBV_WR_TSO IBV_WR_FLUSH IBV_WR_ATOMIC_WRITE cpdef enum ibv_ops_wr_opcode: IBV_WR_TAG_ADD IBV_WR_TAG_DEL IBV_WR_TAG_SYNC cpdef enum ibv_ops_flags: IBV_OPS_SIGNALED IBV_OPS_TM_SYNC cpdef enum ibv_send_flags: IBV_SEND_FENCE IBV_SEND_SIGNALED IBV_SEND_SOLICITED IBV_SEND_INLINE IBV_SEND_IP_CSUM cpdef enum ibv_tm_cap_flags: IBV_TM_CAP_RC cpdef enum ibv_qp_type: IBV_QPT_RC IBV_QPT_UC IBV_QPT_UD IBV_QPT_RAW_PACKET IBV_QPT_XRC_SEND IBV_QPT_XRC_RECV IBV_QPT_DRIVER cpdef enum ibv_qp_state: IBV_QPS_RESET IBV_QPS_INIT IBV_QPS_RTR IBV_QPS_RTS IBV_QPS_SQD IBV_QPS_SQE IBV_QPS_ERR IBV_QPS_UNKNOWN cpdef enum ibv_mw_type: IBV_MW_TYPE_1 IBV_MW_TYPE_2 cpdef enum ibv_wc_status: IBV_WC_SUCCESS IBV_WC_LOC_LEN_ERR IBV_WC_LOC_QP_OP_ERR IBV_WC_LOC_EEC_OP_ERR IBV_WC_LOC_PROT_ERR IBV_WC_WR_FLUSH_ERR IBV_WC_MW_BIND_ERR IBV_WC_BAD_RESP_ERR IBV_WC_LOC_ACCESS_ERR IBV_WC_REM_INV_REQ_ERR IBV_WC_REM_ACCESS_ERR IBV_WC_REM_OP_ERR IBV_WC_RETRY_EXC_ERR IBV_WC_RNR_RETRY_EXC_ERR IBV_WC_LOC_RDD_VIOL_ERR IBV_WC_REM_INV_RD_REQ_ERR IBV_WC_REM_ABORT_ERR IBV_WC_INV_EECN_ERR IBV_WC_INV_EEC_STATE_ERR IBV_WC_FATAL_ERR IBV_WC_RESP_TIMEOUT_ERR IBV_WC_GENERAL_ERR IBV_WC_TM_ERR IBV_WC_TM_RNDV_INCOMPLETE cpdef enum ibv_wc_opcode: IBV_WC_SEND IBV_WC_RDMA_WRITE IBV_WC_RDMA_READ IBV_WC_COMP_SWAP IBV_WC_FETCH_ADD IBV_WC_BIND_MW IBV_WC_LOCAL_INV IBV_WC_TSO IBV_WC_FLUSH IBV_WC_ATOMIC_WRITE IBV_WC_RECV IBV_WC_RECV_RDMA_WITH_IMM IBV_WC_TM_ADD IBV_WC_TM_DEL IBV_WC_TM_SYNC IBV_WC_TM_RECV IBV_WC_TM_NO_TAG IBV_WC_DRIVER2 IBV_WC_DRIVER3 cpdef enum ibv_create_cq_wc_flags: IBV_WC_EX_WITH_BYTE_LEN IBV_WC_EX_WITH_IMM IBV_WC_EX_WITH_QP_NUM IBV_WC_EX_WITH_SRC_QP IBV_WC_EX_WITH_SLID IBV_WC_EX_WITH_SL IBV_WC_EX_WITH_DLID_PATH_BITS IBV_WC_EX_WITH_COMPLETION_TIMESTAMP IBV_WC_EX_WITH_CVLAN IBV_WC_EX_WITH_FLOW_TAG IBV_WC_EX_WITH_TM_INFO IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK cpdef enum ibv_wc_flags: IBV_WC_GRH IBV_WC_WITH_IMM IBV_WC_IP_CSUM_OK IBV_WC_WITH_INV IBV_WC_TM_SYNC_REQ IBV_WC_TM_MATCH IBV_WC_TM_DATA_VALID cpdef enum ibv_srq_attr_mask: IBV_SRQ_MAX_WR IBV_SRQ_LIMIT cpdef enum ibv_srq_type: IBV_SRQT_BASIC IBV_SRQT_XRC IBV_SRQT_TM cpdef enum ibv_srq_init_attr_mask: IBV_SRQ_INIT_ATTR_TYPE IBV_SRQ_INIT_ATTR_PD IBV_SRQ_INIT_ATTR_XRCD IBV_SRQ_INIT_ATTR_CQ IBV_SRQ_INIT_ATTR_TM cpdef enum ibv_mig_state: IBV_MIG_MIGRATED IBV_MIG_REARM IBV_MIG_ARMED cpdef enum ibv_qp_init_attr_mask: IBV_QP_INIT_ATTR_PD IBV_QP_INIT_ATTR_XRCD IBV_QP_INIT_ATTR_CREATE_FLAGS IBV_QP_INIT_ATTR_MAX_TSO_HEADER IBV_QP_INIT_ATTR_IND_TABLE IBV_QP_INIT_ATTR_RX_HASH IBV_QP_INIT_ATTR_SEND_OPS_FLAGS cpdef enum ibv_qp_create_flags: IBV_QP_CREATE_BLOCK_SELF_MCAST_LB IBV_QP_CREATE_SCATTER_FCS IBV_QP_CREATE_CVLAN_STRIPPING IBV_QP_CREATE_SOURCE_QPN IBV_QP_CREATE_PCI_WRITE_END_PADDING cpdef enum ibv_qp_attr_mask: IBV_QP_STATE IBV_QP_CUR_STATE IBV_QP_EN_SQD_ASYNC_NOTIFY IBV_QP_ACCESS_FLAGS IBV_QP_PKEY_INDEX IBV_QP_PORT IBV_QP_QKEY IBV_QP_AV IBV_QP_PATH_MTU IBV_QP_TIMEOUT IBV_QP_RETRY_CNT IBV_QP_RNR_RETRY IBV_QP_RQ_PSN IBV_QP_MAX_QP_RD_ATOMIC IBV_QP_ALT_PATH IBV_QP_MIN_RNR_TIMER IBV_QP_SQ_PSN IBV_QP_MAX_DEST_RD_ATOMIC IBV_QP_PATH_MIG_STATE IBV_QP_CAP IBV_QP_DEST_QPN IBV_QP_RATE_LIMIT cpdef enum ibv_query_qp_data_in_order_flags: IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS cpdef enum ibv_query_qp_data_in_order_caps: IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG IBV_QUERY_QP_DATA_IN_ORDER_ALIGNED_128_BYTES cpdef enum ibv_wq_type: IBV_WQT_RQ cpdef enum ibv_wq_init_attr_mask: IBV_WQ_INIT_ATTR_FLAGS cpdef enum ibv_wq_flags: IBV_WQ_FLAGS_CVLAN_STRIPPING IBV_WQ_FLAGS_SCATTER_FCS IBV_WQ_FLAGS_DELAY_DROP IBV_WQ_FLAGS_PCI_WRITE_END_PADDING cpdef enum ibv_wq_state: IBV_WQS_RESET IBV_WQS_RDY IBV_WQS_ERR IBV_WQS_UNKNOWN cpdef enum ibv_wq_attr_mask: IBV_WQ_ATTR_STATE IBV_WQ_ATTR_CURR_STATE IBV_WQ_ATTR_FLAGS cpdef enum ibv_rx_hash_function_flags: IBV_RX_HASH_FUNC_TOEPLITZ cpdef enum ibv_rx_hash_fields: IBV_RX_HASH_SRC_IPV4 IBV_RX_HASH_DST_IPV4 IBV_RX_HASH_SRC_IPV6 IBV_RX_HASH_DST_IPV6 IBV_RX_HASH_SRC_PORT_TCP IBV_RX_HASH_DST_PORT_TCP IBV_RX_HASH_SRC_PORT_UDP IBV_RX_HASH_DST_PORT_UDP cpdef enum ibv_flow_flags: IBV_FLOW_ATTR_FLAGS_DONT_TRAP IBV_FLOW_ATTR_FLAGS_EGRESS cpdef enum ibv_flow_attr_type: IBV_FLOW_ATTR_NORMAL IBV_FLOW_ATTR_ALL_DEFAULT IBV_FLOW_ATTR_MC_DEFAULT IBV_FLOW_ATTR_SNIFFER cpdef enum ibv_flow_spec_type: IBV_FLOW_SPEC_ETH IBV_FLOW_SPEC_IPV4 IBV_FLOW_SPEC_IPV6 IBV_FLOW_SPEC_IPV4_EXT IBV_FLOW_SPEC_ESP IBV_FLOW_SPEC_TCP IBV_FLOW_SPEC_UDP IBV_FLOW_SPEC_VXLAN_TUNNEL IBV_FLOW_SPEC_GRE IBV_FLOW_SPEC_MPLS IBV_FLOW_SPEC_INNER IBV_FLOW_SPEC_ACTION_TAG IBV_FLOW_SPEC_ACTION_DROP IBV_FLOW_SPEC_ACTION_HANDLE IBV_FLOW_SPEC_ACTION_COUNT cpdef enum: IBV_QPF_GRH_REQUIRED cpdef enum ibv_counter_description: IBV_COUNTER_PACKETS IBV_COUNTER_BYTES cpdef enum ibv_read_counters_flags: IBV_READ_COUNTERS_ATTR_PREFER_CACHED cpdef enum ibv_cq_init_attr_mask: IBV_CQ_INIT_ATTR_MASK_FLAGS IBV_CQ_INIT_ATTR_MASK_PD cpdef enum ibv_create_cq_attr_flags: IBV_CREATE_CQ_ATTR_SINGLE_THREADED IBV_CREATE_CQ_ATTR_IGNORE_OVERRUN cpdef enum ibv_odp_general_caps: IBV_ODP_SUPPORT IBV_ODP_SUPPORT_IMPLICIT cpdef enum ibv_odp_transport_cap_bits: IBV_ODP_SUPPORT_SEND IBV_ODP_SUPPORT_RECV IBV_ODP_SUPPORT_WRITE IBV_ODP_SUPPORT_READ IBV_ODP_SUPPORT_ATOMIC IBV_ODP_SUPPORT_SRQ_RECV cpdef enum ibv_device_cap_flags: IBV_DEVICE_RESIZE_MAX_WR IBV_DEVICE_BAD_PKEY_CNTR IBV_DEVICE_BAD_QKEY_CNTR IBV_DEVICE_RAW_MULTI IBV_DEVICE_AUTO_PATH_MIG IBV_DEVICE_CHANGE_PHY_PORT IBV_DEVICE_UD_AV_PORT_ENFORCE IBV_DEVICE_CURR_QP_STATE_MOD IBV_DEVICE_SHUTDOWN_PORT IBV_DEVICE_INIT_TYPE IBV_DEVICE_PORT_ACTIVE_EVENT IBV_DEVICE_SYS_IMAGE_GUID IBV_DEVICE_RC_RNR_NAK_GEN IBV_DEVICE_SRQ_RESIZE IBV_DEVICE_N_NOTIFY_CQ IBV_DEVICE_MEM_WINDOW IBV_DEVICE_UD_IP_CSUM IBV_DEVICE_XRC IBV_DEVICE_MEM_MGT_EXTENSIONS IBV_DEVICE_MEM_WINDOW_TYPE_2A IBV_DEVICE_MEM_WINDOW_TYPE_2B IBV_DEVICE_RC_IP_CSUM IBV_DEVICE_RAW_IP_CSUM IBV_DEVICE_MANAGED_FLOW_STEERING cpdef enum ibv_raw_packet_caps: IBV_RAW_PACKET_CAP_CVLAN_STRIPPING IBV_RAW_PACKET_CAP_SCATTER_FCS IBV_RAW_PACKET_CAP_IP_CSUM IBV_RAW_PACKET_CAP_DELAY_DROP cpdef enum ibv_xrcd_init_attr_mask: IBV_XRCD_INIT_ATTR_FD IBV_XRCD_INIT_ATTR_OFLAGS IBV_XRCD_INIT_ATTR_RESERVED cpdef enum: IBV_WC_STANDARD_FLAGS cpdef enum ibv_values_mask: IBV_VALUES_MASK_RAW_CLOCK cpdef enum ibv_qp_create_send_ops_flags: IBV_QP_EX_WITH_RDMA_WRITE IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM IBV_QP_EX_WITH_SEND IBV_QP_EX_WITH_SEND_WITH_IMM IBV_QP_EX_WITH_RDMA_READ IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD IBV_QP_EX_WITH_LOCAL_INV IBV_QP_EX_WITH_BIND_MW IBV_QP_EX_WITH_SEND_WITH_INV IBV_QP_EX_WITH_TSO IBV_QP_EX_WITH_FLUSH IBV_QP_EX_WITH_ATOMIC_WRITE cdef unsigned long long IBV_DEVICE_RAW_SCATTER_FCS cdef unsigned long long IBV_DEVICE_PCI_WRITE_END_PADDING cpdef enum ibv_parent_domain_init_attr_mask: IBV_PARENT_DOMAIN_INIT_ATTR_ALLOCATORS IBV_PARENT_DOMAIN_INIT_ATTR_PD_CONTEXT cdef void *IBV_ALLOCATOR_USE_DEFAULT cpdef enum ibv_gid_type: IBV_GID_TYPE_IB IBV_GID_TYPE_ROCE_V1 IBV_GID_TYPE_ROCE_V2 cpdef enum ibv_fork_status: IBV_FORK_DISABLED IBV_FORK_ENABLED IBV_FORK_UNNEEDED cpdef enum ibv_placement_type: IBV_FLUSH_GLOBAL IBV_FLUSH_PERSISTENT cpdef enum ibv_selectivity_level: IBV_FLUSH_MR IBV_FLUSH_RANGE cdef extern from "": cdef unsigned long long IBV_ADVISE_MR_ADVICE_PREFETCH cdef unsigned long long IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE cdef unsigned long long IBV_ADVISE_MR_FLAG_FLUSH cdef unsigned long long IBV_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT _IBV_DEVICE_RAW_SCATTER_FCS = IBV_DEVICE_RAW_SCATTER_FCS _IBV_DEVICE_PCI_WRITE_END_PADDING = IBV_DEVICE_PCI_WRITE_END_PADDING _IBV_ALLOCATOR_USE_DEFAULT = IBV_ALLOCATOR_USE_DEFAULT _IBV_ADVISE_MR_ADVICE_PREFETCH = IBV_ADVISE_MR_ADVICE_PREFETCH _IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE = IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE _IBV_ADVISE_MR_FLAG_FLUSH = IBV_ADVISE_MR_FLAG_FLUSH _IBV_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT = IBV_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT cdef extern from '': cpdef enum ibv_gid_type_sysfs: IBV_GID_TYPE_SYSFS_IB_ROCE_V1 IBV_GID_TYPE_SYSFS_ROCE_V2 cdef extern from "": cpdef enum ibv_tmh_op: IBV_TMH_NO_TAG IBV_TMH_RNDV IBV_TMH_FIN IBV_TMH_EAGER rdma-core-56.1/pyverbs/libibverbs_enums.pyx000066400000000000000000000000001477342711600211060ustar00rootroot00000000000000rdma-core-56.1/pyverbs/librdmacm.pxd000066400000000000000000000142731477342711600175020ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file #cython: language_level=3 from libc.stdint cimport uint8_t, uint32_t from pyverbs.librdmacm_enums cimport * from pyverbs.libibverbs cimport * cdef extern from '': cdef struct rdma_cm_id: ibv_context *verbs rdma_event_channel *channel void *context ibv_qp *qp rdma_port_space ps uint8_t port_num rdma_cm_event *event ibv_comp_channel *send_cq_channel ibv_cq *send_cq ibv_comp_channel *recv_cq_channel ibv_cq *recv_cq ibv_srq *srq ibv_pd *pd ibv_qp_type qp_type cdef struct rdma_event_channel: int fd cdef struct rdma_conn_param: const void *private_data uint8_t private_data_len uint8_t responder_resources uint8_t initiator_depth uint8_t flow_control uint8_t retry_count uint8_t rnr_retry_count uint8_t srq uint32_t qp_num cdef struct rdma_ud_param: const void *private_data uint8_t private_data_len ibv_ah_attr ah_attr uint32_t qp_num uint32_t qkey cdef union param: rdma_conn_param conn rdma_ud_param ud cdef struct rdma_cm_event: rdma_cm_id *id rdma_cm_id *listen_id rdma_cm_event_type event int status param param cdef struct rdma_addrinfo: int ai_flags int ai_family int ai_qp_type int ai_port_space int ai_src_len int ai_dst_len sockaddr *ai_src_addr sockaddr *ai_dst_addr char *ai_src_canonname char *ai_dst_canonname size_t ai_route_len void *ai_route size_t ai_connect_len void *ai_connect rdma_addrinfo *ai_next cdef struct rdma_cm_join_mc_attr_ex: uint32_t comp_mask uint32_t join_flags sockaddr *addr # These non rdmacm structs defined in one of rdma_cma.h's included header files cdef struct sockaddr: unsigned short sa_family char sa_data[14] cdef struct in_addr: uint32_t s_addr cdef struct sockaddr_in: short sin_family unsigned short sin_port in_addr sin_addr char sin_zero[8] rdma_event_channel *rdma_create_event_channel() void rdma_destroy_event_channel(rdma_event_channel *channel) ibv_context **rdma_get_devices(int *num_devices) void rdma_free_devices (ibv_context **list); int rdma_get_cm_event(rdma_event_channel *channel, rdma_cm_event **event) int rdma_ack_cm_event(rdma_cm_event *event) char *rdma_event_str(rdma_cm_event_type event) int rdma_create_ep(rdma_cm_id **id, rdma_addrinfo *res, ibv_pd *pd, ibv_qp_init_attr *qp_init_attr) void rdma_destroy_ep(rdma_cm_id *id) int rdma_create_id(rdma_event_channel *channel, rdma_cm_id **id, void *context, rdma_port_space ps) int rdma_destroy_id(rdma_cm_id *id) int rdma_get_remote_ece(rdma_cm_id *id, ibv_ece *ece) int rdma_set_local_ece(rdma_cm_id *id, ibv_ece *ece) int rdma_get_request(rdma_cm_id *listen, rdma_cm_id **id) int rdma_bind_addr(rdma_cm_id *id, sockaddr *addr) int rdma_resolve_addr(rdma_cm_id *id, sockaddr *src_addr, sockaddr *dst_addr, int timeout_ms) int rdma_resolve_route(rdma_cm_id *id, int timeout_ms) int rdma_join_multicast(rdma_cm_id *id, sockaddr *addr, void *context) int rdma_join_multicast_ex(rdma_cm_id *id, rdma_cm_join_mc_attr_ex *mc_join_attr, void *context) int rdma_leave_multicast(rdma_cm_id *id, sockaddr *addr) int rdma_connect(rdma_cm_id *id, rdma_conn_param *conn_param) int rdma_disconnect(rdma_cm_id *id) int rdma_listen(rdma_cm_id *id, int backlog) int rdma_accept(rdma_cm_id *id, rdma_conn_param *conn_param) int rdma_establish(rdma_cm_id *id) int rdma_getaddrinfo(char *node, char *service, rdma_addrinfo *hints, rdma_addrinfo **res) void rdma_freeaddrinfo(rdma_addrinfo *res) int rdma_init_qp_attr(rdma_cm_id *id, ibv_qp_attr *qp_attr, int *qp_attr_mask) int rdma_create_qp(rdma_cm_id *id, ibv_pd *pd, ibv_qp_init_attr *qp_init_attr) void rdma_destroy_qp(rdma_cm_id *id) int rdma_set_option(rdma_cm_id *id, int level, int optname, void *optval, size_t optlen) int rdma_reject(rdma_cm_id *id, const void *private_data, uint8_t private_data_len) cdef extern from '': int rdma_post_recv(rdma_cm_id *id, void *context, void *addr, size_t length, ibv_mr *mr) int rdma_post_send(rdma_cm_id *id, void *context, void *addr, size_t length, ibv_mr *mr, int flags) int rdma_post_ud_send(rdma_cm_id *id, void *context, void *addr, size_t length, ibv_mr *mr, int flags, ibv_ah *ah, uint32_t remote_qpn) int rdma_post_read(rdma_cm_id *id, void *context, void *addr, size_t length, ibv_mr *mr, int flags, uint64_t remote_addr, uint32_t rkey) int rdma_post_write(rdma_cm_id *id, void *context, void *addr, size_t length, ibv_mr *mr, int flags, uint64_t remote_addr, uint32_t rkey) int rdma_get_send_comp(rdma_cm_id *id, ibv_wc *wc) int rdma_get_recv_comp(rdma_cm_id *id, ibv_wc *wc) ibv_mr *rdma_reg_msgs(rdma_cm_id *id, void *addr, size_t length) ibv_mr *rdma_reg_read(rdma_cm_id *id, void *addr, size_t length) ibv_mr *rdma_reg_write(rdma_cm_id *id, void *addr, size_t length) int rdma_dereg_mr(ibv_mr *mr) rdma-core-56.1/pyverbs/librdmacm.pyx000066400000000000000000000000001477342711600175060ustar00rootroot00000000000000rdma-core-56.1/pyverbs/librdmacm_enums.pxd000066400000000000000000000025511477342711600207050ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. #cython: language_level=3 cdef extern from '': cpdef enum rdma_cm_event_type: RDMA_CM_EVENT_ADDR_RESOLVED RDMA_CM_EVENT_ADDR_ERROR RDMA_CM_EVENT_ROUTE_RESOLVED RDMA_CM_EVENT_ROUTE_ERROR RDMA_CM_EVENT_CONNECT_REQUEST RDMA_CM_EVENT_CONNECT_RESPONSE RDMA_CM_EVENT_CONNECT_ERROR RDMA_CM_EVENT_UNREACHABLE RDMA_CM_EVENT_REJECTED RDMA_CM_EVENT_ESTABLISHED RDMA_CM_EVENT_DISCONNECTED RDMA_CM_EVENT_DEVICE_REMOVAL RDMA_CM_EVENT_MULTICAST_JOIN RDMA_CM_EVENT_MULTICAST_ERROR RDMA_CM_EVENT_ADDR_CHANGE RDMA_CM_EVENT_TIMEWAIT_EXIT cpdef enum rdma_port_space: RDMA_PS_IPOIB RDMA_PS_TCP RDMA_PS_UDP RDMA_PS_IB # Hint flags which control the operation. cpdef enum: RAI_PASSIVE RAI_NUMERICHOST RAI_NOROUTE RAI_FAMILY cpdef enum rdma_cm_join_mc_attr_mask: RDMA_CM_JOIN_MC_ATTR_ADDRESS RDMA_CM_JOIN_MC_ATTR_JOIN_FLAGS cpdef enum rdma_cm_mc_join_flags: RDMA_MC_JOIN_FLAG_FULLMEMBER RDMA_MC_JOIN_FLAG_SENDONLY_FULLMEMBER cpdef enum: RDMA_OPTION_ID cpdef enum: RDMA_OPTION_ID_ACK_TIMEOUT rdma-core-56.1/pyverbs/librdmacm_enums.pyx000066400000000000000000000000001477342711600207150ustar00rootroot00000000000000rdma-core-56.1/pyverbs/mem_alloc.pyx000066400000000000000000000143041477342711600175200ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. See COPYING file #cython: language_level=3 from posix.stdlib cimport posix_memalign as c_posix_memalign from libc.stdlib cimport malloc as c_malloc, free as c_free from posix.mman cimport mmap as c_mmap, munmap as c_munmap, madvise as c_madvise from libc.stdint cimport uintptr_t, uint32_t, uint64_t from pyverbs.base import PyverbsRDMAErrno from libc.string cimport memcpy from libc.string cimport memset cimport posix.mman as mm cdef extern from 'sys/mman.h': cdef void* MAP_FAILED cdef int MADV_DONTNEED cdef extern from 'endian.h': unsigned long htobe32(unsigned long host_32bits) unsigned long htobe64(unsigned long host_64bits) def mmap(addr=0, length=100, prot=mm.PROT_READ | mm.PROT_WRITE, flags=mm.MAP_PRIVATE | mm.MAP_ANONYMOUS, fd=0, offset=0): """ Python wrapper for sys mmap function :param addr: Address to mmap the memory :param length: The length of the requested memory in bytes :param prot: Indicate the protection of this memory :param flags: Specify speicific flags to this memory :param fd: File descriptor to mmap specific file :param offset: Offset to use when mmap :return: The address to the mapped memory """ # uintptr_t is guaranteed to be large enough to hold any pointer. # In order to safely cast addr to void*, it is firstly cast to uintptr_t. ptr = c_mmap(addr, length, prot, flags, fd, offset) if ptr == MAP_FAILED: raise MemoryError('Failed to mmap memory') return ptr def madvise(addr, length, flags=MADV_DONTNEED): """ Python wrapper for sys madvise function :param addr: Address of the memory to be advised about :param length: The length of the requested memory in bytes :param flags: Specify speicific flags to this memory """ rc = c_madvise(addr, length, flags) if rc: raise PyverbsRDMAErrno('Failed to madvise memory') def munmap(addr, length): """ Python wrapper for sys munmap function :param addr: The address of the mapped memory to unmap :param length: The length of this mapped memory """ ret = c_munmap(addr, length) if ret: raise MemoryError('Failed to munmap requested memory') def malloc(size): """ Python wrapper for stdlib malloc function :param size: The size of the memory block in bytes :return: The address of the allocated memory, or 0 if the request fails """ ptr = c_malloc(size) if not ptr: raise MemoryError('Failed to allocate memory') return ptr def posix_memalign(size, alignment=8): """ Python wrapper for the stdlib posix_memalign function. The function calls posix_memalign and memsets the memory to 0. :param size: The size of the memory block in bytes :param alignment: Alignment of the allocated memory, must be a power of two :return: The address of the allocated memory, which is a multiple of alignment. """ cdef void* ptr ret = c_posix_memalign(&ptr, alignment, size) if ret: raise MemoryError('Failed to allocate memory ({err}'.format(ret)) memset(ptr, 0, size) return ptr def free(ptr): """ Python wrapper for stdlib free function :param ptr: The address of a previously allocated memory block """ c_free(ptr) def writebe32(addr, val, offset=0): """ Write 32-bit value as Big Endian to address and offset :param addr: The start of the address to write the value to :param val: Value to write :param offset: Offset of the address to write the value to (in 4-bytes) """ (addr)[offset] = htobe32(val) def writebe64(addr, val, offset=0): """ Write 64-bit value as Big Endian to address and offset :param addr: The start of the address to write the value to :param val: Value to write :param offset: Offset of the address to write the value to (in 8-bytes) """ (addr)[offset] = htobe64(val) def write(addr, data, length, offset=0): """ Write user data to a given address :param addr: The start of the address to write to :param data: User data to write (string or bytes) :param length: Length of the data to write (in bytes) :param offset: Writing offset (in bytes) """ cdef int off = offset cdef void* buf = addr # If data is a string, cast it to bytes as Python3 doesn't # automatically convert it. if isinstance(data, str): data = data.encode() memcpy((buf + off), data, length) def read32(addr, offset=0): """ Read 32-bit value from address and offset :param addr: The start of the address to read from :param offset: Offset of the address to read from (in 4-bytes) :return: The read value """ return (addr)[offset] def read64(addr, offset=0): """ Read 64-bit value from address and offset :param addr: The start of the address to read from :param offset: Offset of the address to read from (in 8-bytes) :return: The read value """ return (addr)[offset] def read(addr, length, offset=0): """ Reads data from a given address :param addr: The start of the address to read from :param length: Length of data to read (in bytes) :param offset: Reading offset (in bytes) :return: The data on the buffer in the requested offset (bytes) """ cdef char *data data = (addr + offset) return data[:length] # protection bits for mmap/mprotect PROT_EXEC_ = mm.PROT_EXEC PROT_READ_ = mm.PROT_READ PROT_WRITE_ = mm.PROT_WRITE PROT_NONE_ = mm.PROT_NONE # flag bits for mmap MAP_PRIVATE_ = mm.MAP_PRIVATE MAP_SHARED_ = mm.MAP_SHARED MAP_FIXED_ = mm.MAP_FIXED MAP_ANONYMOUS_ = mm.MAP_ANONYMOUS MAP_STACK_ = mm.MAP_STACK MAP_LOCKED_ = mm.MAP_LOCKED MAP_HUGETLB_ = mm.MAP_HUGETLB MAP_POPULATE_ = mm.MAP_POPULATE MAP_NORESERVE_ = mm.MAP_NORESERVE MAP_GROWSDOWN_ = mm.MAP_GROWSDOWN rdma-core-56.1/pyverbs/mr.pxd000066400000000000000000000017321477342711600161620ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file # Copyright (c) 2020, Intel Corporation. All rights reserved. See COPYING file #cython: language_level=3 from pyverbs.base cimport PyverbsCM cimport pyverbs.librdmacm as cm from . cimport libibverbs as v cdef class MR(PyverbsCM): cdef object pd cdef object cmid cdef v.ibv_mr *mr cdef int mmap_length cdef object is_huge cdef object is_user_addr cdef void *buf cdef object _is_imported cpdef read(self, length, offset) cdef class MWBindInfo(PyverbsCM): cdef v.ibv_mw_bind_info info cdef object mr cdef class MWBind(PyverbsCM): cdef v.ibv_mw_bind mw_bind cdef object mr cdef class MW(PyverbsCM): cdef object pd cdef v.ibv_mw *mw cdef class DMMR(MR): cdef object dm cdef class DmaBufMR(MR): cdef object dmabuf cdef unsigned long offset cdef object is_dmabuf_internal rdma-core-56.1/pyverbs/mr.pyx000066400000000000000000000450571477342711600162170ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file # Copyright (c) 2020, Intel Corporation. All rights reserved. See COPYING file import resource import logging from posix.mman cimport mmap, munmap, MAP_PRIVATE, PROT_READ, PROT_WRITE, \ MAP_ANONYMOUS, MAP_HUGETLB, MAP_SHARED from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError, \ PyverbsUserError from libc.stdint cimport uintptr_t, SIZE_MAX from pyverbs.utils import rereg_error_to_str from pyverbs.base import PyverbsRDMAErrno from posix.stdlib cimport posix_memalign from libc.string cimport memcpy, memset cimport pyverbs.libibverbs_enums as e from pyverbs.device cimport DM from libc.stdlib cimport free, malloc from .cmid cimport CMID from .pd cimport PD from .dmabuf cimport DmaBuf cdef extern from 'sys/mman.h': cdef void* MAP_FAILED HUGE_PAGE_SIZE = 0x200000 cdef class MR(PyverbsCM): """ MR class represents ibv_mr. Buffer allocation in done in the c'tor. Freeing it is done in close(). """ def __init__(self, creator not None, length=0, access=0, address=None, implicit=False, **kwargs): """ Allocate a user-level buffer of length and register a Memory Region of the given length and access flags. :param creator: A PD/CMID object. In case of CMID is passed the MR will be registered using rdma_reg_msgs/write/read according to the passed access flag of local_write/remote_write or remote_read respectively. :param length: Length (in bytes) of MR's buffer. :param access: Access flags, see ibv_access_flags enum :param address: Memory address to register (Optional). If it's not provided, a memory will be allocated in the class initialization. :param implicit: Implicit the MR address. :param kwargs: Arguments: * *handle* A valid kernel handle for a MR object in the given PD (creator). If passed, the MR will be imported and associated with the context that is associated with the given PD using ibv_import_mr. :return: The newly created MR on success """ super().__init__() if self.mr != NULL: return self.is_huge = True if access & e.IBV_ACCESS_HUGETLB else False if address: self.is_user_addr = True # uintptr_t is guaranteed to be large enough to hold any pointer. # In order to safely cast addr to void*, it is firstly cast to uintptr_t. self.buf = address mr_handle = kwargs.get('handle') # If a MR handle is passed import MR and finish if mr_handle is not None: pd = creator self.mr = v.ibv_import_mr(pd.pd, mr_handle) if self.mr == NULL: raise PyverbsRDMAErrno('Failed to import MR') self._is_imported = True self.pd = pd pd.add_ref(self) return # Allocate a buffer if not address and length > 0: if self.is_huge: # Rounding up to multiple of HUGE_PAGE_SIZE self.mmap_length = length + (HUGE_PAGE_SIZE - length % HUGE_PAGE_SIZE) \ if length % HUGE_PAGE_SIZE else length self.buf = mmap(NULL, self.mmap_length, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0) if self.buf == MAP_FAILED: raise PyverbsError('Failed to allocate MR buffer of size {l}'. format(l=length)) else: rc = posix_memalign(&self.buf, resource.getpagesize(), length) if rc: raise PyverbsError('Failed to allocate MR buffer of size {l}'. format(l=length)) memset(self.buf, 0, length) if isinstance(creator, PD): pd = creator if implicit: self.mr = v.ibv_reg_mr(pd.pd, NULL, SIZE_MAX, access) else: self.mr = v.ibv_reg_mr(pd.pd, self.buf, length, access) self.pd = pd pd.add_ref(self) elif isinstance(creator, CMID): cmid = creator if access == e.IBV_ACCESS_LOCAL_WRITE: self.mr = cm.rdma_reg_msgs(cmid.id, self.buf, length) elif access == e.IBV_ACCESS_REMOTE_WRITE: self.mr = cm.rdma_reg_write(cmid.id, self.buf, length) elif access == e.IBV_ACCESS_REMOTE_READ: self.mr = cm.rdma_reg_read(cmid.id, self.buf, length) self.cmid = cmid cmid.add_ref(self) if self.mr == NULL: raise PyverbsRDMAErrno('Failed to register a MR. length: {l}, access flags: {a}'. format(l=length, a=access)) self.logger.debug('Registered ibv_mr. Length: {l}, access flags {a}'. format(l=length, a=access)) def unimport(self): v.ibv_unimport_mr(self.mr) self.close() def __dealloc__(self): self.close() cpdef close(self): """ Closes the underlying C object of the MR and frees the memory allocated. MR may be deleted directly or indirectly by closing its context, which leaves the Python PD object without the underlying C object, so during destruction, need to check whether or not the C object exists. In case of an imported MR no deregistration will be done, it's left for the original MR, in order to prevent double dereg by the GC. :return: None """ if self.mr != NULL: if self.logger: self.logger.debug('Closing MR') if not self._is_imported: rc = v.ibv_dereg_mr(self.mr) if rc != 0: raise PyverbsRDMAError('Failed to dereg MR', rc) if not self.is_user_addr: if self.is_huge: munmap(self.buf, self.mmap_length) else: free(self.buf) self.mr = NULL self.pd = None self.buf = NULL self.cmid = None def write(self, data, length, offset=0): """ Write user data to the MR's buffer using memcpy :param data: User data to write :param length: Length of the data to write :param offset: Writing offset :return: None """ if not self.buf or length < 0: raise PyverbsUserError('The MR buffer isn\'t allocated or length' f' {length} is invalid') # If data is a string, cast it to bytes as Python3 doesn't # automatically convert it. cdef int off = offset if isinstance(data, str): data = data.encode() memcpy((self.buf + off), data, length) cpdef read(self, length, offset): """ Reads data from the MR's buffer :param length: Length of data to read :param offset: Reading offset :return: The data on the buffer in the requested offset """ cdef char *data cdef int off = offset # we can't use offset in the next line, as it is # a Python object and not C if offset < 0: raise PyverbsUserError(f'Invalid offset {offset}') if not self.buf or length < 0: raise PyverbsUserError('The MR buffer isn\'t allocated or length' f' {length} is invalid') data = (self.buf + off) return data[:length] def rereg(self, flags, PD pd=None, addr=0, length=0, access=0): """ Modifies the attributes of an existing memory region. :param flags: Bit-mask used to indicate which of the properties of the MR are being modified :param pd: New PD :param addr: New addr to reg the MR on :param length: New length of memory to reg :param access: New MR access :return: None """ ret = v.ibv_rereg_mr(self.mr, flags, pd.pd, addr, length, access) if ret != 0: err_msg = rereg_error_to_str(ret) raise PyverbsRDMAErrno(f'Failed to rereg MR: {err_msg}') if flags & e.IBV_REREG_MR_CHANGE_TRANSLATION: if not self.is_user_addr: if self.is_huge: munmap(self.buf, self.mmap_length) else: free(self.buf) self.buf = addr self.is_user_addr = True if flags & e.IBV_REREG_MR_CHANGE_PD: (self.pd).remove_ref(self) self.pd = pd pd.add_ref(self) @property def buf(self): return self.buf @property def lkey(self): return self.mr.lkey @property def rkey(self): return self.mr.rkey @property def length(self): return self.mr.length @property def handle(self): return self.mr.handle def __str__(self): print_format = '{:22}: {:<20}\n' return 'MR:\n' + \ print_format.format('lkey', self.lkey) + \ print_format.format('rkey', self.rkey) + \ print_format.format('length', self.length) + \ print_format.format('buf', self.buf) + \ print_format.format('handle', self.handle) cdef class MWBindInfo(PyverbsCM): def __init__(self, MR mr not None, addr, length, mw_access_flags): super().__init__() self.mr = mr self.info.mr = mr.mr self.info.addr = addr self.info.length = length self.info.mw_access_flags = mw_access_flags @property def mw_access_flags(self): return self.info.mw_access_flags @property def length(self): return self.info.length @property def addr(self): return self.info.addr def __str__(self): print_format = '{:22}: {:<20}\n' return 'MWBindInfo:\n' +\ print_format.format('Addr', self.info.addr) +\ print_format.format('Length', self.info.length) +\ print_format.format('MW access flags', self.info.mw_access_flags) cdef class MWBind(PyverbsCM): def __init__(self, MWBindInfo info not None,send_flags, wr_id=0): super().__init__() self.mw_bind.wr_id = wr_id self.mw_bind.send_flags = send_flags self.mw_bind.bind_info = info.info def __str__(self): print_format = '{:22}: {:<20}\n' return 'MWBind:\n' +\ print_format.format('WR id', self.mw_bind.wr_id) +\ print_format.format('Send flags', self.mw_bind.send_flags) cdef class MW(PyverbsCM): def __init__(self, PD pd not None, v.ibv_mw_type mw_type): """ Initializes a memory window object of the given type :param pd: A PD object :param mw_type: Type of of the memory window, see ibv_mw_type enum :return: """ super().__init__() self.mw = NULL self.mw = v.ibv_alloc_mw(pd.pd, mw_type) if self.mw == NULL: raise PyverbsRDMAErrno('Failed to allocate MW') self.pd = pd pd.add_ref(self) self.logger.debug('Allocated memory window of type {t}'. format(t=mwtype2str(mw_type))) def __dealloc__(self): self.close() cpdef close(self): """ Closes the underlying C MW object. MW may be deleted directly or by deleting its PD, which leaves the Python object without the underlying MW. Need to check that the underlying MW wasn't dealloced before. :return: None """ if self.mw is not NULL: if self.logger: self.logger.debug('Closing MW') rc = v.ibv_dealloc_mw(self.mw) if rc != 0: raise PyverbsRDMAError('Failed to dealloc MW', rc) self.mw = NULL self.pd = None @property def handle(self): return self.mw.handle @property def rkey(self): return self.mw.rkey @property def type(self): return self.mw.type def __str__(self): print_format = '{:22}: {:<20}\n' return 'MW:\n' +\ print_format.format('Rkey', self.mw.rkey) +\ print_format.format('Handle', self.mw.handle) +\ print_format.format('MW Type', mwtype2str(self.mw.type)) cdef class DMMR(MR): def __init__(self, PD pd not None, length, access, DM dm, offset): """ Initializes a DMMR (Device Memory Memory Region) of the given length and access flags using the given PD and DM objects. :param pd: A PD object :param length: Length in bytes :param access: Access flags, see ibv_access_flags enum :param dm: A DM (device memory) object to be used for this DMMR :param offset: Byte offset from the beginning of the allocated device memory buffer :return: The newly create DMMR """ # Initialize the logger here as the parent's __init__ is called after # the DMMR is allocated. Allocation can fail, which will lead to # exceptions thrown during object's teardown. self.logger = logging.getLogger(self.__class__.__name__) self.mr = v.ibv_reg_dm_mr(pd.pd, dm.dm, offset, length, access) if self.mr == NULL: raise PyverbsRDMAErrno('Failed to register a device MR. length: {len}, access flags: {flags}'. format(len=length, flags=access,)) super().__init__(pd, length, access) self.pd = pd self.dm = dm pd.add_ref(self) dm.add_ref(self) self.logger.debug('Registered device ibv_mr. Length: {len}, access flags {flags}'. format(len=length, flags=access)) def write(self, data, length, offset=0): if isinstance(data, str): data = data.encode() return self.dm.copy_to_dm(offset, data, length) cpdef read(self, length, offset): return self.dm.copy_from_dm(offset, length) cdef class DmaBufMR(MR): def __init__(self, PD pd not None, length, access, dmabuf=None, offset=0, gpu=0, gtt=0): """ Initializes a DmaBufMR (DMA-BUF Memory Region) of the given length and access flags using the given PD and DmaBuf objects. :param pd: A PD object :param length: Length in bytes :param access: Access flags, see ibv_access_flags enum :param dmabuf: A DmaBuf object or a FD representing a dmabuf. DmaBuf object will be allocated if None is passed. :param offset: Byte offset from the beginning of the dma-buf :param gpu: GPU unit for internal dmabuf allocation :param gtt: If true allocate internal dmabuf from GTT instead of VRAM :return: The newly created DMABUFMR """ self.logger = logging.getLogger(self.__class__.__name__) if dmabuf is None: self.is_dmabuf_internal = True dmabuf = DmaBuf(length + offset, gpu, gtt) fd = dmabuf.fd if isinstance(dmabuf, DmaBuf) else dmabuf self.mr = v.ibv_reg_dmabuf_mr(pd.pd, offset, length, offset, fd, access) if self.mr == NULL: raise PyverbsRDMAErrno(f'Failed to register a dma-buf MR. length: {length}, access flags: {access}') super().__init__(pd, length, access) self.pd = pd self.dmabuf = dmabuf self.offset = offset pd.add_ref(self) if isinstance(dmabuf, DmaBuf): dmabuf.add_ref(self) self.logger.debug(f'Registered dma-buf ibv_mr. Length: {length}, access flags {access}') def __dealloc__(self): self.close() cpdef close(self): """ Closes the underlying C object of the MR and frees the memory allocated. :return: None """ if self.mr != NULL: if self.logger: self.logger.debug('Closing dma-buf MR') rc = v.ibv_dereg_mr(self.mr) if rc != 0: raise PyverbsRDMAError('Failed to dereg dma-buf MR', rc) self.pd = None self.mr = NULL # Set self.mr to NULL before closing dmabuf because this method is # re-entered when close_weakrefs() is called inside dmabuf.close(). if self.is_dmabuf_internal: self.dmabuf.close() self.dmabuf = None @property def offset(self): return self.offset @property def dmabuf(self): return self.dmabuf def write(self, data, length, offset=0): """ Write user data to the dma-buf backing the MR :param data: User data to write :param length: Length of the data to write :param offset: Writing offset :return: None """ if isinstance(data, str): data = data.encode() cdef int off = offset + self.offset cdef void *buf = mmap(NULL, length + off, PROT_READ | PROT_WRITE, MAP_SHARED, self.dmabuf.drm_fd, self.dmabuf.map_offset) if buf == MAP_FAILED: raise PyverbsError(f'Failed to map dma-buf of size {length}') memcpy((buf + off), data, length) munmap(buf, length + off) cpdef read(self, length, offset): """ Reads data from the dma-buf backing the MR :param length: Length of data to read :param offset: Reading offset :return: The data on the buffer in the requested offset """ cdef int off = offset + self.offset cdef void *buf = mmap(NULL, length + off, PROT_READ | PROT_WRITE, MAP_SHARED, self.dmabuf.drm_fd, self.dmabuf.map_offset) if buf == MAP_FAILED: raise PyverbsError(f'Failed to map dma-buf of size {length}') cdef char *data =malloc(length) memset(data, 0, length) memcpy(data, (buf + off), length) munmap(buf, length + off) res = data[:length] free(data) return res def mwtype2str(mw_type): mw_types = {1:'IBV_MW_TYPE_1', 2:'IBV_MW_TYPE_2'} try: return mw_types[mw_type] except KeyError: return 'Unknown MW type ({t})'.format(t=mw_type) rdma-core-56.1/pyverbs/pd.pxd000066400000000000000000000020351477342711600161440ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. #cython: language_level=3 from pyverbs.base cimport PyverbsObject from pyverbs.device cimport Context cimport pyverbs.libibverbs as v from .base cimport PyverbsCM cdef class PD(PyverbsCM): cdef v.ibv_pd *pd cdef Context ctx cdef add_ref(self, obj) cdef remove_ref(self, obj) cdef object srqs cdef object mrs cdef object mws cdef object ahs cdef object qps cdef object parent_domains cdef object wqs cdef object mkeys cdef object deks cdef object _is_imported cdef class ParentDomainInitAttr(PyverbsObject): cdef v.ibv_parent_domain_init_attr init_attr cdef object pd cdef object alloc cdef object dealloc cdef class ParentDomain(PD): cdef add_ref(self, obj) cdef object protection_domain cdef object cqs cdef class ParentDomainContext(PyverbsObject): cdef object p_alloc cdef object p_free cdef object pd cdef object user_data rdma-core-56.1/pyverbs/pd.pyx000066400000000000000000000247521477342711600162030ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. #cython: legacy_implicit_noexcept=True from libc.stdint cimport uintptr_t, uint32_t from libc.stdlib cimport malloc import weakref import logging from pyverbs.pyverbs_error import PyverbsUserError, PyverbsError, \ PyverbsRDMAError from pyverbs.base import PyverbsRDMAErrno from pyverbs.base cimport close_weakrefs from pyverbs.wr cimport copy_sg_array from pyverbs.device cimport Context from pyverbs.cmid cimport CMID from .mr cimport MR, MW, DMMR from pyverbs.srq cimport SRQ from pyverbs.addr cimport AH from pyverbs.cq cimport CQEX from pyverbs.qp cimport QP from pyverbs.wq cimport WQ cdef class PD(PyverbsCM): def __init__(self, object creator not None, **kwargs): """ Initializes a PD object. A reference for the creating Context is kept so that Python's GC will destroy the objects in the right order. :param creator: The Context/CMID object creating the PD :param kwargs: Arguments: * *handle* A valid kernel handle for a PD object in the given creator (Context). If passed, the PD will be imported and associated with the given handle in the given context using ibv_import_pd. """ super().__init__() pd_handle = kwargs.get('handle') if issubclass(type(creator), Context): # Check if the ibv_pd* was initialized by an inheriting class if self.pd == NULL: if pd_handle is not None: self.pd = v.ibv_import_pd((creator).context, pd_handle) self._is_imported = True err_str = 'Failed to import PD' else: self.pd = v.ibv_alloc_pd((creator).context) err_str = 'Failed to allocate PD' if self.pd == NULL: raise PyverbsRDMAErrno(err_str) self.ctx = creator elif issubclass(type(creator), CMID): cmid = creator self.pd = cmid.id.pd self.ctx = cmid.ctx cmid.pd = self else: raise PyverbsUserError('Cannot create PD from {type}' .format(type=type(creator))) self.ctx.add_ref(self) if self.logger: self.logger.debug('Created PD') self.srqs = weakref.WeakSet() self.mrs = weakref.WeakSet() self.mws = weakref.WeakSet() self.ahs = weakref.WeakSet() self.qps = weakref.WeakSet() self.parent_domains = weakref.WeakSet() self.mkeys = weakref.WeakSet() self.deks = weakref.WeakSet() self.wqs = weakref.WeakSet() def advise_mr(self, advise, uint32_t flags, sg_list not None): """ Give advice or directions to the kernel about an address range belonging to a MR. :param advise: The requested advise value :param flags: Describes the properties of the advise operation :param sg_list: The scatter gather list :return: 0 on success, otherwise PyverbsRDMAError will be raised """ num_sges = len(sg_list) dst_sg_list = malloc(num_sges * sizeof(v.ibv_sge)) copy_sg_array(dst_sg_list, sg_list, num_sges) rc = v.ibv_advise_mr(self.pd, advise, flags, dst_sg_list, num_sges) if rc: raise PyverbsRDMAError('Failed to advise MR', rc) return rc def unimport(self): v.ibv_unimport_pd(self.pd) self.close() def __dealloc__(self): """ Closes the inner PD. :return: None """ self.close() cpdef close(self): """ Closes the underlying C object of the PD. PD may be deleted directly or indirectly by closing its context, which leaves the Python PD object without the underlying C object, so during destruction, need to check whether or not the C object exists. In case of an imported PD no deallocation will be done, it's left for the original PD, in order to prevent double dealloc by the GC. :return: None """ if self.pd != NULL: if self.logger: self.logger.debug('Closing PD') close_weakrefs([self.deks, self.mkeys, self.parent_domains, self.qps, self.wqs, self.ahs, self.mws, self.mrs, self.srqs]) if not self._is_imported: rc = v.ibv_dealloc_pd(self.pd) if rc != 0: raise PyverbsRDMAError('Failed to dealloc PD', rc) self.pd = NULL self.ctx = None cdef add_ref(self, obj): if isinstance(obj, MR) or isinstance(obj, DMMR): self.mrs.add(obj) elif isinstance(obj, MW): self.mws.add(obj) elif isinstance(obj, AH): self.ahs.add(obj) elif isinstance(obj, QP): self.qps.add(obj) elif isinstance(obj, SRQ): self.srqs.add(obj) elif isinstance(obj, ParentDomain): self.parent_domains.add(obj) elif isinstance(obj, WQ): self.wqs.add(obj) else: raise PyverbsError('Unrecognized object type') cdef remove_ref(self, obj): if isinstance(obj, MR): self.mrs.remove(obj) else: raise PyverbsError('Unrecognized object type') @property def handle(self): return self.pd.handle @property def pd(self): return self.pd cdef void *pd_alloc(v.ibv_pd *pd, void *pd_context, size_t size, size_t alignment, v.uint64_t resource_type): """ Parent Domain allocator wrapper. This function is used to wrap a user-defined Python alloc function which should be a part of pd_context. :param pd: Parent domain :param pd_context: User-specific context of type ParentDomainContext :param size: Size of the requested buffer :param alignment: Alignment of the requested buffer :param resource_type: Vendor-specific resource type :return: Pointer to the allocated buffer, or NULL to designate an error. It may also return IBV_ALLOCATOR_USE_DEFAULT asking the callee to allocate the buffer using the default allocator. """ cdef ParentDomainContext pd_ctx pd_ctx = pd_context ptr = pd_ctx.p_alloc(pd_ctx.pd, pd_ctx, size, alignment, resource_type) return ptr cdef void pd_free(v.ibv_pd *pd, void *pd_context, void *ptr, v.uint64_t resource_type): """ Parent Domain deallocator wrapper. This function is used to wrap a user-defined Python free function which should be part of pd_context. :param pd: Parent domain :param pd_context: User-specific context of type ParentDomainContext :param ptr: Pointer to the buffer to be freed :param resource_type: Vendor-specific resource type """ cdef ParentDomainContext pd_ctx pd_ctx = pd_context pd_ctx.p_free(pd_ctx.pd, pd_ctx, ptr, resource_type) cdef class ParentDomainContext(PyverbsObject): def __init__(self, PD pd, alloc_func, free_func, user_data=None): """ Initializes ParentDomainContext object which is used as a pd_context. It contains the relevant fields in order to allow the user to write alloc and free functions in Python :param pd: PD object that represents the ibv_pd which is passed to the creation of the Parent Domain :param alloc_func: Python alloc function :param free_func: Python free function :param user_data: Additional user-specific data """ super().__init__() self.pd = pd self.p_alloc = alloc_func self.p_free = free_func self.user_data = user_data @property def user_data(self): return self.user_data @user_data.setter def user_data(self, val): self.user_data = val cdef class ParentDomainInitAttr(PyverbsObject): def __init__(self, PD pd not None, ParentDomainContext pd_context=None): """ Represents ibv_parent_domain_init_attr C struct :param pd: PD to initialize the ParentDomain with :param pd_context: ParentDomainContext object including the alloc and free Python callbacks """ super().__init__() self.pd = pd self.init_attr.pd = pd.pd if pd_context: self.init_attr.alloc = pd_alloc self.init_attr.free = pd_free self.init_attr.pd_context = pd_context # The only way to use Python callbacks is to pass the (Python) # functions through pd_context. Hence, we must set PD_CONTEXT # in the comp mask. self.init_attr.comp_mask = v.IBV_PARENT_DOMAIN_INIT_ATTR_PD_CONTEXT | \ v.IBV_PARENT_DOMAIN_INIT_ATTR_ALLOCATORS @property def comp_mask(self): return self.init_attr.comp_mask cdef class ParentDomain(PD): def __init__(self, Context context not None, ParentDomainInitAttr attr not None): """ Initializes ParentDomain object which represents a parent domain of ibv_pd C struct type :param context: Device context :param attr: Attribute of type ParentDomainInitAttr to initialize the ParentDomain with """ # Initialize the logger here as the parent's __init__ is called after # the PD is allocated. Allocation can fail, which will lead to exceptions # thrown during object's teardown. self.logger = logging.getLogger(self.__class__.__name__) (attr.pd).add_ref(self) self.protection_domain = attr.pd self.pd = v.ibv_alloc_parent_domain(context.context, &attr.init_attr) if self.pd == NULL: raise PyverbsRDMAErrno('Failed to allocate Parent Domain') super().__init__(context) self.cqs = weakref.WeakSet() self.logger.debug('Allocated ParentDomain') def __dealloc__(self): self.close() cpdef close(self): if self.pd != NULL: if self.logger: self.logger.debug('Closing ParentDomain') close_weakrefs([self.cqs]) super(ParentDomain, self).close() cdef add_ref(self, obj): if isinstance(obj, CQEX): self.cqs.add(obj) else: PD.add_ref(self, obj) rdma-core-56.1/pyverbs/providers/000077500000000000000000000000001477342711600170415ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/__init__.pxd000066400000000000000000000000001477342711600213030ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/__init__.py000066400000000000000000000000001477342711600211400ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/efa/000077500000000000000000000000001477342711600175745ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/efa/CMakeLists.txt000066400000000000000000000003301477342711600223300ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright 2020 Amazon.com, Inc. or its affiliates. All rights reserved. rdma_cython_module(pyverbs/providers/efa efa efa_enums.pyx efadv.pyx libefa.pyx ) rdma-core-56.1/pyverbs/providers/efa/__init__.pxd000066400000000000000000000000001477342711600220360ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/efa/__init__.py000066400000000000000000000000001477342711600216730ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/efa/efa_enums.pxd000066400000000000000000000013111477342711600222470ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) #cython: language_level=3 cdef extern from 'infiniband/efadv.h': cpdef enum: EFADV_DEVICE_ATTR_CAPS_RDMA_READ EFADV_DEVICE_ATTR_CAPS_RNR_RETRY EFADV_DEVICE_ATTR_CAPS_CQ_WITH_SGID EFADV_DEVICE_ATTR_CAPS_RDMA_WRITE EFADV_DEVICE_ATTR_CAPS_UNSOLICITED_WRITE_RECV cpdef enum: EFADV_QP_DRIVER_TYPE_SRD cpdef enum: EFADV_QP_FLAGS_UNSOLICITED_WRITE_RECV cpdef enum: EFADV_WC_EX_WITH_SGID EFADV_WC_EX_WITH_IS_UNSOLICITED cpdef enum: EFADV_MR_ATTR_VALIDITY_RECV_IC_ID EFADV_MR_ATTR_VALIDITY_RDMA_READ_IC_ID EFADV_MR_ATTR_VALIDITY_RDMA_RECV_IC_ID rdma-core-56.1/pyverbs/providers/efa/efa_enums.pyx000066400000000000000000000000001477342711600222660ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/efa/efadv.pxd000066400000000000000000000017301477342711600213770ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright 2020-2024 Amazon.com, Inc. or its affiliates. All rights reserved. #cython: language_level=3 cimport pyverbs.providers.efa.libefa as dv from pyverbs.addr cimport AH from pyverbs.base cimport PyverbsObject from pyverbs.cq cimport CQEX from pyverbs.device cimport Context from pyverbs.qp cimport QP, QPEx cdef class EfaContext(Context): pass cdef class EfaDVDeviceAttr(PyverbsObject): cdef dv.efadv_device_attr device_attr cdef class EfaAH(AH): pass cdef class EfaDVAHAttr(PyverbsObject): cdef dv.efadv_ah_attr ah_attr cdef class SRDQP(QP): pass cdef class SRDQPEx(QPEx): pass cdef class EfaQPInitAttr(PyverbsObject): cdef dv.efadv_qp_init_attr qp_init_attr cdef class EfaCQ(CQEX): cdef dv.efadv_cq *dv_cq cdef class EfaDVCQInitAttr(PyverbsObject): cdef dv.efadv_cq_init_attr cq_init_attr cdef class EfaDVMRAttr(PyverbsObject): cdef dv.efadv_mr_attr mr_attr rdma-core-56.1/pyverbs/providers/efa/efadv.pyx000066400000000000000000000235531477342711600214330ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright 2020-2024 Amazon.com, Inc. or its affiliates. All rights reserved. cimport pyverbs.providers.efa.efa_enums as dve cimport pyverbs.providers.efa.libefa as dv from pyverbs.addr cimport GID from pyverbs.base import PyverbsRDMAErrno, PyverbsRDMAError from pyverbs.cq cimport CQEX, CqInitAttrEx import pyverbs.enums as e cimport pyverbs.libibverbs as v from pyverbs.pd cimport PD from pyverbs.qp cimport QP, QPEx, QPInitAttr, QPInitAttrEx from pyverbs.mr cimport MR def dev_cap_to_str(flags): l = { dve.EFADV_DEVICE_ATTR_CAPS_RDMA_READ: 'RDMA Read', dve.EFADV_DEVICE_ATTR_CAPS_RNR_RETRY: 'RNR Retry', dve.EFADV_DEVICE_ATTR_CAPS_CQ_WITH_SGID: 'CQ entries with source GID', dve.EFADV_DEVICE_ATTR_CAPS_RDMA_WRITE: 'RDMA Write', dve.EFADV_DEVICE_ATTR_CAPS_UNSOLICITED_WRITE_RECV: 'Unsolicited RDMA Write receive', } return bitmask_to_str(flags, l) def bitmask_to_str(bits, values): numeric_bits = bits flags = [] for k, v in sorted(values.items()): if bits & k: flags.append(v) bits -= k if bits: flags.append(f'??({bits:x})') if not flags: flags.append('None') return ', '.join(flags) + f' ({numeric_bits:x})' cdef class EfaContext(Context): """ Represent efa context, which extends Context. """ def __init__(self, name=''): """ Open an efa device :param name: The RDMA device's name (used by parent class) :return: None """ super().__init__(name=name) def query_efa_device(self): """ Queries the provider for device-specific attributes. :return: An EfaDVDeviceAttr containing the attributes. """ dv_attr = EfaDVDeviceAttr() rc = dv.efadv_query_device(self.context, &dv_attr.device_attr, sizeof(dv_attr.device_attr)) if rc: raise PyverbsRDMAError(f'Failed to query efa device {self.name}', rc) return dv_attr cdef class EfaDVDeviceAttr(PyverbsObject): """ Represents efadv_context struct, which exposes efa-specific capabilities, reported by efadv_query_device. """ @property def comp_mask(self): return self.device_attr.comp_mask @property def max_sq_wr(self): return self.device_attr.max_sq_wr @property def max_rq_wr(self): return self.device_attr.max_rq_wr @property def max_sq_sge(self): return self.device_attr.max_sq_sge @property def max_rq_sge(self): return self.device_attr.max_rq_sge @property def inline_buf_size(self): return self.device_attr.inline_buf_size @property def device_caps(self): return self.device_attr.device_caps @property def max_rdma_size(self): return self.device_attr.max_rdma_size def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('comp_mask', self.device_attr.comp_mask) + \ print_format.format('Max SQ WR', self.device_attr.max_sq_wr) + \ print_format.format('Max RQ WR', self.device_attr.max_rq_wr) + \ print_format.format('Max SQ SQE', self.device_attr.max_sq_sge) + \ print_format.format('Max RQ SQE', self.device_attr.max_rq_sge) + \ print_format.format('Inline buffer size', self.device_attr.inline_buf_size) + \ print_format.format('Device Capabilities', dev_cap_to_str(self.device_attr.device_caps)) + \ print_format.format('Max RDMA Size', self.device_attr.max_rdma_size) cdef class EfaDVAHAttr(PyverbsObject): """ Represents efadv_ah_attr struct """ @property def comp_mask(self): return self.ah_attr.comp_mask @property def ahn(self): return self.ah_attr.ahn def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('comp_mask', self.ah_attr.comp_mask) + \ print_format.format('ahn', self.ah_attr.ahn) cdef class EfaAH(AH): def query_efa_ah(self): """ Queries the provider for EFA specific AH attributes. :return: An EfaDVAHAttr containing the attributes. """ ah_attr = EfaDVAHAttr() err = dv.efadv_query_ah(self.ah, &ah_attr.ah_attr, sizeof(ah_attr.ah_attr)) if err: raise PyverbsRDMAError('Failed to query efa ah', err) return ah_attr cdef class SRDQP(QP): """ Initializes an SRD QP according to the user-provided data. :param pd: PD object :param init_attr: QPInitAttr object :return: An initialized SRDQP """ def __init__(self, PD pd not None, QPInitAttr init_attr not None): pd.add_ref(self) self.qp = dv.efadv_create_driver_qp(pd.pd, &init_attr.attr, dve.EFADV_QP_DRIVER_TYPE_SRD) if self.qp == NULL: raise PyverbsRDMAErrno('Failed to create SRD QP') super().__init__(pd, init_attr) cdef class EfaQPInitAttr(PyverbsObject): """ Represents efadv_qp_init_attr struct. """ @property def comp_mask(self): return self.qp_init_attr.comp_mask @property def driver_qp_type(self): return self.qp_init_attr.driver_qp_type @driver_qp_type.setter def driver_qp_type(self, val): self.qp_init_attr.driver_qp_type = val @property def flags(self): return self.qp_init_attr.flags @flags.setter def flags(self, val): self.qp_init_attr.flags = val @property def sl(self): return self.qp_init_attr.sl @sl.setter def sl(self,val): self.qp_init_attr.sl = val cdef class SRDQPEx(QPEx): """ Initializes an SRD QPEx according to the user-provided data. :param ctx: Context object :param init_attr: QPInitAttrEx object :param dv_init_attr: EFAQPInitAttr object :return: An initialized SRDQPEx """ def __init__(self, Context ctx not None, QPInitAttrEx attr_ex not None, EfaQPInitAttr efa_init_attr not None): cdef PD pd self.qp = dv.efadv_create_qp_ex(ctx.context, &attr_ex.attr, &efa_init_attr.qp_init_attr, sizeof(efa_init_attr.qp_init_attr)) if self.qp == NULL: raise PyverbsRDMAErrno('Failed to create SRD QPEx') self.context = ctx ctx.add_ref(self) if attr_ex.pd is not None: pd=attr_ex.pd pd.add_ref(self) super().__init__(ctx, attr_ex) def _get_comp_mask(self, dst): srd_mask = {'INIT': e.IBV_QP_PKEY_INDEX | e.IBV_QP_PORT | e.IBV_QP_QKEY, 'RTR': 0, 'RTS': e.IBV_QP_SQ_PSN} return srd_mask [dst] | e.IBV_QP_STATE cdef class EfaDVCQInitAttr(PyverbsObject): """ Represents efadv_cq_init_attr struct. """ def __init__(self, wc_flags=0): super().__init__() self.cq_init_attr.wc_flags = wc_flags @property def comp_mask(self): return self.cq_init_attr.comp_mask @property def wc_flags(self): return self.cq_init_attr.wc_flags @wc_flags.setter def wc_flags(self, val): self.cq_init_attr.wc_flags = val cdef class EfaCQ(CQEX): """ Initializes an Efa CQ according to the user-provided data. :param ctx: Context object :param attr_ex: CQInitAttrEx object :param efa_init_attr: EfaDVCQInitAttr object :return: An initialized EfaCQ """ def __init__(self, Context ctx not None, CqInitAttrEx attr_ex not None, EfaDVCQInitAttr efa_init_attr): if efa_init_attr is None: efa_init_attr = EfaDVCQInitAttr() self.cq = dv.efadv_create_cq(ctx.context, &attr_ex.attr, &efa_init_attr.cq_init_attr, sizeof(efa_init_attr.cq_init_attr)) if self.cq == NULL: raise PyverbsRDMAErrno('Failed to create EFA CQ') self.ibv_cq = v.ibv_cq_ex_to_cq(self.cq) self.dv_cq = dv.efadv_cq_from_ibv_cq_ex(self.cq) self.context = ctx ctx.add_ref(self) super().__init__(ctx, attr_ex) def read_sgid(self): """ Read SGID from last work completion, if AH is unknown. """ sgid = GID() err = dv.efadv_wc_read_sgid(self.dv_cq, &sgid.gid) if err: return None return sgid def is_unsolicited(self): """ Check if current work completion is unsolicited. """ return dv.efadv_wc_is_unsolicited(self.dv_cq) cdef class EfaDVMRAttr(PyverbsObject): """ Represents efadv_mr_attr struct, which exposes efa-specific MR attributes, reported by efadv_query_mr. """ @property def comp_mask(self): return self.mr_attr.comp_mask @property def ic_id_validity(self): return self.mr_attr.ic_id_validity @property def recv_ic_id(self): return self.mr_attr.recv_ic_id @property def rdma_read_ic_id(self): return self.mr_attr.rdma_read_ic_id @property def rdma_recv_ic_id(self): return self.mr_attr.rdma_recv_ic_id def __str__(self): print_format = '{:28}: {:<20}\n' return print_format.format('comp_mask', self.mr_attr.comp_mask) + \ print_format.format('Interconnect id validity', self.mr_attr.ic_id_validity) + \ print_format.format('Receive interconnect id', self.mr_attr.recv_ic_id) + \ print_format.format('RDMA read interconnect id', self.mr_attr.rdma_read_ic_id) + \ print_format.format('RDMA receive interconnect id', self.mr_attr.rdma_recv_ic_id) cdef class EfaMR(MR): """ Represents an MR with EFA specific properties """ def query(self): """ Queries the MR for device-specific attributes. :return: An EfaDVMRAttr containing the attributes. """ mr_attr = EfaDVMRAttr() rc = dv.efadv_query_mr(self.mr, &mr_attr.mr_attr, sizeof(mr_attr.mr_attr)) if rc: raise PyverbsRDMAError(f'Failed to query EFA MR', rc) return mr_attr rdma-core-56.1/pyverbs/providers/efa/libefa.pxd000066400000000000000000000044241477342711600215370ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright 2020-2024 Amazon.com, Inc. or its affiliates. All rights reserved. #cython: language_level=3 from libc.stdint cimport uint8_t, uint16_t, uint32_t, uint64_t from libcpp cimport bool cimport pyverbs.libibverbs as v cdef extern from 'infiniband/efadv.h': cdef struct efadv_device_attr: uint64_t comp_mask; uint32_t max_sq_wr; uint32_t max_rq_wr; uint16_t max_sq_sge; uint16_t max_rq_sge; uint16_t inline_buf_size; uint8_t reserved[2]; uint32_t device_caps; uint32_t max_rdma_size; cdef struct efadv_ah_attr: uint64_t comp_mask; uint16_t ahn; uint8_t reserved[6]; cdef struct efadv_qp_init_attr: uint64_t comp_mask; uint32_t driver_qp_type; uint16_t flags; uint8_t sl; uint8_t reserved[1]; cdef struct efadv_cq_init_attr: uint64_t comp_mask; uint64_t wc_flags; cdef struct efadv_cq: uint64_t comp_mask; cdef struct efadv_mr_attr: uint64_t comp_mask; uint16_t ic_id_validity; uint16_t recv_ic_id; uint16_t rdma_read_ic_id; uint16_t rdma_recv_ic_id; int efadv_query_device(v.ibv_context *ibvctx, efadv_device_attr *attrs, uint32_t inlen) int efadv_query_ah(v.ibv_ah *ibvah, efadv_ah_attr *attr, uint32_t inlen) v.ibv_qp *efadv_create_driver_qp(v.ibv_pd *ibvpd, v.ibv_qp_init_attr *attr, uint32_t driver_qp_type) v.ibv_qp *efadv_create_qp_ex(v.ibv_context *ibvctx, v.ibv_qp_init_attr_ex *attr_ex, efadv_qp_init_attr *efa_attr, uint32_t inlen) v.ibv_cq_ex *efadv_create_cq(v.ibv_context *ibvctx, v.ibv_cq_init_attr_ex *attr_ex, efadv_cq_init_attr *efa_attr, uint32_t inlen) efadv_cq *efadv_cq_from_ibv_cq_ex(v.ibv_cq_ex *ibvcqx) int efadv_wc_read_sgid(efadv_cq *efadv_cq, v.ibv_gid *sgid) bool efadv_wc_is_unsolicited(efadv_cq *efadv_cq) int efadv_query_mr(v.ibv_mr *ibvmr, efadv_mr_attr *attr, uint32_t inlen) rdma-core-56.1/pyverbs/providers/efa/libefa.pyx000066400000000000000000000000001477342711600215460ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/mlx5/000077500000000000000000000000001477342711600177265ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/mlx5/CMakeLists.txt000066400000000000000000000006701477342711600224710ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file rdma_cython_module(pyverbs/providers/mlx5 mlx5 dr_action.pyx dr_domain.pyx dr_matcher.pyx dr_rule.pyx dr_table.pyx libmlx5.pyx mlx5_enums.pyx mlx5_vfio.pyx mlx5dv.pyx mlx5dv_crypto.pyx mlx5dv_dmabuf.pyx mlx5dv_flow.pyx mlx5dv_mkey.pyx mlx5dv_objects.pyx mlx5dv_sched.pyx ) rdma-core-56.1/pyverbs/providers/mlx5/__init__.pxd000066400000000000000000000000001477342711600221700ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/mlx5/__init__.py000066400000000000000000000000001477342711600220250ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/mlx5/dr_action.pxd000066400000000000000000000041131477342711600224040ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 from pyverbs.providers.mlx5.dr_domain cimport DrDomain from pyverbs.providers.mlx5.mlx5dv cimport Mlx5DevxObj from pyverbs.providers.mlx5.dr_table cimport DrTable cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base cimport PyverbsCM from pyverbs.qp cimport QP cdef class DrAction(PyverbsCM): cdef dv.mlx5dv_dr_action *action cdef object dr_rules cdef object dr_used_actions cdef add_ref(self, obj) cdef class DrActionQp(DrAction): cdef QP qp cdef class DrActionModify(DrAction): cdef DrDomain domain cdef class DrActionFlowCounter(DrAction): cdef Mlx5DevxObj devx_obj cdef class DrActionDrop(DrAction): pass cdef class DrActionTag(DrAction): pass cdef class DrActionDestTable(DrAction): cdef DrTable table cdef class DrActionPushVLan(DrAction): cdef DrDomain domain cdef class DrActionPopVLan(DrAction): pass cdef class DrActionDestAttr(PyverbsCM): cdef DrAction dest cdef dv.mlx5dv_dr_action_dest_attr *action_dest_attr cdef dv.mlx5dv_dr_action_dest_reformat *dest_reformat cdef class DrActionDestArray(DrAction): cdef DrDomain domain cdef object dest_actions cdef class DrActionDefMiss(DrAction): pass cdef class DrActionVPort(DrAction): cdef DrDomain domain cdef int vport cdef class DrActionIBPort(DrAction): cdef DrDomain domain cdef int ib_port cdef class DrActionDestTir(DrAction): cdef Mlx5DevxObj devx_obj cdef class DrActionPacketReformat(DrAction): cdef DrDomain domain cdef class DrFlowSamplerAttr(PyverbsCM): cdef dv.mlx5dv_dr_flow_sampler_attr *attr cdef object actions cdef DrTable table cdef class DrActionFlowSample(DrAction): cdef DrFlowSamplerAttr attr cdef object dr_actions cdef object dr_table cdef class DrFlowMeterAttr(PyverbsCM): cdef dv.mlx5dv_dr_flow_meter_attr *attr cdef DrTable table cdef class DrActionFlowMeter(DrAction): cdef DrFlowMeterAttr attr cdef DrTable dr_table rdma-core-56.1/pyverbs/providers/mlx5/dr_action.pyx000066400000000000000000000503061477342711600224360ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file from pyverbs.base import PyverbsRDMAErrno, PyverbsRDMAError from pyverbs.providers.mlx5.dr_domain cimport DrDomain from pyverbs.providers.mlx5.mlx5dv cimport Mlx5DevxObj from pyverbs.providers.mlx5.dr_rule cimport DrRule import pyverbs.providers.mlx5.mlx5_enums as dve from pyverbs.pyverbs_error import PyverbsError from pyverbs.base cimport close_weakrefs from libc.stdlib cimport calloc, free from libc.stdint cimport uint32_t, uint8_t from libc.string cimport memcpy import weakref import struct import errno be64toh = lambda num: struct.unpack('Q'.encode(), struct.pack('!8s'.encode(), num))[0] ACTION_SIZE = 8 cdef class DrAction(PyverbsCM): def __init__(self): super().__init__() self.dr_rules = weakref.WeakSet() self.dr_used_actions = weakref.WeakSet() cdef add_ref(self, obj): if isinstance(obj, DrRule): self.dr_rules.add(obj) elif isinstance(obj, DrAction): self.dr_used_actions.add(obj) else: raise PyverbsError('Unrecognized object type') def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: if self.logger: self.logger.debug('Closing DrAction.') close_weakrefs([self.dr_rules, self.dr_used_actions]) rc = dv.mlx5dv_dr_action_destroy(self.action) if rc: raise PyverbsRDMAError('Failed to destroy DrAction.', rc) self.action = NULL cdef class DrActionQp(DrAction): def __init__(self, QP qp): super().__init__() self.action = dv.mlx5dv_dr_action_create_dest_ibv_qp((qp).qp) if self.action == NULL: raise PyverbsRDMAErrno('DrActionQp creation failed.') self.qp = qp qp.dr_actions.add(self) def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: super(DrActionQp, self).close() self.qp = None cdef class DrActionFlowCounter(DrAction): def __init__(self, Mlx5DevxObj devx_obj, offset=0): """ Create DR flow counter action. :param devx_obj: Mlx5DevxObj object which is the flow counter object. :param offset: Offset of the specific counter in the counter object. """ super().__init__() self.action = dv.mlx5dv_dr_action_create_flow_counter(devx_obj.obj, offset) if self.action == NULL: raise PyverbsRDMAErrno('DrActionFlowCounter creation failed.') self.devx_obj = devx_obj devx_obj.add_ref(self) def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: super(DrActionFlowCounter, self).close() self.devx_obj = None cdef class DrActionDrop(DrAction): def __init__(self): """ Create DR flow drop action. """ super().__init__() self.action = dv.mlx5dv_dr_action_create_drop() if self.action == NULL: raise PyverbsRDMAErrno('DrActionDrop creation failed.') cdef class DrActionModify(DrAction): def __init__(self, DrDomain domain, flags=0, actions=list()): """ Create DR modify header actions. :param domain: DrDomain object where the action should be located. :param flags: Modify action flags. :param actions: List of Bytes of the actions command input data provided in a device specification format (Stream of bytes or __bytes__ is implemented). """ super().__init__() action_buf_size = len(actions) * ACTION_SIZE cdef unsigned long long *buf = calloc(1, action_buf_size) if buf == NULL: raise MemoryError('Failed to allocate memory', errno) for i in range(len(actions)): buf[i] = be64toh(bytes(actions[i])) self.action = dv.mlx5dv_dr_action_create_modify_header(domain.domain, flags, action_buf_size, buf) free(buf) if self.action == NULL: raise PyverbsRDMAErrno('Failed to create dr action modify header') self.domain = domain domain.dr_actions.add(self) def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: super(DrActionModify, self).close() self.domain = None cdef class DrActionTag(DrAction): def __init__(self, tag): """ Create DR tag action. :param tag: Tag value """ super().__init__() self.action = dv.mlx5dv_dr_action_create_tag(tag) if self.action == NULL: raise PyverbsRDMAErrno('DrActionTag creation failed.') cdef class DrActionDestTable(DrAction): def __init__(self, DrTable table): """ Create DR destination table action. :param table: Destination table """ super().__init__() self.action = dv.mlx5dv_dr_action_create_dest_table(table.table) if self.action == NULL: raise PyverbsRDMAErrno('DrActionDestTable creation failed.') self.table = table table.dr_actions.add(self) def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: super(DrActionDestTable, self).close() self.table = None cdef class DrActionPopVLan(DrAction): def __init__(self): """ Create DR Pop VLAN action. """ super().__init__() self.action = dv.mlx5dv_dr_action_create_pop_vlan() if self.action == NULL: raise PyverbsRDMAErrno('DrActionPopVLan creation failed.') cdef class DrActionPushVLan(DrAction): def __init__(self, DrDomain domain, vlan_hdr): """ Create DR Push VLAN action. :param domain: DrDomain object where the action should be located. :param vlan_hdr: VLAN header. """ super().__init__() self.domain = domain self.action = dv.mlx5dv_dr_action_create_push_vlan(domain.domain, vlan_hdr) if self.action == NULL: raise PyverbsRDMAErrno('DrActionPushVLan creation failed.') domain.dr_actions.add(self) def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: super(DrActionPushVLan, self).close() self.domain = None cdef class DrActionDestAttr(PyverbsCM): def __init__(self, action_type, DrAction dest, DrAction reformat=None): """ Multi destination attributes class used in order to create multi destination array action. :param action_type: Type of action DEST or DEST_REFORMAT :param dest: Destination action to use :param reformat: Reformat action to use before destination action """ super().__init__() self.dest_reformat = NULL self.action_dest_attr = NULL if action_type == dve.MLX5DV_DR_ACTION_DEST: self.action_dest_attr = calloc( 1, sizeof(dv.mlx5dv_dr_action_dest_attr)) if self.action_dest_attr == NULL: raise PyverbsRDMAErrno('Memory allocation for DrActionDestAttr failed.') self.action_dest_attr.type = action_type self.action_dest_attr.dest = dest.action self.dest = dest elif action_type == dve.MLX5DV_DR_ACTION_DEST_REFORMAT: self.dest_reformat = calloc( 1, sizeof(dv.mlx5dv_dr_action_dest_reformat)) if self.dest_reformat == NULL: raise PyverbsRDMAErrno('Memory allocation for DrActionDestAttr failed.') self.action_dest_attr.dest_reformat = self.dest_reformat self.action_dest_attr.dest_reformat.reformat = reformat.action self.action_dest_attr.dest_reformat.dest = dest.action else: raise PyverbsError('Unsupported action type is provided.') def __dealloc__(self): self.close() cpdef close(self): super(DrActionDestAttr, self).close() if self.logger: self.logger.debug('Closing DrActionDestAttr') if self.action_dest_attr != NULL: free(self.action_dest_attr) self.action_dest_attr = NULL if self.dest_reformat != NULL: free(self.dest_reformat) self.dest_reformat = NULL cdef class DrActionDestArray(DrAction): def __init__(self, DrDomain domain, actions_num, dest_actions): """ Create Dest Array Action. :param domain: DrDomain object where the action should be located. :param actions_num: Number of actions. :param dest_actions: Destination actions to use for dest array action. """ cdef dv.mlx5dv_dr_action_dest_attr ** ptr_list cdef DrActionDestAttr temp_attr super().__init__() if not actions_num or not dest_actions or not domain: raise PyverbsError('Domain, number of actions and ' 'dest_actions list must be provided ' 'for creating dest array action.') self.domain = domain self.dest_actions = dest_actions ptr_list = calloc( actions_num, sizeof(dv.mlx5dv_dr_action_dest_attr *)) if ptr_list == NULL: raise PyverbsError('Failed to allocate memory.') for j in range(actions_num): temp_attr = (dest_actions[j]) ptr_list[j] = temp_attr.action_dest_attr self.action = dv.mlx5dv_dr_action_create_dest_array( domain.domain, actions_num, ptr_list) if self.action == NULL: raise PyverbsRDMAErrno('DrActionDestArray creation failed.') free(ptr_list) domain.dr_actions.add(self) def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: super(DrActionDestArray, self).close() self.domain = None self.dest_actions = None cdef class DrActionDefMiss(DrAction): def __init__(self): """ Create DR default miss action. """ super().__init__() self.action = dv.mlx5dv_dr_action_create_default_miss() if self.action == NULL: raise PyverbsRDMAErrno('DrActionDefMiss creation failed.') cdef class DrActionVPort(DrAction): def __init__(self, DrDomain domain, vport): """ Create DR vport action. :param domain: DrDomain object where the action should be placed. :param vport: VPort number. """ super().__init__() self.action = dv.mlx5dv_dr_action_create_dest_vport(domain.domain, vport) if self.action == NULL: raise PyverbsRDMAErrno('Failed to create dr VPort action') self.domain = domain self.vport = vport domain.dr_actions.add(self) def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: super(DrActionVPort, self).close() self.domain = None cdef class DrActionIBPort(DrAction): def __init__(self, DrDomain domain, ib_port): """ Create DR IB port action. :param domain: DrDomain object where the action should be placed. :param ib_port: IB port number. """ super().__init__() self.action = dv.mlx5dv_dr_action_create_dest_ib_port(domain.domain, ib_port) if self.action == NULL: raise PyverbsRDMAErrno('Failed to create dr IB port action') self.domain = domain self.ib_port = ib_port domain.dr_actions.add(self) def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: super(DrActionIBPort, self).close() self.domain = None cdef class DrActionDestTir(DrAction): def __init__(self, Mlx5DevxObj devx_tir): """ Create DR dest devx tir action. :param devx_tir: Destination Mlx5DevxObj tir. """ super().__init__() self.action = dv.mlx5dv_dr_action_create_dest_devx_tir(devx_tir.obj) if self.action == NULL: raise PyverbsRDMAErrno('Failed to create TIR action') self.devx_obj = devx_tir devx_tir.add_ref(self) cdef class DrActionPacketReformat(DrAction): def __init__(self, DrDomain domain, flags=0, reformat_type=dv.MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2, data=None): """ Create DR Packet Reformat action. :param domain: DrDomain object where the action should be placed. :param flags: Packet reformat action flags. :param reformat_type: L2 or L3 encap or decap. :param data: Encap headers (optional). """ super().__init__() cdef char *reformat_data = NULL data_len = 0 if data is None else len(data) if data: arr = bytearray(data) reformat_data = calloc(1, data_len) for i in range(data_len): reformat_data[i] = arr[i] self.action = dv.mlx5dv_dr_action_create_packet_reformat( domain.domain, flags, reformat_type, data_len, reformat_data) if data: free(reformat_data) if self.action == NULL: raise PyverbsRDMAErrno('Failed to create dr action packet reformat') self.domain = domain domain.dr_actions.add(self) def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: super(DrActionPacketReformat, self).close() self.domain = None cdef class DrFlowSamplerAttr(PyverbsCM): def __init__(self, sample_ratio, DrTable default_next_table, sample_actions, action=None): """ Create DrFlowSamplerAttr. :param sample_ratio: The probability for a packet to be sampled by the sampler is 1/sample_ratio :param default_next_table: All the packets continue to the default table id destination :param sample_actions: The actions that are being preformed on The replicated sampled packets at sample_table_id destination :param action: If sample table ID is FDB table type, the object will pass RegC0 from input to both sampler and default output destination as SetActionIn """ cdef dv.mlx5dv_dr_action **actions_ptr_list = NULL self.attr = calloc(1, sizeof(dv.mlx5dv_dr_flow_sampler_attr)) if self.attr == NULL: raise MemoryError('Failed to allocate memory.') size = len(sample_actions) * sizeof(dv.mlx5dv_dr_action *) actions_ptr_list = calloc(1, size) if actions_ptr_list == NULL: raise MemoryError('Failed to allocate memory of size {size}.') self.attr.sample_ratio = sample_ratio self.attr.default_next_table = default_next_table.table self.attr.num_sample_actions = len(sample_actions) for i in range(self.attr.num_sample_actions): actions_ptr_list[i] = \ (sample_actions[i]).action self.attr.sample_actions = actions_ptr_list if action is not None: self.attr.action = be64toh(bytes(action)) self.actions = sample_actions[:] self.table = default_next_table def __dealloc__(self): self.close() cpdef close(self): if self.attr != NULL: if self.attr.sample_actions: free(self.attr.sample_actions) self.attr.sample_actions = NULL self.table = None free(self.attr) self.attr = NULL cdef class DrActionFlowSample(DrAction): def __init__(self, DrFlowSamplerAttr attr): """ Create DR Flow Sample action. :param attr: DrFlowSamplerAttr attr """ super().__init__() self.action = dv.mlx5dv_dr_action_create_flow_sampler(attr.attr) if self.action == NULL: raise PyverbsRDMAErrno('DrActionFlowSample creation failed.') self.attr = attr self.dr_table = self.attr.table self.attr.table.add_ref(self) self.dr_actions = self.attr.actions for action in self.dr_actions: (action).add_ref(self) def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: super(DrActionFlowSample, self).close() self.dr_table = None self.dr_actions = None self.attr = None self.action = NULL cdef class DrFlowMeterAttr(PyverbsCM): def __init__(self, DrTable next_table, active=1, reg_c_index=0, flow_meter_parameter=None): """ Create DrFlowMeterAttr. :param next_table: Destination Table to which packet would be redirected after passing through the Meter. :param active: When set, the Monitor is considered connected to at least one Flow and should be monitored. :param reg_c_index: Index of Register C, where the packet color will be set after passing through the Meter. Valid values are according to QUERY_HCA_CAP.flow_meter_reg_id. The result will be set in the 8 LSB of the register. :param flow_meter_parameter: PRM data that defines the meter behavior: rates, colors, etc. """ cdef bytes py_bytes = bytes(flow_meter_parameter) self.attr = calloc(1, sizeof(dv.mlx5dv_dr_flow_meter_attr)) if self.attr == NULL: raise MemoryError('Failed to allocate memory.') self.attr.next_table = next_table.table self.attr.active = active self.attr.reg_c_index = reg_c_index param_size = len(py_bytes) self.attr.flow_meter_parameter_sz = param_size self.attr.flow_meter_parameter = calloc(1, param_size) if self.attr.flow_meter_parameter == NULL: free(self.attr) raise MemoryError('Failed to allocate memory.') memcpy( self.attr.flow_meter_parameter, py_bytes, param_size) self.table = next_table def __dealloc__(self): self.close() cpdef close(self): if self.attr != NULL: if self.attr.flow_meter_parameter != NULL: free(self.attr.flow_meter_parameter) self.attr.flow_meter_parameter = NULL self.table = None free(self.attr) self.attr = NULL cdef class DrActionFlowMeter(DrAction): def __init__(self, DrFlowMeterAttr attr): """ Create DR Flow Meter action. :param attr: DrFlowMeterAttr attr """ super().__init__() self.action = dv.mlx5dv_dr_action_create_flow_meter(attr.attr) if self.action == NULL: raise PyverbsRDMAErrno('DrActionFlowMeter creation failed.') self.attr = attr self.dr_table = self.attr.table self.attr.table.add_ref(self) def __dealloc__(self): self.close() cpdef close(self): if self.action != NULL: super(DrActionFlowMeter, self).close() self.dr_table = None self.attr = None def modify(self, DrFlowMeterAttr attr, modify_field_select): """ Modify flow meter action by selected field. :param attr: DrFlowMeterAttr attr :param modify_field_select: which fields to modify: Bit 0: Active Bit 1: CBS - affects cbs_exponent and cbs_mantissa Bit 2: CIR - affects cir_exponent and cir_mantissa Bit 3: EBS - affects ebs_exponent and ebs_mantissa Bit 4: EIR - affects eir_exponent and eir_mantissa """ ret = dv.mlx5dv_dr_action_modify_flow_meter(self.action, attr.attr, modify_field_select) if ret: raise PyverbsRDMAErrno('Modify DrActionFlowMeter failed.') rdma-core-56.1/pyverbs/providers/mlx5/dr_domain.pxd000066400000000000000000000006311477342711600223770ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base cimport PyverbsCM cdef class DrDomain(PyverbsCM): cdef dv.mlx5dv_dr_domain *domain cdef object dr_tables cdef object context cdef object dr_actions cdef add_ref(self, obj) rdma-core-56.1/pyverbs/providers/mlx5/dr_domain.pyx000066400000000000000000000066371477342711600224400ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file from pyverbs.base import PyverbsRDMAErrno, PyverbsRDMAError from pyverbs.providers.mlx5.dr_action cimport DrAction from pyverbs.providers.mlx5.dr_table cimport DrTable from pyverbs.pyverbs_error import PyverbsError from pyverbs.base cimport close_weakrefs from pyverbs.device cimport Context cimport pyverbs.libibverbs as v cimport libc.stdio as s import weakref cdef class DrDomain(PyverbsCM): def __init__(self, Context context, domain_type): """ Initialize DrDomain object over underlying mlx5dv_dr_domain C object. :param context: Context object :param domain_type: Type of the domain """ super().__init__() self.domain = dv.mlx5dv_dr_domain_create(context.context, domain_type) if self.domain == NULL: raise PyverbsRDMAErrno('DrDomain creation failed.') self.dr_tables = weakref.WeakSet() self.context = context context.dr_domains.add(self) self.dr_actions = weakref.WeakSet() def allow_duplicate_rules(self, allow): """ Allows or prevents duplicate rules insertion, by default this feature is allowed. :param allow: Boolean to allow or prevent """ dv.mlx5dv_dr_domain_allow_duplicate_rules(self.domain, allow) cdef add_ref(self, obj): if isinstance(obj, DrTable): self.dr_tables.add(obj) elif isinstance(obj, DrAction): self.dr_actions.add(obj) else: raise PyverbsError('Unrecognized object type') def sync(self, flags=dv.MLX5DV_DR_DOMAIN_SYNC_FLAGS_SW | dv.MLX5DV_DR_DOMAIN_SYNC_FLAGS_HW): """ Sync is used in order to flush the rule submission queue. :param flags: MLX5DV_DR_DOMAIN_SYNC_FLAGS_SW - block until completion of all software queued tasks MLX5DV_DR_DOMAIN_SYNC_FLAGS_HW - clear the steering HW cache to enforce next packet hits the latest rules. MLX5DV_DR_DOMAIN_SYNC_FLAGS_MEM - sync device memory to free cached memory. """ if dv.mlx5dv_dr_domain_sync(self.domain, flags): raise PyverbsRDMAErrno('DrDomain sync failed.') def dump(self, filepath): """ Dumps the debug info of the domain into a file. :param filepath: Path to the file """ cdef s.FILE *fp fp = s.fopen(filepath.encode('utf-8'), 'w+') if fp == NULL: raise PyverbsError('Opening dump file failed.') rc = dv.mlx5dv_dump_dr_domain(fp, self.domain) if rc != 0: raise PyverbsRDMAError('Domain dump failed.', rc) if s.fclose(fp) != 0: raise PyverbsError('Closing dump file failed.') def __dealloc__(self): self.close() cpdef close(self): if self.domain != NULL: if self.logger: self.logger.debug('Closing DrDomain.') close_weakrefs([self.dr_actions, self.dr_tables]) rc = dv.mlx5dv_dr_domain_destroy(self.domain) if rc: raise PyverbsRDMAError('Failed to destroy DrDomain.', rc) self.domain = NULL rdma-core-56.1/pyverbs/providers/mlx5/dr_matcher.pxd000066400000000000000000000006011477342711600225500ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base cimport PyverbsCM cdef class DrMatcher(PyverbsCM): cdef dv.mlx5dv_dr_matcher *matcher cdef object dr_table cdef object dr_rules cdef add_ref(self, obj) rdma-core-56.1/pyverbs/providers/mlx5/dr_matcher.pyx000066400000000000000000000060701477342711600226030ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file from pyverbs.providers.mlx5.mlx5dv_flow cimport Mlx5FlowMatchParameters from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError from pyverbs.providers.mlx5.dr_table cimport DrTable from pyverbs.providers.mlx5.dr_rule cimport DrRule from pyverbs.base import PyverbsRDMAErrno from pyverbs.base cimport close_weakrefs import weakref cdef class DrMatcher(PyverbsCM): def __init__(self, DrTable table, priority, match_criteria_enable, Mlx5FlowMatchParameters mask): """ Initialize DrMatcher object over underlying mlx5dv_dr_matcher C object. :param table: Table to create the matcher on :param priority: Matcher priority :param match_criteria_enable: Bitmask representing which of the headers and parameters in match_criteria are used in defining the Flow. Bit 0: outer_headers Bit 1: misc_parameters Bit 2: inner_headers Bit 3: misc_parameters_2 Bit 4: misc_parameters_3 Bit 5: misc_parameters_4 :param mask: Match parameters to match on """ super().__init__() self.matcher = dv.mlx5dv_dr_matcher_create(table.table, priority, match_criteria_enable, mask.params) if self.matcher == NULL: raise PyverbsRDMAErrno('DrMatcher creation failed.') table.add_ref(self) self.dr_table = table self.dr_rules = weakref.WeakSet() cdef add_ref(self, obj): if isinstance(obj, DrRule): self.dr_rules.add(obj) else: raise PyverbsError('Unrecognized object type') def set_layout(self, log_num_of_rules=0, flags=dv.MLX5DV_DR_MATCHER_LAYOUT_NUM_RULE): """ Set the size of the table for the matcher :param log_num_of_rules: Log of the table size (relevant for MLX5DV_DR_MATCHER_LAYOUT_NUM_RULE) :param flags: Matcher layout flags """ cdef dv.mlx5dv_dr_matcher_layout matcher_layout matcher_layout.log_num_of_rules_hint = log_num_of_rules matcher_layout.flags = flags rc = dv.mlx5dv_dr_matcher_set_layout(self.matcher, &matcher_layout) if rc: raise PyverbsRDMAError('Setting matcher layout failed.', rc) def __dealloc__(self): self.close() cpdef close(self): if self.matcher != NULL: if self.logger: self.logger.debug('Closing Matcher.') close_weakrefs([self.dr_rules]) if dv.mlx5dv_dr_matcher_destroy(self.matcher): raise PyverbsRDMAErrno('Failed to destroy DrMatcher.') self.matcher = NULL self.dr_table = None rdma-core-56.1/pyverbs/providers/mlx5/dr_rule.pxd000066400000000000000000000005051477342711600220770ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base cimport PyverbsCM cdef class DrRule(PyverbsCM): cdef dv.mlx5dv_dr_rule *rule cdef object dr_matcher rdma-core-56.1/pyverbs/providers/mlx5/dr_rule.pyx000066400000000000000000000041711477342711600221270ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file from libc.stdlib cimport calloc, free from pyverbs.providers.mlx5.mlx5dv_flow cimport Mlx5FlowMatchParameters from pyverbs.base import PyverbsRDMAErrno, PyverbsRDMAError from pyverbs.providers.mlx5.dr_matcher cimport DrMatcher from pyverbs.providers.mlx5.dr_action cimport DrAction from pyverbs.pyverbs_error import PyverbsError cdef class DrRule(PyverbsCM): def __init__(self, DrMatcher matcher, Mlx5FlowMatchParameters value, actions=None): """ Initialize DrRule object over underlying mlx5dv_dr_rule C object. :param matcher: A matcher with the fields to match on :param value: Match parameters with values to match on :param actions: List of actions to perform """ super().__init__() cdef dv.mlx5dv_dr_action**actions_arr actions = [] if actions is None else actions actions_arr = calloc(len(actions), sizeof(dv.mlx5dv_dr_action*)) if actions_arr == NULL: raise PyverbsError('Failed to allocate memory.') for i in range(0, len(actions)): actions_arr[i] = (actions[i]).action self.rule = dv.mlx5dv_dr_rule_create(matcher.matcher, value.params, len(actions), actions_arr) free(actions_arr) if self.rule == NULL: raise PyverbsRDMAErrno('DrRule creation failed.') for i in range(0, len(actions)): (actions[i]).add_ref(self) matcher.add_ref(self) self.dr_matcher = matcher def __dealloc__(self): self.close() cpdef close(self): if self.rule != NULL: if self.logger: self.logger.debug('Closing DrRule.') rc = dv.mlx5dv_dr_rule_destroy(self.rule) if rc: raise PyverbsRDMAError('Failed to destroy DrRule.', rc) self.rule = NULL self.dr_matcher = None rdma-core-56.1/pyverbs/providers/mlx5/dr_table.pxd000066400000000000000000000006321477342711600222200ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base cimport PyverbsCM cdef class DrTable(PyverbsCM): cdef dv.mlx5dv_dr_table *table cdef object dr_domain cdef object dr_matchers cdef object dr_actions cdef add_ref(self, obj) rdma-core-56.1/pyverbs/providers/mlx5/dr_table.pyx000066400000000000000000000034111477342711600222430ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file from pyverbs.base import PyverbsRDMAErrno, PyverbsRDMAError from pyverbs.providers.mlx5.dr_matcher import DrMatcher from pyverbs.providers.mlx5.dr_domain cimport DrDomain from pyverbs.providers.mlx5.dr_action cimport DrAction from pyverbs.pyverbs_error import PyverbsError from pyverbs.base cimport close_weakrefs import weakref cdef class DrTable(PyverbsCM): def __init__(self, DrDomain domain, level): """ Initialize DrTable object over underlying mlx5dv_dr_table C object. :param domain: Domain object :param level: Table level """ super().__init__() self.table = dv.mlx5dv_dr_table_create(domain.domain, level) if self.table == NULL: raise PyverbsRDMAErrno('DrTable creation failed.') domain.add_ref(self) self.dr_domain = domain self.dr_matchers = weakref.WeakSet() self.dr_actions = weakref.WeakSet() cdef add_ref(self, obj): if isinstance(obj, DrMatcher): self.dr_matchers.add(obj) elif isinstance(obj, DrAction): self.dr_actions.add(obj) else: raise PyverbsError('Unrecognized object type') def __dealloc__(self): self.close() cpdef close(self): if self.table != NULL: if self.logger: self.logger.debug('Closing DrTable.') close_weakrefs([self.dr_matchers, self.dr_actions]) rc = dv.mlx5dv_dr_table_destroy(self.table) if rc: raise PyverbsRDMAError('Failed to destroy DrTable.', rc) self.table = NULL self.dr_domain = None self.dr_actions = None rdma-core-56.1/pyverbs/providers/mlx5/libmlx5.pxd000066400000000000000000000562011477342711600220230ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file #cython: language_level=3 from libc.stdint cimport uint8_t, uint16_t, uint32_t, uint64_t, uintptr_t from posix.types cimport off_t from libcpp cimport bool cimport libc.stdio as s from pyverbs.providers.mlx5.mlx5_enums cimport * cimport pyverbs.libibverbs as v cdef extern from 'infiniband/mlx5dv.h': cdef struct mlx5dv_context_attr: unsigned int flags unsigned long comp_mask cdef struct mlx5dv_cqe_comp_caps: unsigned int max_num unsigned int supported_format cdef struct mlx5dv_sw_parsing_caps: unsigned int sw_parsing_offloads unsigned int supported_qpts cdef struct mlx5dv_striding_rq_caps: unsigned int min_single_stride_log_num_of_bytes unsigned int max_single_stride_log_num_of_bytes unsigned int min_single_wqe_log_num_of_strides unsigned int max_single_wqe_log_num_of_strides unsigned int supported_qpts cdef struct mlx5dv_dci_streams_caps: uint8_t max_log_num_concurent uint8_t max_log_num_errored cdef struct mlx5dv_crypto_caps: uint16_t failed_selftests uint8_t crypto_engines uint8_t wrapped_import_method uint8_t log_max_num_deks uint32_t flags cdef struct mlx5dv_ooo_recv_wrs_caps: uint32_t max_rc uint32_t max_xrc uint32_t max_dct uint32_t max_ud uint32_t max_uc cdef struct mlx5dv_context: unsigned char version unsigned long flags unsigned long comp_mask mlx5dv_cqe_comp_caps cqe_comp_caps mlx5dv_sw_parsing_caps sw_parsing_caps mlx5dv_striding_rq_caps striding_rq_caps mlx5dv_dci_streams_caps dci_streams_caps unsigned int tunnel_offloads_caps unsigned int max_dynamic_bfregs unsigned long max_clock_info_update_nsec unsigned int flow_action_flags unsigned int dc_odp_caps uint8_t num_lag_ports mlx5dv_crypto_caps crypto_caps size_t max_wr_memcpy_length uint64_t max_dc_rd_atom uint64_t max_dc_init_rd_atom mlx5dv_ooo_recv_wrs_caps ooo_recv_wrs_caps cdef struct mlx5dv_dci_streams: uint8_t log_num_concurent uint8_t log_num_errored cdef struct mlx5dv_dc_init_attr: mlx5dv_dc_type dc_type unsigned long dct_access_key mlx5dv_dci_streams dci_streams cdef struct mlx5dv_qp_init_attr: unsigned long comp_mask unsigned int create_flags mlx5dv_dc_init_attr dc_init_attr unsigned long send_ops_flags cdef struct mlx5dv_cq_init_attr: unsigned long comp_mask unsigned char cqe_comp_res_format unsigned int flags unsigned short cqe_size cdef struct mlx5dv_var: uint32_t page_id uint32_t length long mmap_off uint64_t comp_mask cdef struct mlx5dv_pp: uint16_t index cdef struct mlx5dv_devx_uar: void *reg_addr; void *base_addr; uint32_t page_id; long mmap_off; uint64_t comp_mask; cdef struct mlx5dv_qp_ex: uint64_t comp_mask cdef struct mlx5dv_sched_node cdef struct mlx5dv_sched_leaf cdef struct mlx5dv_sched_attr: mlx5dv_sched_node *parent; uint32_t flags; uint32_t bw_share; uint32_t max_avg_bw; uint64_t comp_mask; cdef struct mlx5dv_reg: uint32_t value; uint32_t mask; cdef struct mlx5dv_port: uint64_t flags uint16_t vport uint16_t vport_vhca_id uint16_t esw_owner_vhca_id uint64_t vport_steering_icm_rx uint64_t vport_steering_icm_tx mlx5dv_reg reg_c0 cdef struct mlx5dv_flow_match_parameters: size_t match_sz; uint64_t *match_buf; cdef struct mlx5dv_flow_matcher_attr: v.ibv_flow_attr_type type; uint32_t flags; uint16_t priority; uint8_t match_criteria_enable; mlx5dv_flow_match_parameters *match_mask; uint64_t comp_mask; mlx5_ib_uapi_flow_table_type ft_type; cdef struct mlx5dv_flow_matcher cdef struct mlx5dv_devx_obj cdef struct mlx5dv_flow_action_attr: mlx5dv_flow_action_type type v.ibv_qp *qp v.ibv_flow_action *action unsigned int tag_value mlx5dv_devx_obj *obj cdef struct mlx5dv_dr_domain cdef struct mlx5dv_dr_table cdef struct mlx5dv_dr_matcher cdef struct mlx5dv_dr_matcher_layout: uint32_t flags uint32_t log_num_of_rules_hint cdef struct mlx5dv_dr_action cdef struct mlx5dv_dr_rule cdef struct mlx5dv_dr_action_dest_reformat: mlx5dv_dr_action *reformat mlx5dv_dr_action *dest cdef struct mlx5dv_dr_action_dest_attr: mlx5dv_dr_action_dest_type type mlx5dv_dr_action *dest mlx5dv_dr_action_dest_reformat *dest_reformat cdef struct mlx5dv_dr_flow_sampler_attr: uint32_t sample_ratio mlx5dv_dr_table *default_next_table uint32_t num_sample_actions mlx5dv_dr_action **sample_actions uint64_t action cdef struct mlx5dv_dr_flow_meter_attr: mlx5dv_dr_table *next_table uint8_t active uint8_t reg_c_index size_t flow_meter_parameter_sz void *flow_meter_parameter cdef struct mlx5dv_clock_info: pass cdef struct mlx5dv_mkey_init_attr: v.ibv_pd *pd uint32_t create_flags uint16_t max_entries cdef struct mlx5dv_mkey: uint32_t lkey uint32_t rkey cdef struct mlx5dv_mr_interleaved: uint64_t addr uint32_t bytes_count uint32_t bytes_skip uint32_t lkey cdef struct mlx5dv_mkey_conf_attr: uint32_t conf_flags uint64_t comp_mask cdef struct mlx5dv_sig_crc: mlx5dv_sig_crc_type type uint64_t seed cdef struct mlx5dv_sig_t10dif: mlx5dv_sig_t10dif_bg_type bg_type uint16_t bg uint16_t app_tag uint32_t ref_tag uint16_t flags cdef union sig: mlx5dv_sig_t10dif *dif mlx5dv_sig_crc *crc cdef struct mlx5dv_sig_block_domain: mlx5dv_sig_type sig_type sig sig mlx5dv_block_size block_size uint64_t comp_mask cdef struct mlx5dv_sig_block_attr: mlx5dv_sig_block_domain *mem mlx5dv_sig_block_domain *wire uint32_t flags uint8_t check_mask uint8_t copy_mask uint64_t comp_mask cdef struct mlx5dv_sig_err: uint64_t actual_value uint64_t expected_value uint64_t offset cdef union err: mlx5dv_sig_err sig cdef struct mlx5dv_mkey_err: mlx5dv_mkey_err_type err_type err err cdef struct mlx5_wqe_data_seg: uint32_t byte_count uint32_t lkey uint64_t addr cdef struct mlx5_wqe_ctrl_seg: uint32_t opmod_idx_opcode uint32_t qpn_ds uint8_t signature uint8_t fm_ce_se uint32_t imm cdef struct mlx5dv_devx_umem: uint32_t umem_id; cdef struct mlx5dv_devx_umem_in: void *addr size_t size uint32_t access uint64_t pgsz_bitmap uint64_t comp_mask int dmabuf_fd cdef struct mlx5dv_vfio_context_attr: const char *pci_name uint32_t flags uint64_t comp_mask cdef struct mlx5dv_devx_msi_vector: int vector int fd cdef struct mlx5dv_devx_eq: void *vaddr cdef struct mlx5dv_pd: uint32_t pdn uint64_t comp_mask cdef struct mlx5dv_cq: void *buf uint32_t *dbrec uint32_t cqe_cnt uint32_t cqe_size void *cq_uar uint32_t cqn uint64_t comp_mask cdef struct mlx5dv_qp: uint64_t comp_mask off_t uar_mmap_offset uint32_t tirn uint32_t tisn uint32_t rqn uint32_t sqn cdef struct mlx5dv_srq: uint32_t stride uint32_t head uint32_t tail uint64_t comp_mask uint32_t srqn cdef struct pd: v.ibv_pd *in_ "in" mlx5dv_pd *out cdef struct cq: v.ibv_cq *in_ "in" mlx5dv_cq *out cdef struct qp: v.ibv_qp *in_ "in" mlx5dv_qp *out cdef struct srq: v.ibv_srq *in_ "in" mlx5dv_srq *out cdef struct mlx5dv_obj: pd pd cq cq qp qp srq srq cdef struct mlx5_cqe64: uint16_t wqe_id uint32_t imm_inval_pkey uint32_t byte_cnt uint64_t timestamp uint16_t wqe_counter uint8_t signature uint8_t op_own void mlx5dv_set_ctrl_seg(mlx5_wqe_ctrl_seg *seg, uint16_t pi, uint8_t opcode, uint8_t opmod, uint32_t qp_num, uint8_t fm_ce_se, uint8_t ds, uint8_t signature, uint32_t imm) void mlx5dv_set_data_seg(mlx5_wqe_data_seg *seg, uint32_t length, uint32_t lkey, uintptr_t address) uint8_t mlx5dv_get_cqe_owner(mlx5_cqe64 *cqe) void mlx5dv_set_cqe_owner(mlx5_cqe64 *cqe, uint8_t val) uint8_t mlx5dv_get_cqe_se(mlx5_cqe64 *cqe) uint8_t mlx5dv_get_cqe_format(mlx5_cqe64 *cqe) uint8_t mlx5dv_get_cqe_opcode(mlx5_cqe64 *cqe) cdef struct mlx5dv_dek: pass cdef struct mlx5dv_crypto_login_obj: pass cdef struct mlx5dv_crypto_login_attr: uint32_t credential_id uint32_t import_kek_id char *credential uint64_t comp_mask cdef struct mlx5dv_crypto_login_attr_ex: uint32_t credential_id uint32_t import_kek_id const void *credential size_t credential_len uint64_t comp_mask cdef struct mlx5dv_crypto_login_query_attr: mlx5dv_crypto_login_state state uint64_t comp_mask cdef struct mlx5dv_crypto_attr: mlx5dv_crypto_standard crypto_standard bool encrypt_on_tx mlx5dv_signature_crypto_order signature_crypto_order mlx5dv_block_size data_unit_size char *initial_tweak mlx5dv_dek *dek char *keytag uint64_t comp_mask cdef struct mlx5dv_dek_init_attr: mlx5dv_crypto_key_size key_size bool has_keytag mlx5dv_crypto_key_purpose key_purpose v.ibv_pd *pd char *opaque char *key uint64_t comp_mask mlx5dv_crypto_login_obj *crypto_login cdef struct mlx5dv_dek_attr: mlx5dv_dek_state state char *opaque uint64_t comp_mask bool mlx5dv_is_supported(v.ibv_device *device) v.ibv_context* mlx5dv_open_device(v.ibv_device *device, mlx5dv_context_attr *attr) int mlx5dv_query_device(v.ibv_context *ctx, mlx5dv_context *attrs_out) v.ibv_qp *mlx5dv_create_qp(v.ibv_context *context, v.ibv_qp_init_attr_ex *qp_attr, mlx5dv_qp_init_attr *mlx5_qp_attr) int mlx5dv_query_qp_lag_port(v.ibv_qp *qp, uint8_t *port_num, uint8_t *active_port_num) int mlx5dv_modify_qp_lag_port(v.ibv_qp *qp, uint8_t port_num) int mlx5dv_modify_qp_udp_sport(v.ibv_qp *qp, uint16_t udp_sport) int mlx5dv_dci_stream_id_reset(v.ibv_qp *qp, uint16_t stream_id) v.ibv_cq_ex *mlx5dv_create_cq(v.ibv_context *context, v.ibv_cq_init_attr_ex *cq_attr, mlx5dv_cq_init_attr *mlx5_cq_attr) void mlx5dv_wr_raw_wqe(mlx5dv_qp_ex *mqp_ex, const void *wqe) mlx5dv_var *mlx5dv_alloc_var(v.ibv_context *context, uint32_t flags) void mlx5dv_free_var(mlx5dv_var *dv_var) mlx5dv_pp *mlx5dv_pp_alloc(v.ibv_context *context, size_t pp_context_sz, const void *pp_context, uint32_t flags) void mlx5dv_pp_free(mlx5dv_pp *pp) void mlx5dv_wr_set_dc_addr(mlx5dv_qp_ex *mqp, v.ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key) void mlx5dv_wr_set_dc_addr_stream(mlx5dv_qp_ex *mqp, v.ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key, uint16_t stream_id) void mlx5dv_wr_mr_interleaved(mlx5dv_qp_ex *mqp, mlx5dv_mkey *mkey, uint32_t access_flags, uint32_t repeat_count, uint16_t num_interleaved, mlx5dv_mr_interleaved *data) void mlx5dv_wr_mr_list(mlx5dv_qp_ex *mqp, mlx5dv_mkey *mkey, uint32_t access_flags, uint16_t num_sge, v.ibv_sge *sge) void mlx5dv_wr_memcpy(mlx5dv_qp_ex *mqp, uint32_t dest_lkey, uint64_t dest_addr, uint32_t src_lkey, uint64_t src_addr, uint64_t length) mlx5dv_mkey *mlx5dv_create_mkey(mlx5dv_mkey_init_attr *mkey_init_attr) int mlx5dv_destroy_mkey(mlx5dv_mkey *mkey) mlx5dv_qp_ex *mlx5dv_qp_ex_from_ibv_qp_ex(v.ibv_qp_ex *qp_ex) mlx5dv_sched_node *mlx5dv_sched_node_create(v.ibv_context *context, mlx5dv_sched_attr *sched_attr) mlx5dv_sched_leaf *mlx5dv_sched_leaf_create(v.ibv_context *context, mlx5dv_sched_attr *sched_attr) int mlx5dv_sched_node_modify(mlx5dv_sched_node *node, mlx5dv_sched_attr *sched_attr) int mlx5dv_sched_leaf_modify(mlx5dv_sched_leaf *leaf, mlx5dv_sched_attr *sched_attr) int mlx5dv_sched_node_destroy(mlx5dv_sched_node *node) int mlx5dv_sched_leaf_destroy(mlx5dv_sched_leaf *leaf) int mlx5dv_modify_qp_sched_elem(v.ibv_qp *qp, mlx5dv_sched_leaf *requestor, mlx5dv_sched_leaf *responder) int mlx5dv_reserved_qpn_alloc(v.ibv_context *context, uint32_t *qpn) int mlx5dv_reserved_qpn_dealloc(v.ibv_context *context, uint32_t qpn) void *mlx5dv_dm_map_op_addr(v.ibv_dm *dm, uint8_t op) int mlx5dv_query_port(v.ibv_context *context, uint32_t port_num, mlx5dv_port *port) mlx5dv_flow_matcher *mlx5dv_create_flow_matcher(v.ibv_context *context, mlx5dv_flow_matcher_attr *matcher_attr) int mlx5dv_destroy_flow_matcher(mlx5dv_flow_matcher *matcher) v.ibv_flow *mlx5dv_create_flow(mlx5dv_flow_matcher *matcher, mlx5dv_flow_match_parameters *match_value, size_t num_actions, mlx5dv_flow_action_attr actions_attr[]) v.ibv_flow_action *mlx5dv_create_flow_action_packet_reformat(v.ibv_context *context, size_t data_sz, void *data, unsigned char reformat_type, unsigned char ft_type) v.ibv_mr *mlx5dv_reg_dmabuf_mr(v.ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access, int mlx5_access) int mlx5dv_get_data_direct_sysfs_path(v.ibv_context *context, char *buf, size_t buf_len) # Direct rules verbs mlx5dv_dr_domain *mlx5dv_dr_domain_create(v.ibv_context *ctx, mlx5dv_dr_domain_type type) int mlx5dv_dr_domain_sync(mlx5dv_dr_domain *domain, uint32_t flags) int mlx5dv_dump_dr_domain(s.FILE *fout, mlx5dv_dr_domain *domain) int mlx5dv_dr_domain_destroy(mlx5dv_dr_domain *dmn) mlx5dv_dr_table *mlx5dv_dr_table_create(mlx5dv_dr_domain *dmn, uint32_t level) int mlx5dv_dr_table_destroy(mlx5dv_dr_table *tbl) mlx5dv_dr_matcher *mlx5dv_dr_matcher_create(mlx5dv_dr_table *table, uint16_t priority, uint8_t match_criteria_enable, mlx5dv_flow_match_parameters *mask) int mlx5dv_dr_matcher_set_layout(mlx5dv_dr_matcher *matcher, mlx5dv_dr_matcher_layout *layout) int mlx5dv_dr_matcher_destroy(mlx5dv_dr_matcher *matcher) mlx5dv_dr_action *mlx5dv_dr_action_create_dest_ibv_qp(v.ibv_qp *ibqp) mlx5dv_dr_action *mlx5dv_dr_action_create_tag(uint32_t tag_value) mlx5dv_dr_action *mlx5dv_dr_action_create_dest_table(mlx5dv_dr_table *tbl) mlx5dv_dr_action *mlx5dv_dr_action_create_pop_vlan() mlx5dv_dr_action *mlx5dv_dr_action_create_push_vlan(mlx5dv_dr_domain *dmn, uint32_t vlan_hdr) mlx5dv_dr_action *mlx5dv_dr_action_create_dest_array( mlx5dv_dr_domain *domain, size_t num_dest, mlx5dv_dr_action_dest_attr *dests[]) int mlx5dv_dr_action_destroy(mlx5dv_dr_action *action) mlx5dv_dr_rule *mlx5dv_dr_rule_create(mlx5dv_dr_matcher *matcher, mlx5dv_flow_match_parameters *value, size_t num_actions, mlx5dv_dr_action *actions[]) mlx5dv_dr_action *mlx5dv_dr_action_create_modify_header(mlx5dv_dr_domain *dmn, uint32_t flags, size_t actions_sz, uint64_t actions[]) mlx5dv_dr_action *mlx5dv_dr_action_create_flow_counter(mlx5dv_devx_obj *devx_obj, uint32_t offset) mlx5dv_dr_action *mlx5dv_dr_action_create_drop() mlx5dv_dr_action *mlx5dv_dr_action_create_default_miss() mlx5dv_dr_action *mlx5dv_dr_action_create_dest_vport(mlx5dv_dr_domain *dmn, uint32_t vport) mlx5dv_dr_action *mlx5dv_dr_action_create_dest_ib_port(mlx5dv_dr_domain *dmn, uint32_t ib_port) mlx5dv_dr_action *mlx5dv_dr_action_create_packet_reformat(mlx5dv_dr_domain *domain, uint32_t flags, unsigned char reformat_type, size_t data_sz, void *data) int mlx5dv_dr_rule_destroy(mlx5dv_dr_rule *rule) void mlx5dv_dr_domain_allow_duplicate_rules(mlx5dv_dr_domain *dmn, bool allow) uint64_t mlx5dv_ts_to_ns(mlx5dv_clock_info *clock_info, uint64_t device_timestamp) int mlx5dv_get_clock_info(v.ibv_context *ctx_in, mlx5dv_clock_info *clock_info) int mlx5dv_map_ah_to_qp(v.ibv_ah *ah, uint32_t qp_num) v.ibv_device **mlx5dv_get_vfio_device_list(mlx5dv_vfio_context_attr *attr) int mlx5dv_vfio_get_events_fd(v.ibv_context *ibctx) int mlx5dv_vfio_process_events(v.ibv_context *context) mlx5dv_dr_action *mlx5dv_dr_action_create_dest_devx_tir(mlx5dv_devx_obj *devx_obj) mlx5dv_dr_action *mlx5dv_dr_action_create_flow_sampler(mlx5dv_dr_flow_sampler_attr *attr) mlx5dv_dr_action *mlx5dv_dr_action_create_flow_meter(mlx5dv_dr_flow_meter_attr *attr) int mlx5dv_dr_action_modify_flow_meter(mlx5dv_dr_action *action, mlx5dv_dr_flow_meter_attr *attr, uint64_t modify_field_select) # DevX APIs mlx5dv_devx_uar *mlx5dv_devx_alloc_uar(v.ibv_context *context, uint32_t flags) void mlx5dv_devx_free_uar(mlx5dv_devx_uar *devx_uar) int mlx5dv_devx_general_cmd(v.ibv_context *context, const void *in_, size_t inlen, void *out, size_t outlen) mlx5dv_devx_umem *mlx5dv_devx_umem_reg(v.ibv_context *ctx, void *addr, size_t size, unsigned long access) mlx5dv_devx_umem *mlx5dv_devx_umem_reg_ex(v.ibv_context *ctx, mlx5dv_devx_umem_in *umem_in) int mlx5dv_devx_umem_dereg(mlx5dv_devx_umem *umem) int mlx5dv_devx_query_eqn(v.ibv_context *context, uint32_t vector, uint32_t *eqn) mlx5dv_devx_obj *mlx5dv_devx_obj_create(v.ibv_context *context, const void *_in, size_t inlen, void *out, size_t outlen) int mlx5dv_devx_obj_query(mlx5dv_devx_obj *obj, const void *in_, size_t inlen, void *out, size_t outlen) int mlx5dv_devx_obj_modify(mlx5dv_devx_obj *obj, const void *in_, size_t inlen, void *out, size_t outlen) int mlx5dv_devx_obj_destroy(mlx5dv_devx_obj *obj) int mlx5dv_init_obj(mlx5dv_obj *obj, uint64_t obj_type) mlx5dv_devx_msi_vector *mlx5dv_devx_alloc_msi_vector(v.ibv_context *ibctx) int mlx5dv_devx_free_msi_vector(mlx5dv_devx_msi_vector *msi) mlx5dv_devx_eq *mlx5dv_devx_create_eq(v.ibv_context *context, const void *_in, size_t inlen, void *out, size_t outlen) int mlx5dv_devx_destroy_eq(mlx5dv_devx_eq *eq) # Mkey setters void mlx5dv_wr_mkey_configure(mlx5dv_qp_ex *mqp, mlx5dv_mkey *mkey, int num_setters, mlx5dv_mkey_conf_attr *attr) void mlx5dv_wr_set_mkey_access_flags(mlx5dv_qp_ex *mqp, uint32_t access_flags) void mlx5dv_wr_set_mkey_layout_list(mlx5dv_qp_ex *mqp, uint16_t num_sges, v.ibv_sge *sge) void mlx5dv_wr_set_mkey_layout_interleaved(mlx5dv_qp_ex *mqp, uint32_t repeat_count, uint16_t num_interleaved, mlx5dv_mr_interleaved *data) void mlx5dv_wr_set_mkey_sig_block(mlx5dv_qp_ex *mqp, mlx5dv_sig_block_attr *attr) int mlx5dv_mkey_check(mlx5dv_mkey *mkey, mlx5dv_mkey_err *err_info) int mlx5dv_qp_cancel_posted_send_wrs(mlx5dv_qp_ex *mqp, uint64_t wr_id) void mlx5dv_wr_set_mkey_crypto(mlx5dv_qp_ex *mqp, mlx5dv_crypto_attr *attr) # Crypto APIs int mlx5dv_crypto_login(v.ibv_context *context, mlx5dv_crypto_login_attr *login_attr) int mlx5dv_crypto_login_query_state(v.ibv_context *context, mlx5dv_crypto_login_state *state) int mlx5dv_crypto_logout(v.ibv_context *context) mlx5dv_dek *mlx5dv_dek_create(v.ibv_context *context, mlx5dv_dek_init_attr *init_attr) int mlx5dv_dek_query(mlx5dv_dek *dek, mlx5dv_dek_attr *attr) int mlx5dv_dek_destroy(mlx5dv_dek *dek) mlx5dv_crypto_login_obj *mlx5dv_crypto_login_create(v.ibv_context *context, mlx5dv_crypto_login_attr_ex *login_attr) int mlx5dv_crypto_login_query(mlx5dv_crypto_login_obj *crypto_login, mlx5dv_crypto_login_query_attr *query_attr) int mlx5dv_crypto_login_destroy(mlx5dv_crypto_login_obj *crypto_login) rdma-core-56.1/pyverbs/providers/mlx5/libmlx5.pyx000066400000000000000000000000001477342711600220320ustar00rootroot00000000000000rdma-core-56.1/pyverbs/providers/mlx5/mlx5_enums.pxd000066400000000000000000000265221477342711600225460ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file #cython: language_level=3 cdef extern from 'infiniband/mlx5dv.h': cpdef enum: MLX5_OPCODE_NOP MLX5_OPCODE_SEND_INVAL MLX5_OPCODE_RDMA_WRITE MLX5_OPCODE_RDMA_WRITE_IMM MLX5_OPCODE_SEND MLX5_OPCODE_SEND_IMM MLX5_OPCODE_TSO MLX5_OPCODE_RDMA_READ MLX5_OPCODE_ATOMIC_CS MLX5_OPCODE_ATOMIC_FA MLX5_OPCODE_ATOMIC_MASKED_CS MLX5_OPCODE_ATOMIC_MASKED_FA MLX5_OPCODE_FMR MLX5_OPCODE_LOCAL_INVAL MLX5_OPCODE_CONFIG_CMD MLX5_OPCODE_UMR MLX5_OPCODE_TAG_MATCHING MLX5_OPCODE_MMO cpdef enum: MLX5_WQE_CTRL_CQ_UPDATE MLX5_WQE_CTRL_SOLICITED MLX5_WQE_CTRL_FENCE MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE cpdef enum mlx5dv_context_attr_flags: MLX5DV_CONTEXT_FLAGS_DEVX cpdef enum mlx5dv_context_comp_mask: MLX5DV_CONTEXT_MASK_CQE_COMPRESION = 1 << 0 MLX5DV_CONTEXT_MASK_SWP = 1 << 1 MLX5DV_CONTEXT_MASK_STRIDING_RQ = 1 << 2 MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS = 1 << 3 MLX5DV_CONTEXT_MASK_DYN_BFREGS = 1 << 4 MLX5DV_CONTEXT_MASK_CLOCK_INFO_UPDATE = 1 << 5 MLX5DV_CONTEXT_MASK_FLOW_ACTION_FLAGS = 1 << 6 MLX5DV_CONTEXT_MASK_DC_ODP_CAPS = 1 << 7 MLX5DV_CONTEXT_MASK_NUM_LAG_PORTS = 1 << 9 MLX5DV_CONTEXT_MASK_SIGNATURE_OFFLOAD = 1 << 10 MLX5DV_CONTEXT_MASK_DCI_STREAMS = 1 << 11 MLX5DV_CONTEXT_MASK_WR_MEMCPY_LENGTH = 1 << 12 MLX5DV_CONTEXT_MASK_CRYPTO_OFFLOAD = 1 << 13 MLX5DV_CONTEXT_MASK_MAX_DC_RD_ATOM = 1 << 14 MLX5DV_CONTEXT_MASK_OOO_RECV_WRS = 1 << 16 cpdef enum mlx5dv_context_flags: MLX5DV_CONTEXT_FLAGS_CQE_V1 = 1 << 0 MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED = 1 << 2 MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW = 1 << 3 MLX5DV_CONTEXT_FLAGS_CQE_128B_COMP = 1 << 4 MLX5DV_CONTEXT_FLAGS_CQE_128B_PAD = 1 << 5 MLX5DV_CONTEXT_FLAGS_PACKET_BASED_CREDIT_MODE = 1 << 6 MLX5DV_CONTEXT_FLAGS_REAL_TIME_TS = 1 << 7 cpdef enum mlx5dv_sw_parsing_offloads: MLX5DV_SW_PARSING = 1 << 0 MLX5DV_SW_PARSING_CSUM = 1 << 1 MLX5DV_SW_PARSING_LSO = 1 << 2 cpdef enum mlx5dv_cqe_comp_res_format: MLX5DV_CQE_RES_FORMAT_HASH = 1 << 0 MLX5DV_CQE_RES_FORMAT_CSUM = 1 << 1 MLX5DV_CQE_RES_FORMAT_CSUM_STRIDX = 1 << 2 cpdef enum mlx5dv_sched_elem_attr_flags: MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE = 1 << 0 MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW = 1 << 1 cpdef enum mlx5dv_tunnel_offloads: MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_VXLAN = 1 << 0 MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GRE = 1 << 1 MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GENEVE = 1 << 2 MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_CW_MPLS_OVER_GRE = 1 << 3 MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_CW_MPLS_OVER_UDP = 1 << 4 cpdef enum mlx5dv_flow_action_cap_flags: MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM = 1 << 0 MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_REQ_METADATA = 1 << 1 MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_SPI_STEERING = 1 << 2 MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_FULL_OFFLOAD = 1 << 3 MLX5DV_FLOW_ACTION_FLAGS_ESP_AES_GCM_TX_IV_IS_ESN = 1 << 4 cpdef enum mlx5dv_qp_init_attr_mask: MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS = 1 << 0 MLX5DV_QP_INIT_ATTR_MASK_DC = 1 << 1 MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS = 1 << 2 MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS = 1 << 3 cpdef enum mlx5dv_qp_create_flags: MLX5DV_QP_CREATE_TUNNEL_OFFLOADS = 1 << 0 MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_UC = 1 << 1 MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_MC = 1 << 2 MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE = 1 << 3 MLX5DV_QP_CREATE_ALLOW_SCATTER_TO_CQE = 1 << 4 MLX5DV_QP_CREATE_PACKET_BASED_CREDIT_MODE = 1 << 5 MLX5DV_QP_CREATE_SIG_PIPELINING = 1 << 6 MLX5DV_QP_CREATE_OOO_DP = 1 << 7 cpdef enum mlx5dv_dc_type: MLX5DV_DCTYPE_DCT = 1 MLX5DV_DCTYPE_DCI = 2 cpdef enum mlx5dv_mkey_init_attr_flags: MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO MLX5DV_MKEY_INIT_ATTR_FLAGS_REMOTE_INVALIDATE cpdef enum mlx5dv_mkey_err_type: MLX5DV_MKEY_NO_ERR MLX5DV_MKEY_SIG_BLOCK_BAD_GUARD MLX5DV_MKEY_SIG_BLOCK_BAD_REFTAG MLX5DV_MKEY_SIG_BLOCK_BAD_APPTAG cpdef enum mlx5dv_sig_type: MLX5DV_SIG_TYPE_T10DIF MLX5DV_SIG_TYPE_CRC cpdef enum mlx5dv_sig_t10dif_bg_type: MLX5DV_SIG_T10DIF_CRC MLX5DV_SIG_T10DIF_CSUM cpdef enum mlx5dv_sig_t10dif_flags: MLX5DV_SIG_T10DIF_FLAG_REF_REMAP MLX5DV_SIG_T10DIF_FLAG_APP_ESCAPE MLX5DV_SIG_T10DIF_FLAG_APP_REF_ESCAPE cpdef enum mlx5dv_sig_crc_type: MLX5DV_SIG_CRC_TYPE_CRC32 MLX5DV_SIG_CRC_TYPE_CRC32C MLX5DV_SIG_CRC_TYPE_CRC64_XP10 cpdef enum mlx5dv_block_size: MLX5DV_BLOCK_SIZE_512 MLX5DV_BLOCK_SIZE_520 MLX5DV_BLOCK_SIZE_4048 MLX5DV_BLOCK_SIZE_4096 MLX5DV_BLOCK_SIZE_4160 cpdef enum mlx5dv_sig_mask: MLX5DV_SIG_MASK_T10DIF_GUARD MLX5DV_SIG_MASK_T10DIF_APPTAG MLX5DV_SIG_MASK_T10DIF_REFTAG MLX5DV_SIG_MASK_CRC32 MLX5DV_SIG_MASK_CRC32C MLX5DV_SIG_MASK_CRC64_XP10 cpdef enum mlx5dv_sig_block_attr_flags: MLX5DV_SIG_BLOCK_ATTR_FLAG_COPY_MASK cpdef enum mlx5dv_qp_create_send_ops_flags: MLX5DV_QP_EX_WITH_MR_INTERLEAVED = 1 << 0 MLX5DV_QP_EX_WITH_MR_LIST = 1 << 1 MLX5DV_QP_EX_WITH_MKEY_CONFIGURE = 1 << 2 MLX5DV_QP_EX_WITH_RAW_WQE = 1 << 3 MLX5DV_QP_EX_WITH_MEMCPY = 1 << 4 cpdef enum mlx5dv_cq_init_attr_mask: MLX5DV_CQ_INIT_ATTR_MASK_COMPRESSED_CQE = 1 << 0 MLX5DV_CQ_INIT_ATTR_MASK_FLAGS = 1 << 1 MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE = 1 << 2 cpdef enum mlx5dv_cq_init_attr_flags: MLX5DV_CQ_INIT_ATTR_FLAGS_CQE_PAD = 1 << 0 MLX5DV_CQ_INIT_ATTR_FLAGS_RESERVED = 1 << 1 cpdef enum mlx5dv_flow_action_type: MLX5DV_FLOW_ACTION_DEST_IBV_QP MLX5DV_FLOW_ACTION_DROP MLX5DV_FLOW_ACTION_IBV_COUNTER MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION MLX5DV_FLOW_ACTION_TAG MLX5DV_FLOW_ACTION_DEST_DEVX MLX5DV_FLOW_ACTION_COUNTERS_DEVX MLX5DV_FLOW_ACTION_DEFAULT_MISS cpdef enum mlx5dv_dr_domain_type: MLX5DV_DR_DOMAIN_TYPE_NIC_RX MLX5DV_DR_DOMAIN_TYPE_NIC_TX MLX5DV_DR_DOMAIN_TYPE_FDB cpdef enum mlx5dv_qp_comp_mask: MLX5DV_QP_MASK_UAR_MMAP_OFFSET MLX5DV_QP_MASK_RAW_QP_HANDLES MLX5DV_QP_MASK_RAW_QP_TIR_ADDR cpdef enum mlx5dv_srq_comp_mask: MLX5DV_SRQ_MASK_SRQN cpdef enum mlx5dv_obj_type: MLX5DV_OBJ_QP MLX5DV_OBJ_CQ MLX5DV_OBJ_SRQ MLX5DV_OBJ_RWQ MLX5DV_OBJ_DM MLX5DV_OBJ_AH MLX5DV_OBJ_PD cpdef enum: MLX5_RCV_DBR MLX5_SND_DBR cpdef enum: MLX5_CQE_OWNER_MASK MLX5_CQE_REQ MLX5_CQE_RESP_WR_IMM MLX5_CQE_RESP_SEND MLX5_CQE_RESP_SEND_IMM MLX5_CQE_RESP_SEND_INV MLX5_CQE_RESIZE_CQ MLX5_CQE_NO_PACKET MLX5_CQE_SIG_ERR MLX5_CQE_REQ_ERR MLX5_CQE_RESP_ERR MLX5_CQE_INVALID cpdef enum: MLX5_SEND_WQE_BB MLX5_SEND_WQE_SHIFT cpdef enum mlx5dv_vfio_context_attr_flags: MLX5DV_VFIO_CTX_FLAGS_INIT_LINK_DOWN cpdef enum mlx5dv_wc_opcode: MLX5DV_WC_UMR MLX5DV_WC_RAW_WQE MLX5DV_WC_MEMCPY cpdef enum mlx5dv_crypto_standard: MLX5DV_CRYPTO_STANDARD_AES_XTS cpdef enum mlx5dv_signature_crypto_order: MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_AFTER_CRYPTO_ON_TX MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_BEFORE_CRYPTO_ON_TX cpdef enum mlx5dv_crypto_login_state: MLX5DV_CRYPTO_LOGIN_STATE_VALID MLX5DV_CRYPTO_LOGIN_STATE_NO_LOGIN MLX5DV_CRYPTO_LOGIN_STATE_INVALID cpdef enum mlx5dv_crypto_key_size: MLX5DV_CRYPTO_KEY_SIZE_128 MLX5DV_CRYPTO_KEY_SIZE_256 cpdef enum mlx5dv_crypto_key_purpose: MLX5DV_CRYPTO_KEY_PURPOSE_AES_XTS cpdef enum mlx5dv_dek_state: MLX5DV_DEK_STATE_READY MLX5DV_DEK_STATE_ERROR cpdef enum mlx5dv_dek_init_attr_mask: MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN cpdef enum mlx5dv_crypto_engines_caps: MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_SINGLE_BLOCK MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_MULTI_BLOCK cpdef enum mlx5dv_crypto_wrapped_import_method_caps: MLX5DV_CRYPTO_WRAPPED_IMPORT_METHOD_CAP_AES_XTS cpdef enum mlx5dv_crypto_caps_flags: MLX5DV_CRYPTO_CAPS_CRYPTO MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_OPERATIONAL MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_GOING_TO_COMMISSIONING cpdef enum mlx5dv_dr_action_flags: MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL cpdef enum mlx5dv_dr_domain_sync_flags: MLX5DV_DR_DOMAIN_SYNC_FLAGS_SW MLX5DV_DR_DOMAIN_SYNC_FLAGS_HW MLX5DV_DR_DOMAIN_SYNC_FLAGS_MEM cpdef enum mlx5dv_dr_matcher_layout_flags: MLX5DV_DR_MATCHER_LAYOUT_RESIZABLE MLX5DV_DR_MATCHER_LAYOUT_NUM_RULE cpdef enum mlx5dv_dr_action_dest_type: MLX5DV_DR_ACTION_DEST MLX5DV_DR_ACTION_DEST_REFORMAT cpdef enum: MLX5DV_UMEM_MASK_DMABUF cdef unsigned long long MLX5DV_RES_TYPE_QP cdef unsigned long long MLX5DV_RES_TYPE_RWQ cdef unsigned long long MLX5DV_RES_TYPE_DBR cdef unsigned long long MLX5DV_RES_TYPE_SRQ cdef unsigned long long MLX5DV_PP_ALLOC_FLAGS_DEDICATED_INDEX cdef unsigned long long MLX5DV_UAR_ALLOC_TYPE_BF cdef unsigned long long MLX5DV_UAR_ALLOC_TYPE_NC cdef unsigned long long MLX5DV_QUERY_PORT_VPORT cdef unsigned long long MLX5DV_QUERY_PORT_VPORT_VHCA_ID cdef unsigned long long MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_RX cdef unsigned long long MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_TX cdef unsigned long long MLX5DV_QUERY_PORT_VPORT_REG_C0 cdef unsigned long long MLX5DV_QUERY_PORT_ESW_OWNER_VHCA_ID cdef extern from 'infiniband/mlx5_user_ioctl_verbs.h': cdef enum mlx5_ib_uapi_flow_table_type: pass cdef extern from 'infiniband/mlx5_api.h': cdef int MLX5DV_FLOW_TABLE_TYPE_RDMA_RX cdef int MLX5DV_FLOW_TABLE_TYPE_RDMA_TX cdef int MLX5DV_FLOW_TABLE_TYPE_NIC_RX cdef int MLX5DV_FLOW_TABLE_TYPE_NIC_TX cdef int MLX5DV_FLOW_TABLE_TYPE_FDB cdef int MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2 cdef int MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL cdef int MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2 cdef int MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL cdef int MLX5DV_REG_DMABUF_ACCESS_DATA_DIRECT rdma-core-56.1/pyverbs/providers/mlx5/mlx5_enums.pyx000066400000000000000000000033731477342711600225720ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2024 Nvidia All rights reserved. #cython: language_level=3 _MLX5DV_RES_TYPE_QP = MLX5DV_RES_TYPE_QP _MLX5DV_RES_TYPE_RWQ = MLX5DV_RES_TYPE_RWQ _MLX5DV_RES_TYPE_DBR = MLX5DV_RES_TYPE_DBR _MLX5DV_RES_TYPE_SRQ = MLX5DV_RES_TYPE_SRQ _MLX5DV_PP_ALLOC_FLAGS_DEDICATED_INDEX = MLX5DV_PP_ALLOC_FLAGS_DEDICATED_INDEX _MLX5DV_UAR_ALLOC_TYPE_BF = MLX5DV_UAR_ALLOC_TYPE_BF _MLX5DV_UAR_ALLOC_TYPE_NC = MLX5DV_UAR_ALLOC_TYPE_NC MLX5DV_QUERY_PORT_VPORT_ = MLX5DV_QUERY_PORT_VPORT MLX5DV_QUERY_PORT_VPORT_VHCA_ID_ = MLX5DV_QUERY_PORT_VPORT_VHCA_ID MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_RX_ = MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_RX MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_TX_ = MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_TX MLX5DV_QUERY_PORT_VPORT_REG_C0_ = MLX5DV_QUERY_PORT_VPORT_REG_C0 MLX5DV_QUERY_PORT_ESW_OWNER_VHCA_ID_ = MLX5DV_QUERY_PORT_ESW_OWNER_VHCA_ID MLX5DV_FLOW_TABLE_TYPE_RDMA_RX_ = MLX5DV_FLOW_TABLE_TYPE_RDMA_RX MLX5DV_FLOW_TABLE_TYPE_RDMA_TX_ = MLX5DV_FLOW_TABLE_TYPE_RDMA_TX MLX5DV_FLOW_TABLE_TYPE_NIC_RX_ = MLX5DV_FLOW_TABLE_TYPE_NIC_RX MLX5DV_FLOW_TABLE_TYPE_NIC_TX_ = MLX5DV_FLOW_TABLE_TYPE_NIC_TX MLX5DV_FLOW_TABLE_TYPE_FDB_ = MLX5DV_FLOW_TABLE_TYPE_FDB MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2_ = \ MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2 MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL_ = \ MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2_ = \ MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2 MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL_ = \ MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL MLX5DV_REG_DMABUF_ACCESS_DATA_DIRECT_ = MLX5DV_REG_DMABUF_ACCESS_DATA_DIRECT rdma-core-56.1/pyverbs/providers/mlx5/mlx5_vfio.pxd000066400000000000000000000006541477342711600223600ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 from pyverbs.providers.mlx5.mlx5dv cimport Mlx5Context cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base cimport PyverbsObject cdef class Mlx5VfioContext(Mlx5Context): pass cdef class Mlx5VfioAttr(PyverbsObject): cdef dv.mlx5dv_vfio_context_attr attr rdma-core-56.1/pyverbs/providers/mlx5/mlx5_vfio.pyx000066400000000000000000000077021477342711600224060ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 from cpython.mem cimport PyMem_Malloc, PyMem_Free from libc.string cimport strcpy import weakref from pyverbs.pyverbs_error import PyverbsRDMAError cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base import PyverbsRDMAErrno from pyverbs.base cimport close_weakrefs from pyverbs.device cimport Context cimport pyverbs.libibverbs as v cdef class Mlx5VfioAttr(PyverbsObject): """ Mlx5VfioAttr class, represents mlx5dv_vfio_context_attr C struct. """ def __init__(self, pci_name, flags=0, comp_mask=0): self.pci_name = pci_name self.attr.flags = flags self.attr.comp_mask = comp_mask def __dealloc__(self): if self.attr.pci_name != NULL: PyMem_Free(self.attr.pci_name) self.attr.pci_name = NULL @property def flags(self): return self.attr.flags @flags.setter def flags(self, val): self.attr.flags = val @property def comp_mask(self): return self.attr.comp_mask @comp_mask.setter def comp_mask(self, val): self.attr.comp_mask = val @property def pci_name(self): return self.attr.pci_name[:] @pci_name.setter def pci_name(self, val): if self.attr.pci_name != NULL: PyMem_Free(self.attr.pci_name) pci_name_bytes = val.encode() self.attr.pci_name = PyMem_Malloc(len(pci_name_bytes)) strcpy(self.attr.pci_name, pci_name_bytes) cdef class Mlx5VfioContext(Mlx5Context): """ Mlx5VfioContext class is used to easily initialize and open a context over a mlx5 vfio device. It is initialized based on the passed mlx5 vfio attributes (Mlx5VfioAttr), by getting the relevant vfio device and opening it (creating a context). """ def __init__(self, Mlx5VfioAttr attr): super(Context, self).__init__() cdef v.ibv_device **dev_list self.name = attr.pci_name self.pds = weakref.WeakSet() self.devx_umems = weakref.WeakSet() self.devx_objs = weakref.WeakSet() self.uars = weakref.WeakSet() self.devx_eqs = weakref.WeakSet() dev_list = dv.mlx5dv_get_vfio_device_list(&attr.attr) if dev_list == NULL: raise PyverbsRDMAErrno('Failed to get VFIO device list') self.device = dev_list[0] if self.device == NULL: raise PyverbsRDMAError('Failed to get VFIO device') try: self.context = v.ibv_open_device(self.device) if self.context == NULL: raise PyverbsRDMAErrno('Failed to open mlx5 VFIO device ' f'({self.device.name.decode()})') finally: v.ibv_free_device_list(dev_list) def get_events_fd(self): """ Gets the file descriptor to manage driver events. :return: The file descriptor to be used for managing driver events. """ fd = dv.mlx5dv_vfio_get_events_fd(self.context) if fd < 0: raise PyverbsRDMAError('Failed to get VFIO events FD', -fd) return fd def process_events(self): """ Process events on the vfio device. This method should run from application thread to maintain device events. :return: None """ rc = dv.mlx5dv_vfio_process_events(self.context) if rc: raise PyverbsRDMAError('VFIO process events failed', rc) cpdef close(self): if self.context != NULL: if self.logger: self.logger.debug('Closing Mlx5VfioContext') close_weakrefs([self.pds, self.devx_objs, self.devx_umems, self.uars, self.devx_eqs]) rc = v.ibv_close_device(self.context) if rc != 0: raise PyverbsRDMAErrno(f'Failed to close device {self.name}') self.context = NULL rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv.pxd000066400000000000000000000050761477342711600216720ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file #cython: language_level=3 from pyverbs.base cimport PyverbsObject, PyverbsCM cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.device cimport Context from pyverbs.qp cimport QP, QPEx from pyverbs.cq cimport CQEX cdef class Mlx5Context(Context): cdef object devx_umems cdef object devx_objs cdef object devx_eqs cdef add_ref(self, obj) cpdef close(self) cdef class Mlx5DVContextAttr(PyverbsObject): cdef dv.mlx5dv_context_attr attr cdef class Mlx5DVContext(PyverbsObject): cdef dv.mlx5dv_context dv cdef class Mlx5DVPortAttr(PyverbsObject): cdef dv.mlx5dv_port attr cdef class Mlx5DCIStreamInitAttr(PyverbsObject): cdef dv.mlx5dv_dci_streams dci_streams cdef class Mlx5DVDCInitAttr(PyverbsObject): cdef dv.mlx5dv_dc_init_attr attr cdef class Mlx5DVQPInitAttr(PyverbsObject): cdef dv.mlx5dv_qp_init_attr attr cdef class Mlx5QP(QPEx): cdef object dc_type cdef class Mlx5DVCQInitAttr(PyverbsObject): cdef dv.mlx5dv_cq_init_attr attr cdef class Mlx5CQ(CQEX): pass cdef class Mlx5VAR(PyverbsObject): cdef dv.mlx5dv_var *var cdef object context cpdef close(self) cdef class Mlx5PP(PyverbsObject): cdef dv.mlx5dv_pp *pp cdef object context cpdef close(self) cdef class Mlx5UAR(PyverbsObject): cdef dv.mlx5dv_devx_uar *uar cdef object context cpdef close(self) cdef class Mlx5DmOpAddr(PyverbsCM): cdef void *addr @staticmethod cdef void _cpy(void *dst, void *src, int length) cdef class WqeSeg(PyverbsCM): cdef void *segment cpdef _copy_to_buffer(self, addr) cdef class WqeCtrlSeg(WqeSeg): pass cdef class WqeDataSeg(WqeSeg): pass cdef class Wqe(PyverbsCM): cdef void *addr cdef int is_user_addr cdef object segments cdef class Mlx5UMEM(PyverbsCM): cdef dv.mlx5dv_devx_umem *umem cdef Context context cdef void *addr cdef object is_user_addr cdef class Mlx5DevxObj(PyverbsCM): cdef dv.mlx5dv_devx_obj *obj cdef Context context cdef object out_view cdef object flow_counter_actions cdef object dest_tir_actions cdef add_ref(self, obj) cdef class Mlx5Cqe64(PyverbsObject): cdef dv.mlx5_cqe64 *cqe cdef class Mlx5VfioAttr(PyverbsObject): cdef dv.mlx5dv_vfio_context_attr attr cdef class Mlx5DevxMsiVector(PyverbsCM): cdef dv.mlx5dv_devx_msi_vector *msi_vector cdef class Mlx5DevxEq(PyverbsCM): cdef dv.mlx5dv_devx_eq *eq cdef Context context cdef object out_view rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv.pyx000066400000000000000000002157761477342711600217310ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file from libc.stdint cimport uintptr_t, uint8_t, uint16_t, uint32_t, uint64_t from libc.string cimport memcpy, memset from libc.stdlib cimport calloc, free from posix.mman cimport munmap import logging import weakref from pyverbs.providers.mlx5.mlx5dv_mkey cimport Mlx5MrInterleaved, Mlx5Mkey, \ Mlx5MkeyConfAttr, Mlx5SigBlockAttr from pyverbs.providers.mlx5.mlx5dv_crypto cimport Mlx5CryptoLoginAttr, Mlx5CryptoAttr from pyverbs.pyverbs_error import PyverbsUserError, PyverbsRDMAError, PyverbsError from pyverbs.providers.mlx5.dr_action cimport DrActionFlowCounter, DrActionDestTir from pyverbs.providers.mlx5.mlx5dv_sched cimport Mlx5dvSchedLeaf cimport pyverbs.providers.mlx5.mlx5_enums as dve cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.mem_alloc import posix_memalign from pyverbs.qp cimport QPInitAttrEx, QPEx from pyverbs.base import PyverbsRDMAErrno from pyverbs.base cimport close_weakrefs from pyverbs.wr cimport copy_sg_array cimport pyverbs.libibverbs_enums as e from pyverbs.cq cimport CqInitAttrEx cimport pyverbs.libibverbs as v from pyverbs.device cimport DM from pyverbs.addr cimport AH from pyverbs.pd cimport PD cdef extern from 'endian.h': unsigned long htobe16(unsigned long host_16bits) unsigned long be16toh(unsigned long network_16bits) unsigned long htobe32(unsigned long host_32bits) unsigned long be32toh(unsigned long network_32bits) unsigned long htobe64(unsigned long host_64bits) unsigned long be64toh(unsigned long network_64bits) cdef char* _prepare_devx_inbox(in_bytes): """ Auxiliary function that allocates inboxes for DevX commands, and fills them the bytes input. The allocated box must be freed when it's no longer needed. :param in_bytes: Stream of bytes of the command's input :return: The C allocated inbox """ cdef char *in_bytes_c = in_bytes cdef char* in_mailbox = calloc(1, len(in_bytes)) if in_mailbox == NULL: raise MemoryError('Failed to allocate memory') memcpy(in_mailbox, in_bytes_c, len(in_bytes)) return in_mailbox cdef char* _prepare_devx_outbox(outlen): """ Auxiliary function that allocates the outboxes for DevX commands. The allocated box must be freed when it's no longer needed. :param outlen: Output command's length in bytes :return: The C allocated outbox """ cdef char* out_mailbox = calloc(1, outlen) if out_mailbox == NULL: raise MemoryError('Failed to allocate memory') return out_mailbox cdef uintptr_t copy_data_to_addr(uintptr_t addr, data): """ Auxiliary function that copies data to memory at provided address. :param addr: Address to copy the data to :param data: Data to copy :return: The incremented address to the end of the written data """ cdef bytes py_bytes = bytes(data) cdef char *tmp = py_bytes memcpy(addr, tmp, len(data)) return addr + len(data) cdef class Mlx5DVPortAttr(PyverbsObject): """ Represents mlx5dv_port struct, which exposes mlx5-specific capabilities, reported by mlx5dv_query_port() """ def __init__(self): super().__init__() def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('flags', hex(self.attr.flags)) @property def flags(self): return self.attr.flags @property def vport(self): return self.attr.vport @property def vport_vhca_id(self): return self.attr.vport_vhca_id @property def esw_owner_vhca_id(self): return self.attr.esw_owner_vhca_id @property def vport_steering_icm_rx(self): return self.attr.vport_steering_icm_rx @property def vport_steering_icm_tx(self): return self.attr.vport_steering_icm_tx @property def reg_c0_value(self): return self.attr.reg_c0.value @property def reg_c0_mask(self): return self.attr.reg_c0.mask cdef class Mlx5DVContextAttr(PyverbsObject): """ Represent mlx5dv_context_attr struct. This class is used to open an mlx5 device. """ def __init__(self, flags=0, comp_mask=0): super().__init__() self.attr.flags = flags self.attr.comp_mask = comp_mask def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('flags', self.attr.flags) +\ print_format.format('comp_mask', self.attr.comp_mask) @property def flags(self): return self.attr.flags @flags.setter def flags(self, val): self.attr.flags = val @property def comp_mask(self): return self.attr.comp_mask @comp_mask.setter def comp_mask(self, val): self.attr.comp_mask = val cdef class Mlx5DevxObj(PyverbsCM): """ Represents mlx5dv_devx_obj C struct. """ def __init__(self, Context context, in_, outlen): """ Creates a DevX object. If the object was successfully created, the command's output would be stored as a memoryview in self.out_view. :param in_: Bytes of the obj_create command's input data provided in a device specification format. (Stream of bytes or __bytes__ is implemented) :param outlen: Expected output length in bytes """ super().__init__() in_bytes = bytes(in_) cdef char *in_mailbox = _prepare_devx_inbox(in_bytes) cdef char *out_mailbox = _prepare_devx_outbox(outlen) self.obj = dv.mlx5dv_devx_obj_create(context.context, in_mailbox, len(in_bytes), out_mailbox, outlen) try: if self.obj == NULL: raise PyverbsRDMAErrno('Failed to create DevX object') self.out_view = memoryview(out_mailbox[:outlen]) status = hex(self.out_view[0]) syndrome = self.out_view[4:8].hex() if status != hex(0): raise PyverbsRDMAError('Failed to create DevX object with status' f'({status}) and syndrome (0x{syndrome})') finally: free(in_mailbox) free(out_mailbox) self.context = context self.context.add_ref(self) self.flow_counter_actions = weakref.WeakSet() self.dest_tir_actions = weakref.WeakSet() def query(self, in_, outlen): """ Queries the DevX object. :param in_: Bytes of the obj_query command's input data provided in a device specification format. (Stream of bytes or __bytes__ is implemented) :param outlen: Expected output length in bytes :return: Bytes of the command's output """ in_bytes = bytes(in_) cdef char *in_mailbox = _prepare_devx_inbox(in_bytes) cdef char *out_mailbox = _prepare_devx_outbox(outlen) rc = dv.mlx5dv_devx_obj_query(self.obj, in_mailbox, len(in_bytes), out_mailbox, outlen) try: if rc: raise PyverbsRDMAError('Failed to query DevX object', rc) out = out_mailbox[:outlen] finally: free(in_mailbox) free(out_mailbox) return out def modify(self, in_, outlen): """ Modifies the DevX object. :param in_: Bytes of the obj_modify command's input data provided in a device specification format. (Stream of bytes or __bytes__ is implemented) :param outlen: Expected output length in bytes :return: Bytes of the command's output """ in_bytes = bytes(in_) cdef char *in_mailbox = _prepare_devx_inbox(in_bytes) cdef char *out_mailbox = _prepare_devx_outbox(outlen) rc = dv.mlx5dv_devx_obj_modify(self.obj, in_mailbox, len(in_bytes), out_mailbox, outlen) try: if rc: raise PyverbsRDMAError('Failed to modify DevX object', rc) out = out_mailbox[:outlen] finally: free(in_mailbox) free(out_mailbox) return out cdef add_ref(self, obj): if isinstance(obj, DrActionFlowCounter): self.flow_counter_actions.add(obj) elif isinstance(obj, DrActionDestTir): self.dest_tir_actions.add(obj) else: raise PyverbsError('Unrecognized object type') @property def out_view(self): return self.out_view @property def obj(self): return self.obj def __dealloc__(self): self.close() cpdef close(self): if self.obj != NULL: if self.logger: self.logger.debug('Closing Mlx5DvexObj') close_weakrefs([self.flow_counter_actions, self.dest_tir_actions]) rc = dv.mlx5dv_devx_obj_destroy(self.obj) if rc: raise PyverbsRDMAError('Failed to destroy a DevX object', rc) self.obj = NULL self.context = None cdef class Mlx5Context(Context): """ Represent mlx5 context, which extends Context. """ def __init__(self, Mlx5DVContextAttr attr not None, name=''): """ Open an mlx5 device using the given attributes :param name: The RDMA device's name (used by parent class) :param attr: mlx5-specific device attributes :return: None """ super().__init__(name=name, attr=attr) if not dv.mlx5dv_is_supported(self.device): raise PyverbsUserError('This is not an MLX5 device') self.context = dv.mlx5dv_open_device(self.device, &attr.attr) if self.context == NULL: raise PyverbsRDMAErrno('Failed to open mlx5 context on {dev}' .format(dev=self.name)) self.devx_umems = weakref.WeakSet() self.devx_objs = weakref.WeakSet() self.devx_eqs = weakref.WeakSet() def query_mlx5_device(self, comp_mask=-1): """ Queries the provider for device-specific attributes. :param comp_mask: Which attributes to query. Default value is -1. If not changed by user, pyverbs will pass a bitwise OR of all available enum entries. :return: A Mlx5DVContext containing the attributes. """ dv_attr = Mlx5DVContext() if comp_mask == -1: dv_attr.comp_mask = \ dve.MLX5DV_CONTEXT_MASK_CQE_COMPRESION |\ dve.MLX5DV_CONTEXT_MASK_SWP |\ dve.MLX5DV_CONTEXT_MASK_STRIDING_RQ |\ dve.MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS |\ dve.MLX5DV_CONTEXT_MASK_DYN_BFREGS |\ dve.MLX5DV_CONTEXT_MASK_CLOCK_INFO_UPDATE |\ dve.MLX5DV_CONTEXT_MASK_DC_ODP_CAPS |\ dve.MLX5DV_CONTEXT_MASK_FLOW_ACTION_FLAGS |\ dve.MLX5DV_CONTEXT_MASK_DCI_STREAMS |\ dve.MLX5DV_CONTEXT_MASK_WR_MEMCPY_LENGTH |\ dve.MLX5DV_CONTEXT_MASK_CRYPTO_OFFLOAD |\ dve.MLX5DV_CONTEXT_MASK_MAX_DC_RD_ATOM |\ dve.MLX5DV_CONTEXT_MASK_OOO_RECV_WRS else: dv_attr.comp_mask = comp_mask rc = dv.mlx5dv_query_device(self.context, &dv_attr.dv) if rc != 0: raise PyverbsRDMAError(f'Failed to query mlx5 device {self.name}.', rc) return dv_attr @staticmethod def query_mlx5_port(Context ctx, port_num): dv_attr = Mlx5DVPortAttr() rc = dv.mlx5dv_query_port(ctx.context, port_num, &dv_attr.attr) if rc != 0: raise PyverbsRDMAError(f'Failed to query dv port mlx5 {ctx.name} port {port_num}.', rc) return dv_attr @staticmethod def reserved_qpn_alloc(Context ctx): """ Allocate a reserved QP number from firmware. :param ctx: The device context to issue the action on. :return: The reserved QP number. """ cdef uint32_t qpn rc = dv.mlx5dv_reserved_qpn_alloc(ctx.context, &qpn) if rc != 0: raise PyverbsRDMAError('Failed to alloc reserved QP number.', rc) return qpn @staticmethod def reserved_qpn_dealloc(Context ctx, qpn): """ Release the reserved QP number to firmware. :param ctx: The device context to issue the action on. :param qpn: The QP number to be deallocated. """ rc = dv.mlx5dv_reserved_qpn_dealloc(ctx.context, qpn) if rc != 0: raise PyverbsRDMAError(f'Failed to dealloc QP number {qpn}.', rc) @staticmethod def crypto_login(Context ctx, Mlx5CryptoLoginAttr login_attr): """ Creates a crypto login session :param ctx: The device context to issue the action on. :param login_attr: Mlx5CryptoLoginAttr object which contains the credential to login with and the import KEK to be used for secured communications. """ rc = dv.mlx5dv_crypto_login(ctx.context, &login_attr.mlx5dv_crypto_login_attr) if rc != 0: raise PyverbsRDMAError(f'Failed to create crypto login session.', rc) @staticmethod def query_login_state(Context ctx): """ Queries the state of the current crypto login session. :param ctx: The device context to issue the action on. :return: The login state. """ cdef dv.mlx5dv_crypto_login_state state rc = dv.mlx5dv_crypto_login_query_state(ctx.context, &state) if rc != 0: raise PyverbsRDMAError(f'Failed to query the crypto login session state.', rc) return state @staticmethod def crypto_logout(Context ctx): """ Logs out from the current crypto login session. :param ctx: The device context to issue the action on. """ rc = dv.mlx5dv_crypto_logout(ctx.context) if rc != 0: raise PyverbsRDMAError(f'Failed to logout from crypto login session.', rc) def devx_general_cmd(self, in_, outlen): """ Executes a DevX general command according to the input mailbox. :param in_: Bytes of the general command's input data provided in a device specification format. (Stream of bytes or __bytes__ is implemented) :param outlen: Expected output length in bytes :return out: Bytes of the general command's output data provided in a device specification format """ in_bytes = bytes(in_) cdef char *in_mailbox = _prepare_devx_inbox(in_bytes) cdef char *out_mailbox = _prepare_devx_outbox(outlen) rc = dv.mlx5dv_devx_general_cmd(self.context, in_mailbox, len(in_bytes), out_mailbox, outlen) try: if rc: raise PyverbsRDMAError("DevX general command failed", rc) out = out_mailbox[:outlen] finally: free(in_mailbox) free(out_mailbox) return out @staticmethod def device_timestamp_to_ns(Context ctx, device_timestamp): """ Convert device timestamp from HCA core clock units to the corresponding nanosecond units. The function uses mlx5dv_get_clock_info to get the device clock information. :param ctx: The device context to issue the action on. :param device_timestamp: The device timestamp to convert. :return: Timestamp in nanoseconds """ cdef dv.mlx5dv_clock_info *clock_info clock_info = calloc(1, sizeof(dv.mlx5dv_clock_info)) rc = dv.mlx5dv_get_clock_info(ctx.context, clock_info) if rc != 0: raise PyverbsRDMAError(f'Failed to get the clock info', rc) ns_time = dv.mlx5dv_ts_to_ns(clock_info, device_timestamp) free(clock_info) return ns_time def devx_query_eqn(self, vector): """ Query EQN for a given vector id. :param vector: Completion vector number :return: The device EQ number which relates to the given input vector """ cdef uint32_t eqn rc = dv.mlx5dv_devx_query_eqn(self.context, vector, &eqn) if rc: raise PyverbsRDMAError('Failed to query EQN', rc) return eqn def get_data_direct_sysfs_path(self, length=512): cdef char *buffer = calloc(1, length) if buffer == NULL: raise MemoryError('Failed to allocate memory') rc = dv.mlx5dv_get_data_direct_sysfs_path(self.context, buffer, length) if rc: free(buffer) raise PyverbsRDMAError('Get data direct sysfs path failed.', rc) buffer_str = str(buffer.decode()) free(buffer) return buffer_str cdef add_ref(self, obj): try: Context.add_ref(self, obj) except PyverbsError: if isinstance(obj, Mlx5UMEM): self.devx_umems.add(obj) elif isinstance(obj, Mlx5DevxObj): self.devx_objs.add(obj) elif isinstance(obj, Mlx5DevxEq): self.devx_eqs.add(obj) else: raise PyverbsError('Unrecognized object type') def __dealloc__(self): self.close() cpdef close(self): if self.context != NULL: close_weakrefs([self.pps, self.devx_objs, self.devx_umems, self.devx_eqs]) super(Mlx5Context, self).close() cdef class Mlx5DVContext(PyverbsObject): """ Represents mlx5dv_context struct, which exposes mlx5-specific capabilities, reported by mlx5dv_query_device. """ @property def version(self): return self.dv.version @property def flags(self): return self.dv.flags @property def comp_mask(self): return self.dv.comp_mask @comp_mask.setter def comp_mask(self, val): self.dv.comp_mask = val @property def cqe_comp_caps(self): return self.dv.cqe_comp_caps @property def sw_parsing_caps(self): return self.dv.sw_parsing_caps @property def striding_rq_caps(self): return self.dv.striding_rq_caps @property def tunnel_offload_caps(self): return self.dv.tunnel_offloads_caps @property def max_dynamic_bfregs(self): return self.dv.max_dynamic_bfregs @property def max_clock_info_update_nsec(self): return self.dv.max_clock_info_update_nsec @property def flow_action_flags(self): return self.dv.flow_action_flags @property def dc_odp_caps(self): return self.dv.dc_odp_caps @property def crypto_caps(self): return self.dv.crypto_caps @property def num_lag_ports(self): return self.dv.num_lag_ports @property def dci_streams_caps(self): return self.dv.dci_streams_caps @property def max_wr_memcpy_length(self): return self.dv.max_wr_memcpy_length @property def max_dc_rd_atom(self): return self.dv.max_dc_rd_atom @property def max_dc_init_rd_atom(self): return self.dv.max_dc_init_rd_atom @property def ooo_recv_wrs_caps(self): return self.dv.ooo_recv_wrs_caps def __str__(self): print_format = '{:20}: {:<20}\n' ident_format = ' {:20}: {:<20}\n' cqe = 'CQE compression caps:\n' +\ ident_format.format('max num', self.dv.cqe_comp_caps.max_num) +\ ident_format.format('supported formats', cqe_comp_to_str(self.dv.cqe_comp_caps.supported_format)) swp = 'SW parsing caps:\n' +\ ident_format.format('SW parsing offloads', swp_to_str(self.dv.sw_parsing_caps.sw_parsing_offloads)) +\ ident_format.format('supported QP types', qpts_to_str(self.dv.sw_parsing_caps.supported_qpts)) strd = 'Striding RQ caps:\n' +\ ident_format.format('min single stride log num of bytes', self.dv.striding_rq_caps.min_single_stride_log_num_of_bytes) +\ ident_format.format('max single stride log num of bytes', self.dv.striding_rq_caps.max_single_stride_log_num_of_bytes) +\ ident_format.format('min single wqe log num of strides', self.dv.striding_rq_caps.min_single_wqe_log_num_of_strides) +\ ident_format.format('max single wqe log num of strides', self.dv.striding_rq_caps.max_single_wqe_log_num_of_strides) +\ ident_format.format('supported QP types', qpts_to_str(self.dv.striding_rq_caps.supported_qpts)) stream = 'DCI stream caps:\n' +\ ident_format.format('max log num concurrent streams', self.dv.dci_streams_caps.max_log_num_concurent) +\ ident_format.format('max log num errored streams', self.dv.dci_streams_caps.max_log_num_errored) return print_format.format('Version', self.dv.version) +\ print_format.format('Flags', context_flags_to_str(self.dv.flags)) +\ print_format.format('comp mask', context_comp_mask_to_str(self.dv.comp_mask)) +\ cqe + swp + strd + stream +\ print_format.format('Tunnel offloads caps', tunnel_offloads_to_str(self.dv.tunnel_offloads_caps)) +\ print_format.format('Max dynamic BF registers', self.dv.max_dynamic_bfregs) +\ print_format.format('Max clock info update [nsec]', self.dv.max_clock_info_update_nsec) +\ print_format.format('Flow action flags', self.dv.flow_action_flags) +\ print_format.format('DC ODP caps', self.dv.dc_odp_caps) +\ print_format.format('Num LAG ports', self.dv.num_lag_ports) +\ print_format.format('Max WR memcpy length', self.dv.max_wr_memcpy_length) +\ print_format.format('Max DC Read Atomic', self.dv.max_dc_rd_atomic) +\ print_format.format('Max DC Init Read Atomic', self.dv.max_dc_init_rd_atomic) cdef class Mlx5DCIStreamInitAttr(PyverbsObject): """ Represents dci_streams struct, which defines initial attributes for DC QP creation. """ def __init__(self, log_num_concurent=0, log_num_errored=0): """ Initializes an Mlx5DCIStreamInitAttr object with the given DC log_num_concurent and log_num_errored. :param log_num_concurent: Number of dci stream channels. :param log_num_errored: Number of dci error stream channels before moving DCI to error. :return: An initialized object """ super().__init__() self.dci_streams.log_num_concurent = log_num_concurent self.dci_streams.log_num_errored = log_num_errored def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('DCI Stream log_num_concurent', self.dci_streams.log_num_concurent) +\ print_format.format('DCI Stream log_num_errored', self.dci_streams.log_num_errored) @property def log_num_concurent(self): return self.dci_streams.log_num_concurent @log_num_concurent.setter def log_num_concurent(self, val): self.dci_streams.log_num_concurent=val @property def log_num_errored(self): return self.dci_streams.log_num_errored @log_num_errored.setter def log_num_errored(self, val): self.dci_streams.log_num_errored=val cdef class Mlx5DVDCInitAttr(PyverbsObject): """ Represents mlx5dv_dc_init_attr struct, which defines initial attributes for DC QP creation. """ def __init__(self, dc_type=dve.MLX5DV_DCTYPE_DCI, dct_access_key=0, dci_streams=None): """ Initializes an Mlx5DVDCInitAttr object with the given DC type and DCT access key. :param dc_type: Which DC QP to create (DCI/DCT). :param dct_access_key: Access key to be used by the DCT :param dci_streams: Mlx5DCIStreamInitAttr :return: An initializes object """ super().__init__() self.attr.dc_type = dc_type self.attr.dct_access_key = dct_access_key if dci_streams is not None: self.attr.dci_streams.log_num_concurent=dci_streams.log_num_concurent self.attr.dci_streams.log_num_errored=dci_streams.log_num_errored def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('DC type', dc_type_to_str(self.attr.dc_type)) +\ print_format.format('DCT access key', self.attr.dct_access_key) +\ print_format.format('DCI Stream log_num_concurent', self.attr.dci_streams.log_num_concurent) +\ print_format.format('DCI Stream log_num_errored', self.attr.dci_streams.log_num_errored) @property def dc_type(self): return self.attr.dc_type @dc_type.setter def dc_type(self, val): self.attr.dc_type = val @property def dct_access_key(self): return self.attr.dct_access_key @dct_access_key.setter def dct_access_key(self, val): self.attr.dct_access_key = val @property def dci_streams(self): return self.attr.dci_streams @dci_streams.setter def dci_streams(self, val): self.attr.dci_streams=val cdef class Mlx5DVQPInitAttr(PyverbsObject): """ Represents mlx5dv_qp_init_attr struct, initial attributes used for mlx5 QP creation. """ def __init__(self, comp_mask=0, create_flags=0, Mlx5DVDCInitAttr dc_init_attr=None, send_ops_flags=0): """ Initializes an Mlx5DVQPInitAttr object with the given user data. :param comp_mask: A bitmask specifying which fields are valid :param create_flags: A bitwise OR of mlx5dv_qp_create_flags :param dc_init_attr: Mlx5DVDCInitAttr object :param send_ops_flags: A bitwise OR of mlx5dv_qp_create_send_ops_flags :return: An initialized Mlx5DVQPInitAttr object """ super().__init__() self.attr.comp_mask = comp_mask self.attr.create_flags = create_flags self.attr.send_ops_flags = send_ops_flags if dc_init_attr is not None: self.attr.dc_init_attr.dc_type = dc_init_attr.dc_type if comp_mask & dve.MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS: self.attr.dc_init_attr.dci_streams = dc_init_attr.dci_streams else: self.attr.dc_init_attr.dct_access_key = dc_init_attr.dct_access_key def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('Comp mask', qp_comp_mask_to_str(self.attr.comp_mask)) +\ print_format.format('Create flags', qp_create_flags_to_str(self.attr.create_flags)) +\ 'DC init attr:\n' +\ print_format.format(' DC type', dc_type_to_str(self.attr.dc_init_attr.dc_type)) +\ print_format.format(' DCI Stream log_num_concurent', self.attr.dc_init_attr.dci_streams.log_num_concurent) +\ print_format.format(' DCI Stream log_num_errored', self.attr.dc_init_attr.dci_streams.log_num_errored) +\ print_format.format(' DCT access key', self.attr.dc_init_attr.dct_access_key) +\ print_format.format('Send ops flags', send_ops_flags_to_str(self.attr.send_ops_flags)) @property def comp_mask(self): return self.attr.comp_mask @comp_mask.setter def comp_mask(self, val): self.attr.comp_mask = val @property def create_flags(self): return self.attr.create_flags @create_flags.setter def create_flags(self, val): self.attr.create_flags = val @property def send_ops_flags(self): return self.attr.send_ops_flags @send_ops_flags.setter def send_ops_flags(self, val): self.attr.send_ops_flags = val @property def dc_type(self): return self.attr.dc_init_attr.dc_type @dc_type.setter def dc_type(self, val): self.attr.dc_init_attr.dc_type = val @property def dct_access_key(self): return self.attr.dc_init_attr.dct_access_key @dct_access_key.setter def dct_access_key(self, val): self.attr.dc_init_attr.dct_access_key = val @property def dci_streams(self): return self.attr.dc_init_attr.dci_streams @dci_streams.setter def dci_streams(self, val): self.attr.dc_init_attr.dci_streams=val cdef copy_mr_interleaved_array(dv.mlx5dv_mr_interleaved *mr_interleaved_p, mr_interleaved_lst): """ Build C array from the C objects of Mlx5MrInterleaved list and set the mr_interleaved_p to this array address. The mr_interleaved_p should be allocated with enough size for those objects. :param mr_interleaved_p: Pointer to array of mlx5dv_mr_interleaved. :param mr_interleaved_lst: List of Mlx5MrInterleaved. """ num_interleaved = len(mr_interleaved_lst) cdef dv.mlx5dv_mr_interleaved *tmp for i in range(num_interleaved): tmp = &(mr_interleaved_lst[i]).mlx5dv_mr_interleaved memcpy(mr_interleaved_p, tmp, sizeof(dv.mlx5dv_mr_interleaved)) mr_interleaved_p += 1 cdef class Mlx5QP(QPEx): def __init__(self, Context context, QPInitAttrEx init_attr, Mlx5DVQPInitAttr dv_init_attr): """ Initializes an mlx5 QP according to the user-provided data. :param context: Context object :param init_attr: QPInitAttrEx object :param dv_init_attr: Mlx5DVQPInitAttr object :return: An initialized Mlx5QP """ cdef PD pd # Initialize the logger here as the parent's __init__ is called after # the QP is allocated. Allocation can fail, which will lead to exceptions # thrown during object's teardown. self.logger = logging.getLogger(self.__class__.__name__) self.dc_type = dv_init_attr.dc_type if dv_init_attr else 0 if init_attr.pd is not None: pd = init_attr.pd pd.add_ref(self) self.qp = \ dv.mlx5dv_create_qp(context.context, &init_attr.attr, &dv_init_attr.attr if dv_init_attr is not None else NULL) if self.qp == NULL: raise PyverbsRDMAErrno('Failed to create MLX5 QP.\nQPInitAttrEx ' 'attributes:\n{}\nMLX5DVQPInitAttr:\n{}'. format(init_attr, dv_init_attr)) super().__init__(context, init_attr) def _get_comp_mask(self, dst): masks = {dve.MLX5DV_DCTYPE_DCT: {'INIT': e.IBV_QP_PKEY_INDEX | e.IBV_QP_PORT | e.IBV_QP_ACCESS_FLAGS, 'RTR': e.IBV_QP_AV |\ e.IBV_QP_PATH_MTU |\ e.IBV_QP_MIN_RNR_TIMER}, dve.MLX5DV_DCTYPE_DCI: {'INIT': e.IBV_QP_PKEY_INDEX |\ e.IBV_QP_PORT, 'RTR': e.IBV_QP_PATH_MTU, 'RTS': e.IBV_QP_TIMEOUT |\ e.IBV_QP_RETRY_CNT |\ e.IBV_QP_RNR_RETRY | e.IBV_QP_SQ_PSN |\ e.IBV_QP_MAX_QP_RD_ATOMIC}} if self.dc_type == 0: return super()._get_comp_mask(dst) return masks[self.dc_type][dst] | e.IBV_QP_STATE def wr_set_dc_addr(self, AH ah, remote_dctn, remote_dc_key): """ Attach a DC info to the last work request. :param ah: Address Handle to the requested DCT. :param remote_dctn: The remote DCT number. :param remote_dc_key: The remote DC key. """ dv.mlx5dv_wr_set_dc_addr(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), ah.ah, remote_dctn, remote_dc_key) def wr_raw_wqe(self, wqe): """ Build a raw work request :param wqe: A Wqe object """ cdef void *wqe_ptr = wqe.address dv.mlx5dv_wr_raw_wqe(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), wqe_ptr) def wr_mr_interleaved(self, Mlx5Mkey mkey, access_flags, repeat_count, mr_interleaved_lst): """ Registers an interleaved memory layout by using an indirect mkey and some interleaved data. :param mkey: A Mlx5Mkey instance to reg this memory. :param access_flags: The mkey access flags. :param repeat_count: Number of times to repeat the interleaved layout. :param mr_interleaved_lst: List of Mlx5MrInterleaved. """ num_interleaved = len(mr_interleaved_lst) cdef dv.mlx5dv_mr_interleaved *mr_interleaved_p = \ calloc(1, num_interleaved * sizeof(dv.mlx5dv_mr_interleaved)) if mr_interleaved_p == NULL: raise MemoryError('Failed to calloc mr interleaved buffers') copy_mr_interleaved_array(mr_interleaved_p, mr_interleaved_lst) dv.mlx5dv_wr_mr_interleaved(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), mkey.mlx5dv_mkey, access_flags, repeat_count, num_interleaved, mr_interleaved_p) free(mr_interleaved_p) def wr_mr_list(self, Mlx5Mkey mkey, access_flags, sge_list): """ Registers a memory layout based on list of SGE. :param mkey: A Mlx5Mkey instance to reg this memory. :param access_flags: The mkey access flags. :param sge_list: List of SGE. """ num_sges = len(sge_list) cdef v.ibv_sge *sge_p = calloc(1, num_sges * sizeof(v.ibv_sge)) if sge_p == NULL: raise MemoryError('Failed to calloc sge buffers') copy_sg_array(sge_p, sge_list, num_sges) dv.mlx5dv_wr_mr_list(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), mkey.mlx5dv_mkey, access_flags, num_sges, sge_p) free(sge_p) def wr_mkey_configure(self, Mlx5Mkey mkey, num_setters, Mlx5MkeyConfAttr mkey_config): """ Create a work request to configure an Mkey :param mkey: A Mlx5Mkey instance to configure. :param num_setters: The number of setters that must be called after this function. :param attr: The Mkey configuration attributes. """ dv.mlx5dv_wr_mkey_configure(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), mkey.mlx5dv_mkey, num_setters, &mkey_config.mlx5dv_mkey_conf_attr) def wr_set_mkey_access_flags(self, access_flags): """ Set the memory protection attributes for an Mkey :param access_flags: The mkey access flags. """ dv.mlx5dv_wr_set_mkey_access_flags(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), access_flags) def wr_set_mkey_layout_list(self, sge_list): """ Set a memory layout for an Mkey based on SGE list. :param sge_list: List of SGE. """ num_sges = len(sge_list) cdef v.ibv_sge *sge_p = calloc(1, num_sges * sizeof(v.ibv_sge)) if sge_p == NULL: raise MemoryError('Failed to calloc sge buffers') copy_sg_array(sge_p, sge_list, num_sges) dv.mlx5dv_wr_set_mkey_layout_list(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), num_sges, sge_p) free(sge_p) def wr_set_mkey_layout_interleaved(self, repeat_count, mr_interleaved_lst): """ Set an interleaved memory layout for an Mkey :param repeat_count: Number of times to repeat the interleaved layout. :param mr_interleaved_lst: List of Mlx5MrInterleaved. """ num_interleaved = len(mr_interleaved_lst) cdef dv.mlx5dv_mr_interleaved *mr_interleaved_p = \ calloc(1, num_interleaved * sizeof(dv.mlx5dv_mr_interleaved)) if mr_interleaved_p == NULL: raise MemoryError('Failed to calloc mr interleaved buffers') copy_mr_interleaved_array(mr_interleaved_p, mr_interleaved_lst) dv.mlx5dv_wr_set_mkey_layout_interleaved(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), repeat_count, num_interleaved, mr_interleaved_p) free(mr_interleaved_p) def wr_set_mkey_crypto(self, Mlx5CryptoAttr attr): """ Configure a MKey for crypto operation. :param attr: crypto attributes to set for the mkey. """ dv.mlx5dv_wr_set_mkey_crypto(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), &attr.mlx5dv_crypto_attr) def wr_set_mkey_sig_block(self, Mlx5SigBlockAttr block_attr): """ Configure a MKEY for block signature (data integrity) operation. :param block_attr: Block signature attributes to set for the mkey. """ dv.mlx5dv_wr_set_mkey_sig_block(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), &block_attr.mlx5dv_sig_block_attr) def wr_memcpy(self, dest_lkey, dest_addr, src_lkey, src_addr, length): """ Copies memory data on PCI bus using DMA functionality of the device. :param dest_lkey: Local key of the mkey to copy data to :param dest_addr: Memory address to copy data to :param src_lkey: Local key of the mkey to copy data from :param src_addr: Memory address to copy data from :param length: Length of data to be copied """ dv.mlx5dv_wr_memcpy(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), dest_lkey, dest_addr, src_lkey, src_addr, length) def cancel_posted_send_wrs(self, wr_id): """ Cancel all pending send work requests with supplied wr_id in a QP in SQD state. :param wr_id: The WRID to cancel. :return: Number of work requests that were canceled. """ rc = dv.mlx5dv_qp_cancel_posted_send_wrs(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), wr_id) if rc < 0: raise PyverbsRDMAError(f'Failed to cancel send WRs', -rc) return rc def wr_set_dc_addr_stream(self, AH ah, remote_dctn, remote_dc_key, stream_id): """ Attach a DC info to the last work request. :param ah: Address Handle to the requested DCT. :param remote_dctn: The remote DCT number. :param remote_dc_key: The remote DC key. :param stream_id: DCI stream channel_id """ dv.mlx5dv_wr_set_dc_addr_stream(dv.mlx5dv_qp_ex_from_ibv_qp_ex(self.qp_ex), ah.ah, remote_dctn, remote_dc_key, stream_id) @staticmethod def query_lag_port(QP qp): """ Queries for port num that the QP desired to use, and the port that is currently used by the bond for this QP. :param qp: Queries the port for this QP. :return: Tuple of the desired port and actual port which used by the HW. """ cdef uint8_t port_num cdef uint8_t active_port_num rc = dv.mlx5dv_query_qp_lag_port(qp.qp, &port_num, &active_port_num) if rc != 0: raise PyverbsRDMAError(f'Failed to query QP #{qp.qp.qp_num}', rc) return port_num, active_port_num @staticmethod def modify_lag_port(QP qp, uint8_t port_num): """ Modifies the lag port num that the QP desires to use. :param qp: Modifies the port for this QP. :param port_num: The desired port to be used by the QP to send traffic in a LAG configuration. """ rc = dv.mlx5dv_modify_qp_lag_port(qp.qp, port_num) if rc != 0: raise PyverbsRDMAError(f'Failed to modify lag of QP #{qp.qp.qp_num}', rc) @staticmethod def modify_qp_sched_elem(QP qp, Mlx5dvSchedLeaf req_sched_leaf=None, Mlx5dvSchedLeaf resp_sched_leaf=None): """ Connect a QP with a requestor and/or a responder scheduling element. :param qp: connect this QP to schedule elements. :param req_sched_leaf: Mlx5dvSchedLeaf for the send queue. :param resp_sched_leaf: Mlx5dvSchedLeaf for the recv queue. """ req_se = req_sched_leaf.sched_leaf if req_sched_leaf else NULL resp_se = resp_sched_leaf.sched_leaf if resp_sched_leaf else NULL rc = dv.mlx5dv_modify_qp_sched_elem(qp.qp, req_se, resp_se) if rc != 0: raise PyverbsRDMAError(f'Failed to modify QP #{qp.qp.qp_num} sched element', rc) @staticmethod def modify_udp_sport(QP qp, uint16_t udp_sport): """ Modifies the UDP source port of a given QP. :param qp: A QP in RTS state to modify its UDP sport. :param udp_sport: The desired UDP sport to be used by the QP. """ rc = dv.mlx5dv_modify_qp_udp_sport(qp.qp, udp_sport) if rc != 0: raise PyverbsRDMAError(f'Failed to modify UDP source port of QP ' f'#{qp.qp.qp_num}', rc) @staticmethod def map_ah_to_qp(AH ah, qp_num): """ Map the destination path information in ah to the information extracted from the qp. :param ah: The target’s address handle. :param qp_num: The traffic initiator QP number. """ rc = dv.mlx5dv_map_ah_to_qp(ah.ah, qp_num) if rc != 0: raise PyverbsRDMAError(f'Failed to map AH to QP #{qp_num}', rc) @staticmethod def modify_dci_stream_channel_id(QP qp, uint16_t stream_id): """ Reset an errored stream_id in the HW DCI context. :param qp: A DCI QP in RTS state. :param stream_id: The desired stream_id that need to be reset. """ rc = dv.mlx5dv_dci_stream_id_reset(qp.qp, stream_id) if rc != 0: raise PyverbsRDMAError(f'Failed to reset stream_id #{stream_id} for DCI QP' f'#{qp.qp.qp_num}', rc) cdef class Mlx5DVCQInitAttr(PyverbsObject): """ Represents mlx5dv_cq_init_attr struct, initial attributes used for mlx5 CQ creation. """ def __init__(self, comp_mask=0, cqe_comp_res_format=0, flags=0, cqe_size=0): """ Initializes an Mlx5CQInitAttr object with zeroes as default values. :param comp_mask: Marks which of the following fields should be considered. Use mlx5dv_cq_init_attr_mask enum. :param cqe_comp_res_format: The various CQE response formats of the responder side. Use mlx5dv_cqe_comp_res_format enum. :param flags: A bitwise OR of the various values described in mlx5dv_cq_init_attr_flags. :param cqe_size: Configure the CQE size to be 64 or 128 bytes, other values will cause the CQ creation process to fail. Valid when MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE is set. :return: None """ super().__init__() self.attr.comp_mask = comp_mask self.attr.cqe_comp_res_format = cqe_comp_res_format self.attr.flags = flags self.attr.cqe_size = cqe_size @property def comp_mask(self): return self.attr.comp_mask @comp_mask.setter def comp_mask(self, val): self.attr.comp_mask = val @property def cqe_comp_res_format(self): return self.attr.cqe_comp_res_format @cqe_comp_res_format.setter def cqe_comp_res_format(self, val): self.attr.cqe_comp_res_format = val @property def flags(self): return self.attr.flags @flags.setter def flags(self, val): self.attr.flags = val @property def cqe_size(self): return self.attr.cqe_size @cqe_size.setter def cqe_size(self, val): self.attr.cqe_size = val def __str__(self): print_format = '{:22}: {:<20}\n' flags = {dve.MLX5DV_CQ_INIT_ATTR_FLAGS_CQE_PAD: "MLX5DV_CQ_INIT_ATTR_FLAGS_CQE_PAD}"} mask = {dve.MLX5DV_CQ_INIT_ATTR_MASK_COMPRESSED_CQE: "MLX5DV_CQ_INIT_ATTR_MASK_COMPRESSED_CQE", dve.MLX5DV_CQ_INIT_ATTR_MASK_FLAGS: "MLX5DV_CQ_INIT_ATTR_MASK_FLAGS", dve.MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE: "MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE"} fmt = {dve.MLX5DV_CQE_RES_FORMAT_HASH: "MLX5DV_CQE_RES_FORMAT_HASH", dve.MLX5DV_CQE_RES_FORMAT_CSUM: "MLX5DV_CQE_RES_FORMAT_CSUM", dve.MLX5DV_CQE_RES_FORMAT_CSUM_STRIDX: "MLX5DV_CQE_RES_FORMAT_CSUM_STRIDX"} return 'Mlx5DVCQInitAttr:\n' +\ print_format.format('comp_mask', bitmask_to_str(self.comp_mask, mask)) +\ print_format.format('CQE compression format', bitmask_to_str(self.cqe_comp_res_format, fmt)) +\ print_format.format('flags', bitmask_to_str(self.flags, flags)) + \ print_format.format('CQE size', self.cqe_size) cdef class Mlx5CQ(CQEX): def __init__(self, Mlx5Context context, CqInitAttrEx init_attr, Mlx5DVCQInitAttr dv_init_attr): # Initialize the logger here as the parent's __init__ is called after # the CQ is allocated. Allocation can fail, which will lead to exceptions # thrown during object's teardown. self.logger = logging.getLogger(self.__class__.__name__) self.cq = \ dv.mlx5dv_create_cq(context.context, &init_attr.attr, &dv_init_attr.attr if dv_init_attr is not None else NULL) if self.cq == NULL: raise PyverbsRDMAErrno('Failed to create MLX5 CQ.\nCQInitAttrEx:\n' '{}\nMLX5DVCQInitAttr:\n{}'. format(init_attr, dv_init_attr)) self.ibv_cq = v.ibv_cq_ex_to_cq(self.cq) self.context = context context.add_ref(self) super().__init__(context, init_attr) def __str__(self): print_format = '{:<22}: {:<20}\n' return 'Mlx5 CQ:\n' +\ print_format.format('Handle', self.cq.handle) +\ print_format.format('CQEs', self.cq.cqe) def qpts_to_str(qp_types): numeric_types = qp_types qpts_str = '' qpts = {e.IBV_QPT_RC: 'RC', e.IBV_QPT_UC: 'UC', e.IBV_QPT_UD: 'UD', e.IBV_QPT_RAW_PACKET: 'Raw Packet', e.IBV_QPT_XRC_SEND: 'XRC Send', e.IBV_QPT_XRC_RECV: 'XRC Recv', e.IBV_QPT_DRIVER: 'Driver QPT'} for t in qpts.keys(): if (1 << t) & qp_types: qpts_str += qpts[t] + ', ' qp_types -= t if qp_types == 0: break return qpts_str[:-2] + ' ({})'.format(numeric_types) def bitmask_to_str(bits, values): numeric_bits = bits res = '' for t in values.keys(): if t & bits: res += values[t] + ', ' bits -= t if bits == 0: break return res[:-2] + ' ({})'.format(numeric_bits) # Remove last comma and space def context_comp_mask_to_str(mask): l = {dve.MLX5DV_CONTEXT_MASK_CQE_COMPRESION: 'CQE compression', dve.MLX5DV_CONTEXT_MASK_SWP: 'SW parsing', dve.MLX5DV_CONTEXT_MASK_STRIDING_RQ: 'Striding RQ', dve.MLX5DV_CONTEXT_MASK_TUNNEL_OFFLOADS: 'Tunnel offloads', dve.MLX5DV_CONTEXT_MASK_DYN_BFREGS: 'Dynamic BF regs', dve.MLX5DV_CONTEXT_MASK_CLOCK_INFO_UPDATE: 'Clock info update', dve.MLX5DV_CONTEXT_MASK_FLOW_ACTION_FLAGS: 'Flow action flags'} return bitmask_to_str(mask, l) def context_flags_to_str(flags): l = {dve.MLX5DV_CONTEXT_FLAGS_CQE_V1: 'CQE v1', dve.MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED: 'Multi packet WQE allowed', dve.MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW: 'Enhanced multi packet WQE', dve.MLX5DV_CONTEXT_FLAGS_CQE_128B_COMP: 'Support CQE 128B compression', dve.MLX5DV_CONTEXT_FLAGS_CQE_128B_PAD: 'Support CQE 128B padding', dve.MLX5DV_CONTEXT_FLAGS_PACKET_BASED_CREDIT_MODE: 'Support packet based credit mode (in RC QP)'} return bitmask_to_str(flags, l) def swp_to_str(swps): l = {dve.MLX5DV_SW_PARSING: 'SW Parsing', dve.MLX5DV_SW_PARSING_CSUM: 'SW Parsing CSUM', dve.MLX5DV_SW_PARSING_LSO: 'SW Parsing LSO'} return bitmask_to_str(swps, l) def cqe_comp_to_str(cqe): l = {dve.MLX5DV_CQE_RES_FORMAT_HASH: 'with hash', dve.MLX5DV_CQE_RES_FORMAT_CSUM: 'with RX checksum CSUM', dve.MLX5DV_CQE_RES_FORMAT_CSUM_STRIDX: 'with stride index'} return bitmask_to_str(cqe, l) def tunnel_offloads_to_str(tun): l = {dve.MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_VXLAN: 'VXLAN', dve.MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GRE: 'GRE', dve.MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GENEVE: 'Geneve', dve.MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_CW_MPLS_OVER_GRE:\ 'Ctrl word + MPLS over GRE', dve.MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_CW_MPLS_OVER_UDP:\ 'Ctrl word + MPLS over UDP'} return bitmask_to_str(tun, l) def dc_type_to_str(dctype): l = {dve.MLX5DV_DCTYPE_DCT: 'DCT', dve.MLX5DV_DCTYPE_DCI: 'DCI'} try: return l[dctype] except KeyError: return 'Unknown DC type ({dc})'.format(dc=dctype) def qp_comp_mask_to_str(flags): l = {dve.MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS: 'Create flags', dve.MLX5DV_QP_INIT_ATTR_MASK_DC: 'DC', dve.MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS: 'Send ops flags', dve.MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS: 'DCI Stream'} return bitmask_to_str(flags, l) def qp_create_flags_to_str(flags): l = {dve.MLX5DV_QP_CREATE_TUNNEL_OFFLOADS: 'Tunnel offloads', dve.MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_UC: 'Allow UC self loopback', dve.MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_MC: 'Allow MC self loopback', dve.MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE: 'Disable scatter to CQE', dve.MLX5DV_QP_CREATE_ALLOW_SCATTER_TO_CQE: 'Allow scatter to CQE', dve.MLX5DV_QP_CREATE_PACKET_BASED_CREDIT_MODE: 'Packet based credit mode', dve.MLX5DV_QP_CREATE_SIG_PIPELINING: 'Support signature pipeline support'} return bitmask_to_str(flags, l) def send_ops_flags_to_str(flags): l = {dve.MLX5DV_QP_EX_WITH_MR_INTERLEAVED: 'With MR interleaved', dve.MLX5DV_QP_EX_WITH_MR_LIST: 'With MR list', dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE: 'With Mkey configure'} return bitmask_to_str(flags, l) cdef class Mlx5VAR(PyverbsObject): def __init__(self, Context context not None, flags=0): self.context = context self.var = dv.mlx5dv_alloc_var(context.context, flags) if self.var == NULL: raise PyverbsRDMAErrno('Failed to allocate VAR') context.vars.add(self) def __dealloc__(self): self.close() cpdef close(self): if self.var != NULL: dv.mlx5dv_free_var(self.var) self.var = NULL def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('page id', self.var.page_id) +\ print_format.format('length', self.var.length) +\ print_format.format('mmap offset', self.var.mmap_off) +\ print_format.format('compatibility mask', self.var.comp_mask) @property def page_id(self): return self.var.page_id @property def length(self): return self.var.length @property def mmap_off(self): return self.var.mmap_off @property def comp_mask(self): return self.var.comp_mask cdef class Mlx5PP(PyverbsObject): """ Represents mlx5dv_pp, packet pacing struct. """ def __init__(self, Context context not None, pp_context, flags=0): """ Initializes a Mlx5PP object. :param context: DevX context :param pp_context: Bytes of packet pacing context according to the device specs. Must be bytes type or implements __bytes__ method :param flags: Packet pacing allocation flags """ self.context = context pp_ctx_bytes = bytes(pp_context) self.pp = dv.mlx5dv_pp_alloc(context.context, len(pp_ctx_bytes), pp_ctx_bytes, flags) if self.pp == NULL: raise PyverbsRDMAErrno('Failed to allocate packet pacing entry') context.pps.add(self) def __dealloc__(self): self.close() cpdef close(self): if self.pp != NULL: dv.mlx5dv_pp_free(self.pp) self.pp = NULL @property def index(self): return self.pp.index cdef class Mlx5UAR(PyverbsObject): def __init__(self, Context context not None, flags=0): self.uar = dv.mlx5dv_devx_alloc_uar(context.context, flags) if self.uar == NULL: raise PyverbsRDMAErrno('Failed to allocate UAR') context.uars.add(self) def __dealloc__(self): self.close() cpdef close(self): if self.uar != NULL: dv.mlx5dv_devx_free_uar(self.uar) self.uar = NULL def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('reg addr', self.uar.reg_addr) +\ print_format.format('base addr', self.uar.base_addr) +\ print_format.format('page id', self.uar.page_id) +\ print_format.format('mmap off', self.uar.mmap_off) +\ print_format.format('comp mask', self.uar.comp_mask) @property def reg_addr(self): return self.uar.reg_addr @property def base_addr(self): return self.uar.base_addr @property def page_id(self): return self.uar.page_id @property def mmap_off(self): return self.uar.mmap_off @property def comp_mask(self): return self.uar.comp_mask @property def uar(self): return self.uar cdef class Mlx5DmOpAddr(PyverbsCM): def __init__(self, DM dm not None, op=0): """ Wraps mlx5dv_dm_map_op_addr. Gets operation address of a device memory (DM), which must be munmapped by the user when it's no longer needed. :param dm: Device Memory instance :param op: DM operation type :return: An mmaped address to the DM for the requested operation (op). """ self.addr = dv.mlx5dv_dm_map_op_addr(dm.dm, op) if self.addr == NULL: raise PyverbsRDMAErrno('Failed to get DM operation address') def unmap(self, length): munmap(self.addr, length) @staticmethod cdef void _cpy(void *dst, void *src, int length): """ Copy data (bytes) from src to dst. To ensure atomicity, copy in a single write operation. :param dst: The address to copy from. :param src: The address to copy to. :param length: Length in bytes. (supports: power of two. up to 8 bytes) """ if length == 1: ( dst)[0] = ( src)[0] elif length == 2: ( dst)[0] = ( src)[0] elif length == 4: ( dst)[0] = ( src)[0] elif length == 8: ( dst)[0] = ( src)[0] elif length == 16: raise PyverbsUserError('Currently PyVerbs does not support 16 bytes Memic Atomic operations') else: raise PyverbsUserError(f'Memic Atomic operations do not support with length: {length}') def write(self, data): """ Writes data (bytes) to the DM operation address. :param data: Bytes of data """ length = len(data) Mlx5DmOpAddr._cpy( self.addr, data, length) def read(self, length): """ Reads 'length' bytes from the DM operation address. :param length: Data length to read (in bytes) :return: Read data in bytes """ cdef void *data = calloc(length, sizeof(char)) Mlx5DmOpAddr._cpy(data, self.addr, length) res = ( data)[:length] free(data) return res cpdef close(self): self.addr = NULL @property def addr(self): return self.addr cdef class WqeSeg(PyverbsCM): """ An abstract class for WQE segments. Each WQE segment (such as control segment, data segment, etc.) should inherit from this class. """ @staticmethod def sizeof(): return 0 cpdef _copy_to_buffer(self, addr): memcpy(addr, self.segment, self.sizeof()) def __dealloc__(self): self.close() cpdef close(self): if self.segment != NULL: free(self.segment) self.segment = NULL cdef class WqeCtrlSeg(WqeSeg): """ Wrapper class for dv.mlx5_wqe_ctrl_seg """ def __init__(self, pi=0, opcode=0, opmod=0, qp_num=0, fm_ce_se=0, ds=0, signature=0, imm=0): """ Create a WqeCtrlSeg by creating a mlx5_wqe_ctrl_seg and using mlx5dv_set_ctrl_seg, segment values are accessed through the getters/setters. """ self.segment = calloc(1, sizeof(dv.mlx5_wqe_ctrl_seg)) self.set_ctrl_seg(pi, opcode, opmod, qp_num, fm_ce_se, ds, signature, imm) def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('opcode', (self.segment).opmod_idx_opcode) + \ print_format.format('qpn_ds', (self.segment).qpn_ds) + \ print_format.format('signature', (self.segment).signature) + \ print_format.format('fm_ce_se', (self.segment).fm_ce_se) + \ print_format.format('imm', (self.segment).imm) def set_ctrl_seg(self, pi, opcode, opmod, qp_num, fm_ce_se, ds, signature, imm): dv.mlx5dv_set_ctrl_seg(self.segment, pi, opcode, opmod, qp_num, fm_ce_se, ds, signature, imm) @staticmethod def sizeof(): return sizeof(dv.mlx5_wqe_ctrl_seg) @property def addr(self): return self.segment @property def opmod_idx_opcode(self): return be32toh((self.segment).opmod_idx_opcode) @opmod_idx_opcode.setter def opmod_idx_opcode(self, val): (self.segment).opmod_idx_opcode = htobe32(val) @property def qpn_ds(self): return be32toh((self.segment).qpn_ds) @qpn_ds.setter def qpn_ds(self, val): (self.segment).qpn_ds = htobe32(val) @property def signature(self): return (self.segment).signature @signature.setter def signature(self, val): (self.segment).signature = val @property def fm_ce_se(self): return (self.segment).fm_ce_se @fm_ce_se.setter def fm_ce_se(self, val): (self.segment).fm_ce_se = val @property def imm(self): return be32toh((self.segment).imm) @imm.setter def imm(self, val): (self.segment).imm = htobe32(val) cdef class WqeDataSeg(WqeSeg): def __init__(self, length=0, lkey=0, addr=0): """ Create a dv.mlx5_wqe_data_seg by allocating it and using dv.mlx5dv_set_data_seg with the values received in init """ self.segment = calloc(1, sizeof(dv.mlx5_wqe_data_seg)) self.set_data_seg(length, lkey, addr) @staticmethod def sizeof(): return sizeof(dv.mlx5_wqe_data_seg) def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('byte_count', (self.segment).byte_count) + \ print_format.format('lkey', (self.segment).lkey) + \ print_format.format('addr', (self.segment).addr) def set_data_seg(self, length, lkey, addr): dv.mlx5dv_set_data_seg(self.segment, length, lkey, addr) @property def byte_count(self): return be32toh((self.segment).byte_count) @byte_count.setter def byte_count(self, val): (self.segment).byte_count = htobe32(val) @property def lkey(self): return be32toh((self.segment).lkey) @lkey.setter def lkey(self, val): (self.segment).lkey = htobe32(val) @property def addr(self): return be64toh((self.segment).addr) @addr.setter def addr(self, val): (self.segment).addr = htobe64(val) cdef class Wqe(PyverbsCM): """ The Wqe class represents a WQE, which is one or more chained WQE segments. """ def __init__(self, segments, addr=0): """ Create a Wqe with , in case an address was not passed by the user, memory would be allocated according to the size needed and the segments are copied over to the buffer. :param segments: The segments (ctrl, data) of the Wqe as PRM format or WqeSeg instance. :param addr: User address to write the WQE on (Optional) """ self.segments = segments if addr: self.is_user_addr = True self.addr = addr else: self.is_user_addr = False allocation_size = sum(map(lambda x: x.sizeof() if isinstance(x, WqeSeg) else len(x), self.segments)) self.addr = calloc(1, allocation_size) addr = self.addr for seg in self.segments: if isinstance(seg, WqeSeg): seg._copy_to_buffer(addr) addr += seg.sizeof() else: # PRM format addr = copy_data_to_addr(addr, seg) @property def address(self): return self.addr def __str__(self): ret_str = '' i = 0 for segment in self.segments: ret_str += f'Segment type {type(segment)} #{i}:\n' + str(segment) i += 1 return ret_str def __dealloc__(self): self.close() cpdef close(self): if self.addr != NULL: if not self.is_user_addr: free(self.addr) self.addr = NULL cdef class Mlx5UMEM(PyverbsCM): def __init__(self, Context context not None, size, addr=None, alignment=64, access=0, pgsz_bitmap=0, comp_mask=0, dmabuf_fd=0): """ User memory object to be used by the DevX interface. If pgsz_bitmap or comp_mask were passed, the extended umem registration will be used. :param context: RDMA device context to create the action on :param size: The size of the addr buffer (or the internal buffer to be allocated if addr is None) :param alignment: The alignment of the internally allocated buffer (Valid if addr is None) :param addr: The memory start address to register (if None, the address will be allocated internally) :param access: The desired memory protection attributes (default: 0) :param pgsz_bitmap: Represents the required page sizes :param comp_mask: Compatibility mask :param dmabuf_fd: FD of a dmabuf """ super().__init__() cdef dv.mlx5dv_devx_umem_in umem_in if addr is not None: self.addr = addr self.is_user_addr = True else: self.addr = posix_memalign(size, alignment) memset(self.addr, 0, size) self.is_user_addr = False if pgsz_bitmap or comp_mask: umem_in.addr = self.addr umem_in.size = size umem_in.access = access umem_in.pgsz_bitmap = pgsz_bitmap umem_in.comp_mask = comp_mask umem_in.dmabuf_fd = dmabuf_fd self.umem = dv.mlx5dv_devx_umem_reg_ex(context.context, &umem_in) else: self.umem = dv.mlx5dv_devx_umem_reg(context.context, self.addr, size, access) if self.umem == NULL: raise PyverbsRDMAErrno("Failed to register a UMEM.") self.context = context self.context.add_ref(self) def __dealloc__(self): self.close() cpdef close(self): if self.umem != NULL: if self.logger: self.logger.debug('Closing Mlx5UMEM') rc = dv.mlx5dv_devx_umem_dereg(self.umem) try: if rc: raise PyverbsError("Failed to dereg UMEM.", rc) finally: if not self.is_user_addr: free(self.addr) self.umem = NULL self.context = None def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('umem id', self.umem_id) + \ print_format.format('reg addr', self.umem_addr) @property def umem_id(self): return self.umem.umem_id @property def umem_addr(self): if self.addr: return self.addr cdef class Mlx5Cqe64(PyverbsObject): def __init__(self, addr): self.cqe = addr def dump(self): dump_format = '{:08x} {:08x} {:08x} {:08x}\n' str = '' for i in range(0, 16, 4): str += dump_format.format(be32toh((self.cqe)[i]), be32toh((self.cqe)[i + 1]), be32toh((self.cqe)[i + 2]), be32toh((self.cqe)[i + 3])) return str def is_empty(self): for i in range(16): if be32toh((self.cqe)[i]) != 0: return False return True @property def owner(self): return dv.mlx5dv_get_cqe_owner(self.cqe) @owner.setter def owner(self, val): dv.mlx5dv_set_cqe_owner(self.cqe, val) @property def se(self): return dv.mlx5dv_get_cqe_se(self.cqe) @property def format(self): return dv.mlx5dv_get_cqe_format(self.cqe) @property def opcode(self): return dv.mlx5dv_get_cqe_opcode(self.cqe) @property def imm_inval_pkey(self): return be32toh(self.cqe.imm_inval_pkey) @property def wqe_id(self): return be16toh(self.cqe.wqe_id) @property def byte_cnt(self): return be32toh(self.cqe.byte_cnt) @property def timestamp(self): return be64toh(self.cqe.timestamp) @property def wqe_counter(self): return be16toh(self.cqe.wqe_counter) @property def signature(self): return self.cqe.signature @property def op_own(self): return self.cqe.op_own def __str__(self): return (((self.cqe)[0])).__str__() cdef class Mlx5DevxMsiVector(PyverbsCM): """ Represents mlx5dv_devx_msi_vector C struct. """ def __init__(self, Context context): super().__init__() self.msi_vector = dv.mlx5dv_devx_alloc_msi_vector(context.context) if self.msi_vector == NULL: raise PyverbsRDMAErrno('Failed to allocate an msi_vector') @property def vector(self): return self.msi_vector.vector @property def fd(self): return self.msi_vector.fd cpdef close(self): if self.msi_vector != NULL: rc = dv.mlx5dv_devx_free_msi_vector(self.msi_vector) if rc: raise PyverbsRDMAError('Failed to free the msi_vector', rc) self.msi_vector = NULL cdef class Mlx5DevxEq(PyverbsCM): """ Represents mlx5dv_devx_eq C struct. """ def __init__(self, Context context, in_, outlen): """ Creates a DevX EQ object. If the object was successfully created, the command's output would be stored as a memoryview in self.out_view. :param in_: Bytes of the obj_create command's input data provided in a device specification format. (Stream of bytes or __bytes__ is implemented) :param outlen: Expected output length in bytes """ super().__init__() in_bytes = bytes(in_) cdef char *in_mailbox = _prepare_devx_inbox(in_bytes) cdef char *out_mailbox = _prepare_devx_outbox(outlen) self.eq = dv.mlx5dv_devx_create_eq(context.context, in_mailbox, len(in_bytes), out_mailbox, outlen) try: if self.eq == NULL: raise PyverbsRDMAErrno('Failed to create async EQ object') self.out_view = memoryview(out_mailbox[:outlen]) status = hex(self.out_view[0]) syndrome = self.out_view[4:8].hex() if status != hex(0): raise PyverbsRDMAError('Failed to create async EQ object with status' f'({status}) and syndrome (0x{syndrome})') finally: free(in_mailbox) free(out_mailbox) self.context = context self.context.add_ref(self) @property def out_view(self): return self.out_view @property def vaddr(self): return self.eq.vaddr def __dealloc__(self): self.close() cpdef close(self): if self.eq != NULL: self.logger.debug('Closing Mlx5DevxEq') rc = dv.mlx5dv_devx_destroy_eq(self.eq) if rc: raise PyverbsRDMAError('Failed to destroy a DevX EQ object', rc) self.eq = NULL self.context = None rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_crypto.pxd000066400000000000000000000020441477342711600232620ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 from pyverbs.base cimport PyverbsObject, PyverbsCM cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.device cimport Context from pyverbs.pd cimport PD cdef class Mlx5CryptoLoginAttr(PyverbsObject): cdef dv.mlx5dv_crypto_login_attr mlx5dv_crypto_login_attr cdef class Mlx5CryptoExtLoginAttr(PyverbsObject): cdef dv.mlx5dv_crypto_login_attr_ex mlx5dv_crypto_login_attr_ex cdef object credential cdef class Mlx5DEKInitAttr(PyverbsObject): cdef dv.mlx5dv_dek_init_attr mlx5dv_dek_init_attr cdef PD pd cdef class Mlx5DEKAttr(PyverbsObject): cdef dv.mlx5dv_dek_attr mlx5dv_dek_attr cdef class Mlx5CryptoAttr(PyverbsObject): cdef dv.mlx5dv_crypto_attr mlx5dv_crypto_attr cdef class Mlx5DEK(PyverbsCM): cdef dv.mlx5dv_dek *mlx5dv_dek cdef PD pd cdef class Mlx5CryptoLogin(PyverbsCM): cdef dv.mlx5dv_crypto_login_obj *crypto_login_obj cdef Context context rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_crypto.pyx000066400000000000000000000310221477342711600233050ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia, Inc. All rights reserved. See COPYING file from libc.string cimport memcpy from pyverbs.pyverbs_error import PyverbsRDMAError cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base import PyverbsRDMAErrno from pyverbs.pd cimport PD cdef class Mlx5CryptoLoginAttr(PyverbsObject): def __init__(self, credential, credential_id=0, import_kek_id=0): """ Initializes a Mlx5CryptoLoginAttr object representing mlx5dv_crypto_login_attr C struct. :param credential: The credential to login with. Must be provided wrapped by the AES key wrap algorithm using the import KEK indicated by *import_kek_id*. :param credential_id: The index of credential that stored on the device. :param import_kek_id: The index of import_kek that stored on the device. """ cdef char *credential_c = credential self.mlx5dv_crypto_login_attr.credential_id = credential_id self.mlx5dv_crypto_login_attr.import_kek_id = import_kek_id memcpy(self.mlx5dv_crypto_login_attr.credential, credential_c, 48) @property def credential_id(self): return self.mlx5dv_crypto_login_attr.credential_id @property def import_kek_id(self): return self.mlx5dv_crypto_login_attr.import_kek_id @property def credential(self): return self.mlx5dv_crypto_login_attr.credential @property def comp_mask(self): return self.mlx5dv_crypto_login_attr.comp_mask def __str__(self): print_format = '{:20}: {:<20}\n' return 'Mlx5CryptoLoginAttr:\n' +\ print_format.format('Credential id', self.credential_id) +\ print_format.format('Import KEK id', self.import_kek_id) +\ print_format.format('Credential', str(self.credential)) +\ print_format.format('Comp mask', self.comp_mask) cdef class Mlx5CryptoExtLoginAttr(PyverbsObject): def __init__(self, credential, credential_len, credential_id=0, import_kek_id=0): """ Initializes a Mlx5CryptoExtLoginAttr object representing mlx5dv_crypto_login_attr_ex C struct. :param credential: The credential to login with. :param credential_len: The credential length. Must be provided. :param credential_id: The index of credential that stored on the device. :param import_kek_id: The index of import_kek that stored on the device. """ cdef char *credential_c = credential self.mlx5dv_crypto_login_attr_ex.credential_id = credential_id self.mlx5dv_crypto_login_attr_ex.import_kek_id = import_kek_id self.mlx5dv_crypto_login_attr_ex.credential = credential_c self.mlx5dv_crypto_login_attr_ex.credential_len = credential_len self.credential = credential @property def credential_id(self): return self.mlx5dv_crypto_login_attr_ex.credential_id @property def import_kek_id(self): return self.mlx5dv_crypto_login_attr_ex.import_kek_id @property def credential(self): return self.credential @property def credential_len(self): return self.mlx5dv_crypto_login_attr_ex.credential_len @property def comp_mask(self): return self.mlx5dv_crypto_login_attr_ex.comp_mask def __str__(self): print_format = '{:20}: {:<20}\n' return 'Mlx5CryptoExtLoginAttr:\n' +\ print_format.format('Credential id', self.credential_id) +\ print_format.format('Import KEK id', self.import_kek_id) +\ print_format.format('Credential', str(self.credential)) +\ print_format.format('credential_len', self.credential_len) +\ print_format.format('Comp mask', self.comp_mask) cdef class Mlx5DEKInitAttr(PyverbsObject): def __init__(self, PD pd, key_size, has_keytag=False, key_purpose=0, opaque=bytes(), key=bytes(), comp_mask=0, Mlx5CryptoLogin crypto_login=None): """ Initializes a Mlx5DEKInitAttr object representing mlx5dv_dek_init_attr C struct. :param pd: The protection domain to be associated with the DEK. :param credential_id: The size of the key, can be MLX5DV_CRYPTO_KEY_SIZE_128/256 :param has_keytag: Whether the DEK has a keytag or not. If set, the key should include a 8 Bytes keytag. :param key_purpose: The crypto purpose of the key. :param opaque: Plaintext metadata to describe the key. :param key: The key itself, wrapped by the crypto login session's import KEK. :param comp_mask: Reserved for future extension. :param crypto_login: Crypto login object. """ cdef char *opaque_c = opaque cdef char *key_c = key self.pd = pd self.mlx5dv_dek_init_attr.pd = pd.pd self.mlx5dv_dek_init_attr.key_size = key_size self.mlx5dv_dek_init_attr.has_keytag = has_keytag self.mlx5dv_dek_init_attr.key_purpose = key_purpose memcpy(self.mlx5dv_dek_init_attr.opaque, opaque_c, 8) memcpy(self.mlx5dv_dek_init_attr.key, key_c, 128) self.mlx5dv_dek_init_attr.comp_mask = comp_mask self.mlx5dv_dek_init_attr.crypto_login = crypto_login.crypto_login_obj if crypto_login else NULL @property def key_size(self): return self.mlx5dv_dek_init_attr.key_size @property def has_keytag(self): return self.mlx5dv_dek_init_attr.has_keytag @property def key_purpose(self): return self.mlx5dv_dek_init_attr.key_purpose @property def opaque(self): return self.mlx5dv_dek_init_attr.opaque.decode() @property def key(self): return self.mlx5dv_dek_init_attr.key.hex() @property def comp_mask(self): return self.mlx5dv_dek_init_attr.comp_mask def __str__(self): print_format = '{:20}: {:<20}\n' return 'Mlx5DEKInitAttr:\n' +\ print_format.format('key_size', self.key_size) +\ print_format.format('Has keytag', self.has_keytag) +\ print_format.format('Key purpose', self.key_purpose) +\ print_format.format('Opaque', self.opaque) +\ print_format.format('Key (in hex format)', self.key) +\ print_format.format('Comp mask', self.comp_mask) cdef class Mlx5DEKAttr(PyverbsObject): """ Initializes a Mlx5DEKAttr object representing mlx5dv_dek_attr C struct. """ @property def state(self): return self.mlx5dv_dek_attr.state @property def opaque(self): return self.mlx5dv_dek_attr.opaque @property def comp_mask(self): return self.mlx5dv_dek_attr.comp_mask cdef class Mlx5CryptoAttr(PyverbsObject): def __init__(self, crypto_standard=0, encrypt_on_tx=False, signature_crypto_order=0, data_unit_size=0, initial_tweak=bytes(), Mlx5DEK dek=None, keytag=bytes(), comp_mask=0): """ Initializes a Mlx5CryptoAttr object representing mlx5dv_crypto_attr C struct. :param crypto_standard: The encryption standard that should be used. :param encrypt_on_tx: If set, memory data will be encrypted during TX and wire data will be decrypted during RX. :param signature_crypto_order: Controls the order between crypto and signature operations. Relevant only if signature is configured. :param data_unit_size: The tweak is incremented after each *data_unit_size* during the encryption. :param initial_tweak: A value to be used during encryption of each data unit. This value is incremented by the device for every data unit in the message :param dek: The DEK to be used for the crypto operations. :param keytag: A tag that verifies that the correct DEK is being used. :param comp_mask: Reserved for future extension. """ cdef char *initial_tweak_c = initial_tweak cdef char *keytag_c = keytag self.mlx5dv_crypto_attr.crypto_standard = crypto_standard self.mlx5dv_crypto_attr.encrypt_on_tx = encrypt_on_tx self.mlx5dv_crypto_attr.signature_crypto_order = signature_crypto_order self.mlx5dv_crypto_attr.data_unit_size = data_unit_size memcpy(self.mlx5dv_crypto_attr.initial_tweak, initial_tweak_c, 16) self.mlx5dv_crypto_attr.dek = dek.mlx5dv_dek memcpy(self.mlx5dv_crypto_attr.keytag, keytag_c, 8) self.mlx5dv_crypto_attr.comp_mask = comp_mask @property def crypto_standard(self): return self.mlx5dv_crypto_attr.crypto_standard @property def encrypt_on_tx(self): return self.mlx5dv_crypto_attr.encrypt_on_tx @property def signature_crypto_order(self): return self.mlx5dv_crypto_attr.signature_crypto_order @property def data_unit_size(self): return self.mlx5dv_crypto_attr.data_unit_size @property def initial_tweak(self): return self.mlx5dv_crypto_attr.initial_tweak.hex() @property def keytag(self): print('@keytag') return self.mlx5dv_crypto_attr.keytag.hex() @property def comp_mask(self): return self.mlx5dv_crypto_attr.comp_mask def __str__(self): print_format = '{:30}: {:<20}\n' return 'Mlx5CryptoAttr:\n' +\ print_format.format('Crypto standard', self.crypto_standard) +\ print_format.format('Encrypt on TX', self.encrypt_on_tx) +\ print_format.format('Signature crypto order', self.signature_crypto_order) +\ print_format.format('Data unit size', self.data_unit_size) +\ print_format.format('Initial tweak (in hex format)', self.initial_tweak) +\ print_format.format('keytag (in hex format)', self.keytag) +\ print_format.format('Comp mask', self.comp_mask) cdef class Mlx5DEK(PyverbsCM): def __init__(self, Context ctx, Mlx5DEKInitAttr dek_init_attr): """ Create a Mlx5DEK object. :param context: Context to create the schedule resources on. :param dek_init_attr: Mlx5DEKInitAttr, containing the DEK attributes. """ self.mlx5dv_dek = dv.mlx5dv_dek_create(ctx.context, &dek_init_attr.mlx5dv_dek_init_attr) if self.mlx5dv_dek == NULL: raise PyverbsRDMAErrno('Failed to create DEK') self.pd = dek_init_attr.pd self.pd.deks.add(self) def query(self): """ Query the dek state. :return: Mlx5DEKAttr which contains the dek state and opaque. """ dek_attr = Mlx5DEKAttr() rc = dv.mlx5dv_dek_query(self.mlx5dv_dek, &dek_attr.mlx5dv_dek_attr) if rc: raise PyverbsRDMAError('Failed to query the dek', rc) return dek_attr def __dealloc__(self): self.close() cpdef close(self): if self.mlx5dv_dek != NULL: rc = dv.mlx5dv_dek_destroy(self.mlx5dv_dek) if rc: raise PyverbsRDMAError('Failed to destroy a DEK', rc) self.mlx5dv_dek = NULL self.pd = None cdef class Mlx5CryptoLogin(PyverbsCM): def __init__(self, Context context, Mlx5CryptoExtLoginAttr crypto_attr): """ Create a Mlx5CryptoLogin object. :param context: Context to create the schedule resources on. :param crypto_ext_login_attr: Mlx5CryptoExtLoginAttr. """ self.crypto_login_obj = dv.mlx5dv_crypto_login_create(context.context, &crypto_attr.mlx5dv_crypto_login_attr_ex) if self.crypto_login_obj == NULL: raise PyverbsRDMAErrno('Failed to create CryptoLoginObj') self.context = context context.crypto_logins.add(self) def query(self): """ Queries the state of the current crypto login session. :return: The login state. """ cdef dv.mlx5dv_crypto_login_query_attr query_attr return dv.mlx5dv_crypto_login_query(self.crypto_login_obj, &query_attr) def __dealloc__(self): self.close() cpdef close(self): if self.crypto_login_obj != NULL: rc = dv.mlx5dv_crypto_login_destroy(self.crypto_login_obj) if rc: raise PyverbsRDMAError('Failed to destroy a CryptoLoginObj', rc) self.crypto_login_obj = NULL rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_dmabuf.pxd000066400000000000000000000003471477342711600232040ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2024 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 from pyverbs.mr cimport DmaBufMR cdef class Mlx5DmaBufMR(DmaBufMR): pass rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_dmabuf.pyx000066400000000000000000000026401477342711600232270ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2024 Nvidia, Inc. All rights reserved. See COPYING file cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base import PyverbsRDMAErrno from pyverbs.mr cimport DmaBufMR from pyverbs.pd cimport PD cdef class Mlx5DmaBufMR(DmaBufMR): def __init__(self, PD pd not None, offset, length, iova=0, fd=None, access=0, mlx5_access=0): """ Initializes a DmaBufMR (DMA-BUF Memory Region) of the given length and access flags using the given PD and DmaBuf objects. :param pd: A PD object :param offset: Byte offset from the beginning of the dma-buf :param length: Length in bytes :param iova: The virtual base address of the MR when accessed through a lkey or rkey. :param fd: FD representing a dmabuf. :param access: Access flags, see ibv_access_flags enum. :param mlx5_access: A specific device access flags. :return: The newly created DMABUFMR """ self.mr = dv.mlx5dv_reg_dmabuf_mr(pd.pd, offset, length, iova, fd, access, mlx5_access) if self.mr == NULL: raise PyverbsRDMAErrno( f'Failed to register a mlx5 dma-buf MR. length: {length}, access flags: {access} ' f'mlx5_access: {mlx5_access}') self.pd = pd self.dmabuf = fd self.offset = offset pd.add_ref(self) rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_flow.pxd000066400000000000000000000015671477342711600227220ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.flow cimport Flow, FlowAction from pyverbs.base cimport PyverbsObject cdef class Mlx5FlowMatchParameters(PyverbsObject): cdef dv.mlx5dv_flow_match_parameters *params cpdef close(self) cdef class Mlx5FlowMatcherAttr(PyverbsObject): cdef dv.mlx5dv_flow_matcher_attr attr cdef class Mlx5FlowMatcher(PyverbsObject): cdef dv.mlx5dv_flow_matcher *flow_matcher cdef object flows cdef add_ref(self, obj) cpdef close(self) cdef class Mlx5FlowActionAttr(PyverbsObject): cdef dv.mlx5dv_flow_action_attr attr cdef object qp cdef object action cdef class Mlx5Flow(Flow): pass cdef class Mlx5PacketReformatFlowAction(FlowAction): pass rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_flow.pyx000066400000000000000000000265061477342711600227470ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file from libc.stdlib cimport calloc, free from libc.string cimport memcpy from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsError, \ PyverbsUserError from pyverbs.device cimport Context from pyverbs.base import PyverbsRDMAErrno from pyverbs.base cimport close_weakrefs from pyverbs.device cimport Context cimport pyverbs.libibverbs as v from pyverbs.qp cimport QP import weakref cdef class Mlx5FlowMatchParameters(PyverbsObject): def __init__(self, size, values): """ Initialize a Mlx5FlowMatchParameters object over an underlying mlx5dv_flow_match_parameters C object that defines match parameters for steering flow. :param size: Length of the mask/value in bytes :param values: Bytes with mask/value to use in format of Flow Table Entry Match Parameters Format table in PRM or instance of FlowTableEntryMatchParam class. """ cdef char *py_bytes_c super().__init__() struct_size = sizeof(size_t) + size self.params = calloc(1, struct_size) if self.params == NULL: raise PyverbsError(f'Failed to allocate buffer of size {struct_size}') self.params.match_sz = size if size: py_bytes = bytes(values) py_bytes_c = py_bytes memcpy(self.params.match_buf, py_bytes_c, len(values)) def __dealloc__(self): self.close() cpdef close(self): if self.params != NULL: if self.logger: self.logger.debug('Closing Mlx5FlowMatchParameters') free(self.params) self.params = NULL cdef class Mlx5FlowMatcherAttr(PyverbsObject): def __init__(self, Mlx5FlowMatchParameters match_mask, attr_type=v.IBV_FLOW_ATTR_NORMAL, flags=0, priority=0, match_criteria_enable=0, comp_mask=0, ft_type=0): """ Initialize a Mlx5FlowMatcherAttr object over an underlying mlx5dv_flow_matcher_attr C object that defines matcher's attributes. :param match_mask: Match parameters to match on :param attr_type: Type of matcher to be created :param flags: Special flags to control rule: Nothing or zero value means matcher will store ingress flow rules. IBV_FLOW_ATTR_FLAGS_EGRESS: Specified this matcher will store egress flow rules. :param priority: Matcher priority :param match_criteria_enable: Bitmask representing which of the headers and parameters in match_criteria are used in defining the Flow. Bit 0: outer_headers Bit 1: misc_parameters Bit 2: inner_headers Bit 3: misc_parameters_2 Bit 4: misc_parameters_3 Bit 5: misc_parameters_4 :param comp_mask: MLX5DV_FLOW_MATCHER_MASK_FT_TYPE for ft_type (the only option that is currently supported) :param ft_type: Specified in which flow table type, the matcher will store the flow rules: MLX5DV_FLOW_TABLE_TYPE_NIC_RX: Specified this matcher will store ingress flow rules. MLX5DV_FLOW_TABLE_TYPE_NIC_TX - matcher will store egress flow rules. MLX5DV_FLOW_TABLE_TYPE_FDB - matcher will store FDB rules. MLX5DV_FLOW_TABLE_TYPE_RDMA_RX - matcher will store ingress RDMA flow rules. MLX5DV_FLOW_TABLE_TYPE_RDMA_TX - matcher will store egress RDMA flow rules. """ super().__init__() self.attr.type = attr_type self.attr.flags = flags self.attr.priority = priority self.attr.match_criteria_enable = match_criteria_enable self.attr.match_mask = match_mask.params self.attr.comp_mask = comp_mask self.attr.ft_type = ft_type cdef class Mlx5FlowMatcher(PyverbsObject): def __init__(self, Context context, Mlx5FlowMatcherAttr attr): """ Initialize a Mlx5FlowMatcher object over an underlying mlx5dv_flow_matcher C object that defines a matcher for steering flow. :param context: Context object :param attr: Flow matcher attributes """ super().__init__() self.flow_matcher = dv.mlx5dv_create_flow_matcher(context.context, &attr.attr) if self.flow_matcher == NULL: raise PyverbsRDMAErrno('Flow matcher creation failed.') self.flows = weakref.WeakSet() cdef add_ref(self, obj): if isinstance(obj, Flow): self.flows.add(obj) else: raise PyverbsError('Unrecognized object type') def __dealloc__(self): self.close() cpdef close(self): if self.flow_matcher != NULL: if self.logger: self.logger.debug('Closing Mlx5FlowMatcher') close_weakrefs([self.flows]) rc = dv.mlx5dv_destroy_flow_matcher(self.flow_matcher) if rc: raise PyverbsRDMAError('Destroy matcher failed.', rc) self.flow_matcher = NULL cdef class Mlx5PacketReformatFlowAction(FlowAction): def __init__(self, Context context, data=None, reformat_type=dv.MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2, ft_type=dv.MLX5DV_FLOW_TABLE_TYPE_NIC_RX): """ Initialize a Mlx5PacketReformatFlowAction object derived from FlowAction class and represents reformat flow steering action that allows adding/removing packet headers. :param context: Context object :param data: Encap headers (if needed) :param reformat_type: L2 or L3 encap or decap :param ft_type: dv.MLX5DV_FLOW_TABLE_TYPE_NIC_RX for ingress or dv.MLX5DV_FLOW_TABLE_TYPE_NIC_TX for egress """ super().__init__() cdef char *buf = NULL data_len = 0 if data is None else len(data) if data: arr = bytearray(data) buf = calloc(1, data_len) for i in range(data_len): buf[i] = arr[i] reformat_data = NULL if data is None else buf self.action = dv.mlx5dv_create_flow_action_packet_reformat( context.context, data_len, reformat_data, reformat_type, ft_type) if data: free(buf) if self.action == NULL: raise PyverbsRDMAErrno('Failed to create flow action packet reformat') cdef class Mlx5FlowActionAttr(PyverbsObject): def __init__(self, action_type=None, QP qp=None, FlowAction flow_action=None): """ Initialize a Mlx5FlowActionAttr object over an underlying mlx5dv_flow_action_attr C object that defines actions attributes for the flow matcher. :param action_type: Type of the action :param qp: A QP target for go to QP action :param flow_action: An action to perform for the flow """ super().__init__() if action_type: self.attr.type = action_type if action_type == dv.MLX5DV_FLOW_ACTION_DEST_IBV_QP: self.attr.qp = qp.qp self.qp = qp elif action_type == dv.MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION: self.attr.action = flow_action.action self.action = flow_action elif action_type: raise PyverbsUserError(f'Unsupported action type: {action_type}.') @property def type(self): return self.attr.type @type.setter def type(self, action_type): if self.attr.type != dv.MLX5DV_FLOW_ACTION_DEST_IBV_QP: raise PyverbsUserError(f'Unsupported action type: {action_type}.') self.attr.type = action_type @property def qp(self): if self.attr.type != dv.MLX5DV_FLOW_ACTION_DEST_IBV_QP: raise PyverbsUserError(f'Action attr of type {self.attr.type} doesn\'t have a qp') return self.qp @qp.setter def qp(self, QP qp): if self.attr.type != dv.MLX5DV_FLOW_ACTION_DEST_IBV_QP: raise PyverbsUserError(f'Action attr of type {self.attr.type} doesn\'t have a qp') self.qp = qp @property def action(self): if self.attr.type != dv.MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION: raise PyverbsUserError(f'Action attr of type {self.attr.type} doesn\'t have an action') return self.action @action.setter def action(self, FlowAction action): if self.attr.type != dv.MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION: raise PyverbsUserError(f'Action attr of type {self.attr.type} doesn\'t have an action') self.action = action self.attr.action = action.action cdef class Mlx5Flow(Flow): def __init__(self, Mlx5FlowMatcher matcher, Mlx5FlowMatchParameters match_value, action_attrs=None, num_actions=0): """ Initialize a Mlx5Flow object derived form Flow class. :param matcher: A matcher with the fields to match on :param match_value: Match parameters with values to match on :param action_attrs: List of actions to perform :param num_actions: Number of actions """ cdef void *tmp_addr cdef void *attr_addr super(Flow, self).__init__() action_attrs = [] if action_attrs is None else action_attrs if len(action_attrs) != num_actions: self.logger.warn('num_actions is different from actions array length.') total_size = num_actions * sizeof(dv.mlx5dv_flow_action_attr) attr_addr = calloc(1, total_size) if attr_addr == NULL: raise PyverbsError(f'Failed to allocate memory of size {total_size}') tmp_addr = attr_addr for attr in action_attrs: if (attr).attr.type == dv.MLX5DV_FLOW_ACTION_DEST_IBV_QP: ((attr.qp)).add_ref(self) self.qp = (attr).qp elif (attr).attr.type not in [dv.MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION]: raise PyverbsUserError(f'Unsupported action type: ' f'{attr).attr.type}.') memcpy(tmp_addr, &(attr).attr, sizeof(dv.mlx5dv_flow_action_attr)) tmp_addr += sizeof(dv.mlx5dv_flow_action_attr) self.flow = dv.mlx5dv_create_flow(matcher.flow_matcher, match_value.params, num_actions, attr_addr) free(attr_addr) if self.flow == NULL: raise PyverbsRDMAErrno('Flow creation failed.') matcher.add_ref(self) rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_mkey.pxd000066400000000000000000000021601477342711600227060ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 from pyverbs.base cimport PyverbsObject, PyverbsCM cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.pd cimport PD cdef class Mlx5MkeyConfAttr(PyverbsObject): cdef dv.mlx5dv_mkey_conf_attr mlx5dv_mkey_conf_attr cdef class Mlx5MrInterleaved(PyverbsObject): cdef dv.mlx5dv_mr_interleaved mlx5dv_mr_interleaved cdef class Mlx5Mkey(PyverbsCM): cdef dv.mlx5dv_mkey *mlx5dv_mkey cdef PD pd cdef object max_entries cdef class Mlx5SigCrc(PyverbsObject): cdef dv.mlx5dv_sig_crc mlx5dv_sig_crc cdef class Mlx5SigT10Dif(PyverbsObject): cdef dv.mlx5dv_sig_t10dif mlx5dv_sig_t10dif cdef class Mlx5SigBlockDomain(PyverbsObject): cdef dv.mlx5dv_sig_block_domain mlx5dv_sig_block_domain cdef class Mlx5SigBlockAttr(PyverbsObject): cdef dv.mlx5dv_sig_block_attr mlx5dv_sig_block_attr cdef class Mlx5SigErr(PyverbsObject): cdef dv.mlx5dv_sig_err mlx5dv_sig_err cdef class Mlx5MkeyErr(PyverbsObject): cdef dv.mlx5dv_mkey_err mlx5dv_mkey_err rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_mkey.pyx000066400000000000000000000207721477342711600227440ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file from pyverbs.pyverbs_error import PyverbsUserError, PyverbsRDMAError cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base import PyverbsRDMAErrno from pyverbs.pd cimport PD cdef class Mlx5SigCrc(PyverbsObject): def __init__(self, crc_type=0, seed=0): """ Initializes a Mlx5SigCrc object representing mlx5dv_sig_crc C struct. :param crc_type: The specific CRC type. :param seed: A seed for the CRC calculation per block. """ self.mlx5dv_sig_crc.type = crc_type self.mlx5dv_sig_crc.seed = seed cdef class Mlx5SigT10Dif(PyverbsObject): def __init__(self, bg_type=0, bg=0, app_tag=0, ref_tag=0, flags=0): """ Initializes a Mlx5SigT10Dif object representing mlx5dv_sig_t10dif C struct. :param bg_type: The block guard type to be used. :param bg: A seed for the block guard calculation per block. :param app_tag: An application tag to generate or validate. :param ref_tag: A reference tag to generate or validate. :param flags: Flags for the T10DIF attributes. """ self.mlx5dv_sig_t10dif.bg_type = bg_type self.mlx5dv_sig_t10dif.bg = bg self.mlx5dv_sig_t10dif.app_tag = app_tag self.mlx5dv_sig_t10dif.ref_tag = ref_tag self.mlx5dv_sig_t10dif.flags = flags cdef class Mlx5SigBlockDomain(PyverbsObject): def __init__(self, sig_type=0, Mlx5SigT10Dif dif=None, Mlx5SigCrc crc=None, block_size=0, comp_mask=0): """ Initializes a Mlx5SigBlockDomain object representing mlx5dv_sig_block_domain C struct. :param sig_type: The signature type for this domain. :param dif: Mlx5SigT10Dif object. :param crc: Mlx5SigCrc object. :param block_size: The block size for this domain. :param comp_mask: Compatibility mask. """ self.mlx5dv_sig_block_domain.sig_type = sig_type self.mlx5dv_sig_block_domain.block_size = block_size self.mlx5dv_sig_block_domain.comp_mask = comp_mask if dif: self.mlx5dv_sig_block_domain.sig.dif = &dif.mlx5dv_sig_t10dif if crc: self.mlx5dv_sig_block_domain.sig.crc = &crc.mlx5dv_sig_crc cdef class Mlx5SigBlockAttr(PyverbsObject): def __init__(self, Mlx5SigBlockDomain mem=None, Mlx5SigBlockDomain wire=None, flags=0, check_mask=0, copy_mask=0, comp_mask=0): """ Initializes a Mlx5SigBlockAttr object representing mlx5dv_sig_block_attr C struct. :param mem: Mlx5SigBlockDomain of the signature configuration for the memory domain or None if the domain does not have a signature. :param wire: Mlx5SigBlockDomain of the signature configuration for the wire domain or None if the domain does not have a signature. :param flags: Flags for the block signature attributes. :param check_mask: Byte of the input signature is checked if corresponding bit in check_mask is set. :param copy_mask: Byte of the signature is copied from the source domain to the destination domain if corresponding bit in copy_mask is set. :param comp_mask: Compatibility mask. """ self.mlx5dv_sig_block_attr.flags = flags self.mlx5dv_sig_block_attr.check_mask = check_mask self.mlx5dv_sig_block_attr.copy_mask = copy_mask self.mlx5dv_sig_block_attr.comp_mask = comp_mask if mem: self.mlx5dv_sig_block_attr.mem = &mem.mlx5dv_sig_block_domain if wire: self.mlx5dv_sig_block_attr.wire = &wire.mlx5dv_sig_block_domain cdef class Mlx5SigErr(PyverbsObject): def __init__(self, actual_value=0, expected_value=0, offset=0): """ Initializes a Mlx5SigBlockAttr object representing mlx5dv_sig_block_attr C struct. :param actual_value: The actual value that was calculated from the transferred data. :param expected_value: The expected value based on what appears in the signature respected field. :param offset: The offset within the transfer where the error happened. """ self.mlx5dv_sig_err.actual_value = actual_value self.mlx5dv_sig_err.expected_value = expected_value self.mlx5dv_sig_err.offset = offset @property def actual_value(self): return self.mlx5dv_sig_err.actual_value @property def expected_value(self): return self.mlx5dv_sig_err.expected_value @property def offset(self): return self.mlx5dv_sig_err.offset cdef class Mlx5MkeyErr(PyverbsObject): def __init__(self, Mlx5SigErr sig_err=Mlx5SigErr(), err_type=dv.MLX5DV_MKEY_NO_ERR): """ Initializes a Mlx5MkeyErr object representing mlx5dv_mkey_err C struct. :param sig_err: Mlx5SigErr object that handle the sig error. :param err_type: Indicate what kind of error happened. """ self.mlx5dv_mkey_err.err_type = err_type self.mlx5dv_mkey_err.err.sig = sig_err.mlx5dv_sig_err @property def err_type(self): return self.mlx5dv_mkey_err.err_type @property def sig_err(self): return Mlx5SigErr(self.mlx5dv_mkey_err.err.sig.actual_value, self.mlx5dv_mkey_err.err.sig.expected_value, self.mlx5dv_mkey_err.err.sig.offset) cdef class Mlx5MkeyConfAttr(PyverbsObject): def __init__(self, conf_flags=0, comp_mask=0): """ Initializes a Mlx5MkeyConfAttr object representing mlx5dv_mkey_conf_attr C struct. :param conf_flags: Mkey configuration flags. :param comp_mask: Compatibility mask. """ self.mlx5dv_mkey_conf_attr.conf_flags = conf_flags self.mlx5dv_mkey_conf_attr.comp_mask = comp_mask cdef class Mlx5MrInterleaved(PyverbsObject): def __init__(self, addr, bytes_count, bytes_skip, lkey): """ Initializes a Mlx5MrInterleaved object representing mlx5dv_mr_interleaved C struct. :param addr: The start of address. :param bytes_count: Count of bytes from the address that will hold the real data. :param bytes_skip: Count of bytes to skip after the bytes_count. :param lkey: The lkey of this memory. """ self.mlx5dv_mr_interleaved.addr = addr self.mlx5dv_mr_interleaved.bytes_count = bytes_count self.mlx5dv_mr_interleaved.bytes_skip = bytes_skip self.mlx5dv_mr_interleaved.lkey = lkey cdef class Mlx5Mkey(PyverbsCM): def __init__(self, PD pd not None, create_flags, max_entries): """ Creates an indirect mkey and store the actual mkey max_entries after the mkey creation. :param pd: PD instance. :param create_flags: Mkey creation flags. :param max_entries: Requested max number of pointed entries by this indirect mkey. """ cdef dv.mlx5dv_mkey_init_attr mkey_init mkey_init.pd = pd.pd mkey_init.create_flags = create_flags mkey_init.max_entries = max_entries self.mlx5dv_mkey = dv.mlx5dv_create_mkey(&mkey_init) if self.mlx5dv_mkey == NULL: raise PyverbsRDMAErrno('Failed to create mkey') self.max_entries = mkey_init.max_entries self.pd = pd self.pd.mkeys.add(self) def mkey_check(self): """ Checks the mkey for errors and provides the result in err_info on success. :return: Mlx5MkeyErr object, the result of the Mkey check. """ mkey_err = Mlx5MkeyErr() rc = dv.mlx5dv_mkey_check(self.mlx5dv_mkey, &mkey_err.mlx5dv_mkey_err) if rc: raise PyverbsRDMAError('Failed to check the mkey', rc) return mkey_err @property def lkey(self): return self.mlx5dv_mkey.lkey @property def rkey(self): return self.mlx5dv_mkey.rkey @property def max_entries(self): return self.max_entries def __dealloc__(self): self.close() cpdef close(self): if self.mlx5dv_mkey != NULL: rc = dv.mlx5dv_destroy_mkey(self.mlx5dv_mkey) if rc: raise PyverbsRDMAError('Failed to destroy a mkey', rc) self.mlx5dv_mkey = NULL self.pd = None rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_objects.pxd000066400000000000000000000012321477342711600233710ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base cimport PyverbsObject cdef class Mlx5DvPD(PyverbsObject): cdef dv.mlx5dv_pd dv_pd cdef class Mlx5DvCQ(PyverbsObject): cdef dv.mlx5dv_cq dv_cq cdef class Mlx5DvQP(PyverbsObject): cdef dv.mlx5dv_qp dv_qp cdef class Mlx5DvSRQ(PyverbsObject): cdef dv.mlx5dv_srq dv_srq cdef class Mlx5DvObj(PyverbsObject): cdef dv.mlx5dv_obj obj cdef Mlx5DvCQ dv_cq cdef Mlx5DvQP dv_qp cdef Mlx5DvPD dv_pd cdef Mlx5DvSRQ dv_srq rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_objects.pyx000066400000000000000000000142541477342711600234260ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia, Inc. All rights reserved. See COPYING file """ This module wraps mlx5dv_ C structs, such as mlx5dv_cq, mlx5dv_qp etc. It exposes to the users the mlx5 driver-specific attributes for ibv objects by extracting them via mlx5dv_init_obj() API by using Mlx5DvObj class, which holds all the (currently) supported Mlx5Dv objects. Note: This is not be confused with Mlx5 which holds the ibv__ex that was created using mlx5dv_create_(). """ from libc.stdint cimport uintptr_t, uint32_t from pyverbs.pyverbs_error import PyverbsUserError, PyverbsRDMAError cimport pyverbs.providers.mlx5.mlx5_enums as dve cimport pyverbs.libibverbs as v cdef class Mlx5DvPD(PyverbsObject): @property def pdn(self): """ The protection domain object number """ return self.dv_pd.pdn @property def comp_mask(self): return self.dv_pd.comp_mask cdef class Mlx5DvCQ(PyverbsObject): @property def cqe_size(self): return self.dv_cq.cqe_size @property def comp_mask(self): return self.dv_cq.comp_mask @property def cqn(self): return self.dv_cq.cqn @property def buf(self): return self.dv_cq.buf @property def cq_uar(self): return self.dv_cq.cq_uar @property def dbrec(self): return self.dv_cq.dbrec @property def cqe_cnt(self): return self.dv_cq.cqe_cnt cdef class Mlx5DvQP(PyverbsObject): @property def rqn(self): """ The receive queue number of the QP""" return self.dv_qp.rqn @property def sqn(self): """ The send queue number of the QP""" return self.dv_qp.sqn @property def tirn(self): """ The number of the transport interface receive object that attached to the RQ of the QP """ return self.dv_qp.tirn @property def tisn(self): """ The number of the transport interface send object that attached to the SQ of the QP """ return self.dv_qp.tisn @property def comp_mask(self): return self.dv_qp.comp_mask @comp_mask.setter def comp_mask(self, val): self.dv_qp.comp_mask = val @property def uar_mmap_offset(self): return self.uar_mmap_offset cdef class Mlx5DvSRQ(PyverbsObject): @property def stride(self): return self.dv_srq.stride @property def head(self): return self.dv_srq.stride @property def tail(self): return self.dv_srq.stride @property def comp_mask(self): return self.dv_srq.comp_mask @comp_mask.setter def comp_mask(self, val): self.dv_srq.comp_mask = val @property def srqn(self): """ The shared receive queue object number """ return self.dv_srq.srqn cdef class Mlx5DvObj(PyverbsObject): """ Mlx5DvObj represents mlx5dv_obj C struct. """ def __init__(self, obj_type=None, **kwargs): """ Retrieves DV objects from ibv object to be able to extract attributes (such as cqe_size of a CQ). Currently supports CQ, QP, PD and SRQ objects. The initialized objects can be accessed using self.dvobj (e.g. self.dvcq). :param obj_type: Bitmask which defines what objects was provided. Currently it supports: MLX5DV_OBJ_CQ, MLX5DV_OBJ_QP, MLX5DV_OBJ_SRQ and MLX5DV_OBJ_PD. :param kwargs: List of objects (cq, qp, pd, srq) from which to extract data and their comp_masks if applicable. If comp_mask is not provided by user, mask all by default. """ self.dv_pd = self.dv_cq = self.dv_qp = self.dv_srq = None if obj_type is None: return self.init_obj(obj_type, **kwargs) def init_obj(self, obj_type, **kwargs): """ Initialize DV objects. The objects are re-initialized if they're already extracted. """ supported_obj_types = dve.MLX5DV_OBJ_CQ | dve.MLX5DV_OBJ_QP | \ dve.MLX5DV_OBJ_PD | dve.MLX5DV_OBJ_SRQ if obj_type & supported_obj_types is False: raise PyverbsUserError('Invalid obj_type was provided') cq = kwargs.get('cq') if obj_type | dve.MLX5DV_OBJ_CQ else None qp = kwargs.get('qp') if obj_type | dve.MLX5DV_OBJ_QP else None pd = kwargs.get('pd') if obj_type | dve.MLX5DV_OBJ_PD else None srq = kwargs.get('srq') if obj_type | dve.MLX5DV_OBJ_SRQ else None if cq is qp is pd is srq is None: raise PyverbsUserError("No supported object was provided.") if cq: dv_cq = Mlx5DvCQ() self.obj.cq.in_ = cq.cq self.obj.cq.out = &(dv_cq.dv_cq) self.dv_cq = dv_cq if qp: dv_qp = Mlx5DvQP() comp_mask = kwargs.get('qp_comp_mask') dv_qp.comp_mask = comp_mask if comp_mask else \ dv.MLX5DV_QP_MASK_UAR_MMAP_OFFSET | \ dv.MLX5DV_QP_MASK_RAW_QP_HANDLES | \ dv.MLX5DV_QP_MASK_RAW_QP_TIR_ADDR self.obj.qp.in_ = qp.qp self.obj.qp.out = &(dv_qp.dv_qp) self.dv_qp = dv_qp if pd: dv_pd = Mlx5DvPD() self.obj.pd.in_ = pd.pd self.obj.pd.out = &(dv_pd.dv_pd) self.dv_pd = dv_pd if srq: dv_srq = Mlx5DvSRQ() comp_mask = kwargs.get('srq_comp_mask') dv_srq.comp_mask = comp_mask if comp_mask else dv.MLX5DV_SRQ_MASK_SRQN self.obj.srq.in_ = srq.srq self.obj.srq.out = &(dv_srq.dv_srq) self.dv_srq = dv_srq rc = dv.mlx5dv_init_obj(&self.obj, obj_type) if rc != 0: raise PyverbsRDMAError("Failed to initialize Mlx5DvObj", rc) @property def dvcq(self): return self.dv_cq @property def dvqp(self): return self.dv_qp @property def dvpd(self): return self.dv_pd @property def dvsrq(self): return self.dv_srq rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_sched.pxd000066400000000000000000000012441477342711600230310ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file #cython: language_level=3 cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base cimport PyverbsObject cdef class Mlx5dvSchedAttr(PyverbsObject): cdef dv.mlx5dv_sched_attr sched_attr cdef object parent_sched_node cdef class Mlx5dvSchedNode(PyverbsObject): cdef dv.mlx5dv_sched_node *sched_node cdef object context cdef object sched_attr cpdef close(self) cdef class Mlx5dvSchedLeaf(PyverbsObject): cdef dv.mlx5dv_sched_leaf *sched_leaf cdef object context cdef object sched_attr cpdef close(self) rdma-core-56.1/pyverbs/providers/mlx5/mlx5dv_sched.pyx000066400000000000000000000130021477342711600230510ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file from pyverbs.pyverbs_error import PyverbsRDMAError cimport pyverbs.providers.mlx5.libmlx5 as dv from pyverbs.base import PyverbsRDMAErrno from pyverbs.device cimport Context cdef class Mlx5dvSchedAttr(PyverbsObject): def __init__(self, Mlx5dvSchedNode parent_sched_node=None, bw_share=0, max_avg_bw=0, flags=0, comp_mask=0): """ Create a Schedule attr. :param parent_sched_node: The parent Mlx5dvSchedNode. None if this Attr is for the root node. :param flags: Bitmask specifying what attributes in the structure are valid. :param bw_share: The relative bandwidth share allocated for this element. :param max_avg_bw: The maximal transmission rate allowed for the element, averaged over time. :param comp_mask: Reserved for future extension. """ self.parent_sched_node = parent_sched_node parent_node = parent_sched_node.sched_node if parent_sched_node \ else NULL self.sched_attr.parent = parent_node self.sched_attr.flags = flags self.sched_attr.bw_share = bw_share self.sched_attr.max_avg_bw = max_avg_bw self.sched_attr.comp_mask = comp_mask @property def bw_share(self): return self.sched_attr.bw_share @property def max_avg_bw(self): return self.sched_attr.max_avg_bw @property def flags(self): return self.sched_attr.flags @property def comp_mask(self): return self.sched_attr.comp_mask def __str__(self): print_format = '{:20}: {:<20}\n' return 'Mlx5dvSchedAttr:\n' +\ print_format.format('BW share', self.bw_share) +\ print_format.format('Max avgerage BW', self.max_avg_bw) +\ print_format.format('Flags', self.flags) +\ print_format.format('Comp mask', self.comp_mask) cdef class Mlx5dvSchedNode(PyverbsObject): def __init__(self, Context context not None, Mlx5dvSchedAttr sched_attr): """ Create a Schedule node. :param context: Context to create the schedule resources on. :param sched_attr: Mlx5dvSchedAttr, containing the sched attributes. """ self.sched_attr = sched_attr self.sched_node = dv.mlx5dv_sched_node_create(context.context, &sched_attr.sched_attr) if self.sched_node == NULL: raise PyverbsRDMAErrno('Failed to create sched node') self.context = context context.sched_nodes.add(self) def modify(self, Mlx5dvSchedAttr sched_attr): rc = dv.mlx5dv_sched_node_modify(self.sched_node, &sched_attr.sched_attr) def __str__(self): print_format = '{:20}: {:<20}\n' return 'Mlx5dvSchedNode:\n' +\ print_format.format('sched attr', str(self.sched_attr)) @property def sched_attr(self): return self.sched_attr @property def bw_share(self): return self.sched_attr.bw_share @property def max_avg_bw(self): return self.sched_attr.max_avg_bw @property def flags(self): return self.sched_attr.flags @property def comp_mask(self): return self.sched_attr.comp_mask def __dealloc__(self): self.close() cpdef close(self): if self.sched_node != NULL: rc = dv.mlx5dv_sched_node_destroy(self.sched_node) if rc != 0: raise PyverbsRDMAError('Failed to destroy a sched node', rc) self.sched_node = NULL self.context = None cdef class Mlx5dvSchedLeaf(PyverbsObject): def __init__(self, Context context not None, Mlx5dvSchedAttr sched_attr): """ Create a Schedule leaf. :param context: Context to create the schedule resources on. :param sched_attr: Mlx5dvSchedAttr, containing the sched attributes. """ self.sched_attr = sched_attr self.sched_leaf = dv.mlx5dv_sched_leaf_create(context.context, &sched_attr.sched_attr) if self.sched_leaf == NULL: raise PyverbsRDMAErrno('Failed to create sched leaf') self.context = context context.sched_leafs.add(self) def modify(self, Mlx5dvSchedAttr sched_attr): rc = dv.mlx5dv_sched_leaf_modify(self.sched_leaf, &sched_attr.sched_attr) def __str__(self): print_format = '{:20}: {:<20}\n' return 'Mlx5dvSchedLeaf:\n' +\ print_format.format('sched attr', str(self.sched_attr)) @property def sched_attr(self): return self.sched_attr @property def bw_share(self): return self.sched_attr.bw_share @property def max_avg_bw(self): return self.sched_attr.max_avg_bw @property def flags(self): return self.sched_attr.flags @property def comp_mask(self): return self.sched_attr.comp_mask def __dealloc__(self): self.close() cpdef close(self): if self.sched_leaf != NULL: rc = dv.mlx5dv_sched_leaf_destroy(self.sched_leaf) if rc != 0: raise PyverbsRDMAError('Failed to destroy a sched leaf', rc) self.sched_leaf = NULL self.context = None rdma-core-56.1/pyverbs/pyverbs_error.py000066400000000000000000000027371477342711600203120ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. import os class PyverbsError(Exception): """ Base exception class for Pyverbs. Inherited by PyverbsRDMAError (for errors returned by rdma-core) and PyverbsUserError (for user-related errors found by Pyverbs, e.g. non-existing device name). """ def __init__(self, msg, error_code = -1): """ Initializes a PyverbsError instance :param msg: The exception's message :param error_code: errno value """ if error_code != -1: msg = '{msg}. Errno: {err}, {err_str}'.\ format(msg=msg, err=error_code, err_str=os.strerror(error_code)) super(PyverbsError, self).__init__(msg) class PyverbsRDMAError(PyverbsError): """ This exception is raised when an rdma-core function returns an error. """ def __init__(self, msg, error_code = -1): super(PyverbsRDMAError, self).__init__(msg, error_code) self._error_code = error_code @property def error_code(self): return self._error_code class PyverbsUserError(PyverbsError): """ This exception is raised when Pyverbs encounters an error resulting from user's action or input. """ def __init__(self, msg): """ Initializes a PyverbsUserError instance :param msg: The exception's message """ super(PyverbsUserError, self).__init__(msg) rdma-core-56.1/pyverbs/qp.pxd000066400000000000000000000023421477342711600161620ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. #cython: language_level=3 from pyverbs.base cimport PyverbsObject, PyverbsCM cimport pyverbs.libibverbs as v cdef class QPCap(PyverbsObject): cdef v.ibv_qp_cap cap cdef class QPInitAttr(PyverbsObject): cdef v.ibv_qp_init_attr attr cdef object scq cdef object rcq cdef object srq cdef class QPInitAttrEx(PyverbsObject): cdef v.ibv_qp_init_attr_ex attr cdef object scq cdef object rcq cdef object _pd cdef object xrcd cdef object srq cdef object ind_table cdef class QPAttr(PyverbsObject): cdef v.ibv_qp_attr attr cdef class QP(PyverbsCM): cdef v.ibv_qp *qp cdef int type cdef int state cdef object pd cdef object context cdef object xrcd cpdef close(self) cdef update_cqs(self, init_attr) cdef object scq cdef object rcq cdef object mws cdef object srq cdef object flows cdef object dr_actions cdef add_ref(self, obj) cdef class DataBuffer(PyverbsCM): cdef v.ibv_data_buf data cdef class QPEx(QP): cdef v.ibv_qp_ex *qp_ex cdef object ind_table cdef class ECE(PyverbsCM): cdef v.ibv_ece ece rdma-core-56.1/pyverbs/qp.pyx000066400000000000000000001551651477342711600162230ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. from libc.stdlib cimport malloc, free from libc.string cimport memcpy import weakref from pyverbs.pyverbs_error import PyverbsUserError, PyverbsError, PyverbsRDMAError from pyverbs.utils import gid_str, qp_type_to_str, qp_state_to_str, mtu_to_str from pyverbs.utils import access_flags_to_str, mig_state_to_str from pyverbs.wq cimport RwqIndTable, RxHashConf from pyverbs.mr cimport MW, MWBindInfo, MWBind from pyverbs.wr cimport RecvWR, SendWR, SGE from pyverbs.base import PyverbsRDMAErrno from pyverbs.addr cimport AHAttr, GID, AH from pyverbs.flow cimport FlowAttr, Flow from pyverbs.base cimport close_weakrefs cimport pyverbs.libibverbs_enums as e from pyverbs.addr cimport GlobalRoute from pyverbs.device cimport Context from cpython.ref cimport PyObject from pyverbs.cq cimport CQ, CQEX cimport pyverbs.libibverbs as v from pyverbs.xrcd cimport XRCD from pyverbs.srq cimport SRQ from pyverbs.pd cimport PD cdef extern from 'Python.h': void* PyLong_AsVoidPtr(object) cdef extern from 'endian.h': unsigned long htobe32(unsigned long host_32bits) cdef class QPCap(PyverbsObject): def __init__(self, max_send_wr=1, max_recv_wr=10, max_send_sge=1, max_recv_sge=1, max_inline_data=0): """ Initializes a QPCap object with user-provided or default values. :param max_send_wr: max number of outstanding WRs in the SQ :param max_recv_wr: max number of outstanding WRs in the RQ :param max_send_sge: Requested max number of scatter-gather elements in a WR in the SQ :param max_recv_sge: Requested max number of scatter-gather elements in a WR in the RQ :param max_inline_data: max number of data (bytes) that can be posted inline to the SQ, otherwise 0 :return: """ super().__init__() self.cap.max_send_wr = max_send_wr self.cap.max_recv_wr = max_recv_wr self.cap.max_send_sge = max_send_sge self.cap.max_recv_sge = max_recv_sge self.cap.max_inline_data = max_inline_data @property def max_send_wr(self): return self.cap.max_send_wr @max_send_wr.setter def max_send_wr(self, val): self.cap.max_send_wr = val @property def max_recv_wr(self): return self.cap.max_recv_wr @max_recv_wr.setter def max_recv_wr(self, val): self.cap.max_recv_wr = val @property def max_send_sge(self): return self.cap.max_send_sge @max_send_sge.setter def max_send_sge(self, val): self.cap.max_send_sge = val @property def max_recv_sge(self): return self.cap.max_recv_sge @max_recv_sge.setter def max_recv_sge(self, val): self.cap.max_recv_sge = val @property def max_inline_data(self): return self.cap.max_inline_data @max_inline_data.setter def max_inline_data(self, val): self.cap.max_inline_data = val def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('max send wrs', self.cap.max_send_wr) +\ print_format.format('max recv wrs', self.cap.max_recv_wr) +\ print_format.format('max send sges', self.cap.max_send_sge) +\ print_format.format('max recv sges', self.cap.max_recv_sge) +\ print_format.format('max inline data', self.cap.max_inline_data) cdef class QPInitAttr(PyverbsObject): def __init__(self, qp_type=e.IBV_QPT_UD, qp_context=None, PyverbsObject scq=None, PyverbsObject rcq=None, SRQ srq=None, QPCap cap=None, sq_sig_all=1): """ Initializes a QpInitAttr object representing ibv_qp_init_attr struct. :param qp_type: The desired QP type (see enum ibv_qp_type) :param qp_context: Associated QP context :param scq: Send CQ to be used for this QP :param rcq: Receive CQ to be used for this QP :param srq: Shared receive queue to be used as RQ in QP :param cap: A QPCap object :param sq_sig_all: If set, each send WR will generate a completion entry :return: A QpInitAttr object """ super().__init__() _copy_caps(cap, self) self.attr.qp_context = qp_context if scq is not None: if type(scq) is CQ: self.attr.send_cq = (scq).cq elif isinstance(scq, CQEX): self.attr.send_cq = (scq).ibv_cq else: raise PyverbsUserError('Expected CQ/CQEX, got {t}'.\ format(t=type(scq))) self.scq = scq if rcq is not None: if type(rcq) is CQ: self.attr.recv_cq = (rcq).cq elif isinstance(rcq, CQEX): self.attr.recv_cq = (rcq).ibv_cq else: raise PyverbsUserError('Expected CQ/CQEX, got {t}'.\ format(t=type(rcq))) self.rcq = rcq self.attr.qp_type = qp_type self.attr.sq_sig_all = sq_sig_all self.srq = srq self.attr.srq = srq.srq if srq else NULL @property def send_cq(self): return self.scq @send_cq.setter def send_cq(self, val): if type(val) is CQ: self.attr.send_cq = (val).cq elif type(val) is CQEX: self.attr.send_cq = (val).ibv_cq self.scq = val @property def srq(self): return self.srq @srq.setter def srq(self, SRQ val): self.attr.srq = val.srq self.srq = val @property def recv_cq(self): return self.rcq @recv_cq.setter def recv_cq(self, val): if type(val) is CQ: self.attr.recv_cq = (val).cq elif type(val) is CQEX: self.attr.recv_cq = (val).ibv_cq self.rcq = val @property def cap(self): return QPCap(max_send_wr=self.attr.cap.max_send_wr, max_recv_wr=self.attr.cap.max_recv_wr, max_send_sge=self.attr.cap.max_send_sge, max_recv_sge=self.attr.cap.max_recv_sge, max_inline_data=self.attr.cap.max_inline_data) @cap.setter def cap(self, val): _copy_caps(val, self) @property def qp_type(self): return self.attr.qp_type @qp_type.setter def qp_type(self, val): self.attr.qp_type = val @property def sq_sig_all(self): return self.attr.sq_sig_all @sq_sig_all.setter def sq_sig_all(self, val): self.attr.sq_sig_all = val @property def max_send_wr(self): return self.attr.cap.max_send_wr @max_send_wr.setter def max_send_wr(self, val): self.attr.cap.max_send_wr = val @property def max_recv_wr(self): return self.attr.cap.max_recv_wr @max_recv_wr.setter def max_recv_wr(self, val): self.attr.cap.max_recv_wr = val @property def max_send_sge(self): return self.attr.cap.max_send_sge @max_send_sge.setter def max_send_sge(self, val): self.attr.cap.max_send_sge = val @property def max_recv_sge(self): return self.attr.cap.max_recv_sge @max_recv_sge.setter def max_recv_sge(self, val): self.attr.cap.max_recv_sge = val @property def max_inline_data(self): return self.attr.cap.max_inline_data @max_inline_data.setter def max_inline_data(self, val): self.attr.cap.max_inline_data = val def __str__(self): print_format = '{:20}: {:<20}\n' ident_format = ' {:20}: {:<20}\n' return print_format.format('QP type', qp_type_to_str(self.qp_type)) +\ print_format.format('SQ sig. all', self.sq_sig_all) +\ 'QP caps:\n' +\ ident_format.format('max send WR', self.attr.cap.max_send_wr) +\ ident_format.format('max recv WR', self.attr.cap.max_recv_wr) +\ ident_format.format('max send SGE', self.attr.cap.max_send_sge) +\ ident_format.format('max recv SGE', self.attr.cap.max_recv_sge) +\ ident_format.format('max inline data', self.attr.cap.max_inline_data) cdef class QPInitAttrEx(PyverbsObject): def __init__(self, qp_type=e.IBV_QPT_UD, qp_context=None, PyverbsObject scq=None, PyverbsObject rcq=None, SRQ srq=None, QPCap cap=None, sq_sig_all=0, comp_mask=0, PD pd=None, XRCD xrcd=None, create_flags=0, max_tso_header=0, source_qpn=0, RxHashConf hash_conf=None, RwqIndTable ind_table=None, send_ops_flags=0): """ Initialize a QPInitAttrEx object with user-defined or default values. :param qp_type: QP type to be created :param qp_context: Associated user context :param scq: Send CQ to be used for this QP :param rcq: Recv CQ to be used for this QP :param srq: Shared receive queue to be used as RQ in QP :param cap: A QPCap object :param sq_sig_all: If set, each send WR will generate a completion entry :param comp_mask: bit mask to determine which of the following fields are valid :param pd: A PD object to be associated with this QP :param xrcd: XRC domain to be used for XRC QPs :param create_flags: Creation flags for this QP :param max_tso_header: Maximum TSO header size :param source_qpn: Source QP number (requires IBV_QP_CREATE_SOURCE_QPN set in create_flags) :param hash_conf: A RxHashConf object, config of RX hash key. :param ind_table: A RwqIndTable object, indirection table of RWQs. :param send_ops_flags: Send opcodes to be supported by the extended QP. Use ibv_qp_create_send_ops_flags enum :return: An initialized QPInitAttrEx object """ super().__init__() _copy_caps(cap, self) if scq is not None: if type(scq) is CQ: self.attr.send_cq = (scq).cq elif isinstance(scq, CQEX): self.attr.send_cq = (scq).ibv_cq else: raise PyverbsUserError('Expected CQ/CQEX, got {t}'.\ format(t=type(scq))) self.scq = scq if rcq is not None: if type(rcq) is CQ: self.attr.recv_cq = (rcq).cq elif isinstance(rcq, CQEX): self.attr.recv_cq = (rcq).ibv_cq else: raise PyverbsUserError('Expected CQ/CQEX, got {t}'.\ format(t=type(rcq))) self.rcq = rcq self.srq = srq self.attr.srq = srq.srq if srq else NULL self.xrcd = xrcd self.attr.xrcd = xrcd.xrcd if xrcd else NULL if hash_conf: self.attr.rx_hash_conf = hash_conf.rx_hash_conf self.ind_table = ind_table self.attr.rwq_ind_tbl = ind_table.rwq_ind_table if ind_table else NULL self.attr.qp_type = qp_type self.attr.sq_sig_all = sq_sig_all self.attr.comp_mask = comp_mask if pd is not None: self._pd = pd self.attr.pd = pd.pd self.attr.create_flags = create_flags self.attr.max_tso_header = max_tso_header self.attr.source_qpn = source_qpn self.attr.send_ops_flags = send_ops_flags @property def send_cq(self): return self.scq @send_cq.setter def send_cq(self, val): if type(val) is CQ: self.attr.send_cq = (val).cq elif type(val) is CQEX: self.attr.send_cq = (val).ibv_cq self.scq = val @property def recv_cq(self): return self.rcq @recv_cq.setter def recv_cq(self, val): if type(val) is CQ: self.attr.recv_cq = (val).cq elif type(val) is CQEX: self.attr.recv_cq = (val).ibv_cq self.rcq = val @property def cap(self): return QPCap(max_send_wr=self.attr.cap.max_send_wr, max_recv_wr=self.attr.cap.max_recv_wr, max_send_sge=self.attr.cap.max_send_sge, max_recv_sge=self.attr.cap.max_recv_sge, max_inline_data=self.attr.cap.max_inline_data) @cap.setter def cap(self, val): _copy_caps(val, self) @property def qp_type(self): return self.attr.qp_type @qp_type.setter def qp_type(self, val): self.attr.qp_type = val @property def sq_sig_all(self): return self.attr.sq_sig_all @sq_sig_all.setter def sq_sig_all(self, val): self.attr.sq_sig_all = val @property def comp_mask(self): return self.attr.comp_mask @comp_mask.setter def comp_mask(self, val): self.attr.comp_mask = val @property def pd(self): return self._pd @pd.setter def pd(self, PD val): self.attr.pd = val.pd self._pd = val @property def xrcd(self): return self.xrcd @xrcd.setter def xrcd(self, XRCD val): self.attr.xrcd = val.xrcd self.xrcd = val @property def srq(self): return self.srq @srq.setter def srq(self, SRQ val): self.attr.srq = val.srq self.srq = val @property def create_flags(self): return self.attr.create_flags @create_flags.setter def create_flags(self, val): self.attr.create_flags = val @property def max_tso_header(self): return self.attr.max_tso_header @max_tso_header.setter def max_tso_header(self, val): self.attr.max_tso_header = val @property def source_qpn(self): return self.attr.source_qpn @source_qpn.setter def source_qpn(self, val): self.attr.source_qpn = val @property def max_send_wr(self): return self.attr.cap.max_send_wr @max_send_wr.setter def max_send_wr(self, val): self.attr.cap.max_send_wr = val @property def max_recv_wr(self): return self.attr.cap.max_recv_wr @max_recv_wr.setter def max_recv_wr(self, val): self.attr.cap.max_recv_wr = val @property def max_send_sge(self): return self.attr.cap.max_send_sge @max_send_sge.setter def max_send_sge(self, val): self.attr.cap.max_send_sge = val @property def max_recv_sge(self): return self.attr.cap.max_recv_sge @max_recv_sge.setter def max_recv_sge(self, val): self.attr.cap.max_recv_sge = val @property def max_inline_data(self): return self.attr.cap.max_inline_data @max_inline_data.setter def max_inline_data(self, val): self.attr.cap.max_inline_data = val @property def ind_table(self): return self.ind_table @ind_table.setter def ind_table(self, RwqIndTable val): self.attr.rwq_ind_tbl = val.rwq_ind_table self.ind_table = val def mask_to_str(self, mask): comp_masks = {1: 'PD', 2: 'XRCD', 4: 'Create Flags', 8: 'Max TSO header', 16: 'Indirection Table', 32: 'RX hash'} mask_str = '' for f in comp_masks: if mask & f: mask_str += comp_masks[f] mask_str += ' ' return mask_str def flags_to_str(self, flags): create_flags = {1: 'Block self mcast loopback', 2: 'Scatter FCS', 4: 'CVLAN stripping', 8: 'Source QPN', 16: 'PCI write end padding'} create_str = '' for f in create_flags: if flags & f: create_str += create_flags[f] create_str += ' ' return create_str def __str__(self): print_format = '{:20}: {:<20}\n' return print_format.format('QP type', qp_type_to_str(self.qp_type)) +\ print_format.format('SQ sig. all', self.sq_sig_all) +\ 'QP caps:\n' +\ print_format.format(' max send WR', self.attr.cap.max_send_wr) +\ print_format.format(' max recv WR', self.attr.cap.max_recv_wr) +\ print_format.format(' max send SGE', self.attr.cap.max_send_sge) +\ print_format.format(' max recv SGE', self.attr.cap.max_recv_sge) +\ print_format.format(' max inline data', self.attr.cap.max_inline_data) +\ print_format.format('comp mask', self.mask_to_str(self.attr.comp_mask)) +\ print_format.format('create flags', self.flags_to_str(self.attr.create_flags)) +\ print_format.format('max TSO header', self.attr.max_tso_header) +\ print_format.format('Source QPN', self.attr.source_qpn) cdef class QPAttr(PyverbsObject): def __init__(self, qp_state=e.IBV_QPS_INIT, cur_qp_state=e.IBV_QPS_RESET, port_num=1, path_mtu=e.IBV_MTU_1024): """ Initializes a QPQttr object which represents ibv_qp_attr structs. It can be used to modify a QP. This function initializes default values for reset-to-init transition. :param qp_state: Desired QP state :param cur_qp_state: Current QP state :return: An initialized QpAttr object """ super().__init__() self.attr.qp_state = qp_state self.attr.cur_qp_state = cur_qp_state self.attr.port_num = port_num self.attr.path_mtu = path_mtu @property def qp_state(self): return self.attr.qp_state @qp_state.setter def qp_state(self, val): self.attr.qp_state = val @property def cur_qp_state(self): return self.attr.cur_qp_state @cur_qp_state.setter def cur_qp_state(self, val): self.attr.cur_qp_state = val @property def path_mtu(self): return self.attr.path_mtu @path_mtu.setter def path_mtu(self, val): self.attr.path_mtu = val @property def path_mig_state(self): return self.attr.path_mig_state @path_mig_state.setter def path_mig_state(self, val): self.attr.path_mig_state = val @property def qkey(self): return self.attr.qkey @qkey.setter def qkey(self, val): self.attr.qkey = val @property def rq_psn(self): return self.attr.rq_psn @rq_psn.setter def rq_psn(self, val): self.attr.rq_psn = val @property def sq_psn(self): return self.attr.sq_psn @sq_psn.setter def sq_psn(self, val): self.attr.sq_psn = val @property def dest_qp_num(self): return self.attr.dest_qp_num @dest_qp_num.setter def dest_qp_num(self, val): self.attr.dest_qp_num = val @property def qp_access_flags(self): return self.attr.qp_access_flags @qp_access_flags.setter def qp_access_flags(self, val): self.attr.qp_access_flags = val @property def cap(self): return QPCap(max_send_wr=self.attr.cap.max_send_wr, max_recv_wr=self.attr.cap.max_recv_wr, max_send_sge=self.attr.cap.max_send_sge, max_recv_sge=self.attr.cap.max_recv_sge, max_inline_data=self.attr.cap.max_inline_data) @cap.setter def cap(self, val): _copy_caps(val, self) @property def ah_attr(self): if self.attr.ah_attr.is_global: gid = gid_str(self.attr.ah_attr.grh.dgid._global.subnet_prefix, self.attr.ah_attr.grh.dgid._global.interface_id) g = GID(gid) gr = GlobalRoute(flow_label=self.attr.ah_attr.grh.flow_label, sgid_index=self.attr.ah_attr.grh.sgid_index, hop_limit=self.attr.ah_attr.grh.hop_limit, dgid=g, traffic_class=self.attr.ah_attr.grh.traffic_class) else: gr = None ah = AHAttr(dlid=self.attr.ah_attr.dlid, sl=self.attr.ah_attr.sl, port_num=self.attr.ah_attr.port_num, src_path_bits=self.attr.ah_attr.src_path_bits, static_rate=self.attr.ah_attr.static_rate, is_global=self.attr.ah_attr.is_global, gr=gr) return ah @ah_attr.setter def ah_attr(self, val): self._copy_ah(val) @property def alt_ah_attr(self): if self.attr.alt_ah_attr.is_global: gid = gid_str(self.attr.alt_ah_attr.grh.dgid._global.subnet_prefix, self.attr.alt_ah_attr.grh.dgid._global.interface_id) g = GID(gid) gr = GlobalRoute(flow_label=self.attr.alt_ah_attr.grh.flow_label, sgid_index=self.attr.alt_ah_attr.grh.sgid_index, hop_limit=self.attr.alt_ah_attr.grh.hop_limit, dgid=g, traffic_class=self.attr.alt_ah_attr.grh.traffic_class) else: gr = None ah = AHAttr(dlid=self.attr.alt_ah_attr.dlid, port_num=self.attr.ah_attr.port_num, sl=self.attr.alt_ah_attr.sl, src_path_bits=self.attr.alt_ah_attr.src_path_bits, static_rate=self.attr.alt_ah_attr.static_rate, is_global=self.attr.alt_ah_attr.is_global, gr=gr) return ah @alt_ah_attr.setter def alt_ah_attr(self, val): self._copy_ah(val, True) def _copy_ah(self, AHAttr ah_attr, is_alt=False): if ah_attr is None: return if not is_alt: for i in range(16): self.attr.ah_attr.grh.dgid.raw[i] = \ ah_attr.ah_attr.grh.dgid.raw[i] self.attr.ah_attr.grh.flow_label = ah_attr.ah_attr.grh.flow_label self.attr.ah_attr.grh.sgid_index = ah_attr.ah_attr.grh.sgid_index self.attr.ah_attr.grh.hop_limit = ah_attr.ah_attr.grh.hop_limit self.attr.ah_attr.grh.traffic_class = \ ah_attr.ah_attr.grh.traffic_class self.attr.ah_attr.dlid = ah_attr.ah_attr.dlid self.attr.ah_attr.sl = ah_attr.ah_attr.sl self.attr.ah_attr.src_path_bits = ah_attr.ah_attr.src_path_bits self.attr.ah_attr.static_rate = ah_attr.ah_attr.static_rate self.attr.ah_attr.is_global = ah_attr.ah_attr.is_global self.attr.ah_attr.port_num = ah_attr.ah_attr.port_num else: for i in range(16): self.attr.alt_ah_attr.grh.dgid.raw[i] = \ ah_attr.ah_attr.grh.dgid.raw[i] self.attr.alt_ah_attr.grh.flow_label = \ ah_attr.ah_attr.grh.flow_label self.attr.alt_ah_attr.grh.sgid_index = \ ah_attr.ah_attr.grh.sgid_index self.attr.alt_ah_attr.grh.hop_limit = ah_attr.ah_attr.grh.hop_limit self.attr.alt_ah_attr.grh.traffic_class = \ ah_attr.ah_attr.grh.traffic_class self.attr.alt_ah_attr.dlid = ah_attr.ah_attr.dlid self.attr.alt_ah_attr.sl = ah_attr.ah_attr.sl self.attr.alt_ah_attr.src_path_bits = ah_attr.ah_attr.src_path_bits self.attr.alt_ah_attr.static_rate = ah_attr.ah_attr.static_rate self.attr.alt_ah_attr.is_global = ah_attr.ah_attr.is_global self.attr.alt_ah_attr.port_num = ah_attr.ah_attr.port_num @property def pkey_index(self): return self.attr.pkey_index @pkey_index.setter def pkey_index(self, val): self.attr.pkey_index = val @property def alt_pkey_index(self): return self.attr.alt_pkey_index @alt_pkey_index.setter def alt_pkey_index(self, val): self.attr.alt_pkey_index = val @property def en_sqd_async_notify(self): return self.attr.en_sqd_async_notify @en_sqd_async_notify.setter def en_sqd_async_notify(self, val): self.attr.en_sqd_async_notify = val @property def sq_draining(self): return self.attr.sq_draining @sq_draining.setter def sq_draining(self, val): self.attr.sq_draining = val @property def max_rd_atomic(self): return self.attr.max_rd_atomic @max_rd_atomic.setter def max_rd_atomic(self, val): self.attr.max_rd_atomic = val @property def max_dest_rd_atomic(self): return self.attr.max_dest_rd_atomic @max_dest_rd_atomic.setter def max_dest_rd_atomic(self, val): self.attr.max_dest_rd_atomic = val @property def min_rnr_timer(self): return self.attr.min_rnr_timer @min_rnr_timer.setter def min_rnr_timer(self, val): self.attr.min_rnr_timer = val @property def port_num(self): return self.attr.port_num @port_num.setter def port_num(self, val): self.attr.port_num = val @property def timeout(self): return self.attr.timeout @timeout.setter def timeout(self, val): self.attr.timeout = val @property def retry_cnt(self): return self.attr.retry_cnt @retry_cnt.setter def retry_cnt(self, val): self.attr.retry_cnt = val @property def rnr_retry(self): return self.attr.rnr_retry @rnr_retry.setter def rnr_retry(self, val): self.attr.rnr_retry = val @property def alt_port_num(self): return self.attr.alt_port_num @alt_port_num.setter def alt_port_num(self, val): self.attr.alt_port_num = val @property def alt_timeout(self): return self.attr.alt_timeout @alt_timeout.setter def alt_timeout(self, val): self.attr.alt_timeout = val @property def rate_limit(self): return self.attr.rate_limit @rate_limit.setter def rate_limit(self, val): self.attr.rate_limit = val def __str__(self): print_format = '{:22}: {:<20}\n' ah_format = ' {:22}: {:<20}\n' ident_format = ' {:22}: {:<20}\n' if self.attr.ah_attr.is_global: global_ah = ah_format.format('dgid', gid_str(self.attr.ah_attr.grh.dgid._global.subnet_prefix, self.attr.ah_attr.grh.dgid._global.interface_id)) +\ ah_format.format('flow label', self.attr.ah_attr.grh.flow_label) +\ ah_format.format('sgid index', self.attr.ah_attr.grh.sgid_index) +\ ah_format.format('hop limit', self.attr.ah_attr.grh.hop_limit) +\ ah_format.format('traffic_class', self.attr.ah_attr.grh.traffic_class) else: global_ah = '' if self.attr.alt_ah_attr.is_global: alt_global_ah = ah_format.format('dgid', gid_str(self.attr.alt_ah_attr.grh.dgid._global.subnet_prefix, self.attr.alt_ah_attr.grh.dgid._global.interface_id)) +\ ah_format.format('flow label', self.attr.alt_ah_attr.grh.flow_label) +\ ah_format.format('sgid index', self.attr.alt_ah_attr.grh.sgid_index) +\ ah_format.format('hop limit', self.attr.alt_ah_attr.grh.hop_limit) +\ ah_format.format('traffic_class', self.attr.alt_ah_attr.grh.traffic_class) else: alt_global_ah = '' return print_format.format('QP state', qp_state_to_str(self.attr.qp_state)) +\ print_format.format('QP current state', qp_state_to_str(self.attr.cur_qp_state)) +\ print_format.format('Path MTU', mtu_to_str(self.attr.path_mtu)) +\ print_format.format('Path mig. state', mig_state_to_str(self.attr.path_mig_state)) +\ print_format.format('QKey', self.attr.qkey) +\ print_format.format('RQ PSN', self.attr.rq_psn) +\ print_format.format('SQ PSN', self.attr.sq_psn) +\ print_format.format('Dest QP number', self.attr.dest_qp_num) +\ print_format.format('QP access flags', access_flags_to_str(self.attr.qp_access_flags)) +\ 'QP caps:\n' +\ ident_format.format('max send WR', self.attr.cap.max_send_wr) +\ ident_format.format('max recv WR', self.attr.cap.max_recv_wr) +\ ident_format.format('max send SGE', self.attr.cap.max_send_sge) +\ ident_format.format('max recv SGE', self.attr.cap.max_recv_sge) +\ ident_format.format('max inline data', self.attr.cap.max_inline_data) +\ 'AH Attr:\n' +\ ident_format.format('port num', self.attr.ah_attr.port_num) +\ ident_format.format('sl', self.attr.ah_attr.sl) +\ ident_format.format('source path bits', self.attr.ah_attr.src_path_bits) +\ ident_format.format('dlid', self.attr.ah_attr.dlid) +\ ident_format.format('port num', self.attr.ah_attr.port_num) +\ ident_format.format('static rate', self.attr.ah_attr.static_rate) +\ ident_format.format('is global', self.attr.ah_attr.is_global) +\ global_ah +\ 'Alt. AH Attr:\n' +\ ident_format.format('port num', self.attr.alt_ah_attr.port_num) +\ ident_format.format('sl', self.attr.alt_ah_attr.sl) +\ ident_format.format('source path bits', self.attr.alt_ah_attr.src_path_bits) +\ ident_format.format('dlid', self.attr.alt_ah_attr.dlid) +\ ident_format.format('port num', self.attr.alt_ah_attr.port_num) +\ ident_format.format('static rate', self.attr.alt_ah_attr.static_rate) +\ ident_format.format('is global', self.attr.alt_ah_attr.is_global) +\ alt_global_ah +\ print_format.format('PKey index', self.attr.pkey_index) +\ print_format.format('Alt. PKey index', self.attr.alt_pkey_index) +\ print_format.format('En. SQD async notify', self.attr.en_sqd_async_notify) +\ print_format.format('SQ draining', self.attr.sq_draining) +\ print_format.format('Max RD atomic', self.attr.max_rd_atomic) +\ print_format.format('Max dest. RD atomic', self.attr.max_dest_rd_atomic) +\ print_format.format('Min RNR timer', self.attr.min_rnr_timer) +\ print_format.format('Port number', self.attr.port_num) +\ print_format.format('Timeout', self.attr.timeout) +\ print_format.format('Retry counter', self.attr.retry_cnt) +\ print_format.format('RNR retry', self.attr.rnr_retry) +\ print_format.format('Alt. port number', self.attr.alt_port_num) +\ print_format.format('Alt. timeout', self.attr.alt_timeout) +\ print_format.format('Rate limit', self.attr.rate_limit) cdef class ECE(PyverbsCM): def __init__(self, vendor_id=0, options=0, comp_mask=0): """ :param vendor_id: Unique identifier of the provider vendor. :param options: Provider specific attributes which are supported or needed to be enabled by ECE users. :param comp_mask: A bitmask specifying which ECE options should be valid. """ super().__init__() self.ece.vendor_id = vendor_id self.ece.options = options self.ece.comp_mask = comp_mask @property def vendor_id(self): return self.ece.vendor_id @vendor_id.setter def vendor_id(self, val): self.ece.vendor_id = val @property def options(self): return self.ece.options @options.setter def options(self, val): self.ece.options = val @property def comp_mask(self): return self.ece.comp_mask @comp_mask.setter def comp_mask(self, val): self.ece.comp_mask = val def __str__(self): print_format = '{:22}: 0x{:<20x}\n' return 'ECE:\n' +\ print_format.format('Vendor ID', self.ece.vendor_id) +\ print_format.format('Options', self.ece.options) +\ print_format.format('Comp Mask', self.ece.comp_mask) cdef class QP(PyverbsCM): def __init__(self, object creator not None, object init_attr not None, QPAttr qp_attr=None): """ Initializes a QP object and performs state transitions according to user request. A C ibv_qp object will be created using the provided init_attr. If a qp_attr object is provided, pyverbs will consider this a hint to transit the QP's state as far as possible towards RTS: - In case of UD and Raw Packet QP types, if a qp_attr is provided the QP will be returned in RTS state. - In case of connected QPs (RC, UC), remote QPN is needed for INIT2RTR transition, so if a qp_attr is provided, the QP will be returned in INIT state. :param creator: The object creating the QP. Can be of type PD so ibv_create_qp will be used or of type Context, so ibv_create_qp_ex will be used. :param init_attr: QP initial attributes of type QPInitAttr (when created using PD) or QPInitAttrEx (when created using Context). :param qp_attr: Optional QPAttr object. Will be used for QP state transitions after creation. :return: An initialized QP object """ cdef PD pd cdef Context ctx super().__init__() self.mws = weakref.WeakSet() self.flows = weakref.WeakSet() self.dr_actions = weakref.WeakSet() self.update_cqs(init_attr) # QP initialization was not done by the provider, we should do it here if self.qp == NULL: # In order to use cdef'd methods, a proper casting must be done, # let's infer the type. if issubclass(type(creator), Context): self._create_qp_ex(creator, init_attr) if self.qp == NULL: raise PyverbsRDMAErrno('Failed to create QP') ctx = creator self.context = ctx ctx.add_ref(self) if init_attr.pd is not None: pd = init_attr.pd pd.add_ref(self) self.pd = pd if init_attr.xrcd is not None: xrcd = init_attr.xrcd xrcd.add_ref(self) self.xrcd = xrcd else: self._create_qp(creator, init_attr) if self.qp == NULL: raise PyverbsRDMAErrno('Failed to create QP') pd = creator self.pd = pd pd.add_ref(self) self.context = None if init_attr.srq is not None: srq = init_attr.srq srq.add_ref(self) self.srq = srq if qp_attr is not None: funcs = {e.IBV_QPT_RC: self.to_init, e.IBV_QPT_UC: self.to_init, e.IBV_QPT_UD: self.to_rts, e.IBV_QPT_XRC_RECV: self.to_init, e.IBV_QPT_XRC_SEND: self.to_init, e.IBV_QPT_RAW_PACKET: self.to_rts} funcs[self.qp.qp_type](qp_attr) cdef update_cqs(self, init_attr): cdef CQ cq cdef CQEX cqex if init_attr.send_cq is not None: if type(init_attr.send_cq) == CQ: cq = init_attr.send_cq cq.add_ref(self) self.scq = cq else: cqex = init_attr.send_cq cqex.add_ref(self) self.scq = cqex if init_attr.send_cq != init_attr.recv_cq and init_attr.recv_cq is not None: if type(init_attr.recv_cq) == CQ: cq = init_attr.recv_cq cq.add_ref(self) self.rcq = cq else: cqex = init_attr.recv_cq cqex.add_ref(self) self.rcq = cqex def _create_qp(self, PD pd, QPInitAttr attr): self.qp = v.ibv_create_qp(pd.pd, &attr.attr) def _create_qp_ex(self, Context ctx, QPInitAttrEx attr): self.qp = v.ibv_create_qp_ex(ctx.context, &attr.attr) cdef add_ref(self, obj): if isinstance(obj, MW): self.mws.add(obj) elif isinstance(obj, Flow): self.flows.add(obj) else: raise PyverbsError('Unrecognized object type') def __dealloc__(self): self.close() cpdef close(self): if self.qp != NULL: if self.logger: self.logger.debug('Closing QP') close_weakrefs([self.mws, self.flows, self.dr_actions]) rc = v.ibv_destroy_qp(self.qp) if rc: raise PyverbsRDMAError('Failed to destroy QP', rc) self.qp = NULL self.pd = None self.context = None self.scq = None self.rcq = None def _get_comp_mask(self, dst): masks = {e.IBV_QPT_RC: {'INIT': e.IBV_QP_PKEY_INDEX | e.IBV_QP_PORT |\ e.IBV_QP_ACCESS_FLAGS, 'RTR': e.IBV_QP_AV |\ e.IBV_QP_PATH_MTU | e.IBV_QP_DEST_QPN |\ e.IBV_QP_RQ_PSN |\ e.IBV_QP_MAX_DEST_RD_ATOMIC |\ e.IBV_QP_MIN_RNR_TIMER, 'RTS': e.IBV_QP_TIMEOUT |\ e.IBV_QP_RETRY_CNT | e.IBV_QP_RNR_RETRY |\ e.IBV_QP_SQ_PSN | e.IBV_QP_MAX_QP_RD_ATOMIC}, e.IBV_QPT_UC: {'INIT': e.IBV_QP_PKEY_INDEX | e.IBV_QP_PORT |\ e.IBV_QP_ACCESS_FLAGS, 'RTR': e.IBV_QP_AV |\ e.IBV_QP_PATH_MTU | e.IBV_QP_DEST_QPN |\ e.IBV_QP_RQ_PSN, 'RTS': e.IBV_QP_SQ_PSN}, e.IBV_QPT_UD: {'INIT': e.IBV_QP_PKEY_INDEX | e.IBV_QP_PORT |\ e.IBV_QP_QKEY, 'RTR': 0, 'RTS': e.IBV_QP_SQ_PSN}, e.IBV_QPT_RAW_PACKET: {'INIT': e.IBV_QP_PORT, 'RTR': 0, 'RTS': 0}, e.IBV_QPT_XRC_RECV: {'INIT': e.IBV_QP_PKEY_INDEX |\ e.IBV_QP_PORT | e.IBV_QP_ACCESS_FLAGS, 'RTR': e.IBV_QP_AV | e.IBV_QP_PATH_MTU |\ e.IBV_QP_DEST_QPN | e.IBV_QP_RQ_PSN | \ e.IBV_QP_MAX_DEST_RD_ATOMIC |\ e.IBV_QP_MIN_RNR_TIMER, 'RTS': e.IBV_QP_TIMEOUT | e.IBV_QP_SQ_PSN }, e.IBV_QPT_XRC_SEND: {'INIT': e.IBV_QP_PKEY_INDEX |\ e.IBV_QP_PORT | e.IBV_QP_ACCESS_FLAGS, 'RTR': e.IBV_QP_AV | e.IBV_QP_PATH_MTU |\ e.IBV_QP_DEST_QPN | e.IBV_QP_RQ_PSN, 'RTS': e.IBV_QP_TIMEOUT |\ e.IBV_QP_RETRY_CNT | e.IBV_QP_RNR_RETRY |\ e.IBV_QP_SQ_PSN | e.IBV_QP_MAX_QP_RD_ATOMIC}} return masks[self.qp.qp_type][dst] | e.IBV_QP_STATE def to_init(self, QPAttr qp_attr): """ Modify the current QP's state to INIT. If the current state doesn't support transition to INIT, an exception will be raised. The comp mask provided to the kernel includes the needed bits for 2INIT transition for this QP type. :param qp_attr: QPAttr object containing the needed attributes for 2INIT transition :return: None """ mask = self._get_comp_mask('INIT') qp_attr.qp_state = e.IBV_QPS_INIT rc = v.ibv_modify_qp(self.qp, &qp_attr.attr, mask) if rc != 0: raise PyverbsRDMAError('Failed to modify QP state to init', rc) def to_rtr(self, QPAttr qp_attr): """ Modify the current QP's state to RTR. It assumes that its current state is INIT or RESET, in which case it will attempt a transition to INIT prior to transition to RTR. As a result, if current state doesn't support transition to INIT, an exception will be raised. The comp mask provided to the kernel includes the needed bits for 2RTR transition for this QP type. :param qp_attr: QPAttr object containing the needed attributes for 2RTR transition. :return: None """ if self.qp_state != e.IBV_QPS_INIT: #assume reset self.to_init(qp_attr) mask = self._get_comp_mask('RTR') qp_attr.qp_state = e.IBV_QPS_RTR rc = v.ibv_modify_qp(self.qp, &qp_attr.attr, mask) if rc != 0: raise PyverbsRDMAError('Failed to modify QP state to RTR', rc) def to_rts(self, QPAttr qp_attr): """ Modify the current QP's state to RTS. It assumes that its current state is either RTR, INIT or RESET. If current state is not RTR, to_rtr() will be called. The comp mask provided to the kernel includes the needed bits for 2RTS transition for this QP type. :param qp_attr: QPAttr object containing the needed attributes for 2RTS transition. :return: None """ if self.qp_state != e.IBV_QPS_RTR: #assume reset/init self.to_rtr(qp_attr) mask = self._get_comp_mask('RTS') qp_attr.qp_state = e.IBV_QPS_RTS rc = v.ibv_modify_qp(self.qp, &qp_attr.attr, mask) if rc != 0: raise PyverbsRDMAError('Failed to modify QP state to RTS', rc) def query(self, attr_mask): """ Query the QP :param attr_mask: The minimum list of attributes to retrieve. Some devices may return additional attributes as well (see enum ibv_qp_attr_mask) :return: (QPAttr, QPInitAttr) tuple containing the QP requested attributes """ attr = QPAttr() init_attr = QPInitAttr() rc = v.ibv_query_qp(self.qp, &attr.attr, attr_mask, &init_attr.attr) if rc != 0: raise PyverbsRDMAError('Failed to query QP', rc) return attr, init_attr def modify(self, QPAttr qp_attr not None, comp_mask): """ Modify the QP :param qp_attr: A QPAttr object with updated values to be applied to the QP :param comp_mask: A bitmask specifying which QP attributes should be modified (see enum ibv_qp_attr_mask) :return: None """ rc = v.ibv_modify_qp(self.qp, &qp_attr.attr, comp_mask) if rc != 0: raise PyverbsRDMAError('Failed to modify QP', rc) def post_recv(self, RecvWR wr not None, RecvWR bad_wr=None): """ Post a receive WR on the QP. :param wr: The work request to post :param bad_wr: A RecvWR object to hold the bad WR if it is available in case of a failure :return: None """ cdef v.ibv_recv_wr *my_bad_wr # In order to provide a pointer to a pointer, use a temporary cdef'ed # variable. rc = v.ibv_post_recv(self.qp, &wr.recv_wr, &my_bad_wr) if rc != 0: if (bad_wr): memcpy(&bad_wr.recv_wr, my_bad_wr, sizeof(bad_wr.recv_wr)) raise PyverbsRDMAError('Failed to post recv', rc) def post_send(self, SendWR wr not None, SendWR bad_wr=None): """ Post a send WR on the QP. :param wr: The work request to post :param bad_wr: A SendWR object to hold the bad WR if it is available in case of a failure :return: None """ # In order to provide a pointer to a pointer, use a temporary cdef'ed # variable. cdef v.ibv_send_wr *my_bad_wr rc = v.ibv_post_send(self.qp, &wr.send_wr, &my_bad_wr) if rc != 0: if (bad_wr): memcpy(&bad_wr.send_wr, my_bad_wr, sizeof(bad_wr.send_wr)) raise PyverbsRDMAError('Failed to post send', rc) def set_ece(self, ECE ece): """ Set ECE options and use them for QP configuration stage :param ece: The requested ECE values. :return: None """ if ece.ece.vendor_id == 0: return rc = v.ibv_set_ece(self.qp, &ece.ece) if rc != 0: raise PyverbsRDMAError('Failed to set ECE', rc) def query_ece(self): """ Query QPs ECE options :return: ECE object with this QP ece configuration. """ ece = ECE() rc = v.ibv_query_ece(self.qp, &ece.ece) if rc != 0: raise PyverbsRDMAError('Failed to query ECE', rc) return ece def bind_mw(self, MW mw not None, MWBind mw_bind): """ Bind Memory window type 1. :param mw: The memory window to bind. :param mw_bind: MWBind object, includes the bind attributes. :return: None """ rc = v.ibv_bind_mw(self.qp, mw.mw, &mw_bind.mw_bind) if rc != 0: raise PyverbsRDMAError('Failed to Bind MW', rc) def query_data_in_order(self, op, flags=0): """ Query if QP data is guaranteed to be in order. :param op: Operation type. :param flags: Flags are used to select a query type. For IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS, the function will return a capabilities vector. If 0, will query for IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG support and return 0/1 result. (see enum ibv_query_qp_data_in_order_flags) :return: Return value is determined by flags. For each capability bit, 1 is returned if the data is guaranteed to be written in-order for selected operation and type, 0 otherwise. (see enum ibv_query_qp_data_in_order_caps) """ return v.ibv_query_qp_data_in_order(self.qp, op, flags) @property def qp_type(self): return self.qp.qp_type @property def qp_state(self): return self.qp.state @property def qp_num(self): return self.qp.qp_num def __str__(self): print_format = '{:22}: {:<20}\n' return print_format.format('QP type', qp_type_to_str(self.qp_type)) +\ print_format.format(' number', self.qp_num) +\ print_format.format(' state', qp_state_to_str(self.qp_state)) cdef class DataBuffer(PyverbsCM): def __init__(self, addr, length): super().__init__() self.data.addr = PyLong_AsVoidPtr(addr) self.data.length = length cdef class QPEx(QP): def __init__(self, object creator not None, object init_attr not None, QPAttr qp_attr=None): """ Initializes a QPEx object. Since this is an extension of a QP, QP creation is done in the parent class. The extended QP is retrieved by casting the ibv_qp to ibv_qp_ex. :return: An initialized QPEx object """ super().__init__(creator, init_attr, qp_attr) if init_attr.ind_table is not None: ind_table = init_attr.ind_table ind_table.add_ref(self) self.ind_table = ind_table if init_attr.comp_mask & v.IBV_QP_INIT_ATTR_SEND_OPS_FLAGS: self.qp_ex = v.ibv_qp_to_qp_ex(self.qp) if self.qp_ex == NULL: raise PyverbsRDMAErrno('Failed to create extended QP') else: self.logger.debug('qp_ex is not accessible since IBV_QP_INIT_ATTR_SEND_OPS_FLAGS was not passed.') @property def comp_mask(self): return self.qp_ex.comp_mask @comp_mask.setter def comp_mask(self, val): self.qp_ex.comp_mask = val @property def wr_id(self): return self.qp_ex.wr_id @wr_id.setter def wr_id(self, val): self.qp_ex.wr_id = val @property def wr_flags(self): return self.qp_ex.wr_flags @wr_flags.setter def wr_flags(self, val): self.qp_ex.wr_flags = val @property def ind_table(self): return self.ind_table def wr_atomic_cmp_swp(self, rkey, remote_addr, compare, swap): v.ibv_wr_atomic_cmp_swp(self.qp_ex, rkey, remote_addr, compare, swap) def wr_atomic_fetch_add(self, rkey, remote_addr, add): v.ibv_wr_atomic_fetch_add(self.qp_ex, rkey, remote_addr, add) def wr_bind_mw(self, MW mw, rkey, MWBindInfo bind_info): cdef v.ibv_mw_bind_info *info info = &bind_info.info v.ibv_wr_bind_mw(self.qp_ex, mw.mw, rkey, info) self.add_ref(mw) def wr_local_inv(self, invalidate_rkey): v.ibv_wr_local_inv(self.qp_ex, invalidate_rkey) def wr_atomic_write(self, rkey, remote_addr, atomic_wr): cdef char *atomic_wr_c = atomic_wr v.ibv_wr_atomic_write(self.qp_ex, rkey, remote_addr, atomic_wr_c) def wr_rdma_read(self, rkey, remote_addr): v.ibv_wr_rdma_read(self.qp_ex, rkey, remote_addr) def wr_rdma_write(self, rkey, remote_addr): v.ibv_wr_rdma_write(self.qp_ex, rkey, remote_addr) def wr_rdma_write_imm(self, rkey, remote_addr, data): cdef unsigned int imm_data = htobe32(data) v.ibv_wr_rdma_write_imm(self.qp_ex, rkey, remote_addr, imm_data) def wr_flush(self, rkey, remote_addr, length, ptype, level): v.ibv_wr_flush(self.qp_ex, rkey, remote_addr, length, ptype, level) def wr_send(self): v.ibv_wr_send(self.qp_ex) def wr_send_imm(self, data): cdef unsigned int imm_data = htobe32(data) return v.ibv_wr_send_imm(self.qp_ex, imm_data) def wr_send_inv(self, invalidate_rkey): v.ibv_wr_send_inv(self.qp_ex, invalidate_rkey) def wr_send_tso(self, hdr, hdr_sz, mss): ptr = PyLong_AsVoidPtr(hdr) v.ibv_wr_send_tso(self.qp_ex, ptr, hdr_sz, mss) def wr_set_ud_addr(self, AH ah, remote_qpn, remote_rkey): v.ibv_wr_set_ud_addr(self.qp_ex, ah.ah, remote_qpn, remote_rkey) def wr_set_xrc_srqn(self, remote_srqn): v.ibv_wr_set_xrc_srqn(self.qp_ex, remote_srqn) def wr_set_inline_data(self, addr, length): ptr = PyLong_AsVoidPtr(addr) v.ibv_wr_set_inline_data(self.qp_ex, ptr, length) def wr_set_inline_data_list(self, num_buf, buf_list): cdef v.ibv_data_buf *data = NULL data = malloc(num_buf * sizeof(v.ibv_data_buf)) if data == NULL: raise PyverbsError('Failed to allocate data buffer') for i in range(num_buf): data_buf = buf_list[i] data[i].addr = data_buf.data.addr data[i].length = data_buf.data.length v.ibv_wr_set_inline_data_list(self.qp_ex, num_buf, data) free(data) def wr_set_sge(self, SGE sge not None): v.ibv_wr_set_sge(self.qp_ex, sge.lkey, sge.addr, sge.length) def wr_set_sge_list(self, num_sge, sg_list): cdef v.ibv_sge *sge = NULL sge = malloc(num_sge * sizeof(v.ibv_sge)) if sge == NULL: raise PyverbsError('Failed to allocate SGE buffer') for i in range(num_sge): sge[i].addr = sg_list[i].addr sge[i].length = sg_list[i].length sge[i].lkey = sg_list[i].lkey v.ibv_wr_set_sge_list(self.qp_ex, num_sge, sge) free(sge) def wr_start(self): v.ibv_wr_start(self.qp_ex) def wr_complete(self): rc = v.ibv_wr_complete(self.qp_ex) if rc != 0: raise PyverbsRDMAError('ibv_wr_complete failed.', rc) def wr_abort(self): v.ibv_wr_abort(self.qp_ex) def _copy_caps(QPCap src, dst): """ Copy the QPCaps values of src into the inner ibv_qp_cap struct of dst. Since both ibv_qp_init_attr and ibv_qp_attr have an inner ibv_qp_cap inner struct, they can both be used. :param src: A QPCap object :param dst: A QPInitAttr / QPInitAttrEx / QPAttr object :return: None """ # we're assigning to C structs here, we must have type-specific objects in # order to do that. Instead of having this function smaller but in 3 # classes, it appears here once. cdef QPInitAttr qia cdef QPInitAttrEx qiae cdef QPAttr qa if src is None: return if type(dst) == QPInitAttr: qia = dst qia.attr.cap.max_send_wr = src.cap.max_send_wr qia.attr.cap.max_recv_wr = src.cap.max_recv_wr qia.attr.cap.max_send_sge = src.cap.max_send_sge qia.attr.cap.max_recv_sge = src.cap.max_recv_sge qia.attr.cap.max_inline_data = src.cap.max_inline_data elif type(dst) == QPInitAttrEx: qiae = dst qiae.attr.cap.max_send_wr = src.cap.max_send_wr qiae.attr.cap.max_recv_wr = src.cap.max_recv_wr qiae.attr.cap.max_send_sge = src.cap.max_send_sge qiae.attr.cap.max_recv_sge = src.cap.max_recv_sge qiae.attr.cap.max_inline_data = src.cap.max_inline_data else: qa = dst qa.attr.cap.max_send_wr = src.cap.max_send_wr qa.attr.cap.max_recv_wr = src.cap.max_recv_wr qa.attr.cap.max_send_sge = src.cap.max_send_sge qa.attr.cap.max_recv_sge = src.cap.max_recv_sge qa.attr.cap.max_inline_data = src.cap.max_inline_data rdma-core-56.1/pyverbs/spec.pxd000066400000000000000000000014451477342711600164770ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia All rights reserved. #cython: language_level=3 from pyverbs.base cimport PyverbsObject cimport pyverbs.libibverbs as v cdef class Spec(PyverbsObject): cdef object spec_type cdef unsigned short size cpdef _copy_data(self, unsigned long ptr) cdef class EthSpec(Spec): cdef v.ibv_flow_eth_filter val cdef v.ibv_flow_eth_filter mask cdef _mac_to_str(self, unsigned char mac[6]) cdef class Ipv4ExtSpec(Spec): cdef v.ibv_flow_ipv4_ext_filter val cdef v.ibv_flow_ipv4_ext_filter mask cdef class TcpUdpSpec(Spec): cdef v.ibv_flow_tcp_udp_filter val cdef v.ibv_flow_tcp_udp_filter mask cdef class Ipv6Spec(Spec): cdef v.ibv_flow_ipv6_filter val cdef v.ibv_flow_ipv6_filter mask rdma-core-56.1/pyverbs/spec.pyx000066400000000000000000000500361477342711600165240ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia. All rights reserved. from pyverbs.pyverbs_error import PyverbsError from libc.string cimport memcpy import socket, struct U32_MASK = 0xffffffff cdef class Spec(PyverbsObject): """ Abstract class for all the specs to derive from. """ def __init__(self): raise NotImplementedError('This class is abstract.') @property def size(self): return self.size cpdef _copy_data(self, unsigned long ptr): """ memcpy the spec to the provided address in proper order. This function must be implemented in each subclass. :param addr: address to copy spec to """ raise NotImplementedError('Must be implemented in subclass.') def __str__(self): return f"{'Spec type':<16}: {self.type_to_str(self.spec_type):<20}\n" \ f"{'Size':<16}: {self.size:<20}\n" @staticmethod def type_to_str(spec_type): types = {v.IBV_FLOW_SPEC_ETH : 'IBV_FLOW_SPEC_ETH', v.IBV_FLOW_SPEC_IPV4_EXT : "IBV_FLOW_SPEC_IPV4_EXT", v.IBV_FLOW_SPEC_IPV6 : "IBV_FLOW_SPEC_IPV6", v.IBV_FLOW_SPEC_TCP : "IBV_FLOW_SPEC_TCP", v.IBV_FLOW_SPEC_UDP : "IBV_FLOW_SPEC_UDP"} res_str = "" if spec_type & v.IBV_FLOW_SPEC_INNER: res_str += 'IBV_FLOW_SPEC_INNER ' try: s_type = spec_type & ~v.IBV_FLOW_SPEC_INNER res_str += types[s_type] except IndexError: raise PyverbsError(f'This type {s_type} is not implemented yet') return res_str @staticmethod def _set_val_mask(default_mask, val=None, val_mask=None): """ If value is given without val_mask, default_mask will be returned. :param default_mask: default mask to set if not provided :param val: user provided value :param val_mask: user provided mask :return: resulting value and mask """ res_val = 0 res_mask = 0 if val is not None: res_val = val res_mask = default_mask if val_mask is None else val_mask return res_val, res_mask cdef class EthSpec(Spec): MAC_LEN = 6 MAC_MASK = ('ff:' * MAC_LEN)[:-1] ZERO_MAC = [0] * MAC_LEN def __init__(self, dst_mac=None, dst_mac_mask=None, src_mac=None, src_mac_mask=None, ether_type=None, ether_type_mask=None, vlan_tag=None, vlan_tag_mask=None, is_inner=0): """ Initialize a EthSpec object over an underlying ibv_flow_spec_eth C object that defines Ethernet header specifications for steering flow to match on. :param dst_mac: destination mac to match on (e.g. 'aa:bb:12:13:14:fe') :param dst_mac_mask: destination mac mask (e.g. 'ff:ff:ff:ff:ff:ff') :param src_mac: source mac to match on :param src_mac_mask: source mac mask :param ether_type: ethertype to match on :param ether_type_mask: ethertype mask :param vlan_tag: VLAN tag to match on :param vlan_tag_mask: VLAN tag mask :param is_inner: is inner spec """ self.spec_type = v.IBV_FLOW_SPEC_ETH if is_inner: self.spec_type |= v.IBV_FLOW_SPEC_INNER self.size = sizeof(v.ibv_flow_spec_eth) self.dst_mac, self.dst_mac_mask = self._set_val_mask(self.MAC_MASK, dst_mac, dst_mac_mask) self.src_mac, self.src_mac_mask = self._set_val_mask(self.MAC_MASK, src_mac, src_mac_mask) self.val.ether_type, self.mask.ether_type = \ map(socket.htons, self._set_val_mask(0xffff, ether_type, ether_type_mask)) self.val.vlan_tag, self.mask.vlan_tag = \ map(socket.htons, self._set_val_mask(0xffff, vlan_tag, vlan_tag_mask)) cdef _mac_to_str(self, unsigned char mac[6]): s = '' if len(mac) == 0: return s # Building string from array # [0xa, 0x1b, 0x2c, 0x3c, 0x4d, 0x5e] -> "0a:1b:2c:3c:4d:5e" for i in range(self.MAC_LEN): s += hex(mac[i])[2:].zfill(2) + ':' return s[:-1] def _set_mac(self, val): mac = EthSpec.ZERO_MAC[:] if val: s = val.split(':') for i in range(self.MAC_LEN): mac[i] = int(s[i], 16) return mac @property def dst_mac(self): return self._mac_to_str(self.val.dst_mac) @dst_mac.setter def dst_mac(self, val): self.val.dst_mac = self._set_mac(val) @property def dst_mac_mask(self): return self._mac_to_str(self.mask.dst_mac) @dst_mac_mask.setter def dst_mac_mask(self, val): self.mask.dst_mac = self._set_mac(val) @property def src_mac(self): return self._mac_to_str(self.val.src_mac) @src_mac.setter def src_mac(self, val): self.val.src_mac = self._set_mac(val) @property def src_mac_mask(self): return self._mac_to_str(self.mask.src_mac) @src_mac_mask.setter def src_mac_mask(self, val): self.mask.src_mac = self._set_mac(val) @property def ether_type(self): return socket.ntohs(self.val.ether_type) @ether_type.setter def ether_type(self, val): self.val.ether_type = socket.htons(val) @property def ether_type_mask(self): return socket.ntohs(self.mask.ether_type) @ether_type_mask.setter def ether_type_mask(self, val): self.mask.ether_type = socket.htons(val) @property def vlan_tag(self): return socket.ntohs(self.val.vlan_tag) @vlan_tag.setter def vlan_tag(self, val): self.val.vlan_tag = socket.htons(val) @property def vlan_tag_mask(self): return socket.ntohs(self.mask.vlan_tag) @vlan_tag_mask.setter def vlan_tag_mask(self, val): self.mask.vlan_tag = socket.htons(val) def __str__(self): return super().__str__() + \ f"{'Src mac':<16}: {self.src_mac:<20} {self.src_mac_mask:<20}\n" \ f"{'Dst mac':<16}: {self.dst_mac:<20} {self.dst_mac_mask:<20}\n" \ f"{'Ether type':<16}: {self.val.ether_type:<20} " \ f"{self.mask.ether_type:<20}\n" \ f"{'Vlan tag':<16}: {self.val.vlan_tag:<20} " \ f"{self.mask.vlan_tag:<20}\n" cpdef _copy_data(self, unsigned long ptr): cdef v.ibv_flow_spec_eth eth eth.size = self.size eth.type = self.spec_type eth.val = self.val eth.mask = self.mask memcpy(ptr, ð, self.size) cdef class Ipv4ExtSpec(Spec): def __init__(self, dst_ip=None, dst_ip_mask=None, src_ip=None, src_ip_mask=None, proto=None, proto_mask=None, tos=None, tos_mask=None, ttl=None, ttl_mask=None, flags=None, flags_mask=None, is_inner=False): """ Initialize an Ipv4ExtSpec object over an underlying ibv_flow_ipv4_ext C object that defines IPv4 header specifications for steering flow to match on. :param dst_ip: Destination IP to match on (e.g. '1.2.3.4') :param dst_ip_mask: Destination IP mask (e.g. '255.255.255.255') :param src_ip: source IP to match on :param src_ip_mask: Source IP mask :param proto: Protocol to match on :param proto_mask: Protocol mask :param tos: Type of service to match on :param tos_mask: Type of service mask :param ttl: Time to live to match on :param ttl_mask: Time to live mask :param flags: Flags to match on :param flags_mask: Flags mask :param is_inner: Is inner spec """ self.spec_type = v.IBV_FLOW_SPEC_IPV4_EXT if is_inner: self.spec_type |= v.IBV_FLOW_SPEC_INNER self.size = sizeof(v.ibv_flow_spec_ipv4_ext) self.val.dst_ip, self.mask.dst_ip = \ map(socket.htonl, self._set_val_mask(U32_MASK, self._str_to_ip(dst_ip), self._str_to_ip(dst_ip_mask))) self.val.src_ip, self.mask.src_ip = \ map(socket.htonl, self._set_val_mask(U32_MASK, self._str_to_ip(src_ip), self._str_to_ip(src_ip_mask))) self.val.proto, self.mask.proto = self._set_val_mask(0xff, proto, proto_mask) self.val.tos, self.mask.tos = self._set_val_mask(0xff, tos, tos_mask) self.val.ttl, self.mask.ttl = self._set_val_mask(0xff, ttl, ttl_mask) self.val.flags, self.mask.flags = self._set_val_mask(0xff, flags, flags_mask) @staticmethod def _str_to_ip(ip_str): return None if ip_str is None else \ struct.unpack('!L', socket.inet_aton(ip_str))[0] @staticmethod def _ip_to_str(ip): return socket.inet_ntoa(struct.pack('!L', ip)) @property def dst_ip(self): return self._ip_to_str(socket.ntohl(self.val.dst_ip)) @dst_ip.setter def dst_ip(self, val): self.val.dst_ip = socket.htonl(self._str_to_ip(val)) @property def dst_ip_mask(self): return self._ip_to_str(socket.ntohl(self.mask.dst_ip)) @dst_ip_mask.setter def dst_ip_mask(self, val): self.mask.dst_ip = socket.htonl(self._str_to_ip(val)) @property def src_ip(self): return self._ip_to_str(socket.ntohl(self.val.src_ip)) @src_ip.setter def src_ip(self, val): self.val.src_ip = socket.htonl(self._str_to_ip(val)) @property def src_ip_mask(self): return self._ip_to_str(socket.ntohl(self.mask.src_ip)) @src_ip_mask.setter def src_ip_mask(self, val): self.mask.src_ip = socket.htonl(self._str_to_ip(val)) @property def proto(self): return self.val.proto @proto.setter def proto(self, val): self.val.proto = val @property def proto_mask(self): return self.mask.proto @proto_mask.setter def proto_mask(self, val): self.mask.proto = val @property def tos(self): return self.val.tos @tos.setter def tos(self, val): self.val.tos = val @property def tos_mask(self): return self.mask.tos @tos_mask.setter def tos_mask(self, val): self.mask.tos = val @property def ttl(self): return self.val.ttl @ttl.setter def ttl(self, val): self.val.ttl = val @property def ttl_mask(self): return self.mask.ttl @ttl_mask.setter def ttl_mask(self, val): self.mask.ttl = val @property def flags(self): return self.val.flags @flags.setter def flags(self, val): self.val.flags = val @property def flags_mask(self): return self.mask.flags @flags_mask.setter def flags_mask(self, val): self.mask.flags = val def __str__(self): return super().__str__() + \ f"{'Src IP':<16}: {self.src_ip:<20} {self.src_ip_mask:<20}\n" \ f"{'Dst IP':<16}: {self.dst_ip:<20} {self.dst_ip_mask:<20}\n" \ f"{'Proto':<16}: {self.val.proto:<20} {self.mask.proto:<20}\n" \ f"{'ToS':<16}: {self.val.tos:<20} {self.mask.tos:<20}\n" \ f"{'TTL':<16}: {self.val.ttl:<20} {self.mask.ttl:<20}\n" \ f"{'Flags':<16}: {self.val.flags:<20} {self.mask.flags:<20}\n" cpdef _copy_data(self, unsigned long ptr): cdef v.ibv_flow_spec_ipv4_ext ipv4 ipv4.size = self.size ipv4.type = self.spec_type ipv4.val = self.val ipv4.mask = self.mask memcpy(ptr, &ipv4, self.size) cdef class TcpUdpSpec(Spec): def __init__(self, v.ibv_flow_spec_type spec_type, dst_port=None, dst_port_mask=None, src_port=None, src_port_mask=None, is_inner=False): """ Initialize a TcpUdpSpec object over an underlying ibv_flow_tcp_udp C object that defines TCP or UDP header specifications for steering flow to match on. :param spec_type: IBV_FLOW_SPEC_TCP or IBV_FLOW_SPEC_UDP :param dst_port: Destination port to match on :param dst_port_mask: Destination port mask :param src_port: Source port to match on :param src_port_mask: Source port mask :param is_inner: Is inner spec """ if spec_type is not v.IBV_FLOW_SPEC_TCP and spec_type is not\ v.IBV_FLOW_SPEC_UDP: raise PyverbsError('Spec type must be IBV_FLOW_SPEC_TCP or' ' IBV_FLOW_SPEC_UDP') self.spec_type = spec_type if is_inner: self.spec_type |= v.IBV_FLOW_SPEC_INNER self.size = sizeof(v.ibv_flow_spec_tcp_udp) self.val.dst_port, self.mask.dst_port = \ map(socket.htons, self._set_val_mask(0xffff, dst_port, dst_port_mask)) self.val.src_port, self.mask.src_port = \ map(socket.htons, self._set_val_mask(0xffff, src_port, src_port_mask)) @property def dst_port(self): return socket.ntohs(self.val.dst_port) @dst_port.setter def dst_port(self, val): self.val.dst_port = socket.htons(val) @property def dst_port_mask(self): return socket.ntohs(self.mask.dst_port) @dst_port_mask.setter def dst_port_mask(self, val): self.mask.dst_port = socket.htons(val) @property def src_port(self): return socket.ntohs(self.val.src_port) @src_port.setter def src_port(self, val): self.val.src_port = socket.htons(val) @property def src_port_mask(self): return socket.ntohs(self.mask.src_port) @src_port_mask.setter def src_port_mask(self, val): self.mask.src_port = socket.htons(val) def __str__(self): return super().__str__() + \ f"{'Src port':<16}: {self.src_port:<20} {self.src_port_mask:<20}\n" \ f"{'Dst port':<16}: {self.dst_port:<20} {self.dst_port_mask:<20}\n" cpdef _copy_data(self, unsigned long ptr): cdef v.ibv_flow_spec_tcp_udp tcp_udp tcp_udp.size = self.size tcp_udp.type = self.spec_type tcp_udp.val = self.val tcp_udp.mask = self.mask memcpy(ptr, &tcp_udp, self.size) cdef class Ipv6Spec(Spec): EMPTY_IPV6 = [0] * 16 IPV6_MASK = ("ffff:" * 8)[:-1] FLOW_LABEL_MASK = 0xfffff def __init__(self, dst_ip=None, dst_ip_mask=None, src_ip=None, src_ip_mask=None, flow_label=None, flow_label_mask=None, next_hdr=None, next_hdr_mask=None, traffic_class=None, traffic_class_mask=None, hop_limit=None, hop_limit_mask=None, is_inner=False): """ Initialize an Ipv6Spec object over an underlying ibv_flow_ipv6 C object that defines IPv6 header specifications for steering flow to match on. :param dst_ip: Destination IPv6 to match on (e.g. 'a0a1::a2a3:a4a5:a6a7:a8a9') :param dst_ip_mask: Destination IPv6 mask (e.g. 'ffff::ffff:ffff:ffff:ffff') :param src_ip: Source IPv6 to match on :param src_ip_mask: Source IPv6 mask :param flow_label: Flow label to match on :param flow_label_mask: Flow label mask :param next_hdr: Next header to match on :param next_hdr_mask: Next header mask :param traffic_class: Traffic class to match on :param traffic_class_mask: Traffic class mask :param hop_limit: Hop limit to match on :param hop_limit_mask: Hop limit mask :param is_inner: Is inner spec """ self.spec_type = v.IBV_FLOW_SPEC_IPV6 if is_inner: self.spec_type |= v.IBV_FLOW_SPEC_INNER self.size = sizeof(v.ibv_flow_spec_ipv6) self.dst_ip, self.dst_ip_mask = self._set_val_mask(self.IPV6_MASK, dst_ip, dst_ip_mask) self.src_ip, self.src_ip_mask = self._set_val_mask(self.IPV6_MASK, src_ip, src_ip_mask) self.val.flow_label, self.mask.flow_label = \ map(socket.htonl, self._set_val_mask(self.FLOW_LABEL_MASK, flow_label, flow_label_mask)) self.val.next_hdr, self.mask.next_hdr = \ self._set_val_mask(0xff, next_hdr, next_hdr_mask) self.val.traffic_class, self.mask.traffic_class = \ self._set_val_mask(0xff, traffic_class, traffic_class_mask) self.val.hop_limit, self.mask.hop_limit = \ self._set_val_mask(0xff, hop_limit, hop_limit_mask) @property def dst_ip(self): return socket.inet_ntop(socket.AF_INET6, self.val.dst_ip) @dst_ip.setter def dst_ip(self, val): self.val.dst_ip = socket.inet_pton(socket.AF_INET6, val) @property def dst_ip_mask(self): return socket.inet_ntop(socket.AF_INET6, self.mask.dst_ip) @dst_ip_mask.setter def dst_ip_mask(self, val): self.mask.dst_ip = socket.inet_pton(socket.AF_INET6, val) @property def src_ip(self): return socket.inet_ntop(socket.AF_INET6, self.val.src_ip) @src_ip.setter def src_ip(self, val): self.val.src_ip = socket.inet_pton(socket.AF_INET6, val) @property def src_ip_mask(self): return socket.inet_ntop(socket.AF_INET6, self.mask.src_ip) @src_ip_mask.setter def src_ip_mask(self, val): self.mask.src_ip = socket.inet_pton(socket.AF_INET6, val) @property def flow_label(self): return socket.ntohl(self.val.flow_label) @flow_label.setter def flow_label(self, val): self.val.flow_label = socket.htonl(val) @property def flow_label_mask(self): return socket.ntohl(self.mask.flow_label) @flow_label_mask.setter def flow_label_mask(self, val): self.mask.flow_label = socket.htonl(val) @property def next_hdr(self): return self.val.next_hdr @next_hdr.setter def next_hdr(self, val): self.val.next_hdr = val @property def next_hdr_mask(self): return self.mask.next_hdr @next_hdr_mask.setter def next_hdr_mask(self, val): self.mask.next_hdr = val @property def traffic_class(self): return self.val.traffic_class @traffic_class.setter def traffic_class(self, val): self.val.traffic_class = val @property def traffic_class_mask(self): return self.mask.traffic_class @traffic_class_mask.setter def traffic_class_mask(self, val): self.mask.traffic_class = val @property def hop_limit(self): return self.val.hop_limit @hop_limit.setter def hop_limit(self, val): self.val.hop_limit = val @property def hop_limit_mask(self): return self.mask.hop_limit @hop_limit_mask.setter def hop_limit_mask(self, val): self.mask.hop_limit = val def __str__(self): return super().__str__() + \ f"{'Src IP':<16}: {self.src_ip:<20} {self.src_ip_mask:<20}\n" \ f"{'Dst IP':<16}: {self.dst_ip:<20} {self.dst_ip_mask:<20}\n" \ f"{'Flow label':<16}: {self.flow_label:<20} {self.flow_label_mask:<20}\n" \ f"{'Next header':<16}: {self.next_hdr:<20} {self.next_hdr_mask:<20}\n" \ f"{'Traffic class':<16}: {self.traffic_class:<20} {self.traffic_class_mask:<20}\n" \ f"{'Hop limit':<16}: {self.hop_limit:<20} {self.hop_limit_mask:<20}\n" cpdef _copy_data(self, unsigned long ptr): cdef v.ibv_flow_spec_ipv6 ipv6 ipv6.size = self.size ipv6.type = self.spec_type ipv6.val = self.val ipv6.mask = self.mask memcpy(ptr, &ipv6, self.size) rdma-core-56.1/pyverbs/srq.pxd000066400000000000000000000013141477342711600163450ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. #cython: language_level=3 from pyverbs.base cimport PyverbsObject, PyverbsCM from . cimport libibverbs as v cdef class SrqAttr(PyverbsObject): cdef v.ibv_srq_attr attr cdef class SrqInitAttr(PyverbsObject): cdef v.ibv_srq_init_attr attr cdef class SrqInitAttrEx(PyverbsObject): cdef v.ibv_srq_init_attr_ex attr cdef object _cq cdef object _pd cdef object _xrcd cdef class OpsWr(PyverbsCM): cdef v.ibv_ops_wr ops_wr cdef class SRQ(PyverbsCM): cdef v.ibv_srq *srq cdef object cq cdef object qps cdef add_ref(self, obj) cpdef close(self) rdma-core-56.1/pyverbs/srq.pyx000066400000000000000000000232671477342711600164050ustar00rootroot00000000000000import weakref from libc.errno cimport errno from libc.string cimport memcpy from libc.stdlib cimport malloc, free from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsError from pyverbs.wr cimport RecvWR, SGE, copy_sg_array from pyverbs.base import PyverbsRDMAErrno from pyverbs.base cimport close_weakrefs cimport pyverbs.libibverbs_enums as e from pyverbs.device cimport Context from pyverbs.cq cimport CQEX, CQ cimport pyverbs.libibverbs as v from pyverbs.xrcd cimport XRCD from pyverbs.qp cimport QP from pyverbs.pd cimport PD cdef class SrqAttr(PyverbsObject): def __init__(self, max_wr=100, max_sge=1, srq_limit=0): super().__init__() self.attr.max_wr = max_wr self.attr.max_sge = max_sge self.attr.srq_limit = srq_limit @property def max_wr(self): return self.attr.max_wr @max_wr.setter def max_wr(self, val): self.attr.max_wr = val @property def max_sge(self): return self.attr.max_sge @max_sge.setter def max_sge(self, val): self.attr.max_sge = val @property def srq_limit(self): return self.attr.srq_limit @srq_limit.setter def srq_limit(self, val): self.attr.srq_limit = val cdef class SrqInitAttr(PyverbsObject): def __init__(self, SrqAttr attr = None): super().__init__() if attr is not None: self.attr.attr.max_wr = attr.max_wr self.attr.attr.max_sge = attr.max_sge self.attr.attr.srq_limit = attr.srq_limit @property def max_wr(self): return self.attr.attr.max_wr @property def max_sge(self): return self.attr.attr.max_sge @property def srq_limit(self): return self.attr.attr.srq_limit cdef class SrqInitAttrEx(PyverbsObject): def __init__(self, max_wr=100, max_sge=1, srq_limit=0): super().__init__() self.attr.attr.max_wr = max_wr self.attr.attr.max_sge = max_sge self.attr.attr.srq_limit = srq_limit self._cq = None self._pd = None self._xrcd = None @property def max_wr(self): return self.attr.attr.max_wr @property def max_sge(self): return self.attr.attr.max_sge @property def srq_limit(self): return self.attr.attr.srq_limit @property def comp_mask(self): return self.attr.comp_mask @comp_mask.setter def comp_mask(self, val): self.attr.comp_mask = val @property def srq_type(self): return self.attr.srq_type @srq_type.setter def srq_type(self, val): self.attr.srq_type = val @property def pd(self): return self._pd @pd.setter def pd(self, PD val): self._pd = val self.attr.pd = val.pd @property def xrcd(self): return self._xrcd @xrcd.setter def xrcd(self, XRCD val): self._xrcd = val self.attr.xrcd = val.xrcd @property def max_num_tags(self): return self.attr.tm_cap.max_num_tags @max_num_tags.setter def max_num_tags(self, val): self.attr.tm_cap.max_num_tags = val @property def max_ops(self): return self.attr.tm_cap.max_ops @max_ops.setter def max_ops(self, val): self.attr.tm_cap.max_ops = val @property def cq(self): return self._cq @cq.setter def cq(self, val): if type(val) == CQ: self.attr.cq = (val).cq self._cq = val else: self.attr.cq = (val).ibv_cq self._cq = val cdef class OpsWr(PyverbsCM): def __init__(self, wr_id=0, opcode=e.IBV_WR_TAG_ADD, flags=e.IBV_OPS_SIGNALED, OpsWr next_wr=None, unexpected_cnt=0, recv_wr_id=0, num_sge=None, tag=0, mask=0, sg_list=None): self.ops_wr.wr_id = wr_id self.ops_wr.opcode = opcode self.ops_wr.flags = flags self.ops_wr.tm.unexpected_cnt = unexpected_cnt self.ops_wr.tm.add.recv_wr_id = recv_wr_id self.ops_wr.tm.add.tag = tag self.ops_wr.tm.add.mask = mask if next_wr is not None: self.ops_wr.next = &next_wr.ops_wr if num_sge is not None: self.ops_wr.tm.add.num_sge = num_sge cdef v.ibv_sge *dst if sg_list is not None: self.ops_wr.tm.add.sg_list = malloc(num_sge * sizeof(v.ibv_sge)) if self.ops_wr.tm.add.sg_list == NULL: raise MemoryError('Failed to malloc SG buffer') dst = self.ops_wr.tm.add.sg_list copy_sg_array(dst, sg_list, num_sge) def __dealloc__(self): self.close() cpdef close(self): if self.ops_wr.tm.add.sg_list != NULL: free(self.ops_wr.tm.add.sg_list) self.ops_wr.tm.add.sg_list = NULL @property def wr_id(self): return self.ops_wr.wr_id @wr_id.setter def wr_id(self, val): self.ops_wr.wr_id = val @property def next_wr(self): if self.ops_wr.next == NULL: return None val = OpsWr() val.ops_wr = self.ops_wr.next[0] return val @next_wr.setter def next_wr(self, OpsWr val not None): self.ops_wr.next = &val.ops_wr @property def opcode(self): return self.ops_wr.opcode @opcode.setter def opcode(self, val): self.ops_wr.opcode = val @property def flags(self): return self.ops_wr.flags @flags.setter def flags(self, val): self.ops_wr.flags = val @property def unexpected_cnt(self): return self.ops_wr.tm.unexpected_cnt @unexpected_cnt.setter def unexpected_cnt(self, val): self.ops_wr.tm.unexpected_cnt = val @property def recv_wr_id(self): return self.ops_wr.tm.add.recv_wr_id @recv_wr_id.setter def recv_wr_id(self, val): self.ops_wr.tm.add.recv_wr_id = val @property def tag(self): return self.ops_wr.tm.add.tag @tag.setter def tag(self, val): self.ops_wr.tm.add.tag = val @property def handle(self): return self.ops_wr.tm.handle @handle.setter def handle(self, val): self.ops_wr.tm.handle = val @property def mask(self): return self.ops_wr.tm.add.mask @mask.setter def mask(self, val): self.ops_wr.tm.add.mask = val cdef class SRQ(PyverbsCM): def __init__(self, object creator not None, object attr not None): super().__init__() self.srq = NULL self.cq = None self.qps = weakref.WeakSet() if isinstance(creator, PD): self._create_srq(creator, attr) elif isinstance(creator, Context): self._create_srq_ex(creator, attr) else: raise PyverbsRDMAError('Srq needs either Context or PD for creation') if self.srq == NULL: raise PyverbsRDMAErrno('Failed to create SRQ (errno is {err})'. format(err=errno)) self.logger.debug('SRQ Created') def __dealloc__(self): self.close() cpdef close(self): if self.srq != NULL: if self.logger: self.logger.debug('Closing SRQ') close_weakrefs([self.qps]) rc = v.ibv_destroy_srq(self.srq) if rc != 0: raise PyverbsRDMAError('Failed to destroy SRQ', rc) self.srq = NULL self.cq =None cdef add_ref(self, obj): if isinstance(obj, QP): self.qps.add(obj) else: raise PyverbsError('Unrecognized object type') def _create_srq(self, PD pd, SrqInitAttr init_attr): self.srq = v.ibv_create_srq(pd.pd, &init_attr.attr) pd.add_ref(self) def _create_srq_ex(self, Context context, SrqInitAttrEx init_attr_ex): self.srq = v.ibv_create_srq_ex(context.context, &init_attr_ex.attr) if init_attr_ex.cq: cq = init_attr_ex.cq cq.add_ref(self) self.cq = cq if init_attr_ex.xrcd: xrcd = init_attr_ex.xrcd xrcd.add_ref(self) if init_attr_ex.pd: pd = init_attr_ex.pd pd.add_ref(self) def get_srq_num(self): cdef unsigned int srqn rc = v.ibv_get_srq_num(self.srq, &srqn) if rc != 0: raise PyverbsRDMAError('Failed to retrieve SRQ number', rc) return srqn def modify(self, SrqAttr attr, comp_mask): rc = v.ibv_modify_srq(self.srq, &attr.attr, comp_mask) if rc != 0: raise PyverbsRDMAError('Failed to modify SRQ', rc) def query(self): attr = SrqAttr() rc = v.ibv_query_srq(self.srq, &attr.attr) if rc != 0: raise PyverbsRDMAError('Failed to query SRQ', rc) return attr def post_srq_ops(self, OpsWr wr not None, OpsWr bad_wr=None): """ Perform on a special shared receive queue (SRQ) configuration manipulations :param wr: Ops Work Requests to be posted to the TM-Shared Receive Queue :param bad_wr: A pointer that will be filled with the first Ops Work Request, that its processing failed """ cdef v.ibv_ops_wr *my_bad_wr rc= v.ibv_post_srq_ops(self.srq, &wr.ops_wr, &my_bad_wr) if rc != 0: if bad_wr: memcpy(&bad_wr.ops_wr, my_bad_wr, sizeof(bad_wr.ops_wr)) raise PyverbsRDMAError('post SRQ ops failed.', rc) def post_recv(self, RecvWR wr not None, RecvWR bad_wr=None): cdef v.ibv_recv_wr *my_bad_wr rc = v.ibv_post_srq_recv(self.srq, &wr.recv_wr, &my_bad_wr) if rc != 0: if bad_wr: memcpy(&bad_wr.recv_wr, my_bad_wr, sizeof(bad_wr.recv_wr)) raise PyverbsRDMAError('Failed to post receive to SRQ.', rc) rdma-core-56.1/pyverbs/utils.py000066400000000000000000000061021477342711600165350ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file import struct from pyverbs.pyverbs_error import PyverbsUserError import pyverbs.enums as e be64toh = lambda num: struct.unpack('Q', struct.pack('!Q', num))[0] def gid_str(subnet_prefix, interface_id): hex_values = '%016x%016x' % (be64toh(subnet_prefix), be64toh(interface_id)) return ':'.join([hex_values[0:4], hex_values[4:8], hex_values[8:12], hex_values[12:16], hex_values[16:20], hex_values[20:24], hex_values[24:28],hex_values[28:32]]) def gid_str_to_array(val): """ Splits a GID to an array of u8 that can be easily assigned to a GID's raw array. :param val: GID value in 8 words format 'xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx' :return: An array of format xx:xx etc. """ val = val.split(':') if len(val) != 8: raise PyverbsUserError('Invalid GID value ({val})'.format(val=val)) if any([len(v) != 4 for v in val]): raise PyverbsUserError('Invalid GID value ({val})'.format(val=val)) val_int = int(''.join(val), 16) vals = [] for i in range(8): vals.append(val[i][0:2]) vals.append(val[i][2:4]) return vals def qp_type_to_str(qp_type): types = {2: 'RC', 3: 'UC', 4: 'UD', 8: 'Raw Packet', 9: 'XRCD_SEND', 10: 'XRCD_RECV', 0xff:'Driver QP'} try: return types[qp_type] except KeyError: return 'Unknown ({qpt})'.format(qpt=qp_type) def qp_state_to_str(qp_state): states = {0: 'Reset', 1: 'Init', 2: 'RTR', 3: 'RTS', 4: 'SQD', 5: 'SQE', 6: 'Error', 7: 'Unknown'} try: return states[qp_state] except KeyError: return 'Unknown ({qps})'.format(qps=qp_state_to_str) def mtu_to_str(mtu): mtus = {1: 256, 2: 512, 3: 1024, 4: 2048, 5: 4096} try: return mtus[mtu] except KeyError: return 0 def access_flags_to_str(flags): access_flags = {1: 'Local write', 2: 'Remote write', 4: 'Remote read', 8: 'Remote atomic', 16: 'MW bind', 32: 'Zero based', 64: 'On demand'} access_str = '' for f in access_flags: if flags & f: access_str += access_flags[f] access_str += ' ' return access_str def mig_state_to_str(mig): mig_states = {0: 'Migrated', 1: 'Re-arm', 2: 'Armed'} try: return mig_states[mig] except KeyError: return 'Unknown ({m})'.format(m=mig) def rereg_error_to_str(error): error_map = {e.IBV_REREG_MR_ERR_INPUT: 'IBV_REREG_MR_ERR_INPUT', e.IBV_REREG_MR_ERR_DONT_FORK_NEW: \ 'IBV_REREG_MR_ERR_DONT_FORK_NEW', e.IBV_REREG_MR_ERR_DO_FORK_OLD: 'IBV_REREG_MR_ERR_DO_FORK_OLD', e.IBV_REREG_MR_ERR_CMD: 'IBV_REREG_MR_ERR_CMD', e.IBV_REREG_MR_ERR_CMD_AND_DO_FORK_NEW: \ 'IBV_REREG_MR_ERR_CMD_AND_DO_FORK_NEW'} try: return error_map[error] except KeyError: return f'Unknown error ({error})' rdma-core-56.1/pyverbs/wq.pxd000066400000000000000000000020121477342711600161630ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia Inc. All rights reserved. See COPYING file #cython: language_level=3 from pyverbs.base cimport PyverbsObject, PyverbsCM from pyverbs.device cimport Context cimport pyverbs.libibverbs as v from pyverbs.cq cimport CQ from pyverbs.pd cimport PD cdef class WQInitAttr(PyverbsObject): cdef v.ibv_wq_init_attr attr cdef PD pd cdef object cq cdef class WQAttr(PyverbsObject): cdef v.ibv_wq_attr attr cdef class WQ(PyverbsCM): cdef v.ibv_wq *wq cdef Context context cdef PD pd cdef object cq cdef object rwq_ind_tables cpdef add_ref(self, obj) cdef class RwqIndTableInitAttr(PyverbsObject): cdef v.ibv_rwq_ind_table_init_attr attr cdef object wqs_list cdef class RwqIndTable(PyverbsCM): cdef v.ibv_rwq_ind_table *rwq_ind_table cdef Context context cdef object wqs cdef object qps cpdef add_ref(self, obj) cdef class RxHashConf(PyverbsObject): cdef v.ibv_rx_hash_conf rx_hash_conf rdma-core-56.1/pyverbs/wq.pyx000066400000000000000000000304051477342711600162170ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia Inc. All rights reserved. See COPYING file from libc.stdlib cimport calloc, free from libc.stdint cimport uint8_t from libc.string cimport memcpy import weakref from .pyverbs_error import PyverbsRDMAError, PyverbsError, PyverbsUserError from pyverbs.base import PyverbsRDMAErrno from pyverbs.base cimport close_weakrefs cimport pyverbs.libibverbs_enums as e from pyverbs.device cimport Context from pyverbs.wr cimport RecvWR from pyverbs.cq cimport CQ, CQEX from pyverbs.pd cimport PD from pyverbs.qp cimport QP cdef class WQInitAttr(PyverbsObject): def __init__(self, wq_context=None, PD wq_pd=None, wq_cq=None, wq_type=e.IBV_WQT_RQ, max_wr=100, max_sge=1, comp_mask=0, create_flags=0): """ Initializes a WqInitAttr object representing ibv_wq_init_attr struct. :param wq_context: Associated WQ context :param wq_pd: PD to be associated with the WQ :param wq_cq: CQ or CQEX to be associated with the WQ :param wp_type: The desired WQ type :param max_wr: Requested max number of outstanding WRs in the WQ :param max_sge: Requested max number of scatter/gather (s/g) elements per WR in the WQ :param comp_mask: Identifies valid fields :param create_flags: Creation flags for the WQ :return: A WqInitAttr object """ super().__init__() self.attr.wq_context = (wq_context) if wq_context else NULL self.attr.wq_type = wq_type self.attr.max_wr = max_wr self.attr.max_sge = max_sge self.pd = wq_pd self.attr.pd = wq_pd.pd if wq_pd else NULL self.cq = wq_cq if wq_cq: if isinstance(wq_cq, CQ): self.attr.cq = (wq_cq).cq else: self.attr.cq = (wq_cq).ibv_cq else: self.attr.cq = NULL self.attr.comp_mask = comp_mask self.attr.create_flags = create_flags @property def wq_type(self): return self.attr.wq_type @wq_type.setter def wq_type(self, val): self.attr.wq_type = val @property def pd(self): return self.pd @pd.setter def pd(self, PD val): self.pd = val self.attr.pd = val.pd @property def cq(self): return self.cq @cq.setter def cq(self, val): self.cq = val if isinstance(val, CQ): self.attr.cq = (val).cq else: self.attr.cq = (val).ibv_cq cdef class WQAttr(PyverbsObject): def __init__(self, attr_mask=0, wq_state=0, curr_wq_state=0, flags=0, flags_mask=0): """ Initializes a WQAttr object which represents ibv_wq_attr struct. It can be used to modify a WQ. :param attr_mask: Identifies valid fields :param wq_state: Desired WQ state :param curr_wq_state: Current WQ state :param flags: Flags values to modify :param flags_mask: Which flags to modify :return: An initialized WQAttr object """ super().__init__() self.attr.attr_mask = attr_mask self.attr.wq_state = wq_state self.attr.curr_wq_state = curr_wq_state self.attr.flags = flags self.attr.flags_mask = flags_mask @property def wq_state(self): return self.attr.wq_state @wq_state.setter def wq_state(self, val): self.attr.wq_state = val @property def attr_mask(self): return self.attr.attr_mask @attr_mask.setter def attr_mask(self, val): self.attr.attr_mask = val @property def curr_wq_state(self): return self.attr.curr_wq_state @curr_wq_state.setter def curr_wq_state(self, val): self.attr.curr_wq_state = val @property def flags(self): return self.attr.flags @flags.setter def flags(self, val): self.attr.flags = val @property def flags_mask(self): return self.attr.flags_mask @flags_mask.setter def flags_mask(self, val): self.attr.flags_mask = val cdef class WQ(PyverbsCM): def __init__(self, Context ctx, WQInitAttr attr): """ Creates a WQ object. :param ctx: The context the wq will be associated with. :param attr: WQ initial attributes of type WQInitAttr. :return: A WQ object """ super().__init__() self.wq = v.ibv_create_wq(ctx.context, &attr.attr) if self.wq == NULL: raise PyverbsRDMAErrno('Failed to create WQ') self.context = ctx ctx.add_ref(self) pd = attr.pd pd.add_ref(self) self.pd = pd if isinstance(attr.cq, CQ): (attr.cq).add_ref(self) elif isinstance(attr.cq, CQEX): (attr.cq).add_ref(self) self.cq = attr.cq self.rwq_ind_tables = weakref.WeakSet() cpdef add_ref(self, obj): if isinstance(obj, RwqIndTable): self.rwq_ind_tables.add(obj) else: raise PyverbsError('Unrecognized object type') def modify(self, WQAttr wq_attr not None): """ Modify the WQ :param qp_attr: A WQAttr object with updated values to be applied to the WQ :return: None """ rc = v.ibv_modify_wq(self.wq, &wq_attr.attr) if rc != 0: raise PyverbsRDMAError('Failed to modify WQ', rc) def post_recv(self, RecvWR wr not None, RecvWR bad_wr=None): """ Post a receive WR on the WQ. :param wr: The work request to post :param bad_wr: A RecvWR object to hold the bad WR if it is available in case of a failure :return: None """ cdef v.ibv_recv_wr *my_bad_wr # In order to provide a pointer to a pointer, use a temporary cdef'ed # variable. rc = v.ibv_post_wq_recv(self.wq, &wr.recv_wr, &my_bad_wr) if rc != 0: if (bad_wr): memcpy(&bad_wr.recv_wr, my_bad_wr, sizeof(bad_wr.recv_wr)) raise PyverbsRDMAError('Failed to post recv', rc) def __dealloc__(self): self.close() cpdef close(self): """ Closes the underlying C object of the WQ. :return: None """ if self.wq != NULL: if self.logger: self.logger.debug('Closing WQ') close_weakrefs([self.rwq_ind_tables]) rc = v.ibv_destroy_wq(self.wq) if rc != 0: raise PyverbsRDMAError('Failed to dealloc WQ', rc) self.wq = NULL self.context = None self.pd = None self.cq = None @property def wqn(self): return self.wq.wq_num cdef class RwqIndTableInitAttr(PyverbsObject): def __init__(self, log_ind_tbl_size=5, wqs_list=None, comp_mask=0): """ Initializes a RwqIndTableInitAttr object representing ibv_rwq_ind_table_init_attr struct. :param log_ind_tbl_size: Log, base 2, of Indirection table size :param wqs_list: List of WQs :param comp_mask: Identifies valid fields :return: A RwqIndTableInitAttr object """ super().__init__() if log_ind_tbl_size <= 0: raise PyverbsUserError('Invalid indirection table size. Log size must be > 0') if (1 << log_ind_tbl_size) < len(wqs_list): raise PyverbsUserError(f'Requested table size ({1 << log_ind_tbl_size}) is smaller ' f'than the number of wqs ({len(wqs_list)})') self.attr.log_ind_tbl_size = log_ind_tbl_size cdef v.ibv_wq **rwq_ind_table = calloc(len(wqs_list), sizeof(v.ibv_wq*)) if rwq_ind_table == NULL: raise MemoryError('Failed to allocate memory for Indirection Table') for i in range(len(wqs_list)): rwq_ind_table[i] = (wqs_list[i]).wq self.attr.ind_tbl = rwq_ind_table self.wqs_list = wqs_list self.attr.comp_mask = comp_mask def __dealloc__(self): """ Closes the rwq_ind_tbl init attr. :return: None """ free(self.attr.ind_tbl) cdef class RwqIndTable(PyverbsCM): def __init__(self, Context ctx, RwqIndTableInitAttr attr): """ Initializes a RwqIndTable object. :param ctx: The context the RWQ IND TBL will be associated with. :param attr: RWQ IND TBL initial attributes of type RwqIndTableInitAttr. :return: A RwqIndTable object """ super().__init__() self.rwq_ind_table = v.ibv_create_rwq_ind_table(ctx.context, &attr.attr) if self.rwq_ind_table == NULL: raise PyverbsRDMAErrno('Failed to create RwqIndTable') self.context = ctx ctx.add_ref(self) self.wqs = attr.wqs_list for wq in self.wqs: wq.add_ref(self) self.qps = weakref.WeakSet() cpdef add_ref(self, obj): if isinstance(obj, QP): self.qps.add(obj) else: raise PyverbsError('Unrecognized object type') @property def wqs(self): return self.wqs def __dealloc__(self): self.close() cpdef close(self): """ Closes the underlying C object of the RWQ IND TBL. :return: None """ if self.rwq_ind_table != NULL: if self.logger: self.logger.debug('Closing RWQ IND TBL') close_weakrefs([self.qps]) rc = v.ibv_destroy_rwq_ind_table(self.rwq_ind_table) if rc != 0: raise PyverbsRDMAError('Failed to dealloc RWQ IND TBL', rc) self.rwq_ind_table = NULL self.context = None cdef class RxHashConf(PyverbsObject): def __init__(self, rx_hash_function=0, rx_hash_key_len=0, rx_hash_key=None, rx_hash_fields_mask=0): """ Initializes a RxHashConf object representing ibv_rx_hash_conf struct. :param rx_hash_function: RX hash function, use enum ibv_rx_hash_function_flags :param rx_hash_key_len: RX hash key length :param rx_hash_key: RX hash key data :param rx_hash_fields_mask: RX fields that should participate in the hashing :return: A RxHashConf object """ super().__init__() cdef uint8_t *rx_hash_key_c = NULL if rx_hash_key: if rx_hash_key_len != len(rx_hash_key): raise PyverbsUserError('Length of rx_hash_key not equal to rx_hash_key_len') self.rx_hash_key = rx_hash_key self.rx_hash_conf.rx_hash_function = rx_hash_function self.rx_hash_conf.rx_hash_key_len = rx_hash_key_len self.rx_hash_conf.rx_hash_fields_mask = rx_hash_fields_mask @property def rx_hash_function(self): return self.rx_hash_conf.rx_hash_function @rx_hash_function.setter def rx_hash_function(self, val): self.rx_hash_conf.rx_hash_function = val @property def rx_hash_key_len(self): return self.rx_hash_conf.rx_hash_key_len @rx_hash_key_len.setter def rx_hash_key_len(self, val): if val <= 0: raise PyverbsUserError('Invalid rx_hash_key_len. Must be greater then 0') self.rx_hash_conf.rx_hash_key_len = val @property def rx_hash_fields_mask(self): return self.rx_hash_conf.rx_hash_fields_mask @rx_hash_fields_mask.setter def rx_hash_fields_mask(self, val): self.rx_hash_conf.rx_hash_fields_mask = val @property def rx_hash_key(self): return self.rx_hash_conf.rx_hash_fields_mask @rx_hash_key.setter def rx_hash_key(self, vals_list): if self.rx_hash_conf.rx_hash_key != NULL: free(self.rx_hash_conf.rx_hash_key) self.rx_hash_conf.rx_hash_key = NULL cdef uint8_t *rx_hash_key_c = calloc(len(vals_list), sizeof(uint8_t)) if rx_hash_key_c == NULL: raise MemoryError('Failed to allocate memory for RX hash key') for i in range(len(vals_list)): rx_hash_key_c[i] = vals_list[i] self.rx_hash_conf.rx_hash_key = rx_hash_key_c self.rx_hash_conf.rx_hash_key_len = len(vals_list) def __dealloc__(self): """ Frees rx hash key allocated memory. :return: None """ free(self.rx_hash_conf.rx_hash_key) self.rx_hash_conf.rx_hash_key = NULL rdma-core-56.1/pyverbs/wr.pxd000066400000000000000000000010041477342711600161640ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file #cython: language_level=3 from .base cimport PyverbsCM from pyverbs cimport libibverbs as v cdef class SGE(PyverbsCM): cdef v.ibv_sge *sge cpdef read(self, length, offset) cdef class RecvWR(PyverbsCM): cdef v.ibv_recv_wr recv_wr cdef class SendWR(PyverbsCM): cdef v.ibv_send_wr send_wr cdef object ah cdef copy_sg_array(v.ibv_sge *dst, sg, num_sge) rdma-core-56.1/pyverbs/wr.pyx000066400000000000000000000262311477342711600162220ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies Inc. All rights reserved. See COPYING file from pyverbs.pyverbs_error import PyverbsUserError, PyverbsError from pyverbs.base import PyverbsRDMAErrno, inc_rkey from pyverbs.mr cimport MW, MR, MWBindInfo cimport pyverbs.libibverbs_enums as e cimport pyverbs.libibverbs as v from pyverbs.addr cimport AH from libc.stdlib cimport free, malloc from libc.string cimport memcpy from libc.stdint cimport uintptr_t cdef class SGE(PyverbsCM): """ Represents ibv_sge struct. It has a read function to allow users to keep track of data. Write function is not provided as a scatter-gather element can be using a MR or a DMMR. In case direct (device's) memory is used, write can't be done using memcpy that relies on CPU-specific optimizations. A SGE has no way to tell which memory it is using. """ def __init__(self, addr, length, lkey): """ Initializes a SGE object. :param addr: The address to be used for read/write :param length: Available buffer size :param lkey: Local key of the used MR/DMMR :return: A SGE object """ super().__init__() self.sge = malloc(sizeof(v.ibv_sge)) if self.sge == NULL: raise PyverbsError('Failed to allocate an SGE') self.sge.addr = addr self.sge.length = length self.sge.lkey = lkey def __dealloc(self): self.close() cpdef close(self): free(self.sge) cpdef read(self, length, offset): """ Reads bytes of data starting at bytes from the SGE's address. :param length: How many bytes to read :param offset: Offset from the SGE's address in bytes :return: The data written at the SGE's address + offset """ cdef char *sg_data cdef int off = offset sg_data = (self.sge.addr + off) return sg_data[:length] def __str__(self): print_format = '{:22}: {:<20}\n' return print_format.format('Address', hex(self.sge.addr)) +\ print_format.format('Length', self.sge.length) +\ print_format.format('Key', hex(self.sge.lkey)) @property def addr(self): return self.sge.addr @addr.setter def addr(self, val): self.sge.addr = val @property def length(self): return self.sge.length @length.setter def length(self, val): self.sge.length = val @property def lkey(self): return self.sge.lkey @lkey.setter def lkey(self, val): self.sge.lkey = val cdef class RecvWR(PyverbsCM): def __init__(self, wr_id=0, num_sge=0, sg=None, RecvWR next_wr=None): """ Initializes a RecvWR object. :param wr_id: A user-defined WR ID :param num_sge: Size of the scatter-gather array :param sg: A scatter-gather array :param: next_wr: The next WR in the list :return: A RecvWR object """ super().__init__() cdef v.ibv_sge *dst if num_sge < 1 or sg is None: raise PyverbsUserError('A WR needs at least one SGE') self.recv_wr.sg_list = malloc(num_sge * sizeof(v.ibv_sge)) if self.recv_wr.sg_list == NULL: raise PyverbsRDMAErrno('Failed to malloc SG buffer') dst = self.recv_wr.sg_list copy_sg_array(dst, sg, num_sge) self.recv_wr.num_sge = num_sge self.recv_wr.wr_id = wr_id if next_wr is not None: self.recv_wr.next = &next_wr.recv_wr def __dealloc(self): self.close() cpdef close(self): free(self.recv_wr.sg_list) def __str__(self): print_format = '{:22}: {:<20}\n' return print_format.format('WR ID', self.recv_wr.wr_id) +\ print_format.format('Num SGE', self.recv_wr.num_sge) @property def next_wr(self): if self.recv_wr.next == NULL: return None val = RecvWR() val.recv_wr = self.recv_wr.next[0] return val @next_wr.setter def next_wr(self, RecvWR val not None): self.recv_wr.next = &val.recv_wr @property def wr_id(self): return self.recv_wr.wr_id @wr_id.setter def wr_id(self, val): self.recv_wr.wr_id = val @property def num_sge(self): return self.recv_wr.num_sge @num_sge.setter def num_sge(self, val): self.recv_wr.num_sge = val cdef class SendWR(PyverbsCM): def __init__(self, wr_id=0, opcode=e.IBV_WR_SEND, num_sge=0, imm_data=0, sg = None, send_flags=e.IBV_SEND_SIGNALED, SendWR next_wr = None): """ Initialize a SendWR object with user-provided or default values. :param wr_id: A user-defined WR ID :param opcode: The WR's opcode :param num_sge: Number of scatter-gather elements in the WR :param imm_data: Immediate data :param sg: A SGE element, head of the scatter-gather list :param send_flags: Send flags as define in ibv_send_flags enum :return: An initialized SendWR object """ cdef v.ibv_sge *dst super().__init__() mw_opcodes = [e.IBV_WR_LOCAL_INV, e.IBV_WR_BIND_MW, e.IBV_WR_SEND_WITH_INV] if opcode not in mw_opcodes and (num_sge < 1 or sg is None): raise PyverbsUserError('A WR needs at least one SGE') self.send_wr.sg_list = malloc(num_sge * sizeof(v.ibv_sge)) if self.send_wr.sg_list == NULL: raise PyverbsRDMAErrno('Failed to malloc SG buffer') dst = self.send_wr.sg_list copy_sg_array(dst, sg, num_sge) self.send_wr.num_sge = num_sge self.send_wr.wr_id = wr_id if next_wr is not None: self.send_wr.next = &next_wr.send_wr self.send_wr.opcode = opcode self.send_wr.send_flags = send_flags self.send_wr.imm_data = imm_data self.ah = None def __dealloc(self): self.close() cpdef close(self): free(self.send_wr.sg_list) def __str__(self): print_format = '{:22}: {:<20}\n' return print_format.format('WR ID', self.send_wr.wr_id) +\ print_format.format('Num SGE', self.send_wr.num_sge) +\ print_format.format('Opcode', self.send_wr.opcode) +\ print_format.format('Send flags', send_flags_to_str(self.send_wr.send_flags) +\ print_format.format('Imm Data', self.send_wr.imm_data)) @property def next_wr(self): if self.send_wr.next == NULL: return None val = SendWR() val.send_wr = self.send_wr.next[0] return val @next_wr.setter def next_wr(self, SendWR val not None): self.send_wr.next = &val.send_wr @property def wr_id(self): return self.send_wr.wr_id @wr_id.setter def wr_id(self, val): self.send_wr.wr_id = val @property def num_sge(self): return self.send_wr.num_sge @num_sge.setter def num_sge(self, val): self.send_wr.num_sge = val @property def imm_data(self): return self.send_wr.imm_data @imm_data.setter def imm_data(self, val): self.send_wr.imm_data = val @property def opcode(self): return self.send_wr.opcode @opcode.setter def opcode(self, val): self.send_wr.opcode = val @property def send_flags(self): return self.send_wr.send_flags @send_flags.setter def send_flags(self, val): self.send_wr.send_flags = val property sg_list: def __set__(self, SGE val not None): self.send_wr.sg_list = val.sge def set_wr_ud(self, AH ah not None, rqpn, rqkey): """ Set the members of the ud struct in the send_wr's wr union. :param ah: An address handle object :param rqpn: The remote QP number :param rqkey: The remote QKey, authorizing access to the destination QP :return: None """ self.ah = ah self.send_wr.wr.ud.ah = ah.ah self.send_wr.wr.ud.remote_qpn = rqpn self.send_wr.wr.ud.remote_qkey = rqkey def set_wr_rdma(self, rkey, addr): """ Set the members of the rdma struct in the send_wr's wr union, used for RDMA extended transport header creation. :param rkey: Key to access the specified memory address. :param addr: Start address of the buffer :return: None """ self.send_wr.wr.rdma.remote_addr = addr self.send_wr.wr.rdma.rkey = rkey def set_wr_atomic(self, rkey, addr, compare_add, swap=0): """ Set the members of the atomic struct in the send_wr's wr union, used for the atomic extended transport header. :param rkey: Key to access the specified memory address. :param addr: Start address of the buffer :param compare_add: The data operand used in the compare portion of the compare and swap operation :param swap: The data operand used in atomic operations: - In compare and swap this field is swapped into the addressed buffer - In fetch and add this field is added to the contents of the addressed buffer :return: None """ self.send_wr.wr.atomic.remote_addr = addr self.send_wr.wr.atomic.rkey = rkey self.send_wr.wr.atomic.compare_add = compare_add self.send_wr.wr.atomic.swap = swap def set_bind_wr(self, MW mw, MWBindInfo bind_info): """ Set the members of the bind_mw struct in the send_wr. :param mw: The MW to bind. :param bind_info: MWBindInfo object, includes the bind attributes. :return: None """ self.send_wr.bind_mw.mw = mw.mw # Create the new key from the MW rkey. rkey = inc_rkey(mw.rkey) self.send_wr.bind_mw.rkey = rkey self.send_wr.bind_mw.bind_info = bind_info.info @property def rkey(self): return self.send_wr.bind_mw.rkey def set_qp_type_xrc(self, remote_srqn): """ Set the members of the xrc struct in the send_wr's qp_type union, used for the XRC extended transport header. :param remote_srqn: The XRC SRQ number to be used by the responder fot this packet :return: None """ self.send_wr.qp_type.xrc.remote_srqn = remote_srqn def send_flags_to_str(flags): send_flags = {e.IBV_SEND_FENCE: 'IBV_SEND_FENCE', e.IBV_SEND_SIGNALED: 'IBV_SEND_SIGNALED', e.IBV_SEND_SOLICITED: 'IBV_SEND_SOLICITED', e.IBV_SEND_INLINE: 'IBV_SEND_INLINE', e.IBV_SEND_IP_CSUM: 'IBV_SEND_IP_CSUM'} flags_str = '' for f in send_flags: if flags & f: flags_str += send_flags[f] flags_str += ' ' return flags_str cdef copy_sg_array(v.ibv_sge *dst, sg, num_sge): cdef v.ibv_sge *src for i in range(num_sge): src = (sg[i]).sge memcpy(dst, src, sizeof(v.ibv_sge)) dst += 1 rdma-core-56.1/pyverbs/xrcd.pxd000066400000000000000000000007511477342711600165040ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. #cython: language_level=3 from pyverbs.base cimport PyverbsCM, PyverbsObject from pyverbs.device cimport Context cimport pyverbs.libibverbs as v cdef class XRCDInitAttr(PyverbsObject): cdef v.ibv_xrcd_init_attr attr cdef class XRCD(PyverbsCM): cdef v.ibv_xrcd *xrcd cdef Context ctx cdef add_ref(self, obj) cdef object srqs cdef object qps rdma-core-56.1/pyverbs/xrcd.pyx000066400000000000000000000054011477342711600165260ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. import weakref from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError from pyverbs.base import PyverbsRDMAErrno from pyverbs.base cimport close_weakrefs from pyverbs.device cimport Context from pyverbs.srq cimport SRQ from pyverbs.qp cimport QP cdef class XRCDInitAttr(PyverbsObject): def __init__(self, comp_mask, oflags, fd): super().__init__() self.attr.fd = fd self.attr.comp_mask = comp_mask self.attr.oflags = oflags @property def fd(self): return self.attr.fd @fd.setter def fd(self, val): self.attr.fd = val @property def comp_mask(self): return self.attr.comp_mask @comp_mask.setter def comp_mask(self, val): self.attr.comp_mask = val @property def oflags(self): return self.attr.oflags @oflags.setter def oflags(self, val): self.attr.oflags = val cdef class XRCD(PyverbsCM): def __init__(self, Context context not None, XRCDInitAttr init_attr not None): """ Initializes a XRCD object. :param context: The Context object creating the XRCD :return: The newly created XRCD on success """ super().__init__() self.xrcd = v.ibv_open_xrcd( context.context, &init_attr.attr) if self.xrcd == NULL: raise PyverbsRDMAErrno('Failed to allocate XRCD') self.ctx = context context.add_ref(self) self.logger.debug('XRCD: Allocated ibv_xrcd') self.srqs = weakref.WeakSet() self.qps = weakref.WeakSet() def __dealloc__(self): """ Closes the inner XRCD. :return: None """ self.close() cpdef close(self): """ Closes the underlying C object of the XRCD. :return: None """ # XRCD may be deleted directly or indirectly by closing its context, # which leaves the Python XRCD object without the underlying C object, # so during destruction, need to check whether or not the C object # exists. if self.xrcd != NULL: if self.logger: self.logger.debug('Closing XRCD') close_weakrefs([self.qps, self.srqs]) rc = v.ibv_close_xrcd(self.xrcd) if rc != 0: raise PyverbsRDMAError('Failed to dealloc XRCD', rc) self.xrcd = NULL self.ctx = None cdef add_ref(self, obj): if isinstance(obj, QP): self.qps.add(obj) elif isinstance(obj, SRQ): self.srqs.add(obj) else: raise PyverbsError('Unrecognized object type') rdma-core-56.1/rdma-ndd/000077500000000000000000000000001477342711600150205ustar00rootroot00000000000000rdma-core-56.1/rdma-ndd/CMakeLists.txt000066400000000000000000000011141477342711600175550ustar00rootroot00000000000000# COPYRIGHT (c) 2016 Intel Corporation. # Licensed under BSD (MIT variant) or GPLv2. See COPYING. set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS}") rdma_sbin_executable(rdma-ndd rdma-ndd.c ) target_link_libraries(rdma-ndd LINK_PRIVATE ${SYSTEMD_LIBRARIES} ${UDEV_LIBRARIES} ) # FIXME Autogenerate from the .rst rdma_man_pages( rdma-ndd.8.in ) install(FILES "rdma-ndd.rules" RENAME "60-rdma-ndd.rules" DESTINATION "${CMAKE_INSTALL_UDEV_RULESDIR}") rdma_subst_install(FILES "rdma-ndd.service.in" DESTINATION "${CMAKE_INSTALL_SYSTEMD_SERVICEDIR}" RENAME "rdma-ndd.service") rdma-core-56.1/rdma-ndd/rdma-ndd.8.in000066400000000000000000000050711477342711600172070ustar00rootroot00000000000000.\" Man page generated from reStructuredText. . .TH RDMA-NDD 8 "@BUILD_DATE@" "" "OpenIB Diagnostics" .SH NAME RDMA-NDD \- RDMA device Node Description update daemon . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .SH SYNOPSIS .sp rdma\-ndd .SH DESCRIPTION .sp rdma\-ndd is a system daemon which watches for rdma device changes and/or hostname changes and updates the Node Description of the rdma devices based on those changes. .SH DETAILS .sp Full operation of this daemon requires kernels which support polling of the procfs hostname file as well as libudev. .sp If your system does not support either of these features, the daemon will set the Node Descriptions at start up and then sleep forever. .SS Node Description configuration .sp The daemon uses the environment variable RDMA_NDD_ND_FORMAT to set the node description. The following wild cards can be specified for more dynamic control. .sp %h \-\- replace with the current hostname (not including domain) .sp %d \-\- replace with the device name (for example mlx4_0, qib0, etc.) .sp If not specified the default is "%h %d". .sp NOTE: At startup, and on new device detection, the Node Description is always written to ensure the SM and rdma\-ndd are in sync. Subsequent events will only write the Node Description on a device if it has changed. .SS Using systemd .sp Setting the environment variable for the daemon is normally be done via a systemd drop in unit. For example the following could be added to a file named /etc/systemd/system/rdma\-ndd.service.d/nd\-format.conf to use only the hostname as your node description. .sp [Service] Environment="RDMA_NDD_ND_FORMAT=%%h" .sp NOTE: Systemd requires an extra \(aq%\(aq. .SH OPTIONS .sp \fB\-f, \-\-foreground\fP Run in the foreground instead of as a daemon .sp \fB\-d, \-\-debug\fP Log additional debugging information to syslog .sp \fB\-\-systemd\fP Enable systemd integration. .SH AUTHOR .INDENT 0.0 .TP .B Ira Weiny < \fI\%ira.weiny@intel.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/rdma-ndd/rdma-ndd.8.in.rst000066400000000000000000000036721477342711600200230ustar00rootroot00000000000000======== RDMA-NDD ======== ------------------------------------------ RDMA device Node Description update daemon ------------------------------------------ :Date: @BUILD_DATE@ :Manual section: 8 :Manual group: OpenIB Diagnostics SYNOPSIS ======== rdma-ndd DESCRIPTION =========== rdma-ndd is a system daemon which watches for rdma device changes and/or hostname changes and updates the Node Description of the rdma devices based on those changes. DETAILS ======= Full operation of this daemon requires kernels which support polling of the procfs hostname file as well as libudev. If your system does not support either of these features, the daemon will set the Node Descriptions at start up and then sleep forever. Node Description configuration ------------------------------ The daemon uses the environment variable RDMA_NDD_ND_FORMAT to set the node description. The following wild cards can be specified for more dynamic control. %h -- replace with the current hostname (not including domain) %d -- replace with the device name (for example mlx4_0, qib0, etc.) If not specified the default is "%h %d". NOTE: At startup, and on new device detection, the Node Description is always written to ensure the SM and rdma-ndd are in sync. Subsequent events will only write the Node Description on a device if it has changed. Using systemd ------------- Setting the environment variable for the daemon is normally be done via a systemd drop in unit. For example the following could be added to a file named /etc/systemd/system/rdma-ndd.service.d/nd-format.conf to use only the hostname as your node description. [Service] Environment="RDMA_NDD_ND_FORMAT=%%h" NOTE: Systemd requires an extra '%'. OPTIONS ======= **-f, --foreground** Run in the foreground instead of as a daemon **-d, --debug** Log additional debugging information to syslog **--systemd** Enable systemd integration. AUTHOR ====== Ira Weiny < ira.weiny@intel.com > rdma-core-56.1/rdma-ndd/rdma-ndd.c000066400000000000000000000171201477342711600166530ustar00rootroot00000000000000/* * Copyright (c) 2014,2016 Intel Corporation. All Rights Reserved * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include static struct udev *g_udev; static struct udev_monitor *g_mon; #define SYS_HOSTNAME "/proc/sys/kernel/hostname" #define SYS_INFINIBAND "/sys/class/infiniband" #define DEFAULT_ND_FORMAT "%h %d" static char *g_nd_format = NULL; static bool debugging; static void newline_to_null(char *str) { char *term = index(str, '\n'); if (term) *term = '\0'; } static void strip_domain(char *str) { char *term = index(str, '.'); if (term) *term = '\0'; } static __attribute__((format(printf, 1, 2))) void dbg_log(const char *fmt, ...) { va_list ap; if (!debugging) return; va_start(ap, fmt); vsyslog(LOG_DEBUG, fmt, ap); va_end(ap); } static void build_node_desc(char *dest, size_t len, const char *device, const char *hostname) { char *end = dest + len-1; const char *field; char *src = g_nd_format; while (*src && (dest < end)) { if (*src != '%') { *dest++ = *src++; } else { src++; switch (*src) { case 'h': field = hostname; while (*field && (*field != '.') && (dest < end)) *dest++ = *field++; break; case 'd': field = device; while (*field && (dest < end)) *dest++ = *field++; break; } src++; } } *dest = 0; } static int update_node_desc(const char *device, const char *hostname, int force) { int rc; char nd[128]; char new_nd[64]; char nd_file[PATH_MAX]; FILE *f; snprintf(nd_file, sizeof(nd_file), SYS_INFINIBAND "/%s/node_desc", device); nd_file[sizeof(nd_file)-1] = '\0'; f = fopen(nd_file, "r+"); if (!f) return -EIO; if (!fgets(nd, sizeof(nd), f)) { syslog(LOG_ERR, "Failed to read %s\n", nd_file); rc = -EIO; goto error; } newline_to_null(nd); build_node_desc(new_nd, sizeof(new_nd), device, hostname); if (!force && strncmp(new_nd, nd, sizeof(new_nd)) == 0) { dbg_log("%s: no change (%s)\n", device, new_nd); } else { dbg_log("%s: change (%s) -> (%s)\n", device, nd, new_nd); rewind(f); fprintf(f, "%s", new_nd); } rc = 0; error: fclose(f); return rc; } static void set_rdma_node_desc(const char *hostname, int force) { DIR *class_dir; struct dirent *dent; class_dir = opendir(SYS_INFINIBAND); if (!class_dir) { syslog(LOG_ERR, "Failed to open " SYS_INFINIBAND); return; } while ((dent = readdir(class_dir))) { if (dent->d_name[0] == '.') continue; if (update_node_desc(dent->d_name, hostname, force)) syslog(LOG_DEBUG, "set Node Description failed on %s\n", dent->d_name); } closedir(class_dir); } static void read_hostname(int fd, char *name, size_t len) { memset(name, 0, len); if (read(fd, name, len-1) >= 0) { newline_to_null(name); strip_domain(name); } else { syslog(LOG_ERR, "Read %s Failed\n", SYS_HOSTNAME); } lseek(fd, 0, SEEK_SET); } static void setup_udev(void) { g_udev = udev_new(); if (!g_udev) { syslog(LOG_ERR, "udev_new failed\n"); return; } } static int get_udev_fd(void) { g_mon = udev_monitor_new_from_netlink(g_udev, "udev"); if (!g_mon) { syslog(LOG_ERR, "udev monitoring failed\n"); return -1; } udev_monitor_filter_add_match_subsystem_devtype(g_mon, "infiniband", NULL); udev_monitor_enable_receiving(g_mon); return udev_monitor_get_fd(g_mon); } static void process_udev_event(int ud_fd, const char *hostname) { struct udev_device *dev; dev = udev_monitor_receive_device(g_mon); if (dev) { const char *device = udev_device_get_sysname(dev); const char *action = udev_device_get_action(dev); dbg_log("Device event: %s, %s, %s\n", udev_device_get_subsystem(dev), device, action); if (device && action && (!strncmp(action, "add", sizeof("add")) || !strncmp(action, "move", sizeof("add")))) if (update_node_desc(device, hostname, 1)) syslog(LOG_DEBUG, "set Node Description failed on %s\n", device); udev_device_unref(dev); } } static void monitor(bool systemd) { char hostname[128]; int hn_fd; struct pollfd fds[2]; int numfds = 1; int ud_fd; hn_fd = open(SYS_HOSTNAME, O_RDONLY); if (hn_fd < 0) { syslog(LOG_ERR, "Open %s Failed exiting\n", SYS_HOSTNAME); exit(EXIT_FAILURE); } read_hostname(hn_fd, hostname, sizeof(hostname)); fds[0].fd = hn_fd; fds[0].events = 0; ud_fd = get_udev_fd(); if (ud_fd >= 0) numfds = 2; fds[1].fd = ud_fd; fds[1].events = POLLIN; if (systemd) sd_notify(0, "READY=1"); set_rdma_node_desc((const char *)hostname, 1); while (1) { if (poll(fds, numfds, -1) <= 0) { syslog(LOG_ERR, "Poll %s failed; exiting\n", SYS_HOSTNAME); exit(EXIT_FAILURE); } if (fds[0].revents != 0) { read_hostname(hn_fd, hostname, sizeof(hostname)); dbg_log("Hostname event: %s\n", hostname); set_rdma_node_desc((const char *)hostname, 0); } if (fds[1].revents != 0) process_udev_event(ud_fd, hostname); } } int main(int argc, char *argv[]) { bool foreground = false; bool systemd = false; openlog(NULL, LOG_NDELAY | LOG_CONS | LOG_PID, LOG_DAEMON); while (1) { static const struct option long_opts[] = { { "foreground", 0, NULL, 'f' }, { "systemd", 0, NULL, 's' }, { "help", 0, NULL, 'h' }, { "debug", 0, NULL, 'd' }, { } }; int c = getopt_long(argc, argv, "fh", long_opts, NULL); if (c == -1) break; switch (c) { case 'f': foreground = true; break; case 's': systemd = true; break; case 'd': debugging = true; break; case 'h': printf("rdma-ndd [options]\n"); printf(" See 'man rdma-ndd' for details\n"); return 0; default: break; } } if (!foreground && !systemd) { if (daemon(0, 0) != 0) { syslog(LOG_ERR, "Failed to daemonize\n"); return EXIT_FAILURE; } } setup_udev(); g_nd_format = getenv("RDMA_NDD_ND_FORMAT"); if (g_nd_format && strncmp("", g_nd_format, strlen(g_nd_format)) != 0) g_nd_format = strdup(g_nd_format); else g_nd_format = strdup(DEFAULT_ND_FORMAT); dbg_log("Node Descriptor format (%s)\n", g_nd_format); monitor(systemd); return 0; } rdma-core-56.1/rdma-ndd/rdma-ndd.rules000066400000000000000000000003461477342711600175650ustar00rootroot00000000000000# If an InfiniBand/RDMA device is installed with a writable node_description # sysfs then start rdma-ndd to keep it up to date SUBSYSTEM=="infiniband", TAG+="systemd", ATTRS{node_desc}=="*", ENV{SYSTEMD_WANTS}+="rdma-ndd.service" rdma-core-56.1/rdma-ndd/rdma-ndd.service.in000066400000000000000000000016311477342711600204760ustar00rootroot00000000000000[Unit] Description=RDMA Node Description Daemon Documentation=man:rdma-ndd StopWhenUnneeded=yes # rdma-ndd is a kernel support program and needs to run as early as possible, # before the network link is brought up, and before an external manager tries # to read the local node description. DefaultDependencies=no Before=sysinit.target # Do not execute concurrently with an ongoing shutdown (required for DefaultDependencies=no) Conflicts=shutdown.target Before=shutdown.target # Networking, particularly link up, should not happen until ndd is ready Wants=network-pre.target Before=network-pre.target # rdma-hw is not ready until ndd is running Before=rdma-hw.target [Service] Type=notify Restart=always ExecStart=@CMAKE_INSTALL_FULL_SBINDIR@/rdma-ndd --systemd ProtectSystem=full ProtectHome=true ProtectKernelLogs=true # rdma-ndd is automatically wanted by udev when an RDMA device with a node description is present rdma-core-56.1/redhat/000077500000000000000000000000001477342711600146015ustar00rootroot00000000000000rdma-core-56.1/redhat/rdma-core.spec000066400000000000000000000522031477342711600173300ustar00rootroot00000000000000Name: rdma-core Version: 56.1 Release: 1%{?dist} Summary: RDMA core userspace libraries and daemons # Almost everything is licensed under the OFA dual GPLv2, 2 Clause BSD license # providers/ipathverbs/ Dual licensed using a BSD license with an extra patent clause # providers/rxe/ Incorporates code from ipathverbs and contains the patent clause # providers/hfi1verbs Uses the 3 Clause BSD license License: GPLv2 or BSD Url: https://github.com/linux-rdma/rdma-core Source: rdma-core-%{version}.tar.gz # Do not build static libs by default. %define with_static %{?_with_static: 1} %{?!_with_static: 0} # 32-bit arm is missing required arch-specific memory barriers, ExcludeArch: %{arm} BuildRequires: binutils BuildRequires: cmake >= 2.8.11 BuildRequires: gcc BuildRequires: libudev-devel BuildRequires: pkgconfig BuildRequires: pkgconfig(libnl-3.0) BuildRequires: pkgconfig(libnl-route-3.0) BuildRequires: /usr/bin/rst2man BuildRequires: valgrind-devel %if 0%{?fedora} < 37 BuildRequires: systemd %endif BuildRequires: systemd-devel %if 0%{?fedora} >= 32 || 0%{?rhel} >= 8 %define with_pyverbs %{?_with_pyverbs: 1} %{?!_with_pyverbs: %{?!_without_pyverbs: 1} %{?_without_pyverbs: 0}} %else %define with_pyverbs %{?_with_pyverbs: 1} %{?!_with_pyverbs: 0} %endif %if %{with_pyverbs} BuildRequires: python3-devel BuildRequires: python3-Cython %else %if 0%{?rhel} >= 8 || 0%{?fedora} >= 30 BuildRequires: python3 %else BuildRequires: python %endif %endif %if 0%{?rhel} >= 8 || 0%{?fedora} >= 30 || %{with_pyverbs} BuildRequires: python3-docutils %else BuildRequires: python-docutils %endif %if 0%{?fedora} >= 21 || 0%{?rhel} >= 8 BuildRequires: perl-generators %endif Requires: pciutils # Red Hat/Fedora previously shipped redhat/ as a stand-alone # package called 'rdma', which we're supplanting here. Provides: rdma = %{version}-%{release} Obsoletes: rdma < %{version}-%{release} Conflicts: infiniband-diags <= 1.6.7 # Since we recommend developers use Ninja, so should packagers, for consistency. %define CMAKE_FLAGS %{nil} %if 0%{?fedora} >= 23 || 0%{?rhel} >= 8 # Ninja was introduced in FC23 BuildRequires: ninja-build %define CMAKE_FLAGS -GNinja %if 0%{?fedora} >= 33 || 0%{?rhel} >= 9 %define make_jobs ninja-build -C %{_vpath_builddir} -v %{?_smp_mflags} %define cmake_install DESTDIR=%{buildroot} ninja-build -C %{_vpath_builddir} install %else %define make_jobs ninja-build -v %{?_smp_mflags} %define cmake_install DESTDIR=%{buildroot} ninja-build install %endif %else # Fallback to make otherwise BuildRequires: make %define make_jobs make VERBOSE=1 %{?_smp_mflags} %define cmake_install DESTDIR=%{buildroot} make install %endif %if 0%{?fedora} >= 25 || 0%{?rhel} == 8 # pandoc was introduced in FC25, Centos8 BuildRequires: pandoc %endif %description RDMA core userspace infrastructure and documentation, including initialization scripts, kernel driver-specific modprobe override configs, IPoIB network scripts, dracut rules, and the rdma-ndd utility. %package devel Summary: RDMA core development libraries and headers Requires: libibverbs%{?_isa} = %{version}-%{release} Provides: libibverbs-devel = %{version}-%{release} Obsoletes: libibverbs-devel < %{version}-%{release} Requires: libibumad%{?_isa} = %{version}-%{release} Provides: libibumad-devel = %{version}-%{release} Obsoletes: libibumad-devel < %{version}-%{release} Requires: librdmacm%{?_isa} = %{version}-%{release} Provides: librdmacm-devel = %{version}-%{release} Obsoletes: librdmacm-devel < %{version}-%{release} Provides: ibacm-devel = %{version}-%{release} Obsoletes: ibacm-devel < %{version}-%{release} Requires: infiniband-diags%{?_isa} = %{version}-%{release} Provides: infiniband-diags-devel = %{version}-%{release} Obsoletes: infiniband-diags-devel < %{version}-%{release} Provides: libibmad-devel = %{version}-%{release} Obsoletes: libibmad-devel < %{version}-%{release} %if %{with_static} # Since our pkg-config files include private references to these packages they # need to have their .pc files installed too, even for dynamic linking, or # pkg-config breaks. BuildRequires: pkgconfig(libnl-3.0) BuildRequires: pkgconfig(libnl-route-3.0) %endif %description devel RDMA core development libraries and headers. %changelog * Tue May 9 2023 Leon Romanovsky - 44 - Fix epoch warning %package -n infiniband-diags Summary: InfiniBand Diagnostic Tools Provides: perl(IBswcountlimits) Provides: libibmad = %{version}-%{release} Obsoletes: libibmad < %{version}-%{release} Obsoletes: openib-diags < 1.3 %description -n infiniband-diags This package provides IB diagnostic programs and scripts needed to diagnose an IB subnet. infiniband-diags now also provides libibmad. libibmad provides low layer IB functions for use by the IB diagnostic and management programs. These include MAD, SA, SMP, and other basic IB functions. %package -n infiniband-diags-compat Summary: OpenFabrics Alliance InfiniBand Diagnostic Tools %description -n infiniband-diags-compat Deprecated scripts and utilities which provide duplicated functionality, most often at a reduced performance. These are maintained for the time being for compatibility reasons. %package -n libibverbs Summary: A library and drivers for direct userspace use of RDMA (InfiniBand/iWARP/RoCE) hardware Requires(post): /sbin/ldconfig Requires(postun): /sbin/ldconfig Provides: libcxgb4 = %{version}-%{release} Obsoletes: libcxgb4 < %{version}-%{release} Provides: libefa = %{version}-%{release} Obsoletes: libefa < %{version}-%{release} Provides: liberdma = %{version}-%{release} Obsoletes: liberdma < %{version}-%{release} Provides: libhfi1 = %{version}-%{release} Obsoletes: libhfi1 < %{version}-%{release} Provides: libhns = %{version}-%{release} Obsoletes: libhns < %{version}-%{release} Provides: libipathverbs = %{version}-%{release} Obsoletes: libipathverbs < %{version}-%{release} Provides: libirdma = %{version}-%{release} Obsoletes: libirdma < %{version}-%{release} Provides: libmana = %{version}-%{release} Obsoletes: libmana < %{version}-%{release} Provides: libmlx4 = %{version}-%{release} Obsoletes: libmlx4 < %{version}-%{release} Provides: libmlx5 = %{version}-%{release} Obsoletes: libmlx5 < %{version}-%{release} Provides: libmthca = %{version}-%{release} Obsoletes: libmthca < %{version}-%{release} Provides: libocrdma = %{version}-%{release} Obsoletes: libocrdma < %{version}-%{release} Provides: librxe = %{version}-%{release} Obsoletes: librxe < %{version}-%{release} %description -n libibverbs libibverbs is a library that allows userspace processes to use RDMA "verbs" as described in the InfiniBand Architecture Specification and the RDMA Protocol Verbs Specification. This includes direct hardware access from userspace to InfiniBand/iWARP adapters (kernel bypass) for fast path operations. Device-specific plug-in ibverbs userspace drivers are included: - libcxgb4: Chelsio T4 iWARP HCA - libefa: Amazon Elastic Fabric Adapter - liberdma: Alibaba Elastic RDMA (iWarp) Adapter - libhfi1: Intel Omni-Path HFI - libhns: HiSilicon Hip08+ SoC - libipathverbs: QLogic InfiniPath HCA - libirdma: Intel Ethernet Connection RDMA - libmana: Microsoft Azure Network Adapter - libmlx4: Mellanox ConnectX-3 InfiniBand HCA - libmlx5: Mellanox Connect-IB/X-4+ InfiniBand HCA - libmthca: Mellanox InfiniBand HCA - libocrdma: Emulex OneConnect RDMA/RoCE Device - libqedr: QLogic QL4xxx RoCE HCA - librxe: A software implementation of the RoCE protocol - libsiw: A software implementation of the iWarp protocol - libvmw_pvrdma: VMware paravirtual RDMA device %package -n libibverbs-utils Summary: Examples for the libibverbs library Requires: libibverbs%{?_isa} = %{version}-%{release} %description -n libibverbs-utils Useful libibverbs example programs such as ibv_devinfo, which displays information about RDMA devices. %package -n ibacm Summary: InfiniBand Communication Manager Assistant Requires(post): systemd-units Requires(preun): systemd-units Requires(postun): systemd-units %description -n ibacm The ibacm daemon helps reduce the load of managing path record lookups on large InfiniBand fabrics by providing a user space implementation of what is functionally similar to an ARP cache. The use of ibacm, when properly configured, can reduce the SA packet load of a large IB cluster from O(n^2) to O(n). The ibacm daemon is started and normally runs in the background, user applications need not know about this daemon as long as their app uses librdmacm to handle connection bring up/tear down. The librdmacm library knows how to talk directly to the ibacm daemon to retrieve data. %package -n iwpmd Summary: iWarp Port Mapper userspace daemon Requires(post): systemd-units Requires(preun): systemd-units Requires(postun): systemd-units %description -n iwpmd iwpmd provides a userspace service for iWarp drivers to claim tcp ports through the standard socket interface. %package -n libibumad Summary: OpenFabrics Alliance InfiniBand umad (userspace management datagram) library %description -n libibumad libibumad provides the userspace management datagram (umad) library functions, which sit on top of the umad modules in the kernel. These are used by the IB diagnostic and management tools, including OpenSM. %package -n librdmacm Summary: Userspace RDMA Connection Manager %description -n librdmacm librdmacm provides a userspace RDMA Communication Management API. %package -n librdmacm-utils Summary: Examples for the librdmacm library Requires: librdmacm%{?_isa} = %{version}-%{release} %description -n librdmacm-utils Example test programs for the librdmacm library. %package -n srp_daemon Summary: Tools for using the InfiniBand SRP protocol devices Obsoletes: srptools <= 1.0.3 Provides: srptools = %{version}-%{release} Obsoletes: openib-srptools <= 0.0.6 Requires(post): systemd-units Requires(preun): systemd-units Requires(postun): systemd-units %description -n srp_daemon In conjunction with the kernel ib_srp driver, srp_daemon allows you to discover and use SCSI devices via the SCSI RDMA Protocol over InfiniBand. %if %{with_pyverbs} %package -n python3-pyverbs Summary: Python3 API over IB verbs %{?python_provide:%python_provide python3-pyverbs} %description -n python3-pyverbs Pyverbs is a Cython-based Python API over libibverbs, providing an easy, object-oriented access to IB verbs. %endif %prep %setup %build # New RPM defines _rundir, usually as /run %if 0%{?_rundir:1} %else %define _rundir /var/run %endif %{!?EXTRA_CMAKE_FLAGS: %define EXTRA_CMAKE_FLAGS %{nil}} # Pass all of the rpm paths directly to GNUInstallDirs and our other defines. %cmake %{CMAKE_FLAGS} \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_BINDIR:PATH=%{_bindir} \ -DCMAKE_INSTALL_SBINDIR:PATH=%{_sbindir} \ -DCMAKE_INSTALL_LIBDIR:PATH=%{_lib} \ -DCMAKE_INSTALL_LIBEXECDIR:PATH=%{_libexecdir} \ -DCMAKE_INSTALL_LOCALSTATEDIR:PATH=%{_localstatedir} \ -DCMAKE_INSTALL_SHAREDSTATEDIR:PATH=%{_sharedstatedir} \ -DCMAKE_INSTALL_INCLUDEDIR:PATH=include \ -DCMAKE_INSTALL_INFODIR:PATH=%{_infodir} \ -DCMAKE_INSTALL_MANDIR:PATH=%{_mandir} \ -DCMAKE_INSTALL_SYSCONFDIR:PATH=%{_sysconfdir} \ -DCMAKE_INSTALL_SYSTEMD_SERVICEDIR:PATH=%{_unitdir} \ -DCMAKE_INSTALL_INITDDIR:PATH=%{_initrddir} \ -DCMAKE_INSTALL_RUNDIR:PATH=%{_rundir} \ -DCMAKE_INSTALL_DOCDIR:PATH=%{_docdir}/%{name} \ -DCMAKE_INSTALL_UDEV_RULESDIR:PATH=%{_udevrulesdir} \ -DCMAKE_INSTALL_PERLDIR:PATH=%{perl_vendorlib} \ -DENABLE_IBDIAGS_COMPAT:BOOL=True \ %if %{with_static} -DENABLE_STATIC=1 \ %endif %{EXTRA_CMAKE_FLAGS} \ %if %{defined __python3} -DPYTHON_EXECUTABLE:PATH=%{__python3} \ -DCMAKE_INSTALL_PYTHON_ARCH_LIB:PATH=%{python3_sitearch} \ %endif %if %{with_pyverbs} -DNO_PYVERBS=0 %else -DNO_PYVERBS=1 %endif %make_jobs %install %cmake_install mkdir -p %{buildroot}/%{_sysconfdir}/rdma # Red Hat specific glue %global dracutlibdir %{_prefix}/lib/dracut %global sysmodprobedir %{_prefix}/lib/modprobe.d mkdir -p %{buildroot}%{_libexecdir} mkdir -p %{buildroot}%{_udevrulesdir} mkdir -p %{buildroot}%{dracutlibdir}/modules.d/05rdma mkdir -p %{buildroot}%{sysmodprobedir} install -D -m0644 redhat/rdma.mlx4.conf %{buildroot}/%{_sysconfdir}/rdma/mlx4.conf install -D -m0755 redhat/rdma.modules-setup.sh %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh install -D -m0644 redhat/rdma.mlx4.sys.modprobe %{buildroot}%{sysmodprobedir}/libmlx4.conf install -D -m0755 redhat/rdma.mlx4-setup.sh %{buildroot}%{_libexecdir}/mlx4-setup.sh rm -f %{buildroot}%{_sysconfdir}/rdma/modules/rdma.conf install -D -m0644 redhat/rdma.conf %{buildroot}%{_sysconfdir}/rdma/modules/rdma.conf # ibacm (if [ -d %{__cmake_builddir} ]; then cd %{__cmake_builddir}; fi ./bin/ib_acme -D . -O && install -D -m0644 ibacm_opts.cfg %{buildroot}%{_sysconfdir}/rdma/) # Delete the package's init.d scripts rm -rf %{buildroot}/%{_initrddir}/ rm -f %{buildroot}/%{_sbindir}/srp_daemon.sh %post -n rdma-core if [ -x /sbin/udevadm ]; then /sbin/udevadm trigger --subsystem-match=infiniband --action=change || true /sbin/udevadm trigger --subsystem-match=net --action=change || true /sbin/udevadm trigger --subsystem-match=infiniband_mad --action=change || true fi %post -n infiniband-diags -p /sbin/ldconfig %postun -n infiniband-diags -p /sbin/ldconfig %post -n libibverbs -p /sbin/ldconfig %postun -n libibverbs -p /sbin/ldconfig %post -n libibumad -p /sbin/ldconfig %postun -n libibumad -p /sbin/ldconfig %post -n librdmacm -p /sbin/ldconfig %postun -n librdmacm -p /sbin/ldconfig %post -n ibacm %systemd_post ibacm.service %preun -n ibacm %systemd_preun ibacm.service %postun -n ibacm %systemd_postun_with_restart ibacm.service %post -n srp_daemon %systemd_post srp_daemon.service %preun -n srp_daemon %systemd_preun srp_daemon.service %postun -n srp_daemon %systemd_postun_with_restart srp_daemon.service %post -n iwpmd %systemd_post iwpmd.service %preun -n iwpmd %systemd_preun iwpmd.service %postun -n iwpmd %systemd_postun_with_restart iwpmd.service %files %dir %{_sysconfdir}/rdma %dir %{_docdir}/%{name} %doc %{_docdir}/%{name}/70-persistent-ipoib.rules %doc %{_docdir}/%{name}/README.md %doc %{_docdir}/%{name}/rxe.md %doc %{_docdir}/%{name}/udev.md %doc %{_docdir}/%{name}/tag_matching.md %config(noreplace) %{_sysconfdir}/rdma/mlx4.conf %config(noreplace) %{_sysconfdir}/rdma/modules/infiniband.conf %config(noreplace) %{_sysconfdir}/rdma/modules/iwarp.conf %config(noreplace) %{_sysconfdir}/rdma/modules/opa.conf %config(noreplace) %{_sysconfdir}/rdma/modules/rdma.conf %config(noreplace) %{_sysconfdir}/rdma/modules/roce.conf %dir %{_sysconfdir}/modprobe.d %config(noreplace) %{_sysconfdir}/modprobe.d/mlx4.conf %config(noreplace) %{_sysconfdir}/modprobe.d/truescale.conf %{_unitdir}/rdma-hw.target %{_unitdir}/rdma-load-modules@.service %dir %{dracutlibdir} %dir %{dracutlibdir}/modules.d %dir %{dracutlibdir}/modules.d/05rdma %{dracutlibdir}/modules.d/05rdma/module-setup.sh %dir %{_udevrulesdir} %{_udevrulesdir}/../rdma_rename %{_udevrulesdir}/60-rdma-ndd.rules %{_udevrulesdir}/60-rdma-persistent-naming.rules %{_udevrulesdir}/75-rdma-description.rules %{_udevrulesdir}/90-rdma-hw-modules.rules %{_udevrulesdir}/90-rdma-ulp-modules.rules %{_udevrulesdir}/90-rdma-umad.rules %dir %{sysmodprobedir} %{sysmodprobedir}/libmlx4.conf %{_libexecdir}/mlx4-setup.sh %{_libexecdir}/truescale-serdes.cmds %{_sbindir}/rdma-ndd %{_unitdir}/rdma-ndd.service %{_mandir}/man7/rxe* %{_mandir}/man8/rdma-ndd.* %license COPYING.* %files devel %doc %{_docdir}/%{name}/MAINTAINERS %dir %{_includedir}/infiniband %dir %{_includedir}/rdma %{_includedir}/infiniband/* %{_includedir}/rdma/* %if %{with_static} %{_libdir}/lib*.a %endif %{_libdir}/lib*.so %{_libdir}/pkgconfig/*.pc %{_mandir}/man3/efadv* %{_mandir}/man3/hnsdv* %{_mandir}/man3/ibv_* %{_mandir}/man3/rdma* %{_mandir}/man3/umad* %{_mandir}/man3/*_to_ibv_rate.* %{_mandir}/man7/rdma_cm.* %{_mandir}/man3/manadv* %{_mandir}/man3/mlx5dv* %{_mandir}/man3/mlx4dv* %{_mandir}/man7/efadv* %{_mandir}/man7/hnsdv* %{_mandir}/man7/manadv* %{_mandir}/man7/mlx5dv* %{_mandir}/man7/mlx4dv* %{_mandir}/man3/ibnd_* %files -n infiniband-diags-compat %{_sbindir}/ibcheckerrs %{_mandir}/man8/ibcheckerrs* %{_sbindir}/ibchecknet %{_mandir}/man8/ibchecknet* %{_sbindir}/ibchecknode %{_mandir}/man8/ibchecknode* %{_sbindir}/ibcheckport %{_mandir}/man8/ibcheckport.* %{_sbindir}/ibcheckportwidth %{_mandir}/man8/ibcheckportwidth* %{_sbindir}/ibcheckportstate %{_mandir}/man8/ibcheckportstate* %{_sbindir}/ibcheckwidth %{_mandir}/man8/ibcheckwidth* %{_sbindir}/ibcheckstate %{_mandir}/man8/ibcheckstate* %{_sbindir}/ibcheckerrors %{_mandir}/man8/ibcheckerrors* %{_sbindir}/ibdatacounts %{_mandir}/man8/ibdatacounts* %{_sbindir}/ibdatacounters %{_mandir}/man8/ibdatacounters* %{_sbindir}/ibdiscover.pl %{_mandir}/man8/ibdiscover* %{_sbindir}/ibswportwatch.pl %{_mandir}/man8/ibswportwatch* %{_sbindir}/ibqueryerrors.pl %{_sbindir}/iblinkinfo.pl %{_sbindir}/ibprintca.pl %{_mandir}/man8/ibprintca* %{_sbindir}/ibprintswitch.pl %{_mandir}/man8/ibprintswitch* %{_sbindir}/ibprintrt.pl %{_mandir}/man8/ibprintrt* %{_sbindir}/set_nodedesc.sh %{_sbindir}/ibclearerrors %{_mandir}/man8/ibclearerrors* %{_sbindir}/ibclearcounters %{_mandir}/man8/ibclearcounters* %files -n infiniband-diags %{_sbindir}/ibaddr %{_mandir}/man8/ibaddr* %{_sbindir}/ibnetdiscover %{_mandir}/man8/ibnetdiscover* %{_sbindir}/ibping %{_mandir}/man8/ibping* %{_sbindir}/ibportstate %{_mandir}/man8/ibportstate* %{_sbindir}/ibroute %{_mandir}/man8/ibroute.* %{_sbindir}/ibstat %{_mandir}/man8/ibstat.* %{_sbindir}/ibsysstat %{_mandir}/man8/ibsysstat* %{_sbindir}/ibtracert %{_mandir}/man8/ibtracert* %{_sbindir}/perfquery %{_mandir}/man8/perfquery* %{_sbindir}/sminfo %{_mandir}/man8/sminfo* %{_sbindir}/smpdump %{_mandir}/man8/smpdump* %{_sbindir}/smpquery %{_mandir}/man8/smpquery* %{_sbindir}/saquery %{_mandir}/man8/saquery* %{_sbindir}/vendstat %{_mandir}/man8/vendstat* %{_sbindir}/iblinkinfo %{_mandir}/man8/iblinkinfo* %{_sbindir}/ibqueryerrors %{_mandir}/man8/ibqueryerrors* %{_sbindir}/ibcacheedit %{_mandir}/man8/ibcacheedit* %{_sbindir}/ibccquery %{_mandir}/man8/ibccquery* %{_sbindir}/ibccconfig %{_mandir}/man8/ibccconfig* %{_sbindir}/dump_fts %{_mandir}/man8/dump_fts* %{_sbindir}/ibhosts %{_mandir}/man8/ibhosts* %{_sbindir}/ibswitches %{_mandir}/man8/ibswitches* %{_sbindir}/ibnodes %{_mandir}/man8/ibnodes* %{_sbindir}/ibrouters %{_mandir}/man8/ibrouters* %{_sbindir}/ibfindnodesusing.pl %{_mandir}/man8/ibfindnodesusing* %{_sbindir}/ibidsverify.pl %{_mandir}/man8/ibidsverify* %{_sbindir}/check_lft_balance.pl %{_mandir}/man8/check_lft_balance* %{_sbindir}/dump_lfts.sh %{_mandir}/man8/dump_lfts* %{_sbindir}/dump_mfts.sh %{_mandir}/man8/dump_mfts* %{_sbindir}/ibstatus %{_mandir}/man8/ibstatus* %{_mandir}/man8/infiniband-diags* %{_libdir}/libibmad*.so.* %{_libdir}/libibnetdisc*.so.* %{perl_vendorlib}/IBswcountlimits.pm %config(noreplace) %{_sysconfdir}/infiniband-diags/error_thresholds %config(noreplace) %{_sysconfdir}/infiniband-diags/ibdiag.conf %files -n libibverbs %dir %{_sysconfdir}/libibverbs.d %dir %{_libdir}/libibverbs %{_libdir}/libefa.so.* %{_libdir}/libhns.so.* %{_libdir}/libibverbs*.so.* %{_libdir}/libibverbs/*.so %{_libdir}/libmana.so.* %{_libdir}/libmlx5.so.* %{_libdir}/libmlx4.so.* %config(noreplace) %{_sysconfdir}/libibverbs.d/*.driver %doc %{_docdir}/%{name}/libibverbs.md %files -n libibverbs-utils %{_bindir}/ibv_* %{_mandir}/man1/ibv_* %files -n ibacm %config(noreplace) %{_sysconfdir}/rdma/ibacm_opts.cfg %{_bindir}/ib_acme %{_sbindir}/ibacm %{_mandir}/man1/ib_acme.* %{_mandir}/man7/ibacm.* %{_mandir}/man7/ibacm_prov.* %{_mandir}/man8/ibacm.* %{_unitdir}/ibacm.service %{_unitdir}/ibacm.socket %dir %{_libdir}/ibacm %{_libdir}/ibacm/* %doc %{_docdir}/%{name}/ibacm.md %files -n iwpmd %{_sbindir}/iwpmd %{_unitdir}/iwpmd.service %config(noreplace) %{_sysconfdir}/rdma/modules/iwpmd.conf %config(noreplace) %{_sysconfdir}/iwpmd.conf %{_udevrulesdir}/90-iwpmd.rules %{_mandir}/man8/iwpmd.* %{_mandir}/man5/iwpmd.* %files -n libibumad %{_libdir}/libibumad*.so.* %files -n librdmacm %{_libdir}/librdmacm*.so.* %dir %{_libdir}/rsocket %{_libdir}/rsocket/*.so* %doc %{_docdir}/%{name}/librdmacm.md %{_mandir}/man7/rsocket.* %files -n librdmacm-utils %{_bindir}/cmtime %{_bindir}/mckey %{_bindir}/rcopy %{_bindir}/rdma_client %{_bindir}/rdma_server %{_bindir}/rdma_xclient %{_bindir}/rdma_xserver %{_bindir}/riostream %{_bindir}/rping %{_bindir}/rstream %{_bindir}/ucmatose %{_bindir}/udaddy %{_bindir}/udpong %{_mandir}/man1/cmtime.* %{_mandir}/man1/mckey.* %{_mandir}/man1/rcopy.* %{_mandir}/man1/rdma_client.* %{_mandir}/man1/rdma_server.* %{_mandir}/man1/rdma_xclient.* %{_mandir}/man1/rdma_xserver.* %{_mandir}/man1/riostream.* %{_mandir}/man1/rping.* %{_mandir}/man1/rstream.* %{_mandir}/man1/ucmatose.* %{_mandir}/man1/udaddy.* %{_mandir}/man1/udpong.* %files -n srp_daemon %config(noreplace) %{_sysconfdir}/srp_daemon.conf %config(noreplace) %{_sysconfdir}/rdma/modules/srp_daemon.conf %{_libexecdir}/srp_daemon/start_on_all_ports %{_unitdir}/srp_daemon.service %{_unitdir}/srp_daemon_port@.service %{_sbindir}/ibsrpdm %{_sbindir}/srp_daemon %{_sbindir}/run_srp_daemon %{_udevrulesdir}/60-srp_daemon.rules %{_mandir}/man5/srp_daemon.service.5* %{_mandir}/man5/srp_daemon_port@.service.5* %{_mandir}/man8/ibsrpdm.8* %{_mandir}/man8/srp_daemon.8* %doc %{_docdir}/%{name}/ibsrpdm.md %if %{with_pyverbs} %files -n python3-pyverbs %{python3_sitearch}/pyverbs %{_docdir}/%{name}/tests/*.py %endif rdma-core-56.1/redhat/rdma.conf000066400000000000000000000007071477342711600163770ustar00rootroot00000000000000# These modules are loaded by the system if any RDMA devices is installed # iSCSI over RDMA client support ib_iser # iSCSI over RDMA target support ib_isert # SCSI RDMA Protocol target driver ib_srpt # User access to RDMA verbs (supports libibverbs) ib_uverbs # User access to RDMA connection management (supports librdmacm) rdma_ucm # RDS over RDMA support # rds_rdma # NFS over RDMA client support xprtrdma # NFS over RDMA server support svcrdma rdma-core-56.1/redhat/rdma.mlx4-setup.sh000066400000000000000000000047501477342711600201070ustar00rootroot00000000000000#!/bin/bash dir="/sys/bus/pci/drivers/mlx4_core" [ ! -d $dir ] && exit 1 pushd $dir >/dev/null function set_dual_port() { device=$1 port1=$2 port2=$3 pushd $device >/dev/null cur_p1=`cat mlx4_port1` cur_p2=`cat mlx4_port2` # special case the "eth eth" mode as we need port2 to # actually switch to eth before the driver will let us # switch port1 to eth as well if [ "$port1" == "eth" ]; then if [ "$port2" != "eth" ]; then echo "In order for port1 to be eth, port2 to must also be eth" popd >/dev/null return fi if [ "$cur_p2" != "eth" -a "$cur_p2" != "auto (eth)" ]; then tries=0 echo "$port2" > mlx4_port2 2>/dev/null sleep .25 cur_p2=`cat mlx4_port2` while [ "$cur_p2" != "eth" -a "$cur_p2" != "auto (eth)" -a $tries -lt 10 ]; do sleep .25 let tries++ cur_p2=`cat mlx4_port2` done if [ "$cur_p2" != "eth" -a "$cur_p2" != "auto (eth)" ]; then echo "Failed to set port2 to eth mode" popd >/dev/null return fi fi if [ "$cur_p1" != "eth" -a "$cur_p1" != "auto (eth)" ]; then tries=0 echo "$port1" > mlx4_port1 2>/dev/null sleep .25 cur_p1=`cat mlx4_port1` while [ "$cur_p1" != "eth" -a "$cur_p1" != "auto (eth)" -a $tries -lt 10 ]; do sleep .25 let tries++ cur_p1=`cat mlx4_port1` done if [ "$cur_p1" != "eth" -a "$cur_p1" != "auto (eth)" ]; then echo "Failed to set port1 to eth mode" fi fi popd >/dev/null return fi # our mode is not eth as that is covered above # so we should be able to successfully set the ports in # port1 then port2 order if [ "$cur_p1" != "$port1" -o "$cur_p2" != "$port2" ]; then # Try setting the ports in order first echo "$port1" > mlx4_port1 2>/dev/null ; sleep .1 echo "$port2" > mlx4_port2 2>/dev/null ; sleep .1 cur_p1=`cat mlx4_port1` cur_p2=`cat mlx4_port2` fi if [ "$cur_p1" != "$port1" -o "$cur_p2" != "$port2" ]; then # Try reverse order this time echo "$port2" > mlx4_port2 2>/dev/null ; sleep .1 echo "$port1" > mlx4_port1 2>/dev/null ; sleep .1 cur_p1=`cat mlx4_port1` cur_p2=`cat mlx4_port2` fi if [ "$cur_p1" != "$port1" -o "$cur_p2" != "$port2" ]; then echo "Error setting port type on mlx4 device $device" fi popd >/dev/null return } while read device port1 port2 ; do [ -d "$device" ] || continue [ -z "$port1" ] && continue [ -f "$device/mlx4_port2" -a -z "$port2" ] && continue [ -f "$device/mlx4_port2" ] && set_dual_port $device $port1 $port2 || echo "$port1" > "$device/mlx4_port1" done popd 2&>/dev/null rdma-core-56.1/redhat/rdma.mlx4.conf000066400000000000000000000023351477342711600172610ustar00rootroot00000000000000# Config file for mlx4 hardware port settings # This file is read when the mlx4_core module is loaded and used to # set the port types for any hardware found. If a card is not listed # in this file, then its port types are left alone. # # Format: # [port2_type] # # @port1 and @port2: # One of auto, ib, or eth. No checking is performed to make sure that # combinations are valid. Invalid inputs will result in the driver # not setting the port to the type requested. port1 is required at # all times, port2 is required for dual port cards. # # Example: # 0000:0b:00.0 eth eth # # You can find the right pci device to use for any given card by loading # the mlx4_core module, then going to /sys/bus/pci/drivers/mlx4_core and # seeing what possible PCI devices are listed there. The possible values # for ports are: ib, eth, and auto. However, not all cards support all # types, so if you get messages from the kernel that your selected port # type isn't supported, there's nothing this script can do about it. Also, # some cards don't support using different types on the two ports (aka, # both ports must be either eth or ib). Again, we can't set what the kernel # or hardware won't support. # rdma-core-56.1/redhat/rdma.mlx4.sys.modprobe000066400000000000000000000010571477342711600207600ustar00rootroot00000000000000# WARNING! - This file is overwritten any time the rdma rpm package is # updated. Please do not make any changes to this file. Instead, make # changes to the mlx4.conf file. It's contents are preserved if they # have been changed from the default values. install mlx4_core /sbin/modprobe --ignore-install mlx4_core $CMDLINE_OPTS && (if [ -f /usr/libexec/mlx4-setup.sh -a -f /etc/rdma/mlx4.conf ]; then /usr/libexec/mlx4-setup.sh < /etc/rdma/mlx4.conf; fi; /sbin/modprobe mlx4_en; if /sbin/modinfo mlx4_ib > /dev/null 2>&1; then /sbin/modprobe mlx4_ib; fi) rdma-core-56.1/redhat/rdma.modules-setup.sh000066400000000000000000000024341477342711600206700ustar00rootroot00000000000000#!/bin/bash check() { [ -n "$hostonly" -a -d /sys/class/infiniband_verbs/uverbs0 ] && return 0 [ -n "$hostonly" ] && return 255 return 0 } depends() { return 0 } install() { inst /etc/rdma/mlx4.conf inst /etc/rdma/modules/infiniband.conf inst /etc/rdma/modules/iwarp.conf inst /etc/rdma/modules/opa.conf inst /etc/rdma/modules/rdma.conf inst /etc/rdma/modules/roce.conf inst /usr/libexec/mlx4-setup.sh inst /usr/lib/modprobe.d/libmlx4.conf inst_multiple lspci setpci awk sleep inst_multiple -o /etc/modprobe.d/mlx4.conf inst_rules 60-rdma-persistent-naming.rules 70-persistent-ipoib.rules 75-rdma-description.rules 90-rdma-hw-modules.rules 90-rdma-ulp-modules.rules 90-rdma-umad.rules inst_multiple -o \ $systemdsystemunitdir/rdma-hw.target \ $systemdsystemunitdir/rdma-load-modules@.service for i in \ rdma-load-modules@rdma.service \ rdma-load-modules@roce.service \ rdma-load-modules@infiniband.service; do $SYSTEMCTL -q --root "$initdir" add-wants initrd.target "$i" done } installkernel() { hostonly='' instmods =drivers/infiniband =drivers/net/ethernet/mellanox =drivers/net/ethernet/chelsio =drivers/net/ethernet/cisco =drivers/net/ethernet/emulex =drivers/target hostonly='' instmods crc-t10dif crct10dif_common xprtrdma svcrdma } rdma-core-56.1/srp_daemon/000077500000000000000000000000001477342711600154615ustar00rootroot00000000000000rdma-core-56.1/srp_daemon/CMakeLists.txt000066400000000000000000000044351477342711600202270ustar00rootroot00000000000000set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${NO_STRICT_ALIASING_FLAGS}") rdma_man_pages( ibsrpdm.8 srp_daemon.8.in srp_daemon.service.5 srp_daemon_port@.service.5 ) rdma_sbin_executable(srp_daemon srp_daemon.c srp_handle_traps.c srp_sync.c ) target_link_libraries(srp_daemon LINK_PRIVATE ibverbs ibumad ${RT_LIBRARIES} ${CMAKE_THREAD_LIBS_INIT} ) rdma_install_symlink(srp_daemon "${CMAKE_INSTALL_SBINDIR}/ibsrpdm") # FIXME: Why? rdma_install_symlink(srp_daemon "${CMAKE_INSTALL_SBINDIR}/run_srp_daemon") rdma_subst_install(FILES "srp_daemon.sh.in" DESTINATION "${CMAKE_INSTALL_SBINDIR}" RENAME "srp_daemon.sh" PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ OWNER_EXECUTE GROUP_EXECUTE WORLD_EXECUTE) rdma_subst_install(FILES start_on_all_ports.in DESTINATION "${CMAKE_INSTALL_LIBEXECDIR}/srp_daemon" RENAME start_on_all_ports PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ OWNER_EXECUTE GROUP_EXECUTE WORLD_EXECUTE) rdma_subst_install(FILES srp_daemon.service.in DESTINATION "${CMAKE_INSTALL_SYSTEMD_SERVICEDIR}" RENAME srp_daemon.service PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ) rdma_subst_install(FILES srp_daemon_port@.service.in DESTINATION "${CMAKE_INSTALL_SYSTEMD_SERVICEDIR}" RENAME srp_daemon_port@.service PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ) install(FILES srp_daemon.conf DESTINATION "${CMAKE_INSTALL_SYSCONFDIR}") rdma_subst_install(FILES "srp_daemon.rules.in" RENAME "60-srp_daemon.rules" DESTINATION "${CMAKE_INSTALL_UDEV_RULESDIR}") install(FILES modules-srp_daemon.conf RENAME "srp_daemon.conf" DESTINATION "${CMAKE_INSTALL_SYSCONFDIR}/rdma/modules") # FIXME: The ib init.d file should really be included in rdma-core as well. set(RDMA_SERVICE "openibd" CACHE STRING "init.d file service name to order srpd after") # NOTE: These defaults are for CentOS, packagers should override. set(SRP_DEFAULT_START "2 3 4 5" CACHE STRING "Default-Start service data for srpd") set(SRP_DEFAULT_STOP "0 1 6" CACHE STRING "Default-Stop service data for srpd") configure_file(srpd.in "${CMAKE_CURRENT_BINARY_DIR}/srpd") install(FILES "${CMAKE_CURRENT_BINARY_DIR}/srpd" DESTINATION "${CMAKE_INSTALL_INITDDIR}" PERMISSIONS OWNER_WRITE OWNER_READ GROUP_READ WORLD_READ OWNER_EXECUTE GROUP_EXECUTE WORLD_EXECUTE) rdma-core-56.1/srp_daemon/ibsrpdm.8000066400000000000000000000014221477342711600172110ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH IBSRPDM 8 "August 30, 2005" "OpenFabrics" "USER COMMANDS" .SH NAME ibsrpdm \- Discover SRP targets on an InfiniBand Fabric .SH SYNOPSIS .B ibsrpdm [\fIOPTIONS\fB] .SH DESCRIPTION .PP List InfiniBand SCSI RDMA Protocol (SRP) targets on an IB fabric. .SH OPTIONS .PP .TP \fB\-c\fR Generate output suitable for piping directly to a /sys/class/infiniband_srp/srp\-\-/add_target file .TP \fB\-d\fR \fIDEVICE\fR Use device file \fIDEVICE\fR (default /dev/infiniband/umad0) .TP \fB\-k\fR \fIP_KEY\fR Use InfiniBand partition key \fIP_KEY\fR (default 0xffff) .TP \fB\-v\fR Print more verbose output .SH SEE ALSO .BR srp_daemon (1) .SH AUTHORS .TP Roland Dreier .RI < roland@kernel.org > rdma-core-56.1/srp_daemon/modules-srp_daemon.conf000066400000000000000000000001131477342711600221200ustar00rootroot00000000000000# These modules are loaded by the system if srp_daemon is to be run ib_srp rdma-core-56.1/srp_daemon/srp_daemon.8.in000066400000000000000000000142641477342711600203150ustar00rootroot00000000000000.\" Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md .TH SRP_DAEMON 8 "September 5, 2006" "OpenFabrics" "USER COMMANDS" .SH NAME srp_daemon \- Discovers SRP targets in an InfiniBand Fabric .SH SYNOPSIS .B srp_daemon\fR [\fB-vVcaeon\fR] [\fB-d \fIumad-device\fR | \fB-i \fIinfiniband-device\fR [\fB-p \fIport-num\fR] | \fB-j \fIdev:port\fR] [\fB-t \fItimeout(ms)\fR] [\fB-r \fIretries\fR] [\fB-R \fIrescan-time\fR] [\fB-f \fIrules-file\fR] .SH DESCRIPTION .PP Discovers and connects to InfiniBand SCSI RDMA Protocol (SRP) targets in an IB fabric. Each srp_daemon instance operates on one local port. Upon boot it performs a full rescan of the fabric and then waits for an srp_daemon event. An srp_daemon event can be a join of a new machine to the fabric, a change in the capabilities of a machine, an SA change, or an expiration of a predefined timeout. When a new machine joins the fabric, srp_daemon checks if it is an SRP target. When there is a change of capabilities, srp_daemon checks if the machine has turned into an SRP target. When there is an SA change or a timeout expiration, srp_daemon performs a full rescan of the fabric. For each target srp_daemon finds, it checks if it should connect to this target according to its rules (the default rules file is @CMAKE_INSTALL_FULL_SYSCONFDIR@/srp_daemon.conf) and if it is already connected to the local port. If it should connect to this target and if it is not connected yet, srp_daemon can either print the target details or connect to it. .SH OPTIONS .PP .TP \fB\-v\fR Print more verbose output .TP \fB\-V\fR Print even more verbose output (debug mode) .TP \fB\-i\fR \fIinfiniband-device\fR Work on \fIinfiniband-device\fR. This option should not be used with -d nor with -j. .TP \fB\-p\fR \fIport-num\fR Work on port \fIport-num\fR (default 1). This option must be used with -i and should not be used with -d nor with -j. .TP \fB\-j\fR \fIdev:port\fR Work on port number \fIport\fR of InfiniBand device \fIdev\fR. This option should not be used with -d, -i nor with -p. .TP \fB\-d\fR \fIumad-device\fR Use device file \fIumad-device\fR (default /dev/infiniband/umad0) This option should not be used with -i, -p nor with -j. .TP \fB\-c\fR Generate output suitable for piping directly to a /sys/class/infiniband_srp/srp\-\-/add_target file. .TP \fB\-a\fR Prints all the targets in the fabric, not only targets that are not connected through the local port. This is the same behavior as that of ibsrpdm. .TP \fB\-e\fR Execute the connection command, i.e., make the connection to the target. .TP \fB\-o\fR Perform only one rescan and exit just like ibsrpdm. .TP \fB\-R\fR \fIrescan-time\fR Force a complete rescan every \fIrescan-time\fR seconds. If -R is not specified, no timeout rescans will be performed. .TP \fB\-T\fR \fIretry-timeout\fR Retries to connect to existing target after \fIretry-timeout\fR seconds. If -R is not specified, uses 5 Seconds timeout. if retry-timeout is 0, will not try to reconnect. The reason srp_daemon retries to connect to the target is because there may be a rare scnerio in which srp_daemon will try to connect to add a target when the target is about to be removed, but is not removed yet. .TP \fB\-f\fR \fIrules-file\fR Decide to which targets to connect according to the rules in \fIrules-file\fR. If \fB\-f\fR is not specified, uses the default rules file @CMAKE_INSTALL_FULL_SYSCONFDIR@/srp_daemon.conf. Each line in the \fIrules-file\fR is a rule which can be either an allow connection or a disallow connection according to the first character in the line (a or d accordingly). The rest of the line is values for id_ext, ioc_guid, dgid, service_id. Please take a look at the example section for an example of the file. srp_daemon decide whether to allow or disallow each target according to first rule that match the target. If no rule matches the target, the target is allowed and will be connected. In an allow rule it is possible to set attributes for the connection to the target. Supported attributes are max_cmd_per_lun and max_sect. .TP \fB\-t\fR \fItimeout\fR Use timeout of \fItimeout\fR msec for MAD responses (default: 5 sec). .TP \fB\-r\fR \fIretries\fR Perform \fIretries\fR retries on each send to MAD (default: 3 retries). .TP \fB\-n\fR New format - use also initiator_ext in the connection command. .TP \fB\--systemd\fR Enable systemd integration. .SH FILES @CMAKE_INSTALL_FULL_SYSCONFDIR@/srp_daemon.conf - Default rules configuration file that indicates to which targets to connect. Can be overridden using the \fB\-f\fR \fIrules-file\fR option. Each line in this file is a rule which can be either an allow connection or a disallow connection according to the first character in the line (a or d accordingly). The rest of the line is values for id_ext, ioc_guid, dgid, service_id. Please take a look at the example section for an example of the file. srp_daemon decide whether to allow or disallow each target according to first rule that match the target. If no rule matches the target, the target is allowed and will be connected. In an allow rule it is possible to set attributes for the connection to the target. Supported attributes are max_cmd_per_lun and max_sect. .SH EXAMPLES srp_daemon -e -i mthca0 -p 1 -R 60 (Connects to the targets accessible through port 1 of mthca0. Performs a complete rescan every minute) srp_daemon -o -c -a (Prints the connection commands for the targets in the fabric and exits - similar to ibsrpdm) srp_daemon -e -f rules.txt (Connects to the targets allowed in the rules file rules.txt) .nf An example for a rules configuration file (such as @CMAKE_INSTALL_FULL_SYSCONFDIR@/srp_daemon.conf) ------------------------------------------------------------------------ # Rules file example # This is a comment # disallow the following dgid d dgid=fe800000000000000002c90200402bd5 # allow target with the following ioc_guid a ioc_guid=00a0b80200402bd7 # allow target with the following id_ext and ioc_guid. And setting max_cmd_per_lun to 31. a id_ext=200500A0B81146A1,ioc_guid=00a0b80200402bef,max_cmd_per_lun=31 # disallow all the rest d .fi .SH SEE ALSO .BR ibsrpdm (8) .SH AUTHORS .TP Roland Dreier .RI < rolandd@cisco.com > .TP Ishai Rabinovitz .RI < ishai@mellanox.co.il > rdma-core-56.1/srp_daemon/srp_daemon.c000066400000000000000000001741601477342711600177650ustar00rootroot00000000000000/* * srp_daemon - discover SRP targets over IB * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. * Copyright (c) 2006 Mellanox Technologies Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * $Author: ishai Rabinovitz [ishai@mellanox.co.il]$ * Based on Roland Dreier's initial code [rdreier@cisco.com] * */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "srp_ib_types.h" #include "srp_daemon.h" #define IBDEV_STR_SIZE 16 #define IBPORT_STR_SIZE 16 #define IGNORE(value) do { if (value) { } } while (0) #define max_t(type, x, y) ({ \ type __max1 = (x); \ type __max2 = (y); \ __max1 > __max2 ? __max1: __max2; }) #define get_data_ptr(mad) ((void *) ((mad).hdr.data)) enum log_dest { log_to_syslog, log_to_stderr }; static int get_lid(struct umad_resources *umad_res, union umad_gid *gid, uint16_t *lid); static const int node_table_response_size = 1 << 18; static const char *sysfs_path = "/sys"; static enum log_dest s_log_dest = log_to_syslog; static int wakeup_pipe[2] = { -1, -1 }; void wake_up_main_loop(char ch) { int res; assert(wakeup_pipe[1] >= 0); res = write(wakeup_pipe[1], &ch, 1); IGNORE(res); } static void signal_handler(int signo) { wake_up_main_loop(signo); } /* * Return either the received signal (SIGINT, SIGTERM, ...) or 0 if no signal * has been received before the timeout has expired. */ static int get_received_signal(time_t tv_sec, suseconds_t tv_usec) { int fd, ret, received_signal = 0; fd_set rset; struct timeval timeout; char buf[16]; fd = wakeup_pipe[0]; FD_ZERO(&rset); FD_SET(fd, &rset); timeout.tv_sec = tv_sec; timeout.tv_usec = tv_usec; ret = select(fd + 1, &rset, NULL, NULL, &timeout); if (ret < 0) assert(errno == EINTR); while ((ret = read(fd, buf, sizeof(buf))) > 0) received_signal = buf[ret - 1]; return received_signal; } static int check_process_uniqueness(struct config_t *conf) { char path[256]; int fd; snprintf(path, sizeof(path), SRP_DAEMON_LOCK_PREFIX "_%s_%d", conf->dev_name, conf->port_num); if ((fd = open(path, O_CREAT|O_RDWR, S_IRUSR|S_IRGRP|S_IROTH|S_IWUSR)) < 0) { pr_err("cannot open file \"%s\" (errno: %d).\n", path, errno); return -1; } if (0 != lockf(fd, F_TLOCK, 0)) { pr_err("failed to lock %s (errno: %d). possibly another " "srp_daemon is locking it\n", path, errno); close(fd); fd = -1; } return fd; } static int srpd_sys_read_string(const char *dir_name, const char *file_name, char *str, int max_len) { char path[256], *s; int fd, r; snprintf(path, sizeof(path), "%s/%s", dir_name, file_name); if ((fd = open(path, O_RDONLY)) < 0) return (errno > 0) ? -errno : errno; if ((r = read(fd, str, max_len)) < 0) { int e = errno; close(fd); return (e > 0) ? -e : e; } str[(r < max_len) ? r : max_len - 1] = 0; if ((s = strrchr(str, '\n'))) *s = 0; close(fd); return 0; } static int srpd_sys_read_gid(const char *dir_name, const char *file_name, uint8_t *gid) { char buf[64], *str, *s; __be16 *ugid = (__be16 *)gid; int r, i; if ((r = srpd_sys_read_string(dir_name, file_name, buf, sizeof(buf))) < 0) return r; for (s = buf, i = 0 ; i < 8; i++) { if (!(str = strsep(&s, ": \t\n"))) return -EINVAL; ugid[i] = htobe16(strtoul(str, NULL, 16) & 0xffff); } return 0; } static int srpd_sys_read_uint64(const char *dir_name, const char *file_name, uint64_t *u) { char buf[32]; int r; if ((r = srpd_sys_read_string(dir_name, file_name, buf, sizeof(buf))) < 0) return r; *u = strtoull(buf, NULL, 0); return 0; } static void usage(const char *argv0) { fprintf(stderr, "Usage: %s [-vVcaeon] [-d | -i [-p ]] [-t ] [-r ] [-R ] [-f \n", argv0); fprintf(stderr, "-v Verbose\n"); fprintf(stderr, "-V debug Verbose\n"); fprintf(stderr, "-c prints connection Commands\n"); fprintf(stderr, "-a show All - prints also targets that are already connected\n"); fprintf(stderr, "-e Executes connection commands\n"); fprintf(stderr, "-o runs only Once and stop\n"); fprintf(stderr, "-d use umad Device \n"); fprintf(stderr, "-i use InfiniBand device \n"); fprintf(stderr, "-p use Port num \n"); fprintf(stderr, "-j : use the IB dev / port_num combination \n"); fprintf(stderr, "-R perform complete Rescan every seconds\n"); fprintf(stderr, "-T Retries to connect to existing target after Timeout of seconds\n"); fprintf(stderr, "-l Transport retry count before failing IO. should be in range [2..7], (default 2)\n"); fprintf(stderr, "-f use rules File to set to which target(s) to connect (default: " SRP_DAEMON_CONFIG_FILE ")\n"); fprintf(stderr, "-t Timeout for mad response in milliseconds\n"); fprintf(stderr, "-r number of send Retries for each mad\n"); fprintf(stderr, "-n New connection command format - use also initiator extension\n"); fprintf(stderr, "--systemd Enable systemd integration.\n"); fprintf(stderr, "\nExample: srp_daemon -e -n -i mthca0 -p 1 -R 60\n"); } static int check_equal_uint64(char *dir_name, const char *attr, uint64_t val) { uint64_t attr_value; if (srpd_sys_read_uint64(dir_name, attr, &attr_value)) return 0; return attr_value == val; } static int check_equal_uint16(char *dir_name, const char *attr, uint16_t val) { uint64_t attr_value; if (srpd_sys_read_uint64(dir_name, attr, &attr_value)) return 0; return val == (attr_value & 0xffff); } static int recalc(struct resources *res); static void pr_cmd(char *target_str, int not_connected) { int ret; if (config->cmd) printf("%s\n", target_str); if (config->execute && not_connected) { int fd = open(config->add_target_file, O_WRONLY); if (fd < 0) { pr_err("unable to open %s, maybe ib_srp is not loaded\n", config->add_target_file); return; } ret = write(fd, target_str, strlen(target_str)); pr_debug("Adding target returned %d\n", ret); close(fd); } } void pr_debug(const char *fmt, ...) { va_list args; if (!config->debug_verbose) return; va_start(args, fmt); vprintf(fmt, args); va_end(args); } void pr_err(const char *fmt, ...) { va_list args; va_start(args, fmt); switch (s_log_dest) { case log_to_syslog: vsyslog(LOG_DAEMON | LOG_ERR, fmt, args); break; case log_to_stderr: vfprintf(stderr, fmt, args); break; } va_end(args); } static int check_not_equal_str(const char *dir_name, const char *attr, const char *value) { char attr_value[64]; int len = strlen(value); if (len > sizeof(attr_value)) { pr_err("string %s is too long\n", value); return 1; } if (srpd_sys_read_string(dir_name, attr, attr_value, sizeof(attr_value))) return 0; if (strncmp(attr_value, value, len)) return 1; return 0; } static int check_not_equal_int(const char *dir_name, const char *attr, int value) { char attr_value[64]; if (srpd_sys_read_string(dir_name, attr, attr_value, sizeof(attr_value))) return 0; if (value != atoi(attr_value)) return 1; return 0; } static int is_enabled_by_rules_file(struct target_details *target) { int rule; struct config_t *conf = config; if (NULL == conf->rules) { pr_debug("Allowing SRP target with id_ext %s because not using a rules file\n", target->id_ext); return 1; } rule = -1; do { rule++; if (conf->rules[rule].id_ext[0] != '\0' && strtoull(target->id_ext, NULL, 16) != strtoull(conf->rules[rule].id_ext, NULL, 16)) continue; if (conf->rules[rule].ioc_guid[0] != '\0' && be64toh(target->ioc_prof.guid) != strtoull(conf->rules[rule].ioc_guid, NULL, 16)) continue; if (conf->rules[rule].dgid[0] != '\0') { char tmp = conf->rules[rule].dgid[16]; conf->rules[rule].dgid[16] = '\0'; if (strtoull(conf->rules[rule].dgid, NULL, 16) != target->subnet_prefix) { conf->rules[rule].dgid[16] = tmp; continue; } conf->rules[rule].dgid[16] = tmp; if (strtoull(&conf->rules[rule].dgid[16], NULL, 16) != target->h_guid) continue; } if (conf->rules[rule].service_id[0] != '\0' && strtoull(conf->rules[rule].service_id, NULL, 16) != target->h_service_id) continue; if (conf->rules[rule].pkey[0] != '\0' && (uint16_t)strtoul(conf->rules[rule].pkey, NULL, 16) != target->pkey) continue; target->options = conf->rules[rule].options; pr_debug("SRP target with id_ext %s %s by rules file\n", target->id_ext, conf->rules[rule].allow ? "allowed" : "disallowed"); return conf->rules[rule].allow; } while (1); } static bool use_imm_data(void) { bool ret = false; char flag = 0; int cnt; int fd = open("/sys/module/ib_srp/parameters/use_imm_data", O_RDONLY); if (fd < 0) return false; cnt = read(fd, &flag, 1); if (cnt != 1) { close(fd); return false; } if (!strncmp(&flag, "Y", 1)) ret = true; close(fd); return ret; } static bool imm_data_size_gt_send_size(unsigned int send_size) { bool ret = false; unsigned int srp_max_imm_data = 0; FILE *fp = fopen("/sys/module/ib_srp/parameters/max_imm_data", "r"); int cnt; if (fp == NULL) return ret; cnt = fscanf(fp, "%d", &srp_max_imm_data); if (cnt <= 0) { fclose(fp); return ret; } if (srp_max_imm_data > send_size) ret = true; fclose(fp); return ret; } static int add_non_exist_target(struct target_details *target) { char scsi_host_dir[256]; DIR *dir; struct dirent *subdir; char *subdir_name_ptr; int prefix_len; union umad_gid dgid_val; char target_config_str[255]; int len; int not_connected = 1; unsigned int send_size; pr_debug("Found an SRP target with id_ext %s - check if it is already connected\n", target->id_ext); strcpy(scsi_host_dir, "/sys/class/scsi_host/"); dir=opendir(scsi_host_dir); if (!dir) { perror("opendir - /sys/class/scsi_host/"); return -1; } prefix_len = strlen(scsi_host_dir); subdir_name_ptr = scsi_host_dir + prefix_len; subdir = (void *) 1; /* Dummy value to enter the loop */ while (subdir) { subdir = readdir(dir); if (!subdir) continue; if (subdir->d_name[0] == '.') continue; strncpy(subdir_name_ptr, subdir->d_name, sizeof(scsi_host_dir) - prefix_len); if (!check_equal_uint64(scsi_host_dir, "id_ext", strtoull(target->id_ext, NULL, 16))) continue; if (!check_equal_uint16(scsi_host_dir, "pkey", target->pkey) && !config->execute) continue; if (!check_equal_uint64(scsi_host_dir, "service_id", target->h_service_id)) continue; if (!check_equal_uint64(scsi_host_dir, "ioc_guid", be64toh(target->ioc_prof.guid))) continue; if (srpd_sys_read_gid(scsi_host_dir, "orig_dgid", dgid_val.raw)) { /* * In case this is an old kernel that does not have * orig_dgid in sysfs, use dgid instead (this is * problematic when there is a dgid redirection * by the CM) */ if (srpd_sys_read_gid(scsi_host_dir, "dgid", dgid_val.raw)) continue; } if (htobe64(target->subnet_prefix) != dgid_val.global.subnet_prefix) continue; if (htobe64(target->h_guid) != dgid_val.global.interface_id) continue; /* If there is no local_ib_device in the scsi host dir (old kernel module), assumes it is equal */ if (check_not_equal_str(scsi_host_dir, "local_ib_device", config->dev_name)) continue; /* If there is no local_ib_port in the scsi host dir (old kernel module), assumes it is equal */ if (check_not_equal_int(scsi_host_dir, "local_ib_port", config->port_num)) continue; /* there is a match - this target is already connected */ /* There is a rare possibility of a race in the following scenario: a. A link goes down, b. ib_srp decide to remove the corresponding scsi_host. c. Before removing it, the link returns d. srp_daemon gets trap 64. e. srp_daemon thinks that this target is still connected (ib_srp has not removed it yet) so it does not connect to it. f. ib_srp continue to remove the scsi_host. As a result there is no connection to a target in the fabric and there will not be a new trap. To solve this race we schedule here another call to check if this target exist in the near future. */ /* If there is a need to print all we will continue to pr_cmd. not_connected is set to zero to make sure that this target will be printed but not connected. */ if (config->all) { not_connected = 0; break; } pr_debug("This target is already connected - skip\n"); closedir(dir); return 0; } len = snprintf(target_config_str, sizeof(target_config_str), "id_ext=%s," "ioc_guid=%016llx," "dgid=%016llx%016llx," "pkey=%04x," "service_id=%016llx", target->id_ext, (unsigned long long) be64toh(target->ioc_prof.guid), (unsigned long long) target->subnet_prefix, (unsigned long long) target->h_guid, target->pkey, (unsigned long long) target->h_service_id); if (len >= sizeof(target_config_str)) { pr_err("Target config string is too long, ignoring target\n"); closedir(dir); return -1; } if (target->ioc_prof.io_class != htobe16(SRP_REV16A_IB_IO_CLASS)) { len += snprintf(target_config_str+len, sizeof(target_config_str) - len, ",io_class=%04hx", be16toh(target->ioc_prof.io_class)); if (len >= sizeof(target_config_str)) { pr_err("Target config string is too long, ignoring target\n"); closedir(dir); return -1; } } if (config->print_initiator_ext) { len += snprintf(target_config_str+len, sizeof(target_config_str) - len, ",initiator_ext=%016llx", (unsigned long long) target->h_guid); if (len >= sizeof(target_config_str)) { pr_err("Target config string is too long, ignoring target\n"); closedir(dir); return -1; } } if (config->execute && config->tl_retry_count) { len += snprintf(target_config_str + len, sizeof(target_config_str) - len, ",tl_retry_count=%d", config->tl_retry_count); if (len >= sizeof(target_config_str)) { pr_err("Target config string is too long, ignoring target\n"); closedir(dir); return -1; } } if (target->options) { len += snprintf(target_config_str+len, sizeof(target_config_str) - len, "%s", target->options); if (len >= sizeof(target_config_str)) { pr_err("Target config string is too long, ignoring target\n"); closedir(dir); return -1; } } /* * The SRP initiator stops parsing parameters if it encounters * an unrecognized parameter. Rest parameters will be ignored. * Append 'max_it_iu_size' in the very end of login string to * avoid breaking SRP login. */ send_size = be32toh(target->ioc_prof.send_size); if (use_imm_data() && imm_data_size_gt_send_size(send_size)) { len += snprintf(target_config_str+len, sizeof(target_config_str) - len, ",max_it_iu_size=%d", send_size); if (len >= sizeof(target_config_str)) { pr_err("Target config string is too long, ignoring target\n"); closedir(dir); return -1; } } target_config_str[len] = '\0'; pr_cmd(target_config_str, not_connected); closedir(dir); return 1; } static int send_and_get(int portid, int agent, struct srp_ib_user_mad *out_mad, struct srp_ib_user_mad *in_mad, int in_mad_size) { struct umad_dm_packet *out_dm_mad = (void *) out_mad->hdr.data; struct umad_dm_packet *in_dm_mad = (void *) in_mad->hdr.data; int i, len; int in_agent; int ret; static uint32_t tid; uint32_t received_tid; for (i = 0; i < config->mad_retries; ++i) { /* Skip tid 0 because OpenSM ignores it. */ if (++tid == 0) ++tid; out_dm_mad->mad_hdr.tid = htobe64(tid); ret = umad_send(portid, agent, out_mad, MAD_BLOCK_SIZE, config->timeout, 0); if (ret < 0) { pr_err("umad_send to %u failed\n", (uint16_t) be16toh(out_mad->hdr.addr.lid)); return ret; } do { recv: len = in_mad_size ? in_mad_size : MAD_BLOCK_SIZE; in_agent = umad_recv(portid, (struct ib_user_mad *) in_mad, &len, config->timeout); if (in_agent < 0) { pr_err("umad_recv from %u failed - %d\n", (uint16_t) be16toh(out_mad->hdr.addr.lid), in_agent); return in_agent; } if (in_agent != agent) { pr_debug("umad_recv returned different agent\n"); goto recv; } ret = umad_status(in_mad); if (ret) { pr_err( "bad MAD status (%u) from lid %#x\n", ret, be16toh(out_mad->hdr.addr.lid)); return -ret; } received_tid = be64toh(in_dm_mad->mad_hdr.tid); if (tid != received_tid) pr_debug("umad_recv returned different transaction id sent %d got %d\n", tid, received_tid); } while ((int32_t)(tid - received_tid) > 0); if (len > 0) return len; } return -1; } static void initialize_sysfs(void) { char *env; env = getenv("SYSFS_PATH"); if (env) { int len; char *dup; sysfs_path = dup = strndup(env, 256); len = strlen(dup); while (len > 0 && dup[len - 1] == '/') { --len; dup[len] = '\0'; } } } static int translate_umad_to_ibdev_and_port(char *umad_dev, char **ibdev, char **ibport) { char *class_dev_path; char *umad_dev_name; int ret; *ibdev = NULL; *ibport = NULL; umad_dev_name = rindex(umad_dev, '/'); if (!umad_dev_name) { pr_err("Couldn't find device name in '%s'\n", umad_dev); return -1; } ret = asprintf(&class_dev_path, "%s/class/infiniband_mad/%s", sysfs_path, umad_dev_name); if (ret < 0) { pr_err("out of memory\n"); return -ENOMEM; } *ibdev = malloc(IBDEV_STR_SIZE); if (!*ibdev) { pr_err("out of memory\n"); ret = -ENOMEM; goto end; } if (srpd_sys_read_string(class_dev_path, "ibdev", *ibdev, IBDEV_STR_SIZE) < 0) { pr_err("Couldn't read ibdev attribute\n"); ret = -1; goto end; } *ibport = malloc(IBPORT_STR_SIZE); if (!*ibport) { pr_err("out of memory\n"); ret = -ENOMEM; goto end; } if (srpd_sys_read_string(class_dev_path, "port", *ibport, IBPORT_STR_SIZE) < 0) { pr_err("Couldn't read port attribute\n"); ret = -1; goto end; } ret = 0; end: if (ret) { free(*ibport); free(*ibdev); *ibdev = NULL; } free(class_dev_path); return ret; } static void init_srp_mad(struct srp_ib_user_mad *out_umad, int agent, uint16_t h_dlid, uint16_t h_attr_id, uint32_t h_attr_mod) { struct umad_dm_packet *out_mad; memset(out_umad, 0, sizeof *out_umad); out_umad->hdr.agent_id = agent; out_umad->hdr.addr.qpn = htobe32(1); out_umad->hdr.addr.qkey = htobe32(UMAD_QKEY); out_umad->hdr.addr.lid = htobe16(h_dlid); out_mad = (void *) out_umad->hdr.data; out_mad->mad_hdr.base_version = UMAD_BASE_VERSION; out_mad->mad_hdr.method = UMAD_METHOD_GET; out_mad->mad_hdr.attr_id = htobe16(h_attr_id); out_mad->mad_hdr.attr_mod = htobe32(h_attr_mod); } static void init_srp_dm_mad(struct srp_ib_user_mad *out_mad, int agent, uint16_t h_dlid, uint16_t h_attr_id, uint32_t h_attr_mod) { struct umad_sa_packet *out_dm_mad = get_data_ptr(*out_mad); init_srp_mad(out_mad, agent, h_dlid, h_attr_id, h_attr_mod); out_dm_mad->mad_hdr.mgmt_class = UMAD_CLASS_DEVICE_MGMT; out_dm_mad->mad_hdr.class_version = 1; } static void init_srp_sa_mad(struct srp_ib_user_mad *out_mad, int agent, uint16_t h_dlid, uint16_t h_attr_id, uint32_t h_attr_mod) { struct umad_sa_packet *out_sa_mad = get_data_ptr(*out_mad); init_srp_mad(out_mad, agent, h_dlid, h_attr_id, h_attr_mod); out_sa_mad->mad_hdr.mgmt_class = UMAD_CLASS_SUBN_ADM; out_sa_mad->mad_hdr.class_version = UMAD_SA_CLASS_VERSION; } static int check_sm_cap(struct umad_resources *umad_res, int *mask_match) { struct srp_ib_user_mad out_mad, in_mad; struct umad_sa_packet *in_sa_mad; struct umad_class_port_info *cpi; int ret; in_sa_mad = get_data_ptr(in_mad); init_srp_sa_mad(&out_mad, umad_res->agent, umad_res->sm_lid, UMAD_ATTR_CLASS_PORT_INFO, 0); ret = send_and_get(umad_res->portid, umad_res->agent, &out_mad, &in_mad, 0); if (ret < 0) return ret; cpi = (void *) in_sa_mad->data; *mask_match = !!(be16toh(cpi->cap_mask) & SRP_SM_SUPPORTS_MASK_MATCH); return 0; } int pkey_index_to_pkey(struct umad_resources *umad_res, int pkey_index, __be16 *pkey) { if (ibv_query_pkey(umad_res->ib_ctx, config->port_num, pkey_index, pkey) < 0) return -1; if (*pkey) pr_debug("discover Targets for P_key %04x (index %d)\n", *pkey, pkey_index); return 0; } static int pkey_to_pkey_index(struct umad_resources *umad_res, uint16_t h_pkey, uint16_t *pkey_index) { int res = ibv_get_pkey_index(umad_res->ib_ctx, config->port_num, htobe16(h_pkey)); if (res >= 0) *pkey_index = res; return res; } static int set_class_port_info(struct umad_resources *umad_res, uint16_t dlid, uint16_t h_pkey) { struct srp_ib_user_mad in_mad, out_mad; struct umad_dm_packet *out_dm_mad, *in_dm_mad; struct umad_class_port_info *cpi; char val[64]; int i; init_srp_dm_mad(&out_mad, umad_res->agent, dlid, UMAD_ATTR_CLASS_PORT_INFO, 0); if (pkey_to_pkey_index(umad_res, h_pkey, &out_mad.hdr.addr.pkey_index) < 0) { pr_err("set_class_port_info: Unable to find pkey_index for pkey %#x\n", h_pkey); return -1; } out_dm_mad = get_data_ptr(out_mad); out_dm_mad->mad_hdr.method = UMAD_METHOD_SET; cpi = (void *) out_dm_mad->data; if (srpd_sys_read_string(umad_res->port_sysfs_path, "lid", val, sizeof val) < 0) { pr_err("Couldn't read LID\n"); return -1; } cpi->trap_lid = htobe16(strtol(val, NULL, 0)); if (srpd_sys_read_string(umad_res->port_sysfs_path, "gids/0", val, sizeof val) < 0) { pr_err("Couldn't read GID[0]\n"); return -1; } for (i = 0; i < 8; ++i) cpi->trapgid.raw_be16[i] = htobe16(strtol(val + i * 5, NULL, 16)); if (send_and_get(umad_res->portid, umad_res->agent, &out_mad, &in_mad, 0) < 0) return -1; in_dm_mad = get_data_ptr(in_mad); if (in_dm_mad->mad_hdr.status) { pr_err("Class Port Info set returned status 0x%04x\n", be16toh(in_dm_mad->mad_hdr.status)); return -1; } return 0; } static int get_iou_info(struct umad_resources *umad_res, uint16_t dlid, uint16_t h_pkey, struct srp_dm_iou_info *iou_info) { struct srp_ib_user_mad in_mad, out_mad; struct umad_dm_packet *in_dm_mad; init_srp_dm_mad(&out_mad, umad_res->agent, dlid, SRP_DM_ATTR_IO_UNIT_INFO, 0); if (pkey_to_pkey_index(umad_res, h_pkey, &out_mad.hdr.addr.pkey_index) < 0) { pr_err("get_iou_info: Unable to find pkey_index for pkey %#x\n", h_pkey); return -1; } if (send_and_get(umad_res->portid, umad_res->agent, &out_mad, &in_mad, 0) < 0) return -1; in_dm_mad = get_data_ptr(in_mad); if (in_dm_mad->mad_hdr.status) { pr_err("IO Unit Info query returned status 0x%04x\n", be16toh(in_dm_mad->mad_hdr.status)); return -1; } memcpy(iou_info, in_dm_mad->data, sizeof *iou_info); /* pr_debug("iou_info->max_controllers is %d\n", iou_info->max_controllers); */ return 0; } static int get_ioc_prof(struct umad_resources *umad_res, uint16_t h_dlid, uint16_t h_pkey, int ioc, struct srp_dm_ioc_prof *ioc_prof) { struct srp_ib_user_mad in_mad, out_mad; struct umad_dm_packet *in_dm_mad; init_srp_dm_mad(&out_mad, umad_res->agent, h_dlid, SRP_DM_ATTR_IO_CONTROLLER_PROFILE, ioc); if (pkey_to_pkey_index(umad_res, h_pkey, &out_mad.hdr.addr.pkey_index) < 0) { pr_err("get_ioc_prof: Unable to find pkey_index for pkey %#x\n", h_pkey); return -1; } if (send_and_get(umad_res->portid, umad_res->agent, &out_mad, &in_mad, 0) < 0) return -1; in_dm_mad = get_data_ptr(in_mad); if (in_dm_mad->mad_hdr.status) { pr_err("IO Controller Profile query returned status 0x%04x for %d\n", be16toh(in_dm_mad->mad_hdr.status), ioc); return -1; } memcpy(ioc_prof, in_dm_mad->data, sizeof *ioc_prof); return 0; } static int get_svc_entries(struct umad_resources *umad_res, uint16_t dlid, uint16_t h_pkey, int ioc, int start, int end, struct srp_dm_svc_entries *svc_entries) { struct srp_ib_user_mad in_mad, out_mad; struct umad_dm_packet *in_dm_mad; init_srp_dm_mad(&out_mad, umad_res->agent, dlid, SRP_DM_ATTR_SERVICE_ENTRIES, (ioc << 16) | (end << 8) | start); if (pkey_to_pkey_index(umad_res, h_pkey, &out_mad.hdr.addr.pkey_index) < 0) { pr_err("get_svc_entries: Unable to find pkey_index for pkey %#x\n", h_pkey); return -1; } if (send_and_get(umad_res->portid, umad_res->agent, &out_mad, &in_mad, 0) < 0) return -1; in_dm_mad = get_data_ptr(in_mad); if (in_dm_mad->mad_hdr.status) { pr_err("Service Entries query returned status 0x%04x\n", be16toh(in_dm_mad->mad_hdr.status)); return -1; } memcpy(svc_entries, in_dm_mad->data, sizeof *svc_entries); return 0; } static int do_port(struct resources *res, uint16_t pkey, uint16_t dlid, uint64_t subnet_prefix, uint64_t h_guid) { struct umad_resources *umad_res = res->umad_res; struct srp_dm_iou_info iou_info; struct srp_dm_svc_entries svc_entries; int i, j, k, ret; static const uint64_t topspin_oui = 0x0005ad0000000000ull; static const uint64_t oui_mask = 0xffffff0000000000ull; struct target_details *target = (struct target_details *) malloc(sizeof(struct target_details)); target->subnet_prefix = subnet_prefix; target->h_guid = h_guid; target->options = NULL; pr_debug("enter do_port\n"); if ((target->h_guid & oui_mask) == topspin_oui && set_class_port_info(umad_res, dlid, pkey)) pr_err("Warning: set of ClassPortInfo failed\n"); ret = get_iou_info(umad_res, dlid, pkey, &iou_info); if (ret < 0) { pr_err("failed to get iou info for dlid %#x\n", dlid); goto out; } pr_human("IO Unit Info:\n"); pr_human(" port LID: %04x\n", dlid); pr_human(" port GID: %016llx%016llx\n", (unsigned long long) target->subnet_prefix, (unsigned long long) target->h_guid); pr_human(" change ID: %04x\n", be16toh(iou_info.change_id)); pr_human(" max controllers: 0x%02x\n", iou_info.max_controllers); if (config->verbose > 0) for (i = 0; i < iou_info.max_controllers; ++i) { pr_human(" controller[%3d]: ", i + 1); switch ((iou_info.controller_list[i / 2] >> (4 * (1 - i % 2))) & 0xf) { case SRP_DM_NO_IOC: pr_human("not installed\n"); break; case SRP_DM_IOC_PRESENT: pr_human("present\n"); break; case SRP_DM_NO_SLOT: pr_human("no slot\n"); break; default: pr_human("\n"); break; } } for (i = 0; i < iou_info.max_controllers; ++i) { if (((iou_info.controller_list[i / 2] >> (4 * (1 - i % 2))) & 0xf) == SRP_DM_IOC_PRESENT) { pr_human("\n"); if (get_ioc_prof(umad_res, dlid, pkey, i + 1, &target->ioc_prof)) continue; pr_human(" controller[%3d]\n", i + 1); pr_human(" GUID: %016llx\n", (unsigned long long) be64toh(target->ioc_prof.guid)); pr_human(" vendor ID: %06x\n", be32toh(target->ioc_prof.vendor_id) >> 8); pr_human(" device ID: %06x\n", be32toh(target->ioc_prof.device_id)); pr_human(" IO class : %04hx\n", be16toh(target->ioc_prof.io_class)); pr_human(" Maximum size of Send Messages in bytes: %d\n", be32toh(target->ioc_prof.send_size)); pr_human(" ID: %s\n", target->ioc_prof.id); pr_human(" service entries: %d\n", target->ioc_prof.service_entries); for (j = 0; j < target->ioc_prof.service_entries; j += 4) { int n; n = j + 3; if (n >= target->ioc_prof.service_entries) n = target->ioc_prof.service_entries - 1; if (get_svc_entries(umad_res, dlid, pkey, i + 1, j, n, &svc_entries)) continue; for (k = 0; k <= n - j; ++k) { if (sscanf(svc_entries.service[k].name, "SRP.T10:%16s", target->id_ext) != 1) continue; pr_human(" service[%3d]: %016llx / %s\n", j + k, (unsigned long long) be64toh(svc_entries.service[k].id), svc_entries.service[k].name); target->h_service_id = be64toh(svc_entries.service[k].id); target->pkey = pkey; if (is_enabled_by_rules_file(target)) { if (!add_non_exist_target(target) && !config->once) { target->retry_time = time(NULL) + config->retry_timeout; push_to_retry_list(res->sync_res, target); } } } } } } pr_human("\n"); out: free(target); return ret; } int get_node(struct umad_resources *umad_res, uint16_t dlid, uint64_t *guid) { struct srp_ib_user_mad out_mad, in_mad; struct umad_sa_packet *out_sa_mad, *in_sa_mad; struct srp_sa_node_rec *node; in_sa_mad = get_data_ptr(in_mad); out_sa_mad = get_data_ptr(out_mad); init_srp_sa_mad(&out_mad, umad_res->agent, umad_res->sm_lid, UMAD_SA_ATTR_NODE_REC, 0); out_sa_mad->comp_mask = htobe64(1); /* LID */ node = (void *) out_sa_mad->data; node->lid = htobe16(dlid); if (send_and_get(umad_res->portid, umad_res->agent, &out_mad, &in_mad, 0) < 0) return -1; node = (void *) in_sa_mad->data; *guid = be64toh(node->port_guid); return 0; } static int get_port_info(struct umad_resources *umad_res, uint16_t dlid, uint64_t *subnet_prefix, int *isdm) { struct srp_ib_user_mad out_mad, in_mad; struct umad_sa_packet *out_sa_mad, *in_sa_mad; struct srp_sa_port_info_rec *port_info; in_sa_mad = get_data_ptr(in_mad); out_sa_mad = get_data_ptr(out_mad); init_srp_sa_mad(&out_mad, umad_res->agent, umad_res->sm_lid, UMAD_SA_ATTR_PORT_INFO_REC, 0); out_sa_mad->comp_mask = htobe64(1); /* LID */ port_info = (void *) out_sa_mad->data; port_info->endport_lid = htobe16(dlid); if (send_and_get(umad_res->portid, umad_res->agent, &out_mad, &in_mad, 0) < 0) return -1; port_info = (void *) in_sa_mad->data; *subnet_prefix = be64toh(port_info->subnet_prefix); *isdm = !!(be32toh(port_info->capability_mask) & SRP_IS_DM); return 0; } static int get_shared_pkeys(struct resources *res, uint16_t dest_port_lid, uint16_t *pkeys) { struct umad_resources *umad_res = res->umad_res; uint8_t *in_mad_buf; struct srp_ib_user_mad out_mad; struct ib_user_mad *in_mad; struct umad_sa_packet *out_sa_mad, *in_sa_mad; struct ib_path_rec *path_rec; ssize_t len; int i, num_pkeys = 0; __be16 pkey; uint16_t local_port_lid = get_port_lid(res->ud_res->ib_ctx, config->port_num, NULL); in_mad_buf = malloc(sizeof(struct ib_user_mad) + node_table_response_size); if (!in_mad_buf) return -ENOMEM; in_mad = (void *)in_mad_buf; in_sa_mad = (void *)in_mad->data; out_sa_mad = get_data_ptr(out_mad); init_srp_sa_mad(&out_mad, umad_res->agent, umad_res->sm_lid, UMAD_SA_ATTR_PATH_REC, 0); /** * Due to OpenSM bug (issue #335016) SM won't return * table of all shared P_Keys, it will return only the first * shared P_Key, So we send path_rec over each P_Key in the P_Key * table. SM will return path record if P_Key is shared or else None. * Once SM bug will be fixed, this loop should be removed. **/ for (i = 0; ; i++) { if (pkey_index_to_pkey(umad_res, i, &pkey)) break; if (!pkey) continue; /* Mark components: DLID, SLID, PKEY */ out_sa_mad->comp_mask = htobe64(1 << 4 | 1 << 5 | 1 << 13); path_rec = (struct ib_path_rec *)out_sa_mad->data; path_rec->slid = htobe16(local_port_lid); path_rec->dlid = htobe16(dest_port_lid); path_rec->pkey = pkey; len = send_and_get(umad_res->portid, umad_res->agent, &out_mad, (struct srp_ib_user_mad *)in_mad, node_table_response_size); if (len < 0) goto err; path_rec = (struct ib_path_rec *)in_sa_mad->data; pkeys[num_pkeys++] = be16toh(path_rec->pkey); } free(in_mad_buf); return num_pkeys; err: free(in_mad_buf); return -1; } static int do_dm_port_list(struct resources *res) { struct umad_resources *umad_res = res->umad_res; uint8_t *in_mad_buf; struct srp_ib_user_mad out_mad; struct ib_user_mad *in_mad; struct umad_sa_packet *out_sa_mad, *in_sa_mad; struct srp_sa_port_info_rec *port_info; ssize_t len; int size; int i, j,num_pkeys; uint16_t pkeys[SRP_MAX_SHARED_PKEYS]; uint64_t guid; in_mad_buf = malloc(sizeof(struct ib_user_mad) + node_table_response_size); if (!in_mad_buf) return -ENOMEM; in_mad = (void *) in_mad_buf; in_sa_mad = (void *) in_mad->data; out_sa_mad = get_data_ptr(out_mad); init_srp_sa_mad(&out_mad, umad_res->agent, umad_res->sm_lid, UMAD_SA_ATTR_PORT_INFO_REC, SRP_SM_CAP_MASK_MATCH_ATTR_MOD); out_sa_mad->mad_hdr.method = UMAD_SA_METHOD_GET_TABLE; out_sa_mad->comp_mask = htobe64(1 << 7); /* Capability mask */ out_sa_mad->rmpp_hdr.rmpp_version = UMAD_RMPP_VERSION; out_sa_mad->rmpp_hdr.rmpp_type = 1; port_info = (void *) out_sa_mad->data; port_info->capability_mask = htobe32(SRP_IS_DM); /* IsDM */ len = send_and_get(umad_res->portid, umad_res->agent, &out_mad, (struct srp_ib_user_mad *) in_mad, node_table_response_size); if (len < 0) { free(in_mad_buf); return len; } size = ib_get_attr_size(in_sa_mad->attr_offset); if (!size) { if (config->verbose) { printf("Query did not find any targets\n"); } free(in_mad_buf); return 0; } for (i = 0; (i + 1) * size <= len - MAD_RMPP_HDR_SIZE; ++i) { port_info = (void *) in_sa_mad->data + i * size; if (get_node(umad_res, be16toh(port_info->endport_lid), &guid)) continue; num_pkeys = get_shared_pkeys(res, be16toh(port_info->endport_lid), pkeys); if (num_pkeys < 0) { pr_err("failed to get shared P_Keys with LID %#x\n", be16toh(port_info->endport_lid)); free(in_mad_buf); return num_pkeys; } for (j = 0; j < num_pkeys; ++j) do_port(res, pkeys[j], be16toh(port_info->endport_lid), be64toh(port_info->subnet_prefix), guid); } free(in_mad_buf); return 0; } void handle_port(struct resources *res, uint16_t pkey, uint16_t lid, uint64_t h_guid) { struct umad_resources *umad_res = res->umad_res; uint64_t subnet_prefix; int isdm; pr_debug("enter handle_port for lid %#x\n", lid); if (get_port_info(umad_res, lid, &subnet_prefix, &isdm)) return; if (!isdm) return; do_port(res, pkey, lid, subnet_prefix, h_guid); } static int do_full_port_list(struct resources *res) { struct umad_resources *umad_res = res->umad_res; uint8_t *in_mad_buf; struct srp_ib_user_mad out_mad; struct ib_user_mad *in_mad; struct umad_sa_packet *out_sa_mad, *in_sa_mad; struct srp_sa_node_rec *node; ssize_t len; int size; int i, j, num_pkeys; uint16_t pkeys[SRP_MAX_SHARED_PKEYS]; in_mad_buf = malloc(sizeof(struct ib_user_mad) + node_table_response_size); if (!in_mad_buf) return -ENOMEM; in_mad = (void *) in_mad_buf; in_sa_mad = (void *) in_mad->data; out_sa_mad = get_data_ptr(out_mad); init_srp_sa_mad(&out_mad, umad_res->agent, umad_res->sm_lid, UMAD_SA_ATTR_NODE_REC, 0); out_sa_mad->mad_hdr.method = UMAD_SA_METHOD_GET_TABLE; out_sa_mad->comp_mask = 0; /* Get all end ports */ out_sa_mad->rmpp_hdr.rmpp_version = UMAD_RMPP_VERSION; out_sa_mad->rmpp_hdr.rmpp_type = 1; len = send_and_get(umad_res->portid, umad_res->agent, &out_mad, (struct srp_ib_user_mad *) in_mad, node_table_response_size); if (len < 0) { free(in_mad_buf); return len; } size = be16toh(in_sa_mad->attr_offset) * 8; for (i = 0; (i + 1) * size <= len - MAD_RMPP_HDR_SIZE; ++i) { node = (void *) in_sa_mad->data + i * size; num_pkeys = get_shared_pkeys(res, be16toh(node->lid), pkeys); if (num_pkeys < 0) { pr_err("failed to get shared P_Keys with LID %#x\n", be16toh(node->lid)); free(in_mad_buf); return num_pkeys; } for (j = 0; j < num_pkeys; ++j) (void) handle_port(res, pkeys[j], be16toh(node->lid), be64toh(node->port_guid)); } free(in_mad_buf); return 0; } struct config_t *config; static void print_config(struct config_t *conf) { printf(" configuration report\n"); printf(" ------------------------------------------------\n"); printf(" Current pid : %u\n", getpid()); printf(" Device name : \"%s\"\n", conf->dev_name); printf(" IB port : %u\n", conf->port_num); printf(" Mad Retries : %d\n", conf->mad_retries); printf(" Number of outstanding WR : %u\n", conf->num_of_oust); printf(" Mad timeout (msec) : %u\n", conf->timeout); printf(" Prints add target command : %d\n", conf->cmd); printf(" Executes add target command : %d\n", conf->execute); printf(" Print also connected targets : %d\n", conf->all); printf(" Report current targets and stop : %d\n", conf->once); if (conf->rules_file) printf(" Reads rules from : %s\n", conf->rules_file); if (conf->print_initiator_ext) printf(" Print initiator_ext\n"); else printf(" Do not print initiator_ext\n"); if (conf->recalc_time) printf(" Performs full target rescan every %d seconds\n", conf->recalc_time); else printf(" No full target rescan\n"); if (conf->retry_timeout) printf(" Retries to connect to existing target after %d seconds\n", conf->retry_timeout); else printf(" Do not retry to connect to existing targets\n"); printf(" ------------------------------------------------\n"); } static char *copy_till_comma(char *d, char *s, int len, int base) { int i=0; while (strchr(", \t\n", *s) == NULL) { if (i == len) return NULL; if ((base == 16 && isxdigit(*s)) || (base == 10 && isdigit(*s))) { *d=*s; ++d; ++s; ++i; } else return NULL; } *d='\0'; if (*s == '\n') return s; ++s; return s; } static char *parse_main_option(struct rule *rule, char *ptr) { struct option_info { const char *name; size_t offset; size_t len; int base; }; #define OPTION_INFO(n, base) { #n "=", offsetof(struct rule, n), \ sizeof(((struct rule *)NULL)->n), base} static const struct option_info opt_info[] = { OPTION_INFO(id_ext, 16), OPTION_INFO(ioc_guid, 16), OPTION_INFO(dgid, 16), OPTION_INFO(service_id, 16), OPTION_INFO(pkey, 16), }; int i, optnamelen; char *ptr2 = NULL; for (i = 0; i < sizeof(opt_info) / sizeof(opt_info[0]); i++) { optnamelen = strlen(opt_info[i].name); if (strncmp(ptr, opt_info[i].name, optnamelen) == 0) { ptr2 = copy_till_comma((char *)rule + opt_info[i].offset, ptr + optnamelen, opt_info[i].len - 1, opt_info[i].base); break; } } return ptr2; } /* * Return values: * -1 if the output buffer is not large enough. * 0 if an unsupported option has been encountered. * > 0 if parsing succeeded. */ static int parse_other_option(struct rule *rule, char *ptr) { static const char *const opt[] = { "allow_ext_sg=", "ch_count=", "cmd_sg_entries=", "comp_vector=", "max_cmd_per_lun=", "max_sect=", "queue_size=", "sg_tablesize=", "tl_retry_count=", }; char *ptr2 = NULL, *optr, option[17]; int i, optnamelen, len, left; optr = rule->options; left = sizeof(rule->options); len = strlen(optr); optr += len; left -= len; for (i = 0; i < sizeof(opt)/sizeof(opt[0]); ++i) { optnamelen = strlen(opt[i]); if (strncmp(ptr, opt[i], optnamelen) != 0) continue; ptr2 = copy_till_comma(option, ptr + optnamelen, sizeof(option) - 1, 10); if (!ptr2) return -1; len = snprintf(optr, left, ",%s%s", opt[i], option); optr += len; left -= len; if (left <= 0) return -1; break; } return ptr2 ? ptr2 - ptr : 0; } static int get_rules_file(struct config_t *conf) { int line_number = 1, len, line_number_for_output, ret = -1; char line[255]; char *ptr, *ptr2; struct rule *rule; FILE *infile = fopen(conf->rules_file, "r"); if (infile == NULL) { pr_debug("Could not find rules file %s, going with default\n", conf->rules_file); return 0; } while (fgets(line, sizeof(line), infile) != NULL) { if (line[0] != '#' && line[0] != '\n') line_number++; } if (fseek(infile, 0L, SEEK_SET) != 0) { pr_err("internal error while seeking %s\n", conf->rules_file); goto out; } conf->rules = malloc(sizeof(struct rule) * line_number); rule = &conf->rules[0] - 1; line_number_for_output = 0; while (fgets(line, sizeof(line), infile) != NULL) { line_number_for_output++; if (line[0] == '#' || line[0] == '\n') continue; rule++; switch (line[0]) { case 'a': case 'A': rule->allow = 1; break; case 'd': case 'D': rule->allow = 0; break; default: pr_err("Bad syntax in rules file %s line %d:" " line should start with 'a' or 'd'\n", conf->rules_file, line_number_for_output); goto out; } rule->id_ext[0] = '\0'; rule->ioc_guid[0] = '\0'; rule->dgid[0] = '\0'; rule->service_id[0] = '\0'; rule->pkey[0] = '\0'; rule->options[0] = '\0'; ptr = &line[1]; while (*ptr == ' ' || *ptr == '\t') ptr++; while (*ptr != '\n') { ptr2 = parse_main_option(rule, ptr); if (!ptr2 && rule->allow) { len = parse_other_option(rule, ptr); if (len < 0) { pr_err("Buffer overflow triggered by" " rules file %s line %d\n", conf->rules_file, line_number_for_output); goto out; } ptr2 = len ? ptr + len : NULL; } if (ptr2 == NULL) { pr_err("Bad syntax in rules file %s line %d\n", conf->rules_file, line_number_for_output); goto out; } ptr = ptr2; while (*ptr == ' ' || *ptr == '\t') ptr++; } } rule++; rule->id_ext[0] = '\0'; rule->ioc_guid[0] = '\0'; rule->dgid[0] = '\0'; rule->service_id[0] = '\0'; rule->pkey[0] = '\0'; rule->options[0] = '\0'; rule->allow = 1; ret = 0; out: fclose(infile); return ret; } static int set_conf_dev_and_port(char *umad_dev, struct config_t *conf) { int ret; if (umad_dev) { char *ibport; ret = translate_umad_to_ibdev_and_port(umad_dev, &conf->dev_name, &ibport); if (ret) { pr_err("Fail to translate umad to ibdev and port\n"); goto out; } conf->port_num = atoi(ibport); if (conf->port_num == 0) { pr_err("Bad port number %s\n", ibport); ret = -1; } free(ibport); } else { umad_ca_t ca; umad_port_t port; ret = umad_get_ca(NULL, &ca); if (ret) { pr_err("Failed to get default CA\n"); goto out; } ret = umad_get_port(ca.ca_name, 0, &port); if (ret) { pr_err("Failed to get default port for CA %s\n", ca.ca_name); umad_release_ca(&ca); goto out; } conf->dev_name = strdup(ca.ca_name); conf->port_num = port.portnum; umad_release_port(&port); umad_release_ca(&ca); pr_debug("Using device %s port %d\n", conf->dev_name, conf->port_num); } out: return ret; } static const struct option long_opts[] = { { "systemd", 0, NULL, 'S' }, {} }; static const char short_opts[] = "caveod:i:j:p:t:r:R:T:l:Vhnf:"; /* Check if the --systemd options was passed in very early so we can setup * logging properly. */ static bool is_systemd(int argc, char *argv[]) { while (1) { int c; c = getopt_long(argc, argv, short_opts, long_opts, NULL); if (c == -1) break; if (c == 'S') return true; } return false; } static int get_config(struct config_t *conf, int argc, char *argv[]) { /* set defaults */ char* umad_dev = NULL; int ret; conf->port_num = 1; conf->num_of_oust = 10; conf->dev_name = NULL; conf->cmd = 0; conf->once = 0; conf->execute = 0; conf->all = 0; conf->verbose = 0; conf->debug_verbose = 0; conf->timeout = 5000; conf->mad_retries = 3; conf->recalc_time = 0; conf->retry_timeout = 20; conf->add_target_file = NULL; conf->print_initiator_ext = 0; conf->rules_file = SRP_DAEMON_CONFIG_FILE; conf->rules = NULL; conf->tl_retry_count = 0; optind = 1; while (1) { int c; c = getopt_long(argc, argv, short_opts, long_opts, NULL); if (c == -1) break; switch (c) { case 'd': umad_dev = optarg; break; case 'i': conf->dev_name = strdup(optarg); if (!conf->dev_name) { pr_err("Fail to alloc space for dev_name\n"); return -ENOMEM; } break; case 'p': conf->port_num = atoi(optarg); if (conf->port_num == 0) { pr_err("Bad port number %s\n", optarg); return -1; } break; case 'j': { char dev[32]; int port_num; if (sscanf(optarg, "%31[^:]:%d", dev, &port_num) != 2) { pr_err("Bad dev:port specification %s\n", optarg); return -1; } conf->dev_name = strdup(dev); conf->port_num = port_num; } break; case 'c': ++conf->cmd; break; case 'o': ++conf->once; break; case 'a': ++conf->all; break; case 'e': ++conf->execute; break; case 'v': ++conf->verbose; break; case 'V': ++conf->debug_verbose; break; case 'n': ++conf->print_initiator_ext; break; case 't': conf->timeout = atoi(optarg); if (conf->timeout == 0) { pr_err("Bad timeout - %s\n", optarg); return -1; } break; case 'r': conf->mad_retries = atoi(optarg); if (conf->mad_retries == 0) { pr_err("Bad number of retries - %s\n", optarg); return -1; } break; case 'R': conf->recalc_time = atoi(optarg); if (conf->recalc_time == 0) { pr_err("Bad Rescan time window - %s\n", optarg); return -1; } break; case 'T': conf->retry_timeout = atoi(optarg); if (conf->retry_timeout == 0 && strcmp(optarg, "0")) { pr_err("Bad retry Timeout value- %s.\n", optarg); return -1; } break; case 'f': conf->rules_file = optarg; break; case 'l': conf->tl_retry_count = atoi(optarg); if (conf->tl_retry_count < 2 || conf->tl_retry_count > 7) { pr_err("Bad tl_retry_count argument (%d), " "must be 2 <= tl_retry_count <= 7\n", conf->tl_retry_count); return -1; } break; case 'S': break; case 'h': default: usage(argv[0]); return -1; } } initialize_sysfs(); if (conf->dev_name == NULL) { ret = set_conf_dev_and_port(umad_dev, conf); if (ret) { pr_err("Failed to build config\n"); return ret; } } ret = asprintf(&conf->add_target_file, "%s/class/infiniband_srp/srp-%s-%d/add_target", sysfs_path, conf->dev_name, conf->port_num); if (ret < 0) { pr_err("error while allocating add_target\n"); return ret; } if (get_rules_file(conf)) return -1; return 0; } static void free_config(struct config_t *conf) { free(conf->dev_name); free(conf->add_target_file); free(conf->rules); free(conf); } static void umad_resources_init(struct umad_resources *umad_res) { umad_res->portid = -1; umad_res->agent = -1; umad_res->port_sysfs_path = NULL; } static void umad_resources_destroy(struct umad_resources *umad_res) { if (umad_res->port_sysfs_path) free(umad_res->port_sysfs_path); if (umad_res->portid >= 0) { if (umad_res->agent >= 0) umad_unregister(umad_res->portid, umad_res->agent); umad_close_port(umad_res->portid); } umad_done(); } static int check_link_layer(const char *port_sysfs_path) { const char expected_link_layer[] = "InfiniBand"; char link_layer[sizeof(expected_link_layer)]; int ret; ret = srpd_sys_read_string(port_sysfs_path, "link_layer", link_layer, sizeof(link_layer)); if (ret < 0) { pr_err("Couldn't read link layer\n"); return ret; } if (strcmp(link_layer, expected_link_layer)) { pr_err("Unsupported link layer %s\n", link_layer); return -EINVAL; } return 0; } static int umad_resources_create(struct umad_resources *umad_res) { int ret; ret = asprintf(&umad_res->port_sysfs_path, "%s/class/infiniband/%s/ports/%d", sysfs_path, config->dev_name, config->port_num); if (ret < 0) { umad_res->port_sysfs_path = NULL; return -ENOMEM; } ret = check_link_layer(umad_res->port_sysfs_path); if (ret) return ret; umad_res->portid = umad_open_port(config->dev_name, config->port_num); if (umad_res->portid < 0) { pr_err("umad_open_port failed for device %s port %d\n", config->dev_name, config->port_num); return -ENXIO; } umad_res->agent = umad_register(umad_res->portid, UMAD_CLASS_SUBN_ADM, UMAD_SA_CLASS_VERSION, UMAD_RMPP_VERSION, NULL); if (umad_res->agent < 0) { pr_err("umad_register failed\n"); return umad_res->agent; } return 0; } static void *run_thread_retry_to_connect(void *res_in) { struct resources *res = (struct resources *)res_in; struct target_details *target; time_t sleep_time; pthread_mutex_lock(&res->sync_res->retry_mutex); while (!res->sync_res->stop_threads) { if (retry_list_is_empty(res->sync_res)) pthread_cond_wait(&res->sync_res->retry_cond, &res->sync_res->retry_mutex); while (!res->sync_res->stop_threads && (target = pop_from_retry_list(res->sync_res)) != NULL) { pthread_mutex_unlock(&res->sync_res->retry_mutex); sleep_time = target->retry_time - time(NULL); if (sleep_time > 0) srp_sleep(sleep_time, 0); add_non_exist_target(target); free(target); pthread_mutex_lock(&res->sync_res->retry_mutex); } } /* empty retry_list */ while ((target = pop_from_retry_list(res->sync_res))) free(target); pthread_mutex_unlock(&res->sync_res->retry_mutex); pr_debug("retry_to_connect thread ended\n"); pthread_exit(NULL); } static void free_res(struct resources *res) { void *status; if (!res) return; if (res->sync_res) { pthread_mutex_lock(&res->sync_res->retry_mutex); res->sync_res->stop_threads = 1; pthread_cond_signal(&res->sync_res->retry_cond); pthread_mutex_unlock(&res->sync_res->retry_mutex); } if (res->ud_res) modify_qp_to_err(res->ud_res->qp); if (res->reconnect_thread) { pthread_kill(res->reconnect_thread, SIGINT); pthread_join(res->reconnect_thread, &status); } if (res->async_ev_thread) { pthread_kill(res->async_ev_thread, SIGINT); pthread_join(res->async_ev_thread, &status); } if (res->trap_thread) { pthread_kill(res->trap_thread, SIGINT); pthread_join(res->trap_thread, &status); } if (res->sync_res) sync_resources_cleanup(res->sync_res); if (res->ud_res) ud_resources_destroy(res->ud_res); if (res->umad_res) umad_resources_destroy(res->umad_res); free(res); } static struct resources *alloc_res(void) { struct all_resources { struct resources res; struct ud_resources ud_res; struct umad_resources umad_res; struct sync_resources sync_res; }; struct all_resources *res; int ret; res = calloc(1, sizeof(*res)); if (!res) goto err; umad_resources_init(&res->umad_res); ret = umad_resources_create(&res->umad_res); if (ret) goto err; res->res.umad_res = &res->umad_res; ud_resources_init(&res->ud_res); ret = ud_resources_create(&res->ud_res); if (ret) goto err; res->res.ud_res = &res->ud_res; res->umad_res.ib_ctx = res->ud_res.ib_ctx; ret = sync_resources_init(&res->sync_res); if (ret) goto err; res->res.sync_res = &res->sync_res; if (!config->once) { ret = pthread_create(&res->res.trap_thread, NULL, run_thread_get_trap_notices, &res->res); if (ret) goto err; ret = pthread_create(&res->res.async_ev_thread, NULL, run_thread_listen_to_events, &res->res); if (ret) goto err; } if (config->retry_timeout && !config->once) { ret = pthread_create(&res->res.reconnect_thread, NULL, run_thread_retry_to_connect, &res->res); if (ret) goto err; } return &res->res; err: if (res) free_res(&res->res); return NULL; } /* *c = *a - *b. See also the BSD macro timersub(). */ static void ts_sub(const struct timespec *a, const struct timespec *b, struct timespec *res) { res->tv_sec = a->tv_sec - b->tv_sec; res->tv_nsec = a->tv_nsec - b->tv_nsec; if (res->tv_nsec < 0) { res->tv_sec--; res->tv_nsec += 1000 * 1000 * 1000; } } static void cleanup_wakeup_fd(void) { struct sigaction sa = {}; sigemptyset(&sa.sa_mask); sa.sa_handler = SIG_DFL; sigaction(SIGINT, &sa, NULL); sigaction(SIGTERM, &sa, NULL); sigaction(SRP_CATAS_ERR, &sa, NULL); close(wakeup_pipe[1]); close(wakeup_pipe[0]); wakeup_pipe[0] = -1; wakeup_pipe[1] = -1; } static int setup_wakeup_fd(void) { struct sigaction sa = {}; int ret; ret = pipe2(wakeup_pipe, O_NONBLOCK | O_CLOEXEC); if (ret < 0) { pr_err("could not create pipe\n"); return -1; } sigemptyset(&sa.sa_mask); sa.sa_handler = signal_handler; sigaction(SIGINT, &sa, NULL); sigaction(SIGTERM, &sa, NULL); sigaction(SRP_CATAS_ERR, &sa, NULL); return 0; } static int ibsrpdm(int argc, char *argv[]) { char* umad_dev = NULL; struct resources *res; int ret; s_log_dest = log_to_stderr; config = calloc(1, sizeof(*config)); config->num_of_oust = 10; config->timeout = 5000; config->mad_retries = 3; config->all = 1; config->once = 1; while (1) { int c; c = getopt(argc, argv, "cd:h:v"); if (c == -1) break; switch (c) { case 'c': ++config->cmd; break; case 'd': umad_dev = optarg; break; case 'v': ++config->debug_verbose; break; case 'h': default: fprintf(stderr, "Usage: %s [-vc] [-d ]\n", argv[0]); return 1; } } initialize_sysfs(); ret = set_conf_dev_and_port(umad_dev, config); if (ret) { pr_err("Failed to build config\n"); goto out; } ret = umad_init(); if (ret != 0) goto out; res = alloc_res(); if (!res) { ret = 1; pr_err("Resource allocation failed\n"); goto umad_done; } ret = recalc(res); if (ret) pr_err("Querying SRP targets failed\n"); free_res(res); umad_done: umad_done(); out: free_config(config); return ret; } int main(int argc, char *argv[]) { int ret; struct resources *res; uint16_t lid, sm_lid; uint16_t pkey; union umad_gid gid; struct target_details *target; int subscribed; int lockfd = -1; int received_signal = 0; bool systemd; #ifndef __CHECKER__ /* * Hide these checks for sparse because these checks fail with * older versions of sparse. */ BUILD_ASSERT(sizeof(struct ib_path_rec) == 64); BUILD_ASSERT(sizeof(struct ib_inform_info) == 36); BUILD_ASSERT(sizeof(struct ib_mad_notice_attr) == 80); BUILD_ASSERT(offsetof(struct ib_mad_notice_attr, generic.trap_num) == 4); BUILD_ASSERT(offsetof(struct ib_mad_notice_attr, vend.dev_id) == 4); BUILD_ASSERT(offsetof(struct ib_mad_notice_attr, ntc_64_67.gid) == 16); BUILD_ASSERT(offsetof(struct ib_mad_notice_attr, ntc_144.new_cap_mask) == 16); #endif BUILD_ASSERT(sizeof(struct srp_sa_node_rec) == 108); BUILD_ASSERT(sizeof(struct srp_sa_port_info_rec) == 58); BUILD_ASSERT(sizeof(struct srp_dm_iou_info) == 132); BUILD_ASSERT(sizeof(struct srp_dm_ioc_prof) == 128); if (strcmp(argv[0] + max_t(int, 0, strlen(argv[0]) - strlen("ibsrpdm")), "ibsrpdm") == 0) { ret = ibsrpdm(argc, argv); goto out; } systemd = is_systemd(argc, argv); if (systemd) openlog(NULL, LOG_NDELAY | LOG_CONS | LOG_PID, LOG_DAEMON); else openlog("srp_daemon", LOG_PID, LOG_DAEMON); config = calloc(1, sizeof(*config)); if (!config) { pr_err("out of memory\n"); ret = ENOMEM; goto close_log; } if (get_config(config, argc, argv)) { ret = EINVAL; goto free_config; } if (config->verbose) print_config(config); if (!config->once) { lockfd = check_process_uniqueness(config); if (lockfd < 0) { ret = EPERM; goto free_config; } } ret = setup_wakeup_fd(); if (ret) goto cleanup_wakeup; catas_start: subscribed = 0; ret = umad_init(); if (ret < 0) { pr_err("umad_init failed\n"); goto close_lockfd; } res = alloc_res(); if (!res && received_signal == SRP_CATAS_ERR) pr_err("Device has not yet recovered from catas error\n"); if (!res) goto clean_umad; /* * alloc_res() fails while the HCA is recovering from a catastrophic * error. Clear 'received_signal' after alloc_res() has succeeded to * finish the alloc_res() retry loop. */ if (received_signal == SRP_CATAS_ERR) { pr_err("Device recovered from catastrophic error\n"); received_signal = 0; } if (config->once) { ret = recalc(res); goto free_res; } while (received_signal == 0) { pthread_mutex_lock(&res->sync_res->mutex); if (__rescan_scheduled(res->sync_res)) { uint16_t port_lid; pthread_mutex_unlock(&res->sync_res->mutex); pr_debug("Starting a recalculation\n"); port_lid = get_port_lid(res->ud_res->ib_ctx, config->port_num, &sm_lid); if (port_lid > 0 && port_lid < 0xc000 && (port_lid != res->ud_res->port_attr.lid || sm_lid != res->ud_res->port_attr.sm_lid)) { if (res->ud_res->ah) { ibv_destroy_ah(res->ud_res->ah); res->ud_res->ah = NULL; } ret = create_ah(res->ud_res); if (ret) { received_signal = get_received_signal(10, 0); goto kill_threads; } } if (res->ud_res->ah) { if (register_to_traps(res, 1)) pr_err("Fail to register to traps, maybe there " "is no SM running on fabric or IB port is down\n"); else subscribed = 1; } clear_traps_list(res->sync_res); schedule_rescan(res->sync_res, config->recalc_time ? config->recalc_time : -1); /* empty retry_list */ pthread_mutex_lock(&res->sync_res->retry_mutex); while ((target = pop_from_retry_list(res->sync_res))) free(target); pthread_mutex_unlock(&res->sync_res->retry_mutex); recalc(res); } else if (pop_from_list(res->sync_res, &lid, &gid, &pkey)) { pthread_mutex_unlock(&res->sync_res->mutex); if (lid) { uint64_t guid; ret = get_node(res->umad_res, lid, &guid); if (ret) /* unexpected error - do a full rescan */ schedule_rescan(res->sync_res, 0); else handle_port(res, pkey, lid, guid); } else { ret = get_lid(res->umad_res, &gid, &lid); if (ret < 0) /* unexpected error - do a full rescan */ schedule_rescan(res->sync_res, 0); else { pr_debug("lid is %#x\n", lid); srp_sleep(0, 100); handle_port(res, pkey, lid, be64toh(ib_gid_get_guid(&gid))); } } } else { static const struct timespec zero; struct timespec now, delta; struct timespec recalc = { .tv_sec = config->recalc_time }; struct timeval timeout; clock_gettime(CLOCK_MONOTONIC, &now); ts_sub(&res->sync_res->next_recalc_time, &now, &delta); pthread_mutex_unlock(&res->sync_res->mutex); if (ts_cmp(&zero, &delta, <=) && ts_cmp(&delta, &recalc, <)) recalc = delta; timeout.tv_sec = recalc.tv_sec; timeout.tv_usec = recalc.tv_nsec / 1000 + 1; received_signal = get_received_signal(timeout.tv_sec, timeout.tv_usec) ? : received_signal; } } ret = 0; kill_threads: switch (received_signal) { case SIGINT: pr_err("Got SIGINT\n"); break; case SIGTERM: pr_err("Got SIGTERM\n"); break; case SRP_CATAS_ERR: pr_err("Got SIG SRP_CATAS_ERR\n"); break; case 0: break; default: pr_err("Got SIG???\n"); break; } if (subscribed && received_signal != SRP_CATAS_ERR) { pr_err("Deregistering traps ...\n"); register_to_traps(res, 0); pr_err("Finished trap deregistration.\n"); } free_res: free_res(res); /* Discard the SIGINT triggered by the free_res() implementation. */ get_received_signal(0, 0); clean_umad: umad_done(); if (received_signal == SRP_CATAS_ERR) { /* * Device got a catastrophic error. Let's wait a grace * period and try to probe the device by attempting to * allocate IB resources. Once it recovers, we will * start all over again. */ received_signal = get_received_signal(10, 0) ? : received_signal; if (received_signal == SRP_CATAS_ERR) goto catas_start; } close_lockfd: if (lockfd >= 0) close(lockfd); cleanup_wakeup: cleanup_wakeup_fd(); free_config: free_config(config); close_log: closelog(); out: exit(ret ? 1 : 0); } static int recalc(struct resources *res) { struct umad_resources *umad_res = res->umad_res; int mask_match; char val[7]; int ret; ret = srpd_sys_read_string(umad_res->port_sysfs_path, "sm_lid", val, sizeof val); if (ret < 0) { pr_err("Couldn't read SM LID\n"); return ret; } umad_res->sm_lid = strtol(val, NULL, 0); if (umad_res->sm_lid == 0) { pr_err("SM LID is 0, maybe no SM is running\n"); return -1; } ret = check_sm_cap(umad_res, &mask_match); if (ret < 0) return ret; if (mask_match) { pr_debug("Advanced SM, performing a capability query\n"); ret = do_dm_port_list(res); } else { pr_debug("Old SM, performing a full node query\n"); ret = do_full_port_list(res); } return ret; } static int get_lid(struct umad_resources *umad_res, union umad_gid *gid, uint16_t *lid) { struct srp_ib_user_mad out_mad, in_mad; struct umad_sa_packet *in_sa_mad = get_data_ptr(in_mad); struct umad_sa_packet *out_sa_mad = get_data_ptr(out_mad); struct ib_path_rec *path_rec = (struct ib_path_rec *) out_sa_mad->data; memset(&in_mad, 0, sizeof(in_mad)); init_srp_sa_mad(&out_mad, umad_res->agent, umad_res->sm_lid, UMAD_SA_ATTR_PATH_REC, 0); out_sa_mad->comp_mask = htobe64( 4 | 8 | 64 | 512 | 4096 ); path_rec->sgid = *gid; path_rec->dgid = *gid; path_rec->reversible_numpath = 1; path_rec->hop_flow_raw = htobe32(1 << 31); /* rawtraffic=1 hoplimit = 0 */ if (send_and_get(umad_res->portid, umad_res->agent, &out_mad, &in_mad, 0) < 0) return -1; path_rec = (struct ib_path_rec *) in_sa_mad->data; *lid = be16toh(path_rec->dlid); return 0; } rdma-core-56.1/srp_daemon/srp_daemon.conf000066400000000000000000000010721477342711600204570ustar00rootroot00000000000000## This is an example rules configuration file for srp_daemon. ## #This is a comment ## disallow the following dgid #d dgid=fe800000000000000002c90200402bd5 ## allow target with the following ioc_guid #a ioc_guid=00a0b80200402bd7 ## allow target with the following pkey #a pkey=ffff ## allow target with the following id_ext and ioc_guid #a id_ext=200500A0B81146A1,ioc_guid=00a0b80200402bef ## disallow all the rest #d ## ## Here is another example: ## ## Allow all targets and set queue size to 128. # a queue_size=128,max_cmd_per_lun=128 rdma-core-56.1/srp_daemon/srp_daemon.h000066400000000000000000000215271477342711600177700ustar00rootroot00000000000000/* * srp_daemon - discover SRP targets over IB * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. * Copyright (c) 2006 Mellanox Technologies Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef SRP_DM_H #define SRP_DM_H #include #include #include #include #include #include #include #include /* __be16, __be32 and __be64 */ #include #include "config.h" #include "srp_ib_types.h" #define SRP_CATAS_ERR SIGUSR1 enum { SRP_DM_ATTR_IO_UNIT_INFO = 0x0010, SRP_DM_ATTR_IO_CONTROLLER_PROFILE = 0x0011, SRP_DM_ATTR_SERVICE_ENTRIES = 0x0012 }; enum { SRP_DM_NO_IOC = 0x0, SRP_DM_IOC_PRESENT = 0x1, SRP_DM_NO_SLOT = 0xf }; enum { SRP_SM_SUPPORTS_MASK_MATCH = 1 << 13, SRP_IS_DM = 1 << 19, SRP_SM_CAP_MASK_MATCH_ATTR_MOD = 1 << 31, }; enum { SRP_REV10_IB_IO_CLASS = 0xff00, SRP_REV16A_IB_IO_CLASS = 0x0100 }; struct srp_sa_node_rec { __be16 lid; __be16 reserved; uint8_t base_version; uint8_t class_version; uint8_t type; uint8_t num_ports; __be64 sys_guid __attribute__((packed)); __be64 node_guid __attribute__((packed)); __be64 port_guid __attribute__((packed)); __be16 partition_cap; __be16 device_id; __be32 revision; __be32 port_num_vendor_id; uint8_t desc[64]; }; struct srp_sa_port_info_rec { __be16 endport_lid; uint8_t port_num; uint8_t reserved; __be64 m_key __attribute__((packed)); __be64 subnet_prefix __attribute__((packed)); __be16 base_lid; __be16 master_sm_base_lid; __be32 capability_mask __attribute__((packed)); __be16 diag_code; __be16 m_key_lease_period; uint8_t local_port_num; uint8_t link_width_enabled; uint8_t link_width_supported; uint8_t link_width_active; uint8_t state_info1; uint8_t state_info2; uint8_t mkey_lmc; uint8_t link_speed; uint8_t mtu_smsl; uint8_t vl_cap; uint8_t vl_high_limit; uint8_t vl_arb_high_cap; uint8_t vl_arb_low_cap; uint8_t mtu_cap; uint8_t vl_stall_life; uint8_t vl_enforce; __be16 m_key_violations; __be16 p_key_violations; __be16 q_key_violations; uint8_t guid_cap; uint8_t subnet_timeout; uint8_t resp_time_value; uint8_t error_threshold; }; struct srp_dm_iou_info { __be16 change_id; uint8_t max_controllers; uint8_t diagid_optionrom; uint8_t controller_list[128]; }; struct srp_dm_ioc_prof { __be64 guid; __be32 vendor_id; __be32 device_id; __be16 device_version; __be16 reserved1; __be32 subsys_vendor_id; __be32 subsys_device_id; __be16 io_class; __be16 io_subclass; __be16 protocol; __be16 protocol_version; __be32 reserved2; __be16 send_queue_depth; uint8_t reserved3; uint8_t rdma_read_depth; __be32 send_size; __be32 rdma_size; uint8_t cap_mask; uint8_t reserved4; uint8_t service_entries; uint8_t reserved5[9]; char id[64]; }; struct srp_dm_svc_entries { struct { char name[40]; __be64 id; } service[4]; }; enum { SEND_SIZE = 256, GRH_SIZE = 40, RECV_BUF_SIZE = SEND_SIZE + GRH_SIZE, }; struct rule { int allow; char id_ext[17], ioc_guid[17], dgid[33], service_id[17], pkey[10], options[128]; }; #define SRP_MAX_SHARED_PKEYS 127 #define MAX_ID_EXT_STRING_LENGTH 17 struct target_details { uint16_t pkey; char id_ext[MAX_ID_EXT_STRING_LENGTH]; struct srp_dm_ioc_prof ioc_prof; uint64_t subnet_prefix; uint64_t h_guid; uint64_t h_service_id; time_t retry_time; char *options; struct target_details *next; }; struct config_t { char *dev_name; int port_num; char *add_target_file; int mad_retries; int num_of_oust; int cmd; int once; int execute; int all; int verbose; int debug_verbose; int timeout; int recalc_time; int print_initiator_ext; const char *rules_file; struct rule *rules; int retry_timeout; int tl_retry_count; }; extern struct config_t *config; struct ud_resources { struct ibv_device **dev_list; struct ibv_context *ib_ctx; struct ibv_pd *pd; struct ibv_cq *send_cq; struct ibv_cq *recv_cq; struct ibv_qp *qp; struct ibv_mr *mr; struct ibv_ah *ah; char *recv_buf; char *send_buf; struct ibv_device_attr device_attr; struct ibv_port_attr port_attr; int cq_size; struct ibv_comp_channel *channel; pthread_mutex_t *mad_buffer_mutex; struct umad_sa_packet *mad_buffer; }; struct umad_resources { struct ibv_context *ib_ctx; int portid; int agent; char *port_sysfs_path; uint16_t sm_lid; }; enum { SIZE_OF_TASKS_LIST = 5, }; struct sync_resources { int stop_threads; bool error; int next_task; struct timespec next_recalc_time; struct { uint16_t lid; uint16_t pkey; union umad_gid gid; } tasks[SIZE_OF_TASKS_LIST]; pthread_mutex_t mutex; struct target_details *retry_tasks_head; struct target_details *retry_tasks_tail; pthread_mutex_t retry_mutex; pthread_cond_t retry_cond; }; struct resources { struct ud_resources *ud_res; struct umad_resources *umad_res; struct sync_resources *sync_res; pthread_t trap_thread; pthread_t async_ev_thread; pthread_t reconnect_thread; }; struct srp_ib_user_mad { struct ib_user_mad hdr; char filler[MAD_BLOCK_SIZE]; }; #include #define pr_human(arg...) \ do { \ if (!config->cmd && !config->execute) \ printf(arg); \ } while (0) void pr_debug(const char *fmt, ...) __attribute__((format(printf, 1, 2))); void pr_err(const char *fmt, ...) __attribute__((format(printf, 1, 2))); int pkey_index_to_pkey(struct umad_resources *umad_res, int pkey_index, __be16 *pkey); void handle_port(struct resources *res, uint16_t pkey, uint16_t lid, uint64_t h_guid); void ud_resources_init(struct ud_resources *res); int ud_resources_create(struct ud_resources *res); int ud_resources_destroy(struct ud_resources *res); int wait_for_recalc(struct resources *res_in); int trap_main(struct resources *res); void *run_thread_get_trap_notices(void *res_in); void *run_thread_listen_to_events(void *res_in); int get_node(struct umad_resources *umad_res, uint16_t dlid, uint64_t *guid); int create_trap_resources(struct ud_resources *ud_res); int register_to_traps(struct resources *res, int subscribe); uint16_t get_port_lid(struct ibv_context *ib_ctx, int port_num, uint16_t *sm_lid); int create_ah(struct ud_resources *ud_res); void push_gid_to_list(struct sync_resources *res, union umad_gid *gid, uint16_t pkey); void push_lid_to_list(struct sync_resources *res, uint16_t lid, uint16_t pkey); struct target_details *pop_from_retry_list(struct sync_resources *res); void push_to_retry_list(struct sync_resources *res, struct target_details *target); int retry_list_is_empty(struct sync_resources *res); void clear_traps_list(struct sync_resources *res); int pop_from_list(struct sync_resources *res, uint16_t *lid, union umad_gid *gid, uint16_t *pkey); int sync_resources_init(struct sync_resources *res); void sync_resources_cleanup(struct sync_resources *res); bool sync_resources_error(struct sync_resources *res); int modify_qp_to_err(struct ibv_qp *qp); void srp_sleep(time_t sec, time_t usec); void wake_up_main_loop(char ch); void __schedule_rescan(struct sync_resources *res, int when); void schedule_rescan(struct sync_resources *res, int when); int __rescan_scheduled(struct sync_resources *res); int rescan_scheduled(struct sync_resources *res); void raise_catastrophic_error(struct sync_resources *res); #endif /* SRP_DM_H */ rdma-core-56.1/srp_daemon/srp_daemon.rules.in000066400000000000000000000003171477342711600212720ustar00rootroot00000000000000SUBSYSTEM=="infiniband_mad", KERNEL=="*umad*", PROGRAM=="@SYSTEMCTL_BIN@ show srp_daemon -p ActiveState", RESULT=="ActiveState=active", ENV{SYSTEMD_WANTS}+="srp_daemon_port@$attr{ibdev}:$attr{port}.service" rdma-core-56.1/srp_daemon/srp_daemon.service.5000066400000000000000000000020731477342711600213370ustar00rootroot00000000000000'\" t .TH "SRP_DAEMON\&.SERVICE" "5" "" "srp_daemon" "srp_daemon.service" .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" srp_daemon.service \- srp_daemon systemd service that controls all ports .SH "SYNOPSIS" .PP srp_daemon\&.service .SH "DESCRIPTION" .PP The srp_daemon\&.service controls whether or not any srp_daemon processes are running. Although no srp_daemon processes are controlled directly by the srp_daemon\&.service, this service controls whether or not any srp_daemon_port@\&.service are allowed to be active. Each srp_daemon_port@\&.service controls one srp_daemon process. .SH "SEE ALSO" .PP \fBsrp_daemon\fR(1), \fBsrp_daemon_port@.service\fR(5), \fBsystemctl\fR(1) rdma-core-56.1/srp_daemon/srp_daemon.service.in000066400000000000000000000007471477342711600216070ustar00rootroot00000000000000[Unit] Description=Daemon that discovers and logs in to SRP target systems Documentation=man:srp_daemon file:/etc/srp_daemon.conf DefaultDependencies=false Conflicts=emergency.target emergency.service Before=remote-fs-pre.target [Service] Type=oneshot RemainAfterExit=yes ExecStart=@CMAKE_INSTALL_FULL_LIBEXECDIR@/srp_daemon/start_on_all_ports MemoryDenyWriteExecute=yes PrivateTmp=yes ProtectHome=yes ProtectKernelModules=yes RestrictRealtime=yes [Install] WantedBy=multi-user.target rdma-core-56.1/srp_daemon/srp_daemon.sh.in000077500000000000000000000043511477342711600205570ustar00rootroot00000000000000#!/bin/bash # # Copyright (c) 2006 Mellanox Technologies. All rights reserved. # # This Software is licensed under one of the following licenses: # # 1) under the terms of the "Common Public License 1.0" a copy of which is # available from the Open Source Initiative, see # http://www.opensource.org/licenses/cpl.php. # # 2) under the terms of the "The BSD License" a copy of which is # available from the Open Source Initiative, see # http://www.opensource.org/licenses/bsd-license.php. # # 3) under the terms of the "GNU General Public License (GPL) Version 2" a # copy of which is available from the Open Source Initiative, see # http://www.opensource.org/licenses/gpl-license.php. # # Licensee has the right to choose one of the above licenses. # # Redistributions of source code must retain the above copyright # notice and one of the license notices. # # Redistributions in binary form must reproduce both the above copyright # notice, one of the license notices in the documentation # and/or other materials provided with the distribution. # # $Id$ # shopt -s nullglob prog=@CMAKE_INSTALL_FULL_SBINDIR@/srp_daemon params=("$@") ibdir="/sys/class/infiniband" rescan_interval=60 pids=() pidfile="@CMAKE_INSTALL_FULL_RUNDIR@/srp_daemon.sh.pid" mypid=$$ trap_handler() { if [ "${#pids[@]}" ]; then kill -15 "${pids[@]}" > /dev/null 2>&1 wait "${pids[@]}" fi logger -i -t "$(basename "$0")" "killing $prog." /bin/rm -f "$pidfile" exit 0 } # Check if there is another copy running of srp_daemon.sh if [ -f "$pidfile" ]; then if [ -e "/proc/$(cat "$pidfile" 2>/dev/null)/status" ]; then echo "$(basename "$0") is already running. Exiting." exit 1 else /bin/rm -f "$pidfile" fi fi if ! echo $mypid > "$pidfile"; then echo "Creating $pidfile for pid $mypid failed" exit 1 fi trap 'trap_handler' 2 15 while [ ! -d ${ibdir} ] do sleep 30 done for d in ${ibdir}_mad/umad*; do hca_id="$(<"$d/ibdev")" port="$(<"$d/port")" add_target="${ibdir}_srp/srp-${hca_id}-${port}/add_target" if [ -e "${add_target}" ]; then ${prog} -e -c -n -i "${hca_id}" -p "${port}" -R "${rescan_interval}" "${params[@]}" >/dev/null 2>&1 & pids+=($!) fi done wait rdma-core-56.1/srp_daemon/srp_daemon_port@.service.5000066400000000000000000000031701477342711600225020ustar00rootroot00000000000000'\" t .TH "SRP_DAEMON_PORT@\&.SERVICE" "5" "" "srp_daemon" "srp_daemon_port@.service" .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" srp_daemon_port@.service \- srp_daemon_port@ systemd service that controls a single port .SH "SYNOPSIS" .PP srp_daemon_port@\&.service .SH "DESCRIPTION" .PP The srp_daemon_port@\&.service controls whether or not an srp_daemon process is monitoring the RDMA port specified as template argument. The format for the RDMA port name is \fIdev:port\fR where \fIdev\fR is the name of an RDMA device and \fIport\fR is an port number starting from one. Starting an instance of this template will start an srp_daemon process. Stopping an instance of this template will stop the srp_daemon process for the specified port. It can be prevented that srp_daemon is started for a certain port by masking the corresponding systemd service, e.g. \fBsystemctl mask srp_daemon_port@mlx4_0:1\fR. A list of all RDMA device and port number pairs can be obtained e.g. as follows: .PP .nf .RS $ (cd /sys/class/infiniband >&/dev/null && for p in */ports/*; do [ -e "$p" ] && echo "${p/\\/ports\\//:}"; done) mlx4_0:1 mlx4_0:2 mlx4_1:1 mlx4_1:2 .RE .fi .PP .SH "SEE ALSO" .PP \fBsrp_daemon\fR(1), \fBsrp_daemon.service\fR(5), \fBsystemctl\fR(1) rdma-core-56.1/srp_daemon/srp_daemon_port@.service.in000066400000000000000000000032171477342711600227460ustar00rootroot00000000000000[Unit] Description=SRP daemon that monitors port %i Documentation=man:srp_daemon file:/etc/rdma/rdma.conf file:/etc/srp_daemon.conf # srp_daemon is required to mount filesystems, and could run before sysinit.target DefaultDependencies=false Before=remote-fs-pre.target # Do not execute concurrently with an ongoing shutdown (required for DefaultDependencies=no) Conflicts=shutdown.target Before=shutdown.target # Ensure required kernel modules are loaded before starting Requires=rdma-load-modules@srp_daemon.service After=rdma-load-modules@srp_daemon.service # Complete setting up low level RDMA hardware After=rdma-hw.target # Only run while the RDMA udev device is in an active state, and shutdown if # it becomes unplugged. After=sys-subsystem-rdma-devices-%i-umad.device BindsTo=sys-subsystem-rdma-devices-%i-umad.device # Allow srp_daemon to act as a leader for all of the port services for # stop/start/reset After=srp_daemon.service BindsTo=srp_daemon.service [Service] Type=simple ExecStart=@CMAKE_INSTALL_FULL_SBINDIR@/srp_daemon --systemd -e -c -n -j %I -R 60 MemoryDenyWriteExecute=yes PrivateNetwork=yes PrivateTmp=yes ProtectControlGroups=yes ProtectHome=yes ProtectKernelModules=yes ProtectSystem=full RestrictRealtime=yes SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io [Install] # Instances of this template unit file is started automatically by udev or by # srp_daemon.service as devices are discovered. However, if the user manually # enables a template unit then it will be installed with remote-fs-pre. Note # that systemd will defer starting the unit until the rdma .device appears. WantedBy=remote-fs-pre.target rdma-core-56.1/srp_daemon/srp_handle_traps.c000066400000000000000000000544711477342711600211700ustar00rootroot00000000000000/* * Copyright (c) 2006 Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * $Author: ishai Rabinovitz [ishai@mellanox.co.il]$ * */ #include #include #include #include #include #include #include #include #include #include #include #include #include "srp_ib_types.h" #include "srp_daemon.h" void srp_sleep(time_t sec, time_t usec) { struct timespec req, rem; if (usec > 1000) { sec += usec / 1000; usec = usec % 1000; } req.tv_sec = sec; req.tv_nsec = usec * 1000000; nanosleep(&req, &rem); } /***************************************************************************** * Function: ud_resources_init *****************************************************************************/ void ud_resources_init(struct ud_resources *res) { res->dev_list = NULL; res->ib_ctx = NULL; res->send_cq = NULL; res->recv_cq = NULL; res->channel = NULL; res->qp = NULL; res->pd = NULL; res->mr = NULL; res->ah = NULL; res->send_buf = NULL; res->recv_buf = NULL; } /***************************************************************************** * Function: modify_qp_to_rts *****************************************************************************/ static int modify_qp_to_rts(struct ibv_qp *qp) { struct ibv_qp_attr attr; int flags; int rc; /* RESET -> INIT */ memset(&attr, 0, sizeof(struct ibv_qp_attr)); attr.qp_state = IBV_QPS_INIT; attr.port_num = config->port_num; attr.pkey_index = 0; attr.qkey = UMAD_QKEY; flags = IBV_QP_STATE | IBV_QP_PKEY_INDEX | IBV_QP_PORT | IBV_QP_QKEY; rc = ibv_modify_qp(qp, &attr, flags); if (rc) { pr_err("failed to modify QP state to INIT\n"); return rc; } /* INIT -> RTR */ memset(&attr, 0, sizeof(attr)); attr.qp_state = IBV_QPS_RTR; flags = IBV_QP_STATE; rc = ibv_modify_qp(qp, &attr, flags); if (rc) { pr_err("failed to modify QP state to RTR\n"); return rc; } /* RTR -> RTS */ /* memset(&attr, 0, sizeof(attr)); */ attr.qp_state = IBV_QPS_RTS; attr.sq_psn = 0; flags = IBV_QP_STATE | IBV_QP_SQ_PSN; rc = ibv_modify_qp(qp, &attr, flags); if (rc) { pr_err("failed to modify QP state to RTS\n"); return rc; } return 0; } int modify_qp_to_err(struct ibv_qp *qp) { static struct ibv_qp_attr attr = { .qp_state = IBV_QPS_ERR, }; return ibv_modify_qp(qp, &attr, IBV_QP_STATE); } /***************************************************************************** * Function: fill_rq_entry *****************************************************************************/ static int fill_rq_entry(struct ud_resources *res, int cur_receive) { struct ibv_recv_wr rr; struct ibv_sge sg; struct ibv_recv_wr *_bad_wr = NULL; struct ibv_recv_wr **bad_wr = &_bad_wr; int ret; memset(&rr, 0, sizeof(rr)); sg.length = RECV_BUF_SIZE; sg.lkey = res->mr->lkey; rr.next = NULL; rr.sg_list = &sg; rr.num_sge = 1; sg.addr = (((unsigned long)res->recv_buf) + RECV_BUF_SIZE * cur_receive); rr.wr_id = cur_receive; ret = ibv_post_recv(res->qp, &rr, bad_wr); if (ret < 0) { pr_err("failed to post RR\n"); return ret; } return 0; } /***************************************************************************** * Function: fill_rq *****************************************************************************/ static int fill_rq(struct ud_resources *res) { int cur_receive; int ret; for (cur_receive=0; cur_receivenum_of_oust; ++cur_receive) { ret = fill_rq_entry(res, cur_receive); if (ret < 0) { pr_err("failed to fill_rq_entry\n"); return ret; } } return 0; } /***************************************************************************** * Function: ud_resources_create *****************************************************************************/ int ud_resources_create(struct ud_resources *res) { struct ibv_device *ib_dev = NULL; size_t size; int i; int cq_size; int num_devices; /* get device names in the system */ res->dev_list = ibv_get_device_list(&num_devices); if (!res->dev_list) { pr_err("failed to get IB devices list\n"); return -1; } for (i = 0; i < num_devices; i ++) { if (!strcmp(ibv_get_device_name(res->dev_list[i]), config->dev_name)) { ib_dev = res->dev_list[i]; break; } } if (!ib_dev) { pr_err("IB device %s wasn't found\n", config->dev_name); return -ENXIO; } pr_debug("Device %s was found\n", config->dev_name); /* get device handle */ res->ib_ctx = ibv_open_device(ib_dev); if (!res->ib_ctx) { pr_err("failed to open device %s\n", config->dev_name); return -ENXIO; } res->channel = ibv_create_comp_channel(res->ib_ctx); if (!res->channel) { pr_err("failed to create completion channel \n"); return -ENXIO; } res->pd = ibv_alloc_pd(res->ib_ctx); if (!res->pd) { pr_err("ibv_alloc_pd failed\n"); return -1; } cq_size = config->num_of_oust; res->recv_cq = ibv_create_cq(res->ib_ctx, cq_size, NULL, res->channel, 0); if (!res->recv_cq) { pr_err("failed to create CQ with %u entries\n", cq_size); return -1; } pr_debug("CQ was created with %u CQEs\n", cq_size); if (ibv_req_notify_cq(res->recv_cq, 0)) { pr_err("Couldn't request CQ notification\n"); return -1; } res->send_cq = ibv_create_cq(res->ib_ctx, 1, NULL, NULL, 0); if (!res->send_cq) { pr_err("failed to create CQ with %u entries\n", 1); return -1; } pr_debug("CQ was created with %u CQEs\n", 1); size = cq_size * RECV_BUF_SIZE + SEND_SIZE; res->recv_buf = malloc(size); if (!res->recv_buf) { pr_err("failed to malloc %zu bytes to memory buffer\n", size); return -ENOMEM; } memset(res->recv_buf, 0, size); res->send_buf = res->recv_buf + cq_size * RECV_BUF_SIZE; res->mr = ibv_reg_mr(res->pd, res->recv_buf, size, IBV_ACCESS_LOCAL_WRITE); if (!res->mr) { pr_err("ibv_reg_mr failed\n"); return -1; } pr_debug("MR was created with addr=%p, lkey=0x%x,\n", res->recv_buf, res->mr->lkey); { struct ibv_qp_init_attr attr = { .send_cq = res->send_cq, .recv_cq = res->recv_cq, .cap = { .max_send_wr = 1, .max_recv_wr = config->num_of_oust, .max_send_sge = 1, .max_recv_sge = 1 }, .qp_type = IBV_QPT_UD, .sq_sig_all = 1, }; res->qp = ibv_create_qp(res->pd, &attr); if (!res->qp) { pr_err("failed to create QP\n"); return -1; } pr_debug("QP was created, QP number=0x%x\n", res->qp->qp_num); } /* modify the QP to RTS (connect the QPs) */ if (modify_qp_to_rts(res->qp)) { pr_err("failed to modify QP state from RESET to RTS\n"); return -1; } pr_debug("QPs were modified to RTS\n"); if (fill_rq(res)) return -1; res->mad_buffer = malloc(sizeof(struct umad_sa_packet)); if (!res->mad_buffer) { pr_err("Could not alloc mad_buffer, abort\n"); return -1; } res->mad_buffer_mutex = malloc(sizeof(pthread_mutex_t)); if (!res->mad_buffer_mutex) { pr_err("Could not alloc mad_buffer_mutex, abort\n"); return -1; } if (pthread_mutex_init(res->mad_buffer_mutex, NULL)) { pr_err("Could not init mad_buffer_mutex, abort\n"); return -1; } return 0; } uint16_t get_port_lid(struct ibv_context *ib_ctx, int port_num, uint16_t *sm_lid) { struct ibv_port_attr port_attr; int ret; ret = ibv_query_port(ib_ctx, port_num, &port_attr); if (!ret) { if (sm_lid) *sm_lid = port_attr.sm_lid; return port_attr.lid; } return 0; } int create_ah(struct ud_resources *ud_res) { struct ibv_ah_attr ah_attr; assert(!ud_res->ah); /* create the UD AV */ memset(&ah_attr, 0, sizeof(ah_attr)); if (ibv_query_port(ud_res->ib_ctx, config->port_num, &ud_res->port_attr)) { pr_err("ibv_query_port on port %u failed\n", config->port_num); return -1; } ah_attr.dlid = ud_res->port_attr.sm_lid; ah_attr.port_num = config->port_num; ud_res->ah = ibv_create_ah(ud_res->pd, &ah_attr); if (!ud_res->ah) { pr_err("failed to create UD AV\n"); return -1; } return 0; } /***************************************************************************** * Function: ud_resources_destroy *****************************************************************************/ int ud_resources_destroy(struct ud_resources *res) { int test_result = 0; if (res->qp) { if (ibv_destroy_qp(res->qp)) { pr_err("failed to destroy QP\n"); test_result = 1; } } if (res->mr) { if (ibv_dereg_mr(res->mr)) { pr_err("ibv_dereg_mr failed\n"); test_result = 1; } } if (res->send_cq) { if (ibv_destroy_cq(res->send_cq)) { pr_err("ibv_destroy_cq of CQ failed\n"); test_result = 1; } } if (res->recv_cq) { if (ibv_destroy_cq(res->recv_cq)) { pr_err("ibv_destroy_cq of CQ failed\n"); test_result = 1; } } if (res->channel) { if (ibv_destroy_comp_channel(res->channel)) { pr_err("ibv_destroy_comp_channel failed\n"); test_result = 1; } } if (res->ah) { if (ibv_destroy_ah(res->ah)) { pr_err("ibv_destroy_ah failed\n"); test_result = 1; } } if (res->pd) { if (ibv_dealloc_pd(res->pd)) { pr_err("ibv_dealloc_pd failed\n"); test_result = 1; } } if (res->ib_ctx) { if (ibv_close_device(res->ib_ctx)) { pr_err("ibv_close_device failed\n"); test_result = 1; } } if (res->dev_list) ibv_free_device_list(res->dev_list); if (res->recv_buf) free(res->recv_buf); if (res->mad_buffer) free(res->mad_buffer); if (res->mad_buffer_mutex) free(res->mad_buffer_mutex); return test_result; } static void fill_send_request(struct ud_resources *res, struct ibv_send_wr *psr, struct ibv_sge *psg, struct umad_hdr *mad_hdr) { static int wr_id=0; assert(res->ah); memset(psr, 0, sizeof(*psr)); psr->next = NULL; psr->wr_id = wr_id++; psr->sg_list = psg; psr->num_sge = 1; psr->opcode = IBV_WR_SEND; // psr->send_flags = IBV_SEND_SIGNALED | IBV_SEND_INLINE; psr->send_flags = IBV_SEND_SIGNALED; psr->wr.ud.ah = res->ah; psr->wr.ud.remote_qpn = 1; psr->wr.ud.remote_qkey = UMAD_QKEY; psg->addr = (uintptr_t) mad_hdr; psg->length = SEND_SIZE; psg->lkey = res->mr->lkey; } static int stop_threads(struct sync_resources *sync_res) { int result; pthread_mutex_lock(&sync_res->retry_mutex); result = sync_res->stop_threads; pthread_mutex_unlock(&sync_res->retry_mutex); return result; } /***************************************************************************** * Function: poll_cq_once * Poll a CQ once. * Returns the number of completion polled (0 or 1). * Returns a negative value on error. *****************************************************************************/ static int poll_cq_once(struct sync_resources *sync_res, struct ibv_cq *cq, struct ibv_wc *wc) { int ret; ret = ibv_poll_cq(cq, 1, wc); if (ret < 0) { pr_err("poll CQ failed\n"); return ret; } if (ret > 0 && wc->status != IBV_WC_SUCCESS) { if (!stop_threads(sync_res)) pr_err("got bad completion with status: 0x%x\n", wc->status); return -ret; } return ret; } static int poll_cq(struct sync_resources *sync_res, struct ibv_cq *cq, struct ibv_wc *wc, struct ibv_comp_channel *channel) { int ret; struct ibv_cq *ev_cq; void *ev_ctx; if (channel) { /* There may be extra completions that * were associated to the previous event. * Only poll for the first one. If there are more than one, * they will be handled by later call to poll_cq */ ret = poll_cq_once(sync_res, cq, wc); /* return directly if there was an error or * 1 completion polled */ if (ret) return ret; if (ibv_get_cq_event(channel, &ev_cq, &ev_ctx)) { pr_err("Failed to get cq_event\n"); return -1; } ibv_ack_cq_events(ev_cq, 1); if (ev_cq != cq) { pr_debug("CQ event for unknown CQ %p\n", ev_cq); return -1; } if (ibv_req_notify_cq(cq, 0)) { pr_err("Couldn't request CQ notification\n"); return -1; } } do { ret = poll_cq_once(sync_res, cq, wc); if (ret < 0) return ret; if (ret == 0) { if (channel) { pr_err("Weird poll returned no cqe after CQ event\n"); return -1; } if (sync_resources_error(sync_res)) return -1; } } while (ret == 0); return 0; } /***************************************************************************** * Function: register_to_trap *****************************************************************************/ static int register_to_trap(struct sync_resources *sync_res, struct ud_resources *res, int dest_lid, int trap_num, int subscribe) { struct ibv_send_wr sr; struct ibv_wc wc; struct ibv_sge sg; struct ibv_send_wr *_bad_wr = NULL; struct ibv_send_wr **bad_wr = &_bad_wr; int counter; int rc; int ret; long long unsigned comp_mask = 0; struct umad_hdr *mad_hdr = (struct umad_hdr *) (res->send_buf); struct umad_sa_packet *p_sa_mad = (struct umad_sa_packet *) (res->send_buf); struct ib_inform_info *data = (struct ib_inform_info *) (p_sa_mad->data); static uint64_t trans_id = 0x0000FFFF; if (subscribe) pr_debug("Registering to trap:%d (sm in %#x)\n", trap_num, dest_lid); else pr_debug("Deregistering from trap:%d (sm in %#x)\n", trap_num, dest_lid); memset(res->send_buf, 0, SEND_SIZE); fill_send_request(res, &sr, &sg, mad_hdr); umad_init_new(mad_hdr, /* Mad Header */ UMAD_CLASS_SUBN_ADM, /* Management Class */ UMAD_SA_CLASS_VERSION, /* Class Version */ UMAD_METHOD_SET, /* Method */ 0, /* Transaction ID - will be set before the send in the loop*/ htobe16(UMAD_ATTR_INFORM_INFO), /* Attribute ID */ 0 ); /* Attribute Modifier */ data->lid_range_begin = htobe16(0xFFFF); data->is_generic = 1; data->subscribe = subscribe; if (trap_num == UMAD_SM_GID_IN_SERVICE_TRAP) data->trap_type = htobe16(3); /* SM */ else if (trap_num == UMAD_SM_LOCAL_CHANGES_TRAP) data->trap_type = htobe16(4); /* Informational */ data->g_or_v.generic.trap_num = htobe16(trap_num); data->g_or_v.generic.node_type_msb = 0; if (trap_num == UMAD_SM_GID_IN_SERVICE_TRAP) /* Class Manager */ data->g_or_v.generic.node_type_lsb = htobe16(4); else if (trap_num == UMAD_SM_LOCAL_CHANGES_TRAP) /* Channel Adapter */ data->g_or_v.generic.node_type_lsb = htobe16(1); comp_mask |= SRP_INFORMINFO_LID_COMP | SRP_INFORMINFO_ISGENERIC_COMP | SRP_INFORMINFO_SUBSCRIBE_COMP | SRP_INFORMINFO_TRAPTYPE_COMP | SRP_INFORMINFO_TRAPNUM_COMP | SRP_INFORMINFO_PRODUCER_COMP; if (!data->subscribe) { data->g_or_v.generic.qpn_resp_time_val = htobe32(res->qp->qp_num << 8); comp_mask |= SRP_INFORMINFO_QPN_COMP; } p_sa_mad->comp_mask = htobe64(comp_mask); pr_debug("comp_mask: %llx\n", comp_mask); for (counter = 3, rc = 0; counter > 0 && rc == 0; counter--) { pthread_mutex_lock(res->mad_buffer_mutex); res->mad_buffer->mad_hdr.base_version = 0; // flag that the buffer is empty pthread_mutex_unlock(res->mad_buffer_mutex); mad_hdr->tid = htobe64(trans_id); trans_id++; ret = ibv_post_send(res->qp, &sr, bad_wr); if (ret) { pr_err("failed to post SR\n"); return ret; } ret = poll_cq(sync_res, res->send_cq, &wc, NULL); if (ret < 0) return ret; /* sleep and check for response from SA */ do { srp_sleep(1, 0); pthread_mutex_lock(res->mad_buffer_mutex); if (res->mad_buffer->mad_hdr.base_version == 0) rc = 0; else if (res->mad_buffer->mad_hdr.tid == mad_hdr->tid) rc = 1; else { res->mad_buffer->mad_hdr.base_version = 0; rc = 2; } pthread_mutex_unlock(res->mad_buffer_mutex); } while (rc == 2); // while old response. } if (counter == 0) { pr_err("No response to inform info registration\n"); return -EAGAIN; } return 0; } /***************************************************************************** * Function: response_to_trap *****************************************************************************/ static int response_to_trap(struct sync_resources *sync_res, struct ud_resources *res, struct umad_sa_packet *mad_buffer) { struct ibv_send_wr sr; struct ibv_sge sg; struct ibv_send_wr *_bad_wr = NULL; struct ibv_send_wr **bad_wr = &_bad_wr; int ret; struct ibv_wc wc; struct umad_sa_packet *response_buffer = (struct umad_sa_packet *) (res->send_buf); memcpy(response_buffer, mad_buffer, sizeof(struct umad_sa_packet)); response_buffer->mad_hdr.method = UMAD_METHOD_REPORT_RESP; fill_send_request(res, &sr, &sg, (struct umad_hdr *) response_buffer); ret = ibv_post_send(res->qp, &sr, bad_wr); if (ret < 0) { pr_err("failed to post response\n"); return ret; } ret = poll_cq(sync_res, res->send_cq, &wc, NULL); return ret; } /***************************************************************************** * Function: get_trap_notices *****************************************************************************/ static int get_trap_notices(struct resources *res) { struct ibv_wc wc; int cur_receive = 0; int ret = 0; int pkey_index; __be16 pkey; char *buffer; struct umad_sa_packet *mad_buffer; struct ib_mad_notice_attr *notice_buffer; int trap_num; while (!stop_threads(res->sync_res)) { ret = poll_cq(res->sync_res, res->ud_res->recv_cq, &wc, res->ud_res->channel); if (ret < 0) { srp_sleep(0, 1); continue; } pr_debug("get_trap_notices: Got CQE wc.wr_id=%lld\n", (long long int) wc.wr_id); cur_receive = wc.wr_id; buffer = res->ud_res->recv_buf + RECV_BUF_SIZE * cur_receive; mad_buffer = (struct umad_sa_packet *) (buffer + GRH_SIZE); if ((mad_buffer->mad_hdr.mgmt_class == UMAD_CLASS_SUBN_ADM) && (mad_buffer->mad_hdr.method == UMAD_METHOD_GET_RESP) && (be16toh(mad_buffer->mad_hdr.attr_id) == UMAD_ATTR_INFORM_INFO)) { /* this is probably a response to register to trap */ pthread_mutex_lock(res->ud_res->mad_buffer_mutex); *res->ud_res->mad_buffer = *mad_buffer; pthread_mutex_unlock(res->ud_res->mad_buffer_mutex); } else if ((mad_buffer->mad_hdr.mgmt_class == UMAD_CLASS_SUBN_ADM) && (mad_buffer->mad_hdr.method == UMAD_METHOD_REPORT) && (be16toh(mad_buffer->mad_hdr.attr_id) == UMAD_ATTR_NOTICE)) { /* this is a trap notice */ pkey_index = wc.pkey_index; ret = pkey_index_to_pkey(res->umad_res, pkey_index, &pkey); if (ret) { pr_err("get_trap_notices: Got Bad pkey_index (%d)\n", pkey_index); wake_up_main_loop(0); break; } notice_buffer = (struct ib_mad_notice_attr *) (mad_buffer->data); trap_num = be16toh(notice_buffer->generic.trap_num); response_to_trap(res->sync_res, res->ud_res, mad_buffer); if (trap_num == UMAD_SM_GID_IN_SERVICE_TRAP) push_gid_to_list(res->sync_res, ¬ice_buffer->ntc_64_67.gid, be16toh(pkey)); else if (trap_num == UMAD_SM_LOCAL_CHANGES_TRAP) { if (be32toh(notice_buffer->ntc_144.new_cap_mask) & SRP_IS_DM) push_lid_to_list(res->sync_res, be16toh(notice_buffer->ntc_144.lid), be16toh(pkey)); } else { pr_err("Unhandled trap_num %d\n", trap_num); } } ret = fill_rq_entry(res->ud_res, cur_receive); if (ret < 0) { wake_up_main_loop(0); break; } } return ret; } void *run_thread_get_trap_notices(void *res_in) { int ret; ret = get_trap_notices((struct resources *)res_in); pr_debug("get_trap_notices thread ended\n"); pthread_exit((void *)(long)ret); } /***************************************************************************** * Function: register_to_traps *****************************************************************************/ int register_to_traps(struct resources *res, int subscribe) { int rc; int trap_numbers[] = {UMAD_SM_GID_IN_SERVICE_TRAP, UMAD_SM_LOCAL_CHANGES_TRAP}; int i; for (i=0; i < sizeof(trap_numbers) / sizeof(*trap_numbers); ++i) { rc = register_to_trap(res->sync_res, res->ud_res, res->ud_res->port_attr.sm_lid, trap_numbers[i], subscribe); if (rc != 0) return rc; } return 0; } void *run_thread_listen_to_events(void *res_in) { struct resources *res = (struct resources *)res_in; struct ibv_async_event event; while (!stop_threads(res->sync_res)) { if (ibv_get_async_event(res->ud_res->ib_ctx, &event)) { if (errno != EINTR) pr_err("ibv_get_async_event failed (errno = %d)\n", errno); break; } pr_debug("event_type %d, port %d\n", event.event_type, event.element.port_num); switch (event.event_type) { case IBV_EVENT_PORT_ACTIVE: case IBV_EVENT_SM_CHANGE: case IBV_EVENT_LID_CHANGE: case IBV_EVENT_CLIENT_REREGISTER: case IBV_EVENT_PKEY_CHANGE: if (event.element.port_num == config->port_num) { pthread_mutex_lock(&res->sync_res->mutex); __schedule_rescan(res->sync_res, 0); wake_up_main_loop(0); pthread_mutex_unlock(&res->sync_res->mutex); } break; case IBV_EVENT_DEVICE_FATAL: case IBV_EVENT_CQ_ERR: case IBV_EVENT_QP_FATAL: /* clean and restart */ pr_err("Critical event %d, raising catastrophic " "error signal\n", event.event_type); raise_catastrophic_error(res->sync_res); break; /* case IBV_EVENT_PORT_ERR: case IBV_EVENT_QP_REQ_ERR: case IBV_EVENT_QP_ACCESS_ERR: case IBV_EVENT_COMM_EST: case IBV_EVENT_SQ_DRAINED: case IBV_EVENT_PATH_MIG: case IBV_EVENT_PATH_MIG_ERR: case IBV_EVENT_SRQ_ERR: case IBV_EVENT_SRQ_LIMIT_REACHED: case IBV_EVENT_QP_LAST_WQE_REACHED: */ default: break; } ibv_ack_async_event(&event); } return NULL; } rdma-core-56.1/srp_daemon/srp_ib_types.h000066400000000000000000000134051477342711600203370ustar00rootroot00000000000000/* * srp-ib_types - discover SRP targets over IB * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. * Copyright (c) 2006 Mellanox Technologies Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef SRP_IB_TYPES_H #define SRP_IB_TYPES_H #include #include #include /* __be16, __be32 and __be64 */ #include /* union umad_gid */ #include #define SRP_INFORMINFO_LID_COMP (1 << 1) #define SRP_INFORMINFO_ISGENERIC_COMP (1 << 4) #define SRP_INFORMINFO_SUBSCRIBE_COMP (1 << 5) #define SRP_INFORMINFO_TRAPTYPE_COMP (1 << 6) #define SRP_INFORMINFO_TRAPNUM_COMP (1 << 7) #define SRP_INFORMINFO_QPN_COMP (1 << 8) #define SRP_INFORMINFO_PRODUCER_COMP (1 << 12) #define PACK_SUFFIX4 __attribute__((aligned(4))) __attribute__((packed)) #define PACK_SUFFIX __attribute__((packed)) /****d* IBA Base: Constants/MAD_BLOCK_SIZE * NAME * MAD_BLOCK_SIZE * * DESCRIPTION * Size of a non-RMPP MAD datagram. * * SOURCE */ #define MAD_BLOCK_SIZE 256 static inline uint32_t ib_get_attr_size(const __be16 attr_offset) { return( ((uint32_t)be16toh( attr_offset )) << 3 ); } /************************************************************ * NAME * MAD_RMPP_HDR_SIZE * * DESCRIPTION * Size of an RMPP header, including the common MAD header. * * SOURCE */ enum { MAD_RMPP_HDR_SIZE = 36, }; /****s* IBA Base: Types/struct ib_path_rec * NAME * struct ib_path_rec * * DESCRIPTION * Path records encapsulate the properties of a given * route between two end-points on a subnet. * * SYNOPSIS * * NOTES * The role of this data structure is identical to the role of struct * ibv_path_record in libibverbs/sa.h. */ struct ib_path_rec { uint8_t resv0[8]; union umad_gid dgid; union umad_gid sgid; __be16 dlid; __be16 slid; __be32 hop_flow_raw; uint8_t tclass; uint8_t reversible_numpath; /* reversible-7:7 num path-6:0 */ __be16 pkey; __be16 sl; uint8_t mtu; uint8_t rate; uint8_t pkt_life; uint8_t preference; uint8_t resv2[6]; }; /****f* IBA Base: Types/umad_init_new * NAME * umad_init_new * * DESCRIPTION * Initialize UMAD common header. * * SYNOPSIS */ static inline void umad_init_new(struct umad_hdr* const p_mad, const uint8_t mgmt_class, const uint8_t class_ver, const uint8_t method, const __be64 trans_id, const __be16 attr_id, const __be32 attr_mod) { p_mad->base_version = 1; p_mad->mgmt_class = mgmt_class; p_mad->class_version = class_ver; p_mad->method = method; p_mad->status = 0; p_mad->class_specific = 0; p_mad->tid = trans_id; p_mad->attr_id = attr_id; p_mad->resv = 0; p_mad->attr_mod = attr_mod; } struct ib_inform_info { union umad_gid gid; __be16 lid_range_begin; __be16 lid_range_end; __be16 reserved1; uint8_t is_generic; uint8_t subscribe; __be16 trap_type; union _inform_g_or_v { struct _inform_generic { __be16 trap_num; __be32 qpn_resp_time_val; uint8_t reserved2; uint8_t node_type_msb; __be16 node_type_lsb; } PACK_SUFFIX generic; struct _inform_vend { __be16 dev_id; __be32 qpn_resp_time_val; uint8_t reserved2; uint8_t vendor_id_msb; __be16 vendor_id_lsb; } PACK_SUFFIX vend; } PACK_SUFFIX g_or_v; } PACK_SUFFIX4; struct ib_mad_notice_attr // Total Size calc Accumulated { union { uint8_t generic_type; // 1 1 struct _notice_generic { uint8_t generic_type; uint8_t prod_type_msb; __be16 prod_type_lsb; __be16 trap_num; } generic; struct _notice_vend { uint8_t generic_type; uint8_t vend_id_msb; __be16 vend_id_lsb; __be16 dev_id; } vend; }; __be16 issuer_lid; // 2 8 union // 54 64 { __be16 toggle_count; // 2 10 struct _raw_data { __be16 toggle_count; uint8_t details[54]; } raw_data; struct _ntc_64_67 { __be16 toggle_count; uint8_t res[6]; union umad_gid gid; // the Node or Multicast Group that came in/out } ntc_64_67; struct _ntc_144 { __be16 toggle_count; __be16 pad1; __be16 lid; // lid where capability mask changed __be16 pad2; __be32 new_cap_mask; // new capability mask } ntc_144; }; union umad_gid issuer_gid; // 16 80 }; /****f* IBA Base: Types/ib_gid_get_guid * NAME * ib_gid_get_guid * * DESCRIPTION * Gets the guid from a GID. * * SYNOPSIS */ static inline __be64 ib_gid_get_guid(const union umad_gid *const p_gid) { return p_gid->global.interface_id; } #endif rdma-core-56.1/srp_daemon/srp_sync.c000066400000000000000000000160441477342711600174720ustar00rootroot00000000000000/* * srp_sync - discover SRP targets over IB * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2006 Cisco Systems, Inc. All rights reserved. * Copyright (c) 2006 Mellanox Technologies Ltd. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * * $Author: ishai Rabinovitz [ishai@mellanox.co.il]$ */ #include #include #include #include #include #include #include #include "srp_daemon.h" /* * Schedule a rescan at now + when if when >= 0 or disable rescanning if * when < 0. */ void __schedule_rescan(struct sync_resources *res, int when) { struct timespec *ts = &res->next_recalc_time; clock_gettime(CLOCK_MONOTONIC, ts); ts->tv_sec = when >= 0 ? ts->tv_sec + when : LONG_MAX; } void schedule_rescan(struct sync_resources *res, int when) { pthread_mutex_lock(&res->mutex); __schedule_rescan(res, when); pthread_mutex_unlock(&res->mutex); } int __rescan_scheduled(struct sync_resources *res) { struct timespec now; clock_gettime(CLOCK_MONOTONIC, &now); return ts_cmp(&res->next_recalc_time, &now, <=); } int rescan_scheduled(struct sync_resources *res) { int ret; pthread_mutex_lock(&res->mutex); ret = __rescan_scheduled(res); pthread_mutex_unlock(&res->mutex); return ret; } void raise_catastrophic_error(struct sync_resources *res) { pthread_mutex_lock(&res->mutex); res->error = true; pthread_mutex_unlock(&res->mutex); raise(SRP_CATAS_ERR); } bool sync_resources_error(struct sync_resources *res) { bool ret; pthread_mutex_lock(&res->mutex); ret = res->error; pthread_mutex_unlock(&res->mutex); return ret; } int sync_resources_init(struct sync_resources *res) { int ret; res->stop_threads = 0; res->error = false; __schedule_rescan(res, 0); res->next_task = 0; ret = pthread_mutex_init(&res->mutex, NULL); if (ret < 0) { pr_err("could not initialize mutex\n"); return ret; } res->retry_tasks_head = NULL; ret = pthread_mutex_init(&res->retry_mutex, NULL); if (ret < 0) { pr_err("could not initialize mutex\n"); return ret; } ret = pthread_cond_init(&res->retry_cond, NULL); if (ret < 0) pr_err("could not initialize cond\n"); return ret; } void sync_resources_cleanup(struct sync_resources *res) { pthread_cond_destroy(&res->retry_cond); pthread_mutex_destroy(&res->retry_mutex); pthread_mutex_destroy(&res->mutex); } void push_gid_to_list(struct sync_resources *res, union umad_gid *gid, uint16_t pkey) { int i; /* If there is going to be a recalc soon - do nothing */ if (rescan_scheduled(res)) return; pthread_mutex_lock(&res->mutex); /* check if the gid is already in the list */ for (i=0; i < res->next_task; ++i) if (!memcmp(&res->tasks[i].gid, gid, 16) && res->tasks[i].pkey == pkey) { pr_debug("gid is already in task list\n"); pthread_mutex_unlock(&res->mutex); return; } if (res->next_task == SIZE_OF_TASKS_LIST) { /* if the list is full, lets do a full rescan */ __schedule_rescan(res, 0); res->next_task = 0; } else { /* otherwise enter to the next entry */ res->tasks[res->next_task].gid = *gid; res->tasks[res->next_task].lid = 0; res->tasks[res->next_task].pkey = pkey; ++res->next_task; } wake_up_main_loop(0); pthread_mutex_unlock(&res->mutex); } void push_lid_to_list(struct sync_resources *res, uint16_t lid, uint16_t pkey) { int i; /* If there is going to be a recalc soon - do nothing */ if (rescan_scheduled(res)) return; pthread_mutex_lock(&res->mutex); /* check if the lid is already in the list */ for (i=0; i < res->next_task; ++i) if (res->tasks[i].lid == lid && res->tasks[i].pkey == pkey) { pr_debug("lid %#x is already in task list\n", lid); pthread_mutex_unlock(&res->mutex); return; } if (res->next_task == SIZE_OF_TASKS_LIST) { /* if the list is full, lets do a full rescan */ __schedule_rescan(res, 0); res->next_task = 0; } else { /* otherwise enter to the next entry */ res->tasks[res->next_task].lid = lid; res->tasks[res->next_task].pkey = pkey; memset(&res->tasks[res->next_task].gid, 0, 16); ++res->next_task; } wake_up_main_loop(0); pthread_mutex_unlock(&res->mutex); } void clear_traps_list(struct sync_resources *res) { pthread_mutex_lock(&res->mutex); res->next_task = 0; pthread_mutex_unlock(&res->mutex); } /* assumes that res->mutex is locked !!! */ int pop_from_list(struct sync_resources *res, uint16_t *lid, union umad_gid *gid, uint16_t *pkey) { int ret=0; int i; if (res->next_task) { *lid = res->tasks[0].lid; *pkey = res->tasks[0].pkey; *gid = res->tasks[0].gid; /* push the rest down */ for (i=1; i < res->next_task; ++i) res->tasks[i-1] = res->tasks[i]; ret = 1; --res->next_task; } return ret; } /* assumes that res->retry_mutex is locked !!! */ struct target_details *pop_from_retry_list(struct sync_resources *res) { struct target_details *ret = res->retry_tasks_head; if (ret) res->retry_tasks_head = ret->next; else res->retry_tasks_tail = NULL; return ret; } void push_to_retry_list(struct sync_resources *res, struct target_details *orig_target) { struct target_details *target; /* If there is going to be a recalc soon - do nothing */ if (rescan_scheduled(res)) return; target = malloc(sizeof(struct target_details)); memcpy(target, orig_target, sizeof(struct target_details)); pthread_mutex_lock(&res->retry_mutex); if (!res->retry_tasks_head) res->retry_tasks_head = target; if (res->retry_tasks_tail) res->retry_tasks_tail->next = target; res->retry_tasks_tail = target; target->next = NULL; pthread_cond_signal(&res->retry_cond); pthread_mutex_unlock(&res->retry_mutex); } /* assumes that res->retry_mutex is locked !!! */ int retry_list_is_empty(struct sync_resources *res) { return res->retry_tasks_head == NULL; } rdma-core-56.1/srp_daemon/srpd.in000077500000000000000000000067141477342711600167740ustar00rootroot00000000000000#!/bin/bash # Licensed under the OpenIB.org BSD license (FreeBSD Variant) - See COPYING.md # # Manage the SRP client daemon (srp_daemon) # # chkconfig: - 25 75 # description: Starts/Stops InfiniBand SRP client service # config: @CMAKE_INSTALL_FULL_SYSCONFDIR@/srp_daemon.conf # ### BEGIN INIT INFO # Provides: srpd # Required-Start: $syslog @RDMA_SERVICE@ # Required-Stop: $syslog @RDMA_SERVICE@ # Default-Start: @SRP_DEFAULT_START@ # Default-Stop: @SRP_DEFAULT_STOP@ # Should-Start: # Should-Stop: # Short-Description: Starts and stops the InfiniBand SRP client service # Description: The InfiniBand SRP client service attaches to SRP devices # on the InfiniBand fabric and makes them appear as local disks to # to the system. This service starts the client daemon that's # responsible for initiating and maintaining the connections to # remote devices. ### END INIT INFO if [ -e /etc/rdma/rdma.conf ]; then # RHEL / Fedora. RDMA_CONFIG=/etc/rdma/rdma.conf else # OFED RDMA_CONFIG=/etc/infiniband/openib.conf fi if [ -f $RDMA_CONFIG ]; then . $RDMA_CONFIG fi pidfile=@CMAKE_INSTALL_FULL_RUNDIR@/srp_daemon.sh.pid prog=@CMAKE_INSTALL_FULL_SBINDIR@/srp_daemon.sh checkpid() { [ -e "/proc/$1" ] } stop_srp_daemon() { if ! running; then return 1 fi local pid=`cat $pidfile` kill $pid # timeout 30 seconds for termination for i in `seq 300`; do if ! checkpid $pid; then return 0 fi sleep 0.1 done kill -9 $pid # If srp_daemon executables didn't finish by now # force kill pkill -9 srp_daemon return 0 } # if the ib_srp module is loaded or built into the kernel return 0 otherwise # return 1. is_srp_mod_loaded() { [ -e /sys/module/ib_srp ] } running() { [ -f $pidfile ] && checkpid "$(cat $pidfile)" } start() { if ! is_srp_mod_loaded; then echo "SRP kernel module is not loaded, unable to start SRP daemon" return 6 fi if running; then echo "Already started" return 0 fi echo -n "Starting SRP daemon service" if [ "$SRP_DEFAULT_TL_RETRY_COUNT" ]; then params=$params"-l $SRP_DEFAULT_TL_RETRY_COUNT " fi setsid $prog $params &/dev/null & RC=$? [ $RC -eq 0 ] && echo || echo " ...failed" return $RC } stop() { echo -n "Stopping SRP daemon service" stop_srp_daemon RC=$? for ((i=0;i<5;i++)); do if ! running; then rm -f $pidfile break fi sleep 1 done [ $RC -eq 0 ] && echo || echo " ...failed" return $RC } status() { local ret if [ ! -f $pidfile ]; then ret=3 # program not running else checkpid "$(cat $pidfile)" ret=$? # 1: pid file exists and not running / 0: running fi if [ $ret -eq 0 ] ; then echo "$prog is running... pid=$(cat $pidfile)" else echo "$prog is not running." fi return $ret } restart() { stop start } condrestart() { [ -f $pidfile ] && restart || return 0 } usage() { echo echo "Usage: `basename $0` {start|stop|restart|condrestart|try-restart|force-reload|status}" echo return 2 } case $1 in start|stop|restart|condrestart|try-restart|force-reload) [ `id -u` != "0" ] && exit 4 ;; esac case $1 in start) start; RC=$? ;; stop) stop; RC=$? ;; restart) restart; RC=$? ;; reload) RC=3 ;; condrestart) condrestart; RC=$? ;; try-restart) condrestart; RC=$? ;; force-reload) condrestart; RC=$? ;; status) status; RC=$? ;; *) usage; RC=$? ;; esac exit $RC rdma-core-56.1/srp_daemon/start_on_all_ports.in000066400000000000000000000004261477342711600217230ustar00rootroot00000000000000#!/bin/bash for p in /sys/class/infiniband/*/ports/*; do [ -e "$p" ] || continue [ "$(cat ${p}/link_layer)" == "InfiniBand" ] || continue p=${p#/sys/class/infiniband/} nohup @SYSTEMCTL_BIN@ start "srp_daemon_port@${p/\/ports\//:}" &/dev/null & done rdma-core-56.1/suse/000077500000000000000000000000001477342711600143115ustar00rootroot00000000000000rdma-core-56.1/suse/module-setup.sh000066400000000000000000000022661477342711600172760ustar00rootroot00000000000000#!/bin/bash check() { [ -n "$hostonly" -a -c /sys/class/infiniband_verbs/uverbs0 ] && return 0 [ -n "$hostonly" ] && return 255 return 0 } depends() { return 0 } install() { inst /etc/rdma/mlx4.conf inst /etc/rdma/modules/infiniband.conf inst /etc/rdma/modules/iwarp.conf inst /etc/rdma/modules/opa.conf inst /etc/rdma/modules/rdma.conf inst /etc/rdma/modules/roce.conf inst /usr/libexec/mlx4-setup.sh inst_multiple lspci setpci awk sleep inst_rules 60-rdma-persistent-naming.rules 70-persistent-ipoib.rules 75-rdma-description.rules 90-rdma-hw-modules.rules 90-rdma-ulp-modules.rules inst_multiple -o \ $systemdsystemunitdir/rdma-hw.target \ $systemdsystemunitdir/rdma-load-modules@.service for i in \ rdma-load-modules@rdma.service \ rdma-load-modules@roce.service \ rdma-load-modules@infiniband.service; do $SYSTEMCTL -q --root "$initdir" add-wants initrd.target "$i" done } installkernel() { hostonly='' instmods =drivers/infiniband =drivers/net/ethernet/mellanox =drivers/net/ethernet/chelsio =drivers/net/ethernet/cisco =drivers/net/ethernet/emulex =drivers/target hostonly='' instmods crc-t10dif crct10dif_common xprtrdma svcrdma } rdma-core-56.1/suse/rdma-core.spec000066400000000000000000000675531477342711600170560ustar00rootroot00000000000000# # spec file for package rdma-core # # Copyright (c) 2025 SUSE LLC # # All modifications and additions to the file contributed by third parties # remain the property of their copyright owners, unless otherwise agreed # upon. The license for this file, and modifications and additions to the # file, is the same license as for the pristine package itself (unless the # license for the pristine package is not an Open Source License, in which # case the license is the MIT License). An "Open Source License" is a # license that conforms to the Open Source Definition (Version 1.9) # published by the Open Source Initiative. # Please submit bugfixes or comments via https://bugs.opensuse.org/ # %bcond_without systemd # Do not build static libs by default. %define with_static %{?_with_static: 1} %{?!_with_static: 0} %define with_pyverbs %{?_with_pyverbs: 1} %{?!_with_pyverbs: 0} %if 0%{?suse_version} < 1550 && 0%{?sle_version} <= 150300 # systemd-rpm-macros is wrong in 15.3 and below %define _modprobedir /lib/modprobe.d %endif %define git_ver %{nil} Name: rdma-core Version: 56.1 Release: 0 Summary: RDMA core userspace libraries and daemons License: BSD-2-Clause OR GPL-2.0-only Group: Productivity/Networking/Other %define efa_so_major 1 %define hns_so_major 1 %define verbs_so_major 1 %define rdmacm_so_major 1 %define umad_so_major 3 %define mana_so_major 1 %define mlx4_so_major 1 %define mlx5_so_major 1 %define ibnetdisc_major 5 %define mad_major 5 %define efa_lname libefa%{efa_so_major} %define hns_lname libhns%{hns_so_major} %define verbs_lname libibverbs%{verbs_so_major} %define rdmacm_lname librdmacm%{rdmacm_so_major} %define umad_lname libibumad%{umad_so_major} %define mana_lname libmana%{mana_so_major} %define mlx4_lname libmlx4-%{mlx4_so_major} %define mlx5_lname libmlx5-%{mlx5_so_major} %ifnarch s390 %arm %define dma_coherent 1 %endif %global modprobe_d_files 50-libmlx4.conf truescale.conf %{?dma_coherent:mlx4.conf} # Almost everything is licensed under the OFA dual GPLv2, 2 Clause BSD license # providers/ipathverbs/ Dual licensed using a BSD license with an extra patent clause # providers/rxe/ Incorporates code from ipathverbs and contains the patent clause # providers/hfi1verbs Uses the 3 Clause BSD license URL: https://github.com/linux-rdma/rdma-core Source: rdma-core-%{version}%{git_ver}.tar.gz Source1: baselibs.conf BuildRequires: binutils BuildRequires: cmake >= 2.8.11 BuildRequires: gcc BuildRequires: pandoc # perl is needed for the proper rpm macros %if %{?suse_version} > 1550 BuildRequires: perl %endif BuildRequires: pkgconfig BuildRequires: python3-base BuildRequires: python3-docutils BuildRequires: pkgconfig(libsystemd) BuildRequires: pkgconfig(libudev) BuildRequires: pkgconfig(systemd) BuildRequires: pkgconfig(udev) %if %{with_pyverbs} BuildRequires: python3-Cython BuildRequires: python3-devel %endif %ifnarch s390 s390x %if 0%{?suse_version} >= 1550 BuildRequires: valgrind-client-headers %else BuildRequires: valgrind-devel %endif %endif BuildRequires: systemd-rpm-macros BuildRequires: pkgconfig(libnl-3.0) BuildRequires: pkgconfig(libnl-route-3.0) BuildRequires: pkgconfig(systemd) Requires: kmod Requires: systemd Requires: udev Recommends: rdma-ndd # SUSE previously shipped rdma as a stand-alone # package which we're supplanting here. Provides: rdma = %{version} Obsoletes: rdma < %{version} Provides: ofed = %{version} Obsoletes: ofed < %{version} # Trickery to handle both SUSE OpenBuild System and Manual build # In OBS, rdma-core must use curl-mini instead of curl to avoid # a build dependency loop: # rdma-core -> cmake -> curl -> ... -> boost -> rdma-core # Thus we force a BuildRequires to curl-mini which as no impact # as it is not used during the build. # However curl-mini is not a published RPM. This would prevent any build # outside of OBS. Thus we add a bcond to allow manual build. # To force build without the use of curl-mini, --without=curlmini # should be passed to rpmbuild %bcond_without curlmini %if 0%{?suse_version} >= 1330 && 0%{?suse_version} < 1550 %if %{with curlmini} BuildRequires: curl-mini %endif %endif # Tumbleweed's cmake RPM macro adds -Wl,--no-undefined to the module flags # which is totally inappropriate and breaks building 'ENABLE_EXPORTS' style # module libraries (eg ibacmp). #%%define CMAKE_FLAGS -DCMAKE_MODULE_LINKER_FLAGS="" # Since we recommend developers use Ninja, so should packagers, for consistency. %define CMAKE_FLAGS %{nil} %if 0%{?suse_version} >= 1300 BuildRequires: ninja %define CMAKE_FLAGS -GNinja %define make_jobs ninja -v %{?_smp_mflags} %define cmake_install DESTDIR=%{buildroot} ninja install %else # Fallback to make otherwise BuildRequires: make %define make_jobs make VERBOSE=1 %{?_smp_mflags} %define cmake_install DESTDIR=%{buildroot} make install %endif %description RDMA core userspace infrastructure and documentation, including initialization scripts, kernel driver-specific modprobe override configs, IPoIB network scripts, dracut rules, and the rdma-ndd utility. %package devel Summary: RDMA core development libraries and headers Group: Development/Libraries/C and C++ Requires: %{name}%{?_isa} = %{version}-%{release} Requires: %{rdmacm_lname} = %{version}-%{release} Requires: %{umad_lname} = %{version}-%{release} Requires: %{verbs_lname} = %{version}-%{release} %if 0%{?dma_coherent} Requires: %{efa_lname} = %{version}-%{release} Requires: %{hns_lname} = %{version}-%{release} Requires: %{mana_lname} = %{version}-%{release} Requires: %{mlx4_lname} = %{version}-%{release} Requires: %{mlx5_lname} = %{version}-%{release} %endif Requires: rsocket = %{version}-%{release} Provides: libibverbs-devel = %{version}-%{release} Obsoletes: libibverbs-devel < %{version}-%{release} Provides: libibumad-devel = %{version}-%{release} Obsoletes: libibumad-devel < %{version}-%{release} Provides: librdmacm-devel = %{version}-%{release} Obsoletes: librdmacm-devel < %{version}-%{release} #Requires: ibacm = %%{version}-%%{release} Provides: ibacm-devel = %{version}-%{release} Obsoletes: ibacm-devel < %{version}-%{release} %if %{with_static} # Since our pkg-config files include private references to these packages they # need to have their .pc files installed too, even for dynamic linking, or # pkg-config breaks. BuildRequires: pkgconfig(libnl-3.0) BuildRequires: pkgconfig(libnl-route-3.0) %endif Requires: infiniband-diags = %{version}-%{release} Provides: infiniband-diags-devel = %{version}-%{release} Obsoletes: infiniband-diags-devel < %{version}-%{release} Provides: libibmad-devel = %{version}-%{release} Obsoletes: libibmad-devel < %{version} %description devel RDMA core development libraries and headers. %package -n libibverbs Summary: Library & drivers for direct userspace use of InfiniBand/iWARP/RoCE hardware Group: System/Libraries Requires: %{name}%{?_isa} = %{version}-%{release} Obsoletes: libcxgb4-rdmav2 < %{version}-%{release} Obsoletes: libefa-rdmav2 < %{version}-%{release} Obsoletes: libhfi1verbs-rdmav2 < %{version}-%{release} Obsoletes: libhns-rdmav2 < %{version}-%{release} Obsoletes: libipathverbs-rdmav2 < %{version}-%{release} Obsoletes: libmana-rdmav2 < %{version}-%{release} Obsoletes: libmlx4-rdmav2 < %{version}-%{release} Obsoletes: libmlx5-rdmav2 < %{version}-%{release} Obsoletes: libmthca-rdmav2 < %{version}-%{release} Obsoletes: libocrdma-rdmav2 < %{version}-%{release} Obsoletes: librxe-rdmav2 < %{version}-%{release} %if 0%{?dma_coherent} Requires: %{efa_lname} = %{version}-%{release} Requires: %{hns_lname} = %{version}-%{release} Requires: %{mana_lname} = %{version}-%{release} Requires: %{mlx4_lname} = %{version}-%{release} Requires: %{mlx5_lname} = %{version}-%{release} %endif # Recommended packages for rxe Recommends: iproute2 %description -n libibverbs libibverbs is a library that allows userspace processes to use RDMA "verbs" as described in the InfiniBand Architecture Specification and the RDMA Protocol Verbs Specification. This includes direct hardware access from userspace to InfiniBand/iWARP adapters (kernel bypass) for fast path operations. Device-specific plug-in ibverbs userspace drivers are included: - libcxgb4: Chelsio T4 iWARP HCA - libefa: Amazon Elastic Fabric Adapter - libhfi1: Intel Omni-Path HFI - libhns: HiSilicon Hip08+ SoC - libipathverbs: QLogic InfiniPath HCA - libirdma: Intel Ethernet Connection RDMA - libmana: Microsoft Azure Network Adapter - libmlx4: Mellanox ConnectX-3 InfiniBand HCA - libmlx5: Mellanox Connect-IB/X-4+ InfiniBand HCA - libmthca: Mellanox InfiniBand HCA - libocrdma: Emulex OneConnect RDMA/RoCE Device - libqedr: QLogic QL4xxx RoCE HCA - librxe: A software implementation of the RoCE protocol - libsiw: A software implementation of the iWarp protocol - libvmw_pvrdma: VMware paravirtual RDMA device %package -n %verbs_lname Summary: Ibverbs runtime library Group: System/Libraries Requires: libibverbs = %{version} %description -n %verbs_lname This package contains the ibverbs runtime library. %package -n %efa_lname Summary: EFA runtime library Group: System/Libraries %description -n %efa_lname This package contains the efa runtime library. %package -n %hns_lname Summary: HNS runtime library Group: System/Libraries %description -n %hns_lname This package contains the hns runtime library. %package -n %mana_lname Summary: MANA runtime library Group: System/Libraries %description -n %mana_lname This package contains the mana runtime library. %package -n %mlx4_lname Summary: MLX4 runtime library Group: System/Libraries %description -n %mlx4_lname This package contains the mlx4 runtime library. %package -n %mlx5_lname Summary: MLX5 runtime library Group: System/Libraries %description -n %mlx5_lname This package contains the mlx5 runtime library. %package -n libibnetdisc%{ibnetdisc_major} Summary: Infiniband Net Discovery runtime library Group: System/Libraries %description -n libibnetdisc%{ibnetdisc_major} This package contains the Infiniband Net Discovery runtime library needed mainly by infiniband-diags. %package -n libibverbs-utils Summary: Examples for the libibverbs library Group: Productivity/Networking/Other Requires: libibverbs%{?_isa} = %{version} %description -n libibverbs-utils Useful libibverbs example programs such as ibv_devinfo, which displays information about RDMA devices. %package -n ibacm Summary: InfiniBand Communication Manager Assistant Group: Productivity/Networking/Other %{?systemd_requires} Requires: %{name}%{?_isa} = %{version} Obsoletes: libibacmp1 < %{version} Provides: libibacmp1 = %{version} %description -n ibacm The ibacm daemon helps reduce the load of managing path record lookups on large InfiniBand fabrics by providing a user space implementation of what is functionally similar to an ARP cache. The use of ibacm, when properly configured, can reduce the SA packet load of a large IB cluster from O(n^2) to O(n). The ibacm daemon is started and normally runs in the background, user applications need not know about this daemon as long as their app uses librdmacm to handle connection bring up/tear down. The librdmacm library knows how to talk directly to the ibacm daemon to retrieve data. %package -n infiniband-diags Summary: InfiniBand Diagnostic Tools Group: Productivity/Networking/Diagnostic Requires: perl = %{perl_version} %description -n infiniband-diags diags provides IB diagnostic programs and scripts needed to diagnose an IB subnet. %package -n libibmad%{mad_major} Summary: Libibmad runtime library Group: System/Libraries %description -n libibmad%{mad_major} Libibmad provides low layer IB functions for use by the IB diagnostic and management programs. These include MAD, SA, SMP, and other basic IB functions. This package contains the runtime library. %package -n iwpmd Summary: Userspace iWarp Port Mapper daemon Group: Development/Libraries/C and C++ Requires: %{name}%{?_isa} = %{version} %{?systemd_requires} %description -n iwpmd iwpmd provides a userspace service for iWarp drivers to claim tcp ports through the standard socket interface. %package -n %umad_lname Summary: OpenFabrics Alliance InfiniBand Userspace Management Datagram library Group: System/Libraries %description -n %umad_lname libibumad provides the userspace management datagram (umad) library functions, which sit on top of the umad modules in the kernel. These are used by the IB diagnostic and management tools, including OpenSM. %package -n %rdmacm_lname Summary: Userspace RDMA Connection Manager Group: System/Libraries Requires: %{name} = %{version} Provides: librdmacm = %{version} Obsoletes: librdmacm < %{version} %description -n %rdmacm_lname librdmacm provides a userspace RDMA Communication Management API. %package -n rsocket Summary: Preloadable library to turn the socket API RDMA-aware # Older librdmacm-tools used to provide rsocket Group: System/Libraries Conflicts: librdmacm-tools < 2 %description -n rsocket Existing applications can make use of rsockets through the use this preloadable library. See the documentation in the packaged rsocket(7) manpage for details. %package -n librdmacm-utils Summary: Examples for the librdmacm library Group: Productivity/Networking/Other Obsoletes: librdmacm-tools < %{version} Provides: librdmacm-tools = %{version} %description -n librdmacm-utils Example test programs for the librdmacm library. %package -n srp_daemon Summary: Tools for using the InfiniBand SRP protocol devices Group: Development/Libraries/C and C++ Requires: %{name} = %{version} Obsoletes: srptools <= 1.0.3 Provides: srptools = %{version} %{?systemd_requires} %description -n srp_daemon In conjunction with the kernel ib_srp driver, srp_daemon allows you to discover and use SCSI devices via the SCSI RDMA Protocol over InfiniBand. %package -n rdma-ndd Summary: Daemon to manage RDMA Node Description Group: System/Daemons Requires: %{name} = %{version} # The udev rules in rdma need to be aware of rdma-ndd: Conflicts: rdma < 2.1 %{?systemd_requires} %description -n rdma-ndd rdma-ndd is a system daemon which watches for rdma device changes and/or hostname changes and updates the Node Description of the rdma devices based on those changes. %package -n python3-pyverbs Summary: Python3 API over IB verbs Group: Development/Languages/Python %description -n python3-pyverbs Pyverbs is a Cython-based Python API over libibverbs, providing an easy, object-oriented access to IB verbs. %prep %setup -q -n %{name}-%{version}%{git_ver} %build # New RPM defines _rundir, usually as /run %if 0%{?_rundir:1} %else %define _rundir /var/run %endif %{!?EXTRA_CMAKE_FLAGS: %define EXTRA_CMAKE_FLAGS %{nil}} # Pass all of the rpm paths directly to GNUInstallDirs and our other defines. %cmake %{CMAKE_FLAGS} \ -DCMAKE_MODULE_LINKER_FLAGS="-Wl,--as-needed -Wl,-z,now" \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_BINDIR:PATH=%{_bindir} \ -DCMAKE_INSTALL_SBINDIR:PATH=%{_sbindir} \ -DCMAKE_INSTALL_LIBDIR:PATH=%{_lib} \ -DCMAKE_INSTALL_LIBEXECDIR:PATH=%{_libexecdir} \ -DCMAKE_INSTALL_LOCALSTATEDIR:PATH=%{_localstatedir} \ -DCMAKE_INSTALL_SHAREDSTATEDIR:PATH=%{_sharedstatedir} \ -DCMAKE_INSTALL_INCLUDEDIR:PATH=include \ -DCMAKE_INSTALL_INFODIR:PATH=%{_infodir} \ -DCMAKE_INSTALL_MANDIR:PATH=%{_mandir} \ -DCMAKE_INSTALL_MODPROBEDIR:PATH=%{_modprobedir} \ -DCMAKE_INSTALL_SYSCONFDIR:PATH=%{_sysconfdir} \ -DCMAKE_INSTALL_SYSTEMD_SERVICEDIR:PATH=%{_unitdir} \ -DCMAKE_INSTALL_SYSTEMD_BINDIR:PATH=%{_prefix}/lib/systemd \ -DCMAKE_INSTALL_INITDDIR:PATH=%{_initddir} \ -DCMAKE_INSTALL_RUNDIR:PATH=%{_rundir} \ -DCMAKE_INSTALL_DOCDIR:PATH=%{_docdir}/%{name}-%{version} \ -DCMAKE_INSTALL_UDEV_RULESDIR:PATH=%{_udevrulesdir} \ -DCMAKE_INSTALL_PERLDIR:PATH=%{perl_vendorlib} \ %if %{with_static} -DENABLE_STATIC=1 \ %endif %{EXTRA_CMAKE_FLAGS} \ %if %{defined __python3} -DPYTHON_EXECUTABLE:PATH=%{__python3} \ -DCMAKE_INSTALL_PYTHON_ARCH_LIB:PATH=%{python3_sitearch} \ %endif %if %{with_pyverbs} -DNO_PYVERBS=0 %else -DNO_PYVERBS=1 %endif %make_jobs %install cd build %cmake_install cd .. mkdir -p %{buildroot}/%{_sysconfdir}/rdma %global dracutlibdir %%{_prefix}/lib/dracut/ mkdir -p %{buildroot}%{_udevrulesdir} mkdir -p %{buildroot}%{dracutlibdir}/modules.d/05rdma mkdir -p %{buildroot}%{_modprobedir} mkdir -p %{buildroot}%{_unitdir} # Port type setup for mlx4 dual port cards install -D -m0644 redhat/rdma.mlx4.sys.modprobe %{buildroot}%{_modprobedir}/50-libmlx4.conf install -D -m0644 redhat/rdma.mlx4.conf %{buildroot}/%{_sysconfdir}/rdma/mlx4.conf %if 0%{?dma_coherent} chmod 0644 %{buildroot}%{_modprobedir}/mlx4.conf %endif install -D -m0755 redhat/rdma.mlx4-setup.sh %{buildroot}%{_libexecdir}/mlx4-setup.sh # Dracut file for IB support during boot install -D -m0644 suse/module-setup.sh %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh %if "%{_libexecdir}" != "/usr/libexec" sed 's-/usr/libexec-%{_libexecdir}-g' -i %{buildroot}%{_modprobedir}/50-libmlx4.conf sed 's-/usr/libexec-%{_libexecdir}-g' -i %{buildroot}%{dracutlibdir}/modules.d/05rdma/module-setup.sh %endif # ibacm cd build LD_LIBRARY_PATH=./lib bin/ib_acme -D . -O install -D -m0644 ibacm_opts.cfg %{buildroot}%{_sysconfdir}/rdma/ %if 0%{?suse_version} < 1600 for service in rdma rdma-ndd ibacm iwpmd srp_daemon; do ln -sf %{_sbindir}/service %{buildroot}%{_sbindir}/rc${service}; done %endif # Delete the package's init.d scripts rm -rf %{buildroot}/%{_initddir}/ rm -rf %{buildroot}/%{_sbindir}/srp_daemon.sh %post -n %verbs_lname -p /sbin/ldconfig %postun -n %verbs_lname -p /sbin/ldconfig %post -n %efa_lname -p /sbin/ldconfig %postun -n %efa_lname -p /sbin/ldconfig %post -n %hns_lname -p /sbin/ldconfig %postun -n %hns_lname -p /sbin/ldconfig %post -n %mana_lname -p /sbin/ldconfig %postun -n %mana_lname -p /sbin/ldconfig %post -n %mlx4_lname -p /sbin/ldconfig %postun -n %mlx4_lname -p /sbin/ldconfig %post -n %mlx5_lname -p /sbin/ldconfig %postun -n %mlx5_lname -p /sbin/ldconfig %post -n %umad_lname -p /sbin/ldconfig %postun -n %umad_lname -p /sbin/ldconfig %post -n %rdmacm_lname -p /sbin/ldconfig %postun -n %rdmacm_lname -p /sbin/ldconfig %post -n libibnetdisc%{ibnetdisc_major} -p /sbin/ldconfig %postun -n libibnetdisc%{ibnetdisc_major} -p /sbin/ldconfig %post -n libibmad%{mad_major} -p /sbin/ldconfig %postun -n libibmad%{mad_major} -p /sbin/ldconfig %pre # Avoid restoring outdated stuff in posttrans for _f in %{?modprobe_d_files}; do [ ! -f "/etc/modprobe.d/${_f}.rpmsave" ] || \ mv -f "/etc/modprobe.d/${_f}.rpmsave" "/etc/modprobe.d/${_f}.rpmsave.old" || : done %post # we ship udev rules, so trigger an update. %{_bindir}/udevadm trigger --subsystem-match=infiniband --action=change || true %{_bindir}/udevadm trigger --subsystem-match=infiniband_mad --action=change || true %posttrans # Migration of modprobe.conf files to _modprobedir for _f in %{?modprobe_d_files}; do [ ! -f "/etc/modprobe.d/${_f}.rpmsave" ] || \ mv -fv "/etc/modprobe.d/${_f}.rpmsave" "/etc/modprobe.d/${_f}" || : done # # ibacm # %pre -n ibacm %service_add_pre ibacm.service ibacm.socket %post -n ibacm %service_add_post ibacm.service ibacm.socket %preun -n ibacm %service_del_preun ibacm.service ibacm.socket %postun -n ibacm %service_del_postun ibacm.service ibacm.socket # # srp daemon # %pre -n srp_daemon %service_add_pre srp_daemon.service %post -n srp_daemon %service_add_post srp_daemon.service # we ship udev rules, so trigger an update. %{_bindir}/udevadm trigger --subsystem-match=infiniband_mad --action=change %preun -n srp_daemon %service_del_preun srp_daemon.service %postun -n srp_daemon %service_del_postun srp_daemon.service # # iwpmd # %pre -n iwpmd %service_add_pre ibiwpmd.service %post -n iwpmd %service_add_post iwpmd.service %preun -n iwpmd %service_del_preun iwpmd.service %postun -n iwpmd %service_del_postun iwpmd.service # # rdma-ndd # %pre -n rdma-ndd %service_add_pre rdma-ndd.service %preun -n rdma-ndd %service_del_preun rdma-ndd.service %post -n rdma-ndd %service_add_post rdma-ndd.service # we ship udev rules, so trigger an update. %{_bindir}/udevadm trigger --subsystem-match=infiniband --action=change || true %postun -n rdma-ndd %service_del_postun rdma-ndd.service %files %dir %{_sysconfdir}/rdma %dir %{_sysconfdir}/rdma/modules %dir %{_docdir}/%{name}-%{version} %dir %{_udevrulesdir} %dir %{_modprobedir} %doc %{_docdir}/%{name}-%{version}/70-persistent-ipoib.rules %doc %{_docdir}/%{name}-%{version}/README.md %doc %{_docdir}/%{name}-%{version}/udev.md %config(noreplace) %{_sysconfdir}/rdma/mlx4.conf %config(noreplace) %{_sysconfdir}/rdma/modules/infiniband.conf %config(noreplace) %{_sysconfdir}/rdma/modules/iwarp.conf %config(noreplace) %{_sysconfdir}/rdma/modules/opa.conf %config(noreplace) %{_sysconfdir}/rdma/modules/rdma.conf %config(noreplace) %{_sysconfdir}/rdma/modules/roce.conf %if 0%{?dma_coherent} %{_modprobedir}/mlx4.conf %endif %{_modprobedir}/truescale.conf %{_unitdir}/rdma-hw.target %{_unitdir}/rdma-load-modules@.service %dir %{dracutlibdir} %dir %{dracutlibdir}/modules.d %dir %{dracutlibdir}/modules.d/05rdma %{dracutlibdir}/modules.d/05rdma/module-setup.sh %{_udevrulesdir}/../rdma_rename %{_udevrulesdir}/60-rdma-persistent-naming.rules %{_udevrulesdir}/75-rdma-description.rules %{_udevrulesdir}/90-rdma-hw-modules.rules %{_udevrulesdir}/90-rdma-ulp-modules.rules %{_udevrulesdir}/90-rdma-umad.rules %{_modprobedir}/50-libmlx4.conf %{_libexecdir}/mlx4-setup.sh %{_libexecdir}/truescale-serdes.cmds %license COPYING.* %if 0%{?suse_version} < 1600 %{_sbindir}/rcrdma %endif %files devel %doc %{_docdir}/%{name}-%{version}/MAINTAINERS %dir %{_includedir}/infiniband %dir %{_includedir}/rdma %{_includedir}/infiniband/* %{_includedir}/rdma/* %if %{with_static} %{_libdir}/lib*.a %endif %{_libdir}/lib*.so %{_libdir}/pkgconfig/*.pc %{_mandir}/man3/ibnd_* %{_mandir}/man3/ibv_* %{_mandir}/man3/rdma* %{_mandir}/man3/umad* %{_mandir}/man3/*_to_ibv_rate.* %{_mandir}/man7/rdma_cm.* %if 0%{?dma_coherent} %{_mandir}/man3/efadv* %{_mandir}/man3/hnsdv* %{_mandir}/man3/manadv* %{_mandir}/man3/mlx5dv* %{_mandir}/man3/mlx4dv* %{_mandir}/man7/efadv* %{_mandir}/man7/hnsdv* %{_mandir}/man7/manadv* %{_mandir}/man7/mlx5dv* %{_mandir}/man7/mlx4dv* %endif %files -n libibverbs %dir %{_sysconfdir}/libibverbs.d %dir %{_libdir}/libibverbs %{_libdir}/libibverbs/*.so %config(noreplace) %{_sysconfdir}/libibverbs.d/*.driver %doc %{_docdir}/%{name}-%{version}/libibverbs.md %doc %{_docdir}/%{name}-%{version}/rxe.md %doc %{_docdir}/%{name}-%{version}/tag_matching.md %{_mandir}/man7/rxe* %files -n libibnetdisc%{ibnetdisc_major} %{_libdir}/libibnetdisc.so.* %files -n libibmad%{mad_major} %{_libdir}/libibmad.so.* %files -n %verbs_lname %{_libdir}/libibverbs*.so.* %if 0%{?dma_coherent} %files -n %efa_lname %{_libdir}/libefa*.so.* %files -n %hns_lname %defattr(-,root,root) %{_libdir}/libhns*.so.* %files -n %mana_lname %{_libdir}/libmana*.so.* %files -n %mlx4_lname %{_libdir}/libmlx4*.so.* %files -n %mlx5_lname %{_libdir}/libmlx5*.so.* %endif %files -n libibverbs-utils %{_bindir}/ibv_* %{_mandir}/man1/ibv_* %files -n ibacm %config(noreplace) %{_sysconfdir}/rdma/ibacm_opts.cfg %{_bindir}/ib_acme %{_sbindir}/ibacm %{_mandir}/man1/ib_acme.* %{_mandir}/man7/ibacm.* %{_mandir}/man7/ibacm_prov.* %{_mandir}/man8/ibacm.* %{_unitdir}/ibacm.service %{_unitdir}/ibacm.socket %dir %{_libdir}/ibacm %{_libdir}/ibacm/* %if 0%{?suse_version} < 1600 %{_sbindir}/rcibacm %endif %doc %{_docdir}/%{name}-%{version}/ibacm.md %files -n infiniband-diags %dir %{_sysconfdir}/infiniband-diags %config(noreplace) %{_sysconfdir}/infiniband-diags/* %{_sbindir}/ibaddr %{_mandir}/man8/ibaddr* %{_sbindir}/ibnetdiscover %{_mandir}/man8/ibnetdiscover* %{_sbindir}/ibping %{_mandir}/man8/ibping* %{_sbindir}/ibportstate %{_mandir}/man8/ibportstate* %{_sbindir}/ibroute %{_mandir}/man8/ibroute.* %{_sbindir}/ibstat %{_mandir}/man8/ibstat.* %{_sbindir}/ibsysstat %{_mandir}/man8/ibsysstat* %{_sbindir}/ibtracert %{_mandir}/man8/ibtracert* %{_sbindir}/perfquery %{_mandir}/man8/perfquery* %{_sbindir}/sminfo %{_mandir}/man8/sminfo* %{_sbindir}/smpdump %{_mandir}/man8/smpdump* %{_sbindir}/smpquery %{_mandir}/man8/smpquery* %{_sbindir}/saquery %{_mandir}/man8/saquery* %{_sbindir}/vendstat %{_mandir}/man8/vendstat* %{_sbindir}/iblinkinfo %{_mandir}/man8/iblinkinfo* %{_sbindir}/ibqueryerrors %{_mandir}/man8/ibqueryerrors* %{_sbindir}/ibcacheedit %{_mandir}/man8/ibcacheedit* %{_sbindir}/ibccquery %{_mandir}/man8/ibccquery* %{_sbindir}/ibccconfig %{_mandir}/man8/ibccconfig* %{_sbindir}/dump_fts %{_mandir}/man8/dump_fts* %{_sbindir}/ibhosts %{_mandir}/man8/ibhosts* %{_sbindir}/ibswitches %{_mandir}/man8/ibswitches* %{_sbindir}/ibnodes %{_mandir}/man8/ibnodes* %{_sbindir}/ibrouters %{_mandir}/man8/ibrouters* %{_sbindir}/ibfindnodesusing.pl %{_mandir}/man8/ibfindnodesusing* %{_sbindir}/ibidsverify.pl %{_mandir}/man8/ibidsverify* %{_sbindir}/check_lft_balance.pl %{_mandir}/man8/check_lft_balance* %{_sbindir}/dump_lfts.sh %{_mandir}/man8/dump_lfts* %{_sbindir}/dump_mfts.sh %{_mandir}/man8/dump_mfts* %{_sbindir}/ibstatus %{_mandir}/man8/ibstatus* %{_mandir}/man8/infiniband-diags* %{perl_vendorlib}/IBswcountlimits.pm %files -n iwpmd %dir %{_sysconfdir}/rdma %dir %{_sysconfdir}/rdma/modules %{_sbindir}/iwpmd %if 0%{?suse_version} < 1600 %{_sbindir}/rciwpmd %endif %{_unitdir}/iwpmd.service %config(noreplace) %{_sysconfdir}/rdma/modules/iwpmd.conf %config(noreplace) %{_sysconfdir}/iwpmd.conf %{_udevrulesdir}/90-iwpmd.rules %{_mandir}/man8/iwpmd.* %{_mandir}/man5/iwpmd.* %files -n %umad_lname %{_libdir}/libibumad*.so.* %files -n %rdmacm_lname %{_libdir}/librdmacm*.so.* %doc %{_docdir}/%{name}-%{version}/librdmacm.md %files -n rsocket %dir %{_libdir}/rsocket %{_libdir}/rsocket/*.so* %{_mandir}/man7/rsocket.* %files -n librdmacm-utils %{_bindir}/cmtime %{_bindir}/mckey %{_bindir}/rcopy %{_bindir}/rdma_client %{_bindir}/rdma_server %{_bindir}/rdma_xclient %{_bindir}/rdma_xserver %{_bindir}/riostream %{_bindir}/rping %{_bindir}/rstream %{_bindir}/ucmatose %{_bindir}/udaddy %{_bindir}/udpong %{_mandir}/man1/cmtime.* %{_mandir}/man1/mckey.* %{_mandir}/man1/rcopy.* %{_mandir}/man1/rdma_client.* %{_mandir}/man1/rdma_server.* %{_mandir}/man1/rdma_xclient.* %{_mandir}/man1/rdma_xserver.* %{_mandir}/man1/riostream.* %{_mandir}/man1/rping.* %{_mandir}/man1/rstream.* %{_mandir}/man1/ucmatose.* %{_mandir}/man1/udaddy.* %{_mandir}/man1/udpong.* %files -n srp_daemon %dir %{_libexecdir}/srp_daemon %dir %{_sysconfdir}/rdma %dir %{_sysconfdir}/rdma/modules %config(noreplace) %{_sysconfdir}/srp_daemon.conf %config(noreplace) %{_sysconfdir}/rdma/modules/srp_daemon.conf %{_udevrulesdir}/60-srp_daemon.rules %{_libexecdir}/srp_daemon/start_on_all_ports %{_unitdir}/srp_daemon.service %{_unitdir}/srp_daemon_port@.service %{_sbindir}/ibsrpdm %{_sbindir}/srp_daemon %{_sbindir}/run_srp_daemon %if 0%{?suse_version} < 1600 %{_sbindir}/rcsrp_daemon %endif %{_mandir}/man5/srp_daemon.service.5* %{_mandir}/man5/srp_daemon_port@.service.5* %{_mandir}/man8/ibsrpdm.8* %{_mandir}/man8/srp_daemon.8* %doc %{_docdir}/%{name}-%{version}/ibsrpdm.md %files -n rdma-ndd %{_sbindir}/rdma-ndd %if 0%{?suse_version} < 1600 %{_sbindir}/rcrdma-ndd %endif %{_unitdir}/rdma-ndd.service %{_mandir}/man8/rdma-ndd.8* %{_udevrulesdir}/60-rdma-ndd.rules %if %{with_pyverbs} %files -n python3-pyverbs %{python3_sitearch}/pyverbs %dir %{_docdir}/%{name}-%{version}/tests/ %{_docdir}/%{name}-%{version}/tests/*.py %endif %changelog rdma-core-56.1/tests/000077500000000000000000000000001477342711600144745ustar00rootroot00000000000000rdma-core-56.1/tests/CMakeLists.txt000066400000000000000000000024751477342711600172440ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019, Mellanox Technologies. All rights reserved. See COPYING file rdma_python_test(tests __init__.py args_parser.py base.py base_rdmacm.py cuda_utils.py efa_base.py irdma_base.py mlx5_base.py mlx5_prm_structs.py rdmacm_utils.py test_addr.py test_atomic.py test_cq.py test_cq_events.py test_cqex.py test_cuda_dmabuf.py test_device.py test_efa_srd.py test_efadv.py test_flow.py test_fork.py test_mlx5_cq.py test_mlx5_crypto.py test_mlx5_cuda_umem.py test_mlx5_dc.py test_mlx5_devx.py test_mlx5_dm_ops.py test_mlx5_dma_memcpy.py test_mlx5_dmabuf.py test_mlx5_dr.py test_mlx5_flow.py test_mlx5_huge_page.py test_mlx5_lag_affinity.py test_mlx5_mkey.py test_mlx5_ooo_qp.py test_mlx5_pp.py test_mlx5_query_port.py test_mlx5_raw_wqe.py test_mlx5_rdmacm.py test_mlx5_sched.py test_mlx5_timestamp.py test_mlx5_uar.py test_mlx5_udp_sport.py test_mlx5_var.py test_mlx5_vfio.py test_mr.py test_odp.py test_pd.py test_parent_domain.py test_qp.py test_qpex.py test_rdmacm.py test_relaxed_ordering.py test_rss_traffic.py test_shared_pd.py test_srq.py test_tag_matching.py utils.py ) rdma_python_test(tests run_tests.py ) rdma_internal_binary( run_tests.py ) rdma-core-56.1/tests/__init__.py000066400000000000000000000031711477342711600166070ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc . All rights reserved. See COPYING file import importlib import os from args_parser import parser # Load every test as a module in the system so that unittest's loader can find it def _load_tests(): res = [] for fn in sorted(os.listdir(os.path.dirname(__file__))): if fn.endswith(".py") and fn.startswith("test_"): m = importlib.import_module("." + os.path.basename(fn)[:-3], __name__) res.append(m) return res __test_modules__ = _load_tests() # unittest -v prints names like 'tests.test_foo', but it always starts # searching from the tests module, adding the name 'tests.test' lets the user # specify the same test name from logging on the command line to trivially run # a single test. tests = importlib.import_module(".", __name__) def _show_tests_and_exit(loader, standard_tests, pattern): """ Prints the full test names that are loaded with the current modules via loadTestsFromModule protocol, without modifying standard_tests. """ for mod in __test_modules__: for test in loader.loadTestsFromModule(mod, pattern=pattern): for test_case in test: print(test_case.id()) return standard_tests def load_tests(loader, standard_tests, pattern): """Implement the loadTestsFromModule protocol""" if parser.args['list_tests']: return _show_tests_and_exit(loader, standard_tests, pattern) for mod in __test_modules__: standard_tests.addTests(loader.loadTestsFromModule(mod, pattern=pattern)) return standard_tests rdma-core-56.1/tests/args_parser.py000066400000000000000000000037631477342711600173670ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Kamal Heib , All rights reserved. See COPYING file # Copyright (c) 2021 Nvidia, Inc. All rights reserved. See COPYING file import argparse import sys class ArgsParser(object): def __init__(self): self.args = None def get_config(self): return self.args def parse_args(self): parser = argparse.ArgumentParser() parser.add_argument('--dev', help='RDMA device to run the tests on') parser.add_argument('--pci-dev', help='PCI device to run the tests on, which is ' 'needed by some tests where the RDMA device is ' 'not available (e.g. VFIO)') parser.add_argument('--port', help='Use port of RDMA device', type=int, default=1) parser.add_argument('--gid', help='Use gid index of RDMA device', type=int) parser.add_argument('--gpu', nargs='?', type=int, const=0, default=0, help='GPU unit to allocate dmabuf from') parser.add_argument('--gtt', action='store_true', default=False, help='Allocate dmabuf from GTT instead of VRAM') parser.add_argument('-v', '--verbose', dest='verbosity', action='store_const', const=2, help='Verbose output') parser.add_argument('--list-tests', action='store_true', default=False, help='Print a list of the full test names that are ' 'loaded by default and exit without running ' 'them.') ns, args = parser.parse_known_args() self.args = vars(ns) if self.args['verbosity']: args += ['--verbose'] sys.argv[1:] = args parser = ArgsParser() rdma-core-56.1/tests/base.py000066400000000000000000001016111477342711600157600ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc . All rights reserved. See COPYING file import multiprocessing as mp import subprocess import unittest import tempfile import random import errno import stat import json import sys import os from pyverbs.qp import QPCap, QPInitAttrEx, QPInitAttr, QPAttr, QP from pyverbs.srq import SRQ, SrqInitAttrEx, SrqInitAttr, SrqAttr from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsError from pyverbs.addr import AHAttr, GlobalRoute from pyverbs.xrcd import XRCD, XRCDInitAttr from pyverbs.device import Context from args_parser import parser import pyverbs.cm_enums as ce import pyverbs.device as d import pyverbs.enums as e from pyverbs.pd import PD from pyverbs.cq import CQ from pyverbs.mr import MR PATH_MTU = e.IBV_MTU_1024 MAX_DEST_RD_ATOMIC = 1 NUM_OF_PROCESSES = 2 MC_IP_PREFIX = '230' MAX_RDMA_ATOMIC = 20 MAX_RD_ATOMIC = 1 MIN_RNR_TIMER =12 RETRY_CNT = 7 RNR_RETRY = 7 TIMEOUT = 14 # Devices that don't support RoCEv2 should be added here MLNX_VENDOR_ID = 0x02c9 CX3_MLNX_PART_ID = 4099 CX3Pro_MLNX_PART_ID = 4103 DCT_KEY = 0xbadc0de # Dictionary: vendor_id -> array of part_ids of devices that lack RoCEv2 support ROCEV2_UNSUPPORTED_DEVS = {MLNX_VENDOR_ID: [CX3Pro_MLNX_PART_ID, CX3_MLNX_PART_ID]} def has_roce_hw_bug(vendor_id, vendor_part_id): return vendor_part_id in ROCEV2_UNSUPPORTED_DEVS.get(vendor_id, []) def set_rnr_attributes(qp_attr): """ Set default QP RNR attributes. :param qp_attr: The QPAttr to set its attributes :return: None """ qp_attr.min_rnr_timer = MIN_RNR_TIMER qp_attr.timeout = TIMEOUT qp_attr.retry_cnt = RETRY_CNT qp_attr.rnr_retry = RNR_RETRY def is_gid_available(gid_index): if gid_index is None: raise unittest.SkipTest(f'No relevant GID found') class PyverbsAPITestCase(unittest.TestCase): def __init__(self, methodName='runTest'): super().__init__(methodName) # Hold the command line arguments self.config = parser.get_config() self.dev_name = None self.ctx = None self.attr = None self.attr_ex = None self.gid_index = 0 self.pre_environment = {} def setUp(self): """ Opens the device and queries it. The results of the query and query_ex are stored in attr and attr_ex instance attributes respectively. If the user didn't pass a device name, the first device is chosen by default. """ self.ib_port = self.config['port'] self.dev_name = self.config['dev'] if not self.dev_name: dev_list = d.get_device_list() if not dev_list: raise unittest.SkipTest('No IB devices found') self.dev_name = dev_list[0].name.decode() if self.config['gid']: self.gid_index = self.config['gid'] self.create_context() self.attr = self.ctx.query_device() self.attr_ex = self.ctx.query_device_ex() def create_context(self): self.ctx = d.Context(name=self.dev_name) def set_env_variable(self, var, value): """ Set environment variable. The current value for each variable is stored and is set back at the end of the test. :param var: The name of the environment variable :param value: The requested new value of this environment variable """ if var not in self.pre_environment.keys(): self.pre_environment[var] = os.environ.get(var) os.environ[var] = value def tearDown(self): for k, v in self.pre_environment.items(): if v is None: os.environ.pop(k) else: os.environ[k] = v self.ctx.close() class RDMATestCase(unittest.TestCase): ZERO_GID = '0000:0000:0000:0000' def __init__(self, methodName='runTest', dev_name=None, ib_port=None, gid_index=None, pkey_index=None, gid_type=None): """ Initialize a RDMA test unit based on unittest.TestCase. If no device was provided, it iterates over the existing devices, for each port of each device, it checks which GID indexes are valid (in RoCE, only IPv4 and IPv6 based GIDs are used). Each is added to an array and one entry is selected. If a device was provided, the same process is done for all ports of this device (in case they're not provided), and so on. If gid_type is provided by the user, only GIDs of that type would be be chosen (valid only if gid_index was not provided). :param methodName: The base method to be used by the unittest :param dev_name: Device name to use :param ib_port: IB port of the device to use :param gid_index: GID index to use :param pkey_index: PKEY index to use :param gid_type: If provided, only GIDs of gid_type will be chosen (ignored if gid_index is provided by the user) """ super(RDMATestCase, self).__init__(methodName) # Hold the command line arguments self.config = parser.get_config() dev = self.config['dev'] self.dev_name = dev_name if dev_name else dev self.ib_port = ib_port if ib_port else self.config['port'] self.gid_index = gid_index if gid_index else self.config['gid'] self.pkey_index = pkey_index self.gid_type = gid_type if gid_index is None else None self.ip_addr = None self.mac_addr = None self.pre_environment = {} self.server = None self.client = None self.iters = 10 def is_eth_and_has_roce_hw_bug(self): """ Check if the link layer is Ethernet and the device lacks RoCEv2 support with a known HW bug. return: True if the link layer is Ethernet and device is not supported """ ctx = d.Context(name=self.dev_name) port_attrs = ctx.query_port(self.ib_port) dev_attrs = ctx.query_device() vendor_id = dev_attrs.vendor_id vendor_pid = dev_attrs.vendor_part_id return port_attrs.link_layer == e.IBV_LINK_LAYER_ETHERNET and \ has_roce_hw_bug(vendor_id, vendor_pid) @staticmethod def get_net_name(dev, port=None): if port is not None: out = subprocess.check_output(['rdma', 'link', 'show', '-j']) loaded_json = json.loads(out.decode()) for row in loaded_json: try: if row['ifname'] == dev and row['port'] == port: return row['netdev'] except KeyError: pass if not os.path.exists(f'/sys/class/infiniband/{dev}/device/net/'): return None out = subprocess.check_output(['ls', f'/sys/class/infiniband/{dev}/device/net/']) return out.decode().split('\n')[0] @staticmethod def get_ip_mac_address(ifname): out = subprocess.check_output(['ip', '-j', 'addr', 'show', ifname]) loaded_json = json.loads(out.decode()) interface = loaded_json[0]['addr_info'][0]['local'] mac = loaded_json[0]['address'] if 'fe80::' in interface: interface = interface + '%' + ifname return interface, mac def setUp(self): """ Verify that the test case has dev_name, ib_port, gid_index and pkey index. If not provided by the user, the first valid combination will be used. """ if self.pkey_index is None: # To avoid iterating the entire pkeys table, if a pkey index wasn't # provided, use index 0 which is always valid self.pkey_index = 0 self.args = [] if self.dev_name is not None: ctx = d.Context(name=self.dev_name) if self.ib_port is not None: if self.gid_index is not None: self._get_ip_mac(self.dev_name, self.ib_port, self.gid_index) else: # Add avaiable GIDs of the given dev_name + port self._add_gids_per_port(ctx, self.dev_name, self.ib_port) else: # Add available GIDs for each port of the given dev_name self._add_gids_per_device(ctx, self.dev_name) else: # Iterate available devices, add available GIDs for each of # their ports lst = d.get_device_list() for dev in lst: dev_name = dev.name.decode() ctx = d.Context(name=dev_name) self._add_gids_per_device(ctx, dev_name) if not self.args: raise unittest.SkipTest('No supported port is up, can\'t run traffic') # Choose one combination and use it self._select_config() self.dev_info = {'dev_name': self.dev_name, 'ib_port': self.ib_port, 'gid_index': self.gid_index} def _add_gids_per_port(self, ctx, dev, port): # Don't add ports which are not active port_attrs = ctx.query_port(port) if port_attrs.state != e.IBV_PORT_ACTIVE: return if not port_attrs.gid_tbl_len: self._get_ip_mac(dev, port, None) return dev_attrs = ctx.query_device() vendor_id = dev_attrs.vendor_id vendor_pid = dev_attrs.vendor_part_id for idx in range(port_attrs.gid_tbl_len): gid = ctx.query_gid(port, idx) # Avoid adding ZERO GIDs if gid.gid[-19:] == self.ZERO_GID: continue # Avoid RoCEv2 GIDs on unsupported devices if port_attrs.link_layer == e.IBV_LINK_LAYER_ETHERNET and \ ctx.query_gid_type(port, idx) == \ e.IBV_GID_TYPE_SYSFS_ROCE_V2 and \ has_roce_hw_bug(vendor_id, vendor_pid): continue if self.gid_type is not None and ctx.query_gid_type(port, idx) != \ self.gid_type: continue self._get_ip_mac(dev, port, idx) def _add_gids_per_device(self, ctx, dev): self._add_gids_per_port(ctx, dev, self.ib_port) def _get_ip_mac(self, dev, port, idx): net_name = self.get_net_name(dev, port) if net_name is None: self.args.append([dev, port, idx, None, None]) return try: ip_addr, mac_addr = self.get_ip_mac_address(net_name) except (KeyError, IndexError): self.args.append([dev, port, idx, None, None]) else: self.args.append([dev, port, idx, ip_addr, mac_addr]) def _select_config(self): args_with_inet_ip = [] for arg in self.args: if arg[3]: args_with_inet_ip.append(arg) if args_with_inet_ip: args = args_with_inet_ip[0] else: args = self.args[0] self.dev_name = args[0] self.ib_port = args[1] self.gid_index = args[2] self.ip_addr = args[3] self.mac_addr = args[4] def set_env_variable(self, var, value): """ Set environment variable. The current value for each variable is stored and is set back at the end of the test. :param var: The name of the environment variable :param value: The requested new value of this environment variable """ if var not in self.pre_environment.keys(): self.pre_environment[var] = os.environ.get(var) os.environ[var] = value def sync_remote_attr(self): """ Sync the MR remote attributes between the server and the client. """ self.server.rkey = self.client.mr.rkey self.server.raddr = self.client.mr.buf self.client.rkey = self.server.mr.rkey self.client.raddr = self.server.mr.buf def pre_run(self): """ Configure Resources before running traffic. pre_run() must be implemented by the client and server. """ self.client.pre_run(self.server.psns, self.server.qps_num) self.server.pre_run(self.client.psns, self.client.qps_num) def create_players(self, resource, sync_attrs=True, **resource_arg): """ Init test resources. :param resource: The RDMA resources to use. :param sync_attrs: If True, sync remote attrs such as rkey and raddr :param resource_arg: Dict of args that specify the resource specific attributes. """ try: self.client = resource(**self.dev_info, **resource_arg) self.server = resource(**self.dev_info, **resource_arg) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Create player of {resource.__name__} is not supported') raise ex self.pre_run() if sync_attrs: self.sync_remote_attr() self.traffic_args = {'client': self.client, 'server': self.server, 'iters': self.iters, 'gid_idx': self.gid_index, 'port': self.ib_port} def tearDown(self): """ Restore the previous environment variables values before ending the test. """ for k, v in self.pre_environment.items(): if v is None: os.environ.pop(k) else: os.environ[k] = v if self.server: self.server.ctx.close() if self.client: self.client.ctx.close() super().tearDown() class RDMACMBaseTest(RDMATestCase): """ Base RDMACM test class. This class does not include any test, but rather implements generic connection and traffic methods that are needed by RDMACM tests in general. Each RDMACM test should have a class that inherits this class and extends its functionalities if needed. """ def setUp(self): super().setUp() if not self.ip_addr: raise unittest.SkipTest('Device {} doesn\'t have net interface' .format(self.dev_name)) is_gid_available(self.gid_index) def two_nodes_rdmacm_traffic(self, connection_resources, test_flow, bad_flow=False, **resource_kwargs): """ Init and manage the rdmacm test processes. The exit code of the test processes indicates if exception was thrown. {0: pass, 2: exception was thrown, 5: skip test} If needed, terminate those processes and raise an exception. :param connection_resources: The CMConnection resources to use. :param test_flow: The target RDMACM flow method to run. :param bad_flow: If true, traffic is expected to fail. :param resource_kwargs: Dict of args that specify the CMResources specific attributes. Each test case can pass here as key words the specific CMResources attributes that are requested. :return: None """ if resource_kwargs.get('port_space', None) == ce.RDMA_PS_UDP and \ self.is_eth_and_has_roce_hw_bug(): raise unittest.SkipTest('Device {} doesn\'t support UDP with RoCEv2' .format(self.dev_name)) ctx = mp.get_context('fork') self.syncer = ctx.Barrier(NUM_OF_PROCESSES, timeout=15) self.notifier = ctx.Queue() passive = ctx.Process(target=test_flow, kwargs={'connection_resources': connection_resources, 'passive':True, **resource_kwargs}) active = ctx.Process(target=test_flow, kwargs={'connection_resources': connection_resources, 'passive':False, **resource_kwargs}) passive.start() active.start() repeat_times=150 if not bad_flow else 3 proc_res = {} for _ in range(repeat_times): for proc in [passive, active]: proc.join(0.1) # Write the exit code of the proc. if not proc.is_alive(): side = 'passive' if proc == passive else 'active' if side not in proc_res.keys(): proc_res[side] = proc.exitcode # If the processes is still alive kill them and fail the test. proc_killed = False for proc in [passive, active]: if proc.is_alive(): proc.terminate() proc_killed = True # Check if need to skip this test for side in proc_res.keys(): if proc_res[side] == 5: raise unittest.SkipTest(f'SkipTest occurred on {side} side') # Check if the test processes raise exceptions. res_exception = False for side in proc_res: if 0 < proc_res[side] < 5: res_exception = True if res_exception: raise Exception('Exception in active/passive side occurred') # Raise exeption if the test proceses was terminate. if bad_flow and not proc_killed: raise Exception('Bad flow: traffic passed which is not expected') if not bad_flow and proc_killed: raise Exception('RDMA CM test procces is stuck, kill the test') def rdmacm_traffic(self, connection_resources=None, passive=None, **kwargs): """ Run RDMACM traffic between two CMIDs. :param connection_resources: The connection resources to use. :param passive: Indicate if this CMID is this the passive side. :param kwargs: Arguments to be passed to the connection_resources. :return: None """ try: player = connection_resources(ip_addr=self.ip_addr, syncer=self.syncer, notifier=self.notifier, passive=passive, **kwargs) player.establish_connection() if kwargs.get('reject_conn'): return player.rdmacm_traffic() player.disconnect() except Exception as ex: self._rdmacm_exception_handler(passive, ex) def rdmacm_multicast_traffic(self, connection_resources=None, passive=None, extended=False, leave_test=False, **kwargs): """ Run RDMACM multicast traffic between two CMIDs. :param connection_resources: The connection resources to use. :param passive: Indicate if this CMID is the passive side. :param extended: Use exteneded multicast join request. This request allows CMID to join with specific join flags. :param leave_test: Perform traffic after leaving the multicast group to ensure leave works. :param kwargs: Arguments to be passed to the connection_resources. :return: None """ try: player = connection_resources(ip_addr=self.ip_addr, syncer=self.syncer, notifier=self.notifier, passive=False, **kwargs) mc_addr = MC_IP_PREFIX + self.ip_addr[self.ip_addr.find('.'):] player.join_to_multicast(src_addr=self.ip_addr, mc_addr=mc_addr, extended=extended) player.rdmacm_traffic(server=passive, multicast=True) player.leave_multicast(mc_addr=mc_addr) if leave_test: player.rdmacm_traffic(server=passive, multicast=True) except Exception as ex: self._rdmacm_exception_handler(passive, ex) def rdmacm_remote_traffic(self, connection_resources=None, passive=None, remote_op='write', **kwargs): """ Run RDMACM remote traffic between two CMIDs. :param connection_resources: The connection resources to use. :param passive: Indicate if this CMID is the passive side. :param remote_op: The remote operation in the traffic. :param kwargs: Arguments to be passed to the connection_resources. :return: None """ try: player = connection_resources(ip_addr=self.ip_addr, syncer=self.syncer, notifier=self.notifier, passive=passive, remote_op=remote_op, **kwargs) player.establish_connection() player.remote_traffic(passive=passive, remote_op=remote_op) player.disconnect() except Exception as ex: self._rdmacm_exception_handler(passive, ex) @staticmethod def _rdmacm_exception_handler(passive, exception): if isinstance(exception, PyverbsRDMAError): if exception.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: sys.exit(5) if isinstance(exception, unittest.case.SkipTest): sys.exit(5) side = 'passive' if passive else 'active' print(f'Player {side} got: {exception}') sys.exit(2) class BaseResources(object): """ BaseResources class is a base aggregator object which contains basic resources like Context and PD. It opens a context over the given device and port and allocates a PD. """ def __init__(self, dev_name, ib_port, gid_index): """ Initializes a BaseResources object. :param dev_name: Device name to be used (default: 'ibp0s8f0') :param ib_port: IB port of the device to use (default: 1) :param gid_index: Which GID index to use (default: 0) """ self.dev_name = dev_name self.gid_index = gid_index self.ib_port = ib_port self.create_context() self.create_pd() def create_context(self): self.ctx = Context(name=self.dev_name) def create_pd(self): self.pd = PD(self.ctx) def mem_write(self, data, size, offset=0): self.mr.write(data, size, offset) def mem_read(self, size=None, offset=0): size_ = self.msg_size if size is None else size return self.mr.read(size_, offset) class TrafficResources(BaseResources): """ Basic traffic class. It provides the basic RDMA resources and operations needed for traffic. """ def __init__(self, dev_name, ib_port, gid_index, with_srq=False, qp_count=1, msg_size=1024): """ Initializes a TrafficResources object with the given values and creates basic RDMA resources. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param with_srq: If True, create SRQ and attach to QPs :param qp_count: Number of QPs to create :param msg_size: Size of resource msg. If None, use 1024 as default. """ super(TrafficResources, self).__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) self.msg_size = msg_size self.num_msgs = 1000 self.port_attr = None self.mr = None self.use_mr_prefetch = None self.srq = None self.cq = None self.qps = [] self.qps_num = [] self.psns = [] self.rqps_num = None self.rpsns = None self.with_srq = with_srq self.qp_count = qp_count self.init_resources() @property def qp(self): return self.qps[0] @property def mr_lkey(self): if self.mr: return self.mr.lkey def init_resources(self): """ Initializes a CQ, MR and an RC QP. :return: None """ self.port_attr = self.ctx.query_port(self.ib_port) self.create_cq() if self.with_srq: self.create_srq() self.create_mr() self.create_qps() def create_cq(self): """ Initializes self.cq with a CQ of depth - defined by each test. :return: None """ self.cq = CQ(self.ctx, self.num_msgs, None, None, 0) def create_mr(self): """ Initializes self.mr with an MR of length - defined by each test. :return: None """ self.mr = MR(self.pd, self.msg_size, e.IBV_ACCESS_LOCAL_WRITE) def create_qp_cap(self): return QPCap(max_recv_wr=self.num_msgs) def create_qp_init_attr(self): return QPInitAttr(qp_type=e.IBV_QPT_RC, scq=self.cq, rcq=self.cq, srq=self.srq, cap=self.create_qp_cap()) def create_qp_attr(self): return QPAttr(port_num=self.ib_port) def create_qps(self): """ Initializes self.qps with RC QPs. :return: None """ qp_init_attr = self.create_qp_init_attr() qp_attr = self.create_qp_attr() for _ in range(self.qp_count): try: qp = QP(self.pd, qp_init_attr, qp_attr) self.qps.append(qp) self.qps_num.append(qp.qp_num) self.psns.append(random.getrandbits(24)) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Create QP type {qp_init_attr.qp_type} is not supported') raise ex def create_srq_attr(self): return SrqAttr(max_wr=self.num_msgs*self.qp_count) def create_srq_init_attr(self): return SrqInitAttr(self.create_srq_attr()) def create_srq(self): srq_init_attr = self.create_srq_init_attr() try: self.srq = SRQ(self.pd, srq_init_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create SRQ is not supported') raise ex def pre_run(self, rpsns, rqps_num): """ Configure resources before running traffic and modifies the QP to RTS if required. :param rpsns: Remote PSNs (packet serial numbers) :param rqps_num: Remote QPs Number """ self.rpsns = rpsns self.rqps_num = rqps_num self.to_rts() def to_rts(self): """ Modify the QP's states to RTS and initialize it to be ready for traffic. If not required, can be "passed" but must be implemented. """ raise NotImplementedError() class RoCETrafficResources(TrafficResources): def __init__(self, dev_name, ib_port, gid_index, **kwargs): is_gid_available(gid_index) super(RoCETrafficResources, self).__init__(dev_name, ib_port, gid_index, **kwargs) class RCResources(RoCETrafficResources): def to_rts(self): """ Set the QP attributes' values to arbitrary values (same values used in ibv_rc_pingpong). :return: None """ attr = self.create_qp_attr() attr.path_mtu = PATH_MTU attr.max_dest_rd_atomic = MAX_DEST_RD_ATOMIC set_rnr_attributes(attr) attr.max_rd_atomic = MAX_RD_ATOMIC gr = GlobalRoute(dgid=self.ctx.query_gid(self.ib_port, self.gid_index), sgid_index=self.gid_index) ah_attr = AHAttr(port_num=self.ib_port, is_global=1, gr=gr, dlid=self.port_attr.lid) attr.ah_attr = ah_attr for i in range(self.qp_count): attr.dest_qp_num = self.rqps_num[i] attr.rq_psn = self.psns[i] attr.sq_psn = self.rpsns[i] self.qps[i].to_rts(attr) class UDResources(RoCETrafficResources): UD_QKEY = 0x11111111 UD_PKEY_INDEX = 0 GRH_SIZE = 40 def create_mr(self): self.mr = MR(self.pd, self.msg_size + self.GRH_SIZE, e.IBV_ACCESS_LOCAL_WRITE) def create_qp_init_attr(self): return QPInitAttr(qp_type=e.IBV_QPT_UD, scq=self.cq, rcq=self.cq, srq=self.srq, cap=self.create_qp_cap()) def create_qps(self): qp_init_attr = self.create_qp_init_attr() qp_attr = self.create_qp_attr() qp_attr.qkey = self.UD_QKEY qp_attr.pkey_index = self.UD_PKEY_INDEX for _ in range(self.qp_count): try: qp = QP(self.pd, qp_init_attr, qp_attr) self.qps.append(qp) self.qps_num.append(qp.qp_num) self.psns.append(random.getrandbits(24)) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Create QP type {qp_init_attr.qp_type} is not supported') raise ex def to_rts(self): pass class RawResources(TrafficResources): def create_qp_init_attr(self): return QPInitAttr(qp_type=e.IBV_QPT_RAW_PACKET, scq=self.cq, rcq=self.cq, srq=self.srq, cap=self.create_qp_cap()) def pre_run(self, rpsns=None, rqps_num=None): pass class XRCResources(RoCETrafficResources): def __init__(self, dev_name, ib_port, gid_index, qp_count=2): self.temp_file = None self.xrcd_fd = -1 self.xrcd = None self.sqp_lst = [] self.rqp_lst = [] super(XRCResources, self).__init__(dev_name, ib_port, gid_index, qp_count=qp_count) def close(self): os.close(self.xrcd_fd) self.temp_file.close() @property def qp(self): return self.sqp_lst[0] def create_qps(self): """ Initializes self.qps with an XRC SEND/RECV QPs. :return: None """ qp_attr = QPAttr(port_num=self.ib_port) qp_attr.pkey_index = 0 for _ in range(self.qp_count): attr_ex = QPInitAttrEx(qp_type=e.IBV_QPT_XRC_RECV, comp_mask=e.IBV_QP_INIT_ATTR_XRCD, xrcd=self.xrcd) qp_attr.qp_access_flags = e.IBV_ACCESS_LOCAL_WRITE | \ e.IBV_ACCESS_REMOTE_READ | \ e.IBV_ACCESS_REMOTE_WRITE | \ e.IBV_ACCESS_REMOTE_ATOMIC recv_qp = QP(self.ctx, attr_ex, qp_attr) self.rqp_lst.append(recv_qp) qp_caps = QPCap(max_send_wr=self.num_msgs, max_recv_sge=0, max_recv_wr=0) attr_ex = QPInitAttrEx(qp_type=e.IBV_QPT_XRC_SEND, sq_sig_all=1, comp_mask=e.IBV_QP_INIT_ATTR_PD, pd=self.pd, scq=self.cq, cap=qp_caps) qp_attr.qp_access_flags = 0 send_qp =QP(self.ctx, attr_ex, qp_attr) self.sqp_lst.append(send_qp) self.qps_num.append((recv_qp.qp_num, send_qp.qp_num)) self.psns.append(random.getrandbits(24)) def create_xrcd(self): """ Initializes self.xrcd with an XRC Domain object. :return: None """ self.temp_file = tempfile.NamedTemporaryFile() self.xrcd_fd = os.open(self.temp_file.name, os.O_RDONLY | os.O_CREAT, stat.S_IRUSR | stat.S_IRGRP) init = XRCDInitAttr( e.IBV_XRCD_INIT_ATTR_FD | e.IBV_XRCD_INIT_ATTR_OFLAGS, os.O_CREAT, self.xrcd_fd) try: self.xrcd = XRCD(self.ctx, init) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create XRCD is not supported') raise ex def create_srq(self): """ Initializes self.srq with a Shared Receive QP object. :return: None """ srq_attr = SrqInitAttrEx(max_wr=self.qp_count*self.num_msgs) srq_attr.srq_type = e.IBV_SRQT_XRC srq_attr.pd = self.pd srq_attr.xrcd = self.xrcd srq_attr.cq = self.cq srq_attr.comp_mask = e.IBV_SRQ_INIT_ATTR_TYPE | e.IBV_SRQ_INIT_ATTR_PD | \ e.IBV_SRQ_INIT_ATTR_CQ | e.IBV_SRQ_INIT_ATTR_XRCD self.srq = SRQ(self.ctx, srq_attr) def to_rts(self): gid = self.ctx.query_gid(self.ib_port, self.gid_index) gr = GlobalRoute(dgid=gid, sgid_index=self.gid_index) ah_attr = AHAttr(port_num=self.ib_port, is_global=True, gr=gr, dlid=self.port_attr.lid) qp_attr = QPAttr() qp_attr.max_rd_atomic = MAX_RD_ATOMIC qp_attr.max_dest_rd_atomic = MAX_DEST_RD_ATOMIC qp_attr.path_mtu = PATH_MTU set_rnr_attributes(qp_attr) qp_attr.ah_attr = ah_attr for i in range(self.qp_count): qp_attr.dest_qp_num = self.rqps_num[i][1] qp_attr.rq_psn = self.psns[i] qp_attr.sq_psn = self.rpsns[i] self.rqp_lst[i].to_rts(qp_attr) qp_attr.dest_qp_num = self.rqps_num[i][0] self.sqp_lst[i].to_rts(qp_attr) def init_resources(self): self.create_xrcd() super(XRCResources, self).init_resources() self.create_srq() rdma-core-56.1/tests/base_rdmacm.py000066400000000000000000000157311477342711600173120ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc . All rights reserved. See COPYING file import abc from pyverbs.cmid import CMID, AddrInfo, CMEventChannel, ConnParam, UDParam from pyverbs.qp import QPCap, QPInitAttr, QPAttr, QP from pyverbs.pyverbs_error import PyverbsUserError import pyverbs.cm_enums as ce import pyverbs.enums as e from pyverbs.cq import CQ GRH_SIZE = 40 qp_type_per_ps = {ce.RDMA_PS_TCP: e.IBV_QPT_RC, ce.RDMA_PS_UDP: e.IBV_QPT_UD, ce.RDMA_PS_IPOIB : e.IBV_QPT_UD} class CMResources(abc.ABC): """ CMResources class is an abstract base class which contains basic resources for RDMA CM communication. """ def __init__(self, addr=None, passive=None, **kwargs): """ :param addr: Local address to bind to. :param passive: Indicate if this CM is the passive CM. :param kwargs: Arguments: * *port* (str) Port number of the address * *with_ext_qp* (bool) If set, an external RC QP will be created and used by RDMACM * *port_space* (str) If set, indicates the CMIDs port space """ self.qp_init_attr = None self.passive = passive self.with_ext_qp = kwargs.get('with_ext_qp', False) self.port = kwargs.get('port') if kwargs.get('port') else '7471' self.ib_port = int(kwargs.get('ib_port', '1')) self.port_space = kwargs.get('port_space', ce.RDMA_PS_TCP) self.remote_operation = kwargs.get('remote_op') self.qp_type = qp_type_per_ps[self.port_space] self.qp_init_attr = QPInitAttr(qp_type=self.qp_type, cap=QPCap()) self.connected = False # When passive side (server) listens to incoming connection requests, # for each new request it creates a new cmid which is used to establish # the connection with the remote side self.msg_size = 1024 self.num_msgs = 10 self.channel = None self.cq = None self.qps = {} self.mr = None self.remote_qpn = None self.ud_params = None self.child_ids = {} self.cmids = {} if self.passive: self.ai = AddrInfo(src=addr, src_service=self.port, port_space=self.port_space, flags=ce.RAI_PASSIVE) else: self.ai = AddrInfo(src=addr, dst=addr, dst_service=self.port, port_space=self.port_space) @property def child_id(self): if self.child_ids: return self.child_ids[0] @property def cmid(self): if self.cmids: return self.cmids[0] @property def qp(self): if self.qps: return self.qps[0] def create_mr(self): cmid = self.child_id if self.passive else self.cmid mr_remote_function = {None: cmid.reg_msgs, 'read': cmid.reg_read, 'write': cmid.reg_write} self.mr = mr_remote_function[self.remote_operation](self.msg_size + GRH_SIZE) def create_event_channel(self): self.channel = CMEventChannel() def create_qp_init_attr(self, rcq=None, scq=None): return QPInitAttr(qp_type=self.qp_type, rcq=rcq, scq=scq, cap=QPCap(max_recv_wr=1)) def create_conn_param(self, qp_num=0, conn_idx=0): if self.with_ext_qp: qp_num = self.qp.qp_num return ConnParam(qp_num=qp_num) def set_ud_params(self, cm_event): if self.port_space in [ce.RDMA_PS_UDP, ce.RDMA_PS_IPOIB]: self.ud_params = UDParam(cm_event) def my_qp_number(self): if self.with_ext_qp: return self.qp.qp_num else: cm = self.child_id if self.passive else self.cmid return cm.qpn def create_qp(self, conn_idx=0): """ Create an rdmacm QP. If self.with_ext_qp is set, then an external CQ and QP will be created. In case that CQ is already created, it is used for the newly created QP. :param conn_idx: The connection index. """ cmid = self.child_id if self.passive else self.cmid if not self.with_ext_qp: cmid.create_qp(self.create_qp_init_attr()) else: self.create_cq(cmid) init_attr = self.create_qp_init_attr(rcq=self.cq, scq=self.cq) self.qps[conn_idx] = QP(cmid.pd, init_attr, QPAttr()) def create_cq(self, cmid): if not self.cq: self.cq = CQ(cmid.context, self.num_msgs, None, None, 0) def modify_ext_qp_to_rts(self, conn_idx=0): cmid = self.child_id if self.passive else self.cmid attr, mask = cmid.init_qp_attr(e.IBV_QPS_INIT) self.qps[conn_idx].modify(attr, mask) attr, mask = cmid.init_qp_attr(e.IBV_QPS_RTR) self.qps[conn_idx].modify(attr, mask) attr, mask = cmid.init_qp_attr(e.IBV_QPS_RTS) self.qps[conn_idx].modify(attr, mask) def mem_write(self, data, size, offset=0): self.mr.write(data, size, offset) def mem_read(self, size=None, offset=0): size_ = self.msg_size if size is None else size return self.mr.read(size_, offset) @abc.abstractmethod def create_child_id(self, cm_event=None): pass @property def mr_lkey(self): return self.mr.lkey class AsyncCMResources(CMResources): """ AsyncCMResources class contains resources for RDMA CM asynchronous communication. :param addr: Local address to bind to. :param passive: Indicate if this CM is the passive CM. """ def __init__(self, addr=None, passive=None, **kwargs): super(AsyncCMResources, self).__init__(addr=addr, passive=passive, **kwargs) self.create_event_channel() def create_cmid(self, idx=0): self.cmids[idx] = CMID(creator=self.channel, port_space=self.port_space) def create_child_id(self, cm_event=None): if not self.passive: raise PyverbsUserError('create_child_id can be used only in passive side') new_child_idx = len(self.child_ids) self.child_ids[new_child_idx] = CMID(creator=cm_event, listen_id=self.cmid) class SyncCMResources(CMResources): """ SyncCMResources class contains resources for RDMA CM synchronous communication. :param addr: Local address to bind to. :param passive: Indicate if this CM is the passive CM. """ def __init__(self, addr=None, passive=None, **kwargs): super(SyncCMResources, self).__init__(addr=addr, passive=passive, **kwargs) def create_cmid(self, idx=0): self.cmids[idx] = CMID(creator=self.ai, qp_init_attr=self.qp_init_attr) def create_child_id(self, cm_event=None): if not self.passive: raise PyverbsUserError('create_child_id can be used only in passive side') new_child_idx = len(self.child_ids) self.child_ids[new_child_idx] = self.cmid.get_request() rdma-core-56.1/tests/cuda_utils.py000066400000000000000000000101711477342711600172020ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2022 Nvidia Inc. All rights reserved. See COPYING file """ This module provides utilities and auxiliary functions for CUDA related tests including the initialization/tear down needed and some error handlers. """ import unittest try: from cuda import cuda, cudart, nvrtc CUDA_FOUND = True except ImportError: CUDA_FOUND = False def requires_cuda(func): def inner(instance): if not CUDA_FOUND: raise unittest.SkipTest( 'cuda-python 12.0+ must be installed to run CUDA tests') res = cudart.cudaGetDeviceCount() if res[0].value == cuda.CUresult.CUDA_ERROR_NO_DEVICE or res[1] == 0: raise unittest.SkipTest('No CUDA-capable devices were detected') return func(instance) return inner def _cuda_get_error_enum(error): if isinstance(error, cuda.CUresult): err, name = cuda.cuGetErrorName(error) return name if err == cuda.CUresult.CUDA_SUCCESS else "" elif isinstance(error, cudart.cudaError_t): return cudart.cudaGetErrorName(error)[1] elif isinstance(error, nvrtc.nvrtcResult): return nvrtc.nvrtcGetErrorString(error)[1] else: raise RuntimeError(f'Unknown error type: {error}') def check_cuda_errors(result): """ CUDA error handler. If the CUDA result is success it returns the remaining objects that originally returned by the CUDA function call (if any). Otherwise and exception is raised. :param result: CUDA function result (CUresult) :return: The CUDA function results in case of success """ if result[0].value: raise RuntimeError( f'CUDA error code = {result[0].value} ({_cuda_get_error_enum(result[0])})') if len(result) == 1: return None elif len(result) == 2: return result[1] else: return result[1:] # The following functions should not be used directly, instead they should # replace/extend unittest.TestCase's derived methods in CUDA tests. @requires_cuda def setUp(obj): super(obj.__class__, obj).setUp() obj.iters = 10 obj.traffic_args = None obj.cuda_ctx = None obj.init_cuda() def tearDown(obj): if obj.server and obj.server.cuda_addr: check_cuda_errors(cuda.cuMemFree(obj.server.cuda_addr)) if obj.client and obj.client.cuda_addr: check_cuda_errors(cuda.cuMemFree(obj.client.cuda_addr)) if obj.cuda_ctx: check_cuda_errors(cuda.cuCtxDestroy(obj.cuda_ctx)) super(obj.__class__, obj).tearDown() def init_cuda(obj): cuda_dev_id = obj.config['gpu'] if cuda_dev_id is None: raise unittest.SkipTest('GPU device ID must be passed') check_cuda_errors(cuda.cuInit(0)) cuda_device = check_cuda_errors(cuda.cuDeviceGet(cuda_dev_id)) obj.cuda_ctx = check_cuda_errors( cuda.cuCtxCreate(cuda.CUctx_flags.CU_CTX_MAP_HOST, cuda_device)) check_cuda_errors(cuda.cuCtxSetCurrent(obj.cuda_ctx)) def set_init_cuda_methods(cls): """ Replaces the setUp and tearDown methods of any unittest.TestCase derived class. Can be useful as a decorator for CUDA related tests. :param cls: Test class of unittest.TestCase """ cls.setUp = setUp cls.tearDown = tearDown cls.init_cuda = init_cuda cls.mem_write = mem_write cls.mem_read = mem_read return cls # The following functions should be used by CUDA resources objects def mem_write(obj, data, size, offset=0): cuda_addr = cuda.CUdeviceptr(init_value=int(obj.cuda_addr) + offset) check_cuda_errors(cuda.cuMemcpyHtoD(cuda_addr, data.encode(), size)) def mem_read(obj, size=None, offset=0): size_ = obj.msg_size if size is None else size data_read = bytearray(size_) cuda_addr = cuda.CUdeviceptr(init_value=int(obj.cuda_addr) + offset) check_cuda_errors(cuda.cuMemcpyDtoH(data_read, cuda_addr, size_)) return data_read def set_mem_io_cuda_methods(cls): """ Replaces the mem_write/mem_read methods of any class derived from BaseResources. :param cls: Test class of BaseResources """ cls.mem_write = mem_write cls.mem_read = mem_read return cls rdma-core-56.1/tests/efa_base.py000066400000000000000000000113751477342711600166020ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright 2020-2023 Amazon.com, Inc. or its affiliates. All rights reserved. import unittest import random import errno from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.cq import CqInitAttrEx from pyverbs.qp import QPAttr, QPCap, QPInitAttrEx import pyverbs.providers.efa.efa_enums as efa_e import pyverbs.providers.efa.efadv as efa import pyverbs.device as d import pyverbs.enums as e from tests.base import PyverbsAPITestCase from tests.base import TrafficResources from tests.base import RDMATestCase import tests.utils AMAZON_VENDOR_ID = 0x1d0f def is_efa_dev(ctx): dev_attrs = ctx.query_device() return dev_attrs.vendor_id == AMAZON_VENDOR_ID def skip_if_not_efa_dev(ctx): if not is_efa_dev(ctx): raise unittest.SkipTest('Can not run the test over non EFA device') class EfaAPITestCase(PyverbsAPITestCase): def setUp(self): super().setUp() skip_if_not_efa_dev(self.ctx) class EfaRDMATestCase(RDMATestCase): def setUp(self): super().setUp() skip_if_not_efa_dev(d.Context(name=self.dev_name)) class SRDResources(TrafficResources): SRD_QKEY = 0x11111111 SRD_PKEY_INDEX = 0 def __init__(self, dev_name, ib_port, gid_index, send_ops_flags, qp_count=1): self.send_ops_flags = send_ops_flags super().__init__(dev_name, ib_port, gid_index, qp_count=qp_count) def create_qp_attr(self): attr = QPAttr(port_num=self.ib_port) attr.qkey = self.SRD_QKEY attr.pkey_index = self.SRD_PKEY_INDEX return attr def to_rts(self): attr = self.create_qp_attr() for i in range(self.qp_count): attr.dest_qp_num = self.rqps_num[i] attr.sq_psn = self.rpsns[i] self.qps[i].to_rts(attr) def create_qps(self): qp_cap = QPCap(max_recv_wr=self.num_msgs, max_send_wr=self.num_msgs, max_recv_sge=1, max_send_sge=1) comp_mask = e.IBV_QP_INIT_ATTR_PD if self.send_ops_flags: comp_mask |= e.IBV_QP_INIT_ATTR_SEND_OPS_FLAGS qp_init_attr_ex = QPInitAttrEx(cap=qp_cap, qp_type=e.IBV_QPT_DRIVER, scq=self.cq, rcq=self.cq, pd=self.pd, send_ops_flags=self.send_ops_flags, comp_mask=comp_mask) efa_init_attr_ex = efa.EfaQPInitAttr() efa_init_attr_ex.driver_qp_type = efa_e.EFADV_QP_DRIVER_TYPE_SRD try: for _ in range(self.qp_count): qp = efa.SRDQPEx(self.ctx, qp_init_attr_ex, efa_init_attr_ex) self.qps.append(qp) self.qps_num.append(qp.qp_num) self.psns.append(random.getrandbits(24)) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Extended SRD QP is not supported on this device') raise ex def create_mr(self): additional_access_flags = 0 if self.send_ops_flags == e.IBV_QP_EX_WITH_RDMA_READ: additional_access_flags = e.IBV_ACCESS_REMOTE_READ elif self.send_ops_flags in [e.IBV_QP_EX_WITH_RDMA_WRITE, e.IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM]: additional_access_flags = e.IBV_ACCESS_REMOTE_WRITE self.mr = tests.utils.create_custom_mr(self, additional_access_flags) class EfaCQRes(SRDResources): def __init__(self, dev_name, ib_port, gid_index, send_ops_flags, qp_count=1, requested_dev_cap=None, wc_flags=None): """ Initialize EFA DV CQ based on SRD resources. :param requested_dev_cap: A necessary device cap. If it's not supported by the device, the test will be skipped. :param wc_flags: WC flags for EFA DV CQ. """ self.requested_dev_cap = requested_dev_cap self.efa_wc_flags = wc_flags super().__init__(dev_name, ib_port, gid_index, send_ops_flags, qp_count=qp_count) def create_context(self): super().create_context() if self.requested_dev_cap: with efa.EfaContext(name=self.ctx.name) as efa_ctx: if not efa_ctx.query_efa_device().device_caps & self.requested_dev_cap: miss_caps = efa.dev_cap_to_str(self.requested_dev_cap) raise unittest.SkipTest(f'Device caps doesn\'t support {miss_caps}') def create_cq(self): cia = CqInitAttrEx(wc_flags=e.IBV_WC_STANDARD_FLAGS) efa_cia = efa.EfaDVCQInitAttr(self.efa_wc_flags) try: self.cq = efa.EfaCQ(self.ctx, cia, efa_cia) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create EFA DV CQ is not supported') raise ex rdma-core-56.1/tests/irdma_base.py000066400000000000000000000060041477342711600171340ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2023 Red Hat, Inc, All rights reserved. See COPYING file import unittest INTEL_VENDOR_ID = 0x8086 IRDMA_DEVS = { 0x1572, # I40E_DEV_ID_SFP_XL710 0x1574, # I40E_DEV_ID_QEMU 0x1580, # I40E_DEV_ID_KX_B 0x1581, # I40E_DEV_ID_KX_C 0x1583, # I40E_DEV_ID_QSFP_A 0x1584, # I40E_DEV_ID_QSFP_B 0x1585, # I40E_DEV_ID_QSFP_C 0x1586, # I40E_DEV_ID_10G_BASE_T 0x1587, # I40E_DEV_ID_20G_KR2 0x1588, # I40E_DEV_ID_20G_KR2_A 0x1589, # I40E_DEV_ID_10G_BASE_T4 0x158A, # I40E_DEV_ID_25G_B 0x158B, # I40E_DEV_ID_25G_SFP28 0x154C, # I40E_DEV_ID_VF 0x1571, # I40E_DEV_ID_VF_HV 0x374C, # I40E_DEV_ID_X722_A0 0x374D, # I40E_DEV_ID_X722_A0_VF 0x37CE, # I40E_DEV_ID_KX_X722 0x37CF, # I40E_DEV_ID_QSFP_X722 0x37D0, # I40E_DEV_ID_SFP_X722 0x37D1, # I40E_DEV_ID_1G_BASE_T_X722 0x37D2, # I40E_DEV_ID_10G_BASE_T_X722 0x37D3, # I40E_DEV_ID_SFP_I_X722 0x37CD, # I40E_DEV_ID_X722_VF 0x37D9, # I40E_DEV_ID_X722_VF_HV 0x124C, # Intel(R) Ethernet Connection E823-L for backplane 0x124D, # Intel(R) Ethernet Connection E823-L for SFP 0x124E, # Intel(R) Ethernet Connection E823-L/X557-AT 10GBASE-T 0x124F, # Intel(R) Ethernet Connection E823-L 1GbE 0x151D, # Intel(R) Ethernet Connection E823-L for QSFP 0x1591, # Intel(R) Ethernet Controller E810-C for backplane 0x1592, # Intel(R) Ethernet Controller E810-C for QSFP 0x1593, # Intel(R) Ethernet Controller E810-C for SFP 0x1599, # Intel(R) Ethernet Controller E810-XXV for backplane 0x159A, # Intel(R) Ethernet Controller E810-XXV for QSFP 0x159B, # Intel(R) Ethernet Controller E810-XXV for SFP 0x188A, # Intel(R) Ethernet Connection E823-C for backplane 0x188B, # Intel(R) Ethernet Connection E823-C for QSFP 0x188C, # Intel(R) Ethernet Connection E823-C for SFP 0x188D, # Intel(R) Ethernet Connection E823-C/X557-AT 10GBASE-T 0x188E, # Intel(R) Ethernet Connection E823-C 1GbE 0x1890, # Intel(R) Ethernet Connection C822N for backplane 0x1891, # Intel(R) Ethernet Connection C822N for QSFP 0x1892, # Intel(R) Ethernet Connection C822N for SFP 0x1893, # Intel(R) Ethernet Connection E822-C/X557-AT 10GBASE-T 0x1894, # Intel(R) Ethernet Connection E822-C 1GbE 0x1897, # Intel(R) Ethernet Connection E822-L for backplane 0x1898, # Intel(R) Ethernet Connection E822-L for SFP 0x1899, # Intel(R) Ethernet Connection E822-L/X557-AT 10GBASE-T 0x189A, # Intel(R) Ethernet Connection E822-L 1GbE } def is_irdma_dev(ctx): dev_attrs = ctx.query_device() return dev_attrs.vendor_id == INTEL_VENDOR_ID and \ dev_attrs.vendor_part_id in IRDMA_DEVS def skip_if_irdma_dev(ctx): if is_irdma_dev(ctx): raise unittest.SkipTest('Can not run the test over irdma device') rdma-core-56.1/tests/mlx5_base.py000066400000000000000000001303051477342711600167270ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 NVIDIA Corporation . All rights reserved. See COPYING file import unittest import resource import random import struct import errno import math import time import sys from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr, \ Mlx5DVQPInitAttr, Mlx5QP, Mlx5DVDCInitAttr, Mlx5DCIStreamInitAttr, \ Mlx5DevxObj, Mlx5UMEM, Mlx5UAR, WqeDataSeg, WqeCtrlSeg, Wqe, Mlx5Cqe64, \ Mlx5DVCQInitAttr, Mlx5CQ from tests.base import RoCETrafficResources, set_rnr_attributes, DCT_KEY, \ RDMATestCase, PyverbsAPITestCase, RDMACMBaseTest, BaseResources, PATH_MTU, \ RNR_RETRY, RETRY_CNT, MIN_RNR_TIMER, TIMEOUT, MAX_RDMA_ATOMIC, RCResources, \ is_gid_available from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsUserError, \ PyverbsError from pyverbs.providers.mlx5.mlx5dv_objects import Mlx5DvObj from pyverbs.qp import QPCap, QPInitAttrEx, QPAttr import pyverbs.providers.mlx5.mlx5_enums as dve from pyverbs.addr import AHAttr, GlobalRoute from pyverbs.cq import CqInitAttrEx import pyverbs.mem_alloc as mem import pyverbs.dma_util as dma import pyverbs.device as d from pyverbs.pd import PD import pyverbs.enums as e from pyverbs.mr import MR import tests.utils MLX5_CQ_SET_CI = 0 POLL_CQ_TIMEOUT = 5 # In seconds PORT_STATE_TIMEOUT = 20 # In seconds MELLANOX_VENDOR_ID = 0x02c9 MLX5_DEVS = { 0x1011, # MT4113 Connect-IB 0x1012, # Connect-IB Virtual Function 0x1013, # ConnectX-4 0x1014, # ConnectX-4 Virtual Function 0x1015, # ConnectX-4LX 0x1016, # ConnectX-4LX Virtual Function 0x1017, # ConnectX-5, PCIe 3.0 0x1018, # ConnectX-5 Virtual Function 0x1019, # ConnectX-5 Ex 0x101a, # ConnectX-5 Ex VF 0x101b, # ConnectX-6 0x101c, # ConnectX-6 VF 0x101d, # ConnectX-6 DX 0x101e, # ConnectX family mlx5Gen Virtual Function 0x101f, # ConnectX-6 LX 0x1021, # ConnectX-7 0x1023, # ConnectX-8 0xa2d2, # BlueField integrated ConnectX-5 network controller 0xa2d3, # BlueField integrated ConnectX-5 network controller VF 0xa2d6, # BlueField-2 integrated ConnectX-6 Dx network controller 0xa2dc, # BlueField-3 integrated ConnectX-7 network controller 0xa2df, # BlueField-4 integrated ConnectX-8 network controller } DCI_TEST_GOOD_FLOW = 0 DCI_TEST_BAD_FLOW_WITH_RESET = 1 DCI_TEST_BAD_FLOW_WITHOUT_RESET = 2 IB_SMP_ATTR_PORT_INFO = 0x0015 IB_MGMT_CLASS_SUBN_LID_ROUTED = 0x01 IB_MGMT_METHOD_GET = 0x01 DB_BF_DBR_LESS_BUF_OFFSET = 0x600 class PortStatus: MLX5_PORT_UP = 1 MLX5_PORT_DOWN = 2 class PortState: NO_STATE_CHANGE = 0 DOWN = 1 INIT = 2 ARMED = 3 ACTIVE = 4 def is_mlx5_dev(ctx): dev_attrs = ctx.query_device() return dev_attrs.vendor_id == MELLANOX_VENDOR_ID and \ dev_attrs.vendor_part_id in MLX5_DEVS def skip_if_not_mlx5_dev(ctx): if not is_mlx5_dev(ctx): raise unittest.SkipTest('Can not run the test over non MLX5 device') class Mlx5PyverbsAPITestCase(PyverbsAPITestCase): def setUp(self): super().setUp() skip_if_not_mlx5_dev(self.ctx) class Mlx5RDMATestCase(RDMATestCase): def setUp(self): super().setUp() skip_if_not_mlx5_dev(d.Context(name=self.dev_name)) class Mlx5RDMACMBaseTest(RDMACMBaseTest): def setUp(self): super().setUp() skip_if_not_mlx5_dev(d.Context(name=self.dev_name)) class Mlx5DcResources(RoCETrafficResources): def __init__(self, dev_name, ib_port, gid_index, send_ops_flags, qp_count=1, create_flags=0): self.send_ops_flags = send_ops_flags self.create_flags = create_flags super().__init__(dev_name, ib_port, gid_index, with_srq=True, qp_count=qp_count) def to_rts(self): attr = self.create_qp_attr() for i in range(self.qp_count): self.qps[i].to_rts(attr) self.dct_qp.to_rtr(attr) def create_context(self): mlx5dv_attr = Mlx5DVContextAttr() try: self.ctx = Mlx5Context(mlx5dv_attr, name=self.dev_name) except PyverbsUserError as ex: raise unittest.SkipTest(f'Could not open mlx5 context ({ex})') except PyverbsRDMAError: raise unittest.SkipTest('Opening mlx5 context is not supported') def create_mr(self): access = e.IBV_ACCESS_REMOTE_WRITE | e.IBV_ACCESS_LOCAL_WRITE | \ e.IBV_ACCESS_REMOTE_ATOMIC | e.IBV_ACCESS_REMOTE_READ self.mr = MR(self.pd, self.msg_size, access) def create_qp_cap(self): return QPCap(100, 0, 1, 0) def create_qp_attr(self): qp_attr = QPAttr(port_num=self.ib_port) set_rnr_attributes(qp_attr) qp_access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE | \ e.IBV_ACCESS_REMOTE_ATOMIC | e.IBV_ACCESS_REMOTE_READ qp_attr.qp_access_flags = qp_access gr = GlobalRoute(dgid=self.ctx.query_gid(self.ib_port, self.gid_index), sgid_index=self.gid_index) ah_attr = AHAttr(port_num=self.ib_port, is_global=1, gr=gr, dlid=self.port_attr.lid) qp_attr.ah_attr = ah_attr return qp_attr def create_qp_init_attr(self, send_ops_flags=0): comp_mask = e.IBV_QP_INIT_ATTR_PD if send_ops_flags: comp_mask |= e.IBV_QP_INIT_ATTR_SEND_OPS_FLAGS return QPInitAttrEx(cap=self.create_qp_cap(), pd=self.pd, scq=self.cq, rcq=self.cq, srq=self.srq, qp_type=e.IBV_QPT_DRIVER, send_ops_flags=send_ops_flags, comp_mask=comp_mask, sq_sig_all=1) def create_qps(self): # Create the DCI QPs. qp_init_attr = self.create_qp_init_attr(self.send_ops_flags) try: for _ in range(self.qp_count): comp_mask = dve.MLX5DV_QP_INIT_ATTR_MASK_DC if self.create_flags: comp_mask |= dve.MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS attr = Mlx5DVQPInitAttr(comp_mask=comp_mask, create_flags=self.create_flags, dc_init_attr=Mlx5DVDCInitAttr()) qp = Mlx5QP(self.ctx, qp_init_attr, attr) self.qps.append(qp) self.qps_num.append(qp.qp_num) self.psns.append(random.getrandbits(24)) # Create the DCT QP. qp_init_attr = self.create_qp_init_attr() dc_attr = Mlx5DVDCInitAttr(dc_type=dve.MLX5DV_DCTYPE_DCT, dct_access_key=DCT_KEY) attr = Mlx5DVQPInitAttr(comp_mask=dve.MLX5DV_QP_INIT_ATTR_MASK_DC, dc_init_attr=dc_attr) self.dct_qp = Mlx5QP(self.ctx, qp_init_attr, attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Create DC QP is not supported') raise ex class Mlx5DcStreamsRes(Mlx5DcResources): def __init__(self, dev_name, ib_port, gid_index, send_ops_flags, qp_count=1, create_flags=0): self.bad_flow = 0 self.mr_bad_flow = False self.stream_check = False super().__init__(dev_name, ib_port, gid_index, send_ops_flags, qp_count, create_flags) def reset_qp(self, qp_idx): qp_attr = QPAttr(qp_state=e.IBV_QPS_RESET) self.qps[qp_idx].modify(qp_attr, e.IBV_QP_STATE) self.qps[qp_idx].to_rts(qp_attr) self.qp_stream_errors[qp_idx][0] = 0 def get_stream_id(self, qp_idx): return self.current_qp_stream_id[qp_idx] def generate_stream_id(self, qp_idx): self.current_qp_stream_id[qp_idx] += 1 # Reset stream id to check double-usage if self.current_qp_stream_id[qp_idx] > self.dcis[qp_idx]['stream']+2: self.current_qp_stream_id[qp_idx] = 1 return self.current_qp_stream_id[qp_idx] def dci_reset_stream_id(self, qp_idx): stream_id = self.get_stream_id(qp_idx) Mlx5QP.modify_dci_stream_channel_id(self.qps[qp_idx], stream_id) # Check once if error raised when reset wrong stream id if self.stream_check: try: Mlx5QP.modify_dci_stream_channel_id(self.qps[qp_idx], stream_id+1) except PyverbsRDMAError as ex: self.stream_check = False def bad_flow_handler_qp(self, qp_idx, status, reset=False): str_id = self.get_stream_id(qp_idx) bt_stream = (1 << str_id) if status == e.IBV_WC_LOC_PROT_ERR: self.qp_stream_errors[qp_idx][1] += 1 if (self.qp_stream_errors[qp_idx][0] & bt_stream) != 0: raise PyverbsError(f'Dublicate error from stream id {str_id}') self.qp_stream_errors[qp_idx][0] |= bt_stream if status == e.IBV_WC_WR_FLUSH_ERR: qp_attr, _ = self.qps[qp_idx].query(e.IBV_QP_STATE) if qp_attr.cur_qp_state == e.IBV_QPS_ERR and reset: if self.qp_stream_errors[qp_idx][1] != self.dcis[qp_idx]['errored']: msg = f'QP {qp_idx} in ERR state with wrong number of counter' raise PyverbsError(msg) self.reset_qp(qp_idx) self.qp_stream_errors[qp_idx][2] = True return True def bad_flow_handling(self, qp_idx, status, reset=False): if self.bad_flow == DCI_TEST_GOOD_FLOW: return False if self.bad_flow == DCI_TEST_BAD_FLOW_WITH_RESET: self.qp_stream_errors[qp_idx][1] += 1 if reset: self.dci_reset_stream_id(qp_idx) return True if self.bad_flow == DCI_TEST_BAD_FLOW_WITHOUT_RESET: return self.bad_flow_handler_qp(qp_idx, status, reset) return False def set_bad_flow(self, bad_flow): self.bad_flow = bad_flow if self.bad_flow: if bad_flow == DCI_TEST_BAD_FLOW_WITH_RESET and self.log_dci_errored == 0: raise unittest.SkipTest('DCS test of bad flow with reset is not ' 'supported when HCA_CAP.log_dci_errored is 0') self.pd_bad = PD(self.ctx) self.mr_bad_flow = False if bad_flow == DCI_TEST_BAD_FLOW_WITH_RESET: self.stream_check = True def is_bad_flow(self, qp_idx): cnt = self.get_stream_id(qp_idx) if self.bad_flow == DCI_TEST_GOOD_FLOW: return False if self.bad_flow == DCI_TEST_BAD_FLOW_WITH_RESET: if (cnt % 3) != 0: return False self.qp_stream_errors[qp_idx][0] += 1 if self.bad_flow == DCI_TEST_BAD_FLOW_WITHOUT_RESET: if self.qp_stream_errors[qp_idx][2]: return False return True def check_bad_flow(self, qp_idx): change_mr = False if self.is_bad_flow(qp_idx): if not self.mr_bad_flow: self.mr_bad_flow = True pd = self.pd_bad change_mr = True else: if self.mr_bad_flow: self.mr_bad_flow = False pd = self.pd change_mr = True if change_mr: self.mr.rereg(flags=e.IBV_REREG_MR_CHANGE_PD, pd=pd, addr=0, length=0, access=0) def check_after_traffic(self): if self.bad_flow == DCI_TEST_BAD_FLOW_WITH_RESET: for errs in self.qp_stream_errors: if errs[0] != errs[1]: msg = f'Number of qp_stream_errors {errs[0]} not same '\ f'as number of catches {errs[1]}' raise PyverbsError(msg) if self.stream_check: msg = 'Reset of good stream id does not create exception' raise PyverbsError(msg) def generate_dci_attr(self, qpn): # This array contains current number of log_dci_streams # and log_dci_errored values per qp. For 1-st qp number # of streams greater than number of errored and vice-versa # for the 2nd qp. qp_arr = {0: [3, 2], 1: [2, 3]} try: dci_caps = self.ctx.query_mlx5_device().dci_streams_caps except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: raise unittest.SkipTest('Get DCI caps is not supported') raise ex if not dci_caps or dci_caps['max_log_num_concurent'] == 0: raise unittest.SkipTest('DCI caps is not supported by HW') self.log_dci_streams = min(qp_arr.get(qpn, [1,1])[0], dci_caps['max_log_num_concurent']) self.log_dci_errored = min(qp_arr.get(qpn, [1,1])[1], dci_caps['max_log_num_errored']) def create_qps(self): # Create the DCI QPs. qp_init_attr = self.create_qp_init_attr(self.send_ops_flags) self.dcis = {} # This array contains current stream id self.current_qp_stream_id = {} # This array counts different errors in bad_flow self.qp_stream_errors = [] comp_mask = dve.MLX5DV_QP_INIT_ATTR_MASK_DC | \ dve.MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS try: for qpn in range(self.qp_count): if self.create_flags: comp_mask |= dve.MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS self.generate_dci_attr(qpn) stream_ctx = Mlx5DCIStreamInitAttr(self.log_dci_streams, self.log_dci_errored) self.dcis[qpn] = {'stream': 1 << self.log_dci_streams, 'errored': 1 << self.log_dci_errored} attr = Mlx5DVQPInitAttr(comp_mask=comp_mask, create_flags=self.create_flags, dc_init_attr=Mlx5DVDCInitAttr(dci_streams=stream_ctx)) qp = Mlx5QP(self.ctx, qp_init_attr, attr) self.qps.append(qp) # Different values for start point of stream id per qp self.current_qp_stream_id[qpn] = qpn # Array of errors for bad_flow # For DCI_TEST_BAD_FLOW_WITH_RESET # First element - number of injected bad flows # Second element - number of exceptions from bad flows # For DCI_TEST_BAD_FLOW_WITHOUT_RESET # First element - bitmap of bad flow streams # Second element - number of exceptions from bad flows # Third element - flag if reset of qp been executed self.qp_stream_errors.append([0, 0, False]) self.qps_num.append(qp.qp_num) self.psns.append(random.getrandbits(24)) # Create the DCT QP. qp_init_attr = self.create_qp_init_attr() dc_attr = Mlx5DVDCInitAttr(dc_type=dve.MLX5DV_DCTYPE_DCT, dct_access_key=DCT_KEY) attr = Mlx5DVQPInitAttr(comp_mask=dve.MLX5DV_QP_INIT_ATTR_MASK_DC, dc_init_attr=dc_attr) self.dct_qp = Mlx5QP(self.ctx, qp_init_attr, attr) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: raise unittest.SkipTest('Create DC QP is not supported') raise ex @staticmethod def traffic_with_bad_flow(client, server, iters, gid_idx, port): """ Runs basic traffic with bad flow between two sides :param client: client side, clients base class is BaseTraffic :param server: server side, servers base class is BaseTraffic :param iters: number of traffic iterations :param gid_idx: local gid index :param port: IB port :return: None """ import tests.utils as u send_op = e.IBV_WR_SEND ah_client = u.get_global_ah(client, gid_idx, port) s_recv_wr = u.get_recv_wr(server) c_recv_wr = u.get_recv_wr(client) for qp_idx in range(server.qp_count): # Prepare the receive queue with RecvWR u.post_recv(client, c_recv_wr, qp_idx=qp_idx) u.post_recv(server, s_recv_wr, qp_idx=qp_idx) read_offset = 0 for _ in range(iters): for qp_idx in range(server.qp_count): _, c_send_object = u.get_send_elements(client, False) u.send(client, c_send_object, send_op, True, qp_idx, ah_client, False) try: wcs = u._poll_cq(client.cq) except PyverbsError as ex: if client.bad_flow_handling(qp_idx, e.IBV_WC_SUCCESS, True): continue raise ex else: if wcs[0].status != e.IBV_WC_SUCCESS and \ client.bad_flow_handling(qp_idx, wcs[0].status, True): continue u.poll_cq(server.cq) u.post_recv(server, s_recv_wr, qp_idx=qp_idx) msg_received = server.mr.read(server.msg_size, read_offset) u.validate(msg_received, True, server.msg_size) client.check_after_traffic() class WqAttrs: def __init__(self): super().__init__() self.wqe_num = 0 self.wqe_size = 0 self.wq_size = 0 self.head = 0 self.post_idx = 0 self.wqe_shift = 0 self.offset = 0 def __str__(self): return str(vars(self)) def __format__(self, format_spec): return str(self).__format__(format_spec) class CqAttrs: def __init__(self): super().__init__() self.cons_idx = 0 self.cqe_size = 64 self.ncqes = 256 def __str__(self): return str(vars(self)) def __format__(self, format_spec): return str(self).__format__(format_spec) class QueueAttrs: def __init__(self): self.rq = WqAttrs() self.sq = WqAttrs() self.cq = CqAttrs() def __str__(self): print_format = '{}:\n\t{}\n' return print_format.format('RQ Attributes', self.rq) + \ print_format.format('SQ Attributes', self.sq) + \ print_format.format('CQ Attributes', self.cq) class Mlx5DevxRcResources(BaseResources): """ Creates all the DevX resources needed for a traffic-ready RC DevX QP, including methods to transit the WQs into RTS state. It also includes traffic methods for post send/receive and poll. The class currently supports post send with immediate, but can be easily extended to support other opcodes in the future. """ def __init__(self, dev_name, ib_port, gid_index, msg_size=1024, activate_port_state=False, send_dbr_mode=0): from tests.mlx5_prm_structs import SendDbrMode super().__init__(dev_name, ib_port, gid_index) self.umems = {} self.send_dbr_mode = send_dbr_mode self.msg_size = msg_size self.num_msgs = 1000 self.imm = 0x03020100 self.uar = {} self.max_recv_sge = 1 self.eqn = None self.pd = None self.dv_pd = None self.mr = None self.msi_vector = None self.eq = None self.cq = None self.qp = None self.qpn = None self.psn = None self.lid = None self.gid = [0, 0, 0, 0] # Remote attrs self.rqpn = None self.rpsn = None self.rlid = None self.rgid = [0, 0, 0, 0] self.rmac = None self.devx_objs = [] self.qattr = QueueAttrs() self.with_odp = False self.user_addr = None if activate_port_state: start_state_t = time.perf_counter() self.change_port_state_with_registers(PortStatus.MLX5_PORT_UP) admin_status, oper_status = self.query_port_state_with_registers() while admin_status != PortStatus.MLX5_PORT_UP or oper_status != PortStatus.MLX5_PORT_UP: if time.perf_counter() - start_state_t >= PORT_STATE_TIMEOUT: raise PyverbsRDMAError('Could not change the port state to UP') self.change_port_state_with_registers(PortStatus.MLX5_PORT_UP) admin_status, oper_status = self.query_port_state_with_registers() time.sleep(1) mad_port_state = self.query_port_state_with_mads(ib_port) while mad_port_state < PortState.ACTIVE: if time.perf_counter() - start_state_t >= PORT_STATE_TIMEOUT: raise PyverbsRDMAError('Could not change the port state to UP') time.sleep(1) mad_port_state = self.query_port_state_with_mads(ib_port) if self.send_dbr_mode != SendDbrMode.DBR_VALID: self.check_cap_send_dbr_mode() self.init_resources() def get_wqe_data_segment(self): return WqeDataSeg(self.mr.length, self.mr.lkey, self.mr.buf) def change_port_state_with_registers(self, state): from tests.mlx5_prm_structs import PaosReg paos_in = PaosReg(local_port=self.ib_port, admin_status=state, ase=1) self.access_paos_register(paos_in) def query_port_state_with_registers(self): from tests.mlx5_prm_structs import PaosReg paos_in = PaosReg(local_port=self.ib_port) paos_out = self.access_paos_register(paos_in) return paos_out.admin_status, paos_out.oper_status def access_paos_register(self, paos_in, op_mod=0): # op_mod: 0 - write / 1 - read from tests.mlx5_prm_structs import AccessPaosRegisterIn, \ AccessPaosRegisterOut, DevxOps paos_reg_in = AccessPaosRegisterIn(op_mod=op_mod, register_id=DevxOps.MLX5_CMD_OP_ACCESS_REGISTER_PAOS, data=paos_in) cmd_out = self.ctx.devx_general_cmd(paos_reg_in, len(AccessPaosRegisterOut())) paos_reg_out = AccessPaosRegisterOut(cmd_out) if paos_reg_out.status: raise PyverbsRDMAError(f'Failed to access PAOS register ({paos_reg_out.syndrome})') return paos_reg_out.data def query_port_state_with_mads(self, ib_port): from tests.mlx5_prm_structs import IbSmp in_mad = IbSmp(base_version=1, mgmt_class=IB_MGMT_CLASS_SUBN_LID_ROUTED, class_version=1, method=IB_MGMT_METHOD_GET, attr_id=IB_SMP_ATTR_PORT_INFO, attr_mod=ib_port) ib_smp_out = IbSmp(self._send_mad_cmd(ib_port, in_mad, 0x3)) return ib_smp_out.data[32] & 0xf def _send_mad_cmd(self, ib_port, in_mad, op_mod): from tests.mlx5_prm_structs import MadIfcIn, MadIfcOut mad_ifc_in = MadIfcIn(op_mod=op_mod, port=ib_port, mad=in_mad) cmd_out = self.ctx.devx_general_cmd(mad_ifc_in, len(MadIfcOut())) mad_ifc_out = MadIfcOut(cmd_out) if mad_ifc_out.status: raise PyverbsRDMAError(f'Failed to send MAD with syndrome ({mad_ifc_out.syndrome})') return mad_ifc_out.mad def check_cap_send_dbr_mode(self): """ Check the capability of the dbr less. If the HCA cap have HCA cap 2, check if in HCA cap2 0x20(HCA CAP 2) + 0x1(current) have the send_dbr_mode_no_dbr_ext. """ from tests.mlx5_prm_structs import QueryCmdHcaCap2Out, \ QueryHcaCapIn, QueryCmdHcaCapOut, QueryHcaCapOp, QueryHcaCapMod, SendDbrMode self.create_context() query_cap_in = QueryHcaCapIn(op_mod=0x1) query_cap_out = QueryCmdHcaCapOut(self.ctx.devx_general_cmd( query_cap_in, len(QueryCmdHcaCapOut()))) if query_cap_out.status: raise PyverbsRDMAError('Failed to query general HCA CAPs with syndrome ' f'({query_cap_out.syndrome}') if not query_cap_out.capability.hca_cap_2: raise unittest.SkipTest("The device doesn't support general HCA CAPs 2") query_cap2_in = QueryHcaCapIn(op_mod=(QueryHcaCapOp.HCA_CAP_2 << 0x1) | \ QueryHcaCapMod.CURRENT) query_cap2_out = QueryCmdHcaCap2Out(self.ctx.devx_general_cmd( query_cap2_in, len(QueryCmdHcaCap2Out()))) if self.send_dbr_mode == SendDbrMode.NO_DBR_EXT and \ not query_cap2_out.capability.send_dbr_mode_no_dbr_ext: raise unittest.SkipTest("The device doesn't support send_dbr_mode_no_dbr_ext cap") if self.send_dbr_mode == SendDbrMode.NO_DBR_INT and \ not query_cap2_out.capability.send_dbr_mode_no_dbr_int: raise unittest.SkipTest("The device doesn't support send_dbr_mode_no_dbr_int cap") def init_resources(self): if not self.is_eth(): self.query_lid() else: is_gid_available(self.gid_index) self.query_gid() self.create_pd() self.create_mr() self.query_eqn() self.create_uar() self.create_queue_attrs() self.create_eq() self.create_cq() self.create_qp() # Objects closure order is important, and must be done manually in DevX self.devx_objs = [self.qp, self.cq] + list(self.uar.values()) + list(self.umems.values()) + [self.msi_vector, self.eq] def query_lid(self): from tests.mlx5_prm_structs import QueryHcaVportContextIn, \ QueryHcaVportContextOut, QueryHcaCapIn, QueryCmdHcaCapOut query_cap_in = QueryHcaCapIn(op_mod=0x1) query_cap_out = QueryCmdHcaCapOut(self.ctx.devx_general_cmd( query_cap_in, len(QueryCmdHcaCapOut()))) if query_cap_out.status: raise PyverbsRDMAError('Failed to query general HCA CAPs with syndrome ' f'({query_cap_out.syndrome}') port_num = self.ib_port if query_cap_out.capability.num_ports >= 2 else 0 query_port_in = QueryHcaVportContextIn(port_num=port_num) query_port_out = QueryHcaVportContextOut(self.ctx.devx_general_cmd( query_port_in, len(QueryHcaVportContextOut()))) if query_port_out.status: raise PyverbsRDMAError('Failed to query vport with syndrome ' f'({query_port_out.syndrome})') self.lid = query_port_out.hca_vport_context.lid def query_gid(self): gid = self.ctx.query_gid(self.ib_port, self.gid_index).gid.split(':') for i in range(0, len(gid), 2): self.gid[int(i/2)] = int(gid[i] + gid[i+1], 16) def is_eth(self): from tests.mlx5_prm_structs import QueryHcaCapIn, \ QueryCmdHcaCapOut query_cap_in = QueryHcaCapIn(op_mod=0x1) query_cap_out = QueryCmdHcaCapOut(self.ctx.devx_general_cmd( query_cap_in, len(QueryCmdHcaCapOut()))) if query_cap_out.status: raise PyverbsRDMAError('Failed to query general HCA CAPs with syndrome ' f'({query_cap_out.syndrome})') return query_cap_out.capability.port_type # 0:IB, 1:ETH @staticmethod def roundup_pow_of_two(val): return pow(2, math.ceil(math.log2(val))) def create_queue_attrs(self): # RQ calculations wqe_size = WqeDataSeg.sizeof() * self.max_recv_sge self.qattr.rq.wqe_size = self.roundup_pow_of_two(wqe_size) max_recv_wr = self.roundup_pow_of_two(self.num_msgs) self.qattr.rq.wq_size = max(self.qattr.rq.wqe_size * max_recv_wr, dve.MLX5_SEND_WQE_BB) self.qattr.rq.wqe_num = math.ceil(self.qattr.rq.wq_size / self.qattr.rq.wqe_size) self.qattr.rq.wqe_shift = int(math.log2(self.qattr.rq.wqe_size - 1)) + 1 # SQ calculations self.qattr.sq.offset = self.qattr.rq.wq_size # 192 = max overhead size of all structs needed for all operations in RC wqe_size = 192 + WqeDataSeg.sizeof() # Align wqe size to MLX5_SEND_WQE_BB self.qattr.sq.wqe_size = (wqe_size + dve.MLX5_SEND_WQE_BB - 1) & ~(dve.MLX5_SEND_WQE_BB - 1) self.qattr.sq.wq_size = self.roundup_pow_of_two(self.qattr.sq.wqe_size * self.num_msgs) self.qattr.sq.wqe_num = math.ceil(self.qattr.sq.wq_size / dve.MLX5_SEND_WQE_BB) self.qattr.sq.wqe_shift = int(math.log2(dve.MLX5_SEND_WQE_BB)) def create_context(self): try: attr = Mlx5DVContextAttr(dve.MLX5DV_CONTEXT_FLAGS_DEVX) self.ctx = Mlx5Context(attr, self.dev_name) except PyverbsUserError as ex: raise unittest.SkipTest(f'Could not open mlx5 context ({ex})') except PyverbsRDMAError: raise unittest.SkipTest('Opening mlx5 DevX context is not supported') def create_pd(self): self.pd = PD(self.ctx) self.dv_pd = Mlx5DvObj(dve.MLX5DV_OBJ_PD, pd=self.pd).dvpd def create_mr(self): access = e.IBV_ACCESS_REMOTE_WRITE | e.IBV_ACCESS_LOCAL_WRITE | \ e.IBV_ACCESS_REMOTE_READ self.mr = MR(self.pd, self.msg_size, access) def create_umem(self, size, access=e.IBV_ACCESS_LOCAL_WRITE, alignment=resource.getpagesize()): return Mlx5UMEM(self.ctx, size=size, alignment=alignment, access=access) def create_uar(self): self.uar['qp'] = Mlx5UAR(self.ctx, dve._MLX5DV_UAR_ALLOC_TYPE_NC) self.uar['cq'] = Mlx5UAR(self.ctx, dve._MLX5DV_UAR_ALLOC_TYPE_NC) if not self.uar['cq'].page_id or not self.uar['qp'].page_id: raise PyverbsRDMAError('Failed to allocate UAR') def query_eqn(self): self.eqn = self.ctx.devx_query_eqn(0) def create_cq(self): from tests.mlx5_prm_structs import CreateCqIn, SwCqc, CreateCqOut cq_size = self.roundup_pow_of_two(self.qattr.cq.cqe_size * self.qattr.cq.ncqes) # Align to page size pg_size = resource.getpagesize() cq_size = (cq_size + pg_size - 1) & ~(pg_size - 1) self.umems['cq'] = self.create_umem(size=cq_size) self.umems['cq_dbr'] = self.create_umem(size=8, alignment=8) log_cq_size = math.ceil(math.log2(self.qattr.cq.ncqes)) cmd_in = CreateCqIn(cq_umem_valid=1, cq_umem_id=self.umems['cq'].umem_id, sw_cqc=SwCqc(c_eqn=self.eqn, uar_page=self.uar['cq'].page_id, log_cq_size=log_cq_size, dbr_umem_valid=1, dbr_umem_id=self.umems['cq_dbr'].umem_id)) self.cq = Mlx5DevxObj(self.ctx, cmd_in, len(CreateCqOut())) def create_qp(self): self.psn = random.getrandbits(24) from tests.mlx5_prm_structs import SwQpc, CreateQpIn, DevxOps,\ CreateQpOut, CreateCqOut self.psn = random.getrandbits(24) qp_size = self.roundup_pow_of_two(self.qattr.rq.wq_size + self.qattr.sq.wq_size) # Align to page size pg_size = resource.getpagesize() qp_size = (qp_size + pg_size - 1) & ~(pg_size - 1) self.umems['qp'] = self.create_umem(size=qp_size) self.umems['qp_dbr'] = self.create_umem(size=8, alignment=8) log_rq_size = int(math.log2(self.qattr.rq.wqe_num - 1)) + 1 # Size of a receive WQE is 16*pow(2, log_rq_stride) log_rq_stride = self.qattr.rq.wqe_shift - 4 log_sq_size = int(math.log2(self.qattr.sq.wqe_num - 1)) + 1 cqn = CreateCqOut(self.cq.out_view).cqn qpc = SwQpc(st=DevxOps.MLX5_QPC_ST_RC, pd=self.dv_pd.pdn, pm_state=DevxOps.MLX5_QPC_PM_STATE_MIGRATED, log_rq_size=log_rq_size, log_sq_size=log_sq_size, ts_format=0x1, log_rq_stride=log_rq_stride, uar_page=self.uar['qp'].page_id, cqn_snd=cqn, cqn_rcv=cqn, dbr_umem_id=self.umems['qp_dbr'].umem_id, dbr_umem_valid=1, send_dbr_mode=self.send_dbr_mode) cmd_in = CreateQpIn(sw_qpc=qpc, wq_umem_id=self.umems['qp'].umem_id, wq_umem_valid=1) self.qp = Mlx5DevxObj(self.ctx, cmd_in, len(CreateQpOut())) self.qpn = CreateQpOut(self.qp.out_view).qpn def create_eq(self): pass def to_rts(self): """ Moves the created QP to RTS state by modifying it using DevX through all the needed states with all the required attributes. rlid, rpsn, rqpn and rgid (when valid) must be already updated before calling this method. """ from tests.mlx5_prm_structs import DevxOps, ModifyQpIn, ModifyQpOut,\ CreateQpOut, SwQpc cmd_out_len = len(ModifyQpOut()) # RST2INIT qpn = CreateQpOut(self.qp.out_view).qpn swqpc = SwQpc(rre=1, rwe=1) swqpc.primary_address_path.vhca_port_num = self.ib_port cmd_in = ModifyQpIn(opcode=DevxOps.MLX5_CMD_OP_RST2INIT_QP, qpn=qpn, sw_qpc=swqpc) self.qp.modify(cmd_in, cmd_out_len) # INIT2RTR swqpc = SwQpc(mtu=PATH_MTU, log_msg_max=20, remote_qpn=self.rqpn, min_rnr_nak=MIN_RNR_TIMER, next_rcv_psn=self.rpsn) swqpc.primary_address_path.vhca_port_num = self.ib_port swqpc.primary_address_path.rlid = self.rlid if self.is_eth(): # GID field is a must for Eth (or if GRH is set in IB) swqpc.primary_address_path.rgid_rip = self.rgid swqpc.primary_address_path.rmac = self.rmac swqpc.primary_address_path.src_addr_index = self.gid_index swqpc.primary_address_path.hop_limit = tests.utils.PacketConsts.TTL_HOP_LIMIT # UDP sport must be reserved for roce v1 and v1.5 if self.ctx.query_gid_type(self.ib_port, self.gid_index) == e.IBV_GID_TYPE_SYSFS_ROCE_V2: swqpc.primary_address_path.udp_sport = 0xdcba else: swqpc.primary_address_path.rlid = self.rlid cmd_in = ModifyQpIn(opcode=DevxOps.MLX5_CMD_OP_INIT2RTR_QP, qpn=qpn, sw_qpc=swqpc) self.qp.modify(cmd_in, cmd_out_len) # RTR2RTS swqpc = SwQpc(retry_count=RETRY_CNT, rnr_retry=RNR_RETRY, next_send_psn=self.psn, log_sra_max=MAX_RDMA_ATOMIC) swqpc.primary_address_path.vhca_port_num = self.ib_port swqpc.primary_address_path.ack_timeout = TIMEOUT cmd_in = ModifyQpIn(opcode=DevxOps.MLX5_CMD_OP_RTR2RTS_QP, qpn=qpn, sw_qpc=swqpc) self.qp.modify(cmd_in, cmd_out_len) def pre_run(self, rpsn, rqpn, rgid=0, rlid=0, rmac=0): """ Configure Resources before running traffic :param rpsns: Remote PSN (packet serial number) :param rqpn: Remote QP number :param rgid: Remote GID :param rlid: Remote LID :param rmac: Remote MAC (valid for RoCE) :return: None """ self.rpsn = rpsn self.rqpn = rqpn self.rgid = rgid self.rlid = rlid self.rmac = rmac self.to_rts() def post_send(self): """ Posts one send WQE to the SQ by doing all the required work such as building the control/data segments, updating and ringing the dbr, updating the producer indexes, etc. """ from tests.mlx5_prm_structs import SendDbrMode buffer_address = self.uar['qp'].reg_addr if self.send_dbr_mode == SendDbrMode.NO_DBR_EXT: # Address of DB blueflame register buffer_address = self.uar['qp'].base_addr + DB_BF_DBR_LESS_BUF_OFFSET idx = self.qattr.sq.post_idx if self.qattr.sq.post_idx < self.qattr.sq.wqe_num else 0 buf_offset = self.qattr.sq.offset + (idx << dve.MLX5_SEND_WQE_SHIFT) # Prepare WQE imm_be32 = struct.unpack("I", self.imm + self.qattr.sq.post_idx))[0] ctrl_seg = WqeCtrlSeg(imm=imm_be32, fm_ce_se=dve.MLX5_WQE_CTRL_CQ_UPDATE) data_seg = self.get_wqe_data_segment() ctrl_seg.opmod_idx_opcode = (self.qattr.sq.post_idx & 0xffff) << 8 | dve.MLX5_OPCODE_SEND_IMM size_in_octowords = int((ctrl_seg.sizeof() + data_seg.sizeof()) / 16) ctrl_seg.qpn_ds = self.qpn << 8 | size_in_octowords Wqe([ctrl_seg, data_seg], self.umems['qp'].umem_addr + buf_offset) self.qattr.sq.post_idx += int((size_in_octowords * 16 + dve.MLX5_SEND_WQE_BB - 1) / dve.MLX5_SEND_WQE_BB) # Make sure descriptors are written dma.udma_to_dev_barrier() if not self.send_dbr_mode: # Update the doorbell record mem.writebe32(self.umems['qp_dbr'].umem_addr, self.qattr.sq.post_idx & 0xffff, dve.MLX5_SND_DBR) dma.udma_to_dev_barrier() # Ring the doorbell and post the WQE dma.mmio_write64_as_be(buffer_address, mem.read64(ctrl_seg.addr)) def post_recv(self): """ Posts one receive WQE to the RQ by doing all the required work such as building the control/data segments, updating the dbr and the producer indexes. """ buf_offset = self.qattr.rq.offset + self.qattr.rq.wqe_size * self.qattr.rq.head # Prepare WQE data_seg = self.get_wqe_data_segment() Wqe([data_seg], self.umems['qp'].umem_addr + buf_offset) # Update indexes self.qattr.rq.post_idx += 1 self.qattr.rq.head = self.qattr.rq.head + 1 if self.qattr.rq.head + 1 < self.qattr.rq.wqe_num else 0 # Update the doorbell record dma.udma_to_dev_barrier() mem.writebe32(self.umems['qp_dbr'].umem_addr, self.qattr.rq.post_idx & 0xffff, dve.MLX5_RCV_DBR) def poll_cq(self): """ Polls the CQ once and updates the consumer index upon success. The CQE opcode and owner bit are checked and verified. This method does busy-waiting as long as it gets an empty CQE, until a timeout of POLL_CQ_TIMEOUT seconds. """ idx = self.qattr.cq.cons_idx % self.qattr.cq.ncqes cq_owner_flip = not(not(self.qattr.cq.cons_idx & self.qattr.cq.ncqes)) cqe_start_addr = self.umems['cq'].umem_addr + (idx * self.qattr.cq.cqe_size) cqe = None start_poll_t = time.perf_counter() while cqe is None: cqe = Mlx5Cqe64(cqe_start_addr) if (cqe.opcode == dve.MLX5_CQE_INVALID) or \ (cqe.owner ^ cq_owner_flip) or cqe.is_empty(): if time.perf_counter() - start_poll_t >= POLL_CQ_TIMEOUT: raise PyverbsRDMAError(f'CQE #{self.qattr.cq.cons_idx} ' f'is empty or invalid:\n{cqe.dump()}') cqe = None # After CQE ownership check, must do memory barrier and re-read the CQE. dma.udma_from_dev_barrier() cqe = Mlx5Cqe64(cqe_start_addr) if cqe.opcode == dve.MLX5_CQE_RESP_ERR: raise PyverbsRDMAError(f'Got a CQE #{self.qattr.cq.cons_idx} ' f'with responder error:\n{cqe.dump()}') elif cqe.opcode == dve.MLX5_CQE_REQ_ERR: raise PyverbsRDMAError(f'Got a CQE #{self.qattr.cq.cons_idx} ' f'with requester error:\n{cqe.dump()}') self.qattr.cq.cons_idx += 1 mem.writebe32(self.umems['cq_dbr'].umem_addr, self.qattr.cq.cons_idx & 0xffffff, MLX5_CQ_SET_CI) return cqe def close_resources(self): for obj in self.devx_objs: if obj: obj.close() class Mlx5DevxTrafficBase(Mlx5RDMATestCase): """ A base class for mlx5 DevX traffic tests. This class does not include any tests, but provides quick players (client, server) creation and provides a traffic method. """ def tearDown(self): if self.server: self.server.close_resources() if self.client: self.client.close_resources() super().tearDown() def create_players(self, resources, **resource_arg): """ Initialize tests resources. :param resources: The RDMA resources to use. :param resource_arg: Dictionary of args that specify the resources specific attributes. :return: None """ self.server = resources(**self.dev_info, **resource_arg) self.client = resources(**self.dev_info, **resource_arg) self.pre_run() def pre_run(self): self.server.pre_run(self.client.psn, self.client.qpn, self.client.gid, self.client.lid, self.mac_addr) self.client.pre_run(self.server.psn, self.server.qpn, self.server.gid, self.server.lid, self.mac_addr) def invalidate_mr_pages(self): if self.client.with_odp: mem.madvise(self.client.mr.buf, self.client.msg_size) self.client.mem_write('c' * self.client.msg_size, self.client.msg_size) if self.server.with_odp: mem.madvise(self.server.mr.buf, self.server.msg_size) def send_imm_traffic(self): self.client.mem_write('c' * self.client.msg_size, self.client.msg_size) for _ in range(self.client.num_msgs): cons_idx = self.client.qattr.cq.cons_idx self.invalidate_mr_pages() self.server.post_recv() self.client.post_send() # Poll client and verify received cqe opcode send_cqe = self.client.poll_cq() self.assertEqual(send_cqe.opcode, dve.MLX5_CQE_REQ, 'Unexpected CQE opcode') # Poll server and verify received cqe opcode recv_cqe = self.server.poll_cq() self.assertEqual(recv_cqe.opcode, dve.MLX5_CQE_RESP_SEND_IMM, 'Unexpected CQE opcode') msg_received = self.server.mem_read() # Validate data (of received message and immediate value) tests.utils.validate(msg_received, True, self.server.msg_size) imm_inval_pkey = recv_cqe.imm_inval_pkey if sys.byteorder == 'big': imm_inval_pkey = int.from_bytes( imm_inval_pkey.to_bytes(4, byteorder='big'), 'little') self.assertEqual(imm_inval_pkey, self.client.imm + cons_idx) self.server.mem_write('s' * self.server.msg_size, self.server.msg_size) class Mlx5RcResources(RCResources): def __init__(self, dev_name, ib_port, gid_index, **kwargs): self.dv_send_ops_flags = 0 self.send_ops_flags = 0 self.create_send_ops_flags() super().__init__(dev_name, ib_port, gid_index, **kwargs) def create_send_ops_flags(self): self.dv_send_ops_flags = 0 self.send_ops_flags = e.IBV_QP_EX_WITH_SEND def create_context(self): mlx5dv_attr = Mlx5DVContextAttr() try: self.ctx = Mlx5Context(mlx5dv_attr, name=self.dev_name) except PyverbsUserError as ex: raise unittest.SkipTest(f'Could not open mlx5 context ({ex})') except PyverbsRDMAError: raise unittest.SkipTest('Opening mlx5 context is not supported') def create_qp_init_attr(self): comp_mask = e.IBV_QP_INIT_ATTR_PD | e.IBV_QP_INIT_ATTR_SEND_OPS_FLAGS return QPInitAttrEx(cap=self.create_qp_cap(), pd=self.pd, scq=self.cq, rcq=self.cq, qp_type=e.IBV_QPT_RC, send_ops_flags=self.send_ops_flags, comp_mask=comp_mask) def create_qps(self): try: qp_init_attr = self.create_qp_init_attr() comp_mask = dve.MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS if self.dv_send_ops_flags: comp_mask |= dve.MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS attr = Mlx5DVQPInitAttr(comp_mask=comp_mask, send_ops_flags=self.dv_send_ops_flags) qp = Mlx5QP(self.ctx, qp_init_attr, attr) self.qps.append(qp) self.qps_num.append(qp.qp_num) self.psns.append(random.getrandbits(24)) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create Mlx5DV QP is not supported') raise ex def create_cq(self): """ Initializes self.cq with a dv_cq :return: None """ dvcq_init_attr = Mlx5DVCQInitAttr() try: self.cq = Mlx5CQ(self.ctx, CqInitAttrEx(), dvcq_init_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create Mlx5DV CQ is not supported') raise ex rdma-core-56.1/tests/mlx5_prm_structs.py000066400000000000000000002612551477342711600204130ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia Inc. All rights reserved. See COPYING file """ This module provides scapy based classes that represent the mlx5 PRM structs. """ import unittest try: import logging logging.getLogger("scapy.runtime").setLevel(logging.ERROR) from scapy.packet import Packet from scapy.fields import BitField, ByteField, IntField, IPField, \ ShortField, LongField, StrFixedLenField, PacketField, \ PacketListField, ConditionalField, PadField, FieldListField, MACField, \ MultipleTypeField from scapy.layers.inet6 import IP6Field except ImportError: raise unittest.SkipTest('scapy package is needed in order to run DevX tests') class DevxOps: MLX5_CMD_OP_ALLOC_PD = 0x800 MLX5_CMD_OP_CREATE_CQ = 0x400 MLX5_CMD_OP_QUERY_CQ = 0x402 MLX5_CMD_OP_MODIFY_CQ = 0x403 MLX5_CMD_OP_CREATE_QP = 0x500 MLX5_CMD_OP_QUERY_QP = 0x50b MLX5_CMD_OP_RST2INIT_QP = 0x502 MLX5_CMD_OP_INIT2RTR_QP = 0x503 MLX5_CMD_OP_RTR2RTS_QP = 0x504 MLX5_CMD_OP_RTS2RTS_QP = 0x505 MLX5_CMD_OP_QUERY_HCA_VPORT_CONTEXT = 0x762 MLX5_CMD_OP_QUERY_HCA_VPORT_GID = 0x764 MLX5_QPC_ST_RC = 0X0 MLX5_QPC_PM_STATE_MIGRATED = 0x3 MLX5_CMD_OP_QUERY_HCA_CAP = 0x100 MLX5_CMD_OP_QUERY_QOS_CAP = 0xc MLX5_CMD_OP_QUERY_ODP_CAP = 0x2 MLX5_CMD_OP_ALLOC_FLOW_COUNTER = 0x939 MLX5_CMD_OP_DEALLOC_FLOW_COUNTER = 0x93a MLX5_CMD_OP_QUERY_FLOW_COUNTER = 0x93b MLX5_CMD_OP_CREATE_TIR = 0x900 MLX5_CMD_OP_CREATE_EQ = 0x301 MLX5_CMD_OP_MAD_IFC = 0x50d MLX5_CMD_OP_ACCESS_REGISTER_PAOS = 0x5006 MLX5_CMD_OP_ACCESS_REG = 0x805 MLX5_CMD_OP_CREATE_MKEY = 0x200 class ActionType: SET_ACTION = 0x1 ADD_ACTION = 0x2 COPY_ACTION = 0x3 class PRMPacket(Packet): def extract_padding(self, p): return "", p # Common class SwPas(PRMPacket): fields_desc = [ IntField('pa_h', 0), BitField('pa_l', 0, 20), BitField('reserved1', 0, 12), ] # PD class AllocPdIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_ALLOC_PD), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), StrFixedLenField('reserved2', None, length=8), ] class AllocPdOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), ByteField('reserved2', 0), BitField('pd', 0, 24), StrFixedLenField('reserved3', None, length=4), ] # CQ class CmdInputFieldSelectResizeCq(PRMPacket): fields_desc = [ BitField('reserved1', 0, 28), BitField('umem', 0, 1), BitField('log_page_size', 0, 1), BitField('page_offset', 0, 1), BitField('log_cq_size', 0, 1), ] class CmdInputFieldSelectModifyCqFields(PRMPacket): fields_desc = [ BitField('reserved_0', 0, 26), BitField('status', 0, 1), BitField('cq_period_mode', 0, 1), BitField('c_eqn', 0, 1), BitField('oi', 0, 1), BitField('cq_max_count', 0, 1), BitField('cq_period', 0, 1), ] class SwCqc(PRMPacket): fields_desc = [ BitField('status', 0, 4), BitField('as_notify', 0, 1), BitField('initiator_src_dct', 0, 1), BitField('dbr_umem_valid', 0, 1), BitField('reserved1', 0, 1), BitField('cqe_sz', 0, 3), BitField('cc', 0, 1), BitField('reserved2', 0, 1), BitField('scqe_break_moderation_en', 0, 1), BitField('oi', 0, 1), BitField('cq_period_mode', 0, 2), BitField('cqe_compression_en', 0, 1), BitField('mini_cqe_res_format', 0, 2), BitField('st', 0, 4), ByteField('reserved3', 0), IntField('dbr_umem_id', 0), BitField('reserved4', 0, 20), BitField('page_offset', 0, 6), BitField('reserved5', 0, 6), BitField('reserved6', 0, 3), BitField('log_cq_size', 0, 5), BitField('uar_page', 0, 24), BitField('reserved7', 0, 4), BitField('cq_period', 0, 12), ShortField('cq_max_count', 0), BitField('reserved8', 0, 24), ByteField('c_eqn', 0), BitField('reserved9', 0, 3), BitField('log_page_size', 0, 5), BitField('reserved10', 0, 24), StrFixedLenField('reserved11', None, length=4), ByteField('reserved12', 0), BitField('last_notified_index', 0, 24), ByteField('reserved13', 0), BitField('last_solicit_index', 0, 24), ByteField('reserved14', 0), BitField('consumer_counter', 0, 24), ByteField('reserved15', 0), BitField('producer_counter', 0, 24), BitField('local_partition_id', 0, 12), BitField('process_id', 0, 20), ShortField('reserved16', 0), ShortField('thread_id', 0), IntField('db_record_addr_63_32', 0), BitField('db_record_addr_31_3', 0, 29), BitField('reserved17', 0, 3), ] class CreateCqIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_CREATE_CQ), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), ByteField('reserved2', 0), BitField('cqn', 0, 24), StrFixedLenField('reserved3', None, length=4), PacketField('sw_cqc', SwCqc(), SwCqc), LongField('e_mtt_pointer_or_cq_umem_offset', 0), IntField('cq_umem_id', 0), BitField('cq_umem_valid', 0, 1), BitField('reserved4', 0, 31), StrFixedLenField('reserved5', None, length=176), PacketListField('pas', [SwPas() for x in range(0)], SwPas, count_from=lambda pkt: 0), ] class CreateCqOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), ByteField('reserved2', 0), BitField('cqn', 0, 24), StrFixedLenField('reserved3', None, length=4), ] # QP class SwAds(PRMPacket): fields_desc = [ BitField('fl', 0, 1), BitField('free_ar', 0, 1), BitField('reserved1', 0, 14), ShortField('pkey_index', 0), ByteField('reserved2', 0), BitField('grh', 0, 1), BitField('mlid', 0, 7), ShortField('rlid', 0), BitField('ack_timeout', 0, 5), BitField('reserved3', 0, 3), ByteField('src_addr_index', 0), BitField('log_rtm', 0, 4), BitField('stat_rate', 0, 4), ByteField('hop_limit', 0), BitField('reserved4', 0, 4), BitField('tclass', 0, 8), BitField('flow_label', 0, 20), FieldListField('rgid_rip', [0 for x in range(4)], IntField('', 0), count_from=lambda pkt: 4), BitField('reserved5', 0, 4), BitField('f_dscp', 0, 1), BitField('f_ecn', 0, 1), BitField('reserved6', 0, 1), BitField('f_eth_prio', 0, 1), BitField('ecn', 0, 2), BitField('dscp', 0, 6), ShortField('udp_sport', 0), BitField('dei_cfi_reserved_from_prm_041', 0, 1), BitField('eth_prio', 0, 3), BitField('sl', 0, 4), ByteField('vhca_port_num', 0), MACField('rmac', '00:00:00:00:00:00'), ] class SwQpc(PRMPacket): fields_desc = [ BitField('state', 0, 4), BitField('lag_tx_port_affinity', 0, 4), ByteField('st', 0), BitField('reserved1', 0, 3), BitField('pm_state', 0, 2), BitField('reserved2', 0, 1), BitField('req_e2e_credit_mode', 0, 2), BitField('offload_type', 0, 4), BitField('end_padding_mode', 0, 2), BitField('reserved3', 0, 2), BitField('wq_signature', 0, 1), BitField('block_lb_mc', 0, 1), BitField('atomic_like_write_en', 0, 1), BitField('latency_sensitive', 0, 1), BitField('dual_write', 0, 1), BitField('drain_sigerr', 0, 1), BitField('multi_path', 0, 1), BitField('reserved4', 0, 1), BitField('pd', 0, 24), BitField('mtu', 0, 3), BitField('log_msg_max', 0, 5), BitField('reserved5', 0, 1), BitField('log_rq_size', 0, 4), BitField('log_rq_stride', 0, 3), BitField('no_sq', 0, 1), BitField('log_sq_size', 0, 4), BitField('reserved6', 0, 1), BitField('retry_mode', 0, 2), BitField('ts_format', 0, 2), BitField('data_in_order', 0, 1), BitField('rlkey', 0, 1), BitField('ulp_stateless_offload_mode', 0, 4), ByteField('counter_set_id', 0), BitField('uar_page', 0, 24), BitField('send_dbr_mode', 0, 2), BitField('reserved7', 0, 1), BitField('full_handshake', 0, 1), BitField('cnak_reverse_sl', 0, 4), BitField('user_index', 0, 24), BitField('reserved8', 0, 3), BitField('log_page_size', 0, 5), BitField('remote_qpn', 0, 24), PacketField('primary_address_path', SwAds(), SwAds), PacketField('secondary_address_path', SwAds(), SwAds), BitField('log_ack_req_freq', 0, 4), BitField('reserved9', 0, 4), BitField('log_sra_max', 0, 3), BitField('extended_rnr_retry_valid', 0, 1), BitField('reserved10', 0, 1), BitField('retry_count', 0, 3), BitField('rnr_retry', 0, 3), BitField('extended_retry_count_valid', 0, 1), BitField('fre', 0, 1), BitField('cur_rnr_retry', 0, 3), BitField('cur_retry_count', 0, 3), BitField('extended_log_rnr_retry', 0, 5), ShortField('extended_cur_rnr_retry', 0), ShortField('packet_pacing_rate_limit_index', 0), ByteField('reserved11', 0), BitField('next_send_psn', 0, 24), ByteField('reserved12', 0), BitField('cqn_snd', 0, 24), ByteField('reserved13', 0), BitField('deth_sqpn', 0, 24), ByteField('reserved14', 0), ByteField('extended_retry_count', 0), ByteField('reserved15', 0), ByteField('extended_cur_retry_count', 0), ByteField('reserved16', 0), BitField('last_acked_psn', 0, 24), ByteField('reserved17', 0), BitField('ssn', 0, 24), ByteField('reserved18', 0), BitField('log_rra_max', 0, 3), BitField('reserved19', 0, 1), BitField('atomic_mode', 0, 4), BitField('rre', 0, 1), BitField('rwe', 0, 1), BitField('rae', 0, 1), BitField('reserved20', 0, 1), BitField('page_offset', 0, 6), BitField('reserved21', 0, 3), BitField('cd_slave_receive', 0, 1), BitField('cd_slave_send', 0, 1), BitField('cd_master', 0, 1), BitField('reserved22', 0, 3), BitField('min_rnr_nak', 0, 5), BitField('next_rcv_psn', 0, 24), ByteField('reserved23', 0), BitField('xrcd', 0, 24), ByteField('reserved24', 0), BitField('cqn_rcv', 0, 24), LongField('dbr_addr', 0), IntField('q_key', 0), BitField('reserved25', 0, 5), BitField('rq_type', 0, 3), BitField('srqn_rmpn_xrqn', 0, 24), ByteField('reserved26', 0), BitField('rmsn', 0, 24), ShortField('hw_sq_wqebb_counter', 0), ShortField('sw_sq_wqebb_counter', 0), IntField('hw_rq_counter', 0), IntField('sw_rq_counter', 0), ByteField('reserved27', 0), BitField('roce_adp_retrans_rtt', 0, 24), BitField('reserved28', 0, 15), BitField('cgs', 0, 1), ByteField('cs_req', 0), ByteField('cs_res', 0), LongField('dc_access_key', 0), BitField('rdma_active', 0, 1), BitField('comm_est', 0, 1), BitField('suspended', 0, 1), BitField('dbr_umem_valid', 0, 1), BitField('reserved29', 0, 4), BitField('send_msg_psn', 0, 24), ByteField('reserved30', 0), BitField('rcv_msg_psn', 0, 24), LongField('rdma_va', 0), IntField('rdma_key', 0), IntField('dbr_umem_id', 0), ] class CreateQpIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_CREATE_QP), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), ByteField('reserved2', 0), BitField('input_qpn', 0, 24), BitField('reserved3', 0, 1), BitField('cmd_on_behalf', 0, 1), BitField('reserved4', 0, 14), ShortField('vhca_id', 0), IntField('opt_param_mask', 0), StrFixedLenField('reserved5', None, length=4), PacketField('sw_qpc', SwQpc(), SwQpc), LongField('e_mtt_pointer_or_wq_umem_offset', 0), IntField('wq_umem_id', 0), BitField('wq_umem_valid', 0, 1), BitField('reserved6', 0, 31), PacketListField('pas', [SwPas() for x in range(0)], SwPas, count_from=lambda pkt: 0), ] class CreateQpOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), ByteField('reserved2', 0), BitField('qpn', 0, 24), StrFixedLenField('reserved3', None, length=4), ] class ModifyQpIn(PRMPacket): fields_desc = [ ShortField('opcode', 0), ShortField('uid', 0), ShortField('vhca_tunnel_id', 0), ShortField('op_mod', 0), ByteField('reserved2', 0), BitField('qpn', 0, 24), IntField('reserved3', 0), IntField('opt_param_mask', 0), IntField('ece', 0), PacketField('sw_qpc', SwQpc(), SwQpc), StrFixedLenField('reserved4', None, length=16), ] class ModifyQpOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), ] class QueryQpIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_QUERY_QP), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), ByteField('reserved2', 0), BitField('qpn', 0, 24), StrFixedLenField('reserved3', None, length=4), ] class QueryQpOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), IntField('opt_param_mask', 0), StrFixedLenField('reserved3', None, length=4), PacketField('sw_qpc', SwQpc(), SwQpc), LongField('e_mtt_pointer', 0), StrFixedLenField('reserved4', None, length=8), PacketListField('pas', [SwPas() for x in range(0)], SwPas, count_from=lambda pkt: 0), ] # EQ class SwEqc(PRMPacket): fields_desc = [ BitField('status', 0, 4), BitField('reserved1', 0, 9), BitField('ec', 0, 1), BitField('oi', 0, 1), BitField('reserved2', 0, 5), BitField('st', 0, 4), ByteField('reserved3', 0), StrFixedLenField('reserved4', None, length=4), BitField('reserved5', 0, 20), BitField('page_offset', 0, 6), BitField('reserved6', 0, 6), BitField('reserved7', 0, 3), BitField('log_eq_size', 0, 5), BitField('uar_page', 0, 24), StrFixedLenField('reserved8', None, length=4), BitField('reserved9', 0, 20), BitField('intr', 0, 12), BitField('reserved10', 0, 3), BitField('log_page_size', 0, 5), BitField('reserved11', 0, 24), StrFixedLenField('reserved12', None, length=12), ByteField('reserved13', 0), BitField('consumer_counter', 0, 24), ByteField('reserved14', 0), BitField('producer_counter', 0, 24), StrFixedLenField('reserved15', None, length=16), ] class CreateEqIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_CREATE_EQ), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), BitField('reserved2', 0, 24), ByteField('eqn', 0), StrFixedLenField('reserved3', None, length=4), PacketField('sw_eqc', SwEqc(), SwEqc), LongField('e_mtt_pointer', 0), LongField('event_bitmask_63_0', 0), LongField('event_bitmask_127_640', 0), LongField('event_bitmask_191_128', 0), LongField('event_bitmask_255_192', 0), StrFixedLenField('reserved4', None, length=152), ] class CreateEqOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), BitField('reserved2', 0, 24), ByteField('eqn', 0), StrFixedLenField('reserved3', None, length=4), ] class IbSmp(PRMPacket): fields_desc = [ ByteField('base_version', 0), ByteField('mgmt_class', 0), ByteField('class_version', 0), ByteField('method', 0), ShortField('status', 0), ByteField('hop_ptr', 0), ByteField('hop_cnt', 0), LongField('tid', 0), ShortField('attr_id', 0), ShortField('resv', 0), IntField('attr_mod', 0), LongField('mkey', 0), ShortField('dr_slid', 0), ShortField('dr_dlid', 0), FieldListField('reserved', [0 for x in range(7)], IntField('', 0), count_from=lambda pkt: 7), FieldListField('data', [0 for x in range(64)], ByteField('', 0), count_from=lambda pkt: 64), FieldListField('initial_path', [0 for x in range(64)], ByteField('', 0), count_from=lambda pkt: 64), FieldListField('return_path', [0 for x in range(64)], ByteField('', 0), count_from=lambda pkt: 64), ] class MadIfcIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_MAD_IFC), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), ShortField('remote_lid', 0), ByteField('reserved2', 0), ByteField('port', 0), StrFixedLenField('reserved3', None, length=4), StrFixedLenField('mad', None, length=256), ] class MadIfcOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), StrFixedLenField('mad', None, length=256), ] class PaosReg(PRMPacket): fields_desc = [ ByteField('swid', 0), ByteField('local_port', 0), BitField('reserved1', 0, 4), BitField('admin_status', 0, 4), BitField('reserved2', 0, 4), BitField('oper_status', 0, 4), BitField('ase', 0, 1), BitField('ee', 0, 1), BitField('reserved3', 0, 21), BitField('fd', 0, 1), BitField('reserved4', 0, 6), BitField('e', 0, 2), StrFixedLenField('reserved5', None, length=8), ] class AccessPaosRegisterIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_ACCESS_REG), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), ShortField('reserved2', 0), ShortField('register_id', 0), IntField('argument', 0), PacketField('data', PaosReg(), PaosReg), ] class AccessPaosRegisterOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), PacketField('data', PaosReg(), PaosReg), ] # EQE class EventType: COMPLETION_EVENTS = 0X0 CQ_ERROR = 0X4 PORT_STATE_CHANGE = 0X9 class AffiliatedEventHeader(PRMPacket): fields_desc = [ ShortField('reserved1', 0), ShortField('obj_type', 0), IntField('obj_id', 0), ] class CompEvent(PRMPacket): fields_desc = [ StrFixedLenField('reserved1', None, length=24), ByteField('reserved2', 0), BitField('cqn', 0, 24), ] class CqError(PRMPacket): fields_desc = [ ByteField('reserved1', 0), BitField('cqn', 0, 24), StrFixedLenField('reserved2', None, length=4), BitField('reserved3', 0, 24), ByteField('syndrome', 0), StrFixedLenField('reserved4', None, length=16), ] class PortStateChangeEvent(PRMPacket): fields_desc = [ StrFixedLenField('reserved1', None, length=8), BitField('port_num', 0, 4), BitField('reserved2', 0, 28), StrFixedLenField('reserved3', None, length=16), ] class SwEqe(PRMPacket): fields_desc = [ ByteField('reserved1', 0), ByteField('event_type', 0), ByteField('reserved2', 0), ByteField('event_sub_type', 0), StrFixedLenField('reserved3', None, length=28), MultipleTypeField( [ (PadField(PacketField('event_data', CompEvent(), CompEvent), 28, padwith=b"\x00"), lambda pkt: pkt.event_type == EventType.COMPLETION_EVENTS), (PadField(PacketField('event_data', CqError(), CqError), 28, padwith=b"\x00"), lambda pkt: pkt.event_type == EventType.CQ_ERROR), (PadField(PacketField('event_data', PortStateChangeEvent(), PortStateChangeEvent), 28, padwith=b"\x00"), lambda pkt: pkt.event_type == EventType.PORT_STATE_CHANGE), ], StrFixedLenField('event_data', None, length=28) # By default ), ShortField('reserved4', 0), ByteField('signature', 0), BitField('reserved5', 0, 7), BitField('owner', 0, 1), ] # Query HCA VPORT Context class QueryHcaVportContextIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_QUERY_HCA_VPORT_CONTEXT), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), BitField('other_vport', 0, 1), BitField('reserved2', 0, 11), BitField('port_num', 0, 4), ShortField('vport_number', 0), StrFixedLenField('reserved3', None, length=4), ] class HcaVportContext(PRMPacket): fields_desc = [ IntField('field_select', 0), StrFixedLenField('reserved1', None, length=28), BitField('sm_virt_aware', 0, 1), BitField('has_smi', 0, 1), BitField('has_raw', 0, 1), BitField('grh_required', 0, 1), BitField('reserved2', 0, 1), BitField('min_wqe_inline_mode', 0, 3), ByteField('reserved3', 0), BitField('port_physical_state', 0, 4), BitField('vport_state_policy', 0, 4), BitField('port_state', 0, 4), BitField('vport_state', 0, 4), StrFixedLenField('reserved4', None, length=4), LongField('system_image_guid', 0), LongField('port_guid', 0), LongField('node_guid', 0), IntField('cap_mask1', 0), IntField('cap_mask1_field_select', 0), IntField('cap_mask2', 0), IntField('cap_mask2_field_select', 0), ShortField('reserved5', 0), ShortField('ooo_sl_mask', 0), StrFixedLenField('reserved6', None, length=12), ShortField('lid', 0), BitField('reserved7', 0, 4), BitField('init_type_reply', 0, 4), BitField('lmc', 0, 3), BitField('subnet_timeout', 0, 5), ShortField('sm_lid', 0), BitField('sm_sl', 0, 4), BitField('reserved8', 0, 12), ShortField('qkey_violation_counter', 0), ShortField('pkey_violation_counter', 0), StrFixedLenField('reserved9', None, length=404), ] class QueryHcaVportContextOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), PacketField('hca_vport_context', HcaVportContext(), HcaVportContext), ] # Query HCA VPORT GID class QueryHcaVportGidIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_QUERY_HCA_VPORT_GID), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), BitField('other_vport', 0, 1), BitField('reserved2', 0, 11), BitField('port_num', 0, 4), ShortField('vport_number', 0), ShortField('reserved3', 0), ShortField('gid_index', 0), ] class IbGidCmd(PRMPacket): fields_desc = [ LongField('prefix', 0), LongField('guid', 0), ] class QueryHcaVportGidOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=4), ShortField('gids_num', 0), ShortField('reserved3', 0), PacketField('gid0', IbGidCmd(), IbGidCmd), ] class QueryHcaCapOp: HCA_CAP_2 = 0X20 HCA_NIC_FLOW_TABLE_CAP = 0x7 class QueryHcaCapMod: MAX = 0x0 CURRENT = 0x1 class SendDbrMode: DBR_VALID = 0x0 NO_DBR_EXT = 0x1 NO_DBR_INT = 0x2 # Query HCA CAP class QueryHcaCapIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_QUERY_HCA_CAP), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), BitField('other_function', 0, 1), BitField('reserved2', 0, 15), ShortField('function_id', 0), StrFixedLenField('reserved3', None, length=4), ] class CmdHcaCap(PRMPacket): fields_desc = [ BitField('access_other_hca_roce', 0, 1), BitField('reserved1', 0, 30), BitField('vhca_resource_manager', 0, 1), BitField('hca_cap_2', 0, 1), BitField('reserved2', 0, 2), BitField('event_on_vhca_state_teardown_request', 0, 1), BitField('event_on_vhca_state_in_use', 0, 1), BitField('event_on_vhca_state_active', 0, 1), BitField('event_on_vhca_state_allocated', 0, 1), BitField('event_on_vhca_state_invalid', 0, 1), ByteField('transpose_max_element_size', 0), ShortField('vhca_id', 0), ByteField('transpose_max_cols', 0), ByteField('transpose_max_rows', 0), ShortField('transpose_max_size', 0), BitField('reserved3', 0, 1), BitField('sw_steering_icm_large_scale_steering', 0, 1), BitField('qp_data_in_order', 0, 1), BitField('log_regexp_scatter_gather_size', 0, 5), BitField('reserved4', 0, 3), BitField('log_dma_mmo_max_size', 0, 5), BitField('relaxed_ordering_write_pci_enabled', 0, 1), BitField('reserved5', 0, 2), BitField('log_compress_max_size', 0, 5), BitField('reserved6', 0, 3), BitField('log_decompress_max_size', 0, 5), ByteField('log_max_srq_sz', 0), ByteField('log_max_qp_sz', 0), BitField('event_cap', 0, 1), BitField('reserved7', 0, 2), BitField('isolate_vl_tc_new', 0, 1), BitField('reserved8', 0, 2), BitField('nvmeotcp', 0, 1), BitField('pcie_hanged', 0, 1), BitField('prio_tag_required', 0, 1), BitField('wqe_index_ignore_cap', 0, 1), BitField('reserved9', 0, 1), BitField('log_max_qp', 0, 5), BitField('regexp', 0, 1), BitField('regexp_params', 0, 1), BitField('regexp_alloc_onbehalf_umem', 0, 1), BitField('ece', 0, 1), BitField('regexp_num_of_engines', 0, 4), BitField('allow_pause_tx', 0, 1), BitField('reg_c_preserve', 0, 1), BitField('isolate_vl_tc', 0, 1), BitField('log_max_srqs', 0, 5), BitField('psp', 0, 1), BitField('reserved10', 0, 1), BitField('ts_cqe_to_dest_cqn', 0, 1), BitField('regexp_log_crspace_size', 0, 5), BitField('selective_repeat', 0, 1), BitField('go_back_n', 0, 1), BitField('reserved11', 0, 1), BitField('scatter_fcs_w_decap_disable', 0, 1), BitField('reserved12', 0, 4), ByteField('max_sgl_for_optimized_performance', 0), ByteField('log_max_cq_sz', 0), BitField('relaxed_ordering_write_umr', 0, 1), BitField('relaxed_ordering_read_umr', 0, 1), BitField('access_register_user', 0, 1), BitField('reserved13', 0, 5), BitField('upt_device_emulation_manager', 0, 1), BitField('virtio_net_device_emulation_manager', 0, 1), BitField('virtio_blk_device_emulation_manager', 0, 1), BitField('log_max_cq', 0, 5), ByteField('log_max_eq_sz', 0), BitField('relaxed_ordering_write', 0, 1), BitField('relaxed_ordering_read', 0, 1), BitField('log_max_mkey', 0, 6), BitField('tunneled_atomic', 0, 1), BitField('as_notify', 0, 1), BitField('m_pci_port', 0, 1), BitField('m_vhca_mk', 0, 1), BitField('hotplug_manager', 0, 1), BitField('nvme_device_emulation_manager', 0, 1), BitField('terminate_scatter_list_mkey', 0, 1), BitField('repeated_mkey', 0, 1), BitField('dump_fill_mkey', 0, 1), BitField('dpp', 0, 1), BitField('resources_on_nvme_emulation_manager', 0, 1), BitField('fast_teardown', 0, 1), BitField('log_max_eq', 0, 4), ByteField('max_indirection', 0), BitField('fixed_buffer_size', 0, 1), BitField('log_max_mrw_sz', 0, 7), BitField('force_teardown', 0, 1), BitField('prepare_fast_teardown_allways_1', 0, 1), BitField('log_max_bsf_list_size', 0, 6), BitField('umr_extended_translation_offset', 0, 1), BitField('null_mkey', 0, 1), BitField('log_max_klm_list_size', 0, 6), BitField('non_wire_sq', 0, 1), BitField('ats_ro_dependence', 0, 1), BitField('qp_context_extension', 0, 1), BitField('log_max_static_sq_wq_size', 0, 5), BitField('resources_on_virtio_net_emulation_manager', 0, 1), BitField('resources_on_virtio_blk_emulation_manager', 0, 1), BitField('log_max_ra_req_dc', 0, 6), BitField('vhca_trust_level_reg', 0, 1), BitField('eth_wqe_too_small_mode', 0, 1), BitField('vnic_env_eth_wqe_too_small', 0, 1), BitField('log_max_static_sq_wq', 0, 5), BitField('ooo_sl_mask', 0, 1), BitField('vnic_env_cq_overrun', 0, 1), BitField('log_max_ra_res_dc', 0, 6), BitField('cc_roce_ecn_rp_classify_mode', 0, 1), BitField('cc_roce_ecn_rp_dynamic_rtt', 0, 1), BitField('cc_roce_ecn_rp_dynamic_ai', 0, 1), BitField('cc_roce_ecn_rp_dynamic_g', 0, 1), BitField('cc_roce_ecn_rp_burst_decouple', 0, 1), BitField('release_all_pages', 0, 1), BitField('depracated_do_not_use', 0, 1), BitField('sig_crc64_xp10', 0, 1), BitField('sig_crc32c', 0, 1), BitField('roce_accl', 0, 1), BitField('log_max_ra_req_qp', 0, 6), BitField('reserved14', 0, 1), BitField('rts2rts_udp_sport', 0, 1), BitField('rts2rts_lag_tx_port_affinity', 0, 1), BitField('dma_mmo', 0, 1), BitField('compress_min_block_size', 0, 4), BitField('compress', 0, 1), BitField('decompress', 0, 1), BitField('log_max_ra_res_qp', 0, 6), BitField('end_pad', 0, 1), BitField('cc_query_allowed', 0, 1), BitField('cc_modify_allowed', 0, 1), BitField('start_pad', 0, 1), BitField('cache_line_128byte', 0, 1), BitField('gid_table_size_ro', 0, 1), BitField('pkey_table_size_ro', 0, 1), BitField('rts2rts_qp_rmp', 0, 1), BitField('rnr_nak_q_counters', 0, 1), BitField('rts2rts_qp_counters_set_id', 0, 1), BitField('rts2rts_qp_dscp', 0, 1), BitField('gen3_cc_negotiation', 0, 1), BitField('vnic_env_int_rq_oob', 0, 1), BitField('sbcam_reg', 0, 1), BitField('cwcam_reg', 0, 1), BitField('qcam_reg', 0, 1), ShortField('gid_table_size', 0), BitField('out_of_seq_cnt', 0, 1), BitField('vport_counters', 0, 1), BitField('retransmission_q_counters', 0, 1), BitField('debug', 0, 1), BitField('modify_rq_counters_set_id', 0, 1), BitField('rq_delay_drop', 0, 1), BitField('max_qp_cnt', 0, 10), ShortField('pkey_table_size', 0), BitField('vport_group_manager', 0, 1), BitField('vhca_group_manager', 0, 1), BitField('ib_virt', 0, 1), BitField('eth_virt', 0, 1), BitField('vnic_env_queue_counters', 0, 1), BitField('ets', 0, 1), BitField('nic_flow_table', 0, 1), BitField('eswitch_manager', 0, 1), BitField('device_memory', 0, 1), BitField('mcam_reg', 0, 1), BitField('pcam_reg', 0, 1), BitField('local_ca_ack_delay', 0, 5), BitField('port_module_event', 0, 1), BitField('enhanced_retransmission_q_counters', 0, 1), BitField('port_checks', 0, 1), BitField('pulse_gen_control', 0, 1), BitField('disable_link_up_by_init_hca', 0, 1), BitField('beacon_led', 0, 1), BitField('port_type', 0, 2), ByteField('num_ports', 0), BitField('snapshot', 0, 1), BitField('pps', 0, 1), BitField('pps_modify', 0, 1), BitField('log_max_msg', 0, 5), BitField('multi_path_xrc_rdma', 0, 1), BitField('multi_path_dc_rdma', 0, 1), BitField('multi_path_rc_rdma', 0, 1), BitField('traffic_fast_control', 0, 1), BitField('max_tc', 0, 4), BitField('temp_warn_event', 0, 1), BitField('dcbx', 0, 1), BitField('general_notification_event', 0, 1), BitField('multi_prio_sq', 0, 1), BitField('afu_owner', 0, 1), BitField('fpga', 0, 1), BitField('rol_s', 0, 1), BitField('rol_g', 0, 1), BitField('ib_port_sniffer', 0, 1), BitField('wol_s', 0, 1), BitField('wol_g', 0, 1), BitField('wol_a', 0, 1), BitField('wol_b', 0, 1), BitField('wol_m', 0, 1), BitField('wol_u', 0, 1), BitField('wol_p', 0, 1), ShortField('stat_rate_support', 0), BitField('sig_block_4048', 0, 1), BitField('pci_sync_for_fw_update_event', 0, 1), BitField('init2rtr_drain_sigerr', 0, 1), BitField('log_max_extended_rnr_retry', 0, 5), BitField('init2_lag_tx_port_affinity', 0, 1), BitField('flow_group_type_hash_split', 0, 1), BitField('reserved15', 0, 1), BitField('wqe_based_flow_table_update', 0, 1), BitField('cqe_version', 0, 4), BitField('compact_address_vector', 0, 1), BitField('eth_striding_wq', 0, 1), BitField('reserved16', 0, 1), BitField('ipoib_enhanced_offloads', 0, 1), BitField('ipoib_basic_offloads', 0, 1), BitField('ib_link_list_striding_wq', 0, 1), BitField('repeated_block_disabled', 0, 1), BitField('umr_modify_entity_size_disabled', 0, 1), BitField('umr_modify_atomic_disabled', 0, 1), BitField('umr_indirect_mkey_disabled', 0, 1), BitField('umr_fence', 0, 2), BitField('dc_req_sctr_data_cqe', 0, 1), BitField('dc_connect_qp', 0, 1), BitField('dc_cnak_trace', 0, 1), BitField('drain_sigerr', 0, 1), BitField('cmdif_checksum', 0, 2), BitField('sigerr_cqe', 0, 1), BitField('e_psv', 0, 1), BitField('wq_signature', 0, 1), BitField('sctr_data_cqe', 0, 1), BitField('bsf_in_create_mkey', 0, 1), BitField('sho', 0, 1), BitField('tph', 0, 1), BitField('rf', 0, 1), BitField('dct', 0, 1), BitField('qos', 0, 1), BitField('eth_net_offloads', 0, 1), BitField('roce', 0, 1), BitField('atomic', 0, 1), BitField('extended_retry_count', 0, 1), BitField('cq_oi', 0, 1), BitField('cq_resize', 0, 1), BitField('cq_moderation', 0, 1), BitField('cq_period_mode_modify', 0, 1), BitField('cq_invalidate', 0, 1), BitField('reserved17', 0, 1), BitField('cq_eq_remap', 0, 1), BitField('pg', 0, 1), BitField('block_lb_mc', 0, 1), BitField('exponential_backoff', 0, 1), BitField('scqe_break_moderation', 0, 1), BitField('cq_period_start_from_cqe', 0, 1), BitField('cd', 0, 1), BitField('atm', 0, 1), BitField('apm', 0, 1), BitField('vector_calc', 0, 1), BitField('umr_ptr_rlkey', 0, 1), BitField('imaicl', 0, 1), BitField('qp_packet_based', 0, 1), BitField('ib_cyclic_striding_wq', 0, 1), BitField('ipoib_enhanced_pkey_change', 0, 1), BitField('initiator_src_dct_in_cqe', 0, 1), BitField('qkv', 0, 1), BitField('pkv', 0, 1), BitField('set_deth_sqpn', 0, 1), BitField('rts2rts_primary_sl', 0, 1), BitField('initiator_src_dct', 0, 1), BitField('dc_v2', 0, 1), BitField('xrc', 0, 1), BitField('ud', 0, 1), BitField('uc', 0, 1), BitField('rc', 0, 1), BitField('uar_4k', 0, 1), BitField('reserved18', 0, 7), BitField('fl_rc_qp_when_roce_disabled', 0, 1), BitField('reserved19', 0, 1), BitField('uar_sz', 0, 6), BitField('reserved20', 0, 3), BitField('log_max_dc_cnak_qps', 0, 5), ByteField('log_pg_sz', 0), BitField('bf', 0, 1), BitField('driver_version', 0, 1), BitField('pad_tx_eth_packet', 0, 1), BitField('query_driver_version', 0, 1), BitField('max_qp_retry_freq', 0, 1), BitField('qp_by_name', 0, 1), BitField('mkey_by_name', 0, 1), BitField('reserved21', 0, 4), BitField('log_bf_reg_size', 0, 5), BitField('reserved22', 0, 6), BitField('lag_dct', 0, 2), BitField('lag_tx_port_affinity', 0, 1), BitField('lag_native_fdb_selection', 0, 1), BitField('must_be_0', 0, 1), BitField('lag_master', 0, 1), BitField('num_lag_ports', 0, 4), ShortField('num_of_diagnostic_counters', 0), ShortField('max_wqe_sz_sq', 0), ShortField('reserved23', 0), ShortField('max_wqe_sz_rq', 0), ShortField('max_flow_counter_31_16', 0), ShortField('max_wqe_sz_sq_dc', 0), BitField('reserved24', 0, 7), BitField('max_qp_mcg', 0, 25), ShortField('mlnx_tag_ethertype', 0), ByteField('flow_counter_bulk_alloc', 0), ByteField('log_max_mcg', 0), BitField('reserved25', 0, 3), BitField('log_max_transport_domain', 0, 5), BitField('reserved26', 0, 3), BitField('log_max_pd', 0, 5), BitField('reserved27', 0, 11), BitField('log_max_xrcd', 0, 5), BitField('nic_receive_steering_discard', 0, 1), BitField('receive_discard_vport_down', 0, 1), BitField('transmit_discard_vport_down', 0, 1), BitField('eq_overrun_count', 0, 1), BitField('nic_receive_steering_depth', 0, 1), BitField('invalid_command_count', 0, 1), BitField('quota_exceeded_count', 0, 1), BitField('flow_counter_by_name', 0, 1), ByteField('log_max_flow_counter_bulk', 0), ShortField('max_flow_counter_15_0', 0), BitField('modify_tis', 0, 1), BitField('flow_counters_dump', 0, 1), BitField('reserved28', 0, 1), BitField('log_max_rq', 0, 5), BitField('reserved29', 0, 3), BitField('log_max_sq', 0, 5), BitField('reserved30', 0, 3), BitField('log_max_tir', 0, 5), BitField('reserved31', 0, 3), BitField('log_max_tis', 0, 5), BitField('basic_cyclic_rcv_wqe', 0, 1), BitField('reserved32', 0, 2), BitField('log_max_rmp', 0, 5), BitField('reserved33', 0, 3), BitField('log_max_rqt', 0, 5), BitField('reserved34', 0, 3), BitField('log_max_rqt_size', 0, 5), BitField('reserved35', 0, 3), BitField('log_max_tis_per_sq', 0, 5), BitField('ext_stride_num_range', 0, 1), BitField('reserved36', 0, 2), BitField('log_max_stride_sz_rq', 0, 5), BitField('reserved37', 0, 3), BitField('log_min_stride_sz_rq', 0, 5), BitField('reserved38', 0, 3), BitField('log_max_stride_sz_sq', 0, 5), BitField('reserved39', 0, 3), BitField('log_min_stride_sz_sq', 0, 5), BitField('hairpin_eth_raw', 0, 1), BitField('reserved40', 0, 2), BitField('log_max_hairpin_queues', 0, 5), BitField('hairpin_ib_raw', 0, 1), BitField('hairpin_eth2ipoib', 0, 1), BitField('hairpin_ipoib2eth', 0, 1), BitField('log_max_hairpin_wq_data_sz', 0, 5), BitField('reserved41', 0, 3), BitField('log_max_hairpin_num_packets', 0, 5), BitField('reserved42', 0, 3), BitField('log_max_wq_sz', 0, 5), BitField('nic_vport_change_event', 0, 1), BitField('disable_local_lb_uc', 0, 1), BitField('disable_local_lb_mc', 0, 1), BitField('log_min_hairpin_wq_data_sz', 0, 5), BitField('system_image_guid_modifiable', 0, 1), BitField('reserved43', 0, 1), BitField('vhca_state', 0, 1), BitField('log_max_vlan_list', 0, 5), BitField('reserved44', 0, 3), BitField('log_max_current_mc_list', 0, 5), BitField('reserved45', 0, 3), BitField('log_max_current_uc_list', 0, 5), LongField('general_obj_types', 0), BitField('sq_ts_format', 0, 2), BitField('rq_ts_format', 0, 2), BitField('steering_format_version', 0, 4), BitField('create_qp_start_hint', 0, 24), BitField('tls', 0, 1), BitField('ats', 0, 1), BitField('reserved46', 0, 1), BitField('log_max_uctx', 0, 5), BitField('aes_xts', 0, 1), BitField('crypto', 0, 1), BitField('ipsec_offload', 0, 1), BitField('log_max_umem', 0, 5), ShortField('max_num_eqs', 0), BitField('reserved47', 0, 1), BitField('tls_tx', 0, 1), BitField('tls_rx', 0, 1), BitField('log_max_l2_table', 0, 5), ByteField('reserved48', 0), ShortField('log_uar_page_sz', 0), BitField('e', 0, 1), BitField('reserved49', 0, 31), IntField('device_frequency_mhz', 0), IntField('device_frequency_khz', 0), BitField('capi', 0, 1), BitField('create_pec', 0, 1), BitField('nvmf_target_offload', 0, 1), BitField('capi_invalidate', 0, 1), BitField('reserved50', 0, 23), BitField('log_max_pasid', 0, 5), IntField('num_of_uars_per_page', 0), IntField('flex_parser_protocols', 0), ByteField('max_geneve_tlv_options', 0), BitField('reserved51', 0, 3), BitField('max_geneve_tlv_option_data_len', 0, 5), BitField('flex_parser_header_modify', 0, 1), BitField('reserved52', 0, 2), BitField('log_max_guaranteed_connections', 0, 5), BitField('reserved53', 0, 3), BitField('log_max_dct_connections', 0, 5), ByteField('log_max_atomic_size_qp', 0), BitField('reserved54', 0, 3), BitField('log_max_dci_stream_channels', 0, 5), BitField('reserved55', 0, 3), BitField('log_max_dci_errored_streams', 0, 5), ByteField('log_max_atomic_size_dc', 0), ShortField('max_multi_user_group_size', 0), BitField('reserved56', 0, 2), BitField('crossing_vhca_mkey', 0, 1), BitField('log_max_dek', 0, 5), BitField('reserved57', 0, 1), BitField('mini_cqe_resp_l3l4header', 0, 1), BitField('mini_cqe_resp_flow_tag', 0, 1), BitField('enhanced_cqe_compression', 0, 1), BitField('mini_cqe_resp_stride_index', 0, 1), BitField('cqe_128_always', 0, 1), BitField('cqe_compression_128b', 0, 1), BitField('cqe_compression', 0, 1), ShortField('cqe_compression_timeout', 0), ShortField('cqe_compression_max_num', 0), BitField('reserved58', 0, 3), BitField('wqe_based_flow_table_update_dest_type_offset', 0, 5), BitField('flex_parser_id_gtpu_dw_0', 0, 4), BitField('log_max_tm_offloaded_op_size', 0, 4), BitField('tag_matching', 0, 1), BitField('rndv_offload_rc', 0, 1), BitField('rndv_offload_dc', 0, 1), BitField('log_tag_matching_list_sz', 0, 5), BitField('reserved59', 0, 3), BitField('log_max_xrq', 0, 5), ByteField('affiliate_nic_vport_criteria', 0), ByteField('native_port_num', 0), ByteField('num_vhca_ports', 0), BitField('flex_parser_id_gtpu_teid', 0, 4), BitField('reserved60', 0, 1), BitField('trusted_vnic_vhca', 0, 1), BitField('sw_owner_id', 0, 1), BitField('reserve_not_to_use', 0, 1), ShortField('max_num_of_monitor_counters', 0), ShortField('num_ppcnt_monitor_counters', 0), ShortField('max_num_sf', 0), ShortField('num_q_monitor_counters', 0), StrFixedLenField('reserved61', None, length=4), BitField('sf', 0, 1), BitField('sf_set_partition', 0, 1), BitField('reserved62', 0, 1), BitField('log_max_sf', 0, 5), ByteField('reserved63', 0), ByteField('log_min_sf_size', 0), ByteField('max_num_sf_partitions', 0), IntField('uctx_permission', 0), BitField('flex_parser_id_mpls_over_x_cw', 0, 4), BitField('flex_parser_id_geneve_tlv_option_0', 0, 4), BitField('flex_parser_id_icmp_dw1', 0, 4), BitField('flex_parser_id_icmp_dw0', 0, 4), BitField('flex_parser_id_icmpv6_dw1', 0, 4), BitField('flex_parser_id_icmpv6_dw0', 0, 4), BitField('flex_parser_id_outer_first_mpls_over_gre', 0, 4), BitField('flex_parser_id_outer_first_mpls_over_udp_label', 0, 4), ShortField('max_num_match_definer', 0), ShortField('sf_base_id', 0), BitField('flex_parser_id_gtpu_dw_2', 0, 4), BitField('flex_parser_id_gtpu_first_ext_dw_0', 0, 4), BitField('num_total_dynamic_vf_msix', 0, 24), BitField('reserved64', 0, 3), BitField('log_flow_hit_aso_granularity', 0, 5), BitField('reserved65', 0, 3), BitField('log_flow_hit_aso_max_alloc', 0, 5), BitField('reserved66', 0, 4), BitField('dynamic_msix_table_size', 0, 12), BitField('reserved67', 0, 3), BitField('log_max_num_flow_hit_aso', 0, 5), BitField('reserved68', 0, 4), BitField('min_dynamic_vf_msix_table_size', 0, 4), BitField('reserved69', 0, 4), BitField('max_dynamic_vf_msix_table_size', 0, 12), BitField('reserved70', 0, 3), BitField('log_max_num_header_modify_argument', 0, 5), BitField('reserved71', 0, 4), BitField('log_header_modify_argument_granularity', 0, 4), BitField('reserved72', 0, 3), BitField('log_header_modify_argument_max_alloc', 0, 5), BitField('reserved73', 0, 3), BitField('max_flow_execute_aso', 0, 5), LongField('vhca_tunnel_commands', 0), LongField('match_definer_format_supported', 0), ] class QueryCmdHcaCapOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), PadField(PacketField('capability', CmdHcaCap(), CmdHcaCap), 2048, padwith=b"\x00"), ] class FlowTableEntryMatchSetMisc(PRMPacket): fields_desc = [ BitField('gre_c_present', 0, 1), BitField('bth_a', 0, 1), BitField('gre_k_present', 0, 1), BitField('gre_s_present', 0, 1), BitField('source_vhca_port', 0, 4), BitField('source_sqn', 0, 24), ShortField('src_esw_owner_vhca_id', 0), ShortField('source_port', 0), BitField('outer_second_prio', 0, 3), BitField('outer_second_cfi', 0, 1), BitField('outer_second_vid', 0, 12), BitField('inner_second_prio', 0, 3), BitField('inner_second_cfi', 0, 1), BitField('inner_second_vid', 0, 12), BitField('outer_second_cvlan_tag', 0, 1), BitField('inner_second_cvlan_tag', 0, 1), BitField('outer_second_svlan_tag', 0, 1), BitField('inner_second_svlan_tag', 0, 1), BitField('outer_emd_tag', 0, 1), BitField('reserved2', 0, 11), ShortField('gre_protocol', 0), BitField('gre_key_h', 0, 24), ByteField('gre_key_l', 0), BitField('vxlan_vni', 0, 24), ByteField('bth_opcode', 0), BitField('geneve_vni', 0, 24), BitField('reserved4', 0, 7), BitField('geneve_oam', 0, 1), BitField('reserved5', 0, 12), BitField('outer_ipv6_flow_label', 0, 20), BitField('reserved6', 0, 12), BitField('inner_ipv6_flow_label', 0, 20), BitField('reserved7', 0, 10), BitField('geneve_opt_len', 0, 6), ShortField('geneve_protocol_type', 0), ByteField('reserved8', 0), BitField('bth_dst_qp', 0, 24), IntField('inner_esp_spi', 0), IntField('outer_esp_spi', 0), StrFixedLenField('reserved9', None, length=4), IntField('outer_emd_tag_data_47_16', 0), ShortField('outer_emd_tag_data_15_0', 0), ShortField('reserved10', 0), ] class FlowTableEntryMatchSetMisc2(PRMPacket): fields_desc = [ BitField('outer_first_mpls_label', 0, 20), BitField('outer_first_mpls_exp', 0, 3), BitField('outer_first_mpls_s_bos', 0, 1), ByteField('outer_first_mpls_ttl', 0), BitField('inner_first_mpls_label', 0, 20), BitField('inner_first_mpls_exp', 0, 3), BitField('inner_first_mpls_s_bos', 0, 1), ByteField('inner_first_mpls_ttl', 0), BitField('outer_last_mpls_over_gre_label', 0, 20), BitField('outer_last_mpls_over_gre_exp', 0, 3), BitField('outer_last_mpls_over_gre_s_bos', 0, 1), ByteField('outer_last_mpls_over_gre_ttl', 0), BitField('outer_last_mpls_over_udp_label', 0, 20), BitField('outer_last_mpls_over_udp_exp', 0, 3), BitField('outer_last_mpls_over_udp_s_bos', 0, 1), ByteField('outer_last_mpls_over_udp_ttl', 0), IntField('metadata_reg_c_7', 0), IntField('metadata_reg_c_6', 0), IntField('metadata_reg_c_5', 0), IntField('metadata_reg_c_4', 0), IntField('metadata_reg_c_3', 0), IntField('metadata_reg_c_2', 0), IntField('metadata_reg_c_1', 0), IntField('metadata_reg_c_0', 0), IntField('metadata_reg_a', 0), IntField('metadata_reg_b', 0), StrFixedLenField('reserved1', None, length=8), ] class FlowTableEntryMatchSetMisc3(PRMPacket): fields_desc = [ IntField('inner_tcp_seq_num', 0), IntField('outer_tcp_seq_num', 0), IntField('inner_tcp_ack_num', 0), IntField('outer_tcp_ack_num', 0), ByteField('reserved1', 0), BitField('outer_vxlan_gpe_vni', 0, 24), ByteField('outer_vxlan_gpe_next_protocol', 0), ByteField('outer_vxlan_gpe_flags', 0), ShortField('reserved2', 0), IntField('icmp_header_data', 0), IntField('icmpv6_header_data', 0), ByteField('icmp_type', 0), ByteField('icmp_code', 0), ByteField('icmpv6_type', 0), ByteField('icmpv6_code', 0), IntField('geneve_tlv_option_0_data', 0), IntField('gtpu_teid', 0), ByteField('gtpu_msg_type', 0), BitField('reserved3', 0, 5), BitField('gtpu_flags', 0, 3), ShortField('reserved4', 0), IntField('gtpu_dw_2', 0), IntField('gtpu_first_ext_dw_0', 0), IntField('gtpu_dw_0', 0), StrFixedLenField('reserved5', None, length=4), ] class FlowTableEntryMatchSetLyr24(PRMPacket): fields_desc = [ MACField('smac', '00:00:00:00:00:00'), ShortField('ethertype', 0), MACField('dmac', '00:00:00:00:00:00'), BitField('first_prio', 0, 3), BitField('first_cfi', 0, 1), BitField('first_vid', 0, 12), ByteField('ip_protocol', 0), BitField('ip_dscp', 0, 6), BitField('ip_ecn', 0, 2), BitField('cvlan_tag', 0, 1), BitField('svlan_tag', 0, 1), BitField('frag', 0, 1), BitField('ip_version', 0, 4), BitField('tcp_flags', 0, 9), ShortField('tcp_sport', 0), ShortField('tcp_dport', 0), BitField('reserved1', 0, 16), BitField('ipv4_ihl', 0, 4), BitField('l3_ok', 0, 1), BitField('l4_ok', 0, 1), BitField('ipv4_checksum_ok', 0, 1), BitField('l4_checksum_ok', 0, 1), ByteField('ip_ttl_hoplimit', 0), ShortField('udp_sport', 0), ShortField('udp_dport', 0), # Ipv4 and IPv6 fields are edited manually: # Added lambda conditioning # Field names must be different ConditionalField(BitField('src_ip_mask', 0, 128), lambda pkt: pkt.ip_version != 4 and pkt.ip_version != 6), ConditionalField(BitField('reserved2', 0, 96), lambda pkt: pkt.ip_version == 4), ConditionalField(IPField("src_ip4", "0.0.0.0"), lambda pkt: pkt.ip_version == 4), ConditionalField(IP6Field("src_ip6", "::"), lambda pkt: pkt.ip_version == 6), ConditionalField(BitField('dst_ip_mask', 0, 128), lambda pkt: pkt.ip_version != 4 and pkt.ip_version != 6), ConditionalField(BitField('reserved3', 0, 96), lambda pkt: pkt.ip_version == 4), ConditionalField(IPField("dst_ip4", "0.0.0.0"), lambda pkt: pkt.ip_version == 4), ConditionalField(IP6Field("dst_ip6", "::"), lambda pkt: pkt.ip_version == 6), ] class ProgSampleField(PRMPacket): fields_desc = [ IntField('prog_sample_field_value', 0), IntField('prog_sample_field_id', 0), ] class FlowTableEntryMatchSetMisc4(PRMPacket): fields_desc = [ PacketListField('prog_sample_field', [ProgSampleField() for x in range(4)], ProgSampleField, count_from=lambda pkt:4), StrFixedLenField('reserved1', None, length=32), ] class FlowTableEntryMatchSetMisc5(PRMPacket): fields_desc = [ IntField('macsec_tag_0', 0), IntField('macsec_tag_1', 0), IntField('macsec_tag_2', 0), IntField('macsec_tag_3', 0), IntField('tunnel_header_0', 0), IntField('tunnel_header_1', 0), IntField('tunnel_header_2', 0), IntField('tunnel_header_3', 0), StrFixedLenField('reserved1', None, length=32), ] class FlowTableEntryMatchParam(PRMPacket): fields_desc = [ PacketField('outer_headers', FlowTableEntryMatchSetLyr24(), FlowTableEntryMatchSetLyr24), PacketField('misc_parameters', FlowTableEntryMatchSetMisc(), FlowTableEntryMatchSetMisc), PacketField('inner_headers', FlowTableEntryMatchSetLyr24(), FlowTableEntryMatchSetLyr24), PacketField('misc_parameters_2', FlowTableEntryMatchSetMisc2(), FlowTableEntryMatchSetMisc2), PacketField('misc_parameters_3', FlowTableEntryMatchSetMisc3(), FlowTableEntryMatchSetMisc3), PacketField('misc_parameters_4', FlowTableEntryMatchSetMisc4(), FlowTableEntryMatchSetMisc4), PacketField('misc_parameters_5', FlowTableEntryMatchSetMisc5(), FlowTableEntryMatchSetMisc5), # Keep reserved commented out since SW steering checks the size with # supported fields only. # StrFixedLenField('reserved1', None, length=128), ] class SetActionIn(PRMPacket): fields_desc = [ BitField('action_type', ActionType.SET_ACTION, 4), BitField('field', 0, 12), BitField('reserved1', 0, 3), BitField('offset', 0, 5), BitField('reserved2', 0, 3), BitField('length', 0, 5), IntField('data', 0), ] class CopyActionIn(PRMPacket): fields_desc = [ BitField('action_type', ActionType.COPY_ACTION, 4), BitField('src_field', 0, 12), BitField('reserved1', 0, 3), BitField('src_offset', 0, 5), BitField('reserved2', 0, 3), BitField('length', 0, 5), BitField('reserved3', 0, 4), BitField('dst_field', 0, 12), BitField('reserved4', 0, 3), BitField('dst_offest', 0, 5), ByteField('reserved5', 0), ] class AllocFlowCounterIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_ALLOC_FLOW_COUNTER), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), IntField('flow_counter_id', 0), BitField('reserved2', 0, 24), ByteField('flow_counter_bulk', 0), ] class AllocFlowCounterOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), IntField('flow_counter_id', 0), StrFixedLenField('reserved2', None, length=4), ] class DeallocFlowCounterIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_DEALLOC_FLOW_COUNTER), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), IntField('flow_counter_id', 0), StrFixedLenField('reserved2', None, length=4), ] class DeallocFlowCounterOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), ] class QueryFlowCounterIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_QUERY_FLOW_COUNTER), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), StrFixedLenField('reserved2', None, length=4), IntField('mkey', 0), LongField('address', 0), BitField('clear', 0, 1), BitField('dump_to_memory', 0, 1), BitField('num_of_counters', 0, 30), IntField('flow_counter_id', 0), ] class TrafficCounter(PRMPacket): fields_desc = [ LongField('packets', 0), LongField('octets', 0), ] class QueryFlowCounterOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), PacketField('flow_statistics', TrafficCounter(), TrafficCounter), ] class RxHashFieldSelect(PRMPacket): fields_desc = [ BitField('l3_prot_type', 0, 1), BitField('l4_prot_type', 0, 1), BitField('selected_fields', 0, 30), ] class Tirc(PRMPacket): fields_desc = [ StrFixedLenField('reserved1', None, length=4), BitField('disp_type', 0, 4), BitField('tls_en', 0, 1), BitField('nvmeotcp_zerocopy_en', 0, 1), BitField('nvmeotcp_crc_en', 0, 1), BitField('reserved2', 0, 25), StrFixedLenField('reserved3', None, length=8), BitField('reserved4', 0, 4), BitField('lro_timeout_period_usecs', 0, 16), BitField('lro_enable_mask', 0, 4), ByteField('lro_max_msg_sz', 0), ByteField('reserved5', 0), BitField('afu_id', 0, 24), BitField('inline_rqn_vhca_id_valid', 0, 1), BitField('reserved6', 0, 15), ShortField('inline_rqn_vhca_id', 0), BitField('reserved7', 0, 5), BitField('inline_q_type', 0, 3), BitField('inline_rqn', 0, 24), BitField('rx_hash_symmetric', 0, 1), BitField('reserved8', 0, 1), BitField('tunneled_offload_en', 0, 1), BitField('reserved9', 0, 5), BitField('indirect_table', 0, 24), BitField('rx_hash_fn', 0, 4), BitField('reserved10', 0, 2), BitField('self_lb_en', 0, 2), BitField('transport_domain', 0, 24), FieldListField('rx_hash_toeplitz_key', [0 for x in range(10)], IntField('', 0), count_from=lambda pkt:10), PacketField('rx_hash_field_selector_outer', RxHashFieldSelect(), RxHashFieldSelect), PacketField('rx_hash_field_selector_inner', RxHashFieldSelect(), RxHashFieldSelect), IntField('nvmeotcp_tag_buffer_table_id', 0), StrFixedLenField('reserved11', None, length=148), ] class CreateTirIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_CREATE_TIR), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), StrFixedLenField('reserved2', None, length=24), PacketField('tir_context', Tirc(), Tirc), ] class CreateTirOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('icm_address_63_40', 0, 24), IntField('syndrome', 0), ByteField('icm_address_39_32', 0), BitField('tirn', 0, 24), IntField('icm_address_31_0', 0), ] class SwMkc(PRMPacket): fields_desc = [ BitField('reserved1', 0, 1), BitField('free', 0, 1), BitField('reserved2', 0, 1), BitField('access_mode_4_2', 0, 3), BitField('alter_pd_to_vhca_id', 0, 1), BitField('crossed_side_mkey', 0, 1), BitField('reserved3', 0, 5), BitField('relaxed_ordering_write', 0, 1), BitField('reserved4', 0, 1), BitField('small_fence_on_rdma_read_response', 0, 1), BitField('umr_en', 0, 1), BitField('a', 0, 1), BitField('rw', 0, 1), BitField('rr', 0, 1), BitField('lw', 0, 1), BitField('lr', 0, 1), BitField('access_mode_1_0', 0, 2), BitField('reserved5', 0, 1), BitField('tunneled_atomic', 0, 1), BitField('ma_translation_mode', 0, 2), BitField('reserved6', 0, 4), BitField('qpn', 0, 24), ByteField('mkey_7_0', 0), ByteField('reserved7', 0), BitField('pasid', 0, 24), BitField('length64', 0, 1), BitField('bsf_en', 0, 1), BitField('sync_umr', 0, 1), BitField('reserved8', 0, 2), BitField('expected_sigerr_count', 0, 1), BitField('reserved9', 0, 1), BitField('en_rinval', 0, 1), BitField('pd', 0, 24), LongField('start_addr', 0), LongField('len', 0), IntField('bsf_octword_size', 0), StrFixedLenField('reserved10', None, length=12), ShortField('crossing_target_vhca_id', 0), ShortField('reserved11', 0), IntField('translations_octword_size', 0), BitField('reserved12', 0, 25), BitField('relaxed_ordering_read', 0, 1), BitField('reserved13', 0, 1), BitField('log_entity_size', 0, 5), BitField('reserved14', 0, 3), BitField('crypto_en', 0, 2), BitField('reserved15', 0, 27), ] class CreateMkeyIn(PRMPacket): fields_desc = [ ShortField('opcode', DevxOps.MLX5_CMD_OP_CREATE_MKEY), ShortField('uid', 0), ShortField('reserved1', 0), ShortField('op_mod', 0), ByteField('reserved2', 0), BitField('input_mkey_index', 0, 24), BitField('pg_access', 0, 1), BitField('mkey_umem_valid', 0, 1), BitField('reserved3', 0, 30), PacketField('sw_mkc', SwMkc(), SwMkc), LongField('e_mtt_pointer', 0), LongField('e_bsf_pointer', 0), IntField('translations_octword_actual_size', 0), IntField('mkey_umem_id', 0), LongField('mkey_umem_offset', 0), IntField('bsf_octword_actual_size', 0), StrFixedLenField('reserved4', None, length=156), FieldListField('klm_pas_mtt', [0 for x in range(0)], IntField('', 0), count_from=lambda pkt:0), ] class CreateMkeyOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), ByteField('reserved2', 0), BitField('mkey_index', 0, 24), StrFixedLenField('reserved3', None, length=4), ] class MigrationTagVersion0(PRMPacket): fields_desc = [ ShortField('reserved1', 0), ShortField('device_id', 0), ShortField('fw_version_minor', 0), ShortField('icm_version', 0), StrFixedLenField('reserved2', None, length=4), IntField('crc', 0), ] class CmdHcaCap2(PRMPacket): fields_desc = [ StrFixedLenField('reserved1', None, length=16), BitField('migratable', 0, 1), BitField('force_multi_prio_sq', 0, 1), BitField('cq_with_emulated_dev_eq', 0, 1), BitField('max_num_prog_sample_field', 0, 5), BitField('multi_path_force', 0, 1), BitField('fw_cpu_monitoring', 0, 1), BitField('enh_eth_striding_wq', 0, 1), BitField('log_max_num_reserved_qpn', 0, 5), BitField('reserved2', 0, 1), BitField('introspection_mkey_access_allowed', 0, 1), BitField('query_vuid', 0, 1), BitField('log_reserved_qpn_granularity', 0, 5), BitField('reserved3', 0, 3), BitField('log_reserved_qpn_max_alloc', 0, 5), ByteField('max_reformat_insert_size', 0), ByteField('max_reformat_insert_offset', 0), ByteField('max_reformat_remove_size', 0), ByteField('max_reformat_remove_offset', 0), BitField('multi_sl_qp', 0, 1), BitField('non_tunnel_reformat', 0, 1), BitField('reserved4', 0, 2), BitField('log_min_stride_wqe_sz', 0, 4), BitField('migration_multi_load', 0, 1), BitField('migration_tracking_state', 0, 1), BitField('reserved5', 0, 1), BitField('log_conn_track_granularity', 0, 5), BitField('reserved6', 0, 3), BitField('log_conn_track_max_alloc', 0, 5), BitField('reserved7', 0, 3), BitField('log_max_conn_track_offload', 0, 5), IntField('cross_vhca_object_to_object_supported', 0), LongField('allowed_object_for_other_vhca_access', 0), IntField('introspection_mkey', 0), BitField('ec_mmo_qp', 0, 1), BitField('sync_driver_version', 0, 1), BitField('driver_version_change_event', 0, 1), BitField('hairpin_sq_wqe_bb_size', 0, 5), BitField('hairpin_sq_wq_in_host_mem', 0, 1), BitField('hairpin_data_buffer_locked', 0, 1), BitField('reserved8', 0, 1), BitField('log_ec_mmo_max_size', 0, 5), BitField('reserved9', 0, 3), BitField('log_ec_mmo_max_src', 0, 5), BitField('reserved10', 0, 3), BitField('log_ec_mmo_max_dst', 0, 5), IntField('sync_driver_actions', 0), ByteField('flow_table_type_2_type', 0), BitField('reserved11', 0, 2), BitField('format_select_dw_8_6_ext', 0, 1), BitField('reserved12', 0, 1), BitField('log_min_mkey_entity_size', 0, 4), ShortField('execute_aso_type', 0), LongField('general_obj_types_127_64', 0), IntField('repeated_mkey_v2', 0), BitField('reserved_gid_index_valid', 0, 1), BitField('sw_vhca_id_valid', 0, 1), BitField('sw_vhca_id', 0, 14), ShortField('reserved_gid_index', 0), BitField('reserved13', 0, 3), BitField('log_max_channel_service_connection', 0, 5), BitField('reserved14', 0, 3), BitField('ts_cqe_metadata_size2wqe_counter', 0, 5), BitField('reserved15', 0, 3), BitField('flow_counter_bulk_log_max_alloc', 0, 5), BitField('reserved16', 0, 3), BitField('flow_counter_bulk_log_granularity', 0, 5), ByteField('format_select_dw_mpls_over_x_cw', 0), ByteField('format_select_dw_geneve_tlv_option_0', 0), ByteField('format_select_dw_outer_first_mpls_over_gre', 0), ByteField('format_select_dw_outer_first_mpls_over_udp', 0), ByteField('format_select_dw_gtpu_dw_0', 0), ByteField('format_select_dw_gtpu_dw_1', 0), ByteField('format_select_dw_gtpu_dw_2', 0), ByteField('format_select_dw_gtpu_first_ext_dw_0', 0), IntField('generate_wqe_type', 0), ShortField('max_enh_strwq_supported_profile', 0), BitField('reserved17', 0, 3), BitField('log_max_total_hairpin_data_buffer_locked_size', 0, 5), BitField('reserved18', 0, 3), BitField('log_max_rq_hairpin_data_buffer_locked_size', 0, 5), BitField('send_dbr_mode_no_dbr_int', 0, 1), BitField('send_dbr_mode_no_dbr_ext', 0, 1), BitField('reserved19', 0, 1), BitField('log_max_send_dbr_less_qp_sq', 0, 5), BitField('reserved20', 0, 3), BitField('enh_strwq_max_log_page_size', 0, 5), ByteField('enh_strwq_max_headroom', 0), ByteField('enh_strwq_max_tailroom', 0), PacketField('migration_tag_version_0', MigrationTagVersion0(), MigrationTagVersion0), BitField('reserved21', 0, 3), BitField('log_max_hairpin_wqe_num', 0, 5), BitField('reserved22', 0, 24), StrFixedLenField('reserved23', None, length=140), ] class QueryCmdHcaCap2Out(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), PadField(PacketField('capability', CmdHcaCap2(), CmdHcaCap2), 2048, padwith=b"\x00"), ] class FlowMeterParams(PRMPacket): fields_desc = [ BitField('valid', 0, 1), BitField('bucket_overflow', 0, 1), BitField('start_color', 0, 2), BitField('both_buckets_on_green', 0, 1), BitField('reserved1', 0, 1), BitField('meter_mode', 0, 2), BitField('reserved2', 0, 24), StrFixedLenField('reserved3', None, length=4), ByteField('cbs_exponent', 0), ByteField('cbs_mantissa', 0), BitField('reserved4', 0, 3), BitField('cir_exponent', 0, 5), ByteField('cir_mantissa', 0), StrFixedLenField('reserved5', None, length=4), ByteField('ebs_exponent', 0), ByteField('ebs_mantissa', 0), BitField('reserved6', 0, 3), BitField('eir_exponent', 0, 5), ByteField('eir_mantissa', 0), StrFixedLenField('reserved7', None, length=12), ] class QosCaps(PRMPacket): fields_desc = [ BitField('packet_pacing', 0, 1), BitField('esw_scheduling', 0, 1), BitField('esw_bw_share', 0, 1), BitField('esw_rate_limit', 0, 1), BitField('hll', 0, 1), BitField('packet_pacing_burst_bound', 0, 1), BitField('packet_pacing_typical_size', 0, 1), BitField('flow_meter_old', 0, 1), BitField('nic_sq_scheduling', 0, 1), BitField('nic_bw_share', 0, 1), BitField('nic_rate_limit', 0, 1), BitField('packet_pacing_uid', 0, 1), BitField('log_esw_max_sched_depth', 0, 4), ByteField('log_max_flow_meter', 0), ByteField('flow_meter_reg_id', 0), BitField('wqe_rate_pp', 0, 1), BitField('nic_qp_scheduling', 0, 1), BitField('reserved1', 0, 2), BitField('log_nic_max_sched_depth', 0, 4), BitField('flow_meter', 0, 1), BitField('reserved2', 0, 1), BitField('qos_remap_pp', 0, 1), BitField('log_max_qos_nic_queue_group', 0, 5), ShortField('reserved3', 0), IntField('packet_pacing_max_rate', 0), IntField('packet_pacing_min_rate', 0), BitField('reserved4', 0, 11), BitField('log_esw_max_rate_limit', 0, 5), ShortField('packet_pacing_rate_table_size', 0), ShortField('esw_element_type', 0), ShortField('esw_tsar_type', 0), ShortField('max_qos_para_vport', 0), ShortField('max_qos_para_vport_old', 0), IntField('max_tsar_bw_share', 0), ShortField('nic_element_type', 0), ShortField('nic_tsar_type', 0), BitField('reserved5', 0, 3), BitField('log_meter_aso_granularity', 0, 5), BitField('reserved6', 0, 3), BitField('log_meter_aso_max_alloc', 0, 5), BitField('reserved7', 0, 3), BitField('log_max_num_meter_aso', 0, 5), ByteField('reserved8', 0), BitField('reserved9', 0, 3), BitField('log_max_qos_nic_scheduling_element', 0, 5), BitField('reserved10', 0, 3), BitField('log_max_qos_esw_scheduling_element', 0, 5), ShortField('reserved11', 0), StrFixedLenField('reserved12', None, length=212), ] class QueryQosCapOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), PadField(PacketField('capability', QosCaps(), QosCaps), 4096, padwith=b"\x00"), ] class OdpPerTransportServiceCap(PRMPacket): fields_desc = [ BitField('send', 0, 1), BitField('receive', 0, 1), BitField('write', 0, 1), BitField('read', 0, 1), BitField('atomic', 0, 1), BitField('rmp', 0, 1), BitField('tag_matching', 0, 1), BitField('reserved1', 0, 25), ] class OdpSchemeCap(PRMPacket): fields_desc = [ StrFixedLenField('reserved1', None, length=8), BitField('sig', 0, 1), BitField('cross_vhca_mkey', 0, 1), BitField('klm_null_mkey', 0, 1), BitField('dpa_process_win', 0, 1), BitField('reserved2', 0, 3), BitField('mmo_wqe', 0, 1), BitField('local_mmo_wqe', 0, 1), BitField('aso_wqe', 0, 1), BitField('umr_wqe', 0, 1), BitField('get_psv_wqe', 0, 1), BitField('rget_psv_wqe', 0, 1), BitField('reserved3', 0, 19), StrFixedLenField('reserved4', None, length=4), PacketField('rc_odp_caps', OdpPerTransportServiceCap(), OdpPerTransportServiceCap), PacketField('uc_odp_caps', OdpPerTransportServiceCap(), OdpPerTransportServiceCap), PacketField('ud_odp_caps', OdpPerTransportServiceCap(), OdpPerTransportServiceCap), PacketField('xrc_odp_caps', OdpPerTransportServiceCap(), OdpPerTransportServiceCap), PacketField('dc_odp_caps', OdpPerTransportServiceCap(), OdpPerTransportServiceCap), StrFixedLenField('reserved5', None, length=28), ] class OdpCap(PRMPacket): fields_desc = [ PacketField('transport_page_fault_scheme_cap', OdpSchemeCap(), OdpSchemeCap), PacketField('memory_page_fault_scheme_cap', OdpSchemeCap(), OdpSchemeCap), StrFixedLenField('reserved1', None, length=64), BitField('mem_page_fault', 0, 1), BitField('reserved2', 0, 31), StrFixedLenField('reserved3', None, length=60), ] class QueryOdpCapOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), PadField(PacketField('capability', OdpCap(), OdpCap), 4096, padwith=b"\x00"), ] class FlowTableFieldsSupported2(PRMPacket): fields_desc = [ BitField('reserved1', 0, 10), BitField('lag_rx_port_affinity', 0, 1), BitField('inner_esp_seq_num', 0, 1), BitField('outer_esp_seq_num', 0, 1), BitField('hash_result', 0, 1), BitField('bth_opcode', 0, 1), BitField('tunnel_header_2_3', 0, 1), BitField('tunnel_header_0_1', 0, 1), BitField('macsec_syndrome', 0, 1), BitField('macsec_tag', 0, 1), BitField('outer_lrh_sl', 0, 1), BitField('inner_ipv4_ihl', 0, 1), BitField('outer_ipv4_ihl', 0, 1), BitField('nisp_syndrome', 0, 1), BitField('inner_l3_ok', 0, 1), BitField('inner_l4_ok', 0, 1), BitField('outer_l3_ok', 0, 1), BitField('outer_l4_ok', 0, 1), BitField('nisp_header', 0, 1), BitField('inner_ipv4_checksum_ok', 0, 1), BitField('inner_l4_checksum_ok', 0, 1), BitField('outer_ipv4_checksum_ok', 0, 1), BitField('outer_l4_checksum_ok', 0, 1), StrFixedLenField('reserved2', None, length=12), ] class FlowTableFieldsSupported(PRMPacket): fields_desc = [ BitField('outer_dmac', 0, 1), BitField('outer_smac', 0, 1), BitField('outer_ether_type', 0, 1), BitField('outer_ip_version', 0, 1), BitField('outer_first_prio', 0, 1), BitField('outer_first_cfi', 0, 1), BitField('outer_first_vid', 0, 1), BitField('outer_ipv4_ttl', 0, 1), BitField('outer_second_prio', 0, 1), BitField('outer_second_cfi', 0, 1), BitField('outer_second_vid', 0, 1), BitField('outer_ipv6_flow_label', 0, 1), BitField('outer_sip', 0, 1), BitField('outer_dip', 0, 1), BitField('outer_frag', 0, 1), BitField('outer_ip_protocol', 0, 1), BitField('outer_ip_ecn', 0, 1), BitField('outer_ip_dscp', 0, 1), BitField('outer_udp_sport', 0, 1), BitField('outer_udp_dport', 0, 1), BitField('outer_tcp_sport', 0, 1), BitField('outer_tcp_dport', 0, 1), BitField('outer_tcp_flags', 0, 1), BitField('outer_gre_protocol', 0, 1), BitField('outer_gre_key', 0, 1), BitField('outer_vxlan_vni', 0, 1), BitField('outer_geneve_vni', 0, 1), BitField('outer_geneve_oam', 0, 1), BitField('outer_geneve_protocol_type', 0, 1), BitField('outer_geneve_opt_len', 0, 1), BitField('source_vhca_port', 0, 1), BitField('source_eswitch_port', 0, 1), BitField('inner_dmac', 0, 1), BitField('inner_smac', 0, 1), BitField('inner_ether_type', 0, 1), BitField('inner_ip_version', 0, 1), BitField('inner_first_prio', 0, 1), BitField('inner_first_cfi', 0, 1), BitField('inner_first_vid', 0, 1), BitField('inner_ipv4_ttl', 0, 1), BitField('inner_second_prio', 0, 1), BitField('inner_second_cfi', 0, 1), BitField('inner_second_vid', 0, 1), BitField('inner_ipv6_flow_label', 0, 1), BitField('inner_sip', 0, 1), BitField('inner_dip', 0, 1), BitField('inner_frag', 0, 1), BitField('inner_ip_protocol', 0, 1), BitField('inner_ip_ecn', 0, 1), BitField('inner_ip_dscp', 0, 1), BitField('inner_udp_sport', 0, 1), BitField('inner_udp_dport', 0, 1), BitField('inner_tcp_sport', 0, 1), BitField('inner_tcp_dport', 0, 1), BitField('inner_tcp_flags', 0, 1), BitField('outer_tcp_seq_num', 0, 1), BitField('inner_tcp_seq_num', 0, 1), BitField('prog_sample_field', 0, 1), BitField('outer_first_mpls_over_udp_cw', 0, 1), BitField('outer_tcp_ack_num', 0, 1), BitField('inner_tcp_ack_num', 0, 1), BitField('outer_first_mpls_over_gre_cw', 0, 1), BitField('metadata_reg_b', 0, 1), BitField('metadata_reg_a', 0, 1), BitField('geneve_tlv_option_0_data', 0, 1), BitField('geneve_tlv_option_0_exist', 0, 1), BitField('outer_vxlan_gpe_vni', 0, 1), BitField('outer_vxlan_gpe_flags', 0, 1), BitField('outer_vxlan_gpe_next_protocol', 0, 1), BitField('outer_first_mpls_over_gre_ttl', 0, 1), BitField('outer_first_mpls_over_gre_s_bos', 0, 1), BitField('outer_first_mpls_over_gre_exp', 0, 1), BitField('outer_first_mpls_over_gre_label', 0, 1), BitField('outer_first_mpls_over_udp_ttl', 0, 1), BitField('outer_first_mpls_over_udp_s_bos', 0, 1), BitField('outer_first_mpls_over_udp_exp', 0, 1), BitField('outer_first_mpls_over_udp_label', 0, 1), BitField('inner_first_mpls_ttl', 0, 1), BitField('inner_first_mpls_s_bos', 0, 1), BitField('inner_first_mpls_exp', 0, 1), BitField('inner_first_mpls_label', 0, 1), BitField('outer_first_mpls_ttl', 0, 1), BitField('outer_first_mpls_s_bos', 0, 1), BitField('outer_first_mpls_exp', 0, 1), BitField('outer_first_mpls_label', 0, 1), BitField('outer_emd_tag', 0, 1), BitField('inner_esp_spi', 0, 1), BitField('outer_esp_spi', 0, 1), BitField('inner_ipv6_hop_limit', 0, 1), BitField('outer_ipv6_hop_limit', 0, 1), BitField('bth_dst_qp', 0, 1), BitField('inner_first_svlan', 0, 1), BitField('inner_second_svlan', 0, 1), BitField('outer_first_svlan', 0, 1), BitField('outer_second_svlan', 0, 1), BitField('source_sqn', 0, 1), BitField('outer_gre_c_present', 0, 1), BitField('outer_gre_k_present', 0, 1), BitField('outer_gre_s_present', 0, 1), BitField('ipsec_syndrome', 0, 1), BitField('ipsec_next_header', 0, 1), BitField('gtpu_first_ext_dw_0', 0, 1), BitField('gtpu_dw_0', 0, 1), BitField('gtpu_teid', 0, 1), BitField('gtpu_msg_type', 0, 1), BitField('gtpu_flags', 0, 1), BitField('outer_lrh_lid', 0, 1), BitField('outer_grh_flow_label', 0, 1), BitField('outer_grh_tclass', 0, 1), BitField('outer_grh_gid', 0, 1), BitField('outer_bth_pkey', 0, 1), BitField('gtpu_dw_2', 0, 1), BitField('reserved1', 0, 2), BitField('icmpv6_code', 0, 1), BitField('icmp_code', 0, 1), BitField('icmpv6_type', 0, 1), BitField('icmp_type', 0, 1), BitField('icmpv6_header_data', 0, 1), BitField('icmp_header_data', 0, 1), BitField('metadata_reg_c_7', 0, 1), BitField('metadata_reg_c_6', 0, 1), BitField('metadata_reg_c_5', 0, 1), BitField('metadata_reg_c_4', 0, 1), BitField('metadata_reg_c_3', 0, 1), BitField('metadata_reg_c_2', 0, 1), BitField('metadata_reg_c_1', 0, 1), BitField('metadata_reg_c_0', 0, 1), ] class HeaderModifyCapProperties(PRMPacket): fields_desc = [ PacketField('set_action_field_support', FlowTableFieldsSupported(), FlowTableFieldsSupported), PacketField('set_action_field_support_2', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('add_action_field_support', FlowTableFieldsSupported(), FlowTableFieldsSupported), PacketField('add_action_field_support_2', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('copy_action_field_support', FlowTableFieldsSupported(), FlowTableFieldsSupported), PacketField('copy_action_field_support_2', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), StrFixedLenField('reserved1', None, length=32), ] class FlowTablePropLayout(PRMPacket): fields_desc = [ BitField('ft_support', 0, 1), BitField('flow_tag', 0, 1), BitField('flow_counter', 0, 1), BitField('flow_modify_en', 0, 1), BitField('modify_root', 0, 1), BitField('identified_miss_table', 0, 1), BitField('flow_table_modify', 0, 1), BitField('reformat', 0, 1), BitField('decap', 0, 1), BitField('reset_root_to_default', 0, 1), BitField('pop_vlan', 0, 1), BitField('push_vlan', 0, 1), BitField('fpga_vendor_acceleration', 0, 1), BitField('pop_vlan_2', 0, 1), BitField('push_vlan_2', 0, 1), BitField('reformat_and_vlan_action', 0, 1), BitField('modify_and_vlan_action', 0, 1), BitField('sw_owner', 0, 1), BitField('reformat_l3_tunnel_to_l2', 0, 1), BitField('reformat_l2_to_l3_tunnel', 0, 1), BitField('reformat_and_modify_action', 0, 1), BitField('ignore_flow_level', 0, 1), BitField('reserved1', 0, 1), BitField('table_miss_action_domain', 0, 1), BitField('termination_table', 0, 1), BitField('reformat_and_fwd_to_table', 0, 1), BitField('forward_vhca_rx_root', 0, 1), BitField('forward_vhca_tx_root', 0, 1), BitField('ipsec_encrypt', 0, 1), BitField('ipsec_decrypt', 0, 1), BitField('sw_owner_v2', 0, 1), BitField('wqe_based_flow_update', 0, 1), BitField('termination_table_raw_traffic', 0, 1), BitField('vlan_and_fwd_to_table', 0, 1), BitField('log_max_ft_size', 0, 6), ByteField('log_max_modify_header_context', 0), ByteField('max_modify_header_actions', 0), ByteField('max_ft_level', 0), BitField('reformat_add_esp_transport', 0, 1), BitField('reformat_l2_to_l3_esp_tunnel', 0, 1), BitField('reformat_add_esp_transport_over_udp', 0, 1), BitField('reformat_del_esp_transport', 0, 1), BitField('reformat_l3_esp_tunnel_to_l2', 0, 1), BitField('reformat_del_esp_transport_over_udp', 0, 1), BitField('execute_aso', 0, 1), BitField('forward_flow_meter', 0, 1), ByteField('log_max_flow_sampler_num', 0), ByteField('metadata_reg_b_width', 0), ByteField('metadata_reg_a_width', 0), BitField('reformat_l2_to_l3_nisp_tunnel', 0, 1), BitField('reformat_l3_nisp_tunnel_to_l2', 0, 1), BitField('reformat_insert', 0, 1), BitField('reformat_remove', 0, 1), BitField('macsec_encrypt', 0, 1), BitField('macsec_decrypt', 0, 1), BitField('nisp_encrypt', 0, 1), BitField('nisp_decrypt', 0, 1), BitField('reformat_add_macsec', 0, 1), BitField('reformat_remove_macsec', 0, 1), BitField('reparse', 0, 1), BitField('reserved2', 0, 1), BitField('cross_vhca_object', 0, 1), BitField('reserved3', 0, 11), ByteField('log_max_ft_num', 0), ShortField('reserved4', 0), ByteField('log_max_flow_counter', 0), ByteField('log_max_destination', 0), BitField('reserved5', 0, 24), ByteField('log_max_flow', 0), StrFixedLenField('reserved6', None, length=8), PacketField('ft_field_support', FlowTableFieldsSupported(), FlowTableFieldsSupported), PacketField('ft_field_bitmask_support', FlowTableFieldsSupported(), FlowTableFieldsSupported), ] class FlowTableNicCap(PRMPacket): fields_desc = [ BitField('nic_rx_multi_path_tirs', 0, 1), BitField('nic_rx_multi_path_tirs_fts', 0, 1), BitField('allow_sniffer_and_nic_rx_shared_tir', 0, 1), BitField('reserved1', 0, 1), BitField('nic_rx_flow_tag_multipath_en', 0, 1), BitField('ttl_checksum_correction', 0, 1), BitField('nic_rx_rdma_fwd_tir', 0, 1), BitField('sw_owner_reformat_supported', 0, 1), ShortField('reserved2', 0), ByteField('nic_receive_max_steering_depth', 0), BitField('encap_general_header', 0, 1), BitField('reserved3', 0, 10), BitField('log_max_packet_reformat_context', 0, 5), BitField('reserved4', 0, 6), BitField('max_encap_header_size', 0, 10), StrFixedLenField('reserved5', None, length=56), PacketField('flow_table_properties_nic_receive', FlowTablePropLayout(), FlowTablePropLayout), PacketField('flow_table_properties_nic_receive_rdma', FlowTablePropLayout(), FlowTablePropLayout), PacketField('flow_table_properties_nic_receive_sniffer', FlowTablePropLayout(), FlowTablePropLayout), PacketField('flow_table_properties_nic_transmit', FlowTablePropLayout(), FlowTablePropLayout), PacketField('flow_table_properties_nic_transmit_rdma', FlowTablePropLayout(), FlowTablePropLayout), PacketField('flow_table_properties_nic_transmit_sniffer', FlowTablePropLayout(), FlowTablePropLayout), StrFixedLenField('reserved6', None, length=64), PacketField('header_modify_nic_receive', HeaderModifyCapProperties(), HeaderModifyCapProperties), PacketField('ft_field_support_2_nic_receive', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('ft_field_bitmask_support_2_nic_receive', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('ft_field_support_2_nic_receive_rdma', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('ft_field_bitmask_support_2_nic_receive_rdma', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('ft_field_support_2_nic_receive_sniffer', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('ft_field_bitmask_support_2_nic_receive_sniffer', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('ft_field_support_2_nic_transmit', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('ft_field_bitmask_support_2_nic_transmit', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('ft_field_support_2_nic_transmit_rdma', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('ft_field_bitmask_support_2_nic_transmit_rdma', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('ft_field_support_2_nic_transmit_sniffer', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), PacketField('ft_field_bitmask_support_2_nic_transmit_sniffer', FlowTableFieldsSupported2(), FlowTableFieldsSupported2), StrFixedLenField('reserved7', None, length=64), PacketField('header_modify_nic_transmit', HeaderModifyCapProperties(), HeaderModifyCapProperties), LongField('sw_steering_nic_rx_action_drop_icm_address', 0), LongField('sw_steering_nic_tx_action_drop_icm_address', 0), LongField('sw_steering_nic_tx_action_allow_icm_address', 0), StrFixedLenField('reserved8', None, length=40), ] class QueryCmdHcaNicFlowTableCapOut(PRMPacket): fields_desc = [ ByteField('status', 0), BitField('reserved1', 0, 24), IntField('syndrome', 0), StrFixedLenField('reserved2', None, length=8), PadField(PacketField('capability', FlowTableNicCap(), FlowTableNicCap), 2048, padwith=b"\x00"), ] rdma-core-56.1/tests/rdmacm_utils.py000066400000000000000000000503771477342711600175450ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file """ Provide some useful helper function for pyverbs rdmacm' tests. """ import sys from tests.utils import validate, poll_cq, get_send_elements, get_recv_wr from tests.base_rdmacm import AsyncCMResources, SyncCMResources from pyverbs.cmid import CMEvent, AddrInfo, JoinMCAttrEx from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError import pyverbs.cm_enums as ce from pyverbs.addr import AH import pyverbs.enums as e import abc import errno GRH_SIZE = 40 MULTICAST_QPN = 0xffffff REJECT_MSG = 'connection rejected' class CMConnection(abc.ABC): """ RDMA CM base abstract connection class. The class contains the rdmacm resources and other methods to easily establish a connection and run traffic using the rdmacm resources. Each type of connection or traffic should inherit from this class and implement the necessary methods such as connection establishment and traffic. """ def __init__(self, syncer=None, notifier=None): """ Initializes a connection object. :param syncer: Barrier object to sync between all the test processes. :param notifier: Queue object to pass objects between the connection sides. """ self.syncer = syncer self.notifier = notifier self.cm_res = None def rdmacm_traffic(self, server=None, multicast=False): """ Run rdmacm traffic. This method runs the compatible traffic flow depending on the CMResources. If self.with_ext_qp is set the traffic will go through the external QP. :param server: Run as server. :param multicast: Run multicast traffic. """ server = server if server is not None else self.cm_res.passive if self.cm_res.with_ext_qp: if server: self._ext_qp_server_traffic() else: self._ext_qp_client_traffic() elif multicast: if server: self._cmid_server_multicast_traffic() else: self._cmid_client_multicast_traffic() else: if server: self._cmid_server_traffic() else: self._cmid_client_traffic() def remote_traffic(self, passive, remote_op='write'): """ Run rdmacm remote traffic. This method runs RDMA remote traffic from the active to the passive. :param passive: If True, run as server. :param remote_op: 'write'/'read', The type of the RDMA remote operation. """ msg_size = self.cm_res.msg_size if passive: self.cm_res.mr.write((msg_size) * 's', msg_size) mr_details = (self.cm_res.mr.rkey, self.cm_res.mr.buf) self.notifier.put(mr_details) self.syncer.wait() self.syncer.wait() if remote_op == 'write': msg_received = self.cm_res.mr.read(msg_size, 0) validate(msg_received, True, msg_size) else: self.cm_res.mr.write((msg_size) * 'c', msg_size) self.syncer.wait() rkey, remote_addr = self.notifier.get() cmid = self.cm_res.cmid post_func = cmid.post_write if remote_op == 'write' else \ cmid.post_read for _ in range(self.cm_res.num_msgs): post_func(self.cm_res.mr, msg_size, remote_addr, rkey, flags=e.IBV_SEND_SIGNALED) cmid.get_send_comp() self.syncer.wait() if remote_op == 'read': msg_received = self.cm_res.mr.read(msg_size, 0) validate(msg_received, False, msg_size) def _ext_qp_server_traffic(self): """ RDMACM server side traffic function which sends and receives a message, and then validates the received message. This traffic method uses the CM external QP and CQ for send, recv and get_completion. :return: None """ recv_wr = get_recv_wr(self.cm_res) self.cm_res.qp.post_recv(recv_wr) self.syncer.wait() for _ in range(self.cm_res.num_msgs): poll_cq(self.cm_res.cq) self.cm_res.qp.post_recv(recv_wr) msg_received = self.cm_res.mr.read(self.cm_res.msg_size, 0) validate(msg_received, self.cm_res.passive, self.cm_res.msg_size) send_wr = get_send_elements(self.cm_res, self.cm_res.passive)[0] self.cm_res.qp.post_send(send_wr) poll_cq(self.cm_res.cq) def _ext_qp_client_traffic(self): """ RDMACM client side traffic function which sends and receives a message, and then validates the received message. This traffic method uses the CM external QP and CQ for send, recv and get_completion. :return: None """ recv_wr = get_recv_wr(self.cm_res) self.syncer.wait() for _ in range(self.cm_res.num_msgs): send_wr = get_send_elements(self.cm_res, self.cm_res.passive)[0] self.cm_res.qp.post_send(send_wr) poll_cq(self.cm_res.cq) self.cm_res.qp.post_recv(recv_wr) poll_cq(self.cm_res.cq) msg_received = self.cm_res.mr.read(self.cm_res.msg_size, 0) validate(msg_received, self.cm_res.passive, self.cm_res.msg_size) def _cmid_server_traffic(self): """ RDMACM server side traffic function which sends and receives a message, and then validates the received message. This traffic method uses the RDMACM API for send, recv and get_completion. :return: None """ grh_offset = GRH_SIZE if self.cm_res.qp_type == e.IBV_QPT_UD else 0 send_msg = (self.cm_res.msg_size + grh_offset) * 's' cmid = self.cm_res.child_id for _ in range(self.cm_res.num_msgs): cmid.post_recv(self.cm_res.mr) self.syncer.wait() self.syncer.wait() wc = cmid.get_recv_comp() msg_received = self.cm_res.mr.read(self.cm_res.msg_size, grh_offset) validate(msg_received, True, self.cm_res.msg_size) if self.cm_res.port_space == ce.RDMA_PS_TCP: self.cm_res.mr.write(send_msg, self.cm_res.msg_size) cmid.post_send(self.cm_res.mr) else: ah = AH(cmid.pd, wc=wc, port_num=self.cm_res.ib_port, grh=self.cm_res.mr.buf) rqpn = self.cm_res.remote_qpn self.cm_res.mr.write(send_msg, self.cm_res.msg_size + GRH_SIZE) cmid.post_ud_send(self.cm_res.mr, ah, rqpn=rqpn, length=self.cm_res.msg_size) cmid.get_send_comp() self.syncer.wait() def _cmid_client_traffic(self): """ RDMACM client side traffic function which sends and receives a message, and then validates the received message. This traffic method uses the RDMACM API for send, recv and get_completion. :return: None """ grh_offset = GRH_SIZE if self.cm_res.qp_type == e.IBV_QPT_UD else 0 send_msg = (self.cm_res.msg_size + grh_offset) * 'c' cmid = self.cm_res.cmid for _ in range(self.cm_res.num_msgs): self.cm_res.mr.write(send_msg, self.cm_res.msg_size + grh_offset) self.syncer.wait() if self.cm_res.port_space == ce.RDMA_PS_TCP: cmid.post_send(self.cm_res.mr) else: ah = AH(cmid.pd, attr=self.cm_res.ud_params.ah_attr) cmid.post_ud_send(self.cm_res.mr, ah, rqpn=self.cm_res.ud_params.qp_num, length=self.cm_res.msg_size) cmid.get_send_comp() cmid.post_recv(self.cm_res.mr) self.syncer.wait() self.syncer.wait() cmid.get_recv_comp() msg_received = self.cm_res.mr.read(self.cm_res.msg_size, grh_offset) validate(msg_received, False, self.cm_res.msg_size) def _cmid_server_multicast_traffic(self): """ RDMACM server side multicast traffic function which receives a message, and then validates its data. """ for _ in range(self.cm_res.num_msgs): self.cm_res.cmid.post_recv(self.cm_res.mr) self.syncer.wait() self.syncer.wait() self.cm_res.cmid.get_recv_comp() msg_received = self.cm_res.mr.read(self.cm_res.msg_size, GRH_SIZE) validate(msg_received, True, self.cm_res.msg_size) def _cmid_client_multicast_traffic(self): """ RDMACM client side multicast traffic function which sends a message to the multicast group. """ send_msg = (self.cm_res.msg_size + GRH_SIZE) * 'c' for _ in range(self.cm_res.num_msgs): self.cm_res.mr.write(send_msg, self.cm_res.msg_size + GRH_SIZE) self.syncer.wait() ah = AH(self.cm_res.cmid.pd, attr=self.cm_res.ud_params.ah_attr) self.cm_res.cmid.post_ud_send(self.cm_res.mr, ah, rqpn=MULTICAST_QPN, length=self.cm_res.msg_size) self.cm_res.cmid.get_send_comp() self.syncer.wait() def event_handler(self, expected_event=None): """ Handle and execute corresponding API for RDMACM events of asynchronous communication. :param expected_event: The user expected event. :return: None """ cm_event = CMEvent(self.cm_res.cmid.event_channel) if cm_event.event_type == ce.RDMA_CM_EVENT_CONNECT_REQUEST: self.cm_res.create_child_id(cm_event) elif cm_event.event_type in [ce.RDMA_CM_EVENT_ESTABLISHED, ce.RDMA_CM_EVENT_MULTICAST_JOIN]: self.cm_res.set_ud_params(cm_event) if expected_event and expected_event != cm_event.event_type: raise PyverbsError('Expected this event: {}, got this event: {}'. format(expected_event, cm_event.event_str())) if expected_event == ce.RDMA_CM_EVENT_REJECTED: assert cm_event.private_data[:len(REJECT_MSG)].decode() == REJECT_MSG, \ f'CM event data ({cm_event.private_data}) is different than the expected ({REJECT_MSG})' cm_event.ack_cm_event() @abc.abstractmethod def establish_connection(self): pass @abc.abstractmethod def disconnect(self): pass class CMAsyncConnection(CMConnection): """ Implement RDMACM connection management for asynchronous CMIDs. It includes connection establishment, disconnection and other methods such as traffic. """ def __init__(self, ip_addr, syncer=None, notifier=None, passive=False, num_conns=1, qp_timeout=-1, reject_conn=False, **kwargs): """ Init the CMConnection and then init the AsyncCMResources. :param ip_addr: IP address to use. :param syncer: Barrier object to sync between all the test processes. :param notifier: Queue object to pass objects between the connection sides. :param passive: Indicate if it's a passive side. :param num_conns: Number of connections. :param qp_timeout: Value of the QP timeout. :param reject_conn: True if the server will reject the connection. :param kwargs: Arguments used to initialize the CM resources. For more info please check CMResources. """ super(CMAsyncConnection, self).__init__(syncer=syncer, notifier=notifier) self.num_conns = num_conns self.create_cm_res(ip_addr, passive=passive, **kwargs) self.qp_timeout = qp_timeout self.reject_conn = reject_conn def create_cm_res(self, ip_addr, passive, **kwargs): self.cm_res = AsyncCMResources(addr=ip_addr, passive=passive, **kwargs) if passive: self.cm_res.create_cmid() else: for i in range(self.num_conns): self.cm_res.create_cmid(i) def join_to_multicast(self, mc_addr=None, src_addr=None, extended=False): """ Join the CMID to multicast group. :param mc_addr: The multicast IP address. :param src_addr: The CMIDs source address. :param extended: Use the join_multicast_ex API. """ self.cm_res.cmid.bind_addr(self.cm_res.ai) resolve_addr_info = AddrInfo(src=src_addr, dst=mc_addr) self.cm_res.cmid.resolve_addr(resolve_addr_info) self.event_handler(expected_event=ce.RDMA_CM_EVENT_ADDR_RESOLVED) self.cm_res.create_qp() mc_addr_info = AddrInfo(src=mc_addr) if not extended: self.cm_res.cmid.join_multicast(addr=mc_addr_info) else: flags = ce.RDMA_MC_JOIN_FLAG_FULLMEMBER comp_mask = ce.RDMA_CM_JOIN_MC_ATTR_ADDRESS | \ ce.RDMA_CM_JOIN_MC_ATTR_JOIN_FLAGS mcattr = JoinMCAttrEx(addr=mc_addr_info, comp_mask=comp_mask, join_flags=flags) self.cm_res.cmid.join_multicast(mc_attr=mcattr) self.event_handler(expected_event=ce.RDMA_CM_EVENT_MULTICAST_JOIN) self.cm_res.create_mr() def leave_multicast(self, mc_addr=None): """ Leave multicast group. :param mc_addr: The multicast IP address. """ mc_addr_info = AddrInfo(src=mc_addr) self.cm_res.cmid.leave_multicast(mc_addr_info) def establish_connection(self): """ Establish RDMACM connection between two Async CMIDs. """ if self.cm_res.passive: self.cm_res.cmid.bind_addr(self.cm_res.ai) self.cm_res.cmid.listen() for conn_idx in range(self.num_conns): if self.cm_res.passive: self.syncer.wait() self.event_handler(expected_event=ce.RDMA_CM_EVENT_CONNECT_REQUEST) self.cm_res.create_qp(conn_idx=conn_idx) if self.qp_timeout >= 0: self.set_qp_timeout(self.cm_res.child_ids[conn_idx], self.qp_timeout) if self.cm_res.with_ext_qp: self.set_cmids_qp_ece(self.cm_res.passive) self.cm_res.modify_ext_qp_to_rts(conn_idx=conn_idx) self.set_cmid_ece(self.cm_res.passive) child_id = self.cm_res.child_ids[conn_idx] if self.reject_conn: child_id.reject(REJECT_MSG.encode()) return child_id.accept(self.cm_res.create_conn_param(conn_idx=conn_idx)) if self.qp_timeout >= 0: attr, _ = child_id.query_qp(e.IBV_QP_TIMEOUT) assert self.qp_timeout == attr.timeout if self.cm_res.port_space == ce.RDMA_PS_TCP: self.event_handler(expected_event=ce.RDMA_CM_EVENT_ESTABLISHED) else: cmid = self.cm_res.cmids[conn_idx] cmid.resolve_addr(self.cm_res.ai) self.event_handler(expected_event=ce.RDMA_CM_EVENT_ADDR_RESOLVED) self.syncer.wait() cmid.resolve_route() self.event_handler(expected_event=ce.RDMA_CM_EVENT_ROUTE_RESOLVED) self.cm_res.create_qp(conn_idx=conn_idx) if self.qp_timeout >= 0: self.set_qp_timeout(self.cm_res.cmid, self.qp_timeout) if self.cm_res.with_ext_qp: self.set_cmid_ece(self.cm_res.passive) cmid.connect(self.cm_res.create_conn_param(conn_idx=conn_idx)) if self.cm_res.with_ext_qp: self.event_handler(expected_event=\ ce.RDMA_CM_EVENT_CONNECT_RESPONSE) self.set_cmids_qp_ece(self.cm_res.passive) self.cm_res.modify_ext_qp_to_rts(conn_idx=conn_idx) cmid.establish() else: if self.reject_conn: self.event_handler(expected_event=ce.RDMA_CM_EVENT_REJECTED) return self.event_handler(expected_event=ce.RDMA_CM_EVENT_ESTABLISHED) if self.qp_timeout >= 0: attr, _ = self.cm_res.cmid.query_qp(e.IBV_QP_TIMEOUT) assert self.qp_timeout == attr.timeout self.cm_res.create_mr() self.sync_qp_numbers() def set_qp_timeout(self, cm_id, ack_timeout): cm_id.set_option(ce.RDMA_OPTION_ID, ce.RDMA_OPTION_ID_ACK_TIMEOUT, ack_timeout, 1) def sync_qp_numbers(self): """ Sync the QP numbers of the connections sides. """ if self.cm_res.passive: self.syncer.wait() self.notifier.put(self.cm_res.my_qp_number()) self.syncer.wait() self.cm_res.remote_qpn = self.notifier.get() else: self.syncer.wait() self.cm_res.remote_qpn = self.notifier.get() self.notifier.put(self.cm_res.my_qp_number()) self.syncer.wait() def disconnect(self): """ Disconnect the connection. """ if self.cm_res.port_space == ce.RDMA_PS_TCP: if self.cm_res.passive: for child_id in self.cm_res.child_ids.values(): child_id.disconnect() else: self.event_handler(expected_event=ce.RDMA_CM_EVENT_DISCONNECTED) for cmid in self.cm_res.cmids.values(): cmid.disconnect() def set_cmid_ece(self, passive): """ Set the local CMIDs ECE. The ECE is taken from the CMIDs QP ECE. :param passive: Indicates if this CMID is participate as passive in this connection. """ cmid = self.cm_res.child_id if passive else self.cm_res.cmid try: ece = self.cm_res.qp.query_ece() cmid.set_local_ece(ece) except PyverbsRDMAError as ex: if ex.error_code != errno.EOPNOTSUPP: raise ex def set_cmids_qp_ece(self, passive): """ Set the CMIDs QP ECE. :param passive: Indicates if this CMID is participate as passive in this connection. """ cmid = self.cm_res.child_id if passive else self.cm_res.cmid try: ece = cmid.get_remote_ece() self.cm_res.qp.set_ece(ece) except PyverbsRDMAError as ex: if ex.error_code != errno.EOPNOTSUPP: raise ex class CMSyncConnection(CMConnection): """ Implement RDMACM connection management for synchronous CMIDs. It includes connection establishment, disconnection and other methods such as traffic. """ def __init__(self, ip_addr, syncer=None, notifier=None, passive=False, **kwargs): """ Init the CMConnection and then init the SyncCMResources. :param ip_addr: IP address to use. :param syncer: Barrier object to sync between all the test processes. :param notifier: Queue object to pass objects between the connection sides. :param passive: Indicate if it's a passive side. :param kwargs: Arguments used to initialize the CM resources. For more info please check CMResources. """ super(CMSyncConnection, self).__init__(syncer=syncer, notifier=notifier) self.create_cm_res(ip_addr, passive=passive, **kwargs) def create_cm_res(self, ip_addr, passive, **kwargs): self.cm_res = SyncCMResources(addr=ip_addr, passive=passive, **kwargs) self.cm_res.create_cmid() def establish_connection(self): """ Establish RDMACM connection between two Sync CMIDs. """ if self.cm_res.passive: self.cm_res.cmid.listen() self.syncer.wait() self.cm_res.create_child_id() self.cm_res.child_id.accept() self.cm_res.create_mr() else: self.syncer.wait() self.cm_res.cmid.connect() self.cm_res.create_mr() def disconnect(self): """ Disconnect the connection. """ if self.cm_res.port_space == ce.RDMA_PS_TCP: if self.cm_res.passive: self.cm_res.child_id.disconnect() else: self.cm_res.cmid.disconnect() rdma-core-56.1/tests/run_tests.py000077500000000000000000000006771477342711600171110ustar00rootroot00000000000000#!/usr/bin/env python3 # SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018, Mellanox Technologies. All rights reserved. See COPYING file from args_parser import parser import unittest import os from importlib.machinery import SourceFileLoader module_path = os.path.join(os.path.dirname(__file__), '__init__.py') tests = SourceFileLoader('tests', module_path).load_module() parser.parse_args() unittest.main(module=tests) rdma-core-56.1/tests/test_addr.py000066400000000000000000000062761477342711600170320ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file import unittest import errno from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError from tests.base import PyverbsAPITestCase from pyverbs.addr import AHAttr, AH import pyverbs.device as d import pyverbs.enums as e from pyverbs.pd import PD import tests.utils as u class AHTest(PyverbsAPITestCase): """ Test various functionalities of the AH class. """ def verify_link_layer_ether(self, ctx): """ Aux function to verify link layer """ link_layer = ctx.query_port(self.ib_port).link_layer if link_layer != e.IBV_LINK_LAYER_ETHERNET: raise unittest.SkipTest(f'Link layer of port={self.ib_port} is {d.translate_link_layer(link_layer)} , skip RoCE test') def verify_state(self, ctx): """ Aux function to verify port state """ state = ctx.query_port(self.ib_port).state if state != e.IBV_PORT_ACTIVE and state != e.IBV_PORT_INIT: raise unittest.SkipTest(f'Port {self.ib_port} is not up, can not create AH') def test_create_ah(self): """ Test ibv_create_ah. """ self.verify_state(self.ctx) gr = u.get_global_route(self.ctx, gid_index=self.gid_index, port_num=self.ib_port) port_attrs = self.ctx.query_port(self.ib_port) dlid = port_attrs.lid if port_attrs.link_layer == e.IBV_LINK_LAYER_INFINIBAND else 0 ah_attr = AHAttr(dlid=dlid, gr=gr, is_global=1, port_num=self.ib_port) pd = PD(self.ctx) try: AH(pd, attr=ah_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create AH is not supported') raise ex def test_create_ah_roce(self): """ Verify that AH can't be created without GRH in RoCE """ self.verify_link_layer_ether(self.ctx) self.verify_state(self.ctx) pd = PD(self.ctx) ah_attr = AHAttr(is_global=0, port_num=self.ib_port) try: AH(pd, attr=ah_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create AH is not supported') assert 'Failed to create AH' in str(ex) else: raise PyverbsError(f'Successfully created a non-global AH on RoCE port={self.ib_port}') def test_destroy_ah(self): """ Test ibv_destroy_ah. """ self.verify_state(self.ctx) gr = u.get_global_route(self.ctx, gid_index=self.gid_index, port_num=self.ib_port) port_attrs = self.ctx.query_port(self.ib_port) dlid = port_attrs.lid if port_attrs.link_layer == e.IBV_LINK_LAYER_INFINIBAND else 0 ah_attr = AHAttr(dlid=dlid, gr=gr, is_global=1, port_num=self.ib_port) pd = PD(self.ctx) try: with AH(pd, attr=ah_attr) as ah: ah.close() except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create AH is not supported') raise ex rdma-core-56.1/tests/test_atomic.py000066400000000000000000000124501477342711600173630ustar00rootroot00000000000000import unittest import errno from tests.base import RCResources, RDMATestCase, XRCResources from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.qp import QPAttr, QPInitAttr import pyverbs.device as d import pyverbs.enums as e from pyverbs.mr import MR import tests.utils as u class RCAtomic(RCResources): def __init__(self, dev_name, ib_port, gid_index, msg_size=8, qp_access=None, mr_access=None): """ Initialize an RCAtomic Resource object. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param msg_size: Message size for all resources memory actions :param qp_access: The QP access to use when modifying the resource's QP :param mr_access: The MR access to use when registering the resource's MR """ atomic_access = e.IBV_ACCESS_LOCAL_WRITE | \ e.IBV_ACCESS_REMOTE_ATOMIC self.qp_access = qp_access if qp_access else atomic_access self.mr_access = mr_access if mr_access else atomic_access super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) self.msg_size = msg_size self.new_mr_lkey = None def create_mr(self): try: self.mr = MR(self.pd, self.msg_size, self.mr_access) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Reg MR with access ({self.mr_access}) is not supported') raise ex def create_qp_init_attr(self): return QPInitAttr(qp_type=e.IBV_QPT_RC, scq=self.cq, sq_sig_all=0, rcq=self.cq, srq=self.srq, cap=self.create_qp_cap()) def create_qp_attr(self): qp_attr = QPAttr(port_num=self.ib_port) qp_attr.qp_access_flags = self.qp_access return qp_attr @property def mr_lkey(self): return self.new_mr_lkey if self.new_mr_lkey is not None else self.mr.lkey class XRCAtomic(XRCResources): def create_mr(self): try: atomic_access = e.IBV_ACCESS_LOCAL_WRITE | \ e.IBV_ACCESS_REMOTE_ATOMIC self.mr = MR(self.pd, self.msg_size, atomic_access) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Reg MR with access ({atomic_access}) is not supported') raise ex class AtomicTest(RDMATestCase): """ Test various functionalities of the DM class. """ def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None self.traffic_args = None ctx = d.Context(name=self.dev_name) if ctx.query_device().atomic_caps == e.IBV_ATOMIC_NONE: raise unittest.SkipTest('Atomic operations are not supported') def test_atomic_cmp_and_swap(self): self.create_players(RCAtomic) u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_CMP_AND_SWP) u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_CMP_AND_SWP, receiver_val=1, sender_val=1) def test_atomic_fetch_and_add(self): self.create_players(RCAtomic) u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) def test_xrc_atomic_fetch_and_add(self): self.create_players(XRCAtomic) u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) def test_xrc_atomic_cmp_and_swap(self): self.create_players(XRCAtomic) u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_CMP_AND_SWP) u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_CMP_AND_SWP, receiver_val=1, sender_val=1) def test_atomic_invalid_qp_access(self): self.create_players(RCAtomic, qp_access=e.IBV_ACCESS_LOCAL_WRITE) with self.assertRaises(PyverbsRDMAError) as ex: u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) def test_atomic_invalid_mr_access(self): self.create_players(RCAtomic, mr_access=e.IBV_ACCESS_LOCAL_WRITE) with self.assertRaises(PyverbsRDMAError) as ex: u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) def test_atomic_non_aligned_addr(self): self.create_players(RCAtomic, msg_size=9) self.client.raddr += 1 with self.assertRaises(PyverbsRDMAError) as ex: u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) def test_atomic_invalid_lkey(self): self.create_players(RCAtomic) self.client.new_mr_lkey = self.client.mr.lkey + 1 with self.assertRaises(PyverbsRDMAError) as ex: u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) def test_atomic_invalid_rkey(self): self.create_players(RCAtomic) self.client.rkey += 1 with self.assertRaises(PyverbsRDMAError) as ex: u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) rdma-core-56.1/tests/test_cq.py000066400000000000000000000116361477342711600165170ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file # Copyright 2020 Amazon.com, Inc. or its affiliates. All rights reserved. """ Test module for pyverbs' cq module. """ import unittest import errno from tests.base import PyverbsAPITestCase, RDMATestCase, UDResources from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.base import PyverbsRDMAErrno from pyverbs.cq import CompChannel, CQ import tests.irdma_base as irdma from pyverbs.qp import QPCap import pyverbs.device as d import tests.utils as u class CQUDResources(UDResources): def __init__(self, dev_name, ib_port, gid_index, cq_depth=None): self.cq_depth = cq_depth super().__init__(dev_name, ib_port, gid_index) def create_cq(self): """ Initializes self.cq with a CQ of depth - defined by each test. :return: None """ cq_depth = self.cq_depth if self.cq_depth is not None else self.num_msgs self.cq = CQ(self.ctx, cq_depth, None, None, 0) def create_qp_cap(self): return QPCap(max_recv_wr=self.num_msgs, max_send_wr=10) class CQAPITest(PyverbsAPITestCase): """ Test the API of the CQ class. """ def setUp(self): super().setUp() def test_create_cq(self): for cq_size in [1, self.attr.max_cqe/2, self.attr.max_cqe]: for comp_vector in range(0, min(2, self.ctx.num_comp_vectors)): try: cq = CQ(self.ctx, cq_size, None, None, comp_vector) cq.close() except PyverbsRDMAError as ex: cq_attr = f'cq_size={cq_size}, comp_vector={comp_vector}' raise PyverbsRDMAErrno(f'Failed to create a CQ with {cq_attr}') # Create CQ with Max value of comp_vector. max_cqs_comp_vector = self.ctx.num_comp_vectors - 1 cq = CQ(self.ctx, self.ctx.num_comp_vectors, None, None, max_cqs_comp_vector) def test_create_cq_with_comp_channel(self): for cq_size in [1, self.attr.max_cqe/2, self.attr.max_cqe]: try: cc = CompChannel(self.ctx) CQ(self.ctx, cq_size, None, cc, 0) cc.close() except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'CQ with completion channel is not supported') def test_create_cq_bad_flow(self): """ Test ibv_create_cq() with wrong comp_vector / number of cqes """ with self.assertRaises(PyverbsRDMAError) as ex: CQ(self.ctx, self.attr.max_cqe + 1, None, None, 0) self.assertEqual(ex.exception.error_code, errno.EINVAL) with self.assertRaises(PyverbsRDMAError) as ex: CQ(self.ctx, 100, None, None, self.ctx.num_comp_vectors + 1) self.assertEqual(ex.exception.error_code, errno.EINVAL) class CQTest(RDMATestCase): """ Test various functionalities of the CQ class. """ def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None def test_resize_cq(self): """ Test resize CQ, start with specific value and then increase and decrease the CQ size. The test also check bad flow of decrease the CQ size when there are more completions on it than the new value. """ self.create_players(CQUDResources, cq_depth=3) # Decrease the CQ size. new_cq_size = 1 try: self.client.cq.resize(new_cq_size) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Resize CQ is not supported') raise ex self.assertTrue(self.client.cq.cqe >= new_cq_size, f'The actual CQ size ({self.client.cq.cqe}) is less ' 'than guaranteed ({new_cq_size})') # Increase the CQ size. new_cq_size = 7 post_send_num = new_cq_size - 1 self.client.cq.resize(new_cq_size) self.assertTrue(self.client.cq.cqe >= new_cq_size, f'The actual CQ size ({self.client.cq.cqe}) is less ' 'than guaranteed ({new_cq_size})') # Fill the CQ entries except one for avoid cq_overrun warnings. send_wr, _ = u.get_send_elements(self.client, False) ah_client = u.get_global_ah(self.client, self.gid_index, self.ib_port) for i in range(post_send_num): u.send(self.client, send_wr, ah=ah_client) # Decrease the CQ size to less than the CQ unpolled entries. new_cq_size = 1 try: self.client.cq.resize(new_cq_size) except PyverbsRDMAError as ex: self.assertEqual(ex.error_code, errno.EINVAL) finally: for i in range(post_send_num): u.poll_cq(self.client.cq) rdma-core-56.1/tests/test_cq_events.py000066400000000000000000000017141477342711600200770ustar00rootroot00000000000000import errno import unittest from pyverbs.pyverbs_error import PyverbsRDMAError from tests.base import RCResources, UDResources from tests.base import RDMATestCase from tests.utils import traffic from pyverbs.cq import CQ, CompChannel def create_cq_with_comp_channel(agr_obj): agr_obj.comp_channel = CompChannel(agr_obj.ctx) agr_obj.cq = CQ(agr_obj.ctx, agr_obj.num_msgs, None, agr_obj.comp_channel) agr_obj.cq.req_notify() class CqEventsUD(UDResources): def create_cq(self): create_cq_with_comp_channel(self) class CqEventsRC(RCResources): def create_cq(self): create_cq_with_comp_channel(self) class CqEventsTestCase(RDMATestCase): def setUp(self): super().setUp() self.iters = 100 def test_cq_events_ud(self): self.create_players(CqEventsUD) traffic(**self.traffic_args) def test_cq_events_rc(self): self.create_players(CqEventsRC) traffic(**self.traffic_args) rdma-core-56.1/tests/test_cqex.py000066400000000000000000000077001477342711600170510ustar00rootroot00000000000000from tests.base import RCResources, UDResources, XRCResources, RDMATestCase, \ PyverbsAPITestCase from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.cq import CqInitAttrEx, CQEX import pyverbs.enums as e from pyverbs.mr import MR import tests.utils as u import unittest import errno def create_ex_cq(res): """ Create an Extended CQ using res's context and assign it to res's cq member. IBV_WC_STANDARD_FLAGS is used for WC flags to avoid support differences between devices. :param res: An instance of TrafficResources """ wc_flags = e.IBV_WC_STANDARD_FLAGS cia = CqInitAttrEx(cqe=2000, wc_flags=wc_flags) try: res.cq = CQEX(res.ctx, cia) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create Extended CQ is not supported') raise ex class CqExUD(UDResources): def create_cq(self): create_ex_cq(self) def create_mr(self): self.mr = MR(self.pd, self.msg_size + self.GRH_SIZE, e.IBV_ACCESS_LOCAL_WRITE) class CqExRC(RCResources): def create_cq(self): create_ex_cq(self) class CqExXRC(XRCResources): def create_cq(self): create_ex_cq(self) class CqExTestCase(RDMATestCase): """ Run traffic over the existing UD, RC and XRC infrastructure, but use ibv_cq_ex instead of legacy ibv_cq """ def setUp(self): super().setUp() self.iters = 100 def test_ud_traffic_cq_ex(self): self.create_players(CqExUD) u.traffic(**self.traffic_args, is_cq_ex=True) def test_rc_traffic_cq_ex(self): self.create_players(CqExRC) u.traffic(**self.traffic_args, is_cq_ex=True) def test_xrc_traffic_cq_ex(self): self.create_players(CqExXRC) u.xrc_traffic(self.client, self.server, is_cq_ex=True) class CQEXAPITest(PyverbsAPITestCase): """ Test the API of the CQEX class. """ def setUp(self): super().setUp() self.max_cqe = self.attr.max_cqe def test_create_cq_ex(self): """ Test ibv_create_cq_ex() """ cq_init_attrs_ex = CqInitAttrEx(cqe=10, wc_flags=0, comp_mask=0, flags=0) if self.attr_ex.raw_packet_caps & e.IBV_RAW_PACKET_CAP_CVLAN_STRIPPING: cq_init_attrs_ex.wc_flags = e.IBV_WC_EX_WITH_CVLAN CQEX(self.ctx, cq_init_attrs_ex) for flag in list(e.ibv_create_cq_wc_flags): cq_init_attrs_ex.wc_flags = flag try: cq_ex = CQEX(self.ctx, cq_init_attrs_ex) cq_ex.close() except PyverbsRDMAError as ex: if ex.error_code != errno.EOPNOTSUPP: raise ex cq_init_attrs_ex.wc_flags = 0 cq_init_attrs_ex.comp_mask = e.IBV_CQ_INIT_ATTR_MASK_FLAGS attr_flags = list(e.ibv_create_cq_attr_flags) for flag in attr_flags: cq_init_attrs_ex.flags = flag try: cq_ex = CQEX(self.ctx, cq_init_attrs_ex) cq_ex.close() except PyverbsRDMAError as ex: if ex.error_code != errno.EOPNOTSUPP: raise ex def test_create_cq_ex_bad_flow(self): """ Test ibv_create_cq_ex() with wrong comp_vector / number of cqes """ cq_attrs_ex = CqInitAttrEx(cqe=self.max_cqe + 1, wc_flags=0, comp_mask=0, flags=0) with self.assertRaises(PyverbsRDMAError) as ex: CQEX(self.ctx, cq_attrs_ex) if ex.exception.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create Extended CQ is not supported') self.assertEqual(ex.exception.error_code, errno.EINVAL) cq_attrs_ex = CqInitAttrEx(10, wc_flags=0, comp_mask=0, flags=0) cq_attrs_ex.comp_vector = self.ctx.num_comp_vectors + 1 with self.assertRaises(PyverbsRDMAError) as ex: CQEX(self.ctx, cq_attrs_ex) self.assertEqual(ex.exception.error_code, errno.EINVAL) rdma-core-56.1/tests/test_cuda_dmabuf.py000066400000000000000000000054601477342711600203440ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2022 Nvidia Inc. All rights reserved. See COPYING file import unittest import errno from pyverbs.pyverbs_error import PyverbsRDMAError from tests.base import RCResources, RDMATestCase from pyverbs.mr import DmaBufMR from pyverbs.qp import QPAttr import tests.cuda_utils as cu import pyverbs.enums as e import tests.utils as u try: from cuda import cuda, cudart, nvrtc cu.CUDA_FOUND = True except ImportError: cu.CUDA_FOUND = False GPU_PAGE_SIZE = 1 << 16 @cu.set_mem_io_cuda_methods class DmabufCudaRes(RCResources): def __init__(self, dev_name, ib_port, gid_index, mr_access=e.IBV_ACCESS_LOCAL_WRITE): """ Initializes MR and DMA BUF resources on top of a CUDA memory. Uses RC QPs for traffic. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param mr_access: The MR access """ self.mr_access = mr_access self.cuda_addr = None super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) def create_mr(self): self.cuda_addr = cu.check_cuda_errors(cuda.cuMemAlloc(GPU_PAGE_SIZE)) attr_flag = 1 cu.check_cuda_errors(cuda.cuPointerSetAttribute( attr_flag, cuda.CUpointer_attribute.CU_POINTER_ATTRIBUTE_SYNC_MEMOPS, int(self.cuda_addr))) dmabuf_fd = cu.check_cuda_errors( cuda.cuMemGetHandleForAddressRange(self.cuda_addr, GPU_PAGE_SIZE, cuda.CUmemRangeHandleType.CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD, 0)) try: self.mr = DmaBufMR(self.pd, self.msg_size, self.mr_access, dmabuf_fd) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Registering DMABUF MR is not supported') raise ex def create_qp_attr(self): qp_attr = QPAttr(port_num=self.ib_port) qp_access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE | \ e.IBV_ACCESS_REMOTE_READ qp_attr.qp_access_flags = qp_access return qp_attr @cu.set_init_cuda_methods class DmabufCudaTest(RDMATestCase): """ Test RDMA traffic over CUDA memory """ def test_cuda_dmabuf_rdma_write_traffic(self): """ Runs RDMA Write traffic over CUDA allocated memory using DMA BUF and RC QPs. """ access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE self.create_players(DmabufCudaRes, mr_access=access) u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) rdma-core-56.1/tests/test_device.py000066400000000000000000000442441477342711600173540ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2018 Mellanox Technologies, Inc. All rights reserved. See COPYING file # Copyright 2020 Amazon.com, Inc. or its affiliates. All rights reserved. """ Test module for pyverbs' device module. """ from multiprocessing import Process, Queue import unittest import resource import random import errno import os from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError from tests.base import PyverbsAPITestCase from pyverbs.device import Context, DM import tests.utils as u import pyverbs.device as d import pyverbs.enums as e PAGE_SIZE = resource.getpagesize() class DeviceTest(PyverbsAPITestCase): """ Test various functionalities of the Device class. """ def get_device_list(self): lst = d.get_device_list() if len(lst) == 0: raise unittest.SkipTest('No IB device found') dev_name = self.config['dev'] if dev_name: for dev in lst: if dev.name.decode() == dev_name: lst = [dev] break if len(lst) == 0: raise PyverbsRDMAError(f'No IB device with name {dev_name} found') return lst def test_dev_list(self): """ Verify that it's possible to get IB devices list. """ self.get_device_list() def test_open_dev(self): """ Test ibv_open_device() """ for dev in self.get_device_list(): d.Context(name=dev.name.decode()) def test_query_device(self): """ Test ibv_query_device() """ for dev in self.get_device_list(): with d.Context(name=dev.name.decode()) as ctx: attr = ctx.query_device() self.verify_device_attr(attr, dev) def test_query_pkey(self): """ Test ibv_query_pkey() """ for dev in self.get_device_list(): with d.Context(name=dev.name.decode()) as ctx: if dev.node_type == e.IBV_NODE_CA: ctx.query_pkey(port_num=self.ib_port, index=0) def test_get_pkey_index(self): """ Test ibv_get_pkey_index() """ source_pkey_index = 0 for dev in self.get_device_list(): with d.Context(name=dev.name.decode()) as ctx: if dev.node_type == e.IBV_NODE_CA: pkey = u.get_pkey_from_kernel(device=dev.name.decode(), port=self.ib_port, index=source_pkey_index) queried_pkey_idx = ctx.get_pkey_index(port_num=self.ib_port, pkey=pkey) self.assertEqual(queried_pkey_idx, source_pkey_index, f'Got index={queried_pkey_idx}\nExpected index={source_pkey_index}') def test_query_gid(self): """ Test ibv_query_gid() """ for dev in self.get_device_list(): with d.Context(name=dev.name.decode()) as ctx: gid_tbl_len = ctx.query_port(self.ib_port).gid_tbl_len if gid_tbl_len > 0: ctx.query_gid(port_num=self.ib_port, index=0) def test_query_gid_table(self): """ Test ibv_query_gid_table() """ for dev in self.get_device_list(): with d.Context(name=dev.name.decode()) as ctx: device_attr = ctx.query_device() max_entries = 0 for port_num in range(1, device_attr.phys_port_cnt + 1): port_attr = ctx.query_port(port_num) max_entries += port_attr.gid_tbl_len try: if max_entries > 0: ctx.query_gid_table(max_entries) except PyverbsRDMAError as ex: if ex.error_code in [-errno.EOPNOTSUPP, -errno.EPROTONOSUPPORT]: raise unittest.SkipTest('ibv_query_gid_table is not'\ ' supported on this device') raise ex def test_query_gid_table_bad_flow(self): """ Test ibv_query_gid_table() with too small a buffer """ try: self.ctx.query_gid_table(0) except PyverbsRDMAError as ex: if ex.error_code in [-errno.EOPNOTSUPP, -errno.EPROTONOSUPPORT]: raise unittest.SkipTest('ibv_query_gid_table is not' ' supported on this device') self.assertEqual(ex.error_code, -errno.EINVAL, f'Got -{os.strerror(-ex.error_code)} but ' f'Expected -{os.strerror(errno.EINVAL)} ') else: raise PyverbsRDMAError('Successfully queried ' 'gid_table with an insufficient buffer') def test_query_gid_ex(self): """ Test ibv_query_gid_ex() """ for dev in self.get_device_list(): with d.Context(name=dev.name.decode()) as ctx: try: gid_tbl_len = ctx.query_port(self.ib_port).gid_tbl_len if gid_tbl_len > 0: ctx.query_gid_ex(port_num=self.ib_port, gid_index=0) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: raise unittest.SkipTest('ibv_query_gid_ex is not'\ ' supported on this device') raise ex def test_query_gid_ex_bad_flow(self): """ Test ibv_query_gid_ex() with an empty index """ try: port_attr = self.ctx.query_port(self.ib_port) max_entries = 0 for port_num in range(1, self.attr.phys_port_cnt + 1): attr = self.ctx.query_port(port_num) max_entries += attr.gid_tbl_len if max_entries > 0: gid_indices = {gid_entry.gid_index for gid_entry in self.ctx.query_gid_table(max_entries) if gid_entry.port_num == self.ib_port} else: gid_indices = {} possible_indices = set(range(port_attr.gid_tbl_len)) if port_attr.gid_tbl_len > 1 else set() try: no_gid_index = possible_indices.difference(gid_indices).pop() except KeyError: # all indices are populated by GIDs raise unittest.SkipTest('All gid indices populated,' ' cannot check bad flow') self.ctx.query_gid_ex(port_num=self.ib_port, gid_index=no_gid_index) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: raise unittest.SkipTest('ibv_query_gid_ex is not' ' supported on this device') self.assertEqual(ex.error_code, errno.ENODATA, f'Got {os.strerror(ex.error_code)} but ' f'Expected {os.strerror(errno.ENODATA)}') else: raise PyverbsRDMAError('Successfully queried ' f'non-existent gid index {no_gid_index}') @staticmethod def verify_device_attr(attr, device): """ Helper method that verifies correctness of some members of DeviceAttr object. :param attr: A DeviceAttr object :param device: A Device object :return: None """ if device.node_type != e.IBV_NODE_UNSPECIFIED and device.node_type != e.IBV_NODE_UNKNOWN: assert attr.node_guid != 0 assert attr.sys_image_guid != 0 assert attr.max_mr_size > PAGE_SIZE assert attr.page_size_cap >= PAGE_SIZE assert attr.vendor_id != 0 assert attr.max_qp > 0 assert attr.max_qp_wr > 0 assert attr.max_sge > 0 assert attr.max_sge_rd >= 0 assert attr.max_cq > 0 assert attr.max_cqe > 0 assert attr.max_mr > 0 assert attr.max_pd > 0 if device.node_type == e.IBV_NODE_CA: assert attr.max_pkeys > 0 def test_query_device_ex(self): """ Test ibv_query_device_ex() """ for dev in self.get_device_list(): with d.Context(name=dev.name.decode()) as ctx: attr_ex = ctx.query_device_ex() self.verify_device_attr(attr_ex.orig_attr, dev) def test_phys_port_cnt_ex(self): """ Test phys_port_cnt_ex """ for dev in self.get_device_list(): with d.Context(name=dev.name.decode()) as ctx: attr_ex = ctx.query_device_ex() phys_port_cnt = attr_ex.orig_attr.phys_port_cnt phys_port_cnt_ex = attr_ex.phys_port_cnt_ex if phys_port_cnt_ex > 255: self.assertEqual(phys_port_cnt, 255, f'phys_port_cnt should be 255 if ' + f'phys_port_cnt_ex is bigger than 255') else: self.assertEqual(phys_port_cnt, phys_port_cnt_ex, f'phys_port_cnt_ex and phys_port_cnt ' + f'should be equal if number of ports is ' + f'less than 256') @staticmethod def verify_port_attr(attr): """ Helper method that verifies correctness of some members of PortAttr object. :param attr: A PortAttr object :return: None """ assert 'Invalid' not in d.phys_state_to_str(attr.state) assert 'Invalid' not in d.translate_mtu(attr.max_mtu) assert 'Invalid' not in d.translate_mtu(attr.active_mtu) assert 'Invalid' not in d.width_to_str(attr.active_width) assert 'Invalid' not in d.speed_to_str(attr.active_speed, attr.active_speed_ex) assert 'Invalid' not in d.translate_link_layer(attr.link_layer) assert attr.max_msg_sz > 0x1000 def test_query_port(self): """ Test ibv_query_port """ for dev in self.get_device_list(): with d.Context(name=dev.name.decode()) as ctx: port_attr = ctx.query_port(self.ib_port) self.verify_port_attr(port_attr) def test_query_port_bad_flow(self): """ Verify that querying non-existing ports fails as expected """ for dev in self.get_device_list(): with d.Context(name=dev.name.decode()) as ctx: num_ports = ctx.query_device().phys_port_cnt try: port = num_ports + random.randint(1, 10) ctx.query_port(port) except PyverbsRDMAError as e: assert 'Failed to query port' in e.args[0] assert 'Invalid argument' in e.args[0] else: raise PyverbsRDMAError( 'Successfully queried non-existing port {p}'. \ format(p=port)) class DMTest(PyverbsAPITestCase): """ Test various functionalities of the DM class. """ def setUp(self): super().setUp() if self.attr_ex.max_dm_size == 0: raise unittest.SkipTest('Device memory is not supported') def test_create_dm(self): """ test ibv_alloc_dm() """ dm_len = random.randrange(u.MIN_DM_SIZE, int(self.attr_ex.max_dm_size/2), u.DM_ALIGNMENT) dm_attrs = u.get_dm_attrs(dm_len) with d.DM(self.ctx, dm_attrs): pass def test_destroy_dm(self): """ test ibv_free_dm() """ dm_len = random.randrange(u.MIN_DM_SIZE, int(self.attr_ex.max_dm_size/2), u.DM_ALIGNMENT) dm_attrs = u.get_dm_attrs(dm_len) dm = d.DM(self.ctx, dm_attrs) dm.close() def test_create_dm_bad_flow(self): """ test ibv_alloc_dm() with an illegal size and comp mask """ dm_len = self.attr_ex.max_dm_size + 1 dm_attrs = u.get_dm_attrs(dm_len) try: d.DM(self.ctx, dm_attrs) except PyverbsRDMAError as e: assert 'Failed to allocate device memory of size' in \ e.args[0] assert 'Max available size' in e.args[0] else: raise PyverbsError( 'Created a DM with size larger than max reported') dm_attrs.comp_mask = random.randint(1, 100) try: d.DM(self.ctx, dm_attrs) except PyverbsRDMAError as e: assert 'Failed to allocate device memory of size' in \ e.args[0] else: raise PyverbsError( 'Created a DM with illegal comp mask {c}'. \ format(c=dm_attrs.comp_mask)) def test_destroy_dm_bad_flow(self): """ Test calling ibv_free_dm() twice """ dm_len = random.randrange(u.MIN_DM_SIZE, int(self.attr_ex.max_dm_size/2), u.DM_ALIGNMENT) dm_attrs = u.get_dm_attrs(dm_len) dm = d.DM(self.ctx, dm_attrs) dm.close() dm.close() def test_dm_write(self): """ Test writing to the device memory """ dm_len = random.randrange(u.MIN_DM_SIZE, int(self.attr_ex.max_dm_size/2), u.DM_ALIGNMENT) dm_attrs = u.get_dm_attrs(dm_len) with d.DM(self.ctx, dm_attrs) as dm: data_length = random.randrange(4, dm_len, u.DM_ALIGNMENT) data_offset = random.randrange(0, dm_len - data_length, u.DM_ALIGNMENT) data = 'a' * data_length dm.copy_to_dm(data_offset, data.encode(), data_length) def test_dm_write_bad_flow(self): """ Test writing to the device memory with bad offset and length """ dm_len = random.randrange(u.MIN_DM_SIZE, int(self.attr_ex.max_dm_size/2), u.DM_ALIGNMENT) dm_attrs = u.get_dm_attrs(dm_len) with d.DM(self.ctx, dm_attrs) as dm: data_length = random.randrange(4, dm_len, u.DM_ALIGNMENT) data_offset = random.randrange(0, dm_len - data_length, u.DM_ALIGNMENT) data_offset += 1 # offset needs to be a multiple of 4 data = 'a' * data_length try: dm.copy_to_dm(data_offset, data.encode(), data_length) except PyverbsRDMAError as e: assert 'Failed to copy to dm' in e.args[0] else: raise PyverbsError( 'Wrote to device memory with a bad offset') def test_dm_read(self): """ Test reading from the device memory """ dm_len = random.randrange(u.MIN_DM_SIZE, int(self.attr_ex.max_dm_size/2), u.DM_ALIGNMENT) dm_attrs = u.get_dm_attrs(dm_len) with d.DM(self.ctx, dm_attrs) as dm: data_length = random.randrange(4, dm_len, u.DM_ALIGNMENT) data_offset = random.randrange(0, dm_len - data_length, u.DM_ALIGNMENT) data = 'a' * data_length dm.copy_to_dm(data_offset, data.encode(), data_length) read_str = dm.copy_from_dm(data_offset, data_length) assert read_str.decode() == data def alloc_dm(self, res_queue, size): """ Alloc device memory. Used by multiple processes that allocate DMs in parallel. :param res_queue: Result Queue to return the result to the parent process. :param size: The DM allocation size. :return: None """ try: d.DM(self.ctx, d.AllocDmAttr(length=size)) except PyverbsError as err: res_queue.put(err.error_code) res_queue.put(0) def test_multi_process_alloc_dm(self): """ Several processes try to allocate device memory simultaneously. """ res_queue = Queue() processes = [] processes_num = 5 # Dividing the max dm size by 2 since we're not # guaranteed to have the max size free for us. total_size = self.attr_ex.max_dm_size / 2 / processes_num for i in range(processes_num): processes.append(Process(target=self.alloc_dm, args=(res_queue, total_size))) for i in range(processes_num): processes[i].start() for i in range(processes_num): processes[i].join() rc = res_queue.get() self.assertEqual(rc, 0, f'Parallel device memory allocation failed with errno: {rc}') class SharedDMTest(PyverbsAPITestCase): """ Tests shared device memory by importing DMs """ def setUp(self): super().setUp() if self.attr_ex.max_dm_size == 0: raise unittest.SkipTest('Device memory is not supported') self.dm_size = int(self.attr_ex.max_dm_size / 2) def test_import_dm(self): """ Creates a DM and imports it from a different (duplicated) Context. Then writes some data to the original DM, reads it from the imported DM and verifies that the read data is as expected. """ with d.DM(self.ctx, d.AllocDmAttr(length=self.dm_size)) as dm: cmd_fd_dup = os.dup(self.ctx.cmd_fd) try: imported_ctx = Context(cmd_fd=cmd_fd_dup) imported_dm = DM(imported_ctx, handle=dm.handle) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: raise unittest.SkipTest('Some object imports are not supported') raise ex original_data = b'\xab' * self.dm_size dm.copy_to_dm(0, original_data, self.dm_size) read_data = imported_dm.copy_from_dm(0, self.dm_size) self.assertEqual(original_data, read_data) imported_dm.unimport() rdma-core-56.1/tests/test_efa_srd.py000066400000000000000000000104701477342711600175120ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright 2020-2023 Amazon.com, Inc. or its affiliates. All rights reserved. import unittest import errno from pyverbs.cq import CQ, CompChannel from pyverbs.pyverbs_error import PyverbsRDMAError import pyverbs.enums as e from tests.efa_base import EfaRDMATestCase from tests.efa_base import SRDResources import tests.utils as u class CqEventsSRD(SRDResources): def __init__(self, dev_name, ib_port, gid_index): super().__init__(dev_name, ib_port, gid_index, e.IBV_QP_EX_WITH_SEND) def create_cq(self): self.comp_channel = CompChannel(self.ctx) self.cq = CQ(self.ctx, self.num_msgs, None, self.comp_channel) self.cq.req_notify() class CqEventsSRDTestCase(EfaRDMATestCase): def setUp(self): super().setUp() self.iters = 100 def test_cq_events_srd(self): for use_new_send in [False, True]: with self.subTest(): super().create_players(CqEventsSRD) u.traffic(**self.traffic_args, new_send=use_new_send) class QPSRDTestCase(EfaRDMATestCase): def setUp(self): super().setUp() self.iters = 100 self.server = None self.client = None def create_players(self, send_ops_flags=0, qp_count=8): super().create_players(SRDResources, send_ops_flags=send_ops_flags, qp_count=qp_count) def full_sq_bad_flow(self): """ Check post_send while qp's sq is full. - Find qp's sq length - Fill the qp with work requests until overflow """ qp_idx = 0 send_op = e.IBV_WR_SEND ah = u.get_global_ah(self.client, self.gid_index, self.ib_port) qp_attr, _ = self.client.qps[qp_idx].query(e.IBV_QP_CAP) max_send_wr = qp_attr.cap.max_send_wr with self.assertRaises(PyverbsRDMAError) as ex: for _ in range (max_send_wr + 1): _, c_sg = u.get_send_elements(self.client, False) u.send(self.client, c_sg, send_op, new_send=True, qp_idx=qp_idx, ah=ah) self.assertEqual(ex.exception.error_code, errno.ENOMEM) def test_qp_ex_srd_send(self): self.create_players(e.IBV_QP_EX_WITH_SEND) u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND) def test_qp_ex_srd_send_imm(self): self.create_players(e.IBV_QP_EX_WITH_SEND_WITH_IMM) u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND_WITH_IMM) def test_qp_ex_srd_rdma_read(self): self.create_players(e.IBV_QP_EX_WITH_RDMA_READ) self.server.mr.write('s' * self.server.msg_size, self.server.msg_size) u.rdma_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_RDMA_READ) def test_qp_ex_srd_rdma_write(self): self.create_players(e.IBV_QP_EX_WITH_RDMA_WRITE) u.rdma_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_RDMA_WRITE) def test_qp_ex_srd_rdma_write_with_imm(self): self.create_players(e.IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM) u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_RDMA_WRITE_WITH_IMM) def test_qp_ex_srd_old_send(self): self.create_players() u.traffic(**self.traffic_args, new_send=False) def test_qp_ex_srd_old_send_imm(self): self.create_players() u.traffic(**self.traffic_args, new_send=False, send_op=e.IBV_WR_SEND_WITH_IMM) def test_qp_ex_srd_zero_size(self): self.create_players(e.IBV_QP_EX_WITH_SEND) self.client.msg_size = 0 self.server.msg_size = 0 u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND) def test_post_receive_qp_state_bad_flow(self): self.create_players(e.IBV_QP_EX_WITH_SEND, qp_count=1) u.post_rq_state_bad_flow(self) def test_post_send_qp_state_bad_flow(self): self.create_players(e.IBV_QP_EX_WITH_SEND, qp_count=1) u.post_sq_state_bad_flow(self) def test_full_rq_bad_flow(self): self.create_players(e.IBV_QP_EX_WITH_SEND, qp_count=1) u.full_rq_bad_flow(self) def test_full_sq_bad_flow(self): self.create_players(e.IBV_QP_EX_WITH_SEND, qp_count=1) self.full_sq_bad_flow() def test_rq_with_larger_sgl_bad_flow(self): self.create_players(e.IBV_QP_EX_WITH_SEND, qp_count=1) u.create_rq_with_larger_sgl_bad_flow(self) rdma-core-56.1/tests/test_efadv.py000066400000000000000000000143201477342711600171720ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright 2020-2024 Amazon.com, Inc. or its affiliates. All rights reserved. """ Test module for efa direct-verbs. """ import unittest import random import errno import pyverbs.providers.efa.efa_enums as efa_e from pyverbs.base import PyverbsRDMAError import pyverbs.providers.efa.efadv as efa from pyverbs.qp import QPInitAttrEx from pyverbs.addr import AHAttr from pyverbs.cq import CQ import pyverbs.enums as e from pyverbs.pd import PD from tests.efa_base import EfaAPITestCase, EfaRDMATestCase, EfaCQRes import tests.utils as u class EfaQueryDeviceTest(EfaAPITestCase): """ Test various functionalities of the direct verbs class. """ def test_efadv_query(self): """ Verify that it's possible to read EFA direct-verbs. """ with efa.EfaContext(name=self.ctx.name) as efa_ctx: try: efa_attrs = efa_ctx.query_efa_device() if self.config['verbosity']: print(f'\n{efa_attrs}') except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Not supported on non EFA devices') raise ex class EfaAHTest(EfaAPITestCase): """ Test functionality of the EfaAH class """ def test_efadv_query_ah(self): """ Test efadv_query_ah() """ pd = PD(self.ctx) try: gr = u.get_global_route(self.ctx, port_num=self.ib_port) ah_attr = AHAttr(gr=gr, is_global=1, port_num=self.ib_port) ah = efa.EfaAH(pd, attr=ah_attr) query_ah_attr = ah.query_efa_ah() if self.config['verbosity']: print(f'\n{query_ah_attr}') except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Not supported on non EFA devices') raise ex class EfaQPTest(EfaAPITestCase): """ Test SRD QP class """ def test_efadv_create_driver_qp(self): """ Test efadv_create_driver_qp() """ with PD(self.ctx) as pd: with CQ(self.ctx, 100) as cq: qia = u.get_qp_init_attr(cq, self.attr) qia.qp_type = e.IBV_QPT_DRIVER try: qp = efa.SRDQP(pd, qia) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest("Create SRD QP is not supported") raise ex class EfaQPExTest(EfaAPITestCase): """ Test SRD QPEx class """ def test_efadv_create_qp_ex(self): """ Test efadv_create_qp_ex() """ with PD(self.ctx) as pd: with CQ(self.ctx, 100) as cq: qiaEx = get_qp_init_attr_ex(cq, pd, self.attr) efaqia = efa.EfaQPInitAttr() efaqia.driver_qp_type = efa_e.EFADV_QP_DRIVER_TYPE_SRD try: qp = efa.SRDQPEx(self.ctx, qiaEx, efaqia) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest("Create SRD QPEx is not supported") raise ex def get_random_send_op_flags(): send_ops_flags = [e.IBV_QP_EX_WITH_SEND, e.IBV_QP_EX_WITH_SEND_WITH_IMM, e.IBV_QP_EX_WITH_RDMA_READ, e.IBV_QP_EX_WITH_RDMA_WRITE, e.IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM] selected = u.sample(send_ops_flags) selected_ops_flags = 0 for s in selected: selected_ops_flags += s.value return selected_ops_flags def get_qp_init_attr_ex(cq, pd, attr): qp_cap = u.random_qp_cap(attr) sig = random.randint(0, 1) mask = e.IBV_QP_INIT_ATTR_PD | e.IBV_QP_INIT_ATTR_SEND_OPS_FLAGS send_ops_flags = get_random_send_op_flags() qia = QPInitAttrEx(qp_type=e.IBV_QPT_DRIVER, cap=qp_cap, sq_sig_all=sig, comp_mask=mask, create_flags=0, max_tso_header=0, send_ops_flags=send_ops_flags) qia.send_cq = cq qia.recv_cq = cq qia.pd = pd return qia class EfaCqTest(EfaRDMATestCase): def setUp(self): super().setUp() self.iters = 100 self.server = None self.client = None def create_players(self, dev_cap, wc_flags, send_ops_flags, qp_count=8): super().create_players(EfaCQRes, send_ops_flags=send_ops_flags, qp_count=qp_count, requested_dev_cap=dev_cap, wc_flags=wc_flags) self.server.remote_gid = self.client.ctx.query_gid(self.client.ib_port, self.client.gid_index) def test_dv_cq_ex_with_sgid(self): wc_flag = efa_e.EFADV_WC_EX_WITH_SGID dev_cap = efa_e.EFADV_DEVICE_ATTR_CAPS_CQ_WITH_SGID self.create_players(dev_cap, wc_flag, e.IBV_QP_EX_WITH_SEND, qp_count=1) recv_wr = u.get_recv_wr(self.server) self.server.qps[0].post_recv(recv_wr) ah_client = u.get_global_ah(self.client, self.gid_index, self.ib_port) _ , sg = u.get_send_elements(self.client, False) u.send(self.client, sg, e.IBV_WR_SEND, new_send=True, qp_idx=0, ah=ah_client) u.poll_cq_ex(self.client.cq) u.poll_cq_ex(self.server.cq, sgid=self.server.remote_gid) class EfaMRTest(EfaAPITestCase): """ Test various functionalities of the EfaMR class. """ def test_efadv_query_mr(self): with PD(self.ctx) as pd: try: mr = efa.EfaMR(pd, 16, e.IBV_ACCESS_LOCAL_WRITE) mr_attrs = mr.query() if self.config['verbosity']: print(f'\n{mr_attrs}') assert(mr_attrs.ic_id_validity & ~(efa_e.EFADV_MR_ATTR_VALIDITY_RECV_IC_ID | efa_e.EFADV_MR_ATTR_VALIDITY_RDMA_READ_IC_ID | efa_e.EFADV_MR_ATTR_VALIDITY_RDMA_RECV_IC_ID) == 0) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.ENOTTY, errno.EPROTONOSUPPORT]: raise unittest.SkipTest(f'Query MR not supported, errno={ex.error_code}') raise ex rdma-core-56.1/tests/test_flow.py000066400000000000000000000135571477342711600170670ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia All rights reserved. See COPYING file """ Test module for pyverbs' flow module. """ from tests.base import RDMATestCase, RawResources, PyverbsRDMAError from pyverbs.spec import EthSpec, Ipv4ExtSpec, Ipv6Spec, TcpUdpSpec from tests.utils import requires_root_on_eth, PacketConsts from pyverbs.flow import FlowAttr, Flow import pyverbs.enums as e import tests.utils as u import unittest import socket import errno class FlowRes(RawResources): def __init__(self, dev_name, ib_port, gid_index): """ Initialize Flow resources based on Raw resources that include Raw QP. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use """ super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) @requires_root_on_eth() def create_qps(self): super().create_qps() @staticmethod def create_eth_spec(ether_type=PacketConsts.ETHER_TYPE_IPV4): """ Creates ethernet spec that matches on ethertype, source and destination macs. :param ether_type: IPv4 or IPv6 :return: created ethernet spec """ eth_spec = EthSpec(ether_type=ether_type, dst_mac=PacketConsts.DST_MAC) eth_spec.src_mac = PacketConsts.SRC_MAC eth_spec.src_mac_mask = PacketConsts.SRC_MAC return eth_spec def create_ip_spec(self, ver=PacketConsts.IP_V4, next_hdr=socket.IPPROTO_UDP): """ Creates IPv4 or IPv6 spec that matches on source and destination ips. :param ver: IP version :param next_hdr: Next header type :return: created IPv4 or IPv6 spec """ if ver == PacketConsts.IP_V4: ip_spec = Ipv4ExtSpec(src_ip=PacketConsts.SRC_IP, dst_ip=PacketConsts.DST_IP, proto=next_hdr) else: ip_spec = Ipv6Spec(src_ip=PacketConsts.SRC_IP6, dst_ip=PacketConsts.DST_IP6, next_hdr=next_hdr) return ip_spec @staticmethod def create_tcp_udp_spec(spec_type): """ Creates TcpUdp spec that matches on ethertype, source and destination macs. :param spec_type: Spec type TCP or UDP :return: TCP or UDP spec """ spec = TcpUdpSpec(spec_type, src_port=PacketConsts.SRC_PORT, dst_port=PacketConsts.DST_PORT) return spec def create_flow(self, specs=None): """ Creates flow to match on provided specs. :param specs: list of specs to match on :return: created flow """ specs = [] if specs is None else specs flow_attr = FlowAttr(num_of_specs=len(specs), port=self.ib_port) for spec in specs: flow_attr.specs.append(spec) try: flow = self._create_flow(flow_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Flow creation is not supported') raise ex return flow def _create_flow(self, flow_attr): return Flow(self.qp, flow_attr) class FlowTest(RDMATestCase): """ Test various functionalities of the Flow class. """ def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None def flow_traffic(self, specs, l3=PacketConsts.IP_V4, l4=PacketConsts.UDP_PROTO): """ Execute raw ethernet traffic with given specs flow. :param specs: list of specs :param l3: Packet layer 3 type: 4 for IPv4 or 6 for IPv6 :param l4: Packet layer 4 type: 'tcp' or 'udp' :return: None """ self.flow = self.server.create_flow(specs) u.raw_traffic(self.client, self.server, self.iters, l3, l4) def test_eth_spec_flow_traffic(self): self.create_players(FlowRes) self.flow_traffic([self.server.create_eth_spec()]) def test_ipv4_spec_flow_traffic(self): self.create_players(FlowRes) if self.is_eth_and_has_roce_hw_bug(): raise unittest.SkipTest(f'Device {self.dev_name} doesn\'t support Ipv4ExtSpec') self.flow_traffic([self.server.create_ip_spec()]) def test_ipv6_spec_flow_traffic(self): self.create_players(FlowRes) eth_spec = self.server.create_eth_spec(PacketConsts.ETHER_TYPE_IPV6) if self.is_eth_and_has_roce_hw_bug(): raise unittest.SkipTest(f'Device {self.dev_name} doesn\'t support Ipv6Spec') ip_spec = self.server.create_ip_spec(PacketConsts.IP_V6) self.flow_traffic([eth_spec, ip_spec], PacketConsts.IP_V6) def test_udp_spec_flow_traffic(self): self.create_players(FlowRes) eth_spec = self.server.create_eth_spec() if self.is_eth_and_has_roce_hw_bug(): raise unittest.SkipTest(f'Device {self.dev_name} doesn\'t support Ipv4ExtSpec') ip_spec = self.server.create_ip_spec() udp_spec = self.server.create_tcp_udp_spec(e.IBV_FLOW_SPEC_UDP) self.flow_traffic([eth_spec, ip_spec, udp_spec], PacketConsts.IP_V4, PacketConsts.UDP_PROTO) def test_tcp_spec_flow_traffic(self): self.create_players(FlowRes) eth_spec = self.server.create_eth_spec(PacketConsts.ETHER_TYPE_IPV6) if self.is_eth_and_has_roce_hw_bug(): raise unittest.SkipTest(f'Device {self.dev_name} doesn\'t support Ipv6Spec') ip_spec = self.server.create_ip_spec(PacketConsts.IP_V6, socket.IPPROTO_TCP) tcp_spec = self.server.create_tcp_udp_spec(e.IBV_FLOW_SPEC_TCP) self.flow_traffic([eth_spec, ip_spec, tcp_spec], PacketConsts.IP_V6, PacketConsts.TCP_PROTO) rdma-core-56.1/tests/test_fork.py000066400000000000000000000020301477342711600170410ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright 2021 Amazon.com, Inc. or its affiliates. All rights reserved. import errno import pyverbs.enums as e from pyverbs.fork import fork_init, is_fork_initialized from pyverbs.pyverbs_error import PyverbsRDMAError from tests.base import PyverbsAPITestCase class ForkAPITest(PyverbsAPITestCase): """ Test the API of the fork functions. """ def test_is_fork_initialized(self): try: fork_init() expected_ret = [e.IBV_FORK_ENABLED, e.IBV_FORK_UNNEEDED] except PyverbsRDMAError as ex: # Depends on the order of the tests EINVAL could be returned if # fork_init() is called after an MR has already been registered. self.assertEqual(ex.error_code, errno.EINVAL) expected_ret = [e.IBV_FORK_DISABLED, e.IBV_FORK_UNNEEDED] ret = is_fork_initialized() if self.config['verbosity']: print(f'is_fork_initialized() = {ret}') self.assertIn(ret, expected_ret) rdma-core-56.1/tests/test_mlx5_cq.py000066400000000000000000000235451477342711600174660ustar00rootroot00000000000000import unittest import errno from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr, \ Mlx5DVCQInitAttr, Mlx5CQ, context_flags_to_str, cqe_comp_to_str from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsUserError import pyverbs.providers.mlx5.mlx5_enums as dve from tests.mlx5_base import Mlx5RDMATestCase from tests.mlx5_base import Mlx5DcResources from pyverbs.cq import CqInitAttrEx from tests.base import RCResources import pyverbs.enums as e import tests.utils as u def create_dv_cq(res): """ Create Mlx5 DV CQ. :param res: An instance of BaseResources. :return: None """ dvcq_init_attr = Mlx5DVCQInitAttr() if res.cqe_comp_res_format: dvcq_init_attr.cqe_comp_res_format = res.cqe_comp_res_format dvcq_init_attr.comp_mask |= dve.MLX5DV_CQ_INIT_ATTR_MASK_COMPRESSED_CQE # Check CQE compression capability cqe_comp_caps = res.ctx.query_mlx5_device().cqe_comp_caps if not (cqe_comp_caps['supported_format'] & res.cqe_comp_res_format) or \ not cqe_comp_caps['max_num']: cqe_comp_str = cqe_comp_to_str(res.cqe_comp_res_format) raise unittest.SkipTest(f'CQE compression {cqe_comp_str} is not supported') if res.flags: dvcq_init_attr.flags = res.flags dvcq_init_attr.comp_mask |= dve.MLX5DV_CQ_INIT_ATTR_MASK_FLAGS if res.cqe_size: dvcq_init_attr.cqe_size = res.cqe_size dvcq_init_attr.comp_mask |= dve.MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE try: res.cq = Mlx5CQ(res.ctx, CqInitAttrEx(), dvcq_init_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create Mlx5DV CQ is not supported') raise ex class Mlx5CQRes(RCResources): def __init__(self, dev_name, ib_port, gid_index, cqe_comp_res_format=None, flags=None, cqe_size=None, msg_size=1024, requested_dev_cap=None): """ Initialize Mlx5 DV CQ resources based on RC resources that include RC QP. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param cqe_comp_res_format: Type of compression to use :param flags: DV CQ specific flags :param cqe_size: The CQE size :param msg_size: The resource msg size :param requested_dev_cap: A necessary device cap. If it's not supported by the device, the test will be skipped. """ self.cqe_comp_res_format = cqe_comp_res_format self.flags = flags self.cqe_size = cqe_size self.requested_dev_cap = requested_dev_cap super().__init__(dev_name, ib_port, gid_index, msg_size=msg_size) def create_context(self): mlx5dv_attr = Mlx5DVContextAttr() try: self.ctx = Mlx5Context(mlx5dv_attr, name=self.dev_name) except PyverbsUserError as ex: raise unittest.SkipTest(f'Could not open mlx5 context ({ex})') except PyverbsRDMAError: raise unittest.SkipTest('Opening mlx5 context is not supported') if self.requested_dev_cap: if not self.ctx.query_mlx5_device().flags & self.requested_dev_cap: miss_caps = context_flags_to_str(self.requested_dev_cap) raise unittest.SkipTest(f'Device caps doesn\'t support {miss_caps}') def create_cq(self): create_dv_cq(self) class Mlx5DvCqDcRes(Mlx5DcResources): def __init__(self, dev_name, ib_port, gid_index, cqe_comp_res_format=None, flags=None, cqe_size=None, create_flags=None): """ Initialize Mlx5 DV CQ resources based on RC resources that include RC QP. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param cqe_comp_res_format: Type of compression to use :param flags: DV CQ specific flags :param cqe_size: The CQ's CQe size :param create_flags: DV QP specific flags """ self.cqe_comp_res_format = cqe_comp_res_format self.flags = flags self.cqe_size = cqe_size super().__init__(dev_name, ib_port, gid_index, send_ops_flags=e.IBV_QP_EX_WITH_SEND, create_flags=create_flags) def create_cq(self): create_dv_cq(self) class DvCqTest(Mlx5RDMATestCase): def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None self.traffic_args = None def create_players(self, resource, **resource_arg): """ Init DV CQ tests resources. :param resource: The RDMA resources to use. :param resource_arg: Dict of args that specify the resource specific attributes. :return: None """ super().create_players(resource, **resource_arg) if resource == Mlx5DvCqDcRes: self.client.remote_dct_num = self.server.dct_qp.qp_num self.server.remote_dct_num = self.client.dct_qp.qp_num def test_dv_cq_traffic(self): """ Run SEND traffic using DC CQ. """ self.create_players(Mlx5CQRes) u.traffic(**self.traffic_args, is_cq_ex=True) def test_dv_cq_compression_flags(self): """ Create DV CQ with different types of CQE compression formats. The test also does bad flow and try to use more than one compression formats. """ # Create DV CQ with all legal compression flags. for comp_type in [dve.MLX5DV_CQE_RES_FORMAT_CSUM_STRIDX, dve.MLX5DV_CQE_RES_FORMAT_CSUM, dve.MLX5DV_CQE_RES_FORMAT_HASH]: self.create_players(Mlx5CQRes, cqe_comp_res_format=comp_type, requested_dev_cap=dve.MLX5DV_CONTEXT_FLAGS_CQE_128B_COMP) u.traffic(**self.traffic_args, is_cq_ex=True) # Try to create DV CQ with more than one compression flags. cqe_multi_format = dve.MLX5DV_CQE_RES_FORMAT_HASH | \ dve.MLX5DV_CQE_RES_FORMAT_CSUM with self.assertRaises(PyverbsRDMAError) as ex: self.create_players(Mlx5CQRes, cqe_comp_res_format=cqe_multi_format) self.assertEqual(ex.exception.error_code, errno.EINVAL) def test_dv_cq_padding(self): """ Create DV CQ with padding flag. """ self.create_players(Mlx5CQRes, cqe_size=128, flags=dve.MLX5DV_CQ_INIT_ATTR_FLAGS_CQE_PAD, requested_dev_cap=dve.MLX5DV_CONTEXT_FLAGS_CQE_128B_PAD) u.traffic(**self.traffic_args, is_cq_ex=True) def test_dv_cq_padding_not_aligned_cqe_size(self): """ Create DV CQ with padding flag when CQE size is not 128B. The creation should fail because padding is supported only with CQE size of 128B. """ # Padding flag works only when the cqe size is 128. with self.assertRaises(PyverbsRDMAError) as ex: self.create_players(Mlx5CQRes, cqe_size=64, flags=dve.MLX5DV_CQ_INIT_ATTR_FLAGS_CQE_PAD, requested_dev_cap=dve.MLX5DV_CONTEXT_FLAGS_CQE_128B_PAD) self.assertEqual(ex.exception.error_code, errno.EINVAL) def test_dv_cq_cqe_size_128(self): """ Test multiple sizes of msg using CQE size of 128B. """ msg_sizes = [60, # Lower than 64B 70, # In range of 64B - 128B 140] # Bigger than 128B for size in msg_sizes: self.create_players(Mlx5CQRes, cqe_size=128, msg_size=size) u.traffic(**self.traffic_args, is_cq_ex=True) def test_dv_cq_cqe_size_64(self): """ Test multiple sizes of msg using CQE size of 64B. """ msg_sizes = [16, # Lower than 32B 60, # In range of 32B - 64B 70] # Bigger than 64B for size in msg_sizes: self.create_players(Mlx5CQRes, cqe_size=64, msg_size=size) u.traffic(**self.traffic_args, is_cq_ex=True) def test_dv_cq_cqe_size_with_bad_size(self): """ Create CQ with ilegal cqe_size value. """ # Set the CQE size in the CQE creation. with self.assertRaises(PyverbsRDMAError) as ex: self.create_players(Mlx5CQRes, cqe_size=100) self.assertEqual(ex.exception.error_code, errno.EINVAL) # Set the CQE size using the environment value. self.set_env_variable('MLX5_CQE_SIZE', '100') with self.assertRaises(PyverbsRDMAError) as ex: self.create_players(Mlx5CQRes) self.assertEqual(ex.exception.error_code, errno.EINVAL) def test_dv_cq_cqe_size_environment_var(self): """ Create DV CQs with all the legal cqe_size values using the environment variable mechanism. """ for cqe_size in ['64', '128']: self.set_env_variable('MLX5_CQE_SIZE', cqe_size) self.create_players(Mlx5CQRes) def test_scatter_to_cqe_control_by_qp(self): """ Create QP with specific SCATTER_TO_CQE flags. The test set different values in the scatter2cqe environment variable and create the QP with enable/disable flags. The QP should ignore the environment variable value and behave according to the specific creation flag. """ for s2c_env_val in ['0', '1']: for qp_s2c_value in [dve.MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE, dve.MLX5DV_QP_CREATE_ALLOW_SCATTER_TO_CQE]: self.set_env_variable('MLX5_SCATTER_TO_CQE', s2c_env_val) self.create_players(Mlx5DvCqDcRes, create_flags=qp_s2c_value) u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND, is_cq_ex=True) rdma-core-56.1/tests/test_mlx5_crypto.py000066400000000000000000000523771477342711600204100ustar00rootroot00000000000000import unittest import struct import errno import json import os from pyverbs.providers.mlx5.mlx5dv_mkey import Mlx5Mkey, Mlx5MrInterleaved, \ Mlx5MkeyConfAttr, Mlx5SigCrc, Mlx5SigBlockDomain, Mlx5SigBlockAttr from pyverbs.providers.mlx5.mlx5dv_crypto import Mlx5CryptoLoginAttr, Mlx5DEK, \ Mlx5DEKInitAttr, Mlx5CryptoAttr, Mlx5CryptoLogin, Mlx5CryptoExtLoginAttr from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr, \ Mlx5DVQPInitAttr, Mlx5QP from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsUserError from tests.mlx5_base import Mlx5RDMATestCase, Mlx5PyverbsAPITestCase import pyverbs.providers.mlx5.mlx5_enums as dve from pyverbs.wr import SGE, SendWR, RecvWR from pyverbs.qp import QPInitAttrEx, QPCap from tests.base import RCResources from pyverbs.pd import PD import pyverbs.enums as e import tests.utils as u DEK_OPAQUE = b'dek' """ Crytpo operation requires specific input from the user, e.g. the wrapped credential that the device is configured with. The input should be provided in JSON format in a file in this path: "/tmp/mlx5_crypto_test.txt". User can also set this environment variable with his file path: MLX5_CRYPTO_TEST_INFO This doc describes the input option with examples to the format: Mandatory: Wrapped credential: The credential that was configured in the device wrapped. Wrapped key: Wrapped key for the DEK creation. The key should be encrypted using the KEK (Key Encrypted Key). Optional: Wrapped 256 bits key: Wrapped key for the DEK creation when the key size of 256 bits is required. If not provided, that test will only use the key in size of 128 bits. Encrypted data for 512 of 'c': If a user wants to have data validation, he needs to provide expected encrypted data for a plain text of 512 bytes of the character 'c'. If not provided, data validation will be skipped. Example of content of such file: [{"credential": [8704278040424473809, 4403447855848063568, 13892768337045135232, 5942481448427925932, 171338997253969038, 5703425261028721211], "wrapped_key": [contains 5 integers of 64bits each], "plaintext_key": [contains 8 integers of 32bits each], "plaintext_256_bits_key": [contains 16 integers of 32bits each], "encrypted_data_for_512_c": [contains 64 integers of 64bits each], "wrapped_256_bits_key": [contains 9 integers of 64bits each]}] """ def check_crypto_caps(dev_name, is_wrapped_dek_mode, multi_block_support=False): """ Check that this device support crypto actions. :param dev_name: The device name. :param is_wrapped_dek_mode: True when wrapped_dek and False when plaintext. :param multi_block_support: If True, check for multi-block support. """ mlx5dv_attr = Mlx5DVContextAttr() ctx = Mlx5Context(mlx5dv_attr, name=dev_name) crypto_caps = ctx.query_mlx5_device().crypto_caps failed_selftests = crypto_caps['failed_selftests'] if failed_selftests: raise unittest.SkipTest(f'The device crypto selftest failed ({failed_selftests})') single_block_cap = dve.MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_SINGLE_BLOCK \ & crypto_caps['crypto_engines'] multi_block_cap = dve.MLX5DV_CRYPTO_ENGINES_CAP_AES_XTS_MULTI_BLOCK \ & crypto_caps['crypto_engines'] if not single_block_cap and not multi_block_cap: raise unittest.SkipTest('The device crypto engines does not support AES') elif multi_block_support and not multi_block_cap: raise unittest.SkipTest('The device crypto engines does not support multi blocks') elif multi_block_cap: assert single_block_cap, \ 'The device crypto engines do not support single block but support multi blocks' dev_wrapped_import_method = crypto_caps['wrapped_import_method'] if is_wrapped_dek_mode: if not dve.MLX5DV_CRYPTO_WRAPPED_IMPORT_METHOD_CAP_AES_XTS & dev_wrapped_import_method or \ not dve.MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_OPERATIONAL & crypto_caps['flags']: raise unittest.SkipTest('The device does not support wrapped DEK') elif dve.MLX5DV_CRYPTO_WRAPPED_IMPORT_METHOD_CAP_AES_XTS & dev_wrapped_import_method and \ dve.MLX5DV_CRYPTO_CAPS_WRAPPED_CRYPTO_OPERATIONAL & crypto_caps['flags']: raise unittest.SkipTest('The device does not support plaintext DEK') def require_crypto_login_details(instance): """ Parse the crypto login session details from this file: '/tmp/mlx5_crypto_test.txt' If the file doesn't exists or the content is not in Json format, skip the test. :param instance: The test instance. """ crypto_file = '/tmp/mlx5_crypto_test.txt' if 'MLX5_CRYPTO_TEST_INFO' in os.environ: crypto_file = os.environ['MLX5_CRYPTO_TEST_INFO'] try: with open(crypto_file, 'r') as f: test_details = f.read() setattr(instance, 'crypto_details', test_details) json.loads(test_details) instance.crypto_details = json.loads(test_details)[0] except json.JSONDecodeError: raise unittest.SkipTest(f'The crypto data in {crypto_file} must be in Json format') except FileNotFoundError: raise unittest.SkipTest(f'Crypto login details must be supplied in {crypto_file}') def requires_crypto_support(is_wrapped_dek_mode, multi_block_support=False): """ :param is_wrapped_dek_mode: True when wapped_dek and False when plaintext. :param multi_block_support: If True, check for multi-block support. """ def outer(func): def inner(instance): require_crypto_login_details(instance) check_crypto_caps(instance.dev_name, is_wrapped_dek_mode, multi_block_support) return func(instance) return inner return outer class Mlx5CryptoResources(RCResources): def __init__(self, dev_name, ib_port, gid_index, dv_send_ops_flags=0, mkey_create_flags=dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT, msg_size=1024): self.dv_send_ops_flags = dv_send_ops_flags self.mkey_create_flags = mkey_create_flags self.max_inline_data = 512 self.send_ops_flags = e.IBV_QP_EX_WITH_SEND super().__init__(dev_name, ib_port, gid_index, msg_size=msg_size) self.create_mkeys() def create_mkeys(self): try: self.wire_enc_mkey = Mlx5Mkey(self.pd, self.mkey_create_flags, max_entries=1) self.mem_enc_mkey = Mlx5Mkey(self.pd, self.mkey_create_flags, max_entries=1) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create Mkey is not supported') raise ex def create_qp_cap(self): return QPCap(max_send_wr=self.num_msgs, max_recv_wr=self.num_msgs, max_inline_data=self.max_inline_data) def create_qp_init_attr(self): comp_mask = e.IBV_QP_INIT_ATTR_PD | e.IBV_QP_INIT_ATTR_SEND_OPS_FLAGS return QPInitAttrEx(cap=self.create_qp_cap(), pd=self.pd, scq=self.cq, rcq=self.cq, qp_type=e.IBV_QPT_RC, send_ops_flags=self.send_ops_flags, comp_mask=comp_mask) def create_qps(self): try: qp_init_attr = self.create_qp_init_attr() comp_mask = dve.MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS attr = Mlx5DVQPInitAttr(comp_mask=comp_mask, send_ops_flags=self.dv_send_ops_flags) qp = Mlx5QP(self.ctx, qp_init_attr, attr) self.qps.append(qp) self.qps_num.append(qp.qp_num) self.psns.append(0) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create Mlx5DV QP is not supported') raise ex class Mlx5CryptoAPITest(Mlx5PyverbsAPITestCase): def __init__(self, methodName='runTest'): super().__init__(methodName) self.crypto_details = None def verify_create_dek_out_of_login_session(self): """ Verify that create DEK out of crypto login session is not permited. """ with self.assertRaises(PyverbsRDMAError) as ex: Mlx5DEK(self.ctx, self.dek_init_attr) self.assertEqual(ex.exception.error_code, errno.EINVAL) def verify_login_state(self, expected_state): """ Query the session login state and verify that it's as expected. """ state = Mlx5Context.query_login_state(self.ctx) self.assertEqual(state, expected_state) def verify_login_twice(self): """ Verify that when there is already a login session alive the second login fails. """ with self.assertRaises(PyverbsRDMAError) as ex: Mlx5Context.crypto_login(self.ctx, self.login_attr) self.assertEqual(ex.exception.error_code, errno.EEXIST) def verify_dek_opaque(self): """ Query the DEK and verify that its opaque is as expected. """ dek_attr = self.dek.query() self.assertEqual(dek_attr.opaque, DEK_OPAQUE) @requires_crypto_support(is_wrapped_dek_mode=True) def test_mlx5_dek_management(self): """ Test crypto login and DEK management APIs. The test checks also that invalid actions are not permited, e.g, create DEK not in login session on wrapped DEK mode. """ try: self.pd = PD(self.ctx) cred_bytes = struct.pack('!6Q', *self.crypto_details['credential']) key = struct.pack('!5Q', *self.crypto_details['wrapped_key']) self.dek_init_attr = \ Mlx5DEKInitAttr(self.pd, key=key, key_size=dve.MLX5DV_CRYPTO_KEY_SIZE_128, key_purpose=dve.MLX5DV_CRYPTO_KEY_PURPOSE_AES_XTS, opaque=DEK_OPAQUE) self.verify_create_dek_out_of_login_session() self.verify_login_state(dve.MLX5DV_CRYPTO_LOGIN_STATE_NO_LOGIN) # Login to crypto session self.login_attr = Mlx5CryptoLoginAttr(cred_bytes) Mlx5Context.crypto_login(self.ctx, self.login_attr) self.verify_login_state(dve.MLX5DV_CRYPTO_LOGIN_STATE_VALID) self.verify_login_twice() self.dek = Mlx5DEK(self.ctx, self.dek_init_attr) self.verify_dek_opaque() self.dek.close() # Logout from crypto session Mlx5Context.crypto_logout(self.ctx) self.verify_login_state(dve.MLX5DV_CRYPTO_LOGIN_STATE_NO_LOGIN) except PyverbsRDMAError as ex: print(ex) if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create crypto elements is not supported') raise ex class Mlx5CryptoTrafficTest(Mlx5RDMATestCase): """ Test the mlx5 cryto APIs. """ def setUp(self): super().setUp() self.iters = 10 self.crypto_details = None self.validate_data = False self.is_multi_block = False self.msg_size = 1024 self.key_size = dve.MLX5DV_CRYPTO_KEY_SIZE_128 def create_client_dek(self): """ Create DEK using the client resources. """ cred_bytes = struct.pack('!6Q', *self.crypto_details['credential']) log_attr = Mlx5CryptoLoginAttr(cred_bytes) Mlx5Context.crypto_login(self.client.ctx, log_attr) self._create_wrapped_dek() def create_client_wrapped_dek_login_obj(self): """ Create CryptoExtLoginAttr and crypto login object """ cred_bytes = struct.pack('!6Q', *self.crypto_details['credential']) log_attr = Mlx5CryptoExtLoginAttr(cred_bytes, 48) crypto_login_obj = Mlx5CryptoLogin(self.client.ctx, log_attr) comp_mask=dve.MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN self._create_wrapped_dek(comp_mask, crypto_login_obj) def _create_wrapped_dek(self, comp_mask=0, crypto_login_obj=None): """ Create wrapped DEK using the client resources. """ key = struct.pack('!5Q', *self.crypto_details['wrapped_key']) if self.key_size == dve.MLX5DV_CRYPTO_KEY_SIZE_256: key = struct.pack('!9Q', *self.crypto_details['wrapped_256_bits_key']) if comp_mask == dve.MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN: self.dek_attr = Mlx5DEKInitAttr(self.client.pd, key=key, key_size=self.key_size, key_purpose=dve.MLX5DV_CRYPTO_KEY_PURPOSE_AES_XTS, comp_mask=comp_mask, crypto_login=crypto_login_obj) else: self.dek_attr = Mlx5DEKInitAttr(self.client.pd, key=key, key_size=self.key_size, key_purpose=dve.MLX5DV_CRYPTO_KEY_PURPOSE_AES_XTS) self.dek = Mlx5DEK(self.client.ctx, self.dek_attr) def create_client_plaintext_dek(self): """ Create DEK using the client resources. """ key = struct.pack('8I',*self.crypto_details['plaintext_key']) if self.key_size == dve.MLX5DV_CRYPTO_KEY_SIZE_256: key = struct.pack('16I', *self.crypto_details['plaintext_256_bits_key']) self.dek_attr = Mlx5DEKInitAttr(self.client.pd,key=key, key_size=self.key_size, comp_mask=dve.MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN, crypto_login=None) self.dek = Mlx5DEK(self.client.ctx, self.dek_attr) def reg_client_mkey(self, signature=False): """ Configure an mkey with crypto attributes. :param signature: True if signature configuration requested. """ num_of_configuration = 4 if signature else 3 for mkey in [self.client.wire_enc_mkey, self.client.mem_enc_mkey]: self.client.qp.wr_start() self.client.qp.wr_flags = e.IBV_SEND_SIGNALED | e.IBV_SEND_INLINE offset = 0 if mkey == self.client.wire_enc_mkey else self.client.msg_size/2 sge = SGE(self.client.mr.buf + offset, self.client.msg_size/2, self.client.mr.lkey) self.client.qp.wr_mkey_configure(mkey, num_of_configuration, Mlx5MkeyConfAttr()) self.client.qp.wr_set_mkey_access_flags(e.IBV_ACCESS_LOCAL_WRITE) self.client.qp.wr_set_mkey_layout_list([sge]) if signature: self.configure_mkey_signature() initial_tweak = struct.pack('!2Q', int(0), int(0)) encrypt_on_tx = mkey == self.client.wire_enc_mkey sign_crypto_order = dve.MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_BEFORE_CRYPTO_ON_TX crypto_attr = Mlx5CryptoAttr(crypto_standard=dve.MLX5DV_CRYPTO_STANDARD_AES_XTS, encrypt_on_tx=encrypt_on_tx, signature_crypto_order=sign_crypto_order, data_unit_size=dve.MLX5DV_BLOCK_SIZE_512, dek=self.dek, initial_tweak=initial_tweak) self.client.qp.wr_set_mkey_crypto(crypto_attr) self.client.qp.wr_complete() u.poll_cq(self.client.cq) def configure_mkey_signature(self): """ Configure an mkey with signature attributes. """ sig_crc = Mlx5SigCrc(crc_type=dve.MLX5DV_SIG_CRC_TYPE_CRC32, seed=0xFFFFFFFF) sig_block_domain = Mlx5SigBlockDomain(sig_type=dve.MLX5DV_SIG_TYPE_CRC, crc=sig_crc, block_size=dve.MLX5DV_BLOCK_SIZE_512) sig_attr = Mlx5SigBlockAttr(wire=sig_block_domain, check_mask=dve.MLX5DV_SIG_MASK_CRC32) self.client.qp.wr_set_mkey_sig_block(sig_attr) def get_send_wr(self, player, wire_encryption): mkey = player.wire_enc_mkey if wire_encryption else player.mem_enc_mkey sge = SGE(0, player.msg_size/2, mkey.lkey) return SendWR(opcode=e.IBV_WR_SEND, num_sge=1, sg=[sge]) def get_recv_wr(self, player, wire_encryption): offset = 0 if wire_encryption else player.msg_size/2 sge = SGE(player.mr.buf + offset, player.msg_size/2, player.mr.lkey) return RecvWR(sg=[sge], num_sge=1) def prepare_validate_data(self): data_size = int(self.msg_size / 2) self.client.mr.write('c' * data_size, data_size) if self.is_multi_block: encrypted_data = struct.pack('!128Q', *self.crypto_details['encrypted_data_for_1024_c']) else: encrypted_data = struct.pack('!64Q', *self.crypto_details['encrypted_data_for_512_c']) self.client.mr.write(encrypted_data, data_size, offset=data_size) def validate_crypto_data(self): """ Validate the server MR data. Verify that the encryption/decryption works well. """ data_size= int(self.msg_size / 2) send_msg = self.client.mr.read(self.msg_size, 0) recv_msg = self.server.mr.read(self.msg_size, 0) self.assertEqual(send_msg[0:data_size], recv_msg[data_size:self.msg_size]) self.assertEqual(send_msg[data_size:self.msg_size], recv_msg[0:data_size]) def init_data_validation(self): if 'encrypted_data_for_512_c' in self.crypto_details and not self.is_multi_block: self.validate_data = True elif 'encrypted_data_for_1024_c' in self.crypto_details and self.is_multi_block: self.validate_data = True def traffic(self): """ Perform RC traffic using the configured mkeys. """ if self.validate_data: self.prepare_validate_data() for _ in range(self.iters): self.server.qp.post_recv(self.get_recv_wr(self.server, wire_encryption=True)) self.server.qp.post_recv(self.get_recv_wr(self.server, wire_encryption=False)) self.client.qp.post_send(self.get_send_wr(self.client, wire_encryption=True)) self.client.qp.post_send(self.get_send_wr(self.client, wire_encryption=False)) u.poll_cq(self.client.cq, count=2) u.poll_cq(self.server.cq, count=2) if self.validate_data: self.validate_crypto_data() def run_crypto_dek_test(self, create_dek_func): """ Creates player and DEK, using the give function, for different test cases, and runs traffic. :param create_dek_func: Function that creates a DEK. """ self.init_data_validation() mkey_flags = dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO | \ dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT self.create_players(Mlx5CryptoResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE, mkey_create_flags=mkey_flags, msg_size=self.msg_size) create_dek_func() self.reg_client_mkey() self.traffic() @requires_crypto_support(is_wrapped_dek_mode=True) def test_mlx5_crypto_mkey_old_api(self): """ Create Mkeys, register a memory layout using the mkeys, configure crypto attributes on it and then run traffic. """ self.run_crypto_dek_test(self.create_client_dek) @requires_crypto_support(is_wrapped_dek_mode=True) def test_mlx5_crypto_wrapped_dek(self): """ Create Mkeys, register a memory layout using the mkeys, configure crypto attributes on it using login object API and then run traffic. Use wrapped DEK with new API. """ self.run_crypto_dek_test(self.create_client_wrapped_dek_login_obj) @requires_crypto_support(is_wrapped_dek_mode=False) def test_mlx5_crypto_plaintext_dek(self): """ Create Mkeys, register a memory layout using the mkeys, configure crypto attributes on it using login object API and then run traffic. Use plaintext DEK. """ self.run_crypto_dek_test(self.create_client_plaintext_dek) @requires_crypto_support(is_wrapped_dek_mode=True) def test_mlx5_crypto_signature_mkey(self): """ Create Mkeys, register a memory layout using this mkey, configure crypto and signature attributes on it and then perform traffic using this mkey. """ if 'wrapped_256_bits_key' in self.crypto_details: self.key_size = dve.MLX5DV_CRYPTO_KEY_SIZE_256 mkey_flags = dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO | \ dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT | \ dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE self.create_players(Mlx5CryptoResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE, mkey_create_flags=mkey_flags) self.create_client_dek() self.reg_client_mkey(signature=True) self.traffic() @requires_crypto_support(is_wrapped_dek_mode=False, multi_block_support=True) def test_mlx5_plaintext_dek_multi_block(self): self.is_multi_block = True self.msg_size = 2048 self.run_crypto_dek_test(self.create_client_plaintext_dek) @requires_crypto_support(is_wrapped_dek_mode=True, multi_block_support=True) def test_mlx5_wrapped_dek_multi_block(self): self.is_multi_block = True self.msg_size = 2048 self.run_crypto_dek_test(self.create_client_wrapped_dek_login_obj) rdma-core-56.1/tests/test_mlx5_cuda_umem.py000066400000000000000000000101361477342711600210120ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2022 Nvidia Inc. All rights reserved. See COPYING file import resource from pyverbs.providers.mlx5.mlx5dv import Mlx5DevxObj, WqeDataSeg, Mlx5UMEM from tests.mlx5_base import Mlx5DevxRcResources, Mlx5DevxTrafficBase import pyverbs.providers.mlx5.mlx5_enums as dve import tests.cuda_utils as cu import pyverbs.enums as e try: from cuda import cuda, cudart, nvrtc cu.CUDA_FOUND = True except ImportError: cu.CUDA_FOUND = False GPU_PAGE_SIZE = 1 << 16 @cu.set_mem_io_cuda_methods class CudaDevxRes(Mlx5DevxRcResources): def __init__(self, dev_name, ib_port, gid_index, mr_access=e.IBV_ACCESS_LOCAL_WRITE): """ Initialize DevX resources with CUDA memory allocations. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param mr_access: The MR access """ self.mr_access = mr_access self.cuda_addr = None self.dmabuf_fd = None self.umem = None self.mkey = None self.lkey = None self.lkey = None super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) def init_resources(self): self.alloc_cuda_mem() super().init_resources() self.create_dmabuf_umem() self.create_mkey() def get_wqe_data_segment(self): return WqeDataSeg(self.msg_size, self.lkey, int(self.cuda_addr)) def alloc_cuda_mem(self): """ Allocates CUDA memory and a DMABUF FD on that memory. """ self.cuda_addr = cu.check_cuda_errors(cuda.cuMemAlloc(GPU_PAGE_SIZE)) # Sync between memory operations attr_value = 1 cu.check_cuda_errors(cuda.cuPointerSetAttribute( attr_value, cuda.CUpointer_attribute.CU_POINTER_ATTRIBUTE_SYNC_MEMOPS, int(self.cuda_addr) )) # Memory address and size must be aligned to page size to get a handle assert (GPU_PAGE_SIZE % resource.getpagesize() == 0 and int(self.cuda_addr) % resource.getpagesize() == 0) self.dmabuf_fd = cu.check_cuda_errors( cuda.cuMemGetHandleForAddressRange(self.cuda_addr, GPU_PAGE_SIZE, cuda.CUmemRangeHandleType.CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD, 0)) def create_mr(self): pass def create_dmabuf_umem(self): umem_aligment = resource.getpagesize() self.umem = Mlx5UMEM(self.ctx, GPU_PAGE_SIZE, 0, umem_aligment, self.mr_access, umem_aligment, dve.MLX5DV_UMEM_MASK_DMABUF, self.dmabuf_fd) def create_mkey(self): from tests.mlx5_prm_structs import SwMkc, CreateMkeyIn, CreateMkeyOut accesses = [e.IBV_ACCESS_LOCAL_WRITE, e.IBV_ACCESS_REMOTE_READ, e.IBV_ACCESS_REMOTE_WRITE] lw, rr, rw = (list(map(lambda access: int(self.mr_access & access != 0), accesses))) mkey_ctx = SwMkc(lr=1, lw=lw, rr=rr, rw=rw, access_mode_1_0=0x1, start_addr=int(self.cuda_addr), len=GPU_PAGE_SIZE, pd=self.dv_pd.pdn, qpn=0xffffff) self.mkey = Mlx5DevxObj(self.ctx, CreateMkeyIn(sw_mkc=mkey_ctx, mkey_umem_id=self.umem.umem_id, mkey_umem_valid=1), len(CreateMkeyOut())) self.lkey = CreateMkeyOut(self.mkey.out_view).mkey_index << 8 @cu.set_init_cuda_methods class Mlx5GpuDevxRcTrafficTest(Mlx5DevxTrafficBase): """ Test DevX traffic over CUDA memory using DMA BUF and UMEM """ @cu.requires_cuda def test_mlx_devx_cuda_send_imm_traffic(self): """ Creates two DevX RC QPs and runs SEND_IMM traffic over CUDA allocated memory using UMEM and DMA BUF. """ self.create_players(CudaDevxRes) # Send traffic self.send_imm_traffic() rdma-core-56.1/tests/test_mlx5_dc.py000066400000000000000000000132101477342711600174350ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 NVIDIA Corporation . All rights reserved. See COPYING file import unittest import errno import pyverbs.providers.mlx5.mlx5_enums as me from tests.mlx5_base import Mlx5DcResources, Mlx5RDMATestCase, Mlx5DcStreamsRes,\ DCI_TEST_GOOD_FLOW, DCI_TEST_BAD_FLOW_WITH_RESET,\ DCI_TEST_BAD_FLOW_WITHOUT_RESET from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.providers.mlx5.mlx5dv import Mlx5QP import pyverbs.enums as e import tests.utils as u class OdpDc(Mlx5DcResources): def create_mr(self): try: self.mr = u.create_custom_mr(self, e.IBV_ACCESS_ON_DEMAND) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Reg ODP MR is not supported') raise ex class DCTest(Mlx5RDMATestCase): def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None self.traffic_args = None def sync_remote_attr(self): """ Exchange the remote attributes between the server and the client. """ super().sync_remote_attr() self.client.remote_dct_num = self.server.dct_qp.qp_num self.server.remote_dct_num = self.client.dct_qp.qp_num def test_dc_rdma_write(self): self.create_players(Mlx5DcResources, qp_count=2, send_ops_flags=e.IBV_QP_EX_WITH_RDMA_WRITE) u.rdma_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_RDMA_WRITE) def test_dc_send(self): self.create_players(Mlx5DcResources, qp_count=2, send_ops_flags=e.IBV_QP_EX_WITH_SEND) u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND) def test_dc_atomic(self): self.create_players(Mlx5DcResources, qp_count=2, send_ops_flags=e.IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD) client_max_log = self.client.ctx.query_mlx5_device().max_dc_rd_atom server_max_log = self.server.ctx.query_mlx5_device().max_dc_rd_atom u.atomic_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD, client_wr=client_max_log, server_wr=server_max_log) def test_dc_ah_to_qp_mapping(self): self.create_players(Mlx5DcResources, qp_count=2, send_ops_flags=e.IBV_QP_EX_WITH_SEND) client_ah = u.get_global_ah(self.client, self.gid_index, self.ib_port) try: Mlx5QP.map_ah_to_qp(client_ah, self.server.qps[0].qp_num) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Mapping AH to QP is not supported') raise ex u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND) def check_odp_dc_support(self): """ Check if the device supports ODP with DC. :raises SkipTest: In case ODP is not supported with DC """ dc_odp_caps = self.server.ctx.query_mlx5_device().dc_odp_caps required_odp_caps = e.IBV_ODP_SUPPORT_SEND | e.IBV_ODP_SUPPORT_SRQ_RECV if required_odp_caps & dc_odp_caps != required_odp_caps: raise unittest.SkipTest('ODP is not supported using DC') def test_odp_dc_traffic(self): send_ops_flag = e.IBV_QP_EX_WITH_SEND self.create_players(OdpDc, qp_count=2, send_ops_flags=send_ops_flag) self.check_odp_dc_support() u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND) def test_dc_rdma_write_stream(self): """ Check good flow of DCS. Calculate stream_id for DCS test by setting same stream id twice for WR and after increase it. Setting goes by loop and after stream_id is more than number of concurrent streams + 1 then stream_id returns to 1. :raises SkipTest: In case DCI is not supported with HW """ self.create_players(Mlx5DcStreamsRes, qp_count=2, send_ops_flags=e.IBV_QP_EX_WITH_RDMA_WRITE) u.rdma_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_RDMA_WRITE) def test_dc_send_stream_bad_flow(self): """ Check bad flow of DCS with reset stream id. Create error in dci stream by setting invalid PD so dci stream goes to error. In the end, the test verifies that the number of errors is as expected. :raises SkipTest: In case DCI is not supported with HW """ self.create_players(Mlx5DcStreamsRes, qp_count=1, send_ops_flags=e.IBV_QP_EX_WITH_SEND) self.client.set_bad_flow(DCI_TEST_BAD_FLOW_WITH_RESET) self.client.traffic_with_bad_flow(**self.traffic_args) def test_dc_send_stream_bad_flow_qp(self): """ Check bad flow of DCS with reset qp. Checked if resetting of wrong dci stream id produces an exception. This bad flow creates enough errors without resetting the streams, enforcing the QP to get into ERR state. Then the checking is stopped. Also has feature that after QP goes in ERR state test will reset QP to RTS state. :raises SkipTest: In case DCI is not supported with HW """ self.iters = 20 self.create_players(Mlx5DcStreamsRes, qp_count=1, send_ops_flags=e.IBV_QP_EX_WITH_SEND) self.client.set_bad_flow(DCI_TEST_BAD_FLOW_WITHOUT_RESET) self.client.traffic_with_bad_flow(**self.traffic_args) rdma-core-56.1/tests/test_mlx5_devx.py000066400000000000000000000035731477342711600200300ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia Inc. All rights reserved. See COPYING file """ Test module for mlx5 DevX. """ from tests.mlx5_base import Mlx5DevxRcResources, Mlx5DevxTrafficBase import pyverbs.mem_alloc as mem from pyverbs.mr import MR import pyverbs.enums as e import tests.utils as u class Mlx5DevxRcOdpRes(Mlx5DevxRcResources): @u.requires_odpv2 def create_mr(self): self.with_odp = True self.user_addr = mem.mmap(length=self.msg_size, flags=mem.MAP_ANONYMOUS_ | mem.MAP_PRIVATE_) access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_READ | \ e.IBV_ACCESS_ON_DEMAND self.mr = MR(self.pd, self.msg_size, access, self.user_addr) class Mlx5DevxRcTrafficTest(Mlx5DevxTrafficBase): """ Test various functionality of mlx5 DevX objects """ def test_devx_rc_qp_send_imm_traffic(self): """ Creates two DevX RC QPs and modifies them to RTS state. Then does SEND_IMM traffic. """ self.create_players(Mlx5DevxRcResources) # Send traffic self.send_imm_traffic() def test_devx_rc_qp_send_imm_doorbell_less_traffic(self): """ Creates two DevX RC QPs with dbr less ext and modifies them to RTS state. Then does SEND_IMM traffic. """ from tests.mlx5_prm_structs import SendDbrMode self.create_players(Mlx5DevxRcResources, send_dbr_mode=SendDbrMode.NO_DBR_EXT) # Send traffic self.send_imm_traffic() @u.requires_odp('rc', e.IBV_ODP_SUPPORT_SEND | e.IBV_ODP_SUPPORT_RECV) def test_devx_rc_qp_odp_traffic(self): """ Creates two DevX RC QPs using ODP enabled MKeys. Then does SEND_IMM traffic. """ self.create_players(Mlx5DevxRcOdpRes) # Send traffic self.send_imm_traffic() rdma-core-56.1/tests/test_mlx5_dm_ops.py000066400000000000000000000140131477342711600203320ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 NVIDIA Corporation . All rights reserved. See COPYING file from threading import Thread from queue import Queue import unittest import struct import errno from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr, \ Mlx5DmOpAddr from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsUserError from tests.mlx5_base import Mlx5PyverbsAPITestCase import pyverbs.providers.mlx5.mlx5_enums as dve import pyverbs.device as d MEMIC_ATOMIC_INCREMENT = 0x0 MEMIC_ATOMIC_TEST_AND_SET = 0x1 MLX5_CMD_OP_QUERY_HCA_CAP = 0x100 MLX5_CMD_MOD_DEVICE_MEMORY_CAP = 0xF MLX5_CMD_OP_QUERY_HCA_CAP_OUT_LEN = 0x1010 def requires_memic_atomic_support(func): def wrapper(instance): cmd_in = struct.pack('!HIH8s', MLX5_CMD_OP_QUERY_HCA_CAP, 0, MLX5_CMD_MOD_DEVICE_MEMORY_CAP << 1 | 0x1, bytes(8)) cmd_out = Mlx5Context.devx_general_cmd(instance.ctx, cmd_in, MLX5_CMD_OP_QUERY_HCA_CAP_OUT_LEN) cmd_view = memoryview(cmd_out) status = cmd_view[0] if status: raise PyverbsRDMAError('Query Device Memory CAPs failed with status' f' ({status})') memic_op_support = int.from_bytes(cmd_view[80:84], 'big') increment_size_sup = cmd_view[20] test_and_set_size_sup = cmd_view[22] # Verify that MEMIC atomic operations (both increment and test_and_set) # are supported with write/read size of 1 Byte. if memic_op_support & 0x3 != 0x3: raise unittest.SkipTest('MEMIC atomic operations are not supported') if not increment_size_sup & test_and_set_size_sup & 0x1: raise unittest.SkipTest( 'MEMIC atomic operations are not supported with 1 Bytes read/write sizes') return func(instance) return wrapper class Mlx5DmOpAddresses(Mlx5PyverbsAPITestCase): def setUp(self): super().setUp() self.dm_size = int(self.attr_ex.max_dm_size / 2) def create_context(self): try: attr = Mlx5DVContextAttr(dve.MLX5DV_CONTEXT_FLAGS_DEVX) self.ctx = Mlx5Context(attr, self.dev_name) except PyverbsUserError as ex: raise unittest.SkipTest(f'Could not open mlx5 context ({ex})') except PyverbsRDMAError: raise unittest.SkipTest('Opening mlx5 DevX context is not supported') def _write_to_op_addr(self): try: inc_addr = Mlx5DmOpAddr(self.dm, MEMIC_ATOMIC_INCREMENT) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: self.skip_queue.put(unittest.SkipTest( 'MEMIC_ATOMIC_INCREMENT op is not supported')) return raise ex inc_addr.write(b'\x01') inc_addr.unmap(self.dm_size) def _read_from_op_addr(self): try: test_and_set_addr = Mlx5DmOpAddr(self.dm, MEMIC_ATOMIC_TEST_AND_SET) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: self.skip_queue.put(unittest.SkipTest( 'MEMIC_ATOMIC_TEST_AND_SET op is not supported')) return raise ex val = test_and_set_addr.read(1) test_and_set_addr.unmap(self.dm_size) return val @requires_memic_atomic_support def test_dm_atomic_ops(self): """ Tests "increment" and "test_and_set" MEMIC atomic operations. The test does two increments to the same buffer data and verifies the values using test_and_set. Then verifies that the latter op sets the buffer as expected. """ with d.DM(self.ctx, d.AllocDmAttr(length=self.dm_size)) as dm: # Set DM buffer to 0 dm.copy_to_dm(0, bytes(self.dm_size), self.dm_size) try: inc_addr = Mlx5DmOpAddr(dm, MEMIC_ATOMIC_INCREMENT) test_and_set_addr = Mlx5DmOpAddr(dm, MEMIC_ATOMIC_TEST_AND_SET) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: raise unittest.SkipTest('MEMIC atomic operations are not supported') raise ex inc_addr.write(b'\x01') inc_addr.write(b'\x01') # Now we should read 0x02 and the memory set to 0x1 val = int.from_bytes(test_and_set_addr.read(1), 'big') self.assertEqual(val, 2) # Verify that TEST_AND_SET set the memory to 0x1 val = int.from_bytes(test_and_set_addr.read(1), 'big') self.assertEqual(val, 1) inc_addr.unmap(self.dm_size) test_and_set_addr.unmap(self.dm_size) @requires_memic_atomic_support def test_parallel_dm_atomic_ops(self): """ Runs multiple threads that do test_and_set operation, followed by multiple threads that do increments of +1, to the same DM buffer. Then verifies that the buffer data was incremented as expected. """ threads = [] num_threads = 10 self.skip_queue = Queue() with d.DM(self.ctx, d.AllocDmAttr(length=self.dm_size)) as self.dm: for _ in range(num_threads): threads.append(Thread(target=self._read_from_op_addr)) threads[-1].start() for thread in threads: thread.join() threads = [] for _ in range(num_threads): threads.append(Thread(target=self._write_to_op_addr)) threads[-1].start() for thread in threads: thread.join() if not self.skip_queue.empty(): raise self.skip_queue.get() val = int.from_bytes(self._read_from_op_addr(), 'big') self.assertEqual(val, num_threads + 1, f'Read value is ({val}) is different than expected ({num_threads+1})' ) rdma-core-56.1/tests/test_mlx5_dma_memcpy.py000066400000000000000000000123261477342711600211710ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia, Inc. All rights reserved. See COPYING file from enum import Enum import unittest from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsUserError from tests.mlx5_base import Mlx5RDMATestCase, Mlx5RcResources import pyverbs.providers.mlx5.mlx5_enums as dve from pyverbs.pd import PD from pyverbs.mr import MR import pyverbs.enums as e import tests.utils as u class BadFlowType(Enum): DIFFERENT_PD = 1 MR_ILLEGAL_ACCESS = 2 class Mlx5DmaResources(Mlx5RcResources): def create_send_ops_flags(self): self.dv_send_ops_flags = dve.MLX5DV_QP_EX_WITH_MEMCPY self.send_ops_flags = e.IBV_QP_EX_WITH_SEND class DmaGgaMemcpy(Mlx5RDMATestCase): def create_resources(self, bad_flow_type=0, **resource_arg): """ Creates DmaGga test resources that include a "server" resource that can be used to send the MEMCPY WR, and a destination MR to copy data to. The destination MR can be created on a different PD or with insufficient access permissions, according to the bad_flow_type. :param bad_flow_type: (Optional) An enum of BadFlowType that indicates the bad flow type (default: 0 - good flow) :param resource_arg: Dict of args that specify the resource specific attributes. :return: None """ self.server = Mlx5DmaResources(**self.dev_info, **resource_arg) self.dest_pd = self.server.pd dest_mr_access = e.IBV_ACCESS_LOCAL_WRITE if bad_flow_type == BadFlowType.DIFFERENT_PD: self.dest_pd = PD(self.server.ctx) elif bad_flow_type == BadFlowType.MR_ILLEGAL_ACCESS: dest_mr_access = e.IBV_ACCESS_REMOTE_READ self.dest_mr = MR(self.dest_pd, self.server.msg_size, dest_mr_access) # No need to connect the QPs self.server.pre_run([0], [0]) def dma_memcpy(self, msg_size=1024, bad_flow=False): """ Creates resources and posts a memcpy WR. After posting the WR, the WC opcode and the data are verified. :param msg_size: Size of the data to be copied (in Bytes) :param bad_flow: If True, do not fill data in the MRs (default: False) :return: None """ self.create_resources(msg_size=msg_size) if not bad_flow: self.dest_mr.write('0' * msg_size, msg_size) self.server.mr.write('s' * msg_size, msg_size) self.server.qp.wr_start() self.server.qp.wr_flags = e.IBV_SEND_SIGNALED self.server.qp.wr_memcpy(self.dest_mr.lkey, self.dest_mr.buf, self.server.mr.lkey, self.server.mr.buf, msg_size) self.server.qp.wr_complete() u.poll_cq_ex(self.server.cq) wc_opcode = self.server.cq.read_opcode() self.assertEqual(wc_opcode, dve.MLX5DV_WC_MEMCPY, 'WC opcode validation failed') self.assertEqual(self.dest_mr.read(msg_size, 0), self.server.mr.read(msg_size, 0)) def dma_memcpy_bad_protection_flow(self, bad_flow_type): """ Creates resources with bad protection and posts a memcpy WR. The bad protection is either a destination MR created on a different PD or a destination MR created with insufficient access permissions. :param bad_flow_type: An enum of BadFlowType that indicates the bad flow type :return: None """ self.create_resources(bad_flow_type) self.server.qp.wr_start() self.server.qp.wr_flags = e.IBV_SEND_SIGNALED self.server.qp.wr_memcpy(self.dest_mr.lkey, self.dest_mr.buf, self.server.mr.lkey, self.server.mr.buf, self.server.msg_size) self.server.qp.wr_complete() with self.assertRaises(PyverbsRDMAError): u.poll_cq_ex(self.server.cq) self.assertEqual(self.server.cq.status, e.IBV_WC_LOC_PROT_ERR, 'Expected CQE with Local Protection Error') def test_dma_memcpy_data(self): self.dma_memcpy() def test_dma_memcpy_different_pd_bad_flow(self): self.dma_memcpy_bad_protection_flow(BadFlowType.DIFFERENT_PD) def test_dma_memcpy_protection_bad_flow(self): self.dma_memcpy_bad_protection_flow(BadFlowType.MR_ILLEGAL_ACCESS) def test_dma_memcpy_large_data_bad_flow(self): """ Bad flow test, testing DMA memcpy with data larger than the maximum allowed size, according to the HCA capabilities. :return: None """ try: ctx = Mlx5Context(Mlx5DVContextAttr(), name=self.dev_name) except PyverbsUserError as ex: raise unittest.SkipTest(f'Could not open mlx5 context ({ex})') except PyverbsRDMAError: raise unittest.SkipTest('Opening mlx5 context is not supported') max_size = ctx.query_mlx5_device( dve.MLX5DV_CONTEXT_MASK_WR_MEMCPY_LENGTH).max_wr_memcpy_length max_size = max_size if max_size else 1024 with self.assertRaises(PyverbsRDMAError): self.dma_memcpy(max_size + 1, bad_flow=True) rdma-core-56.1/tests/test_mlx5_dmabuf.py000066400000000000000000000121071477342711600203110ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2024 Nvidia Inc. All rights reserved. See COPYING file from os import strerror import unittest import errno from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr from pyverbs.providers.mlx5.mlx5dv_dmabuf import Mlx5DmaBufMR from pyverbs.pyverbs_error import PyverbsRDMAError import pyverbs.providers.mlx5.mlx5_enums as dve from tests.mlx5_base import Mlx5RDMATestCase from tests.base import RCResources from pyverbs.qp import QPAttr import tests.cuda_utils as cu import pyverbs.enums as e import tests.utils as u try: from cuda import cuda cu.CUDA_FOUND = True except ImportError: cu.CUDA_FOUND = False GPU_PAGE_SIZE = 1 << 16 def requires_data_direct_support(): """ Check if the device support data-direct """ def outer(func): def inner(instance): with Mlx5Context(Mlx5DVContextAttr(), name=instance.dev_name) as ctx: try: ctx.get_data_direct_sysfs_path() except PyverbsRDMAError as ex: if ex.error_code == errno.ENODEV: raise unittest.SkipTest('There is no data direct device in the system') raise ex return func(instance) return inner return outer @cu.set_mem_io_cuda_methods class Mlx5DmabufCudaRes(RCResources): def __init__(self, dev_name, ib_port, gid_index, mr_access=e.IBV_ACCESS_LOCAL_WRITE, mlx5_access=0): """ Initializes data-direct MR and DMA BUF resources on top of a CUDA memory. Uses RC QPs for traffic. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param mr_access: The MR access :param mlx5_access: The data-direct access """ self.mr_access = mr_access self.mlx5_access = mlx5_access self.cuda_addr = None super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) def create_mr(self): self.cuda_addr = cu.check_cuda_errors(cuda.cuMemAlloc(GPU_PAGE_SIZE)) attr_flag = 1 cu.check_cuda_errors(cuda.cuPointerSetAttribute( attr_flag, cuda.CUpointer_attribute.CU_POINTER_ATTRIBUTE_SYNC_MEMOPS, int(self.cuda_addr))) cuda_flag = cuda.CUmemRangeHandleType.CU_MEM_RANGE_HANDLE_TYPE_DMA_BUF_FD dmabuf_fd = cu.check_cuda_errors( cuda.cuMemGetHandleForAddressRange(self.cuda_addr, GPU_PAGE_SIZE, cuda_flag, 0)) try: self.mr = Mlx5DmaBufMR(self.pd, offset=0, length=self.msg_size, access=self.mr_access, fd=dmabuf_fd, mlx5_access=self.mlx5_access) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Registering DV DMABUF MR is not supported') raise ex def create_qp_attr(self): qp_attr = QPAttr(port_num=self.ib_port) qp_access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE | \ e.IBV_ACCESS_REMOTE_READ qp_attr.qp_access_flags = qp_access return qp_attr @cu.set_init_cuda_methods class Mlx5DmabufCudaTest(Mlx5RDMATestCase): """ Test data-direct DV verbs """ @requires_data_direct_support() def test_data_direct_sysfs_path_bad_length(self): """ Query data direct sysfs path with buffer of 5 bytes. This is bad flow since 5 bytes aren't enough for any sysfs path, so ENOSPC should be raised. """ ctx = Mlx5Context(Mlx5DVContextAttr(), name=self.dev_name) try: path = ctx.get_data_direct_sysfs_path(5) except PyverbsRDMAError as ex: self.assertEqual(ex.error_code, errno.ENOSPC, f'Got {strerror(ex.error_code)} but Expected {strerror(errno.ENOSPC)}') else: raise PyverbsRDMAError('Successfully queried data direct sysfs path with 5 bytes: ' f'{path}') def test_dv_dmabuf_mr(self): """ Creates dmabuf MR with DV API. mlx5_access is 0, so the MR is regular dmabuf MR. Run RDMA write traffic. """ access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE self.create_players(Mlx5DmabufCudaRes, mr_access=access) u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) @requires_data_direct_support() def test_dv_dmabuf_mr_data_direct(self): """ Runs RDMA Write traffic over CUDA allocated memory using Data Direct DMA BUF and RC QPs. """ mr_access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE self.create_players(Mlx5DmabufCudaRes, mr_access=mr_access, mlx5_access=dve.MLX5DV_REG_DMABUF_ACCESS_DATA_DIRECT_) u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) rdma-core-56.1/tests/test_mlx5_dr.py000066400000000000000000002153751477342711600174740ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia All rights reserved. See COPYING file """ Test module for pyverbs' mlx5 dr module. """ from os import path, system import unittest import struct import socket import errno import math from pyverbs.providers.mlx5.dr_action import DrActionQp, DrActionModify, \ DrActionFlowCounter, DrActionDrop, DrActionTag, DrActionDestTable, \ DrActionPopVLan, DrActionPushVLan, DrActionDestAttr, DrActionDestArray, \ DrActionDefMiss, DrActionVPort, DrActionIBPort, DrActionDestTir, DrActionPacketReformat,\ DrFlowSamplerAttr, DrActionFlowSample, DrFlowMeterAttr, DrActionFlowMeter from pyverbs.providers.mlx5.mlx5dv import Mlx5DevxObj, Mlx5Context, Mlx5DVContextAttr from tests.utils import skip_unsupported, requires_root_on_eth, requires_eswitch_on, \ PacketConsts from tests.mlx5_base import Mlx5RDMATestCase, PyverbsAPITestCase, MELLANOX_VENDOR_ID from pyverbs.providers.mlx5.mlx5dv_flow import Mlx5FlowMatchParameters from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsUserError from pyverbs.providers.mlx5.dr_matcher import DrMatcher from pyverbs.providers.mlx5.dr_domain import DrDomain from pyverbs.providers.mlx5.dr_table import DrTable from pyverbs.providers.mlx5.dr_rule import DrRule import pyverbs.providers.mlx5.mlx5_enums as dve from tests.test_mlx5_flow import requires_reformat_support from pyverbs.cq import CqInitAttrEx, CQEX, CQ from pyverbs.wq import WQInitAttr, WQ, WQAttr from tests.base import RawResources import pyverbs.enums as e import tests.utils as u SET_ACTION = 0x1 MAX_MATCH_PARAM_SIZE = 0x180 PF_VPORT = 0x0 GENEVE_PACKET_OUTER_LENGTH = 50 ROCE_PACKET_OUTER_LENGTH = 58 SAMPLER_ERROR_MARGIN = 0.2 SAMPLE_RATIO = 4 METADATA_C_FIELDS = ['metadata_reg_c_0', 'metadata_reg_c_1', 'metadata_reg_c_2', 'metadata_reg_c_3', 'metadata_reg_c_4', 'metadata_reg_c_5'] FLOW_METER_GREEN = 2 FLOW_METER_RED = 0 REG_C_DATA = 0x1234 class ModifyFields: """ Supported SW steering modify fields. """ OUT_SMAC_47_16 = 0x1 OUT_SMAC_15_0 = 0x2 META_DATA_REG_C_0 = 0x51 META_DATA_REG_C_1 = 0x52 class ModifyFieldsLen: """ Supported SW steering modify fields length. """ MAC_47_16 = 32 MAC_15_0 = 16 META_DATA_REG_C = 32 def skip_if_has_geneve_tx_bug(ctx): """ Some mlx5 devices such as CX5 and CX6 has a bug matching on Geneve fields on TX side. Raises unittest.SkipTest if that's the case. :param ctx: Mlx5 Context """ dev_attrs = ctx.query_device() mlx5_cx5_cx6 = [0x1017, 0x1018, 0x1019, 0x101a, 0x101b] if dev_attrs.vendor_id == MELLANOX_VENDOR_ID and \ dev_attrs.vendor_part_id in mlx5_cx5_cx6: raise unittest.SkipTest('This test is not supported on cx5/6') def requires_geneve_fields_rx_support(func): def func_wrapper(instance): nic_tbl_caps = u.query_nic_flow_table_caps(instance) field_support = nic_tbl_caps.flow_table_properties_nic_receive.ft_field_support if not (field_support.outer_geneve_vni and field_support.outer_geneve_oam and field_support.outer_geneve_protocol_type and field_support.outer_geneve_opt_len): raise unittest.SkipTest('NIC flow table does not support geneve fields') return func(instance) return func_wrapper class Mlx5DrResources(RawResources): """ Test various functionalities of the mlx5 direct rules class. """ def create_context(self): mlx5dv_attr = Mlx5DVContextAttr() try: self.ctx = Mlx5Context(mlx5dv_attr, name=self.dev_name) except PyverbsUserError as ex: raise unittest.SkipTest(f'Could not open mlx5 context ({ex})') except PyverbsRDMAError: raise unittest.SkipTest('Opening mlx5 context is not supported') def __init__(self, dev_name, ib_port, gid_index=0, wc_flags=0, msg_size=1024, qp_count=1): self.wc_flags = wc_flags super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index, msg_size=msg_size, qp_count=qp_count) @requires_root_on_eth() def create_qps(self): super().create_qps() def create_cq(self): """ Create an Extended CQ. """ wc_flags = e.IBV_WC_STANDARD_FLAGS | self.wc_flags cia = CqInitAttrEx(cqe=self.num_msgs, wc_flags=wc_flags) try: self.cq = CQEX(self.ctx, cia) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create Extended CQ is not supported') raise ex def get_first_flow_meter_reg_id(self): """ Queries hca caps for supported reg C indexes for flow meter. :return: First reg C index that is supported """ from tests.mlx5_prm_structs import QueryHcaCapIn, QueryQosCapOut, DevxOps query_cap_in = QueryHcaCapIn(op_mod=DevxOps.MLX5_CMD_OP_QUERY_QOS_CAP << 1) cmd_res = self.ctx.devx_general_cmd(query_cap_in, len(QueryQosCapOut())) query_cap_out = QueryQosCapOut(cmd_res) if query_cap_out.status: raise PyverbsRDMAError(f'QUERY_HCA_CAP has failed with status ({query_cap_out.status}) ' f'and syndrome ({query_cap_out.syndrome})') bit_regs = query_cap_out.capability.flow_meter_reg_id if bit_regs == 0: raise PyverbsRDMAError(f'Reg C is not supported)') return int(math.log2(bit_regs & -bit_regs)) class Mlx5DrTirResources(Mlx5DrResources): def __init__(self, dev_name, ib_port, gid_index=0, wc_flags=0, msg_size=1024, qp_count=1, server=False): self.server = server super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index, wc_flags=wc_flags, msg_size=msg_size, qp_count=qp_count) def create_cq(self): self.cq = CQ(self.ctx, cqe=self.num_msgs) @requires_root_on_eth() def create_qps(self): if not self.server: super().create_qps() else: from tests.mlx5_prm_structs import Tirc, CreateTirIn, CreateTirOut self.qps = [WQ(self.ctx, WQInitAttr(wq_pd=self.pd, wq_cq=self.cq))] self.qps[0].modify(WQAttr(attr_mask=e.IBV_WQ_ATTR_STATE, wq_state=e.IBV_WQS_RDY)) tir_ctx = Tirc(inline_rqn=self.qps[0].wqn) self.tir = Mlx5DevxObj(self.ctx, CreateTirIn(tir_context=tir_ctx), len(CreateTirOut())) class Mlx5DrTest(Mlx5RDMATestCase): def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None self.rules = [] def tearDown(self): if self.server: self.server.ctx.close() if self.client: self.client.ctx.close() @skip_unsupported def create_rx_recv_rules_based_on_match_params(self, mask_param, val_param, actions, match_criteria=u.MatchCriteriaEnable.OUTER, domain=None, log_matcher_size=None, root_only=False): """ Creates a rule on RX domain that forwards packets that match on the provided parameters to the SW steering flow table and another rule on that table with provided actions. :param mask_param: The FlowTableEntryMatchParam mask matcher value. :param val_param: The FlowTableEntryMatchParam value matcher value. :param actions: List of actions to attach to the recv rule. :param match_criteria: the match criteria enable flag to match on :param domain: RX DR domain to use if provided, otherwise create default RX domain. :param log_matcher_size: Size of the matcher table :param root_only : If True, rules are created only on root table :return: Non-root table and dest table action to it if root=false else root_table """ self.domain_rx = domain if domain else DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) root_table = DrTable(self.domain_rx, 0) if not root_only: non_root_table = DrTable(self.domain_rx, 1) table = root_table if root_only else non_root_table self.matcher = DrMatcher(table, 1, match_criteria, mask_param) if log_matcher_size: self.matcher.set_layout(log_matcher_size) self.rules.append(DrRule(self.matcher, val_param, actions)) if not root_only: self.root_matcher = DrMatcher(root_table, 0, match_criteria, mask_param) self.dest_table_action = DrActionDestTable(table) self.rules.append(DrRule(self.root_matcher, val_param, [self.dest_table_action])) return table, self.dest_table_action return table @skip_unsupported def create_rx_recv_rules(self, smac_value, actions, log_matcher_size=None, domain=None, root_only=False): """ Creates a rule on RX domain that forwards packets that match the smac in the matcher to the SW steering flow table and another rule on that table with provided actions. :param smac_value: The smac matcher value. :param actions: List of actions to attach to the recv rule. :param log_matcher_size: Size of the matcher table :param domain: RX DR domain to use if provided, otherwise create default RX domain. :param root_only : If True, rules are created only on root table :return: Non-root table and dest table action to it if root_only=false else root_table """ smac_mask = bytes([0xff] * 6) + bytes(2) mask_param = Mlx5FlowMatchParameters(len(smac_mask), smac_mask) # Size of the matcher value should be modulo 4 smac_value = smac_value if root_only else smac_value + bytes(2) value_param = Mlx5FlowMatchParameters(len(smac_value), smac_value) return self.create_rx_recv_rules_based_on_match_params(mask_param, value_param, actions, u.MatchCriteriaEnable.OUTER, domain, log_matcher_size, root_only=root_only) def send_client_raw_packets(self, iters, src_mac=None): """ Send raw packets. :param iters: Number of packets to send. :param src_mac: If set, src mac to set in the packets. """ c_send_wr, _, _ = u.get_send_elements_raw_qp(self.client, src_mac=src_mac) poll_cq = u.poll_cq_ex if isinstance(self.client.cq, CQEX) else u.poll_cq for _ in range(iters): u.send(self.client, c_send_wr, e.IBV_WR_SEND) poll_cq(self.client.cq) def send_server_fdb_to_nic_packets(self, iters): """ Server sends and receives raw packets. :param iters: Number of packets to send. """ s_recv_wr = u.get_recv_wr(self.server) u.post_recv(self.server, s_recv_wr, qp_idx=0) c_send_wr, _, msg = u.get_send_elements_raw_qp(self.server) for _ in range(iters): u.send(self.server, c_send_wr, e.IBV_WR_SEND) u.poll_cq_ex(self.server.cq) u.poll_cq_ex(self.server.cq) u.post_recv(self.server, s_recv_wr, qp_idx=0) msg_received = self.server.mr.read(self.server.msg_size, 0) u.validate_raw(msg_received, msg, []) def dest_port(self, is_vport=True): """ Creates FDB domain, root table with matcher on source mac on the server side. Create a rule to forward all traffic to the non-root table. On this table apply VPort/IBPort action goto PF. On the server open another RX domain on PF with QP action and validate packets by sending traffic from client, catch all traffic with VPort/IBPort action goto PF, open another RX domain on PF with QP action and validate packets. :param is_vport: A flag to indicate if to use VPort or IBPort action. """ self.client = Mlx5DrResources(**self.dev_info) self.server = Mlx5DrResources(**self.dev_info) self.domain_fdb = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_FDB) port_action = DrActionVPort(self.domain_fdb, PF_VPORT) if is_vport \ else DrActionIBPort(self.domain_fdb, self.ib_port) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.fdb_table, self.fdb_dest_act = self.create_rx_recv_rules(smac_value, [port_action], domain=self.domain_fdb) self.domain_rx = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) rx_table = DrTable(self.domain_rx, 0) qp_action = DrActionQp(self.server.qp) smac_mask = bytes([0xff] * 6) mask_param = Mlx5FlowMatchParameters(len(smac_mask), smac_mask) rx_matcher = DrMatcher(rx_table, 0, u.MatchCriteriaEnable.OUTER, mask_param) value_param = Mlx5FlowMatchParameters(len(smac_value), smac_value) self.rules.append(DrRule(rx_matcher, value_param, [qp_action])) # Validate traffic on RX u.raw_traffic(self.client, self.server, self.iters) @staticmethod def create_dest_mac_params(): from tests.mlx5_prm_structs import FlowTableEntryMatchParam eth_match_mask = FlowTableEntryMatchParam() eth_match_mask.outer_headers.dmac = PacketConsts.MAC_MASK eth_match_value = FlowTableEntryMatchParam() eth_match_value.outer_headers.dmac = PacketConsts.DST_MAC mask_param = Mlx5FlowMatchParameters(len(eth_match_mask), eth_match_mask) value_param = Mlx5FlowMatchParameters(len(eth_match_value), eth_match_value) return mask_param, value_param @staticmethod def create_counter(ctx): """ Create flow counter. :param ctx: The player context to create the counter on. :return: The counter object and the flow counter ID . """ from tests.mlx5_prm_structs import AllocFlowCounterIn, AllocFlowCounterOut counter = Mlx5DevxObj(ctx, AllocFlowCounterIn(), len(AllocFlowCounterOut())) flow_counter_id = AllocFlowCounterOut(counter.out_view).flow_counter_id return counter, flow_counter_id @staticmethod def query_counter_packets(counter, flow_counter_id): """ Query flow counter packets count. :param counter: The counter for the query. :param flow_counter_id: The flow counter ID for the query. :return: Number of packets on this counter. """ from tests.mlx5_prm_structs import QueryFlowCounterIn, QueryFlowCounterOut query_in = QueryFlowCounterIn(flow_counter_id=flow_counter_id) counter_out = QueryFlowCounterOut(counter.query(query_in, len(QueryFlowCounterOut()))) return counter_out.flow_statistics.packets @staticmethod def gen_gre_tunnel_encap_header(msg_size, is_l2_tunnel=True): gre_ether_type = PacketConsts.ETHER_TYPE_ETH if is_l2_tunnel else \ PacketConsts.ETHER_TYPE_IPV4 gre_header = u.gen_gre_header(ether_type=gre_ether_type) ip_header = u.gen_ipv4_header(packet_len=msg_size + len(gre_header), next_proto=socket.IPPROTO_GRE) mac_header = u.gen_ethernet_header() return mac_header + ip_header + gre_header @staticmethod def gen_geneve_tunnel_encap_header(msg_size, is_l2_tunnel=True): proto = PacketConsts.ETHER_TYPE_ETH if is_l2_tunnel else PacketConsts.ETHER_TYPE_IPV4 geneve_header = u.gen_geneve_header(proto=proto) udp_header = u.gen_udp_header(packet_len=msg_size + len(geneve_header), dst_port=PacketConsts.GENEVE_PORT) ip_header = u.gen_ipv4_header(packet_len=msg_size + len(udp_header) + len(geneve_header)) mac_header = u.gen_ethernet_header() return mac_header + ip_header + udp_header + geneve_header @staticmethod def create_geneve_params(): from tests.mlx5_prm_structs import FlowTableEntryMatchParam geneve_mask = FlowTableEntryMatchParam() geneve_mask.misc_parameters.geneve_vni = 0xffffff geneve_mask.misc_parameters.geneve_oam = 1 geneve_value = FlowTableEntryMatchParam() geneve_value.misc_parameters.geneve_vni = PacketConsts.GENEVE_VNI geneve_value.misc_parameters.geneve_oam = PacketConsts.GENEVE_OAM mask_param = Mlx5FlowMatchParameters(len(geneve_mask), geneve_mask) value_param = Mlx5FlowMatchParameters(len(geneve_value), geneve_value) return mask_param, value_param @staticmethod def gen_roce_bth_header(msg_size): mac_header = u.gen_ethernet_header() ip_header = u.gen_ipv4_header(packet_len=msg_size + PacketConsts.UDP_HEADER_SIZE + PacketConsts.BTH_HEADER_SIZE) udp_header = u.gen_udp_header(packet_len=msg_size + PacketConsts.BTH_HEADER_SIZE, dst_port=PacketConsts.ROCE_PORT) bth_header = u.gen_bth_header() return mac_header + ip_header + udp_header + bth_header @staticmethod def create_roce_bth_params(): from tests.mlx5_prm_structs import FlowTableEntryMatchParam roce_mask = FlowTableEntryMatchParam() roce_mask.misc_parameters.bth_opcode = 0xff roce_mask.misc_parameters.bth_dst_qp = 0xffffff roce_mask.misc_parameters.bth_a = 0x1 roce_value = FlowTableEntryMatchParam() roce_value.misc_parameters.bth_opcode = PacketConsts.BTH_OPCODE roce_value.misc_parameters.bth_dst_qp = PacketConsts.BTH_DST_QP roce_value.misc_parameters.bth_a = PacketConsts.BTH_A mask_param = Mlx5FlowMatchParameters(len(roce_mask), roce_mask) value_param = Mlx5FlowMatchParameters(len(roce_value), roce_value) return mask_param, value_param def create_empty_matcher_go_to_tbl(self, src_tbl, dst_tbl): """ Create rule that forward all packets (by empty matcher) from src_tbl to dst_tbl. """ from tests.mlx5_prm_structs import FlowTableEntryMatchParam empty_param = Mlx5FlowMatchParameters(len(FlowTableEntryMatchParam()), FlowTableEntryMatchParam()) matcher = DrMatcher(src_tbl, 0, u.MatchCriteriaEnable.NONE, empty_param) go_to_tbl_action = DrActionDestTable(dst_tbl) self.rules.append(DrRule(matcher, empty_param, [go_to_tbl_action])) return go_to_tbl_action @requires_eswitch_on @skip_unsupported def test_dest_vport(self): self.dest_port() @requires_eswitch_on @skip_unsupported def test_dest_ib_port(self): self.dest_port(False) @skip_unsupported def add_qp_rule_and_send_pkts(self, root_only=False): """ :param root_only : If True, rules are created only on root table """ self.create_players(Mlx5DrResources) self.qp_action = DrActionQp(self.server.qp) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.create_rx_recv_rules(smac_value, [self.qp_action], root_only=root_only) u.raw_traffic(self.client, self.server, self.iters) def test_tbl_qp_rule(self): """ Creates RX domain, SW table with matcher on source mac. Creates QP action and a rule with this action on the matcher. """ self.add_qp_rule_and_send_pkts() def test_root_tbl_qp_rule(self): """ Creates RX domain, SW table with matcher on source mac. Creates QP action and a rule with this action on the matcher. """ self.add_qp_rule_and_send_pkts(root_only=True) @skip_unsupported def modify_tx_smac_and_send_pkts(self, root_only=False): """ Create a rule on TX domain that modifies smac of matched packet and sends it to the wire. :param root_only : If True, rules are created only on root table """ from tests.mlx5_prm_structs import SetActionIn self.create_players(Mlx5DrResources) self.domain_tx = DrDomain(self.client.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_TX) root_table_tx = DrTable(self.domain_tx, 0) if not root_only: non_root_table_tx = DrTable(self.domain_tx, 1) self.move_action = self.create_empty_matcher_go_to_tbl(root_table_tx, non_root_table_tx) table = root_table_tx if root_only else non_root_table_tx smac_mask = bytes([0xff] * 6) mask_param = Mlx5FlowMatchParameters(len(smac_mask), smac_mask) matcher_tx = DrMatcher(table, 0, u.MatchCriteriaEnable.OUTER, mask_param) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) smac_value += bytes(2) value_param = Mlx5FlowMatchParameters(len(smac_value), smac_value) action1 = SetActionIn(action_type=SET_ACTION, field=ModifyFields.OUT_SMAC_47_16, data=0x88888888, length=ModifyFieldsLen.MAC_47_16) action2 = SetActionIn(action_type=SET_ACTION, field=ModifyFields.OUT_SMAC_15_0, data=0x8888, length=ModifyFieldsLen.MAC_15_0) flags = dve.MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL if root_only else 0 self.modify_action_tx = DrActionModify(self.domain_tx, flags, [action1, action2]) self.rules.append(DrRule(matcher_tx, value_param, [self.modify_action_tx])) src_mac = struct.pack('!6s', bytes.fromhex("88:88:88:88:88:88".replace(':', ''))) self.qp_action = DrActionQp(self.server.qp) self.create_rx_recv_rules(src_mac, [self.qp_action], root_only=root_only) exp_packet = u.gen_packet(self.client.msg_size, src_mac=src_mac) u.raw_traffic(self.client, self.server, self.iters, expected_packet=exp_packet) @skip_unsupported def test_tbl_modify_header_rule(self): """ Creates TX domain, SW table with matcher on source mac and modify the smac. Then creates RX domain and rule that forwards packets with the new smac to server QP. Perform traffic that do this flow. """ self.modify_tx_smac_and_send_pkts() @skip_unsupported def test_root_tbl_modify_header_rule(self): """ Creates TX domain, root table with matcher on source mac and modify the smac. Then creates RX domain and rule that forwards packets with the new smac to server QP. Perform traffic that do this flow. """ self.modify_tx_smac_and_send_pkts(root_only=True) @skip_unsupported def test_metadata_modify_action_set_copy_match(self): """ Verify modify header with set and copy actions. TX and RX: - Root table: Match empty (hit all): Rule: prio 0 - val empty. Action: Go TO Table 1 - Table 1: Match empty (hit all): Rule: prio 0 - val empty. Action: Modify Header (set reg_c_0 to REG_C_DATA) + Go TO Table 2 - Table 2: Match empty (hit all): Rule: prio 0 - val empty. Action: Modify Header (copy reg_c_0 to reg_c_1) + Go To Table 3 TX: - Table 3: Match reg_c_0 and reg_c_1: Rule: prio 0 - val REG_C_DATA. Action: Counter RX: - Table 3: Match reg_c_0 and reg_c_1: Rule: prio 0 - val REG_C_DATA. Action: Go To QP """ from tests.mlx5_prm_structs import FlowTableEntryMatchParam, FlowTableEntryMatchSetMisc2, \ SetActionIn, CopyActionIn self.create_players(Mlx5DrResources) match_param = FlowTableEntryMatchParam() empty_param = Mlx5FlowMatchParameters(len(match_param), match_param) mask_metadata = FlowTableEntryMatchParam(misc_parameters_2= FlowTableEntryMatchSetMisc2(metadata_reg_c_0=0xffff, metadata_reg_c_1=0xffff)) mask_param = Mlx5FlowMatchParameters(len(match_param), mask_metadata) value_metadata = FlowTableEntryMatchParam(misc_parameters_2= FlowTableEntryMatchSetMisc2(metadata_reg_c_0=REG_C_DATA, metadata_reg_c_1=REG_C_DATA)) value_param = Mlx5FlowMatchParameters(len(match_param), value_metadata) self.client.domain = DrDomain(self.client.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_TX) self.server.domain = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) for player in [self.client, self.server]: player.tables = [] player.matchers = [] for i in range(4): player.tables.append(DrTable(player.domain, i)) for i in range(2): player.matchers.append(DrMatcher(player.tables[i + 1], 0, u.MatchCriteriaEnable.NONE, empty_param)) player.matchers.append(DrMatcher(player.tables[3], 0, u.MatchCriteriaEnable.MISC_2, mask_param)) player.go_to_tbl1_action = self.create_empty_matcher_go_to_tbl(player.tables[0], player.tables[1]) set_reg = SetActionIn(field=ModifyFields.META_DATA_REG_C_0, length=ModifyFieldsLen.META_DATA_REG_C, data=REG_C_DATA) player.modify_action_set = DrActionModify(player.domain, 0, [set_reg]) player.go_to_tbl2_action = DrActionDestTable(player.tables[2]) self.rules.append(DrRule(player.matchers[0], empty_param, [player.modify_action_set, player.go_to_tbl2_action])) copy_reg = CopyActionIn(src_field=ModifyFields.META_DATA_REG_C_0, length=ModifyFields.META_DATA_REG_C_0, dst_field=ModifyFields.META_DATA_REG_C_1) player.modify_action_copy = DrActionModify(player.domain, 0, [copy_reg]) player.go_to_tbl3_action = DrActionDestTable(player.tables[3]) self.rules.append(DrRule(player.matchers[1], empty_param, [player.modify_action_copy, player.go_to_tbl3_action])) counter, flow_counter_id = self.create_counter(self.client.ctx) counter_action = DrActionFlowCounter(counter) self.rules.append(DrRule(self.client.matchers[2], value_param, [counter_action])) qp_action = DrActionQp(self.server.qp) self.rules.append(DrRule(self.server.matchers[2], value_param, [qp_action])) u.raw_traffic(self.client, self.server, self.iters) sent_packets = self.query_counter_packets(counter, flow_counter_id) self.assertEqual(sent_packets, self.iters, 'Counter of metadata missed some sent packets') @skip_unsupported def add_counter_action_and_send_pkts(self, root_only=False): """ :param root_only : If True, rules are created only on root table """ self.create_players(Mlx5DrResources) counter, flow_counter_id = self.create_counter(self.server.ctx) self.server_counter_action = DrActionFlowCounter(counter) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.qp_action = DrActionQp(self.server.qp) self.create_rx_recv_rules(smac_value, [self.qp_action, self.server_counter_action], root_only=root_only) u.raw_traffic(self.client, self.server, self.iters) recv_packets = self.query_counter_packets(counter, flow_counter_id) self.assertEqual(recv_packets, self.iters, 'Counter missed some recv packets') @skip_unsupported def test_root_tbl_counter_action(self): """ Create flow counter object, on root table attach it to a rule using counter action and perform traffic that hit this rule. Verify that the packets counter increased. """ self.add_counter_action_and_send_pkts(root_only=True) @skip_unsupported def test_tbl_counter_action(self): """ Create flow counter object, on non-root table attach it to a rule using counter action and perform traffic that hit this rule. Verify that the packets counter increased. """ self.add_counter_action_and_send_pkts() @skip_unsupported def test_prevent_duplicate_rule(self): """ Creates RX domain, sets duplicate rule to be not allowed on that domain, try creating duplicate rule. Fail if creation succeeded. """ from tests.mlx5_prm_structs import FlowTableEntryMatchParam self.server = Mlx5DrResources(**self.dev_info) domain_rx = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) domain_rx.allow_duplicate_rules(False) table = DrTable(domain_rx, 1) empty_param = Mlx5FlowMatchParameters(len(FlowTableEntryMatchParam()), FlowTableEntryMatchParam()) matcher = DrMatcher(table, 0, u.MatchCriteriaEnable.NONE, empty_param) self.qp_action = DrActionQp(self.server.qp) self.drop_action = DrActionDrop() self.rules.append(DrRule(matcher, empty_param, [self.qp_action])) with self.assertRaises(PyverbsRDMAError) as ex: self.rules.append(DrRule(matcher, empty_param, [self.drop_action])) self.assertEqual(ex.exception.error_code, errno.EEXIST) def _drop_action(self, root_only=False): self.create_players(Mlx5DrResources) # Initiate the sender side domain_tx = DrDomain(self.client.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_TX) tx_root_table = DrTable(domain_tx, 0) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) if not root_only: tx_non_root_table = DrTable(domain_tx, 1) tx_dest_table_action = self.fwd_packets_to_table(tx_root_table, tx_non_root_table) smac_value += bytes(2) tx_test_table = tx_root_table if root_only else tx_non_root_table mask_param = Mlx5FlowMatchParameters(len(bytes([0xff] * 6)), bytes([0xff] * 6)) matcher = DrMatcher(tx_test_table, 0, u.MatchCriteriaEnable.OUTER, mask_param) value_param = Mlx5FlowMatchParameters(len(smac_value), smac_value) self.tx_drop_action = DrActionDrop() self.rules.append(DrRule(matcher, value_param, [self.tx_drop_action])) # Initiate the receiver side domain_rx = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) rx_root_table = DrTable(domain_rx, 0) if not root_only: rx_non_root_table = DrTable(domain_rx, 1) rx_dest_table_action = self.fwd_packets_to_table(rx_root_table, rx_non_root_table) rx_test_table = rx_root_table if root_only else rx_non_root_table # Create server counter. counter, flow_counter_id = self.create_counter(self.server.ctx) self.server_counter_action = DrActionFlowCounter(counter) mask_param, value_param = self.create_dest_mac_params() matcher = DrMatcher(rx_test_table, 0, u.MatchCriteriaEnable.OUTER, mask_param) self.rx_drop_action = DrActionDrop() self.rules.append(DrRule(matcher, value_param, [self.server_counter_action, self.rx_drop_action])) # Send packets with two different smacs and expect half to be dropped. src_mac_drop = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) src_mac_non_drop = struct.pack('!6s', bytes.fromhex("88:88:88:88:88:88".replace(':', ''))) self.send_client_raw_packets(int(self.iters / 2), src_mac=src_mac_drop) recv_packets = self.query_counter_packets(counter, flow_counter_id) self.assertEqual(recv_packets, 0, 'Drop action did not drop the TX packets') self.send_client_raw_packets(int(self.iters / 2), src_mac=src_mac_non_drop) recv_packets = self.query_counter_packets(counter, flow_counter_id) self.assertEqual(recv_packets, int(self.iters/2), 'Drop action dropped TX packets that not matched the rule') @skip_unsupported def test_root_tbl_drop_action(self): """ Create root drop actions on TX and RX. Verify using counter on the server RX that only packets which miss the drop rule arrived to the server RX. """ self._drop_action(root_only=True) @skip_unsupported def test_tbl_drop_action(self): """ Create non-root drop actions on TX and RX. Verify using counter on the server RX that only packets that which the drop rule arrived to the server RX. """ self._drop_action() @skip_unsupported def add_qp_tag_rule_and_send_pkts(self, root_only=False): """ Creates RX domain, table with matcher on source mac. Creates QP action and tag action. Creates a rule with those actions on the matcher. Verifies traffic and tag. :param root_only : If True, rules are created only on root table """ self.wc_flags = e.IBV_WC_EX_WITH_FLOW_TAG self.create_players(Mlx5DrResources, wc_flags=e.IBV_WC_EX_WITH_FLOW_TAG) qp_action = DrActionQp(self.server.qp) tag = 0x123 tag_action = DrActionTag(tag) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.create_rx_recv_rules(smac_value, [tag_action, qp_action], root_only=root_only) self.domain_rx.sync() u.raw_traffic(self.client, self.server, self.iters) # Verify tag self.assertEqual(self.server.cq.read_flow_tag(), tag, 'Wrong tag value') @skip_unsupported def test_tbl_qp_tag_rule(self): """ Creates RX domain, non-root table with matcher on source mac. Creates QP action and tag action. Creates a rule with those actions on the matcher. Verifies traffic and tag. """ self.add_qp_tag_rule_and_send_pkts() @skip_unsupported def test_root_tbl_qp_tag_rule(self): """ Creates RX domain, root table with matcher on source mac. Creates QP action and tag action. Creates a rule with those actions on the matcher. Verifies traffic and tag. """ self.add_qp_tag_rule_and_send_pkts(root_only=True) @skip_unsupported def test_set_matcher_layout(self): """ Creates a non root matcher and sets its size. Creates a rule on that matcher and increases the matcher size. Verifies the rule. """ log_matcher_size = 5 self.create_players(Mlx5DrResources) self.qp_action = DrActionQp(self.server.qp) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.create_rx_recv_rules(smac_value, [self.qp_action], log_matcher_size) self.matcher.set_layout(log_matcher_size + 1) u.raw_traffic(self.client, self.server, self.iters) self.matcher.set_layout(flags=dve.MLX5DV_DR_MATCHER_LAYOUT_RESIZABLE) u.raw_traffic(self.client, self.server, self.iters) @skip_unsupported def test_push_vlan(self): """ Creates RX domain, root table with matcher on source mac. Create a rule to forward all traffic to the non-root table. Creates QP action and push VLAN action. Creates a rule with those actions on the matcher. Verifies traffic and packet with specified VLAN. """ self.client = Mlx5DrResources(**self.dev_info) vlan_hdr = struct.pack('!HH', PacketConsts.VLAN_TPID, (PacketConsts.VLAN_PRIO << 13) + (PacketConsts.VLAN_CFI << 12) + PacketConsts.VLAN_ID) self.server = Mlx5DrResources(msg_size=self.client.msg_size + PacketConsts.VLAN_HEADER_SIZE, **self.dev_info) self.domain_tx = DrDomain(self.client.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_TX) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) push_action = DrActionPushVLan(self.domain_tx, struct.unpack('I', vlan_hdr)[0]) self.tx_table, self.tx_dest_act = self.create_rx_recv_rules(smac_value, [push_action], domain=self.domain_tx) self.domain_rx = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) qp_action = DrActionQp(self.server.qp) self.create_rx_recv_rules(smac_value, [qp_action], domain=self.domain_rx) exp_packet = u.gen_packet(self.client.msg_size + PacketConsts.VLAN_HEADER_SIZE, with_vlan=True) u.raw_traffic(self.client, self.server, self.iters, expected_packet=exp_packet) @skip_unsupported def test_pop_vlan(self): """ Creates RX domain, root table with matcher on source mac. Create a rule to forward all traffic to the non-root table. Creates QP action and pop VLAN action. Creates a rule with those actions on the matcher. Verifies packets received without VLAN header. """ self.server = Mlx5DrResources(**self.dev_info) self.client = Mlx5DrResources(**self.dev_info) exp_packet = u.gen_packet(self.server.msg_size - PacketConsts.VLAN_HEADER_SIZE) qp_action = DrActionQp(self.server.qp) pop_action = DrActionPopVLan() smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.create_rx_recv_rules(smac_value, [pop_action, qp_action]) u.raw_traffic(self.client, self.server, self.iters, with_vlan=True, expected_packet=exp_packet) @skip_unsupported def dest_array(self, root_only=False): """ Creates RX domain, root table with matcher on source mac. Create a rule to forward all traffic to the non-root table. On this table add a rule with multi dest array action which include destination QP actions and next FT (also with QP action). Validate on all QPs the received packets. :param root_only : If True, rules are created only on root table """ max_actions = 8 self.client = Mlx5DrResources(qp_count=max_actions, **self.dev_info) self.server = Mlx5DrResources(qp_count=max_actions, **self.dev_info) self.domain_rx = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) actions = [] dest_attrs = [] for qp in self.server.qps[:-1]: qp_action = DrActionQp(qp) actions.append(qp_action) dest_attrs.append(DrActionDestAttr(dve.MLX5DV_DR_ACTION_DEST, qp_action)) ft_action = DrTable(self.domain_rx, 0xff) last_table_action = DrActionDestTable(ft_action) smac_mask = bytes([0xff] * 6) + bytes(2) mask_param = Mlx5FlowMatchParameters(len(smac_mask), smac_mask) last_matcher = DrMatcher(ft_action, 1, u.MatchCriteriaEnable.OUTER, mask_param) dest_attrs.append(DrActionDestAttr(dve.MLX5DV_DR_ACTION_DEST, last_table_action)) last_qp_action = DrActionQp(self.server.qps[max_actions - 1]) smac_value = struct.pack('!6s2s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', '')), bytes(2)) value_param = Mlx5FlowMatchParameters(len(smac_value), smac_value) self.rules.append(DrRule(last_matcher, value_param, [last_qp_action])) multi_dest_a = DrActionDestArray(self.domain_rx, len(dest_attrs), dest_attrs) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.create_rx_recv_rules(smac_value, [multi_dest_a], domain=self.domain_rx, root_only=root_only) u.raw_traffic(self.client, self.server, self.iters) @skip_unsupported def test_root_dest_array(self): """ Creates RX domain, root table with matcher on source mac.on root table add a rule with multi dest array action which include destination QP actions and next FT (also with QP action). Validate on all QPs the received packets. """ self.dest_array(root_only=True) @skip_unsupported def test_dest_array(self): """ Creates RX domain, non-root table with matcher on source mac. Create a rule to forward all traffic to the non-root table. On this table add a rule with multi dest array action which include destination QP actions and next FT (also with QP action). Validate on all QPs the received packets. """ self.dest_array() @skip_unsupported def test_tx_def_miss_action(self): """ Create TX root table and forward all traffic to next SW steering table, create two matchers with different priorities, one with default miss action (on TX it's go to wire action) and one with drop action, default miss action should occur before the drop action hence packets should reach server side which has RX rule with QP action. """ self.create_players(Mlx5DrResources) self.domain_tx = DrDomain(self.client.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_TX) tx_def_miss = DrActionDefMiss() tx_drop_action = DrActionDrop() smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.tx_table, self.tx_dest_act = self.create_rx_recv_rules(smac_value, [tx_def_miss], domain=self.domain_tx) qp_action = DrActionQp(self.server.qp) self.create_rx_recv_rules(smac_value, [qp_action]) smac_mask = bytes([0xff] * 6) + bytes(2) mask_param = Mlx5FlowMatchParameters(len(smac_mask), smac_mask) matcher_tx2 = DrMatcher(self.tx_table, 2, u.MatchCriteriaEnable.OUTER, mask_param) smac_value += bytes(2) value_param = Mlx5FlowMatchParameters(len(smac_value), smac_value) self.rules.append(DrRule(matcher_tx2, value_param, [tx_drop_action])) u.raw_traffic(self.client, self.server, self.iters) @skip_unsupported def add_dest_tir_action_send_pkts(self, root_only=False): """ :param root_only: If True, rules are created only on root table """ self.client = Mlx5DrTirResources(**self.dev_info) self.server = Mlx5DrTirResources(**self.dev_info, server=True) tir_action = DrActionDestTir(self.server.tir) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.create_rx_recv_rules(smac_value, [tir_action], root_only=root_only) u.raw_traffic(self.client, self.server, self.iters) @skip_unsupported def test_dest_tir(self): self.add_dest_tir_action_send_pkts() @skip_unsupported def test_root_dest_tir(self): self.add_dest_tir_action_send_pkts(root_only=True) def packet_reformat_actions(self, outer, root_only=False, l2_ref_type=True): """ Creates packet reformat actions on TX (encap) and on RX (decap). :param outer: The outer header to encap. :param root_only: If True create actions only on root tables :param l2_ref_type: If False use L2 to L3 tunneling reformat """ smac_mask = bytes([0xff] * 6) + bytes(2) mask_param = Mlx5FlowMatchParameters(len(smac_mask), smac_mask) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) value_param = Mlx5FlowMatchParameters(len(smac_value), smac_value) reformat_flag = dve.MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL if root_only else 0 # TX domain_tx = DrDomain(self.client.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_TX) tx_root_table = DrTable(domain_tx, 0) tx_root_matcher = DrMatcher(tx_root_table, 0, u.MatchCriteriaEnable.OUTER, mask_param) if not root_only: tx_table = DrTable(domain_tx, 1) tx_matcher = DrMatcher(tx_table, 1, u.MatchCriteriaEnable.OUTER, mask_param) dest_table_action_tx = DrActionDestTable(tx_table) self.rules.append(DrRule(tx_root_matcher, value_param, [dest_table_action_tx])) reformat_matcher = tx_root_matcher if root_only else tx_matcher # Create encap action tx_reformat_type = dve.MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL_ if \ l2_ref_type else dve.MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL_ reformat_action_tx = DrActionPacketReformat(domain=domain_tx, flags=reformat_flag, reformat_type=tx_reformat_type, data=outer) smac_value_tx = smac_value + bytes(2) value_param = Mlx5FlowMatchParameters(len(smac_value_tx), smac_value_tx) self.rules.append(DrRule(reformat_matcher, value_param, [reformat_action_tx])) # RX domain_rx = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) # Create decap action data = struct.pack('!6s6s', bytes.fromhex(PacketConsts.DST_MAC.replace(':', '')), bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) data += PacketConsts.ETHER_TYPE_IPV4.to_bytes(2, 'big') rx_reformat_type = dve.MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2_ if \ l2_ref_type else dve.MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2_ reformat_action_rx = DrActionPacketReformat(domain=domain_rx, flags=reformat_flag, reformat_type=rx_reformat_type, data=None if l2_ref_type else data) qp_action = DrActionQp(self.server.qp) if root_only: rx_root_table = DrTable(domain_rx, 0) rx_root_matcher = DrMatcher(rx_root_table, 0, u.MatchCriteriaEnable.OUTER, mask_param) self.rules.append(DrRule(rx_root_matcher, value_param, [reformat_action_rx, qp_action])) else: self.create_rx_recv_rules(smac_value, [reformat_action_rx, qp_action], domain=domain_rx) # Send traffic and validate packet u.raw_traffic(self.client, self.server, self.iters) @skip_unsupported def test_flow_sampler(self): """ Flow sampler has a default table (all the packets are forwarded to it) and a sampler actions (for sampled packets) The default table has counter action. For NIC RX table sampler actions are counter and TIR. Verify that default counter counts all the packets. Verify that sampled packets counter and receiving them on QP(from TIR) """ self.client = Mlx5DrTirResources(**self.dev_info) self.server = Mlx5DrTirResources(**self.dev_info, server=True) self.iters = 1000 # Create tir & counter actions for sampler attr tir_action = DrActionDestTir(self.server.tir) counter_1, flow_counter_id_1 = self.create_counter(self.server.ctx) self.server_counter_action = DrActionFlowCounter(counter_1) # Create resources smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) rx_domain = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) default_tbl = DrTable(rx_domain, 2) # Create sampler action on NIC RX table sample_actions = [self.server_counter_action, tir_action] sampler_attr = DrFlowSamplerAttr(sample_ratio=SAMPLE_RATIO, default_next_table=default_tbl, sample_actions=sample_actions) sampler_action = DrActionFlowSample(sampler_attr) tbl, _ = self.create_rx_recv_rules(smac_value, [sampler_action], domain=rx_domain) smac_mask = bytes([0xff] * 6) + bytes(2) mask_param = Mlx5FlowMatchParameters(len(smac_mask), smac_mask) self.default_matcher = DrMatcher(default_tbl, 1, u.MatchCriteriaEnable.OUTER, mask_param) # Size of the matcher value should be modulo 4 smac_value += bytes(2) value_param = Mlx5FlowMatchParameters(len(smac_value), smac_value) # Create Counter action on default table counter_2, flow_counter_id_2 = self.create_counter(self.server.ctx) self.server_counter_action_2 = DrActionFlowCounter(counter_2) self.rules.append(DrRule(self.default_matcher, value_param, [self.server_counter_action_2])) # Send traffic and validate packet u.sampler_traffic(self.client, self.server, self.iters) recv_packets = self.query_counter_packets(counter=counter_1, flow_counter_id=flow_counter_id_1) exp_packets = math.ceil((self.iters / SAMPLE_RATIO)) max_exp_packets = int(exp_packets * (1 + SAMPLER_ERROR_MARGIN)) min_exp_packets = int(exp_packets * (1 - SAMPLER_ERROR_MARGIN)) is_sampled_packets_in_error_margin = min_exp_packets <= recv_packets <= max_exp_packets self.assertTrue(is_sampled_packets_in_error_margin, f'Expected sampled packets {exp_packets} is more than ' f'{SAMPLER_ERROR_MARGIN * 100}% \ndiffernt from actual {recv_packets}') recv_packets_from_default_tbl = \ self.query_counter_packets(counter=counter_2, flow_counter_id=flow_counter_id_2) self.assertEqual(recv_packets_from_default_tbl, self.iters, 'Counter on default table missed some recv packets') @skip_unsupported def geneve_match_rx(self, root_only=False): """ Creates matcher on RX to match on Geneve related fields with counter and qp action, sends packets and verifies the matcher. :param root_only: If True, rules are created only on root table """ self.create_players(Mlx5DrResources) geneve_mask, geneve_val = self.create_geneve_params() domain_rx = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) counter, flow_counter_id = self.create_counter(self.server.ctx) self.server_counter_action = DrActionFlowCounter(counter) self.qp_action = DrActionQp(self.server.qp) self.create_rx_recv_rules_based_on_match_params(geneve_mask, geneve_val, [self.qp_action, self.server_counter_action], match_criteria=u.MatchCriteriaEnable.MISC, domain=domain_rx, root_only=root_only) inner_msg_size = self.client.msg_size - GENEVE_PACKET_OUTER_LENGTH outer = self.gen_geneve_tunnel_encap_header(inner_msg_size) packet_to_send = outer + u.gen_packet(msg_size=inner_msg_size) # Send traffic and validate packet u.raw_traffic(self.client, self.server, self.iters, packet_to_send=packet_to_send) recv_packets_rx = self.query_counter_packets(counter, flow_counter_id) self.assertEqual(recv_packets_rx, self.iters, 'Counter rx missed some recv packets') src_mac = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.send_client_raw_packets(self.iters, src_mac=src_mac) recv_packets_rx = self.query_counter_packets(counter, flow_counter_id) self.assertEqual(recv_packets_rx, self.iters, 'Counter rx counts more than expected recv packets') @requires_geneve_fields_rx_support def test_root_geneve_match_rx(self): """ Creates matcher on RX root table to match on Geneve related fields with counter and qp action, sends packets and verifies the matcher. """ self.geneve_match_rx(root_only=True) @requires_geneve_fields_rx_support def test_geneve_match_rx(self): """ Creates matcher on RX non-root table to match on Geneve related fields with counter and qp action, sends packets and verifies the matcher. """ self.geneve_match_rx() @skip_unsupported def test_geneve_match_tx(self): """ Creates matcher on TX to match on Geneve related fields with counter action, sends packets and verifies the matcher. """ from tests.mlx5_prm_structs import FlowTableEntryMatchParam self.create_players(Mlx5DrResources) skip_if_has_geneve_tx_bug(self.client.ctx) geneve_mask, geneve_val = self.create_geneve_params() # TX self.domain_tx = DrDomain(self.client.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_TX) tx_root_table = DrTable(self.domain_tx, 0) tx_root_matcher = DrMatcher(tx_root_table, 0, u.MatchCriteriaEnable.MISC, geneve_mask) tx_table = DrTable(self.domain_tx, 1) self.tx_matcher = DrMatcher(tx_table, 1, u.MatchCriteriaEnable.MISC, geneve_mask) counter, flow_counter_id = self.create_counter(self.client.ctx) self.client_counter_action = DrActionFlowCounter(counter) self.dest_table_action_tx = DrActionDestTable(tx_table) self.rules.append(DrRule(tx_root_matcher, geneve_val, [self.dest_table_action_tx])) self.rules.append(DrRule(self.tx_matcher, geneve_val, [self.client_counter_action])) # RX domain_rx = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) self.qp_action = DrActionQp(self.server.qp) empty_param = Mlx5FlowMatchParameters(len(FlowTableEntryMatchParam()), FlowTableEntryMatchParam()) self.create_rx_recv_rules_based_on_match_params\ (empty_param, empty_param, [self.qp_action], match_criteria=u.MatchCriteriaEnable.NONE, domain=domain_rx) inner_msg_size = self.client.msg_size - GENEVE_PACKET_OUTER_LENGTH outer = self.gen_geneve_tunnel_encap_header(inner_msg_size) packet_to_send = outer + u.gen_packet(msg_size=inner_msg_size) # Send traffic and validate packet u.raw_traffic(self.client, self.server, self.iters, packet_to_send=packet_to_send) recv_packets_tx = self.query_counter_packets(counter, flow_counter_id) self.assertEqual(recv_packets_tx, self.iters, 'Counter tx missed some recv packets') src_mac = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.send_client_raw_packets(self.iters, src_mac=src_mac) recv_packets_tx = self.query_counter_packets(counter, flow_counter_id) self.assertEqual(recv_packets_tx, self.iters, 'Counter tx counts more than expected recv packets') def roce_bth_match(self, domain_flag=dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX): """ Creates RoCE BTH rule on RX/TX domain. For RX domain, will match on BTH related fields with counter and qp action. For TX domain, will match on BTH relate fields with counter action. And then generate and send RoCE BTH hit and miss traffic according to the matcher and validate the result. :param domain_flag: RX/TX Domain for the test. """ from tests.mlx5_prm_structs import FlowTableEntryMatchParam self.create_players(Mlx5DrResources) roce_bth_mask, roce_bth_val = self.create_roce_bth_params() empty_param = Mlx5FlowMatchParameters(len(FlowTableEntryMatchParam()), FlowTableEntryMatchParam()) self.domain = DrDomain(self.server.ctx, domain_flag) root_table = DrTable(self.domain, 0) root_matcher = DrMatcher(root_table, 0, u.MatchCriteriaEnable.NONE, empty_param) table = DrTable(self.domain, 1) self.matcher = DrMatcher(table, 1, u.MatchCriteriaEnable.MISC, roce_bth_mask) if domain_flag == dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX: counter, flow_counter_id = self.create_counter(self.server.ctx) else: counter, flow_counter_id = self.create_counter(self.client.ctx) self.dest_tbl_action = DrActionDestTable(table) self.qp_action = DrActionQp(self.server.qp) self.counter_action = DrActionFlowCounter(counter) self.rules.append(DrRule(root_matcher, empty_param, [self.dest_tbl_action])) if domain_flag == dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX: self.rules.append(DrRule(self.matcher, roce_bth_val, [self.qp_action, self.counter_action])) else: self.rules.append(DrRule(self.matcher, roce_bth_val, [self.counter_action])) if domain_flag == dve.MLX5DV_DR_DOMAIN_TYPE_NIC_TX: domain_rx = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) self.create_rx_recv_rules_based_on_match_params\ (empty_param, empty_param, [self.qp_action], match_criteria=u.MatchCriteriaEnable.NONE, domain=domain_rx) inner_msg_size = self.client.msg_size - ROCE_PACKET_OUTER_LENGTH outer = self.gen_roce_bth_header(inner_msg_size) packet_to_send = outer + u.gen_packet(msg_size=inner_msg_size) # Send traffic hit the rule and validate by the counter action u.raw_traffic(self.client, self.server, self.iters, packet_to_send=packet_to_send) recv_packets = self.query_counter_packets(counter, flow_counter_id) self.assertEqual(recv_packets, self.iters, 'Counter missed some recv packets') # Send traffic miss the rule and validate by the counter action src_mac = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) self.send_client_raw_packets(self.iters, src_mac=src_mac) recv_packets = self.query_counter_packets(counter, flow_counter_id) self.assertEqual(recv_packets, self.iters, 'Counter counts more than expected recv packets') @u.requires_roce_disabled @skip_unsupported def test_roce_bth_match_rx(self): """ Verify RX matching on RoCE BTH. """ self.roce_bth_match() @u.requires_roce_disabled @skip_unsupported def test_roce_bth_match_tx(self): """ Verify TX matching on RoCE BTH. """ self.roce_bth_match(domain_flag=dve.MLX5DV_DR_DOMAIN_TYPE_NIC_TX) @skip_unsupported def test_packet_reformat_l2_gre(self): """ Creates GRE packet with non-root l2 to l2 reformat actions on TX (encap) and on RX (decap). """ self.create_players(Mlx5DrResources) encap_header = self.gen_gre_tunnel_encap_header(self.client.msg_size, is_l2_tunnel=True) self.packet_reformat_actions(outer=encap_header) @requires_reformat_support @u.requires_encap_disabled_if_eswitch_on @skip_unsupported def test_packet_reformat_root_l2_gre(self): """ Creates GRE packet with root l2 to l2 reformat actions on TX (encap) and on RX (decap). """ self.create_players(Mlx5DrResources) encap_header = self.gen_gre_tunnel_encap_header(self.client.msg_size, is_l2_tunnel=True) self.packet_reformat_actions(outer=encap_header, root_only=True) @skip_unsupported def test_packet_reformat_l3_gre(self): """ Creates GRE packet with non-root l2 to l3 reformat actions on TX (encap) and on RX (decap). """ self.create_players(Mlx5DrResources) encap_header = self.gen_gre_tunnel_encap_header(self.client.msg_size, is_l2_tunnel=False) self.packet_reformat_actions(outer=encap_header, l2_ref_type=False) @requires_reformat_support @u.requires_encap_disabled_if_eswitch_on @skip_unsupported def test_packet_reformat_root_l3_gre(self): """ Creates GRE packet with root l2 to l3 reformat actions on TX (encap) and on RX (decap). """ self.create_players(Mlx5DrResources) encap_header = self.gen_gre_tunnel_encap_header(self.client.msg_size, is_l2_tunnel=False) self.packet_reformat_actions(outer=encap_header, root_only=True, l2_ref_type=False) @skip_unsupported def test_packet_reformat_l2_geneve(self): """ Creates Geneve packet with non-root l2 to l2 reformat actions on TX (encap) and on RX (decap). """ self.create_players(Mlx5DrResources) encap_header = self.gen_geneve_tunnel_encap_header(self.client.msg_size, is_l2_tunnel=True) self.packet_reformat_actions(outer=encap_header) @requires_reformat_support @u.requires_encap_disabled_if_eswitch_on @skip_unsupported def test_packet_reformat_root_l2_geneve(self): """ Creates Geneve packet with root l2 to l2 reformat actions on TX (encap) and on RX (decap). """ self.create_players(Mlx5DrResources) encap_header = self.gen_geneve_tunnel_encap_header(self.client.msg_size, is_l2_tunnel=True) self.packet_reformat_actions(outer=encap_header, root_only=True) @skip_unsupported def test_packet_reformat_l3_geneve(self): """ Creates Geneve packet with non-root l2 to l3 tunnel reformat actions on TX (encap) and on RX (decap). """ self.create_players(Mlx5DrResources) encap_header = self.gen_geneve_tunnel_encap_header(self.client.msg_size, is_l2_tunnel=False) self.packet_reformat_actions(outer=encap_header, l2_ref_type=False) @requires_reformat_support @u.requires_encap_disabled_if_eswitch_on @skip_unsupported def test_packet_reformat_root_l3_geneve(self): """ Creates Geneve packet with root l2 to l3 reformat actions on TX (encap) and on RX (decap). """ self.create_players(Mlx5DrResources) encap_header = self.gen_geneve_tunnel_encap_header(self.client.msg_size, is_l2_tunnel=False) self.packet_reformat_actions(outer=encap_header, root_only=True, l2_ref_type=False) @skip_unsupported def test_flow_meter(self): """ Create flow meter actions on TX and RX non-root tables. Add green and red counters to the meter rules to verify the packets split to different colors. Send minimal traffic to see that both counters increased. """ from tests.mlx5_prm_structs import FlowTableEntryMatchParam, FlowTableEntryMatchSetMisc2,\ FlowMeterParams self.create_players(Mlx5DrResources) # Common resources matcher_len = len(FlowTableEntryMatchParam()) empty_param = Mlx5FlowMatchParameters(matcher_len, FlowTableEntryMatchParam()) reg_c_idx = self.client.get_first_flow_meter_reg_id() reg_c_field = METADATA_C_FIELDS[reg_c_idx] meter_param = FlowMeterParams(valid=0x1, bucket_overflow=0x1, start_color=0x2, cir_mantissa=1, cir_exponent=6) # 15.625MBps reg_c_mask = Mlx5FlowMatchParameters(matcher_len, FlowTableEntryMatchParam( misc_parameters_2=FlowTableEntryMatchSetMisc2(**{reg_c_field: 0xffffffff}))) reg_c_green = Mlx5FlowMatchParameters(matcher_len, FlowTableEntryMatchParam( misc_parameters_2=FlowTableEntryMatchSetMisc2(**{reg_c_field: FLOW_METER_GREEN}))) reg_c_red = Mlx5FlowMatchParameters(matcher_len, FlowTableEntryMatchParam( misc_parameters_2=FlowTableEntryMatchSetMisc2(**{reg_c_field: FLOW_METER_RED}))) self.client.domain = DrDomain(self.client.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_TX) self.server.domain = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) for player in [self.client, self.server]: player.root_table = DrTable(player.domain, 0) player.table = DrTable(player.domain, 1) player.next_table = DrTable(player.domain, 2) player.root_matcher = DrMatcher(player.root_table, 0, u.MatchCriteriaEnable.NONE, empty_param) player.matcher = DrMatcher(player.table, 0, u.MatchCriteriaEnable.NONE, empty_param) player.reg_c_matcher = DrMatcher(player.next_table, 2, u.MatchCriteriaEnable.MISC_2, reg_c_mask) meter_attr = DrFlowMeterAttr(player.next_table, 1, reg_c_idx, meter_param) player.meter_action = DrActionFlowMeter(meter_attr) player.dest_action = DrActionDestTable(player.table) self.rules.append(DrRule(player.root_matcher, empty_param, [player.dest_action])) self.rules.append(DrRule(player.matcher, empty_param, [player.meter_action])) player.counter_green, player.flow_counter_id_green = self.create_counter(player.ctx) player.counter_action_green = DrActionFlowCounter(player.counter_green) player.counter_red, player.flow_counter_id_red = self.create_counter(player.ctx) player.counter_action_red = DrActionFlowCounter(player.counter_red) self.rules.append(DrRule(player.reg_c_matcher, reg_c_green, [player.counter_action_green])) self.rules.append(DrRule(player.reg_c_matcher, reg_c_red, [player.counter_action_red])) packet = u.gen_packet(self.client.msg_size) # We want to send at least at 30MBps speed rate_limit = 30 u.high_rate_send(self.client, packet, rate_limit) for name, player in {'client': self.client, 'server': self.server}.items(): green_packets = self.query_counter_packets(player.counter_green, player.flow_counter_id_green) red_packets = self.query_counter_packets(player.counter_red, player.flow_counter_id_red) self.assertTrue(green_packets > 0, f'No packet of {name} got green color') self.assertTrue(red_packets > 0, f'No packet of {name} got red color') def fwd_packets_to_table(self, src_table, dst_table): """ Forward all traffic from one table to another using empty matcher :param src_table: Source table :param dst_table: Destination table :return: DrActionDestTable used to move the packets from src_table to dst_table """ from tests.mlx5_prm_structs import FlowTableEntryMatchParam empty_param = Mlx5FlowMatchParameters(len(FlowTableEntryMatchParam()), FlowTableEntryMatchParam()) matcher = DrMatcher(src_table, 0, u.MatchCriteriaEnable.NONE, empty_param) dest_table_action = DrActionDestTable(dst_table) self.rules.append(DrRule(matcher, empty_param, [dest_table_action])) return dest_table_action def gen_two_smac_rules(self, table, actions): """ Generate two rules that match over different smacs values. The rules use the same actions and matchers. :param table: The table the rules are applied on :param actions: SMAC rule actions :return: The two generated smacs """ smac_mask = bytes([0xff] * 6) + bytes(2) mask_param = Mlx5FlowMatchParameters(len(smac_mask), smac_mask) matcher = DrMatcher(table, 0, u.MatchCriteriaEnable.OUTER, mask_param) src_mac_1 = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) src_mac_2 = struct.pack('!6s', bytes.fromhex("88:88:88:88:88:88".replace(':', ''))) src_mac_1_for_matcher = src_mac_1 + bytes(2) src_mac_2_for_matcher = src_mac_2 + bytes(2) value_param_1 = Mlx5FlowMatchParameters(len(src_mac_1_for_matcher), src_mac_1_for_matcher) value_param_2 = Mlx5FlowMatchParameters(len(src_mac_2_for_matcher), src_mac_2_for_matcher) self.rules.append(DrRule(matcher, value_param_1, actions)) self.rules.append(DrRule(matcher, value_param_2, actions)) return src_mac_1, src_mac_2 def reuse_action_and_matcher(self, root_only=False): """ Creates rules with same matcher and actions, the rules match over different smacs. Over TX side, creates rule to counter action, over RX side - creates rule to counter and drop actions. Send traffic to match the rules, verify them by querying the counters. :param root_only: If True, rules are created only on root table. """ self.create_players(Mlx5DrResources) # Create TX resources self.domain_tx = DrDomain(self.client.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_TX) tx_root_table = DrTable(self.domain_tx, 0) if not root_only: tx_non_root_table = DrTable(self.domain_tx, 1) tx_dest_table_action = self.fwd_packets_to_table(tx_root_table, tx_non_root_table) tx_table = tx_root_table if root_only else tx_non_root_table # Create client counter. client_counter, tx_flow_counter_id = self.create_counter(self.client.ctx) self.client_counter_action = DrActionFlowCounter(client_counter) tx_actions = [self.client_counter_action] self.gen_two_smac_rules(tx_table, tx_actions) # Create RX resources self.domain_rx = DrDomain(self.server.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) rx_root_table = DrTable(self.domain_rx, 0) if not root_only: rx_non_root_table = DrTable(self.domain_rx, 1) rx_dest_table_action = self.fwd_packets_to_table(rx_root_table, rx_non_root_table) rx_table = rx_root_table if root_only else rx_non_root_table # Create server counter. server_counter, rx_flow_counter_id = self.create_counter(self.server.ctx) self.server_counter_action = DrActionFlowCounter(server_counter) self.rx_drop_action = DrActionDrop() actions = [self.server_counter_action, self.rx_drop_action] src_mac_1, src_mac_2 = self.gen_two_smac_rules(rx_table, actions) # Send packets with two different smacs which are used and reused in action and matcher self.send_client_raw_packets(int(self.iters / 2), src_mac=src_mac_1) self.send_client_raw_packets(int(self.iters / 2), src_mac=src_mac_2) matched_packets_tx = self.query_counter_packets(client_counter, tx_flow_counter_id) self.assertEqual(matched_packets_tx, self.iters, 'Reuse action or matcher failed on TX') matched_packets_rx = self.query_counter_packets(server_counter, rx_flow_counter_id) self.assertEqual(matched_packets_rx, self.iters, 'Reuse action or matcher failed on RX') @skip_unsupported def test_root_reuse_action_and_matcher(self): """ Create root rules on TX and RX that use the same matcher and actions """ self.reuse_action_and_matcher(root_only=True) @skip_unsupported def test_reuse_action_and_matcher(self): """ Create non-root rules on TX and RX that use the same matcher and actions """ self.reuse_action_and_matcher() class Mlx5DrDumpTest(PyverbsAPITestCase): def setUp(self): super().setUp() self.res = None def tearDown(self): super().tearDown() if self.res: self.res.ctx.close() @skip_unsupported def test_domain_dump(self): dump_file = '/tmp/dump.txt' self.res = Mlx5DrResources(self.dev_name, self.ib_port) self.domain_rx = DrDomain(self.res.ctx, dve.MLX5DV_DR_DOMAIN_TYPE_NIC_RX) self.domain_rx.dump(dump_file) self.assertTrue(path.isfile(dump_file), 'Dump file does not exist.') self.assertGreater(path.getsize(dump_file), 0, 'Dump file is empty') rdma-core-56.1/tests/test_mlx5_flow.py000066400000000000000000000170001477342711600200170ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia All rights reserved. See COPYING file """ Test module for pyverbs' mlx5 flow module. """ import unittest import errno from pyverbs.providers.mlx5.mlx5dv_flow import Mlx5FlowMatcher, \ Mlx5FlowMatcherAttr, Mlx5FlowMatchParameters, Mlx5FlowActionAttr, Mlx5Flow,\ Mlx5PacketReformatFlowAction from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsUserError from tests.utils import requires_root_on_eth, PacketConsts import pyverbs.providers.mlx5.mlx5_enums as dve from tests.mlx5_base import Mlx5RDMATestCase from tests.base import RawResources import pyverbs.enums as e import tests.utils as u import struct MAX_MATCH_PARAM_SIZE = 0x180 @u.skip_unsupported def requires_reformat_support(func): def func_wrapper(instance): nic_tbl_caps = u.query_nic_flow_table_caps(instance) # Verify that both NIC RX and TX support reformat actions by checking # the following PRM fields: encap_general_header, # log_max_packet_reformat, and reformat (for both RX and TX). if not(nic_tbl_caps.encap_general_header and nic_tbl_caps.log_max_packet_reformat_context and nic_tbl_caps.flow_table_properties_nic_receive.reformat and nic_tbl_caps.flow_table_properties_nic_transmit.reformat): raise unittest.SkipTest('NIC flow table does not support reformat') return func(instance) return func_wrapper def gen_vxlan_l2_tunnel_encap_header(msg_size): vxlan_header = u.gen_vxlan_header() udp_header = u.gen_udp_header(packet_len=msg_size + len(vxlan_header), dst_port=PacketConsts.VXLAN_PORT) ip_header = u.gen_ipv4_header(packet_len=msg_size + len(vxlan_header) + len(udp_header)) mac_header = u.gen_ethernet_header() return mac_header + ip_header + udp_header + vxlan_header class Mlx5FlowResources(RawResources): def create_matcher(self, mask, match_criteria_enable, flags=0, ft_type=dve.MLX5DV_FLOW_TABLE_TYPE_NIC_RX_): """ Creates a matcher from a provided mask. :param mask: The mask to match on (in bytes) :param match_criteria_enable: Bitmask representing which of the headers and parameters in match_criteria are used :param flags: Flow matcher flags :param ft_type: Flow table type :return: Resulting matcher """ try: flow_match_param = Mlx5FlowMatchParameters(len(mask), mask) attr = Mlx5FlowMatcherAttr(match_mask=flow_match_param, match_criteria_enable=match_criteria_enable, flags=flags, ft_type=ft_type) matcher = Mlx5FlowMatcher(self.ctx, attr) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: raise unittest.SkipTest('Matcher creation is not supported') raise ex return matcher @requires_root_on_eth() def create_qps(self): super().create_qps() class Mlx5MatcherTest(Mlx5RDMATestCase): def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None @u.skip_unsupported def test_create_empty_matcher(self): """ Creates an empty matcher """ self.res = Mlx5FlowResources(**self.dev_info) empty_mask = bytes(MAX_MATCH_PARAM_SIZE) self.res.create_matcher(empty_mask, u.MatchCriteriaEnable.NONE) @u.skip_unsupported def test_create_smac_matcher(self): """ Creates a matcher to match on outer source mac """ self.res = Mlx5FlowResources(**self.dev_info) smac_mask = bytes([0xff, 0xff, 0xff, 0xff, 0xff, 0xff]) self.res.create_matcher(smac_mask, u.MatchCriteriaEnable.OUTER) @u.skip_unsupported def test_smac_matcher_to_qp_flow(self): """ Creates a matcher to match on outer source mac and a flow that forwards packets to QP when matching on source mac. """ self.create_players(Mlx5FlowResources) smac_mask = bytes([0xff] * 6) matcher = self.server.create_matcher(smac_mask, u.MatchCriteriaEnable.OUTER) smac_value = struct.pack('!6s', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) value_param = Mlx5FlowMatchParameters(len(smac_value), smac_value) action_qp = Mlx5FlowActionAttr(action_type=dve.MLX5DV_FLOW_ACTION_DEST_IBV_QP, qp=self.server.qp) self.server.flow = Mlx5Flow(matcher, value_param, [action_qp], 1) u.raw_traffic(self.client, self.server, self.iters) @requires_reformat_support @u.requires_encap_disabled_if_eswitch_on def test_tx_packet_reformat(self): """ Creates packet reformat (encap) action on TX and with QP action on RX verifies that the packet was encapsulated as expected. """ self.client = Mlx5FlowResources(**self.dev_info) outer = gen_vxlan_l2_tunnel_encap_header(self.client.msg_size) # Due to encapsulation action Ipv4 and UDP checksum of the outer header # will be recalculated, need to skip them during packet validation. ipv4_id_idx = [18, 19] ipv4_chksum_idx = [24, 25] udp_chksum_idx = [34, 35] # Server will receive encaped packet so message size must include the # length of the outer part. self.server = Mlx5FlowResources(msg_size=self.client.msg_size + len(outer), **self.dev_info) empty_bytes_arr = bytes(MAX_MATCH_PARAM_SIZE) empty_value_param = Mlx5FlowMatchParameters(len(empty_bytes_arr), empty_bytes_arr) # TX steering tx_matcher = self.client.create_matcher(empty_bytes_arr, u.MatchCriteriaEnable.NONE, e.IBV_FLOW_ATTR_FLAGS_EGRESS, dve.MLX5DV_FLOW_TABLE_TYPE_NIC_TX_) # Create encap action reformat_action = Mlx5PacketReformatFlowAction( self.client.ctx, data=outer, reformat_type=dve.MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL_, ft_type=dve.MLX5DV_FLOW_TABLE_TYPE_NIC_TX_) action_reformat_attr = Mlx5FlowActionAttr(flow_action=reformat_action, action_type=dve.MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION) self.client.flow = Mlx5Flow(tx_matcher, empty_value_param, [action_reformat_attr], 1) # RX steering rx_matcher = self.server.create_matcher(empty_bytes_arr, u.MatchCriteriaEnable.NONE) action_qp_attr = Mlx5FlowActionAttr(action_type=dve.MLX5DV_FLOW_ACTION_DEST_IBV_QP, qp=self.server.qp) self.server.flow = Mlx5Flow(rx_matcher, empty_value_param, [action_qp_attr], 1) # Send traffic and validate packet packet = u.gen_packet(self.client.msg_size) u.raw_traffic(self.client, self.server, self.iters, expected_packet=outer + packet, skip_idxs=ipv4_id_idx + ipv4_chksum_idx + udp_chksum_idx) rdma-core-56.1/tests/test_mlx5_huge_page.py000066400000000000000000000035651477342711600210070ustar00rootroot00000000000000import unittest from pyverbs.pyverbs_error import PyverbsRDMAError from tests.mlx5_base import Mlx5PyverbsAPITestCase from pyverbs.qp import QP, QPInitAttr, QPCap from pyverbs.srq import SRQ, SrqInitAttr from pyverbs.pd import PD from pyverbs.cq import CQ import tests.utils as u def huge_pages_supported(): try: u.huge_pages_supported() except unittest.SkipTest: return False return True class ResourcesOnHugePageTest(Mlx5PyverbsAPITestCase): def create_cq(self): return CQ(self.ctx, 100, None, None, 0) def create_qp(self): with PD(self.ctx) as pd: with self.create_cq() as cq: attr = QPInitAttr(scq=cq, rcq=cq, cap=QPCap(max_recv_wr=100, max_send_wr=100)) QP(pd, attr) def create_srq(self): with PD(self.ctx) as pd: SRQ(pd, SrqInitAttr()) def set_env_alloc_type(self, alloc_type): self.set_env_variable('MLX_CQ_ALLOC_TYPE', alloc_type) self.set_env_variable('MLX_QP_ALLOC_TYPE', alloc_type) self.set_env_variable('MLX_SRQ_ALLOC_TYPE', alloc_type) def create_objects(self): self.create_cq() self.create_qp() self.create_srq() def test_prefer_obj_on_huge(self): """ Test PREFER_HUGE allocation type for srq cq and qp. """ self.set_env_alloc_type('PREFER_HUGE') self.create_objects() def test_obj_on_huge(self): """ Test HUGE allocation type for srq cq and qp. If there are huge pages in the system expect to success, else expect to fail. """ self.set_env_alloc_type('HUGE') if huge_pages_supported() and u.is_root(): self.create_objects() else: with self.assertRaises(PyverbsRDMAError): self.create_objects() rdma-core-56.1/tests/test_mlx5_lag_affinity.py000066400000000000000000000052511477342711600215110ustar00rootroot00000000000000import unittest import errno from tests.base import BaseResources, RCResources, UDResources from pyverbs.qp import QP, QPAttr, QPInitAttr, QPCap from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.providers.mlx5.mlx5dv import Mlx5QP from tests.mlx5_base import Mlx5RDMATestCase import tests.utils as u import pyverbs.enums as e from pyverbs.cq import CQ class LagRawQP(BaseResources): def __init__(self, dev_name, ib_port): super().__init__(dev_name, ib_port, None) self.cq = self.create_cq() self.qp = self.create_qp() def create_cq(self): return CQ(self.ctx, 100) @u.requires_root_on_eth() def create_qp(self): qia = QPInitAttr(e.IBV_QPT_RAW_PACKET, rcq=self.cq, scq=self.cq, cap=QPCap()) try: qp = QP(self.pd, qia) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest("Create Raw Packet QP is not supported") raise ex qp.to_init(QPAttr(port_num=self.ib_port)) return qp class LagPortTestCase(Mlx5RDMATestCase): def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None def modify_lag(self, resources): try: port_num, active_port_num = Mlx5QP.query_lag_port(resources.qp) # if port_num is 1 - modify to 2, else modify to 1 new_port_num = (2 - port_num) + 1 Mlx5QP.modify_lag_port(resources.qp, new_port_num) port_num, active_port_num = Mlx5QP.query_lag_port(resources.qp) self.assertEqual(port_num, new_port_num, 'Port num is not as expected') except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Set LAG affinity is not supported on this device') raise ex def test_raw_modify_lag_port(self): qp = LagRawQP(self.dev_name, self.ib_port) self.modify_lag(qp) def create_players(self, resource, **resource_arg): """ Initialize tests resources. :param resource: The RDMA resources to use. :param resource_arg: Dictionary of args that specify the resource specific attributes. :return: None """ super().create_players(resource, **resource_arg) self.modify_lag(self.client) self.modify_lag(self.server) def test_rc_modify_lag_port(self): self.create_players(RCResources) u.traffic(**self.traffic_args) def test_ud_modify_lag_port(self): self.create_players(UDResources) u.traffic(**self.traffic_args) rdma-core-56.1/tests/test_mlx5_mkey.py000066400000000000000000000641771477342711600200360ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia, Inc. All rights reserved. See COPYING file import unittest import random import errno from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr, \ Mlx5DVQPInitAttr, Mlx5QP from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsUserError, \ PyverbsError from pyverbs.providers.mlx5.mlx5dv_mkey import Mlx5Mkey, Mlx5MrInterleaved, \ Mlx5MkeyConfAttr, Mlx5SigT10Dif, Mlx5SigCrc, Mlx5SigBlockDomain, \ Mlx5SigBlockAttr from tests.base import RCResources, RDMATestCase import pyverbs.providers.mlx5.mlx5_enums as dve from pyverbs.wr import SGE, SendWR, RecvWR from pyverbs.qp import QPInitAttrEx, QPCap, QPAttr from pyverbs.mr import MR import pyverbs.enums as e import tests.utils as u class Mlx5MkeyResources(RCResources): def __init__(self, dev_name, ib_port, gid_index, dv_send_ops_flags=0, mkey_create_flags=dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT, dv_qp_create_flags=dve.MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE): self.dv_send_ops_flags = dv_send_ops_flags self.mkey_create_flags = mkey_create_flags self.dv_qp_create_flags = dv_qp_create_flags if dv_send_ops_flags & dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE: self.max_inline_data = 512 else: self.max_inline_data = 0 self.qp_access_flags = e.IBV_ACCESS_LOCAL_WRITE self.send_ops_flags = e.IBV_QP_EX_WITH_SEND # The signature pipelining tests use RDMA_WRITE. Allow RDMA_WRITE # if the pipelining flag is enabled for the QP. if self.dv_qp_create_flags & dve.MLX5DV_QP_CREATE_SIG_PIPELINING: self.qp_access_flags |= e.IBV_ACCESS_REMOTE_WRITE self.send_ops_flags |= e.IBV_QP_EX_WITH_RDMA_WRITE super().__init__(dev_name, ib_port, gid_index) self.create_mkey() def create_context(self): mlx5dv_attr = Mlx5DVContextAttr() try: self.ctx = Mlx5Context(mlx5dv_attr, name=self.dev_name) except PyverbsUserError as ex: raise unittest.SkipTest(f'Could not open mlx5 context ({ex})') except PyverbsRDMAError: raise unittest.SkipTest('Opening mlx5 context is not supported') def create_mkey(self): try: self.mkey = Mlx5Mkey(self.pd, self.mkey_create_flags, 3) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: raise unittest.SkipTest('Create Mkey is not supported') raise ex def create_qp_cap(self): return QPCap(max_send_wr=self.num_msgs, max_recv_wr=self.num_msgs, max_inline_data=self.max_inline_data) def create_qp_init_attr(self): comp_mask = e.IBV_QP_INIT_ATTR_PD | e.IBV_QP_INIT_ATTR_SEND_OPS_FLAGS return QPInitAttrEx(cap=self.create_qp_cap(), pd=self.pd, scq=self.cq, rcq=self.cq, qp_type=e.IBV_QPT_RC, send_ops_flags=self.send_ops_flags, comp_mask=comp_mask) def create_qp_attr(self): attr = super().create_qp_attr() attr.qp_access_flags = self.qp_access_flags return attr def create_qps(self): try: qp_init_attr = self.create_qp_init_attr() comp_mask = dve.MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS |\ dve.MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS attr = Mlx5DVQPInitAttr(comp_mask=comp_mask, create_flags=self.dv_qp_create_flags, send_ops_flags=self.dv_send_ops_flags) qp = Mlx5QP(self.ctx, qp_init_attr, attr) self.qps.append(qp) self.qps_num.append(qp.qp_num) self.psns.append(random.getrandbits(24)) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create Mlx5DV QP is not supported') raise ex class Mlx5MkeyOdpRes(Mlx5MkeyResources): @u.requires_odp('rc', e.IBV_ODP_SUPPORT_SEND | e.IBV_ODP_SUPPORT_RECV) def create_mr(self): self.mr = MR(self.pd, self.msg_size, e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_ON_DEMAND) class Mlx5MkeyTest(RDMATestCase): """ Test various functionalities of the mlx5 mkeys. """ def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None def reg_mr_list(self, configure_mkey=False): """ Register a list of SGEs using the player's mkeys. :param configure_mkey: If True, use the mkey configuration API. """ for player in [self.server, self.client]: player.qp.wr_start() player.qp.wr_flags = e.IBV_SEND_SIGNALED | e.IBV_SEND_INLINE sge_1 = SGE(player.mr.buf, 8, player.mr.lkey) sge_2 = SGE(player.mr.buf + 64, 8, player.mr.lkey) if configure_mkey: player.qp.wr_mkey_configure(player.mkey, 2, Mlx5MkeyConfAttr()) player.qp.wr_set_mkey_access_flags(e.IBV_ACCESS_LOCAL_WRITE) player.qp.wr_set_mkey_layout_list([sge_1, sge_2]) else: player.qp.wr_mr_list(player.mkey, e.IBV_ACCESS_LOCAL_WRITE, sge_list=[sge_1, sge_2]) player.qp.wr_complete() u.poll_cq(player.cq) def reg_mr_interleaved(self, configure_mkey=False): """ Register an interleaved memory layout using the player's mkeys. :param configure_mkey: Use the mkey configuration API. """ for player in [self.server, self.client]: player.qp.wr_start() player.qp.wr_flags = e.IBV_SEND_SIGNALED | e.IBV_SEND_INLINE mr_interleaved_1 = Mlx5MrInterleaved(addr=player.mr.buf, bytes_count=8, bytes_skip=2, lkey=player.mr.lkey) mr_interleaved_2 = Mlx5MrInterleaved(addr=player.mr.buf + 64, bytes_count=8, bytes_skip=2, lkey=player.mr.lkey) mr_interleaved_lst = [mr_interleaved_1, mr_interleaved_2] mkey_access = e.IBV_ACCESS_LOCAL_WRITE if configure_mkey: player.qp.wr_mkey_configure(player.mkey, 2, Mlx5MkeyConfAttr()) player.qp.wr_set_mkey_access_flags(mkey_access) player.qp.wr_set_mkey_layout_interleaved(3, mr_interleaved_lst) else: player.qp.wr_mr_interleaved(player.mkey, e.IBV_ACCESS_LOCAL_WRITE, repeat_count=3, mr_interleaved_lst=mr_interleaved_lst) player.qp.wr_complete() u.poll_cq(player.cq) def reg_mr_sig_t10dif(self): """ Register the player's mkeys with T10DIF signature on the wire domain. """ for player in [self.server, self.client]: player.qp.wr_start() player.qp.wr_flags = e.IBV_SEND_SIGNALED | e.IBV_SEND_INLINE sge = SGE(player.mr.buf, 512, player.mr.lkey) player.qp.wr_mkey_configure(player.mkey, 3, Mlx5MkeyConfAttr()) player.qp.wr_set_mkey_access_flags(e.IBV_ACCESS_LOCAL_WRITE) player.qp.wr_set_mkey_layout_list([sge]) t10dif_flags = dve.MLX5DV_SIG_T10DIF_FLAG_REF_REMAP sig_t10dif = Mlx5SigT10Dif(bg_type=dve.MLX5DV_SIG_T10DIF_CRC, bg=0xFFFF, app_tag=0xABCD, ref_tag=0x01234567, flags=t10dif_flags) sig_type = dve.MLX5DV_SIG_TYPE_T10DIF block_size = dve.MLX5DV_BLOCK_SIZE_512 sig_block_domain = Mlx5SigBlockDomain(sig_type=sig_type, dif=sig_t10dif, block_size=block_size) check_mask = (dve.MLX5DV_SIG_MASK_T10DIF_GUARD | dve.MLX5DV_SIG_MASK_T10DIF_APPTAG | dve.MLX5DV_SIG_MASK_T10DIF_REFTAG) sig_attr = Mlx5SigBlockAttr(wire=sig_block_domain, check_mask=check_mask) player.qp.wr_set_mkey_sig_block(sig_attr) player.qp.wr_complete() u.poll_cq(player.cq) def reg_mr_sig_crc(self): """ Register the player's mkeys with CRC32 signature on the wire domain. """ for player in [self.server, self.client]: player.qp.wr_start() player.qp.wr_flags = e.IBV_SEND_SIGNALED | e.IBV_SEND_INLINE sge = SGE(player.mr.buf, 512, player.mr.lkey) player.qp.wr_mkey_configure(player.mkey, 3, Mlx5MkeyConfAttr()) player.qp.wr_set_mkey_access_flags(e.IBV_ACCESS_LOCAL_WRITE) player.qp.wr_set_mkey_layout_list([sge]) sig_crc = Mlx5SigCrc(crc_type=dve.MLX5DV_SIG_CRC_TYPE_CRC32, seed=0xFFFFFFFF) sig_block_domain = Mlx5SigBlockDomain(sig_type=dve.MLX5DV_SIG_TYPE_CRC, crc=sig_crc, block_size=dve.MLX5DV_BLOCK_SIZE_512) sig_attr = Mlx5SigBlockAttr(wire=sig_block_domain, check_mask=dve.MLX5DV_SIG_MASK_CRC32) player.qp.wr_set_mkey_sig_block(sig_attr) player.qp.wr_complete() u.poll_cq(player.cq) def reg_mr_sig_err(self): """ Register the player's mkeys with an SGE and CRC32 signature on the memory domain. Data transport operation with these MKEYs will cause a signature error because the test does not fill out the signature in the memory buffer. """ sig_crc = Mlx5SigCrc(crc_type=dve.MLX5DV_SIG_CRC_TYPE_CRC32, seed=0xFFFFFFFF) block_size = dve.MLX5DV_BLOCK_SIZE_512 sig_block_domain = Mlx5SigBlockDomain(sig_type=dve.MLX5DV_SIG_TYPE_CRC, crc=sig_crc, block_size=block_size) sig_attr = Mlx5SigBlockAttr(mem=sig_block_domain, check_mask=dve.MLX5DV_SIG_MASK_CRC32) # Configure the mkey on the server side self.server.qp.wr_start() self.server.qp.wr_flags = e.IBV_SEND_SIGNALED | e.IBV_SEND_INLINE sge = SGE(self.server.mr.buf, 512, self.server.mr.lkey) self.server.qp.wr_mkey_configure(self.server.mkey, 2, Mlx5MkeyConfAttr()) self.server.qp.wr_set_mkey_access_flags(e.IBV_ACCESS_LOCAL_WRITE) self.server.qp.wr_set_mkey_layout_list([sge]) self.server.qp.wr_complete() u.poll_cq(self.server.cq) # Configure the mkey on the client side self.client.qp.wr_start() self.client.qp.wr_flags = e.IBV_SEND_SIGNALED | e.IBV_SEND_INLINE sge = SGE(self.client.mr.buf, 512 + 4, self.client.mr.lkey) self.client.qp.wr_mkey_configure(self.client.mkey, 3, Mlx5MkeyConfAttr()) self.client.qp.wr_set_mkey_access_flags(e.IBV_ACCESS_LOCAL_WRITE) self.client.qp.wr_set_mkey_layout_list([sge]) self.client.qp.wr_set_mkey_sig_block(sig_attr) self.client.qp.wr_complete() u.poll_cq(self.client.cq) def reg_mr_sig_pipelining_server(self): """ Register mkey without signature. """ self.server.qp.wr_start() self.server.qp.wr_flags = e.IBV_SEND_SIGNALED | e.IBV_SEND_INLINE sge = SGE(self.server.mr.buf, 512, self.server.mr.lkey) self.server.qp.wr_mkey_configure(self.server.mkey, 2, Mlx5MkeyConfAttr()) self.server.qp.wr_set_mkey_access_flags(e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE) self.server.qp.wr_set_mkey_layout_list([sge]) self.server.qp.wr_complete() u.poll_cq(self.server.cq) def reg_mr_sig_pipelining_client(self, check_mask=0): """ Register mkey with CRC32 signature in memory domain and no signature in wire domain. :param check_mask: The mask for the signature checking. """ self.client.qp.wr_start() self.client.qp.wr_flags = e.IBV_SEND_SIGNALED | e.IBV_SEND_INLINE # Add 4 bytes for CRC32 signature sge = SGE(self.client.mr.buf, 512 + 4, self.client.mr.lkey) self.client.qp.wr_mkey_configure(self.client.mkey, 3, Mlx5MkeyConfAttr()) self.client.qp.wr_set_mkey_access_flags(e.IBV_ACCESS_LOCAL_WRITE) self.client.qp.wr_set_mkey_layout_list([sge]) sig = Mlx5SigCrc(crc_type = dve.MLX5DV_SIG_CRC_TYPE_CRC32) sig_domain = Mlx5SigBlockDomain(sig_type=dve.MLX5DV_SIG_TYPE_CRC, crc=sig, block_size=dve.MLX5DV_BLOCK_SIZE_512) sig_attr = Mlx5SigBlockAttr(mem=sig_domain, check_mask=check_mask) self.client.qp.wr_set_mkey_sig_block(sig_attr) self.client.qp.wr_complete() u.poll_cq(self.client.cq) def build_traffic_elements(self, sge_size): """ Build the server and client send/recv work requests. :param sge_size: The sge send size using the mkey. """ opcode = e.IBV_WR_SEND server_sge = SGE(0, sge_size, self.server.mkey.lkey) self.server_recv_wr = RecvWR(sg=[server_sge], num_sge=1) client_sge = SGE(0, sge_size, self.client.mkey.lkey) self.client_send_wr = SendWR(opcode=opcode, num_sge=1, sg=[client_sge]) def build_traffic_elements_sig_pipelining(self): """ Build two WRs for data and response on client side and one WR for response on server side. Transaction consists of two operations: RDMA write of data and send/recv of response. Data size is 512 bytes, response size is 16 bytes. For simplicity the same memory is used for data and response. Data is transferred using the player signature mkey. Response is transferred using the plain MR. """ server_sge_resp = SGE(self.server.mr.buf, 16, self.server.mr.lkey) self.server_resp_wr = RecvWR(sg=[server_sge_resp], num_sge=1) client_sge_data = SGE(0, 512, self.client.mkey.lkey) self.client_data_wr = SendWR(wr_id=1, opcode=e.IBV_WR_RDMA_WRITE, num_sge=1, sg=[client_sge_data], send_flags=0) self.client_data_wr.set_wr_rdma(self.server.mkey.rkey, 0) client_sge_resp = SGE(self.client.mr.buf, 16, self.client.mr.lkey) client_send_flags = e.IBV_SEND_SIGNALED | e.IBV_SEND_FENCE self.client_resp_wr = SendWR(wr_id=1, opcode=e.IBV_WR_SEND, num_sge=1, sg=[client_sge_resp], send_flags=client_send_flags) def traffic(self, sge_size, exp_buffer): """ Perform RC traffic using the mkey. :param sge_size: The sge size using the mkey. :param exp_buffer: The expected result of the receive buffer after the traffic operation. """ self.build_traffic_elements(sge_size) self.server.qp.post_recv(self.server_recv_wr) for _ in range(self.iters): self.server.mr.write('s' * self.server.msg_size, self.server.msg_size) self.client.mr.write('c' * self.client.msg_size, self.client.msg_size) self.client.qp.post_send(self.client_send_wr) u.poll_cq(self.client.cq) u.poll_cq(self.server.cq) self.server.qp.post_recv(self.server_recv_wr) act_buffer = self.server.mr.read(len(exp_buffer), 0).decode() if act_buffer != exp_buffer: raise PyverbsError('Data validation failed: expected ' f'{exp_buffer}, received {act_buffer}') def traffic_scattered_data(self, sge_size=16): exp_buffer=((('c' * 8 + 's' * 56) *2)[:100]) self.traffic(sge_size=sge_size, exp_buffer=exp_buffer) def traffic_sig(self): exp_buffer=('c' * 512 + 's' * (self.server.msg_size - 512)) self.traffic(sge_size=512, exp_buffer=exp_buffer) def invalidate_mkeys(self): """ Invalidate the players mkey. """ for player in [self.server, self.client]: inv_send_wr = SendWR(opcode=e.IBV_WR_LOCAL_INV) inv_send_wr.imm_data = player.mkey.lkey player.qp.post_send(inv_send_wr) u.poll_cq(player.cq) def invalidate_mkeys_remotely(self): """ Client remotely invalidates the server's rkey """ sge = SGE(0,0, self.server.mkey.lkey) self.server.qp.post_recv(RecvWR(sg=[sge], num_sge=1)) self.client.qp.wr_start() self.client.qp.wr_flags = e.IBV_SEND_SIGNALED self.client.qp.wr_send_inv(self.server.mkey.rkey) sge = SGE(0, 0, self.client.mkey.lkey) self.client.qp.wr_set_sge(sge) self.client.qp.wr_complete() u.poll_cq(self.client.cq) u.poll_cq(self.server.cq) def test_mkey_remote_invalidate(self): """ Verify remote Mkey invalidation. Create Mkey, traffic using this mkey then the client invalidates the server's mkey remotly. """ self.create_players(Mlx5MkeyResources, mkey_create_flags=dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT | dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_REMOTE_INVALIDATE, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE) self.reg_mr_list(configure_mkey=True) self.traffic_scattered_data() self.invalidate_mkeys_remotely() with self.assertRaises(PyverbsRDMAError): self.traffic_scattered_data() def check_mkey(self, player, expected=dve.MLX5DV_MKEY_NO_ERR): """ Check the player's mkey for a signature error. param player: Player to check. param expected: The expected result of the checking. """ mkey_err = player.mkey.mkey_check() if mkey_err.err_type != expected: raise PyverbsRDMAError('MKEY check failed: ' f'expected err_type: {expected}, ' f'actual err_type: {mkey_err.err_type}') def test_mkey_interleaved(self): """ Create Mkeys, register an interleaved memory layout using this mkey and then perform traffic using it. """ self.create_players(Mlx5MkeyResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MR_INTERLEAVED) self.reg_mr_interleaved() self.traffic_scattered_data() self.invalidate_mkeys() def test_mkey_list(self): """ Create Mkeys, register a memory layout using this mkey and then perform traffic using this mkey. """ self.create_players(Mlx5MkeyResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MR_LIST) self.reg_mr_list() self.traffic_scattered_data() self.invalidate_mkeys() def test_mkey_list_new_api(self): """ Create Mkeys, configure it with memory layout using the new API and traffic using this mkey. """ self.create_players(Mlx5MkeyResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE) self.reg_mr_list(configure_mkey=True) self.traffic_scattered_data() self.invalidate_mkeys() def test_odp_mkey_list_new_api(self): """ Create Mkeys above ODP MR, configure it with memory layout using the new API and traffic using this mkey. """ self.create_players(Mlx5MkeyOdpRes, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE) self.reg_mr_list(configure_mkey=True) self.traffic_scattered_data() self.invalidate_mkeys() def test_mkey_interleaved_new_api(self): """ Create Mkeys, configure it with interleaved memory layout using the new API and then perform traffic using it. """ self.create_players(Mlx5MkeyResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE) self.reg_mr_interleaved(configure_mkey=True) self.traffic_scattered_data() self.invalidate_mkeys() def test_mkey_list_bad_flow(self): """ Create Mkeys, register a memory layout using this mkey and then try to access the memory out of the mkey defined region. Expect this case to fail. """ self.create_players(Mlx5MkeyResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MR_LIST) self.reg_mr_list() with self.assertRaises(PyverbsRDMAError) as ex: self.traffic_scattered_data(sge_size=100) def test_mkey_sig_t10dif(self): """ Create Mkeys, configure it with T10DIF signature and traffic using this mkey. """ self.create_players(Mlx5MkeyResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE, mkey_create_flags=dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT | dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE) self.reg_mr_sig_t10dif() self.traffic_sig() self.check_mkey(self.server) self.check_mkey(self.client) self.invalidate_mkeys() def test_mkey_sig_crc(self): """ Create Mkeys, configure it with CRC32 signature and traffic using this mkey. """ self.create_players(Mlx5MkeyResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE, mkey_create_flags=dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT | dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE) self.reg_mr_sig_crc() self.traffic_sig() self.check_mkey(self.server) self.check_mkey(self.client) self.invalidate_mkeys() def test_mkey_sig_err(self): """ Test the signature error handling flow. Create Mkeys, configure it CRC32 signature on the memory domain but do not set a valid signature in the memory buffer. Run traffic using this mkey, ensure that the signature error is detected. """ self.create_players(Mlx5MkeyResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE, mkey_create_flags=dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT | dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE) self.reg_mr_sig_err() # The test supports only one iteration because mkey re-registration # is required after each signature error. self.iters = 1 self.traffic_sig() self.check_mkey(self.client, dve.MLX5DV_MKEY_SIG_BLOCK_BAD_GUARD) self.check_mkey(self.server) self.invalidate_mkeys() def test_mkey_sig_pipelining_good(self): """ Test the good signature pipelining scenario. """ self.create_players(Mlx5MkeyResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE, mkey_create_flags=dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT | dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE, dv_qp_create_flags=dve.MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE | dve.MLX5DV_QP_CREATE_SIG_PIPELINING) self.reg_mr_sig_pipelining_client() self.reg_mr_sig_pipelining_server() self.build_traffic_elements_sig_pipelining() self.server.qp.post_recv(self.server_resp_wr) self.client.qp.post_send(self.client_data_wr) self.client.qp.post_send(self.client_resp_wr) u.poll_cq(self.client.cq) u.poll_cq(self.server.cq) def test_mkey_sig_pipelining_bad(self): """ Test the bad signature pipelining scenario. """ self.create_players(Mlx5MkeyResources, dv_send_ops_flags=dve.MLX5DV_QP_EX_WITH_MKEY_CONFIGURE, mkey_create_flags=dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT | dve.MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE, dv_qp_create_flags=dve.MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE | dve.MLX5DV_QP_CREATE_SIG_PIPELINING) self.reg_mr_sig_pipelining_client(check_mask=dve.MLX5DV_SIG_MASK_CRC32) self.reg_mr_sig_pipelining_server() self.build_traffic_elements_sig_pipelining() self.server.qp.post_recv(self.server_resp_wr) self.client.qp.post_send(self.client_data_wr) self.client.qp.post_send(self.client_resp_wr) # Expect SQ_DRAINED event event = self.client.ctx.get_async_event() event.ack() self.assertEqual(event.event_type, e.IBV_EVENT_SQ_DRAINED) # No completion is expected on the client side nc, _ = self.client.cq.poll(1) self.assertEqual(nc, 0) # No completion is expected on the server side nc, _ = self.server.cq.poll(1) self.assertEqual(nc, 0) self.check_mkey(self.client, dve.MLX5DV_MKEY_SIG_BLOCK_BAD_GUARD) self.check_mkey(self.server) # Cancel and repost response WR canceled_count = self.client.qp.cancel_posted_send_wrs(1) self.assertEqual(canceled_count, 1) self.client.qp.post_send(self.client_resp_wr) # Move QP back to RTS and receive completions self.client.qp.modify(QPAttr(qp_state=e.IBV_QPS_RTS, cur_qp_state=e.IBV_QPS_SQD), e.IBV_QP_STATE | e.IBV_QP_CUR_STATE) u.poll_cq(self.client.cq) u.poll_cq(self.server.cq) rdma-core-56.1/tests/test_mlx5_ooo_qp.py000066400000000000000000000131331477342711600203470ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2024 NVIDIA Corporation . All rights reserved. See COPYING file import unittest import random import errno from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr, \ Mlx5DVQPInitAttr, Mlx5QP, Mlx5DVCQInitAttr, Mlx5CQ from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsUserError, \ PyverbsError from pyverbs.cq import CQ, CQEX, PollCqAttr, CqInitAttrEx from pyverbs.qp import QPInitAttrEx, QPCap, QPAttr import pyverbs.providers.mlx5.mlx5_enums as dve from pyverbs.wr import SGE, SendWR, RecvWR from pyverbs.mr import MR import pyverbs.enums as e from tests.base import RCResources, RDMATestCase from tests.mlx5_base import Mlx5RcResources import tests.utils as u def create_ooo_dv_qp(res, max_recv_wr=1000, qp_type=e.IBV_QPT_RC): dv_ctx = res.ctx.query_mlx5_device() if not dv_ctx.comp_mask & dve.MLX5DV_CONTEXT_MASK_OOO_RECV_WRS: raise unittest.SkipTest('DV QP OOO feature is not supported') send_ops_flags = e.IBV_QP_EX_WITH_SEND | e.IBV_QP_EX_WITH_SEND_WITH_IMM | \ e.IBV_QP_EX_WITH_RDMA_WRITE | e.IBV_QP_EX_WITH_RDMA_READ |\ e.IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM qp_cap = QPCap(max_recv_wr=max_recv_wr, max_send_wr=max_recv_wr) comp_mask = e.IBV_QP_INIT_ATTR_PD | e.IBV_QP_INIT_ATTR_SEND_OPS_FLAGS qp_init_attr = QPInitAttrEx(cap=qp_cap, pd=res.pd, scq=res.cq, rcq=res.cq, qp_type=qp_type, send_ops_flags=send_ops_flags, comp_mask=comp_mask) dv_comp_mask = dve.MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS attr = Mlx5DVQPInitAttr(comp_mask=dv_comp_mask, create_flags=res.dvqp_create_flags) try: qp = Mlx5QP(res.ctx, qp_init_attr, attr) res.qps.append(qp) res.qps_num.append(qp.qp_num) res.psns.append(random.getrandbits(24)) except PyverbsRDMAError as ex: raise ex class Mlx5OOORcRes(Mlx5RcResources): def __init__(self, dev_name, ib_port, gid_index, msg_size=1024, dvqp_create_flags=0, **kwargs): """ Initialize mlx5 DV QP resources based on RCResources. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param msg_size: The resource msg size :param dvqp_create_flags: DV QP create flags :param kwargs: General arguments """ self.qp_access_flags = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE | \ e.IBV_ACCESS_REMOTE_READ self.dvqp_create_flags = dvqp_create_flags super().__init__(dev_name, ib_port, gid_index, msg_size=msg_size, **kwargs) def create_qp_attr(self): attr = super().create_qp_attr() attr.qp_access_flags = self.qp_access_flags return attr def create_qps(self): for _ in range(self.qp_count): create_ooo_dv_qp(self) def create_mr(self): self.mr = MR(self.pd, self.msg_size, self.qp_access_flags) def create_cq(self): wc_flags = e.IBV_WC_STANDARD_FLAGS cia = CqInitAttrEx(cqe=2000, wc_flags=wc_flags) dvcq_init_attr = Mlx5DVCQInitAttr() dvcq_init_attr.comp_mask |= dve.MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE dvcq_init_attr.cqe_size = 64 dvcq_init_attr.comp_mask |= dve.MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE try: self.cq = Mlx5CQ(self.ctx, cia, dvcq_init_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create Mlx5DV CQ is not supported') raise ex class DvOOOQPTest(RDMATestCase): def test_ooo_qp_bad_flow(self): """ DDP - OOO Recv WRs bad flow test 1. Create QP with max recv WRs possible and validate it with querying the QP 2. Try to create QP with more then max recv wr supported 3. Try to create QP with unsupported QP type """ self.create_players(Mlx5OOORcRes, dvqp_create_flags=dve.MLX5DV_QP_CREATE_OOO_DP) dv_ctx = self.server.ctx.query_mlx5_device() max_rc_rwrs = dv_ctx.ooo_recv_wrs_caps['max_rc'] create_ooo_dv_qp(self.server, max_recv_wr=max_rc_rwrs) attr, init_attr = self.server.qps[-1].query(0x1ffffff) self.assertEqual(max_rc_rwrs, init_attr.cap.max_recv_wr) # Try to create QP with more then max recv wr supported with self.assertRaises(PyverbsRDMAError) as ex: create_ooo_dv_qp(self.server, max_rc_rwrs + 1) self.assertEqual(ex.exception.error_code, errno.EINVAL) # Try to create QP with unsupported QP type with self.assertRaises(PyverbsRDMAError) as ex: create_ooo_dv_qp(self.server, qp_type=e.IBV_QPT_RAW_PACKET) self.assertEqual(ex.exception.error_code, errno.EOPNOTSUPP) def test_ooo_qp_send_traffic(self): """ DV QP OOO traffic opcode SEND """ self.create_players(Mlx5OOORcRes, dvqp_create_flags=dve.MLX5DV_QP_CREATE_OOO_DP) u.traffic_poll_at_once(self, msg_size=int(self.server.msg_size / self.iters), iterations=self.iters) def test_ooo_qp_rdma_write_imm_traffic(self): """ DV QP OOO traffic opcode RDMA_WRITE_WITH_IMM """ self.create_players(Mlx5OOORcRes, dvqp_create_flags=dve.MLX5DV_QP_CREATE_OOO_DP) u.traffic_poll_at_once(self, msg_size=int(self.server.msg_size / self.iters), iterations=self.iters, opcode=e.IBV_WR_RDMA_WRITE_WITH_IMM) rdma-core-56.1/tests/test_mlx5_pp.py000066400000000000000000000045041477342711600174740ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file """ Test module for mlx5 packet pacing entry allocation. """ from pyverbs.providers.mlx5.mlx5dv import Mlx5PP, Mlx5Context, Mlx5DVContextAttr from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsUserError import pyverbs.providers.mlx5.mlx5_enums as e from tests.mlx5_base import Mlx5RDMATestCase import unittest import struct import errno class Mlx5PPRes: def __init__(self, dev_name): try: mlx5dv_attr = Mlx5DVContextAttr(e.MLX5DV_CONTEXT_FLAGS_DEVX) self.ctx = Mlx5Context(mlx5dv_attr, dev_name) except PyverbsUserError as ex: raise unittest.SkipTest('Could not open mlx5 context ({})' .format(str(ex))) except PyverbsRDMAError: raise unittest.SkipTest('Opening mlx5 DevX context is not supported') self.pps = [] class Mlx5PPTestCase(Mlx5RDMATestCase): def setUp(self): super().setUp() self.pp_res = Mlx5PPRes(self.dev_name) def test_pp_alloc(self): """ Allocate two packet pacing entries with the same configuration. One of the entries is allocated with a dedicated index. Then verify that the indexes are different and free the entries. """ # An arbitrary valid rate limit value (in kbps) rate_limit = struct.pack('>I', 100) try: self.pp_res.pps.append(Mlx5PP(self.pp_res.ctx, rate_limit)) # Create a dedicated entry of the same previous configuration # and verify that it has a different index self.pp_res.pps.append(Mlx5PP(self.pp_res.ctx, rate_limit, flags=e._MLX5DV_PP_ALLOC_FLAGS_DEDICATED_INDEX)) self.assertNotEqual(self.pp_res.pps[0].index, self.pp_res.pps[1].index, 'Dedicated PP index is not unique') for pp in self.pp_res.pps: pp.close() except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP or ex.error_code == errno.EPROTONOSUPPORT: raise unittest.SkipTest('Packet pacing entry allocation is not supported') raise ex finally: self.pp_res.ctx.close() rdma-core-56.1/tests/test_mlx5_query_port.py000066400000000000000000000031421477342711600212630ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 NVIDIA Corporation . All rights reserved. See COPYING file """ Test module for Mlx5 DV query port. """ import unittest import errno from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.providers.mlx5.mlx5dv import Mlx5Context from tests.mlx5_base import Mlx5PyverbsAPITestCase import pyverbs.providers.mlx5.mlx5_enums as e class Mlx5DVQueryPortTestCase(Mlx5PyverbsAPITestCase): def test_dv_query_port(self): """ Test the DV query port and that no error is returned. """ for port in range (1, self.attr_ex.phys_port_cnt_ex + 1): try: port_attr = Mlx5Context.query_mlx5_port(self.ctx, port) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: raise unittest.SkipTest(f'mlx5dv_query_port() isn\'t supported') raise ex if (port_attr.flags & e.MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_RX_): self.assertNotEqual(port_attr.vport_steering_icm_rx, 0, f'Vport steering icm rx address is zero') if (port_attr.flags & e.MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_TX_): self.assertNotEqual(port_attr.vport_steering_icm_tx, 0, f'Vport steering icm tx address is zero') if (port_attr.flags & e.MLX5DV_QUERY_PORT_VPORT_REG_C0_): self.assertNotEqual(port_attr.reg_c0_mask, 0, f'Vport reg c0 mask is zero') rdma-core-56.1/tests/test_mlx5_raw_wqe.py000066400000000000000000000063621477342711600205260ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia, Inc. All rights reserved. See COPYING file from pyverbs.providers.mlx5.mlx5dv import Wqe, WqeDataSeg, WqeCtrlSeg from pyverbs.pyverbs_error import PyverbsError import pyverbs.providers.mlx5.mlx5_enums as dve from tests.mlx5_base import Mlx5RDMATestCase, Mlx5RcResources from pyverbs.qp import QPCap from pyverbs.wr import SGE import pyverbs.enums as e import tests.utils as u class Mlx5RawWqeResources(Mlx5RcResources): def create_send_ops_flags(self): self.dv_send_ops_flags = dve.MLX5DV_QP_EX_WITH_RAW_WQE self.send_ops_flags = e.IBV_QP_EX_WITH_SEND def create_qp_cap(self): """ Create QPCap such that work queue elements will wrap around the send work queue, this happens due to the iteration count being higher than the max_send_wr. :return: """ return QPCap(max_send_wr=1, max_recv_wr=4, max_recv_sge=2, max_send_sge=2) class RawWqeTest(Mlx5RDMATestCase): def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None def prepare_send_elements(self): mr = self.client.mr sge_count = 2 unit_size = mr.length / 2 data_segs = [WqeDataSeg(unit_size, mr.lkey, mr.buf + i * unit_size) for i in range(sge_count)] ctrl_seg = WqeCtrlSeg() ctrl_seg.fm_ce_se = dve.MLX5_WQE_CTRL_CQ_UPDATE segment_num = 1 + len(data_segs) ctrl_seg.opmod_idx_opcode = dve.MLX5_OPCODE_SEND ctrl_seg.qpn_ds = segment_num | int(self.client.qp.qp_num) << 8 self.raw_send_wqe = Wqe([ctrl_seg] + data_segs) self.regular_send_sge = SGE(mr.buf, mr.length, mr.lkey) def mixed_traffic(self): s_recv_wr = u.get_recv_wr(self.server) u.post_recv(self.server, s_recv_wr) self.prepare_send_elements() for i in range(self.iters): self.client.qp.wr_start() if i % 2: self.client.mr.write('c' * self.client.mr.length, self.client.mr.length) self.client.qp.wr_flags = e.IBV_SEND_SIGNALED self.client.qp.wr_send() self.client.qp.wr_set_sge(self.regular_send_sge) else: self.client.mr.write('s' * self.client.mr.length, self.client.mr.length) self.client.qp.wr_raw_wqe(self.raw_send_wqe) self.client.qp.wr_complete() u.poll_cq_ex(self.client.cq) u.poll_cq_ex(self.server.cq) u.post_recv(self.server, s_recv_wr) expected_opcode = e.IBV_WC_SEND if i % 2 else e.IBV_WC_DRIVER2 if self.client.cq.read_opcode() != expected_opcode: raise PyverbsError('Opcode validation failed: expected ' f'{expected_opcode}, received {self.client.cq.read_opcode()}') act_buffer = self.server.mr.read(self.server.mr.length, 0) u.validate(act_buffer, i % 2, self.server.mr.length) def test_mixed_raw_wqe_traffic(self): """ Runs traffic with a mix of SEND opcode regular WQEs and SEND opcode RAW WQEs. """ self.create_players(Mlx5RawWqeResources) self.mixed_traffic() rdma-core-56.1/tests/test_mlx5_rdmacm.py000066400000000000000000000204301477342711600203140ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Nvidia Corporation. All rights reserved. See COPYING file import unittest import errno from pyverbs.providers.mlx5.mlx5dv import Mlx5DVQPInitAttr, Mlx5QP, \ Mlx5DVDCInitAttr, Mlx5Context from tests.test_rdmacm import CMAsyncConnection from tests.mlx5_base import Mlx5PyverbsAPITestCase, Mlx5RDMACMBaseTest from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.srq import SRQ, SrqInitAttr, SrqAttr import pyverbs.providers.mlx5.mlx5_enums as dve from tests.base_rdmacm import AsyncCMResources from pyverbs.qp import QPCap, QPInitAttrEx from pyverbs.cmid import ConnParam from tests.base import DCT_KEY from pyverbs.addr import AH import pyverbs.enums as e from pyverbs.cq import CQ import tests.utils as u class DcCMConnection(CMAsyncConnection): """ Implement RDMACM connection management for asynchronous CMIDs using DC as an external QP. """ def create_cm_res(self, ip_addr, passive, **kwargs): self.cm_res = DcCMResources(addr=ip_addr, passive=passive, **kwargs) if passive: self.cm_res.create_cmid() else: for conn_idx in range(self.num_conns): self.cm_res.create_cmid(conn_idx) def _ext_qp_server_traffic(self): recv_wr = u.get_recv_wr(self.cm_res) for _ in range(self.cm_res.num_msgs): u.post_recv(self.cm_res, recv_wr) self.syncer.wait() for _ in range(self.cm_res.num_msgs): u.poll_cq(self.cm_res.cq) def _ext_qp_client_traffic(self): self.cm_res.remote_dct_num = self.cm_res.remote_qpn _, send_wr = u.get_send_elements(self.cm_res, self.cm_res.passive) ah = AH(self.cm_res.cmid.pd, attr=self.cm_res.remote_ah) self.syncer.wait() for send_idx in range(self.cm_res.num_msgs): dci_idx = send_idx % len(self.cm_res.qps) u.post_send_ex(self.cm_res, send_wr, e.IBV_WR_SEND, ah=ah, qp_idx=dci_idx) u.poll_cq(self.cm_res.cq) def disconnect(self): if self.cm_res.reserved_qp_num and self.cm_res.passive: Mlx5Context.reserved_qpn_dealloc(self.cm_res.child_id.context, self.cm_res.reserved_qp_num) self.cm_res.reserved_qp_num = 0 super().disconnect() class DcCMResources(AsyncCMResources): """ DcCMResources class contains resources for RDMA CM asynchronous communication using DC as an external QP. """ def __init__(self, addr=None, passive=None, **kwargs): """ Init DcCMResources instance. :param addr: Local address to bind to. :param passive: Indicate if this CM is the passive CM. """ super().__init__(addr=addr, passive=passive, **kwargs) self.srq = None self.remote_dct_num = None self.reserved_qp_num = 0 def create_qp(self, conn_idx=0): """ Create an RDMACM QP. If self.with_ext_qp is set, then an external CQ and DC QP will be created. In case that CQ is already created, it is used for the newly created QP. """ try: if not self.passive: # Create the DCI QPs. cmid = self.cmids[conn_idx] self.create_cq(cmid) qp_init_attr = self.create_qp_init_attr(cmid, e.IBV_QP_EX_WITH_SEND) attr = Mlx5DVQPInitAttr(comp_mask=dve.MLX5DV_QP_INIT_ATTR_MASK_DC, dc_init_attr=Mlx5DVDCInitAttr()) self.qps[conn_idx] = Mlx5QP(cmid.context, qp_init_attr, attr) if self.passive and conn_idx == 0: # Create the DCT QP only for the first connection. cmid = self.child_id self.create_cq(cmid) self.create_srq(cmid) qp_init_attr = self.create_qp_init_attr(cmid) dc_attr = Mlx5DVDCInitAttr(dc_type=dve.MLX5DV_DCTYPE_DCT, dct_access_key=DCT_KEY) attr = Mlx5DVQPInitAttr(comp_mask=dve.MLX5DV_QP_INIT_ATTR_MASK_DC, dc_init_attr=dc_attr) self.qps[conn_idx] = Mlx5QP(cmid.context, qp_init_attr, attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create DC QP is not supported') raise ex def create_qp_cap(self): return QPCap(self.num_msgs, 0, 1, 0) def create_qp_init_attr(self, cmid, send_ops_flags=0): comp_mask = e.IBV_QP_INIT_ATTR_PD if send_ops_flags: comp_mask |= e.IBV_QP_INIT_ATTR_SEND_OPS_FLAGS return QPInitAttrEx(cap=self.create_qp_cap(), pd=cmid.pd, scq=self.cq, rcq=self.cq, srq=self.srq, qp_type=e.IBV_QPT_DRIVER, send_ops_flags=send_ops_flags, comp_mask=comp_mask, sq_sig_all=1) def create_srq(self, cmid): srq_init_attr = SrqInitAttr(SrqAttr(max_wr=self.num_msgs)) try: self.srq = SRQ(cmid.pd, srq_init_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create SRQ is not supported') raise ex def modify_ext_qp_to_rts(self, conn_idx=0): cmids = self.child_ids if self.passive else self.cmids if not self.passive or not conn_idx: qp = self.qps[conn_idx] attr, _ = cmids[conn_idx].init_qp_attr(e.IBV_QPS_INIT) qp.to_init(attr) attr, _ = cmids[conn_idx].init_qp_attr(e.IBV_QPS_RTR) qp.to_rtr(attr) if not self.passive: # The passive QP is DCT which should stay in RTR state. self.remote_ah = attr.ah_attr attr, _ = cmids[conn_idx].init_qp_attr(e.IBV_QPS_RTS) qp.to_rts(attr) def create_conn_param(self, qp_num=0, conn_idx=0): if conn_idx and self.passive: try: ctx = self.child_id.context self.reserved_qp_num = Mlx5Context.reserved_qpn_alloc(ctx) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Alloc reserved QP number is not supported') raise ex qp_num = self.reserved_qp_num else: qp_num = self.qps[conn_idx].qp_num return ConnParam(qp_num=qp_num) class Mlx5CMTestCase(Mlx5RDMACMBaseTest): """ Mlx5 RDMACM test class. """ def test_rdmacm_async_traffic_dc_external_qp(self): """ Connect multiple RDMACM connections using DC as an external QP for traffic. """ self.two_nodes_rdmacm_traffic(DcCMConnection, self.rdmacm_traffic, with_ext_qp=True, num_conns=2) class ReservedQPTest(Mlx5PyverbsAPITestCase): def test_reservered_qpn(self): """ Alloc reserved qpn multiple times and then dealloc the qpns. In addition, the test includes bad flows where a fake qpn gets deallocated, and a real qpn gets deallocated twice. """ try: # Alloc qp number multiple times. qpns = [] for i in range(1000): qpns.append(Mlx5Context.reserved_qpn_alloc(self.ctx)) for i in range(1000): Mlx5Context.reserved_qpn_dealloc(self.ctx, qpns[i]) # Dealloc qp number that was not allocated. qpn = Mlx5Context.reserved_qpn_alloc(self.ctx) with self.assertRaises(PyverbsRDMAError) as ex: fake_qpn = qpn - 1 Mlx5Context.reserved_qpn_dealloc(self.ctx, fake_qpn) self.assertEqual(ex.exception.error_code, errno.EINVAL) # Try to dealloc same qp number twice. Mlx5Context.reserved_qpn_dealloc(self.ctx, qpn) with self.assertRaises(PyverbsRDMAError) as ex: Mlx5Context.reserved_qpn_dealloc(self.ctx, qpn) self.assertEqual(ex.exception.error_code, errno.EINVAL) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Alloc reserved QP number is not supported') raise ex rdma-core-56.1/tests/test_mlx5_sched.py000066400000000000000000000100261477342711600201370ustar00rootroot00000000000000import unittest import errno from pyverbs.providers.mlx5.mlx5dv_sched import Mlx5dvSchedAttr, \ Mlx5dvSchedNode, Mlx5dvSchedLeaf from tests.mlx5_base import Mlx5RDMATestCase, Mlx5PyverbsAPITestCase from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.providers.mlx5.mlx5dv import Mlx5QP import pyverbs.providers.mlx5.mlx5_enums as dve from tests.base import RCResources import tests.utils as u class Mlx5SchedTest(Mlx5PyverbsAPITestCase): def test_create_sched_tree(self): """ Create schedule elements tree. Test the schedule elements API, this includes creating schedule nodes with different flags and connecting them with schedule leaves. In addition, modify some nodes with different BW share and max BW. """ try: root_node = Mlx5dvSchedNode(self.ctx, Mlx5dvSchedAttr()) # Create a node with only max_avg_bw argument. max_sched_attr = Mlx5dvSchedAttr(root_node, max_avg_bw=10, flags=dve.MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW) max_bw_node = Mlx5dvSchedNode(self.ctx, max_sched_attr) # Create a node with only bw_share argument. weighed_sched_attr = Mlx5dvSchedAttr(root_node, bw_share=10, flags=dve.MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE) max_bw_node = Mlx5dvSchedNode(self.ctx, weighed_sched_attr) # Create a node with max_avg_bw and bw_share arguments. mixed_flags = dve.MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW | \ dve.MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE mixed_sched_attr = Mlx5dvSchedAttr(root_node, max_avg_bw=10, bw_share=2, flags=mixed_flags) mixed_bw_node = Mlx5dvSchedNode(self.ctx, mixed_sched_attr) # Modify a node. modify_sched_attr = Mlx5dvSchedAttr(root_node, max_avg_bw=4, bw_share=1, flags=mixed_flags) mixed_bw_node.modify(modify_sched_attr) # Attach sched leaf to mixed_bw_node max_sched_attr = Mlx5dvSchedAttr(mixed_bw_node) sched_leaf = Mlx5dvSchedLeaf(self.ctx, max_sched_attr) # Modify a leaf. modify_sched_attr = Mlx5dvSchedAttr(mixed_bw_node, max_avg_bw=3, bw_share=3, flags=mixed_flags) sched_leaf.modify(modify_sched_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create schedule elements is not supported') raise ex class Mlx5SchedTrafficTest(Mlx5RDMATestCase): def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None self.traffic_args = None def test_sched_per_qp_traffic(self): """ Tests attaching a QP to a sched leaf. The test creates a sched tree consisting of a root node and a leaf with max BW and share BW, modifies two RC QPs to be attached to the sched leaf and then run traffic using those QPs. """ self.create_players(RCResources) try: root_node = Mlx5dvSchedNode(self.server.ctx, Mlx5dvSchedAttr()) mixed_flags = dve.MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW | \ dve.MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE mixed_sched_attr = Mlx5dvSchedAttr(root_node, max_avg_bw=10, bw_share=2, flags=mixed_flags) leaf = Mlx5dvSchedLeaf(self.server.ctx, mixed_sched_attr) Mlx5QP.modify_qp_sched_elem(self.server.qp, req_sched_leaf=leaf, resp_sched_leaf=leaf) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Creation or usage of schedule elements is not supported') raise ex u.traffic(**self.traffic_args) rdma-core-56.1/tests/test_mlx5_timestamp.py000077500000000000000000000176741477342711600210770ustar00rootroot00000000000000import unittest import datetime import errno from pyverbs.enums import IBV_WC_EX_WITH_COMPLETION_TIMESTAMP as FREE_RUNNING, \ IBV_WC_EX_WITH_COMPLETION_TIMESTAMP_WALLCLOCK as REAL_TIME from tests.base import RCResources, RDMATestCase, PyverbsAPITestCase from pyverbs.providers.mlx5.mlx5dv import Mlx5Context from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.cq import CqInitAttrEx, CQEX from tests.test_flow import FlowRes from pyverbs.qp import QPInitAttr from pyverbs.cq import PollCqAttr import pyverbs.enums as e import tests.utils as u GIGA = 1000000000 def convert_ts_to_ns(ctx, device_ts): """ Convert device timestamp from HCA core clock units to corresponding nanosecond counts. :param ctx: The context that gets this timestamp. :param device_ts: The device timestamp to translate. :return: Timestamp in nanoseconds """ try: timestamp_in_ns = Mlx5Context.device_timestamp_to_ns(ctx, device_ts) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Converting timestamp to nanoseconds is not supported') raise ex return timestamp_in_ns def timestamp_res_cls(base_class): """ This is a factory function which creates a class that inherits base_class of any BaseResources type. :param base_class: The base resources class to inherit from. :return: TimeStampRes class. """ class TimeStampRes(base_class): def __init__(self, dev_name, ib_port, gid_index, qp_type, send_ts=None, recv_ts=None): self.qp_type = qp_type self.send_ts = send_ts self.recv_ts = recv_ts self.timestamp = None self.scq = None self.rcq = None super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) def create_cq(self): self.scq = self._create_ex_cq(self.send_ts) self.rcq = self._create_ex_cq(self.recv_ts) def _create_ex_cq(self, timestamp=None): """ Create an Extended CQ. :param timestamp: If set, the timestamp type to use. """ wc_flags = e.IBV_WC_STANDARD_FLAGS if timestamp: wc_flags |= timestamp cia = CqInitAttrEx(cqe=self.num_msgs, wc_flags=wc_flags) try: cq = CQEX(self.ctx, cia) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create Extended CQ is not supported') raise ex return cq def create_qp_init_attr(self): return QPInitAttr(qp_type=self.qp_type, scq=self.scq, rcq=self.rcq, srq=self.srq, cap=self.create_qp_cap()) return TimeStampRes class TimeStampTest(RDMATestCase): """ Test various types of timestamping formats. """ def setUp(self): super().setUp() self.send_ts = None self.recv_ts = None self.qp_type = None @property def resource_arg(self): return {'send_ts': self.send_ts, 'recv_ts': self.recv_ts, 'qp_type': self.qp_type} def test_timestamp_free_running_rc_traffic(self): """ Test free running timestamp on RC traffic. """ self.qp_type = e.IBV_QPT_RC self.send_ts = self.recv_ts = FREE_RUNNING self.create_players(timestamp_res_cls(RCResources), **self.resource_arg) self.ts_traffic() timestamp = convert_ts_to_ns(self.client.ctx, self.client.timestamp) self.verify_ts(timestamp) def test_timestamp_real_time_rc_traffic(self): """ Test real time timestamp on RC traffic. """ self.qp_type = e.IBV_QPT_RC self.send_ts = self.recv_ts = REAL_TIME self.create_players(timestamp_res_cls(RCResources), **self.resource_arg) self.ts_traffic() self.verify_ts(self.client.timestamp) def test_timestamp_free_running_send_raw_traffic(self): """ Test timestamping on RAW traffic only on the send completions. """ self.qp_type = e.IBV_QPT_RAW_PACKET self.send_ts = FREE_RUNNING self.create_players(timestamp_res_cls(FlowRes), **self.resource_arg) self.flow = self.server.create_flow([self.server.create_eth_spec()]) self.ts_traffic() timestamp = convert_ts_to_ns(self.client.ctx, self.client.timestamp) self.verify_ts(timestamp) def test_timestamp_free_running_recv_raw_traffic(self): """ Test timestamping on RAW traffic only on the recv completions. """ self.qp_type = e.IBV_QPT_RAW_PACKET self.recv_ts = FREE_RUNNING self.create_players(timestamp_res_cls(FlowRes), **self.resource_arg) self.flow = self.server.create_flow([self.server.create_eth_spec()]) self.ts_traffic() timestamp = convert_ts_to_ns(self.server.ctx, self.server.timestamp) self.verify_ts(timestamp) def test_timestamp_real_time_raw_traffic(self): """ Test real time timestamp on RAW traffic. """ self.qp_type = e.IBV_QPT_RAW_PACKET self.send_ts = self.recv_ts = REAL_TIME self.create_players(timestamp_res_cls(FlowRes), **self.resource_arg) self.flow = self.server.create_flow([self.server.create_eth_spec()]) self.ts_traffic() self.verify_ts(self.client.timestamp) @staticmethod def verify_ts(timestamp): """ Verify that the timestamp is a valid value of time. """ datetime.datetime.fromtimestamp(timestamp/GIGA) @staticmethod def poll_cq_ex_ts(cqex, ts_type=None): """ Poll completion from the extended CQ. :param cqex: CQEX to poll from :param ts_type: If set, read the CQE timestamp in this format :return: The CQE timestamp if it requested. """ polling_timeout = 10 start = datetime.datetime.now() ts = 0 poll_attr = PollCqAttr() ret = cqex.start_poll(poll_attr) while ret == 2 and (datetime.datetime.now() - start).seconds < polling_timeout: ret = cqex.start_poll(poll_attr) if ret == 2: raise PyverbsRDMAError('Failed to poll CQEX - Got timeout') if ret != 0: raise PyverbsRDMAError('Failed to poll CQEX') if cqex.status != e.IBV_WC_SUCCESS: raise PyverbsRDMAError('Completion status is {cqex.status}') if ts_type == FREE_RUNNING: ts = cqex.read_timestamp() if ts_type == REAL_TIME: ts = cqex.read_completion_wallclock_ns() cqex.end_poll() return ts def ts_traffic(self): """ Run RDMA traffic and read the completions timestamps. """ s_recv_wr = u.get_recv_wr(self.server) u.post_recv(self.server, s_recv_wr) if self.qp_type == e.IBV_QPT_RAW_PACKET: c_send_wr, _, _ = u.get_send_elements_raw_qp(self.client) else: c_send_wr, _ = u.get_send_elements(self.client, False) u.send(self.client, c_send_wr, e.IBV_WR_SEND, False, 0) self.client.timestamp = self.poll_cq_ex_ts(self.client.scq, ts_type=self.send_ts) self.server.timestamp = self.poll_cq_ex_ts(self.server.rcq, ts_type=self.recv_ts) class TimeAPITest(PyverbsAPITestCase): def test_query_rt_values(self): """ Test the ibv_query_rt_values_ex API. Query the device real-time values, convert them to ns and verify that the timestamp is a valid value of time.. """ try: _, hw_time = self.ctx.query_rt_values_ex() time_in_ns = convert_ts_to_ns(self.ctx, hw_time) datetime.datetime.fromtimestamp(time_in_ns/GIGA) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Query device real time is not supported') raise ex rdma-core-56.1/tests/test_mlx5_uar.py000066400000000000000000000024061477342711600176430ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file """ Test module for Mlx5 UAR allocation. """ import unittest import errno from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.providers.mlx5.mlx5dv import Mlx5UAR import pyverbs.providers.mlx5.mlx5_enums as e from tests.mlx5_base import Mlx5RDMATestCase from tests.base import BaseResources class Mlx5UarRes(BaseResources): def __init__(self, dev_name, ib_port=None, gid_index=None): super().__init__(dev_name, ib_port, gid_index) self.uars = [] class Mlx5UarTestCase(Mlx5RDMATestCase): def setUp(self): super().setUp() self.uar_res = Mlx5UarRes(self.dev_name) def test_alloc_uar(self): try: for f in [e._MLX5DV_UAR_ALLOC_TYPE_BF, e._MLX5DV_UAR_ALLOC_TYPE_NC]: self.uar_res.uars.append(Mlx5UAR(self.uar_res.ctx, f)) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP or ex.error_code == errno.EPROTONOSUPPORT: raise unittest.SkipTest(f'UAR allocation (with flag={f}) is not supported') raise ex finally: for uar in self.uar_res.uars: uar.close() rdma-core-56.1/tests/test_mlx5_udp_sport.py000066400000000000000000000024541477342711600210760ustar00rootroot00000000000000import unittest import errno from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.providers.mlx5.mlx5dv import Mlx5QP from tests.mlx5_base import Mlx5RDMATestCase from tests.base import RCResources import pyverbs.enums as e import tests.utils as u class UdpSportTestCase(Mlx5RDMATestCase): def __init__(self, methodName='runTest', dev_name=None, ib_port=None, gid_index=None, pkey_index=None, gid_type=e.IBV_GID_TYPE_SYSFS_ROCE_V2): # Modify UDP source port is not supported on RoCEv1 super().__init__(methodName, dev_name, ib_port, gid_index, pkey_index, gid_type) def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None def test_rc_modify_udp_sport(self): """ Create RC resources and change the server QP's UDP source port to an arbitrary legal value (55555). Then run SEND traffic. :return: None """ self.create_players(RCResources) try: Mlx5QP.modify_udp_sport(self.server.qp, udp_sport=55555) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Modifying a QP UDP sport is not supported') raise ex u.traffic(**self.traffic_args) rdma-core-56.1/tests/test_mlx5_var.py000066400000000000000000000025221477342711600176430ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file """ Test module for Mlx5 VAR allocation. """ from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.providers.mlx5.mlx5dv import Mlx5VAR from tests.mlx5_base import Mlx5RDMATestCase from tests.base import BaseResources import unittest import errno import mmap class Mlx5VarRes(BaseResources): def __init__(self, dev_name, ib_port=None, gid_index=None): super().__init__(dev_name, ib_port, gid_index) try: self.var = Mlx5VAR(self.ctx) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP or ex.error_code == errno.EPROTONOSUPPORT: raise unittest.SkipTest('VAR allocation is not supported') class Mlx5VarTestCase(Mlx5RDMATestCase): def setUp(self): super().setUp() self.var_res = Mlx5VarRes(self.dev_name) def test_var_map_unmap(self): var_map = mmap.mmap(fileno=self.var_res.ctx.cmd_fd, length=self.var_res.var.length, offset=self.var_res.var.mmap_off) # There is no munmap method in mmap Python module, but by closing the # mmap instance the memory is unmapped. var_map.close() self.var_res.var.close() rdma-core-56.1/tests/test_mlx5_vfio.py000066400000000000000000000255111477342711600200210ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2021 Nvidia, Inc. All rights reserved. See COPYING file """ Test module for pyverbs' mlx5_vfio module. """ from threading import Thread import unittest import logging import struct import select import errno import time import math import os from tests.mlx5_base import Mlx5DevxRcResources, Mlx5DevxTrafficBase, PortState, \ PortStatus, PORT_STATE_TIMEOUT from pyverbs.providers.mlx5.mlx5dv import Mlx5DevxMsiVector, Mlx5DevxEq, Mlx5UAR from pyverbs.providers.mlx5.mlx5_vfio import Mlx5VfioAttr, Mlx5VfioContext from pyverbs.pyverbs_error import PyverbsRDMAError import pyverbs.providers.mlx5.mlx5_enums as dve from pyverbs.base import PyverbsRDMAErrno import pyverbs.mem_alloc as mem import pyverbs.dma_util as dma class Mlx5VfioResources(Mlx5DevxRcResources): def __init__(self, ib_port, pci_name, gid_index=None, ctx=None, activate_port_state=False): self.pci_name = pci_name self.ctx = ctx super().__init__(None, ib_port, gid_index, activate_port_state=activate_port_state) def create_context(self): """ Opens an mlx5 VFIO context. Since only one context is allowed to be opened on a VFIO, the user must pass that context for the remaining resources, which in that case, the same context would be used. :return: None """ if self.ctx: return try: vfio_attr = Mlx5VfioAttr(pci_name=self.pci_name) vfio_attr.pci_name = self.pci_name self.ctx = Mlx5VfioContext(attr=vfio_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Mlx5 VFIO is not supported ({ex})') raise ex def query_gid(self): """ Currently Mlx5VfioResources does not support Eth port type. Query GID would just be skipped. """ pass class Mlx5VfioEqResources(Mlx5VfioResources): def __init__(self, ib_port, pci_name, gid_index=None, ctx=None): self.cons_index = 0 super().__init__(ib_port, pci_name, gid_index, ctx) self.logger = logging.getLogger(self.__class__.__name__) def create_uar(self): super().create_uar() self.uar['eq'] = Mlx5UAR(self.ctx, dve._MLX5DV_UAR_ALLOC_TYPE_NC) if not self.uar['eq'].page_id: raise PyverbsRDMAError('Failed to allocate UAR') def get_eqe(self, cc): from tests.mlx5_prm_structs import SwEqe ci = self.cons_index + cc entry = ci & (self.nent - 1) eqe_bytes = mem.read64(self.eq.vaddr + entry * len(SwEqe())) eqe_bytes = eqe_bytes.to_bytes(length=8, byteorder='little') eqe = SwEqe(eqe_bytes) if (eqe.owner & 1) ^ (not(not(ci & self.nent))): eqe = None elif eqe: dma.udma_from_dev_barrier() return eqe def update_ci(self, cc, arm=0): addr = self.doorbell if arm: addr += 8 # Adding 2 bytes according to PRM self.cons_index += cc val = (self.cons_index & 0xffffff) | (self.eqn << 24) val_be = struct.unpack("I", val))[0] dma.mmio_write32_as_be(addr, val_be) dma.udma_to_dev_barrier() def init_dveq_buff(self): from tests.mlx5_prm_structs import SwEqe for i in range(self.nent): eqe_bytes = mem.read64(self.eq.vaddr + i * len(SwEqe())) eqe_bytes = eqe_bytes.to_bytes(length=8, byteorder='little') eqe = SwEqe(eqe_bytes) eqe.owner = 0x1 self.update_ci(0) def update_cc(self, cc): if cc >= self.num_spare_eqe: self.update_ci(cc) cc = 0 return cc def create_eq(self): from tests.mlx5_prm_structs import CreateEqIn, SwEqc, CreateEqOut,\ EventType # Using num_spare_eqe to guarantee that we update # the ci before we polled all the entries in the EQ self.num_spare_eqe = 0x80 self.nent = 0x80 + self.num_spare_eqe self.msi_vector = Mlx5DevxMsiVector(self.ctx) vector = self.msi_vector.vector log_eq_size = math.ceil(math.log2(self.nent)) mask = 1 << EventType.PORT_STATE_CHANGE cmd_in = CreateEqIn(sw_eqc=SwEqc(uar_page=self.uar['eq'].page_id, log_eq_size=log_eq_size, intr=vector), event_bitmask_63_0=mask) self.eq = Mlx5DevxEq(self.ctx, cmd_in, len(CreateEqOut())) self.eqn = CreateEqOut(self.eq.out_view).eqn self.doorbell = self.uar['eq'].base_addr + 0x40 self.init_dveq_buff() self.update_ci(0, 1) def query_eqn(self): pass def process_async_events(self, fd): from tests.mlx5_prm_structs import EventType cc = 0 ret = os.read(fd, 8) if not ret: raise PyverbsRDMAErrno('Failed to read FD') eqe = self.get_eqe(cc) while eqe: if eqe.event_type == EventType.PORT_STATE_CHANGE: self.logger.debug('Caught port state change event') return eqe.event_type elif eqe.event_type == EventType.CQ_ERROR: raise PyverbsRDMAError('Event type Error') cc = self.update_cc(cc + 1) eqe = self.get_eqe(cc) self.update_ci(cc, 1) class Mlx5VfioTrafficTest(Mlx5DevxTrafficBase): """ Test various functionality of an mlx5-vfio device. """ def setUp(self): """ Verifies that the user has passed a PCI device name to work with. """ self.pci_dev = self.config['pci_dev'] if not self.pci_dev: raise unittest.SkipTest('PCI device must be passed by the user') def create_players(self): self.server = Mlx5VfioResources(ib_port=self.ib_port, pci_name=self.pci_dev, activate_port_state=True) self.client = Mlx5VfioResources(ib_port=self.ib_port, pci_name=self.pci_dev, ctx=self.server.ctx) def create_async_players(self): self.server = Mlx5VfioEqResources(ib_port=self.ib_port, pci_name=self.pci_dev) self.client = Mlx5VfioResources(ib_port=self.ib_port, pci_name=self.pci_dev, ctx=self.server.ctx) def vfio_process_events(self): """ Processes mlx5 vfio device events. This method should run from application thread to maintain the events. """ # Server and client use the same context events_fd = self.server.ctx.get_events_fd() with select.epoll() as epoll_events: epoll_events.register(events_fd, select.EPOLLIN) while self.proc_events: for fd, event in epoll_events.poll(timeout=0.1): if fd == events_fd: if not (event & select.EPOLLIN): self.event_ex.append(PyverbsRDMAError(f'Unexpected vfio event: {event}')) self.server.ctx.process_events() def vfio_process_async_events(self): """ Processes mlx5 vfio device async events. This method should run from application thread to maintain the events. """ from tests.mlx5_prm_structs import EventType # Server and client use the same context events_fd = self.server.msi_vector.fd with select.epoll() as epoll_events: epoll_events.register(events_fd, select.EPOLLIN) while self.proc_events: for fd, event in epoll_events.poll(timeout=0.1): if fd == events_fd: if not (event & select.EPOLLIN): self.event_ex.append(PyverbsRDMAError(f'Unexpected vfio event: {event}')) if self.server.process_async_events(events_fd) == EventType.PORT_STATE_CHANGE: self.caught_event = True def test_mlx5vfio_rc_qp_send_imm_traffic(self): """ Opens one mlx5 vfio context, creates two DevX RC QPs on it, and modifies them to RTS state. Then does SEND_IMM traffic. """ self.create_players() if self.server.is_eth(): raise unittest.SkipTest(f'{self.__class__.__name__} is currently supported over IB only') self.event_ex = [] self.proc_events = True proc_events = Thread(target=self.vfio_process_events) proc_events.start() # Move the DevX QPs to RTS state self.pre_run() try: # Send traffic self.send_imm_traffic() finally: # Stop listening to events self.proc_events = False proc_events.join() if self.event_ex: raise PyverbsRDMAError(f'Received unexpected vfio events: {self.event_ex}') def test_mlx5vfio_async_event(self): """ Opens one mlx5 vfio context, creates DevX EQ on it. Then activates the port and catches the port state change event. """ self.create_async_players() if self.server.is_eth(): raise unittest.SkipTest(f'{self.__class__.__name__} is currently supported over IB only') self.event_ex = [] self.proc_events = True self.caught_event = False proc_events = Thread(target=self.vfio_process_events) proc_async_events = Thread(target=self.vfio_process_async_events) proc_events.start() proc_async_events.start() # Move the DevX QPs to RTS state self.pre_run() try: # Change port state self.server.change_port_state_with_registers(PortStatus.MLX5_PORT_UP) admin_status, oper_status = self.server.query_port_state_with_registers() start_state_t = time.perf_counter() while admin_status != PortStatus.MLX5_PORT_UP or oper_status != PortStatus.MLX5_PORT_UP: if time.perf_counter() - start_state_t >= PORT_STATE_TIMEOUT: raise PyverbsRDMAError('Could not change the port state to UP') admin_status, oper_status = self.server.query_port_state_with_registers() start_state_t = time.perf_counter() while self.server.query_port_state_with_mads(self.ib_port) < PortState.ACTIVE: if time.perf_counter() - start_state_t >= PORT_STATE_TIMEOUT: raise PyverbsRDMAError('Could not change the port state to ACTIVE') time.sleep(1) finally: # Stop listening to events self.proc_events = False proc_events.join() proc_async_events.join() if self.event_ex: raise PyverbsRDMAError(f'Received unexpected vfio events: {self.event_ex}') if not self.caught_event: raise PyverbsRDMAError('Failed to catch an async event') rdma-core-56.1/tests/test_mr.py000066400000000000000000000726371477342711600165420ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file # Copyright (c) 2020 Intel Corporation. All rights reserved. See COPYING file """ Test module for pyverbs' mr module. """ import unittest import random import errno from tests.base import PyverbsAPITestCase, RCResources, RDMATestCase from pyverbs.pyverbs_error import PyverbsRDMAError, PyverbsError from pyverbs.mr import MR, MW, DMMR, DmaBufMR, MWBindInfo, MWBind from pyverbs.mem_alloc import posix_memalign, free from pyverbs.dmabuf import DmaBuf from pyverbs.qp import QPAttr from pyverbs.wr import SendWR import pyverbs.device as d from pyverbs.pd import PD import pyverbs.enums as e import tests.utils as u MAX_IO_LEN = 1048576 DM_INVALID_ALIGNMENT = 3 class MRRes(RCResources): def __init__(self, dev_name, ib_port, gid_index, mr_access=e.IBV_ACCESS_LOCAL_WRITE): """ Initialize MR resources based on RC resources that include RC QP. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param mr_access: The MR access """ self.mr_access = mr_access super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) def create_mr(self): try: self.mr = MR(self.pd, self.msg_size, self.mr_access) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Reg MR with access ({self.mr_access}) is not supported') raise ex def create_qp_attr(self): qp_attr = QPAttr(port_num=self.ib_port) qp_access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE | \ e.IBV_ACCESS_REMOTE_ATOMIC qp_attr.qp_access_flags = qp_access return qp_attr def rereg_mr(self, flags, pd=None, addr=0, length=0, access=0): try: self.mr.rereg(flags, pd, addr, length, access) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Rereg MR is not supported ({str(ex)})') raise ex class MRTest(RDMATestCase): """ Test various functionalities of the MR class. """ def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None self.server_qp_attr = None self.client_qp_attr = None self.traffic_args = None def restate_qps(self): """ Restate the resources QPs from ERR back to RTS state. """ self.server.qp.modify(QPAttr(qp_state=e.IBV_QPS_RESET), e.IBV_QP_STATE) self.server.qp.to_rts(self.server_qp_attr) self.client.qp.modify(QPAttr(qp_state=e.IBV_QPS_RESET), e.IBV_QP_STATE) self.client.qp.to_rts(self.client_qp_attr) def test_mr_rereg_atomic(self): """ Test the rereg of MR's atomic access with the following flow: Create MRs with atomic access, then rereg the MRs without atomic access and verify that traffic fails with the relevant error. Rereg the MRs back to atomic access and verify that traffic now succeeds. """ atomic_mr_access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_ATOMIC self.create_players(MRRes, mr_access=atomic_mr_access) self.server_qp_attr, _ = self.server.qp.query(0x1ffffff) self.client_qp_attr, _ = self.client.qp.query(0x1ffffff) access = e.IBV_ACCESS_LOCAL_WRITE self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_ACCESS, access=access) self.client.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_ACCESS, access=access) with self.assertRaisesRegex(PyverbsRDMAError, 'Completion status is Remote access error'): u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) self.restate_qps() self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_ACCESS, access=atomic_mr_access) self.client.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_ACCESS, access=atomic_mr_access) u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) def test_mr_rereg_access(self): self.create_players(MRRes) access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_ACCESS, access=access) self.client.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_ACCESS, access=access) u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) def test_mr_rereg_access_bad_flow(self): """ Test that cover rereg MR's access with this flow: Run remote traffic on MR with compatible access, then rereg the MR without remote access and verify that traffic fails with the relevant error. """ remote_access = e.IBV_ACCESS_LOCAL_WRITE |e.IBV_ACCESS_REMOTE_WRITE self.create_players(MRRes, mr_access=remote_access) u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) access = e.IBV_ACCESS_LOCAL_WRITE self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_ACCESS, access=access) with self.assertRaisesRegex(PyverbsRDMAError, 'Remote access error'): u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) def test_mr_rereg_pd(self): """ Test that cover rereg MR's PD with this flow: Use MR with QP that was created with the same PD. Then rereg the MR's PD and use the MR with the same QP, expect the traffic to fail with "remote operation error". Restate the QP from ERR state, rereg the MR back to its previous PD and use it again with the QP, verify that it now succeeds. """ self.create_players(MRRes) self.server_qp_attr, _ = self.server.qp.query(0x1ffffff) self.client_qp_attr, _ = self.client.qp.query(0x1ffffff) u.traffic(**self.traffic_args) server_new_pd = PD(self.server.ctx) self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_PD, pd=server_new_pd) with self.assertRaisesRegex(PyverbsRDMAError, 'Remote operation error'): u.traffic(**self.traffic_args) self.restate_qps() self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_PD, pd=self.server.pd) u.traffic(**self.traffic_args) # Rereg the MR again with the new PD to cover # destroying a PD with a re-registered MR. self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_PD, pd=server_new_pd) def test_mr_rereg_addr(self): self.create_players(MRRes) self.server_qp_attr, _ = self.server.qp.query(0x1ffffff) self.client_qp_attr, _ = self.client.qp.query(0x1ffffff) s_recv_wr = u.get_recv_wr(self.server) self.server.qp.post_recv(s_recv_wr) server_addr = posix_memalign(self.server.msg_size) self.server.rereg_mr(flags=e.IBV_REREG_MR_CHANGE_TRANSLATION, addr=server_addr, length=self.server.msg_size) with self.assertRaisesRegex(PyverbsRDMAError, 'Remote operation error'): # The server QP receive queue has WR with the old MR address, # therefore traffic should fail. u.traffic(**self.traffic_args) self.restate_qps() u.traffic(**self.traffic_args) free(server_addr) def test_reg_mr_bad_flags(self): """ Verify that illegal flags combination fails as expected """ with d.Context(name=self.dev_name) as ctx: with PD(ctx) as pd: with self.assertRaisesRegex(PyverbsRDMAError, 'Failed to register a MR'): MR(pd, u.get_mr_length(), e.IBV_ACCESS_REMOTE_WRITE) with self.assertRaisesRegex(PyverbsRDMAError, 'Failed to register a MR'): MR(pd, u.get_mr_length(), e.IBV_ACCESS_REMOTE_ATOMIC) class MWRC(RCResources): def __init__(self, dev_name, ib_port, gid_index, mw_type): """ Initialize Memory Window resources based on RC resources that include RC QP. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param mw_type: The MW type to use """ super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) self.mw_type = mw_type access = e.IBV_ACCESS_REMOTE_WRITE | e.IBV_ACCESS_LOCAL_WRITE self.mw_bind_info = MWBindInfo(self.mr, self.mr.buf, self.msg_size, access) self.mw_bind = MWBind(self.mw_bind_info, e.IBV_SEND_SIGNALED) try: self.mw = MW(self.pd, self.mw_type) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create MW is not supported') raise ex def create_mr(self): access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_MW_BIND try: self.mr = MR(self.pd, self.msg_size, access) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Reg MR with MW access is not supported') raise ex def create_qp_attr(self): qp_attr = QPAttr(port_num=self.ib_port) qp_access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE qp_attr.qp_access_flags = qp_access return qp_attr class MWTest(RDMATestCase): """ Test various functionalities of the MW class. """ def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None def tearDown(self): if self.server: self.server.mw.close() if self.client: self.client.mw.close() return super().tearDown() def bind_mw_type_1(self): self.server.qp.bind_mw(self.server.mw, self.server.mw_bind) self.client.qp.bind_mw(self.client.mw, self.client.mw_bind) # Poll the bind MW action completion. u.poll_cq(self.server.cq) u.poll_cq(self.client.cq) self.server.rkey = self.client.mw.rkey self.server.raddr = self.client.mr.buf self.client.rkey = self.server.mw.rkey self.client.raddr = self.server.mr.buf def bind_mw_type_2(self): client_send_wr = SendWR(opcode=e.IBV_WR_BIND_MW) client_send_wr.set_bind_wr(self.client.mw, self.client.mw_bind_info) server_send_wr = SendWR(opcode=e.IBV_WR_BIND_MW) server_send_wr.set_bind_wr(self.server.mw, self.server.mw_bind_info) self.server.qp.post_send(server_send_wr) self.client.qp.post_send(client_send_wr) # Poll the bind MW WR. u.poll_cq(self.server.cq) u.poll_cq(self.client.cq) self.server.rkey = client_send_wr.rkey self.server.raddr = self.client.mr.buf self.client.rkey = server_send_wr.rkey self.client.raddr = self.server.mr.buf def invalidate_mw_type1(self): """ Invalidate the MWs by rebind this MW with zero length. :return: None """ for player in [self.server, self.client]: mw_bind_info = MWBindInfo(player.mr, player.mr.buf, 0, 0) mw_bind = MWBind(mw_bind_info, e.IBV_SEND_SIGNALED) player.qp.bind_mw(player.mw, mw_bind) # Poll the bound MW action request completion. u.poll_cq(player.cq) def invalidate_mw_type2_local(self): """ Invalidate the MWs by post invalidation send WR from the local QP. :return: None """ inv_send_wr = SendWR(opcode=e.IBV_WR_LOCAL_INV) inv_send_wr.imm_data = self.server.rkey self.client.qp.post_send(inv_send_wr) inv_send_wr = SendWR(opcode=e.IBV_WR_LOCAL_INV) inv_send_wr.imm_data = self.client.rkey self.server.qp.post_send(inv_send_wr) # Poll the invalidate MW WR. u.poll_cq(self.server.cq) u.poll_cq(self.client.cq) def invalidate_mw_type2_remote(self): """ Invalidate the MWs by sending invalidation send WR from the remote QP. :return: None """ server_recv_wr = u.get_recv_wr(self.server) client_recv_wr = u.get_recv_wr(self.client) self.server.qp.post_recv(server_recv_wr) self.client.qp.post_recv(client_recv_wr) inv_send_wr = SendWR(opcode=e.IBV_WR_SEND_WITH_INV) inv_send_wr.imm_data = self.client.rkey self.client.qp.post_send(inv_send_wr) inv_send_wr = SendWR(opcode=e.IBV_WR_SEND_WITH_INV) inv_send_wr.imm_data = self.server.rkey self.server.qp.post_send(inv_send_wr) # Poll the invalidate MW send WR. u.poll_cq(self.server.cq) u.poll_cq(self.client.cq) # Poll the invalidate MW recv WR. u.poll_cq(self.server.cq) u.poll_cq(self.client.cq) def test_mw_type1(self): self.create_players(MWRC, mw_type=e.IBV_MW_TYPE_1) self.bind_mw_type_1() u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) def test_invalidate_mw_type1(self): self.test_mw_type1() self.invalidate_mw_type1() with self.assertRaisesRegex(PyverbsRDMAError, 'Remote access error'): u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) def test_mw_type2(self): self.create_players(MWRC, mw_type=e.IBV_MW_TYPE_2) self.bind_mw_type_2() u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) def test_mw_type2_invalidate_local(self): self.test_mw_type2() self.invalidate_mw_type2_local() with self.assertRaisesRegex(PyverbsRDMAError, 'Remote access error'): u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) def test_mw_type2_invalidate_remote(self): self.test_mw_type2() self.invalidate_mw_type2_remote() with self.assertRaisesRegex(PyverbsRDMAError, 'Remote access error'): u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) def test_mw_type2_invalidate_dealloc(self): self.test_mw_type2() # Dealloc the MW by closing the pyverbs objects. self.server.mw.close() self.client.mw.close() with self.assertRaisesRegex(PyverbsRDMAError, 'Remote access error'): u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) def test_reg_mw_wrong_type(self): """ Verify that trying to create a MW of a wrong type fails """ with d.Context(name=self.dev_name) as ctx: with PD(ctx) as pd: try: mw_type = 3 MW(pd, mw_type) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Create memory window of type {} is not supported'.format(mw_type)) else: raise PyverbsError('Created a MW with type {t}'.\ format(t=mw_type)) class DeviceMemoryAPITest(PyverbsAPITestCase): """ Test various API usages of the DMMR class. """ def setUp(self): super().setUp() if self.attr_ex.max_dm_size == 0: raise unittest.SkipTest('Device memory is not supported') def test_create_dm_mr(self): max_dm_size = self.attr_ex.max_dm_size dm_access = e.IBV_ACCESS_ZERO_BASED | e.IBV_ACCESS_LOCAL_WRITE for dm_size in [4, max_dm_size/4, max_dm_size/2]: dm_size = dm_size - (dm_size % u.DM_ALIGNMENT) for dmmr_factor_size in [0.1, 0.5, 1]: dmmr_size = dm_size * dmmr_factor_size dmmr_size = dmmr_size - (dmmr_size % u.DM_ALIGNMENT) with d.DM(self.ctx, d.AllocDmAttr(length=dm_size)) as dm: DMMR(PD(self.ctx), dmmr_size, dm_access, dm, 0) def test_dm_bad_access(self): """ Test multiple types of bad access to the Device Memory. Device memory access requests a 4B alignment. The test tries to access the DM with bad alignment or outside of the allocated memory. """ dm_size = 100 with d.DM(self.ctx, d.AllocDmAttr(length=dm_size)) as dm: dm_access = e.IBV_ACCESS_ZERO_BASED | e.IBV_ACCESS_LOCAL_WRITE dmmr = DMMR(PD(self.ctx), dm_size, dm_access, dm, 0) access_cases = [(DM_INVALID_ALIGNMENT, 4), # Valid length with unaligned offset (4, DM_INVALID_ALIGNMENT), # Valid offset with unaligned length (dm_size + 4, 4), # Offset out of allocated memory (0, dm_size + 4)] # Length out of allocated memory for case in access_cases: offset, length = case with self.assertRaisesRegex(PyverbsRDMAError, 'Failed to copy from dm'): dmmr.read(offset=offset, length=length) with self.assertRaisesRegex(PyverbsRDMAError, 'Failed to copy to dm'): dmmr.write(data='s'*length, offset=offset, length=length) def test_dm_bad_registration(self): """ Test bad Device Memory registration when trying to register bigger DMMR than the allocated DM. """ dm_size = 100 with d.DM(self.ctx, d.AllocDmAttr(length=dm_size)) as dm: dm_access = e.IBV_ACCESS_ZERO_BASED | e.IBV_ACCESS_LOCAL_WRITE with self.assertRaisesRegex(PyverbsRDMAError, 'Failed to register a device MR'): DMMR(PD(self.ctx), dm_size + 4, dm_access, dm, 0) def check_dmabuf_support(gpu=0): """ Check if dma-buf allocation is supported by the system. Skip the test on failure. """ device_num = 128 + gpu try: DmaBuf(1, gpu=gpu) except PyverbsRDMAError as ex: if ex.error_code == errno.ENOENT: raise unittest.SkipTest(f'Device /dev/dri/renderD{device_num} is not present') if ex.error_code == errno.EACCES: raise unittest.SkipTest(f'Lack of permission to access /dev/dri/renderD{device_num}') if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Allocating dmabuf is not supported by /dev/dri/renderD{device_num}') def check_dmabuf_mr_support(pd, gpu=0): """ Check if dma-buf MR registration is supported by the driver. Skip the test on failure """ try: DmaBufMR(pd, 1, 0, gpu=gpu) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Reg dma-buf MR is not supported by the RDMA driver') class DmaBufMRTest(PyverbsAPITestCase): """ Test various functionalities of the DmaBufMR class. """ def setUp(self): super().setUp() self.gpu = self.config['gpu'] self.gtt = self.config['gtt'] def test_dmabuf_reg_mr(self): """ Test ibv_reg_dmabuf_mr() """ check_dmabuf_support(self.gpu) with PD(self.ctx) as pd: check_dmabuf_mr_support(pd, self.gpu) flags = u.get_dmabuf_access_flags(self.ctx) for f in flags: len = u.get_mr_length() for off in [0, len//2]: with DmaBufMR(pd, len, f, offset=off, gpu=self.gpu, gtt=self.gtt) as mr: pass def test_dmabuf_dereg_mr(self): """ Test ibv_dereg_mr() with DmaBufMR """ check_dmabuf_support(self.gpu) with PD(self.ctx) as pd: check_dmabuf_mr_support(pd, self.gpu) flags = u.get_dmabuf_access_flags(self.ctx) for f in flags: len = u.get_mr_length() for off in [0, len//2]: with DmaBufMR(pd, len, f, offset=off, gpu=self.gpu, gtt=self.gtt) as mr: mr.close() def test_dmabuf_dereg_mr_twice(self): """ Verify that explicit call to DmaBufMR's close() doesn't fail """ check_dmabuf_support(self.gpu) with PD(self.ctx) as pd: check_dmabuf_mr_support(pd, self.gpu) flags = u.get_dmabuf_access_flags(self.ctx) for f in flags: len = u.get_mr_length() for off in [0, len//2]: with DmaBufMR(pd, len, f, offset=off, gpu=self.gpu, gtt=self.gtt) as mr: # Pyverbs supports multiple destruction of objects, # we are not expecting an exception here. mr.close() mr.close() def test_dmabuf_reg_mr_bad_flags(self): """ Verify that DmaBufMR with illegal flags combination fails as expected """ check_dmabuf_support(self.gpu) with PD(self.ctx) as pd: check_dmabuf_mr_support(pd, self.gpu) for i in range(5): flags = random.sample([e.IBV_ACCESS_REMOTE_WRITE, e.IBV_ACCESS_REMOTE_ATOMIC], random.randint(1, 2)) mr_flags = 0 for i in flags: mr_flags += i.value try: DmaBufMR(pd, u.get_mr_length(), mr_flags, gpu=self.gpu, gtt=self.gtt) except PyverbsRDMAError as err: assert 'Failed to register a dma-buf MR' in err.args[0] else: raise PyverbsRDMAError('Registered a dma-buf MR with illegal falgs') def test_dmabuf_write(self): """ Test writing to DmaBufMR's buffer """ check_dmabuf_support(self.gpu) with PD(self.ctx) as pd: check_dmabuf_mr_support(pd, self.gpu) for i in range(10): mr_len = u.get_mr_length() flags = u.get_dmabuf_access_flags(self.ctx) for f in flags: for mr_off in [0, mr_len//2]: with DmaBufMR(pd, mr_len, f, offset=mr_off, gpu=self.gpu, gtt=self.gtt) as mr: write_len = min(random.randint(1, MAX_IO_LEN), mr_len) mr.write('a' * write_len, write_len) def test_dmabuf_read(self): """ Test reading from DmaBufMR's buffer """ check_dmabuf_support(self.gpu) with PD(self.ctx) as pd: check_dmabuf_mr_support(pd, self.gpu) for i in range(10): mr_len = u.get_mr_length() flags = u.get_dmabuf_access_flags(self.ctx) for f in flags: for mr_off in [0, mr_len//2]: with DmaBufMR(pd, mr_len, f, offset=mr_off, gpu=self.gpu, gtt=self.gtt) as mr: write_len = min(random.randint(1, MAX_IO_LEN), mr_len) write_str = 'a' * write_len mr.write(write_str, write_len) read_len = random.randint(1, write_len) offset = random.randint(0, write_len-read_len) read_str = mr.read(read_len, offset).decode() assert read_str in write_str def test_dmabuf_lkey(self): """ Test reading lkey property """ check_dmabuf_support(self.gpu) with PD(self.ctx) as pd: check_dmabuf_mr_support(pd, self.gpu) length = u.get_mr_length() flags = u.get_dmabuf_access_flags(self.ctx) for f in flags: with DmaBufMR(pd, length, f, gpu=self.gpu, gtt=self.gtt) as mr: mr.lkey def test_dmabuf_rkey(self): """ Test reading rkey property """ check_dmabuf_support(self.gpu) with PD(self.ctx) as pd: check_dmabuf_mr_support(pd, self.gpu) length = u.get_mr_length() flags = u.get_dmabuf_access_flags(self.ctx) for f in flags: with DmaBufMR(pd, length, f, gpu=self.gpu, gtt=self.gtt) as mr: mr.rkey class DmaBufRC(RCResources): def __init__(self, dev_name, ib_port, gid_index, gpu, gtt): """ Initialize an DmaBufRC object. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param gpu: GPU unit to allocate dmabuf from :gtt: Allocate dmabuf from GTT instead og VRAM """ self.gpu = gpu self.gtt = gtt super(DmaBufRC, self).__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) def create_mr(self): check_dmabuf_support(self.gpu) check_dmabuf_mr_support(self.pd, self.gpu) access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE mr = DmaBufMR(self.pd, self.msg_size, access, gpu=self.gpu, gtt=self.gtt) self.mr = mr def create_qp_attr(self): qp_attr = QPAttr(port_num=self.ib_port) qp_access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_WRITE qp_attr.qp_access_flags = qp_access return qp_attr class DmaBufTestCase(RDMATestCase): def setUp(self): super(DmaBufTestCase, self).setUp() self.iters = 100 self.gpu = self.config['gpu'] self.gtt = self.config['gtt'] def test_dmabuf_rc_traffic(self): """ Test send/recv using dma-buf MR over RC """ self.create_players(DmaBufRC, gpu=self.gpu, gtt=self.gtt) u.traffic(**self.traffic_args) def test_dmabuf_rdma_traffic(self): """ Test rdma write using dma-buf MR """ self.create_players(DmaBufRC, gpu=self.gpu, gtt=self.gtt) u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) class DeviceMemoryRes(RCResources): def __init__(self, dev_name, ib_port, gid_index, remote_access=False, msg_size=1024): """ Initialize DM resources based on RC resources that include RC QP. :param dev_name: Device name to be used. :param ib_port: IB port of the device to use. :param gid_index: Which GID index to use. :param remote_access: If True, enable remote access. :param msg_size: Message size (default: 1024). """ self.remote_access = remote_access super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index, msg_size=msg_size) def create_mr(self): try: self.dm = d.DM(self.ctx, d.AllocDmAttr(length=self.msg_size)) access = e.IBV_ACCESS_ZERO_BASED | e.IBV_ACCESS_LOCAL_WRITE if self.remote_access: access |= e.IBV_ACCESS_REMOTE_WRITE | e.IBV_ACCESS_REMOTE_READ | \ e.IBV_ACCESS_REMOTE_ATOMIC self.mr = DMMR(self.pd, self.msg_size, access, self.dm, 0) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Reg DMMR with access={access} is not supported') raise ex def create_qp_attr(self): qp_attr = QPAttr(port_num=self.ib_port) qp_attr.qp_access_flags = e.IBV_ACCESS_LOCAL_WRITE if self.remote_access: qp_attr.qp_access_flags |= e.IBV_ACCESS_REMOTE_WRITE | e.IBV_ACCESS_REMOTE_READ | \ e.IBV_ACCESS_REMOTE_ATOMIC return qp_attr class DeviceMemoryTest(RDMATestCase): """ Test various functionalities of the DM class. """ def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None self.traffic_args = None ctx = d.Context(name=self.dev_name) if ctx.query_device_ex().max_dm_size == 0: raise unittest.SkipTest('Device memory is not supported') # Device memory can not work in scatter to cqe mode in MLX5 devices, # therefore disable it and restore the default value at the end of the # test. self.set_env_variable('MLX5_SCATTER_TO_CQE', '0') def test_dm_traffic(self): self.create_players(DeviceMemoryRes) u.traffic(**self.traffic_args) def test_dm_remote_traffic(self): self.create_players(DeviceMemoryRes, remote_access=True) u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) def test_dm_remote_write_traffic_imm(self): self.create_players(DeviceMemoryRes, remote_access=True) u.traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE_WITH_IMM) def test_dm_remote_read_traffic(self): self.create_players(DeviceMemoryRes, remote_access=True) u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_READ) def test_dm_atomic_fetch_add(self): self.create_players(DeviceMemoryRes, remote_access=True, msg_size=8) u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) def test_dm_atomic_cmp_swp(self): self.create_players(DeviceMemoryRes, remote_access=True, msg_size=8) u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_CMP_AND_SWP) rdma-core-56.1/tests/test_odp.py000066400000000000000000000267411477342711600167010ustar00rootroot00000000000000from pyverbs.mem_alloc import mmap, munmap, madvise, MAP_ANONYMOUS_, MAP_PRIVATE_, \ MAP_HUGETLB_ from tests.base import RCResources, UDResources, XRCResources from pyverbs.qp import QPCap, QPAttr, QPInitAttr from pyverbs.wr import SGE, SendWR, RecvWR from tests.base import RDMATestCase from pyverbs.mr import MR import pyverbs.enums as e import tests.utils as u HUGE_PAGE_SIZE = 0x200000 class OdpUD(UDResources): def __init__(self, request_user_addr=False, **kwargs): self.request_user_addr = request_user_addr self.user_addr = None super(OdpUD, self).__init__(**kwargs) @u.requires_odp('ud', e.IBV_ODP_SUPPORT_SEND) def create_mr(self): if self.request_user_addr: self.user_addr = mmap(length=self.msg_size, flags=MAP_ANONYMOUS_ | MAP_PRIVATE_) self.send_mr = MR(self.pd, self.msg_size + u.GRH_SIZE, e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_ON_DEMAND, address=self.user_addr) self.recv_mr = MR(self.pd, self.msg_size + u.GRH_SIZE, e.IBV_ACCESS_LOCAL_WRITE) class OdpRC(RCResources): def __init__(self, dev_name, ib_port, gid_index, is_huge=False, request_user_addr=False, use_mr_prefetch=None, is_implicit=False, prefetch_advice=e._IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE, msg_size=1024, odp_caps=e.IBV_ODP_SUPPORT_SEND | e.IBV_ODP_SUPPORT_RECV, use_mixed_mr=False): """ Initialize an OdpRC object. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param is_huge: If True, use huge pages for MR registration :param request_user_addr: Request to provide the MR's buffer address. If False, the buffer will be allocated by pyverbs. :param use_mr_prefetch: Describes the properties of the prefetch operation. The options are 'sync', 'async' and None to skip the prefetch operation. :param is_implicit: If True, register implicit MR. :param prefetch_advice: The advice of the prefetch request (ignored if use_mr_prefetch is None). :param use_mixed_mr: If True, create a non-ODP MR in addition to the ODP MR. """ self.is_huge = is_huge self.request_user_addr = request_user_addr self.is_implicit = is_implicit self.odp_caps = odp_caps self.access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_ON_DEMAND | \ e.IBV_ACCESS_REMOTE_ATOMIC | e.IBV_ACCESS_REMOTE_READ | \ e.IBV_ACCESS_REMOTE_WRITE self.user_addr = None self.use_mixed_mr = use_mixed_mr self.non_odp_mr = None super(OdpRC, self).__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) self.use_mr_prefetch = use_mr_prefetch self.prefetch_advice = prefetch_advice self.msg_size = msg_size @u.requires_odp('rc', e.IBV_ODP_SUPPORT_SEND | e.IBV_ODP_SUPPORT_RECV) def create_mr(self): u.odp_supported(self.ctx, 'rc', self.odp_caps) if self.request_user_addr: mmap_flags = MAP_ANONYMOUS_| MAP_PRIVATE_ length = self.msg_size if self.is_huge: mmap_flags |= MAP_HUGETLB_ length = HUGE_PAGE_SIZE self.user_addr = mmap(length=length, flags=mmap_flags) access = self.access if self.is_huge: access |= e.IBV_ACCESS_HUGETLB self.mr = MR(self.pd, self.msg_size, access, address=self.user_addr, implicit=self.is_implicit) if self.use_mixed_mr: self.non_odp_mr = MR(self.pd, self.msg_size, e.IBV_ACCESS_LOCAL_WRITE) def create_qp_init_attr(self): return QPInitAttr(qp_type=e.IBV_QPT_RC, scq=self.cq, sq_sig_all=0, rcq=self.cq, srq=self.srq, cap=self.create_qp_cap()) def create_qp_attr(self): qp_attr = QPAttr(port_num=self.ib_port) qp_attr.qp_access_flags = self.access return qp_attr def create_qp_cap(self): if self.use_mixed_mr: return QPCap(max_recv_wr=self.num_msgs, max_send_sge=2, max_recv_sge=2) return super().create_qp_cap() class OdpXRC(XRCResources): def __init__(self, request_user_addr=False, **kwargs): self.request_user_addr = request_user_addr self.user_addr = None super(OdpXRC, self).__init__(**kwargs) @u.requires_odp('xrc', e.IBV_ODP_SUPPORT_SEND | e.IBV_ODP_SUPPORT_SRQ_RECV) def create_mr(self): if self.request_user_addr: self.user_addr = mmap(length=self.msg_size, flags=MAP_ANONYMOUS_| MAP_PRIVATE_) self.mr = u.create_custom_mr(self, e.IBV_ACCESS_ON_DEMAND, user_addr=self.user_addr) class OdpTestCase(RDMATestCase): def setUp(self): super(OdpTestCase, self).setUp() self.iters = 100 self.force_page_faults = True self.is_huge = False def create_players(self, resource, **resource_arg): """ Init odp tests resources. :param resource: The RDMA resources to use. A class of type BaseResources. :param resource_arg: Dict of args that specify the resource specific attributes. """ sync_attrs = False if resource == OdpUD else True super().create_players(resource, sync_attrs, **resource_arg) self.traffic_args['force_page_faults'] = self.force_page_faults def tearDown(self): if self.server and self.server.user_addr: length = HUGE_PAGE_SIZE if self.is_huge else self.server.msg_size munmap(self.server.user_addr, length) if self.client and self.client.user_addr: length = HUGE_PAGE_SIZE if self.is_huge else self.client.msg_size munmap(self.client.user_addr, length) super(OdpTestCase, self).tearDown() def test_odp_rc_traffic(self): self.create_players(OdpRC, request_user_addr=self.force_page_faults) u.traffic(**self.traffic_args) def test_odp_rc_mixed_mr(self): self.create_players(OdpRC, request_user_addr=self.force_page_faults, use_mixed_mr=True) u.traffic(**self.traffic_args) def test_odp_rc_atomic_cmp_and_swp(self): self.force_page_faults = False self.create_players(OdpRC, request_user_addr=self.force_page_faults, msg_size=8, odp_caps=e.IBV_ODP_SUPPORT_ATOMIC) u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_CMP_AND_SWP) u.atomic_traffic(**self.traffic_args, receiver_val=1, sender_val=1, send_op=e.IBV_WR_ATOMIC_CMP_AND_SWP) def test_odp_rc_atomic_fetch_and_add(self): self.force_page_faults = False self.create_players(OdpRC, request_user_addr=self.force_page_faults, msg_size=8, odp_caps=e.IBV_ODP_SUPPORT_ATOMIC) u.atomic_traffic(**self.traffic_args, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) def test_odp_rc_rdma_read(self): self.create_players(OdpRC, request_user_addr=self.force_page_faults, odp_caps=e.IBV_ODP_SUPPORT_READ) self.server.mr.write('s' * self.server.msg_size, self.server.msg_size) u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_READ) def test_odp_rc_rdma_write(self): self.create_players(OdpRC, request_user_addr=self.force_page_faults, odp_caps=e.IBV_ODP_SUPPORT_WRITE) u.rdma_traffic(**self.traffic_args, send_op=e.IBV_WR_RDMA_WRITE) def test_odp_implicit_rc_traffic(self): self.create_players(OdpRC, request_user_addr=self.force_page_faults, is_implicit=True) u.traffic(**self.traffic_args) def test_odp_ud_traffic(self): self.create_players(OdpUD, request_user_addr=self.force_page_faults) # Implement the traffic here because OdpUD uses two different MRs for # send and recv. ah_client = u.get_global_ah(self.client, self.gid_index, self.ib_port) recv_sge = SGE(self.server.recv_mr.buf, self.server.msg_size + u.GRH_SIZE, self.server.recv_mr.lkey) server_recv_wr = RecvWR(sg=[recv_sge], num_sge=1) send_sge = SGE(self.client.send_mr.buf + u.GRH_SIZE, self.client.msg_size, self.client.send_mr.lkey) client_send_wr = SendWR(num_sge=1, sg=[send_sge]) for i in range(self.iters): madvise(self.client.send_mr.buf, self.client.msg_size) self.server.qp.post_recv(server_recv_wr) u.post_send(self.client, client_send_wr, ah=ah_client) u.poll_cq(self.client.cq) u.poll_cq(self.server.cq) def test_odp_xrc_traffic(self): self.create_players(OdpXRC, request_user_addr=self.force_page_faults) u.xrc_traffic(self.client, self.server) @u.requires_huge_pages() def test_odp_rc_huge_traffic(self): self.force_page_faults = False self.create_players(OdpRC, request_user_addr=self.force_page_faults, is_huge=True) u.traffic(**self.traffic_args) @u.requires_huge_pages() def test_odp_rc_huge_user_addr_traffic(self): self.is_huge = True self.create_players(OdpRC, request_user_addr=self.force_page_faults, is_huge=True) u.traffic(**self.traffic_args) def test_odp_sync_prefetch_rc_traffic(self): for advice in [e._IBV_ADVISE_MR_ADVICE_PREFETCH, e._IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE]: self.create_players(OdpRC, request_user_addr=self.force_page_faults, use_mr_prefetch='sync', prefetch_advice=advice) u.traffic(**self.traffic_args) def test_odp_async_prefetch_rc_traffic(self): for advice in [e._IBV_ADVISE_MR_ADVICE_PREFETCH, e._IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE]: self.create_players(OdpRC, request_user_addr=self.force_page_faults, use_mr_prefetch='async', prefetch_advice=advice) u.traffic(**self.traffic_args) def test_odp_implicit_sync_prefetch_rc_traffic(self): self.create_players(OdpRC, request_user_addr=self.force_page_faults, use_mr_prefetch='sync', is_implicit=True) u.traffic(**self.traffic_args) def test_odp_implicit_async_prefetch_rc_traffic(self): self.create_players(OdpRC, request_user_addr=self.force_page_faults, use_mr_prefetch='async', is_implicit=True) u.traffic(**self.traffic_args) def test_odp_prefetch_sync_no_page_fault_rc_traffic(self): prefetch_advice = e._IBV_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT self.create_players(OdpRC, request_user_addr=self.force_page_faults, use_mr_prefetch='sync', prefetch_advice=prefetch_advice) u.traffic(**self.traffic_args) def test_odp_prefetch_async_no_page_fault_rc_traffic(self): prefetch_advice = e._IBV_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT self.create_players(OdpRC, request_user_addr=self.force_page_faults, use_mr_prefetch='async', prefetch_advice=prefetch_advice) u.traffic(**self.traffic_args) rdma-core-56.1/tests/test_parent_domain.py000066400000000000000000000153101477342711600207250ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file """ Test module for Pyverbs' ParentDomain. """ from pyverbs.pd import ParentDomainInitAttr, ParentDomain, ParentDomainContext from tests.base import RCResources, UDResources, RDMATestCase from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.cq import CqInitAttrEx, CQEX import pyverbs.mem_alloc as mem import pyverbs.enums as e import tests.utils as u import unittest import errno HUGE_PAGE_SIZE = 0x200000 def default_allocator(pd, context, size, alignment, resource_type): return e._IBV_ALLOCATOR_USE_DEFAULT def default_free(pd, context, ptr, resource_type): return e._IBV_ALLOCATOR_USE_DEFAULT def mem_align_allocator(pd, context, size, alignment, resource_type): p = mem.posix_memalign(size, alignment) return p def free_func(pd, context, ptr, resource_type): mem.free(ptr) def huge_page_alloc(pd, context, size, alignment, resource_type): ptr = context.user_data remainder = ptr % alignment ptr += 0 if remainder == 0 else (alignment - remainder) context.user_data += size return ptr def huge_page_free(pd, context, ptr, resource_type): """ No need to free memory, since this allocator assumes the huge page was externally mapped (and will be externally un-mapped). """ pass def create_parent_domain_with_allocators(res): """ Creates parent domain for res instance. The allocators themselves are taken from res.allocator_func and res.free_func. :param res: The resources instance to work on (an instance of BaseResources) """ if res.allocator_func and res.free_func: res.pd_ctx = ParentDomainContext(res.pd, res.allocator_func, res.free_func, res.user_data) pd_attr = ParentDomainInitAttr(pd=res.pd, pd_context=res.pd_ctx) try: res.pd = ParentDomain(res.ctx, attr=pd_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Parent Domain is not supported on this device') raise ex def parent_domain_res_cls(base_class): """ This is a factory function which creates a class that inherits base_class of any BaseResources type. Its purpose is to behave exactly as base_class does, except for creating a parent domain with custom allocators. Hence the returned class must be initialized with (alloc_func, free_func, user_data, **kwargs), while kwargs are the arguments needed (if any) for base_class. :param base_class: The base resources class to inherit from :return: ParentDomainRes(alloc_func=None, free_func=None, **kwargs) class """ class ParentDomainRes(base_class): def __init__(self, alloc_func=None, free_func=None, user_data=None, **kwargs): self.pd_ctx = None self.protection_domain = None self.allocator_func = alloc_func self.free_func = free_func self.user_data = user_data super().__init__(**kwargs) def create_pd(self): super().create_pd() self.protection_domain = self.pd create_parent_domain_with_allocators(self) return ParentDomainRes class ParentDomainHugePageRcRes(parent_domain_res_cls(RCResources)): def __init__(self, alloc_func=None, free_func=None, **kwargs): user_data = mem.mmap(length=HUGE_PAGE_SIZE, flags=mem.MAP_ANONYMOUS_ | mem.MAP_PRIVATE_ | mem.MAP_HUGETLB_) super().__init__(alloc_func=alloc_func, free_func=free_func, user_data=user_data, **kwargs) def __del__(self): mem.munmap(self.user_data, HUGE_PAGE_SIZE) class ParentDomainCqExSrqRes(parent_domain_res_cls(RCResources)): """ Parent domain resources. Based on RCResources. This includes a parent domain created with the given allocators, in addition it creates an extended CQ and a SRQ for RC traffic. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use :param alloc_func: Custom allocator function :param free_func: Custom free function """ def __init__(self, dev_name, ib_port=None, gid_index=None, alloc_func=None, free_func=None): super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index, alloc_func=alloc_func, free_func=free_func, with_srq=True) def create_cq(self): wc_flags = e.IBV_WC_STANDARD_FLAGS cia = CqInitAttrEx(cqe=2000, wc_flags=wc_flags, parent_domain=self.pd, comp_mask=e.IBV_CQ_INIT_ATTR_MASK_FLAGS | e.IBV_CQ_INIT_ATTR_MASK_PD) try: self.cq = CQEX(self.ctx, cia) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Extended CQ with Parent Domain is not supported') raise ex class ParentDomainTrafficTest(RDMATestCase): def setUp(self): super().setUp() self.iters = 10 self.server = None self.client = None def test_without_allocators_rc_traffic(self): parent_domain_rc_res = parent_domain_res_cls(RCResources) self.create_players(parent_domain_rc_res) u.traffic(**self.traffic_args) def test_default_allocators_rc_traffic(self): parent_domain_rc_res = parent_domain_res_cls(RCResources) self.create_players(parent_domain_rc_res, alloc_func=default_allocator, free_func=default_free) u.traffic(**self.traffic_args) def test_mem_align_rc_traffic(self): parent_domain_rc_res = parent_domain_res_cls(RCResources) self.create_players(parent_domain_rc_res, alloc_func=mem_align_allocator, free_func=free_func) u.traffic(**self.traffic_args) def test_mem_align_ud_traffic(self): parent_domain_ud_res = parent_domain_res_cls(UDResources) self.create_players(parent_domain_ud_res, alloc_func=mem_align_allocator, free_func=free_func) u.traffic(**self.traffic_args) def test_mem_align_srq_excq_rc_traffic(self): self.create_players(ParentDomainCqExSrqRes, alloc_func=mem_align_allocator, free_func=free_func) u.traffic(**self.traffic_args, is_cq_ex=True) @u.requires_huge_pages() def test_huge_page_traffic(self): self.create_players(ParentDomainHugePageRcRes, alloc_func=huge_page_alloc, free_func=huge_page_free) u.traffic(**self.traffic_args) rdma-core-56.1/tests/test_pd.py000066400000000000000000000022531477342711600165120ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file """ Test module for pyverbs' pd module. """ import random from tests.base import PyverbsAPITestCase from pyverbs.pd import PD class PDTest(PyverbsAPITestCase): """ Test various functionalities of the PD class. """ def test_alloc_pd(self): """ Test ibv_alloc_pd() """ with PD(self.ctx): pass def test_dealloc_pd(self): """ Test ibv_dealloc_pd() """ with PD(self.ctx) as pd: pd.close() def test_multiple_pd_creation(self): """ Test multiple creations and destructions of a PD object """ for i in range(random.randint(1, 200)): with PD(self.ctx) as pd: pd.close() def test_destroy_pd_twice(self): """ Test bad flow cases in destruction of a PD object """ with PD(self.ctx) as pd: # Pyverbs supports multiple destruction of objects, we are # not expecting an exception here. pd.close() pd.close() rdma-core-56.1/tests/test_qp.py000066400000000000000000000401661477342711600165340ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file # Copyright (c) 2020 Kamal Heib , All rights reserved. See COPYING file # Copyright 2020-2023 Amazon.com, Inc. or its affiliates. All rights reserved. """ Test module for pyverbs' qp module. """ import unittest import random import errno import os from pyverbs.pyverbs_error import PyverbsRDMAError from pyverbs.qp import QPInitAttr, QPAttr, QP from tests.base import PyverbsAPITestCase import pyverbs.utils as pu import pyverbs.device as d import pyverbs.enums as e from pyverbs.pd import PD from pyverbs.cq import CQ import tests.utils as u class QPTest(PyverbsAPITestCase): """ Test various functionalities of the QP class. """ def create_qp(self, creator, qp_init_attr, is_ex, with_attr, port_num): """ Auxiliary function to create QP object. """ try: qp_attr = (None, QPAttr(port_num=port_num))[with_attr] return QP(creator, qp_init_attr, qp_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: with_str = ('without', 'with')[with_attr] + ('', ' extended')[is_ex] qp_type_str = pu.qp_type_to_str(qp_init_attr.qp_type) raise unittest.SkipTest(f'Create {qp_type_str} QP {with_str} attrs is not supported') raise ex def create_qp_common_test(self, qp_type, qp_state, is_ex, with_attr, qp_attr_edit_callback=None): """ Common function used by create QP tests. """ with PD(self.ctx) as pd: with CQ(self.ctx, 100, None, None, 0) as cq: if qp_type == e.IBV_QPT_RAW_PACKET: if not (u.is_eth(self.ctx, self.ib_port) and u.is_root()): raise unittest.SkipTest('To Create RAW QP must be done by root on Ethernet link layer') if is_ex: qia = get_qp_init_attr_ex(cq, pd, self.attr, self.attr_ex, qp_type) creator = self.ctx else: qia = u.get_qp_init_attr(cq, self.attr) qia.qp_type = qp_type creator = pd if qp_attr_edit_callback: qia = qp_attr_edit_callback(qia) qp = self.create_qp(creator, qia, is_ex, with_attr, self.ib_port) qp_type_str = pu.qp_type_to_str(qp_type) qp_state_str = pu.qp_state_to_str(qp_state) assert qp.qp_state == qp_state , f'{qp_type_str} QP should have been in {qp_state_str}' def test_create_rc_qp_no_attr(self): """ Test RC QP creation via ibv_create_qp without a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_RC, e.IBV_QPS_RESET, False, False) def test_create_uc_qp_no_attr(self): """ Test UC QP creation via ibv_create_qp without a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_UC, e.IBV_QPS_RESET, False, False) def test_create_ud_qp_no_attr(self): """ Test UD QP creation via ibv_create_qp without a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_UD, e.IBV_QPS_RESET, False, False) def test_create_raw_qp_no_attr(self): """ Test RAW Packet QP creation via ibv_create_qp without a QPAttr object provided. Raw Packet is skipped for non-root users / Infiniband link layer. """ self.create_qp_common_test(e.IBV_QPT_RAW_PACKET, e.IBV_QPS_RESET, False, False) def test_create_rc_qp_with_attr(self): """ Test RC QP creation via ibv_create_qp with a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_RC, e.IBV_QPS_INIT, False, True) def test_create_uc_qp_with_attr(self): """ Test UC QP creation via ibv_create_qp with a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_UC, e.IBV_QPS_INIT, False, True) def test_create_ud_qp_with_attr(self): """ Test UD QP creation via ibv_create_qp with a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_UD, e.IBV_QPS_RTS, False, True) def test_create_raw_qp_with_attr(self): """ Test RAW Packet QP creation via ibv_create_qp with a QPAttr object provided. Raw Packet is skipped for non-root users / Infiniband link layer. """ self.create_qp_common_test(e.IBV_QPT_RAW_PACKET, e.IBV_QPS_RTS, False, True) def test_create_rc_qp_ex_no_attr(self): """ Test RC QP creation via ibv_create_qp_ex without a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_RC, e.IBV_QPS_RESET, True, False) def test_create_uc_qp_ex_no_attr(self): """ Test UC QP creation via ibv_create_qp_ex without a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_UC, e.IBV_QPS_RESET, True, False) def test_create_ud_qp_ex_no_attr(self): """ Test UD QP creation via ibv_create_qp_ex without a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_UD, e.IBV_QPS_RESET, True, False) def test_create_raw_qp_ex_no_attr(self): """ Test Raw Packet QP creation via ibv_create_qp_ex without a QPAttr object provided. Raw Packet is skipped for non-root users / Infiniband link layer. """ self.create_qp_common_test(e.IBV_QPT_RAW_PACKET, e.IBV_QPS_RESET, True, False) def test_create_rc_qp_ex_with_attr(self): """ Test RC QP creation via ibv_create_qp_ex with a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_RC, e.IBV_QPS_INIT, True, True) def test_create_uc_qp_ex_with_attr(self): """ Test UC QP creation via ibv_create_qp_ex with a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_UC, e.IBV_QPS_INIT, True, True) def test_create_ud_qp_ex_with_attr(self): """ Test UD QP creation via ibv_create_qp_ex with a QPAttr object provided. """ self.create_qp_common_test(e.IBV_QPT_UD, e.IBV_QPS_RTS, True, True) def test_create_raw_qp_ex_with_attr(self): """ Test Raw Packet QP creation via ibv_create_qp_ex with a QPAttr object provided. Raw Packet is skipped for non-root users / Infiniband link layer. """ self.create_qp_common_test(e.IBV_QPT_RAW_PACKET, e.IBV_QPS_RTS, True, True) def qp_attr_edit_max_send_wr_callback(self, qp_init_attr): qp_init_attr.max_send_wr = 0xffffffff # max_uint32 return qp_init_attr def qp_attr_edit_max_send_sge_callback(self, qp_init_attr): qp_init_attr.max_send_sge = 0xffff # max_uint16 return qp_init_attr def qp_attr_edit_max_recv_sge_callback(self, qp_init_attr): qp_init_attr.max_recv_sge = 0xffff # max_uint16 return qp_init_attr def qp_attr_edit_max_recv_wr_callback(self, qp_init_attr): qp_init_attr.max_recv_wr = 0xffffffff # max_uint32 return qp_init_attr def test_create_raw_qp_ex_with_illegal_caps_max_send_wr(self): """ Test Raw Packet QP creation via ibv_create_qp_ex with a QPAttr object with illegal max_send_wr. """ dev_attr = self.ctx.query_device() if dev_attr.max_qp_wr < 0xffffffff: with self.assertRaises(PyverbsRDMAError) as ex: self.create_qp_common_test(e.IBV_QPT_UD, e.IBV_QPS_RTS, False, True, qp_attr_edit_callback=self.qp_attr_edit_max_send_wr_callback) self.assertNotEqual(ex.exception.error_code, 0) def test_create_raw_qp_ex_with_illegal_caps_max_send_sge(self): """ Test Raw Packet QP creation via ibv_create_qp_ex with a QPAttr object with illegal max_send_sge. """ dev_attr = self.ctx.query_device() if dev_attr.max_sge < 0xffff: with self.assertRaises(PyverbsRDMAError) as ex: self.create_qp_common_test(e.IBV_QPT_UD, e.IBV_QPS_RTS, False, True, qp_attr_edit_callback=self.qp_attr_edit_max_send_sge_callback) self.assertNotEqual(ex.exception.error_code, 0) def test_create_raw_qp_ex_with_illegal_caps_max_recv_sge(self): """ Test Raw Packet QP creation via ibv_create_qp_ex with a QPAttr object with illegal max_recv_sge. """ dev_attr = self.ctx.query_device() if dev_attr.max_sge < 0xffff: with self.assertRaises(PyverbsRDMAError) as ex: self.create_qp_common_test(e.IBV_QPT_UD, e.IBV_QPS_RTS, False, True, qp_attr_edit_callback=self.qp_attr_edit_max_recv_sge_callback) self.assertNotEqual(ex.exception.error_code, 0) def test_create_raw_qp_ex_with_illegal_caps_max_recv_wr(self): """ Test Raw Packet QP creation via ibv_create_qp_ex with a QPAttr object with illegal max_recv_wr. """ dev_attr = self.ctx.query_device() if dev_attr.max_qp_wr < 0xffffffff: with self.assertRaises(PyverbsRDMAError) as ex: self.create_qp_common_test(e.IBV_QPT_UD, e.IBV_QPS_RTS, False, True, qp_attr_edit_callback=self.qp_attr_edit_max_recv_wr_callback) self.assertNotEqual(ex.exception.error_code, 0) def verify_qp_attrs(self, orig_cap, state, init_attr, attr): self.assertEqual(state, attr.qp_state) self.assertLessEqual(orig_cap.max_send_wr, init_attr.cap.max_send_wr) self.assertLessEqual(orig_cap.max_recv_wr, init_attr.cap.max_recv_wr) self.assertLessEqual(orig_cap.max_send_sge, init_attr.cap.max_send_sge) self.assertLessEqual(orig_cap.max_recv_sge, init_attr.cap.max_recv_sge) self.assertLessEqual(orig_cap.max_inline_data, init_attr.cap.max_inline_data) def get_node_type(self): for dev in d.get_device_list(): if dev.name.decode() == self.ctx.name: return dev.node_type def query_qp_common_test(self, qp_type): with PD(self.ctx) as pd: with CQ(self.ctx, 100, None, None, 0) as cq: if qp_type == e.IBV_QPT_RAW_PACKET: if not (u.is_eth(self.ctx, self.ib_port) and u.is_root()): raise unittest.SkipTest('To Create RAW QP must be done by root on Ethernet link layer') # Legacy QP qia = u.get_qp_init_attr(cq, self.attr) qia.qp_type = qp_type caps = qia.cap qp = self.create_qp(pd, qia, False, False, self.ib_port) qp_attr, qp_init_attr = qp.query(e.IBV_QP_STATE | e.IBV_QP_CAP) if self.get_node_type() == e.IBV_NODE_RNIC: self.verify_qp_attrs(caps, e.IBV_QPS_INIT, qp_init_attr, qp_attr) else: self.verify_qp_attrs(caps, e.IBV_QPS_RESET, qp_init_attr, qp_attr) # Extended QP qia = get_qp_init_attr_ex(cq, pd, self.attr, self.attr_ex, qp_type) caps = qia.cap # Save them to verify values later qp = self.create_qp(self.ctx, qia, True, False, self.ib_port) qp_attr, qp_init_attr = qp.query(e.IBV_QP_STATE | e.IBV_QP_CAP) if self.get_node_type() == e.IBV_NODE_RNIC: self.verify_qp_attrs(caps, e.IBV_QPS_INIT, qp_init_attr, qp_attr) else: self.verify_qp_attrs(caps, e.IBV_QPS_RESET, qp_init_attr, qp_attr) def test_query_rc_qp(self): """ Queries an RC QP after creation. Verifies that its properties are as expected. """ self.query_qp_common_test(e.IBV_QPT_RC) def test_query_uc_qp(self): """ Queries an UC QP after creation. Verifies that its properties are as expected. """ self.query_qp_common_test(e.IBV_QPT_UC) def test_query_ud_qp(self): """ Queries an UD QP after creation. Verifies that its properties are as expected. """ self.query_qp_common_test(e.IBV_QPT_UD) def test_query_raw_qp(self): """ Queries an RAW Packet QP after creation. Verifies that its properties are as expected. Raw Packet is skipped for non-root users / Infiniband link layer. """ self.query_qp_common_test(e.IBV_QPT_RAW_PACKET) def test_query_data_in_order(self): """ Queries an UD QP data in order after moving it to RTS state. Verifies that the result from the query is valid. """ with PD(self.ctx) as pd: with CQ(self.ctx, 100, None, None, 0) as cq: qia = u.get_qp_init_attr(cq, self.attr) qia.qp_type = e.IBV_QPT_UD qp = self.create_qp(pd, qia, False, True, self.ib_port) is_data_in_order = qp.query_data_in_order(e.IBV_WR_SEND) self.assertIn(is_data_in_order, [0, 1], 'Data in order result with flags=0 is not valid') is_data_in_order = qp.query_data_in_order(e.IBV_WR_SEND,e.IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS) valid_results = [0, e.IBV_QUERY_QP_DATA_IN_ORDER_ALIGNED_128_BYTES, e.IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG | e.IBV_QUERY_QP_DATA_IN_ORDER_ALIGNED_128_BYTES] self.assertIn(is_data_in_order, valid_results, 'Data in order result with flags=1 is not valid') @u.skip_unsupported def test_modify_ud_qp(self): """ Queries a UD QP after calling modify(). Verifies that its properties are as expected. """ with PD(self.ctx) as pd: with CQ(self.ctx, 100, None, None, 0) as cq: # Legacy QP qia = u.get_qp_init_attr(cq, self.attr) qia.qp_type = e.IBV_QPT_UD qp = self.create_qp(pd, qia, False, False, self.ib_port) qa = QPAttr() qa.qkey = 0x123 qp.to_init(qa) qp_attr, _ = qp.query(e.IBV_QP_QKEY) assert qp_attr.qkey == qa.qkey, 'Legacy QP, QKey is not as expected' qp.to_rtr(qa) qa.sq_psn = 0x45 qp.to_rts(qa) qp_attr, _ = qp.query(e.IBV_QP_SQ_PSN) assert qp_attr.sq_psn == qa.sq_psn, 'Legacy QP, SQ PSN is not as expected' qa.qp_state = e.IBV_QPS_RESET qp.modify(qa, e.IBV_QP_STATE) assert qp.qp_state == e.IBV_QPS_RESET, 'Legacy QP, QP state is not as expected' # Extended QP qia = get_qp_init_attr_ex(cq, pd, self.attr, self.attr_ex, e.IBV_QPT_UD) qp = self.create_qp(self.ctx, qia, True, False, self.ib_port) qa = QPAttr() qa.qkey = 0x123 qp.to_init(qa) qp_attr, _ = qp.query(e.IBV_QP_QKEY) assert qp_attr.qkey == qa.qkey, 'Extended QP, QKey is not as expected' qp.to_rtr(qa) qa.sq_psn = 0x45 qp.to_rts(qa) qp_attr, _ = qp.query(e.IBV_QP_SQ_PSN) assert qp_attr.sq_psn == qa.sq_psn, 'Extended QP, SQ PSN is not as expected' qa.qp_state = e.IBV_QPS_RESET qp.modify(qa, e.IBV_QP_STATE) assert qp.qp_state == e.IBV_QPS_RESET, 'Extended QP, QP state is not as expected' def get_qp_init_attr_ex(cq, pd, attr, attr_ex, qpt): """ Creates a QPInitAttrEx object with a QP type of the provided array and other random values. :param cq: CQ to be used as send and receive CQ :param pd: A PD object to use :param attr: Device attributes for capability checks :param attr_ex: Extended device attributes for capability checks :param qpt: QP type :return: An initialized QPInitAttrEx object """ qia = u.random_qp_init_attr_ex(attr_ex, attr, qpt) qia.send_cq = cq qia.recv_cq = cq qia.pd = pd # Only XRCD can be created without a PD return qia rdma-core-56.1/tests/test_qpex.py000066400000000000000000000325431477342711600170710ustar00rootroot00000000000000import unittest import random import errno from pyverbs.qp import QPCap, QPInitAttrEx, QPAttr, QPEx, QP from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError from pyverbs.mr import MW, MWBindInfo from pyverbs.base import inc_rkey from tests.utils import wc_status_to_str import pyverbs.enums as e from tests.base import UDResources, RCResources, RDMATestCase, XRCResources import tests.utils as u def create_qp_ex(agr_obj, qp_type, send_flags): if qp_type == e.IBV_QPT_XRC_SEND: cap = QPCap(max_send_wr=agr_obj.num_msgs, max_recv_wr=0, max_recv_sge=0, max_send_sge=1) else: cap = QPCap(max_send_wr=agr_obj.num_msgs, max_recv_wr=agr_obj.num_msgs, max_recv_sge=1, max_send_sge=1) qia = QPInitAttrEx(cap=cap, qp_type=qp_type, scq=agr_obj.cq, rcq=agr_obj.cq, pd=agr_obj.pd, send_ops_flags=send_flags, comp_mask=e.IBV_QP_INIT_ATTR_PD | e.IBV_QP_INIT_ATTR_SEND_OPS_FLAGS) qp_attr = QPAttr(port_num=agr_obj.ib_port) if qp_type == e.IBV_QPT_UD: qp_attr.qkey = agr_obj.UD_QKEY qp_attr.pkey_index = agr_obj.UD_PKEY_INDEX if qp_type == e.IBV_QPT_RC: qp_attr.qp_access_flags = e.IBV_ACCESS_REMOTE_WRITE | \ e.IBV_ACCESS_REMOTE_READ | \ e.IBV_ACCESS_REMOTE_ATOMIC | \ e.IBV_ACCESS_FLUSH_GLOBAL | \ e.IBV_ACCESS_FLUSH_PERSISTENT try: # We don't have capability bits for this qp = QPEx(agr_obj.ctx, qia, qp_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Extended QP is not supported on this device') raise ex if qp_type != e.IBV_QPT_XRC_SEND: agr_obj.qps.append(qp) agr_obj.qps_num.append(qp.qp_num) agr_obj.psns.append(random.getrandbits(24)) else: return qp class QpExUDSend(UDResources): def create_qps(self): create_qp_ex(self, e.IBV_QPT_UD, e.IBV_QP_EX_WITH_SEND) class QpExRCSend(RCResources): def create_qps(self): create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_SEND) class QpExXRCSend(XRCResources): def create_qps(self): qp_attr = QPAttr(port_num=self.ib_port) qp_attr.pkey_index = 0 for _ in range(self.qp_count): attr_ex = QPInitAttrEx(qp_type=e.IBV_QPT_XRC_RECV, comp_mask=e.IBV_QP_INIT_ATTR_XRCD, xrcd=self.xrcd) qp_attr.qp_access_flags = e.IBV_ACCESS_REMOTE_WRITE | \ e.IBV_ACCESS_REMOTE_READ recv_qp = QP(self.ctx, attr_ex, qp_attr) self.rqp_lst.append(recv_qp) send_qp = create_qp_ex(self, e.IBV_QPT_XRC_SEND, e.IBV_QP_EX_WITH_SEND) self.sqp_lst.append(send_qp) self.qps_num.append((recv_qp.qp_num, send_qp.qp_num)) self.psns.append(random.getrandbits(24)) class QpExUDSendImm(UDResources): def create_qps(self): create_qp_ex(self, e.IBV_QPT_UD, e.IBV_QP_EX_WITH_SEND_WITH_IMM) class QpExRCSendImm(RCResources): def create_qps(self): create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_SEND_WITH_IMM) class QpExXRCSendImm(XRCResources): def create_qps(self): qp_attr = QPAttr(port_num=self.ib_port) qp_attr.pkey_index = 0 for _ in range(self.qp_count): attr_ex = QPInitAttrEx(qp_type=e.IBV_QPT_XRC_RECV, comp_mask=e.IBV_QP_INIT_ATTR_XRCD, xrcd=self.xrcd) qp_attr.qp_access_flags = e.IBV_ACCESS_REMOTE_WRITE | \ e.IBV_ACCESS_REMOTE_READ recv_qp = QP(self.ctx, attr_ex, qp_attr) self.rqp_lst.append(recv_qp) send_qp = create_qp_ex(self, e.IBV_QPT_XRC_SEND, e.IBV_QP_EX_WITH_SEND_WITH_IMM) self.sqp_lst.append(send_qp) self.qps_num.append((recv_qp.qp_num, send_qp.qp_num)) self.psns.append(random.getrandbits(24)) class QpExRCFlush(RCResources): ptype = e.IBV_FLUSH_GLOBAL level = e.IBV_FLUSH_RANGE def create_qps(self): create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_FLUSH | e.IBV_QP_EX_WITH_RDMA_WRITE) def create_mr(self): try: self.mr = u.create_custom_mr(self, e.IBV_ACCESS_FLUSH_GLOBAL | e.IBV_ACCESS_REMOTE_WRITE) except PyverbsRDMAError as ex: if ex.error_code == errno.EINVAL: raise unittest.SkipTest('Create mr with IBV_ACCESS_FLUSH_GLOBAL access flag is not supported in kernel') raise ex class QpExRCAtomicWrite(RCResources): def create_qps(self): create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_ATOMIC_WRITE) def create_mr(self): self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_WRITE) class QpExRCRDMAWrite(RCResources): def create_qps(self): create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_RDMA_WRITE) def create_mr(self): self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_WRITE) class QpExRCRDMAWriteImm(RCResources): def create_qps(self): create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM) def create_mr(self): self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_WRITE) class QpExRCRDMARead(RCResources): def create_qps(self): create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_RDMA_READ) def create_mr(self): self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_READ) class QpExRCAtomicCmpSwp(RCResources): def create_qps(self): create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP) self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_ATOMIC) class QpExRCAtomicFetchAdd(RCResources): def create_qps(self): create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD) self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_ATOMIC) class QpExRCBindMw(RCResources): def create_qps(self): create_qp_ex(self, e.IBV_QPT_RC, e.IBV_QP_EX_WITH_RDMA_WRITE | e.IBV_QP_EX_WITH_BIND_MW) def create_mr(self): self.mr = u.create_custom_mr(self, e.IBV_ACCESS_REMOTE_WRITE | e.IBV_ACCESS_MW_BIND) class QpExTestCase(RDMATestCase): """ Run traffic using the new post send API. """ def setUp(self): super().setUp() self.iters = 100 def test_qp_ex_ud_send(self): self.create_players(QpExUDSend) u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND) def test_qp_ex_ud_zero_size(self): self.create_players(QpExUDSend) self.client.msg_size = 0 self.server.msg_size = 0 u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND) def test_qp_ex_rc_send(self): self.create_players(QpExRCSend) u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND) def test_qp_ex_xrc_send(self): self.create_players(QpExXRCSend) u.xrc_traffic(self.client, self.server, send_op=e.IBV_WR_SEND) def test_qp_ex_ud_send_imm(self): self.create_players(QpExUDSendImm) u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND_WITH_IMM) def test_qp_ex_rc_send_imm(self): self.create_players(QpExRCSendImm) u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_SEND_WITH_IMM) def test_qp_ex_xrc_send_imm(self): self.create_players(QpExXRCSendImm) u.xrc_traffic(self.client, self.server, send_op=e.IBV_WR_SEND_WITH_IMM) def test_qp_ex_rc_flush(self): self.create_players(QpExRCFlush) wcs = u.flush_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_FLUSH) if wcs[0].status != e.IBV_WC_SUCCESS: raise PyverbsError(f'Unexpected {wc_status_to_str(wcs[0].status)}') self.client.level = e.IBV_FLUSH_MR wcs = u.flush_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_FLUSH) if wcs[0].status != e.IBV_WC_SUCCESS: raise PyverbsError(f'Unexpected {wc_status_to_str(wcs[0].status)}') def test_qp_ex_rc_flush_type_violate(self): self.create_players(QpExRCFlush) self.client.ptype = e.IBV_FLUSH_PERSISTENT wcs = u.flush_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_FLUSH) if wcs[0].status != e.IBV_WC_REM_ACCESS_ERR: raise PyverbsError(f'Expected errors {wc_status_to_str(e.IBV_WC_REM_ACCESS_ERR)} - got {wc_status_to_str(wcs[0].status)}') def test_qp_ex_rc_atomic_write(self): self.create_players(QpExRCAtomicWrite) self.client.msg_size = 8 self.server.msg_size = 8 u.rdma_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_ATOMIC_WRITE) def test_qp_ex_rc_rdma_write(self): self.create_players(QpExRCRDMAWrite) u.rdma_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_RDMA_WRITE) def test_qp_ex_rc_rdma_write_imm(self): self.create_players(QpExRCRDMAWriteImm) u.traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_RDMA_WRITE_WITH_IMM) def test_qp_ex_rc_rdma_write_zero_length(self): self.create_players(QpExRCRDMAWrite) self.client.msg_size = 0 self.server.msg_size = 0 u.rdma_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_RDMA_WRITE) def test_qp_ex_rc_rdma_read(self): self.create_players(QpExRCRDMARead) self.server.mr.write('s' * self.server.msg_size, self.server.msg_size) u.rdma_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_RDMA_READ) def test_qp_ex_rc_rdma_read_zero_size(self): self.create_players(QpExRCRDMARead) self.client.msg_size = 0 self.server.msg_size = 0 self.server.mr.write('s' * self.server.msg_size, self.server.msg_size) u.rdma_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_RDMA_READ) def test_qp_ex_rc_atomic_cmp_swp(self): self.create_players(QpExRCAtomicCmpSwp) self.client.msg_size = 8 # Atomic work on 64b operators self.server.msg_size = 8 self.server.mr.write('s' * 8, 8) u.atomic_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_ATOMIC_CMP_AND_SWP) def test_qp_ex_rc_atomic_fetch_add(self): self.create_players(QpExRCAtomicFetchAdd) self.client.msg_size = 8 # Atomic work on 64b operators self.server.msg_size = 8 self.server.mr.write('s' * 8, 8) u.atomic_traffic(**self.traffic_args, new_send=True, send_op=e.IBV_WR_ATOMIC_FETCH_AND_ADD) def test_qp_ex_rc_bind_mw(self): """ Verify bind memory window operation using the new post_send API. Instead of checking through regular pingpong style traffic, we'll do as follows: - Register an MR with remote write access - Bind a MW without remote write permission to the MR - Verify that remote write fails Since it's a unique flow, it's an integral part of that test rather than a utility method. """ self.create_players(QpExRCBindMw) client_sge = u.get_send_elements(self.client, False)[1] # Create a MW and bind it self.server.qp.wr_start() self.server.qp.wr_id = 0x123 self.server.qp.wr_flags = e.IBV_SEND_SIGNALED bind_info = MWBindInfo(self.server.mr, self.server.mr.buf, self.server.mr.length, e.IBV_ACCESS_LOCAL_WRITE) try: mw = MW(self.server.pd, mw_type=e.IBV_MW_TYPE_2) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Memory Window allocation is not supported') raise ex new_key = inc_rkey(mw.rkey) self.server.qp.wr_bind_mw(mw, new_key, bind_info) self.server.qp.wr_complete() u.poll_cq(self.server.cq) # Verify that remote write fails self.client.qp.wr_start() self.client.qp.wr_id = 0x124 self.client.qp.wr_flags = e.IBV_SEND_SIGNALED self.client.qp.wr_rdma_write(new_key, self.server.mr.buf) self.client.qp.wr_set_sge(client_sge) self.client.qp.wr_complete() wcs = u._poll_cq(self.client.cq) if wcs[0].status != e.IBV_WC_REM_ACCESS_ERR: raise PyverbsRDMAError(f'Completion status is {wc_status_to_str(wcs[0].status)}') def test_post_receive_qp_state_bad_flow(self): self.create_players(QpExUDSend) u.post_rq_state_bad_flow(self) def test_post_send_qp_state_bad_flow(self): self.create_players(QpExUDSend) u.post_sq_state_bad_flow(self) def test_full_rq_bad_flow(self): self.create_players(QpExUDSend) u.full_rq_bad_flow(self) def test_rq_with_larger_sgl_bad_flow(self): self.create_players(QpExUDSend) u.create_rq_with_larger_sgl_bad_flow(self) rdma-core-56.1/tests/test_rdmacm.py000066400000000000000000000060211477342711600173470ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file import unittest import os from tests.rdmacm_utils import CMSyncConnection, CMAsyncConnection from tests.base import RDMATestCase, RDMACMBaseTest from tests.utils import requires_mcast_support import tests.irdma_base as irdma import pyverbs.cm_enums as ce import pyverbs.device as d import pyverbs.enums as e class CMTestCase(RDMACMBaseTest): """ RDMACM Test class. Include all the native RDMACM functionalities. """ @staticmethod def get_port_space(): # IPoIB currently is not supported return ce.RDMA_PS_UDP def test_rdmacm_sync_traffic(self): self.two_nodes_rdmacm_traffic(CMSyncConnection, self.rdmacm_traffic) def test_rdmacm_async_traffic(self): # QP ack timeout formula: 4.096 * 2^(ack_timeout) [usec] irdma.skip_if_irdma_dev(d.Context(name=self.dev_name)) self.two_nodes_rdmacm_traffic(CMAsyncConnection, self.rdmacm_traffic, qp_timeout=21) def test_rdmacm_async_reject_traffic(self): self.two_nodes_rdmacm_traffic(CMAsyncConnection, self.rdmacm_traffic, reject_conn=True) @requires_mcast_support() def test_rdmacm_async_multicast_traffic(self): self.two_nodes_rdmacm_traffic(CMAsyncConnection, self.rdmacm_multicast_traffic, port_space=self.get_port_space()) @requires_mcast_support() def test_rdmacm_async_ex_multicast_traffic(self): self.two_nodes_rdmacm_traffic(CMAsyncConnection, self.rdmacm_multicast_traffic, port_space=self.get_port_space(), extended=True) @requires_mcast_support() def test_rdmacm_async_ex_leave_multicast_traffic(self): self.two_nodes_rdmacm_traffic(CMAsyncConnection, self.rdmacm_multicast_traffic, port_space=self.get_port_space(), extended=True, leave_test=True, bad_flow=True) def test_rdmacm_async_traffic_external_qp(self): self.two_nodes_rdmacm_traffic(CMAsyncConnection, self.rdmacm_traffic, with_ext_qp=True) def test_rdmacm_async_udp_traffic(self): self.two_nodes_rdmacm_traffic(CMAsyncConnection, self.rdmacm_traffic, port_space=self.get_port_space(), ib_port=self.ib_port) def test_rdmacm_async_read(self): self.two_nodes_rdmacm_traffic(CMAsyncConnection, self.rdmacm_remote_traffic, remote_op='read') def test_rdmacm_async_write(self): self.two_nodes_rdmacm_traffic(CMAsyncConnection, self.rdmacm_remote_traffic, remote_op='write') rdma-core-56.1/tests/test_relaxed_ordering.py000066400000000000000000000022461477342711600214260ustar00rootroot00000000000000from tests.base import RCResources, UDResources, XRCResources from tests.utils import traffic, xrc_traffic from tests.base import RDMATestCase from pyverbs.mr import MR import pyverbs.enums as e class RoUD(UDResources): def create_mr(self): self.mr = MR(self.pd, self.msg_size + self.GRH_SIZE, e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_RELAXED_ORDERING) class RoRC(RCResources): def create_mr(self): self.mr = MR(self.pd, self.msg_size, e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_RELAXED_ORDERING) class RoXRC(XRCResources): def create_mr(self): self.mr = MR(self.pd, self.msg_size, e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_RELAXED_ORDERING) class RoTestCase(RDMATestCase): def setUp(self): super(RoTestCase, self).setUp() self.iters = 100 def test_ro_rc_traffic(self): self.create_players(RoRC) traffic(**self.traffic_args) def test_ro_ud_traffic(self): self.create_players(RoUD) traffic(**self.traffic_args) def test_ro_xrc_traffic(self): self.create_players(RoXRC) xrc_traffic(self.client, self.server) rdma-core-56.1/tests/test_rss_traffic.py000066400000000000000000000130321477342711600204110ustar00rootroot00000000000000import unittest import random import errno from pyverbs.wq import WQInitAttr, WQAttr, WQ, RwqIndTableInitAttr, RwqIndTable, RxHashConf from tests.utils import requires_root_on_eth, PacketConsts from tests.base import RDMATestCase, PyverbsRDMAError, MLNX_VENDOR_ID, \ CX3_MLNX_PART_ID, CX3Pro_MLNX_PART_ID from pyverbs.qp import QPInitAttrEx, QPEx from tests.test_flow import FlowRes from pyverbs.flow import Flow from pyverbs.cq import CQ import pyverbs.enums as e import tests.utils as u WRS_PER_ROUND = 512 CQS_NUM = 2 TOEPLITZ_KEY_LEN = 40 HASH_KEY = [0x2c, 0xc6, 0x81, 0xd1, 0x5b, 0xdb, 0xf4, 0xf7, 0xfc, 0xa2, 0x83, 0x19, 0xdb, 0x1a, 0x3e, 0x94, 0x6b, 0x9e, 0x38, 0xd9, 0x2c, 0x9c, 0x03, 0xd1, 0xad, 0x99, 0x44, 0xa7, 0xd9, 0x56, 0x3d, 0x59, 0x06, 0x3c, 0x25, 0xf3, 0xfc, 0x1f, 0xdc, 0x2a] def requires_indirection_table_support(func): def wrapper(instance): dev_attrs = instance.ctx.query_device() vendor_id = dev_attrs.vendor_id vendor_pid = dev_attrs.vendor_part_id if vendor_id == MLNX_VENDOR_ID and vendor_pid in [CX3_MLNX_PART_ID, CX3Pro_MLNX_PART_ID]: raise unittest.SkipTest('WQN must be aligned with the Indirection Table size in CX3') return func(instance) return wrapper class RssRes(FlowRes): def __init__(self, dev_name, ib_port, gid_index, log_ind_tbl_size=3): """ Initialize rss resources based on Flow resources that include RSS Raw QP. :param dev_name: Device name to be used :param ib_port: IB port of the device to use :param gid_index: Which GID index to use """ self.log_ind_tbl_size = log_ind_tbl_size self.wqs = [] self.cqs = [] self.ind_table = None super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index) def create_cq(self): self.cqs = [CQ(self.ctx, WRS_PER_ROUND) for _ in range(CQS_NUM)] @requires_root_on_eth() def create_qps(self): """ Initializes self.qps with RSS QPs. :return: None """ qp_init_attr = self.create_qp_init_attr() for _ in range(self.qp_count): try: qp = QPEx(self.ctx, qp_init_attr) self.qps.append(qp) self.qps_num.append(qp.qp_num) self.psns.append(random.getrandbits(24)) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Create QPEx type {qp_init_attr.qp_type} is not ' 'supported') raise ex def create_qp_init_attr(self): self.create_ind_table() mask = e.IBV_QP_INIT_ATTR_CREATE_FLAGS | e.IBV_QP_INIT_ATTR_PD | \ e.IBV_QP_INIT_ATTR_RX_HASH | e.IBV_QP_INIT_ATTR_IND_TABLE return QPInitAttrEx(qp_type=e.IBV_QPT_RAW_PACKET, comp_mask=mask, pd=self.pd, hash_conf=self.hash_conf, ind_table=self.ind_tbl) @requires_indirection_table_support def create_ind_table(self): self.ind_tbl = RwqIndTable(self.ctx, self.initiate_table_attr()) self.hash_conf = self.init_rx_hash_config() def initiate_table_attr(self): self.create_wqs() return RwqIndTableInitAttr(self.log_ind_tbl_size, self.wqs) def create_wqs(self): wqias = [self.initiate_wq_attr(cq) for cq in self.cqs] for i in range(1 << self.log_ind_tbl_size): wq = WQ(self.ctx, wqias[i % CQS_NUM]) wq.modify(WQAttr(attr_mask=e.IBV_WQ_ATTR_STATE, wq_state=e.IBV_WQS_RDY)) self.wqs.append(wq) return self.wqs def initiate_wq_attr(self, cq): return WQInitAttr(wq_context=None, wq_pd=self.pd, wq_cq=cq, wq_type=e.IBV_WQT_RQ, max_wr=WRS_PER_ROUND, max_sge=self.ctx.query_device().max_sge, comp_mask=0, create_flags=0) def init_rx_hash_config(self): return RxHashConf(rx_hash_function=e.IBV_RX_HASH_FUNC_TOEPLITZ, rx_hash_key_len=len(HASH_KEY), rx_hash_key=HASH_KEY, rx_hash_fields_mask=e.IBV_RX_HASH_DST_IPV4 | e.IBV_RX_HASH_SRC_IPV4) def _create_flow(self, flow_attr): return [Flow(qp, flow_attr) for qp in self.qps] class RSSTrafficTest(RDMATestCase): """ Test various functionalities of the RSS QPs. """ def setUp(self): super().setUp() self.iters = 1 self.server = None self.client = None def create_players(self): """ Init RSS tests resources. RSS-QP can recive traffic only, so client will be based on Flow tests resources. """ self.client = FlowRes(**self.dev_info) self.server = RssRes(**self.dev_info) def flow_traffic(self, specs, l3=PacketConsts.IP_V4, l4=PacketConsts.UDP_PROTO): """ Execute raw ethernet traffic with given specs flow. :param specs: List of flow specs to match on the QP :param l3: Packet layer 3 type: 4 for IPv4 or 6 for IPv6 :param l4: Packet layer 4 type: 'tcp' or 'udp' :return: None """ self.flows = self.server.create_flow(specs) u.raw_rss_traffic(self.client, self.server, self.iters, l3, l4, num_packets=32) def test_rss_traffic(self): self.create_players() self.flow_traffic([self.server.create_eth_spec()]) rdma-core-56.1/tests/test_shared_pd.py000066400000000000000000000072201477342711600200370ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2020 Mellanox Technologies, Inc. All rights reserved. See COPYING file """ Test module for Shared PD. """ import unittest import errno import os from tests.test_qpex import QpExRCRDMAWrite from tests.base import RDMATestCase from pyverbs.device import Context from pyverbs.pd import PD from pyverbs.mr import MR import pyverbs.enums as e import tests.utils as u def get_import_res_class(base_class): """ This function creates a class that inherits base_class of any BaseResources type. Its purpose is to behave exactly as base_class does, except for the objects creation, which instead of creating context, PD and MR, it imports them. Hence the returned class must be initialized with (cmd_fd, pd_handle, mr_handle, mr_addr, **kwargs), while kwargs are the arguments needed (if any) for base_class. In addition it has unimport_resources() method which unimprot all the resources and closes the imported PD object. :param base_class: The base resources class to inherit from :return: ImportResources(cmd_fd, pd_handle, mr_handle, mr_addr, **kwargs) class """ class ImportResources(base_class): def __init__(self, cmd_fd, pd_handle, mr_handle, mr_addr=None, **kwargs): self.cmd_fd = cmd_fd self.pd_handle = pd_handle self.mr_handle = mr_handle self.mr_addr = mr_addr super(ImportResources, self).__init__(**kwargs) def create_context(self): try: self.ctx = Context(cmd_fd=self.cmd_fd) except u.PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: raise unittest.SkipTest('Importing a device is not supported') raise ex def create_pd(self): self.pd = PD(self.ctx, handle=self.pd_handle) def create_mr(self): self.mr = MR(self.pd, handle=self.mr_handle, address=self.mr_addr) def unimport_resources(self): self.mr.unimport() self.pd.unimport() self.pd.close() return ImportResources class SharedPDTestCase(RDMATestCase): def setUp(self): super().setUp() self.iters = 10 self.server_res = None self.imported_res = [] def tearDown(self): for res in self.imported_res: res.unimport_resources() super().tearDown() def test_imported_rc_ex_rdma_write(self): setup_params = {'dev_name': self.dev_name, 'ib_port': self.ib_port, 'gid_index': self.gid_index} self.server_res = QpExRCRDMAWrite(**setup_params) cmd_fd_dup = os.dup(self.server_res.ctx.cmd_fd) import_cls = get_import_res_class(QpExRCRDMAWrite) server_import = import_cls( cmd_fd_dup, self.server_res.pd.handle, self.server_res.mr.handle, # The imported MR's address is NULL, so using the address of the # "main" MR object to be able to validate the message self.server_res.mr.buf, **setup_params) self.imported_res.append(server_import) client = QpExRCRDMAWrite(**setup_params) client.pre_run(server_import.psns, server_import.qps_num) server_import.pre_run(client.psns, client.qps_num) client.rkey = server_import.mr.rkey server_import.rkey = client.mr.rkey client.raddr = server_import.mr.buf server_import.raddr = client.mr.buf u.rdma_traffic(client, server_import, self.iters, self.gid_index, self.ib_port, send_op=e.IBV_WR_RDMA_WRITE, new_send=True) rdma-core-56.1/tests/test_srq.py000066400000000000000000000050021477342711600167070ustar00rootroot00000000000000import unittest from tests.base import RCResources, RDMATestCase from pyverbs.srq import SrqAttr import pyverbs.enums as e import tests.utils as u class SrqTestCase(RDMATestCase): def setUp(self): super().setUp() self.iters = 100 self.create_players(RCResources, qp_count=2, with_srq=True) def test_rc_srq_traffic(self): """ Test RC traffic with SRQ. """ u.traffic(**self.traffic_args) def test_resize_srq(self): """ Test modify_srq with IBV_SRQ_MAX_WR which allows to modify max_wr. Once modified, query the SRQ and verify that the new value is greater or equal than the requested max_wr. """ device_attr = self.server.ctx.query_device() if not device_attr.device_cap_flags & e.IBV_DEVICE_SRQ_RESIZE: raise unittest.SkipTest('SRQ resize is not supported') srq_query_attr = self.server.srq.query() srq_query_max_wr = srq_query_attr.max_wr srq_max_wr = min(device_attr.max_srq_wr, srq_query_max_wr*2) srq_attr = SrqAttr(max_wr=srq_max_wr) self.server.srq.modify(srq_attr, e.IBV_SRQ_MAX_WR) srq_attr_modified = self.server.srq.query() self.assertGreaterEqual(srq_attr_modified.max_wr, srq_attr.max_wr, 'Resize SRQ failed') def test_modify_srq_limit(self): """ Test IBV_SRQ_LIMIT modification. Add 10 wr to the SRQ and set the limit to 7, Query and verify that the SRQ limit changed to the expected value. send 4 packets from the client to the server, only 6 wr remain in the server SRQ so IBV_EVENT_SRQ_LIMIT_REACHED should be generated. Listen for income event and if one received check if it's equal to IBV_EVENT_SRQ_LIMIT_REACHED else fail. """ for _ in range(10): self.server.srq.post_recv(u.get_recv_wr(self.server)) srq_modify_attr = SrqAttr(srq_limit=7) self.server.srq.modify(srq_modify_attr, e.IBV_SRQ_LIMIT) server_query = self.server.srq.query() self.assertEqual(srq_modify_attr.srq_limit, server_query.srq_limit, 'Modify SRQ failed') for _ in range(4): c_send_wr, c_sg = u.get_send_elements(self.client, False) u.send(self.client, c_send_wr) u.poll_cq(self.client.cq) u.poll_cq(self.server.cq) event = self.server.ctx.get_async_event() event.ack() self.assertEqual(event.event_type, e.IBV_EVENT_SRQ_LIMIT_REACHED) rdma-core-56.1/tests/test_tag_matching.py000066400000000000000000000350151477342711600205360ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2022 Nvidia, Inc. All rights reserved. See COPYING file import unittest import errno import time from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError from pyverbs.cq import CqInitAttrEx, PollCqAttr, CQEX from pyverbs.srq import SrqInitAttrEx, OpsWr, SRQ from tests.base import RDMATestCase, RCResources from pyverbs.wr import SGE, RecvWR, SendWR from pyverbs.base import PyverbsRDMAErrno from pyverbs.qp import QPAttr, QPCap from pyverbs.mr import MR import pyverbs.enums as e import tests.utils as u TAG_MASK = 0xffff TMH_SIZE = 16 SYNC_WRID = 27 HW_LIMITAION = 33 FIXED_SEND_TAG = 0x1234 # Tag matching header lengths and offsets TM_OPCODE_OFFSET = 0 TM_OPCODE_LENGTH = 1 TM_TAG_OFFSET = 8 TM_TAG_LENGTH = 8 RNDV_VA_OFFSET = 0x10 RNDV_VA_LENGTH = 8 RNDV_RKEY_OFFSET = 0x18 RNDV_RKEY_LENGTH = 4 RNDV_LEN_OFFSET = 0x1c RNDV_LEN_LENGTH = 4 def write_tm_header(mr, tag, tm_opcode): """ Build a tag matching header, the header is written on the base address of the given mr. """ mr.write(int(tm_opcode).to_bytes(1, byteorder='big'), TM_OPCODE_LENGTH, TM_OPCODE_OFFSET) mr.write(int(tag).to_bytes(8, byteorder='big'), TM_TAG_LENGTH, TM_TAG_OFFSET) def write_rndvu_header(player, mr, tag, tm_opcode): """ Build a tag matching header + rendezvous header """ write_tm_header(mr=mr, tag=tag, tm_opcode=tm_opcode) mr.write(int(player.mr.buf).to_bytes(8, byteorder='big'), RNDV_VA_LENGTH, RNDV_VA_OFFSET) mr.write(int(player.mr.rkey).to_bytes(4, byteorder='big'), RNDV_RKEY_LENGTH, RNDV_RKEY_OFFSET) mr.write(int(player.msg_size).to_bytes(4, byteorder='big'), RNDV_LEN_LENGTH, RNDV_LEN_OFFSET) class TMResources(RCResources): def __init__(self, dev_name, ib_port, gid_index, qp_count=1, with_srq=True): self.unexp_cnt = 0 super().__init__(dev_name=dev_name, ib_port=ib_port, gid_index=gid_index, with_srq=with_srq, qp_count=qp_count) if not self.ctx.query_device_ex().tm_caps.flags & e.IBV_TM_CAP_RC: raise unittest.SkipTest("Tag matching is not supported") def create_srq(self): srq_attr = SrqInitAttrEx() srq_attr.comp_mask = e.IBV_SRQ_INIT_ATTR_TYPE | e.IBV_SRQ_INIT_ATTR_PD | \ e.IBV_SRQ_INIT_ATTR_CQ | e.IBV_SRQ_INIT_ATTR_TM srq_attr.srq_type = e.IBV_SRQT_TM srq_attr.pd = self.pd srq_attr.cq = self.cq srq_attr.max_num_tags = self.ctx.query_device_ex().tm_caps.max_num_tags srq_attr.max_ops = 10 self.srq = SRQ(self.ctx, srq_attr) def create_cq(self): cq_init_attr = CqInitAttrEx(wc_flags=e.IBV_WC_EX_WITH_TM_INFO | e.IBV_WC_STANDARD_FLAGS) try: self.cq = CQEX(self.ctx, cq_init_attr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest('Extended CQ is not supported') raise ex def create_qp_cap(self): return QPCap(max_send_wr=0, max_send_sge=0, max_recv_wr=0, max_recv_sge=0) if self.with_srq \ else QPCap(max_send_wr=4, max_send_sge=1, max_recv_wr=self.num_msgs, max_recv_sge=4) def create_qp_attr(self): qp_attr = QPAttr(port_num=self.ib_port) qp_attr.qp_access_flags = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_READ | \ e.IBV_ACCESS_REMOTE_WRITE return qp_attr def create_mr(self): access = e.IBV_ACCESS_LOCAL_WRITE | e.IBV_ACCESS_REMOTE_READ | e.IBV_ACCESS_REMOTE_WRITE self.mr = MR(self.pd, self.msg_size, access=access) class TMTest(RDMATestCase): """ Test various functionalities of tag matching. """ def setUp(self): super().setUp() self.server = None self.client = None self.iters = 10 self.curr_unexpected_cnt = 1 self.create_players(TMResources) self.prepare_to_traffic() def create_players(self, resource): self.client = resource(**self.dev_info, with_srq=False) self.server = resource(**self.dev_info) self.client.pre_run(self.server.psns, self.server.qps_num) self.server.pre_run(self.client.psns, self.client.qps_num) def prepare_to_traffic(self): """ Prepare the TM SRQ for tag matching traffic by posing 33 (hardware limitation) recv WR for fill his queue """ for _ in range(self.server.qp_count): u.post_recv(self.client, u.get_recv_wr(self.client), num_wqes=HW_LIMITAION) u.post_recv(self.server, u.get_recv_wr(self.server), num_wqes=HW_LIMITAION) def get_send_elements(self, tag=0, tm_opcode=e.IBV_TMH_EAGER, tm=True): """ Creates a single SGE and a single Send WR for client QP. The content of the message is 'c' for client side. The function also generates TMH and RVH to the msg :return: Send wr and expected msg that is read from mr """ sge = SGE(self.client.mr.buf, self.client.msg_size, self.client.mr_lkey) if tm_opcode == e.IBV_TMH_RNDV: max_rndv_hdr_size = self.server.ctx.query_device_ex().tm_caps.max_rndv_hdr_size sge.length = max_rndv_hdr_size if max_rndv_hdr_size <= self.server.mr.length else \ self.server.mr.length write_rndvu_header(player=self.client, mr=self.client.mr, tag=tag, tm_opcode=tm_opcode) c_recv_wr = RecvWR(wr_id=tag, sg=[sge], num_sge=1) # Need to post_recv client because the server sends rdma-read request to client u.post_recv(self.client, c_recv_wr) else: msg = self.client.msg_size * 'c' self.client.mr.write(msg, self.client.msg_size) if tm: write_tm_header(mr=self.client.mr, tag=tag, tm_opcode=tm_opcode) send_wr = SendWR(opcode=e.IBV_WR_SEND, num_sge=1, sg=[sge]) exp_msg = self.client.mr.read(self.client.msg_size, 0) return send_wr, exp_msg def get_exp_wc_flags(self, tm_opcode=e.IBV_TMH_EAGER, fixed_send_tag=None): if tm_opcode == e.IBV_TMH_RNDV: return e.IBV_WC_TM_MATCH return 0 if fixed_send_tag else e.IBV_WC_TM_MATCH | e.IBV_WC_TM_DATA_VALID def get_exp_params(self, fixed_send_tag=None, send_tag=0, tm_opcode=e.IBV_TMH_EAGER): wc_flags = self.get_exp_wc_flags(tm_opcode=tm_opcode, fixed_send_tag=fixed_send_tag) return (fixed_send_tag, 0, 0, wc_flags) if fixed_send_tag else \ (send_tag, send_tag, send_tag, wc_flags) def validate_msg(self, actual_msg, expected_msg, msg_size): if actual_msg[0:msg_size] != expected_msg[0:msg_size]: raise PyverbsError(f'Data validation failure: expected {expected_msg}, ' f'received {actual_msg}') def verify_cqe(self, actual_cqe, wr_id=0, opcode=None, wc_flags=0, tag=0, is_server=True): expected_cqe = {'wr_id': wr_id, 'opcode': opcode, 'wc_flags': wc_flags} if is_server: expected_cqe['tag'] = tag for key in expected_cqe: if expected_cqe[key] != actual_cqe[key]: raise PyverbsError(f'CQE validation failure: {key} expected value: ' f'{expected_cqe[key]}, received {actual_cqe[key]}') def validate_exp_recv_params(self, exp_parm, recv_parm, descriptor): if exp_parm != recv_parm: raise PyverbsError(f'{descriptor} validation failure: expected value {exp_parm}, ' f'received {recv_parm}') def poll_cq_ex(self, cqex, is_server=True, to_valid=True): start = time.perf_counter() poll_attr = PollCqAttr() ret = cqex.start_poll(poll_attr) while ret == 2 and (time.perf_counter() - start < u.POLL_CQ_TIMEOUT): ret = cqex.start_poll(poll_attr) if ret != 0: raise PyverbsRDMAErrno('Failed to poll CQ - got a timeout') if cqex.status != e.IBV_WC_SUCCESS: raise PyverbsError(f'Completion status is {cqex.status}') actual_cqe_dict = {} if to_valid: recv_flags = cqex.read_wc_flags() recv_opcode = cqex.read_opcode() actual_cqe_dict = {'wr_id': cqex.wr_id, 'opcode': cqex.read_opcode(), 'wc_flags': cqex.read_wc_flags()} if is_server: actual_cqe_dict['tag'] = cqex.read_tm_info().tag if recv_opcode == e.IBV_WC_TM_RECV and not \ (recv_flags & (e.IBV_WC_TM_MATCH | e.IBV_WC_TM_DATA_VALID)): # In case of receiving unexpected tag, HW doesn't return such wc_flags # updadte unexpected count and sync is required. self.server.unexp_cnt += 1 cqex.end_poll() self.post_sync() return actual_cqe_dict if recv_opcode == e.IBV_WC_TM_ADD and (recv_flags & e.IBV_WC_TM_SYNC_REQ): # These completion is complemented by the IBV_WC_TM_SYNC_REQ flag, # which indicates whether further HW synchronization is needed. cqex.end_poll() self.post_sync() return actual_cqe_dict cqex.end_poll() return actual_cqe_dict def post_sync(self, wr_id=SYNC_WRID): """ Whenever HW deems a message unexpected, tag matching must be disabled for new tags until SW and HW synchronize. This synchronization is achieved by reporting to HW the number of unexpected messages handled by SW (with respect to the current posted tags). When the SW and HW are in sync, tag matching resumes normally. """ wr = OpsWr(wr_id=wr_id, opcode=e.IBV_WR_TAG_SYNC, unexpected_cnt=self.server.unexp_cnt, recv_wr_id=wr_id, flags=e.IBV_OPS_SIGNALED | e.IBV_OPS_TM_SYNC) self.server.srq.post_srq_ops(wr) actual_cqe = self.poll_cq_ex(cqex=self.server.cq) self.verify_cqe(actual_cqe=actual_cqe, wr_id=SYNC_WRID, opcode=e.IBV_WC_TM_SYNC) def post_recv_tm(self, tag, wrid): """ Create opswr according to user chooce of wr_id and a tag and post recv it with the srq and the special func post_srq_ops that posted opswr wqe. :return: The opswr' """ recv_sge = SGE(self.server.mr.buf, self.server.msg_size, self.server.mr.lkey) wr = OpsWr(wr_id=wrid, unexpected_cnt=self.server.unexp_cnt, recv_wr_id=wrid, num_sge=1, tag=tag, mask=TAG_MASK, sg_list=[recv_sge]) self.server.srq.post_srq_ops(wr) return wr def build_expected_and_recv_msgs(self, exp_msg, tm_opcode=e.IBV_TMH_EAGER, fixed_send_tag=None): no_tag = tm_opcode == e.IBV_TMH_RNDV or fixed_send_tag actual_msg = self.server.mr.read(self.server.msg_size, 0) return (actual_msg, exp_msg, self.client.msg_size) if no_tag else \ (actual_msg.decode(), (self.client.msg_size - TMH_SIZE) * 'c', self.client.msg_size - TMH_SIZE) def tm_traffic(self, tm_opcode=e.IBV_TMH_EAGER, fixed_send_tag=None): """ Runs Tag matching traffic between two sides (server and client) :param tm_opcode: The TM opcode in the send WR :param fixed_send_tag: If not None complitions are expected to be with no tag """ tags_list = list(range(1, self.iters)) for recv_tag in tags_list: self.post_recv_tm(tag=recv_tag, wrid=recv_tag) actual_cqe = self.poll_cq_ex(cqex=self.server.cq) self.verify_cqe(actual_cqe=actual_cqe, wr_id=recv_tag, opcode=e.IBV_WC_TM_ADD) tags_list.reverse() for send_tag in tags_list: send_tag, tag_exp, wrid_exp, wc_flags = self.get_exp_params( fixed_send_tag=fixed_send_tag, send_tag=send_tag, tm_opcode=tm_opcode) send_wr, exp_msg = self.get_send_elements(tag=send_tag, tm_opcode=tm_opcode) u.send(self.client, send_wr) self.poll_cq_ex(cqex=self.client.cq, to_valid=False) actual_cqe = self.poll_cq_ex(cqex=self.server.cq) exp_recv_tm_opcode = e.IBV_WC_TM_NO_TAG if tm_opcode == e.IBV_TMH_NO_TAG else \ e.IBV_WC_TM_RECV self.verify_cqe(actual_cqe=actual_cqe, wr_id=wrid_exp, opcode=exp_recv_tm_opcode, wc_flags=wc_flags, tag=tag_exp) if tm_opcode == e.IBV_TMH_RNDV: actual_cqe = self.poll_cq_ex(cqex=self.client.cq) self.verify_cqe(actual_cqe=actual_cqe, opcode=e.IBV_WC_RECV, is_server=False) actual_cqe = self.poll_cq_ex(cqex=self.server.cq) self.verify_cqe(actual_cqe=actual_cqe, wr_id=wrid_exp, opcode=e.IBV_WC_TM_RECV, wc_flags=e.IBV_WC_TM_DATA_VALID) actual_msg, exp_msg, msg_size = self.build_expected_and_recv_msgs \ (exp_msg=exp_msg, tm_opcode=tm_opcode, fixed_send_tag=fixed_send_tag) self.validate_msg(actual_msg, exp_msg, msg_size) if fixed_send_tag and tm_opcode != e.IBV_TMH_NO_TAG: self.validate_exp_recv_params(exp_parm=self.curr_unexpected_cnt, recv_parm=self.server.unexp_cnt, descriptor='unexpected_count') self.curr_unexpected_cnt += 1 u.post_recv(self.server, u.get_recv_wr(self.server)) def test_tm_traffic(self): """ Test basic Tag Matching traffic, client sends tagged WRs server receives and validates it. """ self.tm_traffic() def test_tm_unexpected_tag(self): """ Test unexpected Tag Matching traffic, client sends unexpected tagged WRs, server receives and validates it, completions are expected to be with no tag, and unexpected_count field of the server TM-SRQ expected to be increased. """ self.tm_traffic(fixed_send_tag=FIXED_SEND_TAG) def test_tm_no_tag(self): """ Test no_tag Tag Matching traffic, client sends WRs with tag and with opcode NO_TAG, server receives and validates it, completions are expected to be with no tag. """ self.tm_traffic(tm_opcode=e.IBV_TMH_NO_TAG, fixed_send_tag=FIXED_SEND_TAG) def test_tm_rndv(self): """ Test rendezvous Tag Matching traffic, client sends WRs with tag and with opcode RNDV, server receives and validates it, 2 completions are expected to be received for every WRs. """ self.tm_traffic(tm_opcode=e.IBV_TMH_RNDV) rdma-core-56.1/tests/utils.py000066400000000000000000002201561477342711600162140ustar00rootroot00000000000000# SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) # Copyright (c) 2019 Mellanox Technologies, Inc. All rights reserved. See COPYING file # Copyright (c) 2020 Intel Corporation. All rights reserved. See COPYING file """ Provide some useful helper function for pyverbs' tests. """ from itertools import combinations as com import errno import subprocess import unittest import random import socket import struct import string import glob import time import os from pyverbs.pyverbs_error import PyverbsError, PyverbsRDMAError, PyverbsUserError from pyverbs.providers.mlx5.mlx5dv import Mlx5Context, Mlx5DVContextAttr from pyverbs.qp import QPCap, QPInitAttr, QPInitAttrEx, QPAttr, QPEx, QP from tests.mlx5_base import Mlx5DcResources, Mlx5DcStreamsRes from tests.base import XRCResources, DCT_KEY, MLNX_VENDOR_ID from pyverbs.addr import AHAttr, AH, GlobalRoute from pyverbs.providers.efa.efadv import EfaCQ from pyverbs.wr import SGE, SendWR, RecvWR from pyverbs.base import PyverbsRDMAErrno from tests.efa_base import SRDResources from pyverbs.cq import PollCqAttr, CQEX from pyverbs.mr import MW, MWBindInfo from pyverbs.mem_alloc import madvise import pyverbs.device as d import pyverbs.enums as e from pyverbs.mr import MR MAX_MR_SIZE = 4194304 # Some HWs limit DM address and length alignment to 4 for read and write # operations. Use a minimal length and alignment that respect that. # For creation purposes use random alignments. As this is log2 of address # alignment, no need for large numbers. MIN_DM_SIZE = 4 DM_ALIGNMENT = 4 MIN_DM_LOG_ALIGN = 0 MAX_DM_LOG_ALIGN = 6 # Raw Packet QP supports TSO header, which creates a larger send WQE. MAX_RAW_PACKET_SEND_WR = 2500 GRH_SIZE = 40 IMM_DATA = 1234 POLL_CQ_TIMEOUT = 10 # In seconds class MatchCriteriaEnable: NONE = 0 OUTER = 1 MISC = 1 << 1 INNER = 1 << 2 MISC_2 = 1 << 3 MISC_3 = 1 << 4 class PacketConsts: """ Class to hold constant packets' values. """ ETHER_HEADER_SIZE = 14 IPV4_HEADER_SIZE = 20 IPV6_HEADER_SIZE = 40 UDP_HEADER_SIZE = 8 TCP_HEADER_SIZE = 20 VLAN_HEADER_SIZE = 4 TCP_HEADER_SIZE_WORDS = 5 IP_V4 = 4 IP_V6 = 6 TCP_PROTO = 'tcp' UDP_PROTO = 'udp' IP_V4_FLAGS = 2 # Don't fragment is set TTL_HOP_LIMIT = 64 IHL = 5 # Hardcoded values for flow matchers ETHER_TYPE_ETH = 0x6558 ETHER_TYPE_IPV4 = 0x800 MAC_MASK = "ff:ff:ff:ff:ff:ff" ETHER_TYPE_IPV6 = 0x86DD SRC_MAC = "24:8a:07:a5:28:c8" # DST mac must be multicast DST_MAC = "01:50:56:19:20:a7" SRC_IP = "1.1.1.1" DST_IP = "2.2.2.2" SRC_PORT = 1234 DST_PORT = 5678 SRC_IP6 = "a0a1::a2a3:a4a5:a6a7:a8a9" DST_IP6 = "b0b1::b2b3:b4b5:b6b7:b8b9" SEQ_NUM = 1 WINDOW_SIZE = 65535 VXLAN_PORT = 4789 VXLAN_VNI = 7777777 VXLAN_FLAGS = 0x8 VXLAN_HEADER_SIZE = 8 VLAN_TPID = 0x8100 VLAN_PRIO = 5 VLAN_CFI = 1 VLAN_ID = 0xc0c GRE_VER = 1 GRE_FLAGS = 2 GRE_KEY = 0x12345678 GENEVE_VNI = 2 GENEVE_OAM = 0 GENEVE_PORT = 6081 BTH_HEADER_SIZE = 16 BTH_OPCODE = 0x81 BTH_DST_QP = 0xd2 BTH_A = 0x1 BTH_PARTITION_KEY = 0xffff BTH_BECN = 1 ROCE_PORT = 4791 def get_mr_length(): """ Provide a random value for MR length. We avoid large buffers as these allocations typically fails. We use random.random() instead of randrange() or randint() due to performance issues when generating very large pseudo random numbers. :return: A random MR length """ return int(MAX_MR_SIZE * random.random()) def filter_illegal_access_flags(element): """ Helper function to filter illegal access flags combinations :param element: A list of access flags to check :return: True if this list is legal, else False """ if e.IBV_ACCESS_REMOTE_ATOMIC in element or e.IBV_ACCESS_REMOTE_WRITE in element: if not e.IBV_ACCESS_LOCAL_WRITE in element: return False return True def get_access_flags(ctx): """ Provide an array of random legal access flags for an MR. Since remote write and remote atomic require local write permission, if one of them is randomly selected without local write, local write will be added as well. After verifying that the flags selection is legal, it is appended to an array, assuming it wasn't previously appended. :param ctx: Device Context to check capabilities :param num: Size of initial collection :return: A random legal value for MR flags """ attr = ctx.query_device() attr_ex = ctx.query_device_ex() vals = list(e.ibv_access_flags) if not attr_ex.odp_caps.general_caps & e.IBV_ODP_SUPPORT: vals.remove(e.IBV_ACCESS_ON_DEMAND) if not attr.device_cap_flags & e.IBV_DEVICE_MEM_WINDOW: vals.remove(e.IBV_ACCESS_MW_BIND) if not attr.atomic_caps & e.IBV_ATOMIC_HCA: vals.remove(e.IBV_ACCESS_REMOTE_ATOMIC) arr = [] for i in range(1, len(vals)): tmp = list(com(vals, i)) tmp = filter(filter_illegal_access_flags, tmp) for t in tmp: # Iterate legal combinations and bitwise OR them val = 0 for flag in t: val += flag.value arr.append(val) return arr def get_dmabuf_access_flags(ctx): """ Similar to get_access_flags, except that dma-buf MR only support a subset of the flags. :param ctx: Device Context to check capabilities :return: A random legal value for MR flags """ attr = ctx.query_device() vals = [e.IBV_ACCESS_LOCAL_WRITE, e.IBV_ACCESS_REMOTE_WRITE, e.IBV_ACCESS_REMOTE_READ, e.IBV_ACCESS_REMOTE_ATOMIC, e.IBV_ACCESS_RELAXED_ORDERING] if not attr.atomic_caps & e.IBV_ATOMIC_HCA: vals.remove(e.IBV_ACCESS_REMOTE_ATOMIC) arr = [] for i in range(1, len(vals)): tmp = list(com(vals, i)) tmp = filter(filter_illegal_access_flags, tmp) for t in tmp: # Iterate legal combinations and bitwise OR them val = 0 for flag in t: val += flag.value arr.append(val) return arr def get_dm_attrs(dm_len): """ Initializes an AllocDmAttr member with the given length and random alignment. It currently sets comp_mask = 0 since other comp_mask values are not supported. :param dm_len: :return: An initialized AllocDmAttr object """ align = random.randint(MIN_DM_LOG_ALIGN, MAX_DM_LOG_ALIGN) return d.AllocDmAttr(dm_len, align, 0) def sample(coll): """ Returns a random-length subset of the given collection. :param coll: The collection to sample :return: A subset of """ return random.sample(coll, int((len(coll) + 1) * random.random())) def random_qp_cap(attr): """ Initializes a QPCap object with valid values based on the device's attributes. It doesn't check the max WR limits since they're reported for smaller WR sizes. :return: A QPCap object """ # We use significantly smaller values than those in device attributes. # The attributes reported by the device don't take into account possible # larger WQEs that include e.g. memory window. send_wr = random.randint(1, int(attr.max_qp_wr / 8)) recv_wr = random.randint(1, int(attr.max_qp_wr / 8)) send_sge = random.randint(1, int(attr.max_sge / 2)) recv_sge = random.randint(1, int(attr.max_sge / 2)) return QPCap(send_wr, recv_wr, send_sge, recv_sge) def random_qp_create_mask(qpt, attr_ex): """ Select a random sublist of ibv_qp_init_attr_mask. Some of the options are not yet supported by pyverbs and will not be returned. TSO support is checked for the device and the QP type. If it doesn't exist, TSO will not be set. :param qpt: Current QP type :param attr_ex: Extended device attributes for capability checks :return: A sublist of ibv_qp_init_attr_mask """ has_tso = attr_ex.tso_caps.max_tso > 0 and \ attr_ex.tso_caps.supported_qpts & 1 << qpt supp_flags = [e.IBV_QP_INIT_ATTR_CREATE_FLAGS, e.IBV_QP_INIT_ATTR_MAX_TSO_HEADER] # Either PD or XRCD flag is needed, XRCD is not supported yet selected = sample(supp_flags) selected.append(e.IBV_QP_INIT_ATTR_PD) if e.IBV_QP_INIT_ATTR_MAX_TSO_HEADER in selected and not has_tso: selected.remove(e.IBV_QP_INIT_ATTR_MAX_TSO_HEADER) mask = 0 for s in selected: mask += s.value return mask def get_create_qp_flags_raw_packet(attr_ex): """ Select random QP creation flags for Raw Packet QP. Filter out unsupported flags prior to selection. :param attr_ex: Device extended attributes to check capabilities :return: A random combination of QP creation flags """ has_fcs = attr_ex.device_cap_flags_ex & e._IBV_DEVICE_RAW_SCATTER_FCS has_cvlan = attr_ex.raw_packet_caps & e.IBV_RAW_PACKET_CAP_CVLAN_STRIPPING has_padding = attr_ex.device_cap_flags_ex & \ e._IBV_DEVICE_PCI_WRITE_END_PADDING l = list(e.ibv_qp_create_flags) l.remove(e.IBV_QP_CREATE_SOURCE_QPN) # UD only if not has_fcs: l.remove(e.IBV_QP_CREATE_SCATTER_FCS) if not has_cvlan: l.remove(e.IBV_QP_CREATE_CVLAN_STRIPPING) if not has_padding: l.remove(e.IBV_QP_CREATE_PCI_WRITE_END_PADDING) flags = sample(l) val = 0 for i in flags: val |= i.value return val def random_valid_qp_create_flags(qpt, attr, attr_ex): """ Select a random sublist of ibv_qp_create_flags according to the QP type. :param qpt: Current QP type :param attr_ex: Used for Raw Packet QP to check device capabilities :return: A sublist of ibv_qp_create_flags """ # Most HCAs doesn't support any create_flags so far except mlx4/mlx5 if attr.vendor_id != MLNX_VENDOR_ID: return 0 if qpt == e.IBV_QPT_RAW_PACKET: return get_create_qp_flags_raw_packet(attr_ex) elif qpt == e.IBV_QPT_UD: # IBV_QP_CREATE_SOURCE_QPN is only supported by mlx5 driver and is not # to be check in unittests. return random.choice([0, 2]) # IBV_QP_CREATE_BLOCK_SELF_MCAST_LB else: return 0 def random_qp_init_attr_ex(attr_ex, attr, qpt=None): """ Create a random-valued QPInitAttrEx object with the given QP type. QP type affects QP capabilities, so allow users to set it and still get valid attributes. :param attr_ex: Extended device attributes for capability checks :param attr: Device attributes for capability checks :param qpt: Requested QP type :return: A valid initialized QPInitAttrEx object """ max_tso = 0 if qpt is None: qpt = random.choice([e.IBV_QPT_RC, e.IBV_QPT_UC, e.IBV_QPT_UD, e.IBV_QPT_RAW_PACKET]) qp_cap = random_qp_cap(attr) if qpt == e.IBV_QPT_RAW_PACKET and \ qp_cap.max_send_wr > MAX_RAW_PACKET_SEND_WR: qp_cap.max_send_wr = MAX_RAW_PACKET_SEND_WR sig = random.randint(0, 1) mask = random_qp_create_mask(qpt, attr_ex) if mask & e.IBV_QP_INIT_ATTR_CREATE_FLAGS: cflags = random_valid_qp_create_flags(qpt, attr, attr_ex) else: cflags = 0 if mask & e.IBV_QP_INIT_ATTR_MAX_TSO_HEADER: if qpt != e.IBV_QPT_RAW_PACKET: mask -= e.IBV_QP_INIT_ATTR_MAX_TSO_HEADER else: max_tso = \ random.randint(16, int(attr_ex.tso_caps.max_tso / 800)) qia = QPInitAttrEx(qp_type=qpt, cap=qp_cap, sq_sig_all=sig, comp_mask=mask, create_flags=cflags, max_tso_header=max_tso) if mask & e.IBV_QP_INIT_ATTR_MAX_TSO_HEADER: # TSO increases send WQE size, let's be on the safe side qia.cap.max_send_sge = 2 return qia def get_qp_init_attr(cq, attr): """ Creates a QPInitAttr object with a QP type of the provided array and other random values. :param cq: CQ to be used as send and receive CQ :param attr: Device attributes for capability checks :return: An initialized QPInitAttr object """ qp_cap = random_qp_cap(attr) sig = random.randint(0, 1) return QPInitAttr(scq=cq, rcq=cq, cap=qp_cap, sq_sig_all=sig) def wc_status_to_str(status): try: return \ {0: 'Success', 1: 'Local length error', 2: 'local QP operation error', 3: 'Local EEC operation error', 4: 'Local protection error', 5: 'WR flush error', 6: 'Memory window bind error', 7: 'Bad response error', 8: 'Local access error', 9: 'Remote invalidate request error', 10: 'Remote access error', 11: 'Remote operation error', 12: 'Retry exceeded', 13: 'RNR retry exceeded', 14: 'Local RDD violation error', 15: 'Remote invalidate RD request error', 16: 'Remote aort error', 17: 'Invalidate EECN error', 18: 'Invalidate EEC state error', 19: 'Fatal error', 20: 'Response timeout error', 21: 'General error'}[status] except KeyError: return 'Unknown WC status ({s})'.format(s=status) def create_custom_mr(agr_obj, additional_access_flags=0, size=None, user_addr=None): """ Creates a memory region using the aggregation object's PD. If size is None, the agr_obj's message size is used to set the MR's size. The access flags are local write and the additional_access_flags. :param agr_obj: The aggregation object that creates the MR :param additional_access_flags: Addition access flags to set in the MR :param size: MR's length. If None, agr_obj.msg_size is used. :param user_addr: The MR's buffer address. If None, the buffer will be allocated by pyverbs. """ mr_length = size if size else agr_obj.msg_size try: return MR(agr_obj.pd, mr_length, e.IBV_ACCESS_LOCAL_WRITE | additional_access_flags, address=user_addr) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Create custom mr with additional access flags {additional_access_flags} is not supported') raise ex # Traffic helpers def get_send_elements(agr_obj, is_server, opcode=e.IBV_WR_SEND): """ Creates a single SGE and a single Send WR for agr_obj's QP type. The content of the message is either 's' for server side or 'c' for client side. :param agr_obj: Aggregation object which contains all resources necessary :param is_server: Indicates whether this is server or client side :return: send wr and its SGE """ if hasattr(agr_obj, 'use_mixed_mr') and agr_obj.use_mixed_mr: return get_send_elements_mixed_mr(agr_obj, is_server, opcode) if opcode == e.IBV_WR_ATOMIC_WRITE: atomic_wr = agr_obj.msg_size * (b's' if is_server else b'c') return None, atomic_wr qp_type = agr_obj.sqp_lst[0].qp_type if isinstance(agr_obj, XRCResources) \ else agr_obj.qp.qp_type offset = GRH_SIZE if qp_type == e.IBV_QPT_UD else 0 msg = (agr_obj.msg_size + offset) * ('s' if is_server else 'c') agr_obj.mem_write(msg, agr_obj.msg_size + offset) sge = SGE(agr_obj.mr.buf + offset, agr_obj.msg_size, agr_obj.mr_lkey) send_wr = SendWR(opcode=opcode, num_sge=1, sg=[sge]) if opcode in [e.IBV_WR_RDMA_WRITE, e.IBV_WR_RDMA_WRITE_WITH_IMM, e.IBV_WR_RDMA_READ]: send_wr.set_wr_rdma(int(agr_obj.rkey), int(agr_obj.raddr)) return send_wr, sge def get_send_elements_mixed_mr(agr_obj, is_server, opcode=e.IBV_WR_SEND): """ Creates 2 SGEs and a single Send WR for agr_obj's QP type. There are 2 messages, one for each MR. The content of the message is either 's' for server side or 'c' for client side. :param agr_obj: Aggregation object which contains all resources necessary :param is_server: Indicates whether this is server or client side :param opcode: send WR opcode :return: send wr and its SG list """ msg = (agr_obj.msg_size) * ('s' if is_server else 'c') agr_obj.mr.write(msg, agr_obj.msg_size) agr_obj.non_odp_mr.write(msg, agr_obj.msg_size) sge1 = SGE(agr_obj.mr.buf, agr_obj.msg_size, agr_obj.mr.lkey) sge2 = SGE(agr_obj.non_odp_mr.buf, agr_obj.msg_size, agr_obj.non_odp_mr.lkey) send_wr = SendWR(opcode=opcode, num_sge=2, sg=[sge1, sge2]) return send_wr, [sge1, sge2] def get_recv_wr(agr_obj): """ Creates a single SGE Recv WR for agr_obj's QP type. In case of mixed MRs, creates 2 SGEs accordingly. :param agr_obj: Aggregation object which contains all resources necessary :return: recv wr """ qp_type = agr_obj.rqp_lst[0].qp_type if isinstance(agr_obj, XRCResources) \ else agr_obj.qp.qp_type if isinstance(agr_obj.qp, QP) else None mr = agr_obj.mr length = agr_obj.msg_size + GRH_SIZE if qp_type == e.IBV_QPT_UD \ else agr_obj.msg_size recv_sgl = [SGE(mr.buf, length, mr.lkey)] if hasattr(agr_obj, 'use_mixed_mr') and agr_obj.use_mixed_mr: sec_mr = agr_obj.non_odp_mr recv_sgl.append(SGE(sec_mr.buf,length,sec_mr.lkey)) return RecvWR(sg=recv_sgl, num_sge=len(recv_sgl)) def get_global_ah(agr_obj, gid_index, port): gr = GlobalRoute(dgid=agr_obj.ctx.query_gid(port, gid_index), sgid_index=gid_index) ah_attr = AHAttr(port_num=port, is_global=1, gr=gr, dlid=agr_obj.port_attr.lid) return AH(agr_obj.pd, attr=ah_attr) def get_global_route(ctx, gid_index=0, port_num=1): """ Queries the provided Context's gid and creates a GlobalRoute object with sgid_index and the queried GID as dgid. :param ctx: Context object to query :param gid_index: GID index to query and use. Default: 0, as it's always valid :param port_num: Number of the port to query. Default: 1 :return: GlobalRoute object """ if ctx.query_port(port_num).gid_tbl_len == 0: raise unittest.SkipTest(f'Not supported without GID table') gid = ctx.query_gid(port_num, gid_index) gr = GlobalRoute(dgid=gid, sgid_index=gid_index) return gr def xrc_post_send(agr_obj, qp_num, send_object, send_op=None): agr_obj.qps = agr_obj.sqp_lst if send_op: post_send_ex(agr_obj, send_object, send_op) else: post_send(agr_obj, send_object) def post_send_ex(agr_obj, send_object, send_op=None, qp_idx=0, ah=None, **kwargs): qp = agr_obj.qps[qp_idx] qp_type = qp.qp_type qp.wr_start() qp.wr_id = 0x123 qp.wr_flags = e.IBV_SEND_SIGNALED if send_op == e.IBV_WR_SEND: qp.wr_send() elif send_op == e.IBV_WR_RDMA_WRITE: qp.wr_rdma_write(agr_obj.rkey, agr_obj.raddr) elif send_op == e.IBV_WR_SEND_WITH_IMM: qp.wr_send_imm(IMM_DATA) elif send_op == e.IBV_WR_RDMA_WRITE_WITH_IMM: qp.wr_rdma_write_imm(agr_obj.rkey, agr_obj.raddr, IMM_DATA) elif send_op == e.IBV_WR_ATOMIC_WRITE: qp.wr_atomic_write(agr_obj.rkey, agr_obj.raddr, send_object) elif send_op == e.IBV_WR_FLUSH: qp.wr_flush(agr_obj.rkey, agr_obj.raddr, agr_obj.msg_size, agr_obj.ptype, agr_obj.level) elif send_op == e.IBV_WR_RDMA_READ: qp.wr_rdma_read(agr_obj.rkey, agr_obj.raddr) elif send_op == e.IBV_WR_ATOMIC_CMP_AND_SWP: cmp_add = kwargs.get('cmp_add') swp = kwargs.get('swap') qp.wr_atomic_cmp_swp(agr_obj.rkey, agr_obj.raddr, int8b_from_int(cmp_add), int8b_from_int(swp)) elif send_op == e.IBV_WR_ATOMIC_FETCH_AND_ADD: cmp_add = kwargs.get('cmp_add') qp.wr_atomic_fetch_add(agr_obj.rkey, agr_obj.raddr, int8b_from_int(cmp_add)) elif send_op == e.IBV_WR_BIND_MW: bind_info = MWBindInfo(agr_obj.mr, agr_obj.mr.buf, agr_obj.mr.rkey, e.IBV_ACCESS_REMOTE_WRITE) mw = MW(agr_obj.pd, mw_type=e.IBV_MW_TYPE_2) # A new rkey is needed to be set into bind_info, modify rkey qp.wr_bind_mw(mw, agr_obj.mr.rkey + 12, bind_info) qp.wr_send() if qp_type == e.IBV_QPT_UD: qp.wr_set_ud_addr(ah, agr_obj.rqps_num[qp_idx], agr_obj.UD_QKEY) if isinstance(agr_obj, SRDResources): qp.wr_set_ud_addr(ah, agr_obj.rqps_num[qp_idx], agr_obj.SRD_QKEY) if qp_type == e.IBV_QPT_XRC_SEND: qp.wr_set_xrc_srqn(agr_obj.remote_srqn) if hasattr(agr_obj, 'remote_dct_num'): if isinstance(agr_obj, Mlx5DcStreamsRes): stream_id = agr_obj.generate_stream_id(qp_idx) agr_obj.check_bad_flow(qp_idx) qp.wr_set_dc_addr_stream(ah, agr_obj.remote_dct_num, DCT_KEY, stream_id) else: qp.wr_set_dc_addr(ah, agr_obj.remote_dct_num, DCT_KEY) if send_op != e.IBV_WR_ATOMIC_WRITE and \ send_op != e.IBV_WR_FLUSH: qp.wr_set_sge(send_object) qp.wr_complete() def post_send(agr_obj, send_wr, qp_idx=0, ah=None, is_imm=False): """ Post a single send WR to the QP. Post_send's second parameter (send bad wr) is ignored for simplicity. For UD traffic an address vector is added as well. :param agr_obj: aggregation object which contains all resources necessary :param send_wr: Send work request to post send :param qp_idx: QP index to use :param ah: The destination address handle :param is_imm: If True, send with imm_data, relevant for old post send API :return: None """ qp_type = agr_obj.qp.qp_type if is_imm: send_wr.imm_data = socket.htonl(IMM_DATA) if qp_type == e.IBV_QPT_UD: send_wr.set_wr_ud(ah, agr_obj.rqps_num[qp_idx], agr_obj.UD_QKEY) if isinstance(agr_obj, SRDResources): send_wr.set_wr_ud(ah, agr_obj.rqps_num[qp_idx], agr_obj.SRD_QKEY) agr_obj.qps[qp_idx].post_send(send_wr, None) def post_recv(agr_obj, recv_wr, qp_idx=0 ,num_wqes=1): """ Call the QP's post_recv() method times. Post_recv's second parameter (recv bad wr) is ignored for simplicity. :param recv_wr: Receive work request to post :param qp_idx: QP index which posts receive work request :param num_wqes: Number of WQEs to post :return: None """ receive_queue = agr_obj.srq if agr_obj.srq else agr_obj.qps[qp_idx] for _ in range(num_wqes): if isinstance(receive_queue, QPEx) and receive_queue.ind_table: for wq in receive_queue.ind_table.wqs: wq.post_recv(recv_wr, None) else: receive_queue.post_recv(recv_wr, None) def _poll_cq(cq, count=1, data=None): """ Poll completions from the CQ. Note: This function calls the blocking poll() method of the CQ until completions were received. Alternatively, gets a single CQ event when events are used. :param cq: CQ to poll from :param count: How many completions to poll :param data: In case of a work request with immediate, the immediate data to be compared after poll :return: An array of work completions of length , None when events are used """ wcs = [] channel = cq.comp_channel start_poll_t = time.perf_counter() while count > 0 and (time.perf_counter() - start_poll_t < POLL_CQ_TIMEOUT): if channel: channel.get_cq_event(cq) cq.req_notify() nc, tmp_wcs = cq.poll(count) for wc in tmp_wcs: if wc.status != e.IBV_WC_SUCCESS: wcs.append(wc) return wcs if data: if wc.wc_flags & e.IBV_WC_WITH_IMM == 0: raise PyverbsRDMAError('Completion without immediate') assert socket.ntohl(wc.imm_data) == data count -= nc wcs.extend(tmp_wcs) if count > 0: raise PyverbsError(f'Got timeout on polling ({count} CQEs remaining)') return wcs def poll_cq(cq, count=1, data=None): """ Poll completions from the CQ. Note: This function calls the blocking poll() method of the CQ until completions were received. Alternatively, gets a single CQ event when events are used. :param cq: CQ to poll from :param count: How many completions to poll :param data: In case of a work request with immediate, the immediate data to be compared after poll :return: An array of work completions of length , None when events are used """ wcs = _poll_cq(cq, count, data) if wcs[0].status != e.IBV_WC_SUCCESS: raise PyverbsRDMAError(f'Completion status is {wc_status_to_str(wcs[0].status)}') return wcs def poll_cq_ex(cqex, count=1, data=None, sgid=None): """ Poll completions from the extended CQ. :param cq: CQEX to poll from :param count: How many completions to poll :param data: In case of a work request with immediate, the immediate data to be compared after poll, either a list of immediate data or one value :param sgid: In case of EFA receive completion, the sgid to be compared after poll :return: WR id order received """ wr_id_order = [] iters = count try: start_poll_t = time.perf_counter() poll_attr = PollCqAttr() ret = cqex.start_poll(poll_attr) while ret == 2 and (time.perf_counter() - start_poll_t < POLL_CQ_TIMEOUT): ret = cqex.start_poll(poll_attr) if ret != 0: raise PyverbsRDMAErrno('Failed to poll CQ') wr_id_order.append(cqex.wr_id) count -= 1 if cqex.status != e.IBV_WC_SUCCESS: raise PyverbsRDMAErrno('Completion status is {s}'. format(s=cqex.status)) if data: imm_data = data if not isinstance(data, list) else data[iters - count - 1] assert imm_data == socket.ntohl(cqex.read_imm_data()) if isinstance(cqex, EfaCQ): if sgid is not None and cqex.read_opcode() == e.IBV_WC_RECV: assert sgid.gid == cqex.read_sgid().gid # Now poll the rest of the packets while count > 0 and (time.perf_counter() - start_poll_t < POLL_CQ_TIMEOUT): ret = cqex.poll_next() while ret == 2: ret = cqex.poll_next() if ret != 0: raise PyverbsRDMAErrno('Failed to poll CQ') if cqex.status != e.IBV_WC_SUCCESS: raise PyverbsRDMAErrno('Completion status is {s}'. format(s=cqex.status)) count -= 1 wr_id_order.append(cqex.wr_id) if data: imm_data = data if not isinstance(data, list) else data[iters - count - 1] assert imm_data == socket.ntohl(cqex.read_imm_data()) if isinstance(cqex, EfaCQ): if sgid is not None and cqex.read_opcode() == e.IBV_WC_RECV: assert sgid.gid == cqex.read_sgid().gid if count > 0: raise PyverbsError(f'Got timeout on polling ({count} CQEs remaining)') finally: cqex.end_poll() return wr_id_order def validate(received_str, is_server, msg_size): """ Validates the received buffer against the expected result. The application should set client's send buffer to 'c's and the server's send buffer to 's's. If the expected buffer is different than the actual, an exception will be raised. :param received_str: The received buffer to check :param is_server: Indicates whether this is the server (receiver) or client side :param msg_size: the message size of the received packet :return: None """ expected_str = msg_size * ('c' if is_server else 's') received_str = received_str.decode() if received_str[0:msg_size] == \ expected_str[0:msg_size]: return else: raise PyverbsError( 'Data validation failure: expected {exp}, received {rcv}'. format(exp=expected_str, rcv=received_str)) def send(agr_obj, send_object, send_op=None, new_send=False, qp_idx=0, ah=None, is_imm=False, **kwargs): if isinstance(agr_obj, XRCResources): agr_obj.qps = agr_obj.sqp_lst if new_send: return post_send_ex(agr_obj, send_object, send_op, qp_idx, ah, **kwargs) return post_send(agr_obj, send_object, qp_idx, ah, is_imm) def traffic_poll_at_once(test, msg_size, iterations=10, opcode=e.IBV_WR_SEND): """ Execute traffic between two peers iterations times: - Create receive resources and then post recv - Create send resources and then post send - Poll client and server all cqes at once - Compare received buffer with the expected one and validate :param test: Test object with client and server. :param msg_size: Size of a packet. :param iterations: Number of packets to send/recv. :param opcode: Send opcode. Currently it supports SEND and RDMA_WRITE_WITH_IMM opcodes only. """ imm_data_exp = [] for i in range(iterations): recv_sge = SGE(test.server.mr.buf + (msg_size * i), msg_size, test.server.mr.lkey) recv_wr = RecvWR(sg=[recv_sge], num_sge=1, wr_id=i) test.server.mr.write(str(iterations - i - 1) * msg_size, msg_size, i * msg_size) test.server.qp.post_recv(recv_wr) for i in range(iterations): send_sge = SGE(test.client.mr.buf + (msg_size * i), msg_size, test.client.mr.lkey) send_wr = SendWR(opcode=opcode, num_sge=1, sg=[send_sge], wr_id=i) if opcode == e.IBV_WR_RDMA_WRITE_WITH_IMM: send_wr.imm_data = socket.htonl(i + iterations) imm_data_exp.append(i + iterations) send_wr.set_wr_rdma(int(test.client.rkey), int(test.client.raddr + i * msg_size)) test.client.mr.write(str(i % 10) * msg_size, msg_size, i * msg_size) test.client.qp.post_send(send_wr) send_id_order = poll_cq_ex(test.client.cq, iterations) recv_id_order = poll_cq_ex(test.server.cq, iterations, data=imm_data_exp) # With opcodes that don't consume Recv WQEs (e.g. WRITE and WRITE_WITH_IMM) the buffer # scattering offsets are defined by the sender. recv_id_order = send_id_order if opcode == e.IBV_WR_RDMA_WRITE_WITH_IMM else recv_id_order for i, j in zip(recv_id_order, send_id_order): exp_buff = bytearray((str(j % 10) * msg_size), 'utf-8') recv_buff = test.server.mr.read(msg_size, i * msg_size) if recv_buff != exp_buff: raise PyverbsRDMAError(f'Data validation failed: expected {exp_buff},' f' received {recv_buff}') def traffic(client, server, iters, gid_idx, port, is_cq_ex=False, send_op=e.IBV_WR_SEND, new_send=False, force_page_faults=False): """ Runs basic traffic between two sides :param client: client side, clients base class is BaseTraffic :param server: server side, servers base class is BaseTraffic :param iters: number of traffic iterations :param gid_idx: local gid index :param port: IB port :param is_cq_ex: If True, use poll_cq_ex() rather than poll_cq() :param send_op: The send_wr opcode. :param new_send: If True use new post send API. :param force_page_faults: If True, use madvise to hint that we don't need the MR's buffer to force page faults (useful for ODP testing). :return: """ if is_datagram_qp(client): ah_client = get_global_ah(client, gid_idx, port) ah_server = get_global_ah(server, gid_idx, port) else: ah_client = None ah_server = None poll = poll_cq_ex if is_cq_ex else poll_cq imm_data = None if send_op in [e.IBV_WR_SEND_WITH_IMM, e.IBV_WR_RDMA_WRITE_WITH_IMM]: imm_data = IMM_DATA s_recv_wr = get_recv_wr(server) c_recv_wr = get_recv_wr(client) for qp_idx in range(server.qp_count): # prepare the receive queue with RecvWR post_recv(client, c_recv_wr, qp_idx=qp_idx) post_recv(server, s_recv_wr, qp_idx=qp_idx) read_offset = GRH_SIZE if client.qp.qp_type == e.IBV_QPT_UD else 0 for _ in range(iters): for qp_idx in range(server.qp_count): if force_page_faults: madvise(client.mr.buf, client.msg_size) madvise(server.mr.buf, server.msg_size) c_send_wr, c_sg = get_send_elements(client, False, send_op) if client.use_mr_prefetch: flags = e._IBV_ADVISE_MR_FLAG_FLUSH if client.use_mr_prefetch == 'async': flags = 0 prefetch_mrs(client, [c_sg], advice=client.prefetch_advice, flags=flags) c_send_object = c_sg if new_send else c_send_wr send(client, c_send_object, send_op, new_send, qp_idx, ah_client, is_imm=(imm_data != None)) poll(client.cq) poll(server.cq, data=imm_data) post_recv(server, s_recv_wr, qp_idx=qp_idx) msg_received_list = get_msg_received(server, read_offset) for msg in msg_received_list: validate(msg, True, server.msg_size) s_send_wr, s_sg = get_send_elements(server, True, send_op) if server.use_mr_prefetch: flags = e._IBV_ADVISE_MR_FLAG_FLUSH if server.use_mr_prefetch == 'async': flags = 0 prefetch_mrs(server, [s_sg], advice=server.prefetch_advice, flags=flags) s_send_object = s_sg if new_send else s_send_wr send(server, s_send_object, send_op, new_send, qp_idx, ah_server, is_imm=(imm_data != None)) poll(server.cq) poll(client.cq, data=imm_data) post_recv(client, c_recv_wr, qp_idx=qp_idx) msg_received_list = get_msg_received(client,read_offset) for msg in msg_received_list: validate(msg, False, server.msg_size) def get_msg_received(agr_obj, read_offset): msg_received_list = [agr_obj.mr.read(agr_obj.msg_size, read_offset)] if hasattr(agr_obj, 'use_mixed_mr') and agr_obj.use_mixed_mr: msg_received_list.append(agr_obj.non_odp_mr.read(agr_obj.msg_size, read_offset)) return msg_received_list def gen_ethernet_header(dst_mac=PacketConsts.DST_MAC, src_mac=PacketConsts.SRC_MAC, ether_type=PacketConsts.ETHER_TYPE_IPV4): """ Generates Ethernet header using the values from the PacketConst class by default. :param dst_mac: Destination mac address :param src_mac: Source mac address :param ether_type: Ether type of next header :return: Ethernet header """ header = struct.pack('!6s6s', bytes.fromhex(dst_mac.replace(':', '')), bytes.fromhex(src_mac.replace(':', ''))) header += ether_type.to_bytes(2, 'big') return header def gen_ipv4_header(packet_len, next_proto=socket.IPPROTO_UDP, src_ip=PacketConsts.SRC_IP, dst_ip=PacketConsts.DST_IP): """ Generates IPv4 header using the values from the PacketConst class by default. :param packet_len: Length of all fields following the IP header :param next_proto: protocol type of next header :param src_ip: Source mac address :param dst_ip: Destination mac address :return: IPv4 header """ ip_total_len = packet_len + PacketConsts.IPV4_HEADER_SIZE return struct.pack('!2B3H2BH4s4s', (PacketConsts.IP_V4 << 4) + PacketConsts.IHL, 0, ip_total_len, 0, PacketConsts.IP_V4_FLAGS << 13, PacketConsts.TTL_HOP_LIMIT, next_proto, 0, socket.inet_aton(src_ip), socket.inet_aton(dst_ip)) def gen_udp_header(packet_len, src_port=PacketConsts.SRC_PORT, dst_port=PacketConsts.DST_PORT): """ Generates UDP header using the values from the PacketConst class by default. :param packet_len: Length of all fields following the UDP header :param src_port: Source port :param dst_port: Destination port :return: UDP header """ udp_total_len = packet_len + PacketConsts.UDP_HEADER_SIZE return struct.pack('!4H', src_port, dst_port, udp_total_len, 0) def gen_gre_header(ether_type=PacketConsts.ETHER_TYPE_IPV4): """ Generates GRE header using the values from the PacketConst class by default. :param ether_type: Ether type of tunneled next header :return: GRE header """ return struct.pack('!2BHI', PacketConsts.GRE_FLAGS << 4, PacketConsts.GRE_VER, ether_type, PacketConsts.GRE_KEY) def gen_vxlan_header(): """ Generates VXLAN header using the values from the PacketConst class by default. :return: VXLAN header """ return struct.pack('!II', PacketConsts.VXLAN_FLAGS << 24, PacketConsts.VXLAN_VNI << 8) def gen_geneve_header(vni=PacketConsts.GENEVE_VNI, oam=PacketConsts.GENEVE_OAM, proto=PacketConsts.ETHER_TYPE_ETH): """ Generates Geneve header using the values from the PacketConst class by default. :param vni: geneve vni :param oam: geneve oam :param proto: Ether type of next header inside the tunnel :return: Geneve header """ return struct.pack('!BBHL', (0 << 6) + 0, (oam << 7) + (0 << 6) + 0, proto, (vni << 8) + 0) def gen_bth_header(opcode=PacketConsts.BTH_OPCODE, dst_qp=PacketConsts.BTH_DST_QP, a=PacketConsts.BTH_A): """ Generates ROCE BTH header using the values from the PacketConst class by default. :param opcode: BTH opcode :param dst_qp: BTH dst QP :param a: BTH acknowledgment bit :return: ROCE BTH header """ return struct.pack('!2BH2BH2L', opcode, 0, PacketConsts.BTH_PARTITION_KEY, PacketConsts.BTH_BECN << 6, dst_qp >> 16, dst_qp & 0xffff, a << 31, 0) def gen_packet(msg_size, l3=PacketConsts.IP_V4, l4=PacketConsts.UDP_PROTO, with_vlan=False, **kwargs): """ Generates a Eth | IPv4 or IPv6 | UDP or TCP packet with hardcoded values in the headers and randomized payload. :param msg_size: total packet size :param l3: Packet layer 3 type: 4 for IPv4 or 6 for IPv6 :param l4: Packet layer 4 type: 'tcp' or 'udp' :param with_vlan: if True add VLAN header to the packet :param kwargs: Arguments: * *src_mac* Source MAC address to use in the packet. * *src_ipv4* Source IPv4 address to use in the packet. :return: packet """ l3_header_size = getattr(PacketConsts, f'IPV{str(l3)}_HEADER_SIZE') l4_header_size = getattr(PacketConsts, f'{l4.upper()}_HEADER_SIZE') payload_size = max(0, msg_size - l3_header_size - l4_header_size - PacketConsts.ETHER_HEADER_SIZE) next_hdr = getattr(socket, f'IPPROTO_{l4.upper()}') ip_total_len = msg_size - PacketConsts.ETHER_HEADER_SIZE # Ethernet header src_mac = kwargs.get('src_mac', bytes.fromhex(PacketConsts.SRC_MAC.replace(':', ''))) packet = struct.pack('!6s6s', bytes.fromhex(PacketConsts.DST_MAC.replace(':', '')), src_mac) if with_vlan: packet += struct.pack('!HH', PacketConsts.VLAN_TPID, (PacketConsts.VLAN_PRIO << 13) + (PacketConsts.VLAN_CFI << 12) + PacketConsts.VLAN_ID) payload_size -= PacketConsts.VLAN_HEADER_SIZE ip_total_len -= PacketConsts.VLAN_HEADER_SIZE if l3 == PacketConsts.IP_V4: packet += PacketConsts.ETHER_TYPE_IPV4.to_bytes(2, 'big') else: packet += PacketConsts.ETHER_TYPE_IPV6.to_bytes(2, 'big') if l3 == PacketConsts.IP_V4: # IPv4 header src_ipv4 = kwargs.get('src_ipv4', PacketConsts.SRC_IP) packet += struct.pack('!2B3H2BH4s4s', (PacketConsts.IP_V4 << 4) + PacketConsts.IHL, 0, ip_total_len, 0, PacketConsts.IP_V4_FLAGS << 13, PacketConsts.TTL_HOP_LIMIT, next_hdr, 0, socket.inet_aton(src_ipv4), socket.inet_aton(PacketConsts.DST_IP)) else: # IPv6 header packet += struct.pack('!IH2B16s16s', (PacketConsts.IP_V6 << 28), ip_total_len, next_hdr, PacketConsts.TTL_HOP_LIMIT, socket.inet_pton(socket.AF_INET6, PacketConsts.SRC_IP6), socket.inet_pton(socket.AF_INET6, PacketConsts.DST_IP6)) if l4 == PacketConsts.UDP_PROTO: # UDP header packet += struct.pack('!4H', PacketConsts.SRC_PORT, PacketConsts.DST_PORT, payload_size + PacketConsts.UDP_HEADER_SIZE, 0) else: # TCP header packet += struct.pack('!2H2I4H', PacketConsts.SRC_PORT, PacketConsts.DST_PORT, 0, 0, PacketConsts.TCP_HEADER_SIZE_WORDS << 12, PacketConsts.WINDOW_SIZE, 0, 0) # Payload packet += str.encode('a' * payload_size) return packet def get_send_elements_raw_qp(agr_obj, l3=PacketConsts.IP_V4, l4=PacketConsts.UDP_PROTO, with_vlan=False, packet_to_send=None, **packet_args): """ Creates a single SGE and a single Send WR for agr_obj's RAW QP type. The content of the message is Eth | Ipv4 | UDP packet. :param agr_obj: Aggregation object which contains all resources necessary :param l3: Packet layer 3 type: 4 for IPv4 or 6 for IPv6 :param l4: Packet layer 4 type: 'tcp' or 'udp' :param with_vlan: if True add VLAN header to the packet :param packet_to_send: If passed, the other packet related parameters would be ignored, and this will be the packet to send. :param packet_args: Pass packet_args to gen_packets method. :return: send wr, its SGE, and message """ mr = agr_obj.mr msg = packet_to_send if packet_to_send is not None else \ gen_packet(agr_obj.msg_size, l3, l4, with_vlan, **packet_args) mr.write(msg, agr_obj.msg_size) sge = SGE(mr.buf, agr_obj.msg_size, mr.lkey) send_wr = SendWR(opcode=e.IBV_WR_SEND, num_sge=1, sg=[sge]) return send_wr, sge, msg def validate_raw(msg_received, msg_expected, skip_idxs): size = len(msg_expected) for i in range(size): if (msg_received[i] != msg_expected[i]) and i not in skip_idxs: err_msg = f'Data validation failure:\nexpected {msg_expected}\n\nreceived {msg_received}' raise PyverbsError(err_msg) def sampler_traffic(client, server, iters, l3=PacketConsts.IP_V4, l4=PacketConsts.UDP_PROTO): """ Send raw ethernet traffic :param client: client side, clients base class is BaseTraffic :param server: server side, servers base class is BaseTraffic :param iters: number of traffic iterations :param l3: Packet layer 3 type: 4 for IPv4 or 6 for IPv6 :param l4: Packet layer 4 type: 'tcp' or 'udp' """ s_recv_wr = get_recv_wr(server) c_recv_wr = get_recv_wr(client) for qp_idx in range(server.qp_count): # Prepare the receive queue with RecvWR post_recv(client, c_recv_wr, qp_idx=qp_idx) post_recv(server, s_recv_wr, qp_idx=qp_idx) poll = poll_cq_ex if isinstance(client.cq, CQEX) else poll_cq for _ in range(iters): for qp_idx in range(server.qp_count): c_send_wr, c_sg, msg = get_send_elements_raw_qp(client, l3, l4, False) send(client, c_send_wr, e.IBV_WR_SEND, False, qp_idx) poll(client.cq) def raw_traffic(client, server, iters, l3=PacketConsts.IP_V4, l4=PacketConsts.UDP_PROTO, with_vlan=False, expected_packet=None, skip_idxs=None, packet_to_send=None): """ Runs raw ethernet traffic between two sides :param client: client side, clients base class is BaseTraffic :param server: server side, servers base class is BaseTraffic :param iters: number of traffic iterations :param l3: Packet layer 3 type: 4 for IPv4 or 6 for IPv6 :param l4: Packet layer 4 type: 'tcp' or 'udp' :param with_vlan: if True add VLAN header to the packet :param expected_packet: Expected packet for validation (when different from the originally sent). :param skip_idxs: indexes to skip during packet validation :param packet_to_send: If passed, the other packet related parameters would be ignored, and this will be the packet to send. """ skip_idxs = [] if skip_idxs is None else skip_idxs s_recv_wr = get_recv_wr(server) c_recv_wr = get_recv_wr(client) for qp_idx in range(server.qp_count): # prepare the receive queue with RecvWR post_recv(client, c_recv_wr, qp_idx=qp_idx) post_recv(server, s_recv_wr, qp_idx=qp_idx) read_offset = 0 poll = poll_cq_ex if isinstance(client.cq, CQEX) else poll_cq for _ in range(iters): for qp_idx in range(server.qp_count): c_send_wr, c_sg, msg = get_send_elements_raw_qp(client, l3, l4, with_vlan, packet_to_send=packet_to_send) send(client, c_send_wr, e.IBV_WR_SEND, False, qp_idx) poll(client.cq) poll(server.cq) post_recv(server, s_recv_wr, qp_idx=qp_idx) msg_received = server.mr.read(server.msg_size, read_offset) # Validate received packet validate_raw(msg_received, expected_packet if expected_packet else msg, skip_idxs) def raw_rss_traffic(client, server, iters, l3=PacketConsts.IP_V4, l4=PacketConsts.UDP_PROTO, with_vlan=False, num_packets=1): """ Runs raw ethernet rss traffic between two sides. :param client: client side, clients base class is BaseTraffic :param server: server side, servers base class is BaseTraffic :param iters: number of traffic iterations :param l3: Packet layer 3 type: 4 for IPv4 or 6 for IPv6 :param l4: Packet layer 4 type: 'tcp' or 'udp' :param with_vlan: if True add VLAN header to the packet :param num_packets: Number of packets to send with different ipv4 src address in each iteration. :return: None """ s_recv_wr = get_recv_wr(server) for qp_idx in range(server.qp_count): # prepare the receive queue with RecvWR post_recv(server, s_recv_wr, qp_idx=qp_idx, num_wqes=num_packets) for _ in range(iters): for qp_idx in range(server.qp_count): for i in range(num_packets): c_send_wr, c_sg, msg = get_send_elements_raw_qp( client, l3, l4, with_vlan, src_ipv4='.'.join([str(num) for num in range(i, i + 4)])) send(client, c_send_wr, e.IBV_WR_SEND, False, qp_idx) poll_cq(client.cq) completions = 0 start_poll_t = time.perf_counter() while completions < num_packets and \ (time.perf_counter() - start_poll_t < POLL_CQ_TIMEOUT): for cq in server.cqs: n, wcs = cq.poll() if n > 0: if wcs[0].status != e.IBV_WC_SUCCESS: raise PyverbsRDMAError( f'Completion status is {wc_status_to_str(wcs[0].status)}', wcs[0].status) completions += 1 if completions >= num_packets: break if completions < num_packets: raise PyverbsError(f'Expected {num_packets} completions - got {completions}') post_recv(server, s_recv_wr, qp_idx=qp_idx, num_wqes=num_packets) def flush_traffic(client, server, iters, gid_idx, port, new_send=False, send_op=None): """ Runs basic RDMA FLUSH traffic that client requests a FLUSH to server. Simply, run RDMA WRITE and then follow up by a RDMA FLUSH. No receive WQEs are posted. :param client: client side, clients base class is BaseTraffic :param server: server side, servers base class is BaseTraffic :param iters: number of traffic iterations :param gid_idx: local gid index :param port: IB port :param new_send: If True use new post send API. :param send_op: The send_wr opcode. :return: """ rdma_traffic(client, server, iters, gid_idx, port, new_send, e.IBV_WR_RDMA_WRITE) for i in range(iters): if client.level == e.IBV_FLUSH_MR: client.msg_size = 0 if i == 0 else random.randint(0, 12345678) send(client, None, send_op, new_send) wcs = _poll_cq(client.cq) if (wcs[0].status != e.IBV_WC_SUCCESS): break return wcs def prepare_validate_data(client=None, server=None): if server: server.mem_write('s' * server.msg_size, server.msg_size) if client: client.mem_write('c' * client.msg_size, client.msg_size) def rdma_traffic(client, server, iters, gid_idx, port, new_send=False, send_op=None, force_page_faults=False): """ Runs basic RDMA traffic between two sides. No receive WQEs are posted. For RDMA send with immediate, use traffic(). :param client: client side, clients base class is BaseTraffic :param server: server side, servers base class is BaseTraffic :param iters: number of traffic iterations :param gid_idx: local gid index :param port: IB port :param new_send: If True use new post send API. :param send_op: The send_wr opcode. :param force_page_faults: If True, use madvise to hint that we don't need the MR's buffer to force page faults (useful for ODP testing). :return: """ # Using the new post send API, we need the SGE, not the SendWR if isinstance(client, Mlx5DcResources) or \ isinstance(client, SRDResources): ah_client = get_global_ah(client, gid_idx, port) ah_server = get_global_ah(server, gid_idx, port) else: ah_client = None ah_server = None send_element_idx = 1 if new_send else 0 same_side_check = send_op in [e.IBV_WR_RDMA_READ, e.IBV_WR_ATOMIC_CMP_AND_SWP, e.IBV_WR_ATOMIC_FETCH_AND_ADD] for _ in range(iters): if force_page_faults: madvise(client.mr.buf, client.msg_size) madvise(server.mr.buf, server.msg_size) prepare_validate_data(client=client, server=server) c_send_wr = get_send_elements(client, False, send_op)[send_element_idx] send(client, c_send_wr, send_op, new_send, ah=ah_client) poll_cq(client.cq) if same_side_check: msg_received = client.mem_read(client.msg_size) else: msg_received = server.mem_read(server.msg_size) validate(msg_received, False if same_side_check else True, server.msg_size) s_send_wr = get_send_elements(server, True, send_op)[send_element_idx] prepare_validate_data(client=client, server=server) send(server, s_send_wr, send_op, new_send, ah=ah_server) poll_cq(server.cq) if same_side_check: msg_received = server.mem_read(client.msg_size) else: msg_received = client.mem_read(server.msg_size) validate(msg_received, True if same_side_check else False, client.msg_size) def atomic_traffic(client, server, iters, gid_idx, port, new_send=False, send_op=None, receiver_val=1, sender_val=2, swap=0, client_wr=1, server_wr=1, **kwargs): """ Runs atomic traffic between two sides. :param client: Client side, clients base class is BaseTraffic :param server: Server side, servers base class is BaseTraffic :param iters: Number of traffic iterations :param gid_idx: Local gid index :param port: IB port :param new_send: If True use new post send API. :param send_op: The send_wr opcode. :param receiver_val: The requested value on the reciver MR. :param sender_val: The requested value on the sender SendWR. :param client_wr: Number of WR the client will post before polling all of them :param server_wr: Number of WR the server will post before polling all of them :param kwargs: General arguments (shared with other traffic functions). """ send_element_idx = 1 if new_send else 0 if is_datagram_qp(client): ah_client = get_global_ah(client, gid_idx, port) ah_server = get_global_ah(server, gid_idx, port) else: ah_client = None ah_server = None for _ in range(iters): client.mr.write(int.to_bytes(sender_val, 1, byteorder='big') * 8, 8) server.mr.write(int.to_bytes(receiver_val, 1, byteorder='big') * 8, 8) for _ in range(client_wr): c_send_wr = get_atomic_send_elements(client, send_op, cmp_add=sender_val, swap=swap)[send_element_idx] if isinstance(server, XRCResources): c_send_wr.set_qp_type_xrc(server.srq.get_srq_num()) send(client, c_send_wr, send_op, new_send, ah=ah_client, cmp_add=sender_val, swap=swap) poll_cq(client.cq, count=client_wr) validate_atomic(send_op, server, client, receiver_val=receiver_val + sender_val * (client_wr - 1), send_cmp_add=sender_val, send_swp=swap) server.mr.write(int.to_bytes(sender_val, 1, byteorder='big') * 8, 8) client.mr.write(int.to_bytes(receiver_val, 1, byteorder='big') * 8, 8) for _ in range(server_wr): s_send_wr = get_atomic_send_elements(server, send_op, cmp_add=sender_val, swap=swap)[send_element_idx] if isinstance(client, XRCResources): s_send_wr.set_qp_type_xrc(client.srq.get_srq_num()) send(server, s_send_wr, send_op, new_send, ah=ah_server, cmp_add=sender_val, swap=swap) poll_cq(server.cq, count=server_wr) validate_atomic(send_op, client, server, receiver_val=receiver_val + sender_val * (server_wr - 1), send_cmp_add=sender_val, send_swp=swap) def validate_atomic(opcode, recv_player, send_player, receiver_val, send_cmp_add, send_swp): """ Validate the data after atomic operations. The expected data in each side of traffic depends on the atomic type and the sender SendWR values. :param opcode: The atomic opcode. :param recv_player: The receiver player. :param send_player: The sender player. :param receiver_val: The value on the receiver MR before the atomic action. :param send_cmp_add: The send WR compare/add value depende on the atomic type. :param send_swp: The send WR swap value, used only in atomic compare and swap. """ send_expected = receiver_val if opcode in [e.IBV_WR_ATOMIC_CMP_AND_SWP, e.IBV_QP_EX_WITH_ATOMIC_CMP_AND_SWP]: recv_expected = send_swp if receiver_val == send_cmp_add \ else receiver_val if opcode in [e.IBV_WR_ATOMIC_FETCH_AND_ADD, e.IBV_QP_EX_WITH_ATOMIC_FETCH_AND_ADD]: recv_expected = receiver_val + send_cmp_add send_actual = int.from_bytes(send_player.mr.read(length=8, offset=0), byteorder='big') recv_actual = int.from_bytes(recv_player.mr.read(length=8, offset=0), byteorder='big') if send_actual != int8b_from_int(send_expected): raise PyverbsError( 'Atomic sender data validation failed: expected {exp}, received {rcv}'. format(exp=int8b_from_int(send_expected), rcv=send_actual)) if recv_actual != int8b_from_int(recv_expected): raise PyverbsError( 'Atomic reciver data validation failed: expected {exp}, received {rcv}'. format(exp=int8b_from_int(recv_expected), rcv=recv_actual)) def int8b_from_int(num): """ Duplicate one-byte value int to 8 bytes. e.g. 1 => b'\x01\x01\x01\x01\x01\x01\x01\x01' == 72340172838076673 :param num: One byte int number (0 <= num < 256). :return: The new number in int format. """ num_multi_8_str = int.to_bytes(num, 1, byteorder='big') * 8 return int.from_bytes(num_multi_8_str, byteorder='big') def get_atomic_send_elements(agr_obj, opcode, cmp_add=0, swap=0): """ Creates a single SGE and a single Send WR for atomic operations. :param agr_obj: Aggregation object which contains all resources necessary :param opcode: The send opcode :param cmp_add: The compare or add value (depends on the opcode). :param swap: The swap value. :return: Send WR and its SGE """ sge = SGE(agr_obj.mr.buf, 8, agr_obj.mr_lkey) send_wr = SendWR(opcode=opcode, num_sge=1, sg=[sge]) send_wr.set_wr_atomic(rkey=int(agr_obj.rkey), addr=int(agr_obj.raddr), compare_add=int8b_from_int(cmp_add), swap=int8b_from_int(swap)) return send_wr, sge def xrc_traffic(client, server, is_cq_ex=False, send_op=None, force_page_faults=False): """ Runs basic xrc traffic, this function assumes that number of QPs, which server and client have are equal, server.send_qp[i] is connected to client.recv_qp[i], each time server.send_qp[i] sends a message, it is redirected to client.srq because client.recv_qp[i] and client.srq are under the same xrcd. The traffic flow in the opposite direction is the same. :param client: Aggregation object of the active side, should be an instance of XRCResources class :param server: Aggregation object of the passive side, should be an instance of XRCResources class :param is_cq_ex: If True, use poll_cq_ex() rather than poll_cq() :param send_op: If not None, new post send API is assumed. :param force_page_faults: If True, use madvise to hint that we don't need the MR's buffer to force page faults (useful for ODP testing). :return: None """ poll = poll_cq_ex if is_cq_ex else poll_cq server.remote_srqn = client.srq.get_srq_num() client.remote_srqn = server.srq.get_srq_num() s_recv_wr = get_recv_wr(server) c_recv_wr = get_recv_wr(client) post_recv(client, c_recv_wr, num_wqes=client.qp_count*client.num_msgs) post_recv(server, s_recv_wr, num_wqes=server.qp_count*server.num_msgs) # Using the new post send API, we need the SGE, not the SendWR send_element_idx = 1 if send_op else 0 for _ in range(client.num_msgs): for i in range(server.qp_count): if force_page_faults: madvise(client.mr.buf, client.msg_size) madvise(server.mr.buf, server.msg_size) c_send_wr = get_send_elements(client, False)[send_element_idx] if send_op is None: c_send_wr.set_qp_type_xrc(client.remote_srqn) xrc_post_send(client, i, c_send_wr, send_op) poll(client.cq) poll(server.cq) msg_received = server.mr.read(server.msg_size, 0) validate(msg_received, True, server.msg_size) s_send_wr = get_send_elements(server, True)[send_element_idx] if send_op is None: s_send_wr.set_qp_type_xrc(server.remote_srqn) xrc_post_send(server, i, s_send_wr, send_op) poll(server.cq) poll(client.cq) msg_received = client.mr.read(client.msg_size, 0) validate(msg_received, False, client.msg_size) # Decorators def requires_odp(qp_type, required_odp_caps): def outer(func): def inner(instance): ctx = getattr(instance, 'ctx', d.Context(name=instance.dev_name)) odp_supported(ctx, qp_type, required_odp_caps) if getattr(instance, 'is_implicit', False): odp_implicit_supported(instance.ctx) return func(instance) return inner return outer def requires_root_on_eth(port_num=1): def outer(func): def inner(instance): if not (is_eth(instance.ctx, port_num) and is_root()): raise unittest.SkipTest('Must be run by root on Ethernet link layer') return func(instance) return inner return outer def requires_mcast_support(): """ Check if the device support multicast return: True if multicast is supported """ def outer(func): def inner(instance): ctx = d.Context(name=instance.dev_name) if ctx.query_device().max_mcast_grp == 0: raise unittest.SkipTest('Multicast is not supported on this device') return func(instance) return inner return outer def odp_supported(ctx, qp_type, required_odp_caps): """ Check device ODP capabilities :param ctx: Device Context :param qp_type: QP type ('rc', 'ud' or 'uc') :param required_odp_caps: ODP Capability mask of specified device :return: None """ odp_caps = ctx.query_device_ex().odp_caps if odp_caps.general_caps == 0: raise unittest.SkipTest('ODP is not supported - No ODP caps') qp_odp_caps = getattr(odp_caps, '{}_odp_caps'.format(qp_type)) if required_odp_caps & qp_odp_caps != required_odp_caps: raise unittest.SkipTest('ODP is unavailable - Operation not supported on this device') def odp_implicit_supported(ctx): """ Check device ODP implicit capability. :param ctx: Device Context :return: None """ odp_caps = ctx.query_device_ex().odp_caps has_odp_implicit = odp_caps.general_caps & e.IBV_ODP_SUPPORT_IMPLICIT if has_odp_implicit == 0: raise unittest.SkipTest('ODP implicit is not supported') def odp_v2_supported(ctx): """ ODPv2 check :return: True/False if ODPv2 supported """ from tests.mlx5_prm_structs import QueryHcaCapIn, QueryOdpCapOut, DevxOps, QueryHcaCapMod query_cap_in = QueryHcaCapIn(op_mod=DevxOps.MLX5_CMD_OP_QUERY_ODP_CAP << 1 | \ QueryHcaCapMod.CURRENT) cmd_res = ctx.devx_general_cmd(query_cap_in, len(QueryOdpCapOut())) query_cap_out = QueryOdpCapOut(cmd_res) if query_cap_out.status: raise PyverbsRDMAError(f'QUERY_HCA_CAP has failed with status ({query_cap_out.status}) ' f'and syndrome ({query_cap_out.syndrome})') return query_cap_out.capability.mem_page_fault == 1 def requires_odpv2(func): def inner(instance): if not odp_v2_supported(instance.ctx): raise unittest.SkipTest('ODPv2 is not supported') return func(instance) return inner def get_pci_name(dev_name): pci_name = glob.glob(f'/sys/bus/pci/devices/*/infiniband/{dev_name}') if not pci_name: raise unittest.SkipTest(f'Could not find the PCI device of {dev_name}') return pci_name[0].split('/')[5] def requires_eswitch_on(func): def inner(instance): if not (is_eth(d.Context(name=instance.dev_name), instance.ib_port) and eswitch_mode_check(instance.dev_name)): raise unittest.SkipTest('Must be run on Ethernet link layer with Eswitch on') return func(instance) return inner def eswitch_mode_check(dev_name): pci_name = get_pci_name(dev_name) eswicth_off_msg = f'Device {dev_name} must be in switchdev mode' try: cmd_out = subprocess.check_output(['devlink', 'dev', 'eswitch', 'show', f'pci/{pci_name}'], stderr=subprocess.DEVNULL) if 'switchdev' not in str(cmd_out): raise unittest.SkipTest(eswicth_off_msg) except subprocess.CalledProcessError: raise unittest.SkipTest(eswicth_off_msg) return True def requires_roce_disabled(func): def inner(instance): if is_roce_enabled(instance.dev_name): raise unittest.SkipTest('ROCE must be disabled') return func(instance) return inner def is_roce_enabled(dev_name): pci_name = get_pci_name(dev_name) cmd_out = subprocess.check_output(['devlink', 'dev', 'param', 'show', f'pci/{pci_name}', 'name', 'enable_roce'], stderr=subprocess.DEVNULL) if 'value true' in str(cmd_out): return True return False def requires_encap_disabled_if_eswitch_on(func): def inner(instance): if not (is_eth(d.Context(name=instance.dev_name), instance.ib_port) and encap_mode_check(instance.dev_name)): raise unittest.SkipTest('Encap must be disabled when Eswitch on') return func(instance) return inner def encap_mode_check(dev_name): pci_name = get_pci_name(dev_name) encap_enable_msg = f'Device {dev_name}: Encap must be disabled over switchdev mode' try: cmd_out = subprocess.check_output(['devlink', 'dev', 'eswitch', 'show', f'pci/{pci_name}'], stderr=subprocess.DEVNULL) if 'switchdev' in str(cmd_out): if any([i for i in ['encap enable', 'encap-mode basic'] if i in str(cmd_out)]): raise unittest.SkipTest(encap_enable_msg) except subprocess.CalledProcessError: raise unittest.SkipTest(encap_enable_msg) return True def requires_huge_pages(): def outer(func): def inner(instance): huge_pages_supported() return func(instance) return inner return outer def skip_unsupported(func): def func_wrapper(*args, **kwargs): try: return func(*args, **kwargs) except PyverbsRDMAError as ex: if ex.error_code in [errno.EOPNOTSUPP, errno.EPROTONOSUPPORT]: raise unittest.SkipTest(f'Operation not supported ({str(ex)})') raise ex return func_wrapper def huge_pages_supported(): """ Check if huge pages are supported in the kernel. :return: None """ huge_path = '/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages' if not os.path.isfile(huge_path): raise unittest.SkipTest('Huge pages of size 2M is not supported in this platform') with open(huge_path, 'r') as f: if not int(f.read()): raise unittest.SkipTest('There are no huge pages of size 2M allocated') @skip_unsupported def query_nic_flow_table_caps(instance): from tests.mlx5_prm_structs import QueryHcaCapIn, QueryQosCapOut, QueryHcaCapOp, \ QueryHcaCapMod, QueryCmdHcaNicFlowTableCapOut try: ctx = Mlx5Context(Mlx5DVContextAttr(), instance.dev_name) except PyverbsUserError as ex: raise unittest.SkipTest(f'Could not open mlx5 context ({ex})') except PyverbsRDMAError: raise unittest.SkipTest('Opening mlx5 context is not supported') # Query NIC Flow Table capabilities query_cap_in = QueryHcaCapIn(op_mod=(QueryHcaCapOp.HCA_NIC_FLOW_TABLE_CAP << 0x1) | \ QueryHcaCapMod.CURRENT) cmd_res = ctx.devx_general_cmd(query_cap_in, len(QueryQosCapOut())) query_cap_out = QueryCmdHcaNicFlowTableCapOut(cmd_res) if query_cap_out.status: raise PyverbsRDMAError(f'QUERY_HCA_CAP has failed with status ({query_cap_out.status}) ' f'and syndrome ({query_cap_out.syndrome})') return query_cap_out.capability def prefetch_mrs(agr_obj, sg_list, advice=e._IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE, flags=e._IBV_ADVISE_MR_FLAG_FLUSH): """ Pre-fetch a range of an on-demand paging MR. :param agr_obj: Aggregation object which contains all resources necessary :param sg_list: SGE list :param advice: The requested advice value :param flags: Describes the properties of the advice operation :return: None """ try: agr_obj.pd.advise_mr(advice, flags, sg_list) except PyverbsRDMAError as ex: if ex.error_code == errno.EOPNOTSUPP: raise unittest.SkipTest(f'Advise MR with flags ({flags}) and advice ({advice}) is not supported') raise ex def is_eth(ctx, port_num): """ Querires the device's context's port for its link layer. :param ctx: The Context to query :param port_num: Which Context's port to query :return: True if the port's link layer is Ethernet, else False """ return ctx.query_port(port_num).link_layer == e.IBV_LINK_LAYER_ETHERNET def is_datagram_qp(agr_obj): if agr_obj.qp.qp_type == e.IBV_QPT_UD or \ isinstance(agr_obj, SRDResources) or \ isinstance(agr_obj, Mlx5DcResources): return True return False def is_root(): return os.geteuid() == 0 def post_rq_state_bad_flow(test_obj): """ Check post_recive on rq while qp is in invalid state. - Change qp's state to IBV_QPS_RESET - Verify post receive on qp fails :param test_obj: An instance of RDMATestCase :return: None. """ qp_attr = QPAttr(qp_state=e.IBV_QPS_RESET, cur_qp_state=e.IBV_QPS_RTS) test_obj.server.qps[0].modify(qp_attr, e.IBV_QP_STATE) recv_wr = get_recv_wr(test_obj.server) with test_obj.assertRaises(PyverbsRDMAError) as ex: post_recv(test_obj.server, recv_wr, qp_idx=0) test_obj.assertEqual(ex.exception.error_code, errno.EINVAL) def post_sq_state_bad_flow(test_obj): """ Check post_send on sq while qp is in invalid state. - Change qp's state to IBV_QPS_RESET - Verify post send on qp fails :param test_obj: An instance of RDMATestCase :return: None. """ qp_idx = 0 qp_attr = QPAttr(qp_state=e.IBV_QPS_RESET, cur_qp_state=e.IBV_QPS_RTS) test_obj.client.qps[qp_idx].modify(qp_attr, e.IBV_QP_STATE) ah = get_global_ah(test_obj.client, test_obj.gid_index, test_obj.ib_port) _, sg = get_send_elements(test_obj.client, False) with test_obj.assertRaises(PyverbsRDMAError) as ex: send(test_obj.client, sg, e.IBV_WR_SEND, new_send=True, qp_idx=qp_idx, ah=ah) test_obj.assertEqual(ex.exception.error_code, errno.EINVAL) def full_rq_bad_flow(test_obj): """ Check post_recive while qp's rq is full. - Find qp's rq length. - Fill the qp with work requests until overflow. :param test_obj: An instance of RDMATestCase :return: None. """ qp_attr, _ = test_obj.server.qps[0].query(e.IBV_QP_CAP) max_recv_wr = qp_attr.cap.max_recv_wr with test_obj.assertRaises(PyverbsRDMAError) as ex: for _ in range (max_recv_wr + 1): s_recv_wr = get_recv_wr(test_obj.server) post_recv(test_obj.server, s_recv_wr, qp_idx=0) test_obj.assertEqual(ex.exception.error_code, errno.ENOMEM) def create_rq_with_larger_sgl_bad_flow(test_obj): """ Check post_receive on qp while wr sgl is bigger than max sge allowed for the qp - Find max sge allowed for the qp - Create wr with sgl bigger than the max - Verify post receive on qp fails :param test_obj: An instance of RDMATestCase :return: None. """ qp_idx = 0 server_mr = test_obj.server.mr server_mr_buf = server_mr.buf qp_attr, _ = test_obj.server.qps[qp_idx].query(e.IBV_QP_CAP) max_recv_sge = qp_attr.cap.max_recv_sge length = test_obj.server.msg_size // (max_recv_sge + 1) sgl = [] offset = 0 for _ in range(max_recv_sge + 1): sgl.append(SGE(server_mr_buf + offset, length, server_mr.lkey)) offset = offset + length s_recv_wr = RecvWR(sg=sgl, num_sge=max_recv_sge + 1) with test_obj.assertRaises(PyverbsRDMAError) as ex: post_recv(test_obj.server, s_recv_wr, qp_idx=qp_idx) test_obj.assertEqual(ex.exception.error_code, errno.EINVAL) def high_rate_send(agr_obj, packet, rate_limit, timeout=2): """ Sends packet at high rate for 'timeout' seconds. :param agr_obj: Aggregation object which contains all resources necessary :param packet: Packet to send :param rate_limit: Minimal rate limit in MBps :param timeout: Seconds to send the packets """ send_sg = SGE(agr_obj.mr.buf, len(packet), agr_obj.mr.lkey) agr_obj.mr.write(packet, len(packet)) send_wr = SendWR(num_sge=1, sg=[send_sg]) poll = poll_cq_ex if isinstance(agr_obj.cq, CQEX) else poll_cq iterations = 0 start_send_t = time.perf_counter() while (time.perf_counter() - start_send_t) < timeout: agr_obj.qp.post_send(send_wr) poll(agr_obj.cq) iterations += 1 # Calculate the rate rate = agr_obj.msg_size * iterations / timeout / 1000000 assert rate > rate_limit, 'Traffic rate is smaller than minimal rate for the test' def get_pkey_from_kernel(device, port=1, index=0): path = f'/sys/class/infiniband/{device}/ports/{port}/pkeys/{index}' output = subprocess.check_output(['cat', path], universal_newlines=True) pkey_hex = output.strip() pkey_decimal = int(pkey_hex, 16) return pkey_decimal rdma-core-56.1/util/000077500000000000000000000000001477342711600143075ustar00rootroot00000000000000rdma-core-56.1/util/CMakeLists.txt000066400000000000000000000012121477342711600170430ustar00rootroot00000000000000publish_internal_headers(util bitmap.h cl_qmap.h compiler.h interval_set.h node_name_map.h rdma_nl.h symver.h util.h ) set(C_FILES bitmap.c cl_map.c interval_set.c node_name_map.c open_cdev.c rdma_nl.c util.c ) if (HAVE_COHERENT_DMA) publish_internal_headers(util mmio.h s390_mmio_insn.h udma_barrier.h ) set(C_FILES ${C_FILES} mmio.c ) set_source_files_properties(mmio.c PROPERTIES COMPILE_FLAGS "${SSE_FLAGS}") endif() add_library(rdma_util STATIC ${C_FILES}) add_library(rdma_util_pic STATIC ${C_FILES}) set_property(TARGET rdma_util_pic PROPERTY POSITION_INDEPENDENT_CODE TRUE) rdma-core-56.1/util/bitmap.c000066400000000000000000000073261477342711600157370ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file */ #define _GNU_SOURCE #include "bitmap.h" #include #include #include #define BMP_WORD_INDEX(n) ((n) / BITS_PER_LONG) #define BMP_WORD_OFFSET(n) ((n) % BITS_PER_LONG) #define BMP_FIRST_WORD_MASK(start) (~0UL << BMP_WORD_OFFSET(start)) #define BMP_LAST_WORD_MASK(end) (BMP_WORD_OFFSET(end) == 0 ? ~0UL : \ ~BMP_FIRST_WORD_MASK(end)) /* * Finds the first set bit in the bitmap starting from * 'start' bit until ('end'-1) bit. * * Returns the set bit index if found, otherwise returns 'end'. */ unsigned long bitmap_find_first_bit(const unsigned long *bmp, unsigned long start, unsigned long end) { unsigned long curr_offset = BMP_WORD_OFFSET(start); unsigned long curr_idx = BMP_WORD_INDEX(start); assert(start <= end); for (; start < end; curr_idx++) { unsigned long bit = ffsl(bmp[curr_idx] >> curr_offset); if (bit) return min(end, start + bit - 1); start += BITS_PER_LONG - curr_offset; curr_offset = 0; } return end; } /* * Zeroes bitmap bits in the following range: [start,end-1] */ void bitmap_zero_region(unsigned long *bmp, unsigned long start, unsigned long end) { unsigned long start_mask; unsigned long last_mask; unsigned long curr_idx = BMP_WORD_INDEX(start); unsigned long last_idx = BMP_WORD_INDEX(end - 1); assert(start <= end); if (start >= end) return; start_mask = BMP_FIRST_WORD_MASK(start); last_mask = BMP_LAST_WORD_MASK(end); if (curr_idx == last_idx) { bmp[curr_idx] &= ~(start_mask & last_mask); return; } bmp[curr_idx] &= ~start_mask; for (curr_idx++; curr_idx < last_idx; curr_idx++) bmp[curr_idx] = 0; bmp[curr_idx] &= ~last_mask; } /* * Sets bitmap bits in the following range: [start,end-1] */ void bitmap_fill_region(unsigned long *bmp, unsigned long start, unsigned long end) { unsigned long start_mask; unsigned long last_mask; unsigned long curr_idx = BMP_WORD_INDEX(start); unsigned long last_idx = BMP_WORD_INDEX(end - 1); assert(start <= end); if (start >= end) return; start_mask = BMP_FIRST_WORD_MASK(start); last_mask = BMP_LAST_WORD_MASK(end); if (curr_idx == last_idx) { bmp[curr_idx] |= (start_mask & last_mask); return; } bmp[curr_idx] |= start_mask; for (curr_idx++; curr_idx < last_idx; curr_idx++) bmp[curr_idx] = ULONG_MAX; bmp[curr_idx] |= last_mask; } /* * Checks whether the contiguous region of region_size bits starting from * start is free. * * Returns true if the said region is free, otherwise returns false. */ static bool bitmap_is_free_region(unsigned long *bmp, unsigned long start, unsigned long region_size) { unsigned long curr_idx; unsigned long last_idx; unsigned long last_mask; unsigned long start_mask; curr_idx = BMP_WORD_INDEX(start); start_mask = BMP_FIRST_WORD_MASK(start); last_idx = BMP_WORD_INDEX(start + region_size - 1); last_mask = BMP_LAST_WORD_MASK(start + region_size); if (curr_idx == last_idx) return !(bmp[curr_idx] & start_mask & last_mask); if (bmp[curr_idx] & start_mask) return false; for (curr_idx++; curr_idx < last_idx; curr_idx++) { if (bmp[curr_idx]) return false; } return !(bmp[curr_idx] & last_mask); } /* * Finds a contiguous region with the size of region_size * in the bitmap that is not set. * * Returns first index of such region if found, * otherwise returns nbits. */ unsigned long bitmap_find_free_region(unsigned long *bmp, unsigned long nbits, unsigned long region_size) { unsigned long start; if (!region_size) return 0; for (start = 0; start + region_size <= nbits; start++) { if (bitmap_test_bit(bmp, start)) continue; if (bitmap_is_free_region(bmp, start, region_size)) return start; } return nbits; } rdma-core-56.1/util/bitmap.h000066400000000000000000000050611477342711600157360ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file */ #ifndef UTIL_BITMAP_H #define UTIL_BITMAP_H #include #include #include #include #include #include "util.h" #define BMP_DECLARE(name, nbits) \ unsigned long (name)[BITS_TO_LONGS((nbits))] unsigned long bitmap_find_first_bit(const unsigned long *bmp, unsigned long start, unsigned long end); void bitmap_zero_region(unsigned long *bmp, unsigned long start, unsigned long end); void bitmap_fill_region(unsigned long *bmp, unsigned long start, unsigned long end); unsigned long bitmap_find_free_region(unsigned long *bmp, unsigned long nbits, unsigned long region_size); static inline void bitmap_fill(unsigned long *bmp, unsigned long nbits) { unsigned long size = BITS_TO_LONGS(nbits) * sizeof(unsigned long); memset(bmp, 0xff, size); } static inline void bitmap_zero(unsigned long *bmp, unsigned long nbits) { unsigned long size = BITS_TO_LONGS(nbits) * sizeof(unsigned long); memset(bmp, 0, size); } static inline bool bitmap_empty(const unsigned long *bmp, unsigned long nbits) { unsigned long i; unsigned long mask = ULONG_MAX; assert(nbits); for (i = 0; i < BITS_TO_LONGS(nbits) - 1; i++) { if (bmp[i] != 0) return false; } if (nbits % BITS_PER_LONG) mask = (1UL << (nbits % BITS_PER_LONG)) - 1; return (bmp[i] & mask) ? false : true; } static inline bool bitmap_full(const unsigned long *bmp, unsigned long nbits) { unsigned long i; unsigned long mask = ULONG_MAX; assert(nbits); for (i = 0; i < BITS_TO_LONGS(nbits) - 1; i++) { if (bmp[i] != -1UL) return false; } if (nbits % BITS_PER_LONG) mask = (1UL << (nbits % BITS_PER_LONG)) - 1; return ((bmp[i] & mask) ^ (mask)) ? false : true; } static inline void bitmap_set_bit(unsigned long *bmp, unsigned long idx) { bmp[(idx / BITS_PER_LONG)] |= (1UL << (idx % BITS_PER_LONG)); } static inline void bitmap_clear_bit(unsigned long *bmp, unsigned long idx) { bmp[(idx / BITS_PER_LONG)] &= ~(1UL << (idx % BITS_PER_LONG)); } static inline bool bitmap_test_bit(const unsigned long *bmp, unsigned long idx) { return !!(bmp[(idx / BITS_PER_LONG)] & (1UL << (idx % BITS_PER_LONG))); } static inline unsigned long *bitmap_alloc0(unsigned long size) { unsigned long *bmp; bmp = calloc(BITS_TO_LONGS(size), sizeof(long)); if (!bmp) return NULL; return bmp; } static inline unsigned long *bitmap_alloc1(unsigned long size) { unsigned long *bmp; bmp = bitmap_alloc0(size); if (!bmp) return NULL; bitmap_fill(bmp, size); return bmp; } #endif rdma-core-56.1/util/cl_map.c000066400000000000000000000461041477342711600157130ustar00rootroot00000000000000/* * Copyright (c) 2004-2009 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ /* * Abstract: * Implementation of quick map, a binary tree where the caller always * provides all necessary storage. * */ /***************************************************************************** * * Map * * Map is an associative array. By providing a key, the caller can retrieve * an object from the map. All objects in the map have an associated key, * as specified by the caller when the object was inserted into the map. * In addition to random access, the caller can traverse the map much like * a linked list, either forwards from the first object or backwards from * the last object. The objects in the map are always traversed in * order since the nodes are stored sorted. * * This implementation of Map uses a red black tree verified against * Cormen-Leiserson-Rivest text, McGraw-Hill Edition, fourteenth * printing, 1994. * *****************************************************************************/ #include #include static inline void __cl_primitive_insert(cl_list_item_t *const p_list_item, cl_list_item_t *const p_new_item) { /* CL_ASSERT that a non-null pointer is provided. */ assert(p_list_item); /* CL_ASSERT that a non-null pointer is provided. */ assert(p_new_item); p_new_item->p_next = p_list_item; p_new_item->p_prev = p_list_item->p_prev; p_list_item->p_prev = p_new_item; p_new_item->p_prev->p_next = p_new_item; } static inline void __cl_primitive_remove(cl_list_item_t *const p_list_item) { /* CL_ASSERT that a non-null pointer is provided. */ assert(p_list_item); /* set the back pointer */ p_list_item->p_next->p_prev = p_list_item->p_prev; /* set the next pointer */ p_list_item->p_prev->p_next = p_list_item->p_next; /* if we're debugging, spruce up the pointers to help find bugs */ #if defined( _DEBUG_ ) if (p_list_item != p_list_item->p_next) { p_list_item->p_next = NULL; p_list_item->p_prev = NULL; } #endif /* defined( _DEBUG_ ) */ } /****************************************************************************** IMPLEMENTATION OF QUICK MAP ******************************************************************************/ /* * Get the root. */ static inline cl_map_item_t *__cl_map_root(const cl_qmap_t * const p_map) { assert(p_map); return (p_map->root.p_left); } /* * Returns whether a given item is on the left of its parent. */ static bool __cl_map_is_left_child(const cl_map_item_t * const p_item) { assert(p_item); assert(p_item->p_up); assert(p_item->p_up != p_item); return (p_item->p_up->p_left == p_item); } /* * Retrieve the pointer to the parent's pointer to an item. */ static cl_map_item_t **__cl_map_get_parent_ptr_to_item(cl_map_item_t * const p_item) { assert(p_item); assert(p_item->p_up); assert(p_item->p_up != p_item); if (__cl_map_is_left_child(p_item)) return (&p_item->p_up->p_left); assert(p_item->p_up->p_right == p_item); return (&p_item->p_up->p_right); } /* * Rotate a node to the left. This rotation affects the least number of links * between nodes and brings the level of C up by one while increasing the depth * of A one. Note that the links to/from W, X, Y, and Z are not affected. * * R R * | | * A C * / \ / \ * W C A Z * / \ / \ * B Z W B * / \ / \ * X Y X Y */ static void __cl_map_rot_left(cl_qmap_t * const p_map, cl_map_item_t * const p_item) { cl_map_item_t **pp_root; assert(p_map); assert(p_item); assert(p_item->p_right != &p_map->nil); pp_root = __cl_map_get_parent_ptr_to_item(p_item); /* Point R to C instead of A. */ *pp_root = p_item->p_right; /* Set C's parent to R. */ (*pp_root)->p_up = p_item->p_up; /* Set A's right to B */ p_item->p_right = (*pp_root)->p_left; /* * Set B's parent to A. We trap for B being NIL since the * caller may depend on NIL not changing. */ if ((*pp_root)->p_left != &p_map->nil) (*pp_root)->p_left->p_up = p_item; /* Set C's left to A. */ (*pp_root)->p_left = p_item; /* Set A's parent to C. */ p_item->p_up = *pp_root; } /* * Rotate a node to the right. This rotation affects the least number of links * between nodes and brings the level of A up by one while increasing the depth * of C one. Note that the links to/from W, X, Y, and Z are not affected. * * R R * | | * C A * / \ / \ * A Z W C * / \ / \ * W B B Z * / \ / \ * X Y X Y */ static void __cl_map_rot_right(cl_qmap_t * const p_map, cl_map_item_t * const p_item) { cl_map_item_t **pp_root; assert(p_map); assert(p_item); assert(p_item->p_left != &p_map->nil); /* Point R to A instead of C. */ pp_root = __cl_map_get_parent_ptr_to_item(p_item); (*pp_root) = p_item->p_left; /* Set A's parent to R. */ (*pp_root)->p_up = p_item->p_up; /* Set C's left to B */ p_item->p_left = (*pp_root)->p_right; /* * Set B's parent to C. We trap for B being NIL since the * caller may depend on NIL not changing. */ if ((*pp_root)->p_right != &p_map->nil) (*pp_root)->p_right->p_up = p_item; /* Set A's right to C. */ (*pp_root)->p_right = p_item; /* Set C's parent to A. */ p_item->p_up = *pp_root; } void cl_qmap_init(cl_qmap_t * const p_map) { assert(p_map); memset(p_map, 0, sizeof(cl_qmap_t)); /* special setup for the root node */ p_map->root.p_up = &p_map->root; p_map->root.p_left = &p_map->nil; p_map->root.p_right = &p_map->nil; p_map->root.color = CL_MAP_BLACK; /* Setup the node used as terminator for all leaves. */ p_map->nil.p_up = &p_map->nil; p_map->nil.p_left = &p_map->nil; p_map->nil.p_right = &p_map->nil; p_map->nil.color = CL_MAP_BLACK; cl_qmap_remove_all(p_map); } cl_map_item_t *cl_qmap_get(const cl_qmap_t * const p_map, const uint64_t key) { cl_map_item_t *p_item; assert(p_map); p_item = __cl_map_root(p_map); while (p_item != &p_map->nil) { if (key == p_item->key) break; /* just right */ if (key < p_item->key) p_item = p_item->p_left; /* too small */ else p_item = p_item->p_right; /* too big */ } return (p_item); } cl_map_item_t *cl_qmap_get_next(const cl_qmap_t * const p_map, const uint64_t key) { cl_map_item_t *p_item; cl_map_item_t *p_item_found; assert(p_map); p_item = __cl_map_root(p_map); p_item_found = (cl_map_item_t *) & p_map->nil; while (p_item != &p_map->nil) { if (key < p_item->key) { p_item_found = p_item; p_item = p_item->p_left; } else { p_item = p_item->p_right; } } return (p_item_found); } void cl_qmap_apply_func(const cl_qmap_t * const p_map, cl_pfn_qmap_apply_t pfn_func, const void *const context) { cl_map_item_t *p_map_item; /* Note that context can have any arbitrary value. */ assert(p_map); assert(pfn_func); p_map_item = cl_qmap_head(p_map); while (p_map_item != cl_qmap_end(p_map)) { pfn_func(p_map_item, (void *)context); p_map_item = cl_qmap_next(p_map_item); } } /* * Balance a tree starting at a given item back to the root. */ static void __cl_map_ins_bal(cl_qmap_t * const p_map, cl_map_item_t * p_item) { cl_map_item_t *p_grand_uncle; assert(p_map); assert(p_item); assert(p_item != &p_map->root); while (p_item->p_up->color == CL_MAP_RED) { if (__cl_map_is_left_child(p_item->p_up)) { p_grand_uncle = p_item->p_up->p_up->p_right; assert(p_grand_uncle); if (p_grand_uncle->color == CL_MAP_RED) { p_grand_uncle->color = CL_MAP_BLACK; p_item->p_up->color = CL_MAP_BLACK; p_item->p_up->p_up->color = CL_MAP_RED; p_item = p_item->p_up->p_up; continue; } if (!__cl_map_is_left_child(p_item)) { p_item = p_item->p_up; __cl_map_rot_left(p_map, p_item); } p_item->p_up->color = CL_MAP_BLACK; p_item->p_up->p_up->color = CL_MAP_RED; __cl_map_rot_right(p_map, p_item->p_up->p_up); } else { p_grand_uncle = p_item->p_up->p_up->p_left; assert(p_grand_uncle); if (p_grand_uncle->color == CL_MAP_RED) { p_grand_uncle->color = CL_MAP_BLACK; p_item->p_up->color = CL_MAP_BLACK; p_item->p_up->p_up->color = CL_MAP_RED; p_item = p_item->p_up->p_up; continue; } if (__cl_map_is_left_child(p_item)) { p_item = p_item->p_up; __cl_map_rot_right(p_map, p_item); } p_item->p_up->color = CL_MAP_BLACK; p_item->p_up->p_up->color = CL_MAP_RED; __cl_map_rot_left(p_map, p_item->p_up->p_up); } } } cl_map_item_t *cl_qmap_insert(cl_qmap_t * const p_map, const uint64_t key, cl_map_item_t * const p_item) { cl_map_item_t *p_insert_at, *p_comp_item; assert(p_map); assert(p_item); assert(p_map->root.p_up == &p_map->root); assert(p_map->root.color != CL_MAP_RED); assert(p_map->nil.color != CL_MAP_RED); p_item->p_left = &p_map->nil; p_item->p_right = &p_map->nil; p_item->key = key; p_item->color = CL_MAP_RED; /* Find the insertion location. */ p_insert_at = &p_map->root; p_comp_item = __cl_map_root(p_map); while (p_comp_item != &p_map->nil) { p_insert_at = p_comp_item; if (key == p_insert_at->key) return (p_insert_at); /* Traverse the tree until the correct insertion point is found. */ if (key < p_insert_at->key) p_comp_item = p_insert_at->p_left; else p_comp_item = p_insert_at->p_right; } assert(p_insert_at != &p_map->nil); assert(p_comp_item == &p_map->nil); /* Insert the item. */ if (p_insert_at == &p_map->root) { p_insert_at->p_left = p_item; /* * Primitive insert places the new item in front of * the existing item. */ __cl_primitive_insert(&p_map->nil.pool_item.list_item, &p_item->pool_item.list_item); } else if (key < p_insert_at->key) { p_insert_at->p_left = p_item; /* * Primitive insert places the new item in front of * the existing item. */ __cl_primitive_insert(&p_insert_at->pool_item.list_item, &p_item->pool_item.list_item); } else { p_insert_at->p_right = p_item; /* * Primitive insert places the new item in front of * the existing item. */ __cl_primitive_insert(p_insert_at->pool_item.list_item.p_next, &p_item->pool_item.list_item); } /* Increase the count. */ p_map->count++; p_item->p_up = p_insert_at; /* * We have added depth to this section of the tree. * Rebalance as necessary as we retrace our path through the tree * and update colors. */ __cl_map_ins_bal(p_map, p_item); __cl_map_root(p_map)->color = CL_MAP_BLACK; /* * Note that it is not necessary to re-color the nil node black because all * red color assignments are made via the p_up pointer, and nil is never * set as the value of a p_up pointer. */ #ifdef _DEBUG_ /* Set the pointer to the map in the map item for consistency checking. */ p_item->p_map = p_map; #endif return (p_item); } static void __cl_map_del_bal(cl_qmap_t * const p_map, cl_map_item_t * p_item) { cl_map_item_t *p_uncle; while ((p_item->color != CL_MAP_RED) && (p_item->p_up != &p_map->root)) { if (__cl_map_is_left_child(p_item)) { p_uncle = p_item->p_up->p_right; if (p_uncle->color == CL_MAP_RED) { p_uncle->color = CL_MAP_BLACK; p_item->p_up->color = CL_MAP_RED; __cl_map_rot_left(p_map, p_item->p_up); p_uncle = p_item->p_up->p_right; } if (p_uncle->p_right->color != CL_MAP_RED) { if (p_uncle->p_left->color != CL_MAP_RED) { p_uncle->color = CL_MAP_RED; p_item = p_item->p_up; continue; } p_uncle->p_left->color = CL_MAP_BLACK; p_uncle->color = CL_MAP_RED; __cl_map_rot_right(p_map, p_uncle); p_uncle = p_item->p_up->p_right; } p_uncle->color = p_item->p_up->color; p_item->p_up->color = CL_MAP_BLACK; p_uncle->p_right->color = CL_MAP_BLACK; __cl_map_rot_left(p_map, p_item->p_up); break; } else { p_uncle = p_item->p_up->p_left; if (p_uncle->color == CL_MAP_RED) { p_uncle->color = CL_MAP_BLACK; p_item->p_up->color = CL_MAP_RED; __cl_map_rot_right(p_map, p_item->p_up); p_uncle = p_item->p_up->p_left; } if (p_uncle->p_left->color != CL_MAP_RED) { if (p_uncle->p_right->color != CL_MAP_RED) { p_uncle->color = CL_MAP_RED; p_item = p_item->p_up; continue; } p_uncle->p_right->color = CL_MAP_BLACK; p_uncle->color = CL_MAP_RED; __cl_map_rot_left(p_map, p_uncle); p_uncle = p_item->p_up->p_left; } p_uncle->color = p_item->p_up->color; p_item->p_up->color = CL_MAP_BLACK; p_uncle->p_left->color = CL_MAP_BLACK; __cl_map_rot_right(p_map, p_item->p_up); break; } } p_item->color = CL_MAP_BLACK; } void cl_qmap_remove_item(cl_qmap_t * const p_map, cl_map_item_t * const p_item) { cl_map_item_t *p_child, *p_del_item; assert(p_map); assert(p_item); if (p_item == cl_qmap_end(p_map)) return; if ((p_item->p_right == &p_map->nil) || (p_item->p_left == &p_map->nil)) { /* The item being removed has children on at most on side. */ p_del_item = p_item; } else { /* * The item being removed has children on both side. * We select the item that will replace it. After removing * the substitute item and rebalancing, the tree will have the * correct topology. Exchanging the substitute for the item * will finalize the removal. */ p_del_item = cl_qmap_next(p_item); assert(p_del_item != &p_map->nil); } /* Remove the item from the list. */ __cl_primitive_remove(&p_item->pool_item.list_item); /* Decrement the item count. */ p_map->count--; /* Get the pointer to the new root's child, if any. */ if (p_del_item->p_left != &p_map->nil) p_child = p_del_item->p_left; else p_child = p_del_item->p_right; /* * This assignment may modify the parent pointer of the nil node. * This is inconsequential. */ p_child->p_up = p_del_item->p_up; (*__cl_map_get_parent_ptr_to_item(p_del_item)) = p_child; if (p_del_item->color != CL_MAP_RED) __cl_map_del_bal(p_map, p_child); /* * Note that the splicing done below does not need to occur before * the tree is balanced, since the actual topology changes are made by the * preceding code. The topology is preserved by the color assignment made * below (reader should be reminded that p_del_item == p_item in some cases). */ if (p_del_item != p_item) { /* * Finalize the removal of the specified item by exchanging it with * the substitute which we removed above. */ p_del_item->p_up = p_item->p_up; p_del_item->p_left = p_item->p_left; p_del_item->p_right = p_item->p_right; (*__cl_map_get_parent_ptr_to_item(p_item)) = p_del_item; p_item->p_right->p_up = p_del_item; p_item->p_left->p_up = p_del_item; p_del_item->color = p_item->color; } assert(p_map->nil.color != CL_MAP_RED); #ifdef _DEBUG_ /* Clear the pointer to the map since the item has been removed. */ p_item->p_map = NULL; #endif } cl_map_item_t *cl_qmap_remove(cl_qmap_t * const p_map, const uint64_t key) { cl_map_item_t *p_item; assert(p_map); /* Seek the node with the specified key */ p_item = cl_qmap_get(p_map, key); cl_qmap_remove_item(p_map, p_item); return (p_item); } void cl_qmap_merge(cl_qmap_t * const p_dest_map, cl_qmap_t * const p_src_map) { cl_map_item_t *p_item, *p_item2, *p_next; assert(p_dest_map); assert(p_src_map); p_item = cl_qmap_head(p_src_map); while (p_item != cl_qmap_end(p_src_map)) { p_next = cl_qmap_next(p_item); /* Remove the item from its current map. */ cl_qmap_remove_item(p_src_map, p_item); /* Insert the item into the destination map. */ p_item2 = cl_qmap_insert(p_dest_map, cl_qmap_key(p_item), p_item); /* Check that the item was successfully inserted. */ if (p_item2 != p_item) { /* Put the item in back in the source map. */ p_item2 = cl_qmap_insert(p_src_map, cl_qmap_key(p_item), p_item); assert(p_item2 == p_item); } p_item = p_next; } } static void __cl_qmap_delta_move(cl_qmap_t * const p_dest, cl_qmap_t * const p_src, cl_map_item_t ** const pp_item) { cl_map_item_t __attribute__((__unused__)) *p_temp; cl_map_item_t *p_next; /* * Get the next item so that we can ensure that pp_item points to * a valid item upon return from the function. */ p_next = cl_qmap_next(*pp_item); /* Move the old item from its current map the the old map. */ cl_qmap_remove_item(p_src, *pp_item); p_temp = cl_qmap_insert(p_dest, cl_qmap_key(*pp_item), *pp_item); /* We should never have duplicates. */ assert(p_temp == *pp_item); /* Point pp_item to a valid item in the source map. */ (*pp_item) = p_next; } void cl_qmap_delta(cl_qmap_t * const p_map1, cl_qmap_t * const p_map2, cl_qmap_t * const p_new, cl_qmap_t * const p_old) { cl_map_item_t *p_item1, *p_item2; uint64_t key1, key2; assert(p_map1); assert(p_map2); assert(p_new); assert(p_old); assert(cl_is_qmap_empty(p_new)); assert(cl_is_qmap_empty(p_old)); p_item1 = cl_qmap_head(p_map1); p_item2 = cl_qmap_head(p_map2); while (p_item1 != cl_qmap_end(p_map1) && p_item2 != cl_qmap_end(p_map2)) { key1 = cl_qmap_key(p_item1); key2 = cl_qmap_key(p_item2); if (key1 < key2) { /* We found an old item. */ __cl_qmap_delta_move(p_old, p_map1, &p_item1); } else if (key1 > key2) { /* We found a new item. */ __cl_qmap_delta_move(p_new, p_map2, &p_item2); } else { /* Move both forward since they have the same key. */ p_item1 = cl_qmap_next(p_item1); p_item2 = cl_qmap_next(p_item2); } } /* Process the remainder if the end of either source map was reached. */ while (p_item2 != cl_qmap_end(p_map2)) __cl_qmap_delta_move(p_new, p_map2, &p_item2); while (p_item1 != cl_qmap_end(p_map1)) __cl_qmap_delta_move(p_old, p_map1, &p_item1); } rdma-core-56.1/util/cl_qmap.h000066400000000000000000000554321477342711600161050ustar00rootroot00000000000000/* * Copyright (c) 2004, 2005 Voltaire, Inc. All rights reserved. * Copyright (c) 2002-2005 Mellanox Technologies LTD. All rights reserved. * Copyright (c) 1996-2003 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ /* * Abstract: * Declaration of quick map, a binary tree where the caller always provides * all necessary storage. */ #ifndef _CL_QMAP_H_ #define _CL_QMAP_H_ #include #include #include #include typedef struct _cl_list_item { struct _cl_list_item *p_next; struct _cl_list_item *p_prev; } cl_list_item_t; typedef struct _cl_pool_item { cl_list_item_t list_item; } cl_pool_item_t; /****h* Component Library/Quick Map * NAME * Quick Map * * DESCRIPTION * Quick map implements a binary tree that stores user provided cl_map_item_t * structures. Each item stored in a quick map has a unique 64-bit key * (duplicates are not allowed). Quick map provides the ability to * efficiently search for an item given a key. * * Quick map does not allocate any memory, and can therefore not fail * any operations due to insufficient memory. Quick map can thus be useful * in minimizing the error paths in code. * * Quick map is not thread safe, and users must provide serialization when * adding and removing items from the map. * * The quick map functions operate on a cl_qmap_t structure which should be * treated as opaque and should be manipulated only through the provided * functions. * * SEE ALSO * Structures: * cl_qmap_t, cl_map_item_t, cl_map_obj_t * * Callbacks: * cl_pfn_qmap_apply_t * * Item Manipulation: * cl_qmap_set_obj, cl_qmap_obj, cl_qmap_key * * Initialization: * cl_qmap_init * * Iteration: * cl_qmap_end, cl_qmap_head, cl_qmap_tail, cl_qmap_next, cl_qmap_prev * * Manipulation: * cl_qmap_insert, cl_qmap_get, cl_qmap_remove_item, cl_qmap_remove, * cl_qmap_remove_all, cl_qmap_merge, cl_qmap_delta, cl_qmap_get_next * * Search: * cl_qmap_apply_func * * Attributes: * cl_qmap_count, cl_is_qmap_empty, *********/ /****i* Component Library: Quick Map/cl_map_color_t * NAME * cl_map_color_t * * DESCRIPTION * The cl_map_color_t enumerated type is used to note the color of * nodes in a map. * * SYNOPSIS */ typedef enum _cl_map_color { CL_MAP_RED, CL_MAP_BLACK } cl_map_color_t; /* * VALUES * CL_MAP_RED * The node in the map is red. * * CL_MAP_BLACK * The node in the map is black. * * SEE ALSO * Quick Map, cl_map_item_t *********/ /****s* Component Library: Quick Map/cl_map_item_t * NAME * cl_map_item_t * * DESCRIPTION * The cl_map_item_t structure is used by maps to store objects. * * The cl_map_item_t structure should be treated as opaque and should * be manipulated only through the provided functions. * * SYNOPSIS */ typedef struct _cl_map_item { /* Must be first to allow casting. */ cl_pool_item_t pool_item; struct _cl_map_item *p_left; struct _cl_map_item *p_right; struct _cl_map_item *p_up; cl_map_color_t color; uint64_t key; #ifdef _DEBUG_ struct _cl_qmap *p_map; #endif } cl_map_item_t; /* * FIELDS * pool_item * Used to store the item in a doubly linked list, allowing more * efficient map traversal. * * p_left * Pointer to the map item that is a child to the left of the node. * * p_right * Pointer to the map item that is a child to the right of the node. * * p_up * Pointer to the map item that is the parent of the node. * * color * Indicates whether a node is red or black in the map. * * key * Value that uniquely represents a node in a map. This value is * set by calling cl_qmap_insert and can be retrieved by calling * cl_qmap_key. * * NOTES * None of the fields of this structure should be manipulated by users, as * they are crititcal to the proper operation of the map in which they * are stored. * * To allow storing items in either a quick list, a quick pool, or a quick * map, the map implementation guarantees that the map item can be safely * cast to a pool item used for storing an object in a quick pool, or cast * to a list item used for storing an object in a quick list. This removes * the need to embed a map item, a list item, and a pool item in objects * that need to be stored in a quick list, a quick pool, and a quick map. * * SEE ALSO * Quick Map, cl_qmap_insert, cl_qmap_key, cl_pool_item_t, cl_list_item_t *********/ /****s* Component Library: Quick Map/cl_map_obj_t * NAME * cl_map_obj_t * * DESCRIPTION * The cl_map_obj_t structure is used to store objects in maps. * * The cl_map_obj_t structure should be treated as opaque and should * be manipulated only through the provided functions. * * SYNOPSIS */ typedef struct _cl_map_obj { cl_map_item_t item; const void *p_object; } cl_map_obj_t; /* * FIELDS * item * Map item used by internally by the map to store an object. * * p_object * User defined context. Users should not access this field directly. * Use cl_qmap_set_obj and cl_qmap_obj to set and retrieve the value * of this field. * * NOTES * None of the fields of this structure should be manipulated by users, as * they are crititcal to the proper operation of the map in which they * are stored. * * Use cl_qmap_set_obj and cl_qmap_obj to set and retrieve the object * stored in a map item, respectively. * * SEE ALSO * Quick Map, cl_qmap_set_obj, cl_qmap_obj, cl_map_item_t *********/ /****s* Component Library: Quick Map/cl_qmap_t * NAME * cl_qmap_t * * DESCRIPTION * Quick map structure. * * The cl_qmap_t structure should be treated as opaque and should * be manipulated only through the provided functions. * * SYNOPSIS */ typedef struct _cl_qmap { cl_map_item_t root; cl_map_item_t nil; size_t count; } cl_qmap_t; /* * PARAMETERS * root * Map item that serves as root of the map. The root is set up to * always have itself as parent. The left pointer is set to point * to the item at the root. * * nil * Map item that serves as terminator for all leaves, as well as * providing the list item used as quick list for storing map items * in a list for faster traversal. * * state * State of the map, used to verify that operations are permitted. * * count * Number of items in the map. * * SEE ALSO * Quick Map *********/ /****d* Component Library: Quick Map/cl_pfn_qmap_apply_t * NAME * cl_pfn_qmap_apply_t * * DESCRIPTION * The cl_pfn_qmap_apply_t function type defines the prototype for * functions used to iterate items in a quick map. * * SYNOPSIS */ typedef void (*cl_pfn_qmap_apply_t) (cl_map_item_t * const p_map_item, void *context); /* * PARAMETERS * p_map_item * [in] Pointer to a cl_map_item_t structure. * * context * [in] Value passed to the callback function. * * RETURN VALUE * This function does not return a value. * * NOTES * This function type is provided as function prototype reference for the * function provided by users as a parameter to the cl_qmap_apply_func * function. * * SEE ALSO * Quick Map, cl_qmap_apply_func *********/ /****f* Component Library: Quick Map/cl_qmap_count * NAME * cl_qmap_count * * DESCRIPTION * The cl_qmap_count function returns the number of items stored * in a quick map. * * SYNOPSIS */ static inline uint32_t cl_qmap_count(const cl_qmap_t * const p_map) { assert(p_map); return ((uint32_t) p_map->count); } /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure whose item count to return. * * RETURN VALUE * Returns the number of items stored in the map. * * SEE ALSO * Quick Map, cl_is_qmap_empty *********/ /****f* Component Library: Quick Map/cl_is_qmap_empty * NAME * cl_is_qmap_empty * * DESCRIPTION * The cl_is_qmap_empty function returns whether a quick map is empty. * * SYNOPSIS */ static inline bool cl_is_qmap_empty(const cl_qmap_t * const p_map) { assert(p_map); return (p_map->count == 0); } /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure to test for emptiness. * * RETURN VALUES * TRUE if the quick map is empty. * * FALSE otherwise. * * SEE ALSO * Quick Map, cl_qmap_count, cl_qmap_remove_all *********/ /****f* Component Library: Quick Map/cl_qmap_set_obj * NAME * cl_qmap_set_obj * * DESCRIPTION * The cl_qmap_set_obj function sets the object stored in a map object. * * SYNOPSIS */ static inline void cl_qmap_set_obj(cl_map_obj_t * const p_map_obj, const void *const p_object) { assert(p_map_obj); p_map_obj->p_object = p_object; } /* * PARAMETERS * p_map_obj * [in] Pointer to a map object stucture whose object pointer * is to be set. * * p_object * [in] User defined context. * * RETURN VALUE * This function does not return a value. * * SEE ALSO * Quick Map, cl_qmap_obj *********/ /****f* Component Library: Quick Map/cl_qmap_obj * NAME * cl_qmap_obj * * DESCRIPTION * The cl_qmap_obj function returns the object stored in a map object. * * SYNOPSIS */ static inline void *cl_qmap_obj(const cl_map_obj_t * const p_map_obj) { assert(p_map_obj); return ((void *)p_map_obj->p_object); } /* * PARAMETERS * p_map_obj * [in] Pointer to a map object stucture whose object pointer to return. * * RETURN VALUE * Returns the value of the object pointer stored in the map object. * * SEE ALSO * Quick Map, cl_qmap_set_obj *********/ /****f* Component Library: Quick Map/cl_qmap_key * NAME * cl_qmap_key * * DESCRIPTION * The cl_qmap_key function retrieves the key value of a map item. * * SYNOPSIS */ static inline uint64_t cl_qmap_key(const cl_map_item_t * const p_item) { assert(p_item); return (p_item->key); } /* * PARAMETERS * p_item * [in] Pointer to a map item whose key value to return. * * RETURN VALUE * Returns the 64-bit key value for the specified map item. * * NOTES * The key value is set in a call to cl_qmap_insert. * * SEE ALSO * Quick Map, cl_qmap_insert *********/ /****f* Component Library: Quick Map/cl_qmap_init * NAME * cl_qmap_init * * DESCRIPTION * The cl_qmap_init function initialized a quick map for use. * * SYNOPSIS */ void cl_qmap_init(cl_qmap_t * const p_map); /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure to initialize. * * RETURN VALUES * This function does not return a value. * * NOTES * Allows calling quick map manipulation functions. * * SEE ALSO * Quick Map, cl_qmap_insert, cl_qmap_remove *********/ /****f* Component Library: Quick Map/cl_qmap_end * NAME * cl_qmap_end * * DESCRIPTION * The cl_qmap_end function returns the end of a quick map. * * SYNOPSIS */ static inline const cl_map_item_t *cl_qmap_end(const cl_qmap_t * const p_map) { assert(p_map); /* Nil is the end of the map. */ return (&p_map->nil); } /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure whose end to return. * * RETURN VALUE * Pointer to the end of the map. * * NOTES * cl_qmap_end is useful for determining the validity of map items returned * by cl_qmap_head, cl_qmap_tail, cl_qmap_next, or cl_qmap_prev. If the * map item pointer returned by any of these functions compares to the end, * the end of the map was encoutered. * When using cl_qmap_head or cl_qmap_tail, this condition indicates that * the map is empty. * * SEE ALSO * Quick Map, cl_qmap_head, cl_qmap_tail, cl_qmap_next, cl_qmap_prev *********/ /****f* Component Library: Quick Map/cl_qmap_head * NAME * cl_qmap_head * * DESCRIPTION * The cl_qmap_head function returns the map item with the lowest key * value stored in a quick map. * * SYNOPSIS */ static inline cl_map_item_t *cl_qmap_head(const cl_qmap_t * const p_map) { assert(p_map); return ((cl_map_item_t *) p_map->nil.pool_item.list_item.p_next); } /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure whose item with the lowest * key is returned. * * RETURN VALUES * Pointer to the map item with the lowest key in the quick map. * * Pointer to the map end if the quick map was empty. * * NOTES * cl_qmap_head does not remove the item from the map. * * SEE ALSO * Quick Map, cl_qmap_tail, cl_qmap_next, cl_qmap_prev, cl_qmap_end, * cl_qmap_item_t *********/ /****f* Component Library: Quick Map/cl_qmap_tail * NAME * cl_qmap_tail * * DESCRIPTION * The cl_qmap_tail function returns the map item with the highest key * value stored in a quick map. * * SYNOPSIS */ static inline cl_map_item_t *cl_qmap_tail(const cl_qmap_t * const p_map) { assert(p_map); return ((cl_map_item_t *) p_map->nil.pool_item.list_item.p_prev); } /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure whose item with the * highest key is returned. * * RETURN VALUES * Pointer to the map item with the highest key in the quick map. * * Pointer to the map end if the quick map was empty. * * NOTES * cl_qmap_end does not remove the item from the map. * * SEE ALSO * Quick Map, cl_qmap_head, cl_qmap_next, cl_qmap_prev, cl_qmap_end, * cl_qmap_item_t *********/ /****f* Component Library: Quick Map/cl_qmap_next * NAME * cl_qmap_next * * DESCRIPTION * The cl_qmap_next function returns the map item with the next higher * key value than a specified map item. * * SYNOPSIS */ static inline cl_map_item_t *cl_qmap_next(const cl_map_item_t * const p_item) { assert(p_item); return ((cl_map_item_t *) p_item->pool_item.list_item.p_next); } /* * PARAMETERS * p_item * [in] Pointer to a map item whose successor to return. * * RETURN VALUES * Pointer to the map item with the next higher key value in a quick map. * * Pointer to the map end if the specified item was the last item in * the quick map. * * SEE ALSO * Quick Map, cl_qmap_head, cl_qmap_tail, cl_qmap_prev, cl_qmap_end, * cl_map_item_t *********/ /****f* Component Library: Quick Map/cl_qmap_prev * NAME * cl_qmap_prev * * DESCRIPTION * The cl_qmap_prev function returns the map item with the next lower * key value than a precified map item. * * SYNOPSIS */ static inline cl_map_item_t *cl_qmap_prev(const cl_map_item_t * const p_item) { assert(p_item); return ((cl_map_item_t *) p_item->pool_item.list_item.p_prev); } /* * PARAMETERS * p_item * [in] Pointer to a map item whose predecessor to return. * * RETURN VALUES * Pointer to the map item with the next lower key value in a quick map. * * Pointer to the map end if the specifid item was the first item in * the quick map. * * SEE ALSO * Quick Map, cl_qmap_head, cl_qmap_tail, cl_qmap_next, cl_qmap_end, * cl_map_item_t *********/ /****f* Component Library: Quick Map/cl_qmap_insert * NAME * cl_qmap_insert * * DESCRIPTION * The cl_qmap_insert function inserts a map item into a quick map. * NOTE: Only if such a key does not alerady exist in the map !!!! * * SYNOPSIS */ cl_map_item_t *cl_qmap_insert(cl_qmap_t * const p_map, const uint64_t key, cl_map_item_t * const p_item); /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure into which to add the item. * * key * [in] Value to assign to the item. * * p_item * [in] Pointer to a cl_map_item_t stucture to insert into the quick map. * * RETURN VALUE * Pointer to the item in the map with the specified key. If insertion * was successful, this is the pointer to the item. If an item with the * specified key already exists in the map, the pointer to that item is * returned - but the new key is NOT inserted... * * NOTES * Insertion operations may cause the quick map to rebalance. * * SEE ALSO * Quick Map, cl_qmap_remove, cl_map_item_t *********/ /****f* Component Library: Quick Map/cl_qmap_get * NAME * cl_qmap_get * * DESCRIPTION * The cl_qmap_get function returns the map item associated with a key. * * SYNOPSIS */ cl_map_item_t *cl_qmap_get(const cl_qmap_t * const p_map, const uint64_t key); /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure from which to retrieve the * item with the specified key. * * key * [in] Key value used to search for the desired map item. * * RETURN VALUES * Pointer to the map item with the desired key value. * * Pointer to the map end if there was no item with the desired key value * stored in the quick map. * * NOTES * cl_qmap_get does not remove the item from the quick map. * * SEE ALSO * Quick Map, cl_qmap_get_next, cl_qmap_remove *********/ /****f* Component Library: Quick Map/cl_qmap_get_next * NAME * cl_qmap_get_next * * DESCRIPTION * The cl_qmap_get_next function returns the first map item associated with a * key > the key specified. * * SYNOPSIS */ cl_map_item_t *cl_qmap_get_next(const cl_qmap_t * const p_map, const uint64_t key); /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure from which to retrieve the * first item with a key > the specified key. * * key * [in] Key value used to search for the desired map item. * * RETURN VALUES * Pointer to the first map item with a key > the desired key value. * * Pointer to the map end if there was no item with a key > the desired key * value stored in the quick map. * * NOTES * cl_qmap_get_next does not remove the item from the quick map. * * SEE ALSO * Quick Map, cl_qmap_get, cl_qmap_remove *********/ /****f* Component Library: Quick Map/cl_qmap_remove_item * NAME * cl_qmap_remove_item * * DESCRIPTION * The cl_qmap_remove_item function removes the specified map item * from a quick map. * * SYNOPSIS */ void cl_qmap_remove_item(cl_qmap_t * const p_map, cl_map_item_t * const p_item); /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure from which to * remove item. * * p_item * [in] Pointer to a map item to remove from its quick map. * * RETURN VALUES * This function does not return a value. * * In a debug build, cl_qmap_remove_item asserts that the item being removed * is in the specified map. * * NOTES * Removes the map item pointed to by p_item from its quick map. * * SEE ALSO * Quick Map, cl_qmap_remove, cl_qmap_remove_all, cl_qmap_insert *********/ /****f* Component Library: Quick Map/cl_qmap_remove * NAME * cl_qmap_remove * * DESCRIPTION * The cl_qmap_remove function removes the map item with the specified key * from a quick map. * * SYNOPSIS */ cl_map_item_t *cl_qmap_remove(cl_qmap_t * const p_map, const uint64_t key); /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure from which to remove the item * with the specified key. * * key * [in] Key value used to search for the map item to remove. * * RETURN VALUES * Pointer to the removed map item if it was found. * * Pointer to the map end if no item with the specified key exists in the * quick map. * * SEE ALSO * Quick Map, cl_qmap_remove_item, cl_qmap_remove_all, cl_qmap_insert *********/ /****f* Component Library: Quick Map/cl_qmap_remove_all * NAME * cl_qmap_remove_all * * DESCRIPTION * The cl_qmap_remove_all function removes all items in a quick map, * leaving it empty. * * SYNOPSIS */ static inline void cl_qmap_remove_all(cl_qmap_t * const p_map) { assert(p_map); p_map->root.p_left = &p_map->nil; p_map->nil.pool_item.list_item.p_next = &p_map->nil.pool_item.list_item; p_map->nil.pool_item.list_item.p_prev = &p_map->nil.pool_item.list_item; p_map->count = 0; } /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure to empty. * * RETURN VALUES * This function does not return a value. * * SEE ALSO * Quick Map, cl_qmap_remove, cl_qmap_remove_item *********/ /****f* Component Library: Quick Map/cl_qmap_merge * NAME * cl_qmap_merge * * DESCRIPTION * The cl_qmap_merge function moves all items from one map to another, * excluding duplicates. * * SYNOPSIS */ void cl_qmap_merge(cl_qmap_t * const p_dest_map, cl_qmap_t * const p_src_map); /* * PARAMETERS * p_dest_map * [out] Pointer to a cl_qmap_t structure to which items should be added. * * p_src_map * [in/out] Pointer to a cl_qmap_t structure whose items to add * to p_dest_map. * * RETURN VALUES * This function does not return a value. * * NOTES * Items are evaluated based on their keys only. * * Upon return from cl_qmap_merge, the quick map referenced by p_src_map * contains all duplicate items. * * SEE ALSO * Quick Map, cl_qmap_delta *********/ /****f* Component Library: Quick Map/cl_qmap_delta * NAME * cl_qmap_delta * * DESCRIPTION * The cl_qmap_delta function computes the differences between two maps. * * SYNOPSIS */ void cl_qmap_delta(cl_qmap_t * const p_map1, cl_qmap_t * const p_map2, cl_qmap_t * const p_new, cl_qmap_t * const p_old); /* * PARAMETERS * p_map1 * [in/out] Pointer to the first of two cl_qmap_t structures whose * differences to compute. * * p_map2 * [in/out] Pointer to the second of two cl_qmap_t structures whose * differences to compute. * * p_new * [out] Pointer to an empty cl_qmap_t structure that contains the * items unique to p_map2 upon return from the function. * * p_old * [out] Pointer to an empty cl_qmap_t structure that contains the * items unique to p_map1 upon return from the function. * * RETURN VALUES * This function does not return a value. * * NOTES * Items are evaluated based on their keys. Items that exist in both * p_map1 and p_map2 remain in their respective maps. Items that * exist only p_map1 are moved to p_old. Likewise, items that exist only * in p_map2 are moved to p_new. This function can be useful in evaluating * changes between two maps. * * Both maps pointed to by p_new and p_old must be empty on input. This * requirement removes the possibility of failures. * * SEE ALSO * Quick Map, cl_qmap_merge *********/ /****f* Component Library: Quick Map/cl_qmap_apply_func * NAME * cl_qmap_apply_func * * DESCRIPTION * The cl_qmap_apply_func function executes a specified function * for every item stored in a quick map. * * SYNOPSIS */ void cl_qmap_apply_func(const cl_qmap_t * const p_map, cl_pfn_qmap_apply_t pfn_func, const void *const context); /* * PARAMETERS * p_map * [in] Pointer to a cl_qmap_t structure. * * pfn_func * [in] Function invoked for every item in the quick map. * See the cl_pfn_qmap_apply_t function type declaration for * details about the callback function. * * context * [in] Value to pass to the callback functions to provide context. * * RETURN VALUE * This function does not return a value. * * NOTES * The function provided must not perform any map operations, as these * would corrupt the quick map. * * SEE ALSO * Quick Map, cl_pfn_qmap_apply_t *********/ #endif /* _CL_QMAP_H_ */ rdma-core-56.1/util/compiler.h000066400000000000000000000025621477342711600162770ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file */ #ifndef UTIL_COMPILER_H #define UTIL_COMPILER_H /* Use to tag a variable that causes compiler warnings. Use as: int uninitialized_var(sz) This is only enabled for old compilers. gcc 6.x and beyond have excellent static flow analysis. If code solicits a warning from 6.x it is almost certainly too complex for a human to understand. For some reason powerpc uses a different scheme than gcc for flow analysis. gcc 12 seems to have regressed badly here and now acts like PPC gcc does. */ #if (__GNUC__ >= 6 && __GNUC__ < 12 && !defined(__powerpc__)) || defined(__clang__) #define uninitialized_var(x) x #else #define uninitialized_var(x) x = x #endif #ifndef likely #ifdef __GNUC__ #define likely(x) __builtin_expect(!!(x), 1) #else #define likely(x) (x) #endif #endif #ifndef unlikely #ifdef __GNUC__ #define unlikely(x) __builtin_expect(!!(x), 0) #else #define unlikely(x) (x) #endif #endif #ifdef HAVE_FUNC_ATTRIBUTE_ALWAYS_INLINE #define ALWAYS_INLINE __attribute__((always_inline)) #else #define ALWAYS_INLINE #endif /* Use to mark fall through on switch statements as desired. */ #if __GNUC__ >= 7 #define SWITCH_FALLTHROUGH __attribute__ ((fallthrough)) #else #define SWITCH_FALLTHROUGH #endif #ifdef __CHECKER__ # define __force __attribute__((force)) #else # define __force #endif #endif rdma-core-56.1/util/interval_set.c000066400000000000000000000075401477342711600171600ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file */ #include #include #include #include #include #include struct iset { struct list_head head; pthread_mutex_t lock; }; struct iset_range { struct list_node entry; uint64_t start; uint64_t length; }; struct iset *iset_create(void) { struct iset *iset; iset = calloc(1, sizeof(*iset)); if (!iset) { errno = ENOMEM; return NULL; } pthread_mutex_init(&iset->lock, NULL); list_head_init(&iset->head); return iset; } void iset_destroy(struct iset *iset) { struct iset_range *range, *tmp; list_for_each_safe(&iset->head, range, tmp, entry) free(range); free(iset); } static int range_overlap(uint64_t s1, uint64_t len1, uint64_t s2, uint64_t len2) { if (((s1 < s2) && (s1 + len1 - 1 < s2)) || ((s1 > s2) && (s1 > s2 + len2 - 1))) return 0; return 1; } static struct iset_range *create_range(uint64_t start, uint64_t length) { struct iset_range *range; range = calloc(1, sizeof(*range)); if (!range) { errno = ENOMEM; return NULL; } range->start = start; range->length = length; return range; } static void delete_range(struct iset_range *r) { list_del(&r->entry); free(r); } static bool check_do_combine(struct iset *iset, struct iset_range *p, struct iset_range *n, uint64_t start, uint64_t length) { bool combined2prev = false, combined2next = false; if (p && (p->start + p->length == start)) { p->length += length; combined2prev = true; } if (n && (start + length == n->start)) { if (combined2prev) { p->length += n->length; delete_range(n); } else { n->start = start; n->length += length; } combined2next = true; } return combined2prev || combined2next; } int iset_insert_range(struct iset *iset, uint64_t start, uint64_t length) { struct iset_range *prev = NULL, *r, *rnew; bool found = false, combined; int ret = 0; if (!length || (start + length - 1 < start)) { errno = EINVAL; return errno; } pthread_mutex_lock(&iset->lock); list_for_each(&iset->head, r, entry) { if (range_overlap(r->start, r->length, start, length)) { errno = EINVAL; ret = errno; goto out; } if (r->start > start) { found = true; break; } prev = r; } combined = check_do_combine(iset, prev, found ? r : NULL, start, length); if (!combined) { rnew = create_range(start, length); if (!rnew) { ret = errno; goto out; } if (!found) list_add_tail(&iset->head, &rnew->entry); else list_add_before(&iset->head, &r->entry, &rnew->entry); } out: pthread_mutex_unlock(&iset->lock); return ret; } static int power_of_two(uint64_t x) { return ((x != 0) && !(x & (x - 1))); } int iset_alloc_range(struct iset *iset, uint64_t length, uint64_t *start, uint64_t alignment) { struct iset_range *r, *rnew; uint64_t astart, rend; bool found = false; int ret = 0; if (!power_of_two(alignment)) { errno = EINVAL; return errno; } pthread_mutex_lock(&iset->lock); list_for_each(&iset->head, r, entry) { astart = align(r->start, alignment); /* Check for wrap around */ if ((astart + length - 1 >= astart) && (astart + length - 1 <= r->start + r->length - 1)) { found = true; break; } } if (!found) { errno = ENOSPC; ret = errno; goto out; } if (r->start == astart) { if (r->length == length) { /* Case #1 */ delete_range(r); } else { /* Case #2 */ r->start += length; r->length -= length; } } else { rend = r->start + r->length; if (astart + length != rend) { /* Case #4 */ rnew = create_range(astart + length, rend - astart - length); if (!rnew) { ret = errno; goto out; } list_add_after(&iset->head, &r->entry, &rnew->entry); } r->length = astart - r->start; /* Case #3 & #4 */ } *start = astart; out: pthread_mutex_unlock(&iset->lock); return ret; } rdma-core-56.1/util/interval_set.h000066400000000000000000000034421477342711600171620ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file */ #include struct iset; /** * iset_create - Create an interval set * * Return the created iset if succeeded, NULL otherwise, with errno set */ struct iset *iset_create(void); /** * iset_destroy - Destroy an interval set * @iset: The set to be destroyed */ void iset_destroy(struct iset *iset); /** * iset_insert_range - Insert a range to the set * @iset: The set to be operated * @start: The start address of the range * @length: The length of the range * * If this range is continuous to the adjacent ranges (before and/or after), * then they will be combined to a larger one. * * Return 0 if succeeded, errno otherwise */ int iset_insert_range(struct iset *iset, uint64_t start, uint64_t length); /** * iset_alloc_range - Allocate a range from the set * * @iset: The set to be operated * @length: The length of the range * @start: The start address of the allocated range * @alignment: The alignment that @start must be aligned with, must be power * of two * * Return 0 if succeeded, errno otherwise * * Note: There are these cases: * Case 1: Original range is fully taken +------------------+ |XXXXXXXXXXXXXXXXXX| +------------------+ => (NULL) Case 2: Original range shrunk +------------------+ |XXXXX | +------------------+ => +------------+ | | +------------+ Case 3: Original range shrunk +------------------+ | XXXXX| +------------------+ => +------------+ | | +------------+ Case 4: Original range splited +------------------+ | XXXXX | +------------------+ => +-----+ +------+ | | | | +-----+ +------+ */ int iset_alloc_range(struct iset *iset, uint64_t length, uint64_t *start, uint64_t alignment); rdma-core-56.1/util/mmio.c000066400000000000000000000077511477342711600154260ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file */ #include #include #include #include #include #ifdef __s390x__ #include #include bool s390_is_mio_supported; static __attribute__((constructor)) void check_mio_supported(void) { s390_is_mio_supported = !!(getauxval(AT_HWCAP) & HWCAP_S390_PCI_MIO); } typedef void (*mmio_memcpy_x64_fn_t)(void *, const void *, size_t); /* This uses the STT_GNU_IFUNC extension to have the dynamic linker select the best above implementations at runtime. */ #if HAVE_FUNC_ATTRIBUTE_IFUNC void mmio_memcpy_x64(void *, const void *, size_t) __attribute__((ifunc("resolve_mmio_memcpy_x64"))); static mmio_memcpy_x64_fn_t resolve_mmio_memcpy_x64(uint64_t); #else __asm__(".type mmio_memcpy_x64, %gnu_indirect_function"); write64_fn_t resolve_mmio_memcpy_64(uint64_t) __asm__("mmio_memcpy_x64"); #endif #define S390_MAX_WRITE_SIZE 128 #define S390_BOUNDARY_SIZE (1 << 12) #define S390_BOUNDARY_MASK (S390_BOUNDARY_SIZE - 1) static uint8_t get_max_write_size(void *dst, size_t len) { size_t offset = ((uint64_t __force)dst) & S390_BOUNDARY_MASK; size_t size = min_t(int, len, S390_MAX_WRITE_SIZE); if (likely(offset + size <= S390_BOUNDARY_SIZE)) return size; return S390_BOUNDARY_SIZE - offset; } static void mmio_memcpy_x64_mio(void *dst, const void *src, size_t bytecnt) { size_t size; /* Input is 8 byte aligned 64 byte chunks. The alignment matches the * requirements of pcistbi but we must not cross a 4K byte boundary. */ while (bytecnt > 0) { size = get_max_write_size(dst, bytecnt); if (size > 8) s390_pcistbi(dst, src, size); else s390_pcistgi(dst, *(uint64_t *)src, 8); src += size; dst += size; bytecnt -= size; } } mmio_memcpy_x64_fn_t resolve_mmio_memcpy_x64(uint64_t hwcap) { if (hwcap & HWCAP_S390_PCI_MIO) return &mmio_memcpy_x64_mio; else return &s390_mmio_write_syscall; } #endif /* __s390x__ */ #if SIZEOF_LONG != 8 static pthread_spinlock_t mmio_spinlock; static __attribute__((constructor)) void lock_constructor(void) { pthread_spin_init(&mmio_spinlock, PTHREAD_PROCESS_PRIVATE); } /* When the arch does not have a 64 bit store we provide an emulation that does two stores in address ascending order while holding a global spinlock. */ static void pthread_mmio_write64_be(void *addr, __be64 val) { __be32 first_dword = htobe32(be64toh(val) >> 32); __be32 second_dword = htobe32(be64toh(val)); /* The WC spinlock, by definition, provides global ordering for all UC and WC stores within the critical region. */ mmio_wc_spinlock(&mmio_spinlock); mmio_write32_be(addr, first_dword); mmio_write32_be(addr + 4, second_dword); mmio_wc_spinunlock(&mmio_spinlock); } #if defined(__i386__) #include #include /* For ia32 we have historically emitted movlps SSE instructions to do the 64 bit operations. */ static void __attribute__((target("sse"))) sse_mmio_write64_be(void *addr, __be64 val) { __m128 tmp = {}; tmp = _mm_loadl_pi(tmp, (__force __m64 *)&val); _mm_storel_pi((__m64 *)addr,tmp); } static bool have_sse(void) { unsigned int ax,bx,cx,dx; if (!__get_cpuid(1,&ax,&bx,&cx,&dx)) return false; return dx & bit_SSE; } typedef void (*write64_fn_t)(void *, __be64); /* This uses the STT_GNU_IFUNC extension to have the dynamic linker select the best above implementations at runtime. */ #if HAVE_FUNC_ATTRIBUTE_IFUNC void mmio_write64_be(void *addr, __be64 val) __attribute__((ifunc("resolve_mmio_write64_be"))); static write64_fn_t resolve_mmio_write64_be(void); #else __asm__(".type mmio_write64_be, %gnu_indirect_function"); write64_fn_t resolve_mmio_write64_be(void) __asm__("mmio_write64_be"); #endif write64_fn_t resolve_mmio_write64_be(void) { if (have_sse()) return &sse_mmio_write64_be; return &pthread_mmio_write64_be; } #else void mmio_write64_be(void *addr, __be64 val) { return pthread_mmio_write64_be(addr, val); } #endif /* defined(__i386__) */ #endif /* SIZEOF_LONG != 8 */ rdma-core-56.1/util/mmio.h000066400000000000000000000241261477342711600154260ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file These accessors always map to PCI-E TLPs in predictable ways. Translation to other buses should follow similar definitions. write32(mem, 1) Produce a 4 byte MemWr TLP with bit 0 of DW byte offset 0 set write32_be(mem, htobe32(1)) Produce a 4 byte MemWr TLP with bit 0 of DW byte offset 3 set write32_le(mem, htole32(1)) Produce a 4 byte MemWr TLP with bit 0 of DW byte offset 0 set For ordering these accessors are similar to the Kernel's concept of writel_relaxed(). When working with UC memory the following hold: 1) Strong ordering is required when talking to the same device (eg BAR), and combining is not permitted: write32(mem, 1); write32(mem + 4, 1); write32(mem, 1); Must produce three TLPs, in order. 2) Ordering ignores all pthread locking: pthread_spin_lock(&lock); write32(mem, global++); pthread_spin_unlock(&lock); When run concurrently on all CPUs the device must observe all stores, but the data value will not be strictly increasing. 3) Interaction with DMA is not ordered. Explicit use of a barrier from udma_barriers is required: *dma_mem = 1; udma_to_device_barrier(); write32(mem, GO_DMA); 4) Access out of program order (eg speculation), either by the CPU or compiler is not permitted: if (cond) read32(); Must not issue a read TLP if cond is false. If these are used with WC memory then #1 and #4 do not apply, and all WC accesses must be bracketed with mmio_wc_start() // mmio_flush_writes() */ #ifndef __UTIL_MMIO_H #define __UTIL_MMIO_H #include #include #include #include #include #include #include /* The first step is to define the 'raw' accessors. To make this very safe with sparse we define two versions of each, a le and a be - however the code is always identical. */ #ifdef __s390x__ #include #define MAKE_WRITE(_NAME_, _SZ_) \ static inline void _NAME_##_be(void *addr, __be##_SZ_ value) \ { \ if (s390_is_mio_supported) \ s390_pcistgi(addr, value, sizeof(value)); \ else \ s390_mmio_write_syscall(addr, &value, sizeof(value)); \ } \ static inline void _NAME_##_le(void *addr, __le##_SZ_ value) \ { \ if (s390_is_mio_supported) \ s390_pcistgi(addr, value, sizeof(value)); \ else \ s390_mmio_write_syscall(addr, &value, sizeof(value)); \ } #define MAKE_READ(_NAME_, _SZ_) \ static inline __be##_SZ_ _NAME_##_be(const void *addr) \ { \ __be##_SZ_ res; \ if (s390_is_mio_supported) \ res = s390_pcilgi(addr, sizeof(res)); \ else \ s390_mmio_read_syscall(addr, &res, sizeof(res)); \ return res; \ } \ static inline __le##_SZ_ _NAME_##_le(const void *addr) \ { \ __le##_SZ_ res; \ if (s390_is_mio_supported) \ res = s390_pcilgi(addr, sizeof(res)); \ else \ s390_mmio_read_syscall(addr, &res, sizeof(res)); \ return res; \ } static inline void mmio_write8(void *addr, uint8_t value) { if (s390_is_mio_supported) s390_pcistgi(addr, value, sizeof(value)); else s390_mmio_write_syscall(addr, &value, sizeof(value)); } static inline uint8_t mmio_read8(const void *addr) { uint8_t res; if (s390_is_mio_supported) res = s390_pcilgi(addr, sizeof(res)); else s390_mmio_read_syscall(addr, &res, sizeof(res)); return res; } #else /* __s390x__ */ #define MAKE_WRITE(_NAME_, _SZ_) \ static inline void _NAME_##_be(void *addr, __be##_SZ_ value) \ { \ atomic_store_explicit((_Atomic(uint##_SZ_##_t) *)addr, \ (__force uint##_SZ_##_t)value, \ memory_order_relaxed); \ } \ static inline void _NAME_##_le(void *addr, __le##_SZ_ value) \ { \ atomic_store_explicit((_Atomic(uint##_SZ_##_t) *)addr, \ (__force uint##_SZ_##_t)value, \ memory_order_relaxed); \ } #define MAKE_READ(_NAME_, _SZ_) \ static inline __be##_SZ_ _NAME_##_be(const void *addr) \ { \ return (__force __be##_SZ_)atomic_load_explicit( \ (_Atomic(uint##_SZ_##_t) *)addr, memory_order_relaxed); \ } \ static inline __le##_SZ_ _NAME_##_le(const void *addr) \ { \ return (__force __le##_SZ_)atomic_load_explicit( \ (_Atomic(uint##_SZ_##_t) *)addr, memory_order_relaxed); \ } static inline void mmio_write8(void *addr, uint8_t value) { atomic_store_explicit((_Atomic(uint8_t) *)addr, value, memory_order_relaxed); } static inline uint8_t mmio_read8(const void *addr) { return atomic_load_explicit((_Atomic(uint32_t) *)addr, memory_order_relaxed); } #endif /* __s390x__ */ MAKE_WRITE(mmio_write16, 16) MAKE_WRITE(mmio_write32, 32) MAKE_READ(mmio_read16, 16) MAKE_READ(mmio_read32, 32) #if SIZEOF_LONG == 8 MAKE_WRITE(mmio_write64, 64) MAKE_READ(mmio_read64, 64) #else void mmio_write64_be(void *addr, __be64 val); static inline void mmio_write64_le(void *addr, __le64 val) { mmio_write64_be(addr, (__be64 __force)val); } /* There is no way to do read64 atomically, rather than provide some sketchy implementation we leave these functions undefined, users should not call them if SIZEOF_LONG != 8, but instead implement an appropriate version. */ __be64 mmio_read64_be(const void *addr); __le64 mmio_read64_le(const void *addr); #endif /* SIZEOF_LONG == 8 */ #undef MAKE_WRITE #undef MAKE_READ /* Now we can define the host endian versions of the operator, this just includes a call to htole. */ #define MAKE_WRITE(_NAME_, _SZ_) \ static inline void _NAME_(void *addr, uint##_SZ_##_t value) \ { \ _NAME_##_le(addr, htole##_SZ_(value)); \ } #define MAKE_READ(_NAME_, _SZ_) \ static inline uint##_SZ_##_t _NAME_(const void *addr) \ { \ return le##_SZ_##toh(_NAME_##_le(addr)); \ } /* This strictly guarantees the order of TLP generation for the memory copy to be in ascending address order. */ #if defined(__aarch64__) || defined(__arm__) #include static inline void _mmio_memcpy_x64_64b(void *dest, const void *src) { vst4q_u64(dest, vld4q_u64(src)); } static inline void _mmio_memcpy_x64(void *dest, const void *src, size_t bytecnt) { do { _mmio_memcpy_x64_64b(dest, src); bytecnt -= sizeof(uint64x2x4_t); src += sizeof(uint64x2x4_t); dest += sizeof(uint64x2x4_t); } while (bytecnt > 0); } #define mmio_memcpy_x64(dest, src, bytecount) \ ({ \ if (__builtin_constant_p((bytecount) == 64)) \ _mmio_memcpy_x64_64b((dest), (src)); \ else \ _mmio_memcpy_x64((dest), (src), (bytecount)); \ }) #elif defined(__s390x__) void mmio_memcpy_x64(void *dst, const void *src, size_t bytecnt); #else /* Transfer is some multiple of 64 bytes */ static inline void mmio_memcpy_x64(void *dest, const void *src, size_t bytecnt) { uintptr_t *dst_p = dest; /* Caller must guarantee: assert(bytecnt != 0); assert((bytecnt % 64) == 0); assert(((uintptr_t)dest) % __alignof__(*dst) == 0); assert(((uintptr_t)src) % __alignof__(*dst) == 0); */ /* Use the native word size for the copy */ if (sizeof(*dst_p) == 8) { const __be64 *src_p = src; do { /* Do 64 bytes at a time */ mmio_write64_be(dst_p++, *src_p++); mmio_write64_be(dst_p++, *src_p++); mmio_write64_be(dst_p++, *src_p++); mmio_write64_be(dst_p++, *src_p++); mmio_write64_be(dst_p++, *src_p++); mmio_write64_be(dst_p++, *src_p++); mmio_write64_be(dst_p++, *src_p++); mmio_write64_be(dst_p++, *src_p++); bytecnt -= 8 * sizeof(*dst_p); } while (bytecnt > 0); } else if (sizeof(*dst_p) == 4) { const __be32 *src_p = src; do { mmio_write32_be(dst_p++, *src_p++); mmio_write32_be(dst_p++, *src_p++); bytecnt -= 2 * sizeof(*dst_p); } while (bytecnt > 0); } } #endif MAKE_WRITE(mmio_write16, 16) MAKE_WRITE(mmio_write32, 32) MAKE_WRITE(mmio_write64, 64) MAKE_READ(mmio_read16, 16) MAKE_READ(mmio_read32, 32) MAKE_READ(mmio_read64, 64) #undef MAKE_WRITE #undef MAKE_READ #endif rdma-core-56.1/util/node_name_map.c000066400000000000000000000120631477342711600172370ustar00rootroot00000000000000/* * Copyright (c) 2008 Voltaire, Inc. All rights reserved. * Copyright (c) 2007 Lawrence Livermore National Lab * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. * */ #include #include #include #include #include #include #include #include #include #include #include #define PARSE_NODE_MAP_BUFLEN 256 typedef struct _name_map_item { cl_map_item_t item; uint64_t guid; char *name; } name_map_item_t; struct nn_map { cl_qmap_t map; }; static int map_name(void *cxt, uint64_t guid, char *p) { cl_qmap_t *map = cxt; name_map_item_t *item; p = strtok(p, "\"#"); if (!p) return 0; item = malloc(sizeof(*item)); if (!item) return -1; item->guid = guid; item->name = strdup(p); cl_qmap_insert(map, item->guid, (cl_map_item_t *) item); return 0; } void close_node_name_map(nn_map_t * map) { name_map_item_t *item = NULL; if (!map) return; item = (name_map_item_t *) cl_qmap_head(&map->map); while (item != (name_map_item_t *) cl_qmap_end(&map->map)) { item = (name_map_item_t *) cl_qmap_remove(&map->map, item->guid); free(item->name); free(item); item = (name_map_item_t *) cl_qmap_head(&map->map); } free(map); } char *remap_node_name(nn_map_t * map, uint64_t target_guid, const char *nodedesc) { char *rc = NULL; name_map_item_t *item = NULL; if (!map) goto done; item = (name_map_item_t *) cl_qmap_get(&map->map, target_guid); if (item != (name_map_item_t *) cl_qmap_end(&map->map)) rc = strdup(item->name); done: if (rc == NULL) { rc = malloc(IB_SMP_DATA_SIZE + 1); if (rc) { strncpy(rc, nodedesc, IB_SMP_DATA_SIZE); rc[IB_SMP_DATA_SIZE] = '\0'; clean_nodedesc(rc); } } return (rc); } char *clean_nodedesc(char *nodedesc) { int i = 0; nodedesc[IB_SMP_DATA_SIZE] = '\0'; while (nodedesc[i]) { if (!isprint(nodedesc[i])) nodedesc[i] = ' '; i++; } return (nodedesc); } static int parse_node_map_wrap(const char *file_name, int (*create) (void *, uint64_t, char *), void *cxt, char *linebuf, unsigned int linebuflen) { char line[PARSE_NODE_MAP_BUFLEN]; FILE *f; if (!(f = fopen(file_name, "r"))) return -1; while (fgets(line, sizeof(line), f)) { uint64_t guid; char *p, *e; p = line; while (isspace(*p)) p++; if (*p == '\0' || *p == '\n' || *p == '#') continue; guid = strtoull(p, &e, 0); if (e == p || (!isspace(*e) && *e != '#' && *e != '\0')) { fclose(f); errno = EIO; if (linebuf) { memcpy(linebuf, line, min_t(size_t, PARSE_NODE_MAP_BUFLEN, linebuflen)); e = strpbrk(linebuf, "\n"); if (e) *e = '\0'; } return -1; } p = e; while (isspace(*p)) p++; e = strpbrk(p, "\n"); if (e) *e = '\0'; if (create(cxt, guid, p)) { fclose(f); return -1; } } fclose(f); return 0; } nn_map_t *open_node_name_map(const char *node_name_map) { nn_map_t *map; char linebuf[PARSE_NODE_MAP_BUFLEN + 1]; if (!node_name_map) { struct stat buf; node_name_map = IBDIAG_NODENAME_MAP_PATH; if (stat(node_name_map, &buf)) return NULL; } map = malloc(sizeof(*map)); if (!map) return NULL; cl_qmap_init(&map->map); memset(linebuf, '\0', PARSE_NODE_MAP_BUFLEN + 1); if (parse_node_map_wrap(node_name_map, map_name, map, linebuf, PARSE_NODE_MAP_BUFLEN)) { if (errno == EIO) { fprintf(stderr, "WARNING failed to parse node name map " "\"%s\"\n", node_name_map); fprintf(stderr, "WARNING failed line: \"%s\"\n", linebuf); } else fprintf(stderr, "WARNING failed to open node name map " "\"%s\" (%s)\n", node_name_map, strerror(errno)); close_node_name_map(map); return NULL; } return map; } rdma-core-56.1/util/node_name_map.h000066400000000000000000000007651477342711600172520ustar00rootroot00000000000000/* Copyright (c) 2019 Mellanox Technologies. All rights reserved. * * Connect to opensm's cl_nodenamemap.h if it is available. */ #ifndef __LIBUTIL_NODE_NAME_MAP_H__ #define __LIBUTIL_NODE_NAME_MAP_H__ #include struct nn_map; typedef struct nn_map nn_map_t; nn_map_t *open_node_name_map(const char *node_name_map); void close_node_name_map(nn_map_t *map); char *remap_node_name(nn_map_t *map, uint64_t target_guid, const char *nodedesc); char *clean_nodedesc(char *nodedesc); #endif rdma-core-56.1/util/open_cdev.c000066400000000000000000000100611477342711600164130ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include static int open_cdev_internal(const char *path, dev_t cdev) { struct stat st; int fd; fd = open(path, O_RDWR | O_CLOEXEC); if (fd == -1) return -1; if (fstat(fd, &st) || !S_ISCHR(st.st_mode) || (cdev != 0 && st.st_rdev != cdev)) { close(fd); return -1; } return fd; } /* * In case the cdev was not exactly where we should be, use this more * elaborate approach to find it. This is designed to resolve a race with * module autoloading where udev is concurrently creately the cdev as we are * looking for it. udev has 5 seconds to create the link or we fail. * * Modern userspace and kernels create the /dev/infiniband/X synchronously via * devtmpfs before returning from the netlink query, so they should never use * this path. */ static int open_cdev_robust(const char *devname_hint, dev_t cdev) { struct itimerspec ts = { .it_value = { .tv_sec = 5 } }; uint64_t buf[sizeof(struct inotify_event) * 16 / sizeof(uint64_t)]; struct pollfd fds[2]; char *devpath; int res = -1; int ifd; int tfd; /* * This assumes that udev is being used and is creating the /dev/char/ * symlinks. */ if (asprintf(&devpath, "/dev/char/%u:%u", major(cdev), minor(cdev)) < 0) return -1; /* Use inotify to speed up the resolution time. */ ifd = inotify_init1(IN_CLOEXEC | IN_NONBLOCK); if (ifd == -1) goto err_mem; if (inotify_add_watch(ifd, "/dev/char/", IN_CREATE) == -1) goto err_inotify; /* Timerfd is simpler than working with relative time outs */ tfd = timerfd_create(CLOCK_MONOTONIC, TFD_CLOEXEC); if (tfd == -1) goto err_inotify; if (timerfd_settime(tfd, 0, &ts, NULL) == -1) goto out_timer; res = open_cdev_internal(devpath, cdev); if (res != -1) goto out_timer; fds[0].fd = ifd; fds[0].events = POLLIN; fds[1].fd = tfd; fds[1].events = POLLIN; while (poll(fds, 2, -1) > 0) { res = open_cdev_internal(devpath, cdev); if (res != -1) goto out_timer; if (fds[0].revents) { if (read(ifd, buf, sizeof(buf)) == -1) goto out_timer; } if (fds[1].revents) goto out_timer; } out_timer: close(tfd); err_inotify: close(ifd); err_mem: free(devpath); return res; } int open_cdev(const char *devname_hint, dev_t cdev) { char *devpath; int fd; if (asprintf(&devpath, RDMA_CDEV_DIR "/%s", devname_hint) < 0) return -1; fd = open_cdev_internal(devpath, cdev); free(devpath); if (fd == -1 && cdev != 0) return open_cdev_robust(devname_hint, cdev); return fd; } rdma-core-56.1/util/rdma_nl.c000066400000000000000000000112451477342711600160720ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #include #include #include struct nla_policy rdmanl_policy[RDMA_NLDEV_ATTR_MAX] = { [RDMA_NLDEV_ATTR_CHARDEV] = { .type = NLA_U64 }, [RDMA_NLDEV_ATTR_CHARDEV_ABI] = { .type = NLA_U64 }, [RDMA_NLDEV_ATTR_DEV_INDEX] = { .type = NLA_U32 }, [RDMA_NLDEV_ATTR_DEV_NODE_TYPE] = { .type = NLA_U8 }, [RDMA_NLDEV_ATTR_NODE_GUID] = { .type = NLA_U64 }, [RDMA_NLDEV_ATTR_UVERBS_DRIVER_ID] = { .type = NLA_U32 }, [RDMA_NLDEV_ATTR_PORT_INDEX] = { .type = NLA_U32 }, #ifdef NLA_NUL_STRING [RDMA_NLDEV_ATTR_CHARDEV_NAME] = { .type = NLA_NUL_STRING }, [RDMA_NLDEV_ATTR_DEV_NAME] = { .type = NLA_NUL_STRING }, [RDMA_NLDEV_ATTR_DEV_PROTOCOL] = { .type = NLA_NUL_STRING }, #endif /* NLA_NUL_STRING */ [RDMA_NLDEV_SYS_ATTR_COPY_ON_FORK] = { .type = NLA_U8 }, }; static int rdmanl_saw_err_cb(struct sockaddr_nl *nla, struct nlmsgerr *nlerr, void *arg) { bool *failed = arg; *failed = true; return 0; } struct nl_sock *rdmanl_socket_alloc(void) { struct nl_sock *nl; nl = nl_socket_alloc(); if (!nl) return NULL; nl_socket_disable_auto_ack(nl); nl_socket_disable_msg_peek(nl); if (nl_connect(nl, NETLINK_RDMA)) { nl_socket_free(nl); return NULL; } return nl; } int rdmanl_get_copy_on_fork(struct nl_sock *nl, nl_recvmsg_msg_cb_t cb_func, void *data) { bool failed = false; int ret; if (nl_send_simple(nl, RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_SYS_GET), 0, NULL, 0) < 0) return -1; if (nl_socket_modify_err_cb(nl, NL_CB_CUSTOM, rdmanl_saw_err_cb, &failed)) return -1; if (nl_socket_modify_cb(nl, NL_CB_VALID, NL_CB_CUSTOM, cb_func, data)) return -1; do { ret = nl_recvmsgs_default(nl); } while (ret > 0); nl_socket_modify_err_cb(nl, NL_CB_CUSTOM, NULL, NULL); if (ret || failed) return -1; return 0; } int rdmanl_get_devices(struct nl_sock *nl, nl_recvmsg_msg_cb_t cb_func, void *data) { bool failed = false; int ret; if (nl_send_simple(nl, RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET), NLM_F_DUMP, NULL, 0) < 0) return -1; if (nl_socket_modify_err_cb(nl, NL_CB_CUSTOM, rdmanl_saw_err_cb, &failed)) return -1; if (nl_socket_modify_cb(nl, NL_CB_VALID, NL_CB_CUSTOM, cb_func, data)) return -1; do { ret = nl_recvmsgs_default(nl); } while (ret > 0); nl_socket_modify_err_cb(nl, NL_CB_CUSTOM, NULL, NULL); if (ret || failed) return -1; return 0; } int rdmanl_get_chardev(struct nl_sock *nl, int ibidx, const char *name, nl_recvmsg_msg_cb_t cb_func, void *data) { bool failed = false; struct nl_msg *msg; int ret; msg = nlmsg_alloc_simple( RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET_CHARDEV), 0); if (!msg) return -1; if (ibidx != -1) NLA_PUT_U32(msg, RDMA_NLDEV_ATTR_DEV_INDEX, ibidx); NLA_PUT_STRING(msg, RDMA_NLDEV_ATTR_CHARDEV_TYPE, name); ret = nl_send_auto(nl, msg); nlmsg_free(msg); if (ret < 0) return -1; if (nl_socket_modify_err_cb(nl, NL_CB_CUSTOM, rdmanl_saw_err_cb, &failed)) return -1; if (nl_socket_modify_cb(nl, NL_CB_VALID, NL_CB_CUSTOM, cb_func, data)) return -1; do { ret = nl_recvmsgs_default(nl); } while (ret > 0); nl_socket_modify_err_cb(nl, NL_CB_CUSTOM, NULL, NULL); if (ret || failed) return -1; return 0; nla_put_failure: nlmsg_free(msg); return -1; } rdma-core-56.1/util/rdma_nl.h000066400000000000000000000040311477342711600160720ustar00rootroot00000000000000/* * Copyright (c) 2019, Mellanox Technologies. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef UTIL_RDMA_NL_H #define UTIL_RDMA_NL_H #include #include #include #include #include extern struct nla_policy rdmanl_policy[RDMA_NLDEV_ATTR_MAX]; struct nl_sock *rdmanl_socket_alloc(void); int rdmanl_get_devices(struct nl_sock *nl, nl_recvmsg_msg_cb_t cb_func, void *data); int rdmanl_get_chardev(struct nl_sock *nl, int ibidx, const char *name, nl_recvmsg_msg_cb_t cb_func, void *data); bool get_copy_on_fork(void); int rdmanl_get_copy_on_fork(struct nl_sock *nl, nl_recvmsg_msg_cb_t cb_func, void *data); #endif rdma-core-56.1/util/s390_mmio_insn.h000066400000000000000000000051161477342711600172310ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file */ #ifndef __S390_UTIL_MMIO_H #define __S390_UTIL_MMIO_H #ifdef __s390x__ #include #include #include #include #include #include #include /* s390 requires special instructions to access IO memory. Originally there were only privileged IO instructions that are exposed via special syscalls. Starting with z15 there are also non-privileged memory IO (MIO) instructions we can execute in user-space. Despite the hardware support this requires support in the kernel. If MIO instructions are available is indicated in an ELF hardware capability. */ extern bool s390_is_mio_supported; union register_pair { unsigned __int128 pair; struct { uint64_t even; uint64_t odd; }; }; /* The following pcilgi and pcistgi instructions allow IO memory access from user-space but are only available on z15 and newer. */ static inline uint64_t s390_pcilgi(const void *ioaddr, size_t len) { union register_pair ioaddr_len = {.even = (uint64_t)ioaddr, .odd = len}; uint64_t val; int cc; asm volatile ( /* pcilgi */ ".insn rre,0xb9d60000,%[val],%[ioaddr_len]\n" "ipm %[cc]\n" "srl %[cc],28\n" : [cc] "=d" (cc), [val] "=d" (val), [ioaddr_len] "+&d" (ioaddr_len.pair) :: "cc"); if (unlikely(cc)) val = -1ULL; return val; } static inline void s390_pcistgi(void *ioaddr, uint64_t val, size_t len) { union register_pair ioaddr_len = {.even = (uint64_t)ioaddr, .odd = len}; asm volatile ( /* pcistgi */ ".insn rre,0xb9d40000,%[val],%[ioaddr_len]\n" : [ioaddr_len] "+&d" (ioaddr_len.pair) : [val] "d" (val) : "cc", "memory"); } /* This is the block store variant of unprivileged IO access instructions */ static inline void s390_pcistbi(void *ioaddr, const void *data, size_t len) { const uint8_t *src = data; asm volatile ( /* pcistbi */ ".insn rsy,0xeb00000000d4,%[len],%[ioaddr],%[src]\n" : [len] "+d" (len) : [ioaddr] "d" ((uint64_t *)ioaddr), [src] "Q" (*src) : "cc"); } static inline void s390_pciwb(void) { if (s390_is_mio_supported) asm volatile (".insn rre,0xb9d50000,0,0\n"); /* pciwb */ else asm volatile("" ::: "memory"); } static inline void s390_mmio_write_syscall(void *mmio_addr, const void *val, size_t length) { syscall(__NR_s390_pci_mmio_write, mmio_addr, val, length); } static inline void s390_mmio_read_syscall(const void *mmio_addr, void *val, size_t length) { syscall(__NR_s390_pci_mmio_read, mmio_addr, val, length); } #endif /* __s390x__ */ #endif /* __S390_UTIL_MMIO_H */ rdma-core-56.1/util/symver.h000066400000000000000000000100561477342711600160070ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file These definitions help using the ELF symbol version feature, and must be used in conjunction with the library's map file. */ #ifndef __UTIL_SYMVER_H #define __UTIL_SYMVER_H #include #include /* These macros should only be used if the library is defining compatibility symbols, eg: 213: 000000000000a650 315 FUNC GLOBAL DEFAULT 13 ibv_get_device_list@IBVERBS_1.0 214: 000000000000b020 304 FUNC GLOBAL DEFAULT 13 ibv_get_device_list@@IBVERBS_1.1 Symbols which have only a single implementation should use a normal extern function and be placed in the correct stanza in the linker map file. Follow this pattern to use this feature: public.h: struct ibv_device **ibv_get_device_list(int *num_devices); foo.c: // Implement the latest version LATEST_SYMVER_FUNC(ibv_get_device_list, 1_1, "IBVERBS_1.1", struct ibv_device **, int *num_devices) { ... } // Implement the compat version COMPAT_SYMVER_FUNC(ibv_get_device_list, 1_0, "IBVERBS_1.0", struct ibv_device_1_0 **, int *num_devices) { ... } As well as matching information in the map file. These macros deal with the various uglyness in gcc surrounding symbol versions - The internal name __public_1_x is synthesized by the macro - A prototype for the internal name is created by the macro - If statically linking the latest symbol expands into a normal function definition - If statically linking the compat symbols expand into unused static functions are are discarded by the compiler. - The prototype of the latest symbol is checked against the public prototype (only when compiling statically) The extra prototypes are included only to avoid -Wmissing-prototypes warnings. See also Documentation/versioning.md */ #if HAVE_FUNC_ATTRIBUTE_SYMVER #define _MAKE_SYMVER(_local_sym, _public_sym, _ver_str) \ __attribute__((__symver__(#_public_sym "@" _ver_str))) #else #define _MAKE_SYMVER(_local_sym, _public_sym, _ver_str) \ asm(".symver " #_local_sym "," #_public_sym "@" _ver_str); #endif #define _MAKE_SYMVER_FUNC(_public_sym, _uniq, _ver_str, _ret, ...) \ _ret __##_public_sym##_##_uniq(__VA_ARGS__); \ _MAKE_SYMVER(__##_public_sym##_##_uniq, _public_sym, _ver_str) \ _ret __##_public_sym##_##_uniq(__VA_ARGS__) #if defined(HAVE_FULL_SYMBOL_VERSIONS) && !defined(_STATIC_LIBRARY_BUILD_) // Produce all symbol versions for dynamic linking # define COMPAT_SYMVER_FUNC(_public_sym, _uniq, _ver_str, _ret, ...) \ _MAKE_SYMVER_FUNC(_public_sym, _uniq, _ver_str, _ret, __VA_ARGS__) # define LATEST_SYMVER_FUNC(_public_sym, _uniq, _ver_str, _ret, ...) \ _MAKE_SYMVER_FUNC(_public_sym, _uniq, "@" _ver_str, _ret, __VA_ARGS__) #elif defined(HAVE_LIMITED_SYMBOL_VERSIONS) && !defined(_STATIC_LIBRARY_BUILD_) /* Produce only implemenations for the latest symbol and tag it with the * correct symbol versions. This supports dynamic linkers that do not * understand symbol versions */ # define COMPAT_SYMVER_FUNC(_public_sym, _uniq, _ver_str, _ret, ...) \ static inline _ret __##_public_sym##_##_uniq(__VA_ARGS__) # define LATEST_SYMVER_FUNC(_public_sym, _uniq, _ver_str, _ret, ...) \ _MAKE_SYMVER_FUNC(_public_sym, _uniq, "@" _ver_str, _ret, __VA_ARGS__) #else // Static linking, or linker does not support symbol versions #define COMPAT_SYMVER_FUNC(_public_sym, _uniq, _ver_str, _ret, ...) \ static inline __attribute__((unused)) \ _ret __##_public_sym##_##_uniq(__VA_ARGS__) #define LATEST_SYMVER_FUNC(_public_sym, _uniq, _ver_str, _ret, ...) \ static __attribute__((unused)) \ _ret __##_public_sym##_##_uniq(__VA_ARGS__) \ __attribute__((alias(stringify(_public_sym)))); \ extern _ret _public_sym(__VA_ARGS__) #endif #endif rdma-core-56.1/util/tests/000077500000000000000000000000001477342711600154515ustar00rootroot00000000000000rdma-core-56.1/util/tests/CMakeLists.txt000066400000000000000000000001521477342711600202070ustar00rootroot00000000000000rdma_test_executable(bitmap_test bitmap_test.c) target_link_libraries(bitmap_test LINK_PRIVATE rdma_util) rdma-core-56.1/util/tests/bitmap_test.c000066400000000000000000000070051477342711600201320ustar00rootroot00000000000000// SPDX-License-Identifier: (GPL-2.0 OR Linux-OpenIB) #include #include #include #include #include static int failed_tests; #define EXPECT_EQ(expected, actual) \ ({ \ typeof(expected) _expected = (expected); \ typeof(actual) _actual = (actual); \ if (_expected != _actual) { \ printf(" FAIL at line %d: %s not %s\n", __LINE__, \ #expected, #actual); \ printf("\tExpected: %ld\n", (long) _expected); \ printf("\t Actual: %ld\n", (long) _actual); \ failed_tests++; \ } \ }) #define EXPECT_TRUE(actual) EXPECT_EQ(true, actual) #define EXPECT_FALSE(actual) EXPECT_EQ(false, actual) static void test_bitmap_empty(unsigned long *bmp, const int nbits) { for (int i = 1; i < nbits; i++) { bitmap_zero(bmp, nbits); EXPECT_TRUE(bitmap_empty(bmp, nbits)); bitmap_set_bit(bmp, i); EXPECT_TRUE(bitmap_empty(bmp, i)); EXPECT_FALSE(bitmap_empty(bmp, i + 1)); } } static void test_bitmap_find_first_bit(unsigned long *bmp, const int nbits) { bitmap_zero(bmp, nbits); for (int i = 0; i < nbits; i++) { EXPECT_EQ(nbits, bitmap_find_first_bit(bmp, 0, nbits)); bitmap_set_bit(bmp, i); EXPECT_EQ(i, bitmap_find_first_bit(bmp, 0, nbits)); EXPECT_EQ(i, bitmap_find_first_bit(bmp, i, nbits)); EXPECT_EQ(i, bitmap_find_first_bit(bmp, 0, i + 1)); EXPECT_EQ(i, bitmap_find_first_bit(bmp, 0, i)); EXPECT_EQ(nbits, bitmap_find_first_bit(bmp, i + 1, nbits)); bitmap_clear_bit(bmp, i); } } static void test_bitmap_zero_region(unsigned long *bmp, const int nbits) { for (int end = 0; end <= nbits; end++) { int bit = end / 2; bitmap_fill(bmp, nbits); bitmap_zero_region(bmp, bit, end); for (int i = 0; i < nbits; i++) { bool expected = i < bit || i >= end; EXPECT_EQ(expected, bitmap_test_bit(bmp, i)); } } } static void test_bitmap_fill_region(unsigned long *bmp, const int nbits) { for (int end = 0; end <= nbits; end++) { int bit = end / 2; bitmap_zero(bmp, nbits); bitmap_fill_region(bmp, bit, end); for (int i = 0; i < nbits; i++) { bool expected = i >= bit && i < end; EXPECT_EQ(expected, bitmap_test_bit(bmp, i)); } } } static void test_bitmap_find_free_region(unsigned long *bmp, const int nbits) { for (int region_size = 1; region_size <= nbits; region_size++) { int start = nbits - region_size; bitmap_zero(bmp, nbits); EXPECT_EQ(0, bitmap_find_free_region(bmp, nbits, region_size)); if (start > region_size) bitmap_fill_region(bmp, region_size - 1, start); else bitmap_fill_region(bmp, 0, start); EXPECT_EQ(start, bitmap_find_free_region(bmp, nbits, region_size)); } } int main(int argc, char **argv) { int all_failed_tests = 0; int nbitsv[] = { BITS_PER_LONG, BITS_PER_LONG - 1, BITS_PER_LONG + 1, BITS_PER_LONG / 2, BITS_PER_LONG * 2, }; for (int i = 0; i < ARRAY_SIZE(nbitsv); i++) { int nbits = nbitsv[i]; unsigned long *bmp = bitmap_alloc0(nbits); #define TEST(func_name) do { \ VALGRIND_MAKE_MEM_UNDEFINED(bmp, BITS_TO_LONGS(nbits) * sizeof(long)); \ failed_tests = 0; \ (func_name)(bmp, nbits); \ printf("%6s %s(nbits=%d)\n", failed_tests ? "FAILED" : "OK", #func_name, \ nbits); \ all_failed_tests += failed_tests; \ } while (0) TEST(test_bitmap_empty); TEST(test_bitmap_find_first_bit); TEST(test_bitmap_zero_region); TEST(test_bitmap_fill_region); TEST(test_bitmap_find_free_region); #undef TEST printf("\n"); free(bmp); } if (all_failed_tests) { printf("%d tests failed\n", all_failed_tests); return 1; } return 0; } rdma-core-56.1/util/udma_barrier.h000066400000000000000000000266621477342711600171300ustar00rootroot00000000000000/* * Copyright (c) 2005 Topspin Communications. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU * General Public License (GPL) Version 2, available from the file * COPYING in the main directory of this source tree, or the * OpenIB.org BSD license below: * * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following * conditions are met: * * - Redistributions of source code must retain the above * copyright notice, this list of conditions and the following * disclaimer. * * - Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimer in the documentation and/or other materials * provided with the distribution. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ #ifndef __UTIL_UDMA_BARRIER_H #define __UTIL_UDMA_BARRIER_H #include /* Barriers for DMA. These barriers are expliclty only for use with user DMA operations. If you are looking for barriers to use with cache-coherent multi-threaded consitency then look in stdatomic.h. If you need both kinds of synchronicity for the same address then use an atomic operation followed by one of these barriers. When reasoning about these barriers there are two objects: - CPU attached address space (the CPU memory could be a range of things: cached/uncached/non-temporal CPU DRAM, uncached MMIO space in another device, pMEM). Generally speaking the ordering is only relative to the local CPU's view of the system. Eg if the local CPU is not guaranteed to see a write from another CPU then it is also OK for the DMA device to also not see the write after the barrier. - A DMA initiator on a bus. For instance a PCI-E device issuing MemRd/MemWr TLPs. The ordering guarantee is always stated between those two streams. Eg what happens if a MemRd TLP is sent in via PCI-E relative to a CPU WRITE to the same memory location. The providers have a very regular and predictable use of these barriers, to make things very clear each narrow use is given a name and the proper name should be used in the provider as a form of documentation. */ /* Ensure that the device's view of memory matches the CPU's view of memory. This should be placed before any MMIO store that could trigger the device to begin doing DMA, such as a device doorbell ring. eg *dma_buf = 1; udma_to_device_barrier(); mmio_write(DO_DMA_REG, dma_buf); Must ensure that the device sees the '1'. This is required to fence writes created by the libibverbs user. Those writes could be to any CPU mapped memory object with any cachability mode. NOTE: x86 has historically used a weaker semantic for this barrier, and only fenced normal stores to normal memory. libibverbs users using other memory types or non-temporal stores are required to use SFENCE in their own code prior to calling verbs to start a DMA. */ #if defined(__i386__) #define udma_to_device_barrier() asm volatile("" ::: "memory") #elif defined(__x86_64__) #define udma_to_device_barrier() asm volatile("" ::: "memory") #elif defined(__PPC64__) #define udma_to_device_barrier() asm volatile("sync" ::: "memory") #elif defined(__PPC__) #define udma_to_device_barrier() asm volatile("sync" ::: "memory") #elif defined(__ia64__) #define udma_to_device_barrier() asm volatile("mf" ::: "memory") #elif defined(__sparc_v9__) #define udma_to_device_barrier() asm volatile("membar #StoreStore" ::: "memory") #elif defined(__aarch64__) #define udma_to_device_barrier() asm volatile("dmb oshst" ::: "memory") #elif defined(__sparc__) || defined(__s390x__) #define udma_to_device_barrier() asm volatile("" ::: "memory") #elif defined(__loongarch__) #define udma_to_device_barrier() asm volatile("dbar 0" ::: "memory") #elif defined(__riscv) #define udma_to_device_barrier() asm volatile("fence ow,ow" ::: "memory") #elif defined(__mips__) #define udma_to_device_barrier() asm volatile("sync" ::: "memory") #else #error No architecture specific memory barrier defines found! #endif /* Ensure that all ordered stores from the device are observable from the CPU. This only makes sense after something that observes an ordered store from the device - eg by reading a MMIO register or seeing that CPU memory is updated. This guarantees that all reads that follow the barrier see the ordered stores that preceded the observation. For instance, this would be used after testing a valid bit in a memory that is a DMA target, to ensure that the following reads see the data written before the MemWr TLP that set the valid bit. */ #if defined(__i386__) #define udma_from_device_barrier() asm volatile("lock; addl $0,0(%%esp) " ::: "memory") #elif defined(__x86_64__) #define udma_from_device_barrier() asm volatile("lfence" ::: "memory") #elif defined(__PPC64__) #define udma_from_device_barrier() asm volatile("lwsync" ::: "memory") #elif defined(__PPC__) #define udma_from_device_barrier() asm volatile("sync" ::: "memory") #elif defined(__ia64__) #define udma_from_device_barrier() asm volatile("mf" ::: "memory") #elif defined(__sparc_v9__) #define udma_from_device_barrier() asm volatile("membar #LoadLoad" ::: "memory") #elif defined(__aarch64__) #define udma_from_device_barrier() asm volatile("dmb oshld" ::: "memory") #elif defined(__sparc__) || defined(__s390x__) #define udma_from_device_barrier() asm volatile("" ::: "memory") #elif defined(__loongarch__) #define udma_from_device_barrier() asm volatile("dbar 0" ::: "memory") #elif defined(__riscv) #define udma_from_device_barrier() asm volatile("fence ir,ir" ::: "memory") #elif defined(__mips__) #define udma_from_device_barrier() asm volatile("sync" ::: "memory") #else #error No architecture specific memory barrier defines found! #endif /* Order writes to CPU memory so that a DMA device cannot view writes after the barrier without also seeing all writes before the barrier. This does not guarantee any writes are visible to DMA. This would be used in cases where a DMA buffer might have a valid bit and data, this barrier is placed after writing the data but before writing the valid bit to ensure the DMA device cannot observe a set valid bit with unwritten data. Compared to udma_to_device_barrier() this barrier is not required to fence anything but normal stores to normal malloc memory. Usage should be: write_wqe udma_to_device_barrier(); // Get user memory ready for DMA wqe->addr = ...; wqe->flags = ...; udma_ordering_write_barrier(); // Guarantee WQE written in order wqe->valid = 1; */ #define udma_ordering_write_barrier() udma_to_device_barrier() /* Promptly flush writes to MMIO Write Cominbing memory. This should be used after a write to WC memory. This is both a barrier and a hint to the CPU to flush any buffers to reduce latency to TLP generation. This is not required to have any effect on CPU memory. If done while holding a lock then the ordering of MMIO writes across CPUs must be guaranteed to follow the natural ordering implied by the lock. This must also act as a barrier that prevents write combining, eg *wc_mem = 1; mmio_flush_writes(); *wc_mem = 2; Must always produce two MemWr TLPs, '1' and '2'. Without the barrier the CPU is allowed to produce a single TLP '2'. Note that there is no order guarantee for writes to WC memory without barriers. This is intended to be used in conjunction with WC memory to generate large PCI-E MemWr TLPs from the CPU. */ #if defined(__i386__) #define mmio_flush_writes() asm volatile("lock; addl $0,0(%%esp) " ::: "memory") #elif defined(__x86_64__) #define mmio_flush_writes() asm volatile("sfence" ::: "memory") #elif defined(__PPC64__) #define mmio_flush_writes() asm volatile("sync" ::: "memory") #elif defined(__PPC__) #define mmio_flush_writes() asm volatile("sync" ::: "memory") #elif defined(__ia64__) #define mmio_flush_writes() asm volatile("fwb" ::: "memory") #elif defined(__sparc_v9__) #define mmio_flush_writes() asm volatile("membar #StoreStore" ::: "memory") #elif defined(__aarch64__) #define mmio_flush_writes() asm volatile("dsb st" ::: "memory"); #elif defined(__sparc__) #define mmio_flush_writes() asm volatile("" ::: "memory") #elif defined(__loongarch__) #define mmio_flush_writes() asm volatile("dbar 0" ::: "memory") #elif defined(__riscv) #define mmio_flush_writes() asm volatile("fence ow,ow" ::: "memory") #elif defined(__s390x__) #include "s390_mmio_insn.h" #define mmio_flush_writes() s390_pciwb() #elif defined(__mips__) #define mmio_flush_writes() asm volatile("sync" ::: "memory") #else #error No architecture specific memory barrier defines found! #endif /* Prevent WC writes from being re-ordered relative to other MMIO writes. This should be used before a write to WC memory. This must act as a barrier to prevent write re-ordering from different memory types: *mmio_mem = 1; mmio_flush_writes(); *wc_mem = 2; Must always produce a TLP '1' followed by '2'. This barrier implies udma_to_device_barrier() This is intended to be used in conjunction with WC memory to generate large PCI-E MemWr TLPs from the CPU. */ #define mmio_wc_start() mmio_flush_writes() /* Keep MMIO writes in order. Currently we lack writel macros that universally guarantee MMIO writes happen in order, like the kernel does. Even worse many providers haphazardly open code writes to MMIO memory omitting even volatile. Until this can be fixed with a proper writel macro, this barrier is a stand in to indicate places where MMIO writes should be switched to some future writel. */ #define mmio_ordered_writes_hack() mmio_flush_writes() /* Write Combining Spinlock primitive Any access to a multi-value WC region must ensure that multiple cpus do not write to the same values concurrently, these macros make that straightforward and efficient if the choosen exclusion is a spinlock. The spinlock guarantees that the WC writes issued within the critical section are made visible as TLP to the device. The TLP must be seen by the device strictly in the order that the spinlocks are acquired, and combining WC writes between different sections is not permitted. Use of these macros allow the fencing inside the spinlock to be combined with the fencing required for DMA. */ static inline void mmio_wc_spinlock(pthread_spinlock_t *lock) { pthread_spin_lock(lock); #if !defined(__i386__) && !defined(__x86_64__) /* For x86 the serialization within the spin lock is enough to * strongly order WC and other memory types. */ mmio_wc_start(); #endif } static inline void mmio_wc_spinunlock(pthread_spinlock_t *lock) { /* It is possible that on x86 the atomic in the lock is strong enough * to force-flush the WC buffers quickly, and this SFENCE can be * omitted too. */ mmio_flush_writes(); pthread_spin_unlock(lock); } #endif rdma-core-56.1/util/util.c000066400000000000000000000023231477342711600154300ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file */ #include #include #include #include #include #include #include #include int set_fd_nonblock(int fd, bool nonblock) { int val; val = fcntl(fd, F_GETFL); if (val == -1) return -1; if (nonblock) val |= O_NONBLOCK; else val &= ~(unsigned int)(O_NONBLOCK); if (fcntl(fd, F_SETFL, val) == -1) return -1; return 0; } #ifndef GRND_INSECURE #define GRND_INSECURE 0x0004 #endif unsigned int get_random(void) { static unsigned int seed; ssize_t sz; if (!seed) { sz = getrandom(&seed, sizeof(seed), GRND_NONBLOCK | GRND_INSECURE); if (sz < 0) sz = getrandom(&seed, sizeof(seed), GRND_NONBLOCK); if (sz != sizeof(seed)) seed = time(NULL); } return rand_r(&seed); } bool check_env(const char *var) { const char *env_value = getenv(var); return env_value && (strcmp(env_value, "0") != 0); } /* Xorshift random number generator */ uint32_t xorshift32(struct xorshift32_state *state) { /* Algorithm "xor" from p. 4 of Marsaglia, "Xorshift RNGs" */ uint32_t x = state->seed;; x ^= x << 13; x ^= x >> 17; x ^= x << 5; return state->seed = x; } rdma-core-56.1/util/util.h000066400000000000000000000060511477342711600154370ustar00rootroot00000000000000/* GPLv2 or OpenIB.org BSD (MIT) See COPYING file */ #ifndef UTIL_UTIL_H #define UTIL_UTIL_H #include #include #include #include /* Return true if the snprintf succeeded, false if there was truncation or * error */ static inline bool __good_snprintf(size_t len, int rc) { return (rc < len && rc >= 0); } #define check_snprintf(buf, len, fmt, ...) \ __good_snprintf(len, snprintf(buf, len, fmt, ##__VA_ARGS__)) /* a CMP b. See also the BSD macro timercmp(). */ #define ts_cmp(a, b, CMP) \ (((a)->tv_sec == (b)->tv_sec) ? \ ((a)->tv_nsec CMP (b)->tv_nsec) : \ ((a)->tv_sec CMP (b)->tv_sec)) #define offsetofend(_type, _member) \ (offsetof(_type, _member) + sizeof(((_type *)0)->_member)) #define BITS_PER_LONG (8 * sizeof(long)) #define BITS_PER_LONG_LONG (8 * sizeof(long long)) #define BITS_TO_LONGS(nr) (((nr) + BITS_PER_LONG - 1) / BITS_PER_LONG) #define GENMASK(h, l) \ (((~0UL) - (1UL << (l)) + 1) & (~0UL >> (BITS_PER_LONG - 1 - (h)))) #define GENMASK_ULL(h, l) \ (((~0ULL) << (l)) & (~0ULL >> (BITS_PER_LONG_LONG - 1 - (h)))) #define BIT(nr) (1UL << (nr)) #define BIT_ULL(nr) (1ULL << (nr)) #define __bf_shf(x) (__builtin_ffsll(x) - 1) /** * FIELD_PREP() - prepare a bitfield element * @_mask: shifted mask defining the field's length and position * @_val: value to put in the field * * FIELD_PREP() masks and shifts up the value. The result should * be combined with other fields of the bitfield using logical OR. */ #define FIELD_PREP(_mask, _val) \ ({ \ ((typeof(_mask))(_val) << __bf_shf(_mask)) & (_mask); \ }) /** * FIELD_GET() - extract a bitfield element * @_mask: shifted mask defining the field's length and position * @_reg: value of entire bitfield * * FIELD_GET() extracts the field specified by @_mask from the * bitfield passed in as @_reg by masking and shifting it down. */ #define FIELD_GET(_mask, _reg) \ ({ \ (typeof(_mask))(((_reg) & (_mask)) >> __bf_shf(_mask)); \ }) static inline unsigned long align(unsigned long val, unsigned long align) { return (val + align - 1) & ~(align - 1); } static inline unsigned long align_down(unsigned long val, unsigned long _align) { return align(val - (_align - 1), _align); } static inline uint64_t roundup_pow_of_two(uint64_t n) { return n == 1 ? 1 : 1ULL << ilog64(n - 1); } static inline unsigned long DIV_ROUND_UP(unsigned long n, unsigned long d) { return (n + d - 1) / d; } struct xorshift32_state { /* The state word must be initialized to non-zero */ uint32_t seed; }; uint32_t xorshift32(struct xorshift32_state *state); int set_fd_nonblock(int fd, bool nonblock); int open_cdev(const char *devname_hint, dev_t cdev); unsigned int get_random(void); bool check_env(const char *var); #endif rdma-core-56.1/buildlib/pandoc-prebuilt/0000755000175100002000000000000014773456421026061 5ustar00vsts_azpcontainerdocker_azpcontainer00000000000000rdma-core-56.1/buildlib/pandoc-prebuilt/a91b9346f932b9f38e4ce3ec5ee815fd39fa0a910000644000175100002000000000306014773456415033631 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_devx_create_event_channel, mlx5dv_devx_destroy_event_channel" "3" "" "" "" .hy .SH NAME .PP mlx5dv_devx_create_event_channel - Create an event channel to be used for DEVX asynchronous events. .PP mlx5dv_devx_destroy_event_channel - Destroy a DEVX event channel. .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_devx_event_channel { int fd; }; struct mlx5dv_devx_event_channel * mlx5dv_devx_create_event_channel(struct ibv_context *context, enum mlx5dv_devx_create_event_channel_flags flags) void mlx5dv_devx_destroy_event_channel(struct mlx5dv_devx_event_channel *event_channel) \f[R] .fi .SH DESCRIPTION .PP Create or destroy a channel to be used for DEVX asynchronous events. .PP The create verb exposes an mlx5dv_devx_event_channel object that can be used to read asynchronous DEVX events. This lets an application to subscribe to get device events and once an event occurred read it from this object. .SH ARGUMENTS .TP \f[I]context\f[R] .IP .nf \f[C] RDMA device context to create the channel on. \f[R] .fi .TP \f[I]flags\f[R] MLX5DV_DEVX_CREATE_EVENT_CHANNEL_FLAGS_OMIT_EV_DATA: omit the event data on this channel. .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_devx_create_event_channel\f[R] will return a new \f[I]struct mlx5dv_devx_event_channel\f[R] object, on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[I]mlx5dv_open_device(3)\f[R], \f[I]mlx5dv_devx_obj_create(3)\f[R] .PP #AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/162c504f31acfe07dc7c4eed5abd1c873ac2f7340000644000175100002000000000542114773456414033733 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_create_flow" "3" "2018-9-19" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_create_flow - creates a steering flow rule .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_flow * mlx5dv_create_flow(struct mlx5dv_flow_matcher *flow_matcher, struct mlx5dv_flow_match_parameters *match_value, size_t num_actions, struct mlx5dv_flow_action_attr actions_attr[]) \f[R] .fi .SH DESCRIPTION .PP \f[B]mlx5dv_create_flow()\f[R] creates a steering flow rule with the ability to specify specific driver properties. .SH ARGUMENTS .PP Please see \f[I]mlx5dv_create_flow_matcher(3)\f[R] for \f[I]flow_matcher\f[R] and \f[I]match_value\f[R]. .TP \f[I]num_actions\f[R] Specifies how many actions are passed in \f[I]actions_attr\f[R] .SS \f[I]actions_attr\f[R] .IP .nf \f[C] struct mlx5dv_flow_action_attr { enum mlx5dv_flow_action_type type; union { struct ibv_qp *qp; struct ibv_counters *counter; struct ibv_flow_action *action; uint32_t tag_value; struct mlx5dv_devx_obj *obj; }; }; \f[R] .fi .TP \f[I]type\f[R] MLX5DV_FLOW_ACTION_DEST_IBV_QP The QP passed will receive the matched packets. MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION The flow action to be applied. MLX5DV_FLOW_ACTION_TAG Flow tag to be provided in work completion. MLX5DV_FLOW_ACTION_DEST_DEVX The DEVX destination object for the matched packets. MLX5DV_FLOW_ACTION_COUNTERS_DEVX The DEVX counter object for the matched packets. MLX5DV_FLOW_ACTION_DEFAULT_MISS Steer the packet to the default miss destination. MLX5DV_FLOW_ACTION_DROP Action is dropping the matched packet. .TP \f[I]qp\f[R] QP passed, to be used with \f[I]type\f[R] \f[I]MLX5DV_FLOW_ACTION_DEST_IBV_QP\f[R]. .TP \f[I]action\f[R] Flow action, to be used with \f[I]type\f[R] \f[I]MLX5DV_FLOW_ACTION_IBV_FLOW_ACTION\f[R] see \f[I]mlx5dv_create_flow_action_modify_header(3)\f[R] and \f[I]mlx5dv_create_flow_action_packet_reformat(3)\f[R]. .TP \f[I]tag_value\f[R] tag value to be passed in the work completion, to be used with \f[I]type\f[R] \f[I]MLX5DV_FLOW_ACTION_TAG\f[R] see \f[I]ibv_create_cq_ex(3)\f[R]. .TP \f[I]obj\f[R] DEVX object, to be used with \f[I]type\f[R] \f[I]MLX5DV_FLOW_ACTION_DEST_DEVX\f[R] or by \f[I]MLX5DV_FLOW_ACTION_COUNTERS_DEVX\f[R]. .SH RETURN VALUE .PP \f[B]mlx5dv_create_flow\f[R] returns a pointer to the created flow rule, on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[I]mlx5dv_create_flow_action_modify_header(3)\f[R], \f[I]mlx5dv_create_flow_action_packet_reformat(3)\f[R], \f[I]mlx5dv_create_flow_matcher(3)\f[R], \f[I]mlx5dv_create_qp(3)\f[R], \f[I]ibv_create_qp_ex(3)\f[R] \f[I]ibv_create_cq_ex(3)\f[R] \f[I]ibv_create_counters(3)\f[R] .SH AUTHOR .PP Mark Bloch rdma-core-56.1/buildlib/pandoc-prebuilt/20bb05893074311222ee111bdbb5f232376d1fdd0000644000175100002000000000346014773456415033252 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_devx_alloc_uar / mlx5dv_devx_free_uar" "3" "" "" "" .hy .SH NAME .PP mlx5dv_devx_alloc_uar - Allocates a DEVX UAR .PP mlx5dv_devx_free_uar - Frees a DEVX UAR .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_devx_uar *mlx5dv_devx_alloc_uar(struct ibv_context *context, uint32_t flags); void mlx5dv_devx_free_uar(struct mlx5dv_devx_uar *devx_uar); \f[R] .fi .SH DESCRIPTION .PP Create / free a DEVX UAR which is needed for other device commands over the DEVX interface. .PP The DEVX API enables direct access from the user space area to the mlx5 device driver, the UAR information is needed for few commands as of QP creation. .SH ARGUMENTS .TP \f[I]context\f[R] RDMA device context to work on. .TP \f[I]flags\f[R] Allocation flags for the UAR. MLX5DV_UAR_ALLOC_TYPE_BF: Allocate UAR with Blueflame properties. MLX5DV_UAR_ALLOC_TYPE_NC: Allocate UAR with non-cache properties. MLX5DV_UAR_ALLOC_TYPE_NC_DEDICATED: Allocate a dedicated UAR with non-cache properties. .SS devx_uar .IP .nf \f[C] struct mlx5dv_devx_uar { void *reg_addr; void *base_addr; uint32_t page_id; off_t mmap_off; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]reg_addr\f[R] The write address of DB/BF. .TP \f[I]base_addr\f[R] The base address of the UAR. .TP \f[I]page_id\f[R] The device page id to be used. .TP \f[I]mmap_off\f[R] The mmap offset parameter to be used for re-mapping, to be used by a secondary process. .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_devx_alloc_uar\f[R] will return a new \f[I]struct mlx5dv_devx_uar\f[R], on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[B]mlx5dv_open_device\f[R], \f[B]mlx5dv_devx_obj_create\f[R] .PP #AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/532c7a2d93d5555e2b0b1403669a2be45ec851d40000644000175100002000000000535514773456412033303 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_GET_DEVICE_LIST" "3" "2006-10-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_get_device_list, ibv_free_device_list - get and release list of available RDMA devices .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_device **ibv_get_device_list(int *num_devices); void ibv_free_device_list(struct ibv_device **list); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_get_device_list()\f[R] returns a NULL-terminated array of RDMA devices currently available. The argument \f[I]num_devices\f[R] is optional; if not NULL, it is set to the number of devices returned in the array. .PP \f[B]ibv_free_device_list()\f[R] frees the array of devices \f[I]list\f[R] returned by \f[B]ibv_get_device_list()\f[R]. .SH RETURN VALUE .PP \f[B]ibv_get_device_list()\f[R] returns the array of available RDMA devices, or sets \f[I]errno\f[R] and returns NULL if the request fails. If no devices are found then \f[I]num_devices\f[R] is set to 0, and non-NULL is returned. .PP \f[B]ibv_free_device_list()\f[R] returns no value. .SH ERRORS .TP \f[B]EPERM\f[R] Permission denied. .TP \f[B]ENOSYS\f[R] No kernel support for RDMA. .TP \f[B]ENOMEM\f[R] Insufficient memory to complete the operation. .SH NOTES .PP Client code should open all the devices it intends to use with \f[B]ibv_open_device()\f[R] before calling \f[B]ibv_free_device_list()\f[R]. Once it frees the array with \f[B]ibv_free_device_list()\f[R], it will be able to use only the open devices; pointers to unopened devices will no longer be valid. .PP Setting the environment variable \f[B]IBV_SHOW_WARNINGS\f[R] will cause warnings to be emitted to stderr if a kernel verbs device is discovered, but no corresponding userspace driver can be found for it. .SH STATIC LINKING .PP If \f[B]libibverbs\f[R] is statically linked to the application then all provider drivers must also be statically linked. The library will not load dynamic providers when static linking is used. .PP To link the providers set the \f[B]RDMA_STATIC_PROVIDERS\f[R] define to the comma separated list of desired providers when compiling the application. The special keyword `all' will statically link all supported \f[B]libibverbs\f[R] providers. .PP This is intended to be used along with \f[B]pkg-config(1)\f[R] to setup the proper flags for \f[B]libibverbs\f[R] linking. .PP If this is not done then \f[B]ibv_get_device_list\f[R] will always return an empty list. .PP Using only dynamic linking for \f[B]libibverbs\f[R] applications is strongly recommended. .SH SEE ALSO .PP \f[B]ibv_fork_init\f[R](3), \f[B]ibv_get_device_guid\f[R](3), \f[B]ibv_get_device_name\f[R](3), \f[B]ibv_get_device_index\f[R](3), \f[B]ibv_open_device\f[R](3) .SH AUTHOR .PP Dotan Barak rdma-core-56.1/buildlib/pandoc-prebuilt/dde42016fc7808c9fd87d819ed0f34905949155b0000644000175100002000000000304314773456415033342 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_get_vfio_device_list" "3" "" "" "" .hy .SH NAME .PP mlx5dv_get_vfio_device_list - Get list of available devices to be used over VFIO .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_device ** mlx5dv_get_vfio_device_list(struct mlx5dv_vfio_context_attr *attr); \f[R] .fi .SH DESCRIPTION .PP Returns a NULL-terminated array of devices based on input \f[I]attr\f[R]. .SH ARGUMENTS .TP \f[I]attr\f[R] Describe the VFIO devices to return in list. .SS \f[I]attr\f[R] argument .IP .nf \f[C] struct mlx5dv_vfio_context_attr { const char *pci_name; uint32_t flags; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]pci_name\f[R] The PCI name of the required device. .TP \f[I]flags\f[R] .IP .nf \f[C] A bitwise OR of the various values described below. *MLX5DV_VFIO_CTX_FLAGS_INIT_LINK_DOWN*: Upon device initialization link should stay down. \f[R] .fi .TP \f[I]comp_mask\f[R] .IP .nf \f[C] Bitmask specifying what fields in the structure are valid. \f[R] .fi .SH RETURN VALUE .PP Returns the array of the matching devices, or sets errno and returns NULL if the request fails. .SH NOTES .PP Client code should open all the devices it intends to use with ibv_open_device() before calling ibv_free_device_list(). Once it frees the array with ibv_free_device_list(), it will be able to use only the open devices; pointers to unopened devices will no longer be valid. .SH SEE ALSO .PP \f[I]ibv_open_device(3)\f[R] \f[I]ibv_free_device_list(3)\f[R] .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/62e2ea4dd0d8c3cd1746e2913630817883715f600000644000175100002000000001205414773456417033157 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBADDR" 8 "2013-10-11" "" "OpenIB Diagnostics" .SH NAME IBADDR \- query InfiniBand address(es) .SH SYNOPSIS .sp ibaddr [options] .SH DESCRIPTION .sp Display the lid (and range) as well as the GID address of the port specified (by DR path, lid, or GUID) or the local port by default. .sp Note: this utility can be used as simple address resolver. .SH OPTIONS .sp \fB\-\-gid_show, \-g\fP show gid address only .sp \fB\-\-lid_show, \-l\fP show lid range only .sp \fB\-\-Lid_show, \-L\fP show lid range (in decimal) only .SS Addressing Flags .\" Define the common option -D for Directed routes . .sp \fB\-D, \-\-Direct\fP The address specified is a directed route .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Examples: [options] \-D [options] "0" # self port [options] \-D [options] "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag \(aq\-P\(aq or the port found through the automatic selection process.) .ft P .fi .UNINDENT .UNINDENT .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Configuration flags .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .SH EXAMPLES .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # local port\e\(aqs address ibaddr 32 # show lid range and gid of lid 32 ibaddr \-G 0x8f1040023 # same but using guid address ibaddr \-l 32 # show lid range only ibaddr \-L 32 # show decimal lid range only ibaddr \-g 32 # show gid address only .ft P .fi .UNINDENT .UNINDENT .SH SEE ALSO .sp \fBibroute (8), ibtracert (8)\fP .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/4ec307c3a681a39f63a082f76df1f238be73f2d90000644000175100002000000002727314773456416033474 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\"t .\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_wr_set_mkey_sig_block" "3" "" "" "" .hy .SH NAME .PP mlx5dv_wr_set_mkey_sig_block - Configure a MKEY for block signature (data integrity) operation. .SH SYNOPSIS .IP .nf \f[C] #include static inline void mlx5dv_wr_set_mkey_sig_block(struct mlx5dv_qp_ex *mqp, const struct mlx5dv_sig_block_attr *attr) \f[R] .fi .SH DESCRIPTION .PP Configure a MKEY with block-level data protection properties. With this, the device can add/modify/strip/validate integrity fields per block when transmitting data from memory to network and when receiving data from network to memory. .PP This setter can be optionally called after a MKEY configuration work request posting has started using \f[B]mlx5dv_wr_mkey_configure\f[R](3). Configuring block signature properties to a MKEY is done by describing what kind of signature is required (or expected) in two domains: the wire domain and the memory domain. .PP The MKEY represents a virtually contiguous memory, by configuring a layout to it. The memory signature domain describes whether data in this virtually contiguous memory includes integrity fields, and if so, what kind(\f[B]enum mlx5dv_sig_type\f[R]) and what block size(\f[B]enum mlx5dv_block_size\f[R]). .PP The wire signature domain describes the same kind of properties for the data as it is seen on the wire. Now, depending on the actual operation that happens (TX or RX), the device will do the \[lq]right thing\[rq] based on the signature configurations of the two domains. .SS Example 1: .PP Memory signature domain is configured for CRC32 every 512B block. .PP Wire signature domain is configured for no signature. .PP A SEND is issued using the MKEY as a local key. .PP Result: device will gather the data with the CRC32 fields from the MKEY (using whatever layout configured to the MKEY to locate the actual memory), validate each CRC32 against the previous 512 bytes of data, strip the CRC32 field, and transmit only 512 bytes of data to the wire. .SS Example 1.1: .PP Same as above, but a RECV is issued with the same key, and RX happens. .PP Result: device will receive the data from the wire, scatter it to the MKEY (using whatever layout configured to the MKEY to locate the actual memory), generating and scattering additional CRC32 field after every 512 bytes that are scattered. .SS Example 2: .PP Memory signature domain is configured for no signature. .PP Wire signature domain is configured for T10DIF every 4K block. .PP The MKEY is sent to a remote node that issues a RDMA_READ to this MKEY. .PP Result: device will gather the data from the MKEY (using whatever layout configured to the MKEY to locate the actual memory), transmit it to the wire while generating an additional T10DIF field every 4K of data. .SS Example 2.1: .PP Same as above, but remote node issues a RDMA_WRITE to this MKEY. .PP Result: Device will receive the data from the wire, validate each T10DIF field against the previous 4K of data, strip the T10DIF field, and scatter the data alone to the MKEY (using whatever layout configured to the MKEY to locate the actual memory). .SH ARGUMENTS .TP \f[I]mqp\f[R] .IP .nf \f[C] The QP where an MKEY configuration work request was created by \f[R] .fi .RS .PP \f[B]mlx5dv_wr_mkey_configure()\f[R]. .RE .TP \f[I]attr\f[R] Block signature attributes to set for the MKEY. .SS Block signature attributes .PP Block signature attributes describe the input and output data structures in memory and wire domains. .IP .nf \f[C] struct mlx5dv_sig_block_attr { const struct mlx5dv_sig_block_domain *mem; const struct mlx5dv_sig_block_domain *wire; uint32_t flags; uint8_t check_mask; uint8_t copy_mask; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]mem\f[R] A pointer to the signature configuration for the memory domain or NULL if the domain does not have a signature. .TP \f[I]wire\f[R] A pointer to the signature configuration for the wire domain or NULL if the domain does not have a signature. .TP \f[I]flags\f[R] A bitwise OR of the various values described below. .RS .TP \f[B]MLX5DV_SIG_BLOCK_ATTR_FLAG_COPY_MASK\f[R] If the bit is not set, then \f[I]copy_mask\f[R] is ignored. See details in the \f[I]copy_mask\f[R] description. .RE .TP \f[I]check_mask\f[R] Each bit of \f[I]check_mask\f[R] corresponds to a byte of the signature field in input domain. Byte of the input signature is checked if corresponding bit in \f[I]check_mask\f[R] is set. Bits not relevant to the signature type are ignored. .RS .PP Layout of \f[I]check_mask\f[R]. .TS tab(@); l l l l l l l l l. T{ check_mask (bits) T}@T{ 7 T}@T{ 6 T}@T{ 5 T}@T{ 4 T}@T{ 3 T}@T{ 2 T}@T{ 1 T}@T{ 0 T} _ T{ T10-DIF (bytes) T}@T{ GUARD[1] T}@T{ GUARD[0] T}@T{ APP[1] T}@T{ APP[0] T}@T{ REF[3] T}@T{ REF[2] T}@T{ REF[1] T}@T{ REF[0] T} T{ CRC32C/CRC32 (bytes) T}@T{ 3 T}@T{ 2 T}@T{ 1 T}@T{ 0 T}@T{ T}@T{ T}@T{ T}@T{ T} T{ CRC64_XP10 (bytes) T}@T{ 7 T}@T{ 6 T}@T{ 5 T}@T{ 4 T}@T{ 3 T}@T{ 2 T}@T{ 1 T}@T{ 0 T} .TE .PP Common used masks are defined in \f[B]enum mlx5dv_sig_mask\f[R]. Other masks are also supported. Follow the above table to define a custom mask. For example, this can be useful for the application tag field of the T10DIF signature. Using the application tag is out of the scope of the T10DIF specification and depends on the implementation. \f[I]check_mask\f[R] allows validating a part of the application tag if needed. .RE .TP \f[I]copy_mask\f[R] A mask to specify what part of the signature is copied from the source domain to the destination domain. The copy mask is usually calculated automatically. The signature is copied if the same signature type is configurted on both domains. The parts of the T10-DIF are compared and handled independetly. .RS .PP If \f[B]MLX5DV_SIG_BLOCK_ATTR_FLAG_COPY_MASK\f[R] is set, the \f[I]copy_mask\f[R] attribute overrides the calculated value of the copy mask. Otherwise, \f[I]copy_mask\f[R] is ignored. .PP Each bit of \f[I]copy_mask\f[R] corresponds to a byte of the signature field. If corresponding bit in \f[I]copy_mask\f[R] is set, byte of the signature field is copied from the input domain to the output domain. Calculation according to the output domain configuration is not performed in this case. Bits not relevant to the signature type are ignored. \f[I]copy_mask\f[R] may be used only if input and output domains have the same structure, i.e.\ same block size and signature type. The MKEY configuration will fail if \f[B]MLX5DV_SIG_BLOCK_ATTR_FLAG_COPY_MASK\f[R] is set but the domains have different signature structures. .PP The predefined masks are available in \f[B]enum mlx5dv_sig_mask\f[R]. It is also supported to specify a user-defined mask. Follow the table in \f[I]check_mask\f[R] description to define a custom mask. .PP \f[I]copy_mask\f[R] can be useful when some bytes of the signature are not known in advance, hence can\[cq]t be checked, but shall be preserved. In this case corresponding bits should be cleared in \f[I]check_mask\f[R] and set in \f[I]copy_mask\f[R]. .RE .TP \f[I]comp_mask\f[R] Reserved for future extension, must be 0 now. .SS Block signature domain .IP .nf \f[C] struct mlx5dv_sig_block_domain { enum mlx5dv_sig_type sig_type; union { const struct mlx5dv_sig_t10dif *dif; const struct mlx5dv_sig_crc *crc; } sig; enum mlx5dv_block_size block_size; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]sig_type\f[R] The signature type for this domain, one of the following .RS .TP \f[B]MLX5DV_SIG_TYPE_T10DIF\f[R] The block-level data protection defined in the T10 specifications (T10 SBC-3). .TP \f[B]MLX5DV_SIG_TYPE_CRC\f[R] The block-level data protection based on cyclic redundancy check (CRC). The specific type of CRC is defined in \f[I]sig\f[R]. .RE .TP \f[I]sig\f[R] Depending on \f[I]sig_type\f[R], this is the per signature type specific configuration. .TP \f[I]block_size\f[R] The block size for this domain, one of \f[B]enum mlx5dv_block_size\f[R]. .TP \f[I]comp_mask\f[R] Reserved for future extension, must be 0 now. .SS CRC signature .IP .nf \f[C] struct mlx5dv_sig_crc { enum mlx5dv_sig_crc_type type; uint64_t seed; }; \f[R] .fi .TP \f[I]type\f[R] The specific CRC type, one of the following. .RS .TP \f[B]MLX5DV_SIG_CRC_TYPE_CRC32\f[R] CRC32 signature is created by calculating a 32-bit CRC defined in Fibre Channel Physical and Signaling Interface (FC-PH), ANSI X3.230:1994. .TP \f[B]MLX5DV_SIG_CRC_TYPE_CRC32C\f[R] CRC32C signature is created by calculating a 32-bit CRC called the Castagnoli CRC, defined in the Internet Small Computer Systems Interface (iSCSI) rfc3720. .TP \f[B]MLX5DV_SIG_CRC_TYPE_CRC64_XP10\f[R] CRC64_XP10 signature is created by calculating a 64-bit CRC defined in Microsoft XP10 compression standard. .RE .TP \f[I]seed\f[R] A seed for the CRC calculation per block. Bits not relevant to the CRC type are ignored. For example, all bits are used for CRC64_XP10, but only the 32 least significant bits are used for CRC32/CRC32C. .RS .PP Only the following values are supported as a seed: CRC32/CRC32C - 0, 0xFFFFFFFF(UINT32_MAX); CRC64_XP10 - 0, 0xFFFFFFFFFFFFFFFF(UINT64_MAX). .RE .SS T10DIF signature .PP T10DIF signature is defined in the T10 specifications (T10 SBC-3) for block-level data protection. The size of data block protected by T10DIF must be modulo 8bytes as required in the T10DIF specifications. Note that when setting the initial LBA value to \f[I]ref_tag\f[R], it should be the value of the first block to be transmitted. .IP .nf \f[C] struct mlx5dv_sig_t10dif { enum mlx5dv_sig_t10dif_bg_type bg_type; uint16_t bg; uint16_t app_tag; uint32_t ref_tag; uint16_t flags; }; \f[R] .fi .TP \f[I]bg_type\f[R] The block guard type to be used, one of the following. .RS .TP \f[B]MLX5DV_SIG_T10DIF_CRC\f[R] Use CRC in the block guard field as required in the T10DIF specifications. .TP \f[B]MLX5DV_SIG_T10DIF_CSUM\f[R] Use IP checksum instead of CRC in the block guard field. .RE .TP \f[I]bg\f[R] A seed for the block guard calculation per block. .RS .PP The following values are supported as a seed: 0, 0xFFFF(UINT16_MAX). .RE .TP \f[I]app_tag\f[R] An application tag to generate or validate. .TP \f[I]ref_tag\f[R] A reference tag to generate or validate. .TP \f[I]flags\f[R] Flags for the T10DIF attributes, one of the following. .RS .TP \f[B]MLX5DV_SIG_T10DIF_FLAG_REF_REMAP\f[R] Increment reference tag per block. .TP \f[B]MLX5DV_SIG_T10DIF_FLAG_APP_ESCAPE\f[R] Do not check block guard if application tag is 0xFFFF. .TP \f[B]MLX5DV_SIG_T10DIF_FLAG_APP_REF_ESCAPE\f[R] Do not check block guard if application tag is 0xFFFF and reference tag is 0xFFFFFFFF. .RE .SH RETURN VALUE .PP This function does not return a value. .PP In case of error, user will be notified later when completing the DV WRs chain. .SH Notes .PP A DEVX context should be opened by using \f[B]mlx5dv_open_device\f[R](3). .PP MKEY must be created with \f[B]MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE\f[R] flag. .PP The last operation posted on the supplied QP should be \f[B]mlx5dv_wr_mkey_configure\f[R](3), or one of its related setters, and the operation must still be open (no doorbell issued). .PP In case of \f[B]ibv_wr_complete()\f[R] failure or calling to \f[B]ibv_wr_abort()\f[R], the MKey may be left in an unknown state. The next configuration of it should not assume any previous state of the MKey, i.e.\ signature/crypto should be re-configured or reset, as required. For example, assuming \f[B]mlx5dv_wr_set_mkey_sig_block()\f[R] and then \f[B]ibv_wr_abort()\f[R] were called, then on the next configuration of the MKey, if signature is not needed, it should be reset using \f[B]MLX5DV_MKEY_CONF_FLAG_RESET_SIG_ATTR\f[R]. .SH SEE ALSO .PP \f[B]mlx5dv_wr_mkey_configure\f[R](3), \f[B]mlx5dv_create_mkey\f[R](3), \f[B]mlx5dv_destroy_mkey\f[R](3) .SH AUTHORS .PP Oren Duer .PP Sergey Gorenko rdma-core-56.1/buildlib/pandoc-prebuilt/48b5e97fbd80118046be969d71f060dcd9a4ceb20000644000175100002000000002255614773456421033551 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "SAQUERY" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME saquery \- query InfiniBand subnet administration attributes .SH SYNOPSIS .sp saquery [options] [ | | ] .SH DESCRIPTION .sp saquery issues the selected SA query. Node records are queried by default. .SH OPTIONS .INDENT 0.0 .TP .B \fB\-p\fP get PathRecord info .TP .B \fB\-N\fP get NodeRecord info .TP .B \fB\-D, \-\-list\fP get NodeDescriptions of CAs only .TP .B \fB\-S\fP get ServiceRecord info .TP .B \fB\-I\fP get InformInfoRecord (subscription) info .TP .B \fB\-L\fP return the Lids of the name specified .TP .B \fB\-l\fP return the unique Lid of the name specified .TP .B \fB\-G\fP return the Guids of the name specified .TP .B \fB\-O\fP return the name for the Lid specified .TP .B \fB\-U\fP return the name for the Guid specified .TP .B \fB\-c\fP get the SA\(aqs class port info .TP .B \fB\-s\fP return the PortInfoRecords with isSM or isSMdisabled capability mask bit on .TP .B \fB\-g\fP get multicast group info .TP .B \fB\-m\fP get multicast member info. If a group is specified, limit the output to the group specified and print one line containing only the GUID and node description for each entry. Example: saquery \-m 0xc000 .TP .B \fB\-x\fP get LinkRecord info .TP .B \fB\-\-src\-to\-dst \fP get a PathRecord for where src and dst are either node names or LIDs .TP .B \fB\-\-sgid\-to\-dgid \fP get a PathRecord for \fBsgid\fP to \fBdgid\fP where both GIDs are in an IPv6 format acceptable to \fBinet_pton (3)\fP .TP .B \fB\-\-smkey \fP use SM_Key value for the query. Will be used only with "trusted" queries. If non\-numeric value (like \(aqx\(aq) is specified then saquery will prompt for a value. Default (when not specified here or in /usr/local/etc/infiniband\-diags/ibdiag.conf) is to use SM_Key == 0 (or "untrusted") .UNINDENT .\" Define the common option -K . .INDENT 0.0 .TP .B \fB\-K, \-\-show_keys\fP show security keys (mkey, smkey, etc.) associated with the request. .UNINDENT .sp \fB\-\-slid \fP Source LID (PathRecord) .sp \fB\-\-dlid \fP Destination LID (PathRecord) .sp \fB\-\-mlid \fP Multicast LID (MCMemberRecord) .sp \fB\-\-sgid \fP Source GID (IPv6 format) (PathRecord) .sp \fB\-\-dgid \fP Destination GID (IPv6 format) (PathRecord) .sp \fB\-\-gid \fP Port GID (MCMemberRecord) .sp \fB\-\-mgid \fP Multicast GID (MCMemberRecord) .sp \fB\-\-reversible\fP Reversible path (PathRecord) .sp \fB\-\-numb_path\fP Number of paths (PathRecord) .INDENT 0.0 .TP .B \fB\-\-pkey\fP P_Key (PathRecord, MCMemberRecord). If non\-numeric value (like \(aqx\(aq) is specified then saquery will prompt for a value .UNINDENT .sp \fB\-\-qos_class\fP QoS Class (PathRecord) .sp \fB\-\-sl\fP Service level (PathRecord, MCMemberRecord) .sp \fB\-\-mtu\fP MTU and selector (PathRecord, MCMemberRecord) .sp \fB\-\-rate\fP Rate and selector (PathRecord, MCMemberRecord) .sp \fB\-\-pkt_lifetime\fP Packet lifetime and selector (PathRecord, MCMemberRecord) .INDENT 0.0 .TP .B \fB\-\-qkey\fP Q_Key (MCMemberRecord). If non\-numeric value (like \(aqx\(aq) is specified then saquery will prompt for a value .UNINDENT .sp \fB\-\-tclass\fP Traffic Class (PathRecord, MCMemberRecord) .sp \fB\-\-flow_label\fP Flow Label (PathRecord, MCMemberRecord) .sp \fB\-\-hop_limit\fP Hop limit (PathRecord, MCMemberRecord) .sp \fB\-\-scope\fP Scope (MCMemberRecord) .sp \fB\-\-join_state\fP Join state (MCMemberRecord) .sp \fB\-\-proxy_join\fP Proxy join (MCMemberRecord) .sp \fB\-\-service_id\fP ServiceID (PathRecord) .sp Supported query names (and aliases): .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ClassPortInfo (CPI) NodeRecord (NR) [lid] PortInfoRecord (PIR) [[lid]/[port]/[options]] SL2VLTableRecord (SL2VL) [[lid]/[in_port]/[out_port]] PKeyTableRecord (PKTR) [[lid]/[port]/[block]] VLArbitrationTableRecord (VLAR) [[lid]/[port]/[block]] InformInfoRecord (IIR) LinkRecord (LR) [[from_lid]/[from_port]] [[to_lid]/[to_port]] ServiceRecord (SR) PathRecord (PR) MCMemberRecord (MCMR) LFTRecord (LFTR) [[lid]/[block]] MFTRecord (MFTR) [[mlid]/[position]/[block]] GUIDInfoRecord (GIR) [[lid]/[block]] SwitchInfoRecord (SWIR) [lid] SMInfoRecord (SMIR) [lid] .ft P .fi .UNINDENT .UNINDENT .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -z . .INDENT 0.0 .TP .B \fB\-\-outstanding_smps, \-o \fP Specify the number of outstanding SMP\(aqs which should be issued during the scan .sp Default: 2 .UNINDENT .\" Define the common option --node-name-map . .sp \fB\-\-node\-name\-map \fP Specify a node name map. .INDENT 0.0 .INDENT 3.5 This file maps GUIDs to more user friendly names. See FILES section. .UNINDENT .UNINDENT .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH COMMON FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .SH DEPENDENCIES .sp OpenSM (or other running SM/SA), libosmcomp, libibumad, libibmad .SH AUTHORS .INDENT 0.0 .TP .B Ira Weiny < \fI\%ira.weiny@intel.com\fP > .TP .B Hal Rosenstock < \fI\%halr@mellanox.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/bf558bfc93b590797fd71edcf80cae00686228050000644000175100002000000000414114773456415033470 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_devx_create_eq" "3" "2022-01-12" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_devx_create_eq - Create an EQ object .PP mlx5dv_devx_destroy_eq - Destroy an EQ object .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_devx_eq * mlx5dv_devx_create_eq(struct ibv_context *ibctx, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_destroy_eq(struct mlx5dv_devx_eq *eq); \f[R] .fi .SH DESCRIPTION .PP Create / Destroy an EQ object. Upon creation, the caller prepares the in/out mail boxes based on the device specification format; For the input mailbox, caller needs to prepare all fields except \[lq]eqc.log_page_size\[rq] and the pas list, which will be set by the driver. The \[lq]eqc.intr\[rq] field should be used from the output of mlx5dv_devx_alloc_msi_vector(). .SH ARGUMENTS .TP \f[I]ibctx\f[R] RDMA device context to create the action on. .TP \f[I]in\f[R] A buffer which contains the command\[cq]s input data provided in a device specification format. .TP \f[I]inlen\f[R] The size of \f[I]in\f[R] buffer in bytes. .TP \f[I]out\f[R] A buffer which contains the command\[cq]s output data according to the device specification format. .TP \f[I]outlen\f[R] The size of \f[I]out\f[R] buffer in bytes. .TP \f[I]eq\f[R] The EQ object to work on. .IP .nf \f[C] struct mlx5dv_devx_eq { void *vaddr; }; \f[R] .fi .TP \f[I]vaddr\f[R] EQ VA that was allocated in the driver for. .SH NOTES .PP mlx5dv_devx_query_eqn() will not support vectors which are used by mlx5dv_devx_create_eq(). .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_devx_create_eq\f[R] will return a new \f[I]struct mlx5dv_devx_eq\f[R]; On error NULL will be returned and errno will be set. .PP Upon success \f[I]mlx5dv_devx_destroy_eq\f[R] will return 0, on error errno will be returned. .PP If the error value is EREMOTEIO, outbox.status and outbox.syndrome will contain the command failure details. .SH SEE ALSO .PP \f[I]mlx5dv_devx_alloc_msi_vector(3)\f[R], \f[I]mlx5dv_devx_query_eqn(3)\f[R] .SH AUTHOR .PP Mark Zhang rdma-core-56.1/buildlib/pandoc-prebuilt/7ec1f1647577aaaeeb4f2ad1b5f9267cff61fe1a0000644000175100002000000001035114773456420034026 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBSYSSTAT" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME ibsysstat \- system status on an InfiniBand address .SH SYNOPSIS .sp ibsysstat [options] [] .SH DESCRIPTION .sp ibsysstat uses vendor mads to validate connectivity between IB nodes and obtain other information about the IB node. ibsysstat is run as client/server. Default is to run as client. .SH OPTIONS .sp Current supported operations: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ping \e\- verify connectivity to server (default) host \e\- obtain host information from server cpu \e\- obtain cpu information from server .ft P .fi .UNINDENT .UNINDENT .INDENT 0.0 .TP .B \fB\-o, \-\-oui\fP use specified OUI number to multiplex vendor mads .TP .B \fB\-S, \-\-Server\fP start in server mode (do not return) .UNINDENT .SS Addressing Flags .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/48ca0bd526e135788ce126d7a5ba285a6e1444070000644000175100002000000001105414773456412033300 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_ADVISE_MR" "3" "2018-10-19" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_advise_mr - Gives advice or directions to the kernel about an address range belongs to a memory region (MR). .SH SYNOPSIS .IP .nf \f[C] #include int ibv_advise_mr(struct ibv_pd *pd, enum ibv_advise_mr_advice advice, uint32_t flags, struct ibv_sge *sg_list, uint32_t num_sge) \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_advise_mr()\f[R] Give advice or directions to the kernel about an address range belonging to a memory region (MR). Applications that are aware of future access patterns can use this verb in order to leverage this knowledge to improve system or application performance. .PP \f[B]Conventional advice values\f[R] .TP \f[I]IBV_ADVISE_MR_ADVICE_PREFETCH\f[R] Pre-fetch a range of an on-demand paging MR. Make pages present with read-only permission before the actual IO is conducted. This would provide a way to reduce latency by overlapping paging-in and either compute time or IO to other ranges. .TP \f[I]IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE\f[R] Like IBV_ADVISE_MR_ADVICE_PREFETCH but with read-access and write-access permission to the fetched memory. .TP \f[I]IBV_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT\f[R] Pre-fetch a range of an on-demand paging MR without faulting. This allows presented pages in the CPU to become presented to the device. .SH ARGUMENTS .TP \f[I]pd\f[R] The protection domain (PD) associated with the MR. .TP \f[I]advice\f[R] The requested advise value (as listed above). .TP \f[I]flags\f[R] Describes the properties of the advise operation \f[B]Conventional advice values\f[R] \f[I]IBV_ADVISE_MR_FLAG_FLUSH\f[R] : Request to be a synchronized operation. Return to the caller after the operation is completed. .TP \f[I]sg_list\f[R] Pointer to the s/g array When using IBV_ADVISE_OP_PREFETCH advise value, all the lkeys of all the scatter gather elements (SGEs) must be associated with ODP MRs (MRs that were registered with IBV_ACCESS_ON_DEMAND). .TP \f[I]num_sge\f[R] Number of elements in the s/g array .SH RETURN VALUE .PP \f[B]ibv_advise_mr()\f[R] returns 0 when the call was successful, or the value of errno on failure (which indicates the failure reason). .TP \f[I]EOPNOTSUPP\f[R] libibverbs or provider driver doesn\[cq]t support the ibv_advise_mr() verb (ENOSYS may sometimes be returned by old versions of libibverbs). .TP \f[I]ENOTSUP\f[R] The advise operation isn\[cq]t supported. .TP \f[I]EFAULT\f[R] In one of the following: o When the range requested is out of the MR bounds, or when parts of it are not part of the process address space. o One of the lkeys provided in the scatter gather list is invalid or with wrong write access. .TP \f[I]EINVAL\f[R] In one of the following: o The PD is invalid. o The flags are invalid. o The requested address doesn\[cq]t belong to a MR, but a MW or something. .TP \f[I]EPERM\f[R] In one of the following: o Referencing a valid lkey outside the caller\[cq]s security scope. o The advice is IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE but the specified MR in the scatter gather list is not registered as writable access. .TP \f[I]ENOENT\f[R] The providing lkeys aren\[cq]t consistent with the MR\[cq]s. .TP \f[I]ENOMEM\f[R] Not enough memory. # NOTES .PP An application may pre-fetch any address range within an ODP MR when using the \f[B]IBV_ADVISE_MR_ADVICE_PREFETCH\f[R] or \f[B]IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE\f[R] advice. Semantically, this operation is best-effort. That means the kernel does not guarantee that underlying pages are updated in the HCA or the pre-fetched pages would remain resident. .PP When using \f[B]IBV_ADVISE_MR_ADVICE_PREFETCH\f[R] or \f[B]IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE\f[R] advice, the operation will be done in the following stages: o Page in the user pages to memory (pages aren\[cq]t pinned). o Get the dma mapping of these user pages. o Post the underlying page translations to the HCA. .PP If \f[B]IBV_ADVISE_MR_FLAG_FLUSH\f[R] is specified then the underlying pages are guaranteed to be updated in the HCA before returning SUCCESS. Otherwise the driver can choose to postpone the posting of the new translations to the HCA. When performing a local RDMA access operation it is recommended to use IBV_ADVISE_MR_FLAG_FLUSH flag with one of the pre-fetch advices to increase probability that the pages translations are valid in the HCA and avoid future page faults. .SH SEE ALSO .PP \f[B]ibv_reg_mr\f[R](3), \f[B]ibv_rereg_mr\f[R](3), \f[B]ibv_dereg_mr\f[R](3) .SH AUTHOR .PP Aviad Yehezkel rdma-core-56.1/buildlib/pandoc-prebuilt/b3d9d854e425936e8c2dafb5f2689701bb71462a0000644000175100002000000000212114773456416033400 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_vfio_process_events" "3" "" "" "" .hy .SH NAME .PP mlx5dv_vfio_process_events - process vfio driver events .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_vfio_process_events(struct ibv_context *ctx); \f[R] .fi .SH DESCRIPTION .PP This API should run from application thread and maintain device events. The application is responsible to get the events FD by calling \f[I]mlx5dv_vfio_get_events_fd()\f[R] and once the FD is pollable call the API to let driver process its internal events. .SH ARGUMENTS .TP \f[I]ctx\f[R] device context that was opened for VFIO by calling mlx5dv_get_vfio_device_list(). .SH RETURN VALUE .PP Returns 0 upon success or errno value in case a failure has occurred. .SH NOTES .PP Application can use this API also to periodically check the device health state even if no events exist. .SH SEE ALSO .PP \f[I]ibv_open_device(3)\f[R] \f[I]ibv_free_device_list(3)\f[R] \f[I]mlx5dv_get_vfio_device_list(3)\f[R] \f[I]mlx5dv_vfio_get_events_fd(3)\f[R] .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/61284d8d064b74fa3d2b4f791a7498fed5d4fbb00000644000175100002000000000206414773456415033544 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_modify_qp_sched_elem" "3" "2020-9-22" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_modify_qp_sched_elem - Connect a QP with a requestor and/or a responder scheduling element .SH SYNOPSIS .IP .nf \f[C] int mlx5dv_modify_qp_sched_elem(struct ibv_qp *qp, struct mlx5dv_sched_leaf *requestor, struct mlx5dv_sched_leaf *responder); \f[R] .fi .SH DESCRIPTION .PP The QP scheduling element (SE) allows the association of a QP to a SE tree. The SE is described in \f[I]mlx5dv_sched_node_create(3)\f[R] man page. .PP By default, all QPs are not associated to SE. The default setting is ensuring fair bandwidth allocation with no maximum bandwidth limiting. .PP A QP can be associate to a requestor and/or a responder SE following the IB spec definition. .SH RETURN VALUE .PP upon success 0 is returned or the value of errno on a failure. .SH SEE ALSO .PP \f[B]mlx5dv_sched_node_create\f[R](3) .SH AUTHOR .PP Mark Zhang Ariel Almog rdma-core-56.1/buildlib/pandoc-prebuilt/b4a6bc6bbb2f05ddc2593766851a6aaf9fd4d3060000644000175100002000000000310214773456416033660 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_pp_alloc / mlx5dv_pp_free" "3" "" "" "" .hy .SH NAME .PP mlx5dv_pp_alloc - Allocates a packet pacing entry .PP mlx5dv_pp_free - Frees a packet pacing entry .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_pp * mlx5dv_pp_alloc(struct ibv_context *context, size_t pp_context_sz, const void *pp_context, uint32_t flags); void mlx5dv_pp_free(struct mlx5dv_pp *dv_pp); \f[R] .fi .SH DESCRIPTION .PP Create / free a packet pacing entry which can be used for some device commands over the DEVX interface. .PP The DEVX API enables direct access from the user space area to the mlx5 device driver, the packet pacing information is needed for few commands where a packet pacing index is needed. .SH ARGUMENTS .TP \f[I]context\f[R] RDMA device context to work on, need to be opened with DEVX support by using mlx5dv_open_device(). .TP \f[I]pp_context_sz\f[R] Length of \f[I]pp_context\f[R] input buffer. .TP \f[I]pp_context\f[R] Packet pacing context according to the device specification. .TP \f[I]flags\f[R] MLX5DV_PP_ALLOC_FLAGS_DEDICATED_INDEX: allocate a dedicated index. .SS dv_pp .IP .nf \f[C] struct mlx5dv_pp { uint16_t index; }; \f[R] .fi .TP \f[I]index\f[R] The device index to be used. .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_pp_alloc\f[R] returns a pointer to the created packet pacing object, on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[B]mlx5dv_open_device\f[R], \f[B]mlx5dv_devx_obj_create\f[R] .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/f48a8d31ddfa68fad6c3badbc768ac703976c43f0000644000175100002000000000306214773456416034044 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "CHECK_LFT_BALANCE" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME check_lft_balance \- check InfiniBand unicast forwarding tables balance .SH SYNOPSIS .sp check_lft_balance.sh [\-hRv] .SH DESCRIPTION .sp check_lft_balance.sh is a script which checks for balancing in Infiniband unicast forwarding tables. It analyzes the output of \fBdump_lfts(8)\fP and \fBiblinkinfo(8)\fP .SH OPTIONS .INDENT 0.0 .TP .B \fB\-h\fP show help .TP .B \fB\-R\fP Recalculate dump_lfts information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe that the fabric has changed. .TP .B \fB\-v\fP verbose output .UNINDENT .SH SEE ALSO .sp \fBdump_lfts(8)\fP \fBiblinkinfo(8)\fP .SH AUTHORS .INDENT 0.0 .TP .B Albert Chu < \fI\%chu11@llnl.gov\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/6962baf519ab44a4635fd03f70c3033b30b7467e0000644000175100002000000000315314773456414033276 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx4dv_set_context_attr" "3" "" "" "" .hy .SH NAME .PP mlx4dv_set_context_attr - Set context attributes .SH SYNOPSIS .IP .nf \f[C] #include int mlx4dv_set_context_attr(struct ibv_context *context, enum mlx4dv_set_ctx_attr_type attr_type, void *attr); \f[R] .fi .SH DESCRIPTION .PP mlx4dv_set_context_attr gives the ability to set vendor specific attributes on the RDMA context. .SH ARGUMENTS .TP \f[I]context\f[R] RDMA device context to work on. .TP \f[I]attr_type\f[R] The type of the provided attribute. .TP \f[I]attr\f[R] Pointer to the attribute to be set. ## attr_type .IP .nf \f[C] enum mlx4dv_set_ctx_attr_type { /* Attribute type uint8_t */ MLX4DV_SET_CTX_ATTR_LOG_WQS_RANGE_SZ = 0, MLX4DV_SET_CTX_ATTR_BUF_ALLOCATORS = 1, }; \f[R] .fi .TP \f[I]MLX4DV_SET_CTX_ATTR_LOG_WQS_RANGE_SZ\f[R] Change the LOG WQs Range size for RSS .TP \f[I]MLX4DV_SET_CTX_ATTR_BUF_ALLOCATORS\f[R] Provide an external buffer allocator .IP .nf \f[C] struct mlx4dv_ctx_allocators { void *(*alloc)(size_t size, void *priv_data); void (*free)(void *ptr, void *priv_data); void *data; }; \f[R] .fi .TP \f[I]alloc\f[R] Function used for buffer allocation instead of libmlx4 internal method .TP \f[I]free\f[R] Function used to free buffers allocated by alloc function .TP \f[I]data\f[R] Metadata that can be used by alloc and free functions .SH RETURN VALUE .PP Returns 0 on success, or the value of errno on failure (which indicates the failure reason). .PP #AUTHOR .PP Majd Dibbiny rdma-core-56.1/buildlib/pandoc-prebuilt/1879d859228a7d13b32ba7dbd189b58be1bf011f0000644000175100002000000000335714773456414033460 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "HNSDV_CREATE_QP" "3" "2024-02-06" "hns" "HNS Programmer\[cq]s Manual" .hy .SH NAME .PP hnsdv_create_qp - creates a HNS specific queue pair (QP) .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_qp *hnsdv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_attr, struct hnsdv_qp_init_attr *hns_attr); \f[R] .fi .SH DESCRIPTION .PP \f[B]hnsdv_create_qp()\f[R] creates a HNS specific queue pair (QP) with specific driver properties. .SH ARGUMENTS .PP Please see \f[I]ibv_create_qp_ex(3)\f[R] man page for \f[I]context\f[R] and \f[I]qp_attr\f[R]. .SS hns_attr .IP .nf \f[C] struct hnsdv_qp_init_attr { uint64_t comp_mask; uint32_t create_flags; uint8_t congest_type; uint8_t reserved[3]; }; \f[R] .fi .TP \f[I]comp_mask\f[R] Bitmask specifying what fields in the structure are valid: .IP .nf \f[C] HNSDV_QP_INIT_ATTR_MASK_QP_CONGEST_TYPE: Valid values in congest_type. Allow setting a congestion control algorithm for QP. \f[R] .fi .TP \f[I]create_flags\f[R] Enable the QP of a feature. .TP \f[I]congest_type\f[R] Type of congestion control algorithm: .RS .PP HNSDV_QP_CREATE_ENABLE_DCQCN: Data Center Quantized Congestion Notification HNSDV_QP_CREATE_ENABLE_LDCP: Low Delay Control Protocol HNSDV_QP_CREATE_ENABLE_HC3: Huawei Converged Congestion Control HNSDV_QP_CREATE_ENABLE_DIP: Destination IP based Quantized Congestion Notification .RE .SH RETURN VALUE .PP \f[B]hnsdv_create_qp()\f[R] returns a pointer to the created QP, on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[B]ibv_create_qp_ex\f[R](3) .SH AUTHOR .PP Junxian Huang rdma-core-56.1/buildlib/pandoc-prebuilt/50f6e71e397cf5410c315ea80d7bd3a9670806470000644000175100002000000000141014773456412033226 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_GET_DEVICE_INDEX" "3" "2020-04-22" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_get_device_index - get an RDMA device index .SH SYNOPSIS .IP .nf \f[C] #include int ibv_get_device_index(struct ibv_device *device); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_get_device_index()\f[R] returns stable IB device index as it is assigned by the kernel. .SH RETURN VALUE .PP \f[B]ibv_get_device_index()\f[R] returns an index, or -1 if the kernel doesn\[cq]t support device indexes. .SH SEE ALSO .PP \f[B]ibv_get_device_name\f[R](3), \f[B]ibv_get_device_guid\f[R](3), \f[B]ibv_get_device_list\f[R](3), \f[B]ibv_open_device\f[R](3) .SH AUTHOR .PP Leon Romanovsky rdma-core-56.1/buildlib/pandoc-prebuilt/71b9f30576194f743a340f6eaef13c674b4019d50000644000175100002000000000163214773456413033234 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_QUERY_PKEY" "3" "2006-10-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_query_pkey - query an InfiniBand port\[cq]s P_Key table .SH SYNOPSIS .IP .nf \f[C] #include int ibv_query_pkey(struct ibv_context *context, uint8_t port_num, int index, uint16_t *pkey); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_query_pkey()\f[R] returns the P_Key value (in network byte order) in entry \f[I]index\f[R] of port \f[I]port_num\f[R] for device context \f[I]context\f[R] through the pointer \f[I]pkey\f[R]. .SH RETURN VALUE .PP \f[B]ibv_query_pkey()\f[R] returns 0 on success, and -1 on error. .SH SEE ALSO .PP \f[B]ibv_open_device\f[R](3), \f[B]ibv_query_device\f[R](3), \f[B]ibv_query_gid\f[R](3), \f[B]ibv_query_port\f[R](3) .SH AUTHOR .PP Dotan Barak rdma-core-56.1/buildlib/pandoc-prebuilt/b774fc986511ba9f337151177587179e2df9588d0000644000175100002000000000421414773456412033134 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_QUERY_GID_TABLE" "3" "2020-04-24" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_query_gid_table - query an InfiniBand device\[cq]s GID table .SH SYNOPSIS .IP .nf \f[C] #include ssize_t ibv_query_gid_table(struct ibv_context *context, struct ibv_gid_entry *entries, size_t max_entries, uint32_t flags); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_query_gid_table()\f[R] returns the valid GID table entries of the RDMA device context \f[I]context\f[R] at the pointer \f[I]entries\f[R]. .PP A caller must allocate \f[I]entries\f[R] array for the GID table entries it desires to query. This API returns only valid GID table entries. .PP A caller must pass non zero number of entries at \f[I]max_entries\f[R] that corresponds to the size of \f[I]entries\f[R] array. .PP \f[I]entries\f[R] array must be allocated such that it can contain all the valid GID table entries of the device. If there are more valid GID entries than the provided value of \f[I]max_entries\f[R] and \f[I]entries\f[R] array, the call will fail. For example, if an RDMA device \f[I]context\f[R] has a total of 10 valid GID entries, \f[I]entries\f[R] should be allocated for at least 10 entries, and \f[I]max_entries\f[R] should be set appropriately. .SH ARGUMENTS .TP \f[I]context\f[R] The context of the device to query. .TP \f[I]entries\f[R] Array of ibv_gid_entry structs where the GID entries are returned. Please see \f[B]ibv_query_gid_ex\f[R](3) man page for \f[I]ibv_gid_entry\f[R]. .TP \f[I]max_entries\f[R] Maximum number of entries that can be returned. .TP \f[I]flags\f[R] Extra fields to query post \f[I]entries->ndev_ifindex\f[R], for now must be 0. .SH RETURN VALUE .PP \f[B]ibv_query_gid_table()\f[R] returns the number of entries that were read on success or negative errno value on error. Number of entries returned is <= max_entries. .SH SEE ALSO .PP \f[B]ibv_open_device\f[R](3), \f[B]ibv_query_device\f[R](3), \f[B]ibv_query_port\f[R](3), \f[B]ibv_query_gid_ex\f[R](3) .SH AUTHOR .PP Parav Pandit rdma-core-56.1/buildlib/pandoc-prebuilt/d3b0093ed7a124b560e5c23c4b6adc7732f49e300000644000175100002000000000640214773456416033432 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_query_port" "3" "" "" "" .hy .SH NAME .PP mlx5dv_query_port - Query non standard attributes of IB device port. .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_query_port(struct ibv_context *context, uint32_t port_num, struct mlx5dv_port *info); \f[R] .fi .SH DESCRIPTION .PP Query port info which can be used for some device commands over the DEVX interface and when directly accessing the hardware resources. .PP A function that lets a user query hardware and configuration attributes associated with the port. .SH USAGE .PP A user should provide the port number to query. On successful query \f[I]flags\f[R] will store a subset of the requested attributes which are supported/relevant for that port. .SH ARGUMENTS .TP \f[I]context\f[R] RDMA device context to work on. .TP \f[I]port_num\f[R] Port number to query. .TP ## \f[I]info\f[R] Stores the returned attributes from the kernel. .IP .nf \f[C] struct mlx5dv_port { uint64_t flags; uint16_t vport; uint16_t vport_vhca_id; uint16_t esw_owner_vhca_id; uint16_t rsvd0; uint64_t vport_steering_icm_rx; uint64_t vport_steering_icm_tx; struct mlx5dv_reg reg_c0; }; \f[R] .fi .TP \f[I]flags\f[R] Bit field of attributes, on successful query \f[I]flags\f[R] stores the valid filled attributes. .RS .PP MLX5DV_QUERY_PORT_VPORT: The vport number of that port. .PP MLX5DV_QUERY_PORT_VPORT_VHCA_ID: The VHCA ID of \f[I]vport_num\f[R]. .PP MLX5DV_QUERY_PORT_ESW_OWNER_VHCA_ID: The E-Switch owner of \f[I]vport_num\f[R]. .PP MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_RX: The ICM RX address when directing traffic. .PP MLX5DV_QUERY_PORT_VPORT_STEERING_ICM_TX: The ICM TX address when directing traffic. .PP MLX5DV_QUERY_PORT_VPORT_REG_C0: Register C0 value used to identify egress of \f[I]vport_num\f[R]. .RE .TP \f[I]vport\f[R] The VPORT number of that port. .TP \f[I]vport_vhca_id\f[R] The VHCA ID of \f[I]vport_num\f[R]. .TP \f[I]rsvd0\f[R] A reserved field. Not to be used. .TP \f[I]esw_owner_vhca_id\f[R] The E-Switch owner of \f[I]vport_num\f[R]. .TP \f[I]vport_steering_ica_rx\f[R] The ICM RX address when directing traffic. .TP \f[I]vport_steering_icm_tx\f[R] The ICM TX address when directing traffic. .TP ## reg_c0 Register C0 value used to identify traffic of \f[I]vport_num\f[R]. .IP .nf \f[C] struct mlx5dv_reg { uint32_t value; uint32_t mask; }; \f[R] .fi .TP \f[I]value\f[R] The value that should be used as match. .TP \f[I]mask\f[R] The mask that should be used when matching. .SH RETURN VALUE .PP returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH EXAMPLE .IP .nf \f[C] for (i = 1; i <= ports; i++) { ret = mlx5dv_query_port(context, i, &port_info); if (ret) { printf(\[dq]Error querying port %d\[rs]n\[dq], i); break; } printf(\[dq]Port: %d:\[rs]n\[dq], i); if (port_info.flags & MLX5DV_QUERY_PORT_VPORT) printf(\[dq]\[rs]tvport_num: 0x%x\[rs]n\[dq], port_info.vport_num); if (port_info.flags & MLX5DV_QUERY_PORT_VPORT_REG_C0) printf(\[dq]\[rs]treg_c0: val: 0x%x mask: 0x%x\[rs]n\[dq], port_info.reg_c0.value, port_info.reg_c0.mask); } \f[R] .fi .PP Mark Bloch rdma-core-56.1/buildlib/pandoc-prebuilt/2b17e4fb06589e6a6911da4c72f7903f110168e80000644000175100002000000001133014773456420033230 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBROUTERS" 8 "2016-12-20" "" "OpenIB Diagnostics" .SH NAME IBROUTERS \- show InfiniBand router nodes in topology .SH SYNOPSIS .sp ibrouters [options] [] .SH DESCRIPTION .sp ibrouters is a script which either walks the IB subnet topology or uses an already saved topology file and extracts the router nodes. .SH OPTIONS .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .SH SEE ALSO .sp ibnetdiscover(8) .SH DEPENDENCIES .sp ibnetdiscover, ibnetdiscover format .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/a97e5ee1ed0082ac88e5b6e27e0bdcda9af2405a0000644000175100002000000001017614773456420034023 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBPING" 8 "2012-05-14" "" "Open IB Diagnostics" .SH NAME IBPING \- ping an InfiniBand address .SH SYNOPSIS .sp ibping [options] .SH DESCRIPTION .sp ibping uses vendor mads to validate connectivity between IB nodes. On exit, (IP) ping like output is show. ibping is run as client/server. Default is to run as client. Note also that a default ping server is implemented within the kernel. .SH OPTIONS .sp \fB\-c, \-\-count\fP stop after count packets .sp \fB\-f, \-\-flood\fP flood destination: send packets back to back without delay .sp \fB\-o, \-\-oui\fP use specified OUI number to multiplex vendor mads .sp \fB\-S, \-\-Server\fP start in server mode (do not return) .SS Addressing Flags .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Configuration flags .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .SS Debugging flags .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/eedfb16aa70a7b6e5718ceefdb1e88c7a95a64a70000644000175100002000000001265314773456417034132 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBCCCONFIG" 8 "2012-05-31" "" "OpenIB Diagnostics" .SH NAME IBCCCONFIG \- configure congestion control settings .SH SYNOPSIS .sp ibccconfig [common_options] [\-c cckey] [port] .SH DESCRIPTION .sp \fBibccconfig\fP supports the configuration of congestion control settings on switches and HCAs. .sp \fBWARNING \-\- You should understand what you are doing before using this tool. Misuse of this tool could result in a broken fabric.\fP .SH OPTIONS .INDENT 0.0 .TP .B Current supported operations and their parameters: CongestionKeyInfo (CK) SwitchCongestionSetting (SS) SwitchPortCongestionSetting (SP) CACongestionSetting (CS) CongestionControlTable (CT) ... .UNINDENT .sp \fB\-\-cckey, \-c, \fP Specify a congestion control (CC) key. If none is specified, a key of 0 is used. .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Addressing Flags .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Configuration flags .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH EXAMPLES .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibccconfig SwitchCongestionSetting 2 0x1F 0x1FFFFFFFFF 0x0 0xF 8 0 0:0 1 # Configure Switch Congestion Settings ibccconfig CACongestionSetting 1 0 0x3 150 1 0 0 # Configure CA Congestion Settings to SL 0 and SL 1 ibccconfig CACongestionSetting 1 0 0x4 200 1 0 0 # Configure CA Congestion Settings to SL 2 ibccconfig CongestionControlTable 1 63 0 0:0 0:1 ... # Configure first block of Congestion Control Table ibccconfig CongestionControlTable 1 127 0 0:64 0:65 ... # Configure second block of Congestion Control Table .ft P .fi .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .SH AUTHOR .INDENT 0.0 .TP .B Albert Chu < \fI\%chu11@llnl.gov\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/15509ed914ee358ac026220db5abf5f9fe1737de0000644000175100002000000000136214773456412033530 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_GET_DEVICE_NAME" "3" "2006-10-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_get_device_name - get an RDMA device\[cq]s name .SH SYNOPSIS .IP .nf \f[C] #include const char *ibv_get_device_name(struct ibv_device *device); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_get_device_name()\f[R] returns a human-readable name associated with the RDMA device \f[I]device\f[R]. .SH RETURN VALUE .PP \f[B]ibv_get_device_name()\f[R] returns a pointer to the device name, or NULL if the request fails. .SH SEE ALSO .PP \f[B]ibv_get_device_guid\f[R](3), \f[B]ibv_get_device_list\f[R](3), \f[B]ibv_open_device\f[R](3) .SH AUTHOR .PP Dotan Barak rdma-core-56.1/buildlib/pandoc-prebuilt/597d978170941460b60150581fe39cedbf5016770000644000175100002000000001114714773456417033036 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBCCQUERY" 8 "2012-05-31" "" "OpenIB Diagnostics" .SH NAME IBCCQUERY \- query congestion control settings/info .SH SYNOPSIS .sp ibccquery [common_options] [\-c cckey] [port] .SH DESCRIPTION .sp ibccquery support the querying of settings and other information related to congestion control. .SH OPTIONS .INDENT 0.0 .TP .B Current supported operations and their parameters: CongestionInfo (CI) CongestionKeyInfo (CK) CongestionLog (CL) SwitchCongestionSetting (SS) SwitchPortCongestionSetting (SP) [] CACongestionSetting (CS) CongestionControlTable (CT) Timestamp (TI) .UNINDENT .sp \fB\-\-cckey, \-c \fP Specify a congestion control (CC) key. If none is specified, a key of 0 is used. .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Addressing Flags .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Configuration flags .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .SH EXAMPLES .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibccquery CongestionInfo 3 # Congestion Info by lid ibccquery SwitchPortCongestionSetting 3 # Query all Switch Port Congestion Settings ibccquery SwitchPortCongestionSetting 3 1 # Query Switch Port Congestion Setting for port 1 .ft P .fi .UNINDENT .UNINDENT .SH AUTHOR .INDENT 0.0 .TP .B Albert Chu < \fI\%chu11@llnl.gov\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/c2bfa26f54efd008a0a64b4de38897576b9458030000644000175100002000000001212514773456421033377 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "VENDSTAT" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME vendstat \- query InfiniBand vendor specific functions .SH SYNOPSIS .sp vendstat [options] .SH DESCRIPTION .sp vendstat uses vendor specific MADs to access beyond the IB spec vendor specific functionality. Currently, there is support for Mellanox InfiniSwitch\-III (IS3) and InfiniSwitch\-IV (IS4). .SH OPTIONS .INDENT 0.0 .TP .B \fB\-N\fP show IS3 or IS4 general information. .TP .B \fB\-w\fP show IS3 port xmit wait counters. .TP .B \fB\-i\fP show IS4 counter group info. .TP .B \fB\-c \fP configure IS4 counter groups. .sp Configure IS4 counter groups 0 and 1. Such configuration is not persistent across IS4 reboot. First number is for counter group 0 and second is for counter group 1. .sp Group 0 counter config values: .UNINDENT .INDENT 0.0 .TP .B :: .INDENT 7.0 .INDENT 3.5 0 \- PortXmitDataSL0\-7 1 \- PortXmitDataSL8\-15 2 \- PortRcvDataSL0\-7 .UNINDENT .UNINDENT .sp Group 1 counter config values: .UNINDENT .INDENT 0.0 .TP .B :: 1 \- PortXmitDataSL8\-15 2 \- PortRcvDataSL0\-7 8 \- PortRcvDataSL8\-15 .TP .B \fB\-R, \-\-Read \fP Read configuration space record at addr .TP .B \fB\-W, \-\-Write \fP Write configuration space record at addr .UNINDENT .SS Addressing Flags .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .SH EXAMPLES .INDENT 0.0 .TP .B :: vendstat \-N 6 # read IS3 or IS4 general information vendstat \-w 6 # read IS3 port xmit wait counters vendstat \-i 6 12 # read IS4 port 12 counter group info vendstat \-c 0,1 6 12 # configure IS4 port 12 counter groups for PortXmitDataSL vendstat \-c 2,8 6 12 # configure IS4 port 12 counter groups for PortRcvDataSL .UNINDENT .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%hal.rosenstock@gmail.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/e2cfc53feeefa2927ad8741ae5964165b27d6aee0000644000175100002000000000240014773456412033755 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_IS_FORK_INITIALIZED" "3" "2020-10-09" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_is_fork_initialized - check if fork support (ibv_fork_init) is enabled .SH SYNOPSIS .IP .nf \f[C] #include enum ibv_fork_status { IBV_FORK_DISABLED, IBV_FORK_ENABLED, IBV_FORK_UNNEEDED, }; enum ibv_fork_status ibv_is_fork_initialized(void); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_is_fork_initialized()\f[R] checks whether libibverbs \f[B]fork()\f[R] support was enabled through the \f[B]ibv_fork_init()\f[R] verb. .SH RETURN VALUE .PP \f[B]ibv_is_fork_initialized()\f[R] returns IBV_FORK_DISABLED if fork support is disabled, or IBV_FORK_ENABLED if enabled. IBV_FORK_UNNEEDED return value indicates that the kernel copies DMA pages on fork, hence a call to \f[B]ibv_fork_init()\f[R] is unneeded. .SH NOTES .PP The IBV_FORK_UNNEEDED return value takes precedence over IBV_FORK_DISABLED and IBV_FORK_ENABLED. If the kernel supports copy-on-fork for DMA pages then IBV_FORK_UNNEEDED will be returned regardless of whether \f[B]ibv_fork_init()\f[R] was called or not. .SH SEE ALSO .PP \f[B]fork\f[R](2), \f[B]ibv_fork_init\f[R](3) .SH AUTHOR .PP Gal Pressman rdma-core-56.1/buildlib/pandoc-prebuilt/d5c7e7b0425b7c207ee41b58a93e749b88d7afee0000644000175100002000000000525414773456414033636 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_create_flow_matcher" "3" "2018-9-19" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_create_flow_matcher - creates a matcher to be used with \f[I]mlx5dv_create_flow(3)\f[R] .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_flow_matcher * mlx5dv_create_flow_matcher(struct ibv_context *context, struct mlx5dv_flow_matcher_attr *attr) \f[R] .fi .SH DESCRIPTION .PP \f[B]mlx5dv_create_flow_matcher()\f[R] creates a flow matcher (mask) to be used with \f[I]mlx5dv_create_flow(3)\f[R]. .SH ARGUMENTS .PP Please see \f[I]ibv_open_device(3)\f[R] for \f[I]context\f[R]. .SS \f[I]attr\f[R] .IP .nf \f[C] struct mlx5dv_flow_matcher_attr { enum ibv_flow_attr_type type; uint32_t flags; /* From enum ibv_flow_flags */ uint16_t priority; uint8_t match_criteria_enable; /* Device spec format */ struct mlx5dv_flow_match_parameters *match_mask; uint64_t comp_mask; enum mlx5dv_flow_table_type ft_type; }; \f[R] .fi .TP \f[I]type\f[R] Type of matcher to be created: IBV_FLOW_ATTR_NORMAL: Normal rule according to specification. .TP \f[I]flags\f[R] special flags to control rule: 0: Nothing or zero value means matcher will store ingress flow rules. IBV_FLOW_ATTR_FLAGS_EGRESS: Specified this matcher will store egress flow rules. .TP \f[I]priority\f[R] See \f[I]ibv_create_flow(3)\f[R]. .TP \f[I]match_criteria_enable\f[R] What match criteria is configured in \f[I]match_mask\f[R], passed in device spec format. .SS \f[I]match_mask\f[R] .IP .nf \f[C] struct mlx5dv_flow_match_parameters { size_t match_sz; uint64_t match_buf[]; /* Device spec format */ }; \f[R] .fi .TP \f[I]match_sz\f[R] Size in bytes of \f[I]match_buf\f[R]. .TP \f[I]match_buf\f[R] Set which mask to be used, passed in device spec format. .TP \f[I]comp_mask\f[R] MLX5DV_FLOW_MATCHER_MASK_FT_TYPE for \f[I]ft_type\f[R] .SS \f[I]ft_type\f[R] .PP Specified in which flow table type, the matcher will store the flow rules: MLX5DV_FLOW_TABLE_TYPE_NIC_RX: Specified this matcher will store ingress flow rules. MLX5DV_FLOW_TABLE_TYPE_NIC_TX Specified this matcher will store egress flow rules. MLX5DV_FLOW_TABLE_TYPE_FDB : Specified this matcher will store FDB rules. MLX5DV_FLOW_TABLE_TYPE_RDMA_RX: Specified this matcher will store ingress RDMA flow rules. MLX5DV_FLOW_TABLE_TYPE_RDMA_TX: Specified this matcher will store egress RDMA flow rules. .SH RETURN VALUE .PP \f[B]mlx5dv_create_flow_matcher\f[R] returns a pointer to \f[I]mlx5dv_flow_matcher\f[R], on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[I]ibv_open_device(3)\f[R], \f[I]ibv_create_flow(3)\f[R] .SH AUTHOR .PP Mark Bloch rdma-core-56.1/buildlib/pandoc-prebuilt/c6cf51c33703f96d23549f640ab1e80205143daf0000644000175100002000000000764514773456412033305 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "ibv_attach_counters_point_flow" "3" "2018-04-02" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP \f[B]ibv_attach_counters_point_flow\f[R] - attach individual counter definition to a flow object .SH SYNOPSIS .IP .nf \f[C] #include int ibv_attach_counters_point_flow(struct ibv_counters *counters, struct ibv_counter_attach_attr *counter_attach_attr, struct ibv_flow *flow); \f[R] .fi .SH DESCRIPTION .PP Attach counters point are a family of APIs to attach individual counter description definition to a verb object at a specific index location. .PP Counters object will start collecting values after it is bound to the verb object resource. .PP A static attach can be created when NULL is provided instead of the reference to the verbs object (e.g.: in case of flow providing NULL instead of \f[I]flow\f[R]). In this case, this counters object will only start collecting values after it is bound to the verbs resource, for flow this is when referencing the counters handle when creating a flow with \f[B]ibv_create_flow\f[R](). .PP Once an ibv_counters is bound statically to a verbs resource, no additional attach is allowed till the counter object is not bound to any verb object. .PP The argument counter_desc specifies which counter value should be collected. It is defined in verbs.h as one of the enum ibv_counter_description options. .PP Supported capabilities of specific counter_desc values per verbs object can be tested by checking the return value for success or ENOTSUP errno. .PP Attaching a counters handle to multiple objects of the same type will accumulate the values into a single index. e.g.: creating several ibv_flow(s) with the same ibv_counters handle will collect the values from all relevant flows into the relevant index location when reading the values from \f[B]ibv_read_counters\f[R](), setting the index more than once with different or same counter_desc will aggregate the values from all relevant counters into the relevant index location. .PP The runtime values of counters can be read from the hardware by calling \f[B]ibv_read_counters\f[R](). .SH ARGUMENTS .TP \f[I]counters\f[R] Existing counters to attach new counter point on. .TP \f[I]counter_attach_attr\f[R] An ibv_counter_attach_attr struct, as defined in verbs.h. .TP \f[I]flow\f[R] Existing flow to attach a new counters point on (in static mode it must be NULL). .SS \f[I]counter_attach_attr\f[R] Argument .IP .nf \f[C] struct ibv_counter_attach_attr { enum ibv_counter_description counter_desc; uint32_t index; uint32_t comp_mask; }; \f[R] .fi .SS \f[I]counter_desc\f[R] Argument .IP .nf \f[C] enum ibv_counter_description { IBV_COUNTER_PACKETS, IBV_COUNTER_BYTES, }; \f[R] .fi .TP \f[I]index\f[R] Desired location of the specific counter at the counters object. .TP \f[I]comp_mask\f[R] Bitmask specifying what fields in the structure are valid. .SH RETURN VALUE .PP \f[B]ibv_attach_counters_point_flow\f[R]() returns 0 on success, or the value of errno on failure (which indicates the failure reason) .SH ERRORS .TP EINVAL invalid argument(s) passed .TP ENOTSUP \f[I]counter_desc\f[R] is not supported on the requested object .TP EBUSY the counter object is already bound to a flow, additional attach calls is not allowed (valid for static attach only) .TP ENOMEM not enough memory .SH NOTES .PP Counter values in each index location are cleared upon creation when calling \f[B]ibv_create_counters\f[R](). Attaching counters points will only increase these values accordingly. .SH EXAMPLE .PP An example of use of \f[B]ibv_attach_counters_point_flow\f[R]() is shown in \f[B]ibv_read_counters\f[R] .SH SEE ALSO .PP \f[B]ibv_create_counters\f[R], \f[B]ibv_destroy_counters\f[R], \f[B]ibv_read_counters\f[R], \f[B]ibv_create_flow\f[R] .SH AUTHORS .PP Raed Salem .PP Alex Rosenbaum rdma-core-56.1/buildlib/pandoc-prebuilt/a6005575c35a13bb5c24946d5327aeea5feb206f0000644000175100002000000000410514773456413033427 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "EFADV_CREATE_QP_EX" "3" "2019-08-06" "efa" "EFA Direct Verbs Manual" .hy .SH NAME .PP efadv_create_qp_ex - Create EFA specific extended Queue Pair .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_qp *efadv_create_qp_ex(struct ibv_context *ibvctx, struct ibv_qp_init_attr_ex *attr_ex, struct efadv_qp_init_attr *efa_attr, uint32_t inlen); \f[R] .fi .SH DESCRIPTION .PP \f[B]efadv_create_qp_ex()\f[R] creates device-specific extended Queue Pair. .PP The argument attr_ex is an ibv_qp_init_attr_ex struct, as defined in . .PP Use ibv_qp_to_qp_ex() to get the ibv_qp_ex for accessing the send ops iterator interface, when QP create attr IBV_QP_INIT_ATTR_SEND_OPS_FLAGS is used. .PP Scalable Reliable Datagram (SRD) transport provides reliable out-of-order delivery, transparently utilizing multiple network paths to reduce network tail latency. Its interface is similar to UD, in particular it supports message size up to MTU, with error handling extended to support reliable communication. .PP Compatibility is handled using the comp_mask and inlen fields. .IP .nf \f[C] struct efadv_qp_init_attr { uint64_t comp_mask; uint32_t driver_qp_type; uint16_t flags; uint8_t sl; uint8_t reserved[1]; }; \f[R] .fi .TP \f[I]inlen\f[R] In: Size of struct efadv_qp_init_attr. .TP \f[I]comp_mask\f[R] Compatibility mask. .TP \f[I]driver_qp_type\f[R] The type of QP to be created: .RS .PP EFADV_QP_DRIVER_TYPE_SRD: Create an SRD QP. .RE .TP \f[I]flags\f[R] .IP .nf \f[C] A bitwise OR of the values described below. \f[R] .fi .RS .PP EFADV_QP_FLAGS_UNSOLICITED_WRITE_RECV: Receive WRs will not be consumed for RDMA write with imm. .RE .TP \f[I]sl\f[R] Service Level - 0 value implies default level. .SH RETURN VALUE .PP efadv_create_qp_ex() returns a pointer to the created QP, or NULL if the request fails. .SH SEE ALSO .PP \f[B]efadv\f[R](7), \f[B]ibv_create_qp_ex\f[R](3) .SH AUTHORS .PP Gal Pressman Daniel Kranzdorf rdma-core-56.1/buildlib/pandoc-prebuilt/2b0acd4321378a4260fb5b442a2d4e8b4834c12d0000644000175100002000000000436414773456417033344 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBCACHEEDIT" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME ibcacheedit \- edit an ibnetdiscover cache .SH SYNOPSIS .sp ibcacheedit [options] .SH DESCRIPTION .sp ibcacheedit allows users to edit an ibnetdiscover cache created through the \fB\-\-cache\fP option in \fBibnetdiscover(8)\fP . .SH OPTIONS .INDENT 0.0 .TP .B \fB\-\-switchguid BEFOREGUID:AFTERGUID\fP Specify a switchguid that should be changed. The before and after guid should be separated by a colon. On switches, port guids are identical to the switch guid, so port guids will be adjusted as well on switches. .TP .B \fB\-\-caguid BEFOREGUID:AFTERGUID\fP Specify a caguid that should be changed. The before and after guid should be separated by a colon. .TP .B \fB\-\-sysimgguid BEFOREGUID:AFTERGUID\fP Specify a sysimgguid that should be changed. The before and after guid should be spearated by a colon. .TP .B \fB\-\-portguid NODEGUID:BEFOREGUID:AFTERGUID\fP Specify a portguid that should be changed. The nodeguid of the port (e.g. switchguid or caguid) should be specified first, followed by a colon, the before port guid, another colon, then the after port guid. On switches, port guids are identical to the switch guid, so the switch guid will be adjusted as well on switches. .UNINDENT .SS Debugging flags .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SH AUTHORS .INDENT 0.0 .TP .B Albert Chu < \fI\%chu11@llnl.gov\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/57c450015ca13612701f4c70e1a3770ed09899170000644000175100002000000003103514773456417033001 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBNETDISCOVER" 8 "2013-06-22" "" "Open IB Diagnostics" .SH NAME IBNETDISCOVER \- discover InfiniBand topology .SH SYNOPSIS .sp ibnetdiscover [options] [] .SH DESCRIPTION .sp ibnetdiscover performs IB subnet discovery and outputs a human readable topology file. GUIDs, node types, and port numbers are displayed as well as port LIDs and NodeDescriptions. All nodes (and links) are displayed (full topology). Optionally, this utility can be used to list the current connected nodes by nodetype. The output is printed to standard output unless a topology file is specified. .SH OPTIONS .sp \fB\-l, \-\-list\fP List of connected nodes .sp \fB\-g, \-\-grouping\fP Show grouping. Grouping correlates IB nodes by different vendor specific schemes. It may also show the switch external ports correspondence. .sp \fB\-H, \-\-Hca_list\fP List of connected CAs .sp \fB\-S, \-\-Switch_list\fP List of connected switches .sp \fB\-R, \-\-Router_list\fP List of connected routers .sp \fB\-s, \-\-show\fP Show progress information during discovery. .sp \fB\-f, \-\-full\fP Show full information (ports\(aq speed and width, vlcap) .sp \fB\-p, \-\-ports\fP Obtain a ports report which is a list of connected ports with relevant information (like LID, portnum, GUID, width, speed, and NodeDescription). .sp \fB\-m, \-\-max_hops\fP Report max hops discovered. .\" Define the common option -z . .INDENT 0.0 .TP .B \fB\-\-outstanding_smps, \-o \fP Specify the number of outstanding SMP\(aqs which should be issued during the scan .sp Default: 2 .UNINDENT .SS Cache File flags .\" Define the common option cache . .sp \fB\-\-cache \fP Cache the ibnetdiscover network data in the specified filename. This cache may be used by other tools for later analysis. .\" Define the common option load-cache . .sp \fB\-\-load\-cache \fP Load and use the cached ibnetdiscover data stored in the specified filename. May be useful for outputting and learning about other fabrics or a previous state of a fabric. .\" Define the common option diff . .sp \fB\-\-diff \fP Load cached ibnetdiscover data and do a diff comparison to the current network or another cache. A special diff output for ibnetdiscover output will be displayed showing differences between the old and current fabric. By default, the following are compared for differences: switches, channel adapters, routers, and port connections. .\" Define the common option diffcheck . .sp \fB\-\-diffcheck \fP Specify what diff checks should be done in the \fB\-\-diff\fP option above. Comma separate multiple diff check key(s). The available diff checks are: \fBsw = switches\fP, \fBca = channel adapters\fP, \fBrouter\fP = routers, \fBport\fP = port connections, \fBlid\fP = lids, \fBnodedesc\fP = node descriptions. Note that \fBport\fP, \fBlid\fP, and \fBnodedesc\fP are checked only for the node types that are specified (e.g. \fBsw\fP, \fBca\fP, \fBrouter\fP). If \fBport\fP is specified alongside \fBlid\fP or \fBnodedesc\fP, remote port lids and node descriptions will also be compared. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Configuration flags .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .\" Define the common option -z . .INDENT 0.0 .TP .B \fB\-\-outstanding_smps, \-o \fP Specify the number of outstanding SMP\(aqs which should be issued during the scan .sp Default: 2 .UNINDENT .\" Define the common option --node-name-map . .sp \fB\-\-node\-name\-map \fP Specify a node name map. .INDENT 0.0 .INDENT 3.5 This file maps GUIDs to more user friendly names. See FILES section. .UNINDENT .UNINDENT .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .\" Common text to describe the Topology file. . .SS TOPOLOGY FILE FORMAT .sp The topology file format is human readable and largely intuitive. Most identifiers are given textual names like vendor ID (vendid), device ID (device ID), GUIDs of various types (sysimgguid, caguid, switchguid, etc.). PortGUIDs are shown in parentheses (). For switches, this is shown on the switchguid line. For CA and router ports, it is shown on the connectivity lines. The IB node is identified followed by the number of ports and a quoted the node GUID. On the right of this line is a comment (#) followed by the NodeDescription in quotes. If the node is a switch, this line also contains whether switch port 0 is base or enhanced, and the LID and LMC of port 0. Subsequent lines pertaining to this node show the connectivity. On the left is the port number of the current node. On the right is the peer node (node at other end of link). It is identified in quotes with nodetype followed by \- followed by NodeGUID with the port number in square brackets. Further on the right is a comment (#). What follows the comment is dependent on the node type. If it it a switch node, it is followed by the NodeDescription in quotes and the LID of the peer node. If it is a CA or router node, it is followed by the local LID and LMC and then followed by the NodeDescription in quotes and the LID of the peer node. The active link width and speed are then appended to the end of this output line. .sp An example of this is: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # # Topology file: generated on Tue Jun 5 14:15:10 2007 # # Max of 3 hops discovered # Initiated from node 0008f10403960558 port 0008f10403960559 Non\-Chassis Nodes vendid=0x8f1 devid=0x5a06 sysimgguid=0x5442ba00003000 switchguid=0x5442ba00003080(5442ba00003080) Switch 24 "S\-005442ba00003080" # "ISR9024 Voltaire" base port 0 lid 6 lmc 0 [22] "H\-0008f10403961354"[1](8f10403961355) # "MT23108 InfiniHost Mellanox Technologies" lid 4 4xSDR [10] "S\-0008f10400410015"[1] # "SW\-6IB4 Voltaire" lid 3 4xSDR [8] "H\-0008f10403960558"[2](8f1040396055a) # "MT23108 InfiniHost Mellanox Technologies" lid 14 4xSDR [6] "S\-0008f10400410015"[3] # "SW\-6IB4 Voltaire" lid 3 4xSDR [12] "H\-0008f10403960558"[1](8f10403960559) # "MT23108 InfiniHost Mellanox Technologies" lid 10 4xSDR vendid=0x8f1 devid=0x5a05 switchguid=0x8f10400410015(8f10400410015) Switch 8 "S\-0008f10400410015" # "SW\-6IB4 Voltaire" base port 0 lid 3 lmc 0 [6] "H\-0008f10403960984"[1](8f10403960985) # "MT23108 InfiniHost Mellanox Technologies" lid 16 4xSDR [4] "H\-005442b100004900"[1](5442b100004901) # "MT23108 InfiniHost Mellanox Technologies" lid 12 4xSDR [1] "S\-005442ba00003080"[10] # "ISR9024 Voltaire" lid 6 1xSDR [3] "S\-005442ba00003080"[6] # "ISR9024 Voltaire" lid 6 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x8f10403960984 Ca 2 "H\-0008f10403960984" # "MT23108 InfiniHost Mellanox Technologies" [1](8f10403960985) "S\-0008f10400410015"[6] # lid 16 lmc 1 "SW\-6IB4 Voltaire" lid 3 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x5442b100004900 Ca 2 "H\-005442b100004900" # "MT23108 InfiniHost Mellanox Technologies" [1](5442b100004901) "S\-0008f10400410015"[4] # lid 12 lmc 1 "SW\-6IB4 Voltaire" lid 3 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x8f10403961354 Ca 2 "H\-0008f10403961354" # "MT23108 InfiniHost Mellanox Technologies" [1](8f10403961355) "S\-005442ba00003080"[22] # lid 4 lmc 1 "ISR9024 Voltaire" lid 6 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x8f10403960558 Ca 2 "H\-0008f10403960558" # "MT23108 InfiniHost Mellanox Technologies" [2](8f1040396055a) "S\-005442ba00003080"[8] # lid 14 lmc 1 "ISR9024 Voltaire" lid 6 4xSDR [1](8f10403960559) "S\-005442ba00003080"[12] # lid 10 lmc 1 "ISR9024 Voltaire" lid 6 1xSDR .ft P .fi .UNINDENT .UNINDENT .sp When grouping is used, IB nodes are organized into chassis which are numbered. Nodes which cannot be determined to be in a chassis are displayed as "Non\-Chassis Nodes". External ports are also shown on the connectivity lines. .SH AUTHORS .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .TP .B Ira Weiny < \fI\%ira.weiny@intel.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/23046225aae54879fdd2d044ba307096e412d64c0000644000175100002000000000267114773456414033220 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_alloc_var / mlx5dv_free_var" "3" "" "" "" .hy .SH NAME .PP mlx5dv_alloc_var - Allocates a VAR .PP mlx5dv_free_var - Frees a VAR .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_var * mlx5dv_alloc_var(struct ibv_context *context, uint32_t flags); void mlx5dv_free_var(struct mlx5dv_var *dv_var); \f[R] .fi .SH DESCRIPTION .PP Create / free a VAR which can be used for some device commands over the DEVX interface. .PP The DEVX API enables direct access from the user space area to the mlx5 device driver, the VAR information is needed for few commands related to Virtio. .SH ARGUMENTS .TP \f[I]context\f[R] RDMA device context to work on. .TP \f[I]flags\f[R] Allocation flags for the UAR. .SS dv_var .IP .nf \f[C] struct mlx5dv_var { uint32_t page_id; uint32_t length; off_t mmap_off; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]page_id\f[R] The device page id to be used. .TP \f[I]length\f[R] The mmap length parameter to be used for mapping a VA to the allocated VAR entry. .TP \f[I]mmap_off\f[R] The mmap offset parameter to be used for mapping a VA to the allocated VAR entry. .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_alloc_var\f[R] returns a pointer to the created VAR ,on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[B]mlx5dv_open_device\f[R], \f[B]mlx5dv_devx_obj_create\f[R] .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/fde27be097c2a08bd5bfc0ada95b2446558486b40000644000175100002000000002211314773456420033604 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBQUERYERRORS" 8 "2016-09-26" "" "OpenIB Diagnostics" .SH NAME IBQUERYERRORS \- query and report IB port counters .SH SYNOPSIS .sp ibqueryerrors [options] .SH DESCRIPTION .sp The default behavior is to report the port error counters which exceed a threshold for each port in the fabric. The default threshold is zero (0). Error fields can also be suppressed entirely. .sp In addition to reporting errors on every port. ibqueryerrors can report the port transmit and receive data as well as report full link information to the remote port if available. .SH OPTIONS .sp \fB\-s, \-\-suppress \fP Suppress the errors listed in the comma separated list provided. .sp \fB\-c, \-\-suppress\-common\fP Suppress some of the common "side effect" counters. These counters usually do not indicate an error condition and can be usually be safely ignored. .sp \fB\-r, \-\-report\-port\fP Report the port information. This includes LID, port, external port (if applicable), link speed setting, remote GUID, remote port, remote external port (if applicable), and remote node description information. .sp \fB\-\-data\fP Include the optional transmit and receive data counters. .sp \fB\-\-threshold\-file \fP Specify an alternate threshold file. The default is /usr/local/etc/infiniband\-diags/error_thresholds .sp \fB\-\-switch\fP print data for switch\(aqs only .sp \fB\-\-ca\fP print data for CA\(aqs only .sp \fB\-\-skip\-sl\fP Use the default sl for queries. This is not recommended when using a QoS aware routing engine as it can cause a credit deadlock. .sp \fB\-\-router\fP print data for routers only .sp \fB\-\-clear\-errors \-k\fP Clear error counters after read. .sp \fB\-\-clear\-counts \-K\fP Clear data counters after read. .sp \fBCAUTION\fP clearing data or error counters will occur regardless of if they are printed or not. See \fB\-\-counters\fP and \fB\-\-data\fP for details on controlling which counters are printed. .sp \fB\-\-details\fP include receive error and transmit discard details .sp \fB\-\-counters\fP print data counters only .SS Partial Scan flags .sp The node to start a partial scan can be specified with the following addresses. .\" Define the common option -G . .sp \fB\-\-port\-guid, \-G \fP Specify a port_guid .\" Define the common option -D for Directed routes . .sp \fB\-D, \-\-Direct \fP The address specified is a directed route .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Examples: \-D "0" # self port \-D "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag \(aq\-P\(aq or the port found through the automatic selection process.) .ft P .fi .UNINDENT .UNINDENT .sp \fBNote:\fP For switches results are printed for all ports not just switch port 0. .sp \fB\-S \fP same as "\-G". (provided only for backward compatibility) .SS Cache File flags .\" Define the common option load-cache . .sp \fB\-\-load\-cache \fP Load and use the cached ibnetdiscover data stored in the specified filename. May be useful for outputting and learning about other fabrics or a previous state of a fabric. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Configuration flags .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .\" Define the common option -z . .INDENT 0.0 .TP .B \fB\-\-outstanding_smps, \-o \fP Specify the number of outstanding SMP\(aqs which should be issued during the scan .sp Default: 2 .UNINDENT .\" Define the common option --node-name-map . .sp \fB\-\-node\-name\-map \fP Specify a node name map. .INDENT 0.0 .INDENT 3.5 This file maps GUIDs to more user friendly names. See FILES section. .UNINDENT .UNINDENT .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .sp \fB\-R\fP (This option is obsolete and does nothing) .SH EXIT STATUS .sp \fB\-1\fP if scan fails. .sp \fB0\fP if scan succeeds without errors beyond thresholds .sp \fB1\fP if errors are found beyond thresholds or inconsistencies are found in check mode. .SH FILES .SS ERROR THRESHOLD .sp /usr/local/etc/infiniband\-diags/error_thresholds .sp Define threshold values for errors. File format is simple "name=val". Comments begin with \(aq#\(aq .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # Define thresholds for error counters SymbolErrorCounter=10 LinkErrorRecoveryCounter=10 VL15Dropped=100 .ft P .fi .UNINDENT .UNINDENT .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .SH AUTHOR .INDENT 0.0 .TP .B Ira Weiny < \fI\%ira.weiny@intel.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/c0c239f1fb706358d4ee439f21164f7fb0662c860000644000175100002000000000225514773456413033322 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_RATE_TO_MBPS" "3" "2012-03-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_rate_to_mbps - convert IB rate enumeration to Mbit/sec .PP mbps_to_ibv_rate - convert Mbit/sec to an IB rate enumeration .SH SYNOPSIS .IP .nf \f[C] #include int ibv_rate_to_mbps(enum ibv_rate rate); enum ibv_rate mbps_to_ibv_rate(int mbps); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_rate_to_mbps()\f[R] converts the IB transmission rate enumeration \f[I]rate\f[R] to a number of Mbit/sec.\ For example, if \f[I]rate\f[R] is \f[B]IBV_RATE_5_GBPS\f[R], the value 5000 will be returned (5 Gbit/sec = 5000 Mbit/sec). .PP \f[B]mbps_to_ibv_rate()\f[R] converts the number of Mbit/sec \f[I]mult\f[R] to an IB transmission rate enumeration. For example, if \f[I]mult\f[R] is 5000, the rate enumeration \f[B]IBV_RATE_5_GBPS\f[R] will be returned. .SH RETURN VALUE .PP \f[B]ibv_rate_to_mbps()\f[R] returns the number of Mbit/sec. .PP \f[B]mbps_to_ibv_rate()\f[R] returns the enumeration representing the IB transmission rate. .SH SEE ALSO .PP \f[B]ibv_query_port\f[R](3) .SH AUTHOR .PP Dotan Barak rdma-core-56.1/buildlib/pandoc-prebuilt/624de381c4dd90a5061dfb899e33d1aff4f8af1c0000644000175100002000000000120514773456415033672 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_is_supported" "3" "" "" "" .hy .SH NAME .PP mlx5dv_is_supported - Check whether an RDMA device implemented by the mlx5 provider .SH SYNOPSIS .IP .nf \f[C] #include bool mlx5dv_is_supported(struct ibv_device *device); \f[R] .fi .SH DESCRIPTION .PP mlx5dv functions may be called only if this function returns true for the RDMA device. .SH ARGUMENTS .TP \f[I]device\f[R] RDMA device to check. .SH RETURN VALUE .PP Returns true if device is implemented by mlx5 provider. .SH SEE ALSO .PP \f[I]mlx5dv(7)\f[R] .SH AUTHOR .PP Artemy Kovalyov rdma-core-56.1/buildlib/pandoc-prebuilt/ed34436575fb0561941fec3de0462234e81f9cec0000644000175100002000000001361614773456417033406 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "DUMP_FTS" 8 "2013-03-26" "" "OpenIB Diagnostics" .SH NAME DUMP_FTS \- dump InfiniBand forwarding tables .SH SYNOPSIS .sp dump_fts [options] [ []] .SH DESCRIPTION .sp dump_fts is similar to ibroute but dumps tables for every switch found in an ibnetdiscover scan of the subnet. .sp The dump file format is compatible with loading into OpenSM using the \-R file \-U /path/to/dump\-file syntax. .SH OPTIONS .INDENT 0.0 .TP .B \fB\-a, \-\-all\fP show all lids in range, even invalid entries .TP .B \fB\-n, \-\-no_dests\fP do not try to resolve destinations .TP .B \fB\-M, \-\-Multicast\fP show multicast forwarding tables In this case, the range parameters are specifying the mlid range. .UNINDENT .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option --node-name-map . .sp \fB\-\-node\-name\-map \fP Specify a node name map. .INDENT 0.0 .INDENT 3.5 This file maps GUIDs to more user friendly names. See FILES section. .UNINDENT .UNINDENT .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .SH SEE ALSO .sp \fBdump_lfts(8), dump_mfts(8), ibroute(8), ibswitches(8), opensm(8)\fP .SH AUTHORS .INDENT 0.0 .TP .B Ira Weiny < \fI\%ira.weiny@intel.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/bc330f50986a4c202ab66bc12d82b6904bff909f0000644000175100002000000000171314773456413033437 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "EFADV" "7" "2019-01-19" "efa" "EFA Direct Verbs Manual" .hy .SH NAME .PP efadv - Direct verbs for efa devices .PP This provides low level access to efa devices to perform direct operations, without general branching performed by libibverbs. .SH DESCRIPTION .PP The libibverbs API is an abstract one. It is agnostic to any underlying provider specific implementation. While this abstraction has the advantage of user applications portability, it has a performance penalty. For some applications optimizing performance is more important than portability. .PP The efa direct verbs API is intended for such applications. It exposes efa specific low level operations, allowing the application to bypass the libibverbs API. .PP The direct include of efadv.h together with linkage to efa library will allow usage of this new interface. .SH SEE ALSO .PP \f[B]verbs\f[R](7) .SH AUTHORS .PP Gal Pressman rdma-core-56.1/buildlib/pandoc-prebuilt/c20e5e8509a68bd427c4300f25e71112e7c0d7cd0000644000175100002000000001203314773456415033352 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_devx_obj_create / destroy / modify /query / general" "3" "" "" "" .hy .SH NAME .PP mlx5dv_devx_obj_create - Creates a devx object .PP mlx5dv_devx_obj_destroy - Destroys a devx object .PP mlx5dv_devx_obj_modify - Modifies a devx object .PP mlx5dv_devx_obj_query - Queries a devx object .PP mlx5dv_devx_obj_query_async - Queries a devx object in an asynchronous mode .PP mlx5dv_devx_general_cmd - Issues a general command over the devx interface .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_devx_obj * mlx5dv_devx_obj_create(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_obj_query(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_obj_query_async(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, size_t outlen, uint64_t wr_id, struct mlx5dv_devx_cmd_comp *cmd_comp); int mlx5dv_devx_obj_modify(struct mlx5dv_devx_obj *obj, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_obj_destroy(struct mlx5dv_devx_obj *obj); int mlx5dv_devx_general_cmd(struct ibv_context *context, const void *in, size_t inlen, void *out, size_t outlen); \f[R] .fi .SH DESCRIPTION .PP Create / destroy / modify / query a devx object, issue a general command over the devx interface. .PP The DEVX API enables direct access from the user space area to the mlx5 device driver by using the KABI mechanism. The main purpose is to make the user space driver as independent as possible from the kernel so that future device functionality and commands can be activated with minimal to none kernel changes. .PP A DEVX object represents some underlay firmware object, the input command to create it is some raw data given by the user application which should match the device specification. Upon successful creation the output buffer includes the raw data from the device according to its specification, this data can be used as part of related firmware commands to this object. .PP Once the DEVX object is created it can be queried/modified/destroyed by the matching mlx5dv_devx_obj_xxx() API. Both the input and the output for those APIs need to match the device specification as well. .PP The mlx5dv_devx_general_cmd() API enables issuing some general command which is not related to an object such as query device capabilities. .PP The mlx5dv_devx_obj_query_async() API is similar to the query object API, however, it runs asynchronously without blocking. The input includes an mlx5dv_devx_cmd_comp object and an identifier named `wr_id' for this command. The response should be read upon success with the mlx5dv_devx_get_async_cmd_comp() API. The `wr_id' that was supplied as an input is returned as part of the response to let application knows for which command the response is related to. .PP An application can gradually migrate to use DEVX according to its needs, it is not all or nothing. For example it can create an ibv_cq via ibv_create_cq() verb and then use the returned cqn to create a DEVX QP object by the mlx5dv_devx_obj_create() API which needs that cqn. .PP The above example can enable an application to create a QP with some driver specific attributes that are not exposed in the ibv_create_qp() API, in that case no user or kernel change may be needed at all as the command input reaches directly to the firmware. .PP The expected users for the DEVX APIs are application that use the mlx5 DV APIs and are familiar with the device specification in both control and data path. .PP To successfully create a DEVX object and work on, a DEVX context must be created, this is done by the mlx5dv_open_device() API with the \f[I]MLX5DV_CONTEXT_FLAGS_DEVX\f[R] flag. .SH ARGUMENTS .TP \f[I]context\f[R] RDMA device context to create the action on. .TP \f[I]in\f[R] A buffer which contains the command\[cq]s input data provided in a device specification format. .TP \f[I]inlen\f[R] The size of \f[I]in\f[R] buffer in bytes. .TP \f[I]out\f[R] A buffer which contains the command\[cq]s output data according to the device specification format. .TP \f[I]outlen\f[R] The size of \f[I]out\f[R] buffer in bytes. .TP \f[I]obj\f[R] For query, modify, destroy: the devx object to work on. .TP \f[I]wr_id\f[R] The command identifier when working in asynchronous mode. .TP \f[I]cmd_comp\f[R] The command completion object to read the response from in asynchronous mode. .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_devx_create_obj\f[R] will return a new \f[I]struct mlx5dv_devx_obj\f[R] on error NULL will be returned and errno will be set. .PP Upon success query, modify, destroy, general commands, 0 is returned or the value of errno on a failure. .PP If the error value is EREMOTEIO, outbox.status and outbox.syndrome will contain the command failure details. .SH SEE ALSO .PP \f[B]mlx5dv_open_device\f[R], \f[B]mlx5dv_devx_create_cmd_comp\f[R], \f[B]mlx5dv_devx_get_async_cmd_comp\f[R] .PP #AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/72e3fc8ebb7e504f7d36ce7478cd7cff2fa1ee750000644000175100002000000000772314773456415034070 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_mkey_check" "3" "" "" "" .hy .SH NAME .PP mlx5dv_mkey_check - Check a MKEY for errors .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_mkey_check(struct mlx5dv_mkey *mkey, struct mlx5dv_mkey_err *err_info); \f[R] .fi .SH DESCRIPTION .PP Checks \f[I]mkey\f[R] for errors and provides the result in \f[I]err_info\f[R] on success. .PP This should be called after using a MKEY configured with signature validation in a transfer operation. While the transfer operation itself may be completed successfully (i.e.\ no transport related errors occurred), there still may be errors related to the integrity of the data. The first of these errors is reported to the MKEY and kept there until application software queries it by calling this API. .PP The type of error indicates which part of the signature was bad (guard, reftag or apptag). Also provided is the actual calculated value based on the transferred data, and the expected value based on the signature fields. Last part provided is the offset in the transfer that caused the error. .SH ARGUMENTS .TP \f[I]mkey\f[R] The MKEY to check for errors. .TP \f[I]err_info\f[R] The result of the MKEY check, information about the errors detected, if any. .RS .IP .nf \f[C] struct mlx5dv_mkey_err { enum mlx5dv_mkey_err_type err_type; union { struct mlx5dv_sig_err sig; } err; }; \f[R] .fi .TP \f[I]err_type\f[R] What kind of error happened. If several errors exist in one block verified by the device, only the first of them is reported, according to the order specified in T10DIF specification, which is: \f[B]MLX5DV_MKEY_SIG_BLOCK_BAD_GUARD\f[R], \f[B]MLX5DV_MKEY_SIG_BLOCK_BAD_APPTAG\f[R], \f[B]MLX5DV_MKEY_SIG_BLOCK_BAD_REFTAG\f[R]. .RS .TP \f[B]MLX5DV_MKEY_NO_ERR\f[R] No error is detected for the MKEY. .TP \f[B]MLX5DV_MKEY_SIG_BLOCK_BAD_GUARD\f[R] A signature error was detected in CRC/CHECKSUM for T10-DIF or CRC32/CRC32C/CRC64_XP10 (depends on the configured signature type). Additional information about the error is provided in \f[B]struct mlx5dv_sig_err\f[R] of \f[I]err\f[R]. .TP \f[B]MLX5DV_MKEY_SIG_BLOCK_BAD_REFTAG\f[R] A signature error was detected in the reference tag. This kind of signature error is relevant for T10-DIF only. Additional information about the error is provided in \f[B]struct mlx5dv_sig_err\f[R] of \f[I]err\f[R]. .TP \f[B]MLX5DV_MKEY_SIG_BLOCK_BAD_APPTAG\f[R] A signature error was detected in the application tag. This kind of signature error is relevant for T10-DIF only. Additional information about the error is provided in \f[B]struct mlx5dv_sig_err\f[R] of \f[I]err\f[R]. .RE .TP \f[I]err\f[R] Information about the detected error if \f[I]err_type\f[R] is not \f[B]MLX5DV_MKEY_NO_ERR\f[R]. Otherwise, its value is not defined. .RE .SS Signature error .IP .nf \f[C] struct mlx5dv_sig_err { uint64_t actual_value; uint64_t expected_value; uint64_t offset; }; \f[R] .fi .TP \f[I]actual_value\f[R] The actual value that was calculated from the transferred data. .TP \f[I]expected_value\f[R] The expected value based on what appears in the signature respected field. .TP \f[I]offset\f[R] The offset within the transfer where the error happened. In block signature, this is guaranteed to be a block boundary offset. .SH RETURN VALUE .PP 0 on success or the value of errno on failure (which indicates the failure reason). .SH NOTES .PP A DEVX context should be opened by using \f[B]mlx5dv_open_device\f[R](3). .PP Checking the MKEY for errors should be done after the application knows the data transfer that was using the MKEY has finished. Application should wait for the respected completion (if this was a local MKEY) or wait for a received message from a peer (if this was a remote MKEY). .SH SEE ALSO .PP \f[B]mlx5dv_wr_mkey_configure\f[R](3), \f[B]mlx5dv_wr_set_mkey_sig_block\f[R](3), \f[B]mlx5dv_create_mkey\f[R](3), \f[B]mlx5dv_destroy_mkey\f[R](3) .SH AUTHORS .PP Oren Duer .PP Sergey Gorenko rdma-core-56.1/buildlib/pandoc-prebuilt/f1ca8423b4c349b90448b57350568f1c9c3fdd6b0000644000175100002000000000341414773456416033401 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_reg_dmabuf_mr" "3" "" "" "" .hy .SH NAME .PP mlx5dv_reg_dmabuf_mr - Register a dma-buf based memory region (MR) .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_mr *mlx5dv_reg_dmabuf_mr(struct ibv_pd *pd, uint64_t offset, size_t length, uint64_t iova, int fd, int access, int mlx5_access) \f[R] .fi .SH DESCRIPTION .PP Register a dma-buf based memory region (MR), it follows the functionality of \f[I]ibv_reg_dmabuf_mr()\f[R] with the ability to supply specific mlx5 access flags. .SH ARGUMENTS .TP \f[I]pd\f[R] The associated protection domain. .TP \f[I]offset\f[R] The offset of the dma-buf where the MR starts. .TP \f[I]length\f[R] .IP .nf \f[C] The length of the MR. \f[R] .fi .TP \f[I]iova\f[R] Specifies the virtual base address of the MR when accessed through a lkey or rkey. It must have the same page offset as \f[I]offset\f[R] and be aligned with the system page size. .TP \f[I]fd\f[R] The file descriptor that the dma-buf is identified by. .TP \f[I]access\f[R] The desired memory protection attributes; it is either 0 or the bitwise OR of one or more of \f[I]enum ibv_access_flags\f[R]. .TP \f[I]mlx5_access\f[R] A specific device access flags, it is either 0 or the below. .RS .PP \f[I]MLX5DV_REG_DMABUF_ACCESS_DATA_DIRECT\f[R] if set, this MR will be accessed through the Data Direct engine bonded with that RDMA device. .RE .SH RETURN VALUE .PP Upon success returns a pointer to the registered MR, or NULL if the request fails, in that case the value of errno indicates the failure reason. .SH SEE ALSO .PP \f[I]ibv_reg_dmabuf_mr(3)\f[R], \f[I]mlx5dv_get_data_direct_sysfs_path(3)\f[R] .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/77e091fce9252614b7c6136f15917606746eac440000644000175100002000000000206114773456415033104 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_devx_query_eqn" "3" "" "" "" .hy .SH NAME .PP mlx5dv_devx_query_eqn - Query EQN for a given vector id. .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_devx_query_eqn(struct ibv_context *context, uint32_t vector, uint32_t *eqn); \f[R] .fi .SH DESCRIPTION .PP Query EQN for a given input vector, the EQN is needed for other device commands over the DEVX interface. .PP The DEVX API enables direct access from the user space area to the mlx5 device driver, the EQN information is needed for few commands such as CQ creation. .SH ARGUMENTS .TP \f[I]context\f[R] RDMA device context to work on. .TP \f[I]vector\f[R] Completion vector number. .TP \f[I]eqn\f[R] The device EQ number which relates to the given input vector. .SH RETURN VALUE .PP returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH SEE ALSO .PP \f[B]mlx5dv_open_device\f[R], \f[B]mlx5dv_devx_obj_create\f[R] .PP #AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/fe1de88695b9f8551b1f861987b4188fdd5920020000644000175100002000000000216514773456412033273 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_EVENT_TYPE_STR" "3" "2006-10-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_event_type_str - Return string describing event_type enum value .PP ibv_node_type_str - Return string describing node_type enum value .PP ibv_port_state_str - Return string describing port_state enum value .SH SYNOPSIS .IP .nf \f[C] #include const char *ibv_event_type_str(enum ibv_event_type event_type); const char *ibv_node_type_str(enum ibv_node_type node_type); const char *ibv_port_state_str(enum ibv_port_state port_state); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_node_type_str()\f[R] returns a string describing the node type enum value \f[I]node_type\f[R]. .PP \f[B]ibv_port_state_str()\f[R] returns a string describing the port state enum value \f[I]port_state\f[R]. .PP \f[B]ibv_event_type_str()\f[R] returns a string describing the event type enum value \f[I]event_type\f[R]. .SH RETURN VALUE .PP These functions return a constant string that describes the enum value passed as their argument. .SH AUTHOR .PP Roland Dreier rdma-core-56.1/buildlib/pandoc-prebuilt/c6c59b5def9ab3d0083324e4053a36a8633658650000644000175100002000000000322714773456412033234 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "ibv_import_pd, ibv_unimport_pd" "3" "2020-5-3" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_import_pd - import a PD from a given ibv_context .PP ibv_unimport_pd - unimport a PD .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_pd *ibv_import_pd(struct ibv_context *context, uint32_t pd_handle); void ibv_unimport_pd(struct ibv_pd *pd) \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_import_pd()\f[R] returns a protection domain (PD) that is associated with the given \f[I]pd_handle\f[R] in the given \f[I]context\f[R]. .PP The input \f[I]pd_handle\f[R] value must be a valid kernel handle for a PD object in the given \f[I]context\f[R]. It can be achieved from the original PD by getting its ibv_pd->handle member value. .PP The returned \f[I]ibv_pd\f[R] can be used in all verbs that get a protection domain. .PP \f[B]ibv_unimport_pd()\f[R] unimport the PD. Once the PD usage has been ended ibv_dealloc_pd() or ibv_unimport_pd() should be called. The first one will go to the kernel to destroy the object once the second one way cleanup what ever is needed/opposite of the import without calling the kernel. .PP This is the responsibility of the application to coordinate between all ibv_context(s) that use this PD. Once destroy is done no other process can touch the object except for unimport. All users of the context must collaborate to ensure this. .SH RETURN VALUE .PP \f[B]ibv_import_pd()\f[R] returns a pointer to the allocated PD, or NULL if the request fails. .SH SEE ALSO .PP \f[B]ibv_alloc_pd\f[R](3), \f[B]ibv_dealloc_pd\f[R](3), .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/3385187a51fce9f9219cfcb7c541af106b8beb930000644000175100002000000003225514773456416033551 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_wr_mkey_configure" "3" "" "" "" .hy .SH NAME .PP mlx5dv_wr_mkey_configure - Create a work request to configure an MKEY .PP mlx5dv_wr_set_mkey_access_flags - Set the memory protection attributes for an MKEY .PP mlx5dv_wr_set_mkey_layout_list - Set a memory layout for an MKEY based on SGE list .PP mlx5dv_wr_set_mkey_layout_interleaved - Set an interleaved memory layout for an MKEY .SH SYNOPSIS .IP .nf \f[C] #include static inline void mlx5dv_wr_mkey_configure(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint8_t num_setters, struct mlx5dv_mkey_conf_attr *attr); static inline void mlx5dv_wr_set_mkey_access_flags(struct mlx5dv_qp_ex *mqp, uint32_t access_flags); static inline void mlx5dv_wr_set_mkey_layout_list(struct mlx5dv_qp_ex *mqp, uint16_t num_sges, const struct ibv_sge *sge); static inline void mlx5dv_wr_set_mkey_layout_interleaved(struct mlx5dv_qp_ex *mqp, uint32_t repeat_count, uint16_t num_interleaved, const struct mlx5dv_mr_interleaved *data); \f[R] .fi .SH DESCRIPTION .PP The MLX5DV MKEY configure API and the related setters (mlx5dv_wr_set_mkey*) are an extension of IBV work request API (ibv_wr*) with specific features for MLX5DV MKEY. .PP MKEYs allow creation of virtually-contiguous address spaces out of non-contiguous chunks of memory regions already registered with the hardware. Additionally it provides access to some advanced hardware offload features, e.g. signature offload. .PP These APIs are intended to be used to access additional functionality beyond what is provided by \f[B]mlx5dv_wr_mr_list\f[R]() and \f[B]mlx5dv_wr_mr_interleaved\f[R](). The MKEY features can be optionally enabled using the mkey configure setters. It allows using different features in the same MKEY. .SH USAGE .PP To use these APIs a QP must be created using \f[B]mlx5dv_create_qp\f[R](3) which allows setting the \f[B]MLX5DV_QP_EX_WITH_MKEY_CONFIGURE\f[R] in \f[B]send_ops_flags\f[R]. .PP The MKEY configuration work request is created by calling \f[B]mlx5dv_wr_mkey_configure\f[R](), a WR builder function, followed by required setter functions. \f[I]num_setters\f[R] is a number of required setters for the WR. All setters are optional. \f[I]num_setters\f[R] can be zero to apply \f[I]attr\f[R] only. Each setter can be called only once per the WR builder. .PP The WR configures \f[I]mkey\f[R] and applies \f[I]attr\f[R] of the builder function and setter functions\[cq] arguments for it. If \f[I]mkey\f[R] is already configured, the WR overrides some \f[I]mkey\f[R] properties depends on builder and setter functions\[cq] arguments (see details in setters\[cq] description). To clear configuration of \f[I]mkey\f[R], use \f[B]ibv_post_send\f[R]() with \f[B]IBV_WR_LOCAL_INV\f[R] opcode or \f[B]ibv_wr_local_inv\f[R](). .PP Current implementation requires the \f[B]IBV_SEND_INLINE\f[R] option to be set in \f[B]wr_flags\f[R] field of \f[B]ibv_qp_ex\f[R] structure prior to builder function call. Non-inline payload is currently not supported by this API. Please note that inlining here is done for MKEY configuration data, not for user data referenced by data layouts. .PP Once MKEY is configured, it may be used in subsequent work requests (SEND, RDMA_READ, RDMA_WRITE, etc). If these work requests are posted on the same QP, there is no need to wait for completion of MKEY configuration work request. They can be posted immediately after the last setter (or builder if no setters). Usually there is no need to even request a completion for MKEY configuration work request. .PP If completion is requested for MKEY configuration work request it will be delivered with the \f[B]IBV_WC_DRIVER1\f[R] opcode. .SS Builder function .TP \f[B]mlx5dv_wr_mkey_configure()\f[R] Post a work request to configure an existing MKEY. With this call alone, it is possible to configure the MKEY and keep or reset signature attributes. This call may be followed by zero or more optional setters. .RS .TP \f[I]mqp\f[R] The QP to post the work request on. .TP \f[I]mkey\f[R] The MKEY to configure. .TP \f[I]num_setters\f[R] The number of setters that must be called after this function. .TP \f[I]attr\f[R] The MKEY configuration attributes .RE .SS MKEY configuration attributes .PP MKEY configuration attributes are provided in \f[B]mlx5dv_mkey_conf_attr\f[R] structure. .IP .nf \f[C] struct mlx5dv_mkey_conf_attr { uint32_t conf_flags; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]conf_flags\f[R] Bitwise OR of the following flags: .RS .TP \f[B]MLX5DV_MKEY_CONF_FLAG_RESET_SIG_ATTR\f[R] Reset the signature attributes of the MKEY. If not set, previously configured signature attributes will be kept. .RE .TP \f[I]comp_mask\f[R] Reserved for future extension, must be 0 now. .SS Generic setters .TP \f[B]mlx5dv_wr_set_mkey_access_flags()\f[R] Set the memory protection attributes for the MKEY. If the MKEY is configured, the setter overrides the previous value. For example, two MKEY configuration WRs are posted. The first one sets \f[B]IBV_ACCESS_REMOTE_READ\f[R]. The second one sets \f[B]IBV_ACCESS_REMOTE_WRITE\f[R]. In this case, the second WR overrides the memory protection attributes, and only \f[B]IBV_ACCESS_REMOTE_WRITE\f[R] is allowed for the MKEY when the WR is completed. .RS .TP \f[I]mqp\f[R] The QP where an MKEY configuration work request was created by \f[B]mlx5dv_wr_mkey_configure()\f[R]. .TP \f[I]access_flags\f[R] The desired memory protection attributes; it is either 0 or the bitwise OR of one or more of flags in \f[B]enum ibv_access_flags\f[R]. .RE .SS Data layout setters .PP Data layout setters define how data referenced by the MKEY will be scattered/gathered in the memory. In order to use MKEY with RDMA operations, it must be configured with a layout. .PP Not more than one data layout setter may follow builder function. Layout can be updated in the next calls to builder function. .PP When MKEY is used in RDMA operations, it should be used in a zero-based mode, i.e.\ the \f[B]addr\f[R] field in \f[B]ibv_sge\f[R] structure is an offset in the total data. .TP \f[B]mlx5dv_wr_set_mkey_layout_list()\f[R] Set a memory layout for an MKEY based on SGE list. If the MKEY is configured and the data layout was defined by some data layout setter (not necessary this one), the setter overrides the previous value. .RS .PP Default WQE size can fit only 4 SGE entries. To allow more, the QP should be created with a larger WQE size that may fit it. This should be done using the \f[B]max_inline_data\f[R] attribute of \f[B]struct ibv_qp_cap\f[R] upon QP creation. .TP \f[I]mqp\f[R] The QP where an MKEY configuration work request was created by \f[B]mlx5dv_wr_mkey_configure()\f[R]. .TP \f[I]num_sges\f[R] Number of SGEs in the list. .TP \f[I]sge\f[R] Pointer to the list of \f[B]ibv_sge\f[R] structures. .RE .TP \f[B]mlx5dv_wr_set_mkey_layout_interleaved()\f[R] Set an interleaved memory layout for an MKEY. If the MKEY is configured and the data layout was defined by some data layout setter (not necessary this one), the setter overrides the previous value. .RS .PP Default WQE size can fit only 3 interleaved entries. To allow more, the QP should be created with a larger WQE size that may fit it. This should be done using the \f[B]max_inline_data\f[R] attribute of \f[B]struct ibv_qp_cap\f[R] upon QP creation. .PP As one entry will be consumed for strided header, the MKEY should be created with one more entry than the required \f[I]num_interleaved\f[R]. .TP \f[I]mqp\f[R] The QP where an MKEY configuration work request was created by \f[B]mlx5dv_wr_mkey_configure()\f[R]. .TP \f[I]repeat_count\f[R] The \f[I]data\f[R] layout representation is repeated \f[I]repeat_count\f[R] times. .TP \f[I]num_interleaved\f[R] Number of entries in the \f[I]data\f[R] representation. .TP \f[I]data\f[R] Pointer to the list of interleaved data layout descriptions. .PP Interleaved data layout is described by \f[B]mlx5dv_mr_interleaved\f[R] structure. .IP .nf \f[C] struct mlx5dv_mr_interleaved { uint64_t addr; uint32_t bytes_count; uint32_t bytes_skip; uint32_t lkey; }; \f[R] .fi .TP \f[I]addr\f[R] Start address of the local memory buffer. .TP \f[I]bytes_count\f[R] Number of data bytes to put into the buffer. .TP \f[I]bytes_skip\f[R] Number of bytes to skip in the buffer before the next data block. .TP \f[I]lkey\f[R] Key of the local Memory Region .RE .SS Signature setters .PP The signature attributes of the MKEY allow adding/modifying/stripping/validating integrity fields when transmitting data from memory to network and when receiving data from network to memory. .PP Use the signature setters to set/update the signature attributes of the MKEY. To reset the signature attributes without invalidating the MKEY, use the \f[B]MLX5DV_MKEY_CONF_FLAG_RESET_SIG_ATTR\f[R] flag. .TP \f[B]mlx5dv_wr_set_mkey_sig_block\f[R]() Set MKEY block signature attributes. If the MKEY is already configured with the signature attributes, the setter overrides the previous value. See dedicated man page for \f[B]mlx5dv_wr_set_mkey_sig_block\f[R](3). .SS Crypto setter .PP The crypto attributes of the MKey allow encryption and decryption of transmitted data from memory to network and when receiving data from network to memory. .PP Use the crypto setter to set/update the crypto attributes of the MKey. When the MKey is created with \f[B]MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO\f[R] it must be configured with crypto attributes before the MKey can be used. .TP \f[B]mlx5dv_wr_set_mkey_crypto()\f[R] Set MKey crypto attributes. If the MKey is already configured with crypto attributes, the setter overrides the previous value. see dedicated man page for \f[B]mlx5dv_wr_set_mkey_crypto\f[R](3). .SH EXAMPLES .SS Create QP and MKEY .PP Code below creates a QP with MKEY configure operation support and an indirect mkey. .IP .nf \f[C] /* Create QP with MKEY configure support */ struct ibv_qp_init_attr_ex attr_ex = {}; attr_ex.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS; attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_WRITE; struct mlx5dv_qp_init_attr attr_dv = {}; attr_dv.comp_mask |= MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS; attr_dv.send_ops_flags = MLX5DV_QP_EX_WITH_MKEY_CONFIGURE; ibv_qp *qp = mlx5dv_create_qp(ctx, attr_ex, attr_dv); ibv_qp_ex *qpx = ibv_qp_to_qp_ex(qp); mlx5dv_qp_ex *mqpx = mlx5dv_qp_ex_from_ibv_qp_ex(qpx); mkey_attr.create_flags = MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT; struct mlx5dv_mkey *mkey = mlx5dv_create_mkey(&mkey_attr); \f[R] .fi .SS List data layout configuration .PP Code below configures an MKEY which allows remote access for read and write and is based on SGE list layout with two entries. When this MKEY is used in RDMA write operation, data will be scattered between two memory regions. The first 64 bytes will go to memory referenced by \f[B]mr1\f[R]. The next 4096 bytes will go to memory referenced by \f[B]mr2\f[R]. .IP .nf \f[C] ibv_wr_start(qpx); qpx->wr_id = my_wr_id_1; qpx->wr_flags = IBV_SEND_INLINE; struct mlx5dv_mkey_conf_attr mkey_attr = {}; mlx5dv_wr_mkey_configure(mqpx, mkey, 2, &mkey_attr); mlx5dv_wr_set_mkey_access_flags(mqpx, IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_WRITE); struct ibv_sge sgl[2]; sgl[0].addr = mr1->addr; sgl[0].length = 64; sgl[0].lkey = mr1->lkey; sgl[1].addr = mr2->addr; sgl[1].length = 4096; sgl[1].lkey = mr2->lkey; mlx5dv_wr_set_mkey_layout_list(mqpx, 2, sgl); ret = ibv_wr_complete(qpx); \f[R] .fi .SS Interleaved data layout configuration .PP Code below configures an MKEY which allows remote access for read and write and is based on interleaved data layout with two entries and repeat count of two. When this MKEY is used in RDMA write operation, data will be scattered between two memory regions. The first 512 bytes will go to memory referenced by \f[B]mr1\f[R] at offset 0. The next 8 bytes will go to memory referenced by \f[B]mr2\f[R] at offset 0. The next 512 bytes will go to memory referenced by \f[B]mr1\f[R] at offset 516. The next 8 bytes will go to memory referenced by \f[B]mr2\f[R] at offset 8. .IP .nf \f[C] ibv_wr_start(qpx); qpx->wr_id = my_wr_id_1; qpx->wr_flags = IBV_SEND_INLINE; struct mlx5dv_mkey_conf_attr mkey_attr = {}; mlx5dv_wr_mkey_configure(mqpx, mkey, 2, &mkey_attr); mlx5dv_wr_set_mkey_access_flags(mqpx, IBV_ACCESS_REMOTE_READ | IBV_ACCESS_REMOTE_WRITE); struct mlx5dv_mr_interleaved data[2]; data[0].addr = mr1->addr; data[0].bytes_count = 512; data[0].bytes_skip = 4; data[0].lkey = mr1->lkey; data[1].addr = mr2->addr; data[1].bytes_count = 8; data[1].bytes_skip = 0; data[1].lkey = mr2->lkey; mlx5dv_wr_set_mkey_layout_interleaved(mqpx, 2, 2, &data); ret = ibv_wr_complete(qpx); \f[R] .fi .SH NOTES .PP A DEVX context should be opened by using \f[B]mlx5dv_open_device\f[R](3). .SH SEE ALSO .PP \f[B]mlx5dv_create_mkey\f[R](3), \f[B]mlx5dv_create_qp\f[R](3), \f[B]mlx5dv_wr_set_mkey_sig_block\f[R](3), \f[B]mlx5dv_wr_set_mkey_crypto\f[R](3) .SH AUTHORS .PP Oren Duer .PP Sergey Gorenko .PP Evgenii Kochetov rdma-core-56.1/buildlib/pandoc-prebuilt/846b9c279c61cc170cbf05bd5d68af74085667fb0000644000175100002000000002207714773456417033501 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBLINKINFO" 8 "2018-07-09" "" "OpenIB Diagnostics" .SH NAME IBLINKINFO \- report link info for all links in the fabric .SH SYNOPSIS .sp iblinkinfo .SH DESCRIPTION .sp iblinkinfo reports link info for each port in an IB fabric, node by node. Optionally, iblinkinfo can do partial scans and limit its output to parts of a fabric. .SH OPTIONS .sp \fB\-\-down, \-d\fP Print only nodes which have a port in the "Down" state. .sp \fB\-\-line, \-l\fP Print all information for each link on one line. Default is to print a header with the node information and then a list for each port (useful for grep\(aqing output). .sp \fB\-\-additional, \-p\fP Print additional port settings (,,) .sp \fB\-\-switches\-only\fP Show only switches in output. .sp \fB\-\-cas\-only\fP Show only CAs in output. .SS Partial Scan flags .sp The node to start a partial scan can be specified with the following addresses. .\" Define the common option -G . .sp \fB\-\-port\-guid, \-G \fP Specify a port_guid .\" Define the common option -D for Directed routes . .sp \fB\-D, \-\-Direct \fP The address specified is a directed route .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Examples: \-D "0" # self port \-D "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag \(aq\-P\(aq or the port found through the automatic selection process.) .ft P .fi .UNINDENT .UNINDENT .sp \fBNote:\fP For switches results are printed for all ports not just switch port 0. .sp \fB\-\-switch, \-S \fP same as "\-G". (provided only for backward compatibility) .sp How much of the scan to be printed can be controlled with the following. .sp \fB\-\-all, \-a\fP Print all nodes found in a partial fabric scan. Normally a partial fabric scan will return only the node specified. This option will print the other nodes found as well. .sp \fB\-\-hops, \-n \fP Specify the number of hops away from a specified node to scan. This is useful to expand a partial fabric scan beyond the node specified. .SS Cache File flags .\" Define the common option load-cache . .sp \fB\-\-load\-cache \fP Load and use the cached ibnetdiscover data stored in the specified filename. May be useful for outputting and learning about other fabrics or a previous state of a fabric. .\" Define the common option diff . .sp \fB\-\-diff \fP Load cached ibnetdiscover data and do a diff comparison to the current network or another cache. A special diff output for ibnetdiscover output will be displayed showing differences between the old and current fabric. By default, the following are compared for differences: switches, channel adapters, routers, and port connections. .sp \fB\-\-diffcheck \fP Specify what diff checks should be done in the \fB\-\-diff\fP option above. Comma separate multiple diff check key(s). The available diff checks are: \fBport\fP = port connections, \fBstate\fP = port state, \fBlid\fP = lids, \fBnodedesc\fP = node descriptions. Note that \fBport\fP, \fBlid\fP, and \fBnodedesc\fP are checked only for the node types that are specified (e.g. \fBswitches\-only\fP, \fBcas\-only\fP). If \fBport\fP is specified alongside \fBlid\fP or \fBnodedesc\fP, remote port lids and node descriptions will also be compared. .sp \fB\-\-filterdownports \fP Filter downports indicated in a ibnetdiscover cache. If a port was previously indicated as down in the specified cache, and is still down, do not output it in the resulting output. This option may be particularly useful for environments where switches are not fully populated, thus much of the default iblinkinfo info is considered useless. See \fBibnetdiscover\fP for information on caching ibnetdiscover output. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Configuration flags .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .\" Define the common option -z . .INDENT 0.0 .TP .B \fB\-\-outstanding_smps, \-o \fP Specify the number of outstanding SMP\(aqs which should be issued during the scan .sp Default: 2 .UNINDENT .\" Define the common option --node-name-map . .sp \fB\-\-node\-name\-map \fP Specify a node name map. .INDENT 0.0 .INDENT 3.5 This file maps GUIDs to more user friendly names. See FILES section. .UNINDENT .UNINDENT .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .SS Debugging flags .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SH EXIT STATUS .sp 0 on success, \-1 on failure to scan the fabric, 1 if check mode is used and inconsistencies are found. .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .SH AUTHOR .INDENT 0.0 .TP .B Ira Weiny < \fI\%ira.weiny@intel.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/87bcbf2b86c31bd9ac6bf9e31aeb0c2c46f31d9c0000644000175100002000000000310614773456412034077 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "ibv_import_dm ibv_unimport_dm" "3" "2021-1-17" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_import_dm - import an DM from a given ibv_context .PP ibv_unimport_dm - unimport an DM .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_dm *ibv_import_dm(struct ibv_context *context, uint32_t dm_handle); void ibv_unimport_dm(struct ibv_dm *dm) \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_import_dm()\f[R] returns a Device memory (DM) that is associated with the given \f[I]dm_handle\f[R] in the RDMA context. .PP The input \f[I]dm_handle\f[R] value must be a valid kernel handle for an DM object in the assosicated RDMA context. It can be achieved from the original DM by getting its ibv_dm->handle member value. .PP \f[B]ibv_unimport_dm()\f[R] un import the DM. Once the DM usage has been ended ibv_free_dm() or ibv_unimport_dm() should be called. The first one will go to the kernel to destroy the object once the second one way cleanup what ever is needed/opposite of the import without calling the kernel. .PP This is the responsibility of the application to coordinate between all ibv_context(s) that use this DM. Once destroy is done no other process can touch the object except for unimport. All users of the context must collaborate to ensure this. .SH RETURN VALUE .PP \f[B]ibv_import_dm()\f[R] returns a pointer to the allocated DM, or NULL if the request fails and errno is set. .SH NOTES .SH SEE ALSO .PP \f[B]ibv_alloc_dm\f[R](3), \f[B]ibv_free_dm\f[R](3), .SH AUTHOR .PP Maor Gottlieb rdma-core-56.1/buildlib/pandoc-prebuilt/33b3f34c341a1accced887a78f6d5c4cfdd7de050000644000175100002000000001127414773456421034031 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "SMPDUMP" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME smpdump \- dump InfiniBand subnet management attributes .SH SYNOPSIS .sp smpdump [options] [attribute_modifier] .SH DESCRIPTION .sp smpdump is a general purpose SMP utility which gets SM attributes from a specified SMA. The result is dumped in hex by default. .SH OPTIONS .INDENT 0.0 .TP .B \fBdlid|drpath\fP LID or DR path to SMA .TP .B \fBattribute\fP IBA attribute ID for SM attribute .TP .B \fBattribute_modifier\fP IBA modifier for SM attribute .TP .B \fB\-s, \-\-string\fP Print strings in packet if possible .UNINDENT .SS Addressing Flags .\" Define the common option -D for Directed routes . .sp \fB\-D, \-\-Direct\fP The address specified is a directed route .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Examples: [options] \-D [options] "0" # self port [options] \-D [options] "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag \(aq\-P\(aq or the port found through the automatic selection process.) .ft P .fi .UNINDENT .UNINDENT .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .SH EXAMPLES .sp Direct Routed Examples .INDENT 0.0 .TP .B :: smpdump \-D 0,1,2,3,5 16 # NODE DESC smpdump \-D 0,1,2 0x15 2 # PORT INFO, port 2 .UNINDENT .sp LID Routed Examples .INDENT 0.0 .TP .B :: smpdump 3 0x15 2 # PORT INFO, lid 3 port 2 smpdump 0xa0 0x11 # NODE INFO, lid 0xa0 .UNINDENT .SH SEE ALSO .sp smpquery (8) .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/1f21cdca514c6383ef7e95bac3153183b43b43f90000644000175100002000000000413714773456412033442 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_FORK_INIT" "3" "2006-10-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_fork_init - initialize libibverbs to support fork() .SH SYNOPSIS .IP .nf \f[C] #include int ibv_fork_init(void); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_fork_init()\f[R] initializes libibverbs\[cq]s data structures to handle \f[B]fork()\f[R] function calls correctly and avoid data corruption, whether \f[B]fork()\f[R] is called explicitly or implicitly (such as in \f[B]system()\f[R]). .PP It is not necessary to use this function if all parent process threads are always blocked until all child processes end or change address spaces via an \f[B]exec()\f[R] operation. .SH RETURN VALUE .PP \f[B]ibv_fork_init()\f[R] returns 0 on success, or the value of errno on failure (which indicates the failure reason). An error value of EINVAL indicates that there had been RDMA memory registration already and it is therefore not safe anymore to fork. .SH NOTES .PP \f[B]ibv_fork_init()\f[R] works on Linux kernels supporting the \f[B]MADV_DONTFORK\f[R] flag for \f[B]madvise()\f[R] (2.6.17 and higher). .PP Setting the environment variable \f[B]RDMAV_FORK_SAFE\f[R] or \f[B]IBV_FORK_SAFE\f[R] has the same effect as calling \f[B]ibv_fork_init()\f[R]. .PP Setting the environment variable \f[B]RDMAV_HUGEPAGES_SAFE\f[R] tells the library to check the underlying page size used by the kernel for memory regions. This is required if an application uses huge pages either directly or indirectly via a library such as libhugetlbfs. .PP Calling \f[B]ibv_fork_init()\f[R] will reduce performance due to an extra system call for every memory registration, and the additional memory allocated to track memory regions. The precise performance impact depends on the workload and usually will not be significant. .PP Setting \f[B]RDMAV_HUGEPAGES_SAFE\f[R] adds further overhead to all memory registrations. .SH SEE ALSO .PP \f[B]exec\f[R](3), \f[B]fork\f[R](2), \f[B]ibv_get_device_list\f[R](3), \f[B]system\f[R](3), \f[B]wait\f[R](2) .SH AUTHOR .PP Dotan Barak rdma-core-56.1/buildlib/pandoc-prebuilt/ba541f84e3e4aee84dcf1896a54e1da9b8aa83920000644000175100002000000000632314773456416033707 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_qp_cancel_posted_send_wrs" "3" "" "" "" .hy .SH NAME .PP mlx5dv_qp_cancel_posted_send_wrs - Cancel all pending send work requests with supplied WRID in a QP in SQD state .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_qp_cancel_posted_send_wrs(struct mlx5dv_qp_ex *mqp, uint64_t wr_id); \f[R] .fi .SH DESCRIPTION .PP The canceled work requests are replaced with NOPs (no operation), and will generate good completions according to the signaling originally requested in the send flags, or \[lq]flushed\[rq] completions in case the QP goes to error. A work request can only be canceled when the QP is in SQD state. .PP The cancel function is a part of the signature pipelining feature. The feature allows posting a signature related transfer operation together with a SEND with a good response to the client. Normally, the application must wait for the transfer to end, check the MKEY for errors, and only then send a good or bad response. However this increases the latency of the good flow of a transaction. .PP To enable this feature, a QP must be created with the \f[B]MLX5DV_QP_CREATE_SIG_PIPELINING\f[R] creation flag. Such QP will stop after a transfer operation that failed signature validation in SQD state. \f[B]IBV_EVENT_SQ_DRAINED\f[R] is generated to inform about the new state. .PP The SEND operation that might need to be canceled due to a bad signature of a previous operation must be posted with the \f[B]IBV_SEND_FENCE\f[R] option in \f[B]ibv_qp_ex->wr_flags\f[R] field. .PP When QP stopped at SQD, it means that at least one WR caused signature error. It may not be the last WR. It may be that more than one WRs cause signature errors by the time the QP finally stopped. It is guaranteed that the QP has stopped somewhere between the WQE that generated the signature error, and the next WQE that has \f[B]IBV_SEND_FENCE\f[R] on it. .PP Software must handle the SQD event as described below: .IP "1." 3 Poll everything (polling until 0 once) on the respective CQ, allowing the discovery of all possible signature errors. .IP "2." 3 Look through all \[lq]open\[rq] transactions, check related signature MKEYs using \f[B]mlx5dv_mkey_check\f[R](3), find the one with the signature error, get a \f[B]WRID\f[R] from the operation software context and handle the failed operation. .IP "3." 3 Cancel the SEND WR by the WRID using \f[B]mlx5dv_qp_cancel_posted_send_wrs\f[R](). .IP "4." 3 Modify the QP back to RTS state. .SH ARGUMENTS .TP \f[I]mqp\f[R] The QP to investigate, which must be in SQD state. .TP \f[I]wr_id\f[R] The WRID to cancel. .SH RETURN VALUE .PP Number of work requests that were canceled, or -errno on error. .SH NOTES .PP A DEVX context should be opened by using \f[B]mlx5dv_open_device\f[R](3). .PP Must be called with a QP in SQD state. .PP QP should be created with \f[B]MLX5DV_QP_CREATE_SIG_PIPELINING\f[R] creation flag. Application must listen on QP events, and expect a SQD event. .SH SEE ALSO .PP \f[B]mlx5dv_mkey_check\f[R](3), \f[B]mlx5dv_wr_mkey_configure\f[R](3), \f[B]mlx5dv_wr_set_mkey_sig_block\f[R](3), \f[B]mlx5dv_create_mkey\f[R](3), \f[B]mlx5dv_destroy_mkey\f[R](3) .SH AUTHORS .PP Oren Duer .PP Sergey Gorenko rdma-core-56.1/buildlib/pandoc-prebuilt/e321cdac4bc27b15425028ce3a7c8f96cf78ae1b0000644000175100002000000000323514773456414033661 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_dci_stream_id_reset" "3" "" "" "" .hy .SH NAME .PP mlx5dv_dci_stream_id_reset - Reset stream_id of a given DCI QP .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_dci_stream_id_reset(struct ibv_qp *qp, uint16_t stream_id); \f[R] .fi .SH DESCRIPTION .PP Used by SW to reset an errored \f[I]stream_id\f[R] in the HW DCI context. .PP On work completion with error, the application should call ibv_query_qp() to check if the QP was moved to an error state, or it\[cq]s still operational (in RTS state), which means that the specific \f[I]stream_id\f[R] that caused the completion with error is in error state. .PP Errors which are stream related will cause only that \f[I]stream_id\[cq]s\f[R] work request to be flushed as they are handled in order in the send queue. Once all \f[I]stream_id\f[R] WR\[cq]s are flushed, application should reset the errored \f[I]stream_id\f[R] by calling mlx5dv_dci_stream_id_reset(). Work requested for other \f[I]stream_id\[cq]s\f[R] will continue to be processed by the QP. The DCI QP will move to an error state and stop operating once the number of unique \f[I]stream_id\f[R] in error reaches the DCI QP\[cq]s `log_num_errored' streams defined by SW. .PP Application should use the `wr_id' in the ibv_wc to find the \f[I]stream_id\f[R] from it\[cq]s private context. .SH ARGUMENTS .TP \f[I]qp\f[R] The ibv_qp object to issue the action on. .TP \f[I]stream_id\f[R] The DCI stream channel id that need to be reset. .SH RETURN VALUE .PP Returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH AUTHOR .PP Lior Nahmanson rdma-core-56.1/buildlib/pandoc-prebuilt/6de8298d2452a2503f893112a0955baf560008c10000644000175100002000000000240114773456413033051 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_RATE_TO_MULT" "3" "2006-10-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_rate_to_mult - convert IB rate enumeration to multiplier of 2.5 Gbit/sec .PP mult_to_ibv_rate - convert multiplier of 2.5 Gbit/sec to an IB rate enumeration .SH SYNOPSIS .IP .nf \f[C] #include int ibv_rate_to_mult(enum ibv_rate rate); enum ibv_rate mult_to_ibv_rate(int mult); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_rate_to_mult()\f[R] converts the IB transmission rate enumeration \f[I]rate\f[R] to a multiple of 2.5 Gbit/sec (the base rate). For example, if \f[I]rate\f[R] is \f[B]IBV_RATE_5_GBPS\f[R], the value 2 will be returned (5 Gbit/sec = 2 * 2.5 Gbit/sec). .PP \f[B]mult_to_ibv_rate()\f[R] converts the multiplier value (of 2.5 Gbit/sec) \f[I]mult\f[R] to an IB transmission rate enumeration. For example, if \f[I]mult\f[R] is 2, the rate enumeration \f[B]IBV_RATE_5_GBPS\f[R] will be returned. .SH RETURN VALUE .PP \f[B]ibv_rate_to_mult()\f[R] returns the multiplier of the base rate 2.5 Gbit/sec. .PP \f[B]mult_to_ibv_rate()\f[R] returns the enumeration representing the IB transmission rate. .SH SEE ALSO .PP \f[B]ibv_query_port\f[R](3) .SH AUTHOR .PP Dotan Barak rdma-core-56.1/buildlib/pandoc-prebuilt/7d3edfef629d9dea0b4104ab062e4db1ce3aa45f0000644000175100002000000000437314773456413034075 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_QUERY_GID_EX" "3" "2020-04-24" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_query_gid_ex - Query an InfiniBand port\[cq]s GID table entry .SH SYNOPSIS .IP .nf \f[C] #include int ibv_query_gid_ex(struct ibv_context *context, uint32_t port_num, uint32_t gid_index, struct ibv_gid_entry *entry, uint32_t flags); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_query_gid_ex()\f[R] returns the GID entry at \f[I]entry\f[R] for \f[I]gid_index\f[R] of port \f[I]port_num\f[R] for device context \f[I]context\f[R]. .SH ARGUMENTS .TP \f[I]context\f[R] The context of the device to query. .TP \f[I]port_num\f[R] The number of port to query its GID table. .TP \f[I]gid_index\f[R] The index of the GID table entry to query. .TP ## \f[I]entry\f[R] Argument An ibv_gid_entry struct, as defined in . .RS .IP .nf \f[C] struct ibv_gid_entry { union ibv_gid gid; uint32_t gid_index; uint32_t port_num; uint32_t gid_type; uint32_t ndev_ifindex; }; \f[R] .fi .PP \f[I]gid\f[R] .RE .IP .nf \f[C] The GID entry. \f[R] .fi .RS .PP \f[I]gid_index\f[R] .RE .IP .nf \f[C] The GID table index of this entry. \f[R] .fi .RS .PP \f[I]port_num\f[R] .RE .IP .nf \f[C] The port number that this GID belongs to. \f[R] .fi .RS .PP \f[I]gid_type\f[R] .RE .IP .nf \f[C] enum ibv_gid_type, can be one of IBV_GID_TYPE_IB, IBV_GID_TYPE_ROCE_V1 or IBV_GID_TYPE_ROCE_V2. \f[R] .fi .RS .PP \f[I]ndev_ifindex\f[R] .RE .IP .nf \f[C] The interface index of the net device associated with this GID. It is 0 if there is no net device associated with it. \f[R] .fi .TP \f[I]flags\f[R] Extra fields to query post \f[I]ndev_ifindex\f[R], for now must be 0. .SH RETURN VALUE .PP \f[B]ibv_query_gid_ex()\f[R] returns 0 on success or errno value on error. .SH ERRORS .TP ENODATA \f[I]gid_index\f[R] is within the GID table size of port \f[I]port_num\f[R] but there is no data in this index. .SH SEE ALSO .PP \f[B]ibv_open_device\f[R](3), \f[B]ibv_query_device\f[R](3), \f[B]ibv_query_pkey\f[R](3), \f[B]ibv_query_port\f[R](3), \f[B]ibv_query_gid_table\f[R](3) .SH AUTHOR .PP Parav Pandit rdma-core-56.1/buildlib/pandoc-prebuilt/683da434f646ce7467d4ceb9b7aa2ee55f1a25ee0000644000175100002000000000262214773456415033704 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_get_data_direct_sysfs_path" "3" "" "" "" .hy .SH NAME .PP mlx5dv_get_data_direct_sysfs_path - Get the sysfs path of a data direct device .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_get_data_direct_sysfs_path(struct ibv_context *context, char *buf, size_t buf_len) \f[R] .fi .SH DESCRIPTION .PP Get the sysfs path of the data direct device that is associated with the given \f[I]context\f[R]. .PP This lets an application to discover whether/which data direct device is associated with the given \f[I]context\f[R]. .SH ARGUMENTS .TP \f[I]context\f[R] RDMA device context to work on. .TP \f[I]buf\f[R] The buffer where to place the sysfs path of the associated data direct device. .TP \f[I]buf_len\f[R] .IP .nf \f[C] The length of the buffer. \f[R] .fi .SH RETURN VALUE .PP Upon success 0 is returned or the value of errno on a failure. .SH ERRORS .PP The below specific error values should be considered. .TP ENODEV .IP .nf \f[C] There is no associated data direct device for the given *context*. \f[R] .fi .TP ENOSPC .IP .nf \f[C] The input buffer size is too small to hold the full sysfs path. \f[R] .fi .SH NOTES .PP Upon succees, the caller should add the /sys/ prefix to get the full sysfs path. .SH SEE ALSO .PP \f[I]mlx5dv_reg_dmabuf_mr(3)\f[R] .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/a2e90b674819f826146fc161bc8b59535fa9bb500000644000175100002000000002151614773456416033327 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "MLX5DV_WR" "3" "2019-02-24" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_wr_set_dc_addr - Attach a DC info to the last work request .PP mlx5dv_wr_raw_wqe - Build a raw work request .PP mlx5dv_wr_memcpy - Build a DMA memcpy work request .SH SYNOPSIS .IP .nf \f[C] #include static inline void mlx5dv_wr_set_dc_addr(struct mlx5dv_qp_ex *mqp, struct ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key); static inline void mlx5dv_wr_set_dc_addr_stream(struct mlx5dv_qp_ex *mqp, struct ibv_ah *ah, uint32_t remote_dctn, uint64_t remote_dc_key, uint16_t stream_id); struct mlx5dv_mr_interleaved { uint64_t addr; uint32_t bytes_count; uint32_t bytes_skip; uint32_t lkey; }; static inline void mlx5dv_wr_mr_interleaved(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint32_t access_flags, /* use enum ibv_access_flags */ uint32_t repeat_count, uint16_t num_interleaved, struct mlx5dv_mr_interleaved *data); static inline void mlx5dv_wr_mr_list(struct mlx5dv_qp_ex *mqp, struct mlx5dv_mkey *mkey, uint32_t access_flags, /* use enum ibv_access_flags */ uint16_t num_sges, struct ibv_sge *sge); static inline int mlx5dv_wr_raw_wqe(struct mlx5dv_qp_ex *mqp, const void *wqe); static inline void mlx5dv_wr_memcpy(struct mlx5dv_qp_ex *mqp_ex, uint32_t dest_lkey, uint64_t dest_addr, uint32_t src_lkey, uint64_t src_addr, size_t length) \f[R] .fi .SH DESCRIPTION .PP The MLX5DV work request APIs (mlx5dv_wr_*) is an extension for IBV work request API (ibv_wr_*) with mlx5 specific features for send work request. This may be used together with or without ibv_wr_* calls. .SH USAGE .PP To use these APIs a QP must be created using mlx5dv_create_qp() with \f[I]send_ops_flags\f[R] of struct ibv_qp_init_attr_ex set. .PP If the QP does not support all the requested work request types then QP creation will fail. .PP The mlx5dv_qp_ex is extracted from the IBV_QP by ibv_qp_to_qp_ex() and mlx5dv_qp_ex_from_ibv_qp_ex(). This should be used to apply the mlx5 specific features on the posted WR. .PP A work request creation requires to use the ibv_qp_ex as described in the man for ibv_wr_post and mlx5dv_qp with its available builders and setters. .SS QP Specific builders .TP \f[I]RC\f[R] QPs \f[I]mlx5dv_wr_mr_interleaved()\f[R] .RS .PP registers an interleaved memory layout by using an indirect mkey and some interleaved data. The layout of the memory pointed by the mkey after its registration will be the \f[I]data\f[R] representation for the \f[I]num_interleaved\f[R] entries. This single layout representation is repeated by \f[I]repeat_count\f[R]. .PP The \f[I]data\f[R] as described by struct mlx5dv_mr_interleaved will hold real data defined by \f[I]bytes_count\f[R] and then a padding of \f[I]bytes_skip\f[R]. Post a successful registration, RDMA operations can use this \f[I]mkey\f[R]. The hardware will scatter the data according to the pattern. The \f[I]mkey\f[R] should be used in a zero-based mode. The \f[I]addr\f[R] field in its \f[I]ibv_sge\f[R] is an offset in the total data. To create this \f[I]mkey\f[R] mlx5dv_create_mkey() should be used. .PP Current implementation requires the IBV_SEND_INLINE option to be on in \f[I]ibv_qp_ex->wr_flags\f[R] field. To be able to have more than 3 \f[I]num_interleaved\f[R] entries, the QP should be created with a larger WQE size that may fit it. This should be done using the \f[I]max_inline_data\f[R] attribute of \f[I]struct ibv_qp_cap\f[R] upon its creation. .PP As one entry will be consumed for strided header, the \f[I]mkey\f[R] should be created with one more entry than the required \f[I]num_interleaved\f[R]. .PP In case \f[I]ibv_qp_ex->wr_flags\f[R] turns on IBV_SEND_SIGNALED, the reported WC opcode will be MLX5DV_WC_UMR. Unregister the \f[I]mkey\f[R] to enable another pattern registration should be done via ibv_post_send with IBV_WR_LOCAL_INV opcode. .RE \f[I]mlx5dv_wr_mr_list()\f[R] .RS .PP registers a memory layout based on list of ibv_sge. The layout of the memory pointed by the \f[I]mkey\f[R] after its registration will be based on the list of \f[I]sge\f[R] counted by \f[I]num_sges\f[R]. Post a successful registration RDMA operations can use this \f[I]mkey\f[R], the hardware will scatter the data according to the pattern. The \f[I]mkey\f[R] should be used in a zero-based mode, the \f[I]addr\f[R] field in its \f[I]ibv_sge\f[R] is an offset in the total data. .PP Current implementation requires the IBV_SEND_INLINE option to be on in \f[I]ibv_qp_ex->wr_flags\f[R] field. To be able to have more than 4 \f[I]num_sge\f[R] entries, the QP should be created with a larger WQE size that may fit it. This should be done using the \f[I]max_inline_data\f[R] attribute of \f[I]struct ibv_qp_cap\f[R] upon its creation. .PP In case \f[I]ibv_qp_ex->wr_flags\f[R] turns on IBV_SEND_SIGNALED, the reported WC opcode will be MLX5DV_WC_UMR. Unregister the \f[I]mkey\f[R] to enable other pattern registration should be done via ibv_post_send with IBV_WR_LOCAL_INV opcode. .RE .TP \f[I]RC\f[R] or \f[I]DCI\f[R] QPs \f[I]mlx5dv_wr_memcpy()\f[R] .RS .PP Builds a DMA memcpy work request to copy data of length \f[I]length\f[R] from \f[I]src_addr\f[R] to \f[I]dest_addr\f[R]. The copy operation will be done using the DMA MMO functionality of the device to copy data on PCI bus. .PP The MLX5DV_QP_EX_WITH_MEMCPY flag in \f[I]mlx5dv_qp_init_attr.send_ops_flags\f[R] needs to be set during QP creation. If the device or QP doesn\[cq]t support it then QP creation will fail. The maximum memcpy length that is supported by the device is reported in \f[I]mlx5dv_context->max_wr_memcpy_length\f[R]. A zero value in \f[I]mlx5dv_context->max_wr_memcpy_length\f[R] means the device doesn\[cq]t support memcpy operations. .PP IBV_SEND_FENCE indicator should be used on a following send request which is dependent on \f[I]dest_addr\f[R] of the memcpy operation. .PP In case \f[I]ibv_qp_ex->wr_flags\f[R] turns on IBV_SEND_SIGNALED, the reported WC opcode will be MLX5DV_WC_MEMCPY. .RE .SS Raw WQE builders .TP \f[I]mlx5dv_wr_raw_wqe()\f[R] It is used to build a custom work request (WQE) and post it on a normal QP. The caller needs to set all details of the WQE (except the \[lq]ctrl.wqe_index\[rq] and \[lq]ctrl.signature\[rq] fields, which is the driver\[cq]s responsibility to set). The MLX5DV_QP_EX_WITH_RAW_WQE flag in mlx5_qp_attr.send_ops_flags needs to be set. .RS .PP The wr_flags are ignored as it\[cq]s the caller\[cq]s responsibility to set flags in WQE. .PP No matter what the send opcode is, the work completion opcode for a raw WQE is IBV_WC_DRIVER2. .RE .SS QP Specific setters .TP \f[I]DCI\f[R] QPs \f[I]mlx5dv_wr_set_dc_addr()\f[R] must be called to set the DCI WR properties. The destination address of the work is specified by \f[I]ah\f[R], the remote DCT number is specified by \f[I]remote_dctn\f[R] and the DC key is specified by \f[I]remote_dc_key\f[R]. This setter is available when the QP transport is DCI and send_ops_flags in struct ibv_qp_init_attr_ex is set. The available builders and setters for DCI QP are the same as RC QP. DCI QP created with MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS can call \f[I]mlx5dv_wr_set_dc_addr_stream()\f[R] to define the \f[I]stream_id\f[R] of the operation to allow HW to choose one of the multiple concurrent DCI resources. Calls to \f[I]mlx5dv_wr_set_dc_addr()\f[R] are equivalent to using \f[I]stream_id\f[R]=0 .SH EXAMPLE .IP .nf \f[C] /* create DC QP type and specify the required send opcodes */ attr_ex.qp_type = IBV_QPT_DRIVER; attr_ex.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS; attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_WRITE; attr_dv.comp_mask |= MLX5DV_QP_INIT_ATTR_MASK_DC; attr_dv.dc_init_attr.dc_type = MLX5DV_DCTYPE_DCI; ibv_qp *qp = mlx5dv_create_qp(ctx, attr_ex, attr_dv); ibv_qp_ex *qpx = ibv_qp_to_qp_ex(qp); mlx5dv_qp_ex *mqpx = mlx5dv_qp_ex_from_ibv_qp_ex(qpx); ibv_wr_start(qpx); /* Use ibv_qp_ex object to set WR generic attributes */ qpx->wr_id = my_wr_id_1; qpx->wr_flags = IBV_SEND_SIGNALED; ibv_wr_rdma_write(qpx, rkey, remote_addr_1); ibv_wr_set_sge(qpx, lkey, local_addr_1, length_1); /* Use mlx5 DC setter using mlx5dv_qp_ex object */ mlx5dv_wr_set_wr_dc_addr(mqpx, ah, remote_dctn, remote_dc_key); ret = ibv_wr_complete(qpx); \f[R] .fi .SH SEE ALSO .PP \f[B]ibv_post_send\f[R](3), \f[B]ibv_create_qp_ex(3)\f[R], \f[B]ibv_wr_post(3)\f[R], \f[B]mlx5dv_create_mkey(3)\f[R]. .SH AUTHOR .PP Guy Levi .PP Mark Zhang rdma-core-56.1/buildlib/pandoc-prebuilt/e17edb66e91620850eb7da65f8e01f7fd1d1ddfd0000644000175100002000000000163614773456415033765 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_modify_qp_lag_port" "3" "" "" "" .hy .SH NAME .PP mlx5dv_modify_qp_lag_port - Modify the lag port information of a given QP .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_modify_qp_lag_port(struct ibv_qp *qp, uint8_t port_num); \f[R] .fi .SH DESCRIPTION .PP This API enables modifying the configured port num of a given QP. .PP If the QP state is modified later, the port num may be implicitly re-configured. .PP Use query mlx5dv_query_qp_lag_port to check the configured and active port num values. .SH ARGUMENTS .TP \f[I]qp\f[R] The ibv_qp object to issue the action on. .TP \f[I]port_num\f[R] The port_num to set for the QP. .SH RETURN VALUE .PP 0 on success; EOPNOTSUPP if not in LAG mode, or other errno value on other failures. .SH SEE ALSO .PP \f[I]mlx5dv_query_qp_lag_port(3)\f[R] .SH AUTHOR .PP Aharon Landau rdma-core-56.1/buildlib/pandoc-prebuilt/771e81c03946e49b29d803afc6498a1c2c346ce80000644000175100002000000000277414773456415033340 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "MLX5DV_DUMP API" "3" "2019-11-18" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_dump_dr_domain - Dump DR Domain .PP mlx5dv_dump_dr_table - Dump DR Table .PP mlx5dv_dump_dr_matcher - Dump DR Matcher .PP mlx5dv_dump_dr_rule - Dump DR Rule .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_dump_dr_domain(FILE *fout, struct mlx5dv_dr_domain *domain); int mlx5dv_dump_dr_table(FILE *fout, struct mlx5dv_dr_table *table); int mlx5dv_dump_dr_matcher(FILE *fout, struct mlx5dv_dr_matcher *matcher); int mlx5dv_dump_dr_rule(FILE *fout, struct mlx5dv_dr_rule *rule); \f[R] .fi .SH DESCRIPTION .PP The Dump API (mlx5dv_dump_*) allows the dumping of the existing rdma-core resources to the provided file. The output file format is vendor specific. .PP \f[I]mlx5dv_dump_dr_domain()\f[R] dumps a DR Domain object properties to a specified file. .PP \f[I]mlx5dv_dump_dr_table()\f[R] dumps a DR Table object properties to a specified file. .PP \f[I]mlx5dv_dump_dr_matcher()\f[R] dumps a DR Matcher object properties to a specified file. .PP \f[I]mlx5dv_dump_dr_rule()\f[R] dumps a DR Rule object properties to a specified file. .SH RETURN VALUE .PP The API calls returns 0 on success, or the value of errno on failure (which indicates the failure reason). The calls are blocking - function returns only when all related resources info is written to the file. .SH AUTHOR .PP Yevgeny Kliteynik Muhammad Sammar rdma-core-56.1/buildlib/pandoc-prebuilt/ca22a60969c4c2b09f35bd74358cc9247766569b0000644000175100002000000000263314773456412033251 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_QUERY_ECE" "3" "2020-01-22" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_query_ece - query ECE options. .SH SYNOPSIS .IP .nf \f[C] #include int ibv_query_ece(struct ibv_qp *qp, struct ibv_ece *ece); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_query_ece()\f[R] query ECE options. .PP Return to the user current ECE state for the QP. .SH ARGUMENTS .TP \f[I]qp\f[R] The queue pair (QP) associated with the ECE options. .TP ## \f[I]ece\f[R] Argument The ECE values. .IP .nf \f[C] struct ibv_ece { uint32_t vendor_id; uint32_t options; uint32_t comp_mask; }; \f[R] .fi .TP \f[I]vendor_id\f[R] Unique identifier of the provider vendor on the network. The providers will set IEEE OUI here to distinguish itself in non-homogenius network. .TP \f[I]options\f[R] Provider specific attributes which are supported. .TP \f[I]comp_mask\f[R] Bitmask specifying what fields in the structure are valid. .SH RETURN VALUE .PP \f[B]ibv_query_ece()\f[R] returns 0 when the call was successful, or the errno value which indicates the failure reason. .TP \f[I]EOPNOTSUPP\f[R] libibverbs or provider driver doesn\[cq]t support the ibv_set_ece() verb. .TP \f[I]EINVAL\f[R] In one of the following: o The QP is invalid. o The ECE options are invalid. .SH SEE ALSO .PP \f[B]ibv_set_ece\f[R](3), .SH AUTHOR .PP Leon Romanovsky rdma-core-56.1/buildlib/pandoc-prebuilt/7dd9f0d248b5c12ac0798c7c4ddf8579cb36d2fa0000644000175100002000000000322714773456414033711 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "manadv_init_obj" "3" "" "" "" .hy .SH NAME .PP manadv_init_obj - Initialize mana direct verbs object from ibv_xxx structures .SH SYNOPSIS\[dq] .IP .nf \f[C] #include int manadv_init_obj(struct manadv_obj *obj, uint64_t obj_type); \f[R] .fi .SH DESCRIPTION .PP manadv_init_obj() This function will initialize manadv_xxx structs based on supplied type. The information for initialization is taken from ibv_xx structs supplied as part of input. .SH ARGUMENTS .TP \f[I]obj\f[R] The manadv_xxx structs be to returned. .IP .nf \f[C] struct manadv_qp { void *sq_buf; uint32_t sq_count; uint32_t sq_size; uint32_t sq_id; uint32_t tx_vp_offset; void *db_page; }; struct manadv_cq { void *buf; uint32_t count; uint32_t cq_id; }; struct manadv_rwq { void *buf; uint32_t count; uint32_t size; uint32_t wq_id; void *db_page; }; struct manadv_obj { struct { struct ibv_qp *in; struct manadv_qp *out; } qp; struct { struct ibv_cq *in; struct manadv_cq *out; } cq; struct { struct ibv_wq *in; struct manadv_rwq *out; } rwq; }; \f[R] .fi .TP \f[I]obj_type\f[R] The types of the manadv_xxx structs to be returned. .IP .nf \f[C] enum manadv_obj_type { MANADV_OBJ_QP = 1 << 0, MANADV_OBJ_CQ = 1 << 1, MANADV_OBJ_RWQ = 1 << 2, }; \f[R] .fi .SH RETURN VALUE .PP 0 on success or the value of errno on failure (which indicates the failure reason). .SH AUTHORS .PP Long Li rdma-core-56.1/buildlib/pandoc-prebuilt/fc30617d889e83a4c77a329249b2ecc3ce5b227f0000644000175100002000000000315414773456420033462 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBSTATUS" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME ibstatus \- query basic status of InfiniBand device(s) .SH SYNOPSIS .sp ibstatus [\-h] [devname[:port]]... .SH DESCRIPTION .sp ibstatus is a script which displays basic information obtained from the local IB driver. Output includes LID, SMLID, port state, link width active, and port physical state. .SH OPTIONS .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .INDENT 0.0 .TP .B \fBdevname\fP InfiniBand device name .TP .B \fBportnum\fP port number of InfiniBand device .UNINDENT .SH EXAMPLES .INDENT 0.0 .TP .B :: ibstatus # display status of all IB ports ibstatus mthca1 # status of mthca1 ports ibstatus mthca1:1 mthca0:2 # show status of specified ports .UNINDENT .SH SEE ALSO .sp \fBibstat (8)\fP .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/7acddf75dff9731ca765052c431e5fde2039ed350000644000175100002000000003266114773456416033630 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\"t .\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_wr_set_mkey_crypto" "3" "" "" "" .hy .SH NAME .PP mlx5dv_wr_set_mkey_crypto - Configure a MKey for crypto operation. .SH SYNOPSIS .IP .nf \f[C] #include static inline void mlx5dv_wr_set_mkey_crypto(struct mlx5dv_qp_ex *mqp, const struct mlx5dv_crypto_attr *attr); \f[R] .fi .SH DESCRIPTION .PP Configure a MKey with crypto properties. With this, the device will encrypt/decrypt data when transmitting data from memory to network and when receiving data from network to memory. .PP In order to configure MKey with crypto properties, the MKey should be created with \f[B]MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO\f[R]. MKey that was created with \f[B]MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO\f[R] must have crypto properties configured to it before it can be used, i.e.\ this setter must be called before the MKey can be used or else traffic will fail, generating a CQE with error. A call to this setter on a MKey that already has crypto properties configured to it will override existing crypto properties. .PP Configuring crypto properties to a MKey is done by specifying the crypto standard that should be used and its attributes, and also by providing the Data Encryption Key (DEK) to be used for the encryption/decryption itself. .PP The MKey represents a virtually contiguous memory, by configuring a layout to it. The crypto properties of the MKey describe whether data in this virtually contiguous memory is encrypted or in plaintext, and whether it should be encrypted/decrypted before transmitting it or after receiving it. Depending on the actual operation that happens (TX or RX), the device will do the \[lq]right thing\[rq] based on the crypto properties configured in the MKey. .PP MKeys can be configured with both crypto and signature properties at the same time by calling both \f[B]mlx5dv_wr_set_mkey_crypto()\f[R](3) and \f[B]mlx5dv_wr_set_mkey_sig_block()\f[R](3). In this case, both crypto and signature operations will be performed according to the crypto and signature properties configured in the MKey, and the order of operations will be determined by the \f[I]signature_crypto_order\f[R] property. .SS Example 1 (corresponds to row F in the table below): .PP Memory signature domain is not configured, and memory data is encrypted. .PP Wire signature domain is not configured, and wire data is in plaintext. .PP \f[I]encrypt_on_tx\f[R] is set to false, and because signature is not configured, \f[I]signature_crypto_order\f[R] value doesn\[cq]t matter. .PP A SEND is issued using the MKey as a local key. .PP Result: device will gather the encrypted data from the MKey (using whatever layout configured to the MKey to locate the actual memory), decrypt it using the supplied DEK and transmit the decrypted data to the wire. .SS Example 1.1: .PP Same as above, but a RECV is issued with the same MKey, and RX happens. .PP Result: device will receive the data from the wire, encrypt it using the supplied DEK and scatter it to the MKey (using whatever layout configured to the MKey to locate the actual memory). .SS Example 2 (corresponds to row C in the table below): .PP Memory signature domain is configured for no signature, and memory data is in plaintext. .PP Wire signature domain is configured for T10DIF every 512 Bytes block, and wire data (including the T10DIF) is encrypted. .PP \f[I]encrypt_on_tx\f[R] is set to true and \f[I]signature_crypto_order\f[R] is set to be \f[B]MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_BEFORE_CRYPTO_ON_TX\f[R]. \f[I]data_unit_size\f[R] is set to \f[B]MLX5DV_BLOCK_SIZE_520\f[R]. .PP The MKey is sent to a remote node that issues a RDMA_READ to this MKey. .PP Result: device will gather the data from the MKey (using whatever layout configured to the MKey to locate the actual memory), generate an additional T10DIF field every 512B of data, encrypt the data and the newly generated T10DIF field using the supplied DEK, and transmit it to the wire. .SS Example 2.1: .PP Same as above, but remote node issues a RDMA_WRITE to this MKey. .PP Result: device will receive the data from the wire, decrypt the data using the supplied DEK, validate each T10DIF field against the previous 512B of data, strip the T10DIF field, and scatter the data alone to the MKey (using whatever layout configured to the MKey to locate the actual memory). .SH ARGUMENTS .TP \f[I]mqp\f[R] The QP where an MKey configuration work request was created by \f[B]mlx5dv_wr_mkey_configure()\f[R]. .TP \f[I]attr\f[R] Crypto attributes to set for the MKey. .SS Crypto Attributes .PP Crypto attributes describe the format (encrypted or plaintext) and layout of the input and output data in memory and wire domains, the crypto standard that should be used and its attributes. .IP .nf \f[C] struct mlx5dv_crypto_attr { enum mlx5dv_crypto_standard crypto_standard; bool encrypt_on_tx; enum mlx5dv_signature_crypto_order signature_crypto_order; enum mlx5dv_block_size data_unit_size; char initial_tweak[16]; struct mlx5dv_dek *dek; char keytag[8]; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]crypto_standard\f[R] The encryption standard that should be used, currently can only be the following value .RS .TP \f[B]MLX5DV_CRYPTO_STANDARD_AES_XTS\f[R] The AES-XTS encryption standard defined in IEEE Std 1619-2007. .RE .TP \f[I]encrypt_on_tx\f[R] If set, memory data will be encrypted during TX and wire data will be decrypted during RX. If not set, memory data will be decrypted during TX and wire data will be encrypted during RX. .TP \f[I]signature_crypto_order\f[R] Controls the order between crypto and signature operations (Please see detailed table below). Relevant only if signature is configured. Can be one of the following values .RS .TP \f[B]MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_AFTER_CRYPTO_ON_TX\f[R] During TX, first perform crypto operation (encrypt/decrypt based on \f[I]encrypt_on_tx\f[R]) and then signature operation on memory data. During RX, first perform signature operation and then crypto operation (encrypt/decrypt based on \f[I]encrypt_on_tx\f[R]) on wire data. .TP \f[B]MLX5DV_SIGNATURE_CRYPTO_ORDER_SIGNATURE_BEFORE_CRYPTO_ON_TX\f[R] During TX, first perform signature operation and then crypto operation (encrypt/decrypt based on \f[I]encrypt_on_tx\f[R]) on memory data. During RX, first perform crypto operation (encrypt/decrypt based on \f[I]encrypt_on_tx\f[R]) and then signature operation on wire data. .PP Table: \f[I]signature_crypto_order\f[R] and \f[I]encrypt_on_tx\f[R] Meaning. .PP The table describes the possible data layouts in memory and wire domains, and the order in which crypto and signature operations are performed according to \f[I]signature_crypto_order\f[R], \f[I]encrypt_on_tx\f[R] and signature configuration. .PP Memory column represents the data layout in the memory domain. .PP Wire column represents the data layout in the wire domain. .PP There are three possible operations that can be performed by the device on the data when processing it from memory to wire and from wire to memory: .IP "1." 3 Crypto operation. .IP "2." 3 Signature operation in memory domain. .IP "3." 3 Signature operation in wire domain. .PP Op1, Op2 and Op3 columns represent these operations. On TX, Op1, Op2 and Op3 are performed on memory data to produce the data layout that is specified in Wire column. On RX, Op3, Op2 and Op1 are performed on wire data to produce the data layout specified in Memory column. \[lq]SIG.mem\[rq] and \[lq]SIG.wire\[rq] represent the signature operation that is performed in memory and wire domains respectively. None means no operation is performed. The exact signature operations are determined by the signature attributes configured by \f[B]mlx5dv_wr_set_mkey_sig_block()\f[R]. .PP encrypt_on_tx and signature_crypto_order columns represent the values that \f[I]encrypt_on_tx\f[R] and \f[I]signature_crypto_order\f[R] should have in order to achieve such behavior. .PP .TS tab(@); lw(2.7n) lw(8.5n) lw(8.5n) lw(8.5n) lw(8.5n) lw(8.5n) lw(8.0n) lw(17.0n). T{ T}@T{ Memory T}@T{ Op1 T}@T{ Op2 T}@T{ Op3 T}@T{ Wire T}@T{ encrypt_on_tx T}@T{ signature_crypto_order T} _ T{ A T}@T{ data T}@T{ Encrypt on TX T}@T{ SIG.mem = none T}@T{ SIG.wire = none T}@T{ enc(data) T}@T{ True T}@T{ Doesn\[cq]t matter T} T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T} T{ B T}@T{ data T}@T{ Encrypt On TX T}@T{ SIG.mem = none T}@T{ SIG.wire = SIG T}@T{ enc(data)+SIG T}@T{ True T}@T{ SIGNATURE_AFTER_CRYPTO_ON_TX T} T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T} T{ C T}@T{ data T}@T{ SIG.mem = none T}@T{ SIG.wire = SIG T}@T{ Encrypt on TX T}@T{ enc(data+SIG) T}@T{ True T}@T{ SIGNATURE_BEFORE_CRYPTO_ON_TX T} T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T} T{ D T}@T{ data+SIG T}@T{ SIG.mem = SIG T}@T{ SIG.wire = none T}@T{ Encrypt on TX T}@T{ enc(data) T}@T{ True T}@T{ SIGNATURE_BEFORE_CRYPTO_ON_TX T} T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T} T{ E T}@T{ data+SIG1 T}@T{ SIG.mem = SIG1 T}@T{ SIG.wire = SIG2 T}@T{ Encrypt on TX T}@T{ enc(data+SIG2) T}@T{ True T}@T{ SIGNATURE_BEFORE_CRYPTO_ON_TX T} T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T} T{ F T}@T{ enc(data) T}@T{ Decrypt on TX T}@T{ SIG.mem = none T}@T{ SIG.wire = none T}@T{ data T}@T{ False T}@T{ Doesn\[cq]t matter T} T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T} T{ G T}@T{ enc(data) T}@T{ Decrypt on TX T}@T{ SIG.mem = none T}@T{ SIG.wire = SIG T}@T{ data+SIG T}@T{ False T}@T{ SIGNATURE_AFTER_CRYPTO_ON_TX T} T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T} T{ H T}@T{ enc(data+SIG) T}@T{ Decrypt on TX T}@T{ SIG.mem = SIG T}@T{ SIG.wire = none T}@T{ data T}@T{ False T}@T{ SIGNATURE_AFTER_CRYPTO_ON_TX T} T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T} T{ I T}@T{ enc(data+SIG1) T}@T{ Decrypt on TX T}@T{ SIG.mem = SIG1 T}@T{ SIG.wire = SIG2 T}@T{ data+SIG2 T}@T{ False T}@T{ SIGNATURE_AFTER_CRYPTO_ON_TX T} T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T}@T{ T} T{ J T}@T{ enc(data)+SIG T}@T{ SIG.mem = SIG T}@T{ SIG.wire = none T}@T{ Decrypt on TX T}@T{ data T}@T{ False T}@T{ SIGNATURE_BEFORE_CRYPTO_ON_TX T} .TE .PP Notes: .IP \[bu] 2 \[lq]Encrypt on TX\[rq] also means \[lq]Decrypt on RX\[rq], and \[lq]Decrypt on TX\[rq] also means \[lq]Encrypt on RX\[rq]. .IP \[bu] 2 When signature properties are not configured in the MKey, only crypto operations will be performed. Thus, \f[I]signature_crypto_order\f[R] has no meaning in this case (rows A and F), and it can be set to either one of its values. .RE .TP \f[I]data_unit_size\f[R] For storage, this will normally be the storage block size. The tweak is incremented after each \f[I]data_unit_size\f[R] during the encryption. can be one of \f[B]enum mlx5dv_block_size\f[R]. .TP \f[I]initial_tweak\f[R] A value to be used during encryption of each data unit. Must be supplied in little endian. This value is incremented by the device for every data unit in the message. For storage encryption, this will normally be the LBA of the first block in the message, so that the increments represent the LBAs of the rest of the blocks in the message. .TP \f[I]dek\f[R] The DEK to be used for the crypto operations. This DEK must be pre-loaded to the device using \f[B]mlx5dv_dek_create()\f[R]. .TP \f[I]key_tag\f[R] A tag that verifies that the correct DEK is being used. \f[I]key_tag\f[R] is optional and is valid only if the DEK was created with \f[B]has_keytag\f[R] set to true. If so, it must match the key tag that was provided when the DEK was created. Supllied in plaintext. .TP \f[I]comp_mask\f[R] Reserved for future extension, must be 0 now. .SH RETURN VALUE .PP This function does not return a value. .PP In case of error, user will be notified later when completing the DV WRs chain. .SH NOTES .PP MKey must be created with \f[B]MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO\f[R] flag. .PP The last operation posted on the supplied QP should be \f[B]mlx5dv_wr_mkey_configure\f[R](3), or one of its related setters, and the operation must still be open (no doorbell issued). .PP In case of \f[B]ibv_wr_complete()\f[R] failure or calling to \f[B]ibv_wr_abort()\f[R], the MKey may be left in an unknown state. The next configuration of it should not assume any previous state of the MKey, i.e.\ signature/crypto should be re-configured or reset, as required. For example, assuming \f[B]mlx5dv_wr_set_mkey_sig_block()\f[R] and then \f[B]ibv_wr_abort()\f[R] were called, then on the next configuration of the MKey, if signature is not needed, it should be reset using \f[B]MLX5DV_MKEY_CONF_FLAG_RESET_SIG_ATTR\f[R]. .PP When configuring a MKey with AES-XTS crypto offload, and using the former for traffic (send/receive), the amount of data to send/receive must meet one of the following conditions for successful encryption/decryption process (per AES-XTS spec): .PP Let\[cq]s refer to the amount of data to send/receive as `job_size' 1.job_size % \f[I]data_unit_size\f[R] == 0 2.(job_size % 16 == 0) && (job_size % \f[I]data_unit_size\f[R] <= \f[I]data_unit_size\f[R] - 16) .PP For example: When \f[I]data_unit_size\f[R] = 512B: 1. job_size = 512B is valid (1 holds). 2. job_size = 128B is valid (2 holds). 3. job_size = 47B is invalid (neither 1 nor 2 holds). .PP When \f[I]data_unit_size\f[R] = 520B: 1. job_size = 520B is valid (1 holds). 2. job_size = 496B is valid (2 holds). 3. job_size = 512B is invalid (neither 1 nor 2 holds). .SH SEE ALSO .PP \f[B]mlx5dv_wr_mkey_configure\f[R](3), \f[B]mlx5dv_wr_set_mkey_sig_block\f[R](3), \f[B]mlx5dv_create_mkey\f[R](3), \f[B]mlx5dv_destroy_mkey\f[R](3), \f[B]mlx5dv_crypto_login\f[R](3), \f[B]mlx5dv_crypto_login_create\f[R](3), \f[B]mlx5dv_dek_create\f[R](3) .SH AUTHORS .PP Oren Duer .PP Avihai Horon .PP Maher Sanalla rdma-core-56.1/buildlib/pandoc-prebuilt/4a2fda3e7e3b15e84396f81e6aae0bde38dcfb980000644000175100002000000000157214773456412034043 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_QUERY_GID" "3" "2006-10-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_query_gid - query an InfiniBand port\[cq]s GID table .SH SYNOPSIS .IP .nf \f[C] #include int ibv_query_gid(struct ibv_context *context, uint8_t port_num, int index, union ibv_gid *gid); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_query_gid()\f[R] returns the GID value in entry \f[I]index\f[R] of port \f[I]port_num\f[R] for device context \f[I]context\f[R] through the pointer \f[I]gid\f[R]. .SH RETURN VALUE .PP \f[B]ibv_query_gid()\f[R] returns 0 on success, and -1 on error. .SH SEE ALSO .PP \f[B]ibv_open_device\f[R](3), \f[B]ibv_query_device\f[R](3), \f[B]ibv_query_pkey\f[R](3), \f[B]ibv_query_port\f[R](3) .SH AUTHOR .PP Dotan Barak rdma-core-56.1/buildlib/pandoc-prebuilt/2cd4402b920e0a57d92dcf281f2091ee6e4ac1410000644000175100002000000000224114773456413033421 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_RESIZE_CQ" "3" "2006-10-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_resize_cq - resize a completion queue (CQ) .SH SYNOPSIS .IP .nf \f[C] #include int ibv_resize_cq(struct ibv_cq *cq, int cqe); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_resize_cq()\f[R] resizes the completion queue (CQ) \f[I]cq\f[R] to have at least \f[I]cqe\f[R] entries. \f[I]cqe\f[R] must be at least the number of unpolled entries in the CQ \f[I]cq\f[R]. If \f[I]cqe\f[R] is a valid value less than the current CQ size, \f[B]ibv_resize_cq()\f[R] may not do anything, since this function is only guaranteed to resize the CQ to a size at least as big as the requested size. .SH RETURN VALUE .PP \f[B]ibv_resize_cq()\f[R] returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH NOTES .PP \f[B]ibv_resize_cq()\f[R] may assign a CQ size greater than or equal to the requested size. The cqe member of \f[I]cq\f[R] will be updated to the actual size. .SH SEE ALSO .PP \f[B]ibv_create_cq\f[R](3), \f[B]ibv_destroy_cq\f[R](3) .SH AUTHOR .PP Dotan Barak rdma-core-56.1/buildlib/pandoc-prebuilt/caaca7667f40fff2095c23c0f40c925f1ff3edea0000644000175100002000000000214414773456413034021 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "EFADV_QUERY_AH" "3" "2019-05-19" "efa" "EFA Direct Verbs Manual" .hy .SH NAME .PP efadv_query_ah - Query EFA specific Address Handle attributes .SH SYNOPSIS .IP .nf \f[C] #include int efadv_query_ah(struct ibv_ah *ibvah, struct efadv_ah_attr *attr, uint32_t inlen); \f[R] .fi .SH DESCRIPTION .PP \f[B]efadv_query_ah()\f[R] queries device-specific Address Handle attributes. .PP Compatibility is handled using the comp_mask and inlen fields. .IP .nf \f[C] struct efadv_ah_attr { uint64_t comp_mask; uint16_t ahn; uint8_t reserved[6]; }; \f[R] .fi .TP \f[I]inlen\f[R] In: Size of struct efadv_ah_attr. .TP \f[I]comp_mask\f[R] Compatibility mask. .TP \f[I]ahn\f[R] Device\[cq]s Address Handle number. .SH RETURN VALUE .PP \f[B]efadv_query_ah()\f[R] returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH SEE ALSO .PP \f[B]efadv\f[R](7) .SH NOTES .IP \[bu] 2 Compatibility mask (comp_mask) is an out field and currently has no values. .SH AUTHORS .PP Gal Pressman rdma-core-56.1/buildlib/pandoc-prebuilt/c41c950912e4d1f32e01fbf73716d1122a2e66a60000644000175100002000000000424314773456414033267 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_create_steering_anchor / mlx5dv_destroy_steering_anchor" "3" "" "" "" .hy .SH NAME .PP mlx5dv_create_steering_anchor - Creates a steering anchor .PP mlx5dv_destroy_steering_anchor - Destroys a steering anchor .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_steering_anchor * mlx5dv_create_steering_anchor(struct ibv_context *context, struct mlx5dv_steering_anchor_attr *attr); int mlx5dv_destroy_steering_anchor(struct mlx5dv_steering_anchor *sa); \f[R] .fi .SH DESCRIPTION .PP A user can take packets into a user-configured sandbox and do packet processing at the end of which a steering pipeline decision is made on what to do with the packet. .PP A steering anchor allows the user to reinject the packet back into the kernel for additional processing. .PP \f[B]mlx5dv_create_steering_anchor()\f[R] Creates an anchor which will allow injecting the packet back into the kernel steering pipeline. .PP \f[B]mlx5dv_destroy_steering_anchor()\f[R] Destroys a steering anchor. .SH ARGUMENTS .SS context .PP The device context to associate the steering anchor with. .SS attr .PP Anchor attributes specify the priority and flow table type to which the anchor will point. .IP .nf \f[C] struct mlx5dv_steering_anchor_attr { enum mlx5dv_flow_table_type ft_type; uint16_t priority; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]ft_type\f[R] The flow table type to which the anchor will point. .TP \f[I]priority\f[R] The priority inside \f[I]ft_type\f[R] to which the created anchor will point. .TP \f[I]comp_mask\f[R] Reserved for future extension, must be 0 now. .SS mlx5dv_steering_anchor .IP .nf \f[C] struct mlx5dv_steering_anchor { uint32_t id; }; \f[R] .fi .TP \f[I]id\f[R] The flow table ID to use as the destination when creating the flow table entry. .SH RETURN VALUE .PP \f[B]mlx5dv_create_steering_anchor()\f[R] returns a pointer to a new \f[I]mlx5dv_steering_anchor\f[R] on success. On error NULL is returned and errno is set. .PP \f[B]mlx5dv_destroy_steering_anchor()\f[R] returns 0 on success and errno value on error. .SH AUTHORS .PP Mark Bloch rdma-core-56.1/buildlib/pandoc-prebuilt/561c21785df0cfbff916d5860a43a2e301875e900000644000175100002000000000673514773456417033327 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBFINDNODESUSING" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME ibfindnodesusing \- find a list of end nodes which are routed through the specified switch and port .SH SYNOPSIS .sp ibfindnodesusing.pl [options] .SH DESCRIPTION .sp ibfindnodesusing.pl uses ibroute and detects the current nodes which are routed through both directions of the link specified. The link is specified by one switch port end; the script finds the remote end automatically. .SH OPTIONS .INDENT 0.0 .TP .B \fB\-h\fP show help .TP .B \fB\-R\fP Recalculate the ibnetdiscover information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe that the fabric has changed. .UNINDENT .sp \fB\-C \fP use the specified ca_name. .sp \fB\-P \fP use the specified ca_port. .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .SH AUTHOR .INDENT 0.0 .TP .B Ira Weiny < \fI\%ira.weiny@intel.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/8b66ab581ae97d421270efd7663ff0e978bf4d0f0000644000175100002000000000514114773456414033553 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_create_mkey / mlx5dv_destroy_mkey" "3" "" "" "" .hy .SH NAME .PP mlx5dv_create_mkey - Creates an indirect mkey .PP mlx5dv_destroy_mkey - Destroys an indirect mkey .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_mkey_init_attr { struct ibv_pd *pd; uint32_t create_flags; uint16_t max_entries; }; struct mlx5dv_mkey { uint32_t lkey; uint32_t rkey; }; struct mlx5dv_mkey * mlx5dv_create_mkey(struct mlx5dv_mkey_init_attr *mkey_init_attr); int mlx5dv_destroy_mkey(struct mlx5dv_mkey *mkey); \f[R] .fi .SH DESCRIPTION .PP Create / destroy an indirect mkey. .PP Create an indirect mkey to enable application uses its specific device functionality. .SH ARGUMENTS .PP ##mkey_init_attr## .TP \f[I]pd\f[R] ibv protection domain. .TP \f[I]create_flags\f[R] MLX5DV_MKEY_INIT_ATTR_FLAGS_INDIRECT: Indirect mkey is being created. MLX5DV_MKEY_INIT_ATTR_FLAGS_BLOCK_SIGNATURE: Enable block signature offload support for mkey. MLX5DV_MKEY_INIT_ATTR_FLAGS_CRYPTO: Enable crypto offload support for mkey. Setting this flag means that crypto operations will be done and hence, must be configured. I.e. if this flag is set and the MKey was not configured for crypto properties using \f[B]mlx5dv_wr_set_mkey_crypto()\f[R], then running traffic with the MKey will fail, generating a CQE with error. MLX5DV_MKEY_INIT_ATTR_FLAGS_UPDATE_TAG: Enable update tag support for mkey. Setting this flag allows an application to set the mkey tag post of creating the mkey. If the kernel does not support updating the mkey tag, mkey creation will fail. MLX5DV_MKEY_INIT_ATTR_FLAGS_REMOTE_INVALIDATE: Enable remote invalidation support for mkey. .TP \f[I]max_entries\f[R] Requested max number of pointed entries by this indirect mkey. The function will update the \f[I]mkey_init_attr->max_entries\f[R] with the actual mkey value that was created; it will be greater than or equal to the value requested. .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_create_mkey\f[R] will return a new \f[I]struct mlx5dv_mkey\f[R] on error NULL will be returned and errno will be set. .PP Upon success destroy 0 is returned or the value of errno on a failure. .SH Notes .PP To let this functionality works a DEVX context should be opened by using \f[I]mlx5dv_open_device\f[R]. .PP The created indirect mkey can\[ga]t work with scatter to CQE feature, consider \f[I]mlx5dv_create_qp()\f[R] with MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE for small messages. .SH SEE ALSO .PP \f[B]mlx5dv_open_device\f[R](3), \f[B]mlx5dv_create_qp\f[R](3) .PP #AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/515c4ddf52e644f7f347c40deac43d5ddb5bb19d0000644000175100002000000000223314773456413033744 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "EFADV_CREATE_DRIVER_QP" "3" "2019-01-23" "efa" "EFA Direct Verbs Manual" .hy .SH NAME .PP efadv_create_driver_qp - Create EFA specific Queue Pair # SYNOPSIS .IP .nf \f[C] #include struct ibv_qp *efadv_create_driver_qp(struct ibv_pd *ibvpd, struct ibv_qp_init_attr *attr, uint32_t driver_qp_type); \f[R] .fi .SH DESCRIPTION .PP \f[B]efadv_create_driver_qp()\f[R] Create device-specific Queue Pairs. .PP Scalable Reliable Datagram (SRD) transport provides reliable out-of-order delivery, transparently utilizing multiple network paths to reduce network tail latency. Its interface is similar to UD, in particular it supports message size up to MTU, with error handling extended to support reliable communication. .TP \f[I]driver_qp_type\f[R] The type of QP to be created: .RS .PP EFADV_QP_DRIVER_TYPE_SRD: Create an SRD QP. .RE .SH RETURN VALUE .PP efadv_create_driver_qp() returns a pointer to the created QP, or NULL if the request fails. .SH SEE ALSO .PP \f[B]efadv\f[R](7) .SH AUTHORS .PP Gal Pressman rdma-core-56.1/buildlib/pandoc-prebuilt/740ff287af59dd58577c3c7910e7778286773efb0000644000175100002000000000625414773456415033314 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_devx_umem_reg, mlx5dv_devx_umem_dereg" "3" "" "" "" .hy .SH NAME .PP mlx5dv_devx_umem_reg - Register a user memory to be used by the devx interface .PP mlx5dv_devx_umem_reg_ex - Register a user memory to be used by the devx interface .PP mlx5dv_devx_umem_dereg - Deregister a devx umem object .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_devx_umem { uint32_t umem_id; }; struct mlx5dv_devx_umem * mlx5dv_devx_umem_reg(struct ibv_context *context, void *addr, size_t size, uint32_t access) struct mlx5dv_devx_umem_in { void *addr; size_t size; uint32_t access; uint64_t pgsz_bitmap; uint64_t comp_mask; int dmabuf_fd; }; struct mlx5dv_devx_umem * mlx5dv_devx_umem_reg_ex(struct ibv_context *ctx, struct mlx5dv_devx_umem_in *umem_in); int mlx5dv_devx_umem_dereg(struct mlx5dv_devx_umem *dv_devx_umem) \f[R] .fi .SH DESCRIPTION .PP Register or deregister a user memory to be used by the devx interface. .PP The register verb exposes a UMEM DEVX object for user memory registration for DMA. The API to register the user memory gets as input the user address, length and access flags, and provides to the user as output an object which holds the UMEM ID returned by the firmware to this registered memory. .PP The user can ask for specific page sizes for the given address and length, in that case \f[I]mlx5dv_devx_umem_reg_ex()\f[R] should be used. In case the kernel couldn\[cq]t find a matching page size from the given \f[I]umem_in->pgsz_bitmap\f[R] bitmap the API will fail. .PP The user will use that UMEM ID in device direct commands that use this memory instead of the physical addresses list, for example upon \f[I]mlx5dv_devx_obj_create\f[R] to create a QP. .SH ARGUMENTS .TP \f[I]context\f[R] .IP .nf \f[C] RDMA device context to create the action on. \f[R] .fi .TP \f[I]addr\f[R] The memory start address to register. .TP \f[I]size\f[R] .IP .nf \f[C] The size of *addr* buffer. \f[R] .fi .TP \f[I]access\f[R] The desired memory protection attributes; it is either 0 or the bitwise OR of one or more of \f[I]enum ibv_access_flags\f[R]. .TP \f[I]umem_in\f[R] A structure holds the argument bundle. .TP \f[I]pgsz_bitmap\f[R] Represents the required page sizes. umem creation will fail if it cannot be created with these page sizes. .TP \f[I]comp_mask\f[R] Flags indicating the additional fields. .TP \f[I]dmabuf_fd\f[R] If MLX5DV_UMEM_MASK_DMABUF is set in \f[I]comp_mask\f[R] then this value must be a FD of a dmabuf. In this mode the dmabuf is used as the backing memory to create the umem out of. The dmabuf must be pinnable. \f[I]addr\f[R] is interpreted as the starting offset of the dmabuf. .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_devx_umem_reg\f[R] / \f[I]mlx5dv_devx_umem_reg_ex\f[R] will return a new \f[I]struct mlx5dv_devx_umem\f[R] object, on error NULL will be returned and errno will be set. .PP \f[I]mlx5dv_devx_umem_dereg\f[R] returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH SEE ALSO .PP \f[I]mlx5dv_open_device(3)\f[R], \f[I]ibv_reg_mr(3)\f[R], \f[I]mlx5dv_devx_obj_create(3)\f[R] .PP #AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/a6e351c703e8e37dd34435cd99d002090744d0d00000644000175100002000000000227214773456413033217 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "RDMA_INIT_QP_ATTR" "3" "2018-12-31" "librdmacm" "Librdmacm Programmer\[cq]s Manual" .hy .SH NAME .PP rdma_init_qp_attr - Returns qp attributes of an rdma_cm_id. .SH SYNOPSIS .IP .nf \f[C] #include int rdma_init_qp_attr(struct rdma_cm_id *id, struct ibv_qp_attr *qp_attr, int *qp_attr_mask); \f[R] .fi .SH DESCRIPTION .PP \f[B]rdma_init_qp_attr()\f[R] returns qp attributes of an rdma_cm_id. .PP Information about qp attributes and qp attributes mask is returned through the \f[I]qp_attr\f[R] and \f[I]qp_attr_mask\f[R] parameters. .PP For details on the qp_attr structure, see ibv_modify_qp. .SH ARGUMENTS .TP \f[I]id\f[R] RDMA identifier. .TP \f[I]qp_attr\f[R] A reference to a qp attributes struct containing response information. .TP \f[I]qp_attr_mask\f[R] A reference to a qp attributes mask containing response information. .SH RETURN VALUE .PP \f[B]rdma_init_qp_attr()\f[R] returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH SEE ALSO .PP \f[B]rdma_cm\f[R](7), \f[B]ibv_modify_qp\f[R](3) .SH AUTHOR .PP Danit Goldberg rdma-core-56.1/buildlib/pandoc-prebuilt/edcb345e0afc5fdd0f2beadfd7bbbb5ec6c130430000644000175100002000000001131214773456417034360 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBHOSTS" 8 "2016-12-20" "" "OpenIB Diagnostics" .SH NAME IBHOSTS \- show InfiniBand host nodes in topology .SH SYNOPSIS .sp ibhosts [options] [] .SH DESCRIPTION .sp ibhosts is a script which either walks the IB subnet topology or uses an already saved topology file and extracts the CA nodes. .SH OPTIONS .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .SH SEE ALSO .sp ibnetdiscover(8) .SH DEPENDENCIES .sp ibnetdiscover, ibnetdiscover format .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/488157a3da9c72a87455826d599ca51a8d4a37160000644000175100002000000000311014773456414033165 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "HNSDV_QUERY_DEVICE" "3" "2024-02-06" "hns" "HNS Direct Verbs Manual" .hy .SH NAME .PP hnsdv_query_device - Query hns device specific attributes .SH SYNOPSIS .IP .nf \f[C] #include int hnsdv_query_device(struct ibv_context *context, struct hnsdv_context *attrs_out); \f[R] .fi .SH DESCRIPTION .PP \f[B]hnsdv_query_device()\f[R] Queries hns device specific attributes. .SH ARGUMENTS .PP Please see \f[I]ibv_query_device(3)\f[R] man page for \f[I]context\f[R]. .SS attrs_out .IP .nf \f[C] struct hnsdv_context { uint64_t comp_mask; uint64_t flags; uint8_t congest_type; uint8_t reserved[7]; }; \f[R] .fi .TP \f[I]comp_mask\f[R] Bitmask specifying what fields in the structure are valid: .RS .PP HNSDV_CONTEXT_MASK_CONGEST_TYPE: Congestion control algorithm is supported. .RE .TP \f[I]congest_type\f[R] Bitmask of supported congestion control algorithms. .RS .PP HNSDV_QP_CREATE_ENABLE_DCQCN: Data Center Quantized Congestion Notification HNSDV_QP_CREATE_ENABLE_LDCP: Low Delay Control Protocol HNSDV_QP_CREATE_ENABLE_HC3: Huawei Converged Congestion Control HNSDV_QP_CREATE_ENABLE_DIP: Destination IP based Quantized Congestion Notification .RE .SH RETURN VALUE .PP \f[B]hnsdv_query_device()\f[R] returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH SEE ALSO .PP \f[B]ibv_query_device\f[R](3) .SH NOTES .IP \[bu] 2 \f[I]flags\f[R] is an out field and currently has no values. .SH AUTHORS .PP Junxian Huang rdma-core-56.1/buildlib/pandoc-prebuilt/b374d10efc75dbb83fd6b6f4c9ea706162ce974f0000644000175100002000000000310314773456415033701 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_devx_alloc_msi_vector" "3" "2022-01-12" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_devx_alloc_msi_vector - Allocate an msi vector to be used for creating an EQ. .PP mlx5dv_devx_free_msi_vector - Release an msi vector. .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_devx_msi_vector * mlx5dv_devx_alloc_msi_vector(struct ibv_context *ibctx); int mlx5dv_devx_free_msi_vector(struct mlx5dv_devx_msi_vector *msi); \f[R] .fi .SH DESCRIPTION .PP Allocate or free an msi vector to be used for creating an EQ. .PP The allocate API exposes a mlx5dv_devx_msi_vector object, which includes an msi vector and a fd. The vector can be used as the \[lq]eqc.intr\[rq] field when creating an EQ, while the fd (created as non-blocking) can be polled to see once there is some data on that EQ. .SH ARGUMENTS .TP \f[I]ibctx\f[R] RDMA device context to create the action on. .TP \f[I]msi\f[R] The msi vector object to work on. .SS msi_vector .IP .nf \f[C] struct mlx5dv_devx_msi_vector { int vector; int fd; }; \f[R] .fi .TP \f[I]vector\f[R] The vector to be used when creating the EQ over the device specification. .TP \f[I]fd\f[R] The FD that will be used for polling. .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_devx_alloc_msi_vector\f[R] will return a new \f[I]struct mlx5dv_devx_msi_vector\f[R]; On error NULL will be returned and errno will be set. .PP Upon success \f[I]mlx5dv_devx_free_msi_vector\f[R] will return 0, on error errno will be returned. .SH AUTHOR .PP Mark Zhang rdma-core-56.1/buildlib/pandoc-prebuilt/4479a779f8a764c7a4c8ca11bda05bee2aeb86290000644000175100002000000000512414773456420033617 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBSTAT" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME ibstat \- query basic status of InfiniBand device(s) .SH SYNOPSIS .sp ibstat [options] [portnum] .SH DESCRIPTION .sp ibstat is a binary which displays basic information obtained from the local IB driver. Output includes LID, SMLID, port state, link width active, and port physical state. .sp It is similar to the ibstatus utility but implemented as a binary rather than a script. It has options to list CAs and/or ports and displays more information than ibstatus. .SH OPTIONS .INDENT 0.0 .TP .B \fB\-l, \-\-list_of_cas\fP list all IB devices .TP .B \fB\-s, \-\-short\fP short output .TP .B \fB\-p, \-\-port_list\fP show port list .TP .B \fBca_name\fP InfiniBand device name .TP .B \fBportnum\fP port number of InfiniBand device .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH EXAMPLES .INDENT 0.0 .TP .B :: ibstat # display status of all ports on all IB devices ibstat \-l # list all IB devices ibstat \-p # show port guids ibstat mthca0 2 # show status of port 2 of \(aqmthca0\(aq .UNINDENT .SH SEE ALSO .sp ibstatus (8) .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/4616c6e33f45add7c8e0b9e86d978fb01250c8c10000644000175100002000000002102014773456414033455 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\"t .\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_dek_create / mlx5dv_dek_query / mlx5dv_dek_destroy" "3" "" "" "" .hy .SH NAME .PP mlx5dv_dek_create - Creates a DEK .PP mlx5dv_dek_query - Queries a DEK\[cq]s attributes .PP mlx5dv_dek_destroy - Destroys a DEK .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_dek *mlx5dv_dek_create(struct ibv_context *context, struct mlx5dv_dek_init_attr *init_attr); int mlx5dv_dek_query(struct mlx5dv_dek *dek, struct mlx5dv_dek_attr *attr); int mlx5dv_dek_destroy(struct mlx5dv_dek *dek); \f[R] .fi .SH DESCRIPTION .PP Data Encryption Keys (DEKs) are used to encrypt and decrypt transmitted data. After a DEK is created, it can be configured in MKeys for crypto offload operations. DEKs are not persistent and are destroyed upon process exit. Therefore, software process needs to re-create all needed DEKs on startup. .PP \f[B]mlx5dv_dek_create()\f[R] creates a new DEK with the attributes specified in \f[I]init_attr\f[R]. A pointer to the newly created dek is returned, which can be used for DEK query, DEK destruction and when configuring a MKey for crypto offload operations. .PP The DEK can be either wrapped or in plaintext and the format that should be used is determined by the specified crypto_login object. .PP To create a wrapped DEK, the application must have a valid crypto login object prior to creating the DEK. Creating a wrapped DEK can be performed in two ways: 1. Call \f[B]mlx5dv_crypto_login_create()\f[R] to obtain a crypto login object. Indicate that the DEK is wrapped by setting \f[B]MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN\f[R] value in \f[I]comp_mask\f[R] and passing the crypto login object in \f[I]crypto_login\f[R] field of \f[I]init_attr\f[R]. Fill the other DEK attributes and create the DEK. .IP "2." 3 Call \f[B]mlx5dv_crypto_login()\f[R] i.e., the old API. Supply credential, import_kek_id .PP To create a plaintext DEK, the application must indicate that the DEK is in plaintext by setting \f[B]MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN\f[R] value in \f[I]comp_mask\f[R] and passing NULL value in \f[I]crypto_login\f[R] field of \f[I]init_attr\f[R], fill the other DEK attributes and create the DEK. .PP To use the created DEK (either wrapped or plaintext) in a MKey, a valid crypto login object or session is not needed. Revoking the import KEK or credential that were used for the crypto login object or session (and therefore rendering the crypto login invalid) does not prevent using a created DEK. .PP \f[B]mlx5dv_dek_query()\f[R] queries the DEK specified by \f[I]dek\f[R] and returns the queried attributes in \f[I]attr\f[R]. A valid crypto login object or session is not required to query a plaintext DEK. On the other hand, to query a wrapped DEK a valid crypto login object or session must be present. .PP \f[B]mlx5dv_dek_destroy()\f[R] destroys the DEK specified by \f[I]dek\f[R]. .SH ARGUMENTS .SS context .PP The device context to create the DEK with. .SS init_attr .IP .nf \f[C] enum mlx5dv_dek_init_attr_mask { MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN = 1 << 0, }; struct mlx5dv_dek_init_attr { enum mlx5dv_crypto_key_size key_size; bool has_keytag; enum mlx5dv_crypto_key_purpose key_purpose; struct ibv_pd *pd; char opaque[8]; char key[128]; uint64_t comp_mask; /* Use enum mlx5dv_dek_init_attr_mask */ struct mlx5dv_crypto_login_obj *crypto_login; }; \f[R] .fi .TP \f[I]key_size\f[R] The size of the key, can be one of the following .RS .TP \f[B]MLX5DV_CRYPTO_KEY_SIZE_128\f[R] Key size is 128 bit. .TP \f[B]MLX5DV_CRYPTO_KEY_SIZE_256\f[R] Key size is 256 bit. .RE .TP \f[I]has_keytag\f[R] Whether the DEK has a keytag or not. If set, the key should include a 8 Bytes keytag. Keytag is used to verify that the DEK being used by a MKey is the expected DEK. This is done by comparing the keytag that was defined during DEK creation with the keytag provided in the MKey crypto configuration, and failing the operation if they are different. .TP \f[I]key_purpose\f[R] The purpose of the key, currently can only be the following value .RS .TP \f[B]MLX5DV_CRYPTO_KEY_PURPOSE_AES_XTS\f[R] The key will be used for AES-XTS crypto engine. .RE .TP \f[I]pd\f[R] The protection domain to be associated with the DEK. .TP \f[I]opaque\f[R] Plaintext metadata to describe the key. .TP \f[I]key\f[R] The key that will be used for encryption and decryption of transmitted data. For plaintext DEK \f[I]key\f[R] must be provided in plaintext. For wrapped DEK \f[I]key\f[R] must be provided wrapped by the import KEK that was specified in the crypto login. Actual size and layout of this field depend on the provided \f[I]key_size\f[R] and \f[I]has_keytag\f[R] fields, as well as on the format of the key (plaintext or wrapped). \f[I]key\f[R] should be constructed according to the following table. .RS .PP DEK \f[I]key\f[R] Field Construction. .TS tab(@); l l l l. T{ Import Method T}@T{ Has Keytag T}@T{ Key size T}@T{ Key Layout T} _ T{ Plaintext T}@T{ No T}@T{ 128 Bit T}@T{ key1_128b + key2_128b T} T{ T}@T{ T}@T{ T}@T{ T} T{ Plaintext T}@T{ No T}@T{ 256 Bit T}@T{ key1_256b + key2_256b T} T{ T}@T{ T}@T{ T}@T{ T} T{ Plaintext T}@T{ Yes T}@T{ 128 Bit T}@T{ key1_128b + key2_128b + keytag_64b T} T{ T}@T{ T}@T{ T}@T{ T} T{ Plaintext T}@T{ Yes T}@T{ 256 Bit T}@T{ key1_256b + key2_256b + keytag_64b T} T{ T}@T{ T}@T{ T}@T{ T} T{ Wrapped T}@T{ No T}@T{ 128 Bit T}@T{ ENC(iv_64b + key1_128b + key2_128b) T} T{ T}@T{ T}@T{ T}@T{ T} T{ Wrapped T}@T{ No T}@T{ 256 Bit T}@T{ ENC(iv_64b + key1_256b + key2_256b) T} T{ T}@T{ T}@T{ T}@T{ T} T{ Wrapped T}@T{ Yes T}@T{ 128 Bit T}@T{ ENC(iv_64b + key1_128b + key2_128b + keytag_64b) T} T{ T}@T{ T}@T{ T}@T{ T} T{ Wrapped T}@T{ Yes T}@T{ 256 Bit T}@T{ ENC(iv_64b + key1_256b + key2_256b + keytag_64b) T} .TE .PP Where ENC() is AES key wrap algorithm and iv_64b is 0xA6A6A6A6A6A6A6A6 as per the NIST SP 800-38F AES key wrap spec. .PP The following example shows how to wrap a 128 bit key that has keytag using a 128 bit import KEK in OpenSSL: .IP .nf \f[C] #include unsigned char import_kek[16]; /* 128 bit import KEK in plaintext for wrapping */ unsigned char iv[8] = {0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6}; /* * Indexes 0-15 are key1 in plaintext, indexes 16-31 are key2 in plaintext, * and indexes 32-39 are key_tag in plaintext. */ unsigned char key[40]; unsigned char wrapped_key[48]; EVP_CIPHER_CTX *ctx; int len; ctx = EVP_CIPHER_CTX_new(); EVP_CIPHER_CTX_set_flags(ctx, EVP_CIPHER_CTX_FLAG_WRAP_ALLOW); EVP_EncryptInit_ex(ctx, EVP_aes_128_wrap(), NULL, import_kek, iv); EVP_EncryptUpdate(ctx, wrapped_key, &len, key, sizeof(key)); EVP_EncryptFinal_ex(ctx, wrapped_key + len, &len); EVP_CIPHER_CTX_free(ctx); \f[R] .fi .RE .TP \f[I]comp_mask\f[R] Currently can be the following value: .RS .PP \f[B]MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN\f[R], which indicates that \f[I]crypto_login\f[R] field is applicable. .RE .TP \f[I]crypto_login\f[R] Pointer to a crypto login object. If set to a valid crypto login object, indicates that this is a wrapped DEK that will be created using the given crypto login object. If set to NULL, indicates that this is a plaintext DEK. Must be NULL if \f[B]MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN\f[R] is not set. Only relevant when comp_mask is set with \f[I]MLX5DV_DEK_INIT_ATTR_CRYPTO_LOGIN\f[R] .SS dek .IP .nf \f[C] Pointer to an existing DEK to query or to destroy. \f[R] .fi .SS attr .IP .nf \f[C] DEK attributes to be populated when querying a DEK. \f[R] .fi .IP .nf \f[C] struct mlx5dv_dek_attr { enum mlx5dv_dek_state state; char opaque[8]; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]state\f[R] The state of the DEK, can be one of the following .RS .TP \f[B]MLX5DV_DEK_STATE_READY\f[R] The key is ready for use. This is the state of the key when it is first created. .TP \f[B]MLX5DV_DEK_STATE_ERROR\f[R] The key is unusable. The key needs to be destroyed and re-created in order to be used. This can happen, for example, due to DEK memory corruption. .RE .TP \f[I]opaque\f[R] Plaintext metadata to describe the key. .TP \f[I]comp_mask\f[R] Reserved for future extension, must be 0 now. .SH RETURN VALUE .PP \f[B]mlx5dv_dek_create()\f[R] returns a pointer to a new \f[I]struct mlx5dv_dek\f[R] on success. On error NULL is returned and errno is set. .PP \f[B]mlx5dv_dek_query()\f[R] returns 0 on success and updates \f[I]attr\f[R] with the queried DEK attributes. On error errno value is returned. .PP \f[B]mlx5dv_dek_destroy()\f[R] returns 0 on success and errno value on error. .SH SEE ALSO .PP \f[B]mlx5dv_crypto_login\f[R](3), \f[B]mlx5dv_crypto_login_create\f[R](3), \f[B]mlx5dv_query_device\f[R](3) .SH AUTHORS .PP Avihai Horon rdma-core-56.1/buildlib/pandoc-prebuilt/697d7ae1cfe1af4b9264377df95979884266183b0000644000175100002000000000212614773456412033277 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "ibv_import_device" "3" "2020-5-3" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_import_device - import a device from a given command FD .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_context *ibv_import_device(int cmd_fd); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_import_device()\f[R] returns an \f[I]ibv_context\f[R] pointer that is associated with the given \f[I]cmd_fd\f[R]. .PP The \f[I]cmd_fd\f[R] is obtained from the ibv_context cmd_fd member, which must be dup\[cq]d (eg by dup(), SCM_RIGHTS, etc) before being passed to ibv_import_device(). .PP Once the \f[I]ibv_context\f[R] usage has been ended \f[I]ibv_close_device()\f[R] should be called. This call may cleanup whatever is needed/opposite of the import including closing the command FD. .SH RETURN VALUE .PP \f[B]ibv_import_device()\f[R] returns a pointer to the allocated RDMA context, or NULL if the request fails. .SH SEE ALSO .PP \f[B]ibv_open_device\f[R](3), \f[B]ibv_close_device\f[R](3), .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/4d2afe628b50572105e0483d1f946091a6a6388f0000644000175100002000000001223614773456414033155 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_crypto_login_create / mlx5dv_crypto_login_query / mlx5dv_crypto_login_destroy" "3" "" "" "" .hy .SH NAME .PP mlx5dv_crypto_login_create - Creates a crypto login object .PP mlx5dv_crypto_login_query - Queries the given crypto login object .PP mlx5dv_crypto_login_destroy - Destroys the given crypto login object .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_crypto_login_obj * mlx5dv_crypto_login_create(struct ibv_context *context, struct mlx5dv_crypto_login_attr_ex *login_attr); int mlx5dv_crypto_login_query(struct mlx5dv_crypto_login_obj *crypto_login, struct mlx5dv_crypto_login_query_attr *query_attr); int mlx5dv_crypto_login_destroy(struct mlx5dv_crypto_login_obj *crypto_login); \f[R] .fi .SH DESCRIPTION .PP When using a crypto engine that is in wrapped import method, a valid crypto login object must be provided in order to create and query wrapped Data Encryption Keys (DEKs). .PP A valid crypto login object is necessary only to create and query wrapped DEKs. Existing DEKs that were previously created don\[cq]t need a valid crypto login object in order to be used (in MKey or during traffic). .PP \f[B]mlx5dv_crypto_login_create()\f[R] creates and returns a crypto login object with the credential given in \f[I]login_attr\f[R]. Only one crypto login object can be created per device context. The created crypto login object must be provided to \f[B]mlx5dv_dek_create()\f[R] in order to create wrapped DEKs. .PP \f[B]mlx5dv_crypto_login_query()\f[R] queries the crypto login object \f[I]crypto_login\f[R] and returns the queried attributes in \f[I]query_attr\f[R]. .PP \f[B]mlx5dv_crypto_login_destroy()\f[R] destroys the given crypto login object. .SH ARGUMENTS .SS context .PP The device context that will be associated with the crypto login object. .SS login_attr .PP Crypto extended login attributes specify the credential to login with and the import KEK to be used for secured communications done with the crypto login object. .IP .nf \f[C] struct mlx5dv_crypto_login_attr_ex { uint32_t credential_id; uint32_t import_kek_id; const void *credential; size_t credential_len; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]credential_id\f[R] An ID of a credential, from the credentials stored on the device, that indicates the credential that should be validated against the credential provided in \f[I]credential\f[R]. .TP \f[I]import_kek_id\f[R] An ID of an import KEK, from the import KEKs stored on the device, that indicates the import KEK that will be used for unwrapping the credential provided in \f[I]credential\f[R] and also for all other secured communications done with the crypto login object. .TP \f[I]credential\f[R] The credential to login with. Credential is a piece of data used to authenticate the user for crypto login. The credential in \f[I]credential\f[R] is validated against the credential indicated by \f[I]credential_id\f[R], which is stored on the device. The credentials must match in order for the crypto login to succeed. \f[I]credential\f[R] must be provided wrapped by the AES key wrap algorithm using the import KEK indicated by \f[I]import_kek_id\f[R]. \f[I]credential\f[R] format is ENC(iv_64b + plaintext_credential) where ENC() is AES key wrap algorithm and iv_64b is 0xA6A6A6A6A6A6A6A6 as per the NIST SP 800-38F AES key wrap spec, and plaintext_credential is the credential value stored on the device. .TP \f[I]credential_len\f[R] The length of the provided \f[I]credential\f[R] value in bytes. .TP \f[I]comp_mask\f[R] Reserved for future extension, must be 0 now. .SS query_attr .IP .nf \f[C] Crypto login attributes to be populated when querying a crypto login object. \f[R] .fi .IP .nf \f[C] struct mlx5dv_crypto_login_query_attr { enum mlx5dv_crypto_login_state state; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]state\f[R] The state of the crypto login object, can be one of the following .RS .TP \f[B]MLX5DV_CRYPTO_LOGIN_STATE_VALID\f[R] The crypto login object is valid and can be used. .TP \f[B]MLX5DV_CRYPTO_LOGIN_STATE_INVALID\f[R] The crypto login object is invalid and cannot be used. A valid crypto login object can become invalid if the credential or the import KEK used in the crypto login object were deleted while in use (for example by a crypto officer). In this case, \f[B]mlx5dv_crypto_login_destroy()\f[R] should be called to destroy the invalid crypto login object and if still necessary, \f[B]mlx5dv_crypto_login_create()\f[R] should be called to create a new crypto login object with valid credential and import KEK. .RE .TP \f[I]comp_mask\f[R] Reserved for future extension, must be 0 now. .SH RETURN VALUE .PP \f[B]mlx5dv_crypto_login_create()\f[R] returns a pointer to a new valid \f[I]struct mlx5dv_crypto_login_obj\f[R] on success. On error NULL is returned and errno is set. .PP \f[B]mlx5dv_crypto_login_query()\f[R] returns 0 on success and fills \f[I]query_attr\f[R] with the queried attributes. On error, errno is returned. .PP \f[B]mlx5dv_crypto_login_destroy()\f[R] returns 0 on success and errno on error. .SH SEE ALSO .PP \f[B]mlx5dv_dek_create\f[R](3), \f[B]mlx5dv_query_device\f[R](3) .SH AUTHORS .PP Avihai Horon rdma-core-56.1/buildlib/pandoc-prebuilt/0ba59aab27e17cce9897cc6bb8b15983f98d220c0000644000175100002000000000162214773456416033625 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_modify_qp_udp_sport" "3" "" "" "" .hy .SH NAME .PP mlx5dv_modify_qp_udp_sport - Modify the UDP source port of a given QP .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_modify_qp_udp_sport(struct ibv_qp *qp, uint16_t udp_sport) \f[R] .fi .SH DESCRIPTION .PP The UDP source port is used to create entropy for network routers (ECMP), load balancers and 802.3ad link aggregation switching that are not aware of RoCE IB headers. .PP This API enables modifying the configured UDP source port of a given RC/UC QP when QP is in RTS state. .SH ARGUMENTS .TP \f[I]qp\f[R] The ibv_qp object to issue the action on. .TP \f[I]udp_sport\f[R] The UDP source port to set for the QP. .SH RETURN VALUE .PP Returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH AUTHOR .PP Maor Gottlieb rdma-core-56.1/buildlib/pandoc-prebuilt/c10b498742b7bd02b349331d8ab6ed7a2951bbc50000644000175100002000000000226014773456413033431 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "RDMA_ESTABLISH" "3" "2019-01-16" "librdmacm" "Librdmacm Programmer\[cq]s Manual" .hy .SH NAME .PP rdma_establish - Complete an active connection request. .SH SYNOPSIS .IP .nf \f[C] #include int rdma_establish(struct rdma_cm_id *id); \f[R] .fi .SH DESCRIPTION .PP \f[B]rdma_establish()\f[R] Acknowledge an incoming connection response event and complete the connection establishment. .PP Notes: .PP If a QP has not been created on the rdma_cm_id, this function should be called by the active side to complete the connection, .PP after getting connect response event. .PP This will trigger a connection established event on the passive side. .PP This function should not be used on an rdma_cm_id on which a QP has been created. .SH ARGUMENTS .TP \f[I]id\f[R] RDMA identifier. .SH RETURN VALUE .PP \f[B]rdma_establish()\f[R] returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH SEE ALSO .PP \f[B]rdma_connect\f[R](3), \f[B]rdma_disconnect\f[R](3) \f[B]rdma_get_cm_event\f[R](3) .SH AUTHORS .PP Danit Goldberg .PP Yossi Itigin rdma-core-56.1/buildlib/pandoc-prebuilt/23660644c7d16519530ca5d9fe12f0f800e1f1c00000644000175100002000000000207214773456416033213 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_open_device" "3" "" "" "" .hy .SH NAME .PP mlx5dv_open_device - Open an RDMA device context for the mlx5 provider .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_context * mlx5dv_open_device(struct ibv_device *device, struct mlx5dv_context_attr *attr); \f[R] .fi .SH DESCRIPTION .PP Open an RDMA device context with specific mlx5 provider attributes. .SH ARGUMENTS .TP \f[I]device\f[R] RDMA device to open. .SS \f[I]attr\f[R] argument .IP .nf \f[C] struct mlx5dv_context_attr { uint32_t flags; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]flags\f[R] .IP .nf \f[C] A bitwise OR of the various values described below. *MLX5DV_CONTEXT_FLAGS_DEVX*: Allocate a DEVX context \f[R] .fi .TP \f[I]comp_mask\f[R] .IP .nf \f[C] Bitmask specifying what fields in the structure are valid \f[R] .fi .SH RETURN VALUE .PP Returns a pointer to the allocated device context, or NULL if the request fails. .SH SEE ALSO .PP \f[I]ibv_open_device(3)\f[R] .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/6a82b0bc695f8fd980a86aefaf5890804a0107610000644000175100002000000000270714773456412033402 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "ibv_alloc_null_mr" "3" "2018-6-1" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_alloc_null_mr - allocate a null memory region (MR) .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_mr *ibv_alloc_null_mr(struct ibv_pd *pd); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_alloc_null_mr()\f[R] allocates a null memory region (MR) that is associated with the protection domain \f[I]pd\f[R]. .PP A null MR discards all data written to it, and always returns 0 on read. It has the maximum length and only the lkey is valid, the MR is not exposed as an rkey. .PP A device should implement the null MR in a way that bypasses PCI transfers, internally discarding or sourcing 0 data. This provides a way to avoid PCI bus transfers by using a scatter/gather list in commands if applications do not intend to access the data, or need data to be 0 filled. .PP Specifically upon \f[B]ibv_post_send()\f[R] the device skips PCI read cycles and upon \f[B]ibv_post_recv()\f[R] the device skips PCI write cycles which finally improves performance. .PP \f[B]ibv_dereg_mr()\f[R] deregisters the MR. The use of ibv_rereg_mr() or ibv_bind_mw() with this MR is invalid. .SH RETURN VALUE .PP \f[B]ibv_alloc_null_mr()\f[R] returns a pointer to the allocated MR, or NULL if the request fails. .SH SEE ALSO .PP \f[B]ibv_reg_mr\f[R](3), \f[B]ibv_dereg_mr\f[R](3), .SH AUTHOR .PP Yonatan Cohen rdma-core-56.1/buildlib/pandoc-prebuilt/77100c5a6ee765b51de802a3368e8e9fdea4914c0000644000175100002000000000140514773456412033455 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_GET_DEVICE_GUID" "3" "2006-10-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_get_device_guid - get an RDMA device\[cq]s GUID .SH SYNOPSIS .IP .nf \f[C] #include uint64_t ibv_get_device_guid(struct ibv_device *device); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_get_device_guid()\f[R] returns the Global Unique IDentifier (GUID) of the RDMA device \f[I]device\f[R]. .SH RETURN VALUE .PP \f[B]ibv_get_device_guid()\f[R] returns the GUID of the device in network byte order. .SH SEE ALSO .PP \f[B]ibv_get_device_index\f[R](3), \f[B]ibv_get_device_list\f[R](3), \f[B]ibv_get_device_name\f[R](3), \f[B]ibv_open_device\f[R](3) .SH AUTHOR .PP Dotan Barak rdma-core-56.1/buildlib/pandoc-prebuilt/10cc2151fa6d9310030f90db8112aa7b9773508e0000644000175100002000000000162714773456416033206 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_vfio_get_events_fd" "3" "" "" "" .hy .SH NAME .PP mlx5dv_vfio_get_events_fd - Get the file descriptor to manage driver events. .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_vfio_get_events_fd(struct ibv_context *ctx); \f[R] .fi .SH DESCRIPTION .PP Returns the file descriptor to be used for managing driver events. .SH ARGUMENTS .TP \f[I]ctx\f[R] device context that was opened for VFIO by calling mlx5dv_get_vfio_device_list(). .SH RETURN VALUE .PP Returns the internal matching file descriptor. .SH NOTES .PP Client code should poll the returned file descriptor and once there is some data to be managed immediately call \f[I]mlx5dv_vfio_process_events()\f[R]. .SH SEE ALSO .PP \f[I]ibv_open_device(3)\f[R] \f[I]ibv_free_device_list(3)\f[R] \f[I]mlx5dv_get_vfio_device_list(3)\f[R] .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/971674ea9c99ebc02210ea2412f59a09a24327840000644000175100002000000000415314773456417033154 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBIDSVERIFY" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME ibidsverify \- validate IB identifiers in subnet and report errors .SH SYNOPSIS .sp ibidsverify.pl [\-h] [\-R] .SH DESCRIPTION .sp ibidsverify.pl is a perl script which uses a full topology file that was created by ibnetdiscover, scans the network to validate the LIDs and GUIDs in the subnet. The validation consists of checking that there are no zero or duplicate identifiers. .sp Finally, ibidsverify.pl will also reuse the cached ibnetdiscover output from some of the other diag tools which makes it a bit faster than running ibnetdiscover from scratch. .SH OPTIONS .sp \fB\-R\fP Recalculate the ibnetdiscover information, ie do not use the cached information. This option is slower but should be used if the diag tools have not been used for some time or if there are other reasons to believe the fabric has changed. .sp \fB\-C \fP use the specified ca_name. .sp \fB\-P \fP use the specified ca_port. .SH EXIT STATUS .sp Exit status is 1 if errors are found, 0 otherwise. .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .SH SEE ALSO .sp \fBibnetdiscover(8)\fP .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/7fbac8884b21e9bc3bbb20607d56da0b48f2a1560000644000175100002000000001133414773456420033572 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBSWITCHES" 8 "2016-12-20" "" "OpenIB Diagnostics" .SH NAME IBSWITCHES \- show InfiniBand switch nodes in topology .SH SYNOPSIS .sp ibswitches [options] [] .SH DESCRIPTION .sp ibswitches is a script which either walks the IB subnet topology or uses an already saved topology file and extracts the switch nodes. .SH OPTIONS .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .SH SEE ALSO .sp ibnetdiscover(8) .SH DEPENDENCIES .sp ibnetdiscover, ibnetdiscover format .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/bd6ec68a7f862bcc7b9254518ba938f0cacab1020000644000175100002000000000213714773456413033664 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "HNSDV" "7" "2024-02-06" "hns" "HNS Direct Verbs Manual" .hy .SH NAME .PP hnsdv - Direct verbs for hns devices .PP This provides low level access to hns devices to perform direct operations, without general branching performed by libibverbs. .SH DESCRIPTION .PP The libibverbs API is an abstract one. It is agnostic to any underlying provider specific implementation. While this abstraction has the advantage of user applications portability it has a performance penalty. Besides, some provider specific features that are directly facing users are not available through libibverbs. For some applications these demands are more important than portability. .PP The hns direct verbs API is intended for such applications. It exposes hns specific low level operations, allowing the application to bypass the libibverbs API and enable some hns specific features. .PP The direct include of hnsdv.h together with linkage to hns library will allow usage of this new interface. .SH SEE ALSO .PP \f[B]verbs\f[R](7) .SH AUTHORS .PP Junxian Huang rdma-core-56.1/buildlib/pandoc-prebuilt/bde0f0fb11d80958e182842cb166935bb5be33470000644000175100002000000000225114773456412033362 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_GET_PKEY_INDEX" "3" "2018-07-16" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_get_pkey_index - obtain the index in the P_Key table of a P_Key .SH SYNOPSIS .IP .nf \f[C] #include int ibv_get_pkey_index(struct ibv_context *context, uint8_t port_num, __be16 pkey); \f[R] .fi .SH DESCRIPTION .PP Every InfiniBand HCA maintains a P_Key table for each of its ports that is indexed by an integer and with a P_Key in each element. Certain InfiniBand data structures that work with P_Keys expect a P_Key index, e.g.\ \f[B]struct ibv_qp_attr\f[R] and \f[B]struct ib_mad_addr\f[R]. Hence the function \f[B]ibv_get_pkey_index()\f[R] that accepts a P_Key in network byte order and that returns an index in the P_Key table as result. .SH RETURN VALUE .PP \f[B]ibv_get_pkey_index()\f[R] returns the P_Key index on success, and -1 on error. .SH SEE ALSO .PP \f[B]ibv_open_device\f[R](3), \f[B]ibv_query_device\f[R](3), \f[B]ibv_query_gid\f[R](3), \f[B]ibv_query_pkey\f[R](3), \f[B]ibv_query_port\f[R](3) .SH AUTHOR .PP Bart Van Assche rdma-core-56.1/buildlib/pandoc-prebuilt/2aa4f0cffa6107c201fba341669d2e08cfc9ac090000644000175100002000000000125114773456414033641 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "HNSDV_IS_SUPPORTED" "3" "2024-02-06" "hns" "HNS Programmer\[cq]s Manual" .hy .SH NAME .PP hnsdv_is_supported - Check whether an RDMA device implemented by the hns provider .SH SYNOPSIS .IP .nf \f[C] #include bool hnsdv_is_supported(struct ibv_device *device); \f[R] .fi .SH DESCRIPTION .PP hnsdv functions may be called only if this function returns true for the RDMA device. .SH ARGUMENTS .TP \f[I]device\f[R] RDMA device to check. .SH RETURN VALUE .PP Returns true if device is implemented by hns provider. .SH SEE ALSO .PP \f[I]hnsdv(7)\f[R] .SH AUTHOR .PP Junxian Huang rdma-core-56.1/buildlib/pandoc-prebuilt/983dc82fa7ae24ca010e5d6e9d76e867250411500000644000175100002000000000303114773456413033304 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_REQ_NOTIFY_CQ" "3" "2006-10-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_req_notify_cq - request completion notification on a completion queue (CQ) .SH SYNOPSIS .IP .nf \f[C] #include int ibv_req_notify_cq(struct ibv_cq *cq, int solicited_only); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_req_notify_cq()\f[R] requests a completion notification on the completion queue (CQ) \f[I]cq\f[R]. .PP Upon the addition of a new CQ entry (CQE) to \f[I]cq\f[R], a completion event will be added to the completion channel associated with the CQ. If the argument \f[I]solicited_only\f[R] is zero, a completion event is generated for any new CQE. If \f[I]solicited_only\f[R] is non-zero, an event is only generated for a new CQE with that is considered \[lq]solicited.\[rq] A CQE is solicited if it is a receive completion for a message with the Solicited Event header bit set, or if the status is not successful. All other successful receive completions, or any successful send completion is unsolicited. .SH RETURN VALUE .PP \f[B]ibv_req_notify_cq()\f[R] returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH NOTES .PP The request for notification is \[lq]one shot.\[rq] Only one completion event will be generated for each call to \f[B]ibv_req_notify_cq()\f[R]. .SH SEE ALSO .PP \f[B]ibv_create_comp_channel\f[R](3), \f[B]ibv_create_cq\f[R](3), \f[B]ibv_get_cq_event\f[R](3) .SH AUTHOR .PP Dotan Barak rdma-core-56.1/buildlib/pandoc-prebuilt/41bbb0bed7a781be59e8c0dcd8b7278af2ce68820000644000175100002000000000177014773456412033761 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "UMAD_INIT" "3" "May 21, 2007" "OpenIB" "OpenIB Programmer\[cq]s Manual" .hy .SH NAME .PP umad_init, umad_done - perform library initialization and finalization .SH SYNOPSIS .IP .nf \f[C] #include int umad_init(void); int umad_done(void); \f[R] .fi .SH DESCRIPTION .PP \f[B]umad_init()\f[R] and \f[B]umad_done()\f[R] do nothing. .SH RETURN VALUE .PP Always 0. .SH COMPATIBILITY .PP Versions prior to release 18 of the library require \f[B]umad_init()\f[R] to be called prior to using any other library functions. Old versions could return a failure code of -1 from \f[B]umad_init()\f[R]. .PP For compatibility, applications should continue to call \f[B]umad_init()\f[R], and check the return code, prior to calling other \f[B]umad_\f[R] functions. If \f[B]umad_init()\f[R] returns an error, then no further use of the umad library should be attempted. .SH AUTHORS .PP Dotan Barak , Hal Rosenstock rdma-core-56.1/buildlib/pandoc-prebuilt/f31915665317aa73d3c9d93494ac44fd7d5388660000644000175100002000000000163014773456415033200 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_dm_map_op_addr" "3" "2021-1-21" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_dm_map_op_addr - Get operation address of a device memory (DM) .SH SYNOPSIS .IP .nf \f[C] #include void *mlx5dv_dm_map_op_addr(struct ibv_dm *dm, uint8_t op); \f[R] .fi .SH DESCRIPTION .PP \f[B]mlx5dv_dm_map_op_addr()\f[R] returns a mmaped address to the device memory for the requested \f[B]op\f[R]. .SH ARGUMENTS .TP \f[I]dm\f[R] .IP .nf \f[C] The associated ibv_dm for this operation. \f[R] .fi .TP \f[I]op\f[R] .IP .nf \f[C] Indicates the DM operation type, based on device specification. \f[R] .fi .SH RETURN VALUE .PP Returns a pointer to the mmaped address, on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[B]ibv_alloc_dm\f[R](3), \f[B]mlx5dv_alloc_dm\f[R](3), .SH AUTHOR .PP Maor Gottlieb rdma-core-56.1/buildlib/pandoc-prebuilt/cf4e4cd11a7895e2b33c4b3e1625393ebf1054520000644000175100002000000000400014773456414033347 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_create_cq" "3" "2018-9-1" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_create_cq - creates a completion queue (CQ) .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_cq_ex *mlx5dv_create_cq(struct ibv_context *context, struct ibv_cq_init_attr_ex *cq_attr, struct mlx5dv_cq_init_attr *mlx5_cq_attr); \f[R] .fi .SH DESCRIPTION .PP \f[B]mlx5dv_create_cq()\f[R] creates a completion queue (CQ) with specific driver properties. .SH ARGUMENTS .PP Please see \f[B]ibv_create_cq_ex(3)\f[R] man page for \f[B]context\f[R] and \f[B]cq_attr\f[R] .SS mlx5_cq_attr .IP .nf \f[C] struct mlx5dv_cq_init_attr { uint64_t comp_mask; uint8_t cqe_comp_res_format; uint32_t flags; uint16_t cqe_size; }; \f[R] .fi .TP \f[I]comp_mask\f[R] Bitmask specifying what fields in the structure are valid: .RS .PP MLX5DV_CQ_INIT_ATTR_MASK_COMPRESSED_CQE enables creating a CQ in a mode that few CQEs may be compressed into a single CQE, valid values in \f[I]cqe_comp_res_format\f[R] .PP MLX5DV_CQ_INIT_ATTR_MASK_FLAGS valid values in \f[I]flags\f[R] .PP MLX5DV_CQ_INIT_ATTR_MASK_CQE_SIZE valid values in \f[I]cqe_size\f[R] .RE .TP \f[I]cqe_comp_res_format\f[R] A bitwise OR of the various CQE response formats of the responder side: .RS .PP MLX5DV_CQE_RES_FORMAT_HASH CQE compression with hash .PP MLX5DV_CQE_RES_FORMAT_CSUM CQE compression with RX checksum .PP MLX5DV_CQE_RES_FORMAT_CSUM_STRIDX CQE compression with stride index .RE .TP \f[I]flags\f[R] A bitwise OR of the various values described below: .RS .PP MLX5DV_CQ_INIT_ATTR_FLAGS_CQE_PAD create a padded 128B CQE .RE .TP \f[I]cqe_size\f[R] configure the CQE size to be 64 or 128 bytes other values will fail mlx5dv_create_cq. .SH RETURN VALUE .PP \f[B]mlx5dv_create_cq()\f[R] returns a pointer to the created CQ, or NULL if the request fails and errno will be set. .SH SEE ALSO .PP \f[B]ibv_create_cq_ex\f[R](3), .SH AUTHOR .PP Yonatan Cohen rdma-core-56.1/buildlib/pandoc-prebuilt/e44e94a238c3c63d976a79adb52e34fb24140a850000644000175100002000000000477414773456413033405 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_REREG_MR" "3" "2016-03-13" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_rereg_mr - re-register a memory region (MR) .SH SYNOPSIS .IP .nf \f[C] #include int ibv_rereg_mr(struct ibv_mr *mr, int flags, struct ibv_pd *pd, void *addr, size_t length, int access); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_rereg_mr()\f[R] Modifies the attributes of an existing memory region (MR) \f[I]mr\f[R]. Conceptually, this call performs the functions deregister memory region followed by register memory region. Where possible, resources are reused instead of deallocated and reallocated. .PP \f[I]flags\f[R] is a bit-mask used to indicate which of the following properties of the memory region are being modified. Flags should be a combination (bit field) of: .TP \f[B]IBV_REREG_MR_CHANGE_TRANSLATION \f[R] Change translation (location and length) .TP \f[B]IBV_REREG_MR_CHANGE_PD \f[R] Change protection domain .TP \f[B]IBV_REREG_MR_CHANGE_ACCESS \f[R] Change access flags .PP When \f[B]IBV_REREG_MR_CHANGE_PD\f[R] is used, \f[I]pd\f[R] represents the new PD this MR should be registered to. .PP When \f[B]IBV_REREG_MR_CHANGE_TRANSLATION\f[R] is used, \f[I]addr\f[R]. represents the virtual address (user-space pointer) of the new MR, while \f[I]length\f[R] represents its length. .PP The access and other flags are represented in the field \f[I]access\f[R]. This field describes the desired memory protection attributes; it is either 0 or the bitwise OR of one or more of ibv_access_flags. .SH RETURN VALUE .PP \f[B]ibv_rereg_mr()\f[R] returns 0 on success, otherwise an error has occurred, \f[I]enum ibv_rereg_mr_err_code\f[R] represents the error as of below. .PP IBV_REREG_MR_ERR_INPUT - Old MR is valid, an input error was detected by libibverbs. .PP IBV_REREG_MR_ERR_DONT_FORK_NEW - Old MR is valid, failed via don\[cq]t fork on new address range. .PP IBV_REREG_MR_ERR_DO_FORK_OLD - New MR is valid, failed via do fork on old address range. .PP IBV_REREG_MR_ERR_CMD - MR shouldn\[cq]t be used, command error. .PP IBV_REREG_MR_ERR_CMD_AND_DO_FORK_NEW - MR shouldn\[cq]t be used, command error, invalid fork state on new address range. .SH NOTES .PP Even on a failure, the user still needs to call ibv_dereg_mr on this MR. .SH SEE ALSO .PP \f[B]ibv_dereg_mr\f[R](3), \f[B]ibv_reg_mr\f[R](3) .SH AUTHORS .PP Matan Barak , Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/5149808f7f084016a7e6ee8afa1b217bcd78f25d0000644000175100002000000000166614773456412033470 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_GET_SRQ_NUM" "3" "2013-06-26" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_get_srq_num - return srq number associated with the given shared receive queue (SRQ) .SH SYNOPSIS .IP .nf \f[C] #include int ibv_get_srq_num(struct ibv_srq *srq, uint32_t *srq_num); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_get_srq_num()\f[R] return srq number associated with the given XRC shared receive queue The argument \f[I]srq\f[R] is an ibv_srq struct, as defined in . \f[I]srq_num\f[R] is an output parameter that holds the returned srq number. .SH RETURN VALUE .PP \f[B]ibv_get_srq_num()\f[R] returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH SEE ALSO .PP \f[B]ibv_alloc_pd\f[R](3), \f[B]ibv_create_srq_ex\f[R](3), \f[B]ibv_modify_srq\f[R](3) .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/cbf7103380d562ab4c0ca6b71d0c6e06c1729ecf0000644000175100002000000001007214773456414033561 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_crypto_login / mlx5dv_crypto_login_query_state / mlx5dv_crypto_logout" "3" "" "" "" .hy .SH NAME .PP mlx5dv_crypto_login - Creates a crypto login session .PP mlx5dv_crypto_login_query_state - Queries the state of the current crypto login session .PP mlx5dv_crypto_logout - Logs out from the current crypto login session .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_crypto_login(struct ibv_context *context, struct mlx5dv_crypto_login_attr *login_attr); int mlx5dv_crypto_login_query_state(struct ibv_context *context, enum mlx5dv_crypto_login_state *state); int mlx5dv_crypto_logout(struct ibv_context *context); \f[R] .fi .SH DESCRIPTION .PP When using a crypto engine that is in wrapped import method, an active crypto login session must be present in order to create and query Data Encryption Keys (DEKs). .PP \f[B]mlx5dv_crypto_login()\f[R] Creates a crypto login session with the credential given in \f[I]login_attr\f[R] and associates it with \f[I]context\f[R]. Only one active crypto login session can be associated per device context. .PP \f[B]mlx5dv_crypto_login_query_state()\f[R] queries the state of the crypto login session associated with \f[I]context\f[R] and returns the state in \f[I]state\f[R], which indicates whether it is valid, invalid or doesn\[cq]t exist. A valid crypto login session can become invalid if the credential or the import KEK used in the crypto login session were deleted during the login session (for example by a crypto officer). In this case, \f[B]mlx5dv_crypto_logout()\f[R] should be called to destroy the current invalid crypto login session and if still necessary, \f[B]mlx5dv_crypto_login()\f[R] should be called to create a new crypto login session with valid credential and import KEK. .PP \f[B]mlx5dv_crypto_logout()\f[R] logs out from the current crypto login session associated with \f[I]context\f[R]. .PP Existing DEKs that were previously loaded to the device during a crypto login session don\[cq]t need an active crypto login session in order to be used (in MKey or during traffic). .SH ARGUMENTS .SS context .PP The device context to associate the crypto login session with. .SS login_attr .PP Crypto login attributes specify the credential to login with and the import KEK to be used for secured communications during the crypto login session. .IP .nf \f[C] struct mlx5dv_crypto_login_attr { uint32_t credential_id; uint32_t import_kek_id; char credential[48]; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]credential_id\f[R] An ID of a credential, from the credentials stored on the device, that indicates the credential that should be validated against the credential provided in \f[I]credential\f[R]. .TP \f[I]import_kek_id\f[R] An ID of an import KEK, from the import KEKs stored on the device, that indicates the import KEK that will be used for unwrapping the credential provided in \f[I]credential\f[R] and also for all other secured communications during the crypto login session. .TP \f[I]credential\f[R] The credential to login with. Must be provided wrapped by the AES key wrap algorithm using the import KEK indicated by \f[I]import_kek_id\f[R]. .TP \f[I]comp_mask\f[R] Reserved For future extension, must be 0 now. .SS state .PP Indicates the state of the current crypto login session. can be one of MLX5DV_CRYPTO_LOGIN_STATE_VALID, MLX5DV_CRYPTO_LOGIN_STATE_NO_LOGIN and MLX5DV_CRYPTO_LOGIN_STATE_INVALID. .SH RETURN VALUE .PP \f[B]mlx5dv_crypto_login()\f[R] returns 0 on success and errno value on error. .PP \f[B]mlx5dv_crypto_login_query_state()\f[R] returns 0 on success and updates \f[I]state\f[R] with the queried state. On error, errno value is returned. .PP \f[B]mlx5dv_crypto_logout()\f[R] returns 0 on success and errno value on error. .SH ERRORS .TP EEXIST A crypto login session already exists. .TP EINVAL Invalid attributes were provided, or one or more of \f[I]credential\f[R], \f[I]credential_id\f[R] and \f[I]import_kek_id\f[R] are invalid. .TP ENOENT No crypto login session exists. .SH AUTHORS .PP Avihai Horon rdma-core-56.1/buildlib/pandoc-prebuilt/1c3f51131206bb1a7ed34fbdc897910d313df6870000644000175100002000000000337714773456415033370 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_flow_action_esp" "3" "" "" "" .hy .SH NAME .PP mlx5dv_flow_action_esp - Flow action esp for mlx5 provider .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_flow_action * mlx5dv_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp_attr *esp, struct mlx5dv_flow_action_esp *mlx5_attr); \f[R] .fi .SH DESCRIPTION .PP Create an IPSEC ESP flow steering action. .PD 0 .P .PD This verb is identical to \f[I]ibv_create_flow_action_esp\f[R] verb, but allows mlx5 specific flags. .SH ARGUMENTS .PP Please see \f[I]ibv_flow_action_esp(3)\f[R] man page for \f[I]ctx\f[R] and \f[I]esp\f[R]. .SS \f[I]mlx5_attr\f[R] argument .IP .nf \f[C] struct mlx5dv_flow_action_esp { uint64_t comp_mask; /* Use enum mlx5dv_flow_action_esp_mask */ uint32_t action_flags; /* Use enum mlx5dv_flow_action_flags */ }; \f[R] .fi .TP \f[I]comp_mask\f[R] Bitmask specifying what fields in the structure are valid (\f[I]enum mlx5dv_flow_action_esp_mask\f[R]). .TP \f[I]action_flags\f[R] A bitwise OR of the various values described below. .RS .PP \f[I]MLX5DV_FLOW_ACTION_FLAGS_REQUIRE_METADATA\f[R]: .PD 0 .P .PD Each received and transmitted packet using offload is expected to carry metadata in the form of a L2 header .PD 0 .P .PD with ethernet type 0x8CE4, followed by 6 bytes of data and the original packet ethertype. .RE .SH NOTE .PP The ESN is expected to be placed in the IV field for egress packets. .PD 0 .P .PD The 64 bit sequence number is written in big-endian over the 64 bit IV field. .PD 0 .P .PD There is no need to call modify to update the ESN window on egress when this DV is used. .SH SEE ALSO .PP \f[I]ibv_flow_action_esp(3)\f[R], \f[I]RFC 4106\f[R] rdma-core-56.1/buildlib/pandoc-prebuilt/2082c9e75706a10a0c0c9925f5108736249d83680000644000175100002000000000455614773456412032745 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "ibv_create_counters" "3" "2018-04-02" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP \f[B]ibv_create_counters\f[R], \f[B]ibv_destroy_counters\f[R] - Create or destroy a counters handle .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_counters * ibv_create_counters(struct ibv_context *context, struct ibv_counters_init_attr *init_attr); int ibv_destroy_counters(struct ibv_counters *counters); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_create_counters\f[R]() creates a new counters handle for the RDMA device context. .PP An ibv_counters handle can be attached to a verbs resource (e.g.: QP, WQ, Flow) statically when these are created. .PP For example attach an ibv_counters statically to a Flow (struct ibv_flow) during creation of a new Flow by calling \f[B]ibv_create_flow()\f[R]. .PP Counters are cleared upon creation and values will be monotonically increasing. .PP \f[B]ibv_destroy_counters\f[R]() releases the counters handle, user should detach the counters object before destroying it. .SH ARGUMENTS .TP \f[I]context\f[R] RDMA device context to create the counters on. .TP \f[I]init_attr\f[R] Is an ibv_counters_init_attr struct, as defined in verbs.h. .SS \f[I]init_attr\f[R] Argument .IP .nf \f[C] struct ibv_counters_init_attr { int comp_mask; }; \f[R] .fi .TP \f[I]comp_mask\f[R] Bitmask specifying what fields in the structure are valid. .SH RETURN VALUE .PP \f[B]ibv_create_counters\f[R]() returns a pointer to the allocated ibv_counters object, or NULL if the request fails (and sets errno to indicate the failure reason) .PP \f[B]ibv_destroy_counters\f[R]() returns 0 on success, or the value of errno on failure (which indicates the failure reason) .SH ERRORS .TP EOPNOTSUPP \f[B]ibv_create_counters\f[R]() is not currently supported on this device (ENOSYS may sometimes be returned by old versions of libibverbs). .TP ENOMEM \f[B]ibv_create_counters\f[R]() could not create ibv_counters object, not enough memory .TP EINVAL invalid parameter supplied \f[B]ibv_destroy_counters\f[R]() .SH EXAMPLE .PP An example of use of ibv_counters is shown in \f[B]ibv_read_counters\f[R] .SH SEE ALSO .PP \f[B]ibv_attach_counters_point_flow\f[R], \f[B]ibv_read_counters\f[R], \f[B]ibv_create_flow\f[R] .SH AUTHORS .PP Raed Salem .PP Alex Rosenbaum rdma-core-56.1/buildlib/pandoc-prebuilt/5d67297aeec84eb513d984263139efe7f71750d50000644000175100002000000000511714773456413033345 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "ibv_query_qp_data_in_order" "3" "2020-3-3" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_query_qp_data_in_order - check if qp data is guaranteed to be in order. .SH SYNOPSIS .IP .nf \f[C] #include int ibv_query_qp_data_in_order(struct ibv_qp *qp, enum ibv_wr_opcode op, uint32_t flags); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_query_qp_data_in_order()\f[R] Checks whether WQE data is guaranteed to be written in-order, and thus reader may poll for data instead of poll for completion. This function indicates data is written in-order within each WQE, but cannot be used to determine ordering between separate WQEs. This function describes ordering at the receiving side of the QP, not the sending side. .SH ARGUMENTS .TP \f[I]qp\f[R] .IP .nf \f[C] The local queue pair (QP) to query. \f[R] .fi .TP \f[I]op\f[R] .IP .nf \f[C] The operation type to query about. Different operation types may write data in a different order. \f[R] .fi .RS For RDMA read operations: describes ordering of RDMA reads posted on this local QP. For RDMA write operations: describes ordering of remote RDMA writes being done into this local QP. For RDMA send operations: describes ordering of remote RDMA sends being done into this local QP. This function should not be used to determine ordering of other operation types. .RE .TP \f[I]flags\f[R] Flags are used to select a query type. Supported values: .PP IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS - Query for supported capabilities and return a capabilities vector. .PP Passing 0 is equivalent to using IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS and checking for IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG support. .SH RETURN VALUE .PP \f[B]ibv_query_qp_data_in_order()\f[R] Return value is determined by flags. For each capability bit, 1 is returned if the data is guaranteed to be written in-order for selected operation and type, 0 otherwise. If IBV_QUERY_QP_DATA_IN_ORDER_RETURN_CAPS flag is used, return value can consist of following capabilities: .PP IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG - All data is being written in order. .PP IBV_QUERY_QP_DATA_IN_ORDER_ALIGNED_128_BYTES - Each 128 bytes aligned block is being written in order. .PP If flags is 0, the function will return 1 if IBV_QUERY_QP_DATA_IN_ORDER_WHOLE_MSG is supported and 0 otherwise. .SH NOTES .PP Return value is valid only when the data is read by the CPU and relaxed ordering MR is not the target of the transfer. .SH SEE ALSO .PP \f[B]ibv_query_qp\f[R](3) .SH AUTHOR .PP Patrisious Haddad .PP Yochai Cohen rdma-core-56.1/buildlib/pandoc-prebuilt/387bcd897d46908fd6f21b68160ba3e0439581310000644000175100002000000000352114773456414033165 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "MANADV" "7" "2022-05-16" "mana" "MANA Direct Verbs Manual" .hy .SH NAME .PP manadv - Direct verbs for mana devices .PP This provides low level access to mana devices to perform direct operations, without general branching performed by libibverbs. .SH DESCRIPTION .PP The libibverbs API is an abstract one. It is agnostic to any underlying provider specific implementation. While this abstraction has the advantage of user applications portability, it has a performance penalty. For some applications optimizing performance is more important than portability. .PP The mana direct verbs API is intended for such applications. It exposes mana specific low level operations, allowing the application to bypass the libibverbs API. .PP This version of the driver supports one QP type: IBV_QPT_RAW_PACKET. To use this QP type, the application is required to use manadv_set_context_attr() to set external buffer allocators for allocating queues, and use manadv_init_obj() to obtain all the queue information. The application implements its own queue operations, bypassing libibverbs API for sending/receiving traffic over the queues. At hardware layer, IBV_QPT_RAW_PACKET QP shares the same hardware resource as the Ethernet port used in the kernel. The software checks for exclusive use of the hardware Ethernet port, and will fail the QP creation if the port is already in use. To create a IBV_QPT_RAW_PACKET on a specified port, the user needs to configure the system in such a way that this port is not used by any other software (including the Kernel). If the port is used, ibv_create_qp() will fail with errno set to EBUSY. .PP The direct include of manadv.h together with linkage to mana library will allow usage of this new interface. .SH SEE ALSO .PP \f[B]verbs\f[R](7) .SH AUTHORS .PP Long Li rdma-core-56.1/buildlib/pandoc-prebuilt/27eaa419d1bb46824097bffdf0ae970f24c1f0eb0000644000175100002000000000220414773456416033657 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_query_qp_lag_port" "3" "" "" "" .hy .SH NAME .PP mlx5dv_query_qp_lag_port - Query the lag port information of a given QP .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_query_qp_lag_port(struct ibv_qp *qp, uint8_t *port_num, uint8_t *active_port_num); \f[R] .fi .SH DESCRIPTION .PP This API returns the configured and active port num of a given QP in mlx5 devices. .PP The active port num indicates which port that the QP sends traffic out in a LAG configuration. .PP The num_lag_ports field of struct mlx5dv_context greater than 1 means LAG is supported on this device. .SH ARGUMENTS .TP \f[I]qp\f[R] The ibv_qp object to issue the action on. .TP \f[I]port_num\f[R] The configured port num of the QP. .TP \f[I]active_port_num\f[R] The current port num of the QP, which may different from the configured value because of the bonding status. .SH RETURN VALUE .PP 0 on success; EOPNOTSUPP if not in LAG mode, or other errno value on other failures. .SH SEE ALSO .PP \f[I]mlx5dv_modify_qp_lag_port(3)\f[R] .SH AUTHOR .PP Aharon Landau rdma-core-56.1/buildlib/pandoc-prebuilt/00b1d0691cdea71ca370160f85854622bfef1e920000644000175100002000000003061114773456412033347 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "ibv_flow_action_esp" "3" "" "" "" .hy .SH NAME .PP ibv_flow_action_esp - Flow action esp for verbs .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_flow_action * ibv_create_flow_action_esp(struct ibv_context *ctx, struct ibv_flow_action_esp *esp); int ibv_modify_flow_action_esp(struct ibv_flow_action *action, struct ibv_flow_action_esp *esp); int ibv_destroy_flow_action(struct ibv_flow_action *action); \f[R] .fi .SH DESCRIPTION .PP An IPSEC ESP flow steering action allows a flow steering rule to decrypt or encrypt a packet after matching. Each action contains the necessary information for this operation in the \f[I]params\f[R] argument. .PP After the crypto operation the packet will continue to be processed by flow steering rules until it reaches a final action of discard or delivery. .PP After the action is created, then it should be associated with a \f[I]struct ibv_flow_attr\f[R] using \f[I]struct ibv_flow_spec_action_handle\f[R] flow specification. Each action can be associated with multiple flows, and \f[I]ibv_modify_flow_action_esp\f[R] will alter all associated flows simultaneously. .SH ARGUMENTS .TP \f[I]ctx\f[R] RDMA device context to create the action on. .TP \f[I]esp\f[R] ESP parameters and key material for the action. .TP \f[I]action\f[R] Existing action to modify ESP parameters. .SS \f[I]action\f[R] Argument .IP .nf \f[C] struct ibv_flow_action_esp { struct ibv_flow_action_esp_attr *esp_attr; /* See Key Material */ uint16_t keymat_proto; uint16_t keymat_len; void *keymat_ptr; /* See Replay Protection */ uint16_t replay_proto; uint16_t replay_len; void *replay_ptr; struct ibv_flow_action_esp_encap *esp_encap; uint32_t comp_mask; uint32_t esn; }; \f[R] .fi .TP \f[I]comp_mask\f[R] Bitmask specifying what fields in the structure are valid. .TP \f[I]esn\f[R] The starting value of the ESP extended sequence number. Valid only if \f[I]IBV_FLOW_ACTION_ESP_MASK_ESN\f[R] is set in \f[I]comp_mask\f[R]. .RS .PP The 32 bits of \f[I]esn\f[R] will be used to compute the full 64 bit ESN required for the AAD construction. .PP When in \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_INLINE_CRYPTO\f[R] mode, the implementation will automatically track rollover of the lower 32 bits of the ESN. However, an update of the window is required once every 2\[ha]31 sequences. .PP When in \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD\f[R] mode this value is automatically incremended and it is also used for anti-replay checks. .RE .TP \f[I]esp_attr\f[R] See \f[I]ESP Attributes\f[R]. May be NULL on modify. .TP \f[I]keymat_proto\f[R], \f[I]keymat_len\f[R], \f[I]keymat_ptr\f[R] Describe the key material and encryption standard to use. May be NULL on modify. .TP \f[I]replay_proto\f[R], \f[I]replay_len\f[R], \f[I]replay_ptr\f[R] Describe the replay protection scheme used to manage sequence numbers and prevent replay attacks. This field is only valid in full offload mode. May be NULL on modify. .TP \f[I]esp_encap\f[R] Describe the encapsulation of ESP packets such as the IP tunnel and/or UDP encapsulation. This field is only valid in full offload mode. May be NULL on modify. .SS ESP attributes .IP .nf \f[C] struct ibv_flow_action_esp_attr { uint32_t spi; uint32_t seq; uint32_t tfc_pad; uint32_t flags; uint64_t hard_limit_pkts; }; \f[R] .fi .TP \f[I]flags\f[R] A bitwise OR of the various \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS\f[R] described below. .RS .TP \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_DECRYPT\f[R], \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_ENCRYPT\f[R] The action will decrypt or encrypt a packet using the provided keying material. .RS .PP The implementation may require that encrypt is only used with an egress flow steering rule, and that decrypt is only used with an ingress flow steering rule. .RE .RE .SS Full Offload Mode .PP When \f[I]esp_attr\f[R] flag \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD\f[R] is set the ESP header and trailer are added and removed automatically during the cipher operation. In this case the \f[I]esn\f[R] and \f[I]spi\f[R] are used to populate and check the ESP header, and any information from the \f[I]keymat\f[R] (eg a IV) is placed in the headers and otherwise handled automatically. .PP For decrypt the hardware will perform anti-replay. .PP Decryption failure will cause the packet to be dropped. .PP This action must be combined with the flow steering that identifies the packets protected by the SA defined in this action. .PP The following members of the esp_attr are used only in full offload mode: .TP \f[I]spi\f[R] The value for the ESP Security Parameters Index. It is only used for \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLAOD\f[R]. .TP \f[I]seq\f[R] The initial 32 lower bytes of the sequence number. This is the value of the ESP sequence number. It is only used for \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLAOD\f[R]. .TP \f[I]tfc_pad\f[R] The length of Traffic Flow Confidentiality Padding as specified by RFC4303. If it is set to zero no additional padding is added. It is only used for \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLAOD\f[R]. .TP \f[I]hard_limit_pkts\f[R] The hard lifetime of the SA measured in number of packets. As specified by RFC4301. After this limit is reached the action will drop future packets to prevent breaking the crypto. It is only used for \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLAOD\f[R]. .SS Inline Crypto Mode .PP When \f[I]esp_attr\f[R] flag \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_INLINE_CRYPTO\f[R] is set the user must providate packets with additional headers. .PP For encrypt the packet must contain a fully populated IPSEC packet except the data payload is left un-encrypted and there is no IPsec trailer. If the IV must be unpredictable, then a flag should indicate the transofrmation such as \f[I]IB_UVERBS_FLOW_ACTION_IV_ALGO_SEQ\f[R]. .PP \f[I]IB_UVERBS_FLOW_ACTION_IV_ALGO_SEQ\f[R] means that the IV is incremented sequentually. If the IV algorithm is supported by HW, then it could provide support for LSO offload with ESP inline crypto. .PP Finally, the IV used to encrypt the packet replaces the IV field provided, the payload is encrypted and authenticated, a trailer with padding is added and the ICV is added as well. .PP For decrypt the packet is authenticated and decrypted in-place, resulting in a decrypted IPSEC packet with no trailer. The result of decryption and authentication can be retrieved from an extended CQ via the \f[I]ibv_wc_read_XXX(3)\f[R] function. .PP This mode must be combined with the flow steering including \f[I]IBV_FLOW_SPEC_IPV4\f[R] and \f[I]IBV_FLOW_SPEC_ESP\f[R] to match the outer packet headers to ensure that the action is only applied to IPSEC packets with the correct identifiers. .PP For inline crypto, we have some special requirements to maintain a stateless ESN while maintaining the same parameters as software. The system supports offloading a portion of the IPSEC flow, enabling a single flow to be split between multiple NICs. .SS Determining the ESN for Ingress Packets .PP We require a \[lq]modify\[rq] command once every 2\[ha]31 packets. This modify command allows the implementation in HW to be stateless, as follows: .IP .nf \f[C] ESN 1 ESN 2 ESN 3 |-------------*-------------|-------------*-------------|-------------* \[ha] \[ha] \[ha] \[ha] \[ha] \[ha] \f[R] .fi .PP \[ha] - marks where command invoked to update the SA ESN state machine. .PD 0 .P .PD | - marks the start of the ESN scope (0-2\[ha]32-1). At this point move SA ESN \[lq]new_window\[rq] bit to zero and increment ESN. .PD 0 .P .PD * - marks the middle of the ESN scope (2\[ha]31). At this point move SA ESN \[lq]new_window\[rq] bit to one. .PP For decryption the implementation uses the following state machine to determine ESN: .IP .nf \f[C] if (!overlap) { use esn // regardless of packet.seq } else { // new_window if (packet.seq >= 2\[ha]31) use esn else // packet.seq < 2\[ha]31 use esn+1 } \f[R] .fi .PP This mechanism is controlled by the \f[I]esp_attr\f[R] flag: .TP \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_ESN_NEW_WINDOW\f[R] This flag is only used to provide stateless ESN support for inline crypto. It is used only for \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_INLINE_CRYPTO\f[R] and \f[I]IBV_FLOW_ACTION_ESP_MASK_ESN\f[R]. .RS .PP Setting this flag indicates that the bottom of the replay window is between 2\[ha]31 - 2\[ha]32. .RE .SS Key Material for AES GCM (\f[I]IBV_ACTION_ESP_KEYMAT_AES_GCM\f[R]) .PP The AES GCM crypto algorithm as defined by RFC4106. This struct is to be provided in \f[I]keymat_ptr\f[R] when \f[I]keymat_proto\f[R] is set to \f[I]IBV_ACTION_ESP_KEYMAT_AES_GCM\f[R]. .IP .nf \f[C] struct ibv_flow_action_esp_aes_keymat_aes_gcm { uint64_t iv; uint32_t iv_algo; /* Use enum ib_uverbs_flow_action_esp_aes_gcm_keymat_iv_algo */ uint32_t salt; uint32_t icv_len; uint32_t key_len; uint32_t aes_key[256 / 32]; }; \f[R] .fi .TP \f[I]iv\f[R] The starting value for the initialization vector used only with \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD\f[R] encryption as defined in RFC4106. This field is ignored for \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_INLINE_CRYPTO\f[R]. .RS .PP For a given key, the IV MUST NOT be reused. .RE .TP \f[I]iv_algo\f[R] The algorithm used to transform/generate new IVs with \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD\f[R] encryption. .RS .PP The only supported value is \f[I]IB_UVERBS_FLOW_ACTION_IV_ALGO_SEQ\f[R] to generate sequantial IVs. .RE .TP \f[I]salt\f[R] The salt as defined by RFC4106. .TP \f[I]icv_len\f[R] The length of the Integrity Check Value in bytes as defined by RFC4106. .TP \f[I]aes_key\f[R], \f[I]key_len\f[R] The cipher key data. It must be either 16, 24 or 32 bytes as defined by RFC4106. .SS Bitmap Replay Protection (\f[I]IBV_FLOW_ACTION_ESP_REPLAY_BMP\f[R]) .PP A shifting bitmap is used to identify which packets have already been transmitted. Each bit in the bitmap represents a packet, it is set if a packet with this ESP sequence number has been received and it passed authentication. If a packet with the same sequence is received, then the bit is already set, causing replay protection to drop the packet. The bitmap represents a window of \f[I]size\f[R] sequence numbers. If a newer sequence number is received, then the bitmap will shift to represent this as in RFC6479. The replay window cannot shift more than 2\[ha]31 sequence numbers forward. .PP This struct is to be provided in \f[I]replay_ptr\f[R] when \f[I]reply_proto\f[R] is set to \f[I]IBV_FLOW_ACTION_ESP_REPLAY_BMP\f[R]. In this mode reply_ptr and reply_len should point to a struct ibv_flow_action_esp_replay_bmp containing: \f[I]size\f[R] : The size of the bitmap. .SS ESP Encapsulation .PP An \f[I]esp_encap\f[R] specification is required when \f[I]eps_attr\f[R] flags \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_TUNNEL\f[R] is set. It is used to provide the fields for the encapsulation header that is added/removed to/from packets. Tunnel and Transport mode are defined as in RFC4301. UDP encapsulation of ESP can be specified by providing the appropriate UDP header. .PP This setting is only used in \f[I]IB_UVERBS_FLOW_ACTION_ESP_FLAGS_FULL_OFFLOAD\f[R] mode. .IP .nf \f[C] struct ibv_flow_action_esp_encap { void *val; /* pointer to struct ibv_flow_xxxx_filter */ struct ibv_flow_action_esp_encap *next_ptr; uint16_t len; /* Len of mask and pointer (separately) */ uint16_t type; /* Use flow_spec enum */ }; \f[R] .fi .PP Each link in the list specifies a network header in the same manner as the flow steering API. The header should be selected from a supported header in `enum ibv_flow_spec_type'. .SH RETURN VALUE .PP Upon success \f[I]ibv_create_flow_action_esp\f[R] will return a new \f[I]struct ibv_flow_action\f[R] object, on error NULL will be returned and errno will be set. .PP Upon success \f[I]ibv_modify_action_esp\f[R] will return 0. On error the value of errno will be returned. If ibv_modify_flow_action fails, it is guaranteed that the last action still holds. If it succeeds, there is a point in the future where the old action is applied on all packets until this point and the new one is applied on all packets from this point and on. .SH SEE ALSO .PP \f[I]ibv_create_flow(3)\f[R], \f[I]ibv_destroy_action(3)\f[R], \f[I]RFC 4106\f[R] rdma-core-56.1/buildlib/pandoc-prebuilt/5358b48bb3cfd5bdddbb449240ddc673311cfbdd0000644000175100002000000001207514773456421034020 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "SMINFO" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME sminfo \- query InfiniBand SMInfo attribute .SH SYNOPSIS .sp sminfo [options] sm_lid | sm_dr_path [modifier] .SH DESCRIPTION .sp Optionally set and display the output of a sminfo query in human readable format. The target SM is the one listed in the local port info, or the SM specified by the optional SM lid or by the SM direct routed path. .sp Note: using sminfo for any purposes other then simple query may be very dangerous, and may result in a malfunction of the target SM. .SH OPTIONS .INDENT 0.0 .TP .B \fB\-s, \-\-state \fP set SM state 0 not active .sp 1 discovering .sp 2 standby .sp 3 master .UNINDENT .sp \fB\-p, \-\-priority \fP set priority (0\-15) .sp \fB\-a, \-\-activity \fP set activity count .SS Addressing Flags .\" Define the common option -D for Directed routes . .sp \fB\-D, \-\-Direct\fP The address specified is a directed route .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Examples: [options] \-D [options] "0" # self port [options] \-D [options] "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag \(aq\-P\(aq or the port found through the automatic selection process.) .ft P .fi .UNINDENT .UNINDENT .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .SH EXAMPLES .INDENT 0.0 .TP .B :: sminfo # local port\(aqs sminfo sminfo 32 # show sminfo of lid 32 sminfo \-G 0x8f1040023 # same but using guid address .UNINDENT .SH SEE ALSO .sp smpdump (8) .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/8105741e009ee9c0458701d0170810358fd598870000644000175100002000000000711214773456415032662 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_devx_qp[/cq/srq/wq/ind_tbl]_modify / query" "3" "" "" "" .hy .SH NAME .PP mlx5dv_devx_qp_modify - Modifies a verbs QP via DEVX .PP mlx5dv_devx_qp_query - Queries a verbs QP via DEVX .PP mlx5dv_devx_cq_modify - Modifies a verbs CQ via DEVX .PP mlx5dv_devx_cq_query - Queries a verbs CQ via DEVX .PP mlx5dv_devx_srq_modify - Modifies a verbs SRQ via DEVX .PP mlx5dv_devx_srq_query - Queries a verbs SRQ via DEVX .PP mlx5dv_devx_wq_modify - Modifies a verbs WQ via DEVX .PP mlx5dv_devx_wq_query - Queries a verbs WQ via DEVX .PP mlx5dv_devx_ind_tbl_modify - Modifies a verbs indirection table via DEVX .PP mlx5dv_devx_ind_tbl_query - Queries a verbs indirection table via DEVX .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_devx_qp_modify(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_qp_query(struct ibv_qp *qp, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_cq_modify(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_cq_query(struct ibv_cq *cq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_srq_modify(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_srq_query(struct ibv_srq *srq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_wq_modify(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_wq_query(struct ibv_wq *wq, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_ind_tbl_modify(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen); int mlx5dv_devx_ind_tbl_query(struct ibv_rwq_ind_table *ind_tbl, const void *in, size_t inlen, void *out, size_t outlen); \f[R] .fi .SH DESCRIPTION .PP Modify / query a verb object over the DEVX interface. .PP The DEVX API enables direct access from the user space area to the mlx5 device driver by using the KABI mechanism. The main purpose is to make the user space driver as independent as possible from the kernel so that future device functionality and commands can be activated with minimal to none kernel changes. .PP The above APIs enables modifying/querying a verb object via the DEVX interface. This enables interoperability between verbs and DEVX. As such an application can use the create method from verbs (e.g.\ ibv_create_qp) and modify and query the created object via DEVX (e.g.\ mlx5dv_devx_qp_modify). .SH ARGUMENTS .TP \f[I]qp/cq/wq/srq/ind_tbl\f[R] The ibv_xxx object to issue the action on. .TP \f[I]in\f[R] A buffer which contains the command\[cq]s input data provided in a device specification format. .TP \f[I]inlen\f[R] The size of \f[I]in\f[R] buffer in bytes. .TP \f[I]out\f[R] A buffer which contains the command\[cq]s output data according to the device specification format. .TP \f[I]outlen\f[R] The size of \f[I]out\f[R] buffer in bytes. .SH RETURN VALUE .PP Upon success 0 is returned or the value of errno on a failure. .PP If the error value is EREMOTEIO, outbox.status outbox.syndrome will contain the command failure details. .SH SEE ALSO .PP \f[B]mlx5dv_open_device\f[R], \f[B]mlx5dv_devx_obj_create\f[R] .PP #AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/b6088f7c78e9b3da664dc6c5eb0c290e3e75613e0000644000175100002000000005151514773456415033556 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "MLX5DV_DR API" "3" "2019-03-28" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_dr_domain_create, mlx5dv_dr_domain_sync, mlx5dv_dr_domain_destroy, mlx5dv_dr_domain_set_reclaim_device_memory, mlx5dv_dr_domain_allow_duplicate_rules - Manage flow domains .PP mlx5dv_dr_table_create, mlx5dv_dr_table_destroy - Manage flow tables .PP mlx5dv_dr_matcher_create, mlx5dv_dr_matcher_destroy, mlx5dv_dr_matcher_set_layout - Manage flow matchers .PP mlx5dv_dr_rule_create, mlx5dv_dr_rule_destroy - Manage flow rules .PP mlx5dv_dr_action_create_drop - Create drop action .PP mlx5dv_dr_action_create_default_miss - Create default miss action .PP mlx5dv_dr_action_create_tag - Create tag actions .PP mlx5dv_dr_action_create_dest_ibv_qp - Create packet destination QP action .PP mlx5dv_dr_action_create_dest_table - Create packet destination dr table action .PP mlx5dv_dr_action_create_dest_root_table - Create packet destination root table action .PP mlx5dv_dr_action_create_dest_vport - Create packet destination vport action .PP mlx5dv_dr_action_create_dest_ib_port - Create packet destination IB port action .PP mlx5dv_dr_action_create_dest_devx_tir - Create packet destination TIR action .PP mlx5dv_dr_action_create_dest_array - Create destination array action .PP mlx5dv_dr_action_create_packet_reformat - Create packet reformat actions .PP mlx5dv_dr_action_create_modify_header - Create modify header actions .PP mlx5dv_dr_action_create_flow_counter - Create devx flow counter actions .PP mlx5dv_dr_action_create_aso, mlx5dv_dr_action_modify_aso - Create and modify ASO actions .PP mlx5dv_dr_action_create_flow_meter, mlx5dv_dr_action_modify_flow_meter - Create and modify meter action .PP mlx5dv_dr_action_create_flow_sampler - Create flow sampler action .PP mlx5dv_dr_action_create_pop_vlan - Create pop vlan action .PP mlx5dv_dr_action_create_push_vlan- Create push vlan action .PP mlx5dv_dr_action_destroy - Destroy actions .PP mlx5dv_dr_aso_other_domain_link, mlx5dv_dr_aso_other_domain_unlink - link/unlink ASO devx object to work with different domains .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_dr_domain *mlx5dv_dr_domain_create( struct ibv_context *ctx, enum mlx5dv_dr_domain_type type); int mlx5dv_dr_domain_sync( struct mlx5dv_dr_domain *domain, uint32_t flags); int mlx5dv_dr_domain_destroy(struct mlx5dv_dr_domain *domain); void mlx5dv_dr_domain_set_reclaim_device_memory( struct mlx5dv_dr_domain *dmn, bool enable); void mlx5dv_dr_domain_allow_duplicate_rules(struct mlx5dv_dr_domain *dmn, bool allow); struct mlx5dv_dr_table *mlx5dv_dr_table_create( struct mlx5dv_dr_domain *domain, uint32_t level); int mlx5dv_dr_table_destroy(struct mlx5dv_dr_table *table); struct mlx5dv_dr_matcher *mlx5dv_dr_matcher_create( struct mlx5dv_dr_table *table, uint16_t priority, uint8_t match_criteria_enable, struct mlx5dv_flow_match_parameters *mask); int mlx5dv_dr_matcher_destroy(struct mlx5dv_dr_matcher *matcher); int mlx5dv_dr_matcher_set_layout(struct mlx5dv_dr_matcher *matcher, struct mlx5dv_dr_matcher_layout *matcher_layout); struct mlx5dv_dr_rule *mlx5dv_dr_rule_create( struct mlx5dv_dr_matcher *matcher, struct mlx5dv_flow_match_parameters *value, size_t num_actions, struct mlx5dv_dr_action *actions[]); void mlx5dv_dr_rule_destroy(struct mlx5dv_dr_rule *rule); struct mlx5dv_dr_action *mlx5dv_dr_action_create_drop(void); struct mlx5dv_dr_action *mlx5dv_dr_action_create_default_miss(void); struct mlx5dv_dr_action *mlx5dv_dr_action_create_tag( uint32_t tag_value); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_ibv_qp( struct ibv_qp *ibqp); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_table( struct mlx5dv_dr_table *table); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_root_table( struct mlx5dv_dr_table *table, uint16_t priority); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_vport( struct mlx5dv_dr_domain *domain, uint32_t vport); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_ib_port( struct mlx5dv_dr_domain *domain, uint32_t ib_port); struct mlx5dv_dr_action *mlx5dv_dr_action_create_dest_devx_tir( struct mlx5dv_devx_obj *devx_obj); struct mlx5dv_dr_action *mlx5dv_dr_action_create_packet_reformat( struct mlx5dv_dr_domain *domain, uint32_t flags, enum mlx5dv_flow_action_packet_reformat_type reformat_type, size_t data_sz, void *data); struct mlx5dv_dr_action *mlx5dv_dr_action_create_modify_header( struct mlx5dv_dr_domain *domain, uint32_t flags, size_t actions_sz, __be64 actions[]); struct mlx5dv_dr_action *mlx5dv_dr_action_create_flow_counter( struct mlx5dv_devx_obj *devx_obj, uint32_t offset); struct mlx5dv_dr_action * mlx5dv_dr_action_create_aso(struct mlx5dv_dr_domain *domain, struct mlx5dv_devx_obj *devx_obj, uint32_t offset, uint32_t flags, uint8_t return_reg_c); int mlx5dv_dr_action_modify_aso(struct mlx5dv_dr_action *action, uint32_t offset, uint32_t flags, uint8_t return_reg_c); struct mlx5dv_dr_action * mlx5dv_dr_action_create_flow_meter(struct mlx5dv_dr_flow_meter_attr *attr); int mlx5dv_dr_action_modify_flow_meter(struct mlx5dv_dr_action *action, struct mlx5dv_dr_flow_meter_attr *attr, __be64 modify_field_select); struct mlx5dv_dr_action * mlx5dv_dr_action_create_flow_sampler(struct mlx5dv_dr_flow_sampler_attr *attr); struct mlx5dv_dr_action * mlx5dv_dr_action_create_dest_array(struct mlx5dv_dr_domain *domain, size_t num_dest, struct mlx5dv_dr_action_dest_attr *dests[]); struct mlx5dv_dr_action *mlx5dv_dr_action_create_pop_vlan(void); struct mlx5dv_dr_action *mlx5dv_dr_action_create_push_vlan( struct mlx5dv_dr_domain *dmn, __be32 vlan_hdr) int mlx5dv_dr_action_destroy(struct mlx5dv_dr_action *action); int mlx5dv_dr_aso_other_domain_link(struct mlx5dv_devx_obj *devx_obj, struct mlx5dv_dr_domain *peer_dmn, struct mlx5dv_dr_domain *dmn, uint32_t flags, uint8_t return_reg_c); int mlx5dv_dr_aso_other_domain_unlink(struct mlx5dv_devx_obj *devx_obj, struct mlx5dv_dr_domain *dmn); \f[R] .fi .SH DESCRIPTION .PP The Direct Rule API (mlx5dv_dr_*) allows complete access by verbs application to the device\[ga]s packet steering functionality. .PP Steering flow rules are the combination of attributes with a match pattern and a list of actions. Rules can have several distinct actions (such as counting, encapsulating, decapsulating before redirecting packets to a particular queue or port, etc.). In order to manage the rule execution order for the packet processing matching by HW, multiple flow tables in an ordered chain and multiple flow matchers sorted by priorities are defined. .SS Domain .PP \f[I]mlx5dv_dr_domain_create()\f[R] creates a DR domain object to be used with \f[I]mlx5dv_dr_table_create()\f[R] and \f[I]mlx5dv_dr_action_create_*()\f[R]. .PP A domain should be destroyed by calling \f[I]mlx5dv_dr_domain_destroy()\f[R] once all depended resources are released. .PP The device support the following domains types: .PP \f[B]MLX5DV_DR_DOMAIN_TYPE_NIC_RX\f[R] Manage ethernet packets received on the NIC. Packets in this domain can be dropped, dispatched to QP\[ga]s, modified or redirected to additional tables inside the domain. Default behavior: Drop packet. .PP \f[B]MLX5DV_DR_DOMAIN_TYPE_NIC_TX\f[R] Manage ethernet packets transmit on the NIC. Packets in this domain can be dropped, modified or redirected to additional tables inside the domain. Default behavior: Forward packet to NIC vport (to eSwitch or wire). .PP \f[B]MLX5DV_DR_DOMAIN_TYPE_FDB\f[R] Manage ethernet packets in the eSwitch Forwarding Data Base for packets received from wire or from any other vport. Packets in this domain can be dropped, dispatched to vport, modified or redirected to additional tables inside the domain. Default behavior: Forward packet to eSwitch manager vport. .PP \f[I]mlx5dv_dr_domain_sync()\f[R] is used in order to flush the rule submission queue. By default, rules in a domain are updated in HW asynchronously. \f[B]flags\f[R] should be a set of type \f[I]enum mlx5dv_dr_domain_sync_flags\f[R]: .PP \f[B]MLX5DV_DR_DOMAIN_SYNC_FLAGS_SW\f[R]: block until completion of all software queued tasks. .PP \f[B]MLX5DV_DR_DOMAIN_SYNC_FLAGS_HW\f[R]: clear the steering HW cache to enforce next packet hits the latest rules, in addition to the SW SYNC handling. .PP \f[B]MLX5DV_DR_DOMAIN_SYNC_FLAGS_MEM\f[R]: sync device memory to free cached memory. .PP \f[I]mlx5dv_dr_domain_set_reclaim_device_memory()\f[R] is used to enable the reclaiming of device memory back to the system when not in use, by default this feature is disabled. .PP \f[I]mlx5dv_dr_domain_allow_duplicate_rules()\f[R] is used to allow or prevent insertion of rules matching on same fields(duplicates) on non root tables, by default this feature is allowed. .SS Table .PP \f[I]mlx5dv_dr_table_create()\f[R] creates a DR table in the \f[B]domain\f[R], at the appropriate \f[B]level\f[R], and can be used with \f[I]mlx5dv_dr_matcher_create()\f[R], \f[I]mlx5dv_dr_action_create_dest_table()\f[R] and \f[I]mlx5dv_dr_action_create_dest_root_table\f[R]. All packets start traversing the steering domain tree at table \f[B]level\f[R] zero (0). Using rule and action, packets can by redirected to other tables in the domain. .PP A table should be destroyed by calling \f[I]mlx5dv_dr_table_destroy()\f[R] once all depended resources are released. .SS Matcher .PP \f[I]mlx5dv_dr_matcher_create()\f[R] create a matcher object in \f[B]table\f[R], at sorted \f[B]priority\f[R] (lower value is check first). A matcher can hold multiple rules, all with identical \f[B]mask\f[R] of type \f[I]struct mlx5dv_flow_match_parameters\f[R] which represents the exact attributes to be compared by HW steering. The \f[B]match_criteria_enable\f[R] and \f[B]mask\f[R] are defined in a device spec format. Only the fields that where masked in the \f[I]matcher\f[R] should be filled by the rule in \f[I]mlx5dv_dr_rule_create()\f[R]. .PP A matcher should be destroyed by calling \f[I]mlx5dv_dr_matcher_destroy()\f[R] once all depended resources are released. .PP \f[I]mlx5dv_dr_matcher_set_layout()\f[R] is used to set specific layout parameters of a matcher, on some conditions setting some attributes might not be supported, in such cases ENOTSUP will be returned. \f[B]flags\f[R] should be a set of type \f[I]enum mlx5dv_dr_matcher_layout_flags\f[R]: .PP \f[B]MLX5DV_DR_MATCHER_LAYOUT_RESIZABLE\f[R]: The matcher can resize its scale and resources according to the rules that are inserted or removed. .PP \f[B]MLX5DV_DR_MATCHER_LAYOUT_NUM_RULE\f[R]: Indicates a hint from the application about the number of the rules the matcher is expected to handle. This allows preallocation of matcher resources for faster rule updates when using with non-resizable layout mode. .SS Actions .PP A set of action create API are defined by \f[I]mlx5dv_dr_action_create_*()\f[R]. All action are created as \f[I]struct mlx5dv_dr_action\f[R]. An action should be destroyed by calling \f[I]mlx5dv_dr_action_destroy()\f[R] once all depended rules are destroyed. .PP When an action handle is reused for multiple rules, the same action will be executed. e.g.: action `count' will count multiple flows rules on the same HW flow counter context. action `drop' will drop packets of different rule from any matcher. .PP Action: Drop \f[I]mlx5dv_dr_action_create_drop\f[R] create a terminating action which drops packets. Can not be mixed with Destination actions. .PP Action: Default miss \f[I]mlx5dv_dr_action_create_default_miss\f[R] create a terminating action which will execute the default behavior based on the domain type. .PP Action: Tag \f[I]mlx5dv_dr_action_create_tag\f[R] creates a non-terminating action which tags packets with \f[B]tag_value\f[R]. The \f[B]tag_value\f[R] is available in the CQE of the packet received. Valid only on domain type NIC_RX. .PP Action: Destination \f[I]mlx5dv_dr_action_create_dest_ibv_qp\f[R] creates a terminating action delivering the packet to a QP, defined by \f[B]ibqp\f[R]. Valid only on domain type NIC_RX. \f[I]mlx5dv_dr_action_create_dest_table\f[R] creates a forwarding action to another flow table, defined by \f[B]table\f[R]. The destination \f[B]table\f[R] must be from the same domain with a level higher than zero. \f[I]mlx5dv_dr_action_create_dest_root_table\f[R] creates a forwarding action to another priority inside a root flow table, defined by \f[B]table\f[R] and \f[B]priority\f[R]. \f[I]mlx5dv_dr_action_create_dest_vport\f[R] creates a forwarding action to a \f[B]vport\f[R] on the same \f[B]domain\f[R]. Valid only on domain type FDB. \f[I]mlx5dv_dr_action_create_dest_ib_port\f[R] creates a forwarding action to a \f[B]ib_port\f[R] on the same \f[B]domain\f[R]. The valid range of ports is a based on the capability phys_port_cnt_ex provided by ibq_query_device_ex and it is possible to query the ports details using mlx5dv_query_port. Action is supported only on domain type FDB. \f[I]mlx5dv_dr_action_create_dest_devx_tir\f[R] creates a terminating action delivering the packet to a TIR, defined by \f[B]devx_obj\f[R]. Valid only on domain type NIC_RX. .PP Action: Array \f[I]mlx5dv_dr_action_create_dest_array\f[R] creates an action which replicates a packet to multiple destinations. \f[B]num_dest\f[R] defines the number of replication destinations. Each \f[B]dests\f[R] destination array entry can be of different \f[B]type\f[R]. Use type MLX5DV_DR_ACTION_DEST for direct forwarding to an action destination. Use type MLX5DV_DR_ACTION_DEST_REFORMAT when reformat action should be performed on the packet before it is forwarding to the destination action. .PP Action: Packet Reformat \f[I]mlx5dv_dr_action_create_packet_reformat\f[R] create a packet reformat context and action in the \f[B]domain\f[R]. The \f[B]reformat_type\f[R], \f[B]data_sz\f[R] and \f[B]data\f[R] are defined in \f[I]man mlx5dv_create_flow_action_packet_reformat\f[R]. .PP Action: Modify Header \f[I]mlx5dv_dr_action_create_modify_header\f[R] create a modify header context and action in the \f[B]domain\f[R]. The \f[B]actions_sz\f[R] and \f[B]actions\f[R] are defined in \f[I]man mlx5dv_create_flow_action_modify_header\f[R]. .PP Action: Flow Count \f[I]mlx5dv_dr_action_create_flow_counter\f[R] creates a flow counter action from a DEVX flow counter object, based on \f[B]devx_obj\f[R] and specific counter index from \f[B]offset\f[R] in the counter bulk. .PP Action: ASO \f[I]mlx5dv_dr_action_create_aso\f[R] receives a \f[B]domain\f[R] pointer and creates an ASO action from the DEVX ASO object, based on \f[B]devx_obj\f[R]. Use \f[B]offset\f[R] to select the specific ASO object in the \f[B]devx_obj\f[R] bulk. DR rules using this action can optionally update the ASO object value according to \f[B]flags\f[R] to choose the specific wanted behavior of this object. After a packet hits the rule with the ASO object the value of the ASO object will be copied into the chosen \f[B]return_reg_c\f[R] which can be used for match in following DR rules. .PP \f[I]mlx5dv_dr_action_modify_aso\f[R] modifies ASO action \f[B]action\f[R] with new values for \f[B]offset\f[R], \f[B]return_reg_c\f[R] and \f[B]flags\f[R]. Only new DR rules using this \f[B]action\f[R] will use the modified values. Existing DR rules do not change the HW action values stored. .PP \f[B]flags\f[R] can be set to one of the types of \f[I]mlx5dv_dr_action_aso_first_hit_flags\f[R] or \f[I]mlx5dv_dr_action_aso_flow_meter_flags\f[R] or \f[I]mlx5dv_dr_action_aso_ct_flags\f[R]: \f[B]MLX5DV_DR_ACTION_ASO_FIRST_HIT_FLAGS_SET\f[R]: is used to set the ASO first hit object context, else the context is only copied to the return_reg_c. \f[B]MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_RED\f[R]: is used to indicate to update the initial color in ASO flow meter object value to red. \f[B]MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_YELLOW\f[R]: is used to indicate to update the initial color in ASO flow meter object value to yellow. \f[B]MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_GREEN\f[R]: is used to indicate to update the initial color in ASO flow meter object value to green. \f[B]MLX5DV_DR_ACTION_FLAGS_ASO_FLOW_METER_UNDEFINED\f[R]: is used to indicate to update the initial color in ASO flow meter object value to undefined. \f[B]MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_INITIATOR\f[R]: is used to indicate the TCP connection direction the SYN packet was sent on. \f[B]MLX5DV_DR_ACTION_FLAGS_ASO_CT_DIRECTION_RESPONDER\f[R]: is used to indicate the TCP connection direction the SYN-ACK packet was sent on. .PP Action: Meter \f[I]mlx5dv_dr_action_create_flow_meter\f[R] creates a meter action based on the flow meter parameters. The paramertes are according to the device specification. \f[I]mlx5dv_dr_action_modify_flow_meter\f[R] modifies existing flow meter \f[B]action\f[R] based on \f[B]modify_field_select\f[R]. \f[B]modify_field_select\f[R] is according to the device specification. .PP Action: Sampler \f[I]mlx5dv_dr_action_create_flow_sampler\f[R] creates a sampler action, allowing us to duplicate and sample a portion of traffic. Packets steered to the sampler action will be sampled with an approximate probability of 1/sample_ratio provided in \f[B]attr\f[R], and sample_actions provided in \f[B]attr\f[R] will be executed over them. All original packets will be steered to default_next_table in \f[B]attr\f[R]. A modify header format SET_ACTION data can be provided in action of \f[B]attr\f[R], which can be executed on packets before going to default flow table. On some devices, this is required to set register value. .PP Action Flags: action \f[B]flags\f[R] can be set to one of the types of \f[I]enum mlx5dv_dr_action_flags\f[R]: .PP Action: Pop Vlan \f[I]mlx5dv_dr_action_create_pop_vlan\f[R] creates a pop vlan action which removes VLAN tags from packets layer 2. .PP Action: Push Vlan \f[I]mlx5dv_dr_action_create_push_vlan\f[R] creates a push vlan action which adds VLAN tags to packets layer 2. .PP \f[B]MLX5DV_DR_ACTION_FLAGS_ROOT_LEVEL\f[R]: is used to indicate the action is targeted for flow table in level=0 (ROOT) of the specific domain. .SS Rule .PP \f[I]mlx5dv_dr_rule_create()\f[R] creates a HW steering rule entry in \f[B]matcher\f[R]. The \f[B]value\f[R] of type \f[I]struct mlx5dv_flow_match_parameters\f[R] holds the exact attribute values of the steering rule to be matched, in a device spec format. Only the fields that where masked in the \f[I]matcher\f[R] should be filled. HW will perform the set of \f[B]num_actions\f[R] from the \f[B]action\f[R] array of type \f[I]struct mlx5dv_dr_action\f[R], once a packet matches the exact \f[B]value\f[R] of the rule (referred to as a `hit'). .PP \f[I]mlx5dv_dr_rule_destroy()\f[R] destroys the rule. .SS Other .PP \f[I]mlx5dv_dr_aso_other_domain_link()\f[R] links the ASO devx object, \f[B]devx_obj\f[R] to a domain \f[B]dmn\f[R], this will allow creating a rule with ASO action using the given object on the linked domain \f[B]dmn\f[R]. \f[B]peer_dmn\f[R] is the domain that the ASO devx object was created on. \f[B]dmn\f[R] is the domain that ASO devx object will be linked to. \f[B]flags\f[R] choose the specific wanted behavior of this object according to the flags, same as for ASO action creation flags. \f[B]regc_index\f[R] After a packet hits the rule with the ASO object the value of the ASO object will be copied into the regc register indicated by this param, and then we can use the value for matching in the following DR rules. .PP \f[I]mlx5dv_dr_aso_other_domain_unlink()\f[R] will unlink the \f[B]devx_obj\f[R] from the linked \f[B]dmn\f[R]. \f[B]dmn\f[R] is the domain that ASO devx object is linked to. .SH RETURN VALUE .PP The create API calls will return a pointer to the relevant object: table, matcher, action, rule. on failure, NULL will be returned and errno will be set. .PP The destroy API calls will returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH LIMITATIONS .PP Application can verify is a feature is supported by \f[I]trail and error\f[R]. No capabilities are exposed, as the combination of all the options exposed are way to large to define. .PP Tables are size less by definition. They are expected to grow and shrink to accommodate for all rules, according to driver capabilities. Once reaching a limit, an error is returned. .PP Matchers in same priority, in the same table, will have undefined ordered. .PP A rule with identical value pattern to another rule on a given matcher are rejected. .PP IP version in matcher mask and rule should be equal and set to 4, 6 or 0. # SEE ALSO .PP \f[B]mlx5dv_open_device(3)\f[R], \f[B]mlx5dv_create_flow_action_packet_reformat(3)\f[R], \f[B]mlx5dv_create_flow_action_modify_header(3)\f[R]. .SH AUTHOR .PP Alex Rosenbaum Alex Vesker rdma-core-56.1/buildlib/pandoc-prebuilt/8088d28d309bb0af8cb14f7e3953d54c8e4392600000644000175100002000000002100514773456421033317 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "PERFQUERY" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME perfquery \- query InfiniBand port counters on a single port .SH SYNOPSIS .sp perfquery [options] [ [[port(s)] [reset_mask]]] .SH DESCRIPTION .sp perfquery uses PerfMgt GMPs to obtain the PortCounters (basic performance and error counters), PortExtendedCounters, PortXmitDataSL, PortRcvDataSL, PortRcvErrorDetails, PortXmitDiscardDetails, PortExtendedSpeedsCounters, or PortSamplesControl from the PMA at the node/port specified. Optionally shows aggregated counters for all ports of node. Finally it can, reset after read, or just reset the counters. .sp Note: In PortCounters, PortCountersExtended, PortXmitDataSL, and PortRcvDataSL, components that represent Data (e.g. PortXmitData and PortRcvData) indicate octets divided by 4 rather than just octets. .sp Note: Inputting a port of 255 indicates an operation be performed on all ports. .sp Note: For PortCounters, ExtendedCounters, and resets, multiple ports can be specified by either a comma separated list or a port range. See examples below. .SH OPTIONS .INDENT 0.0 .TP .B \fB\-x, \-\-extended\fP show extended port counters rather than (basic) port counters. Note that extended port counters attribute is optional. .TP .B \fB\-X, \-\-xmtsl\fP show transmit data SL counter. This is an optional counter for QoS. .TP .B \fB\-S, \-\-rcvsl\fP show receive data SL counter. This is an optional counter for QoS. .TP .B \fB\-D, \-\-xmtdisc\fP show transmit discard details. This is an optional counter. .TP .B \fB\-E, \-\-rcverr\fP show receive error details. This is an optional counter. .TP .B \fB\-D, \-\-xmtdisc\fP show transmit discard details. This is an optional counter. .TP .B \fB\-T, \-\-extended_speeds\fP show extended speeds port counters. This is an optional counter. .TP .B \fB\-\-oprcvcounters\fP show Rcv Counters per Op code. This is an optional counter. .TP .B \fB\-\-flowctlcounters\fP show flow control counters. This is an optional counter. .TP .B \fB\-\-vloppackets\fP show packets received per Op code per VL. This is an optional counter. .TP .B \fB\-\-vlopdata\fP show data received per Op code per VL. This is an optional counter. .TP .B \fB\-\-vlxmitflowctlerrors\fP show flow control update errors per VL. This is an optional counter. .TP .B \fB\-\-vlxmitcounters\fP show ticks waiting to transmit counters per VL. This is an optional counter. .TP .B \fB\-\-swportvlcong\fP show sw port VL congestion. This is an optional counter. .TP .B \fB\-\-rcvcc\fP show Rcv congestion control counters. This is an optional counter. .TP .B \fB\-\-slrcvfecn\fP show SL Rcv FECN counters. This is an optional counter. .TP .B \fB\-\-slrcvbecn\fP show SL Rcv BECN counters. This is an optional counter. .TP .B \fB\-\-xmitcc\fP show Xmit congestion control counters. This is an optional counter. .TP .B \fB\-\-vlxmittimecc\fP show VL Xmit Time congestion control counters. This is an optional counter. .TP .B \fB\-c, \-\-smplctl\fP show port samples control. .TP .B \fB\-a, \-\-all_ports\fP show aggregated counters for all ports of the destination lid, reset all counters for all ports, or if multiple ports are specified, aggregate the counters of the specified ports. If the destination lid does not support the AllPortSelect flag, all ports will be iterated through to emulate AllPortSelect behavior. .TP .B \fB\-l, \-\-loop_ports\fP If all ports are selected by the user (either through the \fB\-a\fP option or port 255) or multiple ports are specified iterate through each port rather than doing than aggregate operation. .TP .B \fB\-r, \-\-reset_after_read\fP reset counters after read .TP .B \fB\-R, \-\-Reset_only\fP only reset counters .UNINDENT .SS Addressing Flags .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .SH EXAMPLES .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C perfquery # read local port performance counters perfquery 32 1 # read performance counters from lid 32, port 1 perfquery \-x 32 1 # read extended performance counters from lid 32, port 1 perfquery \-a 32 # read perf counters from lid 32, all ports perfquery \-r 32 1 # read performance counters and reset perfquery \-x \-r 32 1 # read extended performance counters and reset perfquery \-R 0x20 1 # reset performance counters of port 1 only perfquery \-x \-R 0x20 1 # reset extended performance counters of port 1 only perfquery \-R \-a 32 # reset performance counters of all ports perfquery \-R 32 2 0x0fff # reset only error counters of port 2 perfquery \-R 32 2 0xf000 # reset only non\-error counters of port 2 perfquery \-a 32 1\-10 # read performance counters from lid 32, port 1\-10, aggregate output perfquery \-l 32 1\-10 # read performance counters from lid 32, port 1\-10, output each port perfquery \-a 32 1,4,8 # read performance counters from lid 32, port 1, 4, and 8, aggregate output perfquery \-l 32 1,4,8 # read performance counters from lid 32, port 1, 4, and 8, output each port .ft P .fi .UNINDENT .UNINDENT .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%hal.rosenstock@gmail.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/03680fe180ea50ca7a257bae4e9229a77c5bee390000644000175100002000000000320114773456413033520 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "RDMA_GET_REMOTE_ECE" "3" "2020-02-02" "librdmacm" "Librdmacm Programmer\[cq]s Manual" .hy .SH NAME .PP rdma_get_remote_ece - Get remote ECE paraemters as received from the peer. .SH SYNOPSIS .IP .nf \f[C] #include int rdma_get_remote_ece(struct rdma_cm_id *id, struct ibv_ece *ece); \f[R] .fi .SH DESCRIPTION .PP \f[B]rdma_get_remote_ece()\f[R] get ECE parameters as were received from the communication peer. .PP This function is suppose to be used by the users of external QPs. The call needs to be performed before replying to the peer and needed to allow for the passive side to know ECE options of other side. .PP Being used by external QP and RDMA_CM doesn\[cq]t manage that QP, the peer needs to call to libibverbs API by itself. .PP Usual flow for the passive side will be: .IP \[bu] 2 ibv_create_qp() <- create data QP. .IP \[bu] 2 ece = rdma_get_remote_ece() <- get ECE options from remote peer .IP \[bu] 2 ibv_set_ece(ece) <- set local ECE options with data received from the peer. .IP \[bu] 2 ibv_modify_qp() <- enable data QP. .IP \[bu] 2 rdma_set_local_ece(ece) <- set desired ECE options after respective libibverbs provider masked unsupported options. .IP \[bu] 2 rdma_accept()/rdma_establish()/rdma_reject_ece() .SH ARGUMENTS .TP *id RDMA communication identifier. .TP *ece ECE struct to be filled. .SH RETURN VALUE .PP \f[B]rdma_get_remote_ece()\f[R] returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH SEE ALSO .PP \f[B]rdma_cm\f[R](7), rdma_set_local_ece(3) .SH AUTHOR .PP Leon Romanovsky rdma-core-56.1/buildlib/pandoc-prebuilt/34cf0e59f60dd9af279902148ab5180325339afc0000644000175100002000000000150714773456412033313 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_INC_RKEY" "3" "2015-01-29" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_inc_rkey - creates a new rkey from the given one .SH SYNOPSIS .IP .nf \f[C] #include uint32_t ibv_inc_rkey(uint32_t rkey); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_inc_rkey()\f[R] Increases the 8 LSB of \f[I]rkey\f[R] and returns the new value. .SH RETURN VALUE .PP \f[B]ibv_inc_rkey()\f[R] returns the new rkey. .SH NOTES .PP The verb generates a new rkey that is different from the previous one on its tag part but has the same index (bits 0xffffff00). A use case for this verb can be to create a new rkey from a Memory window\[cq]s rkey when binding it to a Memory region. .SH AUTHORS .PP Majd Dibbiny , Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/a77221aad56d48bfd807994efb178f5c74e8586c0000644000175100002000000000321214773456413033501 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "RDMA_SET_LOCAL_ECE" "3" "2020-02-02" "librdmacm" "Librdmacm Programmer\[cq]s Manual" .hy .SH NAME .PP rdma_set_local_ece - Set local ECE paraemters to be used for REQ/REP communication. .SH SYNOPSIS .IP .nf \f[C] #include int rdma_set_local_ece(struct rdma_cm_id *id, struct ibv_ece *ece); \f[R] .fi .SH DESCRIPTION .PP \f[B]rdma_set_local_ece()\f[R] set local ECE parameters. .PP This function is suppose to be used by the users of external QPs. The call needs to be performed before replying to the peer and needed to configure RDMA_CM with desired ECE options. .PP Being used by external QP and RDMA_CM doesn\[cq]t manage that QP, the peer needs to call to libibverbs API by itself. .PP Usual flow for the passive side will be: .IP \[bu] 2 ibv_create_qp() <- create data QP. .IP \[bu] 2 ece = ibv_query_ece() <- get ECE from libibvers provider. .IP \[bu] 2 rdma_set_local_ece(ece) <- set desired ECE options. .IP \[bu] 2 rdma_connect() <- send connection request .IP \[bu] 2 ece = rdma_get_remote_ece() <- get ECE options from remote peer .IP \[bu] 2 ibv_set_ece(ece) <- set local ECE options with data received from the peer. .IP \[bu] 2 ibv_modify_qp() <- enable data QP. .IP \[bu] 2 rdma_accept()/rdma_establish()/rdma_reject_ece() .SH ARGUMENTS .TP \f[I]id\f[R] RDMA communication identifier. .TP *ece ECE parameters. .SH RETURN VALUE .PP \f[B]rdma_set_local_ece()\f[R] returns 0 on success, or -1 on error. If an error occurs, errno will be set to indicate the failure reason. .SH SEE ALSO .PP \f[B]rdma_cm\f[R](7), rdma_get_remote_ece(3) .SH AUTHOR .PP Leon Romanovsky rdma-core-56.1/buildlib/pandoc-prebuilt/f1232dd8be9303cd060d99e7633aa1f42301f3570000644000175100002000000003273314773456421033302 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "INFINIBAND-DIAGS" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME infiniband-diags \- Diagnostics for InfiniBand Fabrics .SH DESCRIPTION .sp infiniband\-diags is a set of utilities designed to help configure, debug, and maintain infiniband fabrics. Many tools and utilities are provided. Some with similar functionality. .sp The base utilities use directed route MAD\(aqs to perform their operations. They may therefore work even in unconfigured subnets. Other, higher level utilities, require LID routed MAD\(aqs and to some extent SA/SM access. .SH THE USE OF SMPS (QP0) .sp Many of the tools in this package rely on the use of SMPs via QP0 to acquire data directly from the SMA. While this mode of operation is not technically in compliance with the InfiniBand specification, practical experience has found that this level of diagnostics is valuable when working with a fabric which is broken or only partially configured. For this reason many of these tools may require the use of an MKey or operation from Virtual Machines may be restricted for security reasons. .SH COMMON OPTIONS .sp Most OpenIB diagnostics take some of the following common flags. The exact list of supported flags per utility can be found in the documentation for those commands. .SS Addressing Flags .sp The \-D and \-G option have two forms: .\" Define the common option -D for Directed routes . .sp \fB\-D, \-\-Direct\fP The address specified is a directed route .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Examples: [options] \-D [options] "0" # self port [options] \-D [options] "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag \(aq\-P\(aq or the port found through the automatic selection process.) .ft P .fi .UNINDENT .UNINDENT .\" Define the common option -D for Directed routes . .sp \fB\-D, \-\-Direct \fP The address specified is a directed route .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Examples: \-D "0" # self port \-D "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag \(aq\-P\(aq or the port found through the automatic selection process.) .ft P .fi .UNINDENT .UNINDENT .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -G . .sp \fB\-\-port\-guid, \-G \fP Specify a port_guid .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -z . .INDENT 0.0 .TP .B \fB\-\-outstanding_smps, \-o \fP Specify the number of outstanding SMP\(aqs which should be issued during the scan .sp Default: 2 .UNINDENT .\" Define the common option --node-name-map . .sp \fB\-\-node\-name\-map \fP Specify a node name map. .INDENT 0.0 .INDENT 3.5 This file maps GUIDs to more user friendly names. See FILES section. .UNINDENT .UNINDENT .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH COMMON FILES .sp The following config files are common amongst many of the utilities. .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .\" Common text to describe the Topology file. . .SS TOPOLOGY FILE FORMAT .sp The topology file format is human readable and largely intuitive. Most identifiers are given textual names like vendor ID (vendid), device ID (device ID), GUIDs of various types (sysimgguid, caguid, switchguid, etc.). PortGUIDs are shown in parentheses (). For switches, this is shown on the switchguid line. For CA and router ports, it is shown on the connectivity lines. The IB node is identified followed by the number of ports and a quoted the node GUID. On the right of this line is a comment (#) followed by the NodeDescription in quotes. If the node is a switch, this line also contains whether switch port 0 is base or enhanced, and the LID and LMC of port 0. Subsequent lines pertaining to this node show the connectivity. On the left is the port number of the current node. On the right is the peer node (node at other end of link). It is identified in quotes with nodetype followed by \- followed by NodeGUID with the port number in square brackets. Further on the right is a comment (#). What follows the comment is dependent on the node type. If it it a switch node, it is followed by the NodeDescription in quotes and the LID of the peer node. If it is a CA or router node, it is followed by the local LID and LMC and then followed by the NodeDescription in quotes and the LID of the peer node. The active link width and speed are then appended to the end of this output line. .sp An example of this is: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # # Topology file: generated on Tue Jun 5 14:15:10 2007 # # Max of 3 hops discovered # Initiated from node 0008f10403960558 port 0008f10403960559 Non\-Chassis Nodes vendid=0x8f1 devid=0x5a06 sysimgguid=0x5442ba00003000 switchguid=0x5442ba00003080(5442ba00003080) Switch 24 "S\-005442ba00003080" # "ISR9024 Voltaire" base port 0 lid 6 lmc 0 [22] "H\-0008f10403961354"[1](8f10403961355) # "MT23108 InfiniHost Mellanox Technologies" lid 4 4xSDR [10] "S\-0008f10400410015"[1] # "SW\-6IB4 Voltaire" lid 3 4xSDR [8] "H\-0008f10403960558"[2](8f1040396055a) # "MT23108 InfiniHost Mellanox Technologies" lid 14 4xSDR [6] "S\-0008f10400410015"[3] # "SW\-6IB4 Voltaire" lid 3 4xSDR [12] "H\-0008f10403960558"[1](8f10403960559) # "MT23108 InfiniHost Mellanox Technologies" lid 10 4xSDR vendid=0x8f1 devid=0x5a05 switchguid=0x8f10400410015(8f10400410015) Switch 8 "S\-0008f10400410015" # "SW\-6IB4 Voltaire" base port 0 lid 3 lmc 0 [6] "H\-0008f10403960984"[1](8f10403960985) # "MT23108 InfiniHost Mellanox Technologies" lid 16 4xSDR [4] "H\-005442b100004900"[1](5442b100004901) # "MT23108 InfiniHost Mellanox Technologies" lid 12 4xSDR [1] "S\-005442ba00003080"[10] # "ISR9024 Voltaire" lid 6 1xSDR [3] "S\-005442ba00003080"[6] # "ISR9024 Voltaire" lid 6 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x8f10403960984 Ca 2 "H\-0008f10403960984" # "MT23108 InfiniHost Mellanox Technologies" [1](8f10403960985) "S\-0008f10400410015"[6] # lid 16 lmc 1 "SW\-6IB4 Voltaire" lid 3 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x5442b100004900 Ca 2 "H\-005442b100004900" # "MT23108 InfiniHost Mellanox Technologies" [1](5442b100004901) "S\-0008f10400410015"[4] # lid 12 lmc 1 "SW\-6IB4 Voltaire" lid 3 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x8f10403961354 Ca 2 "H\-0008f10403961354" # "MT23108 InfiniHost Mellanox Technologies" [1](8f10403961355) "S\-005442ba00003080"[22] # lid 4 lmc 1 "ISR9024 Voltaire" lid 6 4xSDR vendid=0x2c9 devid=0x5a44 caguid=0x8f10403960558 Ca 2 "H\-0008f10403960558" # "MT23108 InfiniHost Mellanox Technologies" [2](8f1040396055a) "S\-005442ba00003080"[8] # lid 14 lmc 1 "ISR9024 Voltaire" lid 6 4xSDR [1](8f10403960559) "S\-005442ba00003080"[12] # lid 10 lmc 1 "ISR9024 Voltaire" lid 6 1xSDR .ft P .fi .UNINDENT .UNINDENT .sp When grouping is used, IB nodes are organized into chassis which are numbered. Nodes which cannot be determined to be in a chassis are displayed as "Non\-Chassis Nodes". External ports are also shown on the connectivity lines. .SH UTILITIES LIST .SS Basic fabric connectivity .INDENT 0.0 .INDENT 3.5 See: ibnetdiscover, iblinkinfo .UNINDENT .UNINDENT .SS Node information .INDENT 0.0 .INDENT 3.5 See: ibnodes, ibswitches, ibhosts, ibrouters .UNINDENT .UNINDENT .SS Port information .INDENT 0.0 .INDENT 3.5 See: ibportstate, ibaddr .UNINDENT .UNINDENT .SS Switch Forwarding Table info .INDENT 0.0 .INDENT 3.5 See: ibtracert, ibroute, dump_lfts, dump_mfts, check_lft_balance, ibfindnodesusing .UNINDENT .UNINDENT .SS Performance counters .INDENT 0.0 .INDENT 3.5 See: ibqueryerrors, perfquery .UNINDENT .UNINDENT .SS Local HCA info .INDENT 0.0 .INDENT 3.5 See: ibstat, ibstatus .UNINDENT .UNINDENT .SS Connectivity check .INDENT 0.0 .INDENT 3.5 See: ibping, ibsysstat .UNINDENT .UNINDENT .SS Low level query tools .INDENT 0.0 .INDENT 3.5 See: smpquery, smpdump, saquery, sminfo .UNINDENT .UNINDENT .SS Fabric verification tools .INDENT 0.0 .INDENT 3.5 See: ibidsverify .UNINDENT .UNINDENT .SH BACKWARDS COMPATIBILITY SCRIPTS .sp The following scripts have been identified as redundant and/or lower performing as compared to the above scripts. They are provided as legacy scripts when \-\-enable\-compat\-utils is specified at build time. .sp ibcheckerrors, ibclearcounters, ibclearerrors, ibdatacounters ibchecknet, ibchecknode, ibcheckport, ibcheckportstate, ibcheckportwidth, ibcheckstate, ibcheckwidth, ibswportwatch, ibprintca, ibprintrt, ibprintswitch, set_nodedesc.sh .SH AUTHORS .INDENT 0.0 .TP .B Ira Weiny < \fI\%ira.weiny@intel.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/35de5f25ab929eed324046bc74a2a953f3b8a47b0000644000175100002000000000274214773456412033531 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_ATTACH_MCAST" "3" "2006-10-31" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_attach_mcast, ibv_detach_mcast - attach and detach a queue pair (QPs) to/from a multicast group .SH SYNOPSIS .IP .nf \f[C] #include int ibv_attach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); int ibv_detach_mcast(struct ibv_qp *qp, const union ibv_gid *gid, uint16_t lid); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_attach_mcast()\f[R] attaches the QP \f[I]qp\f[R] to the multicast group having MGID \f[I]gid\f[R] and MLID \f[I]lid\f[R]. .PP \f[B]ibv_detach_mcast()\f[R] detaches the QP \f[I]qp\f[R] to the multicast group having MGID \f[I]gid\f[R] and MLID \f[I]lid\f[R]. .SH RETURN VALUE .PP \f[B]ibv_attach_mcast()\f[R] and \f[B]ibv_detach_mcast()\f[R] returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH NOTES .PP Only QPs of Transport Service Type \f[B]IBV_QPT_UD\f[R] may be attached to multicast groups. .PP If a QP is attached to the same multicast group multiple times, the QP will still receive a single copy of a multicast message. .PP In order to receive multicast messages, a join request for the multicast group must be sent to the subnet administrator (SA), so that the fabric\[cq]s multicast routing is configured to deliver messages to the local port. .SH SEE ALSO .PP \f[B]ibv_create_qp\f[R](3) .SH AUTHOR .PP Dotan Barak rdma-core-56.1/buildlib/pandoc-prebuilt/8411e6130dd9d3fa1bb1fd321deea8883b016f5b0000644000175100002000000000242514773456413033571 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "RDMA_FREEADDRINFO" 3 "2025-02-03" "" "Librdmacm Programmer's Manual" .SH NAME RDMA_FREEADDRINFO \- Frees the list of rdma_addrinfo structures returned by rdma_getaddrinfo .SH SYNOPSIS .sp #include .sp void rdma_freeaddrinfo (struct rdma_addrinfo *res); .SH ARGUMENTS .sp res List of rdma_addrinfo structures returned by rdma_getaddrinfo. .SH DESCRIPTION .sp Frees the list of rdma_addrinfo structures returned by rdma_getaddrinfo. .SH RETURN VALUE .sp None .SH SEE ALSO .sp rdma_getaddrinfo(3) .SH AUTHOR .sp Mark Zhang <\fI\%markzhang@nvidia.com\fP> .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/6d34a4e8e6b675858f0805d291edb400791955430000644000175100002000000001707014773456414033116 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_create_qp" "3" "2018-9-1" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_create_qp - creates a queue pair (QP) .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_qp *mlx5dv_create_qp(struct ibv_context *context, struct ibv_qp_init_attr_ex *qp_attr, struct mlx5dv_qp_init_attr *mlx5_qp_attr) \f[R] .fi .SH DESCRIPTION .PP \f[B]mlx5dv_create_qp()\f[R] creates a queue pair (QP) with specific driver properties. .SH ARGUMENTS .PP Please see \f[I]ibv_create_qp_ex(3)\f[R] man page for \f[I]context\f[R] and \f[I]qp_attr\f[R]. .SS mlx5_qp_attr .IP .nf \f[C] struct mlx5dv_qp_init_attr { uint64_t comp_mask; uint32_t create_flags; struct mlx5dv_dc_init_attr dc_init_attr; uint64_t send_ops_flags; }; \f[R] .fi .TP \f[I]comp_mask\f[R] Bitmask specifying what fields in the structure are valid: MLX5DV_QP_INIT_ATTR_MASK_QP_CREATE_FLAGS: valid values in \f[I]create_flags\f[R] MLX5DV_QP_INIT_ATTR_MASK_DC: valid values in \f[I]dc_init_attr\f[R] MLX5DV_QP_INIT_ATTR_MASK_SEND_OPS_FLAGS: valid values in \f[I]send_ops_flags\f[R] .TP \f[I]create_flags\f[R] A bitwise OR of the various values described below. .RS .PP MLX5DV_QP_CREATE_TUNNEL_OFFLOADS: Enable offloading such as checksum and LRO for incoming tunneling traffic. .PP MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_UC: Allow receiving loopback unicast traffic. .PP MLX5DV_QP_CREATE_TIR_ALLOW_SELF_LOOPBACK_MC: Allow receiving loopback multicast traffic. .PP MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE: Disable scatter to CQE feature which is enabled by default. .PP MLX5DV_QP_CREATE_ALLOW_SCATTER_TO_CQE: Allow scatter to CQE for requester even if the qp was not configured to signal all WRs. .PP MLX5DV_QP_CREATE_PACKET_BASED_CREDIT_MODE: Set QP to work in end-to-end packet-based credit, instead of the default message-based credits (IB spec. section 9.7.7.2). .PD 0 .P .PD It is the applications responsibility to make sure that the peer QP is configured with same mode. .PP MLX5DV_QP_CREATE_SIG_PIPELINING: If the flag is set, the QP is moved to SQD state upon encountering a signature error, and IBV_EVENT_SQ_DRAINED is generated to inform about the new state. The signature pipelining feature is a performance optimization, which reduces latency for read operations in the storage protocols. The feature is optional. Creating the QP fails if the kernel or device does not support the feature. In this case, an application should fallback to backward compatibility mode and handle read operations without the pipelining. See details about the signature pipelining in \f[B]mlx5dv_qp_cancel_posted_send_wrs\f[R](3). .PP MLX5DV_QP_CREATE_OOO_DP: If the flag is set, Receive WRs on the receiver side of the QP are allowed to be consumed out-of-order and sender side of the QP is allowed to transmit messages without guaranteeing any arrival ordering on the receiver side. .IP .nf \f[C] The flag, when set, must be set both on the sender and receiver side of a QP (e.g., DCT and DCI). Setting the flag is optional and the availability of this feature should be queried by the application (See details in **mlx5dv_query_device**(3)) and there is no automatic fallback: If the flag is set while kernel or device does not support the feature, then creating the QP fails. Thus, before creating a QP with this flag set, application must query the maximal outstanding Receive WRs possible on a QP with this flag set, according to the QP type (see details in **mlx5dv_query_device**(3)) and make sure the capability is supported. > **Note** > > All the following describe the behavior and semantics of a QP > with this flag set. Completions\[aq] delivery ordering: A Receive WR posted on this QP may be consumed by any arriving message to this QP that requires Receive WR consumption. Nonetheless, the ordering in which work completions are delivered for the posted WRs, both on sender side and receiver side, remains unchanged when this flag is set (and is independent of the ordering in which the Receive WRs are consumed). The ID delivered in every work completion (wr_id) will specify which WR was completed by the delivered work completion. Data placing and operations\[aq] execution ordering: RDMA Read and RDMA Atomic operations are executed on the responder side in order, i.e., these operations are executed after all previous messages are done executing. However, the ordering of RDMA Read response packets being scattered to memory on the requestor side is not guaranteed. This means that, although the data is read after executing all previous messages, it may be scattered out-of-order on the requestor side. Ordering of write requests towards the memory on the responder side, initiated by RDMA Send, RDMA Send with Immediate, RDMA Write or RDMA Write with Immediate is not guaranteed. Good and bad practice: Since it cannot be guaranteed which RDMA Send (and/or RDMA Send with Immediate) will consume a Receive WR (and will scatter its data to the memory buffers specified in the WR) it\[aq]s not recommended to post different sizes of Receive WRs. Polling on any memory that is used by the device to scatter data, is not recommended since ordering of data placement of RDMA Send, RDMA Write and RDMA Write with Immediate is not guaranteed. Receiver, upon getting a completion for an RDMA Write with Immediate, should not rely on wr_id alone to determine to which memory data was scattered by the operation. \f[R] .fi .RE .TP \f[I]dc_init_attr\f[R] DC init attributes. .SS \f[I]dc_init_attr\f[R] .IP .nf \f[C] struct mlx5dv_dci_streams { uint8_t log_num_concurent; uint8_t log_num_errored; }; struct mlx5dv_dc_init_attr { enum mlx5dv_dc_type dc_type; union { uint64_t dct_access_key; struct mlx5dv_dci_streams dci_streams; }; }; \f[R] .fi .TP \f[I]dc_type\f[R] MLX5DV_DCTYPE_DCT QP type: Target DC. MLX5DV_DCTYPE_DCI QP type: Initiator DC. .TP \f[I]dct_access_key\f[R] used to create a DCT QP. .TP \f[I]dci_streams\f[R] dci_streams used to define DCI QP with multiple concurrent streams. Valid when comp_mask includes MLX5DV_QP_INIT_ATTR_MASK_DCI_STREAMS. .RS .PP log_num_concurent Defines the number of parallel different streams that could be handled by HW. All work request of a specific stream_id are handled in order. .PP log_num_errored Defines the number of dci error stream channels before moving DCI to an error state. .RE .TP \f[I]send_ops_flags\f[R] A bitwise OR of the various values described below. .RS .PP MLX5DV_QP_EX_WITH_MR_INTERLEAVED: Enables the mlx5dv_wr_mr_interleaved() work requset on this QP. .PP MLX5DV_QP_EX_WITH_MR_LIST: Enables the mlx5dv_wr_mr_list() work requset on this QP. .PP MLX5DV_QP_EX_WITH_MKEY_CONFIGURE: Enables the mlx5dv_wr_mkey_configure() work request and the related setters on this QP. .RE .SH NOTES .PP \f[B]mlx5dv_qp_ex_from_ibv_qp_ex()\f[R] is used to get \f[I]struct mlx5dv_qp_ex\f[R] for accessing the send ops interfaces when IBV_QP_INIT_ATTR_SEND_OPS_FLAGS is used. .PP The MLX5DV_QP_CREATE_DISABLE_SCATTER_TO_CQE flag should be set in cases that IOVA doesn\[cq]t match the process\[cq] VA and the message payload size is small enough to trigger the scatter to CQE feature. .PP When device memory is used IBV_SEND_INLINE and scatter to CQE should not be used, as the memcpy is not possible. .SH RETURN VALUE .PP \f[B]mlx5dv_create_qp()\f[R] returns a pointer to the created QP, on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[B]ibv_query_device_ex\f[R](3), \f[B]ibv_create_qp_ex\f[R](3), \f[B]mlx5dv_query_device\f[R](3) .SH AUTHOR .PP Yonatan Cohen rdma-core-56.1/buildlib/pandoc-prebuilt/51fc73ea390e64beef0e9ea1a0243c2de4b463120000644000175100002000000000465214773456414033573 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_alloc_dm" "3" "2018-9-1" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_alloc_dm - allocates device memory (DM) .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_dm *mlx5dv_alloc_dm(struct ibv_context *context, struct ibv_alloc_dm_attr *dm_attr, struct mlx5dv_alloc_dm_attr *mlx5_dm_attr) \f[R] .fi .SH DESCRIPTION .PP \f[B]mlx5dv_alloc_dm()\f[R] allocates device memory (DM) with specific driver properties. .SH ARGUMENTS .PP Please see \f[I]ibv_alloc_dm(3)\f[R] man page for \f[I]context\f[R] and \f[I]dm_attr\f[R]. .SS mlx5_dm_attr .IP .nf \f[C] struct mlx5dv_alloc_dm_attr { enum mlx5dv_alloc_dm_type type; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]type\f[R] The device memory type user wishes to allocate: .RS .PP MLX5DV_DM_TYPE_MEMIC Device memory of type MEMIC - On-Chip memory that can be allocated and used as memory region for transmitting/receiving packet directly from/to the memory on the chip. .PP MLX5DV_DM_TYPE_STEERING_SW_ICM Device memory of type STEERING SW ICM - This memory is used by the device to store the packet steering tables and rules. Can be used for direct table and steering rules creation when allocated by a privileged user. .PP MLX5DV_DM_TYPE_HEADER_MODIFY_SW_ICM Device memory of type HEADER MODIFY SW ICM - This memory is used by the device to store the packet header modification tables and rules. Can be used for direct table and header modification rules creation when allocated by a privileged user. .PP MLX5DV_DM_TYPE_HEADER_MODIFY_PATTERN_SW_ICM Device memory of type HEADER MODIFY PATTERN SW ICM - This memory is used by the device to store packet header modification patterns/templates. Can be used for direct table and header modification rules creation when allocated by a privileged user. .PP MLX5DV_DM_TYPE_ENCAP_SW_ICM Device memory of type PACKET ENCAP SW ICM - This memory is used by the device to store packet encap data. Can be used for packet encap reformat rules creation when allocated by a privileged user. .RE .TP \f[I]comp_mask\f[R] Bitmask specifying what fields in the structure are valid: Currently reserved and should be set to 0. .SH RETURN VALUE .PP \f[B]mlx5dv_alloc_dm()\f[R] returns a pointer to the created DM, on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[B]ibv_alloc_dm\f[R](3), .SH AUTHOR .PP Ariel Levkovich rdma-core-56.1/buildlib/pandoc-prebuilt/f343716d5e324a4b411a5857268bb68666db217b0000644000175100002000000003223214773456413033150 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\"t .\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_WR API" "3" "2018-11-27" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_wr_abort, ibv_wr_complete, ibv_wr_start - Manage regions allowed to post work .PP ibv_wr_atomic_cmp_swp, ibv_wr_atomic_fetch_add - Post remote atomic operation work requests .PP ibv_wr_bind_mw, ibv_wr_local_inv - Post work requests for memory windows .PP ibv_wr_rdma_read, ibv_wr_rdma_write, ibv_wr_rdma_write_imm, ibv_wr_flush - Post RDMA work requests .PP ibv_wr_send, ibv_wr_send_imm, ibv_wr_send_inv - Post send work requests .PP ibv_wr_send_tso - Post segmentation offload work requests .PP ibv_wr_set_inline_data, ibv_wr_set_inline_data_list - Attach inline data to the last work request .PP ibv_wr_set_sge, ibv_wr_set_sge_list - Attach data to the last work request .PP ibv_wr_set_ud_addr - Attach UD addressing info to the last work request .PP ibv_wr_set_xrc_srqn - Attach an XRC SRQN to the last work request .SH SYNOPSIS .IP .nf \f[C] #include void ibv_wr_abort(struct ibv_qp_ex *qp); int ibv_wr_complete(struct ibv_qp_ex *qp); void ibv_wr_start(struct ibv_qp_ex *qp); void ibv_wr_atomic_cmp_swp(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, uint64_t compare, uint64_t swap); void ibv_wr_atomic_fetch_add(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, uint64_t add); void ibv_wr_bind_mw(struct ibv_qp_ex *qp, struct ibv_mw *mw, uint32_t rkey, const struct ibv_mw_bind_info *bind_info); void ibv_wr_local_inv(struct ibv_qp_ex *qp, uint32_t invalidate_rkey); void ibv_wr_rdma_read(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr); void ibv_wr_rdma_write(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr); void ibv_wr_rdma_write_imm(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, __be32 imm_data); void ibv_wr_send(struct ibv_qp_ex *qp); void ibv_wr_send_imm(struct ibv_qp_ex *qp, __be32 imm_data); void ibv_wr_send_inv(struct ibv_qp_ex *qp, uint32_t invalidate_rkey); void ibv_wr_send_tso(struct ibv_qp_ex *qp, void *hdr, uint16_t hdr_sz, uint16_t mss); void ibv_wr_set_inline_data(struct ibv_qp_ex *qp, void *addr, size_t length); void ibv_wr_set_inline_data_list(struct ibv_qp_ex *qp, size_t num_buf, const struct ibv_data_buf *buf_list); void ibv_wr_set_sge(struct ibv_qp_ex *qp, uint32_t lkey, uint64_t addr, uint32_t length); void ibv_wr_set_sge_list(struct ibv_qp_ex *qp, size_t num_sge, const struct ibv_sge *sg_list); void ibv_wr_set_ud_addr(struct ibv_qp_ex *qp, struct ibv_ah *ah, uint32_t remote_qpn, uint32_t remote_qkey); void ibv_wr_set_xrc_srqn(struct ibv_qp_ex *qp, uint32_t remote_srqn); void ibv_wr_flush(struct ibv_qp_ex *qp, uint32_t rkey, uint64_t remote_addr, size_t len, uint8_t type, uint8_t level); \f[R] .fi .SH DESCRIPTION .PP The verbs work request API (ibv_wr_*) allows efficient posting of work to a send queue using function calls instead of the struct based \f[I]ibv_post_send()\f[R] scheme. This approach is designed to minimize CPU branching and locking during the posting process. .PP This API is intended to be used to access additional functionality beyond what is provided by \f[I]ibv_post_send()\f[R]. .PP WRs batches of \f[I]ibv_post_send()\f[R] and this API WRs batches can interleave together just if they are not posted within the critical region of each other. (A critical region in this API formed by \f[I]ibv_wr_start()\f[R] and \f[I]ibv_wr_complete()\f[R]/\f[I]ibv_wr_abort()\f[R]) .SH USAGE .PP To use these APIs the QP must be created using ibv_create_qp_ex() which allows setting the \f[B]IBV_QP_INIT_ATTR_SEND_OPS_FLAGS\f[R] in \f[I]comp_mask\f[R]. The \f[I]send_ops_flags\f[R] should be set to the OR of the work request types that will be posted to the QP. .PP If the QP does not support all the requested work request types then QP creation will fail. .PP Posting work requests to the QP is done within the critical region formed by \f[I]ibv_wr_start()\f[R] and \f[I]ibv_wr_complete()\f[R]/\f[I]ibv_wr_abort()\f[R] (see CONCURRENCY below). .PP Each work request is created by calling a WR builder function (see the table column WR builder below) to start creating the work request, followed by allowed/required setter functions described below. .PP The WR builder and setter combination can be called multiple times to efficiently post multiple work requests within a single critical region. .PP Each WR builder will use the \f[I]wr_id\f[R] member of \f[I]struct ibv_qp_ex\f[R] to set the value to be returned in the completion. Some operations will also use the \f[I]wr_flags\f[R] member to influence operation (see Flags below). These values should be set before invoking the WR builder function. .PP For example a simple send could be formed as follows: .IP .nf \f[C] qpx->wr_id = 1; ibv_wr_send(qpx); ibv_wr_set_sge(qpx, lkey, &data, sizeof(data)); \f[R] .fi .PP The section WORK REQUESTS describes the various WR builders and setters in details. .PP Posting work is completed by calling \f[I]ibv_wr_complete()\f[R] or \f[I]ibv_wr_abort()\f[R]. No work is executed to the queue until \f[I]ibv_wr_complete()\f[R] returns success. \f[I]ibv_wr_abort()\f[R] will discard all work prepared since \f[I]ibv_wr_start()\f[R]. .SH WORK REQUESTS .PP Many of the operations match the opcodes available for \f[I]ibv_post_send()\f[R]. Each operation has a WR builder function, a list of allowed setters, and a flag bit to request the operation with \f[I]send_ops_flags\f[R] in \f[I]struct ibv_qp_init_attr_ex\f[R] (see the EXAMPLE below). .PP .TS tab(@); l l l l. T{ Operation T}@T{ WR builder T}@T{ QP Type Supported T}@T{ setters T} _ T{ ATOMIC_CMP_AND_SWP T}@T{ ibv_wr_atomic_cmp_swp() T}@T{ RC, XRC_SEND T}@T{ DATA, QP T} T{ ATOMIC_FETCH_AND_ADD T}@T{ ibv_wr_atomic_fetch_add() T}@T{ RC, XRC_SEND T}@T{ DATA, QP T} T{ BIND_MW T}@T{ ibv_wr_bind_mw() T}@T{ UC, RC, XRC_SEND T}@T{ NONE T} T{ LOCAL_INV T}@T{ ibv_wr_local_inv() T}@T{ UC, RC, XRC_SEND T}@T{ NONE T} T{ RDMA_READ T}@T{ ibv_wr_rdma_read() T}@T{ RC, XRC_SEND T}@T{ DATA, QP T} T{ RDMA_WRITE T}@T{ ibv_wr_rdma_write() T}@T{ UC, RC, XRC_SEND T}@T{ DATA, QP T} T{ FLUSH T}@T{ ibv_wr_flush() T}@T{ RC, RD, XRC_SEND T}@T{ DATA, QP T} T{ RDMA_WRITE_WITH_IMM T}@T{ ibv_wr_rdma_write_imm() T}@T{ UC, RC, XRC_SEND T}@T{ DATA, QP T} T{ SEND T}@T{ ibv_wr_send() T}@T{ UD, UC, RC, XRC_SEND, RAW_PACKET T}@T{ DATA, QP T} T{ SEND_WITH_IMM T}@T{ ibv_wr_send_imm() T}@T{ UD, UC, RC, SRC SEND T}@T{ DATA, QP T} T{ SEND_WITH_INV T}@T{ ibv_wr_send_inv() T}@T{ UC, RC, XRC_SEND T}@T{ DATA, QP T} T{ TSO T}@T{ ibv_wr_send_tso() T}@T{ UD, RAW_PACKET T}@T{ DATA, QP T} .TE .SS Atomic operations .PP Atomic operations are only atomic so long as all writes to memory go only through the same RDMA hardware. It is not atomic with writes performed by the CPU, or by other RDMA hardware in the system. .TP \f[I]ibv_wr_atomic_cmp_swp()\f[R] If the remote 64 bit memory location specified by \f[I]rkey\f[R] and \f[I]remote_addr\f[R] equals \f[I]compare\f[R] then set it to \f[I]swap\f[R]. .TP \f[I]ibv_wr_atomic_fetch_add()\f[R] Add \f[I]add\f[R] to the 64 bit memory location specified \f[I]rkey\f[R] and \f[I]remote_addr\f[R]. .SS Memory Windows .PP Memory window type 2 operations (See man page for ibv_alloc_mw). .TP \f[I]ibv_wr_bind_mw()\f[R] Bind a MW type 2 specified by \f[B]mw\f[R], set a new \f[B]rkey\f[R] and set its properties by \f[B]bind_info\f[R]. .TP \f[I]ibv_wr_local_inv()\f[R] Invalidate a MW type 2 which is associated with \f[B]rkey\f[R]. .SS RDMA .TP \f[I]ibv_wr_rdma_read()\f[R] Read from the remote memory location specified \f[I]rkey\f[R] and \f[I]remote_addr\f[R]. The number of bytes to read, and the local location to store the data, is determined by the DATA buffers set after this call. .TP \f[I]ibv_wr_rdma_write()\f[R], \f[I]ibv_wr_rdma_write_imm()\f[R] Write to the remote memory location specified \f[I]rkey\f[R] and \f[I]remote_addr\f[R]. The number of bytes to read, and the local location to get the data, is determined by the DATA buffers set after this call. .RS .PP The _imm version causes the remote side to get a IBV_WC_RECV_RDMA_WITH_IMM containing the 32 bits of immediate data. .RE .SS Message Send .TP \f[I]ibv_wr_send()\f[R], \f[I]ibv_wr_send_imm()\f[R] Send a message. The number of bytes to send, and the local location to get the data, is determined by the DATA buffers set after this call. .RS .PP The _imm version causes the remote side to get a IBV_WC_RECV_RDMA_WITH_IMM containing the 32 bits of immediate data. .RE .TP \f[I]ibv_wr_send_inv()\f[R] The data transfer is the same as for \f[I]ibv_wr_send()\f[R], however the remote side will invalidate the MR specified by \f[I]invalidate_rkey\f[R] before delivering a completion. .TP \f[I]ibv_wr_send_tso()\f[R] Produce multiple SEND messages using TCP Segmentation Offload. The SGE points to a TCP Stream buffer which will be segmented into MSS size SENDs. The hdr includes the entire network headers up to and including the TCP header and is prefixed before each segment. .SS QP Specific setters .PP Certain QP types require each post to be accompanied by additional setters, these setters are mandatory for any operation listing a QP setter in the above table. .TP \f[I]UD\f[R] QPs \f[I]ibv_wr_set_ud_addr()\f[R] must be called to set the destination address of the work. .TP \f[I]XRC_SEND\f[R] QPs \f[I]ibv_wr_set_xrc_srqn()\f[R] must be called to set the destination SRQN field. .SS DATA transfer setters .PP For work that requires to transfer data one of the following setters should be called once after the WR builder: .TP \f[I]ibv_wr_set_sge()\f[R] Transfer data to/from a single buffer given by the lkey, addr and length. This is equivalent to \f[I]ibv_wr_set_sge_list()\f[R] with a single element. .TP \f[I]ibv_wr_set_sge_list()\f[R] Transfer data to/from a list of buffers, logically concatenated together. Each buffer is specified by an element in an array of \f[I]struct ibv_sge\f[R]. .PP Inline setters will copy the send data during the setter and allows the caller to immediately re-use the buffer. This behavior is identical to the IBV_SEND_INLINE flag. Generally this copy is done in a way that optimizes SEND latency and is suitable for small messages. The provider will limit the amount of data it can support in a single operation. This limit is requested in the \f[I]max_inline_data\f[R] member of \f[I]struct ibv_qp_init_attr\f[R]. Valid only for SEND and RDMA_WRITE. .TP \f[I]ibv_wr_set_inline_data()\f[R] Copy send data from a single buffer given by the addr and length. This is equivalent to \f[I]ibv_wr_set_inline_data_list()\f[R] with a single element. .TP \f[I]ibv_wr_set_inline_data_list()\f[R] Copy send data from a list of buffers, logically concatenated together. Each buffer is specified by an element in an array of \f[I]struct ibv_inl_data\f[R]. .SS Flags .PP A bit mask of flags may be specified in \f[I]wr_flags\f[R] to control the behavior of the work request. .TP \f[B]IBV_SEND_FENCE\f[R] Do not start this work request until prior work has completed. .TP \f[B]IBV_SEND_IP_CSUM\f[R] Offload the IPv4 and TCP/UDP checksum calculation .TP \f[B]IBV_SEND_SIGNALED\f[R] A completion will be generated in the completion queue for the operation. .TP \f[B]IBV_SEND_SOLICITED\f[R] Set the solicited bit in the RDMA packet. This informs the other side to generate a completion event upon receiving the RDMA operation. .SH CONCURRENCY .PP The provider will provide locking to ensure that \f[I]ibv_wr_start()\f[R] and \f[I]ibv_wr_complete()/abort()\f[R] form a per-QP critical section where no other threads can enter. .PP If an \f[I]ibv_td\f[R] is provided during QP creation then no locking will be performed and it is up to the caller to ensure that only one thread can be within the critical region at a time. .SH RETURN VALUE .PP Applications should use this API in a way that does not create failures. The individual APIs do not return a failure indication to avoid branching. .PP If a failure is detected during operation, for instance due to an invalid argument, then \f[I]ibv_wr_complete()\f[R] will return failure and the entire posting will be aborted. .SH EXAMPLE .IP .nf \f[C] /* create RC QP type and specify the required send opcodes */ qp_init_attr_ex.qp_type = IBV_QPT_RC; qp_init_attr_ex.comp_mask |= IBV_QP_INIT_ATTR_SEND_OPS_FLAGS; qp_init_attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_WRITE; qp_init_attr_ex.send_ops_flags |= IBV_QP_EX_WITH_RDMA_WRITE_WITH_IMM; ibv_qp *qp = ibv_create_qp_ex(ctx, qp_init_attr_ex); ibv_qp_ex *qpx = ibv_qp_to_qp_ex(qp); ibv_wr_start(qpx); /* create 1st WRITE WR entry */ qpx->wr_id = my_wr_id_1; ibv_wr_rdma_write(qpx, rkey, remote_addr_1); ibv_wr_set_sge(qpx, lkey, local_addr_1, length_1); /* create 2nd WRITE_WITH_IMM WR entry */ qpx->wr_id = my_wr_id_2; qpx->wr_flags = IBV_SEND_SIGNALED; ibv_wr_rdma_write_imm(qpx, rkey, remote_addr_2, htonl(0x1234)); ibv_set_wr_sge(qpx, lkey, local_addr_2, length_2); /* Begin processing WRs */ ret = ibv_wr_complete(qpx); \f[R] .fi .SH SEE ALSO .PP \f[B]ibv_post_send\f[R](3), \f[B]ibv_create_qp_ex(3)\f[R]. .SH AUTHOR .PP Jason Gunthorpe Guy Levi rdma-core-56.1/buildlib/pandoc-prebuilt/c6e1c734678cc9de4f4d742d7c6471c8fa6c72e70000644000175100002000000000460514773456413033565 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "EFADV_QUERY_DEVICE" "3" "2019-04-22" "efa" "EFA Direct Verbs Manual" .hy .SH NAME .PP efadv_query_device - Query device capabilities .SH SYNOPSIS .IP .nf \f[C] #include int efadv_query_device(struct ibv_context *ibvctx, struct efadv_device_attr *attr, uint32_t inlen); \f[R] .fi .SH DESCRIPTION .PP \f[B]efadv_query_device()\f[R] Queries EFA device specific attributes. .PP Compatibility is handled using the comp_mask and inlen fields. .IP .nf \f[C] struct efadv_device_attr { uint64_t comp_mask; uint32_t max_sq_wr; uint32_t max_rq_wr; uint16_t max_sq_sge; uint16_t max_rq_sge; uint16_t inline_buf_size; uint8_t reserved[2]; uint32_t device_caps; uint32_t max_rdma_size; }; \f[R] .fi .TP \f[I]inlen\f[R] In: Size of struct efadv_device_attr. .TP \f[I]comp_mask\f[R] Compatibility mask. .TP \f[I]max_sq_wr\f[R] Maximum Send Queue (SQ) Work Requests (WRs). .TP \f[I]max_rq_wr\f[R] Maximum Receive Queue (RQ) Work Requests (WRs). .TP \f[I]max_sq_sge\f[R] Maximum Send Queue (SQ) Scatter Gather Elements (SGEs). .TP \f[I]max_rq_sge\f[R] Maximum Receive Queue (RQ) Scatter Gather Elements (SGEs). .TP \f[I]inline_buf_size\f[R] Maximum inline buffer size. .TP \f[I]device_caps\f[R] Bitmask of device capabilities: .RS .PP EFADV_DEVICE_ATTR_CAPS_RDMA_READ: RDMA read is supported. .PP EFADV_DEVICE_ATTR_CAPS_RNR_RETRY: RNR retry is supported for SRD QPs. .PP EFADV_DEVICE_ATTR_CAPS_CQ_WITH_SGID: Reading source address (SGID) from receive completion descriptors is supported. Valid only for unknown AH. .PP EFADV_DEVICE_ATTR_CAPS_RDMA_WRITE: RDMA write is supported .PP EFADV_DEVICE_ATTR_CAPS_UNSOLICITED_WRITE_RECV: Indicates the device has support for creating QPs that can receive unsolicited RDMA write with immediate. RQ with this feature enabled will not consume any work requests in order to receive RDMA write with immediate and a WC generated for such receive will be marked as unsolicited. .RE .TP \f[I]max_rdma_size\f[R] Maximum RDMA transfer size in bytes. .SH RETURN VALUE .PP \f[B]efadv_query_device()\f[R] returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH SEE ALSO .PP \f[B]efadv\f[R](7) .SH NOTES .IP \[bu] 2 Compatibility mask (comp_mask) is an out field and currently has no values. .SH AUTHORS .PP Gal Pressman rdma-core-56.1/buildlib/pandoc-prebuilt/ed4694ee7bd02a221fdec2f2234c202c4313f5520000644000175100002000000001714414773456421033425 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "SMPQUERY" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME smpquery \- query InfiniBand subnet management attributes .SH SYNOPSIS .sp smpquery [options] [op params] .SH DESCRIPTION .sp smpquery allows a basic subset of standard SMP queries including the following: node info, node description, switch info, port info. Fields are displayed in human readable format. .SH OPTIONS .sp Current supported operations (case insensitive) and their parameters: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Nodeinfo (NI) Nodedesc (ND) Portinfo (PI) [] # default port is zero PortInfoExtended (PIE) [] Switchinfo (SI) PKeyTable (PKeys) [] SL2VLTable (SL2VL) [] VLArbitration (VLArb) [] GUIDInfo (GI) MlnxExtPortInfo (MEPI) [] # default port is zero .ft P .fi .UNINDENT .UNINDENT .INDENT 0.0 .TP .B \fB\-c, \-\-combined\fP Use Combined route address argument \fB \fP .TP .B \fB\-x, \-\-extended\fP Set SMSupportsExtendedSpeeds bit 31 in AttributeModifier (only impacts PortInfo queries). .UNINDENT .\" Define the common option -K . .INDENT 0.0 .TP .B \fB\-K, \-\-show_keys\fP show security keys (mkey, smkey, etc.) associated with the request. .UNINDENT .SS Addressing Flags .\" Define the common option -D for Directed routes . .sp \fB\-D, \-\-Direct\fP The address specified is a directed route .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Examples: [options] \-D [options] "0" # self port [options] \-D [options] "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag \(aq\-P\(aq or the port found through the automatic selection process.) .ft P .fi .UNINDENT .UNINDENT .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option --node-name-map . .sp \fB\-\-node\-name\-map \fP Specify a node name map. .INDENT 0.0 .INDENT 3.5 This file maps GUIDs to more user friendly names. See FILES section. .UNINDENT .UNINDENT .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .SH EXAMPLES .INDENT 0.0 .TP .B :: smpquery portinfo 3 1 # portinfo by lid, with port modifier smpquery \-G switchinfo 0x2C9000100D051 1 # switchinfo by guid smpquery \-D nodeinfo 0 # nodeinfo by direct route smpquery \-c nodeinfo 6 0,12 # nodeinfo by combined route .UNINDENT .SH SEE ALSO .sp smpdump (8) .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%hal@mellanox.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/f951af66c47de282e7c5fede594de5d30db0292a0000644000175100002000000000341114773456412033700 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "ibv_import_mr ibv_unimport_mr" "3" "2020-5-3" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_import_mr - import an MR from a given ibv_pd .PP ibv_unimport_mr - unimport an MR .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_mr *ibv_import_mr(struct ibv_pd *pd, uint32_t mr_handle); void ibv_unimport_mr(struct ibv_mr *mr) \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_import_mr()\f[R] returns a Memory region (MR) that is associated with the given \f[I]mr_handle\f[R] in the RDMA context that assosicated with the given \f[I]pd\f[R]. .PP The input \f[I]mr_handle\f[R] value must be a valid kernel handle for an MR object in the assosicated RDMA context. It can be achieved from the original MR by getting its ibv_mr->handle member value. .PP \f[B]ibv_unimport_mr()\f[R] un import the MR. Once the MR usage has been ended ibv_dereg_mr() or ibv_unimport_mr() should be called. The first one will go to the kernel to destroy the object once the second one way cleanup what ever is needed/opposite of the import without calling the kernel. .PP This is the responsibility of the application to coordinate between all ibv_context(s) that use this MR. Once destroy is done no other process can touch the object except for unimport. All users of the context must collaborate to ensure this. .SH RETURN VALUE .PP \f[B]ibv_import_mr()\f[R] returns a pointer to the allocated MR, or NULL if the request fails. .SH NOTES .PP The \f[I]addr\f[R] field in the imported MR is not applicable, NULL value is expected. .SH SEE ALSO .PP \f[B]ibv_reg_mr\f[R](3), \f[B]ibv_reg_dm_mr\f[R](3), \f[B]ibv_reg_mr_iova\f[R](3), \f[B]ibv_reg_mr_iova2\f[R](3), \f[B]ibv_dereg_mr\f[R](3), .SH AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/b19940b1d5493c3710fdcd421a36776aaafd3bc60000644000175100002000000000334414773456413033514 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "EFADV_QUERY_MR" "3" "2023-11-13" "efa" "EFA Direct Verbs Manual" .hy .SH NAME .PP efadv_query_mr - Query EFA specific Memory Region attributes .SH SYNOPSIS .IP .nf \f[C] #include int efadv_query_mr(struct ibv_mr *ibvmr, struct efadv_mr_attr *attr, uint32_t inlen); \f[R] .fi .SH DESCRIPTION .PP \f[B]efadv_query_mr()\f[R] queries device-specific Memory Region attributes. .PP Compatibility is handled using the comp_mask and inlen fields. .IP .nf \f[C] struct efadv_mr_attr { uint64_t comp_mask; uint16_t ic_id_validity; uint16_t recv_ic_id; uint16_t rdma_read_ic_id; uint16_t rdma_recv_ic_id; }; \f[R] .fi .TP \f[I]inlen\f[R] In: Size of struct efadv_mr_attr. .TP \f[I]comp_mask\f[R] Compatibility mask. .TP \f[I]ic_id_validity\f[R] Validity mask of interconnect id fields: .RS .PP EFADV_MR_ATTR_VALIDITY_RECV_IC_ID: recv_ic_id has a valid value. .PP EFADV_MR_ATTR_VALIDITY_RDMA_READ_IC_ID: rdma_read_ic_id has a valid value. .PP EFADV_MR_ATTR_VALIDITY_RDMA_RECV_IC_ID: rdma_recv_ic_id has a valid value. .RE .TP \f[I]recv_ic_id\f[R] Physical interconnect used by the device to reach the MR for receive operation. .TP \f[I]rdma_read_ic_id\f[R] Physical interconnect used by the device to reach the MR for RDMA read operation. .TP \f[I]rdma_recv_ic_id\f[R] Physical interconnect used by the device to reach the MR for RDMA write receive. .SH RETURN VALUE .PP \f[B]efadv_query_mr()\f[R] returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH SEE ALSO .PP \f[B]efadv\f[R](7) .SH NOTES .IP \[bu] 2 Compatibility mask (comp_mask) is an out field and currently has no values. .SH AUTHORS .PP Michael Margolin rdma-core-56.1/buildlib/pandoc-prebuilt/b8fed3ca75d836aab593a3062616fb4ca61058fc0000644000175100002000000001242014773456416033602 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_sched_node[/leaf]_create / modify / destroy" "3" "2020-9-3" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_sched_node_create - Creates a scheduling node element .PP mlx5dv_sched_leaf_create - Creates a scheduling leaf element .PP mlx5dv_sched_node_modify - Modifies a node scheduling element .PP mlx5dv_sched_leaf_modify - Modifies a leaf scheduling element .PP mlx5dv_sched_node_destroy - Destroys a node scheduling element .PP mlx5dv_sched_leaf_destroy - Destroys a leaf scheduling element .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_sched_node *mlx5dv_sched_node_create(struct ibv_context *context, struct mlx5dv_sched_attr *sched_attr); struct mlx5dv_sched_leaf *mlx5dv_sched_leaf_create(struct ibv_context *context, struct mlx5dv_sched_attr *sched_attr); int mlx5dv_sched_node_modify(struct mlx5dv_sched_node *node, struct mlx5dv_sched_attr *sched_attr); int mlx5dv_sched_leaf_modify(struct mlx5dv_sched_leaf *leaf, struct mlx5dv_sched_attr *sched_attr); int mlx5dv_sched_node_destroy(struct mlx5dv_sched_node *node); int mlx5dv_sched_leaf_destroy(struct mlx5dv_sched_leaf *leaf); \f[R] .fi .SH DESCRIPTION .PP The transmit scheduling element (SE) is scheduling the transmission for of all nodes connected it. By configuring the SE, QoS policies may be enforced between the competing entities (e.g.\ SQ, QP). .PP In each scheduling cycle, the SE schedules all ready-to-transmit entities. The SE assures that weight for each entity is met. If entity has reached its maximum allowed bandwidth within the scheduling cycle, it won\[cq]t be scheduled till end of the scheduling cycle. The unused transmission bandwidth will be distributed among the remaining entities assuring the weight setting. .PP The SEs are connected in a tree structure. The entity is connected to a leaf. One or more leaves can be connected to a SE node. One or more SE nodes can be connected to a SE node, until reaching the SE root. For each input on each node, user can assign the maximum bandwidth and the scheduling weight. .PP The SE APIs (mlx5dv_sched_*) allows access by verbs application to set the hierarchical SE tree to the device. The ibv_qp shall be connected to a leaf. .SH ARGUMENTS .PP Please see \f[I]ibv_create_qp_ex(3)\f[R] man page for \f[I]context\f[R]. .SS mlx5dv_sched_attr .IP .nf \f[C] struct mlx5dv_sched_attr { struct mlx5dv_sched_node *parent; uint32_t flags; uint32_t bw_share; uint32_t max_avg_bw; uint64_t comp_mask; }; \f[R] .fi .TP \f[I]parent\f[R] A node handler to the parent scheduling element which this scheduling element will be connected to. The root scheduling element doesn\[cq]t have a parent. .TP \f[I]flags\f[R] Specifying what attributes in the structure are valid: .RS .PP MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE for \f[I]bw_share\f[R] .PP MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW for \f[I]max_avg_bw\f[R] .RE .TP \f[I]bw_share\f[R] The relative bandwidth share allocated for this element. This field has no units. The bandwidth is shared between all elements connected to the same parent element, relatively to their bw_share. Value of 0, indicates a device default Weight. This field must be 0 for the root TSAR. .TP \f[I]max_avg_bw\f[R] The maximal transmission rate allowed for the element, averaged over time. Value is given in units of 1 Mbit/sec.\ Value 0x0 indicates the rate is unlimited. This field must be 0 for the root TSAR. .TP \f[I]comp_mask\f[R] Reserved for future extension, must be 0 now. .TP \f[I]node/leaf\f[R] For modify, destroy: the scheduling element to work on. .TP \f[I]sched_attr\f[R] For create, modify: the attribute of the scheduling element to work on. .SH NOTES .PP For example if an application wants to create 2 QoS QP groups: .IP .nf \f[C] g1: 70% bandwidth share of this application g2: 30% bandwidth share of this application, with maximum average bandwidth limited to 4Gbps \f[R] .fi .PP Pseudo code: .IP .nf \f[C] struct mlx5dv_sched_node *root; struct mlx5dv_sched_leaf *leaf_g1, *leaf_g2; struct mlx5dv_sched_attr; struct ibv_qp *qp1, qp2; /* Create root node */ attr.comp_mask = 0; attr.parent = NULL; attr.flags = 0; root = mlx5dv_sched_node_create(context, attr); /* Create group1 */ attr.comp_mask = 0; attr.parent = root; attr.bw_share = 7; attr.flags = MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE; leaf_g1 = mlx5dv_sched_leaf_create(context, attr); /* Create group2 */ attr.comp_mask = 0; attr.parent = root; attr.bw_share = 3; attr.max_avg_bw = 4096; attr.flags = MLX5DV_SCHED_ELEM_ATTR_FLAGS_BW_SHARE | MLX5DV_SCHED_ELEM_ATTR_FLAGS_MAX_AVG_BW; leaf_g2 = mlx5dv_sched_leaf_create(context, attr); foreach (qp1 in group1) mlx5dv_modify_qp_sched_elem(qp1, leaf_g1, NULL); foreach (qp2 in group2) mlx5dv_modify_qp_sched_elem(qp2, leaf_g2, NULL); \f[R] .fi .SH RETURN VALUE .PP Upon success *mlx5dv_sched_node[/leaf]_create()* will return a new \f[I]struct mlx5dv_sched_node[/leaf]\f[R], on error NULL will be returned and errno will be set. .PP Upon success modify and destroy, 0 is returned or the value of errno on a failure. .SH SEE ALSO .PP \f[B]ibv_create_qp_ex\f[R](3), \f[B]mlx5dv_modify_qp_sched_elem\f[R](3) .SH AUTHOR .PP Mark Zhang .PP Ariel Almog rdma-core-56.1/buildlib/pandoc-prebuilt/972a32a8debfec8e394c32769fd0d69e06a946ef0000644000175100002000000000323614773456413033644 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "IBV_SET_ECE" "3" "2020-01-22" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP ibv_set_ece - set ECE options and use them for QP configuration stage. .SH SYNOPSIS .IP .nf \f[C] #include int ibv_set_ece(struct ibv_qp *qp, struct ibv_ece *ece); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_set_ece()\f[R] set ECE options and use them for QP configuration stage. .PP The desired ECE options will be used during various modify QP stages based on supported options in relevant QP state. .SH ARGUMENTS .TP \f[I]qp\f[R] The queue pair (QP) associated with the ECE options. .TP ## \f[I]ece\f[R] Argument The requested ECE values. This is IN/OUT field, the accepted options will be returned in this field. .IP .nf \f[C] struct ibv_ece { uint32_t vendor_id; uint32_t options; uint32_t comp_mask; }; \f[R] .fi .TP \f[I]vendor_id\f[R] Unique identifier of the provider vendor on the network. The providers will set IEEE OUI here to distinguish itself in non-homogenius network. .TP \f[I]options\f[R] Provider specific attributes which are supported or needed to be enabled by ECE users. .TP \f[I]comp_mask\f[R] Bitmask specifying what fields in the structure are valid. .SH RETURN VALUE .PP \f[B]ibv_set_ece()\f[R] returns 0 when the call was successful, or the errno value which indicates the failure reason. .TP \f[I]EOPNOTSUPP\f[R] libibverbs or provider driver doesn\[cq]t support the ibv_set_ece() verb. .TP \f[I]EINVAL\f[R] In one of the following: o The QP is invalid. o The ECE options are invalid. .SH SEE ALSO .PP \f[B]ibv_query_ece\f[R](3), .SH AUTHOR .PP Leon Romanovsky rdma-core-56.1/buildlib/pandoc-prebuilt/9d130ceb33244e4b1cf9296db5629c9998caaf4a0000644000175100002000000001671014773456421033544 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBTRACERT" 8 "2018-04-02" "" "Open IB Diagnostics" .SH NAME ibtracert \- trace InfiniBand path .SH SYNOPSIS .sp ibtracert [options] [ [ []]] .SH DESCRIPTION .sp ibtracert uses SMPs to trace the path from a source GID/LID to a destination GID/LID. Each hop along the path is displayed until the destination is reached or a hop does not respond. By using the \-m option, multicast path tracing can be performed between source and destination nodes. .SH OPTIONS .INDENT 0.0 .TP .B \fB\-n, \-\-no_info\fP simple format; don\(aqt show additional information .TP .B \fB\-m\fP show the multicast trace of the specified mlid .TP .B \fB\-f, \-\-force\fP force route to destination port .UNINDENT .SS Addressing Flags .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .\" Define the common option --ports-file . .sp \fB\-\-ports\-file \fP Specify a ports file. .INDENT 0.0 .INDENT 3.5 This file contains multiple source and destination lid or guid pairs. See FILES section. .UNINDENT .UNINDENT .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option --node-name-map . .sp \fB\-\-node\-name\-map \fP Specify a node name map. .INDENT 0.0 .INDENT 3.5 This file maps GUIDs to more user friendly names. See FILES section. .UNINDENT .UNINDENT .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .\" Common text to describe the port file. . .SS PORTS FILE FORMAT .sp The ports file can be used to specify multiple source and destination pairs. They can be lids or guids. If guids, use the \-G option to indicate that. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C 73 207 203 657 531 101 > OR < 0x0008f104003f125c 0x0008f104003f133d 0x0008f1040011ab07 0x0008f104004265c0 0x0008f104007c5510 0x0008f1040099bb08 .ft P .fi .UNINDENT .UNINDENT .SH EXAMPLES .sp Unicast examples .INDENT 0.0 .TP .B :: ibtracert 4 16 # show path between lids 4 and 16 ibtracert \-n 4 16 # same, but using simple output format ibtracert \-G 0x8f1040396522d 0x002c9000100d051 # use guid addresses .UNINDENT .sp Multicast example .INDENT 0.0 .TP .B :: ibtracert \-m 0xc000 4 16 # show multicast path of mlid 0xc000 between lids 4 and 16 .UNINDENT .SH SEE ALSO .sp ibroute (8) .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock <\fI\%hal.rosenstock@gmail.com\fP> .TP .B Ira Weiny < \fI\%ira.weiny@intel.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/aaadc388b36ef43bafc2846dde6cc6f8777cca480000644000175100002000000000273614773456414034134 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "manadv_set_context_attr" "3" "" "" "" .hy .SH NAME .PP manadv_set_context_attr - Set context attributes .SH SYNOPSIS .IP .nf \f[C] #include int manadv_set_context_attr(struct ibv_context *context, enum manadv_set_ctx_attr_type attr_type, void *attr); \f[R] .fi .SH DESCRIPTION .PP manadv_set_context_attr gives the ability to set vendor specific attributes on the RDMA context. .SH ARGUMENTS .TP \f[I]context\f[R] RDMA device context to work on. .TP \f[I]attr_type\f[R] The type of the provided attribute. .TP \f[I]attr\f[R] Pointer to the attribute to be set. .SS attr_type .IP .nf \f[C] enum manadv_set_ctx_attr_type { /* Attribute type uint8_t */ MANADV_SET_CTX_ATTR_BUF_ALLOCATORS = 0, }; \f[R] .fi .TP \f[I]MANADV_SET_CTX_ATTR_BUF_ALLOCATORS\f[R] Provide an external buffer allocator .IP .nf \f[C] struct manadv_ctx_allocators { void *(*alloc)(size_t size, void *priv_data); void (*free)(void *ptr, void *priv_data); void *data; }; \f[R] .fi .TP \f[I]alloc\f[R] Function used for buffer allocation instead of libmana internal method .TP \f[I]free\f[R] Function used to free buffers allocated by alloc function .TP \f[I]data\f[R] Metadata that can be used by alloc and free functions .SH RETURN VALUE .PP Returns 0 on success, or the value of errno on failure (which indicates the failure reason). .SH AUTHOR .PP Long Li rdma-core-56.1/buildlib/pandoc-prebuilt/58748d44e47709c08982c6349ef8fc8891398ef30000644000175100002000000000456014773456415033166 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_devx_create_cmd_comp, mlx5dv_devx_destroy_cmd_comp, get_async" "3" "" "" "" .hy .SH NAME .PP mlx5dv_devx_create_cmd_comp - Create a command completion to be used for DEVX asynchronous commands. .PP mlx5dv_devx_destroy_cmd_comp - Destroy a devx command completion. .PP mlx5dv_devx_get_async_cmd_comp - Get an asynchronous command completion. # SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_devx_cmd_comp { int fd; }; struct mlx5dv_devx_cmd_comp * mlx5dv_devx_create_cmd_comp(struct ibv_context *context) void mlx5dv_devx_destroy_cmd_comp(struct mlx5dv_devx_cmd_comp *cmd_comp) struct mlx5dv_devx_async_cmd_hdr { uint64_t wr_id; uint8_t out_data[]; }; int mlx5dv_devx_get_async_cmd_comp(struct mlx5dv_devx_cmd_comp *cmd_comp, struct mlx5dv_devx_async_cmd_hdr *cmd_resp, size_t cmd_resp_len) \f[R] .fi .SH DESCRIPTION .PP Create or destroy a command completion to be used for DEVX asynchronous commands. .PP The create verb exposes an mlx5dv_devx_cmd_comp object that can be used as part of asynchronous DEVX commands. This lets an application run asynchronously without blocking and once the response is ready read it from this object. .PP The response can be read by the mlx5dv_devx_get_async_cmd_comp() API, upon response the \f[I]wr_id\f[R] that was supplied upon the asynchronous command is returned and the \f[I]out_data\f[R] includes the data itself. The application must supply a large enough buffer to match any command that was issued on the \f[I]cmd_comp\f[R], its size is given by the input \f[I]cmd_resp_len\f[R] parameter. .SH ARGUMENTS .TP \f[I]context\f[R] .IP .nf \f[C] RDMA device context to create the action on. \f[R] .fi .TP \f[I]cmd_comp\f[R] The command completion object. .TP \f[I]cmd_resp\f[R] The output data from the asynchronous command. .TP \f[I]cmd_resp_len\f[R] The output buffer size to hold the response. .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_devx_create_cmd_comp\f[R] will return a new \f[I]struct mlx5dv_devx_cmd_comp\f[R] object, on error NULL will be returned and errno will be set. .PP Upon success \f[I]mlx5dv_devx_get_async_cmd_comp\f[R] will return 0, otherwise errno will be returned. .SH SEE ALSO .PP \f[I]mlx5dv_open_device(3)\f[R], \f[I]mlx5dv_devx_obj_create(3)\f[R] .PP #AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/4aefe6ec699efe9cbeab3f78569c8ed5da970a2e0000644000175100002000000000423314773456414034232 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_create_flow_action_packet_reformat" "3" "" "" "" .hy .SH NAME .PP mlx5dv_create_flow_action_packet_reformat - Flow action reformat packet for mlx5 provider .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_flow_action * mlx5dv_create_flow_action_packet_reformat(struct ibv_context *ctx, size_t data_sz, void *data, enum mlx5dv_flow_action_packet_reformat_type reformat_type, enum mlx5dv_flow_table_type ft_type) \f[R] .fi .SH DESCRIPTION .PP Create a packet reformat flow steering action. It allows adding/removing packet headers. .SH ARGUMENTS .TP \f[I]ctx\f[R] .IP .nf \f[C] RDMA device context to create the action on. \f[R] .fi .TP \f[I]data_sz\f[R] .IP .nf \f[C] The size of *data* buffer. \f[R] .fi .TP \f[I]data\f[R] .IP .nf \f[C] A buffer which contains headers in case the actions requires them. \f[R] .fi .TP \f[I]reformat_type\f[R] .IP .nf \f[C] The reformat type to be create. Use enum mlx5dv_flow_action_packet_reformat_type. \f[R] .fi .RS .PP MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TUNNEL_TO_L2: Decap a generic L2 tunneled packet up to inner L2. .PP MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L2_TUNNEL: Generic encap, \f[I]data\f[R] should contain the encapsulating headers. .PP MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L3_TUNNEL_TO_L2: Will do decap where the inner packet starts from L3. \f[I]data\f[R] should be MAC or MAC + vlan (14 or 18 bytes) to be appended to the packet after the decap action. .PP MLX5DV_FLOW_ACTION_PACKET_REFORMAT_TYPE_L2_TO_L3_TUNNEL: Will do encap where is L2 of the original packet will not be included. \f[I]data\f[R] should be the encapsulating header. .RE .TP \f[I]ft_type\f[R] .IP .nf \f[C] It defines the flow table type to which the packet reformat action \f[R] .fi .RS will be attached. .RE .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_create_flow_action_packet_reformat\f[R] will return a new \f[I]struct ibv_flow_action\f[R] object, on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[I]ibv_create_flow(3)\f[R], \f[I]ibv_create_flow_action(3)\f[R] rdma-core-56.1/buildlib/pandoc-prebuilt/2d8bf0753443ec6498bc7a90d728d901107075330000644000175100002000000000356214773456415033102 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_devx_subscribe_devx_event, mlx5dv_devx_subscribe_devx_event_fd" "3" "" "" "" .hy .SH NAME .PP mlx5dv_devx_subscribe_devx_event - Subscribe over an event channel for device events. .PP mlx5dv_devx_subscribe_devx_event_fd - Subscribe over an event channel for device events to signal eventfd. .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_devx_subscribe_devx_event(struct mlx5dv_devx_event_channel *dv_event_channel, struct mlx5dv_devx_obj *obj, uint16_t events_sz, uint16_t events_num[], uint64_t cookie) int mlx5dv_devx_subscribe_devx_event_fd(struct mlx5dv_devx_event_channel *dv_event_channel, int fd, struct mlx5dv_devx_obj *obj, uint16_t event_num) \f[R] .fi .SH DESCRIPTION .PP Subscribe over a DEVX event channel for device events. .SH ARGUMENTS .TP \f[I]dv_event_channel\f[R] Event channel to subscribe over. .TP \f[I]fd\f[R] A file descriptor that previously was opened by the eventfd() system call. .TP \f[I]obj\f[R] DEVX object that \f[I]events_num\f[R] relates to, can be NULL for unaffiliated events. .TP \f[I]events_sz\f[R] Size of the \f[I]events_num\f[R] buffer that holds the events to subscribe for. .TP \f[I]events_num\f[R] Holds the required event numbers to subscribe for, numbers are according to the device specification. .TP \f[I]cookie\f[R] The value to be returned back when reading the event, can be used as an ID for application use. .SH NOTES .PP When mlx5dv_devx_subscribe_devx_event_fd will be used the \f[I]fd\f[R] will be signaled once an event has occurred. .SH SEE ALSO .PP \f[I]mlx5dv_open_device(3)\f[R], \f[I]mlx5dv_devx_create_event_channel(3)\f[R], \f[I]mlx5dv_devx_get_event(3)\f[R] .PP #AUTHOR .PP Yishai Hadas rdma-core-56.1/buildlib/pandoc-prebuilt/fed8d43c3078914b2141c0cf655d840638a9e9210000644000175100002000000000305014773456416033234 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_reserved_qpn_alloc / dealloc" "3" "2020-12-29" "mlx5" "mlx5 Programmer\[cq]s Manual" .hy .SH NAME .PP mlx5dv_reserved_qpn_alloc - Allocate a reserved QP number from device .PP mlx5dv_reserved_qpn_dealloc - Release the reserved QP number .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_reserved_qpn_alloc(struct ibv_context *ctx, uint32_t *qpn); int mlx5dv_reserved_qpn_dealloc(struct ibv_context *ctx, uint32_t qpn); \f[R] .fi .SH DESCRIPTION .PP When work with RDMA_CM RDMA_TCP_PS + external QP support, a client node needs GUID level unique QP numbers to comply with the CM\[cq]s timewait logic. .PP If a real unique QP is not allocated, a device global QPN value is required and can be allocated via this interface. .PP The mlx5 DCI QP is such an example, which could connect to the remote DCT\[cq]s multiple times as long as the application provides unique QPN for each new RDMA_CM connection. .PP These 2 APIs provide the allocation/deallocation of a unique QP number from/to device. This qpn can be used with DC QPN in RDMA_CM connection establishment, which will comply with the CM timewait kernel logic. .SH ARGUMENTS .TP \f[I]ctx\f[R] The device context to issue the action on. .TP \f[I]qpn\f[R] The allocated QP number (for alloc API), or the QP number to be deallocated (for dealloc API). .SH RETURN VALUE .PP 0 on success; EOPNOTSUPP if not supported, or other errno value on other failures. .SH AUTHOR .PP Mark Zhang .PP Alex Rosenbaum rdma-core-56.1/buildlib/pandoc-prebuilt/561ea56de897d453a681e70cd2be7d0c0335e7840000644000175100002000000000250714773456414033404 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_create_flow_action_modify_header" "3" "" "" "" .hy .SH NAME .PP mlx5dv_create_flow_action_modify_header - Flow action modify header for mlx5 provider .SH SYNOPSIS .IP .nf \f[C] #include struct ibv_flow_action * mlx5dv_create_flow_action_modify_header(struct ibv_context *ctx, size_t actions_sz, uint64_t actions[], enum mlx5dv_flow_table_type ft_type) \f[R] .fi .SH DESCRIPTION .PP Create a modify header flow steering action, it allows mutating a packet header. .SH ARGUMENTS .TP \f[I]ctx\f[R] RDMA device context to create the action on. .TP \f[I]actions_sz\f[R] The size of \f[I]actions\f[R] buffer in bytes. .TP \f[I]actions\f[R] A buffer which contains modify actions provided in device spec format (i.e.\ be64). .TP \f[I]ft_type\f[R] Defines the flow table type to which the modify header action will be attached. .RS .PP MLX5DV_FLOW_TABLE_TYPE_NIC_RX: RX FLOW TABLE .PP MLX5DV_FLOW_TABLE_TYPE_NIC_TX: TX FLOW TABLE .RE .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_create_flow_action_modify_header\f[R] will return a new \f[I]struct ibv_flow_action\f[R] object, on error NULL will be returned and errno will be set. .SH SEE ALSO .PP \f[I]ibv_create_flow(3)\f[R], \f[I]ibv_create_flow_action(3)\f[R] rdma-core-56.1/buildlib/pandoc-prebuilt/e4d776d0b6f839435f0db61df3122af0280416e60000644000175100002000000001265414773456413033313 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "ibv_read_counters" "3" "2018-04-02" "libibverbs" "Libibverbs Programmer\[cq]s Manual" .hy .SH NAME .PP \f[B]ibv_read_counters\f[R] - Read counter values .SH SYNOPSIS .IP .nf \f[C] #include int ibv_read_counters(struct ibv_counters *counters, uint64_t *counters_value, uint32_t ncounters, uint32_t flags); \f[R] .fi .SH DESCRIPTION .PP \f[B]ibv_read_counters\f[R]() returns the values of the chosen counters into \f[I]counters_value\f[R] array of which can accumulate \f[I]ncounters\f[R]. The values are filled according to the configuration defined by the user in the \f[B]ibv_attach_counters_point_xxx\f[R] functions. .SH ARGUMENTS .TP \f[I]counters\f[R] Counters object to read. .TP \f[I]counters_value\f[R] Input buffer to hold read result. .TP \f[I]ncounters\f[R] Number of counters to fill. .TP \f[I]flags\f[R] Use enum ibv_read_counters_flags. .SS \f[I]flags\f[R] Argument .TP IBV_READ_COUNTERS_ATTR_PREFER_CACHED Will prefer reading the values from driver cache, else it will do volatile hardware access which is the default. .SH RETURN VALUE .PP \f[B]ibv_read_counters\f[R]() returns 0 on success, or the value of errno on failure (which indicates the failure reason) .SH EXAMPLE .PP Example: Statically attach counters to a new flow .PP This example demonstrates the use of counters which are attached statically with the creation of a new flow. The counters are read from hardware periodically, and finally all resources are released. .IP .nf \f[C] /* create counters object and define its counters points */ /* create simple L2 flow with hardcoded MAC, and a count action */ /* read counters periodically, every 1sec, until loop ends */ /* assumes user prepared a RAW_PACKET QP as input */ /* only limited error checking in run time for code simplicity */ #include #include /* the below MAC should be replaced by user */ #define FLOW_SPEC_ETH_MAC_VAL { .dst_mac = { 0x00, 0x01, 0x02, 0x03, 0x04,0x05}, .src_mac = { 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, .ether_type = 0, .vlan_tag = 0, } #define FLOW_SPEC_ETH_MAC_MASK { .dst_mac = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}, .src_mac = { 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF}, .ether_type = 0, .vlan_tag = 0, } void example_create_flow_with_counters_on_raw_qp(struct ibv_qp *qp) { int idx = 0; int loop = 10; struct ibv_flow *flow = NULL; struct ibv_counters *counters = NULL; struct ibv_counters_init_attr init_attr = {0}; struct ibv_counter_attach_attr attach_attr = {0}; /* create single counters handle */ counters = ibv_create_counters(qp->context, &init_attr); /* define counters points */ attach_attr.counter_desc = IBV_COUNTER_PACKETS; attach_attr.index = idx++; ret = ibv_attach_counters_point_flow(counters, &attach_attr, NULL); if (ret == ENOTSUP) { fprintf(stderr, \[dq]Attaching IBV_COUNTER_PACKETS to flow is not \[rs] supported\[dq]); exit(1); } attach_attr.counter_desc = IBV_COUNTER_BYTES; attach_attr.index = idx++; ibv_attach_counters_point_flow(counters, &attach_attr, NULL); if (ret == ENOTSUP) { fprintf(stderr, \[dq]Attaching IBV_COUNTER_BYTES to flow is not \[rs] supported\[dq]); exit(1); } /* define a new flow attr that includes the counters handle */ struct raw_eth_flow_attr { struct ibv_flow_attr attr; struct ibv_flow_spec_eth spec_eth; struct ibv_flow_spec_counter_action spec_count; } flow_attr = { .attr = { .comp_mask = 0, .type = IBV_FLOW_ATTR_NORMAL, .size = sizeof(flow_attr), .priority = 0, .num_of_specs = 2, /* ETH + COUNT */ .port = 1, .flags = 0, }, .spec_eth = { .type = IBV_EXP_FLOW_SPEC_ETH, .size = sizeof(struct ibv_flow_spec_eth), .val = FLOW_SPEC_ETH_MAC_VAL, .mask = FLOW_SPEC_ETH_MAC_MASK, }, .spec_count = { .type = IBV_FLOW_SPEC_ACTION_COUNT, .size = sizeof(struct ibv_flow_spec_counter_action), .counters = counters, /* attached this counters handle to the newly created ibv_flow */ } }; /* create the flow */ flow = ibv_create_flow(qp, &flow_attr.attr); /* allocate array for counters value reading */ uint64_t *counters_value = malloc(sizeof(uint64_t) * idx); /* periodical read and print of flow counters */ while (--loop) { sleep(1); /* read hardware counters values */ ibv_read_counters(counters, counters_value, idx, IBV_READ_COUNTERS_ATTR_PREFER_CACHED); printf(\[dq]PACKETS = %\[dq]PRIu64\[dq], BYTES = %\[dq]PRIu64 \[rs]n\[dq], counters_value[0], counters_value[1] ); } /* all done, release all */ free(counters_value); /* destroy flow and detach counters */ ibv_destroy_flow(flow); /* destroy counters handle */ ibv_destroy_counters(counters); return; } \f[R] .fi .SH SEE ALSO .PP \f[B]ibv_create_counters\f[R], \f[B]ibv_destroy_counters\f[R], \f[B]ibv_attach_counters_point_flow\f[R], \f[B]ibv_create_flow\f[R] .SH AUTHORS .PP Raed Salem .PP Alex Rosenbaum rdma-core-56.1/buildlib/pandoc-prebuilt/2572b872850e46e1909ad82fa0ade2511de868ba0000644000175100002000000001677414773456420033403 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBROUTE" 8 "2017-08-21" "" "Open IB Diagnostics" .SH NAME ibroute \- query InfiniBand switch forwarding tables .SH SYNOPSIS .sp ibroute [options] [ [ []]] .SH DESCRIPTION .sp ibroute uses SMPs to display the forwarding tables (unicast (LinearForwardingTable or LFT) or multicast (MulticastForwardingTable or MFT)) for the specified switch LID and the optional lid (mlid) range. The default range is all valid entries in the range 1...FDBTop. .SH OPTIONS .INDENT 0.0 .TP .B \fB\-a, \-\-all\fP show all lids in range, even invalid entries .TP .B \fB\-n, \-\-no_dests\fP do not try to resolve destinations .TP .B \fB\-M, \-\-Multicast\fP show multicast forwarding tables In this case, the range parameters are specifying the mlid range. .UNINDENT .SS Addressing Flags .\" Define the common option -D for Directed routes . .sp \fB\-D, \-\-Direct\fP The address specified is a directed route .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Examples: [options] \-D [options] "0" # self port [options] \-D [options] "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag \(aq\-P\(aq or the port found through the automatic selection process.) .ft P .fi .UNINDENT .UNINDENT .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Debugging flags .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SS Configuration flags .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .\" Define the common option --node-name-map . .sp \fB\-\-node\-name\-map \fP Specify a node name map. .INDENT 0.0 .INDENT 3.5 This file maps GUIDs to more user friendly names. See FILES section. .UNINDENT .UNINDENT .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .SH EXAMPLES .sp Unicast examples .INDENT 0.0 .TP .B :: ibroute 4 # dump all lids with valid out ports of switch with lid 4 ibroute \-a 4 # same, but dump all lids, even with invalid out ports ibroute \-n 4 # simple dump format \- no destination resolution ibroute 4 10 # dump lids starting from 10 (up to FDBTop) ibroute 4 0x10 0x20 # dump lid range ibroute \-G 0x08f1040023 # resolve switch by GUID ibroute \-D 0,1 # resolve switch by direct path .UNINDENT .sp Multicast examples .INDENT 0.0 .TP .B :: ibroute \-M 4 # dump all non empty mlids of switch with lid 4 ibroute \-M 4 0xc010 0xc020 # same, but with range ibroute \-M \-n 4 # simple dump format .UNINDENT .SH SEE ALSO .sp ibtracert (8) .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/9f5424fb7c398a6f74bb3dd4acc250ca6e0040bc0000644000175100002000000000323214773456415033653 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_map_ah_to_qp" "3" "" "" "" .hy .SH NAME .PP mlx5dv_map_ah_to_qp - Map the destination path information in address handle (AH) to the information extracted from the qp. .SH SYNOPSIS .IP .nf \f[C] #include int mlx5dv_map_ah_to_qp(struct ibv_ah *ah, uint32_t qp_num); \f[R] .fi .SH DESCRIPTION .PP This API maps the destination path information in address handle (\f[I]ah\f[R]) to the information extracted from the qp (e.g.\ congestion control from ECE). .PP This API serves as an enhancement to DC and UD QPs to achieve better performance by using per-address congestion control (CC) algorithms, enabling DC/UD QPs to use multiple CC algorithms in the same datacenter. .PP The mapping created by this API is implicitly destroyed when the address handle is destroyed. It is not affected by the destruction of QP \f[I]qp_num\f[R]. .PP A duplicate mapping to the same address handle is ignored. As this API is just a hint for the hardware in this case it would do nothing and return success regardless of the new qp_num ECE. .PP The function must be called after ECE negotiation/preconfiguration was done by some external means. .SH ARGUMENTS .TP \f[I]ah\f[R] The target\[cq]s address handle. .TP \f[I]qp_num\f[R] The initiator QP from which congestion control information is extracted from its ECE. .SH RETURN VALUE .PP Upon success, returns 0; Upon failure, the value of errno is returned. .SH SEE ALSO .PP \f[I]rdma_cm(7)\f[R], \f[I]rdma_get_remote_ece(3)\f[R], \f[I]ibv_query_ece(3)\f[R], \f[I]ibv_set_ece(3)\f[R] .SH AUTHOR .PP Yochai Cohen .PP Patrisious Haddad rdma-core-56.1/buildlib/pandoc-prebuilt/bacbdaf8fa21ec967a707b8ea7981f40298805a70000644000175100002000000001706314773456420033626 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBPORTSTATE" 8 "2013-03-26" "" "Open IB Diagnostics" .SH NAME IBPORTSTATE \- handle port (physical) state and link speed of an InfiniBand port .SH SYNOPSIS .sp ibportstate [options] [] .SH DESCRIPTION .sp ibportstate allows the port state and port physical state of an IB port to be queried (in addition to link width and speed being validated relative to the peer port when the port queried is a switch port), or a switch port to be disabled, enabled, or reset. InfiniBand HCA port state may be changed locally without the knowledge of the Subnet Manager. It also allows the link speed/width enabled on any IB port to be adjusted. .SH OPTIONS .INDENT 0.0 .TP .B \fB\fP .INDENT 7.0 .TP .B Supported ops: enable, disable, reset, speed, espeed, fdr10, width, query, on, off, down, arm, active, vls, mtu, lid, smlid, lmc, mkey, mkeylease, mkeyprot (Default is query) .UNINDENT .sp \fBenable, disable, and reset\fP change or reset a switch or HCA port state (You must specify the CA name and Port number when locally change CA port state.) .sp \fBoff\fP change the port state to disable. .sp \fBon\fP change the port state to enable(only when the current state is disable). .sp \fBspeed and width\fP are allowed on any port .sp \fBspeed\fP values are the legal values for PortInfo:LinkSpeedEnabled (An error is indicated if PortInfo:LinkSpeedSupported does not support this setting) .sp \fBespeed\fP is allowed on any port supporting extended link speeds .sp \fBfdr10\fP is allowed on any port supporting fdr10 (An error is indicated if port\(aqs capability mask indicates extended link speeds are not supported or if PortInfo:LinkSpeedExtSupported does not support this setting) .sp \fBwidth\fP values are legal values for PortInfo:LinkWidthEnabled (An error is indicated if PortInfo:LinkWidthSupported does not support this setting) (NOTE: Speed and width changes are not effected until the port goes through link renegotiation) .sp \fBquery\fP also validates port characteristics (link width, speed, espeed, and fdr10) based on the peer port. This checking is done when the port queried is a switch port as it relies on combined routing (an initial LID route with directed routing to the peer) which can only be done on a switch. This peer port validation feature of query op requires LID routing to be functioning in the subnet. .sp \fBmkey, mkeylease, and mkeyprot\fP are only allowed on CAs, routers, or switch port 0 (An error is generated if attempted on external switch ports). Hexadecimal and octal mkeys may be specified by prepending the key with \(aq0x\(aq or \(aq0\(aq, respectively. If a non\-numeric value (like \(aqx\(aq) is specified for the mkey, then ibportstate will prompt for a value. .UNINDENT .SS Addressing Flags .\" Define the common option -L . .sp \fB\-L, \-\-Lid\fP The address specified is a LID .\" Define the common option -G . .sp \fB\-G, \-\-Guid\fP The address specified is a Port GUID .\" Define the common option -D for Directed routes . .sp \fB\-D, \-\-Direct\fP The address specified is a directed route .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Examples: [options] \-D [options] "0" # self port [options] \-D [options] "0,1,2,1,4" # out via port 1, then 2, ... (Note the second number in the path specified must match the port being used. This can be specified using the port selection flag \(aq\-P\(aq or the port found through the automatic selection process.) .ft P .fi .UNINDENT .UNINDENT .\" Define the common option -s . .sp \fB\-s, \-\-sm_port \fP use \(aqsmlid\(aq as the target lid for SA queries. .SS Port Selection flags .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SS Configuration flags .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -y . .INDENT 0.0 .TP .B \fB\-y, \-\-m_key \fP use the specified M_key for requests. If non\-numeric value (like \(aqx\(aq) is specified then a value will be prompted for. .UNINDENT .SS Debugging flags .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -d . .INDENT 0.0 .TP .B \-d raise the IB debugging level. May be used several times (\-ddd or \-d \-d \-d). .UNINDENT .\" Define the common option -e . .INDENT 0.0 .TP .B \-e show send and receive errors (timeouts and others) .UNINDENT .\" Define the common option -K . .INDENT 0.0 .TP .B \fB\-K, \-\-show_keys\fP show security keys (mkey, smkey, etc.) associated with the request. .UNINDENT .\" Define the common option -v . .INDENT 0.0 .TP .B \fB\-v, \-\-verbose\fP increase the application verbosity level. May be used several times (\-vv or \-v \-v \-v) .UNINDENT .\" Define the common option -V . .sp \fB\-V, \-\-version\fP show the version info. .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .SH EXAMPLES .INDENT 0.0 .TP .B :: ibportstate \-C qib0 \-P 1 3 1 disable # by CA name, CA Port Number, lid, physical port number ibportstate \-C qib0 \-P 1 3 1 enable # by CA name, CA Port Number, lid, physical port number ibportstate \-D 0 1 # (query) by direct route ibportstate 3 1 reset # by lid ibportstate 3 1 speed 1 # by lid ibportstate 3 1 width 1 # by lid ibportstate \-D 0 1 lid 0x1234 arm # by direct route .UNINDENT .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%hal.rosenstock@gmail.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/59234b57ac865b4965d1158c7bbcad075f57cb700000644000175100002000000001100114773456417033372 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Man page generated from reStructuredText. . . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .TH "IBNODES" 8 "2012-05-14" "" "OpenIB Diagnostics" .SH NAME IBNODES \- show InfiniBand nodes in topology .SH SYNOPSIS .sp ibnodes [options] [] .SH DESCRIPTION .sp ibnodes is a script which either walks the IB subnet topology or uses an already saved topology file and extracts the IB nodes (CAs and switches). .SH OPTIONS .\" Define the common option -C . .sp \fB\-C, \-\-Ca \fP use the specified ca_name. .\" Define the common option -P . .sp \fB\-P, \-\-Port \fP use the specified ca_port. .\" Define the common option -t . .sp \fB\-t, \-\-timeout \fP override the default timeout for the solicited mads. .\" Define the common option -h . .sp \fB\-h, \-\-help\fP show the usage message .\" Define the common option -z . .sp \fB\-\-config, \-z \fP Specify alternate config file. .INDENT 0.0 .INDENT 3.5 Default: /usr/local/etc/infiniband\-diags/ibdiag.conf .UNINDENT .UNINDENT .\" Explanation of local port selection . .SS Local port Selection .sp Multiple port/Multiple CA support: when no IB device or port is specified (see the "local umad parameters" below), the libibumad library selects the port to use by the following criteria: .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP 1. 3 the first port that is ACTIVE. .IP 2. 3 if not found, the first port that is UP (physical link up). .UNINDENT .sp If a port and/or CA name is specified, the libibumad library attempts to fulfill the user request, and will fail if it is not possible. .sp For example: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C ibaddr # use the first port (criteria #1 above) ibaddr \-C mthca1 # pick the best port from "mthca1" only. ibaddr \-P 2 # use the second (active/up) port from the first available IB device. ibaddr \-C mthca0 \-P 2 # use the specified port only. .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .SH FILES .\" Common text for the config file . .SS CONFIG FILE .sp /usr/local/etc/infiniband\-diags/ibdiag.conf .sp A global config file is provided to set some of the common options for all tools. See supplied config file for details. .\" Common text to describe the node name map file. . .SS NODE NAME MAP FILE FORMAT .sp The node name map is used to specify user friendly names for nodes in the output. GUIDs are used to perform the lookup. .sp This functionality is provided by the opensm\-libs package. See \fBopensm(8)\fP for the file location for your installation. .sp \fBGenerically:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # comment "" .ft P .fi .UNINDENT .UNINDENT .sp \fBExample:\fP .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C # IB1 # Line cards 0x0008f104003f125c "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f125d "IB1 (Rack 11 slot 1 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d2 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10d3 "IB1 (Rack 11 slot 2 ) ISR9288/ISR9096 Voltaire sLB\-24D" 0x0008f104003f10bf "IB1 (Rack 11 slot 12 ) ISR9288/ISR9096 Voltaire sLB\-24D" # Spines 0x0008f10400400e2d "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2e "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e2f "IB1 (Rack 11 spine 1 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e31 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" 0x0008f10400400e32 "IB1 (Rack 11 spine 2 ) ISR9288 Voltaire sFB\-12D" # GUID Node Name 0x0008f10400411a08 "SW1 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a28 "SW2 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f10400411a34 "SW3 (Rack 3) ISR9024 Voltaire 9024D" 0x0008f104004119d0 "SW4 (Rack 3) ISR9024 Voltaire 9024D" .ft P .fi .UNINDENT .UNINDENT .SH SEE ALSO .sp ibnetdiscover(8) .SH DEPENDENCIES .sp ibnetdiscover, ibnetdiscover format .SH AUTHOR .INDENT 0.0 .TP .B Hal Rosenstock < \fI\%halr@voltaire.com\fP > .UNINDENT .\" Generated by docutils manpage writer. . rdma-core-56.1/buildlib/pandoc-prebuilt/2a9899c3a62b0c9164f7f76f08930a7e80ea9e510000644000175100002000000000432014773456415033334 0ustar00vsts_azpcontainerdocker_azpcontainer00000000000000.\" Automatically generated by Pandoc 2.9.2.1 .\" .TH "mlx5dv_devx_get_event" "3" "" "" "" .hy .SH NAME .PP mlx5dv_devx_get_event - Get an asynchronous event. .SH SYNOPSIS .IP .nf \f[C] #include struct mlx5dv_devx_async_event_hdr { uint64_t cookie; uint8_t out_data[]; }; ssize_t mlx5dv_devx_get_event(struct mlx5dv_devx_event_channel *event_channel, struct mlx5dv_devx_async_event_hdr *event_data, size_t event_resp_len) \f[R] .fi .SH DESCRIPTION .PP Get a device event on the given \f[I]event_channel\f[R]. Post a successful subscription over the event channel by calling to mlx5dv_devx_subscribe_devx_event() the application should use this API to get the response once an event has occurred. .PP Upon response the \f[I]cookie\f[R] that was supplied upon the subscription is returned and the \f[I]out_data\f[R] includes the data itself. The \f[I]out_data\f[R] may be omitted in case the channel was created with the omit data flag. .PP The application must supply a large enough buffer to hold the event according to the device specification, the buffer size is given by the input \f[I]event_resp_len\f[R] parameter. .SH ARGUMENTS .TP \f[I]event_channel\f[R] .IP .nf \f[C] The channel to get the event over. \f[R] .fi .TP \f[I]event_data\f[R] The output data from the asynchronous event. .TP \f[I]event_resp_len\f[R] The output buffer size to hold the response. .SH RETURN VALUE .PP Upon success \f[I]mlx5dv_devx_get_event\f[R] will return the number of bytes read, otherwise -1 will be returned and errno was set. .SH NOTES .PP In case the \f[I]event_channel\f[R] was created with the omit data flag, events having the same type may be combined per subscription and be reported once with the matching \f[I]cookie\f[R]. In that mode of work, ordering is not preserved between those events to other on this channel. .PP On the other hand, when each event should hold the device data ordering is preserved, however, events might be loose as of lack of kernel memory, in that case EOVERFLOW will be reported. .SH SEE ALSO .PP \f[I]mlx5dv_open_device(3)\f[R], \f[I]mlx5dv_devx_subscribe_devx_event(3)\f[R] .PP #AUTHOR .PP Yishai Hadas